arthur_bench.scoring.summary_quality.SummaryQuality#
- class arthur_bench.scoring.summary_quality.SummaryQuality(llm: BaseChatModel | None = None, context_window: int = 4096, tokenizer: Encoding | None = None)#
Comprehensive measure of summarization quality compared to a reference summary.
- __init__(llm: BaseChatModel | None = None, context_window: int = 4096, tokenizer: Encoding | None = None)#
Methods
__init__([llm, context_window, tokenizer])arun(candidate_outputs[, reference_outputs, ...])Async version of run method.
arun_batch(candidate_batch[, ...])Summary quality requires input_text_batch.
All possible values returned by the scorer if output type is categorical.
from_dict(config)Load a scorer from a json configuration file.
Whether the scorer is continuous or categorical.
name()Get the name of this Scorer :return: the Scorer name
requires_reference()True if scorer requires reference output to compute score, False otherwise
run(candidate_outputs[, reference_outputs, ...])Score a set of test cases.
run_batch(candidate_batch[, ...])Summary quality requires input_text_batch.
to_dict([warn])Provides a json serializable representation of the scorer.
to_metadata()type()Supplies whether a scorer is built-in or custom.
validate_batch(candidate_batch[, ...])