Evaluation data of rule_qa for GPT4, GPT3.5, and Claude #36

alirezamshi · 2024-05-15T08:16:58Z

Hi,

Thanks for the cool resource. According to the publication, " rule_qa was also manually evaluated by a law-trained individual". Do you plan to release the annotations for this evaluation? Thanks

neelguha · 2024-05-15T14:27:55Z

The answers are available here: https://huggingface.co/datasets/nguha/legalbench/viewer/rule_qa/test.

Is this what you're looking for?

alirezamshi · 2024-05-15T18:43:35Z

Thanks for your response. I meant the evaluation of rule-based application: "Rule-application tasks were evaluated manually by a law-trained individual, who analyzed LLM responses for both correctness and analysis"

neelguha · 2024-05-15T18:49:35Z

Ah sorry I misunderstood.

rule_qa is a "rule-application" task. rule_qa was manually evaluated by a legally trained individual, because it is an open-generation task. That individual examined a model's generation and compared it to the answers in the column of the above-linked dataset
The answers for the rule-application tasks that were used for evaluation can be found on this page: https://hazyresearch.stanford.edu/legalbench/getting-started/

alirezamshi · 2024-05-15T19:18:31Z

Thanks for the answer. Do you plan to release that human judgement for the evaluation?

alirezamshi · 2024-05-17T06:23:07Z

Following up on this...

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Evaluation data of rule_qa for GPT4, GPT3.5, and Claude #36

Evaluation data of rule_qa for GPT4, GPT3.5, and Claude #36

alirezamshi commented May 15, 2024

neelguha commented May 15, 2024 •

edited

Loading

alirezamshi commented May 15, 2024

neelguha commented May 15, 2024

alirezamshi commented May 15, 2024

alirezamshi commented May 17, 2024

Evaluation data of rule_qa for GPT4, GPT3.5, and Claude #36

Evaluation data of rule_qa for GPT4, GPT3.5, and Claude #36

Comments

alirezamshi commented May 15, 2024

neelguha commented May 15, 2024 • edited Loading

alirezamshi commented May 15, 2024

neelguha commented May 15, 2024

alirezamshi commented May 15, 2024

alirezamshi commented May 17, 2024

neelguha commented May 15, 2024 •

edited

Loading