Verdict 0.2.1
  • API Reference
  • GitHub
  • Discord
  • Whitepaper
  1. Paper Implementations

# Paper Implementations

Title Colab Link
G-Eval: NLG Evaluation using GPT-4 with Better Human Alignment (EMNLP 2023) Open In Colab
Large Language Model Evaluators are not Fair Evaluators (ACL 2024) Open In Colab
LLM Evaluators Recognize and Favor Their Own Generations (NeurIPS 2024) Open In Colab
Debating with More Persuasive LLMs Leads to More Truthful Answers (ICML 2024) Open In Colab
On scalable oversight with weak LLMs judging strong LLMs (NeurIPS 2024) Open In Colab
LMUnit: Fine-grained Evaluation with Natural Language Unit Tests (2024) Open In Colab
Edit on GitHub
Previous Block
Next DS​Py Integration
  • Haize Labs
  • License

© Copyright 2025 — All rights reserved