#8: Who Validate the Validator? - 継続的な評価をアップデートする仕組み -
Update: 2024-11-04
Description
継続的にLLMアプリケーションの評価基準や自動評価をアップデートする仕組みであるEvalGenについて書かれた論文「Who Validates the Validators? Aligning LLM-Assisted Evaluation of LLM Outputs with Human Preferences」について話しました。
ポッドキャストの書き起こしサービス「LISTEN」はこちら
Shownotes:
https://arxiv.org/abs/2404.12272
https://www.sh-reya.com/blog/ai-engineering-flywheel/
https://github.com/wandb/evalForge/tree/main
https://blog.langchain.dev/aligning-llm-as-a-judge-with-human-preferences/
出演者:
seya(@sekikazu01)
kagaya(@ry0_kaga)
Comments
Top Podcasts
The Best New Comedy Podcast Right Now – June 2024The Best News Podcast Right Now – June 2024The Best New Business Podcast Right Now – June 2024The Best New Sports Podcast Right Now – June 2024The Best New True Crime Podcast Right Now – June 2024The Best New Joe Rogan Experience Podcast Right Now – June 20The Best New Dan Bongino Show Podcast Right Now – June 20The Best New Mark Levin Podcast – June 2024
In Channel