Salesforce Just KILLED ChatGPT
Description
The Salesforce AI Research team has developed a family of large language models (LLMs) called SFR-Judge. These models are designed to automatically evaluate the outputs of other LLMs, acting as "judge models" that provide feedback on the quality and effectiveness of responses. SFR-Judge models are trained on a diverse range of evaluation tasks, including pairwise comparisons, single ratings, and binary classification. The researchers demonstrate that SFR-Judge consistently outperforms other open-source and proprietary judge models, particularly in areas like reward modeling and instruction following. SFR-Judge is not only useful for evaluating LLMs but also for improving their performance through reinforcement learning from human feedback (RLHF). The team found that using explanations generated by SFR-Judge during RLHF training led to significant improvements in downstream model outputs.