Spectrum tuning: Post-training for distributional coverage and in-context steerability
Description
This research paper focuses on conditional distributional modeling for large language models (LLMs), introducing the SPECTRUM SUITE dataset and a new training method called SPECTRUM TUNING. The paper outlines three main objectives: in-context steerability (modifying output probabilities based on inference-time information), valid output coverage (generating diverse, correct responses), and distributional alignment (matching a target probability distribution over outputs). The authors empirically demonstrate that current instruction-tuning methods often degrade performance on these objectives, particularly in-context steerability, while their new SPECTRUM TUNING method, which incorporates in-context examples and focuses on distributional data, shows improvements in covering valid output spaces and matching target distributions across various language models. The document further includes detailed sections on methodology, experimental results comparing pretrained (PT), instruction-tuned (IT), and SPECTRUM TUNED (ST) models, and examples of tasks used in the SPECTRUM SUITE.







