Training AIs to help us align AIs
Update: 2023-03-27
Description
If we can accurately recognize good performance on alignment, we could elicit lots of useful alignment work from our models, even if they're playing the training game: https://www.planned-obsolescence.org/training-ais-to-help-us-align-ais/
Comments
In Channel



