25 points | by mike_hearn 8 hours ago ago
2 comments
Funny how there was a lot of concerns then about reward hacking, something I never hear anyone talk about with current AI
I think it just got folded under the umbrella concept of model alignment. And it moved from theoretical discussions to practical daily struggles with LLMs deleting failing unit tests
Funny how there was a lot of concerns then about reward hacking, something I never hear anyone talk about with current AI
I think it just got folded under the umbrella concept of model alignment. And it moved from theoretical discussions to practical daily struggles with LLMs deleting failing unit tests