Measuring AI Ability to Complete Long Tasks – METR

(metr.org)

2 points | by diginova 14 hours ago ago

No comments yet.