Bluffbench is near saturation: LLMs can interpret counterintuitive plots

(opensource.posit.co)

2 points | by ionychal 2 hours ago ago

No comments yet.