Personal Assistant
Home Settings
Daily Digest Newsletters Papers Ruby Posts AI Posts Ruby: Blogs and News AI: Blogs and News Gem Updates Gem Discoveries Digest Tweets
Twitter Lists Bluesky Lists RSS Lists Tracked Gems
Sign in Explore
@emollick

Ethan Mollick

@emollick

What a great illustration of the central problem of AI benchmarking for real work All of the effort is going into benchmarking for coding, but that is a small part of the actual jobs people do, which leaves the true trajectory of AI progress less clear. https://arxiv.org/pdf/2603.01203

Post media Post media

arxiv.org

4:30 PM · Mar 3, 2026