Opinion editorial
Medium
@emollick
Importance score: 5 • Posted: February 18, 2026 at 18:33
Score
5
This matches the general feeling on the big Chinese open source models. They have great benchmarks and near-frontier status on some coding, but there is a larger gap with the the big closed models than the benchmarks would indicate when it comes to real work and general “smarts”
But every time we've evaluated them, we've found the same thing: that their real life performance, for agentic behavior, and outside of coding use cases, falls extremely short of what they show on evals.
Grok reasoning
Discussion on Chinese open models vs closed, benchmarks vs real performance.
Likes
218
Reposts
13
Views
25,467
Tags
not related
open source
artificial intelligence
coding models
benchmarking
Tweet ID: 2024190674166239420
Prompt source: ai-influencers-news
Fetched at: February 19, 2026 at 11:20