Personal Assistant Web

Opinion editorial Medium

@emollick

Importance score: 5 • Posted: February 18, 2026 at 18:33

Score

This matches the general feeling on the big Chinese open source models. They have great benchmarks and near-frontier status on some coding, but there is a larger gap with the the big closed models than the benchmarks would indicate when it comes to real work and general “smarts”

Flo Crivello

@Altimor

2026-02-18T16:58:09.000000Z

Open

But every time we've evaluated them, we've found the same thing: that their real life performance, for agentic behavior, and outside of coding use cases, falls extremely short of what they show on evals.

Grok reasoning

Discussion on Chinese open models vs closed, benchmarks vs real performance.

Likes

218

Reposts

Views

25,467

AI Post

@emollick