Research

Paper

AI LLM March 25, 2026

LLMpedia: A Transparent Framework to Materialize an LLM's Encyclopedic Knowledge at Scale

Authors

Muhammed Saeed, Simon Razniewski

Abstract

Benchmarks such as MMLU suggest flagship language models approach factuality saturation, with scores above 90\%. We show this picture is incomplete. \emph{LLMpedia} generates encyclopedic articles entirely from parametric memory, producing ${\sim}$1M articles across three model families without retrieval. For gpt-5-mini, the verifiable true rate on Wikipedia-covered subjects is only 74.7\% -- more than 15 percentage points below the benchmark-based picture, consistent with the availability bias of fixed-question evaluation. Beyond Wikipedia, frontier subjects verifiable only through curated web evidence fall further to 63.2\% true rate. Wikipedia covers just 61\% of surfaced subjects, and three model families overlap by only 7.3\% in subject choice. In a capture-trap benchmark inspired by prior analysis of Grokipedia, LLMpedia achieves substantially higher factuality at roughly half the textual similarity to Wikipedia. Unlike Grokipedia, every prompt, artifact, and evaluation verdict is publicly released, making LLMpedia the first fully open parametric encyclopedia -- bridging factuality evaluation and knowledge materialization. All data, code, and a browsable interface are at https://llmpedia.net.

Metadata

arXiv ID: 2603.24080

Provider: ARXIV

Primary Category: cs.CL

Published: 2026-03-25

Fetched: 2026-03-26 06:02

Related papers

Vibe Coding XR: Accelerating AI + XR Prototyping with XR Blocks and Gemini

Ruofei Du, Benjamin Hersh, David Li, Nels Numan, Xun Qian, Yanhe Chen, Zhongy... • 2026-03-25

Comparing Developer and LLM Biases in Code Evaluation

Aditya Mittal, Ryan Shar, Zichu Wu, Shyam Agarwal, Tongshuang Wu, Chris Donah... • 2026-03-25

The Stochastic Gap: A Markovian Framework for Pre-Deployment Reliability and Oversight-Cost Auditing in Agentic Artificial Intelligence

Biplab Pal, Santanu Bhattacharya • 2026-03-25

Retrieval Improvements Do Not Guarantee Better Answers: A Study of RAG for AI Policy QA

Saahil Mathur, Ryan David Rittner, Vedant Ajit Thakur, Daniel Stuart Schiff, ... • 2026-03-25

MARCH: Multi-Agent Reinforced Self-Check for LLM Hallucination

Zhuo Li, Yupeng Zhang, Pengyu Cheng, Jiajun Song, Mengyu Zhou, Hao Li, Shujie... • 2026-03-25

Raw Data (Debug)

{
  "raw_xml": "<entry>\n    <id>http://arxiv.org/abs/2603.24080v1</id>\n    <title>LLMpedia: A Transparent Framework to Materialize an LLM's Encyclopedic Knowledge at Scale</title>\n    <updated>2026-03-25T08:37:26Z</updated>\n    <link href='https://arxiv.org/abs/2603.24080v1' rel='alternate' type='text/html'/>\n    <link href='https://arxiv.org/pdf/2603.24080v1' rel='related' title='pdf' type='application/pdf'/>\n    <summary>Benchmarks such as MMLU suggest flagship language models approach factuality saturation, with scores above 90\\%. We show this picture is incomplete. \\emph{LLMpedia} generates encyclopedic articles entirely from parametric memory, producing ${\\sim}$1M articles across three model families without retrieval. For gpt-5-mini, the verifiable true rate on Wikipedia-covered subjects is only 74.7\\% -- more than 15 percentage points below the benchmark-based picture, consistent with the availability bias of fixed-question evaluation. Beyond Wikipedia, frontier subjects verifiable only through curated web evidence fall further to 63.2\\% true rate. Wikipedia covers just 61\\% of surfaced subjects, and three model families overlap by only 7.3\\% in subject choice. In a capture-trap benchmark inspired by prior analysis of Grokipedia, LLMpedia achieves substantially higher factuality at roughly half the textual similarity to Wikipedia. Unlike Grokipedia, every prompt, artifact, and evaluation verdict is publicly released, making LLMpedia the first fully open parametric encyclopedia -- bridging factuality evaluation and knowledge materialization. All data, code, and a browsable interface are at https://llmpedia.net.</summary>\n    <category scheme='http://arxiv.org/schemas/atom' term='cs.CL'/>\n    <category scheme='http://arxiv.org/schemas/atom' term='cs.DB'/>\n    <published>2026-03-25T08:37:26Z</published>\n    <arxiv:primary_category term='cs.CL'/>\n    <author>\n      <name>Muhammed Saeed</name>\n    </author>\n    <author>\n      <name>Simon Razniewski</name>\n    </author>\n  </entry>"
}