Research

Paper

AI LLM March 12, 2026

Performance Evaluation of Open-Source Large Language Models for Assisting Pathology Report Writing in Japanese

Authors

Masataka Kawai, Singo Sakashita, Shumpei Ishikawa, Shogo Watanabe, Anna Matsuoka, Mikio Sakurai, Yasuto Fujimoto, Yoshiyuki Takahara, Atsushi Ohara, Hirohiko Miyake, Genichiro Ishii

Abstract

The performance of large language models (LLMs) for supporting pathology report writing in Japanese remains unexplored. We evaluated seven open-source LLMs from three perspectives: (A) generation and information extraction of pathology diagnosis text following predefined formats, (B) correction of typographical errors in Japanese pathology reports, and (C) subjective evaluation of model-generated explanatory text by pathologists and clinicians. Thinking models and medical-specialized models showed advantages in structured reporting tasks that required reasoning and in typo correction. In contrast, preferences for explanatory outputs varied substantially across raters. Although the utility of LLMs differed by task, our findings suggest that open-source LLMs can be useful for assisting Japanese pathology report writing in limited but clinically relevant scenarios.

Metadata

arXiv ID: 2603.11597
Provider: ARXIV
Primary Category: cs.CL
Published: 2026-03-12
Fetched: 2026-03-14 05:03

Related papers

Raw Data (Debug)
{
  "raw_xml": "<entry>\n    <id>http://arxiv.org/abs/2603.11597v1</id>\n    <title>Performance Evaluation of Open-Source Large Language Models for Assisting Pathology Report Writing in Japanese</title>\n    <updated>2026-03-12T06:40:04Z</updated>\n    <link href='https://arxiv.org/abs/2603.11597v1' rel='alternate' type='text/html'/>\n    <link href='https://arxiv.org/pdf/2603.11597v1' rel='related' title='pdf' type='application/pdf'/>\n    <summary>The performance of large language models (LLMs) for supporting pathology report writing in Japanese remains unexplored. We evaluated seven open-source LLMs from three perspectives: (A) generation and information extraction of pathology diagnosis text following predefined formats, (B) correction of typographical errors in Japanese pathology reports, and (C) subjective evaluation of model-generated explanatory text by pathologists and clinicians. Thinking models and medical-specialized models showed advantages in structured reporting tasks that required reasoning and in typo correction. In contrast, preferences for explanatory outputs varied substantially across raters. Although the utility of LLMs differed by task, our findings suggest that open-source LLMs can be useful for assisting Japanese pathology report writing in limited but clinically relevant scenarios.</summary>\n    <category scheme='http://arxiv.org/schemas/atom' term='cs.CL'/>\n    <category scheme='http://arxiv.org/schemas/atom' term='cs.AI'/>\n    <published>2026-03-12T06:40:04Z</published>\n    <arxiv:comment>9 pages (including bibliography), 2 figures, 6 tables</arxiv:comment>\n    <arxiv:primary_category term='cs.CL'/>\n    <author>\n      <name>Masataka Kawai</name>\n    </author>\n    <author>\n      <name>Singo Sakashita</name>\n    </author>\n    <author>\n      <name>Shumpei Ishikawa</name>\n    </author>\n    <author>\n      <name>Shogo Watanabe</name>\n    </author>\n    <author>\n      <name>Anna Matsuoka</name>\n    </author>\n    <author>\n      <name>Mikio Sakurai</name>\n    </author>\n    <author>\n      <name>Yasuto Fujimoto</name>\n    </author>\n    <author>\n      <name>Yoshiyuki Takahara</name>\n    </author>\n    <author>\n      <name>Atsushi Ohara</name>\n    </author>\n    <author>\n      <name>Hirohiko Miyake</name>\n    </author>\n    <author>\n      <name>Genichiro Ishii</name>\n    </author>\n  </entry>"
}