Research

Paper

TESTING February 27, 2026

Benchmarking BERT-based Models for Sentence-level Topic Classification in Nepali Language

Authors

Nischal Karki, Bipesh Subedi, Prakash Poudyal, Rupak Raj Ghimire, Bal Krishna Bal

Abstract

Transformer-based models such as BERT have significantly advanced Natural Language Processing (NLP) across many languages. However, Nepali, a low-resource language written in Devanagari script, remains relatively underexplored. This study benchmarks multilingual, Indic, Hindi, and Nepali BERT variants to evaluate their effectiveness in Nepali topic classification. Ten pre-trained models, including mBERT, XLM-R, MuRIL, DevBERT, HindiBERT, IndicBERT, and NepBERTa, were fine-tuned and tested on the balanced Nepali dataset containing 25,006 sentences across five conceptual domains and the performance was evaluated using accuracy, weighted precision, recall, F1-score, and AUROC metrics. The results reveal that Indic models, particularly MuRIL-large, achieved the highest F1-score of 90.60%, outperforming multilingual and monolingual models. NepBERTa also performed competitively with an F1-score of 88.26%. Overall, these findings establish a robust baseline for future document-level classification and broader Nepali NLP applications.

Metadata

arXiv ID: 2602.23940
Provider: ARXIV
Primary Category: cs.CL
Published: 2026-02-27
Fetched: 2026-03-02 06:04

Related papers

Raw Data (Debug)
{
  "raw_xml": "<entry>\n    <id>http://arxiv.org/abs/2602.23940v1</id>\n    <title>Benchmarking BERT-based Models for Sentence-level Topic Classification in Nepali Language</title>\n    <updated>2026-02-27T11:42:38Z</updated>\n    <link href='https://arxiv.org/abs/2602.23940v1' rel='alternate' type='text/html'/>\n    <link href='https://arxiv.org/pdf/2602.23940v1' rel='related' title='pdf' type='application/pdf'/>\n    <summary>Transformer-based models such as BERT have significantly advanced Natural Language Processing (NLP) across many languages. However, Nepali, a low-resource language written in Devanagari script, remains relatively underexplored. This study benchmarks multilingual, Indic, Hindi, and Nepali BERT variants to evaluate their effectiveness in Nepali topic classification. Ten pre-trained models, including mBERT, XLM-R, MuRIL, DevBERT, HindiBERT, IndicBERT, and NepBERTa, were fine-tuned and tested on the balanced Nepali dataset containing 25,006 sentences across five conceptual domains and the performance was evaluated using accuracy, weighted precision, recall, F1-score, and AUROC metrics. The results reveal that Indic models, particularly MuRIL-large, achieved the highest F1-score of 90.60%, outperforming multilingual and monolingual models. NepBERTa also performed competitively with an F1-score of 88.26%. Overall, these findings establish a robust baseline for future document-level classification and broader Nepali NLP applications.</summary>\n    <category scheme='http://arxiv.org/schemas/atom' term='cs.CL'/>\n    <category scheme='http://arxiv.org/schemas/atom' term='cs.LG'/>\n    <published>2026-02-27T11:42:38Z</published>\n    <arxiv:comment>5 pages, 2 figures. Accepted and presented at the Regional International Conference on Natural Language Processing (RegICON 2025), Gauhati University, Guwahati, India, November 27-29, 2025. To appear in the conference proceedings. Accepted papers list available at: https://www.regicon2025.in/accepted-papers</arxiv:comment>\n    <arxiv:primary_category term='cs.CL'/>\n    <author>\n      <name>Nischal Karki</name>\n    </author>\n    <author>\n      <name>Bipesh Subedi</name>\n    </author>\n    <author>\n      <name>Prakash Poudyal</name>\n    </author>\n    <author>\n      <name>Rupak Raj Ghimire</name>\n    </author>\n    <author>\n      <name>Bal Krishna Bal</name>\n    </author>\n  </entry>"
}