Research

Paper

AI LLM February 20, 2026

Qualitative Coding Analysis through Open-Source Large Language Models: A User Study and Design Recommendations

Authors

Tung T. Ngo, Dai Nguyen Van, Anh-Minh Nguyen, Phuong-Anh Do, Anh Nguyen-Quoc

Abstract

Qualitative data analysis is labor-intensive, yet the privacy risks associated with commercial Large Language Models (LLMs) often preclude their use in sensitive research. To address this, we introduce ChatQDA, an on-device framework powered by open-source LLMs designed for privacy-preserving open coding. Our mixed-methods user study reveals that while participants rated the system highly for usability and perceived efficiency, they exhibited "conditional trust", valuing the tool for surface-level extraction while questioning its interpretive nuance and consistency. Furthermore, despite the technical security of local deployment, participants reported epistemic uncertainty regarding data protection, suggesting that invisible security measures are insufficient to foster trust. We conclude with design recommendations for local-first analysis tools that prioritize verifiable privacy and methodological rigor.

Metadata

arXiv ID: 2602.18352
Provider: ARXIV
Primary Category: cs.HC
Published: 2026-02-20
Fetched: 2026-02-23 05:33

Related papers

Raw Data (Debug)
{
  "raw_xml": "<entry>\n    <id>http://arxiv.org/abs/2602.18352v1</id>\n    <title>Qualitative Coding Analysis through Open-Source Large Language Models: A User Study and Design Recommendations</title>\n    <updated>2026-02-20T17:04:02Z</updated>\n    <link href='https://arxiv.org/abs/2602.18352v1' rel='alternate' type='text/html'/>\n    <link href='https://arxiv.org/pdf/2602.18352v1' rel='related' title='pdf' type='application/pdf'/>\n    <summary>Qualitative data analysis is labor-intensive, yet the privacy risks associated with commercial Large Language Models (LLMs) often preclude their use in sensitive research. To address this, we introduce ChatQDA, an on-device framework powered by open-source LLMs designed for privacy-preserving open coding. Our mixed-methods user study reveals that while participants rated the system highly for usability and perceived efficiency, they exhibited \"conditional trust\", valuing the tool for surface-level extraction while questioning its interpretive nuance and consistency. Furthermore, despite the technical security of local deployment, participants reported epistemic uncertainty regarding data protection, suggesting that invisible security measures are insufficient to foster trust. We conclude with design recommendations for local-first analysis tools that prioritize verifiable privacy and methodological rigor.</summary>\n    <category scheme='http://arxiv.org/schemas/atom' term='cs.HC'/>\n    <category scheme='http://arxiv.org/schemas/atom' term='cs.CR'/>\n    <category scheme='http://arxiv.org/schemas/atom' term='cs.SE'/>\n    <published>2026-02-20T17:04:02Z</published>\n    <arxiv:comment>6 pages. Accepted as Poster to CHI'26</arxiv:comment>\n    <arxiv:primary_category term='cs.HC'/>\n    <author>\n      <name>Tung T. Ngo</name>\n    </author>\n    <author>\n      <name>Dai Nguyen Van</name>\n    </author>\n    <author>\n      <name>Anh-Minh Nguyen</name>\n    </author>\n    <author>\n      <name>Phuong-Anh Do</name>\n    </author>\n    <author>\n      <name>Anh Nguyen-Quoc</name>\n    </author>\n  </entry>"
}