Paper
Qualitative Coding Analysis through Open-Source Large Language Models: A User Study and Design Recommendations
Authors
Tung T. Ngo, Dai Nguyen Van, Anh-Minh Nguyen, Phuong-Anh Do, Anh Nguyen-Quoc
Abstract
Qualitative data analysis is labor-intensive, yet the privacy risks associated with commercial Large Language Models (LLMs) often preclude their use in sensitive research. To address this, we introduce ChatQDA, an on-device framework powered by open-source LLMs designed for privacy-preserving open coding. Our mixed-methods user study reveals that while participants rated the system highly for usability and perceived efficiency, they exhibited "conditional trust", valuing the tool for surface-level extraction while questioning its interpretive nuance and consistency. Furthermore, despite the technical security of local deployment, participants reported epistemic uncertainty regarding data protection, suggesting that invisible security measures are insufficient to foster trust. We conclude with design recommendations for local-first analysis tools that prioritize verifiable privacy and methodological rigor.
Metadata
Related papers
Vibe Coding XR: Accelerating AI + XR Prototyping with XR Blocks and Gemini
Ruofei Du, Benjamin Hersh, David Li, Nels Numan, Xun Qian, Yanhe Chen, Zhongy... • 2026-03-25
Comparing Developer and LLM Biases in Code Evaluation
Aditya Mittal, Ryan Shar, Zichu Wu, Shyam Agarwal, Tongshuang Wu, Chris Donah... • 2026-03-25
The Stochastic Gap: A Markovian Framework for Pre-Deployment Reliability and Oversight-Cost Auditing in Agentic Artificial Intelligence
Biplab Pal, Santanu Bhattacharya • 2026-03-25
Retrieval Improvements Do Not Guarantee Better Answers: A Study of RAG for AI Policy QA
Saahil Mathur, Ryan David Rittner, Vedant Ajit Thakur, Daniel Stuart Schiff, ... • 2026-03-25
MARCH: Multi-Agent Reinforced Self-Check for LLM Hallucination
Zhuo Li, Yupeng Zhang, Pengyu Cheng, Jiajun Song, Mengyu Zhou, Hao Li, Shujie... • 2026-03-25
Raw Data (Debug)
{
"raw_xml": "<entry>\n <id>http://arxiv.org/abs/2602.18352v1</id>\n <title>Qualitative Coding Analysis through Open-Source Large Language Models: A User Study and Design Recommendations</title>\n <updated>2026-02-20T17:04:02Z</updated>\n <link href='https://arxiv.org/abs/2602.18352v1' rel='alternate' type='text/html'/>\n <link href='https://arxiv.org/pdf/2602.18352v1' rel='related' title='pdf' type='application/pdf'/>\n <summary>Qualitative data analysis is labor-intensive, yet the privacy risks associated with commercial Large Language Models (LLMs) often preclude their use in sensitive research. To address this, we introduce ChatQDA, an on-device framework powered by open-source LLMs designed for privacy-preserving open coding. Our mixed-methods user study reveals that while participants rated the system highly for usability and perceived efficiency, they exhibited \"conditional trust\", valuing the tool for surface-level extraction while questioning its interpretive nuance and consistency. Furthermore, despite the technical security of local deployment, participants reported epistemic uncertainty regarding data protection, suggesting that invisible security measures are insufficient to foster trust. We conclude with design recommendations for local-first analysis tools that prioritize verifiable privacy and methodological rigor.</summary>\n <category scheme='http://arxiv.org/schemas/atom' term='cs.HC'/>\n <category scheme='http://arxiv.org/schemas/atom' term='cs.CR'/>\n <category scheme='http://arxiv.org/schemas/atom' term='cs.SE'/>\n <published>2026-02-20T17:04:02Z</published>\n <arxiv:comment>6 pages. Accepted as Poster to CHI'26</arxiv:comment>\n <arxiv:primary_category term='cs.HC'/>\n <author>\n <name>Tung T. Ngo</name>\n </author>\n <author>\n <name>Dai Nguyen Van</name>\n </author>\n <author>\n <name>Anh-Minh Nguyen</name>\n </author>\n <author>\n <name>Phuong-Anh Do</name>\n </author>\n <author>\n <name>Anh Nguyen-Quoc</name>\n </author>\n </entry>"
}