Research

Paper

AI LLM February 20, 2026

DeCEAT: Decoding Carbon Emissions for AI-driven Software Testing

Authors

Pragati Kumari, Novarun Deb

Abstract

The increasing use of language models in automated software testing raises concerns about their environmental impact, yet existing sustainability analyses focus almost exclusively on large language models. As a result, the energy and carbon characteristics of small language models (SLMs) during test generation remain largely unexplored. To address this gap, this work introduces the DeCEAT framework, which systematically evaluates the environmental and performance trade-offs of SLMs using the HumanEval benchmark and adaptive prompt variants (based on the Anthropic template). The framework quantifies emission and time-aware behavior under controlled conditions, with CodeCarbon measuring energy consumption and carbon emissions, and unit test coverage assessing the quality of generated tests. Our results show that different SLMs exhibit distinct sustainability strengths: some prioritize lower energy use and faster execution, while others maintain higher stability or accuracy under carbon constraints. These findings demonstrate that sustainability in the generation of SLM-driven tests is multidimensional and strongly shaped by prompt design. This work provides a focused sustainability evaluation framework specifically tailored to automated SLM-based test generation, clarifying how prompt structure and model choice jointly influence environmental and performance outcomes.

Metadata

arXiv ID: 2602.18012
Provider: ARXIV
Primary Category: cs.SE
Published: 2026-02-20
Fetched: 2026-02-23 05:33

Related papers

Raw Data (Debug)
{
  "raw_xml": "<entry>\n    <id>http://arxiv.org/abs/2602.18012v1</id>\n    <title>DeCEAT: Decoding Carbon Emissions for AI-driven Software Testing</title>\n    <updated>2026-02-20T05:54:58Z</updated>\n    <link href='https://arxiv.org/abs/2602.18012v1' rel='alternate' type='text/html'/>\n    <link href='https://arxiv.org/pdf/2602.18012v1' rel='related' title='pdf' type='application/pdf'/>\n    <summary>The increasing use of language models in automated software testing raises concerns about their environmental impact, yet existing sustainability analyses focus almost exclusively on large language models. As a result, the energy and carbon characteristics of small language models (SLMs) during test generation remain largely unexplored. To address this gap, this work introduces the DeCEAT framework, which systematically evaluates the environmental and performance trade-offs of SLMs using the HumanEval benchmark and adaptive prompt variants (based on the Anthropic template). The framework quantifies emission and time-aware behavior under controlled conditions, with CodeCarbon measuring energy consumption and carbon emissions, and unit test coverage assessing the quality of generated tests. Our results show that different SLMs exhibit distinct sustainability strengths: some prioritize lower energy use and faster execution, while others maintain higher stability or accuracy under carbon constraints. These findings demonstrate that sustainability in the generation of SLM-driven tests is multidimensional and strongly shaped by prompt design. This work provides a focused sustainability evaluation framework specifically tailored to automated SLM-based test generation, clarifying how prompt structure and model choice jointly influence environmental and performance outcomes.</summary>\n    <category scheme='http://arxiv.org/schemas/atom' term='cs.SE'/>\n    <published>2026-02-20T05:54:58Z</published>\n    <arxiv:primary_category term='cs.SE'/>\n    <author>\n      <name>Pragati Kumari</name>\n    </author>\n    <author>\n      <name>Novarun Deb</name>\n    </author>\n  </entry>"
}