Research

Paper

AI LLM February 23, 2026

Workflow-Level Design Principles for Trustworthy GenAI in Automotive System Engineering

Authors

Chih-Hong Cheng, Brian Hsuan-Cheng Liao, Adam Molin, Hasan Esen

Abstract

The adoption of large language models in safety-critical system engineering is constrained by trustworthiness, traceability, and alignment with established verification practices. We propose workflow-level design principles for trustworthy GenAI integration and demonstrate them in an end-to-end automotive pipeline, from requirement delta identification to SysML v2 architecture update and re-testing. First, we show that monolithic ("big-bang") prompting misses critical changes in large specifications, while section-wise decomposition with diversity sampling and lightweight NLP sanity checks improves completeness and correctness. Then, we propagate requirement deltas into SysML v2 models and validate updates via compilation and static analysis. Additionally, we ensure traceable regression testing by generating test cases through explicit mappings from specification variables to architectural ports and states, providing practical safeguards for GenAI used in safety-critical automotive engineering.

Metadata

arXiv ID: 2602.19614
Provider: ARXIV
Primary Category: cs.SE
Published: 2026-02-23
Fetched: 2026-02-24 04:38

Related papers

Raw Data (Debug)
{
  "raw_xml": "<entry>\n    <id>http://arxiv.org/abs/2602.19614v1</id>\n    <title>Workflow-Level Design Principles for Trustworthy GenAI in Automotive System Engineering</title>\n    <updated>2026-02-23T09:02:38Z</updated>\n    <link href='https://arxiv.org/abs/2602.19614v1' rel='alternate' type='text/html'/>\n    <link href='https://arxiv.org/pdf/2602.19614v1' rel='related' title='pdf' type='application/pdf'/>\n    <summary>The adoption of large language models in safety-critical system engineering is constrained by trustworthiness, traceability, and alignment with established verification practices. We propose workflow-level design principles for trustworthy GenAI integration and demonstrate them in an end-to-end automotive pipeline, from requirement delta identification to SysML v2 architecture update and re-testing. First, we show that monolithic (\"big-bang\") prompting misses critical changes in large specifications, while section-wise decomposition with diversity sampling and lightweight NLP sanity checks improves completeness and correctness. Then, we propagate requirement deltas into SysML v2 models and validate updates via compilation and static analysis. Additionally, we ensure traceable regression testing by generating test cases through explicit mappings from specification variables to architectural ports and states, providing practical safeguards for GenAI used in safety-critical automotive engineering.</summary>\n    <category scheme='http://arxiv.org/schemas/atom' term='cs.SE'/>\n    <category scheme='http://arxiv.org/schemas/atom' term='cs.LG'/>\n    <published>2026-02-23T09:02:38Z</published>\n    <arxiv:primary_category term='cs.SE'/>\n    <author>\n      <name>Chih-Hong Cheng</name>\n    </author>\n    <author>\n      <name>Brian Hsuan-Cheng Liao</name>\n    </author>\n    <author>\n      <name>Adam Molin</name>\n    </author>\n    <author>\n      <name>Hasan Esen</name>\n    </author>\n  </entry>"
}