Paper
His2Trans: A Skeleton First Framework for Self Evolving C to Rust Translation with Historical Retrieval
Authors
Shengbo Wang, Mingwei Liu, Guangsheng Ou, Yuwen Chen, Zike Li, Yanlin Wang, Zibin Zheng
Abstract
Automated C-to-Rust migration encounters systemic obstacles when scaling from code snippets to industrial projects, mainly because build context is often unavailable ("dependency hell") and domain-specific evolutionary knowledge is missing. As a result, current LLM-based methods frequently cannot reconstruct precise type definitions under complex build systems or infer idiomatic API correspondences, which in turn leads to hallucinated dependencies and unproductive repair loops. To tackle these issues, we introduce His2Trans, a framework that combines a deterministic, build-aware skeleton with self-evolving knowledge extraction to support stable, incremental migration. On the structural side, His2Trans performs build tracing to create a compilable Project-Level Skeleton Graph, providing a strictly typed environment that separates global verification from local logic generation. On the cognitive side, it derives fine-grained API and code-fragment rules from historical migration traces and uses a Retrieval-Augmented Generation (RAG) system to steer the LLM toward idiomatic interface reuse. Experiments on industrial OpenHarmony modules show that His2Trans reaches a 99.75% incremental compilation pass rate, effectively fixing build failures where baselines struggle. On general-purpose benchmarks, it lowers the unsafe code ratio by 23.6 percentage points compared to C2Rust while producing the fewest warnings. Finally, knowledge accumulation studies demonstrate the framework's evolutionary behavior: by continuously integrating verified patterns, His2Trans cuts repair overhead on unseen tasks by about 60%.
Metadata
Related papers
Fractal universe and quantum gravity made simple
Fabio Briscese, Gianluca Calcagni • 2026-03-25
POLY-SIM: Polyglot Speaker Identification with Missing Modality Grand Challenge 2026 Evaluation Plan
Marta Moscati, Muhammad Saad Saeed, Marina Zanoni, Mubashir Noman, Rohan Kuma... • 2026-03-25
LensWalk: Agentic Video Understanding by Planning How You See in Videos
Keliang Li, Yansong Li, Hongze Shen, Mengdi Liu, Hong Chang, Shiguang Shan • 2026-03-25
Orientation Reconstruction of Proteins using Coulomb Explosions
Tomas André, Alfredo Bellisario, Nicusor Timneanu, Carl Caleman • 2026-03-25
The role of spatial context and multitask learning in the detection of organic and conventional farming systems based on Sentinel-2 time series
Jan Hemmerling, Marcel Schwieder, Philippe Rufin, Leon-Friedrich Thomas, Mire... • 2026-03-25
Raw Data (Debug)
{
"raw_xml": "<entry>\n <id>http://arxiv.org/abs/2603.02617v1</id>\n <title>His2Trans: A Skeleton First Framework for Self Evolving C to Rust Translation with Historical Retrieval</title>\n <updated>2026-03-03T05:42:08Z</updated>\n <link href='https://arxiv.org/abs/2603.02617v1' rel='alternate' type='text/html'/>\n <link href='https://arxiv.org/pdf/2603.02617v1' rel='related' title='pdf' type='application/pdf'/>\n <summary>Automated C-to-Rust migration encounters systemic obstacles when scaling from code snippets to industrial projects, mainly because build context is often unavailable (\"dependency hell\") and domain-specific evolutionary knowledge is missing. As a result, current LLM-based methods frequently cannot reconstruct precise type definitions under complex build systems or infer idiomatic API correspondences, which in turn leads to hallucinated dependencies and unproductive repair loops. To tackle these issues, we introduce His2Trans, a framework that combines a deterministic, build-aware skeleton with self-evolving knowledge extraction to support stable, incremental migration. On the structural side, His2Trans performs build tracing to create a compilable Project-Level Skeleton Graph, providing a strictly typed environment that separates global verification from local logic generation. On the cognitive side, it derives fine-grained API and code-fragment rules from historical migration traces and uses a Retrieval-Augmented Generation (RAG) system to steer the LLM toward idiomatic interface reuse. Experiments on industrial OpenHarmony modules show that His2Trans reaches a 99.75% incremental compilation pass rate, effectively fixing build failures where baselines struggle. On general-purpose benchmarks, it lowers the unsafe code ratio by 23.6 percentage points compared to C2Rust while producing the fewest warnings. Finally, knowledge accumulation studies demonstrate the framework's evolutionary behavior: by continuously integrating verified patterns, His2Trans cuts repair overhead on unseen tasks by about 60%.</summary>\n <category scheme='http://arxiv.org/schemas/atom' term='cs.SE'/>\n <published>2026-03-03T05:42:08Z</published>\n <arxiv:primary_category term='cs.SE'/>\n <author>\n <name>Shengbo Wang</name>\n </author>\n <author>\n <name>Mingwei Liu</name>\n </author>\n <author>\n <name>Guangsheng Ou</name>\n </author>\n <author>\n <name>Yuwen Chen</name>\n </author>\n <author>\n <name>Zike Li</name>\n </author>\n <author>\n <name>Yanlin Wang</name>\n </author>\n <author>\n <name>Zibin Zheng</name>\n </author>\n </entry>"
}