Research

Papers

Research papers from arXiv and related sources

Total: 4694 AI/LLM: 2583 Testing: 2111
TESTING

Vectorization of Verilog Designs and its Effects on Verification and Synthesis

Vectorization is a compiler optimization that replaces multiple operations on scalar values with a single operation on vector values. Although common in traditional compilers such as rustc, clang, ...

Maria Fernanda Oiveira Guimarães, Ulisses Rosa, Ian Trudel, João Victor Amorim Vieira, Augusto Am...

2603.17099 2026-03-17
TESTING

Topological inference on brain networks with application to lesion symptom mapping

Persistent homology (PH) characterizes the shape of brain networks through persistence features. Group comparison of persistence features from brain networks can be challenging as they are inherent...

Yuan Wang, Jian Yin, Nicholas Riccardi, Drik-Bart Den Ouden, Julius Fridriksson, Rutvik H. Desai

2603.17086 2026-03-17
AI LLM

Efficient Reasoning on the Edge

Large language models (LLMs) with chain-of-thought reasoning achieve state-of-the-art performance across complex problem-solving tasks, but their verbose reasoning traces and large context requirem...

Yelysei Bondarenko, Thomas Hehn, Rob Hesselink, Romain Lepert, Fabio Valerio Massoli, Evgeny Miro...

2603.16867 2026-03-17
AI LLM

Chronos: Temporal-Aware Conversational Agents with Structured Event Retrieval for Long-Term Memory

Recent advances in Large Language Models (LLMs) have enabled conversational AI agents to engage in extended multi-turn interactions spanning weeks or months. However, existing memory systems strugg...

Sahil Sen, Elias Lumer, Anmol Gulati, Vamse Kumar Subbiah

2603.16862 2026-03-17
TESTING

SocialOmni: Benchmarking Audio-Visual Social Interactivity in Omni Models

Omni-modal large language models (OLMs) redefine human-machine interaction by natively integrating audio, vision, and text. However, existing OLM benchmarks remain anchored to static, accuracy-cent...

Tianyu Xie, Jinfa Huang, Yuexiao Ma, Rongfang Luo, Yan Yang, Wang Chen, Yuhui Zeng, Ruize Fang, Y...

2603.16859 2026-03-17
TESTING

Long-Horizon Traffic Forecasting via Incident-Aware Conformal Spatio-Temporal Transformers

Reliable multi-horizon traffic forecasting is challenging because network conditions are stochastic, incident disruptions are intermittent, and effective spatial dependencies vary across time-of-da...

Mayur Patil, Qadeer Ahmed, Shawn Midlam-Mohler, Stephanie Marik, Allen Sheldon, Rajeev Chhajer, N...

2603.16857 2026-03-17
TESTING

BrickSim: A Physics-Based Simulator for Manipulating Interlocking Brick Assemblies

Interlocking brick assemblies provide a standardized yet challenging testbed for contact-rich and long-horizon robotic manipulation, but existing rigid-body simulators do not faithfully capture sna...

Haowei Wen, Ruixuan Liu, Weiyi Piao, Siyu Li, Changliu Liu

2603.16853 2026-03-17
AI LLM

Mediocrity is the key for LLM as a Judge Anchor Selection

The ``LLM-as-a-judge'' paradigm has become a standard method for evaluating open-ended generation. To address the quadratic scalability costs of pairwise comparisons, popular benchmarks like Arena-...

Shachar Don-Yehiya, Asaf Yehudai, Leshem Choshen, Omri Abend

2603.16848 2026-03-17
AI LLM

Learning to Present: Inverse Specification Rewards for Agentic Slide Generation

Automated presentation generation remains a challenging task requiring coherent content creation, visual design, and audience-aware communication. This work proposes an OpenEnv-compatible reinforce...

Karthik Ragunath Ananda Kumar, Subrahmanyam Arunachalam

2603.16839 2026-03-17
TESTING

Complex Wannier centers and drifting Wannier functions in non-Hermitian Hamiltonians

The extension of topological band theory to non-Hermitian Hamiltonians with line energy gaps remains largely unexplored, despite early indications of rich underlying physics. In this setting, Wilso...

Pedro Fittipaldi de Castro, Wladimir A. Benalcazar

2603.16838 2026-03-17
TESTING

Conditional Distributional Treatment Effects: Doubly Robust Estimation and Testing

Beyond conditional average treatment effects, treatments may impact the entire outcome distribution in covariate-dependent ways, for example, by altering the variance or tail risks for specific sub...

Saksham Jain, Alex Luedtke

2603.16829 2026-03-17
AI LLM

Prompt Programming for Cultural Bias and Alignment of Large Language Models

Culture shapes reasoning, values, prioritization, and strategic decision-making, yet large language models (LLMs) often exhibit cultural biases that misalign with target populations. As LLMs are in...

Maksim Eren, Eric Michalak, Brian Cook, Johnny Seales

2603.16827 2026-03-17
AI LLM

Surg$Σ$: A Spectrum of Large-Scale Multimodal Data and Foundation Models for Surgical Intelligence

Surgical intelligence has the potential to improve the safety and consistency of surgical care, yet most existing surgical AI frameworks remain task-specific and struggle to generalize across proce...

Zhitao Zeng, Mengya Xu, Jian Jiang, Pengfei Guo, Yunqiu Xu, Zhu Zhuo, Chang Han Low, Yufan He, Do...

2603.16822 2026-03-17
TESTING

Measurement-Based Estimation of Causal Conditional Variances and Its Application to Macroscopic quantum phenomenon

We analytically investigate a quantum estimation method for a mechanical oscillator in a detuned cavity system based solely on homodyne measurement records, building on the framework developed by C...

Kosei Hatakeyama, Ryotaro Fukuzumi, Akira Matsumura, Daisuke Miki, Kazuhiro Yamamoto

2603.16821 2026-03-17
AI LLM

Leveraging LLMs for Structured Information Extraction and Analysis from Cloud Incident Reports (Work In Progress Paper)

Incident management is essential to maintain the reliability and availability of cloud computing services. Cloud vendors typically disclose incident reports to the public, summarizing the failures ...

Xiaoyu Chu, Shashikant Ilager, Yizhen Zang, Sacheendra Talluri, Alexandru Iosup

2603.16818 2026-03-17
AI LLM

Is Conformal Factuality for RAG-based LLMs Robust? Novel Metrics and Systematic Insights

Large language models (LLMs) frequently hallucinate, limiting their reliability in knowledge-intensive applications. Retrieval-augmented generation (RAG) and conformal factuality have emerged as po...

Yi Chen, Daiwei Chen, Sukrut Madhav Chikodikar, Caitlyn Heqi Yin, Ramya Korlakai Vinayak

2603.16817 2026-03-17
AI LLM

ODIN-Based CPU-GPU Architecture with Replay-Driven Simulation and Emulation

Integration of CPU and GPU technologies is a key enabler for modern AI and graphics workloads, combining control-oriented processing with massive parallel compute capability. As systems evolve towa...

Nij Dorairaj, Debabrata Chatterjee, Hong Wang, Hong Jiang, Alankar Saxena, Altug Koker, Thiam Ern...

2603.16812 2026-03-17
TESTING

High-Dimensional Gaussian Mean Estimation under Realizable Contamination

We study mean estimation for a Gaussian distribution with identity covariance in $\mathbb{R}^d$ under a missing data scheme termed realizable $ε$-contamination model. In this model an adversary can...

Ilias Diakonikolas, Daniel M. Kane, Thanasis Pittas

2603.16798 2026-03-17
AI LLM

Improving Code Comprehension through Cognitive-Load Aware Automated Refactoring for Novice Programmers

Novice programmers often struggle to comprehend code due to vague naming, deep nesting, and poor structural organization. While explanations may offer partial support, they typically do not restruc...

Subarna Saha, Alif Al Hasan, Fariha Tanjim Shifat, Mia Mohammad Imran

2603.16791 2026-03-17
TESTING

InCoder-32B: Code Foundation Model for Industrial Scenarios

Recent code large language models have achieved remarkable progress on general programming tasks. Nevertheless, their performance degrades significantly in industrial scenarios that require reasoni...

Jian Yang, Wei Zhang, Jiajun Wu, Junhang Cheng, Shawn Guo, Haowen Wang, Weicheng Gu, Yaxin Du, Jo...

2603.16790 2026-03-17