Papers
Research papers from arXiv and related sources
A Benchmarking Framework for Model Datasets
Empirical and LLM-based research in model-driven engineering increasingly relies on datasets of software models, for instance, to train or evaluate machine learning techniques for modeling support....
Philipp-Lorenz Glaser, Lola Burgueño, Dominik Bork
GCAgent: Enhancing Group Chat Communication through Dialogue Agents System
As a key form in online social platforms, group chat is a popular space for interest exchange or problem-solving, but its effectiveness is often hindered by inactivity and management challenges. Wh...
Zijie Meng, Zheyong Xie, Zheyu Ye, Chonggang Lu, Zuozhu Liu, Zihan Niu, Yao Hu, Shaosheng Cao
SlideSparse: Fast and Flexible (2N-2):2N Structured Sparsity
NVIDIA's 2:4 Sparse Tensor Cores deliver 2x throughput but demand strict 50% pruning -- a ratio that collapses LLM reasoning accuracy (Qwen3: 54% to 15%). Milder $(2N-2):2N$ patterns (e.g., 6:8, 25...
Hanyong Shao, Yingbo Hao, Ting Song, Yan Xia, Di Zhang, Shaohan Huang, Xun Wu, Songchen Xu, Le Xu...
Boosting ASR Robustness via Test-Time Reinforcement Learning with Audio-Text Semantic Rewards
Recently, Automatic Speech Recognition (ASR) systems (e.g., Whisper) have achieved remarkable accuracy improvements but remain highly sensitive to real-world unseen data (data with large distributi...
Linghan Fang, Tianxin Xie, Li Liu
Not All Trust is the Same: Effects of Decision Workflow and Explanations in Human-AI Decision Making
A central challenge in AI-assisted decision making is achieving warranted, well-calibrated trust. Both overtrust (accepting incorrect AI recommendations) and undertrust (rejecting correct advice) s...
Laura Spillner, Rachel Ringe, Robert Porzel, Rainer Malaka
AI+HW 2035: Shaping the Next Decade
Artificial intelligence (AI) and hardware (HW) are advancing at unprecedented rates, yet their trajectories have become inseparably intertwined. The global research community lacks a cohesive, long...
Deming Chen, Jason Cong, Azalia Mirhoseini, Christos Kozyrakis, Subhasish Mitra, Jinjun Xiong, Cl...
Scaling Real-Time Traffic Analytics on Edge-Cloud Fabrics for City-Scale Camera Networks
Real-time city-scale traffic analytics requires processing 100s-1000s of CCTV streams under strict latency, bandwidth, and compute limits. We present a scalable AI-driven Intelligent Transportation...
Akash Sharma, Pranjal Naman, Roopkatha Banerjee, Priyanshu Pansari, Sankalp Gawali, Mayank Arya, ...
Core-based Hierarchies for Efficient GraphRAG
Retrieval-Augmented Generation (RAG) enhances large language models by incorporating external knowledge. However, existing vector-based methods often fail on global sensemaking tasks that require r...
Jakir Hossain, Ahmet Erdem Sarıyüce
Diffusion LLMs can think EoS-by-EoS
Diffusion LLMs have been proposed as an alternative to autoregressive LLMs, excelling especially at complex reasoning tasks with interdependent sub-goals. Curiously, this is particularly true if th...
Sarah Breckner, Sebastian Schuster
Small Changes, Big Impact: Demographic Bias in LLM-Based Hiring Through Subtle Sociocultural Markers in Anonymised Resumes
Large Language Models (LLMs) are increasingly deployed in resume screening pipelines. Although explicit PII (e.g., names) is commonly redacted, resumes typically retain subtle sociocultural markers...
Bryan Chen Zhengyu Tan, Shaun Khoo, Bich Ngoc Doan, Zhengyuan Liu, Nancy F. Chen, Roy Ka-Wei Lee
Escaping the Hydrolysis Trap: An Agentic Workflow for Inverse Design of Durable Photocatalytic Covalent Organic Frameworks
Covalent organic frameworks (COFs) are promising photocatalysts for solar hydrogen production, yet the most electronically favorable linkages, imines, hydrolyze rapidly in water, creating a stabili...
Iman Peivaste, Nicolas D. Boscher, Ahmed Makradi, Salim Belouettar
Mario: Multimodal Graph Reasoning with Large Language Models
Recent advances in large language models (LLMs) have opened new avenues for multimodal reasoning. Yet, most existing methods still rely on pretrained vision-language models (VLMs) to encode image-t...
Yuanfu Sun, Kang Li, Pengkang Guo, Jiajin Liu, Qiaoyu Tan
SWARM-SLR AIssistant: A Unified Framework for Scalable Systematic Literature Review Automation
Despite a growing ecosystem of tools supporting Systematic Literature Reviews (SLRs), integrating them into user-friendly workflows remains challenging. The Streamlined Workflow for Automating Mach...
Tim Wittenborg, Allard Oelen, Manuel Prinz
Incentive Aware AI Regulations: A Credal Characterisation
While high-stakes ML applications demand strict regulations, strategic ML providers often evade them to lower development costs. To address this challenge, we cast AI regulation as a mechanism desi...
Anurag Singh, Julian Rodemann, Rajeev Verma, Siu Lun Chau, Krikamol Muandet
Guidelines for the Annotation and Visualization of Legal Argumentation Structures in Chinese Judicial Decisions
This guideline proposes a systematic and operational annotation framework for representing the structure of legal argumentation in judicial decisions. Grounded in theories of legal reasoning and ar...
Kun Chen, Xianglei Liao, Kaixue Fei, Yi Xing, Xinrui Li
Sparse-BitNet: 1.58-bit LLMs are Naturally Friendly to Semi-Structured Sparsity
Semi-structured N:M sparsity and low-bit quantization (e.g., 1.58-bit BitNet) are two promising approaches for improving the efficiency of large language models (LLMs), yet they have largely been s...
Di Zhang, Xun Wu, Shaohan Huang, Yudong Wang, Hanyong Shao, Yingbo Hao, Zewen Chi, Li Dong, Ting ...
C2-Faith: Benchmarking LLM Judges for Causal and Coverage Faithfulness in Chain-of-Thought Reasoning
Large language models (LLMs) are increasingly used as judges of chain-of-thought (CoT) reasoning, but it remains unclear whether they can reliably assess process faithfulness rather than just answe...
Avni Mittal, Rauno Arike
LBM: Hierarchical Large Auto-Bidding Model via Reasoning and Acting
The growing scale of ad auctions on online advertising platforms has intensified competition, making manual bidding impractical and necessitating auto-bidding to help advertisers achieve their econ...
Yewen Li, Zhiyi Lyu, Peng Jiang, Qingpeng Cai, Fei Pan, Bo An, Peng Jiang
MedCoRAG: Interpretable Hepatology Diagnosis via Hybrid Evidence Retrieval and Multispecialty Consensus
Diagnosing hepatic diseases accurately and interpretably is critical, yet it remains challenging in real-world clinical settings. Existing AI approaches for clinical diagnosis often lack transparen...
Zheng Li, Jiayi Xu, Zhikai Hu, Hechang Chen, Lele Cong, Yunyun Wang, Shuchao Pang
Measuring the Redundancy of Decoder Layers in SpeechLLMs
Speech Large Language Models route speech encoder representations into an LLM decoder that typically accounts for over 90% of total parameters. We study how much of this decoder capacity is actuall...
Adel Moumen, Guangzhi Sun, Philip C Woodland