Papers
Research papers from arXiv and related sources
SafeSeek: Universal Attribution of Safety Circuits in Language Models
Mechanistic interpretability reveals that safety-critical behaviors (e.g., alignment, jailbreak, backdoor) in Large Language Models (LLMs) are grounded in specialized functional components. However...
Miao Yu, Siyuan Fu, Moayad Aloqaily, Zhenhong Zhou, Safa Otoum, Xing fan, Kun Wang, Yufei Guo, Qi...
On the Vulnerability of FHE Computation to Silent Data Corruption
Fully Homomorphic Encryption (FHE) is rapidly emerging as a promising foundation for privacy-preserving cloud services, enabling computation directly on encrypted data. As FHE implementations matur...
Jianan Mu, Ge Yu, Zhaoxuan Kan, Song Bian, Liang Kong, Zizhen Liu, Cheng Liu, Jing Ye, Huawei Li
AI Lifecycle-Aware Feasibility Framework for Split-RIC Orchestration in NTN O-RAN
Integrating Artificial Intelligence (AI) into Non-Terrestrial Networks (NTN) is constrained by the joint limits of satellite SWaP and feeder-link capacity, which directly impact O-RAN closed-loop c...
Daniele Tarchi
Is AI Catching Up to Human Expression? Exploring Emotion, Personality, Authorship, and Linguistic Style in English and Arabic with Six Large Language Models
The advancing fluency of LLMs raises important questions about their ability to emulate complex human traits, including emotional expression and personality, across diverse linguistic and cultural ...
Nasser A Alsadhan
Virtual materials testing of ASSB cathodes combining AI-based stochastic 3D modeling and numerical simulations
The performance of all-solid-state battery (ASSB) cathodes strongly depends on their microstructure. Optimizing the cathode morphology can therefore enhance effective macroscopic properties such as...
Anina Dufter, Sabrina Weber, Orkun Furat, Johannes Schubert, René Rekers, Maximilian Luczak, Erik...
MemCollab: Cross-Agent Memory Collaboration via Contrastive Trajectory Distillation
Large language model (LLM)-based agents rely on memory mechanisms to reuse knowledge from past problem-solving experiences. Existing approaches typically construct memory in a per-agent manner, tig...
Yurui Chang, Yiran Wu, Qingyun Wu, Lu Lin
I Came, I Saw, I Explained: Benchmarking Multimodal LLMs on Figurative Meaning in Memes
Internet memes represent a popular form of multimodal online communication and often use figurative elements to convey layered meaning through the combination of text and images. However, it remain...
Shijia Zhou, Saif M. Mohammad, Barbara Plank, Diego Frassinelli
Decoding AI Authorship: Can LLMs Truly Mimic Human Style Across Literature and Politics?
Amidst the rising capabilities of generative AI to mimic specific human styles, this study investigates the ability of state-of-the-art large language models (LLMs), including GPT-4o, Gemini 1.5 Pr...
Nasser A Alsadhan
Sparser, Faster, Lighter Transformer Language Models
Scaling autoregressive large language models (LLMs) has driven unprecedented progress but comes with vast computational costs. In this work, we tackle these costs by leveraging unstructured sparsit...
Edoardo Cetin, Stefano Peluchetti, Emilio Castillo, Akira Naruse, Mana Murakami, Llion Jones
Who Is in the Room? Stakeholder Perspectives on AI Recording in Pediatric Emergency Care
Artificial intelligence systems that record voice and video during pediatric emergencies are emerging as human-computer interaction (HCI) technologies with direct implications for clinical work, ...
Alexandre De Masi, Sergio Manzano, Johan N. Siebert, Frederic Ehrler
ViKey: Enhancing Temporal Understanding in Videos via Visual Prompting
Recent advancements in Video Large Language Models (VideoLLMs) have enabled strong performance across diverse multimodal video tasks. To reduce the high computational cost of processing dense video...
Yeonkyung Lee, Dayun Ju, Youngmin Kim, Seil Kang, Seong Jae Hwang
ImplicitRM: Unbiased Reward Modeling from Implicit Preference Data for LLM alignment
Reward modeling represents a long-standing challenge in reinforcement learning from human feedback (RLHF) for aligning language models. Current reward modeling is heavily contingent upon experiment...
Hao Wang, Haocheng Yang, Licheng Pan, Lei Shen, Xiaoxi Li, Yinuo Wang, Zhichao Chen, Yuan Lu, Hao...
Reasoning over Semantic IDs Enhances Generative Recommendation
Recent advances in generative recommendation have leveraged pretrained LLMs by formulating sequential recommendation as autoregressive generation over a unified token space comprising language toke...
Yingzhi He, Yan Sun, Junfei Tan, Yuxin Chen, Xiaoyu Kong, Chunxu Shen, Xiang Wang, An Zhang, Tat-...
From Synthetic to Native: Benchmarking Multilingual Intent Classification in Logistics Customer Service
Multilingual intent classification is central to customer-service systems on global logistics platforms, where models must process noisy user queries across languages and hierarchical label spaces....
Haoyu He, Jinyu Zhuang, Haoran Chu, Shuhang Yu, J, T AI Group, Hao Wang, Kunpeng Han
Robust Safety Monitoring of Language Models via Activation Watermarking
Large language models (LLMs) can be misused to reveal sensitive information, such as weapon-making instructions or writing malware. LLM providers rely on $\emph{monitoring}$ to detect and flag unsa...
Toluwani Aremu, Daniil Ognev, Samuele Poppi, Nils Lukas
UniDial-EvalKit: A Unified Toolkit for Evaluating Multi-Faceted Conversational Abilities
Benchmarking AI systems in multi-turn interactive scenarios is essential for understanding their practical capabilities in real-world applications. However, existing evaluation protocols are highly...
Qi Jia, Haodong Zhao, Dun Pei, Xiujie Song, Shibo Wang, Zijian Chen, Zicheng Zhang, Xiangyang Zhu...
Why AI-Generated Text Detection Fails: Evidence from Explainable AI Beyond Benchmark Accuracy
The widespread adoption of Large Language Models (LLMs) has made the detection of AI-Generated text a pressing and complex challenge. Although many detection systems report high benchmark accuracy,...
Shushanta Pudasaini, Luis Miralles-Pechuán, David Lillis, Marisa Llorens Salvador
Can Language Models Pass Software Testing Certification Exams? a case study
Large Language Models (LLMs) play a pivotal role in both academic research and broader societal applications. LLMs are increasingly used in software testing activities such as test case generation,...
Fitash Ul Haq, Jordi Cabot
DAK-UCB: Diversity-Aware Prompt Routing for LLMs and Generative Models
The expansion of generative AI and LLM services underscores the growing need for adaptive mechanisms to select an appropriate available model to respond to a user's prompts. Recent works have propo...
Donya Jafari, Farzan Farnia
HGNet: Scalable Foundation Model for Automated Knowledge Graph Generation from Scientific Literature
Automated knowledge graph (KG) construction is essential for navigating the rapidly expanding body of scientific literature. However, existing approaches struggle to recognize long multi-word entit...
Devvrat Joshi, Islem Rekik