Papers
Research papers from arXiv and related sources
Hierarchical LLM-Based Multi-Agent Framework with Prompt Optimization for Multi-Robot Task Planning
Multi-robot task planning requires decomposing natural-language instructions into executable actions for heterogeneous robot teams. Conventional Planning Domain Definition Language (PDDL) planners ...
Tomoya Kawabe, Rin Takano
DWA-KD: Dual-Space Weighting and Time-Warped Alignment for Cross-Tokenizer Knowledge Distillation
Knowledge Distillation (KD) has emerged as a crucial technique for compressing Large Language Models (LLMs). Although existing cross-tokenizer KD methods have made notable progress, their effective...
Duc Trung Vu, Pham Khanh Chi, Dat Phi Van, Linh Ngo Van, Sang Dinh, Trung Le
Following the Diagnostic Trace: Visual Cognition-guided Cooperative Network for Chest X-Ray Diagnosis
Computer-aided diagnosis (CAD) has significantly advanced automated chest X-ray diagnosis but remains isolated from clinical workflows and lacks reliable decision support and interpretability. Huma...
Shaoxuan Wu, Jingkun Chen, Chong Ma, Cong Shen, Xiao Zhang, Jun Feng
Irresponsible Counselors: Large Language Models and the Loneliness of Modern Humans
Large language models (LLMs) have rapidly shifted from peripheral assistive tools to constant companions in everyday and even high stakes human decision making. Many users now consult these models ...
Abas Bertina, Sara Shakeri
PPCR-IM: A System for Multi-layer DAG-based Public Policy Consequence Reasoning and Social Indicator Mapping
Public policy decisions are typically justified using a narrow set of headline indicators, leaving many downstream social impacts unstructured and difficult to compare across policies. We propose P...
Zichen Song, Weijia Li
Multimodal Survival Modeling and Fairness-Aware Clinical Machine Learning for 5-Year Breast Cancer Risk Prediction
Clinical risk prediction models often underperform in real-world settings due to poor calibration, limited transportability, and subgroup disparities. These challenges are amplified in high-dimensi...
Toktam Khatibi
Mitigating Structural Noise in Low-Resource S2TT: An Optimized Cascaded Nepali-English Pipeline with Punctuation Restoration
This paper presents and evaluates an optimized cascaded Nepali speech-to-English text translation (S2TT) system, focusing on mitigating structural noise introduced by Automatic Speech Recognition (...
Tangsang Chongbang, Pranesh Pyara Shrestha, Amrit Sarki, Anku Jaiswal
Scalable Multilingual Multimodal Machine Translation with Speech-Text Fusion
Multimodal Large Language Models (MLLMs) have achieved notable success in enhancing translation performance by integrating multimodal information. However, existing research primarily focuses on im...
Yexing Du, Youcheng Pan, Zekun Wang, Zheng Chu, Yichong Huang, Kaiyuan Liu, Bo Yang, Yang Xiang, ...
Multi-dimensional Assessment and Explainable Feedback for Counselor Responses to Client Resistance in Text-based Counseling with LLMs
Effectively addressing client resistance is a sophisticated clinical skill in psychological counseling, yet practitioners often lack timely and scalable supervisory feedback to refine their approac...
Anqi Li, Ruihan Wang, Zhaoming Chen, Yuqian Chen, Yu Lu, Yi Zhu, Yuan Xie, Zhenzhong Lan
AgentLTV: An Agent-Based Unified Search-and-Evolution Framework for Automated Lifetime Value Prediction
Lifetime Value (LTV) prediction is critical in advertising, recommender systems, and e-commerce. In practice, LTV data patterns vary across decision scenarios. As a result, practitioners often buil...
Chaowei Wu, Huazhu Chen, Congde Yuan, Qirui Yang, Guoqing Song, Yue Gao, Li Luo, Frank Youhua Che...
Permutation Polynomials Under Multiplicative-Additive Perturbations: Characterization via Difference Distribution Tables
We investigate permutation polynomials F over finite fields F_{p^n} whose generalized derivative maps x -> F(x + a) - cF(x) are themselves permutations for all nonzero shifts a. This property, term...
Ranit Dutta, Pantelimon Stanica, Bimal Mandal
Multi-Layer Scheduling for MoE-Based LLM Reasoning
Large Language Models (LLMs) have achieved remarkable success across a wide range of tasks, but serving them efficiently at scale remains a critical challenge due to their substantial computational...
Yifan Sun, Gholamreza Haffar, Minxian Xu, Rajkumar Buyya, Adel N. Toosi
When More Is Less: A Systematic Analysis of Spatial and Commonsense Information for Visual Spatial Reasoning
Visual spatial reasoning (VSR) remains challenging for modern vision-language models (VLMs), despite advances in multimodal architectures. A common strategy is to inject additional information at i...
Muku Akasaka, Soyeon Caren Han
Holographic QCD equation of state constrained by lattice QCD: neural-ODE for probe-limit and a back-reaction test
We study the equation of state (EoS) of QCD matter in a bottom-up holographic setup that combines an Einstein-Maxwell-dilaton (EMD) sector with an improved Karch-Katz-Son-Stephanov (KKSS) flavor ac...
Yutian Deng, Mei Huang, Lin Zhang
GW070605: An Undisclosed Binary Neutron Star Hardware Injection in LIGO's Fifth Science Run
The authors wished to document the sensitivity improvement that has been contributed to the GW detection rate by detection algorithm research and development efforts, and set about re-analyzing S5 ...
Heather Fong, Kipp Cannon, Chi-Wai Chan, Richard N. George, Alvin K. Y. Li, Soichiro Kuwahara, Hi...
Structurally Aligned Subtask-Level Memory for Software Engineering Agents
Large Language Models (LLMs) have demonstrated significant potential as autonomous software engineering (SWE) agents. Recent work has further explored augmenting these agents with memory mechanisms...
Kangning Shen, Jingyuan Zhang, Chenxi Sun, Wencong Zeng, Yang Yue
WatchHand: Enabling Continuous Hand Pose Tracking On Off-the-Shelf Smartwatches
Tracking hand poses on wrist-wearables enables rich, expressive interactions, yet remains unavailable on commercial smartwatches, as prior implementations rely on external sensors or custom hardwar...
Jiwan Kim, Chi-Jung Lee, Hohurn Jung, Tianhong Catherine Yu, Ruidong Zhang, Ian Oakley, Cheng Zhang
MixSarc: A Bangla-English Code-Mixed Corpus for Implicit Meaning Identification
Bangla-English code-mixing is widespread across South Asian social media, yet resources for implicit meaning identification in this setting remain scarce. Existing sentiment and sarcasm models larg...
Kazi Samin Yasar Alam, Md Tanbir Chowdhury, Tamim Ahmed, Ajwad Abrar, Md Rafid Haque
Inverse prediction of capacitor multiphysics dynamic parameters using deep generative model
Finite element simulations are run by package design engineers to model design structures. The process is irreversible meaning every minute structural adjustment requires a fresh input parameter ru...
Kart-Leong Lim, Rahul Dutta, Mihai Rotaru
Towards Autonomous Graph Data Analytics with Analytics-Augmented Generation
This paper argues that reliable end-to-end graph data analytics cannot be achieved by retrieval- or code-generation-centric LLM agents alone. Although large language models (LLMs) provide strong re...
Qiange Wang, Chaoyi Chen, Jingqi Gao, Zihan Wang, Yanfeng Zhang, Ge Yu