Papers
Research papers from arXiv and related sources
Chaotic motion and power spectral density in Schwarzschild Bertotti-Robinson black hole spacetime
In this paper, we show that in weak field limit Schwarzschild Bertotti-Robinson black hole (Schwarzschild-BR BH) turns into Schwarzschild black hole immersed in external uniform magnetic field whic...
Yunqiao Xu, Uktamjon Uktamov, Pierros Ntelis, Ahmadjon Abdujabbarov, Bobomurat Ahmedov, Chengxun ...
Text-Based Personas for Simulating User Privacy Decisions
The ability to simulate human privacy decisions has significant implications for aligning autonomous agents with individual intent and conducting cost-effective, large-scale privacy-centric user st...
Kassem Fawaz, Ren Yi, Octavian Suciu, Rishabh Khandelwal, Hamza Harkous, Nina Taft, Marco Gruteser
Embodied Science: Closing the Discovery Loop with Agentic Embodied AI
Artificial intelligence has demonstrated remarkable capability in predicting scientific properties, yet scientific discovery remains an inherently physical, long-horizon pursuit governed by experim...
Xiang Zhuang, Chenyi Zhou, Kehua Feng, Zhihui Zhu, Yunfan Gao, Yijie Zhong, Yichi Zhang, Junjie H...
Evaluating Image Editing with LLMs: A Comprehensive Benchmark and Intermediate-Layer Probing Approach
Evaluating text-guided image editing (TIE) methods remains a challenging problem, as reliable assessment should simultaneously consider perceptual quality, alignment with textual instructions, and ...
Shiqi Gao, Zitong Xu, Kang Fu, Huiyu Duan, Xiongkuo Min, Jia wang, Guangtao Zhai
Template-based Object Detection Using a Foundation Model
Most currently used object detection methods are learning-based, and can detect objects under varying appearances. Those models require training and a training dataset. We focus on use cases with l...
Valentin Braeutigam, Matthias Stock, Bernhard Egger
Extraction of tabulated statistical results with tableParser
Tabulated content is omnipresent in scientific literature. This work presents the R package *tableParser*, designed to extract and postprocess tables from NISO-JATS-encoded XML, HTML, DOCX, and, wi...
Ingmar Böschen
ConSearcher: Supporting Conversational Information Seeking in Online Communities with Member Personas
Many people browse online communities to learn from others' experiences and opinions, e.g., for constructing travel plans. Conversational search powered by large language models (LLMs) could ease t...
Shiwei Wu, Xinyue Chen, Yuheng Liu, Xingbo Wang, Qingyu Guo, Longfei Chen, Chuhan Shi, Zhenhui Peng
Rethinking Ground Truth: A Case Study on Human Label Variation in MLLM Benchmarking
Human Label Variation (HLV), i.e. systematic differences among annotators' judgments, remains underexplored in benchmarks despite rapid progress in large language model (LLM) development. We addres...
Tomas Ruiz, Tanalp Agustoslu, Carsten Schwemmer
Dual Path Attribution: Efficient Attribution for SwiGLU-Transformers through Layer-Wise Target Propagation
Understanding the internal mechanisms of transformer-based large language models (LLMs) is crucial for their reliable deployment and effective operation. While recent efforts have yielded a plethor...
Lasse Marten Jantsch, Dong-Jae Koh, Seonghyeon Lee, Young-Kyoon Suh
FedPDPO: Federated Personalized Direct Preference Optimization for Large Language Model Alignment
Aligning large language models (LLMs) with human preferences in federated learning (FL) is challenging due to decentralized, privacy-sensitive, and highly non-IID preference data. Direct Preference...
Kewen Zhu, Liping Yi, Zhiming Zhao, Zhuang Qi, Han Yu, Qinghua Hu
PoC: Performance-oriented Context Compression for Large Language Models via Performance Prediction
While context compression can mitigate the growing inference costs of Large Language Models (LLMs) by shortening contexts, existing methods that specify a target compression ratio or length suffer ...
Runsong Zhao, Shilei Liu, Jiwei Tang, Langming Liu, Haibin Chen, Weidong Zhang, Yujin Yuan, Tong ...
LiteAtt: Secure and Seamless IoT Services Using TinyML-based Self-Attestation as a Primitive
As the Internet of Things (IoT) becomes an integral part of critical infrastructure, smart cities, and consumer networks, there has been an increase in the number of software attacks on the microco...
Varun Kohli, Biplab Sikdar
Stepwise: Neuro-Symbolic Proof Search for Automated Systems Verification
Formal verification via interactive theorem proving is increasingly used to ensure the correctness of critical systems, yet constructing large proof scripts remains highly manual and limits scalabi...
Baoding He, Zenan Li, Wei Sun, Yuan Yao, Taolue Chen, Xiaoxing Ma, Zhendong Su
TAB-AUDIT: Detecting AI-Fabricated Scientific Tables via Multi-View Likelihood Mismatch
AI-generated fabricated scientific manuscripts raise growing concerns with large-scale breaches of academic integrity. In this work, we present the first systematic study on detecting AI-generated ...
Shuo Huang, Yan Pen, Lizhen Qu
EvoTaxo: Building and Evolving Taxonomy from Social Media Streams
Constructing taxonomies from social media corpora is challenging because posts are short, noisy, semantically entangled, and temporally dynamic. Existing taxonomy induction methods are largely desi...
Yiyang Li, Tianyi Ma, Yanfang Ye
AIGQ: An End-to-End Hybrid Generative Architecture for E-commerce Query Recommendation
Pre-search query recommendation, widely known as HintQ on Taobao's homepage, plays a vital role in intent capture and demand discovery, yet traditional methods suffer from shallow semantics, poor c...
Jingcao Xu, Jianyun Zou, Renkai Yang, Zili Geng, Qiang Liu, Haihong Tang
Sensing Your Vocals: Exploring the Activity of Vocal Cord Muscles for Pitch Assessment Using Electromyography and Ultrasonography
Vocal training is difficult because the muscles that control pitch, resonance, and phonation are internal and invisible to learners. This paper investigates how Electromyography (EMG) and ultrasoni...
Kanyu Chen, Rebecca Panskus, Erwin Wu, Yichen Peng, Daichi Saito, Emiko Kamiyama, Ruiteng Li, Che...
Demographic-Aware Self-Supervised Anomaly Detection Pretraining for Equitable Rare Cardiac Diagnosis
Rare cardiac anomalies are difficult to detect from electrocardiograms (ECGs) due to their long-tailed distribution with extremely limited case counts and demographic disparities in diagnostic perf...
Chaoqin Huang, Zi Zeng, Aofan Jiang, Yuchen Xu, Qing Cao, Kang Chen, Chenfei Chi, Yanfeng Wang, Y...
From Token to Item: Enhancing Large Language Models for Recommendation via Item-aware Attention Mechanism
Large Language Models (LLMs) have recently gained increasing attention in the field of recommendation. Existing LLM-based methods typically represent items as token sequences, and apply attention l...
Xiaokun Zhang, Bowei He, Jiamin Chen, Ziqiang Cui, Chen Ma
DataProphet: Demystifying Supervision Data Generalization in Multimodal LLMs
Conventional wisdom for selecting supervision data for multimodal large language models (MLLMs) is to prioritize datasets that appear similar to the target benchmark, such as text-intensive or visi...
Xuan Qi, Luxi He, Dan Roth, Xingyu Fu