Personal Assistant Web

AI LLM

LiveCultureBench: a Multi-Agent, Multi-Cultural Benchmark for Large Language Models in Dynamic Social Simulations

Large language models (LLMs) are increasingly deployed as autonomous agents, yet evaluations focus primarily on task success rather than cultural appropriateness or evaluator reliability. We introd...

Viet-Thanh Pham, Lizhen Qu, Thuy-Trang Vu, Gholamreza Haffari, Dinh Phung

2603.01952 • 2026-03-02

View PDF

TESTING

PreSight: Preoperative Outcome Prediction for Parkinson's Disease via Region-Prior Morphometry and Patient-Specific Weighting

Preoperative improvement rate prediction for Parkinson's disease surgery is clinically important yet difficult because imaging signals are subtle and patients are heterogeneous. We address this set...

Yand Wang, Chen Zhang, Lanyun Zhu, Yixin Chen, Qunbo Wang, Yutong Bai, Jurgen Germann, Yinghong W...

2603.01948 • 2026-03-02

View PDF

TESTING

When Numbers Tell Half the Story: Human-Metric Alignment in Topic Model Evaluation

Topic models uncover latent thematic structures in text corpora, yet evaluating their quality remains challenging, particularly in specialized domains. Existing methods often rely on automated metr...

Thibault Prouteau, Francis Lareau, Nicolas Dugué, Jean-Charles Lamirel, Christophe Malaterre

2603.01945 • 2026-03-02

View PDF

TESTING

A Simulation Study to Compare Inferential Properties when Modelling Ordinal Outcomes: The Case for the (Plain but Robust) Proportional Odds Model

Ordinal measurements are common outcomes in studies within psychology, as well as in the social and behavioral sciences. Choosing an appropriate regression model for analysing such data poses a dif...

Stefan Inerle, Markus Pauly, Moritz Berger

2603.01943 • 2026-03-02

View PDF

AI LLM

Ignore All Previous Instructions: Jailbreaking as a de-escalatory peace building practise to resist LLM social media bots

Large Language Models have intensified the scale and strategic manipulation of political discourse on social media, leading to conflict escalation. The existing literature largely focuses on platfo...

Huw Day, Adrianna Jezierska, Jessica Woodgate

2603.01942 • 2026-03-02

View PDF

TESTING

CoVe: Training Interactive Tool-Use Agents via Constraint-Guided Verification

Developing multi-turn interactive tool-use agents is challenging because real-world user needs are often complex and ambiguous, yet agents must execute deterministic actions to satisfy them. To add...

Jinpeng Chen, Cheng Gong, Hanbo Li, Ziru Liu, Zichen Tian, Xinyu Fu, Shi Wu, Chenyang Zhang, Wu Z...

2603.01940 • 2026-03-02

View PDF

AI LLM

Dream2Learn: Structured Generative Dreaming for Continual Learning

Continual learning requires balancing plasticity and stability while mitigating catastrophic forgetting. Inspired by human dreaming as a mechanism for internal simulation and knowledge restructurin...

Salvatore Calcagno, Matteo Pennisi, Federica Proietto Salanitri, Amelia Sorrenti, Simone Palazzo,...

2603.01935 • 2026-03-02

View PDF

AI LLM

Real Money, Fake Models: Deceptive Model Claims in Shadow APIs

Access to frontier large language models (LLMs), such as GPT-5 and Gemini-2.5, is often hindered by high pricing, payment barriers, and regional restrictions. These limitations drive the proliferat...

Yage Zhang, Yukun Jiang, Zeyuan Chen, Michael Backes, Xinyue Shen, Yang Zhang

2603.01919 • 2026-03-02

View PDF

AI LLM

Fast Entropy Decoding for Sparse MVM on GPUs

We present a novel, practical approach to speed up sparse matrix-vector multiplication (SpMVM) on GPUs. The novel key idea is to apply lossless entropy coding to further compress the sparse matrix ...

Emil Schätzle, Tommaso Pegolotti, Markus Püschel

2603.01915 • 2026-03-02

View PDF

TESTING

AdaPonderLM: Gated Pondering Language Models with Token-Wise Adaptive Depth

Test-time scaling via recurrent/iterative Transformers enables large language models to spend more computation at inference, but most pretrained recurrent LMs run a fixed number of iterations, wast...

Shixiang Song, He Li, Zitong Wang, Boyi Zeng, Feichen Song, Yixuan Wang, Zhiqin John Xu, Ziwei He...

2603.01914 • 2026-03-02

View PDF

AI LLM

Demonstrating ViviDoc: Generating Interactive Documents through Human-Agent Collaboration

Interactive articles help readers engage with complex ideas through exploration, yet creating them remains costly, requiring both domain expertise and web development skills. Recent LLM-based agent...

Yinghao Tang, Yupeng Xie, Yingchaojie Feng, Tingfeng Lan, Wei Chen

2603.01912 • 2026-03-02

View PDF

AI LLM

FLANS at SemEval-2026 Task 7: RAG with Open-Sourced Smaller LLMs for Everyday Knowledge Across Diverse Languages and Cultures

This system paper describes our participation in the SemEval-2025 Task-7 ``Everyday Knowledge Across Diverse Languages and Cultures''. We attended two subtasks, i.e., Track 1: Short Answer Question...

Liliia Bogdanova, Shiran Sun, Lifeng Han, Natalia Amat Lefort, Flor Miriam Plaza-del-Arco

2603.01910 • 2026-03-02

View PDF

TESTING

Exploring $\widetilde{R}_2$ Leptoquarks and Majorana Neutrinos via same-sign dimuons at the HL-LHC

We study the phenomenology of scalar leptoquark (sLQ) $\widetilde{R}_2$ coupled to right-handed neutrinos (RHNs) at the High-Luminosity Large Hadron Collider (HL-LHC), focusing on signatures that d...

Subham Saha, Arvind Bhaskar, Manimala Mitra

2603.01903 • 2026-03-02

View PDF

TESTING

Dynamic Connectivity and Local Frequency Strength under Stochastic Variations

This paper introduces a novel metric, termed the Generalized Fiedler Vector (GFV), to evaluate the \textit{dynamic connectivity} in power systems. The proposed metric leverages the network connecti...

Bruno Pinheiro, Daniel Dotta

2603.01902 • 2026-03-02

View PDF

TESTING

B-fields And dust in interstelLar fiLAments using Dust POLarization (BALLAD-POL): VI. Grain alignment mechanisms in the massive quiescent filament G16.96+0.27 using dust polarization observations from JCMT/POL-2

Dust polarization induced by aligned non-spherical grains acts as an important tool to trace the magnetic field (B-field) morphologies and strengths in molecular clouds and constrain grain properti...

Saikhom Pravash, Thiem Hoang, Archana Soam, Qi-Lao Gu, Tie Liu, Pham Ngoc Diep, Le Ngoc Tram, Ngu...

2603.01899 • 2026-03-02

View PDF

AI LLM

Agentic Code Reasoning

Can LLM agents explore codebases and reason about code semantics without executing the code? We study this capability, which we call agentic code reasoning, and introduce semi-formal reasoning: a s...

Shubham Ugare, Satish Chandra

2603.01896 • 2026-03-02

View PDF

AI LLM

VietSuperSpeech: A Large-Scale Vietnamese Conversational Speech Dataset for ASR Fine-Tuning in Chatbot, Customer Support, and Call Center Applications

We introduce VietSuperSpeech, a large-scale Vietnamese automatic speech recognition (ASR) dataset of 52,023 audio-text pairs totaling 267.39 hours, with a distinctive focus on casual conversational...

Loan Do, Thanh Ngoc Nguyen, Thanh Pham, Vinh Do, Hien Nguyen, Charlotte Nguyen

2603.01894 • 2026-03-02

View PDF

TESTING

Generative Visual Chain-of-Thought for Image Editing

Existing image editing methods struggle to perceive where to edit, especially under complex scenes and nuanced spatial instructions. To address this issue, we propose Generative Visual Chain-of-Tho...

Zijin Yin, Tiankai Hang, Yiji Cheng, Shiyi Zhang, Runze He, Yu Xu, Chunyu Wang, Bing Li, Zheng Ch...

2603.01893 • 2026-03-02

View PDF

TESTING

Asymptotic Analysis of Shallow Water Moment Equations

The Shallow Water Moment Equations (SWME) are an extension of the Shallow Water Equations (SWE) for improved modelling of free-surface flows. In contrast to the SWE, the SWME incorporate vertical v...

Mieke Daemen, Julio Careaga, Zhenning Cai, Julian Koellermeier

2603.01886 • 2026-03-02

View PDF

TESTING

Growth factor in teleparallel Gauss-Bonnet gravity

Teleparallel gravity offers a competing geometric framework on which to build cosmological models. The Gauss-Bonnet invariant captures key aspects of the underlying geometry that has been shown to ...

Shivam Kumar Mishra, Jackson Levi Said, B. Mishra

2603.01884 • 2026-03-02

View PDF

Papers