Paper
From Logs to Language: Learning Optimal Verbalization for LLM-Based Recommendation in Production
Authors
Yucheng Shi, Ying Li, Yu Wang, Yesu Feng, Arjun Rao, Rein Houthooft, Shradha Sehgal, Jin Wang, Hao Zhen, Ninghao Liu, Linas Baltrunas
Abstract
Large language models (LLMs) are promising backbones for generative recommender systems, yet a key challenge remains underexplored: verbalization, i.e., converting structured user interaction logs into effective natural language inputs. Existing methods rely on rigid templates that simply concatenate fields, yielding suboptimal representations for recommendation. We propose a data-centric framework that learns verbalization for LLM-based recommendation. Using reinforcement learning, a verbalization agent transforms raw interaction histories into optimized textual contexts, with recommendation accuracy as the training signal. This agent learns to filter noise, incorporate relevant metadata, and reorganize information to improve downstream predictions. Experiments on a large-scale industrial streaming dataset show that learned verbalization delivers up to 93% relative improvement in discovery item recommendation accuracy over template-based baselines. Further analysis reveals emergent strategies such as user interest summarization, noise removal, and syntax normalization, offering insights into effective context construction for LLM-based recommender systems.
Metadata
Related papers
Vibe Coding XR: Accelerating AI + XR Prototyping with XR Blocks and Gemini
Ruofei Du, Benjamin Hersh, David Li, Nels Numan, Xun Qian, Yanhe Chen, Zhongy... • 2026-03-25
Comparing Developer and LLM Biases in Code Evaluation
Aditya Mittal, Ryan Shar, Zichu Wu, Shyam Agarwal, Tongshuang Wu, Chris Donah... • 2026-03-25
The Stochastic Gap: A Markovian Framework for Pre-Deployment Reliability and Oversight-Cost Auditing in Agentic Artificial Intelligence
Biplab Pal, Santanu Bhattacharya • 2026-03-25
Retrieval Improvements Do Not Guarantee Better Answers: A Study of RAG for AI Policy QA
Saahil Mathur, Ryan David Rittner, Vedant Ajit Thakur, Daniel Stuart Schiff, ... • 2026-03-25
MARCH: Multi-Agent Reinforced Self-Check for LLM Hallucination
Zhuo Li, Yupeng Zhang, Pengyu Cheng, Jiajun Song, Mengyu Zhou, Hao Li, Shujie... • 2026-03-25
Raw Data (Debug)
{
"raw_xml": "<entry>\n <id>http://arxiv.org/abs/2602.20558v1</id>\n <title>From Logs to Language: Learning Optimal Verbalization for LLM-Based Recommendation in Production</title>\n <updated>2026-02-24T05:15:24Z</updated>\n <link href='https://arxiv.org/abs/2602.20558v1' rel='alternate' type='text/html'/>\n <link href='https://arxiv.org/pdf/2602.20558v1' rel='related' title='pdf' type='application/pdf'/>\n <summary>Large language models (LLMs) are promising backbones for generative recommender systems, yet a key challenge remains underexplored: verbalization, i.e., converting structured user interaction logs into effective natural language inputs. Existing methods rely on rigid templates that simply concatenate fields, yielding suboptimal representations for recommendation. We propose a data-centric framework that learns verbalization for LLM-based recommendation. Using reinforcement learning, a verbalization agent transforms raw interaction histories into optimized textual contexts, with recommendation accuracy as the training signal. This agent learns to filter noise, incorporate relevant metadata, and reorganize information to improve downstream predictions. Experiments on a large-scale industrial streaming dataset show that learned verbalization delivers up to 93% relative improvement in discovery item recommendation accuracy over template-based baselines. Further analysis reveals emergent strategies such as user interest summarization, noise removal, and syntax normalization, offering insights into effective context construction for LLM-based recommender systems.</summary>\n <category scheme='http://arxiv.org/schemas/atom' term='cs.AI'/>\n <category scheme='http://arxiv.org/schemas/atom' term='cs.IR'/>\n <published>2026-02-24T05:15:24Z</published>\n <arxiv:comment>Work in progress</arxiv:comment>\n <arxiv:primary_category term='cs.AI'/>\n <author>\n <name>Yucheng Shi</name>\n </author>\n <author>\n <name>Ying Li</name>\n </author>\n <author>\n <name>Yu Wang</name>\n </author>\n <author>\n <name>Yesu Feng</name>\n </author>\n <author>\n <name>Arjun Rao</name>\n </author>\n <author>\n <name>Rein Houthooft</name>\n </author>\n <author>\n <name>Shradha Sehgal</name>\n </author>\n <author>\n <name>Jin Wang</name>\n </author>\n <author>\n <name>Hao Zhen</name>\n </author>\n <author>\n <name>Ninghao Liu</name>\n </author>\n <author>\n <name>Linas Baltrunas</name>\n </author>\n </entry>"
}