Research

Paper

AI LLM March 25, 2026

MolEvolve: LLM-Guided Evolutionary Search for Interpretable Molecular Optimization

Authors

Xiangsen Chen, Ruilong Wu, Yanyan Lan, Ting Ma, Yang Liu

Abstract

Despite deep learning's success in chemistry, its impact is hindered by a lack of interpretability and an inability to resolve activity cliffs, where minor structural nuances trigger drastic property shifts. Current representation learning, bound by the similarity principle, often fails to capture these structural-activity discontinuities. To address this, we introduce MolEvolve, an evolutionary framework that reformulates molecular discovery as an autonomous, look-ahead planning problem. Unlike traditional methods that depend on human-engineered features or rigid prior knowledge, MolEvolve leverages a Large Language Model (LLM) to actively explore and evolve a library of executable chemical symbolic operations. By utilizing the LLM to cold start and an Monte Carlo Tree Search (MCTS) engine for test-time planning with external tools (e.g. RDKit), the system self-discovers optimal trajectories autonomously. This process evolves transparent reasoning chains that translate complex structural transformations into actionable, human-readable chemical insights. Experimental results demonstrate that MolEvolve's autonomous search not only evolves transparent, human-readable chemical insights, but also outperforms baselines in both property prediction and molecule optimization tasks.

Metadata

arXiv ID: 2603.24382
Provider: ARXIV
Primary Category: cs.LG
Published: 2026-03-25
Fetched: 2026-03-26 06:02

Related papers

Raw Data (Debug)
{
  "raw_xml": "<entry>\n    <id>http://arxiv.org/abs/2603.24382v1</id>\n    <title>MolEvolve: LLM-Guided Evolutionary Search for Interpretable Molecular Optimization</title>\n    <updated>2026-03-25T15:01:03Z</updated>\n    <link href='https://arxiv.org/abs/2603.24382v1' rel='alternate' type='text/html'/>\n    <link href='https://arxiv.org/pdf/2603.24382v1' rel='related' title='pdf' type='application/pdf'/>\n    <summary>Despite deep learning's success in chemistry, its impact is hindered by a lack of interpretability and an inability to resolve activity cliffs, where minor structural nuances trigger drastic property shifts. Current representation learning, bound by the similarity principle, often fails to capture these structural-activity discontinuities. To address this, we introduce MolEvolve, an evolutionary framework that reformulates molecular discovery as an autonomous, look-ahead planning problem. Unlike traditional methods that depend on human-engineered features or rigid prior knowledge, MolEvolve leverages a Large Language Model (LLM) to actively explore and evolve a library of executable chemical symbolic operations. By utilizing the LLM to cold start and an Monte Carlo Tree Search (MCTS) engine for test-time planning with external tools (e.g. RDKit), the system self-discovers optimal trajectories autonomously. This process evolves transparent reasoning chains that translate complex structural transformations into actionable, human-readable chemical insights. Experimental results demonstrate that MolEvolve's autonomous search not only evolves transparent, human-readable chemical insights, but also outperforms baselines in both property prediction and molecule optimization tasks.</summary>\n    <category scheme='http://arxiv.org/schemas/atom' term='cs.LG'/>\n    <category scheme='http://arxiv.org/schemas/atom' term='cs.AI'/>\n    <category scheme='http://arxiv.org/schemas/atom' term='cs.CE'/>\n    <published>2026-03-25T15:01:03Z</published>\n    <arxiv:primary_category term='cs.LG'/>\n    <author>\n      <name>Xiangsen Chen</name>\n    </author>\n    <author>\n      <name>Ruilong Wu</name>\n    </author>\n    <author>\n      <name>Yanyan Lan</name>\n    </author>\n    <author>\n      <name>Ting Ma</name>\n    </author>\n    <author>\n      <name>Yang Liu</name>\n    </author>\n  </entry>"
}