Research

Paper

TESTING February 26, 2026

Strategy Executability in Mathematical Reasoning: Leveraging Human-Model Differences for Effective Guidance

Authors

Weida Liang, Yiyou Sun, Shuyuan Nan, Chuang Li, Dawn Song, Kenji Kawaguchi

Abstract

Example-based guidance is widely used to improve mathematical reasoning at inference time, yet its effectiveness is highly unstable across problems and models-even when the guidance is correct and problem-relevant. We show that this instability arises from a previously underexplored gap between strategy usage-whether a reasoning strategy appears in successful solutions-and strategy executability-whether the strategy remains effective when instantiated as guidance for a target model. Through a controlled analysis of paired human-written and model-generated solutions, we identify a systematic dissociation between usage and executability: human- and model-derived strategies differ in structured, domain-dependent ways, leading to complementary strengths and consistent source-dependent reversals under guidance. Building on this diagnosis, we propose Selective Strategy Retrieval (SSR), a test-time framework that explicitly models executability by selectively retrieving and combining strategies using empirical, multi-route, source-aware signals. Across multiple mathematical reasoning benchmarks, SSR yields reliable and consistent improvements over direct solving, in-context learning, and single-source guidance, improving accuracy by up to $+13$ points on AIME25 and $+5$ points on Apex for compact reasoning models. Code and benchmark are publicly available at: https://github.com/lwd17/strategy-execute-pipeline.

Metadata

arXiv ID: 2602.22583
Provider: ARXIV
Primary Category: cs.AI
Published: 2026-02-26
Fetched: 2026-02-27 04:35

Related papers

Raw Data (Debug)
{
  "raw_xml": "<entry>\n    <id>http://arxiv.org/abs/2602.22583v1</id>\n    <title>Strategy Executability in Mathematical Reasoning: Leveraging Human-Model Differences for Effective Guidance</title>\n    <updated>2026-02-26T03:34:23Z</updated>\n    <link href='https://arxiv.org/abs/2602.22583v1' rel='alternate' type='text/html'/>\n    <link href='https://arxiv.org/pdf/2602.22583v1' rel='related' title='pdf' type='application/pdf'/>\n    <summary>Example-based guidance is widely used to improve mathematical reasoning at inference time, yet its effectiveness is highly unstable across problems and models-even when the guidance is correct and problem-relevant. We show that this instability arises from a previously underexplored gap between strategy usage-whether a reasoning strategy appears in successful solutions-and strategy executability-whether the strategy remains effective when instantiated as guidance for a target model. Through a controlled analysis of paired human-written and model-generated solutions, we identify a systematic dissociation between usage and executability: human- and model-derived strategies differ in structured, domain-dependent ways, leading to complementary strengths and consistent source-dependent reversals under guidance. Building on this diagnosis, we propose Selective Strategy Retrieval (SSR), a test-time framework that explicitly models executability by selectively retrieving and combining strategies using empirical, multi-route, source-aware signals. Across multiple mathematical reasoning benchmarks, SSR yields reliable and consistent improvements over direct solving, in-context learning, and single-source guidance, improving accuracy by up to $+13$ points on AIME25 and $+5$ points on Apex for compact reasoning models. Code and benchmark are publicly available at: https://github.com/lwd17/strategy-execute-pipeline.</summary>\n    <category scheme='http://arxiv.org/schemas/atom' term='cs.AI'/>\n    <category scheme='http://arxiv.org/schemas/atom' term='cs.CL'/>\n    <published>2026-02-26T03:34:23Z</published>\n    <arxiv:primary_category term='cs.AI'/>\n    <author>\n      <name>Weida Liang</name>\n    </author>\n    <author>\n      <name>Yiyou Sun</name>\n    </author>\n    <author>\n      <name>Shuyuan Nan</name>\n    </author>\n    <author>\n      <name>Chuang Li</name>\n    </author>\n    <author>\n      <name>Dawn Song</name>\n    </author>\n    <author>\n      <name>Kenji Kawaguchi</name>\n    </author>\n  </entry>"
}