Research

Paper

AI LLM March 20, 2026

Large Language Models and Stock Investing: Is the Human Factor Required?

Authors

Ricardo Crisostomo, Diana Mykhalyuk

Abstract

This paper investigates whether large language models (LLMs) can generate reliable stock market predictions. We evaluate four state-of-the-art models - ChatGPT, Gemini, DeepSeek, and Perplexity - across three prompting strategies: a naive query, a structured approach, and chain-of-thought reasoning. Our results show that LLM-generated recommendations are hindered by recurring reasoning failures, including financial misconceptions, carryover errors, and reliance on outdated or hallucinated information. When appropriately guided and supervised, LLMs demonstrate the capacity to outperform the market, but realizing LLMs' full potential requires substantial human oversight. We also find that grounding stock recommendations in official regulatory filings increases their forecasting accuracy. Overall, our findings underscore the need for robust safeguards and validation when deploying LLMs in financial markets.

Metadata

arXiv ID: 2603.19944
Provider: ARXIV
Primary Category: q-fin.TR
Published: 2026-03-20
Fetched: 2026-03-23 16:54

Related papers

Raw Data (Debug)
{
  "raw_xml": "<entry>\n    <id>http://arxiv.org/abs/2603.19944v1</id>\n    <title>Large Language Models and Stock Investing: Is the Human Factor Required?</title>\n    <updated>2026-03-20T13:47:13Z</updated>\n    <link href='https://arxiv.org/abs/2603.19944v1' rel='alternate' type='text/html'/>\n    <link href='https://arxiv.org/pdf/2603.19944v1' rel='related' title='pdf' type='application/pdf'/>\n    <summary>This paper investigates whether large language models (LLMs) can generate reliable stock market predictions. We evaluate four state-of-the-art models - ChatGPT, Gemini, DeepSeek, and Perplexity - across three prompting strategies: a naive query, a structured approach, and chain-of-thought reasoning. Our results show that LLM-generated recommendations are hindered by recurring reasoning failures, including financial misconceptions, carryover errors, and reliance on outdated or hallucinated information. When appropriately guided and supervised, LLMs demonstrate the capacity to outperform the market, but realizing LLMs' full potential requires substantial human oversight. We also find that grounding stock recommendations in official regulatory filings increases their forecasting accuracy. Overall, our findings underscore the need for robust safeguards and validation when deploying LLMs in financial markets.</summary>\n    <category scheme='http://arxiv.org/schemas/atom' term='q-fin.TR'/>\n    <category scheme='http://arxiv.org/schemas/atom' term='q-fin.ST'/>\n    <published>2026-03-20T13:47:13Z</published>\n    <arxiv:comment>33 pages; 6 tables; 2 figure</arxiv:comment>\n    <arxiv:primary_category term='q-fin.TR'/>\n    <author>\n      <name>Ricardo Crisostomo</name>\n    </author>\n    <author>\n      <name>Diana Mykhalyuk</name>\n    </author>\n  </entry>"
}