Research

Paper

TESTING March 08, 2026

Machine Learning for Electrode Materials: Property Prediction via Composition

Authors

Hao Wu, Cameron Hargreaves, Arpit Mishra, Gian-Marco Rignanese

Abstract

In this work, we benchmark three leading Machine Learning (ML) frameworks-MODNet, CrabNet, and a random forest model based on Magpie feature-for predicting properties of battery electrode materials using the Materials Project Battery Explorer dataset. We evaluate these models based on predictive accuracy, visualize numerical features using two-dimensional embeddings, and quantify performance using standard metrics. Our results demonstrate that CrabNet consistently outperforms the other models across all tests. To validate these findings, we employ robust statistical methods: bootstrap resampling and two cross-validation (CV) strategies (leave one cluster out and stratified 5-fold CV), comparing each model against a control baseline. In addition, we apply unsupervised clustering on MODNet-derived features using t-SNE and DBSCAN, revealing coherent material groupings without prior labels. This analysis confirms the robustness of the evaluated models and underscores the potential of ML-driven approaches for accelerating the electrode materials discovery. However, our study also identifies practical limitations and quantifies challenges associated with integrating ML models into materials science workflows. Despite these constraints, our findings suggest that ML models are highly effective for early-stage compositional screening in the battery industry. This work provides a foundation for future research on ML applications in materials discovery.

Metadata

arXiv ID: 2603.07805

Provider: ARXIV

Primary Category: cond-mat.mtrl-sci

Published: 2026-03-08

Fetched: 2026-03-10 05:43

Related papers

Cosmic Shear in Effective Field Theory at Two-Loop Order: Revisiting $S_8$ in Dark Energy Survey Data

Shi-Fan Chen, Joseph DeRose, Mikhail M. Ivanov, Oliver H. E. Philcox • 2026-03-30

Stop Probing, Start Coding: Why Linear Probes and Sparse Autoencoders Fail at Compositional Generalisation

Vitória Barin Pacela, Shruti Joshi, Isabela Camacho, Simon Lacoste-Julien, Da... • 2026-03-30

SNID-SAGE: A Modern Framework for Interactive Supernova Classification and Spectral Analysis

Fiorenzo Stoppa, Stephen J. Smartt • 2026-03-30

Acoustic-to-articulatory Inversion of the Complete Vocal Tract from RT-MRI with Various Audio Embeddings and Dataset Sizes

Sofiane Azzouz, Pierre-André Vuissoz, Yves Laprie • 2026-03-30

Rotating black hole shadows in metric-affine bumblebee gravity

Jose R. Nascimento, Ana R. M. Oliveira, Albert Yu. Petrov, Paulo J. Porfírio,... • 2026-03-30

Raw Data (Debug)

{
  "raw_xml": "<entry>\n    <id>http://arxiv.org/abs/2603.07805v1</id>\n    <title>Machine Learning for Electrode Materials: Property Prediction via Composition</title>\n    <updated>2026-03-08T21:15:12Z</updated>\n    <link href='https://arxiv.org/abs/2603.07805v1' rel='alternate' type='text/html'/>\n    <link href='https://arxiv.org/pdf/2603.07805v1' rel='related' title='pdf' type='application/pdf'/>\n    <summary>In this work, we benchmark three leading Machine Learning (ML) frameworks-MODNet, CrabNet, and a random forest model based on Magpie feature-for predicting properties of battery electrode materials using the Materials Project Battery Explorer dataset. We evaluate these models based on predictive accuracy, visualize numerical features using two-dimensional embeddings, and quantify performance using standard metrics. Our results demonstrate that CrabNet consistently outperforms the other models across all tests. To validate these findings, we employ robust statistical methods: bootstrap resampling and two cross-validation (CV) strategies (leave one cluster out and stratified 5-fold CV), comparing each model against a control baseline. In addition, we apply unsupervised clustering on MODNet-derived features using t-SNE and DBSCAN, revealing coherent material groupings without prior labels. This analysis confirms the robustness of the evaluated models and underscores the potential of ML-driven approaches for accelerating the electrode materials discovery. However, our study also identifies practical limitations and quantifies challenges associated with integrating ML models into materials science workflows. Despite these constraints, our findings suggest that ML models are highly effective for early-stage compositional screening in the battery industry. This work provides a foundation for future research on ML applications in materials discovery.</summary>\n    <category scheme='http://arxiv.org/schemas/atom' term='cond-mat.mtrl-sci'/>\n    <published>2026-03-08T21:15:12Z</published>\n    <arxiv:comment>28 pages, 12 figures</arxiv:comment>\n    <arxiv:primary_category term='cond-mat.mtrl-sci'/>\n    <author>\n      <name>Hao Wu</name>\n    </author>\n    <author>\n      <name>Cameron Hargreaves</name>\n    </author>\n    <author>\n      <name>Arpit Mishra</name>\n    </author>\n    <author>\n      <name>Gian-Marco Rignanese</name>\n    </author>\n  </entry>"
}