Paper
Machine Learning for Electrode Materials: Property Prediction via Composition
Authors
Hao Wu, Cameron Hargreaves, Arpit Mishra, Gian-Marco Rignanese
Abstract
In this work, we benchmark three leading Machine Learning (ML) frameworks-MODNet, CrabNet, and a random forest model based on Magpie feature-for predicting properties of battery electrode materials using the Materials Project Battery Explorer dataset. We evaluate these models based on predictive accuracy, visualize numerical features using two-dimensional embeddings, and quantify performance using standard metrics. Our results demonstrate that CrabNet consistently outperforms the other models across all tests. To validate these findings, we employ robust statistical methods: bootstrap resampling and two cross-validation (CV) strategies (leave one cluster out and stratified 5-fold CV), comparing each model against a control baseline. In addition, we apply unsupervised clustering on MODNet-derived features using t-SNE and DBSCAN, revealing coherent material groupings without prior labels. This analysis confirms the robustness of the evaluated models and underscores the potential of ML-driven approaches for accelerating the electrode materials discovery. However, our study also identifies practical limitations and quantifies challenges associated with integrating ML models into materials science workflows. Despite these constraints, our findings suggest that ML models are highly effective for early-stage compositional screening in the battery industry. This work provides a foundation for future research on ML applications in materials discovery.
Metadata
Related papers
Cosmic Shear in Effective Field Theory at Two-Loop Order: Revisiting $S_8$ in Dark Energy Survey Data
Shi-Fan Chen, Joseph DeRose, Mikhail M. Ivanov, Oliver H. E. Philcox • 2026-03-30
Stop Probing, Start Coding: Why Linear Probes and Sparse Autoencoders Fail at Compositional Generalisation
Vitória Barin Pacela, Shruti Joshi, Isabela Camacho, Simon Lacoste-Julien, Da... • 2026-03-30
SNID-SAGE: A Modern Framework for Interactive Supernova Classification and Spectral Analysis
Fiorenzo Stoppa, Stephen J. Smartt • 2026-03-30
Acoustic-to-articulatory Inversion of the Complete Vocal Tract from RT-MRI with Various Audio Embeddings and Dataset Sizes
Sofiane Azzouz, Pierre-André Vuissoz, Yves Laprie • 2026-03-30
Rotating black hole shadows in metric-affine bumblebee gravity
Jose R. Nascimento, Ana R. M. Oliveira, Albert Yu. Petrov, Paulo J. Porfírio,... • 2026-03-30
Raw Data (Debug)
{
"raw_xml": "<entry>\n <id>http://arxiv.org/abs/2603.07805v1</id>\n <title>Machine Learning for Electrode Materials: Property Prediction via Composition</title>\n <updated>2026-03-08T21:15:12Z</updated>\n <link href='https://arxiv.org/abs/2603.07805v1' rel='alternate' type='text/html'/>\n <link href='https://arxiv.org/pdf/2603.07805v1' rel='related' title='pdf' type='application/pdf'/>\n <summary>In this work, we benchmark three leading Machine Learning (ML) frameworks-MODNet, CrabNet, and a random forest model based on Magpie feature-for predicting properties of battery electrode materials using the Materials Project Battery Explorer dataset. We evaluate these models based on predictive accuracy, visualize numerical features using two-dimensional embeddings, and quantify performance using standard metrics. Our results demonstrate that CrabNet consistently outperforms the other models across all tests. To validate these findings, we employ robust statistical methods: bootstrap resampling and two cross-validation (CV) strategies (leave one cluster out and stratified 5-fold CV), comparing each model against a control baseline. In addition, we apply unsupervised clustering on MODNet-derived features using t-SNE and DBSCAN, revealing coherent material groupings without prior labels. This analysis confirms the robustness of the evaluated models and underscores the potential of ML-driven approaches for accelerating the electrode materials discovery. However, our study also identifies practical limitations and quantifies challenges associated with integrating ML models into materials science workflows. Despite these constraints, our findings suggest that ML models are highly effective for early-stage compositional screening in the battery industry. This work provides a foundation for future research on ML applications in materials discovery.</summary>\n <category scheme='http://arxiv.org/schemas/atom' term='cond-mat.mtrl-sci'/>\n <published>2026-03-08T21:15:12Z</published>\n <arxiv:comment>28 pages, 12 figures</arxiv:comment>\n <arxiv:primary_category term='cond-mat.mtrl-sci'/>\n <author>\n <name>Hao Wu</name>\n </author>\n <author>\n <name>Cameron Hargreaves</name>\n </author>\n <author>\n <name>Arpit Mishra</name>\n </author>\n <author>\n <name>Gian-Marco Rignanese</name>\n </author>\n </entry>"
}