Paper
Sometimes nonparametrics beat parametrics, even when the model is right
Authors
Morten Byholt, Nils Lid Hjort
Abstract
A basic issue in both teaching of and practice of statistics is the interplay between modelling assumptions and inference performance. The general message conveyed is that stronger assumptions lead to better statistical performance of the relevant estimators, tests and confidence intervals, provided that these assumptions hold. On the other hand, fewer assumptions often lead to safer and more robust methods that are good also outside narrow conditions, but not quite as good as specialist methods that exploit such narrower conditions, if these are fulfilled. This interplay is nicely illustrated in the context of density estimation, where parametric and nonparametric methods can be contrasted. The parametric ones have mean squared errors of size $O(n^{-1})$ in terms of sample size $n$ if the parametric model is right, but are not even consistent outside the model. The nonparametric methods are everywhere consistent and have mean squared errors of size $O(n^{-4/5})$ for broad classes of estimands. The point we are making here is that this picture is not universally true! We show that a simple kernel density estimator can perform better than a directly estimated parametric density on the latter's home turf, for small sample sizes, in the sense of mean integrated squared error. Our main example is that of estimating an unknown normal density. In the process of developing and discussing this somewhat counter-intuitive and half-paradoxical example we touch on several tangential issues of interest, pertaining to exact small-sample analysis of density estimators.
Metadata
Related papers
Fractal universe and quantum gravity made simple
Fabio Briscese, Gianluca Calcagni • 2026-03-25
POLY-SIM: Polyglot Speaker Identification with Missing Modality Grand Challenge 2026 Evaluation Plan
Marta Moscati, Muhammad Saad Saeed, Marina Zanoni, Mubashir Noman, Rohan Kuma... • 2026-03-25
LensWalk: Agentic Video Understanding by Planning How You See in Videos
Keliang Li, Yansong Li, Hongze Shen, Mengdi Liu, Hong Chang, Shiguang Shan • 2026-03-25
Orientation Reconstruction of Proteins using Coulomb Explosions
Tomas André, Alfredo Bellisario, Nicusor Timneanu, Carl Caleman • 2026-03-25
The role of spatial context and multitask learning in the detection of organic and conventional farming systems based on Sentinel-2 time series
Jan Hemmerling, Marcel Schwieder, Philippe Rufin, Leon-Friedrich Thomas, Mire... • 2026-03-25
Raw Data (Debug)
{
"raw_xml": "<entry>\n <id>http://arxiv.org/abs/2603.18590v1</id>\n <title>Sometimes nonparametrics beat parametrics, even when the model is right</title>\n <updated>2026-03-19T07:56:49Z</updated>\n <link href='https://arxiv.org/abs/2603.18590v1' rel='alternate' type='text/html'/>\n <link href='https://arxiv.org/pdf/2603.18590v1' rel='related' title='pdf' type='application/pdf'/>\n <summary>A basic issue in both teaching of and practice of statistics is the interplay between modelling assumptions and inference performance. The general message conveyed is that stronger assumptions lead to better statistical performance of the relevant estimators, tests and confidence intervals, provided that these assumptions hold. On the other hand, fewer assumptions often lead to safer and more robust methods that are good also outside narrow conditions, but not quite as good as specialist methods that exploit such narrower conditions, if these are fulfilled.\n This interplay is nicely illustrated in the context of density estimation, where parametric and nonparametric methods can be contrasted. The parametric ones have mean squared errors of size $O(n^{-1})$ in terms of sample size $n$ if the parametric model is right, but are not even consistent outside the model. The nonparametric methods are everywhere consistent and have mean squared errors of size $O(n^{-4/5})$ for broad classes of estimands.\n The point we are making here is that this picture is not universally true! We show that a simple kernel density estimator can perform better than a directly estimated parametric density on the latter's home turf, for small sample sizes, in the sense of mean integrated squared error. Our main example is that of estimating an unknown normal density. In the process of developing and discussing this somewhat counter-intuitive and half-paradoxical example we touch on several tangential issues of interest, pertaining to exact small-sample analysis of density estimators.</summary>\n <category scheme='http://arxiv.org/schemas/atom' term='math.ST'/>\n <published>2026-03-19T07:56:49Z</published>\n <arxiv:comment>18 pages, 2 figures; Statistical Research Report, Department of Mathematics, University of Oslo, October 1996, but now arXiv'd March 2026</arxiv:comment>\n <arxiv:primary_category term='math.ST'/>\n <author>\n <name>Morten Byholt</name>\n </author>\n <author>\n <name>Nils Lid Hjort</name>\n </author>\n </entry>"
}