Paper
Minimax optimal adaptive structured transfer learning through semi-parametric domain-varying coefficient model
Authors
Hanxiao Chen, Debarghya Mukherjee
Abstract
Transfer learning aims to improve inference in a target domain by leveraging information from related source domains, but its effectiveness critically depends on how cross-domain heterogeneity is modeled and controlled. When the conditional mechanism linking covariates and responses varies across domains, indiscriminate information pooling can lead to negative transfer, degrading performance relative to target-only estimation. We study a multi-source, single-target transfer learning problem under conditional distributional drift and propose a semiparametric domain-varying coefficient model (DVCM), in which domain-relatedness is encoded through an observable domain identifier. This framework generalizes classical varying-coefficient models to structured transfer learning and interpolates between invariant and fully heterogeneous regimes. Building on this model, we develop an adaptive transfer learning estimator that selectively borrows strength from informative source domains while provably safeguarding against negative transfer. Our estimator is computationally efficient and easy to implement; we also show that it is minimax rate-optimal and derive its asymptotic distribution, enabling valid uncertainty quantification and hypothesis testing despite data-adaptive pooling and shrinkage. Our results precisely characterize the interplay among domain heterogeneity, the smoothness of the underlying mean function, and the number of source domains and are corroborated by comprehensive numerical experiments and two real-data applications.
Metadata
Related papers
Fractal universe and quantum gravity made simple
Fabio Briscese, Gianluca Calcagni • 2026-03-25
POLY-SIM: Polyglot Speaker Identification with Missing Modality Grand Challenge 2026 Evaluation Plan
Marta Moscati, Muhammad Saad Saeed, Marina Zanoni, Mubashir Noman, Rohan Kuma... • 2026-03-25
LensWalk: Agentic Video Understanding by Planning How You See in Videos
Keliang Li, Yansong Li, Hongze Shen, Mengdi Liu, Hong Chang, Shiguang Shan • 2026-03-25
Orientation Reconstruction of Proteins using Coulomb Explosions
Tomas André, Alfredo Bellisario, Nicusor Timneanu, Carl Caleman • 2026-03-25
The role of spatial context and multitask learning in the detection of organic and conventional farming systems based on Sentinel-2 time series
Jan Hemmerling, Marcel Schwieder, Philippe Rufin, Leon-Friedrich Thomas, Mire... • 2026-03-25
Raw Data (Debug)
{
"raw_xml": "<entry>\n <id>http://arxiv.org/abs/2602.17967v1</id>\n <title>Minimax optimal adaptive structured transfer learning through semi-parametric domain-varying coefficient model</title>\n <updated>2026-02-20T03:53:06Z</updated>\n <link href='https://arxiv.org/abs/2602.17967v1' rel='alternate' type='text/html'/>\n <link href='https://arxiv.org/pdf/2602.17967v1' rel='related' title='pdf' type='application/pdf'/>\n <summary>Transfer learning aims to improve inference in a target domain by leveraging information from related source domains, but its effectiveness critically depends on how cross-domain heterogeneity is modeled and controlled. When the conditional mechanism linking covariates and responses varies across domains, indiscriminate information pooling can lead to negative transfer, degrading performance relative to target-only estimation. We study a multi-source, single-target transfer learning problem under conditional distributional drift and propose a semiparametric domain-varying coefficient model (DVCM), in which domain-relatedness is encoded through an observable domain identifier. This framework generalizes classical varying-coefficient models to structured transfer learning and interpolates between invariant and fully heterogeneous regimes. Building on this model, we develop an adaptive transfer learning estimator that selectively borrows strength from informative source domains while provably safeguarding against negative transfer. Our estimator is computationally efficient and easy to implement; we also show that it is minimax rate-optimal and derive its asymptotic distribution, enabling valid uncertainty quantification and hypothesis testing despite data-adaptive pooling and shrinkage. Our results precisely characterize the interplay among domain heterogeneity, the smoothness of the underlying mean function, and the number of source domains and are corroborated by comprehensive numerical experiments and two real-data applications.</summary>\n <category scheme='http://arxiv.org/schemas/atom' term='math.ST'/>\n <category scheme='http://arxiv.org/schemas/atom' term='stat.ME'/>\n <category scheme='http://arxiv.org/schemas/atom' term='stat.ML'/>\n <published>2026-02-20T03:53:06Z</published>\n <arxiv:comment>86 pages, 8 figures</arxiv:comment>\n <arxiv:primary_category term='math.ST'/>\n <author>\n <name>Hanxiao Chen</name>\n </author>\n <author>\n <name>Debarghya Mukherjee</name>\n </author>\n </entry>"
}