Research

Paper

TESTING March 24, 2026

A Clinically Anchored Radiomics Dictionary for Explainable TI-RADS-Based Thyroid Nodule Classification in Ultrasound; Dictionary Version TU1.0

Authors

Mohammad Salmanpour, Shahram Taeb, Ali Fathi Jouzdani, Mohammad Ayazi, Siavash Hosseinpour Saffarian, Mehdi Maghsudi, Ilker Hacihaliloglu, Arman Rahmim

Abstract

Artificial intelligence based radiomics models for thyroid ultrasound (US) often achieve strong diagnostic performance but remain difficult to interpret, limiting clinical trust and adoption. We developed and validated an interpretable radiomic feature (RF) framework for thyroid nodule classification by linking quantitative US features to the Thyroid Imaging Reporting and Data System (TI-RADS) semantic lexicon through a clinically grounded radiomics dictionary. The dictionary mapped TI-RADS categories, including composition, echogenicity, shape, margin, and echogenic foci, to Image Biomarker Standardization Initiative compliant RFs extracted from two-dimensional US images. Relationships were defined through expert consensus and examined using Shapley Additive Explanations (SHAP). Three multicenter datasets were combined, yielding 5,542 nodules. A total of 107 RFs were extracted using PyRadiomics and normalized with min-max scaling. For benign versus malignant classification, 27 feature selection methods were paired with 25 classifiers and evaluated using stratified five-fold cross-validation on 70% of the data, followed by testing on the remaining 30%. Robust model selection used a stability-aware composite score combining mean performance and variability across balanced accuracy, precision, recall, F1-score, and ROC-AUC. The proposed dictionary enabled direct interpretation of radiomic signatures in TI-RADS terms. The best model, Select-From-Model based on logistic regression with Extra-Trees, achieved a test ROC-AUC of 0.941 +/- 0.005. SHAP analysis showed that texture heterogeneity was the dominant malignancy signal, with gray level run length matrix non-uniformity, intensity dispersion, and kurtosis aligning with high-risk TI-RADS descriptors. These findings support transparent and clinically meaningful thyroid nodule risk stratification from US.

Metadata

arXiv ID: 2603.22692
Provider: ARXIV
Primary Category: physics.med-ph
Published: 2026-03-24
Fetched: 2026-03-25 06:02

Related papers

Raw Data (Debug)
{
  "raw_xml": "<entry>\n    <id>http://arxiv.org/abs/2603.22692v1</id>\n    <title>A Clinically Anchored Radiomics Dictionary for Explainable TI-RADS-Based Thyroid Nodule Classification in Ultrasound; Dictionary Version TU1.0</title>\n    <updated>2026-03-24T01:31:21Z</updated>\n    <link href='https://arxiv.org/abs/2603.22692v1' rel='alternate' type='text/html'/>\n    <link href='https://arxiv.org/pdf/2603.22692v1' rel='related' title='pdf' type='application/pdf'/>\n    <summary>Artificial intelligence based radiomics models for thyroid ultrasound (US) often achieve strong diagnostic performance but remain difficult to interpret, limiting clinical trust and adoption. We developed and validated an interpretable radiomic feature (RF) framework for thyroid nodule classification by linking quantitative US features to the Thyroid Imaging Reporting and Data System (TI-RADS) semantic lexicon through a clinically grounded radiomics dictionary. The dictionary mapped TI-RADS categories, including composition, echogenicity, shape, margin, and echogenic foci, to Image Biomarker Standardization Initiative compliant RFs extracted from two-dimensional US images. Relationships were defined through expert consensus and examined using Shapley Additive Explanations (SHAP). Three multicenter datasets were combined, yielding 5,542 nodules. A total of 107 RFs were extracted using PyRadiomics and normalized with min-max scaling. For benign versus malignant classification, 27 feature selection methods were paired with 25 classifiers and evaluated using stratified five-fold cross-validation on 70% of the data, followed by testing on the remaining 30%. Robust model selection used a stability-aware composite score combining mean performance and variability across balanced accuracy, precision, recall, F1-score, and ROC-AUC. The proposed dictionary enabled direct interpretation of radiomic signatures in TI-RADS terms. The best model, Select-From-Model based on logistic regression with Extra-Trees, achieved a test ROC-AUC of 0.941 +/- 0.005. SHAP analysis showed that texture heterogeneity was the dominant malignancy signal, with gray level run length matrix non-uniformity, intensity dispersion, and kurtosis aligning with high-risk TI-RADS descriptors. These findings support transparent and clinically meaningful thyroid nodule risk stratification from US.</summary>\n    <category scheme='http://arxiv.org/schemas/atom' term='physics.med-ph'/>\n    <published>2026-03-24T01:31:21Z</published>\n    <arxiv:primary_category term='physics.med-ph'/>\n    <author>\n      <name>Mohammad Salmanpour</name>\n    </author>\n    <author>\n      <name>Shahram Taeb</name>\n    </author>\n    <author>\n      <name>Ali Fathi Jouzdani</name>\n    </author>\n    <author>\n      <name>Mohammad Ayazi</name>\n    </author>\n    <author>\n      <name>Siavash Hosseinpour Saffarian</name>\n    </author>\n    <author>\n      <name>Mehdi Maghsudi</name>\n    </author>\n    <author>\n      <name>Ilker Hacihaliloglu</name>\n    </author>\n    <author>\n      <name>Arman Rahmim</name>\n    </author>\n  </entry>"
}