Is Your AI Radiomics Tool Based on Insufficient Data?
Doctors in India must carefully evaluate new diagnostic tools before clinical adoption. Recently, a major study highlighted significant Radiomics Sample Size Deficits within machine learning models found in elite journals. These findings suggest that many validated models lack the statistical power required for reliable performance. Specifically, nearly 90% of analyzed models failed to meet minimum training requirements. This systemic issue often leads to overfitting, which causes tools to fail in real-world settings.
Addressing Radiomics Sample Size Deficits
Researchers investigated how model complexity relates to the quantity of available training data. They applied conservative benchmarks to externally validated models to assess their stability. Consequently, the results showed a median deficit of approximately 200 training instances per study. Furthermore, only a small fraction of these publications met basic heuristics for stable development. Most models used too many features relative to the number of patient events recorded. Therefore, the reported accuracy of these tools may be misleading.
Clinical reproducibility remains a significant challenge for modern radiology. If a model is undertrained, it cannot generalize across diverse patient populations in India. Moreover, the lack of sufficient data makes these models prone to statistical noise. Therefore, radiologists should demand rigorous sample size justifications in all AI-related research. This oversight ensures that clinical decisions rely on robust evidence rather than algorithmic artifacts.
Frequently Asked Questions
Q1: Why do radiomics sample size deficits occur in high-impact journals?
Many studies prioritize algorithmic complexity over data volume. Additionally, acquiring large, high-quality medical datasets is often difficult and expensive for researchers.
Q2: How does an insufficient sample size affect clinical outcomes?
Insufficient data leads to model instability and overfitting. Consequently, a tool might show high accuracy during testing but provide incorrect predictions when applied to new patients.
References
- Kocak B et al. Externally validated yet undertrained: sample size deficits in machine learning-based radiomics. Eur Radiol. 2026 Apr 23. doi: 10.1007/s00330-026-12543-2. PMID: 42020624.
- Riley RD et al. Calculating the sample size required for developing a clinical prediction model. BMJ. 2020;368:m441.
- Tsegaye B et al. Larger sample sizes are needed when developing a clinical prediction model using machine learning in oncology: methodological systematic review. J Clin Epidemiol. 2025.
