How AI Software Variations Impact Lung Nodule Management

Artificial intelligence systems now play a central role in routine medical imaging. Specifically, lung nodule detection relies increasingly on commercial software platforms. However, a new post-market evaluation reveals significant variations among these AI tools. Consequently, these discrepancies can directly impact guideline-based patient management.

Evaluating Different Lung Nodule Detection Systems

The research team evaluated three commercial AI tools on 740 routine CT and PET-CT studies. Surprisingly, these tools showed significant differences in the total number of detections. Specifically, the first tool flagged 1,336 nodules, the second found 1,060, and the third detected 1,536 nodules. Therefore, clinicians must exercise caution when relying solely on a single AI platform.

Impact on Fleischner and BTS Guidelines

Variations in nodule measurements can easily lead to conflicting clinical recommendations. For instance, Fleischner Society and British Thoracic Society (BTS) guidelines rely heavily on precise diameter and volume measurements. Because the three tools measured the same nodules differently, they suggested different follow-up intervals. Consequently, patients might receive contrasting clinical advice depending on which software the hospital utilizes.

Clinical Implications for Radiologists

This study highlights the need for standardized software validation in clinical environments. Additionally, radiologists must remain the final decision-makers rather than blindly trusting AI outputs. If different platforms yield disparate results, then clinical oversight becomes even more critical. Ultimately, doctors must integrate clinical risk factors to ensure accurate patient follow-up.

Frequently Asked Questions

Q1: Why do commercial AI tools show different results in lung nodule detection?

Each software tool uses unique algorithms, training datasets, and segmentation thresholds. Consequently, these differences cause variations in both nodule detection rates and measurements.

Q2: How do these variations affect guideline-based management?

Guidelines like the Fleischner and BTS models determine follow-up frequency based on nodule size. Therefore, measurement discrepancies among tools can lead to different recommended monitoring intervals for the same patient.

Q3: Should clinicians rely entirely on AI for managing lung nodules?

No, clinicians should not rely solely on AI. Instead, radiologists must always review AI findings and integrate patient-specific risk factors to make the final clinical decision.

References

Jöbstl A et al. Lung nodule detection and potential impact on guideline-based management: a retrospective post-market evaluation of three commercial software systems. Eur Radiol. 2026 Jun 24. doi: 10.1007/s00330-026-12702-5. PMID: 42343062.
MacMahon H et al. Guidelines for Management of Incidental Pulmonary Nodules Detected on CT Images: From the Fleischner Society 2017. Radiology. 2017 Jul;284(1):228-243. doi: 10.1148/radiol.2017162894.
Callister MEJ et al. British Thoracic Society guidelines for the investigation and management of pulmonary nodules. Thorax. 2015 Aug;70 Suppl 2:ii1-ii54. doi: 10.1136/thoraxjnl-2015-207168.