AI Foundation Model for Chest X-rays
Source: nature.com
AI Model for Chest Radiography
Chest radiography is often used as a starting point for identifying lung diseases. Deep learning could help automate chest radiography interpretation. Current deep learning models have limitations in diagnostic scope, adaptability, and extensibility.
Ark+ is a foundation model designed to address these limitations. It is pre-trained by gathering and reusing data from expert labels across different datasets. Ark+ excels in diagnosing thoracic diseases and can adjust to new diagnostic needs, even novel diseases. It can learn rare conditions from limited samples and adapt to new settings without extra training. It also handles data biases, supports federated learning for privacy, and can be extended to other modalities, making it a foundation model for medical imaging.
How Ark+ Works
Ark+'s capabilities come from combining diverse datasets, which broadens patient populations and utilizes knowledge from experts, enhancing performance and lowering annotation costs. The creation of Ark+ shows that open models, trained by collecting and reusing knowledge from varied expert annotations with public datasets, can outperform proprietary models trained on large datasets.
The developers hope their findings encourage more researchers to share data and code or federate data to create open foundation models with global expertise and patient populations, to accelerate open science and democratize AI for medicine.
Ark+ Technical Details
Ark+ uses a teacher-student framework with multi-task heads for specific tasks and cyclic pretraining to reuse knowledge. The student model scans datasets and learns from expert annotations. The student's knowledge is added to the teacher through exponential moving averages (EMA). A projector maps representations to the same feature space for consistency.
Unlike previous designs, Ark+ feeds the teacher resized original images instead of random crops to ensure a consistent supervisory signal, speeding up training and improving performance. Ark+ can be federated by deploying a local version at each site to protect privacy. Sites train their Ark+ models and send student weights to a central server, where they are averaged into a master model. This model is then sent back to local sites for continuous improvement.
An upgraded model, Ark++covid, is created by pretraining Ark+ with COVID-19 diagnostic tasks. Embeddings for COVID-19, Pneumonia, and Normal cases evolve in t-SNE from the pretrained Ark+ to fine-tuning with increasing samples. Distinct embeddings are achieved with 3,000 samples, showing Ark+'s ability to develop feature representations and enhance diagnostic accuracy.
Supplementary methods include ablation studies on the multi-task head design and cyclic training approach. Fine-tuning baselines from an ImageNet-pretrained model are also provided for comparison. Supplementary data includes source data and results for underperforming methods.