RSNA Breast Cancer Mammography Classification Dataset
Modality | Data Format | Publisher | Licence |
Mammography | DICOM | RSNA | Non-Commercial |
According to the World Health Organization (WHO), breast cancer remains the most commonly diagnosed cancer globally, impacting millions of lives each year. In 2020, the statistics were alarming—2.3 million new diagnoses and 685,000 deaths were reported. Despite these high numbers, mortality rates in high-income countries have seen a significant decline of 40% since the 1980s.
This improvement is largely attributed to the widespread adoption of regular mammography screening for at-risk age groups. Mammography has proved to be a cornerstone in early detection, and early detection is invariably linked to better outcomes, including a greater array of treatment options and reduced mortality.
About Mammography Classification Dataset
The RSNA Breast Cancer Mammography Classification Dataset serves a pivotal role in the advancement of healthcare analytics, specifically in the early detection of breast cancer. Comprising mammograms in DICOM format from screening exams, the dataset aims to facilitate machine learning models that can accurately identify breast cancer cases.
With an approximate 8,000 patients in the hidden test set, the dataset is robust enough for developing and validating sophisticated algorithms. A noteworthy detail is that each patient usually has around 4 images, although this can vary. Additionally, these images often employ the JPEG 2000 format, necessitating special libraries for proper loading and image manipulation. This dataset provides a critical resource for creating an automated solution that can aid radiologists in diagnosing breast cancer with high precision.