The Cancer Imaging Archive Explained

The Cancer Imaging Archive (TCIA) is an open-access database of medical images for cancer research. The site is funded by the National Cancer Institute's (NCI) Cancer Imaging Program, and the contract is operated by the University of Arkansas for Medical Sciences. Data within the archive is organized into collections which typically share a common cancer type and/or anatomical site. The majority of the data consists of CT, MRI, and nuclear medicine (e.g. PET) images stored in DICOM format, but many other types of supporting data are also provided or linked to, in order to enhance research utility.[1] All data are de-identified in order to comply with the Health Insurance Portability and Accountability Act and National Institutes of Health data sharing policies.

TCIA resources are intended to support:

TCIA is recognized as a recommended repository for the Scientific Data, PLOS One,[3] and F1000Research journals.[4] It is also listed in the Registry of Research Data Repositories.[5]

History

Prior to the creation of TCIA, the NCI funded development of the National Biomedical Imaging Archive. NBIA is an open-source Web application which was designed to allow the storage and query of DICOM images. TCIA was subsequently initiated in December 2010 to expand data sharing activities by funding a service component which would help address the technical and policy challenges associated with medical imaging research. TCIA leverages open-source tools such as NBIA and Clinical Trials Processor in order to provide its services.[6] [7]

Organization of the archive

The site content is organized into five categories:[8]

Methods for accessing data

Most collections on the Cancer Imaging Archive can be accessed without an account, but a few are restricted to specific users and therefore require an account to access them.[2] TCIA has several ways to browse, filter, and download data. They include:

Browsing, bulk downloading and access to supporting data

The home page includes a list of all available collections. Basic information about the data such as the cancer type, cancer location, modalities, and number of subjects are also provided. Clicking on a collection name presents a page which describes the data including its original research purpose, how the data were generated, and how it might be useful to other TCIA users. For example, describes the NSCLC-Radiomics-Genomics Collection. In the lower section of the page there are links to search or download the images and any available supporting data in the Data Access tab. Additional tabs provide information about data versions and how to cite the data if used in publications.

Many collections contain additional data types such as genomics, patient demographics, treatment details, and expert analyses of the images. This data is usually only found by browsing the collection pages as opposed to searching in NBIA or using the API.

Filtering or searching with NBIA

On each Collection page and also in the main menu of the site there are links to "Search TCIA". This will load the NBIA application which allows simple, advanced and free text searches. Search results follow the conventional DICOM hierarchy of patient -> study -> series. TCIA provides comprehensive documentation on the various features of the NBIA software.[9]

RESTful API

A number of search and download commands are also available through the API. New iterations on the API are released as new versions, so that existing applications developed against older versions of the API continue to function.[10]

Research activities

A list of known publications based on TCIA data is maintained as a convenience to researchers who might want to investigate how it has been used previously.[11] In addition to peer-reviewed publications there are also several major research initiatives described in the Research Activities section of the site.

The CIP TCGA Radiology Initiative for Radiogenomics Research

A large number of collections contain subjects which were analyzed as part of the NIH/NHGRI database known as The Cancer Genome Atlas (TCGA). This offers researchers the ability to correlate clinical images using shared unique identifiers each study that has in TCGA extensive genomic analysis, digital pathology slides and bulk download of individual demographic data and clinical data. A multi-institutional network of investigators volunteering their time is using the data to develop methods to determine prognosis or predict the response to therapy.[12] TCGA collections are designated by nomenclature shared by the TCGA Data Portal[13] (e.g.: TCGA-BRCA, TCGA-GBM, etc). They are subject to a special publication policy which is unique from the other public data on TCIA.[14]

Challenge competitions

TCIA also provides specific data sets used for "Challenge" competitions such as international digital image-focused professional societies like MICCAI, SPIE, or ISBI. A directory of previous and upcoming challenges is maintained on the site.[15]

Digital object identifiers

To facilitate data sharing, many publications encourage authors to include data citations to the data that the authors used in creating the results described in their scholarly papers. In addition, new journals are now available for describing data collections outright (e.g., Nature Scientific Data). TCIA assigns digital object identifiers (DOIs) to all collections when they are submitted, and also has the ability to create persistent identifiers linked to subsets of data held within TCIA that authors may use for data citations in their scholarly papers.[16]

External links

References

  1. Vendt. Clark. The Cancer Imaging Archive (TCIA): Maintaining and Operating a Public Information Repository.. Journal of Digital Imaging. June 2013. 26. 6. 10.1007/s10278-013-9622-7. 23884657. 3824915. 1045–57.
  2. News: About The Cancer Imaging Archive (TCIA) - The Cancer Imaging Archive (TCIA). The Cancer Imaging Archive (TCIA). en-US. 2016-03-24.
  3. Web site: PLOS ONE: accelerating the publication of peer-reviewed science. journals.plos.org. 2016-06-11.
  4. Web site: Data guidelines - F1000Research. f1000research.com. 2016-06-11.
  5. Web site: The Cancer Imaging Archive re3data.org. service.re3data.org. 2016-06-11.
  6. Clark. Kenneth. Vendt. Bruce. Smith. Kirk. Freymann. John. Kirby. Justin. Koppel. Paul. Moore. Stephen. Phillips. Stanley. Maffitt. David. 2013-07-25. The Cancer Imaging Archive (TCIA): Maintaining and Operating a Public Information Repository. Journal of Digital Imaging. en. 26. 6. 1045–1057. 10.1007/s10278-013-9622-7. 0897-1889. 3824915. 23884657.
  7. Web site: The Cancer Imaging Archive. Cancer Imaging Program - National Cancer Institute. 2016-03-24.
  8. Web site: The Cancer Imaging Archive (TCIA) - A growing archive of medical images of cancer. The Cancer Imaging Archive (TCIA). en-US. 2016-03-24.
  9. Web site: Cancer Imaging Archive User's Guide - TCIA Online Help - Cancer Imaging Archive Wiki. wiki.cancerimagingarchive.net. 2016-03-24.
  10. Web site: TCIA Programmatic Interface (REST API) Usage Guide - The Cancer Imaging Archive (TCIA) Public Access - Cancer Imaging Archive Wiki. wiki.cancerimagingarchive.net. 2016-03-24.
  11. Web site: Publications - The Cancer Imaging Archive (TCIA) Public Access - Cancer Imaging Archive Wiki. wiki.cancerimagingarchive.net. 2016-03-24.
  12. Web site: CIP TCGA Radiology Initiative - The Cancer Imaging Archive (TCIA) Public Access - Cancer Imaging Archive Wiki. wiki.cancerimagingarchive.net. 2016-03-24.
  13. https://tcga-data.nci.nih.gov TCGA Data Portal
  14. Web site: Data Usage Policies and Restrictions - The Cancer Imaging Archive (TCIA) Public Access - Cancer Imaging Archive Wiki. wiki.cancerimagingarchive.net. 2016-03-24.
  15. Web site: Challenge competitions - The Cancer Imaging Archive (TCIA) Public Access - Cancer Imaging Archive Wiki. wiki.cancerimagingarchive.net. 2016-03-24.
  16. Web site: TCIA Digital Object Identifiers - TCIA DOIs - Cancer Imaging Archive Wiki. wiki.cancerimagingarchive.net. 2016-03-24.