Blog Layout

TCGA's Pan-Cancer Efforts and Expansion to Include Whole Genome Sequence

Carolyn Hutter, Ph.D.

Program Director of the Division of Genomic Medicine at the National Human Genome Research Institute (NHGRI)

In 2013, TCGA’s ‘Pan-Cancer’ analysis on over 5,000 cases from 12 tumor projects (see figure) was featured in Nature Genetics with a complementary focus website, which presented over 15 papers and 5 thematic threads. The threads highlight key findings for mutational drivers, network models, exposures and pathogens, data discovery and future directions.


TCGA is currently expanding efforts to characterize commonalities, differences, and emergent themes across cancer types in collaboration with the International Cancer Genome Consortium (ICGC) through the Pan-Cancer Analysis of Whole Genomes (PAWG) project. The goal is to analyze the genomes, including genome-wide sequence data, of approximately 2000 pairs of tumor and normal samples, and integrate those results with clinical and other molecular data on the same cases. The genomic sequence data will be available to the research community through the TCGA Data PortalCGHub, and the ICGC Data Repository. Investigators around the globe will lead analysis in a number of scientific areas, including: integration of transcriptome and genome analyses, patterns of structural variations, novel somatic mutation-calling methods, evolution and heterogeneity, and germline cancer genome variation.

Figure 1: Integrated data set for comparing and contrasting multiple tumor types. The Cancer Genome Atlas Research Network, Weinstein, J.N., Collisson, E.A., Mills, G.B., Shaw, K.M., Ozenberger, B.A., Ellrott, K., Shmulevich, I., Sander, C., and Stuart, J.M. (2013) The Cancer Genome Atlas Pan-Cancer analysis project. Nat Genet. doi:10.1038/ng.2764. Read the full article.


The TCGA/ICGC PAWG will capitalize on existing TCGA data and infrastructure, and will incorporate information from other NIH-funded projects, such as the Encyclopedia of DNA Elements (ENCODE), the Genotype-Tissue Expression (GTEx) Program and the Roadmap Epigenomics Program. As with other TCGA Pan-Cancer efforts to date, this work represents a significant effort and underscores the importance of team science. Using integrative approaches, investigators will be better able to distinguish the signal from the noise and focus on functionally relevant genomic alterations, pathways and mechanisms. However, whole genome analysis also poses a number of key challenges and research needs, such as improved approaches for computing on petabytes of data, more robust standards for cross-project mutation calling, and more effective methods for analyzing and interpreting non-coding variation.


Overall, combining whole genome sequence analysis and comprehensive genomic characterization in this coordinated cross-cancer analysis will enhance our knowledge of cancer genomics and biology. Such work will move TCGA closer towards our goal to improve our ability to diagnose, treat and prevent cancer. Furthermore, the advances in this project will extend beyond cancer research, as the improved capabilities in whole genome sequence analysis and interpretation will be applicable to studies of other diseases and of biology in general.



Share this Article with others

By Sorena Nadaf February 13, 2025
January 2025 Newsletter: The Year of Artificial Intelligence Sorena Nadaf-Rahrov, MS, MMI, PhDc
By Sorena Nadaf February 5, 2025
A recent Lancet study demonstrated that AI implementation led to a 29% increase in cancer detection, with no increase in false positives and a reduced workload compared to radiologists without AI assistance. While emerging evidence supports AI’s potential to enhance cancer detection in mammography screening and reduce screen-reading workload, further research is needed to fully understand its clinical impact.
By Sorena Nadaf December 10, 2024
Artificial intelligence to empower diagnosis of myelodysplastic syndromes by multiparametric flow cytometry
By Sorena Nadaf November 11, 2024
With heavy hearts, we remember and honor Brady Davis, whose sudden passing leaves an immense void. Brady was a devoted supporter and invaluable contributor to the Cancer Center Informatics Society, dedicating countless hours to advancing our mission and strengthening our community. His expertise, enthusiasm, and unwavering commitment shaped our initiatives and inspired everyone fortunate enough to work alongside him. Brady’s legacy will live on through the progress he championed and the connections he fostered. We extend our deepest sympathies to his family, friends, and all who knew him. He will be greatly missed. In honor of Brady’s legacy, Ci4CC will be forming a committee to explore meaningful ways to memorialize him within our society for years to come. We plan to announce the committee’s recommendations at our Spring Summit in San Diego, CA, on March 31, 2025. Please find his obituary here , and visit his memorial page on MyKeeper to leave a tribute. Support the Davis family in Brady’s memory via GoFundMe ---------- Cancer Center Informatics Society (Ci4CC) Sorena Nadaf-Rahrov & Warren Kibbe Co-Founders, Ci4CC
By Sorena Nadaf October 29, 2024
Nature Digital Medicine PRISM: Patient Records Interpretation for Semantic clinical trial Matching system using large language models
By Sorena Nadaf September 4, 2024
American Cancer Society and Color Health to Provide Free At-Home Colorectal Cancer Screening in Underserved Rural Communities
September 4, 2024
Leading Progress Against Cancer
August 24, 2024
By Drs Karen Knudsen & Othman Laraki.
By Sorena Nadaf July 18, 2024
Conversation with The Cancer Letter: NCI’s new chief data scientist Warren Kibbe tells us about efforts to get “AI-ready” - July 12, 2024
By Sorena Nadaf July 11, 2024
"Proportion and number of cancer cases and deaths attributable to potentially modifiable risk factors in the United States, 2019" American Cancer Society
Share by: