Read Time: 7 minutes

Cancers share DNA mutations that affect our genome

Researchers discovered regions of our DNA where mutations disrupt the structure of our genome and could promote growth in multiple different forms of cancer.


shadow
Image Credit: Image by GrumpyBeere from Pixabay

The human genome consists of roughly 3 billion DNA base pairs. If these base pairs were letters organized in a line, they would fill more than 6,000 novels, which would be too big to fit in a cell. Instead, some proteins organize and reshape the DNA into a more functional 3D structure called chromatin. These proteins regulate how different parts of the genome interact and control which genes are active and which remain silent in each cell. One such protein is the CCCTC-binding factor or CTCF. 

For CTCF to function, it must first attach to specific spots on the DNA called binding sites. Scientists have reported these CTCF-binding sites behave differently in different scenarios. Some lose their binding ability due to chemical interactions within the DNA, while others remain stable. Scientists call the stable ones persistent CTCF-binding sites

Scientists in the past reported that mutations in CTCF-binding sites are often found in cancer cells and disrupt the normal 3D structure of the genome. However, they were unsure whether these mutations were concentrated in persistent sites or what roles they performed. Researchers from Australia sought to understand mutations at persistent CTCF-binding sites and how they impact various cancers.

To tackle these questions, the research team developed a computational tool based on machine learning models, called CTCF-INSITE. Their tool uses genetic data and interactions of organic compounds like methyl in the genome to predict which CTCF-binding sites are likely to remain persistent even when CTCF protein levels drop. The researchers used this tool to map out which persistent CTCF-binding sites across the genome might be particularly vulnerable to mutations and whether these mutations were linked to cancer growth.

The researchers trained their tool to distinguish between stable and unstable CTCF-binding sites using data from several human cell culture samples, including prostate, breast, and lung cancer cells. They used features like the binding strength of the proteins, the relative locations of the binding sites in the genome, and how distant regions of DNA interact to create the proteins. 

Next, the researchers downloaded mutation data for 12 cancer types from the International Cancer Genome Consortium. They filtered out data entries with too few or too many mutations to avoid imbalance. Then, they applied their CTCF-INSITE tool to test whether persistent CTCF-binding sites were more prone to mutations in cancerous cells than other CTCF-binding sites. 

They found significantly more mutations at the persistent CTCF-binding sites in all cancer types they tested, meaning these sites had more mutations than would be expected by random chance. The researchers noted the mutations were specifically in the CTCF-binding sites, rather than in sections of DNA close to the sites. They also reported these mutations were more prominent in breast and prostate cancer cells than in other cancer types.

The researchers further sought to understand whether these mutations altered the genome’s 3D structure. They used laboratory techniques, like fluorescence imaging, to test some of these cancer-specific mutations and found that many altered the genome structure to reduce the strength and effectiveness of CTCF binding. They explained this reduction can impact gene expression in a way that may promote cancer growth.

The researchers emphasized that their findings were not limited to 1 or 2 types of cancer since they found similar results in stomach, lung, prostate, breast, and skin cancers. While the exact mutation patterns varied between cancers, they reported that persistent CTCF-binding sites had consistently higher mutations across the board.

The researchers concluded their findings could help other cancer researchers understand the similarities in how multiple cancer types develop and progress. They also proposed their machine learning tool could provide future researchers with relevant CTCF-binding site candidates for experiments investigating undocumented causes of cancer.

Study Information

Original study: Machine learning enables pan-cancer identification of mutational hotspots at persistent CTCF binding sites

Study was published on: August 12, 2024

Study author(s): Wenhan Chen, Yi C. Zeng, Joanna Achinger-Kawecka, Elyssa Campbell, Alicia K. Jones, Alastair G. Stewart, Amanda Khoury, Susan J. Clark

The study was done at: Garvan Institute of Medical Research (Australia), Victor Change Cardiac Research Institute (Australia), University of New South Wales (Australia)

The study was funded by: National Health and Medical Research Council

Raw data availability: In supplementary info

Featured image credit: Image by GrumpyBeere from Pixabay

This summary was edited by: Aubrey Zerkle