Read Time: 7 minutes

Mistakes in DNA shape gene activity

Researchers used a new sequencing technology to explore the human genome. They found that DNA methylation is linked to gene expression through changes in underlying DNA sequences.


shadow
Image Credit: From kjpargeter on Freepik

In every cell, molecular components constantly read, interpret, and modify the DNA to extract instructions for biological processes. One way cells tweak this manual is by a chemical process called DNA methylation. This process is like a system that marks parts of the DNA as “do not read” or “start here.” Scientists in the past have noticed that when certain spots in DNA, called CpG sites, are not methylated, the genes near them are often more active. This observation led researchers to assume that methylation directly turns genes on or off, affecting gene expression

A team of researchers from Iceland suspected that the observed patterns between DNA methylation and gene activity might not be caused by methylation itself, but instead by the DNA sequences underneath. They asked the question, what if methylation and gene expression respond to something more fundamental, like mistakes in the normal ATCG of the DNA, referred to as DNA variants?

To explore this hypothesis, the researchers obtained blood samples from 7,179 people and sequenced their entire genomes using a method that can detect both the DNA letters and whether they carry methylation marks, called nanopore sequencing. This analysis allowed them to track which versions of DNA people inherited from each parent and to see how methylation changed from one version to another.

They focused on regions where clusters of CpG sites had very low methylation in one version but not the other. These unmethylated zones often appear in places where genes are regulated, and they act like genetic dimmer switches. In total, they identified nearly 190,000 unmethylated zones.

Next, they compared DNA variations within 100,000 base pairs of the unmethylated zones and found that specific DNA sequence variants affected about 41% of these unmethylated zones. In other words, a change in the DNA, like a single letter swap, an insertion, or a deletion, could affect whether a CpG site was methylated or not. They called these sequence variants allele-specific methylation quantitative trait loci, or ASM-QTLs

The researchers also looked at how methylation in these unmethylated zones changed how much a gene was expressed. They sequenced RNA from 896 of the same people and matched it to the DNA methylation data. They found that in every case where an unmethylated zone was linked to gene expression, an ASM-QTL was also involved. Their finding implied that the DNA variant was tweaking methylation and impacting how active a nearby gene was. 

In addition, the researchers used a statistical analysis to test whether methylation changes were likely to cause changes in gene expression or if it was the other way around. Their finding leaned toward the idea that methylation influences gene expression, rather than vice versa. They also used statistical models to show that sequence variants often explained more differences in gene expression than methylation alone. 

They examined the location of these variants and found that many of them sat in places where proteins that help control gene expression bind to the DNA. Some of these proteins are known to avoid binding if the DNA is methylated. So, they reasoned that if a variant changes the protein’s ability to bind, it could change both methylation patterns and gene activity.

The researchers noted how often these methylation-associated variants overlapped with places already linked to human diseases and traits, like type II diabetes and asthma. They found that ASM-QTLs were over 23 times more likely than random variants to occur near these disease-related sites, especially those related to blood traits, which made sense since all their data came from blood samples.

In conclusion, the researchers suggested caution when interpreting links between DNA methylation and gene activity, because two things being connected doesn’t always mean one is causing the other. They highlighted how long-read sequencing technologies like nanopore could help resolve these complexities in the genome. They also suggested that understanding the genetic code at this fine scale could provide clearer insights into several sections of the genome that researchers in the past have labelled as junk DNA.

Study Information

Original study: The correlation between CpG methylation and gene expression is driven by sequence variants

Study was published on: July 24, 2024

Study author(s): Olafur Andri Stefansson, Brynja Dogg Sigurpalsdottir, Solvi Rognvaldsson, Gisli Hreinn Halldorsson, Kristinn Juliusson, Gardar Sveinbjornsson, Bjarni Gunnarsson, Doruk Beyter, Hakon Jonsson, Sigurjon Axel Gudjonsson, Thorunn Asta Olafsdottir, Saedis Saevarsdottir, Magnus Karl Magnusson, Sigrun Helga Lund, Vinicius Tragante, Asmundur Oddsson, Marteinn Thor Hardarson, Hannes Petur Eggertsson, Reynir L. Gudmundsson, Sverrir Sverrisson, Michael L. Frigge, Florian Zink, Hilma Holm, Hreinn Stefansson, Thorunn Rafnar, Ingileif Jonsdottir, Patrick Sulem, Agnar Helgason, Daniel F. Gudbjartsson, Bjarni V. Halldorsson, Unnur Thorsteinsdottir, Kari Stefansson

The study was done at: deCODE genetics/Amgen (Iceland), Reykjavik University (Iceland), University of Iceland (Iceland)

The study was funded by: None acknowledged

Raw data availability: Not available

Featured image credit: From kjpargeter on Freepik

This summary was edited by: Aubrey Zerkle