In every cell, molecular components constantly read, interpret, and modify the DNA to extract instructions for biological processes. One way cells tweak this manual is by a chemical process called DNA methylation. This process is like a system that marks parts of the DNA as “do not read” or “start here.” Scientists in the past have noticed that when certain spots in DNA, called CpG sites, are not methylated, the genes near them are often more active. This observation led researchers to assume that methylation directly turns genes on or off, affecting gene expression.
A team of researchers from Iceland suspected that the observed patterns between DNA methylation and gene activity might not be caused by methylation itself, but instead by the DNA sequences underneath. They asked the question, what if methylation and gene expression respond to something more fundamental, like mistakes in the normal ATCG of the DNA, referred to as DNA variants?
To explore this hypothesis, the researchers obtained blood samples from 7,179 people and sequenced their entire genomes using a method that can detect both the DNA letters and whether they carry methylation marks, called nanopore sequencing. This analysis allowed them to track which versions of DNA people inherited from each parent and to see how methylation changed from one version to another.
They focused on regions where clusters of CpG sites had very low methylation in one version but not the other. These unmethylated zones often appear in places where genes are regulated, and they act like genetic dimmer switches. In total, they identified nearly 190,000 unmethylated zones.
Next, they compared DNA variations within 100,000 base pairs of the unmethylated zones and found that specific DNA sequence variants affected about 41% of these unmethylated zones. In other words, a change in the DNA, like a single letter swap, an insertion, or a deletion, could affect whether a CpG site was methylated or not. They called these sequence variants allele-specific methylation quantitative trait loci, or ASM-QTLs.
The researchers also looked at how methylation in these unmethylated zones changed how much a gene was expressed. They sequenced RNA from 896 of the same people and matched it to the DNA methylation data. They found that in every case where an unmethylated zone was linked to gene expression, an ASM-QTL was also involved. Their finding implied that the DNA variant was tweaking methylation and impacting how active a nearby gene was.
In addition, the researchers used a statistical analysis to test whether methylation changes were likely to cause changes in gene expression or if it was the other way around. Their finding leaned toward the idea that methylation influences gene expression, rather than vice versa. They also used statistical models to show that sequence variants often explained more differences in gene expression than methylation alone.
They examined the location of these variants and found that many of them sat in places where proteins that help control gene expression bind to the DNA. Some of these proteins are known to avoid binding if the DNA is methylated. So, they reasoned that if a variant changes the protein’s ability to bind, it could change both methylation patterns and gene activity.
The researchers noted how often these methylation-associated variants overlapped with places already linked to human diseases and traits, like type II diabetes and asthma. They found that ASM-QTLs were over 23 times more likely than random variants to occur near these disease-related sites, especially those related to blood traits, which made sense since all their data came from blood samples.
In conclusion, the researchers suggested caution when interpreting links between DNA methylation and gene activity, because two things being connected doesn’t always mean one is causing the other. They highlighted how long-read sequencing technologies like nanopore could help resolve these complexities in the genome. They also suggested that understanding the genetic code at this fine scale could provide clearer insights into several sections of the genome that researchers in the past have labelled as junk DNA.