6 Single-cell epigenetics
Transcription factor scoring, diagonal integration, gene regulatory networks
6.1 Primer on the genome, epigenetics, and enhancers
The genome is the complete set of DNA within an organism, encoding the instructions for life. While every cell in an organism typically contains the same genome, different cells exhibit distinct phenotypes and functions. This diversity arises not from changes in the underlying DNA sequence but from epigenetic regulation—heritable modifications that influence gene expression without altering the DNA itself. Epigenetics includes processes like DNA methylation, histone modification, and chromatin accessibility, all of which contribute to the dynamic regulation of gene activity in response to developmental cues and environmental signals.
Enhancers play a critical role in this regulatory landscape. These are DNA sequences that, while not coding for proteins themselves, can dramatically increase the transcription of target genes. Enhancers act by binding specific transcription factors, proteins that recognize and attach to DNA sequences to regulate gene expression. Some transcription factors require assistance from chaperone proteins, which ensure their proper folding and functionality, or pioneer proteins, which can access and open tightly packed chromatin to allow other factors to bind. This interplay highlights the complexity of the regulatory machinery that governs cellular function. See Figure 6.1 and Figure 6.2 to appreciate how complex this machinery is.


6.2 The zoo of epigenetic modalities
Epigenetics encompasses a vast array of molecular mechanisms that regulate gene expression without altering the underlying DNA sequence. These mechanisms include modifications to DNA, RNA, chromatin, and the spatial organization of the genome, collectively forming a complex regulatory landscape. Below are some of the key modalities studied in epigenetics:
DNA Accessibility: Techniques like ATAC-seq and DNase-seq measure how accessible DNA is to transcription factors and other regulatory proteins. Accessible regions often overlap with promoters and enhancers, providing critical insights into gene regulation. (This is what we’ll focus on this chapter.)
DNA Methylation1: This modification, typically at cytosines in CpG dinucleotides, is a key epigenetic mark associated with gene silencing. Tools like bisulfite sequencing are used to map methylation patterns across the genome, revealing their roles in development and disease. See Figure 6.3.


- Hi-C and Genome Organization2: Hi-C measures chromatin interactions to reveal the three-dimensional structure of the genome. It uncovers features like topologically associating domains (TADs) and enhancer-promoter loops, which are crucial for understanding how spatial organization influences gene regulation. See Figure 6.5.


- Histone Modifications3: Post-translational modifications, such as acetylation, methylation, and phosphorylation, occur on histone proteins and regulate chromatin structure. Techniques like ChIP-seq and the newer Cut & Tag method are used to map these modifications and their role in gene expression at the bulk level. See Figure 6.7.


Some of you might be interested: Personally, I think one of the fascinating concepts based on histone modifications is bivalent chromatin (essentially, chromatin that is wrapped in such a way that is simultaneously activated and silenced), see Figure 6.9. It’s a particularly curious phenomenon that entire labs dedicate themselves to studying. See (blanco2020bivalent?) for an overview why this mechanism might be “beneficial.”


- RNA Modifications (m6A and Pseudouridine): Modifications like N6-methyladenosine (m6A) and pseudouridine occur on RNA molecules and are involved in processes like splicing, translation, and mRNA decay. See Figure 6.11 for what pseudouridine is, i.e., “a rotation of the uridine molecule.” These modifications add an epitranscriptomic layer to gene regulation.

- Untranslated Regions (UTRs): Before talking about UTRs, it’s probably good to review what an mRNA fragment “looks like” at the different stages of transcription and translation, see Figure 6.12. The 5′ and 3′ UTRs contain regulatory elements that influence mRNA stability, localization, and translation efficiency. The 3′ UTR, in particular, serves as a binding platform for RNA-binding proteins and microRNAs, providing an additional layer of post-transcriptional gene regulation.


- Alternative Splicing: This post-transcriptional process generates multiple mRNA isoforms from the same gene, expanding the proteomic diversity, see Figure 6.15. This is a combinatorial explosion of isoforms. While not traditionally classified as epigenetic, splicing often intersects with chromatin modifications and RNA-binding proteins, blurring the lines between transcriptional and post-transcriptional regulation.

- Alternative Polyadenylation (APA): APA generates transcript isoforms with different 3′ ends, affecting mRNA stability, localization, and translation, see Figure 6.16. By selecting distinct cleavage sites, APA alters the 3′ UTR length without affecting the poly-A tail itself, thereby modulating interactions with regulatory factors such as microRNAs and RNA-binding proteins.

Efforts like the ENCODE project (https://pmc.ncbi.nlm.nih.gov/articles/PMC7061942/) have systematically mapped these modalities across cell types and tissues, creating a comprehensive resource for understanding genome function. ENCODE has provided invaluable datasets on DNA accessibility, histone modifications, and RNA-binding proteins, enabling researchers to uncover how epigenetic and transcriptomic layers work together to drive cellular processes.
Finally, tools like MPRA (Massively Parallel Reporter Assays) are revolutionizing how we study enhancers and regulatory sequences. MPRA allows researchers to test thousands of DNA fragments for their regulatory activity, providing functional validation for epigenetic marks. This growing zoo of modalities continues to expand our understanding of how the genome is dynamically regulated in health and disease.
See Figure 6.17 for some courageous figures that try to display multiple types of epigenetic modifications all at once. See (lim2024advances?) for a broad overview about the different types of technologies to sequence different omics and layers of epigenetics, and how they are getting computationally put together.


The technology to measure this at single-cell resolution is still being developed (see snmC-seq2 (liu2021dna?)), and is particularly interesting due to methylation’s relation to cellular memory (kim2017dna?) and molecular clocks (hernando2019ageing?); (trapp2021profiling?); (gabbutt2022fluctuating?).↩︎
See Droplet-HiC (chang2024droplet?) for an example of single-cell Hi-C.↩︎
See Paired-tag (zhu2021joint?) for an example of single-cell histone modification.↩︎