5 ‘Single-cell’ proteomics
Foray into deep learning and multi-omic integration
5.1 Review of the central dogma
The central dogma of molecular biology outlines the flow of genetic information within a cell: DNA is transcribed into RNA, and RNA is then translated into protein (see Figure 5.1). Specifically, coding genes in DNA are transcribed into messenger RNA (mRNA), which serves as a template for protein synthesis. During translation, mRNA sequences are read in sets of three nucleotides, known as codons, each corresponding to a specific amino acid. These amino acids are then linked together to form proteins, which carry out a vast array of functions within the cell. While this process provides a foundational framework, it is a dramatic over-simplification. As we’ll explore later in the course, the correlation between a gene and its corresponding protein levels is often surprisingly low, highlighting the complexity of gene expression regulation.

Understanding proteins is crucial because they are the primary effectors of cellular function. Most cellular activities — whether structural, enzymatic, or signaling—are mediated by proteins. While RNA intermediates, such as mRNA, play important roles in carrying genetic information, the majority of RNA fragments never leave the cell, with a few exceptions like extracellular RNA in communication. Proteins, however, directly influence both intracellular processes and extracellular interactions. Given the weak correlation between genes and proteins and the central role proteins play in biological function, studying proteins arguably provides a more direct and meaningful insight into cellular and organismal behavior. This dual perspective on the central dogma will frame much of our exploration in this course.1


5.2 Other ways to study proteins that we’re not going to discuss here
5.2.1 So You Heard About AlphaFold…
AlphaFold (see Figure 5.4) represents a revolutionary advancement in computational biology, designed to predict the three-dimensional structure of proteins from their amino acid sequences2. Historically, determining protein shapes required experimental techniques like X-ray crystallography, cryo-electron microscopy (EM) (see Figure 5.5), or nuclear magnetic resonance (NMR), which are resource-intensive and time-consuming. AlphaFold uses deep learning and structural biology insights to achieve high accuracy.
However, significant challenges remain. There are still many open questions on how specific genetic modifications impact protein folding, how proteins dynamically change their conformation, or how they interact with other molecules such as DNA or other proteins. Additionally, ongoing developments in using large language models are showing promise in predicting not only shape but also potential functions directly from amino acid sequences.


5.2.2 Other Methods: Flow Cytometry, Spatial Proteomics, and FISH
While AlphaFold focuses on protein structure, methods like flow cytometry and spatial proteomics explore proteins in their functional and cellular contexts. Flow cytometry, sometimes considered the “original” single-cell data method, measures the expression of surface and intracellular proteins across thousands of cells, providing rich insights into cellular heterogeneity. Spatial proteomics and techniques like fluorescence in situ hybridization (FISH) take this further by localizing proteins and RNA within tissue contexts, enabling researchers to map molecular interactions in their native environments. These approaches highlight the versatility of protein studies, from understanding their structure to dissecting their function and distribution in complex systems. While not the focus of this course, these methods are invaluable in expanding our understanding of proteins and their roles in biology.
Proteins typically degrade much slower than mRNA fragments. See https://book.bionumbers.org/how-fast-do-rnas-and-proteins-degrade. For this reason, you might hypothesize that “cellular memory” is stored via proteins, not mRNA.↩︎
See https://www.youtube.com/watch?v=P_fHJIYENdI for a fun YouTube video for more about this.↩︎