Scientists have developed TopicVelo, a powerful new approach for analyzing the static snapshots from scRNA-seq to observe the changes in cells and genes over time. This breakthrough will greatly enhance the study of embryo development, cell differentiation, cancer formation, and immune system responses.
Imagine being able to predict the exact finishing order of the Kentucky Derby from a still photograph taken 10 seconds into the race.
This challenge is nothing compared to the complexities researchers encounter when using single-cell RNA-sequencing (scRNA-seq) to study how embryos develop, cells differentiate, cancers form, and the immune system responds.In an article published today in Proceedings of the National Academy of Sciences, a team of researchers from the UChicago Pritzker School of Molecular Engineering and the Chemistry Department have developed TopicVelo, a new technique for analyzing how cells and genes evolve over time using the static snapshots from scRNA-seq. The interdisciplinary team integrated ideas from classical machine learning, computational biology, and chemistry to create TopicVelo, which combines unsupervised machine learning with the transcriptional mode.”We developed TopicVelo to infer cell-state transitions from scRNA-seq data, it’s also a very simple, old idea. But when you put them together, they do something more powerful than you might expect,” said PME Assistant Professor of Molecular Engineering and Medicine Samantha Riesenfeld, who wrote the paper with Chemistry Department Prof. Suriyanarayanan Vaikuntanathan and their joint student, UChicago Chemistry PhD candidate Cheng Frank Gao.
The issue with pseudotime
Scientists use scRNA-seq to obtain powerful and detailed measurements, but by nature, they are static.
“We developed TopicVelo to infer cell-state transitions from scRNA-seq data,” Riesenfeld explained. “It’s challenging to achieve that from this type of data due to the destructive nature of scRNA-seq. When you measure the cell in this manner, you are essentially destroying it.”
As a result, researchers are left with a frozen moment in time when the cell was measured/destroyed. While scRNA-seq provides the most comprehensive transcriptome-wide snapshot available, what many researchers really need is the ability to track how cells change over time. They are interested in understanding the process of a cell transforming into a cancerous state or how a specific gene program behaves during an immune response.
To address the challenge of understanding dynamic processes from a static snapshot, researchers typically rely on a technique known as “Pseudotime.” Observing the change and growth of an individual cell or gene’s expression in a still image is impossible. However, the captured image also includes other cells and genes of the same type that may be further along in the same process. By correctly connecting the dots, scientists can gain valuable insights into the process over time.
Connecting these dots involves difficult guesswork, as it is based on the assumption that similar-looking cells are at different points along the same path. Biology is much more complex, with false starts, stops, bursts, and multiple chemical forces affecting each gene.
Instead of traditional
When comparing to traditional pseudotime approaches that focus on expression similarity among cell transcriptional profiles, RNA velocity approaches instead analyze the dynamics of transcription, splicing, and mRNA degradation within the cells.
While promising, this technology is still in its early stages.
“The gap between the potential and practical application of RNA velocity has limited its use,” stated the authors in their paper.
To overcome this challenge, TopicVelo moves away from deterministic models and instead embraces a more complex stochastic model that reflects biology’s inherent randomness, while also drawing insights from it.
“Cells, when you think about them, are intrinsically random,” stated Gao, the paper’s first author. “You can have twins or genetically identical cells that will grow up to be very different. TopicVelo introduces the use of a stochastic model to better capture the underlying biophysics in the transcription processes that are important for mRNA transcription.”
The team also discovered that standard RNA velocity is limited by another assumption. “Most methods assume that all cells are essentially expressing the same major gene program, but you can imagine that cells can have different expression patterns,” stated the team.Riesenfeld mentioned the need to handle multiple processes at the same time, each to a different extent. Untangling these processes presents a challenge. The team from UChicago utilized probabilistic topic modeling, a machine learning tool commonly used for identifying themes in written documents. TopicVelo, the tool used by the team, organizes scRNA-seq data based on the processes in which the cells and genes are involved, rather than categorizing them by cell or gene types. These processes are determined from the data itself, rather than being based on external knowledge. This approach differs from the organization of a science magazine, which typically categorizes topics such as “physics,” “chemistry,” and “astrophysics.”Gao mentioned that they have applied an organizing principle to single-cell RNA-sequencing data in order to categorize the data by topics such as ‘ribosomal synthesis,’ ‘differentiation,’ ‘immune response,’ and ‘cell cycle’. They can then apply stochastic transcriptional models that are specific to each process.
TopicVelo then organizes these processes by topic and applies weights back onto the cells to account for the percentage of each cell’s transcriptional profile involved in each activity. This approach helps to analyze the dynamics of different processes, according to Riesenfeld.who can help you,” Gao said. “That’s how our project came about. People from all different backgrounds helped make it happen.” Gao believes this interdisciplinary collaboration is key to advancing research in the field of cell biology and understanding the complex behaviors of cells. By combining different models and techniques, researchers can gain a more comprehensive understanding of cellular processes and ultimately make breakthroughs in various scientific fields.”We are still working on it,” he said. “It’s not just about chemistry.