Editor’s note: This is the first in a three-part series that examines interdisciplinary research in computer science, with a focus on applications of artificial intelligence.
Cutting edge computer science techniques and biology meet in professor of genomics William Noble’s lab, where they use machine learning to identify patterns and find answers about DNA structure, protein function, gene expression, and other cellular processes.
“Machine learning tends to focus on very large data sets and very heterogeneous data sets,” Noble said. “A lot of what we’re interested in is applying existing methods to new data.”
The goal is to build an understanding of cells. Noble’s lab features a combination of researchers in genomics and proteomics, the studies of DNA and proteins respectively. Both aspects are critical to developing a complete comprehension of cell function.
“One of the things happening in genomics is that people are developing methods to study single cells at a time,” Noble said. “The challenge with this kind of measurement is that it’s destructive.”
Noble’s lab is able to use techniques to paint a more complete picture of multiple characteristics of a single cell. Noble uses the example of measuring gene expression for a certain single cell. Using the lab’s techniques and data from other similar cells, they can draw conclusions about other characteristics of the cell as well. This enhances what researchers can understand about the state of the cell overall.
Noble’s lab has used these techniques to infer the three-dimensional structure of chromosomes. This is important because the structure is related to function and helps researchers understand processes like gene regulation and replication.
The lab has also used machine learning to address the three-dimensional organization of the genome during cardiogenesis. Cardiogenesis is the formation of the heart in an embryo when stem cells differentiate into the specific types of cells that make up this organ. This expands what researchers know about cell differentiation and may hold clues to better understand congenital heart defects.
Noble also leads the UW Center for Nuclear Organization and Function. This is a collaboration between a large group of researchers to understand the nucleus of the cell and gene expression.
The nucleus of the cell is essentially where the shots are called. It houses the genetic material of the cell as chromosomes and directs the cell’s activities including growth, protein synthesis, and reproduction. Gene expression is how instructions in DNA become products such as proteins that carry out the functions of the cell.
“This is a collaboration between many labs with many different focuses … but all are addressed with the same technology,” Noble said.
The common thread between all of these projects and what brings together both computer scientists and biologists is the application of machine learning to process large scale data.
The field of computational biology has changed rapidly since Noble first entered it in graduate school. According to Noble, most of the change is technology-driven. Over the past couple of decades, computational speeds have increased dramatically changing the scope and time of calculations that are readily available through machine learning methods. Additionally, technology is being invented rapidly, giving researchers new tools with which they can probe data to gather meaningful information.
Currently, Noble’s lab includes graduate and undergraduate students from genome sciences, computer science, and engineering. There are multiple projects in motion. Looking to the future, Noble says there are a lot of new possibilities he’s excited to pursue when it comes to applying machine learning methods to understand more about molecular biology.
“At any time, we have basically as many projects as we can possibly keep running,” Noble said.
Reach reporter Rhea John at firstname.lastname@example.org. Twitter: @rheamjo
Like what you’re reading? Support high-quality student journalism by donating here.