FCS News

» Go to news main

Finding Your Way Through Genes

Posted by Faculty of Computer Science on June 13, 2016 in Graduate, Students, Students, Research, Bioinformatics & Algorithms, Faculty, News, Research
Students Emma Sylvester (master's in Computational Biology and Bioinformatics) and Praveen Nadukkalam Ravindran (PhD in Computer Science) use a combination of network, machine learning, and visualization approaches to identify genetically distinct groups of Atlantic salmon that may occupy different habitats.
Students Emma Sylvester (master's in Computational Biology and Bioinformatics) and Praveen Nadukkalam Ravindran (PhD in Computer Science) use a combination of network, machine learning, and visualization approaches to identify genetically distinct groups of Atlantic salmon that may occupy different habitats.

This story is the feature cover story of the Spring 2016 Computer Science Magazine.

When you think of all the possibilities for new discoveries that have opened up with the ability to decode the gene sequences of living organisms, you might envision some scientist in a white lab coat, surrounded by an array of test tubes.

The process of decoding DNA sequences produces an avalanche of data – and finding the meaning and knowledge hidden in that data is a challenge being tackled today by computer scientists. They’re the researchers who work with algorithms and focus on interpreting genetic data instead of the messy business of samples and test tubes.


Robert Beiko is one such researcher. Specializing in Bioinformatics, Dr. Beiko leads a team of graduate students and postdoctoral fellows in a range of projects with direct applications to real-world problems.

Biological data are noisy, confusing, and incomplete, offering up different challenges in each project this team takes on.

The genetic material of many organisms is unknown or known only in fragmentary form; and the crucial patterns sought by bioinformaticians are buried in billions of letters of DNA that have changed through billions of years of evolution. Trying to correlate a specific gene sequence with a particular trait is also plagued by the complexity of how genes function: the story is rarely as simple as a one-to-one connection between a gene and any traits it might influence.



Dr. Beiko is collaborating with scientists at the Department of Fisheries and Oceans to try to understand the North Atlantic salmon; specifically, how many distinct populations exist. Although all Atlantic salmon belong to the same species, there are distinct breeding groups that rarely interbreed.

The project is a vital component of conservation strategies. Managing all salmon as a single unit could lead to the overfishing and destruction of local populations. The differences between populations may be invisible to us, but genetics can lead us to a better understanding of who breeds with whom, and where.

Policy makers have to take account of this information in order to produce effective conservation strategies. Linking genes to traits like temperature preference can also provide clues about how salmon will survive and migrate as climate change alters the environment of the North Atlantic.

By comparing the genomes of individual salmon, information to identify distinct breeding groups can be discovered. But teasing the informative patterns out of this data is no easy

task. PhD student Praveen Nadukkalam Ravindran is currently working on this challenge. He is building graph-based algorithms to identify the key similarities and differences between salmon populations. After applying his new algorithms to hundreds or thousands of salmon, Praveen and the research team use statistical methods to draw lines where one population ends and another begins and identify regions shared by two or more populations.


For generations, farmers have pursued selective breeding to improve the quality of their dairy stock. Over the years, it has been discovered that selecting for high volumes of milk production has led to a diminished reproductive longevity, which means that cattle produce for a shorter period in their lifespan. Dr. Beiko’s group is working on a collaboration with Performance Genomics Inc. (PGI) to identify gene sequences which are correlated with reproductive longevity in dairy cattle.

“We are working on a genomic selection approach, a new era in genetic improvement, which allows us to do a faster job of improving productivity as well as health, reproduction and longevity,” says Jyoti Joshi, postdoctoral fellow in Dr. Beiko’s lab. “This project is extremely promising for animal breeding industries. To improve the reproductive life of Holstein cattle and increase the overall genetic profile of the herd would increase the overall benefit of the project to the agriculture sector.”

PGI was founded from a remarkable experiment in which populations of mice were selectively bred for reproductive longevity over dozens of generations. While it might seem strange to use mice to learn about important genes in cattle, many of their genes (and our genes!) are remarkably similar, and what is true for mice is often true for other mammals as well.

The bred mice did indeed achieve alonger reproductive lifespan, and genetic analysis showed important genetic differences when these mice were compared to “normal” mice. But the effects were subtle, and a deeper genetic study with new sequencing technologies was required. Dr. Beiko’s team carried out parallel studies with genomic data from hundreds of mice and cattle, and compared the results to find the most promising leads.

A subtler understanding of the genes involved should allow for the selection of a desired trait without also having to accept a disadvantageous trait as a side effect. The team’s discoveries are currently being tested on thousands of cows, to see how well the top statistically-performing genes in their analysis line up with reality. If successful, the genes will be integrated into a test panel that farmers and companies use to decide which animals should produce the next generation of dairy cattle.


More complex still is the task of understanding the human microbiome, the ecosystem of single-celled organisms that live on and inside humans.

Bacterial DNA extracted from human gut and oral samples may contain fragments from hundreds or thousands of species, most of which have not been studied in a lab. And yet, it may be that correlations can be found with such general problems as periodontal disease and health problems associated with ageing. Such correlations might suggest specific organisms to focus on more closely.

“It may sound like needles and haystacks,” says Dr. Beiko, “but with the right tools and especially the right people, we can find crucial patterns that point toward accurate diagnostics and targeted interventions.”

He’s helping to develop a method to predict the role different microbes might be playing in the gut, based on comparisons with close genetic relatives. He’s produced a paper on the issue, which has already been cited over 300 times.

Dr. Beiko is also pursuing practical applications of the microbiome, co-leading (with Dr. Ken Rockwood, a professor of Geriatric Medicine and physician at the QEII Health Sciences Centre) a pilot survey of microbiome variability in an assisted-care facility. The research team is currently examining the links between age, frailty, and the microbiome in a population of 45 individuals who were sampled weekly for a month. Among the hundreds of microbial species the research team has detected, there are several that show correlations with patient age. Analysis is ongoing, and many more patterns remain to be discovered in this rich dataset.

The project spans a range of disciplines. Postdoc Akhilesh Dhanani, an expert in probiotics such as Lactobacillus, is leading the sample collection and DNA sequencing efforts. Michael Hall, a Master of Science (Computational Biology and Bioinformatics) student who will be starting his PhD in the fall, has developed new methods to probe the stability – or instability – of the microbiome over time.

“Genetic technologies are advancing at a bewildering rate; the rate at which DNA can be sequenced quickly and cheaply is dramatically outpacing Moore’s Law,” Dr. Beiko says. “New applications are emerging all the time in human health, biodiversity, and industry. And as more data and new data are produced at an unprecedented pace, advances in bioinformatics will be vital in making sense of the patterns buried in biological systems.”