Associate Professor John Barton has published a new article in Molecular Biology and Evolution titled “Correlated Allele Frequency Changes Reveal Clonal Structure and Selection in Temporal Genetic Data.”
Barton’s lab seeks to understand signals of natural selection in evolving populations by looking at sequence data over time. In this study, Barton and his team developed a computational model to process large amounts of biological data.
“What we’d like to be able to do is have a good idea of what a population of individuals looks like at different moments in time,” Barton said. “For viruses, this is often possible because they have very small genomes. For bacteria, this can be more difficult.”
Barton wanted to gain better insights on microbial populations using existing data. To accomplish this, he focused on developing a computational method to infer clonal structure from data using allele frequency changes.
Clonal structure refers to genetically distinct subpopulations and the mutations that make each population unique. Allele frequencies represent the fraction of individuals with specific genetic variants within these data sequences.
“This paper was an effort to read in this frequency data and then try to figure out how the different sequences are related and group them into clades,” he said. “Clades are groups of sequences that are genetically related to each other.”
Barton and his collaborator, Yunxiao Li, developed a mathematical model to support their research process. The basis of their work was the Wright-Fisher model, a foundational model in population genetics.
“What we came up with is something very simple,” Barton said. “Basically, if you look at the product of the allele frequency changes, if those alleles are present in the same sequences, they tend to be correlated.”
He found that the products of these changes tend to be positive. In other words, if one allele frequency increases, the other frequency will also increase. This principle would also apply to a decrease in frequency.
The computational method developed in this study will provide researchers with a powerful tool to infer clonal structures from data sets where only allele frequencies are available.
“Even if you don’t have any information in your data about which alleles are on the same sequence, with these correlated changes in time, you can learn about the structure,” Barton said.
This approach will also empower researchers to prove estimates of natural selection in data.
“We’re able to analyze large data sets that wouldn’t have been possible with other methods,” Barton said. “By showing that we can apply this to microbial data, we can potentially get good results in terms of understanding particular alleles. That’s something that can give us confidence in future studies.”