Visualizing evolution as a network of shared mutations

`Can't load canvas object`

Here, I’m showing the outcome of a basic demographic / mutational process, in which each circle represents a different genetic variant or haplotype within an evolving population. Each time step, individuals from each haplotype are born or die according to a Moran process, keeping the total population size constant. Additionally, mutations enter the population, creating new haplotypes. This simulation is assuming haploid, rather than diploid, genetic structure; each individual possesses a single genotype.

Haplotypes are positioned according to a force-directed algorithm, where haplotypes act as charged particles and repel from all other haplotypes, and parent-and-child haplotypes are connected by idealized springs that keep them a certain distance apart. In the resulting dynamics, there are usually a smaller number of high-frequency variants surrounding by their low-frequency mutational progeny. This behavior fits with quasispecies models.

Stochastically, some haplotypes bear more offspring than other haplotypes, resulting in genetic drift. Without mutation replenishing diversity, the population would eventually arrive at a single haplotype. With both genetic drift and mutation, the population reaches an equilibrium level of diversity. This diversity is often measured by the level of heterozygosity H, equal to the chance that two randomly selected individuals in the population share the same haplotype. The expected level of heterozygosity is equal to \( \dfrac{\theta}{\theta+1} \), where θ is equal to the population-scaled mutation rate 2 N μ for haploid populations. This is two times the number of new mutations entering the population each generation.

We can also measure diversity in terms of the total number of distinct haplotypes k in the population. The expected number of haplotypes can also be calculated from θ and is equal to \( \sum_{i=1}^n \dfrac{\theta}{\theta+i-1} \). The visualization shows expected and observed heterozygosities and variant counts as the simulation proceeds. It’s also possible to calculate the full distribution of the probability of observing ηi individuals possessing haplotype i. This known as the Ewen’s sampling formula.

These results represent the mutational corollary to the genealogical process represented by the Kingman coalescent.

Press H to bring up a listing of commands.