Gene conversion as a source of nucleotide diversity in Plasmodium falciparum

Nielsen KM, Kasper J, Choi M, Bedford T, Kristiansen K, Wirth DF, Volkman SK, Lozovsky ER, Hartl DL. 2003. Mol Biol Evol 20: 726-734.

Abstract

Examination of polymorphisms in the Plasmodium falciparum gene for falcipain 2 revealed that this gene is one of two paralogs separated by 10.8 kb in chromosome 11. We designate the annotated gene denoted chr11.gen424 as encoding falcipain 2A and the annotated gene denoted chr11.gen427 as encoding falcipain 2B. The paralogs are 96% identical at the nucleotide level and 93% identical at the amino acid level. The consensus sequences differ in 31/309 synonymous sites and 45/1140 nonsynonymous sites, including three amino acid replacements (V393I, A400P, and Q414E) that are near the catalytic site and that may affect substrate affinity or specificity. In six reference isolates, among 36 synonymous sites and 46 nonsynonymous sites that are polymorphic in the gene for falcipain 2A, falcipain 2B, or both, significant spatial clustering is observed. All but one of the polymorphisms appear to result from gene conversion between the paralogs. The estimated rate of gene conversion between the paralogs may be as many as 1,400 to 1,700 times greater than the rate of mutation. Owing to gene conversion, one of the falcipain 2A alleles is more similar to the falcipain 2B alleles than it is to other falcipain 2A alleles. Divergence among the synonymous sites suggests that the paralogous genes last shared a common ancestor 15.2 MYA, with a range of 8.8 to 20.6 MYA. During this period, the paralogs have acquired 0.10 synonymous substitutions per synonymous site in the coding region. The 59 and 39 flanking regions differ in 47.7% and 39.8% of the nucleotide sites, respectively. Hence synonymous sites and flanking regions are not conserved in sequence in spite of their high AT content and T skew.