Fig. 1

Retroviral genome invasion and the fate of endogenous retrovirus (ERV) loci in the germline. The three panels show schematic diagrams illustrating how the distribution of ERVs is influenced by a host phylogeny; b activity of ERV lineages within the gene pool; c patterns of ERV locus inheritance within populations of host species. Panel a shows how ERV lineages originate when infection of an ancestral species by an ancient retrovirus causes a ‘germline colonisation’ event in which a retroviral provirus is integrated into the nuclear genome of a germline cell that then goes on to develop into a viable organism. This ‘founder’ ERV provirus can subsequently generate further copies within the germline (panel b). The fate of individual ERV loci is determined by selective forces at the level of the host population. Most ERV loci are quickly eliminated from the germline via selection or drift. However, some may increase in frequency from one host generation to the next, to the point where they become genetically ‘fixed’—i.e. they occur in all members of the species. The schematic in panel c illustrates this in a simplified way, showing an ERV locus (copy x) becoming fixed in over several host generations. As shown in panel a, fixed ERV loci persist in the host germline as ‘footprints’ of ERV activity, and the identification of orthologous ERV loci in multiple species indicates that those species diverged after the ERV was inserted. Thus, when host divergence dates have been estimated, they can be used to infer minimum ages for orthologous ERV loci. Importantly, extinction of host lineages eliminates swathes of ERV loci. In some rare cases, however, their sequences may still be recoverable (e.g. see [79]). Abbreviations: ERV endogenous retrovirus, NWM New World monkeys, OWM Old World monkeys