Evolutionary biology studies usually concentrate on a limited subset of species within a clade, overlooking numerous lineages, either because they are extinct, because they are unknown or because they are simply not included in the carried analysis. Given that most species are extinct or unknown, these overlooked lineages (or “ghosts”) are thought to represent the vast majority.
When studying horizontal evolutionary processes, such as endosymbiosis, horizontal gene transfers or introgressions, overlooking these ghost lineages can have an important impact because gene flows are likely to involve, as donors or recipients, these lineages. Interestingly, this possible confounding effect of ghosts has been neglected or at least minimized in studies so far.
Using simulations, we characterized and quantified the impact of ghost lineages on popular methods used to detect horizontal gene flow at various evolutionary scales [1,2]. We show that under the weak hypothesis that unsampled taxa are legion compared to sampled ones, both the donor and recipients of gene flow can be misidentified in many cases, and the conclusions of various studies can become unfounded or even reversed.
To end on a more positive note, we will present recent results showing that ghost lineages do not simply deceive gene flow detection methods: they may leave a signature in phylogenetic trees that can be detected to predict, along the branches of a species tree, the amount of these unseen (ghost) lineages.
[1] Théo Tricou, Eric Tannier, Damien M. de Vienne. 2022. Ghost lineages can invalidate or even reverse findings regarding gene flow. PLoS Biology. 20(9) : e3001776.
[2] Théo Tricou, Eric Tannier, Damien M. de Vienne. 2022. Ghost lineages highly influence the interpretation of introgression tests. Systematic Biology. 71(5):1147–1158.