1, vev016 (2015). A., Lytras, S., Singer, J. A third approach attempted to minimize the number of regions removed while also minimizing signals of mosaicism and homoplasy. Calibration of priors can be performed using other coronaviruses (SARS-CoV, MERS-CoV and HCoV-OC43), but estimated rates vary with the timescale of sample collection. Syst. 4). Influenza viruses reassort17 but they do not undergo homologous recombination within RNA segments18,19, meaning that origins questions for influenza outbreaks can always be reduced to origins questions for each of influenzas eight RNA segments. A counting renaissance: combining stochastic mapping and empirical Bayes to quickly detect amino acid sites under positive selection. However, on closer inspection, the relative divergences in the phylogenetic tree (Fig. 94, e0012720 (2020). PubMed Central In the meantime, to ensure continued support, we are displaying the site without styles 5. "This is an extremely interesting . & Li, X. Crossspecies transmission of the newly identified coronavirus 2019nCoV. Virus Evol. He, B. et al. The Bat, the Pangolin and the City: A Tale of COVID-19 A SARS-like cluster of circulating bat coronaviruses shows potential for human emergence. The red and blue boxplots represent the divergence time estimates for SARS-CoV-2 (red) and the 2002-2003 SARS-CoV (blue) from their most closely related bat virus, with the light- and dark-colored versions based on the HCoV-OC43 and MERS-CoV centered priors, respectively. Here, we analyse the evolutionary history of SARS-CoV-2 using available genomic data on sarbecoviruses. RegionC showed no PI signals within it. Conservatively, we combined the three BFRs >2kb identified above into non-recombining region1 (NRR1). Lie, P., Chen, W. & Chen, J.-P. matics program called Pangolin was developed. A novel bat coronavirus closely related to SARS-CoV-2 contains natural insertions at the S1/S2 cleavage site of the Spike protein. While there is involvement of other mammalian speciesspecifically pangolins for SARS-CoV-2as a plausible conduit for transmission to humans, there is no evidence that pangolins are facilitating adaptation to humans. Coronavirus Software Tools - Illumina, Inc. This long divergence period suggests there are unsampled virus lineages circulating in horseshoe bats that have zoonotic potential due to the ancestral position of the human-adapted contact residues in the SARS-CoV-2 RBD. R. Soc. Forni, D., Cagliani, R., Clerici, M. & Sironi, M. Molecular evolution of human coronavirus genomes. Discovery and genetic analysis of novel coronaviruses in least horseshoe bats in southwestern China. Eight other BFRs <500nt were identified, and the regions were named BFRAJ in order of length. Divergence time estimates based on the three regions/alignments where the effects of recombination have been removed. Because there is no single accepted method of inferring breakpoints and identifying clean subregions with high certainty, we implemented several approaches to identifying three classic statistical signals of recombination: mosaicism, phylogenetic incongruence and excessive homoplasy51. This study provides an integration of existing classifications and describes evolutionary trends of the SARS-CoV . Posterior means (horizontal bars) of patristic distances between SARS-CoV-2 and its closest bat and pangolin sequences, for the spike proteins variable loop region and CTD region excluding the variable loop. Epidemiology, genetic recombination, and pathogenesis of coronaviruses. This new approach classifies the newly sequenced genome against all the diverse lineages present instead of a representative select sequences. Is the COVID-19 Outbreak the 'Revenge of the Pangolin'? | PETA 4. Are pangolins the intermediate host of the 2019 novel coronavirus (SARS-CoV-2)? Adv. While such models have recently been made available, we lack the information to calibrate the rate decline over time (for example, through internal node calibrations44). is funded by The National Natural Science Foundation of China Excellent Young Scientists Fund (Hong Kong and Macau; no. 2 Lack of root-to-tip temporal signal in SARS-CoV-2. Mol. Among the 68sequences in the aligned sarbecovirus sequence set, 67 show evidence of mosaicism (all DunnSidak-corrected P<4104 and 3SEQ14), indicating involvement in homologous recombination either directly with identifiable parentals or in their deeper shared evolutionary historythat is, due to shared ancestral recombination events. The pangolin coronaviruses show lower similarity to SARS-CoV-2 than bat coronavirus RaTG13 across the whole genome, but higher similarity in the spike receptor binding domain, although the similarity at either scale remains too low to implicate . Annu Rev. To gauge the length of time this lineage has circulated in bats, we estimate the time to the most recent common ancestor (TMRCA) of SARS-CoV-2 and RaTG13. Lemey, P., Minin, V. N., Bielejec, F., Pond, S. L. K. & Suchard, M. A. The S1 protein of Pangolin-CoV is much more closely related to SARS-CoV-2 than to RaTG13. Because 3SEQ identified ten BFRs >500nt, we used GARDs (v.2.5.0) inference on 10, 11 and 12 breakpoints. Wang, L. et al. 4), that region and shorter BFRs were not included in combined putative non-recombinant regions. GitHub - cov-lineages/pangolin: Software package for assigning SARS-CoV-2 genome sequences to global lineages. A dynamic nomenclature proposal for SARS-CoV-2 lineages to - PubMed Microbiol. SARS-CoV-2 genetic lineages in the United States are routinely monitored through epidemiological investigations, virus genetic sequence-based surveillance, and laboratory studies. PubMed Central 725422-ReservoirDOCS). COVID-19: A Catastrophe or Opportunity for Pangolin Conservation? - Nature It is RaTG13 that is more divergent in the variable-loop region (Extended Data Fig. For the HCoV-OC43, MERS-CoV and SARS datasets we specified flexible skygrid coalescent tree priors. Biol. 1. Provided by the Springer Nature SharedIt content-sharing initiative, Molecular and Cellular Biochemistry (2023), Nature Microbiology (Nat Microbiol) RegionB is 5,525nt long. This leaves the insertion of polybasic. PDF How COVID-19 Variants Get Their Name - doh.wa.gov Katoh, K., Asimenos, G. & Toh, H. in Bioinformatics for DNA Sequence Analysis (ed. 6, eabb9153 (2020). Identifying SARS-CoV-2-related coronaviruses in Malayan pangolins. However, the coronavirus isolated from pangolin is similar at 99% in a specific region of the S protein, which corresponds to the 74 amino acids involved in the ACE (Angiotensin Converting Enzyme . Future trajectory of SARS-CoV-2: Constant spillover back and forth Virus Evol. wrote the first draft of the manuscript, and all authors contributed to manuscript editing. We find that the sarbecovirusesthe viral subgenus containing SARS-CoV and SARS-CoV-2undergo frequent recombination and exhibit spatially structured genetic diversity on a regional scale in China. the development of viral diversity. The consistency of the posterior rates for the different prior means also implies that the data do contribute to the evolutionary rate estimate, despite the fact that a temporal signal was visually not apparent (Extended Data Fig. A hypothesis of snakes as intermediate hosts of SARS-CoV-2 was posited during the early epidemic phase54, but we found no evidence of this55,56; see Extended Data Fig. Discovery of a rich gene pool of bat SARS-related coronaviruses provides new insights into the origin of SARS coronavirus. Nat Microbiol 5, 14081417 (2020). All three approaches to removal of recombinant genomic segments point to a single ancestral lineage for SARS-CoV-2 and RaTG13. However, inconsistency in the nomenclature limits uniformity in its epidemiological understanding. From this perspective, it may be useful to perform surveillance for more closely related viruses to SARS-CoV-2 along the gradient from Yunnan to Hubei. Methods Ecol. Genet. Temporal signal was tested using a recently developed marginal likelihood estimation procedure41 (Supplementary Table 1). Menachery, V. D. et al. Boni, M. F., Posada, D. & Feldman, M. W. An exact nonparametric method for inferring mosaic structure in sequence triplets. The most parsimonious explanation for these shared ACE2-specific residues is that they were present in the common ancestors of SARS-CoV-2, RaTG13 and Pangolin Guangdong 2019, and were lost through recombination in the lineage leading to RaTG13. Pangolins may have incubated the novel coronavirus, gene study shows Evolutionary origins of the SARS-CoV-2 sarbecovirus lineage - Nature Host ecology determines the dispersal patterns of a plant virus. It performs: K-mer based detection Map/align, variant calling Consensus sequence generation Lineage/clade analysis using Pangolin and NextClade Access the DRAGEN COVID Lineage App on BaseSpace Sequence Hub 4), but also by markedly different evolutionary rates. (Yes, Pango is a tongue-in-cheek reference to pangolins, which were briefly suspected to have had a role in the coronavirus's originseveral of the team's computational tools are named after. Viral metagenomics revealed Sendai virus and coronavirus infection of Malayan pangolins (Manis javanica). Patino-Galindo, J. Boni, M.F., Lemey, P., Jiang, X. et al. We used TreeAnnotator to summarize posterior tree distributions and annotated the estimated values to a maximum clade credibility tree, which was visualized using FigTree. The rate of genome generation is unprecedented, yet there is currently no coherent nor accepted scheme for naming the expanding . b, Similarity plot between SARS-CoV-2 and several selected sequences including RaTG13 (black), SARS-CoV (pink) and two pangolin sequences (orange). This underscores the need for a global network of real-time human disease surveillance systems, such as that which identified the unusual cluster of pneumonia in Wuhan in December 2019, with the capacity to rapidly deploy genomic tools and functional studies for pathogen identification and characterization. Nature 579, 265269 (2020). The genetic distances between SARS-CoV-2 and RaTG13 (bottom) demonstrate that their relationship is consistent across all regions except for the variable loop. For weather, science, and COVID-19 . Specifically, using a formal Bayesian approach42 (see Methods), we estimate a fast evolutionary rate (0.00169 substitutions per siteyr1, 95% highest posterior density (HPD) interval (0.00131,0.00205)) for SARS viruses sampled over a limited timescale (1year), a slower rate (0.00078 (0.00063,0.00092) substitutions per siteyr1) for MERS-CoV on a timescale of about 4years and the slowest rate (0.00024 (0.00019,0.00029) substitutions per siteyr1) for HCoV-OC43 over almost five decades. Next, we (1) collected all breakpoints into a single set, (2) complemented this set to generate a set of non-breakpoints, (3) grouped non-breakpoints into contiguous BFRs and (4) sorted these regions by length. Identification of diverse alphacoronaviruses and genomic characterization of a novel severe acute respiratory syndrome-like coronavirus from bats in China. It compares the new genome against the large, diverse population of sequenced strains using a Open reading frames are shown above the breakpoint plot, with the variable-loop region indicated in the Sprotein. Region A has been shortened to A (5,017nt) based on potential recombination signals within the region. Internet Explorer). Correspondence to SARS-CoV-2 and RaTG13 are also exceptions because they were sampled from Hubei and Yunnan, respectively. pango-designation Public Repository for suggesting new lineages that should be added to the current scheme Python 968 73 pangolin Public Software package for assigning SARS-CoV-2 genome sequences to global lineages. Humans' selfish, speciesist treatment of these animals could be the very reason why the novel coronavirus exists. USA 113, 30483053 (2016). Furthermore, the other key feature thought to be instrumental in the ability of SARS-CoV-2 to infect humansa polybasic cleavage site insertion in the Sproteinhas not yet been seen in another close bat relative of the SARS-CoV-2 virus. [12] GARD identified eight breakpoints that were also within 50nt of those identified by 3SEQ. & Minh, B. Q. IQ-TREE: a fast and effective stochastic algorithm for estimating maximum-likelihood phylogenies. SARS-CoV-2 itself is not a recombinant of any sarbecoviruses detected to date, and its receptor-binding motif, important for specificity to human ACE2 receptors, appears to be an ancestral trait shared with bat viruses and not one acquired recently via recombination. Relevant bootstrap values are shown on branches, and grey-shaded regions show sequences exhibiting phylogenetic incongruence along the genome. & Bedford, T. MERS-CoV spillover at the camelhuman interface. Hon, C. et al. Concurrent evidence also proposed pangolins as a potential intermediate species for SARS-CoV-2 emergence and suggested them as a potential reservoir species11,12,13. There is a 90% DNA match between SARS CoV 2 and a coronavirus in pangolins. Intragenomic rearrangements involving 5-untranslated region segments in SARS-CoV-2, other betacoronaviruses, and alphacoronaviruses, Crystal structure of the CoV-Y domain of SARS-CoV-2 nonstructural protein 3, Association of underlying comorbidities and progression of COVID-19 infection amongst 2586 patients hospitalised in the National Capital Region of India: a retrospective cohort study, Molecular characterization of horse nettle virus A, a new member of subgroup B of the genus Nepovirus, Molecular phylogeny of coronaviruses and host receptors among domestic and close-contact animals reveals subgenome-level conservation, crossover, and divergence. SARS-CoV-2 Variant Classifications and Definitions 26, 450452 (2020). Bayesian evaluation of temporal signal in measurably evolving populations. PubMed & Andersen, K. G. The evolution of Ebola virus: insights from the 20132016 epidemic. Several of the recombinant sequences in these trees show that recombination events do occur across geographically divergent clades. J. Med Virol. To avoid artefacts due to recombination, we focused on NRR1 and NRR2 and the recombination-masked alignment NRA3 to infer time-measured evolutionary histories. We compiled a set of 69SARS-CoV genomes including 58 sampled from humans and 11 sampled from civets and raccoon dogs. You signed in with another tab or window. PubMed Central Individual sequences such as RpShaanxi2011, Guangxi GX2013 and two sequences from Zhejiang Province (CoVZXC21/CoVZC45), as previously shown22,25, have strong phylogenetic recombination signals because they fall on different evolutionary lineages (with bootstrap support >80%) depending on what region of the genome is being examined. Mol. This produced non-recombining alignment NRA3, which included 63 of the 68genomes. Overview of the SARS-CoV-2 genotypes circulating in Latin America Evolutionary origins of the SARS-CoV-2 sarbecovirus lineage responsible for the COVID-19 pandemic. Gray inset shows majority rule consensus trees with mean posterior branch lengths for the two regions, with posterior probabilities on the key nodes showing the relationships among SARS-CoV-2, RaTG13, and Pangolin 2019. Boxes show 95% HPD credible intervals. Wu, Y. et al. Uncertainty measures are shown in Extended Data Fig. Lam, H. M., Ratmann, O. 21, 15081514 (2015). Our most conservative approach attempted to ensure that putative NRRs had no mosaic or phylogenetic incongruence signals. The extent of sarbecovirus recombination history can be illustrated by five phylogenetic trees inferred from BFRs or concatenated adjacent BFRs (Fig. Further information on research design is available in the Nature Research Reporting Summary linked to this article. =0.00075 and one with a mean of 0.00024 and s.d. Chernomor, O. et al. Zhou, P. et al. Schierup, M. H. & Hein, J. Recombination and the molecular clock. Stamatakis, A. RAxML version 8: a tool for phylogenetic analysis and post-analysis of large phylogenies. In our second stage, we wanted to construct non-recombinant regions where our approach to breakpoint identification was as conservative as possible. the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in When the first genome sequence of SARS-CoV-2, Wuhan-Hu-1, was released on 10January 2020 (GMT) on Virological.org by a consortium led by Zhang6, it enabled immediate analyses of its ancestry. 3 Priors and posteriors for evolutionary rate of SARS-CoV-2. Although the human ACE2-compatible RBD was very likely to have been present in a bat sarbecovirus lineage that ultimately led to SARS-CoV-2, this RBD sequence has hitherto been found in only a few pangolin viruses. https://doi.org/10.1038/s41564-020-0771-4, DOI: https://doi.org/10.1038/s41564-020-0771-4. Removal of five sequences that appear to be recombinants and two small subregions of BFRA was necessary to ensure that there were no phylogenetic incongruence signals among or within the three BFRs. Visual exploration using TempEst39 indicates that there is no evidence for temporal signal in these datasets (Extended Data Fig. Note that breakpoints can be shared between sequences if they are descendants of the same recombination events. Pink, green and orange bars show BFRs, with regionA (nt 13,29119,628) showing two trimmed segments yielding regionA (nt13,29114,932, 15,40517,162, 18,00919,628). Add entries for pangolin-data/-assignment 1.18.1.1 (, Really add a document on testing strategy. By mid-January 2020, the virus was spreading widely within Hubei province and by early March SARS-CoV-2 was declared a pandemic8. B 281, 20140732 (2014). Pangolins: What are they and why are they linked to Covid-19? - Inverse Due to the absence of temporal signal in the sarbecovirus datasets, we used informative prior distributions on the evolutionary rate to estimate divergence dates. c, Maximum likelihood phylogenetic trees rooted on a 2007 virus sampled in Kenya (BtKy72; root truncated from images), shown for five BFRs of the sarbecovirus alignment. Novel Coronavirus (2019-nCoV) Situation Report 1, 21 January 2020 (World Health Organization, 2020). These means are based on the mean rates estimated for MERS-CoV and HCoV-OC43, respectively, while the standard deviations are set ten times higher than empirical values to allow greater prior uncertainty and avoid strong bias (Extended Data Fig. Nature 538, 193200 (2016). Trends Microbiol. In regionA, we removed subregion A1 (ntpositions 3,8724,716 within regionA) and subregion A4 (nt1,6422,113) because both showed PI signals with other subregions of regionA. Menachery, V. D. et al. The shaded region corresponds to the Sprotein. Effect of closure of live poultry markets on poultry-to-person transmission of avian influenza A H7N9 virus: an ecological study. The canine viral genome was excluded from the Bayesian phylogenetic analyses because temporal signal analyses (see below) indicated that it was an outlier. Of the nine breakpoints defining these ten BFRs, four showed phylogenetic incongruence (PI) signals with bootstrap support >80%, adopting previously published criteria on using a combination of mosaic and PI signals to show evidence of past recombination events19. The idea is that pangolins carrying the virus, SARS-CoV-2, came into contact with humans. 2). & Boni, M. F. Improved algorithmic complexity for the 3SEQ recombination detection algorithm. Holmes, E. C. The Evolution and Emergence of RNA Viruses (Oxford Univ. Evol. Identifying SARS-CoV-2-related coronaviruses in Malayan pangolins Biol. 88, 70707082 (2014). Evol. & Muhire, B. RDP4: Detection and analysis of recombination patterns in virus genomes. Accurate estimation of ages for deeper nodes would require adequate accommodation of time-dependent rate variation. and D.L.R. The estimated divergence times for the pangolin virus most closely related to the SARS-CoV-2/RaTG13 lineage range from 1851 (1730-1958) to 1877 (1746-1986), indicating that these pangolin . P.L. There are outstanding evolutionary questions on the recent emergence of human coronavirus SARS-CoV-2 including the role of reservoir species, the role of recombination and its time of divergence from animal viruses. As a proxy, it would be possible to model the long-term purifying selection dynamics as a major source of time-dependent rates43,44,52, but this is beyond the scope of the current study. At present, we analyzed the diversity of SARS-CoV-2 viral genomes in India to know the evolutionary patterns of viruses in the country through their pangolin lineage and GISAID-Clade. Except for specifying that sequences are linear, all settings were kept to their defaults. 1, vev003 (2015). The authors declare no competing interests. Maclean, O. Trova, S. et al. While there is evidence of positive selection in the sarbecovirus lineage leading to RaTG13/SARS-CoV-2 (ref. Consistent with this, we estimate a concomitantly decreasing non-synonymous-to-synonymous substitution rate ratio over longer evolutionary timescales: 1.41 (1.20,1.68), 0.35 (0.30,0.41) and 0.133 (0.129,0.136) for SARS, MERS-CoV and HCoV-OC43, respectively. These are in general agreement with estimates using NRR2 and NRA3, which result in divergence times of 1982 (19482009) and 1948 (18791999), respectively, for SARS-CoV-2, and estimates of 1952 (19061989) and 1970 (19321996), respectively, for the divergence time of SARS-CoV from its closest known bat relative. Biol. 36) (RDP, GENECONV, MaxChi, Bootscan, SisScan and 3SEQ) and considered recombination signals detected by more than two methods for breakpoint identification. Biol. 2). Researchers have found that SARS-CoV-2 in humans shares about 90.3% of its genome sequence with a coronavirus found in pangolins (Cyranoski, 2020). Divergence time estimates based on the HCoV-OC43-centred rate prior for the separate BFRs (Supplementary Table 3) show consistency in TMRCA estimates across the genome. Google Scholar. The virus then. Biol. To begin characterizing any ancestral relationships for SARS-CoV-2, NRRs of the genome must be identified so that reliable phylogenetic reconstruction and dating can be performed. It allows a user to assign a SARS-CoV-2 genome sequence the most likely lineage (Pango lineage) to SARS-CoV-2 query sequences. Share . Coronavirus: Pangolins found to carry related strains - BBC News Allen O'Brien on LinkedIn: #r #rstudio #rstats #pangolin #covid19 # With horseshoe bats currently the most plausible origin of SARS-CoV-2, it is important to consider that sarbecoviruses circulate in a variety of horseshoe bat species with widely overlapping species ranges57. One geographic clade includes viruses from provinces in southern China (Guangxi, Yunnan, Guizhou and Guangdong), with its major sister clade consisting of viruses from provinces in northern China (Shanxi, Henan, Hebei and Jilin) as well as Hubei Province in central China and Shaanxi Province in northwestern China. Bioinformatics 28, 32483256 (2012). The relatively fast evolutionary rate means that it is most appropriate to estimate shallow nodes in the sarbecovirus evolutionary history. Evol. Zhou, H. et al. & Holmes, E. C. A genomic perspective on the origin and emergence of SARS-CoV-2. Biol. 91, 10581062 (2010). Wang, H., Pipes, L. & Nielsen, R. Synonymous mutations and the molecular evolution of SARS-Cov-2 origins. Given that these pangolin viruses are ancestral to the progenitor of the RaTG13/SARS-CoV-2 lineage, it is more likely that they are also acquiring viruses from bats. Biol. M.F.B., P.L. Evolutionary rate estimation can be profoundly affected by the presence of recombination50. 24, 490502 (2016). PureBasic 53 13 constellations Public Python 42 17 Using the most conservative approach (NRR1), the divergence time estimate for SARS-CoV-2 and RaTG13 is 1969 (95% HPD: 19302000), while that between SARS-CoV and its most closely related bat sequence is 1962 (95% HPD: 19321988); see Fig. Bioinformatics 22, 26882690 (2006). Rev. Zhou et al.2 concluded from the genetic proximity of SARS-CoV-2 to RaTG13 that a bat origin for the current COVID-19 outbreak is probable. MERS-CoV data were subsampled to match sample sizes with SARS-CoV and HCoV-OC43. We considered (1) the possibility that BFRs could be combined into larger non-recombinant regions and (2) the possibility of further recombination within each BFR. This boundary appears to be rarely crossed. To obtain The proximal origin of SARS-CoV-2 | Nature Medicine Don't blame pangolins, coronavirus family tree tracing could prove key Center for Infectious Disease Dynamics, Department of Biology, Pennsylvania State University, University Park, PA, USA, Department of Microbiology, Immunology and Transplantation, KU Leuven, Rega Institute, Leuven, Belgium, Department of Biological Sciences, Xian Jiaotong-Liverpool University, Suzhou, China, State Key Laboratory of Emerging Infectious Diseases, School of Public Health, The University of Hong Kong, Hong Kong SAR, China, Department of Biology, University of Texas Arlington, Arlington, TX, USA, Institute of Evolutionary Biology, University of Edinburgh, Edinburgh, UK, MRC-University of Glasgow Centre for Virus Research, Glasgow, UK, You can also search for this author in To evaluate the performance procedure, we confirmed that the recombination masking resulted in (1) a markedly different outcome of the PHI test64, (2) removal of well-supported (bootstrap value >95%) incompatible splits in Neighbor-Net65 and (3) a near-complete reduction of mosaic signal as identified by 3SEQ. All sequence data analysed in this manuscript are available at https://github.com/plemey/SARSCoV2origins. It allows a user to assign a SARS-CoV-2 genome sequence the most likely lineage (Pango lineage) to SARS-CoV-2 query sequences. Google Scholar. Anyone you share the following link with will be able to read this content: Sorry, a shareable link is not currently available for this article. B., Weaver, S. & Sergei, L. Evidence of significant natural selection in the evolution of SARS-CoV-2 in bats, not humans. J. Virol. Two exceptions can be seen in the relatively close relationship of Hong Kong viruses to those from Zhejiang Province (with two of the latter, CoVZC45 and CoVZXC21, identified as recombinants) and a recombinant virus from Sichuan for which part of the genome (regionB of SC2018 in Fig. 5, 536544 (2020). The assumption of long-term purifying selection would imply that coronaviruses are in endemic equilibrium with their natural host species, horseshoe bats, to which they are presumably well adapted. BEAST inferences made use of the BEAGLE v.3 library68 for efficient likelihood computations. Coronavirus origins: genome analysis suggests two viruses may have combined Meet the people who warn the world about new covid variants Zhang, Y.-Z. In such cases, even moderate rate variation among long, deep phylogenetic branches will substantially impact expected root-to-tip divergences over a sampling time range that represents only a small fraction of the evolutionary history40. 5 (NRR1) are conservative in the sense that NRR1 is more likely to be non-recombinant than NRR2 or NRA3. The origins we present in Fig. Xiao, K. et al. 5. While it is possible that pangolins, or another hitherto undiscovered species, may have acted as an intermediate host facilitating transmission to humans, current evidence is consistent with the virus having evolved in bats resulting in bat sarbecoviruses that can replicate in the upper respiratory tract of both humans and pangolins25,32.