Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Investigate a reasonable root for a global build #26

Closed
j23414 opened this issue Oct 11, 2024 · 2 comments
Closed

Investigate a reasonable root for a global build #26

j23414 opened this issue Oct 11, 2024 · 2 comments
Assignees
Labels
enhancement New feature or request

Comments

@j23414
Copy link
Collaborator

j23414 commented Oct 11, 2024

Context

Proximately related to #20

Example of a global tree using midpoint rooting (needs to be fixed): https://next.nextstrain.org/staging/WNV/global

@j23414 j23414 added the enhancement New feature or request label Oct 11, 2024
@DOH-LMT2303
Copy link
Collaborator

Rooting of US builds
The WNV Nextstrain build for the "Twenty years of WNV in the Americas" is rooted on the Israel sequence from 1998 AF481864. This is most likely because the first detection of WNV in the Americas (New York outbreak 1999) was most closely related to the Israel 1998 isolate. https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6822705/

Thus, the Israel sequence could serve as a root for a WA or other US build. However, this might not be the most appropiate root for a global build. WNV is believed to have originated in Africa, with its first discovery in Uganda in 1937 (https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10772404/).

In GenBank the earliest sequence available is from 1931 from Illinois which might be a date error since WNV wasn't detected in the US until 1999. The next earliest sequence available is from 1953 from Israel.

Rooting of a global tree
The most informative paper I found on historical genomics of WNV is this paper "Spatial and temporal dynamics of West Nile virus between Africa and Europe" the first WNV L1 (cluster 1) strain recovered in Africa is from 1951 from Egypt (Genbank AF260968).The authors also note that "clusters 2, 3, 4, 6, and 7 are rooted by the two ancient sequences from Nigeria and Senegal" looking at figure 1. on the paper I think that those are GQ851607.1 Nigeria 1965 and GQ851606.1 Senegal 1979. "It is also shown that all strains within cluster 2 are rooted by the 1989 Senegalese strain (Genbank OP846971)" https://www.nature.com/articles/s41467-023-42185-7

@j23414
Copy link
Collaborator Author

j23414 commented Oct 15, 2024

I can get behind AF260968 as the global root. I built a quick WNV tree using sequences greater than 9000nt long, MAFFT (auto), and FastTree (nt), and then opened the complete tree in FigTree. From FigTree, I initially midpoint rooted, then tried setting different tips as root. The AF260968 seems reasonable compared to the other reference sequences in red.

Screenshot 2024-10-15 at 1 31 14 PM

But let me know if you see anything that looks concerning in the tree. I'll start adjusting the phylogenetic build accordingly.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

2 participants