Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add Nextclade dataset(s) #48

Open
j23414 opened this issue Nov 29, 2024 · 2 comments
Open

Add Nextclade dataset(s) #48

j23414 opened this issue Nov 29, 2024 · 2 comments
Assignees
Labels
enhancement New feature or request

Comments

@j23414
Copy link
Contributor

j23414 commented Nov 29, 2024

Context

Add a Nextclade dataset (GPC) or datasets (L, S segments).

TKTK - Write up how a nextclade dataset benefits people here...

Description

Examples

Possible solution

@j23414 j23414 added the enhancement New feature or request label Nov 29, 2024
@JoiRichi
Copy link
Collaborator

Description

Lassa fever claims approximately 5,000 lives annually in West Africa and affects around 500,000 people each year. The disease has also resulted in fatal imported cases globally, emphasising the need for international health security. Unfortunately, there are no approved vaccines.

The Lassa virus (LASV), the causative agent of Lassa fever(LF), is currently categorized into seven distinct lineages circulating in specific geographic regions (Garry, 2023). Lineages 1, 2, and 3 are primarily found in Nigeria, while lineages 4 and 5 are prevalent in Sierra Leone and Mali (Garry, 2023). These lineages not only circulate in different regions but also exhibit significant variations in immune response (Buck et al., 2022) and disease outcomes (Anderson et al., 2015). For instance, Anderson et al. demonstrated that the Sierra Leonean strain tends to be more fatal than the Nigerian strains.

With increasing global travel, the risk of cross-regional transmission of these lineages is rising. Understanding the lineage responsible for a specific outbreak or patient case is critical for effective disease management. Despite the recurrent nature of Lassa fever outbreaks, specialized tools for rapid response and containment remain scarce. LF epidemics usually occur between August and March every year - so these tools are needed now more than ever.

In an effort to address this gap, we developed a tool for fast lineage assignment (Daodu et al., 2024). However, the tool is still limited in its capabilities, including the lack of a user-friendly interface. The success of the Nextclade dataset and rapid lineage assignment in managing the SARS-CoV-2 pandemic highlights the potential value of such resources for LASV control.

A dedicated LASV Nextclade dataset (GPC, L , S segments) would enable rapid lineage assignment, real-time mutation tracking, and support vaccine and diagnostic test development, enhancing the global capacity to respond to Lassa fever outbreaks effectively.

Example(s)

https://nextstrain.org/nextclade/sars-cov-2

Possible solution

Development of LASV Nextclade dataset (GPC, L , S segments) and rapid lineage assignment.

@j23414
Copy link
Contributor Author

j23414 commented Dec 4, 2024

we developed a tool for fast lineage assignment (Daodu et al., 2024).

To clarify, this paper seems to be focused on lineage assignment based on genomic signal in the GPC region. Supporting a L nextclade dataset may require developing a novel lineage assignment method based solely on genomic information from the L sequences.

For example, the isolate LASV/H.sapiens-wt/NGA/2018/IRR_013 has genbank MK117961 for the S segment and the genbank MK117879 for the L segment

Thus, we classify the isolate based on the GPC genomic region (MK117879) which means we only have a Nextclade GPC tree and would need to somehow link the S genbanks to L genbanks to overlay that information on the L segment tree (there are no rules in ingest or phylogenetic workflows to link segments at the moment as I imagine many accessions may be missing the /isolate annotation to be able to link the segments reliably).

Hopefully, there's sufficient clade-defining genetic signal in the L segment region (complicated if there's a lot of reassortment between segments) to support an L segment Nextclade tree and by-pass the need for "linking-S-to-L" genbank accession.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

2 participants