Skip to content

SRLV Project Data

Robert J. Gifford edited this page Nov 26, 2024 · 15 revisions

Genome-Length Reference Sequences

The SRLV extension layer includes a set of genome-length SRLV reference sequences from NCBI Nucleotide, in a source called "ncbi-refseqs-srlv."

Details for each isolate and sequence can be found in the accompanying data file.

A reference phylogeny was generated using maximum likelihood phylogenetic reconstruction, implemented in RAXML. An annotated phylogeny PDF is provided in this repository.

The reference phylogeny defines the following genotypes and subtypes:

  • Genotype A: A0, A1, A3, A4, A8, A18
  • Genotype B: B1, B2, B3
  • Genotype C
  • Genotype E: E1, E2

Sequence Metadata

Additionally, this SRLV extension layer includes comprehensive metadata linked to each SRLV sequence. Metadata categories include:

  • Sequence information (length, publication date)
  • Taxonomic data (genotype, subtype)
  • Isolate data (host species, sampling date, location, isolation source)

Nuccore Sequences

The SRLV extension layer contains a regularly updated set of SRLV sequences downloaded from NCBI Nucleotide (GenBank), under the source "ncbi-nuccore-srlv." This set excludes the genome-length references listed above.

These sequences are linked to standardized metadata extracted from GenBank XML using GLUE's GenbankXmlPopulator module. In cases where isolate data are missing from GenBank entries, we have supplemented these fields with values obtained through our SRLV origins investigation. Missing data are added to the Lentivirus-GLUE database using GLUE's textFilePopulator module, configured in this XML file.

Genotype and subtype assignments for all nuccore sequences, previously calculated using a maximum likelihood-based genotyping tool provided with this extension, are also imported from a tabular file.


Alignment Tree

This extension layer establishes a dedicated alignment tree, rooted on bovine viruses, complementing the genus-level alignment tree created during the Lentivirus-GLUE build.

The alignment tree defines three clade categories:

  1. Subgenus: Ovine-Caprine
  2. Genotype: Follows established nomenclature, with Genotype C grouped as a subtype within the broader B-like clade. This grouping aligns with the rooted SRLV phylogeny.
  3. Subtype: Defined only for subtypes established by the complete genome phylogeny. Sequences without subtype assignments may represent unique lineages for which genome-length sequences are unavailable.

Clone this wiki locally