Skip to content

ukhsa-collaboration/variant_definitions

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

67 Commits
 
 
 
 
 
 

Repository files navigation

Standardised Variant Definitions

This is a repository containing the up-to-date lineage definitions for variants of concern (VOC) and other variants (V) as curated by UK Health Security Agency. They are provided in order to facilitate standardised VOC and V calling across sequencing sites and bioinformatics pipelines and are the same definitions used internally at Public Health England. The mutations have been chosen to aid rapid and sensitive identification from sequence data: these are typically only a subset of the total set of mutations found in a variant.

Variant definitions are subject to change at any time. The latest release can be downloaded under the releases tab. A CHANGELOG is available. For email notifications when this repository is updated, please use the GitHub "Watch" functionality at the top right of this page.

Variant List

Label Lineages Description
VOC-20DEC-01 PANGO: B.1.1.7, nextstrain: N501Y.V1 This variant became widespread in the UK in the Winter of 2021 and is characterised by increased transmissibility.
VOC-20DEC-02 PANGO: B.1.351, nextstrain: N501Y.V2 This variant became widespread in countries in Southern Africa at the end of 2020 and has now been exported to a number of other countries including the UK
VOC-21FEB-02 PANGO: B.1.1.7 This variant is a cluster of B.1.1.7 (VOC202012/01) that contains E484K and is associated with the Bristol area
VOC-21JAN-02 PANGO: P.1 This variant was first identified in Japan in travellers from Brazil and is associated with Manaus in the Amazonas region assoicated with a severe second wave of COVID-19
V-21FEB-01 PANGO: A.23.1 This variant is a cluster within clade A.23.1 containing E484K observed in Liverpool
V-21FEB-03 PANGO: B.1.525 This variant is a cluster of E484K containing genomes
V-21FEB-04 PANGO: B.1.1.318 This variant is a cluster of E484K containing genomes
V-21JAN-01 PANGO: P.2 This variant became widespread in Rio de Janeiro, Brazil and imported cases have been reported in a number of other countries including the UK
V-21MAR-01 PANGO: B.1.324.1 First detected in the UK in a traveller from Antigua.
V-21MAR-02 PANGO: P.3 This variant appeared to be closely associated with the Phillipines and widely divergent from anything else upon discovery.
V-21APR-01 PANGO: B.1.617.1 This variant is reported to be circulating in India and has been exported to other countries.
VOC-21APR-02 PANGO: B.1.617.2 This variant is reported to be circulating in India and has been exported to other countries.
V-21APR-03 PANGO: B.1.617.3 This variant is reported to be circulating in India and has been exported to other countries.
E484K PANGO: Multiple Catch-all definition to identify sequences with the E484K spike variant
V-21MAY-01 PANGO: AV.1 This variant has been observed in a growing cluster in the UK
V-21MAY-02 PANGO: C.36.3 This variant has been observed in a growing number of imported cases in the UK
V-21JUN-01 PANGO: C.37 This variant is a clade first associated with South America but now observed in USA and Europe
V-21JUL-01 PANGO: B.1.621 This variant is a clade first associated with Colombia but now seen across the Americas and Europe
V-21OCT-01 PANGO: AY.4.2 This variant is a sublineage of Delta with spike A222V and Y145H
VOC-21NOV-01 WHO: Omicron PANGO: B.1.1.529 This variant is a lineage first identified in Southern Africa and has been exported to several other countries
VOC-22JAN-01 PANGO: BA.2 This variant is a sub-lineage of B.1.1.529
V-22APR-01 PANGO: XD This variant is a recombinant of Delta and BA.1
V-22APR-02 PANGO: XE This variant is a recombinant of BA.1/BA.2
V-22APR-03 PANGO: BA.4 This variant is a sub-lineage of B.1.1.529
V-22APR-04 PANGO: BA.5 This variant is a sub-lineage of B.1.1.529
V-22JUL-01 PANGO: BA.2.75 This variant is a sub-lineage of B.1.1.529.2 (BA.2)
V-22SEP-01 PANGO: BA.4.6 This variant is a sub-lineage of B.1.1.529.4 (BA.4)
V-22OCT-01 PANGO: BQ.1 This variant is a sub-lineage of B.1.1.529.5 (BA.5)
V-22OCT-02 PANGO: XBB This variant is a recombinant lineage of BJ.1 and BM.1.1.1
V-22DEC-01 PANGO: CH.1.1 This variant is a sub-lineage of BA.2.75
V-23JAN-01 PANGO: XBB.1.5 This variant is a sub-lineage of XBB
V-23APR-01 PANGO: XBB.1.16 This variant is a sub-lineage of XBB
V-23JUL-01 PANGO: EG.5.1 This variant is a sub-lineage of XBB (alias of ABB.1.9.2.5.1)
V-23AUG-01 PANGO: BA.2.86 This variant is a sub-lineage of BA.2 with a large number of mutations observed in several countries
V-23DEC-01 PANGO: JN.1 This variant is a sub-lineage of BA.2.86
SIM-BA3 PANGO: BA.3 This variant is a sub-lineage of B.1.1.529 and has not been declared as a variant by UKHSA but is defined for monitoring purposes.

File format definition

Each variant is stored one-per-file in the variant_yaml directory. Each file should be syntactically correct YAML. The file should be named according to its unique-id (unique identifer) with the suffix .yml.

You can check that files are correct using the yaml-validator.py script.

Top-level block Required Type Description
unique-id yes text A unique identifier for this definition file, should never change.
phe-label yes text The official Public Health England description for this variant, may change over time (e.g. upgrade VUI to VOC)
who-label no text The official World Health Organisation name for this variant, if any
alternate-names no list of text Synonyms for this variant for ease of referencing
belongs-to-lineage no list of dict Lineage descriptions for commonly used lineage naming schemes e.g. PANGO and nextstrain
description yes text Description of the variant
information-sources no list of URLs Useful references (not exhaustive) to official information sources about the discovery and monitoring of this reference
requires no text Name of YAML definition required for variant definition (i.e. new variant must be Confirmed or Probable for existing variant)
acknowledgements no list of text Acknowledgements to people or institutions involved in creating this definition
curators no list of text List of individuals responsible for maintaining this definition file
variants block yes list of dict List of mutations (SNPs, insertions and deletions) defining this variant
amino-acid-change no text amino acid change, relative to coordinates of gene, e.g. N501Y
codon-change yes text codon change encoded as reference codon - alternate codon e.g. "AAT-TAT"
protein-codon-position yes text codon position for codon-change according to protein annotation (not gene)
gene no text the gene corresponding to annotations in change and codon-change
one-based-reference-position yes integer the start position of the mutation, 1 based encoded - all mutations are encoded relatively to SARS-CoV-2 reference genome NC_045512.2
reference-base yes text The base or bases (for deletions) in reference corresponding with one-based-reference-position
predicted-effect no text synonymous, non-synonymous, no-effect (e.g. upstream stop)
protein no text the mature protein product (used for amino-acid-change, codon-change and protein-codon-position)
type yes text SNP, MNP, insertion, deletion
variant-base yes text The mutated base or bases (for insertions) in the variant at one-based-reference-genome, encoded as per VCF. N bases in MNP notation to be ignored.
calling-definition block no dict A set of dictionary labels defining 1 or more calling definitions
mutations-required yes integer number of mutations (SNPs or MNPs) required to call the mutation
indels-required yes integer number of insertions or deletions from the variant definition required
allowed-wildtype yes integer how many wild-type (reference) calls are permitted to satisfy this calling definition

Contact

For further information, help or assistance contact UKHSA Genomics Public Health Analysis at genomicspublichealthanalysis@ukhsa.gov.uk

Contributors

  • Matt Bull (PHW)
  • Meera Chand (UKHSA)
  • Tom Connor (PHW)
  • Nick Ellaby (UKHSA)
  • Natalie Groves (UKHSA)
  • Katri Jalava (UKHSA)
  • Nick Loman (University of Birmingham/UKHSA)
  • Richard Myers (UKHSA)
  • Sam Nicholls (University of Birmingham/UKHSA)
  • Ulf Schaefer (UKHSA)