Candida glabrata MLST nomenclature changes

It has been necessary to change some of the ST numbers in the C. glabrata MLST scheme. This has come about due to an unforeseen issue whereby STs had been assigned but not entered in to the database, prior to its hosting on PubMLST. Some of these were subsequently included in the published literature.

Since this came to light, we have tried to minimise the impact and retrospectively added the profiles that have been formally published. Where these had been assigned STs that have since been re-issued we have had to change the ST on the later-assigned profiles.

The following timeline indicates the changes that have been made:

Transfer of alleles, 68 STs, and isolates from
STs 5 and 9 were not part of the data received, although they are mentioned in the literature. ST-5 and ST-9 have been blocked from future use.
ST numbers issued by within this timeframe were not documented and were not passed to during the data migration process.
Novel alleles and sequence types from Lott et al. 2010 (PMID 20190071) and Lott et al. 2012 (PMID 21838617) were added to the sequence database.
ST-83 in Lott et al. 2010 (PMID 20190071) is identical to ST-75 in Lott et al. 2012 (PMID 21838617). ST-83 was given preference due to the earlier publication date; consequently there is no ST-75 in the database, and the number is blocked from future use.
Novel sequence types from Amanloo et al. 2018 (PMID 28482076) were added; due to partial overlap with numbers used in this study with those used in Lott et al. 2010 and Lott et al. 2012, ST-71 to ST-79 (designation in Amanloo et al. 2018) are included as ST-101 to ST-109 in the database. This is noted in the respective ST records.
A total of 368 non-redundant isolates from Lott et al. 2010 (PMID 20190071, n=229 ) and Lott et al 2012 (PMID 21838617, n=265) were added to the isolates database. Isolates included in these two studies partially overlap.
A total of 50 isolates from Amanloo et al. 2018 (PMID 28482076) were added to the isolates database, with ST numbering as explained above.
Added novel alleles, STs, and isolates from Bordallo-Cardona et al. 2019 (PMID 30397068) upon publication.
Added novel alleles, STs, and isolates from Achmad et al. 2019 (PMID 30455247) in parallel to publication process.
Added novel alleles, STs, and isolate information from Biswas et al. (PMID 30559734), Mushi et al. (PMID 30597052), and Bordallo-Cardona et al. (PMID 30397068) in parallel to publication process.
Retrospectively added information from Sasso et al. (PMID 29580647). Novel FKS allele "X" added to database as FKS29, ST "X" is now ST166.
Retrospectively added alleles, STs, and isolates from Healey et al. 2016 (PMID 27020939).
Added NCBI_BioProject field to isolates table.
Added novel alleles, STs, and isolates from five whole-genome-sequencing studies: Xu et al. 2016 (PMID 27713500; BioP PRJNA218162), Håvelsrud and Gaustad 2017 (PMID 28280017, BioP PRJNA297263) Vale-Silva et a.l, 2017 (PMID 28663342, BioP PRJNA374542), Carrete et al., 2018 (PMID 29249661, BioP PRJNA361477), and Barber et al. 2019 (PMID 30478162, BioP PRJNA483064). Novel STs for isolates “Norway 5 and 6” from Håvelsrud and Gaustad 2017 (PMID 28280017) are now ST137, novel STs for isolates “P35_2” and “P35_3” from Carrete et al 2018 (PMID 29249661) are now ST136.
Finalized isolate data from Biswas et al. 2018 (PMID 30559734) in parallel to publication process. Labels there misplaced in the original figure 1 have subsequently been corrected by the authors (PMID 31608038).
Retrospectively added novel alleles, STs, and isolates from Biswas et al. 2017 (PMID 28344162, BioP PRJNA310957). Isolates CMRL-06, -07, and -08 were omitted due to ambiguous sites in our mapping obtained from data deposited at SRA.
Added isolate data from Rivero-Menendez et al. 2019 (PMID 31285229) upon publication. Consecutive isolates are indicated by patient numbers.
Reconstructed ST5 (5-7-8-1-3-6) and ST9 (1-2-2-7-2-1) from original publication (PMID 14662965) and mended records for isolates CE-02 (ST8→ST9) and CE-03 (ST3→ST5).
Added novel alleles, STs, and isolates deduced from raw data deposited in SRA by Guo et al. 2019 (PMID 31059831, BioP PRJEB20459). Isolate Y1644 “from ATCC archive, isolated from Iowa, USA” was found to be ST10, and therefore presumed to be ATCC90030 (==database isolate 1).
Added 3 isolates from Carrete et al. 2019 (PMID 30809200; BioP PRJNA506893).
Added 3 isolates deposited in SRA from Porto (Portugal) under BioP PRJNA525402 (2019).
Added novel alleles, STs, and isolates of the CDC (USA) deduced from raw data deposited in SRA under BioP PRJNA329124 (2016) and PRJNA524686 (2019). Isolates were partially redundant, also with those already present in the database (Lott et al. 2010, 2012; PMID 20190071; PMID 21838617) and Healey et al (2016; PMID 28018323). The following modifications were made to join the datasets:
  1. Three records were omitted: SRR8697269 (CAS11-3129), which displayed a frameshift in FKS2 due to an 8 bp insertion in our assembly, SRR8697391 (CAS08-0631), which did not have sufficient sequencing depth to determine the ST, and SRR8697473 (CAS08-0629), which had no matches to Cg MLST loci (isolate might be C. parapsilosis).
  2. One novel ST derived from BioP PRJNA329124 (ST169) and 11 derived from BioP PRJNA524686 (STs 179-189) were added.
  3. In BioP PRJNA329124 isolate names are given with underscores, these were replaced by dashes to allow matching with other datasets.
  4. For three isolates duplicate SRA entries were found: CAS08-0209, CAS08-0439, and CAS11-2978. Since deduced STs were identical, these were merged into single records each.
  5. Thirty-eight isolates were already present in the database by isolate name and could be traced back to the same original. Since the deduced STs were identical to those previously recorded, the SRA information was added to the pre-existing records.
  6. Six isolates (CAS08-0069, CAS08-0094, CAS08-0525, CAS08-0569, CAS08-0725, andCAS09-0869) were already present in the database by isolate name as above, but the genome sequencing-derived STs did not match those previously recorded. These datasets were introduced with the postfix "_GS" to the isolate name to flag those versions with genome sequencing-derived STs.
  7. In total, 26 novel isolates from BioP PRJNA329124 and 219 from BioP PRJNA524686 were added.