mlstdbNet release history
Version 2.0.0
Released: 2008-03-26
Written by Keith Jolley, © 2003-2008 University of Oxford
Jolley et al. 2004, BMC Bioinformatics, 5:86.
Features new to version 2.0
Version 2.0 has undergone a code review. Due to the increasing size of the code base, it has been reorganized into a modular structure with only the core database search functionality remaining within the main script. Other functionality has been moved into plugins - these are optional elements that can be removed completely.
- Removed need to generate stats page every day - these are now generated on-the-fly using the SimpleBreakdown plugin.
- Query speed has been improved dramatically by removing the necessity to generate a datafile containing the entire dataset for subsequent statistics display. The statistics are now generated by the SimpleBreakdown plugin using the original query when required.
- Removed support for pre-computed data files since they are no longer required.
- BURST plugin. This has been adapted from the original BURST code by Man-Suen Chan.
- Tree plugin. UPGMA or NJ trees can be generated from search results of 200 records or fewer.
- Searches can now be filtered by publication, allowing complex searches of isolates described in a paper.
- The page bar can be positioned at the top or bottom of the display (or both).
Features new to version 1.6
- ST:Allele (Jolley) plots can be generated after a search provided Chartdirector software is installed. These charts were used first in Jolley et al. 2000, J Clin Microbiol 38: 4492-4498.
- Links to ST information from allele information pages.
- Sample sequence trace display on allele information pages (where configured).
- Searching of multiple client isolate databases from within the profiles database.
- Similarity search of alleles.
- Queries can be configured via options page to add drop down list boxes for searching of any field.
- Support for restricted view based on the result of a SQL query that can be set for a particular database.
- Support for usergroups where values of a particular field in an isolate database can be set automatically by the curation software dependent on the usergroup a curator belongs to, e.g. to add country information for users belonging to a particular organisation.
- PubMed reference support has changed so that references are linked to isolates in a separate table. This allows any isolate to be associated with an unlimited number of publications. The structure of the reference database has also changed. If you upgrade from a previous version, the reference database will need to be regenerated and repopulated.
- A page bar at the bottom of the page following a query allows quick navigation of further results.
Features new to version 1.5
- Browse database - peruse all records, sorted by any field.
- Pre-computed data files (used for breakdown statistics etc.) for entire dataset - any query that returns the whole dataset (e.g. browse functionality) can be speeded up dramatically (< 1 second for 10000 isolates). Use browsePreCompute.pl <xml file> to create a file called browsePreCompute.txt and place this in the database directory within the webroot.
- Export data in extended multi-FASTA format for use in Xavier Didelot and Daniel Falush's Clonal Frame program.
- Allow composite fields to be displayed. These are fields that are made up of any number of other fields, formatted as required. This was developed to display a genotype made up of ST, clonal complex, group, and other antigen data.
- Allow the user to select which fields are displayed in the main results table following a query. This option is stored as a browser cookie so is remembered between sessions and is set from the options page.
Features new to version 1.4
- New curator's website script - no need to use WDBI now.
- Export data in format suitable for use in Jonathan Pritchard's Structure program.
- Search using 'not contain' clause.
- Locus explorer.
- Compare any two alleles.
Features new to version 1.3
- Database breakdown page can now utilise Chartdirector software, if installed, to improve the graphical breakdown display.
- Grouped field queries - fields can be grouped and searched together - useful, for example, where you may have multiple identifer fields.
- The number of fields that can be combined in a query can now be selected by a user and stored as a cookie.
- The order of fields within the database no longer has to be in-sync with the XML description of the database. This means that new fields can be readily added to a database and be conveniently ordered by the position specified in the XML file.
- Separate stylesheets, headers and footers can be used for each database on the system. Place files named stylesheet.css, header.html and footer.html in to the database directory within the web site (i.e. within /WEBROOT/DBASE NAME/).
Features new to version 1.2
- Paged results for faster display - page size is user-selectable and a default can be set and saved using a browser cookie.
- Multiple views - different database views can be set for different users depending on their log-on details. This can restrict entries to certain individuals.
- Tree-drawing can be disabled by a user with a browser cookie or for individual databases where appropriate.
- Trace files can be made available for download or display (you will need to install BCM Trace Viewer).
Features introduced with version 1.1
- Compatible with mod_perl for enhanced speed and reduced system overhead.
- Loci can have variable length sequences for MLST schemes that contain indels. When performing allele queries on such loci, a BLAST search will be done if an exact match is not found and the query sequence length is different from the standard for that locus (otherwise the standard query will be done).
- Profile and allele sequences in multiple formats are generated on-the-fly for download. This removes the need for some auxillary scripts.
- Fully configurable by a single settings file. Certain functionality can be switched off if some third-party modules/programs are not installed or desired, e.g. Bio-Perl, EMBOSS, BLAST, E-mail export.
- Allele concatenation for both profile and isolate databases.
- Drop-down boxes for querying individual fields with option lists can be enabled in the XML database description.
- Debug mode.
Features introduced with version 1.0
- Separation of profile/sequence type information from isolate data.
- The profiles database contains the ST, allelic profile and sometimes clonal complex information. There is no replication of profile data within isolate databases.
- Any number of isolate databases can connect to the profiles database
to retrieve ST/profile information. Isolate databases can be:
- general public repositories, e.g. the PubMLST databases.
- public, project specific databases, e.g. those that include the data from a publication.
- private, project/population specific databases.
- Separation of database and web servers possible.
- The databases can now run on separate machines from the web server, improving speed and scalability.
- Linking to external data sources.
- The database structure is not limited to MLST, since the isolate databases may also link to other data sources. Examples of these include the Neisseria PorA and FetA variable region databases, where a variant type is entered in the isolate database, with the web scripts automatically retrieving a peptide sequence and hyperlinking this to an appropriate web page on the antigen web site. Provided these databases are compatible, automatic retrieval of data can be made by a one-line addition to the isolate database configuration script.
- Retrieval of isolate data by cited publication.
- mlstdbNet automatically downloads citation information from PubMed for any publication cited in a reference field of an isolate database. This enables groups of isolates to be easily retrieved by searching against citation or author.
- Choice of whether ST or profile data is stored in an isolate database.
- Either an ST number (in which case allelic profile information is retrieved from the profiles database) or a full allelic profile (where the ST number is retrieved from the profiles database) can be stored. The latter choice enables information to be added to the system as sequencing results are generated, rather than only when a full profile is obtained. This is more appropriate for project databases.
- Batch profile queries.
- Improved search queries including non-case specific partial searches of any field.
- Improved allele comparison with lists of nucleotide differences between a query sequence and the nearest known allele generated to aid sequence assembly.
- Ability to export the full isolate information from a search in a format suitable for loading into a spreadsheet.
- Breakdown statistics of searches, including allele and polymorphism frequencies.