Differences in bioinformatics pipelines may contribute to substantial variability across labs, in terms of variant annotation, interpretation, and reporting. The lack of standardization is an emerging concern, especially given the growing availability of commercial bioinformatics software options that reduce the barrier for new labs to adopt next-generation sequencing (NGS). To demonstrate how differences in commercial software can influence analysis, organizers of the Two-Day Symposium for Molecular Biologists in Pathology at the European Congress of Pathology (ECP) 2017 set up an NGS Bioinformatics Challenge where both Illumina and QIAGEN were invited to participate.
The concept of the challenge was simple: 3 institutions in Germany (Universitätsklinikum Köln, Erlangen, and Charite in Berlin) contributed FASTQ files for a total of 12 tumor samples that were known to harbor pathogenic variants. These data were then sent to Illumina and QIAGEN 2 months before the event, and subjected to variant calling and interpretation using their commercially available offerings. Both Illumina and QIAGEN were blinded to the identity of the known variants, and reported on their findings at a round-table session at ECP where the organizers also revealed what the expected variants were, and how they had been interpreted by each institution. To add an interesting twist, each contributing institution had used a different library prep (and sequencing platform) for their samples: The Berlin samples used the AmpliSeq Colon and Lung v2 hotspot panel and were sequenced on the Ion Torrent PGM; both Erlangen and Köln samples were sequenced on an Illumina MiSeq™ System, but the Erlangen samples used the Illumina TruSight® Tumor 15 prep and the Köln samples used a custom QIAGEN amplicon panel.
Illumina was represented by the BaseSpace Variant Interpreter team in this challenge. To account for the differences in library prep across samples, variant calling was performed using custom pipelines hosted on BaseSpace Sequence Hub, then loaded into BaseSpace Variant Interpreter for custom filtering, annotation, and interpretation.
Results of the analysis agreed with 22 of 23 of the expected findings, including a challenging KIT exon 11 deletion that was only discovered by Illumina. The missing variant, a MET deletion spanning the exon-intron boundary of exon 14 in one of the Köln samples, could be recovered by changing alignment parameters to improve amplicon coverage in the region of interest.
Importantly, during the challenge BaseSpace Variant Interpreter demonstrated:
- Awareness of different variant effect associations based on literature evidence beyond drug sensitivity/resistance, ie, tumor subtype classification and prognosis. For example, an atypical KIT exon 11 internal tandem duplication (ITD) was associated with prognosis, whereas a typical exon 11 deletion would have been associated with drug response.
- Intelligent matching of variants based on positioning information beyond an exact nucleotide match, ie, codon, exon, or gene. For example, BaseSpace Variant Interpreter was able to identify literature evidence for an atypical EGFR exon 19 deletion based on matching at the exonic level.
- Further, the Integrated Genome Viewer (IGV) visualization features were used to point out interesting examples of tumor heterogeneity (TP53 subclones) and complex variants (dinucleotide mutations, insertions/deletions in cis, ie, located on the same strand/allele).
“The NGS challenge was well attended and the lively discussion showed that this session format is received well by the audience. Definitively this is something to repeat!”
– Dr. Sabine Merkelbach-Bruse, head of Diagnostic Molecular Pathology at Universitätsklinikum Köln.
“It was very interesting to see how both companies handled the challenging task of analysing the data from different samples, library prep methods, sequencing platforms and institutions by just receiving FASTQ files. I think that the NGS challenge was very educational for the attendees of the 2 Day Symposium.”
– Dr. Carina Heydt, Universitätsklinikum Köln.
BaseSpace Variant Interpreter was launched for general availability in August 2017 and is available at https://variantinterpreter.informatics.illumina.com. For more information about the European Congress of Pathology, please visit https://www.esp-congress.org.
A European regional instance of BaseSpace Variant Interpreter will also be available via a German-hosted Amazon Web Services (AWS) data center in Frankfurt, later in 2017.
Please contact your Illumina sales representative for more details on BaseSpace Variant Interpreter, and how it can accelerate interpretation of variants produced by next-generation sequencing.
For Research Use Only. Not for use in diagnostic procedures.