BWA+GATK now in BaseSpace

As of about an hour ago all MiSeq resequencing applications in BaseSpace can run the BWA+GATK secondary analysis pipeline. The pipeline can be initiated directly on the instrument when setting up a MiSeq run, and following real-time data transfer the analysis is performed natively in BaseSpace.

According to the Broad Institute, GATK best practices basically boils down to the following:
1.      Duplicate removal
2.      BQ recalibration
3.      INDEL cleaning
4.      Variant calling (Unified Genotyper)
5.      Manually curating variants / Variant recalibration / variant filtration

The current version of BWA+GATK implemented in BaseSpace comprises:
1.     Duplicate removal/marking (modified samtools)
2.      Variant calling (UnifiedGenotyper)
3.      Variant filtering (vcf annotator)

Various experiments during the past year have indicated base quality recalibration is not necessary with our current base quality tables in RTA.

INDEL cleaning and variant recalibration are inherently good ideas, but impracticable for broad deployment on BaseSpace given the significant compute resources they require. More work will be done on this from our end and we hope to include them in the pipeline down the road.

Importantly, BWA+GATK will not be available for the on-instrument MiSeq software for a few months. This speaks to the fact that we can quickly deploy features, fix bugs and incorporate user feedback to tweak our BaseSpace implementation before rolling it out to the instrument install base. So expect to see new things in BaseSpace well before they are deployed on instrument going forward.

While we are working on an app store that delivers all sorts of commercial downstream analysis tools, we understand that select open source academic tools are used by most of our customers and want to make those as accessible as possible.



  1. The GATK team at the Broad Institute is planning a workshop for users this Fall. If you’re interested in attending the workshop, you can vote on the topics and activities that you’d like the workshop to include by filling in this survey:

    • Currently running BWA+GATK in BaseSpace can be invoked directly from the MiSeq control software during run setup. In the near future BWA will be enabled to be run directly within BaseSpace.

  2. Hello BaseSpace team,
    After running the BWA Enrichment v2.1 and viewing the VCF file in VariantStudio app, the ‘Allele Frequency Global Minor’ displayed in VariantStudio is different from the GMAF displayed in DBSNP. Here’s what I got-
    The ‘Allele Frequency Global Minor’ in VarintStudio for rs3833412, rs35159794, rs76796411 is 0.00, 0.00, 3.98 respectively. Whereas the GMAF in the dbSNP for the same rsIDs is 47.81, 24.68, 5.17 respectively.
    I need to know why there is a difference in the Global Minor Allele Frequencies for the same rsID?

    • Anand,

      Could you please contact our technical support ( with the question. Thank you.

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.