Reduce amplification bias and improve sequencing coverage.
High fidelity PCR in NGS is used to selectively enrich library fragments carrying appropriate adaptor sequences and to amplify the amount of DNA prior to sequencing. During PCR enrichment, existing high fidelity DNA polymerases do not synthesize all library fragments with equal efficiency. This amplification bias exacerbates uneven sequence coverage.
KAPA HiFi Library Amplification Kits contain a novel DNA polymerase engineered for:
- Improved amplification of GC- and AT-rich genomic regions
- Reduced enzyme bias resulting in improved sequencing coverage
- Industry leading fidelity
Effect of high-GC content on coverage depth for libraries amplified using common proof-reading (B-family) polymerases
Indexed Illumina TruSeqTM libraries prepared from identical sheared M. tuberculosis (65% GC) gDNA were amplified using the indicated PCR reagents, and compared to an equivalent unamplified library by paired-end sequencing (2 x 75 bp). After filtering and aligning read pairs to reference sequences, 250 000 read pairs were randomly sampled for each genome, and scatter plots of mean sequence coverage depth vs. GC content were generated by analyzing 250 bp windows. GC-rich M. tuberculosis sequences were under-represented following library amplification using either Phusion® HF Master Mix or Illumina TruSeqTM PCR Master Mix. In contrast, library amplification with KAPA HiFi HotStart Master Mix resulted in coverage distribution across the range of GC-content that is almost indistinguishable from that of the unamplified control.
Percentage of the M. tuberculosis genome not represented in sequence data when using different PCR reagents for library amplification
Indexed libraries were prepared from identical sheared M. tuberculosis (65% GC) gDNA using the illumina TruSeq™ DNA Sample Prep Kit and then amplified using the indicated polymerases before paired-end sequencing (2 x 75 bp). Libraries were quantified before and after amplification using the KAPA Library Quantification Kit to determine the number of doublings in each case. After filtering and aligning read pairs to reference sequences, 250 000 read pairs (~8.5x coverage) were randomly sampled for each genome.
Low GC-content libraries result in variable bias depending on the polymerase used for amplification
Libraries prepared from identical sheared P. falciparum (19% GC) gDNA were amplified using the indicated PCR reagents, and compared to an equivalent unamplified library. Observed frequencies of GC-content for reads are plotted for each condition tested (black = unamplified; green = KAPA HiFi HotStart Master Mix; blue = Phusion® HF Master Mix). The expected frequency distribution of reads is indicated by the grey shaded area. The unamplified library tracked the expected frequency distribution. Amplification with KAPA HiFi showed minimal bias while amplification with Phusion® resulted in a dramatic bias against reads with low GC-content. Average coverage depth for each library was 16.0x (unamplified control); 16.5x (KAPA HiFi); 18.8x (Phusion®). Data courtesy of Dr. Michael A. Quail, The Wellcome Trust Sanger Institute.
Percentage of the P. falciparum genome not represented in sequence data when using different PCR reagents for library amplification
Equivalent indexed libraries were amplified (14 PCR cycles) using either KAPA HiFi Hot Start Ready mix or Phusion HF master mix, and then sequenced and compared to an unamplified control library. Average coverage depth for each library was 16.0x (unamplified control); 16.5x (KAPA HiFi); 18.8x (Phusion).
Library amplification can dramatically affect coverage uniformity
The following Artemis screen captures depict examples of coverage bias in libraries amplified with either KAPA HiFi HotStart Master Mix or Phusion® HF Master Mix, compared to an unamplified control library. In short stretches of either high or low-GC content, the degree of coverage bias varies according to the method used to amplify the library.
Coverage depth and GC content across a ~7 kb region of the P. falciparum genome. Within this region of the genome there are 3 locations where high-AT sequences (>80%) lead to coverage bias (grey bars). In all three regions coverage depth drops significantly after amplification with Phusion® (blue), while the library amplified using KAPA HiFi (green) shows more uniform coverage depth which tracks that of the unamplified control library (black).
Coverage depth and GC content across a ~7 kb region of the B. pertussis genome. Within this region of the genome there are 4 distinct locations of high-GC sequence (>75%) that lead to sequence coverage bias (grey bars). in these regions the library amplified using Phusion® (blue) exhibits lower depth of coverage compared to the unamplified control. In contrast, the library amplified with KAPA HiFi (green) exhibits more even coverage depth, similar to the control library (black).
Superior accuracy for high fidelity library amplification
Error rates of DNA polymerases and polymerase blends. The error rate of KAPA HiFi is calculated at 1 error in 3.54 x 106 bases covered (2.82 x 10-7). The error rate of KAPA HiFi is 100X lower than Taq polymerase, 40X lower than polymerase blends and 2X lower than Phusion®.
Phusion® is a registered trademark of Thermo Fisher Scientific Inc. TruSeqTM is a trademark of Illumina Inc.
Licensed under U.S. Patent nos. 5,338,671 and 5,587,287 and corresponding patents in other countries.