Software

QUILT

QUILT is a program for rapid genotype imputation from low-coverage sequence using a large reference panel. QUILT works by modelling sequencing reads in diploid individuals as having read labels that reflect whether they come from the maternally or paternally inherited chromosome. Once the read labels have been estimated, imputation can be performed per-chromosome, which has linear computational complexity and can accomodate very large reference panels. In addition, with QUILT, we introduce QUILT-HLA, a program for rapid HLA imputation from low-coverage sequence using a labelled reference panel.

The method is available at https://github.com/rwdavies/QUILT. The paper is available here

STITCH

STITCH is method for reference panel free low coverage whole genome sequence imputation. STITCH runs on a set of samples with sequencing reads in BAM format, as well as a list of positions to genotype, and outputs imputed genotypes in VCF format. STITCH works by modelling each chromosome in the set of samples as a mosaic of K unknown founders or ancestral haplotypes. STITCH employs a hidden Markov model, whose parameters are sequentially updated using expectation maximization. Both steps are handled in a read aware fashion done without using external reference haplotype sets.

The method is available at https://github.com/rwdavies/STITCH. The paper is available here

SEW

SEW is a method for reference panel free phasing using long sequencing reads. SEW runs on data from a single sample using sequencing reads in BAM format, as well as a list of positions to phase, and outputs phased genotypes, as well as metrics that are useful for subsequent variant filtration. SEW uses an EM algorithm, starting from an initial guess of the two underlying haplotypes. SEW then iteratively first calculates probabilities for each sequencing read for coming from each of the two haplotypes, and then updates the sequence of the underlying haplotypes using the sequence of the reads and the probabilities the reads came from the two haplotypes.

The method is available at https://github.com/Genomicsplc/SEW. The paper is available here