Qian Liu, Peng Zhang, Depeng Wang, Weihong Gu and Kai Wang; Genome Medicine 20179:65 DOI: 10.1186/s13073-017-0456-7
Microsatellite expansion, such as trinucleotide repeat expansion (TRE), is known to cause a number of genetic diseases. Sanger sequencing and next-generation short-read sequencing are unable to interrogate TRE reliably.
In this study, we have developed RepeatHMM to detect repeat counts of microsatellites from long-read sequencing data. RepeatHMM was evaluated on both simulation data and real data and our results suggested that RepeatHMM was effective and efficient to quantify repeat counts. RepeatHMM is flexible to handle repeat patterns of any length beyond trinucleotide repeats and can incorporate different error profiles. With the wider application of long-read sequencing techniques in research and clinical settings, RepeatHMM is expected to contribute to the quantification of repeat counts and to facilitate the analysis of genotype-phenotype relationships for disease-related microsatellites.
Interrogating the “unsequenceable” genomic trinucleotide repeat disorders by long-read sequencing