Advances in haplotype phasing and genotype imputation

Advances in haplotype phasing and genotype imputation

September 24, 2025

Voight, B. F., Kudaravalli, S., Wen, X. & Pritchard, J. K. A map of recent positive selection in the human genome. PLoS Biol. 4, e72 (2006).

Article
PubMed
PubMed Central

Google Scholar

Fournier, R., Tsangalidou, Z., Reich, D. & Palamara, P. F. Haplotype-based inference of recent effective population size in modern and ancient DNA samples. Nat. Commun. 14, 7945 (2023).

Article
PubMed
PubMed Central

Google Scholar

Palamara, P. F. & Pe’er, I. Inference of historical migration rates via haplotype sharing. Bioinformatics 29, i180–i188 (2013).

Article
PubMed
PubMed Central

Google Scholar

Al-Asadi, H., Petkova, D., Stephens, M. & Novembre, J. Estimating recent migration and population-size surfaces. PLoS Genet. 15, e1007908 (2019).

Article
PubMed
PubMed Central

Google Scholar

Tian, X., Cai, R. & Browning, S. R. Estimating the genome-wide mutation rate from thousands of unrelated individuals. Am. J. Hum. Genet. 109, 2178–2184 (2022).

Article
PubMed
PubMed Central

Google Scholar

Porubsky, D. et al. Human de novo mutation rates from a four-generation pedigree reference. Nature 643, 427–436 (2025).

Article
PubMed
PubMed Central

Google Scholar

Lassen, F. H. et al. Exome-wide evidence of compound heterozygous effects across common phenotypes in the UK Biobank. Cell Genom. 4, 100602 (2024).

Article
PubMed
PubMed Central

Google Scholar

Maples, B. K., Gravel, S., Kenny, E. E. & Bustamante, C. D. RFMix: a discriminative modeling approach for rapid and robust local-ancestry inference. Am. J. Hum. Genet. 93, 278–288 (2013).

Article
PubMed
PubMed Central

Google Scholar

Browning, S. R., Waples, R. K. & Browning, B. L. Fast, accurate local ancestry inference with FLARE. Am. J. Hum. Genet. 110, 326–335 (2023).

Article
PubMed
PubMed Central

Google Scholar

Sun, Q. et al. Improving polygenic risk prediction in admixed populations by explicitly modeling ancestral-differential effects via GAUDI. Nat. Commun. 15, 1016 (2024).

Article
PubMed
PubMed Central

Google Scholar

Horimoto, A. R. V. R. et al. Admixture mapping of chronic kidney disease and risk factors in Hispanic/Latino individuals from Central America country of origin. Circ. Genom. Precis. Med. 17, e004314 (2024).

Article
PubMed
PubMed Central

Google Scholar

Sun, Q. et al. Opportunities and challenges of local ancestry in genetic association analyses. Am. J. Hum. Genet. 112, 727–740 (2025).

Article
PubMed

Google Scholar

Fuchsberger, C., Abecasis, G. R. & Hinds, D. A. minimac2: faster genotype imputation. Bioinformatics 31, 782–784 (2015). This paper introduces the minimac2 imputation method (minimac3 and minimac4 are not associated with any publications).

Article
PubMed

Google Scholar

Browning, B. L., Zhou, Y. & Browning, S. R. A one-penny imputed genome from next-generation reference panels. Am. J. Hum. Genet. 103, 338–348 (2018). This paper introduces the Beagle5 imputation method.

Article
PubMed
PubMed Central

Google Scholar

Rubinacci, S., Delaneau, O. & Marchini, J. Genotype imputation using the positional burrows wheeler transform. PLoS Genet. 16, e1009049 (2020). This paper introduces the IMPUTE5 imputation method.

Article
PubMed
PubMed Central

Google Scholar

Sun, Q. et al. Analyses of biomarker traits in diverse UK Biobank participants identify associations missed by European-centric analysis strategies. J. Hum. Genet. 67, 87–93 (2022).

Article
PubMed

Google Scholar

Huerta-Chagoya, A. et al. The power of TOPMed imputation for the discovery of Latino-enriched rare variants associated with type 2 diabetes. Diabetologia 66, 1273–1288 (2023).

Article
PubMed
PubMed Central

Google Scholar

Qin, Z. S., Niu, T. & Liu, J. S. Partition-ligation-expectation-maximization algorithm for haplotype inference with single-nucleotide polymorphisms. Am. J. Hum. Genet. 71, 1242–1247 (2002).

Article
PubMed
PubMed Central

Google Scholar

Stephens, M. & Scheet, P. Accounting for decay of linkage disequilibrium in haplotype inference and missing-data imputation. Am. J. Hum. Genet. 76, 449–462 (2005).

Article
PubMed
PubMed Central

Google Scholar

Scheet, P. & Stephens, M. A fast and flexible statistical model for large-scale population genotype data: applications to inferring missing genotypes and haplotypic phase. Am. J. Hum. Genet. 78, 629–644 (2006).

Article
PubMed
PubMed Central

Google Scholar

Browning, S. R. & Browning, B. L. Rapid and accurate haplotype phasing and missing-data inference for whole-genome association studies by use of localized haplotype clustering. Am. J. Hum. Genet. 81, 1084–1097 (2007).

Article
PubMed
PubMed Central

Google Scholar

Li, Y., Willer, C. J., Ding, J., Scheet, P. & Abecasis, G. R. MaCH: using sequence and genotype data to estimate haplotypes and unobserved genotypes. Genet. Epidemiol. 34, 816–834 (2010).

Article
PubMed
PubMed Central

Google Scholar

Howie, B. N., Donnelly, P. & Marchini, J. A flexible and accurate genotype imputation method for the next generation of genome-wide association studies. PLoS Genet. 5, e1000529 (2009).

Article
PubMed
PubMed Central

Google Scholar

Carlson, C. S. et al. Selecting a maximally informative set of single-nucleotide polymorphisms for association analyses using linkage disequilibrium. Am. J. Hum. Genet. 74, 106–120 (2004).

Article
PubMed

Google Scholar

Long, J. C., Williams, R. C. & Urbanek, M. An E-M algorithm and testing strategy for multiple-locus haplotypes. Am. J. Hum. Genet. 56, 799–810 (1995).

PubMed
PubMed Central

Google Scholar

Kong, A. et al. Detection of sharing by descent, long-range phasing and haplotype imputation. Nat. Genet. 40, 1068–1075 (2008).

Article
PubMed
PubMed Central

Google Scholar

Halperin, E. & Karp, R. M. Perfect phylogeny and haplotype assignment. In Proc. 8th Annual International Conference on Computational Molecular Biology 10–19 (ACM, 2004).

Halperin, E. & Eskin, E. Haplotype reconstruction from genotype data using imperfect phylogeny. Bioinformatics 20, 1842–1849 (2004).

Article
PubMed

Google Scholar

Li, N. & Stephens, M. Modeling linkage disequilibrium and identifying recombination hotspots using single-nucleotide polymorphism data. Genetics 165, 2213–2233 (2003).

Article
PubMed
PubMed Central

Google Scholar

Fearnhead, P. & Donnelly, P. Estimating recombination rates from population genetic data. Genetics 159, 1299–1318 (2001).

Article
PubMed
PubMed Central

Google Scholar

Loh, P.-R. et al. Reference-based phasing using the haplotype reference consortium panel. Nat. Genet. 48, 1443–1448 (2016). This paper introduces the Eagle2 phasing method, where PBWT is applied to improve computational efficiency.

Article
PubMed
PubMed Central

Google Scholar

Hofmeister, R. J., Ribeiro, D. M., Rubinacci, S. & Delaneau, O. Accurate rare variant phasing of whole-genome and whole-exome sequencing data in the UK Biobank. Nat. Genet. 55, 1243–1249 (2023). This paper introduces the SHAPEIT5 phasing method, where singletons are explicitly considered.

Article
PubMed
PubMed Central

Google Scholar

Browning, B. L., Tian, X., Zhou, Y. & Browning, S. R. Fast two-stage phasing of large-scale sequence data. Am. J. Hum. Genet. 108, 1880–1890 (2021). This paper introduces the Beagle5 phasing method (which is different from the Beagle5 imputation publication), where a two-stage phasing strategy is proposed separately for common and rare variants.

Article
PubMed
PubMed Central

Google Scholar

Taliun, D. et al. Sequencing of 53,831 diverse genomes from the NHLBI TOPMed Program. Nature 590, 290–299 (2021). This paper introduces the TOPMed reference panel containing haplotypes from diverse populations, which is more suitable for imputation of global populations compared with previous reference panels, including 1000 Genomes and HRC.

Article
PubMed
PubMed Central

Google Scholar

Feng, Y.-C. A. et al. Taiwan Biobank: a rich biomedical research database of the Taiwanese population. Cell Genom. 2, 100197 (2022).

Article
PubMed
PubMed Central

Google Scholar

Das, S. et al. Next-generation genotype imputation service and methods. Nat. Genet. 48, 1284–1287 (2016). This paper introduces the Michigan imputation server, exemplary in promoting broader usage of reference panels and public servers without accessing individual genotypes contributing to the panels.

Article
PubMed
PubMed Central

Google Scholar

Al Bkhetan, Z., Zobel, J., Kowalczyk, A., Verspoor, K. & Goudey, B. Exploring effective approaches for haplotype block phasing. BMC Bioinform. 20, 540 (2019).

Article

Google Scholar

Al Bkhetan, Z., Chana, G., Ramamohanarao, K., Verspoor, K. & Goudey, B. Evaluation of consensus strategies for haplotype phasing. Brief. Bioinform. 22, bbaa280 (2021).

Article
PubMed

Google Scholar

Wertenbroek, R., Hofmeister, R. J., Xenarios, I., Thoma, Y. & Delaneau, O. Improving population scale statistical phasing with whole-genome sequencing data. PLoS Genet. 20, e1011092 (2024). This paper introduces a method to correct phasing errors leveraging raw sequencing.

Article
PubMed
PubMed Central

Google Scholar

Sun, Q. et al. MagicalRsq: machine-learning-based genotype imputation quality calibration. Am. J. Hum. Genet. 109, 1986–1997 (2022). This paper introduces a framework to recalculate imputation quality metric for post-imputation quality control, especially for low-frequency and rare variants where the state-of-the-art imputation quality metric (for example, Rsq) performs less well.

Article
PubMed
PubMed Central

Google Scholar

Sun, Q. et al. MagicalRsq-X: a cross-cohort transferable genotype imputation quality metric. Am. J. Hum. Genet. 111, 990–995 (2024).

Article
PubMed
PubMed Central

Google Scholar

Aleknonytė-Resch, M., Szymczak, S., Freitag-Wolf, S., Dempfle, A. & Krawczak, M. Genotype imputation in case-only studies of gene-environment interaction: validity and power. Hum. Genet. 140, 1217–1228 (2021).

Article
PubMed
PubMed Central

Google Scholar

Sun, Q. et al. Leveraging TOPMed imputation server and constructing a cohort-specific imputation reference panel to enhance genotype imputation among cystic fibrosis patients. HGG Adv. 3, 100090 (2022).

PubMed
PubMed Central

Google Scholar

Lau, W. et al. The hazards of genotype imputation when mapping disease susceptibility variants. Genome Biol. 25, 7 (2024).

Article
PubMed
PubMed Central

Google Scholar

Liu, E. Y. et al. Genotype imputation of metabochip SNPs using a study-specific reference panel of ~4,000 haplotypes in African Americans from the women’s health initiative. Genet. Epidemiol. 36, 107–117 (2012).

Article
PubMed
PubMed Central

Google Scholar

Xu, Z. M. et al. Using population-specific add-on polymorphisms to improve genotype imputation in underrepresented populations. PLoS Comput. Biol. 18, e1009628 (2022).

Article
PubMed
PubMed Central

Google Scholar

Sengupta, D. et al. Performance and accuracy evaluation of reference panels for genotype imputation in sub-Saharan African populations. Cell Genom. 3, 100332 (2023).

Article
PubMed
PubMed Central

Google Scholar

Cahoon, J. L. et al. Imputation accuracy across global human populations. Am. J. Hum. Genet. 111, 979–989 (2024).

Article
PubMed
PubMed Central

Google Scholar

Handsaker, R. E. et al. Large multiallelic copy number variations in humans. Nat. Genet. 47, 296–303 (2015).

Article
PubMed
PubMed Central

Google Scholar

Saini, S., Mitra, I., Mousavi, N., Fotsing, S. F. & Gymrek, M. A reference haplotype panel for genome-wide imputation of short tandem repeats. Nat. Commun. 9, 4397 (2018).

Article
PubMed
PubMed Central

Google Scholar

Ziaei Jam, H. et al. A deep population reference panel of tandem repeat variation. Nat. Commun. 14, 6711 (2023).

Article
PubMed
PubMed Central

Google Scholar

Noyvert, B. et al. Imputation of structural variants using a multi-ancestry long-read sequencing panel enables identification of disease associations. eLife 14, RP106115 (2025). This work performs imputation of SVs using a reference panel based on long-read sequencing data, demonstrating the practical utility of long-read sequencing in the context of imputation, particularly for SVs.

Browning, S. R. & Browning, B. L. Haplotype phasing: existing methods and new developments. Nat. Rev. Genet. 12, 703–714 (2011).

Article
PubMed
PubMed Central

Google Scholar

Sakaue, S. et al. Tutorial: a statistical genetics guide to identifying HLA alleles driving complex disease. Nat. Protoc. 18, 2625–2641 (2023).

Article
PubMed
PubMed Central

Google Scholar

Tregouet, D. A., Escolano, S., Tiret, L., Mallet, A. & Golmard, J. L. A new algorithm for haplotype-based association analysis: the Stochastic-EM algorithm. Ann. Hum. Genet. 68, 165–177 (2004).

Article
PubMed

Google Scholar

Browning, B. L. & Browning, S. R. Statistical phasing of 150,119 sequenced genomes in the UK Biobank. Am. J. Hum. Genet. 110, 161–165 (2023).

Article
PubMed

Google Scholar

Sohail, M. et al. Mexican Biobank advances population and medical genomics of diverse ancestries. Nature 622, 775–783 (2023).

Article
PubMed
PubMed Central

Google Scholar

Durbin, R. Efficient haplotype matching and storage using the positional Burrows-Wheeler transform (PBWT). Bioinformatics 30, 1266–1272 (2014). This paper proposes a series of algorithms for haplotype data compression and efficient haplotype matching, reducing the computational complexity from quadratic to linear in terms of the number of reference haplotypes. It represents a milestone of recent computational development of phasing and imputation methods.

Article
PubMed
PubMed Central

Google Scholar

Delaneau, O., Zagury, J.-F., Robinson, M. R., Marchini, J. L. & Dermitzakis, E. T. Accurate, scalable and integrative haplotype estimation. Nat. Commun. 10, 5436 (2019).

Article
PubMed
PubMed Central

Google Scholar

Browning, B. L. & Browning, S. R. Genotype imputation with millions of reference samples. Am. J. Hum. Genet. 98, 116–126 (2016).

Article
PubMed
PubMed Central

Google Scholar

Delaneau, O., Zagury, J.-F. & Marchini, J. Improved whole-chromosome phasing for disease and population genetic studies. Nat. Methods 10, 5–6 (2013).

Article
PubMed

Google Scholar

O’Connell, J. et al. Haplotype estimation for biobank-scale data sets. Nat. Genet. 48, 817–820 (2016).

Article
PubMed
PubMed Central

Google Scholar

Palin, K., Campbell, H., Wright, A. F., Wilson, J. F. & Durbin, R. Identity-by-descent-based phasing and imputation in founder populations using graphical models. Genet. Epidemiol. 35, 853–860 (2011).

Article
PubMed
PubMed Central

Google Scholar

Hickey, J. M. et al. A combined long-range phasing and long haplotype imputation method to impute phase for SNP genotypes. Genet. Sel. Evol. 43, 12 (2011).

Article
PubMed
PubMed Central

Google Scholar

Herzig, A. F. et al. Strategies for phasing and imputation in a population isolate. Genet. Epidemiol. 42, 201–213 (2018).

Article
PubMed

Google Scholar

Williams, A. L., Patterson, N., Glessner, J., Hakonarson, H. & Reich, D. Phasing of many thousands of genotyped samples. Am. J. Hum. Genet. 91, 238–251 (2012).

Article
PubMed
PubMed Central

Google Scholar

O’Connell, J. et al. A general approach for haplotype phasing across the full spectrum of relatedness. PLoS Genet. 10, e1004234 (2014).

Article
PubMed
PubMed Central

Google Scholar

Oget-Ebrad, C. et al. Benchmarking phasing software with a whole-genome sequenced cattle pedigree. BMC Genom. 23, 130 (2022).

Article

Google Scholar

Choi, Y., Chan, A. P., Kirkness, E., Telenti, A. & Schork, N. J. Comparison of phasing strategies for whole human genomes. PLoS Genet. 14, e1007308 (2018).

Article
PubMed
PubMed Central

Google Scholar

Lajugie, J. et al. Complete genome phasing of family quartet by combination of genetic, physical and population-based phasing analysis. PLoS ONE 8, e64571 (2013).

Article
PubMed
PubMed Central

Google Scholar

Chen, G. K., Wang, K., Stram, A. H., Sobel, E. M. & Lange, K. Mendel-GPU: haplotyping and genotype imputation on graphics processing units. Bioinformatics 28, 2979–2980 (2012).

Article
PubMed
PubMed Central

Google Scholar

Na, J. C., Lee, I., Rhee, J.-K. & Shin, S.-Y. Fast single individual haplotyping method using GPGPU. Comput. Biol. Med. 113, 103421 (2019).

Article
PubMed

Google Scholar

Luo, Y. et al. A high-resolution HLA reference panel capturing global population diversity enables multi-ancestry fine-mapping in HIV host response. Nat. Genet. 53, 1504–1516 (2021).

Article
PubMed
PubMed Central

Google Scholar

1000 Genomes Project Consortium et al. A map of human genome variation from population-scale sequencing. Nature 467, 1061–1073 (2010).

Article

Google Scholar

Li, Y., Willer, C., Sanna, S. & Abecasis, G. Genotype imputation. Annu. Rev. Genom. Hum. Genet. 10, 387–406 (2009).

Article

Google Scholar

Marchini, J. & Howie, B. Genotype imputation for genome-wide association studies. Nat. Rev. Genet. 11, 499–511 (2010).

Article
PubMed

Google Scholar

Das, S., Abecasis, G. R. & Browning, B. L. Genotype imputation from large reference panels. Annu. Rev. Genom. Hum. Genet. 19, 73–96 (2018).

Article

Google Scholar

Howie, B., Fuchsberger, C., Stephens, M., Marchini, J. & Abecasis, G. R. Fast and accurate genotype imputation in genome-wide association studies through pre-phasing. Nat. Genet. 44, 955–959 (2012).

Article
PubMed
PubMed Central

Google Scholar

Kojima, K., Tadaka, S., Okamura, Y. & Kinoshita, K. Two-stage strategy using denoising autoencoders for robust reference-free genotype imputation with missing input genotypes. J. Hum. Genet. 69, 511–518 (2024).

Article
PubMed
PubMed Central

Google Scholar

Pasaniuc, B. et al. Extremely low-coverage sequencing and imputation increases power for genome-wide association studies. Nat. Genet. 44, 631–635 (2012).

Article
PubMed
PubMed Central

Google Scholar

Hui, R., D’Atanasio, E., Cassidy, L. M., Scheib, C. L. & Kivisild, T. Evaluating genotype imputation pipeline for ultra-low coverage ancient genomes. Sci. Rep. 10, 18542 (2020).

Article
PubMed
PubMed Central

Google Scholar

Sousa da Mota, B. et al. Imputation of ancient human genomes. Nat. Commun. 14, 3660 (2023).

Article
PubMed
PubMed Central

Google Scholar

Rubinacci, S., Ribeiro, D. M., Hofmeister, R. J. & Delaneau, O. Efficient phasing and imputation of low-coverage sequencing data using large reference panels. Nat. Genet. 53, 120–126 (2021).

Article
PubMed

Google Scholar

Spiliopoulou, A., Colombo, M., Orchard, P., Agakov, F. & McKeigue, P. GeneImp: fast imputation to large reference panels using genotype likelihoods from ultralow coverage sequencing. Genetics 206, 91–104 (2017).

Article
PubMed
PubMed Central

Google Scholar

Davies, R. W. et al. Rapid genotype imputation from sequence with reference panels. Nat. Genet. 53, 1104–1111 (2021).

Article
PubMed
PubMed Central

Google Scholar

Jagirdar, K. et al. Molecular analysis of common polymorphisms within the human tyrosinase locus and genetic association with pigmentation traits. Pigment. Cell Melanoma Res. 27, 552–564 (2014).

Article
PubMed
PubMed Central

Google Scholar

VanRaden, P. M., Sun, C. & O’Connell, J. R. Fast imputation using medium or low-coverage sequence data. BMC Genet. 16, 82 (2015).

Article
PubMed
PubMed Central

Google Scholar

Davies, R. W., Flint, J., Myers, S. & Mott, R. Rapid genotype imputation from sequence without reference panels. Nat. Genet. 48, 965–969 (2016).

Article
PubMed
PubMed Central

Google Scholar

Zheng, C., Boer, M. P. & van Eeuwijk, F. A. Accurate genotype imputation in multiparental populations from low-coverage sequence. Genetics 210, 71–82 (2018).

Article
PubMed
PubMed Central

Google Scholar

Geman, S. & Geman, D. Stochastic relaxation, gibbs distributions, and the Bayesian restoration of images. IEEE Trans. Pattern Anal. Mach. Intell. 6, 721–741 (1984).

Article
PubMed

Google Scholar

Rubinacci, S., Hofmeister, R. J., Sousa da Mota, B. & Delaneau, O. Imputation of low-coverage sequencing data from 150,119 UK Biobank genomes. Nat. Genet. 55, 1088–1090 (2023). This paper introduces GLIMPSE2, an imputation method specifically designed for ulcWGS data.

Article
PubMed
PubMed Central

Google Scholar

Martiniano, R. et al. The population genomics of archaeological transition in west Iberia: Investigation of ancient substructure using imputation and haplotype-based methods. PLoS Genet. 13, e1006852 (2017).

Article
PubMed
PubMed Central

Google Scholar

Gamba, C. et al. Genome flux and stasis in a five millennium transect of European prehistory. Nat. Commun. 5, 5257 (2014).

Article
PubMed

Google Scholar

Royo, J. L. Hardy Weinberg equilibrium disturbances in case–control studies lead to non-conclusive results. Cell J. 22, 572–574 (2021).

PubMed

Google Scholar

Wigginton, J. E., Cutler, D. J. & Abecasis, G. R. A note on exact tests of Hardy–Weinberg equilibrium. Am. J. Hum. Genet. 76, 887–893 (2005).

Article
PubMed
PubMed Central

Google Scholar

Yu, K.-D., Di, G.-H., Fan, L. & Shao, Z.-M. Test of Hardy–Weinberg equilibrium in breast cancer case-control studies: an issue may influence the conclusions. Breast Cancer Res. Treat. 117, 675–677 (2009).

Article
PubMed

Google Scholar

Hachiya, T. et al. The NBDC-DDBJ imputation server facilitates the use of controlled access reference panel datasets in Japan. Hum. Gen. Var. 9, 48 (2022).

Article

Google Scholar

Gürsoy, G., Chielle, E., Brannon, C. M., Maniatakos, M. & Gerstein, M. Privacy-preserving genotype imputation with fully homomorphic encryption. Cell Syst. 13, 173–182.e3 (2022).

Article
PubMed

Google Scholar

Mosca, M. J. & Cho, H. Reconstruction of private genomes through reference-based genotype imputation. Genome Biol. 24, 271 (2023).

Article
PubMed
PubMed Central

Google Scholar

Cavinato, T., Rubinacci, S., Malaspinas, A.-S. & Delaneau, O. A resampling-based approach to share reference panels. Nat. Comput. Sci. 4, 360–366 (2024).

Article
PubMed
PubMed Central

Google Scholar

Rayner, N. W., Park, Y.-C., Fuchsberger, C., Barysenka, A. & Zeggini, E. Toward GDPR compliance with the Helmholtz Munich genotype imputation server. Nat. Genet. 56, 2580–2581 (2024).

Article
PubMed

Google Scholar

Zhu, W. et al. IMMerge: merging imputation data at scale. Bioinformatics 39, btac750 (2023).

Article
PubMed

Google Scholar

McCarthy, S. et al. A reference panel of 64,976 haplotypes for genotype imputation. Nat. Genet. 48, 1279–1283 (2016).

Article
PubMed
PubMed Central

Google Scholar

Jostins, L., Morley, K. I. & Barrett, J. C. Imputation of low-frequency variants using the HapMap3 benefits from large, diverse reference sets. Eur. J. Hum. Genet. 19, 662–666 (2011).

Article
PubMed
PubMed Central

Google Scholar

Bai, W.-Y. et al. Genotype imputation and reference panel: a systematic evaluation on haplotype size and diversity. Brief. Bioinform. 21, 1806–1817 (2019).

Kowalski, M. H. et al. Use of > 100,000 NHLBI trans-omics for precision medicine (TOPMed) consortium whole genome sequences improves imputation quality and detection of rare variant associations in admixed African and Hispanic/Latino populations. PLoS Genet. 15, e1008500 (2019).

Article
PubMed
PubMed Central

Google Scholar

Yoo, S.-K. et al. NARD: whole-genome reference panel of 1779 Northeast Asians improves imputation accuracy of rare and low-frequency variants. Genome Med. 11, 64 (2019).

Article
PubMed
PubMed Central

Google Scholar

Yu, C. et al. A high-resolution haplotype-resolved reference panel constructed from the China Kadoorie Biobank study. Nucleic Acids Res. 51, 11770–11782 (2023).

Article
PubMed
PubMed Central

Google Scholar

Cengnata, A. et al. A genotype imputation reference panel specific for native Southeast Asian populations. NPJ Genom. Med. 9, 47 (2024).

Article
PubMed
PubMed Central

Google Scholar

O’Connell, J. et al. A population-specific reference panel for improved genotype imputation in African Americans. Commun. Biol. 4, 1269 (2021).

Article
PubMed
PubMed Central

Google Scholar

Panjwani, N. et al. Improving imputation in disease-relevant regions: lessons from cystic fibrosis. NPJ Genom. Med. 3, 8 (2018).

Article
PubMed
PubMed Central

Google Scholar

Yu, K. et al. Meta-imputation: an efficient method to combine genotype data after imputation with multiple reference panels. Am. J. Hum. Genet. 109, 1007–1015 (2022). This paper introduces meta-imputation to combine imputed results from multiple reference panels. It is helpful in scenarios where multiple references are suitable, for example, where a small population-specific (or disease cohort) reference panel is available in addition to a large reference panel from general or mismatched populations.

Article
PubMed
PubMed Central

Google Scholar

Hwang, M. Y., Choi, N.-H., Won, H. H., Kim, B.-J. & Kim, Y. J. Analyzing the Korean reference genome with meta-imputation increased the imputation accuracy and spectrum of rare variants in the Korean population. Front. Genet. 13, 1008646 (2022).

Article
PubMed
PubMed Central

Google Scholar

Xu, J. et al. Evaluation of imputation performance of multiple reference panels in a Pakistani population. HGG Adv. 6, 100395 (2025).

PubMed

Google Scholar

Quick, C. et al. Sequencing and imputation in GWAS: cost-effective strategies to increase power and genomic coverage across diverse populations. Genet. Epidemiol. 44, 537–549 (2020).

Article
PubMed
PubMed Central

Google Scholar

Roberts, G. H. L., Santorico, S. A. & Spritz, R. A. Deep genotype imputation captures virtually all heritability of autoimmune vitiligo. Hum. Mol. Genet. 29, 859–863 (2020).

Article
PubMed
PubMed Central

Google Scholar

Yu, W.-Y. et al. Efficient identification of trait-associated loss-of-function variants in the UK Biobank cohort by exome-sequencing based genotype imputation. Genet. Epidemiol. 47, 121–134 (2023).

Article
PubMed

Google Scholar

Si, Y., Vanderwerff, B. & Zöllner, S. Why are rare variants hard to impute? Coalescent models reveal theoretical limits in existing algorithms. Genetics 217, iyab011 (2021).

Article
PubMed
PubMed Central

Google Scholar

Chen, S.-F. et al. Genotype imputation and variability in polygenic risk score estimation. Genome Med. 12, 100 (2020).

Article
PubMed
PubMed Central

Google Scholar

Zhang, Z., Xiao, X., Zhou, W., Zhu, D. & Amos, C. I. False positive findings during genome-wide association studies with imputation: influence of allele frequency and imputation accuracy. Hum. Mol. Genet. 31, 146–155 (2021).

Article
PubMed
PubMed Central

Google Scholar

Appadurai, V. et al. Accuracy of haplotype estimation and whole genome imputation affects complex trait analyses in complex biobanks. Commun. Biol. 6, 101 (2023).

Article
PubMed
PubMed Central

Google Scholar

Scarano, C. et al. The third-generation sequencing challenge: novel insights for the omic sciences. Biomolecules 14, 568 (2024).

Article
PubMed
PubMed Central

Google Scholar

Wenger, A. M. et al. Accurate circular consensus long-read sequencing improves variant detection and assembly of a human genome. Nat. Biotechnol. 37, 1155–1162 (2019).

Article
PubMed
PubMed Central

Google Scholar

Deamer, D., Akeson, M. & Branton, D. Three decades of nanopore sequencing. Nat. Biotechnol. 34, 518–524 (2016).

Article
PubMed
PubMed Central

Google Scholar

Xu, Y., Luo, H., Wang, Z., Lam, H.-M. & Huang, C. Oxford Nanopore Technology: revolutionizing genomics research in plants. Trends Plant. Sci. 27, 510–511 (2022).

Article
PubMed

Google Scholar

Snyder, M. W., Adey, A., Kitzman, J. O. & Shendure, J. Haplotype-resolved genome sequencing: experimental methods and applications. Nat. Rev. Genet. 16, 344–358 (2015).

Article
PubMed

Google Scholar

Garg, S. Computational methods for chromosome-scale haplotype reconstruction. Genome Biol. 22, 101 (2021).

Article
PubMed
PubMed Central

Google Scholar

Zhang, T. et al. Complex genome assembly based on long-read sequencing. Brief. Bioinform. 23, bbac305 (2022).

Article
PubMed

Google Scholar

Maestri, S. et al. A long-read sequencing approach for direct haplotype phasing in clinical settings. Int. J. Mol. Sci. 21, 9177 (2020).

Article
PubMed
PubMed Central

Google Scholar

Kronenberg, Z. N. et al. Extended haplotype-phasing of long-read de novo genome assemblies using Hi-C. Nat. Commun. 12, 1935 (2021).

Article
PubMed
PubMed Central

Google Scholar

Sakamoto, Y. et al. Phasing analysis of lung cancer genomes using a long read sequencer. Nat. Commun. 13, 3464 (2022).

Article
PubMed
PubMed Central

Google Scholar

Bansal, V. & Bafna, V. HapCUT: an efficient and accurate algorithm for the haplotype assembly problem. Bioinformatics 24, i153–i159 (2008).

Article
PubMed

Google Scholar

Bansal, V. Hapcut2: a method for phasing genomes using experimental sequence data. Methods Mol. Biol. 2590, 139–147 (2023).

Article
PubMed

Google Scholar

Patterson, M. et al. WhatsHap: weighted haplotype assembly for future-generation sequencing reads. J. Comput. Biol. 22, 498–509 (2015).

Article
PubMed

Google Scholar

Bracciali, A. et al. PWHATSHAP: efficient haplotyping for future generation sequencing. BMC Bioinform. 17, 342 (2016).

Article

Google Scholar

Garg, S. et al. Chromosome-scale, haplotype-resolved assembly of human genomes. Nat. Biotechnol. 39, 309–312 (2021).

Article
PubMed

Google Scholar

Feng, Z., Clemente, J. C., Wong, B. & Schadt, E. E. Detecting and phasing minor single-nucleotide variants from long-read sequencing data. Nat. Commun. 12, 3032 (2021).

Article
PubMed
PubMed Central

Google Scholar

Yu, Y., Chen, L., Miao, X. & Li, S. C. SpecHap: a diploid phasing algorithm based on spectral graph theory. Nucleic Acids Res. 49, e114 (2021).

Article
PubMed
PubMed Central

Google Scholar

Fruzangohar, M., Timmins, W. A., Kravchuk, O. & Taylor, J. HaploMaker: an improved algorithm for rapid haplotype assembly of genomic sequences. Gigascience 11, giac038 (2022).

Article
PubMed
PubMed Central

Google Scholar

Lin, J.-H., Chen, L.-C., Yu, S.-C. & Huang, Y.-T. LongPhase: an ultra-fast chromosome-scale phasing algorithm for small and large variants. Bioinformatics 38, 1816–1822 (2022).

Article
PubMed

Google Scholar

Holt, J. M. et al. HiPhase: jointly phasing small, structural, and tandem repeat variants from HiFi sequencing. Bioinformatics 40, btae042 (2024).

Article
PubMed
PubMed Central

Google Scholar

Edsgärd, D., Reinius, B. & Sandberg, R. scphaser: haplotype inference using single-cell RNA-seq data. Bioinformatics 32, 3038–3040 (2016).

Article
PubMed
PubMed Central

Google Scholar

Castel, S. E., Mohammadi, P., Chung, W. K., Shen, Y. & Lappalainen, T. Rare variant phasing and haplotypic expression from RNA sequencing with phASER. Nat. Commun. 7, 12817 (2016).

Article
PubMed
PubMed Central

Google Scholar

Akbari, V. & Jones, S. J. M. Phasing DNA methylation. Methods Mol. Biol. 2590, 219–235 (2023).

Article
PubMed

Google Scholar

Fu, Y. et al. MethPhaser: methylation-based long-read haplotype phasing of human genomes. Nat. Commun. 15, 5327 (2024).

Article
PubMed
PubMed Central

Google Scholar

Ouchi, S., Kajitani, R. & Itoh, T. GreenHill: a de novo chromosome-level scaffolding and phasing tool using Hi-C. Genome Biol. 24, 162 (2023).

Article
PubMed
PubMed Central

Google Scholar

Henglin, M. et al. Graphasing: phasing diploid genome assembly graphs with single-cell strand sequencing. Genome Biol. 25, 265 (2024).

Article
PubMed
PubMed Central

Google Scholar

Yang, W.-Y. et al. Leveraging reads that span multiple single nucleotide polymorphisms for haplotype inference from sequencing data. Bioinformatics 29, 2245–2252 (2013).

Article
PubMed
PubMed Central

Google Scholar

Bansal, V. Integrating read-based and population-based phasing for dense and accurate haplotyping of individual genomes. Bioinformatics 35, i242–i248 (2019).

Article
PubMed
PubMed Central

Google Scholar

Schloissnig, S. et al. Structural variation in 1,019 diverse humans based on long-read sequencing. Nature 644, 442–452 (2025).

Article
PubMed
PubMed Central

Google Scholar

Liao, W.-W. et al. A draft human pangenome reference. Nature 617, 312–324 (2023).

Article
PubMed
PubMed Central

Google Scholar

Cui, H. et al. scGPT: toward building a foundation model for single-cell multi-omics using generative AI. Nat. Methods 21, 1470–1480 (2024).

Article
PubMed

Google Scholar

Dalla-Torre, H. et al. Nucleotide transformer: building and evaluating robust foundation models for human genomics. Nat. Methods 22, 287–297 (2025).

Article
PubMed

Google Scholar

Consens, M. E. et al. Transformers and genome language models. Nat. Mach. Intell. 7, 346–362 (2025).

Article

Google Scholar

Durante, Z. et al. Agent AI: surveying the horizons of multimodal interaction. Preprint at https://doi.org/10.48550/arXiv.2401.03568 (2024).

Kapoor, S., Stroebl, B., Siegel, Z. S., Nadgir, N. & Narayanan, A. AI agents that matter. Preprint at https://doi.org/10.48550/arXiv.2407.01502 (2024).

Choudhury, O., Chakrabarty, A. & Emrich, S. J. Highly accurate and efficient data-driven methods for genotype imputation. IEEE/ACM Trans. Comput. Biol. Bioinform. 16, 1107–1116 (2019).

Article
PubMed

Google Scholar

Chen, J. & Shi, X. Sparse convolutional denoising autoencoders for genotype imputation. Genes 10, 652 (2019).

Article
PubMed
PubMed Central

Google Scholar

Kojima, K. et al. A genotype imputation method for de-identified haplotype reference information by using recurrent neural network. PLoS Comput. Biol. 16, e1008207 (2020).

Article
PubMed
PubMed Central

Google Scholar

Chi Duong, V. et al. A rapid and reference-free imputation method for low-cost genotyping platforms. Sci. Rep. 13, 23083 (2023).

Article
PubMed
PubMed Central

Google Scholar

Mowlaei, M. E. et al. STICI: split-transformer with integrated convolutions for genotype imputation. Nat. Commun. 16, 1218 (2025).

Article
PubMed
PubMed Central

Google Scholar

Sun, Q. et al. Polygenic scores of cardiometabolic risk factors in american indian adults. JAMA Netw. Open 8, e250535 (2025).

Article
PubMed
PubMed Central

Google Scholar

Li, Y., Sidore, C., Kang, H. M., Boehnke, M. & Abecasis, G. R. Low-coverage sequencing: implications for design of complex trait association studies. Genome Res. 21, 940–951 (2011).

Article
PubMed
PubMed Central

Google Scholar

Zöllner, S. Sampling strategies for rare variant tests in case-control studies. Eur. J. Hum. Genet. 20, 1085–1091 (2012).

Article
PubMed
PubMed Central

Google Scholar

Kang, J. et al. AbCD: arbitrary coverage design for sequencing-based genetic studies. Bioinformatics 29, 799–801 (2013).

Article
PubMed
PubMed Central

Google Scholar

Duan, Q., Liu, E. Y., Croteau-Chonka, D. C., Mohlke, K. L. & Li, Y. A comprehensive SNP and indel imputability database. Bioinformatics 29, 528–531 (2013).

Article
PubMed
PubMed Central

Google Scholar

Browning, B. L. & Browning, S. R. Efficient multilocus association testing for whole genome association studies using localized haplotype clustering. Genet. Epidemiol. 31, 365–375 (2007).

Article
PubMed

Google Scholar

Loh, P.-R., Palamara, P. F. & Price, A. L. Fast and accurate long-range phasing in a UK Biobank cohort. Nat. Genet. 48, 811–816 (2016).

Article
PubMed
PubMed Central

Google Scholar

Platt, A., Pivirotto, A., Knoblauch, J. & Hey, J. An estimator of first coalescent time reveals selection on young variants and large heterogeneity in rare allele ages among human populations. PLoS Genet. 15, e1008340 (2019).

Article
PubMed
PubMed Central

Google Scholar

Banday, A. R. et al. Genetic regulation of OAS1 nonsense-mediated decay underlies association with COVID-19 hospitalization in patients of European and African ancestries. Nat. Genet. 54, 1103–1116 (2022).

Article
PubMed
PubMed Central

Google Scholar

Michalek, D. A. et al. A multi-ancestry genome-wide association study in type 1 diabetes. Hum. Mol. Genet. 33, 958–968 (2024).

Article
PubMed
PubMed Central

Google Scholar

Lucas, E. R. et al. Genome-wide association studies reveal novel loci associated with pyrethroid and organophosphate resistance in Anopheles gambiae and Anopheles coluzzii. Nat. Commun. 14, 4946 (2023).

Article
PubMed
PubMed Central

Google Scholar

Bråten, L. S., Ingelman-Sundberg, M., Jukic, M. M., Molden, E. & Kringen, M. K. Impact of the novel CYP2C:TG haplotype and CYP2B6 variants on sertraline exposure in a large patient population. Clin. Transl. Sci. 15, 2135–2145 (2022).

Article
PubMed
PubMed Central

Google Scholar

Aksit, M. A. et al. Pleiotropic modifiers of age-related diabetes and neonatal intestinal obstruction in cystic fibrosis. Am. J. Hum. Genet. 109, 1894–1908 (2022).

Article
PubMed
PubMed Central

Google Scholar

Loftus, S. K. et al. Haplotype-based analysis resolves missing heritability in oculocutaneous albinism type 1B. Am. J. Hum. Genet. 110, 1123–1137 (2023). This paper sets up an example of how phasing or haplotype-level analyses can help better understand disease-causing alleles, elucidate genetic mechanisms underlying diseases and aid genetic diagnosis.

Article
PubMed
PubMed Central

Google Scholar

Khankhanian, P., Gourraud, P.-A., Lizee, A. & Goodin, D. S. Haplotype-based approach to known MS-associated regions increases the amount of explained risk. J. Med. Genet. 52, 587–594 (2015).

Article
PubMed

Google Scholar

Albiñana, C. et al. Genetic correlates of vitamin D-binding protein and 25-hydroxyvitamin D in neonatal dried blood spots. Nat. Commun. 14, 852 (2023).

Article
PubMed
PubMed Central

Google Scholar

Sollis, E. et al. The NHGRI-EBI GWAS catalog: knowledgebase and deposition resource. Nucleic Acids Res. 51, D977–D985 (2023).

Article
PubMed

Google Scholar

Lin, S. et al. Evidence that the Ser192Tyr/Arg402Gln in cis tyrosinase gene haplotype is a disease-causing allele in oculocutaneous albinism type 1B (OCA1B). NPJ Genom. Med. 7, 2 (2022).

Article
PubMed
PubMed Central

Google Scholar

Shriner, D. Overview of admixture mapping. Curr. Protoc. 3, e677 (2023).

Article
PubMed

Google Scholar

Duan, Q. et al. A robust and powerful two-step testing procedure for local ancestry adjusted allelic association analysis in admixed populations. Genet. Epidemiol. 42, 288–302 (2018).

Article
PubMed

Google Scholar

Atkinson, E. G. et al. Tractor uses local ancestry to enable the inclusion of admixed individuals in GWAS and to boost power. Nat. Genet. 53, 195–204 (2021).

Article
PubMed
PubMed Central

Google Scholar

Hou, K. et al. Admix-kit: an integrated toolkit and pipeline for genetic analyses of admixed populations. Bioinformatics 40, btae148 (2024).

Article
PubMed
PubMed Central

Google Scholar

Hou, K. et al. Causal effects on complex traits are similar for common variants across segments of different continental ancestries within admixed individuals. Nat. Genet. 55, 549–558 (2023).

Article
PubMed
PubMed Central

Google Scholar

Meisner, J., Benros, M. E. & Rasmussen, S. Leveraging haplotype information in heritability estimation and polygenic prediction. Nat. Commun. 16, 126 (2025).

Article
PubMed
PubMed Central

Google Scholar