Voight, B. F., Kudaravalli, S., Wen, X. & Pritchard, J. K. A map of recent positive selection in the human genome. PLoS Biol. 4, e72 (2006).
Fournier, R., Tsangalidou, Z., Reich, D. & Palamara, P. F. Haplotype-based inference of recent effective population size in modern and ancient DNA samples. Nat. Commun. 14, 7945 (2023).
Palamara, P. F. & Pe’er, I. Inference of historical migration rates via haplotype sharing. Bioinformatics 29, i180–i188 (2013).
Al-Asadi, H., Petkova, D., Stephens, M. & Novembre, J. Estimating recent migration and population-size surfaces. PLoS Genet. 15, e1007908 (2019).
Tian, X., Cai, R. & Browning, S. R. Estimating the genome-wide mutation rate from thousands of unrelated individuals. Am. J. Hum. Genet. 109, 2178–2184 (2022).
Porubsky, D. et al. Human de novo mutation rates from a four-generation pedigree reference. Nature 643, 427–436 (2025).
Lassen, F. H. et al. Exome-wide evidence of compound heterozygous effects across common phenotypes in the UK Biobank. Cell Genom. 4, 100602 (2024).
Maples, B. K., Gravel, S., Kenny, E. E. & Bustamante, C. D. RFMix: a discriminative modeling approach for rapid and robust local-ancestry inference. Am. J. Hum. Genet. 93, 278–288 (2013).
Browning, S. R., Waples, R. K. & Browning, B. L. Fast, accurate local ancestry inference with FLARE. Am. J. Hum. Genet. 110, 326–335 (2023).
Sun, Q. et al. Improving polygenic risk prediction in admixed populations by explicitly modeling ancestral-differential effects via GAUDI. Nat. Commun. 15, 1016 (2024).
Horimoto, A. R. V. R. et al. Admixture mapping of chronic kidney disease and risk factors in Hispanic/Latino individuals from Central America country of origin. Circ. Genom. Precis. Med. 17, e004314 (2024).
Sun, Q. et al. Opportunities and challenges of local ancestry in genetic association analyses. Am. J. Hum. Genet. 112, 727–740 (2025).
Fuchsberger, C., Abecasis, G. R. & Hinds, D. A. minimac2: faster genotype imputation. Bioinformatics 31, 782–784 (2015). This paper introduces the minimac2 imputation method (minimac3 and minimac4 are not associated with any publications).
Browning, B. L., Zhou, Y. & Browning, S. R. A one-penny imputed genome from next-generation reference panels. Am. J. Hum. Genet. 103, 338–348 (2018). This paper introduces the Beagle5 imputation method.
Rubinacci, S., Delaneau, O. & Marchini, J. Genotype imputation using the positional burrows wheeler transform. PLoS Genet. 16, e1009049 (2020). This paper introduces the IMPUTE5 imputation method.
Sun, Q. et al. Analyses of biomarker traits in diverse UK Biobank participants identify associations missed by European-centric analysis strategies. J. Hum. Genet. 67, 87–93 (2022).
Huerta-Chagoya, A. et al. The power of TOPMed imputation for the discovery of Latino-enriched rare variants associated with type 2 diabetes. Diabetologia 66, 1273–1288 (2023).
Qin, Z. S., Niu, T. & Liu, J. S. Partition-ligation-expectation-maximization algorithm for haplotype inference with single-nucleotide polymorphisms. Am. J. Hum. Genet. 71, 1242–1247 (2002).
Stephens, M. & Scheet, P. Accounting for decay of linkage disequilibrium in haplotype inference and missing-data imputation. Am. J. Hum. Genet. 76, 449–462 (2005).
Scheet, P. & Stephens, M. A fast and flexible statistical model for large-scale population genotype data: applications to inferring missing genotypes and haplotypic phase. Am. J. Hum. Genet. 78, 629–644 (2006).
Browning, S. R. & Browning, B. L. Rapid and accurate haplotype phasing and missing-data inference for whole-genome association studies by use of localized haplotype clustering. Am. J. Hum. Genet. 81, 1084–1097 (2007).
Li, Y., Willer, C. J., Ding, J., Scheet, P. & Abecasis, G. R. MaCH: using sequence and genotype data to estimate haplotypes and unobserved genotypes. Genet. Epidemiol. 34, 816–834 (2010).
Howie, B. N., Donnelly, P. & Marchini, J. A flexible and accurate genotype imputation method for the next generation of genome-wide association studies. PLoS Genet. 5, e1000529 (2009).
Carlson, C. S. et al. Selecting a maximally informative set of single-nucleotide polymorphisms for association analyses using linkage disequilibrium. Am. J. Hum. Genet. 74, 106–120 (2004).
Long, J. C., Williams, R. C. & Urbanek, M. An E-M algorithm and testing strategy for multiple-locus haplotypes. Am. J. Hum. Genet. 56, 799–810 (1995).
Kong, A. et al. Detection of sharing by descent, long-range phasing and haplotype imputation. Nat. Genet. 40, 1068–1075 (2008).
Halperin, E. & Karp, R. M. Perfect phylogeny and haplotype assignment. In Proc. 8th Annual International Conference on Computational Molecular Biology 10–19 (ACM, 2004).
Halperin, E. & Eskin, E. Haplotype reconstruction from genotype data using imperfect phylogeny. Bioinformatics 20, 1842–1849 (2004).
Li, N. & Stephens, M. Modeling linkage disequilibrium and identifying recombination hotspots using single-nucleotide polymorphism data. Genetics 165, 2213–2233 (2003).
Fearnhead, P. & Donnelly, P. Estimating recombination rates from population genetic data. Genetics 159, 1299–1318 (2001).
Loh, P.-R. et al. Reference-based phasing using the haplotype reference consortium panel. Nat. Genet. 48, 1443–1448 (2016). This paper introduces the Eagle2 phasing method, where PBWT is applied to improve computational efficiency.
Hofmeister, R. J., Ribeiro, D. M., Rubinacci, S. & Delaneau, O. Accurate rare variant phasing of whole-genome and whole-exome sequencing data in the UK Biobank. Nat. Genet. 55, 1243–1249 (2023). This paper introduces the SHAPEIT5 phasing method, where singletons are explicitly considered.
Browning, B. L., Tian, X., Zhou, Y. & Browning, S. R. Fast two-stage phasing of large-scale sequence data. Am. J. Hum. Genet. 108, 1880–1890 (2021). This paper introduces the Beagle5 phasing method (which is different from the Beagle5 imputation publication), where a two-stage phasing strategy is proposed separately for common and rare variants.
Taliun, D. et al. Sequencing of 53,831 diverse genomes from the NHLBI TOPMed Program. Nature 590, 290–299 (2021). This paper introduces the TOPMed reference panel containing haplotypes from diverse populations, which is more suitable for imputation of global populations compared with previous reference panels, including 1000 Genomes and HRC.
Feng, Y.-C. A. et al. Taiwan Biobank: a rich biomedical research database of the Taiwanese population. Cell Genom. 2, 100197 (2022).
Das, S. et al. Next-generation genotype imputation service and methods. Nat. Genet. 48, 1284–1287 (2016). This paper introduces the Michigan imputation server, exemplary in promoting broader usage of reference panels and public servers without accessing individual genotypes contributing to the panels.
Al Bkhetan, Z., Zobel, J., Kowalczyk, A., Verspoor, K. & Goudey, B. Exploring effective approaches for haplotype block phasing. BMC Bioinform. 20, 540 (2019).
Al Bkhetan, Z., Chana, G., Ramamohanarao, K., Verspoor, K. & Goudey, B. Evaluation of consensus strategies for haplotype phasing. Brief. Bioinform. 22, bbaa280 (2021).
Wertenbroek, R., Hofmeister, R. J., Xenarios, I., Thoma, Y. & Delaneau, O. Improving population scale statistical phasing with whole-genome sequencing data. PLoS Genet. 20, e1011092 (2024). This paper introduces a method to correct phasing errors leveraging raw sequencing.
Sun, Q. et al. MagicalRsq: machine-learning-based genotype imputation quality calibration. Am. J. Hum. Genet. 109, 1986–1997 (2022). This paper introduces a framework to recalculate imputation quality metric for post-imputation quality control, especially for low-frequency and rare variants where the state-of-the-art imputation quality metric (for example, Rsq) performs less well.
Sun, Q. et al. MagicalRsq-X: a cross-cohort transferable genotype imputation quality metric. Am. J. Hum. Genet. 111, 990–995 (2024).
Aleknonytė-Resch, M., Szymczak, S., Freitag-Wolf, S., Dempfle, A. & Krawczak, M. Genotype imputation in case-only studies of gene-environment interaction: validity and power. Hum. Genet. 140, 1217–1228 (2021).
Sun, Q. et al. Leveraging TOPMed imputation server and constructing a cohort-specific imputation reference panel to enhance genotype imputation among cystic fibrosis patients. HGG Adv. 3, 100090 (2022).
Lau, W. et al. The hazards of genotype imputation when mapping disease susceptibility variants. Genome Biol. 25, 7 (2024).
Liu, E. Y. et al. Genotype imputation of metabochip SNPs using a study-specific reference panel of ~4,000 haplotypes in African Americans from the women’s health initiative. Genet. Epidemiol. 36, 107–117 (2012).
Xu, Z. M. et al. Using population-specific add-on polymorphisms to improve genotype imputation in underrepresented populations. PLoS Comput. Biol. 18, e1009628 (2022).
Sengupta, D. et al. Performance and accuracy evaluation of reference panels for genotype imputation in sub-Saharan African populations. Cell Genom. 3, 100332 (2023).
Cahoon, J. L. et al. Imputation accuracy across global human populations. Am. J. Hum. Genet. 111, 979–989 (2024).
Handsaker, R. E. et al. Large multiallelic copy number variations in humans. Nat. Genet. 47, 296–303 (2015).
Saini, S., Mitra, I., Mousavi, N., Fotsing, S. F. & Gymrek, M. A reference haplotype panel for genome-wide imputation of short tandem repeats. Nat. Commun. 9, 4397 (2018).
Ziaei Jam, H. et al. A deep population reference panel of tandem repeat variation. Nat. Commun. 14, 6711 (2023).
Noyvert, B. et al. Imputation of structural variants using a multi-ancestry long-read sequencing panel enables identification of disease associations. eLife 14, RP106115 (2025). This work performs imputation of SVs using a reference panel based on long-read sequencing data, demonstrating the practical utility of long-read sequencing in the context of imputation, particularly for SVs.
Browning, S. R. & Browning, B. L. Haplotype phasing: existing methods and new developments. Nat. Rev. Genet. 12, 703–714 (2011).
Sakaue, S. et al. Tutorial: a statistical genetics guide to identifying HLA alleles driving complex disease. Nat. Protoc. 18, 2625–2641 (2023).
Tregouet, D. A., Escolano, S., Tiret, L., Mallet, A. & Golmard, J. L. A new algorithm for haplotype-based association analysis: the Stochastic-EM algorithm. Ann. Hum. Genet. 68, 165–177 (2004).
Browning, B. L. & Browning, S. R. Statistical phasing of 150,119 sequenced genomes in the UK Biobank. Am. J. Hum. Genet. 110, 161–165 (2023).
Sohail, M. et al. Mexican Biobank advances population and medical genomics of diverse ancestries. Nature 622, 775–783 (2023).
Durbin, R. Efficient haplotype matching and storage using the positional Burrows-Wheeler transform (PBWT). Bioinformatics 30, 1266–1272 (2014). This paper proposes a series of algorithms for haplotype data compression and efficient haplotype matching, reducing the computational complexity from quadratic to linear in terms of the number of reference haplotypes. It represents a milestone of recent computational development of phasing and imputation methods.
Delaneau, O., Zagury, J.-F., Robinson, M. R., Marchini, J. L. & Dermitzakis, E. T. Accurate, scalable and integrative haplotype estimation. Nat. Commun. 10, 5436 (2019).
Browning, B. L. & Browning, S. R. Genotype imputation with millions of reference samples. Am. J. Hum. Genet. 98, 116–126 (2016).
Delaneau, O., Zagury, J.-F. & Marchini, J. Improved whole-chromosome phasing for disease and population genetic studies. Nat. Methods 10, 5–6 (2013).
O’Connell, J. et al. Haplotype estimation for biobank-scale data sets. Nat. Genet. 48, 817–820 (2016).
Palin, K., Campbell, H., Wright, A. F., Wilson, J. F. & Durbin, R. Identity-by-descent-based phasing and imputation in founder populations using graphical models. Genet. Epidemiol. 35, 853–860 (2011).
Hickey, J. M. et al. A combined long-range phasing and long haplotype imputation method to impute phase for SNP genotypes. Genet. Sel. Evol. 43, 12 (2011).
Herzig, A. F. et al. Strategies for phasing and imputation in a population isolate. Genet. Epidemiol. 42, 201–213 (2018).
Williams, A. L., Patterson, N., Glessner, J., Hakonarson, H. & Reich, D. Phasing of many thousands of genotyped samples. Am. J. Hum. Genet. 91, 238–251 (2012).
O’Connell, J. et al. A general approach for haplotype phasing across the full spectrum of relatedness. PLoS Genet. 10, e1004234 (2014).
Oget-Ebrad, C. et al. Benchmarking phasing software with a whole-genome sequenced cattle pedigree. BMC Genom. 23, 130 (2022).
Choi, Y., Chan, A. P., Kirkness, E., Telenti, A. & Schork, N. J. Comparison of phasing strategies for whole human genomes. PLoS Genet. 14, e1007308 (2018).
Lajugie, J. et al. Complete genome phasing of family quartet by combination of genetic, physical and population-based phasing analysis. PLoS ONE 8, e64571 (2013).
Chen, G. K., Wang, K., Stram, A. H., Sobel, E. M. & Lange, K. Mendel-GPU: haplotyping and genotype imputation on graphics processing units. Bioinformatics 28, 2979–2980 (2012).
Na, J. C., Lee, I., Rhee, J.-K. & Shin, S.-Y. Fast single individual haplotyping method using GPGPU. Comput. Biol. Med. 113, 103421 (2019).
Luo, Y. et al. A high-resolution HLA reference panel capturing global population diversity enables multi-ancestry fine-mapping in HIV host response. Nat. Genet. 53, 1504–1516 (2021).
1000 Genomes Project Consortium et al. A map of human genome variation from population-scale sequencing. Nature 467, 1061–1073 (2010).
Li, Y., Willer, C., Sanna, S. & Abecasis, G. Genotype imputation. Annu. Rev. Genom. Hum. Genet. 10, 387–406 (2009).
Marchini, J. & Howie, B. Genotype imputation for genome-wide association studies. Nat. Rev. Genet. 11, 499–511 (2010).
Das, S., Abecasis, G. R. & Browning, B. L. Genotype imputation from large reference panels. Annu. Rev. Genom. Hum. Genet. 19, 73–96 (2018).
Howie, B., Fuchsberger, C., Stephens, M., Marchini, J. & Abecasis, G. R. Fast and accurate genotype imputation in genome-wide association studies through pre-phasing. Nat. Genet. 44, 955–959 (2012).
Kojima, K., Tadaka, S., Okamura, Y. & Kinoshita, K. Two-stage strategy using denoising autoencoders for robust reference-free genotype imputation with missing input genotypes. J. Hum. Genet. 69, 511–518 (2024).
Pasaniuc, B. et al. Extremely low-coverage sequencing and imputation increases power for genome-wide association studies. Nat. Genet. 44, 631–635 (2012).
Hui, R., D’Atanasio, E., Cassidy, L. M., Scheib, C. L. & Kivisild, T. Evaluating genotype imputation pipeline for ultra-low coverage ancient genomes. Sci. Rep. 10, 18542 (2020).
Sousa da Mota, B. et al. Imputation of ancient human genomes. Nat. Commun. 14, 3660 (2023).
Rubinacci, S., Ribeiro, D. M., Hofmeister, R. J. & Delaneau, O. Efficient phasing and imputation of low-coverage sequencing data using large reference panels. Nat. Genet. 53, 120–126 (2021).
Spiliopoulou, A., Colombo, M., Orchard, P., Agakov, F. & McKeigue, P. GeneImp: fast imputation to large reference panels using genotype likelihoods from ultralow coverage sequencing. Genetics 206, 91–104 (2017).
Davies, R. W. et al. Rapid genotype imputation from sequence with reference panels. Nat. Genet. 53, 1104–1111 (2021).
Jagirdar, K. et al. Molecular analysis of common polymorphisms within the human tyrosinase locus and genetic association with pigmentation traits. Pigment. Cell Melanoma Res. 27, 552–564 (2014).
VanRaden, P. M., Sun, C. & O’Connell, J. R. Fast imputation using medium or low-coverage sequence data. BMC Genet. 16, 82 (2015).
Davies, R. W., Flint, J., Myers, S. & Mott, R. Rapid genotype imputation from sequence without reference panels. Nat. Genet. 48, 965–969 (2016).
Zheng, C., Boer, M. P. & van Eeuwijk, F. A. Accurate genotype imputation in multiparental populations from low-coverage sequence. Genetics 210, 71–82 (2018).
Geman, S. & Geman, D. Stochastic relaxation, gibbs distributions, and the Bayesian restoration of images. IEEE Trans. Pattern Anal. Mach. Intell. 6, 721–741 (1984).
Rubinacci, S., Hofmeister, R. J., Sousa da Mota, B. & Delaneau, O. Imputation of low-coverage sequencing data from 150,119 UK Biobank genomes. Nat. Genet. 55, 1088–1090 (2023). This paper introduces GLIMPSE2, an imputation method specifically designed for ulcWGS data.
Martiniano, R. et al. The population genomics of archaeological transition in west Iberia: Investigation of ancient substructure using imputation and haplotype-based methods. PLoS Genet. 13, e1006852 (2017).
Gamba, C. et al. Genome flux and stasis in a five millennium transect of European prehistory. Nat. Commun. 5, 5257 (2014).
Royo, J. L. Hardy Weinberg equilibrium disturbances in case–control studies lead to non-conclusive results. Cell J. 22, 572–574 (2021).
Wigginton, J. E., Cutler, D. J. & Abecasis, G. R. A note on exact tests of Hardy–Weinberg equilibrium. Am. J. Hum. Genet. 76, 887–893 (2005).
Yu, K.-D., Di, G.-H., Fan, L. & Shao, Z.-M. Test of Hardy–Weinberg equilibrium in breast cancer case-control studies: an issue may influence the conclusions. Breast Cancer Res. Treat. 117, 675–677 (2009).
Hachiya, T. et al. The NBDC-DDBJ imputation server facilitates the use of controlled access reference panel datasets in Japan. Hum. Gen. Var. 9, 48 (2022).
Gürsoy, G., Chielle, E., Brannon, C. M., Maniatakos, M. & Gerstein, M. Privacy-preserving genotype imputation with fully homomorphic encryption. Cell Syst. 13, 173–182.e3 (2022).
Mosca, M. J. & Cho, H. Reconstruction of private genomes through reference-based genotype imputation. Genome Biol. 24, 271 (2023).
Cavinato, T., Rubinacci, S., Malaspinas, A.-S. & Delaneau, O. A resampling-based approach to share reference panels. Nat. Comput. Sci. 4, 360–366 (2024).
Rayner, N. W., Park, Y.-C., Fuchsberger, C., Barysenka, A. & Zeggini, E. Toward GDPR compliance with the Helmholtz Munich genotype imputation server. Nat. Genet. 56, 2580–2581 (2024).
Zhu, W. et al. IMMerge: merging imputation data at scale. Bioinformatics 39, btac750 (2023).
McCarthy, S. et al. A reference panel of 64,976 haplotypes for genotype imputation. Nat. Genet. 48, 1279–1283 (2016).
Jostins, L., Morley, K. I. & Barrett, J. C. Imputation of low-frequency variants using the HapMap3 benefits from large, diverse reference sets. Eur. J. Hum. Genet. 19, 662–666 (2011).
Bai, W.-Y. et al. Genotype imputation and reference panel: a systematic evaluation on haplotype size and diversity. Brief. Bioinform. 21, 1806–1817 (2019).
Kowalski, M. H. et al. Use of > 100,000 NHLBI trans-omics for precision medicine (TOPMed) consortium whole genome sequences improves imputation quality and detection of rare variant associations in admixed African and Hispanic/Latino populations. PLoS Genet. 15, e1008500 (2019).
Yoo, S.-K. et al. NARD: whole-genome reference panel of 1779 Northeast Asians improves imputation accuracy of rare and low-frequency variants. Genome Med. 11, 64 (2019).
Yu, C. et al. A high-resolution haplotype-resolved reference panel constructed from the China Kadoorie Biobank study. Nucleic Acids Res. 51, 11770–11782 (2023).
Cengnata, A. et al. A genotype imputation reference panel specific for native Southeast Asian populations. NPJ Genom. Med. 9, 47 (2024).
O’Connell, J. et al. A population-specific reference panel for improved genotype imputation in African Americans. Commun. Biol. 4, 1269 (2021).
Panjwani, N. et al. Improving imputation in disease-relevant regions: lessons from cystic fibrosis. NPJ Genom. Med. 3, 8 (2018).
Yu, K. et al. Meta-imputation: an efficient method to combine genotype data after imputation with multiple reference panels. Am. J. Hum. Genet. 109, 1007–1015 (2022). This paper introduces meta-imputation to combine imputed results from multiple reference panels. It is helpful in scenarios where multiple references are suitable, for example, where a small population-specific (or disease cohort) reference panel is available in addition to a large reference panel from general or mismatched populations.
Hwang, M. Y., Choi, N.-H., Won, H. H., Kim, B.-J. & Kim, Y. J. Analyzing the Korean reference genome with meta-imputation increased the imputation accuracy and spectrum of rare variants in the Korean population. Front. Genet. 13, 1008646 (2022).
Xu, J. et al. Evaluation of imputation performance of multiple reference panels in a Pakistani population. HGG Adv. 6, 100395 (2025).
Quick, C. et al. Sequencing and imputation in GWAS: cost-effective strategies to increase power and genomic coverage across diverse populations. Genet. Epidemiol. 44, 537–549 (2020).
Roberts, G. H. L., Santorico, S. A. & Spritz, R. A. Deep genotype imputation captures virtually all heritability of autoimmune vitiligo. Hum. Mol. Genet. 29, 859–863 (2020).
Yu, W.-Y. et al. Efficient identification of trait-associated loss-of-function variants in the UK Biobank cohort by exome-sequencing based genotype imputation. Genet. Epidemiol. 47, 121–134 (2023).
Si, Y., Vanderwerff, B. & Zöllner, S. Why are rare variants hard to impute? Coalescent models reveal theoretical limits in existing algorithms. Genetics 217, iyab011 (2021).
Chen, S.-F. et al. Genotype imputation and variability in polygenic risk score estimation. Genome Med. 12, 100 (2020).
Zhang, Z., Xiao, X., Zhou, W., Zhu, D. & Amos, C. I. False positive findings during genome-wide association studies with imputation: influence of allele frequency and imputation accuracy. Hum. Mol. Genet. 31, 146–155 (2021).
Appadurai, V. et al. Accuracy of haplotype estimation and whole genome imputation affects complex trait analyses in complex biobanks. Commun. Biol. 6, 101 (2023).
Scarano, C. et al. The third-generation sequencing challenge: novel insights for the omic sciences. Biomolecules 14, 568 (2024).
Wenger, A. M. et al. Accurate circular consensus long-read sequencing improves variant detection and assembly of a human genome. Nat. Biotechnol. 37, 1155–1162 (2019).
Deamer, D., Akeson, M. & Branton, D. Three decades of nanopore sequencing. Nat. Biotechnol. 34, 518–524 (2016).
Xu, Y., Luo, H., Wang, Z., Lam, H.-M. & Huang, C. Oxford Nanopore Technology: revolutionizing genomics research in plants. Trends Plant. Sci. 27, 510–511 (2022).
Snyder, M. W., Adey, A., Kitzman, J. O. & Shendure, J. Haplotype-resolved genome sequencing: experimental methods and applications. Nat. Rev. Genet. 16, 344–358 (2015).
Garg, S. Computational methods for chromosome-scale haplotype reconstruction. Genome Biol. 22, 101 (2021).
Zhang, T. et al. Complex genome assembly based on long-read sequencing. Brief. Bioinform. 23, bbac305 (2022).
Maestri, S. et al. A long-read sequencing approach for direct haplotype phasing in clinical settings. Int. J. Mol. Sci. 21, 9177 (2020).
Kronenberg, Z. N. et al. Extended haplotype-phasing of long-read de novo genome assemblies using Hi-C. Nat. Commun. 12, 1935 (2021).
Sakamoto, Y. et al. Phasing analysis of lung cancer genomes using a long read sequencer. Nat. Commun. 13, 3464 (2022).
Bansal, V. & Bafna, V. HapCUT: an efficient and accurate algorithm for the haplotype assembly problem. Bioinformatics 24, i153–i159 (2008).
Bansal, V. Hapcut2: a method for phasing genomes using experimental sequence data. Methods Mol. Biol. 2590, 139–147 (2023).
Patterson, M. et al. WhatsHap: weighted haplotype assembly for future-generation sequencing reads. J. Comput. Biol. 22, 498–509 (2015).
Bracciali, A. et al. PWHATSHAP: efficient haplotyping for future generation sequencing. BMC Bioinform. 17, 342 (2016).
Garg, S. et al. Chromosome-scale, haplotype-resolved assembly of human genomes. Nat. Biotechnol. 39, 309–312 (2021).
Feng, Z., Clemente, J. C., Wong, B. & Schadt, E. E. Detecting and phasing minor single-nucleotide variants from long-read sequencing data. Nat. Commun. 12, 3032 (2021).
Yu, Y., Chen, L., Miao, X. & Li, S. C. SpecHap: a diploid phasing algorithm based on spectral graph theory. Nucleic Acids Res. 49, e114 (2021).
Fruzangohar, M., Timmins, W. A., Kravchuk, O. & Taylor, J. HaploMaker: an improved algorithm for rapid haplotype assembly of genomic sequences. Gigascience 11, giac038 (2022).
Lin, J.-H., Chen, L.-C., Yu, S.-C. & Huang, Y.-T. LongPhase: an ultra-fast chromosome-scale phasing algorithm for small and large variants. Bioinformatics 38, 1816–1822 (2022).
Holt, J. M. et al. HiPhase: jointly phasing small, structural, and tandem repeat variants from HiFi sequencing. Bioinformatics 40, btae042 (2024).
Edsgärd, D., Reinius, B. & Sandberg, R. scphaser: haplotype inference using single-cell RNA-seq data. Bioinformatics 32, 3038–3040 (2016).
Castel, S. E., Mohammadi, P., Chung, W. K., Shen, Y. & Lappalainen, T. Rare variant phasing and haplotypic expression from RNA sequencing with phASER. Nat. Commun. 7, 12817 (2016).
Akbari, V. & Jones, S. J. M. Phasing DNA methylation. Methods Mol. Biol. 2590, 219–235 (2023).
Fu, Y. et al. MethPhaser: methylation-based long-read haplotype phasing of human genomes. Nat. Commun. 15, 5327 (2024).
Ouchi, S., Kajitani, R. & Itoh, T. GreenHill: a de novo chromosome-level scaffolding and phasing tool using Hi-C. Genome Biol. 24, 162 (2023).
Henglin, M. et al. Graphasing: phasing diploid genome assembly graphs with single-cell strand sequencing. Genome Biol. 25, 265 (2024).
Yang, W.-Y. et al. Leveraging reads that span multiple single nucleotide polymorphisms for haplotype inference from sequencing data. Bioinformatics 29, 2245–2252 (2013).
Bansal, V. Integrating read-based and population-based phasing for dense and accurate haplotyping of individual genomes. Bioinformatics 35, i242–i248 (2019).
Schloissnig, S. et al. Structural variation in 1,019 diverse humans based on long-read sequencing. Nature 644, 442–452 (2025).
Liao, W.-W. et al. A draft human pangenome reference. Nature 617, 312–324 (2023).
Cui, H. et al. scGPT: toward building a foundation model for single-cell multi-omics using generative AI. Nat. Methods 21, 1470–1480 (2024).
Dalla-Torre, H. et al. Nucleotide transformer: building and evaluating robust foundation models for human genomics. Nat. Methods 22, 287–297 (2025).
Consens, M. E. et al. Transformers and genome language models. Nat. Mach. Intell. 7, 346–362 (2025).
Durante, Z. et al. Agent AI: surveying the horizons of multimodal interaction. Preprint at https://doi.org/10.48550/arXiv.2401.03568 (2024).
Kapoor, S., Stroebl, B., Siegel, Z. S., Nadgir, N. & Narayanan, A. AI agents that matter. Preprint at https://doi.org/10.48550/arXiv.2407.01502 (2024).
Choudhury, O., Chakrabarty, A. & Emrich, S. J. Highly accurate and efficient data-driven methods for genotype imputation. IEEE/ACM Trans. Comput. Biol. Bioinform. 16, 1107–1116 (2019).
Chen, J. & Shi, X. Sparse convolutional denoising autoencoders for genotype imputation. Genes 10, 652 (2019).
Kojima, K. et al. A genotype imputation method for de-identified haplotype reference information by using recurrent neural network. PLoS Comput. Biol. 16, e1008207 (2020).
Chi Duong, V. et al. A rapid and reference-free imputation method for low-cost genotyping platforms. Sci. Rep. 13, 23083 (2023).
Mowlaei, M. E. et al. STICI: split-transformer with integrated convolutions for genotype imputation. Nat. Commun. 16, 1218 (2025).
Sun, Q. et al. Polygenic scores of cardiometabolic risk factors in american indian adults. JAMA Netw. Open 8, e250535 (2025).
Li, Y., Sidore, C., Kang, H. M., Boehnke, M. & Abecasis, G. R. Low-coverage sequencing: implications for design of complex trait association studies. Genome Res. 21, 940–951 (2011).
Zöllner, S. Sampling strategies for rare variant tests in case-control studies. Eur. J. Hum. Genet. 20, 1085–1091 (2012).
Kang, J. et al. AbCD: arbitrary coverage design for sequencing-based genetic studies. Bioinformatics 29, 799–801 (2013).
Duan, Q., Liu, E. Y., Croteau-Chonka, D. C., Mohlke, K. L. & Li, Y. A comprehensive SNP and indel imputability database. Bioinformatics 29, 528–531 (2013).
Browning, B. L. & Browning, S. R. Efficient multilocus association testing for whole genome association studies using localized haplotype clustering. Genet. Epidemiol. 31, 365–375 (2007).
Loh, P.-R., Palamara, P. F. & Price, A. L. Fast and accurate long-range phasing in a UK Biobank cohort. Nat. Genet. 48, 811–816 (2016).
Platt, A., Pivirotto, A., Knoblauch, J. & Hey, J. An estimator of first coalescent time reveals selection on young variants and large heterogeneity in rare allele ages among human populations. PLoS Genet. 15, e1008340 (2019).
Banday, A. R. et al. Genetic regulation of OAS1 nonsense-mediated decay underlies association with COVID-19 hospitalization in patients of European and African ancestries. Nat. Genet. 54, 1103–1116 (2022).
Michalek, D. A. et al. A multi-ancestry genome-wide association study in type 1 diabetes. Hum. Mol. Genet. 33, 958–968 (2024).
Lucas, E. R. et al. Genome-wide association studies reveal novel loci associated with pyrethroid and organophosphate resistance in Anopheles gambiae and Anopheles coluzzii. Nat. Commun. 14, 4946 (2023).
Bråten, L. S., Ingelman-Sundberg, M., Jukic, M. M., Molden, E. & Kringen, M. K. Impact of the novel CYP2C:TG haplotype and CYP2B6 variants on sertraline exposure in a large patient population. Clin. Transl. Sci. 15, 2135–2145 (2022).
Aksit, M. A. et al. Pleiotropic modifiers of age-related diabetes and neonatal intestinal obstruction in cystic fibrosis. Am. J. Hum. Genet. 109, 1894–1908 (2022).
Loftus, S. K. et al. Haplotype-based analysis resolves missing heritability in oculocutaneous albinism type 1B. Am. J. Hum. Genet. 110, 1123–1137 (2023). This paper sets up an example of how phasing or haplotype-level analyses can help better understand disease-causing alleles, elucidate genetic mechanisms underlying diseases and aid genetic diagnosis.
Khankhanian, P., Gourraud, P.-A., Lizee, A. & Goodin, D. S. Haplotype-based approach to known MS-associated regions increases the amount of explained risk. J. Med. Genet. 52, 587–594 (2015).
Albiñana, C. et al. Genetic correlates of vitamin D-binding protein and 25-hydroxyvitamin D in neonatal dried blood spots. Nat. Commun. 14, 852 (2023).
Sollis, E. et al. The NHGRI-EBI GWAS catalog: knowledgebase and deposition resource. Nucleic Acids Res. 51, D977–D985 (2023).
Lin, S. et al. Evidence that the Ser192Tyr/Arg402Gln in cis tyrosinase gene haplotype is a disease-causing allele in oculocutaneous albinism type 1B (OCA1B). NPJ Genom. Med. 7, 2 (2022).
Shriner, D. Overview of admixture mapping. Curr. Protoc. 3, e677 (2023).
Duan, Q. et al. A robust and powerful two-step testing procedure for local ancestry adjusted allelic association analysis in admixed populations. Genet. Epidemiol. 42, 288–302 (2018).
Atkinson, E. G. et al. Tractor uses local ancestry to enable the inclusion of admixed individuals in GWAS and to boost power. Nat. Genet. 53, 195–204 (2021).
Hou, K. et al. Admix-kit: an integrated toolkit and pipeline for genetic analyses of admixed populations. Bioinformatics 40, btae148 (2024).
Hou, K. et al. Causal effects on complex traits are similar for common variants across segments of different continental ancestries within admixed individuals. Nat. Genet. 55, 549–558 (2023).
Meisner, J., Benros, M. E. & Rasmussen, S. Leveraging haplotype information in heritability estimation and polygenic prediction. Nat. Commun. 16, 126 (2025).