{"id":18663,"date":"2025-04-14T07:40:22","date_gmt":"2025-04-14T07:40:22","guid":{"rendered":"https:\/\/www.europesays.com\/uk\/18663\/"},"modified":"2025-04-14T07:40:22","modified_gmt":"2025-04-14T07:40:22","slug":"solanum-pan-genetics-reveals-paralogues-as-contingencies-in-crop-engineering","status":"publish","type":"post","link":"https:\/\/www.europesays.com\/uk\/18663\/","title":{"rendered":"Solanum pan-genetics reveals paralogues as contingencies in crop engineering"},"content":{"rendered":"<p>Plant material, phenotypic analyses and imaging<\/p>\n<p>Details on all plant material used in this study, including the passport identification numbers of acquisitions from seed stock centres, are available in Supplementary Tables <a data-track=\"click\" data-track-label=\"link\" data-track-action=\"supplementary material anchor\" href=\"http:\/\/www.nature.com\/articles\/s41586-025-08619-6#MOESM3\" target=\"_blank\" rel=\"noopener\">1<\/a> and <a data-track=\"click\" data-track-label=\"link\" data-track-action=\"supplementary material anchor\" href=\"http:\/\/www.nature.com\/articles\/s41586-025-08619-6#MOESM3\" target=\"_blank\" rel=\"noopener\">10<\/a>. All phenotypic assessments were performed on plants grown in greenhouses or fields. All of the images presented in all of the figures were taken by the authors and are our own. All illustrations (such as fruit representations) in all of the figures were prepared by the authors and are our own. Quantitative phenotypic data were collected manually in fields and greenhouses and recorded in Microsoft Excel. Source data are provided in Supplementary Tables <a data-track=\"click\" data-track-label=\"link\" data-track-action=\"supplementary material anchor\" href=\"http:\/\/www.nature.com\/articles\/s41586-025-08619-6#MOESM3\" target=\"_blank\" rel=\"noopener\">8<\/a>, <a data-track=\"click\" data-track-label=\"link\" data-track-action=\"supplementary material anchor\" href=\"http:\/\/www.nature.com\/articles\/s41586-025-08619-6#MOESM3\" target=\"_blank\" rel=\"noopener\">12<\/a>\u2013<a data-track=\"click\" data-track-label=\"link\" data-track-action=\"supplementary material anchor\" href=\"http:\/\/www.nature.com\/articles\/s41586-025-08619-6#MOESM3\" target=\"_blank\" rel=\"noopener\">14<\/a>, <a data-track=\"click\" data-track-label=\"link\" data-track-action=\"supplementary material anchor\" href=\"http:\/\/www.nature.com\/articles\/s41586-025-08619-6#MOESM3\" target=\"_blank\" rel=\"noopener\">16<\/a> and <a data-track=\"click\" data-track-label=\"link\" data-track-action=\"supplementary material anchor\" href=\"http:\/\/www.nature.com\/articles\/s41586-025-08619-6#MOESM3\" target=\"_blank\" rel=\"noopener\">18<\/a>. Seven herbarium vouchers were collected from field-grown Solanum accessions. Vouchers were deposited to the Steere Herbarium at the New York Botanical Garden (Supplementary Table <a data-track=\"click\" data-track-label=\"link\" data-track-action=\"supplementary material anchor\" href=\"http:\/\/www.nature.com\/articles\/s41586-025-08619-6#MOESM3\" target=\"_blank\" rel=\"noopener\">1<\/a>).<\/p>\n<p>Tissue collection and high-molecular-mass DNA extraction<\/p>\n<p>For extraction of high-molecular-mass DNA, young leaves were collected from 21-day-old light-grown seedlings. Before tissue collection, seedlings were etiolated in complete darkness for 48\u2009h. Flash-frozen plant tissue was ground using a mortar and pestle and extracted in four volumes of ice-cold extraction buffer 1 (0.4\u2009M sucrose, 10\u2009mM Tris-HCl pH\u20098, 10\u2009mM MgCl2 and 5\u2009mM 2-mercaptoethanol). Extracts were briefly vortexed, incubated on ice for 15\u2009min and filtered twice through a single layer of Miracloth (Millipore Sigma). Filtrates were centrifuged at 4,000\u2009rpm for 20\u2009min at 4\u2009\u00b0C, and pellets were gently resuspended in 1\u2009ml of extraction buffer 2 (0.25\u2009M sucrose, 10\u2009mM Tris-HCl pH\u20098, 10\u2009mM MgCl2, 1% Triton X-100, and 5\u2009mM 2-mercaptoethanol). Crude nuclear pellets were collected by centrifugation at 12,000g for 10\u2009min at 4\u2009\u00b0C and washed by resuspension in 1\u2009ml of extraction buffer 2 followed by centrifugation at 12,000g for 10\u2009min at 4\u2009\u00b0C. Nuclear pellets were resuspended in 500\u2009ml of extraction buffer 3 (1.7\u2009M sucrose, 10\u2009mM Tris-HCl pH\u20098, 0.15% Triton X-100, 2\u2009mM MgCl2 and 5\u2009mM 2-mercaptoethanol), layered over 500\u2009ml extraction buffer 3 and centrifuged for 30\u2009min at 16,000g at 4\u2009\u00b0C. The nuclei were resuspended in 2.5\u2009ml of nuclei lysis buffer (0.2\u2009M Tris pH\u20097.5, 2\u2009M NaCl, 50\u2009mM EDTA and 55\u2009mM CTAB) and 1\u2009ml of 5% Sarkosyl solution and incubated at 60\u2009\u00b0C for 30\u2009min.<\/p>\n<p>To extract DNA, nuclear extracts were gently mixed with 8.5\u2009ml of chloroform:isoamyl alcohol solution (24:1) and slowly rotated for 15\u2009min. After centrifugation at 4,000\u2009rpm for 20\u2009min, 3\u2009ml of aqueous phase was transferred to new tubes and mixed with 300\u2009ml of 3\u2009M NaOAc and 6.6\u2009ml of ice-cold ethanol. Precipitated DNA strands were transferred to new 1.5\u2009ml tubes and washed twice with ice-cold 80% ethanol. Dried DNA strands were dissolved in 100\u2009ml of elution buffer (10\u2009mM Tris-HCl, pH\u20098.5) overnight at 4\u2009\u00b0C. The quality, quantity and molecular mass of DNA samples were assessed using Nanodrop (Thermo Fisher Scientific), Qubit (Thermo Fisher Scientific) and pulsed-field gel electrophoresis (CHEF Mapper XA System, Bio-Rad) according to the manufacturer\u2019s instructions.<\/p>\n<p>Genome assembly<\/p>\n<p>Reference quality genome assemblies for each of the 22 species (and two reference quality genomes for S. muricatum) (accession information is provided in Supplementary Table <a data-track=\"click\" data-track-label=\"link\" data-track-action=\"supplementary material anchor\" href=\"http:\/\/www.nature.com\/articles\/s41586-025-08619-6#MOESM3\" target=\"_blank\" rel=\"noopener\">2<\/a>) were generated using a combination of long-read sequencing (Pacific Biosciences) for contigging and optical mapping (Bionano Genomics) for scaffolding. Between 1 and 4 PacBio Sequel IIe flow cells (Pacific Biosciences) were used for the sequencing of each sample in the Solanum wide pan-genome (average read N50\u2009=\u200929,067\u2009bp, average coverage\u2009=\u200963\u00d7). The exact number of flow cells and sequencing technology for each sample are provided in Supplementary Table <a data-track=\"click\" data-track-label=\"link\" data-track-action=\"supplementary material anchor\" href=\"http:\/\/www.nature.com\/articles\/s41586-025-08619-6#MOESM3\" target=\"_blank\" rel=\"noopener\">2<\/a>. For the additional nine S. aethiopicum samples, a combination of PacBio Sequel IIe, PacBio Revio sequencing and Oxford Nanopore sequencing was used to assemble the genomes (Supplementary Table <a data-track=\"click\" data-track-label=\"link\" data-track-action=\"supplementary material anchor\" href=\"http:\/\/www.nature.com\/articles\/s41586-025-08619-6#MOESM3\" target=\"_blank\" rel=\"noopener\">11<\/a>). Before assembly, we counted k-mers from raw reads using KMC3<a data-track=\"click\" data-track-action=\"reference anchor\" data-track-label=\"link\" data-test=\"citation-ref\" aria-label=\"Reference 62\" title=\"Kokot, M., Dlugosz, M. &amp; Deorowicz, S. KMC 3: counting and manipulating k-mer statistics. Bioinformatics 33, 2759&#x2013;2761 (2017).\" href=\"http:\/\/www.nature.com\/articles\/s41586-025-08619-6#ref-CR62\" id=\"ref-link-section-d278053768e3079\" target=\"_blank\" rel=\"noopener\">62<\/a> (v.3.2.1) and estimated the genome size, sequencing coverage and heterozygosity using GenomeScope (v.2.0)<a data-track=\"click\" data-track-action=\"reference anchor\" data-track-label=\"link\" data-test=\"citation-ref\" aria-label=\"Reference 63\" title=\"Ranallo-Benavidez, T. R., Jaron, K. S. &amp; Schatz, M. C. GenomeScope 2.0 and Smudgeplot for reference-free profiling of polyploid genomes. Nat. Commun. 11, 1432 (2020).\" href=\"http:\/\/www.nature.com\/articles\/s41586-025-08619-6#ref-CR63\" id=\"ref-link-section-d278053768e3083\" target=\"_blank\" rel=\"noopener\">63<\/a>. For five samples (details are provided in Supplementary Table <a data-track=\"click\" data-track-label=\"link\" data-track-action=\"supplementary material anchor\" href=\"http:\/\/www.nature.com\/articles\/s41586-025-08619-6#MOESM3\" target=\"_blank\" rel=\"noopener\">2<\/a>), low-quality reads were filtered out with a custom script (<a href=\"https:\/\/github.com\/pan-sol\/pan-sol-pipelines\" target=\"_blank\" rel=\"noopener\">https:\/\/github.com\/pan-sol\/pan-sol-pipelines<\/a>). Sequencing reads from each sample were assembled using hifiasm<a data-track=\"click\" data-track-action=\"reference anchor\" data-track-label=\"link\" data-test=\"citation-ref\" aria-label=\"Reference 64\" title=\"Cheng, H., Concepcion, G. T., Feng, X., Zhang, H. &amp; Li, H. Haplotype-resolved de novo assembly using phased assembly graphs with hifiasm. Nat. Methods 18, 170&#x2013;175 (2021).\" href=\"http:\/\/www.nature.com\/articles\/s41586-025-08619-6#ref-CR64\" id=\"ref-link-section-d278053768e3098\" target=\"_blank\" rel=\"noopener\">64<\/a> and the exact parameters and software version varied between the samples based on the level of estimated heterozygosity and are reported in Supplementary Table <a data-track=\"click\" data-track-label=\"link\" data-track-action=\"supplementary material anchor\" href=\"http:\/\/www.nature.com\/articles\/s41586-025-08619-6#MOESM3\" target=\"_blank\" rel=\"noopener\">2<\/a>. After assembly, the draft contigs were screened for possible microbial contamination as previously described<a data-track=\"click\" data-track-action=\"reference anchor\" data-track-label=\"link\" data-test=\"citation-ref\" aria-label=\"Reference 26\" title=\"Alonge, M. et al. Major impacts of widespread structural variation on gene expression and crop improvement in tomato. Cell 182, 145&#x2013;161 (2020).\" href=\"http:\/\/www.nature.com\/articles\/s41586-025-08619-6#ref-CR26\" id=\"ref-link-section-d278053768e3105\" target=\"_blank\" rel=\"noopener\">26<\/a>. Nchart was generated with ggplot2 (<a href=\"https:\/\/ggplot2.tidyverse.org\/\" target=\"_blank\" rel=\"noopener\">https:\/\/ggplot2.tidyverse.org\/<\/a>) using adaptation of N-chart (<a href=\"https:\/\/github.com\/MariaNattestad\/Nchart\" target=\"_blank\" rel=\"noopener\">https:\/\/github.com\/MariaNattestad\/Nchart<\/a>).<\/p>\n<p>Genome assembly scaffolding<\/p>\n<p>Optical mapping (Bionano Genomics) was performed for 17 samples to facilitate scaffolding. Scaffolding with optical maps was performed using the Bionano solve Hybrid Scaffold pipeline with the recommended default parameters (<a href=\"https:\/\/bionano.com\/software-downloads\/\" target=\"_blank\" rel=\"noopener\">https:\/\/bionano.com\/software-downloads\/<\/a>). Hybrid scaffold N50s ranged from 33,254,022\u2009bp to 219,385,699\u2009bp (further details, including Bionano molecules per sample, are provided in Supplementary Table <a data-track=\"click\" data-track-label=\"link\" data-track-action=\"supplementary material anchor\" href=\"http:\/\/www.nature.com\/articles\/s41586-025-08619-6#MOESM3\" target=\"_blank\" rel=\"noopener\">2<\/a>). High-throughput chromosome conformation capture (Hi-C) from Arima Genomics was performed for eight samples to finalize scaffolding. With Hi-C, reads were integrated with the Juicer (v.0.7.17-r1198-dirty) pipeline. Next, misjoins and chromosomal boundaries were manually curated in the Juicebox (v.1.11.08) application. Chromosomes were named based on sequence homology, determined using the RagTag<a data-track=\"click\" data-track-action=\"reference anchor\" data-track-label=\"link\" data-test=\"citation-ref\" aria-label=\"Reference 65\" title=\"Alonge, M. et al. Automated assembly scaffolding using RagTag elevates a new tomato system for high-throughput genome editing. Genome Biol. 23, 258 (2022).\" href=\"http:\/\/www.nature.com\/articles\/s41586-025-08619-6#ref-CR65\" id=\"ref-link-section-d278053768e3141\" target=\"_blank\" rel=\"noopener\">65<\/a> scaffold (v.2.1.0, default parameters), with the phylogenetically closest finished genome (Supplementary Table <a data-track=\"click\" data-track-label=\"link\" data-track-action=\"supplementary material anchor\" href=\"http:\/\/www.nature.com\/articles\/s41586-025-08619-6#MOESM3\" target=\"_blank\" rel=\"noopener\">2<\/a>), 12 of these samples (including nine S. aethiopicum samples) were scaffolded with Ragtag. Finally, small contigs (95% of the sequence mapping to a named chromosome were removed. Moreover, small contigs (80% of the sequence mapping to a named chromosome that contained one or more duplicated BUSCO genes, but no single BUSCO genes, were also removed using a Python script. Using merqury<a data-track=\"click\" data-track-action=\"reference anchor\" data-track-label=\"link\" data-test=\"citation-ref\" aria-label=\"Reference 61\" title=\"Rhie, A., Walenz, B. P., Koren, S. &amp; Phillippy, A. M. Merqury: reference-free quality, completeness, and phasing assessment for genome assemblies. Genome Biol. 21, 245 (2020).\" href=\"http:\/\/www.nature.com\/articles\/s41586-025-08619-6#ref-CR61\" id=\"ref-link-section-d278053768e3152\" target=\"_blank\" rel=\"noopener\">61<\/a> with the HiFi data, the final consensus quality of the assemblies was estimated as QV\u2009=\u200953 on average and a completeness of 99.2741% on average.<\/p>\n<p>Tissue collection, RNA extraction and quantification<\/p>\n<p>All tissues were collected in 3\u20134 biological replicates from different greenhouse-grown plants at approximately 09:00\u201310:00 and flash-frozen in liquid nitrogen in 1.5\u2009ml microfuge tubes containing a 5\/32 inch (about\u00a03.97\u2009mm) 440 stainless steel ball bearing (BC Precision). Tubes containing tissue were placed in a \u221280\u2009\u00b0C stainless steel tube rack and ground using a SPEX SamplePrep 2010 Geno\/Grinder (Cole-Parmer) for 1\u2009min at 1,440\u2009rpm. For shoot apices, total RNA was extracted using TRIzol (Invitrogen) according to the manufacturer\u2019s instructions for ground tissue. For all other tissues (cotyledons, hypocotyls, leaves, flower buds and flowers), total RNA was extracted using Quick-RNA MicroPrep Kit (Zymo Research). RNA was treated with DNase I (Zymo Research) according to the manufacturer\u2019s instructions. The purity and concentration of the resulting total RNA was assessed using the NanoDrop One spectrophotometer (Thermo Fisher Scientific). Libraries for RNA-seq were prepared using the KAPA mRNA HyperPrep Kit (Roche). Paired-end 100\u2009base sequencing was conducted on the NextSeq 2000 P3 sequencing platform (Illumina). Reads were trimmed using trimmomatic (v.0.39)<a data-track=\"click\" data-track-action=\"reference anchor\" data-track-label=\"link\" data-test=\"citation-ref\" aria-label=\"Reference 66\" title=\"Bolger, A. M., Lohse, M. &amp; Usadel, B. Trimmomatic: a flexible trimmer for Illumina sequence data. Bioinformatics 30, 2114&#x2013;2120 (2014).\" href=\"http:\/\/www.nature.com\/articles\/s41586-025-08619-6#ref-CR66\" id=\"ref-link-section-d278053768e3165\" target=\"_blank\" rel=\"noopener\">66<\/a> and then mapped to their respective genome using STAR (v.2.7.5c)<a data-track=\"click\" data-track-action=\"reference anchor\" data-track-label=\"link\" data-test=\"citation-ref\" aria-label=\"Reference 67\" title=\"Dobin, A. et al. STAR: ultrafast universal RNA-seq aligner. Bioinformatics 29, 15&#x2013;21 (2013).\" href=\"http:\/\/www.nature.com\/articles\/s41586-025-08619-6#ref-CR67\" id=\"ref-link-section-d278053768e3169\" target=\"_blank\" rel=\"noopener\">67<\/a> and expression was computed in TPM.<\/p>\n<p>Gene annotation<\/p>\n<p>The gene-annotation pipeline (Supplementary Fig. <a data-track=\"click\" data-track-label=\"link\" data-track-action=\"supplementary material anchor\" href=\"http:\/\/www.nature.com\/articles\/s41586-025-08619-6#MOESM1\" target=\"_blank\" rel=\"noopener\">2c<\/a>) involved several crucial steps, beginning with lift over of gene models using the Liftoff algorithm on community-established references of tomato (Heinz reference genome) and eggplant (Brinjal reference genome). We augmented the annotation using RNA-seq data from 15 species and multiple tissues for de novo annotation. Initially, the quality of raw RNA-seq reads from each sample (Supplementary Table <a data-track=\"click\" data-track-label=\"link\" data-track-action=\"supplementary material anchor\" href=\"http:\/\/www.nature.com\/articles\/s41586-025-08619-6#MOESM3\" target=\"_blank\" rel=\"noopener\">6<\/a>) underwent assessment using FastQC v.0.11.9 (<a href=\"http:\/\/www.bioinformatics.babraham.ac.uk\/projects\/fastqc\/\" target=\"_blank\" rel=\"noopener\">http:\/\/www.bioinformatics.babraham.ac.uk\/projects\/fastqc\/<\/a>). Subsequently, reference-based transcripts were generated using the STAR (v.2.7.5c)<a data-track=\"click\" data-track-action=\"reference anchor\" data-track-label=\"link\" data-test=\"citation-ref\" aria-label=\"Reference 67\" title=\"Dobin, A. et al. STAR: ultrafast universal RNA-seq aligner. Bioinformatics 29, 15&#x2013;21 (2013).\" href=\"http:\/\/www.nature.com\/articles\/s41586-025-08619-6#ref-CR67\" id=\"ref-link-section-d278053768e3194\" target=\"_blank\" rel=\"noopener\">67<\/a> and Stringtie2 (v.2.1.2)<a data-track=\"click\" data-track-action=\"reference anchor\" data-track-label=\"link\" data-test=\"citation-ref\" aria-label=\"Reference 68\" title=\"Kovaka, S. et al. Transcriptome assembly from long-read RNA-seq alignments with StringTie2. Genome Biol. 20, 278 (2019).\" href=\"http:\/\/www.nature.com\/articles\/s41586-025-08619-6#ref-CR68\" id=\"ref-link-section-d278053768e3198\" target=\"_blank\" rel=\"noopener\">68<\/a> workflows. To refine the data, invalid splice junctions from the STAR aligner were filtered out using Portcullis (v.1.2.0)<a data-track=\"click\" data-track-action=\"reference anchor\" data-track-label=\"link\" data-test=\"citation-ref\" aria-label=\"Reference 69\" title=\"Mapleson, D., Venturini, L., Kaithakottil, G. &amp; Swarbreck, D. Efficient and accurate detection of splice junctions from RNA-seq with Portcullis. Gigascience 7, giy131 (2018).\" href=\"http:\/\/www.nature.com\/articles\/s41586-025-08619-6#ref-CR69\" id=\"ref-link-section-d278053768e3203\" target=\"_blank\" rel=\"noopener\">69<\/a>. Orthologues with coverage above 50% and 75% identity were lifted from the tomato reference genome Heinz (v.4.0)<a data-track=\"click\" data-track-action=\"reference anchor\" data-track-label=\"link\" data-test=\"citation-ref\" aria-label=\"Reference 70\" title=\"Hosmani, P. S. et al. An improved de novo assembly and annotation of the tomato reference genome using single-molecule sequencing, Hi-C proximity ligation and optical maps. Preprint at bioRxiv &#010;                  https:\/\/doi.org\/10.1101\/767764&#010;                  &#010;                 (2019).\" href=\"http:\/\/www.nature.com\/articles\/s41586-025-08619-6#ref-CR70\" id=\"ref-link-section-d278053768e3207\" target=\"_blank\" rel=\"noopener\">70<\/a> and the eggplant reference genome Eggplant (v.4.1)<a data-track=\"click\" data-track-action=\"reference anchor\" data-track-label=\"link\" data-test=\"citation-ref\" aria-label=\"Reference 71\" title=\"Li, D. et al. A high-quality genome assembly of the eggplant provides insights into the molecular basis of disease resistance and chlorogenic acid synthesis. Mol. Ecol. Resour. 21, 1274&#x2013;1286 (2021).\" href=\"http:\/\/www.nature.com\/articles\/s41586-025-08619-6#ref-CR71\" id=\"ref-link-section-d278053768e3211\" target=\"_blank\" rel=\"noopener\">71<\/a> using Liftoff (v.1.6.3)<a data-track=\"click\" data-track-action=\"reference anchor\" data-track-label=\"link\" data-test=\"citation-ref\" aria-label=\"Reference 72\" title=\"Shumate, A. &amp; Salzberg, S. L. Liftoff: accurate mapping of gene annotations. Bioinformatics 37, 1639&#x2013;1643 (2021).\" href=\"http:\/\/www.nature.com\/articles\/s41586-025-08619-6#ref-CR72\" id=\"ref-link-section-d278053768e3215\" target=\"_blank\" rel=\"noopener\">72<\/a> using the parameters &#8211;copies,&#8211;exclude_partial and using both the Gmap (v.2020-10-14)<a data-track=\"click\" data-track-action=\"reference anchor\" data-track-label=\"link\" data-test=\"citation-ref\" aria-label=\"Reference 73\" title=\"Wu, T. D., Reeder, J., Lawrence, M., Becker, G. &amp; Brauer, M. J. GMAP and GSNAP for genomic sequence alignment: enhancements to speed, accuracy, and functionality. Methods Mol. Biol. 1418, 283&#x2013;334 (2016).\" href=\"http:\/\/www.nature.com\/articles\/s41586-025-08619-6#ref-CR73\" id=\"ref-link-section-d278053768e3219\" target=\"_blank\" rel=\"noopener\">73<\/a> and Minimap2 (v.2.17-r941)<a data-track=\"click\" data-track-action=\"reference anchor\" data-track-label=\"link\" data-test=\"citation-ref\" aria-label=\"Reference 74\" title=\"Li, H. Minimap2: pairwise alignment for nucleotide sequences. Bioinformatics 34, 3094&#x2013;3100 (2018).\" href=\"http:\/\/www.nature.com\/articles\/s41586-025-08619-6#ref-CR74\" id=\"ref-link-section-d278053768e3223\" target=\"_blank\" rel=\"noopener\">74<\/a> aligners. Furthermore, protein evidence from several published Solanaceae genomes<a data-track=\"click\" data-track-action=\"reference anchor\" data-track-label=\"link\" data-test=\"citation-ref\" aria-label=\"Reference 70\" title=\"Hosmani, P. S. et al. An improved de novo assembly and annotation of the tomato reference genome using single-molecule sequencing, Hi-C proximity ligation and optical maps. Preprint at bioRxiv &#010;                  https:\/\/doi.org\/10.1101\/767764&#010;                  &#010;                 (2019).\" href=\"http:\/\/www.nature.com\/articles\/s41586-025-08619-6#ref-CR70\" id=\"ref-link-section-d278053768e3228\" target=\"_blank\" rel=\"noopener\">70<\/a>,<a data-track=\"click\" data-track-action=\"reference anchor\" data-track-label=\"link\" data-test=\"citation-ref\" aria-label=\"Reference 71\" title=\"Li, D. et al. A high-quality genome assembly of the eggplant provides insights into the molecular basis of disease resistance and chlorogenic acid synthesis. Mol. Ecol. Resour. 21, 1274&#x2013;1286 (2021).\" href=\"http:\/\/www.nature.com\/articles\/s41586-025-08619-6#ref-CR71\" id=\"ref-link-section-d278053768e3231\" target=\"_blank\" rel=\"noopener\">71<\/a>,<a data-track=\"click\" data-track-action=\"reference anchor\" data-track-label=\"link\" data-test=\"citation-ref\" aria-label=\"Reference 75\" title=\"Potato Genome Sequencing Consortium. Genome sequence and analysis of the tuber crop potato. Nature 475, 189&#x2013;195 (2011).\" href=\"http:\/\/www.nature.com\/articles\/s41586-025-08619-6#ref-CR75\" id=\"ref-link-section-d278053768e3234\" target=\"_blank\" rel=\"noopener\">75<\/a>, and the UniProt\/SwissProt database were used to support gene annotation. Structural gene annotations were generated using the Mikado (v.2.0rc2)<a data-track=\"click\" data-track-action=\"reference anchor\" data-track-label=\"link\" data-test=\"citation-ref\" aria-label=\"Reference 76\" title=\"Venturini, L., Caim, S., Kaithakottil, G. G., Mapleson, D. L. &amp; Swarbreck, D. Leveraging multiple transcriptome assembly methods for improved gene structure annotation. Gigascience 7, giy093 (2018).\" href=\"http:\/\/www.nature.com\/articles\/s41586-025-08619-6#ref-CR76\" id=\"ref-link-section-d278053768e3238\" target=\"_blank\" rel=\"noopener\">76<\/a> framework, leveraging evidence from the Daijin pipeline. Moreover, microsynteny and shared orthology to Heinz v.4.0 and Eggplant v.4.0 were assessed using Microsynteny and Orthofinder (v.2.5.2)<a data-track=\"click\" data-track-action=\"reference anchor\" data-track-label=\"link\" data-test=\"citation-ref\" aria-label=\"Reference 77\" title=\"Emms, D. M. &amp; Kelly, S. OrthoFinder: phylogenetic orthology inference for comparative genomics. Genome Biol. 20, 238 (2019).\" href=\"http:\/\/www.nature.com\/articles\/s41586-025-08619-6#ref-CR77\" id=\"ref-link-section-d278053768e3242\" target=\"_blank\" rel=\"noopener\">77<\/a>. Correction of gene models with inframe stop codons was performed using Miniprot2<a data-track=\"click\" data-track-action=\"reference anchor\" data-track-label=\"link\" data-test=\"citation-ref\" aria-label=\"Reference 78\" title=\"Li, H. Protein-to-genome alignment with miniprot. Bioinformatics 39, btad014 (2023).\" href=\"http:\/\/www.nature.com\/articles\/s41586-025-08619-6#ref-CR78\" id=\"ref-link-section-d278053768e3246\" target=\"_blank\" rel=\"noopener\">78<\/a> protein alignments to incorporate protein data from Heinz v.4.0 and Eggplant v.4.1. Furthermore, gene models lacking start or stop codons were adjusted by placing them within 300\u2009bp of the nearest codon location using a custom Python script (<a href=\"https:\/\/github.com\/pan-sol\/pan-sol-pipelines\" target=\"_blank\" rel=\"noopener\">https:\/\/github.com\/pan-sol\/pan-sol-pipelines<\/a>). Overall gene synteny was visualized using GENESPACE (v.1.3.1)<a data-track=\"click\" data-track-action=\"reference anchor\" data-track-label=\"link\" data-test=\"citation-ref\" aria-label=\"Reference 79\" title=\"Lovell, J. T. et al. GENESPACE tracks regions of interest and gene copy number variation across multiple genomes. eLife 11, e78526 (2022).\" href=\"http:\/\/www.nature.com\/articles\/s41586-025-08619-6#ref-CR79\" id=\"ref-link-section-d278053768e3257\" target=\"_blank\" rel=\"noopener\">79<\/a>.<\/p>\n<p>For functional annotation, ENTAP (v.0.10.8)<a data-track=\"click\" data-track-action=\"reference anchor\" data-track-label=\"link\" data-test=\"citation-ref\" aria-label=\"Reference 80\" title=\"Hart, A. J. et al. EnTAP: bringing faster and smarter functional annotation to non-model eukaryotic transcriptomes. Mol. Ecol. Resour. 20, 591&#x2013;604 (2020).\" href=\"http:\/\/www.nature.com\/articles\/s41586-025-08619-6#ref-CR80\" id=\"ref-link-section-d278053768e3264\" target=\"_blank\" rel=\"noopener\">80<\/a> integrated data from diverse databases such as PLAZA dicots (v.5.0)<a data-track=\"click\" data-track-action=\"reference anchor\" data-track-label=\"link\" data-test=\"citation-ref\" aria-label=\"Reference 81\" title=\"Van Bel, M. et al. PLAZA 5.0: extending the scope and power of comparative and functional genomics in plants. Nucleic Acids Res. 50, D1468&#x2013;D1474 (2022).\" href=\"http:\/\/www.nature.com\/articles\/s41586-025-08619-6#ref-CR81\" id=\"ref-link-section-d278053768e3268\" target=\"_blank\" rel=\"noopener\">81<\/a>, UniProt\/Swissprot<a data-track=\"click\" data-track-action=\"reference anchor\" data-track-label=\"link\" data-test=\"citation-ref\" aria-label=\"Reference 82\" title=\"Apweiler, R. et al. UniProt: the universal protein knowledgebase. Nucleic Acids Res. 32, D115&#x2013;D119 (2004).\" href=\"http:\/\/www.nature.com\/articles\/s41586-025-08619-6#ref-CR82\" id=\"ref-link-section-d278053768e3272\" target=\"_blank\" rel=\"noopener\">82<\/a>, TREMBL, RefSeq, Solanaceae proteins and InterProScan5<a data-track=\"click\" data-track-action=\"reference anchor\" data-track-label=\"link\" data-test=\"citation-ref\" aria-label=\"Reference 83\" title=\"Jones, P. et al. InterProScan 5: genome-scale protein function classification. Bioinformatics 30, 1236&#x2013;1240 (2014).\" href=\"http:\/\/www.nature.com\/articles\/s41586-025-08619-6#ref-CR83\" id=\"ref-link-section-d278053768e3276\" target=\"_blank\" rel=\"noopener\">83<\/a> with Pfam, TIGRFAM, Gene Ontology and TRAPID<a data-track=\"click\" data-track-action=\"reference anchor\" data-track-label=\"link\" data-test=\"citation-ref\" aria-label=\"Reference 84\" title=\"Van Bel, M. et al. TRAPID: an efficient online tool for the functional and comparative analysis of de novo RNA-seq transcriptomes. Genome Biol. 14, R134 (2013).\" href=\"http:\/\/www.nature.com\/articles\/s41586-025-08619-6#ref-CR84\" id=\"ref-link-section-d278053768e3280\" target=\"_blank\" rel=\"noopener\">84<\/a> annotations. Finally, the annotated data underwent a series of filtering steps, excluding proteins shorter than 20 amino acids, those exceeding 20 times the length of functional orthologues and transposable element genes, which were removed using the TEsorter<a data-track=\"click\" data-track-action=\"reference anchor\" data-track-label=\"link\" data-test=\"citation-ref\" aria-label=\"Reference 85\" title=\"Zhang, R.-G. et al. TEsorter: an accurate and fast method to classify LTR-retrotransposons in plant genomes. Hortic. Res. 9, uhac017 (2022).\" href=\"http:\/\/www.nature.com\/articles\/s41586-025-08619-6#ref-CR85\" id=\"ref-link-section-d278053768e3285\" target=\"_blank\" rel=\"noopener\">85<\/a> pipeline.<\/p>\n<p>We assessed the completeness of the gene models by assessing single-copy orthologues through BUSCO<a data-track=\"click\" data-track-action=\"reference anchor\" data-track-label=\"link\" data-test=\"citation-ref\" aria-label=\"Reference 86\" title=\"Manni, M., Berkeley, M. R., Seppey, M. &amp; Zdobnov, E. M. BUSCO: assessing genomic data quality and beyond. Curr. Protoc. 1, e323 (2021).\" href=\"http:\/\/www.nature.com\/articles\/s41586-025-08619-6#ref-CR86\" id=\"ref-link-section-d278053768e3292\" target=\"_blank\" rel=\"noopener\">86<\/a> in protein mode, comparing them against the solanales_odb10 database (Supplementary Tables <a data-track=\"click\" data-track-label=\"link\" data-track-action=\"supplementary material anchor\" href=\"http:\/\/www.nature.com\/articles\/s41586-025-08619-6#MOESM3\" target=\"_blank\" rel=\"noopener\">2<\/a> and <a data-track=\"click\" data-track-label=\"link\" data-track-action=\"supplementary material anchor\" href=\"http:\/\/www.nature.com\/articles\/s41586-025-08619-6#MOESM3\" target=\"_blank\" rel=\"noopener\">3<\/a>). Moreover, we examined the presence or absence of a curated set of 150 candidate genes known to be relevant in plant development and QTL studies (Supplementary Table <a data-track=\"click\" data-track-label=\"link\" data-track-action=\"supplementary material anchor\" href=\"http:\/\/www.nature.com\/articles\/s41586-025-08619-6#MOESM3\" target=\"_blank\" rel=\"noopener\">7<\/a>).<\/p>\n<p>Transposable element annotation<\/p>\n<p>The S. lycopersicum chloroplast and mitochondrion sequences were collected from NCBI reference sequences <a href=\"https:\/\/www.ncbi.nlm.nih.gov\/nuccore\/NC_007898.3\" target=\"_blank\" rel=\"noopener\">NC_007898.3<\/a> and <a href=\"https:\/\/www.ncbi.nlm.nih.gov\/nuccore\/NC_035963.1\" target=\"_blank\" rel=\"noopener\">NC_035963.1<\/a>, respectively. Non-transposable-element repeat sequences, including 18S rDNA (OK073663.1), 5S rDNA (<a href=\"https:\/\/www.ncbi.nlm.nih.gov\/nuccore\/X55697.1\" target=\"_blank\" rel=\"noopener\">X55697.1<\/a>), 5.8S rDNA (<a href=\"https:\/\/www.ncbi.nlm.nih.gov\/nuccore\/X52265.1\" target=\"_blank\" rel=\"noopener\">X52265.1<\/a>), 25S rDNA (OK073662.1), DNA spacer (<a href=\"https:\/\/www.ncbi.nlm.nih.gov\/nuccore\/AY366528.1\" target=\"_blank\" rel=\"noopener\">AY366528.1<\/a>), centromeric repeat (JA176199.1) and telomere sequences (TTTAGGG), were collected from the NCBI and further curated. Transposable element sequences curated in the SUN locus study<a data-track=\"click\" data-track-action=\"reference anchor\" data-track-label=\"link\" data-test=\"citation-ref\" aria-label=\"Reference 87\" title=\"Jiang, N., Gao, D., Xiao, H. &amp; van der Knaap, E. Genome organization of the tomato sun locus and characterization of the unusual retrotransposon Rider. Plant J. 60, 181&#x2013;193 (2009).\" href=\"http:\/\/www.nature.com\/articles\/s41586-025-08619-6#ref-CR87\" id=\"ref-link-section-d278053768e3352\" target=\"_blank\" rel=\"noopener\">87<\/a> as well as several other transposable element sequences from NCBI were also collected. These sequences were combined as the curated set of tomato repeats.<\/p>\n<p>De novo transposable element annotation was first performed on each genome using EDTA (v.2.1.5)<a data-track=\"click\" data-track-action=\"reference anchor\" data-track-label=\"link\" data-test=\"citation-ref\" aria-label=\"Reference 88\" title=\"Ou, S. et al. Benchmarking transposable element annotation methods for creation of a streamlined, comprehensive pipeline. Genome Biol. 20, 275 (2019).\" href=\"http:\/\/www.nature.com\/articles\/s41586-025-08619-6#ref-CR88\" id=\"ref-link-section-d278053768e3359\" target=\"_blank\" rel=\"noopener\">88<\/a>, with coding sequences from the ITAG4.0 Eggplant V4 annotation<a data-track=\"click\" data-track-action=\"reference anchor\" data-track-label=\"link\" data-test=\"citation-ref\" aria-label=\"Reference 89\" title=\"Barchi, L. et al. Improved genome assembly and pan-genome provide key insights into eggplant domestication and breeding. Plant J. 107, 579&#x2013;596 (2021).\" href=\"http:\/\/www.nature.com\/articles\/s41586-025-08619-6#ref-CR89\" id=\"ref-link-section-d278053768e3363\" target=\"_blank\" rel=\"noopener\">89<\/a> provided (&#8211;cds) to purge gene coding sequences in the transposable element annotation and parameters of &#8211;anno 1 &#8211;sensitive 1 for sensitive detection and annotation of repeat sequences. Curated tomato repeats were supplied to EDTA (&#8211;curatedlib) for de novo annotation. Transposable element annotations of individual genomes were together processed by panEDTA<a data-track=\"click\" data-track-action=\"reference anchor\" data-track-label=\"link\" data-test=\"citation-ref\" aria-label=\"Reference 90\" title=\"Ou, S. et al. Differences in activity and stability drive transposable element variation in tropical and temperate maize. Genome Res. 34, 1140&#x2013;1153 (2024).\" href=\"http:\/\/www.nature.com\/articles\/s41586-025-08619-6#ref-CR90\" id=\"ref-link-section-d278053768e3367\" target=\"_blank\" rel=\"noopener\">90<\/a> for the creation of consistent pan-genome transposable element annotation. The summary of whole-genome repeat annotations was derived from .sum files generated by panEDTA (Supplementary Table <a data-track=\"click\" data-track-label=\"link\" data-track-action=\"supplementary material anchor\" href=\"http:\/\/www.nature.com\/articles\/s41586-025-08619-6#MOESM3\" target=\"_blank\" rel=\"noopener\">4<\/a>).<\/p>\n<p>Evaluation of repeat assembly quality was performed using LAI (b3.2)<a data-track=\"click\" data-track-action=\"reference anchor\" data-track-label=\"link\" data-test=\"citation-ref\" aria-label=\"Reference 91\" title=\"Ou, S., Chen, J. &amp; Jiang, N. Assessing genome assembly quality using the LTR assembly index (LAI). Nucleic Acids Res. 46, e126 (2018).\" href=\"http:\/\/www.nature.com\/articles\/s41586-025-08619-6#ref-CR91\" id=\"ref-link-section-d278053768e3377\" target=\"_blank\" rel=\"noopener\">91<\/a> with inputs generated by EDTA and parameters -t 48 -unlock. LAI of S. aethiopicum genomes were standardized based on the HiFi-based reference assembly, with the parameters -iden 95.71 -totLTR 49.22 -genome_size 1102623763 -t 48 -unlock.<\/p>\n<p>Generation of CRISPR\u2013Cas9-induced mutants<\/p>\n<p>CRISPR guide RNAs to target CLV3 and SCPL25 across Solanum species were designed using Geneious (listed in Supplementary Table <a data-track=\"click\" data-track-label=\"link\" data-track-action=\"supplementary material anchor\" href=\"http:\/\/www.nature.com\/articles\/s41586-025-08619-6#MOESM3\" target=\"_blank\" rel=\"noopener\">20<\/a>). The Golden Gate cloning approach was used to create multiplexed gRNA constructs. Plant regeneration and Agrobacterium tumefaciens-mediated transformation of S. prinophyllum were performed according to our previously published protocol<a data-track=\"click\" data-track-action=\"reference anchor\" data-track-label=\"link\" data-test=\"citation-ref\" aria-label=\"Reference 92\" title=\"Van Eck, J., Keen, P. &amp; Tjahjadi, M. in Transgenic Plants: Methods and Protocols (eds Kumar, S. et al.) 225&#x2013;234 (Springer, 2019).\" href=\"http:\/\/www.nature.com\/articles\/s41586-025-08619-6#ref-CR92\" id=\"ref-link-section-d278053768e3414\" target=\"_blank\" rel=\"noopener\">92<\/a>. For S. cleistogamum plant regeneration, the medium was supplemented with 0.5\u2009mg\u2009l\u22121 zeatin instead of 2\u2009mg\u2009l\u22121 and, for the selection medium, 75\u2009mg\u2009l\u22121 kanamycin was used instead of 200\u2009mg\u2009l\u22121. For S. aethiopicum, the protocol was the same as for S. cleistogamum, except the fourth transfer of transformed plantlets was done onto medium supplemented with 50\u2009mg\u2009l\u22121 kanamycin. The seed germination time in culture can vary between species and batches of harvested seeds. Typically, S. prinophyllum germination took 8\u201310\u2009days, S. cleistogamum germinated in 6\u20138\u2009days and S. aethiopicum in 7\u201310\u2009days.<\/p>\n<p>Distribution maps and species status<\/p>\n<p>Species were categorized into wild, domesticated, locally important consumed or ornamental based on taxonomic literature and expert opinion<a data-track=\"click\" data-track-action=\"reference anchor\" data-track-label=\"link\" data-test=\"citation-ref\" aria-label=\"Reference 17\" title=\"Hilgenhof, R. et al. Morphological trait evolution in Solanum (Solanaceae): evolutionary lability of key taxonomic characters. Taxon 72, 811&#x2013;847 (2023).\" href=\"http:\/\/www.nature.com\/articles\/s41586-025-08619-6#ref-CR17\" id=\"ref-link-section-d278053768e3462\" target=\"_blank\" rel=\"noopener\">17<\/a> (PBI Solanum Project (2024), Solanaceae Source; <a href=\"http:\/\/www.solanaceaesource.org\/\" target=\"_blank\" rel=\"noopener\">http:\/\/www.solanaceaesource.org\/<\/a>). The distribution maps were generated using the open source osm-liberty package (<a href=\"https:\/\/github.com\/maputnik\/osm-liberty\" target=\"_blank\" rel=\"noopener\">http:\/\/github.com\/maputnik\/osm-liberty\/<\/a>). Native ranges were derived from the same taxonomic literature and approximate centroids of the ranges were used for the mapping. The map is from osm-liberty, designed for open source maps.<\/p>\n<p>Phylogenomic analyses<\/p>\n<p>Jaltomata sinuosa<a data-track=\"click\" data-track-action=\"reference anchor\" data-track-label=\"link\" data-test=\"citation-ref\" aria-label=\"Reference 93\" title=\"Wu, M., Kostyun, J. L. &amp; Moyle, L. C. Genome sequence of Jaltomata addresses rapid reproductive trait evolution and enhances comparative genomics in the hyper-diverse Solanaceae. Genome Biol. Evol. 11, 335&#x2013;349 (2019).\" href=\"http:\/\/www.nature.com\/articles\/s41586-025-08619-6#ref-CR93\" id=\"ref-link-section-d278053768e3492\" target=\"_blank\" rel=\"noopener\">93<\/a> was used as an outgroup for the Solanum pan-genome tree, whereas the closely related S. anguivi, S. insanum and S. melongena were used as an outgroup for the S. aethiopicum dataset. Orthofinder<a data-track=\"click\" data-track-action=\"reference anchor\" data-track-label=\"link\" data-test=\"citation-ref\" aria-label=\"Reference 77\" title=\"Emms, D. M. &amp; Kelly, S. OrthoFinder: phylogenetic orthology inference for comparative genomics. Genome Biol. 20, 238 (2019).\" href=\"http:\/\/www.nature.com\/articles\/s41586-025-08619-6#ref-CR77\" id=\"ref-link-section-d278053768e3512\" target=\"_blank\" rel=\"noopener\">77<\/a> was used to identify single-copy orthologues across all species. This resulted in 7,825 loci for the Solanum pan-genome dataset, and 19,769 loci for the S. aethiopicum dataset. To reduce the computing time, we randomly subsampled 5,000 loci for the S. aethiopicum dataset. This strategy was validated by topology, bootstrap support and gene tree concordance factors that are nearly identical to results obtained from a smaller 353 loci dataset described previously<a data-track=\"click\" data-track-action=\"reference anchor\" data-track-label=\"link\" data-test=\"citation-ref\" aria-label=\"Reference 35\" title=\"Gagnon, E. et al. Phylogenomic discordance suggests polytomies along the backbone of the large genus Solanum. Am. J. Bot. 109, 580&#x2013;601 (2022).\" href=\"http:\/\/www.nature.com\/articles\/s41586-025-08619-6#ref-CR35\" id=\"ref-link-section-d278053768e3525\" target=\"_blank\" rel=\"noopener\">35<\/a>. To reduce the effect of missing data and long branch attraction, sequences shorter than 25% of the average length for each loci were eliminated as described previously<a data-track=\"click\" data-track-action=\"reference anchor\" data-track-label=\"link\" data-test=\"citation-ref\" aria-label=\"Reference 35\" title=\"Gagnon, E. et al. Phylogenomic discordance suggests polytomies along the backbone of the large genus Solanum. Am. J. Bot. 109, 580&#x2013;601 (2022).\" href=\"http:\/\/www.nature.com\/articles\/s41586-025-08619-6#ref-CR35\" id=\"ref-link-section-d278053768e3530\" target=\"_blank\" rel=\"noopener\">35<\/a>. MAFFT<a data-track=\"click\" data-track-action=\"reference anchor\" data-track-label=\"link\" data-test=\"citation-ref\" aria-label=\"Reference 94\" title=\"Katoh, K., Misawa, K., Kuma, K.-I. &amp; Miyata, T. MAFFT: a novel method for rapid multiple sequence alignment based on fast Fourier transform. Nucleic Acids Res. 30, 3059&#x2013;3066 (2002).\" href=\"http:\/\/www.nature.com\/articles\/s41586-025-08619-6#ref-CR94\" id=\"ref-link-section-d278053768e3534\" target=\"_blank\" rel=\"noopener\">94<\/a> was used to align each locus individually. Only loci that had all species in the alignment were retained. trimAl was also used to remove columns that had more than 75% gaps. IQ\u2010TREE2\u00a0(ref.\u00a0<a data-track=\"click\" data-track-action=\"reference anchor\" data-track-label=\"link\" data-test=\"citation-ref\" aria-label=\"Reference 95\" title=\"Minh, B. Q. et al. IQ-TREE 2: new models and efficient methods for phylogenetic inference in the genomic era. Mol. Biol. Evol. 37, 1530&#x2013;1534 (2020).\" href=\"http:\/\/www.nature.com\/articles\/s41586-025-08619-6#ref-CR95\" id=\"ref-link-section-d278053768e3538\" target=\"_blank\" rel=\"noopener\">95<\/a>) was used to generate individual ML trees for each locus. The resulting phylogenies were used for coalescent analyses with ASTRAL\u2010III (v.5.7.3)<a data-track=\"click\" data-track-action=\"reference anchor\" data-track-label=\"link\" data-test=\"citation-ref\" aria-label=\"Reference 96\" title=\"Zhang, C., Rabiee, M., Sayyari, E. &amp; Mirarab, S. ASTRAL-III: polynomial time species tree reconstruction from partially resolved gene trees. BMC Bioinform. 19, 153 (2018).\" href=\"http:\/\/www.nature.com\/articles\/s41586-025-08619-6#ref-CR96\" id=\"ref-link-section-d278053768e3542\" target=\"_blank\" rel=\"noopener\">96<\/a>, where tree nodes with <a data-track=\"click\" data-track-action=\"reference anchor\" data-track-label=\"link\" data-test=\"citation-ref\" aria-label=\"Reference 97\" title=\"Junier, T. &amp; Zdobnov, E. M. The Newick utilities: high-throughput phylogenetic tree processing in the UNIX shell. Bioinformatics 26, 1669&#x2013;1670 (2010).\" href=\"http:\/\/www.nature.com\/articles\/s41586-025-08619-6#ref-CR97\" id=\"ref-link-section-d278053768e3546\" target=\"_blank\" rel=\"noopener\">97<\/a>. Branch support was assessed using localPP support<a data-track=\"click\" data-track-action=\"reference anchor\" data-track-label=\"link\" data-test=\"citation-ref\" aria-label=\"Reference 98\" title=\"Sayyari, E. &amp; Mirarab, S. Fast coalescent-based computation of local branch support from quartet frequencies. Mol. Biol. Evol. 33, 1654&#x2013;1668 (2016).\" href=\"http:\/\/www.nature.com\/articles\/s41586-025-08619-6#ref-CR98\" id=\"ref-link-section-d278053768e3550\" target=\"_blank\" rel=\"noopener\">98<\/a>, where PP values\u2009&gt;\u20090.95 were considered strong, 0.75 to 0.94 weak to moderate, and \u22640.74 as unsupported. Trees were visualized with R using the packages ggtree<a data-track=\"click\" data-track-action=\"reference anchor\" data-track-label=\"link\" data-test=\"citation-ref\" aria-label=\"Reference 99\" title=\"Yu, G., Smith, D. K., Zhu, H., Guan, Y. &amp; Lam, T. T.-Y. Ggtree: an R package for visualization and annotation of phylogenetic trees with their covariates and other associated data. Methods Ecol. Evol. 8, 28&#x2013;36 (2017).\" href=\"http:\/\/www.nature.com\/articles\/s41586-025-08619-6#ref-CR99\" id=\"ref-link-section-d278053768e3555\" target=\"_blank\" rel=\"noopener\">99<\/a> and treeio<a data-track=\"click\" data-track-action=\"reference anchor\" data-track-label=\"link\" data-test=\"citation-ref\" aria-label=\"Reference 100\" title=\"Wang, L.-G. et al. Treeio: an R package for phylogenetic tree input and output with richly annotated and associated data. Mol. Biol. Evol. 37, 599&#x2013;603 (2020).\" href=\"http:\/\/www.nature.com\/articles\/s41586-025-08619-6#ref-CR100\" id=\"ref-link-section-d278053768e3559\" target=\"_blank\" rel=\"noopener\">100<\/a>.<\/p>\n<p>The 22 Solanum species were distributed into two major clades, grade I and clade II, along an orthologue-based phylogenetic tree. The terms grade I and clade II are established clade names in Solanum, originating from reference phylogenetic publications<a data-track=\"click\" data-track-action=\"reference anchor\" data-track-label=\"link\" data-test=\"citation-ref\" aria-label=\"Reference 35\" title=\"Gagnon, E. et al. Phylogenomic discordance suggests polytomies along the backbone of the large genus Solanum. Am. J. Bot. 109, 580&#x2013;601 (2022).\" href=\"http:\/\/www.nature.com\/articles\/s41586-025-08619-6#ref-CR35\" id=\"ref-link-section-d278053768e3572\" target=\"_blank\" rel=\"noopener\">35<\/a>. These were formally referred to as clade I and clade II, but clade I was shown to consist of a set of paraphyletic clades that do not form a monophyletic group. Thus, they are now referred to as grade I to reflect their evolutionary origin.<\/p>\n<p>Gene expansion contraction analysis<\/p>\n<p>To analyse gene expansions and contractions, we processed the ultrametric species tree and gene family counts from OrthoFinder using CAFE5\u00a0(ref.\u00a0<a data-track=\"click\" data-track-action=\"reference anchor\" data-track-label=\"link\" data-test=\"citation-ref\" aria-label=\"Reference 101\" title=\"Mendes, F. K., Vanderpool, D., Fulton, B. &amp; Hahn, M. W. CAFE 5 models variation in evolutionary rates among gene families. Bioinformatics 36, 5516&#x2013;5518 (2021).\" href=\"http:\/\/www.nature.com\/articles\/s41586-025-08619-6#ref-CR101\" id=\"ref-link-section-d278053768e3585\" target=\"_blank\" rel=\"noopener\">101<\/a>). CAFE5 was run with the gamma model and parameter \u2018k\u2009=\u20093\u2019 to identify changes in gene family size along the species tree while accounting for rate variation among gene families.<\/p>\n<p>GO enrichment analysis<\/p>\n<p>Gene Ontology (GO) enrichment analysis was performed using the GOATOOLS package<a data-track=\"click\" data-track-action=\"reference anchor\" data-track-label=\"link\" data-test=\"citation-ref\" aria-label=\"Reference 102\" title=\"Klopfenstein, D. V. et al. GOATOOLS: a Python library for Gene Ontology analyses. Sci. Rep. 8, 10872 (2018).\" href=\"http:\/\/www.nature.com\/articles\/s41586-025-08619-6#ref-CR102\" id=\"ref-link-section-d278053768e3597\" target=\"_blank\" rel=\"noopener\">102<\/a> to investigate the functional implications of genes associated with various duplication types including whole-genome (WGD), tandem (TD), proximal (PD), transposed (TSD) and dispersed (DSD) duplications. Genes were classified into these different duplication categories by DupGen_finder<a data-track=\"click\" data-track-action=\"reference anchor\" data-track-label=\"link\" data-test=\"citation-ref\" aria-label=\"Reference 38\" title=\"Qiao, X. et al. Gene duplication and evolution in recurring polyploidization-diploidization cycles in plants. Genome Biol. 20, 38 (2019).\" href=\"http:\/\/www.nature.com\/articles\/s41586-025-08619-6#ref-CR38\" id=\"ref-link-section-d278053768e3601\" target=\"_blank\" rel=\"noopener\">38<\/a>. Moreover, we conducted GO enrichment on gene expansions (Supplementary Table <a data-track=\"click\" data-track-label=\"link\" data-track-action=\"supplementary material anchor\" href=\"http:\/\/www.nature.com\/articles\/s41586-025-08619-6#MOESM3\" target=\"_blank\" rel=\"noopener\">21<\/a>) and contractions (Supplementary Table <a data-track=\"click\" data-track-label=\"link\" data-track-action=\"supplementary material anchor\" href=\"http:\/\/www.nature.com\/articles\/s41586-025-08619-6#MOESM3\" target=\"_blank\" rel=\"noopener\">22<\/a>) identified across all lineages as reported by CAFE5, to examine functional trends related to these gene copy-number changes across the pangenome.<\/p>\n<p>Synteny analysis<\/p>\n<p>The genomic neighbourhood around CLV3 for selected species was manually inspected to detect and annotate intact and pseudogenized CLV3 copies using pairwise sequence comparison with Exonerate (<a href=\"http:\/\/www.ebi.ac.uk\/about\/vertebrate-genomics\/software\/exonerate\" target=\"_blank\" rel=\"noopener\">www.ebi.ac.uk\/about\/vertebrate-genomics\/software\/exonerate<\/a>). Synteny plots were generated from a reciprocal BLASTP table obtained running Clinker (v.0.0.29, <a href=\"http:\/\/github.com\/gamcil\/clinker\" target=\"_blank\" rel=\"noopener\">github.com\/gamcil\/clinker<\/a>). Pseudomolecule visualization was generated via a custom script (<a href=\"https:\/\/github.com\/pan-sol\/pan-sol-pipelines\" target=\"_blank\" rel=\"noopener\">https:\/\/github.com\/pan-sol\/pan-sol-pipelines<\/a>). Transposable elements and resistance genes annotations were overlaid as needed using custom scripts (<a href=\"https:\/\/github.com\/pan-sol\/pan-sol-pipelines\" target=\"_blank\" rel=\"noopener\">https:\/\/github.com\/pan-sol\/pan-sol-pipelines<\/a>).<\/p>\n<p>Gene expression analysis<\/p>\n<p>Reads from each tissue sample were aligned to the corresponding species-specific genome using STAR (v.2.7.2b)<a data-track=\"click\" data-track-action=\"reference anchor\" data-track-label=\"link\" data-test=\"citation-ref\" aria-label=\"Reference 67\" title=\"Dobin, A. et al. STAR: ultrafast universal RNA-seq aligner. Bioinformatics 29, 15&#x2013;21 (2013).\" href=\"http:\/\/www.nature.com\/articles\/s41586-025-08619-6#ref-CR67\" id=\"ref-link-section-d278053768e3662\" target=\"_blank\" rel=\"noopener\">67<\/a>, and only samples with more than 50% uniquely mapped reads were retained for subsequent analysis. For each species with two or more biological replicates per tissue, we calculated the Spearman correlation between tissue replicates, and removed samples with low correlation (0.75 or below). This yielded gene expression estimates for 240 samples across 22 species, with 15 species having expression data in two or more tissues. Specifically, 7 out of 22 species had expression data exclusively from the apex tissue, while 15 species had expression from two or more tissues. As expression diversification groups are defined based on the coexpression and expression fold change of paralogue pairs across two or more tissues, the analyses focused on 15 out of 22 species. Expression data were TPM-normalized and genes with zero expression across all of the samples were excluded from further analysis. PCA was performed on the tissue-specific expression profiles of 5,146 singleton genes selected based on Orthofinder results and shared across all 22 species to reveal the global relationships among samples. Plotting was performed using ggplot2 (<a href=\"https:\/\/ggplot2.tidyverse.org\/\" target=\"_blank\" rel=\"noopener\">https:\/\/ggplot2.tidyverse.org\/<\/a>). This validated the expected results that expression was largely clustering by tissue type.<\/p>\n<p>Analysis of whether the total dosage of duplicate gene pairs is conserved across Solanum<\/p>\n<p>Survival of a gene after duplication depends on the competition between preservation to maintain partial or total dosage and mutational degradation rendering one copy with reduced or no function. Consequently, functional fates of duplicate genes are often characterized by the extent of selective pressures on total dosage. To assess the relative importance of dosage balance (copies evolving under strong purifying selection to maintain total dosage) and neutral drift (no selection on total dosage) in maintaining duplicate genes, we compared the total expression of paralogue pairs within each tissue for each pair of species. Note that the prickle tissue from S. prinophyllum is not included in this analysis as it is absent in the other 21 species.<\/p>\n<p>In each tissue, gene expression was averaged over the biological replicates for each species. For each pair of species with expression data in a shared tissue, orthogroups with exactly two copies in each species with non-zero average expression in the tissue were retained for further analysis. For each tissue and species pair, we calculated the summed expression of paralogue pairs in each retained orthogroup, and observed that the total orthogroup-level expression was highly correlated across species, suggesting a prominent role of dosage balance in shaping the expression evolution of paralogues. We computed the ratio of the orthogroup-level expression between the species pair and transformed them into z scores. For each orthogroup in a species expressed in the tissue of interest, we averaged the P values from all pairwise species comparisons, adjusted the average P values using Benjamini\u2013Hochberg correction and classified orthogroups with an adjusted average P\u2009<\/p>\n<p>All other orthogroups were assumed to evolve under selective constraint on total dosage. Note that the high z-score threshold provides a conservative estimate of the number of paralogue pairs evolving under drift. Sequence evolution rates for paralogue pairs (Ka\/Ks) were calculated using KaKs_Calculator (v.2.0)<a data-track=\"click\" data-track-action=\"reference anchor\" data-track-label=\"link\" data-test=\"citation-ref\" aria-label=\"Reference 103\" title=\"Wang, D., Zhang, Y., Zhang, Z., Zhu, J. &amp; Yu, J. KaKs_Calculator 2.0: a toolkit incorporating gamma-series methods and sliding window strategies. Genom. Proteom. Bioinform. 8, 77&#x2013;80 (2010).\" href=\"http:\/\/www.nature.com\/articles\/s41586-025-08619-6#ref-CR103\" id=\"ref-link-section-d278053768e3717\" target=\"_blank\" rel=\"noopener\">103<\/a>.<\/p>\n<p>Different modes of paralogue functional evolution<\/p>\n<p>For each of the 15 species in which expression data were collected for two or more tissues, the expression data were first subset to genes with greater-than-median expression in at least one sample. The coexpression network for each species was constructed by calculating the Pearson correlation between all pairs of genes, ranking the correlation coefficients for each gene (with NAs assigned the median rank) and then standardizing the network by the maximum ranked correlation coefficient. From OrthoFinder, we obtained 763,492 paralogue pairs across the 15 species, representing all combinations of gene pairs within orthogroups. Of these pairs, 71% had low or no expression, and another 15% were filtered out due to insufficient expression for reliable analysis. This left 14% of pairs for further classification, where 8% (57% out of the 14% available for further classification) fit into one of four expression diversification groups below, while the remaining 6% did not meet our thresholds. Coexpression for each pair of paralogues in each orthogroup was obtained from this rank-standardized network. For each paralogue pair with non-zero expression in two or more samples, we also computed the fold change in expression across samples and used the absolute values of mean and s.d. of log2-transformed fold change across samples to summarize the degree of expression divergence between the two copies.<\/p>\n<p>We classified the paralogue pairs within each species into different retention categories based on their variation in expression levels and correlated expression across samples. We selected these two axes of variation as they intuitively represent average expression difference (fold\u00a0change) and specific pattern of difference (coexpression) between gene pairs. We classified paralogue pairs into four broad groups as follows:<\/p>\n<ol class=\"u-list-style-none\">\n<li>\n                    (I)<\/p>\n<p>Dosage balanced: coexpression\u2009&gt;\u20090.9; mean log2[fold change]\u20092[fold change]\u2009<\/p>\n<\/li>\n<li>\n                    (II)<\/p>\n<p>Paralogue dominance: coexpression\u2009&gt;\u20090.9; mean log2[fold change]\u2009\u2265\u00a01, s.d. of log2[fold change]\u2009<\/p>\n<\/li>\n<li>\n                    (III)<\/p>\n<p>Specialized: coexpression\u2009&gt;\u20090.9; mean log2[fold change]\u2009\u2265\u20091; s.d. of log2[fold\u00a0change]\u2009\u2265\u20091.<\/p>\n<\/li>\n<li>\n                    (IV)<\/p>\n<p>Diverged: coexpression 2[fold change]\u2009\u2265\u20091; s.d. of log2[fold change]\u2009\u2265\u20091.<\/p>\n<\/li>\n<\/ol>\n<p>Paralogues originating from whole-genome, tandem and proximal duplications were obtained using the DupGen_finder pipeline<a data-track=\"click\" data-track-action=\"reference anchor\" data-track-label=\"link\" data-test=\"citation-ref\" aria-label=\"Reference 38\" title=\"Qiao, X. et al. Gene duplication and evolution in recurring polyploidization-diploidization cycles in plants. Genome Biol. 20, 38 (2019).\" href=\"http:\/\/www.nature.com\/articles\/s41586-025-08619-6#ref-CR38\" id=\"ref-link-section-d278053768e3799\" target=\"_blank\" rel=\"noopener\">38<\/a>. WGD pairs with Ks ranging from 0.2 to 2.5, and tandem and proximal duplicates with Ks ranging from 0.05 to 2.5 were used to generate the stacked bar plots corresponding to WGDs and SSDs, respectively, in Fig. <a data-track=\"click\" data-track-label=\"link\" data-track-action=\"figure anchor\" href=\"http:\/\/www.nature.com\/articles\/s41586-025-08619-6#Fig2\" target=\"_blank\" rel=\"noopener\">2i<\/a>.<\/p>\n<p>The gene family size for each classified paralogue pair within a species corresponds to the number of genes in its orthogroup. The expression breadth of a gene corresponds to the number of tissues (among apices, cotyledon, hypocotyl, inflorescence, leaves) where the gene has an average expression greater than 3 TPM. The number of shared tissues expressing a paralogue pair is computed by intersecting the expression breadths of both copies, and ranges from 0 to 5. A gene was considered non-functional if it was annotated as a pseudogene or had an average expression below 3 TPM. Tissue-specific genes for each tissue were identified as genes with the highest expression in the tissue of interest, tissue-specificity score<a data-track=\"click\" data-track-action=\"reference anchor\" data-track-label=\"link\" data-test=\"citation-ref\" aria-label=\"Reference 104\" title=\"Yanai, I. et al. Genome-wide midrange transcription profiles reveal expression level relationships in human tissue specification. Bioinformatics 21, 650&#x2013;659 (2005).\" href=\"http:\/\/www.nature.com\/articles\/s41586-025-08619-6#ref-CR104\" id=\"ref-link-section-d278053768e3817\" target=\"_blank\" rel=\"noopener\">104<\/a> greater than 0.7 and with expression greater than 5 TPM in the relevant tissue. Both tissue specificity and pseudogene calling are sensitive to the breadth of tissue sampling, and the collection and incorporation of additional data into this framework would improve the comprehensiveness of the calling of modes of paralogue evolution.<\/p>\n<p>Mapping of loci controlling the S. aethiopicum locule number<\/p>\n<p>The high-locule-count parent and reference accession PI 424860, and low- and higher-locule-count parents 804750187 and 804750136, respectively, were selected as founding parents to map QTLs and their causative variants affecting fruit locule number. Resulting F1 progeny were selfed to generate F2 mapping populations, which were sown in the greenhouse and then transplanted to a field site at Lloyd Harbor, New York, USA, during the summer of 2022. Six F3 populations derived from genotyped (see below) F2 individuals were sown and transplanted at the same location during the summer of 2024. Approximately ten fruits were collected from each F2 individual and the number of locules exposed by slicing each fruit transversely and counting. In the F2 populations derived from 804750187 \u00d7 PI 424860 and 804750136 \u00d7 PI 424860, 144 and 135 individuals were phenotyped, respectively (Supplementary Tables <a data-track=\"click\" data-track-label=\"link\" data-track-action=\"supplementary material anchor\" href=\"http:\/\/www.nature.com\/articles\/s41586-025-08619-6#MOESM3\" target=\"_blank\" rel=\"noopener\">13<\/a> and <a data-track=\"click\" data-track-label=\"link\" data-track-action=\"supplementary material anchor\" href=\"http:\/\/www.nature.com\/articles\/s41586-025-08619-6#MOESM3\" target=\"_blank\" rel=\"noopener\">14<\/a>). For each population, DNA from 30 random individuals at the low and high ends of the phenotypic distribution for locule number were pooled for bulk-segregant QTL-seq analysis. The DNA from eight individuals of the common parental accession PI 424860 were also pooled to capture parental polymorphisms.<\/p>\n<p>DNA from 15 of the most extreme low- and high-locule count individuals was extracted from young leaf tissue using the DNeasy Plant Pro Kit (Qiagen) according to the manufacturer\u2019s instructions for high-polysaccharide-content plant tissue. Tissue used for extraction was ground using a SPEX SamplePrep 2010 Geno\/Grinder (Cole-Parmer) for 2\u2009min at 1,440\u2009rpm. The sample DNA (1\u2009\u00b5l assay volume) concentrations were assayed using Qubit 1\u00d7 dsDNA HS buffer (Thermo Fisher Scientific) on the Qubit 4 fluorometer (Thermo Fisher Scientific) according to the manufacturer\u2019s instructions. Separate pools were made for the parents, the bulked high-locule-count F2 individuals and the bulked low-locule-count F2 individuals, with an equivalent mass of DNA pooled from each individual to yield a final pooled mass of 3\u2009\u00b5g in each bulk. DNA pools were purified using 1.8\u00d7 volume of AMPure XP beads (Beckman Coulter) and the DNA concentration and purity were assayed using Qubit and the NanoDrop One spectrophotometer (Thermo Fisher Scientific), respectively.<\/p>\n<p>Paired-end sequencing libraries for QTL-seq analysis were prepared with &gt;1\u2009\u00b5g of DNA using the KAPA HyperPrep PCR-free kit (Roche) according to the manufacturer\u2019s instructions. Indexed libraries were pooled for sequencing on a NextSeq 2000 P3 chip (Illumina). Mapping was performed using the end-to-end pipeline implemented in the QTL-seq software package<a data-track=\"click\" data-track-action=\"reference anchor\" data-track-label=\"link\" data-test=\"citation-ref\" aria-label=\"Reference 105\" title=\"Takagi, H. et al. QTL-seq: rapid mapping of quantitative trait loci in rice by whole genome resequencing of DNA from two bulked populations. Plant J. 74, 174&#x2013;183 (2013).\" href=\"http:\/\/www.nature.com\/articles\/s41586-025-08619-6#ref-CR105\" id=\"ref-link-section-d278053768e3863\" target=\"_blank\" rel=\"noopener\">105<\/a> (v.2.2.4, <a href=\"https:\/\/github.com\/YuSugihara\/QTL-seq\" target=\"_blank\" rel=\"noopener\">https:\/\/github.com\/YuSugihara\/QTL-seq<\/a>) with reads aligned against the S. aethiopicum (Saet3, PI 424860) genome assembly.<\/p>\n<p>To determine the effects of the two identified QTL on locule number in the populations derived from 804750136 \u00d7 PI 424860, co-segregation analysis was performed on the full F2 populations by genotyping SaetCLV3 and the minor-effect locus on chromosome 5. For SaetCLV3, a cleaved amplified polymorphic sequence (CAPS) assay was used to genotype a variant in the promoter region of SaetCLV3 linked to the identified CLV3 SV haplotypes. A 1,258\u2009bp region surrounding an AseI restriction fragment length polymorphism in the SaetCLV3 promoter was amplified using the KOD One PCR Master Mix (Toyobo) on template DNA extracted using the cetyltrimethylammonium bromide method<a data-track=\"click\" data-track-action=\"reference anchor\" data-track-label=\"link\" data-test=\"citation-ref\" aria-label=\"Reference 106\" title=\"Doyle, J. J. &amp; Doyle, J. L. A rapid DNA isolation procedure for small quantities of fresh leaf tissue. Phytochem. Bull. 19, 11&#x2013;15 (1987).\" href=\"http:\/\/www.nature.com\/articles\/s41586-025-08619-6#ref-CR106\" id=\"ref-link-section-d278053768e3898\" target=\"_blank\" rel=\"noopener\">106<\/a> (primers 5431 and 4681 are shown in Supplementary Table <a data-track=\"click\" data-track-label=\"link\" data-track-action=\"supplementary material anchor\" href=\"http:\/\/www.nature.com\/articles\/s41586-025-08619-6#MOESM3\" target=\"_blank\" rel=\"noopener\">20<\/a>). To 5\u2009\u00b5l of the resulting PCR product, a 10\u2009\u00b5l reaction containing 0.2\u2009\u00b5l AseI (New England Biolabs) and 1\u2009\u00b5l CutSmart r3.1 buffer (New England BioLabs) was incubated for 2\u2009h at 37\u2009\u00b0C. The reactions were then loaded onto a 1% agarose gel and electrophoresed in an Owl D3-14 electrophoresis box (Thermo Fisher Scientific) containing 1\u00d7 TBE buffer for 30\u2009min at 180\u2009V delivered from an Owl EC 300 XL power supply (Thermo Fisher Scientific). The electrophoresis results were visualized under UV light using the Bio-Rad ChemiDoc XRS+ (Bio-Rad) imaging platform and ImageLab (Bio-Rad) software. The resulting banding patterns were then used to assign genotypes. For the chromosome 5 QTL, primers (primers 5883 and 5884 are shown in Supplementary Table <a data-track=\"click\" data-track-label=\"link\" data-track-action=\"supplementary material anchor\" href=\"http:\/\/www.nature.com\/articles\/s41586-025-08619-6#MOESM3\" target=\"_blank\" rel=\"noopener\">20<\/a>) were used to amplify a 425\u2009bp region containing a 1\u2009bp deletion occurring near the summit of the QTL peak using the KOD One PCR Master Mix. The resulting PCR products were purified using Ampure 1.8\u00d7 beads and were used as a template for Sanger sequencing (Azenta Genewiz). The sequencing results were then used to assign genotype calls at chromosome 5. Presented data are from individuals that were successfully genotyped at both loci.<\/p>\n<p>Conservatory analysis<\/p>\n<p>The Conservatory algorithm (v.2.0)<a data-track=\"click\" data-track-action=\"reference anchor\" data-track-label=\"link\" data-test=\"citation-ref\" aria-label=\"Reference 107\" title=\"Hendelman, A. et al. Conserved pleiotropy of an ancient plant homeobox gene uncovered by cis-regulatory dissection. Cell 184, 1724&#x2013;1739 (2021).\" href=\"http:\/\/www.nature.com\/articles\/s41586-025-08619-6#ref-CR107\" id=\"ref-link-section-d278053768e3916\" target=\"_blank\" rel=\"noopener\">107<\/a> was used to identify conserved non-coding sequences (CNSs) within the Solanaceae family (Supplementary Fig. <a data-track=\"click\" data-track-label=\"link\" data-track-action=\"supplementary material anchor\" href=\"http:\/\/www.nature.com\/articles\/s41586-025-08619-6#MOESM1\" target=\"_blank\" rel=\"noopener\">4b<\/a>) (<a href=\"https:\/\/conservatorycns.com\/dist\/pages\/conservatory\/about.php\" target=\"_blank\" rel=\"noopener\">https:\/\/conservatorycns.com\/dist\/pages\/conservatory\/about.php<\/a>). A total of 26 genomes, including 23 Solanum genomes, two tomato genomes (Heinz and M82) and one groundcherry (P. grisea), were used as references to enable the identification of CNSs irrespective of structural variations among references. Protein similarity was scored using Bitscore<a data-track=\"click\" data-track-action=\"reference anchor\" data-track-label=\"link\" data-test=\"citation-ref\" aria-label=\"Reference 108\" title=\"Altschul, S. F., Gish, W., Miller, W., Myers, E. W. &amp; Lipman, D. J. Basic local alignment search tool. J. Mol. Biol. 215, 403&#x2013;410 (1990).\" href=\"http:\/\/www.nature.com\/articles\/s41586-025-08619-6#ref-CR108\" id=\"ref-link-section-d278053768e3937\" target=\"_blank\" rel=\"noopener\">108<\/a>, while cis-regulatory similarity was assessed using LastZ<a data-track=\"click\" data-track-action=\"reference anchor\" data-track-label=\"link\" data-test=\"citation-ref\" aria-label=\"Reference 109\" title=\"Harris, R. S. Improved Pairwise Alignment of Genomic DNA (Pennsylvania State Univ., 2007).\" href=\"http:\/\/www.nature.com\/articles\/s41586-025-08619-6#ref-CR109\" id=\"ref-link-section-d278053768e3944\" target=\"_blank\" rel=\"noopener\">109<\/a> score. Homologous gene pairs were required to share at least one CNS. For orthogroup calling, all orthologous genes shared at least one CNS with the reference gene. Gene pairs with a conservation score exceeding 90% of the highest score were classified as paralogues (Supplementary Fig. <a data-track=\"click\" data-track-label=\"link\" data-track-action=\"supplementary material anchor\" href=\"http:\/\/www.nature.com\/articles\/s41586-025-08619-6#MOESM1\" target=\"_blank\" rel=\"noopener\">4b<\/a>). A total of 844,525 paralogues was identified across the Solanum pan-genome. Sequence evolution pressure rates (Ka\/Ks) for paralogue pairs were calculated using the R seqinR package (v.4.2-36)<a data-track=\"click\" data-track-action=\"reference anchor\" data-track-label=\"link\" data-test=\"citation-ref\" aria-label=\"Reference 110\" title=\"Charif, D. &amp; Lobry, J. R. in Structural Approaches to Sequence Evolution: Molecules, Networks, Populations (eds Bastolla, U. et al.) 207&#x2013;232 (Springer, 2007).\" href=\"http:\/\/www.nature.com\/articles\/s41586-025-08619-6#ref-CR110\" id=\"ref-link-section-d278053768e3965\" target=\"_blank\" rel=\"noopener\">110<\/a>. Gene duplication events were classified using DupGen_finder<a data-track=\"click\" data-track-action=\"reference anchor\" data-track-label=\"link\" data-test=\"citation-ref\" aria-label=\"Reference 38\" title=\"Qiao, X. et al. Gene duplication and evolution in recurring polyploidization-diploidization cycles in plants. Genome Biol. 20, 38 (2019).\" href=\"http:\/\/www.nature.com\/articles\/s41586-025-08619-6#ref-CR38\" id=\"ref-link-section-d278053768e3969\" target=\"_blank\" rel=\"noopener\">38<\/a>, identifying whole-genome and transposed duplications for gene pairs recognized by both the Conservatory and DupGen_finder tools. Tandem and proximal duplications were defined based on gene positioning: adjacent genes were considered to be tandem duplications, and genes up to ten genes apart were defined as proximal duplications. All other duplicated gene pairs were categorized as dispersed duplications (Supplementary Fig. <a data-track=\"click\" data-track-label=\"link\" data-track-action=\"supplementary material anchor\" href=\"http:\/\/www.nature.com\/articles\/s41586-025-08619-6#MOESM1\" target=\"_blank\" rel=\"noopener\">4c<\/a>). Of the identified paralogues, 23,730 were associated with expression groups and were used to compare relationships between sequence evolution pressure rates and protein and cis-regulatory divergence across different expression groups. Homologues, orthogroups and paragroups were identified, and the relationships between protein and cis-regulatory elements were visualized using custom scripts, which are available at GitHub (<a href=\"https:\/\/github.com\/pan-sol\/pan-sol-pipelines\" target=\"_blank\" rel=\"noopener\">https:\/\/github.com\/pan-sol\/pan-sol-pipelines<\/a>). See Supplementary Table <a data-track=\"click\" data-track-label=\"link\" data-track-action=\"supplementary material anchor\" href=\"http:\/\/www.nature.com\/articles\/s41586-025-08619-6#MOESM3\" target=\"_blank\" rel=\"noopener\">5<\/a> for statistical analysis.<\/p>\n<p>Statistical analysis<\/p>\n<p>All statistical tests were performed in R. For the quantitative analysis of fruit locule numbers in Figs. <a data-track=\"click\" data-track-label=\"link\" data-track-action=\"figure anchor\" href=\"http:\/\/www.nature.com\/articles\/s41586-025-08619-6#Fig3\" target=\"_blank\" rel=\"noopener\">3f<\/a> and <a data-track=\"click\" data-track-label=\"link\" data-track-action=\"figure anchor\" href=\"http:\/\/www.nature.com\/articles\/s41586-025-08619-6#Fig5\" target=\"_blank\" rel=\"noopener\">5c,d<\/a> and Extended Data Fig. <a data-track=\"click\" data-track-label=\"link\" data-track-action=\"figure anchor\" href=\"http:\/\/www.nature.com\/articles\/s41586-025-08619-6#Fig10\" target=\"_blank\" rel=\"noopener\">5b,c<\/a>, n represents the number of fruits quantified. Pairwise comparisons were conducted using Dunnett\u2019s T3 test (R package PMCMRplus v.1.9.10) for multiple comparisons with unequal variances, with the default parameters. Statistical tests and the resulting P values are presented in Supplementary Tables <a data-track=\"click\" data-track-label=\"link\" data-track-action=\"supplementary material anchor\" href=\"http:\/\/www.nature.com\/articles\/s41586-025-08619-6#MOESM3\" target=\"_blank\" rel=\"noopener\">5<\/a>, <a data-track=\"click\" data-track-label=\"link\" data-track-action=\"supplementary material anchor\" href=\"http:\/\/www.nature.com\/articles\/s41586-025-08619-6#MOESM3\" target=\"_blank\" rel=\"noopener\">9<\/a>, <a data-track=\"click\" data-track-label=\"link\" data-track-action=\"supplementary material anchor\" href=\"http:\/\/www.nature.com\/articles\/s41586-025-08619-6#MOESM3\" target=\"_blank\" rel=\"noopener\">15<\/a>, <a data-track=\"click\" data-track-label=\"link\" data-track-action=\"supplementary material anchor\" href=\"http:\/\/www.nature.com\/articles\/s41586-025-08619-6#MOESM3\" target=\"_blank\" rel=\"noopener\">17<\/a> and <a data-track=\"click\" data-track-label=\"link\" data-track-action=\"supplementary material anchor\" href=\"http:\/\/www.nature.com\/articles\/s41586-025-08619-6#MOESM3\" target=\"_blank\" rel=\"noopener\">19<\/a>.<\/p>\n<p>Reporting summary<\/p>\n<p>Further information on research design is available in the\u00a0<a data-track=\"click\" data-track-label=\"link\" data-track-action=\"supplementary material anchor\" href=\"http:\/\/www.nature.com\/articles\/s41586-025-08619-6#MOESM2\" target=\"_blank\" rel=\"noopener\">Nature Portfolio Reporting Summary<\/a> linked to this article.<\/p>\n","protected":false},"excerpt":{"rendered":"Plant material, phenotypic analyses and imaging Details on all plant material used in this study, including the passport&hellip;\n","protected":false},"author":2,"featured_media":18664,"comment_status":"","ping_status":"","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[3846],"tags":[12369,12370,267,12371,3965,3966,12372,70,16,15],"class_list":{"0":"post-18663","1":"post","2":"type-post","3":"status-publish","4":"format-standard","5":"has-post-thumbnail","7":"category-genetics","8":"tag-agricultural-genetics","9":"tag-evolutionary-genetics","10":"tag-genetics","11":"tag-genome-informatics","12":"tag-humanities-and-social-sciences","13":"tag-multidisciplinary","14":"tag-plant-domestication","15":"tag-science","16":"tag-uk","17":"tag-united-kingdom"},"share_on_mastodon":{"url":"https:\/\/pubeurope.com\/@uk\/114335190496976480","error":""},"_links":{"self":[{"href":"https:\/\/www.europesays.com\/uk\/wp-json\/wp\/v2\/posts\/18663","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.europesays.com\/uk\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.europesays.com\/uk\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.europesays.com\/uk\/wp-json\/wp\/v2\/users\/2"}],"replies":[{"embeddable":true,"href":"https:\/\/www.europesays.com\/uk\/wp-json\/wp\/v2\/comments?post=18663"}],"version-history":[{"count":0,"href":"https:\/\/www.europesays.com\/uk\/wp-json\/wp\/v2\/posts\/18663\/revisions"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/www.europesays.com\/uk\/wp-json\/wp\/v2\/media\/18664"}],"wp:attachment":[{"href":"https:\/\/www.europesays.com\/uk\/wp-json\/wp\/v2\/media?parent=18663"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.europesays.com\/uk\/wp-json\/wp\/v2\/categories?post=18663"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.europesays.com\/uk\/wp-json\/wp\/v2\/tags?post=18663"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}