The increasing demand for computational power to train large language models threatens to widen the technological gap between wealthy nations and the Global South. Sandra Malagon, Monica A. Ulloa Ruiz, Tatiana Elizabeth Sandoval Plaza, and colleagues from Carreras con Impacto investigate the practical and financial viability of training such models within Brazil and Mexico, considering limitations in hardware, energy, and funding. Their research assesses the compute needs, energy use, costs, and regulatory considerations for training a substantial 10-trillion-token model, varying both the type of accelerator used and the training timeframe. The team demonstrates that while all configurations remain within manageable infrastructure limits, achieving fiscal sustainability hinges on hardware efficiency, with newer technologies significantly reducing costs, and that extending training timelines offers a viable policy option for countries seeking to build locally relevant AI capabilities without necessarily competing at the very forefront of global innovation.

Sovereign Language Models, Infrastructure and Compute Budgets

This study pioneers a methodology for assessing the feasibility of sovereign-scale language model training in Brazil and Mexico, addressing the computational disparities between high-capacity and resource-constrained nations. Researchers established a fixed computational budget aligned with the DeepSeek-V3 model, a substantial 671-billion-parameter model trained on 14. 8 trillion tokens, to define a strategically sufficient model capable of supporting regional deployments without competing at the global AI frontier. This scale was deliberately chosen to reflect a credible, yet not cutting-edge, model capable of supporting applications in science, government, and education.

The study then evaluated four infrastructure configurations, combining newer and older generation accelerators with training schedules of 90 and 150 days, to capture trade-offs between hardware efficiency, energy consumption, and deployment feasibility. Newer generation accelerators were assigned a peak throughput of 2,000 TFLOPs using advanced precision techniques, while older generation accelerators were assigned 312 TFLOPs under standard precision. A factor accounting for real-world training inefficiencies was applied to all scenarios, based on empirical estimates from prior work. The total number of accelerators required was then calculated, ranging from approximately 350 of the newer generation (in the 150-day scenario) to over 2,200 of the older generation (in the 90-day scenario), demonstrating the impact of both hardware efficiency and extended training schedules on infrastructure needs.

Researchers meticulously estimated resource requirements for each scenario, calculating energy consumption based on training duration, accelerator power draw, and total accelerator count, adjusted by a factor to reflect datacenter overhead. They estimated power draw of 700W per newer generation accelerator and 400W per older generation accelerator, yielding energy requirements between 0. 3 and 3. 3 GWh depending on hardware type and training schedule. Peak electrical load was estimated by assuming full simultaneous accelerator usage, ranging from 0.

41MW (newer generation, 150 days) to 1. 49MW (older generation, 90 days), remaining within the capacity of typical medium-voltage distribution infrastructure. Capital expenditures were calculated using hardware prices and integration overhead, with country-specific import duties applied to Brazil. This detailed methodology allows for a nuanced assessment of the fiscal and logistical challenges of establishing sovereign AI capabilities in middle-income countries.

Sovereign AI Training Feasible for Brazil, Mexico

This research demonstrates that sovereign-scale language model training is technically and fiscally achievable for countries like Brazil and Mexico, even with constrained resources. By modelling various hardware and training duration scenarios, the study establishes that all configurations examined remain within existing export control limits and do not overwhelm typical electrical infrastructure. The primary determinant of financial viability, however, is hardware efficiency, with newer generation accelerators significantly reducing overall costs. Specifically, the team found that training a substantial model using newer generation hardware requires an investment of 8 to 14 million USD, while deployments relying on older generation processors range from 19 to 32 million USD due to increased energy consumption and hardware demands.

Extending training timelines emerges as a viable policy lever to mitigate hardware constraints and enable the creation of locally aligned language models without necessarily competing at the global forefront of AI development. The authors acknowledge that while all modelled scenarios are formally feasible, practical deployment in urban environments may require additional permitting and infrastructure upgrades for configurations approaching higher power thresholds. Future work could explore the implications of different model sizes and data requirements, as well as the potential for distributed training approaches to further reduce costs and improve accessibility.

Sovereign Language Models, Costs and Feasibility

This research demonstrates that sovereign-scale language model training is technically and fiscally achievable for countries like Brazil and Mexico, even with constrained resources. By modelling various hardware and training duration scenarios, the study establishes that all configurations examined remain within existing export control limits and do not overwhelm typical electrical infrastructure. The primary determinant of financial viability, however, is hardware efficiency, with newer generation accelerators significantly reducing overall costs. Specifically, the team found that training a substantial model using newer generation hardware requires an investment of 8 to 14 million USD, while deployments relying on older generation processors range from 19 to 32 million USD due to increased energy consumption and hardware demands. Extending training timelines emerges as a viable policy lever to mitigate hardware constraints and enable the creation of locally aligned language models without necessarily competing at the global forefront of AI development.

👉 More information
🗞 The Feasibility of Training Sovereign Language Models in the Global South: A Study of Brazil and Mexico
🧠 ArXiv: https://arxiv.org/abs/2510.19801