{"id":352604,"date":"2025-08-17T21:32:12","date_gmt":"2025-08-17T21:32:12","guid":{"rendered":"https:\/\/www.europesays.com\/uk\/352604\/"},"modified":"2025-08-17T21:32:12","modified_gmt":"2025-08-17T21:32:12","slug":"ranking-the-chinese-open-model-builders","status":"publish","type":"post","link":"https:\/\/www.europesays.com\/uk\/352604\/","title":{"rendered":"Ranking the Chinese Open Model Builders"},"content":{"rendered":"<p>The Chinese AI ecosystem has taken the AI world by storm this summer with an unrelenting pace of stellar open model releases. The flagship releases that got the most Western media coverage are the likes of <a href=\"https:\/\/www.interconnects.ai\/p\/qwen-3-the-new-open-standard?utm_source=publication-search\" rel=\"noopener\" target=\"_blank\">Qwen 3<\/a>, <a href=\"https:\/\/www.interconnects.ai\/p\/kimi-k2-and-when-deepseek-moments?utm_source=publication-search\" rel=\"noopener\" target=\"_blank\">Kimi K2<\/a>, or <a href=\"https:\/\/arxiv.org\/abs\/2508.06471\" rel=\"noopener\" target=\"_blank\">Zhipu GLM 4.5<\/a>, but there is a long-tail of providers close behind in both quality and cadence of releases.<\/p>\n<p>In this post we rank the top 19 Chinese labs by the <strong>quality and quantity of contributions to the open AI ecosystem <\/strong>\u2014 this is not a list of raw ability, but outputs \u2014 all the way from the top of DeepSeek to the emerging open research labs. For a more detailed coverage of all the specific models, we recommend studying our <a href=\"https:\/\/www.interconnects.ai\/t\/artifacts-log\" rel=\"noopener\" target=\"_blank\">Artifacts Log<\/a> series, which chronicles all of the major open model releases every month. We plan to revisit this ranking and make note of major new players, so make sure to subscribe.<\/p>\n<p><a target=\"_blank\" href=\"https:\/\/substackcdn.com\/image\/fetch\/$s_!y86L!,f_auto,q_auto:good,fl_progressive:steep\/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2aff76bf-bd04-46b7-89e0-ddbb0e198aa7_1326x812.png\" data-component-name=\"Image2ToDOM\" rel=\"noopener\" class=\"image-link image2 is-viewable-img\"><img decoding=\"async\" src=\"https:\/\/www.europesays.com\/uk\/wp-content\/uploads\/2025\/08\/https:\/\/substack-post-media.s3.amazonaws.com\/public\/images\/2aff76bf-bd04-46b7-89e0-ddbb0e198aa7_1326.jpeg\" width=\"1326\" height=\"812\" data-attrs=\"{&quot;src&quot;:&quot;https:\/\/substack-post-media.s3.amazonaws.com\/public\/images\/2aff76bf-bd04-46b7-89e0-ddbb0e198aa7_1326x812.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:812,&quot;width&quot;:1326,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:433312,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image\/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:&quot;https:\/\/www.interconnects.ai\/i\/171165224?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2aff76bf-bd04-46b7-89e0-ddbb0e198aa7_1326x812.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}\" alt=\"\"   fetchpriority=\"high\" class=\"sizing-normal\"\/><\/a><\/p>\n<p>These companies rival Western counterparts with the quality and frequency of their models.<\/p>\n<p><a href=\"https:\/\/www.deepseek.com\/\" rel=\"noopener\" target=\"_blank\">deepseek.com<\/a> | \ud83e\udd17 <a href=\"https:\/\/huggingface.co\/deepseek-ai\" rel=\"noopener\" target=\"_blank\">deepseek-ai<\/a> | X <a href=\"https:\/\/x.com\/DeepSeek_AI\" rel=\"\">@DeepSeek_AI<\/a><\/p>\n<p>DeepSeek needs little introduction. Their <a href=\"https:\/\/www.interconnects.ai\/p\/deepseek-v3-and-the-actual-cost-of\" rel=\"noopener\" target=\"_blank\">V3<\/a> and <a href=\"https:\/\/www.interconnects.ai\/p\/deepseek-r1-recipe-for-o1\" rel=\"noopener\" target=\"_blank\">R1<\/a> models, and their impact, are still likely the biggest AI stories of 2025 \u2014 open, Chinese models at the frontier of performance with permissive licenses and the exposed model chains of thought that enamored users around the world.<\/p>\n<p>With all the attention following the breakthrough releases, a bit more has been said about DeepSeek in terms of operations, <a href=\"https:\/\/www.chinatalk.media\/p\/deepseek-ceo-interview-with-chinas\" rel=\"noopener\" target=\"_blank\">ideology<\/a>, and <a href=\"https:\/\/semianalysis.com\/2025\/07\/03\/deepseek-debrief-128-days-later\/#a-boom-and-bust\" rel=\"noopener\" target=\"_blank\">business model<\/a> relative to the other labs. They are very innovative technically and have not devoted extensive resources to their consumer chatbot or API hosting (as judged by higher than industry-standard performance degradation).<\/p>\n<p>Over the last 18 months, DeepSeek was known for making \u201cabout one major release a month.\u201d Since the updated releases of V3-0324 and R1-0528, many close observers have been surprised by their lack of contributions. This has let other players in the ecosystem close the gap, but in terms of impact and actual commercial usage, DeepSeek is still king.<\/p>\n<p>An important aspect of DeepSeek\u2019s strategy is their focus on improving their core models at the frontier of performance. To complement this, they have experiments using their current generation to make fundamental research innovations, such as <a href=\"https:\/\/arxiv.org\/abs\/2504.21801\" rel=\"noopener\" target=\"_blank\">theorem proving<\/a> or math models, which ultimately get used for the next iteration of models. This is similar to how Western labs operate. First, you test a new idea as an experiment internally, then you fold it into the \u201cmain product\u201d that most of your users see.<\/p>\n<p><a href=\"https:\/\/arxiv.org\/abs\/2402.03300\" rel=\"noopener\" target=\"_blank\">DeepSeekMath<\/a>, for example, used DeepSeek-Coder-Base-v1.5 7B and introduced the now famous reinforcement learning algorithm <a href=\"https:\/\/rlhfbook.com\/c\/11-policy-gradients.html#group-relative-policy-optimization\" rel=\"noopener\" target=\"_blank\">Group Relative Policy Optimization<\/a> (GRPO), which is one of the main drivers of R1. The exception to this (at least today) is <a href=\"https:\/\/arxiv.org\/abs\/2410.13848\" rel=\"noopener\" target=\"_blank\">Janus<\/a>, their omni-modal series, which has not been used in their main line.<\/p>\n<p><a href=\"https:\/\/qwenlm.ai\/\" rel=\"noopener\" target=\"_blank\">qwenlm.ai<\/a> | \ud83e\udd17 <a href=\"https:\/\/huggingface.co\/Qwen\" rel=\"noopener\" target=\"_blank\">Qwen<\/a> | X <a href=\"https:\/\/x.com\/Alibaba_Qwen\/highlights\" rel=\"\">@Alibaba_Qwen<\/a><\/p>\n<p>Tongyi Qianwen, the primary AI lab within Alibaba\u2019s cloud division, is by far and away most known for their open language model series. They have been releasing many models across a range of sizes (quite similar to Llama 1 through 3) for years. Recently, their models from Qwen 2.5 and Qwen 3 have had <a href=\"https:\/\/www.atomproject.ai\/\" rel=\"noopener\" target=\"_blank\">accelerating market share among AI research and startup development<\/a>.<\/p>\n<p>Qwen is closer to American Big Tech companies than to other Chinese AI labs in terms of releases: They are covering the entire stack, from <a href=\"https:\/\/huggingface.co\/Qwen\/Qwen2.5-VL-7B-Instruct\" rel=\"noopener\" target=\"_blank\">VLMs<\/a> to <a href=\"https:\/\/huggingface.co\/Qwen\/Qwen3-Embedding-8B\" rel=\"noopener\" target=\"_blank\">embedding models<\/a>, <a href=\"https:\/\/huggingface.co\/Qwen\/Qwen3-Coder-480B-A35B-Instruct\" rel=\"noopener\" target=\"_blank\">coding models<\/a>, <a href=\"https:\/\/huggingface.co\/Qwen\/Qwen-Image\" rel=\"noopener\" target=\"_blank\">image<\/a> and <a href=\"https:\/\/huggingface.co\/Wan-AI\/Wan2.2-TI2V-5B\" rel=\"noopener\" target=\"_blank\">video generation<\/a>, and so on.<\/p>\n<p>They also cater to all possible customers (or rather every part of the open community) by releasing capable models of all sizes. Small dense models are important for academia to run experiments and for small\/medium businesses to power their applications, so it comes to no surprise that Qwen-based models are exploding in popularity.<\/p>\n<p>On top of model releases for everyone, they also focused on supporting the (Western) community, releasing <a href=\"https:\/\/huggingface.co\/Qwen\/Qwen3-8B-MLX-8bit\" rel=\"noopener\" target=\"_blank\">MLX<\/a> and <a href=\"https:\/\/huggingface.co\/Qwen\/Qwen3-8B-GGUF\" rel=\"noopener\" target=\"_blank\">GGUF<\/a> versions of their models for local usage or a <a href=\"https:\/\/github.com\/QwenLM\/qwen-code\" rel=\"noopener\" target=\"_blank\">CLI<\/a> for their coding models, which includes a generous amount of free requests.<\/p>\n<p>Unlike some American companies, the core team seems to have stayed relatively small in terms of headcount, in line with other Chinese AI labs: <a href=\"https:\/\/arxiv.org\/abs\/2505.09388\" rel=\"noopener\" target=\"_blank\">Qwen3<\/a> has 177 contributors, whereas Llama 3 has thrice the amount, while Gemini 2.5 has over 3,000 people as part of the model. <\/p>\n<p><a target=\"_blank\" href=\"https:\/\/substackcdn.com\/image\/fetch\/$s_!QxA6!,f_auto,q_auto:good,fl_progressive:steep\/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F88e4d2b4-e5e5-49b0-855c-c12e4027986d_1152x666.png\" data-component-name=\"Image2ToDOM\" rel=\"noopener\" class=\"image-link image2 is-viewable-img\"><img decoding=\"async\" src=\"https:\/\/www.europesays.com\/uk\/wp-content\/uploads\/2025\/08\/https:\/\/substack-post-media.s3.amazonaws.com\/public\/images\/88e4d2b4-e5e5-49b0-855c-c12e4027986d_1152.png\" width=\"1152\" height=\"666\" data-attrs=\"{&quot;src&quot;:&quot;https:\/\/substack-post-media.s3.amazonaws.com\/public\/images\/88e4d2b4-e5e5-49b0-855c-c12e4027986d_1152x666.png&quot;,&quot;srcNoWatermark&quot;:&quot;https:\/\/substack-post-media.s3.amazonaws.com\/public\/images\/ca9f294b-7af4-40cc-b9d4-bc3fde7848e0_1152x666.png&quot;,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:666,&quot;width&quot;:1152,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:37876,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image\/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https:\/\/www.interconnects.ai\/i\/171165224?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fca9f294b-7af4-40cc-b9d4-bc3fde7848e0_1152x666.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}\" alt=\"\"   loading=\"lazy\" class=\"sizing-normal\"\/><\/a><\/p>\n<p>These companies have recently arrived at the frontier of performance and we will see if they have the capability to consistently release great models at a pace matching Qwen or DeepSeek.<\/p>\n<p><a href=\"https:\/\/moonshot.cn\/en\" rel=\"noopener\" target=\"_blank\">moonshot.cn<\/a> | \ud83e\udd17 <a href=\"https:\/\/huggingface.co\/moonshotai\" rel=\"noopener\" target=\"_blank\">moonshotai<\/a> | X <a href=\"https:\/\/x.com\/Kimi_Moonshot\/highlights\" rel=\"\">@Kimi_Moonshot<\/a><\/p>\n<p>Moonshot AI is one of the so-called \u201cAI tigers\u201d, a group of hot Chinese AI startups determined by Chinese media and investors. This group consists of Baichuan, Zhipu AI, Moonshot AI, MiniMax, StepFun, and 01.AI \u2014 most of which have attracted investments by tech funds and other tech grants. For example, Alibaba is seen as a big winner in the AI space by having their own models and by <a href=\"https:\/\/www.bloomberg.com\/news\/articles\/2024-02-27\/alibaba-leads-record-deal-to-create-2-5-billion-china-ai-player\" rel=\"noopener\" target=\"_blank\">being a lead investor in Moonshot<\/a>, sort of like how big tech companies in the U.S. are investing in fundraising rounds for newer AI labs.<\/p>\n<p>While their first models, K1 and K1.5, were closed and available <a href=\"https:\/\/platform.moonshot.ai\/docs\/guide\/choose-an-appropriate-kimi-model\" rel=\"noopener\" target=\"_blank\">on their API<\/a>, they started releasing open models after the R1 release with <a href=\"https:\/\/huggingface.co\/moonshotai\/Moonlight-16B-A3B\" rel=\"noopener\" target=\"_blank\">experimental models<\/a> using the Muon optimizer. Similar to DeepSeek, they focus on a single model line, with small experiments eventually feeding back into the main model. K2 is their \u201cmoonshot run,\u201d a.k.a. <a href=\"https:\/\/x.com\/_jasonwei\/status\/1757486124082303073?lang=en\" rel=\"\">yolo run<\/a>, and quickly became a hit similar to R1 (see <a href=\"https:\/\/www.interconnects.ai\/p\/kimi-k2-and-when-deepseek-moments\" rel=\"noopener\" target=\"_blank\">our report<\/a> from the release).<\/p>\n<p><a href=\"https:\/\/www.chinatalk.media\/p\/kimi-k2-the-open-source-way\" rel=\"noopener\" target=\"_blank\">Further<\/a> <a href=\"https:\/\/www.chinatalk.media\/p\/moonshot-ais-agi-vision\" rel=\"noopener\" target=\"_blank\">reading<\/a> on Kimi can be found on ChinaTalk.<\/p>\n<p><a href=\"https:\/\/z.ai\/\" rel=\"\">z.ai<\/a> | \ud83e\udd17 <a href=\"https:\/\/huggingface.co\/zai-org\" rel=\"noopener\" target=\"_blank\">zai-org<\/a> | X <a href=\"https:\/\/x.com\/Zai_org\/highlights\" rel=\"\">@Zai_org<\/a><\/p>\n<p>Zhipu, known in the west as Z.ai, is a startup spinoff of Tsinghua University with <a href=\"https:\/\/www.scmp.com\/tech\/big-tech\/article\/3321314\/unicorn-zai-adapts-models-huawei-chips-drive-broaden-chinas-ai-ecosystem?utm_source=chatgpt.com\" rel=\"noopener\" target=\"_blank\">considerable investments<\/a> by Chinese companies and VCs. Currently, they are <a href=\"https:\/\/www.reuters.com\/world\/china\/zhipu-ai-ramps-up-overseas-expansion-strategy-ahead-ipo-2025-04-23\/\" rel=\"noopener\" target=\"_blank\">even considering an IPO<\/a>, which would make them the first AI tiger to do so.<\/p>\n<p>In terms of models, they are mostly known for their recent release of <a href=\"https:\/\/huggingface.co\/zai-org\/GLM-4.5\" rel=\"noopener\" target=\"_blank\">GLM-4.5<\/a> and <a href=\"https:\/\/huggingface.co\/zai-org\/GLM-4.5V\" rel=\"noopener\" target=\"_blank\">GLM-4.5V<\/a>, which are all very capable for their sizes (both of which are fairly large mixture of expert models). However, they are not just releasing LLMs, but also <a href=\"https:\/\/huggingface.co\/zai-org\/CogView4-6B\" rel=\"noopener\" target=\"_blank\">image<\/a> and <a href=\"https:\/\/huggingface.co\/zai-org\/CogVideoX1.5-5B\" rel=\"noopener\" target=\"_blank\">video generation<\/a> models, setting them apart from pure-LLM companies and labs.<\/p>\n<p data-attrs=\"{&quot;url&quot;:&quot;https:\/\/www.interconnects.ai\/p\/chinas-top-19-open-model-labs?utm_source=substack&amp;utm_medium=email&amp;utm_content=share&amp;action=share&quot;,&quot;text&quot;:&quot;Share&quot;,&quot;action&quot;:null,&quot;class&quot;:null}\" data-component-name=\"ButtonCreateButton\" class=\"button-wrapper\"><a href=\"https:\/\/www.interconnects.ai\/p\/chinas-top-19-open-model-labs?utm_source=substack&amp;utm_medium=email&amp;utm_content=share&amp;action=share\" rel=\"noopener\" class=\"button primary\" target=\"_blank\">Share<\/a><\/p>\n<p>These companies are transitioning to open releases, have open models with inferior capabilities, or slightly different foci than the text-centric labs pushing the frontiers of intelligence.<\/p>\n<p><a href=\"https:\/\/stepfun.ai\/\" rel=\"noopener\" target=\"_blank\">stepfun.ai<\/a> | \ud83e\udd17 <a href=\"https:\/\/huggingface.co\/stepfun-ai\" rel=\"noopener\" target=\"_blank\">stepfun-ai<\/a> | X <a href=\"https:\/\/x.com\/StepFun_ai\" rel=\"\">@StepFun_ai<\/a><\/p>\n<p>StepFun first started as a closed model provider, but pivoted to open model releases after DeepSeek R1 shook up the industry. They are mostly focusing on multi-modal model releases, with <a href=\"https:\/\/huggingface.co\/stepfun-ai\/step3\" rel=\"noopener\" target=\"_blank\">Step3<\/a> being their flagship VLM. They also have <a href=\"https:\/\/huggingface.co\/stepfun-ai\/NextStep-1-Large\" rel=\"noopener\" target=\"_blank\">image<\/a>, <a href=\"https:\/\/huggingface.co\/stepfun-ai\/Step-Audio-AQAA\" rel=\"noopener\" target=\"_blank\">audio<\/a> and <a href=\"https:\/\/huggingface.co\/stepfun-ai\/stepvideo-ti2v\" rel=\"noopener\" target=\"_blank\">video generation models<\/a>.<\/p>\n<p><a href=\"https:\/\/hunyuan.tencent.com\/en\" rel=\"noopener\" target=\"_blank\">hunyuan.tencent.com<\/a> | \ud83e\udd17 <a href=\"https:\/\/huggingface.co\/Tencent\" rel=\"noopener\" target=\"_blank\">Tencent<\/a> | X <a href=\"https:\/\/x.com\/TencentHunyuan\/highlights\" rel=\"\">@TencentHunyuan<\/a><\/p>\n<p>Hunyuan is mostly known for <a href=\"https:\/\/huggingface.co\/tencent\/HunyuanVideo\" rel=\"noopener\" target=\"_blank\">HunyuanVideo<\/a> and <a href=\"https:\/\/huggingface.co\/tencent\/Hunyuan3D-2.1\" rel=\"noopener\" target=\"_blank\">Hunyuan3D<\/a>. While they have released <a href=\"https:\/\/huggingface.co\/tencent\/Tencent-Hunyuan-Large\" rel=\"noopener\" target=\"_blank\">three<\/a> <a href=\"https:\/\/huggingface.co\/collections\/tencent\/hunyuan-a13b-685ec38e5b46321e3ea7c4be\" rel=\"noopener\" target=\"_blank\">series<\/a> of <a href=\"https:\/\/huggingface.co\/collections\/tencent\/hunyuan-dense-model-6890632cda26b19119c9c5e7\" rel=\"noopener\" target=\"_blank\">different<\/a> LLMs, their releases come with very strict licenses, which is unusual for Chinese companies and dampens excitement when combined with performance levels that can be found elsewhere.<\/p>\n<p><a href=\"https:\/\/www.xiaohongshu.com\/\" rel=\"noopener\" target=\"_blank\">xiaohongshu.com<\/a> | \ud83e\udd17 <a href=\"https:\/\/huggingface.co\/rednote-hilab\" rel=\"noopener\" target=\"_blank\">rednote-hilab<\/a><\/p>\n<p>The Chinese version of Instagram, RedNote, recently joined the ranks of Chinese companies releasing open models. Especially their capable character recognition \/ <a href=\"https:\/\/huggingface.co\/rednote-hilab\/dots.ocr\" rel=\"noopener\" target=\"_blank\">OCR model<\/a> surprised many (see <a href=\"https:\/\/www.interconnects.ai\/p\/latest-open-artifacts-13-the-abundance\" rel=\"noopener\" target=\"_blank\">our coverage<\/a>). Similar to Xiaomi and Baidu, it remains to be seen what their overall open strategy will be in the near and distant future and they have not competed in the large, frontier model space.<\/p>\n<p><a href=\"https:\/\/www.minimaxi.com\/\" rel=\"noopener\" target=\"_blank\">minimaxi.com<\/a> | \ud83e\udd17 <a href=\"https:\/\/huggingface.co\/MiniMaxAI\" rel=\"noopener\" target=\"_blank\">MiniMaxAI<\/a> | X <a href=\"https:\/\/x.com\/MiniMax__AI\" rel=\"\">@MiniMax__AI<\/a><\/p>\n<p>MiniMax is another of the AI tigers and also started as a closed company. After the release of R1, they changed their strategy and released the weights of <a href=\"https:\/\/huggingface.co\/MiniMaxAI\/MiniMax-Text-01\" rel=\"noopener\" target=\"_blank\">Minimax-Text-01<\/a>, following up with <a href=\"https:\/\/huggingface.co\/MiniMaxAI\/MiniMax-M1-80k\" rel=\"noopener\" target=\"_blank\">reasoning models<\/a> building upon it. The unique selling point of these models are the 1M context window achieved with hybrid attention.<\/p>\n<p>These text models are not the only thing they are focusing on \u2014 they also have <a href=\"https:\/\/hailuoai.video\" rel=\"noopener\" target=\"_blank\">image and video generation models<\/a>, but those remain closed and only available on their API. They are also promoting <a href=\"https:\/\/agent.minimax.io\" rel=\"noopener\" target=\"_blank\">their consumer platform<\/a> heavily as they <a href=\"https:\/\/www.reuters.com\/world\/asia-pacific\/chinese-ai-firm-minimax-files-confidentially-hong-kong-ipo-sources-say-2025-07-16\/\" rel=\"noopener\" target=\"_blank\">eye an IPO<\/a>.<\/p>\n<p><a href=\"https:\/\/internlm.intern-ai.org.cn\/\" rel=\"noopener\" target=\"_blank\">internlm.intern-ai.org.cn<\/a> | \ud83e\udd17 <a href=\"https:\/\/huggingface.co\/InternLM\" rel=\"noopener\" target=\"_blank\">InternLM<\/a> | X <a href=\"https:\/\/x.com\/opengvlab\" rel=\"\">@opengvlab<\/a><\/p>\n<p>InternLM &amp; OpenGVLab have deep ties to the Shanghai AI Laboratory, with InternLM focusing on the language models, while OpenGVLab releases vision models. While they release a range of models such as <a href=\"https:\/\/huggingface.co\/collections\/internlm\/intern-s1-6882e325e8ac1c58ba108aa5\" rel=\"noopener\" target=\"_blank\">S1<\/a> or <a href=\"https:\/\/huggingface.co\/collections\/internlm\/internlm2-math-65b0ce88bf7d3327d0a5ad9f\" rel=\"noopener\" target=\"_blank\">InternLM-Math<\/a>, the orgs are mostly known for the strong <a href=\"https:\/\/huggingface.co\/collections\/OpenGVLab\/internvl3-67f7f690be79c2fe9d74fe9d\" rel=\"noopener\" target=\"_blank\">InternVL<\/a> series. While the first versions mostly used their own InternLM pretrained models, later releases (such as InternVL3) rely on Qwen as the language backend. <\/p>\n<p><a href=\"https:\/\/skywork.ai\/\" rel=\"noopener\" target=\"_blank\">skywork.ai<\/a> | \ud83e\udd17 <a href=\"https:\/\/huggingface.co\/Skywork\" rel=\"noopener\" target=\"_blank\">Skywork<\/a> | X <a href=\"https:\/\/x.com\/Skywork_AI\" rel=\"\">@Skywork_AI<\/a><\/p>\n<p>The Singaporean Skywork first started out as an online karaoke company (yes, <a href=\"https:\/\/play.google.com\/store\/apps\/details\/?hl=en-US&amp;id=com.starmakerinteractive.starmaker\" rel=\"noopener\" target=\"_blank\">really<\/a>) before they pivoted to AI and being a competitor to <a href=\"https:\/\/manus.im\" rel=\"noopener\" target=\"_blank\">Manus<\/a>, with their platform focusing on <a href=\"https:\/\/skywork.ai\/\" rel=\"noopener\" target=\"_blank\">agents for work-related tasks<\/a>, such as slide generation.<\/p>\n<p>Their LLM journey started with them releasing their own pretrained <a href=\"https:\/\/huggingface.co\/Skywork\/Skywork-13B-base\" rel=\"noopener\" target=\"_blank\">dense<\/a> and <a href=\"https:\/\/huggingface.co\/Skywork\/Skywork-MoE-Base\" rel=\"noopener\" target=\"_blank\">MoE<\/a> models. However, they stopped pre-training their own models and instead started to fine-tune existing models: Their <a href=\"https:\/\/huggingface.co\/Skywork\/Skywork-OR1-32B\" rel=\"noopener\" target=\"_blank\">OR1 reasoning model<\/a> builds on top of DeepSeek-R1-Distill-Qwen-32B, <a href=\"https:\/\/huggingface.co\/Skywork\/Skywork-R1V3-38B\" rel=\"noopener\" target=\"_blank\">R1V3<\/a> uses InternVL3 (which itself uses Qwen2.5 as its LLM backend).<\/p>\n<p>Aside from LLMs, they have a wide range of other models, from <a href=\"https:\/\/huggingface.co\/Skywork\/Matrix-3D\" rel=\"noopener\" target=\"_blank\">world models<\/a>, <a href=\"https:\/\/huggingface.co\/Skywork\/UniPic2-Metaquery-9B\" rel=\"noopener\" target=\"_blank\">image<\/a> and <a href=\"https:\/\/huggingface.co\/Skywork\/SkyReels-V1-Hunyuan-T2V\" rel=\"noopener\" target=\"_blank\">video generation models<\/a>, and <a href=\"https:\/\/huggingface.co\/Skywork\/Skywork-Reward-V2-Qwen3-8B\" rel=\"noopener\" target=\"_blank\">reward models<\/a>. Similar to their LLMs, they mostly build on top of other models. Unlike many labs, Skywork has released some datasets with their models, such as <a href=\"https:\/\/huggingface.co\/datasets\/Skywork\/Skywork-Reward-Preference-80K-v0.2\" rel=\"noopener\" target=\"_blank\">preference<\/a> and <a href=\"https:\/\/huggingface.co\/datasets\/Skywork\/Skywork-OR1-RL-Data\" rel=\"noopener\" target=\"_blank\">reasoning<\/a> training data.<\/p>\n<p>These companies are either just getting their toes wet with open models or operating as more of academic research organizations than labs pushing the performance of models.<\/p>\n<p><a href=\"https:\/\/seed.bytedance.com\/en\" rel=\"noopener\" target=\"_blank\">seed.bytedance.com<\/a> | \ud83e\udd17 <a href=\"https:\/\/huggingface.co\/ByteDance-Seed\" rel=\"noopener\" target=\"_blank\">ByteDance-Seed<\/a><\/p>\n<p>Seed is the R&amp;D arm of ByteDance and eerily similar to Meta\u2019s FAIR division: Diverse models with interesting research, with their papers garnering a ton of attention in the community. However, it remains to be seen whether they shoot for a Llama-style model release or continue to release research artifacts.<\/p>\n<p>Here are some recent papers:<\/p>\n<p><a href=\"https:\/\/openbmb.ai\/\" rel=\"noopener\" target=\"_blank\">openbmb.ai<\/a> | \ud83e\udd17 <a href=\"https:\/\/huggingface.co\/openbmb\" rel=\"noopener\" target=\"_blank\">openbmb<\/a> | X <a href=\"https:\/\/x.com\/OpenBMB\" rel=\"\">@OpenBMB<\/a><\/p>\n<p>OpenBMB is an open-source community (comparable to BigScience) from Tsinghua University NLP Lab (the very same university where Zhipu was spun off from) with support from the Beijing Academy of Artificial Intelligence (BAAI) and ModelBest.<\/p>\n<p>They are mostly focusing on small multi-modal models for the edge, such as <a href=\"https:\/\/huggingface.co\/collections\/openbmb\/minicpm-o-and-minicpm-v-65d48fa84e358ce02a92d004\" rel=\"noopener\" target=\"_blank\">MiniCPM-V-4<\/a>. However, the license is rather restrictive, which is surprising given the community-driven origins of the group. Aside from model releases, they also release frameworks and specialized <a href=\"https:\/\/github.com\/OpenBMB\/CPM.cu\" rel=\"noopener\" target=\"_blank\">kernels<\/a> to make sure their models run on low-end hardware.<\/p>\n<p><a href=\"https:\/\/www.mi.com\/global\/brand\/ai\/xiaomi-hyperai\" rel=\"noopener\" target=\"_blank\">mi.com<\/a> | \ud83e\udd17 <a href=\"https:\/\/huggingface.co\/XiaomiMiMo\" rel=\"noopener\" target=\"_blank\">XiaomiMiMo<\/a><\/p>\n<p>Xiaomi started releasing a bunch of small, capable models, ranging from <a href=\"https:\/\/huggingface.co\/collections\/XiaomiMiMo\/mimo-6811688ee20ba7d0682f5cb9\" rel=\"noopener\" target=\"_blank\">LLMs<\/a> to <a href=\"https:\/\/huggingface.co\/collections\/XiaomiMiMo\/mimo-vl-68382ccacc7c2875500cd212\" rel=\"noopener\" target=\"_blank\">VLMs<\/a> and <a href=\"https:\/\/huggingface.co\/mispeech\/midashenglm-7b\" rel=\"noopener\" target=\"_blank\">audio models<\/a>. Xiaomi updating the models quickly after an initial launch and releasing multiple variants of the models show that it is not a one-off foray into open models. However, it remains to be seen whether those are mostly research artifacts or whether they are serious about potentially pushing the frontier or competing for adoption.<\/p>\n<p><a href=\"https:\/\/yiyan.baidu.com\/\" rel=\"noopener\" target=\"_blank\">yiyan.baidu.com<\/a> | \ud83e\udd17 <a href=\"https:\/\/huggingface.co\/baidu\" rel=\"noopener\" target=\"_blank\">baidu<\/a> | X <a href=\"https:\/\/x.com\/Baidu_Inc\" rel=\"\">@Baidu_Inc<\/a><\/p>\n<p>Baidu, one of the original names in the Chinese AI space, has only released the weights of <a href=\"https:\/\/huggingface.co\/collections\/baidu\/ernie-45-6861cd4c9be84540645f35c9\" rel=\"noopener\" target=\"_blank\">ERNIE 4.5<\/a>. It remains to be seen whether they continue to release weights of newer releases as well.<\/p>\n<p>The rest of the labs that we are watching.<\/p>\n<p><a href=\"https:\/\/m-a-p.ai\/\" rel=\"noopener\" target=\"_blank\">m-a-p.ai<\/a> | \ud83e\udd17 <a href=\"https:\/\/huggingface.co\/m-a-p\" rel=\"noopener\" target=\"_blank\">m-a-p<\/a><\/p>\n<p>An open research community, releasing all kinds of models (including a <a href=\"https:\/\/huggingface.co\/m-a-p\/neo_7b\" rel=\"noopener\" target=\"_blank\">truly open 7B language model<\/a> with data, etc.). Now, they\u2019re mostly known for the music generation model <a href=\"https:\/\/huggingface.co\/collections\/m-a-p\/yue-6797d55e22990ae89b90a3d6\" rel=\"noopener\" target=\"_blank\">YuE<\/a>.<\/p>\n<p><a href=\"https:\/\/aidc-ai.com\/\" rel=\"noopener\" target=\"_blank\">aidc-ai.com<\/a> | \ud83e\udd17 <a href=\"https:\/\/huggingface.co\/AIDC-AI\" rel=\"noopener\" target=\"_blank\">AIDC-AI<\/a><\/p>\n<p>Another R&amp;D arm of Alibaba, mostly releasing niche models building upon Qwen.<\/p>\n<p><a href=\"https:\/\/www.baai.ac.cn\/en\/\" rel=\"noopener\" target=\"_blank\">baai.ac.cn<\/a> | \ud83e\udd17 <a href=\"https:\/\/huggingface.co\/BAAI\" rel=\"noopener\" target=\"_blank\">BAAI<\/a> | X <a href=\"https:\/\/x.com\/BAAIBeijing\" rel=\"\">@BAAIBeijing<\/a><\/p>\n<p>As a university, the Beijing Academy of Artificial Intelligence has a high diversity of projects. They are mostly known for <a href=\"https:\/\/huggingface.co\/collections\/BAAI\/bge-66797a74476eb1f085c7446d\" rel=\"noopener\" target=\"_blank\">BGE<\/a>, which are capable embedding models.<\/p>\n<p>\ud83e\udd17 <a href=\"https:\/\/huggingface.co\/inclusionAI\" rel=\"noopener\" target=\"_blank\">inclusionAI<\/a> | X <a href=\"https:\/\/x.com\/InclusionAI666\" rel=\"\">@InclusionAI666<\/a><\/p>\n<p>The open weight arm from the Ant Group (an affiliate of Alibaba handling mobile payments and some financial industries), responsible for <a href=\"https:\/\/huggingface.co\/collections\/inclusionAI\/ling-67c51c85b34a7ea0aba94c32\" rel=\"noopener\" target=\"_blank\">Ling Lite<\/a>, a series of LLMs.<\/p>\n<p><a href=\"https:\/\/www.huaweicloud.com\/\" rel=\"noopener\" target=\"_blank\">huaweicloud.com<\/a> | X <a href=\"https:\/\/x.com\/HuaweiCloud1\" rel=\"\">@HuaweiCloud1<\/a><\/p>\n<p>Huawei is working on AI accelerators to threaten the market share of Nvidia GPUs, which are often targeted by regulations, both from the US and China. Their model releases are mostly to show what\u2019s possible with their cards, but not <a href=\"https:\/\/github.com\/HW-whistleblower\/True-Story-of-Pangu\" rel=\"noopener\" target=\"_blank\">without drama<\/a> accusing them of upcycling Qwen models and not stating it. We would expect them to continue to release more models in the near future.<\/p>\n","protected":false},"excerpt":{"rendered":"The Chinese AI ecosystem has taken the AI world by storm this summer with an unrelenting pace of&hellip;\n","protected":false},"author":2,"featured_media":352605,"comment_status":"","ping_status":"","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[3163],"tags":[323,1942,53,16,15],"class_list":{"0":"post-352604","1":"post","2":"type-post","3":"status-publish","4":"format-standard","5":"has-post-thumbnail","7":"category-artificial-intelligence","8":"tag-ai","9":"tag-artificial-intelligence","10":"tag-technology","11":"tag-uk","12":"tag-united-kingdom"},"share_on_mastodon":{"url":"https:\/\/pubeurope.com\/@uk\/115046248896164374","error":""},"_links":{"self":[{"href":"https:\/\/www.europesays.com\/uk\/wp-json\/wp\/v2\/posts\/352604","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.europesays.com\/uk\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.europesays.com\/uk\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.europesays.com\/uk\/wp-json\/wp\/v2\/users\/2"}],"replies":[{"embeddable":true,"href":"https:\/\/www.europesays.com\/uk\/wp-json\/wp\/v2\/comments?post=352604"}],"version-history":[{"count":0,"href":"https:\/\/www.europesays.com\/uk\/wp-json\/wp\/v2\/posts\/352604\/revisions"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/www.europesays.com\/uk\/wp-json\/wp\/v2\/media\/352605"}],"wp:attachment":[{"href":"https:\/\/www.europesays.com\/uk\/wp-json\/wp\/v2\/media?parent=352604"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.europesays.com\/uk\/wp-json\/wp\/v2\/categories?post=352604"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.europesays.com\/uk\/wp-json\/wp\/v2\/tags?post=352604"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}