AI
  • Europe
  • Europa
  • Britain
  • France
  • Germany
  • Italy
  • Spain
  • Poland
  • Netherlands
  • Japan
  • Canada
  • Africa
  • Afrique
  • People
  • AI
  • Agentic AI
  • AGI
  • AI
  • Anthropic
  • Google
  • Microsoft
  • OpenAI
  • xAI
AI
  • Europe
  • Europa
  • Britain
  • France
  • Germany
  • Italy
  • Spain
  • Poland
  • Netherlands
  • Japan
  • Canada
  • Africa
  • Afrique
  • People
  • AI

Browsing Tag

AI benchmarks

9 posts
GGoogle
Markus Kasanmascheff
Read More

Google DeepMind AI Co-Clinician Tops GPT-5.4 in 98-Query Test But Still Trails Physicians

  • May 2, 2026
TL;DR Blind Doctor Test: Doctors preferred Google DeepMind’s AI co-clinician to GPT-5.4-thinking-with-search by 63 to 30 across 98…
OOpenAI
GPT-5.5 lands as OpenAI accelerates its model release cadence to near-monthly
Read More

GPT-5.5 lands as OpenAI accelerates its model release cadence to near-monthly – Startup Fortune

  • April 25, 2026
OpenAI released GPT-5.5 on April 23, 2026, just six weeks after GPT-5.4, and the model’s performance on independent…
AAgentic AI
OpenAI Launches GPT-5.5: Smarter Agentic AI Model That Codes, Operates Software and Automates Complex Workflows
Read More

OpenAI Launches GPT-5.5: Smarter Agentic AI Model That Codes, Operates Software and Automates Complex Workflows

  • April 24, 2026
The company said the biggest leap is in agentic coding and computer. On Terminal-Bench 2.0, which tests complex…
GGoogle
A leaked open-source model called Yahu may have just broken the logic ceiling that has defined AI for years
Read More

A leaked open-source model called Yahu may have just broken the logic ceiling that has defined AI for years – Startup Fortune

  • April 24, 2026
A consortium of former OpenAI and Google DeepMind researchers quietly dropped Yahu onto X overnight, triggering global trending,…
AAGI
Alphabet-X just released Astra and the internet is calling it the moment AGI arrived
Read More

Alphabet-X just released Astra and the internet is calling it the moment AGI arrived – Startup Fortune

  • April 23, 2026
Alphabet-X’s Astra model passed the Lovelace Test 2.0 today, scoring 96.8% on the ARC-AGI benchmark and solving a…
OOpenAI
OpenAI's ChatGPT Images 2.0 tops every major benchmark just days after launch and that changes the competitive map
Read More

OpenAI’s ChatGPT Images 2.0 tops every major benchmark just days after launch and that changes the competitive map – Startup Fortune

  • April 23, 2026
ChatGPT Images 2.0 debuted on April 20 and has already claimed the top spot on the GenEval and…
GGoogle
A wrong-size gold coin from your local dealer is more than an inconvenience when gold trades above $3,300 an ounce
Read More

Google’s Gemma 4 just outscored ChatGPT and Gemini Chat and you can run it yourself – Startup Fortune

  • April 22, 2026
Google DeepMind’s Gemma 4 has landed benchmark scores above both ChatGPT and Gemini Chat, and because it’s open-weight,…
AAI
Stanford's 2026 Report: AI Safety Benchmarks Are Falling Behind
Read More

Stanford’s 2026 Report: AI Safety Benchmarks Are Falling Behind

  • April 15, 2026
The assumption that the US holds a durable lead in AI model performance is not well-supported by the…
AAI
Did Meta Sacrifice Its Open-Source Identity for a Competitive AI Model?
Read More

Did Meta Sacrifice Its Open-Source Identity for a Competitive AI Model?

  • April 10, 2026
The open-source AI movement has never lacked for options. Mistral, Falcon, and a growing field of open-weight models…
AI
www.europesays.com