Why you can’t trust Google to pick the best films to watch

With a stellar cast led by Fiona Shaw and Emma Mackey, Hot Milk, the film adaptation of Deborah Levy’s 2016 novel about a mother who travels with her daughter to a Spanish clinic to cure her paralysis, was supposed to be one of the cultural highlights of the summer.

In the event, the film flopped, with several scathing assessments from professional critics and a 37 per cent rating on the reviews website Rotten Tomatoes. The Times review called it an “unfortunate directorial debut” for Rebecca Lenkiewicz.

But anyone who looked up the film on Google might have been persuaded otherwise. When we asked the search engine over the summer what the general consensus was on Hot Milk, the authoritative-looking Google AI overview tool at the top of the results told us the film’s reviews had been “generally positive, though somewhat mixed”. Hmm.

Still from the film *Hot Milk*, featuring Vicky Krieps and Emma Mackey.

Vicky Krieps and Emma Mackey in Hot Milk

ALAMY

Launched in May last year, Google’s AI Overview uses artificial intelligence to provide a summary of information at the top of a search results page. The information is pulled from various online sources so you don’t need to spend ages trawling through various websites — it’s all distilled there on Google.

If you’re looking for inspiration for which of the thousands of films on streaming platforms to watch, which book to read or exhibition to go to, this is designed to be a time-saving tool — aggregating the critics. In fact, given that Google has about 1.5 billion users worldwide, Google AI Overview is also the most popular critic around. But can you trust it?

• Hot Milk review — Fiona Shaw and Emma Mackey stumble in the Spanish sun

I asked Google how The Times had rated Hot Milk, as someone might if they didn’t have a subscription to the newspaper and so couldn’t read it. It was, AI Overview confidently asserted, a “mixed review”. Really? Asked the exact same question again, just a few minutes later, Google, tail between its legs, admitted the review had been “negative”.

It went further, telling us that the critic, Kevin Maher, had given the film a one-star rating and derided it as a “fumbled, pretentious and utterly charmless adaptation” of Levy’s novel. The problem? Maher had given the film two stars, not one. And, while he was far from complimentary, the critic had not described the film as any of the things Google said it did. Google’s AI Overview appears to have invented the quote and attributed it to him.

It’s easy to poke fun at Google’s AI Overview tool and its propensity to “hallucinate”, a term adopted by the tech industry to describe the phenomenon of a chatbot spitting out false, misleading or nonsensical information.

• Google’s AI overviews are hallucinating — and it’s getting worse

But, in an era when we have such a huge choice of films across cinema, terrestrial channels and streamers, Google — the world’s biggest website, a trusted companion to billions — has immense power to influence our Saturday night schedules. You would hope, therefore, that it would at least be accurate and fair.

Beeban Kidron, a film-maker whose credits include directing Bridget Jones: The Edge of Reason, was disturbed by these findings. Kidron, a crossbench peer who has become an expert in AI while campaigning for chatbots to be forced to respect copyright laws, says: “If the AI summary is actually sort of replacing Google as a tool, then the fact that it is not only inaccurate but making things up is very problematic for the world.”

So, how does Google’s AI Overview work? And how can a tool developed by one of the world’s biggest tech superpowers get things so clumsily wrong?

The Silicon Valley company introduced AI Overviews as part of an effort to keep up with competition from chatbots. The fear for Google is that ChatGPT and its rivals will soon replace the traditional search engine by offering users instant answers, recommendations and a more personal touch. While Google has developed its own chatbot, Gemini, it also increasingly places AI Overview within search results.

• Ministers’ silence on threats to our creatives is bewildering

“AI is going to completely replace search,” says Laurence O’Toole, the chief executive of Authoritas, which advises companies on how they can appear in AI searches. “Google thinks AI Overviews are a good thing for users and a way for retaining its monopoly in search. So I think the days of traditional search are numbered and that it is likely that we will get an AI-generated result for pretty much everything in the weeks and months ahead.”

Snow White with the Seven Dwarfs.

AI Overview claimed the live action Snow White was “captivating” despite being widely panned

WALT DISNEY PICTURES/ALAMY

For example, you might ask: is the new Snow White (another big flop of 2025) really that bad? The answer comes: “The live-action Snow White film has polarised critics, with some calling it ‘captivating’ and ‘mostly captivating’ while others find it ‘exhaustingly awful’ and a ‘missed opportunity’.” It then provides a list of sources: a YouTube video, a Reddit feed, user reviews on IMDb and three BBC news articles reporting on the critical response to Snow White. Here it’s worth noting that, while the trade magazine the Hollywood Reporter did describe the movie as “mostly captivating”, none of the largely negative sources for Google’s answer described Snow White as entirely “captivating”, as AI Overview claimed.

For Google, hallucinations, which occur when there are gaps in the data its AI tools have been trained on, are not new. In one infamous case, a user was advised that adding non-toxic glue to a pizza sauce could help the cheese to stick better. In another case, it tried to claim that the phrase “you can’t lick a badger twice” was a common idiom.

• Can’t lick a badger twice: how AI invents meaning for nonsense sayings

It’s not just Google — ChatGPT and other AI tools are regularly noted to confidently assert mistruths as facts. But for film buffs, Google, having established itself over years as a reliable source of reviews and information, is for now the troublemaker to watch.

Over a series of Google searches, we identified numerous oddities and issues with AI Overview. “Critics are divided” on Bridget Jones: Mad About the Boy, which generally attracted rave reviews. A consensus on the Oscar-winning Oppenheimer? “Mixed.”

O’Toole said that from his company’s recent experiments with Google, it seemed that AI Overview had been “tuned to be neutral”. For reviews, that may sound dull and lacking in conviction, but at least it also implies fairness and balance.

However, we found that AI Overview does not appear to give every film the benefit of the doubt. The makers of Disney’s recent live-action remake of Lilo & Stitch — “heartfelt but uneven”, according to AI Overview — might feel hard done by. In this case, Google’s acerbic take was that, while young people might enjoy the recent remake (which scores 92 per cent on Rotten Tomatoes’ Popcornmeter measure of viewer ratings), for many it “falls short of matching the quality and charm of the 2002 animated film” (which had a Popcornmeter score of 78 per cent).

One of the most glaring problems with Google’s tool is the lack of consistency. Ask the search engine whether Mission: Impossible,The Final Reckoning is critically acclaimed, and AI Overview will tell you it had indeed “received a generally positive reception from critics”. Ask again, seconds later, and you are informed the movie “is not widely considered critically acclaimed”.

While it is often correct, we also found numerous instances of the tool misrepresenting or misquoting professional critics. When asked about the Sunday Times review of Superman, Google reported that Tom Shone had praised a “compassionate character”, when he had done no such thing in his five-star review. It also claimed he had given Freakier Friday four stars when in fact he gave it three stars, and 28 Years Later, for which he gave three stars, was a five-star film according to AI.

Jamie Lee Curtis and Lindsay Lohan in pajamas, mouths open in surprise.

Jamie Lee Curtis and Lindsay Lohan in Freakier Friday

ALAMY

Asked how The Times had rated Die, My Love, AI Overview confidently reported that it had received a zero-star rating. Wow … but is that true? Asked the same question seconds later, Google said in fact The Times had not given it a rating. Sure? On the third search, AI Overview plumped for two stars. Then minus one. Only on the fifth search did Google identify — or, presumably, guess — the right answer: one star.

• ‘Friendly’ bots, a contradiction that can only end in disaster

A spokesperson for Google suggested that AI Overview is improving all the time. They said: “We aim to surface relevant, high quality information in all our Search features and we continue to raise the bar for quality with ongoing updates and improvements. When issues arise — like if our features misinterpret web content or miss some context — we use those examples to improve and take appropriate action under our policies.”

But Kidron, who is supportive of useful AI technology, says there are several issues here. One is that Google is effectively “nicking [copyrighted media] property and then it’s reproducing something that’s rubbish”. Another is that, as she’s found in her work around AI technology, people are coming to rely on chatbots for the sort of advice they might have once sought out from friends and family.

“Any film-maker will tell you that word of mouth is a key aspect of whether a film succeeds or not,” she says. “So you can have a film that’s reviewed badly but does really well through word of mouth, and you can have a film that reviews brilliantly but people say, ‘Meh, don’t bother.’ What’s interesting and crucial to understand is that the way that people are responding to AI is very much as if they are ‘people’s’ recommendations.” It’s her theory that AI sometimes tells you “what they think you want to hear”, so really it should be difficult to trust its film recommendations.

For the Times critic Maher there are even more problems at play. He is concerned about the danger of bland or misleading Google and AI Overviews undermining informed criticism in an era when it has already become difficult for people to find objective, honest reviews that have not been influenced by Hollywood’s PR machine. “If everything is filtered through this really useless programme, that’s it,” he said. “That’s the end of the conversation. Everything is marketing.”

I put the question to AI Overview, asking if it can be trusted for reviews. It doesn’t even have confidence in itself, saying: “No, you cannot always trust Google’s AI Overviews for film reviews, as they are known to be unreliable, can make significant mistakes (“hallucinations”), and may present incorrect or misleading information.” So by all means ask Google, but know that for now it is not infallible.

Why you can’t trust Google to pick the best films to watch

Tags: