

Data Sources:
IMDb
https://datasets.imdbws.com/
My CSV file
https://drive.google.com/file/d/14vCY8NwXAUPGhKZhvx1H8OyENw1dOpWa/view?usp=sharing
Tools used:
Julius AI
https://julius.ai/
Canva
https://www.canva.com/
Posted by wehavethedata_


Data Sources:
IMDb
https://datasets.imdbws.com/
My CSV file
https://drive.google.com/file/d/14vCY8NwXAUPGhKZhvx1H8OyENw1dOpWa/view?usp=sharing
Tools used:
Julius AI
https://julius.ai/
Canva
https://www.canva.com/
Posted by wehavethedata_
9 comments
what’s with all the cutouts of random people on here? It’s hard to actually read the chart
I assume the 10k vote threshold is to limit it to “how many sci fi films (that matter) were released every year.”
But with IMDb being younger than most of this timeline, there’s a bias present just in votes. The older movie you are, the more popular you have to be just to get the vote threshold.
if we’re going to specify sci fi movies released, this only really has meaning if we compare to total movies released. Perhaps sci fi as a percentage of all movies?
Then for rating, consider that IMDb as a web site is much younger than many of the movies it rates. Perhaps there’s some survivorship bias in that the old sci-fi’s people watch would be the good ones that friends recommend instead of just what the movie theaters are promoting.
It’s good that you mentioned you used “AI” in this, but now I question the validity of all the data. Maybe _some_ of this data is accurate, but who knows? Better to not read it and fill my head with dubious information, because Future Me won’t know which datapoints were solid and which were LLM slop.
What about controlling for the number of movies made per year? I assume every genre would have a similar graph because the number of movies per year has increased.
This should be as a percentage of films released
1968 was a great year to log on to IMDb and rate a movie.
The more that’s released the more the decent ideas are drowned out by the slop.
Not starting your graph at 0, only including movies with 10,000 votes, etc. This data is not beautiful
Comments are closed.