Sources: pushshift dump dataset of all posts on r/AmItheAsshole from subreddit creation up until end of 2024, totalling 7.53 GB (2,503,443 posts, approx 700k of which are flaired with the result YTA/ESH/INFO/NAH/NTA)

Tools: Golang code for data cleaning & parsing, Python code & matplotlib for data visualization

Posted by GeorgeDaGreat123

25 comments
  1. happy to answer anyone’s questions about methodology

  2. Funny that when people are allowed to tell only their side of a story, they come across better!

  3. Any theories on why the world had less assholes during 2019?

  4. Huh I’m not on that sub all that frequently but I am surprised that YTA is not the majority here overall. Maybe those are just the ones that get the most traction?

  5. What’s with the graph getting smoother, is that due to increased traffic?

  6. I must not know how to read this type of graph… It looks to me that 100% of posts = YTA and no matter where you look the YTA,ESH,INFO,NAH,NTA adds up to over 100% which doesn’t make sense… help me understand? thx

  7. So looking at timelines, some serious volatility before 2018-2019ish. Does anyone with memory of those days know if that was before maybe some stronger guidelines started being enforced, or some change in handling of the sub?

    Then 2019, I would call a COVID bubble. Folks dealing with COVID related disruptions to their lives, and they piled into that sub for validation/advice/karma because they were isolated and bored.

  8. Great data analysis, thanks for sharing, that plot is cleannnnnnn.

  9. I’m wondering why the distribution is so rocky for ESH, INFO, and NAH posts earlier in the decade.

  10. It looks interesting how it “evens out” to an average in the last years. Is that due to bot posting? Or is the amount of data just so little in the ealier years?

  11. This doesn’t surprise me at all!

    I get a ton of AITA stories in my feed that are like “my vegan sister is demanding that I let her bring her 6 misbehaving young children (6 different dads btw, she’s on welfare btw, her only job is  onlyfans modeling btw) to my strictly childfree wedding, and also she wants me to surrender my elderly dog to the animal shelter because she’s allergic. I replied to her with a sharp, confident rebuttal that really put her in her place. My mom says I was too harsh and now I have family members blowing up my phone. AITA? (ps – my boyfriend threw me off of a 10 story parking garage roof because i said hello to a male cashier at the store, if that matters)” and then for a few hours thousands of nearly identical replies pour in, righteously declaring NTA! NTA!

    Even before LLM posts became common, this sort of obvious bait took up a huge amount of space in the sub. I honestly don’t even mind, because even though a post is fake it can spur real conversation in the comments. 

    But ya, I’ve always suspected “OP is the innocent one” posts get more traction than the “OP is clearly the wrong one so let’s pile on” posts and this chart proves it, very cool.

  12. That sub is so annoying. People be like “my MIL drowned my puppy and I got upset, AITA?”

  13. Love seeing the surge in discourse/YTA judgements following the 2016 election, and the swell of empathy and subjectivity leading up to the 2020 election

  14. What percent are posted by bots? Looks like about 85% just eyeballing it.

  15. The reality is most of these stories should be INFO. They are almost always one sided accounts and the other side is often very important.

    Now, to be fair, if the poster talks about someone beating their spouse or children or something, the beater definitely the asshole. There are some one-sided accounts where you can feel comfortable stating that at least one side is the asshole regardless of the situation.

    However, what is not clear is whether the recipient of said beating is not also some form of asshole.

    You never *deserve* a beating, but that doesn’t mean that you can’t be an asshole yourself. There is even a term for that concept: “asshole victim”.

  16. re: beautiful: this is the gold standard for these plot IMO. They’re usually done so badly.

Comments are closed.