Most similar language to each European language, based purely on letter distribution

2022-06-08

Tags:
Europe

16 comments

Udzu says:

2022-06-08 at 11:58

Thought it was mildly interesting that most European languages can be grouped sensibly merely by looking at their letter distributions.

The similarity measure used here is just the sum of the absolute differences in character prevalences (so a lower score means more similar): e.g. if language A has distribution `{A: 0.5, B: 0.3, C: 0.2}` and language B has distribution `{A: 0.8, B: 0.2}` then their similarity is `|0.5-0.8|+|0.3-0.2|+|0.2-0.0|=0.6`.

**Notice to all Bosnian-Croatian-Montenegrin-Serbian speakers: this is about *written* similarity, not *spoken* similarity!**
Fkappa says:

2022-06-08 at 12:02

I was wondering why there aren’t connection between Portuguese and Romanian, but then I saw how it is based and calmed myself down.
potatolulz says:

2022-06-08 at 12:03

yeah letter distribution 😀 I was wondering how the heck did Hungarian get connected to Swedish 😀
radenkosalapuratetak says:

2022-06-08 at 12:06

As someone from Serbia I can’t understand any of the languages from the group that is close to Serbian in this graph.

When I go to Croatia, I speak Serbian and they speak Croatian and we all understand each other completely.

TLDR – this graph is not quite meaningful. I guess the only criteria was alphabet, but for languages it’s not true.
Unlikely-Elk-8316 says:

2022-06-08 at 12:09

Laughs in Greek.
Mladenetsa says:

2022-06-08 at 12:09

For Serbia Im pretty sure this is wrong, Croat is by far the most close to Serbian
BrightCharlie says:

2022-06-08 at 12:10

It’s funny how this map gets so much stuff right and so much stuff so spectacularly wrong.
odium34 says:

2022-06-08 at 12:11

Luxenbourgish is just a german dialect
Eupowa says:

2022-06-08 at 12:18

One of these high-effort OG posts that will get downvoted to hell by people not understanding it…
WolFlow2021 says:

2022-06-08 at 12:19

This is misleading.
Doktor_musmatta says:

2022-06-08 at 12:29

r/badlinguistics
Self-Bitter says:

2022-06-08 at 12:31

Greece is not Europe!
falquiboy says:

2022-06-08 at 12:47

I was raised in three countries and I dont understand this. Regards
Cariad73 says:

2022-06-08 at 12:58

The Welsh to English is wrong , Welsh has more letters (29) and more vowels (7) it’s more related to Irish than English on the language tree English is Germanic in origin
lehorselessman says:

2022-06-08 at 13:08

This is terrible. How can you connect languages via alphabet? I would say, among those, Hungarian would be most similar to Turkish regarding vocabulary, grammar, etc. And then Russian and Bulgarian maybe due to having Turkic influence in their daily words.
MrJinkins says:

2022-06-08 at 13:08

Some quality content here, thank you.

You must be logged in to post a comment.

Most similar language to each European language, based purely on letter distribution

Tags:

16 comments

Leave a Reply