Thought it was mildly interesting that most European languages can be grouped sensibly merely by looking at their letter distributions.
The similarity measure used here is just the sum of the absolute differences in character prevalences (so a lower score means more similar): e.g. if language A has distribution `{A: 0.5, B: 0.3, C: 0.2}` and language B has distribution `{A: 0.8, B: 0.2}` then their similarity is `|0.5-0.8|+|0.3-0.2|+|0.2-0.0|=0.6`.
**Notice to all Bosnian-Croatian-Montenegrin-Serbian speakers: this is about *written* similarity, not *spoken* similarity!**
I was wondering why there aren’t connection between Portuguese and Romanian, but then I saw how it is based and calmed myself down.
yeah letter distribution 😀 I was wondering how the heck did Hungarian get connected to Swedish 😀
As someone from Serbia I can’t understand any of the languages from the group that is close to Serbian in this graph.
When I go to Croatia, I speak Serbian and they speak Croatian and we all understand each other completely.
TLDR – this graph is not quite meaningful. I guess the only criteria was alphabet, but for languages it’s not true.
Laughs in Greek.
For Serbia Im pretty sure this is wrong, Croat is by far the most close to Serbian
It’s funny how this map gets so much stuff right and so much stuff so spectacularly wrong.
Luxenbourgish is just a german dialect
One of these high-effort OG posts that will get downvoted to hell by people not understanding it…
This is misleading.
r/badlinguistics
Greece is not Europe!
I was raised in three countries and I dont understand this. Regards
The Welsh to English is wrong , Welsh has more letters (29) and more vowels (7) it’s more related to Irish than English on the language tree English is Germanic in origin
This is terrible. How can you connect languages via alphabet? I would say, among those, Hungarian would be most similar to Turkish regarding vocabulary, grammar, etc. And then Russian and Bulgarian maybe due to having Turkic influence in their daily words.
16 comments
Thought it was mildly interesting that most European languages can be grouped sensibly merely by looking at their letter distributions.
The similarity measure used here is just the sum of the absolute differences in character prevalences (so a lower score means more similar): e.g. if language A has distribution `{A: 0.5, B: 0.3, C: 0.2}` and language B has distribution `{A: 0.8, B: 0.2}` then their similarity is `|0.5-0.8|+|0.3-0.2|+|0.2-0.0|=0.6`.
**Notice to all Bosnian-Croatian-Montenegrin-Serbian speakers: this is about *written* similarity, not *spoken* similarity!**
I was wondering why there aren’t connection between Portuguese and Romanian, but then I saw how it is based and calmed myself down.
yeah letter distribution 😀 I was wondering how the heck did Hungarian get connected to Swedish 😀
As someone from Serbia I can’t understand any of the languages from the group that is close to Serbian in this graph.
When I go to Croatia, I speak Serbian and they speak Croatian and we all understand each other completely.
TLDR – this graph is not quite meaningful. I guess the only criteria was alphabet, but for languages it’s not true.
Laughs in Greek.
For Serbia Im pretty sure this is wrong, Croat is by far the most close to Serbian
It’s funny how this map gets so much stuff right and so much stuff so spectacularly wrong.
Luxenbourgish is just a german dialect
One of these high-effort OG posts that will get downvoted to hell by people not understanding it…
This is misleading.
r/badlinguistics
Greece is not Europe!
I was raised in three countries and I dont understand this. Regards
The Welsh to English is wrong , Welsh has more letters (29) and more vowels (7) it’s more related to Irish than English on the language tree English is Germanic in origin
This is terrible. How can you connect languages via alphabet? I would say, among those, Hungarian would be most similar to Turkish regarding vocabulary, grammar, etc. And then Russian and Bulgarian maybe due to having Turkic influence in their daily words.
Some quality content here, thank you.