Numbers calculated in Python using the recently publish [Unicode 17.0 draft](https://www.unicode.org/Public/draft/ucd/) (specifically UnicodeData.txt, Scripts.txt, Blocks.txt and emoji/emoji-data.txt). Visualised using Google Sheets and GIMP.
China has more population even in computer codes.
Great, now the fascists will find out about this and ban Unicode. /s
Great visualization and data. I had no idea that latin script was such a tiny part of unicode
What about Cyrillic and related alphabets? Hebrew, Arabic, and other related abjads? Are they not in unicode?
Honestly we should just get rid of logographic writing systems entirely, they’re just inefficient and hard to learn and use for no reason at all. Hangul has the right idea, giving you information on how a word is pronounced should be how a writing system works.
The proportion of the frequency characters that are actually used would probably be roughly the opposite.
emoji is the biggest Japanese contribution to human communication
Who wrote that text? Theres so many typos and spelling errors in the first few sentences already
That’s just the 汉字 they’ve imported to Unicode. There are several times that officially recognized as valid and an unknowable number of characters lost to the ages.
“… but words can also be constructed using the rebus principle (e.g. writing belief as bee+leaf).”
Absolutely diabolical. People say English is convoluted, but at least the word play we use for fun isn’t a requirement of the writing system. I get that it’s a thing with a lot of ancient pictographic languages as they transitioned into a more complex system, but still…
11 comments
Numbers calculated in Python using the recently publish [Unicode 17.0 draft](https://www.unicode.org/Public/draft/ucd/) (specifically UnicodeData.txt, Scripts.txt, Blocks.txt and emoji/emoji-data.txt). Visualised using Google Sheets and GIMP.
China has more population even in computer codes.
Great, now the fascists will find out about this and ban Unicode. /s
Great visualization and data. I had no idea that latin script was such a tiny part of unicode
What about Cyrillic and related alphabets? Hebrew, Arabic, and other related abjads? Are they not in unicode?
Honestly we should just get rid of logographic writing systems entirely, they’re just inefficient and hard to learn and use for no reason at all. Hangul has the right idea, giving you information on how a word is pronounced should be how a writing system works.
The proportion of the frequency characters that are actually used would probably be roughly the opposite.
emoji is the biggest Japanese contribution to human communication
Who wrote that text? Theres so many typos and spelling errors in the first few sentences already
That’s just the 汉字 they’ve imported to Unicode. There are several times that officially recognized as valid and an unknowable number of characters lost to the ages.
“… but words can also be constructed using the rebus principle (e.g. writing belief as bee+leaf).”
Absolutely diabolical. People say English is convoluted, but at least the word play we use for fun isn’t a requirement of the writing system. I get that it’s a thing with a lot of ancient pictographic languages as they transitioned into a more complex system, but still…
Comments are closed.