[OC] Where Common English Words Come From

Posted by cavedave

8 comments
  1. New graph of this submission [https://www.reddit.com/r/dataisbeautiful/comments/1hlayul/oc_english_words_where_do_the_come_from/](https://www.reddit.com/r/dataisbeautiful/comments/1hlayul/oc_english_words_where_do_the_come_from/) based on suggested improvements.

    The top most used 1000 English words are of German origin and after that it is French words that dominate. I remember hearing this and I want to see if it is true. Is English really a French Creole?

    Wordlist First lets get the 2000 most common words from Contemporary Fiction theres lots of possible wordfrequency lists

    Data from wiktionary. Boththe frequencies and most of the etymologies https://en.wiktionary.org/wiki/Wiktionary:Frequency_lists/Contemporary_fiction

    Python matplotlib code and the analysis code up at

    https://colab.research.google.com/drive/1QUnmjgOD76TpPO3IGB3Oz3SymL7pGEbQ?usp=sharing

    Full classified word list up at https://github.com/cavedave/EnglishWords And I will fix errors as we find them. With 2000 words some will be wrong. And some will not be possible to get right. There is words that academics are still arguing about the origins of.

  2. Cool, on pourra bientot parler normalement sur ce putain de site de branleurs…

  3. I like how french is seen as different from latin even though it is a neo-latin idioma

  4. Diese Kommentarsektion ist nun Eigentum der Bundesrepublik Deutschland 🇩🇪

  5. The word créole is not well defined, but it is much much easier to make a normal sounding sentence or paragraph with Latin only words (besides the grammar particles) than with Germanic only words

    A lot of the German share in these 2000 most common words come from non-nouns, such as “the” “in” “to”…

    If you discount these, even the top 2000, which is an extremely limited vocabulary, is majority Latin

    Formal documents such as the Declaration of independence of the United States or the United Nations charter have barely any non grammar Germanic words

    Meanwhile the opposite is so difficult that Anglish is counted as a hard exercise / conlang

    What is a creloe we cannot determine with any degree of objectivity, but it’s certain that, while the grammar of English is Germanic, the non grammar vocabulary is absolutely dominated by Latin

    English is Germanic hardware with mostly Latin software

  6. The word créole is not well defined, but it is much much easier to make a normal sounding sentence or paragraph with Latin only words (besides the grammar particles) than with Germanic only words

    A lot of the German share in these 2000 most common words come from non-nouns, such as “the” “in” “to”…

    If you discount these, even the top 2000, which is an extremely limited vocabulary, is majority Latin

    Formal documents such as the Declaration of independence of the United States or the United Nations charter have barely any non grammar Germanic words

    Meanwhile the opposite is so difficult that Anglish is counted as a hard exercise / conlang

    What is a creloe we cannot determine with any degree of objectivity, but it’s certain that, while the grammar of English is Germanic, the non grammar vocabulary is absolutely dominated by Latin

    English is Germanic hardware with mostly Latin software

Comments are closed.