A fascinating subject, and there's an extension of it about which I've been curious for years, it's why English doesn't use diacritics or accents on alphabetical characters as do many European languages—French, German, etc. For instance, in French the letter 'e' can take four forms—without accent, grave è, acute/aigu é, circumflex ê. This makes the letter much more flexible and greatly assists with pronunciation.
So why doesn't English use them? It's clear the language needs them with words such as through, thorough and thought which are confusing enough for native speakers let alone those learning English as a second language. And how about inconsistencies such as the verb to lead and the metal Pb lead, or other strange 'gh' words such as cough?
Similarly, proper nouns such as Wycombe, Warwick, etc. defy logic when it comes to pronunciation and would greatly benefit from diacritical marks.
I've never considered myself a particularly good speller and I reckon I would have benefited from diacritics had English used them.
English spelling becomes a lot more rational [1] when you understand that you are spelling words largely according to Middle English. Sound changes that occur after that are largely not reflected in English spelling.
As for why English doesn't use diacritics, well, my hypothesis is that Middle English just didn't need them. Per the Wikipedia article on Middle English phonology, there's 5 short vowels + unstressed vowel + 7 long vowels + diphthonging with /j/ or /w/. With just 5 short vowels, it's possible to write every one with the 5 basic vowels of Latin script, and the long vowels and diphthongs can be written with paired vowels, using different vowels for the extra long vowel sounds (cf. ee versus ea)--no need for diacritics, even if it is a little clumsy.
The Great Vowel Shift came along and fucked up all the long vowels, and separate changes caused "a" to break into 2-3 sounds (crowding into the "o" sound, as well), leaving us with clearly multiple sounds for the same short "a" and "long" vowel spellings that bare little to no resemblance to their corresponding pronunciations. Some of the sounds (in particular, ee and ea) merged into one sound as part of the Great Vowel Shift, too.
[1] The other thing that ruins English spelling is a proclivity to borrow foreign words with foreign pronunciation and foreign spellings (even if it's not written in Latin script!), so to some extent, you have to play a guessing game as to etymology to work out spelling. But ignoring those cases, you can actually pretty reliably work out the pronunciation of most English words by effectively applying the (mostly regular) phonological changes from Middle English.
very interesting!
can you share a couple examples of words borrowed with foreign, non-Latin script? I'm having trouble imagining how this could look like
Pyjamas, guru, khaki, avatar, bandana, jodhpurs, and shampoo are but a few of the many words drawn from Hindi and Sanskrit.
Urushiol, kudzu, futon, and karate from Japanese.
Orangutan, bamboo, cassowary, paddy, ramie, rattan, gong, and camphor from Indonesian (and related languages).
Ammonia, banana, bongo, cola, dengue, ebony, and gnu are of African origin (numerous languages).
Algebra, algorithm, alchemy, and alcohol all come from Arabic ('al' is the definite article in Arabic). So too do numerous terms for textiles: chiffon, gabardine, satin, tafetta, and wadding. Tahini, tuna, tamarind, talc, tangerine, and talisman as well.
All these use a non-Latin script, in some cases no script at all.
> khaki
From Urdu from Persian.
Persian and Sanskrit are closely related and thought to have originated from a common Proto-Indo-Iranian language from about 4000 years ago.
Fair point as I'd chosen to use Hindi (culture) rather than India (geographic region), though of course India itself derives from Hindi (the word).
A fuller set of candidate languages, from Wikipedia: Hindi, Urdu, Kannada, Malayalam, Sanskrit, Tamil, Teluga, Bengali, Assamese, Bengali, and Marathi.
<https://en.wikipedia.org/wiki/List_of_English_words_of_India...>
Words like 'manga' or 'anime' come from Japanese (written in kanji) or Greek words like onomatopoeia (Greek is a different alphabet from Latin) are ones that come to mind.
This hypothesis seems to ignore the vast number of English words that come from French, even despite the vowel shift
Divorcee (we normally say divorsay not divorsee so it should be divorcée) and especially fiance (fiancé) are good examples (employee has changed pronunciation but technically should be employée).
Many of those French words in English though come from Norman French which is very different to the modern language. French having also undergone a major vowel shift as well as dropping pronunciation of some final consonants so that some of the “English” pronunciation of certain “French” words is arguably more correct
Your examples are not consistent dialectically, in my dialect and many related dialects (Northeastern Irish, Scottish and Northern English) “divorsay” is anglicised and pronounced “divorsee”.
That said, wealth, and proclivity to pretentiousness, not just dialect are also factors in the stylings of one’s speech, including which pronunciation to use.
It’s a fascinating subject.
I think a lot of the English peculiarities with spelling come from it adopting other vocabulary without enforcing a change to English orthography onto it, so we end up with a crazy mix of Germanic, French, Latin, Greek and other language spelling conventions. Also English speakers were perhaps more willing to coin Latin/Greek terms for new objects than other languages - contrast ‘television’ from Greek and Latin in English, with Fernseher (literally far-looker) in German.
With respect to dialectical marks, I’m not sure why English in general doesn’t use them (although some might argue we do if we spell words café or naïve), but they don’t seem to be ‘fashionable’ in related languages - Dutch doesn’t seem to use them either, and in German they feel somewhat optional - ä, ö and ü can be replaced with ae, oe and ue respectively (e.g. where the accented forms aren’t available) and ß can be replaced with ss. Indeed correct alphabetical sorting of German demands this.
> naïve
There are actually a good number of words that historically used a diaeresis in English, especially up until about a century ago: noël, reëlect, reënter, Chloë, Zoë, noöne, preëmpt, daïs, coöperate, coöpt, coördinate, zoölogy...
The idea was to ensure that the second vowel would be seen as its own syllable.
Dutch is not as stable as other languages over time. We've had a few spelling reforms and trying to read 17th century Dutch is actually not that easy even for native speakers. Old Dutch is more similar to German than modern Dutch.
I live in Germany and I always joke that Dutch is basically simplified German. The two languages are obviously related but we got a bit pragmatic about some of their more convoluted grammar choices and they gradually disappeared from the language. For example, we have male and female words but we only use two instead of three articles: de for male/female, het for the neutral form instead of die/der/das. Which is similar to the English the and it. Several articles still used in German (e.g. des and den) completely disappeared from the language but used to be common. A nice example where you still see this are place names like Den Haag or s'Gravenhage, which is short for Des Gravenhage and refers to the same city (The Hague).
As for accents, those would be one of the things that got removed from the language or have become optional. E.g. you don't see a lot of ë used in modern Dutch anymore but that is a recent change (in the nineties). The use of computers has accelerated this because we use keyboards without support for this (English layout basically). In highschool, I used things like wordperfect and memorized some of the key codes for accents so I could type this letter. Oddly, an e with an umlaut is not a thing in German or in French.
Also as you mention we use letter combinations for things that have e.g. an umlaut in German. For example ö is written "eu" in Dutch. The ü becomes "uu" and we use oe for the German u. French accents used to be common but have not survived language reforms. We even have some 3 and 4 letter combinations that tend to confuse the hell out of foreigners. E.g. the Dutch translation of lion would be leeuw. Löwe in German would be close in pronounciation.
I lived in Sweden for a while. The Scandinavians did a similar modernization and simplification of their grammar last century. This actually makes it relatively easy to learn. It's a pretty regular language with a fairly straightforward grammar. Also, the grammars of Norwegian, Danish, and Swedish are basically the same. The Swedish use a few different letters e.g. ö in Swedish vs. ø in Norwegian and Danish. But otherwise the languages are very similar in written form. Icelandic is a bit weirder but basically a Scandinavian language as well.
> contrast ‘television’ from Greek and Latin in English
I recall reading somewhere that back in the day, there were complaints that this new word coinage "television" would never catch on specifically because it mixed Greek and Latin.
> dialectical marks
Of course dialectical Marx is something else entirely.
I wonder how personalized are autocompletions on Android and iPhone. How strongly could we infer, if at all, that someone had used the term dialectical before? Are autocompletions personalized enough that they might suggest dialectical over diacritic because someone had used (or received!) words like Hegel or capitalism? (Or maybe we could merely only infer someone's phone is using a European English word probability database as in European discourse dialectics might be even more common than diacritics. ;)
I don't have a definite answer on why some writing systems decide to use diacritics why other don't. I'm not sure there is a single reason for English.
But I think a bit of historical context would help. Many European languages derive from Latin, and most other were influenced and borrowed its alphabet. Latin didn't have diacritics, despite having pronunciation accents. E.g. "mania" could mean "mănia" (madness) or "mānia" (a kind of spirit). With time, many short vowels disappeared or changed, e.g. "orphănus" became "orphenin" then "orphelin" in French. I suppose that, for languages that are mostly an evolution of Latin, introducing diacritics was a way to mark words that were different from Latin words. It was even more useful because Latin was the written standard (a moving standard across centuries).
By the way, French and English are languages were you can't be sure of the pronunciation of a word you've never seen. Some studies have shown that other languages are much more efficient to read and write. For instance, Spanish readers barely slow down when reading a text that contains rare words, while English readers stumble. I remember stumbling when I first encountered "antienne" in my native language, or "recipe" and "gaol" in English.
> French and English are languages were you can't be sure of the pronunciation of a word you've never seen.
English yes of course, but French pronunciation is very regular and only a few loanwords don't follow the rules. "Antienne" is regular and pronounced the same way all other -tienne words are as far as I know (like, say, Étienne). The most difficult rules to master though are those that take into account the origin of a word (Latin, Greek or a Germanic language), especially for how to pronounce "ch", but these are essentially loanwords.
The reverse is not true though, there are many different potential ways to write a word that produce the same pronunciation.
I always found it weird how English seems to basically consist of only exceptions and yet it is so much easier to learn and master than French. At least from my experience as a non-native speaker of either. And I say that as a native speaker of a very very regular language.
I always felt like in French you never know how to write something. And seeing it written you sort of leave out the last three or four or maybe none of the letters when speaking but you never really know from just seeing it written. Depends on if you are speaking or singing too. An Edith Piaf will sing French like it's German and pronounce every letter. But you're not supposed to.
And don't get me started on Monsieur. I mean come on, it basically is "my sir" i.e. "mon sieur". But you say "miss" (speak this as an English word) "yeux" (speak this as the French word for "eyes") and if you tell this to a French person they look at you like you are trying to kill them.
And in the end it does all sort of make sense and they all are intertwined.
Like "have" and "habe" and "a". (English, German and French). If you take a French person trying to pronounce either the English or German word you basically are left with just the "a" if you try to write the word down from someone saying it. "h" is mue. And then you leave out the last few letters. French people have a hard time saying "have" or "habe" properly.
> The reverse is not true though, there are many different potential ways to write a word that produce the same pronunciation.
For any "real" language it's probably impossible to have a 1-1 mapping in both directions.
French tries really hard to embed everything required for pronunciation in the spelling, but because of dialects and phonetic drift that leads to some pronunciations having multiple viable spellings.
German seems more concerned with the other direction: for each pronunciation (in high German) there is one obvious way to spell it (barring some baggage around c/k/ck and s/ss/ß, but even that has rules that can be applied to mostly clear it up). The other direction also mostly works, the spelling-pronunciation mapping is fairly obvious even if you don't know the word, but it is more ambiguous than French (no way to differentiate e è é or ê; and ë is mostly inferred).
English somehow seems to neither try to maintain a consistent spelling-pronunciation mapping nor a pronunciation-spelling mapping, nor a tradeoff between the two. At least unless you know the origin of the word and are aware of pertinent linguistic history of the last ~200 years, or develop a good heuristic understanding for those. Probably because of the amount of coordination required to maintain a decent mapping in either direction. Both France and Germany have influential central control on spelling.
> For any "real" language it's probably impossible to have a 1-1 mapping in both directions.
Ukrainian is pretty close. It helps that it doesn't have a lot of reduced vowels that sound indistinct, and that double consonants are very unusual. The only ambiguous part is stress, technically stress marks exist, but they are used only in dictionaries.
In fact, most Slavic languages are _reasonably_ close to 1-1 mapping. Some are closer, some are further. For example, Russian has a lot of reduced vowels that you have to remember how to spell.
It's also that french and german have somewhat purer linguistic roots, english started as a west germanic language, got admixtures of old norse, followed by massive injections of french and latin (roughly a quarter of the modern vocabulary, each), and uncontrolled pronunciation shifts.
Written old english was quite regular and phonetically sound. It also had a fair number of diacritics (after latinisation).
> For any "real" language it's probably impossible to have a 1-1 mapping in both directions.
I understand korean is rather good in that perspective, but also that it has a relatively simple phonology and the korean script is constructed, it was designed from scratch for korean.
Interestingly and not dissimilar to TFA's assertions, for a very long time it was only used by the common class and disdained by the aristocracy, 19th century nationalism and separation from china led to its revival.
It's a bit more complicated than that: https://brill.com/view/journals/ldc/6/1/article-p1_1.xml?lan...
> For any "real" language it's probably impossible to have a 1-1 mapping in both directions.
As far as I know, Finnish is very very close to 100% phonetic: one letter, one phoneme, with some slight exceptions and some wiggle room when it comes to dialectical variation and loanwords, especially those with nonnative phonemes like /b/ or /g/. The velar nasal ŋ is the biggest exception, featuring in "nk" and "ng" (like in many other languages) and the rules aren't entirely straightforward. Another exception is that in spoken language a glottal stop or gemination can appear at some morpheme boundaries but isn't reflected in written text.
Haitian Creole is like this. There are a few phonemes that are represented by two letters, but the distinction is there both in writing and in speech.
The rules of pronunciation in French are... hard to describe. Maybe it's because French is a natal language for me. I mean, how do you explain oeuf vs. oeufs? No, French has lots of particularities to it, just not as many as English. Spelling in French is definitely a lot more regular than in English, but the rules of spelling in French have lots of exceptions anyways (e.g., all words that start with 'af' have a double 'f' except [long list of words like afrique]).
I have always loved the Jean Shepherd line:
Its as if the french don’t even know how to spell their own language.
This is from an perspective english. I want to also add that english developed as part of a frontier colony.
Where I’m from, the f in plural œufs is silent. Also, note the correct spelling of œuf
> I mean, how do you explain oeuf vs. oeufs?
Just like 'hour' and 'our' in English I assume. They just happened to sound exactly alike for different(?) reasons.
EDIT: Probably similar reasons, actually? The "h" here participates in in a vowel sound (effectively silencing it) and the 's' suffix is often (but not always!) silent in French?
Spelling is a tension between going purely for sounding things out (e.g IPA) and being able to find commonality with other pronunciations (dialects spoken by neighboring peoples), etc. It's going to get ad hoc because people.
"The New Yorker" magazine uses umlauts to mark the different pronunciations of doubled vowels. The write "cöoperate" to distinguish the way you pronounce its double o from the sound of the double o in "chicken coop".
It should always be the second vowel in English I think.
There are 3 accents that see use in native English:
* the diaeresis (äëïöüÿ) or trema (in French influence), indicates that a vowel that should start its own syllable (that is, turning 1 syllable into 2, thus the name), not be a diphthong. Nowadays usually a hyphen is used instead, or (especially in American English) removed entirely for words that have become common. Most commonly this is 'oö' or 'eë', but others exist e.g. "naïve". This is not to be confused with the German umlaut, which does something completely different and is not productive in English.
* the acute (áéíóúý) indicates an alternate pronunciation (or occasionally stress) of the vowel compared to the usual rules (especially if there's a homonym); there is no consistency in what that alternate is (since many English vowels have 3 or more pronunciations). Sometimes this can create a syllable where there was none, but normally it leaves the syllable count unchanged. The usual example is "résumé"; most other examples are for words derived from French where French had it, but it gets used for other languages that didn't have one too ("saké" and "Pokémon" from Japanese, "maté" via Spanish). I can't think of examples in "native" words (barring trademarks, but I don't think any of those have genericized), but if you go back far enough, aren't they all loanwords? Certainly many of the words that this applies to aren't very foreign anymore.
* the grave (àèìòùỳ) is used in poetry to force an extra syllable that would not be used in prose. By far this is most commonly for a past participle like "blessèd", but the rule is productive and can be used on any vowel. "Learnèd" is a weird example where the pronunciation changes even in prose (for the adjective sense), but the accent is usually only in poetry still.
IIRC Benjamin Franklin tried to make use of diacritics in his (U.S.) newspaper, and it didn't fly.
English could use a spelling simplification, but it's not going to happen -- it's too late and there's too much inertia in the current way of doing things.
Also, FYI, the circumflex in ê in French isn't there to change how e is pronounced but to denote that after the e there used to be an s that has been dropped, and that's how it always is for the circumflex accent in French. Sure, être is not pronounced the same as it would be were it written as etre, but I think that's accidental and not really the essence of the circumflex accent in French.
> English could use a spelling simplification, but it's not going to happen
Sure it is, continuously and gradually.
> Also, FYI, the circumflex in ê in French isn't there to change how e is pronounced but to denote that after the e there used to be an s that has been dropped, and that's how it always is for the circumflex accent in French. Sure, être is not pronounced the same as it would be were it written as etre, but I think that's accidental and not really the essence of the circumflex accent in French.
IIRC, and its been a long time since I studied French or made use of anything but the the most basic bits, for each vowel there is a consistent (in the general case, but there may be exceptions) pronunciation change associated with the now-elided “s”, so the circumflex serves both historical and phonetic purposes.
> Sure it is, continuously and gradually.
There has been no success to any attempts to regularize the spelling of English in... what, over a century now? Is the New Yorker still the only major publication insisting on using umlauts? :)
I'm not saying that English won't evolve, mind you, only that English spelling will evolve [a lot] more slowly than the rest of the language.
> IIRC, and its been a long time since I studied French or made use of anything but the the most basic bits, for each vowel there is a consistent (in the general case, but there may be exceptions) pronunciation change associated with the now-elided “s”, so the circumflex serves both historical and phonetic purposes.
Wikipedia says it alters the sound of a, e, and o. But I would pronounce château and chateau substantially the same. I would pronounce fantôme slightly differently from fantome. Être and etre, on the other hand, would have substantially different pronunciations.
The New Yorker’s eccentric diacritic is the diaeresis rather than the umlaut.
https://www.merriam-webster.com/grammar/mary-norris-diaeresi...
For those wondering, they are written differently in handwriting, but are typically displayed the same in fonts.
My personal pet peeve with the English language is the placement of symbols at the end of a sentence which change the inflection of the sentence? Oh that was a question, I guess you didn't know until you got to the very end of the sentence! Yes I was shouting that whole time, I guess you didn't know.
I find oh so irritating when reading books and not knowing the inflection of the characters speech until the END of the sentence. I much prefer how other languages put a symbol at the start and end of the sentence. Grumble grumble.
Don't quote me on this, but I vaguely recall an explanation years ago, that the one-two punch of Norman French and the Great Vowel Shift arriving were enough for spelling and pronounciation to permanently decouple in English speakers' minds
But people don't read letter by letter to recognize a word. The vast majority of readers are fluent and are seeing a word for the 1000th time. Optimizing for foreign learners instead of native adults seems wrong.
Words vary in pronunciation across time and across regions and dialects. I would assume those words "used to" (which doesn't sound like it has a 'd') sound like they were written. So what would you do, rewrite the dictionary every couple centuries for zero gain?
Maybe one of the reasons the pronunciation varies across time and regions is exactly because the pronunciation rules aren't really standardized, so people can get creative.
I wonder if the languages with strict pronunciation rules tend to change less. In my native language we tend to get new words of course, but if I read a text from 100 years ago I will be able to pronounce every word correctly, even if the word is now archaic and fallen out of use. I might get the accent wrong, and indeed accents do vary across regions, and sometimes even between neighboring towns.
That reason doesn't make sense, because pronunciation has always varied across time and most people were illiterate in the past. the written word, is not the normative form of a language. words aren't made of letters, and letters aren't made of sounds.
It's hard to know for absolutely certain, but a lot of it might be because diacritics would just make English even harder to read/write. By the invention of printing, English was a very confusing mess of the germanic and romantic languages and there was no absolute agreement on pronunciation. Plus, we were in the middle of the great vowel shift, so slapping diacritics on letters would have been a fools errand since they wouldn't sound like that diacritic says they do. English's lax pronunciation, linguistic changes during the middle ages, and strong romantic language influence, would make it even more confusing to create solid rules on diacritic usage, especially as many people would still be illiterate, and English was an "easier" language than the very complex Latin.
If you think of english spelling as highly conservative (it is) and only casually connected to pronunciation (an overstatement, but true) the spelling isn't so bad. The conservatism preserves meaning which is lost in languages like German where spelling is periodically "reformed", severing spelling from a word's historical semantics.
You could just as well consider words like "debt", "lead" or "Wycombe" to be kanjis, and nobody complains about their disconnect from pronunciation.
> You could just as well consider words like "debt", "lead" or "Wycombe" to be kanjis, and nobody complains about their disconnect from pronunciation.
Back in the day, I spent a couple of years teaching in a high school, which exposed me to a fairly large group of Australians of Japanese descent who were learning Japanese as a second language. They did complain about it, and quite fiercely, especially the ones learning it due of family pressure, and not innate interest.
If most people don't complain, I guess it's because a.) they have no exposure to kanji in the first place, b.) they grew up with it, or c.) because their environment regards it as culturally insensitive to complain about foreign writing systems. If none of these apply, you do seem to get plenty of complaints.
None of these factors hold for English, so it makes sense that people complain about the spelling.
English's lack of them greatly assisted alphabets like RADIX-50 for earlier computers. Typewriters, too.
Apparently this was one of the major barriers to achieving widespread adoption of computers in China. The sheer complexity of the Chinese written language meant that it was much harder to display Chinese characters on an old 80s monitor. (I assume that Japanese, Korean, etc had the same problem, but the article I remember reading was about Chinese.)
I read once that the Japanese invented the fax machine as a workaround for difficulties transmitting the character set.
It would be nice if word processors had a feature to translate documents to International Phonetic Alphabet. Your second paragraph would be:
səʊ waɪ dʌznt ˈɪŋɡlɪʃ juːz ðɛm? ɪts klɪə ðə ˈlæŋɡwɪʤ niːdz ðɛm wɪð wɜːdz sʌʧ æz θruː, ˈθʌrə ænd θɔːt wɪʧ ɑː kənˈfjuːzɪŋ ɪˈnʌf fɔː ˈneɪtɪv ˈspiːkəz lɛt əˈləʊn ðəʊz ˈlɜːnɪŋ ˈɪŋɡlɪʃ æz ə ˈsɛkənd ˈlæŋɡwɪʤ. ænd haʊ əˈbaʊt ˌɪnkənˈsɪstᵊnsiz sʌʧ æz ðə vɜːb tuː liːd ænd ðə ˈmɛtᵊl Pb liːd, ɔːr ˈʌðə streɪnʤ 'gh' wɜːdz sʌʧ æz kɒf?
English has so many accents that spelling would still have to be memorized to be used as a universal writing system.
Non-rhotic accent, eh? Estuary English?
"ˈθʌrə"'s was interesting to me, I'd render "thorough" in my ideolect (Southern-inflected General American) "θʌrow". Is your pronunciation "standard" in British English (or whatever the prestige variant where you're from is)?
Also, I think you meant "ðə ˈmɛtᵊl Pb lɛd"
I think I can speak for many many people who casually read Wikipedia articles by asking: how does one go about practically learning the IPA, so as to be able to read (not even write) something like that? every time I come across a Wikipedia article that uses IPA to explain pronunciation, it's wholly inscrutable gibberish, completely useless if there isn't a listen-to-someone-saying-it button.
In my case, we learnt it in high school. Our teachers insisted on it being very useful to look up the pronunciation of unknown words, and we even had exams where we had to transcribe from IPA to the standard spelling.
This was in Spain though, were the national language has a very consistent spelling, and English is taught as a second language. I don't know how it is in other countries.
Just like you learn anything else. Flash cards, pure repeated exposure, reading about the symbols long enough that you have a framework to fit them into, etc.
Usually those pronunciations are also links to the IPA helper page. I've used it to figure out pronunciations in the past.
German is frustrating for its underuse of diacritics - e.g. tetragraph "tsch" representing a single sound, similar for "sch". In Czech, these are č and š respectively. Polish has similar issues, resulting in the infamous "spilled letters" orthography.
A little nitpick,there are 5 "forms" for 'e' in French, you forgot ë it is used in words like Noël and België
> For instance, in French the letter 'e' can take four forms—without accent, grave è, acute/aigu é, circumflex ê. This makes the letter much more flexible and greatly assists with pronunciation.
The circumflex is not related to pronunciation; it tells you that in the historical form of the word an S followed the vowel. The other three are just three different French vowels. It's an odd conflation of concerns.
> Similarly, proper nouns such as Wycombe, Warwick, etc. defy logic when it comes to pronunciation and would greatly benefit from diacritical marks.
Same problem; diacritical marks aren't what you want there. What's happening is that the spelling has changed more slowly than the pronunciation. You don't want to disambiguate the pronunciation of Warwick (between what and what?), what you want is to realize that the spelling isn't trying to tell you how it's pronounced. It's telling you how it used to be pronounced. (Though, in that case, it's not at all difficult to predict the modern pronunciation.)
Sometimes there are cercumflexed vowels that change pronunciation.
My assumption is that these vowels were accented, but you could only have one diacritic.
Economy of printing press blocks, and diacritics wouldn't be enough to cover the wide variety of pronunciations, which is inconsistent in English, as highlighted by Wycombe and Warwick. Diacritics wouldn't save those names.
My theory is that one needs a good reason to cross the ocean. Life has to be better on the other side. It is a selection process. You get people who's hands and feet tend to follow their thoughts as opposed to writing them down.
European alphabets descend from the short lived alphabet invented in Ugarit. It spread by the Phoenicians to Mycenaean Greece and Carthage and then later to the Latins who would found Rome. It should be noted the Minoans already had an older writing system at that time now referred to as Linear A, but it was syllabic not alphabetic.
Correct, the Phoenicians gave it to the Greeks who used vowel letters for the first time... though the Romans mostly inherited theirs from the Etruscans (who got it from the Phoenicians too). That led to some oddities when the Romans conquered the Greeks and started merging the diverged alphabets.
Nevertheless... some letters like aleph/alpha/A survive with and nearly identical orthography. The Phoenicians wrote it sideways compared to Latin - IIRC it was the horns of a bull and represented the first sound of the Phoenician word for cattle.
I find it quite interesting that a writing system having that higher level of abstraction was invented so early. Most early systems were pictograph/ideograph or syllabic AFAIK. Taking the leap to breaking words into syllables then syllables into sounds reduces the orthography to a compact set of symbols with relatively simple rules that is still capable of expressing an entire language.
Alphabetic scripts are arguably not better than syllabic or ideographic scripts. They take less time to learn and are better able to capture the sounds of foreign words, but they are less efficient to mentally process. An already literate scribe working on one language only gets no advantage from an alphabet.
My own theory (as an amateur interested in both linguistics and the history of this time period) is that the late Bronze Age collapse resulted in the decline of Akkadian and the accompanying cuneiform writing system as a diplomatic language, meaning that (1) records started being recorded in the local language only, and (2) what trade and diplomacy did happen had to be multilingual. In this contex, the alphabet does win out. And this is precisely when we saw the alphabet spread across the ancient near east and Mediterranean.
The article makes a difference between "learned scribes" and "creative people at the margins" forgeting a third possibility: learned scribes working not for the religious or secular bureaucracies, but for the merchants.
A parallelism: programmers have different styles working for an established corporation and for a startup. The bureaucracy tends to stand for the old practices, resisting change. The dynamic environment favours starting from scratch and simplicity.
>>Remarkably, two recent discoveries from around 1500 BCE do show scribes using the alphabet. But these exceptions prove the rule, because these scribes used alphabetic writing just as sloppily or playfully as its other users did. In an obscure ostrakon from Thebes and a handful of looted cuneiform tablets we find surprising confirmation that even professional writers used it unprofessionally.
Perhaps they were early physicians?
It's interesting how people in this space study twins, and I wonder sometimes if that isn't more on the nose than we give it credit for.
Twins likely created the first languages - who does the first person genetically capable of speech talk to, except someone genetically and environmentally identical? Likely had the first verbal families, and then verbal tribes. Written secret codes might have started the same way, and ended up either being tribal or trade secrets. Success leads to imitation. Partial success leads to theft, or acquisition.
If you're interested in fascinating deep dives into the history of a few odd letters, the jan Misali channel on Youtube has a video on the letter W (which, along the way, covers F and Y) and another one on the letter C.
w: https://www.youtube.com/watch?v=sg2j7mZ9-2Y
c: https://www.youtube.com/watch?v=chpT0TzietQ
Adding to other resources shared here, archaeologist Denise Schmandt-Besserat has written about the evolution of writing (not strictly the alphabet), and much is available online: https://sites.utexas.edu/dsb/tokens/the-evolution-of-writing... The roots of writing seems to be in counting/tallying marks, i.e. accounting. Another great book, "Against the Grain" by James Scott, describes how both tallying and writing developed hand-in-hand with the state.
Another semi related thing I've thought about is how the invention of words comes about. I know most words today have an origin that can be traced back through text over hundreds of years, but what about the time way way back, Indo-European era, was there anything to trace back to? Was it just one or a few people who realized there as no grunt sound that meant "cold" or "elbow" so decided one day that that was the grunt sound they were gonna use and it spread naturally?
There's a theory that the original words arose via a mechanism essentially analogous to onomatopoeia. But unlike literal onomatopoeia as in "cuckoo", it's a lot harder (and required a genius according to some philosophers like Otto Weininger [1]) to come up with just the sound that when uttered in the presence of other cavemen will correctly evoke such abstract ideas as bigness, smallness, heaviness, lightness, etc.
The idea that there's something about (say) the sound "big" that makes it especially suitable for expressing the idea of bigness seems pretty plausible to me. I tried this experiment on my friend. I spoke out loud the Chinese words for big and small in a random order and ask him to guess which means big and which means small. He immediately guessed correctly. I'm pretty sure that if you can find a Chinese person who's never learnt English and try a similar experiment on them with (the sounds of the English words) "big" and "small" the result will be similar.
[1] "Cannot the whole of human history (naturally in the sense of the history of the mind and not, for example, the history of wars) best be understood through the appearance of a genius, the inspirations emanating from him, and the imitation of what a genius has done by more pithecoid creatures? Take house-building, agriculture, and above all language! Every word was first created by one individual, by an individual above the average, and the same is still the case today (with the sole exception of the names for new technical inventions, which must be ignored in this context). How else should it have been created? The primal words were “onomatopoeic” and they incorporated without the will of the speaker, through the sheer intensity of the specific excitement, something similar to the cause of the excitement, while all the other words were originally tropes, as it were, second-order onomatopoeias, metaphors, similes: all prose was once poetry. Thus most geniuses have remained unknown."
In Gaelic (the Scottish language) Beag (pronounced pretty much as big but with a slightly longer I” means small and mor means small.
I agree though that it’s interesting how people can guess.
With your Chinese experiment it would only be accurate if it were double blind, i.e. the speaker themself did not know the meaning of the words. There is a lot more communication going on with speech than just the words themselves.
Regarding word generation, I grew up in poor urban areas where slang was a huge part of language, and new words were invented and spread almost daily. The sound of a word definitely played a part, but it was mostly about context - when it was used, the tone of the speech, the pitch, what older words it might have been derived from, etc. Like math operations and concepts, there might have only needed to be a few words at the start to build a foundation, a seed from which language growth evolved.
... it is clear that a theory of the alphabet as a casual and playful mode of knowledge explained all of our evidence when I first tackled this back in 2004, and still (encouragingly) explains all of the new evidence discovered in the 20 years since. What we lack is a theory of play as a mode of creativity and knowledge production in ancient writing, which I suggest as a new frontier for research on the early history of writing.
That's the most interesting part of the essay, to me (in an otherwise interesting essay).
It's an interesting theory for innovation, reborn. How much creativity comes from play? How many get their start in games - within games, doing things within the gaming world, either their play-fort or an online world or their imagination about their book or their role-playing game.
Remember when SV companies would encourage play, with rooms designed for it? (Do they still?)
Who Put the Alphabet in Alphabetical Order?
https://www.youtube.com/watch?v=VRDY30tiD98
A song for children by They Might Be Giants
The short answer: the equivalent of today's text slang (esp for languages that use roman letters when using the phone).
It's a nice idea.
The Greeks. They're the first known group to explicitly represent vowels, unlike the older Egyptian derived systems which only represented consonants and thus were abjads rather than alphabets.
While the Greeks have invented the most significant improvement of the writing system, after that when some Semitic people have simplified the Egyptian writing system by eliminating all the multi-consonant signs, any discussion about the Greek alphabet cannot omit the fact that they did not invent the alphabet, but they have only improved the Phoenician alphabet, by reusing signs corresponding to consonants not used in Greek to write the Greek vowels.
This was a huge advance, but it cannot be named as "inventing the alphabet".
It depends what you mean by alphabet. In the narrow sense (consonants AND vowels) Greek was the first language to have one - Phoenician had an abjad (consonants only), probably because most words had 3 consonant roots, with vowels varying with their grammatical role.
To elaborate on this a bit more: When discussing the history of writing systems, the term "alphabet" can be used in two different senses. In a broader sense the term refers to the set of symbols, typically in a specific order, that are based on representing phonetic elements with signs (in contrast to logographic systems). In a narrower sense, the term refers to the Greek alphabet and its derivatives, to distinguish them from their syllabic or consonant-only predecessors.
It should be noted in this context, that Semitic languages can be written with comparatively less disambiguity with a consonant-only set of symbols, as the correct vowels can be inferred with relatively good accuracy from the consonants and the context. For Indo-European languages, such as Greek, this is not the case. However, even for Semitic languages a consonant-only writing-system was far from ideal. Thus, various strategies have been developed to reduce ambiguity, such as the use of plene scriptum[1] or punctation[2] in Hebrew.
The Japanese. They're the first known group to explicitly represent Emoji, unlike the older Latin derived systems which could only represent emotion through character combinations and thus were lame rather than complete.
Not sure if /s, but I recently read about emoji history. It apparently originated from pagers and then "graduated" to phones.
https://one-from-nippon.ghost.io/story-of-the-emoji/
Doesn't talk about kaomoji though which I think are supercool. Like this table flip: ( ╯ ° □ ° ) ╯ ︵ ┻ ━ ┻
The Rosetta Stone is from c. 200 BCE, issued by the Ptolemies in Egypt, ruling after Alexander had conquered Egypt. The first examples of writing in an ancient ancestor of our alphabet writing in Western Semitic are from c. ~1500 BCE where they were using Egyptian hieroglyphs as a model. People were writing in descendants of that alphabet for more than a thousand years when the Rosetta Stone was carved, the Greek script derived from Phoenician which evolved from West Semitic, while the hieroglyphs on the same stone were the model for West Semitic writing a thousand years before.
I was about to write that I was under the impression it came from ancient Egypt, via the Near East and into Greece.
OP’s author is a prof. of religious studies and has a book on Hebrews so possibly his point of view requires the preeminence of Hebrew alphabet.
I also found some of the reasoning questionable. The reason Latin teaching was the job of Greek slaves was precisely because they were Greeks and Roman nouveau rich were adorning the education of their children. Who teaches the children of elite today? Millionaires or smart poor people? The second questionable idea of his is that “sex” and stuff like that are not of interest to “elite”. This confused thinking disregarding content for medium also was a rather weak argument. Maybe the elite were using writing as a private very exclusive chat app and sending textual selfies.
Hebrew must be first if you are a religious person who believes in God speaking Hebrew letters and creating the world. It just doesn’t work if it turns out the Egyptians created the alphabet.
oh definitely - that must explain why I saw a guy with a white shawl and a black box filled with written prayer, tied to his forehead today.. counting grains, no doubt!
to be very clear - the ways of sacred writing are very old, and not the same as counting grains.
If you're trying to make a point about sacred writings being the first texts, you may want to consider Linear B and cuneiform, some of the oldest texts of the Mediterranean and which are almost exclusively inventory lists. While we have things like the epic of Gilgamesh preserved in baked tablets, this is the exception to the rule. For the vast majority of these most ancient texts, tabulation was the main use of writing: how many animals were sacrificed, how many sheaves of wheat were in storage, how much fruit a plot of land could produce, etc. As for sacred writings: Many religions were hesitant to commit their wisdom to writing - one reason why so much of Greco-Roman religion is unknown to us. The Oral Torah was supposedly passed on for centuries until the destruction of the temple and fragmentation of the Jews necessitated the writing down of this knowledge. Heck, Homeric poetry (the hymns as well as the epics) was not written down until centuries of oral development had gone on; not because writing had not been invented, but because it was not used for literary material.
Something important to remember is that the transient documents in the Ancient Mediterranean and Mesopotamian days were scratched into clay (a medium which allows it to be easily erased and adjusted if necessary, and is plentifully available in quantity). One consequence is that if you have these things in a storage building that catches on fire, the clay is baked into pottery and essentially permanently preserved for archaeologists to uncover. Texts written on organic parchment or papyrus are far less durable, as they tend to decompose unless properly stored.
This means we probably have an exaggerated abundance of economic documents due to survivorship bias of the things they wrote economic data on.
Please don't take HN threads into religious flamewar hell. That's the last thing we need here.
We detached this subthread from https://news.ycombinator.com/item?id=37708558.
Edit: we've had to warn you about this specifically once before, as well as several other past warnings about breaking the site guidelines. Would you please review https://news.ycombinator.com/newsguidelines.html and stick to the rules from now on? We have to ban accounts that won't, and I don't want to ban you.
what ? this is not flamewar? I am being completely misunderstood here.. I am not guilty .. dang - honestly, I meant to be fully supportive of prayer and I am deeply wronged in this sequence.. I regularly support religious topics if you read my writing
note: I will re-read the guidelines in an abundance of caution, but I repeat.. I am being misunderstood deeply .. this is not at all meant as some kind of problem thing to say
I'm sorry. I obviously misread you.
Unfortunately, the comment is still a flamewar starter even if you didn't intend it that way, because it didn't make your intent clear enough. I wasn't the only person who took it the wrong way: https://news.ycombinator.com/item?id=37709050. If we had left it in its original position, there would likely have been others.
The burden is on the commenter to disambiguate intent in such cases (I was just writing about this elsewhere - perhaps it will help explain: https://news.ycombinator.com/item?id=37709303). And as the site guidelines say, "Comments should get more thoughtful and substantive, not less, as a topic gets more divisive."
odd thing to be snarky about. organized religious practices coevolved with agricultural societies. I don't think this is that scandalous?
[flagged]
If you're unaware of what "snarky" means, I suggest consulting https://en.wiktionary.org/wiki/snarky.
Came here looking for “Sergey and Larry”. Guess I’m the only one feeling a little silly on a Friday afternoon.
I know it is not a popular opinion, but I think Hebrew came before Phoenician. As far as I can tell, the data could point in either direction.
That seems completely impossible, regardless of the ages of any surviving inscriptions.
The reason is that the initial North-West Semitic alphabet had 27 consonants, whose order is known from the Ugaritic alphabet derived from it.
The Phoenicians have merged 5 pairs of consonants (KHA with HOTA, SHIN with THANNA, DHAL with ZETA, ZU with SADE and AIN with GHAIN), and they have kept only one letter from each pair, the result being a simplified alphabet with only 22 consonants.
There is no doubt that all the other later North-West Semitic alphabets have been derived from the Phoenician alphabet and not from any earlier Semitic alphabet, because all of them have started only with the restricted set of 22 letters, even if their languages had more consonants than 22, so the Phoenician letters were too few for writing all the sounds of those languages.
Because of this mismatch between the Phoenician alphabet and the sound inventory of the languages, Hebrew, Aramaic and Arabic have been forced initially to use a single letter for multiple sounds, which has been corrected later by inventing various diacritic signs to distinguish the multiple meanings of a letter, like in the Hebrew SHIN and SIN (which are distinguished by adding a dot to the letter, in different positions).
If the Hebrew alphabet had been older than the Phoenician, it would have included more than 22 letters, e.g. by having distinct letters for SHIN and SIN (whose pronunciations were different from the modern pronunciations, which have merged SIN with SAMEKH).
I am struggling to find information on this initial northwest Semitic alphabet that you mention.
Your logic is sound, but I’m just not finding any info that backs up what you’re saying.
my first google result was a wikipedia article https://en.wikipedia.org/wiki/Proto-Sinaitic_script that led to: https://www.unicode.org/L2/L2019/19299-revisiting-proto-sina... (see pp8-9 for letter inventory) fwiw
The article discussed here shows examples from older versions of the Semitic alphabet, many hundreds of years before the appearance of the Ugaritic, Phoenician or Hebrew alphabets.
The reconstructed Proto-Semitic language had 29 consonants, so it is likely that the oldest Semitic alphabet also had 29 letters.
However, this cannot be known for sure, because the very few preserved inscriptions do not contain all the signs of the alphabet. Ugaritic proves that there were at least 27 letters.
At some point in time, the Semitic alphabet has split into two variants, a North-West variant and a South-West variant, the latter being used for writing various South-Arabic languages.
While the Northern and the Southern variants have diverged in their graphic forms, the most significant difference is that they have completely different orders of the letters in the alphabet. The reason for the two orders is unknown. Perhaps they have used some mnemonic technique, like reciting a poem for remembering all the letters, and the North and the South have chosen different poems.
The North-West Semitic alphabet is the one having the order alpha-beta-gamma ..., which has been inherited by many later Semitic alphabets and by the Greek, Latin and Cyrillic alphabets, in all their many variants, including the English alphabet.
The oldest Semitic alphabet for which all the letters are known, together with their alphabetic order, is the Ugaritic alphabet. In Ugaritic, two pairs of Proto-Semitic consonants have merged, so it has only 27 consonants of the original 29. Moreover, Ugaritic does not provide any information about the graphic forms of the older Semitic alphabets, because in it all the letter glyphs have been replaced with forms that can be written on cuneiform tablets.
Even so, the Ugaritic alphabet remains the most complete source of information about the Semitic alphabets that have preceded the Phoenician alphabet.
You can see the 27 letters of the Ugaritic alphabet in the Unicode, from "U+10380;UGARITIC LETTER ALPA" to "U+1039A;UGARITIC LETTER TO" (besides these 27 letters inherited from the older North-West Semitic alphabet, Ugaritic has created 3 additional special-purpose letters, appended at the end of the alphabet).
All this information can be found in the literature about the older Semitic languages from the second millennium BC, including Ugaritic, and about Proto-Semitic and comparative Afro-Asiatic linguistics.
There is abundant data demonstrating that Hebrew, Aramaic and Arabic had more than 22 consonants at the time when they have adopted the inadequate for them Phoenician alphabet with only 22 consonants. Arabic has retained 28 consonants until today, so, like Hebrew, it has multiplied the original 22 letters by combining them with diacritic signs.
If any of these languages would have adopted the older alphabet that was the source of the Ugaritic alphabet, instead of adopting the simplified Phoenician alphabet, they would have had distinct letters for their consonants since the beginning, with no need to invent later new diacritic signs.
Hebrew SIN was a lateral fricative, which is a sound that did not exist in Phoenician. When the Hebrews have adopted the Phoenician alphabet, they did not have any letter for writing SIN, so they were forced to write it with the letter SHIN, which was somewhat close in pronunciation. At that time SAMEKH was pronounced in a different way, so it would have been a worse choice.
If the Hebrews would have invented an alphabet of their own, or if they would have adopted another Semitic alphabet variant, and not the Phoenician alphabet, they would not have needed to use a single letter for multiple sounds. This was clearly not a satisfactory solution, because later they have invented the SHIN and SIN dots, to disambiguate the letter with multiple readings.
Given that neither of them “came first” in the topic of inventing the alphabet, what does it matter if Hebrew or Phoenician preceded each other?
An authoritative source on this Inventing the Alphabet by Johnanna Drucker. She covers not only the modern evidence but also attempts to classify alphabets throughout history, with particular focus on the Middle Ages. The first half is a bit dry -- how much do we really care what various scholars in the 16th Century made up about the history of the alphabet? -- (a lot was made up), but the second half looks at the modern archaeological contribution to the study of alphabetic origins and is very interesting.
There are also lots of scans of really interesting Medieval manuscripts cataloging alphabets in the book.
Thanks!
https://press.uchicago.edu/ucp/books/book/chicago/I/bo141943...
From the book's "Chapter 7. Modern Archaeology -- Putting the Evidence of the Alphabet in Place"
"We can now describe the origin of the alphabet chronologically and geographically with some degree of reliability. The basic outlines are these: The alphabet was formed in the context of cultural exchanges between Semitic-speaking people from the Levant and communities in Egypt after or around 1800 BCE. The earliest evidence is dated to Wadi el- Hol, a site in Egypt just west of the Nile, north of Luxor. Later inscriptions in the Sinai and throughout the Fertile Crescent show the gradual distribution and evolution of alphabetic writing, with most evidence dating from the fourteenth century BCE and after."
Regarding "how much do we really care what various scholars in the 16th Century made up about the history of the alphabet?" we should, if we're interested in knowing the context in which various claims, which pop up even today, appeared for the first time. The context often explains the motivation which resulted in specific "made up" narrations, some still (often unfortunately) influencing our lives.