Tagged: google ngram Toggle Comment Threads | Keyboard Shortcuts

  • Sandeep Prasanna

    Sandeep Prasanna 2:10 am on March 1, 2012 Permalink | Reply
    Tags: american english, ap stylebook, , differences between british and american english, google ngram, style guide, toward, towards   

    Toward(s?) a better understanding 

    Hi all, sorry about the delay in getting new posts out to you. Let’s get to it:

    There are many well-documented differences between British and American English. Even those unacquainted with linguistics can point out some of the more obvious ones: color/colour, apartment/flat, spilled/spilt, and plenty more. Lynne Murphy, an American linguist abroad in the UK, maintains the wonderful blog Separated by a Common Language and writes about how language differs across the pond.

    But some American-versus-British rules are less readily apparent. For example, for years, I struggled with whether to write “toward” or “towards.” A few years ago, Grammar Girl taught me that the rule was simple: “toward” is used in the US and “towards” is used in the UK.

    The British newspaper The Guardian writes in its Style Guide:

    -ward, wards. Contemporary usage … suggests that when it is an adjective a word like upward, downward, backward or forward should not end in s, but when it is an adverb it should.

    I checked The Economist‘s Style Guide and found that it was silent on the issue, but it did write “forward” rather than “forwards” twice within the Style Guide itself. The Economist is published out of London and two-thirds of its journalists are based there, so I wonder whether there is or isn’t internal consistency on the use of the -ward(s) suffix.

    According to a commenter on the Grammar Girl website, “toward” is correct AP style. (The AP Stylebook doesn’t have free access, so I can’t confirm.)

    I wondered why we had that difference and whether it had always been that way. So I checked out the Google Ngram data for both American and British corpora. The data ended up raising more questions than it answered, so I’m hoping for more well-informed readers to suggest explanations for the patterns below.

    Here is the frequency of “toward” versus “towards” in British English from 1800 to 2000.

    It’s clear that “towards” has always been favored over “toward” in Britain during this period. There does seem to be a slight shift after 1980, with “toward” becoming more popular than “towards.”

    Here is the American data from the same period, which is more interesting:

    It appears that “toward” supplanted “towards” as the preferred spelling around 1900. The data show a steady decline in the frequency of “towards” starting around 1840. This trend is strange: why did the spelling preference change at all?

    First, a little background: the Oxford English Dictionary regards “toward” and “towards” as variants of the same word. Their etymology is closely related. Similarly, the OED considers other -ward(s) words as variants of each other as well: e.g., forward(s), backward(s), onward(s). It also notes that while there is no difference in definition between -ward and -wards, there may be a slight semantic difference that ascribes more of a sense of “movement” to -wards. This slight difference is disputed, even by the OED authors.

    The OED says:

    In English the history of -wards as an [adverbial] suffix is identical with that of -ward … ; beside every adv. in -ward there has always existed (at least potentially) a parallel formation in -wards, and vice versa. The two forms are so nearly synonymous … that the choice between them is mostly determined by some notion of euphony in the particular context; some persons, apparently, have a fixed preference for the one or the other form.

    It then goes on to observe the preference of Americans for -ward and Brits for -wards.

    Two possible explanations for the American switch from “towards” to “toward” popped into my head at first.

    The first was that Noah Webster’s dictionary, which set out determinedly American spellings for the nascent United States, expressed a preference for “toward.” His dictionary was first published in 1828. I couldn’t find a reliable online source for his original text, so maybe a reader with access to the text can clarify whether this is true. I’m still skeptical whether this is what drove the change. More famous changes like “colour” to “color” happened quicker, according to Google Ngram.

    Another possibility depends on the OED’s observation that “the choice between [toward and towards] is mostly determined by some notion of euphony.”

    According to The Cambridge History of the English Language: English in North America, rhotic accents (accents that pronounce the R in, e.g., “father”) became prestigious in the United States around the 1870s. It may have simply been more euphonic (more pleasing to the ear) for rhotic speakers to pronounce “toward” rather than “towards” — the former has just two consonants in a cluster, whereas the latter would have a three-consonant cluster, making it more difficult to pronounce. This, too, seems tenuous, because written language changes slower than spoken language and Google Ngram depends on data culled from written texts.

    I can’t seem to think of any other explanations, but I encourage readers to share their thoughts below.

     
  • The Diacritics

    The Diacritics 6:00 am on November 10, 2011 Permalink | Reply
    Tags: bushel, customary system, , english system, gallons, google ngram, inperial system, kilometers, leagues, liters, metric system, miles   

    Stop! Don’t move a centimeter! 

    (Posted by Sandeep)

    “You know I’d walk 1609.3 kilometers if I could just see you tonight.” – Vanessa Carlton’s famous balladA Thousand Six Hundred and Nine and Three Tenths Kilometers

    A few reasons why we need to keep the customary system of measurement in America:

    • That’s a nice 37.9-liter hat!
    • It hit me like 907.2 kilograms of bricks.
    • He’s buried 1.83 meters under.
    • Give a man 25.4 millimeters, and he’ll take 1.61 kilometers.
    • I’ve got 907.2 kilograms of work to do tonight.
    • He didn’t feel 28.3 grams of regret for his actions.
    • He went the whole 8.23 meters.

    Okay, okay, to be fair, I should use nice, round numbers in these phrases. But does “You know I’d walk a thousand kilometers if I could just see you tonight” sound any better? “Stop! Don’t move a centimeter!”

    There’s something about the customary system that lends itself better to flowing rhetoric. What is it? Maybe it’s that the metric system is so closely tied to science, a decidedly unpoetic field. Maybe it’s similar to the general Germanic-Latinate perception distinction in English (although the metric system is mostly ultimately derived from Greek), where Germanic words are perceived as simpler and earthier, whereas Latinate words are perceived as haughty and highfalutin. Maybe it’s something else altogether.

    There are plenty of reasons to adopt the metric system in the US. But will we lose these expressions if/when the US finally switches over? The United Kingdom partially adopted the metric system in 1965. However, the imperial (customary) system remains widespread. Today, official signs use the imperial and metric systems side by side.

    Does full metrication mean the eventual loss of these great, useful English phrases? If our children and grandchildren only learn the metric system, would a phrase like “Don’t move an inch!” even carry any meaning?

    Would we even be aware of units like “peck” (“a peck of pickled peppers”) or “league” (“20,000 leagues under the sea”) if they weren’t used in common phrases?

    In the UK, where the customary system is supposed to exist side-by-side with the metric system, more obscure customary units are well on their way out (via Google Ngram):

    leagues

    bushels

    But more common customary units seem to be hanging on pretty robustly:

    miles vs. kilometres

    gallons vs. litres

    Google Ngram gets its results from the Google Books collection, a corpus that doesn’t include scientific journals (which would be bound to use the metric system, at least for the last hundred years). So despite partial metrication in the UK, customary units like miles and gallons are still widely used in non-scientific written works. Still, you can see a sharp down-tick in the use of “miles” and “gallons” (and a sharp uptick in the use of “km”) around 1965, when the UK officially adopted partial metrication.

    It’s conceivable that units like “miles” and “gallons” could be considered obscure, generations from now, after full metrication in the US and the UK. Maybe then we’d substitute in “kilometers” and “liters” in our figurative language. But I’m more inclined to think that they have more staying power than “bushel” or “peck” or “league,” if only for the volume of common phrases and ideas that they’re used in. Maybe that’s just wishful thinking, a premature nostalgia.

    Questions I don’t have the answer to, but hope that somebody does:

    • Can anyone think of common English phrases in which metric units are used?
    • Is there a similar distinction in other languages? Do French poets and writers prefer to use miles instead ofkilometres? I know they both exist in the language, but France is a fully metricated country.
     
  • The Diacritics

    The Diacritics 4:00 am on October 8, 2011 Permalink | Reply
    Tags: congo, country, definite articles, , , geography, google ngram, iraq, nation, popular usage, rule, the congo, the ukraine, ukraine, usage   

    Indefinite definite articles: the Ukraine or Ukraine? 

    (Posted by Sandeep)

    In 2007, Miss Teen South Carolina embarrassed herself in the Miss Teen USA pageant by giving a famously terrible answer to a simple question. Buried somewhere in the maze of her response were two references to Iraq, except in both instances she referred to the country as “the Iraq.” There are plenty of things wrong with what she said, but calling “the Iraq” was especially (and laughably) jarring to me. We just don’t call Iraq “the Iraq.”

    But why? Is it really so simple, that we just don’t add the definite article “the” to Iraq? There are innumerable other examples of countries that don’t take a definite article, of course. All of which would sound ridiculous with a definite article: “the France,” “the Greece,” “the India.”

    But there are a handful of countries which do take definite articles. There are two main patterns.

    The Gambia.

    The Gambia.

    (1) It seems that many countries whose names derive from important geographical features, such as “the Philippines” (islands) or “the Gambia” (river) or “the Netherlands” (lowlands) take a definite article. (Consider similar formations in the names of solely geographical features, such as “the Amazon” or “the Sahara.”)

    (2) Then there’s the United States of America and the United Kingdom, which take a definite article because the countries’ names describes their political organization. (This becomes clearer when you consider similar formations in many countries’ official names, such as “the Republic of China” [Taiwan] or “the Russian Federation” or “the United Mexican States.”)

    Mexico map.

    The United Mexican States.

    For most countries’ names in English, the presence or lack of a definite article is settled. But there are still other conflicts about whether to use “the.”

    (The) Ukraine

    Consider (the) Ukraine. Both “the Ukraine” and “Ukraine” are used in English. Personally, I’ve always used “the Ukraine,” but we’ll see below that my usage is likely misguided.

    A commonly accepted etymology of the word Україна (Ukrayina) is “borderland.” Based on this etymology, the “geographical feature” rule described above could explain the presence of the definite article in “the Ukraine.” But there’s still some level of uncertainty about Ukraine’s etymology — some believe it to be an ancient ethnonym of the Ukranian people, among other etymologies — so that rule doesn’t seem very persuasive here. The geographical rule for definite articles only seems to be useful when the country’s name is obviously referring to a geographical feature. We don’t use definite articles with countries whose names now have tenuous connections to geographical features — like India (the Indus River) or Indonesia (“Indian archipelago”).

    The use of “the Ukraine” stirs up intense passion among Ukranians, in fact. Some argue that the systematic use of “the Ukraine,” especially before its independence from the U.S.S.R., was used by English-language authors and journalists to subjugate the people and nation of Ukraine by demoting it to a mere region, a mere feature of the larger U.S.S.R.

    A similar issue has raised hackles in the Ukranian language itself. The use of the preposition na ”on,” before “Ukraine,” has been scrapped for v ”in,” within Ukraine. According to this site, the Ukranian government requested the change in 1993. Russian prescriptivists, quoted on the same site, continue to demand na, based on “tradition”:

    Литературная норма не может измениться в одночасье из-за каких-либо политических процессов.

    “Literary norms cannot change overnight because of any political process.”

    Some have pointed out that the style guides of many newspapers and magazines, including The Washington Post and The Economist, have explicitly required the use of “Ukraine” rather than “the Ukraine” after its independence. (I don’t have a copy of these style guides, so I can’t confirm, but there are secondary sourcesonline which mention the shift.)

    Ukraine map.

    Ukraine or The Ukraine?

    I did a Google Ngram search to see the frequency of the phrases “in Ukraine” and “in the Ukraine” over the last 50 years in books. There’s a definite shift around 1993, soon after Ukranian independence (and the same year that the Ukranian government requested the preposition shift from “on” to “in”) from “the Ukraine” (red) to “Ukraine” (blue). Click the image below for a larger version.

    Similar data for the phrases “from the Ukraine” (red) and “from Ukraine” (blue).

    As someone who has been using “the Ukraine” for the past decade, I guess I’ll have to make a shift to the apparently more acceptable “Ukraine.”

    (The) Congo

    But what about the Democratic Republic of the Congo? (The) Congo’s name refers to the Congo River, which itself refers to the pre-colonial Kongo Kingdom. Some sources use “the Congo” whereas others use “Congo.” The official name of (the) Congo uses a definite article: “the Democratic Republic of the Congo,” similar to other definite-articled nations like “the Republic of the Gambia” (the Gambia) and unlike nations such as “the Republic of South Africa” (merely South Africa).

    People I know who have traveled often to (the) Congo, including my undergraduate advisor Brian Hare, call it “Congo.” News outlets, such as CNN, also use “Congo.” But check out these Google Ngram graphs.

    “From Congo” versus “from the Congo” usage from 1800-2000. “From the Congo” (red) is significantly more popular.

    Similar data for “in Congo” (blue) versus “in the Congo” (red).

    Perhaps the continued popularity of the phrase “the Congo” is due to the recurrence of the imagery of the Congorainforest (a geographical feature) over references to the actual nation. My advisor Brian Hare’s globetrotting author wife Vanessa Woods wrote a book about bonobos (who live almost exclusively within [the] Congo) and the subtitle of the book uses the phrase “the Congo.” But was that usage referring to the country or to the rainforest? It’s debatable.

    So while Miss Teen South Carolina was clearly veering from popular usage when she called Iraq “the Iraq,” other cases aren’t so clear. It’s worth noting that some languages draw a bright line — French, for example, tacks on a definite article to all non-neuter-gender countries: even though “the France,” “the Greece,” and “the India” might sound strange to us, “la France,” “la Grèce,” and “l’Inde” are par for the course in France.

     
c
compose new post
j
next post/next comment
k
previous post/previous comment
r
reply
e
edit
o
show/hide comments
t
go to top
l
go to login
h
show/hide help
shift + esc
cancel