How do tech terms become legit?

Podcast is in. PL/1 is out. Mashup is on the short list. But pity the poor 8-track : It came and went without ever making it into The Merriam-Webster English Dictionary .

The ephemeral nature of high-tech terminology poses a challenge to dictionary editors, who tend to adopt new words over a period of years -- or decades. If technology moves quickly, the adoption of high-tech terminology into popular mainstream dictionaries -- those most commonly used by the general public and scholars -- moves so slowly that one might think the editors are out of step.

Not so, says Peter Sokolowski, editor at large at Merriam-Webster Inc., which offers both a print edition of its dictionary as well as free access to its contents at . Editors track new words from the point of first citation. Most simply don't make the cut. "We don't want to have words that come and go," he says.

The fastest word ever adopted by Merriam-Webster was Google (verb tense), which entered the dictionary in 2006, five years after the first citation was noted. "In lexicographical terms that's light speed," Sokolowski says. In contrast, the editors had been monitoring malware since 1990, but it just entered the dictionary this year.

There's good reason to hold back, says Steve Kleinedler, supervising editor for The American Heritage Dictionary of the English Language at Houghton Mifflin Harcourt. "It is not the editors' intent to capture every word that comes out of any given industry," he says. In the high-tech field, where terms such as those found on Wired magazine's Jargon Watch may rise and fall in a matter of weeks or months, time separates the wheat from the chaff. "Leaving out the whimsy and the flash in the pan leaves room for essential vocabulary," says Kleinedler.

Jesse Sheidlower, editor at large for The Oxford English Dictionary , says the OED is less conservative than Merriam-Webster in how quickly it includes new words, adding 200 to 250 every quarter. Blog , for example, entered the dictionary in 2003, just four years after the first occurrence was noted. (Perhaps the word to be added to the OED the fastest was not a technology term but a medical one. Norovirus entered the dictionary in December 2003, less than one year after the first quotation.)

Where is it used?

Ironically, while the earliest citations for many new high-tech words come from the blogosphere and other online postings, the adoption and widespread use of new high-tech words in blogs and other Web-based sources carries less weight than do print publications when it comes to deciding which words make it into the dictionary.

Words are accepted when the editors observe a critical mass of citations in "serious" publications, Sokolowski says. That means (with a few exceptions) the major print publications. Online-only publications with "carefully edited prose," such as Slate and , are also influential. But not blogs. "Most blogs are just not carefully edited," he says. "The gold standard has always been the printed page."

Because of the reliance on print citations, the world's biggest search engine also plays a limited role in research at Merriam-Webster. Sokolowski says he uses Google and other online search engines only informally, to begin investigating new words. "Most of the citations we take come from print sources or Nexis [the searchable archive of print and online publications] because that really represents the press," he says.

OED's Sheidlower says citations in blogs and other online sources do influence his decision-making process, and he acknowledges that some terms appear chiefly in online sources. But he, too, wants to see words spill over into print publications before adopting them. "We would be less likely [to include words] that are only used in online sources," he says. "But it could happen."

Getting in

While widespread usage is a requirement for including new technology terms, editorial judgment also plays a role in determining what goes in and what stays out. Some decisions seem counterintuitive. For example, the term beta , as in beta software , never made the cut at Merriam-Webster. "[Beta] was never entered because there wasn't enough evidence that it was used generically," explains Anne Bello, assistant editor.

What's in and what's not also varies by publisher. For example, Bluetooth has been in the OED for five years, but is just now slated for the next edition of American Heritage and Merriam-Webster.

Then there's HD-DVD and Blu-ray , the competing DVD formats. The OED has passed on both. "You want to make sure you're not putting in something that's going to go away in two months when the technology goes one way or another," says Sheidlower. While Blu-ray won the next-generation DVD format battle, it still won't make it into the OED. "Neither of those products is that widespread," he says. But both technologies are apparently popular enough for Merriam-Webster's editors to include them in the next edition of their dictionary.

Seemingly outdated terms also may be added in some cases. For example, gopher was added to the OED several years ago, but is just now on the short list for inclusion in an upcoming version of the Merriam-Webster dictionary. Sheidlower says he wouldn't add the term if it were under consideration today, but says the fact that a technical term is obsolete isn't sufficient reason not to include a word -- or to remove it. "For us to put something in is a statement of faith that it is important," Sheidlower says.

The more technical the term, the less likely it is to make it into the dictionary. While Wi-Fi is in, the protocols it uses, such as 802.11b and 802.11g , are not. AJAX , another technical term, has yet to prove its permanence. "It's a common technology right now, but it is by no means the only technology for dynamically updating Web sites," says Sheidlower. He also sees it as more of a technical term. "The man on the street probably hasn't heard of AJAX, even if they've used Google Maps [which uses the technology]," he adds.

Another sticky area is known as the " Xerox problem": trademarked names that become synonymous with a function, such as photocopying. In the high-tech realm, Google has come into common use as a verb. That's a problem for Kleinedler. "We cannot legally add Google [to the dictionary] as a verb... trademark law is one area where lexicographers' hands are tied," he says, adding that publishers are legally obligated to define trademarks as trademarks (rather than, say, a verb). As a result, American Heritage uses roundabout language in its definition of Google , as cited on : "A trademark used for an Internet search engine. This trademark often occurs in print as a verb, sometimes in lowercase."

But not all publishers are so cautious. Merriam-Webster defines google as "to use the Google search engine to obtain information about (as a person) on the World Wide Web," giving the etymology as "Google, trademark for a search engine."

Falling out

For a word, getting into the dictionary is the academic equivalent of achieving tenure. Once a term gets in, editors are loath to remove it. While editors at Merriam-Webster and American Heritage do occasionally retire words, those that make it into the OED stay forever, and become part of the historic record of the English language, says Sheidlower. That's one reason why he's particularly selective when it comes to high-tech terms. "Other dictionaries can put something in, make a marketing splash and take it out two years later. We can't do that."

But competitors are also very conservative when it comes to removing words. Merriam-Webster drops words only during major revisions, which occur once a decade, and those that do get dropped tend to be real antiques. Some of the more recent terms to get the axe included microreader (first cited in 1949) and record changer (first cited in 1931). A relatively modern term by that measure, PL/1 , the programming language, first cited in 1973, was recently removed from both Merriam-Webster and American Heritage dictionaries.

But taking out words carries real risks, says Kleinedler. In 1998, during a review of obsolete computer science terminology, the word chad faced the chopping block. Then Kleinedler recalled voting with a punch card machine in the previous election. "We decided to hold onto it [for the dictionary's 4th edition, published in 2000]. And lo and behold, the election of 2000 comes along and suddenly chad is everywhere."

American Heritage has removed a handful of technology terms, including data diddling . "Anything you'd need to know about that term can be adequately defined by diddle itself," says Kleinedler. (Although the definition for diddle , as attributed to American Heritage at, is less clear about fraudulent data manipulation than what one finds in the Encarta Dictionary definition .

While dictionaries consistently add new high-tech terminology as it goes mainstream, the editors don't seem too worried about keeping up with the fast-changing computer technology landscape or its high-tech readers. "They're not following us, we're following them. It is they who inform us what needs to go in," says Kleinedler.

For the typical technology enthusiast, the dictionary serves a more important function than providing definitions of high-tech terms that they already know. "[They] are much more likely to use a dictionary to find out how to spell mischievous or figure out what the word olivine or slatternly means, rather than looking up some term that they themselves are on the front lines of establishing in the language."

This story, "How do tech terms become legit?" was originally published by Computerworld.

Join the Network World communities on Facebook and LinkedIn to comment on topics that are top of mind.

Copyright © 2008 IDG Communications, Inc.