2010 relaunch explained

Before the 2010 relaunch, users saw unrevised entries in their static OED2 (1989) version and revised entries in their updated OED3 version.

During the course of revision, a very large number of improvements were made (often manually) to entries not yet revised. These changes relate to a number of different areas: content, bibliography, display, tagging, and many others. These improvements could not previously be displayed in unrevised entries.

For the relaunch, we decided to take an unprecedented step: the version of each entry shown on the OED Online would be the most up-to-date version of that entry held on the OED editors’ working database (for both revised and unrevised entries). Entries currently undergoing revision would be displayed in their most up-to-date version before they were ‘closed’ for revision.

The database available on the OED Online continues to be updated every three months. Each new release now contains not only further new and revised entries, but also any further textual changes made anywhere across the database. As a result, the OED Online database becomes considerably more dynamic, responsive, and accurate; the standardization of data also makes it easier to search.

Users should note that there are risks attached to this approach. However, our view is that the degree of risk is considerably outweighed by the benefits gained. The static, printed versions of entries published in OED2 are still available on the site for reference.

The following section documents some of the new improvements now available on the site outside revised entries. Most of these features have been available in revised entries for some time.

Bibliography and illustrative quotations

  • 389,000 quotations checked or converted to better editions
  • 700,000 short titles regularized, making searching simpler: this includes 24,500 quotations from various texts of the Bible, now searchable by individual title (e.g. Wycliffite, Tyndale, Coverdale, A.V., etc.)
  • Some 53,000 occurrences in which Ibid. is typically replaced by the work title attached to the previous quotation (to facilitate searching)
  • Another 16,000 occurrences in which a double-em dash is replaced by the name of the author cited in the previous quotation (to facilitate searching)
  • The 19,590 quotations represented in OED1/2 by a cross-reference to another entry have been replaced by the full quotations.
  • 920 quotations cited indirectly and without a precise reference from Johnson’s dictionary have been found and converted to their original publication

Display

  • Compounds and derivatives now displayed as separate units (as in the revised text) rather than in OED1’s compressed format
  • 91,000 blind or broken cross-references fixed (arising mainly from imprecise or faulty OED1 data)

Content

  • Interim review and update of factual statements (geographical, historical, biographical, etc.) in 1,500 unrevised entries
  • 11,000 cross-reference quotations have been turned into full quotations (for ease of reference)
  • A number of abbreviations have been expanded in the etymologies, to aid readability and searchability: abbreviated language names have been expanded, so that e.g. ‘French’ is now found in place of ‘F.’ and ‘Old French’ in place of ‘OF.’ A number of other abbreviations commonly occurring in etymologies have also been expanded, such as ‘perh.’ > ‘perhaps’, ‘prob.’ > ‘probably’, ‘app.’ > ‘apparently’. Ambiguous cases have been resolved, such as ‘freq.’ > ‘frequent’, ‘frequently’, or ‘frequentative’. ‘Cf.’ has been replaced with ‘compare’ in etymologies.
  • Typographical errors fixed
  • Some standardization of labelling (e.g. Physical/Cultural Anthropology)
  • 60,000 hyphenated lemmas (in lists) have had their head form added (to enhance searching)
  • 54,000 ‘missing’ parts of speech have been added (these occur mainly with nouns, to which OED1 did not allocate an explicit part of speech)
  • A large number of irregular OED1 parts of speech have been standardized

Tagging

  • Enhanced etymological tagging, in which 135,000 etymologies were tagged to identify the immediate etyma of a word (to facilitate more accurate etymological searching)
  • 13,000 phrases and other structures showing variant forms have been retagged so that each variant is searchable.

John Simpson

Tags:

,