Introduction

Who needs old dictionaries anyway?

Dictionaries lie at the core of the human ability to conceptualize, systematize and convey meaning. But they are hardly positivistic, objective repositories of knowledge or truth about language, let alone the world.

A dictionary is a strange, polyfunctional beast: it is a text, a tool, a model of language, and a cultural object deeply embedded in the historical moment of its production. That means that dictionaries, like any other texts, will inevitably reflect the cultural, political and ideological values of their times, no matter when  and where they were written.

Think of the way Thomas Blount defined coffee (coffa) in his Glossographia; or, a dictionary interpreting the hard words of whatsoever language, now used in our refined English tongue, a very popular dictionary of its time, first published in 1656:

Was 17th century coffee more potent than the one we use today? Did it really rid people of debilitating depression and misanthropic cynicism? Why does Blount's definition feel and read differently than the ones provided by the modern lexicographic works such as The Chambers Dictionary or The New Oxford American Dictionary?

The Chambers Dictionary
The New Oxford American Dictionary

Exercise. Spend some time analyzing the above three entries. What would you say were the main stylistic features of lexicographic prose? How would you describe the tone of Blount's entry when compared to The Chambers or the NOAD? What conclusions can you draw from the fact that modern entries for coffee contain more lexical information? And yet Blount's definition addresses something that other definitions don't: coffee as a stimulant. What do you make of that?

Historical, legacy dictionaries remain valuable to humanists precisely because they provide culturally shaded insights into the lexical knowledge of a particular epoch — sometimes in contrast to contemporary experiences, attitudes or values. We study legacy dictionaries not because we need them for linguistic survival in a world of fauxhawks, twerking and jeggings, but because they have something important to teach us about language, about the people who wrote them and about the time in which they were written.

When you analyzed the above three entries, you've probably also noticed that they don't only sound different, they also look different.

In terms of typography and layout, there are important differences not only between 17th and 21st century dictionaries, but also between the two modern dictionaries. The Chambers, for instance, is more compact: it doesn't number separate senses, it uses abbreviations (n for noun, Turk for Turkish, Ar for Arabic etc.). The NOAD, on the other hand, doesn't seem to be a fan of abbreviations; it uses much more whitespace, starts each example or subsense in a new line, and has a separate section for etymology ("Origin").

These differences can be partly ascribed to the fact that The Chambers is a print dictionary, and the entry from the NOAD comes from its electronic edition. Print dictionaries are like prime real estate: space in them is very expensive. To limit the costs of printing and make the end product manageable, easy to hold and browse through even for those of us who have not descended from giants, lexicographers have over time developed a set to conventions that they use to structure, abbreviate and layout dictionary content in the most compact way possible. Some of those conventions, as we can see, are no longer strictly necessary when a dictionary is published electronically.

Why should we digitize?

The Library of Alexandria -- one of the wonders of the Ancient World, was founded in the fourth century B.C. by Ptolemy I Soter. Its first librarian, Demetrius of Phalerum, had a utopian dream of collecting everything that has ever been written. It is estimated between 400,000 and 700,000 scrolls were housed at the library during its peak days. The library obtained new holdings both through royal gifts and by copying originals. According to Galen, a prominent Greek physician and philosopher, ships anchored at the port of Alexandria were obliged to surrender their books for immediate copying. The Library of Alexandria was the largest and most important library of its time. Until in burned down.

CC-BY-SA-3.0 E.Herzel

While the destruction of the Library of Alexandria is to this day seen as a symbol of tragic loss of knowledge and culture, it is by no means a singular incident in history. Wars, civil unrest and natural disasters, but sometimes also mere accidents, continue to destroy and damage the human record. A 2004 fire in the Duchess Ana Amalia Library in Weimer destroyed some 50,000 volumes of which 12,500 are considered irreplaceable.

Digitization is not a cure for all ills and should not be embraced all too hastily as a replacement for proper preservation strategies. Yet there is no doubt that digital objects contribute to both the preservation and accessibility of cultural heritage. From an institutional point of view, the opportunities created by digital resources "for learning, teaching, research, scholarship, documentation, and public accountability" (Kenney and Rieger, 2000, 1) are immense.

The following list (based on Deegan and Tanner, 2002) succinctly summarizes some of the advantages of digitization:

  • immediate access to high-demand and frequently used items
  • easier access to individual components within items (e.g. articles within journals, or individual entries within a dictionary)
  • ability to recirculate out-of-print items
  • potential to display items that are in inaccessible formats (large volumes, maps etc.)
  • 'virtual reunification' - allowing dispersed collections to be brought together
  • ability to enhance digital images in terms of size, sharpness, color contrast, noise reductional etc.
  • potential to conserve fragile objects while presenting surrogates in more accessible forms
  • potential for integration into teaching materials
  • enhanced searchability, including full text
  • integration of multimedial content

There are, however, also reasons that speak against digitization: lack of adequate funding or institutional support, unresolved copyright issues, the risk of damaging precious objects, unpredictable costs of long-term storage of the digital files themselves (see Hughes, 2004, 50-2). Institutions and projects usually cannot digitize everything. It is therefore very important to plan and carefully select the material that is to be digitized.

The digitization boom that began in the mid-1990s dramatically altered the way we engage with immaterial cultural heritage; lexicographic heritage has been no exception. It is by now well-established that digitization can increase the use value of a historical dictionary, especially in global, networked environments. Many projects have been initiated to create electronic editions of printed lexicons, yet practitioners in the field of digitizing dictionaries still face numerous technical, methodological and interpretative challenges.

The goal of this course is to equip you with the knowledge to deal with those challenges and to help you make good use of available tools and methods in your own projects.

In the rest of this unit, we'll look at various steps involved in planning and implementing a dictionary digitization project.

Last modified: Wednesday, 15 March 2017, 8:25 AM