Dr. Dobb's is part of the Informa Tech Division of Informa PLC

This site is operated by a business or businesses owned by Informa PLC and all copyright resides with them. Informa PLC's registered office is 5 Howick Place, London SW1P 1WG. Registered in England and Wales. Number 8860726.


Channels ▼
RSS

The Intellectual Foundation of Information Organization


According to the second edition of the Anglo-American Cataloging Rules, the answer is no. A search for books written by Lewis Carroll would return Alice in Wonderland, but not Euclid and his Modern Rivals. For the latter, you would have to search for books written by Dodgson. Whether or not you agree with this rule, we can all agree that organizing information can be confusing. And the mountain of bits steadily accumulating on the Internet and in our email and Web browsers is only making the task harder.

Fortunately, a small segment of our population, librarians, has been dealing with the problem of information organization since 2000 B.C. Who better to turn to in our time of need than people with thousands of years of accumulated expertise and experience?

The Intellectual Foundation of Information Organization
By Elaine Svenonius
The MIT Press, 2000, 255pp.
mitpress.mit.edu
$37

Applying our conceptual understanding of information organization to the Internet is a necessary and promising endeavor, as it enables initiatives such as the Semantic Web. However, the Internet also brings unprecedented complexity to an already complex problem, forcing us to re-examine its intellectual foundation.

Svenonius, a Professor Emeritus of Library Information Science at UCLA, offers a primer on the past, present, and future directions of the information sciences in her book, The Intellectual Foundation of Information Organization. The book is dense and deeply technical at times, but the content she presents is invaluable. Svenonius provides a framework for understanding and thinking about the problem of information organization by describing its conceptual basis.

Philosophy of Information Organization

The first half of the book delves into the philosophical underpinnings of information organization, and the first chapter is devoted almost entirely to definitions. Svenonius defines a document as the embodiment of a work or expressed thought. The distinction between document and work is a crucial one. This becomes clear when you think in the context of library catalogs. Do the first and second folios of Shakespeare's "Hamlet" constitute the same work? Most would say yes. Are they different documents? Again, most would answer affirmatively.

How about a French translation of "Hamlet" versus the original text? Would those be considered the same work? Probably yes. What about Laurence Olivier's 1948 rendition of "Hamlet"? Probably no, but if that's the case, then what's the relationship between the movie and textual versions of "Hamlet"?

Those who catalog information face these issues constantly. We need a rigorous understanding of the philosophical and linguistic elements of information organization so that we can apply them to more automated solutions.

Much of the book deals with bibliographies, which act as both catalogs and representations of information. Svenonius explains that a bibliography's purpose is twofold: to locate a book or other information entity, and to locate sets of entities based on certain criteria. She then describes bibliographic theory, and in the latter half of the book she examines several real bibliographic systems, like the Dewey Decimal System.

Implications for the Internet

Svenonius makes it evident that completely fulfilling the objectives of bibliographic systems is a complex and expensive proposition. At the same time, systems that only partially fulfill these requirements are far from useless.

Part of the motivation behind the Semantic Web initiative is to create better searching capabilities. Searches for Celtics on the Web will likely turn up pages on Irish culture, when you may be more interested in reading about Bill Russell and Larry Bird. It would be wonderful if you could make the Web understand that you're looking for information on basketball dynasties and not the Celtic harp.

Despite the fact that we haven't yet achieved this exact ideal, we can make do with what we have. Most of us know that narrowing the search term to Boston Celtics will yield the information we want. While keyword searching doesn't satisfy all of the requirements for bibliographic systems, it does serve its purpose well and is relatively cheap to implement.

Svenonius writes, "An important question today is whether the bibliographic universe can be organized both intelligently (that is, to meet the traditional bibliographic objectives) and automatically." This is the crux of the problem we face with the Internet.

Svenonius also admits that the Internet has already succeeded in organizing itself to an extent, "A self-organizing bibliographical universe nevertheless succeeds in meeting the bibliographic objectives in part, occasionally, and somewhat randomly. And for many documents and many users, that is all that is needed."

Nevertheless, the Internet's current state of self-organization is primitive, and it doesn't always meet users' needs. To improve the situation, we'll need a strong understanding of the principles Svenonius describes. While The Intellectual Foundation of Information Organization is a challenging read, it's also a worthwhile one.


Eugene writes, programs, and consults on a freelance basis. He is currently writing a book on the history of free software, entitled Software, Money, and Liberty: How Source Code Became Free. Reach him at [email protected].


Related Reading


More Insights






Currently we allow the following HTML tags in comments:

Single tags

These tags can be used alone and don't need an ending tag.

<br> Defines a single line break

<hr> Defines a horizontal line

Matching tags

These require an ending tag - e.g. <i>italic text</i>

<a> Defines an anchor

<b> Defines bold text

<big> Defines big text

<blockquote> Defines a long quotation

<caption> Defines a table caption

<cite> Defines a citation

<code> Defines computer code text

<em> Defines emphasized text

<fieldset> Defines a border around elements in a form

<h1> This is heading 1

<h2> This is heading 2

<h3> This is heading 3

<h4> This is heading 4

<h5> This is heading 5

<h6> This is heading 6

<i> Defines italic text

<p> Defines a paragraph

<pre> Defines preformatted text

<q> Defines a short quotation

<samp> Defines sample computer code text

<small> Defines small text

<span> Defines a section in a document

<s> Defines strikethrough text

<strike> Defines strikethrough text

<strong> Defines strong text

<sub> Defines subscripted text

<sup> Defines superscripted text

<u> Defines underlined text

Dr. Dobb's encourages readers to engage in spirited, healthy debate, including taking us to task. However, Dr. Dobb's moderates all comments posted to our site, and reserves the right to modify or remove any content that it determines to be derogatory, offensive, inflammatory, vulgar, irrelevant/off-topic, racist or obvious marketing or spam. Dr. Dobb's further reserves the right to disable the profile of any commenter participating in said activities.

 
Disqus Tips To upload an avatar photo, first complete your Disqus profile. | View the list of supported HTML tags you can use to style comments. | Please read our commenting policy.