Dr. Dobb's is part of the Informa Tech Division of Informa PLC

This site is operated by a business or businesses owned by Informa PLC and all copyright resides with them. Informa PLC's registered office is 5 Howick Place, London SW1P 1WG. Registered in England and Wales. Number 8860726.


Channels ▼
RSS

Bottoms Up


Content and Metadata Management

This brings us to the unfulfilled promise of content management systems (CMS). In the late 1990s, many of us in the information architecture community saw CMS as an exciting opportunity to really make use of the controlled vocabulary metadata produced by the bottom-up approach. In short, where others saw software for controlling digital assets, we saw metadata management systems.

Unfortunately, the others are winning. Instead of enabling distributed solutions for capturing and utilizing controlled vocabulary metadata, many CMS installations have been focused purely on controlling the publishing process and allowing the repurposing of a limited suite of digital assets.

Many companies have continued to use the same top-down process for building Web sites despite their ownership of CMS software. This conjures images of a modern pyramid construction project, with thousands of sweaty laborers lugging blocks of stone while a hydraulic crane with a 16,000-ton lifting capacity sits idle.

Companies that are able to marginalize the "you watch your assets and I'll watch mine" mentality can really tap the power of content and metadata management systems to strike an intelligent balance between centralization and decentralization. Centralized teams can focus on designing broad, shallow enterprise-wide controlled vocabularies. Departmental teams can use the enterprise infrastructure as a starting point for deep dives into particular subject or product vocabularies. A unified metadata registry can provide global rules while allowing for local extensions. Once again, we can use the bottom-up approach to define digestible chunks without losing a sense of the whole.

Searching and Browsing Systems

None of this bottom-up work will have value if the systemic perspective isn't carried through into the interface design process. After all, a solid foundation is only as good as the house it supports. And the design of good houses requires an understanding of both the construction materials and the behavior of real humans.

Figure 1


Click for larger image

Because search quality depends on systems of a site working together, you might consider search as a window into the site's soul. If search works, you've probably got a healthy site.

We know, from decades of research in the fields of library science and information retrieval, that the information-seeking behavior of humans is iterative and interactive. People often don't know exactly what they're looking for and their experience with the information system can change their very goals and expectations repeatedly.

Consequently, an understanding of construction materials and human behavior leads us to the conceptual model of the search system pictured in Figure 1.

A successful user experience requires harmony between these components. It's not sufficient to choose a great search engine. And it's not enough to pursue a content-centric or business-centric or user-centric process. We must take a systemwide view if we are to tap the strength of the bottom-up approach to support powerful, flexible searching and browsing, while simultaneously supporting an efficient, distributed model for designing and managing those complex adaptive systems known as Web sites.

Sound difficult? It is. But if we shy away from the challenge and ignore the forest for the trees, we will all be diminished by our lack of vision.

Gathering Content

This excerpt from Information Architecture for the World Wide Web, Second Edition (O'Reilly & Associates), by Louis Rosenfeld and Peter Morville, details key issues involved in analyzing content.

Format Aim for a broad mix of formats, such as textual documents, software applications, video and audio files, and archived email messages. Try to include offline resources such as books, people, facilities, and organizations that are represented by surrogate records within the site.

Document Type Capturing a diverse set of document types should be a top priority. Examples include product catalog records, marketing brochures, press releases, news articles, annual reports, technical reports, white papers, forms, online calculators, presentations, spreadsheets, and the list goes on.

Source Your sample should reflect the diverse sources of content. In a corporate Web site or intranet, this will mirror the organization chart. You'll want to make sure you've got samples from engineering, marketing, customer support, finance, human resources, sales, research, and so on. This is not just useful, it's also politically astute. If your site includes third-party content such as electronic journals or ASP services, grab those too.

Subject This is a tricky one, since you may not have a topical taxonomy for your site. You might look for a publicly available classification scheme or thesaurus for your industry. It's a good exercise to represent a broad range of subjects or topics in your content sample, but don't force it.

Existing Architecture Used together with these other dimensions, the existing structure of the site can be a great guide to diverse content types. Simply by following each of the major category links on the main page or in the global navigation bar, you can often reach a wide sample of content. However, keep in mind that you don't want your analysis to be overly influenced by the old architecture.



Peter is president of Semantic Studios (www.semanticstudios.com), a strategy and information architecture consultancy, and coauthor of Information Architecture for the World Wide Web (O'Reilly and Associates).


Related Reading


More Insights






Currently we allow the following HTML tags in comments:

Single tags

These tags can be used alone and don't need an ending tag.

<br> Defines a single line break

<hr> Defines a horizontal line

Matching tags

These require an ending tag - e.g. <i>italic text</i>

<a> Defines an anchor

<b> Defines bold text

<big> Defines big text

<blockquote> Defines a long quotation

<caption> Defines a table caption

<cite> Defines a citation

<code> Defines computer code text

<em> Defines emphasized text

<fieldset> Defines a border around elements in a form

<h1> This is heading 1

<h2> This is heading 2

<h3> This is heading 3

<h4> This is heading 4

<h5> This is heading 5

<h6> This is heading 6

<i> Defines italic text

<p> Defines a paragraph

<pre> Defines preformatted text

<q> Defines a short quotation

<samp> Defines sample computer code text

<small> Defines small text

<span> Defines a section in a document

<s> Defines strikethrough text

<strike> Defines strikethrough text

<strong> Defines strong text

<sub> Defines subscripted text

<sup> Defines superscripted text

<u> Defines underlined text

Dr. Dobb's encourages readers to engage in spirited, healthy debate, including taking us to task. However, Dr. Dobb's moderates all comments posted to our site, and reserves the right to modify or remove any content that it determines to be derogatory, offensive, inflammatory, vulgar, irrelevant/off-topic, racist or obvious marketing or spam. Dr. Dobb's further reserves the right to disable the profile of any commenter participating in said activities.

 
Disqus Tips To upload an avatar photo, first complete your Disqus profile. | View the list of supported HTML tags you can use to style comments. | Please read our commenting policy.