FREE Subscription to Dr. Dobb’s Digest: Same Great Content, New Digital Edition
Site Archive (Complete)
64 Bit Blog: Tapping into Unstructured Data
AI
CHIPS 'N DIPS

Musings on Broadbus and Multicore.

by Mike Swaine

by
December 08, 2007

Tapping into Unstructured Data

Bill Inmon coined the term 'data warehousing,' wrote the first book on the subject, and held the first conference on data warehousing. Lately he's turned his attention to the broader challenge of managing unstructured textual data.

His new book Tapping into Unstructured Data [with Anthony Nesavich, Prentice Hall, 2008] tackles the challenge of integrating such messy data into business intelligence.

If you think like a database, and when you're compiling and processing and using corporate business intelligence, you pretty much have to think like a database, unstructured data doesn't even exist. There's information there, of course, but it is not expressed in a form that can make any meaningful connection with databases. It is as though the databases are blind to it.

How you cure that blindness is the subject of the book, and for me the most enlightening part of the book was the case studies. Here the authors paint a clear picture of the iterative process of extracting meaning from unstructured data, using the meaning extracted at each step to inform the next step.

Vast amounts of information are tied up in unstructured documents. This blog is adding to the pile. It is really encouraging that efforts like this book are underway to extract from it the kind of meaning that is needed for business, medical, and other decisions.

Posted by Mike Swaine at 05:41 PM  Permalink




RECENT ENTRIES

January 2008
Sun Mon Tue Wed Thu Fri Sat
    1 2 3 4 5
6 7 8 9 10 11 12
13 14 15 16 17 18 19
20 21 22 23 24 25 26
27 28 29 30 31    


BLOGROLL
 
INFO-LINK