Site Archive (Complete)
Database
Email
Print
Reprint

add to:
Del.icio.us
Digg
Google
Furl
Slashdot
Y! MyWeb
Blink
TABLE OF CONTENTS
June 30, 2006
Discovering Relationships in Context

Inductive tools for forensic computing

(Page 1 of 4)
Joe Celko
Cogito is a graph-based relationship analytics tool for pattern matching and relationship identification, making it an ideal tool for computer forensics.
Joe is a database consultant and author of Trees & Hierarchies in SQL. He can be contacted at jcelko212@earthlink.net.


Relational databases hold models of a world that assumes you know the basic relationships among the entities in your problem space. Their purpose is to maintain business rules and access data in a known format. This is a good assumption in a production environment. Actually, it is a necessary assumption. You want to produce known and well-defined transactions and reports in online processing (OLTP) environments. A constantly changing schema would be a model of a world where elephants can drop out of the sky. Even online analytical processing (OLAP) deals with known relationships, usually summaries taken from transactional systems and outside data sources.

The relational model (RDBMS) is a deductive system that can find sets of entities from the data in this model. If you find a new relationship, you have to add tables and new constraints to your schema to model it. Likewise, if you drop a table from your schema, you have to see how that changes the whole model. This is the nature of a deductive approach to data.

RDBMS is very useful, but there are problems in the world that cannot be done with deductive methods. We need inductive tools that let us add relationships, rather than data, to a model. What kind of data requires inductive reasoning? Imagine you are a cop on the "CSI" television show. All you have is a collection of odd facts that do not fall into nice neat relational tables. These facts tie various data elements together in various ways. You have 60 minutes to find a network of associations to connect the bad guys to the crime in some as-of-yet unknown manner. And a new fact introduced after the last commercial break can change the outcome in the last five minutes.

In any of the shows, the detectives go to a marker board and start making "fishbone" diagrams (also known as "Ishikawa" diagrams) and other general-directed graphs. Each event is tied together until we have a path from the perpetrators to the crime. The whole point of the show is finding that path by hard work, some shooting, and clever thinking. It's good drama, but in the real world, it can be slow and we don't have scriptwriters to guarantee outcomes.

What we need is a tool to manipulate a general-directed graph with various relationships between simple entities. Ideally, we would like this tool to be declarative and self-optimizing—like SQL engines.

For instance, graph theory, a branch of mathematics that deals with nodes and edges, is one of the most powerful mathematical tools we have because it is so general. A road map's nodes are cities and the edges are roads. A family tree's nodes are people and the edges are blood relations. A circuit diagram's nodes are electronic components and the edges are the wires between them. Graphs are so general—you've used them all your life, but not thought of them in a formal sense.

Because I'm a big fan of SQL, my first approach to any problem is to write a schema and some queries. I have tried to do general graphs in SQL and my conclusion is that it's possible—but not practical—for any data set of a realistic size.

1 Discovering Relationships in Context | 2 Six Degrees of Kevin Bacon | 3 Finding the Shortest Path | 4 The Right Tool for Forensics Next Page
TOP 5 ARTICLES
No Top Articles.
DR. DOBB'S CAREER CENTER
Ready to take that job and shove it? open | close
Search jobs on Dr. Dobb's TechCareers
Function:

Keyword(s):

State:  
  • Post Your Resume
  • Employers Area
  • News & Features
  • Blogs & Forums
  • Career Resources

    Browse By:
    Location | Employer | City
  • Most Recent Posts:



    MICROSITES
    FEATURED TOPIC

    ADDITIONAL TOPICS

    INFO-LINK



     
    ♦ sponsored




    Related Sites: DotNetJunkies, SD Expo, SqlJunkies