Dr. Dobb's is part of the Informa Tech Division of Informa PLC

This site is operated by a business or businesses owned by Informa PLC and all copyright resides with them. Informa PLC's registered office is 5 Howick Place, London SW1P 1WG. Registered in England and Wales. Number 8860726.


Channels ▼
RSS

Database

True Desktop Search


Programming Desktop Search

Although you may not be interested in customizing an open-source desktop search tool, you may be interested in extending one of the more popular tools, such as Google Desktop. For this reason, most desktop search providers include an API that allows you to extend their tools. Most of the tools let you at least add new content-type filters. This enables you to extend the tools to index and search content for your own custom applications.

However, some of the tools, such as Google Desktop and Windows Desktop, go further. Google, for example, lets you hook various indexing and search events. Both Google Desktop and Windows Desktop Search allow you to integrate their index and search engines right into your own applications. Let's take a look at the APIs available from the desktop search tool providers and examine the capabilities in each.

Beagle API

The Beagle API contains two main types of components you can develop to extend the indexation and query capabilities of Beagle. These components are called Beagle Filters and Beagle Backend components.

Beagle Filters are components that extract pertinent information from an item to be indexed (such as an e-mail or an OpenOffice document). Filter components rely on Beagle Backend components to retrieve data items and stream the indexable content to them.

Beagle Backend components can be divided into two subtypes. Indexable components have programmed-in knowledge of how to locate specific data items for indexing and inform the Beagle engine of items available to be indexed. Queryable components know how to query other data sources at search time that aren't feasible to be indexed themselves. For instance, there is a Google backend component that queries Google and returns the results to the Beagle engine.

Windows Desktop Search Interfaces and SDK

Microsoft lets you enhance its desktop search tool to add support for new file types to be indexed, add new data sources to locate content to be indexed or searched, and integrate desktop search results into other applications. For instance, a human resources application can be extended to display em-ployee information from a corporate database, as well as information about that employee from a representative's local hard drive or some other private store.

Windows Desktop Search components are built as COM objects that implement specific interfaces. For instance, to add support for a new file type, your component must implement the IFilter interface. The result is a component that knows how to extract the searchable information from the file type in question, can be invoked by the desktop search engine and can be integrated with Windows Explorer.

Here's a complete list of COM interfaces and their function:

  • IFilter
  • IUrlAccessor provides information about a resource and its URL.
  • ISearchProtocol maps a resource to an IUrlAccessor and uses a specific protocol to access the data.
  • IContextMenu defines how the URL is to be displayed in a context menu.
  • IExtractIcon defines an icon to associate with a URL.
  • IShellFolder provides references to the implementation of the IContextMenu and IExtractIcon interfaces.
  • IPersistFolder initializes shell folder objects.


Related Reading


More Insights






Currently we allow the following HTML tags in comments:

Single tags

These tags can be used alone and don't need an ending tag.

<br> Defines a single line break

<hr> Defines a horizontal line

Matching tags

These require an ending tag - e.g. <i>italic text</i>

<a> Defines an anchor

<b> Defines bold text

<big> Defines big text

<blockquote> Defines a long quotation

<caption> Defines a table caption

<cite> Defines a citation

<code> Defines computer code text

<em> Defines emphasized text

<fieldset> Defines a border around elements in a form

<h1> This is heading 1

<h2> This is heading 2

<h3> This is heading 3

<h4> This is heading 4

<h5> This is heading 5

<h6> This is heading 6

<i> Defines italic text

<p> Defines a paragraph

<pre> Defines preformatted text

<q> Defines a short quotation

<samp> Defines sample computer code text

<small> Defines small text

<span> Defines a section in a document

<s> Defines strikethrough text

<strike> Defines strikethrough text

<strong> Defines strong text

<sub> Defines subscripted text

<sup> Defines superscripted text

<u> Defines underlined text

Dr. Dobb's encourages readers to engage in spirited, healthy debate, including taking us to task. However, Dr. Dobb's moderates all comments posted to our site, and reserves the right to modify or remove any content that it determines to be derogatory, offensive, inflammatory, vulgar, irrelevant/off-topic, racist or obvious marketing or spam. Dr. Dobb's further reserves the right to disable the profile of any commenter participating in said activities.

 
Disqus Tips To upload an avatar photo, first complete your Disqus profile. | View the list of supported HTML tags you can use to style comments. | Please read our commenting policy.