Dr. Dobb's is part of the Informa Tech Division of Informa PLC

This site is operated by a business or businesses owned by Informa PLC and all copyright resides with them. Informa PLC's registered office is 5 Howick Place, London SW1P 1WG. Registered in England and Wales. Number 8860726.


Channels ▼
RSS

C/C++

An Incremental String Search in C


Searching and sorting routines are an essential part of any programmer's toolkit. A number of sorting routines compete for the programmer's favor, distinguished from on another by speed, the format of the data they expect, and complexity. Most programmers choose a sequential or binary search algorithm. This article discusses a variant technique that is sometimes very useful.

Sequential and binary searches take a key and a sorted list as input, returning information about the location of the list entry that matches a given key. Sequential and binary searches, as they're usually implemented, expect to be given full information about the key to be searched for; neither attempts to find partial matches.

It's often more convenient to seek entries in a sorted list that are compatible with a given search string as characters in the search string are made available. Instead of waiting for the user to enter the full key to search on, we can examine the underlying list dynamically to determine which entries in the list are candidates for the item we're searching for. This method is usually referred to as incremental search. A few design decisions should be made up front when implementing an incremental search routine. How should the routine behave if a search string with no matches is entered? Should it refuse to accept characters in the search string with no match in the search list, or accept them and indicate that no match is possible? What if the data contains duplicate entries? What should be done when one entry in the string array being searched is a prefix of another? This situation would arise if two entries in the array were ZIMMER and ZIMMERMAN. If you typed ZIMMER as your search string, should the incremental search immediately report that it's found an exact match or await further instructions?

Regardless of the answers to these questions, any incremental search routine should certainly return some indicator of success or failure. If any partial matches are found, the range of indices that yield a match should also be made available to the calling routine.

Later, we'll look at a C function, inc_search, which performs incremental search on certain types of string arrays. Without going into too much detail now, suffice it to say that inc_search returns a 1 if it finds a partial match, a 0 if it doesn't, and a -1 if the search is aborted by the user.

To illustrate the basic idea behind our implementation of incremental search, suppose we define variables arr and base by:

typedef char STR[6]; 
STR arr[] = {"ABOVE","BASTE", 
  "BLUE","CANE"}; 
STR *base = arr; 

The first line defines the user type STR, which consists of strings of five or fewer characters (recall that strings in C are terminated by an ASCII zero). The second declares arr to be a four-element array of strings and define the entries in the array. The third line declares base to be a pointer to a variable of type STR and initializes base to point to the origin of the array arr. The effect of these declarations is shown in Figure 1.

Figure 1: Initial configuration base and ar.


Related Reading


More Insights






Currently we allow the following HTML tags in comments:

Single tags

These tags can be used alone and don't need an ending tag.

<br> Defines a single line break

<hr> Defines a horizontal line

Matching tags

These require an ending tag - e.g. <i>italic text</i>

<a> Defines an anchor

<b> Defines bold text

<big> Defines big text

<blockquote> Defines a long quotation

<caption> Defines a table caption

<cite> Defines a citation

<code> Defines computer code text

<em> Defines emphasized text

<fieldset> Defines a border around elements in a form

<h1> This is heading 1

<h2> This is heading 2

<h3> This is heading 3

<h4> This is heading 4

<h5> This is heading 5

<h6> This is heading 6

<i> Defines italic text

<p> Defines a paragraph

<pre> Defines preformatted text

<q> Defines a short quotation

<samp> Defines sample computer code text

<small> Defines small text

<span> Defines a section in a document

<s> Defines strikethrough text

<strike> Defines strikethrough text

<strong> Defines strong text

<sub> Defines subscripted text

<sup> Defines superscripted text

<u> Defines underlined text

Dr. Dobb's encourages readers to engage in spirited, healthy debate, including taking us to task. However, Dr. Dobb's moderates all comments posted to our site, and reserves the right to modify or remove any content that it determines to be derogatory, offensive, inflammatory, vulgar, irrelevant/off-topic, racist or obvious marketing or spam. Dr. Dobb's further reserves the right to disable the profile of any commenter participating in said activities.

 
Disqus Tips To upload an avatar photo, first complete your Disqus profile. | View the list of supported HTML tags you can use to style comments. | Please read our commenting policy.