Dr. Dobb's is part of the Informa Tech Division of Informa PLC

This site is operated by a business or businesses owned by Informa PLC and all copyright resides with them. Informa PLC's registered office is 5 Howick Place, London SW1P 1WG. Registered in England and Wales. Number 8860726.


Channels ▼
RSS

C/C++

Accelerated Search For the Nearest Line


Stefan is a software developer in Berlin specializing in cross-platform development, user interfaces, and reengineering. He can be contacted at [email protected].


Finding the nearest position within a set of lines is not a difficult task--it's done every time you click the mouse in a drawing program or GIS (geographical information system). But when large amounts of data that have to be batch processed are involved, the situation changes. I recently was faced with the problem of about 1,000,000 lines that were exported from a GIS. The lines were more or less touching each other, thus representing a pipe network, but there were no logical links between them. Furthermore, some lines ended not near the end of another line, but instead their end touched another line (call it "parent") far away from its endpoints. It was necessary to split the parent line and create a new link at this position.

I had already a module implementing a "find nearest point in all lines" in a straightforward linear manner, but it turned out that the task of finding the nearest neighbor for all lines took several days on a high-end desktop machine. In contrast to point-to-point search where algorithms with logarithmic search times exist (see A Template for the Nearest Neighbor Problem by Larry Andrews), I don't know of such implementations for point-to-line search. This may have its cause in the fact that--in contrast to point-to-point search--on each single line there are infinite possible matching points. So what to do?

In this article, I present an optimized algorithm that reduces the search time to a fraction, when searching the nearest position within a set of lines. The algorithm is based on dividing the whole possible spatial area in quadrants and then building indices for each quadrant. All searches are based on using a threshold for the maximum search distance to the line found (the "maximum search width", which is 10 meters in my examples). If there is no line within this distance, nothing will be found. The solution consists of the classes described in Table 1. For comparison, the source code of the different versions of linear search discussed here is also provided. The sample code was tested under Windows, Linux and Solaris using VC++ 6.0 and gcc 3.3.3. The measurements were made using Windows XP

Table 1: List of Classes

Sample Data

The sample main() function creates 1,000,000 lines of 100 meter length for sample data, as in Figure 1. Note, that some line ends are near to the end of another line, whereas other line ends are near the center of another line. Compared to the sample data, in typical GIS-applications lines vary much in their length. Also in typical data, lines (i.e. pipes, cables, streets etc.) mostly are represented by polygons with many points. But this is only a specialization of the problem where all polygons will have to be divided into their straight lines. By my experience though, straight lines going thorough the whole area are very rare.

[Click image to view at full size]
Figure 1: Lines in sample data


Related Reading


More Insights






Currently we allow the following HTML tags in comments:

Single tags

These tags can be used alone and don't need an ending tag.

<br> Defines a single line break

<hr> Defines a horizontal line

Matching tags

These require an ending tag - e.g. <i>italic text</i>

<a> Defines an anchor

<b> Defines bold text

<big> Defines big text

<blockquote> Defines a long quotation

<caption> Defines a table caption

<cite> Defines a citation

<code> Defines computer code text

<em> Defines emphasized text

<fieldset> Defines a border around elements in a form

<h1> This is heading 1

<h2> This is heading 2

<h3> This is heading 3

<h4> This is heading 4

<h5> This is heading 5

<h6> This is heading 6

<i> Defines italic text

<p> Defines a paragraph

<pre> Defines preformatted text

<q> Defines a short quotation

<samp> Defines sample computer code text

<small> Defines small text

<span> Defines a section in a document

<s> Defines strikethrough text

<strike> Defines strikethrough text

<strong> Defines strong text

<sub> Defines subscripted text

<sup> Defines superscripted text

<u> Defines underlined text

Dr. Dobb's encourages readers to engage in spirited, healthy debate, including taking us to task. However, Dr. Dobb's moderates all comments posted to our site, and reserves the right to modify or remove any content that it determines to be derogatory, offensive, inflammatory, vulgar, irrelevant/off-topic, racist or obvious marketing or spam. Dr. Dobb's further reserves the right to disable the profile of any commenter participating in said activities.

 
Disqus Tips To upload an avatar photo, first complete your Disqus profile. | View the list of supported HTML tags you can use to style comments. | Please read our commenting policy.