Dr. Dobb's is part of the Informa Tech Division of Informa PLC

This site is operated by a business or businesses owned by Informa PLC and all copyright resides with them. Informa PLC's registered office is 5 Howick Place, London SW1P 1WG. Registered in England and Wales. Number 8860726.


Channels ▼
RSS

An Embedded Memory-Based SQL Class for C++


March 2000/An Embedded Memory-Based SQL Class for C++

SQL and C++ make a powerful combination, particularly if you don't have to do the combining yourself.


Introduction

When writing C++ programs, how many times have you stored data in arrays, linked lists, or collections, and then needed to retrieve data using some selection criteria? Solving this problem usually requires writing your own selection logic. If you plan on supporting logical expressions, get ready for some heavy-duty coding. What you really need is an object class that organizes your data and provides a method to retrieve data using something more powerful than simple iterators.

This article presents a DLL (dynamic linked library), called EMBSQL.DLL, that implements a class providing both data storage and a method for manipulating the object's member data using SQL (structured query language) — the virtually universal database language. Running on Windows 95/98/NT and fully self contained in memory, EMBSQL requires no commercial database package. Because the library uses shared memory, multiple programs easily share the same data through the SQL objects.

To demonstrate the power of an SQL-enabled object, this article presents a sample stock quote server application. The server program reads stock quotes off the Internet and stores them in the embedded SQL object. A concurrent program acts as a client for users; it implements powerful query capabilities on the server's stock market data in less than 30 lines of C++ code. Both programs use the EMBSQL capabilities. Along the way, this article shows how to programmatically retrieve stock quotes from the Internet, how to share data in a dynamic link library, how to create and use virtual memory maintained in a memory-mapped file, how to use some basic SQL concepts.

Overview of EMBSQL

Figure 1, EMBSQL Architecture, shows the relationship of the EMBSQL library to its client programs. After creating an instance of the EMBSQL class, called CESql (Class, Embedded SQL), an application stores data into the object for later update and retrieval through the object's Sql method. Using shared memory to store class data allows multiple processes to share data maintained by the CESql objects.

However, sharing class data between processes complicates the EMBSQL design, because C++'s normal new and delete operators do not allocate shareable memory within the DLL. These methods allocate memory from the running process's private heap storage, regardless of whether the methods are called from within the main process executable or from within the DLL. Since memory allocated from a process's heap cannot be shared with other processes, the EMBSQL library employs a private memory manager.

To set up shared memory for use inside the DLL, the variables managing the shared memory within the library are declared as being in a shared data segment. These variables are declared in the file Malloc.cpp, between the #pragma data_seg() directives. Figure 2 shows the relevant portion of Malloc.cpp.

Note that each variable shared in the DLL is initialized within the #pragma statements as well as declared. Initializing the variables guarantees that the linker will place them in the proper data segment. When linking the DLL, the linker must be told about the shared region [1]. To do this in Microsoft's Visual C++ Integrated Development Environment, simply select Project, then Settings. In the Project Options dialog at the bottom of the window, enter the following:

/SECTION:shared,rws

Now all processes using EMBSQL share the same pointers for memory management within the library. However, the standard C++ new and delete operators cannot be used to allocate the actual memory, since these methods allocate memory from the calling process's private heap. One way to overcome this would be to allocate memory from a large character array declared in the SHARED segment of the library. This scheme unfortunately causes the DLL to grow at least as physically large as the data block declared.

EMBSQL overcomes these memory problems by implementing a private memory management scheme, and by allocating memory from a memory-mapped page file managed by the operating system's virtual memory manager [2]. When a process loads or unloads the EMBSQL DLL (or any DLL for that matter), it calls the library's DllMain function first. The EMBSQL DLL's DllMain function, located in Malloc.cpp, creates the memory-mapped file when the shared memory memCounter variable is zero. The library increments and decrements this counter as processes load and unload the library. Figure 3 shows the portion of Malloc.cpp that implements the shared memory-mapped file and subsequent mappings to this file.

The library's DllMain function calls the CreateMappedFile(int size) to obtain the shared memory file and mappings to it. The size parameter determines how much shared memory to allocate; it is set to the maximum file size by the first process loading and initializing the DLL. The name of the memory-mapped file is embsqlmemory.tmp, located in the system's TEMP directory.

Subsequent processes loading the DLL also call the CreateMappedFile function in file Malloc.cpp. However, as the code fragment in Figure 3 shows, subsequent entries into DllMain cause CreateMappedFile to be called with a size argument of zero, by virtue of memCounter being greater than one. In this case, CreateMappedFile merely maps the memory pointer to the shared memory file. Note the call to MapViewOfFileEx within CreateMappedFile, that happens when CreateMappedFile is called with a size argument of zero. The last parameter to MapViewOfFileEx is the address the virtual memory manager must use for the shared memory base. This variable, sharedMemoryFile, was initialized by the first process loading the DLL. If the address returned by MapViewOfFileEx does not match sharedMemoryFile, the DLL notifies the user of the memory alignment error through the CESql method ReturnError. Unless these shared memory file addresses match, the system will crash.

To enable the EMBSQL library to create objects in shared memory, these objects' new and delete operators are overridden to invoke private shared_malloc and shared_free memory allocators instead of the customary malloc and free. Figure 4 shows the definition for a class SCString (a shared CString), which defines its own shared-memory versions of operator new and delete. These operators serve as forwarding functions for shared_malloc and shared_free respectively. Objects of this class will be allocated entirely from within the shared memory managed by the EMBSQL library. Two other classes that client programs will use are allocated from shared memory as well: the SCCollection class and the SCColumn class.

The class SCCollection, implemented in Collection.cpp (not shown), supports arrays of pointers into shared memory. SCCollection implements the tables and rows within the CESql object. These arrays are similar to C arrays, except that they dynamically shrink and grow as necessary. Array indexes always start at zero. Memory is allocated contiguously to the upper bound, even if some elements are null. The CESql object's ability to store elements into the array is limited only by the availability of shared memory. As with a C array, the access time for an SCCollection's indexed element is constant and independent of the array size. The SCCollection mimics the operation of the MFC CObArray class [4].

The class SCColumn, implemented in Column.cpp (not shown), describes the columns within a table stored in shared memory. This class provides methods to access the name, type of column, and storage for the individual data items in the tables.

Instantiating a CESql Object

To manage your data with an embedded SQL object, you first create an object of the class CESql. Example:

embsql = new CESql();

The first instance of a CESql object causes the EMBSQL library to create the shared memory file used to store all embedded SQL class data. Subsequent creations of CESql objects cause the library to map onto the previously created shared memory file. After obtaining the CESql object, you access it using the CESql's Sql function. This function provides the power of SQL to manipulate the data within the object.

Function Sql takes a CString object and parses the input text within it as SQL. As the parser recognizes the SQL language
elements, it constructs a simple execution stack. The stack is merely an array of CString objects representing operators and their operands, pushed onto the stack in the order they are recognized. After successfully parsing the input, the function Sql executes the stack. Stack execution is quite simple. An operator is popped from the top of the stack. Then, depending on the operator, one or more operands are popped, and the indicated operation performed. The operation result is pushed back onto the stack. After all operators are popped off the stack, the result of the operations remains. An easy way to view this stack is to enable stack tracing with the CESql::PrintStack(BOOL enable) method, setting enable to true.

The file parser.cpp (not shown) implements the class CESql and its recursive descent parser [3]. File Executestack.cpp (not shown) implements the stack execution part of the recursive descent SQL parser.

Defining Tables

CESql's Sql method provides the mechanism for creating tables and table structures via the CREATE TABLE SQL command. An example provides the easiest way to describe the various commands that function Sql can interpret. Consider a collection of stock quotes. Each stock quote contains the following attributes: name of the stock, the time of the last quote, the last quoted price, the highest quoted price, the lowest quoted price, and the volume. A table, with each row representing a stock symbol, and each row containing several columns for the attributes of each stock, nicely organizes the data.

Figure 5 shows how such a "stocks" table is created within the CESql object — by passing a CREATE TABLE SQL command to the Sql method.

An alphanumeric name followed by its type and size, where applicable, describes each row within the table. In Figure 5 the 'float' and 'char' types are shown. Notice the 'char' type also requires a size designation. Other types supported by function Sql are 'time', 'int', and 'unsigned'. Once a table has been defined to the CESql object, the same Sql method performs record selection, insertion, modification, and deletion.

Adding Records

You can add records to the CESql object by passing an SQL INSERT command to the Sql method. The form of the INSERT command is:

INSERT INTO <table-name>
   (<column-name 1>, <column-name-2>,...
    <column-name-N>)
   VALUES(<value-1>,<value-2>,...
          <value-N>)

The uppercase words are keywords of the SQL language. <table-name> denotes the name of the table within the CESql object. <column-name-n> identifies alphanumeric names of the columns within <table-name>. The <value-n> fields describe values for the associated columns. A typical entry of a stock quote into the "stocks" table of the CESql object looks like this:

embsql->Sql("INSERT INTO stocks  \
     (NAME, DATE, TIME, LAST, HIGHEST,
      LOWEST, VOLUME) \
     VALUES('IBM', '10/23/99',
            '10:00AM', 123.23,125.12,
            121.4,349000");

Modifying Records

After populating the CESql object with data, one or more rows may require updating. The SQL UPDATE provides the ability to update data in the collection:

UPDATE <table-name> SET
   (<column-name-1>=<value-1>,<column-name-2>=<value-2>...)
      [WHERE <expression>]

The real power of the CESql object lies in its ability to use the SQL WHERE clause to constrain the operations on its member data. The WHERE <expression> construct in SQL supports operators for evaluating complex conditionals, with operator precedence. Supported operations include +, -, /, *, =, <, <>, >, >=, LIKE, NOT, AND, and OR of column names and numeric and string literals. As an example, if IBM's lowest quote fell to $100.00, we could easily update its value in the CESql object with:

embsql->Sql("UPDATE stocks SET(LOW=100) WHERE NAME='IBM'");

Deleting Records

The SQL DELETE command removes records from the CESql object when they are no longer required. The DELETE form looks like this:

DELETE FROM <table-name> [WHERE <expression>]

Here again, the real power of the CESql object lies in its ability to understand a WHERE clause. To delete all the data from the "stocks" table in the CESql object, use the command:

embsql->Sql("DELETE FROM stocks");

To delete all the rows from the example stock market table where the name is not IBM and the last price was less than $100.00 and volume was greater than 10,000 shares, use:

embsql->Sql("DELETE FROM stocks WHERE
   NAME <> 'IBM' AND LAST <100 AND
   VOLUME >10000");

Selecting Data

The SELECT SQL command retrieves data from a table. Using the power of the WHERE clause, complex queries of your data are now a snap. No hard-coded logic required! The SELECT form of the Sql method is:

SELECT <columns> FROM <table-name>
   [WHERE <expression>]

<columns> specifies a comma-separated list of columns to select out of the table. If all columns are desired, use the '*' short-hand designator. In the stock quote program, the user could select out a list of all the names by using simply:

CCollection* results =
embsql->Sql("SELECT NAME FROM stocks");

Or, to demonstrate a more complex operation, to retrieve the stocks whose high stock prices are 20% higher than the lowest price, the form would be:

CColumns* results =
   embsql->Sql("SELECT * from stocks
               where high > 1.2 * low")

The results collection contains an array of pointers to rows in the table matching the Sql query.

Miscellaneous Helper Methods

The CESql object provides several helper methods that allow you to print records from the object, obtain a catalog of tables within the object, and query column definitions of specific tables. Figure 6 shows sample usage of these methods.

Lines 4-6 illustrate how the catalog of tables within the CESql object is printed.

Line 7 shows how to retrieve the column specifications for a table. The method sets the number of columns in nC and returns the columns as an array in cols.

Line 8 uses the Sql method to make the query.

Lines 7-14 retrieve and print all rows of data from the CESql object.

Line 16 prints any syntax error detected.

Example Use of Embedded SQL Objects

The following sample server and client applications demonstrate the power of using SQL-enabled objects in C++ programs. The server, called Loadstocks.exe, queries quote.yahoo.com for stock quotes. After retrieving the stock quotes from Yahoo, the server loads a CESql object with the information.

A separate client application provides a user with the capability for SQL queries on the stock information maintained by the server, all without using hard-coded logic, or a commercial database. The client is implemented in 30 lines of C++ code.

Server Application

The file loadstocks.cpp, defined in Listing 1, implements a stock quote server program. This program is implemented as an EMBSQL/Win32 console application. The compiled program, Loadstocks.exe, runs continuously. Every five minutes it reads stock quotes from Yahoo on the Internet. The program makes HTML queries to Yahoo's web server for the stock information. When the Yahoo web server returns the quote information, the program parses the quotes from the HTML and stores the information in a CESql table called STOCKS.

The program takes a list of stock symbols as its arguments. To track IBM, Microsoft, UPS, and 3M, use the command line:

C>loadstocks IBM MSFT UPS MMM

Client Application

The file quotes.cpp in Listing 2 obtains an instance of a shared embedded SQL object, and then reads SQL commands from the keyboard. The CESql object executes these SQL commands as typed by the user then and displays the results on the console.

After starting the server, loadstocks.exe, run quotes.exe. Then type some sample queries, for example:

C>query
embsql>select * from stocks
embsql>select * from stocks where symbol='IBM' 
embsql>select * from stocks where (high > 1.02 * low)

Source Code

The full source code for the EMBSQL library, stock quote server, and client program are provided on the CUJ ftp site (see p. 3 for downloading instructions).

Notes and References

[1] For a good discussion on using shared memory in dynamic link libraries, see Paul Carlson's article, "An Automated Shutdown DLL," C/C++ Users Journal, pp. 59-61, August 1999.

[2] For more information on the implementation and use of virtual memory-mapped shared files, see "Managing Memory-Mapped Files in Win32," by Randy Kath, Microsoft Technical Note, February 9, 1993.

[3] The design of the recursive descent parser was based on the work done by Alfred Aho and Jeffrey D. Ullman in their book on compiler design, Principles of Compiler Design, (Addison-Wesley, 1977), pp. 180-184.

[4] See the MSDN (Microsoft Software Developer Network) documentation for more information on the class CObArray.

Ben M. Faul is a senior engineer for the PlanetCalling division of eFusion. He has a bachelor's and a master's degree in computer science from the University of Southern Mississippi and Loyola-Marymount University. He has over ten years experience developing real-time stock-market, network, and systems management software in C and C++ on both PC and Unix platforms. In addition, he has a Jurisdoctorate degree from the University of West Los Angeles Law School and is a member of the California Bar Association. He may be reached at [email protected].


Related Reading


More Insights






Currently we allow the following HTML tags in comments:

Single tags

These tags can be used alone and don't need an ending tag.

<br> Defines a single line break

<hr> Defines a horizontal line

Matching tags

These require an ending tag - e.g. <i>italic text</i>

<a> Defines an anchor

<b> Defines bold text

<big> Defines big text

<blockquote> Defines a long quotation

<caption> Defines a table caption

<cite> Defines a citation

<code> Defines computer code text

<em> Defines emphasized text

<fieldset> Defines a border around elements in a form

<h1> This is heading 1

<h2> This is heading 2

<h3> This is heading 3

<h4> This is heading 4

<h5> This is heading 5

<h6> This is heading 6

<i> Defines italic text

<p> Defines a paragraph

<pre> Defines preformatted text

<q> Defines a short quotation

<samp> Defines sample computer code text

<small> Defines small text

<span> Defines a section in a document

<s> Defines strikethrough text

<strike> Defines strikethrough text

<strong> Defines strong text

<sub> Defines subscripted text

<sup> Defines superscripted text

<u> Defines underlined text

Dr. Dobb's encourages readers to engage in spirited, healthy debate, including taking us to task. However, Dr. Dobb's moderates all comments posted to our site, and reserves the right to modify or remove any content that it determines to be derogatory, offensive, inflammatory, vulgar, irrelevant/off-topic, racist or obvious marketing or spam. Dr. Dobb's further reserves the right to disable the profile of any commenter participating in said activities.

 
Disqus Tips To upload an avatar photo, first complete your Disqus profile. | View the list of supported HTML tags you can use to style comments. | Please read our commenting policy.