Dr. Dobb's is part of the Informa Tech Division of Informa PLC

This site is operated by a business or businesses owned by Informa PLC and all copyright resides with them. Informa PLC's registered office is 5 Howick Place, London SW1P 1WG. Registered in England and Wales. Number 8860726.


Channels ▼
RSS

A Stream Class for Calling Perl from C++


January 2002/A Stream Class for Calling Perl from C++

A Stream Class for Calling Perl from C++

Robert Y. Seward

Mixing Perl and C++ just got a whole lot easier.


Introduction

One nice aspect of Perl is that you can mix Perl with C/C++ by either creating Perl modules written as C++ shared libraries or embedding Perl into a C++ application, allowing you to take advantage of the strengths of both C++ and Perl in one application. One difficulty that I have encountered in doing this is calling Perl from C++. Calling Perl from C++ can be useful to access data that resides in Perl space, to use existing Perl libraries, to create code on the fly, or to simply run some functions that are more suitable for Perl. While calling Perl from C++ is useful, it is not easy. I have seen several wrappers around the Perl API that attempt to make it easier to use, but most of these either lack all the functionality I needed or weren’t much easier to use than the original API. To solve this problem, I have implemented two C++ wrappers, one using the stream interface to send commands to Perl and one to convert the results into standard C++ types. The interface allows you to do just about anything you can do in a normal Perl script, while being as easy as writing to a file and reading from an array. Listing 1 shows some example code using the PerlStream and PerlValue classes.

Calling Perl from C the Hard Way

When you call Perl from C, you have at your disposal a number of C functions that Perl provides to do this kind of thing. What follows is a very condensed version of what can be found in the Perl documentation (man perlguts and man perlcall on Unix).

A Perl value is represented in C with a pointer to an SV type. This SV* can point to any type of Perl value: scalar, array, hash, reference, etc. There are a bunch of macros and functions that Perl provides to query an SV* and to convert back and forth between SVs and standard C types. If an SV* points to an array, you can cast it to an AV* and do all the normal Perl array operations with it. Similarly, if an SV* points to a hash, you can cast it to an HV* and do hash operations with it.

One tricky thing about SVs is reference counting and mortality. In normal Perl code, Perl cleans up variables that are no longer in use by doing reference counting on each value. When programming in C, the cleanup is not so automatic. You must take care to “free” a value either by decrementing the reference count or by telling Perl that the value is “mortal.” Mortality means that Perl will wait to decrement the reference count until the end of the current scope and will free the value if the reference count goes to zero. Think of this in terms of the C++ behavior of calling the destructor on a local variable at the end of the current scope, with the twist that Perl’s “destructor” is not called if the local variable is still in use elsewhere.

You can run Perl code from C by either calling a Perl function or running the C equivalent of Perl’s eval function. In both cases, you have to manipulate the Perl stack by pushing SVs onto the stack before the call and popping SVs off the stack after the call. Also, you must explicitly tell Perl if you want to run the Perl code in a scalar, array, or void context and if you want Perl to keep the error (the $@ Perl variable). Lastly, after popping the results off the stack, you have to use Perl functions and macros to determine what the SV is (int, double, reference, etc.) and convert the SV to a C type. Dealing with array references and hash references adds to the complexity.

To get a taste of what this can look like, take a look at the PerlStream execute function on line 29 of Listing 2. I do not want to explain all the details about each funny Perl macros — read the Perl documentation. I just want to point out that, while this may be second nature for Larry Wall, this is not easy for me to understand. I always have the gnawing feeling that I may be corrupting the Perl stack, leaking memory, or doing some other horrible thing. In fact, I have done all of these at various times when using the Perl API, so the fear is justified.

An Easier Way, Part I: The PerlStream

The ostream to send an evaluation string to Perl is actually two classes: PerlStream and PerlStreamBuf. All concrete streams, such as stringstreams and fstreams, have this same underlying two-class design. The stream part handles all formatting operations, such as converting floating-point numbers to a character string. The result of the formatting is a sequence of characters, which are sent to the stream buffer. The stream buffer takes the characters and sends them to the appropriate device, possibly buffering the characters before sending them on.

When you want to create a new stream class, most often you really want the stream buffer part to be different, but want the formatting operations to be just like a normal stream. As a result, the new stream class is usually a very simple class, just inheriting from stream, whereas the stream buffer class does most of the work.

Things are no different for the PerlStream class. The PerlStream class definition is shown in Listing 3, starting at line 96. The class gets all the formatting functionality by just inheriting from ostream. There is only one data member: the PerlStreamBuf that is associated with this PerlStream object. The constructor passes this PerlStreamBuf to the underlying ostream via ostream’s constructor so that it knows where to send the formatting results. The PerlStream class’s member functions are all just wrappers around the real functions in PerlStreamBuf.

PerlStreamBuf needs to accumulate all characters sent to it in some buffer, waiting for the user to tell it to send it on to Perl. At that point, PerlStreamBuf evaluates the accumulated code and clears the buffer, getting ready for the next set of code.

To implement this, I chose to not give PerlStreamBuf’s streambuf base class a buffer at all. If the base class has no buffer, it calls the overflow virtual function with every character. So, I defined overflow to append the given character to the string mBuffer, as shown in line 23 of Listing 2. Although this is not the intended use of overflow, it makes the design of PerlStreamBuf trivial. Note that I also redefine the xputs virtual member function (Listing 2, line 26) as a performance optimization, since the base class makes one call to xputs for a string instead of calling overflow for each character in the string.

The perl_execute that initiates the Perl evaluation is called an I/O manipulator. For a complete discussion of manipulators, see [1]. Briefly, perl_execute is a function defined to take an ostream and return an ostream (see Listing 3, line 107), and <iomanip> defines an output operator that causes the function to be called when used like this, passing the ostream as its argument. perl_execute will then call the PerlStream’s execute member function.

The code for execute is shown in Listing 2 starting at line 29. Most of the code is best understood by reading the Perl documentation, but a couple of points are worth noting. First, the code passes two flags to perl_eval_sv: one to tell Perl the execution context and one to tell it to keep the error. By default, PerlStream will tell Perl to execute the code in a scalar context (meaning the code will return at most one value). Alternatively, you can have Perl execute the code in a void context (nothing is returned) or an array context (any number of values are returned) by calling the array and voidc PerlStream member functions. These functions set the mContext member data to the appropriate value, which is then passed on to Perl. The G_KEEPERR flag tells Perl to set ERRSV (the C equivalent of $@) if eval failed. The code checks for an error by calling SvTRUE on ERRSV.

Secondly, the code for execute is getting the values from the Perl stack. All the return values, in the form of SV*s, are put into the mRtnVals vector. So that Perl will not destroy these values at the end of the current scope (the FREETMPS/LEAVE statements), the code increments the reference count on all of them. As mentioned before, this prevents Perl from destroying the SVs and all underlying objects (arrays, hashes, etc.). The code will eventually decrement the reference counts by calling freePerlValues, allowing Perl to free the memory if nothing else is using it, at the beginning of execute and in the PerlStreamBuf destructor.

An Easier Way, Part II: The PerlValue Class

The PerlStream class has the [] operator overloaded to access the SV*s returned by Perl, but converts them into a PerlValue object. The PerlValue class, defined in Listing 3 starting at line 12, is just a thin wrapper around all the Perl macros and functions that query SV* about its type and convert an SV* to C types. The conversions are done with C++ conversion operators, which are automatically called when the compiler is faced with the need to convert from a PerlValue to some other type, as in an assignment.

For example, on lines 9 and 10 of Listing 1, ps[0] and ps[-2] return a PerlValue object. C++ calls the int conversion operator on the assignment to two and calls the double conversion operator on the assignment to five. The code to implement the conversions and type queries are simple wrappers around the corresponding Perl functions. For example, isInt (line 1 of Listing 2) calls the Perl function SvIOK, and the conversion to an int (member function Int on line 4) calls the Perl function SvIV. Like Perl, a PerlValue will happily convert from one Perl scalar (string, int, or double) to another on demand. A side effect of this is that the return values of isInt, isNumber, isDouble, and isString can actually change after some of these conversions are done.

If the SV is an array or hash reference, you can access the values in the array or hash by using the PerlValue’s [] operator. If the argument passed in to operator[] is an int, then PerlValue will try to use its SV as an array reference and use the argument as the array index. Like Perl, a negative array index will index from the end of the array. If the argument to operator[] is a char* or string, then PerlValue will try to use its SV as a hash reference and use the argument as the hash key. These can be chained together as in lines 11 and 16 of Listing 1. By default, PerlValue will silently tolerate an out-of-range index and values that are not really array or hash references. Alternatively, you can tell PerlValue to throw an exception in such cases.

You can also iterate over all values of an array or hash the same way you would iterate over an STL vector or map. Examples of this are on lines 21 and 26 of Listing 1. The array iterator is just like an STL vector iterator: you can move forwards and backwards, and the motion can be in any size steps. The hash iterator is more restrictive than an STL map iterator because Perl’s interface is more restrictive. Namely, you can only iterate forward, and you can only have one iterator per hash. A second iterator will set the first iterator back to the beginning of the hash.

The code to implement the iterators is mostly straightforward wrappers around Perl functions. The array iterator, implemented with the PerlValue::array_iter class, keeps an index into the array. The hash iterator, implemented with the PerlValue::hash_iter class, lets Perl handle the hash iteration. The one tricky part of both iterators is implementing the operator-> function. For the array iterator, the function needs to return a pointer to a PerlValue object, but a PerlValue has to be constructed to wrap around the real SV* that is in the array. If the function constructs a local object, it can’t return a pointer to that because the object will cease to exist as soon as the function returns. If the function constructs an object with new, the caller has to remember to free it. The solution to this quandary is to return a PerlValue::ptr object. This object contains a PerlValue and will continue to exist past the function return. The PerlValue::ptr class has its operator-> overloaded to access the encapsulated PerlValue. So, on line 26 of Listing 1, ai->isInt() gets expanded to ai.operator->().operator->()->isInt()two calls to operator->! The hash iterator has something similar, but operator-> returns a PerlValue::hash_return_ptr, which contains a pair<const char*,PerlValue>.

Error Handling

When I designed PerlStream/PerlValue, I wanted to preserve some of Perl’s flexibility in handling errors. By default, Perl will tolerate all sorts of abuse: array out-of-range errors, dereferencing undefined values, etc. However, you can use strict and run it with the -w option to give some amount of error checking. PerlStream/PerlValue does the same. By default, it will silently handle any error condition and do a somewhat sensible thing.

Alternatively, you can tell PerlStream what errors you want to know about. The possible classes of errors are evaluation errors (problems with the code sent to Perl), array out-of-bounds conditions, invalid hash lookups, and casting errors. Casting errors cover a number of problems (e.g., getting a scalar from a reference, getting a hash from a scalar, and getting an array from a hash). If an error is enabled, PerlStream and PerlValue will throw an exception when the error is encountered.

One thing that it is not flagged as an error is converting a text string to an integer or double or converting a double to an integer. Since these are all Perl scalar values, Perl does these conversions somewhat automatically, and often these conversions are legitimate.

All exception classes inherit from the standard exception class, so one can catch all of them by catching an exception type. An example of catching an exception is shown on line 33 of Listing 1. Each exception has an error message that can be retrieved with the what member function. An example of an exception class, PerlEvalError, is shown on line 1 of Listing 3.

Conclusion

I presented two classes, PerlStream and PerlValue, that making dealing with the Perl API much easier without sacrificing any flexibility. PerlStream is conceptually simple to grasp and relatively easy to write. Such a paradigm could be easily adapted to APIs of other languages such as Tcl or Python.

Note

[1] Bjarne Stroustrup. The C++ Programming Language, Chapter 21, “Streams” (Addison-Wesley, 1997).

Robert Y. Seward graduated from Texas A&M University in 1982 with a BS in Electical Engineering. He has worked at General Dynamics, Varo, and Convex Computer, and is currently employed with Hewlett-Packard in Ft. Collins, CO. He is currently a team leader for timing CAD tools in the microprocessor design labs. He has been programming in C for 15 years, C++ for 7 years, and Perl for 5 years.


Related Reading


More Insights






Currently we allow the following HTML tags in comments:

Single tags

These tags can be used alone and don't need an ending tag.

<br> Defines a single line break

<hr> Defines a horizontal line

Matching tags

These require an ending tag - e.g. <i>italic text</i>

<a> Defines an anchor

<b> Defines bold text

<big> Defines big text

<blockquote> Defines a long quotation

<caption> Defines a table caption

<cite> Defines a citation

<code> Defines computer code text

<em> Defines emphasized text

<fieldset> Defines a border around elements in a form

<h1> This is heading 1

<h2> This is heading 2

<h3> This is heading 3

<h4> This is heading 4

<h5> This is heading 5

<h6> This is heading 6

<i> Defines italic text

<p> Defines a paragraph

<pre> Defines preformatted text

<q> Defines a short quotation

<samp> Defines sample computer code text

<small> Defines small text

<span> Defines a section in a document

<s> Defines strikethrough text

<strike> Defines strikethrough text

<strong> Defines strong text

<sub> Defines subscripted text

<sup> Defines superscripted text

<u> Defines underlined text

Dr. Dobb's encourages readers to engage in spirited, healthy debate, including taking us to task. However, Dr. Dobb's moderates all comments posted to our site, and reserves the right to modify or remove any content that it determines to be derogatory, offensive, inflammatory, vulgar, irrelevant/off-topic, racist or obvious marketing or spam. Dr. Dobb's further reserves the right to disable the profile of any commenter participating in said activities.

 
Disqus Tips To upload an avatar photo, first complete your Disqus profile. | View the list of supported HTML tags you can use to style comments. | Please read our commenting policy.