I use the OLE2T()
and T2OLE()
macros a lot in my code to convert between
BSTR
s and char*
s. Is there anywhere this could cause me problems?
If youve spent any time working with COM interfaces while writing C or
C++ code, youve likely had to deal with BSTR
s. A BSTR
is a string
data type that was developed for Visual Basic and differs in its storage characteristics
from a standard, null-terminated char*
string in C/C++. The primary difference
is that a BSTR
contains a leading 4-byte integer block that indicates the number
of characters to follow, whereas a standard char*
string does not. However,
both a BSTR
and a char*
string are null terminated, which can make translating
between the two rather quick as long as there are not embedded nulls in the
string value itself. If this happens, then the leading 4-byte integer in a BSTR
aids in defining the actual length rather than the length to the first null.
Since many COM interfaces are developed to work with both Visual Basic and
C/C++, a developer will often define any string parameters in a COM interface
to be a BSTR
. In fact, if you are creating an OLE Automation (or COM Automation)
interface (that is, one based on IDispatch
), then you must use BSTR
s to
define strings. I found that using BSTR
s in my COM interfaces became a
habit long ago, and its rare when I bother with defining a string method
parameter as a character array comprising two parametersone for the data
and one for the length of the data. I find BSTR
s easier to use in this
case. The downside is the constant need to convert between a BSTR
and a more
useful char*
string. If you work with the STL string class in your
code, youll probably find yourself converting between them very often.
Fortunately, Microsoft developed some easy-to-use macros for MFC, which are
also available for the Active Template Library (ATL) that deal with BSTR
s.
The most common are OLE2T()
and T2OLE()
. The first macro converts a BSTR
to
a Unicode or ANSI string and the second macros converts a Unicode or ANSI string
to a BSTR
. There are also OLE2A()
and A2OLE()
variants that can be used if you
know you are working with ANSI strings. A companion macro named USES_CONVERSION
is also used once per scope use of OLE2T()
or T2OLE()
to set up some stack variables
that allocate memory used during conversion.
One known place where these macros cannot be used is within a C++ catch()
handler.
This is due to the way the OLE2T()
and T2OLE()
(or OLE2A()
and A2OLE()
) macros
are implemented. Both make use of the C run-time function _alloca
to
allocate space on the stack for the string conversion. Doing so has the advantage
of automatic cleanup when exiting the method or function call. Using the heap-based
alloc
or malloc
would have required a cleanup function to be called
when exiting the method whereas using _alloca
does not. This also makes it handy
when unhandled exceptions force a sudden unrolling of the stacka heap-based
cleanup function wouldnt execute and youd end up with orphaned memory
blocks. However, the _alloca
function has a limitation in that it cannot be
used in any kind of exception handler (either Windows NT Structured Exceptions
Handers or C++ catch
statements). Here is an excerpt from the MSDN documentation
on why:
"There are restrictions to explicitly calling _alloca
in an
exception handler (EH). EH routines that run on x86-class processors operate
in their own memory "frame: They perform their tasks in memory space that is
not based on the current location of the stack pointer of the enclosing function."
The unfortunate thing is that the minimal documentation describing use of OLE2T()
or T2OLE()
does not mention this restriction. I've run into this problem myself where I attempted to call a COM interface during a catch() handler that required the conversion of a BSTR
to a char*
string. When the code executed, it immediately crashed. At first, I thought I had a memory corruption problem in the heap or stack. After a lot of research, including poking around in the implementation of the macros, I stumbled across the warnings in the documentation for _alloca
.
Another place where you might have trouble is if you attempt to pass the result of an OLE2T()
conversion to a method/function that takes a char*
pointer, and then find that the pointer is later used in a catch()
handler in the callee.
In my own code, I found this set of rules to be potential causes of needless bugs, so I extended the ATL CComBSTR
class and added support for moving between BSTR
's and char*
strings within methods in the derived class. I also made sure that the return value of a BSTR
to char*
conversion was an STL string to ensure I didn't accidentally try to pass the original char*
to a method that might use it improperly.
Of course, another option would be to roll your own BSTR
to char*
conversion code that doesn't rely on _alloca
to create memory space for the converted data. But for most developers, the availability of OLE2T()
and T2OLE()
are good enough even with the restrictions.
Mark M. Baker is the Chief of Research & Development at BNA Software located in Washington, D.C. He can be contacted at [email protected].