Tuesday, May 03, 2005

Generic programming in C++: An example with CComSafeArray and standard containers

For those of you developing in C++ with COM you’ll be familiar with the all-too-common SAFEARRAY. You’re probably also familiar with CComSafeArray which is a useful class to wrap up the rather raw SAFEARRAY and aid in memory management.

This blog entry isn’t meant as a primer for SAFEARRAY and CComSafeArray usage, there are plenty of good sites for that.

What I would like to talk about is using generic programming for day-to-day tasks.

What is generic programming? Good question, tough to answer. In C++, generic programming is writing code that works for many types by coding solutions based on concepts. A concept defines what the supplied abstract datatype are required to provide in order to get the job done.

Confusing? Initially it may well be. How about we look at a real (albeit trivial) example:



template <typename T>
T
max (T a, T b)
{

return (b < a) ? a : b;
}


In the code above, max is a function that takes in two items (of the same type) and returns the largest of the two. The above code can be used for any T provided it supports the less-than operator. [In standard-library lingo T (the abstract datatype) must satisfy the concept of “Less Than Comparable”.]

This code is maximally reusable and, as such, is A Good Thing. It’s an example of generic programming because, in and of itself, max doesn’t know about T and thus works generically for any type (that supports the less-than operator). We can write max once and it’ll work for all types. Beautiful.

In my experience, occasions to create generic code like max occur more frequently than most developers realize. I believe it’s because it takes awhile to become accustomed to “seeing” generic solutions. For example, consider if our generic version of max didn’t exist but we needed a function to compare two integers. It would be all too easy to write:



int
max (int a, int b)
{

return (b < a) ? a : b;
}


Then, when we have longs to compare, or floats, we could copy and paste the code and change “int” to “long” or “float”. Avoid such code! Duplicating code is evil and should be avoided. If you notice yourself copy and pasting code regularly it may be an indicator that you could benefit from generic coding techniques.

Here’s a real world example I came across the other day at work.

We use SAFEARRAYS and CComSafeArrays often. We also need to manipulate them in various ways so we often covert them to standard containers as they have a wide variety of algorithms that we can employ. Usually we have to convert them back (to pass back over a COM API). Someone had noticed this and written a conversion function:



// Remember to delete the returned pointer
SAFEARRAY *
CreateSafeArrayFromVector(std::vector<long> vec)
{

CComSafeArray<long> sa;
for (std::vector<long>::iterator it = vec.begin(); it != vec.end(); ++it)
{

sa.Add(*it);
}

return
sa.Detach();
}


In fact there were a whole bevy of these conversion functions. CreateSafeArrayFromVector that worked for floats. For doubles. On unsigned longs. We also had to work with std::lists so CreateSafeArrayFromList was born, duplicating all the vector functionality. Remember that much of the code is practically identical, we’re just adding things to a safe array! What a mess.

A single generic function could (and has) replaced it all:



template <typename FwdIterator>
CComSafeArray<typename std::iterator_traits<FwdIterator>::value_type>
CreateCComSafeArray(FwdIterator first, FwdIterator last)
{

CComSafeArray<typename std::iterator_traits<FwdIterator>::value_type> sa;
while (first != last)
{

sa.Add(*first++);
}

return
sa;
}


Much cleaner! How is it used?



std::vector<T> handles;
//...
CComSafeArray saHandles = CreateCComSafeArray(handles.begin(), handles.end());


You’ll notice a few changes. I chose to use CComSafeArrays because there’s no obvious reason not to. It increases type safety and decreases the chance of memory leaks. You’ll see that we pass in iterators to the function now. That allows abstraction from the specific container type. If you’re familiar with using the standard library you’ll notice that this is a fundamental tenet in algorithms – all of them do the same thing. There’s an important point here, if you’re passing containers in your function declarations you’re not programming generically. Instead, create a templated function and expect iterators to be passed in. Tricky to grasp at first but so clean when you bend your brain around it (repeat after me, “algorithms operate on iterators!”).

You’ll also see something about iterator_traits. What the? Without going in to too much detail (again, it’ll have to be another article), iterator_traits is supplied by the standard library to provide compile-time information about the supplied iterator. std::iterator_traits<T>::value_type resolves to the type that T points to. [There’s a whole bunch of things that iterator_traits provides.]

I’ve called the type “FwdIterator” because I require the iterator to support the post increment operator – the concept of “Forward Iterator” defines this requirement.

But don’t get too wrapped up in the lingo. Read the code like this:

“Give me two iterators pointing to the start and end of a range”

“Create a CComSafeArray containing the same type of objects as the iterators point to”

“Add elements to the CComSafeArray from the iterators and return the CComSafeArray when done”

If you can grasp the high level ideas then the syntax comes relatively easy.

In the future I’ll talk about the pair function, FillFromCComSafeArray. See if you can write it yourself!

Relevant links:

SGI’s standard library documentation

Boost’s Generic Programming Techniques

David Muser on Generic Programming

2 comments:

Anonymous said...

Good article, Matty. Even me, who was told yesterday that I'm not a Software Engineer can understand it.

Anonymous said...

Matt, you just saved my life with this article my brother :D

Thanks a lot!