| September 6, 2010 12:00 AM PDT | |
by Neil Gower
neilg@vertexblast.com
To help you become familiar with the STL, we will take a look under the hood of several core STL components by extending them to solve game-specific problems. This article assumes you are familiar with C++ and have at least a basic knowledge of STL. [Josuttis99] is an excellent introduction and reference for readers who want brush up on the STL before diving into customization.
We'll be focusing on four of the core components of the STL: function objects, containers, iterators, and algorithms.
These components build on each other in layers. Containers aggregate objects, and are typically selected for their ability to manage storage (e.g. vector's contiguous memory guarantee) or to index data (e.g. map's efficient random access). Alongside containers are function objects, and layered on top are iterators. Function objects (also called functors) are classes used to represent individual functions. They are the object-oriented equivalent to function pointers. Similarly, iterators can be thought of as object-oriented pointers, providing a generic means of accessing the contents of containers. An object-oriented design book like [Gamma94] can provide more details about iterators in general.
Iterators and functors are combined to build generic algorithms, which are template functions that implement common programming patterns, such as "apply this function to each element in this range". Briefly, generic algorithms operate on a collection of objects using iterators, and are usually customized using one or more functors.
The Generic Approach
In the process of learning C++, generic techniques usually follow after object-oriented techniques, which can give the impression that the generic programming is just advanced object-oriented programming. Generic programming is actually quite a different paradigm from the standard object-oriented approach. The generic approach involves separating data from the processes that operate on it, rather than bundling them together as classes do. The advantage of this separation is it allows us to reuse processes that are independent of the structure of the data.
The tradeoff for this reusability is that some algorithms can be implemented more efficiently by exploiting specific knowledge of the data. For this very reason, you will find some STL containers have member functions that duplicate generic algorithms, such as remove_if()in std::list. When you need efficiency more than flexibility, the member function versions of these algorithms are the best choice. In practice, a good strategy is to start out using the generic options, and then replace generic components with more specialized ones as the program solidifies and the real bottlenecks are revealed.
Generic and object-oriented programming styles are not mutually exclusive. A hybrid approach will often yield the best results, applying each style where it provides the most benefit. For example, template classes and algorithms are often more efficient alternatives to inheritance, because the static polymorphism of templates is resolved at compile-time, rather than at runtime. On the other hand, implementing matrix operations in a class with tightly coupled math functions will typically deliver better performance than the generic equivalent.
For more information about the generic components that come standard in the STL, [Josuttis99] is a great reference. With a high-level understanding of generic programming, we can now take a look at how to implement our own generic components.
To illustrate how STL containers can be applied to game programming, we're going to take a common game data-structure, the octree, and build it STL-style. An octree is a spatial database. It allows you to do spatial queries like "find the objects which are close to this point" without testing against every object in the game world (see [Ginsburg00] for more about octrees).
Contained Elements
One of the most important properties of STL containers is that they are designed around value semantics. This means that they store the values of the things you put in them, as opposed to, for example, a pointer or reference to the original element. This is less significant for primitive types, including pointers, which we're used to passing around by value. Pass-by-value has the nice effect of making resource ownership unambiguous: whatever is inside the container belongs to the container. The downside to this approach is that it can result in a lot of object copies and assignments. A good habit to develop when working with STL containers is judicious use of references to avoid unnecessary copying.
Containers have a small set of requirements they impose on any class that they contain. First, to support value semantics, objects in a container must be efficiently copyable and assignable. [Meyers97] has many useful tips about implementing these operations correctly in C++. Container elements must also be destroyable, so private destructors are not allowed.
A few additional properties might be required, depending on the type of container in question. For searching and some other operations, the element class needs to have an equality test operator. Sorted containers need a function for determining the order of the elements. By default, operator< (less than) for this. Some container functions also require the elements to have a default constructor, like the vector(n) form of the std::vector constructor, which initializes a vector with n default constructed elements.
Container Specifications
Now that we know how the elements in our container work, we can design the container itself. We'll start by deciding on the container specifications it will implement. Whether we are using the standard containers or implementing our own, container specifications are an important concept for writing good STL code.
Container specifications are more flexible than an interface class in a header file. They specify the syntax and semantics of the interface, but for the most part leave concrete types up to the implementer. The 2003 C++ Standard [ISO14882] includes requirements for abstract containers: container, sequence, and associative container. The standard also defines concrete containers, such as vector and map. By writing to the more abstract specifications when possible, we can produce more flexible code.
Our octree will start by implementing the basic container requirements. This essentially means providing some typedefs, begin and end iterators, and size accessors. Of course, we need more functionality than this for an octree. An octree takes arbitrary 3D locations and maps each one to a set of objects at or near that location, which is similar to taking a key and hashing it to a storage bucket. In this sense, an octree is more like a hash table than it is like an STL sequence or associative container, because the elements are part of an unordered collection. Although the C++ Standard does not include a hash table container, the standards committee has since written up an extension that does include hash tables, in Technical Report 1 [ISO19768]. We'll use the TR1 "Unordered Associative Container" requirements as the basis for our spatial database interface to the octree.
Template Parameters
Declaring the template parameters is a good place to start implementing our new container. Since we are building an associative container, we will need template parameters for the type of object to store (T) and for the key type (Key). Although we can assume that keys in an octree will always be 3D locations, different clients may use different classes for representing those locations.
Another notable aspect of the template declaration is the scalar parameter MaxObjectsPerNode. Making this a client-specified compile time constant is helpful for optimizing the octree's internal storage design.
The next two template parameters are functors for customizing the octree implementation. A LocationFunctor is used to get a Key from a container element. The default implementation assumes the object has a location member. If we were storing pointers to objects or objects that exposed the location through an accessor function, the client would simply provide an appropriate functor for that type of object.
The octree also uses a functor for the test of whether an object is contained within a given bounding box. The default functor uses the bounding box classes' own simple point test, but this can be easily replaced with more complicated tests based on the game's collision system.
Finally, we have a template parameter for a custom allocator. This is particularly useful in containers used for games, since careful memory management can often make or break game performance. See [Isensee02] for a selection of custom C++ allocators and a guide to making your own.
Octree Specification
Most of the TR1 hash table requirements map nicely onto the octree interface, although a few deviations turn out to be necessary. The primary functions for accessing the octree are:
Storage
Next, we need to decide on how to store the octree's contents. Like most tree-based data structures, an octree is naturally represented by a hierarchy of nodes. Each octree node contains:
Recall that the octree is also responsible for storing and providing access to the elements' values. Using a vector has some nice properties like guaranteed contiguous storage, and gives us random access iterators over the full set of elements in the container. Based on the decision to use a vector, container indices are the best option for referencing the elements of the vector from the octree nodes. Pointers and iterators are unsuitable because both are invalidated by internal vector reallocations, such as when the vector expands to hold more elements.
To recap, the child nodes are stored in-place, and elements are referenced by index. The easiest container to use for both of these collections is std::vector, because it is compact and can efficiently append new elements.
Octree Implementation
As you look through the full container specifications in the C++ Standard, you will notice a lot of similar functions need to be implemented. For example, there are five variants of the erase() function in the octree. Fortunately, most of these functions are just different ways of accessing the same underlying functionality. The core octree implementation is really three private functions: _octreeInsert(), _octreeFindLeaf(), and _octreeRemove(). These functions are reused throughout the public interface to deliver the majority of the required STL functionality.
The FixedArray container in the sample code on the CD-ROM is one solution to this problem. This class takes advantage of the fact that we know the maximum number of child nodes and maximum number of elements that can be stored in an octree node, and that one of these containers will always be empty in each node. FixedArray minimizes the size of the container when it is empty to that of a single pointer. To eliminate any extra bookkeeping data in the container, unused elements are marked with a sentinel value whose value can be specified using a template parameter to avoid collisions. This means a little more work to skip over the unused elements while iterating through the container, but the cost is marginal when dealing with small containers, and zero when the container is full.
To make this solution work, we need some code for skipping over sentinel values in the container. However, this is an implementation detail that should not concern the user of the container. This is a perfect problem for iterators.
This iterator is tightly coupled to the container, so it is declared as an inner class of FixedArray. One ‘gotcha' for implementing STL iterator classes is that we end up doing it twice – once for the const iterator, and once for the non-const. However, the differences between the two classes are usually quite minor, they mostly have to do with the types of the parameters and return values. Austern presents a nice solution that avoids redundant code for the two iterator variants in [Austern01]. The essence of this method is the ChooseType template, which selects one of two types based on the value of a constant bool. So even though we might vary half a dozen different types between the const and non-const iterator classes, we'll only need to expose a single template parameter. For example, in the listing below we use this technique to declare a type named reference that alternates between a reference to a const and a non-const element in the container, depending on the isConst template parameter:
These functions are pretty straightforward to implement for FixedArrayIterator. The most interesting one is operator++. Whenever we advance the iterator, we skip over any sentinel values to the next real element, or to the end of the container. The end iterator is simply the one-past-the-end position.
The last thing we need for our iterator class is a set of standard typedefs that are used by STL's generic algorithms to query the capabilities of the iterator class. For the FixedArrayIterator>, they look like this:
Iterators have helped us to implement a container that is frugal with memory consumption for use in our octree nodes. At the same time, the octree code itself is oblivious to the underlying complexity, and we can swap between our FixedArray and std::vector by changing two typedefs in the node declaration.
In the STL, functors are intended to be passed by value, which we need to keep in mind when designing our own. In particular, this means functors should avoid keeping internal state data, which can be enforced in code by declaring operator() to be const. The reason for this restriction is we can't make many assumptions about how our functor will be used by other components (e.g. when it will be copied, how many times it will be applied), which makes it very difficult to design stateful functors that behave correctly.
The general form of the algorithm is to take a collection of objects in the world and test each of them for collisions against a list of possible colliding objects. Each colliding pair goes into the results list. In the worst case, the candidates for each object are all of the other objects in the world, resulting in an
algorithm which is not very scalable. Fortunately, this is just the kind of thing octrees are good at optimizing.
Our generic collision algorithm takes two types of iterators as input: a pair of iterators from the source container and one iterator for the destination container. The other two template parameters are functors that allow us to customize the non-generic parts of this algorithm: how the list of collision candidates is generated, and how the collision test itself is performed.
Although the template parameters and typedefs take up a lot of space, the actual code is relatively simple. The outer loop uses the source iterators to step through the input objects and generate a list of possible colliders for each one, using the CandidatesFunctor. This functor takes an object and returns a begin/end pair of iterators representing the set of collision candidates. The inner loop then tests each candidate using the CollisionTestFunctor, and when a collision is detected, adds the two objects as a CollisionPair to the output container.
In this code, we can see the typedefs from the iterator classes coming into play. For example, the type of object we are processing is nested inside the source iterator in SrcIterator::value_type.
To make the discussion more concrete, let's take a look at how we actually apply this generic algorithm. A typical call to generate_collisions() would look like:
In the code listing above, we used a naive CandidatesFunctor that just returns the source container's begin() and end() iterators. This means that every object will be checked against every other object for collisions. We can reduce the complexity of this algorithm to
by replacing this functor with one that uses the octree to efficiently determine the objects in the nearby region. Specifically, we use the find_range() function, which returns a pair of local_iterators over the range of elements that are in octants within a given radius. As long as the radius is fairly small relative to the size of the octants, find_range() will only visit a small subset of the octree while computing this range, which is the source of the efficiency gain.
This small change dramatically improves the scalability of this collision function. We can continue to optimize the algorithm by trying different containers and candidate functors, which use additional techniques to improve efficiency. The algorithm can also be easily extended to handle new collision models by supplying different functors. All of this is possible without virtual functions or inheritance, and without touching the generic algorithm code.
The payoff for the effort to be STL compatible is that we get flexible, efficient code that is consistent with the standard library. This enables new programmers who know their way around C++ to quickly get up to speed with in-house code. Mixing-and-matching STL, in-house, and third-party custom code becomes relatively easy. By extension, we can use generic coding style to gain flexibility and reuse, improving coding productivity without resorting to scripting and interpreted code. The flexibility inherent in generic code also provides us with a wide array of options for optimization, which have minimal impact on the surrounding code. In these ways, STL used well is a valuable part of the game programmer's toolbox.
[Boer00] Boer, J., "Using the STL in Game Programming," Game Programming Gems, Charles River Media, 2000: pp. 41-55.
[Boost09] "Boost C++ Libraries," available online at http://www.boost.org, September 1, 2009.
[Gamma94] Gamma, E., Helm, R., Johnson, R., Vlissides, J., "Design Patterns: Elements of Reusable Object-Oriented Software," Addison-Wesley 1994.
[Ginsburg00] "Octree Construction," Game Programming Gems, Charles River Media, 2000: pp. 439-443.
[Isensee02] Isensee, P., "Custom STL Allocators," Game Programming Gems 3, Charles River Media, 2002: pp. 49-58.
[ISO14882] ISO/IEC 14882:2003 "Programming languages - C++".
[ISO19768] ISO/IEC TR 19768:2007 "Information technology - Programming languages - Technical Report on C++ Library Extensions".
[Josuttis99] Josuttis, N. M., "The C++ Standard Library: A Tutorial and Reference," Addison-Wesley 1999.
[Meyers98] Meyers, S., "Effective C++," Addison-Wesley 1998.
neilg@vertexblast.com
Download Article
Download Extending STL for Games [PDF 405KB]Introduction
Despite widespread usage of the Standard Template Library (STL) in the C++ development community, some game programmers still shy away from it. The STL's use of templates and generic programming style can be off-putting to developers used to working with low-level code and within the limits of previous generations of hardware. The state of current compilers and STL implementations has advanced considerably, as have the capabilities of modern game platforms. Nonetheless, to fully experience the benefits of the STL, it is necessary to know not only how to use it, but also how to customize and extend it to meet the particular needs of game development projects.To help you become familiar with the STL, we will take a look under the hood of several core STL components by extending them to solve game-specific problems. This article assumes you are familiar with C++ and have at least a basic knowledge of STL. [Josuttis99] is an excellent introduction and reference for readers who want brush up on the STL before diving into customization.
Background
STL OverviewWe'll be focusing on four of the core components of the STL: function objects, containers, iterators, and algorithms.
Figure 1. A layered view of the STL.
These components build on each other in layers. Containers aggregate objects, and are typically selected for their ability to manage storage (e.g. vector's contiguous memory guarantee) or to index data (e.g. map's efficient random access). Alongside containers are function objects, and layered on top are iterators. Function objects (also called functors) are classes used to represent individual functions. They are the object-oriented equivalent to function pointers. Similarly, iterators can be thought of as object-oriented pointers, providing a generic means of accessing the contents of containers. An object-oriented design book like [Gamma94] can provide more details about iterators in general.
Iterators and functors are combined to build generic algorithms, which are template functions that implement common programming patterns, such as "apply this function to each element in this range". Briefly, generic algorithms operate on a collection of objects using iterators, and are usually customized using one or more functors.
The Generic Approach
In the process of learning C++, generic techniques usually follow after object-oriented techniques, which can give the impression that the generic programming is just advanced object-oriented programming. Generic programming is actually quite a different paradigm from the standard object-oriented approach. The generic approach involves separating data from the processes that operate on it, rather than bundling them together as classes do. The advantage of this separation is it allows us to reuse processes that are independent of the structure of the data.
The tradeoff for this reusability is that some algorithms can be implemented more efficiently by exploiting specific knowledge of the data. For this very reason, you will find some STL containers have member functions that duplicate generic algorithms, such as remove_if()in std::list. When you need efficiency more than flexibility, the member function versions of these algorithms are the best choice. In practice, a good strategy is to start out using the generic options, and then replace generic components with more specialized ones as the program solidifies and the real bottlenecks are revealed.
Generic and object-oriented programming styles are not mutually exclusive. A hybrid approach will often yield the best results, applying each style where it provides the most benefit. For example, template classes and algorithms are often more efficient alternatives to inheritance, because the static polymorphism of templates is resolved at compile-time, rather than at runtime. On the other hand, implementing matrix operations in a class with tightly coupled math functions will typically deliver better performance than the generic equivalent.
For more information about the generic components that come standard in the STL, [Josuttis99] is a great reference. With a high-level understanding of generic programming, we can now take a look at how to implement our own generic components.
Custom STL Containers
Creating our own containers that are STL compatible allows us to combine higher level STL components like algorithms and iterator adaptors with our custom components. Due to the nature of template classes, implementing STL specifications can be done gradually. Unlike classes that partially implement an interface (which won't compile), partially compatible containers can still be useful, because the compiler will only compile functions that the code actually uses. For the same reason, it's a good idea to have some unit testing in place to ensure that every function gets compiled and exercised while developing template code.To illustrate how STL containers can be applied to game programming, we're going to take a common game data-structure, the octree, and build it STL-style. An octree is a spatial database. It allows you to do spatial queries like "find the objects which are close to this point" without testing against every object in the game world (see [Ginsburg00] for more about octrees).
Figure 2. An octree partitions space for efficient searching.
Contained Elements
One of the most important properties of STL containers is that they are designed around value semantics. This means that they store the values of the things you put in them, as opposed to, for example, a pointer or reference to the original element. This is less significant for primitive types, including pointers, which we're used to passing around by value. Pass-by-value has the nice effect of making resource ownership unambiguous: whatever is inside the container belongs to the container. The downside to this approach is that it can result in a lot of object copies and assignments. A good habit to develop when working with STL containers is judicious use of references to avoid unnecessary copying.
Containers have a small set of requirements they impose on any class that they contain. First, to support value semantics, objects in a container must be efficiently copyable and assignable. [Meyers97] has many useful tips about implementing these operations correctly in C++. Container elements must also be destroyable, so private destructors are not allowed.
A few additional properties might be required, depending on the type of container in question. For searching and some other operations, the element class needs to have an equality test operator. Sorted containers need a function for determining the order of the elements. By default, operator< (less than) for this. Some container functions also require the elements to have a default constructor, like the vector(n) form of the std::vector constructor, which initializes a vector with n default constructed elements.
Container Specifications
Now that we know how the elements in our container work, we can design the container itself. We'll start by deciding on the container specifications it will implement. Whether we are using the standard containers or implementing our own, container specifications are an important concept for writing good STL code.
Container specifications are more flexible than an interface class in a header file. They specify the syntax and semantics of the interface, but for the most part leave concrete types up to the implementer. The 2003 C++ Standard [ISO14882] includes requirements for abstract containers: container, sequence, and associative container. The standard also defines concrete containers, such as vector and map. By writing to the more abstract specifications when possible, we can produce more flexible code.
Our octree will start by implementing the basic container requirements. This essentially means providing some typedefs, begin and end iterators, and size accessors. Of course, we need more functionality than this for an octree. An octree takes arbitrary 3D locations and maps each one to a set of objects at or near that location, which is similar to taking a key and hashing it to a storage bucket. In this sense, an octree is more like a hash table than it is like an STL sequence or associative container, because the elements are part of an unordered collection. Although the C++ Standard does not include a hash table container, the standards committee has since written up an extension that does include hash tables, in Technical Report 1 [ISO19768]. We'll use the TR1 "Unordered Associative Container" requirements as the basis for our spatial database interface to the octree.
Template Parameters
Declaring the template parameters is a good place to start implementing our new container. Since we are building an associative container, we will need template parameters for the type of object to store (T) and for the key type (Key). Although we can assume that keys in an octree will always be 3D locations, different clients may use different classes for representing those locations.
template < typename T,The class representing location vectors can be anything with x,y,z member variables that provides a constructor that takes x,y,z parameters. An inheritance-based solution to support different location classes would involve defining an interface class with virtual functions to access the coordinates. Our template version is more efficient and does not require the client class to be modified to inherit from the interface class.
typename Key,
int MaxObjectsPerNode = 8,
typename LocationFunctor
= DefaultLocationFunctor<T, Key>,
typename ContainmentFunctor
= ObjectContainedIn<T, Key, LocationFunctor>
typename Allocator = std::allocator<T> >
class Octree { ... };
Another notable aspect of the template declaration is the scalar parameter MaxObjectsPerNode. Making this a client-specified compile time constant is helpful for optimizing the octree's internal storage design.
The next two template parameters are functors for customizing the octree implementation. A LocationFunctor is used to get a Key from a container element. The default implementation assumes the object has a location member. If we were storing pointers to objects or objects that exposed the location through an accessor function, the client would simply provide an appropriate functor for that type of object.
The octree also uses a functor for the test of whether an object is contained within a given bounding box. The default functor uses the bounding box classes' own simple point test, but this can be easily replaced with more complicated tests based on the game's collision system.
Finally, we have a template parameter for a custom allocator. This is particularly useful in containers used for games, since careful memory management can often make or break game performance. See [Isensee02] for a selection of custom C++ allocators and a guide to making your own.
Octree Specification
Most of the TR1 hash table requirements map nicely onto the octree interface, although a few deviations turn out to be necessary. The primary functions for accessing the octree are:
iterator insert( const T& t );Our deviations from the hash table requirements are all related to bucket management. The user of a hash table is expected to be fairly involved in the process of balancing the distribution of elements across the buckets. In contrast, an octree automatically subdivides its nodes as more elements are inserted. Although most of the bucket-related interface is omitted, we use the local_iterator concept from the hash table to expose octant contents to clients. This is done by adapting the overloaded local_iterator begin()/end() functions to take a key_type parameter instead of a bucket index.
iterator erase( iterator pos );
iterator find( key_type k );
iterator begin();
iterator end();
local_iterator begin( const key_type& k );
local_iterator end( const key_type& k );
Storage
Next, we need to decide on how to store the octree's contents. Like most tree-based data structures, an octree is naturally represented by a hierarchy of nodes. Each octree node contains:
- a bounding box representing the volume the node covers
- a collection of child nodes
- a collection of the elements located within the bounding box
Recall that the octree is also responsible for storing and providing access to the elements' values. Using a vector has some nice properties like guaranteed contiguous storage, and gives us random access iterators over the full set of elements in the container. Based on the decision to use a vector, container indices are the best option for referencing the elements of the vector from the octree nodes. Pointers and iterators are unsuitable because both are invalidated by internal vector reallocations, such as when the vector expands to hold more elements.
To recap, the child nodes are stored in-place, and elements are referenced by index. The easiest container to use for both of these collections is std::vector, because it is compact and can efficiently append new elements.
Octree Implementation
As you look through the full container specifications in the C++ Standard, you will notice a lot of similar functions need to be implemented. For example, there are five variants of the erase() function in the octree. Fortunately, most of these functions are just different ways of accessing the same underlying functionality. The core octree implementation is really three private functions: _octreeInsert(), _octreeFindLeaf(), and _octreeRemove(). These functions are reused throughout the public interface to deliver the majority of the required STL functionality.
Custom Iterators
Vector is a good default container for many uses, because it is essentially a C array wrapped in an STL interface, bundled with some size management code. This means that unlike most other STL containers, a vector doesn't incur any per-object storage overhead. The downside to using vector in the octree nodes is that there is some storage overhead for each vector itself. For example, compiled with Xcode* on an Intel® processor-based Mac*, the size of an empty vector was 12 bytes; on Windows* using Microsoft Visual C++* 9.0, it was 24 bytes. For large octrees, this per node storage overhead can become a significant concern.The FixedArray container in the sample code on the CD-ROM is one solution to this problem. This class takes advantage of the fact that we know the maximum number of child nodes and maximum number of elements that can be stored in an octree node, and that one of these containers will always be empty in each node. FixedArray minimizes the size of the container when it is empty to that of a single pointer. To eliminate any extra bookkeeping data in the container, unused elements are marked with a sentinel value whose value can be specified using a template parameter to avoid collisions. This means a little more work to skip over the unused elements while iterating through the container, but the cost is marginal when dealing with small containers, and zero when the container is full.
To make this solution work, we need some code for skipping over sentinel values in the container. However, this is an implementation detail that should not concern the user of the container. This is a perfect problem for iterators.
This iterator is tightly coupled to the container, so it is declared as an inner class of FixedArray. One ‘gotcha' for implementing STL iterator classes is that we end up doing it twice – once for the const iterator, and once for the non-const. However, the differences between the two classes are usually quite minor, they mostly have to do with the types of the parameters and return values. Austern presents a nice solution that avoids redundant code for the two iterator variants in [Austern01]. The essence of this method is the ChooseType template, which selects one of two types based on the value of a constant bool. So even though we might vary half a dozen different types between the const and non-const iterator classes, we'll only need to expose a single template parameter. For example, in the listing below we use this technique to declare a type named reference that alternates between a reference to a const and a non-const element in the container, depending on the isConst template parameter:
template <bool isConst = false>Like containers, iterators have specifications in the C++ Standard that define how they should behave. The most basic but still useful iterators implement the forward iterator specification. The main functions a forward iterator provides are operator*(), operator->() and operator++().
class FixedArrayIterator {
typedef typename Choose<isConst, const T&, T&>::type reference;
// reference is a reference to a const object when isConst is true,
// and a reference to a non-const object with isConst is false...
refererence operator*() { ... }
// iterator code continues...
These functions are pretty straightforward to implement for FixedArrayIterator. The most interesting one is operator++. Whenever we advance the iterator, we skip over any sentinel values to the next real element, or to the end of the container. The end iterator is simply the one-past-the-end position.
The last thing we need for our iterator class is a set of standard typedefs that are used by STL's generic algorithms to query the capabilities of the iterator class. For the FixedArrayIterator>, they look like this:
typedef typename FixedArray::value_type value_type;The only typedef that's a little unusual is iterator_category. The standard library provides a set of these iterator tags which allow template authors to parameterize their code in terms of the capabilities of the iterators they are given. As iterator authors, we just need to pick the tag that best matches the iterator specifications we are implementing.
typedef std::ptrdiff_t difference_type;
typedef std::forward_iterator_tag iterator_category;
typedef typename Choose&;lt;isConst, const T*, T*>::type pointer;
typedef typename Choose;lt;isConst, const T&, T&>::type reference;
Iterators have helped us to implement a container that is frugal with memory consumption for use in our octree nodes. At the same time, the octree code itself is oblivious to the underlying complexity, and we can swap between our FixedArray and std::vector by changing two typedefs in the node declaration.
Custom Function Objects
Functors in C++ are created from classes that override the operator() member function. For maximum compatibility with other STL components, functors should inherit from unary_function (one parameter) or binary_function (two parameters) when applicable. For example, a unary functor named MyFunctor, which takes an int parameter and returns a value of ReturnType is shown in the listing below:// define MyFunctor:Functors have some neat properties that make them more powerful than function pointers. Because they are classes, different functors have different types, which is useful for instantiating other templates. Another useful property is that functors can have two kinds of parameters, those bound when the functor is instantiated (i.e. passed to the constructor), and those bound when the functor is invoked (i.e. the parameters to operator()). This provides a kind of partial function application or currying. Functors are also often faster than function pointers because they can be inlined by the compiler.
class MyFunctor: std::unary_function<int,ReturnType> {
public:
ReturnType operator() ( int sampleParam ) const { ... }
};
// use it like this:
MyFunctor f;
ReturnType result = f(42);
In the STL, functors are intended to be passed by value, which we need to keep in mind when designing our own. In particular, this means functors should avoid keeping internal state data, which can be enforced in code by declaring operator() to be const. The reason for this restriction is we can't make many assumptions about how our functor will be used by other components (e.g. when it will be copied, how many times it will be applied), which makes it very difficult to design stateful functors that behave correctly.
Custom Algorithms
STL algorithms operate over a range of objects specified using iterators, and output their results by overwriting the elements of a container using an output iterator. Typical STL algorithms handle operations like sorting and partitioning, searching, and re-ordering. For something more specific to games, we'll look at generating a list of colliding objects from a container of game objects.The general form of the algorithm is to take a collection of objects in the world and test each of them for collisions against a list of possible colliding objects. Each colliding pair goes into the results list. In the worst case, the candidates for each object are all of the other objects in the world, resulting in an
Our generic collision algorithm takes two types of iterators as input: a pair of iterators from the source container and one iterator for the destination container. The other two template parameters are functors that allow us to customize the non-generic parts of this algorithm: how the list of collision candidates is generated, and how the collision test itself is performed.
template <typename SrcIterator,One nice thing about working with template functions, as opposed to template classes, is that the types are deduced automatically by the compiler when the function is called. This means we don't have to explicitly instantiate the template parameters on a template function.
typename DestIterator,
typename CandidatesFunctor,
typename CollisionTestFunctor>
void generate_collisions( SrcIterator srcBegin,
SrcIterator srcEnd,
DestIterator destBegin,
CandidatesFunctor getCandidates,
CollisionTestFunctor testCollision ) {
typedef typename SrcIterator::value_type CollisionObject;
typedef typename CandidatesFunctor::ResultsIterator
ResultsIterator;
typename CandidatesFunctor::Results candidates;
for ( SrcIterator i = srcBegin; i != srcEnd; ++i ) {
const CollisionObject& collidee = *i;
candidates = getCandidates( collidee );
for ( ResultsIterator j = candidates.first;
j != candidates.second;
++j ) {
const CollisionObject& collider = *j;
if ( collidee != collider
&& &collidee < &collider
&& testCollision(collidee, collider) ) {
// collision detected!
*destBegin++ = std::make_pair(collidee, collider);
}
}
}
}
Although the template parameters and typedefs take up a lot of space, the actual code is relatively simple. The outer loop uses the source iterators to step through the input objects and generate a list of possible colliders for each one, using the CandidatesFunctor. This functor takes an object and returns a begin/end pair of iterators representing the set of collision candidates. The inner loop then tests each candidate using the CollisionTestFunctor, and when a collision is detected, adds the two objects as a CollisionPair to the output container.
In this code, we can see the typedefs from the iterator classes coming into play. For example, the type of object we are processing is nested inside the source iterator in SrcIterator::value_type.
To make the discussion more concrete, let's take a look at how we actually apply this generic algorithm. A typical call to generate_collisions() would look like:
// given a container "myGameObjects" of GameObject instances and anThe back_inserter is something we haven't looked at yet; it is an iterator adapter. It takes a standard container and returns an iterator that uses the container's push_back() function to insert new items into the container each time the iterator is used to write an element. The back_inserter lets the output container manage the storage on demand, rather than requiring the caller to over-allocate a container for the worst case.
// empty container "results" of pairs of GameObjects...
generate_collisions( myGameObjects.begin(), myGameObjects.end(),
back_inserter(results),
StdContainerCandidatesFunctor<GameObjContainer>
(myGameObjects),
SphereCollisionFunctor<GameObject>(5) );
In the code listing above, we used a naive CandidatesFunctor that just returns the source container's begin() and end() iterators. This means that every object will be checked against every other object for collisions. We can reduce the complexity of this algorithm to
This small change dramatically improves the scalability of this collision function. We can continue to optimize the algorithm by trying different containers and candidate functors, which use additional techniques to improve efficiency. The algorithm can also be easily extended to handle new collision models by supplying different functors. All of this is possible without virtual functions or inheritance, and without touching the generic algorithm code.
Conclusion
In this article we looked at how to implement custom functors, containers, iterators, and algorithms that are compatible with the C++ Standard Library. The finished code may look daunting at first glance, but in fact a lot of custom STL work is just boilerplate code. Over time you will build up a library of examples that save implementation time and effort. Another way to reduce the amount of boilerplate code is to use the Boost library [Boost09], which contains frameworks and base classes that implement much of the common functionality of STL components, distilling the code we have to write down to a smaller set of essential functions.The payoff for the effort to be STL compatible is that we get flexible, efficient code that is consistent with the standard library. This enables new programmers who know their way around C++ to quickly get up to speed with in-house code. Mixing-and-matching STL, in-house, and third-party custom code becomes relatively easy. By extension, we can use generic coding style to gain flexibility and reuse, improving coding productivity without resorting to scripting and interpreted code. The flexibility inherent in generic code also provides us with a wide array of options for optimization, which have minimal impact on the surrounding code. In these ways, STL used well is a valuable part of the game programmer's toolbox.
References
[Austern01] Austern, M., "The Standard Librarian: Defining Iterators and Const Iterators," C/C++ Users Journal, Vol. 19 (January 2001): pp. 74-79.[Boer00] Boer, J., "Using the STL in Game Programming," Game Programming Gems, Charles River Media, 2000: pp. 41-55.
[Boost09] "Boost C++ Libraries," available online at http://www.boost.org, September 1, 2009.
[Gamma94] Gamma, E., Helm, R., Johnson, R., Vlissides, J., "Design Patterns: Elements of Reusable Object-Oriented Software," Addison-Wesley 1994.
[Ginsburg00] "Octree Construction," Game Programming Gems, Charles River Media, 2000: pp. 439-443.
[Isensee02] Isensee, P., "Custom STL Allocators," Game Programming Gems 3, Charles River Media, 2002: pp. 49-58.
[ISO14882] ISO/IEC 14882:2003 "Programming languages - C++".
[ISO19768] ISO/IEC TR 19768:2007 "Information technology - Programming languages - Technical Report on C++ Library Extensions".
[Josuttis99] Josuttis, N. M., "The C++ Standard Library: A Tutorial and Reference," Addison-Wesley 1999.
[Meyers98] Meyers, S., "Effective C++," Addison-Wesley 1998.
About the Author
Neil has worked on games and related entertainment technologies since 1998. After completing his degree in Computer Science and AI at the University of Toronto, the challenges of real world programming focused his attention on software engineering, while his interests in code optimization and eye candy led him to computer graphics. He continues to stay in touch with his academic roots as an ACM Professional and SIGGRAPH member. Today, Neil runs Vertex Blast, an independent software company supplying programming skills to the entertainment industry.For more complete information about compiler optimizations, see our Optimization Notice.
Comments (1) 
Trackbacks (1)
-
Twitter Trackbacks for
Extending STL for Games - Intel® Software Network
[intel.com]
on Topsy.com
October 1, 2010 2:11 AM PDT
Leave a comment 
To obtain technical support, please go to Software Support.

hardcoder
5
Implementing trees in C++ is a tricky business. Is there any way to see the full source code?