by Kim Pallister
Developing for the current generation PC platform rewards software developers working in 3D with both blessings and curses. On the positive side, PC's are delivering rocket-fueled levels of performance, allowing developers to accomplish amazing feats on inexpensive consumer machines. The downside is that developers have to code their applications to scale across hardware with an increasingly wide range of performance differences and to accommodate significantly different hardware feature sets.
The Microsoft DirectX7 API offers the opportunity for programmers to tap into some fairly remarkable graphics capabilities, including the ability to render primitives to texture surfaces. This article explores some of the possibilities and explains the techniques that can be used to take full advantage of this capability. To do this successfully in a real application, you need to exploit the latest advances in today's graphics chips for effects processing, while gracefully handling systems that lack these advanced features, by either scaling down the effect, or performing the effect at a slower speed.
Detecting support for hardware that can render to texture surfaces is necessary, if you want to know when to work around hardware that isn't capable of rendering to textures. Methods of achieving similar functionality on hardware without direct support for this technique are also presented. To this end, this article introduces an approach to detecting support for rendering to texture surfaces and how to achieve similar functionality on hardware that doesn't directly support it. Finally, a number of examples of special effects that can be achieved using this capability will also be presented.
Rendering to Texture Surfaces
Render to texture surfaces with today's 3D hardware provides the opportunity for some dazzling effects, including reflection/refraction, rendering soft-edged shadows, mip map generation, motion blur, and TV-style transitions, and much more.
There are several ways to render to textures on current generation 3D hardware. Applications aiming to exploit this capability will need to detect which method can be used on the end user's system, and provide codepaths to handle the different methods. (Not all paths need be handled. An application could choose to only support methods A and D, as I have done in my example. The executable and source for this can be found in the Render to texture demos zip).
The different methods can be summed up as in Table One:
|Method A||Use the HW to render to a vidmem surface that is both a render target and texture surface||Fastest method, least memory used||Only supported by some HW|
|Method B||Render to a surface that is a render target, but not a texture. Then blit the surface to a texture surface of the same size and format.||Can be fast, no sysmem copy||Not supported by all HW, uses extra texture memory|
|Method C||Render to the back buffer, and blit a subsection of the backbuffer to a texture surface||Can be fast, no sysmem copy||Not supported by all HW, can break concurrency|
|Method D||Render to the back buffer, and blit a subsection of the backbuffer to a system memory surface, and then blit the system memory surface to a video memory surface||Works on nearly all HW||Slow method|
|Method E||Render to a surface in system memory using software, and then blit that to a texture that the graphics hardware can use. This is the least desirable method for obvious reasons||Works on all HW||Slowest method, limits app to using low res texture|
The first method offers the fastest and most efficient approach, when it is supported, but it is not supported by all hardware. The last of the methods listed will work on all systems, but places the rendering burden on the CPU, creating processor overhead when we would prefer to have the graphics hardware do the work. Additionally, Method's B and D require extra video memory be allocated.
Rendering Directly to Texture Surfaces
Under DirectX6, some graphics hardware could render to texture surfaces, but application developers were unable to take advantage of this feature. Because of the way the surface creation flags were implemented and interpreted by certain hardware vendors and by some application developers, a Catch 22 was created. Despite the fact the driver writers implemented the flags incorrectly, fixing them would have risked breaking currently functional applications already out in the market.
The solution was to make a fresh start with DirectX7, and this time, interpret the flags correctly when used with the new interfaces. Consequently, in order to make use of this capability, and to detect it's presence, applications must be using the DirectX7 interfaces (IDirectDraw7, IDirect3Ddevice7, etc).
When using the new interfaces, applications can test for the render-to-texture capability in two steps. The first way is by creating a DirectDraw surface that is flagged as being both a render target surface and a texture surface. The second way is by attempting to set this surface as the render target for the Direct3Ddevice. If either of these steps fail, the appli cation must fall back to one of the alternate solutions discussed in table one.
At the time of this article's authoring, there were still issues when attempting to render to texture surfaces with legacy DirectX6 display drivers running on the DX7 runtime DLL's. If an application wants to render to texture surfaces and will ship while a significant number of DX6 display drivers are in end users systems, the application will have to resort to cruder methods of checking for the capability or lack thereof. Examples of such crude methods could include rendering to a texture and then locking it and comparing pixels to the expected result, or worst case , checking device names. In my TestRTT sample, I create a black texture, set it as the render target, clear it to white, and then render some primitives to it. If after doing so, it contains only black pixels, I know the render-to-texture attempt has failed, and resort to an alternative method. I do this test at start up, and when the rendering device is changed.
The next few sections explain how each of these methods is implemented. Once the methods have been explained, this article presents a method for detecting which method is supported.
Rendering Directly to Texture Surfaces
A number of the popular consumer 3D graphics accelerators released in the past year or two support rendering directly to texture surfaces. When available, this is generally the fastest method for implementing this technique. No extra copying is necessary and redundant surfaces aren't needed.
The basic procedure is as follows:
- Create a DirectDraw surface with these properties: the surface should support rendering and be usable as a texture. To do this, specify two flags, DDSCAPS_3DDEVICE and DDSCAPS_TEXTURE, when calling DirectDraw's CreateSurface function. The 3DDEVICE flag tells DirectDraw the application would like the surface to be one the device can render to, and the TEXTURE flag tells DirectDraw the application would also like to use it as a texture.1
- If the backbuffer has a Z buffer attached to it, the surface for rendering to texture must also have a Z buffer associated with it.
- If the creation of the surface fails, it could be because the hardware does not support rendering to textures. First, ensure that the failure is not due to any of the usual suspects (lack of video memory, unsupported size or format, etc). If you confirm that the failure is due to the hardware not supporting rendering to texture surfaces, the application must fall back to one of the other mechanisms.
- If the creation of the surface succeeds, the next step is to render to the texture surface. This is done by calling the Direct3DDevice7->SetRenderTarget() method, to point the rendering device to the texture surface. If the SetRenderTarget call fails, indicating that the device doesn't support rendering to texture surfaces, the application will need to fall back to one of the other methods.
- At this point, rendering triangles is performed as usual, ensuring that there is one BeginScene/EndScene pair per render target, per frame.
1Note that if you are using Direct3D's texture manager, textures that are going to be rendered to cannot use the texture manager. They must be created specifying the memory type (usually localvidmem, as few cards can render to AGP surfaces). It is best to do this before invoking the texture manger, so that the texture manager will know how much memory it has left over to work with.
Rendering to a Second Non-Texture Surfaces
This approach creates a DirectDraw surface which is identical in size to the texture surface, but is created with the DDSCAPS_3DDEVICE flag (but without the DDSCAPS_TEXTURE flag).. After that, the steps are similar, except that the SetRenderTarget() method is used to point the device to the intermediate surface. Then, a blit must be done, following the EndScene(), to copy the rendered scene to the texture surface.
This will work on some hardware that won't support the previous method because some hardware, in order to increase performance, rearranges the data in texture surfaces into a format that is more friendly to the graphics chip's texel cache. This is often referred to as a swizzled texture. The rendering device cannot render triangles to this surface type, but it can handle blitting from one type to another.
Rendering to the Back Buffer and Then Blitting a Subsection to a Texture Surface
This method uses less memory than the last method, but it can require an extra clearing of the back buffer. All rendering is done to the back buffer, but there are two BeginScene/EndScene pairs per frame, one for the scene to be rendered to the texture and one for the scene to be rendered to the back buffer.
Software Rendering to a System Memory Surface and Blitting That to the Texture Surface
This approach, while fairly straight forward, should be avoided for obvious reasons (in case it's not obvious, software rendering is slow. We'd prefer to use the 3D hardware to do it). If you have to rely on this method, consider ways to scale back the quality of the effect, such as reducing the texture surface's resolution.
An example of handling some of these fallback methods can be found in The TESTRTT sample code.
Now that we've seen how to render to texture surfaces and how to use less desirable methods to gracefully handle systems that cannot, let's examine a few of the effects that we can produce using this capability.
One of the first uses that springs to mind is mirror reflections, where objects are texture mapped with a reflection of the scene in front of them. This effect requires rendering the scene's geometry from the point of view of a reflected camera, using a rectangle around the mirror (which could be the mirror itself if it is rectangular). The new mirror view frustum is sheared based on the angle between the mirror's normal vector and the vector from the viewer to the mirror position (see Figure 2). The shearing lets the reflection point in the right direction, while letting the mirror plane act as the front clipping plane of the mirror view frustum.
Of course, mirrors can be done by just projecting geometry against a plane. However, if the mirror is a complex shape, there is a lot more clipping work involved. Also, there are advantages in the area of scalability we will discuss later in this article.
The execu table and source code used to generate the above example are provided in the FLATMIRROR directory of the sample code sample.
Figure 2 - click for larger view*
Mirror done by rendering to texture surface.
Dynamic Environment Maps
The reason the polar-to-rectangular mapping is a problem is that while we are adequately (while not completely correctly) calculating the UV coordinates for each vertex, the UV coordinates for intermediate pixels are incorrect. That is to say that while we go across the surface of the sphere, the reflected view vectors generate UV coordinates that fall away exponentially. However, the graphics hardware only does a linear (not exponential) interpolation of the UV coordinates between vertices. The extent to which this problem shows up will depend on how highly tessellated the model is. A model with a vertex per pixel will appear perfect, but the texture will begin to 'jiggle' slightly as the triangles get larger. One way around this may be to do another render-to-texture step that approximates the 'sphereize' filter that many photo editing packages do, using a highly tessellated mesh.
Figure 3 - click for larger view*
Sphereized bitmap for use in an environment map
Figure 4 - click for larger view*
Dynamically rendered environment map
Soft Edged Shadows
In his March 1999 Game Developer Magazine article entitled "Real-time Shadow Casting," Hubert Nguyen presents an approach to rendering shadows into the frame buffer, and then copying them to a texture surface. While this technique is a fitting example of rendering to texture, it uses one of the fallback methods mentioned earlier in this article (Nguyen implemented his method using a 3Dfx Voodoo card, which can't render to texture surfaces).
If you haven't read the article, I'll summarize the approach:
- A 'shadow texture' is created, as one that can be used to render to, and is it cleared to white.
- From light's point of view, the object(s) that is to cast a shadow is rendered. The rendering is done using flat shading, and the scene's ambient light color.
- The objects that to be shadowed are transformed to the point of view of the light. Their 'screen coordinates' in this space are now used as their texture coordinates for the next step.
- The scene is rendered from the point of view of the regular camera, using two passes (or two texture stages on multi-pass hardware). The first stage uses the object's regular material and color. The second stage uses the newly created shadow texture, using the calculated texture coordinates. The second pass (or second stage) is only done on those objects that are receivers of the shadow.
- The blend mode used for the latter pass is modulate. This leaves unshadowed areas alone, and shadowed areas modulated with the scene's ambient light color. Also, the latter texture stage must be set to clamp, to ensure that objects far outside the range of the shadow do not get shadowed.
Figure 5 - click for larger view*
Realtime shadow texture using render-to-texture
Figure 5 is a screenshot of this technique in action. The image in the upper left corner is the shadow texture (i.e. the view of the object casting the shadow, from the point of view of the light). The source code and executable are available in the SHADOWTEX directory of the sample code.
Mip Map Generation
One use for rendering to textures is to create mip-map chains. To accomplish this, set up a chain of surfaces, copy the source texture to the first, and then loop through to the smallest of the chain. At each iteration of the loop, the render target is the next smallest in the chain. A rectangle is rendered over it using the previous one in the chain as the texture, and bilinear filtering helps create the mip map. While this approach doesn't offer any great advantage over storing them on the hard drive and loading them at start time, it may be useful for creating mip-maps of textures created using one of the previously mentioned techniques, or perhaps other procedural textures.
TV-Style scene transitions
When transitioning from one scene to the next, it would be possible to keep the last frame from a scene by rendering it to a texture, and then use it when transitioning to the next scene in a style similar to those scene on TV, or in video editing applications. Typical transitions are ones like barn-door, vertical blind, page turn, etc.
Figure 6 - click for larger view*
Feedback effects using render to texture
What else is possible?
I am certain many other techniques exist. For example, in the screenshot in Figure 6, I tried some feedback buffer effects by rendering to one texture, and the n using that as the background texture while rendering to a second texture, and repeating the process, swapping the pointers to them both. By drawing some random pixels along the bottom of the texture, I tried creating a 'fire effect', and by drawing the objects in my scene with a noise texture, I created some 'smoke trails'. The effect was propagated upwards by slightly offsetting the UV coordinates of the triangles used to draw the background on each texture. The code and executable for this demo can be found in the FEEDBACK directory of the sample code.
Potential Areas for Scalability
One of key problems facing PC game developers is scalability. How do you add a particular feature on systems with the performance to handle it, and scale back the content or the effect on lower performance systems? There are a few ways in which the render-to-texture techniques can be scaled.
One technique is to use a lower resolution texture for systems where the amount of free video memory or the fill rate of the accelerator is a concern. The resulting difference in quality is typically within the acceptable range for many of the effects. The ShadowTex and FlatMirror demos allow you to do this to see the results.
In some cases, the dynamically rendered texture can be updated less frequently. For example, if a camera is panning across the room and the application is running at 60fps, it may not be too obvious if the scene in the mirror in the back of the room is only updating at 30fps. In other cases where there are very large differences in the scenery from frame to frame, the artifacts may be more glaringly obvious. Both the FlatMirror and SphereMap demos allow you see the results of doing this.
You can also use a lower LOD version of the model to scale the effect down on lower end systems. Often applications have multiple LODs resident in memory, allowing the application to switch between them as needed. Generating an environment map or shadow with one of these may still produce a reasonably good quality effect, while reducing the amount of geometry work required to render the texture.
While a number of techniques have been discussed here, as well as a number of areas for applying scalability, some areas for potential enhancements haven't yet been discussed.
Figure 7 - click for larger view*
Combining render-to-texture and bump mapping to do a water reflection
Some of the newer rendering capabilities exposed in DirectX7, combined with the render-to-texture technique, offer one of the more exciting areas for exploration. One example of this is the DX7 support for Envbump bump mapping, which could be used on dynamically rendered textures to do dynamic bump mapping effects (see figure 5, which first appeared in my June 1999 Gamasutra article on Bump Mapping. The code and executable for this demo can be found in the WATER directory of the sample code). This co uld also be used to do heat shimmer or refraction effects like in movies like Predator or The Matrix. Another promising example is the Cubic Environment Mapping feature that will be supported by some upcoming graphics hardware. Used with dynamically rendered textures, this feature could be used to perform pseudo-ray-tracing techniques. (The cubic environment-mapping example on the DirectX7 demonstrates this).
Other areas that offer good potential include using procedural effects on the textures after rendering them, using alpha blending on the rendered textures over multiple frames to achieve effects such as motion blur and depth of field or potentially DirectX texture transforms to do procedural effects.
Wrapping It Up
Being able to render to textures adds one more valuable tool to the developer's arsenal, offering many exciting possibilities for 3D applications and extending the range of available effects to new limits. Now that this capability is supported by a substantial installed hardware base, and the DirectX7 API exposes the feature, developers can start actively using it in their applications. We hope the techniques presented in this article will guide developers along the path to taking the best advantage of this approach to rendering.
About the Author
Kim Pallister is a Technical Marketing Engineer and Processor Evangelist with Intel's Developer Relations Group. He is currently focused on realtime 3D graphics technologies and game development.