Real-Time Shadow Maps on Mainstream Graphics Hardware


Abstract

This white paper describes techniques to generate shadows on mainstream graphics hardware in realtime. Shadows make a 3D scene more realistic and the realism is increased by the use of floating-point textures in shadow maps. This paper describes how shadow maps can be generated in graphics hardware that does not support floating point textures and those that do.


Introduction

Shadows are an important part of our perception of the world. We interpret them to give ourselves more information about the objects we see in the world1. The position and shape tell us about the surface on which a shadow is cast, the shape of the object, and the relative position of the light (Figure 1-1). For 3D graphics, they also provide a more realistic feel to scenes. A viewer can easily detect a computer generated image by incorrect shadows. This paper discusses:

  • A detailed implementation using shadow maps on mainstream graphics hardware based on the Intel® G965 Express Chipset that supports floating-point textures.
  • Threading the algorithm.
  • Showing ways to work around precision issues when using hardware that does not support floating-point textures like the Intel® 915G Express Chipset.
  • An implementation (code sample) of shadow maps that is based on the EmptyProject example from the Microsoft DirectX* 9.0 SDK

 

Figure 1-1. Scene Rendered Without and With Shadows

The shadow provides the viewer with information about the position of the light source and the distance from the torus to the back plane.


Overview

Shadows are created when light from a light source does not reach an object (the receiver) because it is blocked by another object (the occluder). We will consider a simplified lighting model with only point light sources casting shadows. With point light sources, either all of the light from a source reaches a point on the receiver or all of the light is blocked by an occluder. This model creates hard shadows which are unrealistic because they result in a distinct, or hard, edge between the lit area of a receiver and the area in shadow. Light sources that occupy a volume in space create soft shadows where the receiver is partially lit by the light source (Figure 2-1).

Figure 2-1. Representation of Shadow Optics

In Figure 2-1, a light source sampled from only one point creates a hard shadow. Sampling from the entire source creates the penumbra and an umbra that is smaller than the hard shadow.

Lance Williams introduced shadow maps in 1978. The basic idea for shadow maps is that a light source only illuminates objects visible from that light source. To calculate lighting, first determine which points in the scene are visible from the light source. The shadow map stores the distances of the points in the scene that are closest to the light. When calcul ating the lighting for a point, test the distance from that point to the light against the corresponding value in the shadow map. If the point is equal to the value stored in the shadow map, the point is lit. If the point is farther away, there must be an object closer to the light, so the point is in shadow.

To create the shadow map, we render the scene from the light’s point of view and store the depth value at each pixel. Normally, the map is stored as a texture. The standard rendering of the scene provides a very fast way to sort the objects in the scene and allows us to ignore how the scene geometry is represented. When we render the scene, we must transform each point to the light’s perspective space and calculate the distance from the point to the light. We then compare this distance to the value stored in the shadow map. If the distance is from the point to the light is greater than the shadow map value, there must be an occluder between the light and the point, so the point is in shadow.

Shadow maps are especially useful because of their simplicity. We do not need to place any restrictions on the geometry in the scene; if we can render the objects in the scene, we can generate the shadow map. The number of shadow maps, and associated storage requirements, is independent of the scene complexity and is linear with the number of lights. Generating the shadow maps is fast because we use highly optimized hardware. Also, because shadows are independent of the viewer’s position in the scene, we only need to recalculate the shadow maps when lights or objects in the scene move.

Shadow maps are limited by precision. Because processes must transform points in the scene multiple times, floating-point error accumulates. This may cause programs to draw shadowed areas incorrectly. Low resolution models also create problems because the model can seem to cast a shadow on itself. Also, the aliasing inherent in per pixel evaluation is apparent. The main aliasing problem comes from the samples used to generate the shadow map. The shadow map is created in the light source’s perspective-space, but is used to determine lighting from the opposite angle: the view’s perspective-space. When these factors do not match well, many pixels in the view space will correspond to the same sampled point in the shadow map.

An overview of the shadow map technique:

Create the shadow map:

  • Render the scene from light’s point of view.
  • For each pixel:
    • Store the depth value.

Render the scene:

  • Render the scene as normal.
  • For each pixel:
    • Transform that point from the view space to the light space.
    • Determine distance from point to light.
    • Get value from shadow map that matches point's x, y values.
    • If the distance from the point is greater than val ue in shadow map
      • The pixel is in shadow.
    • Else
      • The pixel is not in shadow; use standard lighting calculations.

 


Implementation

The shadow map is stored as a texture, so create a texture for the shadow map that may be used as a render target. Intel® Graphics Technology allows several different texture formats as render targets. Depending on the Intel Graphics hardware present in the system, the technology may or may not support floating-point textures at this time. As a first step, it is good to identify the hardware that is present in the system. The following code could be used to identify the Mobile Intel® 965 Express Chipset adapter:

DWORD behaviorFlags;
// Used to describe vertex processing type
D3DADAPTER_IDENTIFIER9 adapterID;
// Used to store device info
// Gather the primary adapter's information...
if(g_pd3dDevice->GetAdapterIdentifier( 0, 0, &adapterID ) != D3D_OK )
exit(-1);

if ( ( adapterID.VendorId == 0x8086 ) &&
( adapterID.DeviceId == 0x2A02 ))
{
// Mobile Intel® 965 Express Chipset is present
.
.
.
}

 

Depending on the type of hardware, we could create the render target texture using the appropriate format. If the underlying hardware is a Intel G965 Express Chipset, it supports floating-point textures, but it is older parts like the Intel 915G Express Chipset, it would not support floating-point textures. For Intel G965 Express Chipset, the most appropriate choice is a 32-bit floating-point texture (D3DFMT_R32F) and for the Intel 915G Express Chipset, we can use a 32-bit integer texture format (D3DFMT_A8R8G8B8).

// Use D3DFMT_A8R8G8B8 on hardware that doesn’t support floating point textures
D3DXCreateTexture(pd3dDevice, SHADOWMAP_SIZE, SHADOWMAP_SIZE, 1,
D3DUSAGE_RENDERTARGET, D3DFMT_R32F, D3DPOOL_DEFAULT, &g_pShadowMap);

IDirect3DSurface9* pShadowSurface;
IDirect3DSurface9* pOriginalBackBuffer;

g_pShadowMap->GetSurfaceLevel(0, &pShadowSurface);
pd3dDevice->GetRenderTarget(0, &pOriginalBackBuffer);
pd3dDevice->SetRenderTarget(0, pShadowSurface);
SAFE_RELEASE(pShadowSurface);
//render the scene, then restore the settings
pd3dDevice->SetRenderTarget(0, &pOriginalBackBUffer);

 

The shader to generate the shadow map is extremely simple. For the vertex shader, transform the vertex position to the light’s view space. Pass the new position to the pixel shader as a texture coordinate so that the position will be interpolated for each pixel. The pixel shader simply outputs the projected depth value for each pixel as the color.

// Vertex Shader
void GenerateShadowMapVS(float4 iPosition : POSITION,
out float2 oDepth : TEXCOORD0,
out float4 oPosition : POSITION)
{
oPosition = mul(iPosition, g_mWorldViewProjection);
oDepth.xy = oPosition.zw;
}

//Pixel Shader
void GenerateShadowMapPS(float2 Depth : TEXCOORD0,out float4 oColor : COLOR)
{
oColor = Depth.x / Depth.y;
}

 

Figure 3-1. Shadow Map Rendering

A rendering of the shadow map. Depth values are scaled from 0.0 to 1.0 where 0.0 is the closest to the light. Here, points closer to the light are darker.

To use the shadow map for lighting calculations requires the vertex shader to transform the vertex position into the light’s projection space. A program then passes that value to the pixel shader as a texture coordinate. Calculate lighting per pixel. At each pixel, use the x and y values of the position in the light’s projection space as texture coordinates to sample the shadow map value. The x and y values range from -1.0 to 1.0, so scale them to 0.0 to 1.0 for the texture sample. For the pixel to be lit, the position must be less than the value stored in the shadow map. To account for floating-point errors from the multiple transformations of the position, include an additional bias.

void VertexShadowMap(float4 iPosition : POSITION,
float3 iNormal : NORMAL,
out float4 oPosition : POSITION,
out float4 oViewSpacePosition : TEXCOORD0,
out float3 oNormal : TEXCOORD1,
out float4 oLightSpacePosition : TEXCOORD2)
{
oViewSpacePosition = mul(iPosition, g_mWorldView);
oPosition = mul(iPosition, g_mWorldViewProjection);
oNormal = mul( iNormal, (float3x3)g_mWorldView);
oLightSpacePosition = mul(oViewSpacePosition,
g_mViewToLightProjectionSpace);
}

void PixelShadowMap(float4 vViewSpacePosition : TEXCOORD0,
float3 vNormal : TEXCOORD1,
float4 vLightSpacePosition : TEXCOORD2,
out float4 oColor : COLOR)
{
// determine the depth value stored in the shadow map
float2 ShadowMapCoord = 0.5 * vLightSpacePosition.xy /
vLightSpacePosition.w + float2(0.5, 0.5);
ShadowMapCoord.y = 1.0f - ShadowMapCoord.y;
float fShadowMapDistance = tex2D(ShadowMapSampler, ShadowMapCoord);
float fDistanceToLight = vLightSpacePosition.z / vLightSpacePosition.w;

// compare the distance to the light with the shadow map value
// the bias factor reduced impact of floating point errors

float fShadow = ((fDistanceToLight – fShadowMapDistance) >
g_fShadowMapBias) ? 0.0f : 1.0f;
float3 vLight = (float3)normalize(vViewSpacePosition –
g_vPositionOfLight);
float4 vDiffuse = fShadow * g_vLightDiffuse * dot(-vLight,
normalize(vNormal));
oColor = saturate((vDiffuse + g_vLightAmbient)* g_vMaterialColor);
}

 

Figure 3-2. Scene Rendered with Shadows


Threading

Create the threads and set up mutual exclusion using a mutex:

DWORD dwThreadid;
g_hRenderMutex = CreateMutex(NULL, FALSE, NULL);
g_hShadowThread = (HANDLE) _beginthreadex(NULL, 0, GenerateShadowMap,(void*)
pd3dDevice, 0, (unsigned int*)&dwThreadid );

 

This function generates the shadow map:

unsigned __stdcall GenerateShadowMap(void* pd3dDevice)
{
while (pd3dDevice)
{
// Acquire the lock before attempting to create the shadow map
DWORD dwResult =
WaitForSingleObject(g_hRenderMutex, INFINITE);
// Create the shadow map here
// Release the lock after the shadow map is complete
ReleaseMutex(g_hRenderMutex);
// Make the thread sleep until the shadow map should be updated
// The sleep time is specified in milliseconds
Sleep(g_iShadowMapUpdateMS);
}
_endthread();
return 0;
}

 

The main rendering function requires similar locking. Informal tests show a slight increase in frame rate when using a separate thread to generate the shadow map while maintaining a similar update frequency. The threaded version also has a higher frame rate under simulated CPU loads.


Precision

The implementation detailed above stores the shadow map’s depth information in 1 color channel of a 32-bit texture. Because of the limited amount of information that can be stored in the 8 bits in the case of hardware that does not support floating-point textures, it is important to set the near and far clips appropriately when generating the shadow map. Because the depth values from the near clip plane to the far clip plane will be scaled from 0.0 to 1.0, the depth values in the shadow map will be more accurate the closer together the near and far clip planes are. Increase the bias value when determining lighting so that the shader errors on the side of lighting and not shadow. This helps avoid the self-shadowing problems that arise when the depth values are inaccurate or the model representation is not detailed enough. In those cases the shader calculates that an object casts a shadow on itself (Figure 5-1).

Figure 5-1. Incorrect Shadow Maps due to Inaccurate Depth Value Comparisons

Figure 5-1 depicts incorrect shadowing because of inaccurate depth value comparisons. This is caused by a combination of limited depth buffer precision, model resolution, and shadow map resolution.

Because the shadow map stores values using only 8 of a possible 32 bits, use the extra bits to store more precise values.

float4 pack(float f, int iRange)
{
float4 fReturn = 0.0f;
float fRange = iRange * 1.0f;
int i = (int) (f * iRange);
fReturn.r = i / fRange;
f = (f - fReturn.r) * fRange;
i = (int) (f * iRange);
fReturn.g = i / fRange;
f = (f - fReturn.g) * fRange;
i = (int) (f * iRange);
fReturn.b = i / fRange;
fReturn.a = 0.0f;
return fReturn;
}

float unpack(float4 f, int iRange)
{
float fRange = iRange * 1.0f;
return f.r + f.g / fRange + f.b / (fRange * fRange);
}

 

Figure 5-2. Output of Shadow Maps with RGB Color Channels for Depth Values

Figure 5-2 depicts the output of the shadow map with the depth values encoded in the red, green, and blue color channels. The red channel holds the most significant 8 bits, the green holds the next 8 bit, and the blue holds the least significant 8 bits.


Aliasing

Because shadow maps are an image-based technique, they suffer from aliasing. The aliasing is magnified because it takes two samples to render the shadow, first determining the value at the rendered pixel, then comparing that to a previously sampled point in the shadow map. One method to smooth the aliasing is to simply sample the shadow map at multiple points and average the results. However, a single texel of the shadow map could correspond to many pixels in the rendered scene. Sampling neighboring texels is not accurate because the sample stored in the shadow map may be very far away in screen space.

One of the advantages of shadow maps is that they are generated independently of the viewer’s position, but if the view space and light space are very different, the samples stored in the shadow map may not be useful. One technique is to generate the shadow maps in the view’s space4. This reduces aliasing because the shadow map sample size corresponds more closely to the pixel sample size because the distorted space gives more detail in the shadow map to objects closer to the viewer. Using perspective shadow maps increases the complexity of shadow calculations because the view space to light space transformation must be possible.


Conclusion

Simple shadow maps are an effective, fast way to generate shadows. While some of the Intel Graphics Technology chipsets have limitations, these are easily avoided through scene management and simple floating-point manipulation. However, the latest Intel Graphics Technology – Intel G965 Express Chipset, provides full floating-point texture support, hence supporting shadow maps more readily. Shadow maps increase the realism of 3D applications and relay important information about the scene to the viewer. Included with this paper is an implementation of shadow maps that is based on the EmptyProject example from the Microsoft DirectX 9.0 SDK.


Download Code Samples


Download IntegratedGFX_FP_Shadow_Map.zip

 


Einzelheiten zur Compiler-Optimierung finden Sie in unserem Optimierungshinweis.