Ocean Fog using Direct3D 10

Introduction

The purpose of this project was to investigate how we could effectively render a realistic Ocean scene on differing graphics solutions while trying to provide good, current, working class set of data to the graphics community.

Given the complexities involved with rendering an Ocean as well as fog effects we chose to use a projected grid concept as our baseline to start with (as it was very realistic).

We then ported the original Direct3D 9 code to Direct3D 10 and all the additional effects that we needed to convert to Shader Model 4.0. In doing so we took advantage of a great opportunity to learn about a very interesting subject (the projected grid) while adding many more nuances to it.

Again, the main goals of this project were to determine what we’d need to do to this complex system under a DirectX 10 scenario. And what would be required to achieve reasonable frame rates on both low and high-cost graphics solutions.

During this endeavor we learned much about how to offload certain computations to the CPU vs. the GPU. And also when and where those compute cycles would be the most beneficial, both on high-end and low-end graphics solutions.

Also we mention how on the CPU front, using the Intel compiler (version 10.1), we were able to gain an easy 10+ fps on our CPU-side computations (to generate fog and approximate wave movement).

Projected Grid Ocean

The basic concept behind the Projected Grid Ocean is a regular discretized xz-plane in world-space that is displayed orthogonally to the viewer in world space. The vertices of this grid are then displaced using a height field. The height field is a product of two variables which return the height value as specified by the following equation.

This method proves very useul for generating a virtual large body of water. The Perlin noise computation for generating the wave motion uses 4 textures of varying granularity called “octaves” to animate the grid in 3 dimensions. This method was chosen over other functions (such as Navier-Stokes) to generate wave the noise as it is less compute intensive on the CPU. GPU-side Navier-Stokes implementation was not used, but is worth further investigation. For reflections and refractions the algorithm uses derivations of Snell’s function.

To further add realism to the scene we restricted the height of the camera on the y axis so that the illusion of Ocean vastness could be maintained.

For a highly detailed description of this method refer to Claes Johanson’s Master Thesis "Real-time water rendering - Introducing the projected grid concept", formerly available at http://graphics.cs.lth.se/theses/projects/projgrid/.

Perlin Fog

For the Perlin fog we decided to implement the processing on the CPU. We do this by sampling points in the 3D texture space, separated by a stride associated with each octave-the longer the stride, the more heavily weighted the octave is in the total texture. Each of these sample points is mapped to a pseudo randomly chosen gradient, using a permutation table full of normalized gradients from a given point and a hash function2. The value of each pixel is then determined by the weight contributions of its surrounding gradient samples. All these separate octaves are then summed together to achieve a result which has smoothed, organic noise on both a near and far perspective.

This result was successful, however we wanted to achieve an even more smoothed effect and have only subtle noise visible. We then applied a simple Gaussian blur algorithm (also during preprocessing). Our implementation blurred the pixels using factors supplied by Pascal’s triangle constants, i.e. {(1), (1,1), (1,2,1), (1,3,3,1) … } as weights and averaging these weighted sums (a type of convolution filter). We also took advantage of calculating blur in each axis direction independently to improve efficiency2. At this point the result was much closer to the desired; however, at the edges of the texture, seams were visible, since the Gaussian blur algorithm was sampling points beyond the texture’s scope, so we used a mod operator to wrap the sampling space.

In the shader, we first calculate the fog coefficient f, which is a factor for the amount of light absorbed and scattered from a ray through fog volume4. We calculate this value using the equation:

f = e-(ρ*d*n) ρ=density, d=camera distance, n=noise

We then use this coefficient to interpolate between the surface color Coriginal at any point and the fog color Cfog, using this equation:

Cfinal = Coriginal*f + Cfog*(1-f)

This interpolation approximates the light absorption of a ray from any point to the camera3 at low CPU utilization cost.

Finally, we apply animation to the fog by sampling the fog texture according to a linear function that progresses with time. This is a simple ray function, with the slope set as a constant vector.

This method was successful, but gave fog which appeared glued to the geometry surface, rather than moving through the air. For this reason, we used a 3D texture for the blurry noise-when this texture is animated along a ray, the fog moves through world space rather than crawling along the surface’s 2D texture coordinates. This was successful from a birds-eye perspective, but unconvincing at other perspectives. To adjust for this, we applied a quadratic falloff for our noise, dependent on the height of the fog, that is, made the fog clearer at the height of the viewer to give the impression that clouds appeared above and below, rather than simply on all surfaces, with the equation:

n = n*(ΔY2)/2 + 0.001 n=noise, ΔY=camera Y position – vertex Y position

As a result, we mimic volumetric fog quite convincingly, although all fog is in fact projected onto the scene surfaces.

Light Implementation

The scene is lit entirely by two lights-one infinite (directional) light and one spotlight casting from the lighthouse. The infinite light is calculated simply by dot lighting with the light’s direction and the vertex normals. The spotlight also uses dot lighting, but also takes into account a light frustum and falloff.

The stored information for the spotlight includes position, direction, and frustum. For surface lighting, we first determine whether or not a point lies within the spotlight frustum, then we calculate falloff, and finally apply the same dot lighting as used in the infinite light. For the first step, we find a vector from the vertex point to the camera, and then calculate the angle between that vector and the spotlight’s direction vector, using a dot product rule and solving for the angle:

V1 • V2 = | V1 | | V2| cos θ V1 , V2 = vectors, θ = angle between

If the angle between the two vectors is within the frustum angle, we know that the point is illuminated by the spotlight. We then use this same angle to apply a gradual falloff for the spotlight. The difference between the calculated angle and the frustum angle determines how far away from the center of the frustum the point lies, so the falloff can be determined by the expression:

| θfrustum / θ | - 1 θ = angle between

As you can see, the expression is undefined at θ = 0, and less than or equal to 0 at θ > θfrustum. The undefined value computes to an infinitely large number during runtime, so we have exactly what we want-most intensity at the center of the frustum and no intensity at the edge. With this result, we use the HLSL saturate function to clamp these values between 0 and 1, and multiply the spotlight intensity by this final number.

For the volumetric spotlight effect, we used the same frustum angle as before, but instead to construct a series of cones which amplify the fog within the spotlight frustum. The cone vertices and indices are created during preprocessing and stored in appropriate buffers for the length of the application, and the cone is centered at the top. Because of this, we can translate the cone to the spotlight position, rotate to match the spotlight direction, and ensure that the cones cover the frustum completely. The shader code for the cones simply calculate their appropriate world-space fog coefficients as explained earlier, blend those with a surface color of zero alpha, and amplify that value by the spotlight intensity times the spotlight color. Because we are using zero alpha to indicate no amplification, we use an alpha blend state, with an addition blending function. The volumetric frustum falloff is approximated by using multiple cones, so the blending addition creates a higher amplification in the center of the frustum.

Conclusion

The projected grid port provided an excellent chance to test performance on both high and low cost graphics systems. While providing a good opportunity to determine how to scale content for both.

Modern low-cost graphics solutions have come quite a way in recent years. Further, it provided an excellent opportunity to contribute back to the graphics community.

The top 2 areas of performance improvement that impacted low-cost graphics target were in the Perlin fog computation and the Ocean Grid computation. The later only renders in the Camera’s view frustum, was easy to control given the original algorithm. Through that we could easily reduce mesh complexity, and in doing so reduce the scene overhead. Also, by combining terrain and building meshes we gained even more performance on both integrated and discrete graphics.

By pre-computing the Perlin textures on the CPU and only using the GPU for blending and animating the texture we came close to doubling our frame rates. Also tuning down the Ocean grid complexity and using only the necessary reflection and refraction computations we made additional performance gains.

Lastly, the Intel Compiler was instrumental in auto-vectorizing our code which boosted our performance even further (~10%).

Watch the Video

Chuck DeSylva describes and demonstrates a better way to generate fog and waves by off-loading calculations from the GPU when it was overloaded. This resulted in up to a 3X improvement in the frame rate. Click to watch the video A Better Approach to Visualizing Ocean Fog and Waves.

Download the Binary

 

 

References

1. Johanson, Claes. "Real-time water rendering – introducing the projected grid concept.” Master of Science thesis in computer graphics, March 2004.

2. Gustavson, Stefan. "Simplex Noise Demystified.” Gabrielle Zachmann Personal Homepage. 22 Mar. 2005. Linköping University, Sweden. 15 Jul. 2008. /sites/default/files/m/0/c/9/simplexnoise.pdf.

3. Waltz, Frederick M. and Miller, John W. V. "An efficient algorithm for Gaussian blur using finite-state machines.” SPIE Conf. on Machine Vision Systems for Inspection and Metrology VII. 1 Nov. 1998. ECE Department, Univ. of Michigan-Dearborn. 5 Aug. 2008. /sites/default/files/m/1/c/7/21_GBlur.pdf.

4. Zdrojewska , Dorota. "Real time rendering of heterogeneous fog based on the graphics hardware acceleration." Central European Seminar on Computer Graphics for students. 3 Mar. 2004. Technical University of Szczecin. 10 Jul. 2008. http://www.cescg.org/CESCG-2004/web/Zdrojewska-Dorota/.

About the Authors

Chuck Desylva
Chuck is a 15 year Intel veteran who was involved in the first USB/AGP(PCI-E) drivers. He was also involved in developing early Intel graphics drivers. Since the turn of the 21st century he has worked to aid ISV application enabling in Intel’s Software Solutions Group. In so doing, he has worked to promote application acceleration/optimization on a wide array of software on Intel systems.

Alfredo Gimenez
A full-time Computer Science student at UC Davis. Alfredo’s internship has provided him an opportunity to learn DirectX 10 and begin applying both 3D artistry and computer science knowledge in a real world format. On his spare time, Alfredo is an avid flamenco guitar player and freestyle skier.

Jeff Andrews
Jeff Andrews is an Application Engineer with Intel working on optimizing code for software developers. He is currently focused on PC gaming. Jeff was lead architect for Intel’s Smoke Demo framework. Jeff has provided many key performance enhancements to this effort and was invaluable to its success.

Пожалуйста, обратитесь к странице Уведомление об оптимизации для более подробной информации относительно производительности и оптимизации в программных продуктах компании Intel.