by Beverly Hanly
Independent game audio developers can finally play in the same league as the major studios, thanks to descending hardware costs and skyrocketing capabilities in motherboards and processors.
The latest hardware advances (die shrink, Intel® NetBurst™ microarchitecture, and Hyper-Threading Technology, for example) are providing multi-GHz speeds, and incredible bandwidth, meaning developers can process a lot more work through their PCs.
With Intel integrating audio circuitry onto motherboards, it's possible to move the whole sound arena toward standardization. Developers can worry less about the capabilities of the machines their games are played on, leaving them time to concentrate more on creating the game itself. For gamers, this translates into even more compelling games, and a consistent level of play.
"The manufacturers are on board with the computer-everything that's coming out is geared for the PC," says Chris Rickwood, audio composer. "Now there's the power for those times when you have massive amounts of digital audio that has to run on eight tracks with effects processing at the same time."
"I try to keep things inside the computer as much as possible," says composer Mike Falcone of Falcone Studios. "Today's processors are so powerful that I can create whole compositions without having to use any external hardware at all."
Advances in hardware functionality can only be good news for composers.
Ganging Up on Standards
Integrated audio (audio circuitry placed directly on Intel motherboards) helps developers in a number of ways.
It brings the overall cost of a PC down because consumers don't need to purchase a separate sound card to hear the high-quality sound they've come to expect in their games.
An even bigger plus from the development standpoint is that developers don't have to code for cards of varying quality and different driver dependencies. Integrated audio helps the efforts to standardize and deliver consistent high quality audio. This makes music and audio effects creation easier for the developer, and allows the full impact of their work to reach the gamer, whether in stereo or full Dolby* 5.1 surround.
Game music developers have created their own organization: The Game Audio Network Guild (GANG), founded by major game audio composers.
GANG is working on a new standard for Musical Instrument Digital Interface (MIDI) that should allow greater instrumental range and feel, while at the same time ensuring that MIDI orchestrations will sound the same from PC to PC (sound card to sound card).
"A potential partnership with the Writers Guild may help strengthen the young guild's position as the recognized governing body in its movement toward standards," says Steve Pitzel, former Disney animator whose work at Intel as a senior technical marketing engineer brings him into contact with GANG members often. "GANG already offers a forum for all the best and brightest in PC audio to trade best-known methods."
Rickwood pointed to a GANG forum to reinforce his contention that composers are doing lots more with current CPUs. "Just today on the boards of GANG, a guy posted his equipment list-resources never go over 30 percent for him."
Many members of GANG also participate in a yearly audio developers think tank with the unlikely name of Project BarBQ. Hosted by legendary game musician and composer, George Sanger, known as "The Fat Man," this conference brings together audio hardware and software engineers, game composers and musicians, as well as engineers from the major computer chip manufacturers-including members of Intel's software and hardware design teams.
"The name 'Project BarBQ' might lead people to believe the conference is just a friendly get-together," says Steve Pitzel, who was among the Intel engineers in attendance this past October. "Since many of the folks involved in audio engineering are musicians themselves-the nightly jam sessions are a great way to bond and break down corporate barriers. But the days consist of 12 to fourteen-hour brainstorm sessions with everyone focusing on common issues with audio, and really working toward solutions."
So What Are the Issues?
Well, latency for one.
Latency is the time between when you give a command-strike a key on your MIDI keyboard for instance-and when you actually hear the result.
"Latency can be a killer in audio effects," said Pitzel.
WDM (Windows Driver Model) and ASIO define drivers that reduce latency. The problem is that various levels of drivers are built from these specifications, and developers of audio hardware and software tend toward one or the other.
"Some hardware manufacturers are married to WDM, others concentrate on ASIO," said Pitzel. "Implementations of both are inconsistent. If you write music software, or compose on the PC, you have to be sure that the sound card and your chosen software use the same driver model. For instance, if you're running Cakewalk software for music creation, you really should have a card with solid WDM drivers to get the best results and the lowest latency. If you're running Steinberg's line of software, you should use a card that runs well with ASIO drivers. It's rare that you find a software package or hardware that runs just as well with either.
"It's a matter of programming time-how many hours the companies can put into it. Companies that make audio software and hardware are usually small. It's difficult to drive your engineering hours in two directions and still get great quality. Some companies make solid drivers for both models, others don't."
One of the great advantages of high-end add-on sound cards has been the variety of inputs and outputs they provide. But because much of the actual processing of audio can be done directly by the CPU and integrated audio circuitry, some developers are getting around the input/output issue by directing audio and MIDI information through a USB port, or a 1394 port, such as FireWire.
Many audio developers like 1394 connections. A very fast external bus standard, a 1394 port supports data transfer rates of up to 400 mbps. A single 1394 port can connect up to 63 external devices and supports isochronous data -delivering data at a guaranteed rate.
The Intel® Pentium ® 4 processor includes USB 2.0, which supports a transfer rate of up to 480 mbps, and is backward compatible with USB 1.0. The adoption of USB 2.0 by manufacturers of the peripheral devices is still in the early stages, but is quickly growing.
The preference between the two types of connections has been hotly debated and has yet to be resolved. To outflank this debate, many system manufacturers include both USB2.0 and 1394 connections and leave it to the user and software developer to decide, or use both as they choose.
Memory is another concern for audio development and delivery-sound files are notoriously large and can take up quite a bit of memory. Many developers grappling with memory issues prefer to do an all MIDI score, which mitigates the need for large .wav files.
"This drastically reduces the amount of memory needed for music," said Pitzel, "but it can be at the expense of overall sound quality and consistency due to the wide range of MIDI playback schemes. A piano sound on one system, for instance, might have the richness and depth of an expensive studio grand, but may sound tinny as a honky-tonk upright on another."
"A better option is disk streaming of pre-recorded audio tracks," says Falcone, who composed the music for the game, Law & Order Interactive*. "Even if the audio uses a lossy compression format like MP3 files, the quality blows MIDI away and if implemented correctly it can remain entirely interactive."
Falcone also likes Tascam's GigaStudio*. "As opposed to traditional RAM-based samplers, GigaStudio is hard-disk-based," he says. "It uses a proprietary audio driver that works at the kernel level of the OS to trigger multiple samples off the hard drive with minimal latency."
More Power to Ya
Let's consider bit depth and sampling rate. Think of these as the depth and frequency of information it takes to digitally describe a sound. CD quality uses 16 bits with a sampling rate of 44.1 kHz. Those settings are fine for most applications.
As audiences (and ears) become more sophisticated, greater bit depths and higher sampling rates may be needed, and it may be more common to see 24-bit depth and 96-kHz sampling rates in gaming.
This not only means that software and hardware will need more power, speed, bus bandwidth, and sophistication to describe and process the sound, the files that describe those sounds will be larger--more memory and drive space will be needed as well.
The Pentium 4 processors coming out now were built using a process that allows many more transistors to be placed into a much smaller space than before, giving you more overall processing real estate. That extra real estate can hold more-more cache, faster cache - leading to better overall performance, that is, more tracks with more effects and more virtual instruments.
The smaller the transistor, the shorter the distance electricity has to travel, which also increases speed. Add to that the Pentium 4's NetBurst microarchitecture with a Rapid Execution Engine that allows some calculations to happen at twice the speed of the processor, and you have an extremely fast machine.
Today's Pentium 4 processor-based machines also have a faster and wider f ront-side bus, which means a bigger pipeline. Just as more water can move through a big hose than a small one, more information can pass through the memory and CPU much quicker.
"The bottom line is: you can do more, faster," said Pitzel. "With audio, that's very important. These features will be part of the Pentium 4 processor architecture from this point on."
Developers working with multiple tracks, sweetened with multiple effects really tax their CPUs when they attempt to make it all work together.
"Suppose you're working with several instrumental tracks," says Pitzel, "and you have reverb, delay, flange, or chorus-different effects on all of them. All of those pull CPU resources. If you don't have a powerful CPU, you'll find yourself mixing down a few parts at a time, doing your arrangement piecemeal. It's a little like painting a portrait while only seeing one facial feature at a time-getting it all to work together could take as much luck as skill."
The Pentium 4 processor with NetBurst microarchitecture allows a composer to add more tracks and effects and hear what he's doing in real time.
Hyper-Threading Technology-Will It Help Developers?
Hyper-Threading Technology can facilitate the development process in several ways.
If you're designing video along with audio, the CPU normally has to slice up time between the two separate processes. To the user, it may look as if both functions are happening at once, albeit slowly. What's great about Hyper-Threading Technology is that you really are performing both applications at exactly the same time giving you greater overall speed. You can do this with multithreading alone-but you would need two different processors.
At any given time, a processor is usually being utilized at only 60 percent or so, 30 to 40 percent of it is just standing by and waiting. Hyper-Threading Technology makes one processor act like two by delivering work to those normally idle resources.
"Audio developers are typically working with lots of effects-vocals, instruments, and a variety of sounds at the same time," says Pitzel. "Hyper-Threading Technology was built to handle situations like that-where several things need to happen at once. You're not really getting twice the processing-not what you'd get with two processors. Part of the CPU is shared, but Hyper-Threading Technology can give you an extra 35 to 50 percent. Better CPU utilization allows you maybe 1.5 times better performance. Combine this with the multi-gigahertz processor speeds we now have, and you really have something. You not only have high speed-you can do more with it."
"This really helps in multi-track recording. To really get your tracks to play well together, you need the ability to hear them all at once and make adjustments accordingly. With more processors that are bigger and beefier you can do that," says Pitzel. "You can do the same work on a lesser machine, but you'd have to do it in pieces. For instance, you might not be able to process effects like reverb, compression, or delay and listen to all of your tracks at the same time. You'd have to commit to the type and amount of effect up front, render your tracks with those effects-and just hope they all sou nd good together in the end. Not the best way to get the sound you want."
Multi-tracking with Hyper-Threading Technology means better control in the creation process, and can ultimately lead to a better experience for the gamer.
"Most game developers hate the term 'filmatic' when it's applied to games, and rightly so since games are really a different animal altogether. But many of us still regard the look and sound experience available in a quality theater setting to represent the pinnacle. If you look at it that way, we're getting closer to having a real filmatic experience in realtime on your PC," says Pitzel. "Complete with surround sound and the visual quality of a feature film. It's heading that way and it will be happening sooner than we think."
About the Author
Beverly Hanly is a freelance writer and editorial consultant based in San Francisco. She has written and edited for Wired News, PC World, CNet, and CMP Media, and teaches writing and editing for the Web. Her favorite part of the writing process is interviewing sources. Black and white analog photography, beach walks, and writing about arts and culture balance her focus on technology.