290 102 7MB
English Pages 532 [447] Year 2003
Introduction by Todd M. Fay DirectMusic is software that allows composers, sound designers, and audio engineers to create digital musical content with interactive, variable, and adaptive playback properties. What does that mean? Moreover, who cares? Chances are that Joe Musician is not terribly familiar with the terms "interactive," "variable," and "adaptive" as they relate to audio production — and for good reason; musical pieces featuring these properties have yet to enter the musical mainstream. Well, we are here to change all of that…
DirectMusic in Interactive Media Development (aka Computer/Video Games) While not familiar terms in mainstream music production or motion picture post, interactive, variable, and adaptive audio playback are mainstay terms in the world of game audio production. Games, like other visual media, use audio as a means of communicating information and manipulating audience emotion. However, there are some pitfalls inherent to the medium that audio producers need to deal with, namely nonlinearity, variable environments, repetition, storage space, and processing power limitations. These potential roadblocks require unique creative and technical solutions. DirectMusic exists, in part, to assist audio producers in dealing with these unique problems. DirectMusic is part of DirectX Audio, the aural components of a suite of software tools (or "library") collectively referred to as DirectX. Microsoft created DirectX to give programmers powerful tools to create games for Windows PCs. And they succeeded; DirectX made Windows the premier PC platform for games. DirectX Audio consists of two component parts named DirectMusic and DirectSound. At first, DirectX had only the DirectSound library. Programmers used DirectSound for basic playback of a game's audio assets. DirectMusic was introduced later on to give composers and sound designers more control over their contribution to the game. Today, the two are referred to collectively as DirectX Audio. We focus mainly on DirectMusic in this text because it is easier to work with from the perspective of a composer and sound designer and is in effect more powerful. Just understand that DirectSound came into existence before DirectMusic, and DirectMusic was built, in part, on top of DirectSound. This is already more than you need to know, so let's avoid entering into anymore details here, as there is a book on the subject (the one you're holding).
Greater Control for Audio Producers In the early days of game development, audio producers were largely dependent upon audio programmers (software engineers) to get their sounds and music into their game, a process known as integration. DirectMusic relieves much of this dependency. Using DirectMusic Producer, an audio producer can place all of his or her waves, sequence data, sample sets, and even instructions on how audio should be played interactively, adaptively, and with variation. The sum of the audio producer's work is saved in a file called a Segment (.sgt files). A Segment is to an adaptive, interactive, and variable score as a wave and/or MIDI file is to a linear score. Because of the completeness of the information and data stored in a DirectMusic Segment, the role of the programmer in integrating is reduced significantly. Gone are the days of a project's audio programmer screwing with volume levels or triggering sounds to the wrong events. The programmer doesn't need to know how the audio gets played back; he only tells the game to play the piece of music, and whatever audio data and parameters were specified by the composer are used at playback.
This is useful for audio producers creating stand-alone music with DirectMusic. They can save all of their samples, waves, sequence data, and instructions for adaptation, variation, and interactivity (if any) in the Segment. The listener then loads the file into their DirectMusic player program, and voílà — beautiful music! There will, of course, always be some amount of information that programmers and composers must communicate to each other. Programmers must let the composer know how much memory he can use for sample, wave, and sequence data or whether he can stream wave data off of the hard disk or CD/DVD. The programmer generally needs to know how to transition from one piece of music to the next (measure, beat, marker boundaries, and so on). The programmer typically implements things like tempo changes based on game state. However, a composer (or programmer) can instead take advantage of the DirectX Audio Scripting language for greater control over these aspects of playback from the programmer. A scripting language provides the audio producer with the ability to create simple easy-to-read "programs" (known as routines) that a programmer can then run based on programmatic events. The language is designed to be simple to learn but still have a lot of flexibility and power in what it can do for audio producers. We discuss this more later.
DirectMusic for Music Production The applications for interactive, variable, and adaptive audio, particularly music, extend beyond computer and video games. The potential for DirectMusic to alter the way we listen to digital music in our homes, cars, and anywhere else where we can smuggle a personal musical player is very exciting. Music written with interactivity, adaptability, and dynamism allows audiences to take a more active role in their listening experiences. A recording artist's next single could be a DirectMusic module that listeners can intuitively remix with, introduce new musical parts and effects to, or set to react to changes in their computing environment. It could even be made to never play back the same way twice, if listeners so chose! Admittedly, not everyone is interested in having such an active role in music listening. Passive listeners can still enjoy enhanced musical experiences by setting their DirectMusic songs to play back according to packaged templates. "Factory settings" could provide alternate instrumentation, allowing a jazz piece to take on more of a world, rock, or hip-hop feel for instance, even switching back and forth between styles and instrument sets. Other templates could range from intelligent "concert hall" or "arena" reverberation settings using built-in effects processing to completely self-generated performances based on a loose set of criteria created from scratch with every play. There are plenty of tools on the market that allow just about anyone with a few hundred bucks to begin creating music on his computer. The differences between your average audio editing or multitracking program and DirectMusic are the interactive, adaptive, and variable abilities that you will not find anywhere else (at least not to this degree). Remember, these kinds of soundtracks, songs, and soundscapes are for the most part unexplored territory. We assure you, the terrain in this territory is unlike any you've experienced before, either as a composer, a performer, an engineer, or a listener. At the time of this book's publication, Microsoft wasn't promoting DirectMusic as a mainstream music production tool. DirectMusic was virtually unknown to the millions of music-producing hobbyists and professionals looking for new ways to distinguish themselves among their peers and audiences. Only a few members of a small community were using DirectMusic for audio production outside of games. Our solution to bringing interactive, adaptive, and variable music to the masses? Educate audio producers on how to use DirectMusic to create new sonic and musical experiences. The more DirectMusic songs created, the more that people can begin to experiment and appreciate these new musical experiences.
DirectMusic is the chosen platform for this neo-musical production crusade for three reasons. First, its depth. There is not, to our knowledge, any other interactive/adaptive audio solution as comprehensive as DirectMusic. Secondly, its reach. Anyone with a Windows machine and DirectX 6.1 or higher can enjoy music created for DirectMusic. Oh, and number three: It's free! We wish we could add "ease and convenience of use" to our list of reasons to love DirectMusic, but if that were the case, you would not have much use for this fine text! DirectMusic Producer (the program Microsoft gives you to create DirectMusic content) is not user-friendly production software. In all fairness to the designers at Microsoft, DirectMusic Producer (DMP) isn't commercial software, and it was never meant to be. It's a bit convoluted as an interface, and it certainly isn't pretty to look at. Add a slew of alien concepts in audio production, and you have a significant barrier to entry. Up until now, there wasn't even a book on how to get around in DMP, let alone DirectMusic. The good news is that since DMP is simply an interface for producers to work with DirectMusic, someone somewhere could build a more user-friendly program to replace DMP, without sacrificing any of DirectMusic's power. Of course, it's pure speculation as to if that may ever happen… In the meantime, DMP gets the job done. Heck, we often rely on another sequencer/multitrack and editing software package for the bulk of our production work — only using DMP when we want to apply some of DirectMusic's unique functionality to our productions (dynamism, interactivity, and/or adaptability). Without a doubt, the single biggest drawback to producing music using DirectMusic is the lack of a customizable, standardized, and widely available player program. See, DirectMusic files are special in that they cannot just be opened in Windows Media Player, Real Player, Winamp, or whatever other music player people use day in and day out. They have to be played back using a program specially written to play DirectMusic files. Truth be told, WMP, Winamp, Real Player, and their peers could support DirectMusic files, but they just don't. Maybe someday they will, but for now DirectMusic playback options are limited to launching DirectMusic Producer or writing your own player. Luckily for all of us, Mr. Todor Fay (not to be confused with Todd M. Fay!) decided to create Jones. Jones is a DirectMusic player. You can load a DirectMusic file into it and play it back. Jones is free, and you can distribute it as you see fit! So get it out to your audience so they can listen to your new musical creations! There is more on using Jones later… Excited? Good! Read this book and dig into what DirectMusic has to offer you! The wisdom in these pages and the software offer you a grand opportunity for you to distinguish your art from that of your strictly linear audio-producing peers.
DirectX Audio for Software Engineers This is short and sweet. DirectX 9 Audio Exposed: Interactive Audio Development contains a comprehensive tutorial-style look at the inner workings of the DirectX Audio libraries. Everything you'd like to know about the DirectMusic API is within these pages. Todor Fay, the originator of the cornerstone DirectMusic technology, has laid out much of his wisdom for all of us to learn from. So take a look at the examples and discussions and get some ideas about how you can build your own software applications upon the DirectX Audio backbone. We look forward to seeing/hearing/using what you come up with!
Unit I: Producing Audio with DirectMusic Chapter List Chapter 1: DirectMusic Concepts Chapter 2: Linear Playback Chapter 3: Variation Chapter 4: Interactive and Adaptive Audio Chapter 5: DirectMusic Producer Chapter 6: Working with Chord Tracks and ChordMaps Chapter 7: DirectX Audio Scripting
Chapter 1: DirectMusic Concepts Overview Scott Selfon DirectMusic and DirectMusic Producer (the tool used to author music and soundscapes for DirectMusic) introduce some new concepts in audio production. Learn to appreciate what follows, and you will see that there is a new musical world waiting to be conquered. But first, a quick explanation of terms…. We use the term "audio producer" throughout this book. An audio producer is anyone who creates audio to be used as part of a DirectMusic project. This can be a sound designer, producer, recording engineer, composer; we don't care. If your focus is creating audio to be placed in a project, you are an audio producer. If your job is focused on integrating sounds into a project using code, building tools/extensions to DirectX Audio, or developing playback software, or you are a game programmer, then you should perk up when you see the term "programmer."
Interactivity "Interactivity" is both the most overused and misused term in game audio. Often, when someone speaks of interactive audio, he is referring either to adaptive audio (audio that changes characteristics, such as pitch, tempo, instrumentation, etc., in reaction to changes in game conditions) or to the art and science of game audio in general. A better definition for interactive audio is any sound event resulting from action taken by the audience (or a player in the case of a game or a surfer in the case of a web page, etc.). When a game player presses the "fire" button, his weapon sounds off. The sound of the gun firing is interactive. If a player rings a doorbell in a game world, the ringing of the bell is also an interactive sound. If someone rolls the mouse over an object on a web site and triggers a sound, that sound is interactive. Basically, any sound event, whether it be a one-off, a musical event, or even an ambient pad change, if it comes into play because of something the audience does (directly or indirectly), it is classified as interactive audio.
Variability An interesting side effect of the performance of recorded music production is the absence of variation in playback. Songs on a CD play back the same every single time. Music written to take advantage of DirectMusic's variability properties can be different every time it plays. Variation is particularly useful in producing music for games, since there is often a little bit of music that needs to stretch over many hours of gameplay. Chapter 21 discusses applications of variation in music production outside the realm of games. The imperfection of the living, breathing musician creates the human element of live musical performances. Humans are incapable of reproducing a musical performance with 100 percent fidelity. Therefore, every time you hear a band play your favorite song, no matter how much they practice, it differs from the last time they played it live, however subtle the differences. Repetition is perceived as being something unnatural (not necessarily undesirable, but unnatural nonetheless) and is easily detectable by the human ear. Variability also plays a role in song form and improvisation. Again, when a song is committed to a recording, it has the same form and solos every time someone plays that recording. However, when performed by musicians at a live venue, they may choose to alter the form or improvise in a way that is different from that used in the recording. DirectMusic allows a composer to inject variability into a prepared piece of digital audio, whether a violin concerto or an audio design modeled to mimic the sounds of a South American rain forest. Composers and sound designers can introduce variability on different levels, from the instrument level (altering characteristics such as velocity, timbre, pitch, etc.) to the song level (manipulating overall instrumentation choices, musical style choices, song form, etc.). Using DirectMusic's power of variability, composers can create stand-alone pieces of music that reinvent themselves every time the listener plays them, creating a very different listening experience when juxtaposed to a mixed/mastered version of the same music. Composers alter the replay value of their compositions as well by allowing their music to reinvent itself upon every listening session. Avoidance of audio content repetition in games is often important. When asked about music for games, someone once said, "At no time in history have so few notes been heard so many times." Repetition is arguably the single biggest deterrent to the enjoyment of audio (both sound effects and music) in games. Unlike traditional linear media like film, there are typically no set times or durations for specific game events. A "scene" might take five minutes in one instance and hours in another. Furthermore, there is no guarantee that particular events will occur in a specific order, will not repeat, or will not be skipped entirely. Coupled with the modest storage space (also known as the "footprint") budgeted for audio on the media and in memory, this leaves the audio producer in a bit of a quandary. For the issue of underscore, a game title with hours of gameplay might only be budgeted for a few minutes worth of linear music. The audio director must develop creative ways to keep this music fresh — alternative versions, version branching, and so on. Audio programmers can investigate and implement these solutions using DirectMusic. As to events triggering ambience, dialog, and specific sound effects, these may repeat numerous times, adding the challenge of avoiding the kind of obvious repetition that can spoil a game's realism for the player. As we discuss in more detail, DirectMusic provides numerous methods for helping to avoid repetition. On the most basic level, audio programmers can specify variations in pitch and multiple versions of wave, note, and controller data. Even game content (specific scripted events in the game for instance) can specify orderings for playback (such as shuffling, no repeats, and so on) that DirectMusic tracks as the game progresses. Using advanced features, chord progressions can maintain numerous potential progression paths, allowing a limited amount of source material to remain fresh even longer.
Adaptability Adaptive audio is audio that changes according to the state of its playback environment. In many ways, it is like interactive audio in that it responds to a particular event. The difference is that instead of responding to feedback from the listener/player, the audio changes according to changes occurring within the game or playback environment. Say for instance that a game's musical score shifts keys with the rising and setting of the sun within the game world. The player isn't causing the sun to set; the game is. Therefore, the score "adapts" to changes happening within the game's world. A famous example of adaptive audio in games occurs in Tetris. The music plays at a specified tempo, but that tempo will double if the player is in danger of losing the game. Avoiding repetition is an excellent first step in a strong audio implementation for games, as well as reintroducing music listeners to some potentially intriguing performance characteristics lost when listening to linear music. Continuing to focus on audio for games for a moment, audio content triggered out of context to game events is in many ways less desirable than simple repetition. For instance, using data compression and/or high-capacity media, a composer might be able to create hours and hours of underscore, but if this linear music is played with no regard to the state of the game or the current events in the game's plot, it could be misleading or distracting to the user. Adaptive audio (audio that changes according to changes in certain conditions) provides an important first step for creating interactivity in underscore. Do not confuse adaptive audio with variability — if, for instance, 20 variations of a gunshot are created and a single one is randomly chosen when the player fires a gun, that is variable but not necessarily adaptive. If the gunshot's reverberation characteristics modulate as the player navigates various and differing environments, then we have adaptive sound design. Likewise, a game's score could have infinite variability, never playing the same way twice, but it becomes adaptive when the sultry slow jazz tune fades in over the rock theme during the player's introduction to the game's film noir femme fatale antagonist. Now contrast variability and adaptation with interactivity; a character that stops talking when shot by the player is an example of interactive audio. Adaptation does not always need to mean "just-in-time" audio generation. While an underscore that instantly and obviously responds to game events might be appropriate for a children's animated title, this kind of music is often inappropriate for a first-person shooter, for instance. In such a style of game, where the breakneck pace of the action constantly modulates the game state, the user can begin to notice the interactivity of the score. This is dangerous, as the subtlety of the score is lost on the player, potentially damaging the designer's carefully crafted game experience. In a game like Halo, you do not want the player performing music via an exploit in the interactive music engine. For this reason, musical changes are often most satisfying when "immediate" changes are reserved for important game events and more subtle interactivity is used for other events and situations.
Groove Levels One of the more powerful ways that DirectMusic exposes adaptive audio is with the groove level. Groove level is a musical or emotional intensity setting that adjusts in real time according to rules you define as part of your playback environment. You can set up different moods within a DirectMusic piece and assign those moods to different groove levels. For instance, someone could produce a progressive trance piece with sparse musical activity on groove level 1 and increase the intensity of the music through various groove levels. This can be achieved by adding more notes and instruments, or by setting higher volumes. You can then assign the different intensity (groove) levels to trigger upon changes in the playback environment. Say you create a DirectMusic piece for stand-alone music listening; you could set a rule that switches groove levels according to the time of day or even the number of windows the listener has open on the desktop. The possibilities are truly endless. We discuss groove levels in much more detail in the section on Style-based Segments in Chapter 4.
Content-Driven Audio Adaptive audio becomes really interesting when the audio gets behind the wheel and drives the playback environment to behave and/or change in a particular manner. This is commonly referred to as content-driven audio. DirectMusic's sequencer provides notifications that can be used to manipulate the playback environment, allowing for the interesting possibility of music driving gameplay rather than the opposite. For instance, a monster around the corner could wait to jump out at you until an appropriate moment in the score. Content-driven audio is a particularly unexplored area of DirectMusic, one that could be put to some interesting uses.
Playback Standardization DirectMusic can play wave files. The great thing about wave files is that they can be CD quality, and (speaker quality not withstanding) they sound the same when played back on any computer. The problem with wave files is that they are big when compared to MIDI files, and they often limit adaptability and variation. A partial solution is to create a custom sample bank that is distributed with the MIDI sequence data in your DirectMusic file. While in most cases you are still forced to use minimal file sizes (restricting quality), you don't have to worry about different listeners hearing your sequenced music through different synthesizers/samplers; you've given them the sample bank that you created as part of the DirectMusic file. Before this standardization took place, you could listen to a MIDI file of Beethoven's Fifth through your SoundBlaster, while we listened to it through the sample set on our Turtle Beach sound card. The result: We both would have heard very different renditions. This is an audio producer's nightmare, since you have no control over the instruments on which your sequence plays. Luckily, this is no longer a problem, thanks to DLS-2. DLS-2 (the Downloadable Sounds format, level 2) is a sound format used for creating and storing samples, much like the popular Akai sample format. DirectMusic can use DLS-2 samples. The great thing about DLS-2 support in DirectMusic is that audio producers can create their own custom sample bank and include it as part of the DirectMusic file. This means that no matter who plays the DirectMusic file, they will hear the same sounds that everyone else does (as opposed to relying on their sound card or soft synth rendering the sequence data). DLS also specifies basic synthesis abilities — specification of which waves comprise an Instrument and how to modify the source wave data according to MIDI parameters like pitch bend, modulation, and so on. DirectMusic, or more specifically the Microsoft Software Synthesizer that DirectMusic uses, supports DLS-2, which adds numerous additional features, including six stage envelopes for volume and pitch, two lowfrequency oscillators, and MIDI-controllable filtering per voice. When using DirectMusic, these sample-based Instruments are stored in DLS Collections, which can be self-contained files (generally using the .dls file extension) or embedded directly within pieces of music, similar to traditional MOD files and the newer .rmid extension to standard MIDI files, where sampled Instruments can be embedded within the same file as the note information.
3D Spatialized Audio Our world is one of three dimensions. Most audio that you've heard played over the radio or television exists in one or two dimensions. Even surround sound lacks a vertical component (sound coming from above or below). Audio engineers have developed a technique that synthesizes the way we hear sound in 3D space. DirectX Audio has this functionality. This means that a sound can be mixed in space not only to sound closer, farther away, or to the left or right but also from above, below, and even behind us, all using just two speakers! We do not go into detail here on how or why DX Audio is able to do this. Just know that you have the ability to mix sound in 3D using DirectX Audio.
DirectMusic Rules and Regulations DirectMusic imposes several rules and limitations that you should keep in mind. We cover the more significant ones here and discuss more specific restrictions as they arise.
Each DirectMusic Performance Has a Single Tempo at Any One Time A DirectMusic Performance is somewhat analogous to a conductor and an orchestra; the conductor is only able to conduct at a single speed at any time (unless he is conducting some avant-garde piece of contemporary music!), and the orchestra as a whole needs to understand where those beats are. All pieces of music playing on a Performance must play at the same tempo — and a tempo change imposed by one piece of music will affect all other playing pieces of music. This becomes most interesting when using DirectMusic for playing sound effects and ambient sound in addition to music, as these sounds care nothing for tempo. SFX and ambience may inadvertently cut off because of musical tempo changes. There are several options that you may consider: §
Author all content at the same tempo. This of course defeats the ability to change speeds based on various events.
§
Use clock-time Segments for sound effects. Every Segment file can specify whether it is meant to use music-time or clock-time.A music-time Segment bases its timing information on the tempo of the music. Notes will play for a specific number of measures, beats, ticks, etc. If the tempo changes, the note will be played for less or more time — desirable for music but not so much for other sounds. A clock-time Segment, by contrast, only pays attention to the system clock and absolute time. Generally used for Segments that only contain non-note-based wave information, a clock-time Segment uses millisecond accuracy for playback. A wave told to play for 5.12 seconds will play for exactly that amount of time, regardless of tempo changes that occur while it is playing.
§
Use multiple DirectMusic Performances. You can always run more than one DirectMusic Performance simultaneously. The amount of processing power used (i.e., the CPU overhead) for a second DirectMusic Performance is small, though of course there are now additional assets to manage and track. Consider playing sound effects, ambience, and other audio cues that do not respond to tempo on one performance (with constant tempo), while music-oriented sounds will be played on another performance (with variable tempo).
Primary and Secondary Segments The next restriction to keep in mind is the differences between primary and secondary Segments. A Segment can be either primary or secondary. Only one primary Segment may play at a time; starting a new primary Segment will implicitly stop and replace any previously playing primary Segment. Primary Segments typically dictate tempo, groove level, chord progression, and other big picture (aka "global") performance-level events. By contrast, many secondary Segments may be playing at the same time. Secondary Segments typically layer on top of the primary Segment, picking up and using the primary Segment's tempo and other settings. However, if a secondary Segment plays as a controlling secondary Segment, rather than layer on top of the primary Segment, it will actually replace corresponding Tracks from the primary Segment. Controlling secondary Segments are typically reserved for more
advanced usage (for instance, changing the chord progression of the primary Segment, say, when an antagonist enters the same room as the hero in the game). Primary Segments are generally the main background music or ambience. Secondary Segments are often "one-shot" sounds or "stingers" that play over the primary — perhaps sound effects or musical motifs. Secondary Segments follow the primary's chord progression, so musically motivated secondary Segments can play with appropriate transposition and voicing.
Crossfades Programmers and audio producers should note that crossfades are not a built-in DirectMusic function for transitioning between Segments. For now, it is a fact, and so you are going to have to fudge them. It's not all bad, as you do have options. These options include AudioPath volume manipulation, MIDI volume controller use, or authoring volume crossfades directly into content. Remember that only one primary Segment can play at a time, which precludes the possibility of using primary Segments exclusively for crossfades. Unless multiple DirectMusic Performances are used, at least one of the Segments we are crossfading between will need to be a secondary Segment. For ease of use, we suggest that both be secondary Segments. The one-tempo limitation we discussed above makes crossfades between two pieces of music with different tempos difficult. One of the pieces will be played faster or slower (unless multiple DirectMusic Performances are used). Of course, this limitation really only applies for sequenced music; prerendered wave files using different tempos play fine.
Pause/Resume "Pause" is another feature not implicitly built into DirectMusic. However, a programmer can track the time when a Segment starts, as well as its offset when he stops that Segment. Using this information (and keeping in mind that the Segment may have been looped), the programmer can typically restart a Segment from the same location to create "resume" behavior. However, this only works for Segments that use music time. Note that content that includes numerous tempo changes may be less accurate in restart position than content with a single tempo. While this works for waves, MIDI and DLS limitations do not allow this for sequenced note data. The MIDI standard for note information consists of a "note on" and corresponding "note off" event. Starting playback between the two events will not trigger the note. Even if it did, remember that DLS Instruments cannot be started from an offset; they can only be played from the beginning. This won't really be an issue for short notes in a dense musical passage. However, if you have long sustained notes (for instance, a string pad), keep this in mind if you want to be able to pause/resume your music. In that particular case, you might want pause/resume functionality to restart the piece of music at a position before the pad began or even the beginning of the Segment.
Memory Footprint Memory is typically the most precious resource for games, whether on the PC or a console. The amount of memory taken up by the resources of a program or a particular component of a program is called the footprint. Making use of streaming wave data can help keep the memory footprint small; only a small amount of any playing wave is actually in memory at a time. DirectMusic supports wave streaming within Wave Tracks, with content-specified readahead settings that determine how much of the wave is in memory. However, the DLS format does not support streaming, so any DLS Collection instruments play entirely from
memory. There are several optimizations available that we will discuss in Chapter 2 when we cover Bands (a DirectMusic file that stores DLS Instruments), but keep in mind that DLS Instruments will occupy a footprint as long as they are downloaded. Again, this isn't a big deal if you are creating music for stand-alone listening. You'll have the free range of the system memory to work with for samples (normally 128MB of memory — more memory than any pro hardware synthesizer on the 2003 market) on top of all the streaming that the audience's PC can handle. But in games, you are very limited in the amount of memory you can use. Just keep that in mind. Note for programmers: Just using DirectMusic incurs a memory footprint, albeit a small one. The size does depend on which DirectMusic functions you choose to use in your application. For instance, if you use DirectX Audio Scripting with the Visual Basic Scripting Edition language (VBScript), that language requires several libraries of code to be loaded to the tune of almost 1MB of DLLs (dynamically linked libraries). Granted, other aspects of your program might depend on the same libraries, but to help keep the memory footprint more manageable, DirectMusic offers a tiny (~76KB) optimized DirectX Audio Scripting language (called audiovbscript) as an alternative to the fully featured VBScript.
DirectMusic and MIDI We've already alluded several times to DirectMusic's support for the MIDI format. This can be a strength or a weakness; the interoperability with this standard sequencing format allows for easy authoring from more sequencing-focused tools in the composer's palette (like Sonar or Cubase SX). But using DirectMusic for sound effects can work against this support — a sound effect is often meant to be played once or looped indefinitely. The concept of a finite "note on" and "duration" is somewhat foreign in this instance. As we've discussed already, the use of clock-time Segments can mitigate this somewhat for sound effects — they are given a fixed duration in milliseconds that are independent of tempo. While DirectMusic supports MIDI, it has several significant aspects that allow it to overcome some of the basic limitations of MIDI devices. For instance, typical devices are limited to 16 MIDI channels. The Microsoft Software Synthesizer, the synthesis engine used by DirectMusic, allows you to reference up to 32 million discrete channels. Using DirectMusic Producer, you will be able to use 1000 of these performance channels, or pchannels, per Segment. DirectMusic also adds some basic optimizations to MIDI to allow for easier manipulation of controller data. Traditionally, sweeping a controller from one value to another consisted of many rapid MIDI controller events. As a file format (and authoring convenience) optimization, DirectMusic allows the audio producer to instead use continuous controller curves. The start time, end time, initial value, end value, and curve type can all be specified. When the content plays back, the intermediate values are dynamically generated to remain compatible with traditional MIDI. A common question raised when a game is layering 20 pieces of music is whether the composer needs to worry about authoring each piece of audio onto its own set of pchannels. With traditional MIDI, you cannot simultaneously play two separate Instruments on the same channel at the same time — the patch change from one overrides the other. Similarly, if two Tracks on the same channel use the same MIDI controllers, they will override each other. Because DirectMusic allows the composer to play multiple pieces of content simultaneously, DirectMusic provides solutions to these basic MIDI limitations. From the point of view of MIDI, DirectMusic AudioPaths effectively act as unique synthesizers with their own unique pchannels. An AudioPath is pretty much what it sounds like—aroute that audio data follows through hardware or software before it is output. PC AudioPaths can include DirectX Media Objects (DMOs), which are software audio processing effects. AudioPaths also define the behavior of content being played onto them, such as whether they can be positioned in 3D space (kind of like surround sound but different — more on this later), whether they should be played as stereo or mono, and so on. As far as MIDI is concerned, each AudioPath gets its own unique set of "physical" pchannels (granted, this is all in software, so we're not talking about tangible channels). For instance, if two sequences are authored, both using pchannels 1 through 16, playing them on separate AudioPaths will keep the two from interacting. If one Segment on AudioPath A has a piano on pchannel 1 and another Segment on AudioPath B has a French horn on pchannel 1, they will both play happily. If we tried to play these two Segments onto a single AudioPath, one of the patch changes would win out, and one Segment would be playing the wrong instruments. That said, sometimes playing multiple pieces of content onto the same AudioPath is desirable. For instance, a Segment could be authored simply to alter the mix of music played by another Segment (using volume controller curves). Alternatively, in the example of 3D audio above, we probably would want to play everything that will emanate from this single position onto the same AudioPath.
Therefore, we have a solution for multiple simultaneous pieces of music. But what about MIDI controller curve interaction? Of course, if the pchannels are kept unique (either at authoring time or by playing the content onto separate AudioPaths), the MIDI controllers won't conflict. But what about the above example of a Segment where we just want to alter the mix? If the Segment it's altering has its own volume controller curves, the two will conflict, and we might get the volume jumping rapidly between the two. The classic solution is to use volume curves in one Segment and expression curves in the other Segment. This is a common approach in sequenced music, as both affect perceived volume and apply additively. This way the audio producer sets Volume (MIDI Continuous Controller #7) in the primary piece of music, and uses Expression (MIDI Continuous Controller #11) to reduce this volume per individual performance channel in the second, layered piece of music. DirectMusic provides a more generalized solution for this problem — not only for volume but also for pitch bend, chorus send, reverb send, and numerous other MIDI continuous controllers. The audio producer specifies an additive curve ID for each MIDI continuous controller curve created. Curves that have the same ID override each other, as with traditional MIDI. Curves that have different IDs are applied additively (or more accurately for the case of volume, the process becomes subtractive).
Chapter 2: Linear Playback Overview Scott Selfon Okay, let's get cracking. The first thing we want to do is be able to play sound. We add variability, dynamism, and the rest later. Let's focus on playing back a wave file. Note At this point, if you have not already, you must install DirectX 9.0 as well as DirectMusic Producer, which are available on the companion CD.
Waves in Wave Tracks To play back a wave file in DirectMusic, you'll create a Segment out of it. As already mentioned, the DirectMusic Segment file (.sgt) is the basic unit of DirectMusic production. Segments are built from one or more of DirectMusic's various track types, which can make sound (via stand-alone waves and/or DLS instrument triggering) or modify performances (tempo, chord progression, intensity level, etc.). You can create a Segment out of a wave file in DirectMusic Producer. Run DirectMusic Producer and go to File>Import File into Project>Wave File as Segment…. Open a wave file into the program. Building a Segment from a wave file, we get our first look at one of the basic track types in DirectMusic — the Wave Track. Beyond the 32 variation buttons (which we cover in Chapter 3), Wave Tracks play along the sequencer timeline with other Wave Tracks and also with MIDI sequences. If you look at the size of the Segment file, it is much smaller than the wave file. Here is an important early lesson in content delivery: Segments will know where the wave and sample files that it needs to play are but do not store those files as part of itself. There are several reasons to do this; if several Segments use the same wave data, you do not have to worry about having two copies of it in memory. In addition, if you later want to go back and edit that wave data, you do not have to worry about copying it into several different places. That said, there are various reasons that you might instead want to embed the wave data within the Segment itself — for instance, for the convenience of only having to deliver a single file or file load time considerations. Files with the extension ".**p" are design-time files. Design-time files are used in DirectMusic Producer for editing purposes and contain information not necessary for run-time use — for instance, a Segment used in a game does not need to include information on what size and position to open editing windows. In addition, design-time files always reference the content they use, even if you specify that content should be embedded. For these reasons, when content is meant to be integrated into a game or a special player, you should save the Segment as a run-time file (either via the per-file right-click menu Runtime Save As… option or the global Runtime Save All Files option from the File menu). When Segments are runtime saved, you will see the .sgt extension, and wave files will similarly have the more expected .wav extension. So to summarize, be sure to save your Segments as design-time (.sgp) files while you are working on them and as run-time files (.sgt) when they are finished and ready for distribution. For more details on content delivery, there is a white paper available on the Microsoft Developer Network web site (msdn.microsoft.com) called "Delivering the Goods: Microsoft DirectMusic File Management Tips."
Streaming Versus In-Memory Playback A minute-long 44.1 kHz 16-bit stereo wave file is 10MB. 10MB isn't a big deal when you run a stand-alone DirectMusic file on a contemporary computer, but it is a very large file in the world of game audio. Do not forget that the rest of a game's resources need to reside in memory as well. If the game is written to run on a system with as little as 64MB of RAM, you are already in way over your head. Console games are even more unforgiving. Consider yourself lucky if you get 4MB for your entire sound budget! You can use audio streaming to alleviate these restrictions. Streaming is a technique that works a lot like a cassette player. In a cassette player, audio data is moved across the play head a little bit at a time. The play head reads the data as it comes and plays the appropriate sound. With streaming, a small area in memory called a buffer is created. Wave files are moved, bit by bit, through the buffer and read by DirectMusic. The CD player in your PC uses streaming for playback. If it weren't for streaming, you'd have to load the entire music file into your computer's RAM, which in most cases simply isn't an option. DirectMusic uses the following rule for streaming: The wave streams with 500 msec readahead if it is more than five seconds long. If it is shorter than five seconds, it resides and plays from memory (RAM). Readahead determines the size of the buffer. For 500 msec, our 44.1 kHz 16-bit stereo wave file will use 88.2KB of memory (.5 sec x 44100 samples/sec x 2 bytes/sample x 2 channels), a big difference when compared to 10MB! Memory usage is reduced by a factor of more than 100! You can override this behavior, choosing to load all wave data to memory, or specify a different readahead value in the Compression/Streaming tab of a wave's property page in DirectMusic Producer. To get to any Track's property page, simply right-click on it in the Track window.
Tracks and Parts and Layers, Oh My Before we move any further, the concept of Tracks and parts can use a bit of sorting out. When we created our Segment from a wave file, it consisted of a single Track, a Wave Track, with a single part (by default on pchannel 1). Each part specifies its own variation behavior, what pchannel it is played on, and, in the case of Wave Tracks, volume for that part as well. By comparison, Tracks specify big picture (or "global") behaviors such as clocktime, as well as more exotic settings like multiple, concurrent Track groups. If you open the property page for a Track or part, you'll notice the separate tabs with their own settings — the Track properties actually includes both the Track Group Assignments and Flags tabs in addition to the actual Track properties tab. Remember that waves in Wave Track parts never conflict with DLS Collections on a pchannel; you can play as many waves as you want simultaneously and still play a DLS Collection instrument on the same pchannel without any problem. The reason a Wave Track part can be assigned to a pchannel is that MIDI controllers can still be used to drive wave playback (for instance, pan on a mono wave, pitch bend, volume, etc.). The pchannel that a Wave Track is assigned to can be altered in the wave part's property page. This brings us to the concept of layers, or the various lettered rows (a, b, c, etc.) that you see displayed in DirectMusic Producer for a single Wave Track part. As mentioned, you can play as many waves at one time as you wish. Therefore, layers are purely an aid for placing waves against the Segment timeline, rather than having many waves overlap on a tiny area of the screen and not be able to easily edit them (or tell which wave started and finished when). Waves can be placed on different layers within a part for easier legibility. All layers are equal, are played simultaneously, and do not take any additional resources (processor power or memory) when played.
Figure 2-1: Multiple waves on Wave Tracks are split into "layers" so you can see their beginnings and endings more easily. As a last bit of terminology for now, certain parts are actually subdivided even further into strips. In particular, the parts for MIDI-supporting Tracks (Pattern Tracks and Sequence Tracks) have separate strips for note information and continuous controller information. Pattern Tracks also have an additional variation switch point strip, which we cover in Chapter 3 when we discuss variations.
MIDI and DLS Collections Creating a DirectMusic Segment from a piece of MIDI music is simple. In DirectMusic Producer, follow File>Import File into Project>MIDI File as Segment…. This creates a DirectMusic Segment from a MIDI file, importing note, MIDI controller, and other pertinent data along the way. Let's examine some new Track types related to MIDI: § Tempo Track: Specifies the current tempo for the piece of music. You can override this by playing a primary or controlling secondary Segment with its own tempo settings. § Time Signature Track: Sets the time signature for the piece. Use this to track where measures fall as well as how many subdivisions (grids) to give each beat. § Chord Track: Specifies the key as well as specific chord progressions for a piece of music. Typically, for an imported MIDI file, this will just indicateaCmajor chord. DirectMusic Producer does not attempt to analyze the imported MIDI file for chordal information. § Sequence Track: A sequence is what its name implies. Sequence Tracks house MIDI sequences. This is where the majority of MIDI information is imported. Notice that as with Wave Tracks, Sequence Tracks can (and typically do) consist of multiple parts. By default, each MIDI channel is brought in as a separate part on the corresponding pchannel. In addition, each part can contain its own continuous controller information. Unlike Pattern Tracks (more on these in Chapter 3), Sequence Tracks are linear, nonvariable sequences; they play the same every time, and they do not respond to chord changes (though they do respond to tempo changes).
Figure 2-2: A Segment with a Sequence Track consisting of multiple instruments. §
Band Track: Bands are how Performance channels refer to DLS instrument collections. The Band Track is an area where initial pan, volume, and patch change settings for Tracks are stored. This is often an area of confusion, as you can also have volume and pan controller data within Sequence Track (or Pattern Track) parts. The Band Track typically just has a single Band with the initial settings. Subsequent volume changes are typically created as continuous MIDI controller events in Sequence or Pattern tracks. Continuous controller events, unlike band settings, allow you to easily sweep values over time (for example, to fade a track in or out). If any patch changes occur during performance, another Band can be placed in the Band Track, though the elimination of MIDI channel limitations means there is often little reason not to just place each instrument on its own unique and unchanging channel.
Building DLS Collections
Our Segment is now ready for playback. Of course, we're assuming that the piece of music will be played using a preauthored DLS Collection, such as the gm.dls collection that comes with Windows machines. Otherwise, it will be trying to play instruments that don't exist on the end user's machine, and these instruments would be played silently. If we wanted to use our own instruments, we would want to build one or more DLS Collections. DirectMusic Producer provides this authoring ability in DLS Designer. Alternatively, the DLS-2 format is a widespread standard, and there are several tools out there for converting from other common wavetable synthesizer formats to the DLS format. Creating DLS Collections is often one of the more challenging tasks when you decide to create real-time rendered (versus streamed prerendered) audio. Remember that unlike streamed audio, your DLS Collection will occupy system memory (RAM), which is typically one of the most precious commodities for an application. For this reason, you'll want to create collections that get you the most bang for your buck in terms of memory versus instrument range and quality. Let's create our first DLS Collection. From the File menu, select New, and from the dialog that appears, choose DLS Collection and hit OK. We're presented with a project tree entry for our DLS Collection (using the design-time .dlp file extension), which has two subfolders, Instruments and Waves.
Figure 2-3: An empty DLS Collection in the project tree. DLS Collections are composed of waves (effectively, ordinary wave files) and instruments, which consist of instructions for how waves should map to MIDI notes, along with a fairly sophisticated set of audio-processing features. Let's add a few waves for our first instrument, a piano. We can drag our waves right into the Waves folder from a Windows Explorer window or right-click on the Waves folder and choose Insert Wave… from the menu that appears.
Figure 2-4: We've inserted four wave files into our DLS Collection. We can now adjust the properties for these waves by right-clicking on them and choosing Properties to bring up their property window. The most important settings to note are going to be Root Note, loop points (both in the Wave tab), and compression settings (in the Compression/Streaming tab). We'll return to compression a bit later. Root Note specifies the base note that this wave corresponds to, also known in other wave authoring programs as the "unity pitch" or "source note." For our above piano sounds, determining the root note was
made easier by including the root note information right in the wave file names. DirectMusic Producer will also automatically use any root note information stored with the wave file, which some wave authoring tools will provide. Otherwise, we can adjust the root note manually from the property page, using either the scroll arrows or by playing the appropriate note on a MIDI keyboard.
Figure 2-5: Setting the proper root note for our BritePiano_C4 wave. This particular piano wave doesn't loop, so we don't need to worry about the loop settings. Again, if loop settings had been set on the source wave file, DirectMusic Producer would automatically use that information to set loop points. DirectMusic Producer provides some basic editing features on waves in the Wave Editor, which can be opened by double-clicking any wave in the Waves folder (or indeed, any separate wave file you've inserted into your DirectMusic project).
Figure 2-6: The Wave Editor. You can specify whether waveform selections (made by clicking and dragging) should snap to zero crossings via Snap To Zero, and you can specify a loop point by selecting Set Loop From Selection. The Wave Editor window supports clipboard functions, so you can copy to and paste from other wave editing applications if you wish. You can also try out any loop points you've set (or just hear the wave you've created) by playing the wave. As with Segment playback, you can use the Play button from the Transport Controls toolbar, or use the Spacebar as a shortcut to audition your wave. Note that you can only start playing your waves from the beginning, regardless of where you position the play cursor, as DLS waves cannot be started from an offset.
Now that we've set up our waves, let's add them to a DLS instrument so we can play them via MIDI notes. To create an instrument, right-click on the Instruments folder and choose Insert Instrument.
Figure 2-7: Creating an instrument. You'll notice that each instrument is assigned its own instrument ID, a unique combination of three values (MSB, LSB, and Patch) that allows for more than 2 million instruments — a bit more freeing than the traditional 128 MIDI patch changes. DirectMusic Producer will make sure that all loaded DLS Collection instruments use unique instrument IDs, but you should take care if you author your DLS Collections in different projects to make sure that they don't conflict. Otherwise, if two instruments with identical IDs are loaded at the same time, DirectMusic will have no way of knowing which one you want to use. The General MIDI DLS Collection (gm.dls) that is found on all Windows machines uses 0,0,0 through 0,0,128, so our new instrument probably defaulted to instrument ID (0,1,0). Let's open up the Instrument editor by double-clicking on the instrument in the project tree.
Figure 2-8: The Instrument editor. There are lots of options here that really demonstrate the power of a DLS-2 synthesizer, but for now let's just start with our basic instrument. The area above the piano keyboard graphic is where we define which wave or waves should be played for a given note. Each wave assigned to a range of notes is called a DLS region. As you can see, our new instrument defaults to a single region, which plays the wave BritePiano_C5, over the entire range of
MIDI notes. The piano keyboard note with a black dot on it (C5) indicates the root note for the region. Remember that the root note will play the wave as it was authored, notes above the root note will pitch the wave up, and notes below the root note will pitch the wave down. Looking along the left edge of our region, you can see the labels DLS1, 1, 2, and so on. These labels identify the various layers of this instrument. Layers allow you to create overlapping regions, one of the nice features of the DLS-2 specification. One of the more common uses is for adding multiple velocity versions of a wave, where as the note is struck harder, the instrument will switch to a different wave file. This is particularly effective for percussion instruments. Notice that each region has a velocity range that can be set in the Region frame. Multiple layers also mean that a single MIDI note could trigger a much more complex instrument composed of several simultaneously played wave files. Remember that regions on a single layer cannot overlap. The DLS1 layer is the single layer that was supported in the original DLS-1 specification, and is generally still used to author any singlelayer DLS instruments. For this simple instrument, we won't worry about using multiple layers for the moment. Our first step is to create regions for the rest of our piano waves. We want to resize that existing region so there is room on the layer for the other waves (that, and we probably don't want to try to play this wave over the entire range of the MIDI keyboard!). If you move the mouse over either edge of the region, you'll see that it turns into a "resize" icon, and you can then drag in the outer boundaries of the region.
Figure 2-9: Resizing a region with the mouse. Alternatively, we could change the Note Range input boxes in the Region frame. By the way, this is similar to the functionality for note resizing (changing start time and duration) in the MIDI-style Tracks (Pattern Tracks and Sequence Tracks). There are other similarities between region editing and note editing — the selected region turns red, just as notes do. The layer we are currently working with turns yellow, just as the current Track does in the Segment Designer window. And just as you can insert notes by holding the Ctrl key while clicking, you can create a new region in the same way.
You can resize the region as long as the mouse is held down and dragged (or, as before, resize the region by grabbing a side edge of it). As an alternative to drawing a region with the mouse, you can right-click and choose Insert Region from the menu that appears. Every region that we create defaults to use the same wave that the first region did. So we'll want to choose a more appropriate wave from the drop-down aptly labeled Wave. Let's assign BritePiano_E5 to this new region we've created.
We repeat the process for the rest of our waves to build up our instrument. Notice as you click on each region that you can view its particular properties for wave, range, root note, etc., in the Region frame.
Figure 2-10: The completed regions for our piano instrument. There are several potential issues to discuss here. First, should all instruments span the entire keyboard? On the plus side, the instrument would make a sound no matter what note was played. On the minus side, when the source wave is extremely repitched (as BritePiano_C4 is in the downward direction and BritePiano_E5 is in the upward direction), it can become almost unrecognizable, dramatically changing timbre and quality. Most composers opt for the latter, if only to know whether their music ever goes significantly out of an acceptable range for their instruments. If it does, the music can be adjusted (or the DLS instrument expanded to include additional regions). A second question is how far above and/or below the root note a region should span. That is, does the wave maintain the quality of the source instrument more as it is pitched up or down? This can vary significantly from wave to wave (and the aesthetic opinions of composers also differ quite a bit). For the above example, we did a bit of both — the bottom three regions encircle notes both above and below the root note, while the highest piano region (our BritePiano_ E5 wave) extends upward from the root note. How far you can "stretch" a note can vary quite significantly based on the kind of instrument and the source waves. Once you've created your instrument, you can try auditioning it in several ways. An attached MIDI keyboard will allow you to trigger notes (and indeed, small red dots will pop up on the keyboard graphic as notes are played). You can also click on the "notes" of the keyboard graphic to trigger the note. The Audition Options menu allows you to set the velocity of this mouse click and choose whether to audition all layers or only the selected one. (In the case of our example, we only have one layer, so the two options don't matter.)
Try out the piano instrument, paying particular attention to the transitions between regions. These are often difficult to match smoothly, as the pitch shifting of the waves begins to alter their characteristics and the instruments can begin to develop musical "breaks" where the tone quality changes significantly. If you are writing a purely tonal piece of music, it is often useful to author your DLS Collections such that the tonic and dominant tones (the scale root and the 5th degree) are the root notes of your regions. That way, the "best" version of each
wave (not repitched) will be played for what are typically the most commonly heard tones in the music. If you do need to tweak the source waves somewhat, you can either reauthor them and paste over the previous version in the wave bank, or you can make manual edits to how the regions play particular waves. The latter is useful when a wave is just slightly mispitched when played against another wave or when the levels of two recorded waves differ slightly. To override the wave's properties for this region, open the region's property window (by right-clicking and choosing Properties or, as with other property windows, by clicking on the region when the property window is already open).
Figure 2-11: A DLS region's property window. Here we've slightly adjusted the finetuning for BritePiano_G4 only as this region plays it back. If any other regions or instruments used the wave, their pitch would be unaffected by this change. Note
One common question is whether using overlapping regions on multiple layers could be used to more smoothly crossfade between waves. Unfortunately, all regions are triggered at the same velocity, so there is no easy way to make one region play more quietly than another for a given MIDI note. With some effort, you could potentially create several smaller regions over the transition that attenuate the source waves, gradually bringing their volume up (by overriding attenuation in the region's property window) to complete the crossfade.
And that's enough to create our basic instrument. If we wanted to, we could adjust any of a number of additional properties on the instrument's articulation, which is where much of the power of a DLS-2 synthesizer lies. The articulation allows you to specify volume and pitch envelopes, control a pair of low-frequency oscillators (LFOs), and set up low-pass filtering parameters for the instrument. You can even create per-region articulations, where a region has its own behavior that differs from other regions of the instrument. We'll just set up the instrument-wide articulation for this piano. Since our original sample already has most of the volume aspects of the envelope built in, we'll just add a release envelope. This means that when a MIDI note ends, the instrument will fade out over a period of time rather than cutting off immediately, much like an actual piano. By dragging the blue envelope point to the left in the Instrument Articulation frame, we have set the release to be .450 seconds, so this instrument's regions will fade out over roughly a half-second when a MIDI note ends.
Figure 2-12: To add any per-region articulations (presumably different from the global instrument articulation), right-click on the region and choose Insert Articulation List. Our basic piano instrument is now complete. We could open up the instrument's property window (as always, by right-clicking on it and choosing Properties) and give it a better name than Instrument, such as MyPiano. We then repeat the process for other instruments in our collection. Again, remember that DLS Collections will be loaded into memory when a Segment is using them, so you'll want to budget their size according to available memory on the system on which they will be played.
Stereo Waves in DLS Collections One unfortunate omission from the DLS-2 specification is that stereo waves are not supported for regions. DirectMusic Producer works around this by using DLS-2's aforementioned support for multiple layers. When a stereo wave is used in a DLS region, DirectMusic Producer creates a hidden second region, separates the two source channels out, and plays the mono pair of channels in sync on the two regions' collection.
Figure 2-13: The top single stereo region is actually stored by DirectMusic Producer as something closer to the bottom pair of mono regions. All of this is transparent to the composer (and to any content that uses the DLS instrument) — they can use stereo waves the same as mono waves without having to do anything differently. But it does have several impacts on content authoring. The primary implication of DLS-2 not supporting stereo waves is that stereo waves cannot be compressed in
DirectMusic Producer. Otherwise, DirectMusic would have to somehow figure out how to separate the left and the right channels to place them in separate mono regions. If you do intend to use stereo waves and they must be compressed, you must author them as pairs of mono waves, create regions on two layers, and then set those regions' multichannel properties to specify the channels and phase locking of the waves (so they are guaranteed to always play in sync).
Figure 2-14: Making our piano's right channel wave play in sync with the left channel and on the right speaker. Both channels would be set to the same Phase Group ID, and one of them should have the Master check box checked.
Using DLS Collections in Segments Now that we've assembled our DLS Collection, our Segment needs to use that collection's patch changes in order to play its instruments. Remember that patch change information is stored in a Band in the Band Track. If we created a MIDI-based Segment from scratch, we would insert a Band Track (right-click in the Segment Designer window and choose Add Tracks…), and then insert a Band into the Track (by clicking in the first measure and hitting the Insert key). However, let's assume for our very first piece of music that we just imported a MIDI file (from the File menu, choose Import File into Project…, then MIDI File as Segment…). In this case, we already have a Band created for us containing the patch changes from the original MIDI file. We now want to edit this Band to use our new DLS instruments (rather than the General MIDI ones, or worse, instruments that don't exist).
Figure 2-15: Double-click the Band in the Band Track to open the Band Editor.
Figure 2-16: The Band Editor window. Remember that in addition to patch change information, you can set up the initial volume and pan for each channel. The grid at the right allows you to do this graphically (vertical corresponding to volume, horizontal corresponding to left-right pan) by grabbing a channel's square and dragging it. Alternatively, you can set up each Performance channel by doubleclicking on it (or right-clicking and choosing Properties). The properties window is where you set the instrument properties for the channel.
Figure 2-17: The Band Properties dialog box. An instrument's priority allows DirectMusic to know which voices are the most important to keep playing in the event you run out of voices. Volume and Pan are once again the same functionality seen in the Band Editor's right grid. Transpose lets you transpose all notes played onto the channel ("Oct" is for octaves and "Int" is for intervals within an octave). PB Range lets you control how far the wave will be bent by Pitch Bend MIDI controllers. Range is a somewhat interesting option. It instructs DirectMusic that we will only play notes within a certain range, and therefore that we only need the regions from this instrument that are in that range. While this can cut down on the number of DLS instrument regions that are in memory (and thus possibly the size of wave data in memory), it does mean that notes played outside of this range will fall silent. Because transposition, chord usage, or other interactive music features might cause our content to play in a wider-than-authored range, Range is generally not used (and thus left unchecked). Getting back to the task at hand, this channel is currently using General MIDI patch 89 (remember that the other two aspects of an instrument ID, the MSB and LSB in the DLS Collection, are typically zero for General MIDI collections). We want it to instead play our new instrument, so we click on the button displaying the name of the current instrument, choose Other DLS… from the drop-down that appears (the other options are all of the instruments from gm.dls), and select our DLS Collection and instrument to use.
Figure 2-18: The Choose DLS Instrument dialog box. If we now play our Segment, Performance channel one will use our newly created piano instrument.
Authoring Flexible Content for Dynamic Playback Playing back long MIDI sequences isn't going to be particularly interactive, so we might want to take this opportunity to consider alternative solutions to strictly linear scores. For instance, you might want to "chop up" your score into smaller pieces of music that can smoothly flow into each other. Authoring music in this kind of segmented (no pun intended) manner can be as simple as creating a bunch of short looping clips, each of which follows the same progression, perhaps deviating in terms of instrumentation and ornamentation. Alternatively, the various sections could be much more distinct, and you could author transitional pieces of audio content that help "bridge" different areas of the music. In every case, you need to consider the balance between truly "interactive" audio (that is, music that can jump to a different emotional state quickly) and musical concepts such as form, melodic line, and so on. That is, how can a piece of music quickly and constantly switch emotional levels and still sound musically convincing? Finding a balance here is perhaps one of the most significant challenges in creating dynamic audio content, and this challenge is addressed by specific composers in Unit III.
Transitions Now that we've got basic linear pieces of music, let's discuss moving from one piece of linear music to another. How do we move from one piece of music (the "source") to the next (the "destination")? DirectMusic provides an abundance of choices in how such transitions occur. You can specify a Segment's default behavior in its property page (right-click on the Segment) on the Boundary tab. On the positive side, using these settings means you don't need to tell a programmer how your musical components should transition between each other. On the negative side, this locks a Segment into one specific transition behavior, which is often not desired; it's quite common to use different transition types based on both the Segment that we're coming from and the Segment that we're going to. Overriding can be a chore, particularly when using DirectX Audio Scripting. More often than not, audio producers end up having to communicate their ideas for transitions directly to the project's programmer for implementation. This is of course if you're working on a game. If you are creating standalone music for a DirectMusic player, stick to what you've got in terms of transition types! Looking over the boundary options, there are quite a few that are fairly straightforward — Immediate, Grid (regarding the tempo grid; remember that the size of a grid can be specified in the Time Signature Track), Beat, and Measure. End of Segment and End of Segment Queue are also fairly easily understood — the former starting this Segment when the previous one ends and the latter starting this Segment after any other Segments that were queued up to play when the current Segment ended. Marker transitions introduce us to another Track type that we can add to an existing Segment, aptly named the Marker Track. After adding a Marker Track to a Segment, an audio producer can add markers (also known as exit switch points) wherever it is deemed appropriate in the Segment, which will postpone any requested transition until a specific "legal" point is reached. Just to clarify, the transition is specified on the Segment that we're going to, even though the boundary used will be based on the currently playing Segment that we're coming from. Transitions do not have to occur directly from one piece of music to another. The content author can create specific (typically short) pieces of music that make the transition smoother, often "ramping down" or "ramping up" the emotional intensity, altering the tempo, or blending elements of the two Segments. A programmer or scripter can then specify a transition Segment be used in conjunction with any of the boundaries discussed above. As a more advanced behavior, when using Style-based playback, the programmer or scripter can specify particular patterns from the source and/or destination Segments (for instance, End and Intro, which plays an ending pattern from the source Segment's Style and an intro pattern from the destination Segment's Style) to similarly make the transition more convincing. We cover these embellishment patterns more in depth when we get into Stylebased playback in Chapter 4. By default, transitions go from the next legal boundary in the source Segment to the beginning of the destination Segment. There are, of course, cases where we might want the destination Segment to start from a position other than the beginning. Most commonly, we could intend for the destination Segment to start playing from the same relative position, picking up at the same measure and beat as the source Segment that it is replacing. For this kind of transition, alignment can be used, again in conjunction with the various boundaries and transition Segment options above. Align to Segment does exactly what we outlined above. As an example, if the source Segment is at bar ten beat two, the destination Segment will start playing at its own bar ten beat two. Align to Barline looks only at the beat in the source Segment and can be used to help maintain downbeat relationships. Using the
same example as above, the destination Segment would start in its first bar but at beat two to align to barline with the source Segment. As another technique for starting a Segment at a position other than its beginning, enter switch points (specified in the "lower half" of Marker Tracks) can be used. In this case, whenever a transition is intended to occur, the destination Segment will actually force the source Segment to keep playing until an appropriate enter switch point is reached in the destination Segment. Alignment generally provides sufficiently satisfactory transitions with less confusion, so enter switch points are rarely used. Remember in both cases that the limitations outlined earlier for pause/resume apply here; starting a Segment from a position other than the beginning will not pick up any already sustained MIDI notes, though it will pick up waves in Wave Tracks, as well as subsequent MIDI notes. Transition types can be auditioned within DirectMusic Producer by using the Transition (A|B) button.
Figure 2-19: The Transition, or A|B button. By right-clicking on the button, you can set all of the above options (and then some!).
Figure 2-20: The Transition Options window grants access to various transition settings. Then while playing back a Segment, click the Segment you want to be your destination, and then click the Transition button (or Alt+", its keyboard shortcut). The transition will occur as it would in an application using DirectMusic.
Figure 2-21: Play the Segment.
Figure 2-22: Highlight the Segment to which you want to transition, and press the Transition button.
Chapter 3: Variation Scott Selfon Okay, we understand linear pieces of music and getting from one piece of music to another. A primary challenge to overcome is repetition. No matter how smooth our transitions, if we hear the same pieces of audio repeatedly, they are bound to get stale. Here is where DirectMusic provides significant strength!
Variable Playback/Variations Content variability can be achieved on several different levels. On a "microscopic" level, content can specify ranges for volume and pitch on individual waves played in Wave Tracks, as well as randomized volume, start time, and duration on notes in Pattern Tracks.
Figure 3-1: The Wave Properties page allows you to specify ranges for attenuation and fine-tuning. This particular example will randomly pitch-shift the wave up or down by two semitones.
Figure 3-2: The Pattern Note Properties page allows you to specify variability for velocity, start time, and duration. This particular note will play with randomization in start time, duration, and with a random velocity between 75 and 125. On the "macroscopic" level, you could of course author multiple Segments and write a script to choose between them randomly. Then we have variations, which provide the best of both worlds. Variations are those 32 intimidating-looking buttons present on Wave Track parts and Pattern Track parts (in Segment files), as well as in pattern parts (in Style files, a format which we cover in a bit).
Figure 3-3: The Variation Button window. Each of those buttons represents one version of the part; for instance, you could have a drum part where variation 1 includes the full drum part, variation 2 removes the snare, variation 3 mixes up the rhythms a bit, and so on. The primary thing to remember about variations is that at any given time, a single variation plays per part. This does not mean that every single part is playing variation 12 at the same time, but it does mean that if you author 12 variations for your drum Track, only one is being played at any given time. Each part chooses a new variation to play whenever its Segment is started or loops. Before we get into the details of variation behavior and functionality, let's talk a bit about the 32 buttons, which can be among the trickier aspects of DirectMusic Producer to understand. When you first create an empty part that supports variations, each of the 32 button labels is white. This indicates that these variations are empty. As soon as you add any note, continuous controller, or wave information to a variation, that variation's label becomes black, which indicates that it has some information in it.
Figure 3-4: Variation 1 has data in it, while variation 2 is empty. One potentially confusing aspect of variations is that empty variations are just as acceptable to be "picked" to be played as variations with data. For instance, a composer might create three alternate versions of a drum part on variations 1 through 3. When he then tries to play the piece of music, he finds that most of the time it's silent. Why? Because variations 4 through 32 were enabled and empty (silent), meaning 29 out of 32 times, an empty variation was chosen. This brings us to disabling variations. When a variation is disabled, DirectMusic will never "pick" it to be played. By default, all variations are enabled. This is indicated in DirectMusic Producer by a solid gray background for the button.
Figure 3-5: Variations 3 and 4 are disabled. Variation 3 has data in it, which typically should be deleted (since the variation will never be chosen). To disable one or more variations, make sure the appropriate variation is pressed by clicking on it; then right-click anywhere over the 32 variation buttons and choose Disable from the right-click menu that appears. Disabled variations have a checkerboard look to them. Note that it is possible to have a non-empty variation that is disabled. In most typical scenarios, this means that you are saving data that you will never use, so it is generally a good idea to go ahead and erase any disabled variations when your project is completed.
Auditioning Variations in DirectMusic Producer The final aspect of the user interface for variations is the buttons being selected ("pressed down") or not selected ("up").
Figure 3-6: Variation 1 is selected; 2 is not. The main thing to remember about this aspect of the buttons is that it is solely for editing and auditioning purposes in DirectMusic Producer — whether a button is pressed or not has no bearing on how the variations will behave in a DirectMusic player or in a game at run time. Pressing buttons provides for two purposes: §
Editing one or more variations simultaneously. Pressing a single variation button means that you are now editing only that variation. Paste from the clipboard, add a note, or drag a wave into the part, and that data is assigned solely to that variation. If you select multiple variations, notes and waves added will be identically added to each of the selected variations.
§
Auditioning variations within DirectMusic Producer. Let's say you created a fourpart piece of music, and each part had ten enabled variations. You listen to it a few times and notice that there's an incorrect note in one variation of the bass part. Rather than forcing you to keep repeatedly playing the Segment until that variation gets picked again, DirectMusic Producer has an audition mode, where the same variation (whichever one is "pressed") will be chosen every time the Segment is played. Only one part can be in audition mode at a time. To select a part for audition mode, just click anywhere on it. The part that is in audition mode is highlighted in a slight yellow shade.
Figure 3-7: Part three (piano) is in audition mode. Within DirectMusic Producer, all other parts will randomly choose which variation to play from the enabled variation buttons. Part three will always play whatever variation button is pressed (until it is no longer in audition mode). Remember that audition mode only applies in DirectMusic Producer; when a Segment is saved and played in a DirectMusic application, none of its parts will be in audition mode, and all parts will choose from their enabled variations. Audition mode provides for some unique behaviors not seen outside of DirectMusic Producer. First, as multiple variation buttons can be selected ("pressed") at once, you have the opportunity to hear a single part playing multiple variations at once. As mentioned previously, only a single variation will be chosen at a time in an actual DirectMusic application. You can also select disabled variations and audition them; a DirectMusic application will never play a disabled variation. Lastly, you can have all of the variations on
your auditioned part deselected, in which case no variation will be chosen, causing this part to be silent. Again, this behavior is solely an audition feature of DirectMusic Producer.
Variation Shortcuts There are a few shortcuts provided for quick selection of which particular variations you are editing. The vertical bar to the left of the 32 variation buttons selects (and with a second click deselects) all enabled variations. Double-clicking on a variation will "solo" it (only select that variation). Double-clicking again will "unsolo" it, reverting to the previously pressed buttons. If you are using linear wave files, a shortcut is provided to avoid needing to work directly with the 32 variation buttons at all. From the File menu, choose Import>Wave File(s) as Segment Variation(s)…. From the dialog that appears, choose all of the wave files that you want to be variations in your Segment. A clock-time Segment is automatically generated, with a length of the longest wave file you chose and using one variation per wave file. You'll notice that in this case, all of the empty variations (unless of course no variations are empty) are automatically disabled.
Specifying Variation Behavior Settings Once you have created a Segment that utilizes variations, the next logical step is to be able to more finely control how variations are chosen. For instance, if we had 20 variations of a dialog fragment, we would really like the person hearing this Segment not to hear the same variation played twice in a row. Actually, it would often be ideal to play every possible variation once before we ever get a repeat. This kind of functionality can be specified by the audio producer from each part's properties page. As usual, right-click a part and choose Properties (or if the Properties page was already open, click the part to make it the active property display).
Figure 3-8: Selecting per-part variation behavior for a Pattern Track part.
Figure 3-9: Selecting per-part variation behavior for a Wave Track part. Though presented in different locations on the Pattern Track and Wave Track property pages, the available options and functionality are the same: §
In random order: The default behavior — a new variation is randomly selected from the enabled variations every time the piece of music plays or loops. Repeats can occur.
§
In order from first: The first time the Segment is played, variation 1 (or the first enabled variation) is played. On each repeat or loop, the next enabled variation in order is played.
§
In order from random: The first time the Segment is played, a random enabled variation is chosen. On each repeat or loop, the next enabled variation in order is played.
§
With no repeats: Random enabled variations are chosen on each repeat or loop, but the same variation will not be chosen twice in a row.
§
By shuffling: On each repeat or loop, a random enabled variation is chosen, but every enabled variation will be played once before any are repeated.
Recall that a new variation is chosen each time a Segment is played or it loops. However, there are situations where we might want to "reset" the variations. For instance, let's say that each variation is the next step up in an ascending scale, and we allow the Segment to loop as the character picks up tokens. If they haven't picked up any tokens in the last few minutes, we want to stop the Segment and start the scale over from the lowest note the next time a new token is picked up. This is where the Reset Variation Order on Play option (available in The Wave Track and Pattern Track property pages) can be used to control whether DirectMusic "remembers" which variations were previously played the next time the Segment starts. If, for instance, a Segment was set at In Order from First and its Pattern Track had Reset Variation Order on Play checked, every time the Segment was played, it would start from variation 1. If it looped, it would then play variation 2, 3, and so on.
Figure 3-10: Reset Variation Order on Play functionality for the Pattern Track. There are, of course, instances where the audio producer would prefer for two or more parts to always pick their variations identically. For instance, in a DirectMusic arrangement of a jazz trio, there might be several "solos" for each of the instruments (parts). We do not want two instruments to "compete" by playing their solos at the same time. For these kinds of situations, Lock ID functionality provides a way to control how different parts choose their variations. Every part that has the same Lock ID will choose its variation to be the same. Therefore, for the above scenario, if each part has three solos to choose from, we could place the drum part's solos on variations 1-3, the guitar's solos on variations 4-6, and the piano's solos on variations 7-9, for example. For the variations where the parts do not have solos, they would either "lay out" or just keep time. When the Segment was played, each of the parts would choose its variations in step with the other parts, so only one solo would be played at a time.
Figure 3-11: This Wave Track part's Lock ID is set to 1. Any other Wave Track parts in the Segment that are set to Lock ID 1 will choose the same variation as this part. Note that Lock ID does respond to the other settings that we've discussed that control how variations are chosen; if one locked part has variation 10 disabled, that variation will never
be chosen for that entire Lock ID group. Similarly, if one part is set to play its variations in order, all of the parts will follow that part's lead. Mixing and matching between various part variation settings (for instance, one part set to shuffle, another part set to play in order) within a Lock ID group can lead to somewhat unpredictable variation choices, so it's generally recommended that all parts in a Lock ID group use the same settings. Another important concept to remember is that Lock ID is unique per Track, not per Segment. That is, several parts in a Pattern Track that use the same Lock ID will choose the same variation. However, a Pattern Track part and a Wave Track part using the same Lock ID will not lock to each other. A limited subset of this kind of functionality is available in the Wave Track's property page with the Lock Variations to Pattern Track option. Checking this box means that each pchannel of the Wave Track is now locked to the corresponding pchannel of the Pattern Track; a Wave Track part on pchannel 1 will always choose the same variation as the Pattern Track part on pchannel 1 if one is present.
Figure 3-12: Locking a Wave Track's variation to a Pattern Track. Locking a Wave Track's variations to a Pattern Track is most useful when you wish to use MIDI continuous controllers to manipulate waves — panning, MIDI volume control (fade in, fade out, etc.), and so on. By authoring corresponding variations in the Pattern Track, you can ensure that a particular wave plays using particular MIDI controller information. Alternatively, you could keep the Wave Track unlocked from the Pattern Track, in which case any of your controller variations might be used over any Wave Track variation.
Variation Switch Points As we've already discussed, a new variation is chosen for each part whenever that part's Segment starts playing or loops. But what if we want variations to be chosen more frequently? In particular, the audio producer might create small fragments of melodic or rhythmic lines that he wants to be able to switch between in the midst of a Segment. Variation switch points provide the audio producer with a solution to this — namely, the ability to determine if and when a part can change variations in the middle of playback. Variation switch points are displayed in DirectMusic Producer in a separate strip for each Pattern Track part (they are not supported by Wave Track parts). The two types of switch points are enter switch points and exit switch points. Enter switch points tell DirectMusic where it is legal to "jump into" this particular variation. Exit switch points tell DirectMusic where it is legal to try to find another variation to change over to. If no variation has an enter switch point corresponding to this exit switch point, the currently playing variation will continue to play.
Figure 3-13: A variation using variation switch points. This variation can jump to any other variation at the position of the exit switch point (red blocks), shown as the dark blocks in the Var Switch Points line. Similarly, this variation can start playing midSegment if any other variation has an exit switch point corresponding to this variation's enter switch point (green blocks), the lighter blocks in the Var Switch Points line. While opening many possibilities for potential variation mixing and matching combinations, variation switch points can make composition much more challenging. Rather than only having to worry about how different variations meld at Segment loop points, variation switch points open up the possibility of playback jumping between variations right in the middle of playback, at whatever boundaries they wish. Often it is easiest to limit variation switch points to measure boundaries, although they do support down to grid resolution if you need it.
Influencing Variation Choices DirectMusic provides even more acute functionality for controlling which variations play when, particularly when the audio producer is using chord progressions. We cover the various aspects of chord proressions (chords, Chord Tracks, and ChordMaps) in more detail in Chapter 6, but for now just know that a composer can define a static chord progression by placing chords in a Chord Track or a dynamically generated progression by using a ChordMap and a Signpost Track. Regardless of how the chords generate, a composer can elect to have variation choice influenced by the function of the currently requested chord in the current key. This functionality is in the Variation Choices window, more informally known as "The Mother of All Windows." The window opens from the button to the right of the 32 variation buttons:
Figure 3-14: The Variation Choices window. In this window, each row corresponds to one of the 32 variations. Each column allows the audio producer to specify exactly what kinds of chords a variation will or will not be chosen for. For instance, you might have a melodic line in a variation that sounds great played over a IV chord (e.g., an F major chord in the key of C), but you have another variation that provides a stronger option to play on a ii chord (e.g., a D minor chord in the key of C). By selecting and unselecting the various buttons in the Variation Choices window, you can add these kinds of very specific restrictions. An unselected button means that given a matching set of chord conditions, this variation will not be chosen. Unlike with the 32 variation buttons, the button pressed/ unpressed state is actually used here to determine in-application behavior. However, as with the variation buttons themselves, a disabled variation's row will be grayed out. The Functions area allows you to specify specific chords that a variation should or should not be played over. The capitalized version of the chord (e.g., I) indicates the major version, the lowercase version (e.g., i) is minor, and italicized (e.g., i) is for all other chord types (augmented, diminished, etc.). The Root column allows you to specify whether a variation should play over notes that are based within the scale (S), or based on a non-scalar tone that is flatted (b) or sharped (#). Type allows you to determine whether the variation should play over a triadic chord (tri), a chord with a sixth or seventh (6, 7), or more complex chords (Com). Lastly, Dest, or destination, actually looks ahead to what the next chord in the progression is going to be. If the current chord is leading to the tonic (->I), a dominant chord (->V), or any other chord (->Oth), you can instruct a variation to never play in one or more of these situations. Look at Figure 3-15. Assuming we're in the key of C, this variation will only play if the chord is C major (I), d minor (ii), F major (IV), G major (V), or a minor (vi). It will also only play over scalar roots, so, for instance, it would not play over Ab minor. The variation will play over any chord "type" (triadic, 6th/7th, or complex). Lastly, this variation will only be chosen if the next chord encountered in the Segment is a C major (I) or G major (V) chord.
Figure 3-15: One possible configuration for a variation. Audio producers and programmers often ask if there is a way to force a specific variation choice. The Variation Choices window provides one option for doing this. Without using any of the standard chord functionality (notes being revoiced and/or transposed to fit the chord), you could make every variation play over a unique chord and then choose the particular chord to force that single "legal" variation to be chosen.
Tips and Further Options for Variability We've now covered several ways to create nonrepetitive content. There are numerous other ways to add variability to audio content, including (but not limited to): § ChordMaps (see Chapter 6) § Scripting (see Chapter 7) § Style-based playback (see Chapter 4) But the most common and powerful way to create nonrepetitive content is in the areas that we've already discussed — particularly using pitch, volume, and duration variability, along with the 32 variation buttons available per Wave Track and Pattern Track part. Avoiding repetition for sound effects and ambience can be accomplished by using pitch and attenuation randomization in combination with the use of multiple variations (typically each using a unique authored version of a sound's wave or waves — one for each variation). If a sound effect is composed of several different aspects, try varying the timing between them somewhat as well for even more alternatives. Let's take the classic example of a footstep in a game. Traditionally, you might record five or six footsteps and randomly choose between them every time your character moves forward on the screen. But using DirectMusic, you could break the footstep into its component elements — the sound of a heel, the sound of a toe, and perhaps a separate scuff for the sole of the foot/shoe/whatever. Even if you only record three variations of each, by placing each in a separate Wave Track part, you now have at least 3 x 3 x 3 = 27 unique footsteps. If you add appropriate random pitch range to each element, you have even more footsteps. Now if you create added variations where the actual start time of the various elements is adjusted slightly (the scuff sound comes slightly earlier or later), you've got even more potential "performances" of this footstep, with no programmer assistance required. For musical variability, consider occasionally adding or removing single instrument lines via silent variations. Changes in orchestration from variation to variation are also a quick way to keep music changing with each performance. Melodic lines, particularly those that are closely associated with a chord progression, can be more difficult to structure convincingly in multiple variations — especially when you stop to consider that every possible variation of a part will have to play properly against every possible variation of every other part. As nonmelodic instruments, drum parts tend to be more straightforward for adding variations to quickly.
Chapter 4: Interactive and Adaptive Audio Scott Selfon So far, we have covered playing back basic linear audio and adding aspects of variability to that music. The next step is to make the audio dynamic — that is, allow it to respond to events that the user or program creates (i.e., interactive and adaptive audio). For instance, computer games often want to convey a certain emotional level based on what the current game state is. If the player's health is low and many enemies are around, the composer would often like the music to be more frantic. This general concept of an "intensity level" is often difficult to communicate to the programmer and even more challenging to implement convincingly in an application.
Variable Intensity Music and Ambience DirectMusic provides several potential solutions for creating dynamic music. In this chapter, we cover two in particular — the use of DirectMusic Style files and the use of Segmentbased branching.
DirectMusic Style Files Perhaps the easiest way to implement intensity level is to use DirectMusic Style (.sty) files, which contain numerous musical fragments, each with an assigned intensity level. Audio producers compose one or more patterns (musical loops) within a Style. A pattern is composed of a Pattern Track, those same MIDI-supporting Tracks we've already used in Segments. Each pattern is assigned a groove level range, DirectMusic's term for musical intensity. When the programmer alters the groove level based on application events, a new pattern is chosen automatically based on the new intensity requested. When a new Style file is created (File>New…, or the Ctrl+N shortcut), it appears in the project tree with several folders.
Figure 4-1: A newly created style file. Remember that design-time file extensions end with p, so this is a .stp file. The run-time version of the file will be a .sty file. The Bands folder contains one or more Bands (information about per-channel volume, pan, and patch change settings) that can be dragged into a Segment's Band Track. Typically, you will just use one Band per Style, and that Band is automatically inserted when you drop the Style into a Segment. More information on that in a bit. Motifs provide a subset of the functionality available to secondary Segments (and for history buffs out there, they are actually the predecessor to secondary Segments). Recall that one or more secondary Segments can be played at a time and are typically layered on top of a
primary Segment, whereas only a single primary Segment can be playing at a time. While motifs may appear somewhat convenient (all living in a single file), there are several significant drawbacks to motifs: § They cannot be accessed directly using DirectX Audio Scripting. DirectX Audio scripts can only play and stop Segments, not motifs (which reside in Styles). § While they can be played directly by a programmer, the composer needs to manually embed a Band within the motif to ensure that its expected instrument patch changes are applied before it is played. § While motifs support variations, it's up to the game programmer to keep using the same instance of the motif in order to avoid resetting the variation order. In short, for most cases, secondary Segments are much more useful and convenient than motifs. So we're typically going to have a Style with a single Band, no motifs, and several patterns. Our default Style has a single pattern. If we double-click it, the Pattern Editor (looking quite similar to the Segment Designer window) opens. MIDI controller editing is identical to the behavior in Pattern Tracks in Segments.
Figure 4-2: The Pattern Editor window. There is a new Track at the top of the part called the Chords for Composition Track. This Track, covered in more detail in Chapter 6, tells DirectMusic what the original function of the music was so that DirectMusic can make an educated choice when revoicing the music to work over a new chord or key. For now, we leave it at the C major default. Note DirectMusic Style files only support Pattern Tracks and the Chords for Composition Track and therefore DLS Instruments. In particular, Styles cannot be used in conjunction with Wave Tracks, so consider Segment-based branching if you are using prerendered and/or streamed music. Opening the properties page for a pattern gives us access to groove range and other pattern-related functions. This is also where you can adjust the length of this specific pattern.
Figure 4-3: The properties page for a pattern within a Style. We've discussed groove range earlier. An audio producer can assign a pattern to a single groove level if they wish or a range of appropriate groove levels. Typically, a range of values is used. This allows for two things: §
Pattern variability. If two patterns have overlapping ranges, a requested groove level within the overlap will randomly choose between the two patterns (all other aspects of the patterns being equal — see below for more details on how a pattern is chosen).
§
Easy addition of patterns later. Let's say the composer creates three patterns — a low-intensity pattern (assigned to groove range 0–25), a medium-intensity pattern (assigned to groove range 26–75), and a high-intensity pattern (assigned to groove range 76–100). After trying the application, the composer might decide another pattern with intensity between "medium" and "high" is appropriate. He can create a new pattern without the programmer needing to make any changes to his code. By contrast, using a single groove level, rather than a range, for each pattern would probably require the programmer to change how he determines the groove level.
The Destination Range option can only be used when you know what groove level your music is heading toward — that is, generally only when groove levels are authored directly into a Segment via a Groove Track (see the "Using Styles with Segments" section later in this chapter). With destination range, DirectMusic will actually "look ahead" to what groove level is going to be needed next before choosing which pattern should be played. While this could be somewhat useful when "ramping" from one groove level to another, the idea of authoring groove levels into content is typically contrary to the desire for dynamic manipulation of groove levels. For this reason, destination range is rarely used. The Embellishment area allows you to add even further control to how a pattern is chosen. Embellishments are generally specific patterns that you can call for without changing groove levels. For instance, you might be playing a piece of music at groove level 50 and you want to transition to a new piece of music, also at groove level 50. Rather than temporarily changing the groove level to pick a convincing ending pattern, you could create a pattern still assigned to groove level 50 that is an end embellishment. To further strengthen the flow of the transition, the new Style could have an intro embellishment pattern. Then when the programmer or scripter wants to start playing the second piece of music, he can say to play ending and introductory embellishments, and these specially assigned patterns will be played between the two pieces of music. The naming conventions for embellishments (intro, end, fill, break, and custom) are actually somewhat arbitrary — from a programmer's or scripter's point of view, there's no difference between most of them. The exceptions are that an intro embellishment is always picked from the Style you are going to, and the end embellishment is always picked from the Style that you are coming from. Of course, if you're sticking with the same Style and just wanted a brief pattern change, even these have no difference in function. Note that custom embellishment types (with a range from 100–199 to
help keep them from becoming confused with groove ranges) can only be requested programmatically or within content; DirectX Audio Scripting does not support them. The Chord Rhythm option adds yet another layer of control over how DirectMusic chooses which pattern to play, assuming the Segment uses chords (in Chord Tracks or via ChordMaps). The audio producer might have one pattern that is more ideal to a rapidly changing progression and another that works better under a static chord. DirectMusic will look for a pattern with a chord change rhythm that maps closest to the currently playing Segment's chord rhythm.
Choosing a Pattern from a Style So after all of these settings, how exactly does DirectMusic choose which pattern to play at a given time? In descending order of priority: 1. Look for the embellishment requested. 2. Look for the groove level requested. 3. Look for pattern matching destination groove level to next groove change. Again, as destination groove level is rarely known ahead of time, this particular rule is rarely used. 4. Longest pattern that fits available space (between now and end of Segment or next groove change). 5. Look for a pattern with an appropriate chord rhythm. 6. In case of a match in all other settings, randomly choose one of these matching patterns to play. The fourth rule is often the most significant to keep in mind. The argument here is that DirectMusic should pick the musically "strongest" progression, which would mean to prefer to play a single eight-bar pattern rather than a one-bar pattern eight times. But the rule also applies when you have multiple longer patterns to choose between. If you had been anticipating using variable length patterns, it's generally not easy to get around this restriction but it does allow for a simpler composition and musical transition process.
Building Up Styles The most basic use for Styles is what we like to call additive instrumentation. Each pattern is authored identically to the previous (lower groove level) one but adds another instrument to the mix. This can work quite effectively and is relatively easy to implement. The audio producer creates a low groove level pattern (which could be silence, basic rhythm, or perhaps just subtle ambience). They then create additional patterns, each at a higher groove level and each adding a new instrumental line or additional rhythmic complexity. Primarily due to the pattern-choosing rules discussed above, it is often easiest to create every basic pattern with the same length. Embellishment patterns of differing lengths can help break up the rhythm somewhat, though all patterns must use the Style's global time signature for their playback length. Individual parts in a pattern can be varying lengths and time signatures to help distract from the feeling of constant time signature. By the way, groove level and pattern length can be viewed in a somewhat easier-to-read interface via the Style Designer window. The window is opened by double-clicking on the Style itself in the project tree rather than any of its patterns and displays information on all of the patterns (their length, groove ranges, embellishments they are assigned to, and so on).
Figure 4-4: The Style Designer window for a Style with five patterns for varying groove levels. The highlighted pattern is an ending embellishment (as indicated by the E in the Embellishment column).
Using Styles with Segments Remember that the basic unit of playback (as far as the programmer is concerned) is typically the DirectMusic Segment, not a Style file. So we're going to want to create a Segment that actually "plays" our style. Several Track types come into play when instructing a Segment on how to use one or more Styles: §
Style Track: This Track is the basis for using Styles with Segments. Typically, the Style Track will consist of a single Style placed in the first bar. However, there's nothing wrong with changing Styles as a Segment proceeds.
§
Groove Track: This is the Track where the audio producer can set the current groove level for the Segment. This Track also allows for a Segment to manually request an embellishment pattern (for instance, play an intro embellishment at the beginning of the Segment). Inserting a Style into a Style Track automatically makes changes to several other Tracks in a Segment, inserting new Tracks if necessary.
§
Time Signature Track: This Track becomes locked to the time signature of the current Style, as you can't override a Style's time signature in a Segment. If additional Styles are inserted later in the Segment, similarly locked time signature events (which appear "grayed out") will appear.
§
Tempo Track: The default tempo for the Style (specified in the Style's property page and the Style Designer window) is inserted on the first beat of the bar where the Style was entered.
§
Band Track: The Style's default Band is inserted one tick before the bar where the Style begins playback. This ensures that all of the instrument changes and volume/pan settings are applied just before the first notes are heard (assuming no notes begin before beat one — if any notes do occupy the pickup bar, you should manually move the Band even earlier). Note that the Band inserted into the Band Track is a copy of the Style's Band, not a "reference" to it. That is, if you now go back and edit the Style's Band, those changes are not applied to the Band in the Segment. Similarly, if you edit the Band in the Segment, those changes won't be reflected in the original Style's Band. We typically manually drag the Style's Band into the Segment just to make sure we are not using an old version of the Band.
So all of these Tracks are automatically filled in for us. But what about the Groove Track? Because the groove level is something we want to be dynamically changed by the programmer while the application is running, composers don't typically author groove
changes into the Groove Track beyond an initial setting (typically the lowest groove level) in the first bar.
Figure 4-5: A typical Style-based Segment. Groove level 1 is specified in the first bar. When the Style was inserted, the other tracks were created and/or modified automatically. Before we completely dismiss the Groove Track, it is still quite useful for auditioning groove changes within DirectMusic Producer. After composing a bunch of patterns of varying groove levels, you might create a long Segment, drop in the Style and a few groove changes, and listen to how the performance sounds. You could even use variable groove changes (the +/– option in a groove change's property page) or experiment with the embellishment types. As far as the length of the Segment, you have several options. If you are using patterns of varying lengths, you'll generally want to make the Segment the least common multiple of those lengths. For instance, if one pattern is four bars and another is three, you'll want to make the Segment 12 bars long (and infinitely looping). This just makes sure that, assuming a single groove level (and often a single pattern) is played for a long time, the pattern won't be interrupted or cut short by several bars due to the Segment ending. Worse, remember that the pattern chosen is also influenced by how much space is left before the end of the Segment — so you might get a completely different pattern chosen right at the end if the pattern length did not match the Segment length. For the specialized (and easier to manage) case where all patterns have exactly the same length, creating an infinitely looping Segment of that same length is generally preferable. This will also give us some added flexibility in how transitions between the groove levels occur, particularly with the ability to successfully use alignment transitions.
Changing the Playing Patterns If you've experimented with groove changes to audition your patterns in a Segment, you might notice that the new groove level's pattern didn't always take over immediately. The reason behind this is that a groove change tells DirectMusic to choose a more appropriate pattern when the currently playing pattern finishes. Certainly, the application might change groove levels quite a bit, and you wouldn't want new patterns abruptly starting and restarting frequently. However, if you've created a lot of 20-bar patterns, the change to the new groove level could definitely take quite a while and more time than you would hope. There are several options for making pattern changes occur more frequently. The easiest will involve the programmer or scripter. Every time he adjusts the groove level, he can also restart the currently playing Segment using any of the transition types we've already discussed. For instance, the audio producer could instruct the programmer to restart the Segment at the next bar. This means that a groove change will "take effect" (a new pattern will begin playing) within the next bar. Or, in the above example, the audio producer could structure his 20-bar patterns such that they're actually composed of five four-bar-long sections. The audio producer could then put a Marker Track in the Segment (note that the markers are not in the Style file, as Style patterns don't support Marker Tracks) at the natural
break points for the music, and the programmer could restart the Segment on a marker boundary. One difficulty with this solution is that we are restarting the Segment (and thus whatever the appropriate pattern is) every time we touch the groove level. This can be problematic when smaller groove level changes shouldn't really affect the pattern we're playing. For instance, if our "low level" pattern spans groove levels one through 25, it doesn't really matter to us if the groove level changes from one to two. The same piece of music should just keep playing. But in the simplest implementation of the above solution, we're retriggering the Segment — and thus restarting the pattern — every time the groove level changes. Depending on how frequently these changes occur, we could get very intimately acquainted with the first few bars of the pattern and never hear the 20th onward. One way to combat this is for the content creator to instruct the programmer as to when the Segment should really be restarted versus allowing it to continue with the newly adjusted groove level. Based on the above issue with potentially restarting the playing pattern from the beginning constantly, another commonly used technique is to use alignment to Segment transitions. Remember, in this kind of transition, we jump to the same relative location in the destination Segment as we were at in the previous Segment. So if we were at bar four beat two, we jump to bar four beat two of the new piece of music. In the particular case of adjusting groove levels, both the source and destination Segment are the same piece of music. The retriggering is enough to bump us over to a new pattern more appropriate for the current groove level. But with the use of alignment, we transition over in mid-pattern rather than restarting the Segment. This technique is a bit trickier to manage than the previous one. Remember that if we're starting in the middle of a pattern, any notes that would have started playing prior to that point would not be retriggered — so for instance, if our destination had some nice string pads that had started on the downbeat, we wouldn't hear them if our transition occurred on beat two. An added complexity occurs if our patterns had different lengths, as they now may transition to musically awkward positions in each other. This is another reason why it's often easiest to keep all patterns at the same length.
Figure 4-6: Attempting to transition using Segment alignment can be difficult for patterns of differing lengths. In the above diagram, Pattern 1 is four bars long, and Pattern 2 is five bars long. Below them is a display of the 20 bars of a looping Segment, so we can see how they line up over time. If we transition from Pattern 1 to Pattern 2 at, say, the end of bar eight, we're transitioning at the end of Pattern 1 but into the start of the fourth bar of Pattern 2. If both patterns had been the same length, we would be making a potentially more natural transition from the end of Pattern 1 to the beginning of Pattern 2.
Segment-Based Branching and Layering An alternative to DirectMusic Style files for creating variable intensity music and ambience is to use multiple Segments. Remember that with strictly Style-based playback, you are restricted to DLS instruments. With Segments, you have the ability to mix and match between DLS instruments and prerendered streaming wave files. There are two primary ways that Segments can be used to shift intensity: branching and layering.
Branching Playback
In the case of branching Segments, we change the Segment we are playing at a musically appropriate boundary, based on some event that changes the emotional level we want to convey. In the simplest implementation, branching just consists of authoring complete pieces of linear music for each emotional intensity, and then picking logical places where one piece could jump to the beginning of the next. Often measures are appropriate boundaries; if not every measure is, the composer can use markers to dictate which are. Sometimes the above can result in abrupt and unsatisfying transitions, particularly if the two pieces of music were not written with each other in mind. The jolt from playing a slow-moving piece of music to a pulse-poundingly fast one might be appropriate in some circumstances, but in others it can distract more than intended. A potential solution here is to use transition Segments.Aswepreviously discussed, when you tell a new Segment to start playing, you can specify a transition Segment that should be played before this new Segment starts. A common solution to the abruptness issue is to create quick little "bridges" that ramp up or down the tempo, smooth out any instrument entrances and exits, and so on. Of course, this can lead to a bit of extra work on the composer's part — you might find yourself writing transition Segments for every possible configuration (slow to fast, fast to slow, slow piece 1 to slow piece 2, slow piece 2 to slow piece 1, and so on) unless you're clear as to what pieces could possibly jump to other pieces. Building up a roadmap beforehand can aid both in the planning and authoring of transition Segments.
Figure 4-7: A sample diagram for five pieces of music. Each arrow indicates that a piece of music will need to be able to transition to another piece of music. For this scenario, we assume that you have to somehow "engage" enemies in combat, so you'll never jump right to the "fighting" music from the "no enemies" music. Similarly, if you decide to flee the room, you'll disengage from the enemies first, so you don't have to worry about the "fighting" music jumping directly to the hallway "ambient" music. Yet another alternative use of branching is Segment alignment. Remember that by default, a new Segment will start playing from the beginning. Using alignment, we can make it start at the same position where the last Segment left off. With this technique, we can achieve results similar to the Style-based playback above, but with the benefit of streamed waves. Using this technique, you could author several different versions of the same audio track, with different mixes and increased instrumentation in each version. Again, using markers (in
marker tracks) to control what are appropriate boundaries, the music can seamlessly ramp up or down to another wave based on the scenario.
Layered Segments The other effective Segment-based tool for creating dynamic music is layering. Here we keep playing the same basic piece of music, but we layer other pieces of music over the top. This involves the use of secondary Segments. Remember that only one primary Segment can be playing at a time, and playing a new primary Segment automatically stops any currently playing ones. But you can layer as many secondary Segments on top of a primary Segment as you like. Don't forget that whether a Segment is played as primary or secondary is in the hands of the programmer or scripter, so the composer needs to communicate this information beforehand. A common use of layered secondary Segments is as motific elements. Indeed, these secondary Segments replace the somewhat outdated Style-based motifs, which were much less flexible and required extra effort from the programmer or scripter in order to use them. As an example, let's say our character's love interest enters the room, and the love theme plays on top of the basic music track. The use of Chord Tracks (described in more detail in Chapter 6) can make this all the more effective. The love theme can be played in the appropriate key, and even repitch itself based on the chord progression of the underlying music. As a brief example, let's take any old Segment we've already authored and add a Chord Track to it if it doesn't already have one. (Right-click on Segment > Add Track(s)…, and choose Chord Track from the list.) Now let's right-click on the Chord Track (the leftmost part of the track that says "Chords") and bring up its properties. Odds are the track thinks our piece of music is in the key of C because we haven't told it otherwise. Mine happens to be in Eb, so I update the property page like so:
Figure 4-8: Chord Track Properties dialog. We'd probably also want to adjust that C major 7 chord that appears in measure 1 by default. Let's make it an Eb major 7 instead. (I'm not going to bother entering the full chord progression, but we'll use this particular chord in a moment.) We right-click on the CM7, choose Properties, and modify the property page like so:
Figure 4-9: Our Eb major seventh chord. Adjusting layer 1 (the bottommost chord layer) defaults to automatically editing all four layers, so you would only need to adjust the chord (four-octave keyboard on the left) and the scale (one-octave keyboard on the right) once. Right-click and uncheck Auto-Sync Level 1 to All to change this behavior. Now let's create our romantic theme. We create a new Segment with a Pattern Track and a Band Track (making sure that the Band and pattern pchannels don't interfere with any from our original piece of music). As far as the actual music, I'll just borrow a recognizable theme from Tchaikovsky's Romeo and Juliet:
Figure 4-10: Our (well, Tchaikovsky's) love theme. Note that we've authored it in the key of C (not having a Chord Track implies this) and named the Segment "LoveTheme." The Pattern Track part uses performance channel 17 (which can be adjusted from the property page), and we disabled all variations except our theme on variation 1. Our band similarly just has an entry for performance channel 17 telling it what instrument to play. If we played our theme as a primary Segment, it would play in C major. But if we play it as a secondary Segment against our piece of music in the key of Eb, it will transpose to this key. To try it out, open up the Secondary Segment Toolbar (from the View menu, choose Secondary Segment Toolbar), and choose our Segment from one of the drop-downs.
Figure 4-11: The Secondary Segment toolbar. You can specify the boundary it will audition on by right-clicking the Play button. Now play the primary Segment from the main transport controls (click on the Segment to give it "focus," which will make its name appear next to the primary segment's Play button). As it plays, play our love theme from the secondary Segment toolbar's Play button. The secondary Segment transposes to fit the key of our primary Segment. Similarly, this theme could be played against any of our linear compositions in the proper key just by authoring a Chord Track into those Segments and specifying the proper key and scale in a chord in the track. Transposition is one thing, but as a more advanced use, we might want our theme to actually follow the chord progression of our primary Segment. Let's add another chord to our primary Segment and start off with a dominant-tonic (V-I) progression. We'll move that Eb chord to bar two (by dragging or copying and pasting) and insert a Bb chord in bar 1.
Figure 4-12: Our new chord.
Figure 4-13: Our primary Segment. Now when we play our secondary Segment as this primary Segment plays, the notes will conform to the current chord as well as the scale. We can adjust the specifics of how our secondary Segment adjusts itself from its property page, and can even override these settings on a per-note basis if we wish.
Figure 4-14: The Default Play Mode area determines how our secondary Segment responds to chord changes. In the default behavior, chord and scale are used to determine appropriate notes, but there are also settings to only transpose according to scale, to remain fixed to the authored notes (useful for drum tracks), and so on. Alternatively, secondary Segments can be played as controlling secondary Segments. In this case, rather than taking on the characteristics of the primary Segment, they force the primary Segment to take on their characteristics of tempo, chord progression, and so on — whichever tracks are present in the secondary Segment are used to replace to the primary Segment's. Going back to our previous example, we can add a Chord Track to our secondary Segment telling DirectMusic that our theme was indeed authored in the key of C, overaCmajor chord. If we then played it as a controlling secondary Segment over the primary Segment, the primary Segment would now take on the chord progression of the secondary Segment.
Figure 4-15: Adding a new Chord Track and C major chord to our secondary Segment.
Figure 4-16: Setting up a Play button in the secondary Segment toolbar (by rightclicking on it) to play a controlling secondary Segment. Don't forget that this is an
audition setting only; the composer would want to tell the programmer or scripter that the Segment should be played as a controlling secondary Segment. Remember that Sequence Tracks do not respond to chord changes — they are static, nonpitch-varying tracks, analogous to traditional sequenced MIDI tracks. So the fact that our controlling secondary Segment has switched out the chord progression won't be heard in the primary Segment unless any of its notes are played by Pattern Tracks. But if our music was placed in a Pattern Track, and we had traced out the chord progression for the music in our Chord Track, the primary Segment would sound quite different over this new chord progression. To repeat a useful scenario for this, background music authored in a major key could switch to a minor key when the villain enters the room, just by playing a controlling secondary Segment composed of minor chords. Indeed, there is nothing that says a secondary Segment has to be composed of any notes at all — it is perfectly acceptable to use them for Tempo, Chord, Style, Wave, and any other tracks. In this chapter, we have attempted to supply an overview and a basic understanding of some of the power and flexibility DirectMusic can bring to an interactive application. It can be used simply to provide a MIDI-capable synthesizer, with the ability to deliver instrument collections that ensure uniform content playback across multiple sound cards. It can be used to synchronously play back real-time-rendered (MIDI) and prerendered (wave) sounds in synch with each other. DirectMusic provides the ability to seamlessly transition from one piece of music to another at appropriate (and content-defined) boundaries. And it gives a rich set of tools to composers and sound designers for creating variation and dynamically adjustable content. Being aware of some of the rules and restrictions for how it can best be used, I encourage you to experiment with the concepts presented here. It's often easiest to gain familiarity with a few tools at a time rather than being overwhelmed by the options; this chapter has merely touched the surface of the features DirectMusic supports.
Chapter 5: DirectMusic Producer Download CD Content Jason Booth Wow! That was a lot of theory to take in. Now we're ready to dig into using the program that does it all: DirectMusic Producer. DirectMusic Producer is a complex program with a steep learning curve. While this section is not meant to be artistic, it demonstrates DirectMusic's basics, as well as the fundamentals required to use the program in a production environment. We also revisit some basic ideas key to using DirectMusic, including variation, styles, and chord changes. Our goal is to create a simple blues that uses DirectMusic's recomposition features to follow chord changes.
Getting Started Insert the CD included with this book into your CD-ROM drive, as we refer to MIDI files on this CD for the note data in this chapter. Open DirectMusic Producer and create a new project by selecting File>New from the menu and selecting a project in the dialog box. Name your project, and choose a storage location for its data on your hard drive. Select File>Import File into Project>MIDI File as Segment from the menu. Navigate to the Unit I\Chapter 5\Tutorial Part One directory on the CD and select the file Tutorial_Bass_Melody.mid. The Segment opens automatically in the main screen of DirectMusic Producer.
Figure 5-1: A DirectMusic Segment file, created from a MIDI sequence. This Segment has several tracks of data in it. The first is a Tempo Track, which controls the speed of playback. The second track is a Chord Track, which controls how DirectMusic transposes other Segments over this Segment. The next two tracks are Sequence Tracks containing note data. The final track is a Band Track, which controls the instrument patching. Often, when importing data from sequencing programs, the data requires some amount of cleanup. In this case, there are several problems; first, the Segment is two bars long, yet the music in the Sequence Tracks is only one bar long. To fix this problem, right-click on the title bar of the window and select Properties.
Figure 5-2: The properties dialog box. Note
Almost everything in DirectMusic has a properties dialog; our Segment itself has one, as well as each track and each piece of data in each track.
Click on the button next to Length, and set the number of measures to one.
Patterns The music we imported from the MIDI file is incredibly simplistic. It is a single bar of a bass line and melody and hardly interesting at that. However, music often exists as the collective of small patterns like this one, which play over various chord changes. Our next step is to make this small piece of music play over a series of chord changes, using DirectMusic's transposition functionality to adjust the notes to match the new chords. Currently, our music is stored as a Sequence Track in our Segment. Sequence Tracks do not transpose to follow chords, and play exactly as the author wrote them. To get our melody and bass line to transpose over chord changes, we need to use a Pattern Track instead. To create a new Pattern Track: 1. Select File>New. 2. Set the number of measures to one. 3. Select Pattern Track from the menu. 4. Select the Segment1.sgp (which was created) from the menu on the left-hand side. 5. Right-click and select Rename. 6. Rename the Segment to BassSegment.sgp. We need to copy the data from our old Segment into the new Segment's Pattern Track. Select the Tutorial_Bass_Melody.sgt Segment and maximize track one's window. This opens the roll editor, where you can view and edit note data.
Figure 5-3: The roll editor displays note data, which you can edit right in DirectMusic Producer. Select the first note in the window and press Ctrl+A to select all the notes. Press Ctrl+C to copy this data. Open Pattern Track one in our new Bass pattern. Click on the first beat in the measure, and press Ctrl+V to paste the data into the track. You may have to scroll the track up or down to see the note data.
Create a new Segment named Melody.sgt with a Pattern Track, and repeat this process with track two. At the end of the process, you need to point the Pattern Track to channel two by right-clicking on the Pattern Track to bring up the properties dialog box and set the pchannel to two.
Setting the Chord Reference For DirectMusic to transpose a melody to a new chord, it needs to understand the relationship of the melody to its chord and scale. In our case, we base the bass line and melody off a C7 chord. Add a Chord Track to the bass and melody Segments by right-clicking in the Segment window and selecting Add Track and Chord Track from the menus. Right-click on the first beat in the measure of the Chord Track and select Insert to insert a chord.
Figure 5-4: We define the chord and scale reference in the Chord Properties page. Here we use a C7 chord with a C Mixolydian scale. On the bottom piano roll, set the chord to C7. Do this by clicking on the appropriate notes on the piano roll (C2, E2, G2, and Bb2). Now adjust the scale by changing the B natural to a Bb. Repeat this process for the second part as well. Note You can copy the chord from the Chord Track and paste it in the other Segment's Chord Track, rather than resetting the notes manually.
Creating a Primary Segment Now that both of our parts reference chords and scales, we must create a chord progression for them to play over. Create a new Segment called MasterSegment.sgt, set its length to four bars, and add a Chord Track. Place a series of chords in this Chord Track for the bass and melody parts to follow by selecting bar one and pressing Insert to insert a new chord. Use the following chord progression: § C7> F7> G7> C7 The notes and scales for these chords are shown here: Chord
Chord Notes
Scale
C7
C E G Bb
C D E F G A Bb
F7
F A C Eb
C D Eb F G A Bb
G7
G B E G
C D E F G A B
Figure 5-5: Adding your melody and bass line. Add two Segment Trigger Tracks to MasterSegment.sgt. A Segment Trigger Track allows you to include another Segment inside of your Segment, which is extremely useful for organization. In your Segment Trigger Track, right-click on the first beat of the first bar and select Insert from the menu. Select your bass Segment from the drop-down box in the properties page. Now copy and paste this Segment into all four measures on the Segment. Repeat this process on the second track to include the melody as well. Play this Segment. Notice that your instrument settings vanish, and both tracks sound like a piano. To fix this, add a Band Track to the Segment, and copy the Band from your original, imported Segment file into this track. When you are done, MasterSegment.sgt should look like this:
Figure 5-6: Your completed master Segment. Play the Segment and notice how the notes now transpose to follow the chord progression.
Making Our Music Cooler Play the Segment and notice that both the bass and melody move in a linear fashion. That is, when the chord changes from C7 to F7, both lines move up a fourth in parallel. If two musicians were to play this, it is unlikely that they would move through these changes in such a strict parallel fashion. DirectMusic allows us a number of ways to specify how a given part should move through the chord changes. Open the properties page onto the F7 chord. Notice that there are four different chord and scale levels. Use these different levels to specify alternate ways in which DirectMusic should interpret the chord. On the second level of the chord, change the chord mapping so that the top two notes of the chord are on the bottom of the chord. It should now be composed of C, Eb, F, and A, which is the second inversion of the chord. For the G7 chord, set the second chord to the second inversion as well (in this case, C, E, G, and B). Your chord should look like this:
Figure 5-7: Setting up alternate ways for DirectMusic to interpret the G7 chord. Now go back to your melody Segment file, right-click on an area of the pattern where no notes are found, and bring up the properties dialog. In the bottom, right-hand corner under Default Play Mode, set the chord level to Chord Level 2 so that this pattern follows the chord mappings specified in the second layer of the chord.
Figure 5-8: Adjusting chord levels. Play MasterSegment.sgt again. Notice that while the bass is moving linearly through the chord change, the melody now uses the inversion to determine the root note of the melody. Save your project.
In case you are lost, there is a completed version of this section of the tutorial on the companion CD located in Unit I\Chapter 5\Tutorial Part One.
Variations Now that we have a basic piece of music playing over some chord changes, we want to add some variation to the playback so that each time our bass line plays, it plays a little bit differently. Open the BassSegment.sgt that we created earlier, and open the Pattern Track. Notice the variation bar. By default, DirectMusic Producer fills in all of these variations with the same data.
Figure 5-9: The variation bar. Select only the variation to work on. Notice in the above figure that we selected all variations because every number in the bar is depressed. You can quickly select or unselect all variations by clicking the vertical bar on the leftmost edge of the variation bar. Click this bar so that every variation number is not depressed, then select variation 2. Your variation bar should look like this:
Figure 5-10: Only variation 2 is selected. In the pattern, raise the note on the second beat to an E, and lower the last note to an E. Your pattern should look like this:
Figure 5-11: Our first bass line variation. If we play the project now, we would only hear this variation one out of 32 times, since DirectMusic fills out our 32 Segments for us by default. To have the variation play more often, disable all variations except for variations 1 and 2 by deselecting variation 2 and selecting variations 3 and higher. Right-click on the Variation tab, and select Disable. Your Variation tab should now look like this:
Figure 5-12: Variations 1 and 2 are enabled, while all others are disabled.
Play the master Segment, and notice how both our original bass line and our variation play. Again, we saved a completed version of this section of the tutorial on the companion CD in the folder Unit I\Chapter 5\Tutorial Part Two.
Styles In the previous two sections of this chapter, we stored our musical data in Pattern Tracks. Pattern Tracks have a lot of power, as they can follow chord changes in multiple ways and contain up to 32 variations. However, even more powerful than patterns are Styles. As defined in Chapter 4, a Style is a collection of patterns, which DirectMusic chooses based on various parameters. Each pattern within a Style acts just like Pattern Tracks, containing chord mappings and variations. Moreover, because you can place multiple patterns within a Style, they allow you an unlimited number of variations. More importantly, they respond to the groove level. While groove level can be used in many ways, it is easiest to think of it as an intensity filter. Each pattern within a Style has a groove range in which it is valid. Our melodic line might be simpler at groove levels one through ten, while a more active one is used from 11 to 20. When the Style is choosing the pattern to play, it looks at the current groove level and determines which patterns are valid within the Style. A typical game implementation involves tying groove level to the intensity of the current game situation. If there is a lot of action going on, the groove level raises and the music intensifies, and when the action dies down, the groove level lowers and the music calms down.
Creating Our First Style Create a new Style by clicking File>New and selecting Style from the menu. In the left-hand menu, open the folder in the Style named Patterns. Double-click on the pattern to open it. Using the same techniques we used in the first tutorial in this chapter, copy the data from our melody pattern to this Pattern Track. You will need to set the pchannel for this part to two, as well. Now select the pattern in the left-hand menu, and press Ctrl+C to copy it into memory and Ctrl+V to paste it. Your Style should now have two patterns called pattern one and pattern two. Open pattern two, and adjust the notes to make the melody into more of an accompaniment Style line. You can use the right-click menu on the mouse to add new notes if you need to. Our accompaniment looks like this:
Figure 5-13: Our accompaniment pattern. Rename the Style Piano. Open our MasterSegment file, and use the right-click menu to delete our current melody Segment Trigger Track. Use the right-click menu to add a Style Track. Right-click on the first bar in the Style Track, and select Insert. Select our Piano Style from the menu.
Press Play on the master Segment and notice that both patterns are chosen from our Style. Also of note is that we do not have to place references to our Style in every measure of the master Segment. This is because a Style Track is assumed to repeat, unlike our Segments, which have a fixed length.
Adding Groove Level Go back to our Style, and bring up the properties on pattern two. Notice that there is a Low and High setting under Groove Range. This sets which groove levels this pattern is valid. Set these numbers to 1 and 10, respectively. This will make this pattern only viable from groove levels one to ten. Now select pattern one, and set its groove range from five to 100. Go back to our master Segment, and use the right-click menu to add a new track. This time, select Groove Track from the menu. Select the first bar in the Groove Track and press Insert. Play the master Segment, and notice that only pattern two is used. This is because our groove level is set to 1, which is not a valid groove level for pattern one to play. Now set the groove level to 6 in our Groove Track, and play the master Segment. You will notice that both patterns now play. This is because both patterns are valid, so DirectMusic will choose between them every measure. Set the groove level to 11 in our Groove Track. You will notice that only pattern one plays, because pattern two's groove range is too low to play. Save your project. The wonderful thing about using groove levels to control your music is that it's very easy to work with and test on the musical end and provides a simple mapping for a game to use that doesn't require your programmers to understand how your music is composed. Often, groove level can be tied to something very simple for a game engine to understand, such as the number of monsters in the area, and the composer can decide what the music should do in these cases. Once again, we saved a completed version of this section of the tutorial on the companion CD in the folder Unit I\Chapter 5\Tutorial Part Three. This chapter offered some insight into the basic mechanics of using DirectMusic Producer. While DirectMusic Producer can seem daunting at first, once you become familiar with the interface, you will find it efficient and well designed. Remember to start small and be prepared to build several projects as a learning experience.
Chapter 6: Working with Chord Tracks and ChordMaps Download CD Content
Overview Scott Morgan One of the most powerful and revolutionary features of DirectMusic is its ability to handle chord progressions and scales. We first discussed the power of chords in DirectMusic in Chapter 4 and took a brief look at how to harness some of that power in Chapter 5. As you know, audio producers can apply chord changes to their music by entering chords into a Chord Track. Beyond this, they can create ChordMaps that define different paths that chord progressions can follow. Since chords, ChordMaps, and chord progressions are undoubtedly some of the most complex and daunting features, we have devoted an entire chapter to the topic. Be prepared; it is going to be a bumpy but satisfying ride. In the end, you will be able to do some incredible musical feats with chords in DirectMusic! This chapter offers a thorough demonstration of DirectMusic's chordal and scalar abilities. For another example of connecting chords, open the file ChordMapTutorial.pro located in the Unit I\Chapter 6 folder on the companion CD. Oh, and it is assumed the reader has an intermediate understanding of chord and scale theory.
Chord Tracks This section demonstrates designing music that utilizes nonvariable Chord Tracks. Two Segments are created using the same pattern, but each will have their own chords and scales. First, create a pattern: 1. Create a Style called SimpleStyle. 2. Enter the following notes into the Style's only pattern:
Figure 6-1: This is a C major Alberti Bass figure (arpeggio) followed by a sixteenth note scale run. 3. Create an eight-bar Segment called MajorChordTrack. 4. Add a Chord Track (not a ChordMap Track) and a Style Track to it. § Select SimpleStyle in the tree view and drag it into bar one beat one of the Style Track. If you play the Segment, you will hear the pattern that you just entered play eight times. Since there are no chords in the Chord Track, the pattern repeatedly plays over a C chord. Make things more interesting by adding some additional harmonies. 1. Select bar one beat one in the Chord Track and hit the Insert key (or right-click and select Insert). A new C chord appears. 2. Do the same thing to bar two beat one. 3. Click the up arrow next to 2C five times to make the chord an F chord. Next, we want a ii chord. AddaCmajor chord to bar three and transpose it to D. To make it minor, we need to edit the chord. Start by changing the label from M to min. The chord type labels have no effect on playback; they are there for you to customize. If you prefer another way of labeling chords, feel free to use that. When we edit the notes on the keyboard display in the Chord Properties window, we only want to edit the notes in the row labeled 1 on the left-hand side. Later we will go into what the four chord levels are for. We can edit the chord tones and the underlying scale. To edit the third chord, click the F# in the chord section. This deletes that chord tone. Click the F key to give usaDminor chord. It is okay to leave the chord with only two tones, but in this case, we want D minor. Here is what it should look like when you are finished:
Figure 6-2: Chord Properties dialog box. You can generally create a duplicate of anything in DirectMusic Producer by Ctrl+dragging it. Ctrl+drag the F chord to bar four. Now transpose the F chord to G. The Chord Editor page is simply the properties page, so hit F11 to bring it up again or right-click the chord and select Properties. You can copy multiple chords simultaneously as well. Click the C chord in bar one and Shift+click the G major chord. Ctrl+drag those four chords to bar five. Standard cut, copy, and paste works also. Lastly, we want an A minor chord in bar five instead of C major, so select that chord and delete it. We could have edited the C major chord to make it A minor, but it takes fewer steps to copy and transpose another already existing minor chord. Ctrl+drag the D minor chord into bar five and transpose it down to A minor. The finished Segment should look like this:
Figure 6-3: Play the Segment to hear the pattern cycle through all the chords. Next, we create a minor key version of the same chord progression. This is going to require more work, since we need to adjust the scales in addition to the chords. 1. Create a new eight-bar Segment and name it MinorChordTrack. 2. Add a Style and Chord Track to MinorChordTrack. 3. Drag SimpleStyle into the Style Track. Before we add any chords, we must adjust the key of the Segment. Right now, any chord that we create has the underlying scale of C major because our key is set to C major and we do not want to have to edit the whole scale on every chord. If you pull out the dividing line between the tracks and their labels, you can see that the Chord Track is labeled like this: Chord (Key=C/0#s). If you bring up the properties window for the Chord Track, you can change the key. Do this by left-clicking the Chord Track label and hitting F11 if the property page is not already open. Select the Key tab. We are going to use C minor, which has three flats, so the first thing we need to do is click the Flats radio button. Next, raise the key signature value to three flats. Now we can start entering the chords. Insert a chord into bar one beat one. This chord will be a perfect C minor chord with the wrong label. Change the label to min. Notice the underlying scale is C minor instead of C major. Copy the C minor chord to bar two and transpose it to F minor. For bar three we want a ii7b5 (half-diminished) chord. Start withaDminor chord and change the A in the chord to an Ab. You can add a C above the Ab to make it truly a min7b5 chord, but for our purposes, it will not matter. Label that one Dmin7b5 or whatever else you like. Bar four will be G major. Use your keyboard's Insert key to add the chord this time. Transpose it up to G major, and change the third to B natural.
We are still not done with the G chord. The chord is correct, but the scale is still not. Notice when you changed the Bb to B natural that the B natural in the scale side turned blue. This is DirectMusic's way of warning you that there is a chord tone that does not have a matching scale degree. You should always include the chord tones in your scale, or DirectMusic may get confused about how the scale tones relate to the chord tones. Click the blue B natural to add it to the scale. It should turn red like the other tones. Lastly, click the Bb to remove it. The scale is now C harmonic minor. Copy the first four bars into the last four, like we did with the major Segment. Transpose the C minor chord in bar five down to Ab, and make it major instead of minor. This is the final chord progression: | Cmin | Fmin| Dmin7b5 | GM | Abmaj | Fmin | Dmin7b5 | GM | In this section, we used DirectMusic chords to take a simple, onemeasure pattern and created two different chord progressions with it. This can be a quick way to write music if all that is required is to move the music through different chord progressions. This section serves as the foundation for the next section on variable chord progressions.
ChordMaps ChordMaps are sets of potential chords or chord paths for chord progressions to follow. Introducing a ChordMap into a Segment allows DirectMusic to generate chord progressions on the fly, adding a new element of variability to DirectMusic songs. DirectMusic pieces the chord progression together within the boundaries provided by the content creator's ChordMap parameters. This section includes examples of variable chord progressions and instructions on how to use them. Create a ChordMap called SignPostsOnly by using Ctrl+N and selecting ChordMap. If you have seen ChordMaps before, you may have noticed the charts with lots of little connecting lines going everywhere. The truth is, you do not need to put anything into that area of the ChordMap to get a variable chord progression. The most vital thing in a ChordMap is the signpost. Signposts are groups of chords from which DirectMusic chooses while generating a chord progression from the ChordMap. The signpost list is located on the right side of the ChordMap editor (see Figure 6-4). Under SP (SignPost), you'll see the word . Click on that, and a new chord will be created. You can edit the chord from there. If you check a box under the number 1 and next to your chord, your signpost chord will become a member of signpost group one. When you start working in your Segment, the signposts are the main way that you control how the ChordMap is used within the Segment. You will be able to specify which signpost group you want the engine to choose from at that given time.
Figure 6-4: The signpost list. There is an easier method to employ in the creation of chords here. The chord on the left side of the screen can be dragged into the signpost list. The chords on the left have nothing to do with the way the ChordMap sounds. That is just a palette for you to grab chords from. Just like a painter's palette, you can change and customize the palette. All the chords on the palette are editable. To start with, they are all major chords, but if you look up next to DirectMusic Producer's main View drop-down menu, you can see a ChordMap drop-down menu. From there, you can change the palette to all minor, major, major seventh, dominant seventh, minor seventh, or chords based on the underlying scale of the ChordMap. To show flats instead of sharps in any of the ChordMap's various components (in this case, the palette) click on the ChordMap in the tree view and go to the Chord Properties window. There is a check box that lets you specify flats along with the key of the ChordMap. It is a good idea to set this first, since changing the key will transpose all chords in your ChordMap. For this example, add these chords to the signpost list: C, F, Dmin, G7, Ab7, and Db7. Once you've got the chords entered, check group one for C, group two for Dmin, F, and Ab7, and group three for G7 and Db7. If you are a theory buff, group one is tonic, group two is subdominant (or a substitution), and group three is dominant (or a substitution). Figure 6-4 shows what the signpost list should look like.
Now we can make a Segment to use the ChordMap. Create an eight-measure Segment and name it SignPostOnly. You will need a ChordMap Track, a Signpost Track, and a Style Track. To create the ChordMap and Signpost Tracks, use Ctrl+Ins or right-click and select Add Tracks. Then select the appropriate track type to create. Insert SimpleStyle into the Style Track and your new ChordMap into the ChordMap Track (bar one beat one). Click bar one beat one of your Signpost Track to insert a signpost. Choose group one. Fill out the rest of the Segment so the signposts follow this pattern: | 1 | 2 | 3 | 1 | 2 | | 3 | | (also shown in Figure 6-5)
Figure 6-5: Signpost demonstration. Now we can generate the chord progression from the ChordMap. This is done by clicking the dice icon in the Signpost Track. When you do so, a ChordMap Track appears and is filled in with chords. Notice that for all bars where groups two and three are entered, the same chords are used. If you click Compose again, it selects new chords for groups two and three again and uses only those chords instead of the previously generated chords. Group one only has one chord; that's C major. If you do not want to have to click the dice all the time, you can call up the signpost track's properties page (select the blue track label) and go to the Flags tab. Check both Recompose on Play and Recompose on Loop. The Recompose on Play flag automatically selects new chords every time the piece plays. The Recompose on Loop flag automatically selects new chords every time the section loops. Here is what the Segment should look like. Note that the chords will most likely be different: You are probably itching to play with all the neat graphs, so let's do that next! You can define different paths for the chords to take when moving from one signpost to another. Do this by mapping out chords between the different signpost chords and drawing lines from chord to chord. You can experiment with the map that we just created by dragging all the chords from group two to bar one and all the chords from group three to bar three. Right-click each of the chords in measure one and make them beginning signpost chords by selecting Toggle beginning signpost from the pop-up menu that appears. Beginning signposts are signified by a green arrow. Similarly, right-click all the chords in measure three and make them all ending signpost chords. Ending signposts are signified by a red circle. Put whatever chords you want into bar two, and start drawing lines from one chord to another. The lines are drawn from the lowest gray box on the end of the chord. An empty box is always on the bottom since a new one is added every time you create a new connection path.
Figure 6-6: Connecting chords. If you go back and recompose (i.e., regenerate) the Segment above, bar six should have one of the connecting chords that you just added. This happens because once connections between two signposts are drawn, DirectMusic will always use the connection lines if there is room in the Segment. The area between bars five and seven is the only place where there is a move from signpost two to signpost three, and there is a measure open for the connecting chord. If you want the chord progression generated by the ChordMap to be able to choose randomly between going directly from signpost to signpost and using the connecting chords, draw a line that connects the two signposts directly. Otherwise, DirectMusic will always use the connecting chords if there is room. To delete a connection, click on the line and press the Delete key. For another example of connecting chords, open ChordMapTutorial.pro on the companion CD. Check out the bookmark Connecting Chords. The ChordMap only uses one signpost chord that is placed into groups one and two. The map (see Figure 6-7) is meant for a Segment that is five measures in length with signpost one on bar one and signpost two on bar five. The connecting chords play on bars two, three, and four. The strategy here is that bar five is never actually played. The Segment is set up so that it loops after bar four. Bar five is only there to hold the target signpost so the connecting chords can be composed. This is a good strategy for looping Segments that need to transition back to bar one. You can't compose connecting chords back to bar one, so you have to make a duplicate that never plays.
Figure 6-7: Connecting chords example 2. The ChordMap in the example above has to be played in a five-measure Segment that is in 4/4 time to get the connecting chords to play properly. If you select the variable editing option in the ChordMap Properties dialog, the chords can now occur at different time intervals than they are laid out in the ChordMap. To change the time intervals that a chord can play, bring up the properties page for the connection line. Here you can adjust minimum and maximum lengths and even the weight — the probability that that connection will be selected.
One other ChordMap feature is the cadence. This could be useful in a jazz situation where you just want to put in the keys and have the ii-V-Is come in automatically. Cadences are good for that, but the chords do not have to be standard classical cadences. Your cadence could be bVI, bV, i. Any signpost can have an associated cadence. The two slots to the left of the signpost in the signpost list labeled C1 and C2 are for cadence chords (see Figure 6-8). Drag the chord or chords you want to use into those slots. Then when you want to cadence to a signpost, check the Precede Signpost with Cadence check box in the properties page for signposts in your Segment's signpost track. There is a Cadences bookmark in the accompanying project that gives an example of this.
Figure 6-8: Cadences.
Chords for Composition One thing has probably been bothering you: "Do I have to write everything in C major?" The answer is no, you do not. There is a feature called chords for composition that defines the context of your music. This is important since DirectMusic usually remembers notes as a combination of chord and scale degrees and not as actual MIDI notes the way a traditional sequencer would. If you look at any pattern in a Style, you will see a Chord Track along the top. This defines the context of the pattern's notes. All the notes in the pattern are compared to the notes in the chord above them. If you change the chord to Gb minor, the notes will be interpreted in the context of Gb minor. You could even write in Lydian b7 and have that interpreted correctly by the DirectMusic engine. Look at the bookmark ChordForComposition. The chord for composition for the pattern is E7#11, and the music uses the notes in that scale and chord. In this context, an A# is a natural four because A# is the fourth scale degree of the underlying scale of the chord for composition. If this pattern is played overaCmajor chord, the A# will be played as an F, since F is the natural four in C major. Play the Segment above the pattern to hear the unusual E Lydian b7 scale be converted perfectly to C major, C minor, and C Lydian b7. The last chord is the same as the chord for composition, E7#11. Naturally, the output is the same as the notes that were originally entered into the pattern with that chord. You are not required to stick with one chord for composition. You can treat the Chords for Composition Track as a regular Chord Track and enter as many chords as you want. Whatever notes are in the pattern below the Chord Track are interpreted in the context of their corresponding chords. Check out the MultipleChordForComposition bookmark for an example. The pattern has a C major arpeggio followed by a G major arpeggio. The chords for composition are C major and G major, respectively, but the Segment that plays the pattern has only one C major chord. As a result, the G chord is played as a C major chord also. Keep in mind that when you run-time save your files, all chords for composition information are lost, since it is only for the author's benefit. DirectMusic remembers notes in terms of functionality, not MIDI notes (in most cases; we go over the other cases later). We have had to recover lost files from the run-time version before and restore our chords for composition information. It can be scary to open your project and hear all the work you did with lots of strange harmonies played inaCmajor context. The problem above is easy enough to address. If you right-click the Chords for Composition Track and change the Keep Constant option from MIDI Notes (M) to Functionality (F), you can enter in your original chord for composition and everything will be restored immediately. Make sure you change the setting back to MIDI Notes though. MIDI Notes can ruin a project if you accidentally change the Chords for Composition Track or pattern key, and all your music gets moved around inappropriately. If you keep the MIDI Notes setting on, the notes in the pattern will never change visibly. If you change a chord for composition, DirectMusic reevaluates the notes in the new context. If you have Keep Functionality selected and you change a chord for composition, all the notes will move to keep the same functionality with the new chord that it had with the old chord. For instance, if youhaveaCmajor chord for composition with an E in the pattern, the note is remembered as the third. If you change the chord for composition to G major with Keep Functionality selected, the E will transpose to a B since B is third of G major. If you have MIDI Notes selected, the note will stay E, but it will now be remembered as a sixth, since E is a sixth in G.
There is a problem using Pattern Tracks, as opposed to patterns in a Style. The problem comes from the fact that the Segment's Chord Track functions as a Chords for Composition Track for the Pattern Track. This can be very confusing, since the notes will be shown on screen according to whatever chord is in the track at the time you open the Segment. If you hit Play and the Segment recomposes, the notes already displayed do not appear to change. When you begin editing notes, remember to look at your Chord Track first. Make sure that you are thinking about that chord. If you recompose, the context will change again. It is probably better to turn off recomposing while editing a Pattern Track. Even better, put the Pattern Track in a separate secondary Segment; that way, you always know that the chords for composition will stay the same.
PlayModes The chord for composition sets the context of our pattern, but there are other features that we can use to direct the DirectMusic engine on how our patterns should respond to chords. Some of these features are playmodes, note properties, chord levels, pattern chord rhythms, and variation choices. We go through them one by one, learning what each does and how to use them. Let's start with using playmodes. Open up the pattern in the SimpleStyle Style. Open the properties page for the part. On the lower right is a section labeled Default Play Mode. The first drop-down box is set to Chord/Scale. When the pattern is played with Chord/Scale selected, both the chord position and scale degrees are considered when deciding what note is played over a certain chord. The chord position is which chord tone we are talking about. For instance, inaCmajor chord, C is chord position one, E is chord position two, and G is chord position three. If it were C7, Bb would be chord position four. In the sample project, visit the Playmode bookmark for a demonstration of the Chord/Scale playmode. Notice that the Segment has a chord that is not a traditional triad. It is D2, A2, and D2. When SimpleStyle is played with this chord, the notes are stretched out to fit the chord because the Chord/Scale playmode specifies that we take the chord into account as well as the scale. The Sequence Track's purpose in this Segment is to show you the notes that play when the Style's pattern is played through the playmode Segment. The notes are a doubling of the Style's output an octave lower. Sequence Tracks ignore chords and always play the same thing. Back to the topic; let's look at how the engine plays the scale section of the pattern (starting on beat three) when played through the playmode Segment. If you do not have the project handy, refer back to Figure 6-1 for Simple Style's notes. The first two notes are what one would expect, but from there it looks confusing. You probably expect to hear D, E, F, G, A, B, C, D since the pattern has a straight scale, but because DirectMusic has been told to take chord position and scale into account, it does things differently. It plays D, E, A, B, D, E, C, D. The A is played as the third note because A is the second chord tone in the chord D-A-D. In C major, E was the second chord tone. So when you are in C major and you write an E, DirectMusic remembers that as the second chord tone, not as the third scale degree. Thus, you hear A because with D-A-D, the second chord tone is A. In C major, F is one scale degree above the second chord tone, so over D-A-D, B is one degree above the second chord tone. This also explains the D and E after that. The D in the next octave is the third chord tone, like G was in C major, and E is one above that, just like A was in C major. So why are the last two notes lower? We have run out of chord tones and are going to the next octave. These two notes are actually what we expected; they just seem strange since they are lower than the two before them. The second measure plays, ignoring the chord positions because the default playmode of the Scale Style is set to the scale playmode instead of the chord/scale playmode. The scale mode obviously only takes scale position into account. The pattern starts on the root of the chord and plays the same scale degrees as the original pattern. Here is a summary of the way pitches are retained when the chord/scale playmode is specified in a pattern. Pitches have three components in the chord/scale playmode: chord position + scale off-set + chromatic offset. So a D# in C major is what in our D-A-D chord? Well, D# is the second chord position (E) lowered by one chromatic tone in C major. In D-AD, A is the second chord tone. G# is one half-step lower than that, so the answer is G#. In this case, there was no scale offset.
The difference between the chord playmode and the chord/scale playmode is in how these playmodes cause the engine to handle sevenths, or the fourth chord tone. If chord playmode is specified and the chord given in the Chord Track does not contain a seventh or fourth chord tone, that tone will be omitted during playback. This could be a good thing if you are trying to avoid sevenths when they are not specified. On the other hand, the chord playmode could have a negative impact if you have melodic lines that depend on sevenths. PedalPoint is another playmode that will ignore the chord's root and round to the nearest scale note. Unless you really want a true pedal point in the sense of one note, this may not work harmonically with what you are doing. PedalPointChord is a really good option for melodic lines, since it shifts notes to the nearest chord tone. That doesn't mean that every note will be a chord tone, though. It means the generated note will have the same offset from a chord tone as the original note in the pattern. For instance,aCinaCmajor pattern (chord pos 1) shifts up toaBinaGmajor chord (chord pos 2), andaDinaCmajor pattern (chord pos 1+1 scale degree) will becomeaCinaGmajor pattern (chord pos 2+1 scale degree). In English that means if you write a chord tone, it will always play a chord tone. If you write a non-chord tone, it will play another non-chord tone. Listen to the flute part in the PedalPointChord Segment in the ChordMap tutorial for an example. All of this information about chord position and scale-offset information can be quite confusing at first, but it is worth working through in order to take full advantage of DirectMusic.
Note Tweaking Another way to use the pedal point chord playmode is with secondary Segments or motifs. If you play a really long note with the Don't Cut Off and Regenerate on Chord Change flags specified in the note's properties page, some cool harmonies can result. In the tutorial project, visit the Regeneration bookmark. Play the Segment featuring the Chord Track using the main Play button, and then play the other Segment with the secondary Segment toolbar. The notes continue past the end of the Segment, adjusting to every new chord that comes along! In this case, the pedal point chord playmode setting is not specified in the part but on the notes themselves. There are other things that you can do to individual notes using their properties pages. You can tell the engine to harmonize a note to a chord that has not arrived by tweaking the map to chord value. This is great for pickups. You can avoid having one note randomly transpose an octave because it hit a range limit. Do this by setting groups of notes to transpose together with the Override Inv. Group feature. You can tweak the scale and chromatic offsets that we discussed earlier in the Note Properties box. You can tell a note when it is appropriate for it to cut off. For example, you can tell DirectMusic to cut off the note if it is not in the new chord's scale or chord tone list.
ChordLevels and Inversions You can specify what kinds of chord tones pattern parts are supposed to play. This is done with ChordLevels or Inversions. Earlier you were told to always edit chords in the bottom row labeled 1. The bottom level consists of chords that the various instrument parts within a pattern respond to, but you can define deeper chord levels. This is especially useful for more complex harmonies, such as those in jazz. If you have a G13b9 chord, you probably don't want the extensions to be played by the lower instruments. The strategy to use is to place the basic chord tones in the first level and the upper extensions into the upper chord levels. Open the ChordLevel bookmark to see an example. The first three chords are simple and have the same note for each level. When the more complex chords start in bar four, only the trumpets play the dissonant tones. This is because in the pattern the trumpet part is set to chord level three via its properties page under Default Play Mode. All the extension notes are placed in level three in the chords (see Figure 6-9). That way, the trombones and bass play the fundamental chord level one tones, and the trumpets play the extended level three tones.
Figure 6-9: Chord levels. You could also define for DirectMusic whether specific inversions are allowed through the chords. The Inversions bookmark illustrates this. The first set of chords has different inversions defined for the second chord level, but the second set does not. The Style's part is set to play on level two. You hear the first four bars inverted, but the last four are not, since all the chord levels are in root position in the second half of the Segment. One last thing that could be done with this feature is to implement a polychordal texture as bookmark 21 of the Demo8 project that comes with what Producer demonstrates. The Demo8 project is a demo that, in our opinion, is the best way to learn the basics of DirectMusic. The project has a series of bookmarks that lead you through an excellent tutorial. Find it on the web at: http://msdn.microsoft.com/downloads/default.asp?url=/downloads/topic.asp?url=/msdnfiles/028/000/114/topic.xml It is also on the DirectX SDK CD. Run this file to install it: E:\DXF\DXSDK\essentls\DMusProd\DemoContent\DMPDemoCont ent.exe.
Pattern Length and Chord Rhythm Now let's discuss chord rhythm, pattern length, and how DirectMusic chooses which pattern is appropriate to play a given set of chords. So far, every Style that we have dealt with had only one pattern, but it is often better to have several patterns to add variety. One chord change every measure with the same musical motifs repeatedly can get tiresome. You can compose separate patterns for each level of chord activity. By default, DirectMusic picks the longest pattern that fits the chord rhythm in the Segment. This default behavior only holds true if you have a Groove Track with a groove level specified. Adding a Groove Track will not hurt your content if you do not plan to use grooves. Just put any number in the first measure. For an example, look at the bookmark AdvancedTechniques. The Style has two patterns. One is one measure long, and the other is four. What really matters is the chord rhythm. Go to the properties page of the pattern 4Bar. At the bottom is the chord rhythm section. You can click the Set button to change the setting. After adjusting the setting, you can then click the boxes to put an x in every place where there should be a chord change. The fourmeasure pattern is chosen every time that there is no chord change for four bars because the pattern follows only one chord for four measures. If you put an x at the beginning of every measure, the 4Bar pattern always plays because it is the longest pattern with the correct chord rhythm in all cases. There are two variations in 4Bar. One variation is a single chord, but the other variation moves every two beats. Patterns with only one chord change are a good way to take some liberty with the chords in your patterns. You can have different chord rhythms in every variation. Faster, syncopated, and less regular chord rhythms are sometimes better written out exactly without being subject to Chord Track's control. Chord rhythm variability is very easy with this pattern, since there is only one part and there is no need to do variation locking to keep the harmonies together. You can still build inherent chord progressions into different variations if you have multiple parts. All you need to do is take advantage of locking variations. To get this setup to work, you need to make sure that the ChordMap can play for a longer period of time with only one chord. In this example's ChordMap (shown in Figure 6-10), you will notice a connection that has been drawn from the first signpost directly to the second target signpost. If there were no connecting chords, this would not be necessary, but since there are connection chords, DirectMusic will only take paths that are specified when moving between these two signposts.
Figure 6-10: Randomly bypassing connecting chords.
More on Variation Choices: The Mother of All Windows To get to the Variation Choices screen, press the ? button next to the variation numbers in your pattern. Figure 6-11 shows only half of the window. The full window displays a row for all 32 variations in a part. Each row tells what types of chords that variation can play on. Pushed buttons are situations that are legal and unpushed buttons are situations that are forbidden by the content creator. Note that something will always play, so if a chord is selected and there are no variations that are allowed to play over that chord, an illegal variation will be selected at random.
Figure 6-11: Variation Choices window. Figure 6-11 is the string part's variation choices in the AdvancedTechniques example. All variations besides 1 through 6 are disabled, so they appear almost in black. Variation 1 hangs on the fourth scale degree, so it does not sound very good over major chords because of the half-step difference with the third. As a result, all of the major triads are disabled for that variation. The iv chord is disabled since variation 1 didn't sound pleasing over that chord to the author. Variation 2 hangs on the second scale degree, so it does not sound very good over the diminished chords represented in italics. There are only two diminished chords in the ChordMap, so those are the only two disabled. Variations 3 and 4 sounded good to the author with any chord, so every button is pushed in their rows. Variations 5 and 6 are reserved for the I chord, since they are whole notes bringing resolution. The other buttons are not used in this example, but they can be very useful for creating more interesting harmonies. Before these other buttons are explained, it must be made clear that the scale degrees represented by the Roman numerals depend on the Chord Track in the Segment. This, however, does not mean that you are limited to standard major and minor scales. A key of F with five flats would result in F Phrygian or F, Gb, Ab, Bb, C, Db, Eb, F. In that case, ii means Gb minor and VII means Eb major. The Root section tells whether or not variations can be played on chords built off scale degrees that have been raised, lowered, or unchanged. In the F Phrygian example, unchecking S would mean that the variation could not play on a chord based on F, Gb, Ab, Bb, C, Db, or Eb. If # was unchecked, chords based on F#,G, A, B, C#, D, and E would become illegal. In the Type section, "tri" means triads, "6,7" means chords with sixths or sevenths added, and "Com" means complex chords. The Dest area specifies a chord's destination. Before we wrap it up, there is one more trick that you should know about when dealing with ChordMaps. If you ever need to transpose a set of chords, there is a trick to doing it without having to do it chord by chord. Create a dummy ChordMap that you do not plan to use. Copy your chords into that ChordMap, and then change the key of the clipboard ChordMap,
transposing all of the chords that you just pasted in. Select those chords, paste them into the ChordMap that you are actually using, and you have a set of transposed chords. Congratulations! You have just completed an in-depth tutorial on one of the most intimidating and powerful parts of DirectMusic. We wish you luck in becoming comfortable and at ease with these revolutionary concepts. There are so many things to explore in the world of DirectMusic harmony, and there were so few pioneers at the time of publication. We hope that this chapter has given you the knowledge and understanding necessary to dig in and write some great interactive music.
Chapter 7: DirectX Audio Scripting Scott Selfon DirectX Audio Scripting is another tool provided to composers and programmers working with DirectMusic. Rather than needing to write low-level code in order to control such things as Segment playback, the composer can build sets of instructions using a more basic language. While this language does not cover the full gamut of behaviors available to the programmer using the C/C++ DirectMusic programming interfaces, it provides the most common behaviors in an easier-to-understand format. The application then triggers the composer's sets of instructions at the appropriate time. This allows more of the music playback behavior to remain in the hands of the composer and in a form where the behavior can be adjusted more easily without having to recompile the application. The script can be updated in DirectMusic Producer, run-time saved, and dropped back into the application, and the new behavior will be used without any additional changes.
Defining the Audio Implementation Problem While DirectMusic allows a composer or sound designer to create flexible, nonrepetitive audio content, the audio producer must still deal with the process of integrating that audio into the interactive application. Such integration is generally placed in the hands of the programmer. While the programmer no longer needs to be concerned with what exact content a piece of music (a DirectMusic Segment) plays, he must still manage quite a bit of information: § Dynamic volume and groove level changes for Segments § Managing AudioPaths that Segments are played onto (unless each Segment contains its own embedded AudioPath) § Playing Segments at musically appropriate boundaries (generally decided on by the audio producer) § Playing Segments as primary or secondary Segments § DLS Collection download/unload management This can set up a significant workflow challenge — if the composer wants a piece of music to transition at a different boundary, he needs to communicate this to the programmer, wait for the application to be changed and its code rebuilt, and then he can proceed. This often leads to the same kind of slow turnaround process that we were trying to avoid, where very few changes will be able to be made simply because of the time involved with each iteration of the application. Enter DirectX Audio Scripting. Using a basic, human-readable language, an audio producer can now author scripts, instructions for what behaviors should occur when a certain trigger is reached. Each "trigger" is a script routine, a list of instructions that this trigger should cause to occur — play a piece of music, stop another piece of music, adjust the musical intensity, change the volume, and so on. The programmer, rather than needing to implement these instructions, simply calls these script routines at appropriate points while the application runs. DirectX Audio Scripting provides a subset of the features of the lower-level C/C++ DirectMusic programming interfaces. That is, anything you can do with scripting can be done in application code if desired. Scripting allows the composer to abstract more of the behavior away from the programmer. For instance, rather than the programmer needing to know that when a player enters a room, the ambience for this room should play, the intensity should be adjusted based on the number of enemies, and the previous ambience should fade out, he
can just send a trigger called EnteredRoom, and the composer's script can give the instructions for what that means in terms of the audio. The other powerful feature that DirectX Audio Scripting supplies is the ability to pass information back and forth between the application and the music. For instance, the programmer could tell the script what the current score is, and when the programmer triggers the routine called AnnounceScore, the sound designer could stitch together the proper dialog necessary to report the score. Or perhaps the crowd would react with more energy if the script was made aware of the time remaining and how far apart the two teams' scores are. All of this is accomplished by using scripting variables, where data can be passed in both directions; the script can tell the application when an event has occurred in the music, and the application can tell the script what the user is doing and the current state of the application. But wait — isn't all of this scripting a form of programming? Are we turning the composer into a programmer? In some respects we are, but hopefully it is with a straightforward language that allows the composer to do more of what he wants without having to depend on the programmer. As you'll see below, many of the most powerful functions (namely, playing or stopping a piece of music) are very straightforward. Feel free to cut and paste from both of these examples, those in the demo project supplied with DirectMusic Producer, and sample scripts found in the DirectX Software Development Kit (SDK).
Creating a Script The first step to building our script is to create a new one in DirectMusic Producer. As with other file types, you can create a new script via the File menu by selecting New… and then choosing Script in the dialog that opens.
Figure 7-1: An empty script as displayed in the project tree. The first thing we notice is that a script looks an awful lot like a DirectMusic container file, with separate embed and reference folders. In fact, script files are a specialized kind of container file. A script is able to act on any other file that it knows about, as well as manipulate parameters for the global DirectMusic engine (tempo, groove level, and so on). With files placed in the Embed Runtime folder, the content will actually be saved within the Script file when it is run-time saved, just as they were in a container's Embed Runtime folder. Meanwhile, any files placed in the Reference Runtime folder are files that the composer intends to save out and deliver for application integration separately but that the script is still aware of and can use. Files can be placed in the two folders by dragging them onto the folders, or right-clicking on one and choosing Add/Remove Files….
Figure 7-2: The Add/Remove Files dialog provides a quick way to add many files to a script's container at once. We discuss more implications of embedding versus referencing later in the chapter. For now, let's use the Reference Runtime folder, remembering that when it comes time to integrate our content into the application, we need to separately deliver all of the files that the script uses. One of the nice things about DirectX Audio Scripting is that waves and Segments can be mixed together, just as they can by a programmer. Of course, wave files don't give you much in the way of variability nor do you get any tempo or measure information without building a Segment first. But for some scenarios (for instance, non-varying dialog), basic wave files are quite useful in scripts, as we can see in our first set of examples. We're going to reference a few wave files, named for the dialog they contain — "Hello," "My Name Is," and "Bob" (if
you've authored these wave files separately, don't forget to insert them into the project via the File menu's Insert Files into Project option). We also have some background music handy in a pair of Segments called RoomAmbience and ActionMusic.
Figure 7-3: The setup for our upcoming examples. Our two Segments happen to refer to their own DLS Collections, but they could also (or instead) use General MIDI instruments. Note first off that the wave My Name Is is referenced by the script as MyNameIs — without the spaces. The name a script uses to reference an object is not allowed to contain any spaces, so these were removed. Other restrictions to be aware of are that a script object cannot begin with a number, and you can't have two waves/Segments with the same name (if you do, DirectMusic Producer will automatically add a number to the end of the name of the second one inserted). Now let's take a look at the Script Designer window. If it isn't already open, double-click on the script in the project tree.
Figure 7-4: An empty script. There are three frames in the Script Designer. The Source frame is where we type in the actual routines for this script. When we want to try out these routines, we double-click them over in the Routines frame. This simulates the exact function that the programmer will perform when they "trigger" a script routine. The Variables frame will show any variables that our script uses along with their current values. Again, we can click on these to update them just as the programmer would in the actual application.
Scripting Languages DirectMusic Producer supports any ActiveX scripting language for creating scripts, but the two most commonly used languages are Visual Basic Scripting Edition and AudioVBScript. AudioVBScript is a DirectMusic-optimized language, specially geared toward music content playback and occupying a very small memory footprint. Alternatively, a composer could use the full Visual Basic Scripting Edition language ("VBScript" for short). While this language occupies more memory and potentially could take more processing power, it is a fully featured language, providing added flexibility and additional features, such as string concatenation (the ability to merge several pieces of text for display or script usage). Generally, start with AudioVBScript and move up to VBScript if you find that you need additional power. All scripts written in AudioVBScript will work in VBScript, but the opposite is not necessarily true. Of course, don't forget that the programmer can also step in for more complex scripting usage to perform some operations in lower-level DirectMusic code. The language that a specific script uses can be selected from that script's properties page.
Figure 7-5: The Script Properties window. A full list of differences between the two languages can be found in the DirectMusic Producer help documentation under the topic AudioVBScript Limitations.
Basic Scripting Usage Let's create our first scripting routine. Click in the Source frame of the Script Designer window. A cursor appears, and we can type away. Routines follow this format: Sub [name of routine] [things to do when this routine is triggered] End Sub For instance, we're going to create a routine that introduces a new character. Sub IntroduceCharacter End Sub
If you click anywhere else in the Script Designer window (or in the project tree), notice that IntroduceCharacter pops up in the Routines frame at the top right of the window.
Figure 7-6: IntroduceCharacter in the Routines window. You could double-click on this routine to trigger it (though of course, it doesn't do anything yet). By the way, if our routines ever disappear from that Routines list, it means that our script has some kind of typographical error in it. We talk about that more when we get to debugging in a bit.
Playing a Piece of Audio Okay, now let's make our routine do something. The simplest function is probably to play a wave or a DirectMusic Segment. Let's play our Hello wave. Sub IntroduceCharacter Hello.play End Sub The period (.) is how DirectX Audio Scripting separates the object that we're using (in this case, a wave) from the function that we want to call on it (in this case, playing it). Remember, the script only knows about objects that were dragged into its embed or reference folders; if we tried to play a Segment or wave that wasn't in those folders, we would get a syntax error. The above routine will play Hello using all default properties; it will play as a primary Segment, and it will play at the default authored boundary (most typically, immediately, but this could be set in the Boundary tab of the Segment's properties page). Again, we could try this routine out by double-clicking on it over in the Routines frame. Now it makes noise! Making scripts as readable as possible by placing comments helps when you might later need to return to the script to make edits. Any line of a script that begins with an apostrophe (') is effectively ignored when that script is played back and typically used for, among other things, describing what the routine does. Sub IntroduceCharacter 'We trigger this routine whenever our character meets 'other characters. Hello.play End Sub Let's make this routine do something a bit more complex; a routine that only does a single thing doesn't really save us much over having the programmer implement it. We've got these other two waves that we wanted to string together into a more complete introduction, so let's add them to the script. We wouldn't want to just sequentially call Hello.Play, MyNameIs.Play, and Bob.Play — that would fire off all three waves at the same time. Worse yet, they would each be played as the primary Segment, which means they would cut each other off. Instead, we want to use one of the transition boundaries that DirectMusic provides, the ability to queue Segments to play one after the other. Sub IntroduceCharacter 'We trigger this routine whenever our character meets 'other characters. Hello.play MyNameIs.play (AtFinish) Bob.play (AtFinish) End Sub
Our code now introduces one of the several parameters available when you tell a Segment to play. Parameters can be optionally enclosed in parentheses, which often helps for readability. They are required to be in parentheses when they're on the right side of an equation, as we see a bit later on, so it's often easiest to simply always use parentheses. The first parameter that can be used to modify our play call consists of flags that we can use. The AtFinish flag says that this Segment should wait to play until the previous primary Segment has finished playing. So our routine is now firing off "Hello" and queuing up "my name is" and "Bob" to play afterward. Note Be aware that this kind of dialog "stitching" (piecing together dialog fragments to reassemble whole sentences) involves many challenges beyond implementing queued playback in an application. Writing dialog, coaching, and recording voice talent to get appropriate and believable inflections present significant challenges that are beyond the scope of this book. These difficulties are shared by many applications — sports titles that use stitching for their commentators in particular. You can see the list of parameters for playing a Segment by looking in the DirectMusic Scripting Reference guide, under the topic Segment.Play. When the Script Designer window is open, a shortcut to the guide can be found in the Help menu and is also available via the Ctrl+F2 shortcut. Parameters are separated by commas, though if you don't use all of the parameters, their default settings will be used. Therefore, for these early examples, we will just stick with the first parameter. For the flags parameter of Segment.Play, multiple flags can be combined simply by "adding" them (using the + sign between them). For instance, we can create a new routine that starts our room ambience when we enter the room. If any music is playing, we want the ambience to sneak in at a measure boundary. We play it as a secondary Segment so that the background score continues to play. (Recall that only one primary Segment can be playing at a time.) Sub EnterRoom RoomAmbience.Play (IsSecondary + AtMeasure) End Sub Granted, our dialog example has the same issue; we probably don't want our dialog to become the sole primary Segment. Since it becomes a bit more complex to solve this problem for dialog stitching, we come back to that in a moment.
Using Transition Segments The play event also supports three other parameters — an AudioPath on which to play this Segment, a transition Segment to play prior to playing this Segment, and a PlayingSegment to play relative to (use the measure, beat, marker, etc., boundaries). We come back to the last one in the next section. But for now, let's take a quick look at transition Segments. Remember that each parameter in a call to a DirectX Audio Scripting function is separated by a comma. We just want to use the default AudioPath, so we can leave the second parameter blank or use the special keyword "nothing" to indicate that it should be ignored and use the default. Sub StartActionMusic ActionMusic.Play (AtMeasure, nothing, RampUpAction) End Sub …which is equivalent to: Sub StartActionMusic ActionMusic.Play (AtMeasure, , RampUpAction) End Sub Both versions of the routine would do the same thing — at the next measure of the primary Segment, play the RampUpAction Segment that we've authored as a transition Segment; when it finishes, play ActionMusic.
Stopping a Piece of Audio Now our music is chugging along, but we need to be able to stop it eventually (assuming it was authored to loop for a while, or we just need to get back to authoring for a bit!). The basic stop call is similar to our basic play call: Sub ExitRoom RoomAmbience.Stop End Sub Again, by default, the stop occurs on the Segment's default boundary (which was probably immediate). You can use the single parameter of Segment.Stop to specify a more specific boundary or (if you are using Style-based Segments) to tack an end embellishment pattern onto the ending of the Segment. Again, see the DirectMusic Scripting Reference portion of the DirectMusic Producer help file for a complete list of options. As an example, when we exit the room, we want to stop our music at the next beat. We could create the following routine: Sub ExitRoom RoomAmbience.Stop (AtBeat) End Sub
Global DirectMusic Parameters Scripting also exposes a number of DirectMusic's global parameters. A script can modify the master tempo at which all content is playing, for instance. Or the script can change the master groove level (handy for dynamic music that uses Style-based playback). Sub DoubleTime SetMasterTempo (200) End Sub Sub IncreaseIntensityLevel SetMasterGrooveLevel (GetMasterGrooveLevel + 1) End Sub This second example actually uses two global scripting functions, SetMasterGrooveLevel and GetMasterGrooveLevel. We can find out the current groove level, increment it by one, and then set the groove level to the resulting value. All of these "global" functions are technically operating on a DirectMusic Performance object, so in the documentation, they can be found under Performance.[Function]. But since most implementations only involve a single Performance object, you don't typically need to use one in your scripting calls. Remember that multiple performances can be created by a programmer if you need to have multiple tempos in use at the same time or multiple primary Segments; each performance is effectively its own instance of the DirectMusic playback engine. One note with using the global functions is that you probably want to create a function that resets all of the global DirectMusic settings just so that you know you're starting off from a known state. Otherwise, after running the above script routines, we'd be playing at twice the authored tempo and at a higher groove level without having any way to get back to a regular tempo and groove level! Sub ResetSettings SetMasterTempo (100) SetMasterGrooveLevel (0) End Sub
Editing Content while Using Scripts You might sometimes notice that a script "loses track" of a Segment you were using. You try to fire off a routine to play that Segment and you get silence or an older version of the Segment plays. This is most frequently caused by editing a Segment after setting up your script. In order to allow for script routine auditioning within DirectMusic Producer, the script is actually "running" even as you edit it. But when you edit a piece of music that the script was using, DirectMusic Producer has a tough choice — the script could update with each edit to the Segments (which might cause a significant slowdown to edits if the Segment and/or script were large) or the script can just accept that it might get out of sync, at which point the content creator can force a refresh. DirectMusic Producer opts for the latter. You can force the script to resynchronize with all of its source content by hitting the Refresh button above the list of routines. If that is not successful, you can save and reload your script by rightclicking on it, choosing Save, and then again right-clicking and choosing Revert to Saved.
Debugging Scripted Content Let's take a quick break from scripting to look at how you can debug (find and correct errors in) scripts. As we discussed before, you know your script has some kind of syntax error when the routines disappear from the Routines list in the top-right frame of the Script Designer window. But where was that error? Rather than having to hunt through every line of code, DirectMusic Producer can report errors along with their locations via the Message Window. You can open it from the Add-Ins menu. This window reports syntax errors (errors that occur because the script was typed wrong), as well as run-time errors (errors that occur when the script tries to run a routine). Let's set up one of each and see how the window responds. SubExitRoom 'Notice - no space between 'Sub' and 'ExitRoom' RoomAmbience.Stop (AtBeat) End Sub
If the Message Window is open and we click outside of the Source box in the Script Designer (or click the Refresh button above the Routines frame), the Message Window will display this:
Figure 7-7: The Message Window displays errors in your script. (The line might be different for you depending on whether you started a new script or worked from one of the previous examples.) You can see exactly what line that is by clicking back into your script Source window and looking at the status bar down at the bottom of the DirectMusic Producer window. (If the status bar is hidden, display it by selecting it in the View menu.)
Figure 7-8: The status bar displays the exact line and column number that your cursor is currently on. For an example of a run-time error, let's use that same routine but try to play a Segment that we haven't added to either of the script's folders (for instance, a typo in the name of the Segment that we intended to play). Sub ExitRoom 'Try to play a Segment the script doesn't know about SomeOtherSegment.Play (AtBeat)
End Sub Nothing happens when we click away from the Source window, as this is a perfectly legal script routine. But when we try to run it (by double-clicking it in the Routines frame), we get what is shown in Figure 7-9:
Figure 7-9: The Message Window displays an error regarding an unlocatable Segment. This lets us know that the script didn't know where to find the Segment, so we should add it to the reference or embed the run-time folder. The Message Window has one additional use; you can place statements in your routines that provide information similar to comment statements but let you know when a routine (or more particularly, a specific portion of a routine) has been run. These commands are known as trace statements. A programmer can actually "listen" for these trace statements in a game if you wanted to provide information (for instance, for karaoke, song title information, and so on). Going back to our queued-up dialog, we could add a trace statement that lets us know the script routine completed. Sub IntroduceCharacter 'We trigger this routine whenever our character meets 'other characters. Hello.play MyNameIs.play (AtFinish) Bob.play (AtFinish) Trace "Queued up Bob's introduction." End Sub When we double-click the routine, we hear all three waves play in order and we also immediately see the following in the Message Window:
Figure 7-10: The routine works!
Scripting with Variables So we now know how to start and stop pieces of audio. For many scenarios, the next step is to be able to use application information to dictate what specific audio is triggered. This involves the use of variables, which are scripting objects that the programmer and the scripter "share" and can adjust or check. In the most basic examples, the programmer will be the only one adjusting a variable's value, and the script will read that value to decide what to do. Let's return to the character introductions. What if we had 20 character names that our player could choose at the beginning of our application? We wouldn't want to write 20 routines that the programmer would have to choose between when it came time to introduce ourselves. We could instead use a variable — say, one called CharacterNumber (remember that variables, like other scripting objects, cannot contain spaces or start with a digit). The programmer sets this variable when we choose our name. Then whenever we introduce ourselves to another character, the programmer can call our single script routine, which checks the value and uses it to decide the appropriate wave to play. Variable use consists of two parts. First, you must declare the variable so the script knows how to use it (similar to dragging a wave or Segment into one of the script's two folders). The format for this is simply: Dim [name of variable] One quick restriction relating to AudioVBScript is that you must declare any variables that you intend to use above any routines. So this would be illegal: Sub EnterRoom RoomAmbience.Play (IsSecondary + AtMeasure) End Sub 'illegal - all "dim" s must come before all "sub" s dim CharacterNumber Also make sure that you don't name a variable the same name as one of your other scripting objects (Segments, waves, and so on). Again, the Message Window will let you know if this problem occurs. Once a variable has been declared, if you click away from the Sources tab, you'll see the variable appear in the lower-right frame of the Script Designer window, as in Figure 7-11.
Figure 7-11: Variables list in the lower-right frame of the Script Designer window.
You can now click on the value field to set the value, just as the programmer would do in the application. Clicking the Reinitialize button resets all values to Empty.
Conditional Statements Now that we've created a variable, we use conditional statements to act based on their value. With conditional statements, the script performs some comparison on the variable to decide how to act (for instance, what piece of music to play and when to play it). The simplest conditional statements are if…then statements, where if a condition is true, you will perform one or more actions. The syntax for this is: If (expression) then [Action1] [Action2] [...] End if Incidentally, indenting the actions that occur within the If/End If statements are for readability only; there's no rule that these lines of code must be spaced a certain way (in fact, tabs and extra spaces are universally ignored in DirectX Audio Scripting). As an example of a simple If/End If statement, we could have a script like this: dim PlayerHealth Sub PlayDeathMusic if (PlayerHealth = 0) then DeathMusic.Play (AtImmediate) end if end sub
If the programmer tried to trigger the death music when the player still had health, nothing would happen; only if health was zero would we hear the Segment called DeathMusic play. More complex statements will tend to use the if…then/elseif… then/else format: If a condition is true, do some action; otherwise, check to see if another condition is true (elseif) and perform their actions; otherwise, if all of these conditions are false (else), do some final action. For our character introduction example above, our code might look like this (let's simplify to three characters instead of 20): dim CharacterNumber 'We have three possible options for our character '1=Bob '2=Joe '3=Bill
Sub IntroduceCharacter 'We trigger this routine whenever our character meets 'other characters. Hello.play MyNameIs.play AtFinish if (CharacterNumber=1) then Bob.play (AtFinish) elseif (CharacterNumber=2) then Joe.play (AtFinish) elseif (CharacterNumber=3) then Bill.play (AtFinish) else Trace "Illegal character number." end if End Sub We can test out the above routine by setting our variable in the lower-right frame of the Script Designer (just as the programmer would in application code) and then double-click a routine to trigger it, again, just as the programmer would. Notice the final else statement in the above example. Using trace statements here often comes in handy for debugging; you are alerted that something caused your variable's value to go outside of the range that you expected it to fall into.
Local Variables Any time a variable is declared with a dim statement, it is a global variable that the programmer can see and manipulate. There are often situations where global variables are not necessary. For instance, the scripter might do some basic math on several global variables to decide what piece of music to play. Rather than having to create another global variable for this value, scripters can use local variables. Local variables have two primary differences from global variables — they cannot be seen by the programmer, and they do not have to be declared. They are implicitly created when they are first used. A good example of local variable use is adding randomization to scripted playback. AudioVBScript has a rand function that can be used to generate a random number from 1 to the value you specify. For instance, if we had three pieces of music that would be appropriate for the current scene, we could construct a script routine like the following: Sub PlaySceneMusic x = rand(3) 'x is a local variable - we did not have a ' "dim x" statement. 'The programmer cannot see it, and its value is ' not stored after the routine finishes running.
If (x=1) then LowLevelMusicThematic.Play ElseIf (x=2) then LowLevelMusicRhythmic.Play ElseIf (x=3) then LowLevelMusic.Play End if End Sub Again, see the DirectMusic Producer documentation page (for Performance.Rand) for more details on this function. Of course, this provides a small subset of the functionality that you could create within a single Segment with variations, but it does allow for your script to randomly choose between several fully composed Segments without needing to merge them into a giant Segment or needing for the programmer to get involved.
A Two-Way Street: Variables Set by the Scripter Variables can also be set by the scripter and used by the programmer. For instance, a sound designer could track how many times a routine has been called or how many instances of a sound have been started. Note that tracking when they've stopped is often more easily handled by the programmer. dim NumberOfTorchesPlayed Sub StartTorch Torch.Play (IsSecondary) NumberOfTorchesPlayed = NumberOfTorchesPlayed + 1 end sub
Tracking Individual Instances of a Playing Segment Now that we have a basic understanding of variables, let's return to the stop function on Segments. One thing to note with the Segment.Stop call is that we're stopping the Segment rather than a specific playing instance of that Segment. For ambience and music, where there is generally only one instance of any particular piece of music playing at a time, this isn't an issue. But let's say for example that we have a Segment that plays a torch sound, and the character in our application is walking through an environment with 50 torches that we had started playing. Calling Torch.stop would stop all 50 of them. If we have just extinguished one, we want the ability to stop just that single torch. To do this, we can keep track of PlayingSegment objects. Whenever a Segment is played, we can choose to store this particular playing instance in a variable. We can later stop this particular PlayingSegment without stopping every other instance of the original Segment. dim MyPlayingTorch Sub StartMyTorch
'We only carry one torch, so there's only one 'PlayingSegment we need to keep track of Set MyPlayingTorch = Torch.Play (IsSecondary) end sub Sub StartRoomTorch Torch.Play IsSecondary End sub Sub StopMyTorch 'Will only stop our own torch; room torches will 'continue to play. MyPlayingTorch.Stop End Sub Sub StopAllTorches Torch.Stop End Sub Note that the syntax for using objects in variables is slightly modified from numeric variables — a Set statement is used when you are assigning something to an object, rather than a numeric variable. In this example, we now have a personal torch that we track the instance of, and if our character extinguishes it, that single instance can be stopped without affecting other torches in the room. Meanwhile, the room might be filled with other torches that we don't need to be able to stop individually, so the script doesn't keep track of their individual PlayingSegment objects. The StopAllTorches method would stop every torch, including our own.
Triggering Segment Playback Relative to Playing Segments Now that we can use PlayingSegment objects, we can return to our original Character Introduction script routine (stitching together "Hello my name is Bob") and make it use secondary Segments. By default, when you play a Segment, it uses the primary Segment as the reference point to know when to start playing. But we don't want that here, or the "my name is" and "Bob" waves aren't going to play until the primary Segment finishes (and at that point, they'll both play at the same time!). We instead want "my name is" to queue up relative to the "Hello" Segment (which is going to play as a secondary Segment) and then for "Bob" to queue up relative to "my name is." To get the proper stitching behavior, we are going to use the fourth parameter on the Segment.Play function, oFromPlayingSegment, which is the PlayingSegment object that this play event should play relative to. Don't forget that we still need to fill in the second and third parameters to the play function, for which we could use the keyword "nothing" or just leave blank (but still using commas to separate them). Dim PlayingSegHello Dim PlayingSegMyNameIs
Sub IntroduceCharacter Set PlayingSegHello = Hello.play (IsSecondary) Set PlayingSegMyNameIs = MyNameIs.play (AtFinish + IsSecondary, nothing, nothing, PlayingSegHello) Bob.play (AtFinish + IsSecondary, nothing, nothing, PlayingSegMyNameIs) End Sub
Script Tracks and Self-Modifying Music One of the more interesting aspects of DirectX Audio Scripting is that script routines do not solely have to be triggered by the programmer. DirectMusic Segments themselves can fire off script routines at appropriate times, which allows for the concept of self-modifying music. A piece of music might play and check the state of a variable every four bars. If that variable has changed to a certain value, the music might stop of its own accord or fire off another piece of music, change tempo, or any of a number of actions. Segments can trigger script routines by using a Script Track. When a Segment has a Script Track, you can place specific routine calls at any point in the Segment, and that routine will be called when the routine reaches that position. For instance, for our above example, a Segment might look like this:
Figure 7-12: Every four bars, the script routine CheckNumEnemies is called. If this Segment was our primary Segment, and the routine CheckNumEnemies fired off another primary Segment, this Segment would of course stop playing. This can lead to all sorts of interesting implementations. Every piece of music in your application could fire off a script routine call right as it ends that figures out what piece of music to play next. So you could effectively have a self-running jukebox without the programmer having to code anything. Or the music can poll a variable every so often and respond accordingly without needing the programmer to let the script explicitly know that the variable has been updated. Such a technique is used in the baseball example included with the DirectX Software Development Kit — an infinitely looped secondary Segment consisting solely of a script track checks to see if the score has changed and whether the crowd has responded appropriately with cheers or boos (without needing the programmer's involvement in the least). See the baseball.spt file (which can be opened in DirectMusic Producer and re-expanded into its source files) for more details on how this was implemented.
Using Scripting for Resource Management DirectX Audio Scripting can also be used to allow the composer to control certain aspects of resource management, particularly when Segments are loaded, downloaded, and unloaded. Loading is the process of reading a Segment in from the disk. Downloading is the process of determining the waves (streaming or in memory) and/or DLS instruments the Segment uses and loading their wave data into memory (or, where available, transferring their wave data onto a DLS-capable hardware synthesizer, the original concept behind the term "download"). Unloading is the opposite of downloading, where DLS instruments and waves no longer in use can be freed. By default, all of these are managed automatically; when a script loads, all of the Segments that it references or embeds are automatically loaded and downloaded. Of course, if you have one script for an entire application, that could be a fairly large resource hit. The scripter could create one script for every level or one for every group of sounds that can be inclusively loaded and unloaded. Alternatively, the scripter can opt to control when these processes occur.
Figure 7-13: You can control whether loading and downloading automatically occur via the script's properties page. With increased control comes increased responsibility. If the script attempts to play a Segment that it has not yet loaded, an error will occur. If it attempts to play a Segment that has not been downloaded, that Segment will probably play silently. Similarly, it will be up to the scripter to free (unload) content that is no longer needed, or system resources might be unnecessarily drained. As one extra note of precaution, loading and downloading are synchronous functions, which means that the script will wait to continue executing content until the necessary disk transfers are complete. Sub LoadAndPlayMusic MySegment.Load MySegment.DownloadSoundData 'Downloads all DLS instruments and in-memory waves. MySegment.Play End Sub Sub FreeWaveData MySegment.UnloadSoundData 'Make sure you unload as many times as you download
'for a Segment, as wave data won't be freed until then. End Sub While downloaded instruments are freed via UnloadSoundData, loaded Segments are not freed until the script is freed. This is generally not much of an issue, as it is the wave data (DLS instruments and non-streamed waves) that takes significant memory and needs to be unloaded as soon as playback is completed. Once a script has been completed, don't forget to run-time save it (right-click and choose Runtime Save As… or Runtime Save All Files from the File menu); the .spt file that is generated will embed any content that was placed in the Embed Runtime folder and knows how to use any separately delivered content that had been dragged into the Reference Runtime folder. When scripts are used in a DirectMusic application, encourage your programmer to give you the ability to manually fire off every script routine in some manner while running the application. For instance, if the bad guy music only gets triggered at the very end of a game, you probably don't want to have to go all the way through a level just to hear if the music triggered at an appropriate boundary. DirectX Audio Scripting provides another tool to the composer or sound designer for controlling audio behavior without needing to overly involve the programmer and, more importantly, the ability to make changes without having to rebuild the entire application. While the functionality provided is a subset of the functionality made available programmatically, the functions allow for the vast majority of basic and intermediate scenarios to be handled solely by the content author.
Unit II: Programming with DirectX Audio Chapter List Chapter 8: DirectX Audio Programming 101 Chapter 9: The Loader Chapter 10: Segments Chapter 11: AudioPaths Chapter 12: Scripting Chapter 13: Sound Effects Chapter 14: Notifications Download CD Content
Chapter 8: DirectX Audio Programming 101 Overview Todor Fay We begin our journey into DirectX Audio programming with a short overview of the technology and a simple application that makes a sound. Note Understand that a certain level of programming experience is required to work with DirectX Audio. You should be familiar with C++ and object-oriented programming concepts. Experience in COM is a plus but is not required, since we walk through the parts you need to know. Experience in MFC and Windows programming is valuable as well but, again, not required. Of course, you need a compiler. All programming examples were written and tested in Microsoft Visual C++ 6.0 and Microsoft Visual Studio .NET.
DirectX Audio = DirectSound + DirectMusic Before we start, let's look at the DirectX Audio building blocks. To do that, it helps to have a little history. Microsoft created DirectSound back in 1995. DirectSound allowed programmers to access and manipulate audio hardware quickly and efficiently. It provided a low-level mechanism for directly managing the hardware audio channels, hence the "Direct" in its name. Microsoft introduced DirectMusic several years later, providing a low-level, "direct" programming model, where you also talk directly to a MIDI device, as well as a higher-level content-driven mechanism for loading and playing complete pieces of music. Microsoft named DirectMusic's lower-level model the Core Layer and the content-driven model the Performance Layer, which provides relatively sophisticated mechanisms for loading and playback of music. The Performance Layer also manages the Core Layer. In DirectX 7.0, the host application uses DirectMusic and DirectSound separately (see Figure 8-1). MIDI and style-based music are loaded via the DirectMusic Performance Layer and then fed to the DirectMusic Core, which manages MIDI hardware as well as its own software synthesizer. Sound effects in the form of wave files are loaded into memory, prepared directly by the host application, and fed to DirectSound, which manages the wave hardware as well as software emulation. It became clear after the release of DirectMusic that the content-driven model appealed to many application developers and audio designers since the techniques used by DirectMusic to deliver rich, dynamic music were equally appropriate for sound effects. Microsoft decided to integrate the two technologies so that DirectMusic's Performance Layer would take on both music and sound effect delivery.
Figure 8-1: In DirectX 7.0, DirectMusic and DirectSound are separate components. Microsoft added sound effect functionality to the Performance Layer, including the ability to play waves in Segments and optionally use clock time instead of music time. The Core Layer assumes wave mixing via the synthesizer, which is really just a sophisticated mix engine that happens to be more efficient than the low-level DirectSound mechanisms. This way, the Core Layer premixes and feeds sound effects into one DirectSound 3D Buffer, dramatically increasing efficiency. Real-time audio processing significantly enhances the DirectMusic/DirectSound audio pipeline. Perhaps the most significant new feature is the introduction of the AudioPath concept. An AudioPath manages a virtual audio channel, or "path," from start to end. This starts with the generation and processing of MIDI and wave control data, continues through the
synthesizer/mix engine in the Core Layer, on through the audio effects processing channels, and finally to the final mix, including 3D spatialization. With DirectX 8.0 and later, the Performance Layer controls one or more AudioPaths, each of which manages the flow of MIDI and wave data into the DirectMusic software synthesizer, on through multiple audio channels into DirectSound, where they are processed through effects, and through their final mix stage into the audio device (see Figure 8-2).
Figure 8-2: In DirectX 8.0, DirectSound and DirectMusic merge, managed by AudioPaths. This integration of DirectSound and DirectMusic, delivering sounds and music through the Performance Layer via AudioPaths, is what we call DirectX Audio. Note When Microsoft delivered DirectX Audio in DirectX 8.0, there was one unfortunate limitation: The latency through the DirectMusic Core Layer and DirectSound was roughly 85ms, unacceptably high for triggered sound effects. Therefore, although a host application could use the AudioPath mechanism to create rich content-driven sounds and music and channel them through sophisticated real-time processing, it had to bypass the whole chain and write directly to DirectSound for low latency sound effects (in particular, "twitch" triggered sounds like gun blasts) in order to achieve acceptable latency. Fortunately, Microsoft improved latency in the release of DirectX 9.0. With this release, games and other applications can finally use the same technology for all sounds and music. Sounds like a lot of stuff to learn to use, eh? There is no denying that all this power comes with plenty of new concepts and programming interfaces to grok. The good news is you only need to know what you need to know. Therefore, it is quite easy to make sounds (in fact, a lot easier than doing the same with the original DirectMusic or DirectSound APIs) and then, if you want to learn to do more, just dig into the following chapters! In that spirit, we shall write a simple application that makes a sound using DirectX Audio. First, let's acquaint ourselves with the basic DirectX Audio building blocks.
DirectX Audio Building Blocks In order to accomplish the simplest level of DirectX Audio programming, we need to work with three objects — Segment, Performance, and Loader. It is important to understand AudioPaths and COM programming as well.
Segment A Segment represents any playable audio. It might be a MIDI or wave file, or it might be a DirectMusic Segment file authored in DirectMusic Producer. You can load any number of Segments from disk, and you can play any number of them at the same time. You can even play the same Segment multiple times overlapping itself.
AudioPath An AudioPath represents the journey from Segment to synthesizer/mix engine to audio channel to final mix. A DirectSound Buffer manages the audio channel, since a Buffer really represents one hardware (or software) audio channel. There can be any number of AudioPaths active at any time and any number of Segments can play on the same AudioPath at once.
Performance The Performance manages the scheduling of Segments. Think of it as the heart of the DirectX Audio system. It allocates the AudioPaths and connects them, and when a Segment plays, it connects the Segment to an AudioPath and schedules its playback.
Loader Finally, the Loader manages all file input/output. This is intentionally separate from the Performance, so applications can override the loading functionality, should they desire.
COM COM is short for Component Object Model. Like all DirectX objects, the Segment, AudioPath, Performance, and Loader are COM objects. DirectX Audio's usage of COM is at the most elementary level. Let's discuss COM now: § Individual function interfaces represent each object as a table of predefined functions that you can call. There is no direct access to the internal data and functions of the object. § Interfaces and objects are not the same. An interface provides an understood way to communicate with an object. Importantly, it is not limited to one object. For example, there is an interface, IPersistStream, that provides methods for reading and writing data to a file. Any COM object that includes an IPersistStream interface can receive commands to read and write data from any component that understands IPersistStream. § A unique identifier called an IID (interface ID) identifies each COM interface. An IID is really a GUID (globally unique ID), a 16-byte number guaranteed to be unique. For example, IID_IDirectMusicPerformance represents the IDirectMusic-Performance interface.
§ §
§
§
Every COM object also has a unique identifier called a CLSID (class ID), which is also a GUID. Continuing our example, CLSID_DirectMusicPerformance identifies the DirectMusic Performance object. COM has a special function called CoCreateInstance() that you can use to create most COM objects. CoCreateInstance() finds the DLL (dynamic-link library) that contains the object's code, loads it, and calls a standard entry function to create the object and then return it to your program. The caller provides the CLSID of the object and the IID of the desired interface, and CoCreateInstance() returns the matching object and interface. With few exceptions, all COM methods return error codes in a standardized format called HRESULT. When a method succeeds, it returns S_OK. There are failure codes for everything from out of memory to errors specific to the particular object. The SDK documents all possible return codes for each interface. Each object supports the base COM interface, IUnknown, with its three methods: AddRef(), Release(), and QueryInterface(). These methods provide a standardized way to manage the existence of the object and access its interfaces. A program might reference an object with more than one pointer in more than one place. This easily happens when objects indirectly reference other objects (for example, a program references several Segments that all reference a particular wave file). It is important for the object to know how many pointers are currently referencing it. Every time something new points to the object, it calls AddRef() to let the object know. The object's AddRef() implementation increments an internal reference counter and ensures that the object stays valid until the last external reference is released via Release(). AddRef() is automatically invoked when an object is created, so you rarely call it directly. When a pointer to an object is no longer needed, the referencing owner calls Release() to tell the object that a reference has gone away. When the reference count drops to zero, indicating that nothing is using the object anymore, the object must go away for good. Typically, it cleans up any variables of its own and then frees its memory. A COM object can support one or more interfaces. Because all interfaces are based on IUnknown and must implement the QueryInterface() method, it is possible to use one interface to hopscotch to the next by calling QueryInterface() and naming the desired interface (using the IID). For example, the Segment object supports both the IDirectMusicSegment interface, which you use to set Segment parameters, as well as the IPersist-Stream interface, which the Loader uses to read data into the Segment.
Therefore, from your perspective, COM programming in DirectX Audio means you use CoCreateInstance() to create a few objects (the rest are created automatically by other DirectX Audio objects), Release() to get rid of them, and on the rare occasion QueryInterface() to access additional features. That's enough explanation about COM. Let's get on with the program.
HelloWorld! No introduction to any API is complete without that ancient but useful ritual known as the "Hello World" application. Our first application loads a wave file and plays it. Before we do anything, we need to hook up the correct include files. DirectX Audio only requires dmusici.h, as it has all the interfaces, data structures, and GUIDs for the Performance Layer. In turn, it pulls in the dsound.h and dmusic.h lower-level header files, so we get everything we need. Here is what we need to put at the top of the application: #include
Note
You need to link with the GUID library that comes with DirectX. Be sure to include dxguid.lib in your project's linker settings or you will get a ton of errors for each IID or CLSID used in your code but not found by the linker. In VC 6, open the Project Settings… dialog, select the Link tab, and add dxguid.lib to the Object/Library Modules edit box. Of course, this has already been set up properly with the projects included on the companion CD.
Now we can get to the program. Start by declaring the objects we use. Notice that the objects are all interface pointers. They do not yet represent real objects. // The performance manages all audio playback. IDirectMusicPerformance8* pPerformance = NULL; // The loader manages all file I/O. IDirectMusicLoader8* pLoader = NULL; // Each audio file is represented by a Segment. IDirectMusicSegment8* pSegment = NULL; Before we start, initialize the COM system by calling CoInitialize(). This initializes the COM system so that CoCreateInstance() can function. // Initialize COM CoInitialize(NULL); Note
If you are working with DirectPlay and using the same thread, you should instead call CoInitializeEx(NULL,COINIT_MULTITHREADED), which is what DirectPlay requires. DirectMusic is completely thread-safe, so it works well with either.
Now, use COM to create the loader object. CoCreateInstance() is passed the class ID of the loader as well as the interface ID for the IDirectMusicLoader interface. CoCreateInstance() uses this information to query the system registry, find the corresponding dynamic-link library (DLL), which in this case is dmloader.dll, load it, and tell the DLL to create one Loader object with the IDirectMusicLoader interface. // Create the loader. This is used to load any DirectX
// Audio object that is read from a file. We use // it to read a wave file as a Segment. // The three parameters that really matter are the // class ID, interface ID, and returned pointer. The // two other parameters assume more sophisticated COM // programming, which is not required for DirectX Audio. CoCreateInstance( CLSID_DirectMusicLoader, // Class ID of the DirectMusic loader. NULL,
// Ignore COM aggregation.
CLSCTX_INPROC,
// Must run within this application.
IID_IDirectMusicLoader8, // Request the IDirectMusicLoader8 interface. (void**)&pLoader);
// Return pointer to interface here.
Likewise, create a Performance. // Create the Performance. This manages the playback of sound and music. CoCreateInstance( CLSID_DirectMusicPerformance, // Class ID of the DirectMusic Performance. NULL,
// Ignore.
CLSCTX_INPROC,
// Yada yada fancy COM stuff.
IID_IDirectMusicPerformance8, // Interface we want to get back. (void**)&pPerformance); // Returned IDirectMusicPerformance8 pointer. You must initialize the Performance for playback. A call to InitAudio() archives this. At minimum, you must specify which AudioPath type you want to use by default and how many channels you need on that default path. An AudioPath represents a signal path through which the sounds (Segments) play. AudioPaths can include real-time effects, such as reverb or compression. You usually create AudioPaths from configuration files that define their layout, but you can also create predefined AudioPaths directly. In our example, we create one stereo AudioPath with no effects, using the DMUS_APATH_DYNAMIC_STEREO standard AudioPath identifier. // Initialize the Performance. pPerformance->InitAudio(NULL,NULL,NULL, DMUS_APATH_DYNAMIC_STEREO, 2, this wave.
// Default AudioPath type. // Only two pchannels needed for
DMUS_AUDIOF_ALL,NULL); Now it is time to use the Loader to read the wave file from disk. This creates a Segment. Note that the Loader's LoadObjectFromFile() method has a lot in common with the system
CoCreateInstance() command. Again, you provide a class ID for the type of object you wish to load, and you provide an interface ID to indicate which interface you expect to use. Not surprisingly, the Loader turns around and calls CoCreateInstance() to create the object you requested. Provide the name of the file to the Loader. In our HelloWorld application, it is, naturally, a wave file titled Hello.wav. // Load the demo wave. We are making the assumption that it is in the // working directory. // In VC6, this is set in the Project Settings/Debug/General/Working directory: field. if (SUCCEEDED(pLoader->LoadObjectFromFile( CLSID_DirectMusicSegment,
// Class identifier.
IID_IDirectMusicSegment8,
// ID of desired interface.
L"Hello.wav",
// Filename.
(LPVOID*) &pSegment
// Pointer that receives interface.
))) { Once the Loader reads the wave into a Segment, we still need to prep it for play. The Download() command moves any audio data (waves and DLS instruments) required by the Segment into the synthesizer, priming it so it can play instantly. // Install the wave data in the synth pSegment->Download(pPerformance); Now we are ready to rock. The Performance has a method, PlaySegmentEx(), which can play a Segment in all kinds of neat ways. We just use it to play our Segment in the simplest way for now. I cover the neat bells and whistles in upcoming chapters. // Play the wave. pPerformance->PlaySegmentEx( pSegment,
// Segment to play.
NULL,NULL,0,0,NULL,NULL,NULL); Wow! Sound! PlaySegmentEx() returns immediately as the sound starts playing. Normally, this would be no problem, since the application would continue running. However, HelloWorld does not do anything after playing the sound, so it needs to wait patiently for eight seconds (roughly the duration of the sound) before cleaning up and closing down. // Wait eight seconds to let it play. Sleep(8000);
Okay, it should be done by now, so unload the Segment and release all objects. // Unload wave data from the synth. pSegment->Unload(pPerformance); // Release the Segment from memory. pSegment->Release(); } // Done with the performance. Close it down, and then release it. pPerformance->CloseDown(); pPerformance->Release(); // Same with the loader. pLoader->Release(); // Bye bye COM. CoUninitialize(); If you compile this and have trouble getting it to make any sound, the problem is almost certainly that it cannot find the wave file. Make sure that your project's working directory is set to the Media directory, where Hello.wav resides. Refer to your compiler's user manual for information on how to set your working directory. Double-click on the version in the bin directory; it successfully plays because I placed a copy of Hello.wav there.
It's a 3D World Now that we have loaded and played a sound using DirectX Audio, let's see how easy it is to add a little code to move that sound in 3D. To do this, we need to start working with AudioPaths. We create a 3D AudioPath that mixes all audio into a DirectSound 3D Buffer. We then use the IDirectSound3D interface on the buffer to move the sound as it plays. Go back to the InitAudio() call, and change it to create a 3D AudioPath as the default path. // Initialize the performance with a default 3D AudioPath. pPerformance->InitAudio(NULL,NULL,NULL, DMUS_APATH_DYNAMIC_3D, 2, this wave.
// Default AudioPath Type. // Only two pchannels needed for
DMUS_AUDIOF_ALL,NULL); This tells the performance to create a 3D AudioPath as its default path. Later, after the wave has been loaded and is ready to play, we access that path. // Before we play the wave, get the 3D AudioPath so we can set // the 3D position before and during playback. IDirectMusicAudioPath *p3DPath = NULL; pPerformance->GetDefaultAudioPath(&p3DPath); The AudioPath GetObjectInPath() interface method grants direct access to any component within the span of the path. It is called GetObjectInPath(), and you will become quite familiar with it as you work with DirectMusic. In this case, use GetObjectInPath() to directly access the 3D Buffer interface. // Extract the 3D buffer interface from the 3D path. IDirectSound3DBuffer8 *p3DBuffer = NULL; p3DPath->GetObjectInPath(0, DMUS_PATH_BUFFER,0,GUID_NULL,0, IID_IDirectSound3DBuffer, (void **) &p3DBuffer); // Don't need this anymore. p3DPath->Release(); Now that we have the 3D interface, we can change the spatial position of any Segments played on this path. Before we play our sound, set its initial position in 3D space. // Initialize the 3D position so the sound does not jump // the first time we move it.
p3DBuffer->SetPosition(-2,0.2,0.2,DS3D_IMMEDIATE); Then, go ahead and play the wave. pPerformance->PlaySegmentEx( pSegment,
// Segment to play.
NULL,NULL,0,0,NULL,NULL,NULL); Now that the sound is playing, we can progressively move it through space. // Loop and slowly move the sound. float flXPos = (float) -1.99; for (; flXPos < 2.0; flXPos += (float) 0.01) { Sleep(20);
// Equivalent to updating at 50fps
p3DBuffer->SetPosition(flXPos, flXPos, (float) 0.2, DS3D_IMMEDIATE); } Ooh, isn't that cool? We are finished now, so remember to release the buffer while shutting down. // Okay, done. Release the buffer interface. p3DBuffer->Release(); That concludes our whirlwind tour of basic DirectX Audio programming. Believe it or not, we covered most of what you need to know to get sound and music playing in your application. It can get so much more fun if you are willing to dig in deeper and really explore, and that is what we do in the upcoming chapters.
Using the CD CD Content This is a good time to introduce the Unit II material on the companion CD. Take a look in the Unit II folder on the CD. It includes source directories for each project discussed in these chapters. Each directory is labeled with the chapter number and project name. So, for example, this chapter's programming projects are 8_Hello and 8_ Hello3D. The Bin directory contains the final executables of each project, the SourceMedia folder has the DirectMusic Producer source files to create the content, and the Media directory contains the final run-time media files for the sample projects. You can either copy these files by hand or you can run the Setup program, also in the Unit II folder, to automatically install these files on your hard drive. It includes an uninstaller, so you can install, play with the materials, and when done wipe them out to make room for more on your hard drive.
Chapter 9: The Loader Todor Fay Although the Loader is a decidedly unglamorous feature of DirectX Audio, it is a good idea to become familiar with it. Not only does the Loader manage the files you load, it also manages how they are cached in memory. So, even if you skip past this chapter for now, be sure to come back and give it a read later.
Why the Loader? The Loader serves three purposes: managing linkages between DirectMusic files, providing convenience functions for easy file loading and manipulation, and being replaceable so an application can write its own customized Loader that can deliver resources from a packaged file format of its own devising (more on why this is useful later).
Manage File Relationships The Loader manages the relatively complex linkages that exist between DirectMusic files. For example, a Style playback Segment file might reference both a Style file and a DLS instrument file. Because multiple Segments might use the same Style and DLS files, it is extremely inefficient to bundle up all the Style and DLS data into the Segment. You end up with multiple files bundling redundant Style and DLS data. Worse, all this data would redundantly load into memory with the Segment. The solution is to place the shared Styles and DLS instruments in their own files that the Style playback Segment references. At the time the Segment is loaded, the appropriate Style and DLS files must also be loaded and correctly linked. The Loader provides a centralized mechanism to manage this for you. In Figure 9-1, three Segment files reference different combinations of DLS, Style, and wave files. The Loader manages internal pointers to the DLS, Style, and wave objects, so only one instance of each is necessary.
Figure 9-1: File referencing in DirectMusic. Note
In some circumstances it is reasonable to bundle all of the files required by a Segment into the Segment itself. This can include waves, DLS, Styles, and more. You can create Segments with embedded files with the Segment Designer in DirectMusic Producer. This has the advantage of packaging the Segment as just one file with no dependencies. It is great for music file players, since there is no need to ship a set of interconnected files.
Provide a Convenient File Load API The Loader serves as a convenience function for easy file loading. All DirectMusic objects that can be loaded from files support the standard COM file I/O interface IPersistStream. By supporting this, any DirectMusic object can be loaded in a few steps by: 1. Calling CoCreateInstance() to create an instance of the object. 2. Calling the object's QueryInterface() method to retrieve its IPersistStream interface. 3. Creating a file stream object represented by the standard IStream interface. 4. Presenting the IStream to the IPersistStream Load() method, which reads the data from the stream and creates the instance of a loaded object. Clearly, this is a lot of work, and yet it is something that every application must do. Therefore, it makes sense to provide a standard helper so the application does not have to do this redundant work.
Support Customization Applications like games often keep resources bundled up in one large file. Although DirectMusic's container format provides an easy way to accomplish that, it still does not provide encryption or compression features that a game might require. Because the Loader uses a publicly defined interface, you can create your own Loader and have it retrieve the data in any way that it pleases.
Loading a File So, how do you use the Loader to load files? As we saw in the HelloWorld example, you must first use the COM CoCreateInstance() call to create a Loader. CoCreateInstance(CLSID_DirectMusicLoader, NULL, CLSCTX_INPROC, IID_IDirectMusicLoader8, (void**)&pLoader); This creates the Loader and gets it ready for business. Remember that you will need to call Release() on the Loader when you are done using it, which should be after all file I/O is completed for your application. Note Because the Loader caches files for efficiency, it is very important that you keep the same Loader around. I have seen situations where a clever programmer got rid of one pesky variable by creating a Loader every time a file needed to be loaded and then released it. This solution worked, but memory usage skyrocketed as each Loader reloaded the same styles, waves, and DLS Collections instead of sharing them. In addition, file I/O and download times took longer for the same reason. Once you have an instance of a Loader, you need to tell it where to look for your file. Call the SetSearchDirectory() method to tell the Loader where you want it to search for referenced files. You might also use SetObject() if there are specific referenced files that you want it to know about. Then, load the file, using either the LoadObjectFromFile() or GetObject() methods. Let's look at each of these methods in detail.
SetSearchDirectory() SetSearchDirectory() tells the Loader which directory to go to when searching for a file that is referenced by the file currently being loaded. HRESULT SetSearchDirectory( REFGUID rguidClass, WCHAR* pwszPath, BOOL fClear ); §
REFGUID rguidClass: This is the class ID for the type of file (a REFGUID is a pointer to a GUID or, in this case, CLSID). You can set the Loader to look in different directories for different types of files. For example, to set the search directory for DLS files, this is set to CLSID_DirectMusicCollection. To set the search directory for all types, use GUID_DirectMusicAllTypes.
§
WCHAR *pwszPath: The Loader will look in this directory path for files. DirectMusic uses Unicode characters for all naming and filenames, hence the 16-bit WCHAR instead of ASCII char.
§
BOOL fClear: This parameter indicates whether SetSearchDirectory() should flush the current directory of all references. The fClear option guards against files with the same name in different directories. SetSearchDirectory() flushes the files in the old directory from the Loader if this flag is set. However, if an object is already loaded, the fClear option does not remove it from the cache.
LoadObjectFromFile() You can use the Loader to read a file in two ways: GetObject() and LoadObjectFromFile(). LoadObjectFromFile() is the more convenient approach. GetObject() has all kinds of clever options for loading from files, streams, or memory. However, LoadObjectFromFile() provides a very straightforward way to read DirectX Audio objects directly from files. HRESULT LoadObjectFromFile( REFGUID rguidClassID, REFIID iidInterfaceID, WCHAR *pwzFilePath, void ** ppObject ); §
§
§
§
REFGUID rguidClassID: This is the class ID for the type of object to load. The Loader does not know anything about how to load a particular type of file. Instead, it calls COM to create an instance of the object, which then does the actual file reading and, when finished, returns the object to the caller. For example, to load a Segment, this parameter is CLSID_DirectMusicSegment. Note GUID_DirectMusicAllTypes is not valid for LoadObjectFromFile() because it does not define a specific object type. REFIID iidInterfaceID: You must also provide an interface identifier. An interface of the requested type returns in ppObject. For example, when loading a Segment, you can request the standard Segment interface IDirectMusicSegment8 with IID_IDirectMusicSegment8. WCHAR *pwzFilePath: The file path provides the file name and can be a complete path or one relative to the search directory as set with SetSearchDirectory(). The Loader first assumes that the file name is a complete path and tries to load from that absolute address. If that fails, it concatenates the file name onto the current search directory and tries to look there. void ** ppObject: The loaded object returns in ppObject, which is a pointer to the variable that receives the returned object interface.
GetObject() While LoadObjectFromFile() does the job 90 percent of the time, there are scenarios where a little more flexibility is needed. § The Loader can identify every file with alternate information, including a name, version, date stamp, unique GUID, and, of course, file name. The application may need to use one or more of these other variables to search for the file, especially if it is already cached. LoadObjectFromFile() can only use the file name. § The data might be stored in a memory buffer or stream instead of a file. LoadObjectFromFile() can only load from a file. GetObject() provides a way to read from alternate sources. It does so by using the DMUS_OBJECTDESC structure, which allows for a very thorough description of the object.
typedef struct _DMUS_OBJECTDESC { DWORD dwSize; versions.
// Size of this, should it grow in later
DWORD dwValidData; valid data.
// Flags indicating which fields have
GUID
guidObject;
// Every object can have a unique ID. // This is saved in the file.
GUID guidClass; predefined class. FILETIME
ftDate;
DMUS_VERSION
vVersion;
// Every object must belong to a // The date when the object was created. // Major and minor version info.
WCHAR object.
wszName[DMUS_MAX_NAME];
// Friendly name of
WCHAR name.
wszCategory[DMUS_MAX_CATEGORY];
// Optional category
WCHAR wszFileName[DMUS_MAX_FILENAME]; optionally with path. LONGLONG llMemLength; sets length.
// The file name,
// If memory is used instead of a file,
LPBYTE
pbMemData;
// Pointer to memory data.
IStream
*pStream
// Or, stream to read data from.
} DMUS_OBJECTDESC, *LPDMUS_OBJECTDESC; Flag bits in dwValidData indicate which fields are valid. For example, the flag DMUS_OBJ_NAME means that the string in wszName is a valid name for the object. Conversely, a cleared bit in dwValidData indicates that the corresponding data field is empty. The fields that are most important for tracking resources are wszFileName and guidObject. These help identify and validate the correct file. The GUID is guaranteed to be unique but must be authored into the files, typically via DirectMusic Producer. In addition to identification fields, the caller can use pbMemData and pStream to present data in a memory or stream format. We cover an example of loading a Segment from memory later in this chapter (see the "Loading from a Resource" section). HRESULT GetObject( LPDMUS_OBJECTDESC pDesc, REFIID riid, LPVOID FAR * ppObject ); §
LPDMUS_OBJECTDESC pDesc: This is a pointer to a DMUS_OBJECTDESC structure that describes the object. Note The DMUS_OBJECTDESC structure is quite large because of all the Unicode strings embedded within it. If you are managing many structures that have DMUS_OBJECTDESC embedded within them, be aware that it eats up 848 bytes for each instance.
§
REFIID riid: Like LoadObjectFromFile() or even CoCreateInstance(), you must provide the interface ID for the interface that you intend to get back. For example, to get back a Segment interface, pass in IID_DirectMusicSegment8. void **ppObject: The loaded object returns in ppObject, which is a pointer to the variable that receives the returned object interface.
§
SetObject() Sometimes, you need to point the Loader at a file source before another object references it. This could be because the file path is broken or, more likely, the reference object loads from memory or an IStream. Therefore, you need to point the Loader directly at it using SetObject(). This is almost identical to GetObject(), except it does not actually load anything. It just tells the Loader where the object is, and the Loader pulls it in later when needed. HRESULT SetObject( LPDMUS_OBJECTDESC pDesc ); §
LPDMUS_OBJECTDESC pDesc: This is a pointer to a DMUS_OBJECTDESC structure that describes the object. In particular, the location of the object, be it a file path (wszFileName), memory pointer (pbMemData), or IStream (pStream), must be filled in and the appropriate flags in dwValidData must be set. SetObject() will use that to parse the file header to retrieve additional information, including the name and GUID.
Managing the File List The Loader maintains a list of all files that it knows about. This list serves two purposes: § To keep track of where various files are located so that when the caller requests a resource, the Loader can immediately find and load that resource. § To hold on to loaded file objects so that multiple requesters can share them. We refer to this as the cache. When the caller requests a cached object, the Loader immediately returns the object with its reference count incremented. There are several methods of the IDirectMusicLoader interface that give you control over the file list.
ScanDirectory() ScanDirectory() searches a directory for a specific type of file, as defined by its class ID and file extension (the * wildcard is supported.) For example, you might use it to search for all Style files within a particular directory. This does not load any of the files. Instead, it adds all their location information to the file list. To do this, it parses each file header, looking for all pertinent information to fill the DMUS_OBJECTDESC record for each file. Therefore, internal information, including object name and GUID, is retrieved and stored in the list. ScanDirectory() is useful primarily in conjunction with EnumObject(), which lists the objects one by one. This is useful for applications that need to display all available files of a particular type. This is great for media playback and authoring applications but not needed for games that usually have dedicated soundtracks. As long as file names are unchanged and SetSearchDirectory() is used appropriately to establish where to find them, ScanDirectory() incurs unnecessary overhead, so avoid using it when possible.
EnumObject() EnumObject() scans through the file list and retrieves information about all files of a specific file type, identified by its class ID. For example, an application might use this to present the user with a list of all playable media files. It would do so by calling ScanDirectory() to scan for all Segment files (which can include wave and MIDI files as well) and then calling EnumObject() to list them.
EnableCache() EnableCache() and ClearCache() are very important for effective memory management. When the Loader opens a file, it keeps an internal reference to the loaded object in its file list. The Loader effectively caches the file, so a subsequent request simply receives a pointer to the original object. This is all fine and well, but it has its cost. Some file types, like Segments, are rarely, if ever, referenced by anything else. Therefore, it is usually just a memory waste to cache them. EnableCache() lets the application decide which file types the Loader should cache. File types that typically are referenced by others, like Styles and DLS Collections, are great candidates for Loader caching since it ensures that just one instance of a Style (for example) is created and then referenced by all objects that use it. Segments are not. Note It is not always the case that there are no objects referencing Segments. Both Script objects and Segment Trigger Tracks can reference Segments. In those situations, enabling caching may have merit. Even then, it is typically rare to
have multiple Scripts or Segments referencing the same Segment.
ClearCache() ClearCache() provides a quick mechanism for clearing out all objects of a particular type. For example, you might call ClearCache() for some or even all file types when moving from one scene to another in a game. Note ClearCache() does not remove an object from memory unless the Loader is the only entity referencing it. For example, if you call ClearCache() for all DLS Collections, it will remove all references from the Loader, but any DLS Collections currently in use by loaded Segments will continue to stay in memory until the last Segment is released, which in turn does the final release on the DLS Collection. This means that music and sound effects will continue to play without a glitch. Ah, the beauty of COM….
ReleaseObject()/ ReleaseObjectByUnknown() ReleaseObject() and ReleaseObjectByUnknown() can be used in lieu of ClearCache() to release a specific object from the cache. Use these when you need to remove one but not all objects from the cache. These methods are identical in functionality. ReleaseObject() requires the IDirectMusicObject interface, which all loadable objects must support. However, it is obnoxious to have to call QueryInterface() to get that interface in order to then call ReleaseObject(). So, with DirectX 8.0, ReleaseObjectByUnknown() was added to the API. It lets you use the regular interface for the object, and it internally queries for the IDirectMusicObject interface.
Garbage Collection In DX8, DirectMusic introduced three features that could easily cause reference loops. These are Scripting, Script Tracks, and Segment Playback Tracks. For example, one script might reference another script that calls back into the first. On the other hand, a Segment Trigger Track triggers the same Segment or another that triggers the original Segment again. These are all perfectly reasonable scenarios, and they do load and run appropriately. However, when it comes time to clean up, a self-referencing loop keeps the reference counts of all objects in the loop incremented by at least one, and so they never go away, even though nothing actually references them outside the loop. The CollectGarbage() method solves this problem. CollectGarbage() scans for all objects that have been completely released by the Loader as well as the application and calls an internal method on the object, forcing it to release its references to other objects and thus breaking any cyclic references and removing unneeded objects from memory.
LoaderView Sample Application The LoaderView sample demonstrates both how to load files with the Loader and how to manage the cache. In truth, it goes a little overboard demonstrating all of the caching features, but by playing with this, you can get a good understanding of how the Loader manages its file list and how you can consequently control it for your application. LoaderView is a windowed application that lets you open a Segment, play it, view the Loader's internal lists of all object types that were loaded as a result of loading the Segment, and then play with the object lists using the standard Loader methods and observe the behavior. You can run LoaderView by either going to the Unit II\ 9_Loader directory and building LoaderView or running it directly from the Bin directory. LoaderView opens a simple dialog window with commands for loading a Segment and managing the Loader (see Figure 9-2).
Figure 9-2: LoaderView. Let's first explore LoaderView, discuss how it uses the various Loader features, and then look at the code. Click on the Open… button to open a File dialog.
Search for a Segment, MIDI, or wave file to load. Files that reference other files are particularly interesting. For example, EvilDonuts.sgt has references to Style, DLS, and wave files. When you click on the Open… button, the Loader's GetObjectFromFile() method is called to load the Segment. Once the file is loaded, its name is listed in the box underneath the Open… button.
Click on the Play and Stop buttons to audition the Segment. We talk a lot more about playing Segments in the next chapter, but at least we can audition our Segment for now. With the Loader being as dull a subject as it is, it helps to be able to make noise. Now, let's look at what is in the Loader's cache.
The Type drop-down lets you choose which media type to enumerate and display. Click on the list to choose from Segments, Waves, DLS Collections, Styles, and quite a few others.
The list box below the Type drop-down displays all files of the media type that you requested that the Loader is currently managing. Choose a list that displays one or more files. For example, "Evil Donut" uses two DLS Collections, so you might try that. Note that they all have "-cached" after their name. This indicates that the Loader currently has a reference to an instance of the object loaded in memory. You can turn off caching. Uncheck the Enable Cache option. This calls the EnableCache() method with a false flag to disable it. Disabling a cache also automatically clears it, so notice now that the display lists these files no longer as cached. Remember: Once the Loader does not have the file cached, it must reload the file when the application subsequently requests the file. This incurs file and CPU overhead and extra memory overhead if the object is still in use since there are two copies of the object in memory once the second instance loads. Go back and press the Play button. Notice that the music still plays. Although the Loader no longer has its references to the objects in memory (Style, DLS, etc.), the Segment still does, so it plays flawlessly. Later, when the Loader and application finally release the Segment, it in turn releases all objects that it references. Ah, such is the power and beauty of COM. Likewise, you can release an individual item. Go to another list, such as DLS Collections, that has two or more objects. Click on one to select it, and then click on the Release Item button. Note that just the one object loses its "cached" designation. Finally, click on the Scan Files button. This calls ScanDirectory() for the selected file type in the current search directory. The list updates with all additional files found in the directory. Notice that ScanDirectory() only retrieves the file information and does not load and cache the file objects themselves.
LoaderView Source The CAudio class manages all audio and Loader functionality. The LoaderView UI (CLoaderDlg) makes calls into CAudio to load and play Segments. I wrote the code using MFC (Microsoft Foundation Classes). We do not discuss the UI code here. I do my best to keep the UI code and the audio code separated in all programming examples, so we can focus on just the audio portions. Of course, I include the UI source in the LoaderView source code project in the Unit II\ 9_Loader directory on the companion CD. CAudio also tracks a table of object descriptors (DMUS_ OBJECTDESC) for the currently selected object type and provides calls to manipulate the list. This is not enormously useful for regular applications, but it is useful for exploring the Loader's caching behavior, which is what LoaderView is all about. We walk through the layout of CAudio in just a second.
Predefined Object Types Although the Loader is completely format agnostic, for convenience we concentrate on nine familiar DirectX Audio formats. To make communication between the UI and CAudio as simple as possible, we predefine these. // Define the nine categories of DirectX Audio objects we will track // in the loader's file list. enum OBJECT_TYPE { OT_AUDIOPATH = 0,
// AudioPath configuration file.
OT_BAND = 1,
// Band file.
OT_CHORDMAP = 2,
// ChordMap file.
OT_CONTAINER = 3, objects.
// Container - carries a collection of other
OT_DLS = 4,
// DLS Instrument Collection file.
OT_SEGMENT = 5,
// Segment file.
OT_STYLE = 6,
// Style file.
OT_WAVE = 7,
// Wave file.
OT_GRAPH = 8
// ToolGraph file.
}; #define OBJECT_TYPES
9
CAudio Class The CAudio class manages the Loader, the Performance, and one Segment. It has routines for initialization and shutdown. It has routines to load and play Segments. It has a bunch of routines dedicated to exploring and managing the Loader's file lists. Since we cover initialization and playback in other chapters, here we focus just on the code for managing the Loader.
Here is the full definition of the CAudio class: class CAudio { public: // Initialization and shutdown methods. CAudio(); -CAudio(); HRESULT Init(); void Close(); // Convenience methods for accessing DirectX Audio interfaces. IDirectMusicPerformance8 * GetPerformance() { return m_pPerformance; }; IDirectMusicLoader8 * GetLoader() { return m_pLoader; }; // Methods for loading, playing, stopping, and removing Segments. IDirectMusicSegment8 * LoadSegment(WCHAR *pwzFileName); IDirectMusicSegment8 * GetSegment() { return m_pSegment; }; void PlaySegment(); void StopSegment(); void ClearSegment(); // Methods for managing the loader's file lists and retrieving data from it. DWORD SetObjectType(OBJECT_TYPE otSelected); void GetObjectInfo(DWORD dwIndex, char *pszName, BOOL *pfCached); BOOL GetCacheStatus() { return m_afCacheEnabled[m_otSelected]; }; void ChangeCacheStatus(); void ReleaseAll(); void ReleaseItem(DWORD dwItem); void ScanDirectory(); private: // DirectX Audio interfaces for performance, loader, and Segment. IDirectMusicPerformance8 * m_pPerformance; Performance. IDirectMusicLoader8 * m_pLoader; IDirectMusicSegment8 * m_pSegment; loaded Segment.
// The // The Loader. // Currently
// Only one object type is active at a time. This reflects the // current choice in LoaderView's type drop-down list.
OBJECT_TYPE m_otSelected; selected object type.
// Currently
// We maintain a table of enumerated objects of the current selected type. DMUS_OBJECTDESC * m_aObjectTable; descriptors. DWORD m_dwObjectCount;
// Table of object // Size of table.
// We use a static array of class IDs to identify class type in calls to loader. static const GUID * m_saClassIDs[OBJECT_TYPES]; // We need a flag for each object type to indicate whether caching is enabled. BOOL m_afCacheEnabled[OBJECT_TYPES]; // Current search directory. WCHAR m_wzSearchDirectory[DMUS_MAX_FILENAME]; }; The following routines manage the Loader: § SetObjectType() tells CAudio which object type to view. When the user selects a type from the drop-down list, LoaderView calls this. It builds a table of DMUS_OBJECTDESC structures in m_aObjectTable. § GetObjectInfo() retrieves name and caching information from a specific entry in m_aObjectTable. § ReleaseItem() tells the Loader to release its cache reference for just one item. § ReleaseAll() tells the Loader to release all its cached references. § ChangeCacheStatus() tells the Loader to change whether or not it is caching the current object type. § ScanDirectory() tells the Loader to scan the current working directory to build a list of available files. Before we discuss each of these routines, we must initialize the object type to class ID relationship.
Initialization CAudio's Init() method creates the Loader and Performance and initializes both. It also disables caching for Segments to conserve memory. HRESULT CAudio::Init() { //Init COM. CoInitialize(NULL); // Create loader. HRESULT hr = CoCreateInstance( CLSID_DirectMusicLoader, NULL, CLSCTX_INPROC,
IID_IDirectMusicLoader8, (void**)&m_pLoader); if (SUCCEEDED(hr)) { // Turn off caching of Segments. m_pLoader->EnableCache(CLSID_DirectMusicSegment,false); // Create performance. hr = CoCreateInstance( CLSID_DirectMusicPerformance, NULL, CLSCTX_INPROC, IID_IDirectMusicPerformance8, (void**)&m_pPerformance); } if (SUCCEEDED(hr)) { // Once the performance is created, initialize it. // We'll use the standard music reverb audio path and give it 128 pchannels. hr = m_pPerformance->InitAudio(NULL,NULL,NULL, DMUS_APATH_SHARED_STEREOPLUSREVERB, // Default AudioPath type. 128,DMUS_AUDIOF_ALL,NULL);
}
if (FAILED(hr)) { // No luck, give up. Close(); } return hr; } We also have a static array of class IDs that translate from predefined integers to GUIDS, since it is much more convenient to work with predefined integers because they directly correlate with positions in the UI's drop-down list of object types. const GUID *CAudio::m_saClassIDs[OBJECT_TYPES] = {
};
&CLSID_DirectMusicAudioPathConfig,
// OT_AUDIOPATH
&CLSID_DirectMusicBand,
// OT_BAND
&CLSID_DirectMusicChordMap,
// OT_CHORDMAP
&CLSID_DirectMusicContainer,
// OT_CONTAINER
&CLSID_DirectMusicCollection,
// OT_DLS
&CLSID_DirectMusicSegment,
// OT_SEGMENT
&CLSID_DirectMusicStyle,
// OT_STYLE
&CLSID_DirectMusicGraph,
// OT_WAVE
&CLSID_DirectSoundWave
// OT_GRAPH
SetObjectType() SetObjectType() is called when a new file class is requested for display. It sets the variable m_otSelected to the new object type and then builds the m_aObjectTable table of DMUS_OBJECTDESC structures by using the Loader's EnumObject() method. DWORD CAudio::SetObjectType(OBJECT_TYPE otSelected) { // We track the objects in an array, so delete the current array. delete [] m_aObjectTable; m_aObjectTable = NULL; m_dwObjectCount = 0; m_otSelected = otSelected; HRESULT hr = S_OK; We need to know how many objects the Loader has of the current type before we can allocate the memory for the array. However, the Loader does not provide a routine for directly returning this information. Therefore, we use the enumeration method and scan until it returns S_FALSE, indicating that we passed the range. // First, find out how many objects there are // so we can allocate the right size table. for (;hr == S_OK;m_dwObjectCount++) { DMUS_OBJECTDESC Dummy; Dummy.dwSize = sizeof(DMUS_OBJECTDESC); hr = m_pLoader>EnumObject(*m_saClassIDs[m_otSelected],m_dwObjectCount,&Dummy); } m_dwObjectCount--; // Now we know how many, so allocate the table. if (m_dwObjectCount) { m_aObjectTable = new DMUS_OBJECTDESC[m_dwObjectCount]; if (m_aObjectTable) { Once we have the array, filling it is simply a matter of calling EnumObject() a second time — this time iterating through the array. // Then fill the table with the object descriptors. DWORD dwIndex = 0; for (dwIndex = 0;dwIndex < m_dwObjectCount;dwIndex++)
{ // Set the .dwSize field so the loader knows how big it really is and // therefore which fields can be written to. m_aObjectTable[dwIndex].dwSize = sizeof(DMUS_OBJECTDESC); m_pLoader->EnumObject(*m_saClassIDs[m_otSelected], dwIndex,&m_aObjectTable[dwIndex]); } } } Finally, return the number of objects counted. The UI can use this information when displaying the list of objects. return m_dwObjectCount; }
GetObjectInfo() Once the UI has called SetObjectType() to create the table of object descriptors, it needs to populate its list view of those same objects. It calls GetObjectInfo() to iterate through the table and retrieve the name and cache status of each object. void CAudio::GetObjectInfo(DWORD dwIndex, char *pszName, BOOL *pfCached) { if (m_aObjectTable && (dwIndex < m_dwObjectCount)) { // If the wszName field is valid, use it. if (m_aObjectTable[dwIndex].dwValidData & DMUS_OBJ_NAME) { // Convert from Unicode to ASCII. wcstombs(pszName,m_aObjectTable[dwIndex].wszName,DMUS_MAX_NAME); } // Else if the file name is available, use it. else if (m_aObjectTable[dwIndex].dwValidData & DMUS_OBJ_FILENAME) { // First, get the Unicode name. WCHAR wzTemp[DMUS_MAX_FILENAME]; wcscpy(wzTemp,m_aObjectTable[dwIndex].wszFileName); // Then, remove the path.
WCHAR *pwzStart = wcsrchr(wzTemp,'\\'); if (pwzStart) pwzStart++; else pwzStart = wzTemp; // Convert to ASCII. wcstombs(pszName,pwzStart,DMUS_MAX_NAME); } // Uh oh. No name. else { strcpy(pszName,""); } // The DMUS_OBJ_LOADED flag indicates that the resource is currently cached. *pfCached = (m_aObjectTable[dwIndex].dwValidData & DMUS_OBJ_LOADED) && TRUE; } }
ReleaseItem() ReleaseItem() tells the Loader to release the selected item from its cache. The item will continue to be listed, but the Loader will no longer point to it and have it AddRef()'d. Should the item be requested a second time, the Loader will create a new instance and load it. void CAudio::ReleaseItem(DWORD dwItem) { IDirectMusicObject *pObject; if (SUCCEEDED(m_pLoader->GetObject(&m_aObjectTable[dwItem], IID_IDirectMusicObject, (void **) &pObject))) { m_pLoader->ReleaseObject(pObject); pObject->Release(); } }
ReleaseAll() ReleaseAll() tells the Loader to clear all objects of the currently selected type from its cache. All objects not currently AddRef()'d by the application will go away. This is the same as calling ReleaseItem() for every item in the list.
void CAudio::ReleaseAll() { m_pLoader->ClearCache(*m_saClassIDs[m_otSelected]); }
ChangeCacheStatus() You can tell the Loader to enable or disable caching on a class-by-class basis. ChangeCacheStatus() tells the Loader to flip the caching for the currently selected type. It does so by tracking the cache state with an array of Booleans, one for each object type. It flips the state and calls the Loader's EnableCache() method to enable or disable the caching. void CAudio::ChangeCacheStatus() { // Flip the state in the m_afCacheEnabled table. m_afCacheEnabled[m_otSelected] = !m_afCacheEnabled[m_otSelected]; // Call EnableCache with the new state. m_pLoader>EnableCache(*m_saClassIDs[m_otSelected],m_afCacheEnabled[m_otSelect ed]); }
ScanDirectory() ScanDirectory() tells the Loader to look in the current working directory and read all files of the currently selected type. To do so, it calls the Loader's ScanDirectory() method and passes it the wildcard * extension. This tells the Loader to open every single file in the directory and attempt to parse it with the object defined by the class ID. If the object cannot parse the file, the Loader does not include it in the list. For example, the Segment object knows how to parse Segment, MIDI, and wave files. It fails with all other files, and so the Loader does not include the failed files in the resulting list. There is one case where multiple types can parse the same file. Both the CLSID_DirectSoundWave and CLSID_DirectMusicSegment objects know how to parse a wave file. void CAudio::ScanDirectory() { m_pLoader->ScanDirectory(*m_saClassIDs[m_otSelected],L"*",NULL); } That concludes our tour of the LoaderView application. One last thing before we go…
Loading from a Resource You can embed audio objects directly in the executable. To do so, compile them into the executable as resources and then use the Loader's ability to read objects from memory pointers. As you probably know, all buttons, menus, dialogs, and other user interface elements are stored in the program's executable as resources. You create resources in a resource editor that provides tools for visually laying out the program's user interface. The resource editor is part of the development environment. The final stage of program compilation binds the resource data into the executable. The application makes standard calls to access and use the resource data. The mechanism for storing and retrieving resource elements is not limited to menus, icons, and their ilk. You can store just about any data you want in a resource and have your application access it when needed. This provides the convenience of binding all data directly into the executable so there are no extra files to distribute with the program. To demonstrate this, let's use our familiar HelloWorld example from the previous chapter and embed the wave as a binary resource. Once it uses resources, our HelloWorld application can no longer be a console app (a console application is a simple command-line program that does not make any calls into the Windows libraries). Calls to retrieve resources require the Windows APIs. The biggest change in that regard is that main() is replaced with WinMain(). Then, to insert the wave, use the resource editor's option to import a binary file as a resource (it actually has a specific option for wave files but not for other DirectMusic media types). To access the wave resource, use the system calls FindResource() to locate it, LoadResource() to load it, and LockResource() to return a pointer to its memory. Here is the source code: #include calls and structures.
// We need the Windows headers for Windows
#include "resource.h"
// Resource IDs.
#include int APIENTRY WinMain(HINSTANCE hInstance, HINSTANCE hPrevInstance, LPSTR
lpCmdLine,
int
nCmdShow)
{ // The performance manages all audio playback. IDirectMusicPerformance8* pPerformance; // The loader manages all file I/O. IDirectMusicLoader8* pLoader = NULL; // Each audio file is represented by a Segment. IDirectMusicSegment8* pSegment = NULL;
// Initialize COM CoInitialize(NULL); // Create the loader. This is used to load audio objects from files. CoCreateInstance(CLSID_DirectMusicLoader, NULL, CLSCTX_INPROC, IID_IDirectMusicLoader8, (void**)&pLoader); // Create the performance. This manages the playback of sound and music. CoCreateInstance(CLSID_DirectMusicPerformance, NULL, CLSCTX_INPROC, IID_IDirectMusicPerformance8, (void**)&pPerformance); // Initialize the performance. pPerformance->InitAudio(NULL,NULL,NULL, DMUS_APATH_DYNAMIC_STEREO, 2, this wave.
// Default AudioPath type. // Only two pchannels needed for
DMUS_AUDIOF_ALL,NULL); // Okay, let's read the wave from an embedded resource... DMUS_OBJECTDESC ObjDesc; // Find the wave resource in the executable. HRSRC hFound = FindResource(NULL," IDR_WAVE", RT_RCDATA); // Force it to load into global memory. HGLOBAL hRes = LoadResource(NULL, hFound); ObjDesc.dwSize = sizeof(DMUS_OBJECTDESC); // Must let the loader know what kind of object this is. ObjDesc.guidClass = CLSID_DirectMusicSegment; // The only valid fields are the class ID and memory pointer. ObjDesc.dwValidData = DMUS_OBJ_CLASS | DMUS_OBJ_MEMORY; // Get the memory pointer from the resource. ObjDesc.pbMemData = (BYTE *) LockResource(hRes); // Get the memory length from the resource. ObjDesc.llMemLength = SizeofResource(NULL, hFound); // Now, read the Segment from the resource. if (SUCCEEDED(pLoader->GetObject(
&ObjDesc, IID_IDirectMusicSegment8, (void**) &pSegment))) { // Install the wave data in the synth. pSegment->Download(pPerformance); // Play the wave. pPerformance->PlaySegmentEx( pSegment,
// Segment to play.
NULL,NULL,0,0,NULL,NULL,NULL); // Wait eight seconds to let it play. Sleep(8000); // Unload wave data from the synth. pSegment->Unload(pPerformance); // Release the Segment from memory. pSegment->Release(); } // Done with the performance. Close it down, and then release it. pPerformance->CloseDown(); pPerformance->Release(); // Same with the loader. pLoader->Release(); // Bye bye COM. CoUninitialize(); return 0; } This all works fine and dandy if the resource you are loading is completely self contained, which is the case of our example wave file. However, if the Segment references other files, they too need to be prepared as resources and handed to the Loader using SetObject() prior to the call to GetObject(). Follow exactly the same steps to prepare the referenced resources to initialize the resource and descriptor as before, but call SetObject() instead of GetObject(). SetObject() just tells the Loader where the file object is placed in memory so that the Loader can pull it in later when it needs to. Loading from a resource is not the only way to load from memory. You can read anything you want into a chunk of memory and then tell the Loader to read it. However, be very
careful; after the GetObject() call has completed, the Loader may continue to need access to the memory. This is particularly true with waves and DLS instruments that DirectMusic does not actually read until the application instructs it to download a Segment that uses a wave or DLS instrument from these files. If you call GetObject() with a memory pointer and then release the memory, you could get a nasty crash later when you call download on a Segment that references it. To be safe, wait until all download calls have occurred and you are 100 percent confident that there will be no more. Then, clear the Loader's reference with a call to SetObject() with the same DMUS_OBJECTDESC descriptor but with NULL in pbMemData. Here is some sample code that plays a wave file that has been placed in memory (*pbData). Notice how it calls SetObject() after downloading the wave data to ensure that the data cannot be pulled from memory later. IDirectMusicPerformance8 *g_pPerformance; IDirectMusicLoader *g_pLoader; IDirectMusicSegment8 *PlayMemWave(BYTE *pbData,DWORD dwLength) { IDirectMusicSegment8 *pSegment = NULL; DMUS_OBJECTDESC ObjDesc; ObjDesc.dwSize = sizeof(DMUS_OBJECTDESC); // Must let the loader know what kind of object this is. ObjDesc.guidClass = CLSID_DirectMusicSegment; // The only valid fields are the class ID and memory pointer. ObjDesc.dwValidData = DMUS_OBJ_CLASS | DMUS_OBJ_MEMORY; // Assign the memory pointer. ObjDesc.pbMemData = pbData; // Get the memory length from the resource. ObjDesc.llMemLength = dwLength; // Now, read the Segment from the resource. if (SUCCEEDED(g_pLoader->GetObject( &ObjDesc, IID_IDirectMusicSegment8, (void**) &pSegment))) { // At this point, the Segment data has been read, but any wave // data is still pending reading from the file via the download // command. If we free the memory before this next call, it // will cause a crash.
pSegment->Download(g_pPerformance); // Both DLS and wave downloads cause a copy of the memory to // be made, so at this point we can free the resource memory. // However, this does not apply to streamed waves, which always // read from the source every time they play. // To be safe, make sure the loader can't find the memory any more. ObjDesc.pbMemData = NULL; g_pLoader->SetObject( &ObjDesc, IID_IDirectMusicSegment8, (void**) &pSegment))) // Play the wave. g_pPerformance->PlaySegmentEx( pSegment, // Segment to play. NULL,NULL,0,0,NULL,NULL,NULL); } // Return the Segment. Be sure to unload and release it later. return pSegment; } With a better understanding of how the Loader manages files, you are better equipped to avoid some of the pitfalls and downright enigmatic behavior that might test your wits. Although we covered a great deal in this chapter, we still have not touched on the reverse story — how you can replace the Loader with one of your own. For some applications (in particular games), it can be useful to do this because you can then completely control the delivery of game audio resources and even compress or encrypt them in the process. Although we do not cover it in this book, there is a great article and programming example on MSDN. Go to http://msdn.microsoft.com and search for "Custom Loading in DirectMusic."
Chapter 10: Segments Overview Todor Fay In the previous chapters, we explored the rudiments of playing Segments because without playing Segments, you don't have sound! So, we've already skimmed the essentials: loading, downloading, playing, and stopping a Segment. Now it's time to dig in and really understand what makes a Segment tick because the Segment is at the core of DirectMusic's power. Close your eyes, dear student, take a deep breath, and listen very closely to what I am about to say: Use this knowledge judiciously, and you will unlock the secrets of the musical universe. Use it frivolously, and you shall unlock a plague of discordant cacophony! Okay, maybe that's overstating it a bit. Let's just say Segments are fun. Then we'll have some real fun writing a Segment-centric application that explores some of the Segment's more interesting features.
The Tao of Segment Every sound effect or phrase of music that DirectX Audio plays starts as a Segment. So far, we've only played Segments one at a time. But that won't do in a complex application environment. In interactive media, things happen spontaneously, so you can't have a prerecorded (read: canned) audio or music score. The soundtrack must be assembled in real time as the experience takes place. The obvious first step toward such a solution is to have many little sound and music snippets and trigger them as the user goes along. Indeed, that is why Segments are called "Segments"; the intention is each represents just a snippet, or Segment, of the entire sound effect or music score.
Figure 10-1: Multiple overlapping Segments combine to create the complete score. But, that's not enough. In any interactive sound scenario, there are requirements for reactivity, context, variability, and structure for both music and sound effects. Reactivity means that the sounds or music must be capable of responding to immediate actions and changes in the story. Context means that new sounds or musical phrases take into consideration the rest of the audio score, so they reinforce each other, rather than clash. Variability means that sounds and music don't sound exactly the same every time, which would quickly become unnatural. Structure means that sounds and especially music have different levels of inherent timelines. Let's look at sound effects first. By definition, sound effects are reactive, driven by the unfolding story. When something happens, be it a car crashing or a monster belching, the game immediately plays the appropriate sound. But even then, greater sophistication is highly desirable. The human ear is remarkably quick to identify a perfectly repeated sound, and after the fourth belch it starts to sound as if the monster swallowed a tape recorder instead of a deliciously seasoned and sautéed princess. So, variation in the sound is important. Context is important too. If the monster is sad, it would help to reinforce the emotion in the sound, perhaps by bending the belch down in pitch. Music is significantly more challenging. In order to be effective at reinforcing the user's experience, a nonlinear score should track the flow of the media that it supports. Yet, music has multiple layers of structure, from song arrangement at the highest level to phrasing, harmony, and melodic form. Randomly playing notes in response to events in the media doesn't cut it. But, neither does it work to play a lengthy prerecorded score that seems to absentmindedly wander outside of an interactive application with a storyline of its own, often running at cross-purposes to the story. It's as if the score was composed by Mr. Magoo. Remember the cartoon character that could barely see and would "blindly" misinterpret situations and forge ahead on the wrong assumption to comic effect? Do you really want your score composed by that guy?
Unfortunately, that is what a large percentage of scores for games and other interactive applications seem like. Not surprisingly, gamers typically turn the music off after a number of runs. Apparently, the comedy of the ill-fitting score is lost on them! (Pardon me for editorializing a bit here…) What does this all mean? The Segment architecture must be relatively sophisticated in order to support not just playing sounds and music snippets in reaction to events in the application (i.e., gameplay) but also to play them in such a way that they still seem to have structure, with variation and respect for context. While we are at it, let's add another requirement. The Segment mechanism must support different sound- and music-producing technologies, from MIDI and DLS to wave playback, from style-based playback to straight sequencing. The DirectX Audio Segment architecture was designed to solve all of these requirements. The Segment, which is manipulated via the IDirectMusicSegment8 interface, is capable of handling a wide range of sound-producing technologies, from individual waves to Style playback. Multiple Segments can be scheduled for playback in ways that allow their sounds and music to merge, overlap, align, and even control each other.
Anatomy of a Segment A Segment is little more than a simple structure that defines a set of playback parameters, such as duration and loop points for the music or sounds that it will play. But a Segment can contain any number of Tracks, each representing a different technology. By technology, I mean Tracks that play Styles, Tracks that set the tempo, and much more. It is choice of Track types and data placed within them that define a Segment. If you are familiar with creating Segments in DirectMusic Producer, you know that a Segment is constructed by choosing from a wide palette of Tracks and adding them to the Segment. You know that some Track types generate sound (Sequence, Style, Wave, etc.), while others silently provide control information (Chord, Parameter, Time Signature, etc.), and yet others actively control playback (Segment Trigger, Script). There are currently about a dozen Track types included with DirectMusic. I like to think of a Segment as a little box that manages a list of Tracks, each represented by a long box indicating its list of actions over time. The Segment in Figure 10-2 has a Wave Track that triggers one or more waves to play across the duration of the Segment. It also has a Time Signature track, which provides timing information. It has a Band Track, which determines which DLS instruments to play and their initial volume and pan, and it has a Sequence Track, which plays a sequence of MIDI notes. So, this Segment plays a sequence of waves as well as MIDI notes, and it provides time signature information that the Performance can use to control playback of this Segment. In a while, we learn that this can also control the playback of other Segments.
Figure 10-2: Segment with four Tracks. You can see how a Segment is ultimately defined by the collection of Tracks that it contains, since they completely define its behavior. Each Track in a Segment is actually a COM object, represented by the IDirectMusicTrack8 and IPersistStream interfaces. The IPersistStream interface is used for loading the Track, while the IDirectMusicTrack8 interface is used for performing it. In a later chapter, we discuss how to create your own Track type.
How a Segment Loads Tracks are created at the time a Segment is loaded from file. The Segment file format stores each Track as a class ID for the Track type followed by the data. When the Segment is loaded, it reads its small set of data and then reads the list of Tracks, each represented by a class ID and associated data chunk. For each Track in the file, it calls CoCreateInstance() to create the Track object. It then calls the Track's IPersistStream::Load() method and passes it the Track's data chunk in the form of the IStream at the current file position (see Chapter 9 to better understand file I/O and the Loader). The Track reads its data and passes control back to the Segment, which continues to the next Track, assembling a list of created and loaded Tracks as it goes. Note The Track types do not show up as cached object types in the Loader. Because they are always embedded within the Segment, there is no need for the Loader's linking and caching mechanisms.
How a Segment Plays Since playback of a Segment is really playback of the Tracks within the Segment, we look to the Tracks themselves to understand how a Segment performs. There are two ways that Tracks perform: by generating playback events (mostly MIDI messages) for the DirectMusic synth and by delivering control parameters that are read by other Tracks. Examples of message-generating Tracks include Sequence, Style, Band, Wave, and Pattern. Examples of control-generating tracks include Groove Level, Chord/Key, Mute, Tempo, and Time Signature.
Performance Channels
At this point, it's probably a good idea to introduce the concept of Performance channels, or pchannels. Since everything that the Performance plays goes through a MIDI-style synthesizer, a lot of MIDI conventions are used. One is the concept of a MIDI channel. In the MIDI world, each synthesizer can have up to 16 channels. Each channel represents a musical instrument. Up to 16 different instruments can be played at the same time by assigning each to one of the 16 channels. Every performance MIDI message includes the address of the channel. So, the program change MIDI message tells the channel which instrument to play. The MIDI Note On and Note Off messages trigger playing notes using that instrument. More than one note can play at a time on one channel. Additional MIDI messages adjust the volume, pan, pitch bend, and many other parameters for all notes playing on the addressed channel. There are some powerful things that you can do with this. Because each message carries its channel address, you can inject messages into the stream that modify the performance. For example, you can adjust the volume or reverb of a MIDI performance by sending down the appropriate MIDI control change messages without touching the notes themselves. This is all marvelous, but 16 MIDI channels is not enough. There's an historical reason that MIDI is limited to 16 channels. When the MIDI specification was created back around 1982, it was purely a hardware protocol designed to connect electronic musical instruments together via a standard cable. To maximize bandwidth, the number of channels was limited to four bits, or 16. At the time, it was rare to find a musical device that could play more than one instrument at a time, let alone 16, so it was reasonable. Twenty years later, a 16instrument limit is horribly inadequate. DirectMusic's Core Layer addresses this by introducing the concept of channel groups. Each channel group is equivalent to one set of 16 MIDI channels. In turn, the Performance supports unlimited channels, called pchannels, and manages the mapping of these from an AudioPath pchannel (each AudioPath has its own virtual channel space) to a unique channel group + MIDI channel on the synth. What this means is every Track in every Segment that plays on the same AudioPath is sending commands to the synth on the same shared set of pchannels. So, if a Band Track in a Segment sets piano on pchannel 2, a separate Sequence Track with notes on pchannel 2 will play piano. Likewise, one Segment with a Band Track can control the instruments of a completely separate Segment on the same AudioPath (as long as they both play on the same AudioPath, of course). In DirectMusic, everything that can make a sound is addressed via pchannels. This means waves as well as the standard MIDI notes. In all cases, the sound-generating Tracks create messages (called pmsgs), which are time stamped and routed via pchannels, ultimately to the synth where they make sound. The good news is that all this is really set up in the content by the audio designer working with DirectMusic Producer. There's really nothing the programmer needs to do, but it's good to understand what is going on.
Inter-Track Communication While some Track types generate messages on pchannels to send to the synth, others provide control information. Examples of control information are time signatures, which are used to identify the next beat and measure. Groove level is another example, which controls the intensity level of Style playback Tracks. As it turns out, control information is always provided by one track to be used by another. Groove Level, Chord, and Mute Tracks, for example, are used to determine the playback
behavior of a Style Track. At first glance, it might make sense to just put the groove, chord, and mute information directly in the Style Track to keep things simple. But, it turns out that by sharing this information with multiple customer Tracks, some really powerful things can be accomplished. For example, multiple Segments can play at the same time and follow the same chord progression. More on how this works in a bit… So, we have a mechanism for cross-track communication. It's done via an API for requesting control information, and it's called GetParam(). GetParam() methods are provided on the IDirectMusicPerformance, IDirectMusicSegment, and IDirectMusicTrack interfaces. When a Track needs to know a particular parameter, it calls GetParam() on the Performance, which in turn identifies the appropriate Segment and calls GetParam() on it. That, in turn, calls GetParam() on the appropriate Track. Each command request is identified by a unique GUID and associated data structure in which the control data is returned. The beauty of the GetParam() mechanism is that it works not only across Tracks within one Segment but also across Tracks in multiple Segments. This is very important because it allows control to be shared in sophisticated ways that make the playing of multiple Segments at once seem more like a carefully orchestrated performance. For example, a Segment with a Pattern Track that is designed to transpose on top of the current chord and key is played as a musical embellishment. It uses chord and key information generated by a Chord Track in a separate Style playback Segment. So, the embellishment seems to lock into the harmonic structure of the Style Segment, and it all sounds as if it were carefully written to work together.
Primary, Secondary, and Controlling Segments So, how does the performance determine which Segment to read the control information from? As it turns out, there are three levels of playback priority for a Segment, set at the time the Segment is played. They are primary, secondary, and controlling.
Primary Segment A primary Segment is intended primarily for music. Only one primary Segment may be playing at a time, and it is considered the core of the music performance. In addition to the music that it plays, the primary Segment provides the underlying Tempo, Time Signature, and, if applicable, Chord and Groove Level Tracks. By doing so, it establishes these parameters for all music in all Segments. So, any Playback Tracks within that same Segment retrieve their control information from the neighboring Control Tracks as they play. No two primary Segments can play at the same time because that would cause the Control Tracks to conflict. If a primary Segment is playing and a new one is scheduled to play, the currently playing Segment stops exactly at the moment that the new Segment starts.
Controlling Segment A controlling Segment has the unique ability to override the primary Segment's Control Tracks without interrupting its playback. For example, a controlling Segment with an alternate chord progression can be played over a primary Segment and all music shifts to the replacement key and chord until the controlling Segment is stopped. So, suddenly the tonal character of the music can make a shift in an emotionally different direction. Figure 103 demonstrates how a controlling Segment with an alternate chord progression overrides the chords in two subsequent primary Segments.
Figure 10-3: Underlying chord progression assembled from primary and controlling Segments. Likewise, Tempo, Muting, and Groove Level can be overridden. This is a very powerful way to alter the music in real time, perhaps in response to stimuli from the host application (i.e., in response to gameplay). Like the primary Segment, the controlling Segment is primarily for use with music, not sound effects.
Secondary Segment A secondary Segment also has sound-producing Tracks, but it is devoid of Control Tracks. Importantly, any number of secondary Segments can be played at the same time. This is important for playing sound effects as well as music. With sound effects, the usage is simple. Each secondary Segment contributes additional sounds on top of the others playing in the same AudioPath. With music, it can be a little more sophisticated. Because the secondary Segments retrieve their control information from the primary and controlling Segments, they automatically track any of the musical parameters that are defined by the controlling Tracks. This is very powerful. In particular, the fact that they can track the chord and time signature means that musical embellishments can play spontaneously and yet blend in rhythmically and harmonically. There is the special case of a self-controlling Segment. It is possible to create a Segment in DirectMusic Producer that will only listen to its own Control Tracks, regardless of whether it is played as a primary, secondary, or controlling Segment. This is useful if you want to create a Segment that genuinely plays on its own and ignores the Control Tracks of other Segments. Whew! Enough talk — let's do something!
Playing Segments We already learned everything that we needed to know about loading and downloading a Segment from the previous two chapters. But there are a ton of options to explore in how we actually play, stop, and control a running instance of a Segment. For that, we look at the SegmentState object and the PlaySegmentEx() and StopEx() calls.
SegmentState One very important feature of DirectMusic is the ability to play the same Segment multiple times concurrently. This is useful for sound effects. For example, you might want to use the same car engine sound on several cars that are driving at the same time. This is also very important with musical fragments when the score is built from multiple Segments. If the only way to adjust a parameter of the playing Segment or stop its playback involved applying the operation to the original Segment object, things would be very bad since that would have to represent all instances of the playing Segment. We need something to represent each instance of the playing Segment. DirectMusic provides such a beast, called the SegmentState. You can request one of these when you play the Segment and then use it to specifically stop the Segment later, find out if it is still playing, and even access some of its control parameters. It is represented by the IDirectMusicSegmentState interface. We use this to stop a playing Segment in our programming example.
PlaySegmentEx() Let's take a close look at PlaySegmentEx(), the method you call to actually play a Segment. We skip PlaySegment(), which is an older version from DirectX 7.0, as it is pretty much the same as PlaySegmentEx() with fewer options. HRESULT hr = pPerformance->PlaySegmentEx( IUnknown * pSource,
// Segment to play.
WCHAR * pwzSegmentName,
// Unused feature.
IUnknown * pTransition,
// Optional transition Segment.
DWORD dwFlags,
// Control flags.
__int64 i64StartTime,
// Optional start time.
IDirectMusicSegmentState** ppSegmentState, // Optional Segment state to track // playing Segment. IUnknown * pFrom, // Optional Segment or AudioPath that stops when this starts. IUnknown * pAudioPath
// AudioPath to play this Segment
on. ); Typically, the three most important parameters are pSource, dwFlags, and pAudioPath. Pass the Segment you want to play to pSource, specify how you want it played with dwFlags, and determine which AudioPath it plays on with pAudioPath. (That is optional, too. You can pass NULL for the default AudioPath.)
Let's look at all the parameters in detail. §
IUnknown * pSource: This is always an IDirectMusicSegment or IDirectMusicSegment8. Since IUnknown is a base class of IDirectMusicSegment, it automatically works without any casting or calling QueryInterface(). You can just pass the IDirectMusicSegment interface, and it will work. So, why isn't this just IDirectMusicSegment? The API is designed to support both IDirectMusicSegment and IDirectMusicSong. However, IDirectMusicSong is not currently exposed. When Microsoft eventually enables it, you will be in for a treat.
§
WCHAR * pwzSegmentName: This is only an option when using IDirectMusicSong, which is currently not enabled, so just ignore for now.
§
IUnknown * pTransition: The IUnknown interface pointer of a Segment to use to transition to this Segment. The transition Segment will play first, followed by the new Segment. The DMUS_SEGF_AUTOTRANSITION flag must be set in dwFlags. Also, if the transition Segment is using style technology and it includes the proper ChordMap Track, the chord progression for the transition is automatically composed to best fit between the currently playing Segment and the one we are transitioning to. These intelligent transitions make it much smoother when moving from one piece of music to another, resulting in a score that magically stays cohesive even when the story changes dramatically in unexpected ways.
§
DWORD dwFlags: This carries the control flags for the new Segment. These control how the Segment should be scheduled in terms of timing and behavior. It's worth taking some time to explore these. To make things easier, our sample application, Jones, lets you visually set these and hear what happens. All of these flags are explained in detail later when we walk through the source code.
§
__int64 i64StartTime: This is the time to begin playing the Segment. If 0, the Segment will play at the first opportunity, as defined by dwFlags (there are options to align with the time signature in different ways, for example). By default, this is assumed to be MUSIC_TIME units, which track the music time of the performance. However, if clocktime units are required, the DMUS_SEGF_REFTIME flag can be set and REFERENCE_TIME units used. This allows scheduling the start of the Segment's playback against a hard clock time, unaffected by the current musical tempo. This is particularly useful for sound effects that need to be scheduled.
§
IDirectMusicSegmentState** ppSegmentState: If you would like to track the playback of the Segment, pass the address of an IDirectMusicSegmentState interface pointer here. It is then filled in with an instance of an IDirectMusicSegmentState, which can then be used to operate on the specific instance of the playing Segment. This parameter can be NULL if no SegmentState object is wanted.
§
IUnknown * pFrom: Sometimes, you might want the new Segment to override an existing Segment, automatically stopping that Segment's playback at the exact time that the new one starts. Although this is automatically done with primary Segments, it isn't with controlling or secondary Segments, since they can overlap. This also lets you replace a series of Segments that are all playing on the same AudioPath with the new one. Therefore, pFrom is the IUnknown interface pointer of the SegmentState or AudioPath to stop when the new Segment begins playing. If it is an AudioPath, all SegmentStates playing on that AudioPath are stopped. This value can be NULL.
§
IUnknown * pAudioPath: Every Segment must play on an AudioPath. You can either provide a specific AudioPath object with this parameter or pass NULL, in which case the default AudioPath is used.
StopEx() Often, you need to stop the Segment at some point. Again, there are two methods — the older Stop() and the newer StopEx(). Since StopEx() has all the functionality of Stop(), we only look at StopEx(). You have four choices for what you stop: § An individual instance of the playing Segment (the SegmentState). § All instances of the playing Segment. § All Segments playing on a specific AudioPath. § All Segments playing on the Performance. HRESULT hr = m_pPerformance->StopEx( IUnknown *pObjectToStop, to stop.
); §
§
§
// Segment, SegState, or AudioPath
__int64 i64StopTime,
// Optional stop time.
DWORD dwFlags
// Control flags.
IUnknown * pObjectToStop: Pointer to the interface of the object you'd like to stop. Since all COM objects are based on IUnknown, the IDirectMusicSegmentState, IDirectMusic-Segment, or IDirectMusicAudioPath interfaces all work. o IDirectMusicSegmentState will stop the playing instance of the Segment. o IDirectMusicSegment will stop all instances of the Segment. o IDirectMusicAudioPath will stop all Segments playing on the AudioPath. o NULL will stop all Segments playing. __int64 i64StopTime: This defines the time to stop. If 0, the stop will occur at the first opportunity, as defined by dwFlags (there are options to align with the time signature in different ways, for example). By default, this is assumed to be MUSIC_TIME units, which track the music time of the performance. However, if clock-time units are required, the DMUS_SEGF_REFTIME flag can be set and REFERENCE_TIME units used. This allows scheduling the stop against a hard clock time, unaffected by the current musical tempo. DWORD dwFlags: Control flags indicating when the stop should occur. These are a subset of the play flags, primarily focused on setting the timing grid. So, only DMUS_SEGF_BEAT, DMUS_SEGF_DEFAULT, DMUS_SEGF_GRID, DMUS_SEGF_MEASURE, DMUS_SEGF_REFTIME, DMUS_SEGF_MARKER, and DMUS_SEGF_SEGMENTEND are supported (more on these flags in a little bit).
Introducing Jones This chapter's programming example is the first installment of a program named Jones. Jones starts by loading Segments and exploring all the options and permutations for playing them (see Figure 10-4). As we progress through the book, Jones grows to add functionality, including scripting and AudioPaths, as we dig deeper into DirectX Audio features.
Figure 10-4: Jones demonstrates loading, configuring, and playing Segments. Let's do a quick walk-through of Jones to get acquainted and then return to look at each operation in detail along with the underlying code.
Starting Jones You can run Jones by either going to the Unit II\10_Segments directory and building Jones or running SegmentJones.exe directly from the Bin directory.
Loading a Segment Use the Open… button to read a Segment into Jones. This opens a File dialog, where you can choose a Segment, MIDI, or wave file to load. For the purposes of this walk-through, try StyleSegment.sgt from the Media directory.
The tree view displays the loaded Segment. To the left of the Segment is a plus sign, indicating that it has children Segments. Go ahead and click on it. These are also Segments that you can play, and indeed they are designed to be played at the same time as the main Segment. Where did they come from? If the Segment references any Styles (which StyleSegment.sgt does), then the Style's Bands and motifs are converted into Segments and displayed as additional Segments under the Segment you loaded from a file. You can play any of these Segments. This feature in Jones quickly gets you closer to the inherent interactivity of DirectMusic. Styles typically include a series of Bands and motifs, all designed to be played in conjunction with the main Style Segment but with very different intentions. The Bands change the timbre set. In other words, each provides a different set of instruments with a different mix so that by changing Bands you can quickly make strong changes to the music as it plays. The motifs are melodic fragments that are designed with the Style in mind and layer effectively on top of it. By playing the motifs, you add musical elements to the performance. So, by displaying all of the Bands and motifs that come packaged with the Style, Jones immediately gives you a quick way to start interactively jamming and appreciating the layered and interactive approach to delivering music.
Setting Segment Control Parameters Click on a Segment and all of its control information is displayed on the right, along with Play and Stop buttons.
Use these controls to set up how you want the currently selected Segment to play. Click on the Play button to hear the Segment. You can also double-click on the Segment's name in the tree view to play it.
Viewing the TimeLine Once the Segment is playing, Jones displays it as a moving rectangle in the timeline view underneath.
The portion of the Segment that has played or is in the process of playing is painted blue, while the part that is still waiting to play is red. The four vertical lines, from left to right, represent the current time, the latency time, the queue time, and the prepare time. Each represents a different stage in the process of playing a Segment. We start at the current time and work backward through the sound production process: §
Current time (black): This is the time at which you hear the sound. Notice that at the very moment that a Segment rectangle hits this, you start hearing it play. It would be simpler if we could use just this for all our timing and ignore latency, queue, and prepare times. But DirectMusic spreads out the different operations both to increase efficiency (some operations can be done in large chunks to reduce overhead) and to provide control opportunities (for example, tools that can adjust the timing of notes in either direction).
§
Latency time (red): This is the time at which the DirectMusic synth, which is implemented in software, actually does its work. Depending on the hardware, this is anywhere from a few milliseconds to 70ms after the current time. The synth wakes up at regular intervals, typically every ten or so milliseconds, and processes another buffer of data to feed out the AudioPath into the effects filters. However, because the MIDI events and waves are all time stamped, the synth places the sounds appropriately in the buffer at sample accurate resolution, even though it processes the audio in relatively large lumps. Latency time is particularly important to understand because it represents the earliest a sound can play. By default, this is used to calculate when to play a new Segment.
§
Queue time (purple): For increased efficiency, the Performance Layer sends notes down in large chunks rather than individually. This is particularly important when hardware acceleration is involved because kernel transitions (i.e., calls down to the core Windows OS) are expensive. So, the Performance queues up MIDI and wave events and sends them down to the synth roughly 50ms ahead of latency time. The starting notes are sent down immediately, bypassing the queue time wait when a new Segment plays. Just as latency time indicates the earliest a Segment can play, queue time indicates the earliest a Segment can be stopped. Pmsgs in the Performance Layer can be yanked by stop invalidations, but once they are sent down to the synth, they are committed for good. The queue time can be adjusted with a call to IDirectMusicPerformance::SetBumperLength().
§
Prepare time (blue): This is the time at which the Segment Tracks dump pmsgs into the queue. This is usually a full half-second ahead of current time. This relatively large delay allows the Tracks to be called infrequently, increasing efficiency. The delay allows Tool processing to alter the time stamping of pmsgs pretty dramatically. However, it doesn't impose a restriction on the start or stop time of a Segment, so it's relatively harmless even though the delay is relatively large. The delay duration can be adjusted with a call to IDirectMusicPerformance:: SetPrepareTime().
Viewing Control Parameters Below the timeline, the Tempo, Time Sig, Groove, and Chord parameters are displayed.
These four parameters are called control parameters and are retrieved from the currently playing primary and controlling Segments. The Tracks in all playing Segments use these parameters to determine their behavior (for example, transpose a note to follow the current chord). When you want to stop the Segment, make sure it is selected in the tree view and then click on the Stop button or just let it play until it is finished.
Jones Code The source code for Jones can be split into two categories: UI and DirectX Audio management. Since this book is about programming DirectX Audio (not programming Windows with MFC), the source code for Jones' UI is not discussed. Extra effort has been made to separate functionality so that just about everything we care about is covered by the audio management code (starting with the CAudio class), and there's no need to discuss the UI. Of course, the UI code is included with the projects on the companion CD and it is documented, so you are welcome to use it in any way you'd like.
Audio Data Structures Jones uses the same classes that we created for the Loader in the previous chapter — CAudio and CSegment — though with some significant changes and additions, which we will discuss. Jones also uses an additional class: CSegState. CSegState keeps track of one playing instance of a Segment via an IDirectMusicSegmentState8 interface pointer. This can be used to stop the specific instance of the playing Segment, as well as perform other operations on that Segment in real time. CSegState also tracks CSegment for convenience. Since we can have more than one instance of a Segment playing at the same time, we manage the CSegStates with CSegStateList. To facilitate this, CSegState and CSegStateList are based on the list management classes CMyNode and CMyList, respectively. class CSegState : public CMyNode { public: // Constructor. CSegState(CSegment *pSegment, IDirectMusicSegmentState8 *pSegState) { m_pSegState = pSegState; pSegState->AddRef(); m_pSegment = pSegment; } ~CSegState() { m_pSegState->Release(); } // We keep a linked list of SegmentStates. CSegState *GetNext() { return (CSegState *) CMyNode::GetNext(); }; // Access methods. CSegment *GetSegment() { return m_pSegment; };
IDirectMusicSegmentState8 *GetSegState() { return m_pSegState; }; private: CSegment * this is playing.
m_pSegment;
IDirectMusicSegmentState8 * m_pSegState; SegmentState object.
// The Segment that // The DirectMusic
}; /* CSegStateList Based on CMyList, this manages a linked list of CSegStates. */ class CSegStateList : public CMyList { public: // Overrides for CMyList methods. CSegState * GetHead() { return (CSegState *) CMyList::GetHead(); }; CSegState *RemoveHead() { return (CSegState *) CMyList::RemoveHead(); }; // Clear list and release all references. void Clear(); }; The CSegment class manages a loaded instance of a Segment. Like CSegState, it is based on CMyList in order to manage a linked list of CSegments. Each CSegment can have a list of children CSegments, managed by the CSegmentList class. class CSegmentList : public CMyList { public: // Overrides for CMyList methods. CSegment *RemoveHead() { return (CSegment *) CMyList::RemoveHead(); }; // Clear list and release all references. void Clear(IDirectMusicPerformance8 *pPerf); }; // We create Segments from file as well as extract them from embedded Bands and motifs.
// We'll keep track of how a Segment was created for display purposes. typedef enum _SEGMENT_SOURCE { SEGMENT_BAND
= 1,
SEGMENT_MOTIF
= 2,
SEGMENT_FILE
= 3
} SEGMENT_SOURCE; /*
CSegment This manages one Segment. Segments can also carry a list of children Segments.
*/ class CSegment : public CMyNode { public: CSegment(); ~CSegment(); CSegment *GetNext() { return (CSegment *) CMyNode::GetNext(); }; void Init(IDirectMusicSegment8 *pSegment, IDirectMusicPerformance8 *pPerf,SEGMENT_SOURCE ssType); void Clear(IDirectMusicPerformance8 *pPerf); void SetFlags(DWORD dwFlags) { m_dwPlayFlags = dwFlags; }; DWORD GetFlags() { return m_dwPlayFlags; }; void GetName(char *pszBuffer); SEGMENT_SOURCE GetType() { return m_ssType; }; CSegment * EnumChildren(DWORD dwIndex) { return (CSegment *) m_ChildSegments.GetAtIndex(dwIndex); }; bool HasChildren() { return !m_ChildSegments.IsEmpty(); }; CSegment *GetTransition() { return m_pTransition; }; void SetTransition(CSegment * pTransition) { m_pTransition = pTransition; }; bool HasEmbeddedAudioPath(); IDirectMusicSegment8 * GetSegment() { return m_pSegment; }; private: IDirectMusicSegment8 * DirectMusic Segment that
m_pSegment;
// The // this
manages.
DWORD Playback flags.
m_dwPlayFlags;
// DMUS_SEGF_
DMUS_OBJECTDESC object descriptor.
m_Desc;
// DirectMusic // Used by
Loader. CSegmentList Segments of this one.
m_ChildSegments;
// Children
CSegment * Segment to
m_pTransition;
// Transition //
autotransition to this one. SEGMENT_SOURCE Segment (how it was
m_ssType;
// Type of // created.)
}; CAudio is the top-level class. It initializes and manages the DirectMusic Loader and Performance objects. It manages the loading and playback of Segments (CSegment), also keeping track of their SegmentStates (CSegState) as they play. CAudio provides methods that track the current time signature, groove level, tempo, and chord by calling GetParam() and returning the data in string format for display. We take a look at these tracking methods a little later on. For now, let's take a look at CAudio. class CAudio { public: CAudio(); ~CAudio(); HRESULT Init(); Performance, Loader, etc. void Close();
// Open the // Shut down.
IDirectMusicPerformance8 * GetPerformance() { return m_pPerformance; }; IDirectMusicLoader8 * GetLoader() { return m_pLoader; }; CSegment *LoadSegment(WCHAR *pwzFileName); from file.
// Load a Segment
void PlaySegment(CSegment *pSegment);
// Play a Segment.
void StopSegment(CSegment *pSegment);
// Stop a Segment.
void StopSegState(CSegState *pSegState); individual instance of a
// Stop an // playing
Segment. CSegment * EnumSegments(DWORD dwIndex) through all loaded
// Enumerate // Segments.
{ return (CSegment *) m_SegmentList.GetAtIndex(dwIndex); }; // Convenience functions that use GetParam to read current parameters and // convert into strings for display. bool GetTimeSig(char *pszText); signature.
// Current time
bool GetGroove(char *pszText); level (no
// Current groove // embellishments,
though.) bool GetTempo(char *pszText); bool GetChord(char *pszText); (doesn't bother
// Current tempo. // Current chord // with key.)
private: IDirectMusicPerformance8 * Performance object.
m_pPerformance;
// The DirectMusic
IDirectMusicLoader8 * Loader object.
m_pLoader;
// The DirectMusic
CSegmentList Segments.
m_SegmentList;
// List of loaded
CSegStateList Segments.
m_SegStateList;
// List of playing
}; Now that we've covered the data structures, let's walk through the different parts of the program and examine the code.
Loading a Segment into Jones For Jones, our CAudio and CSegment classes pick up more sophisticated Segment loading than in the Loader implementation. In particular, they now check for a Style playback Segment, and if so, expose any children Band and Style motif Segments embedded within the Style. Note A Style motif, by the way, is a little Segment that is included within a Style and is designed to play effectively on top of any Segment that uses the Style. It uses the time signature and chord information to lock on to the current rhythm and harmony, so it always sounds right, no matter when it is triggered. It even uses instruments that are configured as part of the Style design and placed in the Band Track of the Style Segment. Loading a Segment is handled by two methods: CAudio::LoadSegment() and CSegment::Init(). CAudio::LoadSegment() takes a file path as a parameter, loads the Segment, and returns it wrapped in a CSegment class. It then calls CSegment::Init(), which prepares for playback and visualization. Finally, it checks if the Segment has a Style in it, and if so, it spelunks for any embedded Band and motif Segments, which it inserts as children of the main Segment. These show up as children nodes in the tree view and can be played independently as secondary Segments to layer melodies on top (motifs) and change the instruments (Bands).
Let's look at the CAudio::LoadSegment() routine. CAudio::LoadSegment() first takes the file name to be loaded and extracts the path, which it hands to the Loader via a call to IDirectMusicLoader:: SetSearchDirectory(). This ensures that referenced files, such as Styles, and DLS instrument collections will be easily found and connected to the Segment. CAudio::LoadSegment() then loads the Segment, calls Init() on the Segment, and places it in CAudio's Segment list. CSegment *CAudio::LoadSegment(WCHAR *pwzFileName) { wcscpy(m_wzSearchDirectory,pwzFileName); WCHAR *pwzEnd = wcsrchr(m_wzSearchDirectory,'\\'); if (pwzEnd) { // If pwzFileName includes a directory path, use it to set up the search // directory in the Loader. // The Loader will look here for linked files, including Styles and DLS // instruments. *pwzEnd = 0; m_pLoader>SetSearchDirectory(GUID_DirectMusicAllTypes,m_wzSearchDirectory,FAL SE); } CSegment *pSegment = NULL; IDirectMusicSegment8 *pISegment; // Now, load the Segment. if (SUCCEEDED(m_pLoader>LoadObjectFromFile(CLSID_DirectMusicSegment, IID_IDirectMusicSegment8, pwzFileName, (void **) &pISegment))) { // Create a CSegment object to manage playback. // This also recursively searches for embedded Band and Style motif // Segments which can be played as secondary Segments. pSegment = new CSegment(pISegment,m_pPerformance,SEGMENT_FILE); if (pSegment) { m_SegmentList.AddTail(pSegment);
} pISegment->Release(); } return pSegment; } Next, CSegment::Init() prepares the Segment for playback and display. CSegment::Init() inserts a DisplayTrack Track in the Segment. This will be used for visualization. It's really an advanced topic (creating your own Track types), so it would be a bit of an unwelcome tangent right now. If you are interested in how it's done, take a look at the source code, which is well documented. Init() then downloads the Segment to the Performance. This ensures that all waves and DLS instruments are primed in the synth for playback. Note Downloading of instruments is done with Bands, which are pchannel specific. If the pchannel referred to by a Band isn't in the default AudioPath, the download will fail. Next, Init() retrieves the Segment's name via the IDirectMusicObject interface, so there is a friendly name for display. Finally, if this is a Style playback Segment, it scans the Style for Bands and motifs and installs them as children Segments in its CSegmentList field and recursively calls this same Init() routine for each of them. That last part is relatively complex, and it's usually something you'd never need to do. But, it's a good learning exercise and, besides, we needed it for Jones. void CSegment::Init(IDirectMusicSegment8 *pISegment, IDirectMusicPerformance8 *pPerf, SEGMENT_SOURCE ssType) { // Create the DisplayTrack and insert it in the Segment. CDisplayTrack *pDisplayTrack = new CDisplayTrack(this); if (pDisplayTrack) { IDirectMusicTrack *pITrack = NULL; pDisplayTrack->QueryInterface(IID_IDirectMusicTrack,(void **)&pITrack); if (pITrack) { pISegment->InsertTrack(pITrack,-1); } } // Assign the Segment and download it to the synth. m_pSegment = pISegment; pISegment->AddRef(); pISegment->Download(pPerf);
// Get the default playback flags. pISegment->GetDefaultResolution(&m_dwPlayFlags); // If this is a motif, assume secondary Segment and allow it to trigger // to the time signature even when there is no primary Segment playing. if (ssType == SEGMENT_MOTIF) { m_dwPlayFlags |= DMUS_SEGF_SECONDARY | DMUS_SEGF_TIMESIG_ALWAYS; } // If this is a band, just assume secondary. else if (ssType == SEGMENT_BAND) { m_dwPlayFlags |= DMUS_SEGF_SECONDARY; } m_dwPlayFlags &= ~DMUS_SEGF_AUTOTRANSITION; // Get the object descriptor from the Segment. This includes the name. IDirectMusicObject *pIObject = NULL; pISegment->QueryInterface(IID_IDirectMusicObject,(void **)&pIObject); if (pIObject) { pIObject->GetDescriptor(&m_Desc); pIObject->Release(); } m_ssType = ssType; // Now, the really fun part. We're going to see if this has any child Segments // that would come with the Style, if this is a Style playback Segment. // If so, we can get the motifs and Bands from the Styles and install them as // Segments. IDirectMusicStyle8 *pStyle = NULL; if (SUCCEEDED(m_pSegment->GetParam( GUID_IDirectMusicStyle,-1,0,0,NULL,(void *)&pStyle))) { // Paydirt! There's a Style in this Segment.
// First, enumerate through its Band lists. WCHAR wszName[DMUS_MAX_NAME]; DWORD dwEnum; for (dwEnum = 0;;dwEnum++) { // S_FALSE indicates end of the list. if (pStyle->EnumBand(dwEnum,wszName) == S_OK) { // Use the name to retrive the Band. IDirectMusicBand *pBand; if (SUCCEEDED(pStyle->GetBand(wszName,&pBand))) { // The Band itself is not a Segment, but it has a method for // creating a Segment. IDirectMusicSegment8 *pIBandSeg; if
(SUCCEEDED(pBand->CreateSegment((IDirectMusicSegment **)&pIBandSeg)))
{ // For visualization, give the Band Segments a measure of // duration. pIBandSeg->SetLength(768*4); CSegment *pSegment = new CSegment; if (pSegment) { pSegment->Init(pIBandSeg,pPerf,SEGMENT_BAND); wcscpy(pSegment->m_Desc.wszName,wszName); pSegment->m_Desc.dwValidData = DMUS_OBJ_NAME; m_ChildSegments.AddHead(pSegment); } pIBandSeg->Release(); } pBand->Release();
}
} else { break; } } // Now, enumerate through the Style's motifs.
for (dwEnum = 0;;dwEnum++) { if (pStyle->EnumMotif(dwEnum,wszName) == S_OK) { IDirectMusicSegment8 *pIMotif; if (SUCCEEDED(pStyle->GetMotif(wszName,(IDirectMusicSegment **)&pIMotif))) { CSegment *pSegment = new CSegment; if (pSegment) { pSegment->Init(pIMotif,pPerf,SEGMENT_MOTIF); wcscpy(pSegment->m_Desc.wszName,wszName); pSegment->m_Desc.dwValidData = DMUS_OBJ_NAME; m_ChildSegments.AddHead(pSegment); } pIMotif->Release(); } } else { break; } } pStyle->Release(); } } Note
If this seems like a lot to do just to get a Segment ready to play, rest assured that your perception is correct and, no, you wouldn't normally need to do all these things. As we've already seen in previous chapters, all you really need to do is load and then download your Segment. So, don't feel like memorizing these steps is a prerequisite to DirectMusic guruhood. However, walking through all this extra fancy stuff, especially extracting the Bands and motifs, helps to better grasp the flexibility and inherent opportunities presented by the API.
Once a Segment is loaded, we can display it along with the other Segments in the tree view. To facilitate this, CAudio provides a method for enumerating through the Segments. CSegment * EnumSegments(DWORD dwIndex) // Enumerate through all loaded Segments. { return (CSegment *) m_SegmentList.GetAtIndex(dwIndex); };
CSegment::EnumSegments() is based on the CMyList class GetAtIndex() method, which takes a zero-based integer position in the list and returns the item at that position or NULL if the end of the list has been reached. To display the tree, Jones calls EnumSegments() for the loaded Segments and then, if there are children Segments to display indented (motifs and Bands), calls EnumChildren() on each Segment. EnumChildren() is also based on CMyList::GetAtIndex().
Segment Options When you click on a Segment in the tree view, Jones displays everything it knows about the Segment in the box to the right. Almost all of this is the data from CSegment's m_dwPlayFlags field, which is used in the call to PlaySegmentEx(). The display breaks the flags into mutually exclusive groupings. For example, a Segment can either play as a primary, secondary, or controlling Segment, so the two flags that set this are assigned as determined by the choice in the Mode drop-down list. Notice that StyleSegment.sgt has been set to play as a primary, whereas the motif and Band Styles underneath it have all been set to secondary.
Let's walk through the Segment options and discuss each selection and how it controls the flags that are passed to PlaySegmentEx(), starting with the Mode control.
Mode
§
Primary: Primary is actually the default behavior. This mode doesn't set any flags in order to play a Segment as a primary Segment. Select StyleSegment.sgt and play it. Once it is playing, click again. Notice that a new Segment appears in the timeline view, and the first Segment eventually goes away, indicating that the first ended abruptly so that the second could start. Also, notice that
where the second Segment starts, the first Segment stays red, since it no longer plays beyond that point.
§
Secondary: This sets the DMUS_SEGF_SECONDARY flag to enable secondary mode. With StyleSegment.sgt playing, double-click on one of the motif or Band Segments to play it. Notice that an additional Segment starts playing, displayed below the primary Segment. Double-click several times, and you will hear several instances of the same motif. Also, listen to how the motifs track the chord changes in the primary Segment so that the motifs always sound musically correct.
§
Controlling: This sets both the DMUS_SEGF_CONTROL and DMUS_SEGF_SECONDARY flags to indicate that a Segment should be controlling. The DMUS_SEGF_SECONDARY flag enables multiple overlapping Segments, which is required to allow controlling Segments to play at the same time as the primary and other secondary Segments. DMUS_SEGF_CONTROL ensures that the Segment's Control Tracks override the primary Segment's Control Tracks. Go to the Open… button and load ChordChange.sgt. Note that this Segment loads as a primary Segment. Change its mode to Controlling. Start StyleSegment.sgt playing, and then play ChordChange.sgt on top of it. Notice that the music changes its harmony because the chord progression in ChordChange.sgt overrides the chord progression in StyleSegment.sgt. Click on the Stop button to kill ChordChange.sgt, and notice that the music reverts to the original chord progression.
Resolution There is a whole series of options available for when you'd like the new Segment to begin playing. For musical applications, it's very important that the new Segment line up in an appropriate manner. Typically, you want the Segment to play as soon as possible, yet each Segment has different criteria as to when it should rhythmically line up. While one Segment might sound great synchronized on a beat boundary, another might need to align with a measure in order to work well. Select this with the Resolution control:
As we walk through each option, experiment with playing both primary and secondary Segments with each resolution. § Default: The Segment has a default timing resolution authored by the content creator directly into it. This menu option sets the DMUS_SEGF_DEFAULT flag, indicating that whatever was authored into the Segment should be used. Jones calls the IDirectMusicSegment::GetDefaultResolution() method to find out what this is, so you can see it in the display and change it to something else. § None: Don't set any of the resolution flags if you want the Segment to play as soon as possible. This is useful for sound effects. Note that if there is no primary or controlling Segment currently playing, a Segment will play immediately anyway, so you don't need to write a special case for starting the Segment instantly under these circumstances (no primary or controlling Segment). However, if you do care to synchronize to a "silent" time signature, set the DMUS_SEGF_TIMESIG_ALWAYS flag. § Grid: The smallest music resolution is the grid. DirectMusic's time signature is made up of three components: BPM (beats per measure), beat size, and grid. The first two should be familiar to you. For example, a time signature of 3/4 means there are three beats to the measure and the beat is a quarter note. This works fine, but it doesn't allow for a sense of sub-beat resolution, which can be pretty important. If you want a piece to have a "triplet feel," you want to subdivide the beat into threes, rather than twos or fours. So, the grid resolution, which is authored into the time signature in DirectMusic Producer, indicates the lowest-level timing resolution.
§
§
§
§
When you choose Grid, Jones sets the DMUS_SEGF_GRID flag and the Segment plays at the next available grid time. The Segment is still subtly tied into the feel of the music. Be mindful to keep the Segment simple, or it could conflict with the rhythm of the other Segments when it is out of phase with their beat timing. Beat: This sets the start time to line up with the next beat. This is typically the most useful for layering secondary Segments. It can also come in handy when a sudden change is required and it's too long to wait for the end of the measure to start a change in the music. Set this with the DMUS_SEGF_BEAT flag. Measure: This sets the start time to line up with the next measure, or bar. This is typically the most useful for transitioning between Segments because it keeps the musical meter intact. Some secondary Segments, which have relatively long and complicated phrasing, might require alignment to the measure to sound right. Set this with the DMUS_SEGF_MEASURE flag. Marker: This sets the start time to line up with a specific marker placed in a Marker Track in the currently playing primary Segment. This is useful in situations where transitions on measure boundaries are not acceptable because they might break up the musical phrases. Since markers can be placed anywhere, regardless of the time signature, they can be used in different ways. For example, markers can be used to set trigger points for secondary Segments on specific beats. Set this with the DMUS_SEGF_MARKER flag. Segment End: You can set a Segment to start playback when the current primary Segment finishes. This is useful in two ways: Segment End can be used to schedule a Segment to play when the current one finishes or it can be used in conjunction with the
§
alignment flags (see the section titled "Aligned Cut In") to switch between two Segments that start at the same time. Set this with the DMUS_SEGF_SEGMENTEND flag. End of Queue: We have one final option that looks remarkably similar to Segment End but is useful in a different way. If you'd like to queue up a series of Segments to play one after the other, play them with the DMUS_SEGF_QUEUE flag. This only applies to primary Segments. This is very useful if you have a series of Segments that make up the music for a scene and you'd like to just queue them all up at the beginning and then forget about it. You can even set up a song arrangement with repeats of the same Segments in different places. DirectMusic will play the Segments back in the order in which you queued them with the PlaySegmentEx() call. However, you can break out of the queue as soon as you need to. Play a primary Segment that does not have the DMUS_SEGF_QUEUE flag set, and DirectMusic immediately flushes all Segments queued in the future and replaces them with the new Segment. Note It is also possible to specify exactly when a Segment should play. Usually, this isn't necessary because DirectMusic's automated mechanisms for finding the best time to start a Segment work so well. However, you can specify an exact time in music- or clock-time units via the i64StartTime parameter to PlaySegmentEx(). By default, the value placed in i64StartTime is in MUSIC_TIME units, indicating that it tracks the musical tempo. However, you can put an absolute REFERENCE_TIME value in i64StartTime and set the DMUS_SEGF_REFTIME flag.
Delay
You can specify the earliest time to use to calculate the timing resolution for when a Segment should begin playing. Choose a Segment and, as you read about each delay option, try it out. Watch the timeline display to see the different delays. § Optimal: By default, DirectMusic looks at the playback mode and determines which latency to apply. If the Segment is primary or controlling, and there currently are Segments playing, it sets the latency to Queue Time. Otherwise, it opts for the fastest choice, which is Synth Latency. § Synth Latency: This is the default start time for secondary Segments, since it is the earliest possible time the Segment can play. Since secondary Segments just add new notes but don't affect any currently playing Segments in any way, it's best to play as soon as possible. This sets the DMUS_SEGF_AFTER-LATENCYTIME flag. Additionally, if a primary Segment starts when nothing is playing, it plays at Synth Latency. § Queue Time: This sets the DMUS_SEGF_AFTERQUEUE-TIME flag, indicating that the Segment must start playback at some point after the Queue Time. Queue Time is important because it is the last chance to invalidate (turn off) pmsgs just before they go down to the synth. This is useful if the start of the new Segment causes invalidation of another Segment, either because as a primary Segment it overrides the previously playing primary Segment or as a controlling Segment it requires a regeneration of notes in currently playing Segments. This is the default timing for primary and controlling Segments. § Prepare Time: This option sets the DMUS_SEGF_AFTER-PREPARETIME flag, indicating that the new Segment must not start playing until after Prepare Time.
Although setting Prepare Time incurs greater delay, it can be useful in several situations: o If the invalidation caused by starting after Queue Time is unsatisfactory. Sometimes, invalidations caused by controlling Segments or new primary Segments chop off and restart notes in ways that don't work. o When a Segment has a streaming wave, it needs more time to preload the start of the wave if downloading is disabled for that wave (which dramatically reduces memory overhead) or if it picks up partway through the Segment.
Time Signature
This sets the DMUS_SEGF_TIMESIG_ALWAYS flag, indicating that even if there is no primary or controlling Segment currently playing, the time signature from the last played primary Segment holds, and the start time should synchronize with it, depending, of course, on the resolution flags. The easiest way to understand the Time Signature (DMUS_ SEGF_TIMESIG_ALWAYS) option is to play with it a bit. If there are any Segments currently playing, turn them off. Select a motif Segment from under StyleSegment.sgt. With Time Signature turned off, click on the Play button several times to get multiple instances of the Segment playing. Notice that they play immediately but completely out of sync with each other. Now, turn on the check box and try again. This time, the Segments start a little later, as they line up with a silent time signature, but they play in lockstep.
Transition
Sometimes, when there is a dramatic change in the state of the host application (say, in the story of an interactive game), the music should be able to respond by changing in a dramatic way. If you are going to suddenly play something significantly different, it helps if there is a short transition Segment that can bridge between the previous and new musical parts. Using DirectMusic Producer, it is possible to create a transition Segment that can bridge between a random point in the predecessor Segment and the start of a new Segment, often picking up elements from both of them on the fly. Set the DMUS_SEGF_AUTOTRANSITION flag and place the transition Segment in the pTransition parameter of PlaySegmentEx() to use the transition Segment. Jones accomplishes this by letting you choose a template Segment from the list of already loaded Segments, which it displays in the Transition drop-down list.
Aligned Cut In Sometimes it would be nice to start playback of a Segment immediately, yet keep it rhythmically aligned with a larger resolution. For example, you might have a motif that adds a musical layer in response to some user action that needs to align to the measure but needs to be heard immediately. Or you'd like to transition to a new primary Segment immediately, yet keep the meter aligned. In order to use this feature, you need to do two things:
1. Set the timing resolution for the logical start of the new Segment. In other words, set the timing resolution as if the new Segment played from the start. To do this, use one of the resolution flags that we have already discussed. This ensures that the new Segment will still line up rhythmically, even though it may cut in anywhere in the Segment. 2. Set the resolution at which the cut-in may occur. This should be a smaller size than the resolution flag. Why? Resolution ensures that the two Segments line up rhythmically, allowing a cut-in to occur at a finer resolution (in other words sooner), which is the whole purpose of this feature. The Aligned Cut In menu enables this feature and provides choices for the timing of the cut.
A typical example is to set Resolution to Measure and Aligned Cut In to Beat. Instead of waiting for the next measure, the Segment looks back in time and aligns its start with the last measure boundary and initiates playback by cutting over on the next beat. So you get the desired rhythmic effect of measure resolution without waiting for the next measure. The Aligned Cut In option provides a way to determine what's a safe place to make that cut, so it's not just at an arbitrary place. Let's look at each option: §
Disabled: This option simply doesn't activate Aligned Cut In, so none of the alignment flags are set. In order to enable Aligned Cut In, DMUS_SEGF_ALIGN must be set along with an optional cut-in resolution flag. The remaining options select each of these.
§
Only on Marker: This is the default behavior because it provides the most sophisticated control of the cut-in. Cut-in markers can be placed in the Segment when authored in DirectMusic Producer. These markers identify ideal points to switch to the new Segment. Just the DMUS_SEGF_ALIGN flag is set to enable this behavior. However, there can be circumstances where there are no markers and a fallback resolution is desired. These options follow. Understand, though, that if there is a marker placed in the Segment, it will always override any of these. Often, though, these options work well, and there is no need for specific markers.
§
Anywhere: This option enables the fallback cut to occur at music-time resolution (in other words, as soon as possible). It sets the DMUS_SEGF_VALID_START_TICK flag along with DMUS_SEGF_ALIGN.
§
On Grid: This option enables the fallback cut to occur on a grid boundary. It sets the DMUS_SEGF_VALID_START_GRID flag along with DMUS_SEGF_ALIGN.
§
On Beat: This option enables the cut to occur on a beat boundary. It sets the DMUS_SEGF_VALID_START_BEAT flag along with DMUS_SEGF_ALIGN.
§
On Measure: Enables the cut to occur on a measure boundary. It sets the DMUS_SEGF_VALID_START_MEASURE flag along with DMUS_SEGF_ALIGN.
To try out the different invalidations, once again use StyleSegment.sgt. This has some markers embedded within it. Play it as a primary Segment, wait a little, and then click on
Play again. Notice that it picks up on a measure boundary (assuming you didn't change the resolution) but starts at the beginning. Now, change Resolution to Segment End and set Aligned Cut In to Only on Marker. Again play the Segment and then click on Play a second time. This time, notice that the new Segment starts playing but it picks up at a predefined point partway into it. You know it starts out partway because the start of the Segment is green, indicating it was never played. Also note that the start of the new Segment is all the way off the left side of the screen because it has aligned with the start of the first instance of it playing.
Next, experiment by playing motifs on top of a primary Segment. For the motifs, set Resolution to Measure and Aligned Cut In to On Grid. The motifs rhythmically align with the measure but start playing partway through.
Invalidation By default, when a primary or controlling Segment plays, it causes an invalidation of all currently playing Segments. This means that all notes that were generated at prepare time but not yet committed to the synth at queue time are flushed, and the Segment is told to regenerate the notes. Why? Because a change in primary or controlling Segment usually means that something changed, and new notes need to be generated to reflect that. For example, if the groove level or chord changes, then previously generated notes are probably wrong, causing cacophony. There are times when you don't want invalidation. Invalidation can cause an audible glitch when there are sustained sounds, be they DLS instruments or straight waves. Because of this, the introduction of a controlling or primary Segment can sometimes have an unfortunate effect on other aspects of the sound environment, sound effects in particular, which couldn't care less what the underlying chord change or groove level is. There are several ways to deal with the various problems caused by invalidation: § Set the new Segment to play after prepare time, in which case there are no invalidations. If the delay is painful, you can adjust it downward with a call to SetPrepareTime(), but keep an ear out for timing glitches. Numbers in the 200 to 300ms range, which effectively cut it in half, should be safe. If this solution works for the music, it is usually the best because it is simple and avoids the overhead of invalidations anyway. § Create a separate Performance for all sound effects so the two worlds simply don't stomp on each other. This is a perfectly reasonable solution and does not incur much additional overhead, especially since the AudioPaths used are typically different anyway. § Use one of two flags that reduce the scope of invalidations. These are represented in the Invalidation menu.
o
All Segments: This is the default behavior. It leaves all invalidation intact.
o
Only Primary: This sets the DMUS_SEGF_INVALIDATE_PRI flag. Only the notes in the currently playing primary Segment are invalidated. All other Segments are left alone.
o
No Segments: This sets the DMUS_SEGF_NOINVALIDATE, turning off all invalidation. For some circumstances, this is a perfectly reasonable solution.
Use Path A Segment can be authored with an embedded AudioPath. However, that AudioPath is not automatically created and invoked each time the Segment plays. The reason is simple: Creating multiple AudioPaths can be very CPU intensive, especially if they invoke lots of audio effects. If you are going to play the same Segment over and over again or if you intend to share that AudioPath with multiple Segments, then it makes more sense to directly create the AudioPath and manage its usage explicitly. But there are situations where you would like an embedded AudioPath to be automatically used, so there's a flag to enable that. When a Segment has its AudioPath embedded within itself, it carries everything necessary to perform. There is no need to externally match up the Segment with an appropriate AudioPath that may happen to have filters, etc., set specifically for the Tracks in the Segment. A DirectMusic Media Player application would benefit from embedded Segments, for example.
When a Segment has an embedded AudioPath configuration, this option is enabled. It sets the DMUS_SEGF_USE_AUDIOPATH flag.
Setting Segment Options The CSegment class stores and retrieves all the information that you select in the display, with methods for setting and getting data. All of the Segment flags are stored in one variable, m_dwPlayFlags, accessed via calls to SetFlags() and GetFlags(). Note that this is initially filled with the default flags that were placed in the Segment file at author time. The DirectMusic Segment file format does not have a way to define transition Segments, but the PlaySegmentEx() call does take such a parameter. So, CSegment has a field, m_pTransition, which is a pointer to a CSegment to be played as a transition into the parent CSegment. m_pTransition is managed via the SetTransition() and GetTransition() calls, which simply access the variable. To display the name of the Segment, we have a slightly more complex routine that retrieves the name from the DMUS_OBJECT-DESC descriptor that was retrieved at the time the file was loaded and converts from Unicode to ASCII. In some cases, the Segment may not have a name because none was authored into it (typically the case with MIDI and wave files). If so, it uses the file name. If all else fails, it returns a descriptive error message based on the type of Segment. void CSegment::GetName(char *pszBuffer) {
pszBuffer[0] = 0; if (m_Desc.dwValidData & DMUS_OBJ_NAME) { // Convert from Unicode to ASCII. wcstombs(pszBuffer, m_Desc.wszName, DMUS_MAX_NAME); } else if (m_Desc.dwValidData & DMUS_OBJ_FILENAME) { wcstombs(pszBuffer, m_Desc.wszFileName, DMUS_MAX_NAME); // Get rid of any file path. char *pszName = strrchr(pszBuffer,'\\'); if (pszName) strcpy(pszBuffer,++pszName); } else { static char * s_pszNames[3] = { "Band", "Motif", "File" }; wsprintf(pszBuffer,"%s (no name)",s_pszNames[m_ssType - 1]); } } In order to enable the Use Path check box, we have to find out whether the Segment actually has an AudioPath configuration embedded within it. The only way to do this is to actually call the DirectMusic API to retrieve the configuration and return true if one exists. bool CSegment::HasEmbeddedAudioPath() { IUnknown *pConfig; if (SUCCEEDED(m_pSegment->GetAudioPathConfig(&pConfig))) { pConfig->Release(); return true; } return false; } That covers all the routines for managing the Segment's display data. Now, let's actually play the dang thing. Segment playback is handled by the CAudio::PlaySegment() method. This is relatively simple because we're just retrieving the parameters we need, calling DirectMusic's PlaySegmentEx() routine, and then storing the returned IDirectMusicSegmentState. First, check to see if there is a transition Segment. If so, get the IDirectMusicSegment interface and set the DMUS_SEGF_ AUTOTRANSITION flag. Then, call PlaySegmentEx() and pass it the transition Segment as well as an empty SegmentState interface
(IDirectMusicSegmentState). If the call succeeded, create a CSegState object to track both the SegmentState and Segment that spawned it. This will be used for tracking and eventually stopping it later. void CAudio::PlaySegment(CSegment *pSegment) { if (pSegment) { // Is there a transition Segment? IDirectMusicSegment8 *pTransition = NULL; if (pSegment->GetTransition()) { pTransition = pSegment->GetTransition()->GetSegment(); } DWORD dwFlags = pSegment->GetFlags(); if (pTransition) { dwFlags |= DMUS_SEGF_AUTOTRANSITION; } IDirectMusicSegmentState8 *pISegState; if (SUCCEEDED(m_pPerformance->PlaySegmentEx( pSegment->GetSegment(), // Returns IDirectMusicSegment NULL,pTransition,
// Use the transition, if it
dwFlags,
// Playback flags.
0,
// No time stamp.
exists.
(IDirectMusicSegmentState **) &pISegState, // Returned SegState. NULL,
// No prior Segment to stop.
NULL)))
// No AudioPath, just use
default. { CSegState *pSegState = new CSegState(pSegment,pISegState); if (pSegState) { m_SegStateList.AddHead(pSegState); } pISegState->Release(); } } }
To stop the Segment, CAudio has two methods. You can stop an individual instance of a playing Segment with CAudio::StopSegState(), or you can stop all playing instances of the Segment with a call to CAudio::StopSegment(). StopSegState() stops just the individual instance of a playing Segment as managed by the CSegState object. It uses the same timing resolution flags that were used to play the Segment. For example, if a Segment has been set to play on a measure boundary, it will also stop on a measure boundary. Only some of the play flags are legal for Stop. These are defined by STOP_FLAGS. #define STOP_FLAGS (DMUS_SEGF_BEAT | DMUS_SEGF_DEFAULT | DMUS_SEGF_GRID | \ DMUS_SEGF_MEASURE | DMUS_SEGF_REFTIME | DMUS_SEGF_MARKER) Note that once the Segment is stopped, CSegState is no longer useful, so the caller can and should delete it, which then releases the IDirectMusicSegmentState. This is automatically done for you if you use CAudio::StopSegment(). void CAudio::StopSegState(CSegState *pSegState) { if (pSegState) { m_pPerformance->StopEx(pSegState->GetSegState(),0, pSegState->GetSegment()->GetFlags() & STOP_FLAGS); } } CAudio::StopSegment stops all instances of a currently playing Segment. This could be accomplished by passing the IDirectMusicSegment interface directly to StopEx(). However, since we are tracking all of the SegmentStates via the list of CSegStates, we should use them so that we can release their memory at the same time. Scan through the list of all CSegStates and for each that matches the passed CSegment, call StopSegState(). Then, remove from the list and delete. The destructor for CSegState releases the IDirectMusicSegmentState interface, allowing the object to be reclaimed by DirectMusic. void CAudio::StopSegment(CSegment *pSegment) { if (pSegment) { CSegState *pNext; CSegState *pSegState = m_SegStateList.GetHead(); for (;pSegState;pSegState = pNext) { // Get the next pointer now, since pSegState // may be pulled from the list if it matches. pNext = pSegState->GetNext(); // Is this from the requested Segment? if (pSegState->GetSegment() == pSegment)
{ // Stop it. StopSegState(pSegState); // Free the SegState. m_SegStateList.Remove(pSegState); delete pSegState; } } } }
Tracking Control Parameters As we discussed earlier, a big piece of DirectMusic's musical power comes from its ability to share musical control information across multiple playing Segments. You've heard what happens when you experiment with playing different combinations of controlling, primary, and secondary Segments. You can also see it. Across the bottom of Jones is a display of four control parameters: Tempo, Time Sig, Groove, and Chord.
At any point in time, each control parameter is generated by the current primary Segment or a controlling Segment that is overriding it. To help visualize what is going on with the control parameters as well as demonstrate code to access these, CAudio() includes four methods for retrieving each of the parameters via IDirectMusicPerformance::GetParam() and converting the data into text strings for display. CAudio::GetTimeSig() finds out what the current time signature is and writes it in a string. It calls IDirectMusicPerformance::GetParam with the GUID_TimeSignature command and DMUS_TIMESIGNATURE structure to retrieve the time signature from whatever Segment is generating it. Note Even when no Segments are playing, there is still a time signature. Along with tempo, DirectMusic keeps track of the last played time signature as a special case because it is necessary for scheduling new Segments. bool CAudio::GetTimeSig(char *pszText) { if (m_pPerformance) { // Since we can get parameters across a broad range of time, // we need to get the current play time in music format for our request. MUSIC_TIME mtTime; m_pPerformance->GetTime(NULL,&mtTime);
// Use GUID_TimeSignature command and DMUS_TIMESIGNATURE structure DMUS_TIMESIGNATURE TimeSig; if (SUCCEEDED(m_pPerformance->GetParam(GUID_TimeSignature, -1,0,mtTime,NULL,(void *) &TimeSig))) { wsprintf(pszText,"%ld/%ld:%ld",(long)TimeSig.bBeatsPerMeasure, (long)TimeSig.bBeat,(long)TimeSig.wGridsPerBeat); } else { strcpy(pszText,"None"); } return true; } return false; } CAudio::GetGroove() follows a similar format. It finds out what the current groove level is and writes that groove level into a string. CAudio::GetGroove() calls GetParam() with the GUID_Command-Param and DMUS_COMMAND_PARAM structure to retrieve the groove level information from whatever Segment is generating it. Note that the DMUS_COMMAND_PARAM also includes embellishment information, such as DMUS_COMMANDT_FILL and DMUS_COMMANDT_BREAK, but GetGroove() ignores these (left as an exercise for the reader!). bool CAudio::GetGroove(char *pszText) { if (m_pPerformance) { // Since we can get parameters across a broad range of time, // we need to get the current play time in music format for our request. MUSIC_TIME mtTime; m_pPerformance->GetTime(NULL,&mtTime); // Use GUID_CommandParam command and DMUS_COMMAND_PARAM structure DMUS_COMMAND_PARAM Groove; if (SUCCEEDED(m_pPerformance->GetParam(GUID_CommandParam, -1,0,mtTime,NULL,(void *) &Groove))) { // Groove level is a number between 1 and 100.
wsprintf(pszText,"%ld",(long)Groove.bGrooveLevel); } else { strcpy(pszText,"None"); } return true; } return false; }
CAudio::GetTempo() retrieves the current tempo and writes it in a string. It calls GetParam() with the GUID_TempoParam command and the DMUS_TEMPO_PARAM structure to retrieve the tempo from whatever Segment is generating it. Note that even when no Segments are playing, there still is a tempo. As with time signature, DirectMusic treats this as a special case because it is necessary for subsequent Segment scheduling. bool CAudio::GetTempo(char *pszText) { if (m_pPerformance) { // Since we can get parameters across a broad range of time, // we need to get the current play time in music format for our request. MUSIC_TIME mtTime; m_pPerformance->GetTime(NULL,&mtTime); // Use GUID_TempoParam command and DMUS_TEMPO_PARAM structure DMUS_TEMPO_PARAM Tempo; if (SUCCEEDED(m_pPerformance->GetParam(GUID_TempoParam, -1,0,mtTime,NULL,(void *) &Tempo))) { // Tempo is a floating point number. sprintf(pszText,"%3.2f",Tempo.dblTempo); } else { strcpy(pszText,"None"); } return true; }
return false; } Finally, CAudio::GetChord() finds the current chord and writes it in a string. It calls GetParam() with GUID_ChordParam and the DMUS_CHORD_KEY structure. This actually returns both chord and key information; if you take a look at DMUS_CHORD_PARAM, it's quite complex. There's a lot of rich harmonic information stored in DMUS_CHORD_KEY and the array of DMUS_SUBCHORD structures, allowing multiple parallel chords and much more. typedef struct _DMUS_SUBCHORD { DWORD
dwChordPattern;
/* Notes in the subchord */
DWORD
dwScalePattern;
/* Notes in the scale */
DWORD
dwInversionPoints;
/* Where inversions can occur */
DWORD dwLevels; this subchord */
/* Which levels are supported by
BYTE
bChordRoot;
/* Root of the subchord */
BYTE
bScaleRoot;
/* Root of the scale */
} DMUS_SUBCHORD; typedef struct _DMUS_CHORD_KEY { WCHAR
wszName[16];
/* Name of the chord */
WORD
wMeasure;
/* Measure this falls on */
BYTE
bBeat;
/* Beat this falls on */
BYTE bSubChordCount; list of subchords */
/* Number of chords in the
DMUS_SUBCHORD subchords */
SubChordList[DMUS_MAXSUBCHORD]; /* List of
DWORD entire chord */
dwScale;
/* Scale underlying the
BYTE chord */
bKey;
/* Key underlying the entire
bFlags;
/* Miscellaneous flags */
BYTE } DMUS_CHORD_KEY;
Indeed, this is the source of the rather overwhelming Chord Properties page in DirectMusic Producer.
Figure 10-5: The Chord Properties window However, for the purposes of this exercise, we will happily only display the chord name with the root of the first chord, and leave it at that. bool CAudio::GetChord(char *pszText) { if (m_pPerformance) { // Since we can get parameters across a broad range of time, // we need to get the current play time, in music format, for our request. MUSIC_TIME mtTime; m_pPerformance->GetTime(NULL,&mtTime); // Use GUID_ChordParam command and DMUS_CHORD_KEY structure DMUS_CHORD_KEY Chord; if (SUCCEEDED(m_pPerformance->GetParam(GUID_ChordParam,1,0,mtTime,NULL, (void *) &Chord))) { // Display the root of the chord as well as its name. For the root, use a // lookup table of note names. static char szRoots[][12] = { "C", "C#", "D", "Eb", "E", "F", "F#", "G", "G#", "A", "Bb", "B" }; DWORD dwRoot = Chord.SubChordList[0].bChordRoot + 12; wsprintf(pszText,"%s%d%s",szRoots[dwRoot%12],dwRoot/12,Chord.wszName ); } else { strcpy(pszText,"None"); } return true;
} return false; } We've learned a lot in this chapter. Segment playback is understandably at the heart of DirectX Audio. When used for creating music, it can be quite sophisticated. We've covered the scheduling and control of Segment playback, in particular from a music perspective. If you are using DirectMusic's deeper music features, I encourage you to play with Jones and refer back to this chapter as you proceed. A good understanding of how these work will save you from a lot of head scratching and trial-by-error programming later on.
Chapter 11: AudioPaths Overview Todor Fay AudioPaths bring a whole new level of flexibility and control to your music and sound effects. So far, everything we have done plays through one AudioPath. This has worked well for us in our programming examples up to this point; however, it is limited with respect to adding sound effects and increased musical complexity. Let's take a look: §
No way to address 3D: For sound effects work, it is critical that one or more soundproducing Segments can be routed to a specific 3D position in space. It is also required that the 3D position be directly accessible to the host application so it can be controlled during gameplay.
§
Pchannel collision: Segments authored with the same pchannels conflict when played at the same time. For example, instrument choices from one Segment are applied to another, and a piano part is played by a bassoon. It should be possible to play two Segments in their own "virtual" pchannel spaces, so they cannot overlap.
§
No independent audio processing: It should be possible to set up different audio processors to affect different sound effects paths in different ways, such as applying echo to sounds going under a tunnel or distortion to an engine. Likewise, different musical instruments as well as sections can benefit from individualized audio processing (reverb, chorus, compression, etc.).
§
No independent MIDI processing: In the same vein, it should be possible to set up different MIDI, or pmsg, processors (tools) to manipulate individual musical parts and sound effects.
In fact, this was the status quo with DirectX 7.0. AudioPaths, introduced in DirectX 8.0, set out to solve these issues.
Anatomy of an AudioPath So what is an AudioPath? An AudioPath is a DirectMusic object that manages the route that a musical element or sound effect takes. It defines everything that is needed to get from the point when a Segment Track emits a pmsg (Performance message) to when the final rendered audio finally leaves DirectSound as PCM wave data. This includes everything from how many pchannels to use to the placement of audio effects and their configuration. An AudioPath can be as simple or sophisticated as you want it to be. It can be a 3D path with nothing more than a DirectSound3D module at the end for positioning, or it can be a complex path with different pchannels assigned to different audio channels with realtime audio effects processing. Let's dissect the AudioPath from top to bottom and learn about all the possible steps. Because there's so much that it can do, you might find this a bit overwhelming. It's not important that you memorize all the steps. Just read this and get a feel for the possibilities. Later, if you want to zero in on a particular feature, you can read through it again. The AudioPath can be broken down into two phases, before and after the synth (see Figure 11-1). Before the synth, the play data is managed in pmsg form inside DirectMusic's Performance Layer. Each pmsg identifies a particular sound to start or stop or perhaps a control, such as which instrument to play or its volume at any point in time. The synth converts the pmsg into wave data, at which point we are in the second phase of the AudioPath where raw wave data is fed through multiple streaming DirectSound buffers to be processed via effects, optionally positioned in 3D, and finally mixed to generate the output.
Figure 11-1: AudioPath phases.
AudioPath Performance Phase Figure 11-2 shows the Performance phase in greater detail. This is the journey a pmsg takes as it travels through the AudioPath from Track to synth:
Figure 11-2: AudioPath Performance phase. Wow! Does that look complicated. Is there danger of the pmsg making it through in time for lunch, let alone alive? Have no fear; it's not as crazy as it looks. If you are worried about extra latency of CPU overhead, don't be. The internal implementation is lightweight and
efficient. Also, most of the objects in the path are optional, so their steps are simply skipped when they don't exist. Let's walk through the steps: 1. Track emits a pmsg. The pmsg is filled with appropriate data and time stamped. There is a wide range of available pmsg types, from MIDI note to wave. 2. The pmsg enters the Segment's Tool Graph. This is optional. The Graph holds a set of one or more Tools, which are pmsg processors. These can alter the pmsg in real time. Simple examples of Tools would be echo (make multiple copies at later timestamps) and transpose (shift the pitch up or down.) Every Tool is represented by an IDirectMusicTool interface. Tools are very easy to write and use. In fact, there's a wonderful Tool wizard that ships with DX9, so if you have wild ideas for things you'd like to do in real time to your music or sounds as they play, I highly recommend giving it a try. Unfortunately, there isn't enough room in this book to walk through the process of creating full-featured Tools. Tools in the Segment are typically authored directly into the Segment and only process pmsgs in the Segment itself. 3. The pmsg leaves the Segment and enters the AudioPath's Tool Graph. Tools in this Graph process all pmsgs coming from all Segments playing in the AudioPath. Think of these as partially global tools, in that they process all pmsgs that flow through a specific AudioPath, regardless of the Segment. These Tools are also optional and typically authored via the AudioPath configuration file. 4. The AudioPath maps the pmsg pchannel from the local AudioPath defined pchannel range to a unique pchannel in the Performance. This ensures that Segments played on different AudioPaths cannot collide on the same pchannels. 5. The Performance also has an optional Tool Graph. Tools in this Graph process pmsgs from all Segments on all Audio-Paths. These are truly global Tools. 6. Then, the Performance maps the pmsg's pchannel to a real MIDI channel and channel group number because the synthesizer operates completely in the MIDI domain. Additionally, the pmsg is converted into MIDI format data. For example, a note pmsg is converted to a MIDI note on and a MIDI note off. Or, a volume curve is converted into a series of discrete MIDI volume control change commands. The resulting MIDI events are shipped down to the synth as a block of MIDI commands. 7. Finally, the synth receives the MIDI data and renders it into one or more channels of wave data. Note that even the type of synthesizer and which pchannels it connects to is managed by the AudioPath. The cool thing is the AudioPath has access and control over almost all of the steps in this process. This means that via the AudioPath API, you can access most of these as well. Let's continue with the second phase of the AudioPath.
AudioPath DirectSound Phase The synth, which is managed via DirectMusic's IDirectMusicPort interface, generates multiple streams of audio data, which are fed into one or more DirectSound buffers (see Figure 11-3). Again, the AudioPath retains access to and control over each of the steps on the way.
Figure 11-3: AudioPath DirectSound phase. Let's walk through these steps: 1. The synth emits one or more streams of audio data. The AudioPath defines for each pchannel how many audio outputs should be created and where they should connect. These are called "buses." This means that you can literally have every single pchannel routed to a different set of effects if you so desire. 2. Each bus is fed into a DirectSound Buffer. These Buffers are called "synth-in Buffers" because they receive their input from the synth. 3. Each Buffer can hold any number of DMOs. The audio data is fed through each DMO in order. A DMO is an audio effects processor. Like tools, you can create and insert these into your AudioPath configuration. DirectMusic ships with a great set of DMOs, from Reverb to Compression, which you can use. But be careful; each DMO uses CPU to continuously process the audio. So the more you add, the more CPU gets gobbled. 4. If the Buffer is set up as a 3D Buffer, the audio data next enters the 3D engine where it is placed in 3D space and sent to the final mix. Otherwise, the data goes directly to the final mix. 5. A send can be placed in the Buffer to route audio data to another Buffer. The destination buffer takes its input from other buffers, so it is called a "Mix-in Buffer." 6. The Mix-in Buffer also has one or more DMOs, which can process the data before sending to the final mix. These are often called global Buffers because they can take input from any number of regular Buffers. This is a great way to economize on CPU. When creating the AudioPath configuration in Producer, place Send Effects in your regular Buffers to send to a shared Mix-in Buffer, which has some CPU-expensive processing to do, and just that one instance can do the work for all. Pretty amazing, eh? With AudioPaths, you can create sophisticated routing and processing of your audio all the way from the Segment to the final mix. There are many cool things you can stick in it and many ways to configure it. You can create any number of instances of any AudioPath and each has its own virtual pchannel space, so your Segments won't collide. Note One word about terminology — although we use the term "Buffer," it is anything but that. Historically, a Buffer in DirectSound represented a genuine memory buffer. You'd create it, copy your audio data into it, and then play it. As we discussed earlier, this is very inefficient because it binds the data to the playback device. But, by DX8, buffers had many new features, including 3D and a kludge to get around the inefficiency via the DuplicateSoundBuffer command. So, we continue with the name "Buffer," but think of it really as "AudioStream" or something like that.
AudioPath Configurations Now that we've seen all the wonder of what an AudioPath can do, we need some way to define them. There are two ways to do this: § Create from a configuration file loaded from disk § Create from a predefined set of useful AudioPath definitions
AudioPath File Configuration You can create your own definition of an AudioPath and store it in a file called an AudioPath configuration file. DirectMusic Producer has a sophisticated AudioPath editor that lets you create, audition, and save AudioPath configuration files. With it, you can address all the features that we've discussed above, from pchannel assignments to tools, DMOs, and buffers. The AudioPath configuration file does not load directly as an AudioPath. This would be unfortunate because it would mean that each file could only generate one instance of an AudioPath. Let's think of an example: Suppose you've designed an AudioPath for playing race car sounds in 3D. You create an AudioPath configuration with distortion and compression DMOs along with 3D positioning capability. You want to use the same AudioPath design interchangeably for each car that appears on the racetrack, and there could be a lot of them. You also don't want the overhead of loading another AudioPath from disk every time a new car pops into view. Clearly, that would be ridiculous given that all of the AudioPaths are defined the same, but, if you simply use the same AudioPath for all cars, it doesn't work because: § They run in the same pchannel space, so pchannel-based commands like volume and pitch for one car are applied to all. § Adjustments to the compression, distortion, and 3D position in the Buffer are all directed to the same object, so, for example, all cars are heard at the same point in space. Therefore, you need both an AudioPath configuration that defines the AudioPath and then a way to create as many AudioPaths from that configuration as needed. The creation should be very quick and efficient, definitely not involving any file I/O. In many ways, this is analogous to Segments. You load the Segment from disk. It represents the definition of a Segment but not the running instance. When you play the Segment, you get back a SegmentState. The SegmentState is the running instance of the Segment. Likewise, the AudioPath is the running instance of the AudioPath configuration. So, like a Segment, you load the AudioPath configuration from disk. However, unlike a Segment, the AudioPath configuration doesn't have any methods. There's nothing you can do with it directly. So, it doesn't have a unique COM interface of its own. Instead, it is represented by the base level IUnknown interface. With that understood, loading an AudioPath configuration is about the same as loading a Segment. Use the Loader's LoadObjectFromFile() method and pass it the file path and class ID for an AudioPath configuration, CLSID_DirectMusicAudioPath-Configuration. IUnknown *pIConfig; pLoader->LoadObjectFromFile(
CLSID_DirectMusicAudioPathConfig,
// AudioPath config class
IID_IUnknown,
// No special interface.
pwzFileName,
// File path.
ID.
(void **) &pIConfig); pIConfig.
// Config returned in
Once you have the AudioPath configuration loaded, create as many AudioPaths from it as you like with the Performance's CreateAudio-Path() method. // Now, use this to create a live AudioPath. IDirectMusicAudioPath *pIPath = NULL; m_pPerformance->CreateAudioPath(pIConfig,true,&pIPath);
Predefined Paths Because there are some really useful standard AudioPath designs that people usually need, DirectMusic comes with a handful of predefined AudioPaths that you can just ask for without first loading a configuration file. These include: §
DMUS_APATH_DYNAMIC_3D: One bus that feeds into a mono 3D Buffer. Every instance has its own unique Buffer. Although you can play stereo waves and DLS instruments into this path, they will always be rendered in mono, since Direct-Sound 3D needs to place a mono sound in 3D space (so there is no way to play stereo sounds in 3D.).
§
DMUS_APATH_DYNAMIC_MONO: One bus feeding into a mono Buffer without 3D. Again, each instance has a unique Buffer. Since there is only one bus out, MIDI pan control does nothing. But you can use the Buffer Pan command to pan the entire Buffer in the final mix.
§
DMUS_APATH_DYNAMIC_STEREO: Two buses to a stereo Buffer. The Buffer is not shared, so each instance is unique. MIDI pan commands work here.
§
DMUS_APATH_SHARED_STEREOPLUSREVERB: This is a standard music AudioPath. It has two Buffers, both of which are defined as shared, which means that no matter how many Audio-Paths you create, they all share the same audio processing buffers. The two Buffers are Dry Stereo, which receives the regular panned left and right signals, and Reverb, which holds a reverb DMO configured for music and receives the MIDI-controlled reverb send as a mono input, with stereo out to the mixer. (The reverb algorithm takes a mono signal and spreads it across the stereo spectrum, among other things.)
To create a predefined AudioPath, call the Performance's CreateStandardAudioPath() method and pass it the identifier for the predefined type that you'd like. Unlike AudioPath configurations, which have the number of pchannels built in, the standard types do not. One music AudioPath might need only 16 pchannels, while another might require 100. So, CreateStandardAudioPath() also requires a requested number of pchannels as a parameter. IDirectMusicAudioPath *pIPath = NULL;
pPerformance->CreateStandardAudioPath( DMUS_APATH_DYNAMIC_3D,
// Make this a 3D path.
12,
// Allocate 12 pchannels.
true,
// Activate immediately.
&pIPath);
// Returned AudioPath.code)
Working with AudioPaths Once you've created an AudioPath, there's a lot you can do with it, from starting and stopping Segments to accessing and adjusting various objects in the AudioPath. Let's look at all of these. We can start with the one you probably won't need: downloading instruments and waves directly to the AudioPath.
Downloading and Unloading with AudioPaths As we discussed in previous chapters, before a Segment can be played, all of its DLS instruments and waves must be downloaded to the synth. As long as there is only one synthesizer, this is a relatively simple proposition; simply download everything directly to the one synth, regardless of what AudioPath it will play on. Indeed, that is what we have been doing so far by downloading to the Performance. AudioPaths introduce an extra, though optional, level of complexity. Along with the DMOs, buffers, and tools, the AudioPath configuration also defines what synth (or even multiple synths) to use. That's right, even synths are plug-in components in DirectMusic! This makes things a little more interesting because only the AudioPath knows which instruments on which pchannels go to which synths. So, any downloading and unloading of Segments must include the AudioPath to manage the routing of data to the correct synths. To do so, pass the IDirectMusicAudioPath interface to the Segment's Download() and Unload() commands: // Download the Segment's instruments and waves // to the synth(s) managed by the AudioPath. pSegment->Download(pAudioPath); // Later, when done, unload the instruments and waves, // again via the AudioPath. pSegment->Unload(pAudioPath);
Note that if there are multiple copies of the same AudioPath, you don't need to download to each one. Downloading to one will get the instruments and waves in place in the synth for use by all. This is a common misconception. Also, it's actually very rare to be using more than one synth. So, in the 99 percent chance where you are only using the standard Microsoft DLS2 synth, the easiest solution is to download to the Performance and not worry about the AudioPaths. In fact, that is exactly what Jones does.
Playing on an AudioPath You can play Segments directly on an AudioPath. Or, you can make it the default AudioPath for the Performance, in which case all Segments will default to playing on it. To play a Segment directly on the AudioPath, pass it as a parameter to PlaySegmentEx(): pPerformance->PlaySegmentEx( pSegment,NULL,NULL,0,0,NULL,NULL,pAudioPath);
To stop a Segment playing on an AudioPath, you don't need to do anything special. Just pass the Segment or SegmentState to the Performance's StopEx() method, and it will stop it. However, there's a cool trick that you can play if you want to stop everything that is playing on a particular AudioPath. Try passing the IDirectMusicAudioPath interface to StopEx() instead of a Segment or Segment-State (IDirectMusicSegment or IDirectMusicSegmentState), and StopEx() will stop all Segments that are currently playing on just the one AudioPath. This is very useful, especially when playing complex sounds in 3D and you need to simply shut off all sounds from a particular source. pPerformance->StopEx(pAudioPath,0,0); Likewise, you can take advantage of PlaySegmentEx()'s ability to stop one or more Segments at the very moment the new Segment starts. If you pass the IDirectMusicAudioPath pointer in PlaySegmentEx's pFrom parameter, it will stop all Segments playing on the AudioPath at the very moment it starts the new Segment. For example, you might use this to shut off all engine and tire sounds for a race car when it hits a wall and an explosion Segment is played. In the following example, pCarAudioPath is the AudioPath used for all sounds coming from the car, and pExplosionSegment is the Segment with the explosion sound. pPerformance->PlaySegmentEx( pExplosionSegment,NULL,NULL,0,0,NULL,pCarAudioPath,pCarAudioPath); This tells DirectMusic to play the explosion sound on the car's AudioPath while shutting down all Segments currently playing on the same AudioPath.
Embedded AudioPaths Using DirectMusic Producer, it is possible to embed an AudioPath in a Segment. This is handy because it lets you attach everything necessary to play a Segment to the Segment itself. For example, if the Segment were designed to play with a particular reverb on some of the MIDI channels and a combination of compression and parametric EQ on some other channels, you could create an AudioPath configuration with the required reverb, compression, and EQ settings and place it within the Segment. Then, on playback, you'd get exactly the configuration of effects that was designed with the Segment in mind. Once an AudioPath has been embedded in a Segment, there are two ways to use it. You can either request the AudioPath configuration directly by calling the Segment's GetAudioPathConfig() method or telling PlaySegmentEx() to automatically use the AudioPath. Here's how you use the configuration in the Segment: // Get the AudioPath configuration from the Segment. pSegment->GetAudioPathConfig(&pAudioPathConfig); // Create an AudioPath from the configuration. pPerformance->CreateAudioPath(pAudioPathConfig,true,&pAudioPath); // Done with the configuration. pAudioPathConfig->Release();
// Play the Segment on the AudioPath. pPerformance>PlaySegmentEx(pSegment,NULL,NULL,0,0,0,0,pAudioPath); // Release the AudioPath. It will go away when the Segment finishes. pAudioPath->Release(); Or, just tell PlaySegmentEx() to use the embedded AudioPath: pPerformance->PlaySegmentEx(pSegment,NULL,NULL, DMUS_SEGF_USE_AUDIOPATH, // Use the embedded path, if it exists. 0,NULL,NULL);
At first glance, it would seem like the latter is always the preferable solution. Certainly, it is a lot more convenient. However, it is not as flexible and brings with it a performance cost. Creating and using an AudioPath, especially if it has its own dynamic buffers and DMO effects, takes CPU cycles. So, if you intend to use the AudioPath for the playback of one or more Segments, it quickly becomes smarter to manage it directly.
Accessing Objects in an AudioPath With all the wonderful things you can place in an AudioPath, it's important that you be able to access them. For example, if you create an AudioPath with a 3D Buffer, you will need to access the 3D interface on a regular basis to update the 3D position as the object moves in space. So, there's a special method, GetObjectInPath(), that you use to access the objects in an AudioPath both before Segments play on it as well as once it is actively processing playing Segments. Since there are so many different types of objects that can reside in an AudioPath and each has its own interface, GetObjectInPath() approaches this in a generic way, using a series of parameters to identify the instance and position of the object and an IID to request the desired interface. HRESULT hr = pPath->GetObjectInPath( DWORD dwPChannel,
// Pchannel to search.
DWORD dwStage,
// Position in AudioPath.
DWORD dwBuffer,
// Which buffer, if in DSound.
REFGUID guidObject,
// Class ID of object.
DWORD dwIndex,
// Nth item.
REFGUID iidInterface,
// Requested interface IID.
void ** ppObject
// Returned interface.
); Gadzooks! That's a lot of parameters! Have no fear, it will all make sense, and you typically don't need most of these. Let's look at them in order:
§
DWORD dwPChannel: This is the Performance channel to search. dwPChannel is necessary because a Tool or DMO could be set to play only on specific channels. If that level of precision is not needed, DMUS_PCHANNEL_ALL will search all channels.
§
DWORD dwStage: The AudioPath is broken down into a series of "stages," each representing a sequential step in the Audio-Path's route. These are represented by a series of hard-coded identifiers (aka "stages"), starting with DMUS_PATH_AUDIOPATH_GRAPH, which represents the Tool Graph embedded in the AudioPath, all the way to DMUS_PATH_PRIMARY_ BUFFER, which identifies the primary Buffer at the end of DirectSound. Frequently used stages include DMUS_PATH_ AUDIOPATH_TOOL and DMUS_PATH_BUFFER_DMO.
§
DWORD dwBuffer: When accessing a DirectSound Buffer or DMO embedded within a Buffer, providing the stage and pchannel is not enough because there could be more than one Buffer in parallel. Therefore, dwBuffer identifies which Buffer to search, starting with 0 for the first Buffer and iterating up. Otherwise, this should be 0.
§
REFGUID guidObject: For DMOs, Tools, and synth ports, a class ID is needed to identify which type of object to look for. Optionally, GUID_All_Objects will search for objects of all classes.
§
DWORD dwIndex: It is entirely possible that more than one of a particular object type exist at the same stage in the AudioPath. For example, if there are two DMOs of the same type in a Buffer, use dwIndex to identify each. Alternately, when guidObject is set to GUID_All_Objects, dwIndex can be used to iterate through all objects within one particular stage.
§
REFGUID iidInterface: You must provide the IID of the interface that you are requesting. For example, to get the 3D interface on the Buffer, pass IID_IDirectSound3DBuffer.
§
void ** ppObject: This is the address of a variable that receives a pointer to the requested interface. The interface will be AddRef()'d by one, so be sure to remember to release it when done.
Here's an example of using GetObjectInPath() to update the 3D position of a 3D AudioPath: void Set3DPosition(IDirectMusicAudioPath *pPath, D3DVALUE x, D3DVALUE y, D3DVALUE z) { IDirectSound3DBuffer *p3DBuffer = NULL; // Use GetObjectInPath to access the 3D interface. if (SUCCEEDED(pPath->GetObjectInPath( DMUS_PCHANNEL_ALL,
// Ignore pchannels.
DMUS_PATH_BUFFER,
// Buffer stage.
0,
// First buffer.
GUID_All_Objects,
// Ignore object type.
0,
// Ignore index.
IID_IDirectSound3DBuffer,
// 3D buffer interface.
(void **)&p3DBuffer)))
{ // Okay, we got it. Set the new coordinates. p3DBuffer->SetPosition(x,y,z,DS3D_IMMEDIATE); // Now release the interface. p3DBuffer->Release(); } }
Setting the AudioPath Volume The AudioPath interface has a very useful method, SetVolume(), that changes the volume of everything playing on it. SetVolume() takes as parameters the requested volume setting (which is always a negative number because it is really attenuation) and the amount of time to fade or rise to the requested level. // Drop the volume to -6db and take 200ms to do so. pAudioPath->SetVolume(-600,200); Note
SetVolume() can be very CPU expensive if you aren't careful. It works by sending Volume MIDI messages down to the synth for every pchannel in the AudioPath, and if you specify a duration other than 0 for immediate, it will send a series of these to create a smooth fade. If you have created an AudioPath with many more pchannels then you are actually using, you will pay for it with lots of wasted volume messages, which add up. Create only as many pchannels as you need, and if you don't need a fade (i.e., a sudden change in volume is acceptable, perhaps because nothing is playing), then set the fade duration to 0.
Shutting Down an AudioPath Running AudioPaths consumes memory and CPU resources, especially if they have DMOs and dynamic buffers. It's always a good idea to get rid of an AudioPath when you are done using it. To do this, simply call the AudioPath's Release() method. // Bye bye... pPath->Release(); Note
If an AudioPath is currently actively playing Segments, it will not go away when you call Release(). Instead, it will wait until the last of the Segments playing on it finishes.
In some cases, you might want to keep the AudioPath around for a short while between uses, but keep it deactivated so CPU is not wasted. You can do so by deactivating and then reactivating the AudioPath. Call the AudioPath's Activate() method, which takes a Boolean variable to activate or deactivate. // Don't need the path for a while... pPath->Activate(false);
// Okay, fire it back up, we need it again... pPath->Activate(true); It's important to note that the operations of creating, releasing, activating, and deactivating an AudioPath are all glitch free. In other words, you can add and remove AudioPaths to a playing performance without fear of an audio hiccup.
Adding AudioPaths to Jones Okay, let's have some fun and add AudioPath support to Jones. We can add some serious features that make it easy to create and explore AudioPaths. These include: § Creating AudioPaths, both from configuration files as well as the predefined standard AudioPaths § Retrieving AudioPath statistics for display § Scanning and then displaying for editing many of the objects (such as Tools, DMOs, and 3D) within the AudioPath § Playing Segments on AudioPaths Figure 11-4 shows what Jones with AudioPath support looks like. Notice how Jones has been rearranged to make room for the new AudioPaths panel to the right of the Segments panel.
Figure 11-4: Jones with AudioPaths. Let's walk through this and learn a bit about the code.
Using AudioPaths in Jones Start by creating a new AudioPath. First, select which type of AudioPath you'd like to create from the drop-down menu below AudioPaths.
The choices are:
§
Configuration: Creates an AudioPath from an AudioPath configuration file. Clicking on Create opens a file dialog, from which you can choose the configuration file (extension .aud).
§
Music: Creates a predefined music AudioPath. This path has two buffers: Dry Stereo and Reverb. Note that multiple instances all share the same two Buffers. All other predefined types have dynamic buffers, where each instance gets a new buffer.
§
3D: Creates a predefined 3D AudioPath. This path has one mono buffer with DirectSound 3D capabilities.
§
Mono: Creates a predefined Mono AudioPath. This path has one mono buffer.
§
Stereo: Creates a predefined Stereo AudioPath. This path has one stereo buffer.
Make your choice from the menu, and then click on Create. The new AudioPath appears in the list just under the Create button. This list shows all created AudioPaths. If you'd like, experiment by creating a few paths of different types. You might try the AudioPath configuration file BigPath.aud, which comes with many Buffers and DMOs. It is designed with the Segment AudioPath.sgt in mind. Click on an AudioPath in the list. This selects it for display as well as playback. Load a Segment and play it via the AudioPath simply by selecting both in their respective lists. Experiment by playing the same Segment in different AudioPaths. Be sure to try AudioPath.sgt. The box to the right displays the name of the current path at the top along with some statistics about it. The statistics itemize the PChannels, Tools, Buffers, and DMOs that are in the AudioPath.
Below the statistics are lists of components that you can access and edit within the AudioPath. These include the AudioPath volume as well as any Tools, DMOs, and Buffers that exist in the path. Doubleclick on any item in this list. Most items will open a property page that you can then edit. For example, click on Volume to edit the volume in the Volume dialog.
If you have a Segment already playing on the AudioPath, you can adjust any of these parameters in real time and hear the change to the music immediately. For example, create a Music path. Select it and start music playing on it. Go to the list and double-click on the Waves Reverb item. This is the reverb DMO used for music playback.
While the music plays, you can adjust the reverb parameters and hear the result immediately. Press the Apply button to activate the changes.
Jones AudioPath Structures We add a new class, CAudioPath, to Jones. Each manages a running instance of an AudioPath. We won't bother with wrapping the AudioPath configuration. For that, we'll load one every time we need it to create an AudioPath. Since the Loader caches file objects, this isn't as expensive as it sounds. CAudioPath keeps track of one running instance of an AudioPath via an IDirectMusicAudioPath interface pointer. This can be used to choose an AudioPath for playback and control its parameters in real time. CAudioPath also provides a series of methods for directly getting and setting useful parameters, such as Buffer frequency and 3D position. Two methods, GetObjectInfo() and GetObject(), were designed with Jones in mind. They provide a way to iterate through all objects and then access them individually for display. This is not something you would do in a regular application, but it's great for spelunking an AudioPath. CAudioPath inherits from the CMyNode base class so that an unlimited list of CAudioPath objects can be easily managed. class CAudioPath : public CMyNode { public: // Constructor. CAudioPath(IDirectMusicAudioPath *pAudioPath,WCHAR *pzwName); ~CAudioPath(); // We keep a linked list of CAudioPaths. CAudioPath *GetNext() { return (CAudioPath *) CMyNode::GetNext(); };
// Access methods. IDirectMusicAudioPath *GetAudioPath() { return m_pAudioPath; }; char *GetName() { return m_szName; }; // Methods to report stats on the composition of the AudioPath. DWORD GetPChannelCount(); DWORD GetToolCount(); DWORD GetBufferCount(bool fMixin); DWORD GetDMOCount(); // Methods to scan and access objects in the path. bool GetObjectInfo(DWORD dwIndex, char *pszName, AudioPathItemInfo *pItem); bool GetObject(AudioPathItemInfo *pItem,REFGUID iidInterface,void **ppObject); // Methods to get and set useful parameters in the AudioPath. long GetVolume() { return m_lVolume; }; void SetVolume(long lVolume); bool GetBufferParams(DWORD dwBuffer, DWORD dwStage, DWORD dwPChannel, long *plVolume, long *plPan, DWORD *pdwFrequency); bool SetBufferParams(DWORD dwBuffer, DWORD dwStage, DWORD dwPChannel, long lVolume, long lPan, DWORD dwFrequency); bool Get3DParams(DWORD dwBuffer, DWORD dwStage, DWORD dwPChannel, DS3DBUFFER *p3DParams); bool Set3DParams(DWORD dwBuffer, DWORD dwStage, DWORD dwPChannel, DS3DBUFFER *p3DParams); private: bool ClassIDToName(REFGUID rguidClassID,char *pszName); char m_szName[20];
// Name, for display
IDirectMusicAudioPath * m_pAudioPath; AudioPath object.
// The DirectMusic
long m_lVolume; current volume.
// Keep track of
}; CAudioPathList is based on CMyList and manages a linked list of CAudioPaths. class CAudioPathList : public CMyList { public: // Overrides for CMyList methods. CAudioPath *GetHead() { return (CAudioPath *) CMyList::GetHead(); };
CAudioPath *RemoveHead() { return (CAudioPath *) CMyList::RemoveHead(); }; // Clear list and release all references. void Clear(); }; CAudio introduces a new field to keep track of the list of AudioPaths. CAudioPathList m_AudioPathList; AudioPaths.
// List of created
It adds two methods for creating AudioPaths. CAudioPath *CreateAudioPathFromConfig(WCHAR *pwzFileName); CAudioPath *CreateStandardAudioPath(DWORD dwType, DWORD dwChannels); It also adds a CAudioPath parameter to CAudio::PlaySegment(), so you can now play a Segment on a specific AudioPath. void PlaySegment(CSegment *pSegment,CAudioPath *pPath); // Play a Segment. As before, we look only at the code implemented in the audio classes. We do not pay attention to the MFC Windows code. You should be able to take these classes and use them equally effectively with any UI framework, including WTL or direct Windows programming.
Creating an AudioPath CAudio::CreateAudioPathFromConfig() loads the AudioPath configuration from the passed file path and uses it to create an AudioPath. First, it loads the configuration from the file. Then, it retrieves the name of the configuration by QI'ing for the IDirectMusicObject interface and using that to get the object descriptor. This normally wouldn't be required in an application, but we want to display the name of the AudioPath in Jones. Next, CreateAudioPathFromConfig() calls IDirectMusicPerformance8::CreateAudioPath() to create the AudioPath and, assuming the call succeeds, creates a CAudioPath class to wrap the AudioPath. It places the new CAudioPath in its list of AudioPaths, m_AudioPathList. Note that CreateAudioPathFromConfig() doesn't keep the AudioPath configuration around. Doing so would be redundant, since the Loader keeps a copy in its own cache. The next time we try to create an AudioPath from this configuration, the Loader avoids a file load, so we get the efficiency we need. CAudioPath *CAudio::CreateAudioPathFromConfig(WCHAR *pwzFileName) { CAudioPath *pPath = NULL; IUnknown *pIConfig; // First, load the configuration. if (SUCCEEDED(m_pLoader->LoadObjectFromFile(
CLSID_DirectMusicAudioPathConfig, class ID.
// AudioPath config
IID_IUnknown,
// No special interface.
pwzFileName,
// File path.
(void **) &pIConfig))) pIConfig.
// Config returned in
{ // Get the config's name by retrieving the object // descriptor from the configuration. DMUS_OBJECTDESC Desc; Desc.dwSize = sizeof(Desc); wcscpy(Desc.wszName,L"No Name"); IDirectMusicObject *pIObject = NULL; pIConfig->QueryInterface(IID_IDirectMusicObject,(void **)&pIObject); if (pIObject) { pIObject->GetDescriptor(&Desc); pIObject->Release(); } // Now, use this to create a live AudioPath. IDirectMusicAudioPath *pIPath = NULL; m_pPerformance->CreateAudioPath(pIConfig,true,&pIPath); if (pIPath) { // Create a CAudioPath object to manage it. pPath = new CAudioPath(pIPath,Desc.wszName); if (pPath) { m_AudioPathList.AddTail(pPath); } pIPath->Release(); } pIConfig->Release(); } return pPath; }
The constructor for CAudioPath simply stores the AudioPath and its name. The constructor keeps the name in ASCII format because that's all we need for the UI.
CAudioPath::CAudioPath(IDirectMusicAudioPath *pAudioPath,WCHAR *pzwName) { m_pAudioPath = pAudioPath; pAudioPath->AddRef(); wcstombs(m_szName,pzwName,sizeof(m_szName)); m_lVolume = 0; }
Getting AudioPath Stats Now that we have an AudioPath loaded, let's see what we can learn from it. Unfortunately, there are no straightforward DirectMusic methods for reading an AudioPath's capabilities because the assumption is that every AudioPath is designed specifically for the application that uses it, so there would be no reason to interrogate it. However, for an application like Jones, where we can load any arbitrary AudioPath and look at it, we need such a method. No problem. Ve haff meanss for makink ze AudioPath talk!
PChannel Count First, we create a method (CAudioPath::GetPChannelCount()) that figures out how many pchannels belong to the AudioPath in a somewhat hokey way. IDirectMusicAudioPath has a method, Convert-PChannel(), that is used to convert any pchannel from the AudioPath virtual pchannel space to the absolute pchannel value in the Performance. We call ConvertPChannel() with virtual pchannel values of 0 and counting, up until it fails. This assumes that the pchannels are 0 based. I've never run into a situation where they aren't, so it's relatively safe, and since this is only used for display, it's not the end of the world if we miss some esoteric non-zero-based pchannel assigments. DWORD CAudioPath::GetPChannelCount() { DWORD dwCount = 0; if (m_pAudioPath) { HRESULT hr = S_OK; for (;hr == S_OK;dwCount++) { DWORD dwResult; // Normally, we'd use this to convert from the AudioPath's // pchannel to its equivalent value in the Performance. hr = m_pAudioPath->ConvertPChannel(dwCount,&dwResult); }
dwCount--; } return dwCount; }
Tool Count To find out how many Tools are embedded in the AudioPath, CAudio-Path::GetToolCount() calls GetObjectInPath(), iterating through Tools in the DMUS_PATH_AUDIOPATH_TOOL stage until it fails. This works because the generic class ID, GUID_All_Objects, is used instead of a specific Tool class ID. DWORD CAudioPath::GetToolCount() { DWORD dwCount = 0; for (dwCount = 0; ;dwCount++) { IUnknown *pUnknown; if (SUCCEEDED(m_pAudioPath->GetObjectInPath( DMUS_PCHANNEL_ALL,
// Search all pchannels.
DMUS_PATH_AUDIOPATH_TOOL,
// Look in the Tool stage.
0,
// No buffer!
GUID_All_Objects,
// All Tools.
dwCount,
// Nth Tool.
IID_IUnknown,
// Generic interface.
(void **)&pUnknown))) { // We found another Tool, so continue. pUnknown->Release(); } else { // No more Tools. Quit. break; } } return dwCount; }
Buffer Count
To figure out how many DirectSound buffers are included in the AudioPath, we use a similar technique. CAudioPath::GetBuffer-Count() uses GetObjectInPath() to scan for Buffers. However, it is complicated by the fact that there are two sets of Buffers: Sink-in, which take their input from the synth, and Mix-in, which take their input from other Buffers. To handle this and avoid redundant code, GetBufferCount() receives one parameter, fMixIn, which determines whether to look in the DMUS_PATH_MIXIN_BUFFER or DMUS_ PATH_BUFFER stage. It iterates through the Buffers by calling GetObjectInPath() until it fails. DWORD CAudioPath::GetBufferCount(bool fMixin) { DWORD dwPChannel; DWORD dwStage; // If we are searching Mix-in Buffers... if (fMixin) { // Pchannel must be 0. dwPChannel = 0; // Set stage to Mix-in Buffers. dwStage = DMUS_PATH_MIXIN_BUFFER; } else { // Pchannel must be all channels. dwPChannel = DMUS_PCHANNEL_ALL; // Set stage to Sink-in Buffers. dwStage = DMUS_PATH_BUFFER; } DWORD dwCount = 0; for (dwCount = 0; ;dwCount++) { IUnknown *pUnknown; if (SUCCEEDED(m_pAudioPath->GetObjectInPath( dwPChannel,
// Look on all pchannels.
dwStage,
// Searching the appropriate stage.
dwCount,
// Nth buffer.
GUID_All_Objects,
// Ignore class ID (can only get
0,
// No index.
IID_IUnknown,
// Generic interface.
Buffers).
(void **)&pUnknown))) {
// Hah! We found another Buffer! pUnknown->Release(); } else { // No more Buffers. Quit. break; } } return dwCount; }
DMO Count CAudioPath::GetDMOCount() figures out how many DMOs exist in the AudioPath. It does so by using GetObjectInPath() to scan for DMOs in each of the Sink-in and Mix-in Buffers. DWORD CAudioPath::GetDMOCount() { DWORD dwCount = 0; DWORD dwBufferCount = GetBufferCount(false); // There will be two passes. First the Sink-in Buffers (which // read from the synth). Then, the Mix-in Buffers, which receive // from other Buffers. DWORD dwDMOStage[2] = { DMUS_PATH_BUFFER_DMO, DMUS_PATH_MIXIN_BUFFER_DMO }; DWORD dwPChannel[2] = { DMUS_PCHANNEL_ALL, 0 }; for (DWORD dwBufferType = 0; dwBufferType < 2; dwBufferType++) { DWORD dwBuffer; for (dwBuffer = 0; dwBuffer < dwBufferCount;dwBuffer++) { // Now, for each Buffer, iterate through the DMOs. for (DWORD dwDMO = 0; ;dwDMO++) { IUnknown *pUnknown; if (SUCCEEDED(m_pAudioPath->GetObjectInPath( dwPChannel[dwBufferType],
// Search all
dwDMOStage[dwBufferType],
// Buffer DMO stage.
pchannels.
dwBuffer,
// Which Buffer to
GUID_All_Objects,
// Search for all
dwDMO,
// Index of DMO.
IID_IUnknown,
// Look for base
search. object types.
interface. (void **)&pUnknown))) { // DMO was found! Increment. dwCount++; pUnknown->Release(); } else { // No DMO, move on to the next Buffer. break; } } } // Prepare Buffer count for second pass. dwBufferCount = GetBufferCount(true); } return dwCount; }
Scanning an AudioPath Jones needs a way to iterate through all of the objects that were authored into an AudioPath so they can be displayed and double-clicked for editing. These include any DMOs and Tools, as well as the Buffer parameters, including 3D. Normally, this is not particularly useful because an application should know what it needs to access and request them directly. Following that philosophy, DirectMusic's AudioPath API was not designed with this sort of inspection in mind. In other words, there is no direct API that you can call and query for the nth object in the AudioPath. You can sort of do it with GetObjectInPath(), but you must iterate through stages, buffers, and then objects within the buffers. CAudioPath::GetObjectInfo() does just that, but it gets all the extra work out of the way for you. It iterates through all objects within an AudioPath for which we can subsequently throw up a UI of some sort. It uses a data structure, called AudioPathItemInfo, to return all of the parameters (stage, buffer, index) that were needed to access the nth object in the AudioPath, and it stores an identifier for the type of object (Tool, DMO, Buffer, or 3D) in AudioPathItem-Info. A subsequent call to GetObject() with the AudioPathItemInfo structure can quickly access the specified object. GetObjectInfo() also returns a name for the object. Unfortunately, none of the objects have a way to return a friendly name. So, GetObjectInfo() improvises as best it can. For DMOs, it
looks at the class ID to see if it is one of the familiar DMOs that ships with DirectX and, if so, sticks in the name. Otherwise, it just gives generic names, like "DMO 3," etc. Here's the code. First, we define the types of objects that we are looking for. These really reflect UI operations that we know we can do in Jones with these particular items. typedef enum _AP_ITEM_TYPE { AUDIOPATH_VOLUME = 1, // Set the AudioPath volume. AUDIOPATH_TOOL = 2, // Open a Tool's property page. AUDIOPATH_DMO = 3, // Open a DMO's property page. AUDIOPATH_BUFFER = 4, // Set Buffer parameters. AUDIOPATH_3D = 5 // Set 3D parameters. } AP_ITEM_TYPE; The AudioPathItemInfo structure records everything needed to find the nth item in the AudioPath: typedef struct _AudioPathItemInfo { AP_ITEM_TYPE
ItemType;
// Which type of object.
DWORD
dwStage;
// Which stage to look in.
DWORD
dwBuffer;
// Which Buffer, if applicable.
DWORD
dwIndex;
// Index into nth object in stage.
DWORD
dwPChannel; // Pchannel, if applicable.
} AudioPathItemInfo; Finally, here's the big kahuna, GetObjectInfo(). This scans for the nth item (within the set of things it is looking for) in the AudioPath. To do so, it starts at the top and keeps decrementing the passed iterator, dwIndex, until it hits zero, at which point it returns the current item. bool CAudioPath::GetObjectInfo( DWORD dwIndex,
// Nth item to look for.
char *pszName,
// Returned name.
AudioPathItemInfo *pItem) // Returned parameters needed to retrieve item. { // Initialize the AudioPathItemInfo. pItem->dwBuffer = 0; pItem->dwIndex = 0; pItem->dwStage = 0; // If this is the very start, return the volume parameter.
if (dwIndex == 0) { strcpy(pszName,"Volume"); pItem->ItemType = AUDIOPATH_VOLUME; return true; } dwIndex--; // Okay, now see if this is a tool. DWORD dwToolCount = GetToolCount(); if (dwIndex < dwToolCount) { // Since Tools don't have accessible names, just // identify which Tool it is. sprintf(pszName,"Tool %d",dwIndex+1); pItem->ItemType = AUDIOPATH_TOOL; pItem->dwIndex = dwIndex; pItem->dwStage = DMUS_PATH_AUDIOPATH_TOOL; return true; } dwIndex -= dwToolCount; // Now, we look at all the items in the Buffers. // There will be two passes. First the Sink-in Buffers (which // read from the synth). Then, the Mix-in Buffers, which receive // from other Buffers. // Set up the variables that change for the two passes. // There are two DMO stages in the AudioPath, one for each Buffer type. DWORD dwDMOStage[2] = { DMUS_PATH_BUFFER_DMO, DMUS_PATH_MIXIN_BUFFER_DMO }; // And, there are two Buffer stages. DWORD dwBufferStage[2] = { DMUS_PATH_BUFFER, DMUS_PATH_MIXIN_BUFFER }; // Pchannels are handled differently for Sink-in vs Mix-in Buffers. DWORD dwPChannel[2] = { DMUS_PCHANNEL_ALL, 0 }; // Variable to track the two passes. DWORD dwBufferType; // Okay, let's do it. for (dwBufferType = 0; dwBufferType < 2; dwBufferType++) {
// How many Buffers of this type? DWORD dwBufferCount = GetBufferCount(dwBufferType == 1); // Get the pchannel to use for this Buffer type. pItem->dwPChannel = dwPChannel[dwBufferType]; // For each Buffer (there can easily be more than one...) pItem->dwBuffer = 0; for (;pItem->dwBuffer < dwBufferCount;pItem->dwBuffer++) { // First, check if we are at the Buffer itself. If so, // just return success. if (dwIndex == 0) { // Buffers don't have names, so just use a counter. if (dwBufferType) { sprintf(pszName,"MixBuffer %d",pItem->dwBuffer); } else { sprintf(pszName,"SinkBuffer %d",pItem->dwBuffer); } pItem->dwStage = dwBufferStage[dwBufferType]; pItem->ItemType = AUDIOPATH_BUFFER; return true; } dwIndex--; IUnknown *pUnknown; // Okay, it's not the Buffer. Now, iterate through the DMOs. for (pItem->dwIndex = 0;;pItem->dwIndex++) { if (SUCCEEDED(m_pAudioPath->GetObjectInPath( pItem->dwPChannel,
// Pchannels.
dwDMOStage[dwBufferType],
// Which Buffer DMO stage.
pItem->dwBuffer,
// Which Buffer are we
GUID_All_Objects,
// Scan all DMOs.
pItem->dwIndex,
// Look for the Nth DMO.
IID_IUnknown,
// Just the base
looking at?
interface. (void **)&pUnknown))) {
// If index is 0, we've found our DMO. if (dwIndex == 0) { // We'll use IPersist to get the class ID. IPersist *pPersist; // Put in a default name. sprintf(pszName," DMO %d",pItem->dwIndex+1); // See if it has an IPersist and, if so, get the class ID. if (SUCCEEDED(pUnknown>QueryInterface(IID_IPersist, (void **) &pPersist))) { CLSID clsid; pPersist->GetClassID(&clsid); // With the class ID, we might recognize one // of our standard DMOs. ClassIDToName(clsid,pszName); pPersist->Release(); } // Fill in the rest. pItem->dwStage = dwDMOStage[dwBufferType]; pItem->ItemType = AUDIOPATH_DMO; pUnknown->Release(); return true; } pUnknown->Release(); dwIndex--; } else { // Ran out of DMOs, so go to next Buffer. break; } } pItem->dwIndex = 0; // Check to see if this Buffer supports 3D. IDirectSound3DBuffer *p3DBuffer; if (SUCCEEDED(m_pAudioPath->GetObjectInPath( pItem->dwPChannel,
// Pchannel.
dwBufferStage[dwBufferType],
// Which Buffer stage.
pItem->dwBuffer,
// Which Buffer.
GUID_All_Objects,0,
// Don't care about
IID_IDirectSound3DBuffer,
// Does it have the 3D
object class. interface? (void **)&p3DBuffer))) { // If there's a 3D interface, this must be a 3D Buffer. p3DBuffer->Release(); if (dwIndex == 0) { // Ooh, goody, we got it. strcpy(pszName," 3D"); pItem->dwStage = dwBufferStage[dwBufferType]; pItem->ItemType = AUDIOPATH_3D; return true; } dwIndex--; } } } return false; } To use GetObjectInfo(), the application calls it in a for loop, incrementing the iterator dwIndex and retrieving the name and Audio-PathItemInfo for each item found until GetObjectInfo() eventually runs out and returns false. That's exactly what Jones does by building a list of all the items, which it displays in the list on the right. Then, when the user double-clicks on an item in the list, Jones has what it needs to retrieve the item for editing.
Editing AudioPath Object Parameters Now that we've got this nice list with all the items and their names, retrieving an object is straightforward because we have everything we need to call the AudioPath's GetObjectInPath() method.
Accessing with AudioPathItemInfo To directly access any generic object using the AudioPathItemInfo structure, we have a convenience function, GetObject(), which is very simple. GetObject() uses the AudioPathItemInfo structure that was filled in by GetObjectInfo(). It also provides the ID of the interface that it is expecting. Together, these are used to call GetObjectInPath() to retrieve the desired interface. bool CAudioPath::GetObject(AudioPathItemInfo *pItem,
REFGUID iidInterface, void **ppObject) { return (SUCCEEDED(m_pAudioPath->GetObjectInPath( pItem->dwPChannel, returned.
// Use the pchannel that GetObjectInfo
pItem->dwStage, returned.
// And, the stage that GetObjectInfo
pItem->dwBuffer,
// Likewise, use the Buffer.
GUID_All_Objects,
// Get any object type.
pItem->dwIndex,
// Get the Nth item.
iidInterface,
// Caller requested an interface.
ppObject)));
// Returned interface.
}
Accessing Buffer Parameters CAudioPath has two methods (GetBufferParams() and SetBuffer-Params()) for reading and writing the Buffer volume, pan, and frequency. When you double-click on a Buffer in the list, Jones opens a dialog to edit these three parameters and uses these two methods to retrieve and set the data. Note that these two methods do not use the AudioPathItemInfo structure. GetBufferParams() and SetBuffer-Params() can be used much more broadly, since typically the application already knows specifically what Buffer it needs to access. Therefore, these routines are great examples of using GetObject-InPath() to access and manipulate a specific object in an AudioPath. /* CAudioPath::GetBufferParams GetBufferParams is a convenience function that retrieves the current volume, pan, and frequency from the Buffer. The Buffer is defined by Buffer index, pchannel, and stage, since this could be a Sink-in or Mix-in Buffer. */ bool CAudioPath::GetBufferParams(DWORD dwBuffer, DWORD dwStage, DWORD dwPChannel, long *plVolume, long *plPan, DWORD *pdwFrequency) { bool fSuccess = false; // We'll be retrieving an IDirectSoundBuffer interface.
IDirectSoundBuffer *pBuffer = NULL; // Use GetObjectInPath to retrieve the Buffer. if (SUCCEEDED(m_pAudioPath->GetObjectInPath( dwPChannel,
// The pchannel.
dwStage,
// Mix-in or Sink-in stage.
dwBuffer,
// Which Buffer.
GUID_All_Objects,0, // No need for class ID. IID_IDirectSoundBuffer, (void **)&pBuffer))) { // We got it. Go ahead and read the three parameters. pBuffer->GetFrequency(pdwFrequency); pBuffer->GetPan(plPan); pBuffer->GetVolume(plVolume); pBuffer->Release(); fSuccess = true; } return fSuccess; } /* CAudioPath::SetBufferParams SetBufferParams is a convenience function that sets the volume, pan, and frequency for the Buffer. The Buffer is defined by buffer index, pchannel, and stage, since this could be a Sink-in or Mix-in Buffer. */ bool CAudioPath::SetBufferParams(DWORD dwBuffer, DWORD dwStage, DWORD dwPChannel, long lVolume, long lPan, DWORD dwFrequency) { bool fSuccess = false; // We'll be using an IDirectSoundBuffer interface. IDirectSoundBuffer *pBuffer = NULL; // Use GetObjectInPath to retrieve the Buffer.
if (SUCCEEDED(m_pAudioPath->GetObjectInPath( dwPChannel,
// The pchannel.
dwStage,
// Mix-in or Sink-in stage.
dwBuffer,
// Which Buffer.
GUID_All_Objects,0, // No need for class ID. IID_IDirectSoundBuffer, (void **)&pBuffer))) { // We got it. Go ahead and set the three parameters. pBuffer->SetFrequency(dwFrequency); pBuffer->SetPan(lPan); pBuffer->SetVolume(lVolume); pBuffer->Release(); fSuccess = true; } return fSuccess; }
Accessing 3D Parameters CAudioPath has two methods (Get3DParams() and Set3DParams()) for reading and writing the 3D position of a 3D DirectSound Buffer. Use Set3DParams() to continuously update the position of a 3D object that is being tracked in space by a 3D AudioPath. /* CAudioPath::Get3DParams Get3DParams is a convenience function that retrieves the current 3D parameters from a 3D Buffer. */ bool CAudioPath::Get3DParams(DWORD dwBuffer, DWORD dwStage, DWORD dwPChannel, DS3DBUFFER *p3DParams) { bool fSuccess = false; // We'll be using an IDirectSound3DBuffer interface. IDirectSound3DBuffer *p3DBuffer = NULL; // Use GetObjectInPath to retrieve the Buffer. if (SUCCEEDED(m_pAudioPath->GetObjectInPath( dwPChannel, dwStage,
dwBuffer, GUID_All_Objects,0, IID_IDirectSound3DBuffer, (void **)&p3DBuffer))) { // Okay, we got the Buffer. Read it. fSuccess = true; p3DBuffer->GetAllParameters(p3DParams); p3DBuffer->Release(); } return fSuccess; } /* CAudioPath::Set3DParams Set3DParams is a convenience function that writes 3D parameters to a 3D Buffer. */ bool CAudioPath::Set3DParams(DWORD dwBuffer, DWORD dwStage, DWORD dwPChannel, DS3DBUFFER *p3DParams) { bool fSuccess = false; // We'll be using an IDirectSound3DBuffer interface. IDirectSound3DBuffer *p3DBuffer = NULL; // Use GetObjectInPath to retrieve the Buffer. if (SUCCEEDED(m_pAudioPath->GetObjectInPath( dwPChannel, dwStage, dwBuffer, GUID_All_Objects,0, IID_IDirectSound3DBuffer, (void **)&p3DBuffer))) { // Okay, we got the Buffer. Write to it. fSuccess = true; p3DBuffer->SetAllParameters(p3DParams,DS3D_IMMEDIATE); p3DBuffer->Release();
} return fSuccess; }
Accessing Tools and DMOs CAudioPath doesn't have any special methods for accessing Tools or DMOs because they would be nothing more than very thin wrappers over GetObjectInPath(). Tools and DMOs have four different ways of providing control, and GetObjectInPath() provides access to all. They are: §
A specific programming interface: Each object provides its own COM interface that allows direct program control over its parameters. For example, the IDirectSoundFXEcho8 interface provides direct access to read and write the Echo parameters.
§
The IMediaParams interface: This interface provides timeline control over each parameter in a standardized way. It includes methods, used in editors, for querying the names and ranges of each parameter, as well as methods for setting parameters with continuously changing curves. DirectMusic supports this via the Parameter Control Track.
§
The IPersistStream interface: This interface provides a standard way to read and write data from a file stream. All DMOs and Tools must support this if they have any data they must read from disk. This can also be used to write data directly into an object, though given the other options, it's typically not the best approach.
§
A property page: Each DMO and Tool has the option of providing a property page, which can be used to edit its parameters in an authoring tool. To do so, the DMO or Tool must provide an ISpecifyPropertyPages interface, which, in turn, leads to the IPropertyPage interface that manages the property page.
Jones takes the last approach, using GetObjectInPath() to access the DMO or Tool's ISpecifyPropertyPages interface to open the property dialog. The code to do this is remarkably simple: ISpecifyPropertyPages *pPages; if (SUCCEEDED(pAudioPath->GetObjectInPath( DMUS_PCHANNEL_ALL,
// Pchannel.
DMUS_PATH_BUFFER_DMO,
// Look at the DMO stage.
0,GUID_All_Objects,
// Any DMO.
dwIndex,
// Nth DMO.
IID_ISpecifyPropertyPages,
// Get the property page interface.
(void **)&pPages))) { // Success! It has a property page! CAUUID PageID; pPages->GetPages(&PageID); pPages->Release();
// Use the Windows Property page mechanism to view it. OleCreatePropertyFrame(m_hWnd,0,0, L"DMO Editor",1,(IUnknown **) &pPages,1, PageID.pElems,0,0,0); }
Accessing Volume Volume is a special case because the command to adjust the volume is built directly into the IDirectMusicAudioPath interface. For completeness, CAudioPath provides methods for getting and setting the volume. void CAudioPath::SetVolume(long lVolume) { m_pAudioPath->SetVolume(lVolume,0); m_lVolume = lVolume; } GetVolume() simply returns the last volume that was set. long GetVolume() { return m_lVolume; };
Performing with an AudioPath Playing a Segment with an AudioPath is simple; just pass the AudioPath as a parameter to PlaySegmentEx(). So, we update CAudio::PlaySegment() to include a CAudioPath as a parameter: void CAudio::PlaySegment(CSegment *pSegment,CAudioPath *pPath) { if (pSegment) { IDirectMusicAudioPath *pIPath = NULL; if (pPath) { pIPath = pPath->GetAudioPath(); } // Is there a transition Segment? IDirectMusicSegment8 *pTransition = NULL; if (pSegment->GetTransition()) { pTransition = pSegment->GetTransition()->GetSegment();
} DWORD dwFlags = pSegment->GetFlags(); if (pTransition) { dwFlags |= DMUS_SEGF_AUTOTRANSITION; } IDirectMusicSegmentState8 *pISegState; if (SUCCEEDED(m_pPerformance->PlaySegmentEx( pSegment->GetSegment(), // Returns IDirectMusicSegment NULL,pTransition,
// Use the transition, if it
dwFlags,
// Playback flags.
0,
// No time stamp.
exists.
(IDirectMusicSegmentState **) &pISegState, // Returned SegState. NULL,
// No prior Segment to stop.
pIPath)))
// Use AudioPath, if supplied.
{ CSegState *pSegState = new CSegState(pSegment,pISegState); if (pSegState) { m_SegStateList.AddHead(pSegState); } pISegState->Release(); } } } Now that we have a decent understanding of what AudioPaths are and how to use them, we've got the primary pieces of the DirectMusic audio architecture in place. We know how to create an AudioPath, we know how to play music and sound effects on it, and we know how to access components within the AudioPath, so we can alter them in real-time in response to real time events.
Chapter 12: Scripting Overview Todor Fay After DirectX 6.1 shipped with the original DirectMusic, it quickly became clear from developer feedback that the content-driven approach was well-liked. Content creators and audio leads loved the fact that they could create complex sounds and wrap them up in Segment files. However, a critical piece was still missing. There were still too many scenarios where it was necessary to write down instructions for the programmer, and some of these instructions could be quite complicated. For example: "When the cuddly monster dies, play a fill transition Segment and drop the groove level to 23 while playing byebye.sgt as a secondary Segment aligned to a beat boundary." Ouch! This has issues: § That's a lot of information for the programmer to get right. Oh, and how many monsters did you say there were? § If the content creator wants to change anything (for example, play a break transition), he or she can't just change the content and try again. No, the whole program needs to be recompiled with the change request. § The above items slow down the development process and keep programmers and content creators perpetually irritated with each other. Wouldn't it be nice if the content creator could say instead, "When the cuddly monster dies, please call the script routine CuddlyMonsterBuysFarm, and I'll take it from there." Thus, scripting was born…
Anatomy of a Script DirectMusic's scripting support is all built around the Script object. The CLSID_DirectMusicScript object with its IDirectMusicScript interface represents the Script object. Like most DirectX Audio objects, the Loader reads the Script object from a file. Once it has loaded a script, the application can make direct calls into the script to run its routines, as well as pass variables back and forth between the script and the application. So think of a script as a discrete object that is made up of: § Routines to call § Variables that store data as well as provide a means to pass the data between the application and the script § Embedded and/or linked content (Segments, waves, and any other DirectX Audio media) to be referenced by variables and manipulated by routines Figure 12-1 shows an example of a script with its internal routines, variables, and content. The application manages the script and can make calls to trigger the routines and access the variables. In turn, the routines and variables access the embedded and linked Segments, AudioPaths, and other content.
Figure 12-1: A script with its internal routines, variables, and content.
Routines The heart of scripting is, of course, in the code. The code in a DirectMusic script is all stored in routines. There is no primary routine that is called first, unlike the required main() inaCprogram. Instead, any routine can be called in any order. Each routine exposes some specific functionality of the Script object. For example, two routines might be named EnterLevel or MonsterDies. The first would be called upon entering a level in a game. The second would be called when a monster dies. The beauty is that the programmer and content creator only need to agree on the names EnterLevel and MonsterDies and call them at the appropriate times. Each routine has a unique name, and that name is used to invoke the routine via the IDirectMusicScript::CallRoutine() method. HRESULT CallRoutine( WCHAR *pwszRoutineName, DMUS_SCRIPT_ERRORINFO *pErrInfo
); Pass only two parameters — the name of the routine and an optional structure that is used to return errors. If the routine does indeed exist (which should always be the case in a debugged project), CallRoutine() invokes it immediately and does not return until the routine has completed. Therefore, the code to call a routine is typically very simple: // Invoke the script handler when the princess eats the frog by accident pScript->CallRoutine(L" EatFrog",NULL);
// Yum yum.
Notice that no parameters can be passed to a routine. If the caller would like to set some parameters for the routine, it must set them via global variables first. Likewise, if the routine needs to return something, that something needs to be stored in a variable and then retrieved by the caller. For example, a routine that creates an AudioPath from an AudioPath configuration might store the AudioPath in a variable, which the caller can then read immediately after calling the AudioPath creation routine. So, let's look at variables.
Variables There really are three uses for variables in a script: §
Parameter passing variables: Declare a variable in the script and use it to pass information back and forth between the application and the script. An example might be a variable used to track the number of hopping objects in a scene, which in turn influences the musical intensity.
§
Internal variables: Declare a variable in the script and use it purely internal to the script. An example might be an integer that keeps track of how many times the princess steps on frogs before the fairy godmother really gets pissed.
§
Content referencing variables: Link or embed content (Segments, AudioPaths, etc.) in the script, and the script automatically represents each item as a variable. For example, if the Segment FrogCrunch.sgt is embedded in the script, the variable FrogCrunch is automatically allocated to provide access to the Segment.
Variables used in any of these three ways are equally visible to the application calling into the script. Within the scripting language's internal implementation, all variables are handled by the variant data type. A variant is a shape-shifting type that can masquerade as anything from byte to pointer. To do so, it stores both the data and tag that indicates which data type it is. This is why you can create a variable via the "dim" keyword in a scripting or Basic language and then use it without first specifying its type. That is all fine and good in scripting, but it is not a natural way to work in C++. Therefore, IDirectMusicScript gives you three ways to work with variables and translates from variant appropriately: §
Number: A 32-bit long, used for tracking numeric values like intensity, quantity, level, etc.
§
Object: An interface pointer to an object. Typically, this is used to pass objects back and forth with the application. In addition, all embedded and referenced content is handled as an object variable.
§
Variant: You still have the option to work with variants if you need to. This is useful for passing character strings and other types that cannot be translated into interfaces or longs. Fortunately, this is rare.
To accommodate all three types, IDirectMusicScript has Get and Set methods for each. They are GetVariableNumber(), GetVariableObject(), and GetVariableVariant() to retrieve a variable from the script and SetVariableNumber(), SetVariableObject(), and SetVariableVariant() to assign a value to a variable in the script. Each Get or Set call passes the name of the variable along with the data itself. For example, the parameters for GetVariableNumber(), which retrieves a numeric value, are: the Unicode name, a pointer to a long to fill, and an optional error structure in case there's a failure. HRESULT // GetVariableNumber( WCHAR *pwszVariableName, LONG *plValue, DMUS_SCRIPT_ERRORINFO *pErrInfo ); The code to retrieve a variable is simple: // Find out how many frog chances the princess has left long lFrogs; pScript->GetVariableNumber(L"FrogsLeft",&lFrogs,NULL); Likewise, setting a variable is straightforward. See the code snippet below using SetVariableObject() as an example: HRESULT SetVariableObject( WCHAR* pwszVariableName, IUnknown* punkValue, DMUS_SCRIPT_ERRORINFO* pErrInfo ); Again, we pass the name of the variable. Since it is treated as an object, we pass its interface pointer (of which IUnknown is always the base). We can also pass that optional error structure (more on that in a bit). // Pass the script a 3D AudioPath for tracking a hopping frog IDirectMusicAudioPath *pPath = NULL; pPerformance>CreateStandardAudioPath(DMUS_APATH_DYNAMIC_3D,3,true,&pPath); if (pPath)
{ pScript->SetVariableObject("FrogPath",pPath,NULL); } // Remember to release the path when done with it. The script will // release its pointer to the path on its own. Retrieving a variable object is a little more involved in that you need to identify what interface you are expecting. GetVariableObject() passes an additional parameter: the interface ID of the interface that it is expecting. // Retrieve a Segment that is stored in the script IDirectMusicSegment8 *pSegment; pScript->GetVariableObject( "FrogScream",
// Name of the variable.
IID_IDirectMusicSegment8,
// IID of the interface.
(void **)&pSegment,
// Address of the interface.
NULL);
// Optional error info.
Content Obviously, it is important that scripts be able to directly manipulate the DirectMusic objects that they control. To deal with this, key objects that you would want to manipulate from within a script all have scripting extensions. These are the AudioPath, AudioPath Configuration, Performance, Segment, and SegmentState (called a "playing Segment"). They all exhibit methods that can be called directly from within the script (internally, this is done via support for the IDispatch interface). Documentation for these scripting extensions can be found in the DirectMusic Scripting Reference section of the DirectMusic Producer help file, not the regular DirectX programming SDK. It is also very important that a script be able to introduce its own content. It's not enough to load files externally and present them to the script with SetVariableObject() because that implies that the application knows about all of the objects needed to run the script, which gets us back to the content creator writing down a long list of instructions for the programmer, etc. Scripting should directly control which files are needed and when. The Script object also supports linking and embedding objects at authoring time. A linked object simply references objects that are stored in files outside the script file. This is necessary to avoid redundancy if an object is shared by multiple scripts. If the script is the sole owner of an object, it can embed it directly, which results in a cleaner package because there are fewer files. Setting up linked or embedded content is very simple in DirectMusic Producer. In the project tree, just drag the objects you want to use in the script into its Embed Runtime or Reference Runtime folder. This also automatically makes sure the object loads with the script, and if the object can be directly manipulated, it appears as a variable in the script.
You can have everything you need for a particular section in your application all wrapped up in one script file, which is wonderfully clean. Be careful, though. By default, script files automatically download all their DLS and wave instruments to the synthesizer when the script is initialized. If you have more stuff than you want downloaded at any one time, you need to manage this directly in your scripting code and avoid the automatic downloading. If you ever wondered why you could not simply open a script file in a text editor, the embedded and linked content is the reason. Although the code portion of the script is indeed text, the linked and embedded data objects are all binary in RIFF format, which cannot be altered in a text editor.
Finding the Routines and Variables The IDirectMusicScript interface includes two methods, EnumRoutine() and EnumVariable(), which can be used to find out what is in the script. These are very useful for writing small applications that can scan through the script and display all routines and variables and then directly call them. Jones does that, as you will see in a bit. HRESULT EnumRoutine( DWORD dwIndex,
// nth routine in the script.
WCHAR *pwszName // Returned Unicode name of the routine. ); EnumRoutine() simply iterates through all routines, returning their names. HRESULT EnumVariable( DWORD dwIndex, // nth variable in the script. WCHAR *pwszName // Return Unicode name of the variable. ); EnumVariable() does the same for each variable, but it's a little more involved because it enumerates all declared variables as well as all variables that were automatically created for linked and embedded objects. There is a little trick that you can use to figure out which type a variable is. Make a call to GetVariableLong(), and if it succeeds, the variable must be a declared variable; otherwise, it must be a linked or embedded object. Jones uses this technique.
You can also use GetVariableObject() to search for specific object types, since it requires a specific interface ID. Here is a routine that will scan a script looking for all objects that support IDirectMusicObject and display their name and type. This should display everything that is linked or embedded content and can be manipulated as a variable. Note Some content (for example, styles and DLS Collections) cannot be directly manipulated by scripting, so they do not have variables assigned to them. // Search for all content variables within a script // and display their name and type. void ScanForObjects(IDirectMusicScript *pScript) { DWORD dwIndex; HRESULT hr = S_OK; // Enumerate through all variables for (dwIndex = 0; hr == S_OK; dwIndex++) { WCHAR wzName[DMUS_MAX_NAME]; hr = pScript->EnumVariable(dwIndex,wzName); // hr == S_FALSE when the list is finished. if (hr == S_OK) { IDirectMusicObject *pObject; // Only objects that can be loaded from file have the // IDirectMusicObject interface. HRESULT hrTest = pScript->GetVariableObject(wzName, IID_IDirectMusicObject, (void **) &pObject, NULL); if (SUCCEEDED(hrTest)) { // Success. Get the name and type and display them. DMUS_OBJECTDESC Desc; Desc.dwSize = sizeof (Desc); if (SUCCEEDED(pObject->GetDescriptor(&Desc))) { char szName[DMUS_MAX_NAME + 50]; if (Desc.dwValidData & DMUS_OBJ_NAME) { wcstombs(szName,Desc.wszName,DMUS_MAX_NAME);
} else { strcpy(szName,""); } // This should be a Segment, AudioPath configuration, // or another script because only these support IDispatch // and can be loaded from file. if (Desc.guidClass == CLSID_DirectMusicSegment) { strcat(szName,": Segment"); } else if (Desc.guidClass == CLSID_DirectMusicAudioPathConfig) { strcat(szName,": AudioPath Config"); } else if (Desc.guidClass == CLSID_DirectMusicScript) { strcat(szName,": Script"); } strcat(szName,"\n"); OutputDebugString(szName); } pObject->Release(); } } } }
Error Handling Sometimes tracking errors with scripting can be frustrating. Variable or routine names may be wrong, in which case calls to them fail. There can be errors in the routine code, but there is no way to step into the routines when debugging. Therefore, it helps to have a mechanism for figuring out what went wrong. Enter the DMUS_SCRIPT_ERRORINFO structure. As we have seen already, CallRoutine() and all of the methods for manipulating variables can pass this as an option. typedef struct _DMUS_SCRIPT_ERRORINFO {
DWORD
dwSize;
HRESULT hr; ULONG
ulLineNumber;
LONG
ichCharPosition;
WCHAR
wszSourceFile[DMUS_MAX_FILENAME];
WCHAR
wszSourceComponent[DMUS_MAX_FILENAME];
WCHAR
wszDescription[DMUS_MAX_FILENAME];
WCHAR
wszSourceLineText[DMUS_MAX_FILENAME];
} DMUS_SCRIPT_ERRORINFO; As you can see, this structure can provide a wealth of information to find out where something went wrong, especially if the error was in the script code. The SDK covers each of these fields quite well, so let's skip that. If you are curious, look in DirectMusic>DirectMusic C/C++ Reference>DirectMusic Structures at Microsoft's web site.
Script Tracks You can also trigger the calling of script routines by using a Script Track in a Segment. This is very powerful because it gives the opportunity to have time-stamped scripting. You can even use scripting to seamlessly control the flow of music playback by calling routines at decision points in the Segments to decide what to play next based on current state variables. Script Tracks are very straightforward to author and use. In DirectMusic Producer, open a Segment and use the Add Tracks command to add a Script Track to it. At the point in the timeline where you want the script routine called, insert a script event. Then choose from a pull-down menu in the Properties window which routine from which script to call. Each script event also has options for timing. The script routine can be called shortly ahead of the time stamp or exactly at it. The former is useful if you want to make decisions about what to play once the time stamp is reached. For example, you would not want to call the routine to queue a new Segment at exactly the time it should play, since latency would make that impossible. The Script Track is quite flexible. It supports unlimited script calls. It allows calls to more than one script, and no two calls have to use the same timing options.
Script Language DirectMusic's scripting lets you choose between two language implementations. It does this by internally supporting a standard COM interface for managing scripted language, called IActiveScript. Theoretically, that means that you can use any language supported by IActiveScript, which includes a wide range of options from JavaScript to Perl. However, DirectMusic Producer offers content creators exactly two options, VBScript and AudioVBScript. This is just fine because they really are the best choices. VBScript is a standard scripting implementation that Microsoft provides as part of the Windows operating system. VBScript is a very full-featured language. However, it is also big, requiring close to a megabyte to load.
AudioVBScript is a very light and nimble basic scripting language that was developed specifically for DirectMusic scripting. It is fast and very small and ships as part of the API. However, it does not have many of the more sophisticated features found in VBScript. Typically, scripters should use AudioVBScript. If the scripter runs up against a brick wall because AudioVBScript does not have a feature they need (like arrays, for example), then the scripter can simply switch to VBScript and continue. Since AudioVBScript is a subset of VBScript, this should be effortless. Note Documentation for the AudioVBScript language and all scripting commands are located in the DirectMusic Producer help files under Creating Content>Script Designer>AudioVBScript Language, not the regular DirectX help.
Adding Scripting to Jones As it turns out, adding scripting support to Jones is not hard at all. We would like to be able to: § Load a script § Display all script routines § Call routines § Display all script variables § Set variable values Jones' scripting solution adds a series of list boxes on the left side of the Jones window, as seen in Figure 12-2. The top box lists all currently loaded scripts. Underneath it, two list boxes display all routines and variables for the currently selected script. To call a routine, simply double-click on it or, alternately, select it and press the Call button. To view a variable, click on it once and view it in the text box. To change it, edit its value in the text box.
Figure 12-2: Jones with scripting support.
Jones Data Structures We add a new class, CScript, to our audio library. CScript keeps track of one instance of a loaded script via its IDirectMusicScript interface. I considered creating data structures to manage the lists of routines and variables, but the enumeration routines provided by IDirectMusicScript are so straightforward, it seemed silly to do so. We start with the relatively simple CScript class. It tracks the name and script interface and provides access methods to both. The biggest item is the initialization method. class CScript : public CMyNode { public: CScript(); ~CScript(); CScript*
GetNext() { return (CScript*)CMyNode::GetNext();}
IDirectMusicScript *GetScript() {return m_pScript; };
void Init(IDirectMusicScript *pScript, IDirectMusicPerformance8 *pPerformance); char *GetName() { return m_szName; }; private: IDirectMusicScript * m_pScript; char m_szName[DMUS_MAX_NAME];
// Name, for display
}; CScript is managed in a link list by CScriptList. We add one instance of CScriptList to CAudio to manage all of the scripts, and we add a method for loading scripts: CScript *LoadScript(WCHAR *pwzFileName); file.
// Load a script from
CScriptList m_ScriptList; scripts.
// List of loaded
That is it.
Loading the Script To load a script, click on the Open… button. This opens the standard file dialog and prompts you to load a script file with the extension .spt.
Select a script file (I would recommend
SimpleScript.spt in the Media directory) and click
on Open…. The script name is placed in the script list, and its routines and variables appear in the two lists below. Let's look at LoadScript(). It uses the Loader to read a script from the passed file path. Once LoadScript() successfully reads the script file, it calls the CScript's Init() method, which prepares the script for running. CScript *CAudio::LoadScript(WCHAR *pwzFileName) { WCHAR wzSearchDirectory[DMUS_MAX_FILENAME]; wcscpy(wzSearchDirectory,pwzFileName); WCHAR *pwzEnd = wcsrchr(wzSearchDirectory,'\\'); if (pwzEnd) { // If we have a directory path, use it to set // up the search directory in the Loader.
// The Loader will look here for linked files, // including Segments, Styles, and DLS instruments. *pwzEnd = 0; m_pLoader->SetSearchDirectory(GUID_DirectMusicAllTypes, wzSearchDirectory,FALSE); } CScript *pScript = NULL; IDirectMusicScript8 *pIScript; // Now, load the script. if (SUCCEEDED(m_pLoader->LoadObjectFromFile( CLSID_DirectMusicScript,
// Class ID for script.
IID_IDirectMusicScript8,
// Interface ID for script.
pwzFileName,
// File path.
(void **) &pIScript)))
// Returned IDirectMusicScript.
{ // Create a CScript object to manage it. pScript = new CScript; if (pScript) { // Initialize and add to list. pScript->Init(pIScript,m_pPerformance); m_ScriptList.AddTail(pScript); } pIScript->Release(); } return pScript; }
CScript::Init() is similarly simple. It gets the script ready to run by calling the script's IDirectMusicScript::Init() routine and then retrieving the script's name so it can be displayed. void CScript::Init(IDirectMusicScript *pScript, IDirectMusicPerformance8 *pPerformance) { m_pScript = pScript; pScript->AddRef(); // Initialize the script for this performance. // In addition to providing a global performance pointer // for the scripting, this automatically downloads // all waves and DLS instruments, if this option was set
// in the script at authoring time (which is usually the case). pScript->Init(pPerformance,NULL); // Get the object descriptor from the script. This includes the name. IDirectMusicObject *pIObject = NULL; pScript->QueryInterface(IID_IDirectMusicObject,(void **)&pIObject); if (pIObject) { DMUS_OBJECTDESC Desc; Desc.dwSize = sizeof(Desc); pIObject->GetDescriptor(&Desc); pIObject->Release(); m_szName[0] = 0; if (Desc.dwValidData & DMUS_OBJ_NAME) { wcstombs(m_szName, Desc.wszName, DMUS_MAX_NAME); } else if (Desc.dwValidData & DMUS_OBJ_FILENAME) { wcstombs(m_szName, Desc.wszFileName, DMUS_MAX_NAME); // Get rid of any file path. char *pszName = strrchr(m_szName,'\\'); if (pszName) strcpy(m_szName,++pszName); } else { strcpy(m_szName,"Script (no name)"); } } }
Enumerating and Calling Routines Once the script is loaded, Jones displays the list of available script routines in the Routines list box. To run a routine, double-click on its name in the Routines list.
To create that list of routines, Jones calls the IDirectMusicScript:: EnumRoutine() method. I thought of wrapping this somehow in CScript for completeness, but that seemed silly. So, here's the code inside Jones' UI that enumerates the routines in the process of creating the list box. Notice that EnumRoutine() signals the end of list with S_FALSE, not an error condition. // Enumerate through the script's routines. DWORD dwIndex; HRESULT hr = S_OK; for (dwIndex = 0; hr == S_OK; dwIndex++) { // Names are returned in Unicode. WCHAR wzName[DMUS_MAX_NAME]; hr = pScript->GetScript()->EnumRoutine(dwIndex,wzName); if (hr == S_OK) { // Convert from Unicode to ASCII and add to list box. char szName[DMUS_MAX_NAME]; wcstombs(szName, wzName, DMUS_MAX_NAME); m_ctScriptRoutineList.AddString(szName); } } To play the routine, Jones retrieves the routine name from the routine list and uses it to call CallRoutine(). // Convert routine name to Unicode. WCHAR wszName[MAX_PATH]; mbstowcs(wszName,szName,MAX_PATH); // Use the Unicode name to call the routine. pScript->GetScript()->CallRoutine(wszName,NULL);
Enumerating and Accessing Variables The list of variables is displayed below the routines.
Click on a variable to show its value in the edit box. Type over the variable value with a new value and Jones automatically sends the new value to the script. Scripting supports three ways to access variables: objects, variants, and numbers. However, to keep things simple in Jones, we are only supporting numbers. So, when we enumerate through the variables using IDirectMusicScript::EnumVariable(), we need to make sure that the variable is valid as a number and we want to retrieve the current value, which we can store in our list. We do both with a call to IDirectMusicScript::GetVariableNumber(). If the call succeeds, we can display the variable. DWORD dwIndex; HRESULT hr = S_OK; for (dwIndex = 0; hr == S_OK; dwIndex++) { WCHAR wzName[DMUS_MAX_NAME]; // Enumerate through all variables hr = pScript->GetScript()->EnumVariable(dwIndex,wzName); if (hr == S_OK) { long lData = 0; // But verify that each variable is capable of providing a number. // If not, ignore it. hr = pScript->GetScript()>GetVariableNumber(wzName,&lData,NULL); if (SUCCEEDED(hr)) { // Success. Convert to ASCII and place in the list box. char szName[DMUS_MAX_NAME] = ""; wcstombs(szName, wzName, DMUS_MAX_NAME);
int nIndex = m_ctScriptVariableList.AddString(szName); m_ctScriptVariableList.SetItemData(nIndex,(DWORD)lData); } } } Next, we need to be able to edit the variable. When a variable is selected in the list, Jones pulls the variable data from the list (previously retrieved with GetVariableNumber()) and places it in the edit box. Then, in response to user edit, it sets the new value. // Get the name of the variable. char szName[MAX_PATH]; m_ctScriptVariableList.GetText(nSelect,szName); // Convert the name to Unicode. WCHAR wszName[MAX_PATH]; mbstowcs(wszName,szName,MAX_PATH); // Use the name to set the variable to the new value. pScript->GetScript()>SetVariableNumber(wszName,m_lScriptVariable,NULL);
A Scripting Sample To really understand programming with scripting, it helps to mess around with the scripting side of things too. To that end, I created a wacky little script called SimpleScript.spt.
Bad Dream SimpleScript endeavors to provide a soundtrack to a very strange scene. Imagine a mechanical world filled with moving metal parts, and large train-like objects fly by at random intervals with a spray of steam and grinding metal. You are being hunted by a very annoying translucent wasp made of cellophane. As the story progresses, there are more and more mechanical objects to dodge, and as you find yourself increasingly cornered, your level of hysteria increases. Occasionally, you have a good idea that causes time to stand still, but then chaos falls back around your ears and the chase ensues. Eventually, you wake up in a cold sweat, and it's over. To build this soundscape, we need the following: § Theme music that plays forever and responds to both changes in "activity" and "hysteria" § A routine to start the music § Variables and routines for updating the activity and hysteria § A routine to play when the wasp flies by § A routine to play when a train is encountered § A routine to play when there is a "really good idea" § A routine to finish the music when the scene is over The SimpleScript example incorporates solutions for all of these. Load it into Jones, so you can call each routine as we discuss it. If you are feeling adventurous, open it directly in DirectMusic Producer and study it in greater depth as we go along. Before declaring any routines, SimpleScript declares two variables that will be used to set the Activity and Hysteria levels as the scene plays. ' Variables set by the application: dim Activity
' Game supplied Activity level
dim Hysteria
' Game supplied Hysteria level
Theme Music The Segment SkaNSeven.sgt supplies the theme music. It uses a Style for playback (thanks to Microsoft's Kelly Craven for creating a great Style), which has the advantage that Style Segments can respond to global groove level changes. Because Style Segments are chord based, other musical effects can track them harmonically. SkaNSeven.sgt is 42 measures long with measures 10 through 42 looping infinitely, so once it starts playing, it just loops and loops. In truth, you'd want more variety, but hey, this is called "SimpleScript" after all. The first routine, StartLevel, is called by the application when we first enter the scene. StartLevel establishes the initial Activity and Hysteria levels by setting these values to zero and then calling the routine Update (more on that in a second.) Finally, StartLevel plays SkaNSeven.sgt to get the music looping.
' StartLevel is called when we enter the scene. sub StartLevel Activity = 0
' Init the Activity
OldActivity=0 ' Init the previous as well Hysteria = 0
' Init the Hysteria
Update
' Now use these to set tempo and groove
' Finally, play the music SkanSeven.Play AtMeasure end sub Go ahead and click on the StartLevel command to hear the music start.
Setting Activity and Hysteria Two variables, Activity and Hysteria, are used to control the overall groove level and tempo of the theme music as it plays. To change one of these, the application (or Jones) calls IDirectMusicScript:: SetVariableNumber() with the name of the variable (Activity or Hysteria) and that variable's new value. But changing the variable is not enough. The script can't actively do anything with the new information unless one of its routines is called. So, the script has a routine, Update, that should be called whenever either variable has changed. Update first makes sure that Activity and Hysteria are within reasonable ranges. Then, if it sees that the Activity increased, it plays a short secondary Segment, called Change.sgt, on a beat boundary. Then, Update applies formulas to translate the two variables into appropriate groove level and tempo modifiers. Why does it play the Segment? When the groove level is changed, the effect doesn't take until the next pattern boundary within the Style, which is usually every measure. Since something just happened that is supposed to increase the musical activity, it would seem odd to wait up to a whole measure to hear the music react. Yet, cutting immediately would ruin the timing. So, we play the change Segment to add a flurry of activity while we wait for the groove level to hit at the next measure boundary. ' Every time the Activity or Hysteria variables are updated by ' the application, it needs to call Update so the script can act ' accordingly. sub Update ' First, make sure everything is in range. if (Hysteria < 0) then Hysteria = 0 end if if (Hysteria > 10) then Hysteria = 10 end if if (Activity < 0) then
Activity = 0 end if if (Activity > 6) then Activity = 6 end if ' If the activity is increasing, play a motif. if (OldActivity < Activity) then Change.play IsSecondary+AtBeat end if ' Store the activity for next time around. OldActivity = Activity ' Then, set the levels SetMasterGrooveLevel Activity*25 - 50 SetMasterTempo Hysteria*2 + 98 end sub To test this, first change the Activity or Hysteria variable, and then run the Update routine. In Jones, this means clicking on the variable name, editing its value, and then double-clicking on Update.
Wasp Flies By When the wasp flies by, the routine Wasp can be called. Wasp plays a secondary Segment written with a Pattern Track, so it can transpose on top of the current chord and key. The Segment has many different variations so that the melody it plays is always a little different. This wasp has personality! On the other hand, when Activity is high, it seems intuitive that the wasp should be a little louder and little angrier. So, the Wasp routine checks the current Activity level before deciding which of two Segments to play. ' When a Wasp flies by in the game, call the Wasp routine which plays a secondary ' Segment. If the Activity is above 2, play a slightly louder and more intense Wasp. ' In either case, align with the beat. sub Wasp if (Activity > 2) then BigWasp.play IsSecondary+AtBeat else LittleWasp.play IsSecondary+ AtBeat end if end sub
Train
Now let's script the behaviors for the train. When a train lumbers by and the user gets close to the Tracks, the routine NearTracks is called to play something inspiring. This is a simple sequence with some weird train-like sound effects. Since this isn't musical at all, it can play immediately and not sound odd. ' When close to train tracks, call NearTracks. ' This isn't rhythmically attached to the music, so ' play it immediately. sub NearTracks Train.play IsSecondary+AtImmediate end sub
Thinking I don't know about you, but when I have an original thought, the world stops and takes notice. Time stands still, multicolored flashing lights flutter around my head, and fog comes out of my nose. The Thinking routine attempts to recreate this otherworldly experience. It does so in an interesting way. It plays a Segment, Pause.sgt, that is a controlling Segment with a low groove level accompanied by break and fill embellishments. These alter the theme music by causing it to track to the low groove level and play the break and fill embellishments, while continuing with its own chord progression. Pause also plays a few drone-like notes and eerie sounds to telegraph the experience of having a head bursting full of profound, karmic thoughts. But, again, with a groove change, we have a situation where the effect is delayed until the next measure boundary. We need to hear something sooner! Profound thoughts like mine can't wait. The Idea.sgt secondary Segment fills the breach with a brief chorus of angelic voices. ' When it's time to stop and think, call Thinking. sub Thinking ' Play a small motif to carry us over since ' the controlling Segment is aligned to a measure. Idea.play IsSecondary+AtBeat ' Now, the controlling Segment, which plays some ' embellishments and drone sounds while altering ' the groove level. Pause.play IsControl+AtMeasure end sub
Note
There are actually ways to force a groove level or embellishment to occur sooner. You can cause an immediate invalidation (see IDirectMusicPerformance::Invalidate), or you can transition to a new Segment on a beat boundary while making the groove change.
Enough Already! When it's clear that the music itself is causing the hysteria, you can opt out by calling the EndLevel routine. This takes advantage of DirectMusic composition technology, which dynamically writes a transition Segment and plays it to lead into the final Segment, ScaNSevenEnd.sgt. It's worth listening to this a few times to see how it is able to pick chord progressions that wrap up the music quite handily, regardless of where the theme was at the moment you decided to leave. ' EndLevel is called when we are done. sub EndLevel ' Play the finish Segment but precede it with a dynamically ' composed transition with a fill and chords to modulate into it. SkanSevenEnd.Play AtMeasure+PlayFill+PlayModulate end sub Note
If you are previewing in Jones, you will notice that the visual display temporarily shows nothing when the composed transition Segment plays. Magic? I like to think everything that comes out of the composition technology is magic, but there's a more grounded reason. Jones inserts a special display track in each Segment when it loads it. This special Track gives Jones the ability to follow the Segments as they play. The transition Segment is created on the fly within DirectMusic, so there is no opportunity for Jones to intercept it. But, if you want to believe it's magic…
Enough of my foray into sound design. If nothing else, this exercise furthers the point that I should stick to programming and leave sound design to the pros. (And dream interpretation, too.) If your project involves music and audio design that are done by somebody other than the programmer (read: YOU!), it pays to use scripting. It frees the participants to work more efficiently and faster, and there's no question that you will end up with a product that is significantly more sophisticated and well honed. Not to mention it will save you, as programmer, additional headaches! On the rare occasion you are doing it all yourself, it might be a toss up only because scripting is harder to debug than straight C++ in that you can't single step through the code, but that's a minor point.
Chapter 13: Sound Effects Overview Todor Fay We have talked a bit about how wonderful DirectX Audio is for sound effects work, but we have yet to really focus on anything outside of music making. Although the Loader, Segment, AudioPath, and scripting are all important prerequisites for creating immersive audio environments with DirectX Audio, our examples have primarily focused on musical applications. Now we will spend some time focusing specifically on using DirectX Audio for sound effects. There are some big reasons why you can benefit dramatically by playing sound effects through AudioPaths instead of the traditional approach of loading and playing sounds directly in DirectSound Buffers: § There is a performance advantage. 3D spatialization, whether in software or hardware, is significantly more expensive than software mixing of individual sounds on today's CPUs. Even with hardware acceleration, the overhead of shipping the audio data over the bus can be more expensive than a simple CPU mix. The AudioPath mechanism mixes one or more sounds in one or more Segments and sends them all together to one 3D channel on the sound card. In contrast, every single sound in the Buffer approach has its own 3D channel. This approach ultimately costs quite a bit more when sounds are layered, and to add insult to injury, it is more complex to program. § Using AudioPaths allows for more densely layered sound environments because the limiting factor is not the number of sounds that can play at once, but rather the number of 3D positions that the sounds can play through. § The AudioPath approach is content driven. The authoring format, with scripting, variation, and more, is much richer than wave files. Therefore, the sound designer can work on complex sound schemes without tying up the programmer. Naturally, working with AudioPaths is not without its unique challenges. I hope that we will address all of them in this chapter. Jones is not the greatest place to demonstrate working with 3D sound effects in a game-like environment, so our programming treat this time, called Mingle, is a reproduction of a lively cocktail party with some truly fascinating people milling around. We will start by walking through the essentials of programming sound effects with DirectX Audio and then spend some time with critical issues that can make or break your audio design. Lastly, we will dig into Mingle.
Sound Effects Programming 101 In general, sound effects programming with DirectX Audio is very straightforward. We already touched on it a bit in Chapter 8 when we added 3D to our Hello World application, and the chapters on Segments and AudioPaths covered just about all the basics for loading and playing sounds on AudioPaths. Here is a quick walk through the specific things you must do to get sounds playing effectively with real-time control of their 3D position. We start with the one assumption that you created the Loader and Performance and have initialized the Performance.
Allocating a Sound Effects AudioPath Although any possible AudioPath configuration can be defined in an AudioPath configuration file, for most sound effects work, three predefined types do the job admirably. §
DMUS_APATH_DYNAMIC_3D: Use this for sounds that need to be positioned in 3D space. All sounds played on the AudioPath are mixed and then streamed through one DirectSound 3D Buffer for 3D spatialization. The 3D controls of the DirectSound Buffer can be directly manipulated by the application to move the mix in space.
§
DMUS_APATH_DYNAMIC_MONO: Use this for mono sounds that do not need 3D positioning. This mixes everything played on the AudioPath into a Mono DirectSound Buffer. Note that these can still be panned within the stereo spectrum.
§
DMUS_APATH_DYNAMIC_STEREO: Use this for stereo sounds. Again, everything is mixed down into one Buffer — this time in stereo.
To create one of these AudioPaths, call IDirectMusicPerformance:: CreateStandardAudioPath() and pass the type of AudioPath you want. Give it the number of pchannels you will need on the AudioPath. The number of pchannels is really determined by how many pchannels are used in the Segments that will play on the path. For sound effects work, this is usually a small number. For example, if you are just playing wave files, only one pchannel is needed. Here is the code to create a 3D AudioPath with four pchannels. IDirectMusicAudioPath *pPath = NULL; m_pPerformance->CreateStandardAudioPath( DMUS_APATH_DYNAMIC_3D,
// Create a 3D path
4,
// Give it four pchannels
true,
// Activate it for immediate use.
&pPath);
// Returned path.
Later, when done with the AudioPath, just use Release() to get rid of it: pPath->Release(); Working with AudioPaths is quite different from programming directly with DirectSound Buffers. Importantly, you can play any number of sounds on the same AudioPath. This means that instead of creating a 3D buffer for each individual sound, you can create one
AudioPath and play any number of Segments on it, and each Segment can have one or more sounds of its own that play at once. This paradigm shift is important to remember as you design your sound system. Forgive me if I sound a little redundant with this fact, but it is a good one to hammer home.
Playing Sounds on the AudioPath First, load the sound into a Segment. The easiest way to do this is to use the Loader's LoadObjectFromFile() method, which takes a file name and returns the loaded object (in this case a Segment): IDirectMusicSegment8 *pSegment = NULL; // Now, load the Segment. m_pLoader->LoadObjectFromFile( CLSID_DirectMusicSegment,
// Class ID of Segment.
IID_IDirectMusicSegment8,
// Segment interface.
pwzFileName,
// File path.
(void **) &pSegment);
// Returned Segment.
To play the Segment on the AudioPath, pass both the Segment and AudioPath to the Performance's PlaySegmentEx() method: IDirectMusicSegmentState8 *pSegState; hr = m_pPerformance->PlaySegmentEx( pSegment,
// The Segment
NULL,NULL,
// Ignore these
DMUS_SEGF_SECONDARY,
// Play as a secondary Segment
0,
// No time stamp.
&pSegState,
// Optionally, get a SegState.
NULL,
// No prior Segment to stop.
pPath);
// Use AudioPath, if supplied.
Remember that there is not a limit to how many Segments can play on how many AudioPaths. One Segment can play on multiple AudioPaths, and multiple Segments can play on one AudioPath. If the Segment needs to be stopped for some reason, there are three ways to do it: §
Stopping the Segment State: This stops just the one instance of the playing Segment, so all other Segments playing on the AudioPath continue. Use this to terminate one sound on an object without affecting others.
§
Stopping the Segment: This stops all instances of the Segment, no matter which AudioPaths they are playing on. This is the simplest as well as the least useful option.
§
Stopping everything in the AudioPath: This stops all Segments currently playing on the AudioPath. This is very useful for sound effects work.
In all three cases, call the Performance's StopEx() method and pass the Segment, Segment State, or AudioPath as the first parameter. StopEx() figures out which parameter it has received and terminates the appropriate sounds. In these two examples, we first pass the Segment State and then the AudioPath: // Stop just the one instance of a playing Segment. m_pPerformance->StopEx(pSegState,NULL,0); // Stop everything on the AudioPath. m_pPerformance->StopEx(pPath,NULL,0);
Activating and Deactivating the AudioPath Even when all Segments on an AudioPath have stopped playing, the AudioPath continues to use system resources. This primarily involves the overhead of streaming empty data down to the hardware buffer. When there are many hardware buffers, this can be a substantial cost. It is a good idea to deactivate AudioPaths that are currently not in use. Deactivation does not release the AudioPath resources. It keeps them around so the AudioPath is ready to go the next time you need it, but it does disconnect the CPU and I/O overhead. Given that 3D buffers can be somewhat expensive on some hardware configurations, deactivation of the AudioPath is a good feature to use. One method on IDirectMusicAudioPath, Activate(), handles both activation and deactivation. Here is example code for first deactivating and then activating an AudioPath: // Deactivate the AudioPath pPath->Activate(false); // Activate the AudioPath pPath->Activate(true);
Controlling the 3D Parameters Once sounds are playing on an AudioPath, it really gets fun because now you can move them around in space by manipulating the 3D interface of the DirectSound Buffer at the end of the AudioPath. To do this, directly control the 3D Buffer itself via the IDirectSound3DBuffer8 interface, which can be retrieved via a call to the AudioPath's GetObjectInPath() method. The IDirectSound3DBuffer8 interface has many powerful options that you can use to manage the sound image. Of obvious interest are the commands to set the position and velocity. You should be acquainted with all of the features of the 3D Buffer, however. These include: §
Position: Position is the single most important feature. Without it, you cannot possibly claim to have 3D sound in your application. Get and set the position of the object in 3D space.
§
Velocity: Every object can have a velocity, which is used to calculate Doppler shift. DirectSound does not calculate velocity for you, which it theoretically could by simply measuring the change in distance over time because that would cause a delay in any velocity change. Therefore, you must calculate it directly and set it for each Buffer if you want to hear the Doppler effect.
§
Max distance and min distance: These set the range of distance from the listener at which sounds are progressively attenuated. Sounds closer than min distance cease to increase in volume as they get closer. Sounds farther than max distance cease to get quieter as they get farther away.
§
Cone angle, orientation, and outside volume: A very sophisticated feature is the ability to define how an object projects its sound. You can specify a cone of sound that emits from an object. Cone orientation sets the direction in 3D space. Angle sets the width of an inner and outer cone. The inner cone wraps the space that plays at full volume. The outer cone marks the boundary where sound plays at outside volume. The area between the inner and outer cones gradually attenuates from full volume to outside volume.
§
Mode: You can also specify whether the sound position is relative to the listener (more on that in a second) or absolute space or centered inside the listener's head. I'm not particularly fond of voices inside my head, so I avoid that last choice.
The DirectX SDK has excellent documentation on working with the DirectSound 3D Buffer parameters, so I am not going to spend a lot of time on this subject beyond the important position and velocity. Here is some example code that sets the 3D position of an AudioPath (more on setting velocity in our application example later in this chapter): // D3D vector needs to be initialized. D3DVECTOR VPosition; // We'll be using an IDirectSound3DBuffer interface. IDirectSound3DBuffer *p3DBuffer = NULL; // Use GetObjectInPath to retrieve the Buffer. if (SUCCEEDED(m_pAudioPath->GetObjectInPath( 0, DMUS_PATH_BUFFER, 0, GUID_All_Objects,0, IID_IDirectSound3DBuffer, (void **)&p3DBuffer))) { // Okay, we got the 3D Buffer. Position it. p3DBuffer>SetPosition(VPosition.x,VPosition.y,VPosition.z,DS3D_IMMEDIATE); // Then let go of it. p3DBuffer->Release(); } Likewise, you can control any of the other 3D parameters on the Buffer by using the IDirectSound3DBuffer8 interface. Remember that you can adjust any of the regular DirectSound Buffer parameters, like frequency, pan, and volume, in a similar way with the IDirectSoundBuffer8 interface.
Controlling the Listener
In order for the 3D positioning of the sound objects to make any sense, they need to be relative to an imaginary person whose view-point corresponds with the picture we see on the monitor. In other words, the ears of the viewer should be positioned and oriented in much the same way as the eyes. This is called the "listener," as opposed to the visual "viewer." Once the listener information is provided, DirectSound is able to correctly calculate the placement of each 3D sound relative to the listener and render it appropriately. The listener provides a series of parameters that you should become acquainted with as you develop your 3D sound chops: §
Position: This determines the current position of the listener in 3D space.
§
Orientation: This determines the direction the listener is facing. Orientation is described by two 3D vectors. One sets the direction faced, and the second sets the rotation around that vector. Orientation addresses the question of which direction the listener is facing and whether the listener is upside down, looking sideways, or right side up.
§
Velocity: This determines the speed at which the listener is moving through space. This is combined with the velocities of the 3D objects to determine their Doppler shift.
§
Distance Factor: By default, positions are measured in meters. The Distance Factor is simply a number multiplied against all coordinates to translate into a larger or smaller number system. For example, you might want to work in feet instead of meters. Or, your game takes place in outer space and you are measuring Aus instead of meters. Hmmm, for calculating Doppler, how long does it take a sound to travel one light year in outer space? (I sense that this is a trick question…)
§
Doppler Factor: By default, Doppler is calculated technically correctly based on the velocities provided for each object and the listener. For effect, it might be desirable to exaggerate the Doppler. Doppler Factor multiplies the Doppler effect.
§
Rolloff Factor: The Rolloff Factor controls the rate at which sounds attenuate depending on their distance from the listener. Use this to ignore the sound, roll it off, exaggerate it, or give it the same effect as in the real world.
The listener provides methods for getting and setting each of these parameters independently as well as two methods, SetAllParameters() and GetAllParameters(), for accessing them all at once via a structure, DS3DLISTENER. typedef struct { DWORD
dwSize;
D3DVECTOR
vPosition;
D3DVECTOR
vVelocity;
D3DVECTOR
vOrientFront;
D3DVECTOR
vOrientTop;
D3DVALUE
flDistanceFactor;
D3DVALUE
flRolloffFactor;
D3DVALUE
flDopplerFactor;
} DS3DLISTENER, *LPDS3DLISTENER;
To access the Listener, use the AudioPath's GetObjectInPath() method. Since the Listener belongs to the primary Buffer, find it at the DMUS_PATH_PRIMARY_BUFFER stage. The following example gets the Listener via an AudioPath and, for grins, adjusts the Doppler Factor. IDirectSound3DListener8 *pListener = NULL; pPath->GetObjectInPath(0, DMUS_PATH_PRIMARY_BUFFER,
// Retrieve from primary Buffer.
0,GUID_All_Objects,0,
// Ignore object type.
IID_IDirectSound3DListener8, // Request the listener interface. (void **)&pListener); if (pListener) { DS3DLISTENER Data; Data.dwSize = sizeof(Data); // Read all of the listener parameters. pListener->GetAllParameters(&Data); // Now change something for the sake of this example. Data.flDopplerFactor = 10; Doppler.
// Really exagerate the
pListener->SetAllParameters(&Data,DS3D_IMMEDIATE); pListener->Release(); }
Getting Serious Okay, we have covered everything you need to know to get DirectX Audio making sound. As you start to work with this, though, you will find that there are some big picture issues that need to be sorted out for optimal performance. §
Minimize latency: By default, the time between when you start playing a sound and when you hear it can be too long for sudden sound effects. Clearly, that needs fixing.
§
Manage dynamic 3D resources: What do you do when you have 100 objects flying in space and ten hardware AudioPaths to share among them all?
§
Keep music and sound effects separate: How to avoid clashing volume, tempo, groove level, and more.
Minimize Latency Latency is the number one concern for sound effects. For music, it has not been as critical, since the responsiveness of a musical score can be measured in beats and sometimes measures. Although the intention was to provide as low latency as possible for both music and sound effects, DX8 shipped with an average latency of 85 milli-seconds. That's fine for ambient sounds, but it simply doesn't work for sounds that are played in response to sudden actions from the user. Obvious examples of these "twitch" sound effects would be gunfire, car horns, and other sounds that are triggered by user actions and so cannot be scheduled ahead of time in any way. These need to respond at the frame rate, which is usually between 30 and 60 times a second, or between 33 and 16 milliseconds, respectively. Fortunately, DX9 introduces dramatically improved latency. With this release, the latency has dropped as low as 5ms, depending on the sound card and driver. Suddenly, even the hair-trigger sound effects work very well. But there still are a few things you need to do to get optimal performance. Although the latency is capable of dropping insanely low, it is by default still kept pretty high — between 55 and 85ms, depending on the sound card. Why? In order to guarantee 100 percent compatibility for all applications on all sound cards on all systems, Microsoft decided that the very worse case setting must be used to reliably produce glitch-free sound. We're talking a five-year-old Pentium I system with a buggy sound card running DX8 applications. Fortunately, if you know your application is running on a half-decent machine, you can override the settings and drive the latency way, way down. There are two commands for doing this. Because the audio is streamed from the synthesizer and through the effects chains, the mechanism that does this needs to wake up at a regular interval to process another batch of sound. You can determine both how close to the output the write cursor sits and how frequently it wakes up. The lower these numbers, the lower the overall latency. But there's a cost: If the write cursor is too close to the output time, you can get glitches in the sound when it simply doesn't wake up in time. If the write period is too frequent, the CPU overhead goes up. Indeed, these commands actually always existed with DirectMusic from the very start, but they could not reliably deliver significant results because it was very easy to drive the latency down too far and get horrible glitching. But DX9 comes with a rewrite of the audio sink system that borders on sheer genius. (I mean it; I am extremely impressed with what the development team has done.) With that, we can dial the numbers down as low as we want and still not glitch. Keep in mind, however, that there is the caveat that there's an esoteric bad driver out there that could prove the exception.
The two commands set the write period and write latency and are implemented via the IKsControl mechanism. IKsControl is a general-purpose mechanism for talking to kernel modules and low-level drivers. The two commands are represented by GUIDs: §
GUID_DMUS_PROP_WritePeriod: This sets how frequently (in ms) the rendering process should wake up. By default, this is 10 milliseconds. The only cost in lowering this is increased CPU overhead. Dropping to 5ms, though, is still very reasonable. The latency contribution of the write period is, on average, half the write period. So, dropping to 5ms is an average latency increase of 2.5ms — worst case 5ms.
§
GUID_DMUS_PROP_WriteLatency: This sets the latency, in milliseconds, to add to the sound card driver's latency. For example, if the sound card has a latency of 5ms and this is set to 5ms, the real write latency ends up being 10ms.
So, total latency ends up being WritePeriod/2 + WriteLatency + DriverLatency. Here's the code to use this. This example sets the write latency to 5ms and the write period to 5ms. IDirectMusicAudioPath *pPath; if (SUCCEEDED(m_pPerformance->GetDefaultAudioPath(&pPath)) { IKsControl *pControl; pPath->GetObjectInPath(0, DMUS_PATH_PORT,0,
// Talk to the synth
GUID_All_Objects,0,
// Any type of synth
IID_IKsControl,
// IKsControl interface
(void **)&pControl); if (pControl) { KSPROPERTY ksp; DWORD dwData; ULONG cb; dwData = 5;
// Set the write period to 5ms.
ksp.Set = GUID_DMUS_PROP_WritePeriod ; // The command ksp.Id = 0; ksp.Flags = KSPROPERTY_TYPE_SET; pControl->KsProperty(&ksp, sizeof(ksp), &dwData, sizeof(dwData), &cb); dwData = 5;
// Now set the latency to 5ms.
ksp.Set = GUID_DMUS_PROP_WriteLatency ; ksp.Id = 0; ksp.Flags = KSPROPERTY_TYPE_SET; pControl->KsProperty(&ksp, sizeof(ksp), &dwData, sizeof(dwData), &cb); pControl->Release();
} pPath->Release(); } Before you put these calls into your code, remember that it needs to be running on DX9. If not, latency requests this low will definitely cause glitches in the sound. Since you can ship your application with a DX9 installer, this should be a moot point. But if your app needs to be distributed in a lightweight way (i.e., via the web), then you might not include the install. If so, you need to first verify on which version you are running. Unfortunately, there is no simple way to find out which version of DirectX is installed. There was some religious reason for not exposing such an obvious API call, but I can't remember for the life of me what it was. Fortunately, there is a sample piece of code, GetDXVersion, that ships with the SDK. GetDXVersion sniffs around making calls into the various DirectX APIs, looking for specific features that would indicate the version. Mingle includes the GetDXVersion code, so it will run properly on DX8 as well as DX9. Beware — this code won't compile under DX8, so I've included a separate project file, MingleDX8.dsp, that you should compile with if you still have the DX8 SDK. Note The low latency takes advantage of hardware acceleration in a big way. As long as all of the buffers are allocated from hardware, the worst-case driver latency stays low. Once hardware buffers are used up and software emulation comes into play, additional latency is typically added as the software emulation kicks in. However, today's cards usually have at least 64 3D buffers available, which, as we discuss, is typically more than you'll want. Nevertheless, the best low latency strategy makes sure that the number of buffers allocated never exceeds the hardware capabilities.
Manage Dynamic 3D Resources So we get to the wedding scene, and there are well over 100 pigs flying overhead, each emitting a reliable stream of oinks, snorts, and squeals over the continuous din of wing fluttering. Ooops, we only have 32 3D hardware buffers to work with. How do we manage the voices? How do we ensure that the pig buzzing the camera is always heard at the expense of silencing faraway pigs? And that trio of pigs with guitars — we want to make sure we hear them regardless of how near or far they may be. DirectSound does have a dynamic voice management system whereby each playing sound can be assigned a priority and voices can swap out as defined by any combination of priority, time playing, and distance from the listener. It's not perfect, though. It only terminates sounds. So, if a pig that is flying away from the listener is terminated in order to make way for the hungry sow knocking over the dessert table, it remains silent once it flies back toward the listener because there's no mechanism to restart its sounds. For better or worse, the AudioPath mechanism doesn't even try to handle this. You must roll your own. In some ways, this is a win because you can completely define the algorithm to suit your needs, and you have the extra flexibility of being able to restart with Segments or call into script routines or whatever is most appropriate for your application. This does mean quite a bit of work. With that in mind, the major thrust of the Mingle application deals with this exact issue. Mingle includes an AudioPath management library that you can rip out, revise, and replace in your app as you see fit. How does it work? Let's start with an overview of the design here, and then let's investigate in depth later when we look at the Mingle code. There are two sets of items that we need to track:
§
Things that emit sound: Each "thing" represents an independently movable object in the world that makes a noise and so needs to be playing via one instance of an AudioPath. Each thing is assigned a priority, which is a criteria for figuring out which things can be heard, should there not be enough AudioPaths to go around. There is no limit to how large the set of things can be. There is no restriction to how priority should be calculated, though distance from the listener tends to be a good one.
§
3D AudioPaths: This set is limited to the largest amount of AudioPaths that can be running at one time. Each AudioPath is assigned to one thing that plays through the AudioPath.
Since we have a finite set of AudioPaths, we need to match these up with the things that have the highest priorities. But priorities can change over time, especially as things move, so we need to scan through the things every frame and make sure that the highest priority things are still the ones being heard. Figure 13-1 shows the relationship of AudioPaths to things.
Figure 13-1: Three AudioPaths manage sounds for four things, sorted by priority. Once a frame, we do the following: 1. Scan through the list of things and recalculate their positions and priorities. However, do not assign their positions yet. 2. Look for things with newly high priorities that outrank things currently assigned to AudioPaths. Stop the older things from playing and assign their AudioPaths to the new, higher priority things. 3. Get the new 3D positions from the things assigned to AudioPaths and set the AudioPath positions. 4. Start the new things, which have just been assigned AudioPaths, to make sound.
Keep Music and Sound Effects Separate If you think about it, sound effects and music follow very different purposes as well as paradigms in entertainment, be it a movie, game, or even a web site. Sound effects focus on recreating the reality of the scene (albeit with plenty of artistic license). Music, on the other hand, clearly has nothing to do with reality and is added on top of the story to manipulate our perception of it. This is all fine and good, but if you have the same system generating both the music and the sound effects, you can easily fall into situations where they are at crosspurposes. For example, adjusting the tempo or intensity of the music should not unintentionally alter the same for sound effects. In a system as sophisticated as DirectX
Audio, this conflict can happen quite frequently and result in serious head scratching. Typical problems include: §
Groove level interference: Groove level is a great way to set intensity for both music and sound effects, but you might want to have separate intensity controls for music and sound effects. You can accomplish this by having the groove levels on separate group IDs. But there has to be an easier way…
§
Invalidation interference: When a primary or controlling Segment starts, it can automatically cause an invalidation of all playing Segments. This is necessary for music because some Segments may be relying on the control information and need to regenerate with the new parameters. But sound effects couldn't care less if the underlying music chords changed. Worse, if a wave is invalidated, it simply stops playing. You can avoid this problem by playing controlling or primary Segments with the DMUS_SEGF_INVALIDATE_PRI or DMUS_SEGF_AFTERPREPARETIME flags set. But there has to be an easier way…
§
Volume interference: Games often offer separate music and sound effects volume controls to the user. Global control of the volume is most easily managed using the global volume control on the Performance (GUID_PerfMasterVolume). That affects everything, so it can't be used. One solution is to call SetVolume() on every single AudioPath. Again, there has to be an easier way…
Okay, I get the hint. Indeed, there is an easier way. It's really quite simple. Create two separate Performance objects, one for music and one for sound effects. Suddenly, all the interference issues go out the window. The only downside is that you end up with a little extra work if the two worlds interconnect with each other, but even scripting supports the concept of more than one Performance, so that can be made to work if there is a need. There are other bonuses as well: § Each Performance has its own synth and effects architecture. This means that they can run at different sample rates. The downside is that any sounds or instruments shared by both are downloaded twice, once to each synth. § You can do some sound effects in music time, which has better authoring support in some areas of DirectMusic Producer. Establish a tempo and write everything at that rate. This is how the sound effects for Mingle were authored. Of course, Segments authored in clock time continue to work well. Mingle takes this approach. It creates two CAudio objects, each with its own Performance. The sound effects Performance runs at the default tempo of 120BPM without ever changing. 120BPM is convenient because one measure equals one second. The music runs at a sample rate of 48 kHz, while the sound effects run at 32 kHz. There are separate volume controls for each. For efficiency, the two CAudio instances share the same Loader.
Mingle Okay, let's have some programming fun. Imagine a cocktail party with a decent complement of, uh, interesting people wandering around: the snob with the running commentary, the hungry boor noshing on every hors d'oeuvre in sight, the very irritating mosquito person, and the guy who just can't stop laughing. Gentle music plays in the background, barely above the crowd noise. To experience all this and more without inhaling any secondhand smoke, run Mingle by selecting it from the Start menu (if you installed the CD). Otherwise, double-click on its icon in the Unit II\Bin directory or compile and run it from the 13_SoundEffects source directory. See Figure 13-2.
Figure 13-2: Mingle. Taking up most of the Mingle window is a large square box with a plus sign in the middle. The box is a large room, and you are the plus marker in the middle. When Mingle first starts, there is general party ambience, and some music starts playing quietly. Click in the box. Ooops, you bumped into a partygoer, and he is letting you know how he feels about that. Try again, but be more careful. Ouch! Note Regarding the munchkins, my two little boys walked into my office when I was recording. This is demonstrating low latency sound effects and 3D positioning. Click around the box and hear how the sounds are positioned in space. Note If you are running under DX8, you should hear significant buzzing or breakup of the sound. Mingle is really only meant to work under DX9. Either install DX9 or rebuild Mingle with the low latency code disabled (more on that in a bit). Now let's have some more serious fun. There are four boxes down the left side, each representing a different person.
Click on the Talk button. This creates an instance of the person and plops it in the party. You should immediately hear the person gabbing and wandering around the room. Click the Talk button several more times and additional instances of the person enter the room. The little number box on the bottom right shows how many instances of the particular person are yacking it up at the party. Click on the Shut Up! button to remove a partygoer. Click on other participants, and note that they are displayed with different colors. Below the four boxes is an edit field that shows how many 3D AudioPaths are allocated:
This displays how many AudioPaths are currently allocated for use by all of the sound effects. This includes one AudioPath allocated for the sounds that happen when you click the mouse. Mingle dynamically reroutes the 3D AudioPaths to the people who are closest to the center of the box (where the listener is). Continue to click on the Talk buttons and create more people than there are AudioPaths. Notice how the colored boxes turn white as they get farther from the center. This indicates that a person is no longer being played through an AudioPath.
To change the number of available AudioPaths, just edit the number in the 3D Paths box. See what happens when you drop it down to just two or three. Change the number of people by clicking on the Talk and Shut Up! buttons. How many AudioPaths does it take before you can no longer track them individually by ear? How many before it's not noticeable when the dynamic swapping occurs? Keep experimenting, and try not to let the mosquito person irritate you too much. Across the bottom is a set of three sliders. These control three of the listener parameters: Distance Factor, Doppler Factor, and Rolloff Factor.
Drop the number of sounds to a reasonable number, so it's easier to track an individual sound, and then experiment with these sliders. If you can handle it, Mosquito is particularly good for testing because it emits a steady drone. Drag Distance Factor to the right. This increases the distances that the people are moving. Notice that the Doppler becomes more pronounced. That's because the velocities are much higher, since the distance traveled is greater. Drag Doppler to the right. This increases the Doppler effect. Drag Rolloff to the right. Notice how people get much quieter as they leave the center area. Finally, take a look at the two volume sliders:
These control the volumes for sound effects and music. Drag them up or down to change the balance. Ah, much better. Drag them all the way down and you no longer want to mangle Mingle. That completes the tour. Mingle demonstrates several useful things: § Separate sound effects and music environments § Volume control for sound effects and music § Dynamic 3D resource management § Low latency sound effects § Background ambience § Avoiding truncated waves § Manipulating the listener properties § Scripted control of sound effects § Creating music with the composition engine Now let's see how they work.
Separate Sound Effects and Music Environments Although Mingle doesn't have the same UI as Jones, it does borrow and continue to build on the CAudio class library for managing DirectX Audio. In the previous projects, we created one instance of CAudio, which managed the Performance, Loader, and all Segments, scripts, and AudioPaths. For Mingle, we use two instances of CAudio, which keeps the worlds of music and sound effects very separate. However, we'd like to share the Loader. Also, we'd like to set a different sample rate and default AudioPath for each, since music and sound effects have different requirements. So, CAudio::Init() takes additional parameters for default AudioPath type and sample rate.
HRESULT CAudio::Init( IDirectMusicLoader8 *pLoader, DWORD dwSampleRate, Performance. DWORD dwDefaultPath)
// Optionally provided Loader // Sample rate for the // Optional default AudioPath.
{ // If a Loader was provided by the caller, use it. HRESULT hr = S_OK; if (pLoader) { m_pLoader = pLoader; pLoader->AddRef(); } // If not, call COM to create a new one. else { hr = CoCreateInstance( CLSID_DirectMusicLoader, NULL, CLSCTX_INPROC, IID_IDirectMusicLoader8, (void**)&m_pLoader); } // Then, create the Performance. if (SUCCEEDED(hr)) { hr = CoCreateInstance( CLSID_DirectMusicPerformance, NULL, CLSCTX_INPROC, IID_IDirectMusicPerformance8, (void**)&m_pPerformance); } if (SUCCEEDED(hr)) { // Once the Performance is created, initialize it. // Optionally, create a default AudioPath, as defined by // dwDefaultPath, and give it 128 pchannels. // Set the sample rate to the value passed in dwSampleRate.
// Also, get back the IDirectSound interface and store // that in m_pDirectSound. This may come in handy later. DMUS_AUDIOPARAMS Params; Params.dwValidData = DMUS_AUDIOPARAMS_VOICES | DMUS_AUDIOPARAMS_SAMPLERATE; Params.dwSize = sizeof(Params); Params.fInitNow = true; Params.dwVoices = 100; Params.dwSampleRate = dwSampleRate; hr = m_pPerformance->InitAudio(NULL,&m_pDirectSound,NULL, dwDefaultPath, // Default AudioPath type. 128,DMUS_AUDIOF_ALL,&Params); } CMingleApp has two instances of CAudio, m_Effects and m_Music. It sets these up in its initialization. If both succeed, it opens the dialog window (which is the application) and closes down after that is finished. BOOL CMingleApp::InitInstance() { // Initialize COM. CoInitialize(NULL); // Create the sound effects CAudio. Give it a sample rate // of 32K and have it create a default path with a stereo // Buffer. We'll use that for the background ambience. if (SUCCEEDED(m_Effects.Init( NULL,
// Create the Loader.
32000,
// 32K sample rate.
DMUS_APATH_DYNAMIC_STEREO)))
// Default AudioPath is
stereo. { if (SUCCEEDED(m_Music.Init( m_Effects.GetLoader(),
// Use the Loader from FX.
48000,
// Higher sample rate for
music. DMUS_APATH_SHARED_STEREOPLUSREVERB))) // Standard music path. { // Succeeded initializing both CAudio's, so run the window. CMingleDlg dlg; m_pMainWnd = &dlg;
dlg.DoModal(); // Done, time to close down music. m_Music.Close(); } // Close down effects. m_Effects.Close(); } // Done with COM. CoUninitialize(); return FALSE; } Notice that the calls to CoInitialize() and CoUninitialize() were yanked from CAudio and placed in CMingle::InitInstance(). As convenient as it was to put the code in CAudio, it was inappropriate, since these should be called once.
Volume Control for Sound Effects and Music Once we have two Performances for sound effects and music, it is very easy to apply global volume control independently for each. Just calculate the volume in units of 100 per decibel and use the Performance's SetGlobalParam() method to set it. We add a method to CAudio to set the volume. void CAudio::SetVolume(long lVolume) { m_pPerformance->SetGlobalParam( GUID_PerfMasterVolume,
// Command GUID for master
&lVolume,
// Volume parameter.
sizeof(lVolume));
// Size of the parameter.
volume.
} Then, it's trivial to add the sliders to the Mingle UI and connect them to this method on the two instances of CAudio.
Dynamic 3D Resource Management This is the big one. We walked through the overall design earlier, so now we can focus on the implementation in code. We add new fields and functionality to the CAudioPath and CAudio classes and introduce two new classes, CThing and CThingManager, which are used to manage the sound-generating objects in Mingle.
CAudioPath First, we need to add some fields to CAudioPath so it can track the specific object (or "thing") that it is rendering in 3D. These include a pointer to the object as well as the object's
priority and whether a change of object is pending. We also add a method for directly setting the 3D position. Fields and flags are added for managing the priority and status of a pending swap of objects, and an IDirectSound3DBuffer interface is added to provide a direct connect to the 3D controls. // Set the 3D position of the AudioPath. Optionally, the velocity. bool Set3DPosition(D3DVECTOR *pvPosition, D3DVECTOR *pvVelocity,DWORD dwApply); // Get and set the 3D object attached to this AudioPath. void * Get3DObject() { return m_p3DObject; }; void Set3DObject(void *p3DObject) { m_p3DObject = p3DObject; }; // Get and set the priority of the 3D object. float GetPriority() { return m_flPriority; }; void SetPriority(float flPriority) { m_flPriority = flPriority; }; // Variables added for 3D object management. void * in space.
m_p3DObject;
// 3D tracking of an object
float space.
m_flPriority;
// Priority of 3D object in
D3DVECTOR set.
m_vLastPosition;
// Store the last position
D3DVECTOR
m_vLastVelocity;
// And the last velocity.
IDirectSound3DBuffer *m_p3DBuffer; interface.
// Pointer to 3D Buffer
m_p3DObject is the object that the AudioPath renders, and m_flPriority is the object's priority. The two D3D vectors are used to cache the current position and velocity, so calls to set these can be a little more efficient. Likewise, the m_p3DBuffer field provides a quick way to access the 3D interface without calling GetObjectInPath() every time. Note The object stays intentionally unspecific to CAudioPath. It is just a void pointer, so it could point to anything. (In Mingle, it points to a CThing.) Why? We'd like to keep CAudioPath from having to know a specific object design, which would tie it in much closer to the application design and make this code a little less transportable. CAudioPath::CAudioPath The constructor has grown significantly to support all these new parameters. In particular, it calls GetObjectInPath() to create a pointer shortcut to the 3D Buffer interface at the end of the AudioPath. This will be used to directly change the coordinates every time the AudioPath moves. CAudioPath::CAudioPath(IDirectMusicAudioPath *pAudioPath,WCHAR *pzwName) { m_pAudioPath = pAudioPath;
pAudioPath->AddRef(); wcstombs(m_szName,pzwName,sizeof(m_szName)); m_lVolume = 0; // 3D AudioPath fields follow... m_vLastPosition.x = 0; m_vLastPosition.y = 0; m_vLastPosition.z = 0; m_vLastVelocity.x = 0; m_vLastVelocity.y = 0; m_vLastVelocity.z = 0; m_flPriority = FLT_MIN; m_p3DObject = NULL; m_p3DBuffer = NULL; // Try to get a 3D Buffer interface, if it exists. pAudioPath->GetObjectInPath( 0,DMUS_PATH_BUFFER,
// The DirectSound Buffer.
0,GUID_All_Objects,0,
// Any Buffer (should only be one).
IID_IDirectSound3DBuffer, (void **)&m_p3DBuffer); } CAudioPath::Set3DPosition() Set3DPosition() is intended primarily for 3D AudioPaths in the 3D pool. It is called on a regular basis (typically once per frame) to update the 3D position of the AudioPath. Set3DPosition() sports two optimizations. First, it stores the previous position and velocity of the 3D object in the m_vLastPosition and m_vLastVelocity fields. It compares to see if either the position or velocity has changed. If the 3D object's position or velocity have not changed, Set3DPosition() returns without doing anything. Secondly, Set3DPosition() keeps a pointer directly to the IDirectSound3DBuffer interface so it doesn't have to call GetObjectInPath() every time. Set3DPosition() calls the appropriate IDirectSound3DBuffer methods to update the m_vLastPosition and m_vLastVelocity fields if the position or velocity do change. Set3DPosition() returns true for success and false for failure. bool CAudioPath::Set3DPosition( D3DVECTOR *pvPosition, D3DVECTOR *pvVelocity, DWORD dwApply) { // First, verify that this has changed since last time. If not, just // return success. if (!memcmp(&m_vLastPosition,pvPosition,sizeof(D3DVECTOR)))
{ // Position hasn't changed. What about velocity? if (pvVelocity) { if (!memcmp(&m_vLastVelocity,pvVelocity,sizeof(D3DVECTOR))) { // No change to velocity. No need to do anything. return true; } } else return true; } // We'll be using the IDirectSound3DBuffer interface that // we created in the constructor. if (m_p3DBuffer) { // Okay, we have the 3D Buffer. Control it. m_p3DBuffer->SetPosition(pvPosition->x,pvPosition>y,pvPosition->z,dwApply); m_vLastPosition = *pvPosition; // Velocity is optional. if (pvVelocity) { m_p3DBuffer->SetVelocity(pvVelocity->x,pvVelocity>y,pvVelocity->z,dwApply); m_vLastVelocity = *pvVelocity; } return true; } return false; }
CAudio CAudio needs to maintain a pool of AudioPaths. CAudio already has a general-purpose list of AudioPaths, which we explored in full in Chapter 11. However, the 3D pool needs to be separate since its usage is significantly different. It carries a set of identical 3D AudioPaths, intended specifically for swapping back and forth, as we render. So, we create a second list. CAudioPathList m_3DAudioPathList; AudioPaths.
// Pool of 3D
DWORD m_dw3DPoolSize;
// Size of 3D pool.
We provide a routine for setting the size of the pool, a routine for allocating a 3D AudioPath, a routine for releasing one when done with it, and a routine for finding the AudioPath with the lowest priority. // Methods for managing a pool of 3D AudioPaths. DWORD Set3DPoolSize(DWORD dwPoolSize); // Set size of pool. CAudioPath *Alloc3DPath(); AudioPath.
// Allocate a 3D
void Release3DPath(CAudioPath *pPath); // Return a 3D AudioPath. CAudioPath *GetLowestPriorityPath(); AudioPath.
// Get lowest priority 3D
Let's look at each one. CAudio::Set3DPoolSize() CAudio::Set3DPoolSize() sets the maximum size that the 3D pool is allowed to grow to. Notice that it doesn't actually allocate any AudioPaths because they should still only be created when needed. Optionally, the caller can pass POOLSIZE_USE_ALL_HARDWARE instead of a pool size. This sets the pool size to the total number of available hardware 3D buffers. This option allows the application to automatically use the optimal number of 3D buffers. Set3DPoolSize() accomplishes this by calling DirectSound's GetCaps() method and using the value stored in Caps.dwFreeHw3DAllBuffers. DWORD CAudio::Set3DPoolSize(DWORD dwPoolSize) { // If the constant POOLSIZE_USE_ALL_HARDWARE was passed, // call DirectSound's GetCaps method and get the total // number of currently free 3D Buffers. Then, set that as // the maximum. if (dwPoolSize == POOLSIZE_USE_ALL_HARDWARE) { if (m_pDirectSound) { DSCAPS DSCaps; DSCaps.dwSize = sizeof(DSCAPS); m_pDirectSound->GetCaps(&DSCaps); m_dw3DPoolSize = DSCaps.dwFreeHw3DAllBuffers; } } // Otherwise, use the passed value. else {
m_dw3DPoolSize = dwPoolSize; } // Return the PoolSize so the caller can know how many were // allocated in the case of POOLSIZE_USE_ALL_HARDWARE. return m_dw3DPoolSize; } CAudio::Alloc3DPath() When the application does need a 3D AudioPath, it calls CAudio:: Alloc3DPath(). Alloc3DPath() first scans the list of 3D AudioPaths already in the pool. Alloc3DPath() cannot take any paths that are currently being used, which it tests by checking to see if the AudioPath's Get3DObject() method returns anything. If there are no free AudioPaths in the pool, Alloc3DPath() creates a new AudioPath. CAudioPath *CAudio::Alloc3DPath() { DWORD dwCount = 0; CAudioPath *pPath = NULL; for (pPath = m_3DAudioPathList.GetHead();pPath;pPath = pPath>GetNext()) { dwCount++; // Get3DObject() returns whatever object this path is currently // rendering. If NULL, the path is inactive, so take it. if (!pPath->Get3DObject()) { // Start the path running again and return it. pPath->GetAudioPath()->Activate(true); return pPath; } } // Okay, no luck. Have we reached the pool size limit? if (dwCount < m_dw3DPoolSize) { // No, so create a new AudioPath. IDirectMusicAudioPath *pIPath = NULL; m_pPerformance->CreateStandardAudioPath( DMUS_APATH_DYNAMIC_3D, // Standard 3D AudioPath. 16,
// 16 pchannels should be enough.
true,
// Activate immediately.
&pIPath);
if (pIPath) { // Create a CAudioPath object to manage it. pPath = new CAudioPath(pIPath,L"Dynamic 3D"); if (pPath) { // And stick in the pool. m_3DAudioPathList.AddHead(pPath); } pIPath->Release(); } } return pPath; } CAudio::Release3DPath() Conversely, CAudio::Release3DPath() takes an AudioPath that is currently being used to render something and stops it, freeing it up to be used again by a different object (or thing). void CAudio::Release3DPath(CAudioPath *pPath) { // Stop everything that is currently playing on this AudioPath. m_pPerformance->StopEx(pPath->GetAudioPath(),0,0); // Clear its object pointer. pPath->Set3DObject(NULL); // If we had more than we should (pool size was reduced), remove and delete. if (m_3DAudioPathList.GetCount() > m_dw3DPoolSize) { m_3DAudioPathList.Remove(pPath); // The CAudioPath destructor will take care of releasing // the IDirectMusicAudioPath. delete pPath; } // Otherwise, just deactivate so it won't eat resources. else { pPath->GetAudioPath()->Activate(false); } }
CAudio::GetLowestPriorityPath() GetLowestPriorityPath() scans through the list of AudioPaths and finds the one with the lowest priority. This is typically done to find the AudioPath that would be the best candidate for swapping with an object that has come into view and might be a higher priority. In Mingle, GetLowestPriorityPath() is called by CThingManager when it is reprioritizing AudioPaths. Note Keep in mind that the highest priority possible is the lowest number, or zero, not the other way around. CAudioPath * CAudio::GetLowestPriorityPath() { float flPriority = 0.0; priority. CAudioPath *pBest = NULL;
// Start with highest possible // Haven't found anything yet.
CAudioPath *pPath = m_3DAudioPathList.GetHead(); for (;pPath;pPath = pPath->GetNext()) { // Does this have a priority that is lower than best so far? if (pPath->GetPriority() > flPriority) { // Yes, so stick with it from now on. flPriority = pPath->GetPriority(); pBest = pPath; } } return pBest; }
CThing CThing manages a sound-emitting object in Mingle. It stores its current position and velocity and updates these every frame. It also maintains a pointer to the 3D AudioPath that it renders through and uses that pointer to directly reposition the 3D coordinates of the AudioPath. For dynamic routing, CThing also stores a priority number. The priority algorithm is simply the distance from the listener. The shorter the distance, the higher the priority (lower the number). For mathematical simplicity, the priority is calculated as the added squares of the x and y coordinates. The reassignment algorithm works in two stages. First, it marks the things that need to be reassigned and forces them to stop playing. Once all AudioPath reassignments have been made, it runs through the list and starts the new things running. So, there is state information that needs to be placed in CThing. The variables m_fAssigned and m_fWasRunning are used to track the state of CThing. CThing uses script routines to start playback. There are five different types of things, as defined by the THING constants, and these help determine which script variables to set and which script routines to call.
// All five types of thing: #define THING_PERSON1
1
// Wandering snob
#define THING_PERSON2
2
// Wandering food moocher
#define THING_PERSON3
3
// Wandering mosquito person
#define THING_PERSON4
4
// Wandering laughing fool
#define THING_SHOUT
5
// Sudden shouts on mouse clicks
class CThing : public CMyNode { public: CThing(CAudio *pAudio, DWORD dwType, CScript *pScript); ~CThing(); CThing *GetNext() { return (CThing *) CMyNode::GetNext(); }; bool Start(); bool Move(); bool Stop(); void CalcNewPosition(DWORD dwMils); D3DVECTOR *GetPosition() { return &m_vPosition; }; void SetPosition (D3DVECTOR *pVector) { m_vPosition = *pVector; }; void SetAudioPath(CAudioPath *pPath) { m_pPath = pPath; }; CAudioPath *GetAudioPath() { return m_pPath; }; void SetType (DWORD dwType) { m_dwType = dwType; }; DWORD GetType() { return m_dwType; }; float GetPriority() { return m_flPriority; }; void MarkAssigned(CAudioPath *pPath); bool IsAssigned() { return m_fAssigned; }; void ClearAssigned() { m_fAssigned = false; }; bool WasRunning() { return m_fWasRunning; }; bool NeedsPath() { return !m_pPath; }; void StopRunning() { m_pPath = NULL; m_fWasRunning = true; }; private: CAudio * convenience.
m_pAudio;
// Keep pointer to CAudio for
DWORD
m_dwType;
// Which THING_ type this is.
bool
m_fAssigned;
// Was just assigned an AudioPath.
m_fWasRunning;
// Was unassigned at some point in
float
m_flPriority;
// Distance from listener squared.
D3DVECTOR
m_vPosition;
// Current position.
bool past.
D3DVECTOR
m_vVelocity;
// Direction it is currently going
in. CAudioPath *m_pPath; playback of this.
// AudioPath that manages the
CScript * sound.
// Script to invoke for starting
m_pScript;
};
CThing::Start() CThing::Start() is called when a thing needs to start making sound. This could occur when the thing is first invoked, or it could occur when it has regained access to an AudioPath. First, Start() makes sure that it indeed has an AudioPath. Once it does, it uses its 3D position to set the 3D position on the AudioPath. It then sets the AudioPath as the default AudioPath in the Performance. Next, Start() calls a script routine to start playback. Although it is possible to hand the AudioPath directly to the script and let it use it explicitly for playback, it's a little easier to just set it as the default, and then there's less work for the script to do. Note There's a more cynical reason for not passing the AudioPath as a parameter. When an AudioPath is stored as a variable in the script, it seems to cause an extra reference on the script, so the script never completely goes away. This is a bug in DX9 that hopefully will get fixed in the future. bool CThing::Start() { if (m_pAudio) { if (!m_pPath) { m_pPath = m_pAudio->Alloc3DPath(); } if (m_pPath) { m_pPath->Set3DObject(this); m_pPath>Set3DPosition(&m_vPosition,&m_vVelocity,DS3D_IMMEDIATE); if (m_pScript) { // Just set the AudioPath as the default path. Then, // anything that gets played by // the script will automatically play on this path. m_pAudio->GetPerformance()->SetDefaultAudioPath( m_pPath->GetAudioPath()); // Tell the script which of the four people // (or the shouts) it should play.
m_pScript->GetScript()->SetVariableNumber( L"PersonType",m_dwType,NULL); // Then, call the StartTalking Routine, which // will start something playing on the AudioPath. m_pScript->GetScript()->CallRoutine( L"StartTalking",NULL); // No longer not running. m_fWasRunning = false; } return true; } } return false; } CThing::Stop() CThing::Stop() is called when a thing should stop making sound. Since it's going to be quiet, there's no need to hang on to an AudioPath. So, Stop() calls CAudio::Release3DPath(), which kills the sound and marks the AudioPath as free for the next taker. bool CThing::Stop() { if (m_pAudio && m_pPath) { // Release3DPath() stops all audio on the path. m_pAudio->Release3DPath(m_pPath); // Don't point to the path any more cause it has moved on. m_pPath = NULL; return true; } return false; } CThing::CalcNewPosition() CThing::CalcNewPosition() is called every frame to update the position of the Thing. In order to update velocity, it's necessary to understand how much time has elapsed because velocity is really the rate of distance changed over time. Since CThing doesn't have its own internal clock, it receives a time-elapsed parameter, dwMils, which indicates how many milliseconds have elapsed since the last call. It can then calculate a new position by using the velocity and time elapsed. CalcNewPosition() also checks to see if the Thing bumped into the edge of the box, in which case it reverses velocity, causing the Thing to bounce back. When done setting the position and velocity, CalcNewPosition() uses the new position to generate a fresh new priority.
void CThing::CalcNewPosition(DWORD dwMils) { if (m_dwType < THING_SHOUT) { // Velocity is in meters per second, so calculate the distance // traveled in dwMils milliseconds. m_vPosition.x += ((m_vVelocity.x * dwMils) / 1000); if ((m_vPosition.x = (float) 10.0) && (m_vVelocity.x > 0)) { m_vVelocity.x = -m_vVelocity.x; } m_vPosition.y += ((m_vVelocity.y * dwMils) / 1000); if ((m_vPosition.y = (float) 10.0) && (m_vVelocity.y > 0)) { m_vVelocity.y = -m_vVelocity.y; } // We actually track the square of the distance, // since that's all we need for priority. m_flPriority = m_vPosition.x * m_vPosition.x + m_vPosition.y * m_vPosition.y; } if (m_pPath) { m_pPath->SetPriority(m_flPriority); } } CThing::Move() Move() is called every frame. It simply transfers its own 3D position to the 3D AudioPath. For efficiency, it uses the DS3D_DEFERRED flag, indicating that the new 3D position command
should be batched up with all the other requests. Since CThingManager moves all of the things at one time, it can make a call to the listener to commit all the changes once it has moved all of the things. bool CThing::Move() { // Are we currently active for sound? if (m_pPath && m_pAudio) { m_pPath>Set3DPosition(&m_vPosition,&m_vVelocity,DS3D_DEFERRED); return true; } return false; }
CThingManager CThingManager maintains the set of CThings that are milling around the room. In addition to managing the list of CThings, it has routines to create CThings, position them, and keep the closest CThings mapped to active AudioPaths. class CThingManager : public CMyList { public: CThingManager() { m_pAudio = NULL; m_pScript = NULL; }; void Init(CAudio *pAudio,CScript *pScript,DWORD dwLimit); CThing * GetHead() { return (CThing *) CMyList::GetHead(); }; CThing *RemoveHead() { return (CThing *) CMyList::RemoveHead(); }; void Clear(); CThing * CreateThing(DWORD dwType); CThing * GetTypeThing(DWORD dwType); requested type.
// Access first CThing of
DWORD GetTypeCount(DWORD dwType); requested type?
// How many CThings of
void CalcNewPositions(DWORD dwMils); positions for all.
// Calculate new
void MoveThings(); void ReassignAudioPaths(); AudioPath pairings. void SetAudioPathLimit(DWORD dwLimit); void EnforceAudioPathLimit(); private:
// Move CThings. // Maintain optimal
void StartAssignedThings(); CAudio * m_pAudio; for convenience. CScript * DWORD AudioPaths.
// Keep pointer to CAudio
m_pScript;
// Script
m_dwAudioPathLimit;
// Max number of
}; CThingManager::CreateThing() CreateThing() allocates a CThing class of the requested type and installs the current CAudio and CScript in it. CreateThing() doesn't actually start the CThing playing, nor does it connect the CThing to an AudioPath. CThing *CThingManager::CreateThing(DWORD dwType) { CThing *pThing = new CThing(m_pAudio,dwType,m_pScript); if (pThing) { AddHead(pThing); } return pThing; } CThingManager::ReassignAudioPaths() ReassignAudioPaths() is the heart of the dynamic resource system. It makes sure that the highest priority things are matched up with AudioPaths. To do so, it scans through the list of things, looking for each thing that has no AudioPath but has attained a higher priority than the lowest priority thing currently assigned to an AudioPath. When it finds such a match, it steals the AudioPath from the lower priority thing and reassigns the AudioPath to the higher priority thing. At this point, it stops the old thing from making any noise by killing all sounds on the AudioPath, but it doesn't immediately start the new sound playing. Instead, it sets a flag on the new thing, indicating that it has just been assigned an AudioPath, and waits until the next time around to start it. This is done to give the currently playing sound a little time to clear out before the new one starts. Note Under some circumstances, the approach used here of delaying the sound's start until the next frame is unacceptably slow. In those cases, there is an alternate approach. Deactivate the AudioPath by calling IDirectMusicAudioPath::Activate(false), and then reactivate it by passing true to the same call. This causes the AudioPath to be flushed of all sound. When this is done, you can start the new sound immediately. However, there's a greater chance of getting a click in the audio because the old sound was stopped so suddenly. void CThingManager::ReassignAudioPaths() {
// First, start the things that were set up by the // previous call to ReassignAudioPaths(). StartAssignedThings(); // Now, do a new round of swapping. // Start by getting the lowest priority AudioPath. // Use it to do the first swap. CAudioPath *pLowestPath = m_pAudio->GetLowestPriorityPath(); CThing *pThing = GetHead(); for(; pThing && pLowestPath; pThing = pThing->GetNext()) { if (pThing->NeedsPath() && (pThing->GetPriority() < pLowestPath->GetPriority())) { CThing *pOtherThing = (CThing *) pLowestPath>Get3DObject(); if (pOtherThing) { pOtherThing->StopRunning(); } // Stop the playback of all Segments on the path // in case anything is still making sound. m_pAudio->GetPerformance()->StopEx(pLowestPath>GetAudioPath(),0,0); // Now, set the path to point to this thing. pLowestPath->Set3DObject(pThing); // Give it the new priority. pLowestPath->SetPriority(pThing->GetPriority()); // Mark this thing to start playing on the path next time around. // This gives a slight delay to allow any sounds in // the old path to clear out. pThing->MarkAssigned(pLowestPath); // Okay, done. Now, get the new lowest priority path // and use that to continue swapping... pLowestPath = m_pAudio->GetLowestPriorityPath(); } } }
CThingManager::StartAssignedThings() After canceling ReassignAudioPaths(), there may be one or more things that have been assigned a new AudioPath but have not started making any noise. This is intentional; it gives the AudioPath a chance to drain whatever sound was in it before the thing newly assigned to it starts making noise. So, StartAssignedThings() is called on the next pass, and its job is simply to cause each newly assigned thing to start making noise. void CThingManager::StartAssignedThings() { // Scan through all of the CThings... CThing *pThing = GetHead(); for(; pThing; pThing = pThing->GetNext()) { // If this thing was assigned a new AudioPath, then start it. if (pThing->IsAssigned()) { // Get the newly assigned AudioPath. CAudioPath *pPath; if (pPath = pThing->GetAudioPath()) { // Was this previously making sound? // It could have been an AudioPath that was assigned // but was not played yet, in which case we do nothing. if (pThing->WasRunning()) { pThing->Start(); } } // Clear flag so we don't start it again. pThing->ClearAssigned(); } } } CThingManager::EnforceAudioPathLimit() EnforceAudioPathLimit() ensures that the number of AudioPaths matches the number set in SetAudioPathLimit(). If the number went down, it kills off low-priority AudioPaths to get back to where we should be. EnforceAudioPathLimit() makes sure that any available things that could be playing are given AudioPaths and turned on if the number of AudioPaths increased. void CThingManager::EnforceAudioPathLimit() {
// First, count how many AudioPaths are currently assigned // to things. This lets us know how many are currently in use. DWORD dwCount = 0; CThing *pThing = GetHead(); for(; pThing; pThing = pThing->GetNext()) { if (pThing->GetAudioPath()) { dwCount++; } } // Do we have more AudioPaths in use than the limit? If so, // we need to remove some of them. if (dwCount > m_dwAudioPathLimit) { // Make sure we don't have any things or AudioPaths // in a halfway state. StartAssignedThings(); dwCount -= m_dwAudioPathLimit; // dwCount now holds the number of AudioPaths that // we can no longer use. while (dwCount) { // Always knock off the lowest priority AudioPaths. CAudioPath *pLowestPath = m_pAudio>GetLowestPriorityPath(); if (pLowestPath) { // Find the thing associated with the AudioPath. pThing = (CThing *) pLowestPath->Get3DObject(); if (pThing) { // Disconnect the thing. pThing->SetAudioPath(NULL); } // Release the AudioPath. This stops all playback // on the AudioPath. Then, because the number of // AudioPaths is above the limit, it removes the AudioPath // from the pool. m_pAudio->Release3DPath(pLowestPath);
} dwCount--; } } // On the other hand, if we have fewer AudioPaths in use than // we are allowed, see if we can turn some more on. else if (dwCount < m_dwAudioPathLimit) { // Do we have some turned off that could be turned on? if (GetCount() > dwCount) { // If the total is under the limit, // just allocate enough for all things. if (GetCount() < m_dwAudioPathLimit) { dwCount = GetCount() - dwCount; } // Otherwise, allocate enough to meet the limit. else { dwCount = m_dwAudioPathLimit - dwCount; } // Now, dwCount holds the amount of AudioPaths to turn on. while (dwCount) { // Get the highest priority thing that is currently off. pThing = GetHead(); float flPriority = FLT_MAX; CThing *pHighest = NULL; for (;pThing;pThing = pThing->GetNext()) { // No AudioPath and higher priority? if (!pThing->GetAudioPath() && (pThing->GetPriority() < flPriority)) { flPriority = pThing->GetPriority(); pHighest = pThing; } }
if (pHighest) { // We found a high priority AudioPath. // Start it playing. That will cause it to // allocate an AudioPath from the pool. pHighest->Start(); } dwCount--; } } } } CThingManager::CalcNewPositions() CalcNewPositions() scans through the list of things and has each calculate its position. This should be called prior to MoveThings(), which sets the new 3D positions on the AudioPaths, and ReassignAudioPaths(), which uses the new positions to ensure that the closest things are assigned to AudioPaths. CalcNewPositions() receives the time elapsed since the last call as its only parameter. This is used to calculate the velocity for each thing. void CThingManager::CalcNewPositions(DWORD dwMils) { CThing *pThing = GetHead(); for (;pThing;pThing = pThing->GetNext()) { pThing->CalcNewPosition(dwMils); } } CThingManager::SetAudioPathLimit() SetAudioPathLimit() is called whenever the maximum number of 3D AudioPaths changes. In Mingle, this occurs when the user changes the number of 3D AudioPaths with the edit box. First, SetAudioPathLimit() sets the internal parameter, m_dwAudioPathLimit, to the new value. Then, it tells CAudio to do the same. Finally, it calls EnforceAudioPathLimit(), which does the hard work of culling or adding and activating 3D AudioPaths. void CThingManager::SetAudioPathLimit(DWORD dwLimit) { m_dwAudioPathLimit = dwLimit; m_pAudio->Set3DPoolSize(dwLimit); EnforceAudioPathLimit(); }
CThingManager::MoveThings() MoveThings() scans through the list of things and has each one assign its new position to the AudioPath. Once all things have been moved, MoveThings() calls the Listener's CommitDeferredSettings() method to cause the new 3D positions to take hold. void CThingManager::MoveThings() { CThing *pThing = GetHead(); for (;pThing;pThing = pThing->GetNext()) { pThing->Move(); } m_pAudio->Get3DListener()->CommitDeferredSettings(); }
Mingle Implementation The Mingle implementation of CThingManager is quite simple. You can look at the source code in MingleDlg.cpp. Initialization occurs in CMingleDlg::OnInitDialog(). In it, we load a script, which has routines that will be called by things when they start and stop playback. m_pEffects points to the CAudio instance that manages sound effects. m_pScript = m_pEffects->LoadScript(L"..\\Media\\Mingle.spt"); Next, set the maximum number of 3D AudioPaths to eight. This is intentionally a bit low so it can be easier to experiment with and hear the results. // Set the max number of 3D AudioPaths to 8. m_dwMax3DPaths = m_pEffects->Set3DPoolSize(8); // Initialize the thing manager with the CAudio for effects, // the script for playing sounds, and the 3D AudioPath limit. m_ThingManager.Init(m_pEffects,m_pScript,m_dwMax3DPaths); That's all the initialization needed. When one of the Talk buttons is clicked, a CThing object is created and told to start making noise. The code is simple: // Create a new CThing of the requested type. CThing *pThing = m_ThingManager.CreateThing(dwType); if (pThing) { // Successful, so start making sound. pThing->Start();
} Conversely, when one of the Shut Up! buttons is clicked, a CThing object must be removed. This involves stopping the thing from playing, removing it, deleting it, and then checking to see if there is a lower priority thing waiting in the wings to play. // Fing an instance of the requested type thing. CThing *pThing = m_ThingManager.GetTypeThing(dwType); if (pThing) { // Stop it. This will stop all sound on the // AudioPath and release it. pThing->Stop(); // Kill the thing! m_ThingManager.Remove(pThing); delete pThing; // There might be another thing that now has the // priority to be heard, so check and see. m_ThingManager.EnforceAudioPathLimit(); UpdateCounts(); } The dynamic scheduling occurs at regular intervals. Because Mingle is a windowed app, it uses the Windows timer. In a non-windowed application (like a game), this would more likely be once per frame. The timer code executes a series of operations: 1. Calculate new positions for all CThings. This adjustment of positions assigns new x, y, and z coordinates and assigns new priorities. 2. Draw the things. 3. Use the priorities to make sure that the highest priority things are being heard. 4. Update the 3D positions of the AudioPaths that have things assigned to them. void CMingleDlg::OnTimer(UINT nIDEvent) { // First, update the positions of all things. // Indicate that this timer wakes up every 30 milliseconds. m_ThingManager.CalcNewPositions(30); // Draw them in their new positions. CClientDC dc(this); DrawThings(&dc); // Now, make sure that all the highest priority // things are being heard. m_ThingManager.ReassignAudioPaths(); // Finally, update their 3D positions. m_ThingManager.MoveThings();
CDialog::OnTimer(nIDEvent); } If at any point during the play the 3D AudioPath limit changes, one call to SetAudioPathLimit() straightens things out. m_ThingManager.SetAudioPathLimit(m_dwMax3DPaths); That concludes dynamic 3D resource management!
Low-Latency Sound Effects We already talked in some depth about how to drop the latency earlier in this chapter, and the code sample for doing so was taken from Mingle. Mingle demonstrates low-latency sound effects with the various complaints heard when you click on the party with your mouse. First, create a CThing that is dedicated just to making these noises whenever the mouse is clicked. // Create the thing that makes a sound every time the mouse clicks. m_pMouseThing = m_ThingManager.CreateThing(THING_SHOUT); if (m_pMouseThing) { // Allocate an AudioPath up front and hang on to it because // we always want it ready to run the moment the mouse click occurs. m_pMouseThing->SetAudioPath(m_pEffects->Alloc3DPath()); m_pMouseThing->GetAudioPath()->Set3DObject((void *)m_pMouseThing); } When there is a mouse click, translate the mouse coordinates into 3D coordinates and set m_pMouseThing to these coordinates. Then, immediately have it play a sound. // Set the position of the AudioPath to the new position // of the mouse. m_pMouseThing->SetPosition(&Vector); // Start a sound playing immediately. m_pMouseThing->Start();
Background Ambience It's not enough to have a handful of individuals wandering around us mumbling, whining, munching, or laughing. To fill in the gaps and make the party feel a lot larger, we'd like to play some background ambience. The ambience track is a short Segment with a set of stereo wave recordings set up to play as random variations. The Segment is actually played by the script's StartParty routine, so there's no special code in Mingle itself for playing background ambience other than the creation of a stereo dynamic AudioPath for the initial
default AudioPath, which StartParty uses to play the Segment on. The Segment is set to loop forever, so nothing more needs to be done.
Avoiding Truncated Waves When a wave is played on an AudioPath that is pitch shifted down, the duration of the wave can be prematurely truncated. This occurs because the wave in the Wave Track has a specific duration assigned to it. The intention is honorable; you can specify just a portion of a wave to play by identifying the start time and duration. To work well with the scheduling of other sounds in music or clock time, the duration is also defined in music or clock time. Therein lies a problem. If you really want the wave to play through in its entirety, there's no way of specifying the duration in samples. This normally isn't a problem, but if the wave gets pitch shifted down during playback (which happens, for example, with Doppler shift), the wave sample duration becomes longer than the duration imposed on it and it stops prematurely. The simplest solution involves writing some code to force the wave durations to be long enough that they wrap the most dramatic pitch bend you would throw at it. This is actually easy to do, and this is what Mingle does. Through DirectMusic's Tool mechanism, you can write a filter that captures any event type on playback and alters it as needed. So, we write a Tool that intercepts the DMUS_WAVE_PMSG PMsg and multiplies the DMUS_WAVE_PMSG rtDuration field by a reasonable number (say, 10). We place the Tool in the Performance, so it intercepts everything played on all Segments. Although Tools can be written as COM objects that are created via CoCreateInstance() and authored into Segments or AudioPaths, that level of sophistication isn't necessary here. Instead, we declare a class, CWaveTool, that supports the IDirectMusicTool interface, which is actually very simple to write to. The heart of IDirectMusicTool is one method, ProcessPMsg(), which takes a pmsg and works its magic on it. In the case of CWaveTool, that magic is simply increasing the duration field in the wave pmsg. There are a handful of other methods in IDirectMusicTool that are used to set up what pmsg types that the Tool handles and when the pmsg should be delivered to the Tool; choices are immediately, at the time stamp on the pmsg, or a little ahead of that. Here are all the IDirectMusicTool methods, as used by CWaveTool: STDMETHODIMP CWaveTool::Init(IDirectMusicGraph* pGraph) { // Don't need to do anything at initialization. return S_OK; } STDMETHODIMP CWaveTool::GetMsgDeliveryType(DWORD* pdwDeliveryType) { // This tool should process immediately and not wait for the time stamp. *pdwDeliveryType = DMUS_PMSGF_TOOL_IMMEDIATE; return S_OK; }
STDMETHODIMP CWaveTool::GetMediaTypeArraySize(DWORD* pdwNumElements) { // We have exactly one media type to process: Waves. *pdwNumElements = 1; return S_OK; } STDMETHODIMP CWaveTool::GetMediaTypes(DWORD** padwMediaTypes, DWORD dwNumElements) { // Return the one type we handle. **padwMediaTypes = DMUS_PMSGT_WAVE; return S_OK; } STDMETHODIMP CWaveTool::ProcessPMsg(IDirectMusicPerformance* pPerf, DMUS_PMSG* pPMSG) { // Just checking to be safe... if (NULL == pPMSG->pGraph) { return DMUS_S_FREE; } // Point to the next Tool after this one. // Otherwise, it will just loop back here forever // and lock up. if (FAILED(pPMSG->pGraph->StampPMsg(pPMSG))) { return DMUS_S_FREE; } // This should always be a wave since we only allow // that media type, but check anyway to be certain. if( pPMSG->dwType == DMUS_PMSGT_WAVE) { // Okay, now the miracle code. // Multiply the duration by ten to reach beyond the // wildest pitch bend. DMUS_WAVE_PMSG *pWave = (DMUS_WAVE_PMSG *) pPMSG; pWave->rtDuration *= 10; }
return DMUS_S_REQUEUE; } STDMETHODIMP CWaveTool::Flush(IDirectMusicPerformance* pPerf, DMUS_PMSG* pPMSG, REFERENCE_TIME rtTime) { // Nothing to do here. return S_OK; } Once the CWaveTool is compiling nicely, we still need to stick it in the Performance. Doing so is straightforward. We only need to create one CWaveTool and place it directly in the Performance at initialization time. The following code does so. The code uses an AudioPath to access the Performance Tool Graph. Using an AudioPath is actually simpler than calling the GetGraph() method on the Performance because GetGraph() returns NULL if there is no Graph; DirectMusic doesn't create one unless needed, for efficiency reasons. GetObjectInPath() goes the extra step of creating the Graph if it does not exist already. // Use the default AudioPath to access the tool graph. IDirectMusicAudioPath *pPath; if (SUCCEEDED(m_pPerformance->GetDefaultAudioPath(&pPath))) { IDirectMusicGraph *pGraph = NULL; pPath->GetObjectInPath(0, DMUS_PATH_PERFORMANCE_GRAPH,
// The Performance Tool
Graph. 0, GUID_All_Objects,0,
// Only one object type.
IID_IDirectMusicGraph,
// The Graph interface.
(void **)&pGraph); if (pGraph) { // Create a CWaveTool CWaveTool *pWaveTool = new CWaveTool; if (pWaveTool) { // Insert it in the Graph. It will process all wave pmsgs that // are played on this Performance. pGraph>InsertTool(static_cast(pWaveTool), NULL,0,0);
} pGraph->Release(); } pPath->Release(); } That's it. The Tool will stay in the Performance until the Performance is shut down and released. At that point, the references on the Tool will drop to zero, and its destructor will free the memory. Note This technique has only one downside: If you actually do want to set a duration shorter than the wavelength, it undoes your work. In that scenario, the best thing to do is be more specific about where you do this. You can control which pchannels the Tool modifies, or you can place it in the Segments or AudioPaths where you want it to work, rather than the entire Performance.
Manipulating the Listener Properties Working with the Listener is critical for any 3D sound effects work, so CAudio has a method, Get3DListener(), that retrieves the DirectSound IDirectSound3DListener8 interface. This is used for setting the position of the Listener as well as adjusting global 3D sound parameters. Note Get3DListener returns an AddRef()'d instance of the Listener, so it's important to remember to Release() it when done. IDirectSound3DListener8 *CAudio::Get3DListener() { IDirectSound3DListener8 *pListener = NULL; IDirectMusicAudioPath *pPath; // Any AudioPath will do because the listener hangs off the primary buffer. if (m_pPerformance && SUCCEEDED(m_pPerformance>GetDefaultAudioPath(&pPath))) { pPath->GetObjectInPath(0, DMUS_PATH_PRIMARY_BUFFER,0, // Access via the primary Buffer. GUID_All_Objects,0, IID_IDirectSound3DListener8,// Query for listener interface. (void **)&pListener); pPath->Release(); } return pListener; } To facilitate the use of the Listener, the Mingle dialog code keeps a copy of the Listener interface and the DS3DLISTENER data structure that can be filled with all of the Listener properties and set at once.
DS3DLISTENER m_ListenerData; Listener params.
// Store current
IDirectSound3DListener8 *m_pListener; Listener.
// Use to update the
In the initialization of the Mingle window, the code gets access to the Listener and then reads the initial parameters into m_ListenerData. From then on, the window code can tweak any parameter and then bulk-update the Listener with the changes. m_pListener = m_pEffects->Get3DListener(); m_ListenerData.dwSize = sizeof(m_ListenerData); if (m_pListener) { m_pListener->GetAllParameters(&m_ListenerData); } Later, when one or more Listener parameters need updating, just the changed parameters can be altered and then the whole structure is submitted to the Listener. For example, the following code updates the Rolloff Factor in response to dragging a slider. m_ListenerData.flRolloffFactor = (float) m_ctRollOff.GetPos() / 10; m_pListener>SetAllParameters(&m_ListenerData,DS3D_IMMEDIATE);
Scripted Control of Sound Effects Mingle uses scripting as much as possible. As we've already seen, all sound effects are played via scripts. There are only three routines in the script file mingle.spt: §
StartParty: This is called at the beginning of Mingle. It is intended for any initialization that might be needed. It assigns names for the four people, and it starts the background ambience.
§
StartTalking: This is called whenever anything starts making sound. A variable, PersonType, is passed first, indicating which thing type should make the sound.
§
EndParty: This is called when the party is over and it's time to shut down. This isn't entirely necessary, but it's a little cleaner to go ahead and turn off all the sounds before shutting down.
Here's the script: 'Mingle Demo '©2002 NewBlue Inc.
' Variable set by the application prior to calling StartTalking dim PersonType ' Which object is making the sounds ' Variables retrieved by the application to find out the names ' of the people talking dim Person1Name dim Person2Name dim Person3Name dim Person4Name sub StartTalking if (PersonType = 1) then Snob.play IsSecondary elseif (PersonType = 2) then Munch.play IsSecondary elseif (PersonType = 3) then Mosquito.play IsSecondary elseif (PersonType = 4) then HaHa.play IsSecondary elseif (PersonType = 5) then Shout.play IsSecondary end if end sub sub StartParty Crowd.play IsSecondary PersonType = 1 Person1Name = "Snob" Person2Name = "HorsDeOoovers" Person3Name = "Mosquito Person" Person4Name = "Laughing Fool" end sub sub EndParty Crowd.Stop HaHa.Stop Mosquito.Stop Munch.Stop Shout.Stop Snob.Stop end sub
Here's how the script is called at startup. Right after loading the script, call StartParty to initialize the Person name variable and allow the script to do anything else it might have to. Then, read the names. Since these are strings, the only way we can read them is via variants. Fortunately, there are some convenience routines that take the pain out of working with variants. You must call VariantInit() to initialize the variant. Then, use the variant to retrieve the string from the script. Since the memory for a variant string is allocated dynamically, it must be freed. So, call VariantClear(), and that cleans it up appropriately. // Call the initialization routine. This should set the Person // name variables as well as start any background ambience. if (SUCCEEDED(m_pScript->GetScript()>CallRoutine(L"StartParty",NULL))) { // We will use a variant to retrieve the name of each Person. VARIANT Variant; char szName[40]; VariantInit(&Variant); if (SUCCEEDED(m_pScript->GetScript()->GetVariableVariant( L"Person1Name",&Variant,NULL))) { // Copy the name to ASCII. wcstombs(szName,Variant.bstrVal,40); // Use it to set the box title. SetDlgItemText(IDC_PERSON1_NAME,szName); // Clear the variant to release the string memory. VariantClear(&Variant); } // Do the same for the other three names...
We've already seen how StartTalking is called when we looked at dynamic AudioPaths. EndParty is called when the app window is being closed. If there's anything that the script needs to do on shutdown, it can do it there. if (m_pScript) { m_pScript->GetScript()->CallRoutine(L"EndParty",NULL); }
Creating Music with the Composition Engine Although this chapter is all about sound effects, it's important to demonstrate how music and sound effects work together, so I needed to put some music in this but not spend much time on it. Enter DirectMusic's Style and composition technologies.
The Composition Engine can write a piece of music using harmonic (chord progression) guidelines authored in the form of a ChordMap. For the overall shape of the music, the Composition Engine can use a template Segment authored in Producer or actually write a template on the fly using its own algorithm, which is defined by a handful of predefined shapes. The shapes describe different ideas for how the music should progress over time. Examples include rising (increasing in intensity) and falling (decreasing in intensity). A particularly useful shape is Song, which builds a structure with subsections for verse, chorus, etc., that it swaps back and forth. For playback, the chords and phrasing created by the Composition Engine are interpreted by a Style. As it happens, DX9 ships with almost 200 different Styles and about two dozen ChordMaps. Not only are they a great set of examples for how to write style-based content, but they can be used as a jumping-off point for writing your own Styles. Or, for really low-budget apps like Mingle, you can always use them off the shelf. They are also great for dropping in an application to sample different musical genres as well as making a great placeholder while waiting for the real music to be authored. It is really easy to use the Composition Engine to throw together a piece of music to play in Mingle. We add ComposeSegment(), which creates a CSegment as an alternative to LoadSegment(), which reads an authored one from disk. CSegment *CAudio::ComposeSegment( WCHAR *pwzStyleName,
// File name of the Style.
WCHAR *pwzChordMapName,
// File name of the ChordMap.
WORD wNumMeasures,
// How many measures long?
WORD wShape,
// Which shape?
WORD wActivity,
// How busy are the chord changes?
BOOL fIntro, BOOL fEnd) ending?
// Do we want beginning and/or
{ // Create a Composer object to build the Segment. CSegment *pSegment = NULL; if (!m_pComposer) { CoCreateInstance( CLSID_DirectMusicComposer, NULL, CLSCTX_INPROC, IID_IDirectMusicComposer, (void**)&m_pComposer); } if (m_pComposer) { // First, load the Style. IDirectMusicStyle *pStyle = NULL; IDirectMusicChordMap *pChordMap = NULL;
if (SUCCEEDED(m_pLoader->LoadObjectFromFile( CLSID_DirectMusicStyle, IID_IDirectMusicStyle, pwzStyleName, (void **) &pStyle))) { // We have the Style, so load the ChordMap. if (SUCCEEDED(m_pLoader->LoadObjectFromFile( CLSID_DirectMusicChordMap, IID_IDirectMusicChordMap, pwzChordMapName, (void **) &pChordMap))) { // Hooray, we have what we need. Call the Composition // Engine and have it write a Segment. IDirectMusicSegment8 *pISegment; if (SUCCEEDED(m_pComposer->ComposeSegmentFromShape( pStyle, wNumMeasures, wShape, wActivity, fIntro, fEnd, pChordMap, (IDirectMusicSegment **) &pISegment))) { // Create a CSegment object to manage playback. pSegment = new CSegment; if (pSegment) { // Initialize pSegment>Init(pISegment,m_pPerformance,SEGMENT_FILE); m_SegmentList.AddTail(pSegment); } pISegment->Release(); } pChordMap->Release(); } pStyle->Release(); } } return pSegment; }
Using this is simple. When Mingle first opens up, call ComposeSegment() and have it write something long. Set it to repeat forever. Play it and forget about it. CSegment *pSegment = m_pMusic->ComposeSegment( L"Layback.sty",
// Layback style.
L"PDPTDIA.cdm",
// Pop diatonic ChordMap.
100,
// 100 measures.
DMUS_SHAPET_SONG,
// Song shape.
2,
// Relatively high chord activity,
true,true);
// Need an intro and an ending.
if (pSegment) { // Play it over and over again. pSegment->GetSegment()>SetRepeats(DMUS_SEG_REPEAT_INFINITE); m_pMusic->PlaySegment(pSegment,NULL,NULL); } You might have noticed that this is the only sound-producing code in Mingle that is not managed by the script. Unfortunately, scripting does not support ComposeSegmentFromShape(), so we have to call it directly in the program. However, one could have the script manage the selection of Style and ChordMap so that the music choice could be iteratively changed without a recompile of the program. DirectX Audio has the potential to create truly awesome sound environments. Many of the techniques that apply well to music, like scripting, variations, and even groove level, lend themselves to creating audio environments that are equally rich. The beauty is the vast majority of the work is done on the authoring side with a minimum amount of programming. The key is in understanding the technologies and mapping out a good strategy. Hopefully this chapter points you in a good direction. You are welcome to use the sample code in any way you'd like.
Chapter 14: Notifications Overview Todor Fay Sometimes, an application would really like to know when a specific sound or musical event occurs. This is useful for many reasons. For example, if the audio system can report when a Segment is about to finish, the application can select and queue the appropriate Segment to follow. There are plenty of creative paths to follow once you have a robust, easy-to-use system that "notifies" the application when specific audio events occur. Examples of things you can do with a good notification system include: § Placing signals in a Segment to indicate when specific sounds occur to trigger visual elements § Synchronizing animation to the musical beat so characters walk and move with rhythm Think about the corollary between film and games. In the film editing process, sounds often synchronize to visual happenings, and sometimes visual elements cut to the sound or music. You might think of an MTV video for the extreme example, but even action sequences where the music is not center stage can be subtly improved by letting sound drive some of the timing. Rhythm and pacing are far more critical in the audible domain than visual. Think about speech. It works very well to animate lips to words, but it is invariably clumsy the other way around. So far, with game sound, it's been pretty much a one-way street; sounds are triggered by visual events. But, with the proper tools and techniques, you can change all that and make more compelling environments. DirectX Audio includes a rich set of options for synchronizing visuals and actions to the sound track. Depending on the notification requirements, DirectX Audio provides different approaches to accomplish synchronization. §
Performance notifications: The Performance Engine provides a standard system for handling notifications. Standard notifications range from Segment start and stop to groove level change.
§
Lyrics: A Lyric Track in a Segment provides an easy way to provide notifications that are defined as text names. The beauty of this approach is any text word can be placed in a Segment timeline and then delivered to the application at the specific point in time.
§
Script track: A Script Track calls script routines at specific points in the Segment playback.
In this chapter, we talk about all three and add features to Jones to visualize both performance notifications and lyrics.
Performance Notifications The standard mechanism for managing notifications is pretty comprehensive. DirectMusic's Performance Engine provides different categories of notifications, which you can turn on or off as needed. These range from notifications that tell the application when different sounds and music clips start and stop to notifications that indicate when musical events, such as changes in time signature or groove level, occur. The notifications pass through the Performance in pmsg form, much like note or wave messages. However, instead of sending these messages to the synthesizer, DirectMusic delivers them to the application. The application can even choose how to receive the notifications; it can set up an event and be signaled when one arrives, or it can simply poll for the notification messages.
Notification Categories The notification pmsgs are broken into categories, defined via GUIDs. Many of the categories have subtypes, which are in turn defined by a constant DWORD. The categories include: §
GUID_NOTIFICATION_CHORD: This notification indicates that a chord has changed.
§
GUID_NOTIFICATION_COMMAND: This notification indicates whenever a groove level track broadcasts a new groove level or embellishment command.
§
GUID_NOTIFICATION_MEASUREANDBEAT: One of these notifications is sent every beat. It carries the measure and beat number with it.
§
GUID_NOTIFICATION_PERFORMANCE: This notification indicates a change in the status of all Segments in the Performance. For example, it signals when sound or music first starts as well as when the last Segment finishes playing.
§
GUID_NOTIFICATION_RECOMPOSE: This notification indicates when a Track is recomposed, which can automatically occur on the fly. This is primarily used by authoring tools (in other words, DirectMusic Producer).
§
GUID_NOTIFICATION_SEGMENT: This notification indicates a change in the playback status of a Segment, including starting, looping, and stopping.
Of all of these, the Segment commands are by far the most useful. These indicate when a Segment starts, stops, or loops and even when it is stopped prematurely. The notification mechanism also hands the application a pointer to the IDirectMusicSegmentState interface associated with the playing Segment.
The Notification Pmsg The DMUS_NOTIFICATION_PMSG starts with the DMUS_ PMSG_PART core, which provides the necessary infrastructure for pmsg delivery and follows with several fields used to identify the type of the notification: typedef struct _DMUS_NOTIFICATION_PMSG { /* begin DMUS_PMSG_PART */
DMUS_PMSG_PART /* end DMUS_PMSG_PART */ GUID category.
guidNotificationType;
// GUID indicating notification
DWORD dwNotificationOption;
// Subtype of the notification.
DWORD dwField1;
// Optional data field.
DWORD dwField2;
// Second data field.
} DMUS_NOTIFICATION_PMSG; guidNotificationType indicates the category of notification. These are the notification categories listed earlier. A GUID is used to ensure that new notification categories can be introduced by different parties without any danger of clashing. Each category can optionally supply more than one subtype, identified in dwNotificationOption. The following two fields, dwField1 and dwField2, carry notification-specific data. When used, their functionality is defined by the notification category and option. Additional relevant information is stored in the DMUS_PMSG_ PART portion. Of particular interest are: REFERENCE_TIME
rtTime;
// Clock time this pmsg occurs.
MUSIC_TIME
mtTime;
// Music time of this pmsg.
punkUser;
// Optional COM pointer to
IUnknown* referenced object.
rtTime and mtTime carry the specific time that the notification occurs. punkUser carries a pointer to a COM interface, allowing the notification to drag along a pointer to an associated object, if appropriate. For example, the GUID_NOTIFICATION_SEGMENT command uses punkUser to carry a pointer to the IDirectMusic-SegmentState associated with the playing Segment.
Enabling a Notification By default, DirectMusic does not generate any notifications. To activate a notification, you must call the performance's AddNotificationType() method: // Enable Segment notifications. pPerformance->AddNotificationType(GUID_NOTIFICATION_SEGMENT); In this example, setting GUID_NOTIFICATION_SEGMENT tells DirectMusic to generate a notification every time a Segment starts, loops, or stops. To disable a notification, call the inverse method: // Disable time signature notifications pPerformance>RemoveNotificationType(GUID_NOTIFICATION_MEASUREANDBEAT);
Receiving a Notification There are several ways to receive the notification pmsg. You can poll for it. You can set up an event and wait to be signaled. Or you can stick a Tool in the Performance and use the Tool to intercept it.
Polling for Notifications Polling is the easiest way to receive a notification and typically the most useful. If the application has a regular loop of things that it does once per frame, then it can check for notifications as part of that loop. The Performance provides a method, GetNotificationPMsg(), that retrieves any due notification pmsgs. DMUS_NOTIFICATION_PMSG *pPmsg; while (pPerformance->GetNotificationPMsg(&pPmsg) == S_OK) { // Do something here with the data from the notification, and then free it. pPerformance->FreePMsg((DMUS_PMSG*)pPmsg); } GetNotificationPMsg() returns S_OK for every notification it has in its queue; then it returns S_FALSE when there are none left. Be careful: S_FALSE and S_OK are both considered success codes, so don't use the SUCCEEDED(hr) macro because it will always succeed. Instead, check specifically for S_OK. Note Notifications don't wait around forever. If the application does not call GetNotificationPMsg() within a reasonable time (by default, two seconds), DirectMusic discards the notification.
Waiting on an Event If you would prefer, DirectMusic can signal your application exactly when a notification is ready. This option uses the standard event signaling Windows APIs. First, create an event handle with a call to the Windows Create-Event() function. HANDLE hNotify = CreateEvent(NULL, FALSE, FALSE, NULL); After creating the event, hand it to the Performance via its SetNotificationHandle() method. This tells the Performance to signal the event whenever a notification is due. pPerformance->SetNotificationHandle(hNotify, 4000); The second parameter specifies how long, in milliseconds, the Performance should hold onto the notification if it is not retrieved. This keeps old notifications from piling up if the application is temporarily not listening to them. A value of 0 in this parameter indicates that the default time of two seconds should be used. Once the event is prepared, the application calls the Windows API WaitForSingleObject() to wait for a notification. It's important to understand that this call stops (or "blocks") the calling thread until a notification arrives. Because of this, it doesn't work to place this call in a thread
that needs to be doing anything while waiting, so be sure to place it in a thread that can afford to wait. For example, it would not work to call WaitForSingleObject() in the middle of a graphics/ UI loop. WaitForSingleObject(hNotify, 1000); In this code, the caller waits for a notification for up to one second. Once the notification arrives, the caller retrieves the notification via a call to GetNotificationPMsg().
Capturing Notifications with a Tool There's a third way to receive notification pmsgs. Create a Tool with an IDirectMusicTool interface, set it up to handle DMUS_PMSGT_ NOTIFICATION message type, and place it in the Performance. When a notification comes down the pike, the DirectMusic calls the Tool's ProcessPMsg() method, and it can then notify the application in whatever way it wants to. This approach requires a little more work, but it buys extra flexibility: § You have three options for delivery timing. Choose DMUS_PMSGF_TOOL_IMMEDIATE, and the Tool gets called the moment the notification is generated, which is typically 500ms ahead of the time stamp. Choose DMUS_PMSGF_TOOL_QUEUE, and the Tool is called a little early, typically about 50ms. Choose DMUS_PMSGF_TOOL_ATTIME, and the Tool is called as close to the exact time stamp as possible, which is what the regular notification system does. As it turns out, DMUS_PMSGF_TOOL_QUEUE is particularly useful because being slightly early (but still with the correct time stamp) results in a response that might feel more accurate. § You control where the notifications are sent by placing the Tool in the appropriate object. You can insert Tools in Segments, AudioPaths, and the Performance. A notification tool placed in a Segment only receives notifications generated by that Segment, which makes it very easy to understand the context of the notification. § If you plan to also use lyric notifications (we discuss that next), you need to write a Tool anyway, and so you might as well use the same mechanism for both and avoid the extra code. For the last reason, we are using a Tool to capture notifications in Jones. Setting up a Tool is not as hard as it may seem. We learned this in the last chapter when we created a Tool to lengthen wave pmsgs. Define a C++ class based on the IDirectMusicTool interface. Set it up to only receive DMUS_PMSGT_NOTIFICATION messages and receive them at DMUS_PMSGF_TOOL_QUEUE time (which means they arrive about 50ms or roughly a frame early). This chapter's Jones implementation covers installing such a Tool in detail, so we can skip the code for defining and installing the Tool for now. The active part of the Tool is the method ProcessPMsg(), which receives the message. For processing notifications, it verifies that the pmsg is of type DMUS_PMSGT_NOTIFICATION, reads the parameters from the notification pmsg, does whatever it needs to do, and then returns. HRESULT CNotifyTool::ProcessPMsg(IDirectMusicPerformance* pPerf, DMUS_PMSG* pPMSG) { if (pPMSG->dwType == DMUS_PMSGT_NOTIFICATION) {
// Read the notification and do something. } // No need to send this on down the line. return DMUS_S_FREE; } As you can see, the code to receive a notification via a Tool is remarkably simple. Note Be sure not to linger long inside the Tool's ProcessPMsg() method. This is a high-priority thread and should not be used for UI or file I/O work, which could destroy the timing. Instead, set a flag or place a message in a queue and then butt out immediately.
Interpreting the Notification Once we have the notification in hand, we need to make sense of it. This involves reading the dwType, guidNotificationType, and dwNotificationOption fields to glean the specific notification. When working with both lyrics and notifications, it's important to identify the pmsg type by checking the dwType field of the pmsg. // Is this a notification? (Only applied to reading from a Tool.) if (pMsg->dwType == DMUS_PMSGT_NOTIFICATION) { DMUS_NOTIFICATION_PMSG *pNotify = (DMUS_NOTIFICATION_PMSG *) pMsg; // It's a notification! } If it is a notification, check both guidNotificationType for the category of notification and dwNotificationOption for the subtype. In this example, the application wants to know when the music ends, so it can fade out the scene. DMUS_NOTIFICATION_PMSG *pNotify = (DMUS_NOTIFICATION_PMSG *) pMsg; DWORD dwNotificationOption = pNotify->dwNotificationOption; if (pNotify->guidNotificationType == GUID_NOTIFICATION_PERFORMANCE) { if (dwNotificationOption == DMUS_NOTIFICATION_MUSICSTOPPED) { // Music's over. Fade to black. } }
Lyric Notifications Lyrics are a little simpler to talk about for several reasons: § There's only one way you can receive lyrics, and that is via a Tool. § There's no enabling or disabling of lyrics. Regular notifications need this feature because there are all kinds of messages that you may not care about. With lyrics, the messages are always exactly what you want because they've been intentionally authored into the segments. § The lyric pmsg, DMUS_LYRIC_PMSG, is simpler too. It has exactly one field, the lyric string: § §
typedef struct _DMUS_LYRIC_PMSG
§
{
§
/* begin DMUS_PMSG_PART */
§
DMUS_PMSG_PART
§
/* end DMUS_PMSG_PART */
§
WCHAR lyric */
§
wszString[1];
/* NULL-terminated Unicode
} DMUS_LYRIC_PMSG;
Receiving a Lyric There is exactly one way to receive a lyric — intercept it with a Tool. Going this route does mean a little more work, but in truth we're talking an extra hour of programming that can pay off easily in the long term. Create a Tool with an IDirectMusicTool interface, set it up to handle the DMUS_PMSGT_LYRIC message type, and place it in the Performance. (We will provide example code later in this chapter when we add notifications to Jones.) When a lyric comes down the pike, DirectMusic calls the Tool's ProcessPMsg() method, which can then notify the application in whatever way it wants to. HRESULT CLyricTool::ProcessPMsg(IDirectMusicPerformance* pPerf, DMUS_PMSG* pPMSG) { if (pPMSG->dwType == DMUS_PMSGT_LYRIC) { // Read the lyric and do something based on what it says. } // No need to send this on down the line. return DMUS_S_FREE; }
Interpreting the Lyric
Once we have the lyric in hand, read it. First, if working with both lyrics and notifications, check the dwType field of the pmsg. // Is this a lyric? (Only applied to reading from a Tool.) if (pMsg->dwType == DMUS_PMSGT_LYRIC) { DMUS_LYRIC_PMSG *pLyric = (DMUS_LYRIC_PMSG *) pMsg; // It's a lyric! } If it is a lyric, use a series of string comparisons to see what it is. Remember that these are Unicode strings, so use the "L" to force the comparison string to Unicode. if (!wcscmp(pLyric->wszString,L"Boink!")) { // It's a Boink! Flatten Gruber's head momentarily! } That's it. Since this code is probably called from within the Tool's ProcessPMsg() method, you do not even need to free the pmsg. Just return DMUS_S_FREE and the Performance does it for you.
Case Study: Begonia's Amazing Dance Moves Although Jones has its own notification implementation, it doesn't actively do anything in response to the notifications. So, let's write something simpler that shows how to use them. You are working on the sound design for the genre-shattering new title, "Begonia's Amazing Dance Moves." This exciting new adventure places Begonia's writhing torso center stage. Your job is to make it work. You decide to use notifications to communicate state information back to the game engine. First, define state variables for 1) Begonia's mood, 2) whether music is playing, and 3) her current dance move. // Begonia's emotional state, placed in g_dwBegoniaMood. #define BM_PLACID
1
#define BM_ANGUISHED
2
#define BM_WHO_ME
3
#define BM_UPBEAT
4
#define BM_DEAD
5
static DWORD g_dwBegoniaMood = BM_PLACID; // Music status, tracked in g_dwMusicState. #define MS_RUNNING
1
#define MS_STOPPED
2
#define MS_EXPIRING
3
static DWORD g_dwMusicState = MS_STOPPED; // Begonia's dance moves, stored in g_dwBegoniaMove. #define BD_OFF
1
// Not dancing yet.
#define BD_START
2
// Dance command to start.
#define BD_JUMP
3
// Dance command to jump.
#define BD_WIGGLE
4
// Dance command to wiggle.
#define BD_NEXT
5
// Waiting for next dance command.
static DWORD g_dwBegoniaMove =
BD_OFF;
// Also, track the music Segment. static IDirectMusicSegmentState *g_pDanceMusic = NULL; Now, write a routine that takes a pmsg, be it either lyric or notification, and set the state variables appropriately. We assume there is additional code (most likely a Tool, since this supports lyrics) to capture the pmsg and deliver it to the routine ProcessNotification(): void ProcessNotification(DMUS_PMSG *pMsg) { // Is this a lyric? (Only applied to reading from a Tool.) if (pMsg->dwType == DMUS_PMSGT_LYRIC) { DMUS_LYRIC_PMSG *pLyric = (DMUS_LYRIC_PMSG *) pMsg; // Read the lyric and use to change game state, which will be // picked up asynchronously by the main engine. if (!wcscmp(pLyric->wszString,L"Agony Over")) { // Soul rendering theme is finished. Begonia can // stop her tears now and be a little more upbeat. g_dwBegoniaMood = BM_UPBEAT; } else if (!wcscmp(pLyric->wszString,L"Shout")) { // Someone shouts her name, Begonia reacts. g_dwBegoniaMood = BM_WHO_ME; } } else if (pMsg->dwType == DMUS_PMSGT_NOTIFICATION) {
DMUS_NOTIFICATION_PMSG *pNotify = (DMUS_NOTIFICATION_PMSG *) pMsg; DWORD dwNotificationOption = pNotify->dwNotificationOption; if (pNotify->guidNotificationType == GUID_NOTIFICATION_PERFORMANCE) { if (dwNotificationOption == DMUS_NOTIFICATION_MUSICALMOSTEND) { // Uh oh, primary music Segment is about to end. // Let main loop know to go ahead and schedule a new Segment. g_dwMusicState = MS_EXPIRING; } } else if (pNotify->guidNotificationType == GUID_NOTIFICATION_SEGMENT) { if (dwNotificationOption == DMUS_NOTIFICATION_SEGSTART) { // Is this the start of the dance music? // It was scheduled to start on a measure, so we // needed to wait and start once it was ready. if (pMsg->punkUser == g_pDanceMusic) { // Yes! Begonia can jump up and start dancing. g_dwBegoniaMove = BD_START; } } } else if (pNotify->guidNotificationType == GUID_NOTIFICATION_MEASUREANDBEAT) { // Begonia has rhythm. She moves in time with the music. // Read the beat field to decide what to do. // Each time g_dwBegoniaMove changes, the game engine // sets her on a new path and returns g_dwBegoniaMove to BD_NEXT. if (pNotify->dwField1 == 2) { // Begonia jumps on beat 3. g_dwBegoniaMove = BD_JUMP;
} else { // On any other beat, Begonia wiggles. g_dwBegoniaMove = BD_WIGGLE; } } } // Return the pmsg to the Performance. } Depending on how you acquired the pmsg, you may need to free it now: // Return the pmsg to the Performance. pPerf->FreePMsg(pMsg); Note
If this exercise has inspired you to name your first daughter Begonia, please reconsider.
Script Routines Using scripting for notification is very different from performance notifications or lyrics. Instead of writing C++ code to catch notification or lyric messages, you write script routines that actively do something at scheduled times in Segments. Depending on what you intend to do with the notification, there are different approaches: § If the intention of the notification is to turn around and call a script routine, then it clearly makes more sense to just call the script in the first place! Likewise, if the intention is to play a Segment, then it's easy to write a script routine to do just that. The beauty of these solutions is they never touch the C++ code. § If the intention of the notification is to update a state variable, then it's just slightly more indirect. The script routine updates a global variable in the script. Then, the main loop polls the variable to read its value. This is actually very similar to the mechanism we explored while choreographing Begonia's performance.
Notifications in Jones Let's try the notification implementation in Jones and then see how it's done. If you installed the CD, run NotifyJones by selecting it from the Start menu. Otherwise, double-click on its icon in the Unit II\Bin directory or compile and run it from the 14_Notification source directory. Notice that a new display appears at the bottom-right corner.
The Notifications… button lets you select which notifications you'd like to enable, and the box below it displays the notifications as the Segments play. By default, all notifications are disabled, so start by clicking on the button. This opens the Notifications dialog.
Enable Performance, Segment, and Lyric and then click on OK. Go to the Scripts section and click on the Open… button to read in a script. Load SimpleScript.spt from the Media directory. It has Segments that have been prepared with lyrics. Double-click on StartLevel in the routine list to initiate playback of the primary Segment. Immediately, two notifications appear in the display.
Seg Start indicates that a Segment started playing. Perf Start indicates that it is also the very first Segment. Double-click on Wasp. This triggers a call to Wasp() in the script. Wasp() plays a secondary Segment, which also has a Lyric Track.
Wasp() kicks off a flurry of notification activity. First, Seg Start indicates that the new Segment has started. Then, Bzzzzz, Piff!, and finally Zow! appear, as the Segment delivers each of these lyrics. After a while, Seg Near End indicates that the Segment is almost finished, and finally Seg End appears as the Segment completes (if you look to the graphic display, you should see its rectangle moved completely to the left at this very moment). Experiment a little more by turning on and off different notifications and playing the different script routines.
Jones Implementation Jones doesn't really use notifications in the same way that you would for a game or other interactive application. Normally, you would want to use notifications to trigger the application to run some code or set a state variable. Instead, Jones just displays the text. But it's important that the notification system built into Jones as part of CAudio be something that you can easily rip out and put to good use in a more useful way. So, the design focuses on making it easy to control, capture, and retrieve notifications. What you do with the notification information is up to you. We introduce the CNotificationManager class to control notifications. It handles all the busywork so that the tasks of enabling and processing notifications are as simple as possible. CNotification-Manager can: § Enable and disable specific classes of notifications, including lyrics § Capture the notifications and store their information in an easy-to-read-and-use format § Provide easy retrieval of the notifications so the application can act on them with a minimum of additional code Let's walk through the design.
Enabling Notifications DirectMusic's mechanism for classifying the notification categories relies on GUIDs, which are a bit unwieldy. In order to avoid that complexity throughout our code and also integrate lyric support in a clean way, we represent each notification category with a bit flag. We add one last bit flag to indicate a lyric. This allows us to have one DWORD represent the on/off state of all notification categories, including lyrics. #define NOTIFYENABLE_PERFORMANCE GUID_NOTIFICATION_PERFORMANCE
1
//
#define NOTIFYENABLE_SEGMENT GUID_NOTIFICATION_SEGMENT
2
//
#define NOTIFYENABLE_TIMESIG GUID_NOTIFICATION_MEASUREANDBEAT
4
//
#define NOTIFYENABLE_CHORD
8
// GUID_NOTIFICATION_CHORD
#define NOTIFYENABLE_COMMAND GUID_NOTIFICATION_COMMAND
0x10 //
#define NOTIFYENABLE_RECOMPOSE GUID_NOTIFICATION_RECOMPOSE
0x20 //
#define NOTIFYENABLE_LYRIC
0x40 // Lyric pmsg
CNotificationManager has two routines that take these flags to enable and disable notifications. They are called, appropriately, EnableNotifications() and DisableNotifications(). EnableNotifications() takes one parameter, DWORD dwType-Flags, which indicates which notification categories to turn on. For each bit set, it calls the Performance's AddNotificationType() method with the appropriate GUID. Since lyrics are always on (there is no Performance command to enable or disable lyric support), Enable-Notifications() does not have to do anything for lyrics.
void CNotificationManager::EnableNotifications(DWORD dwTypeFlags) { if (dwTypeFlags & NOTIFYENABLE_PERFORMANCE) { m_pAudio->GetPerformance()->AddNotificationType( GUID_NOTIFICATION_PERFORMANCE); } if (dwTypeFlags & NOTIFYENABLE_SEGMENT) { m_pAudio->GetPerformance()->AddNotificationType( GUID_NOTIFICATION_SEGMENT); } ... and so on for all notification categories, with the exception of lyrics, which are always on… // Now, set the flags so we remember what we enabled. m_dwEnabledTypes |= dwTypeFlags; } Likewise, CNotificationManager::DisableNotifications() takes a DWORD with bits for all the notifications to turn off, and it calls RemoveNotificationType() with the appropriate GUID for each bit set. Along the same lines, we need a simpler system for tracking each of the individual notification subtypes. We track the different notifications with our own constants simply because this requires a lot less work than the GUID + subfield organization of the DirectMusic notifications. And, we can add lyric support under the same umbrella. typedef enum _NOTIFICATION_TYPE { NOTIFICATION_NONE
= 0,
// Empty.
NOTIFICATION_SEGSTART
= 1,
// Segment started playback.
NOTIFICATION_SEGEND
= 2,
// Segment finished.
NOTIFICATION_SEGALMOSTEND
= 3,
// Segment about to finish.
NOTIFICATION_SEGLOOP
= 4,
// Segment looped.
= 5,
// Segment stopped
NOTIFICATION_PERFSTARTED
= 6,
// First Segment started.
NOTIFICATION_PERFSTOPPED
= 7,
// Last Segment ended.
NOTIFICATION_SEGABORT prematurely.
NOTIFICATION_PERFALMOSTEND = 8, almost ended.
// Current primary Segment
NOTIFICATION_MEASUREBEAT
= 9,
// Beat marker.
NOTIFICATION_CHORD
= 10,
// Chord changed.
NOTIFICATION_GROOVE
= 11,
// Groove level changed.
NOTIFICATION_EMBELLISHMENT = 12,
// Embellishment played.
NOTIFICATION_RECOMPOSE
= 13,
// Track recomposed.
NOTIFICATION_LYRIC
= 14
// Lyric.
} NOTIFICATION_TYPE; CNotificationManager has a method, TranslateType(), that reads a pmsg and returns a NOTIFICATION_TYPE constant, which can then be used to track the notification in a more friendly way. NOTIFICATION_TYPE CNotificationManager::TranslateType(DMUS_PMSG *pMsg) { if (pMsg->dwType == DMUS_PMSGT_LYRIC) { return NOTIFICATION_LYRIC; } else if (pMsg->dwType == DMUS_PMSGT_NOTIFICATION) { DMUS_NOTIFICATION_PMSG *pNotify = (DMUS_NOTIFICATION_PMSG *) pMsg; DWORD dwNotificationOption = pNotify->dwNotificationOption; if (pNotify->guidNotificationType == GUID_NOTIFICATION_PERFORMANCE) { if (dwNotificationOption == DMUS_NOTIFICATION_MUSICSTARTED) { return NOTIFICATION_PERFSTARTED; } else if (dwNotificationOption == DMUS_NOTIFICATION_MUSICALMOSTEND) { return NOTIFICATION_PERFALMOSTEND; } else if (dwNotificationOption == DMUS_NOTIFICATION_MUSICSTOPPED) { return NOTIFICATION_PERFSTOPPED; } } else if (pNotify->guidNotificationType == GUID_NOTIFICATION_SEGMENT)
{ if (dwNotificationOption == DMUS_NOTIFICATION_SEGSTART) { return NOTIFICATION_SEGSTART; } else if (dwNotificationOption == DMUS_NOTIFICATION_SEGALMOSTEND) { return NOTIFICATION_SEGALMOSTEND; } else if (dwNotificationOption == DMUS_NOTIFICATION_SEGEND) { return NOTIFICATION_SEGEND; } else if (dwNotificationOption == DMUS_NOTIFICATION_SEGABORT) { return NOTIFICATION_SEGABORT; } else if (dwNotificationOption == DMUS_NOTIFICATION_SEGLOOP) { return NOTIFICATION_SEGLOOP; } } else if (pNotify->guidNotificationType == GUID_NOTIFICATION_MEASUREANDBEAT) { return NOTIFICATION_MEASUREBEAT; } else if (pNotify->guidNotificationType == GUID_NOTIFICATION_CHORD) { return NOTIFICATION_CHORD; } else if (pNotify->guidNotificationType == GUID_NOTIFICATION_COMMAND) { if (dwNotificationOption == DMUS_NOTIFICATION_GROOVE) { return NOTIFICATION_GROOVE;
} else if (dwNotificationOption == DMUS_NOTIFICATION_EMBELLISHMENT) { return NOTIFICATION_EMBELLISHMENT; } } else if (pNotify->guidNotificationType == GUID_NOTIFICATION_RECOMPOSE) { return NOTIFICATION_RECOMPOSE; } } return NOTIFICATION_NONE; }
Capturing Notifications To capture both notifications and lyrics, CNotificationManager implements an IDirectMusicTool interface so it can insert itself directly in the Performance and intercept DMUS_NOTIFICATION_PMSG and DMUS_LYRIC_PMSG messages. To make this possible, we base the CNotificationManager class on the IDirectMusicTool interface. CNotificationManager then provides its own implementation of all the IDirectMusicTool methods. class CNotificationManager : public CMyList, public IDirectMusicTool { public: // IUnknown methods. STDMETHODIMP QueryInterface(const IID &iid, void **ppv); STDMETHODIMP_(ULONG) AddRef(); STDMETHODIMP_(ULONG) Release(); // IDirectMusicTool methods. STDMETHODIMP Init(IDirectMusicGraph* pGraph) ; STDMETHODIMP GetMsgDeliveryType(DWORD* pdwDeliveryType) ; STDMETHODIMP GetMediaTypeArraySize(DWORD* pdwNumElements) ; STDMETHODIMP GetMediaTypes(DWORD** padwMediaTypes, DWORD dwNumElements) ; STDMETHODIMP ProcessPMsg(IDirectMusicPerformance* pPerf, DMUS_PMSG* pPMSG) ; STDMETHODIMP Flush(IDirectMusicPerformance* pPerf, DMUS_PMSG* pPMSG,
REFERENCE_TIME rtTime) ; Most of the methods handle the process of setting up the Tool. When it is installing the Tool, DirectMusic calls Init(), GetMsgDelivery-Type(), GetMediaTypeArraySize(), and GetMediaTypes(). Depending on what the Tool needs to do, it can implement these as needed. Init() allows the Tool to initialize its internal structures at the time DirectMusic places the Tool in a Graph. For our purposes, there is nothing to do here, so the method simply returns S_OK. HRESULT CNotificationManager::Init(IDirectMusicGraph* pGraph) { // Don't need to do anything at initialization. return S_OK; } GetMsgDeliveryType() tells DirectMusic when to deliver the messages. In this case, the Tool wants the messages delivered at queue time. HRESULT CNotificationManager::GetMsgDeliveryType(DWORD* pdwDeliveryType) { // This Tool should process just a tad before the time stamp. *pdwDeliveryType = DMUS_PMSGF_TOOL_QUEUE; return S_OK; } DirectMusic also needs to know which message types to send through this Tool. Although a Tool can elect to simply ignore this and receive all pmsgs, doing so is a bad idea for two reasons. First, receiving all PMsgs incurs extra overhead (admittedly minimal, but still…) to process everything and not just the messages we care about. Second, it causes all messages to be held up by this delivery time before they can proceed to the next stage. With a delivery of DMUS_PMSGF_TOOL_QUEUE, the timing is right up against the edge. If the delivery option becomes DMUS_PMSGF_TOOL_ ATTIME, we are in trouble because note and wave messages end up delivered to the DirectMusic Synth too late, resulting in horribly stuttered timing. Therefore, it is best to implement the pair of methods GetMediaTypeArraySize(), which specifies the number of message types, and GetMediaTypes(), which then copies over the list of message types. HRESULT CNotificationManager::GetMediaTypeArraySize(DWORD* pdwNumElements) { // We have two media types to process: lyrics and notifications. *pdwNumElements = 2; return S_OK;
} HRESULT CNotificationManager::GetMediaTypes(DWORD** padwMediaTypes, DWORD dwNumElements) { // Return the two types we handle. DWORD *pdwArray = *padwMediaTypes; pdwArray[0] = DMUS_PMSGT_LYRIC; pdwArray[1] = DMUS_PMSGT_NOTIFICATION; return S_OK; } As we have seen several times, the heart of a Tool is really its ProcessPMsg() method. This is the code that receives the pmsg and does something with it. In this case, it doesn't do too much. Our code turns around and calls an internal method, InsertNotification(), and then tells DirectMusic to discard the pmsg. We discuss Insert-Notification() next. HRESULT CNotificationManager::ProcessPMsg(IDirectMusicPerformance* pPerf, DMUS_PMSG* pPMSG) { // This should always be a lyric or notification, // but check just to be safe. if ((pPMSG->dwType == DMUS_PMSGT_LYRIC) || (pPMSG->dwType == DMUS_PMSGT_NOTIFICATION)) { InsertNotification(pPMSG); } // No need to send this on down the line. return DMUS_S_FREE; } InsertNotification() provides the glue to convert the message into a notification event that can be retrieved asynchronously by the main loop. It takes the pmsg and uses it to create a CNotification object, which it places in its queue. Since the queue itself can be accessed either from the Tool processing thread or the main loop thread, this code must be protected with a critical section. void CNotificationManager::InsertNotification(DMUS_PMSG *pMsg) { // TranslateType reads the pmsg and figures out the // appropriate NOTATION_TYPE for it. NOTIFICATION_TYPE ntType = TranslateType(pMsg); // If this is a lyric, it still might not be enabled
// for capture, so check first. if ((ntType != NOTIFICATION_LYRIC) || (m_dwEnabledTypes & NOTIFYENABLE_LYRIC)) { // Allocate a CNotation structure to store the // notification information. CNotification *pNotification = new CNotification; if (pNotification) { // The Init() routine fills the CNotification with // name, type, and optional object pointer. pNotification->Init(pMsg,ntType); // Since this can be called from a different thread, // we need to be safe with a critical section. EnterCriticalSection(&m_CriticalSection); AddTail(pNotification); LeaveCriticalSection(&m_CriticalSection); } } } The CNotification class tracks one instance of a notification that has been captured on playback. CNotification is based on the class CMyList, so it can be managed in a linked list. CNotification stores several parameters that it fills when it translates the original pmsg: §
A name, m_szName: If a lyric, this stores the actual lyric text. If a performance notification, it stores a description of the notification.
§
The notification type, m_ntType: This is an identifier from the enumerated set of NOTIFICATION_TYPEs.
§
An optional pointer to an object, m_pObject: Some performance notifications include an IUnknown pointer to an object. This maintains its own internal reference to the object and releases it when deleted.
§
An optional data field, m_dwData: If a performance notification uses one of the data fields in the message, it is copied here.
§
The time stamp of the notification, m_rtTimeStamp: This is the exact time that the action that the notification represents will occur. Since the notifications are delivered slightly ahead of time, retaining the exact intended time is important.
class CNotification : public CMyNode { public: CNotification();
~CNotification(); CNotification* GetNext() { return (CNotification*)CMyNode::GetNext();} void Init(DMUS_PMSG *pMsg,NOTIFICATION_TYPE ntType); char *GetName() { return m_szName; }; void GetTimeStamp(REFERENCE_TIME *prtTime) { *prtTime = m_rtTimeStamp; }; NOTIFICATION_TYPE GetType() { return m_ntType; }; IUnknown * GetObject() { return m_pObject; }; // Note: This doesn't AddRef! DWORD GetData() { return m_dwData; }; private: char
m_szName[40];
// Name for notification.
NOTIFICATION_TYPE
m_ntType;
// Notification type.
m_rtTimeStamp;
// Time the notification
m_pObject;
// Optional object pointer.
m_dwData;
// Data field, borrowed from
REFERENCE_TIME occured. IUnknown * DWORD pmsg. };
The only substantial method, Init(), takes a DMUS_PMSG that is either a lyric or notification and uses it to initialize its fields. All of the other methods return data from the fields.
Retrieving Notifications As the notifications are being captured by the CNotificationManager, it stores them in its internal list. The application can then call CNotificationManager::GetNextNotification() to retrieve the oldest CNotification in the queue. When done with a retrieved CNotification, the application must delete it. CNotification *pNotification = pManager->GetNextNotification(); if (pNotification) { // Do something. delete pNotification; } Now, let's look and see how CAudio integrates CNotification-Manager.
CAudio Integration For notification support, CAudio adds a pointer to a CNotification-Manager and two methods to access it:
public: // Methods for managing notifications. CNotification *GetNextNotification(); notification in queue.
// Returns next
void EnableNotifications(DWORD dwType,bool fEnable); private: // CNotificationManager captures and queues notifications. CNotificationManager *
m_pNotifications;
CAudio does not create the CNotificationManager by default. Since it is only needed in the event that the application wants to track notifications, CAudio creates its CNotificationManager when the application calls EnableNotifications() the first time. At that point, it creates an instance of CNotificationManager and places it as a Tool in the Performance's Tool Graph. The CNotificationManager object can be manipulated as a Tool because it exposes the IDirectMusicTool interface. Once CAudio has installed CNotificationManager, calls to CAudio::EnableNotifications() simply pass through to CNotification-Manager::EnableNotifications() and CNotificationManager:: DisableNotifications(). Note that since CNotificationManager is managed as a COM object, the last object to Release() it causes it to go away. Even CAudio's pointer to CNotificationManager is treated as a reference. void CAudio::EnableNotifications(DWORD dwType, bool fEnable) { if (!m_pNotifications) { IDirectMusicAudioPath *pPath; if (SUCCEEDED(m_pPerformance->GetDefaultAudioPath(&pPath))) { IDirectMusicGraph *pGraph = NULL; pPath->GetObjectInPath(0, DMUS_PATH_PERFORMANCE_GRAPH,
// The Performance
Tool Graph. 0, GUID_All_Objects,0,
// Only one object
IID_IDirectMusicGraph,
// The Graph
type. interface. (void **)&pGraph); if (pGraph) { m_pNotifications = new CNotificationManager(this); if (m_pNotifications)
{ // Insert it in the Graph. It will process all lyric and // notification pmsgs that are played on this Performance. pGraph->InsertTool( static_cast(m_pNotifications), NULL,0,0); } pGraph->Release(); } pPath->Release(); } } // Okay, hopefully we have a CNotificationManager. if (m_pNotifications) { // If enabling or disabling, choose the appropriate routine. if (fEnable) { m_pNotifications->EnableNotifications(dwType); } else { m_pNotifications->DisableNotifications(dwType); } } } CAudio::GetNextNotification() simply passes through to CNotificationManager::GetNextNotification(). CNotification *CAudio::GetNextNotification() { if (m_pNotifications) { return m_pNotifications->GetNextNotification(); } return NULL; }
Finally, CAudio::Close(), which the application calls when it is finished with CAudio, releases the CNotificationManager. if (m_pNotifications) { m_pNotifications->Release(); m_pNotifications = NULL; } To use the notifications, Jones works a little differently than a typical application might. Jones displays notifications instead of using notifications to trigger changes in behavior. Jones keeps the most recent eight CNotifications in a queue so that it can continuously display them in its notification display. The following code is called every frame (in Jones, this is every 30 milliseconds). It reads any new notifications from the Performance and places them in its own queue. It then culls the queue to keep it limited to eight. If the queue changed, it redraws the display. bool fChanged = false; if (m_pAudio) { CNotification *pNotif; // Pull in new notifications and add them to the list. while (pNotif = m_pAudio->GetNextNotification()) { m_NotificationList.AddTail(pNotif); fChanged = true; } // We want to keep a list of the last eight notifications for drawing. // So, if the list is longer than eight, remove the items at the top. while (m_NotificationList.GetCount() > 8) { pNotif = m_NotificationList.RemoveHead(); delete pNotif; // Just delete the CNotification when done with it. fChanged = true; } } if (fChanged) { // We can draw the new list in the display.
Case Study: Return to Begonia's Amazing Dance Moves Okay, since Jones' use of CNotificationManager is a little nonstandard, let's take a look at how the Begonia dev team might use it. This is closer to how you might use it in your application. First, determine which notifications to allow with a call to EnableNotifications(). m_pAudio->EnableNotifications(NOTIFYENABLE_LYRIC | lyrics
// Enable
NOTIFYENABLE_PERFORMANCE | // Enable Performance NOTIFYENABLE_SEGMENT |
// Enable
Segment NOTIFYENABLE_TIMESIG,true);// Enable time signature This activates capture of just the notification categories that we care about. Then, we need code to read and process notifications. This code should be called once per frame. It reads the current notifications and responds to them as appropriate by setting global state variables. The code is pretty self explanatory: CNotification *pNotif; while (pNotif = m_pAudio->GetNextNotification()) { switch (pNotif->GetType()) { case NOTIFICATION_LYRIC : if (!strcmp(pNotif->GetName(),"Agony Over")) { g_dwBegoniaMood = BM_UPBEAT; } else if (!strcmp(pNotif->GetName(),"Shout")) { g_dwBegoniaMood = BM_WHO_ME; } break; case NOTIFICATION_PERFALMOSTEND : g_dwMusicState = MS_EXPIRING; break; case NOTIFICATION_SEGSTART : if (pNotif->GetObject() == g_pDanceMusic) {
g_dwBegoniaMove = BD_START; } break; case NOTIFICATION_MEASUREBEAT : if (pNotif->GetData() == 2) { g_dwBegoniaMove = BD_JUMP; } else { g_dwBegoniaMove = BD_WIGGLE; } break; } // Done with the notification. delete pNotif; } } Until now, we've focused on DirectX Audio techniques for building a sound and music environment that responds effectively and realistically to user and game stimuli. However, for the best possible experience, the control should go both ways, and in some cases audio should drive gameplay and visuals. With notification support, we have the tools to do so.
Unit III: DirectX Audio Case Studies Chapter List Chapter 15: A DirectMusic Case Study for Russian Squares for Windows XP Plus Pack Chapter 16: A DirectMusic Case Study for No One Lives Forever Chapter 17: A DirectSound Case Study for Halo Chapter 18: A DirectMusic Case Study for Interactive Music on the Web Chapter 19: A DirectMusic Case Study for Worms Blast Chapter 20: A DirectMusic Case Study for Asheron's Call 2: The Fallen Kings Chapter 21: Beyond Games: Bringing DirectMusic into the Living Room
Chapter 15: A DirectMusic Case Study for Russian Squares for Windows XP Plus Pack Download CD Content Guy Whitmore
Introduction and Overview Russian Squares is a puzzle game involving rows and columns of squares. The player eliminates rows of squares by matching the square's color or shape. The game's difficulty increases as the player eliminates the rows of squares; the time given decreases and "blockers" are introduced to get in the player's way. It sounds simple enough, but the game is both challenging and extremely addictive! The adaptive audio design of Russian Squares offers a wonderful opportunity to demonstrate how an adaptive score works seamlessly with core game design elements. When creating a musical score for a game, start by determining what the game calls for. I approach each game without any particular adaptive technique in mind and brainstorm in the abstract. This way, creative ideas lead to technical decisions. Once I have an abstract idea of how the score will work, the engineering team and I figure out how to execute the adaptive music system with the game engine. I believe that an adaptive audio design document, which should complement or even be a part of the game design document, is an important step in creating a highly adaptive score.
Adaptive Elements The adaptive elements of a game determine the form of an adaptive score. An important first step is identifying potential adaptive elements that tie gameplay to the music. Start by asking questions about the gameplay. What is the core gameplay element? How does it function? What are secondary gameplay elements? In what ways can I link the music to this gameplay?
Music Cells The core gameplay of Russian Squares is the elimination and addition of rows of squares. We made this game design element the core of the music functionality. As the game adds rows or the player eliminates them, the music responds with a subtle change; the adaptive music system adds or subtracts an instrument, the harmony or rhythm changes, etc. The music follows the overall pace of the player. To accomplish this, there are about 50 music cells per composition, which correlate to different groove levels. As the player completes rows, the music incrementally transitions to the next groove level. Logical transition boundaries make those transitions musical and seamless. I used mainly measure boundaries for transitions in Russian Squares. I used music intros, ends, and breaks to kick off and end gameplay. The music transitions to an ambient break section when the player pauses the game. The adaptive music system plays the appropriate intros and ends based on the current music cell.
Variation
Russian Squares uses instrument-level variation to keep the individual cells from getting monotonous. Each groove level is anywhere from two to eight measures in length and repeats as the player works on a given row. Within DirectMusic, each instrument can have up to 32 variations. When combined with other instruments, these variations increase the amount of variation logarithmically. Most often, one to three instruments per cell use variation, and that gives the music an organic, spontaneous feel and prevents that all-toofamiliar loopy feeling. Too much variation, however, can unglue the music's cohesion.
Music Layers Layering short musical gestures over the currently playing groove level accents many gameplay elements, such as the game clock, the clearing and adding of rows, and blockers. I use DirectMusic motifs and secondary Segments to accomplish this. These elements line up rhythmically with the music (i.e., the tempo of the clocks synchronizes with quarter note values of the music). Sounds DLS-2 banks allowed for maximum flexibility and variation in this project. Some of the instrument variations control instrument filters (i.e., low pass filter sweeps) in addition to notes, which are crucial for the genre of music. The sounds used are a combination of Sonic Implants (www.sonicimplants.com) and custom banks.
Cooperative Composing Three other composers (Erik Aho, Mike Min, and Bryan Robbins) assisted in creating the three arrangements. Working with nonlinear cells made it easy to work simultaneously. Generally, we assigned a range of cells to each composer. Often, someone would start an idea and the next composer would elaborate on it or add variations. This method of working made sharing musical ideas easy and kept everything musically coherent.
Creating the Adaptive Audio Design Russian Squares: How the Game Plays Russian Squares is a fast-paced puzzle game that grows with intensity. The object is to eliminate rows of blocks by matching colors or shapes faster than the clock. Each level starts with a square made up of rows of blocks. The player controls an active square with a mouse or arrow keys. The active block can move around the perimeter of the square or push into the square, causing the square directly across from it to pop out and become the new active square. The first level's square starts with blocks set up six by six. If time runs out before the player matches a row, the game adds a row to the square. As the player eliminates rows, the square decreases in size (six by five, five by five, five by four, etc.) until the square disappears. This constitutes the completion of the first level. Each subsequent level begins with a larger square with more rows to eliminate against a faster clock. There is another obstacle thrown in as the game progresses. Blockers are blocks within the square that cannot be moved. Blockers appear in the square for a few seconds, get in the way of the player's actions, and then disappear or switch to another location in the square. They definitely add a challenge and twist to the game. There are nine levels in Russian Squares. A new game starts you on level three; this is because there is the potential for the player to move back a few levels! The progression of levels plays an important part in how I created the music and integrated it into the game.
Aural and Visual Themes There are three visual themes for Russian Squares: Neon, Candy, and Shapes. The gameplay is identical in each, except that with Shapes the player matches block shapes instead of color. Each theme has a distinct aesthetic. Neon has a glowing green, high-tech feel. Candy has a light-hearted, silver look. Shapes portrays a darker red tone. My team wrote three pieces of adaptive music that correlate with each visual theme. Spin Cycle sounds edgy and techy to compliment Neon, Bubble Matter floats on light synthesizer pads over drum and bass riffs for Candy, and the down-tempo groove of Gravity Ride matches the darkness of its visual counterpart, Shapes.
The Musical Form Form, in a nonlinear piece of music, abides by the same basic principles as linear music. Dynamics, harmony, orchestration, and the arc of the form are critical elements for any composition. Linear music, however, follows a singular path created by the composer, while nonlinear or adaptive music chooses one of many potential paths. This does not mean that adaptive music is formless. Its form lies within the boundaries set for the music. Creating these boundaries presents a new set of challenges for game composers. For example, each potential outcome must make sense musically and adhere to the overall consistency of the piece. It is important for an adaptive piece of music to adhere to an overarching form, despite its many paths. The composer builds a musical structure in which the adaptive elements are contained. For example, a snowflake has a form that describes its basic elements — size, temperature, weight, and its crystalline form. Every snowflake differs slightly, yet they are all snowflakes. Adaptive music functions in this way. The overarching form of Russian Squares is one of increasing intensity. Playing all of the music cells (DirectMusic patterns) sequentially results in a gradual building of musical elements, yet the performances that the gameplay produces vary greatly. The adaptive element of the gameplay provides the dynamic ebb and flow of music, while the instrumentlevel variation adds further nonlinearity. The form takes into account the fact that the music cells will not occur sequentially but will slide up and down the many groove levels. Each groove level/pattern usually has one or two musical differences from its closest neighbor. That difference is an instrument addition/subtraction, an instrument change, a chord change, or a rhythmic shift. We arranged them in such a way that the music gradually moves from one large musical idea to another over the course of several groove levels. This gave the music a sense of form. The tempo of each composition remains constant. We scrapped the whole idea of tempo changes for a couple of reasons. First, electronica does not usually feature gradual accelerando as a part of the form, so it sounded ridiculous. Second, it added some technical hurdles given our DirectMusic approach. The game would have to call tempo changes directly (with "increase tempo" or "decrease tempo" calls) or via controlling secondary Segments, which adds another adaptive design layer. Each style/pattern would need to function well within a range of tempos, and some of the DLS banks (i.e. drum grooves) were built around a specific tempo. We kept this aspect simple by having only one tempo per composition. To prevent it from sounding static, we used tricks such as rhythmic modulation to increase the music's sense of pulse. For example, the rhythm changes dramatically over five groove levels in Spin Cycle. The score achieves this change by introducing the new rhythm in one groove level and then gradually emphasizing it (while de-emphasizing the old rhythm) over the course of the next several groove levels. Each player experiences the shift at a different rate, depending on how well they play!
The Adaptive Goal and Core Objectives The arc and progression to Russian Squares is one of growing intensity. However, it is not a straight linear growth of intensity through the levels. Each level begins calm and ends frantic, relatively speaking, but each successive level begins and ends slightly more intense than the last.
Figure 15-1: Intensity Range Our first objective for the adaptive score was to mirror the progression of game levels. This approach involved creating a score that increased in intensity from the start to the end of each level (microadaptability), while each successive level began at a higher intensity than the last (macro-adaptability). Within the game levels, we focus on the primary gameplay element of clearing rows. A musical change occurs each time the player clears a row, a reward that lets the player know he is on the right track. In addition, the change indicates the growing intensity as the rows clear. Running out of time and having a row added is the negative counterpart to clearing a row. The score reflects this state as well. Russian Squares uses an additive-subtractive approach to the music, in which the music engine can add or peel away instrument layers. Given that the time it takes to clear a row can be anywhere from a few seconds to a few minutes, this is a very practical solution. It allows for musical development while maintaining direction and momentum. Implementing this technique has some complications, however. In terms of aesthetics, simply moving along a linear curve from low intensity to high intensity does not work, as adding layer after layer and increasing the tempo is not very musical. The music requires larger sections, instrumentation changes, rhythmic shifts, and tonal changes to stay interesting. In this way, the music progresses in a multidimensional manner. This concept is the backbone for the adaptive audio design of Russian Squares. We mirror many other gameplay elements with musical gestures. Most of these are musical accents indicating important events as they occur in the game. DirectMusic motifs layer these accents over the main body of music. These motifs include row clear, row add, blocker, time running low, high score, and active square. We also use musical gestures for start game, end game, and pause game. Another objective is creating three compositions that share the same core functionality. Three differing pieces of music react in the same manner to the gameplay, and their common adaptive framework allows them to be interchangeable. Having multiple compositions underscores the need to establish a solid adaptive framework prior to composing. Using quality instrument samples goes beyond aesthetics. Sounds also need to be flexible in an adaptive score. DLS instruments offer this flexibility because single note samples can react more quickly than prerecorded phrases.
Inherent Obstacles
The biggest trick was creating a score that could gradually increase or decrease in intensity at any time while creating a satisfying musical experience. How could we put such a score together? How would the layers function? How would transitions operate? The overall objective seemed simple enough, but just thinking about execution for a few moments brought up all kinds of tough questions. Moreover, digging into the production of the score unearthed many more! The music in Russian Squares needs to change quickly. Consider that it may take the player a minute or two to clear a row at the start of the game, yet toward the end of any level, the player can clear a row or the game can add one every few seconds. This poses a curious problem. If music changes every minute or so, you should use longer sections to better hold the listener's interest and avoid repetition. However, when the music changes every few seconds, it must not disrupt the aesthetic flow of the phrases. A successful adaptive score strikes a balance between the length of phrases, their transition boundaries, and the music's responsiveness to game changes. In the modern electronica genre, nothing can make the music cheesier than wimpy sounds. Electronica lives and dies by its grooves, fat synthesizer patches, and filter sweeps. It depends heavily on timbres. This demand for great sounds creates a challenge for the audio producer facing severe memory constraints. The last challenge we needed to overcome was repetition. Any game composer understands the challenge of avoiding undue repetition in game scores. Russian Squares was no exception. A player can easily spend a lot of time solving a given level. As the difficulty rises, rows are cleared, added, cleared, and added again, triggering the same music cells. Luckily, DirectMusic offered us many ways to deal with this and the rest of the challenges we faced during production of our adaptive score.
The DirectMusic Solution After defining the music design goals for Russian Squares, I turned to DirectMusic. I thought of the music in terms of small "cells," or phrases, that correlated to rows in a level — one cell per row. Given my experience with DirectMusic, I saw three possible solutions to the core design objective. We could use DirectMusic primary Segments as the main musical unit, so when a player cleared a row, a transition to a new Segment occurred. We could build the composition out of secondary Segments that layered onto or peeled away from the mix sequentially. Alternately, we could use Styles and their groove levels as the core musical unit; this way, the game transitioned up and down the groove levels as the player cleared rows or the game added them. I experimented with the primary Segment solution. I thought Segments allowed for more flexibility with the form, such as tempo changes and longer, less repetitive phrases. However, our first few tests showed some glaring problems with this approach. It worked well when the music changed every minute or so, but when the music changed more often, it sounded a bit stilted or unnatural. In addition, the programmers told us that primary Segments would not respond as quickly or efficiently in the game as groove levels. Next, I tried layering secondary Segments but quickly learned that this method was a logistical nightmare, both technically and musically. Getting more than three or four secondary Segments to line up musically and respond quickly to game calls was not practical. I also experimented with a muting/unmuting system using controlling secondary Segments to get the same effect, but again it did not turn out to be practical for this design. We use Styles and groove levels in this application, as they respond quickly. Each groove level associates with a row. If the player clears a row, the game increases the groove level by one, and if the game adds a row, it decreases the groove level by one. Therefore, the game engine slides the set of groove levels up and down as the game progresses. Each groove level consists of a musical phrase that is two to 12 measures long and transitions easily to the next groove level. We use measure boundaries as transition points between groove levels, as they allow for relatively quick transitions while remaining musical. There are 45 groove levels for each composition. Using the Style/groove level system, we take full advantage of DirectMusic's intros, ends, and breaks. Anytime a new game or new level begins, DirectMusic plays an appropriate intro leading into the appropriate groove level. When the game ends or the player completes a level, the music seamlessly plays an ending. When the game pauses, a DirectMusic break plays until the player resumes the game. All of this gives the game score a polished and professional feel. We use secondary Segments and layer them over the groove levels for the musical accents that emphasize specific game events: row cleared, row added, blockers, and the clock running out of time. Even though the groove level changes when the player clears a row or the game adds one, we want a consistent underscore that occurs simultaneously with the row change. Another secondary Segment plays every time a blocker appears in the game. These secondary Segments play on the grid boundary (in this case, the next sixteenth note) of the underlying groove level so that they sound virtually instantaneous, yet rhythmically line up with the music. As time is running out on the clock (ten seconds to go), a looping secondary Segment plays until time runs out or the player clears the row. This Segment begins on the beat boundary so that the "tick tock" sound lines up with the beat of the groove level. Each composition uses a different sound set for these secondary Segments. This way, all the sounds are integral to each composition and add coherence. For example, the clock for Spin Cycle is a buzzer sound; for Bubble Matter, it's a wooden "tick tock"; Gravity Ride uses the clank of metal spikes to mark time.
Harmonic Considerations For this game, I chose not to take advantage of DirectMusic's adaptive harmonic abilities, such as ChordMaps or Chord Tracks. The simple harmonic structure of the music did not merit the extra production time that it would have taken to make it function well within the design. Frequent or complex harmonic and chordal shifts did not fit the aesthetic of electronica music. I created the music's harmonic or chordal changes directly in the pattern editor. Harmonic changes occur when moving from one groove level to the next, which correlates to a row being cleared or added.
The Production Process I kicked off the production process by gathering the initial DLS sounds for each composition. Then, setting the tone and main themes for the pieces, I divided the rest of the composing between my team and myself. Rather than having one person assigned to each composition, we all worked on every song. Each composer was responsible for a range of groove levels in each piece. I periodically merged all of our work so that we were aware of each other's progress. Toward the end, I polished and mixed the pieces, ensuring coherence.
Gathering, Creating, and Assembling DLS Banks I find working in DirectMusic's DLS creation environment to be straightforward. Its interface, while not perfect, is similar to other sample-bank creation software. The process consists of importing wave samples, creating an instrument, and assigning samples to regions across the keyboard. The DLS Level 2 standard allows for the layering of samples and adds basic low pass filtering. The trick to good sound is not in the tool but in the experimentation process. Many composers like to get right to the notes and use whatever sounds are available. This can lead to a generic-sounding piece of music. Creating sounds with a specific piece of music in mind gives the sounds immediate relevance. There is a lot of back-and-forth between tweaking a sound and trying it in the context of the music. The process boils down to a few key points: § Quality samples § Instrument layers § Real-time filters § Good use of the ADSR amplitude envelope § Efficient use of samples § Real-time effects, such as reverb, delay, etc. We created an individual sound set for each piece of music. We honed each sound set to a specific composition, giving each piece a unique sound. The development team set aside 8MB for each song's sample set, for a total of 24MB of instruments. In the end, we kept the sample rate at 44.1 kHz for pristine sound. The game loads all 24MB of instrument banks at the start so that the player can switch between songs without pauses. We designed the sounds from scratch or used licensed sounds from Sonic Network (the instruments are called Sonic Implants) at www.sonicnetworkinc.com. Sonic Network has an excellent electronica collection, which served as our point of sonic departure. Each composition has its own rhythmic feel and flavor that the sound sets portray. There are live drum loops diced into quarter note sections; these loops live in a DLS instrument, so they can play back in their original form, or we can mix and match them in a variety of rhythms. This loop-slicing technique is common to the electronica genre and allows a live drum feel with added flexibility and variation. For percussion we use standard drum machine sounds, such as Roland 808, Yamaha R8, and other obscure beat boxes. Hollywood Edge has a great collection of classic sounds called Beat Machine. I got specific permission to use these and other Hollywood Edge sounds in our DLS banks. I also used a Roland Jupiter synthesizer (a little white noise and a filter gave an industrial flavor to Gravity Ride) and some samples generated from percussive metal spike sounds. We used samples of synthesizers as an integral element of the sound palette. We spent a lot of time building and layering synthetic instrument sounds to sound thick, rich, and mean
when they needed to. Many of these instruments have three or four layers, which are panned and slightly detuned to add density to the sounds. I also took advantage of real-time filters as the primary musical gesture on both the lead synthesizer and the rhythmic synthesizer of Spin Cycle.
Setting Up Styles We set up each of the three compositions identically in DirectMusic Producer. One style held all of the groove levels, the motifs, and the Band for a given piece. We set up the Band (within the Style) with the pchannels assigned to the custom DLS instrument patches after we created the Style. We then set the instrument's levels and panning for a rough mix, tweaking them during the composition process, and created our patterns.
The Band and Mixing Each composition uses only one DirectMusic Band. I find constructing one Band, with all of the instruments assigned to different pchannels, easier than using multiple Bands and switching between them to change instrument sets. The unlimited number of available pchannels makes this practical and avoids confusion over pattern to Band assignments. It also simplifies the mixing process because it is accomplished within one Band. I maintain a rough mix via the Band while composing, making adjustments as I go. One of the last stages of the production process is to finalize the mix by fine-tuning the Bands' parameters and setting the AudioPath appropriately. This is far from mixing on a hardware console. The process consists of listening through all the Style/patterns and setting the Band's pchannels when needed. This accomplishes what I call the "general" or "relative" mix. Continuous controller curves within pattern parts provide real-time dynamic instrument motion.
AudioPaths and Effects DirectMusic AudioPaths offer real-time effect processing and routing. Reverb, echo, and compression add a sense of polish to the music of Russian Squares. Real-time reverb (as opposed to reverb mixed with instrument samples) brings all of the instruments into the same sonic space, creating a consistent-sounding mix. In addition, reverb helps smooth transitions from pattern to pattern. This is because the reverb tails from one pattern carry over into the next pattern, blurring the transition point. Echo enhances the feel of the drum and percussion tracks. A delay time equivalent to a dotted eighth note adds syncopation and helps the rhythms feel solid. Compression punches up certain instruments, such as the synthesized bass in Spin Cycle, making them sound more dense. Each composition calls for different effect settings and therefore uses its own custom AudioPath. The number of mix groups in an AudioPath equals the number of effect settings needed. Using too many mix groups eats up processor power and takes longer to load. AudioPaths in Russian Squares use three or four mix groups. All of the pchannels are assigned to one of those mix groups, depending on the needed processing. One of the mix groups often remains dry (no effects) for instruments, such as bass parts that do not call for processing. AudioPath settings for Spin Cycle: § Mix group 1: no effects § Mix group 2: Waves reverb 1 (mix –17dB, reverb time 780ms) § Mix group 3: Waves reverb 2 (mix –5.7dB, reverb time 2640ms) § Mix group 4: Compression/Echo 1 (wet dry mix 28, feedback 12, left delay 510, right delay 525)
Pattern Creation and Groove Level Assignments We assigned each pattern a single groove level number. Each composition has 35 patterns assigned to groove levels 10 through 45. We assigned a range of groove levels to each level of the game. The groove level ranges overlap from level to level. For example, we assigned a range of 20 to 27 to the third level of the game (which is where a "new game" starts), while we assigned a range of 24 to 32 to game level four. The player does not hear the highest groove level (i.e., 45) until the last rows of game level nine are being completed.
Patterns contained all the note and continuous controller information of the music. We assigned each pattern part a pchannel via the Band editor. We wrote, implemented, and arranged all of the music for Russian Squares within DirectMusic Producer's Pattern editor. Composing outside the realm of DirectMusic Producer was not practical in this case because we created the sounds specifically for the DLS Level 2 format and Microsoft software synthesizer. We did not have a good way to duplicate the sound set in another format. Despite the 60-to 80-millisecond delay of the software synthesizer, I performed some of the parts in real time using a MIDI keyboard that triggered its own piano sound as reference during recording. I composed much of the music via note-by-note entry using the piano roll interface of the Pattern editor. Copying and pasting similar pattern parts saved some time, but it was a slow process.
Continuous Controllers Motion within instrument parts adds depth of emotion to an otherwise static composition. Controllers 10 (pan), 11 (expression), and 74 (filter cutoff) breathe dynamic life into the Russian Squares music. We manually entered control curves or points into Pattern Part Continuous Controller tracks. Performing CC data in real time using a MIDI device was impractical and messy in DirectX 8. Performance latency rendered the data inaccurate. In addition, thinning the CC data after a Performance required tedious selecting and deleting. Continuous controller curves, while painstaking to insert and adjust, were much easier to edit than a multitude of individual CC data points (which is the result of any CC Performance). Continuous controller 11 (expression) handles any real-time movement of an instrument's volume. This allows the basic mix to remain solid, while instrument volume moves dynamically relative to its Band volume, which uses CC 7. In this way, CC 7 acts as a pchannel's master volume and CC 11 acts as its relative volume. CC 74 controls an instrument's filter cutoff, allowing for low pass filter sweeps. DLS instrument filter parameters respond if set up appropriately; you must check the Filter Enable box and turn the resonance slider. Spin Cycle's bass synthesizer settings are 18dB for filter resonance, and its initial cutoff frequency is 8,728 cents. Useful filter settings vary depending on the characteristics of the DLS instrument. To ensure good results, audition while testing various settings.
Creating DirectMusic Embellishments: Intros, Ends, and Breaks DirectMusic intros occur when the player starts a game, at the beginning of each level, and when resuming a paused game. Ends occur when the player completes a level, the player quits, or the player wins a game. We use breaks whenever the player pauses the game. DirectMusic embellishments are created just as other patterns in a Style. They are simply tagged as an embellishment in the Pattern Properties window. The groove range indicates when an embellishment pattern is called. For instance, an intro pattern with a groove range of 1 to 17 plays if the music is to start from any of the first 17 groove levels. Likewise, an end pattern with a groove range of 10 to 20 will only play if the game calls an end while any groove from 10 to 20 is playing. This allows you to tailor multiple intros and ends for different patterns of music. For example, Spin Cycle utilizes six intros and five ends. Their groove ranges span all of the composition's groove levels from 10 to 45.
Segments Each piece of music utilizes only one primary Segment. That Segment contains five tracks: Tempo, Time Signature, Style, Groove, and Band. The Tempo Track is fixed, as are the Time Signature, Style, and Band Tracks. The Style is inserted at beat one measure one of the Style Track. When you are done with this, DirectMusic automatically inserts that Style's Band into the Band Track. The game controls the groove level of the Segment Groove Track by incrementing or decrementing the groove level when "clear row" or "add row" occurs. The primary Segment plays continually as its groove level moves up or down, thereby changing the Style/pattern currently playing. For example, as the game starts, the Segment plays groove level "Intro 20" and then moves to groove level 20. The groove level increases to 21 as the player clears a row and moves to 22 when the player clears the next row. Then groove level "Break 22" plays if the player pauses the game. The player ends the game, triggering groove level "End 22."
Motifs and Secondary Segments Motifs in Russian Squares layer short musical gestures over each composition's primary Segment. Five motifs accent relevant game events: add row, clear row, blocker, clock, and high score. You create motif patterns just as other Styles/patterns are created, but they are stored within a Style's Motif folder. Each motif in Russian Squares uses only one or two pattern parts. This simple orchestration functions well over the main body of music, which is denser. Creating appropriate DLS instruments for each motif blends these gestures coherently into the overall arrangement. The game triggers these motifs frequently, so it is crucial that the timbres be convincing without sounding weak or obnoxious. A Performance boundary of grid, for clear row, add row, and blocker, ensures quick response while synchronizing up rhythmically (to the nearest sixteenth note) with the underlying music. Clock and high score motifs synchronize to the beat of their primary Segment (Performance boundary set to beat). The clear row and add row motifs act as instant feedback to the player. Each gesture's distinctiveness becomes a keynote sound to the player. The clear row motif rewards the player's progress, while the add row gesture indicates the player's regression. Each composition enlists a different set of sounds for these motifs. Synthesized pitch bend gestures indicate clear row (upward pitch bend) and add row (downward pitch bend) in Spin Cycle. Harp glissandos signal clear row and add row in Bubble Matter, while Gravity Ride uses a shaker and claves for those motifs. The blocker motif accents each theme aggressively with single staccato gestures. As with the other motifs, the instrument timbres mesh with the respective music themes. The clock motifs are the only looping motifs. They begin as the game clock is running out of time (it turns red) and end when a row is cleared or the clock runs out of time. This means that the clock motif always ends as a clear row or add row motif plays, which creates a nice bookend. Each composition uses a unique instrument for the sound of the clock — buzzer sound, wooden tick tock, and metal clanks, respectively. The high score motif rewards the player when a high score game is completed. These phrases are two or three measures long and overlap a composition's end pattern. High score motifs are simple one-instrument arpeggiated patterns that highlight the player's victory. Motifs are inserted into secondary Segments for playback by the game. (Motifs can be called independently of secondary Segments, but the programmer requested they be called via secondary Segments due to the calls available in DirectX Audio Scripting.) Each of these
secondary Segments contains a Time Signature Track and a Segment Trigger Track. An insertion is made in the Segment Trigger Track, and the properties window allows for selection of the Style/motif. The game calls the secondary Segments, and they in turn play the motifs.
Testing, Delivery, and Integration We tested the DirectMusic components satisfactorily within Producer. Transitions from pattern to pattern worked well using the A/B Transition button. We mixed by listening to all of the patterns back to back and making adjustments to individual pattern parts or the Band as needed. We delivered the files to the DirectMusic programmer in the run-time format, and no container files were used. All of the DirectMusic code was written using DirectX Audio Scripting. The game engine passed relevant cues to DirectMusic, and the script handled the DirectMusic calls. We evaluated the initial game build, made adjustments to the script and the DirectMusic files, and created another build of the game. At this stage we discovered that a motif tied to the active block was being triggered too frequently and was thereby removed. We also honed the clear row function so that the last three rows of a level did not increment the groove level. We did this because the player can eliminate the last three rows in a matter of seconds, undesirably triggering a quick succession of patterns. So, with three rows to go, the current groove level continues until the level is completed. I adjusted some instrument levels after hearing the music within the context of the game. Overall, the music functioned as designed and even exceeded expectations. The pattern transitions sounded musical, and the embellishments were timely and seamless. Puzzle games offer unique opportunities to composers. Game states change more often and quicker than in action/adventure or roleplaying games, thus the adaptive scoring must be more tightly integrated with its game elements. An apt analogy: Scoring a puzzle game is like scoring a Saturday morning cartoon, in which every action on screen has a musical gesture associated with it, whereas scoring an action game is more akin to scoring an epic movie where broad musical gestures paint the soundscape and follow the overall mood of the story. The score for Russian Squares demonstrates DirectMusic's ability to realize the adaptive music goals of this highly active puzzle game. For those interested in learning adaptive music techniques, I highly recommend scoring a puzzle game. Puzzle games allow the composer to focus on a few key gameplay elements while utilizing a wealth of potential adaptive solutions.
Chapter 16: A DirectMusic Case Study for No One Lives Forever Download CD Content Guy Whitmore
An Overview of the Score No One Lives Forever (NOLF) is a first-person action adventure game. It is a spy story set in the 1960s, complete with gadgets, gizmos, and guns. The player assumes the role of Cate Archer, British spy extraordinaire! As shooters go, it is refreshingly lighthearted and sprinkled with kitsch and humor. Monolith Productions developed and Fox Interactive published No One Lives Forever for the PC in 2000. I composed and produced the adaptive score, and Rich Ragsdale contributed the title theme. Eric Aho, Nathan Grigg, and Tobin Buttram created the DirectMusic (DirectX 7) arrangements and composed additional music. Bryan Bouwman programmed and integrated the game's DirectMusic code, and Sonic Network, Inc. provided many of the DLS instruments (www.sonicnetworkinc.com). For the soundtrack, I was asked to capture the flavor of the '50s/'60s spy genre, without infringing on any existing copyrights. Believe it or not, at first I was told to limit my use of brass instruments (this directive came to me through the grapevine via the Bond franchise). That is like being asked to produce a blues album without guitars! The powers that be quickly got over the legal paranoia, however. I did have one theme refused because of a subtle P5, m6, M6 melodic progression (made famous by composer John Barry), even though I thought it was the least "Bond-ish" of my themes. Actually, I drew more influence from German composer Peter Thomas, whose film scores have more of the lighthearted feel that we were after. The Barbarella soundtrack was also required listening. I began the pre-production process by writing five or six themes and prototyping them. These themes became the backbone of the adaptive score. The adaptive scoring techniques for NOLF came out of the concepts and technology implemented for three previous Monolith games — Shogo: Mobile Armor Division, Blood II: The Chosen, and Sanity. Shogo: Mobile Armor Division was the first game I scored that broke the music down into separate music states, which matched the game action. Blood II: The Chosen and Sanity each added to those concepts, creatively and technically. NOLF built upon my adaptive scoring foundation, improving on many aspects of my technique. This white paper describes the adaptive scoring concepts, the production process, and implementation process used to create this game score. In this case study I describe my intentions and the actual outcome, what worked and what didn't. I also describe what I'd like to achieve with future action scores.
The Adaptive Concept and Music States NOLF gameplay has high points of action and ambient points — times when the pace is furious, and times filled with suspense. In many scenarios, the player may direct Cate in with guns blaring or sneak her through the situation at hand. Obviously, the same music cue wouldn't be appropriate for both approaches. Also, there's no way of predetermining how
long a firefight might last and what might come immediately after it. These are the reasons that lead me to break the music down into flexible music states. After writing a thematic idea, I arrange it in a variety of music states using subjective naming conventions that reflect their functionality (or intensity) in the overall score. Some of the tags I've employed are "ambient," "suspenseful," "action," etc. Each of these music states can play for an indeterminate amount of time. Typically, each music state has about one and a half to three minutes of music composed for it. It is sometimes difficult to calculate the exact amount of time when considering variations. As a general rule, I think in terms of how long a particular music state can hold a listener's interest (more on this later). A music state can repeat as necessary until another music state is called. I could also define the number of repeats. The music engine supported automatic transitions from one music state to another or even to silence. This prevents the music from repeating ad nauseum for moments when player interaction is limited.
Music Sets Each musical theme in NOLF is arranged using six basic music states that make up a single music set. At the start of a level, one music set is loaded along with the rest of the level assets. The six standard music states are: § Silence § Super ambient § Ambient § Suspense/sneak § Action 1 § Action 2 The key to composing music for any given music state is to give the music enough musical ebb and flow to keep things interesting while staying within the intensity range prescribed by the music state. For example, the "ambient" music state may rise and fall a bit in musical density but should not feel like it has moved up or down too dramatically. The goal is to maintain the musical state while holding interest. One way that the music sets in NOLF achieve this level of sustained interest is through instrument-level variation. Using variation on just a few instrument tracks of a given music state was very effective and didn't cut too deeply into the production schedule. Instrument-level variation is used in the lower intensity music states quite often. These music states start differently every time they're called up, giving the illusion of more music. In some cases, a four-to eight-measure repeating music state feels like three to five minutes of fresh music.
Transitions: Getting from A to B and Back Again The ability to transition between the various music states provides the flexibility needed for the soundtrack to adapt to the game state. Seamless musical transitions facilitate this adaptability in a way that sounds intentional and musically satisfying. In NOLF, any of the six music states may be called at any time. This means that any given music state must be able to modulate to any of the other five states. This required transitions between states that made sense musically and did not interrupt the flow of the score. Sometimes simply starting the next music state on a logical musical boundary was all that was needed. Often, quickly ending one state and starting the next was enough. However, the most satisfying transitions were the ones that built up to a more intense music state or resolved downward to a less intense music state (without missing a beat, so to speak).
The Matrix First conceived for the Shogo score, a transition matrix filled the need to keep track of the myriad of possible movements between music states. By defining the matrix in a simple script (more on the script later), I was able to assign short sections of music to act as specific transitions between states. When the game calls for a change of music state, it knows which music state is currently playing and which one it needs to move to and plays the appropriate transition between them. The transition matrix acts as a look-up chart for the music/game engine. With six music states, there are 30 possible transitions. Needless to say, I didn't labor over 30 individual sections of music for each theme. Many transitions did not need transition Segments, as they sounded good cutting in on an appropriate boundary. Also, I found that some transition Segments could be used for multiple transitions between music states. Transitions were generally divided into two types to help clarify my thinking: transitions that move to a higher or more intense music state and transitions that move to a lower or less intense music state. Categorizing transitions in this way made reusing transition material easier (i.e., transitioning from music state three to music state two may be similar to "3 to 1," while "3 to 4" may be similar to "3 to 5" — but not always!).
Performance Boundaries Key to making the transitions work musically were performance boundaries. Performance boundaries defined the points along a music state where transitions could take place. Boundary types included Immediate, Grid, Beat, Measure, and Segment. Each of these boundary types proved useful for different situations in NOLF. When a music state was rhythmically ambiguous, Immediate or Grid worked fine, allowing for the quickest transitions. Beat and Measure boundaries came in handy when the rhythmic pulse needed to stay constant, and Segment boundaries allowed the currently playing melodic phrase or harmony to resolve before transitioning. Maintaining a balance between coherent musical transitions and the need to move quickly between states challenged us as arrangers. As you can hear if you play the game, some transitions work better than others. When a new state is called, there is an acceptable window of time to move to the new state. We used a window of zero to six seconds (eight seconds tops). This meant that at a tempo of 120 BPM, a four-measure phrase (in 4/4) was the absolute maximum that a current music state could finish prior to transitioning to the new state. One typical solution was to use two measure boundaries (for quicker transitions) for most of a music state and four measure boundaries in spots that called for them aesthetically. Composing and arranging convincing musical transitions in an adaptive score twists your brain and forces you to think nonlinearly. The interactive score used in NOLF only scratches the surface in this regard; there is plenty of room for future innovation. I can say for certain that having written a number of nonlinear scores, I'll never think about music the same way again. In a way, it's freed my thinking about how music is put together. Even when listening to linear music, I sometimes think "Hmmm… this piece could just as easily start with this section instead of that one" or "I bet they could've started that transition two measures earlier!," etc. Music is malleable and only frozen when we record it.
Stinger Motifs Two or three of the music sets used in NOLF employed motifs. The motifs were applied as short musical accents, or stingers, that played over the top of the currently playing music
state. Performance boundaries were set so that the motifs would be in sync with the underlying music, and Chord Tracks were used so that motifs would play along with a functional harmony (this was the only use of DirectMusic's difficult-to-navigate harmonic features). These motifs were composed of brass riffs, quick guitar licks, and things that would easily fit over the music state Segments. More flexibility would have been nice so that different motifs could be assigned to specific music states (possible using DirectX 8 Audio scripting). Five or six motifs were written for each music set. The engine called a motif randomly when the player hit an enemy AI square in the head (ouch!). A silent motif was employed to prevent a motif from playing every single time.
Sounds and DLS Banks All of the music in NOLF uses Microsoft's software synthesizer in conjunction with DLS. DLS banks are loaded into the software synthesizer (using RAM) and played via the DirectMusic engine. Each music set uses up to 8MB of DLS instruments (as 22 kHz samples), which are loaded as each game level is loaded. These DLS banks are selected, created, and optimized for each individual music set. This gives each music set its own timbral character that coincides with the aesthetic needs of each theme. DirectX 7 doesn't have Wave Tracks, and DirectX 8 and above do; as a result, premixed tracks weren't an option. The DLS+MIDI approach provided the flexibility and practicality needed for features such as instrument-level variation and motifs that respond to harmonic information. My current projects use a combination of Wave/Streaming Tracks and DLS+MIDI Tracks. This provides a balance of premixed CD-quality waves with the flexibility of MIDI. However, as processor speeds continue to increase and better real-time DSP comes about, professional production values will be easier to attain via MIDI+DLS. The two approaches will likely merge and simply be two tools in the same toolbox.
Integration and Implementation Even though we went with an off-the-shelf solution, namely DirectMusic, there was still a good amount of programming needed to successfully integrate the music engine with the LithTech engine (Monolith's game engine). Thankfully, much of the work was done on previous games, and we simply needed to update the code for NOLF. Perhaps the biggest leap in this area was in how the music states were tied to the game. In Shogo, Blood II, and Sanity, music states were called via location triggers placed strategically throughout the levels. This was a laborious task (done in the LithTech level editor) and a pain when enemy/NPC placement inevitably changed as the ship date neared. Necessity was the mother of invention for NOLF, as we didn't have the production schedule to individually place music triggers. Bryan Bouwman and the fabulous programmers at Monolith came up with the bright idea of tying global game states and NPC AI directly to music states. This approach made perfect sense, as the game knows when there is action on the screen; it knows when Cate is sneaking around in "stealth mode," and it knows when the player is simply exploring a level. To add flexibility, the game state to music state assignments are done individually for each game level, so different levels could have different assignments. In addition, more than one music state could be assigned to a game state. For example, music states 5 and 6 were often both assigned to the "combat" game state and music states 1 and 2 to "exploration," etc. The game chooses randomly between them at run time. This means that you could play through the same level twice and have a somewhat different score each time, yet the music would be appropriate to the action in both cases.
Scripting Monolith created a simple scripting method, which provided me with some control over the music asset management and implementation. Being a DirectX 7 game, we didn't have the DirectX 8 Audio Scripting that now comes with DirectMusic. In NOLF there is a script for each music set that is called when a game level is loaded. The script's basic functions are: § Load the music assets for the music set o DLS instruments o DirectMusic Styles, Bands, motifs, and Segments § Assign DirectMusic Segments to the six music states § Set up the transition matrix o Assign transition Segments o Assign transition boundaries § Set the basic reverb settings § Set up the motifs and their boundaries
The Test Player Monolith also built a handy little LithTech DirectMusic player that contained the adaptive functionality used in the game. It loads a script and its music set and then plays the various music states using the correct transitions between them (a simple selector allows you to choose music states). Motifs can even be tested over the music states. This player was a lifesaver, as it allowed me to debug the music content in a game-like setting before implementing the content.
DirectMusic and the Production Process The Prototype and Pre-production The first step in the whole production process was to zero in on the musical direction and thematic material of the score. This began with discussions about the style of the music with game designer Craig Hubbard. Next, I was to bring these ideas to realization in the studio in the form of a prototype. Each thematic prototype was created in my MIDI studio using all appropriate synthesizer/sampler modules and sounds available. The idea was to ignore the nonlinear aspects that the music would take on and ignore the technical limitations that the game machine would place on the music. In this way the focus of the prototype was the thematic material itself, the musical style, and an ideal set of production values. Each prototype was mixed to a stereo wave file and presented to the designer and producer. Some themes were accepted on the first take, some were sent back to the drawing board, and others were rejected outright. By the end of the process, we had five or six themes that we were all happy with, and DirectMusic production could begin.
Composing and Sequencing The sequences created for the prototypes (using Digital Performer on a Mac) served as a starting point for production in DirectMusic Producer. Some of the sequences were fairly complete, while others required extensive work and additional sections once brought into Producer. From Digital Performer I exported each prototype sequence in the form of a standard MIDI file (.mid). This allowed Producer to import the sequences for editing and arranging. Due to its latency, interface issues, and some nagging bugs, DirectMusic Producer can be a difficult program to use, clogging the creative flow. This is especially true in terms of creating music sequences. It's slow going, and there's no way around it (for now). That being the case, I did as much sequencing as practical in Digital Performer in conjunction with my samplers. One key piece of advice that I can offer when using this approach is to have the instrument samples from your studio match, as closely as possible, the instrument samples of the DLS banks to be used in Producer. This is becoming easier to do, as the DLS-2 format used by DirectMusic can be easily translated to GigaStudio (.gig) and SoundFont formats. To create the game score for DieHard: Nakatomi Plaza, I had an exactly mirrored set from DLS as the target format to GigaStudio as the production format.
DLS Creation The DLS Level 1 instruments for No One Lives Forever came from two sources; first, we licensed many instrument collections from Sonic Network (the sounds are called Sonic Implants — www.sonicimplants.com). The other samples were homemade, including some solo cello samples. Sounds are a composer's palette, and having a rich palette, despite memory constraints of the game, was key to making the interactive score convincing. There are some tricks to creating a rich instrument collection within tough memory requirements. §
Layering sounds and resampling: When creating music in a traditional MIDI studio, layering and stacking instrument patches to create a thick-sounding timbre is commonplace. The drawback to this technique in a game is memory usage and limited polyphony. The cure is to layer your patches and resample them into a single set of samples to be assembled into DLS instruments. For example, I created a brass
staccato instrument by stacking about five or six brass patches in unison (including french horns, trombones, and trumpets). One sample from this instrument using one voice sounded like the entire orchestral brass section playing triple forte! §
The use of unique or interesting sounds: One interesting timbre in a piece of music can carry the piece and make it memorable. The low cello glissando in the H.A.R.M. theme is one such example in NOLF. A generic cello patch and the pitch bend wheel would never have cut it. Instead, I brought in cellist Lori Goldston and had her record some short figures and motifs. The ponticello glissando figures were then sampled and pitched down about a perfect fourth. This became the central figure around which the rest of the piece was composed. One "live"-sounding instrument can trick our perception into hearing other parts as performed live.
§
Each instrument must sound convincing when soloed: If an instrument sounds weak on its own, it most likely will not add anything to your music. I resampled many instruments with a bit of room or hall reverb on each sample. The real-time reverb in DirectMusic AudioPaths is very useful, but it certainly isn't your Lexicon-quality algorithm. Adding a bit of high-quality processing to the individual samples (be it reverb, compression, or EQ) can go a long way to get that "professional" sound back into your interactive score.
§
The samples should match the individual composition: Even within different orchestral arrangements, different sets of samples are called for, depending on the pacing and mood of each piece. The "one size fits all" mentality of General MIDI will fail to give your score anything but a generic quality. Each theme of the NOLF score had some instrument sets that were built specifically for that theme — the vocal "BaDeDum" sample for its theme and the horn "blatts" for the Ambush theme, in addition to the cello samples already mentioned for the H.A.R.M. theme.
Creating looping samples posed a big challenge given the short length of the samples. For many samples, a Mac program called Infinity was used to create internal loops. The program has tools, such as crossfade looping, that help smooth out harmonic thumps and clicks common to short loops. That said, short loops are never perfect, and compromises are always made for the sake of memory constraints.
The DLS Editor After creating and editing the individual samples, they were brought into DirectMusic Producer for DLS creation. Each NOLF theme was arranged within its own DirectMusic Producer project. Each project has an Instruments folder to help organize the DLS Collections. My tendency is to create several DLS Collections, each containing just a few instruments and waves, rather than making one massive DLS Collection containing all the instruments. This makes it more convenient to share instruments between projects. When creating a DLS Collection from scratch, a collection was added to the project, and the edited waves were added to the collection's wave folder. Next, an instrument is created and regions assigned across the instrument. The range of a region determines how far a given sample will have to stretch up or down. For most melodic instruments in NOLF, we found that two regions per octave was adequate. This means that each sample will be stretched up and down about a third in either direction. In crucial or exposed ranges of certain instruments, more regions per octave were necessary for a natural sound. Many instruments had various versions, depending on the needs and constraints of a given theme. For some instruments, extreme region ranges were used to get an unnaturally low, yet purposeful effect. I always experiment with a sample's useful range
and often find pleasant surprises, which make their way into my music's arrangements. This is a way to get extra mileage out of your chosen samples. Good use of instrument articulation can extend the variety of your samples and adds dynamic expression to your instruments. Copying an existing instrument, pasting it into the same collection (as to not duplicate your wave data), and then altering the ADSR articulation is the easiest way to add to your palette. Most common in NOLF is one instrument with a short "attack" (0 to .1 seconds) and its partner instrument with a longer swell attack. Smart use of the release articulation can help smooth transitions in an interactive score in a way that straight wave files could never approach. When a transition occurs, there are often sustaining notes that must stop or change to accommodate the new Segment. Instrument release acts as a natural fade-out or decay, and each instrument can have its own release time. Being a DirectX 7 game, NOLF used DLS Level 1 as opposed to DLS Level 2 that came about in DirectX 8, meaning instruments didn't use layering or filters.
Styles, Patterns, Bands, and Segments: How They Work Together Conceptualizing how DirectMusic and its components will work together in your project should be your first task. Having read about DirectMusic's basic functionality, you may have a good understanding of the various DirectMusic components, but different games have different needs, and this is where DirectMusic's open-ended approach is both powerful and confusing in its complexity. The solutions to these different needs are not spoon fed; they must be thought out conceptually first and then applied using DirectMusic's toolset. For my purposes in NOLF, Styles act as project bins for patterns, Bands, and motifs. Patterns are anywhere from one to eight measures in length, corresponding to the length of logical musical phrases. These patterns are assigned to Segments using groove level assignments. Each pattern is assigned a unique groove level, which a Segment can trigger by calling that number from its own groove track. These Segments are then in turn assigned to LithTech music states. These layers of abstraction, patterns to Segments to music states, while complex on the surface, give NOLF its nonlinear and adaptive functionality. Picture the approach as such: § Patterns are generally short musical phrases. § Segments use multiple or repeating patterns (triggered via groove levels) each commonly one and a half to three minutes in length (exceptions to be discussed further). § The game's AI ties into music states 1 through 6. One or more Segments are assigned to each music state.
Figure 16-1: The Producer setup for the Ambush theme, showing project structure and Style/pattern associations.
Style and Band Setup Each project file contained at least one Style. A Style can only have one time signature, so if a theme had changing time signatures, such as the H.A.R.M. and Ambush themes, then a Style was needed for each time signature! If this had been a DirectX 8 game, I may not have used Styles at all but used Pattern Tracks within Segments (which weren't available in DirectX 7) instead, although I do like using styles as a pattern bin, which makes mixing and matching patterns across multiple Segments easier. Each Style contains a default Band (called band1). This Band is referenced while auditioning patterns and used during most of the creative arranging process. I call this the core Band of the project. For simplicity, I do my best to have one Band per theme and project. This approach works well, as I find it easier to expand pchannels within the Band than to create another Band for the sake of a patch change. The volume and pan info in the core Band served as the relative starting point for most patterns with any dynamic changes done using continuous controllers 7 (volume), 11 (expression), and 10 (pan). There was a continual adjusting of the core Band as an arrangement evolved.
Figure 16-2: The Band setup for the Ambush theme.
Pattern Creation and Organization
When exporting the standard MIDI files from Digital Performer, I did my best to save them out in condensed pattern length files. Duplicate MIDI tracks were deleted and tracks of the same instrument were combined in Digital Performer. Rather than save out one long SMF file, individual SMF files of one to eight measures in length were exported, so they would import more easily as patterns. Once imported, we did have to adjust many of the parts' pchannel assignments. This is because DirectMusic assigns pchannels based on the ordering of the imported SMF tracks — from top to bottom. Other patterns were created by copying material from existing patterns and then altering it somewhat. Some patterns were created by entering note data manually. With respect to groove levels, NOLF patterns use a groove range of one, meaning the "lo" and "hi" number are the same. This one-to-one correlation between groove numbers and patterns makes it easy to assign specific patterns to Segments. There are cases where NOLF employs broader groove ranges to create pattern-level variation; if two patterns have overlapping groove ranges or the groove range is the same, DirectMusic randomly chooses one of the patterns to play.
Instrument-level Variation As you know, each pattern part can have up to 32 variations. Each time a pattern is played, one variation per part is randomly chosen. Multiple variations do not have to be utilized; in fact, many pattern parts in NOLF had one "hard-wired" variation to play. We created this "hard-wired" variation by disabling all other possible variations for a pattern. I found that four or five variations on two or three parts was enough variety for most patterns. The ambient and sub-ambient patterns tended to get a deeper variation treatment, with most parts having variations. It may be that the moody, atmospheric, and arrhythmic nature of these music states lent themselves to a truly nonlinear treatment. Variations in the suspense or action music states entailed two or three pattern parts with variation. This allowed the foundation of that music to remain consistent while providing some variety. There are many instances where instrument-level variation is not used. This was mainly due to a limited production schedule. A simple technique for creating subtle variation is to copy the original track (variation 1), paste it into another variation, and then slightly alter it. The new variation could have the same contour with different embellishments. This way, it remains consistent with the intent of the music, yet it adds interest. Another technique for using variations effectively is composing unique melodies for one pattern's variations while the other parts may have subtle variation. This technique creates a "lead" instrument that riffs over a consistent, yet changing bed of music. I ensure that each part has its own rhythmic space to play in. For instance, all of the vibes' variations may occur during measures three and five, while the flute variations are given measures two and six. This ensures that the parts won't step on each other, despite its nonlinear nature.
Composing with the Pattern Editor You have two choices for composing with the Pattern editor: by hand editing the MIDI data in DirectMusic Producer or by recording with a MIDI controller. The truth of the matter is that there is no easy way to compose music using the Pattern editor. We did the best we could with the tools at our disposal, however. The first method is laborious but precise — inserting notes one at a time using the mouse or the Enter and arrow keys. After a phrase is inserted, each note's duration has to be dragged or edited to the desired length. Multiple notes can be selected, copied, and pasted when a set of notes repeats. The most awkward aspect of this approach is changing the velocities of individual notes by grabbing and dragging the top edge of the selected notes. More precise
changes can be made in the note's properties window. This method is completely unglamorous and slow, but it gets the job done. If you want to utilize DirectMusic's variation abilities, you'll certainly find yourself doing this sort of editing. Method two for composing content in the Pattern editor involves interfacing a MIDI controller to DirectMusic Producer. This is done using the MIDI/Performance Options tab. Once set up, the keyboard is used to perform part variations in real time. Due to the latency of the software synthesizer (80ms on average), I suggest monitoring each performance with a piano sound from the keyboard controller. Unfortunately, the MIDI timing and sync during real-time performance is not ideal (in DirectX 7 and DirectX 8); therefore most performances need hand editing in the pattern window. At least this method allows for a certain amount of performance spontaneity.
Continuous Controllers The use of continuous controllers is essential to breathing life into any MIDI score, and a DirectMusic score is no exception. In DirectMusic, each pattern part can have CC tracks inserted beneath its piano roll. In NOLF, controller 7 (volume) is used to change a part's general volume, while controller 11 (expression) creates the dynamic ebb and flow. Parts without dynamic expression can sound flat and monotonous, no matter how well the part is written. Inserting expression curves (CC 11) helps convey the dramatic intent of a musical phrase. Expression curves are used in the NOLF score as crescendos, as quick swells, as fade-ins/outs, and to emphasize certain phrases of a part. The interface in DirectMusic Producer for adding and editing CC curves is easy to use conceptually, but actually moving and manipulating curves may quickly tire your mouse hand. The CC curve concept in DirectMusic Producer is much easier to use than placing individual CC data points. I did experiment with "performing" curves using an external controller (the EMU launch pad), but DirectMusic Producer's timing and latency problems prevented the data from being recorded accurately. This being said, there are two methods for applying CC data to parts: the slow but accurate method of inserting curves by hand (and mouse) or creating the CC data using another sequencer prior to saving out the .mid files. The key in this case is to thin out the CC data. This limits the use of processing power by the DirectMusic engine and makes the data easier to edit once it is in DirectMusic Producer. Note Curves that do not reset during a transition can cause CC info to "stick" at an unwanted level. This can occur when the music moves from one Segment to the next during a CC curve. To prevent this, we place CC data points at the beginning of every pattern part for those pchannels that use CC data. This technique initializes the CC data for each pattern, avoiding unwanted CC levels.
Segment Creation Segments bring together all of the disparate patterns in a project into unified arrangements. The patterns act as building blocks for the Segments. Sometimes the order and arrangements of the Segments were planned prior to their creation. At other times, we would experiment with different orderings of patterns within Segments. This flexibility resulted in some interesting outcomes, and we often got more mileage out of our patterns. Creating short phrase-length patterns gave us this flexibility to work with patterns as if they were Lego blocks.
All NOLF Segments contain the following tracks: Style Track, Tempo Track, Groove Track, Band Track, and Time Signature Track. A few Segments also used Mute Tracks. Segments trigger patterns via the Style and Groove Tracks. The Style is inserted into beat one of the Style Track. This automatically creates a Band Track with that Style's Band inserted. From there, groove numbers are placed along the Groove Track to create the Segment's arrangement. Mute Tracks are used to reveal a pattern gradually within a Segment. For example, a Segment begins with groove level 10 with the Mute Track muting most of the groove/pattern's pchannels. Over the course of 16 measures, the pchannels are individually unmuted, slowly revealing the entire pattern.
Figure 16-3: A typical Segment from NOLF. Setting up Performance boundaries longer than one measure was tricky for NOLF because it was a DirectX 7 game. In DirectX 8 and beyond, a Marker Track can be used to set up transition points within a Segment, but DirectX 7 did not have that feature. In fact, our work with NOLF may have influenced the addition of the Marker Track by the DirectX team. There were two possible workarounds for longer performance boundaries in NOLF. A music state can play multiple Segments sequentially using the LithTech music script. The Segment performance boundaries were set to "end of Segment." This allowed performance boundaries of varying length within a music state. Another clever solution to the lack of a Marker Track utilized the Time Signature Track. We would set the performance boundary of a music state's Segment to "measure" and place various time signatures along the Segment where transitions were to occur. In other words, the Time Signature Track had no direct correlation to the underlying meter of the music. For example, if we wanted performance boundaries every four measures of music, the time signature would be 16/4. Workarounds have often inspired the creation of features, and features (such as the Marker Track) are not created until there is a known need for them.
Motif Creation Motifs are created within Styles. There is a folder within each Style for motifs. Motifs are created and edited in the same manner as patterns. The only difference is in functionality; motifs layer over Segments at a Segment's tempo, while patterns are triggered via groove levels from within Segments. Motifs follow the harmonic structure of a Segment (if set up to do so). To keep things simple, NOLF motifs are short, simple phrases that punctuate and accent the underlying music. As mentioned earlier, NOLF motifs play when a direct hit is made on an enemy AI. Often, one or two staccato brass chords accent the music, or a suspenseful string tremolo rings over the primary Segment. In all cases, NOLF motifs are one measure or less in duration. The instrument of a motif works best when it is different from the instruments of the primary Segment. This prevents the motifs from stepping on a part that is already playing. Using a different or specific range also helps in this regard. For example, if the primary Segment contains lowand mid-range brass instruments, using a trumpet motif in the upper registers will fit well in the mix rather than make it muddy. Even if the primary Segment contains a midrange trumpet, a solo trumpet motif will be heard because of its distinct timbre. The performance boundary for motifs was always Grid or Beat, so they would play in sync with Segments, yet respond quickly to the game. This means that a motif plays no more than
a half second after a direct hit and its gunshot sound. The timing coordinates perfectly with the sound of the gun, responding to a perfect shot. A performance boundary of Immediate might cause the motif to step on the gun sound and would not play in sync with the primary Segment. Performance boundaries of Measure or Segment would cause the motif to respond too late, losing the desired timing. NOLF motifs utilized DirectMusic's Chord Tracks so they perform in harmony with the underlying music. Motifs are created in a neutral key (C), and the primary Segments' Chord Track transposes the motifs according to its chord symbols. The chord symbols inserted in the Chord Track reflect the key or harmony of the primary Segment's music. In most cases, NOLF Chord Tracks contain one chord at the beginning of a Segment setting its key center, and its patterns are transposed to that key. The motifs, in turn, also perform in the key set by the Chord Track. This use of Chord Tracks differs from the more common technique of placing chords at every harmonic or chordal change within a Segment. The difference is that NOLF uses Chord Tracks to set the overall harmony and key center rather than defining each chord change. This simplified approach works because the motifs function well over any chord within its key.
Creating Transition Segments Transitions link the six music states to one another musically. It takes a puzzle-like logic to figure out how the transitions should operate. NOLF uses specifically created Segment files for transitions instead of DirectMusic embellishments. This is because DirectMusic embellishment logic is only aware of one Segment/pattern at a time. Its logic cannot take into account both the current music state and the music state to which the music is transitioning. The transition matrix architecture, set up in the script, allows specific Segments to be assigned to each possible transition (between six music states, there are 30 transitions). These transition Segments are one to four measures in length. This duration allows enough time for convincing transitions while being short enough to keep up with the game action. I ask myself two basic questions when composing a transition Segment: Which music state am I moving from? And to which music state am I transitioning? I write down all the possible transitions and methodically check them off as they are created. I begin composing transition Segments to and from silence — in other words, the intros and ends of each music state (again, these are not DirectMusic embellishments). The music composed for these intros and ends provides the foundation for other transitions. This is because the material written for a music state's intro may become the basis for transitioning to that music state from another music state. Also, a music state's end may also function well when transitioning from that music state to other music states. (This is the puzzle logic that I mentioned!) Sometimes I'll use the end transition of one music state and the intro of another music state as the transition between them. The end material brings the music out of the current music state, and the intro brings the music into the next state. More often, this end/intro transition is the basis for that transition, and further editing and composing is done to make it work well. Many transition Segments function well in multiple transitions, and this saves production time. For example, the transition Segment built for music states 2 to 3 may also work between music states 2 and 4, etc. I compose one music state's transitions at a time. Thus, when I'm working on music state 2, I sequentially create transitions between states 2 to 1, 2 to 3, 2 to 4, 2 to 5, and 2 to 6. Again, this is because there are bound to be similarities among these transitions that I can reuse. Simple transitions are often the most effective. If the music needs to stop or transition quickly, a large percussive accent brings the music to a halt. It's as if the music hits a wall, blunt and jarring, and that frequently works well within the game. There are also cases when no transition Segment is needed between music states. In these cases, the music flows directly from one music state to the next, and the release times of the DLS instruments create a natural blending or crossfade between music states.
Sometimes, one transition Segment is not adequate for a particular transition. If a music state contains a variety of musical sections, more than one transition Segment may be needed. For instance, music state 4 has an A section that is 16 measures in length and a B section also 16 measures long. If the instruments used in each section are different or the harmony and tempo vary between them, then a single end Segment may not work from both section A and section B. In a case such as this, two transition Segments are created, one for transitioning from the A section and a second for transitioning from the B section. During gameplay, the transition matrix calls the appropriate transition Segment, depending on the current playback point of the music state (more on this in the section titled "The Transition Matrix"). When creating music states and their transitions, keep their harmonic content in mind. Music states with disparate key centers cause difficult-to-execute transitions because the drastic harmonic modulation of the transition may sound unnatural or awkward. I recommend using key centers that are closely related (i.e., C major to G major) to create convincing transitions. I also use chromatic modulation in NOLF (i.e., B major to C major), and many transitions simply stay in the same key. It is easy to back yourself into a corner harmonically when creating an adaptive score. Be aware of the tonal centers of the music states as you create them, and try to think ahead about how they will transition harmonically.
The LithTech Player/Tester Preparing DirectMusic Files for the Game Run-time files are saved into the project's Runtime folder after the DirectMusic files are created and tested in Producer. Next, I downsample the run-time DLS instrument waves to 22 kHz using Awave Studio (www.fmjsoft.com). Awave is a fantastic utility for sample format conversion and batch processing. There is no other tool that I'm aware of that batch converts and downsamples DLS files. From here, the files are ready for testing within the LithTech player.
Figure 16-4: The LithTech DirectMusic test player.
Using the Script Monolith programmers and I developed a simple scripting system to put more control into the hands of the composer. DirectX 7 did not have the DirectX Audio Scripting now available in DirectX 8 and after. We wanted an easy way for composers to set the basic DirectMusic parameters for a given music set. Scripts are created within a text document. A script template simplifies the task, and the necessary fields are filled in for a given theme. The script fields begin with the basic setup parameters. NUMINTENSITIES indicates the number of music states to be used. INITIALINTENSITY is the music state that begins playing when the music set is loaded. PCHANNELS sets the number of pchannels used, and VOICES programs the maximum number of synthesizer voices. SYNTHSAMPLERATE sets the sample rate for the DLS synthesizer. REVERB on/off, REVERBGAIN, REVERBMIX, REVERBTIME, and REVERBHIGHFREQRATIO allow the composer to set the reverb parameters. The next section of the script describes the specific DirectMusic files to load into RAM. DLS banks, Styles, and Bands are listed here, while Segments listed in the music state and
transition sections of the script are also loaded into RAM. The script loads all necessary files for a music set at the beginning of a game level. The appropriate script is set within the LithTech level editor for each game level and is called as a level is loaded. Dynamic loading of DirectMusic files, while possible, is more complex to implement and slows the game down during the unload/load cycles, which is why we chose to load everything with the game levels.
Setting Up Music States in the Script Setting up the music states entails listing the Segment or Segments used for each music state. The format and its variables are INTENSITY "music state number," "times to loop" (–1 for infinite), "music state to switch to when finished looping," and then "Segment(s) to be played." A typical script lists them as such (note that LithTech refers to music states as intensities): INTENSITY 1
–1 (loop)
0 (go to)
silence.sgt
INTENSITY 2
8
1
subambient.sgt
INTENSITY 3
–1
0
ambienttension.sgt
INTENSITY 4
–1
0
sneak1.sgt
INTENSITY 5
0
6
action1a.sgt
INTENSITY 6
0
5
action1b.sgt
INTENSITY 7
–1
0
action2.sgt
In the example above, Intensity 2 repeats eight times and then transitions to silence. This prevents that music state from repeating too much. Also notice that Intensities 5 and 6 are actually one music state that plays Intensity 5 and then 6 and repeats. It is split into two parts so that different transition Segments can be assigned to each part of that music state, as described earlier in the section titled "Creating Transition Segments."
The Transition Matrix Each desired transition uses the following variables: "from music state," "to music state," performance boundary, and transition Segment. A typical set of transitions is listed in the script as such: TRANSITION 2 (from)
1 (to)
BEAT
Sub2Silence.sgt
TRANSITION 2
3
MEASURE
Sub2Amb.sgt
TRANSITION 2
4
MEASURE
Sub2Sneak.sgt
TRANSITION 2
5
MEASURE
Sub2Action1.sgt
TRANSITION 2
6
SEGMENT
Sub2Action2.sgt
Performance boundary can be set to Immediate, Grid, Beat, Measure, Segment, or Default, which uses the Segment's internal performance boundary.
Motifs and Secondary Segments
The final part of the script lists motifs and secondary Segments. The motif format is style name, motif name, and performance boundary, while secondary Segments simply list Segment name and performance boundary. MOTIF
style1.sty
DirectHit1 (Motif)
MEASURE
MOTIF
style1.sty
DirectHit2
BEAT
SECONDARY-
fly.sgt
MEASURE
SEGMENT
Using the LithTech Test Player The LithTech DirectMusic test player allows composers to audition their content with the functionality that it will have in the game. The Base Directory field lists the path to the content to be tested. The Rez File refers to the LithTech resource file if the game and music content have been compiled. I avoid this and test the content directly, as run-time files, before checking it into the game. Control File Name names the script created for the content. Once those fields are filled, the Init Level button loads the DirectMusic content. The Intensity scroller plays the scripted music states with the appropriate transitions between them. There are windows for playing and stopping secondary Segments and motifs over the music states. The debug window shows any errors that may arise due to missing content or a typo in the script, etc. This player is indispensable. I test all the transitions with the LithTech player, make adjustments in Producer as needed, and test them again. This process allows quick iteration between content creation and game environment testing. The player, the script, and the Producer can be open at the same time, making the back and forth process of creating and testing easy. The LithTech player is also a fantastic demo tool for illustrating the content and its functionality to the game team before actual implementation. An adaptive music test environment saves mountains of production time because the composer can debug his/her content thoroughly, giving music creators confidence that their content will work well in the game before the big handoff.
Integration and Implementation Integration of DirectMusic Technology DirectMusic functionality had already been a part of the LithTech engine, so integration of the technology was already complete. We innovated in the area of AI, however. NOLF already integrated an advanced state machine, which calculates the player's state and enemy AI. The programmers simply made it possible to trigger music states via game states. These game state to music state associations were made in the LithTech level editor so that different game levels could have unique setups if desired. The level editor also allowed two or more music states to be assigned to one game state, one of which would be randomly chosen during run time. The most intense music states, 5 and 6, were both assigned to the combat game state. Also, music states 1 (silence) and 2 (sub-ambient) often shared the quiet "investigate" game state. Assigning multiple music states to a single game state cut down the repetition and predictability of music within a given level by adding variety to game scenarios.
Implementation of the Music Content The adaptive music state machine described above makes implementing the DirectMusic content easy. The first steps include checking the DirectMusic files into the game and properly setting up each game level. The LithTech level editor selects the music theme and script for a given game level. The music themes are thus assigned to the various levels, each theme being used across an average of three or four levels. Ninety percent of the music's adaptability is handled by the state machine. Location-based triggers account for the other ten percent and override the state machine when triggered. Location-based triggers come into play when a specific theme or music state is desired, regardless of the game state. NOLF cinematics apply the same themes and music sets as the adaptive game score. Music triggers are placed at key points in a cinematic, where music is needed, and transitions between music states automatically occur. Triggering music states from cinematics works surprisingly well but certainly does not sound as good as custom cinematic scores. Music scored specifically to a scene matches the events more precisely than music composed out of context. The adaptive music sets and triggers are used because the NOLF cinematics were not complete in time to score them individually. The lesson is to reserve production time to custom score game cinematics and demand that the developer finalize the timing of them before the scoring begins.
Conclusion The score for No One Lives Forever was a challenge to produce but was also very rewarding. The first challenge was convincing the producers at Fox Interactive that an adaptive score could have high standards of production quality. The Monolith team helped make the case by presenting demos of previous scores, such as Sanity. Also, my prototype themes helped convince them of my abilities as a composer. Putting together convincing DLS banks within tight memory constraints also posed a big challenge. The optimization process was time consuming and tedious but key to the sound of the game. Final mixing and editing called for great attention to detail by going though all the patterns (all their variations), motifs, and Segments, making sure volume levels, panning, and instrumentation meshed well together. NOLF's game state/music state integration gave me the greatest reward. It was fantastic to simply drop music into the game and hear the interactivity immediately. It was also gratifying to collaborate with a strong team of arrangers/composers. Having the help of three other musicians produced more content for the game, and sharing compositional ideas and techniques made us all better musicians. Finally, spy music was just plain fun to compose. The game's sense of humor made it a delight to create its music. Overall, the adaptive design functioned as planned or better. The transitions reacted quickly and smoothly to the game calls, and the mood of each music state matched the on-screen action very well. The instrument variation, music state variation, and use of silence alleviated the repetitiveness common to many games, and the motifs made direct hits more satisfying to the player. At its best, the adaptive score draws the player deeper into the game experience. My biggest criticism is that sometimes the game states change faster than the music was intended to react. This makes the music seesaw between music states unnaturally. Many of these instances are only noticeable to me, but some are more obvious. I will be thinking of solutions to this type of dilemma for my next adaptive score. Also, the sonic quality of the music is limited due to the 22 kHz DLS banks. A combination of Wave Tracks and DLS banks would have allowed for longer samples, phrases, and premixed sections, which can increase the overall fidelity of a score while maintaining adaptability. Each adaptive game score that I produce gives me ideas and concepts for the next. The biggest lesson learned from NOLF is that global music integration is hugely important to a successful score. Good integration creates the logical lines of communication between the music system and the game engine. If these lines are weak or nonexistent, the music will not respond well to gameplay, no matter how well the music functions out of context. Context is everything.
Addendum: A Guide to the NOLF Media Files NOLF Quicktime Videos §
EarthOrbit: The Ambush theme starts in music state 5 (combat 1), transitions to music state 2 (ambient), and then transitions to music state 6 (combat 2) with motifs.
§
Hamburg Club: The BaDeDum theme starts with music state 3 (main theme) and transitions to music state 6 (combat 2); the main theme returns and then transitions to music state 5 (combat 1) and ends with the dialogue.
§
MorroccoAmbushA and B: Each of these movies runs the same scenario but with differing scores. AmbushA transitions to the combat 1 music state, and AmbushB moves to the combat 2 music state.
§
SniperB2: This scene exhibits the motifs of the Ambush theme well, as the music transitions from music state 2 (sub-ambient) to music state 3 (suspense).
Ambush Music States Audio File This music clip moves through all six of the music states for the Ambush theme. Time
Music State
0:00
6 (combat 2)
1:22
transition
1:30
2 (sub-ambient)
2:14
transition
2:16
4 (suspense)
3:10
transition
3:12
3 (ambient)
4:08
transition
4:12
5 (combat 1)
5:33
transition
4:12
1 (silence)
Chapter 17: A DirectSound Case Study for Halo Marty O'Donnell and Jay Weinland
Introduction Audio the Bungie Way The evolution of the audio in Halo began with the first two installments of the Myth series of games developed by Bungie in 1997 and 1998. A lot of time was spent laying the groundwork for the audio engine that we developed further during the production of Oni and utilized in Halo. Functionality was developed to allow for multiple permutations, nonrepetitive loops, detail sounds, dynamic dialog, and impacts reactive to velocity and distance, as well as cascading sound effects. While Halo was rewritten from the ground up for the Xbox (after Bungie was acquired by Microsoft), it is important to note that many of the high-level sound design concepts and their implementation into the Halo audio engine are the result of years of development and refinement with Jason Jones and the Bungie programming team, in particular Halo's audio programmer Matt Segur. We share a philosophy here at Bungie that we like to call "Audio the Bungie Way," which details fundamental audio design for any of our games. At the core of this philosophy is the understanding that repetitive audio is not only annoying but also unrealistic. This is why we allow for so many permutations in every sound call, why music does not play constantly throughout a gaming session, and why we insist on the many customizations to our audio engines. This core idea means that we spend a huge amount of our efforts in the implementation stage. There are many games released every year that have audio assets of the highest quality, but due to lackluster implementation, they fall short of having the maximum impact. We believe firmly that half of what makes a game sound good is in the implementation. This process begins with the technical design and continues through pre-production, production, and post-production. The process requires the full effort and support of the production and programming teams.
History of Halo Audio In July of 1999, our cinematic director, Joseph Staten, approached Totalaudio (at that time Marty O'Donnell and Mike Salvatori) about writing a soundtrack for a live game demo of Halo at the upcoming MacWorld Expo to be shown by Bungie's Jason Jones during Steve Jobs' keynote address. Bungie had a scripted demo running in real time through OpenGL on the Macintosh but with no audio code. The music composed for the demo was played in sync by hitting Play on a CD player when the demo began. Marty describes his approach to the music of Halo and establishing a mood and feel for this ancient, mysterious ring artifact found by humans 500 years hence in some unexplored corner of the galaxy: I felt that I could evoke an ancient and mysterious feeling by starting with some Gregorian monk-style chanting over a bed of strange ambient sounds and then give the action sequences that followed an epic and important feel by getting orchestral strings from the Chicago Symphony to play over a somewhat rock 'n roll rhythm section. I added an improvised Qawwali chant
voice over the top to help reinforce the "alien" nature of the environment. Whether these decisions were the right ones or not doesn't matter. I had two days to write and produce this piece and there simply was no time to ponder or experiment, which is sometimes a good thing. Since this was also a venue that would feature a big screen, a large auditorium, and a gigantic stereo sound system, I wanted to not only capture the mood but also hook the audience. Anything that sounded like "game music" was going to be a disappointment. Plus, the track needed to be interesting enough in its own right so that the audience wouldn't notice that they weren't hearing any sound effects. It seemed to work out pretty well. For E3 2000, Bungie had ambitious plans. The team would show the game being played and then follow it up with a ten-minute trailer that hinted at the storyline of Halo and showed the technological prowess of the Halo team. The hands-on demo had merely a rudimentary sound engine, but the trailer was produced in full 5.1 surround sound and played from a DVD in an enclosed theater at E3. Halo was one of the hits of the show, but at that point it was not known that discussions had been ongoing for Microsoft to buy Bungie and bring Halo to the Xbox. Given that the audio capabilities of the Xbox would surpass any previous console/computer platform, this was an exciting turning point in the audio production of Halo, not only because the audio could be carefully directed toward one set of audio hardware, but also because it greatly expanded the scope of what could be accomplished.
Coming to the Xbox All of the work that had been done on Halo leading up to E3 was cast aside when Bungie arrived in Redmond, Washington, in July of 2000. The team decided to rewrite all parts of the game to maximize the power of the Xbox. This certainly included the audio engine. The audio capabilities were such that we would have been foolish to not take advantage of them. We focused the audio engine on the features that we most wanted to take advantage of, which included 256 voices, real-time 5.1 Dolby Digital encoding, 3D positioning, DSP effects such as reverb, use of the hard drive for audio streaming, and the Xbox ADPCM compression through which the audio chip can play back without any CPU hit. Before we break down the various areas of Halo's audio as they relate to DirectSound and the Xbox, here's a word on our terminology. The most basic building block of our audio engine is a soundfile, which in our case is an AIFF file, stereo or mono, and 22.05 kHz or 44.1 kHz depending on usage. All soundfiles are grouped as permutations into a soundtag (after ADPCM compression). A soundtag is a file that contains not only the ADPCM audio data but also all relevant information about the playback of the enclosed soundfile permutations such as variable pitch/volume, skip fractions (percentage chance each permutation will play), and soundclass (e.g., "weapon fire" or "unit footstep"). The audio engine does not recognize soundfiles, only soundtags. Soundtags are attached to other types of tags in the Halo engine, such as animation, effect, or particle tags. The most complex building block in the Halo audio engine is the soundlooping tag. Every element in a soundlooping tag is a soundtag, which in turn contains a soundfile. These tags contain instructions for how to loop the soundfiles including assignments, beginnings and endings (which we call the "in" and "out" respectively), multiple tracks, and detail sounds. These three types of files are all we need to permeate a Bungie game with audio. Although there are many more complex examples, if we want to assign a gunshot to a weapon, we merely open the weapon tag, go to the slot for the firing action (where we also see the smoke and flame particle effects attached), and attach our soundtag (which has multiple soundfile permutations of a gun firing with randomized pitch/ gain), and we are
done. Another example would be creating sounds for a character, such as the Hunter running. We make sound effects based on an animation (AVI or MOV file) and then attach a soundtag directly in the animation tag for the Hunter (which contains the dozens of animations the Hunter might use) in the slot for the appropriate animation. Lastly, here's a word on soundclasses; we can assign each soundtag to a soundclass that groups similar types of soundfiles, such as a weapon firing, music, unit dialog, or any of a dozen others at our disposal. In essence, our soundclasses function in the same way that sub-masters function on a mixing console. This is an important divisor because it allows us some control over each area of the audio in some key areas. The soundclass attributes (delineated in the audio code) set such things as the maximum number of soundfiles from that soundclass, which can play simultaneously, whether the soundclass is of an importance where it cannot be late (such as reloading a weapon which needs to be synced to animation playback), or if it could be delayed by a few CPU cycles (such as unit dialog), and also what rolloff values should be used for that soundclass. This is important for us because every soundclass needs to have a different rolloff. For instance, footsteps needed to rolloff rather quickly. Distance rolloff was considered as well. Marine dialog should be heard from a short distance away, while explosions such as grenades should be heard from a great distance. The delay factor was important to our audio programmer, as he was the one juggling our 12,300 soundfiles to play back as necessary within 3MB of RAM. With constant loading and purging of audio RAM, it's crucial for the audio engine to know whether it needs to play a sound immediately or if a 30-to 100-millisecond delay is okay.
Production Music We utilized DirectSound for all of the music in Halo to provide a dynamic musical score that is both constantly changing and reactive to gameplay elements. Each piece of music during gameplay was started and stopped in the scripting at various moments within each level. When the gameplay design for a level was nearly completed, we'd sit down with the designer and "spot" how music would play during that level. When a piece was started, it began with the "in" soundtag and continued into the main looping soundtag. This main looping soundtag contained multiple permutations, which play back randomly to create a constantly changing musical experience. If you sit in one place in Halo listening to a piece of music, you will notice that it never plays back exactly the same way twice due to the randomization of the permutations. The music was edited so that the "in" soundfile plays seamlessly into any of the main loop permutations, and the main loop permutations can play seamlessly into the "out" soundfile when the piece is stopped by the script. These main loop permutations contain unique elements, so as they are randomly played back, the piece takes on a new flavor each time you hear it. The script can call upon an alternate track, which is used in reaction to something going on during gameplay. This alternate track (which can have its own unique "out") might have a more or less intense score, depending on the needs of that section of gameplay. For instance, if the player is about to enter an encounter with a number of enemies, the script starts a piece of music beginning with the "in" and proceeds to the "main" looping section. If during the course of battle, the designer throws another onslaught of enemies at the player, the script could trigger an alternate track that, in this case, might be a more intense mix of the same piece of music. Then when all enemies have been vanquished, the script could either stop the piece of music (playing the "out" soundtag) or return to the "main" looping section, indicating to the player that the battle was over. This system means that the player gets a constantly changing musical score that reacts to gameplay elements as they appear. Another example of this would be scripting an ambient piece of music and then scripting a more rhythmic "alt" track that crossfades when the player runs into enemies. For instance, we might begin a piece of music when the player enters a new area. This music is ambient in nature, without a clearly identifiable rhythm. When the level designer decides to ambush the player with enemies, the alternate track is triggered, which could either begin playing at the end of the current loop or crossfade at the moment it is triggered. In this case, the alternate track could be a more rhythmic version of the ambient piece or depart from the ambient piece entirely. As when the enemies have been vanquished, we can instruct the level designer to script the tag to play the "alt out" or return to the "main" loop to continue playing the ambient music. Other elements that are useful, especially during ambient pieces of music, are details and multiple tracks. Details are soundtags full of soundfiles that could be triggered anytime during the playback of a piece of music. We can set which group of files to use and a time range in which to randomly trigger them. Additionally, we could layer a second track on top of the first one, and having both "main" loops playing back with multiple permutations of varying lengths would provide an additional layer of variability. These techniques allow a lot of flexibility and give the player variation and dynamic playback while utilizing a traditional form of playback (streaming prerecorded audio files). All music soundfiles were ADPCM compressed 44.1 kHz stereo AIFF files and loaded in 128K chunks from the hard drive (sort of a faux streaming).
SFX The background stereo ambiences in Halo were created in a similar manner to the music. All ambient soundlooping tags were assembled identically to the music files described above. We had multiple tracks playing in many cases (such as an outdoor ambience plus a wind track), each with multiple permutations in the "main" loop tag, as well as detail sounds that could be randomly triggered and were placed randomly in 3D space. One technique to highlight (which was used extensively in the music tags as well) is permutation weighting. Permutation weighting is the ability to assign different probabilities to each permutation to control how often a given permutation is played. For example, there are six permutations of a "main" loop in an outdoor ambience with lengths varying from two to eight seconds and 27 seconds of material. In one of those loops, there is a distinctive owl hoot, which would be repeated every 27 seconds if the material was played back in the same order each time. Given the randomness of our permutation playback, you actually might hear it on average every 27 seconds but sometimes a little less and sometimes a little more frequently. If that distinctive hoot is still heard too frequently, we can use permutation weighting to assign a high skip fraction to that one permutation that tells the audio engine to skip this permutation x-percentage of the time. We adjust this number until we hear the distinctive hoot at a frequency that seems natural. We use this same technique extensively in the music (for unique musical flourishes) and in SFX like the example above (or for an annoying rattle in a piece of machinery we only want to hear every so often), as well as in dialog permutations that should be heard rarely ("I'd have been your daddy but that dog beat me over the fence!" for example). As with the music, all stereo ambiences were 44.1 kHz ADPCM compressed soundfiles. All other sound effects in the game were 22.05 kHz mono ADPCM soundfiles and played back through the best parts of DirectSound's programming, such as 3D positioning, DSP, including reverb, occlusion, Doppler effect, HRTF, and of course real-time 5.1 Dolby Digital surround sound. We could attach audio to anything in the game, and because of that, the 3D positioning function was critically important to enable the player to sort out where audio elements originate. Knowing that a Marine "had your back" just over your right shoulder brought a sense of security, just as hearing a Hunter's metal armor behind you would bring a sense of impending doom. There were many audio elements in the game that received 3D positioning, such as weapon sound effects, bullet impacts, speech, mechanical objects, particles such as flying dirt or sparks, and outdoor detail ambiences, such as wildlife, rivers, and waterfalls, to name a few. In essence, everything you hear in Halo that is not music or an ambient bed is 3D positioned. With the Xbox audio card, you can play back a maximum of 64 3D positioned sound effects simultaneously, leaving 192 2D voices free. We also utilize the DSP power that is built into the Xbox audio chip. The onboard Motorola chip is identical to the chips that are used in some Pro Tools DSP farm cards that give the Xbox DSP processing power unlike any other platform. We can use as many reverbs per level as we wish and can switch between them on the fly without generating a lot of CPU overhead. It doesn't matter if we use a small room reverb or a huge airplane hangar reverb; the cost to the engine is the same. We use, on average, about six reverbs per level, embedding information about reverb parameters at a given location directly into the level geometry (right alongside the ambience soundtag that should be played). There is not a single area in Halo where the 3D audio is not being played with reverb, and it certainly adds a level of realism to the environment. Standing inside the shaft on The Silent Cartographer level and listening to the echo of
reloading your pistol helps underline the sheer size of the geometry that the artists created for Halo. Although we do not utilize other DSP functions extensively for Halo, we do use a filter for those times in the game when the Master Chief descends underwater or becomes immersed in pools of liquid coolant. The filter simulates hearing sound through water very well. We're looking forward to using this and other functions of the DSP in the future. All 3D audio is subject to occlusion, obstruction, Doppler, and HRTF. Anytime a solid piece of geometry gets between the player and a sound source, it is occluded and obstructed. A good example of this is in the hangar bay of the alien ship where there is a dropship with the engine running. As the Master Chief steps behind a column, the sound of the engine is occluded, rolling off both the gain and the high end of the sound. Other geometry that occludes sound in Halo includes walls and cliffs. A grenade explosion sounds muted when you are around a corner as opposed to having it in an unobstructed field of view. Sound sources emanating from objects that travel through space in Halo, such as the Sentinels or vehicles, were subject to a Doppler shift. A moment during Halo's development that stands out is when we were first testing our looping sounds as they related to Doppler and hooked up the sound of a hot rod engine to the rocket launcher.
Figure 17-1: A rocket launcher in action. We hooked up the hot rod loop to the tag for the launcher's projectile, walked the Master Chief across a big field away from the camera, and fired it back toward the camera. What ensued was of great amusement as the rocket flew past, generating the sound of a hot rod going about 100 miles an hour and exploding in the cliff behind us. It worked like a charm, and we left it hooked up that way for a week or so for everyone to enjoy. The Doppler shift in DirectSound worked perfectly and added a lot of realism to sound sources that move around. HRTF (Head-Related Transfer Function) also add to the audio experience by filtering sounds, depending on which direction the character's head is facing. You can hear it affect the dialog in the cinematics as well as in the sounds that play during combat. Probably the best way to hear this effect, however, is listening to your own footsteps. As you move the Master Chief around in Halo, listen to the sound of his footsteps on various surfaces. Then run through the same areas looking down toward his feet rather than straight ahead and listen to those same footsteps; the difference is stunning. Occlusion, obstruction, Doppler, and HRTF are all aspects of DirectSound that highlight its capabilities to utilize the same source filtered appropriately to add a sense of realism without having to author content specifically to achieve the same effect.
Surround Sound
One built-in feature to the Xbox chip that is absolutely amazing is its ability to generate realtime 5.1 Dolby Digital output. We encode and decode Dolby Surround on the fly from dynamic game audio so that a player gets a full surround audio experience throughout the game. This is a first in console gaming — and a great first at that! There are many advantages to implementing 5.1 audio, from alerting the player to the presence of enemies to surrounding the player with ambience and music. While we did not premix the music or ambience in 5.1, we were able to surround the player with sound. Stereo music and ambience were played 100 percent out of the front speakers and 50 percent out of the rear speakers and also sent to the LFE. Non-positional dialog was sent 100 percent to the center speaker and 50 percent to each of the front speakers. All 3D positioned sound was played back in its appropriate position among the four surround speakers and also sent to the LFE. The Halo audio engine queries the Xbox at startup to see whether it is set to stereo or Dolby Digital and plays back the audio appropriately to avoid doubling up audio signals in the stereo fold down.
Attaching Sounds to Animations In Halo, there were thousands of animations created for several dozen characters. The animator provided us with movies of a group of animations that needed sound effects, and we would then sound design in Pro Tools, making multiple permutations for each one as necessary. For animations such as characters getting in and out of vehicles, characters moving around, or the sound of the Flood Infection Forms skittering about, we generated an appropriate sound effect and attached the soundtag directly to the animation tag for that character. We could also trigger it at a specific frame of the animation in order to ensure the tightest sync possible.
Figure 17-2: An Infection Form closes in on the Master Chief.
Real-time Tracking of Velocity and Pitch The engine can also trigger sound effects based on velocity. Sounds such as vehicle or shell casing impacts will play louder or softer, depending on the object's velocity at the moment of impact. If you watch a shell casing hit the ground, it will bounce several times with each repetition softer than the one before. For a variety of different sounds, including vehicle engines, we had the ability to vary pitch according to a scale. In the case of a vehicle engine, the scale would be the RPM of the engine at any given moment. Like most driving games, we used multi-layered looping samples at various areas throughout the RPM spectrum, so the samples were never pitch bent very far.
Figure 17-3: The jeep wins a game of chicken. We also use the scale for sounds other than engines, such as the Banshee contrail or the human dropship hovering. For the contrail, we raise the pitch on the loop as it thickens, lower the pitch as it thins, and fade it out when you fly straight. For the human dropship, we actually use the scale to modify the gain instead of the pitch so that when the dropship hovers, the player hears a swirling, windy sound. When the downdraft jets are activated, the gain ramps up on the wind and then ramps down as the downdraft jets deactivate. The scale can modify pitch, gain, and skip fraction, depending on the requirements for the sound effect.
Figure 17-4: Dropships on the beach.
Cascading Sound Effects Cascading sound effects include sounds such as breaking glass or rocks and gravel that explode off cliffs or are kicked up from vehicles. Glass breaking is the easiest example to picture. We attach an initial breaking sound to the moment when a glass surface was broken. After that point, however, there are several factors that would affect how the remainder of the glass breaking should sound. Depending on the size of the original piece of glass (how much glass would fall) and the distance of the glass from a solid surface, we had the ability to trigger glass breaking sounds in "cascades" attached to the impact of each glass particle. Keeping in mind that there are often hundreds of particles involved, we triggered soundtags in the following manner. When the first particle impacts, we call "glass small," which would be a soundtag of a very small glass particle hitting a surface. We set a number (eight, for example) of repeated calls within a specific time period after which the audio code triggers "glass medium," which would be a soundtag of multiple glass particles breaking. On that tag, we set another number (five, for example) after which the code would trigger "glass large," which is a soundtag of the mother of all glass smashes. After the final soundtag is triggered, we could start the entire cascading sequence over again if there is still more glass breaking. Each of the three levels would, of course, have many permutations so that each time glass breaks, you do not hear the same series of sound effects over and over. This gives us the ability to sync to individual
particles in combination with the "chorus" effect achieved by recording a larger, more complex event.
Dialog The dialog in Halo was one of the areas that helped to give Halo a unique flavor. There are two types of dialog: cinematic dialog that is a traditional linear script and dynamic dialog. As you play through Halo, your Marines will dynamically alert you to what is going on around you, warn each other and you of danger, chastise, apologize for allies getting shot or killed by friendly fire, and generally amuse you. This aspect of Halo's audio could not have succeeded without the hard work of AI (artificial intelligence) programmer Chris Butcher who gave us many possible AI states in which a character might say something and controlled how much was said to make it all seem natural.
Figure 17-5: Shotgun-toting Marines are great to have around. In addition to the seven distinct Marine voices, we also created four unique voices for the alien races. To make the voices even more colorful, we used a different actor for each character and utilized various dialects (that really is an Australian in there). We went out of our way to have as many permutations as possible for each AI state (sometimes over 20) so that you can play the game for a long time and rarely hear a repeated line. We also used the same AI framework for all of the alien speech, some of which was intelligible (grunts in particular) and some that was not. With such a good speech system, we merely had to apply our production values to the recording sessions. We used professional (AFTRA/SAG) talent and produced, recorded, and edited the sessions ourselves to ensure we got exactly what we wanted. Actors were cast out of both Chicago and Seattle and came to the studio multiple times to record initial takes and pickups later on.
Figure 17-6: A Jackal has an unpleasant meeting with the front of the jeep.
The cinematic dialog also works well due to both a great storyline by the Bungie team and excellent writing by Joseph Staten. Scripts for the cut scenes and other interior scenes during gameplay were written and storyboarded so that we could clearly deliver the plot to the player. There is about one hour and 15 minutes of cut scenes in Halo, and that puts this part of Halo's production on par with a short feature film.
The Final Mix In the commercial, film, and theater industries, the final mix is one of the most important steps in post-production. Those of you who are in the game industry are likely laughing right now, as it is rare that we have the time necessary at the end of a project to do a final mix. In Halo, we got a chance to do some broad stroke mixing but were never able to get down to the minutiae that make up an ideal final mix. It's almost as if Halo is an interactive movie that changes every time you play it, and while we need to be cognizant of that fact, we can still do more to make sure that the mix "sits" nicely in all cases. There are still elements of the mix that annoy us (as there will likely be in every game we do), and all we can do is seek to minimize them as we approach the finish line for each game.
What Worked, What Didn't, and the Future of Both The Good Some of the most memorable audio experiences playing Halo came about because of a lot of hard work and a little luck. The Marine dialog, for instance, was a huge outpouring of effort in terms of scripting, recording, editing, and implementing. We did not, however, have any idea that it would work as well as it did until we got very close to shipping. The combination of great AI coding by Chris Butcher, an extensive script (153 AI categories), some top-notch improvisation in the studio by our cadre of actors, and a basic premise that each category should have as many unique permutations as possible (5,600 permutations in all) led to a tapestry of Marine dialog that is constantly changing and a great source of enjoyment. There were also a lot of other things that worked really well, including Dolby 5.1, the performance of the reverbs, Doppler, 3D positioning, and our dynamic music system. We were developing content for the game at the same time that the hardware was being developed for the Xbox. Because of this, some of these features were not a sure thing until late in the summer of 2001, mere months before shipping. The Xbox audio team did a great job delivering what they promised, making it a great platform on which to develop.
The Not So Good There were some things that did not work so well that we will fix for future games. Reverb morphing is something we had to hack in the Bungie audio engine to prevent hearing the borders of reverb areas pop, but it does not work perfectly in all cases. The HRTF function and some 3D positioning were too heavy-handed during cinematics, and we are looking at ways to disable or limit it in future titles. In a film, character voices do not swing around to the rear speakers because the camera angle changes. In Halo, there are actually a few spots (the bridge in the first level, for instance) where the camera position is so close to a character's face that the audio engine thinks the sound source is behind the player. Looking at a close-up of the Captain's eyes but hearing his voice coming from behind you is disconcerting. In Halo, everything but the music and ambiences were stored at 22.05 kHz ADPCM compressed and result in quite a bit of artifacting, depending on the source. It is most noticeable in the speech. In future titles, we will be moving to 44.1 kHz for all of our content, which will alleviate most artifacting. The initial music we produced included some fine playing by members of the Chicago Symphony and Chicago Lyric Opera Orchestra. Due to time constraints, we were forced to use sampled instruments for the bulk of the production done during 2001. While there are quite a few good samples out there, they are not a replacement for live musicians. For future titles, we will build time into our schedule to record live musicians where appropriate. Another time constraint issue was facial animation. We only had the ability to open or close a character's mouth in reaction to the amplitude of the speech sample. For future games, we will be adapting a more robust facial animation system, which will make the characters appear much more realistic when they are speaking. There were several content areas in which we were constrained to mono sound effects, such as first-person animations and all 3D positional audio. Some 3D positional audio, such as the jeep engine, should be stereo. There is work being done both at Bungie and throughout Microsoft that will allow for stereo sound effects to be 3D positioned and will bring more life to sounds such as vehicles and weapons.
Occlusion is another area we would like to refine. Our engine did not allow for movable objects, such as doors, to occlude. For upcoming games, we will be working to make sure that anything that should occlude will occlude. Our use of DSP was limited and is an area that has a lot of potential. We will be working with our programming team to allow us full use of real-time EQ, LFO, and plug-ins such as limiters. We will also be looking at the use of better data compression for high-quality, lowmemory audio in certain areas of the game. Creating content for a game is merely half the battle. Implementing that content in collaboration with a strong programming team is absolutely essential, and in closing we would like to thank Jason Jones and the Bungie programming team for giving us the tools and support necessary to help make Halo a great-sounding game.
Chapter 18: A DirectMusic Case Study for Interactive Music on the Web Download CD Content Ciaran Walsh
Overview Background Web content today consists of sophisticated graphics, animation, and interactive visual content. By contrast, web sites use sound only sporadically, often as the occasional button sound or introductory flourish. Web sites use music even less frequently, and this music is often repetitive and of poor sound quality. The difference that sound and music can make in a web presentation is enormous, however — just as it is for other audiovisual media, such as film or television. Music has the power to set a mood appropriate to the content and enhance the overall impact of a web site. Appropriately, for an interactive medium like the Internet, interactive music can enhance a web site further still. It can contribute to the usability of the site, assist navigation, and add emphasis to key content. Bandwidth is the main cause for both the lack of quality and use of audio content on the web. Standard uncompressed sound files (i.e., .wav and .aiff) are large and therefore take a long time to download over slow connections. Compression reduces the size of sound files but deducts from the quality of the file. Poor sound quality is inconsistent with the high production values expected of quality web content. While short sound effects are usable, short, repeating music loops become tedious to the listener very quickly. The General MIDI (GM) standard is one solution to the bandwidth problem, playing MIDI files using standard instrument mappings to ensure that the sequence performs using the correct instruments. The benefit of using General MIDI is that small MIDI files are the only soundrelated file required for downloads. The trouble with General MIDI on the other hand is that the instruments are limited to a defined set of 128, mostly variants of standards, such as piano, guitar, and basic orchestral instruments, and tend to be poor quality. General MIDI instruments can vary dramatically in timbre and quality from one sound card to another, despite the intended standardization. Any digital musician knows, for example, that piano sounds on two different brands of sound cards can be drastically different from one another. This difference can wreak havoc on the balance and feel of a piece of music. Streaming music is one alternative to General MIDI synthesis. Streaming consists of downloading sound files bit by bit as they play, allowing playback to start relatively quickly, thus decreasing the perceived loading time of the site. Fast streaming requires a significant amount of compression, even with broadband access becoming more commonplace. Also, streaming ties up valuable bandwidth for the duration of the music file. This can slow down the loading of other pages or cause the music to stop as other pages use bandwidth to load. Looking beyond the bandwidth problem, it is worth considering that web content is interactive by definition. Standard music delivery methods are designed to play a sound or MIDI file from start to finish and do not allow for user interaction. As web content is inherently interactive, linear music (regardless of how it is delivered) can never truly do justice to the unpredictable nature of the user experience. For this, we need a system that is not only
bandwidth efficient and provides high sound quality but is also capable of responding to user actions flexibly and in real time. DirectMusic is ideal for delivering music over the web. Its use of the DLS standard for synthesized instruments makes it bandwidth efficient. For instance, a drumbeat using a DLS instrument may consist of a kick drum sample, snare drum sample, and two hi-hat samples. All of these samples together are likely to amount to less than one measure of the whole beat but allow for endless permutations and fills, while treatments such as panning and filtering further add to the possibilities. The same amount of sample data required for a single prerecorded drum pattern as part of a stream is therefore able to produce countless variations. A sound designer can manipulate and reuse samples in new instruments without adding additional material to the download. Furthermore, the sound designer can compress the samples on a case-by-case basis, allowing for the best compromise between size and quality. In addition, since the user downloads the actual DLS instrument as web content, the music sounds consistent across the full range of playback hardware. The ability to use realtime effects such as reverb, chorus, delay, and distortion further enhances the potential to make the most out of a few samples, as well as adding depth to the final mix. DirectMusic scripting adds the vital ingredient of interactivity. DirectMusic content creators feed button clicks, mouse movements, and other triggers into the script and use them to shape the music in accordance with the user's interactions with the site. It is also possible to give the user direct control over aspects of the music. This control can range from something as simple as a volume control to much more abstract control, say, over subtle harmonic changes; the possibilities are limitless. DirectMusic Styles and patterns allow for a great deal of control over variation, making it perfect for background music that needs to remain interesting over a long period of time (for instance, while the user is reading the contents of the site). Within a pattern, you can create up to 32 variations of each pattern part and control the playback of these in a variety of ways. Parts, and even individual notes, have parameters that can apply a degree of randomness to playback. It is possible, with these features, to create self-varying musical accompaniments guaranteed not to repeat in a predictable way. DirectMusic's efficiency, variability, and interactivity make it uniquely suited to use on the web. When I first investigated the possibility of using it in this way, however, there was no available means of communicating user interactions to a DirectMusic script. Therefore, it became necessary to create an ActiveX control to do this.
The Wide Sounds DMX Control Wide Sounds identified the potential of DirectMusic for music on the web. We created the DMX control, an ActiveX control that accepts various control messages and converts them into routine or set variable calls and then passes these calls to DirectMusic. The DMX control can also extract information from DirectMusic and pass it back to the caller. The data for the control is stored in a .cab file on the host's web site and is downloaded and extracted when the control loads. Thus, a web page can use the DMX control as a generic interface to DirectMusic's capabilities.
Figure 18-1: The DMX control in action.
Prior to the development of the DMX control, using HTML+TIME (Timed Interactive Multimedia Extensions — extensions to HTML adding timing and media synchronization support) was the only way to play DirectMusic content from a web page. This approach did not provide the vitally important scripting functionality. In any case, Internet Explorer 6 and later no longer supports HTML+TIME. Creating the DMX control enabled us to have much greater interactive control over the playback of DirectMusic content on the web than was previously possible. By allowing a web designer to call routines and set variables in a DirectMusic script from user interactions, a vast amount of interactive functionality becomes available.
Functionality The potential uses of scripted music for a web site are almost limitless, but here are a few of the key techniques and components that I find useful when designing interactive music for a web site: §
Play Segment: Play a specific piece of music for a content area, or start a new piece when clicking through to a different part of the site.
§
Transitions: Script changes to the primary Segment in such a way that they provide a seamless transition in order to maintain musical coherence and flow.
§
Motifs: Experiment with using motifs over the background music.
§
Chord tracks: Map out the harmony of a Style, allowing you to fit secondary Segments such as motifs harmonically over the background music.
§
ChordMaps: Vary Styles harmonically using the branching functionality of ChordMaps. Anything from simple key changes to random chord progressions are possible.
§
Mix effects: Use specially designed secondary Segments to manipulate a wide range of mix parameters. You can adjust the volume of different instruments, real-time effect values such as delay feedback, DLS instrument filter cut-off, and many more useful parameters for livening up a mix by using the continuous controllers, the Pattern Tracks, or Parameter Control Tracks.
§
Randomness: Use script routines to assign a random number to a variable. This enables interactions to have an unpredictable effect. You can control the probability of outcomes to keep the randomness within reasonable limits. You can also give variations within a Style a degree of randomness, further avoiding repetition.
§
Timers: Use timers to lock and unlock features during transitions, control queues of events to avoid clashes, or force variation after periods of inactivity. You can create timers by placing script routine calls at a certain point in a Script Track within a secondary Segment. Timers are a very useful under-the-hood element, especially in complex projects.
§
Automatic variation: Embed routine calls in Script Tracks utilizing any of the abovementioned features. This is a useful way of avoiding too much repetition.
Consider these techniques as a "toolbox" of features with which to enhance the content of a web site.
The Web Site We use the DirectMusic content from the Wide Sounds web site but in a simplified HTML version for the purpose of this case study. We designed the full site using Flash, and while you can control the DMX control from Flash, it is easier to demonstrate the basics using HTML. I removed some functionality and content in order to make the HTML as clear and simple as possible. The site provides information about Wide Sounds' services, our client roster, press releases, and contact info. The intended users of the site are potential clients who have heard of the company and want more information.
Figure 18-2: The front page of our site. The structure of the site is very simple; it consists of a front page that links to six sub-pages. I removed our technology demo from this version as well as the link to its page. The Wide Sounds logo remains present throughout and provides a link back to the front page from each sub-page.
The Adaptive Audio Design We have looked at the interactive functionality available with DirectMusic and the DMX control, and we are familiar with the structure of the web site. The next step is to plan our music treatment. We must make decisions about the style of the music that we create and how it integrates interactively with the web site.
Style and Aesthetics — What Is Right for the Site? When considering the style and functionality of the web site's music, avoid thinking in terms of your own particular taste. Similarly, do not attempt to cram in as many tricks and clever interactions as you can think of. Over-embellishment can annoy the user and obstruct or confuse the purpose of the site. Consider the user's point of view: Who are they? What do they want from the site? Visitors to the Wide Sounds site are most likely from the games industry. Perhaps they are producers searching for an overview of the company. They might want to read a bit about the services we provide and find out about our background. As we are a music company, it is natural that we want our site to feature music, and as we provide interactive music, it makes sense to reflect that in our site music as well. However, the company information is the focus, so the music should merely serve to enhance the user's experience and give a sense of quality. For the purposes of the Wide Sounds web site, we use a gentle, unobtrusive musical style that simply aims to provide a pleasant backdrop. We have no idea what kind of musical style our visitors want for their game projects, so it seems sensible to avoid an obvious game style. We do not want to pigeonhole ourselves as electronica artists, classical composers, or some other genre. Having a very general style allows us to appeal to visitors from outside the games industry, in our other markets of advertising, or on the web.
Form and Function Finding the Hooks Mouse clicks and rollovers drive the music script behind the Wide Sounds web site. These interactions are very simple to implement and give us the control we need in most circumstances. With this in mind, we can identify which interactions should control our script. We refer to these interactions as "hooks," and it is important to identify them before designing the music so that the musical content matches the functionality of the site. You should look at the site that you are creating music for in detail, become familiar with its structure, and think about how your music can complement and enhance the content. Consider the amount of time users are likely to spend in each area, which buttons will be used most frequently, and which areas or features of the site should be highlighted musically. We have six lozenge-shaped buttons on the front page used to navigate the site. Navigating between content pages is clearly the main interaction with the site, and therefore we utilize changes in the primary Segment to reflect this priority. As a further embellishment, when clicking through to a different page, an "opening" or "closing" motif plays. This technique stems from the Flash site where the new page visually expands or collapses from the button, although the musical idea remains valid without the expanding/collapsing animation. We also use button rollovers to play various notes while forcing those notes to conform to the
background harmony in a variety of ways. This breaks down to a different note on each button, all of which relate to the chord playing during the rollover. This turns a row of buttons into a basic musical instrument that plays notes in harmony with the background music! The constantly present Wide Sounds logo links back to the front page from each of the content pages. Because it is a logo, an "ident" style motif, as commonly used in corporate branding, seems an appropriate embellishment. Finally, we have added some simple controls specifically for the music: § Volume + § Volume § Music Off § Music On Our web site is very simple, so there are only a small number of hooks. Sites that are more complex may require more complex music implementations. DirectMusic can certainly accommodate the most demanding scripting requirements. Once you have determined the hooks you will be using to drive the music, you can begin to think about creating musical content to match the site structure and interactive behavior.
Matching Music to Site Content As our site is purely informational and has no distinctly themed content areas, such as pages dedicated to the different game titles that we have worked on, there is no obvious requirement for distinctly themed musical content. Therefore, the musical backdrop focuses on atmosphere and variety. If the site had themed pages (say, historical or geographical for instance), we could incorporate elements of the respective themes in the musical backdrop. The backdrop could even transition between different pieces of music as the user navigates to different pages within the site.
Thinking as the User Web site music should enhance a visitor's experience. For the purposes of the Wide Sounds site, the music should also encourage them to retain our services. It is important to consider the user's expectations, needs, and experience when designing music for web sites. What is the average length of time that a visitor views any given page on the site? Are there elements of the site that the user interacts with quite often? How can the musical content aid navigation and usability? Which recurring elements require consistency throughout the site for clarity? Understanding how the site functions helps in designing appropriate interactive audio.
Content Creation While a detailed discussion of content creation techniques in DirectMusic Producer is beyond the scope of this case study, there are some techniques worth considering here. Download size and the unpredictable nature of user interactions are particularly important issues when designing interactive music for a web site. Unlike most games, web sites can potentially be viewed for long periods with very little interaction. Alternatively, the interactions may come thick and fast, and the music must be able to respond without becoming incoherent. Consider these issues carefully when creating content. Make sure that DLS instruments make the most of a small amount of sample data. Build variety into your patterns to maintain interest when no script routines are being triggered. Create Segments that can transition smoothly at any point.
DLS Instrumentation Long loading times are a big turnoff for web site users, and although our DirectMusic content loads separately from the rest of the site, it is still important to keep the size of the download to a minimum. You should pay particular attention to the DLS instrument sample data, as this is the bulk of downloaded material. It is important to minimize sample data, while maintaining the highest possible quality. The target download size for this site is less than 100KB. To meet this requirement, we choose instruments that do not require large samples. Long multi-sampled orchestral instruments require many more samples than anyone has the patience to download. Use short looping source samples that spread across the keyboard without suffering undesirable effects. For example, use samples of analog style waveforms, such as sawtooth waves, as the basis for DLS instruments. Only a few cycles are required and a wide variety of sounds can be created from the same source. For drums and percussion, create a small number of source hits instead of long loops. Also, reuse samples at various pitches and with various envelopes as the basis for different instruments to get the most mileage out of them. Be creative and generate many rich, multi-layered sounds all from the same source samples.
Sample Format and Compression DirectMusic can use DLS instruments created from wave files across a range of sample rates and bit depths. I find that 22.05 kHz, 16-bit samples usually strike the best balance between size and quality and often respond well to compression. It is possible to combine different sample rates within an instrument or DLS collection though; this is sometimes necessary to get the best results. DirectMusic in DirectX 8 has incomplete support for MP3 and WMA compression, so Microsoft ADPCM is the best compression type available in most cases. ADPCM compression reduces the bit depth to 4 bit, so do not reduce a 16-bit sample to 8 bit if you plan to compress the sample. The compression ratio achieved is typically just short of one to four, so in some circumstances it is worth using a 44.1 kHz sample if it compresses well, rather than an uncompressed 22.05 kHz version. It is important to consider how different timbres respond to compression. For instance, when using ADPCM compression on 22.05 kHz drum sounds, you may find that a snare drum hardly suffers at all, while a cymbal can become gritty and lose its shine or a rounded bass drum may gain clearly audible compression artifacts. In some cases, simply replacing one sample with another of the same type of sound can be enough to get the quality you want without drastically increasing the file size. Alternatively, you may need to use a higher
sample rate version of the source sample in order to achieve an acceptable compressed version. For this reason, you should always keep a backup of the highest quality version of every sample in your project.
Instrument Choices Considering what we know at this point, we decided that our instrumentation should lean toward analog style pads, basses and synths based on a small number of samples, and programmed rhythm parts using a small palette of drum sounds. Of course, instrument choices can change during the creative process. Our site uses two instruments (Guitar1 and Marimba) that do not conform to these instrument choices. We felt that the trade-off between variety and download size was justified. You should consider the eventual cost in download time of each instrument that you use from the very beginning. In our final DLS Collection, four samples make up the six non-percussion instruments. The sample Pad1 is the source for three of these instruments — Arpeggio1, Pad1, and Wibble.
Figure 18-3: The DLS Collection. Look in the Waves folder in Flash.dlp, and observe that the final version uses nine samples amounting to 207KB of storage space, uncompressed. Of these, seven responded well to ADPCM compression, and the final run-time DLS file weighs in at 89KB — within our target range even before we create the final downloadable archive, at which point we can expect some further file compression. By making the right instrument choices, reusing samples, and selectively using compression, it is possible to create a good quality sound palette and keep the download to an acceptable size.
The Styles and their Patterns After identifying the required musical content and establishing the resource limits that constrain it, the next step is composition and creating the DirectMusic assets. The source material played back by Segments is contained in Styles. A Style is like a collection of MIDI tracks combined with many settings unique to DirectMusic that influence the way they play back. You can import MIDI files created in another sequencer or create content in DirectMusic Producer. The MIDI content in a Style is organized into patterns. We must create a pattern for each musical passage and define the conditions under which they will play back.
Styles for Each Content Zone
The music for our site is contained in one Style (see Figure 18-4), since the music design avoids drastic stylistic variations within the site. If the site had distinctly themed zones as discussed above, we would almost certainly want to use a separate Style for each zone, although the organization of DirectMusic assets is a matter of personal preference.
Figure 18-4: The Style, expanded to show all the patterns.
Pattern Structure All the patterns in the Style share more or less the same structure. They are all eight bars long, and they all share the same chord sequence (as defined in the Chords for Composition track) and draw on a main pool of pattern parts contained in "All" and "All2." These patterns serve as a source of linked Pattern Parts for the other patterns and have a groove level of 100 so that they themselves never actually play. Linking pattern parts reduces duplication of content and further reduces the size of the final run-time files. Our patterns fall into two groups. Patterns 101 to 106 are derived from All, while 201 to 204 are derived from All2. Patterns 101 to 104 provide a progression and some variation for the main page of the site, and Patterns 105 and 106 add a guitar motif with varying accompaniment for half of the content pages. Patterns 201 to 204 provide the music for the remaining content pages, based on a sparser arrangement of the same basic elements.
Figure 18-5: A Pattern Part using Variation Switch Points. The percussion parts use variations and variation switch points to avoid repetition and add interest to the rhythms. By adding variation switch points at frequent intervals in every variation, the path through the Pattern Part changes every time. This works particularly well with percussion parts, where some amount of unpredictability can greatly enhance the sense of a live performance.
Groove Levels See the groove levels used by all the Patterns in
Main.stp below.
Figure 18-6: The Style Designer window showing the Groove Range settings for each Pattern. Patterns 101 to 106 use groove levels one to seven roughly in order of intensity. Groove level one is the most sparse, used only by the Segment Intro.sgp when the site loads. The Segment Front.sgp, featured on the main page of the site, uses groove range two to four, thus selecting from Patterns 102 to 104. Guitar.sgp uses groove range five to seven or Patterns 105 and 106. We assigned Patterns 201 to 204 to the groove range 11 to 16; Minimal.sgp uses this groove range. The assignment of groove levels to patterns allows us to choose specific patterns or groups of patterns for playback from a Segment. The Repeat Pattern menu in the Segment's Groove Track Properties dialog box adds a further dimension to these playback choices. You can use the drop-down menu shown below to select the method for choosing from multiple patterns of the same groove level.
Figure 18-7: The Repeat Pattern menu. Weighting groove ranges can influence the choice of patterns for playback at run time. As the list of groove level assignments in Figure 18-6 shows, we assigned Patterns 201 to 203 to a single groove level, while we assigned Pattern 204 to the range 14 to 16, or three groove levels. If the Segment uses groove range 11 to 15 (as the final version does), Pattern 204 will be selected 40 percent of the time, while each other pattern appears 20 percent of the time. In a more complex project with distinctly separate content zones and Styles or where embellishments such as fills and breaks are required, it is necessary to create custom patterns for those events. You can assign these patterns to one of the various embellishment types for playback from custom Segments. Our web site did not need such patterns.
The Primary Segments As you know, primary Segments play back music from a Style, and secondary Segments play motifs and controller data or trigger timed script routines. Our site uses four primary Segments. The Intro plays when the web site loads using the lowest groove level. Intro's expression controller data fades in the Arpeggio and Wibble instruments. Intro also has a Segment Trigger Track that triggers a secondary Segment containing filter cutoff controller data and then triggers the next primary Segment. Main, shown below, plays after Intro and on every subsequent visit to the main page. Guitar triggers when the user clicks through to the first, third, and fifth content pages. Minimal triggers when the user clicks through to the second, fourth, and sixth content pages. Please be aware that I disabled the second content page in the HTML version of the site for this project. Figure 18-8 shows Main.sgp as seen in the Segment Designer window. All of our primary Segments are constructed in the same way. Each Track influences the selection and manipulation of Patterns in different ways, as we can see by looking at Chord Tracks.
Figure 18-8: Main.sgp.
Chord Tracks and ChordMaps
There are many reasons to use Chord Tracks in a DirectMusic project. Having a Chord Track in a primary Segment allows any secondary Segments to match the changes in harmony. This ensures coherent playback, regardless of when the secondary Segments trigger. DirectMusic can manipulate the harmony of properly mapped Patterns in real time, enabling key changes and variable chord sequences. The eight-bar chord sequence in our Patterns is expanded into a 16-bar sequence in the Segments, simply by extending the Chord Track. No new note data is required. The most important thing when working with chords in DirectMusic is to map the Chords for Composition accurately. If done correctly, harmonic changes work properly.
Figure 18-9: The Chord Properties dialog. After assigning chords to the relevant measures, set the playmode for the different tracks so that, for instance, percussion parts do not transpose. Inversion boundaries are also useful to restrict the range across which a part can transpose. The seven different playmodes provide much control over Pattern Part responses to chords in a Chord Track. You can also override the Pattern Part playmode setting on a note-by-note basis. The rollover events on the main page of our site illustrate the different playback modes. For instance, the lozenge-shaped buttons each play a single note using the Chord/Scale playmode. These notes are therefore transposed to the current chord and scale, maintaining their harmonic functions. The Wide Sounds logo plays a motif that uses the Pedal Point playmode, playing the same notes every time unless those notes don't appear in the current scale. If the notes fall outside the scale, they are transposed to the nearest appropriate note. It is possible to create variable branching chord sequences using ChordMaps and Signpost Tracks. By using weighted connection lines between the chords in a ChordMap and a set of signposts to determine the boundaries of the chord sequence, you can create anything from simple alternating key changes to rambling, unpredictable freeform chord patterns. This kind of sophistication is not necessary for our simple site, but it can add a huge amount of variability while barely increasing the size of the download.
The Secondary Segments Now that we have our primary Segments, we can create the secondary Segments containing motifs and controller data that will be triggered on top. In the planning stages of the Wide Sounds site, we established that we needed a number of motifs. Loz1.sgp through Loz6.sgp are individual notes used as rollover sound effects for each of the six content buttons. WideLogo.sgp is a rollover sound effect for the Wide Sounds logo. WipeUp.sgp is an opening motif for clicking through to a content page. WipeDown.sgp is a closing
motif for returning to the main page. Lastly, there is an extra motif, Scr.sgp, that is unused in the HTML version of the site; it is a tuned clicking loop used while scrolling text in the Flash version. Note My preference is to create motifs as Pattern Tracks in Segments rather than use the motif object type within a Style. This is because I find them easier to control and more flexible than motifs. It is very important to test motifs thoroughly in context, bearing in mind that you can call the triggering routines at any time. You can use the Secondary Segment toolbar in DirectMusic Producer, shown below, to audition motifs while a primary Segment is playing. Play every combination you can think of, and observe what happens when multiple instances of the same motif trigger in quick succession. This may cause overlapping notes to cut off, polyphony to be exceeded, or the motif to sound extremely loud. If undesirable effects occur, you will need to add functionality to your script to prevent the playback of multiple instances.
Figure 18-10: The transport controls and Secondary Segment toolbar. Once you create your script, you can test the functionality of motifs (and the script routines that drive them) further, but in the early stages the Secondary Segment toolbar is a great help. There is one final secondary Segment in the project, triggered upon loading and left to loop continuously from then on. ClickFilt1.sgp contains a Pattern Track with filter cutoff controller curves affecting the instrument Arpeggio.
Figure 18-11: Controller data in a CC Track, simulating an LFO. The purpose of this Segment is to act as an LFO, constantly tweaking the instrument's filter cutoff to add interest to the sound.
AudioPaths and Effects Real-time effects can give depth to sounds made from small source samples, turn mono into stereo, transform a sound into something completely different, and enhance the depth of the mix. However, when using many effects, it is important to consider the type of computer the end user has. It is a good idea to test using a minimum spec PC to make sure that you have glitch-free playback. Look at FlashPath.aup and observe our AudioPath; it consists of four mix groups. One is dry, two use different combinations of Reverb and Echo, and the other uses the I3DL2 Reverb. A configuration like this should not cause any playback problems. Using the reverbs and stereo delays adds significant depth and atmosphere to the overall sound, which is, after all, created entirely from tiny mono samples.
Figure 18-12: FlashPath.aup in the AudioPath Designer window. To ensure that any Segments you audition in DirectMusic Producer play back on the correct AudioPath, select FlashPath.aup from the default AudioPath drop-down menu located above the Project folder.
Testing the Functionality Just as the Secondary Segment toolbar allows you to test the playback of motifs and other secondary Segments, the Transport List and Controls allow you to select primary Segments for playback. Right-clicking on the Transition (A/B) button brings up the Transition Options dialog (see Figure 18-13), allowing you to audition the various transition methods between primary Segments. The available settings correspond to those in the Boundary tab of the Segment Properties dialog, many of which can also be set in the script. This is a good way to test how the transitions sound and fine-tune your boundary settings before you create the script.
Figure 18-13: The Transition Options dialog. At this point, we have created and tested all of the required content, as defined in the planning stage. In summary, the project consists of one Style, one DLS Collection, four primary Segments, seven secondary Segments, and one AudioPath. We are now ready to script the project.
Scripting Scripting is the key to creating interactivity with DirectMusic. The script drives the music in response to calls to its routines from the DMX control. You have to create script routines for all of the hooks defined previously. You may find "under-the-hood" routines useful in controlling automated functions and in testing conditions for other routines. Getting the script right is vital for properly functioning music. The slightest error can cause a routine, or even the whole script, to fail. Having said this, AudioVBScript is a very simple scripting language. Microsoft designed an easy-to-learn system for people without a programming background. I learned how to use AudioVBScript quickly, with no programming experience. This section provides some useful pointers in relation to scripting a web site project.
Creating the Script Consider in advance what routines your script requires. For our site, we need routines to trigger the motifs Loz1.sgp to Loz6.sgp, WideLogo.sgp, WipeUp.sgp, and WipeDown.sgp, the transitions between the primary Segments, and the volume controls. Additionally, the DMX control must trigger the "start" routine automatically once the content loads. Name these routines clearly early on; this way, you can easily identify which routines the script uses in performing various tasks when we add the calls into our HTML. I like to go through the site, writing down the names of the routines I will use for each interaction as I come across it. You must make the content files available to the script before it can use them. Do this by dragging the files from the project tree into either the Embed Runtime or Reference Runtime containers in the Script object. Each approach has advantages and disadvantages. Dragging a Segment into the Embed Runtime container embeds the Segment into the run-time version of the script file. In other words, you do not need to include the run-time Segment file when you deliver the final content files. This method is attractive, as it reduces the number of files required for delivery. On the other hand, embedding run-time files often involves complicated dependencies of embedded files, especially where you use multiple scripts. The Reference Runtime container does exactly what it says. You must include the run-time Segment file along with the script file. The script file references the Segment file at run time. The appeal of this approach is that you do not need to worry about dependencies, as long as all files are present. It also allows you to update a single component file without changing any of the other files, which is often useful in the final stages of testing. Drag the content files into the Reference Runtime container; then create the routines required to play them. A sensible starting point is the start routine that the DMX control calls once everything loads. This script creates a basic start routine: sub Start Intro.play end sub Although it is not required for the site, it is always useful to have a "stop" routine in the script while testing. The following stop routine halts any looping Segments. You do not need to
include the short, non-looping motifs, as they never continue more than a second or so after the primary and secondary Segments stop. sub Stop Front.stop AtImmediate Intro.stop AtImmediate Minimal.stop AtImmediate Guitar.stop AtImmediate ClickFilt1.stop AtImmediate Scr.stop AtImmediate end sub The AtImmediate flag ensures that the Segments stop right away, regardless of their boundary settings. Any routines in the script appear in the list to the right of the script (see Figure 18-14). Double-clicking on a routine in this list executes that routine. By entering one routine at a time, you build the script up gradually until all the required functionality is present.
Figure 18-14: Routines and variables displayed in the Script Designer window.
Creating an AudioPath Object In DirectMusic Producer, Segments play back on the AudioPath selected in the AudioPath list, shown below. If you want to use a custom AudioPath at run time, you must specify the AudioPath in your script.
Figure 18-15: The AudioPath list. Create an AudioPath object when the script starts and specify that object each time a Segment plays. This ensures that the run-time version uses the correct custom AudioPath. sub CreateAudioPath if AudioPathLoaded 1 then AudioPathLoaded = 1 FlashPath.load set AudioPath1 = FlashPath.create end if end sub The routine above, when run for the first time, loads the AudioPath FlashPath and defines the variable AudioPath1 as FlashPath.create. Adding the parameter AudioPath1 to a play command now ensures that the Segment plays back on the correct AudioPath. sub Start CreateAudioPath Intro.play 0, AudioPath1 end sub This version of the start routine initializes the AudioPath and plays Intro back on the correct AudioPath by running CreateAudioPath.
Working with Variables Scripts often require variables for retrieving, setting, and storing integer values such as Master Volume or for conditional statements. Variables exist within an individual routine, or they can be global and thus available to any routine. Define global variables at the top of the script by typing dim VariableName. Once you define the global variables, they appear in the variable list (below the routine list — see Figure 18-14). This list displays the current value of global variables and allows you to edit them manually for testing. The Script Volume.spp, which is dedicated to the site's volume controls, uses the global variable Vol to track the master volume of the Performance. Calling Down first uses GetMasterVolume to retrieve the current master volume and then sets the variable Vol accordingly. Down uses Vol in a conditional statement to determine whether the master volume is at or above its minimum value. If the master volume is not at the minimum value, Down reduces Vol by 500 and uses SetMasterVolume to set the master volume to the newly reduced value of Vol (an attenuation of 5 dB). sub Down Vol = GetMasterVolume if Vol > -9600 then
Vol = Vol - 500 end if SetMasterVolume Vol end sub Volume exists as a separate script because I like to keep mechanisms such as volume controls as easily accessible chunks of script. This is a habit that I acquired from working on much more complex game projects, and it is arguably unnecessary in such a small project. Communication between the scripts is straightforward. Simply add the name of the target script to the routine call in the parent script, as shown in the following example. sub VolDown Volume.Down end sub Variables can also be set and retrieved between scripts in the same way. Achieving desired functionality often requires complex interactions between multiple routines and variables, especially in situations where the result of a particular routine differs depending on the currently playing Segments. For example, when a transition occurs, you may want to lock out certain motifs or prevent another transition from occurring until the first one finishes. In these situations, you may need to use several variables to track exactly what happens at any given moment. You may also need to use conditional statements to determine whether to start the new transition or motif or initiate a queuing or locking mechanism. Our little web site happily lacks such complexities, but DirectMusic scripting can cope with most things that you care to throw at it if necessary! Even with a simple project, keep scripts clear and tidy. There are likely to be logical groupings of routines, and maintaining clarity makes editing, testing, and bug fixing considerably quicker and easier. Any text that comes after an apostrophe is a comment (see below). Comments are useful for explanatory notes or disabling bits of script while testing. sub Lozenge1 Loz1.play IsSecondary, AudioPath1' This is a comment end sub ' sub Lozenge2 ' Loz2.play IsSecondary, AudioPath1 This whole routine is commented out ' end sub As with any interactive system, it is impossible to overemphasize the value of thorough testing. Test constantly throughout the scripting process, and thoroughly check the script after any changes — no matter how isolated they are.
Delivery Once everything checks out in DirectMusic Producer, test the system as part of the web site. First, create the run-time versions of the DirectMusic files.
You can save all the run-time files at once from the File menu (see Figure 18-16). The default destination is a folder called RuntimeFiles created in the Project folder. If you embed the content files rather than reference them, it may not be necessary to save run-time versions of every file. Avoid duplicating files or including obsolete files that increase the total size of the download.
Figure 18-16: Saving run-time files from DirectMusic Producer.
Packaging of Run-time Files The DMX control downloads a compressed .cab archive file and expands it to temporary files. This means only a single file needs to download, and a final stage of compression applies to the content files. Using a compression utility such as WinAce (see Figure 18-17), add the run-time files to an archive with "Normal" compression. Our archive is WSMain1.cab and resides in the site structure in /music/.
Figure 18-17: Compression settings in WinAce.
Implementation The DMX control works with HTML, Flash, and other web technologies. You can also use it in other applications that support ActiveX controls, such as MS PowerPoint and Word. To use the DMX control with HTML, you need a few lines of JavaScript to direct the routine calls from the HTML to the DMX control, which is located in a different frame (refer to the discussion on framesets in the "Putting It All Together" section) and identified as "Ctrl." The JavaScript audio.js is located in the Includes directory in the root of the site. Reference audio.js in the head of each HTML document containing routine calls as follows:
Vol + | Vol - | Music Off | Music On
Each anchor tag defines the text (e.g., Vol +) as a dummy link (href="#") with the appropriate routine call added using the OnClick trigger. Multiple triggers are usable in the same tag, as in the following example:
This navigation button has one action for a rollover (the tuned single note) and another for a click ('Section1'), which triggers two routines — one starting a new primary Segment at the next measure and the other playing the motif WipeUp.sgt. Going through the HTML and adding the relevant triggers where necessary completes the music implementation. This is simple once the routine calls pass successfully into the DMX control, and you test each piece of functionality as you add them.
Test, Test, Test! The final step, as always, is testing the music in the context of the site. As the permutations of interaction are vast and highly unpredictable, it is sensible to spend some time using and abusing the site to see how the music performs. Although users are likely to move at a sensible pace around the site, stopping to read some sections, perhaps skipping over others
after a cursory glance, we need to be sure that it stands up to anything that the user may do. Some time spent trying as hard as possible to break it with frenetic button clicking is very worthwhile.
Chapter 19: A DirectMusic Case Study for Worms Blast Bjorn Lynne
Game Overview Worms Blast was developed by Team17 Software Ltd. and published by Ubi Soft Entertainment. I (Bjorn Lynne) composed and arranged all of the music. That being said, let's take a look at how the music was created for this unique interactive game title.
Figure 19-1: Worms Blast. Worms Blast looks simple at first but reveals its depths and surprising longevity as you play it, much in the same way as Tetris or the predecessor of Worms Blast, Worms. The game features two players, each placed in a boat floating on water. The water has waves and is difficult to move around in. The water level rises if objects fall in the water. Overhead is a slowly descending landscape of colored blocks, and between the two players is a wall. Occasionally, a window opens in the wall between the two players, allowing the players to fire directly at each other with a selection of wacky comedic weapons. The landscape overhead moves slowly down toward the player. By using a selection of weapons, the player can shoot color combinations of the blocks that make up the overhead landscape, causing them to disappear. The game has several different modes, each with various goals, such as outlasting your opponent, collecting certain combinations of crates and colors, hitting certain targets, and so on. Worms Blast has a large number of hidden weapons and comedy elements. The game features characters already established in the long-running and highly successful series of Worms games, and it has a fun, bright, and colorful appearance. All in all, it is a game that looks very simple at first, but the longer you play it, the more you discover its depth and lasting playability.
Figure 19-2: Worms Blast.
Music Preparation In the early stages of trying out music content for this game, I used a custom-made music system featuring a fairly straightforward background music track along with short sampled phrases of various instruments, each belonging to one of the different characters in the game. So if the Calvin character scored points or picked up a weapon, his instrument would play a motif at the next possible entry point, marked in the background music file with markers. The system worked well enough, but because all phrases had to fit anywhere in the background music, the music had to be very uniform, and it just didn't sound very inspiring or heartfelt. The music turned out rather bland in comparison to what I had hoped for. I scratched my head for a while and decided to give DirectMusic a go. I had witnessed demonstrations of DirectMusic at a couple of game development seminars. I was impressed with the technology but less so with the resulting music. The concept of having a piece of music written to a set of chords and then overlaying a new set of chords to that music scared me. This was probably because the examples I was hearing, while interesting from a technical point of view, just didn't sound very good. Based on this impression, I decided early on that I would not use any of DirectMusic's chord features in this game. I was adamant that if I wrote a melody, I wanted that melody played back exactly how I wrote it. I had never used DirectMusic Producer before, and as I loaded up the application for the first time, I admit I was a little intimidated by the sheer number of screens, options, and features. Much of the terminology was new to me: Segments, Bands, Styles, Groove Levels, and ChordMaps. I've written music for games for the last ten years and used just about every type of music and audio program available, but this was the first time I encountered these terms. Much of the first week was spent studying the supplied demo files and reading the online help documentation. Things began falling into place, and pretty soon I had my first sets of patterns and motifs up and playing.
Making the Motifs Fit the Background Music As the first bars of real game music came together, it occurred to me that my decision to steer clear of all chord features could land me with the same problem as with my earlier nonDirectMusic attempts. Along with the background music, I was writing a number of small motifs to be played on top of the background music. In order for the motifs to work together with the background music at all times, the background music needed to be uniform. After all, if the motif was written in G7 and I refused to let DirectMusic alter it in any way, when the background music played in say, B, we would have some less-than-appealing note combinations. I went for the middle ground, writing the music using one scale and a wide variety of chords based on that scale. I found that my motifs would work equally well on a C minor as a G minor or an Fsus4, so long as I stuck to my scale. Even if the background music was currently playing an Fsus4 (which doesn't contain a G), if my motif had a G in it, that was fine, so long as it didn't include a note that was outside of the scale. I picked a scale to work with and made highly melodic music using plenty of chords. My motifs played successfully on top of this music without ever having DirectMusic change or tweak a single note of music that I wrote. Now I could really get down to work and write the real meat of the music content for Worms Blast.
Samples and Instruments I created samples by recording my synthesizers, guitars, and other instruments using SoundForge. I saved these recordings out as wave files and imported them into DirectMusic Producer, where I sat for hours and built my DLS instrument banks. I used DirectMusic Producer to work out loop points, rather than SoundForge. I was always very impressed with the "find best loop end" feature in DirectMusic, which seems to be better at finding functional loop points than anything else I've tried. In fact, I still use DirectMusic Producer to find loop points in files when creating samples. I decided early on not to use the preconfigured General MIDI instruments. In my experience, music made with General MIDI instruments tends to get that "General MIDI" sound that gives off a whiff of low budgets and shareware distribution. So I custom made every sample in every instrument used in the game. One of the most time-consuming tasks involved in this project was the breaking down of premade drum loops. I split up each loop into several very short loops, which I then rebuilt in DirectMusic Producer. I would take a one-bar drum loop and slice it up into sixteenths. This resulted in 16 stereo samples. Then I would import these 16 stereo samples into DirectMusic Producer and design an instrument with 32 regions — 16 left and 16 right. At the time I was using DirectX 8.0, and DirectMusic Producer treated stereo samples as two separate samples, so two separate regions had to be created in the instrument designer. Luckily in version 8.1, this can be done faster and easier, since the instrument designer now recognizes stereo samples and automatically creates the two separate regions. A useful tool in the above process is Square Circle Software's WaveSurgeon (www.wavesurgeon.com), which slices loops and saves the individual short phrases as separate wave files and also exports a MIDI "timing template." This small MIDI file can be imported into DirectMusic Producer as a pattern and, when played with the instrument created from the individual loop slices, gives a reproduction of the drum loop, which can be played back in different tempos. So if the music increases or decreases in tempo, or the game experiences a momentary system halt or slowdown, the drum loop will still be locked to the rest of the music.
The Project Structure Worms Blast was my first project using DirectMusic, so I was still a little hung up on old ways of thinking about music as something that has a beginning, a middle, and an end. I set out to create four different in-game "music tracks," each with a dedicated sample collection and its own project folder. The game would then pick a random project at the start of each round, load it, and start to play music from it. Half way through the project, I had some regrets about not creating all of the music in the entire game as a single project. Had I, it would have been easier to share instruments, patterns, and motifs. Looking back at it now, I've come to realize that perhaps it was for the best after all. By separating the in-game music into four separate projects, plus a fifth project for the front-end menu system, I had less of a temptation to reuse the same instruments in too many of the patterns. Since I was creating separate DLS instrument collections for each of the five music sets, I felt that I might as well make all the instruments unique to that set. This gave the music and instruments plenty of variety, an important issue for those wanting to play the game for hours. Creating five separate projects kept the number of Styles, Segments, and Patterns to a manageable level. Each of my five projects contained about 60 different Patterns, 30 Segments, ten motifs, but only one Band. This helped reduce the level of confusion that can sometimes arise from the fact that, in a DirectMusic project, there are different Bands all over the place that can affect the instruments in unexpected ways. This isn't helped by the fact that Bands sometimes have to be copied from a Style to a Segment or to a Script. If you go and update one of those Bands, the other places where the same Band exists do not get updated, leaving you with several copies of a single Band with the same name, which you thought were all the same but are in fact different from each other. The fact that I had only one Band for each of my five different projects — even though that band had to be copied to about 30 different Segments, and in doing so were left independent of each other — helped ease the chances of getting into a mess with the Bands.
Using Groove Levels to Set the Mood In Worms Blast, the player must keep his character from being squashed between the water under his boat and the blocks overhead. If the water level rises too high, the space above the character's head will narrow, and he will soon find himself in trouble. I wanted to use this in the music, to let the distance between the water and the overhead blocks determine a "stress factor" or "danger level," which would be reflected in the music. For each of my four different in-game music projects, I wrote about six different main musical parts. I then created five different versions of each of these parts with different intensities. For example, I had an eight-bar pattern called "main melodic part A — level 1." It was calm, laid-back, almost cozy. I copied it to a new pattern and called it "main melodic part A — level 2." In this version I introduced a few more drum sounds, but it was still pretty laid-back. I kept doing this until I had five different "mixes" of the same pattern. The fifth one was very intense, with drums playing double-time and some distorted instruments playing busy melodies along with the main melody. I then assigned groove levels to each of these patterns 11 through 20, 21 through 40, 41 through 60, and 61 through 80. I asked the game programmer to use the distance between the water and the overhead landscape to generate a number between 11 and 80. This number fed into the DirectMusic engine as a groove level. With this model in place, whenever the water gradually sank or rose or the landscape overhead gradually caved in or withdrew, representative patterns would play. The end result was that the music had a higher stress level, according to how much danger the player's character was in. I composed a few more patterns and assigned them to groove levels one through ten. These patterns were very low intensity, basically just a subdued rhythm with a few occasional notes. I called this "pause music" and asked the game programmer to set a groove level of one when the game was in Pause mode. Pausing the game causes the music to change to "pause music," while keeping a steady rhythm going. This worked really well; it sounded natural, slick, and cool. In addition to the different pattern intensities, we also implemented slow, subtle changes to the master tempo, which was tied to the groove level. As the distance between the water and the blocks decreased, the groove level increased and so did the tempo. We created a tempo scale from about 100 percent to about 120 percent of the original tempo, as set in the Segments, and had the tempo change by one percent at a time. This combined with the different patterns with varying intensity levels to further enhance the connection between the stress level in the game and the stress level in the music.
Motifs for Minor Game Events I decided to use "motifs" in two different ways in Worms Blast: one for major game events, such as losing a life, triggering a firestorm, triggering double-damage mode, etc., and another one for minor game events, such as picking up a crate, gaining some extra health, hitting a target, etc. I used motifs to react to minor game events. We had nine different motifs, each containing just a few notes with bright, sparkly instruments so that they would cut through the mix and be heard even if there was a lot of stuff going on in the background music. I set the motifs' boundary settings to Beat so that they would play in time with the music, starting from the next quarter note. This ensured that the motifs would play almost instantly but still not fall out of the rhythm set in the background music.
Motifs for Major Game Events There were important game events in Worms Blast for which simple motifs were not sufficient to represent. These events included losing a life, triggering a very big and powerful weapon, or entering some special game mode. There were 18 of these major game events that were important enough, not only to play a motif on top of the background music, but to actually come in and replace the background music completely for a little while. The way I solved this was to write 18 different patterns with very distinct music, using dramatic effects and instruments that stood out, such as gongs, sweeps, electro effects, etc. I had earlier reserved groove levels 81 through 100, so I assigned these 18 patterns to individual groove levels from 81 up to 98. For example, the two-bar pattern for "Double Damage Mode" contained a Chinese gong — a little Chinese style koto melody, and it was assigned groove level 87. When "Double Damage" occurred in the game, the game would input a groove level value of 87. This resulted in my two-bar "Double Damage" tune being played from the next bar line, replacing the background music instead of just playing over the top of it. Obviously, the tempo remained unchanged, something which helped make this two-bar break fit into the soundtrack as a whole and sound like it was meant to be there. At the end of each two-bar "special event" pattern, I put a drum fill-in/build-up to create a natural-sounding transition back to the regular background music.
Things I Would Have Done Differently I can really think of only one major mistake that I made in the development of the Worms Blast music, and that was to sample all my instruments at 44.1 kHz, like I've always done. Through my years of writing and producing music, I've gotten used to always starting out with files of the highest possible fidelity and then downgrading as and when necessary to save resources. I assumed that the music would play back at 44.1 kHz in-game. My plan was to keep the most "fidelity-demanding" instruments, such as sparkly keyboards, hihats, shakers, transparent and open pads, etc., in 44.1 kHz but downgrade other samples, such as bass sounds, etc., to a lower sample rate to save system resources. However, several months later I was told that the audio engine in the game would only run at 22.05 kHz. So, having some samples in 44.1 kHz would be a waste of resources since it would, in effect, be resampled to 22.05 kHz at the point of delivery. It turned out that I might as well have done all of my samples in 22.05 kHz from the beginning, as I had to downgrade everything to 22.05 kHz in order not to waste system resources with 44.1 kHz samples. With DirectMusic Producer version 8.0, there was no easy way to resample samples. I had to go back to my original sample sources, use SoundForge to resample to 22.05 kHz, save each sample to a new filename, and then go back to DirectMusic Producer and highlight the sample, choose Replace Sample, and then find the newly saved 22.05 kHz sample on my hard drive. This may not sound so bad, but many of my instruments had many regions with separate samples — not to mention my sliced loops, which could have up to 20 stereo samples (which DirectMusic Producer at the time treated as two separate samples), making it necessary to perform this time-consuming procedure to 40 separate samples, just for a single instrument. Actually, it was after this process that I managed to convince the DirectMusic developers to put a "resample" feature in DirectMusic Producer, something that was available from version 8.1 — too late for me, but at least I feel I can take some credit for getting that feature included for others to benefit from. I guess the moral of the story is when working with DirectMusic, forget your old method of always working on a copy of your music in the best possible fidelity and then downgrade as and when necessary. Instead, find out the sample rate that the end product is going to be delivered at before you do any sampling at all, and use that as your highest frequency.
Conclusion To sum up the Worms Blast music design, the game features the following: § Four different "sets" of background music § About eight different musical parts in each set § Five different intensity levels in each part § Eighteen different two-bar patterns for major game events § Nine different small motifs for minor game events § Tempo changes to intensity § Separate DirectMusic content for the front-end menu system, the credits screen, and the world map screen All in all, the game contains 338 musical patterns, most of which are eight bars in length (some of which are two bars long), and 36 motifs. It also has approximately 150 custommade instruments, most of which contain multiple regions. Perhaps equally interesting is what this implementation doesn't have. First of all, it has no chords. Of course, the music has chords, but as far as the DirectMusic engine is concerned, everything is in the default chord of C major. It also doesn't have ChordMaps, a concept I admit I still haven't really looked into. The game doesn't use any DirectMusic scripts. Instead, most of the interactive features are implemented by the use of groove levels.
Chapter 20: A DirectMusic Case Study for Asheron's Call 2: The Fallen Kings Download CD Content Jason Booth
Introduction Turbine Entertainment developed Asheron's Call 2: The Fallen Kings. Geoff Scott composed the music for the game. Dan Ogles handled the DirectMusic programming. I designed the music system and integration.
Music for Massively Multiplayer Games Understanding the dynamics of the game is the first step to designing a music system. It is important to ask yourself many questions about the player's experience that go beyond simple stylistic concerns. Asheron's Call 2 is an "online massively multiplayer roleplaying game," or MMRPG for short. A massively multiplayer game involves thousands of users playing together in one large fantasy game world, hosted online. A role-playing game (RPG) is one in which the player assumes the life of a fictional character that he or she creates. We call this character an avatar. An avatar in Asheron's Call 2 is a three-dimensional character that can be one of a number of fantasy races, be either gender, and have various customizable physical traits. In an MMRPG, gameplay is not clearly defined; players can roam the game world as a loner, or they can choose to work with other users to solve various quests and puzzles. Massively multiplayer games (MMPs), in their current form, have some unique challenges worth considering when designing a music system. For instance, the average MMP game session is much longer than most single-player game sessions. In addition, people tend to play MMP games for many months or years before growing tired of the game. Regardless of your development budget, chances are you're going to have a hard time making the music interesting for an extremely long period. The sheer size of an MMP is also daunting. Many of the techniques used in a traditional interactive scoring approach may not be viable simply because you cannot create music for every place or place cues for every location. In our case, we are dealing with a seamless world, which means we cannot rely on convenient divisions to start and stop music or load in new data. You also do not have control over in-game situations. In a tightly scripted single-player game sequence, you only have to worry about one person interacting with the game, and therefore it is possible to set up musical changes based on what the listener expects to happen. With an MMP, an area that is supposed to be a hotbed of excitement might be completely swarmed with players who cleared out all the monsters, thus making it seem more like a safe haven than a battleground. Finally, when making an MMP, there is a strong desire to be able to change data after the game's release. In the case of Asheron's Call, we built the franchise around stories that propagate to the user over small monthly patches. This has greatly influenced our data structure choices, as we want to be able to make huge changes to the game in very small
amounts of data so our users don't have to suffer through large downloads. For us, being able to add new music in a small amount of data was also important. However, while there are a lot of new restrictions imposed by the current MMP game designs, there are some unique opportunities as well. First, MMPs are primarily a social experience. While you might think that you are building a certain type of game, what you are really building is an experience for people with common interests to socialize around. At any given time, a very large percent of the user base is simply socializing around a meeting point or common interest within the game. Many of the systems in these games are designed to encourage social interaction, as in the long run it will be the social aspects of the game that keep many of your users subscribed to your service. Given this, it is necessary to look at the activities that friends enjoy together, as well as those popular and universal throughout the world's culture. It is also important to look at the tools people use to identify themselves as individuals while identifying themselves with various groups within society. An MMP that successfully provides these activities and tools will be one with a compelling social base and therefore have a longer subscriber life. In my experience, music has the potential to provide all of this and more. Music is fundamentally a social experience. Whether you are going to a concert or playing in a band, it is a social experience at heart. Outside of downloading pornography, chat and music downloading are also the two largest activities on the web. This makes some type of musical activity the perfect fit for an MMP, which is why we decided to add some form of playercreated music into the game.
Project Goals Before we move into the details of the music system, let's define a common set of terms, review the problems, and discuss potential solutions to those problems:
Repetition This is probably the number one reason I turn the music off on so many games. I simply get sick of hearing the same thing repeatedly. It is important that the game's music vary each time that it plays. The wider the variance, the longer the ear can withstand the same basic song playing.
Adaptive When many people talk about "interactive" or "dynamic" music systems, they are often talking about the music system being able to adapt to changes in gameplay. Much like a good film score, the music should highlight key moments in the gameplay, making things more exciting and supporting the mood.
Interactive By my definitions, an interactive music system is one that the user can directly control in some way. To accomplish this goal, AC2 allows players to play music with each other using a simplified musical interface. Additionally, the Tumeroks, a type of playable character (or race, to be more accurate) in the game, use drums to cast magic spells, which also work with the music system.
Informative
My final goal was that the music system be informative. It is my belief that a passive music system does not hold the listener as long as a system that is providing information interesting to the listener. We spend an amazing amount of time making the visuals of a game clue in players about their surroundings, and so should the music. Many of these clues can be subtle, leaving the player with subconscious information about situations and surroundings. For AC2, the scenery around the player determines the background music, while monsters bring in individual melodies that warn the player of their presence.
Code Budget We knew from the beginning of the project that our programming time and support would be limited. The music system was very much my pet project for AC2. Others treated it with less priority than it deserved. The team passed it off from one programmer to another, and it accrued bugs along the way. Given this, we tried to design things to "just work" with as little custom code as possible. We also knew that eventually we would work on other projects, so others might not follow any rules that we established about how to place music in the world in the future. Our mechanism for triggering music was the same as triggering sound. A music hook looked just like a sound hook to the system, so anywhere that we could call a sound sample we could call a Segment. This was useful because it meant we could attach music hooks to character animations or spell effects through our "fx system" or environments through our ambient sound system. Once triggered, DirectMusic handles everything else.
Project Overview I hope that this section gives you a good idea of how we structured the entire project, from both a technical and musical perspective.
Style and Feel The feeling that we wanted in our music originally stemmed from the style of playable races in Asheron's Call 2. The Tumeroks, which pull reference from tribal cultures, would play drums to channel magic, rather than the traditional wand or scepter. Lugians, another playable race in the game, pulled their reference from places like Tibet and brought to mind the sounds of Tuvan throat singers and Tibetan monks. The Humans would represent what people expect, sonically, from an RPG, the traditional classical element, orchestral scores, and church choirs. Our first step was to create a piece of music for each race that encompassed these elements and set the mood for each race. We later used these pieces in the character creation screens to enhance the mood of each race. Once we were happy with these pieces, we created a score that combined the sounds and melodies of all three pieces to set the style for the game itself. With this settled, we began to work on the design for the in-game music. The system evolved over the course of many months until it arrived at a point with which we were satisfied. A brief overview of what we finally settled on is as follows: § Location in the game world determines the current ambient score. This sets basic things like tempo, key signature, Chord-Maps, and a background musical score. This "master Segment" is attached to a specific combination of a terrain and scene (such as grass with willow trees) or on the physical geometry of a dungeon area. § Each monster in the scene adds a musical element to the score. Much like a subtler version of Sergey Prokofiev's "Peter and the Wolf," each of these parts can have many variations. § Players may hold a musical instrument in hand, such as a lute or drum, and play various emotes, which emit a 3D-located melody or rhythm from their instrument. If they do not have an instrument, they can beat box. § Each monster and avatar may add or subtract from the intensity level of the music. Each monster or potentially aggressive avatar in the scene adds two to the groove level, while each nonaggressive avatar subtracts one. The default groove level with no monsters or avatars in the scene is 30. This allows us to define a relative intensity for the music based on the rough state of the area. If the area contains mostly avatars, it is unlikely that anyone is in any danger, and the music becomes calmer. Conversely, if the area has a reasonable number of monsters in it, the groove level will rise, and the music will become more intense.
Time Feel Like many other projects, what we set out to do was quite a bit smaller than what we ended up creating. Our original thinking centered on getting the Tumeroks to be able to cast spells using drum rhythms and allowing them to play together in an interactive drum circle over the background score. Our first step in this was to get a series of bar-long rhythms working together with one-beat fills that would sound pleasing while played during any beat of the measure. Each part would have to be satisfactory by itself, yet add to the feeling of the group.
One of the experiences that we really wanted to emulate was the push and pull of time using polyrhythms. By incorporating the traditions of certain world music, such as African and Latin, we were able to transition and modulate between the feel of multiple time signatures while remaining in a single time signature. For instance, playing in 6/8 using a rolling six feel, then playing four dotted eighth notes against that to create the polyrhythm of "four over six," and then using those dotted eighth notes as the primary rhythmic pulse creates the feeling of playing in 4/4 while remaining in 6/8. Using polyrhythm like this can obscure the bar line and allude the ear as to the placement of one. This allowed our Tumerok drummers to fire off spells at any beat without disrupting the flow of the music.
Figure 20-1: Tumeroks are deeply in tune with the spirits of the world around them. They channel magical energy through drums to attack their foe. We chose a master tempo of 80 in the time signature of 6/8 because it is slow and leaves plenty of space. Using 6/8, we incorporated multiple time feels, including 3/4, 4/4, 6/8, and 12/8. They can work by themselves or against each other. Doubling the tempo to 160 beats per second is still a comfortable pace and gives us an entirely new palette of time feels to work with. We built even more complex feel changes and polyrhythms from these fundamentals. A 9/8 feel constructed of triplets off of the quarter notes in the 3/4 rhythm is difficult to make work as a fundamental, but it can be a nice polyrhythm if subtly used.
The Stack Using these fundamentals as a base to work from, Geoff Scott, the composer on the project, went into Fruity Loops and began to compose the initial drum patterns. Using the sounds of a Djembe, Dun Dun, and Todotzi, each drum would incorporate a different time feel. As we created each pattern, we referenced it against the previous patterns, creating a large "stack" of rhythms. Writing this way ensured that everything worked against everything else and provided insight into the music's architecture as a whole. It was important to balance the timbres of the drums, spreading the thin and thick or high-and low-pitched notes across the measure so that no one beat became overly accented. We then reviewed each rhythm to make sure it was satisfactory on its own as well as in the context of the entire score. Once we were happy with this, we moved the data into DirectMusic. We then composed a series of small fills on each drum that would represent the different types of spells in the game. We grouped these spells into similar categories, such as attacks, enchantments, heals, summons, buffs, and debuffs. We gave each category its own rhythmic figure. When the user casts a spell using a drum, the rhythmic phrase creates an accent. A call and response event occurs when several players cast spells.
Background Score Our goal for the background score was for it to sound almost as part of the environment's sound design itself. We wanted to keep things relatively simple and ambient and avoid earcatching resolutions and stabs. We also knew there would potentially be a very large number of melodic parts playing over a given background score and that those melodies would be the primary force adding tension and resolution to the composition. We chose modal pads with ambiguous harmonic movement — sweeps and swells of color that provide mood over traditional melody. Like our rhythmic scheme, the lack of harmonic resolution would not tire the ear. Similar to the concept used in Indian music, we stated the given scale as a raga (the tonic droned and then played upon for an entire afternoon), creating a meditative space without the repetition and destination of a cycling chord progression. To give the music a sense of global awareness to the player's current predicament, we used various modes to color the music accordingly. A player standing alone in an area would hear the background score in the Lydian mode, but as creatures fill the area, the music transitioned to Phrygian, giving a darker tonality. If monsters continue to outnumber the player, the music might transition to a diminished mode while a quickening drum groove begins to pulse.
Monster Melodies After we had some basic background music working, we wrote a few melodies, which we tied to individual monsters in the game. When one of these monsters is in the area, the melody warns the player of its presence. We wrote these melodies as if they were part of the background score, but designed them so that they could play on any measure in the score. We tried to capture the feel of the creature in the sound of its DLS patch and used information about where different creatures would appear to better mix the parts with each other. While we could not rely on this information as being correct, it gave us a good point to test and design from, as in most cases we knew what creatures were likely to appear together.
Player Melodies With a basic background score and our initial stack of rhythms established, we started to create the player melodies for the game. These would be two-bar melodies, which the player could trigger anywhere in the game and would start playing on the next marker placed every two bars in the music. When multiple users play multiple melodies, they get the feeling of playing music together. We started by composing against our rhythmic stack, using a bass and lute sound. We created ten melodies for each instrument and pulled from multiple time feels contained in the existing drum rhythms. Creating this stack was probably the most difficult aspect of this project. Geoff and I went home with intense headaches on more than one occasion, as wrapping our heads around all of the data was quite a task. Geoff's approach to the initial instrument, the lute, was to write using counterpoint and harmony, while keeping the melodic range as tight as possible to leave room for the other instruments. The initial lute parts used two scalar melodies with three-part harmony countered against each other, while pulling rhythmically from the four and six feels. We based the harmony on one chord structure and concentrated on the color of that chord's
chord tones and tensions, while being aware that the melodies would transpose to new chords and through chord progressions. Like the drum rhythms, each part needed to be sustainable by itself, as well as in a group. We maintained a careful balance between making the parts interesting and not making it so interesting that it became repetitive or dominated the mix.
Dimensions of Musical Space One of the key concepts that allow this project to work compositionally is the defining of musical space. We designed the rhythmic feel for the game to allow as many avenues for musical space within time as possible, providing multiple time feels within a single time signature and tempo. We applied the same concepts to the melodic space, as we needed to account for passing tones and tensions across each beat of the music. While it's natural to think about the melodic content as following the same time space as the rhythmic content does, melodic instruments have a quality that many percussive instruments do not — the length of the note. In many ways, this is yet another avenue of musical space to explore. The final area of space is that of timbres. When the sounds of the instruments are different enough, the ear has a much easier time distinguishing them from other notes of different timbres. These concepts of musical space became very important as we began to add more instruments to the stack. As the stack became larger, it was too unwieldy to manage as a single entity, and it was necessary for us to write new parts with only subsets of the stack available. Our ears could not truly hear 30 lines at once, let alone over 200. However, once you understand how we wrote the original melodies and the concepts of musical space, it no longer becomes necessary to have the entire stack available for listening. You still need to test against subsections of the stack, but eventually most of it becomes something that you do not think about on a theoretical level but just hear and do. In most cases, we tested each stack of ten melodies against a selection of about 20 melodies from other instruments.
Expanding the Stack Our first alternate instruments were simple repatchings of what we created previously. The lute melodies were now available in a marimba or harp sound, and we even added several new drums as well. Simple repatchings gave us a wealth of variety with little work, but we wanted to add new musical feelings to the mix. All the instruments so far were of the same characteristic — a quick attack with a fast decay. Next, we added three instruments that filled a very different role — long tone ambient sounds. We created the Ice Lute, Virinidi Lute, and Barun Lute to give the player music long tones. In the case of the Virindi Lute, each melody is three bars long and stresses upper structure arpeggios and intervallic play. The Ice Lute uses a cold, breathy, pad sound and uses the tensions of the chord and a forced ambiguity of harmonic structure while maintaining the color of the mode. This allows the instrument to float and weave through the harmony, rather than define it. The Barun Lute follows this same formula but with a very different timbre, more hollow and dry sounding than the Ice Lute. We also added flutes, which have a very different sound characteristic that any of the instruments mentioned previously. I wrote unique melodies for them, which stress the use of long held notes and harmonies. We designed each melody to provide a contrast with the original instruments. Each melody has a harmonized version.
Modes and Chords
Once again, I must stress the evolution of this project as a key motivator in how we did things. Once every basic component was working, we embarked on a long series of tweaks to make the overall project more interesting and discovered ways in which we could have done things better the first time. However, each of the restrictions that we placed on the project eventually became a motivator in some new way to expand the project. One such restriction was writing the background melodic score in a single mode. While this made it much easier to write over and provided the feel that we wanted, the player music suffered from the lack of active chord changes. Individual parts would sound repetitive if the players did not actively change the part that they were playing. The solution was to write modal chord changes and map individual parts to each of these changes. DirectMusic allows you to define up to four versions of each chord, allowing us to have multiple layers of chord changes and map each instrument to a different layer. The bottom chord layer would never change, which meant that all of our modal background score and monster melodies would play the same as they always had. The second and third chord layers, however, would be for the player music to follow. There are, however, distinctly different ways for musicians to follow chord changes. If we consider a basic blues as our example, a bassist playing through the changes will often move with root motion. That is, the bassist plays a line on the I chord and then plays that same line on the IV chord. Meanwhile, a soloist might play through those same changes without root motion by creating melodies that adapt to the current chord scale but do not move with the changing root.
Figure 20-2: Here is an example of one of our chords. Chord track 1 is set to CMin7, the first chord in the mode of Dorian. This is the mode for the background music. Chord tracks 2 and 3 are also in the mode of C Dorian, but their chords are set to an EbMaj7 chord. Look at Chord track three. Notice an EbMaj7 as it would be played above C. Chord track two, however, contains the inversion of the chord, which is closest to our root chord of C minor — in this case, the third inversion of the chord. What we define here is how a bass part and a melodic part move through the chord progressions. We map bass parts to chord track three and follow root motion up to the Eb. Melodic parts, however, choose the closest acceptable notes to what they originally play, much like a guitar player going through changes on a solo. This technique opened up a wealth of emergent behavior in the player music. The player music follows chord changes that are not present in the background score, yet work perfectly with the background score behind them. A user busy chatting while they play no longer sounds repetitive but instead actively follows chord changes. As we take the liberty of hiding interesting changes in parts of the world that may not even have a background score, there is an invisible landscape of musical changes waiting for the players to discover.
Groove Level In our first pass, we went through each background score and set various styles to respond to different groove levels. Our default groove level is 30; nearby monsters add two to the groove level, while players subtract one. As monsters outnumber the player, the groove level rises, and the system introduces new parts into the mix to intensify the music. This only provided a scope of density and did not create the change in color and mood we desired. For this, we added additional code to change the primary Segment based on the current groove level. As the player goes up in groove levels, the chords chosen for the music become darker. For instance, an area in a base mode of C Dorian will move to C Phrygian, and finally to a C whole-half tone scale, providing a much darker sound. Like all of these Segments, we wrote secondary chord changes for the player music to follow.
Technical Considerations Project Organization Project organization, while not an exciting topic, is important nonetheless. Throughout the creation of our project, I took the time to reorganize and rename any new data into standard naming conventions, while cleaning up any old data that did not fit our paradigm. In the end, solid organizational techniques will save you time and make finding problems later easier. Segment Trigger Tracks become a very useful organizational tool, as I created a hierarchy of inherited data. A background score in my project looks something like this: § Master Segment o Tempo/time o Chords or ChordMap o Markers o Default Groove Level o Band Segment • Band o Arrangement Segment • Drum Segment • Style Track for drums • Strings Segment • Style Track for strings • Choir Segment • Style Track for choir One of the main reasons that I organize data in this fashion is to speed up the process of creating variations, while keeping everything neat and consistent. My master Segment contains all the high-level controls, such as chords and tempo, while secondary Segments within the master Segment call Band data or group the arrangement of multiple styles together. Making new arrangements, mapping arrangements to new chord structures, or changing the Bands for an entire area are all very easy to do because I have significantly less data to track in each Segment. Also of note is that I always call Bands by creating a Segment for them and using a Segment Trigger Track. This saves a ton of time and frustration, as you only have to update your Band in a single place when you want to change it, instead of having to find all the Segments associated with that instrument and edit their Band settings. As a general practice, we created all note data in Styles, encapsulated in a Segment, and called from a Segment Trigger Track or as a secondary Segment. I never place note data directly into Segments because I often find myself in the position of eventually needing some function of styles and having to convert many Segments into styles.
Optimizing Patch Changes Because of the sheer number of potential Segments playing in our music system, it was important to pay attention to optimization techniques. In fact, at some point in the project, suboptimal data arrangements led to some strange behaviors. I found out the hard way that setting the Band is potentially a very expensive operation, which if not optimized can cause small pauses, sync errors, and other undesirable behaviors. Every time the engine called for a Band change, DirectMusic was not filtering out the redundant Band changes, which lead to hiccups since too many Bands were being set at once. My original Band Segment was
mistakenly set to loop every bar, and because it called as part of a looping master Segment, it accrued one extra Band change for every time the master Segment looped. This essentially means that DirectMusic would upload five instruments the first time through, ten the next, 15, then 20, and then 25. At around 50 patch changes, the pauses and sync problems start to become very noticeable until the music begins losing time. Additionally, because the player music Segments must initialize a Band with the 3D AudioPath, the player music Segment calls another Band for each new melody that a user plays. Note that even when resetting the same Band (such as on a loop), DirectMusic still repatches that Band setting. My first pass was to optimize the background score. I broke my master Band into a series of Band Segments that contained the minimum amount of Band data that each score needed. I gave each player or monster melody its own Band Segment as well. However, while this greatly optimized the background score, my two-bar looping player music Segments were causing a lot of individual patch changes, as each Segment would trigger a Band change on loop, and it was possible to have upwards of 60 people playing music in an area. My solution was to make each secondary Segment 999 bars long, with the Style looping inside of it. This meant that a player music Segment only called its Band when the Segment played initially, or when 1000 bars of music had passed by. While it is still possible that 60 people could trigger a new Segment at once, causing 60 patch changes to happen, it is highly unlikely, especially given Internet latencies.
Optimizing Voices and DLS Banks It is also very important to optimize your voice counts and DLS banks. It's very easy to have a bloated voice count if you're not careful, thus slowing the performance of your entire game. The most common cause of high voice counts is DLS banks, which have very long release times. You want to trim the release times to be as short as possible, and if you use a long release time, make sure you do not create short patterns that repeat before that release time has passed. For instance, if your Segment lasts for about four seconds and you set your release time on the DLS bank to ten seconds, you end up with notes from the original loop playing two loops later. You should also be aware of the memory that you use, as it directly accounts for much of the cost of changing bands. I do not go into general audio data optimization techniques here. However, there are things that you can do to make better-sounding DLS banks using less data.
Cheap Stereo Separation When creating synthesizer or orchestral patches, it is often desirable to have a stereo effect. Ideally, your original samples are in stereo, but this obviously chews up more data. A quickand-dirty way to achieve the same effect is to create two layers with different settings applied to the same wave and pan those layers left and right.
Note Scattering On many percussion instruments, such as a thumb piano, I use a technique I call note scattering to give them a more realistic sound. Thumb pianos have individual metal bars for each note, and each of these tend to have unique overtones, rattles, and other "signature noises" associated with them. Instead of using six samples over a three-octave range, spreading each sample over half an octave, I use six samples and randomly alternate them on notes as I go up the scale. This greatly reduces the chances of someone picking out the repeating signature noises as a run plays up the scale.
Evolving Patches Many of today's synthesizers have very long patches that evolve and change as they play. Such a patch would produce a very large sample file, often unsuitable for the memory restrictions of game development. However, it is quite possible to get this same effect using several smaller, looping samples using the various parameters available in DirectMusic. To make a patch evolve, place the first sample (say, a violin section) on layer one and set attack and delay times as you normally would. Place another sample, perhaps a choir of voices, on the second layer, and set the sample to ramp in very slowly over time. When you trigger a note, you get the attack of the violins slowly ramping into the voices. You can even use this trick with a single sample and play with its filter and modulation settings instead or use its pan settings to move the sample from left to right.
Custom Code While we did not have support for full scripting, we did do some custom code in our interface to DirectMusic. The primary thing we added was the concept of exclusivity, which we used to prevent monster melodies from playing over themselves. This concept would be very useful to add to the core functionality of DirectMusic, as it would allow users to weed out redundant Band changes by making the Band Segment exclusive.
DirectMusic Script We did the rest of our custom code in DirectMusic script and triggered through script calls embedded in Segments. We used this code primarily to handle the transition and selection of the primary Segment. The script intercepts calls to play new primary Segments, places them in a queue, and triggers the new primary Segment when the old primary Segment is about to end. The script also selects which version of the primary Segment is going to play based on the current groove level. We use this to change chord mappings based on the groove level; higher groove levels get a more dangerous-sounding mode, while lower ones get a more serene-sounding mode. The entire script would be much simpler if DirectMusic had an AtLoop flag for transitioning Segments, but currently that choice is not available. If you set the transition to happen at the end of the Segment and the Segment is set to loop indefinitely, you never reach the end of the Segment and your transition never happens. Thus, we needed custom code to handle this. A simplified version of this code is listed below with an explanation of what each routine does. First, we need to initialize the variables used in the script. Dim finish command
' used for determining flags to send with play
Dim nextPlaying
' number of Segments in the queue
Dim nextSeg
' object of Segment in the queue
Dim groovereport
' current groove level
Dim segA
' used for storing temporary Segment object
Dim segB
' used for storing temporary Segment object
Dim segC
' used for storing temporary Segment object
Dim seg1 0-4
' Primary Segment object name for groove offset
Dim seg1a 5-9
' Primary Segment object name for groove offset
Dim seg1b 10+
' Primary Segment object name for groove offset
The Init routine initializes the Segment variables to the Segments in the DirectMusic project. I do this to make things easier to edit, as now the Segment name only needs to be stored in one place, and the variable can be used instead of hard coding the Segment names everywhere they are referenced. Sub Init set seg1 = MI_Hum_C_Dorian set seg1a = MI_Hum_C_Phrygian set seg1b = MI_Hum_C_WholeHalf End Sub
For each area in the game, there is a queue Segment that calls the appropriate queue routine. This routine calls Init to initialize the variables and then checks to see if the nextPlaying variable is set. By default, DirectMusic initializes all variables to zero, so if nextPlaying is zero, we know the music has not started. Once the script sets the nextPlaying and finish flags, it calls the routine ChooseHuman1. Sub Que_01 Init If nextPlaying = 0 Then finish = 0 Else finish = 1 End If nextPlaying = 1 ChooseHuman1 End Sub This routine sets my temporary variables (segA, segB, and segC) tob be the primary Segments that I wish to choose between based on groove level. It then calls the ChooseGroove3 routine. Sub ChooseHuman1 Set segA = seg1 Set segB = seg1a
Set segC = seg1b ChooseGroove3 End Sub This routine checks the master groove level offset and sets the nextSeg object to be the correct primary Segment based on that offset. It finally calls the TriggerPlay routine to play the Segment. Sub ChooseGroove3 groovereport = GetMasterGrooveLevel() If groovereport < 5 Then Set nextSeg = segA End If If groovereport > 4 Then Set nextSeg = segB End If If groovereport > 9 Then Set nextSeg = segC End If TriggerPlay End Sub This routine checks the finish variable and triggers the Segment to play as either AtMeasure.sgt or AtFinish.sgt. Playing the Segment AtFinish.sgt essentially throws out the play command. If the primary Segment is currently playing and set to loop indefinitely, it will never reach AtFinish.sgt. This is necessary in case the primary Segment does not start or stops for some unexplained reason. If this happens, the new primary Segment will start playing immediately because no primary Segment is currently playing. If the primary Segment is already playing, then nothing happens until its Script Track calls the next routine. Sub TriggerPlay If finish = 0 Then nextSeg.play(AtMeasure) else finish = 0 nextSeg.play(AtFinish) End If End Sub
Figure 20-3: Our master Segment with its script call in the Script Track. All primary Segments call the final Segment just before they end. A script call is placed in a Script Track on the second beat of the last bar in the Segment. Once called, it checks to see which Segment number nextPlaying is set to and calls the selection routine for that primary Segment. In this example, there is only one choice, so it calls the ChooseHuman1 routine explained above. By placing this on the second beat of the last bar in the Segment, the routine triggers the next primary Segment to play on the next measure, and the primary Segment transition happens at the end of the primary Segment. Sub PlayNext Init If nextPlaying = 1 Then ChooseHuman1 End If TriggerPlay End Sub The result is that the primary Segments only transition at their ends while still being set to loop. Additionally, the control code allows us to select new primary Segments based on the groove level, which we use to recompose the music in this example from Dorian to Phrygian to a whole half-tone scale based on the current groove level. This gives us a more reactive score, where darker tonalities are associated with being outnumbered, and lighter tonalities are associated with outnumbering the creatures around you.
Postmortem Although we plan to expand the system that we created here as the game evolves, it is always a good idea to go back and look at the various things that went right and wrong in a project.
What Worked Player Music as a Purely Social Game If someone asked me what I am most proud of working on in Asheron's Call 2, it would be the addition of music as a purely social game. No one has ever created a dynamic music system quite like it, and we have already witnessed people getting to know each other because of the music system. These types of systems are what build friendships long after the core gameplay has become stale and provide people with new outlets for their creativity. I can only hope that the music played in-game inspires someone to pick up an instrument in real life, as music is something that fundamentally makes people's lives better.
Figure 20-4: During the last four hours of the beta test, users gathered on top of the Deru Tree to play music together. Part of this jam session is included as an MP3 track on the CD.
Layered Approach I cannot express enough the importance of starting with something simple and expanding it. Revision and the willingness to throw things away is part of what makes a project work well in the end. At any given stage in the project, we had something interesting working. As we added new layers to the mix, things just kept getting more interesting. Many of the solutions and concepts presented above were not part of the initial project and were continually added or refined through the entire process.
Music as Sound Design While style is not something you have, but rather something you cannot get rid of, the choices we made within our style have worked out very well. The combination of polyrhythmic structures and modal melodic structures not only allow our system to work in interesting ways but also do not tire the ear as much as other stylistic choices would have.
We also paid particular attention to blending these stylistic choices with our background environments, treating them as much as sound design as music.
Music Is Informative The monster melodies and use of groove levels in the game provide a kind of musical information that I believe positively enhance the user's experience of the game. They layer another kind of musical tension into the mix based on the user's situation and give arrangements of monsters a unique musical sound. One of our designers even realized that the game engine was not creating monsters in the right locations just from hearing the music.
DirectMusic DirectMusic was truly the only music system that could accomplish this type of holistic musical vision, and while the system is far from perfect, it does some things amazingly well. The ability to adapt a section of music to multiple key signatures allowed us to get a range of moods from a single piece of music, while the synchronization and small data rates of the file format allowed us to work within the network-based parameters that an online game such as ours requires.
What Didn't Work Not What People Expect from an "Interactive" Music System In some ways, we did not do what people expected from an "interactive" music system. When people hear this term, they usually think that the music is going to kick in when they go into combat, play a victory melody when they win, and go back to its normal state again. Instead, our system is ambient in nature and does not provide the painfully obvious kick at the right moments. While what we did was more advanced (and I am very glad we did it), sometimes the simple things are what people want. I do believe these types of situations can assimilate into the current system, but this type of easily quantifiable change is often what tires the ear of the music in the first place. The trick is to achieve a balance between immediately noticeable effects and the more subtle effects that do not tire the ear. The color and density changes controlled by the groove level provide some of this functionality, and I suspect that we will work more in that direction in the future.
Learning DirectMusic while Creating the Project Not fully understanding the functionality of DirectMusic made the initial creation process much more painful than it needed to be. However, I do not know of a better way to learn anything than to do it, and many of our "mistakes" lead to interesting benefits in the end.
Lack of Support Most sound designers fight with a lack of support in their endeavors. In many companies, it is hard to get the proper resources for sound, as the industry tends to focus heavily on graphics. In addition, many people have no idea how we create sound and music, let alone how they affect the experience. The original coder assigned to the music system worked on the user interface and localization of the game. Both the music and UI require two huge systems. In addition, she did not know much about sound and music, so explaining things often took an exorbitant amount of time. Moreover, once someone found a bug, verifying bug fixes always required Geoff or me to be present. Additionally, the code proved very unstable at times, so much so that during the last few weeks of development, we had to rewrite the entire system from scratch.
Dan Ogles, one of our graphics programmers, rose to this monumental task and did a fantastic job with it. However, I cannot imagine how much easier it would have been had the focus and support been there in the first place.
Future Work In the future, I am interested in prototyping the two main concepts of the system separate from each other before integrating them together again. If we created a system of pure player music, we could implement a host of new controls for users, allowing them to control things like tempo, chord and color choices, effect properties, and more. We are also very interested in pursuing a system that allows users to create their own melodies through a very simple interface, which would give them a much greater degree of musical control, while still protecting them from bad note choices. I would like to see how far we could push this system by itself, as there is potentially a lot more room for control within the mix. I would also like to spend some time prototyping a purely film score-type approach, where the speed at which the music adapts to the current situation is stressed. I would also like to try integrating traditional Redbook audio tracks into the mix to increase the sonic quality while still maintaining the interactivity. Once we have gained experience at pushing the boundaries of these systems separately, I would like to take that knowledge and see what parts of the two systems can be combined without limiting either system. Can we create a truly adaptive score, one that adapts to any situation, with the sound quality of Redbook audio, while still maintaining an interesting player music game? How much more control can we give to the users before they strangle their ears with it? I also think that we need to spend some time learning other areas of DirectMusic and exploring what other features lay hidden under the hood. For instance, we did not make use of ChordMaps for variation in the project, and we barely touched on the things that scripting can provide. We also did not use chord changes to write our background score and instead used them for player music and mood changes. How would the system differ if our background score used the more traditional style of writing in DirectMusic? Could we solve less with composition and instead use DirectMusic's functionality better in the end, providing a higher degree of recomposition?
On the CD On the CD included with this book, you will find a folder with a greatly simplified, but quite functional, version of my project. Due to licensing restrictions, I had to rewrite the music and use new DLS banks for the music in this version, but the fundamental principals used to create this project are identical to the ones used to create the original score and should serve as a great utility to understanding the techniques presented in this chapter. In the project, you will find three folders containing the background score elements, monster melody elements, and player music elements of the project. If you load the Demo bookmark in the project, the Secondary Segment window will contain the necessary Segments to preview all of the music in the project as well. At the bottom of the Secondary Segment window, you will find two queue Segments that start the background music through scripting or transition from one background score to the next. Above them are three utility Segments used to raise, lower, or reset the groove level of the project. You may see the current groove offset value by looking at the groovereport variable in the Script window. The remaining Segments in the Secondary Segment window are monster melodies and player music, which you should trigger once the primary Segment starts to simulate what might be happening in the game.
Figure 20-5: Secondary Segments in the example project. Use the first secondary Segment to start the project, the second and forth to change the groove level, and the rest to sample various instrument and monster melodies. 1. Start the primary Segment by playing the Que_Hum_C_Dorian.sgt Segment in the Secondary Segment window. The music may take a few seconds to begin. 2. Play several of the monster melodies (mn_drudge1.sgt, for instance) to simulate the presence of monsters in the area. 3. Play several of the player music Segments, and notice how they follow chord changes not originally heard in the background score, yet they still work with the background score. 4. Raise the groove level by playing the GrooveUp.sgt utility Segment several times until you hear the sound of the drums come in. After a few seconds, the key should change, and the player music should adjust accordingly. 5. Raise the groove level several more times. The drums will become louder, and the key will change again to become even darker.
Chapter 21: Beyond Games: Bringing DirectMusic into the Living Room Download CD Content
Overview Tobin Buttram What people are going to be selling more of in the future is not pieces of music but systems by which people can customize listening experiences for themselves… In that sense, musicians would be offering unfinished pieces of music — pieces of raw material, but highly evolved raw material, that has a strong flavor to it already. I can also feel something evolving on the cusp between "music," "game," and "demonstration".... Such an experience falls in a nice new place — between art and science and playing. This is where I expect artists to be working more and more in the future. Brian Eno in WIRED issue 3.05 — http://www.wired.com/wired/3.05/eno.html?pg=4&topic= Although DirectMusic provides many solutions to the challenge of crafting a cinematic underscore that responds in sync with the unpredictable unfolding of computer game events (as is evident in the excellent work of composers like Guy Whitmore, Nathan Grigg, and others), the greatest untapped potential of DirectMusic resides in the very nature of its nonlinear playback architecture and its widespread install base as part of the DirectX APIs. Nonlinear music is currently at the far frontier of music production technology. As such, it requires composers and designers to develop new compositional, recording, and playback techniques, even new ways of thinking about music. In return, it offers the possibility of discovering and delivering qualitatively new musical experiences to the worldwide community of tech-enabled music listeners. It's beyond the scope of this article to exhaustively document why and how a nonlinear music format represents an important "next step" in the evolution of music production. I only hope to convey the challenges and the rewards that the current "state of the nonlinear art" offers to both composers and listeners. In order to demonstrate what nonlinearity offers in terms of musical experiences I first recount a little of the history of early recording technology and how the introduction of linear playback brought with it an often overlooked paradigm shift that still affects how music is both perceived and conceived today. We look at DirectMusic as the basis for stand-alone nonlinear music content. We also look at the example Segment included on the companion CD: a full bandwidth, nonlinear music file made in collaboration with ProjeKct X (the "R&D division" of the seminal rock group King Crimson: Pat Mastelotto, Trey Gunn, Adrian Belew, and Robert Fripp). From there, I present my view of the challenges unique to nonlinear composition that a composer must take into account when working in this still very new and uncharted territory and present and explain some "conceptual keys" that I've found useful in my work.
From "Sound Capture" to "Music Production…" In a March 2002 article for The NY Times Magazine, author Kevin Kelly recounts an anecdote of Indonesian gamelan musicians reacting to early recording technology. He sums up the nearly immediate effect recording technology had on the development of musical forms: There is no music made today that has not been shaped by the fact of recording and duplication. In fact, the ability to copy music has been deeply disruptive ever since the invention of the gramophone. When John D. Smoot, an engineer for the European company Odeon, carted primitive recording equipment to the Indonesian archipelago in 1904 to record the gamelan orchestras, local musicians were perplexed. Why copy a performance? The popular local tunes that circulated in their villages had a half-life of a few weeks. Why would anyone want to listen to a stale rendition of an obsolete piece when it was so easy to get fresh music? As phonographs spread throughout the world, they had a surprising effect; folk tunes, which had always been malleable, changing with each performer and in each performance, were transformed by the advent of recording into fixed songs that could be endlessly and exactly repeated. Music became shorter, more melodic, and more precise. Early equipment could make recordings that contained no more than four and a half minutes, so musicians truncated old works to fit and created new music abbreviated to adapt to the phonograph. Because the first sound recordings were of unamplified music, recording emphasized the loud sounds of singers and de-emphasized quiet instrumentals. The musicologist Timothy Day notes that once pianists began recording they tried, for the first time, to "distinguish carefully between every quaver and semiquaver — eighth note and sixteenth note — throughout the piece." Musicians played the way technology listened. When the legendary recordist Frederick Gaisberg arrived in Calcutta in 1902, only two decades after the phonograph was invented, he found that Indian musicians were already learning to imitate recorded music and lamented that there was "no traditional music left to record." Kevin Kelly in The New York Times Magazine — http://www.nytimes.com/2002/03/17/magazine/17ONLINE.html For most of us, the very idea of music is indistinguishable from the experience of listening to the radio, LPs, CDs, MTV, MP3s, and the like; so much so that the notion of even questioning the 100 percent equivalency between recordings of music and the performance of music itself can be hard to grasp. In the early days of recording, it was inherently obvious that technology served only as a means of preserving live sonic events, whether musical, oratorical, or otherwise. For the first time in history, fleeting and transitory air vibrations were frozen in wax to be reproduced and distributed according to demand. As time passed and an industry grew around the technology, desire for increased fidelity drove technological development. As technology progressed exponentially, the pendulum swung from more and more accurate sound reproduction over to increasingly sophisticated sound production. This has led to the current situation of a complete reversal of recording technology's original role. Today, many big-budget "live" musical acts painstakingly stage and choreograph their performances to replicate as closely as possible the sound of the studio recording. This
studio recording itself may consist primarily of only careful simulations and samples of some long-ago live performance by a previous generation of performers. We are at the point now where live concerts are often little more than a lip-synced ballet set to the commercial release of a hit single playing through a PA system, only barely supplemented by the sound of the live musician's actual in-the-moment stage performance. In such cases, it is safe to say that sound production technology has caused recording to replace live performance as the aim of music making, even in a so-called live performance. This is not to remotely imply that recordings are necessarily inferior to live music or to devalue the education, enjoyment, and entertainment that recorded music provides. In fact, the whole point of this chapter is that technology can now allow even more musical entertainment and enjoyment. The role of modern recording technology in serving the creative musical process is indisputable. It has empowered some of the greatest musical artists of our time, as is evident in the creation of iconic, never-performed-live masterworks, like Sgt. Pepper's Lonely Hearts Club Band and Bitches Brew, and it's worth acknowledging here that the recording studio with all its tools may legitimately be viewed as a complicated sort of musical instrument, which can be played skillfully by someone who has mastered its various techniques. The point is simply that recorded music is so ubiquitous and so much a part of our cultural experience of music that we often forget that before the advent of recording, the only way to hear music was to participate in some sort of musical performance, either as a musician, an audience member, or, as in many cultures where the dividing line between performer and audience is not so clear cut, a participant in a socio-musical event like a religious service, the gamelan, African call and response, and many, many more such traditions. Before recording, all music was nonlinear, in that every performance was free to respond to the unique necessities of the individual time, place, and people involved. While there is an incredible amount of technological genius and artistry evident in the current tools for producing and recording music, including nonlinear editing, all the results of this technology are still presented to the end user through the inherent limitations of linear playback that were dictated by early recording methods and materials. All linear formats are by nature a closed system, in that once a record is "in the can," the creative process is done. The listener's appreciation and familiarity may grow over time, but nothing in the recording itself and the way it plays back is ever going to change. Of course, this will always be a great and useful thing for capturing the "perfect performance," but linearity in music playback no longer needs to be as irrevocable as a thermodynamic law; it can instead be optional. It may always be the norm, but it needn't be the rule. For the first time, the technology is widely available to overcome the determining condition of linear playback and bring us back full circle to the paraphrased question raised by the perplexed Indonesian musicians in the article quoted above: "Why would anyone want to listen to a stale rendition of an obsolete piece when it's so easy to get fresh music?" As one answer to that question, I propose that composers and music producers take advantage of existing alternatives to the standard linear playback paradigm. DirectMusic is such an alternative and already has a large install base. It provides a format based on nonlinear music architecture from its earliest conception and, critically, it allows delivery of discrete nonlinear music files that can play on most any modern multimedia PC or any other platform that supports the DirectX APIs, like the Xbox.
"…to Nonlinear Music Format" This exploration of DirectMusic as an architectural basis for a consumer-friendly nonlinear music format is based on more than just bright ideas and speculation. Over the past few years, Guy Whitmore and I composed and designed many examples of just such a format using DirectMusic. These compositions proved concepts and explored design methods blending various musical sources for demo and instruction purposes. We used familiar artists' material to demonstrate the commercial possibilities of nonlinear playback to music industry A&R people interested in new technology but understandably new to the concept of nonlinear music. Some examples: I used vocal phrase samples of Bjork in an original nonlinear composition to show how a club DJ remix artist might approach nonlinear composition. Guy arranged different and variable instrumental backing tracks with appropriate production value for a commercially released a cappella version of a Sarah McLachlan song to demonstrate a more traditional pop song approach within a nonlinear framework. He also recreated tracks from Moby's Play using DLS instruments and the same public domain vocal samples he used from the Lomax/Smithsonian collection. We've also been fortunate to work closely with bands keen to explore the potential of nonlinear formats. One of them, ProjeKct X (Pat Mastelotto, Trey Gunn, Adrian Belew, and Robert Fripp) provided multitracks of an improvised session to generate musical material specifically for use in a nonlinear playback format. Not only that, but they have also generously granted permission for me to provide a small portion of that material here for demonstration and educational purposes. All of the aforementioned demos consisted of discrete, individual files and a small accompanying player application designed to run on most any up-to-date multimediacapable Windows PC. Though usually larger than the "average" MP3 file, a nonlinear Segment file is by definition not tied to the standard run-time/footprint ratio of a linear file. 10MB equals roughly ten minutes of run time for MP3, whereas a 10MB DirectMusic Segment may be designed to play anywhere for a minute or two up to an hour or even indefinitely until stopped by the user, if that's what the designer wishes. If you haven't yet listened to the ProjeKct X Segment that accompanies this book, copy the ProjeKct X directory on the companion CD and copy it in its entirety to your C:\ drive. Open the directory that contains both the ProjeKct.exe and ProjeKctX.sgt files, and run ProjeKct.exe (they must be in the same directory for the player to run). The player will automatically load the ProjeKct X Segment on startup (allow some time for the files to decompress, depending on your CPU speed). There is also a WMA file provided as reference of the live linear capture of the piece from the studio. It's worth noting here that the material presented is only a fragment of what was a more than a ten-minute linear piece with three distinct movements. As playing ProjeKctX.sgt demonstrates, the sound of DirectMusic can be as full and clear as Redbook audio or better if the composer wishes to use higher bandwidth source material. From the user's point of view a stand-alone nonlinear Segment appears to be just another sound file that, with a registered player app installed, only needs to be double-clicked to start. On first listen, it sounds like most any other music file. However, on a second, third, fourth, and even hundredth listen, the real difference between the DirectMusic Segment and a linear file becomes more and more apparent — not in sound quality of course, but in the actual experience of listening. In fact, one could say that there never really is more than "one listen" of a robust nonlinear music file because it can render a unique rendition with every play if the composer so desires. This is the essential difference between linear and nonlinear playback: A robust nonlinear music file continually rewards the engaged and active listener with every listen.
Some people on hearing or reading about nonlinear music for the first time form unrealistic assumptions and expect behavior along the lines of "Mary Had a Little Lamb" somehow magically playing itself as "Twinkle, Twinkle, Little Star." A little listening experience easily dispels such notions as most unlikely (not to mention undesirable). What generally happens with nonlinear playback is more like what happens when listening to different performances of the same piece of music. The nonlinear performance can even sound like different groups of musicians playing the same piece with genreappropriate stylistic variations if the composer provides enough material. Even if the composer designs a huge piece with wildly varying elements, the identity of any real musical composition stays recognizable in the same way that we can recognize Coltrane's rendition of "My Favorite Things" as the "same tune" that Julie Andrews sings in The Sound of Music. Simply put, a composition that has no identifiable elements that can translate across different settings and renditions is likely not a composition in the traditional sense. Conversely, if a composition is recognized by an intelligent listener regardless of arrangement or rendition, then the identity of the piece remains intact as some sort of metadata, whether the format is linear or nonlinear. For instance, it will not take more than one or two listenings of the ProjeKct X file or its original linear performance for the listener to recognize it within a short moment. This is true even though it was a group improvisation to start with and even though the melodic lines, words, and textures play back in a different arrangement every time in the DirectMusic version.
Nonlinear Music Design What follows is a necessarily cursory, high-level look at the framework behind some of the approaches that I've used in recent years for designing stand-alone nonlinear music files using DirectMusic. The ideas presented here are subjective and by no means exhaustive. I hope they serve to spark other ideas or at least give interested composers and designers a starting point and/or general direction from which to proceed. Unfortunately, there is no way to create a template or formulate a simple list of instructions for making nonlinear music files. If there were, it would hardly be a creative endeavor, and designing compelling nonlinear compositions requires creativity or, at the very least, ingenuity to be done well. One factor that makes a cookie-cutter approach impossible is that the nature of the musical material used often dictates the most desirable, or even feasible, design approach; even similar-sounding pieces from the same genre can sometimes require different approaches. Additionally, the nature of the DirectMusic architecture is such that there are often several very different ways to accomplish the same sounding outcome, which can be somewhat overwhelming at first. Each approach may be more or less appropriate, depending on the final application you design for and the strengths and weaknesses of your current skill set. Those with advanced MIDI chops who are comfortable building DLS sample sets may like the flexibility that comes with that approach if the material allows. (The ProjeKct X file was constructed with that approach, for instance.) More traditional audio engineers may gravitate toward DirectMusic Wave Tracks as their primary tool due to some similarity there with nonlinear audio editing tools. Ultimately, each piece needs an approach unique to it and to its designer. However, I've found it useful to keep three high-level concepts in mind when approaching a new DirectMusic project. They may seem obvious and simple to grasp at first, but I've found that they become more useful the more experienced I become. First, in brief: Your role in the project is the first key factor to consider. How much freedom do you have to alter the material? If you are the sole, original composer, the answer is perhaps easy. If you are working as remixer, obviously a different approach is required. One role that is unique to nonlinear music production is something like that of a translator — bringing linear material written by another composer (or yourself) into the nonlinear world while conveying the spirit of the original material, which of course brings its own set of requirements and restrictions. Next, what functionality do you need to design for, as in what functionality does your player app expose to the end user? Will your material play in a "start and stop" player like the ProjeKct X player? Or are you required to expose more parameters, such as the ability to change the mix by loading different AudioPaths or different Band files to mute certain tracks, or allow control of purely musical parameters such as changing the tempo or harmonic character of the music. Material based heavily on large linear audio tracks simply won't allow purely musical parameter changes, since these parameters are, in effect, "baked in." This leads naturally to the next concept. The most critical concept is what I call the granularity of the material that you are working with. It is also the hardest concept to convey in the absence of nonlinear composition experience, since it is unique to the field. Is the piece a Bach fugue for a keyboard instrument, a jazz standard with vocals, an ultra-polished mega-hit pop song, or a musically ambient sound bed for an art installment? Each of these will have a different granularity, which will dictate your approach and how much DirectMusic functionality you can use or expose. Now we can look at these concepts a little more in depth.
Role Your role in the process is hopefully self-evident but still worth articulating to identify the boundaries within which you must work. Are you the sole composer of the piece, working entirely from scratch with the freedom to follow the piece wherever it leads? Are you working with music from the classical canon in the public domain, with a jazz standard, or with a traditional score composer and taking his or her linear ideas and converting them to DirectMusic files (as often happens in game music)? Or are you working with someone else's finished audio multitracks, using as much or as little as you like in combination with your own material the way a remix artist might? Or are you limiting yourself to recreating a variable playback version of the original piece ("translating" from linear to nonlinear)? The answers to these questions will help define your role. Since the roles are not discrete stations, separate and isolated from each other, we can arrange the various options on an axis or continuum. For instance, remixers nearly always add their own material to a remix, often using only a vocal track and nothing else from the original. If working as translator, one might choose between adhering to the exact form of the material, allowing variations within the different parts, or one might make editorial decisions and attempt to capture the spirit of the original while taking much liberty with the original form. On a number line from 1 to 12, the continuum might look something like this:
My role in the creation of the ProjeKct X file would be somewhere around 10, closer to translator than remixer. I didn't have any new material but still made some critical choices that affected the listening experience. My aim was to create a composition that did not just sound like that particular piece of music playing but that sounded more like that particular piece of music being played live, and to accomplish this I made certain editorial/translation choices. For instance, in the drum tracks I intentionally have them introduce the piece the same way with every play, without variation. I also place the tacit drum breaks on a similar timeline as the original live performance, as well as have them progress in intensity the same way they do in the original. In choosing the drum track to function as a sort of formal template for the Segment, which provides cues and signposts to the attentive listener, I gained more freedom to add variability to the other instruments. I wasn't forced to do it this way by technical or musical constraints; these were merely editorial decisions based on my reading of what would work best for the material in this format. I also changed the mix quite a bit from the original and brought in musical material from markedly different sections of the same improvisational suite so that this small Segment of material could stand a little better on its own as a demo. It is just one arrangement of several that I did with the material, the most formally literal with regard to the original. Had my role been further to the left of the axis above, I might have added quite a bit of my own material to act as setting for the ProjeKct X material or even just used samples here and there in my own composition.
Functionality Of course, game audio is currently the most common application to compose nonlinear music for, but as the title of this article indicates, the application for consideration here is a stand-alone player application. Currently, the few stand-alone player apps available function primarily as simple start and stop devices, but the APIs are there to add specific DirectMusic functionality to allow a custom player to expose values for different parameters to user input. The following list of parameters is in the order of increasing granularity, as well as increasing restriction on the type of source material that is compatible with such functionality: § Standard playlist functionality to play multiple Segment files in a desired order
§ § § § § §
Choice of different AudioPaths and different Band files within a Segment to allow for different mixes Allow user to mute or unmute certain sub-tracks Allow user to choose approximate length of run time for each Segment Allow user to choose level of "musical activity" or density Allow user to alter tempo Allow user to alter harmonic and melodic characteristics and/or mode
The ProjeKct X player is simple in functionality, so the Segment I provided here has a corresponding architecture; it sits squarely at 1 on the axis above. Another player I've worked with that was specifically designed to showcase this material allowed the user to choose between various mixes and signal processing via alternate AudioPaths and Band files, as well as choose differing levels of musical density and intensity via controlling Segments. Such a player would have required me to ship corresponding assets and also led to different design choices in the Segment's construction. It's important to note that, due to the nature of the ProjeKct X source material, it wouldn't be possible to allow the user to alter the tempo or harmonic/melodic content. This is due to the granularity of the material, as explained below. In the middle of the axis is the game music engine: It requires much more complex design and assets than a simple Segment player, but it's still a more "controlled environment" than a player with musical parameter variables exposed to the user. There is likely no direct user access to musical and sonic parameters in a game. The playback parameters are more predictable by the composer, therefore easier to plan for and work around as required.
Granularity Anyone who has worked much with DirectMusic probably already has an intuitive understanding of what I mean by "granularity." It is useful to think of it in the literal sense; sand is quite granular and conforms naturally to the shape of most any vessel. Gravel is less granular but still conforms to a mold, within rough limits. Bricks or blocks are much less malleable, but in return maintain their own shape and stability, which means we can use them to build a stable vessel or container into which we can pour gravel and sand if we wish. The analogy is simple: MIDI tracks in combination with DLS instruments are like sand, phrase samples are like gravel, and Wave Tracks are like bricks.
The ProjeKct X file belongs at around 4 or 5 on this continuum; phrase samples are built into DLS instruments. The left side allows much more malleability with regard to musical parameters, such as tempo, key, harmonic sequence, and phrasing. The right side is less concerned with musical parameters and more concerned with audio parameters. It is less musically malleable (in terms of key, tempo, etc.) but as such retains much more of the critical performance characteristics so crucial to popular music (i.e., the singer's voice or an instrumentalist's tone). As such, its stability (or invariability) will likely also determine the overall form of the Segment. It is critical that a composer become familiar with how the basic architectural units in DirectMusic relate to one another in order to understand the concept of granularity. Roughly, from smallest to largest, these units are notes, variations, patterns, Styles, and Segments. These units tend to fit one inside the other, sort of like a series of nested Russian boxes.
Once you grasp the internal relationships of the DirectMusic elements, the most critical step in designing a nonlinear Segment becomes somewhat easier — defining the smallest possible unit you can reduce the source material to without losing the essential quality you wish to convey, almost like discovering a "musical molecule." This analogy, of course, cannot be taken too literally since your definition of the granularity will likely be determined somewhat by your role and not by some objective musical quality. For example, if you "translate" a hit song by a well-known artist and you want your Segment to maintain a traditional song form, the critical element is almost certainly the singer's vocal performance and the lyrics. The granularity in such a case is possibly as large as an entire verse and certainly no smaller than a full vocal/lyrical phrase. This would tend to dictate a design based around DirectMusic Wave Tracks, and the variations would likely be more in the post-production elements, such as altering the mix via track levels and differing DSP treatments, using alternate instrumental tracks, or altering the song's form. If you're working closely with the artist, you might be able to include alternate vocal takes. If, however, you design for a dance remix of the same song and are not concerned with maintaining pop song form, you can make the vocal line as granular as you want to take it. In such a case, cutting the vocal line into phrase samples and making a DLS instrument with them might be the best option, depending on the rest of your material. In the case of instrumental music, the level of granularity may also be as large as a verse or a chorus, especially if the critical element you wish to keep is the player's individual tone. If the instrument is especially MIDI friendly though, high-quality DLS instruments may be an option, in which case the level of granularity may drop all the way down to the individual note level, again depending on your role. For example, one could theoretically ship a well-captured keyboard performance in the form of MIDI-based DirectMusic Segments along with a Gigasampler quality DLS instrument. This would be the digital equivalent of shipping a player piano, the piano rolls, and the player. Rather than play the linear MIDI sequence the same way every time, the Segments allow for subtle musical variation in tempo, velocity, and duration, even within specifications derived from analyzing the way a particular player varies its performances. Imagine something like Keith Jarrett's Sun concerts in a format that sounded like he was playing your favorite Gigapiano in your listening room and had the option for subtle or not-so-subtle "Jarrett-like" variations that would differ with every playback. ("Hmm, shall I have Mr. Jarrett perform on my Bosendorfer or the Steinway today?") In such a case, the DirectMusic engineer (i.e., the person who converted the MIDI sequence to DirectMusic Segments and edited and added the musically appropriate variation parameters) would be working around 12 on the role axis above, but the material would be near 1 on granularity. The most compatible player would be near 12 on the player functionality axis above.
Conclusion In conclusion, I want to make clear that I view DirectMusic as the basis for a nonlinear musical format, not as a finished and definitive format. DirectMusic has the advantage of being part of the DirectX APIs, which means there are many potential DirectMusic players (in hardware terms) already in listeners' homes right now, today. With the resources in this book, anyone with enough interest could make their own DirectMusic Segments and a simple player, make them available via download or file sharing, and have millions of potential listeners. Its disadvantage is that the authoring tools have never benefited from the rigorous design/dev/user-test feedback cycle required of commercial software in the marketplace. They have always shipped free as part of the DirectX SDK, and it shows. However, dedicated users have proven that they can master the challenges of the tools and turn the incredible complexity of the architecture to great musical advantage in awardwinning games. Nonlinear music can become the next step in music production technology if music lovers and the listening public have access to quality examples of nonlinear music, not only in games but also on their home multimedia stations. If that happens and the demand for "more" occurs, better tools and improvements in the architecture will not be far behind. Then — who knows? — by the turn of the next century, our current linear formats may seem as quaint and outdated as one of Edison's wax cylinders does now.