The Art and Technique of Electroacoustic Music 9780895797414


224 50 48MB

English Pages 539 Year 2013

Report DMCA / Copyright

DOWNLOAD PDF FILE

Recommend Papers

The Art and Technique of Electroacoustic Music
 9780895797414

  • 0 0 0
  • Like this paper and download? You can publish your own PDF file online for free in a few minutes! Sign Up
File loading please wait...
Citation preview

DAS26ElseaCover_1.247 8/29/13 2:36 PM Page 1

VOLUME 26 VOLUME 26

THE COMPUTER MUSIC AND DIGITAL AUDIO SERIES

THE COMPUTER MUSIC AND DIGITAL AUDIO SERIES

Peter Elsea Electroacoustic music is now in the mainstream of music, pervading all styles from the avant-garde to pop. Even classical works are routinely scored on a computer and a synthesized demo is a powerful tool for previewing a piece. The fundamental skills of electroacoustic composition are now as essential to a music student as ear training and counterpoint. The Art and Technique of Electroacoustic Music provides a detailed approach to those fundamental skills. In this book Peter Elsea explores the topic from the fundamentals of acoustics through the basics of recording, composition with the tools of music concreté, and music production with MIDI instruments, softsynths and digital audio workstations. Later sections cover synthesis in depth and introduce high powered computer composition languages including Csound, ChucK, and Max/MSP. A final section presents the challenges and techniques of live performance. This book can be used as a text for undergraduate courses and also as a guide for self learning.

Í A-R Editions, Inc.

The Art and Technique of Electroacoustic Music

Peter Elsea

Peter Elsea served as principal instructor and director of the Electronic Music Program at the University of California, Santa Cruz from 1980 until retirement in 2013. During his tenure, the program gained an international reputation for breadth and rigor. Alumni of the program have won or been nominated for Oscars and Clios for scores and sound designs or become chart-topping performers. Many others have found a place in the music software industry, designing tools that constantly advance the state of the art. Elsea's career began at the University of Iowa Center for New Music in 1972, where he studied under Peter Todd Lewis and Lowell Cross and worked closely with many prominent electroacoustic composers of the day. He is widely known as an internet author, with his posted articles referenced by thousands of sites around the web.

The Art and Technique of Electroacoustic Music

The Art and Technique of Electroacoustic Music

8551 Research Way, Suite 180 Middleton, WI 53562 800-736-0070 608-836-9000 http://www.areditions.com

Í

Peter Elsea

01_FM_ppi-xxiv 8/29/13 2:37 PM Page i

THE ART AND TECHNIQUE OF ELECTROACOUSTIC MUSIC

01_FM_ppi-xxiv 8/29/13 2:37 PM Page ii

THE COMPUTER MUSIC AND DIGITAL AUDIO SERIES John Strawn, Founding Editor James Zychowicz, Series Editor Recent titles include: New Digital Musical Instruments: Control and Interaction Beyond the Keyboard Eduardo R. Miranda and Marcelo M. Wanderley, with a Foreword by Ross Kirk Fundamentals of Digital Audio New Edition Alan P. Kefauver and David Patschke Hidden Structure: Music Analysis Using Computers David Cope

A-Life for Music Music and Computer Models of Living Sysems Eduardo Reck Miranda Designing Audio Objects for Max/MSP and Pd Eric Lyon The Art and Technique of Electroacoustic Music Peter Elsea

Full information about the entire series is available online: http://www.areditions.com/cmdas

01_FM_ppi-xxiv 8/29/13 2:37 PM Page iii

Volume 26 • THE COMPUTER MUSIC AND DIGITAL AUDIO SERIES

THE ART AND TECHNIQUE OF ELECTROACOUSTIC MUSIC Peter Elsea



Í

A-R Editions, Inc. Middleton, Wisconsin

01_FM_ppi-xxiv 8/29/13 2:37 PM Page iv

Library of Congress Cataloging-in-Publication Data

ISBN 978-0-89579-741-4 A-R Editions, Inc., Middleton, Wisconsin 53562 © 2013 All rights reserved Printed in the United States of America 10 8 6 4 2 1 3 5 7 9

01_FM_ppi-xxiv 8/29/13 2:37 PM Page v

Contents

Preface

xi

Acknowledgments

xiv

List of Figures

xv

Part 1 Building the Studio Chapter One Find a Quiet Spot . . . Audio Equipment The Signal Path The Computer Software Putting It All Together A Note about Institutional Studios Resources for Further Study

Part 2 Fundamental Concepts and Techniques Chapter Two Sound Listening to Sound Looking at Sound Comparing Sound Paying Attention to Sound Exercises Resources for Further Study

1 3 4 9 10 12 18 21 23

25 27 27 35 45 48 48 48

01_FM_ppi-xxiv 8/29/13 2:37 PM Page vi

Chapter Three

Chapter Four

Recording Sounds

51

The Recording Signal Chain Catching a Sound Microphone Technique Recording Quality Exercise Resources for Further Study

51 67 68 69 72 72

Sound after Sound

73

Audio Editing Software Splicing Exercises The Species I Exercise Advanced Techniques A Case Study Resources for Further Study

Chapter Five

Processing Sounds Modifying Frequency Response Modifying Amplitude Playing With Time Distortion Exercises Resources for Further Study

Chapter Six

Layering Sounds Digital Audio Workstations Composing With Layers of Sound Case Study Exercises Resources for Further Study

73 82 82 85 89 92

93 94 104 111 117 122 122

123 123 137 144 145 147

Part 3 Music Store Electroacoustic Music Chapter Seven MIDI

149 151

History The Hardware

151 153

01_FM_ppi-xxiv 8/29/13 2:37 PM Page vii

The Messages Standard MIDI Files General MIDI MIDI and Notation Future MIDI Exercise Resources for Further Study

Chapter Eight

Sequencing Programs Getting Set Up Recording MIDI Data Editing MIDI Notes Composing in MIDI Editors Exercises Resources for Further Study

Chapter Nine

Samplers A Bit of History The Sampling Paradigm Recording Samples Programming Voices Rhythmic Loops The Sampling and Copyright Lecture Exercises Resources for Further Study

Part 4 Synthesis Chapter Ten Fundamentals of Synthesis The Modular Legacy The Patch Modules Performance Interfaces Fundamental Patches Modulation Note Generation

155 161 162 162 163 164 164

165 166 168 171 181 184 185

187 187 188 189 195 201 202 202 203

205 209 209 210 211 222 224 228 232

01_FM_ppi-xxiv 8/29/13 2:37 PM Page viii

Exercises Resources for Further Study

234 235

Voicing Synthesizers

237

Chapter Eleven

Patching Paradigms A Simple Subtractive Synthesizer Subtractive Synthesis on Steroids Exercises Resources for Further Study

Chapter Twelve

FM Synthesis The Math of FM Synthesis The Sound of FM Synthesis Further Exploration of FM Synthesis Exercise Resources for Further Study

Chapter Thirteen

New Approaches to Synthesis

A Closer Look at the Makeup of Sound Analysis Additive Synthesis Spectral Synthesis Granular Synthesis Modeling Synthesis Advanced Synthesis and the Composer Exercises Resources for Further Study

Part 5 Research-style Synthesis Chapter Fourteen Composing on a QWERTY Keyboard Overview of Csound Learning to Build Complex Instruments Resources for Further Study

237 238 246 266 266

267 268 270 279 305 305

307 307 310 313 320 324 329 338 338 338

341 343 344 353 361

01_FM_ppi-xxiv 8/29/13 2:37 PM Page ix

Chapter Fifteen

Coding for Real-time Performance Overview of ChucK Programming Lessons Resources for Further Study

Chapter Sixteen

Programming with Boxes and Lines

Basic Max MSP Audio and Synthesis A Hint of Jitter PD: Simple, Effective, and Free Resources for Further Study

Chapter Seventeen

Synthesis in Hardware

Classic Instruments Hardware Accelerators Kyma Hardware Forever? Resources for Further Study

Part 6 Live Electroacoustic Music Chapter Eighteen The Electroacoustic Performer Sources and Controllers Putting Electronics on Stage Putting Your Show on Stage The Last Word on Performance Resources for Further Study

Chapter Nineteen

363 364 371 392

393 394 406 414 418 419

421 422 432 434 443 443

445 447 448 455 460 463 464

Composing for Electronic Performance

465

The Classic Approach: Live Instrument(s) with Prerecorded Accompaniment Looping as an Art Form

465 468

01_FM_ppi-xxiv 8/29/13 2:37 PM Page x

Transformations and Processing Triggering Sounds and Processes Algorithmic Performance Composing the Performance Exercises Resources for Further Study

Select Bibliography Appendix Index

471 475 481 486 487 487

489

Contents of the Accompanying DVD 493 501

01_FM_ppi-xxiv 8/29/13 2:37 PM Page xi

Preface

Electroacoustic music is one of the fastest-growing music genres today, encompassing academic studies as well as popular styles such as trance, techno, and so on. Electroacoustic music is heard in diverse markets such as television and film soundtracks, commercials, pop songs, and underground radio. Electroacoustic compositions are some of the most common postings on Internet sites such as YouTube and Facebook. New recording companies are springing up to publish compact discs and even vinyl albums of these fresh sounds. The common thread in all of this activity is that the material is produced by (mostly) solitary artists using microphones and computers in private studios. This is a radical change from traditional modes of music production and requires a set of skills and knowledge quite different from that available in the past. Most college music departments and many community colleges include some study of electronic music in the curriculum, and it is beginning to turn up at the secondary school level as a low-cost alternative to traditional ensembles. I have even seen electronic music in middle school and in after-school youth programs. With all of this interest, there are naturally many books published each year about electronic music and audio production. What makes this one different? Most of the books—and, I am sorry to say, many of the courses—focus exclusively on science and technology. A good understanding of underlying principles of the technology is helpful, maybe even essential, but a focus on technical matters can easily obscure the central point—what does it sound like? It is one thing to make a student memorize the formula that describes signal phase, it is another to point out the effect phase may have on a microphone recording or a dual oscillator patch. It is better yet to teach the student how to hear the effects of phase and the skills to deal with it when necessary. This book has a threefold purpose. Composers who understand the underlying principles and paradigms of sound manipulation will find it easy to learn an unfamiliar interface, so this book covers those principles at a practical level of detail. Composers whose experience consists of pasting together precooked loops and samples will be liberated by acquiring the skills to create their own materials, and this book teaches those skills by examples and exercises. It also introduces advanced techniques that allow composers to think outside of the loop and take their xi

01_FM_ppi-xxiv 8/29/13 2:37 PM Page xii

xii

THE ART AND TECHNIQUE OF ELECTROACOUSTIC MUSIC

work to new levels of subtlety and expression. Finally, a composer’s success depends on the ability to hear both the technical and musical details of his or her own work. This book provides exercises to develop that ability. This book is designed to serve as a text for a one-year course in electroacoustic composition or as a study guide for composers setting up on their own. It is not a guide to the use of Pro Tools, the Evolver, or any specific program or piece of equipment. This book presents the musical and auditory skills required to compose music using programs like Pro Tools and the hardware found in any electronic music studio. The book is divided into six parts, which may loosely fit two per academic quarter or three per semester. Part 1, “Building the Studio,” is an overview of the equipment and software used for electroacoustic music. It provides an orientation to an institutional studio or guidelines for composers setting up their own. The old-fashioned model of an expensive studio with locked-down equipment and controlled access is still prevalent in music schools, but it is supplemented and will eventually be supplanted by students setting up modest but powerful private studios. This section provides guidelines for doing that. Part 2, “Fundamental Concepts and Techniques,” defines the fundamental skills necessary for electroacoustic work. These include techniques for recording, processing, and organizing sound and, more important, the listening skills necessary to execute these techniques with professional competence. These are taught within a framework of musique concrète études. This style of composing will be unfamiliar to most students and so focuses their attention on the sonic detail of the materials. Part 3, “Music Store Electroacoustic Music,” introduces commercial production techniques. The tools of MIDI, sequencing workstations, and sample-based synthesis are used for the bulk of electroacoustic music produced today, including work by composers who get paid to do it. Commercial composition is well covered by manual supplements and books of the “Composing for Underachievers” type, so this section presents the difficult parts: the details of MIDI messages, the underlying structure of sequencers, and how to create your own sample libraries. Part 4, “Synthesis,” is an exploration of methods for creating artificial sounds. Again, the emphasis is on universal principles and on hearing sonic detail. Some of the lessons seem specific to certain programs, but they are easily transferred to other applications of the same type. Part 5, “Research-style Synthesis,” is different in organization. These four chapters each serve as an introduction to a major research platform that in itself could be the basis for a career. It is suggested that teachers choose one of the four based on their own experience and preference. Self-guided students should skim over all of them, then choose the platform they want to investigate more fully. Part 6, “Live Electroacoustic Music,” is also different in organization. Live performance is concurrent with all the topics covered, as it is simply an alternative way to present music. These chapters may be assigned whenever they are appropriate to coincide with performance opportunities and student interest.

01_FM_ppi-xxiv 8/29/13 2:37 PM Page xiii

PREFACE

xiii

Several important topics are not covered in this book. It has no historical material and very little reference to the canon of significant compositions. There are several excellent books on the history of electronic music, any one of which could be used as a second text or the basis of an independent course. The most important historical works are available on compact disc, primarily through the offices of the Electronic Music Foundation (emf.org). I leave it up to the teacher to integrate these materials into the course. Students must be made aware of the composers who have come before and of the subtle interplay of technology and art, and they should be encouraged to build their own libraries of electroacoustic masterpieces. This book does not contain any stylistic or musical advice other than universally applicable concepts. My own style will be apparent from some of the examples, but the techniques presented are suitable to any genre. I expect the teacher to provide any stylistic guidance needed in the process of evaluating student work and teaching the skills of self-evaluation. I am always a bit pleased when I notice a reference to one of my compositions in something a student has done, but I consider that a case of teaching by example. My courses are about how to compose, never what to compose.

01_FM_ppi-xxiv 8/29/13 2:37 PM Page xiv

Acknowledgments

Every teacher’s work is built on a foundation established by his own teachers. Credit for my musical and technical foundation is owed to Peter Todd Lewis, Lowell Cross, and Gordon Mumma, whose tutelage and collaboration formed the basis of my own career. Any success I have had as a teacher is owed to my music education mentor, Steven Hedden. The existence of this book is owed to David Cope, whose prodding and encouragement kept me writing and whose criticism and advice kept the writing readable. I also want to thank those who read and advised me on particular parts of the book—Paul Nauert, Avi Tchamni, and Dr. Richard Boulanger, as well as the many software authors who were kind enough to answer my impertinent questions. Ron Alford broadened my knowledge of real-world teaching and provided motivation during the rough patches. My wife Veronica deserves a special award for patience in putting up with book-related absentmindedness and ill humor. But the greatest debt is owed to the 900 students who sat in my classroom over the years. It is their questions, their eagerness to explore possibilities, and their sheer joy in the act of creation that have inspired me to keep experimenting, keep thinking, and keep teaching.

xiv

01_FM_ppi-xxiv 8/29/13 2:37 PM Page xv

List of Figures

Chapter 1 Figure 1.1

Typical frequency response graph

Figure 1.2

Typical signal path

Figure 1.3

Typical equipment layout

Chapter 2 Figure 2.1

Intervals of the harmonic series

Figure 2.2

Fletcher-Munson curves

Figure 2.3

Exponential and log curves

Figure 2.4

Sine waveform

Figure 2.5

The effect of phase on combining waveforms

Figure 2.6

Violin waveform

Figure 2.7

Violin waveform at higher pitch

Figure 2.8

Noise waveform

Figure 2.9

Snare drum envelopes

Figure 2.10

Spectrum of sine tone (slightly distorted)

Figure 2.11

Spectrum of violin tone

Figure 2.12

Time spectrum of an open G on the violin

Figure 2.13

Spectra of four sounds in DVD example 2.23

Figure 2.14

Example of a metering plug-in

Figure 2.15

Masking curves

Chapter 3 Figure 3.1

Dynamic microphone

Figure 3.2

Condenser microphone

Figure 3.3

Bidirectional microphone

Figure 3.4

Omnidirectional microphone xv

01_FM_ppi-xxiv 8/29/13 2:37 PM Page xvi

xvi

THE ART AND TECHNIQUE OF ELECTROACOUSTIC MUSIC

Figure 3.5

Cardioid microphone

Figure 3.6

Cardioid frequency response

Figure 3.7

Effects of aliasing on waveform

Figure 3.8

Effect of word size on waveform

Figure 3.9

The record window from WavePad

Chapter 4 Figure 4.1

Transport controls from Peak LE 6

Figure 4.2

Editing window from Peak LE 6

Figure 4.3

Silence between sounds in the editing window

Figure 4.4

A guitar note in the editing window

Figure 4.5

A bad splice

Figure 4.6

A good splice

Figure 4.7

The beat point in a complex sound

Chapter 5 Figure 5.1

Finding plug-ins in Peak LE 6

Figure 5.2

Low-pass and high-pass filters

Figure 5.3

Band-pass and notch filters

Figure 5.4

The effect of Q on band-pass filters

Figure 5.5

Shelving equalizer

Figure 5.6

Spectrum of pink noise

Figure 5.7

Graphic EQ plug-in

Figure 5.8

Graphic EQ settings for DVD example 5.2

Figure 5.9

Parametric EQ plug-in

Figure 5.10

Parametric EQ setting for DVD example 5.4

Figure 5.11

The effect of limiting on waveform

Figure 5.12

The action of a compressor

Figure 5.13

The word look

Figure 5.14

The word look compressed

Figure 5.15

Components of reverberation

Figure 5.16

The effect of transfer function on a sine wave

Figure 5.17

Sine tones with symmetrical and asymmetrical distortion

Figure 5.18

Spectra of distorted sine tones

01_FM_ppi-xxiv 8/29/13 2:37 PM Page xvii

LIST OF FIGURES

Figure 5.19

Sine wave shaped by a triangle wave

Figure 5.20

Rectified sine wave

Figure 5.21

The effect of overdrive on sine waves

Chapter 6 Figure 6.1

Pro Tools edit window

Figure 6.2

Pro Tools track controls

Figure 6.3

The trim operation

Figure 6.4

Architecture of a mixer

Figure 6.5

Mix window in Pro Tools

Chapter 7 Figure 7.1

MIDI connections

Chapter 8 Figure 8.1

Main window of Logic

Figure 8.2

Event list editing window

Figure 8.3

Piano roll editing window

Figure 8.4

Percussion grid editing window

Figure 8.5

Notation editing window

Chapter 9 Figure 9.1

Sample editor

Figure 9.2

A nicely placed loop

Figure 9.3

Crossfade loop

Figure 9.4

Typical sample mapping

Figure 9.5

Playback engine

Figure 9.6

Control envelopes

Figure 9.7

LFO shapes

Figure 9.8

Interaction of control envelope and sample envelope

Chapter 10 Figure 10.1

The Tassman patch builder window

Figure 10.2

The Tassman player window

Figure 10.3

Oscillator waveforms

Figure 10.4

Sine waves from analog synthesizers

xvii

01_FM_ppi-xxiv 8/29/13 2:37 PM Page xviii

xviii

THE ART AND TECHNIQUE OF ELECTROACOUSTIC MUSIC

Figure 10.5

Pulse width or duty cycle

Figure 10.6

The spectrum of a 5:1 pulse wave

Figure 10.7

The spectrum of a triangle wave

Figure 10.8

Wavetable lookup

Figure 10.9

Filter response curves

Figure 10.10

Gates and envelopes

Figure 10.11

Basic beep

Figure 10.12

Beep with filter

Figure 10.13

Doubled oscillators

Figure 10.14

Vibrato with LFO

Figure 10.15

The basic FM spectrum; carrier is at 8 kHz, modulator at 500 Hz

Figure 10.16

The basic FM patch

Figure 10.17

Polyphonic voices

Figure 10.18

Sample and hold waveforms

Chapter 11 Figure 11.1

Remedy plug-in synthesizer

Figure 11.2

Remedy signal path

Figure 11.3

The spectrum of Remedy’s square wave

Figure 11.4

Time spectrogram of a changing duty cycle

Figure 11.5

Time spectrogram of Remedy FM

Figure 11.6

Time spectrogram of the Remedy filter

Figure 11.7

The effect of drive in Remedy

Figure 11.8

Absynth patch window

Figure 11.9

Absynth envelopes

Figure 11.10

Absynth envelope with exponential segments

Figure 11.11

Absynth envelope with LFO

Figure 11.12

Absynth oscillator

Figure 11.13

Waveshaping in Absynth

Figure 11.14

Transfer functions in analog electronics

Figure 11.15

Effect of modeled analog and pure digital low-pass filters on a sawtooth spectrum

Figure 11.16

Effect of a high-pass filter on a sawtooth spectrum

01_FM_ppi-xxiv 8/29/13 2:37 PM Page xix

LIST OF FIGURES

Figure 11.17

Effect of Q on a band-pass filter

Figure 11.18

Effect of an all-pass filter on a sawtooth spectrum

Figure 11.19

Effect of a notch filter on a sawtooth spectrum

Figure 11.20

Effect of a comb filter on a sawtooth spectrum

Figure 11.21

Mixed waves with changing phase

Figure 11.22

Crossfade envelopes in Absynth

Chapter 12 Figure 12.1

Basic FM spectra

Figure 12.2

Sideband strength as a function of modulation index

Figure 12.3

Logic EFM1 FM synthesizer plug-in

Figure 12.4

EFM1 architecture

Figure 12.5

Spectrum of 1:1 frequency modulation

Figure 12.6

Spectrum of 1:10 frequency modulation

Figure 12.7

Spectrum of 9:1 frequency modulation

Figure 12.8

Expert mode of the FM8 software synthesizer

Figure 12.9

Patching in FM8

Figure 12.10

Ops pane in FM8

Figure 12.11

Operator view in FM8

Figure 12.12

Spectra of modulator waveforms

Figure 12.13

Modulation with a complex waveform

Figure 12.14

Modulation with two operators

Figure 12.15

Waveforms produced by feedback modulation

Figure 12.16

Feedback modulation

Figure 12.17

Spectrum of stacked modulators

Figure 12.18

Producing noise with frequency modulation

Figure 12.19

Spectrum of a real trumpet

Figure 12.20

Spectra of various FM trumpet presets

Figure 12.21

FM trumpet patch showing modulation levels

Figure 12.22

Carrier and modulator envelopes for brassy sounds

Figure 12.23

Time spectrogram of FM trumpet

Figure 12.24

Pitch envelope for FM trumpet

Figure 12.25

Operator envelopes for FM trumpet

xix

01_FM_ppi-xxiv 8/29/13 2:37 PM Page xx

xx

THE ART AND TECHNIQUE OF ELECTROACOUSTIC MUSIC

Figure 12.26

Patch and operator tunings for the Choir preset

Figure 12.27

Spectrum of the Choir preset

Figure 12.28

Envelopes for the Choir preset

Figure 12.29

Formant filter settings for the Choir preset

Figure 12.30

Filter response for the Choir formant

Figure 12.31

Spectra of filtered Choir

Figure 12.32

Spectrum and time spectrogram of a violin

Figure 12.33

Foundation of the Soft Strings patch

Figure 12.34

The Soft Strings patch with modulators

Figure 12.35

The Soft Strings patch with feedback

Figure 12.36

The Soft Strings preset modulation settings

Figure 12.37

The complete Soft Strings patch

Figure 12.38

Envelopes for the Soft Strings preset

Figure 12.39

Patch and operator tunings for the Silver Chordz preset

Figure 12.40

Key scaling in the Silver Chordz preset

Figure 12.41

Envelopes for evolution

Figure 12.42

Envelopes for the Is a Twin preset

Chapter 13 Figure 13.1

Time spectrum in waterfall mode

Figure 13.2

Windows in the FFT algorithm

Figure 13.3

The SPEAR analysis window

Figure 13.4

Vector synthesis

Figure 13.5

The Alchemy additive editor

Figure 13.6

Trumpet spectrum in Alchemy

Figure 13.7

Key follow with transition between 50 percent and 59 percent

Figure 13.8

Spectrum with best time settings

Figure 13.9

Spectrum with best frequency settings

Figure 13.10

Fish photo used for DVD example 13.8

Figure 13.11

MetaSynth Image Synth

Figure 13.12

Granular synthesis

Figure 13.13

RTGS-X main window

Figure 13.14

TimewARP 2600 synthesizer

01_FM_ppi-xxiv 8/29/13 2:37 PM Page xxi

LIST OF FIGURES

Figure 13.15

Modelonia acoustic modeling synthesizer

Figure 13.16

Plucked string model

Figure 13.17

Modelonia pick module

Figure 13.18

Modelonia string module

Figure 13.19

Pipe model

Figure 13.20

Modelonia horn module

Figure 13.21

Modelonia lips module

Figure 13.22

String Studio page B

Chapter 14 Figure 14.1

The Karplus-Strong algorithm

Figure 14.2

Flowchart for filtered pluck instrument

Figure 14.3

The Csound window produced by the code in Listing 14.12

Chapter 15 Figure 15.1

MiniAudicle windows for ChucK

Figure 15.2

Flowchart of MIDI parser

Figure 15.3

A wheel of pitches

Figure 15.4

Biquad filter

Chapter 16 Figure 16.1

Basic elements of Max (Version 5)

Figure 16.2

Data processing

Figure 16.3

Integer and floating point math

Figure 16.4

Chords in C major

Figure 16.5

Simple keyboard

Figure 16.6

Keyboard patch in presentation view

Figure 16.7

Random note generator

Figure 16.8

Syncing metro objects

Figure 16.9

Transport control of tempo

Figure 16.10

A simple rhythm engine

Figure 16.11

Chaotic Player

Figure 16.12

Encapsulation

Figure 16.13

Managing message order with trigger

Figure 16.14

MSP in action

xxi

01_FM_ppi-xxiv 8/29/13 2:37 PM Page xxii

xxii

THE ART AND TECHNIQUE OF ELECTROACOUSTIC MUSIC

Figure 16.15

Flanging a recording

Figure 16.16

Playback from buffer~

Figure 16.17

Basic Beep in Max

Figure 16.18

A beeping abstraction

Figure 16.19

Six-voice polyphony

Figure 16.20

FFT vocoder

Figure 16.21

Importing an image into Jitter

Figure 16.22

Playing movies

Figure 16.23

Visualizing audio

Figure 16.24

Basic beep in PD

Chapter 17 Figure 17.1

Typical tone module (Korg NS5R)

Figure 17.2

Typical synthesizer voice structure

Figure 17.3

Four-part performance patch

Figure 17.4

Editing synthesizer parameters on the computer screen

Figure 17.5

Evolver (Dave Smith Instruments)

Figure 17.6

System exclusive control of Evolver

Figure 17.7

Kyma editing window

Figure 17.8

Kyma virtual control surface

Figure 17.9

Basic beep in Kyma

Figure 17.10

MIDI instrument in Kyma

Figure 17.11

Two-channel MIDI in Kyma

Figure 17.12

Group additive synthesis in Kyma

Figure 17.13

Kyma vocoder

Chapter 18 Figure 18.1

Piezo transducers

Figure 18.2

Orange crate board with vibrators

Figure 18.3

AudioMulch

Figure 18.4

Ableton Live performance mode

Chapter 19 Figure 19.1

Score for instrument and recorded sounds

Figure 19.2

Max beat clock patch

01_FM_ppi-xxiv 8/29/13 2:37 PM Page xxiii

LIST OF FIGURES

Figure 19.3

Excerpt from Requiem for Guitar

Figure 19.4

Signal path with A-B switch

Figure 19.5

Adjustable control numbers in Max

Figure 19.6

Preset stepping in Max

Figure 19.7

Automation in AudioMulch

Figure 19.8

Gesture detection in Max

Figure 19.9

Tap tempo detector in Max

Figure 19.10

Follow tempo in Max

Figure 19.11

Rhythm pattern detection in Max

Figure 19.12

Working with probabilities in Max

Figure 19.13

A probabilistic note player in Max

Figure 19.14

Probability-driven harmony in Max

Figure 19.15

Contents of colls in Figure 19.14

xxiii

01_FM_ppi-xxiv 8/29/13 2:37 PM Page xxiv

02_Chap1_pp1-24 8/29/13 2:38 PM Page 1

Part 1 Building the Studio Every art requires a suitable studio. A studio is a place where tools can be stored, where large projects can be safely left half-finished, and where the mess is not annoying to other people. Further, a studio is a refuge, a place where the artist can concentrate without distractions, make mistakes in private, and occasionally let off a bit of steam. It’s not difficult to set up a studio for electroacoustic music, but it involves potentially expensive choices, so some technical understanding of the task will result in a better work environment.

02_Chap1_pp1-24 8/29/13 2:38 PM Page 2

02_Chap1_pp1-24 8/29/13 2:38 PM Page 3

ONE Find a Quiet Spot . . .

Sound quality is the top priority for an electroacoustic studio. The acoustical requirements are not as stringent as those for a recording studio, but they cover the same ground. There should be no intrusive noise, sounds made in the studio should not bother others, and the ambience of the space should be natural and comfortable. These features require solid construction of the room, well-fitted doors and windows, some consideration of the shape of the room, and thoughtful treatment of surfaces. Few composers have the luxury of working in acoustically designed spaces. Some may have access to such places at a school or commercial studio, but most of us work in a house or an apartment. Luckily, a spare bedroom usually provides all we need. In warmer climates a garage might be a workable location, but most are so flimsy you are essentially outdoors, subject to traffic noise and close to the neighbors. Basements often offer excellent isolation, but low ceilings produce an odd ambience and any dampness will play hob with the electronics. When considering any space, pay attention to sound coming from the adjoining rooms. The main noisemakers in a house are refrigerators, laundry facilities, air conditioners, and plumbing, all of which produce low-frequency sounds that easily pass through walls. No surface treatment will have a significant effect on this problem. You must also consider human activities. Sound intrusions from TVs, stereos, and games may be controlled by negotiation, but it’s less tiresome to avoid the possibility altogether. Of course, the studio must possess the amenities any room needs: adequate ventilation, decent light, and a comfortable temperature. The room need not be particularly large as long as there is space for the planned equipment. I’ve worked for years in a room about 8 feet by 10 feet, which I consider a practical minimum. This is big enough for a basic setup, but I can’t bring anyone else in to hear work in progress. You will need a bigger space if you play an acoustic instrument or if others are going to join you. The equipment layout will be about the same in any size room, but the ideal situation allows for placement in the middle of the room with comfortable access to the back of the gear. The precise shape of the room isn’t likely to be much of a problem. Most houses are designed with some attention to the proportions of rooms. Rooms should not be round or perfectly square, but you seldom encounter these. The main exception 3

02_Chap1_pp1-24 8/29/13 2:38 PM Page 4

4

THE ART AND TECHNIQUE OF ELECTROACOUSTIC MUSIC

is old houses where large rooms have been divided. There you might find some peculiar spaces, generally long and narrow. You can test for shape problems by clapping your hands in the empty room. A badly proportioned room will respond with a definite pitch, or perhaps a flutter echo, which is a series of rapid pulses of sound. Mild cases of these symptoms are treatable, but if the effect is severe, choose another room.

AUDIO EQUIPMENT The heart of any studio is a high-quality sound system. This typically consists of mixing console, amplifier, and speakers and can be augmented by synthesis and processing gear. The size and complexity of these are related to the types of music and processes you are interested in. Most electroacoustic studios are modest in size compared to commercial recording facilities, but sonic quality is another story. There is no reason a composition studio cannot match or even exceed the best professional operation in quality. All the composer needs is some understanding of audio specifications and the skill to hear flaws when evaluating products.

Signal Quality The electrical representation of sound is called the signal. Electroacoustic music is produced by modifying signals. We amplify a signal with the understanding that this will make the sound louder, we equalize a signal knowing that the result will affect the tonal balance, and so on. The concept of signal quality comes down to the sound that will result. Are there any changes that we did not intend? This can be tested by recording the signal with no deliberate changes. A perfect system would reproduce a sound identical to what went into the microphone in the first place, but that is impossible. Microphones do not do a perfect job of converting sound pressure to electricity, and speakers are even worse in converting it back. Even so, the human hearing mechanism is quite forgiving, and listeners are often pleased with the results, denying any difference between the original and the reproduction. This was true even in the early days of recording, when sound was reproduced by a cactus needle following the groove of sound waves scratched in shellac. What we are actually considering when we talk about signal quality are changes that will not be acceptable to listeners. The most prominent offenders affect frequency response, distortion level, and noise. Frequency Response Humans can hear sounds that range in frequency from 20 to 20,000 Hz. If an audio system does not reproduce this entire range, listeners will be aware they are missing something. For an example of restricted frequency range, listen to a telephone,

02_Chap1_pp1-24 8/29/13 2:38 PM Page 5

FIND A QUIET SPOT . . .

5

-10 dB -20 -30 -40 -50 -60 -70 -80 -90 -100 -110 -120 -130 15 Hz

31

63

125

250

500

1k

2k

4k

8k

16k

FIGURE 1.1 Typical frequency response graph.

which covers about only 20 percent of the audible spectrum. Further, if an audio system favors some frequency bands over others, the quality of the sound changes—this is called coloration. Some people have a taste for certain types of coloration, such as turning up the bass on their stereos, but this should be a deliberate choice. Frequency response is usually displayed as a graph of amplitude across the entire audible range, as shown in Figure 1.1. The graph shows change in amplitude at each frequency. If there is no preference for one frequency over another, the graph is flat. Good equipment has flat frequency response. Distortion Audio systems usually adjust the amplitude of the signal at some stage. This process is called amplification or attenuation and has the effect of multiplying the currents of the signal by a constant factor called gain. Volume knobs are an obvious example of amplitude adjustment, but electronic systems constantly manipulate amplitude for reasons best known to the design engineers. We expect such adjustments to be perfectly linear, the same whether the signal is loud or soft, but this does not always work out. Many electronic parts are actually linear only over a small range of current and will distort the signal when the current gets beyond that range. Some circuit designs do not react well to sudden changes in amplitude and smear the attacks in the signal. Problems like these can be solved by careful design and precise manufacturing techniques, but these techniques add to the cost of the circuit. Distortion adds extra components to the signal. It may be specified as a total percentage or divided into harmonic and inharmonic portions. The components of

02_Chap1_pp1-24 8/29/13 2:38 PM Page 6

6

THE ART AND TECHNIQUE OF ELECTROACOUSTIC MUSIC

harmonic distortion occur at simple multiples of the frequency of the original material. Since most musical sounds already contain multiples of a fundamental frequency, a bit of harmonic distortion may be acceptable. The components of inharmonic distortion follow other patterns, such as the sum and difference of the frequencies of the signal. Harmonic distortion becomes audible when the amount exceeds 1 percent but even 0.1 percent inharmonic distortion may be objectionable. Digital systems are not immune to distortion. In fact, the digital revolution has introduced several new types of distortion. The sound of distortion is difficult to describe, but it’s usually grating or rough. Some types of distortion give the impression that the sound is louder than it actually is, other types make it hard to identify the sound. Distortion is not always unpleasant. After years of exposure, many people have developed a taste for distortion, at least for certain types such as that associated with tube amplifiers. Noise Audio noise is anything added to the signal that didn’t go into the microphone. Noise can sound like a lot of things: a low-pitched hum, a hissing sound, random pops, even music if it is somewhere it’s not supposed to be. Noise is measured by comparing the amplitude of the loudest possible signal to what the system produces when there is no input signal at all. This ratio is converted to decibels, a measurement discussed in chapter 2. Signal-to-noise ratios of 60 decibels were once considered satisfactory, but in the digital age they should be in the 80s or even 90s. The most common source of noise in the modern studio is the computer that makes so much of our work possible. The multitude of high-frequency signals associated with bit crunching and running displays often leaks into the audio signal, producing strange buzzes and whistles. The physical noise of the computer is a double-barreled problem. The whirr of fans and disc drives is easily picked up by microphones and will prevent you from hearing similar problems in the music you are trying to create.

Speakers and Amplifiers Accurate playback is as important to the composer as northern light is to a painter, and for much the same reason. The speakers will color the sound of everything produced in the studio, so if they are not accurate the music will sound different in other locations. The playback system may consist of an amplifier and speakers or the amplifiers may be built into the speakers, but the system must have balanced frequency response, an extended low end, and low noise.

Headphones A set of quality headphones is an essential adjunct to the speaker system. You can’t make up for poor speakers with good headphones, but there are times when

02_Chap1_pp1-24 8/29/13 2:38 PM Page 7

FIND A QUIET SPOT . . .

7

you need to keep the sound in your head. Look for headphones that fit comfortably. Some are designed to sit lightly on the earlobe (supra-aural), while others surround the ear with seals resting on the side of the head (circumaural). The latter are better for studio work because there is less sound leakage. In either case they must work with the unique shape of your ears. Headphones have the same quality criteria as speakers. Wireless headphones are not appropriate for studio use. They employ signal processing that will obscure the details you are listening for.

The Mixer Mixers range from pocket-sized portables to room-filling behemoths, but the needs of the electroacoustic musician are relatively modest. The mixer will be used primarily for listening to audio sources, with the occasional recording project thrown in. A model with twelve to sixteen inputs will usually be sufficient. The following considerations are important. Digital or Analog? The best high-end mixers are digital these days, but the low-cost digital models are no better than simple analog machines. There’s a lot of analog circuitry in any mixer, so the cost and performance benefits of digital systems do not appear until the board is fairly complex. The cut-rate effects found in the cheaper digital mixers can even limit the quality of the signal. For now the best choice is analog, but in the near future this is likely to change. (If you do decide to go digital, look for a desk with plenty of digital inputs. Surprisingly, most of the inputs on small digital mixers are analog. There are usually only two or three digital connections.) Preamplifiers Unless you make complex live recordings, the mixer does not need fancy microphone preamplifiers. The basic preamps in a low-cost mixer or field recorder will be adequate for recording source material, since you are going to modify the sounds anyway. If you need high-quality recording, buy a separate premium preamp. Such a preamp costs about the same as a basic sixteen-input mixer. Imagine the price of a mixing board with high-performance preamps in every channel. Monitoring Controls The most important feature on a mixer is a control-room monitor section with a master volume control and switches that include an external input. (This is often marked “tape,” a vestige of that technology in the modern studio.) This will become the central control of the studio. The input switches allow you to choose to listen to studio hardware directly or by way of the mixer. A master volume control will help you keep signal levels consistent because you will always grab the same control when surprised by a loud sound.

02_Chap1_pp1-24 8/29/13 2:38 PM Page 8

8

THE ART AND TECHNIQUE OF ELECTROACOUSTIC MUSIC

Recording Gear At one time, this chapter would have had a long discussion on tape recorders, covering formats, tape widths, and speeds. That is all history now because most recording happens in the computer. You can’t take your computer everywhere, however, and a portable recorder is a useful device to have. Compact machines that use solid-state recording media are available in a variety of styles and price ranges. Eventually you want to have at least two portable recorders. One will be a cheap model with built-in mics that you carry most of the time to grab sounds that seem promising. These are all but disposable, and if one gets lost or damaged, you will regret the loss of the sounds it contains more than the gadget itself. For serious location recordings, you will eventually need a high-end recorder with quality preamps and long recording time.

Keyboards and Synthesizers It’s only been a couple of years since electroacoustic music and hardware synthesizers were synonymous. With the continuing development of computer-based synthesis the heyday of MIDI is over, but there is still room for some of the old beasts in any studio. A keyboard of some sort is still required to control synthesis, whether it be hardware- or software-based. If the keyboard incorporates a basic sound generator, it will be playable when the computer is turned off or doing other tasks. The type and size of keyboard depends on your own experience with keyboard performance. A musician trained on acoustic piano will probably prefer the feel of a weighted key action and the full eighty-eight keys. Others may find a light action and a five-octave range sufficient. I can’t recommend micro-keyboards as anything other than experimental controllers. There’s a reason the size of keys was standardized a century ago. The MIDI tone generators that used to fill the racks of a studio are pretty much history at this point, but they will soon be back in a new form. Computers have real limitations as synthesis devices, so we are beginning to see audio accelerators that will combine the convenience and response of dedicated hardware with the flexibility of general-purpose computers. The Kyma system discussed in chapter 17 is an example of such a system. Of course, if you come across a classic synthesizer at a yard sale, consider giving it room in the studio. It may not be up to modern standards for noise level or ease of use, but it will produce unique sounds.

Rack Gear There are many pieces of equipment that aren’t absolutely required in a computerbased electroacoustic music studio but which will likely be added as the interests and needs of the composer grow. Items such as CD players and premium preamps

02_Chap1_pp1-24 8/29/13 2:38 PM Page 9

FIND A QUIET SPOT . . .

9

are available either in odd-sized boxes or in standardized rack-mount cases. The plain box is fine for one or two items that can be stacked on a shelf, but the rack system eventually becomes a necessity. For that reason, a studio plan should include at least one rack. Equipment in a rack is bolted down, which makes it easy to use and improves reliability of the wiring.

Cables In a professional studio, the equipment is usually bolted down and permanently wired together. The wiring runs through a central patch bay where temporary changes in the signal path can be made by plugging short patch cables into jacks. Only the most elaborate electroacoustic studios need patch bays. Most personal setups are simple enough that the equipment can rest on tables and shelves with connections made directly from item to item. An assortment of cables should be on hand in varied lengths and with connectors appropriate to the jacks on the gear.

THE SIGNAL PATH The cables connected between pieces of gear determine the signal path. This is a simple concept but is often a cause of grief for someone new to audio. In most cases, a signal is routed through one piece of gear after another, although several signals may be combined in a mixer, and occasionally a signal is sent to two devices at once. The basic connection is nothing more than a cable from the output of the first device to the input of the second. This may be complicated by equipment that has more inputs and outputs than seem strictly necessary, such as separate jacks for high-level and low-level signals. If this is the case, consult the manual to determine which inputs and outputs are appropriate, and then put your own labels by the connectors you actually use. If the studio wiring runs through a patch bay, the signal path may not be obvious. Many patch bays have default connections that are broken when patch cords are inserted. These define the “normal” signal path, and patch cords come into use only for exceptions. This should all be legibly labeled, with an explanatory wall chart if necessary. However the connections are made, it is important to be aware of the signal path in use, and paths used for various procedures will be diagrammed throughout this book. As an example, Figure 1.2 is the path you might use to record with a computer program using a microphone and external preamp. This is not an empty intellectual exercise. When things don’t work, the first step in correcting the problem is to trace the signal path in order to check the connections and settings. When we discuss a new piece of gear, we will first learn the signal path within the device, because that is usually key to understanding how it works.

02_Chap1_pp1-24 8/29/13 2:38 PM Page 10

10

THE ART AND TECHNIQUE OF ELECTROACOUSTIC MUSIC

Microphone

Preamp

Computer

Monitors

FIGURE 1.2 Typical signal path.

THE COMPUTER Electroacoustic composition requires a computer. There was a time when the work was done on other types of machine, but I don’t think any composer would like to go back to those days. If possible, you should dedicate a computer entirely to music. Make it the center of your studio, with your keyboard and mixer within reach and your speakers placed for comfortable listening as you face the display.

Basic Requirements Audio is pretty taxing for a computer system. Any recent model will handle basic recording and editing operations, but we always seem to be pushing for more tracks, more complex processing, more elaborate synthesis, and just generally more. For that reason, advice to buy the most powerful machine you can afford will always be appropriate. The operating system is not particularly important. Excellent audio software is available for Windows, Macintosh, and Linux, so you should stay with the system you are comfortable with. That said, Macs are much more common in the audio and music community than other machines. Whatever model you choose, make sure it is a quiet one.

Laptop or Desktop? Laptops are certainly powerful enough to run audio software, and their quiet operation is greatly appreciated. Their portability makes them popular for onstage use as well as for remote recording. On the other hand, many audio programs are easier to use if a full keyboard and two big monitors are available. Desktop models usually beat laptops in the price/performance equation, so if the machine is to stay in the studio, a desktop is the best choice. Connecting a full-sized keyboard and monitor

02_Chap1_pp1-24 8/29/13 2:38 PM Page 11

FIND A QUIET SPOT . . .

11

to a laptop may be an acceptable compromise. Most composers eventually wind up with both.

Dealing with Computer Noise Many desktop computers are mechanically very noisy, and even laptops make a bit of sound when they run hot or have a CD in the disc drive. Generally, the noise a computer makes is related to the cost. Cheap cases rattle and magnify the sounds of fans and drives. This is one distinct advantage of Macintosh computers over Windows machines. In fact, many musicians run Windows on Macintosh hardware to take advantage of the quieter construction. Some audio equipment stores sell studio-optimized systems with premium cases, such as those from Lian Li. There are also soundproof computer cabinets, but they tend to be very expensive. The computer’s noise may be mitigated by careful placement. If the machine is to the side—or better yet, behind you—you will unconsciously discriminate by direction between computer noise and noise in the audio system. I find it sufficient to place the computer on the floor and raise the speakers above eye level. In extreme situations, you may have to relocate the computer to a closet or adjoining room. This is really inconvenient, and I would only consider it if I were doing a lot of live recording in my workspace.

The Audio Interface A stock computer will not have audio input and output capabilities (I/O) that match the quality a serious composer needs. High-fidelity I/O is achieved with an audio interface, which can be a plug-in card or a USB or FireWire device. External interfaces range from budget models with two or four inputs to multi-thousanddollar systems suitable for high-rent recording studios. The electroacoustic composer’s needs fall at the modest end of this spectrum, but the cheapest models are prone to annoying noises picked up from the computer’s internal operations. As with the mixer, the best strategy is to aim for good quality but relatively few inputs. Some interfaces incorporate cheap to adequate microphone preamps, but a separate preamplifier will provide better recordings. To facilitate this, the interface should have balanced connections. Other features to look for include monitor output with controls similar to those described for the mixer and input and output oversampling.

The MIDI Interface Performance information will come to the computer in the form of MIDI messages, which you will read about in chapter 7. Computers do not come with MIDI connectors—these must be added. Some audio interfaces also include MIDI, and some keyboards can send MIDI directly to the computer via USB, but the most

02_Chap1_pp1-24 8/29/13 2:38 PM Page 12

12

THE ART AND TECHNIQUE OF ELECTROACOUSTIC MUSIC

common way of adding MIDI to a computer is with a USB adaptor. This is a simple device with up to eight inputs and outputs. They are not expensive, but make sure you get one rated USB 2.0 and connect it directly to a port on the computer, not through the keyboard or a hub. MIDI work does not usually generate a lot of data, but it is essential that the messages get processed as quickly as possible. Hubs tend to interfere with this timing.

Control Interfaces Most software is designed to be useable with a QWERTY keyboard and mouse, but so much audio work depends on making adjustments in time to music that the process is greatly enhanced by some old-fashioned knobs and sliders. These are available on boxes called “control surfaces” that do nothing but provide interface gadgets that can be read by the computer. Control surfaces come in an astounding range of types and sizes. There are boxes with a single knob and desks as large and complex as full-featured mixing consoles. The best have motorized controls so the program can update knob positions, and digital readouts that display labels to match the current use. Control interfaces also lend a more traditional musical look to the studio’s equipment. There are organ-style keyboards, MIDI wind instruments, drum sets, and guitars. For the nontraditional approach there are devices like the Monome, which is an array of lighted buttons on a featureless box, and the Lemur, with no physical controls at all but a lighted panel that responds to touch like the set dressings on Star Trek. Tablet computers like the iPad can be used to provide wireless control. We’ll look at such things in more detail in chapter 18. Exactly what you buy to start out with depends on your training and inclinations, but you will continually add to your collection.

Device Drivers Both audio and MIDI interfaces require extensions to the computer’s operating system to function properly. These extensions are called drivers and are supplied on compact discs packaged with the gadget, but these always seem to be out of date. You should expect to have to go online to the manufacturer’s website and download updates every time the operating system (OS) is tweaked. Luckily, installation is usually simple and trouble free.

SOFTWARE The software on the computer is a major theme in the rest of this book. It is the nature of software and the software industry that the choice of programs is con-

02_Chap1_pp1-24 8/29/13 2:38 PM Page 13

FIND A QUIET SPOT . . .

13

stantly changing, and the look and feel of established programs is constantly undergoing revision. For this reason, I will focus on processes at the heart of applications rather than specific interface features. A student who wishes to become a proficient electroacoustic composer must learn to make the distinction between interface and process. The key to that knowledge is to try out as many different programs as you can. This will reveal the central purpose of the software, which is often hidden by cosmetic factors. This does not mean you should expect to be in a constant state of bankruptcy keeping up with each new offering and update. I’ve found that upgrading to every other version is generally satisfactory. Most software is developed in one of three types of enterprise. Commercial products are created by teams of professional engineers who work under the supervision of managers and a company president. The company probably has teams of designers and marketing experts who have as much to say about the final product as the engineers writing the code do. They also have teams of testers who try to ensure that the programs run trouble free. Because the commercial enterprise supports a number of people working for a living, the programs are sold at a comparatively high prices and there must be regularly scheduled new releases, typically every year. In academic/open-source development, no one is getting paid, and typically the programs are free. The code is written by volunteers who are generally quite enthusiastic for a few years, then become distracted by other projects or by earning a living. Luckily, there are usually more volunteers to take their place. The enterprise works best when there is a long-term supervisor to act as mentor and adjudicator, like Linus Torvalds of Linux fame. There are some remarkable music projects done in this mold. There is also a heartbreaking number of potentially great products that show no recent activity (you can track open-source software projects on SourceForge.net). Usually there are two versions of such programs available—a well-tested and stable version, and the latest beta release, with new concepts to try out but no guarantee that the program won’t crash. Shareware programs are typically the product of an individual or small cooperative. They aren’t always free, but the price may be flexible down to $0. These programs tend to be modest utilities or plug-ins which were relatively simple to put together. These programs also tend to be the most innovative, since there’s no capital at risk and no committees to approve designs. They can also be amazingly bad, with impenetrable interfaces and spectacular crashes. You can try them with little risk, but always update your virus scan before downloading. Perhaps the most frustrating feature of shareware is that programs tend to disappear with no warning. There are several websites that list and review shareware music programs. In general the electroacoustic composer will be concerned with four kinds of applications: programs for audio editing, multitrack organization, MIDI-based composition, and sound synthesis. There are programs that combine all of these functions, but the best examples are focused on a single task, so I will treat these as separate applications.

02_Chap1_pp1-24 8/29/13 2:38 PM Page 14

14

THE ART AND TECHNIQUE OF ELECTROACOUSTIC MUSIC

Audio Recording and Editing Programs Audio recording and editing programs are the fundamental tools of the trade. These provide a means for getting raw material into the computer and precise control over what the end results sound like. There must be over a hundred such programs, ranging from simple shareware applications to multi-thousand dollar dedicated hardware workstations. Surprisingly, these all offer pretty much the same basic features. Just as all word processors are based on typing, the audio editing paradigm shows some representation of the audio envelope or waveform displayed across the screen and allows graphical cutting and pasting of sections of sound. The design is typically meant to give the impression of a tape recorder, including icons depicting the traditional buttons. Audio programs differ primarily in accuracy of editing and implementation of advanced features. It’s not practical to make specific recommendations here, but here are some suggestions for what to look for. Large and accurate meters. Meters are essential to getting a good recording. It’s difficult to judge the accuracy of meters, but a well-designed meter will crest quickly and fall back a bit slowly, giving a sense of both peak and average signal level. If the peak segment stays lit a bit longer than the others, it will be easy to predict the results with nearly any kind of source. There is an emerging standard called K-metering, which is offered in the best programs. Accurate positioning of the cursor. Surprisingly, many programs are sloppy about the relationship between where the cursor appears on the screen and the actual editing location in the sound file. This can become critical when performing tight edits. Nondestructive editing. An audio file is a long list of numbers. If each change you make is immediately applied to the file, the program response will be slow, because huge amounts of data have to be moved around. It will also be difficult to back up to an earlier state, because it’s hard to put the cursor exactly where it was before. When the editing is nondestructive, changes are displayed and played, but the file is not affected until you save the document. This type of program usually features an edit list, with the capability of rolling back to a previous state. Support for plug-in processors. Every program offers some processing of selected regions of the signal. This may be limited to gain changes and fades, or there may be an extensive list of options, but no program will provide all possibilities. With plug-in processing you can add extra functions specific to your needs. As a bonus, such functions become available to all of your audio software, as long as the format is supported. Look for Steinberg’s Virtual Studio Technology (VST) support on all platforms and Apple’s Audio Units (AU) on Macintosh.

02_Chap1_pp1-24 8/29/13 2:38 PM Page 15

FIND A QUIET SPOT . . .

15

Easy navigation within long files. Once a recording is longer than two or three minutes, scrolling from point to point becomes tedious. The ability to add location markers is essential, as is high-speed cueing. The illustrations in this book use an editor created by BIAS called Peak LE. This is no longer available, but its screen layout clearly presents the principles we will discuss. Steinberg’s WaveLab Elements is a comparable program. Identical functions are provided in free editors such as WavePad or Audacity.

Multitrack Production Programs Multitrack production programs are designed for the commercial recording business, but they provide an invaluable function for the composer—the ability to combine sounds in layers. They not only take the paradigm of the tape recorder beyond stereo tracks, they allow tracks to be moved relative to one another, a feature seldom found in hardware. These are complex programs, so there are fewer to choose from than is the case with stereo editors. They are also expensive (the free applications are really rudimentary), so choice should be based on informed reviews, with some consideration of popularity, or at least compatibility with the popular applications. Multitrack production programs are often sold as part of a hardware package called a digital audio workstation (DAW). Some DAWs are completely closed systems, including computer, hard drives, and conversion hardware, but most run on general-purpose computers with your choice of interfaces. As of this writing, the market is dominated by Pro Tools, which is an excellent program (and quite affordable in some versions) but there are other options. When purchasing any multitrack program, the following are important issues. Hardware compatibility. Multitrack operations make heavy demands on the host computer and associated equipment such as disc drives. Make sure the software will run on your system. The manufacturer’s website often has a list of compatible equipment. The ability to record and play back simultaneously. Surprisingly, this cannot be taken for granted. An associated issue is latency, any delay heard in the material being recorded. A certain amount of latency is unavoidable, but the best programs keep it to a minimum. Track count. This will also depend on the power of your computer, but some programs deliberately limit the number of tracks in less-expensive versions. Multiple takes per track. Sometimes called a playlist, this feature allows recordings to be “stacked” in a single track, allowing quick comparison of different versions of a performance. The items listed above for editing programs are also important for multitrack production programs. And although all multitrackers offer some editing, you really

02_Chap1_pp1-24 8/29/13 2:38 PM Page 16

16

THE ART AND TECHNIQUE OF ELECTROACOUSTIC MUSIC

should have both. A dedicated stereo editor will always offer better accuracy and a deeper feature list. Standard recording production practice is to perform original recording and mixing in the multitrack package, then move to an editor for final mastering. For examples and exercises in this book, we will use Pro Tools. Even if you prefer another multitrack environment, you will encounter many Pro Tools projects in your career and you should be familiar with its interface and procedures.

MIDI Sequencing Programs MIDI sequencing programs follow a different paradigm: they record the MIDI data from keyboards and controllers, allowing the correction of mistakes and enhancements to performance before they are committed to sound. Sequencers allow the composer a powerful alternative to writing scores. Direct data entry can convincingly simulate ensemble performances and realize elaborate synthetic compositions. In recent years many MIDI sequencing programs have acquired audio tracking capability while many multitrack programs now have MIDI features. However, the original focus of each application is apparent in the user interface, and it is the interface that inspires loyalty to one program over another. A MIDI sequencer should have the following features. Flexible assignment of internal and external instruments. MIDI-based production implies interchangeable instruments. Even though hardware synthesizers are disappearing from the stores, some models have become classics with cherished sounds only roughly emulated by software counterparts. It should be simple to apply a performance to any instrument available. Audible editing of MIDI data. You should be able to hear the effect of changes as you make them. A score view of MIDI data. This shows MIDI events as notes and rests. Even though traditional notation cannot transcribe a performance with the rhythmic accuracy of MIDI, many composers prefer to enter notes that way. Data entry in the score should be simple and efficient, with keyboard selection of duration and pitch. An event list view of MIDI data. This shows the contents of all messages as they will be transmitted. A list of numbers may seem hopelessly retro and geeky in appearance, but the event list is the only view where all MIDI data can be manipulated with precision. Other forms of editor use separate views for each type of message, and it can be hard to determine the actual order of events. Flexible tempo and meter control. The concept of fluid tempo and unusual meters has only lately begun to appear in sequencers, and many programs are still limited to the quarter-note beat and a single meter throughout. Metric

02_Chap1_pp1-24 8/29/13 2:38 PM Page 17

FIND A QUIET SPOT . . .

17

rigidity excludes many styles of music and makes film scoring difficult. I am looking forward to sequencers that provide a metronome with beat subdivision. Video scoring support. This feature is becoming increasingly important as composers embrace multimedia opportunities. Many programs that offer video scoring do so by playing the video in the computer itself. This adds to the computation load, so the more professional approach is to synchronize the program to external video using SMPTE timecode, a set of operating standards devised by the Society of Motion Picture and Television Engineers. The examples in this book will be handled with the Logic Pro sequencer for Mac.

Sound Synthesis Programs Sound synthesis programs allow the composer to invent new kinds of sound. These can take the form of specialized programming languages or on-screen emulations of hardware synthesizers with virtual knobs and patch cords. The current landscape is similar to the world of hardware synthesizers in the late 1970s, when the choice was between large patchable modular systems and simple prepatched instruments. Now the choice is between infinitely configurable but intimidating modular software systems and simple instrument plug-ins with one patch and a few virtual knobs to tweak the sound. There’s no need for a list of criteria here—eventually you will own dozens of synthesis applications. Most are available as plug-ins for MIDI sequencers.

Utilities Utility programs may be included with the basic applications or purchased separately for better performance or specific features. Many of the following will prove useful to the electroacoustic composer. CD authoring programs. These programs allow the creation of audio CDs with special features such as adjustable space between tracks or embedded text. Audio analysis programs. These programs display various aspects of the signal, including phase separation, bit utilization, and other quite specialized but occasionally invaluable features. The top of the line is the SpectraFoo suite of analysis programs from Metric Halo, with Spectre from Audiofile Engineering a close (and much cheaper) second. Individual functions can be found as inexpensive plug-ins. The K-Meter plug-in from MeterPlugs was invaluable in preparing the audio examples for this book. Music transcription. Transcription applications differ from audio analysis programs in that they provide features useful in looking at musical performances. Transcribe! from Seventh String Software includes slow speed playback and several pitch detection tools.

02_Chap1_pp1-24 8/29/13 2:38 PM Page 18

18

THE ART AND TECHNIQUE OF ELECTROACOUSTIC MUSIC

Digital signal processing (DSP) programs. These are designed to modify audio files in various ways. One example is SoundHack, a powerful free package by Tom Erbe. Composer’s Desktop Project, headed by Trevor Wishart, provides every possible transformation for a reasonable price. AudioMulch by Ross Bencina, and MainStage, a utility that comes with Logic Studio for Mac, both allow real-time audio processing for performance. Noise removal. Noise removal packages are primarily designed for cleaning up vinyl recordings, but they can prove invaluable in preparing sound clips for samplers or concrete compositions. Many of the examples in this book were processed with SoundSoap from BIAS. Plug-ins. Separately or in packages, plug-ins contain audio processes that can be used within audio recording and editing software. These include standard items like equalization (EQ) and compression, as well as esoteric processes like pitch correctors and harmonic synthesizers. Applications or hardware that will accept plug-ins are known as hosts. Before purchasing a plug-in, check that it is compatible with the host programs and hardware you own.

PUTTING IT ALL TOGETHER Few studios are designed at one sitting. In fact, most grow haphazardly and are occasionally reorganized in fits, like cleaning out the garage. Actually, this is not a bad approach. I have seen elegant studios where each item has its custom fitted place, and I have heard some lovely work produced in them. But the most efficient composers I know work in near chaos. (To a casual observer I certainly do.) Time spent tweaking the studio is not spent composing, but I suspect there is a deeper reason for the tendency toward chaos. The modern electroacoustic studio must be flexible, easily accommodating extra pieces of gear and visiting musicians on short notice. Fancy built-ins can quickly become obstacles, and wiring that is cut for a custom match in one installation will invariably be too short for a desired relocation. The best approach seems to be to put everything on tables and shelves and expect to move it within a year. Keep assorted cables on hand and use Velcro tabs to manage excess lengths. As you work in the space everything will naturally find its best place.

Layout Figure 1.3 shows a typical starting layout for a studio for someone who works alone most of the time. The basic configuration is a U-shape with an arm’s reach between the side tables. The center of the U is the sweet spot for the monitor speakers. The sweet spot is the point where the speakers have the best frequency response and

02_Chap1_pp1-24 8/29/13 2:38 PM Page 19

FIND A QUIET SPOT . . .

19

Speakers Equipment Rack

Keyboard

Computer Below Be elow

Mixer (Shelves above)

FIGURE 1.3 Typical equipment layout.

stereo image. It is usually at the apex of an equilateral triangle, with a width that varies somewhat depending on the type of speaker. There are three major items in the basic studio—the computer keyboard and monitors (with the computer itself moved as far away as possible to keep noise down), the music keyboard, and the mixer. Minor pieces of gear include mic preamps, synthesis modules, and control surfaces. In my studio, I put the computer monitors at the center of the U, the mixer to my right, and keyboard to the left. Some of my minor gear is in a tabletop rack—this includes the computer audio interface. Other minor gear is on shelves above the mixer. This setup reflects my work habits. Since I compose mostly at the computer, the computer gets the central spot. When I compose, mixing is only an occasional level change; I sometimes need to find the level control fast, so the mixer must be close, but it can be somewhat behind me. In a traditional recording studio, the mixer gets the place of honor, while a composer who works mostly at the music keyboard will want that front and center, perhaps with the computer keyboard above. (If you go for this setup, don’t try to write a book on the computer—the ergonomics of typing on a raised keyboard will kill your wrists.) The most important features in my layout are the empty spaces. I often bring in pieces of gear for temporary use, and I leave space for them to sit left of the monitors or by the mixer. You can see hundreds of variations on layouts by searching Google images for “my music studio.” Look at some of these and imagine yourself at the controls.

02_Chap1_pp1-24 8/29/13 2:38 PM Page 20

20

THE ART AND TECHNIQUE OF ELECTROACOUSTIC MUSIC

Wiring It is tempting to buy a reel of cable and a box of loose connectors to build your own cables, and I urge you to learn how to do this eventually, but start with storebought cables. Studio setup and maintenance cut into composing time, and building reliable cables takes longer than you might think. Set up your equipment, take some measurements, make a list, and buy what you need. Once you have settled on a setup, label the cables. ( I use a handwritten sticky label with clear tape wrapped around it.) Your shopping list of cables must include the type of connector on each end. There is a surprising variety of types, but the most common will be the tulipshaped RCA (or phono) plug, microphone style (XLR) connectors which come in male (with pins) and female varieties, and quarter-inch phone plugs. The important feature of these connections is whether they are balanced or unbalanced. (Balancing is an electronic scheme that uses a third wire in the cable to provide improved noise rejection.) RCA plugs are always unbalanced, XLR connectors are always balanced. Phone plugs can be either way—the balanced type has an extra stripe behind the tip like the plug on headphones. This is called a TRS plug. There will be a label by the jack to indicate balanced phone connections. You should use balanced cables when the connection is from balanced output to balanced input, otherwise unbalanced is the only option. Unbalanced cables should be kept as short as practical. See The Sound Reinforcement Handbook by Gary Davis for excellent information about cables and wiring. It is hard to gauge exactly how large a studio must be to require patch bays, but they are often useful when the back of the equipment is hard to reach. Patch bays are assembled from panels of open connectors that are wired to the inputs and outputs of the studio equipment. You can buy patch panels that have connectors on the back as well as the front to allow the used of premade cables. With this type of panel, normalled connections are set up with switches. Note that patch bays and patch cords also come in balanced or unbalanced forms. You can freely mix balanced and unbalanced connections on balanced patch bays; simply choose the proper cord for patching. You should pay close attention to the AC wiring in the studio. The perennial problem with audio equipment is capacity, not in the sense of amperes available, but in the sheer number of things to plug in. Computers and power amps take moderate amounts of current, but most audio equipment needs less power than a nightlight. The magic number to look for on the back of the equipment is VA—that stands for volts times amperes used. A typical home electrical circuit provides 15 amps at 110 volts, so a total of 1650 VA are available. (Of course, if a refrigerator shares the circuit with your studio, there won’t be enough amps to go around. Don’t allow that—your audio will thump when the refrigerator kicks on.) However, every piece of equipment needs to be plugged in, and many use plug transformers that block off two or more receptacles. Most people cope by daisy chaining several

02_Chap1_pp1-24 8/29/13 2:38 PM Page 21

FIND A QUIET SPOT . . .

inexpensive power strips, which is mildly dangerous and really messy with all of the extra cords. Electrical supply stores and large hardware stores sell power strips with up to 24 taps spaced at four inch intervals. These are a bit more expensive but solidly built and will last a very long time. Two or three of these plugged into a surge suppressor will provide convenient and reliable power distribution. Surge suppressors are necessary for all equipment, but the computer should be on a separate UPS (uninterruptible power supply). That’s because computers often feed high-frequency interference back to the power line, and the UPS does a good job of blocking this. There are heavy-duty isolation units especially designed for audio systems, and you should consider this option if there is a problem with some other source of interference. This situation is not subtle, everything will hum.

Setting Levels The most important task in setting up a studio is calibrating the listening level. This requires a sound-level meter, which is available for about $50, and a reference signal, which can be generated by K-Meter or any of the synthesis programs described in chapters 10 through 17 (a sample is provided on the accompanying disc as DVD example 1.1). The procedure is simple. Position the meter (set to the “C weighted” curve) at the sweet spot and play the reference signal (which is pink noise). Adjust the volume controls so that each speaker produces a sound pressure level (SPL) of 83 dB at the sweet spot. Mark the positions of all knobs involved so you can return to this setting any time you want.

A NOTE ABOUT INSTITUTIONAL STUDIOS Studios in schools and other shared facilities must meet the same basic needs and follow the same basic guidelines as for the home studio, but there are additional concerns to address. Security. Electroacoustic studios are stuffed with gear that is attractive and easy to fence. The level of security needed will vary with locale, but at the very minimum access must be controlled by combination locks or a key checkout system. The studio door must be set up in such a way that it is not possible to leave it unlocked. Video surveillance systems are affordable and simple to install. These should not infringe on the privacy of the composers, but must keep track of comings and goings in the studio. Accessibility. The Americans with Disabilities Act applies to all educational or publicly shared studios. Wheelchair access guidelines will limit the

21

02_Chap1_pp1-24 8/29/13 2:38 PM Page 22

22

THE ART AND TECHNIQUE OF ELECTROACOUSTIC MUSIC

placement of equipment. This is best addressed when the studio is designed rather than waiting until a student requiring accommodation shows up. Laying out a studio to meet accessibility requirements actually makes it easier for everybody to use. Wheelchair access guidelines are available online or through your school’s disability accommodation office. The needs of blind composers can usually be handled best in consultation with the student. There are several systems that add speech to computers, and it is preferable to install one with which the student is already familiar. Not all composition applications work well with screen readers. Consult with local and online blind musicians for recommendations about which to install. You may find it necessary to provide a dedicated computer for blind user access. Sighted users can too easily modify settings that are required for screen reader operation. Reliability. It is vital that students arriving for a session find a working environment. Beginning students have enough to think about without having to reconnect studio wiring or chase down vital gear. Thus the studio equipment should be permanently mounted and the wiring routed through a central patch bay. The wiring should be installed to professional standards. There should be a report sheet where students can note equipment problems, and such problems should be investigated promptly. The power distribution must meet fire codes and institutional requirements. That probably means no extension cords or daisy chains of plug strips. A heavy-duty rack-mounted power center with long bar-type plug strips should be acceptable. The power switches on active monitor systems are unlikely to be within cable reach of the central module—these can be switched with remote relay units. Flexibility. The central patch bay is the best way to allow students to connect brought-in equipment. (Be sure there is a handy AC receptacle.) I also find that the patch bay encourages students to come to grips with the concept of signal flow. Do not use normalled connections; require the students to patch everything they use. Provide equipment and software that supports as wide a range of musical activities as possible. Let the students explore a variety of approaches and decide which is best for their interests. Documentation. The manuals for all equipment and software must be available in the studio. There should also be a master document of the studio itself, describing the basic layout, where the power switches are, and stating any rules and policies.

02_Chap1_pp1-24 8/29/13 2:38 PM Page 23

FIND A QUIET SPOT . . .

23

RESOURCES FOR FURTHER STUDY There are dozens of books about studio building and setting up a home studio. Books about acoustics are mostly derived from the works of F. Alton Everest, and it’s best to get the information from the source. Yamaha’s Sound Reinforcement Handbook has more practical information per page than any other text. Anderton, Craig. 1996. Home Recording for Musicians New York: Amsco Publications. Davis, Gary. 1989. The Sound Reinforcement Handbook. Milwaukee: Hal Leonard Corporation. Everest, F. Alton. 1997. Sound Studio Construction on a Budget. New York: McGraw-Hill, 1997). Everest, F. Alton. 2009. The Master Handbook of Acoustics. New York: McGraw-Hill. Touzeau, Jeff. 2009. Home Studio Essentials. Boston: Course Technology [Cengage Learning].

02_Chap1_pp1-24 8/29/13 2:38 PM Page 24

03_Chap2_pp25-50 8/29/13 2:39 PM Page 25

Part 2 Fundamental Concepts and Techniques All electroacoustic music production relies on a single skill. This is the ability to listen to a sound and to hear and understand all of its aspects. There are other required skills, skills with software and skills with music, but they are useless without the ability to listen to sounds and judge them. Chapters 2 through 6 are designed to develop your listening skills. The subject matter may seem to be something else, such as operating various types of software or machinery, but the underlying point is learning to hear the results of the operations. After all, the software will change. The chances that you will use the programs or program versions illustrated in this book are slim, but the functions the programs provide will not change. Your success as a composer or producer depends on the ability to hear the need for these functions and to judge the results accurately.

03_Chap2_pp25-50 8/29/13 2:39 PM Page 26

03_Chap2_pp25-50 8/29/13 2:39 PM Page 27

TWO Sound

LISTENING TO SOUND Sound is the raw material of composition. If you want to be an electroacoustic composer, you have to know sound as well as a carpenter knows wood. Of course, most of us have been hearing sounds since infancy and think we listen well, but I doubt you hear a tenth of what is going on. The point of this chapter is to make you aware of the complexity of sounds, to let you begin to hear the details, and to develop a common language for talking about sound. The DVD accompanying this book contains audio and video examples for every chapter. These are in AIFF or QuickTime (.mov) format and should play on Windows and Macintosh systems. (If Windows Media Player has trouble with the QuickTime files, you should download QuickTime for Windows from the Apple website.) I have used the full-quality audio format to make it possible to hear what I am talking about and to allow you to open the files in an audio editor and take a look at them. There is a folder for each chapter with the pertinent files.

Purity The first aspect of sound to explore is timbre, the quality that sets one sound apart from another. We’ll start in the kitchen, a gold mine of sounds. Take a wineglass. (I’m sure you have done this before.) Put a little water in it, wet your finger, and rub in a circular motion around the rim. With a bit of practice you will find the pressure and speed that produce a steady tone. Most would describe this tone as “pure.” To hear an impure tone, rub your wet finger on the surface of a metal appliance. I’ll bet you hear a kind of squeak. What is the difference between the tone and the squeak? The squeak is a complex sound—it has several components, each of a different frequency. (Because components of sound are associated with frequency in a manner similar to colors of light, these components are often called spectral components.) The glass tone is predominantly one component, although you will hear others when your rubbing is not quite right. Experiment with other items in your kitchen and decide whether the tone is pure or complex. On the DVD 27

03_Chap2_pp25-50 8/29/13 2:39 PM Page 28

28

THE ART AND TECHNIQUE OF ELECTROACOUSTIC MUSIC

example 2.1 was made by playing with a wineglass and example 2.2 has various squeaks. Frequency is the number of events in a second. When the wineglass is sounding, the bowl of the glass is rapidly flexing in and out, at say 350 complete cycles per second. The technical term for cycles per second is hertz (abbreviated Hz). We say the wineglass is sounding at 350 hertz. Humans can hear frequencies that range from 20 Hz to 20,000 Hz. If you measure the time between events instead of counting them, you are determining the period. Period is the inverse of frequency (think of events per second versus seconds per event). As you scrape and rub your way around the kitchen, you will hear sounds of various complexities. For some the spectral components will be distinct enough to be heard independently. Example 2.3 on the DVD is a pan lid rubbed with a wooden dowel. It has two primary tones, and either can be brought out with the right rubbing technique. A cookie sheet “bowed” with a knife sharpener is even richer, although it is difficult to sustain the sound for long. Bowing different places around the rim will emphasize certain components but they all are usually present in the sound (DVD example 2.4). With some practice you can learn to hear individual components in many sounds. Some objects may produce tones purer than that of the wineglass. The purest tone possible is called a sine tone. It could be produced by a tuning fork, or possibly an old AM radio tuned between stations. One of the central axioms of acoustics states that all sounds are combinations of sine tones. Why is this? Everything that vibrates moves in a manner similar to a pendulum—quickly past the center and slowing to turn around at the extremes of motion. If you chart the motion of the pendulum, you will get the sine of an angle that rotates through 360 degrees in the time it takes the pendulum to complete one swing. DVD example 2.5 demonstrates sine tones at several frequencies. Example 2.6 is the sound of a plastic cup rolled across a table. This particular cup has ridges which add a distinct rhythmic element to the sound. The quality of the sound is related as much to the nature of the surface it is rolled on as to any intrinsic tone of the plastic itself. (In fact, this cup is almost inert.) Example 2.7 is the cup rolled across a washing machine. Notice the definite metallic resonance that results. You could say that in some sense a sound-making object plays the surface it is in contact with.

Noise Next take a box of salt; the round cardboard kind works best. Shake it and you will hear something pretty much the opposite of a pure tone, a sort of hiss that has the rhythm of your shaking. If the box is not full, you can get a continuous sound by turning it in your hands. This hiss is a type of noise often called white noise. There are many varieties of noise. You can hear others by shaking an oatmeal carton or turning on the faucet. You can’t assign a specific pitch to these sounds, but there is

03_Chap2_pp25-50 8/29/13 2:39 PM Page 29

SOUND

29

a general impression of low or high. The word noise has several meanings in the audio business, but to a composer it is a kind of sound. DVD example 2.8 demonstrates noise made with a salt box, an oatmeal box, and a faucet. Where a pure tone has a definite frequency, noise has no frequency, or more specifically, if you attempt to measure the frequency any result is possible. The individual grains of salt hit the box at random times. We can assign a statistical curve to the possible periods, which gives us terms to describe noise. If the curve is flat, the noise is white. If the curve shows an even distribution of periods in each octave, the noise is pink. We hear white noise as a high hiss, because the ear groups pitches in octaves, and with a flat distribution of periods, half of the events will be in the top octave. Most sounds have a bit of noise in them, often associated with the action that excites the resonant tone. Likewise, most noises have some distinct tones hidden in the texture. It’s important to become sensitive to both.

Pitch If you sing along with the wineglass, the quality you are matching is the pitch. Pitch is related to the frequency of the most prominent components of the tone. We rank pitches from low to high, following frequency in an exponential manner. The glass in example 2.1 has a frequency of 350 Hz, which corresponds to the pitch of F. We also give the name F to frequencies of 175 and 700 Hz because we hear something similar in the sound—they differ only in their octave. The interval of an octave represents a doubling (or a halving) of the frequency. Musical intervals within the octave are also based on ratios of frequencies and can vary according to personal preference. Various schemes for tuning scales (called temperaments) are in use, and each has it proponents. The most common tuning is equal temperament in which the interval of a fifth has a ratio of 1.498:1, although many composers and performers prefer just intonation, in which a fifth has a ratio of precisely 1.5:1. Most of the music we hear is a mix of equal and just temperament. Fixed pitch instruments like pianos and xylophones are equal tempered, but performers on flexibly tuned instruments like violins and voices will adjust pitch to produce intervals or chords with a purer sonority. If there is more than one component in the tone of the wineglass, chances are they have frequencies that are simple multiples of the lowest. The glass on example 2.1 has components close to 350, 700, 1050, and 1400 Hz. A group of numbers that are multiples of the lowest is called a harmonic series, as outlined in Figure 2.1. The first (lowest) value in a harmonic series is the fundamental. The other components are the harmonics and are numbered starting with 2. When the components of a given sound fall close to a harmonic series, we hear a definite pitch in the tone. The harmonic series has some relationship with musical harmony, but not directly. The intervals of the harmonic series are an octave from fundamental to 2nd harmonic, a fifth from 2nd to 3rd, the 4th harmonic is two octaves above the

03_Chap2_pp25-50 8/29/13 2:39 PM Page 30

30

THE ART AND TECHNIQUE OF ELECTROACOUSTIC MUSIC

Harmonic

Frequency

12 11 10 9 8 7 6 5 4 3 2 1

784.8 719.4 654.0 588.6 523.2 457.8 392.4 327.0 261.6 196.2 136.8 65.4

Nearest Pitch 784.0 (698-774) 659.3 587.3 523.2 466.1 392.4 329.6 261.6 196.2 136.8 65.4

FIGURE 2.1 Intervals of the harmonic series.

fundamental, and the 5th harmonic is about a third above that. The 6th harmonic lands on a fifth again and the 7th harmonic is close to a minor seventh in the same octave. Harmonics higher than the 7th are not very close to musical intervals except where they are octave doublings of lower harmonics. Most objects that produce pitched sounds only approximate a harmonic series. The more the components of a sound differ from a harmonic series, the less sense of pitch there is. When the components of a tone are inharmonic (not harmonically related), it becomes clangorous. Think of a gong or cymbal. There seem to be pitches there, but they are difficult to nail down. Pitch often changes during a sound. Example 2.9 was made by tipping the glass as I rubbed the rim. It’s important to be aware of this detail when you decide what pitch to assign. Most noises have at least a vague sense of pitch, often just low or high. The words rumble and hiss describe these well. Noise in the middle of the acoustic spectrum is band-limited noise and is usually defined by the center of the band of frequencies present. Example 2.10 has several band-limited noises: a running faucet, the hiss of a gas burner, and steam from a kettle. As you continue to explore the sounds in your kitchen, notice the varied degrees of pitch definition. Compare the pitches you hear with notes played on a keyboard and assign a note name if possible. Perfect pitch is not necessary to produce electroacoustic music, but you must be able to judge how well tones match.

03_Chap2_pp25-50 8/29/13 2:39 PM Page 31

SOUND

31

Resonance If you find an object that produces more than one pitch, the sound it produces probably has a characteristic that remains the same regardless of the pitch. I can change the basic pitch of a wineglass by changing the amount of water in it, but there is a high-pitched ringing that always stays the same. DVD example 2.11 demonstrates this. The underlying principle here is resonance, the tendency of an object to vibrate when it is disturbed. You hear resonance in the wineglass when you flick it with your finger. You can also get the wineglass ringing by exposing it to a strong tone at the resonant frequency. Some singers can do this with their voice, even to the extent of breaking the glass. (This is true, but glasses that fragile won’t last long on your dinner table.) Most musical instruments are better at resonating than most kitchen implements, since they are deliberately designed to produce a range of pitches. A violin has two main parts: strings to produce the tone and a body to amplify and color the tone. The violin body resonates in a complex way, which will reinforce some components of the string tone. But the body resonances are fixed, which means the same range of frequency will be reinforced regardless of the pitch played. A body resonance centered on 1200 Hz reinforces the 4th harmonic of the D string and the 3rd harmonic of the A string. This steady region of reinforcement is called a formant. The voice has formants, too, but they are used the other way around. A person speaking at a fairly constant pitch reshapes the back of the mouth, which alters the formants produced in the head and changes the quality of the sound. If you gradually shift a vowel through a-e-o-u, keeping to the same pitch, you will recognize the effect of changing formants. This phenomenon is so closely associated with speech that we call any sound with moving formants a vocal tone.

Loudness We all understand loudness. Hitting a saucepan is clearly louder than hitting a measuring spoon. Yet the physical aspects that determine loudness are more complex than you might think. Loudness is strongly related to the intensity of the sound, a quantity that can be measured as a variation in air pressure, but the perception of loudness is also determined by other factors. The duration of a sound affects loudness. A short sound does not seem as loud as a sustained one, other things being equal. Thus a drum hit, which is as intense as a jet engine, seems only slightly louder than a guitar playing through a modest amplifier. This is illustrated in DVD example 2.12 with noise bursts of various lengths. The frequency of a sound affects loudness. While it is true we hear across an enormous range of frequency, we don’t hear soft sounds at the top or bottom of the range as well. Thus we need a bigger amplifier for the bass in the band, and if we turn down the volume on the stereo, the bass and high ends disappear. This is illustrated

03_Chap2_pp25-50 8/29/13 2:39 PM Page 32

32

THE ART AND TECHNIQUE OF ELECTROACOUSTIC MUSIC

-10 dB -20 -30 -40 -50 -60 -70 -80 -90 -100 -1 -110 10 -120 -130 15 Hz

perceived loudness Lines of equally perceived

Material M aterial below below this cur curve ve is inaudible inaudible..

31

63

125

250

500

1k

2k

4k

8k

16k

FIGURE 2.2 Fletcher-Munson curves. in example 2.13 by a descending series of tones. The ear is most sensitive to frequencies between about 200 Hz and 4,000 Hz, which also happens to be the range from fundamental to third harmonic of human voices. Figure 2.2 is a graph of this effect. This was made by comparing sounds at various frequencies with reference tones at 1,000 Hz and marking the intensity that gives the same impression of loudness. If you follow a line to the left and right from the 1,000 Hz point, you can see how the sensitivity varies. A line going up means the sound is harder to hear. The result of this is that tones below 500 Hz and above 4,000 Hz fade away more quickly than tones in the middle of the range. These curves were originally determined by Harvey Fletcher and W. A. Munson at Bell Labs and have been refined over the years by other researchers. The quality of a sound affects loudness. The signal from distorted guitar is no stronger than the clean version, but the extra harmonics added by a fuzzbox stimulate the ear more. The tones in DVD example 2.14 are exactly the same level. Decibels The range of intensity of sound we can hear is enormous. The ratio of the power of the faintest sound we can hear to the level that causes pain is about 1,000,000,000,000 to 1. This is so wide that when we want to measure sound intensity, we use a logarithmic scale, effectively counting the zeros in the ratio. We encounter this measurement often in audio work so we need to get a grip on the math. Sound pressure level is specified by comparing the power of measured sound with a standard, the power of the faintest audible sound. The unit of this measurement is the bel (named for Alexander Graham Bell) and is defined as the logarithm of the measured level over the standard. We usually want a slightly finer measurement, so we work in tenths of a bel, or decibels (abbreviated dB).

03_Chap2_pp25-50 8/29/13 2:39 PM Page 33

SOUND

33

In practical situations it is easier to measure pressure (or voltage) than power, and the formula for this looks a bit odd: dB = 20 log ( Pm/Ps) Sound pressure level in decibels is twenty times the logarithm of the measured pressure over the standard. If you are measuring voltage of a signal, you can replace pressure with voltage, but the standard is usually greater than the signal voltage, which makes the decibel reading negative. (The logarithms of numbers between 0 and 1 are negative.) The upshot of this odd math is this: A 3 dB difference is just barely noticeable. A 6 dB difference is twice the voltage or pressure. A 10 dB difference seems twice as loud. A -10 dB difference is half as loud. DVD example 2.15 plays a series of tones that increases by 6 dB per step. We seldom need to calculate decibel levels. Instead, we use meters that read decibels directly. The term is often used to specify changes in signal level—a producer might tell an engineer to push a track 3 dB or a preamplifier may provide 60 dB of gain. Since decibel is a relative term, it is essential to know the standard level used. (This is as important as knowing whether a distance is measured in yards or inches.) If the context does not make it clear, suffixes are added to the abbreviation to indicate the type of measurement: dBspl is an acoustic measurement relative to the threshold of hearing dBA is also acoustic and includes equalization to match the frequency sensitivity of the ear dBm is an electrical measurement relative to 1 milliwatt dBu is relative to standard telephone line level dBfs means “dB full scale” and is used in digital systems; the reference is the maximum signal that can be handled by the audio convertors There is often confusion between the terms exponential and logarithmic. Both curves describe the relationship of an effect to the cause. Cause is plotted across a graph (X axis) and the effect is the height (Y axis). A linear relationship is shown by a straight line at any angle. An exponential curve starts out gradually and increases in slope, so it would lie below a straight line from end to end. A logarithmic (or just log) curve begins quickly and tapers off, always above the straight line. Figure 2.3 shows these curves. Sometimes the two terms are used interchangeably, and that may be appropriate, as one is the inverse of the other. If we start with an amplitude and find the dB level using the above formula, it is clear the dB reading is logarithmically related to power. However, to go the other way, we raise 10 to the power of dB/20 to get the ratio, which is an exponential calculation. To confuse the issue further, graphs of

03_Chap2_pp25-50 8/29/13 2:39 PM Page 34

34

THE ART AND TECHNIQUE OF ELECTROACOUSTIC MUSIC

LLog og C Curve urve

Linear Effect Eff ffe ect

Exponential Curve Exponential C urve

Cause Cause FIGURE 2.3

Exponential and log curves.

amplitude response are often shown with scales of decibels up the side and octavespaced frequency across the bottom. These are called log-log graphs. The basic issue is this: perception is logarithmic. Mechanisms that seem to produce steady change in pitch or loudness are exponential.

Envelopes We use the word envelope to describe how the loudness of sound changes over time. When you rub the wineglass, the sound will be faint at first, grow stronger as you rub, and then fade away after you lift your finger. A tap on the wineglass will produce a dramatic change in the way the sound starts. It begins suddenly, almost explosively. The beginning of a sound is called the attack. When we start synthesizing sounds we will discover that minor changes in the attack strongly affect the quality of the sound. You can verify this by tapping with various kinds of strikers. Try the tip of a pencil and the eraser, for instance. The fading of the sound after the tap is called the decay. As you listen closely, you will discover the that quality of the sound changes during the decay—generally it becomes purer as it fades away. Any sound with a quick attack and a slower decay is called percussive. You will note that some sounds decay so quickly there isn’t time to get much sense of quality and only a rough idea of pitch, just a clink, pop, or thump. When you rub the glass, the period during which the sound is more or less steady is called the sustain. The key words here are more and less. Only raw elec-

03_Chap2_pp25-50 8/29/13 2:39 PM Page 35

SOUND

35

tronic sounds are unchanging during the sustain. Usually the volume and quality of a sustained tone vary, often in a cyclical way. For most sounds, the overall envelope tells the general story, but a close listening reveals more complexity. Each component of a sound has its own envelope. In many cases, differing decays give the impression of the sound becoming pure at the end. The attack of each component also flavors the sound. The characteristic “ta” of a trumpet comes from the high-frequency components lagging behind the low, but if you reverse this you get the squeak of a thumb across a balloon. Many sounds have some components that are very brief—they don’t flavor the sound much, but they add character to the attack. Think of hitting a xylophone with a hard or soft mallet. DVD example 2.16 demonstrates various sound envelopes.

Rhythm When we listen to sounds for a composition, we need to be aware of any rhythmic implications in the sound. Most sounds have some internal tempo. It may be caused by the speed the finger circles the rim of the wineglass, how a cookie sheet rotates after it is struck, or the interaction of resonances in the sound of a gong. In example 2.6 we heard the sound of a plastic cup rolled across a table. In addition to exciting the resonances of the table, the ridges on the cup produce a distinct rhythmic element in the sound. These internal rhythms can limit the compositional usefulness of some sounds, but they can also be exploited in the right situation. You can hear a more elaborate example of rhythm in a sound by standing a saucepan lid on its edge. Give it a twirl and listen to the sound it makes as it spins around and finally falls over. The results will vary but are quite familiar to clumsy dishwashers. First you get a slow “roin-roin-roin” as the lid spins on its edge, followed by a “wubba-wubba-wubba” as it begins to fall. This evolves into an accelerating “wapity-wapity” as it bounces on the floor and finally ends with a solid thwack. Each distinct phase of the sound has its own tempo, which makes this event a mini-composition on its own. You can get similar results with other round objects such as quarters and manhole covers, with distinctly different timbres (DVD example 2.17).

LOOKING AT SOUND In the early days of electroacoustic music, composers relied almost exclusively on their ears. You had to listen for the beginnings and ends of sounds as the tape rolled, and you had to listen to the balance of sounds as you adjusted the faders on a mixing board. With the digital approach to composition, the paradigm has changed. Computer music programs are paradoxically visual—we are expected to operate on images of sound presented on the screen. You would think you could

03_Chap2_pp25-50 8/29/13 2:39 PM Page 36

36

THE ART AND TECHNIQUE OF ELECTROACOUSTIC MUSIC

00:00:00

00:00:00.002

FIGURE 2.4 Sine waveform. manipulate sound with your eyes closed, but it isn’t true. Blind composers have found most music programs impossible to use.

Waveform The most common display format is a graph of the air pressure changes that make up the sound. This is called the waveform and is a more or less regular swing around a center line. Figure 2.4 is a pure tone, an electronically generated sine wave, which can be heard in DVD example 2.18. The dotted center line represents the normal or ambient air pressure. The solid line shows how the pressure changes over time, which runs from left to right. It is important to note the time scale of such graphs. The marks in Figure 2.4 are spaced 1 millisecond apart, so the tone has a period of approximately 1 millisecond, or a frequency of 1,000 hertz. There are three important aspects of the waveform that appear here. The height of the curve is the amplitude and is a good indicator of the loudness of the tone. The width of a complete cycle shows the period. The best place to note the period is from the points where the curve crosses the zero line. Of course, this curve crosses the zero line twice, so what we actually need to note is where it is going up. The third aspect is phase: the point where the curve starts is an indication of the phase. Phase is described as an angle from 0 to 360 degrees. We don’t hear phase in an absolute sense, but we do hear the relative phase between two tones. When two sine tones of the same frequency but different phase are combined, the result is a sine tone of the same frequency but different amplitude. The resulting combination can vary from the sum of both amplitudes to no signal at all. If two

03_Chap2_pp25-50 8/29/13 2:39 PM Page 37

SOUND

37

tones are slightly different in frequency, they will go in and out of phase. This will produce an audible pulsing in the amplitude. These pulses are called beats. Figure 2.5 and DVD example 2.19 illustrate this. Figure 2.6 shows a more complex tone. This is a recording of a violin, and you can see that although it has regularity, there are peaks and valleys of various sizes within the largest cycle. If we assume the large downward peaks demark the longest cycle, we can judge the period to be about 3 ms, which means a frequency around 333 Hz. The small peaks are associated with harmonic components. Looking at these, we might hazard a guess that spectral components up to the 10th harmonic are present, but we can be easily fooled. Figure 2.7 shows the same violin on a different pitch. The waveform is not much like Figure 2.6 but the sounds are quite similar. A component by component analysis of the sounds would show the same harmonics at approximately the same amplitudes, but the phase relationship would be different. In fact, the relative phases of the various components are constantly changing, because the components of the tone are slightly detuned from a perfect harmonic series. If you look carefully at Figure 2.6, you can see slight differences in each cycle, such as the height of the last peak before the big downswing. DVD example 2.20 contains the tones shown in Figures 2.6 and 2.7. Figure 2.8 is even more complex. There is nothing obviously periodic in this. This is noise, and if we discern any pattern at all, we are fooling ourselves or there is something wrong with the noise generator. Most recorded waves have a certain amount of noise mixed in, which shows up as fuzz on the principal waveform. DVD example 2.21 is a short recording of noise. What can we learn from looking at waveforms? Well, certain defects in recording show up clearly, and we can judge if two signals are approximately the same amplitude. After a lot of practice, you will be able to get a sense of the timbre and whether a sound is pitched or noisy. The most useful feature of the waveform view is found when we look at a different time scale. Figure 2.9 shows nearly 4 seconds of music. The individual loops of the wave are not visible but we can see the envelopes. This is a snare drum. The first blob is a quick roll which continues into “rata-tat-tat-tat” with the “tats” easily visible as the first three big peaks. You can probably work out the rest of the rhythm and tell that it gets louder toward the end of the 4-second phrase. This view makes it easy to spot the onsets of notes for editing purposes. We will still need our ears to verify what is what (the program lets us select a lump and play it) and we will want to go in to the close view for a precise cut. DVD example 2.22 contains this drum riff.

Spectrum We can get more information about the quality of a sound by looking at a spectral graph. A spectral graph (or spectrograph) only looks at a moment in time, but it shows the strength and frequency of the sound components.

03_Chap2_pp25-50 8/29/13 2:39 PM Page 38

38

THE ART AND TECHNIQUE OF ELECTROACOUSTIC MUSIC

+

+

=

=

Two T w sinewaves in phase wo ph add to double the amplitude

Two T wo w o sinewaves sinewa out of phase cancel out

Sinewave at 100 Hz

Sinewave at 110 110 Hz

A sinewave at 100 H Hz and a sinewave at 110 10 Hz produce beats at 10 Hz FIGURE 2.5 The effect of phase on combining waveforms.

03_Chap2_pp25-50 8/29/13 2:39 PM Page 39

SOUND

00:00:00

00:00:00.003

39

00:00:00.006

FIGURE 2.6 Violin waveform.

00:00:00

00:00:00.003

00:00:00.006

FIGURE 2.7 Violin waveform at higher pitch.

Figure 2.10 is a spectrograph of a nearly pure tone, a sine wave played through a speaker which adds a bit of distortion. The fundamental of the tone is represented by the peak at 1,000 Hz (indicated by 1k for one kilohertz). You can clearly see the harmonics at 2 kHz and 3 kHz. They are multiples of the fundamental frequency but seem to squeeze together because the scale across the bottom of the chart is peculiar. Look at the space between 500 Hz and 1 kHz. It’s the same as the distance between 500 Hz and 250 Hz below. In other words, the markings from left to right represent octaves. Figure 2.11 shows a spectrograph of a violin tone. There are many more harmonics present. In fact, the sound has more energy in harmonics than in the fundamental. Sounds of that character are often described as “bright.” A more precise

03_Chap2_pp25-50 8/29/13 2:39 PM Page 40

40

THE ART AND TECHNIQUE OF ELECTROACOUSTIC MUSIC

00:00:00

00:00:00.003

00:00:00.006

FIGURE 2.8 Noise waveform.

00:00:00

00:00:00.02

FIGURE 2.9 Snare drum envelopes.

analysis would show the slight detuning of the harmonics that is typical of most acoustic instruments. There is another type of spectral display. A spectrogram (or time spectrum) shows how tones evolve. Figure 2.12 is a spectrogram of a violin playing open G. Each component of the tone is indicated by a horizontal line at a height matching its frequency. The strength of the component is indicated by color (in Figure 2.12, black is loud). Time spectra require a lot of computation, so real-time displays tend to be somewhat coarse, especially since slight level differences produce abrupt

03_Chap2_pp25-50 8/29/13 2:39 PM Page 41

SOUND

15

31

63

125

250

500

1k

41

2k

4k

8k

16k

2k

4k

8k

16k

FIGURE 2.10 Spectrum of sine tone (slightly distorted).

15

31

63

125

250

500

1k

FIGURE 2.11 Spectrum of violin tone.

color changes. Nonetheless, they have a lot to tell us about the sound, especially how the upper components attack and release. The spectral views are not used as often as the waveform views (in fact few programs provide them), but they are instructive when you are curious about the makeup of a sound. Figure 2.13 shows the spectra of the four sounds on DVD

03_Chap2_pp25-50 8/29/13 2:39 PM Page 42

42

THE ART AND TECHNIQUE OF ELECTROACOUSTIC MUSIC

FIGURE 2.12 Time spectrum of an open G on the violin.

example 2.23. As you listen and look, try to match the traces with what you hear. Spectral views are created by a process known as Fourier Transforms. That technique is good for a lot more than making instructive pictures, and is covered in some detail in chapter 13.

Level Meters Audio level meters are the most common display, and the display most taken for granted. They show the electrical power of the signal and give a sense of how loud it is. Level meters are sometimes called VU meters, because an old term for decibel was volume unit. Modern meters are usually bar graphs that stretch up or to the right. They are modeled on the meters found on audio hardware and still emulate the old mechanical action. A meter measures amplitude over some period of time. (Instantaneous amplitude would not be useful because it changes too fast for the eye to see.) The response of most meters is averaged over 300 milliseconds. That response time was determined as much by the ballistics of mechanical meters as by psychoacoustics. Many engineers consider this response a bit slow, but it provides a practical indication of the signal power.

03_Chap2_pp25-50 8/29/13 2:39 PM Page 43

SOUND

Whale

15

31

63

125

250

500

1k

2k

4k

8k

16k

63

125

250

500

1k

2k

4k

8k

16k

63

125

250

500

1k

2k

4k

8k

16k

63

125

250

500

1k

2k

4k

8k

16k

Balloon

15

31

Gong

15

31

Bottle

15

31

FIGURE 2.13 Spectra of four sounds in DVD example 2.23.

43

03_Chap2_pp25-50 8/29/13 2:39 PM Page 44

44

THE ART AND TECHNIQUE OF ELECTROACOUSTIC MUSIC

Meters are marked with a scale of decibels, and the reading is derived from the average amplitude by a root-mean-square or RMS calculation. This approximates the perception of loudness of music waveforms. Decibels need to be related to a reference value, and that value is usually the strongest signal that will not distort. Since the reference is marked “0 dB,” most of the readings will be negative numbers. Furthermore, the markings may not be evenly spaced. Most meters are designed to show values near the top in higher resolution, since we usually care more about the difference between -3 and -9 than the difference between -23 and -29. The relationship between the meter reading and the perceived loudness depends to some extent on the nature of the sounds that are measured. Speech does not sound as loud as music that hits the same mark, and some percussion sounds exceed system limits without moving the meter at all. For recording purposes we also need to know about the highest amplitudes present in the signal, no matter how brief they may be. These are shown by a peak meter, which responds quickly and falls back slowly. The peak indicator was originally just a warning light, but now it is a usually a different colored segment on a level meter. Some meters also have a numerical reading of the difference between 0 dB and the strongest measured level. This is called margin or headroom. The smallest margin value will show until it is reset. If the margin hits zero, some meters will show a warning that says “over.” This implies the recording has been clipped, but this may not be the case because many programs are conservative in the placement of the zero point. Meters are included in almost all recording and editing programs (not to mention all recording hardware), but the same sound material may read differently from one program to the next. Metering plug-ins (Figure 2.14) are available that will consistently measure level in all of your applications. The particular meter shown has separate indicators for peak and average levels, with a top segment that stays lit longer than the others. It also shows the numerical values of the meter. Note that the zero point is well below the end of the meter. This follows the old school practice of using zero as the reference for the average of the loudest parts of the music. On this meter, the zero full scale point is actually labeled 14. The top segments will light up if the shape of the peak waveforms is such that the analog parts of the system will produce levels beyond the digital limits. This can happen if the audio interface features oversampled outputs. Level meters mean very little if the setting of the monitor speakers is not taken into consideration. This is a perennial topic of discussion among audio engineers. There are some standards: in the movie business, a reading of -20 dB is set to produce 83 dBspl (86 dB from stereo speakers). That -20 dB is in reference to full scale, the clipping point of digital audio. This is then used as the “loud” signal level, the point that was marked 0 VU on analog systems. Many music engineers (who don’t deal with exploding helicopters) prefer slightly lower monitor settings, about 77 dBspl. A lower monitor setting means the music will be mixed hotter, with the average closer to the full scale mark. This is why music CDs sound so loud on DVD players—the mix is actually targeted at -14 dB or higher. There is an emerging stan-

03_Chap2_pp25-50 8/29/13 2:39 PM Page 45

SOUND

45

FIGURE 2.14 Example of a metering plug-in.

dard that defines three levels for different genres of music. These are called K levels, named after mastering engineer Bob Katz, who proposed the standard. K-20 is the movie standard, which is also used in classical music. K-14 is used in pop music, and K-12 is used for broadcast audio. The level setup I described in chapter 1 is for K-20. If you prefer to work at K-14, you can calibrate your speakers using DVD example 1.1 to set the speakers for 77 dBspl. The metering plug-in shown in Figure 2.14 will generate reference tones at any of these levels for speaker calibration.

COMPARING SOUND Matching by Character The day-to-day work of electroacoustic composition consists of placing one sound after another or combining them for new effects. This process is made easier by thinking ahead of time about how sounds might fit together. One way to think about your collection of sounds is to categorize them, as a painter places colors on his palette or an arranger lists the instruments in an orchestra. The painter has blues and reds in various shades and may keep the cadmium near the ochre. In the traditional orchestra, we have strings, winds, and percussion, with the winds further divided into woodwind and brass. Likewise, the sounds of the kitchen may be grouped by material (metal and glass), technique (rubbing, banging, or twirling), or results (pure tones, complex tones, or noise) There’s no perfect organization. In fact there’s no bad organization; the categories simply provide common points for comparisons. Similarity of sound is important in structuring a composition. Think of how you follow one voice at a dinner party even though the room is filled with conversation

03_Chap2_pp25-50 8/29/13 2:39 PM Page 46

46

THE ART AND TECHNIQUE OF ELECTROACOUSTIC MUSIC

and a lot of background noise. We can do this because we have learned to detect subtle similarities of timbre. An oboe melody stands out against the strings, not because the oboe is louder than the strings but because the tone of an oboe is recognizably different. A melody of rubbed glasses and bowls will stand out against the grind of an icemaker in the same way (DVD example 2.24). Contrast of sounds is equally important. One primary distinction is between foreground and background. A solo violinist plays with a style different from the members of the violin section. This translates into a tone that is brighter, has stronger attacks, and may be tuned just a bit sharp. The section player is careful to match intonation exactly and to play the rhythms precisely in order to blend in. Many electroacoustic sounds will seem suited for foreground from the first, with a clear pitch and striking tone. Backgrounds may be steadier in volume and less complex in quality. In later chapters, we will learn some tricks for modifying sounds so that they can be used in foreground or background.

Blends When sounds are combined, some will blend together so perfectly that they lose individual existence and something new appears. DVD example 2.25 is two wine glasses rubbed at the same time. They are well tuned so they meld into a single sound more complex than either glass alone. Why? Even though the fundamentals are tuned to the same pitch, the frequencies of the harmonics do not quite match. This adds edge to the tone. A similar effect can occur when a high-pitched sound is added to a low-pitched sound. If the high-pitched sound fits the harmonic structure of the low one, the high sound will seem to disappear, resulting in a richer tone with the lower pitch. Of course, this depends on careful adjusting of the loudness of the two. DVD example 2.26 demonstrates a blend of a wine glass and bowl. I noted earlier that many sounds contain some degree of noise. When two slightly noisy sounds are mixed, the noises may add up in a surprising way. Combined sounds of differing registers will not sound louder than each one heard separately. However, the noise will add up and be perceptually louder, as you can hear in DVD example 2.27. To be aware of this possibility, you should listen to your sounds at a high level at least once.

Dissonance The notion of dissonance is an old one in music, with debates on what is or is not dissonant on record from over two millennia ago. In the music theory sense, dissonance is a matter of taste, with certain pitch intervals privileged in some musical cultures. In electroacoustic work, pitch constraints are traditionally loose, but timbral dissonance is an important concept. This occurs when two sounds seem to be competing for foreground status. They are too dissimilar to blend, but neither has

03_Chap2_pp25-50 8/29/13 2:39 PM Page 47

SOUND

-10 dB -20 -30 -40 -50 -60 -70 -80 -90 -100 -110 -120 -130 15 Hz

47

Masking tone

Thr

Masking zone

esh

old o

f he

31

63

arin

g

125

250

500

1k

2k

4k

8k

16k

FIGURE 2.15 Masking curves.

enough distinction for the listener to place them consistently in the stream of events. The result is that the listener’s attention skips back and forth between the two, an effect that can be interesting or disconcerting. DVD example 2.28 is the unlikely combination of a kitchen mixer and handheld vacuum.

Masking When two sounds are presented at the same time, one may make the other inaudible. The action of one sound blocking the perception of another sound is called auditory masking. This is not just a matter of relative loudness, although that is involved. A loud sound will mask a soft one of a nearby frequency; the difference in frequency required for both to be audible is dependent on the difference of amplitudes. The graph in Figure 2.15 shows this. A narrow band of noise is more effective at masking than a pure tone. There is also a phenomenon called temporal masking. A loud sound will mask soft sounds that occur up to a few milliseconds later. Under some conditions, a loud sound will mask a soft sound that occurs slightly earlier! What is the practical effect of masking? Masking effects apply to all components of a sound, not just the fundamentals. Thus some combinations of sound are quite different from what we expect. We run afoul of masking most often when we are dealing with lyrics or spoken text. Human ears are really good at picking words out of a conflicting background, but if consonants and other key elements of the voice are masked, the meaning can be obscured. See if you can tell the t’s from the p’s in DVD example 2.29.

03_Chap2_pp25-50 8/29/13 2:39 PM Page 48

48

THE ART AND TECHNIQUE OF ELECTROACOUSTIC MUSIC

Streaming Every string or wind player has practiced exercises that involve rapid alternation of high and low notes. Done properly, this gives the impression of two independent parts. Why? The brain always tries to put things in groups and makes associations by register. Two high notes will be paired against two low notes, even if one of the low notes occurred between the high notes. This principle is called streaming and can be based on nearly any aspect of sound—loudness, timbre, location, anything that can distinguish two sounds.

PAYING ATTENTION TO SOUND These are just some basic observations about sound. We will discover more as we go about our work, and you should take every opportunity to learn. The best way to learn is to listen to every sound you encounter. Take notes if that focuses your attention. I often send my students on listening walks, across town, over the meadow—the location doesn’t matter because sound is everywhere. Once you start paying attention, you will make amazing discoveries.

EXERCISES 1. Open a drawer somewhere in your home and systematically note the sounds that can be made with the contents. Categorize them according to timbre, pitch and envelope. 2. Take a walk outdoors and list everything you hear. 3. Listen to a single sound and write a description of the sound that does not mention the actual source of the sound. Read your description to someone else to see if they can identify the sound.

RESOURCES FOR FURTHER STUDY There are two kinds of books about sound. Acoustics texts talk about the physics of sound, which is good to know if you like to understand why certain things happen. Books about the artistic side of sound tend to be philosophical. For acoustics, Arthur Benade’s works can’t be beat. There are some newer books, but acoustic science hasn’t changed a lot in the last

03_Chap2_pp25-50 8/29/13 2:39 PM Page 49

SOUND

49

thirty years. For the art of listening, I find Pauline Oliveros’s work beautiful and perceptive, and F. Alton Everest has produced an exceptional course for training sound engineers. Krause speaks of sound in the wild and how civilization is changing it. Benade, Arthur H. 1992. Fundamentals of Musical Acoustics, 2nd rev. ed. New York: Dover Publications, Inc. Everest, F. Alton. 2006. Critical Listening Skills for Audio Professionals. Boston: Artistpro. Krause, Bernie. 2002. Wild Soundscapes: Discovering the Voice of the Natural World. Berkeley, CA: Wilderness Press. Oliveros, Pauline. 2005. Deep Listening: A Composer’s Sound Practice. Lincoln, NE: iUniverse.

03_Chap2_pp25-50 8/29/13 2:39 PM Page 50

04_Chap3_pp51-72 8/29/13 2:39 PM Page 51

THREE Recording Sounds

Electroacoustic music begins with sound, but it becomes “electro” at the microphone. Capturing a precise electrical analog of the varying pressure of the air is the cornerstone of the art. This chapter describes the equipment and procedures for recording sound.

THE RECORDING SIGNAL CHAIN A recording system consists of a microphone to convert sound into electrical signal, a preamplifier to boost and possibly modify the signal, and a recording device of some kind. These may be separate pieces of equipment or combined into a single unit. We needn’t concern ourselves with the details of design and engineering involved, but some understanding of what’s inside the equipment will help us get the best recording possible.

Microphones The heart of a microphone is the diaphragm, a bit of thin foil or plastic that can flex easily in response to changes in air pressure. The motion of the diaphragm can be translated into current in a variety of ways. The most popular mechanisms are dynamic (short for electrodynamic) and condenser. The diaphragm of a dynamic microphone has a lightweight coil of wire attached to it. This is suspended within the field of a powerful magnet. When the diaphragm assembly moves, current is generated in the coil in much the same way that the electric company produces the current that lights our homes (Figure 3.1). The motion of the diaphragm and coil is about the width of a toothpick, so the amount of current generated is tiny. This current may be boosted by a transformer, a miniature version of the devices that sit on top of power poles. A ribbon microphone is a type of dynamic mic that has a thin strip of metal for the diaphragm. This strip is corrugated, which makes it extra flexible—quite like 51

04_Chap3_pp51-72 8/29/13 2:39 PM Page 52

52

THE ART AND TECHNIQUE OF ELECTROACOUSTIC MUSIC

Coil

Magnet Transformer

Diaphragm

FIGURE 3.1

Dynamic microphone.

the tinsel sometimes used to decorate Christmas trees. The ribbon is suspended between the poles of a strong magnet, and motion generates current in the ribbon itself. The ribbon design was used in the first high-quality microphones, but these were so delicate they could not be used outside the studio. In fact, they could be ruined by a sneeze. Ribbons went out of favor for several decades but are making a comeback due to materials such as nanotubes, which make a ribbon diaphragm as sturdy as any other style. The condenser microphone takes advantage of the fact that a tiny amount of electricity can be stored in the air space between two metal plates (Figure 3.2). The device that does this is called a capacitor (condenser is an old term for it) and the stored electricity is called a charge. The amount of charge that can be stored (the capacitance) depends on the size of the parallel plates and the space between them. In the condenser microphone, the diaphragm is one plate of a capacitor. The other plate is a fixed screen. The charge in the assembly is maintained by a battery or external power supply. As the diaphragm moves, the capacitance changes, and current flows into or out of the plates to keep the charge balanced. This current flow is boosted by a miniature amplifier circuit contained in the mic. There is a variation of the condenser microphone known as the electret. This design uses a permanently charged plastic diaphragm instead of a charging system. (It’s charged at the factory by the same process that makes the shrinkwrap from a CD cling to your hand.) Electrets still need power for the internal amplifier, but this draws so little current that a battery will last nearly a year. Electrets can be nice mics, but eventually the charge leaks off the diaphragm, and the mic stops working. In many cases we won’t need to worry about which type of mic we are using, but there are some considerations we should be aware of. Dynamic microphones are usually cheaper than condenser mics, and they are generally tougher. If there is any risk to the microphone from extremely loud sounds or

04_Chap3_pp51-72 8/29/13 2:39 PM Page 53

RECORDING SOUNDS

Diaphragm

53

Backplate Amplifier

Battery FIGURE 3.2

Condenser microphone.

a harsh environment, a dynamic mic is the better choice. You need to take care handling condenser mics, especially the large ones. Condenser microphones are usually better microphones. The recordings made with condenser mics have a more natural sound than those made with dynamic mics. Recordings made with dynamic mics can be quite good, but when every bit of quality counts, use a condenser mic. The electronics inside a condenser microphone are its Achilles heel. The internal amplifier can be overloaded and distort with extremely loud sounds, and it generates a bit of electrical noise, which precludes recording faint sounds. The electronics also require power, which means there either is an internal battery or an external power supply. Sometimes the power originates in the preamplifier or mixer and is passed to the mic over the cable. This technique is called phantom power. Condenser and dynamic microphones produce subtly different signals from the same acoustic input. The current flowing from a condenser microphone is 90 degrees out of phase with the current from a dynamic microphone. This usually doesn’t matter, but when you need to place two mics at the same spot (for stereo), they should be the same type.

Directionality of Microphones If the diaphragm of the microphone is suspended in a frame that leaves it completely exposed to the air, it will respond to sound coming from the front or back, but not from the edges. This is because the diaphragm moves in response to a difference in pressure on each side. If the wave fronts approach at a right angle, the pressure will be the same on either side of the diaphragm, producing no motion. Figure 3.3 illustrates this. The sensitivity of a microphone to sound from different

04_Chap3_pp51-72 8/29/13 2:39 PM Page 54

THE ART AND TECHNIQUE OF ELECTROACOUSTIC MUSIC

Pressure at front moves diaphragm back.

Diaphragm

54

Pressure at rear moves diaphragm forward.

+

-

Response pattern

Pressure at edge does not move diaphragm. FIGURE 3.3

Bidirectional microphone.

directions is called the pickup pattern. It is charted by moving a steady sound source in a circle around the diaphragm and graphing the amplitude of the signal at each angle. This graph is also called a polar response curve. For a bidirectional mic the shape of the polar response is a figure eight. Sometimes a plus and minus sign are shown to remind us that sounds coming from the front and back push the diaphragm different directions at the equivalent phase in the wave. If the back of the diaphragm is enclosed in a case, the microphone will work like a barometer. The pressure within the case will stay steady (a tiny hole allows gradual pressure changes) and the pressure outside will vary with the waveform. This microphone will respond equally to sound approaching from any direction. The polar response for an ideal omnidirectional mic is circular, although there often some frequency dependent variation caused by the supporting structures of the mic (Figure 3.4). The word omnidirectional is usually shortened to omni. The ideal pickup pattern for a microphone would be like that of a camera lens, responding only to sounds directly in front. Unfortunately this is acoustically impossible, at least if you want a wide frequency response. A usable compromise can be obtained by combining an omni and a bidirectional pattern, as shown in Figure 3.5. The resulting polar response is heart-shaped, for which we use the Latinized term cardioid. Signal from the back is reduced because the bipolar response to rear signals is out of phase with the omni response. Cardioid patterns can be generated

04_Chap3_pp51-72 8/29/13 2:39 PM Page 55

Enclosure

Pressure from any direction moves diaphragm back.

Diaphragm

RECORDING SOUNDS

Response pattern

Omnidirectional microphone.

+ Partial enclosure

Response of diaphragm is omni where back is “shaded” and bi-directional where it is open. Result is mix of the two patterns.

Diaphragm

FIGURE 3.4

=

Response pattern FIGURE 3.5

Cardioid microphone.

55

04_Chap3_pp51-72 8/29/13 2:39 PM Page 56

56

THE ART AND TECHNIQUE OF ELECTROACOUSTIC MUSIC

in various ways; these include partially obstructing the back of the diaphragm, adding complex delay paths to the back, or, in the more expensive designs, by combining the signals from two diaphragms. These designs vary in the details of the cardioid shape, in some cases allowing a bit of rear pickup to achieve better rejection at about 130 degrees, a pattern marketed as “hypercardioid.” It turns out to be rather difficult to produce a cardioid pattern that is consistent at all frequencies (Figure 3.6). In fact, in the sub-bass frequencies, all cardioid mics become omni. Closely matched patterns in the high range are one of the differences between a $100 mic and a $1,000 model. You may think that odd frequency response from the sides and back (so-called off-axis response) would be unimportant, but it can have a significant effect on such things as reverberation. DVD example 3.1 explores the directional and frequency characteristics of omni, bidirectional, and cardioid microphones. These were produced by moving a music box in a circle around the microphone. You should perform similar experiments on every mic you own. It’s important to realize the pickup pattern is really three-dimensional. The choice of pickup pattern depends on what is most important in the situation. If a flat frequency response is the goal in a well-controlled situation like a classical concert, omni mics will be best. If there are sounds present which must be excluded, a cardioid mic is usually chosen. A bidirectional mic actually rejects more sound than the cardioid mic, but the rear pickup restricts its use. Bidirectional mics are mostly used in studios where they can be suspended above an instrument with the rear lobe pointed at the ceiling. The pressure difference that drives the bidirectional or cardioid mic is mostly caused by the fact that the front and back are in different parts of the wave, a fixed distance determined by the size of the diaphragm and its supporting hardware. This makes the output amplitude vary with frequency, as short wavelengths will have more pressure variation over a given distance. The amplitude rises with frequency up to a point, then falls as the pressure on front and back approach 180 degrees out of phase. Electronic components in the microphone are used to correct the response, although many cardioid mics remain a bit bright. There is another source of pressure difference, however. The front of the diaphragm is closer to the sound than the back, and the pressure amplitude falls with the square of the distance. This is insignificant unless the sound source is quite close (within inches) of the mic, but in cases where this effect does come into play, the frequency response correction causes a dramatic boost in low frequency. This is called proximity effect, and many mics have a switch to adjust the frequency response to correct for close operation. DVD example 3.2 demonstrates the proximity effect with a bidirectional microphone.

Microphone Controls You wouldn’t think anything as simple as a microphone would need many controls, and most mics have none. Some do, however, especially the high-end models. Here are some common features:

04_Chap3_pp51-72 8/29/13 2:39 PM Page 57

RECORDING SOUNDS

10k

1k

120Hz

57

60Hz

FIGURE 3.6 Cardioid frequency response.

On/Off switch. If a microphone has a battery, it will have a switch to keep from running the battery down when the mic is not in use. You will occasionally see off switches on cheap microphones intended for use in public address (PA) systems. The off switch is probably the most annoying feature a microphone can have, since it is invariably left off when the mic is mounted in some inaccessible location. EQ switch. Many cardioid and bidirectional mics have equalization switches to reduce low-frequency output. This should be set when the mic is close enough to the source to produce the proximity effect. Some mics also have EQ adjustments for the high range, although this is better done at the preamplifier or mixer. Attenuator. Since a condenser microphone contains an amplifier, it is possible for the diaphragm to produce a signal strong enough to cause distortion. Controls in the preamplifier or later in the signal chain can do nothing to prevent this, so the only cure is to move the microphone away from the source. Some condenser mics include a switch to prevent this distortion by reducing the signal from the diaphragm to the internal amplifier. Directionality controls. Condenser mics that use the two-diaphragm method of producing a cardioid response can produce any polar pattern depending on how the diaphragms are hooked up. The switch offers a choice of omni, bidirectional and cardioid patterns, which is very handy indeed. Some top-of-theline mics even have this feature on a remote control.

04_Chap3_pp51-72 8/29/13 2:39 PM Page 58

58

THE ART AND TECHNIQUE OF ELECTROACOUSTIC MUSIC

Impedance Microphones produce quite small currents, but there is a rather wide range in the amount produced by various models. The concept of output current is described rather confusingly in terms of the type of device the microphone can connect to. An input that works with little current is called a high-impedance input, because the quality described by impedance is inversely proportional to the current that will flow at a given voltage. On the other hand, a low-impedance input requires a lot of current, relatively speaking. Studio microphones are capable of generating enough current for any input, and so are called low-impedance. High-impedance microphones are the less-expensive models designed to work with things like camcorders. Some high-impedance mics are decent quality, but most are junk. You can usually tell a microphone’s impedance by the connector. Low-impedance mics have three pin connectors, while high-impedance mics have a short cable with a phone plug, often the 3.5 mm size. High-impedance mics will pick up electrical noise if the cable is longer than about eight feet.

Mounts An important feature of any microphone is the means to mount it on a stand. Nearly every mic has a uniquely shaped body and comes with a matching clip that holds the body in a spring grip. (Some very expensive mics have threaded sections on the body and finely machined fittings to hold them.) The clip then screws onto a standard thread, 5/8 in. in the U.S. or 9 mm in Europe. It is a truism that the more expensive the mic, the more expensive the clip, so care should be taken to prevent their loss. There are more elaborate mounts that suspend the mic from a web of springs or elastic bands. These are designed to prevent the transmission of sound up the stand to the mic, which can be a problem if the stand is on a wooden floor. If needed, these shock mounts should be obtained from the microphone manufacturer; the generic version that relies on rubber bands to grip the mic is quite likely to drop it instead.

Preamplifiers The signal coming out of a microphone is weak, so weak that the wire coming from the mic has to be protected from stray radio signals by shielded and balanced cables. The signal must be amplified quite a bit to match the levels used by most of the studio equipment. That level, called line level, was originally specified by engineers in the telephone industry who needed signals that could reach from town to town on the telephone lines with few problems. The word amplifier (as a distinct piece of equipment) has come to mean the device that boosts line-level signal to

04_Chap3_pp51-72 8/29/13 2:39 PM Page 59

RECORDING SOUNDS

59

the high power needed for loudspeakers. Hence the word preamplifier was coined for devices that boost weak signals to the line level. Studio habitués generally shorten the word to preamp. The large amount of amplification (up to 60 dB) applied to microphone signals requires carefully designed and precisely constructed circuits. These can be approximated cheaply by integrated circuits or standard designs copied directly from textbooks or part suppliers’ “typical applications.” However, standard designs seldom produce good preamps, where such subtle details as the location of the screws that connect the circuit board to the case can affect the purity of the output signal. A quality preamp will have a sophisticated design based on many cycles of trial and testing. Electronic parts are surprisingly variable in performance, even if they have the same part number stamped on them. Parts are tested by the manufacturer to ensure that they meet published specifications, but tolerances of 20 percent are common. The finest preamps are built from components that are hand selected for best performance. None of this comes cheap, so a mic preamp can be the most expensive piece of gear in a studio. Preamplifiers generally incorporate operations beyond amplification. There is usually a signal strength indicator, with lights or a meter to confirm presence of signal and warn of overload conditions. There will be adjustment to the amplification, possibly including a passive attenuator (called a pad) to reduce strong signals without distortion. Phantom power for condenser microphones is usually available, and some preamps even have equalization and dynamic compression. Those last two features are expensive and not really necessary for electroacoustic work. Preamplifiers are often part of other types of equipment. For instance, a recorder with a built-in mic will have to include a preamp, which is probably a single integrated circuit. Most mixing consoles have a preamplifier on every channel. If these are high quality, the console will be quite expensive. Conversely, the preamps in a budget mixer are unlikely to be very special, no matter what the advertising claims. It may seem odd to spend serious money on something that is often already included, but I have found that a decent preamp can improve sound quality more than any other investment.

Recorders We can use either dedicated devices or a general-purpose computer to record audio. There is really little difference in operation from one type of system to another. Aside from the few remaining analog tape machines, all recorders rely on digital technology and computer software, either embedded in a device or loaded into memory. Digital recording programs use analog recorders as a paradigm, so the real differences between the media are hidden from view. There are several technical factors that can limit the recording. These are most obvious in a computerbased system, but the following components are important in any device.

04_Chap3_pp51-72 8/29/13 2:39 PM Page 60

60

THE ART AND TECHNIQUE OF ELECTROACOUSTIC MUSIC

Input and Output Conversion In any digital system, the signal from the preamplifier must be converted to digital form. In this procedure the constantly varying signal is measured at short time intervals and becomes a stream of numbers that reflect the voltage. The device that does this is known as an analog-to-digital converter, or ADC. This device can be a multi-thousand dollar piece of gear or a single integrated circuit. There are many subtle factors that can affect the accuracy of this process, but the two overriding ones are the rate and precision of measurement. The number of measurements taken in a second is the sample rate (in this context, a sample is a single number). As a general rule, the more samples per second the better the fidelity. High sample rates can produce a stunning reproduction of the signal, but they increase the cost of the system by using more memory and storage space per minute of recording. Low sample rates have a distinct effect on fidelity. It is mathematically impossible to digitally represent a signal with a frequency higher than half the sample rate. Figure 3.7 illustrates why this is so. The upper waveform is a high-frequency input signal. This is measured at the times indicated by the sample clock shown at the bottom. The result is a low-frequency wave that bears no obvious relationship to the original. This production of spurious tones is called aliasing. If the sampling rate is twice the frequency of the input, the output will at least have the right number of high to low transitions. The frequency that is half the sampling rate is often called the Nyquist frequency, after a mathematician who studied these problems in the early twentieth century. The analog-to-digital converters in a recording system must be protected from inappropriate frequencies by so-called anti-aliasing filters, which remove any signal above the Nyquist frequency. These filters produce audible effects, but the higher the sample rate the less intrusive the filter. The best ADCs sample at an extremely high rate, then reduce the number of samples actually stored. This technique is called oversampling and allows the use of a digital filter to prevent aliasing. Sample rates in telephone systems range from 10 kHz to 22 kHz, and you can easily tell that something is missing. The lowest sample rate that is considered high fidelity is 44.1 kHz, which was adopted for the CD format. This number was partially determined by the requirements of the videotape systems originally used to capture digital audio. It is high enough to match the high-frequency capabilities of the LP record and low enough to allow a complete Beethoven symphony to fit on one disc. Many producers prefer to make original recordings at a higher rate so that they are sonically ready for better-quality distribution and can tolerate various processes with less-audible effect. Recording at 48 kHz is common practice, and 88.2 kHz or 96 kHz are used regularly. Experiments with even higher rates have not shown any audible improvement. The accuracy of these measurements is just as important as the rate. The limiting factor in accuracy is the number of bits used to hold the digital result of the conversion. This is called the word size. The number of bits available determines

04_Chap3_pp51-72 8/29/13 2:39 PM Page 61

RECORDING SOUNDS

61

Input

Aliased output

Sample clock FIGURE 3.7 Effects of aliasing on waveform. the signal measurement resolution in the same way the number of markings on a ruler determine the precision of a length measurement. To understand the effect on audio signals, we must follow the process through and see what results when the waveform is reconstructed from the recorded data through a digital-to-analog converter or DAC. Figure 3.8 shows how various word sizes capture the nuances of a complex waveform. One way to interpret the difference between the input and output signal is as a noise consisting of the “missing” part of the original signal. This noise has an amplitude equal to the voltage of one bit. The maximum signal is the voltage of all bits, so we can conclude that the word size determines the signal-tonoise ratio of the conversion process. Since each bit doubles the possible voltage, the signal-to-noise ratio is roughly 6 dB per bit. Communications systems use 8- and 10-bit encoding but musically acceptable audio requires 16 bits, the standard for CDs. The highest-quality audio encoders use 24 bits. With a resulting signal-to-noise ratio of 144 dB, there is little reason to push for better. These values are often converted to 32-bit floating point numbers for the purposes of signal processing. This takes full advantage of the computer’s power, reducing rounding errors and other accidental artifacts to insignificance. Increasing the word size of a signal after it has been converted has no effect, but subsequent processing (such as digital mixing) would then be more accurate. When the word size is decreased (such as when 24-bit files are burned to audio CDs), it may be desirable to apply the process known as dither. Dithering uses the information in the bits that will be discarded to create an impression of more bits than are actually remaining. Dither can also shape the error noise of the 16-bit

04_Chap3_pp51-72 8/29/13 2:39 PM Page 62

62

THE ART AND TECHNIQUE OF ELECTROACOUSTIC MUSIC

Resolution 8 Bits

6 Bits

4 Bits

FIGURE 3.8 Effect of word size on waveform.

format so it is less audible. Dither is a careful form of distortion, so it should be applied only one time to a recording. Ultimately, the digital signal must be converted back to analog form before it is amplified and applied to loudspeakers. The issues of conversion accuracy of the digital-to-analog converters are the same as those of the analog-to-digital converters. It is not necessary for the sample rate and word size of the output to be the same as the input, but changing them adds complexity. If a recording is played at a rate different from the original recording rate, the pitch and length of the recording will be affected. In fact, the effect is the same as playing a record or tape at the wrong speed. If the rates are different coming and going, a mathematical sample rate conversion must be performed. DACs require smoothing filters similar to those that prevent aliasing at the input, and many converters use oversampling (or upsampling) to improve the sound of these. When signals are passed in and out of digital systems without being recorded, the time lag between the input and output becomes important. This lag is known as latency and is caused by the fact that the A-to-D and D-to-A processes take time. The delay is small, measured in microseconds, but is quite audible if the delayed signal is recombined with a direct analog version. In general-purpose computers, latency is aggravated by the practice of accumulating the samples into a buffer and processing them in batches of 64 or 128. This is done for computational efficiency,

04_Chap3_pp51-72 8/29/13 2:39 PM Page 63

RECORDING SOUNDS

63

but the delays involved are distinctly noticeable and quite disconcerting to performers who are listening to the processed signal. In addition to the basic concerns of sample rate and word size, there are several subtle factors that can affect the quality of conversion either way. It is not enough to provide sufficient bits in the digital measurement of an analog signal; each onebit change in value must represent exactly the same change in voltage. This means the measurement must be accurate to one part in over 16 million. It should not be a surprise that converters (either way) vary in their ability to meet this goal and that the accuracy is reflected in the cost. The accuracy of the sample rate must approach the same order of magnitude. Any unevenness or jitter in the sample rate is particularly troublesome. In addition, if signals are passed from one device to another in digital form, the sample rates must be perfectly matched. This is often achieved with a synchronizing signal provided by a device called a word clock.

Storage Media Once it has been converted, the resulting data must be stored. If a computer does the recording, the storage will be on a hard drive, ideally an external drive dedicated to audio. At one time high-speed drives had to be specified, but contemporary drives are almost always fast enough (with the exception of “solid-state drives”). Miniature portable recorders typically use memory cards designed for cameras. These come in an astonishing variety of shapes and types, and speed definitely must be considered when they are purchased. Adapters are available to connect these to a computer for transferring files to the hard drive. Size is another issue that has nearly gone away with hard drives but is important in solid-state media. At 5 megabytes per minute, audio is somewhat greedy for disc space, but a gigabyte will hold more than 100 minutes of stereo, so a 500 gigabyte drive can store an impressive library of recordings. Audio files may be stored in .wav or .aiff formats. There is an association between .wav and the Windows operating system, but most programs can deal with either format. Both formats can keep all of the recorded data, unlike MP3, AAC, or other data-compressed formats. The latter are suitable for transmitting low-fidelity versions of audio over the Internet but should not be used for original recordings or when gathering materials for composition. Data-compressed formats extend recording time significantly but limit editing and processing of the material. You never know where a particular recording will end up, so keep uncompressed versions of everything. Audio files can be (and should be!) backed up to removable media for archiving and interchange. There is nothing special about audio files in this respect, although some editing programs create a bundle of files in a project folder that should be kept intact. CD-ROM and DVD-ROM are commonly used for backup, and USB flash drives are convenient for smaller files.

04_Chap3_pp51-72 8/29/13 2:39 PM Page 64

64

THE ART AND TECHNIQUE OF ELECTROACOUSTIC MUSIC

The finished product is usually an audio CD. Many media programs will burn an audio CD, but they are best created by specialized applications that allow variable intertrack intervals and other niceties. Computer-burned CDs should be playable by any standard CD player, but it is unwise to take this for granted. Blank CDs vary tremendously in quality. Manufacturers can get away with this because computers use a continuous verification process when writing to the disc. If a block of data is not properly written, that block is marked as bad in a table of contents and the data written again. Thus a poor CD will hold less data than it should, but the customer is unlikely to notice. Audio CDs use a different layout that requires continuous data. An occasional bad block can be tolerated, but too many in a row produce audible problems. The only defense against this is to use trusted brands of media, avoid the cut-rate bargains, and always listen to the results. Audio CDs should not be used for backup. Use .aiff or .wav files on a data CD. The audio CD data format is designed to allow convincing playback of partially damaged disks, but this process hides any loss of data. Audio copied from CD to CD to CD steadily looses quality, whereas CD-ROMs either work perfectly or not at all. A recordable CD is a delicate sandwich of materials. The drives work by shining a laser through the transparent bottom. The beam is reflected by a mirror backing on the upper side and returns through the CD to a pickup lens. This path is interrupted by dark spots burned into a photosensitive dye by a slightly more powerful laser. The interruptions provide the data, so a CD has three major vulnerabilities. Scratches or dirt on the transparent bottom, damage to the reflective layer, or discoloration of the dye layer will all ruin the CD. These are easily caused by rough handling. The following is some advice for keeping CDs fresh. Keep CDs in a case or paper sleeve. I prefer sleeves because they take up less space and don’t break as easily. Handle CDs by the edge and the center hole. If you need to set a CD down temporarily, set it on the sleeve or in the case. Keep CDs away from heat. This includes cars closed up in warm weather. Keep CDs out of sunlight. Don’t use adhesive labels on CDs. These are likely to damage the CD drive when they come loose. If you try to remove a partially loose label, a large chunk of the mirrored backing will come along. Label CDs using CD pens. Many marking inks contain acetone, which will eat through the mirror layer. Ballpoint pens will scratch the backing. Use only soft-tip pens specifically labeled as CD markers. Print on CDs with an inkjet printer. This requires discs with an inkjet-printable backing and a special printer. Note that these may take 24 hours to dry properly. Other printing schemes have not proved satisfactory.

04_Chap3_pp51-72 8/29/13 2:39 PM Page 65

RECORDING SOUNDS

65

Recording Software and Controls In the heyday of mechanical recorders, the controls were physical knobs, levers, and buttons that adjusted levels and activated motors. There was an amazing variety from brand to brand and model to model. But even before tape disappeared, controls became actuators for microprocessors that could manage the reels gently and cue up to stored locations. In the process, the controls became buttons marked with a standard set of icons for play, stop, record, and other functions. Digital systems try to emulate that look and feel to present a comfortable and familiar interface. For computer-based systems, these are clickable icons, but you can add external boxes with the familiar physical buttons. The standard list of functions is fairly short. Play begins playback of a recording. Many computer programs attach this to the space bar, which is easy to find in a hurry. Record begins the recording process. On tape systems, recording always replaced whatever was on the tape before, but in digital systems this is not necessary. Recording usually begins in a new file, but there may be options to replace or append to an open file. Because recording on tape risked erasing a previous take, the hardware always had some safety feature, such as requiring a press of play and record at the same time. When more than two tracks were involved, switches were provided to indicate which tracks were to record. Other tracks would play, so new material could be synchronized with the old. Recording programs keep all of these features, even though they are not always so necessary. Stop ends playback and recording. There is some variation as to whether playback will resume from the point of stopping or from the beginning. Pause is a temporary stop, with the implication that play would resume from the point of pausing. Tape machines would start more quickly from pause than from stop. Since the play icon is not functional if the recording is already playing, this (and the space bar) are often converted to pause during playback. Record pause is a state where recording is about to begin. In this mode, the inputs are turned on so that proper recording levels can be set. This is often indicated by a flashing record button. Rewind and fast forward were used on the old machines to move tape quickly. They still appear in software, although their meaning is not consistently defined. They generally move the start point to defined locations, but some applications implement an animated continuous motion. Return to zero (RTZ) is not standard but this function is found under some name in any recording program. It simply cues playback to the beginning of the recording.

04_Chap3_pp51-72 8/29/13 2:39 PM Page 66

66

THE ART AND TECHNIQUE OF ELECTROACOUSTIC MUSIC

There are two basic controls that were actuated by knobs on hardware for which some equivalent is needed in a digital system. Playback level adjusts the amplitude of the recordings as they are played. This is not strictly necessary in a stereo recorder, but in a multitrack system it adjusts the relative balance between tracks. Record level adjusts the amplitude of the input. This was and still is the most critical control in the system. It is used in conjunction with a meter that shows the strength of the input. The input signal varies tremendously as it is affected by the placement of the microphones, the microphone type, and settings of the preamplifier. In addition, all but the simplest of sound sources will vary quite a bit in loudness. Record level must be optimized before the signal hits the ADC, so the record level control must come at an earlier point in the system. This will be on the input interface or mic preamp. There are two ways improper record level can mess up a digital recording. If the input value is too strong, many samples will be clipped to the largest number the system can handle. An occasional isolated clip will have no audible effect, but it doesn’t take many for the sound to become noticeably distorted. On the other hand, if the signal is weak, it can be boosted later, but the resolution will be no better than the number of bits that were actually used to encode the signal. Since each bit provides 6 dB of resolution, the meter can be used to estimate the resolution of soft recordings. If full-scale resolution is 16 bits (as on a CD recorder), a signal that registers -24 dB is a 12-bit recording, about the same as good telephone service. Since music often has a dynamic range of 30 dB, you can see the advantage gained by 24-bit recording. In practice, music and speech will give a rapidly fluctuating reading on the meter. The low segments will be lit most of the time, but there will be a set of segments that will flicker on and off. An old-fashioned needle would be vibrating within this area, which represents the average level of the signal. Occasionally the reading shoots toward the top, and the highest segment lit represents the peak level. The difference in dB between the average and peak levels is called the crest factor. The optimum recording level is with the peaks just below 0 dB full scale. The optimum level also has the average around -20 dB for classical material or -12 dB for pop music. What do you do if the source material has a crest factor higher than this? If the peaks are short, as often is the case with percussive material, occasional clips are acceptable. If the peaks represent real climaxes in the music, as sometimes happens in orchestral recordings, a lower level must be used. If the result is perceived as too quiet, steps can be taken to make it seem louder. We will discuss that in a later chapter. There is a device called a limiter that reduces the amplitude of strong peaks while leaving most of the material alone. This is a piece of analog equipment that is connected between the mic preamp and the computer audio input. Many handheld recorders include such a circuit. The effect the limiter has on the signal is gentler

04_Chap3_pp51-72 8/29/13 2:39 PM Page 67

RECORDING SOUNDS

67

than the results of overloading the ADC, but it is still a distortion of the sound. I suggest the limiter be reserved for situations where the performance volume is unpredictable. The details of using a recording device or program are going to vary from machine to machine. You will have to consult the accompanying manual to learn the names used for the controls and their location.

CATCHING A SOUND The process of recording is pretty simple. You can do the exercises in this book with a simple microphone plugged directly into the computer sound card. In a more-complex setup a microphone is connected to the preamplifier with a microphone cable, the preamplifier is connected to the computer audio interface with a line cable, and the output of the interface is connected to an amplifier that lets you switch between headphones and speakers. You should use headphones when making the recordings and switch the speakers on when you play the results back. You never want the microphone and speakers active at the same time. Doing so invites feedback, a loud sound caused by the mic picking up the speaker output. (Eventually, you are going to make a mistake and produce feedback. This will do no harm if it is stopped immediately. You can stop it by hitting any one of several controls: the speaker switch, the preamp level, the microphone switch if there is one, or, as a last resort, unplugging the mic. The burst of feedback is so likely to induce panic in beginners that I suggest you make it happen deliberately a few times so you can practice turning it off.) When the recording is made on a computer, the data must be saved somewhere on the hard drive. Some programs prompt the user for a filename and location before you can start recording; some record into a temporary file and prompt for a name after the fact. In either case, use a name that is descriptive of what you are recording. When you make additional recordings of a source, add a take or version number. Keep a log of all of your recordings. Figure 3.9 shows a simple recording application called WavePad. To make a fresh recording, choose New File from the menu or tool bar and a recording window will open. The WavePad window prompts for recording format and includes controls for selecting the audio device used for input and output. All recording software must have these features, although they are usually hidden away in a Preferences menu. The file type is chosen when the file is saved. To adjust the level in most programs, you put the recording system into record pause and make some trial sounds. Watch the meter display and adjust the preamp level until the loudest sound just reaches the -3 dB mark. WavePad does not have a record pause mode, so you have to be actually recording while setting levels. However, you can record over these trial runs by stopping and moving the scroll bar in the record window to the left. When the level seems right, click the Record icon

04_Chap3_pp51-72 8/29/13 2:39 PM Page 68

68

THE ART AND TECHNIQUE OF ELECTROACOUSTIC MUSIC

FIGURE 3.9 The record window from WavePad.

and make several tries at the sound. Then click the Stop icon. Play the recording. If you like it, close the record window and save the file.

MICROPHONE TECHNIQUE When you listen to your recordings, you will soon discover that a lot depends on the position of the microphone relative to the sound source. Whole books have been written about microphone placement. It’s not really that complicated a subject, but experienced engineers know where to place mics when recording a wide variety of instruments and how to adjust if the sound is not quite right. The books try to convey that experience, but they really can’t, except in a general way. There is no substitute for hearing the effects mic placement can have, so be prepared to do a lot of experimentation. The books can show you where to start but don’t accept what they say if your ears tell you otherwise. There are really only two principles at work here. The first is that as a mic is moved closer to the source, the signal gets stronger (and with some mics the proximity effect kicks in). You have to place mics close to soft sounds and a bit further away from loud ones.

04_Chap3_pp51-72 8/29/13 2:39 PM Page 69

RECORDING SOUNDS

69

The second principle is the tricky one. Once you get fairly close to a sound source, tiny changes in position can have tremendous effects on the sound. This is because the sound does not emanate from a single point on the instrument. The whole object is vibrating, with various tones and partials coming from different points. It is the blend of these tones and partials that make the sound of the instrument, but the blend is not heard until you are some distance away. You don’t need a microphone to prove this—just get your head close and listen as you move around. As an example, consider the piano. There are eighty-eight sets of strings spaced along a distance of about 5 feet. Simple math will tell you that if you place a mic 6 inches from the top string, the sound of the lowest string will be down at least 12 dB in that mic. Further, the strings are up to 6 feet long, with different modes spaced at divisions of the length. An additional source is the soundboard, which resonates with the strings, but with its own complex pattern. The keys and pedal mechanisms make a lot of sounds we would rather not hear. Barely noticeable at any distance, these can be accentuated by mics close to the keyboard. If you want to record a piano as it sounds to an audience, place the mics a few feet away. If you want to record what the player hears, place mics over his head. Other positions let you bring out the highs for crisp attacks or warm up the low end, but these spots can only be found by experimenting with a particular piano. Other instruments are equally complex, and the effects are not always obvious. For instance, most of the sound of a trumpet is produced at the bell, and a microphone there will pick up a consistent tone. The saxophone is quite different; little sound comes out of the bell for most notes, but the two lowest notes will really blast. The way to discover all of this is to make a series of recordings while systematically changing mic placement. Think of it as exploration of an unknown territory. Keep track of everything you do; the sound source, the mic position, the level at the preamp. You can keep a notebook or just talk as you record, as long as you wait until your voice stops reverberating before you make the sound you want to record and don’t talk over the end of the sound. DVD example 3.3 features a sonic tour of a piano.

RECORDING QUALITY Even if we are not worried about getting the perfect tone from a sound source, you will quickly discover that some recordings will be unusable for various reasons. With experience, you will learn to avoid these common mistakes. Recording signal too hot. A “hot” signal is a strong signal and is in danger of overloading the ADC or analog recording components. This results in distortion of the signal and is usually caused by the microphone too close to the source or preamplifier gain turned too high. If the latter, the recording may be both noisy and distorted. The level meter will usually warn of impending overload, but it is limited

04_Chap3_pp51-72 8/29/13 2:39 PM Page 70

70

THE ART AND TECHNIQUE OF ELECTROACOUSTIC MUSIC

to reading the signal at one point. Distortion may occur earlier in the signal chain even when the meter is reading proper levels. The solution to the problem is to either turn the gain down or pull the mic way from the source. Recording signal too weak. A recording that is too quiet has problems beyond the inconvenience of turning the monitors up to hear it. Remember, you waste a recording bit for every 6 dB of empty headroom. A 24-bit recording system dramatically improves this situation, but unless the converters are very high quality, there is still some distortion associated with the quietest sounds. The first step to boosting a weak recording is to move the mic closer to the source. This may give a proximity boost to the bass, so use a low cut switch if necessary. The signal may also be boosted by turning up the recording gain, but this is likely to increase noise, either from the electronics or the recording environment. Some sounds are, alas, too quiet for practical recording. Even in the soundproof studio where I teach—a room where it is possible to record a heartbeat—students are finding sounds that are just too weak to register, like the flare of a match. You should try though. Tiny sounds can be fascinating.

Background Noise Our minds are adept at ignoring sounds that are a constant part of our environment. If I stop writing and listen, I realize my quiet study is serenaded by crows and passing cars. There is also a distant refrigerator, a heating system, and the whirr of a disc drive attached to the television set. These are largely unnoticed (I’m aware of the cars), but if I were to make a recording here and play it elsewhere, all of the noises would be immediately apparent. The effect is rather like those vacation photographs that seem to show a tree growing out of your brother’s head. No recording space is totally without background sound, so the best we can get is an acceptable signal-to-noise ratio (S/N). For the purposes of composition, 40 dB will do. Few recording meters go this low, so you will just have to listen before you record. A good signal-to-noise ratio is most easily achieved by putting the mic close to the sound source. In some cases a directional mic will help, and in a few situations bass or high cut from an equalizer will minimize the background. The background is most noticeable in the spaces between sounds, and in composition, we can cut that out, but the noise will still be a component of the recordings and may be even more objectionable if they come and go with one piece of source material.

Coloration No recording is going to sound precisely like the original sound. The goal is to produce a recording that gives the desired impression. Let me put it this way: if you were to record your mother, you should not only be able to recognize her voice and understand the words, you should be able to tell what kind of mood she is in.

04_Chap3_pp51-72 8/29/13 2:39 PM Page 71

RECORDING SOUNDS

71

When you feel the urge to speak back to the recorded voice, you have it just right. Coloration is determined by choice and placement of the microphone. In addition to finding a spot that picks up all parts of the sound, you must avoid placement near reflective surfaces. If the mic is too close to a tabletop or hard wall, the reflection can combine with the direct sound to produce phase interference. This can affect the sound in a manner we will explore in later chapters. Avoiding coloration does not so much depend on the price of the microphone as it does on choosing a mic that matches the sound you are trying to get and on patient experimentation with microphone placement.

Ambience Any sound recording is inevitably affected by the space where the recording is done. The sound is not sucked up by the microphone; it continues to reverberate in the space according to the characteristics of the room. Unless the space is totally dead, some reverberation will be included in the recording. Most listeners prefer a bit of reverberation. In fact some genres of music call for quite a lot. However, reverberation can color the sound in awkward ways, and too much will make individual sounds muddy and indistinct. The amount of reverberation in a recording is determined by the distance from the microphone to the source. A mic placed close in will not pick up much reverberation at all, whereas a distant mic will take in more reverberation than direct sound. In every location there is a critical distance where reverberation equals direct sound. This is easily found by experiment, because as the mic is moved beyond this critical distance, the recording will not become quieter. Since reverberation can be added to a recording, but never taken out, we record most source material as “dry” as possible, that is, lacking reverberation. That implies a nonreverberant room or a close microphone. It’s a good idea to make a recording of the space when nothing is going on. This is called room tone and is just the ambience including any background noise. We use this to add pauses to a recording or as background to added material such as narration. Room tone will sound more natural than pure silence. DVD example 3.4 demonstrates several recording problems. See if you can identify them. (Answers can be found at the end of this chapter. The first tone is correctly recorded for reference.)

Stereo Recording Stereophonic sound is such a ubiquitous format that the word stereo has become synonymous with any sound playback system. Stereo recording is actually an invention of the mid-twentieth century, adopted because it helps capture some of the ambience of a concert hall experience. True stereo requires the use of two matched microphones and duplication of the entire recording apparatus. The two

04_Chap3_pp51-72 8/29/13 2:39 PM Page 72

72

THE ART AND TECHNIQUE OF ELECTROACOUSTIC MUSIC

recordings that result are kept in strict synchronization (originally with dual-track tape recorders, now with software) and eventually played back through matched loudspeakers. The stereo effect actually comes from minute differences in the two tracks, similar to the point of view of two eyes shown in the red and green images of a 3D movie. Artificial stereo reverberation can be added to a monophonic recording. In fact, most recordings made today are a mix of mono tracks with added electronic reverberation. Electroacoustic music uses the same methods, so most source material can be mono. A common beginner’s mistake is to select stereo recording mode when only one mic is used. This results in an empty right channel, which has to be removed before the recording is edited.

EXERCISE Choose a drawer of kitchen implements or a box of tools and make short recordings of each item. Use various excitation techniques and experiment with mic placement. Avoid the urge to play tunes—the goal is to produce a set of simple sounds that will be used in basic composition exercises.

RESOURCES FOR FURTHER STUDY There are dozens of books about recording, but the best resources are magazines. Most magazines now have supplements online and will probably eventually be found only in electronic formats. These are more up-to-date and reflect more points of view than textbooks. The best are: Mix (published by Penton Media), mixonline.com Tape Op, tapeop.com The standard texts on recording include: Huber, David, Modern Recording Techniques (Focal Press, 2010). Owsinski, Bobby. 2009. The Recording Engineers Handbook, 2nd ed. Boston: Course Technology [Cengage Learning]. Recording problems illustrated in DVD example 3.4: 1. Correct tone 2. Too loud 3. Too soft 4. Background noise (listen to the tail end of the sound) 5. Reflections off nearby wall coloring the sound 6. Bad microphone

05_Chap4_pp73-92 8/29/13 2:40 PM Page 73

FOUR Sound after Sound

Placing sounds one after another is the fundamental activity of electroacoustic composition. In its simplest form, this is the musique concrète procedure of the 1950s, as heard in classic works such as Pierre Schaeffer’s Symphonie pour un homme seul. This is occasionally described as “species I electroacoustic music” or audio montage. Most modern pieces are far more complex with multiple layers of heavily processed sounds, but the fundamental skill is still simply choosing what sound to use next. This skill has two parts. The first is competence with audio editing software; the second (and more difficult to acquire) is the judgment to make musically interesting choices.

AUDIO EDITING SOFTWARE There must be a hundred audio editing applications available, and by the time this book is published there will be twenty more. As noted in chapter 1, they are all similar in features and use, but you do tend to get what you pay for. You are going to spend a lot of time with the editor, and it is worthwhile to invest in one you can be comfortable using. However, many of the features in professional-grade mastering applications are not essential to composition. When pressed for a recommendation, I usually suggest the light version of mature products. I do urge you to use an independent audio editor instead of the editing features found in digital audio workstations (to be explored in chapter 6). Those are geared toward making the occasional edit between takes rather than the dozens of splices per minute of music typical of composition. In this chapter I will cover basic procedures that should be available in any audio editor, demonstrating with Bias’s Peak Studio LE for Macintosh.

Looking at Sound Files When you record a sound or load a file into any editing program, a document window opens containing an image of the sound. The sound is manipulated with a 73

05_Chap4_pp73-92 8/29/13 2:40 PM Page 74

74

THE ART AND TECHNIQUE OF ELECTROACOUSTIC MUSIC

FIGURE 4.1

Transport controls from Peak LE 6.

variety of controls, which may be located in the same window or scattered in separate palettes. In addition to the waveform display window, there is a palette of editing buttons and a palette with the so-called transport controls, which instigate recording and playback operations. These are modeled after the buttons found on hardware recording devices (Figure 4.1). When Play is clicked, the file begins playing, and Stop has the obvious effect. There also may be a Pause, which differs from Stop in that playback will resume from the pause point. These actions are also controlled by the space bar, a keyboard mapping that seems to be universal in audio editors. The inclusion of buttons labeled Rewind and Fast Forward seems silly in a tapeless system, but many editors provide them, with no consensus as to what they should do. The implementations that skip through the file are useful for finding a place in a long recording. Another approach is to skip to the next marker, which is fine if you place markers. The Record button opens a window which allows choice of inputs, type of recording, and more buttons to start and pause the recording process itself. Figure 4.2 shows the editing window of Peak LE. The sound is shown as an oscillogram, a graph of the sample values in the file. This type of display does not indicate what the actual sounds or pitches are, but it does show amplitude, and with practice you can learn to discern the beginnings and ends of notes or words. The top portion of the window shows an overview of the entire file and is used for navigating from section to section. The lower portion shows the area in the box on the overview. It can be zoomed in or out to show the envelopes of individual notes or the details of the waveform. The current edit position is shown as a dotted vertical line. Clicking in the window moves the edit position and hitting the space bar lets you hear the sound from there. There is also a display that shows precise statistics of the cursor position and selection points. This is useful for precision editing. You can type in a position to go to or a selection range in a dialog reached from the edit menu. Premium editors feature the ability to “scrub” the sound file by dragging a cursor across the display. This makes it easily to find edit points quickly. In the Peak LE window, time within the recording is indicated by a horizontal scale and amplitude is marked on a vertical scale. Units in the time scale may be based on milliseconds, sample numbers, or musical beats. Units in the amplitude scale may be dB or percentage of full value. The scales are adjustable, and the oscillogram will expand or contract accordingly. Since you generally want to examine a particular moment in the file, a magnifying-glass cursor is provided. In this mode,

05_Chap4_pp73-92 8/29/13 2:40 PM Page 75

SOUND AFTER SOUND

FIGURE 4.2

75

Editing window from Peak LE 6.

clicking on a point in the display will expand the graph horizontally with the target centered. When you play the file, the play-head cursor travels across the display indicating the current location. Most editors scroll the window to keep the play head centered. Some allow you to defeat scrolling so you can listen to a long section without loosing your place. Edit operations happen at the edit position, which also may be known as the insertion point. The insertion point may be expanded to a selected range of samples, which are then all affected by editing operations. Typical methods of selection include a simple mouse drag or marking the beginning and end of the selection with a menu action. When there is a selected region, the space bar will play it and stop.

Finding the Spot The essential skill in editing is setting an edit point. This is a two-part process: navigate to the point of interest by listening to the file, then use the visual display for precise location. Start with the display fully zoomed out. Playback will begin at the insertion point, so place it just before the location you are looking for. Play the file, stopping when you hear your target. The insertion point will move to where the play-head cursor stopped. Now select a section before that point. Playback now will be limited to the selected area. If it includes the target, zoom in until the selection fills the window and select a shorter region. As you repeat this process, you will eventually

05_Chap4_pp73-92 8/29/13 2:40 PM Page 76

76

THE ART AND TECHNIQUE OF ELECTROACOUSTIC MUSIC

be able to spot the bump that indicates the start of the sound you are looking for. Place the insertion point just before that bump. If you are selecting a region, adjust the end of the selection until only the target range is played. The means for adjusting selections vary from editor to editor. In WavePad, the cursor changes to horizontal arrows when the mouse is over the end of a selection, and a simple drag will modify the selection size. In Peak LE, a combination of shift and click will extend the selection. Many editors have key combinations that nudge the selection edges. To finish up the selection, zoom in until the silent spaces between sounds are visible. If the silence were true silence, you could place the edit point anywhere in this space. However, recorded silence is never just a row of zeros. Figure 4.3 shows the apparent silence between two sounds at two vertical zoom settings. The close-up shows that the first sound extends nearly to the second with a tail that is probably room reverberation. You can see from this that the quietest point is just before the start of the new sound. This is nearly always the magic spot for edits. Both the beginning and the end of a selection should be placed just before the start of a sound. Cutting at the start of a sound insures that you get only the sound from one event and maintains rhythmic accuracy when you are editing music. Finding the place to cut between two sounds that are not separated by silence is a bit trickier. Figure 4.4 shows a short guitar passage with a single note selected. There is little to indicate visually where the note starts—it must be found by listening. The procedure is a process of refining the approximate locations. Set the selection in the middle of the notes before and after the target. Play the selection and listen to the way the desired note begins. At first you will hear a sort of pickup to the note, which could be described as “pa-ping.” As you move the beginning of the selection to the right, the pickup will get shorter and become nothing more than a tick at the beginning of the target note. As you zoom in closer, the adjustments to the selection become finer. (You will want to disable auto scrolling once the selection no longer fits in the window.) Continue to move the beginning of the selection until the note starts cleanly. If you overshoot, you will begin to hear a pop at the start of the note. Adjusting the end of the selection is a similar process, except here you are removing any vestige of the following note. It’s hard to hear when the target note is cut too short, so I usually make two or three tries, noting where each attempt falls on the timeline. DVD example 4.1 has several short passages. Open this in your editor and practice finding the second or third note. There is one final step before the selection is ready for action. You must match the wave phase at the cut points. Figure 4.5 shows why this is necessary. A short section of sound has been selected and removed. The cuts were made at different points on the waveform, leaving a kink or glitch. The audibility of such a thing is unpredictable, but it will usually be a click or pop. The solution is to make the selection at zero crossings where the waveform is going the same direction at each end of the selection. The result will be only a slight bend in the waveform and completely

05_Chap4_pp73-92 8/29/13 2:40 PM Page 77

SOUND AFTER SOUND

Edit Point

Normal View

Close-up

FIGURE 4.3

Silence between sounds in the editing window.

FIGURE 4.4

A guitar note in the editing window.

77

05_Chap4_pp73-92 8/29/13 2:40 PM Page 78

78

THE ART AND TECHNIQUE OF ELECTROACOUSTIC MUSIC

Splice Point

FIGURE 4.5 A bad splice.

inaudible, (illustrated in Figure 4.6). Most audio editors help out in this process by showing the zero line or with explicit “find zero” commands. You should develop the habit of always selecting at zero going up or zero going down. The direction is not important; the advantage is in having all of your chunks of sound cut the same way. Some audio editors approach this problem by overlapping and blending the ends of clips (crossfading) whenever you perform an operation. This harkens back to tape editing, where we always made the cut on an angle to produce a short crossfade. This is often effective but doesn’t work in all circumstances. Crossfading in audio editors can usually be switched on or off.

Editing Actions Most editing consists of removing unwanted material, inserting material from some other location, and similar cut-and-paste operations. The actions available vary according to whether the target area is an insertion point or a selection. Some of the operations remove or add material, which affects both the length of the file and the rhythmic placement of the material remaining after the operation. Delete removes the material in a selection and discards it. This is sometimes called Clear and shortens the file. Cut removes the material in a selection and keeps it in a buffer usually called a clipboard. This also shortens the file. Copy duplicates the material in a selection, placing it in the clipboard. Silence replaces the material in a selection with values of 0. This leaves the file length unchanged. Insert Silence adds silence for a specified amount of time at the insertion point. This lengthens the file. Trim removes all material that is not selected. This is sometimes called Crop. Paste will insert material from the clipboard at the insertion point. When the target is a selection, there is a choice as to what can happen. Usually all of the

05_Chap4_pp73-92 8/29/13 2:40 PM Page 79

SOUND AFTER SOUND

79

Ze Zero ero Cr rossing Crossing

Zero Cross sing Crossing Selection

Splice Point

FIGURE 4.6 A good splice.

clipboard replaces all of the selection. This will change the length of the file if the clipboard contents and the target are not the same length. A variant on this replaces enough of the target to accommodate the contents of the clipboard. This will leave the length of the file unchanged. This is sometimes called Replace in the edit menu, or it might be a pasting mode selected as a preference. You will occasionally see an explicit Insert command that inserts the clipboard at the beginning of a selection or, alternatively, inserts whatever clipboard material will fit in a selection without modifying the file length. With all these versions of the Paste command around, you should always consult the manual or do some simple experiments when trying a new editor.

Edit Lists Every audio editing application lets you undo an editing operation. This is essential, because it is impossible to predict all of the effects of an edit. Most recordings are complex, with interleaving and overlapping threads of sound, and it is common to carefully select and execute an action while listening to a foreground sound, only to discover the effects on some background sound are not acceptable at all. The better editors maintain a list of editing actions and let you roll back to some previous state. This is desirable because many operations require two or three steps, and you can’t judge the result until the whole set is complete. A second skill in the art of editing is to recognize edits that have to be redone. It’s not hard to let something slide by that becomes an obvious flaw after repeated listening. Unfortunately, it is quite difficult to repair a sloppy edit once you have

05_Chap4_pp73-92 8/29/13 2:40 PM Page 80

80

THE ART AND TECHNIQUE OF ELECTROACOUSTIC MUSIC

moved on to another task. I suggest you develop the habit of listening to each edit at least twice before working on the next. Many professional editors (people who edit for a living) insist on working with the speakers at high volume to reveal tiny flaws. I don’t recommend that (protect your ears), but it is good to work at the top of your comfortable range and keep background noises to a minimum. DVD example 4.2 demonstrates some problematic edits: 1. Good foreground match, but poor background match 2. Extraneous material before insert (insert too late) 3. Extraneous material following insert 4. Truncated attack of insert 5. Attack after insert truncated 6. Pop at end of insert (both inserted and following material truncated)

DSP Most audio editing applications include some built-in processing tools that go beyond cut-and-paste. The term DSP is a holdover from the days when processing had been available only in analog hardware and digital signal processing was a new thing. These processes are applied to the entire file or a selection, and they change the data in the file. Although they can be undone, they are best thought of as permanent data alterations. This is in contrast to plug-ins, which affect the audio as it is played without actually changing the data in the file. Plug-ins are discussed in chapter 5. The DSP processes available vary with the application. Typical examples include: Gain change or amplify uniformly modifies the signal level of the selection, as if the original recording had been made with different input levels. There are limits to the process; any clipping that occurred in the initial recording cannot be removed, and amplification of weak signals results in loud low-quality audio. In addition, this process can clip the selection (most applications will issue a warning if this is about to occur). Gain change does not usually produce noticeable distortion, but any effects are cumulative, so if a gain change does not work out, it is better to undo it and make the change again with a new value than to change the section twice. Normalize increases the level of a selection to the maximum possible without clipping. This is more a mathematical process than a musical one and is useful only for tasks like preparing short sounds for use in a sample bank. Fade in and fade out produce gradual transitions in level. In most programs the fade is to zero. A few allow the user choose both beginning and ending levels. This effect is often used in electroacoustic composition.

05_Chap4_pp73-92 8/29/13 2:40 PM Page 81

SOUND AFTER SOUND

81

Pitch change and duration change duplicate a technique that was a major standby of the tape era—changing the playback speed of the tape. This action increased or decreased pitch while shortening or lengthening the duration of the recording. It also produced profound changes in timbre. In a digital environment, the two aspects can be changed independently, and effects on timbre can be minimized. There are a variety of algorithms for these operations, and most produce audible distortion. The different algorithms distort in different ways that affect some types of material more than others. Simple audio editing applications give you only one algorithm; the better ones let you choose and tweak parameters. Invert changes the sign on all of the samples, which turns the waveform upside down. This has no audible effect on an isolated sound but may affect the way it mixes with another. Reverse turns a selected section around, another standby from the tape era. Bass boost or EQ affect the frequency characteristics of the selection as discussed in chapter 5. The DSP versions of these features are usually rudimentary compared to what is available as plug-ins. Mix or combine adds the contents of the clipboard to the selected region of the file. This is not as flexible as the mixing available in a multitrack production application, but it is often a convenient way to build complex sounds. Many editing applications are developing multitrack capabilities according to the principle of feature creep. Multitrack production is discussed in chapter 6. There are many more DSP effects available in various audio editing applications, Some are interesting and useful, some are rather bizarre. You should play with whatever is available in your software and draw your own conclusions as to their utility.

Playlists Most modern audio editors perform nondestructive editing. That means the samples in a file are not actually changed with each operation. That would slow the application down substantially and make undo operations difficult to program. Instead, the application keeps a list of instructions that determines what happens when you hit play. If a three-minute recording had the middle 60 seconds cut out, the instructions might read “play from start to sample 2,646,000, then jump to 5,292,000 and pay to end.” Of course after a lot of editing, the instruction set can get quite complex and may even become too much for the computer to execute gracefully. That’s hard to imagine with modern computers, but if it does happen to you, save the file. The save executes the instruction set, rearranging the audio samples and making all of the operations permanent. Since this convoluted play mechanism exists already, it should possible for the user to specify an arbitrary order. Many applications provide this feature, which is

05_Chap4_pp73-92 8/29/13 2:40 PM Page 82

82

THE ART AND TECHNIQUE OF ELECTROACOUSTIC MUSIC

called a playlist. The first step to creating a playlist is to identify the regions of the file you want to work with, usually by a select-and name operation. Once regions are established, a playlist can be constructed that puts them in any order, including repeating a section as many times as desired. The great advantage of playlists is that you can have several for the same file and set up different versions of a composition.

SPLICING EXERCISES There’s some advantage to spending a little time perfecting your editing technique before you start working with original compositions. That way you will not be distracted from musical considerations by having to figure out technical procedures. Here are some short assignments that will hone splicing skills. 1. Record a scene from a soap opera that has two characters. Remove all of the lines of one character. 2. Find two country songs that are in the same key and tempo. Exchange the choruses of the two songs. 3. Record a newscast from radio or television. Pick a word that occurs fairly frequently and replace that word with a beep. 4. Record a friend telling a story. Remove all unnecessary words, such as “like” or “umm.”

THE SPECIES I EXERCISE The traditional first exercise in an electroacoustic composition class is to assemble a short linear piece using a pool of sounds like those recorded at the end of chapter 3. The étude should be no longer than 30 to 45 seconds in duration and will be accomplished by pasting sounds one after another with minimal modifications. It’s a good way to develop editing skills and begin to develop the subtle art of matching sounds.

Sound Associations This project starts with a survey of the available sounds. First note the obvious characteristic of each sound: Is it pitched? What is the pitch? What register is it in? Percussive or sustained? Any evolution of characteristics? Internal rhythm? Try to find a one-word description of the timbre: woody, brassy, dark, etc. Write these

05_Chap4_pp73-92 8/29/13 2:40 PM Page 83

SOUND AFTER SOUND

83

notes down. Most composers eventually accumulate thousands of sounds, and a good catalog established from the beginning can save days of searching. On the second listening, assign the sounds to groups of similarity based on these characteristics as well as associations such as original source or the material the source was made of. Pay attention to interesting contrasts also. The classification of sounds is somewhat akin to the type of question found in intelligence tests that ask which item does not belong in a list. The difference is that instead of using the kind of abstract concepts that distinguish a pig, dog, and giraffe from a tree, we are looking for qualities within sounds that are alike or contrasting. DVD example 4.3 has some pairs of sounds with some similarity. See if you can spot why they are associated. Finally, look for musical attributes within each sound: some may contain a pitch trajectory that suggests a phrase or a rhythm that can be used to set tempo. Even though compositions of this type are not within the traditions of Western tonality, we must be aware of and deal with any expectations the sounds may evoke. DVD example 4.4 has four recordings that set up musical expectations. See if you can identify the features that do this.

Planning Structure The composition process is begun by choosing a few promising sounds and sketching some possible arrangements. New composers may find it useful to think in terms of beginning, middle, and end, with the following functions. The beginning introduces the materials and sets the context or ground rules for the piece. The first five seconds should establish the tempo and tonality (or lack thereof) and give some indication of the mood. The middle is play time. Use materials similar to those of the beginning, but reordered in a way that sets them off or creates interesting gestures. Pay close attention to rhythmic integrity. Consider the foot tapping test: it should be easy to tap your foot along with the piece. Contrasting material can be introduced, but there should still be points of similarity. If a piece is mostly soft wooden sounds, a loud wooden sound or a soft metallic sound might add a nice contrast, but a loud metallic sound might be jarring. The ending ties everything together. Exactly how this is achieved depends entirely on what came before and cannot possibly be prescribed. If everything is building to a climax, the end is that climax. If the middle is a set of variations on the opening material, a restatement of the opening may work at the end. If tonality is in play, there may be a cadence or a complicated harmonic pattern that just fades away.

Assembly The actual process of assembling the sounds is simple. Create a new mono file and fill it with silence the length of the finished piece. If your editor features bar and

05_Chap4_pp73-92 8/29/13 2:40 PM Page 84

84

THE ART AND TECHNIQUE OF ELECTROACOUSTIC MUSIC

beat time markings, set the tempo. Next, systematically copy individual sounds from the files in the pool and place them where they need to go. Use the time markings or cursor readout to guide placement of the insertion point, then paste the sound in. After pasting, check the loudness against the previous sound and adjust the level if necessary. After each action, listen to the whole piece. If the new sound does not have a desirable effect, undo and try something else.

Critique In my classes, these exercises are played for critique by the teacher and students. The following typical comments often point out common issues and problems. Random or unknowable associations of the sounds. John Cage produced “Williams Mix” by using random operations to pick and place sounds. This was interesting at the time, but I doubt many people have the piece on their iPods. Part of the pleasure of listening to music is forming associations, and if associations can’t be found, the audience will not be sympathetic. This is not to say that such associations cannot be complex or obscure, but in these beginning études it’s best to work with simple relationships. Rhythmic inaccuracies. Later we will discover that mechanistic rhythm is the bane of electroacoustic music, but in these first études mistakes are more likely to go the other way. The foot tapping test is one good criterion. If the rhythm is flexible, following the inner flow of the sources, singing along may be more appropriate. The important thing is that if rhythmic expectations are set up, they should be met. Tuning inaccuracies. There’s more leeway here. Electroacoustic music has never been closely associated with tonality, but satisfactory intonation is also about expectations. If the étude begins with obviously pitched materials in a melodic gesture, listeners will assume that tonality is in play and judge the piece accordingly. You must be aware of the pitch relationships in the material and avoid embarrassing juxtapositions. Often this arises from ignoring my earlier advice and creating melodies when recording your source sounds. The intervals found on a set of spoons will probably not match those on a set of pan lids, so some notes would have to be retuned. Level mismatches. The DSP effect that should be used the most is gain change. Level is one of your most powerful tools for expression. In traditional music, phrases are often marked by a subtle (or not so subtle) arc in volume. The top of the arc is given to the most important note. Within this arc, beats may be slightly emphasized without changing the overall form. Extreme changes between phrases adds drama, but be careful not to drop the volume too low for the likely playback situation. A piece destined for the concert hall can have more dynamic range than one played over radio.

05_Chap4_pp73-92 8/29/13 2:40 PM Page 85

SOUND AFTER SOUND

85

Awkward use of silence. The étude does not have to be a continuous wash of sound. In fact, many composers argue that silence is the most important material. But one quirk that is common in beginner’s work is a pause near the end of a piece where half the audience thinks the show is over. It’s hard to pinpoint exactly why this happens, but it often occurs when there has been no silence up to that point. This suggests that if silence is used in the piece, it should be introduced early on. Another detail about silence that should be kept in mind concerns the nature of the silence. When composing on tape we used blank magnetic tape for silence, rather than paper leader tape. The blank tape produced a subliminal hiss that matched the background of all the other sounds. If paper leader was used, the transition from sound to silence and back made the hiss really obvious. In the modern studio, we are working with much quieter recordings most of the time, but there is still an audible background. When you gather sounds, record a bit of room tone—the sound when nothing is going on. If you use this for silent moments, they will be perceived as part of the music, not an interruption. Bad splices. Digital editing is a lot easier than old-fashioned tape editing, but mistakes still happen. The most common goof is too much material included in the pasted section, that is, some parts of previous and following sounds tagging along. The most common result is pops or ticks. Repair is easy; simply select the offending material and silence it. Don’t delete it because that will affect rhythm. The complementary mistake—loss of part of the desired sound— is more serious. If the original material is still available, the copy-and-paste operation needs to be redone, with adjustments to time before and after if necessary. If the original is no longer available, pops can be ameliorated with short fades. Bad source recording. Problems in a recording are really highlighted in a new context such as a composition. In particular, background noise will pop up like a rude word in church. Beginning composers often miss noise when listening to the source material because it is there throughout the recording. Such material may be usable in a thicker texture, but when spliced into silence the faintest extraneous component will be obvious.

ADVANCED TECHNIQUES The development of any composition follows the above procedure: planning, assembly and critique. You should produce as many études as you can, gradually working with more material and in longer time frames. Sometimes you will readdress an idea with a slightly different point of view, sometimes you will tackle something brand new. The following are some tips to help with common problems.

05_Chap4_pp73-92 8/29/13 2:40 PM Page 86

86

THE ART AND TECHNIQUE OF ELECTROACOUSTIC MUSIC

Some Methods to Achieve Rhythmic Accuracy It is difficult to assemble a composition out of varied sounds and obtain the kind of rhythmic feel we are used to from human performers. This seems strange, given the precision available from any editing application, but that precision can be hard to realize. The position of the cursor can usually be read to the thousandth of a second (a millisecond or ms), but that is not a musical measure. A bit of math is required to arrive at more traditional expression in beats. The first step is to determine how many milliseconds there are in a beat. This is found by dividing the tempo in beats per minute into 60,000 (the number of milliseconds in a minute). As an easy example, a metronome setting of 120 yields 500 ms per beat; to set out a string of eighth notes at M.M. = 120, sounds must be inserted every 250 milliseconds. Some editing software simplifies the process by displaying time in bars and beats. But this display may be deceiving. Unless you are at a high resolution, the image of the waveform is only an approximation of the sound. In particular, the span of a pixel (which limits the accuracy of a selection) may be 30 ms or so. That’s a sixty-fourth note. Therefore, when inserting at specific times it is important to be zoomed in on the selection. Often a newly inserted sound will seem to be late. That’s because the perceived beat of a sound can occur well after the beginning of the clip. Consider the sound illustrated in Figure 4.7 (DVD example 4.5). This is an extreme case, a rubbed balloon, but subtler versions of such sounds are common. To sound rhythmically tight, the spike in the middle of the sound should be lined up on the beat. This is easy to fix: select a chunk of the silence just before the sound and delete it to move the sound forward. At first it will take several tries to get it right, but you will soon learn to judge the appropriate amount to delete. It’s important to do this as you go; once the following sounds are placed this trick won’t work. We are likely to be disappointed when perfect precision is achieved. The word for a rigid performance is mechanical, and it is aptly named. Machines know only two ways to divide time: on exact ticks or unpredictably. Neither of these is the way humans play music. Accented notes are held a bit longer, pickups or other anticipations are shortened, and the tempo ebbs and flows throughout a phrase. It takes a good deal of study to convert that into milliseconds for insertion points, which probably explains the popularity of systems that let you play the notes into the computer. We’ll explore such systems in later chapters, but here is a way to achieve natural rhythms in musique concrète: Record the rhythm of the piece into an empty track using a woodblock or similar short sound and beating out either a metronome track or the actual rhythms intended. When you look at the resulting recording you will see a framework of spikes that clearly mark the beats. Replace the spikes with the desired sounds and you will have a fluid, humane performance.

05_Chap4_pp73-92 8/29/13 2:40 PM Page 87

SOUND AFTER SOUND

87

Beat Point 00:00:00.0

00:00:00.1

00:00:00.2

00:00:00.3

FIGURE 4.7 The beat point in a complex sound.

An Exercise to Illuminate the Effects of Rigidity Music of all styles is full of repeated phrases. With the editing power at our fingertips it is tempting to use copy and paste to clone any repeats we may need. In fact, I often hear pieces which are nothing more than four measures of material looped for three or four minutes. Here’s an exercise to demonstrate what is lost by taking this easy way out. Create an interesting but simple two-bar pattern using three or four sources. Copy and paste it fifteen times to make a 32-bar section. Put the editor into loop mode and listen to it four or more times. Now record the rhythm of all 32 bars using the woodblock trick. Replace the ticks with the same sounds used before. This seems like a lot of work, but it will go quickly if you do it sound by sound, placing each sound in every measure before picking up the next one. Listen to this a few times. Which keeps your attention better? Probably the second version. The slight variation in timing will be enough to keep each measure fresh. When we clone patterns we are working against a powerful perception effect in the brain. If we hear a continuously repetitive sound, it fades from our consciousness. I’ve discussed how we can become unaware of constant sound, but it can also happen with predictable sounds, even loud ones like the ticking of a clock. Features of music that repeat exactly tend to do the same thing—they disappear from our active awareness. There may be times when you want to do this on purpose, but it should be an effect that is under your deliberate control.

05_Chap4_pp73-92 8/29/13 2:40 PM Page 88

88

THE ART AND TECHNIQUE OF ELECTROACOUSTIC MUSIC

You can also add interest to repetitive patterns by modifying sounds here and there. Once the loop is built, select a single sound in each measure and change its level by a few dB. Make loud sounds louder and soft sounds softer. Go through again and apply equalization to the occasional beat, changing the high or low end just a bit. Don’t change anything too much, you are just mimicking what happens when (for instance) a drum hit is a few inches off. DVD example 4.6 demonstrates this technique: it starts with eight identical repetitions, then has eight modified ones.

New Sounds from Splicing In chapter 11 we will study synthesis, and you will discover how much a sound is influenced by the attack. This effect is not just found in synthesis however. We can explore it by splicing to create hybrid sounds. DVD example 4.7 shows how. First we hear the sound of water shaken in a bottle, followed by the sound of a fingernail drawn along a piano string. Then a composite sound is made by splicing one of the sloshes to the ring of the string. A second example combines the piano scratch with the sound of a squeaky door. To make this sound, I placed the clips one after another in the same file, then made a big cut from just after the attack in the first sound to somewhere in the second sound. This is guaranteed not to work the first time. Success calls for a fairly tedious process of removing sections of varying length in the region where the sounds overlap. Often gain change will have to be applied to one or the other sample to get a smooth transition. In the end, little of the first sound will be left, just enough to provide the attack for the composite sound. We can also make composite sounds with the mix process. Copy a sound to the clipboard, and it will be mixed into a selection in the file. This can be done much better with a multitrack workstation, but some interesting sounds can be produced by this quick and dirty technique. Many creature sounds in science fiction movies are made by just this method, combining such things as the bellow of an alligator with the roar of a lion. It is also interesting to reverse a sound and mix it with the original, or to combine a mix with a pitch change.

Speed Change That piano scratch from DVD example 4.7 is an excellent one for demonstrating another standby from the tape era—variable speed playback. Most tape recorders worked at two or three speeds, trading fidelity for economy of tape. (Available speeds ranged from 15/16 inches per second on slow cassettes to 30 inches per second on professional grade machines.) Simply recording at the higher speed and playing back at the lower gave a one octave drop in pitch at half the tempo. An-

05_Chap4_pp73-92 8/29/13 2:40 PM Page 89

SOUND AFTER SOUND

89

other octave could be produced by copying the tape to another machine. The effect is demonstrated in DVD example 4.8, where the piano scratch is first shifted up and then down. Editing programs usually give the option of changing the pitch and tempo independently, something not really possible mechanically. These are all options to explore, but I still find the combined pitch and speed change most interesting. You should listen to all of your sound library at various speeds (just process the whole file, then undo). One thing you should notice right away is that shifting pitch up has a peculiar effect on timbre, especially on vocal sounds. You will remember a discussion of formants from chapter 2. Most sounds can be thought of as a tone applied to a resonator. The tone may vary in pitch, but the resonator always affects components of a specific frequency range. A change in playback pitch changes not only the driving tone, it changes the frequency of the formants. Any extreme change will render the sound unrecognizable. When applied to voice, this is called the chipmunk effect (after a popular album of the ’50s) or Munchkinization (after its use in the 1939 movie The Wizard of Oz). Lowering the pitch not only produces deep dramatic sounds, it can reveal components that go unnoticed in the original. Listen to DVD example 4.9: we first hear a dog bark at normal speed, then progressively slower versions transform the tinkle of the dog’s license tags into the tolling of bells. Tape decks found in electronic music studios often were modified to provide constantly variable speed, which can produce glissandos and glides. This is hard to find in the digital age, but SoundHack by Tom Erbe works well on Macintosh. It will change speed according to a complex curve as illustrated in DVD example 4.10. The same functions are available for a modest price as part of the Composers Desktop Project, a set of processing programs by Trevor Wishart.

A CASE STUDY Planning The chapter 4 folder on the DVD has the source files and final version of a species I étude. This was composed using the following constraints: a duration of 30 to 60 seconds using a few simple sounds modified by pitch and gain changes. The source sounds were recorded from some implements from my kitchen drawer. In the order heard, they are: A cheese spreader with a stainless steel blade. This has a flat wooden handle. If I clamp the handle to the edge of the table and snap the blade with my thumb, there’s an intense twanging sound. If I reverse the operation and hold the blade on the table, the twang becomes a long rhythmic pulsing familiar to

05_Chap4_pp73-92 8/29/13 2:40 PM Page 90

90

THE ART AND TECHNIQUE OF ELECTROACOUSTIC MUSIC

every fourth grader with a ruler. By sliding the vibrating instrument back from the edge I can produce a gentle glissando (DVD example 4.11). A tomato slicer with ten taut sawlike blades. The blades can be picked independently to produce pitches (approximately) from A to F. The sound is fairly loud if you hold the end of the handle firmly against a table. There are several ways to get sounds out of this: picking with fingernails, strumming the back of the blades with the thumb, or scraping a wooden spoon along the business side (DVD example 4.12). An aluminum roast holder. This is constructed like scissors with nine curved fingers on each half to hold the roast and guide a knife blade for even slices. The tines can be strummed or played with a mallet. Snapping the handles closed produces a long ring. There are a lot of harmonics on each tine, but the consensus pitch is a D (DVD example 4.13). A china bowl, about seven inches across. This sounds a clear B with overtones of G when struck with a soft rubber mallet (DVD example 4.14). I recorded each instrument a few dozen times, varying the playing technique and orientation to the microphone. These sounds are quite faint, so I had to shut off every noisy device in the room and position the microphone within two or three inches of the objects. I used a mic with switchable polar patterns, choosing the omni response because the cardioid produced too much proximity effect when placed so close to the objects. The result was two or three files per source, each containing two to three minutes of sound. I spent a good hour just listening and experimenting with the sounds before even thinking about how the piece should be organized. Many of my colleagues recommend composing the ending of a piece first. I agree, but I usually find that the beginning and ending occur to me simultaneously. I like to use my strongest material last, but commercial wisdom has it that you should put everything into the opening (on the grounds that record producers only listen to the beginnings of songs). Similarity in the beginning and ending satisfies everyone. The middle is not so cut and dried, but my usual approach is to explore contrasting material that is somehow related to the beginning. For the kitchen étude, I chose the cheese spreader for the beginning and ending, with the bowl for contrast.

Assembly DVD example 4.11 shows the processing used on the cheese spreader. The original recording is first, followed by a gain change, then a pitch drop of one octave. I unchecked the “preserve duration” option on the pitch change because I wanted the rhythm to slow down. (Similar processes were used on the other sounds, given in DVD examples 4.12, 4.13 and 4.14.) The stretched sound was four seconds in du-

05_Chap4_pp73-92 8/29/13 2:40 PM Page 91

SOUND AFTER SOUND

91

ration, and I can almost hear four beats in it, so the tempo became 60 bpm. (Since many editors don’t display time in beats, I’ll use seconds to refer to events in this discussion.) I set the editor for 120 to give myself more markers but didn’t pay too much attention to the bar and beat numbers. I let the sounds determine the phrasing of the piece, starting each sound close to a beat marker when the previous sound had an indefinite end. The opening cheese spreader clip has a rising glissando. This sets up an expectation for something, which I fulfill with a note from the bowl. I tried several pitches for this note, finally settling on a D to match the material coming up. The bowl is followed by one of the better scrapes on the tomato slicer. The second phrase (at 6.5 seconds) reprises the opening with the cheese spreader and bowl, this time on C. The return of the slicer is a different sample with a serendipitous rhythm made by strumming across the blades (DVD example 4.12). I spliced the scrape at the final note of this pattern, then interrupted the scrape with a reversal of the spreader. It took five or six tries to find a smooth join between the two sounds. I usually do this kind of fiddly construction in a second window and paste the entire result into the main window. This gives me two windows with simpler edit lists, and if I decide later to change the inserted part, I won’t have to sacrifice any later edits in the main window. The only trick is to put off saving the second window, since that closes the edit list. The composite pattern leads to a low G from the bowl, which is the beginning of the middle section (at 13 seconds). I played around with several ideas, including more rhythmic editing and variations on the cheese spreader. Eventually I wound up throwing out about three hours worth of work. The final version is based on a sound from the roast holder (DVD example 4.13). This was just a quick swipe across the tines, but shifted down three or four octaves it becomes a majestic D chord. This material evolves slowly, so a few events fill the allotted time. I used pitch change and duration change independently to build the simple melodic pattern that takes the piece to the 33-second mark. A reprise of the tomato slicer lick (if something turns out that well, why not use it twice?) leads to the grandest version of the roast holder sound, which lasts more than 12 seconds, nearly a quarter of the piece. The ending, starting at second 48, is the cheese slicer with the volume reduced to match the tail of the roast holder. Any splice here sounded awkward, so I sed the “mix from clipboard” feature to layer the slicer on top of the tail. The same technique finishes the piece with a dainty version of the bowl. The entire piece is heard on DVD example 4.15.

Critique My critique is ongoing. As I write, I have listened to the piece several times and made minor tweaks, fixing those pitches I mentioned and trying (and discarding)

05_Chap4_pp73-92 8/29/13 2:40 PM Page 92

92

THE ART AND TECHNIQUE OF ELECTROACOUSTIC MUSIC

changes to the rhythm. There are some things that could be better: the sound of the bowl is a bit thin compared to everything else, and the glissando of the cheese spreader is not as smooth as it could be. But the piece works musically, so these are just things to watch for as I move on. I spent about six hours on this project, and I expect students to do it in about eight hours. The majority of the time is spent listening to sounds that won’t be used, and no amount of practice or experience can speed that up.

RESOURCES FOR FURTHER STUDY There’s not much to read about editing or musique concrète composition. There were several textbooks published in the 1970s, so it’s worth a trip to your library to see what is buried in the stacks. The most important reading you can do is the manual for your software. Here are a few old titles with practical advice: Appleton, Jon, and Ronald Perera, eds. 1975. The Development and Practice of Electronic Music. Englewood Cliffs: Prentice Hall. Dwyer, Terence. 1971. Composing with Tape Recorders: Musique Concrète for Beginners. New York: Oxford University Press. Ernst, David. 1972. Musique Concrète. Boston: Crescendo Publications. Judd, F. C. 1961. Electronic Music and Musique Concrète. London: N. Spearman. Keane, David. 1980. Tape Music Composition. New York: Oxford University Press. This is a good point to do some listening, however. Now that you know how to edit, listen to some classic electronic music and see if you can spot the techniques in use. The best resource for finding classic musique concrète is the online store of the Electronic Music Foundation (www.emf.org) at www.CDeMusic.org. A good place to start is the collection Ohm: The Early Gurus of Electronic Music, 1948–1980 (Ellipsis Arts, 2000). Here are some other interesting recordings: Ariel, Bulent, et al., Columbia Princeton Electronic Music Center: 1961–1973 (New World Records, 1998). Henry, Pierre, Mix Pierre Henry 03.0 (Phillips, 2001). Lansky, Paul, Homebrew (Bridge, 1992). Lucier, Alvin, Vespers and Other Works (New World Records, 2002). Mumma, Gordon, and David Tudor, Gordon Mumma and David Tudor. (New World Records, 2006). Oliveros, Pauline, Electronic Works (Paradigm, 1997). Stockhausen, Karlheinz, Kantakte (Wergo, 1960). Westerkamp, Hildegard, Transformations (Empreintes Digitales, 1996).

06_Chap5_pp93-122 8/29/13 2:41 PM Page 93

FIVE Processing Sounds

After mastering the art of accurately capturing and editing sound, we are ready to explore techniques for modifying sound in interesting ways. Deliberately changing the frequency or dynamic makeup of a sound (or signal) is called signal processing. In the analog days signal processing was done by specialized devices placed in the signal patch. The digital world still features stand-alone equipment, but most processing is done in the computer, either by specialized applications or by chunks of code that plug into the editor or digital audio workstation (DAW). Processing sounds or sound files is commonly referred to as adding effects or, for marketing hype, “FX.” It is an interesting quirk of the English language that audio effects are processes that affect sound while sound effects are something entirely different. There are two reasons we reach into the box of signal processing tools. One is to fix problems with the recorded source material. It is not always possible to capture the perfect recording, so some takes are going to have background noise, some will have frequency response problems, others will have tempo problems or be out of tune. There is an impressive selection of tools designed to address this kind of problem. Most of these are aimed at the recording business and come as accessories to popular production packages such as Pro Tools. The second reason to modify recordings is to create sounds unlike those heard before. We can score songs for a dragon’s voice, give a melody line to the bass piccolo, or play with sonic texture for the fun of it. Some of these extreme operations can be done with common recording plug-ins, but this quest will eventually introduce us to more exotic and experimental software. The method for interfacing plug-ins to host applications is established in several standards and allows some level of interchangeability. For instance, the interface known as Virtual Studio Technology (VST) is used to make plug-ins for Steinberg programs such as Cubase and WaveLab. Many other editors are capable of hosting VST style plug-ins, so your plug-in library becomes available in many places. VST and Real-Time AudioSuite (RTAS) plug-ins are available on both Windows and Macintosh platforms. DirectX plug-ins are specific to Windows and Audio Units (AU) are exclusively for Apple machines. The route to loading a plug-in may be a bit complex. For instance, in Peak LE, they must be chosen per “insert,” a term that derives from patching to mixing consoles. 93

06_Chap5_pp93-122 8/29/13 2:41 PM Page 94

94

THE ART AND TECHNIQUE OF ELECTROACOUSTIC MUSIC

FIGURE 5.1

Finding plug-ins in Peak LE 6.

Figure 5.1 shows how these are accessed in Peak LE version 6. The plug-in menu contains three inserts, which determine the order in which processes are applied. For each insert, a submenu groups plug-ins by format, and they may be further grouped by manufacturer’s package before you get to a list of actual effects. Programs that give you a virtual mixing window, such as Pro Tools or Logic Audio, have similar menus in each track. In any case, the plug-in, once chosen, is loaded into computer memory and begins to run as a part of the application. The plug-in may be heard on all audio played, applied to audio as it is recorded, or applied as a process to selections of a displayed file. I prefer to use plug-ins by applying them to selections. This allows me to try processes on a temporary basis and undo them if things don’t work out. In Peak LE, this is performed by choosing the Bounce option at the bottom of the plug-in menu.

MODIFYING FREQUENCY RESPONSE Devices and software that modify the relative frequency content of audio are known as equalizers. This is often shortened to the acronym EQ, which can be used as a verb or a noun. The most common types of EQ are the bass and treble controls found on most stereos and television sets. These allow you to give a gentle boost or cut to the highest or lowest regions of the audio spectrum. To speak authoritatively about EQ you must specify two parameters—what frequency region is affected and how it is changed. Thus, for example, turning up the bass on a stereo increases the amplitude of sound components on or below the bass clef.

06_Chap5_pp93-122 8/29/13 2:41 PM Page 95

PROCESSING SOUNDS

95

Cutoff

15 Hz

31

63

125

250

500

1k

2k

4k

8k

16k

4k

8k

16k

Lowpass Filter Response

Cutoff

15 Hz

31

63

125

250

500

1k

2k

Highpass Filter Response FIGURE 5.2

Low-pass and high-pass filters.

The electronic circuit used for EQ is a filter, which has an associated frequency known as the cutoff frequency. This arises from the specific components used to build the filter and the settings of its controls. A filter may be designed to reduce signal below the cutoff frequency, in which case it is known as a high-pass filter. Alternatively, a filter may be low-pass, reducing signal above the cutoff. The term cutoff is a bit optimistic. What actually happens is signal at the cutoff is slightly affected, reduced by 3 dB. An octave beyond the cutoff (in the appropriate direction) the signal may be reduced 6, 12, or 24 dB, depending on the design of the filter. At two octaves, the reduction is another 6, 12 or 24 dB, and so on, with the process continued through all of the audible octaves. The rate of reduction is called slope. The 6 dB step comes from the physics of filter circuits. Addition of a filtering element to the circuit cuts the signal amplitude, doubling the slope. You might hear these filter elements referred to as poles, so you’d expect a two-pole design to have a 12 dB per octave slope. The number of poles is often called the order of the filter. In analog circuits high-order filters require more parts, which makes them more expensive. Figure 5.2 shows the frequency response of two typical filters. Plots like

06_Chap5_pp93-122 8/29/13 2:41 PM Page 96

96

THE ART AND TECHNIQUE OF ELECTROACOUSTIC MUSIC

these are produced by sweeping a sine tone across the entire audible range and graphing the results. High-pass and low-pass filter elements can be combined to make a band-pass or band-reject (often called notch) filter (Figure 5.3). The effect depends on the relative cutoff frequencies. A high-pass element with a cutoff lower than a low-pass element results in a band-pass device, and the reverse gives a notch. In this case, the filter is described in terms of a center frequency halfway between the cutoff points, and a bandwidth, which is the difference between the cutoff points. A third description, called the quality factor or simply Q, describes the cutoff slopes in terms of the ratio between the center frequency and bandwidth (see Figure 5.4). High Q values produce slopes that are sharp near the cutoff frequency, but revert to the normal rate for the order after an octave or so. Since a band-pass filter requires separate elements for the high and low sides, a second-order band-pass will have a slope of 6 dB per octave. Another combination of high-pass and low-pass filters produces a shelving equalizer, in which all signals above or below the specified frequency are boosted or reduced by approximately the same amount. This is the circuit found in stereo tone controls. The frequency specified for a shelving equalizer is the point where a cut

Center Frequency

Bandwidth

15 Hz

31

63

125

250

500

1k

2k

4k

8k

16k

8k

16k

Bandpass Filter Response Notch Frequency

15 Hz

31

63

125

250

500

1k

2k

Notch Filter Response FIGURE 5.3

Band-pass and notch filters.

4k

06_Chap5_pp93-122 8/29/13 2:41 PM Page 97

PROCESSING SOUNDS

97

is 3 dB less than the maximum. This is called the turnover or corner frequency. Figure 5.5 illustrates shelving. Strange as it may seem, there is such a thing as an all-pass filter. An all-pass filter does not affect amplitude, but it does change the phase of low-frequency signals. The point at which the phase is changed by 90 degrees is also called the corner frequency. The change is gradual above and below the corner frequency and is often shown as a phase response graph. The effect of an all-pass filter is rather subtle: transients with wideband content will be smeared in time. This softens percussive hits somewhat. The terms low-pass, high-pass, and so on, are just general descriptions. Other aspects of the design have effects on the details of the frequency response curves. For instance, it is possible to design a filter that has a sharper than expected drop in the first and second octave in exchange for a bit of irregularity in the pass band or emphasis at the cutoff frequency. Another factor that turns up in filter design is

Center Frequency Bandwidth

15 Hz

FIGURE 5.4

31

63

125

250

500

1k

2k

4k

8k

16k

4k

8k

16k

The effect of Q on band-pass filters.

Turnover Frequency Boost

Cut 15 Hz

31

63

FIGURE 5.5 Shelving equalizer.

125

250

500

1k

2k

06_Chap5_pp93-122 8/29/13 2:41 PM Page 98

98

THE ART AND TECHNIQUE OF ELECTROACOUSTIC MUSIC

phase change above and below the cutoff. It takes a textbook to describe all of this, but the upshot is that two filters with the same parameters of frequency and slope may sound quite different. Hardware equalizers that control frequency response in a complex way are traditionally built on one of two models. The first is a group of band-pass filters that are spaced to cover the entire audible spectrum. These may be tuned in octaves or have two or three sections per octave. For each filter there is a simple gain control and when all gains are set at the center of the range, the signal will pass through unmodified. The most popular models use sliders to set gain. These are laid out in a row so that you can set the sliders to outline the frequency response you want. This design is called a graphic equalizer. The second model is based on the idea that the most common use of EQ is for minor modification to two or three frequency regions. This requires only two or three filters if the center frequency and bandwidth are adjustable. This more streamlined (and cheaper) design is called parametric. Plug-in companies follow the hardware paradigms closely, often right down to providing a shiny panel with the company’s logo. A third model is available only in software. A technique called the fast Fourier transform (described in chapter 13) allows the signal to be decomposed into a thousand or more frequency bands and reassembled as you like. With this technique, you merely draw the desired frequency response curve. It’s often called an FFT filter after the acronym of the algorithm.

Graphic EQ To practice equalization, use DVD example 5.1 (pink noise) as the source. This has a wide spectrum as shown in Figure 5.6, so response changes should be readily apparent. Let’s try a graphic style EQ first. Figure 5.7 shows a typical graphic EQ. It has thirty-one bands spaced three to the octave. The center frequency of each band is shown below each slider. The level of the slider is shown by the numbers on the right. DVD example 5.2 has a sound of a running faucet modified in various ways, finishing up with the settings shown in Figure 5.8, which also shows the final spectrum. The animated DVD example 5.3 shows this EQ in action. These are drastic modifications that pretty much isolate different tones within the original sound. Note that the biggest effect is achieved by suppressing the unwanted bands and raising the wanted ones. This gives a hint as to the action of this equalizer. Try this yourself, using pink noise (DVD example 5.1) as an input. Note how isolated bands produce fairly distinct pitches. Many early electronic music studios had keyboards hooked up to graphic equalizers so that these pitches could be performed.

Parametric EQ Now let’s see what the parametric style EQ can do. Figure 5.9 shows a five-stage model. For the middle sections, you have three parameters: gain, frequency, and

06_Chap5_pp93-122 8/29/13 2:41 PM Page 99

PROCESSING SOUNDS

15

31

63

125

250

500

1k

2k

4k

8k

99

16k

FIGURE 5.6 Spectrum of pink noise.

FIGURE 5.7 Graphic EQ plug-in.

bandwidth (which is sometimes labeled Q). Operation is simple: Set the frequency to the value of interest, and adjust the gain for desired boost or cut. Adjust the bandwidth according to the task at hand. Wide bandwidth provides subtle coloration, while narrow bandwidth is used to grab a tight frequency range. You can tighten the frequency range further by setting two stages to the same frequency. The first and last filters do not have a bandwidth control; they affect everything below or above their frequency, respectively. The end filters have a mode control,

06_Chap5_pp93-122 8/29/13 2:41 PM Page 100

100

THE ART AND TECHNIQUE OF ELECTROACOUSTIC MUSIC

15

31

63

125

250

500

1k

2k

4k

8k

16k

FIGURE 5.8 Graphic EQ settings for DVD example 5.2.

which chooses between cut or shelf mode. With all of these options, parametric EQs can generate quite complex curves. Figure 5.10 shows the settings heard at the end of DVD example 5.4. Many plug-in EQs are even more elaborate than this one. You will see a wide variety of skins and control types. Some show the frequency response curve and let you adjust the curve directly. The animated DVD example 5.5 shows a particularly colorful EQ that displays the curve produced by the interactions of each individual stage.

FFT EQ The FFT equalizer has no mechanical counterpart. It can be viewed as a graphic EQ with thousands of bands or simply as equalization by drawing curves. In any case, it provides a detailed control of frequency response. There is one important differ-

06_Chap5_pp93-122 8/29/13 2:41 PM Page 101

PROCESSING SOUNDS

101

FIGURE 5.9 Parametric EQ plug-in.

ence between FFT EQ and more traditional filters: FFT curves are often shown with equal frequency spacing across the window, whereas graphic EQ and parametric curves have equal octave spacing. You will remember from the discussion in chapter 2 that the octave spacing (also known as logarithmic) gives equal width to each octave and mimics the way our ears work. If the FFT display is linear, everything to the right of the center is a single octave, the highest. This means most adjustments are found in the extreme left region and are often difficult to refine. Some FFT EQ plug-ins perform a “spectrum matching” function. Using a model signal you determine the frequency characteristics you want to match, and the plug-in performs an overall analysis. You then introduce the signal to be modified, and the plug-in derives a compensating curve that will give the second signal the overall shape of the

06_Chap5_pp93-122 8/29/13 2:41 PM Page 102

102

THE ART AND TECHNIQUE OF ELECTROACOUSTIC MUSIC

15

31

63

125

250

500

1k

2k

4k

8k

16k

FIGURE 5.10 Parameteric EQ setting for DVD example 5.4.

first. A more useful version is designed for cleaning up a noisy recording. It analyzes the noise in a quiet spot and calculates a curve that will reduce those noise components from the entire track.

Using EQ EQ is such a fundamental tool that you will be applying it almost everywhere. You should master it by experimentation. Simply play with EQ settings on all manner of

06_Chap5_pp93-122 8/29/13 2:41 PM Page 103

PROCESSING SOUNDS

103

sounds and try to predict what will be heard before you apply the effect. As you do this, you will probably confirm the following observations. The same EQ settings applied to a sound by different brands of EQ often sound different. There are differences in algorithms and circuits that affect the results, not to mention unavoidable computational errors and mistakes in design. You can sometimes EQ out undesirable sounds. For example, if you record a voice near an idling truck, some low cut will remove much of the truck while leaving the voice alone. A high-pitched whine may be erased with a tight notch filter. Hint: to zero in on an unwanted sound component, boost the gain on a parametric section and adjust the frequency until that pitch is most prominent. Then reduce the gain all the way. More often, EQ will bring out undesirable parts of sounds. For instance, boosting bass will reveal any hum in the recording. You can’t EQ hum away, because most hum is actually several harmonics on 60 Hz (50 Hz in many countries). Wind noise, caused by wind blowing directly on the microphone, is another unwanted sound you can’t touch with EQ. Cutting highs is usually uncontroversial. Most people prefer less high frequency than the world supplies, and many just can’t hear it. Cutting highs often reduces hissy noises such as air conditioning vents. The high end does matter on certain rich instruments such as cymbals, and too much high cut will affect the impact of attacks. High cut is really sensitive to the quality of the EQ, so experiment with several. You can use EQ to compensate for the quirks of many microphones. This is what it was originally designed to do. My test for adjusting microphone EQ is simple. I talk to the talent with the recorder rolling and capture their normal conversation, then adjust the EQ so their speaking voice sounds normal. That’s probably the best setting for the instrument also. EQ can be used to increase distinction between sounds. For instance, to bring the vocal track out in a dense mix, I will often boost the voice about 3 dB somewhere between 1 and 3 kHz, then cut competing sounds 3 dB at the same frequency. This is not an obvious change on any track, but it gives 6 dB of separation in the band critical to understanding lyrics.

Vocoder The most complex filter we will encounter is the vocoder. This is a multiband filter, much like a graphic EQ, but the gain of each band is controlled electronically. The signal that controls each band comes from a matched set of filters that are measuring the content of a control signal. The control signal is never heard. Instead, a second signal, called a carrier, is applied to the first multiband filter. The result is the

06_Chap5_pp93-122 8/29/13 2:41 PM Page 104

104

THE ART AND TECHNIQUE OF ELECTROACOUSTIC MUSIC

frequency response of the control signal imposed on the carrier. If the control is speech, the output will sound like the carrier is speaking or singing. The original analog vocoders were limited to eight bands, but digital vocoders can be as precise as you like. DVD example 5.6 demonstrates a simple vocoder. First you hear the carrier, then the control signal, then the carrier processed by the control.

MODIFYING AMPLITUDE Adjusting the amplitude of a signal is so common that it’s hardly worth considering as a special process. However, when we make a level change, either with a volume knob or an editing tool, we often discover the relationship between amplitude and loudness is not as simple as it seems. We saw in chapter 2 that the concept of loudness is based on the average level of sounds, but that applies principally to sounds that are fairly steady and have some duration. Short sounds, including percussion hits and spoken words, seem to follow different rules. We have seen there is a limit to the amplitude the recorder can accept, so if we record something with a wide dynamic range, quieter sounds may be lost in order to prevent distortion on strong peaks. The final problem we encounter setting amplitude is the difficulty of predicting the future. Every recording engineer learns that a performance is going to be 6 dB higher than the rehearsal, because musicians play louder when the adrenalin flows. To handle all of this, we have a family of devices and software known as dynamics processors.

Limiting Limiters are circuits that automatically turn the gain down if the signal rises above an adjustable threshold. This is like hiring someone to watch the VU meter and turn the recording level down when the needle exceeds a certain mark. It’s not a tricky circuit to build, but there are a couple of details that trip us up. The first is what exactly to measure. It can’t be voltage, because this swings rapidly between positive and negative in the waveform. Even the average voltage won’t work, because that is nearly always zero. What needs to be measured is the average absolute value of the voltage, or alternatively, the average distance between the positive and negative peaks (this is what the meters show). The difference may seem minor, but it is audible on some materials, so many limiters have a switch for absolute or peak-to-peak detection. The second problem is the time it takes to measure the signal. It can’t be instantaneous, because the measurement must include a complete cycle. But at what frequency? A 20 kHz signal has a period of 0.05 milliseconds while it takes 50 milliseconds for a 20 Hz signal to complete a cycle. You might expect to go for the low pitch, but a lot can happen in 50 milliseconds, especially with percussive sounds,

06_Chap5_pp93-122 8/29/13 2:41 PM Page 105

PROCESSING SOUNDS

105

which might be finished by then. So a short measurement period is used. But as a compromise, the circuit is designed to make a smooth transition to and from the limiting state. This transition is adjustable with a control that is usually labeled attack time. It must be fine tuned for the type of material to minimize audible effects. The effect of limiting a signal is shown in Figure 5.11. Unlike clipping, where the tops of the wave are lopped off, the entire signal is reduced until the input falls to acceptable levels. A careful analysis would show there may be some distortion during the attack time. This can be minimized by setting a conservative threshold and a fast attack time, but the distortion is not such a terrible thing. The psychoacoustic perception of distortion is similar to the impact of the amplitude we are removing. The downside of too fast an attack time turns up when the material is a combination of percussive sounds and steady ones. You may hear those steady components fade back in after the percussion hits, an effect called pumping or breathing. Limiting is not really possible in software, because strong signals will be clipped at the D-to-A converter. The software enters the chain of events too late. But Sony is now selling a recorder that captures each track at two different levels. Both recordings are stored in a buffer of a thousand samples or so before anything is written to permanent memory. The stronger signal is measured, and if it rises to the 0 dB point, the softer recording is kept instead. The transition between the two is managed so elegantly that the effect is inaudible. As this technique becomes more widespread, limiters will become less necessary.

Compression As you listen to a limited track, it probably sounds a bit louder than the same recording would have without the limiter. That’s because the limiter allows the whole performance to be recorded at a good strong level, and the soft parts are boosted with the rest. For more obvious effects on loudness we turn to a device called a compressor. Some audio purists avoid using them, but compressors can create interesting electroacoustic material, and artful use of compression has become an important part of contemporary recording practice. (It’s important not to confuse this type of compression, called dynamics compression, with the data compression used to create file formats like MP3. There is no connection between the two.) A compressor is built around the same gain change circuit that is found in a limiter. For limiting, the gain reduction has to be extreme to prevent any excursion of the signal above the threshold. In compression, the gain change is adjustable. A few models simply have knobs labeled less and more, but the best compressors let you specify the gain change as a ratio. A compression ratio of 2:1 means that a signal that would be 6 dB above the threshold comes out 3 dB above. The effect of this is that a passage that originally registered between -18 dB and -3 dB on the meter would swing from -18 dB to -9 dB if the threshold is set at -15 dB. You can then boost the track 6 dB or more. An output level control is provided for just this purpose.

06_Chap5_pp93-122 8/29/13 2:41 PM Page 106

106

THE ART AND TECHNIQUE OF ELECTROACOUSTIC MUSIC

Threshold

Original

00:00:00.0

After Limiting

00:00:00.5

00:00:01.0

FIGURE 5.11 The effect of limiting on waveform.

Another difference between compressors and limiters is a separate control for release time. This means that once the circuit begins reducing the signal, that reduction is continued for up to a second after the input falls back to the threshold. The entire duration of a sound will be processed, not just the loud parts. The best way to learn the effects of each control is to experiment with a variety of sounds. For future reference, the following is a summary of compressor controls. Threshold. This is the trigger level. Signals stronger than this are affected, lesser signals are unchanged. Gain change is calculated from the difference between the measured signal and the threshold. The lower the threshold, the quieter the output will be. However, if the threshold is too low, the compressor will simply stay in the low gain state, and there will be little change to the dynamic range. It is the changing of gain that processes the sound. A meter that shows the amount of gain reduction is the key—when this meter is dancing, the signal is being changed. Attack time. Once the threshold is reached, compression turns on gradually over this period. If attack time is extremely short, the sudden change in gain might produce pops. This is especially pronounced on sounds that are nearly steady—the waveform may trigger repeated attacks. If the attack time is too long, the first part of the sound will slip through unprocessed. Ratio. Ratio is the amount of compression. A ratio of 1:1 is no compression at all, a ratio of 20:1 completely flattens the signal. Release time. When the input drops below the threshold, compression is turned off over this period. This amounts to a fade up as the gain gradually

06_Chap5_pp93-122 8/29/13 2:41 PM Page 107

PROCESSING SOUNDS

Attack

107

Release

Threshold

Original 00:00:00.0

00:00:00.5

After Compression 00:00:01.0

00:00:01.5

FIGURE 5.12 The action of a compressor.

increases to normal. This is normally adjusted to match the tail off of the sound. If release is too short, compression does not really happen unless the threshold is quite low. If release is too long, you may hear pumping. This can be especially noticeable on noisy sources such as analog tape or buzzy guitar amps. Output level (or gain). Since compression reduces signal level, this provides the opportunity to get it back. The “wall of pain” sound is produced by setting a low threshold with a high ratio and turning the gain up all the way. Figure 5.12 shows compression in action. This is a composite showing an artificial tone before and after compression. The original is a 440 Hz sine wave at -18 dB, with a 500 ms burst at 0 dB. The right side has 20:1 compression applied. The threshold is set at -16 dB, so the compression kicks in at the amplitude step. You can see the fade down of the signal over the attack time. Likewise, you can see the fade up of the signal after the input drops below the threshold. There are some before and after comparisons on the accompanying disc. DVD example 5.7 has original and compressed versions of three sounds, including the tone used for Figure 5.12. DVD example 5.8 is a glissando on a koto (a traditional Japanese string instrument). This produces a noticeably loud sound, even though it peaks below -7 dB. The second time there is heavy compression with a ratio of 20:1 and a threshold of -35 dB. This reduces everything to the level of the reverberant tail. Attack is 12 ms and release is 64 ms. The third time, the attack time has been reduced to 3 ms. You can hear that it is quieter as the quicker attacks does a better job of reducing the peaks. Some distortion is inevitable during compression. The amount and type depend on details of the circuit or software used and to some extent on the nature of the program material. Some vintage compressors are prized because of the nature of this distortion, and the most desired programs emulate classic hardware on a

06_Chap5_pp93-122 8/29/13 2:41 PM Page 108

108

THE ART AND TECHNIQUE OF ELECTROACOUSTIC MUSIC

component by component basis. If you find it difficult to get the gain control you want without abusing the character of the music, try another application. Compression results in louder tracks, which can keep vocals understandable in a thick arrangement or make TV commercials annoying, but the uses go far beyond that. The following are some areas where compression is common. Vocals Vocals are compressed so often in contemporary recording that many people are forgetting what the uncompressed voice sounds like. There is always a wide variation in loudness. A single phrase, even a single word, can cover a dynamic range of 20 dB. Figure 5.13 shows the waveform of the word look. This has two main parts, “luh” and “k.” You can see where the “k” starts with a new attack after a quiet spot. The difference between “luh” and “k” measures 18 dB. Many English words are like this, with an explosive beginning and a quiet consonant bringing up the end. In addition to this, the words of a phrase vary in basic strength, with emphasis on important words and a trailing off at the end. That’s one of the things that makes music interesting. Of course, it also makes it easy to loose the vocals in a mix. This is a consequence of masking (see chapter 2), wherein a strong signal component will prevent a weaker component of nearly the same frequency from being heard. Compression of the voice track can prevent masking by reducing the overall dynamic range. Figure 5.14 shows the same word look after compression of 3:1. With the threshold and attack times used, the ratio between “luh” and “k” is 12 dB. DVD example 5.9 has a scrap of poetry with and without compression. Bass As we also learned in chapter 2, the ear loses sensitivity to quiet sounds at low frequencies. The practical result of this is that variations in level are exaggerated on low notes. A bass will disappear as it is turned down and note-to-note loudness variation will be exaggerated. To compensate for this, some light compression is often applied to bass. A ratio of 3:1 is adequate to make up for the psychoacoustic effect and smooth out some of the difference between open and stopped strings. DVD example 5.10 demonstrates this. Drums It seems natural to record loud percussion through a limiter to reduce distortion. Since the drum is such a brief sound, you would expect a short attack time would be used. However, many engineers use a long attack time, perhaps 60 ms. This lets the initial pop of the stick through before the compressor acts. Only the ring of the drum head is reduced in gain. This gives the drum extra punch that will cut through a thick ensemble while reducing the ring which may conflict with other parts. DVD example 5.11 has a snare drum unprocessed, with light compression, and with heavy compression and a slow attack.

06_Chap5_pp93-122 8/29/13 2:41 PM Page 109

PROCESSING SOUNDS

Luh -

00:00:00.0

- oo -

109

-k

00:00:00.2

FIGURE 5.13 The word look.

Sustain If you heavily compress a guitar with a low threshold, you can adjust the release time to closely match the natural decay of the strings. This will give a sustain effect, making the notes last nearly forever. Of course you need a quiet room and amp, or the noise will be brought out with the last gasp of the string. This process can be applied to any string and many percussion instruments. DVD example 5.12 has the effect applied to a gong. The trick is to tweak the release time so there is no audible increase during the latter part of the sustain. Sidechain Compression Some compressors can be set up so one signal controls the process on another one. The separate input for a control signal is often called the side chain. The control may be an equalized version of the program material, which will make the compression sensitive to a defined frequency range, or it may be a different signal entirely. In that case, the control will turn the program down, a process known as ducking. This is commonly used on radio talk shows, where the moderator’s voice reduces the level of the caller.

06_Chap5_pp93-122 8/29/13 2:41 PM Page 110

110

THE ART AND TECHNIQUE OF ELECTROACOUSTIC MUSIC

Luh -

00:00:00.0

- oo -

-k

00:00:00.2

FIGURE 5.14 The word look compressed.

Stereo Compressors Compression of a stereo signal must by controlled by a mix of both channels of the program. If one side were to go into compression before the other, the stereo image would suddenly shift toward the uncompressed side. Hardware compressors have a switch to link the two channels for stereo. Plug-ins are available in mono and stereo versions.

Multiband Compression The compression discussed so far affects the entire signal, just like a volume control. This can lead to some unfortunate effects, such as a powerful bass line turning down a light piccolo obbligato. However, it is possible to combine compression and EQ to adjust the loudness of a signal selectively. In multiband compression, the compressor replaces the gain control of a parametric EQ. Each frequency band is compressed independently, which will affect the spectral balance of the sound. With heavy compression settings the effect is a drastic increase in loudness producing a solid wall of sound. This was first applied at radio stations to make all songs sound as loud as possible and was soon applied to rock albums for the same

06_Chap5_pp93-122 8/29/13 2:41 PM Page 111

PROCESSING SOUNDS

111

reason. The electroacoustic composer is free to use the device in more subtle ways. For instance, a bit of limiting at high frequency can reduce the sibilance heard on ess sounds without affecting the rest of the speech. DVD example 5.13 shows a multiband compressor in action. I’ve applied it to a bit of gamelan music to make the effects obvious.

Expansion and Gating It’s simple to modify the compressor circuit so that signals weaker than the threshold are reduced. This results in an expansion of the dynamic range, something that has found little application in commercial music but can be quite interesting to the electroacoustic composer. When the expansion ratio is extreme, quiet parts of the signal are eliminated entirely and the effect is called gating. Gating is excellent at removing hum and other junk from between the notes. See DVD example 5.14.

PLAYING WITH TIME Delay is probably the most popular effect after EQ and dynamics. Modern delays are purely digital devices. There were some experimental analog delay systems, but they were expensive and not particularly satisfactory. The only real delays available in the classic era used tape. The Echoplex was a tape recorder with a loop of tape running past record and play heads with adjustable spacing. The spacing determined the delay time. You could mix the delayed sound back into the original source and get repeating echoes and a kind of reverberation. Some beautiful pieces were produced this way, but the technique has obvious limitations. For one thing, tape noise tended to build up, and the distortion inherent in tape recording quickly made the echoes unrecognizable. In a digital delay the sound is recorded into memory and read out a little later. The maximum delay time is determined by the amount of memory available. Modern units are set up to allow all of the classic tape delay tricks plus a few more. Basic echo. This uses delay times up to a few seconds, so the sound is heard as a distinct repeat. If the output is mixed back into the input, we get feedback, somewhat like the effect we try to avoid with microphone and speakers, but easier to control because the sound takes a long time to build up (DVD example 5.15). Flanging. This technique uses short delays so the echoes come practically on top of the original sound. The name comes from the original tape technique: two copies of the same recording were played at the same time, and the engineer put his thumb on the flanges of the tape reels to sync them up by briefly slowing one or the other down. The Beatles made this trick famous, using it to thicken the texture on many of their productions (DVD example 5.16).

06_Chap5_pp93-122 8/29/13 2:41 PM Page 112

112

THE ART AND TECHNIQUE OF ELECTROACOUSTIC MUSIC

Comb filtering. If you perform flanging with very short delay times, you get an effect similar to phase interference. This happens because the delayed sound will be exactly in phase with itself at any frequency that is harmonically related to the delay period. This will boost components at those frequencies. With a delay of 2 ms, you will get a boost at 500 Hz, 1,000 Hz, 1,500 Hz, and so on. You will also get nulls at 250 Hz, 750 Hz, 1,250 Hz, and on up. When you draw a graph of this frequency response, it looks like a comb, hence the name. If you add feedback to this, the delay will resonate any input so that it almost seems to be producing signals on its own (DVD example 5.17). Phasing. Flanging with feedback is a kind of all-pass filter. (Filters are built out of delay units—even analog filters are based on components that slightly delay signals.) A phaser is a network of all-pass filters set up so the response peaks are not harmonically related. The output sounds more like a graininess instead of a clear pitch. The effect is magnified if the program material has sweeping frequency or if the all-pass corner frequencies are changed. Most flangers and phasers have a low-frequency oscillator to automate this effect (DVD example 5.18). Chorus. With the two tape deck delay trick, it was possible to change the speed on the playback deck if there was some arrangement to keep the tape taut. The speeds had to be matched again eventually, but while they were different, there would be a pitch change in the feedback. This is caused because the audio was recorded at one speed and played back at another. If the playback were faster than the recorder, the pitch would shift up, at least until the tape was tight. You could then slow playback and get a pitch shift down for a while. This can be done in digital delays by gradually changing the delay time. Often a special control called modulation is provided to automate the process. There are actually two controls, one for modulation rate and another for the amount of modulation. If you only listen to the processed signal, you hear an added vibrato. If you listen to a mix of the original and the effect, it sounds like two voices. Adding some feedback brings in more layers. This can be done with subtlety, using short delay and slight modulation at a slow rate, or by increasing the settings it can generate some seriously insane sounds. Devices or programs specifically sold as chorus effects generally have two or more delays in them (DVD example 5.19). Resonance. When a delay has enough feedback, it will begin to oscillate. If the feedback is set just below the resonance point, a simple pulse at the input will give an effect much like a plucked string. This is the basis for string models we will explore in chapter 13 (DVD example 5.20). There is a certain amount of overlap in these examples. The delay time where phasing changes to flanging and flanging becomes echo is a matter of individual perception. This is probably related to the frequency at which people hear a transi-

06_Chap5_pp93-122 8/29/13 2:41 PM Page 113

PROCESSING SOUNDS

113

tion from individual pulses to a steady tone. Therefore you will find different definitions of these effects and some inconsistency in the markings of effects devices. When you apply delay effects to music, you often want the delay to be related to the tempo. Some devices make it easy, with delay set in beats and tempo derived from the MIDI clock (or host settings in a plug-in). Many other delays specify time in milliseconds, and you have to do a bit of math. The formula is easy—60,000 milliseconds per minute divided by the beats per minute—so a tempo of 120 bpm has a delay time of 500 ms per beat.

Pitch and Duration Change The pitch change you get by modifying sample rate, as discussed above and in chapter 4, is technically known as resampling. That is because the sample rate can’t be changed within the system, so a waveform is computed by the same math used to change sample rates while the rate is actually held constant. This process changes the duration as well as the pitch and is pretty much like old-time tape manipulation, including the chipmunk effect. A more modern approach, called granular resynthesis, allows for independent modification of duration and when combined with resampling can produce pitch change with constant duration. In granular resynthesis, the file is broken up into short chunks of waveform. You might think of these chunks as frames, because they are analogous to the frames of a movie. (They are also often called windows.) Just as the motion of a movie can be slowed down by reducing the frame rate, playing the chunks of waveform slower (while the audio within the chunks is played correctly) will stretch the duration of a sound. The space between the chunks is filled in by playing each chunk twice. There’s a complicated system of crossfading between overlapping chunks to keep the amplitude even. Durations are shortened by playing the frames faster than they originated, overlapping the chunks of waveform. If this seems abusive to the fidelity of the waveform, it can be, but it’s not noticeable on many materials. Pure tones seem to fare the worst, and the problems are proportional to the amount of change. There are parameters that can be adjusted to improve performance with various sounds, but these are seldom available. More often you get a generic choice such as speech or music. Whatever is available, you should experiment with a variety of settings for each of your sounds. Duration change is not really the kind of thing that can go in a real-time plug-in, since well, it changes the duration. Pitch change by resampling with a compensating duration change (usually simply called pitch change) can be used in a plug-in. There are actually systems that analyze the apparent pitch of a tone and adjust the pitch change on the fly to keep the sounds “in tune.” These have saved the careers of some pop divas, but the results are not as good as singing in tune in the first place. Other artists have embraced the sound of auto tuning, deliberately setting the controls to give flange-like effects.

06_Chap5_pp93-122 8/29/13 2:41 PM Page 114

114

THE ART AND TECHNIQUE OF ELECTROACOUSTIC MUSIC

Ring Modulation and Frequency Shift The analog version of pitch shift used a technique known as amplitude modulation. This will be discussed in detail in the chapters on synthesis. Amplitude modulation is related to radio broadcasting, where the program material is modulated onto a carrier signal at the station’s broadcast frequency. This produces sidebands that are the carrier frequency added to all components of the program, and lower sidebands that are the program components subtracted from the carrier. Radios demodulate this so all you hear is the program, but if you modulate an audio frequency carrier, you can listen to this directly. A ring modulator produces the upper and lower sidebands (“ring” refers to the electrical circuit used in the first models) but removes the carrier from the output. A frequency shifter also removes the lower sidebands. Both sound quite odd, as the components are all shifted by the same amount, totally destroying harmonic relationships. These are illustrated by DVD example 5.21. If you listen with a spectrum analyzer plugged in, you will be able to follow the evolution of the sidebands. The different sounds are: Unmodified. A wineglass tone with a fundamental at 360 Hz and harmonics at 720 Hz and 1,080 Hz. Ring modulation by 440 Hz. The upper sideband is 800 Hz and the lower sideband is -80 Hz. This shows as simply 80 Hz, as the lower sideband wraps around the zero frequency point. The harmonics also have equivalent sidebands. Ring modulation by changing tone. You can hear sidebands slide around in complex ways, including dropping to the lower limit of hearing and rising again. Ring modulation by 920 Hz. This results in a high pitch as both sidebands are above the original frequency. Unmodified tone repeated. Frequency shift of 600 Hz up. Since all three partials are shifted the same, the timbre is different, somewhat shrill. Frequency shift of 430 Hz down. This is low enough that the bottom sideband has wrapped around zero to 70 Hz.

Reverberation Material recorded using a close microphone has a disembodied sound with no sense of location. It’s rather like a photograph of a face with no background. Artificial reverberation is used to add a sense of space to dry recordings. The original method for adding reverb was to play a recording in a reverberant space and record the result. At one time, no studio was complete without an echo chamber reserved for this purpose. Of course, echo chambers are expensive and add noise and distortion, so a cheaper and cleaner alternative was needed. Mechanical reverbs work by applying a signal to a spring or steel plate. The effect is

06_Chap5_pp93-122 8/29/13 2:41 PM Page 115

PROCESSING SOUNDS

115

picked up with a device similar to a phonograph cartridge. It really takes an act of imagination to call the resulting sound from a spring reverberation, but it served when nothing better was available. (Spring reverbs are still seen in some guitar amps.) Plates are more realistic and produce a sound attractive enough that some modern reverbs emulate them. Digital reverbs are built with delays. After all, reverberation is the sum of thousands of echoes, so why not build a reverberator out of thousands of delays? It’s a good idea but even today no computer is powerful enough to model all of the delay paths in a concert hall. However, it is possible to make some shortcuts to build a convincing reverb out of a reasonable number of delays. Such devices have been available since the late ’70s. The price of these units was originally astronomical, but it has come down enough that we now find reverbs in toys. At the same time professionally priced units have become more and more sophisticated. In a good unit, you can design your own space. To understand the controls that do this we need to look closely at what actually is heard in a reverberant space. (See Figure 5.15.) Direct sound. The direct sound comes on a straight path from the source to the ears. Since it is the first thing we hear, our brains use this sound to determine the direction of the source. We also use the relative intensity between the direct sound and reverberation to judge our distance from the source. Early reflections. This is the sound reflected from the surfaces nearest the source. That would include stage walls, floor, and ceiling. These reflections support the direct sound, making it seem louder, at least if they arrive within 10 to 15 milliseconds of the direct sound. Strong reflections that arrive after 20 ms or so actually blur the sound and can make it difficult to understand speech. Early reflections are important to performers, making it possible to hear their own sound as well as what other players are doing. Diffuse reverberation. All other reflections make up the diffuse reverberation. This generally builds up smoothly but quickly and decays over a relatively long time. Our brains use the delay between the direct sound and the onset of the reverberation to judge the size of a room. The decay time of the reverberation is dependent on both the size and the sound absorption characteristics of the room. Decay time is measured from the time of the initial sound until reverberation has died down by 60 dB and is usually called RT60. Reverberation will have changing frequency characteristics. Air absorbs high frequency sound, so there is invariably some high cut in large spaces. This makes the high frequencies fade away faster than the low. Some small spaces may have more low frequency absorption than high and give a bright reverb, but it’s not really an enjoyable experience. Most digital reverbs emulate all of this by running a program of discrete delays linked with various feedback paths. This program is called the algorithm. The complexity of the algorithms a real-time device can provide is determined by processor speed and memory size. A cheap reverb will sound sparse and rough with obvious

06_Chap5_pp93-122 8/29/13 2:41 PM Page 116

116

THE ART AND TECHNIQUE OF ELECTROACOUSTIC MUSIC

Initial Delay Diffuse Reverb

First Reflections Direct Sound 00:00:00.0

00:00:00.5

00:00:01.0

FIGURE 5.15 Components of reverberation.

looping as the sound fades out. Reverbs are available in plug-in form, but they are notorious CPU hogs and can really reduce track count. This makes a hardware reverb a good investment. The newest models can be directly integrated with a computer via FireWire, so they are as easy to use as plug-ins. All digital reverbs give a choice of algorithms. These have names like plate, medium room, large hall, gymnasium, and so on, to give a sense of what they are emulating. The best reverbs allow us to design our own variations of these algorithms by adjusting key parameters. There is a lot of variation from brand to brand and model to model as to what parameters are available and what they are called, but the following are some common ones. Mix or balance. This sets the ratio of direct (dry) signal to reverberation. As noted above, this gives a sense of distance. In many situations there will be multiple sources that require individual balance settings. In that case, the reverb is set 100 percent wet and the mix of direct-to-reverb signal is managed in a mixer. Decay time. This corresponds to the RT60 of the reverberation. There are specific decay times associated with various types of program material. Spoken word needs little reverb, if any. The RT60 for nightclubs and small spaces associated with jazz is under a second. Halls for classical music generally have times from 1 to 2.5 seconds, whereas churches and other venues for choral music can have an RT60 up to 4 seconds. Most pop music uses different reverb settings for each track with little attempt at realism.

06_Chap5_pp93-122 8/29/13 2:41 PM Page 117

PROCESSING SOUNDS

117

Hi EQ. This adjusts the high-frequency response of the diffuse reverberation. This may be a parametric EQ or a simple high cut. A few reverbs handle EQ by dividing the signal into adjustable bands and giving each an independent decay setting. Width. Most reverbs produce a stereo output, using different delay paths for left and right channels. This control mixes the left and right outputs to produce a narrow or unnaturally wide effect. Diffusion or density. This is the spacing of the individual reflections that make up the reverberation. At low settings, the delay times cluster, almost producing definite echoes. With high diffusion settings, the reverberation is a smooth wash of sound. Some engineers recommend lower diffusion for vocal reverb to provide a bit more sense of pitch. Pre delay. This sets the initial delay between the input sound and the first reflection. This gives a sense of room size. There may be an additional delay that adjusts the time between the first reflections and the diffuse reverberation. Early reflections. These controls range from a simple level to detailed choice of room type and source placement. At the least, early reflections should be removable, because recordings made in moderately live rooms will already have some. Animated DVD example 5.22 demonstrates the effects of tweaking reverb parameters. Impulse Response Reverb There is a fairly new technique for generating reverberation that is based on sampling the acoustic response of real spaces. More specifically, the impulse response of a space is determined and a process called convolution is used to apply the sound of that space to a recorded signal. Convolution is a powerful technique and is discussed in more detail in the chapters on synthesis. The sound of impulse reverb is quite impressive. It is identical to the sound that would have been captured in the space, at least with the source and microphone positions used in sampling the room. It is usually an offline process. Real-time impulse reverb is currently covered by patents, so it’s limited to some high-end equipment and programs.

DISTORTION We spent a good portion of chapter 3 learning how to make accurate recordings and how to preserve that accuracy throughout the production process. Now we look at ways to creatively mess up the sounds. Distortion is a rich source of material for electroacoustic composition, but it is important that it be used in careful

06_Chap5_pp93-122 8/29/13 2:41 PM Page 118

118

THE ART AND TECHNIQUE OF ELECTROACOUSTIC MUSIC

and controlled ways. Unintentional distortion results from sloppy work, and the audience can always tell. There are many distortion plug-ins and processors available, but the makers seldom tell us exactly what techniques are used to create the distortion. These things have colorful names like Distressor or harmonic rotation. These names are designed to help sales more than give information. Usually it takes a bit of detective work to discover what is actually in use. The DVD examples of these effects start with a recording of a music box (DVD example 5.23) used as the source material. You can use this file as a source for further experiments. The following are some of the processes that are available. Decimation. We have studied the effect of word size on recording fidelity. Decimation is the deliberate discarding of accuracy. When it is required, as in converting 24-bit recordings to 16 bits for audio CDs, techniques like dither are used to keep the signal as clean as possible. When used as a process, dither is left out. Sound processed this way has a lot of background noise, and the foreground develops a bit of extra edge. As you reduce the number of bits, the softer areas drop out entirely. This can leave nothing but the occasional pop. DVD example 5.24 has a lower word size on each pass through the loop. Resampling. The reduction of sample rate of a file that is already recorded can have strange aliasing effects. The signals that are higher than half the new sampling rate are folded around that frequency. A 6 kHz tone resampled at 10 kHz will produce a 4 kHz tone—6 kHz folded around 5 kHz. Listen to DVD example 5.25 to hear how the high tones of the music box become deep chimes. Nonlinearity. Linearity is an engineering term that describes the accuracy of the gain change associated with amplification. In a digital system, a gain of 2 means every sample value is multiplied by two. No analog device is perfectly linear. A transistor circuit with a gain of 2 may double low voltage parts of the waveform, but the peaks might only be amplified by 1.95. A transistor circuit may also treat the negative portion of a signal differently from the positive portion, producing asymmetrical amplification. This is described by the transfer function, a graph of output for a given input (Figure 5.16). Not surprisingly, a perfectly linear transfer function is a straight line. A transfer function with a bend in it deforms the signal, adding harmonic distortion. Analog distortion units use deliberately poor circuits to inject this nonlinearity; plug-ins compute gain based on the transfer function. DVD example 5.26 plays the music box with 20 percent distortion, first of an symmetrical type, then asymmetrical. Figure 5.17 shows the effect of these on a sine tone. Figure 5.18 show the spectra of the two types. Note that the symmetrical distortion only adds even harmonics to the tone, whereas the asymmetrical distortion adds all harmonics. Waveshaping. Another approach to adding nonlinearity takes the transfer function to the extreme—it is not just a bent line, it is another waveform. Fig-

06_Chap5_pp93-122 8/29/13 2:41 PM Page 119

PROCESSING SOUNDS

119

Transfer T ransfer Function

Input

Output

FIGURE 5.16 The effect of transfer function on a sine wave.

Symmetrically Distorted Sine

Asymmetrically Distorted Sine

FIGURE 5.17 Sine tones with symmetrical and asymmetrical distortion.

ure 5.19 shows the result of shaping a sine wave with a symmetrical triangle wave. The result, heard in DVD example 5.27, is a lot like the frequency modulation effects used in synthesis. Rectification. Diodes are electronic devices that only allow current to flow one way. When a bipolar signal is applied to one, only half of the waveform survives. This is illustrated in Figure 5.20. The typical guitar fuzz pedal adds

06_Chap5_pp93-122 8/29/13 2:41 PM Page 120

120

THE ART AND TECHNIQUE OF ELECTROACOUSTIC MUSIC

15

31

63

125

250

500

1k

2k

4k

8k

16k

2k

4k

8k

16k

Symmetrical Distortion

15

31

63

125

250

500

1k

Asymmetrical Distortion FIGURE 5.18 Spectra of distorted sine tones.

Waveform W aveform FIGURE 5.19 Sine wave shaped by a triangle wave.

Spectrum

06_Chap5_pp93-122 8/29/13 2:41 PM Page 121

PROCESSING SOUNDS

121

Rectified

Original FIGURE 5.20 Rectified sine wave.

Soft Clipping

Hard Clipping FIGURE 5.21 The effect of overdrive on sine waves.

variable amounts of rectified signal to the original to generate the typical rich sound. Since a rectified waveform is asymmetrical, all harmonics are added. This is heard in DVD example 5.28. Overdrive. Every circuit has a limit on the voltage level it can handle. Try to apply more, and you have an overload. There is a distinct difference between analog overload and digital overload. Digital overload will clip the tops off the waveforms as shown in Figure 5.21, producing the scary sound in DVD example 5.29. The spectrum shows added harmonics right up to the limit of the system. Analog overload is a gentler event known as soft clipping where the tops of the waves are rounded off and fewer harmonics are added.

06_Chap5_pp93-122 8/29/13 2:41 PM Page 122

122

THE ART AND TECHNIQUE OF ELECTROACOUSTIC MUSIC

EXERCISES The only way to master audio processing is to practice processing audio. A systematic approach is to take a single recording and run that through every device and plug-in while adjusting all of the available controls. Considering the number of presets on most hardware devices and the number of plug-ins available for download, completing this task is unlikely. A more reasonable approach is to explore examples of each type to develop a set of general expectations; then when you are ready to use a particular effect, try out the versions available to learn the subtle differences. 1. The traditional composition exercise for exploring effects is to take a single sound and process it to produce as many variations as possible, then use those to compose a short piece. DVD example 5.30 is such a piece. See if you can identify the processes used. 2. Pick a sound that is kind of ugly and apply processing to make it more beautiful. 3. Pick a beautiful sound and apply processing to make it ugly. 4. Take a recording that contains two different sounds and use processing to remove all traces of one sound. 5. Start with a fairly long sound and develop six variations with processing. Compose a 45-second étude with these sounds.

RESOURCES FOR FURTHER STUDY The best way to find out more about processing audio is to browse the Web. Remember, the Web is not edited, so you will find a lot of dubious information, but seriously consider what you read and confirm anything that looks useful with some experiments. My favorite site on processing is hosted by the Universal Audio Corporation (www.uaudio.com/blog/cat/ask-thedoctors). This company has products for sale, but they are based on thorough research into the workings of classic gear, and the developers share what they discover in the “Ask the Doctors” feature. This book on processing, written from the studio recording point of view, is very detailed: Case, Alexander U. 2007. Sound FX: Unlocking the Creative Potential of Recording Studio Effects. Boston: Focal Press.

07_Chap6_pp123-148 8/29/13 2:42 PM Page 123

SI X Layering Sounds

So far, we have been working in one dimension, exploring how one sound can follow another to make musical sense. It’s time to move on to the electroacoustic equivalent of counterpoint: layers of sound.

DIGITAL AUDIO WORKSTATIONS The only satisfactory tool for layering sound is a digital audio workstation (DAW). There seems to be some lingering nostalgia for editing stereo tape, but I know of no one who misses the old multitrack recorders. The DAW has brought electroacoustic composition to a level of sophistication and control that was undreamed of a few years ago. DAWs range in complexity and quality from simple shareware applications to dedicated hardware systems costing tens of thousands of dollars. I will use Pro Tools for the examples in this chapter because it is powerful, inexpensive, and probably the most widely used DAW at this time. The illustrations for this chapter were made with a fairly old version, but I am covering basic functions here, and those will not change.

Hardware The term digital audio workstation refers to a computer system equipped with a high-quality audio interface and a multitrack recording and mixing application. Dedicated hardware control surfaces make a nice addition, but are not essential. There are various levels of package integration. At its simplest, a DAW is simply an application that runs on a general-purpose computer with whatever peripherals the user chooses to connect. The middle-range software is sold with specific interfaces and accelerators, and high-end systems are complete packages that look more like traditional studio equipment than computing gear. Systems differ primarily in the number of inputs and outputs and the number of simultaneous tracks they will support. The number of inputs and outputs directly 123

07_Chap6_pp123-148 8/29/13 2:42 PM Page 124

124

THE ART AND TECHNIQUE OF ELECTROACOUSTIC MUSIC

affects the price of the interface. Most electroacoustic work can be done with two inputs and as few as four outputs, so a modest interface will usually do fine. The number of tracks available is a direct consequence of the speed and memory of the computer, so a serious investment there will immediately pay off. The highestpowered computers available today will support over a hundred simple tracks, but the complexity of a project affects the track count actually achieved. Getting a system up to ninety-six processed tracks requires hardware accelerators. That investment can bring the cost to well over $10,000. Pure composition seldom requires such power, and I will point out several techniques for efficient use of the system. The rational equation for the composer is a good computer, good software, and a modest interface. Digital audio workstations are similar in many respects to MIDI sequencers, which will be discussed in chapter 8. In fact, most DAWs and sequencers look exactly the same in the advertising. I’m sure that in the near future some single program will meet all needs, at which point this chapter and chapter 8 will apply to two aspects of the same program, but right now we have DAWs with MIDI features and sequencers with audio features. Your choice will depend on your composing style. If you prefer musique concrète sounds and techniques or include a lot of live recording, a full-featured DAW will be required. Producing quality synthesis requires the advanced MIDI capabilities only found in sequencers. In any case you need to learn to use both.

Latency One major difference between types of DAW is their latency. Latency is the time it takes to convert a signal from analog to digital form, transfer it to the computer, process it, transfer it back to the interface, and convert it back into analog signal. This should not take long, but it might take up to 10 milliseconds. That’s enough delay to be really annoying to a performer listening to the computer output during recording. To prevent this, the performer should listen to his or her own signal from some point before it enters the computer. (Look for an interface that has a panel control to switch the headphones between input and computer or a software control, which is an adequate substitute.) The latency of a computer system can be adjusted by setting the hardware buffer size, which determines how much data is accumulated by the interface before passing it to the program. For various reasons computers run more efficiently if data is processed in batches. There are several layers of buffering coming in and going out, so this setting has a greater effect than you might think. The buffer size needs to be increased if the computer’s operation begins to suffer. This will be indicated by stuttering audio or a warning message. The best of the integrated systems (digital consoles are an example) have latency of less than 2 ms, which is hardly noticeable to a performer, but even with these there is a cardinal rule: the output of a digital system must never be combined

07_Chap6_pp123-148 8/29/13 2:42 PM Page 125

LAYERING SOUNDS

125

with the original input. This would produce comb filtering, exactly as discussed under delays in the last chapter. It’s a pretty effect in its place, but you don’t want it in the monitoring chain.

Files and Playlists We learned in chapter 4 that most editing programs do nondestructive editing. They maintain a playlist that controls the order in which sections of the file are played. DAW software takes the concept of playlists much further. There is a playlist for each track, and these playlists can call up different files as well as sections of a file. In many DAWs a track can have alternate playlists, so you can record different takes of a part and build a definitive composite version. The master document that contains the playlists is called a project in Pro Tools, with similar names in other applications. A project may have several dozen associated files. These are not limited to files recorded and assembled for the composition. The application sometimes generates new files to carry out complex operations such as crossfades or pitch changes. Most DAWs require the designation of a drive as the target for recording. If you plan on extensive recording, a dedicated drive is an excellent idea. Many suppliers list certified drives, which means they have tried them out and they seem to work. Multitrack work puts heavy demand on drives, so pay attention to speed ratings when you shop. Also pay attention to noise. Large, high-speed drives tend to be noisy, a flaw that can be improved by the enclosure design. In any case, to make sure there is plenty of room, sort out the read/write permissions and defragment the drive from time to time.

Projects Composition in a DAW begins with the creation of a new project file. Many DAWs only allow one project to be open at a time, but this is not really a limitation. You can create versions of a piece (or completely different pieces using the same material) by muting tracks and adding playlists. Creating a new project usually creates a directory and some subdirectories on the hard drive. In Pro Tools the directory will have the project name. It will contain the project file and directories for audio files, fade files, and assorted other files. If you wish to copy the project to a backup device or another computer, the entire directory must be copied. When you add an existing file to a project, you may have the option of copying the file into the project folder or working with the original while leaving it in its original location. It is probably best to copy short files, but duplicating very long files will impact disc space. If you do the latter, be sure to back it up with the rest of the project. The project opens with a project window. The main features of a typical project window include the following.

07_Chap6_pp123-148 8/29/13 2:42 PM Page 126

126

THE ART AND TECHNIQUE OF ELECTROACOUSTIC MUSIC

A central editing area. Here files or fragments of files are placed in horizontal tracks. These chunks of audio are called regions in Pro Tools. They can be moved in time with varied types of constraint and arbitrarily moved or copied from track to track. In some ways this arrangement reminds me of a teaching aid I remember from kindergarten: A storyteller would have an easel covered with green flannel. As she told the story, she’d pick up pieces of brightly colored flannel cut into the shapes of characters or scenery and place them on the easel. They would stick where placed, but could be easily moved around as the story developed. The DAW editing paradigm moves chunks of sound around in a similar way. Time is marked with one or two rulers across the top of the editing pane. You can choose to display as absolute time or in bars and beats. To make the later accurate, you also need to set meter and tempo. Some DAWs use a graphical representation for tempo, some do this in a dialog box. The graphic format makes it easy to visualize the tempo, but it is easier to enter changes with a dialog box. A list of available content. To the right side of the Pro Tools project window is a list of the audio regions available to the composer. This is hierarchical, showing fragments below files from which they are derived. Regions may be dragged directly from this list to a track in the editing area. In Pro Tools the region list may be hidden to expand the editing area. In other DAWs you may have to open a window to see the content list, but drag and drop is usually available. A set of controls for each track. These are universally shown at the left of the track display. The controls available vary, and can be customized in many programs, but there will at least be a name for the track, an input selector, and buttons for record arm, solo, and mute. Since this area is small, additional information about the selected track may be shown elsewhere. A list of tracks. Pro Tools makes it easy to hide and reveal individual tracks, as well as change their display order. This is done in a track list to the far left of the window. The left area also shows track groupings, a method of associating tracks so that certain operations can be performed on several tracks at once. Some applications do not allow track hiding, but uninteresting tracks may be moved to the bottom of the list. Tools and controls. These are stretched across the top and are quite similar to the controls in a stereo editor. In some cases, the record and playback buttons are placed in a separate transport control window (see Figure 6.1).

File Management Once a project has been created, there are two ways to add an audio file. An existing file can be imported or a new file recorded directly onto a track. Recording is

07_Chap6_pp123-148 8/29/13 2:42 PM Page 127

LAYERING SOUNDS

Mode

V View iew

Cursor

Track T rack List

Track T rack Controls

Time Time Display

Play Position

Tracks T racks and Regions

127

Time T ime Ruler

Automation Line

Region List

FIGURE 6.1 Pro Tools edit window.

like recording in a stereo editor with two differences: the track to be recorded must be armed by setting a switch, and all other tracks will play as recording progresses. This will allow the new material to be synchronized with what is already there. Any track that should not be heard can be muted. Whether the track that is being recorded is heard is a matter of option settings. It is best not to monitor the recording track because the fresh audio will come back delayed. Monitoring through hardware is preferred and is usually quick and simple to set up. Newly recorded files are named automatically. The name will probably be something unimaginative, such as T003-002, but you can change it to something more descriptive. That’s a bit of extra work that really pays off in the long run. It’s handy to be able to identify files in the region list or even from the operating system. Existing audio can be imported with a file menu operation. Generally there is a choice to work directly with a file or to make a copy in the project audio files directory. Even with huge drives it’s not a good idea to have multiple copies of big files, but if you decline the copy option, files can be lost when the project is backed up. In addition, any modification of an uncopied file by another application will cause problems in the project. Files will automatically be copied if they must be modified to match the preferred type for the project. The file import process is the most useful option for electroacoustic composers. My typical workflow is to record and modify sounds in an editing program, import

07_Chap6_pp123-148 8/29/13 2:42 PM Page 128

128

THE ART AND TECHNIQUE OF ELECTROACOUSTIC MUSIC

them to the DAW for layering and mixing, and return the result to the stereo editor for final mastering.

Tracks Before going any farther, I should clarify the concept of a track. This was originally a physical stripe of magnetism on audiotape that could hold a recording independent of any other tracks. Tape decks had from one to forty-eight tracks. It was possible to synchronize tape decks to get even more tracks running; up to 128 tracks were possible in the Alesis digital audiotape (ADAT) format. One inescapable fact about tracks on tape was that they were locked together. You could selectively record, play, or erase them, but you could not alter their time relationship, and you certainly could not edit one independently of the others. The best you could do was copy the track to a new tape, modify it, and record over the original, a technique called flying tracks. The audio channel is a different concept. This is a path for electrical signals that might run in parallel with similar circuits. A stereo system has two audio channels, a 5.1 surround system has six. A 24-track tape deck could be connected to a 24channel mixer, which would combine the signals into a stereo mix. I mention this because many engineers and authors use the terms track and channel interchangeably. In computer code there are no physical tracks or channels, both concepts are used as a metaphor for streams of audio data. The essential meaning of tracks is that they represent audio files that will play at the same time. A semantic quirk has led to the concept of a stereo track: a stream that will play stereo files, which themselves consist of a pair of permanently linked recordings. Even though there is a limit to the number of tracks a system can play, most programs will allow you to construct and display as many tracks as you please. Some applications may warn you when you when have gone too far, but in others your first clue is audio dropping out. In a DAW, a track is a canvas for organizing audio. It resembles the window of an editor, with a graphical representation of the sound that can be enlarged for detailed examination, but there is one extra element—the track controls. This is where you find functions that must be independently set on each track. As always, different programs have different options, but there are a few that are pretty much required. Input connection. It is assumed that a DAW will be able to record several tracks at a time, but it is uncommon to record all of them simultaneously. The interface may provide twenty-four or more inputs, and there needs to be a simple way to route these to tracks. This usually takes the form of a list of potential sources. Audio level meter. Each track has a meter that shows input level during recording and the output during playback. Sometimes the size of these meters

07_Chap6_pp123-148 8/29/13 2:42 PM Page 129

LAYERING SOUNDS

129

is affected by the view or track size options. When recording, watch the biggest version available. Arm for recording. A checkbox or flashing red indicator determines if the track will record when the record icon on the transport control is clicked. In most cases, the input to an armed track is heard through the monitor system if the track is stopped or recording. Mute. Normally all tracks play back together, but sometimes you don’t want to hear all of them. For instance, it is common to record a track over if the first version did not seem perfect. It would be silly to throw away a good if not perfect take on the hopes of getting a better one, so the retake is recorded to another track. The mute button prevents a track from playing so you can contrast the two versions. Solo. Sometimes you need to hear a track by itself. The solo button switches all other tracks off. A few programs do this by engaging the mute buttons on all other tracks, but this is not a good approach because it disturbs mute setups you may have made between takes. Paradoxically, it should be possible to solo several tracks at the same time and hear a mix of them. Recording audio onto a track is simplicity itself. Once the track is connected to an input and armed, a click on the record (and possibly play) icon in the transport control will start the process. Setting levels is exactly as we learned to do in chapter 3. The only extra concern is that since existing tracks play as the new ones are recorded, it is easy to pick up some bleed-through. It is essential to monitor with headphones during recording. It is almost impossible to play a note exactly as the track begins recording, so the recording should begin before the new material comes in. This can be made automatic by setting a pre-roll of something like two measures. Then when the cursor is placed where the new part should begin, the system will jump back two measures to begin recording. If there’s material already there, a punch-in point can be set. Playback will start early to cue the performer, but actual recording will occur on the mark. The metronome provides a beat reference during recording. The sound can be a tick from the computer, but these are often difficult to hear. I prefer to set up the metronome to play percussion sounds from a synthesizer. (I keep an obsolete hardware synthesizer in the studio for just this purpose.) Sometimes more elaborate control of sounds and volume is possible in order to set up accented beats. Ticks on subdivisions would be a nice feature, but I haven’t come across that yet. The best approach is to record a human performance of a click using claves or a similar instrument. This track can even include spoken cues or counting through tempo changes. When it comes time for mixing, this track is simply muted. When a project has more tracks than the computer processor supports, some will not play. The choice of which to skip will be set by some sort of priority scheme. In Pro Tools this is determined by the placement of the track in the editing

07_Chap6_pp123-148 8/29/13 2:42 PM Page 130

130

THE ART AND TECHNIQUE OF ELECTROACOUSTIC MUSIC

Options

Name

Show Record Solo Automation Arm

T Take ake a Select

Level Meter

Height

Mute Display Automation Mode Mode

FIGURE 6.2 Pro Tools track controls. window (Figure 6.2). The playback routines (called voices) are dynamically allocated to tracks as they need to play—when a track goes silent, its voice is freed up to serve another track. In some versions, you can assign a particular voice to a track permanently. This ensures that the track will always play, but when it is silent other tracks cannot use the voice.

Operations on Regions The audio in a track may be broken up into sections or regions. These are equivalent to splices on audiotape although they are more likely to be phrase- or verselength sections than the note-by-note insertions we have done so far. In most cases, a region will correspond to a distinct audio file. The composition process depends heavily on operations on these regions. Movement in Time Most DAWs have several methods for changing a region’s start time, such as a mouse drag or a nudge with the arrow keys. In addition, there may be a choice of constraints to the move. Pro Tools is typical: In shuffle mode regions must be adjacent so there will be no empty space in a track; movement is accomplished by trimming regions. In spot mode a click with the movement tool opens a dialog where a precise time point may be entered. Slip mode allows free movement. In grid mode positions are constrained to beats or a subdivision, but alignment is not by the edge of the region. There’s bound to be some silence before the first note occurs,

07_Chap6_pp123-148 8/29/13 2:42 PM Page 131

LAYERING SOUNDS

Region Name

Sync Point

131

Waveform Waveform

Trim Trim T Tool ool o

FIGURE 6.3 The trim operation.

and even if the region is trimmed to the first sound, the perceived beat may come after the sound starts. The composer can specify an anchor time or sync point for the region, and alignments will be based on that. If there is an option to lock a region’s position, it is a good idea to do so once it is placed. I’ve found it embarrassingly easy to shift a region a few pixels accidentally when using the mouse to select it for other operations. Trimming A region is defined by recording a section, importing a file, or splitting an existing region. With any of these operations, it is likely that the region will contain more sound than is needed. If so, the excess is discarded by trimming the region at either end. This is a simple adjustment of the playlist and does not affect the actual length of the file. All of the movement options apply to trimming, including the ability to nudge with keys. If a region has been trimmed, it can be extended again, but it can’t be made longer than the original material. The trim operation can also produce a fade-in or fade-out for the region. In Pro Tools this is handled with a multifunction mouse tool that is sensitive to how the pointer is moved over the region. (See Figure 6.3.) Editing Editing includes copying and pasting entire regions as well as changing what’s in the region. Most of the editing operations presented in chapter 4 can be performed on a region. Cut, copy, and paste are all commonly available. Selecting is affected by the movement modes and can have some subtle complications. For instance, selecting from beat to beat in grid mode will be numerically accurate but may cut

07_Chap6_pp123-148 8/29/13 2:42 PM Page 132

132

THE ART AND TECHNIQUE OF ELECTROACOUSTIC MUSIC

into the middle of a note. The Pro Tools manual recommends using the spot dialog box to select areas for repeat pasting, because some mouse selections are rounded off. This will cause a loop to gradually get out of sync. Sometimes a track is edited to the point where there are so many regions that it will not play accurately. When this occurs in Pro Tools, an operation known as Consolidate Track can combine the regions into a new one. Other DAWs have the same option under different names, such as Glue. Processing Regions can also be modified by signal processing operations. As in stereo editors, this can take the form of a DSP menu item applied to a single region or selection or as a plug-in processor that affects the entire track. DSP operations will generate a new file, leaving the unprocessed version available in the region list. Some processes like delay and reverb extend beyond the end of the sound, so you must ensure that there is sufficient trailing silence in the region for the effect to tail off completely. If necessary, manually add some silence to accommodate this. Plug-in processors do not generate new files, as the process is applied in real time during playback. Plug-ins add latency to a track, which can create synchronization problems or audio artifacts if the track is eventually mixed back with the original material. It is often necessary to adjust the start time of the track to compensate for plug-in latency. Some DAWs do this automatically, but the compensation may be incorrect. Trust your ears. Bouncing From time to time you will want to convert some tracks to a single audio file, a process known as bouncing. Some programs will execute a bounce as fast as possible, others simply play and record in real time. Generally a bounce includes everything that is heard, so if you want to bounce a single track you should solo it.

Mixer The placement of sounds is only half of the DAW story. The real elegance comes from combining the sounds in an expressive way. DAWs use the studio mixing console as a paradigm for this. To understand the paradigm, it is necessary to understand a few things about hardware mixers. Analog Architecture Complex audio devices are often built as a collection of physically separate modules. A mixer is actually a large collection of a few types of modules. Input modules are the most numerous. These accept signals and amplify them to match a standard level. Input modules include microphone preamplifiers and equalizers; the quality of these circuits is often the basis for a mixer’s reputation. Input modules

07_Chap6_pp123-148 8/29/13 2:42 PM Page 133

LAYERING SOUNDS

133

Input Modules Mic Gain EQ Hi

Direct Insert

EQ Mid EQ Low Send 1 Send 2 Pan Mute/Solo Fader

Buss Module

Buss Assign

Pan Mute/ Solo Fader

1 2 3 4 Left Right Send 1 Send 2

Send Out

Busses Master FIGURE 6.4 Architecture of a mixer.

also include faders and pan controls to determine the level of the signal in the main mix, as well as knobs that send the signal to various auxiliary mixes. The input modules send their signals to a common set of wires known as a bus (Figure 6.4). One key to using an analog mixer properly is to set all of the gain controls so that neither noise nor distortion are introduced. This starts with the preamplifier

07_Chap6_pp123-148 8/29/13 2:42 PM Page 134

134

THE ART AND TECHNIQUE OF ELECTROACOUSTIC MUSIC

gain. The optimum level puts the input fader where the entire length of the fader is usable. If the preamp gain is too high, the signal seems to jump on when the fader is barely touched and too little gain will leave the signal weak in the mix. This principle carries over to digital mixers, but DAWs usually default to appropriate gains. Hardware mixers always have a master output module, which usually has a single fader to control both channels of a stereo mix from the main bus. These are not always shown in software mixers. There may be other bus modules, which were originally intended to combine various inputs into submixes. Switches on the input modules route signals to the main or submix buses. In many cases, the submix can be routed to the main mix to provide convenient control of a group of inputs. Processing in an analog studio is provided by extra pieces of equipment known collectively as outboard gear. There are two ways of connecting these devices. Each input module on the console has insert connections where the signal is sent from and returned to the module before it gets to the fader (if nothing is connected to these jacks, the signal is automatically returned). If a processor is connected to a channel insert, only that channel is affected. In the other style of connection, a mix from auxiliary (aux) send knobs on each module is connected to the processor, which is then returned to the output module via an extra input. When this system is used, several inputs can be combined in the process, and the dry signals are still available for mixing. The aux send is especially useful for reverb, which requires independent settings per input and a balance of dry and processed signal. (Reverbs used this way should have their internal mix parameter set at 100 percent wet.) Digital Mixer Emulation The digital model of a mixer is usually pretty faithful to the analog architecture. The main difference is that the number of input modules is determined by the number of active tracks. There is no input gain control as that must be handled before the A-to-D converters. There is seldom a default EQ—that is treated like any other plug-in. Aux sends are also optional. There is a fader and pan, along with some output routing. The real power of digital mixing becomes apparent with the inserts. Any available plug-in can be added to the input module by clicking on a button and choosing from a list. The control window for the plug-in will appear as you add the process and can be dismissed and recalled at will. The plug-in looks exactly as it does in editor applications. That is the beauty of the plug-in paradigm—once you have the plug-in on your system, it is available to any compatible application. Most DAWs provide an output module with a master fader and plug-in slots. Any plug-in loaded in the output module affects the overall mix. Output modules also have metering for the total mix and routing to hardware outputs. Send and return style processing is provided by adding aux or bus modules. These are similar to input modules, but use an aux bus as input. This is fed by appropriate send controls on the input modules. In Pro Tools, enabling a send opens a window with send controls for that input module, as illustrated in Figure 6.5.

07_Chap6_pp123-148 8/29/13 2:42 PM Page 135

LAYERING SOUNDS

Send

Input Modules

Send Fader

Plugin

Aux

135

Master

FIGURE 6.5 Mix window in Pro Tools.

Control Surfaces The principal drawback of a software DAW is that the faders are not really faders, they are just pictures you drag around with a mouse. This is ergonomically awkward and imprecise, but the biggest limitation is that you cannot perform the most elementary action available on a real mixer—move two faders at the same time.

07_Chap6_pp123-148 8/29/13 2:42 PM Page 136

136

THE ART AND TECHNIQUE OF ELECTROACOUSTIC MUSIC

This is mitigated to some extent by automating the mix, but that is a poor substitute for the real thing. Most engineers prefer an external control surface that provides faders and the most useful buttons. There are many models available, with various degrees of faithfulness to a traditional console. The simplest has a single fader, with cursor keys to select which channel it controls at the moment. This helps with the ergonomics and precision, but there’s still only one. More popular control surfaces have a bank of eight faders. That’s enough to keep both hands busy during a mix. The best control surfaces have motorized faders that can follow the on-screen actions. They are expensive but worth the cost if you do a lot of hand mixing. Automation If flexibility is the core attraction of a digital audio workstation, automated mixing is the icing on the cake. Mixdown has always been the most tedious operation in the sound studio. A traditional mixdown was really a performance. It required studying the multitrack recording, preparing a chart, and innumerable practice runs before you would risk committing to tape. If you listen closely to the mixes of the ’60s and ’70s, you will hear minor flubs such as a part brought in too early or a balance that is just a bit off the mark. Automated mixes (first provided by adding some digital control to analog consoles) have changed the whole process. Preplanning is certainly still needed, but a rough mix can be set up quickly and can be fine tuned channel by channel. Adding automation has three phases: recording, touch-up, and read. In the first pass automation recording is armed for the desired channels. Any motion of the faders, real or virtual, is recorded with the mix. The motion is displayed as a graph in the track window. On a touch-up pass, the faders will move on their own, but if they are moved by the user, new values will be substituted for the original recording. Some DAWs have a second style of touch-up: in trim mode the faders go to a center position and any moves are added or subtracted from the existing automation. Once automation is set, the channel will read the data from then on, and the engineer can focus his attention elsewhere. Automation can record much more than the fader levels. Automated pan and processor parameters provide effects that were never practical when operators were limited to two hands each.

Troubleshooting Eventually you may have more tracks, plug-ins, and automation than the computer can handle and the project won’t play. There are some ways to escape from this situation. For instance, muted tracks take up processor time, so unneeded tracks should be hidden. Likewise, there may be some plug-ins that are active but with all parameters set to neutral. Any unnecessary plug-ins should be removed. In some cases, the action of a plug-in can be applied as a DSP action to a track, creating a region that can simply play back. Finally, consider combining a few related tracks

07_Chap6_pp123-148 8/29/13 2:42 PM Page 137

LAYERING SOUNDS

137

with a preliminary bounce. Then import the resulting file and hide the tracks that were combined.

COMPOSING WITH LAYERS OF SOUND Composition with a digital audio workstation includes several phases: the composer imports or records the sounds, places them in various tracks, applies processing, builds a mix, then makes the final bounce to disc. It seems complex to describe, but it is actually quite efficient, given that all of the materials for the piece are at your fingertips and can be easily combined in trial arrangements. Composition is like any other major undertaking: a bit of preparation can make an impressive difference in the amount of work required later. I prefer to start a project by cleaning up and isolating all the sounds in a stereo editor. One feature of this phase is level adjustment—I don’t normalize the sounds, but I do bring them to the same level as shown by the meters. The result is dozens of short files. This is a good opportunity to create names that show the quality and relationships of the various materials. Once the sound palette is ready, I import it into the DAW. If everything is in the same directory, it can be brought in as a batch. The next step is to create some blank tracks. The number depends on the complexity of the piece, but it is not necessary to know exactly how many will eventually be needed. In general, I devote one or two tracks to each part of the composition. This is analogous to a pop music mix with bass, drums, rhythm, and lead tracks, but I expect to include a variety of sounds in each track. My categories are more likely to be rhythm, bass, backgrounds, and foreground. With a few tracks to fill, the actual assembly can start. There are two basic approaches: either fill in one track all the way and synchronize the other sounds to it, or complete each measure before going on to the next one. My own working style is a combination of the two—I’ll put down about sixteen bars of material to outline a section, then fill it out before going on. Whatever approach you take, be sure to stop often and listen to the entire piece as you put it together. Building a composition by layering sounds is really a kind of orchestration. I’ve found it useful to think in terms of melody and accompaniment or the concepts of rhythm, harmony, and lead. Both approaches admit that not all sounds are equal or that they at least have different levels of importance at a given moment. Some sounds seem by their nature to be destined for the foreground, others not, but most materials are capable of fulfilling either role. The essential quality is contrast— foreground sounds are noticeably different from the background in some aspect. In this view there is one foreground with its unique characteristics and several background sounds that are fairly similar. As the piece progresses, the various materials change roles. The opening gesture may be quite striking, then recede beneath an even more dramatic solo line. This manipulation of the listener’s attention is the greatest challenge of layering.

07_Chap6_pp123-148 8/29/13 2:42 PM Page 138

138

THE ART AND TECHNIQUE OF ELECTROACOUSTIC MUSIC

Background and Foreground The whole issue of foreground depends on perception, the ability of our minds to pay attention to one part of a complex event. The foreground is where the attention is. Foreground sounds possess qualities that make them stand out, and the science of auditory perception offers some clues to what these qualities are. The most important quality is change. The human perceptive mechanism has evolved to focus on the new, on the parts of the environment that are changing. Since there is no such thing as silence, our brains have developed the trick of ignoring continual or familiar sounds but becoming alert when there is something new. This implies that any aspect of a composition will become background if it is repeated enough. If the sound is relatively static (even if quite busy) this process can happen quickly. A newly introduced sound will have the spotlight simply because it is new, but to stay in the foreground, it must possess some qualities that differentiate it from the rest of what is going on. The most obvious feature that calls attention to a sound is loudness. We generally attend to the loudest sound going, and it is common musical tradition to keep accompaniments in the background by playing them more softly. In acoustic music, this is often accomplished by having the performer who is playing the lead to stand up or step forward. In electroacoustic work, the same thing is accomplished by moving a fader up, but it can be tricky to judge how far. Inexperienced mixers tend to be too timid in this operation. To really claim the foreground, a sound must be significantly more powerful than the rest. It’s as if there must be some space between the foreground and the accompaniment. If all volume niches are filled, the effect is more a solid block of sound. (The “wall of sound” has been a popular mixing style for some years, but it should not be produced accidentally.) Just as changing loudness brings a sound forward, a changing pitch demands attention. The interplay of melody and harmony is the most obvious vehicle for this, but there are some subtle tricks as well. In classical music, a vocalist may add vibrato to break out of the chorus, and an instrumentalist may play just a bit sharp. A guitar hero will work the whammy bar. In electroacoustic pieces the concern is usually more about getting pitches to match in the first place, but the same tools that solve this problem can provide the variation needed to make a line stand out. Here the neophyte mistake is to do too much—this is one of those processes that should never be obvious. Detuning by about two beats a second is usually plenty. Another feature that can put sounds in perspective is timbre. There is no simple rule here, just that a sound with a texture different from the others will tend to stand out. A group of sine waves will be hard to differentiate, but a single sine will stand out against a wash of noise. The familiar instrumental examples are a saxophonist biting the reed or a guitarist hitting the distortion pedal. Indeed, distortion definitely brings things out in a mix, but it too requires a delicate touch. The attack of a sound’s envelope is a strong determinant of perception. Those with gradual attack will not be noticed if much else is going on. Percussive sounds will jump out unless they are in a repeating loop. Then they may begin as fore-

07_Chap6_pp123-148 8/29/13 2:42 PM Page 139

LAYERING SOUNDS

139

ground, but will recede to the background as the pattern becomes familiar. Attacks can be modified by compression, as discussed in chapter 5. The effect of compression depends on how it is combined with level adjustment. A lightly compressed line that is brought up will become more clearly audible, while heavy compression flattens all dynamics and pushes sounds down. Sounds in high or low registers have difficulty becoming foreground. According to Fletcher and Munson (see chapter 2), they are not as perceptually loud as the middle range but there seems more to it than that. Even a tightly compressed bass line will recede when mid-range material comes in. On the other hand, any time several sounds cluster together in pitch and are otherwise matched, the outer pitched sounds will be most prominent. DVD example 6.1 demonstrates several ways to differentiate foreground and background. The foreground and background exercises at the end of this chapter will help develop skill in evaluating this aspect of sound.

Spectral Balance Sound engineer Bernie Krause has made an interesting discovery while capturing sounds for PBS documentaries: animals adjust the frequency of their cries to fit into empty bands in the local spectrum. If there are six species of frog in an area, each will sing at a different pitch. This presumably makes their mating calls audible over a longer distance. It also assures that the spectrum will be filled from top to bottom. A similar principle applies to composition with multiple layers of sound. You should choose materials that fill out the spectrum without too much overlap. This is established music practice—soprano, alto, tenor, bass; or drums, bass, harmony, and lead. If two types of material fill the same niche, they should probably alternate rather than compete. We must resist the temptation to overuse the extreme ends of the sonic range. Very high and very low frequencies can be used to enhance the sounds in the mid ranges, but these have their dangers. Not every sound system can reproduce, and not everyone can hear, the top octave or even two octaves of the official 20 Hz to 20 kHz range, so material above 8 kHz may be missed by a substantial portion of the audience. The low end is certainly audible, even tangible under some circumstances, but unless you have complete control of the playback system, there is a distinct risk. Many speaker systems have a peaky response in the mid bass. If you trigger this response with sub-bass sound, the speaker resonance will introduce pitches you never intended. The practical low end for widely distributed music is 60 Hz. When sounds are combined in the same frequency band, you usually get a rich complexity, but in some circumstances one sound will completely obscure the other. This is the masking effect discussed in chapter 2. The details of masking are the subject of intense perceptual studies these days, as the whole field of audio data compression depends on it. In general, the rules are simple. If two sounds are

07_Chap6_pp123-148 8/29/13 2:42 PM Page 140

140

THE ART AND TECHNIQUE OF ELECTROACOUSTIC MUSIC

nearly the same pitch and one is significantly louder than the other, only the loud sound will be heard. It’s as if the loud sound casts a shadow over the other. The shadow is a bit deeper on the low side of the loud pitch, and the shading effect actually lasts for a brief time. Masking is at its most insidious when speech is combined with other sounds. It is often possible for the words to be heard but not understood. This happens when the quieter phonemes are masked, leaving holes in the text. Consonants are particularly susceptible to this. They are relatively weak in the overall vocal spectrum, and the subtle components that distinguish (for instance) da from ta are weaker still. The vocal equalization described in chapter 5 reduces the masking with little effect on the quality of the sounds. (To review, that’s a 3 dB boost somewhere between 1 and 3 kHz on the voice and a matching cut on competing parts. The actual frequency boosted should be tuned to the voice.) Ultimately, the judgment about masking has to be made by listening, and it can be tricky. If you know a sound is there, you are practically guaranteed to hear it, but others may not. You have to learn how to hear your work with fresh ears, as if for the first time. One strategy is to leave the piece alone for a day or two and make adjustments when you come back to it. Another is to learn to recognize situations where masking is likely and make conservative decisions about level. DVD example 6.2 demonstrates some masking effects. The masking exercise at the end of the chapter will help develop skill in balancing conflicting sounds.

Synchronization During the masking exercise you may have discovered that synchronizing sounds can be tricky. The best tools for exact placement of single sounds are the grid and metronome. The process is fairly straightforward, if tedious: Start by placing the clip containing the sound on a downbeat. Now turn on the metronome and listen to how the sound matches the tick. Chances are it will be late. Adjust the sync or anchor point of the clip until the sound is tight with the metronome. You may have to slide it away from the beat and back to hear the change take effect. Once it sounds like it is exactly on the downbeat, move it to the gridline that will be its final destination. Any copies you make will have the same anchor, so you only have to do this once per sound. As the texture gets complex you will rely less on the metronome and more on how the parts sound with each other. Then it’s time to forget the tick, because the slightly irregular rhythms that arise from matching odd sounds by ear are much more interesting than machined perfection. You will also discover that it is easier to synchronize long sounds to short ones than the other way around. This is the reason most recording engineers lay down the drum tracks first. Even if your piece does not have an actual drum track, there is probably some material that serves that function, marking time and giving other events a reference for expressive deviation. Create this part first, then add other

07_Chap6_pp123-148 8/29/13 2:42 PM Page 141

LAYERING SOUNDS

141

parts in reverse order of their freedom. The best way to confront synchronization is to produce some danceable music, as proposed in the exercises.

Tuning We were able to slide by the issue of intonation in the exercises in chapters 1 to 5, where sounds were more or less isolated, but once materials begin overlapping, tuning cannot be ignored. If we are generating source material by recording things found in a kitchen drawer, it is pretty unlikely that any two sounds will be in tune with each other. In fact, even with musical instruments, a recording will reveal many inconsistencies in intonation. A certain amount of variation is acceptable, but serious deficiencies will need to be fixed. The necessary skill is being able to tell if two tones are the same pitch. This does not require perfect pitch (useful but difficult to learn), just the ability to hear when something is off. A DAW is an excellent tool for learning this. Place a simple, strongly pitched sound on a track and repeat the sound several times. Now copy the track and apply slight pitch change to every other occurrence of the sound in the new track (make notes of how the sound is changed). Play them simultaneously. When the pitch difference is small, you will hear a pulsation in the mix, known as beating. The beat frequency is equal to the difference in frequencies of the two tones. Thus, the further apart the pitches are, the faster the beats. We generally accept anything slower than two or three beats a second as being in tune. Of course, this is pretty artificial, tuning the same waveform to a unison, but the technique will work in most situations. Sounds with a strong impression of pitch will produce beats when they are close. You can also expect beating when tones are almost but not quite an octave or a fifth apart or even in thirds. The procedure for tuning up sounds in a mix is pretty easy to describe if tedious to achieve. Change the pitch of one tone a bit and listen to the beats. Slow beats sound better than fast beats, but if the beats are eliminated entirely, the tones will fuse together and lose their individuality. It takes practice to become efficient at tuning, but it will come in time. DVD example 6.3 demonstrates beating with some synthesized tones. Tuning a sound to match another is only part of the problem. You also need to keep the tuning consistent throughout the piece. For this, an external pitch reference is needed. There are software solutions, but most studios generally have a keyboard synthesizer handy. It’s then simple to compare any sound to a known pitch. If you prefer a software approach, the best way is to do some analysis of the sounds before beginning the actual composition. There are many programs that will tell you the frequency of a sound—a good one is Transcribe! by Seventh String Software. There are also some plug-ins that do frequency analysis. Once you have determined the pitch, the best place to keep track of it is in the filename. One of the toughest tuning problems occurs between sounds that change in pitch. If you are matching a cat sliding down a half step with a dog sliding up, what

07_Chap6_pp123-148 8/29/13 2:42 PM Page 142

142

THE ART AND TECHNIQUE OF ELECTROACOUSTIC MUSIC

parts of the pair do you match? Like most musical questions, the answer depends on the context, but I generally try matching the ends of the sounds first. That way the gestures slide into tune, a relatively common musical event. Tuning is often overlooked as a tool for expression. It is a cliché of horror films that discordant sounds are nerve-wracking and threatening, but this effect can be used subtly to reinforce passages that are intended to be energetic or calming. One of the most powerful features of the electronic toolkit is the ability to explore alternate tuning schemes that can easily surpass the standard temperament in richness or purity. The tuning exercise will develop your skill in matching tones and adjusting intonation for various effects.

Repetition It’s easy for a teacher to be dismissive of repetition in compositions. With copy and paste so readily available, many lazy composers produce pieces that are only a few measures of work padded with endless looping of the same material. Such pieces are generally pretty boring by the second listening. The proliferation of “anyone can be a composer” software packages is not helping the situation. However, it is futile to deny the legitimate use of repetition in contemporary music. Composers as diverse as Autechre and Philip Glass prove that repetition can be a powerful element in music. No one who has played bassoon or viola in a Mozart symphony can deny that repetition has always been an important element in music. However, there are gradations of repetition. These range from the blatant looping of GarageBand through the subtle evolutions of Glass and Meredith Monk to the subliminal underpinnings of a Bach passacaglia. The electroacoustic composer needs to be capable of repetition at all of these levels and will probably combine them freely. Of the gradations mentioned, the most difficult is subtle evolution. There are several strategies for producing this effect. A good loop is usually made up of varied materials in some sort of complex counterpoint. If each sound pattern is given its own track, the balance of materials can be varied, with some elements even taking brief vacations. DVD example 6.4 shows how this can work. It only takes a few new sounds to freshen a loop. If a loop is created by copy and paste (paste, paste, paste), it’s simple to replace a few sounds in each copy with similar sounds. This is another reason to keep many takes when you are recording source material (DVD example 6.5). Copy and paste is not the only way to create a loop. Obviously, you must build the first iteration by editing and layering. If you do this again for each repetition with snap to grid turned off (or with a fine grid), there will be subtle rhythmic variations. Adding some dynamic modifications to each element will give the same sort of variation you get when human beings play a passage re-

07_Chap6_pp123-148 8/29/13 2:42 PM Page 143

LAYERING SOUNDS

143

peatedly. You can even add excitement by making progressively greater variations and a bit of crescendo. This is admittedly the labor-intensive approach, but the work pays off in the end (DVD example 6.6).

Complexity New composers often ask, How many layers of sound are enough? The answer is probably four. Four seems to be the human limit for following separate parts over a long period of time. Bach knew this, Mozart knew this, and the Beatles knew this. It is being borne out in modern psychoacoustic experiments. You can prove it for yourself by counting the parts in the next five pieces of music you hear. If you have more than four parts, you should link some so that they are heard as a stream (folks in the sound effect business call this marrying sounds). The classical equivalent is octave doubling or parallel thirds. There can be subtle interplay within a group as long as the impression of togetherness is maintained.

Reaching a Conclusion Most composition teachers agree that the ending is the most difficult part of a composition. I am inclined to agree, if only because of the number of pieces I hear that have no ending; they just fade away while continuing to loop. But I also believe that difficulty in composing an ending is a symptom of a structural problem with the piece. A good ending is anchored in a good beginning and a good middle. Songwriters will tell you that a good song tells a story, and the ending comes out of that story. Abstract music does not necessarily tell a story, but it can still have narrative elements. The sound materials are your cast of characters, and something happens to those characters. Generally, there is interaction between the characters that often provides the main interest. The ending finishes the story of the main character.

Mix and Balance The final mix should not be complex. Once all materials are in their place with any processes and plug-ins applied, the only issues to address are balance and panning. These will be managed by mix automation, which can be drawn into the tracks. Panning is usually simple. I find electroacoustic works are most convincing if the concept of stage image is borrowed from pop mixes. In that tradition, the most important instruments are kept near the center, with auxiliary parts panned somewhat to the left and right. This means that pans are set at the start and seldom moved again. It is actually useful to draw a map of where the sounds should appear

07_Chap6_pp123-148 8/29/13 2:42 PM Page 144

144

THE ART AND TECHNIQUE OF ELECTROACOUSTIC MUSIC

to be. Place some further upstage than others—you create this effect by giving those a bit more reverb. For any sounds that do wander, plot out their motion. It’s more likely that such sounds will simply appear at different locations than actually move while playing. Many of the sounds may have reverb already, but it’s good practice to apply an overall reverberation. This is less out of a desire to create a realistic space than to give the sounds some extra commonality, a bit of glue to hold everything together if you will. This trick is accomplished by using an extra bus with the master reverb as a plug-in. Each channel then contributes to the reverb signal via the aux send. This control also will probably not move during the mix. The controls that do move are the track faders. The balance between materials will change constantly. Much of this change is already part of the sounds, as determined by the expression of each phrase and whatever dynamics were inherent in the source material, but more adjustments will probably be needed. In the discussion of foreground and background we covered a lot of reasons why one sound may be more prominent than another, implying that shifting of focus during a piece is a good thing. Level adjustments reinforce the factors that attract the listener’s attention. If one sound steps into the spotlight, the others may step back by way of a slight reduction in volume. The goal is to have a steady overall level (barring dramatic effects) with particular voices most prominent in the mix when it’s their time to shine. Contemporary style is to balance the mix in spectrum as well as in loudness, so the fader work may take frequency into account. The bass line will not fade back to accommodate a mid-range solo, but other mid-range parts will.

CASE STUDY DVD example 6.7 is a simple composition made using these procedures. The source material came from another visit to my kitchen. The sources include a cheese shredder, the lid from a saucepan (which produced either a C or B depending on where it was hit), and a pan of popcorn popped over a gas flame. Nearly half the work was done in the stereo editor prior to importing the files into Pro Tools. The popcorn was edited a bit, leaving out the first two minutes of heating and the last bit where the kernels started to burn. I also applied some low-cut filtering to reduce background sound. Nothing can get rid of the hiss of the gas flame, so that is an intrinsic part of the recording. The popper has a crank to stir while popping, and that is occasionally heard. The track was then used as is to make the backbone of the piece (since it is popcorn, I guess you would call it an arrhythm track). The pan lid was played with a rubber mallet. There is no attempt to sync it. I just played at a reasonably steady rate for several minutes and pulled a section out that has some interesting variations in tone. I then processed this three ways, with a cho-

07_Chap6_pp123-148 8/29/13 2:42 PM Page 145

LAYERING SOUNDS

145

rus plug-in, with chorus and a wah-wah filter, and again with heavy compression and distortion. The cheese shredder was scraped with a wooden stick. I made two excerpts from this. The first is a pattern of three scrapes dropped and slowed by three octaves, then processed with a ring modulator. That produced a timbre reminiscent of the pan lid. I also took a quick scrape and looped it into a two-bar clip. The layout of the parts is visible in Figure 6.1. Track 1 is the popcorn, track 2 is the unprocessed lid, starting at 17 seconds. Tracks 3 and 4 are the processed lid in a canon with the original. The first sound of track 3 lines up with the second of track 2, and the first sound of track 4 lines up with the second of track 3. Since the lid playing was not perfectly steady the tracks slowly spread out in time. Track 5 is the heavily processed version with a nearly steady raw tone. This comes in a bit later, matched with the third note of track 4. The tone fades in so it does not really participate in the rhythm. It fades away after the other lid tracks have dropped out. Track 6 is the most composed track. This used the sounds of the shredder. I started assembly with the final section, the lowered and ring-modulated version. This was placed so the last sound would have an interesting relationship to the tapering away of the popcorn. I had to try eight or nine placements to find the right spot. Next I began adding the shredder loop. The idea was to just barely hint at a loop. The first two occurrences are shortened, then the sections get progressively longer. Since the clip has two circuits, it is easy to extend it by dropping two clips together. Pro Tools’s shuffle mode ensures that the clips will abut cleanly. As the clips get close to the processed version, I lower the pitch by one octave, then by two octaves to approach the lower register gradually. The last loop is truncated so the deep crash happens on the beat. The mix was quite simple—the pan lid with chorus tracks are panned left and right, everything else is centered. There was some relative level adjustment, but track 5 is the only one with a moving fader.

EXERCISES Foreground and Background 1. Listen to your sound bank (including processed sounds you have developed), choose three or four sounds to be foreground, and pick a compatible background for each. Use the techniques from chapter 4 to create 20-second passages with each, then import your pairs to the DAW and listen to the combined effect. 2. Pick some sounds that can be either foreground or background and make a short composition where the foreground material becomes background, and vice versa. Finish the piece by returning the sounds to their original perspective.

07_Chap6_pp123-148 8/29/13 2:42 PM Page 146

146

THE ART AND TECHNIQUE OF ELECTROACOUSTIC MUSIC

Masking Pick two sounds that are short, possibly percussive. Use the DAW to create a simple rhythmic pattern with one, then exactly match the rhythm in a new track with the second sound. Set the DAW to loop play, and experiment with the levels (using the mix faders), noting how much difference is necessary before one sound overwhelms the other. Now slip one track back about 10 milliseconds and repeat the experiment.

Dance Use the materials in your sound bank to create some rhythmic dances. Start with an up-tempo version first, then try for that same feeling of motion with a leisurely waltz.

Tuning Record several similarly pitched sounds, such as the wineglasses in chapter 2. First bring them all into tune, then construct a piece that uses gradual shifts in intonation to establish a flow from tension to relaxation.

Repetition Make a composition based on three interlocking loops. Vary each loop by a different technique as described above, and return to the original by the end.

Mix and Balance Record a voice reading a text about two minutes in duration. Add sounds that accompany and intensify the meaning of the words.

Final Exercise It’s time to do some serious work. Plot out and construct a three-minute piece that uses most of the techniques that you have studied so far.

07_Chap6_pp123-148 8/29/13 2:42 PM Page 147

LAYERING SOUNDS

147

RESOURCES FOR FURTHER STUDY The most important reading at this point is the manual for your DAW. If you still have questions, look for third-party books about your DAW. Unless you have chosen something really obscure, there will be several to choose from. Helpful hint: look for books where you understand the first page but not the last. There are many books about composition in general, and these are interesting, although most are really quite personal. The major exception, which covers several styles, is: Cope, David. 1997. Techniques of the Contemporary Composer. New York: Schirmer. I prefer to try to understand listeners by studying psychoacoustics. There is a lot of new research about what listeners can hear and how sound elicits emotions. One particularly accessible book on the subject is: Levitin, Daniel J. 2006. This Is Your Brain on Music: The Science of a Human Obsession New York: Dutton.

07_Chap6_pp123-148 8/29/13 2:42 PM Page 148

08_Chap7_pp149-164 8/29/13 2:42 PM Page 149

Part 3 Music Store Electroacoustic Music Up to this point, we have dealt with techniques for working more or less directly with sound. Audio montage has been used to create a lot of beautiful music, and more is yet to come, but it does represent a total break with the traditions of music making. In Western music, these traditions are centered around playing musical instruments or, for the composer, telling other musicians what to play. A visit to any music store will reveal how performance and composition have been profoundly transformed by advancing technology. The next few chapters discuss these changes and how to make the best artistic use of the high-tech versions of traditional tools. Media scholars of the 1960s and 1970s pointed out the effects that a communications medium has on the content of the communication. For example, telephone conversations encourage spontaneity and rapid give-and-take, whereas e-mail exchanges offer the opportunity for careful consideration of the issues and wellreasoned response. Text messaging requires short statements and creative spelling. This principle was famously summed up by Marshall McLuhan as “the medium is the message.” Nowhere is this more apparent than in the effect the medium of MIDI has on the music it is used to produce. A detailed study of the MIDI protocol may seem as pointless as asking a painter to learn quantum electrodynamics, but it has real benefits. The design of the MIDI protocol imposes many limitations on what may be easily specified, and the way to escape those limitations can be found in some subtle aspects that are seldom discussed.

08_Chap7_pp149-164 8/29/13 2:42 PM Page 150

08_Chap7_pp149-164 8/29/13 2:42 PM Page 151

SE VE N MIDI

HISTORY The history of electroacoustic music can be neatly divided into two periods: preMIDI and post-MIDI. The introduction of the MIDI standard, which not coincidently paralleled the development of the personal computer, marked the entry of major instrument makers into the business, along with a transformation of the discipline from a fringe art to a mainstream musical genre. This happened in 1982. There was plenty of electroacoustic music being produced at the beginning of the 1980s. Many university basements were full of tape recorders and synthesizers, and a second generation of electronic music and computer music teachers were training students eager to put these skills to use in serious careers. Most bands included a keyboard player with a literal stack of instruments, instruments now legendary, such as the Sequential Circuits Prophet-5, the Roland Juno, and the Minimoog. The original instigators of MIDI were trying to address a modest problem with these instruments. It was well understood that large modular synthesizers were dinosaurs due to be supplanted by nimble little beasts with fewer capabilities but more flexibility in live performance. These were referred to generically as “keyboards,” and keyboards they were; the area of circuitry required to generate signals was nothing compared to the three square feet required for a standard key mechanism. Since each instrument had a unique sound, most performers soon found themselves with a stage full of ivory-colored plastic and a chronically sore back. The manufacturers were not happy with the situation either. They did not actually build the keyboard assembly; these were purchased from organ companies and basically added cost to the instrument with no sales benefit. The fairly obvious solution was a scheme that would allow a central keyboard to control a number of simpler synthesis devices. A group of manufacturers came up with just such a scheme and named it Musical Instrument Digital Interface. The manufacturer’s association did an excellent job. (Most of the credit belongs to Dave Smith of Sequential Circuits, but he did not work alone.) Not only was the standard adopted immediately by every company in the industry (many even sold retrofit kits for existing models), it has now been in use with no significant changes for thirty years. MIDI has even spread beyond the music world, finding applications 151

08_Chap7_pp149-164 8/29/13 2:42 PM Page 152

152

THE ART AND TECHNIQUE OF ELECTROACOUSTIC MUSIC

in theater production and movie special effects. The adoption of the controller and tone module concept had a tremendous impact on the would-be electronic musician. For one thing, the price of a typical instrument dropped from $2,000 to about $500. A keyboard was still expensive, but only one was needed and it would not become obsolete the way the rapidly evolving synthesis engines did. In fact, the division encouraged the development of better keyboards with companies like Robert Moog’s Big Briar specializing in well-crafted and expressive models. Controllers modeled after other instruments began to appear, although they made few inroads on the ubiquitous black and whites. (Alternative controllers are discussed in chapter 18.) The economies of MIDI made electronic synthesis affordable for the working musician and facilitated electroacoustic music’s escape from the confines of academe. That’s not to say there was ever a plot to restrict electronic music technology to the cultural elite, but musicians who stayed close to music departments to get access to the equipment were also subjected to influences that shaped academic music of the period and somewhat isolated from popular trends. The MIDI diaspora brought electroacoustic influences to every aspect of musical performance from Hollywood to country and western. This period saw rapid model turnover as new electronics made last year’s models obsolete, so even the proverbial starving musicians could afford used instruments. Some of today’s most prominent artists started their career with instruments discarded in Dumpsters. The musical effects of MIDI are more pervasive than stylistic democratization. The requirements of the standard actually marked a return to a more traditional approach to composition. As you may have gathered from the procedures covered so far, classical electroacoustic techniques are not based on notes as the primary unit of sound. We have been working with sound as a fabric, which may be cut up into notes if desired, but can also be manipulated in a variety of other ways. The MIDI data stream is definitely defined as notes, and as keyboard notes at that. Such basic non-keyboard events as crescendos are awkward to specify and seldom convincing. MIDI has also brought precise equal temperament back to standard practice. It is possible to specify other temperaments and the pitch bend wheel provides expressive deviation, but when you get down to it, pitch in MIDI is a single number per note. The flexible intonation heard in acoustic string and vocal ensembles is just not available. One development that followed the introduction of MIDI proved especially disappointing to early advocates of electroacoustic technique. The marketing of instruments to traditionally trained musicians led to a sort of arms race to see who had the best imitation piano. This carried over to imitations of other instruments, even those that sound preposterous when played by a keyboard. This trend ultimately produced a standardized set of sounds known as general MIDI. There is nothing inherently evil in imitation, but it seemed to signal the end of one of the original goals of the electroacoustic approach—new, unimagined sounds. The takeover of design by marketing departments led to another development that was disappointing to experimental musicians—the loss of the knobs. The first

08_Chap7_pp149-164 8/29/13 2:42 PM Page 153

MIDI

153

instruments of the MIDI era were fully programmable, with flexible architectures that provided a rich variety of sounds and textures. But pretty soon the legend began to grow that instruments returned for service never had any but the original sounds stored in memory, that few if any musicians were using the programming features. This begs the question that electrical faults would likely erase the instrument’s memory or that most returns occur during the first six months of use, but it was probably true. Developing usable new sounds on a synthesizer is a fairly difficult art, especially for someone with absolutely no training. This was exacerbated by the amazingly awkward programming interfaces on these machines. Eventually, programming features disappeared from most lines, leaving a legacy of glorified electric pianos. A third effect of the mass marketing of electronic instruments was quite promising at first. This was the development of the sampler, which will be discussed in detail in chapter 9. In brief, a sampler is a recording device capable of storing a lot of short clips in digital memory. The feature that makes it an instrument is the ability to play the clips back instantly as commanded by MIDI messages. If desired, the clips can be changed in pitch according to the key pressed. This allows a sound to be spread over an entire keyboard—we have already explored the wonders to be discovered in shifting the pitch of various sounds. Unfortunately, these instruments also eventually fell prey to marketing forces and the demand for imitation pianos. As you can imagine, you can’t just change the pitch of the low A to get all of the notes on a piano. A convincing imitation requires an individual recording of every single string, a task too daunting for most customers. So the sampler companies began selling prerecorded sound banks with pianos and other instruments on them. Eventually the marketing department began to question the need for the sampling feature at all and started pushing instruments that were just sound banks packaged in steel. These sold briskly and full-featured samplers eventually disappeared. Samplers are now a species of software, and only one or two of the programs available allow the users to make samples for them. These detractions aside, MIDI has been a good thing. The accessibility of MIDI systems has brought many more people into the field than the universities could have ever handled, and the large market created has inspired the invention and manufacture of thousands of clever and unique products. MIDI by and large brought electroacoustic music out of the lab and into the real world of performance and production.

THE HARDWARE You can identify a piece of MIDI-capable equipment by a characteristic connector, usually found on the back of the unit. This connector, technically known as a 5-pin DIN socket, was chosen because it is reliable and cheap. You will find DIN connectors in other situations, particularly in European consumer audio gear, but on

08_Chap7_pp149-164 8/29/13 2:42 PM Page 154

154

THE ART AND TECHNIQUE OF ELECTROACOUSTIC MUSIC

musical instruments these are for MIDI. A socket will be labeled MIDI out if the device is capable of generating MIDI commands or MIDI in if the device can respond to MIDI. A third type, labeled MIDI thru, provides a direct copy of the MIDI in data. This is convenient for connecting multiple devices to a common source. The interconnecting cables have matching plugs, with male connectors on each end. A single cable carries data only one direction, from MIDI out (or thru) to MIDI in. The original concept of a MIDI network was pretty simple: one master controller and many slave tone generators. The MIDI cables are connected from the master MIDI out to the MIDI in of the first slave, then from MIDI thru to MIDI in between the rest of the slaves in a configuration known as a daisy chain. In order for the master to address a particular slave in the chain, messages are tagged as belonging to one of sixteen MIDI channels. A slave can be set to listen to one channel just as a TV set can be tuned to a single channel on the cable (slaves may also be set to Omni mode to respond to everything). With channels available, a keyboard can be split, sending channel 1 from the left-hand keys and channel 2 from the right-hand keys. This gives the performer independent bass and lead sounds. Since many synthesizers of the era could only produce one tone at a time this system makes a lot of sense. It was also common to add synthesizers in layers, each contributing to a fat, rich mix of sound. MIDI networks soon grew complicated. The first variation was the addition of a second controller. Since MIDI messages are batches of 1s and 0s sent along a single cable, it takes a fairly complex device to connect two sources to a single destination. If one source is in the middle of a transmission, data from the second must be stored until it can be injected into a pause in the traffic. These devices are called MIDI mergers. A second complication arose because some manufacturers did not provide a MIDI thru connection or when the MIDI thru function had a noticeable delay. In that case a dedicated MIDI splitter box with multiple outputs was required. The final step in the evolution of MIDI systems was the addition of a computer. A computer is connected between the controller and the rest of the network to record the MIDI data for later playback. The program that does this is called a sequencer, which we will cover in great detail in chapter 8. Some early personal computers had MIDI connections, but now an adapter is required to match MIDI to the generic serial ports or USB connectors found on contemporary computers. Such adapters can be simple or complex. The basic models look like a simple cable, although some sophisticated electronics are hidden in the plugs. Advanced MIDI interfaces have multiple inputs and outputs, which are collectively called MIDI ports. Since each MIDI port can manage sixteen MIDI channels, multiport adapters have vastly expanded the possible complexity of a studio. (See Figure 7.1.) Most MIDI interfaces require installation of driver software. This comes with the interface, but it’s better to visit the manufacturer’s website and get the most recent update. The drivers often include an application that sets options on the interface itself, options such as MIDI routing from controller to synthesizers when the computer is not turned on. On Macintosh systems a lot of this functionality is in the

08_Chap7_pp149-164 8/29/13 2:42 PM Page 155

MIDI

155

Computer

In Out

Out

In

Out

Midi Adaptor

In Thru In

Keyboard Synthesizer

Thru In

Tone Modules FIGURE 7.1

MIDI connections.

.

MIDI page of the audio/MIDI setup utility. This window allows naming of ports and even the connected instruments. The use of a computer as MIDI recorder poses a puzzle when the master controller is a keyboard synthesizer. These instruments include both a controller and a slave, because the keyboard and internal tone generator are really independent devices. The internal tone generator can be set to play everything that comes along or follow one zone on the keyboard. But if there’s a computer at the hub, it should have total control over what is played, including the keyboard’s internal generator. This is supplied with a switch on the keyboard called local control. When the computer is in use, local control is turned off and a MIDI line is brought back from the computer to the synthesizer. The keyboard will no longer play its own tone generator unless the computer calls for it. The latest trend is for controllers and some synthesizers to send their MIDI data directly over USB or network cabling, skipping the interface adapter. This has extra advantages; since USB is a two-way system, a cable is eliminated, and USB can even supply power to the controller. Often a USB controller will include MIDI ports, eliminating the need for an interface for the classic gear.

THE MESSAGES No matter what kind of cable is used for MIDI transmission, the data is the same. MIDI is organized as short bursts of data known as messages. These messages are

08_Chap7_pp149-164 8/29/13 2:42 PM Page 156

156

THE ART AND TECHNIQUE OF ELECTROACOUSTIC MUSIC

sent one after another along the wire (there are no simultaneous events in the MIDI world). A message begins with a byte called “status” that identifies the type of message and the channel it applies to. This is followed by one or two data bytes that are interpreted according to the most recently received status. Additional sets of data can be sent without repeating the status byte. This trick, called running status, can significantly improve the data rate. MIDI data is transferred at 31,250 bits per second, and it takes 10 bits to transmit a byte of data (there must be 2 bits of space between the data bytes). The most common messages have two data bytes, so skipping the status byte can raise the number of messages per second from 1,000 to 1,500. You might think that a millisecond delay between messages is not much, but consider a chord with ten notes. The serial nature of MIDI means that the chord is really an arpeggio. It has been argued that a 10 ms spread is perceivable, even if it’s not overtly obvious. Of course it has also been argued that human musicians seldom approach this degree of accuracy.

Channel Messages There are many more types of message than most musicians realize. Ninety percent of the music can be represented with four kinds of status, but if you limit yourself to these four, your music will be stiff and constrained. Concepts like polyphonic aftertouch may seem arcane, but they can be the solutions to apparently intractable problems. So-called channel messages carry information pertinent to performance. These are distinct from system messages, which carry general information. Note On and Off Note on and note off are the workhorse messages, delineating the beginning and end of a sound. These are two distinct messages—in MIDI there is no concept of duration. These should be more accurately termed key down and key up, because that is the action that instigates them. What is actually heard will depend entirely on the synthesizer that is reacting to these messages. For instance, the end of a note will probably not occur immediately upon releasing the key. Most sounds have a decay phase that is begun by the note off message; the decay phase may continue for quite a while. Percussive sounds usually have a fixed duration that is triggered by note on, and note off is only required for bookkeeping reasons. An instrument that receives a note on will continue playing the note until it receives a matching note off on the same channel. Unplugging a MIDI cable while a pitch is sounding will probably result in a stuck note, as will changing channels on a keyboard. Usually you can deal with a stuck note by replaying the note, although you may have to resort to turning the synthesizer off. There is an all-notes-off message, sometimes available under the rubric “panic.” Both note on and note off include two bytes of data: key number and key velocity. The key number identifies which key is moved. The data, like most of the numbers in MIDI, has 7-bit resolution, with possible values of zero to 127. While the size

08_Chap7_pp149-164 8/29/13 2:42 PM Page 157

MIDI

157

of the keyboard is not assumed, key number 60 is defined as middle C. For some reason this key is identified as C3 in MIDI literature rather than the traditional pianist’s label of C4. The lowest key, which corresponds to a zero in the data, is labeled C-2 (that is, C minus two). This pitch is about 16 Hz, barely perceivable and not rendered well by many synthesizers. The highest key, number 127, has a frequency of 12.5 kHz. This is clearly audible to young ears, but also is not rendered well. The practical range of usable notes is about number 24 to number 96 when the keys are mapped to pitch. Of course, keys need not determine pitch. In percussion synthesizers, for example, the keys are usually mapped to various sounds, and lighting controllers map the keys to individual lights. Since there is no duration encoded in the messages, timing is entirely up to the performer or controlling software. This concept of note on, note off reveals the keyboard bias of MIDI. It is impossible for a keyboard to play a unison, but many other instruments can. A few synthesizers allow unisons, but most will ignore note on for a note already playing and will stop entirely on the first note off. If you want to handle this effect predictably, you need to use two MIDI channels. Key velocity for note on and note off indicate how hard the note was hit, another keyboard concept. The interpretation of note on velocity is up to the synthesizer program, but it most commonly controls loudness. Velocity also has a range of zero to 127, with a default of 64 transmitted by keyboards that don’t have the feature. Zero velocity is actually interpreted as a note off, which allows efficient performance under running status. Velocity information for note off is not often used, but it occasionally controls decay rate. Control Change Control change messages also have two bytes of data, one for control number and one for the value. A control is assumed to be anything you can manipulate that is not a key, like a knob or slider. These are numbered, but since there is no standard arrangement of knobs, the numbers refer to common functions. Controller number 1 is the modulation wheel that made the Minimoog so expressive. Controller number 64 is the sustain pedal, number 10 is pan, and number 7 is channel volume (not to be confused with the master volume knob, which is seldom controllable). Beyond these the definitions begin to loosen up because most synthesizers have a unique set of controls. In fact, the knobs on modern synthesizers are generally multifunction, sending different messages under different circumstances. The situation is further muddied by the fact that many classic MIDI instruments were designed before control designations were standardized and have nonconforming mappings. The current designations can be found at the MIDI Manufacturers Association website, but you will want to confirm what is assigned where on your own instruments. This is usually well documented in instrument manuals, but you may have to discover it by experiments. Control interfaces require even more detective work. A common type of control surface has a row of sliders with a grid of knobs above. The messages sent by these are not immediately apparent because the control surface may be designed for use

08_Chap7_pp149-164 8/29/13 2:42 PM Page 158

158

THE ART AND TECHNIQUE OF ELECTROACOUSTIC MUSIC

with any of three or four incompatible interface schemes. In fact, the controller is probably configurable to work with any of them, meaning the control numbers will change depending on what program was most recently used. In a perfect world, control number information would be readily available in the device manual, but my experience with this has been generally disappointing. The best way to discover this information is a program that displays numbers for incoming MIDI messages, and several freeware versions are available. Many sequencers and related programs also display the MIDI data. The value data for a control change message indicates where the controller ended up. The motion of a controller is actually transmitted as a series of steps. All of the knobs on a device are checked about 100 times a second, and if any have been moved, a message is sent. The resolution of the data is 7 bits, so a control effectively has 128 positions, with values of zero to 127. This is actually too coarse for many functions (you would hear distinct steps) so a high-resolution option is provided. The control numbers 32 to 63 may be used as fine settings (least significant bits) for the corresponding controls zero to 31. The resulting resolution is 14 bits with 16,384 possible values. Since turning a MIDI control knob generates a series of messages, simultaneously turning several knobs may produce enough data to overwhelm the instrument. In the worst cases, the instrument will lock up and need to be shut down. To prevent this, many interfaces and programs have data thinning features. These remove every second message, leaving the shape of a gesture intact, but reducing the number of steps transmitted. Surprisingly, 128 controls are not enough. One of the first additions to the MIDI standard included a scheme for using two-byte parameter numbers. This requires sending four messages, a total of eight bytes under running status. Control number 99 sends the high byte of the parameter number, followed by control number 98 with the low byte. The data is sent as control numbers 6 and 38, or alternatively, control number 96 to increment the parameter or control number 97 to decrement. This routine is called nonregistered parameter number, or nrpn (pronounced “nurpin”) for short. The name implies the existence of registered parameter numbers, and there are some but not many. Control numbers 121 to 127 are assigned to more general actions. Control number 121 is a controller reset, number 122 switches local control, and number 123 turns all sounding notes off. Control numbers 124 and 125 turn omni mode off or on, defeating channel selectivity. Control numbers 126 and 127 switch monophonic and polyphonic mode. In some very old synthesizers, mono mode (with omni off) assigns each voice to its own channel. This made sense when synthesizers only had four to six oscillators. Today’s standard of full multichannel polyphony is never mentioned in the MIDI specification. Program Change When MIDI was first proposed, the great advantage of digitally controlled synthesizers was the ability to store all of the knob settings and recall a setup at the touch of

08_Chap7_pp149-164 8/29/13 2:42 PM Page 159

MIDI

159

a button. All of the data for a setup is stored in a section of memory called a program or preset. Program change messages call up one of the saved programs by number. Memory was expensive in 1982, so 128 program numbers was considered plenty. Hence the message only has one data byte. Fortunately, memory did not stay expensive for long, and synthesizers were soon keeping several banks of 128 programs each around. The bank had to be chosen by hand, but you could set up a program change map to determine which presets to load. This was eventually sorted out by assigning control change number zero along with number 32 to choose the bank number. This has to be followed immediately by a program change. Two-byte resolution is needed for the bank number because some manufacturers have a unique number for every bank in their entire inventory. The program change value has turned out to be one of the most confusing numbers in the MIDI world. This is because although the value has a range of zero to 127, some manufacturers preferred to count programs starting with 1. Others were content to have a program zero. The confusion comes when programs are specified by sequencers. Should a sequencer display zero or 1 as the first program number, considering it has no knowledge of what is connected? Some applications went one way, some the other, a few allowed the user to choose. So now we have a situation where a sequencer that counts programs starting with zero may be sending data to an instrument that starts on 1, and vice versa. Either way, the numbers will not match half the time. Pitch Bend Everyone’s favorite control on the Minimoog was the pitch wheel, a spring-loaded fine frequency control at the left of the keyboard. It was provided because the instrument did not really play in tune and constant correction was needed. Musicians soon discovered it to be a wonderful tool for expression, so a pitch wheel has become standard on every keyboard since. The data sent by the pitch wheel gets its own status—pitch bend. It’s a two-byte number with a range of zero to 16,383. The middle value (8,128) is defined as zero bend. That’s a pretty wide range, so in truth most instruments only use 7 to 10 bits of resolution. The width of the bend is adjustable on the instrument, which means the distance from the middle to the top of the wheel may be a semitone or an octave of bend. Aftertouch Aftertouch sends additional data if the player puts extra pressure on a key being held down. There are two varieties of aftertouch: Channel pressure sends a value that applies to the entire keyboard; it is determined by the hardest press of any finger. Polyphonic aftertouch sends a separate value for each key held. It is not implemented on many synthesizers, but it’s brilliant when available—it is the only message that can affect an individual note while it is playing. Polyphonic aftertouch can be used to make notes crescendo or add vibrato. With some clever programming, it can even be used for microtonality. Polyphonic aftertouch generates a lot of MIDI data, so many interfaces filter it out by default.

08_Chap7_pp149-164 8/29/13 2:42 PM Page 160

160

THE ART AND TECHNIQUE OF ELECTROACOUSTIC MUSIC

System Messages System messages carry information that is available to everything connected. The most useful are commands that make devices operate in synchronization. These were originally intended for hardware sequencers and drum machines but are now used between programs on a computer. They are particularly useful if you are working on a MIDI workstation—a high-powered keyboard synthesizer with a built-in sequencer. Real-time Messages There are actually two schemes for synchronization, one for keeping musical time and another for working with fixed-rate media such as video. Musical time is kept by clock messages, which are generated by a master timekeeper at the rate of 24 per quarter note. Slave devices follow this clock so that everything matches in tempo. There are start and stop messages, of course, along with a continue message, which will restart sequences where they left off. These are complemented by a song select message (which refers to songs by number) and a song position pointer, which cues to a specified sixteenth note within the song. MIDI Time Code Film or video time is specified by Society of Motion Picture and Television Engineers (SMPTE) time code, which is formatted as hour, minute, second, and frame number. There are twenty-four to thirty frames a second depending on the type and country of origin of the media. (The standards set by the Society of Motion Picture and Television Engineers make it possible to use a camera in New Zealand, edit the film in England, and present it in the United States.) MIDI time code (MTC) is a translation of SMPTE time code into MIDI messages. There is a full frame message for setting initial alignment, while quarter frame messages keep things together when the system is in motion. There is also a whole suite of messages called MIDI machine control that provides remote control of recording systems. When you see start and stop buttons on control interfaces for Pro Tools and the like, they are providing MIDI machine control. Active Sensing There is one more type of real-time message, one I put in the category of “must have seemed like a good idea at the time.” This is active sensing, which is supposed to warn if a cable is unplugged. The idea is that if a controller is idle for a while, a simple message is transmitted that says “I’m still here.” A receiver is supposed to notice if active sensing messages stop coming in and take some action, presumably shutting up if notes are sounding. There are plenty of keyboards that send active sensing, but I’ve only seen one instrument that responds, and it does so by putting up an annoying message.

08_Chap7_pp149-164 8/29/13 2:42 PM Page 161

MIDI

161

System Exclusive The authors of the MIDI specification realized they could not possibly anticipate everything that would ever need to be communicated. They thoughtfully included several open-ended types of message such as the nonregistered parameter change. The most open format is the class of messages called system exclusive or sysex. The system exclusive format is simple. First a message is sent that means the MIDI protocol is temporarily suspended—any data following this message has a new meaning as defined for some special purpose. An identification (ID) code is sent immediately after the sysex status (there is a registry of ID codes for all manufacturers who implement sysex). After that, the data format in the remainder of the message is up to the manufacturer. Eventually this state of affairs is cancelled by an end of exclusive (EOX) message. Most sysex messages also include a model identification and even a device number so that the message can be targeted to a specific instrument. Sysex has been used for an amazing range of functions. Some companies made all of the internal parameters of their instruments accessible—one form of message would query for the current value of the parameter, which the instrument would return by a sysex message of its own (that’s why tone generators with no user controls have a MIDI out jack). Another message would change the value of the parameter. A knowledgeable composer can use these features for unique operations such as gradual transitions of timbre or retuning on the fly (there is an example of this in chapter 17, p. 430). Almost all instruments can use sysex for a program dump. This is the transmission of all of a program’s settings and can be used to copy a program to another instrument or to a computer for storage. Computer applications called patch librarians use sysex messages for editing and management of programs on the more popular synthesizers. Other system exclusive operations include sample dumps and device identification. However, in the transition of synthesis from hardware to software, it seems that sysex is being left out in favor of more direct types of control.

STANDARD MIDI FILES Once MIDI data is recorded on a computer, it needs to be saved to disc. Most sequencer programs include two ways to save a recording. One is a proprietary format that includes everything the authors see fit to save. This will include additional information like editing history, track markers, and user comments. The second option is as a standard MIDI file (SMF). This format is limited to what can be transmitted over MIDI, which is a bit more than you may think, but it is still essentially the music as it might be played, including all MIDI events plus certain types of data known as meta-events. These include track names, lyrics, tempos, and time signatures. The advantage of this format is that any MIDI application can open it. The

08_Chap7_pp149-164 8/29/13 2:42 PM Page 162

162

THE ART AND TECHNIQUE OF ELECTROACOUSTIC MUSIC

standard MIDI file is so well established that both Macintosh and Windows operating systems can play the file with no special software. There are some quirks in MIDI files you should be aware of. There are actually three types of MIDI file: 0, 1, and 2. A type 0 file is the simplest—it is just a list of all of the MIDI events in a performance. A type 1 file is designed for use by multitrack sequencers. It is organized in tracks, with one for tempo and others for the events. All modern applications can open both kinds, although most only save type 1. Type 2 MIDI files are sets of patterns suitable for drum machines.

GENERAL MIDI One early use of the standard MIDI file was the exchange of music. Composers could use the forerunner of the Internet to send their latest work to colleagues across the country, who could play it exactly as intended—if they had the same model synthesizer playing the same presets. If the proper instrument was not available, the results may have been entertaining, but probably not what the composer had in mind. Remember that the program change message only sends a number, and the sound actually stored under that number is anyone’s guess. It was easy to wind up with a bass drum playing a harpsichord part. The general MIDI voice set was proposed to avoid such problems. It began as nothing more ambitious than a preferred instrument type in each program on the general MIDI (or GM) bank. The sounds chosen were taken from some of the popular synthesizers of the day and were mostly imitative of acoustic instruments. Perhaps the most influential aspect was the dedication of channel 10 to percussion sounds, with a kit that started with the bass drum on note 35. Since then, nearly all instruments and sequencers assume channel 10 is for percussion whether general MIDI is in use or not. General MIDI still had some missing features, so several manufacturers developed their own implementations which appeared in various models. Eventually these were enfolded into GM level 2, which is the current standard. Many composers lamented the arrival of general MIDI, afraid it marked the end of innovative sounds. In fact, GM did mark the beginning of a fairly conservative period in electroacoustic music, but that is coming to an end as a new generation of composers has found GM just as boring as the iconoclasts of the 1960s found the symphony orchestra.

MIDI AND NOTATION There is, of course, a large body of composers who are not particularly interested in electroacoustic methods. For those who prefer to write scores for human musicians, computers have proven to be an invaluable tool, removing most of the agony and drudgery of producing fair copy manuscripts. The arrangement of notes and

08_Chap7_pp149-164 8/29/13 2:42 PM Page 163

MIDI

163

lines is essentially a graphics problem, and early software did not provide the nuances that musicians expect, but there are now several fine programs to produce publishable music. Many even include specialized features such as automatic transposition and part extraction. The early versions of notation programs required learning an arcane system of text-based markup, but they eventually developed the means to transcribe the notes played on a MIDI keyboard. Transcription is difficult, as the computer is literal minded about such things as the lengths of notes, so parts that are played in may require a fair amount of editing. With MIDI capability came the notion of exporting the score as a MIDI file or even playing the score as it was composed. This feature turned the composition business on its head. We can now hear a fair representation of what is written down and immediately correct things if they go awry. Thus a lot of bad music has disappeared under the delete key rather than hitting the wastebasket after a deadly debut. Few composers want to replace human performers and traditional instruments, but it is nice to avoid wasting their time. Music notation and MIDI composition actually have little in common other than the use of pitches. For instance, rhythm is notated in a fairly coarse way that performers interpret according to the style of the music they are playing, while MIDI rhythms are precise timings. On the other hand, notation can communicate elaborate gestures in a single symbol. Consider a decrescendo—this simple hairpin translates to a whole series of gradually changing velocity values. These can be cooked up by a computer application, but they are not likely to match what a performer would do. Some notation programs allow the composer to specify velocity and control messages, but the interface to that feature tends to be quite awkward. My answer to the problem is simple: use each program for what it does best. I might start a composition in a notation program to get it into rough form, then export it to a MIDI file and open it in a true MIDI editor for final refinement. Another piece might be captured as keyboard noodles in a sequencer, then exported to a notation program to be arranged and published for performers. The MusicXML file format enables the transfer of notation between programs.

FUTURE MIDI The MIDI standard was attacked the moment it was released. Critics thought it was too slow, too limited, too dedicated to traditional music. These things are all true, but MIDI has proven fast enough, extensive enough, and flexible enough to create a lot of fantastic music. There have been many proposals to extend the standard, but other than a few items related to general MIDI and file formats, these have not gone far. MIDI is well established, and any company that attempts to introduce another communications system is taking a big risk. A major manufacturer would probably feel the need to support both MIDI and the new system, an extra cost for little return. The hardware is being gradually upgraded. Most new products now have USB

08_Chap7_pp149-164 8/29/13 2:42 PM Page 164

164

THE ART AND TECHNIQUE OF ELECTROACOUSTIC MUSIC

connections next to MIDI ports, and we are beginning to see high-speed network transmission of MIDI from computer to computer. If there is to be a replacement for MIDI, it is likely to be Open Sound Control (or OSC). This was developed at the University of California, Berkeley, and is designed to use network technology to address musical needs directly. It is inexpensive as it uses existing Ethernet hardware. The protocol is compatible with the Internet, so controllers and instruments can be simply plugged into a local area network. It is possible to embed MIDI within OSC, although there’s little reason to do so. OSC addresses the shortcomings of MIDI in a variety of ways. All numbers are 32-bit precision, messages can be sent to devices by name rather than the broadcast MIDI channel, and there is even a system of sending messages to be executed in the future (this can produce latency-free performance, at least when a computer is in control). For performance interfaces, OSC features an unlimited number of high-precision control channels which are proving as responsive and flexible as acoustic instruments. The current interest in interface experimentation and development is helping popularize OSC because traditional MIDI does not provide the response inventors are looking for. OSC is beginning to turn up in commercial products like cell phone apps, and those products are catching the eye of serious musicians. Electroacoustic musicians have an excellent track record for getting technology out of the lab and into the music store, and there’s no reason OSC can’t make the trip.

EXERCISE Find an application such as MIDI-Ox or MIDI Monitor that displays MIDI data as it comes in to the computer. Send messages from all of your controllers and see what they look like. You may discover features you didn’t know you had.

RESOURCES FOR FURTHER STUDY The details of the MIDI specification are posted on the MIDI Manufacturers Association website, www.midi.org. This site offers several tutorials in the fundamentals of MIDI as well as documents dealing with the practical aspects of implementing MIDI systems that are invaluable to software programmers. David Huber’s MIDI manual is the go-to guide for all MIDI applications and problems: Huber, David. 2007. The MIDI Manual: A Practical Guide to MIDI in the Project Studio, 3rd ed. Boston: (Focal Press, 2007).

09_Chap8_pp165-186 8/29/13 2:44 PM Page 165

EIGHT Sequencing Programs

Sequencers are the mainstay of a MIDI studio. Sequencers were originally pieces of hardware included with analog synthesizers, but those bear the same relationship to the applications of today that a bicycle has to the space shuttle. The function of a modern sequencer is to preprogram an automatic performance. The composer enters data in various ways, then the sequencer feeds that data to MIDI synthesizers for real-time playback. Thus the composer takes over many of the functions of a performer, deciding exactly how each note should be played. As we shall see, this is not a job to be taken on lightly. Most musicians see a sequencer one of two ways: a canvas for entering notes and MIDI controls or a kind of tape recorder that enables you to fix up a performance. Both of these ideas are true, and these functions nicely complement each other. Many composers also see a sequencer as complicated and difficult to use. Everyone finds it so at the beginning, but it’s really one of the easiest studio programs to master. The difficulty comes from unfamiliarity with the abstract way the music is presented and a too literal acceptance of the recorder paradigm. What it boils down to is that musicians have one way of looking at the world and computer programmers another. I will point out typical points of dissention as they come up. Sequencers may be stand-alone programs or part of an integrated digital audio workstation (DAW). As I pointed out in chapter 6, the integrated programs developed gradually out of audio programs that added MIDI features or MIDI sequencers that added audio functions. As of this writing the applications that handle MIDI best are the ones that started out doing only that. It doesn’t make a great deal of difference—all of the major programs perform MIDI operations efficiently and accurately, and the interfaces are so similar that it’s easy to forget which program you are using. Rather than use one of the three or four most popular programs for the examples in this chapter, I’ll describe the basic features in a generic way and point out some alternatives you will encounter.

165

09_Chap8_pp165-186 8/29/13 2:44 PM Page 166

166

THE ART AND TECHNIQUE OF ELECTROACOUSTIC MUSIC

GETTING SET UP Hardware There are some chores to do before a sequencer is ready for use. The most obvious is the wiring—the keyboard or other controller has to be connected to the computer, either via USB or a MIDI input. Any external synthesizers will be connected to the computer’s MIDI outputs. Audio connections will go to a mixer—these include the computer audio if any software synthesizers will be used, the audio output of each hardware instrument, and the keyboard’s audio output if it includes a built-in synthesizer. In the latter case, you should also connect a MIDI out from the computer to the keyboard and turn local control off for the keyboard. The hardware settings must be entered into the audio and MIDI preferences of the sequencer. Audio settings will include the audio port and drivers to use along with sample depth, buffer size, and a variety of mysterious options. It is safe to leave these on the default settings until you get a chance to consult the manual. Some sequencers have elaborate MIDI setups. These are useful in a complex studio but require a lot of information, including the names of your instruments and the MIDI ports they are attached to. This chore ranges from simple to tedious, but at least it only has to be done once, unless you make changes to your studio. If you change setups frequently, I suggest using the port names or some generic title like “keyboard.” You also have the opportunity to enter the names of the patches in the synthesizers, but these lists will be huge and difficult to keep up to date. I doubt the ability to pick “clinker bells” from a list instead of program 14 saves enough time to make up for the initial data entry. It’s better to keep a file or piece of paper handy with the program names. You would need such a list for the data entry anyway, and it’s not hard to copy from the instrument documentation. If you are only using software synthesizers, the setup process is a lot simpler. You merely have to make sure the plug-ins are in the proper directories (consult the manual). If the sequencer is in place before the plug-ins are installed, this is usually automatic. If a sequencer is added later, you may have to copy some files by hand or run the plug-in installer again. Another setup chore involves the metronome. Most sequencers use the computer speaker for the default click. I find these difficult to hear when I am playing and prefer to use a MIDI instrument for the metronome. In fact, my favorite approach is to use a retired synthesizer module connected to a cheap speaker sitting right on top of the MIDI keyboard. The program will have a metronome preferences dialogue that sets the port and sounds used. I find claves to be the best sound for cutting through a dense mix. If you mix plug-in synthesizers and MIDI hardware, there may be a synchronization problem due to the audio latency of the computer. Most sequencers have a MIDI delay setting to compensate. It’s not absolutely necessary, unless you do all of your composing by playing live, because you will be adjusting note start times anyway.

09_Chap8_pp165-186 8/29/13 2:44 PM Page 167

SEQUENCING PROGRAMS

FIGURE 8.1

167

Main window of Logic.

Sequencer Overview The main window of most sequencers is indistinguishable from the main window of a DAW (see Figure 8.1). There is a central window or pane with horizontal tracks, track controls to the left, and a time indicator across the top. The tracks window offers coarse editing of MIDI clips or regions and often some automation of MIDI controls. Since sequencers now offer audio functions and DAWs offer MIDI, the only difference to point out is between the two types of track, where the essential difference is the treatment of time. Time in MIDI is flexible, measured in units relative to the tempo: bars, beats, and other durations. The initial time stamps for events are relative to the tempo in effect at the time of recording. If the tempo is changed for playback, everything slows down smoothly. Time in audio is absolute, measured against the sample rate. This may be displayed in bars and beats, but if the tempo is changed, the bars move relative to the recorded sounds. If the audio no longer matches the tempo, it must be time-warped. Some programs do this more or less automatically, but it is still a change in the audio file, with all of the typical side effects of resynthesis.

Decoding Synthesizers Sequencing is intimately tied to synthesis. The commands in the sequence must be tailored to the needs of the particular patch used, and the patch must meet the needs of the composition. We will study synthesizer programming in detail in the following chapters. For now we will use the default sounds of any available synthesizer. The first task in using a synthesizer is to find out what it can do. Publicity

09_Chap8_pp165-186 8/29/13 2:44 PM Page 168

168

THE ART AND TECHNIQUE OF ELECTROACOUSTIC MUSIC

about “latest modeling techniques” or “best emulation of the classic ZXY5” tells us nothing. The only way to find out what a synthesizer is capable of is to try things. To assess a new synthesizer, the first thing to do is hook up a keyboard and play with the presets. Take the sounds in order, playing a few high notes, a few low notes, then noodling in the middle. Pay attention to the response time—preset sounds seem to be optimized for either notes or sustained effects. The names may reflect this—sustainers are often called pads or atmospheres. Noodling may lead to a fit of improvisation; if I catch myself playing a tune, it is probably a good preset. This process will take an hour or two. Having winnowed the presets down to a few interesting ones, revisit those and see how the sound can be changed as it plays. At this point, start taking notes. I prefer dynamic sounds, so MIDI control during a note is important to me. If there are knobs, turn them as you hold a note. Once you discover something interesting, figure out how MIDI can do the same thing. Finding MIDI controls invariably takes some detective work. Physical controls may send their MIDI equivalent which can be read from the sequencer’s input display. Plug-in synthesizers sometimes have a controller learn feature. When these easy solutions fail, it’s time to look in the manual.

RECORDING MIDI DATA The best way to get the feel of a sequencer is to record a few notes, then look at the way these are displayed. In the main window, you will see track controls at the left and a vertical cursor that you can position by clicking in the timeline stretched across the top. To make a recording, choose or create a MIDI track and route it to the MIDI port and channel that the desired instrument is connected to. Usually this is done with a control in the track display. Some programs make a distinction between tracks for software instruments and tracks for external instruments. Other programs only have one style, but include “external” as an option for the instrument you choose. In that case, the MIDI routing is in the external plug-in. I want to make it clear that while it’s the controller data that gets recorded, it is the connected synthesizer that will be heard. Even if you want to use the keyboard’s internal sounds, it is still chosen as a MIDI destination. When a track is selected, you should be able to hear the correct instrument as you play the keyboard. If you don’t, review your MIDI and audio connections. (Don’t forget to turn local control off on your keyboard.) Clicking the record button will begin recording at the cursor position. There are usually two start options available. One is “wait for input,” which will place the first note played at the cursor position and move on from there. The other is “pre-roll,” which will begin playing the other tracks and metronome a measure or two before recording is to start. In that case, notes played before the downbeat may not be recorded. There are three modes of recording. In overwrite mode existing notes will

09_Chap8_pp165-186 8/29/13 2:44 PM Page 169

SEQUENCING PROGRAMS

169

be deleted, the behavior you’d expect from a tape recorder. In merge mode, new notes are simply added in. The third option, loop record, is merge mode while the program cycles repeatedly over a chosen section. When you record over existing material in overwrite mode, you come across one of the differences between recording MIDI and recording audio. If the punch-in happens to occur in the middle of an existing note, an audio recorder would just cut it short. If a MIDI recorder worked the same way, the note on would be left alone and the note off would be removed, leaving the note to play forever. In some sequencers, the entire note is removed; other sequencers move the note off to give the same effect found in audio recording. In any case, you will have some cleanup to do when you start a recording in the middle of existing data.

Playback Recording can usually be stopped with the space bar. There should now be something in the track area to indicate the presence of recorded data. The grouped data may be called a region or clip and can be moved around or trimmed just like an audio file. If you cue the cursor to the beginning and play, you should hear what you just did. Now is an opportunity to experience the unique features of MIDI recording. Change the patch on the synthesizer and play again. You will hear the recorded notes with new sounds. The effect you get will depend on the difference between the original sound and the new one. If the recorded notes are short and you change to a slowly evolving texture, you may not hear much of anything. To hear a largo version, just slow the tempo. You can also imitate virtuosity by cranking the tempo up. If you record a piano line and switch to a percussion kit, the results will be even more surprising. If you intend to use the sequencer primarily as a recorder, you should practice laying down tracks and working with the metronome. Playing with a metronome is a special skill. The common experience of practicing scales and études with a metronome is not the same as performing an interesting, expressive take. The best advice I have heard is to “relax into the tick,” playing without concentrating on the metronome but feeling the beat just as you would with another musician. A good learning exercise is to play just the metronome part for a while (you don’t actually have to record this) and gradually start adding fills and flourishes. When that’s comfortable, move on to more substantial music. One approach is to start with the rhythm parts and turn off the tick once there’s enough material to follow. Drums are easy to record if you split them into two or three tracks and record kick and snare individually. Many parts other than rhythm can be built out of several tracks sending data to the same instrument. The voices of a contrapuntal passage are an obvious case, and sometimes left and right hands can be separate tracks. Putting complicated controller manipulation such as pitch bends on their own track allows you to take

09_Chap8_pp165-186 8/29/13 2:44 PM Page 170

170

THE ART AND TECHNIQUE OF ELECTROACOUSTIC MUSIC

several tries without risking a set of well-played notes. MIDI tracks can be muted, so you can keep as many versions of a line as you want. Later, you can make a composite of the best parts of each take.

Step Entry It’s not necessary to be a keyboard virtuoso to record notes. Many sequencers have a step entry recording mode that lets you hit a key to enter the next pitch or chord without worrying about rhythm. The rhythm is specified by buttons in a dialogue and can be changed by key shortcuts. You can comfortably create a lot of notes with one hand on the MIDI keys and the other on the computer. Variations on step entry allow typing pitches on the computer keyboard or using an on-screen cartoon of a music keyboard. Chords require some contortions in these alternate modes, but they can be handy when you are working away from the studio.

Modifying MIDI Regions Most sequencers provide the same sort of operations on MIDI regions that we learned for audio clips. Trimming and movement in time are just the beginning. Since this is MIDI data, transposition is trivial and nondestructive, so you can play in a comfortable key and transpose later. There are also nondestructive velocity options: you can boost all velocity data to make a line stand out or scale the data to expand or reduce the dynamic range. There will also be a control change feature to adjust the channel volume. There is an important difference between changing velocity and changing volume on MIDI data. Most synthesizers link velocity to several parameters, so the nature of the sound will change as the velocity increases. Turning up the channel volume just makes it louder. It’s also important to remember that channel volume affects everything sent to the channel. If two different tracks address the channel, the latest volume sent from either will be in effect. Sometimes there is a track feature that sends initial volume when playback begins. I prefer to turn this feature off and insert volume control messages in one track as needed. Some DAWs send volume controls as part of mixdown automation. If that’s the case and you have two or more tracks controlling the same channel, choose one to be the master and use the automation on that track only. There is also a track setting for initial instrument program. I avoid this because it is unnecessary and often gets the bank wrong. If the program must change during a piece, I insert program changes into the controller data, including one at the beginning to return to the initial setting. Be careful with program changes. They might cut off sounding notes and take some time to complete, so it’s best to allow lots of time before and after the change message. Once you have changing programs and controllers, the sequencer should set everything to the proper state when you play from the middle of a piece. This means looking at the early part of each track and restoring the most recent value for each control, a process called chasing. Most

09_Chap8_pp165-186 8/29/13 2:44 PM Page 171

SEQUENCING PROGRAMS

171

programs brag about this feature in their publicity, and you shouldn’t waste your time with an application that doesn’t include it. When you copy a MIDI region and paste it to another location, there are several variations on how the paste will work. Merging with existing data is useful. For instance, if you made a rhythm track in several passes, it makes sense to merge them once you have achieved the sound you want. You may have to do some cleanup after a merge; if the merge results in overlapping notes on the same pitch, the sequencer might not handle it gracefully. Many synthesizers can’t play a unison, and the sequencer’s designer may have made assumptions you don’t agree with. Conflicts can be resolved by cutting the early note short or by combining the notes into a single longer one. Another paste option generates multiple copies one after another. This is a time-saving feature and keeps a repeating passage on the beat. There is a related technique called cloning or aliasing. The copies remain associated with the original, so if you edit the source after cloning, the copies are changed too. There is usually a way to convert a clone into a normal copy, but once that’s done, you can’t go back.

MIDI Effects MIDI effects are processes applied to MIDI data. In sequencers, these effects are usually applied during playback, although there’s often a way to capture the results. The most common effects are echo and arpeggio. Echo repeats notes at specified time intervals. You can use echo to fake the sound of delay with feedback by setting a number of repeats and a fading velocity. Arpeggiators can be quite elaborate. The basic setting cycles through held notes, but you can choose from a variety of patterns. More elaborate effects include note filtering, inversion, channel mapping by velocity, and simple harmonization. In the sequencer Logic, MIDI effects are in the environment, a sort of subbasement to the application where all of the pipes and connections are kept. The program is actually named for its ability to do logical operations on MIDI, operations such as note and control filtering, transposition, velocity scaling, keyboard splits, and the like. The environment is considered difficult to use, but it has a reputation as the most powerful set of processors available. In more recent versions of Logic, many features of the environment have been duplicated in other parts of the program so you can explore them at your leisure or ignore them completely.

EDITING MIDI NOTES The most liberating aspect of a MIDI sequencer is the ability to modify the recorded data. We can fix wrong notes, improve phrasing, clean up rhythms, and write in things that are beyond our capacity to play. This kind of data massage happens in

09_Chap8_pp165-186 8/29/13 2:44 PM Page 172

172

THE ART AND TECHNIQUE OF ELECTROACOUSTIC MUSIC

FIGURE 8.2 Event list editing window. an editing window. There are three editing modes available, each with a unique view of the data and each with special advantages: event list, piano roll, and notation.

The Event List Window The event list is the most intimidating view. This window shows a text listing of all of the MIDI events in the region, one line per event, with the data in columns like a spreadsheet. The only concession to musical sensibilities is that note on and note off messages are combined into a note event of some duration. A sample event list is shown in Figure 8.2. Notice the format in the time column. Time in sequences is measured in bars, beats, and ticks. A bar has the number of beats appropriate to the time signature and a tick is a subdivision of the beat. The number of ticks in a

09_Chap8_pp165-186 8/29/13 2:44 PM Page 173

SEQUENCING PROGRAMS

173

beat is not standardized and has been increasing over the years. The MIDI definition of 24 ticks to a quarter note was the starting point. That number is useful because it easily provides for duple or triplet subdivision: 12 ticks is an eighth note, whereas 8 ticks is a triplet eighth note. Contemporary sequencers use some multiple of 24 as the tick granularity: 480 ticks per quarter was popular for some years because at a tempo of 120 the resolution is close to 1 millisecond, nearly the top speed of MIDI messages. More recent software uses 960 ticks per quarter, which makes sense when CPU speeds exceed 2 GHz. Time is counted from measure 1, so the piece begins at 1 : 1 : 1. The duration is shown as the number of measures, beats, and ticks, so 0 : 1 : 121 is a quarter tied to a thirty-second note, held a tiny bit too long. You will never see tidy numbers in the tick column for music that was actually played. Large tick numbers are difficult to work with, and at least one application shows ticks as subdivisions of something called a division, which is smaller than a beat. This adds another unit to the duration, 1 : 1 : 1 : 1. This can be a bit hard to read, as the meaning of the third unit varies with the division setting, but it saves reaching for the calculator for large tick numbers. The event list has toggles to determine what kind of messages to show. This lets you focus on the notes, or maybe on the notes and the pedaling. Geeky as it is, the event list is invaluable. The sound produced by a synthesizer is the product of several messages—notes, bends, and controllers—and this window is the only one that shows their exact relationships. If you hear something puzzling, the answer will be found in the event list. Events are edited with the usual text modification tools, but take heed: if you need to edit the start time, be careful with your typing. The list will be resorted as soon as you confirm the edit, and a mistaken entry will hide the event somewhere off the window. Luckily, undo will bring it back. You can develop some familiarity with the event list by looking at notes you have recorded.

The Piano Roll Window The piano roll display (shown in Figure 8.3) is the window most commonly associated with sequencers. In this view, notes are represented by rectangles on a grid of pitches and time. Pitch is often indicated by a keyboard graphic at the left, with a time scale in measures and beats across the top. The grid spacing is adjustable in the time dimension, usually down to something like ninety-sixth notes. Tools are provided for manipulation of pitch, time, duration, and velocity. The operations can be performed on individual notes or a selected combination. Selection usually follows the customs of the platform with some additions such as a click on the keyboard graphic to select all notes of a pitch or range. The usual approach to editing assigns functions to each area of the rectangle representing the note. Typically a click and drag near either end extends the duration, with the left end starting the note earlier. Click and drag in the middle area will change pitch or move the note in time without affecting its duration. If notes

09_Chap8_pp165-186 8/29/13 2:44 PM Page 174

174

THE ART AND TECHNIQUE OF ELECTROACOUSTIC MUSIC

FIGURE 8.3 Piano roll editing window.

are short, the magic spots for click and drag can be tricky to find, so you usually need to zoom in a bit. The time movements may be free or constrained to the grid spacing. A selected note can also be nudged by arrow key combinations. The arrows alone will step through the notes, an efficient way of getting around. Because there is no obvious way to display velocity on a piano roll, there is a lot of variety between applications. Velocity might be shown as a color, as a line in or near the note rectangle, or as a pop-up information window. To edit velocity, you may have to click on another magic spot in the note rectangle or choose a velocity tool from a palette. In many applications, a separate graphical velocity display lets you edit velocity as a continuous shape so you can easily adjust the curve to give the expression you want to a phrase. Control changes are usually shown as a series of lines in the piano roll, either laid over the notes or in a separate pane at the bottom of the window. A single message shows up as a vertical line, but control changes usually come in bursts that have a distinct shape. The program usually displays one control at a time, with the type chosen in a menu. If more than one are displayed, they may be distinguished by color or by a different vertical position. Note that these displays are designed to edit the values for particular controls, not to edit which controls are sending values. If you want to move a gesture from one controller to another, you need to cut it from the current control, change the display, then paste the data. Editing control changes is generally done with the ubiquitous pencil and eraser tools. If you draw over existing messages, they are either modified or replaced with new ones. The selection arrow moves events in time; usually you do this to a group

09_Chap8_pp165-186 8/29/13 2:44 PM Page 175

SEQUENCING PROGRAMS

175

FIGURE 8.4 Percussion grid editing window. of changes to keep a gesture intact. Pay particular attention to the relationship between controls and notes. The note makes the control audible; there are some controls that must be set before the note, and others that must move during the note. Remember that the sound of a note often lasts beyond the time shown by the note rectangle. Pitch bend and the various after-touches are generally included as controllers, and nrpns are usually combined into a single entity, even though it takes four messages to send one. Percussion Display Many synthesizers feature percussion programs that consist of drum sounds mapped to different notes. For these, sequencers provide a variation of the piano roll display that replaces note names with general MIDI drum names (see Figure 8.4). If your percussion synthesizer has a different mapping, the names can be rearranged in a custom setup. Since durations don’t matter to drum samples, the rectangles may be replaced with diamonds or other compact symbols. Recording percussion is easy for a skilled drummer using a MIDI drum set, but difficult for everyone else. There is

09_Chap8_pp165-186 8/29/13 2:44 PM Page 176

176

THE ART AND TECHNIQUE OF ELECTROACOUSTIC MUSIC

nothing to be done on a piano keyboard that gives the effect of a drumstick bouncing on a head. A MIDI drum pad may be the best solution—tapping one of these is more responsive than a regular key. Concentrate on getting even velocities—the timing will take care of itself. As I have suggested before, the best recording will be made by playing one drum part at a time. Quantization Quantization refers to moving notes in time to fit a predefined spacing. This is usually the vertical grid lines shown in the piano roll. Quantization has a bad reputation left over from the earliest sequencers, and some applications have done little to restore respect. The first sequencers moved notes to eighth or sixteenth note boundaries, which immediately imposed a rigid feel on the performance. The modern versions still line up the beginnings of notes, but you can leave a certain amount of variation. If you perform a 25 percent quantize operation, a note that was 100 ticks after the division mark would wind up 75 ticks after. A 100 percent quantization moves it all the way to the division mark. You can also set capture ranges so that only notes that are relatively close or really off are affected. In current applications quantize is nondestructive (or just a playback function), so you can always return to the original timings. When you apply quantization, use the finest grid that fixes the problem. As you choose the quantization value you may notice strange options like sixth or twelfth notes. These are an engineer’s conception of triplets. When trying to figure them out, remember the notation rule that triplets squeeze extra notes into a beat. Thus twelfth notes are three to a quarter note, or the equivalent of a triplet eighth note. Don’t quantize an entire track unless you are trying for a special effect. Instead, work phrase by phrase to find the level that’s right for those notes. Often you will find the playing gradually moves out of sync with the metronome and then snaps back after a rest. If you fix the worst area, the sudden contrast with the preceding section will stand out, so you should create a transition zone where the correction is only 75 percent. A nice variation on quantizing allows the definition of arbitrary patterns for the quantize targets. This is generally known as “groove quantize,” and can use patterns supplied with the program, your own performances, or even audio files. Tracks that follow a groove can match the most idiosyncratic playing. A less-interesting variation on quantization is known as swing. This setting delays or accelerates offbeats in a simplistic imitation of some types of jazz rhythm. The antiquantize function is often called humanize. This modifies start times by small random amounts. Few humans play this way, but it might be better than rigid rhythm. Transforms The piano roll window is usually the home for complex operations known as transforms. Transforms are typically applied to groups of notes and can modify practically any feature. Transforms usually include a method of selecting notes by specific criteria. Thus you can make every G-sharp louder or copy all notes below

09_Chap8_pp165-186 8/29/13 2:44 PM Page 177

SEQUENCING PROGRAMS

177

middle C. Many musicians are put off by the manual’s descriptions of transforms in mathematical terms. The following is a translation of math to music, assuming the transform is set to work on note velocity. Notice you often have to specify some value for the operation. Add: all notes get louder by the specified amount Subtract: all notes get softer Multiply: all notes get louder in proportion to how loud they start out Divide: all notes get proportionally softer Exponential: loudness is fit to a curve; the specified value shapes the curve Max: limits the loudness Min: all notes will be at least this loud Fix (or constant ): all notes become the same loudness Flip: loud notes become soft and vice versa; the flip point is specified Scale: another name for multiplication, but may be by a fraction Offset: another name for add; often used with scale. Range: min and max in one operation Random: values randomly changed within range Ramp (or crescendo): loudness steadily moves from start value to end value; note that a start value higher than the end will result in a decrescendo, no matter what the operation is called There are other actions available in various programs, but they will be mostly combinations of the above. You will also discover useful combinations of your own. For instance, applying crescendo followed by exponent will give a nice curve to the dynamic change. This is admittedly geeky, but learning to use transforms can save you a lot of tedious work.

The Notation Window Notation and MIDI don’t really get along. Notation is a shorthand indication of intentions, while MIDI is a precise record of what is actually played. The conflict is clearer when you think of the difference between a note’s value and its duration. The value is quarter, eighth, and so on, and is actually an expression of the time between the starts of notes. Duration is the time the key is down, usually somewhat less than the full value. When you make a transcription of a MIDI performance and simply convert the timings into note equivalents, you usually get a parody of musical notation. (See Figure 8.5.) There are improbable ties of long and short notes, random shifts from duple to triple subdivisions, and blinding flurries of rests. With a skilled performer, some pretty intelligent software, and judicious tweaking by the

09_Chap8_pp165-186 8/29/13 2:44 PM Page 178

178

THE ART AND TECHNIQUE OF ELECTROACOUSTIC MUSIC

FIGURE 8.5 Notation editing window.

user, a transcription can accurately reflect the underlying music, but the process usually takes twice the time it would take to write the notes down in the first place. The opposite process is only slightly more successful. When notation is played by the computer, you get a machinelike performance with every note precisely on time and exactly five levels of velocity. This is good for study purposes, but it doesn’t have the life we have learned to expect from music made by real performers. With all of these limitations, why bother to put a notation-type editor in a sequencer anyway? Aside from feature creep, there are some good reasons. For one thing, transcriptions are improving. Dedicated notation programs are now pretty good at producing readable output, and software in research labs is much better. The current level of notation in sequencers is good enough that musicians can figure out the rhythms and the pitches are accurately rendered. That’s all we really need. A literate musician can look at sloppy notation and understand what is represented, but I don’t know anyone who can do that with piano rolls. (There probably are such people, I just haven’t met them yet.) Notation may be the best format for correcting a performance. The method for finding an error in the piano roll is pretty tedious: hunt around with the arrow keys and listen to the notes, stop when the offender is heard and look at the labels to identify the sour one. In notation, the goof just shows. Notation is a compact display format. A piano roll requires about an eighth inch of height per rectangle and a similar vertical space between rows. Note heads come ten to the inch and actually overlap. You can show five octaves in less than two inches of screen, which means there can easily be up to eight parts in a window. This makes notation the best way to display several tracks at once. You can see exactly how the parts fit together and what the harmonic relationships are.

09_Chap8_pp165-186 8/29/13 2:44 PM Page 179

SEQUENCING PROGRAMS

179

There are some tricks that will let you tidy up the rhythmic notation you get from a transcription. The notation window features yet another quantization control: display quantize. This determines the smallest note or rest value you will see but doesn’t affect playback. The catch is sometimes you want to see thirty-second notes, sometimes you don’t. The trick (at least in Logic) is to split the region when you need different display settings. You can also (again in Logic) make individual notes exempt from the display settings and fix their form. Taken to extremes this means making every note independent, but it’s seldom necessary to go so far—the objective is to get a workable display, not a printable page. Once the display is legible, pitch and rhythm corrections will go much faster than in the other editors. The only odd thing to get used to is that you still change duration by dragging on the note head, which is not an intuitive approach. When you change duration, keep an eye on the length in beats and ticks. The values at which the note shape changes are a bit mysterious and you want to know what will actually be played. I often use the notation and piano roll editing windows together. With two views of the data open I make note-level corrections in the notation window and add precision interpretation in the piano roll.

Tempo Tracks Good sequencers have flexible control of tempo. The heart of this is found in the tempo track. This is usually a graph that indicates current tempo. You can edit the line to produce sudden or gradual changes with as much flexibility as you have the patience to draw in. If your project combines MIDI tracks and recorded audio, you must pay attention to the tempo in the recordings. If the audio was produced first, edit the tempo track so that the MIDI metronome matches the beat in audio. Many programs allow you to change the tempo in the audio to match the MIDI tempo, but there may be audible side effects.

Synchronizing Movies One important function of the tempo track is synchronization. Most professional sequencers allow you to display video that is synced to the play engine or lock playback to an external time source such as a video deck. Time in video is in the SMPTE format of hours, minutes, seconds, and frames, with a frame rate depending on the source. The video plays at a constant speed, so if you want a certain visual event to happen on a downbeat, it is the music that has to adjust. If you know the frame number of a visual event and the measure number of the target downbeat, calculating the tempo that will match them up is pretty simple. First figure out how many seconds there are from the start to the event, multiply that by the frame rate, and add the frames of the last partial second. If the SMPTE time is 00:01:17:12 and

09_Chap8_pp165-186 8/29/13 2:44 PM Page 180

180

THE ART AND TECHNIQUE OF ELECTROACOUSTIC MUSIC

the frame rate is 30, which works out to 2,322 frames. Next figure out how many beats to the target downbeat, say 128. Divide frames by beats: 2,322 ÷ 128 = 18.14 frames per beat. Now divide that into the frames in a minute: 1800 ÷ 18.14 = 99 bpm. Actually the number is 99.22, but we round down because we want the image to be slightly ahead of the music. There are SMPTE calculators that make this easy. Some sequencers make it even easier: you mark a beat and enter the desired SMPTE time and the sequencer automatically adjusts the tempo to make them match. There’s a lot more to creating scores for video than matching hit points, but the sync process is basically that simple. One word of advice: plan all of your hit points in advance and do them in order. Inserting a hit point in the middle and keeping the following hits synchronized can get very complicated. There are a couple of quirks of SMPTE time format to be aware of if you work much with video. The format is frame based, and the number of frames in a second depends on whether we are looking at film or videotape and where it originated. The following are the rates in common use. American film: 24 fps British film or video: 25 fps American broadcast TV: 29.97 fps American video and digital TV: 30 fps Video for computer viewing will be in QuickTime or Windows media formats, so the frame rate will probably be 30 fps. QuickTime video is often at 15 fps, which is more Internet friendly. In that case, the sequencer will work at 30 fps and show each frame twice. You will encounter the other rates if you work in a production studio where the images are on tape or film. The slightly slow rate for broadcast TV leads to some bookkeeping problems—an hour of broadcast video has 108 fewer frames than an hour of 30 fps video. To fix this, the broadcast format skips some frame numbers: the first frame in a minute starts with the number 2 unless the minute is a multiple of 10, a system called drop frame. SMPTE time runs from 00:00:00:00 to 23:59:59:29. After that it rolls back to all zeros. This is awkward on many types of equipment, so most films actually start at 01:00:00:00 or anywhere besides zero. The video menu in the sequencer includes a value for SMPTE offset to specify the start time for the video. You will also occasionally see a subframe number expressed as 00:00:00:00.00. There are technically 80 subframes to a frame, but a lot of programs divide the frame by 100 instead. There is nothing musically interesting about subframes, as hitting the right frame is all that is necessary. If you poke around in movie scoring textbooks, you will see references to tempo in clicks. A click tempo is the number of frames per tick. This is left over from the days when the metronome was connected to the sprocket holes in film. 100 bpm is equivalent to a click of 14 1/2 at a frame rate of 24. There are reference books and calculators that show the relationship of clicks, frames, and beats in various tempi.

09_Chap8_pp165-186 8/29/13 2:44 PM Page 181

SEQUENCING PROGRAMS

181

Playing video is processor intensive, so you will usually want to work with a lowresolution image. The difference between a 4-inch and an 8-inch picture can equal half a dozen plug-ins.

COMPOSING IN MIDI EDITORS Now it’s time to begin composing directly in the sequencer. The process is similar to the traditional approach of composition followed by performance. However, in sequenced music the composer is also the performer. You are not only responsible for inventing the notes, you need to provide the interpretation you might get from a highly skilled performer. The best way to get this knowledge is to be a highly skilled performer, but it’s the rare one of us who can excel on many instruments and in many styles. The practical alternative is to listen and study.

The Composition The work begins with getting the notes down. One approach to a fairly traditional piece is to start out with notation. Usually, this means working in a traditional notation program like Finale or Sibelius, but a sequencer notation window will do for less ambitious work. If a different program for notation is used, the score has to be exported as a MIDI file, then imported into the sequencer. Once this is done, it’s awkward if not impossible to get back to the notation program. For free-wheeling electroacoustic pieces, working directly in the piano roll editor, drawing or steprecording notes is a good approach. Another approach is direct drawing. This starts in the main window: create a track with a long empty MIDI region (the pencil tool, or something like it, will make empty regions, or hit record and do nothing). Opening the piano roll editor, you will see a vast space for potential notes. Draw a note of the right duration for each sound you want, then modify the velocity and add controls as needed. In the early stages, play the entire track after every note is entered to judge the effect. After a while, you will learn the preset well enough to begin working a phrase at a time. My personal style features a lot of variations on a few phrases, so I often copy sections and change the copies. The percussion display makes it easy to set up a drum part: I build up a basic pattern with just a few mouse clicks, repeat paste as many bars as I need, then go back and add fills and flourishes. Step entry is more useful when working from a rough paper sketch. You might find it peculiar, in the midst of all this technology, to use analog tools, but I find paper and pencil, not to mention an eraser, fast and flexible for music. Maybe it’s because paper and pencil allow for sloppiness, which can encourage creativity. The notation programs are not nearly as awkward to use as they once were, but they produce what appears to be a finished product, and I think they promote fussing

09_Chap8_pp165-186 8/29/13 2:44 PM Page 182

182

THE ART AND TECHNIQUE OF ELECTROACOUSTIC MUSIC

over details rather than going for the grand sweep. Another advantage with paper and pencil is that you can keep alternate versions of things and use whatever symbols and graphics make sense to you. There’s no better approach than directly recording a performance, at least if you have the necessary keyboard skill. Many composers lay down backing tracks with the tools discussed above, then perform the lead parts. You may find performing easier if you play from a pencil score. It can also help to record at a reduced tempo, and you should plan on combining several takes. Focus on rhythmic accuracy as you perform. Fixing wrong notes is trivial, but cleaning up sloppy rhythms can take a lot of time. There’s not a lot to say about the composition itself as you, the composer, will be making choices about which sounds to use and what order to put them in. The advice given in earlier chapters about foreground, balance, and masking still apply. Since MIDI instruments are definitely pitch based, you need to have your harmony chops in order.

The Performance Once a section is composed, go over it again to add performance details. One aspect to consider is editing velocity to give an arch for most phrases. The high point of the arch is not necessarily the highest note, but the one you consider most important. It may be the longest note, the note on the beat, or the note at the end. A nonharmonic note may be pushed a little to give it more bite. The arch is not strict: notes between beats may be somewhat lighter than the accented notes of the phrase. If I write dee-da DEE-da DEE-da, you can probably imagine what I mean. Similarly, velocity contrast brings out the independent lines in contrapuntal writing. You will also need to edit the starts of notes, since the pencil tool will put everything precisely on the grid. This is especially necessary with chords, which lack flavor if they are too exact. Some notes in a chord are more important than others; they may be part of the melodic line, the moving note in a chord progression, or the blue note in a jazz chord. Important notes can be emphasized by being a tiny bit early—1/48th or less depending on tempo. Otherwise, adjust chords to build from the bottom up. Note starts in melodic lines should match the style of the music. For instance, swing in jazz implies that the second of a pair of eighth notes will be played a bit late. But this variation isn’t always the same, which is why the auto swing feature of the sequencer usually sounds lame. Sometimes both notes will be late, sometimes they will be straight. Six eighth notes in a row can be given several different interpretations. Furthermore, unequal divisions weren’t invented for jazz. In the Hardanger tradition in Norway, for example, the third beat of a dance in 3/4 is extended to match the lifting motion of the dancers; at least it is in some towns. Neighbors in the next valley do the lift on the second beat, and the musicians oblige. Your ear has to be the judge in all of this—all I can say is precision is dull.

09_Chap8_pp165-186 8/29/13 2:44 PM Page 183

SEQUENCING PROGRAMS

183

The duration of notes determines the articulation. The pencil tool creates notes that stretch all the way to the grid, producing a legato line. Some synthesizers have a setting that will play notes that overlap as if on one breath, so stretch them to make use of the feature. Shorter notes produce a progressively staccato effect. Very short notes are sensitive to synthesizer settings. At some point you won’t hear anything because the sound does not get a chance to evolve. The whole issue of duration is clouded by the long release times typical of electronic instruments, so don’t be surprised if you have to re-edit durations after changing presets. Placement of control changes is absolutely dependent on how the synthesizer works. Some instruments will only respond to controls that are set before the key goes down, others can be changed while the note is sounding. The control message just before the note establishes a starting value. A change during the note is probably a gesture sweeping up or down. The quantize settings will determine the event spacing of a sweep of controls, and this dramatically affects what you hear. A series of controls that step on sixteenth notes is probably going to create a noise like a zipper makes. On the other hand, there is no sense in going faster than 200 steps per second. Such detail is not perceivable and some synthesizers can’t keep up with so much data. At a tempo of 120, the fastest you need go is a ninety-sixth note step. Pitch bend is treated like a control, but the off position is the middle of the range. This is displayed as zero, and bends down are shown as negative values. Pitch bend stays where you put it, so bends must be restored to zero in order to start the next note on pitch. If you are bending notes to adjust tuning, as you might when working with sampled sounds, the best place to work is in the event list. The control values entered with a pencil tool tend to be approximate, but the event list allows you to specify exactly what you want.

Combining Concrète and Synthesis There’s no technical reason not to freely mix MIDI tracks, edited sounds, and audio overdubs in one project. The order in which you do things depends on the nature of the material. Putting rhythmic material down first provides something to guide the other parts. On the other hand, if the piece includes a score from a notation program or a complex edited bit of musique concrète, those will be the foundation. It’s a good idea to start with scratch versions of tracks, knowing that you will redo them later. This is especially likely with vocals. A common working method is to put in enough rhythm and harmony to sing along with and then rough in the vocal track. Then it’s easy to add other materials that respond to the lyrics and expand the orchestration without masking the voice. A really fine vocal retake (several, more likely) finishes the piece off. Some composers build a basic quantized drum track to work with, then bring in a real set of drums when the piece is finished. This fixes a classic “chicken or egg” problem with this kind of production: the best drum track is a live one, but drummers need other parts to listen to when they play, so how can you put the drums down first?

09_Chap8_pp165-186 8/29/13 2:44 PM Page 184

184

THE ART AND TECHNIQUE OF ELECTROACOUSTIC MUSIC

Capturing Output If you have a giant studio, you will be able to lay out the entire piece on synthesizers before you record a thing. But our ambition is often beyond our pocketbook, and we run out of boxes (or computing power) before half the tracks are finished. This is where the combination DAW and sequencer excels. Audio playback is much more efficient than synthesis, so rendering an internal instrument track to disc will free up processor power to continue the work. Likewise, you can capture an external instrument by recording its output. Input latency will create a sync problem, but this is easily fixed by moving the new audio track forward a bit. After the track is converted, mute the MIDI version but don’t delete it. You are bound to find something you need to change. There is a question of whether all synthesized tracks need to be converted to audio tracks before the final mix. It’s not necessary if the mix is simple, but most of the time you will find mixing easier if all tracks are in the box. Besides, if you capture everything, the entire project can be archived and you won’t need extra hardware to get it back. The actual mix process is no different from what we learned for acoustic projects. Get the rhythm and bass (or their equivalents) balanced, set up fader moves with the automation, and bounce. You will find little need for processing synthesizer tracks unless you have a vintage unit with a lot of hiss. In that case, the noise gate is your friend.

EXERCISES It’s best to start with simple projects that address one section of the program before trying something more complex. 1. Write a short étude for each of the following approaches: •playing notes in •entering notes directly •importing a score from a notation program 2. If you have the necessary performing skills, use MIDI recording to capture a piece you already know. Edit it to make the performance better. If you aren’t a performer, download a poor MIDI rendition of a piece you like (there are thousands on the Internet). Edit it to make the performance better. 3. Compose a dance (you might come up with anything from a minuet to techno). 4. Find a comic strip, such as a classic Calvin and Hobbs or Peanuts. Treat the panels like a movie storyboard and compose a score for it. (When you play the resulting score, you can show the cartoon on a projector.)

09_Chap8_pp165-186 8/29/13 2:44 PM Page 185

SEQUENCING PROGRAMS

185

Copy a scene from a movie, discard the audio, and compose a new score for it. This is the best way to get scoring experience.

RESOURCES FOR FURTHER STUDY The manual for your sequencer should command your attention here. In fact, most of the following chapters will introduce new software with manuals to study, so you can just assume I am urging you to keep up. Many of the most complex sequencers and DAWs have inspired authors to write “missing manuals” and tip books. You may find some of these as useful as the official documentation.

09_Chap8_pp165-186 8/29/13 2:44 PM Page 186

10_Chap9_pp187-204 8/29/13 2:45 PM Page 187

NINE Samplers

A BIT OF HISTORY The sampling synthesizer is the direct descendent of the Chamberlin Music Master and the Mellotron, instruments that were ubiquitous in the 1960s (the Mellotron was the British version, made under license from inventor Harry Chamberlin). Both had keyboards connected to an apparatus that would play a short tape recording when a key was hit. The sounds were flutes (as in the Beatles’ “Strawberry Fields Forever”), violins, and other traditional instruments. The tapes were in an odd format that only the factories could make, but a few enterprising engineers like Don Buchla at the San Francisco Tape Music Center managed to load their own sounds. These instruments were less common in the United States than in Europe because the American Federation of Musicians banned them from recording sessions. They fell by the wayside during the keyboard explosion of the 1980s, but you can now buy a new Mellotron with the original sounds. The torch of instrument imitation was picked up by several companies, including Fairlight of Australia, AKAI in Japan, and Ensoniq and E-mu Systems in the United States. Their products replaced tape with digital recording and a pitch changing algorithm. Since they use sample notes from an instrument to play a wide range of pitches, the name “sampling synthesizer” was given to them and quickly shortened to sampler. (The fact that a recording in a sampler and a single value from the audio file are both called samples is endlessly confusing.) E-mu Systems was probably the most successful sampler manufacturer. Their product was named the Emulator, and the final hardware model could record and store several minutes of sound. It dominated the market, especially after adoption by many famous artists. Emulators were made for twenty years, and the software version has carried the name on to personal computers. (Disclaimer: I worked for E-mu during the heyday of their hardware samplers, and I gathered much of the material for this chapter during that time, but I won’t be comparing or recommending any one brand over another.) One thing E-mu and other companies quickly discovered was that sampling synthesizers did not actually have to have the ability to make samples. As we shall see, sampling can be a tedious process, and many musicians don’t want to be bothered. All they want are some strings to sweeten the mix or a portable keyboard that 187

10_Chap9_pp187-204 8/29/13 2:45 PM Page 188

188

THE ART AND TECHNIQUE OF ELECTROACOUSTIC MUSIC

sounds like a Steinway. Manufacturers did not make much money on the instruments themselves—most of the profit was in the sample banks. Several third-party entrepreneurs set up shop to create banks of sounds, recording instruments from lounge bands and marching bands. E-mu’s best-selling products were the Proteus line of tone modules. The first Proteus machine was a simple playback device with traditional instrument sounds stored in read-only memory. When users tired of the sounds, they could buy the Proteus II, then the Proteus World, and so on. Nearly all synthesizer manufacturers eventually wound up following a similar approach. There are few hardware samplers being made today. In almost all cases sampling has been reduced to an option in another kind of synthesizer. If you really want to create your own samples, look to your computer. Software samplers like Native Instruments’ Kontakt and E-Mu’s Emulator X for Windows and the EXS24 supplied with Logic on the Macintosh are economical and powerful alternatives to hardware. The more elaborate studios have one or more computers dedicated to sampler duty, so the hardware concept isn’t entirely dead, it has just become a do-it-yourself project.

THE SAMPLING PARADIGM Samplers imitate an acoustic instrument by playing recorded waveforms. The theory is that if you capture each note of an instrument, you can play the notes back in any order you please. A lot of memory is required, because there are different sounds for each velocity level on all pitches and nuances such as different bowings on strings and articulations on winds. These massive sample banks are now possible as software (with libraries of 100 gigabytes costing about $1,000) but were never practical on a hardware instrument. The secret of the hardware samplers was clever shortcuts, and even with massive drives available for sample storage, these tricks are still important. Shortcut number one is pitch shifting. You can’t change a note’s pitch much without running into the chipmunk effect, but for most instruments, you can move the pitch a third or so. Using shifts up to a major second saves four-fifths of the memory space. The second shortcut was for velocity. No one ever recorded 127 different velocity levels—how exactly would you play them on a piano anyway? Options for piano, mezzo-forte, and fortissimo are considered generous, with the gaps filled in by a velocity-controlled amplifier. The fundamental memory savings is in looping to extend the duration of the note. Of the three distinct phases of a note (attack, sustain, and decay) the attack is the most characteristic. If you splice a violin attack onto a saxophone note the result still sounds like a violin. The sustain is mostly a steady waveform, and the decay is usually that waveform fading away. Samplers loop the steady waveform. Looping can cut the storage requirement for a note of any length down to about a third of a second.

10_Chap9_pp187-204 8/29/13 2:45 PM Page 189

SAMPLERS

189

The flavor lost by all of this pruning is potentially replaced by processing and special effects. The original waves may be from acoustic imitations, but the entire machinery of a synthesizer is available, so the sounds can be dynamically enhanced by filters, compressors, and even reverb. In fact the list of samples usually includes basic waveforms, so you can do most of the synthesis described in chapter 12 and still fake a piano sound. Add in the ability to make your own samples out of recordings of dogs or kitchen implements, and you have a very powerful instrument indeed. If I were stranded on a desert island and could only have one instrument, I’d probably choose a sampler.

RECORDING SAMPLES I strongly recommend learning to make samples before accessing commercial sample libraries. This is because I consider a sampler to be a logical and powerful extension of musique concrète, not merely a decent fake piano. So far in this book, we have learned how to compose with raw sound at a digital audio workstation. With a sampler, the same sounds are at your fingertips. An instrument or laptop can be loaded with any manner of samples and used for live performance, art installations, or theater sound effects. In this section I’ll describe the process of collecting samples and laying them out for performance. This requires two applications. The first is a recording environment, since few software samplers actually include the ability to record samples. Samples have to be created elsewhere and imported. (The editor we have been using all along will do fine.) The second program is a specialized sample editor. A few samplers have decent editing built in, but many do not. The examples will be illustrated with a program for Macintosh called Keymap Pro, which incorporates all of the tools needed to make sampling efficient and flexible. Keymap produces output in formats which can be imported by almost all sampler programs. If you are working in Windows, you will probably find the tools included in Kontakt or Emulator X adequate. You have already learned how to record. As always, the goal is to get as strong a signal as possible without any overloads. For use as a sample the recording must be absolutely free of background noise. I generally record several tries at the sound or a batch of sounds in one take, then isolate the best version and create a new file for each sound. Choose names for the files that identify the source and base pitch. If you don’t know the base pitch, a simple analysis program like Transcribe! can help you figure it out. The ideal length for the sound sample depends on the nature of the sound and the requirements of the composition. If the sound is a distinct event, you want all of it. If it is a rhythmic pattern, you need two or three complete repetitions. If the sound is a sustained tone, you need only enough for the pitch and volume to stabilize, usually a half note at moderate tempo. Always record some of the silence after the sound.

10_Chap9_pp187-204 8/29/13 2:45 PM Page 190

190

THE ART AND TECHNIQUE OF ELECTROACOUSTIC MUSIC

It takes a while to capture a complete instrument. The performer needs to play each note on the instrument at least three times. Rather than starting with the lowest note and working up, have the performer play major scales in the middle range to start. This produces a more consistent tone and better intonation. Play in enough keys to get all of the notes several times. Cover the low and high registers by playing intervals, maybe triads. The idea is to get the performer to lose self-consciousness and hit the notes in a relaxed manner. This will produce a lot of material to wade through, but this is to be expected. The producers of the Vienna Instrument Library claim to record 40,000 samples per instrument. Once a sound has been isolated in its own file, it will need tweaking before it is ready to use as a sample. The following chores almost always need to be done for each sample. Truncation: The files should be cropped right up to the beginning of the sound. Any silence at the beginning of the file will slow the response time when the sound is ultimately assigned to a key on the keyboard. Likewise, truncate any excess space after the sound ceases. This is not as important, but it’s always good practice to conserve memory. If the sample is a beat pattern, the end should be right at the beginning of the first note of the next cycle. Fade-in: If the recording is a segment of a continuing sound, it probably starts with a sudden pop. You can soften this with a short fade-in at the beginning. A fade time of 8 milliseconds will produce a crisp attack without the pop. A fade at the end is less important, since the sampler will probably fade by default, but it’s a good idea to apply one, since you never know if the key will be held longer than the sample. A 100 millisecond fade-out will usually sound natural. Level adjustment: The volume of the samples will be affected by velocity when they are played, but the keyboard will not feel natural unless all of the sounds are the same loudness. Don’t normalize the samples, instead adjust gain for any samples that aren’t in the -12 to -6 dB range. Don’t fight the FletcherMunson curve. Low-frequency samples are going to sound a bit quiet, but at this stage they should. There will be plenty of opportunity to boost the bass later in production. Pitch correction: Compare the recording to a reference note from a keyboard. If the sound is out of tune, tweak it with the pitch change process. Pitch can be corrected in the sampler, but those algorithms are optimized for speed and probably can’t match the digital signal processing quality of your editor. You will get the best results if you don’t preserve duration. Noise removal: Check carefully for noises in the recording. Listen to the file at a reasonably loud volume, and if necessary, process it with some EQ or a noise removal program. I cannot emphasize enough how troublesome noise will be in sample files. Hum or background motor noise are the most problematic. You are likely to miss them if they are part of your normal environment, but they

10_Chap9_pp187-204 8/29/13 2:45 PM Page 191

SAMPLERS

191

really show up as soon as you change the pitch. A noise removal plug-in may help here. If you can’t remove the noise, redo the recording. Compression: I don’t suggest routine compression, but the possibility exists, depending on the intended use for the sound. If it is just going to be triggered as an event, leave the dynamics alone. However, wide dynamics can get in the way if a multisound pattern is to be looped. As always, compress only as much as is necessary. Don’t compress until you have tried the original sound in the sampler, and always keep the uncompressed version. DC offset removal: DC offset means the zero point of a recording is a little above or below true digital zero. It’s not a recording mistake, it comes from the design of the recording hardware. Offset can cause a thump when the key is played. There’s no harm done by removing offset from a file that doesn’t have any. If your editor does not have offset removal, extending the fade-in at the beginning might do well enough. Looping: If your audio editor has looping tools, give them a try and compare the ease of use with the looping tools in the sample editor. The latter are usually superior. Why do all of this work in the audio editor when the sampler programs have the same tools? Because you are likely to use a sample many times. Anything you do at this stage will only need to be done once, but if you wait until after you have imported the sample, you will tweak the same item again and again. Besides, even the best samplers have limited editing. Most audio editors have a better interface and more powerful features.

Looping Once the master sample has been tweaked and truncated, it is ready to be imported into Keymap or the sampling program. In Keymap, this is a two-step process. First the imported file is opened in Keymap’s own sample editor as shown in Figure 9.1. The second step is to map the sample to a zone, or key range, in the instrument editor. There’s still work to do before the final assignment, but it is instructive to map the sample across the entire keyboard to start with. The extreme ends of the range can produce interesting and unexpected results, and they tell a lot about the sample. For instance, if there is any delay at the start of the sample, low notes will bring this out. The most important parameter in mapping is root pitch—this is the key that will play the sample as is (root is called origin or key note in some programs). The Keymap editor is a conventional editor, with playback buttons and two views of the file. The upper view of the waveform allows general editing and displays markers for the loop points. These are commonly labeled LS and LE for loop

10_Chap9_pp187-204 8/29/13 2:45 PM Page 192

192

THE ART AND TECHNIQUE OF ELECTROACOUSTIC MUSIC

-> 59857 52874 ->

Closeup of Loop Point Dark curve is wave within loop. Light curve is wave outside of loop.

FIGURE 9.1

Sample editor.

start and loop end. If looping is enabled, the region between the markers will be repeated as long as the MIDI key is held down. The lower window shows the loop splice point. The waveform to the left of the center line is the end of the looped section, and the right is the loop beginning. Thus this window shows the transition that happens as the loop repeats. In this version of Keymap, the waveform just outside the loop (before the start and after the end) is shown as a grey line. This is helpful in showing possible connections. Figure 9.1 deliberately shows a bad loop point to emphasize how the window works. To get a clean loop, the wave must smoothly cross the line as shown in Figure 9.2.

10_Chap9_pp187-204 8/29/13 2:45 PM Page 193

SAMPLERS

FIGURE 9.2

193

A nicely placed loop.

You define a loop by placing the start and end points approximately where they need to be, then nudging them one sample at a time. As we discovered in general editing, you have better luck if you look for zero crossings, but what you really need to match is the slope of the curve. This becomes easier with practice. Unfortunately, just getting a good-looking line is not enough to ensure that the loop will sound good. DVD example 9.1 demonstrates the process of fine tuning the loop. The end of this video features the sample at various pitches—notice that a loop setting may work better at some pitches than others. To begin with, you need to pick a region without artifacts that will stand out when repeated. Any background click or pop will become a rhythm or buzz that screams out “Here’s a loop!” The wave amplitude at the loop end must match the start. This means good regions for loops will usually appear relatively flat in the display. If there is too much level change, you may have better success with a compressed version of the sample. In addition, any component of the sound that is making a gradual transition will become audible unless its loop points also match. If the sound has any vibrato, you need to set a loop that works with the vibrato rate. This doesn’t necessarily restrict the loop length—the best loop may include two or three vibrato cycles. If there is no vibrato, any variation of pitch or volume within the loop will produce vibrato at the loop rate. You should keep such vibrato slower than about two a second. Absolutely inaudible loops are rare, especially once we start listening specifically for the looped sound. A usable loop has just the slightest hint of repetition at an attractive rate. DVD example 9.2 has several loops with audible artifacts. Open the file in your editor and see if you can find the problems. Some sample editors have an autoloop feature. This performs an analysis of the sample and moves the loop points to a possible sweet spot. Generally, the closer you have come to setting the best loop by hand, the more successful autolooping is. For loops where a pop just won’t go away there is loop crossfade. Crossfade is illustrated in Figure 9.3. This function will blend in material from just before the start and just after the end to smooth the transition. A crossfade of 5,000 samples or so

10_Chap9_pp187-204 8/29/13 2:45 PM Page 194

194

THE ART AND TECHNIQUE OF ELECTROACOUSTIC MUSIC

Crossfade includes this material

Crossfade

60000

70000

80000

90000

Lx

Ls

Loop Start

Le

Loop end

FIGURE 9.3 Crossfade loop.

will iron out most pops; shorter fades may produce their own gentle thump. The fade shape can be set to linear or logarithmic. Linear works most of the time, but you should experiment to see which gives the smoothest sounding transition. If you have a steady tone, you may be able to loop a single cycle. This is tricky, because the loop can only start and stop at sample points. If the frequency of the wave doesn’t mesh perfectly with the sample rate, the loop will be out of tune. This problem is easier to understand if you think of period rather than frequency. The individual samples are spaced at 22.675 microsecond intervals (at 44.1 kHz), and A440 has a period of 2272.727 microseconds. A single cycle of 440 Hz will take 100.23 sample periods (divide the tone period by the sample period). You can’t have a fraction of a sample period, so it will be 100 (4 cents sharp) or 101 (13 cents flat). The error increases as you go up in pitch. Some samplers let you retune the loop to compensate, but in most programs you need to include several cycles in order to achieve good intonation. For the midrange, a loop of four cycles should sound OK. Single-cycle looping also won’t work on sounds with a noise component, like a breathy flute. Remember, noise is random. Once you lock it into the repeating cycle of the loop, it’s not random any more and will produce a buzz.

Sample Envelope You may wish to modify the attack and decay of the sample. The sampler has attack and decay settings, so you seldom need to change these in the editor, but it’s

10_Chap9_pp187-204 8/29/13 2:45 PM Page 195

SAMPLERS

195

occasionally useful. You can apply a fade-in to the sample, which usually has the effect of softening the attack. A fade-out will give a graceful end to sounds chopped short in recording. If looping is activated, the fade-out will probably not be heard. Most samplers end notes by continuing the loop as the playback envelope fades the sound. You occasionally see an option to finish the sample during the release, but it’s hard to hear the difference on most sounds. If you lengthen an attack by editing the sound, no setting in the sampler can shorten it again.

Pitch and Samples As I mentioned, the base pitch of the sample should probably be set before importing the file. Samplers and sample editors also have pitch correction features, and you should experiment to discover which application has the superior algorithms. Keymap has a resynthesis feature, a powerful tool for many problem samples. In resynthesis, the sample is deconstructed by Fourier analysis and reconstituted with changes in frequency and duration. This addresses one of the knottiest problems with sampled sounds, which is how to deal with samples that have varying pitch. Consider animal vocalizations—these almost always have sliding pitch. If you use them as is, you generally establish the final part of the sound is being in tune. What happens if you play a short note? The pitch never reaches the tuned point, so the effect is flat or sharp. In addition, sliding notes will seem to tune differently in different chords. With resynthesis, you can reduce the pitch excursion. Keymap has a nicely automated approach, showing the pitch envelope as a red line and letting you specify the percentage of adjustment you want. That will seldom be 100 percent, as the pitch variation is probably part of the charm of the sound.

PROGRAMMING VOICES Mapping Once the sample produces a convincing note, it’s time to place it on the keyboard (Figure 9.4). To understand this process, we need to look at the architecture of samplers. The sound recordings (samples) available for playback are all loaded into a section of memory. This enables instant response when a key is pressed. The sampler also maintains an instrument definition, which contains instructions for each key—which sample to play and exactly how to play it. This way one sample can appear in several instrument definitions. Each brand of sampler has a somewhat different terminology for how these definitions are managed. The most common notion is the “zone,” which refers to a range of keys that share a sample. The zones are contained within a “instrument” (Logic) or “voice” (E-mu) that applies signal processing to the sample as it plays. Zones may overlap, so it is possible for a single

10_Chap9_pp187-204 8/29/13 2:45 PM Page 196

196

THE ART AND TECHNIQUE OF ELECTROACOUSTIC MUSIC

key to trigger several samples. It’s also possible to define zones by velocity range. If you set up two zones across the same keys but with different velocity ranges, a hard hit will produce one sound and a soft hit another. This is called velocity switching or velocity cross-fade if the transition is gradual. Another interaction can be described as the “high-hat trick”: two zones are linked so only one sample can play at a time. Striking one key stops another sample, much as closing a high-hat pedal chokes off any sound the cymbal is already making. In EXS24 this is done at the group level. Groups are an intermediate level of organization which lets you define common settings for arbitrary combinations of zones. For instance, you set up the high-hat by assigning the open hit and the closed hit to zones with the same key and group. You then reduce the number of voices (polyphony) for the group to one. Zones have detailed control of the way the sample plays, with several parameters affecting pitch and volume. Tuning and transposition offer effects beyond simply fixing intonation. If you create two zones with the same sample and tune one 3 cents high and the other 3 cents low, you get a chorus sound. Transposing will probably affect harmonies, but in some cases a fifth or octave combination will fatten the sound nicely. Pitch changes in playback do not correct duration, so unlooped samples in transposition will not finish at the same time. One of the most important tuning options defeats pitch change, a setting essential for percussion. Some programs offer one-shot or percussion modes—the entire sample is played regardless of note length. Volume control in the zone can include panning and scaling by key number to make the sample louder in the high or low octaves. These produce gradual transitions between overlapping zones. Mapping can be tedious, so most programs have special features to speed up the process. This can be as simple as assigning a sample to the next available key or as sophisticated as assigning samples to keys by pitch detection. The fastest way to distribute a lot of samples is batch import. This may use auto assignment or file names to define the mapping.

Processing Samples The software routine that plays a sample is known as the playback engine (Figure 9.5). This consists of the sample player, a level control, and possibly a filter. These have several parameters that can be modified during the note by various control sources. We have already adjusted each sample in many ways, but the dynamic processing that takes place in the playback engine makes the sounds livelier. Most programs define processing for the entire range of the instrument but a few allow settings specific to zones. The choice is a tradeoff between the convenience of setting everything at once versus detailed control for each sample. This is actually a good argument for having more than one type of sampler. The changes can be controlled by control envelopes, repeating functions, or external MIDI messages. In sampler terminology these are known as control sources.

10_Chap9_pp187-204 8/29/13 2:45 PM Page 197

SAMPLERS

197

These Zones Overlap 3 Zone Zo Z on ne e1

FIGURE 9.4

2

6 4

5

8 7

10 1 0 9

11 1

12 13 12 13

Zone Zo Z on ne e 14 14

Typical sample mapping.

Sample Player

Filter

Ampifier

Envelope

FIGURE 9.5

Playback engine.

Control Sources Up to now, I have used the word envelope to describe the history of a sound’s amplitude over time, with attack referring to the beginning, and so forth. A control envelope is a stream of numbers that changes in synchronization with the note messages. The simplest envelope shape is illustrated in Figure 9.6. At note on, the control value starts at zero and rises to maximum over a specified period labeled attack time. The value stays at maximum until note off, then falls back to zero in the release time. A more elaborate envelope is shown in the right half of Figure 9.6. After the maximum is reached, the value falls to an intermediate level. This fall time is called decay, and the level is called sustain. This type of envelope is named for

10_Chap9_pp187-204 8/29/13 2:45 PM Page 198

198

THE ART AND TECHNIQUE OF ELECTROACOUSTIC MUSIC

Attack

Release

Attack

Decay

Release

Sustain Note on

Note off

AR Envelope FIGURE 9.6

Note on

Note off

ADSR Envelope

Control envelopes.

the acronym of its parameters: ADSR. Many samplers offer more complex envelope shapes: E-mu products have a six-segment envelope, and in some programs you can draw your own envelopes freehand. No matter how complex the shape, the essence of a control envelope is that it runs its course once per note. Repeating functions are named LFOs for the low-frequency oscillators that were used to generate them on analog synthesizers. LFO controls operate at a steady rate and come in a variety of function shapes. These are exactly analogous to audible waveforms but are too low in frequency to hear. The most common shapes are sine, triangle, and an on-off function called pulse or rectangle (Figure 9.7). A pulse that is on exactly half the time is labeled square. Occasionally you see a function with random step levels. This shape is often named “sample and hold” after another analog module. The parameters for an LFO are rate and level. Rate may be specified in Hz, seconds, or musical note durations. Since an LFO controls sound indirectly (modifying the parameters of some process that modifies the sound), changing the level of the LFO changes the amount of the effect you get. The most familiar application of this is the modulation wheel found on most keyboard synthesizers. As you turn the wheel, the modulation effect increases. Control can be quite complex because the parameters of control sources can themselves be controlled. An LFO’s output may be synchronized with the note start or it may be free running. Free-running LFOs will make each note sound subtly different. Processes Volume control is the most common process. The standard connection is the application of velocity to volume, yielding the expected and usually appropriate effect. It’s also pretty standard to dedicate an envelope to level control. This determines how quickly the sample turns on, the amplitude at which it plays, and how it turns off. Since the samples usually have an inherent attack, the default envelope turns on immediately. If the control envelope has a slow attack, the sample will fade in,

10_Chap9_pp187-204 8/29/13 2:45 PM Page 199

SAMPLERS

Sine FIGURE 9.7

Triangle

199

Pulse

LFO shapes.

regardless of its initial shape. The release portion of the envelope determines how the note will fade away. If the note is looped, the release time will be the only factor that controls this. If the note is not looped, or if the sample is allowed to finish when the key is released, the effect will be a combination of the sample’s fade and the envelope release. Figure 9.8 shows how the sample and envelope shown in the upper half interact to produce the result shown in the lower half. In all cases, if the envelope release time is zero, the note will end abruptly when a finger comes up. It’s important to be clear on the difference in effect of defining a playback envelope and applying an envelope in the sample editor. The latter is fixed—once a sample is modified with a fade-in, it will always fade in. Furthermore, the envelope times that are inherent in the sample will vary with pitch, just as sample duration does. The playback envelope is consistent for all keys but variable by other factors. Some experimentation will demonstrate just how much effect the amplitude envelope can have on sounds. Release time is probably the most obvious parameter. A short release time chops the note in an organ-like manner. A long release gives the impression of reverberation. Sustain controls the loudness of the held section of the note in an obvious way. The decay has no effect unless the sustain is on less than full. If the attack is short, a small value for sustain will put a bit of punch on the sound. Turn sustain off to hear the punches by themselves—then extend the decay for a completely percussive effect. When I want this kind of sound, I set the decay and release to the same time. Otherwise, there is an odd effect caused by the fact that release time starts with the note off. The release is just a fade to zero from wherever the envelope is. With a long release, if the key is let up quickly the note may be quite long, but if the key is held longer than the decay time, the note will be short. DVD example 9.3 demonstrates the effects of the envelope parameters. Open it in your editor to see how shape and sound are related. The sampler gives us our first opportunity to play with dynamic filtering. This can produce some lovely transformations. The most basic filter we encounter has a low-pass characteristic, which will produce an effect often encountered in natural sounds. By connecting velocity to cutoff frequency, hitting the key hard will produce bright sounds while a gentle touch will give muted ones. A more complex effect will be produced by applying an envelope to the filter. Usually, the filter

10_Chap9_pp187-204 8/29/13 2:45 PM Page 200

200

THE ART AND TECHNIQUE OF ELECTROACOUSTIC MUSIC

FIGURE 9.8 Interaction of control envelope and sample envelope.

envelope opens immediately to let the attack of the sample through with no modification. On release the cutoff frequency is lowered, which will fade the high-frequency components, leaving a pure sound at the end of the note. A slower filter attack will let high partials in gradually, a sound often associated with brass instruments. We generally set the filter resonance low for these tricks. High resonance (or “Q”) will produce a gliding formant or tone, which is interesting in synthesis but may be a bit odd in naturalistic sampling. DVD example 9.4 demonstrates some filter effects. We also use filters to simulate the formants typical of acoustic instruments. This requires equalizing the sample in the audio editor to eliminate any natural formant in the sound. It is the pitch shifting of this formant that produces the famous chipmunk effect. Formants can’t be entirely removed, but they can be reduced with a graphic or FFT EQ. This process is greatly aided if the editor has a spectrum analysis feature: Try to flatten the spectrum as much as possible. With a flat spectrum sample, a band-pass filter provides a formant boost that gives the sound body and character. There’s no control on the cutoff frequency, just experiment until you find a good setting. Most samplers only provide simple filters, so we can’t construct elaborate multipeak formants in the instrument definition, but we can add this with plug-ins in the DAW. The following are some other common parameters that can be controlled. Pan to spread an instrument between the speakers. Pitch for vibrato or modulation; if used with layered sounds, small pitch changes can give a chorus effect. LFO level; if the LFO controls pitch, this produces classic modulation wheel action. Portamento is an effect applied to the keyboard—when a melody is played the pitch glides from note to note at an adjustable rate.

10_Chap9_pp187-204 8/29/13 2:45 PM Page 201

SAMPLERS

201

Sample start can have some interesting effects on the sounds; for instance, changing start on a sample of speech could cause it to skip words. Sample delay holds the note back a bit; this is useful in layered sounds. Envelope parameters can produce enough note to note variation to disguise the common sample on which the sounds are based.

Connections Every sampler has an assortment of control sources to connect to available parameters. The connection schemes vary from brand to brand. The most popular control connections, such as velocity to level, are usually permanently in place with knobs or sliders to turn them on. The variety of control knobs is amazing. Some just turn an effect on, others specify a percentage from -100 to 100 percent. This type of control is called a reversible attenuator—the negative percentage indicates a reversal of the expected effect. If applied to LFO rate, a reversed control would slow the LFO down. Additional source and destination connections are often listed in a table or matrix. This implies a limited number of connections, if for no other reason than to conserve screen space. These connections are often called patch cords, nomenclature from the analog synthesizer era. The destination list will not include every parameter on the instrument. Some connections would not make sense and others are left off because the designers do not feel they would be useful. In addition to the source and destination, there will be some way to determine the maximum effect of the control. This may be indicated by a text box or an image of a knob.

RHYTHMIC LOOPS Some samples are more complicated than a single sound. We occasionally record an interesting riff or pattern that we want to use intact. However, the clip is likely to be the wrong tempo. The process for dealing with this is simple to explain but tedious to carry out. You make a separate sample from each sound in the clip, along with a MIDI score that plays the samples in the original pattern. Find the original tempo by measuring the duration from the first note of the loop to the first note of the next cycle. (This is why you record two or three times around, three if the first time is a bit rough.) Select the entire cycle, downbeat to downbeat. The selection time can be read from the editor. Divide this time by the number of beats to get milliseconds in a beat, then divide the milliseconds per beat into 60,000 to get the tempo.

10_Chap9_pp187-204 8/29/13 2:45 PM Page 202

202

THE ART AND TECHNIQUE OF ELECTROACOUSTIC MUSIC

Map the notes. Using the milliseconds per beat value, list the start times for each note. You may have to convert this into your sequencer’s beat time format. Set your sequencer to match the loop tempo and enter a scale of notes in that rhythm. Slice up the clip. Isolate each note and save it in a file. Use the key number from the map in the file name. Create a sampler instrument. Now import the files into the proper key mapping in the sampler. Once you’ve done this, the pattern can be played at any tempo you like. There are programs that can automate this for you. ReCycle, for one, automatically detects the note starts (if they are distinct percussion-type hits), slices the file into samples, and creates the sequence. This can be saved in a proprietary format or a standard sample format.

THE SAMPLING AND COPYRIGHT LECTURE At some point in your sampling explorations, you may come up with the idea of sampling a published recording. We hear great things all the time, and occasionally something turns up that seems to fit perfectly into a current project. Unfortunately, anything from a record store or the Internet probably belongs to someone. Copyright law is pretty clear on the question—you must get permission from the copyright owner and pay a fee to use the material. The fee may be something like nine cents per distributed copy or a good deal more. You must pay the fee whether or not you make any money from your composition. My recommendation is that you do something original that other people will want to sample and let them pay fees to you.

EXERCISES This chapter is mostly about instrument design. We study samplers with sequencers because they bridge the areas of musique concrète and synthesis. As our editing of sound clips gets finer and finer, the sampler becomes the most efficient tool for organizing them. The following exercises pose problems that samplers can solve. 1. Record a short speech from the television news. Assign each word to a MIDI key (with no pitch change) and produce a piece in the sequencer that changes the meaning of the speech in various ways.

10_Chap9_pp187-204 8/29/13 2:45 PM Page 203

SAMPLERS

203

2. Record someone singing a haiku on a constant pitch. Make each syllable a sample with velocity mapped to pitch and produce a tune. Velocity to pitch is usually restricted to an octave, so you will need three or four transposed zones per syllable. Hint: Use the edit list to set velocities. 3. Create a set of samples from the kitchen items recorded for chapter 3. Use them to compose a rhythmic dance piece. 4. Record the sound of power tools at work, convert these to samples, and create some holiday music. The album Toolbox Christmas by Woody Phillips (Gourd Music 1996) has excellent examples of this.

RESOURCES FOR FURTHER STUDY Russ, Martin. 2008. Sound Synthesis and Sampling, 3rd ed. New York: Taylor & Francis.

10_Chap9_pp187-204 8/29/13 2:45 PM Page 204

11_Chap10_pp205-236 8/29/13 2:45 PM Page 205

Part 4 Synthesis Many devices and programs generate audio signals. These offer the possibilities of producing sounds never heard in nature and the composer discovering something new. The creation of sounds with electronic circuitry is now called synthesis, and an instrument that uses electronic means to generate musical sound is called a synthesizer. (In colloquial usage, synthesizer is often shortened to “synth,” which may eventually be the preferred term.) Many composers consider synthesis to be the essence of electroacoustic music. When we can create sounds from scratch we break free of the boundaries of traditional sounds and expectations. Admittedly, the sounds of synthesized music seem to become traditional and even trite as soon as a new instrument hits the marketplace, but there is still plenty of room for innovation, not just at the frontiers of novel sound but in the spaces between the standards. We may relish the opportunity to be outrageous, but the real quest is for the subtle—sounds that are sensitive, eloquent, and expressive. This requires a special combination of technical and artistic skills—the ability to use the arsenal of synthesis instruments available and the taste to use them in the service of our music.

METHODOLOGIES Musicians and engineers have explored many approaches to synthesis, and I daresay more will be invented soon. The most common approach is subtractive synthesis, in which filters are used to select an interesting spectrum out of a complex waveform. This technique got a head start on the others because it is relatively easy to implement in analog electronics. Chapters 10 and 11 explore subtractive synthesis in depth. Frequency modulation or FM was the first form of digital synthesis to become popular. This technique relies on the carefully controlled interaction of two oscillators. FM is explored in detail in chapter 12. Chapter 13 will explore programs based on techniques that are just moving from the research lab to composers’ desks. These include additive synthesis, granular synthesis, and modeling. Additive synthesis provides independent control of each partial of a sound. This requires an oscillator and amplifier for each partial, as well

11_Chap10_pp205-236 8/29/13 2:45 PM Page 206

206

THE ART AND TECHNIQUE OF ELECTROACOUSTIC MUSIC

as detailed envelopes to control pitch and amplitude. In most applications, this information is derived from analysis of actual sound and stored in a wideband format. When this technique is used, the process is often called resynthesis. Most resynthesis systems provide a method for interpolating between two or more analyses to produce spectral morphing, resulting in timbres intermediate between both sources. Granular synthesis involves a different type of analysis. A wave is broken down into a series of very short snapshots or grains, similar to the way a motion picture captures action as a series of stills. The grains can be played back with no noticeable loss of fidelity. They can also be played back with different pitches, different durations, or even in a different order. This is the technique at the heart of many audio time-stretch features. As computers grow in power, the notion of modeling becomes more attractive. Modeling involves using the computer to calculate the moment by moment effects of the sound-producing system, solving the relevant equations on the fly. A system is modeled by analyzing its parts and their interaction. For instance, a flute model includes the lips, embouchure plate, and tube. The program computes the resonance of the tube with length specified by the input pitch and breath specified by velocity. Voicing the model involves specifying physical parameters such as tube diameter and sharpness of the embouchure plate. Such models recreate artifacts of articulation and breath variability that are missed by sampling techniques.

DESIGNS Synthesizers can be physical instruments or virtual software applications. Dedicated hardware has historically performed best on stage. The latency inherent in computer systems is barely noticeable in the best situations, but it is still there, and few musicians feel their computers are dependable enough to take out on the road. The main exception to this attitude, laptop musicians, are not usually playing their instruments directly in the sense of push a key, get a note. A live performance of traditional electroacoustic music will generally be built around a keyboard synthesizer, with possibly a couple of extra tone generators in the rack. There are not as many big hardware manufacturers as there once were, but several boutique companies are producing exciting products for the discerning musician. Some of these represent the reappearance of famous names such as Dave Smith and Robert Moog. Hardware synthesizers are covered in chapter 17. Computer-based synthesis reigns in studio production. High-end desktop computers work splendidly in a sequencing environment, and their tantrums can be tolerated when no curtain time is looming. Programs are much less expensive than dedicated boxes, so any composer can afford at least some version of every technology available. A wide choice of USB controllers makes the studio situation very

11_Chap10_pp205-236 8/29/13 2:45 PM Page 207

SYNTHESIS

207

comfortable. I no longer balance my computer keyboard on top of the performance keyboard—I have a mini keyboard with lots of sliders sitting on the desk to use when tweaking sounds and a big keyboard off to the side for laying down tracks in the sequencer. There are two forms of computer-based synthesis. Most production work is done with synthesis applications, programs that are designed to present composers with layouts and paradigms in familiar musical forms. These are packaged instruments with a limited set of functions and abilities. They may be difficult to learn, but they do not require knowledge of underlying computer mechanisms, acoustic principles, or mathematics. This knowledge may be helpful, but the programs are perfectly useable without it. These applications come supplied with enough presets and default settings to be put to work right out of the box. Research synthesis, on the other hand, is usually carried out with tools that look a lot like programming languages. There are few if any defaults, and the only things resembling presets are examples and files produced by other users. There are no guarantees that a particular batch of code will make sound or will not produce a speaker-shattering blast of noise. There are also few limits on what can be done, so this is where the synthesis techniques of the future are born. We will look at such tools starting in chapter 14.

11_Chap10_pp205-236 8/29/13 2:45 PM Page 208

11_Chap10_pp205-236 8/29/13 2:45 PM Page 209

TEN Fundamentals of Synthesis

The modern synthesizer is the product of three different lineages, an amalgam of analog circuit design, computer programming, and traditional musical instruments. The circuitry and programming provide the underlying machinery, and musical traditions define the forms and capabilities of the instruments. The evolution of synthesizers has been an unsteady process that is still going on. One day there will probably be a well-defined instrument that musicians will study the way some students study piano, but that day is still in the future. Right now, the term synthesizer not only includes many different physical devices, but even more virtual instruments that exist only as applications on a computer. All of these are being constantly revised and replaced by their manufacturers, with radically new designs appearing every month. In the midst of such turmoil, it is not practical to focus on the nuances of a single synth for a course or a book, because the instrument will certainly be obsolete by the time the course ends or the book comes out. For these reasons, this chapter will explain synthesis using modular types of synthesizers even though relatively few musicians actually use them. Because it is not rigidly defined, a modular synthesizer makes it easy to isolate and explore fundamental principles. Once these principles are mastered, a composer can quickly learn new machines and programs as they appear. (Besides, modular analog synthesizers are still available and surprisingly affordable.)

THE MODULAR LEGACY A modular synthesizer is a collection of independent units that generate or modify audio signals. Usually, these are mounted together in a large case and share some amenities such as a power supply but are not otherwise interconnected. Several modules must be hooked together before any sound is produced. The short cables that make the connections are called patch cords, the process of making the connections is called patching, and the configuration of modules used for a particular sound is called a patch.

209

11_Chap10_pp205-236 8/29/13 2:45 PM Page 210

210

THE ART AND TECHNIQUE OF ELECTROACOUSTIC MUSIC

Modular synthesizers were first invented in the 1950s and 1960s, reached their peak of power and popularity in the 1970s, and were supplanted in the 1980s by more streamlined digital instruments. As with all things electronic, the most obvious aspect of this evolution was size. The first experimental machines filled a room, the early commercial modular synthesizers covered a large section of wall, and the latest instruments fit into a suitcase. There continues to be interest in the big beasts, with a few cottage manufacturers offering new versions and classic systems selling for startling prices. The romance endures, but old-timers remember modular synthesizers as being limited and undependable in actual use. The feature most often mentioned by nostalgic composers is flexibility. Whatever modules were available could be patched in an astonishing variety of ways. Of course, many configurations didn’t actually do much, but it was comforting to know the possibilities weren’t limited by a manufacturer’s decision. Three decades after they faded from the scene, the legacy of the first-generation modular synthesizers lives on. Nearly every synthesizer available is based on the concepts established by the modular design and has an architecture derived from the most useful patches. The manuals and panels of synthesizers are labeled in terms of oscillators, LFOs, and amplifiers, using terminology invented for the first generation of synthesizers by Robert Moog, Don Buchla, and Alan R. Perlman. Furthermore, modular synthesis is again available in virtual form. There is a wide range of programs that model classic machines and others that expand on the basic concepts in new and exciting ways. The examples in this chapter are taken from Tassman (Applied Acoustic Systems), an inexpensive and powerful modular synthesis program. Tassman can be used as a stand-alone program with MIDI input, or it can be accessed as a plug-in instrument in a MIDI sequencer.

THE PATCH Synthesis begins with the design of a patch. This is a particular set of modules connected in a specific way. The connections provide pathways for audio, control, and timing signals. Once a patch is set up, its capabilities can be explored by adjusting parameters and playing sample passages. The details for building patches in virtual synthesizers vary from program to program, but most provide two modes or views: one for assembling the components of a patch and one for performance. In Tassman, the patch assembly view is called the builder and is illustrated in Figure 10.1. In the builder, modules are dragged from a library on the left and connected by clicking output and input points on each module. This displays the patch in a flowchart similar to those found in synthesis textbooks. There’s not a lot of detailed information in this display, but you can open an inspector that shows initial settings for some parameters and display features. The patch makes sound when you shift to the player, a view that provides virtual knobs for all changeable parameters (Fig-

11_Chap10_pp205-236 8/29/13 2:45 PM Page 211

FUNDAMENTALS OF SYNTHESIS

FIGURE 10.1

211

The Tassman patch builder window.

ure 10.2). These knobs are a bit awkward to manipulate with a mouse, but they are easily matched with external MIDI controls. Schemes for interconnecting hardware modules are varied and ingenious. The most popular system uses short audio cables with a standard or miniature phone plug on each end. There is also some use of a type of connector taken from test instruments known as a banana plug; these have the advantage that they can be stacked up in the same jack, so one connection can easily lead to several inputs. Phone plug systems require a liberal supply of multiples to allow splitting of signals and controls. (Modular synthesizers share the common electronics rule against connecting outputs together.) Other interconnection schemes range from complex switch banks to matrix pin boards.

MODULES Every synthesizer, virtual or physical, provides certain basic modules plus some that are unique to the manufacturer. Details and specific features will vary, sometimes quite a lot, but modules can be classified by fundamental function into one of three types:

11_Chap10_pp205-236 8/29/13 2:45 PM Page 212

212

THE ART AND TECHNIQUE OF ELECTROACOUSTIC MUSIC

FIGURE 10.2

The Tassman player window.

Generators produce audio frequency signals that ultimately will be heard. Processors modify audio signals in some way. Controllers produce relatively slowly changing voltages that control the operation of other modules. The concept of a voltage controlling the parameters of the sound is at the heart of analog synthesis. Controllers may also provide timing signals that trigger the operation of other modules.

Oscillators Oscillators are the fundamental signal generator. The oscillator circuit produces a constant signal at a specified frequency, as determined by knob settings and external control voltages. In the most common design, a change of one volt applied to an oscillator would raise the pitch by one octave. Even in virtual form, where all data is represented as numbers, an oscillator is usually called a VCO, an acronym for

11_Chap10_pp205-236 8/29/13 2:45 PM Page 213

FUNDAMENTALS OF SYNTHESIS

Saw

Triangle

FIGURE 10.3

Oscillator waveforms.

Sine

213

Pulse

voltage controlled oscillator. In the early models, voltage control of pitch was not accurate over more than three or four octaves. To improve intonation, VCOs had range switches that included a low-frequency setting for control applications. An oscillator at subaudible frequency will provide a cyclic voltage change that is suitable for producing vibrato effects. This has enough uses that dedicated lowfrequency oscillators (LFOs) are included in most instruments. The standard oscillator circuit produces a sawtooth waveform, which is a ramp from the lowest voltage to the highest voltage available. Most oscillators include components that convert the sawtooth into triangle, sine, and pulse shapes. These basic waves are illustrated in Figure 10.3 and heard on DVD example 10.1. They are mathematical abstractions, easily produced by a computer, but for various reasons not quite attainable by analog circuitry. DVD example 10.2 shows the oscilloscope display and spectrum of these waveforms. The sawtooth is a very rich sound, with plenty of meat for filtering. The triangle is a simpler, more pleasant tone which is often used as is. The sine is the pure tone and is usually used in combinations. The sine tone in this example was digitally produced. The sine output of hardware synthesizers varies significantly from the ideal, and modules on the same machine will differ from each other. The sound of the oscillators from each manufacturer is unique enough that experienced electroacoustic musicians can identify the synthesizer brand and model just by listening. Figure 10.4 shows the sine wave output found on some classic machines. Composers of the day complained about the lack of purity, but once digital instruments made nearly perfect waves available, there were as many complaints about the loss of interest and warmth. The pulse wave is the only output with any adjustment of timbre. The waveform has a pulse width or duty cycle which is defined as the ratio of the time the voltage is high to the overall period. Figure 10.5 shows pulse waves of various duty cycles. The spectrum of this wave is quite rich, with one interesting feature: certain harmonics are missing according to the duty cycle. If the duty cycle is 1:2 (a square wave), harmonic numbers divisible by two are missing, with only the odd harmonics heard. That is characteristic of any symmetrical waveform and also of certain stopped pipe instruments such as the clarinet, so many describe this timbre as clarinet-like. If the duty cycle of a pulse wave is 1:3, harmonic numbers divisible by three are missing, 1:5 eliminates every fifth, and so on. The in-between duty cycles

11_Chap10_pp205-236 8/29/13 2:45 PM Page 214

214

THE ART AND TECHNIQUE OF ELECTROACOUSTIC MUSIC

Defects Exaggerated for Visibility

Moog 901

FIGURE 10.4

Diode Ladder

Asymmetric (Putney)

Sine waves from analog synthesizers.

1:2 FIGURE 10.5

Clipped Triangle (Arp)

1:3

1:5

Pulse width or duty cycle.

are going to have almost all harmonics; for example, 2:5 will only lose every tenth. When the duty cycle is continuously varied, we hear a rich sound that clarifies when the simpler duty cycles are reached as demonstrated on DVD example 10.3. DVD example 10.4 shows how the spectrum changes with the duty cycle, and Figure 10.6 is a snapshot of the duty cycle 5:1. Like the square wave, a triangle wave is limited to odd-numbered harmonics, but they are much weaker. The power series of the spectrum is the inverse square of the harmonic number. The 3rd harmonic is one-ninth the strength of the fundamental, the 5th harmonic is one-twenty-fifth, and so on. Harmonic 11 is basically inaudible at -42 dB. The sawtooth has harmonics at all integers, with power of a harmonic equal to the inverse of its number. As we work with a variety of artificial waveforms, we will observe that asymmetry produces even harmonics and corners produce high ones. The spectrum of a triangle wave is shown in Figure 10.7. The anatomy of an analog oscillator is well beyond the scope of this book or the composer’s need to know, but it is a good idea to explore the basic technique of digital synthesis. Doing so will prepare us for some surprises that turn up in actual practice. Most programs generate tones using wavetable lookup, an efficient way to create a digital version of any geometric waveform. The wavetable is a block of

11_Chap10_pp205-236 8/29/13 2:45 PM Page 215

FUNDAMENTALS OF SYNTHESIS

215

Missing Components

15

FIGURE 10.6

15

FIGURE 10.7

31

63

125

250

500

1k

2k

4k

8k

16k

1k

2k

4k

8k

16k

The spectrum of a 5:1 pulse wave.

31

63

125

250

500

The spectrum of a triangle wave.

memory that holds a record of one cycle of the wave. The size of the table is some power of two; 1,024 or 4,096 are typical dimensions (the bigger the table, the better the fidelity, up to a point). Figure 10.8 shows a wavetable representing a sine function. Each location in the table contains a number that is the amplitude of the wave at that point. On every tick of the sample clock the routine fetches a number out of the table. If the number is taken from successive locations, the table will be read through in a time equal to the sample period times the size of the table. When the

11_Chap10_pp205-236 8/29/13 2:45 PM Page 216

216

THE ART AND TECHNIQUE OF ELECTROACOUSTIC MUSIC

Wavetable

(Height of line represents value in table.)

Time

Output Sampling Increment FIGURE 10.8

Marked values are sent to output.

Wavetable lookup.

end of the table is reached, the routine starts over, so the wave is read at a frequency equal to the sample rate divided by the table size. With a 1,024-point wavetable and a sample rate of 44.1 kHz, the frequency is a bit over 43 Hz. (Notice how similar this is to looping in a sampler.) The location to be read next is kept in a value called the phase accumulator. To double the frequency, 2.0 is added to the accumulator after each read. Then the routine reads every other point on the wavetable. If a number between 1.0 and 2.0 is added to the accumulator each time, the pitch change is something less than an octave. Any frequency can be produced this way, although the accumulator will be pointing between locations most of the time. This can be resolved in several ways. A quick and dirty system will simply read values twice or skip some. A better approach is to interpolate between the values in the table using the fractional part of the phase accumulator. An even better system does a spline-based interpolation, a trick you see when you draw a fancy curve in a graphics program. A major difference between analog and digital oscillators is that the latter are subject to aliasing. Aliasing, as you will remember, happens when the system attempts to record a signal higher in frequency than half the sample rate. It turns out that aliasing also occurs if an oscillator attempts to calculate a signal higher than half the sample rate. Such pitches are well above anything that can be notated, but they turn up as harmonics in the richer waveforms. A sawtooth wave has significant power up to harmonic 99 or more. If the fundamental is 2 kHz, the 15th harmonic is 30 kHz, well above the Nyquist limit for a 44.1 kHz system. Aliased

11_Chap10_pp205-236 8/29/13 2:45 PM Page 217

FUNDAMENTALS OF SYNTHESIS

217

harmonics become components at the sampling rate minus the harmonic frequency, producing a spectrum folded around the Nyquist frequency. That 15th harmonic is going to produce an alias at 14.1 kHz with a strength of -23 dB. This will simply add some edge to a rich sound, but consider how these aliases will interact in a melody. High C has a fundamental of 1,046 Hz. The frequency of the aliased 41st harmonic will be 1,214 Hz, a flat E-flat. The 42nd harmonic will produce a tone at 168 Hz, and the 43rd will reflect again off 0 Hz and yield a tone at 878 Hz. A half step higher (1,174 Hz) will produce 662, 512, and 1,686 Hz. These partials are lower than the fundamental and jump unpredictably from pitch to pitch as you can hear in DVD example 10.5.

Noise Generators Noise is an important constituent of many sounds. Noise generators usually produce evenly distributed or white noise and often have a second output producing pink noise. Occasionally you will see a low-frequency noise suitable for providing random control voltage variations. Noise generators need no controls, and since the distinguishing aspects of a sound come from processing the noise, only one noise source is needed. Noise generation is one function that analog circuits do better than computers. Analog noise is a physical phenomenon inherent in all circuits. It’s caused by subatomic processes, and a lot of effort is expended to reduce it in most devices. An analog noise generator is nothing more complicated than an amplifier with some deliberate mistakes built in. Since computers deal only with numbers, noise must be computed, which is something of an oxymoron. The best a program can do is use a pseudo-random number generator, which generates a stream of values that is statistically random, but may have audibly repeating cycles. This was painfully obvious with early digital instruments, but is handled better with 32-bit math processors.

Amplifiers The voltage-controlled amplifier or VCA has the critical job of turning the sound off. More accurately, connecting the constant signal from the oscillator to a VCA will block any sound until a control voltage is applied to the amplifier. Carefully shaped control voltages will apply an amplitude envelope to the signal, establishing many of the sound’s primary characteristics. Most amplifier modules have a knob to set initial gain. A second control often found on amplifiers is a switch to choose exponential or linear response. In the linear setting any control input increases the amplitude in direct proportion to the control. That is, if the maximum control is 10 volts and produces unity gain, 5 volts will produce a gain of 0.5. In exponential mode, the gain change can be expressed in dB per volt. The two settings have a subtly different sound and are appropriate

11_Chap10_pp205-236 8/29/13 2:45 PM Page 218

218

THE ART AND TECHNIQUE OF ELECTROACOUSTIC MUSIC

to certain uses. For instance, exponential mode is usually appropriate when turning sounds off, but linear is required when panning from left to right. The digital equivalent of an amplifier is simply multiplication of the signal values by the gain. This produces perfect fidelity unless the clipping limit is exceeded. That is easy to do, but clipping is easy to hear and to fix. Analog amplifiers are not as perfect because the changing control voltage interacts with the signal and produces distortion. This is another subtle effect that gives hardware instruments a distinct sound.

Filters The waveforms produced by the oscillator are static and unlikely to be interesting for long. Varied and dynamically changing tones are usually created by passing the oscillator signal through one or more filter modules. The filters in a synthesizer are somewhat different from those discussed in chapter 5. Filter modules are usually a single circuit with sharp cutoff characteristics. Whereas equalizers generally have a slope of 6 dB or 12 dB per octave, a slope of 24 dB per octave or even steeper is required for synthesis. These filters are used to reduce or eliminate entirely some components of the oscillator output, the defining technique of subtractive synthesis. The main parameter of a voltage-controlled filter (VCF) is the cutoff frequency, which is preset with a knob and changed by a control input. Many designs also feature variable resonance or Q, which emphasizes the signal near the cutoff frequency and makes the cutoff slope somewhat steeper. Some filters will begin to oscillate if the resonance is turned all the way up, producing a pure sine wave. Other designs have no oscillation at high resonance settings but will ring at the cutoff frequency when a pulse is applied to the audio input. This produces a pop like a woodblock or clave. The most common synthesizer filter is low-pass. These are used to reduce the higher partials of a rich waveform produced by the oscillator. We will see later how the cutoff frequency is manipulated to modify or even define notes. The high-pass design is less common but can produce interesting sounds by reducing the fundamental and low harmonics of the signal. This results in squeaks and similar sounds. Low-pass and high-pass responses are shown in Figure 10.9. Low-pass and high-pass circuits are often combined in a single module. In hardware designs, the two sections may be used independently or together for band-pass or notch functions. A popular circuit known as a state-variable filter produces all four functions simultaneously. Filter design is a complex art. A wide range of adjustability and steep cutoff are required, which means some characteristics such as phase response and flatness of the passband will be less than ideal. This is a major factor in the distinctive sound of certain brands of synthesizer. In fact, some classic instruments included several filters of different designs to expand the range of sounds available. Digital filters are also complex, but here designs can be precise and can include response curves like the one called brick wall that are impossible with analog circuits. This

11_Chap10_pp205-236 8/29/13 2:45 PM Page 219

FUNDAMENTALS OF SYNTHESIS

219

Cutoff

15 Hz

31

63

125

250

500

1k

2k

4k

8k

16k

4k

8k

16k

Lowpass Filter Response

Cutoff

15 Hz

31

63

125

250

500

1k

2k

Highpass Filter Response FIGURE 10.9

Filter response curves.

shape requires a circuit with the ability to predict the future, which a digital circuit can do by slightly delaying the signal. Digital filters can be designed using purely mathematical means, or they can model the operation of analog filters on a component by component basis.

Function Generators A function generator (also known as a transient generator) produces a changing control voltage when it is triggered by a timing signal. After the operation is complete, the function generator stops until it is triggered again. The trigger input may be a momentary pulse or a longer one (a gate) which determines the duration of the function. The simplest function is a ramp from zero volts up to a preset level followed by a ramp down. The time it takes to ramp up is set by a control that is usually labeled attack. The falling ramp is likewise set by a release control. These names are taken from the most common use of the device, which is to control an

11_Chap10_pp205-236 8/29/13 2:45 PM Page 220

220

THE ART AND TECHNIQUE OF ELECTROACOUSTIC MUSIC

amplifier to impose an amplitude envelope on a signal. This is the attack-release envelope generator discussed in chapter 9. A more elaborate device provides a four-segment function. This is the ADSR envelope generator, with the phases attack, decay, sustain and release. The ADSR is initiated by a gate signal going positive. The attack phase rises to the peak voltage in the prescribed time, then the voltage falls to an adjustable sustain level in the time defined by the decay control. When the gate falls to zero, the release ramp is initiated. Even more complex function generators include delays and peak hold times. The shape of the function may be controlled as well as preset, often with the sustain level linked to key velocity and release linked to pitch. ADSR envelopes may be controlled by a trigger signal as well as a gate. A trigger without a gate would produce a percussive envelope defined by the attack and release times. A simultaneous trigger and gate will produce the complete function. If a second trigger is received while the envelope is in the sustain phase, the attack and decay would happen again. Digital envelopes are usually controlled by the MIDI messages note on and note off, which behave as gates. Figure 10.10 shows these functions and their relationships to gate and trigger signals. A historical note: Moog transient generators were not originally labeled ADSR. The parameters on the Moog 911 envelope generator were called T-1, T-2, E-sus, and T-3. The more descriptive term ADSR was first used on ARP synthesizers and caught on quickly. Some texts and manuals refer to transient generators as “ADSRs,” even if they produce more complex envelopes.

Low-frequency Oscillator The low-frequency oscillator (LFO) is similar to the signal oscillator but usually built with a more economical design. The distinguishing feature is that it runs at subaudible frequency—in fact we usually speak of period, the number of seconds per cycle, rather than Hz. LFOs typically have fewer waveforms than the audio oscillators and only occasionally have voltage control. The traditional application of an LFO is to slowly modulate some parameter (such as pitch) for vibrato effects. A second important use is to trigger a repeating action with a pulse output.

Sample and Hold The sample and hold module is an analog memory. It has a voltage input and a trigger input. When the device receives a trigger, it captures the current input voltage and holds that value at the output. This simple paradigm can produce a fascinating variety of control patterns. For instance, if the input is a noise waveform, the output will be random steps. This particular use was common enough to become a stereotype, and many modern synthesizers use the term sample and hold to mean random steps.

11_Chap10_pp205-236 8/29/13 2:45 PM Page 221

FUNDAMENTALS OF SYNTHESIS

Attack

Release

Attack

Decay

221

Release

Sustain Gate Trigger

AR Envelope FIGURE 10.10

ADSR Envelope

Gates and envelopes.

Mixers A signal or control can be taken from one source to many destinations. This is necessary for parallel processing of audio or synchronized actions of controls. In hardware systems this requires a multiple, which is nothing more complicated than three or four jacks wired together. In software such as Tassman, several patch cords are simply drawn from the same output. The opposite always requires special treatment. Multiple signals can only be combined in a mixer, a circuit or code routine that adds signals together. In many cases this is included in a module providing multiple inputs, but extra mixers are always handy in complex patches.

Control Processors The output of function generators and LFOs often must be modified for tasteful control. The circuits that provide these modifications are usually built into the destination modules, but sometimes a few special functions are grouped in a separate panel. These circuits are simple, and their functions are easily described. Attenuator. An attenuator reduces the amount of control voltage. These are usually built into oscillators and filters on extra (mixed) control inputs. The familiar modulation wheel is an attenuator on the control between an LFO and the pitch of a VCO. A VCA can be used as a voltage-controlled attenuator to produce control effects that vary during a note. Inverter. This flips the sense of the control function. A ramp that changes from 0 to 5 volts becomes a ramp from 0 to -5 volts. A common use for this sort of thing is panning: the same signal is applied to two VCAs, one of which is initially turned on. The unprocessed control turns one amplifier on while the inverted control turns the other off.

11_Chap10_pp205-236 8/29/13 2:45 PM Page 222

222

THE ART AND TECHNIQUE OF ELECTROACOUSTIC MUSIC

Reversible attenuator. A few manufacturers combined inversion with level control to provide a reversible attenuator. This is a knob in which the center position is off: turning to the right increases the output, and turning to the left provides an inverted output. Slew limiter. This limits the rate at which a control can change, which converts steps into smooth transitions. The most common use is on a keyboard output to provide a portamento effect. Constant voltage source or constant signal. Some patches require a constant voltage source to set the initial value of a parameter that does not have a dedicated control. In hardware systems this is provided by an attenuator connected to the power supply. The digital equivalent is the constant signal, in which all values are the same.

PERFORMANCE INTERFACES The devices used to interact with a synthesizer are nearly as important as the modules that produce the sounds. The interface determines how a performance can go much as a handlebar design influences a bicycle. The list of possible interfaces is expanding rapidly as many inventors and researchers explore new modes of performance (some of these are covered in chapter 18). The traditional controllers associated with synthesizers are keyboards, ribbon controllers, joysticks, and step sequencers.

Keyboard The most common control device is an organ-style keyboard with the traditional white and black keys. In the analog machines the keyboard produces a voltage determined by the last key pressed. It also produces trigger pulses when a key is pressed and a gate signal that lasts for as long as a key is held down. With these three outputs the keyboard controls one voice, determining pitch and rhythm. Modern MIDI-style keyboards are able to provide full polyphony. A hardware synthesizer has to be fairly large to support more than two or three voices, but this is simple for computer emulations. The analog keyboard usually provided one volt per octave to match the response of the oscillators, but this tuning was not mandatory (in fact, it was rather difficult to achieve with any accuracy). Most keyboards were provided with a control that adjusted the total range of the keys, enabling microtonal scales. A few keyboards had individually adjustable keys to provide custom temperaments. There was often a transpose control, although with the base frequency of the oscillators adjustable via knobs, there was no guaranteed key-to-pitch relationship in

11_Chap10_pp205-236 8/29/13 2:45 PM Page 223

FUNDAMENTALS OF SYNTHESIS

223

any case. Many performers became adept at manipulating this knob to provide pitch expression as they played solo lines. Eventually this developed into the spring-loaded control now called pitch bend. A third control on the keyboard enabled a portamento or glissando using a slew rate limiter to slow the rate of voltage change. In some brands of synthesizer, notably Buchla, mechanical keys were replaced by touch plates. These were often laid out in patterns different from the organ template, which encouraged alternate approaches to performance. Touch plates were a bit finicky to play and did not catch on with many musicians. We are now starting to see touch screens as performance interfaces, and it will be interesting to see what that leads to.

Ribbon Controller One of the first keyboard controllers, found on the ondes Martenot (1928) was based on a resistive ribbon contacted by the keys to determine the pitch. The performer could also contact the ribbon directly with a sliding ring and produce glides across the entire range of the instrument. Touching the ribbon carefully would give string-like flexibility and vibrato. Ribbon controllers were available with Moog and other machines, but the MIDI versions currently offered do not afford the same flamboyance and delicacy.

Joystick A joystick is another exotic interface often seen on modular machines. This was named after the control stick used to steer most aircraft prior to the 1950s. For the synthesizer performer, it gave one-handed control of two parameters, sometimes with a button to trigger sounds. Joysticks have made a big comeback as an accessory to video games and many musicians have adapted them (and other game controllers) for performance.

Step Sequencers The most elaborate modular synthesizers had one or two control voltage sequencers. This was a large panel with a grid of knobs, typically three or four rows by eight to sixteen columns. At the top of each column was a light. When the light was lit, the knobs in the column would set a voltage available at an associated jack. The columns could be successively activated by a trigger or a voltage ramp, so a low-frequency oscillator could set the device stepping along, playing a short tune. When the end column was reached, the process could stop or continue in an endless cycle. This device encouraged a rather simple kind of repetition and a lot of

11_Chap10_pp205-236 8/29/13 2:45 PM Page 224

224

THE ART AND TECHNIQUE OF ELECTROACOUSTIC MUSIC

bad synthesizer music, but when used in combination with other sequencers or processes it could generate some fascinating and complex patterns. This kind of evolving pattern is probably the most recognizable aspect of analog-style synthesized music. Some manufacturers combined sequencers with keyboards. This allowed the set of keys to control three or four parameters, and the sequencing and keys could interact in interesting ways.

FUNDAMENTAL PATCHES Basic Beep The set of modules available seems to present a bewildering range of possibilities, especially since most hardware synthesizers have several copies of each. However, a patch that makes sound must necessarily contain a few basic elements, and after forty years of patching, 95 percent of patches, including nearly all instruments with fixed architecture, use a variant of the same patch. I call this patch basic beep, diagrammed in Figure 10.11. There are a few things to point out in this diagram: the shapes represent modules and the lines are connections made with patch cords. Simple lines indicate audio connections, lines with arrows indicate control voltages, and dotted lines indicate timing signals. Signals flow left to right. The basic beep has three modules plus a keyboard controller. The signal originates in the oscillator and is shut off by the amplifier. When a key is depressed, the control to the oscillator determines the pitch and the keyboard gate starts the function generator, which turns the amplifier on. Even though it is simple, basic beep offers a wide range of sounds. The oscillator waveforms have distinct timbres and the pulse can be changed for subtle variations, but the envelope generator is the primary source of variety. Adjustment of the transition times will provide many flavors of sound. The attack parameter is probably the most sensitive. The shortest attacks have the intensity of a percussion hit, whereas long attacks give crescendos of sound unlike any acoustic effect. Settings in between produce articulations reminiscent of winds, brass, and strings. In our exploration of musique concrète, we discovered how much the attack influences sound when we modify it. The release phase of the envelope is next in importance. The length of release relates to the quality of the sound—fairly short is like a highly damped woodblock, longer gives a resonance like a metal bar (DVD example 10.6 illustrates the variety available by changing the attack and release settings of a beep). A gate makes the type of envelope apparent. The AR type will produce organlike tones, while the ADSR will give the more accented attacks associated with piano. With four controls, the ADSR can produce many effects that can’t really be described in words. DVD example 10.7 has a sampling.

11_Chap10_pp205-236 8/29/13 2:45 PM Page 225

FUNDAMENTALS OF SYNTHESIS

VCO

225

VCA

Frequency

Amplitude ADSR Gate

Keyboard FIGURE 10.11

Basic beep.

Here’s a silly exercise that will illustrate a problem you will encounter again and again. Set up a beep patch and then add a second VCA to it. Control it with a second envelope generator connecter to the same keyboard. Play with the attack and release times of the two ADSRs. Note that whatever you do, the longest attack, shortest release, and lowest sustain level determine what you hear. This effect is going to turn up frequently, because many synthesizers have a master envelope applied to all notes. It also happens when the sound source is a recorded signal with an attack and release built in. Most of these master envelopes default to instant on, which is acceptable, but they often have instant off and will squelch the decay of sounds.

Filter Patches The main criticism of basic beep is that it is pretty static. Amplitude changes but the timbre is exactly the same throughout. If we add a filter the sound loses much of its predictability and comes to life. There are two approaches to filtering: fixed and envelope controlled. A filter of fixed frequency has the same effect as the resonator of an acoustic instrument. It serves to accentuate certain components and reduce others. Naturally, a rich waveform must be used for the filter to have much effect. Filtering a sine wave can only reduce its amplitude. The filter can be set at a fixed frequency, which adds a formant to the tone, or the filter can be controlled to some degree by

11_Chap10_pp205-236 8/29/13 2:45 PM Page 226

226

THE ART AND TECHNIQUE OF ELECTROACOUSTIC MUSIC

VCO

Frequency

VCF

Cutoff ADSR Gate

VCA

Amplitude ADSR Gate

Keyboard FIGURE 10.12

Beep with filter.

the keyboard output. The resulting sound is the same on each pitch, but with a more interesting waveform than the bare oscillator. Filters of any type can be used this way—even audio equalizers and fixed filter banks will produce unique sounds. An envelope can control the filter cutoff frequency to produce an especially organic effect. A low-pass filter is initially tuned lower than the oscillator. When the note is triggered, the filter cutoff sweeps upward revealing partials one at a time. This kind of attack is associated with many acoustic instruments, notably brass; the initial sound is the fundamental, with the high partials establishing themselves a bit later. If the controlling envelope is an ADSR type, the initial peak of full spectrum followed by a drop to intermediate complexity is akin to the rich jangle of a plucked string. The final release lets the filter close down and shut off the sound. If the release is fairly long, the lingering sound is mostly a pure fundamental, another effect reminiscent of acoustic sound production. Usually, the filter is placed before an amplifier, as in Figure 10.12, but if the filter shuts the sound off entirely the amplifier is not necessary. Independent envelope generators for filter and amplifier will give the widest set of possibilities. If the resonance of the filter is turned up, envelope control can result in an unnatural, almost comical sound. This effect has been immortalized in countless wahwah pedals. The vocal associations of this effect are related to the action of the soft palate in speech, the position of which controls the frequency of the formants that determine vowels. A filter applied to a sawtooth wave can almost produce speech, but few synthesizers provide the detailed control needed. DVD example 10.8 demonstrates some filter effects. First you will hear notes in ascending scales with a different filter frequency and resonance per scale, then the filter with envelope control.

11_Chap10_pp205-236 8/29/13 2:45 PM Page 227

FUNDAMENTALS OF SYNTHESIS

VCO

Freq.

VCF

227

VCA

+ Amplitude

VCO Freq.

Cutoff ADSR Gate

ADSR Gate

Keyboard FIGURE 10.13

Doubled oscillators.

Fat Patches The basic waveform of a patch will be livelier if it is derived from two oscillators. They should be controlled by the same source but not tuned to the same frquency. The interval can be octaves or fifths, but the richest sound comes from a pair of oscillators at nearly the same frequency (see Figure 10.13). This results in beating, a variation in amplitude at a rate equal to the frequency difference. The difference should be a fraction of a hertz. Any larger interval is heard as a vibrato that changes in proportion to the pitch and gets pretty sour if you jump more than an octave. This technique is one area in which analog systems can be more interesting than digital. It is really impossible to tune two analog oscillators precisely together. It is possible to match one pitch (by listening to the aforementioned beats), but when a different pitch is played the oscillators drift apart. With the original Moog, this was a constant source of aggravation, but the designs of a decade later tracked well enough together to be musical while providing a rich variation. Digital oscillators are perfectly tuned, so if you want note-to-note variation, an extra control source must be added to the patch. The sound of doubled oscillators is demonstrated in DVD example 10.9. The word fat seems an apt description. Since doubling requires twice the signal generation, some applications will have a trade-off between voices doubled and the number of voices that can play at one time. (That is the rule in hardware instruments, but in software it primarily affects the plug-in load.)

11_Chap10_pp205-236 8/29/13 2:45 PM Page 228

228

THE ART AND TECHNIQUE OF ELECTROACOUSTIC MUSIC

Vibrato If you add a low-frequency oscillator to the pitch input of the beep patch, you get a undulating tone right out of 1950s science fiction movies. The LFO waveforms produce strikingly different effects: the sine and triangle with up and down sweeps, the ramp climbing up over and over, and the square sounds like the French police sirens. We can tame the effect by attenuating the LFO signal (most VCOs have attenuators on some frequency inputs). If we set the LFO to triangle wave and play with the rate, we find that at very low frequency the detuning is subtle. If we mix two or three oscillators, each with an LFO at slightly different rates, the result is a dense, ever-changing texture. LFO rates on the order of two or three per second produce vibrato. String players and vocalists practice long hours to develop a tasteful vibrato that is neither too wide or too fast. There is also a shape to vibrato: the note is attacked in tune, then vibrato is added to warm up the sound. This last trick is synthesized by running the LFO through a VCA with envelope, illustrated in Figure 10.14. If the LFO is connected to an amplifier instead of the VCO, the vibrato is a change of intensity rather than pitch, more like that of wind instruments. DVD example 10.10 presents pitch and intensity vibratos at various rates, followed by pitch and intensity vibrato with envelope controls.

MODULATION The word modulation was first used in electronics to describe the process of imposing an audio signal on a broadcast carrier. The carrier corresponds to the frequency of the radio station, such as 810 kHz; it’s a high-voltage signal applied to the broadcast antenna. The audio can be added in by changing the carrier amplitude (for amplitude modulation, or AM) or carrier frequency (for frequency modulation, or FM). The audio information is extracted by detection circuits in radio receivers, but the encoded signal itself can be an interesting sound. Modulation is available to electroacoustic composers through three techniques.

Amplitude Modulation We have seen that amplitude is controlled by VCAs. If a steady tone is applied to the audio input of the VCA, and a second tone is applied to the control input, additional tones are produced with frequencies equal to the sum and difference of the two original tones. So if a tone of 1,000 Hz is amplitude modulated by a tone of 100 Hz, the output also includes tones of 900 Hz and 1,100 Hz. These additional tones are called sidebands, another term borrowed from the broadcasting industry. When the modulating frequency is increased, you can hear a characteristic simultaneous

11_Chap10_pp205-236 8/29/13 2:45 PM Page 229

FUNDAMENTALS OF SYNTHESIS

VCO

LFO

VCA

VCF

Cutoff

Freq

229

VCA

Amplitude

Amplitude 2 Hz

ADSR

ADSR

Gate

Gate

ADSR Gate

Keyboard FIGURE 10.14

Vibrato with LFO.

upward and downward sweep of the sidebands as in DVD example 10.11. An interesting thing happens when the modulator is higher in frequency than the carrier. The lower sideband winds up at a negative frequency. We can’t hear that a frequency is negative (it is just inverted in phase), but we do hear the lower sideband reflecting at 0 Hz and continuing its sweep upward. This discussion so far assumes that we are dealing with sine tones. The details hold true for other waves, but the results contain the sum and difference of all components of the carrier and modulator. That can be a complex sound, and the sweep is particularly interesting (DVD example 10.12). You can also get marvelous sounds by modulating a low-frequency tone with voice (DVD example 10.13). This process is quite different in analog and digital systems. On an analog synthesizer, amplitude modulation will only work right if the amplifier is partially turned on by constant control voltage (usually the initial gain). That’s because the modulator (like all audio signals in the synthesizer) swings positive and negative, but the negative sections of the waveform can only turn the amplifier off. The result would be only the top half of the expected modulated wave form. Removing the negative sections of a waveform is called rectification. Adding a constant control voltage means the amplifier never quite goes off, which is why the original tones are heard in the output.

11_Chap10_pp205-236 8/29/13 2:45 PM Page 230

230

THE ART AND TECHNIQUE OF ELECTROACOUSTIC MUSIC

In a digital system, amplification is performed by multiplying the sample values of the signal (rapidly changing positive and negative numbers) with the control (slowly changing positive numbers). In most situations, this has exactly the same effect as an analog amplifier. However, a digital system will cheerfully multiply a signal by a negative control value. The result is an inverted signal (all samples changed in sign). When two signals are digitally multiplied, the result contains only the sum and difference sidebands. To simulate old-fashioned amplitude modulation in a digital system, you need to add a constant value to the modulator.

Balanced Modulation A balanced modulator is an analog circuit that works in the same manner as digital multiplication. Because early designs included a set of diodes arranged in a circular configuration, the process is also known as ring modulation. Ring modulation, like digital multiplication, produces sum and difference tones from a carrier and modulator. These devices usually let some of the carrier bleed through, and there would often be a control labeled “carrier null” to minimize this. Some devices also included noise gates (“squelch”) to shut the carrier off when there was no modulator. DVD example 10.14 compares amplitude modulation with balanced modulation. Balanced modulation is generally more interesting than amplitude modulation, which is one reason that digital systems generally stay with simple multiplication.

Frequency Modulation An audio signal modulating the frequency of an oscillator produces an even wider range of interesting timbres. Since this works best in the digital form, I’ll explore that implementation first and then come back to analog applications. The musical potentialities of frequency modulation were described by John Chowning in 1973 in a widely influential paper. His algorithms and implementation were used by Yamaha in one of their most successful synthesizers, the DX7, introduced in 1983. Frequency modulation produces sidebands in a manner similar to amplitude modulation, but the actual results depend on the strength of the modulator in a complex way. With a small amount of modulation signal, the result contains the carrier plus two sidebands. The upper sideband is equal to the carrier frequency plus the modulator frequency and the lower sideband is the carrier minus the modulator frequency. Using the variables Fcar for carrier frequency and Fmod for modulator frequency, with a small amount of modulation, the result has three components: Fcar, Fcar – Fmod, and Fcar + Fmod. If the modulation is increased, two new components appear: Fcar – 2Fmod and Fcar + 2Fmod. In addition, the amount of carrier is reduced somewhat. As the modulation is increased further, more sidebands appear at spacing equal to the modulation frequency as shown in Figure 10.15.

11_Chap10_pp205-236 8/29/13 2:45 PM Page 231

FUNDAMENTALS OF SYNTHESIS

Carrier

Spacing at modulator frequency

Lower Sidebands

lin 1k

2k

FIGURE 10.15

3k

4k

5k

231

Upper Sidebands

6k

7k

8k

9k 10k 11k 12k 13k 14k 15k 16k

The basic FM spectrum; carrier is at 8 kHz, modulator at 500 Hz.

The timbre resulting from frequency modulation is determined by the relative frequencies of the carrier and modulator. If they are the same, the result with small modulation signals sounds like the carrier and its second harmonic, because Fcar + Fmod is the same as 2Fcar. The lower sideband has a frequency of 0 Hz, which is inaudible. Additional upper sidebands will appear at higher harmonics of the carrier, but additional lower sidebands will be “reflected” at zero frequency. It is here that digital oscillators and analog oscillators part company (unless the digital version is a faithful model of a real analog module; frequency modulation is an excellent test for so-called classic emulators). The reflected components of digital frequency modulation are mathematically perfect and differ only in phase, but analog modules do not respond well to negative modulation voltages. The reflected components are at best seriously detuned, resulting in a clangorous tone. In many analog designs the carrier pitch will rise as modulation increases. DVD example 10.15 demonstrates frequency modulation on an analog synthesizer. Figure 10.16 shows the basic FM patch. If the modulation frequency is a rational fraction of the carrier, the pitch that is heard is the difference between the carrier and modulator. The various sidebands fill in harmonics in a complex way. If Fmod and Fcar are not related by any simple ratio, the sound is not strongly pitched but is more of a metallic clang. By varying the modulation amount with an envelope generator and VCA, marvelous bell sounds can be produced. All of this is based on linear modulation of the oscillator. Only a few analog oscillators have linear control inputs. Modulation through an exponential (1 volt per octave) control will produce interesting sounds, but nothing like the tightly pitched timbres associated with digital frequency modulation (explored in chapter 12).

11_Chap10_pp205-236 8/29/13 2:45 PM Page 232

232

THE ART AND TECHNIQUE OF ELECTROACOUSTIC MUSIC

Modulator VCO

Carrier VCA

VCO

VCA

Freq Freq

Amp

Freq

ADSR

Amp ADSR Gate

Freq = Frequency Amp = Amplitude Keyboard FIGURE 10.16

The basic FM patch.

NOTE GENERATION MIDI Keyboards Basic beep is usually controlled by a keyboard. The connection points are traditional—keyboard control voltage to the pitch input of the oscillator(s) and keyboard gate to the transient generator(s). If there is a filter in the patch, control to the frequency input will produce a different effect than if the frequency is left unchanged. The original analog keyboards played only one note at a time, but modern MIDI controllers have no such limitation. In a computer-based synthesis program, the keyboard object actually represents a MIDI port and a selector for note events. Most programs make you specify the polyphony—how many notes may be played at a time. They also require an indication of how much of the patch to duplicate for different notes. Figure 10.17 illustrates this concept. The patch defines four oscillators which each have their own amplifier and envelope generator. These are mixed together for output. Most programs simplify this graphic to avoid showing duplicates. Instead, a module called a MIDI mixer indicates the point in the patch where the signals for each voice are combined. Keyboard objects also include a velocity value. Velocity usually controls a second amplifier or the sustain level on an ADSR envelope, but controlling parameters like attack make the patch more responsive to touch. If the polyphony of an instrument is limited by a keyboard or other restriction, what happens when you ask for one more note? In a few cases a new note will not

11_Chap10_pp205-236 8/29/13 2:45 PM Page 233

FUNDAMENTALS OF SYNTHESIS

VCOs

233

VCAs

ADSR

Logic

ADSR

Mix

ADSR

Patch Layout Keyboard

ADSR

Polyphonic Section vco

poly keys

vca

poly mixer

mono out

adsr

Patch from Tassman FIGURE 10.17

Polyphonic voices.

be played until a key is let up. More commonly, especially on software synthesizers, a sounding note will be turned off, a process called voice stealing. The note that has been on the longest is the usual choice to turn off, but a less intrusive approach is to sacrifice the oldest but not lowest note. In any case, the note will be cut off abruptly, which may be quite audible in thin textures.

Sample and Hold Patterns Basic beep will repeat at a steady rate if the pulse output of an LFO is connected to the gate input of the envelope generator. Adjusting the duty cycle of the pulse

11_Chap10_pp205-236 8/29/13 2:45 PM Page 234

234

THE ART AND TECHNIQUE OF ELECTROACOUSTIC MUSIC

Sample Points

Input Sample Output

A FIGURE 10.18

B

C

Sample and hold waveforms.

changes the articulation of the note: a cycle mostly high gives a legato effect as the release overlaps the next attack. A short cycle produces staccato notes. Connecting a second LFO to the pitch input of the beep oscillator gives changing pitches, but they will be sliding with the LFO waveform, not sustained notes. This is the situation the sample and hold module was invented for. The sample and hold operation is illustrated in Figure 10.18. The module captures the voltage at its input when the trigger goes high and holds that voltage until the next trigger. This changes a continuous waveform, such as a ramp from an LFO, into a series of steps. If the rate of this LFO is increased, an interesting phenomenon occurs: while the LFO rate is slower than the trigger pulses, the pattern output by the sample and hold will outline the waveform of the LFO (A in Figure 10.18) As the LFO speeds up, there will be fewer steps per cycle until both the LFO and the pulse generator are at exactly the same speed. This produces a steady repeated pitch. As the LFO continues to speed up the output shows alias effects, producing patterns that change in complex ways. B and C in Figure 10.18 illustrate some of the possibilities, and you can hear more on DVD example 10.16.

EXERCISES 1. Experiment with the patches for basic synthesis techniques given in the chapter and expand on them; then compose some études to find out how they work in practice. 2. Download MIDI scores to some Bach inventions from the Internet; design patches that respond in interesting and musical ways. 3. Produce a piece that uses concrète sounds for background and synthesized sounds for foreground.

11_Chap10_pp205-236 8/29/13 2:45 PM Page 235

FUNDAMENTALS OF SYNTHESIS

235

4. Produce an ambient piece using synthesized textures that evolve at a barely perceptible pace.

RESOURCES FOR FURTHER STUDY Several current books on synthesizer programming focus on the more limited features of keyboard synthesizers. Aikin, Jim. 2004. Power Tools for Synthesizer Programming: The Ultimate Reference for Sound Design. San Francisco: Backbeat Books. Cann, Simon, 2010. Becoming a Synthesizer Wizard. Boston: Course Technology [Cengage Learning]. Jenkins, Mark. 2007. Analog Synthesizers: Understanding, Performing, Buying. Boston: Focal Press 2007. Russ, Martin. 2008. Sound Synthesis and Sampling, 3rd ed. Boston: Focal Press. The best books on synthesis are the classics, which are out of print but available in libraries or from used book sellers. Strange, Allen. 1983. Electronic Music: Systems, Techniques, and Controls, 2nd ed. Dubuque: William C. Brown. Wells, Thomas, and Eric Vogel. 1981. The Technique of Electronic Music. New York: Schirmer Books.

11_Chap10_pp205-236 8/29/13 2:45 PM Page 236

12_Chap11_pp237-266 8/29/13 2:46 PM Page 237

EL E VE N Voicing Synthesizers

“How do I get a synthesizer to produce the sound I hear in my head?” This is the most common question I hear from students learning synthesis. Unfortunately, there is no easy answer to this question. If a sound only exists in your head, it can’t be sampled and we can’t do a spectral analysis or look at the envelope. What we can do is learn to synthesize a wide range of sounds, and maybe that specific one will eventually turn up. Learning advanced synthesis is an intimidating task. The sounds we encountered in the last chapter have a certain simple charm, but they pale in comparison to the presets in any commercial instrument. After all, synthesizers were invented in the 1960s, and a lot has been discovered in the ensuing forty years. We can only catch up by a systematic study of a wide range of instruments. The following chapters will show how to undertake this study. We will encounter several interesting programs and instruments, but their inclusion does not mean these particular synthesizers are the best or that you should get them—they just happen to be available and clearly demonstrate important techniques. Adjusting the sound of an instrument is known as voicing, a term that goes back to the development of pipe organs. You can voice synthesizers with no tools beyond your ears, but I find spectral displays and oscilloscope displays quite useful, especially when learning or explaining new techniques. In the process of preparing this book, I have used Inspector XL from RNDigital (no longer available), WaveWindow by Laidman and Katsura, schOPE from Schwa, and Reveal from Berkley Integrated Audio Software. These are Mac programs, but PC users will find many excellent options.

PATCHING PARADIGMS Software synthesizers come in two styles: fixed architecture and patchable. There is a perception that the patchable ones are more difficult to use, but that’s not really true. Most synthesizers come with a library of pick-and-play preset patches.

237

12_Chap11_pp237-266 8/29/13 2:46 PM Page 238

238

THE ART AND TECHNIQUE OF ELECTROACOUSTIC MUSIC

Some have so many presets that they include a search engine to help find sounds to fit particular situations. Patchable synthesizers vary widely in the amount of flexibility they offer. Sometimes there’s nothing more than a choice of effects to apply to the overall sound. At the other extreme, you get a library of modules and a blank canvas on which to lay them out. There seem to be three major paradigms for patching, which I call graphics, schematics, and slots. Graphic patching has developed from programs that emulate classic modular machines like the Moog 5 or ARP 2600. These show a picture of the modules complete with jacks. A click or two of the mouse connects patch cords, sometimes with exciting animation. These provide a high dose of nostalgia, but frankly, the patch cords were the most annoying feature of those instruments, and in the digital version they really obscure the labels and controls. An option to hide the cords is a definite plus. We have already seen schematic patching in Tassman. In this approach, symbols representing modules are placed on a canvas and connected by lines that represent the signal and control flow. When the patch is constructed, a different view shows operating controls. Schematic patching is the most flexible approach, but it can be hard to learn. As you will see, one of the most fruitful ways to learn synthesis is by deconstructing patches, and this can be quite difficult when there are many tangled lines. In a synthesizer that uses the slot paradigm, the modules are listed in columns (or occasionally rows). The oscillator occupies the top slot and signal flows down. A click on a space below allows selection of a module to add to the column. If a module is selected, its parameters appear in another pane of the window. Since this design does not have a simple way to display control connections, envelope and LFO parameters appear elsewhere. In a variation of the slot paradigm, each module has both a compact representation and an expanded form in which the parameters are shown. This configuration is sometimes called a rack.

A SIMPLE SUBTRACTIVE SYNTHESIZER You can think of a new synthesizer as a puzzle to be solved. The pieces of the solution are architecture, modules, and parameters. Figure 11.1 shows a simple synthesizer named Remedy (freeware, but no longer available). Remedy is an excellent example of subtractive synthesis. It has an oscillator section that is capable of generating rich sounds and a filter that tames them. It is not fully patchable, but many useful connections can be set up by turning knobs. DVD examples 11.1 to 11.8 systematically explore Remedy to find out how its pieces fit together.

12_Chap11_pp237-266 8/29/13 2:46 PM Page 239

VOICING SYNTHESIZERS

FIGURE 11.1

239

Remedy plug-in synthesizer.

Architecture The architecture of a synthesizer refers to the way signals and controls flow in the instrument. If an instrument has a fixed architecture, chances are excellent the structure is a version of basic beep. The first steps in learning the architecture are to list the modules and draw out the signal path. Hopefully, this has already been done in the manual. Sometimes the parameter list in the host application will offer clues. Remedy has two oscillators, one filter, one amplifier, and an LFO that can be routed to several parameters. The filter and amplifier have their own envelope controls, so the concept of a separate envelope generator is lost here. The Remedy signal path is shown in Figure 11.2. This was discovered by assuming it is basic beep and making some experiments. The two VCOs are balanced by the mix knob and fed into the filter. The FM knob in the oscillator section routes VCO 2 to the frequency of VCO 1. It can be tricky to determine if the amplifier follows the filter. Usually this makes no difference, but if the filter is resonant enough to make sound on its own, the amplifier envelope will cut it off on short notes. You can test this by turning the resonance up full and setting short envelope times. If you press a key and hold it, many filters will ring a bit, producing a drum-like sound. If you tap the key, a chopped-off filter ring tells you the amp is after the filter (which is almost always the case). Control pathways in Remedy are determined by knobs instead of patching. In this instrument the knobs that modify control connections are at the source. (Many

12_Chap11_pp237-266 8/29/13 2:46 PM Page 240

240

THE ART AND TECHNIQUE OF ELECTROACOUSTIC MUSIC

Osc1

P PW

Amp

Filter

FM MIX

V

V

C

Ocs2

Mod Amt

PW FM

P P C PW FM

LFO

FIGURE 11.2

ADS Key: P PW C Mod V

ADSR Pitch Pulse Width Filter Cutoff Mod Wheel Velocity

Remedy signal path.

similar instruments have controls at the destination.) The LFO has four sends that go to oscillator pitch, filter cutoff, pulse width (PW) and frequency modulation amount (FM). The amplitude envelope can also be sent to PW and FM. MIDI velocity can affect the filter or amplifier, and the mod wheel can control the filter. Keyboard pitch connections are always assumed to go to the oscillators, but pitch-to-filter connections cannot be taken for granted.

Modules and Parameters To decipher the parameters for each module, we start with the simplest sound the instrument will make and experiment with the controls individually. This usually requires a bit of preliminary tweaking. First I zero out all of the LFO and envelope sends, then I set the filter cutoff and sustain as high as they will go with a short attack and long release. I also set the amp sustain to maximum with medium attack and release. This process reveals a lot, starting with the action of the mouse on the knobs. Some applications expect a circular drag to turn the knob, but most work with a straight up-and-down motion, as Remedy does. My preferred action is a drag up and down with a click on the perimeter of the knob producing an instant change, but really there is no comfortable way a mouse can interact with a picture of a knob. Programs that value ergonomics over appearance use sliders or number

12_Chap11_pp237-266 8/29/13 2:46 PM Page 241

VOICING SYNTHESIZERS

241

boxes for parameter settings. Another essential discovery is how to connect MIDI controls to knobs on the synth. MIDI control makes it simple to explore parameters systematically. Generally, connections are made with a button called learn or MIDI (in Remedy the button is CC). Some combination of clicking this button, clicking or moving the graphic control, and moving the MIDI control will link them up. In some of these videos, I have linked the on-screen knobs to an external MIDI control unit. Oscillators DVD example 11.1 works through the available waveforms on oscillator 1, then oscillator 2. Don’t be surprised to find waves with the same name actually sounding different. One surprise in Remedy is that oscillator 2 is an octave lower than 1. Another interesting discovery is that the left and right channels are slightly out of tune. Most synths have a fine tuning or detuning control to produce beating. Here the only way to turn detuning off is to switch to mono. This is an example of the way an instrument designer’s choices determine the final results. It’s not possible to include a control for everything without filling the screen, so some presumptions are necessary. Here, the designer has chosen one detuning to take or leave, which is going to give all patches played in Remedy a characteristic sound. Listening to steady waveforms reveals a lot. In Remedy the sine wave is very clean, which is to be expected in a digital system. Some composers prefer to have a sine wave with a little character, but in an instrument that features FM a pure sine is essential. The triangle is also a pure tone, at least up to the top of the treble staff. Above that some aliasing sets in, but it is hardly audible. In fact the channel detuning adds more edge to the sound than the aliasing does. Aliasing is really noticeable on the sawtooth waveform. It turns up just above middle C where it is audible as a subtone, pitched lower than the fundamental. You might think filtering would remove aliasing, but it does not. The final waveform is band-limited noise. This is an unusual thing to find on an oscillator, but is a useful addition to the waveform list (noise is usually treated as an independent source). Oscillator 2 has three pulse waves with different duty cycles: 1:2, 1:3, and 1:10. Figure 11.3 shows the spectrum of the square wave (1:2) with the characteristic missing even harmonics. There must be some slight error in the waveform, because the even harmonics aren’t entirely missing, just reduced in amplitude. Figure 11.3 also shows significant aliasing. Some of this is rooted in the digital synthesis technique and some is just the effect of everything between the oscillator and the measuring device used. The pulse wave on oscillator 1 has a variable duty cycle with a range of 1:10 to 1:2. Connecting the triangle LFO to this parameter demonstrates the effects of varying the duty cycle as heard in DVD example 11.2. (Some instruments call this process pulse width modulation or PWM.) Figure 11.4 shows the spectrogram of the changing duty cycle, a striking display. The dark areas show the harmonics and the white curves are the nulls in the harmonics. You may wonder why the nulls in the harmonics follow a curving path when the triangle shape of the LFO has straight sides. The answer lies on the frequency scale of the spectrogram: it’s a logarithmic scale, giving equal spacing to each octave.

12_Chap11_pp237-266 8/29/13 2:46 PM Page 242

242

THE ART AND TECHNIQUE OF ELECTROACOUSTIC MUSIC

Missing Harmonics

Alias Components

Alias Components 15

31

FIGURE 11.3

63

125

250

500

1k

2k

The spectrum of Remedy’s square wave.

6 5 4 3 2 1 Harmonics

Duty Cycle 1:4

1:3

Square

Time

FIGURE 11.4

Time spectrogram of a changing duty cycle.

4k

8k

16k

12_Chap11_pp237-266 8/29/13 2:46 PM Page 243

VOICING SYNTHESIZERS

243

The pulse modulation feature is a good way to investigate the parameters of the LFO. This is shown in the video in DVD example 11.3. The effect of the rate control should be obvious, but it always takes some experimentation to determine the range of frequency available. You will notice that at the highest settings, the LFO generates a tone of its own. The frequency where this happens is around 15 or 20 Hz. The LFO has several shapes, including ramp up and down. VCOs only need one or the other, since they sound exactly the same, but an LFO must have both for differing effects. The LFO square wave is useful for chopping sounds into short bursts. The final LFO waveform in DVD example 11.03 is a random value generator. This gives the same pattern each time you play a note, because the random number generator restarts on each note. The other LFO shapes are also synced to the beginning of the note. This is more clearly heard in DVD example 11.4, where the LFO controls pitch. As I play arpeggiated chords, you will hear each note go its own way, but they will all swing together when I turn the global switch on. The global switch reconnects the patch to use a single LFO that runs continuously. I find this effect a bit sci-fi, but it may be appropriate in some situations. The Remedy oscillators feature limited frequency modulation. In some synthesizers, FM is the core of synthesis, and we will look at some of those in the next chapter, but many simple synthesizers provide basic FM to create rich sounds for the filter to work with. DVD example 11.5 explores the version of FM found on Remedy, with the amount of modulation controlled by the LFO. As we have seen in earlier chapters, frequency modulation produces sidebands at Fcar – nFmod and Fcar + nFmod where the multiplier n equals the sideband number. Fcar is the frequency of the oscillator we hear (oscillator 1) and Fmod is the frequency of the modulator (oscillator 2). Since oscillator 2 is half the frequency of oscillator 1, the fundamental pitch will drop an octave as soon as any modulation is applied (Fcar – Fmod = Fmod ). The modulation amount determines the strength of the sidebands in a complex way. Increasing the modulation adds sidebands above the carrier at intervals based on the modulator frequency, so the effect is like opening a low-pass filter. At high levels of modulation, some of the lower sidebands drop out which causes the timbre to acquire a high-pass flavor. The composite effect is unique and quite attractive. Envelopes One of the classic FM applications is to put an envelope on the modulation amount. In Remedy, the amplitude envelope can be sent to the FM control. Figure 11.5 shows how the envelope affects the spectrum varied envelopes are demonstrated on DVD example 11.6. You can hear that extra harmonics are present during the attackhold-decay phase of the envelope. This works smoothly because the envelope control of FM is proportional to the initial setting of the modulation. It’s adjusted so that the peak of the envelope produces the spectrum that would result without the envelope. The envelope itself is an AHDSR, an ADSR with the added ability to hold for a brief time at the initial peak. It is permanently connected to the amplifier, so the FM changes are linked to loudness. This is another designer choice that limits

12_Chap11_pp237-266 8/29/13 2:46 PM Page 244

244

THE ART AND TECHNIQUE OF ELECTROACOUSTIC MUSIC

Sustain T Time ime

Attack

Release

Decay

FIGURE 11.5 Time spectrogram of Remedy FM.

available sounds, but it also eliminates a lot of tedious work required to balance the modulation with the amplitude envelope. DVD example 11.7 shows a video demonstration of systematically testing the amplitude envelope functions. This is a necessary task for any synth you encounter, since there is no standard for the range of envelope times. Note especially where the attack loses its pop and the way decay and release interact when sustain is off. The following are some things to listen for. Attack: Very short attacks pop. The exact point of transition to a softer sound depends on the quality of the waveform. Rich sounds retain the pop on slower attacks. Very slow attacks prevent short notes from sounding. Release: This is what happens after the note is over. Short releases pop. Longer releases overlap following notes and add reverberation. Too much release can really thicken the texture of a passage. Sustain: Sets the loudness of notes. When the sustain is zero we get percussive notes of fixed length. On some synthesizers velocity is tied to sustain; on others, velocity affects the entire envelope. Decay: With sustain at some middle value, decay affects the punch of the attack. With sustain at maximum, decay is not heard. With sustain at zero the decay becomes the release of most notes, although a short note may jump to

12_Chap11_pp237-266 8/29/13 2:46 PM Page 245

VOICING SYNTHESIZERS

245

the release phase before the decay finishes. In this case all notes are the same length with a plucked or percussive sound. Hold: With higher sustains hold affects the punch of the attack phase, but too long a hold gives a distinct two-part feeling to the sound. With sustain at zero, the hold and decay determine the length of the note. Filters Filters are the modules that give synthesizers individual character. Waveforms and amplifiers are defined by physics, envelopes really only vary by the time settings available, but there are hundreds of filter designs, and they all sound different. The most common design is a low-pass filter with resonance control, and there are a dozen or more options to this type. Since filters are the heart of subtractive synthesis, we need to learn exactly how the filter works: what is its range, what is the slope, and how peaky the resonance is. This is best evaluated with a sawtooth wave. The amplitude envelope should have quick attack, high sustain, and medium release so the filter has the principal effect on what is heard. Filters can either be fixed or track the keyboard. This should be a user’s choice, but on a simple synthesizer it may be permanently set one way or the other. You can easily hear if a filter is fixed—the sound will become softer and clearer on higher pitches. The time spectra in Figure 11.6 show that the Remedy filter does not track. Low notes are rich, but as we go up the keyboard fewer partials get through. A spectral display is also the best way to find out the filter range. Set the resonance and filter sustain to max and play a rich sound while sweeping cutoff all the way up. The peak corresponding to the cutoff should show up clearly. The slope of a filter is some multiple of 6 dB per octave. You can determine this pretty easily with a level meter. Play a sine wave and go up the scale until the amplitude drops a bit and make note of the meter reading. Now go up another octave. The difference in meter readings is the filter slope. The filter slope in Remedy turns out to be 12 dB per octave. The resonance of a filter can be evaluated with noise. Set the frequency to the middle of the noise band and turn the resonance up. The purer the tone you hear, the higher the resonance goes. Some filters will oscillate at high resonance. These can be used for a sine tone, but the most interesting use is to set them just below oscillation and apply a low-frequency pulse wave. They will ring with the pulses at the filter cutoff frequency. The kind of filter found in Remedy can be useful in two ways. First, static filtering will give a formant or resonance to the sound. Set the resonance up a bit, but not to the point where it starts to sound like a wah-wah pedal. Play the sort of melody you have in mind and tune the filter cutoff for best effects. A filter is easy to tune if you turn the resonance up and listen for harmonics. A slight peak on the third or fourth harmonic gives good results. Turn the resonance back down until the effect is not quite heard. The second way to use a static filter is to apply an envelope to the cut-off frequency. As the envelope opens, the low harmonics speak

12_Chap11_pp237-266 8/29/13 2:46 PM Page 246

246

THE ART AND TECHNIQUE OF ELECTROACOUSTIC MUSIC

Filter Envelope

C5 C4 C3 C2 C 2 Fundamental T ime Time

FIGURE 11.6 Time spectrogram of the Remedy filter.

first. With medium attack this produces an effect associated with brass instruments. Keep the resonance fairly low unless you want the talking duck sound. The amplitude and filter envelopes interact, so you need to keep tweaking both. DVD example 11.8 demonstrates the Remedy filter controls and envelope. The filter on Remedy has one more mysterious parameter: drive, from overdrive, which refers to the practice of deliberately applying enough signal to distort an analog circuit. This produces soft clipping and turns sine tones into rounded squares as illustrated in Figure 11.7. The drive effect has nothing to do with filters. It’s actually produced with waveshaping, described below. The result of this effect is the addition of odd harmonics, so it doesn’t do much to sawtooth or other rich waves.

SUBTRACTIVE SYNTHESIS ON STEROIDS The sounds generated with Remedy are varied and expressive, but we have only begun to explore the capabilities of subtractive synthesis. For deeper education, we will turn to a well-established commercial product, Absynth by Native Instruments. This is a modestly priced application with a rich feature set. None of the fea-

12_Chap11_pp237-266 8/29/13 2:46 PM Page 247

VOICING SYNTHESIZERS

247

FIGURE 11.7 The effect of drive in Remedy.

tures discussed below are unique to Absynth, but we would probably have to cover three competing instruments to otherwise include them all. (There is a demo version of Absynth available online that includes enough functionality to experiment with the concepts in this section.)

Architecture Absynth has a slot-based organization. There are three channels, each with room for three modules. There are an additional three slots that can provide processing to the mix of the channels. The simplest patch is one oscillator as shown in Figure 11.8. This is the New Sound preset, which is found under the File menu. The word oscillator is a bit of an understatement. This is a multifunction generator module that can produce signals a half dozen different ways, including sample playback and direct audio input.

12_Chap11_pp237-266 8/29/13 2:46 PM Page 248

248

THE ART AND TECHNIQUE OF ELECTROACOUSTIC MUSIC

FIGURE 11.8

Absynth patch window.

Envelopes DVD example 11.9 is a video featuring the default patch in Absynth. When we play this patch, we realize there is a hidden envelope somewhere. The envelopes view reveals a set of graphic envelopes (Figure 11.9). This view contains all the envelopes and has a list of available envelopes that controls which are displayed. Keeping the envelopes in a view remote from the oscillator may seem awkward, but it is not much of a problem. The ability to compare the different envelopes easily is essential to programming finesse. The +New button above the list creates a new envelope. The process includes choosing a default curve to use and selection of the parameter to control. This approach represents a substantial change in the concept of envelope. The usual paradigm is that an envelope generator is a device that creates a data stream to be routed to destinations. In Absynth, an envelope is a de-

12_Chap11_pp237-266 8/29/13 2:46 PM Page 249

VOICING SYNTHESIZERS

FIGURE 11.9

249

Absynth envelopes.

scription of how a parameter should behave when a note is triggered. This is a bit like the track automation scheme found in DAW software. There can be more than one envelope for a parameter, and envelopes for different parameters may be linked, effectively giving one envelope control of several features. So far our envelope options have been limited to the four segments of ADSR, but a graphic envelope can be as complicated as you like. The envelope drawing feature in Absynth is fairly typical. Each corner of the envelope is a break point that can be selected and moved by clicking on a box. A new break point can be created by right clicking (or control clicking) on the line. Numeric details of a selected break point are shown in an information window. The time and value can be precisely set by editing these numbers. Note that the time display may be absolute or relative to the previous point. The envelope value is displayed in terms of the controlled parameter. One oddity of Absynth is that the envelopes can only reduce the

12_Chap11_pp237-266 8/29/13 2:46 PM Page 250

250

THE ART AND TECHNIQUE OF ELECTROACOUSTIC MUSIC

FIGURE 11.10 Absynth envelope with exponential segments

value of a parameter. The envelope actually has a range of zero to 100 percent. This means if you want to generate a pitch slide up, you have to set the transposition of the oscillator at the top of the slide range. The actual transposition will be shown in the envelope break point values. The envelope line segments are curved, and the curve may be adjusted by dragging a small box in the center of the segment or by editing the slope parameter. The effect of curved envelopes is subtle and depends on what is controlled. Since we experience loudness in a logarithmic way, amplitude control sounds most natural when the envelope has an exponential shape. Absynth envelopes default to a logarithmic curve on attacks, probably because analog envelope generators did so. This results in an attack that is a bit crisper than a linear ramp with the same end point. Changing the attack curve will produce subtle variations. The release segments are exponential, which produces a smooth fade. The envelope shown in Figure 11.10 has break points placed at 3 dB intervals with even spacing in time. If you look closely at the vertical spacing of each step, you will note that the -6 dB point is centered in the space. Since pitch is controlled by the transposition parameter, the exponential frequency to pitch relationship is already accounted for and straight envelopes are appropriate. There are several modes of operation for the envelope. They are similar to those found in many programs and include the following. Release mode, also known as one-shot, simply runs through the entire envelope on each note. This is useful for percussive sounds where the duration of the note should have no effect. Sustain mode is the ubiquitous envelope mode. One of the break points is marked as the sustain point, and the envelope value will park here while the key is held. Applications differ in how they respond when the key is let up before the sustain point is reached. Some continue through the envelope as

12_Chap11_pp237-266 8/29/13 2:46 PM Page 251

VOICING SYNTHESIZERS

251

though it were in release mode; others, including Absynth, skip to the next point after sustain. The transition time is the relative time of the target break point. Loop mode repeats the range between the sustain point and a previous point for as long as the key is held. The transition time from the sustain point to the loop point is the relative time of the loop point. Absynth has a cursor that shows how the envelope is scanned, but it is somewhat misleading in loop mode. It will traverse the segment before the loop point, but the values produced will be a smooth transition from sustain to loop. Retrigger mode is not as common as the others. In this mode, the envelope will be restarted after the duration of a measure. In most implementations of retrigger, the envelope plays through and is repeated after the final release. In some programs retriggered envelopes are provided instead of LFOs. Control driven is a final option that is really a mapping function, where the value of an external control (not time) determines the position of the envelope cursor. Some features of the envelope can be controlled by external MIDI data. MIDI control is independent for each break point. Knobs could be set up to control each break point, but if you want one knob for overall control, the entire envelope should be selected when MIDI is assigned. The source of control can be velocity, pan, or volume as well as arbitrary control change messages using a concept called macro controls. Macro controls are an intermediate step between the MIDI data and the parameters to be controlled. When you set up a patch, you might assign Macro B to control several parameters. Then you would assign MIDI control 16 to control Macro B. The use of macro controls limits the number of MIDI inputs possible but makes it easy to move patches from one environment to another. A single control might affect a dozen parameters in a complex patch. If these connections were directly assigned to MIDI, it would be a bit of work to change from, say, an Oxygen controller (which provides controls 10 to 17) to a Korg micro (controls 16 to 31). Using the macro system means each connection source only needs to be changed once; in fact, many applications consider such assignments global data and save them as preferences or library files. Macro controls are used in a lot of synthesizers under a variety of names such as patch cords, master controllers, and control sources. It is so common to use an envelope to control the effect of an LFO that the designers of Absynth have integrated an LFO into the envelope mechanism. If the LFO option for an envelope is turned on, the result is like that suggested in Figure 11.11. The envelope controls an LFO waveform that affects the target parameter. The parameters for the LFO itself are specific to each break point, so the LFO can speed up and vary in depth. You will remember from chapter 10 that this mechanism is equivalent to an LFO and amplifier with envelope control.

12_Chap11_pp237-266 8/29/13 2:46 PM Page 252

252

THE ART AND TECHNIQUE OF ELECTROACOUSTIC MUSIC

FIGURE 11.11 Absynth envelope with LFO.

Oscillator Parameters Figure 11.12 shows a close-up of the oscillator. You’ll notice there is no pretense of analog controls here, this is unabashedly a computer program. There is even an Edit menu with familiar copy and paste possibilities. Seldom changed parameters are hidden under tabs. The Uni tab reveals controls for doubling the oscillator. This sound can range from slow beating to sour clusters depending on the number of voices in the double and the amount of detuning. Absynth includes a randomization of detuning, which is unpredictable but interesting. Waveforms The main tab in the oscillator shows the mode of operation and the essential parameters, including a small snapshot of the waveform. In most modes the waveform is the center of interest. The waveform menu contains more than forty simple waves, including sine, various filtered sawtooth waves, buzzes, vocals, even something that is called noise but is not particularly convincing (surprisingly, true noise is the one option left off this module). You can play a waveform simply by highlighting it in the list. This is where the spectrum analyzer and oscilloscope displays are most valuable. When you get down to the waves labeled “inharmonic,” you will see wobbles in the scope waveform that indicate this is not a precisely repeated wave. DVD example 11.9 includes a brief tour of the Absynth waveforms. A second column of options are so-called morph waves. These are pairs of wave shapes, and the oscillator can make a controlled interpolated transition from one to the other. To test this, you need to define a morph envelope—assign it to Osc A main morph. If you make that a looping envelope, the sound transitions will be clear. Again, visual display of the waveform and spectrum is more informative than anything else. One thing you will discover is how difficult it is to find names for waves that are descriptive of what they actually sound like. The most fascinating option in the waves display is the ability to create a new waveform. We learned in chapter 10 that digital synthesizers generate sound by fetching values from a wavetable, a list of sample values for one cycle. Wavetables

12_Chap11_pp237-266 8/29/13 2:46 PM Page 253

VOICING SYNTHESIZERS

253

FIGURE 11.12 Absynth oscillator.

in Absynth can be designed by drawing the waveform directly or by manipulating a spectrum. The waveform drawing approach is entertaining, but you will soon find it is difficult to get predictable results. (If you want unpredictable results, go for it!) There is a straight line tool, a curve tool, and a wave stretch tool. These work in conjunction with markers in the window. With the line tool selected, a click in the window will produce a straight line from the existing value at the marker to the mouse. As you move the mouse, the line replaces some portion of the wave. The curve tool is similar, and the stretch tool modifies the wave in the range between two markers. DVD example 11.10 shows spectrum manipulation in action. With this approach you can both see and hear the effects of adjusting harmonics. Absynth even lets you change the phase of the partials, which is quite a rare feature. What is happening in the spectrum window? The line heights are the data for an inverse Fourier transform which is used to fill the wavetable. When working directly on the waveform, it is useful to have spectrum analyzer and oscilloscope plug-ins open, so that you can see the changes in waveshape as you manipulate the spectrum or viceversa. No one learns wave design to the point where they can hit precisely the right sound on the first try—the results always need refining. And remember, the wavetable is a static view of something that will be constantly changing. The important result is to have a good idea of what sounds are available as source material. An excellent exercise is to load in a sine wave, then add a few partials at a time. Another practice technique is to capture a spectrum of recorded sound in the analysis plug-in and see how close you can come to replicating it. Oscillator Modes The oscillator can do more than generate signals from wavetables. There are a total of eight modes available, including sample playback and audio in. DVD example 11.11 shows a tour of the modes. The others are as follows.

12_Chap11_pp237-266 8/29/13 2:46 PM Page 254

254

THE ART AND TECHNIQUE OF ELECTROACOUSTIC MUSIC

Double mode adds a second oscillator. Unlike the Uni control, the extra oscillator (with parameters under the Mod tab) can have independent waveform, envelopes, and LFO. You can make a transition from one wave to the other by applying an envelope to the balance parameter. FM is frequency modulation in a more flexible implementation than in Remedy. The functions here are basic but provide a rich set of dynamic sounds, especially when envelopes are brought in. Ringmod is amplitude modulation. The effect is similar to FM, at least as far as the implementation goes. It is useful for adding a burst of inharmonic material at the very beginning of the note. Set a fixed frequency on the Mod tab and use a brief envelope. This is one of those processes where you reduce the envelope time until you can barely hear the effect. Fractalize is unique to Absynth. It modifies the waveform to be self-similar by adding in smaller versions of the shape. This is more or less equivalent to adding harmonics, except the harmonics are not sine waves. Modifying the amount parameter produces some nice spectral crossfades. The effect on the organ waveforms is exactly like playing with the drawbars on an old Hammond. Sync granular is a form of granular synthesis (explored in chapter 13). The waveform is cut into sections (called grains) about 10 milliseconds long. These are then replayed, which normally recreates the waveform, much as the still images of a movie recreate the action of the characters. The grains are actually overlapped as set by the density control (otherwise, you would hear a buzz at the grain rate). The pitch of the grains can be modified, but the overlapping preserves the original duration. Changing the grain rate modifies duration without affecting pitch. A common effect is to randomize (scatter) the time of the grains, which makes a noisy, bubbly sound. The granular oscillator mode applies these effects to samples (you have to be in sample mode to load them), with added parameters to change duration and randomize pitch or amplitude.

Modifying the Waveform Waveshaping The second slot in an Absynth channel can contain a signal processor chosen from several types. One option is a waveshaper. Waveshaping is controlled distortion. The operation is similar to the wavetable function that creates the signal. In that process, time is used as an index to a table that contains the wave to be synthesized. Waveshaping takes the resulting signal value as the index to a second wavetable. Finally the value from the second table is output. Figure 11.13 shows the process. There are three times shown for the input. The value at time A is used to find an index in the shape table—since the value at time A is fairly high, it looks

12_Chap11_pp237-266 8/29/13 2:46 PM Page 255

VOICING SYNTHESIZERS

255

Va Vb Vc 1

1

0

0

-1

-1 -1

A

B

Input

C

0

Vb

Vc

Shape

1

Va

A

B

C

Output

FIGURE 11.13 Waveshaping in Absynth.

well into the shape table to the index marked Va. The value from the shape is copied to the output. You can imagine the value from input hunting around the corner of the triangle. Since the input value at time B is zero, a zero is returned from the shape table. This happens at the beginning and end of the input cycle as well as the middle. The input value at time C is low, so a value from the left side of the shape table is returned. The spectrum of the output is quite rich, as you may guess from the sharp corners. However, if the input amplitude is reduced to less than half of full scale, only the center of the shape table would be used, resulting in a pure waveform. We can hear this as the oscillator envelope reduces the level at the tail of the note. The first half of DVD example 11.12 demonstrates the waveshaper. Waveshaping has been a constant, if inadvertent, feature of analog synthesis from the beginning. No real transistor or integrated circuit has perfect amplification. The specification for such devices always includes a transfer function, which plots voltage out for voltage in. Figure 11.14 shows an example. A perfect transfer function would be a straight diagonal line. (Quality circuits are designed by keeping the voltages within the center of the range, which has little curve.) These curves may remind you of compression curves from chapter 5, but remember that compression is based on average signal and changes slowly. A compressor that changed instantaneously would produce results similar to waveshaping. In Absynth, all of the oscillator waveforms are available as shapers, and it is useful to design some specifically for waveshaping with the new wave feature. Ring Modulation and Frequency Shifting When the second slot module is in the Mod type, two versions of ring modulation are available. The standard version is exactly like the ring modulation available in

12_Chap11_pp237-266 8/29/13 2:46 PM Page 256

256

THE ART AND TECHNIQUE OF ELECTROACOUSTIC MUSIC

Transfer Transfer Function

Input

Output

FIGURE 11.14 Transfer functions in analog electronics

the oscillator, but here the whole range of oscillator modes is available as a carrier. Experiment with all of them. (I find ring modulation on audio input particularly interesting.) The frequency shift version can suppress either upper or lower sidebands. The result will be either the oscillator minus the modulator frequency or the oscillator plus the modulator frequency. The choice is made with + or – buttons. When the modulation frequency is set in ratio mode, the shift will be a constant interval. When the modulation frequency is in fixed mode, the pitches will be strange—in fact, the lower part of the keyboard may be reversed. The harmonic structure of the sound changes from note to note, as you can hear in the second part of DVD example 11.12. The frequency shifter also includes a feedback feature. This is especially effective when used with shift up. Assume a shift of 100 Hz— running the signal through the shifter again and again will add 100 Hz each time, so the spectrum will sport several components at 100 Hz intervals. This effect is particularly tasty with envelope control. The sideband suppression is done with a 48 dB per octave filter, so combinations of high carrier frequency and low modulator frequency will produce both sets of sidebands.

12_Chap11_pp237-266 8/29/13 2:46 PM Page 257

VOICING SYNTHESIZERS

257

Filters Absynth is a good platform for exploring filters. There are quite a few available, and it’s easy to switch between them to compare effects. DVD example 10.13 shows how filters are selected and set. Numerical values are adjusted by clicking the diamond-shaped buttons above and below the value. This is a tedious method of data entry but is necessary because some hosts restrict the use of the number keys by plug-ins. Envelopes are applied in the envelope window. Low-pass Filters There are several choices of low-pass filters. The filters named 2 pole, 4 pole, and 8 pole are models of analog designs; the filters named -6 dB, -12 dB, and -24 dB are purely digital designs. Comparing the 4 pole to the -24 dB version (remember, a pole provides 6 dB of slope), we find the analog model is slightly brighter. Also the analog model will oscillate if the resonance is turned up all the way. You will certainly notice other differences that can’t be put into words. Figure 11.15 shows the response of the analog-style 4 pole, followed by the digital -24 dB filters with the resonance at maximum. The 4 pole and -24 dB low-pass filters work nicely with complex envelopes when used to gate the signal. Some audio examples are given in DVD example 11.14. High-pass Filters The high-pass filters are primarily useful for flavoring the sound; only -6 dB and -12 dB slopes are provided. Figure 11.16 shows a high-pass spectrum compared to the source. The high-pass filter is also effective when swept by an envelope, but in this case the envelope must be upside down. That brings the high partials in first, giving an accordion-like or vocal sound (DVD example 11.15). Band-pass Filters The band-pass filters in Absynth include a Q control where all of the other filters have resonance. The difference between the two is subtle but significant. Resonance, which emphasizes the level at the cutoff frequency, is achieved by feedback. If this reaches 1.0, the filter will be unstable. An analog filter oscillates; a digital filter does something much more spectacular (the resulting sound has been known to blow out speakers, so you aren’t going to see it in a commercial product). A band-pass filter has two cutoff points, and the frequency difference between the two is the bandwidth. An old measure for filters is something called the quality factor, which is the ratio of the center frequency to bandwidth. This is now called Q and is the preferred way of adjusting filters for music. If the Q is 1, the bandwidth is an octave. A Q of 2 gives a half octave bandwidth, and so on. The Q can be very, very high (up to 1,000, in fact) which reduces the output to a single sine tone. A high Q can produce a loud signal, so there is a dB control to tame it. One interesting trick is to use a rich waveform fixed at a low frequency, then use a high Q filter

12_Chap11_pp237-266 8/29/13 2:46 PM Page 258

258

THE ART AND TECHNIQUE OF ELECTROACOUSTIC MUSIC

15

31

63

125

250

500

1k

2k

4k

8k

16k

4k

8k

16k

Analog Style 4 Pole

15

31

63

125

250

500

1k

2k

Digital 24 dB/Octave FIGURE 11.15 Effect of modeled analog and pure digital low-pass filters on a sawtooth spectrum.

to emphasize the harmonics. This produces notes that pop out of the drone (DVD example 11.16). Many electroacoustic composers use a band-pass filter pretty much the same way as a low-pass filter, gating with an envelope, but they are best used for emphasizing part of the spectrum. Using the transposition value to pick the harmonic to accentuate allows subtle tuning of the timbre. Do this with the harmonic series in mind, which in semitones goes 12, 19, 24, 27, and so on. The band-pass filter is also excellent for creating formants—regions of fixed frequency that are emphasized no matter what pitch is played. Formant filters will usually have a low Q, typically around 2. All-pass Filter An all-pass filter changes phase, with a 90° change at the center frequency, but does not change amplitude. An all-pass filter (Figure 11.17) by itself has little or no audible effect on a signal, but combinations are a different matter. If the phase-changed signal is combined with the original, there will be some low- or high-pass effect; and if the outputs of two all-pass filters of different frequency are mixed, there will be a

12_Chap11_pp237-266 8/29/13 2:46 PM Page 259

VOICING SYNTHESIZERS

15

31

63

125

250

500

1k

2k

4k

8k

16k

2k

4k

8k

16k

259

Input Signal

15

31

63

125

250

500

1k

Highpass Filter FIGURE 11.16 Effect of a high-pass filter on a sawtooth spectrum.

notch. With most synthesizers the user assembles all-pass networks from separate modules, but the all-pass filters in Absynth combine several elements to provide one, two, or three notches. You can hear the notches best if you sweep the filter frequency. The three-notch spectrum is illustrated in Figure 11.18. Increasing resonance on Absynth’s all-pass filters changes the frequency of the notch and increases the amplitude at the peaks. This can produce some surprisingly loud notes. Since the frequency of an all-pass filter is the point of 90° phase change, and the notches in the spectrum are the result of undocumented filters interacting, the frequency shown in the Absynth module is not clearly related to any notch or peak. An all-pass filter is applied in DVD example 11.17. Notch Filters The notch filter produces an effect similar to an all-pass network, that is, a missing section of spectrum (Figure 11.19). However, the notch produced can be much broader, as set by a bandwidth control. Since this is actually a low-pass filter combined with a high-pass filter, the depth of the notch increases with the bandwidth. A bandwidth of two octaves should produce a notch of 12 dB. This type of notch is typical of old modular machines that used independent filters for notching or, to

12_Chap11_pp237-266 8/29/13 2:46 PM Page 260

260

THE ART AND TECHNIQUE OF ELECTROACOUSTIC MUSIC

15

31

63

125

250

500

1k

2k

4k

8k

16k

1k

2k

4k

8k

16k

Q=0

15

31

63

125

250

500

Q = 12 FIGURE 11.17 Effect of Q on a band-pass filter.

15

31

63

125

250

500

1k

2k

4k

FIGURE 11.18 Effect of an all-pass filter on a sawtooth spectrum.

8k

16k

12_Chap11_pp237-266 8/29/13 2:46 PM Page 261

VOICING SYNTHESIZERS

261

Notch

15

31

63

125

250

500

1k

2k

4k

8k

16k

FIGURE 11.19 Effect of a notch filter on a sawtooth spectrum.

use a better term, band rejection. The notch produced by a state-variable or multimode filter is deeper and narrower, controlled by the resonance. Resonance on a band-reject filter pair emphasizes the signal at the cutoff points and has no effect on the depth of the notch. In fact, a two-filter notch spectrum can display two peaks with a deep valley between. DVD example 11.18 demonstrates the sound of a notch filter swept across a rich waveform, followed by a wave swept past a fixed notch. Comb Absynth includes a comb filter in the filter list (Figure 11.20). Recall from chapter 5 that comb filtering is an effect of delay—when you combine a delayed signal with the original, the result emphasizes the harmonics of the delay frequency. (The delay frequency is one over the delay time.) Feedback intensifies the effect, and enough feedback will cause oscillation. When delay is treated like a filter, the results depend on the ratio between the oscillator and comb frequency. If they are the same, the filter only changes the amplitude of the signal as feedback is added. If the comb filter is twice the oscillator frequency, feedback emphasizes even harmonics and reduces odd ones, including the fundamental. Negative feedback emphasizes odd harmonics. Either way, feedback exceeding 0.9 will tend to explode in amplitude, so a gain adjustment is provided. With other ratios a variety of harmonics can be amplified or suppressed. When the comb filter is fixed in frequency, some notes will be altered drastically, others not at all. This is another situation where it can be interesting to fix the oscillator frequency and play the filter. Envelopes and LFOs produce wild sonorities, especially as the comb frequency approaches 0 Hz. DVD example 11.19 demonstrates comb filtering in Absynth.

12_Chap11_pp237-266 8/29/13 2:46 PM Page 262

262

THE ART AND TECHNIQUE OF ELECTROACOUSTIC MUSIC

Harmonics Emphasized by Comb Filter at 240 Hz

Harmonics of Input at 60 Hz

15

31

63

125

250

500

1k

2k

4k

8k

16k

FIGURE 11.20 Effect of a comb filter on a sawtooth spectrum.

Multiple Sources Coalescence Many instruments exploit the rich possibilities of multiple sound sources. We only have to look at the doubled and tripled strings of a piano or the ranks of a pipe organ to understand the principle. Two sound sources are often better than one, if they are balanced well enough to produce a unified sound. With acoustic instruments, the problem is most often in getting two sound producers well-enough matched that they complement each other. With electronic synthesis, the problem is getting just the right difference. There is an effect, which I call coalescence, where the separate components of a patch become a single tone. If you experiment with two oscillators, you can discover this principle. If they are tuned to different pitches, two distinct tones are heard. If the pitches are slowly brought together, there is a magic point where only one tone is heard. Conversely if you start with the two oscillators in unison, there is a point where they separate, and this is not the same point. Between these two points is a narrow range where one sound exists that is somehow richer than a pair of oscillators at precisely the same pitch. The secret to voicing is to find the places where the sound coalesces without becoming dull. DVD example 11.20 demonstrates coalescence. Phase There are several parameters that can produce rich, cohesive sounds. Phase is only perceivable when two oscillators are combined. This is illustrated in DVD example 11.21. Starting with a patch of a sawtooth wave into a low-pass filter, adding an identical oscillator and filter does not change anything except the amplitude of

12_Chap11_pp237-266 8/29/13 2:46 PM Page 263

VOICING SYNTHESIZERS

A

263

B Phase 0°

Phase 90°

FIGURE 11.21 Mixed waves with changing phase.

the sound. However, a single change—turning the phase of one oscillator to free mode—produces the results heard 20 seconds into the example. Not only is the sound changed, it is somewhat different on each note. When oscillator phase is synced, each note begins at the same point in the waveform. This is usually the beginning, but it is adjustable with the phase control. When the phase of an oscillator is free-running the starting point will be different on each note. When that is combined with another oscillator, the phase interference will produce slight variations in levels of the harmonics. The waveforms in Figure 11.21 show two different notes of the same pitch. Waveform A shows the two oscillators almost in phase, whereas B shows a more complicated form. Waveform B has noticeably more bite in the sound. If you enable phase control on the second oscillator and play with it, you will discover just how much variation is possible with this often neglected parameter. Adding an envelope to control phase and making the envelope respond to velocity will make the sound change subtly from note to note. This is not obvious when listening to or even playing the patch, but it brings extra life to the music. Detuning Detuning is the traditional approach to building richness and works well. We have already seen detuning in the oscillator’s double mode, but if two sources are used, a slow envelope on the second will bring the effect in gradually. A frequency difference on the order of 3 cents (0.03 in the transposition field) gives some bite to the tone without going sour. If you try this with a purer tone such as triangle, you will discover why pianos usually have three strings per note. The phase interference will cause the fundamental to noticeably fade in and out. If you add a third oscillator and transpose it down, the effect will be mostly up in the harmonics. The fundamental will vary, but only slightly. This trick can be improved by transposing both secondary oscillators up an octave (or two if you want to keep a triangle sound). In either case, reduce the secondary envelopes until the sound coalesces—you are now blending with the second harmonic of the main oscillator. DVD example 11.22 demonstrates the effects of detuning at the unison and octave.

12_Chap11_pp237-266 8/29/13 2:46 PM Page 264

264

THE ART AND TECHNIQUE OF ELECTROACOUSTIC MUSIC

Transformation Fading a second oscillator in slowly suggests a way to make an evolving waveform. Set one oscillator at filtered saw and another at sine. The saw oscillator should have a quick attack followed by a fade down about 12 dB. The second oscillator fades in slowly, matching the drop in the first. It will take a bit of fiddling to get the envelopes right. Figure 11.22 shows how the envelopes fit together (most of the envelope window displays the top 18 dB of amplitude range). The exponential fade curves produce a steady change in sound. A third oscillator with a slow attack can bring a second transformation. Crossfading oscillators produce tones that sound different on chords and quick notes, an effect that is popular with voice and string sounds. The natural progression seems to be from rich to simple, such as from square to triangle (transitions that go the other way sound backward). The range of possible transformations can be expanded by applying crossfade envelopes to various parameters of parallel paths. DVD example 11.23 demonstrates some crossfaded timbres. Transients Many sounds are created by a violent action such as striking or plucking. One characteristic of these sounds is that the initial waveform has little resemblance to the ultimate tone. That initial burst of sound is very short and is never heard well enough to be identified as something distinct. Here is another experiment: start a new sound with only one oscillator with a filtered sawtooth wave. Set the amplitude envelope to only two segments in release mode. Now experiment with the attack, starting as short as possible and increasing it until the click goes away. You will discover that there are distinct timbres of click, with a 0.1 ms attack being noticeably different from a 0.2 ms attack. At some point, probably around 0.5 ms, the click goes away. This is still a very short attack; increasing the time further reveals the time at which the sound just starts, with neither a pop nor a fade. It’s probably 5 ms or less. Now set a similar time for the release. This 10 ms window is the maximum time span for a transient. Now add a second oscillator with a triangle tone and default envelope. The two sounds will be distinct as at the beginning of DVD example 11.24. Reducing the level of the transient envelope should reveal another kind of coalescence. At this point, you might swear the transient is inaudible, but fading the first oscillator out demonstrates the transient’s role in this sound, adding crispness to the attack. The latter part of the example explores changes in the length of the transient. If it is too short relative to the attack of the sustained tone, two distinct parts are heard. The next experiment would be to explore other waveforms for the transient. As a general rule, the more complex the sustained tone, the more complex the transient should be. Of course, once you have done this exercise, you will be sensitized to transients and will find you can hear them in all manner of sounds. Formants Most sounds have some characteristic that changes with register. It’s not enough just to play the same waveform at a faster rate; there should be changes in the tim-

12_Chap11_pp237-266 8/29/13 2:46 PM Page 265

VOICING SYNTHESIZERS

0.00

1.00

2.00

4.00

3.00

5.00

265

6.00

A

Oscil A Amp

R A

DS

Oscil B Amp

R DS

Oscil C Amp A

FIGURE 11.22 Crossfade envelopes in Absynth.

bre across the keyboard. One easy way to get this in a subtractive synthesizer is to establish formants. The resonances of real instruments are generally fairly complicated, with several peaks that are not harmonically related. The filter is the tool for achieving this, but few synthesizers provide the right type of filter, a graphic equalizer. Band-pass filter modules can provide a boost at the desired frequencies, but the reduction in gain in the outer octaves is usually too much. This is mitigated by building a patch that is essentially three identical branches, each with a differently tuned band-pass filter. When deciding which frequencies to emphasize, keep the frequency of each octave in mind. Bass instruments understandably have formants in the bass region, while a lead voice may have one mid keyboard. A bump between 2 and 3 kHz always helps articulation, and higher formants bring out tonal richness. DVD example 11.25 demonstrates low- and high-frequency formants.

Effects Most modern synthesizers include a module or two that applies effects to the overall sound. These typically include reverbs and delays, things that make a good

12_Chap11_pp237-266 8/29/13 2:46 PM Page 266

266

THE ART AND TECHNIQUE OF ELECTROACOUSTIC MUSIC

impression in the music store regardless of the quality of the actual sounds. I usually ignore these, preferring to make such effects part of a composition rather than a feature of an instrument. Absynth lets you place any modifier or filter in the master position, which can help refine the sound you are trying to build. For instance, a bit of bite can always be added with some subtle waveshaping. Use this with care, however. It’s easy to overwhelm several hours of tweaking and come out with nothing but the sound of the processor.

EXERCISE Your next challenge is to create a suite of voices of your own, using Absynth or any similar synthesizer. Start with an interesting preset and make small changes in each parameter to get a sense of how the synthesizer works. Then make a pair of presets that are variants, one suitable for percussion, one suitable for long sounds. Test these in the sequencer with some classic MIDI files, then write an original composition with them.

RESOURCES FOR FURTHER STUDY You may find these useful if you have the specific instruments addressed: Cann, Simon. 2007. How to Make a Noise: A Comprehensive Guide to Synthesizer Programming. New Malden: Coombe Hill Publishing. Millward, Simon. 2002. Sound Synthesis with VST Instruments. Tonbridge: PC Publishing.

13_Chap12_pp267-306 8/29/13 2:47 PM Page 267

TWELVE FM Synthesis

Frequency modulation is such a powerful process that many synthesizers rely on it as the central method of controlling timbre. There are quite a few FM-based synthesizers available, ranging from the simple to amazingly complex. These include new approaches to the technique as well as software versions of the classic Yamaha X instruments, which are still in use and consistently hot sellers on eBay. The history of FM synthesis is well documented; John Chowning developed it when he was a graduate student at Stanford in 1967, and eventually a patent was licensed to Yamaha who introduced the DX7 in 1983. That instrument brought synthesizers to a new level of sophistication. Practically every show or recording of the 1980s featured DX sounds. Frequency modulation went out of style during the 1990s, eclipsed by cheap sample players (many containing samples of FM), but it has made a comeback with the recent explosion in software synthesizers. I want to make one thing clear—it was not just the sound of frequency modulation that was exciting. There were plenty of other synthesizers that sounded as fresh and interesting. It wasn’t realism either—some of David Bristow’s imitative presets were pretty convincing but they never fooled anybody for long. The feature that made the DX7 everybody’s first choice was its expressiveness. The process of frequency modulation is quite sensitive to input parameters, and it is sensitive in a nuanced way. Tiny changes have tiny effects that graduate smoothly to extreme effects if desired. Playing the competition was like playing a bunch of light switches. The notes started, the notes stopped. A particular key always sounded the same unless the performer had mastered the trick of working controls with one hand and playing keys with the other. The DX7 rewarded subtlety; delicate control of touch provided control of timbre that matched that of the violin. You could practice on it and learn to play it better. The main drawback of frequency modulation is the difficulty of programming sounds. The early instruments were programmed by setting lists of numbers. It was difficult to get a grasp on what each number meant, and the sound itself contains no clues that lead back to the parameters. For example, enough experience with subtractive synthesis and you know what to do when the sound is too bright: adjust the filter cutoff. When you hear an ugly FM sound (and believe me, FM can sound ugly) the cure is not so obvious. But now there are tools to help. Most FM 267

13_Chap12_pp267-306 8/29/13 2:47 PM Page 268

268

THE ART AND TECHNIQUE OF ELECTROACOUSTIC MUSIC

applications feature graphic representations of envelopes and modulation routings, and they respond to parameter changes while notes are sounding. If you have a spectrum display open while you fiddle with a sound, you can see the effect of tweaking parameters. These features enable the user to learn to imagine sounds and program them.

THE MATH OF FM SYNTHESIS We have already encountered the concept of sidebands in both amplitude modulation and frequency modulation. If you have a carrier of one frequency (Fcar) and a modulator of the same or another frequency (Fmod), new frequency components called sidebands will appear. These sidebands have frequencies that are the sum (Fcar + Fmod) and difference (Fcar – Fmod) of the original frequencies. The difference between amplitude and frequency modulation is that increasing the amount of modulation will only increase the amplitude of the sidebands in AM, whereas in FM, more modulation means more sidebands. These sidebands are at frequencies expressed as Fcar + nFmod and Fcar – nFmod where n is the sideband number. Don’t get this confused with the nomenclature of harmonics—the fundamental is also the first harmonic, but the first sideband is distinct from the carrier. There are in fact two first sidebands, upper and lower. When we put some numbers into the math, we see that if a carrier frequency of 1,000 Hz is modulated by a frequency of 100 Hz, something will happen at 900 Hz and 1,100 Hz. More modulation will produce 800 Hz and 1,200 Hz, and so on. The first question that comes to mind is, What happens when a lower sideband is produced at 0 Hz? A frequency of zero is a direct current, or a constant value that is added to all sample values in the waveform. It’s not particularly important, except that we may hear a thump if that wave is turned on too fast. More interesting is that frequency components close to zero are heard as beats or vibrato. The next question is, What exactly does negative frequency mean? The simplified answer is a signal at -100 Hz sounds the same as 100 Hz, but it’s 180 degrees out of phase. You can think of it as a mirror image. If a reflected sideband matches the frequency of another component, it may subtract from the total amplitude. The tricky aspect of FM sidebands lies in their amplitude. The theoretical strength of any sideband depends on the amount of modulation, which is stated as a modulation index. It’s the ratio of the deviation of carrier frequency to the modulator frequency. If a modulator of 100 Hz is strong enough to change the carrier by 100 Hz, the index is 1. We will never have to figure this out. In fact we will never see an index value. All the applications give us is an arbitrary number that is related to the index, but that relation is never specified. There’s a lot of guesswork involved when we look at books and try things on real instruments, and I’ll explain how to make an educated guess shortly. The effect of changing modulation index is shown in Figure 12.1. Image A is the spectrum of an unmodulated carrier. Image B shows that a small amount of modulation has brought in the first two sidebands. Image C

13_Chap12_pp267-306 8/29/13 2:47 PM Page 269

FM SYNTHESIS

lin 1k

2k

3k

4k

5k

6k

A

lin 1k

2k

3k

4k

C

lin 1k

7k

8k

9k 10k 11k 12k 13k 14k 15k 16k

Carrier

5k

6k

7k

8k

9k 10k 11k 12k 13k 14k 15k 16k

Modulation = 1.5

2k

3k

E FIGURE 12.1

4k

5k

6k

7k

8k

9k 10k 11k 12k 13k 14k 15k 16k

Modulation = 2.2

lin 1k

2k

3k

4k

B

lin 1k

lin 1k

6k

7k

8k

9k 10k 11k 12k 13k 14k 15k 16k

Modulation = 1

2k

3k

4k

D

F

5k

269

5k

6k

7k

8k

9k 10k 11k 12k 13k 14k 15k 16k

Modulation = 2

2k

3k

4k

5k

6k

7k

8k

9k 10k 11k 12k 13k 14k 15k 16k

Modulation = 3.8

Basic FM spectra.

shows two more sidebands, and image D has six. But notice the carrier on image D. It is beginning to shorten; in fact, by image E it has disappeared just as the fifth pair of sidebands turns up. In image F, the carrier is back, but the first order sidebands are gone. These transitions are all smooth, so components come and go with no sudden changes in timbre. Another way to think of it is the spectrum gets steadily richer while a pattern of notches sweeps along the sidebands. The expected amplitude for the carrier and each sideband follow a type of curve called the Bessel function of the first kind. In addition to kinds, Bessel curves have orders, and the carrier amplitude follows the function of zero order. The first sideband pair follows the first-order curve, the second pair follows the second-order curve, and so forth. Figure 12.2 shows the first three orders up to an index of 13. Notice that the graphs all go negative from time to time. That indicates a phase inversion (-180°) which will affect how partials combine. Another factor that affects combinations (but doesn’t show on these graphs) is that lower odd-order sidebands are also inverted. I’ll explain how some of this applies as we look at real examples. If you need to decipher how a particular application specifies modulation index you can use Figure 12.2 and a spectrum analyzer. The trick is to look for the nulls of

13_Chap12_pp267-306 8/29/13 2:47 PM Page 270

270

THE ART AND TECHNIQUE OF ELECTROACOUSTIC MUSIC

Order 0 Order 1 Order 2

2

Carrier Null

FIGURE 12.2

4

First Sideband Null

6

8

10

12

Second Sideband Null

Sideband strength as a function of modulation index.

the carrier and first few sidebands. You can compare the value that produces a null with the curves in Figure 12.2 and read the index directly. For instance, the carrier disappears for the first time at modulation index 2.2. That corresponds to image E in Figure 12.1. The application I was using to get these spectra showed the modulator amplitude as 43. You should map out all of the nulls shown in Figure 12.2. A cautionary note: If you do this, use a high-frequency carrier and a modulator about 1/20th of that. Otherwise the reflected sidebands will make the spectrum difficult to interpret. (Figure 12.1 was done with a carrier of 8 kHz and a modulator of 500 Hz and displayed on a linear scale.)

THE SOUND OF FM SYNTHESIS There is no characteristic FM sound, except perhaps for a set of DX7 presets that were heavily overused in the early days of that instrument. Frequency modulation can be used to produce blatantly electronic tones, terrifying clangs, uncannily realistic plucked strings, and even somewhat human voices. We can begin to explore frequency modulation using simple FM synthesis plug-ins. Many of these are available as inexpensive or free downloads. You may also find FM provided as a feature

13_Chap12_pp267-306 8/29/13 2:47 PM Page 271

FM SYNTHESIS

271

of more complex synthesizers such as Absynth, which we explored in chapter 11. Look for a plug-in that clearly shows frequency ratio and modulation index and includes envelope control of modulation. The early examples in this chapter were prepared using EFM1, a basic FM instrument that is part of Logic Studio. Elegant voicing requires better control of the envelopes than graphic sliders allow, so look for an application that allows numeric entry or at least shows time in milliseconds. (In Logic, this is always possible via the controls view.) Then the settings presented in Tables 12.1 to 12.7 should be easy to adapt.

The Basic Sound Set The best way to become familiar with the entire range of FM sounds is to explore the effects produced by various modulation ratios systematically. As always, we will begin to explore the instrument by neutralizing as many effects as we can. Load the default setting and change it to provide a plain sine wave with no modulation. On the EFM1, this requires setting modulation depth to zero and unchecking fixed modulator frequency. The rather arty layout of EFM1 (Figure 12.3) tells you nothing about the architecture (Figure 12.4), but it can be worked out from the documentation. The classic FM design is based on the concept of an operator, which is oscillator, amplifier, and envelope generator. In other words, beep with no filter. It takes two operators to do frequency modulation. One operator is the carrier, which we hear, and the other is the modulator, which we don’t hear. The modulator output is connected to the frequency control of the carrier. The modulator signal changes the pitch of the carrier, but it does this so quickly it’s really just shaping the carrier waveform. Ratio 1:1 The simplest frequency modulation effect can be heard with the carrier and modulator tuned to the same frequency. As the modulation is turned on and increased, harmonics will come and go in a complex way. This is demonstrated in DVD example 12.1. The important issue is that the sidebands are all perfect harmonics of the carrier, so we are going to hear a clear pitch. Figure 12.5 shows the harmonics produced by EFM1 with the modulation knob set halfway. This is a bright clear tone. Most FM voicing involves adjusting the carrier and modulation envelopes. The carrier envelope turns the sound on and off in the manner we are used to and the modulation envelope does the job of a filter envelope, adding harmonics at the right time. A short modulation envelope will put a tiny blat at the beginning of the note. DVD example 12.2 demonstrates. With longer envelopes, we will hear an odd bouncing effect as on DVD example 12.3. This comes from the carrier and first sidebands dropping out as the index sweeps up and back; this sort of thing can be obscured by a bit of vibrato. One of the classic FM sounds with a 1:1 ratio is something like a trumpet. DVD example 12.4 was made with the settings shown in Table 12.1.

13_Chap12_pp267-306 8/29/13 2:47 PM Page 272

272

THE ART AND TECHNIQUE OF ELECTROACOUSTIC MUSIC

FIGURE 12.3

Logic EFM1 FM synthesizer plug-in.

Modulator Osc

Carrier

Amp

Osc

Amp

FM Depth FM

FIGURE 12.4

ADSR

EFM1 architecture.

ADSR

13_Chap12_pp267-306 8/29/13 2:47 PM Page 273

FM SYNTHESIS

15

FIGURE 12.5

31

63

125

250

500

1k

2k

4k

8k

273

16k

Spectrum of 1:1 frequency modulation.

Parameter Harmonic Fine FM/FM amount Attack Decay Sustain Release LFO rate/Amt

Carrier 1 0 0.3 18.0 240 1.0 46 5.1

Modulator 1 0 0.6 0.05 0.24 0 260 -0.01

Units

ms ms ms Hz

TABLE 12.1 Parameter settings for a brassy sound (DVD example 12.4).

This is certainly a bright sound, and with some tweaking it will be trumpet-like on a few notes. Of course we aren’t here to put trumpet players out of work— assigning a label like trumpet to this class of sounds just helps us remember roughly what they sound like.

13_Chap12_pp267-306 8/29/13 2:47 PM Page 274

274

THE ART AND TECHNIQUE OF ELECTROACOUSTIC MUSIC

Parameter Harmonic Fine FM/FM amount Attack Decay Sustain Release LFO rate/Amt

Carrier 10 0 0.0 9.2 650 0. 650 1.6

Modulator 1 0 1.0 0.0 0.0 0.632 10000 0.15

Units

ms ms ms Hz

TABLE 12.2 Parameter settings for clavichord-like sound (DVD example 12.6).

Ratio 1:10 When the carrier is several times the frequency of the modulator, the sound has a very rich spectrum. A ratio of 1:10, heard in DVD example 12.5, demonstrates the real essence of FM. As shown in Figure 12.6, sidebands appear above and below the carrier on a spacing equal to the modulator frequency. The spacing appears unequal on this chart because it has an equal octave (logarithmic) scale, but if you read the numbers you will see the spacing is about 250 Hz. This is the modulator frequency and is heard as the fundamental pitch, but the very high carrier frequency produces an effect similar to a high-pass filter. This is pretty raw, but it can be beautiful in controlled doses. The settings in Table 12.2 produce the sound in DVD example 12.6. The magic comes from the modulation envelope. The nearly instantaneous attack and slower decay make the note speak with a full set of harmonics that are quickly muted. The overall decay of the volume envelope produces a note that finishes with a bright, clavichord-like timbre. A tiny bit of stereo detune puts the icing on the cake. Experimentation with these settings (making tiny changes) will produce all manner of plucked string sounds. Bringing the ratio down will produce a more mellow sound like a nylon string. Ratio 6:1 When the modulator is higher in frequency than the carrier, each integer ratio produces a distinct sound. DVD example 12.7 runs through the set from 1 to 15. At modulator harmonic 2 the sound is familiar, a lot like a square wave. You should be able to recognize it as the even harmonics only. A ratio of 3:1 is still familiar, with every third harmonic missing—really missing, not just suppressed a bit. Modulator harmonic 4 produces some odd harmonics again. At 5:1 we get harmonics 1, 4, 6, 9,

13_Chap12_pp267-306 8/29/13 2:47 PM Page 275

FM SYNTHESIS

15

FIGURE 12.6

31

63

125

250

500

1k

2k

4k

8k

275

16k

Spectrum of 1:10 frequency modulation.

Parameter Harmonic Fine FM/FM amount Attack Decay Sustain Release LFO rate/Amt

Carrier 1 0 0.0 36.0 2800 0.0 920 0.01

Modulator 6 0 0.5 5.2 6.3 0.427 8800 0.14

Units

ms ms ms Hz

TABLE 12.3 Parameter settings for bell-like sounds (DVD example 12.8) 11, 14, 16, 19, and 21. The component pairs spaced at twice the fundamental make the pitch ambiguous; in fact, it begins to sound like two distinct pitches. Higher modulator tunings become even more ambiguous. Figure 12.7 shows the spectrum produced by a ratio of 9:1. The addition of a modulator envelope produces dynamic versions of these sounds. Again the strategy is to let more harmonics through at the attack. The parameters in Table 12.3 will produce the sound of bells as heard in DVD example

13_Chap12_pp267-306 8/29/13 2:47 PM Page 276

276

THE ART AND TECHNIQUE OF ELECTROACOUSTIC MUSIC

15

FIGURE 12.7

31

63

125

250

500

1k

2k

4k

8k

16k

Spectrum of 9:1 frequency modulation.

12.8. These percussive sounds can be easily turned into sustained sounds by raising the volume sustain and reducing the decay and release. These can be refined by adjusting the amount of steady and envelope controlled modulation. Inharmonic Ratios Really outrageous sounds can be produced by detuning both carrier and modulator. This produces inharmonic (noninteger) frequency ratios. Maximum modulation with the modulator harmonic a little below 2 and the carrier a bit below 1 is practically guaranteed to produce chime sounds, as in DVD example 12.9. Table 12.4 gives the settings. Even though there is no LFO on the sound, there are beats everywhere. This example has bell-like envelopes—a slow carrier attack with medium sustain produces tones like bowed glass. You won’t be able to play tunes with this kind of sound (the pitches won’t be the right places on the keyboard), but there will be awesome chords. Fixed Carrier Clicking the fixed button on the EFM1 carrier produces the nastiest sounds available in FM. This disconnects the carrier from the keyboard, leaving it pitched around 30 Hz. The harmonic and fine-tuning controls still work, so a range of base pitches from 0 Hz to 1,200 Hz is available. Modulating a fixed carrier makes a different sound on each note. With a modulator on harmonic 1, the sidebands will be spaced above and below the expected harmonics by the carrier frequency. For instance, a carrier at 100 Hz and a modulator at A440 will produce 100, 340, 540, 780, and 980 Hz. The key a step lower will produce 100, 292, 492, 684, and 884 Hz. Fixed carrier modulation sounds like a chord in high octaves or ripping cloth in low

13_Chap12_pp267-306 8/29/13 2:47 PM Page 277

FM SYNTHESIS

Parameter Harmonic Fine FM/FM amount Attack Decay

Carrier 1 -0.460 0.413 52.0 1700

Modulator Units 2 -0.130 1.0 0.0 ms 5.2 ms

Sustain Release LFO rate/Amt

0.0 1100 5.1

0 5200 0

277

ms Hz

TABLE 12.4 Parameter settings for inharmonic sounds (DVD example 12.9). notes, as you can hear in DVD examples 12.10 and 12.11. The constant pitch of the fixed carrier is sometimes oppressive, so some FM synthesizers have a high-pass filter to take it out. The settings in Table 12.5 modify the sound with an inverted modulation envelope. The tone starts out rich with heavy modulation and softens up as the envelope slowly attacks. On decay, the harmonics return (DVD example 12.12). Instruments with more elaborate envelopes than this can generate a whole piece on one note. If there is no modulation to start with, an inverted envelope sounds essentially the same as a normal one. Percussion envelopes on fixed carrier tones can, with a bit of fussing, sound much like drums (see Table 12.6 and DVD example 12.13). Carrier at Zero Harmonic zero is worth exploring in detail. It means the pitch of the carrier is zero, and will not change with the keyboard input. Modulating a frequency of zero is wave-shaping. A sine waveform as the shape will produce odd harmonics of the modulator. If the carrier frequency is raised a tiny bit, the missing harmonics will suddenly appear with a vibrato at the rate of the carrier. The vibrato is an alternation of sidebands, evens versus odds, as demonstrated in DVD example 12.14. This effect is a bit heavy, but stereo detuning will mask it some. Modulation envelopes produce many variations on this sound without losing its essential character. DVD example 12.15 demonstrates the effect of an inverted modulation envelope with the carrier at zero. The tone begins bright, quickly mutes, then becomes bright again. If the carrier frequency is increased to a few Hz, the sound becomes rougher. A vibrato appears and becomes dual sidebands closely spaced with the modulator and its harmonics. If the carrier tracks the keys, various bell-like sounds will occur.

13_Chap12_pp267-306 8/29/13 2:47 PM Page 278

278

THE ART AND TECHNIQUE OF ELECTROACOUSTIC MUSIC

Parameter Harmonic Fine FM/FM amount Attack Decay Sustain Release LFO rate/Amt

Carrier 3 (fix) 0.45 0.513 410 1300 0.849 1100 5.1

Modulator 1 0 -0.23 3000 4500 0.330 9100 -0.01

Units

ms ms ms Hz

TABLE 12.5 Parameter settings for a negative modulation envelope (DVD example 12.12)

Parameter Harmonic Fine FM/FM amount Attack Decay Sustain Release LFO rate/Amt

Carrier 1 (fix) 0.175 0.125 0.11 1300 0 330 5.1

Modulator 1 0 0.32 0.00 380 0 9400 0.

Units

ms ms ms Hz

TABLE 12.6 Parameter settings for drum-like sounds (DVD example 12.13)

There is a wide variety of effects in the fractional ratio range from 0 to 0.33 (Table 12.7). That last value, with appropriate percussive envelopes gives a decent guitar with nylon strings. We’ve heard quite a variety of sounds out of this instrument, but we’ve only begun to explore the possibilities of frequency modulation. One thing we haven’t

13_Chap12_pp267-306 8/29/13 2:47 PM Page 279

FM SYNTHESIS

Parameter Harmonic Fine FM/FM amount Attack Decay

Carrier 0 0.330 0.120 0.11 1300

Modulator Units 1 0 0.32 0.0 ms 380 ms

Sustain Release LFO rate/Amt

0.0 330 5.1

0 9400 0

279

ms Hz

TABLE 12.7 Parameter settings for carrier at zero (DVD example 12.15)

touched on is the atmospheric possibilities. Such effects are possible with EFM1, but it’s time to introduce a more elaborate instrument.

FURTHER EXPLORATION OF FM SYNTHESIS The Native Instruments FM8 plug-in has become a classic among software synthesizers, probably because it was modeled on the Yamaha instruments that made FM famous in the first place (Figure 12.8). Unlike some DX instrument emulators, the FM8 application actually improves on the original in many respects. (It also comes with a better instruction manual.) FM8 is not the only advanced FM instrument available. There are several shareware versions around, and many of the more complex applications include extensive FM capability. FM8 is available in a demo version that will let you experiment with the principles discussed here. FM8 includes many features designed to appeal to musicians who do not program sounds, such as a giant sound library, an “easy tweaks” page, and a lot of outboard effects. Since we’re here to learn programming, we will work in Expert Mode. But first, look on the effects page and make sure all effects are off. Figure 12.8 shows the operator programming page of expert mode. The heart of this instrument is the patching matrix on the right. Each of the boxes labeled A to F represents a Yamaha style operator, basic beep with some extras. Box X is a noise module, and box Y is a filter. The faint grey boxes represent potential connections. Patching is done by clicking on a grey box and dragging up. Clicking below E and to the left of F connects E to F. A number appears that indicates an attenuator on the

13_Chap12_pp267-306 8/29/13 2:47 PM Page 280

280

THE ART AND TECHNIQUE OF ELECTROACOUSTIC MUSIC

FIGURE 12.8 Expert mode of the FM8 software synthesizer. connection—100 is full on. Clicking above F and to the right of E would connect F to E. Clicking just above a box lets you connect it to itself for feedback. (Feedback is often useful in frequency modulation.) To break a connection, double-click on the number. The numbers in the patching matrix determine the character of the patch. In the patch shown in Figure 12.9, operator C is connected to B which is connected to A. In addition, A is fed back to itself and both A and B are connected to Z. Finally, Z is connected to X, and X is connected to the left output. There are not too many FM synthesizers that offer this amount of flexibility in patching. A more common approach (the one used in the Yamaha hardware) is to provide a list of preconstructed patches called algorithms. For instance the Yamaha TX81 gives you a choice of eight algorithms ranging from four independent carriers to three modulators on a single carrier. With six operators, there are too many parameters to show on the screen at once, so FM8 provides navigation buttons to bring up one of a dozen or more useful views. There’s a master view for each module, and several that compare selected settings from all modules. FM is all about operators interacting, so we need to be able to do a lot of comparisons. Frequency, waveform, and several other settings for all operators are shown in the Ops pane illustrated in Figure 12.10. Details of each operator can be displayed by clicking the letter of the operator. This view is shown in Figure 12.11. Each operator has many features that should be familiar. There’s a frequency ratio value, which combines the harmonic and finetuning functions of EFM1. The frequency produced is the ratio times the keyboard

13_Chap12_pp267-306 8/29/13 2:47 PM Page 281

FM SYNTHESIS

281

FIGURE 12.9 Patching in FM8. pitch. The offset is a value that is added to the operator frequency specified by the ratio; this goes up to 9,999 Hz. If the ratio is zero, the offset will set the same frequency on all notes. If the ratio is 1.0 and the offset is 10 Hz, playing an A440 will produce 450 Hz and playing an octave down will produce 230 Hz. The waveform control chooses from a large assortment of waves tailored to provide interesting modulation spectra. The amplitude controls set the output level of the operator. That number is the highest value that appears in the patching matrix. The middle section of the operator view shows amplitude modulation values that are pertinent to this operator. This is one row of a modulation matrix (the full modulation window is shown in Figure 12.36). The possible modulation sources are shown in boxes. Connections are made by a click and drag on the dash below a source. The envelope at the bottom also controls amplitude and can be as complex as needed. Dragging a corner of the envelope moves it while the numerical values are shown at the bottom. Dragging the dot in the middle of a segment reshapes the line segment. New segments can be added by double-clicking on the line. The area between the two vertical markers is the sustain region. This part of the envelope is repeated as the key is held down.

13_Chap12_pp267-306 8/29/13 2:47 PM Page 282

282

THE ART AND TECHNIQUE OF ELECTROACOUSTIC MUSIC

FIGURE 12.10 Ops pane in FM8.

Frequency Modulation with Multiple Operators Considering the rich range of sounds we have discovered in basic FM synthesis, what is the point of so many operators? There are essentially three benefits: we can make more-complex sounds, we can make more-refined sounds, and we can make sounds that evolve. Rich Waveforms Choosing waveforms other than sine is the quickest approach to rich tones. The choices in the operator menu represent gradations of harmonic complexity. Parabolic, triangle, and square waves have increasing numbers of odd harmonics, as do the soft square and soft tristate, waveforms familiar from waveshaping synthesis. The sawtooth, short tristate, and ramp mod waveforms have all harmonics. The shapes with numbers are additive waveforms featuring the harmonics listed, and those labeled formant are filtered saws (they are not true formants, just emphasized harmonics). The TX shapes are an assortment from the Yamaha TX81Z, one of the classic FM instruments. These waves are not anti-aliased (the original Yamaha instruments had rather low sampling rates and significant aliasing). The spectrum of the chosen waveform is displayed at the top of the editing window. A few sample spectra are shown in Figure 12.12 and heard as the first four tones in DVD example 12.16.

13_Chap12_pp267-306 8/29/13 2:47 PM Page 283

FM SYNTHESIS

FIGURE 12.11

15

31

63

Operator view in FM8.

125

250

500

1k

2k

4k

8k

16k

15

31

63

Parabolic

15

31

63

125

250

500

1k

125

250

500

1k

2k

4k

8k

16k

2k

4k

8k

16k

ShortTriState

2k

4k

8k

16k

15

31

1+4 FIGURE 12.12

283

Spectra of modulator waveforms.

63

125

250

500

1k

5th Formant

13_Chap12_pp267-306 8/29/13 2:47 PM Page 284

284

THE ART AND TECHNIQUE OF ELECTROACOUSTIC MUSIC

lin 1k

A

2k

3k

4k

5k

6k

7k

8k

9k 10k 11k 12k 13k 14k 15k 16k

1+3 wave modulated by 500 Hz sine

FIGURE 12.13

lin 1k

B

2k

3k

4k

5k

6k

7k

8k

9k 10k 11k 12k 13k 14k 15k 16k

2k sine modulated by 1+3 wave

Modulation with a complex waveform.

Figure 12.13 and the last two tones in DVD example 12.16 demonstrate what happens when a complex wave is modulated. Spectrum A in Figure 12.13 is the 1+3 waveform at 2 kHz modulated by a 500 Hz sine (it’s shown with a linear scale to clarify the sidebands). The carrier has components at 2 kHz and 6 kHz. Each of these is modulated by the 500 Hz sine and produces the expected sidebands. Notice that the amount of modulation seems slightly different, with the upper component getting more. This is a result of the math that converts the operator level to modulation index, which is based on the carrier frequency. Compare this with spectrum B— modulating a sine tone with a complex one. In that case, each partial of the modulating tone produces a set of sidebands on the carrier. The difference is most audible with low-level modulation. Multiple Modulators When you use two modulators on the same carrier, the results are more complex. In Figure 12.14 spectrum A shows a 10 kHz tone modulated by a 500 Hz tone and B shows 10 kHz modulated by 1,500 Hz. Nothing is remarkable here. C shows the two spectra mixed and is hardly surprising. D shows a single carrier at 10 kHz modulated by both 500 and 1,500 Hz operators. This is not the simple mix you might expect. In addition to sidebands at Fcar ± Fmod1 and Fcar ± Fmod2, some appear at Fcar ± (Fmod1 – Fmod2 ) as well as Fcar ± (Fmod1 + Fmod2 ). You will also see Fcar ± (2Fmod1 + Fmod2), Fcar ± (Fmod1 + 2Fmod2 ) and fcar ± (2Fmod1 + 2Fmod2 ) as modulation increases. In other words, the interaction is triangular, as if the modulators were modulating each other. These are heard in DVD example 12.17, at lower pitch with changing modulation level. Feedback What do you get when an operator modulates itself? Feedback modulation results in a wave somewhere between a sine and a sawtooth, shown in A in Figure 12.15. The spectrum of this is lovely, with harmonics smoothly fading up in order. There’s no reduction of any sidebands once they appear. But there is one caution when us-

13_Chap12_pp267-306 8/29/13 2:47 PM Page 285

FM SYNTHESIS

lin 1k

2k

3k

4k

lin 1k

5k

6k

7k

8k

9k 10k 11k 12k 13k 14k 15k 16k

10k modulated by 500 Hz

A

2k

3k

4k

C

5k

6k

7k

8k

9k 10k 11k 12k 13k 14k 15k 16k

Mix of A and B

lin 1k

2k

3k

4k

5k

6k

7k

8k

285

9k 10k 11k 12k 13k 14k 15k 16k

10k modulated by 1500 Hz

B

lin 1k

2k

3k

4k

5k

6k

7k

8k

9k 10k 11k 12k 13k 14k 15k 16k

10k modulated by 500 and 1500 Hz

D

FIGURE 12.14 Modulation with two operators.

A Sine modulated by feedback FIGURE 12.15

B

Excessive feedback modulation

Waveforms produced by feedback modulation.

ing feedback: if the modulation index exceeds 1.0, the waveform will become unstable and “explode” as shown in B in Figure 12.15. In FM8, this means you must keep the feedback level below 45. When feedback is applied to a modulator a well-behaved set of sidebands appears at a fairly low modulation index. This avoids the hollow sounds of carrier notching. The first tone of DVD example 12.18 demonstrates the sound of an operator with feedback. The second tone is a sine modulated by an operator with maximum feedback. The spectra of these are shown in Figure 12.16.

13_Chap12_pp267-306 8/29/13 2:47 PM Page 286

286

THE ART AND TECHNIQUE OF ELECTROACOUSTIC MUSIC

15

A

31

63

125

250

500

1k

2k

4k

8k

Carrier with feedback modulation

16k

15

B

31

63

125

250

500

1k

2k

4k

8k

16k

Carrier modulated by operator with feedback modulation

FIGURE 12.16 Feedback modulation.

Modulating Modulators Feedback on a modulator produces modulation with a complex wave. We can easily generate other complex waves with a second modulator modulating the first, a technique called cascading or stacking. We now have two ratios to consider as well as two modulation indexes. Two modulators at the same frequency produce additional sidebands at the modulator harmonics. Carrier notching does not occur until the level hits 30, and plenty of sidebands appear before that. With other ratios, the sidebands produced by the first carrier develop sidebands of their own, as in Figure 12.17. This is demonstrated by the first tone in DVD example 12.19. First we hear the carrier, then a modulator with ratio 0.2 is brought in. Next the modulator is modulated with a ratio of 0.025 and fades out. Two special cases of stacking are demonstrated on the second and third tones of DVD example 12.19. On tone two the carrier modulator is a fixed subaudible frequency, less than about 3 Hz. If the second modulator is the same frequency as the carrier, the resulting tone will have a vibrato that consists of alternating odd and even sidebands. In the third tone, all ratios are close to 1, but the first modulator is slightly above and the top modulator is slightly below. If the modulators are slightly detuned, there will be a vibrato that affects some sideband amplitudes more than others. It’s common to detune pairs of modulators so one is slightly above the integer and the other the same below. This keeps the resultant pitch in tune. A fixed-pitch detuning will always produce a vibrato at the same rate, which will sound odd on chords. A ratio detuning will change the vibrato with note pitch, and it easily becomes too fast (wind players call this a nanny goat vibrato). Combinations of the two seem to work best. It’s also possible to apply feedback from a carrier to its own modulator. This adds high harmonics similar to those resulting from self-feedback but they are modified by detuning of the modulator, producing an especially rich vibrato. Both modulation levels will control the number of added sidebands, making this system a little less stable than plain feedback. This is demonstrated in the last tone of DVD example 12.19.

13_Chap12_pp267-306 8/29/13 2:47 PM Page 287

FM SYNTHESIS

lin 1k

2k

3k

FIGURE 12.17

4k

5k

6k

7k

8k

287

9k 10k 11k 12k 13k 14k 15k 16k 17k 18k 19k 20k 21k

Spectrum of stacked modulators.

Multiple Carriers We saw earlier that mixing two modulator carrier pairs gives simpler spectra than combined modulation. Sometimes simple is all we need. With several carriers it is easy to balance different regions of a tone. One carrier might provide the fundamental, a second the middle harmonics, and a third the high harmonics or the attack spectrum. Once divided up, the sections can be controlled by envelopes or key number. This approach is really additive synthesis with complex waveforms. Most practical patches have two or three carriers. Inharmonic Modulation We’ve already found some rich sounds made by modulating at nonharmonic ratios. It may seem that more operators at other unrelated ratios would increase the effect. This is possible but surprisingly difficult. Adding operators either emphasizes one pitch component or pushes the tone into noise. The best results seem to happen when the extra components have short envelopes, giving a noisy attack that decays into an inharmonic tone. Noise Here’s an inharmonic party trick. Set up the simple routing given in Figure 12.18 and set the frequency ratios as D = 0.58, E = 0.72, F = 53.63. The result, when you play just below middle C, is a pretty decent white noise. If you hit only one key, the noise cycles, but a cluster of three or four notes holds steady. This is the basis of “helicopter” and other special effects sounds most FM synthesizers seem to have. DVD example 12.20 demonstrates this, sweeping in the modulators one after the other. (This was adapted from a patch in Chowning and Bristow, FM Theory and Applications.)

13_Chap12_pp267-306 8/29/13 2:47 PM Page 288

288

THE ART AND TECHNIQUE OF ELECTROACOUSTIC MUSIC

100

90 15

Noise Patch

31

63

125

250

500

1k

2k

4k

8k

16k

Noise patch spectrum

FIGURE 12.18 Producing noise with frequency modulation.

Refining FM Sounds Brass The sounds presented up to now can be described as energetic, raw, or rowdy, but probably not as refined. In synthesis refinement is taken to mean emphasizing the musically effective aspects of a timbre and reducing distracting artifacts. To study refinement, let’s revisit the dullest sound from the EFM1, the feeble trumpet tone. There are several problems with this tone: it doesn’t have a good attack, it’s not really bright enough, and the spectrum should evolve with high partials appearing gradually. The spectrum is the foundation of the tone. A comparison of the spectrum of an actual trumpet (Figure 12.19) and the spectrum of the 1:1 FM patch (A in Figure 12.20) shows the cause of the lack of brightness—there are many more harmonics in the real trumpet. Note that the second and third harmonics are stronger than the fundamental. If we increase modulation to add harmonics to the FM tone, the fundamental goes down and resurges, and the notch moves up the series (B in Figure 12.20). Modulation amounts that approximate the trumpet spectrum will always have one weak partial, giving a hint of a wah-wah sound. This becomes especially annoying when an envelope is applied to modulation. We have seen several ways to add high harmonics to a tone, and experiments will show which is most effective. The quickest way to a rich tone is to use a complex waveform for the modulator. In Figure 12.20, C shows the result of using a triangle wave. There are more harmonics than we need, but the even harmonics above the 7th are missing. The other modulator waves are too rich for a trumpet sound, and it would be difficult to sweep the harmonics in. The next approach to try is a two-modulator stack. The result is not bad, but the harmonics still show some notching and the peak is too high up in the series. One modulator with feedback avoids this, creating a spectrum quite close to the real trumpet (D in Figure 12.20). The patch that produces this is shown in Figure 12.21.

13_Chap12_pp267-306 8/29/13 2:47 PM Page 289

FM SYNTHESIS

15

31

63

125

250

500

1k

2k

4k

8k

289

16k

FIGURE 12.19 Spectrum of a real trumpet.

15

31

63

125

250

500

1k

2k

4k

8k

16k

A

15

C

15

31

63

125

250

500

1k

2k

4k

8k

16k

15

31

63

125

250

500

1k

2k

4k

8k

16k

B

31

63

125

250

500

1k

2k

4k

8k

16k

D

FIGURE 12.20 Spectra of various FM trumpet presets. The next step is to adjust the envelopes for the characteristic brass attack we discovered in our analog experiments. This was achieved with a low-pass filter envelope that revealed the high partials of the tone a bit later than the fundamental. Carrier and modulator envelopes that produce this effect with FM are shown in Figure 12.22. The carrier is envelope F and the modulator is envelope E. The envelope attacks are log curves which give a quick initial response with a slower approach to the final amplitude. The carrier and modulator envelopes are similar, except the

13_Chap12_pp267-306 8/29/13 2:47 PM Page 290

290

THE ART AND TECHNIQUE OF ELECTROACOUSTIC MUSIC

33

28

43

FIGURE 12.21 FM trumpet patch showing modulation levels.

E

Modulator envelope envelope

F

Carrier envelope

FIGURE 12.22 Carrier and modulator envelopes for brassy sounds. modulator attack is a little slower and the modulator release is stretched longer. This causes the high harmonics to slide in late, but the tone stays rich during the release. Key scaling can be effectively used here. Key scaling shortens the envelope times on high notes, a common trait on many sounds. Velocity scaling does the same for loud notes. The quicker attack on high keys or loud notes will shorten the time it takes for the high partials to fade in. If both operator levels are also tied to velocity, loud notes will be brighter as well as crisper. Brass instruments actually have a more complex attack than we achieved on the analog synthesizer. As anybody who has played trumpet knows, the lips don’t start vibrating in pitch right away. They do their own thing until the trumpet speaks,

13_Chap12_pp267-306 8/29/13 2:47 PM Page 291

FM SYNTHESIS

291

FIGURE 12.23 Time spectrogram of FM trumpet. producing an effect that can be described as blat. We can simulate that with a bit of low-frequency modulation on a short envelope. Operator D provides this with a ratio of 0.188; the burst of modulation smears the pitch of the sidebands. This is too brief to be heard as a separate event but makes a convincing trumpet articulation. A time spectrogram is a useful tool for adjusting these effects. Figure 12.23 shows this smearing at the beginning of each partial. Brass attacks are out of tune, starting flat and sliding up. FM8 has a single pitch envelope that can be connected to any operator. Pitch envelopes can be positive or negative, while a neutral envelope is centered in the window, as shown in Figure 12.24. A preliminary scoop up is easy to add, but there is one caveat: an envelope has to end where it began (a quirk from the DX synthesizers that has been kept to make patches compatible). If the note starts with a rising pitch, it will end with a falling one. This can be hidden by extending the pitch envelope beyond the other envelopes. Detuning can easily be overdone—the correct pitch change is not heard as such, it just adds a slight flavor to the tone. Figure 12.25 shows how the elements combine during the note. The carrier envelope is on operator F, and the modulators are D and C. Brass players work hard to develop a consistent tone, but synthesizers sound artificial when every note is exactly the same. This consistency is mitigated by the application of key and velocity scaling to envelope times and the judicious use of LFO. (Brass vibrato should be applied to amplitude rather than frequency.) The

13_Chap12_pp267-306 8/29/13 2:47 PM Page 292

292

THE ART AND TECHNIQUE OF ELECTROACOUSTIC MUSIC

FIGURE 12.24 Pitch envelope for FM trumpet.

result of all of this is heard in DVD example 12.21. This will never be mistaken for a real trumpet. In fact it has some definite accordion qualities, but it is a useful sound, especially on simple chords. Choir: Additive FM Multioperator FM can produce effective additive synthesis. With only six operators this is going to be pretty basic, but some lovely sounds are available. For example, look at the classic Choir patch in the FM8 presets shown in Figure 12.26. Each operator, tuned as shown, is heard at a fairly low level (the sum is loud enough), and there is no modulation. The third formant waveform used on operators A and F has about eight harmonics with an emphasis on harmonic 3. The use of two operators at not quite the same ratio will produce beating proportional to the pitch. Since the ratio is about 4, the spectral peak of this part of the tone will be at harmonic 12. The other operators are paired to set up vibrato on the fundamental and 2nd harmonic. Vocal sounds are characterized by a thin spectrum with a formant in the 2 to 4 kHz range, and this combination of operators produces a good simulation (Figure 12.27). The timbre doesn’t have a real formant (that requires a filter) but it’s fairly convincing and certainly pleasant. Experiment with the level of each component. You will find a point at which the tone comes apart into two distinct notes. Once that happens, your ears will have become sensitive to that component and you will still hear two notes when you return to the original levels. You need to take this phenomenon into account when you are balancing a spectrum. A good approach is to reduce the level until only one sound is heard, then increase the value slightly. Double-check the sound after you have worked on something else. The envelopes of Choir are simple, with staggered attacks and a funny dip in the pitch envelope—this is before any attack, so it’s only heard if notes are reattacked while sounding. The dip produces a tiny hiccup that articulates the notes (Figure 12.28). This sound is demonstrated in DVD example 12.22. Choir sounds are obvious candidates for effects processing, such as voice doubling and chorus. DVD example 12.23 has these effects added. This is a pleasant

13_Chap12_pp267-306 8/29/13 2:47 PM Page 293

FM SYNTHESIS

D

Attack envelope

E

Modulator envelope envelope

F

Carrier envelope

293

Pitch envelope envelope FIGURE 12.25 Operator envelopes for FM trumpet.

sound, although it is easy to overdo the delay effects. Voice doubling has to be used with care. It increases the voices used per MIDI note, so it is easy to exceed the polyphony count, especially on sounds with long releases which may be curtailed for new notes. This can be heard as pops in DVD example 12.23 where the polyphony was 8. The pops were repaired in the next example by increasing polyphony to 16.

13_Chap12_pp267-306 8/29/13 2:47 PM Page 294

294

THE ART AND TECHNIQUE OF ELECTROACOUSTIC MUSIC

Op

10 -99

39 39 39 39 10 0 100 9 -99 100 0 100 1 0 -99

Ratio

Of Offset[Hz] fffset[Hz]

Waveform W aveform

3.9998

-0.07

2.0006

0.49

Sine

0.9994

-0.49

Sine Sine

1.0096

0.12

Sine Si ne ne

2.0197

0.74

Sine Si ne ne

4.0025

0.18

3rd Formant

Patch

3rd F 3rd Formant orma m n

Operator T Tunings unings

FIGURE 12.26 Patch and operator tunings for the Choir preset.

15

31

FIGURE 12.27

63

125

250

500

Spectrum of the Choir preset.

1k

2k

4k

8k

16k

13_Chap12_pp267-306 8/29/13 2:47 PM Page 295

FM SYNTHESIS

295

FIGURE 12.28 Envelopes for the Choir preset.

Filtering We can give Choir a true formant by running the high partials through the filter, which is labeled operator Z. First modify the high partials on operators A and F by dropping the ratio from close to 4 to close to 2 and changing the waves to TX wave 7. That will put some energy in the formant area whatever the pitch. Now we set up the filter. The first step is to flatten the key scaling for operator Z. This will prevent the filter from tracking the keyboard. The filter is not calibrated in hertz, so it’s necessary to use a subterfuge to tune it. Turn all other operators off and connect the noise generator (operator X) to the output. Watch the spectrum as you adjust the noise parameters to get a tone that is as flat as possible. It won’t completely flatten; it’s not that good a noise generator. Now connect X to Z instead of the output. Again, watching the spectrum display, adjust the filter controls to get a smooth peak from 2 to 4 kHz. The settings will be something like Figure 12.29, with the two sections in series and on the high-pass side of band bass mode. The best way to set frequency is to turn the resonance all the way up so there’s a sharp peak to play with. Once the frequency is dialed in, drop the resonance to the level that gives the smooth curve shown in Figure 12.30.

13_Chap12_pp267-306 8/29/13 2:47 PM Page 296

296

THE ART AND TECHNIQUE OF ELECTROACOUSTIC MUSIC

FIGURE 12.29

15

31

FIGURE 12.30

Routing

Filter

Filter Curve Cutoff Cutofff 44

s Reso 2 72

Mode 78

2 Mix 1/2 7 47

pread Spread 12 12

Reso R o 9 69

M e Mode 7 77

Par/Ser P 0 100

Formant filter settings for the Choir preset.

63

125

250

500

1k

2k

4k

8k

16k

Filter response for the Choir formant.

Now turn off the noise and connect the operators A and F to Z. Playing high or low notes will produce partials in the range emphasized by the filter. There are many classic instruments where FM programmers have gone to great lengths to simulate formants. The addition of a real filter really simplifies that process. It’s even simpler to use a plug-in filter in the DAW signal chain. The filtered version of Choir (with doubling but no chorus) is shown in Figure 12.31 and heard in DVD example 12.24. Strings One of the best approaches for learning how a synthesizer works is to look at the stock presets, especially those that attempt to produce a familiar sound. This section examines the FM8 patch called Soft Strings. Every synthesizer has a sound called strings. None of these have the liveliness of real strings (even samples are limited) but the label is a good tag for a type of sound, and the sounds are useful for many musical purposes.

13_Chap12_pp267-306 8/29/13 2:47 PM Page 297

FM SYNTHESIS

15

31

63

A FIGURE 12.31

125

250

500

1k

2k

4k

8k

16k

Low Note

15

B

31

63

125

250

500

1k

2k

4k

8k

297

16k

High Note

Spectra of filtered Choir.

The spectrum of a real violin is shown in Figure 12.32. It’s a rich sound with many harmonics, often emulated with sawtooth waves. What the static spectrum does not show is how dynamic this sound is. There are several rates of vibrato going on at the same time, and since many instruments are involved there are plenty of beating and delay effects. Some of this shows up in the time spectrogram as variations in the darkness of the harmonic bands. The width of the bands is also an indication of beating and vibrato. You can see a soft attack on all of the partials, but no cascading—all harmonics are present from the beginning. The basic spectrum will need more parts than usual in order to add multiple beats. The foundation patch shown in Figure 12.33 is a simple 1:1 ratio modulation using operator C to modulate D. Even though the level is 100, the harmonic count is reduced to four by key scaling. This is heard in the first tone in DVD example 12.25. The foundation is enriched by two more modulators in the stack (Figure 12.34). Operator B has a fixed frequency of 1.78 and contributes some beating to the spectrum. The third modulator has a frequency ratio of 1.0098 and an offset of -0.13. This will modulate operator B and add more harmonics with a beat that is around 3 Hz on middle C. The spectrum is about right, but the combined beating is rather annoying. This combination is heard in the second tone in DVD example 12.25. The periodicity is easily broken up by some feedback modulation, which also brings in the last group of harmonics as heard in the third tone in DVD example 12.25 and shown in Figure 12.35. The traditional sources of vibrato are also available in FM applications. FM8 includes two LFOs, which are patched on the modulation page. The modulation page has fairly extensive routing—sources include pitch bend, the modulation wheel, aftertouch, breath controller, two assignable MIDI controls, and an envelope follower on a signal input. Most of these can control the LFOs. For the string patch the LFOs are slightly detuned and routed to pitch. LFO 1 is a constant amount, and LFO 2 is added by the modulation wheel or aftertouch. The fourth tone in DVD example 12.25 has the added LFOs, shown in Figure 12.36. The result is a pleasant sound, but it is still not rich enough and it is noticeably cyclical. Two final operators will be added to the right channel (Figure 12.37). This is similar to the simple brass patch, but in this case F has a fixed tuning of 1.51 Hz,

13_Chap12_pp267-306 8/29/13 2:47 PM Page 298

298

THE ART AND TECHNIQUE OF ELECTROACOUSTIC MUSIC

Time

15

31

63

125

250

500

1k

2k

4k

8k

16k

125

250

Spectrum FIGURE 12.32

500

1k

2K

4k

Time Spectrogram

Spectrum and time spectrogram of a violin.

100 15

31

63

125

Patch FIGURE 12.33

250

500

1k

2k

4k

8k

16k

2k

4k

8k

16k

Spectrum

Foundation of the Soft Strings patch.

42

30

100 15

Patch FIGURE 12.34

31

63

125

250

500

1k

Spectrum

The Soft Strings patch with modulators.

13_Chap12_pp267-306 8/29/13 2:47 PM Page 299

FM SYNTHESIS

299

14

42

30

100 15

31

63

125

Patch

250

500

1k

2k

4k

8k

16k

Spectrum

FIGURE 12.35

The Soft Strings patch with feedback.

FIGURE 12.36

The Soft Strings preset modulation settings.

which produces yet another rate of vibrato. E would seem to have too much feedback, but it is restrained by an envelope. Finally, the right channel is used to modulate the left. The result is in the final tone in DVD example 12.25. Several more tricks for ensemble string envelopes are shown in Figure 12.38. The carrier envelopes D (expanded here) and F have slow and slightly delayed attacks that more or less match the effect of several strings starting up. The initial delay on D is about 50 milliseconds, then it takes nearly a second to get to full strength. F is similar, but a bit smoother. The modulation envelopes A, B, and C start instantly at maximum value. They drop only slightly on the release. Modulator E has a chirp at the beginning. It isn’t heard because the carrier has barely begun, but it helps articulate repeated notes, especially in monophonic mode. The complete preset is heard in DVD example 12.26.

13_Chap12_pp267-306 8/29/13 2:47 PM Page 300

300

THE ART AND TECHNIQUE OF ELECTROACOUSTIC MUSIC

FIGURE 12.37

The complete Soft Strings patch.

Velocity Response One feature which makes FM attractive is that it can respond elegantly to touch. I have pointed out a few opportunities for this already, but an analysis of the preset Silver Chordz shows how much velocity response can change a sound. This is heard in DVD example 12.27, where the velocities of the familiar tune have been modified to run from low to high. The patch is simple, as shown in Figure 12.39. Operator C is in the left channel and operator F is in the right. F and C are detuned just enough to get channel-to-channel beating, but their modulators are exactly the same and the envelopes match. The envelopes are percussive, with the top modulator stretched to keep the tone rich during the decay. The touch sensitivity comes from the velocity settings, which determine the key velocity necessary to get maxi-

13_Chap12_pp267-306 8/29/13 2:47 PM Page 301

FM SYNTHESIS

FIGURE 12.38

Envelopes for the Soft Strings preset.

Op 78

Ratio

Of Offset[Hz] fffset[Hz]

W Waveform aveform

3.0024

1.95

Sn Sine

0.5002

0.13

Sin ne Sine

0.5030

0.27

S in in ne e Sine

3.0024

1.95

Sine

0.5002

0.13

Sine

0.4985

-0.24

Sine

38

77 39 39

67 -99

67 100 1 0

Patch

FIGURE 12.39

Operator Tunings Tunings

Patch and operator tunings for the Silver Chordz preset.

301

13_Chap12_pp267-306 8/29/13 2:47 PM Page 302

302

THE ART AND TECHNIQUE OF ELECTROACOUSTIC MUSIC

mum amplitude of the operator. A velocity setting of zero means the operator is not affected by velocity. In this patch A is 57, B is 71, and C is zero. If the key is barely touched, there is little modulation. The higher the velocity, the more snap and buzz there is in the sound. Operator A is also controlled by the key scaling shown in Figure 12.40. This reduces the effect of A at the top end of the keyboard and prevents aliasing on high notes.

Evolution The sounds covered so far are instrumental in nature. These are sounds that can be used to play melodies, bass lines, even parts of percussion tracks in traditional genres of all kinds. In electroacoustic composition, we also have a need for sounds to use as background textures or events that are striking on their own. One thing we look for in such sounds is evolution, a gradual change over a relatively long time. Frequency modulation is the king of evolution. Since the sound is so sensitive to balance between modulators, it isn’t hard to make a sound that is practically a composition in itself. We have already explored evolving sounds on a short timescale. We need only stretch the attacks on the trumpet sound developed above to create something that transitions from a pure to a brassy tone. Stretch the release in the same manner and the tones will devolve back to a pure sine after the fingers come up. With some adjustments to modulator ratios, the sound can become quite rich at the peak intensity. This sort of thing is particularly effective with slowly arpeggiated chords. The basic recipe for an evolving sound is a single carrier with several modulators. Each modulator is brought in at a different time during the sound. A set of envelopes like the ones in Figure 12.41 will ensure that the sound changes quite a lot. The carrier is operator F, all the others are modulating it. Envelope F is the only one with sustain on. The main difficulty with this approach is in avoiding abrupt transitions during the sound. Slow envelopes with a bit of overlap will usually do the trick. The second approach is to do the same sort of thing with envelopes that come in at staggered times and build to a climax before fading away. If there are two carriers in the mix, the final sound can be quite different from the initial sound, even a different pitch. DVD example 12.28 has some examples of these. Here are a few other multiple envelope tricks: If the carrier and first modulator have quick attacks while the others are delayed more than a second, short notes will have a different timbre than long ones. If you hold chords in one hand while playing quickly with the other, it will sound like two different instruments. If an envelope is set to sustain, but the attack and sustain levels are zero, the associated modulator will not kick in until the key is let up. This can lead to the characteristic clunk at the end of a clavichord sound or a engaging echo effect on a melody.

13_Chap12_pp267-306 8/29/13 2:47 PM Page 303

FM SYNTHESIS

FIGURE 12.40

303

Key scaling in the Silver Chordz preset.

If some modulator ratios are built on fifths (0.666, 1.5, and 3.0) and they come in about one beat late, you can get some very rich harmonies playing slow progressions. Other intervals can produce equally surprising effects. If two modulators are identical except for an LFO controlling level, the result will be a vibrato that comes in and departs smoothly. Key scaling and velocity can produce further variety in the transition effect, especially on chords and octaves. Don’t forget the pitch envelope. A slight gradual change of pitch from a modulator will produce beats that come and go in a complex way. Envelope techniques can be used to create simple loops. The trick is to put several segments within the sustain portion—as long as the key is held down those segments will repeat. If the shape follows a pattern of attack, release, delay (with a flat segment at zero), the result will be repeating notes. The timing of the pattern determines the tempo. The length of the sustain loop should match the duration of one or more measures. This can be ensured if the synthesizer (like FM8) has a tempo sync feature. When that is engaged, envelope times are displayed as beat durations and the actual rate of repeat is set by a tempo adjustment. The master tempo can be set locally for stand-alone applications or may come from the host of a plug-in. Syncing the pattern to the beat duration does not guarantee the pattern will fit into the arrangement—the notes still must be played at the right time. The envelopes shown in Figure 12.42 are for an FM8 patch called Is a Twin2. This patch is essentially six independent instruments, each triggered by a complex looping envelope. Most of these operators serve as carriers with a complex patch of cross modulation. The most flexible method of sound evolution is external control. The control can come from either MIDI or DAW automation, but this breaks the evolution out of the limits of a single note. The actual parameters that can be controlled by external forces varies dramatically from instrument to instrument. FM8 is fairly rare in that everything is controllable from DAW automation. More commonly certain key parameters or a limited number (set by MIDI connections) are available. DVD example

13_Chap12_pp267-306 8/29/13 2:47 PM Page 304

304

THE ART AND TECHNIQUE OF ELECTROACOUSTIC MUSIC

A B C D E F FIGURE 12.41

Envelopes for evolution.

FIGURE 12.42

Envelopes for the Is a Twin preset.

12.29 illustrates external control. The piece is essentially one note played on the Is a Twin2 patch and held for 32 measures. Track automation is used to gradually turn each operator on and then off again. At this point the distinction between sound design and composition begins to blur. A single preset becomes a palette of sounds that can be exactly tailored for the expressive needs of the music.

13_Chap12_pp267-306 8/29/13 2:47 PM Page 305

FM SYNTHESIS

305

EXERCISE DVD example 12.30 is a series of two-operator FM sounds. Descriptions of them are found in the list below, out of order. Reorder the list to match the sounds in the DVD example. (Answer key found on p. 499.) a) Ratio 1:1 square envelopes b) Ratio 2:1 square envelopes c) Ratio 1.98:1 square envelopes d) Ratio 1:3 Hz square envelopes e) Ratio 1:10 square envelopes f) Ratio 1.5:1 square envelopes g) Ratio 1:1 ADSR modulator envelope h) Ratio 0.88:1 ADSR modulator envelope i) Ratio 1:10 AR carrier envelope

RESOURCES FOR FURTHER STUDY By now you should understand why frequency modulation had such impact on the world of synthesis in the 1970s and 1980s. The sonorities available before Chowning’s discovery can be charitably described as bleeps and bloops (and they were often described uncharitably). This tutorial has only touched the surface—you will discover even more interesting and flexible sounds on your first day of experimenting. (Hint: Save everything!) There are no current books specifically on frequency modulation, but this classic may be available in your library or from a used-book dealer: Chowning, John, and David Bristow. FM Theory and Applications: By Musicians for Musicians. Milwaukee: Hal Leonard Corporation.

13_Chap12_pp267-306 8/29/13 2:47 PM Page 306

14_Chap13_pp307-340 8/29/13 2:48 PM Page 307

THIRTEEN New Approaches to Synthesis

The techniques of the previous two chapters are probably responsible for 90 percent of the electronic sounds heard today. However, there are plenty of other ways to synthesize material. Until recently, these advanced techniques were restricted to research labs and required the high-powered applications we will look at in chapter 14, but they are now showing up in shareware and commercial instruments. Indeed, some of the programs already covered include features based on techniques described here. These techniques have not been hidden away because they are too complicated for the typical musician. Although it is true that they require a bit of study, the real obstacle to including advanced techniques in applications is designing interfaces that present the pertinent parameters in a musically sensible way. A lot of applications were surveyed in preparation for this chapter, and I can’t say that any have found the magical combination of simplicity and depth.

A CLOSER LOOK AT THE MAKEUP OF SOUND The Waveform The software addressed in this chapter is derived from applications for acoustic research and mathematics. If we are to get the maximum benefit from these advanced tools, we need to gain a scientific understanding of the material we are using, that is, sound. I stated back in chapter 2 that it is an axiom that the waveform of a sound is equivalent to some combination of sine waves, and we have certainly heard enough filtering and synthesis to believe this is true. If the waveform is a steady tone, the sine components form a Fourier series, multiples of a fundamental frequency. The formula that states this is shown as Equation 13.1.

f (t )

A  B cos(Zt )  C sin(Zt )  D cos(2Zt )  E sin(2Zt )  F cos(3Zt )  G sin(3Zt ).... 

Equation 13.1 307

14_Chap13_pp307-340 8/29/13 2:48 PM Page 308

308

THE ART AND TECHNIQUE OF ELECTROACOUSTIC MUSIC

The symbol f(t) indicates that this is a function (a drawing if you will) of time. Omega (ω) in Equation 13.1 represents angular frequency, a conversion of frequency (in Hz) to an angle that varies at 2π times the frequency radians per second. The letters B, C, D, and so on, representing the amplitude of each term, are called the sine and cosine coefficients. The constant A at the start of the formula is an offset at 0 Hz, which is inaudible but may be necessary to make the drawing turn out right. To draw one cycle of the wave, t is varied so the angle ωt in each term sweeps from 0 to 2π. Each component of the sound is specified with two terms, cosine and sine, each with a coefficient to determine how that element contributes to the result. This is a mathematically convenient way to specify amplitude and phase of the component. If you add a sine wave to a cosine wave, the result is a sine wave at a different phase. This is not a musically convenient approach, because the amplitude and phase of each component are determined by the interaction of B and C, not by either alone. However, it’s simple to convert Equation 13.1 into familiar terms of amplitude and phase. This version of the formula is shown as Equation 13.2, with the an values representing the amplitude and the φ (phi) values representing phase. These coefficients are often represented by sliders in applications that let you design waveforms. A graph of the amplitudes becomes the familiar spectral analysis used throughout this book.

f (t )

a0  a1 cos(Zt  I1)  a2 cos( 2Zt  I2 )  a3 cos(3Zt  I3 )..... 

Equation 13.2 You may wonder about the fuss over phase, since I have been insisting all along that phase is not audible. This is true for the waveform as a whole, but the relative phase of components of a wave are definitely audible. In addition, a constantly varying phase moves the frequency of a component away from the expected exact multiple of the fundamental. Our experiments with frequency modulation demonstrated how detuned partials add life to a sound. If you want to try your hand at manipulating sine and cosine coefficients directly, there are numerous websites such as www.falstad.com/fourier with Java applets that show the Fourier series in action. Most music applications that provide Fourier-type waveform design use amplitude and phase for direct manipulation. If you have not explored the waveform design tool in Absynth, this is a good time to do so. I stated earlier that the Fourier formulas apply if the waveform is a steady tone, but that’s a big if. Few tones are perfectly steady, and most people describe such tones as sounding “unnatural.” Natural sounds have constantly changing balance and frequency of partials. This means that the equation has to change over time. In fact the best way to represent the amplitude of each component is with an envelope. When we do that, the spectral plot becomes something like Figure 13.1, a stack of shapes often called a waterfall. Notice that the graph is shown at an angle, with the

14_Chap13_pp307-340 8/29/13 2:48 PM Page 309

NEW APPROACHES TO SYNTHESIS

FIGURE 13.1

309

Time spectrum in waterfall mode.

attack in the distant corner. This is because the higher frequency components are usually lower in amplitude than the fundamental and only show from some angles. Some displays allow you to rotate the image for the best view. A separate graph of the changing phase of the partials may be necessary to complete the picture.

Other Components of Sound Most sounds have noise components. Noise is a random fluctuation in the signal. If you were to analyze noise as a combination of sine waves, you would need an infinite number of waves that are constantly changing. Instead we represent noise with a probability distribution, the same kind of graph you get if you chart the results of

14_Chap13_pp307-340 8/29/13 2:48 PM Page 310

310

THE ART AND TECHNIQUE OF ELECTROACOUSTIC MUSIC

throwing a pair of dice a hundred times. For noise, the chart shows the likelihood of any frequency being detected in the signal. If the noise spectrum is flat, all frequencies are equally present and the signal is white noise. If the noise spectrum is strong for low frequency and falls 6 dB per octave, the signal is pink noise. There are other colorful terms for particular distributions, but what we usually encounter in sound is band-limited noise, which has a fairly restricted range and produces some sense of pitch. Noise is impossible to avoid in the real world but is surprisingly difficult to synthesize in a computer. There are equations that produce noiselike waveforms, but the pattern inevitably repeats, producing a result like a looped sample. Luckily, noise is usually such a minor part of musical sounds that oddities go unnoticed. When it is prominent, as in flute-like sounds, noise must be treated with extreme care.

Transients Most sounds also have transients, components that happen too quickly for the ear to categorize or instruments to analyze. These often occur at the beginning of a sound but can turn up at any time. Transients may have a sense of pitch, but there’s just not enough to them for proper analysis. This makes transients difficult to synthesize accurately but perfection is not really needed. We have already seen how a convincing transient can be produced with a short peak on an ADSR-style envelope. The best transients are produced by modeling techniques, which are explored later in this chapter.

ANALYSIS The FFT Determining the frequency, amplitude, and phase for the components of a given tone is an extended exercise in mathematics. I am going to cut to the chase and describe some features of the most common algorithm for doing this: the fast Fourier transform, or FFT. As you might assume for something called fast, the FFT includes some assumptions and shortcuts: The FFT is used on sampled audio data. Therefore, it is limited to a range of partials that extends to only half of the sample rate (the full-blown Fourier transform requires an infinite number of partials). The FFT works on a short section of waveform—a list of samples. The length of this list is called the FFT size. For computational reasons this should be a power of two. Common FFT sizes are 1,024 or 2,048 samples. The resolution of the FFT is determined by its size. If the size is 1,024, the result will be 1,024 data pairs representing the sine and cosine coefficients of components

14_Chap13_pp307-340 8/29/13 2:48 PM Page 311

NEW APPROACHES TO SYNTHESIS

311

Hop Size Window Size FIGURE 13.2

Windows in the FFT algorithm.

spaced at 1/1,024th of the sample rate. Thus the resolution of a 1,024-point FFT at 44.1 kHz is 43 Hz. Larger FFT sizes improve the resolution but take longer to compute. The data pairs are called bins, and the entire set that represents the spectrum is called a frame. The sine and cosine coefficients are often called the real and imaginary parts. This comes from a common derivation of the formula using complex numbers. The fact that the FFT works on an arbitrary chunk taken out of a stream of data introduces errors into the process. The errors are related to effects that occur at each end of the chunk. If the samples just start and stop, the single line you would expect for the spectrum of a sine tone develops a skirt. This is called spectral leakage and is caused by the mismatch between the FFT size and the period of the tone. To avoid this, a window function is applied to the data before the FFT (Figure 13.2). A window is like an envelope, in this case providing a fade-in and fade-out to the data. Spectral leakage is reduced by this, but if the fade is too long, any noise in the signal is exaggerated. There have been many window shapes proposed, and these shapes are often named after their inventor. The most popular windows are Hamming and Hann, and both offer a good compromise between a narrow skirt and low noise. High-end spectrum displays allow you to adjust the FFT size and window shape to get the best results from a particular signal. Most implementations of FFT use overlapping windows to reduce errors. That way any section of the wave is actually analyzed two to eight times with the results combined. Overlap is specified by hop size, which is the number of samples between starting points. The smaller the hop size, the more layers of analysis are performed. Multiple overlaps improve the time response of the analysis at the cost of increased computation load.

14_Chap13_pp307-340 8/29/13 2:48 PM Page 312

312

THE ART AND TECHNIQUE OF ELECTROACOUSTIC MUSIC

The fast Fourier transform of a signal contains all of the information of the original signal. In fact, you can apply the inverse fast Fourier transform and get the signal back. If you manipulate the data of the FFT before reconversion, some interesting effects can be produced. For instance, the FFT filters demonstrated in chapter 5 simply adjust the amplitude values for each bin. If you multiply the FFT data of two signals, you get a strange result called the convolution of the two. The effect is similar to ring modulation, but with carefully selected sources convolution can produce filtering and reverb. When you look at an FFT spectrum display, you see a series of peaks that rise out of a base of noise and artifacts. It is the frequency and amplitude of these peaks that have most effect on the sound. Further analysis of the FFT data can provide a set of “partials” that describe the peaks and the way they change with frequency and amplitude envelopes. Partial extraction has its tricky points but can be done well if the original recording is free of background noise and the nature of the sound matches the Fourier model.

Filter-based Analysis The FFT is excellent for analysis of pitched tones, but is less useful for noise or clangorous sounds. The preferred tool for this type of measurement is an old friend, the fixed filter bank (aka graphic equalizer). Again, I am going to skip a lot of math and just assert that we can code filter banks with as fine-grained a response as we want. Unfortunately, there are two small problems. One, there is no filtering algorithm that is anywhere close to the FFT in efficiency. Thus the filter bank approach is used either for coarse analysis or for offline analysis. The second and more serious problem is that a filter system can have either fine frequency resolution or quick response to changing input, but not both. This makes sense if you think about it. Discriminating between a tone of 440 Hz and a tone of 441 Hz will probably take a full second, but just getting to within 10 Hz would be ten times faster. The advantage of filter analysis over FFT is that the bands may be spaced arbitrarily rather than evenly across the entire spectrum. Filter banks often have a frequency resolution of one-third octave, a logarithmic spacing that matches the ear well. Only the low-frequency bands need to be narrow enough to slow response down, so a filter-based display can refresh quickly. Third octave filters are often used for noise measurements and frequency response tests, especially since they are similar to the analog devices on which many engineers were trained. Filters for demanding applications may have 8,000 or more bands. Analysis by multiband filter is as simple as applying a signal and measuring the output amplitude of each band. This does not have the frequency precision of the FFT—all we learn is that energy of some sort is in the band, but if there are enough bands, the accuracy approaches FFT analysis. A multiband filter can be used for resynthesis by applying an excitation signal and manipulating the output from each band according to the analysis data. This

14_Chap13_pp307-340 8/29/13 2:48 PM Page 313

NEW APPROACHES TO SYNTHESIS

313

mechanism is known as a vocoder, because it was first developed to transform speech into data for communications. Musicians soon discovered it was interesting to resynthesize vocoded material with pitched excitation, as Wendy Carlos did in Clockwork Orange. Surprisingly, this only works well with coarse filtering. If the filter bands are too narrow, the analysis and excitation partial are unlikely to match, and nothing will be heard. On the other hand, resynthesis with noise as the input begins to approach the original sound as the number of bands is increased. If the resolution is high enough, using the analysis to control sine tones or noise will nearly reproduce the original. The high-resolution version is sometimes called a phase vocoder.

ADDITIVE SYNTHESIS Additive synthesis is one of the oldest approaches to computer-based music. The fundamentals were worked out in the 1960s, and several of the classic compositions of mainframe computer music used additive techniques. The attraction of additive synthesis is the potential for total control of the sound—the frequency, phase, and amplitude of every partial can be exactly specified for every sample of the tone. The downside is that the frequency, phase, and amplitude of every partial must be exactly specified for every sample of the tone. This is a lot of work for a computer, not to mention a lot of responsibility for a composer. Computers of the 1960s and 1970s were slow and expensive to use, so composers were not encouraged to simply goof around and see what experiments sounded like. They were expected to work slowly and carefully with at least some idea of the eventual results. Research programs were established to determine the makeup of common sounds (principally orchestral instruments), and a composer was expected to work directly with the numbers in an environment where a misplaced semicolon could wipe out an entire day’s work. In those days a prolific composer could produce two pieces a year. Modern additive synthesis allows a composer to do what was once a month’s work in less than a minute. Analysis of a sound requires less time than it would take to listen to it, and the composer can manipulate the analysis with a variety of graphical tools. It is even possible to interpolate between analyses and create sounds that make a gradual transition (“morph”) of timbre from one instrument to another. Many of these capabilities are still found only in research software, but elements of the process are turning up in consumer applications. To demonstrate the basic principles, we’ll use a free application by Michael Klingbeil called SPEAR (Sinusoidal Partial Editing Analysis and Resynthesis, www.klingbeil.com/spear/). When you open a sound file, SPEAR performs an analysis of the sound and shows the results in a window (Figure 13.3). You specify a desired frequency resolution and the program works out appropriate FFT size and window characteristics. The display looks like a fine-grained spectrogram. Time is shown left to right, frequency is

14_Chap13_pp307-340 8/29/13 2:48 PM Page 314

314

THE ART AND TECHNIQUE OF ELECTROACOUSTIC MUSIC

FIGURE 13.3 The SPEAR analysis window.

up, and relative amplitude is indicated by shades of gray. There are zoom controls that let you focus right down to the smallest unit of analysis, and a floating window shows the exact frequency, pitch, and time at the cursor. There is no detailed amplitude information, but the program is under development and that may be soon added. Even without this feature, there are plenty of interesting things to do with a sound. The space bar initiates the sound, and what you hear is not the original file but a resynthesis of the data on the screen. Change the image and the sound will change. The pitch and rate of playback can be independently manipulated, a technique that is finding its way into some sampling synthesizers. Selecting a partial and copying it to a new window enables you to hear what it contributes to the sound. This is the quickest way to disassemble a sound, and it is easily undone to try a different variation. There are several ways to select partials, including by amplitude or length. If the

14_Chap13_pp307-340 8/29/13 2:48 PM Page 315

NEW APPROACHES TO SYNTHESIS

315

sound is fairly simple and harmonic in nature, the effects should be reasonably predictable from our experiences with synthesis so far. The lack of amplitude envelopes is an annoyance, but the relative strengths of partials are easily manipulated. You can also apply fade-in and fade-out to selected partials, creating envelopes of a sort. You can drag selected partials up or down in frequency, and if you zoom in all the way, you can reshape the frequency curves at the smallest unit of analysis. DVD example 13.1 is a video demonstration of some of these operations. You can draw new partials with a pencil tool. Since there are no constraints to the drawing (and only rough amplitude control) most of the results will sound like free-running oscillators. Here is a technique to produce a harmonic structure: First draw a horizontal line at the fundamental frequency. Copy the line to the clipboard, then apply a frequency shift equal to the fundamental to the original. Paste the copy back in to produce a perfectly tuned harmonic pair. Do this again (copy, shift, paste) with the second harmonic, using the same frequency shift. Once you have created a harmonic grid, save it as a basis for future experiments. DVD example 13.2 demonstrates this process. Many sounds will include things that don’t seem particularly spectral, that is, they don’t match the rule of partials at harmonic multiples of a fundamental frequency. These are the result of the program’s attempts to deal with transients and noise, which often show up in unexpected ways. Figure 13.3 shows the SPEAR analysis of the sound of blowing across a bottle, a sound with a lot of broadband noise. When you play the resynthesized sound from SPEAR, you hear some, but not all of the noise. SPEAR has grabbed the stronger components, which show up as faint wiggly lines. Remember, the FFT process is a series of snapshots. What SPEAR has done is latched on to some prominent data points and connected the dots. If you repeat the analysis with higher resolution the wiggly partials straighten out.

Additive Synthesis in Alchemy Alchemy by Camel Audio is a comprehensive synthesizer. The architecture of Alchemy is based on vector synthesis, a technique introduced by Dave Smith in the Prophet VS and expanded by Yamaha and Korg (Figure 13.4). A typical vector synthesis instrument contains four sound generators with a mixer to blend the sounds. The paradigm is that each generator is at the corner of a square, and moving a marker (or joystick) changes the balance of sounds. Mixing gestures can be preprogrammed or recorded from live performance. This provides such things as a note that starts out like a bass drum and continues with an organ tone. Alchemy is equipped with a full complement of filters, envelopes, and effects, but its most advanced features are in the generators, starting with a nice implementation of additive synthesis. Additive synthesis is set up by importing a sound file while the additive option is selected in the file dialog box. This brings in an analysis of the file, which is used for resynthesis in a manner similar to SPEAR. You can view the data of the analysis

14_Chap13_pp307-340 8/29/13 2:48 PM Page 316

316

THE ART AND TECHNIQUE OF ELECTROACOUSTIC MUSIC

Source A

Source B

Mix

Source C FIGURE 13.4

Source D

Vector synthesis.

in the additive editor, which is shown in Figure 13.5. This screen provides a simplified view of the immense amount of information contained in the analysis data. The lower pane shows either the overall envelope of the sound (“all partials”) or an envelope for a selected partial. The information shown in the envelope may be the amplitude, frequency, or pan for each partial. The envelope has break points indicated by small squares. The upper pane shows the relative values for each partial at a selected break point. Break points are initially determined by the program, but you can increase or decrease their number with the detail control. With the envelope display in all partials mode it’s easy to select a break point and adjust the spectral balance in the upper pane. To adjust the envelope for a single partial, use the selected partials mode and drag the break points on the envelope. Click a partial in the upper pane and the envelope will switch to show it. There is an overall mode, in which the bars of the upper pane control the relative strength of the partials over the entire sound. Overall mode is the way to make big changes in the sound. DVD example 13.3 shows how to build a sound from scratch. The best starting point is to import a sine wave with the additive option chosen. The display shows a single partial and a square envelope. Set the envelope to all partials, and modify it to give some attack and release time. Add a break point (right click on the line) before the loop start and raise it to make an ADSR shape (you will have to add some release time to the master envelope in the main window in order to hear the changes made). Now add some amplitude to partial 2. Different levels will provide a range of timbres from pure to something bassoon-like when the 2nd partial is twice as strong as the first. Note that the pitch heard is still the fundamental unless you

14_Chap13_pp307-340 8/29/13 2:48 PM Page 317

NEW APPROACHES TO SYNTHESIS

FIGURE 13.5

317

The Alchemy additive editor.

remove it entirely. Now add a few more partials with an exponential drop from the 2nd. Switch overall mode off, select the second break point (the start of the loop), and rebalance the partials so that the first is stronger and the others reduced. This will make the loop pop, so copy the loop start (copy break point from the file menu next to the overall button), select the loop end, and choose the paste break point function. Now change to the final break point and adjust the partials so that only the fundamental is strong. The result is a sound with a bright attack and mellow sustain. Switch to pitch display, select the 2nd partial, and change the envelope mode to selected partial. Now add a break point between the loop start and loop end and adjust the pitch for a slight change (anything over 0.2 semitones is too much). This will add a small detuning in the loop, producing a slight vibrato. Now change to the 4th partial and add a similar detuning downward. The sense of liveliness will still be there, but the note stays in tune. When building sounds in this way, it is good to think in terms of three distinct parts of the sound—early, loop, and release. The early section does the most to define the sound. You will find small changes in the spectrum have a powerful effect. The trick in editing the loop is to keep the loop from overpowering the sound. The start and end points of the loop must match in all respects or there will be recurring pops. The deviations in the loop must be slight. The trick of offsetting a variation of one partial with a contrary motion in another will often make a sound less

14_Chap13_pp307-340 8/29/13 2:48 PM Page 318

318

THE ART AND TECHNIQUE OF ELECTROACOUSTIC MUSIC

FIGURE 13.6

Trumpet spectrum in Alchemy.

artificial. The timbre of most sounds becomes simple in the release. This gives sweetness to the decay and helps the sound make connections in complex harmony. The next exercise is to modify an existing sound. This starts with the trumpet we used as a basis for some of the FM explorations in the last chapter. As Figure 13.6 shows, the spectrum has the overall rounded shape typical of brass. In overall mode, reduce the amplitude of the fundamental and second partial. Now you know what mutes do. This similar to a high-pass filter but has a much more specific effect, suppressing particular partials by modifying the physics of the instrument. You can play with the partial levels and create an infinite variety of derivative sounds. Increasing the third partial, for example, moves the tone toward a cornet timbre. Resynthesized sounds are realistic, but they suffer from the same problem all sampled sounds have. Left alone, each instance of a pitch is exactly the same, and transposed notes are subject to the munchkin effect. The solution provided by resynthesis applications is morphing. In Alchemy, morphing is set up by importing different sounds to the A and B engines. For example, let’s take two samples of the same trumpet made at different pitches, C3 and C4. The two samples can be heard at the beginning of DVD example 13.4. The trumpet tone changes quite a bit in the higher octave, so with these loaded in, two distinct instruments are heard on each pitch. This is shown on the first scale in DVD example 13.4. The morph control has

14_Chap13_pp307-340 8/29/13 2:48 PM Page 319

NEW APPROACHES TO SYNTHESIS

FIGURE 13.7

319

Key follow with transition between 50 percent and 59 percent.

a long list of modes, but the important ones are X-fade and morph. In X-fade mode, the morph control simply fades from the A sound to the B sound. This should be connected to the note played—in Alchemy, this is done by selecting the Morph X knob and setting its modulation link to key follow. To make the transition complete in just one octave add a modulation map as shown in Figure 13.7. Now MIDI note C3 will only play source A and MIDI note C4 will only play source B. This X-fade is used on the second scale in DVD example 13.4. What do you hear in the middle? Two trumpets. The final scale in the example was made in morph mode. In morph mode, only one trumpet is heard. The settings for partial amplitude and so on are calculated by interpolating between the spectral analyses of A and B. Thus the sound on G3 is somewhere between the sounds on C3 and C4. Now is the time to start exploring morphs of wild combinations of sound. Some will work, some will not. It’s hard to predict ahead of time, so go ahead and try anything. Here are a few general observations: A combination of a pure sound and a rich sound, such as a flute and a saxophone, will probably have a sudden transition. The rich partials of the sax will color the composite sound most of the time. If you combine sounds with different pitches, there will be a frequency sweep as the morph value is changed. There is a pitch select field in the import dialog which determines the root pitch. Alchemy attempts to set this automatically, but occasional mistakes arise. You can’t change this after the analysis. Some files just will not sound right when resynthesized. This will almost certainly be true of noisy and extremely rich sounds—130 partials is not enough to do justice to sounds such as a bowed guitar. Occasionally, the analysis produces spurious tones as several nearly equal partials are combined into one. Changing the analysis setting to “best frequency” may help. Morphing considers envelope times as well as amplitudes. Alchemy marks five points during import and analysis. These are the start, end of attack, loop start, loop end, and sound end. Morphing between a short and long sound will include adjustment of the envelopes relative to these points. This makes it possible to combine sounds like music box and Hammond organ. The time morphing will have

14_Chap13_pp307-340 8/29/13 2:48 PM Page 320

320

THE ART AND TECHNIQUE OF ELECTROACOUSTIC MUSIC

peculiar effects on combinations of short and long sounds such as a music box and a harp glissando. The harp will be sped up, of course, but the music box will be stretched, making some brief partials sound like sustained tones (DVD example 13.5). Alchemy allows the use of alternate waveforms for the additive synthesis engine, although this is not as useful as you might think. Substituting saw waveforms for the sine tone essentially produces a noisy version of the original.

SPECTRAL SYNTHESIS Alchemy also provides phase vocoder analysis, which can then be used for spectral synthesis. If the spectral option is chosen when a sound is imported, a fixed filter analysis is performed. Once this is loaded you have a choice of resynthesis techniques. In the mode marked resynth, sine tones at the frequency of the filter bands are mixed to reconstruct the sound. The result is a bit noisier than the additive method. This is especially clear in attacks, which tend to thump. In the noiseresynth mode, noise is used as the input to the filter, giving a raw but somewhat reverberant sound. DVD example 13.6 has three versions of resynthesis: additive, sine-based spectral, and noise-based spectral. The example begins with the original bowed bass note. The results of spectral synthesis are dependent on the nature of the initial sound. A sound with a slow attack and rough texture will sound similar in additive and spectral versions. A pure sound with a sharp attack will be noisy and have a slow response in spectral—that slow response will exaggerate any transients, especially if the import option is set to best frequency. A sound that is essentially noise, such as rain or a cymbal, will sound reasonably natural in spectral but quite peculiar in additive synthesis. For many sounds, there is some benefit to each style of resynthesis, and Alchemy provides a hybrid mode. When add+spec is chosen, both types of analysis are loaded, but the spectral section is restricted by a high-pass filter. Thus the lower frequency partials will be contributed by additive synthesis and the high section will be provided by spectral synthesis. This is especially effective on sounds like flute. The spectrum analysis can be displayed in a window as shown in Figure 13.8. The horizontal stripes represent filter bands with the lowest frequency at the bottom. Figure 13.8 shows the analysis set for best time, while Figure 13.9 shows an analysis of the same sound at the best frequency setting. Both images have an inset with a magnified image of the lower left corner. You can see in the insets the lines of Figure 13.9 are actually comprised of longish blocks while the lines in Figure 13.8 are thicker. This makes the sound of the best time version less definite in pitch. The best frequency version has thinner horizontal lines, producing better accuracy in harmonic structure. However, the attack of the high partials clearly start later, which produces a mushy effect. DVD example 13.7 demonstrates both versions. The original sound comes first, then a time stretched version using the best time

14_Chap13_pp307-340 8/29/13 2:48 PM Page 321

NEW APPROACHES TO SYNTHESIS

Detail of Attack

FIGURE 13.8 Spectrum with best time settings.

Detail of Attack

FIGURE 13.9

Spectrum with best frequency settings.

321

14_Chap13_pp307-340 8/29/13 2:48 PM Page 322

322

THE ART AND TECHNIQUE OF ELECTROACOUSTIC MUSIC

analysis. Finally, we hear the same stretch with the best frequency analysis. In most situations the compromise settings will produce better results than either best frequency or best time. It is possible to edit the spectral display. You can define a brush shape and use it to draw in or erase spectral lines. I find the latter more useful, editing out unwanted bits of less than perfect recordings. The variable opacity of the brush provides a gentle filtering effect. Drawing sounds in from scratch requires a lot of practice, and the tools supplied don’t really have much precision. What you can produce are some broad effects with swooping or warbling pitches. Importing images can be a lot of fun. Images with black backgrounds and a general left to right flow work best. Once you understand how the conversion process works, you can experiment with an image editing program like Photoshop to produce subtle effects. Figure 13.10 is a picture of an odd fish, and in DVD example 13.8 you can hear the equally odd results of importing this image into the spectral editor. Imported graphics usually require some image processing to get a useable size and resolution.

MetaSynth If you are intrigued by hand-drawn spectra, I suggest you explore the program MetaSynth by U&I Software. This program is only available for the Macintosh, but many devoted PC users keep a Mac around just for this program. The basic paradigm of MetaSynth is the conversion of image to sound. This can be done at the level of spectrum design or as a method of compositional organization. The array of graphic tools is prodigious, with many processes similar to those found in Photoshop to provide filtering, reverb, and other effects. The best place to start is with Image Synth, the main composing canvas (Figure 13.11). Choose a modest canvas to start, perhaps 512 by 128 pixels. Each pixel on the vertical axis will represent a pitch and the brightness of the pixel controls the amplitude. Draw in some curves and blocks of color (red and green indicate stereo panning and blue is silence). Now render the drawing. What you hear depends on the pitch mapping, which can range from whole tones to fifty notes per octave. The sound also depends on the waveform or instrument chosen for rendering. Basic effects, like reverb and echo, modify the image. Reverb, for example is a blur to the right. For a final surprise, try a displacement map. This stretches and twists the image, producing swoops and fades in the sound. The gestures produced in the Image Synth can be saved two ways, as a recording or as a preset that will allow further modification of the sound. As a longtime MetaSynth user, I have become used to exporting the sounds to a DAW for assembly, but the latest version includes a montage editor that provides similar features. DVD example 13.9 is a video demonstration of some Metasynth features. The Spectrum Synth takes a similar graphic approach but on a true spectral scale. The drawing tools and constraints are well designed for working at the overtone level of detail. The canvas is divided into a grid of up to sixty-four “events,”

14_Chap13_pp307-340 8/29/13 2:48 PM Page 323

NEW APPROACHES TO SYNTHESIS

FIGURE 13.10

Fish photo used for DVD example 13.8.

FIGURE 13.11

MetaSynth Image Synth.

323

14_Chap13_pp307-340 8/29/13 2:48 PM Page 324

324

THE ART AND TECHNIQUE OF ELECTROACOUSTIC MUSIC

each of which is a slice of spectrum. The duration of individual slices can be adjusted as desired. There are ways to enter harmonic or free partials and modify their amplitude and offset. When the sound is played the events are linked with options for crossfading and interpolation amounts. Depending on the time scale used, the Spectrum Synth can produce a single note or a sonic gesture. The spectrum designed can be transferred to the Image Synth and used to play the score. Recordings can be imported into either the Image Synth or the Spectrum Synth. The latter does a precise Fourier analysis, while the former is more of a vocoder effect.

GRANULAR SYNTHESIS Granular synthesis is based on an alternate view of sound. Most of the time, and certainly throughout this book, sound is described in terms of the waveform, the continually changing pressure at the ear. This agrees with the physics of sound, but the perception of sound is a different thing. The inner workings of the ear are not completely understood, but the best model we have so far is a spectrum analyzer, with individual nerves to detect a specific frequency. The physiology of the nervous system is beyond the scope of this book, but one fact is pertinent here— nerves are apparently more or less binary, sending a simple impulse when triggered by the mechanism to which they are attached (in the case of the ear, tiny hairs). They must rest briefly before firing again. Any impression we get of a continuous tone is caused by repeated signals from a small group of nerves, thus the tone may in fact be briefly interrupted with no audible effect. The phenomenon is similar to film, where a series of still images gives the impression of a moving picture. This is the basis of audio data compression formats such as MP3. Granular synthesis starts with an ordinary recording of sound. This is cut up into short sections or grains from 1 ms to 100 ms long. The slicing must be carefully done, as any abrupt transitions will cause easily detectable artifacts. The usual approach is to turn the sound on and off in a smooth manner as illustrated in Figure 13.12. Note that there is a small overlap as one grain fades out and the next fades in. Granular synthesis employs the same window system used for the FFT. When the grains are played back at the rate they were created, there is little difference in the sound. However, when the playback rate is sped up (the grains will overlap a bit more) the sound will be completed sooner. If the playback rate is slowed (some grains may be played twice), the duration of a sound will be extended. None of this affects the pitch, which comes from the chunks of waveform within the grains. Of course there is a bit more to high-quality time stretching than this. Simple repetition of the raw grains usually sounds terrible: if the grains overlap, there are comb filtering effects where the waveform in successive grains does not match in phase. The window function itself is often audible, adding a lowfrequency buzz to the tone. To fight these artifacts, the grain size and window shape must be optimized for the desired change, with grain size adjusted to match

14_Chap13_pp307-340 8/29/13 2:48 PM Page 325

NEW APPROACHES TO SYNTHESIS

325

Buffer

Index

Mix Window

Grain Delay

Grain Length

Output

FIGURE 13.12

Granular synthesis.

the frequency of the input and with some scheme for synchronizing phase. The best systems analyze the signal and apply grain repetition only to sustained parts of the sound, passing transients through with no change. This cleans up drum hits and vocal consonants. The basic mechanism of granulation is pretty simple to implement, but automatic optimization is tricky and probably accounts for much of the quality difference between granular applications. DVD example 13.10 is a bell followed by two examples of the same bell stretched to four times the original length. A double attack is clearly audible in the first long version, which was done with settings appropriate for voice. Changing the settings to the music (percussive) option produced the effect in the second long version where the double attack has been eliminated in favor of extension of the clang that immediately follows the attack in the original. Pitch changing is done as a two-step operation. A granular time stretch is applied first, then the wave is resampled to change the pitch. The pitch change produces a corresponding time change that cancels out the original stretch process. The result is pitch change with constant time. With the amount of analysis and processing involved, the highest quality pitch changing can only be applied to recorded material. Real-time pitch changes usually have some degree of roughness to the

14_Chap13_pp307-340 8/29/13 2:48 PM Page 326

326

THE ART AND TECHNIQUE OF ELECTROACOUSTIC MUSIC

sounds, especially if the shift is extreme. DVD example 13.11 has the original bell followed by several versions of pitch shifting. The first modification is a shift up one octave without duration correction. The sound is clean with no artifacts other than a bit of noise. The second version is one octave up with duration correction. This is also pretty clean. The third is a two-octave shift with duration correction, still faithful to the original. There is a pronounced low-frequency component, but spectral analysis shows that it was there all the time—the two-octave shift brought it above the Fletcher-Munson threshold. The last version is a two-step process; the pitch change followed by a duration change. In this version, the attack is somewhat more pronounced, and there is a surprising artifact at the end, a burst of the looped bell waveform. This shows the importance of the order in which these operations are performed. The composer’s interest in granular processes does not stop with pitch and time manipulation. In fact, the use of granulation in these tricks is only of passing interest—it’s a tool for modifying sounds in a particular way, and if something comes out tomorrow that works better, fine. The fascinating aspect of granular synthesis lies in systems that are not so artful. When granular synthesis “fails,” it fails in interesting ways. Granular synthesis is a sound source with unique properties that can be enjoyed on its own or used as an input to subtractive systems. Many of the instruments we have already studied include at least some granular capability.

RTGS-X There are several shareware systems available for initial studies of granular synthesis. The examples here are prepared with Real-time Granular Synthesizer X (RTGSX) by Marcel Wierckx. This is downloadable software with a usable demo version. The nomenclature of granular synthesis varies from one application to another, but the same parameters are always involved. Buffer. A section of memory holding a short sound file. Most granular synthesis is based on the manipulation of one sound event, rather like sampling. Index pointer. The buffer is scanned and grains played starting from a movable index pointer. The pointer does not necessarily move in normal time, in fact it is often stationary or moved interactively, even backward. This produces a scrubbing effect or slowly mutating sounds. Window shape. Determines the details of how the grain is produced. A triangular shape usually gives the smoothest sound, but many prefer the richer spectrum produced by curves such as the Hanning window. The envelope need not be symmetrical, especially if the intent is to hear individual rhythmic patterns. Grain frequency or grain density. The number of grains produced per second. Low grain frequencies result in pulsing patterns, whereas a frequency above

14_Chap13_pp307-340 8/29/13 2:48 PM Page 327

NEW APPROACHES TO SYNTHESIS

327

20 will produce some sort of tone. Grain delay is sometimes specified instead of frequency—this is the time between grain starts. Grain length. The duration of a grain. This interacts with grain delay to produce pulsing or continuous sounds. If the length is shorter than the delay, there will be space between grains and the waveform of the window function produces either a tone or series of pulses. Once the length matches the delay, grains begin to overlap, and the tone of the grains will be predominant in the sound. When the length becomes twice the grain delay comb filtering effects begin to appear. As the overlap increases a characteristic “cloud” sound is generated. Transposition. Applied to the grains before the window function, this changes the pitch of the intrinsic tone. When the granulation is part of a MIDI-style synthesizer, the note number usually controls transposition. When the pitch of the transposed tone goes too low, the volume starts to drop because the grains are not as long as a complete cycle. These parameters might be preset or controlled by envelopes. They are often modified by random processes to produce shimmering textures. As with any random process, the range of change should be carefully considered. DVD example 13.12 demonstrates the effects of changing grain delay in RTGS-X (Figure 13.13). The delay starts out at 1 ms, with a 100 ms length. There are multiple overlaps and noticeable comb filtering. As the delay is lengthened, the pitched filtering fades away and we begin to hear a reasonable reproduction of the original sound. Beyond that, the grains are heard as separate pulses. In the remainder of the file grain delay is randomized. DVD example 13.13 demonstrates the effect of grain length. The delay is held at 60 ms while the length is extended. At first the only thing heard is a series of pops as the length is too short for a complete cycle of the source sound. A sense of pitch develops when the length allows about 6 cycles. Above that, the pulses of sound gradually blend into a tone. DVD example 13.14 shows the effect of the buffer index. At the beginning of the file, the index is held at the start of a gong sound. This gives a somewhat wobbly tone—the delay and length could be adjusted to improve it. The index is now slowly moved through the gong sound, and timbres associated with parts of the gong recording are heard. In the next section, the index plays through the gong sound at normal speed, reproducing the sound well enough. This is followed by the index running double speed, half speed, and quarter speed with the expected change in overall length in each case. Then the index is reversed and finally randomized. Many commercial synthesizers include some form of granular synthesis, but they seldom implement the full feature set. For instance, the Malström instrument supplied with Propellerhead Software’s Reason is based on two granular oscillators. These do not have adjustments for grain density or length—these seem to be preset for each wave source. The main granular action is moving the index through the sample, either at a preset rate or controlled by an LFO.

14_Chap13_pp307-340 8/29/13 2:48 PM Page 328

328

THE ART AND TECHNIQUE OF ELECTROACOUSTIC MUSIC

FIGURE 13.13

RTGS-X main window.

Absynth 4 has two granular modes. The sync granular mode works on internal wavetables and only has parameters for density and scatter. Density controls the overlap, but whether by manipulating grain frequency or length is not clear, and scatter randomizes the grain frequency. This latter is based on the transposition on the modulation tab, which produces a chorus-like sound when set at some interval from the base pitch. The more advanced granular mode works on external samples. Here the parameter selection is fairly complete, with control over play rate, start time (index), density, grain length, and randomization of frequency, index, and am-

14_Chap13_pp307-340 8/29/13 2:48 PM Page 329

NEW APPROACHES TO SYNTHESIS

329

plitude. My favorite approach is to use a sample with a lot of variation of timbre and to control time with an envelope. DVD example 13.15 gives a couple of examples of granular synthesis in Absynth 4. The source file is a set of cell phone rings.

MODELING SYNTHESIS The most active area of development in synthesis right now is probably modeling. In the grand view, modeling software mimics processes that occur in nature. Meteorologists attempt to model weather systems, economists attempt to model the stock market, and so on. In synthesis, modeling is happening in two areas. Analog modeling copies the processes of classic electronic circuits and acoustic modeling follows the physics of real instruments.

Analog Modeling As an example of analog modeling, consider an oscillator. A software oscillator is a routine that looks up values in a wavetable. To generate a sound, the program steps through the table according to the sample rate and desired pitch, fetching the values to send to the output. Changing the frequency is simply a matter of changing the size of the step—direct and instantaneous. The modeling approach is based on an oscillator circuit, which is a combination of transistors, resistors, and capacitors with specific electrical properties. The program models the behavior of each component, including the discharge curve of the capacitors and the transfer function of the transistors. The frequency is indirectly controlled by specifying current through one of the transistors. When this is changed, the effects must spread through the whole circuit before the pitch is stable at the new value. This delayed response is subtle but audible. Other side effects include interesting variations in waveform and idiosyncratic responses to playing gestures. DVD example 13.16 illustrates the difference in sound of a short passage played twice on Absynth, then twice on a model of the classic ARP 2600 modular synthesizer. Note that on the analog model, the passage differs slightly in timbre between the first time it’s played and the second time. There is no practical difference between using a digital synthesizer and an analog model. Many composers are unaware of what technique their instruments use, just that the sound is subtly different. In truth, most analog models compromise at points where the difference is insignificant. Tassman, for example, has model-based oscillators and filters, but digital-style amplifiers. The most obvious difference is in applications that aim to capture the physical experience of patching a modular machine, such as TimewARP 2600, the application used in DVD example 13.16. As Figure 13.14 shows, the interface is a photographic depiction of the instrument where the composer makes changes by moving patch cords and sliding knobs. Of course

14_Chap13_pp307-340 8/29/13 2:48 PM Page 330

330

THE ART AND TECHNIQUE OF ELECTROACOUSTIC MUSIC

FIGURE 13.14

TimewARP 2600 synthesizer.

there is no reason for the most faithful of emulations to copy the failings of the originals. The unstable intonation and high noise levels of the original instrument would not be acceptable in any instrument today, and it would be silly to deny the benefits of polyphony and keyboard velocity response. Another active area of analog modeling is recapturing the sounds of classic audio equipment of the 1950s and 1960s. This gear was frankly funky, but a lot of marvelous music was produced with it. Despite the imprecations heaped on these devices when they were the best available, there is now serious nostalgia for that archaic sound. Surviving equipment fetches premium prices, and a few manufacturers have reopened their assembly lines. Software plug-ins provide a more affordable approach, modeling the sound of the old gear, warts and all. These range from generic applications that add a tube-like distortion to any signal to meticulously designed simulations based on a part by part analysis of the original. It’s hard to describe the sonic characteristics these plug-ins provide, but it is good to know the classic sounds will not be lost.

Acoustic Modeling Acoustic modeling offers unique sounds to the electroacoustic composer. Here the physical processes of real instruments form the basis of the computation. In a guitar model, a signal is derived from the behavior of a picked string, and that signal is

14_Chap13_pp307-340 8/29/13 2:48 PM Page 331

NEW APPROACHES TO SYNTHESIS

FIGURE 13.15

331

Modelonia acoustic modeling synthesizer.

transferred via a virtual bridge to a resonant body with its own modeled response. A faithful model will include factors for the weight of the strings, the stiffness of the frets, and other fine details, similar to the matters that occupy a skilled luthier. Of course, successful modeling depends on knowing exactly how the instrument works. Even the most humble instrument is a complex device, and the math that describes phenomena such as the flexing of a guitar back has yet to be completely worked out. It is an ongoing process—the cigar box banjo is pretty well understood, the violin not. Even cigar box banjos have musical interest, however, and for the rest, the old saying is true: if you need the sound of a fine violin, get a fine violinist. To demonstrate the fundamental aspects of acoustic modeling, we’ll explore a program called Modelonia by Luigi Felici. Modelonia is available as shareware from nusofting.liqihsynth.com (Figure 13.15). This is a basic program that implements the process in a direct manner. Modelonia includes models of two types of sound, the plucked string and the blown pipe. String Model The most common string model is based on a delay line that feeds back into itself. This emulates the waveform traveling down the string and reflecting back at the

14_Chap13_pp307-340 8/29/13 2:48 PM Page 332

332

THE ART AND TECHNIQUE OF ELECTROACOUSTIC MUSIC

Displacement

Noise

Delay

FIGURE 13.16

Lowpass

Highpass

Allpass

LP

HP

Stiffness

Plucked string model.

end. The length of the delay determines the pitch, just as we learned in chapter 5. The feedback signal is filtered to produce the desired timbre. (See Figure 13.16.) The plucked string algorithm can be explored by choosing basic pluck from the Sound Wizard menu (Figure 13.15). Modelonia imposes a synthetic ADSR envelope on the sounds, so set the attack and decay to minimum with sustain and release at maximum to hear the model clearly. When the instrument is quiet, all of the elements of the delay are zero. This may be thought of as a stretched string at rest. If we copy in the shape of a string deformed by a pick, the result will be similar to real life. The deformed shape will quickly become a pitched tone as it cycles through the loop. In Modelonia the initial deformation is displayed in the pick pane of the window. Moving the sliders changes the height, symmetry, shape, and slope of the displacement. These are analogous to changing the shape, stiffness, and position of a pick. DVD example 13.17 demonstrates the range of sonic change that can be produced with the pick parameters. The string can also be excited by noise—this is used in bowed string emulators (along with a more complex delay line arrangement to simulate the position of the bow). In Modelonia (Figure 13.17), a bit of noise can be added to the pick impulse, imparting a sparkle or chuff to the timbre. The noise can be confined to a narrow band with the BW control and peaked with a Q control. You can even add a bit of sine tone. The sustained sound of the string is strongly affected by signal processing in the feedback—this imitates the effects of energy reflecting at the bridge and fret. Feedback processing is provided by low-pass and high-pass filters on the string module (Figure 13.18). In DVD example 13.18 the LP control is turned from high to low— note how the sound shortens and becomes pure as the filter closes down. The stiffness control adds an all-pass filter to the loop, introducing phase changes that detune the harmonics. The stiffness control has a wider than usual range of ad-

14_Chap13_pp307-340 8/29/13 2:48 PM Page 333

NEW APPROACHES TO SYNTHESIS

FIGURE 13.17

Modelonia pick module.

FIGURE 13.18

Modelonia string module.

333

justment. This starts (as the slider is raised) as slight detuning but goes through some wildly inharmonic timbres and arrives at something like struck metal bars. DVD example 13.19 demonstrates this effect. The modifier buttons change the modeled reflections in various ways. SM adds a bit of buzz via tension modulation, similar to a banjo head. CP changes equalization of the string. WI inverts the waveform at one reflection point which doubles the effective string length and suppresses even harmonics. The second part of DVD example 13.19 has the SM and WI buttons engaged. Pipe Model Pipes are also simulated by delay lines connected in a feedback loop (see Figure 13.19). The most common pipe model is based on a pair of matched delays that feed back into each other. Filters on each end produce effects analogous to a mouthpiece and bell. The driving element is derived from the mechanics of a mouthpiece. The signal is modulated by feedback from the resonator using a technique that simulates

14_Chap13_pp307-340 8/29/13 2:48 PM Page 334

334

THE ART AND TECHNIQUE OF ELECTROACOUSTIC MUSIC

Exitation EQ

Lips

Mouthpiece to Bell Delay

+

*

R+B

Waveshaper

FIGURE 13.19

Key Tracking

Bell to Mouthpiece Delay

Output Filter

Reflection Filter

LP

Pipe model.

the motion of a reed, producing pulses of energy locked to the pipe frequency. Since the sound is sustained, changing pitch presents special problems. A sudden change of delay time would produce a pop, so the design requires complex modeling of tone holes or a quick crossfade between taps on a shared delay line. The right side of Modelonia is a basic pipe model synthesizer (it is labeled horn; see Figure 13.20). There is a gentle low-pass filter for bell effects and a waveshaper (SM) at the mouthpiece end that adds in high-frequency energy. The synthesizer operates in polyphonic mode, so there is no need for tone hole modeling. The central control of the horn module is labeled R+B, for response plus brightness, and adjusts the overall feedback gain. This control is a reversible attenuator—the center is zero and lower settings invert the signal. When R+B is set to maximum, the pipe is just on the verge of resonance. At the lowest setting, the resonance is an octave lower with reduced even partials. The R+B setting determines the response of the pipe—it has to be tailored to the excitation mode and the desired playing range. Adjustment is a matter of finding a sweet spot where the pipe speaks reliably. Once this is found, the switches and filter determine the tone color, which ranges from mellow and clarinet-like to a brassy blat. DVD example 13.20 demonstrates a filter sweep from low to high. The horn module is driven by an excitation signal through the lips module. The excitation can be an oscillator, the noise module, or the output of the string section. The lips module is a resonator coupled to the horn. There is an amplifier with an attack and release envelope that opens and shuts the resonator with the MIDI note. The resonator has variable gain, but this does not go high enough to sustain oscillation on its own. The primary control in the lips module is another R+B slider. This controls the signal fed in from the horn, which can sustain resonance but obviously cannot start it. Feedback is the least when this control is set at the middle—

14_Chap13_pp307-340 8/29/13 2:48 PM Page 335

NEW APPROACHES TO SYNTHESIS

FIGURE 13.20

335

Modelonia horn module.

all you hear is the excitation tone. When it is at either of the ends the response is quick and pretty much independent of the exciter settings. The medium high and medium low settings allow the most leeway for adjustment. The PE button in the lips module turns on an oscillator that injects a sine tone to excite the lips and horn. The resonator attack will affect the response, but the strongest effect is with gain. The effect of release will not be heard unless the master envelope has a long release setting. With R+B a bit above or below the center you can find quite a variety of sounds. This is demonstrated in DVD example 13.21. The PE mode is fairly slow to respond so the tempo has been dialed back. The output of the noise generator can also be used to drive the horn model. A slider in the noise module will make this connection. Excitation by noise can tighten up the attack, rather like the way the action of the tongue affects wind instruments. The slightest amount seems to trigger a note, while more adds noise to the output. DVD example 13.22 has noise added to the settings in 13.20. Noise excitation without the PE oscillator gives a squeaky, accordion-like sound. Hybrid Instruments One of the most interesting features of Modelonia is the cross connection of the two models. The output of the string model can be used as excitation for the horn and vice versa. Horn to string can produce a bowed string sound, especially if the horn is driven by noise. Since the string is transitory, string to horn produces a sharply articulated sound. The string and horn seem to be out of phase, so the connection works best with high stiffness or with the string WI button engaged. The most pleasing sounds I have found use a mix of all three excitations. The sounds are sensitive enough to settings that I often shift to control view to make precision adjustments. DVD example 13.23 features some of my favorite presets. One section of Modelonia that I have ignored up to now is the output processing. This consists of a resonant EQ and a simple reverberator. These modules are not up to modeling the complex resonances found in instrument bodies, but they can

14_Chap13_pp307-340 8/29/13 2:48 PM Page 336

336

THE ART AND TECHNIQUE OF ELECTROACOUSTIC MUSIC

FIGURE 13.21

Modelonia lips module.

provide mild formants and generally add depth and cohesion to a sound. I often use a further EQ insert to model formants and generally adjust tone. The output of a modeling synthesizer will seldom be confused for the real thing. However, the sounds are charming in themselves and have just about the right amount of variability from note to note. You will find they also have many of the limitations of real instruments. For instance, a sound that is perfectly tweaked for the octave above middle C may become unbalanced and shrill in the higher octaves and may fail to speak reliably if shifted down.

String Studio The company that produces Tassman, Applied Acoustic Systems, also has a wide range of acoustic modeling instruments. One of their best is String Studio, which takes the string model down to the tiniest details. This instrument is organized into two pages. The first page is for general performance, with the usual added reverb and delay effects, an arpeggiator, and tuning controls. The B page of String Studio allows us to play with the model (Figure 13.22). This is finely detailed, and most of the parameters are expressed in terms familiar to a musician. For instance, displacement is described as features of the plectrum like stiffness and protrusion. The one item that is a bit unclear is “damp,” which turns up in a lot of places. This refers to damping ratio, which is a measurement of frequency response in a mechanical system (as opposed to plain damping, which describes friction and other losses in a mechanical system). If a part of the model features a damp control, turning this up will increase high-frequency components in the sound. The page is divided into modules that more or less correspond to the mechanics of most string instruments. The principle parts are excitator (pick, bow, or hammer), string (with adjustments for damping ratio, decay, and inharmonicity), and body, with extra detail provided by terminator and damper modules. Filter and EQ sections are not strictly models, but they provide extra refinement.

14_Chap13_pp307-340 8/29/13 2:48 PM Page 337

NEW APPROACHES TO SYNTHESIS

FIGURE 13.22

337

String Studio page B.

The manual supplied is excellent, so our usual approach of study and experiment will quickly lead to mastery of the instrument. For experimentation, I would definitely start with existing presets and tweak one control at a time As an example, starting with the arco cello, I quickly discovered the sound is very sensitive to bow force (as any cello student would tell me). Reducing the force makes the sound thin with a tendency to jump harmonics, while too much creates a blast of distortion. Reducing the friction makes the attack scratchy, while increasing it slightly stops all sound. Velocity has a narrow operating range; however, increasing it just a little allows a bit more bow pressure and a more robust tone. Any setting that works will probably only be good for about a three-octave range—playing above middle C requires adjustments to the strings (slightly more decay and inharmonicity). When you consider that cellos have different strings for high notes, this doesn’t seem too surprising. DVD example 13.24 demonstrates the effect of bow pressure (low to high) followed by bow damping. Notice how certain pressure values don’t produce any sound. Pressure must be balanced with velocity to make notes speak consistently. String Studio provides a choice of body parameters including the type of body. Probably the most sensitive parameter is decay, which determines the liveliness of

14_Chap13_pp307-340 8/29/13 2:48 PM Page 338

338

THE ART AND TECHNIQUE OF ELECTROACOUSTIC MUSIC

the body. The type is a choice of violin, guitar, or piano. DVD example 13.25 demonstrates the effect of changing decay on the cello body and a piano with cello settings. It’s not very surprising that a bowed piano does not work well until the decay gets to be fairly long.

ADVANCED SYNTHESIS AND THE COMPOSER The applications mentioned here are the forerunners of a major change in the kinds of sound available for electroacoustic composition. Many composers have found that subtractive synthesis and sampling offer a surprisingly limited palette of sounds—harsh and artificial for the former and imitative and inflexible for the latter. Too much of the work done with these materials has been rigid and clinical, not because of limited imagination or musical ability, but because the synthesis methods do not offer subtlety. Resynthesis and modeling do offer subtlety and the ability to transform sounds gently, gradually, and expressively. As I said earlier, this is a responsibility as well as an opportunity. A composer wanting to get the most out of spectral synthesis must become fluent with analysis tools, including such arcana as the trade-off in accuracy between frequency and time. A composer interested in modeling instruments must understand the functioning of the instruments (and other things) that are the basis of the model. None of this knowledge requires a physics degree. Practice with the software will teach most of it, and reading basic surveys of musical acoustics will supply the rest.

EXERCISES 1. Create a sound pool using the analysis and editing features of SPEAR. Use these to create a loop-based composition. 2. Compose a short work that features gradual evolution of an acoustic model.

RESOURCES FOR FURTHER STUDY Advanced modes of synthesis are not as well documented as sampling and basic techniques. The best coverage of these topics are in Curtis Roads’s massive tome on computer music. This book is a bit intimidating, but it should be on the shelf of every serious electroacoustic musician.

14_Chap13_pp307-340 8/29/13 2:48 PM Page 339

NEW APPROACHES TO SYNTHESIS

339

Roads, Curtis. 1996. The Computer Music Tutorial. Cambridge, MA: MIT Press. Roads has also written the definitive book on granular synthesis: Roads, Curtis. 2004. Microsound. Cambridge, MA: MIT Press. To get a grip on acoustic modeling, you first need to understand how real instruments work. Benade’s writing is proving to be timeless. Benade, Arthur. 1992. Horns, Strings, and Harmony, 2nd revised ed. New York: Dover Publications, Inc. Fletcher, Neville H., and Thomas Rossing. 1998. The Physics of Musical Instruments, 2nd ed. New York: Springer Verlag. The latest in acoustical analysis of real instruments is available at a website maintained by the University of New South Wales, www.phys.unsw.edu.au/jw/basics.html. Julius Smith has placed an excellent review of recent advances in physical modeling research on the Stanford University Center for Research in Music and Acoustics website, https://ccrma.stanford.edu/~jos/pmupd/.

14_Chap13_pp307-340 8/29/13 2:48 PM Page 340

15_Chap14_pp341-362 8/29/13 2:49 PM Page 341

Part 5 Research-style Synthesis The software examined in chapters 14 to 17 is not for the fainthearted. These are the programs used in cutting-edge research into synthesis and performance techniques. Each is immensely powerful, and when considered in terms of performance for price, an excellent value. In fact, several of these applications are free. Many actually consist of a series of overlapping programs or they are languages in which new programs are created. A mortal musician can get plenty of use out of software like this, but you will find few ready-made presets and a pretty steep learning curve. These chapters cannot cover all of the capabilities of each program but will offer an overview of their methods and possibilities so you can intelligently choose one or two to pursue in detail. Why choose? Why not get them all and use each for the functions they do best? Although they are economical in price, the use of any of these programs requires a substantial investment in learning and development time. Even full-time researchers in university labs usually master only one or two. I know more than one famous composer who continues to work in a program that was old-fashioned when his students were born, simply because of the effort required to translate a career’s worth of material to a new platform. In addition, you may find it difficult to work with two synthesis languages because of conflicting syntax. For instance, the statement 2 + 2 in Csound would be written (+ 2 2) in Common Lisp Music. For most old-timers, the choice of development environments was a matter of accident and limited availability—we were taught on one system and simply stayed with it. There are quite a few synthesis and audio processing programs out there, each with its supporters and detractors. The same debates found in the programming industry are encountered here, where the merits of the tried and true are set against the advantages of the fad of the week. There is actually quite a lot of common ground between synthesis programs, but none does absolutely everything, and some definitely favor a particular approach to music. Few of these programs come from the commercial software industry. Usually we can trace them back to a single inventor, with further development and distribution under the aegis of a research university or the open source community. Some inventors have tried to commercialize their work, but the potential market is small and the work required to develop a product from the research environment to a user-friendly application is daunting and uninteresting. This is both a benefit and drawback for the composer. The benefit is free software. The drawback is limited tech support, and that support is more likely to come from other users than the authors of the program.

15_Chap14_pp341-362 8/29/13 2:49 PM Page 342

342

THE ART AND TECHNIQUE OF ELECTROACOUSTIC MUSIC

All of these programs hold a deep debt to Max Mathews, the Bell Labs and Stanford University researcher who first produced music on a computer in the 1950s. His program MUSIC established the basic algorithms, structure, and terminology used in all audio software today. Mathews’s original program was written for the computers of the day: “big iron” mainframe systems with punch card inputs. Firstgeneration computer musicians’ stories of the effort it took to translate the computer data into sound now have the ring of “walking ten miles in the snow to go to school,” but the situation endured well into the 1970s and still exerts some effect on synthesis software. However, advances in human-computer interfaces and the transition from mainframe to personal computer systems have greatly broadened our choices. The next four chapters survey the synthesis programs and languages that are most commonly used today. They are grouped by style: chapter 14 covers old-school compiled languages, chapter 15 presents interactive languages suitable for live coding, chapter 16 introduces graphical composing environments such as Max/MSP, and chapter 17 explores hardware-based systems.

15_Chap14_pp341-362 8/29/13 2:49 PM Page 343

FOURTEEN Composing on a QWERTY Keyboard

In the early days of computer programming, each application was tailored for a specific computer. The original version of MUSIC was written for the IBM 704. If anyone wished to run it on a different computer (704s were discontinued in 1960), the program had to be rewritten. This provided an opportunity to improve the program, as each generation of computers provided advances in power. Max Mathews eventually produced versions up to MUSIC V, and he shared his code with musicians who spread the program to an ever widening set of machines. This expansion of MUSIC was mostly carried out in university research centers, several of which were founded in the 1960s to specialize in audio and musical applications. These centers accumulated substantial software libraries and developed new programs. Graduates hired to set up computer music studies at other schools would usually take these libraries with them and continue development, sending their improvements back to their alma maters. Thus the original MUSIC program became several major computer music packages. These packages can be viewed as vintage software, much as many contemporary musicians admire vintage modular synthesizers. The following are some packages that are still in use and are generally available free or for a nominal charge. Common Lisp Music (CLM) was developed at the Stanford University Center for Computer Research in Music and Acoustics. It is based on the Lisp programming language, and you must have a working Lisp environment to use it. Common Lisp Music is the signal processing arm of a trio that also includes the algorithmic composition package Common Music and Common Music Notation, one of the first score editors. These can be found at http://ccrma.stanford.edu/software/. The Computer Audio Research Laboratory (CARL) was based at the University of California, San Diego. It is currently part of the Center for Research in Computing for the Arts. CARL researchers originated many fundamental DSP algorithms, such as phase vocoding, known as PVOC. We have already encountered this technique in the pitch and time manipulation features of many applications. The CARL software distribution has more than two hundred programs for PC and UNIX, including the full-featured synthesis language Cmusic. It can be found at http://crca.ucsd.edu/ cmusic/.

343

15_Chap14_pp341-362 8/29/13 2:49 PM Page 344

344

THE ART AND TECHNIQUE OF ELECTROACOUSTIC MUSIC

One excellent way to get your hands on PVOC is through the Composers Desktop Project developed at the University of Bath in the United Kingdom. The central interface is a program by Trevor Wishart called Sound Loom. This allows you to manage a giant library of source files and process them in just about any way imaginable. Options include progressive transformations of files and automated operation. There is a modest fee for this package, which is available for PC or in a slightly limited Macintosh version. The Composers Desktop Project home page is http://www.composersdesktop.com/. All of these programs are in use by many well-known musicians, but Csound is probably the most accessible and best documented. One of the composers who was active in porting MUSIC to different machines is MIT professor Berry Vercoe. He wrote several versions before the programming language C made it practical to create one that could run on any computer. Csound is available free of charge at SourceForge.net in Windows, Macintosh, and Linux versions. Csound received a boost in interest with the publication of The Csound Book, edited by Richard Boulanger, which includes basic and advanced tutorials by highly respected programmers and composers. Since its publication in 2000, there has been an impressive upswing in the number of Csound users, and the program itself has been expanded and refined to better meet the needs of modern music.

OVERVIEW OF Csound Bare Csound is a command line program. To run it in the traditional way, you open a terminal window and type something like what you see in Listing 14.1. > cd /Users/pqe/Documents/Csound/Examples/ > csound -W -o mytune.wav mytune.csd Listing 14.1

Csound from the command line.

This will generate a lot of print and a sound file called mytune.wav. A Csound composition is derived from two files called the orchestra and score, which are passed to the Csound command. These are plain text files, with the extensions .orc or .sco. They can be combined into a XMLfile with the extension .csd. Some front-end programs show these as two separate files and do the XML formatting automatically, but XML is not hard to learn. Many details about how the program runs are set by flags. The -W and -o in Listing 14.1 are examples of flags; -W specifies a .wav formatted output file and -o sets the name for the file. There are about a hundred possible flags controlling everything from output formats to details of MIDI mapping. The terminal can be an awkward place to work, so all but the most hardcore code bashers use a front-end program that provides editing features and runs the Csound command in a transparent way. Popular front ends (as this is written) include WinXound by Stefano Bonetti and the multiplatform QuteCsound by Andrés

15_Chap14_pp341-362 8/29/13 2:49 PM Page 345

COMPOSING ON A QWERTY KEYBOARD

345

Cabrera. Front-end programs for Csound can be somewhat ephemeral. Many are simply personal projects by composers who are willing to share, and some are institutional efforts that depend on the availability of graduate students for maintenance and upgrades. It is difficult for even the most dedicated programmer to keep up with the quirks and updates from Microsoft and Apple, and Csound itself changes from time to time. However, the Csound file formats are solid and can be used with any front end. The nice thing about front ends made by active composers is that they respond to the needs of a working musician. It’s a good idea to keep up with the front-end world and to try new ones as they come out. The best way to do this is to check in with www.csounds.com routinely; you will find announcements of updates, an active user forum, and useful tutorials.

The .csd File The Csound document file has sections for options, instruments, and the score. These are contained within the tags shown in Listing 14.2.

.....flags

.....Instrument definitions

.....the score

Listing 14.2

.csd syntax.

Note that the entire file is within the block. Any flags placed in the block will override the settings of the front end, so you can leave your system set up for the way you usually work and stick the occasional exception in the file. Flags entered in the command line override everything.

The .orc File Listing 14.3 shows a simple orchestra file, the lines between the tags in a .csd file. /******* Basic tone ********/

15_Chap14_pp341-362 8/29/13 2:49 PM Page 346

346

THE ART AND TECHNIQUE OF ELECTROACOUSTIC MUSIC

sr = kr = ksmps = nchnls=

a1

44100 4410 10 1

instr 1 oscil 10000, 440, 1 out a1 endin

Listing 14.3

An orchestra file.

Notice that everything lines up in columns. This is an artifact of the original implementation of MUSIC on early computers that were controlled by punch cards. These cards had twelve rows of eighty columns of potential holes, with a punch representing a digit or a combination of punches representing a letter (or bits—the formatting of punch cards changed over the years). Most systems organized these into fields with each field containing an instruction or data. When the same programs were adapted for keyboard input, the tab key was used to jump to the next field. Tabs are not required in Csound, just spaces and the occasional comma, but nicely aligned columns are easier to read. The first four lines of Listing 14.3 are the header, which indicates how this orchestra should be processed. The code sr indicates the sample rate, nchls is the number of channels. The other two lines take a bit of explanation. While each audio sample must be computed for every tick at the sample rate, many numbers in the program just won’t change fast enough to make the computation worthwhile. If an envelope ramp takes two seconds to change from zero to one, most of the calculations done at the sample rate will yield essentially the same result. So slowly changing values are computed at the control rate, which is indicated by kr. The value ksmps is an integer setting the number of audio calculations between control calculations, or sr/kr. I’ve always thought it ironic that a program that is going to perform millions of calculations to produce its output requires you to do the first one yourself (and in fact, this is optional in newer versions of Csound). The next four lines are an instrument definition. The code instr 1 indicates the start of the first instrument. The code endin marks the last line of the instrument definition. The lines between will usually be in the format of a variable, an opcode, and one or more arguments. The opcode uses the arguments to calculate a value that is stored in the variable. The line in Listing 14.3 that starts with the variable a1 contains the meat of this simple instrument. The fact that the variable a1 begins with the letter a is significant. This calculation is to be performed at the audio rate, or once per sample. Calculations to be performed at the control rate have variables that begin with k, and calculations that are only done once per note begin with i. Understanding the use of variables is key to mastering Csound. In any computer program, a variable is a storage location for data and usually refers to a place in

15_Chap14_pp341-362 8/29/13 2:49 PM Page 347

COMPOSING ON A QWERTY KEYBOARD

347

memory that can contain a single number. The name of the variable is placed in the code where data would be appropriate, and operations then use the data the variable contains. Many programs use an equal sign to signify that data is stored in a variable, and this can be done in Csound, but the storage is understood when a line begins with a variable and the equal sign is left out. One opcode that does not require a variable at the beginning of the line is out, which simply means “put this value in the output file.” The essential opcode in Listing 14.3 is oscil. This is a basic wavetable oscillator that extracts a value from a function table for each sample. If the table contains a sine function, a sine wave will be heard. Function tables are referenced by number, which is the third argument to oscil. The first argument to oscil specifies amplitude and is the actual maximum sample value to be produced. In 16-bit audio the maximum value before clipping is 32,767. If you specify a value higher than this, an error message will be generated. It’s also pretty loud, so most scores use values between 8,192 (-12 dB) and 16,384 (-6 dB); 10,000 is common. In newer versions of Csound, you can change the way amplitudes are specified by including the line 0dbfs = 1 in the header. Then an amplitude of 1.0 will specify full-scale sound. Many of the Csound textbooks and code examples were written before this became available and use big numbers for amplitudes, but I prefer the new version and will use it in the following examples. The second argument to oscil is frequency. Later we will see how this can be extracted from various ways of specifying pitch. Listing 14.4 is a slightly more complicated instrument, the basic beep introduced in chapter 10. /******* Basic Beep *******/ sr = 44100 kr = 4410 ksmps = 10 nchnls = 1 0dbfs = 1 instr 1 k1 linen p4, p6, p3, p7 a1 oscil k1, p5, 1 out a1 endin Listing 14.4

Basic beep in Csound.

In Listing 14.4 most of the arguments are variables. Variables that start with the letter p refer to data supplied by the score. The score is organized into parameter fields, and the variable p5 means use the value from field 5. The oscil opcode

15_Chap14_pp341-362 8/29/13 2:49 PM Page 348

348

THE ART AND TECHNIQUE OF ELECTROACOUSTIC MUSIC

gets amplitude from the variable k1 which is the result of the linen operation; linen is linear envelope and is similar to the AR-type envelope generator we have seen in other synthesizers. The linen operation takes four arguments: amplitude (the same range as for oscil), total duration in seconds, rise time, and decay time. The total duration should be the note duration as specified in the score, so it is usually taken from p3 (see below). The total duration should also be longer than the attack plus decay time. When this instrument is played the value of k1 is updated every ten samples. This will result in a smooth on and off, producing a short beep.

The .sco File Now let’s look at a score to play this instrument. Listing 14.5 will play one note. /******* One Beep ********/ f1 0 4096 10 1 ;sine wave in function f1 ;INS STRT DUR AMP FRQ ATK REL i1 0 2 1 440 0.02 1 Listing 14.5

A score file.

As you can see, a score is also a set of numbers in columns. There’s no header required, but it’s a good idea to include an identifying name for the piece at the beginning. Because the Csound program will not know what to do with the name itself, this kind of extra information should be contained in a comment, marked with a semicolon and extending to the end of the line. Multiline comments may be marked with /* at the start and */ at the end. The name of the piece in Listing 14.5 is “One Beep.” The lines in a score always begin with a letter identifying the type of statement. There are quite a few types available but the most common are function and instrument statements. Function statements create the tables used by oscil and other opcodes. They are indicated by the letter f and the number of the table to be filled. There need be no space between the f and the number, but you sometimes see one. The second field in the line is the time of execution, specified in beats. In most scores the functions are generated at beat 0, but they can be created at any time. The third field is the size of the table, which is a power of two, or a power of two plus one if the function is to be used in certain opcodes. Determining the optimum size for tables is an art, but on gigabyte computers there’s little penalty for generosity. The fourth field is the number of an operation called a GEN routine used to create the function. There are about three dozen standard GEN routines that include everything from reading values out of a file to calculating Chebyshev polynomials.

15_Chap14_pp341-362 8/29/13 2:49 PM Page 349

COMPOSING ON A QWERTY KEYBOARD

349

Instrument statements are the lines that generate notes. They start with the letter i and the instrument number from the score file to use. The second field (which the instrument will know as p2) is the time to start the note. This is in beats, with a default tempo of 60 bpm, so they may also be thought of as seconds. The times don’t have to be in order, but the piece will be hard to work on if they are not. The third field (p3) is the duration of the note in beats. If the instrument has not finished producing sound when this time is up, there will probably be a click as the note is shut down. The fourth field (p4) is traditionally the amplitude and p5 is traditionally the value that determines frequency. Any following fields will be defined by the instrument design. In Listing 14.5 they are attack and release of the envelope. Most composers use a comment line to remind them of the meaning of these fields. The sound of Listing 14.5 with the instrument defined in 14.4 is found in DVD example 14.1. When you render a score, exactly what happens depends on the front end you are using. Most front ends will automatically open the output file for playback, and some will play the score as it is generated. If your score is too complex to compute in real time, this playback will sound rough, but the finished file will be fine. You also may see some text and graphics. The graphics are drawings of your functions, the text is a running log of the compilation process. If anything goes wrong, you can consult this to find the cause. Pitch Conversion Few composers know the frequencies of pitches offhand, so Csound includes features for pitch to frequency conversion. The opcode cpspch converts a pitch specified as octave point pitch class (8ve.pc) format to frequency in Hz. The 8ve.pc format defines pitches exactly as it implies, with a number for the octave, a decimal point, then a two-digit pitch code. The pitch class scheme takes some getting used to but is a handy way to compute with pitches. We assign the number 0 to C, 1 to C-sharp, and on up to B at 11. In 8ve.pc notation, middle C is 8.00; A440 is 8.09. To use this notation in a score the instrument of Listing 14.4 is modified by substituting this line: a1 oscil k1, cpspch(p5), 1 Note that cpspch uses parentheses to enclose its argument. Operations of this type are called functions and can be used in arguments (so can ordinary math operators). Many beginning Csound composers have trouble keeping cpspch( ) (which converts 8ve.pc to Hz) and the opposite function pchcps( ) straight. Just remember that in Csound the destination of a calculation is found on the left. Now a score with some real music is possible. A little tune is shown in Listing 14.6 and played in DVD example 14.2. There are several new things here. First of all, there is a t-statement to set a tempo. The parameters for a t-statement are pairs— beat number and tempo to start at that beat. The first pair must be for beat 0. There are no measures in a Csound score, but there’s nothing to prevent you from marking them with comment lines.

15_Chap14_pp341-362 8/29/13 2:49 PM Page 350

350

THE ART AND TECHNIQUE OF ELECTROACOUSTIC MUSIC

The score proper is several i-statements that produce notes, one per line. These will all use the same instrument, i1. The second column determines the rhythm, which can be defined as precisely as you care to. I won’t say anything beyond that except point out that whole number start times will result in the squarest rhythms you can imagine. The durations in the third column are all much the same, so I used a shortcut to reduce the typing. A period is a ditto mark and means “use the value from the previous line.” The fourth column sets amplitudes. When notes overlap, their amplitudes have to be adjusted to prevent them from adding up to a value bigger than 1.0. The pitches are in the fifth column and outline a C chord. (I assure you it won’t take long to learn this scheme.) The final two columns set the attack and release for the envelope. As you can see, I shortened the release on the third and fourth notes—if the release plus attack of a linen type envelope is longer than the p3 duration, there will be a pop. It may look like there’s plenty of time at first glance, but the tempo shortens the duration without affecting envelope times. /* Fanfare for DC */ t 0 120 f1 0 4096 9 1 1 0 ;INS STRT DUR AMP FRQ ATK REL ;+++++++++++++++++++++++++++++++++++++++++++ i1 i1 i1 i1 i1 i1 i1 i1

0 2 1 2 3.5 4 4 4 4

0.2 . 1.5 0.6 2 . . .

Listing 14.6

8.0 0.1 . 8.04 . 8.07 . 8.11 . 8.00 . 8.04 . 8.07 . 9.00

1 0.05 . . 0.5 . 0.3 . 1 . . . . . .

Score for a short tune.

Beep is a simple instrument and sounds like it, but more engaging instruments are not much more complex. Of the more than six hundred opcodes distributed with Csound, sixty or so are sound generators ranging from simple oscillators to a full implementation of a Minimoog. These include all of the techniques covered in this book, such as frequency modulation, granular and additive synthesis, and physical modeling, as well as commercial schemes like sound fonts, loopers, and samplers. One of the classics is pluck which is an implementation of the Karplus-

15_Chap14_pp341-362 8/29/13 2:49 PM Page 351

COMPOSING ON A QWERTY KEYBOARD

351

Noise Burst

Delay

Average

FIGURE 14.1

The Karplus-Strong algorithm.

Strong algorithm. This elegant algorithm is a string model. It starts with a table filled with random values that is played as though it contained a typical wave function. However, after the samples are read out they are modified, replaced with some form of average of the previously read and following samples. The sound starts as a noise but almost instantly clarifies into a pure tone which then decays in a string-like way. DVD example 14.3 demonstrates pluck. The code for this instrument is shown in Listing 14.7. As you can see, practically the only change is the substitution of the pluck opcode for oscil. The pluck opcode requires the frequency in two fields. The second value for frequency is an i-time parameter that determines the optimal size for the function buffer (the frequency proper could be a changing function). The following arguments specify the initial wavetable contents (0 for random, otherwise a function table number) and a choice of averaging methods. instr 1 k1 linen ampdbfs(p4), 0.02, p3, 0.2 a1 pluck 0.5, cpspch(p5), cpspch(p5), 0, 1 out a1*k1 endin Listing 14.7

pluck instrument.

You may wonder about the need for an envelope (linen) since the pluck opcode decays on its own. The problem is that the decay is independent of the duration specified in the score and goes on for quite a long time. All linen does is provide a quick fade at the end of the note, avoiding the click that would occur if pluck were just chopped. There is no amplifier opcode in Csound—signal levels are modified by multiplication.

15_Chap14_pp341-362 8/29/13 2:49 PM Page 352

352

THE ART AND TECHNIQUE OF ELECTROACOUSTIC MUSIC

Real-time Performance Csound is not restricted to rendering files from written scores. There is a fullfeatured and robust set of MIDI opcodes that can be used to play MIDI files or for live performance. Converting the Karplus-Strong instrument in Listing 14.7 is quite simple, requiring only a few modifications as shown in Listing 14.8. instr 1 icps cpsmidi iamp ampmidi k1 linenr iamp, a1 pluck 0.99, out a1* k1 endin Listing 14.8

; get note number and convert 0.007874 ;get velocity and convert 0.02, 0.3, 0.01 icps, icps, 0, 1, 0.1,2

MIDI version of the Karplus-Strong instrument.

The instrument number is significant when MIDI is in use. The default channel assignment scheme attaches instrument 1 to channel 1, instrument 2 to channel 2, and so forth. (An massign opcode in the header can create more flexible routings.) When a note on is received, an instance of the instrument is created that will generate sound until a matching note off is received. The cpsmidi opcode will retrieve the note number from the message and convert it to frequency, and the ampmidi opcode will retrieve the velocity value. The argument to ampmidi is a multiplier that will convert the MIDI 0–127 value to something appropriate for Csound amplitudes. If you are using raw sample values, 200 will probably work well. If you have 0dbfs set to 1, the magic number is 0.007874. In Listing 14.8, frequency and velocity are assigned to icps and iamp respectively. You often see i-variables calculated at the top of an instrument instead of in the opcodes. This makes the code clearer to read and more efficient to compute—icps can be plugged directly into both frequency fields of pluck but only has to be converted once. The value of iamp is used in the amplitude field of the envelope generator linenr. This opcode is specialized for MIDI input. It will hold at the peak level until the note off is received, and it will extend sound generation long enough for the release phase. There are several similar envelope generators for more complex shapes, such as mxadsr. If more complex operations are needed at note off, the release opcode can trigger them and the xtratim opcode will extend the duration until the operations are completed. Connecting the instrument to a MIDI port is usually done in the front-end application. (It can also be done with the flag -M plus some detective work to discover the appropriate device number or name.) Depending on the front-end program in use, you have a choice of three MIDI input modes: none, virtual, or a MIDI library like portMIDI, which accesses hardware devices. The virtual mode opens a simple keyboard window that is useful for testing the instrument.

15_Chap14_pp341-362 8/29/13 2:49 PM Page 353

COMPOSING ON A QWERTY KEYBOARD

353

Surprisingly, you still need a score in order to play from MIDI input. The typical performance score looks like Listing 14.9. All it does is keep Csound running, waiting to create an empty function an hour from the start time, although there’s no reason not to specify a day’s duration. When you are finished playing, use the stop button in the front end, or type “control c” on the terminal. /*Play an hour*/ f 0 3600 e Listing 14.9

MIDI performance score.

Playing MIDI Files To play a MIDI file, all you have to do is add the flag -F filename as in Listing 14.10. The best place to put the flag is in the block. Playback will begin immediately, and you can play along if you like. If you include a -T flag, Csound will exit automatically when the file is over. A portion of Bach’s two-part invention no. 8 realized with the Karplus-Strong instrument is given in DVD example 14.4.

-F invent8.mid -T

Listing 14.10

Importing a MIDI file to Csound.

LEARNING TO BUILD COMPLEX INSTRUMENTS Most interesting instruments will be a bit more complicated than the examples shown so far. Csound possesses all of the audio processing functions you have come to expect in a synthesis program, including filters and dynamic modifications. Listing 14.11 shows how to add one of Csound’s numerous filters. instr icps iamp k1 a1 a2

1 cpsmidi ; get note number and convert ampmidi 0.07874 ;get velocity and convert linenr iamp, 0.02, 0.3, 0.01 pluck 0.99, icps, icps, 0, 1, 0.1, 2 reson a1, 3*icps, 500, 2 out a2* k1

endin Listing 14.11

Filtered pluck.

15_Chap14_pp341-362 8/29/13 2:49 PM Page 354

354

THE ART AND TECHNIQUE OF ELECTROACOUSTIC MUSIC

The filter opcode is reson, described in the reference as a second-order resonant filter. To see how this is used, we consult the manual, known as “The Canonical Csound Reference Manual” by Berry Vercoe et al., available as an HTML document and sometimes provided in front ends. (The best automatically show the page for selected terms.) The manual has grown over the years as features have been added and various authors have clarified certain points, but it is still a bit terse. The meat is in the opcode definitions, which are built around a declaration of the syntax: ares reson asig, kcf, kbw [, iscl] [, iskip] The declaration always starts with a sample variable name. The important fact you take from the sample variable name is whether it is an a-rate or k-rate operator (or either if the name starts with x). There may be two or more variables to indicate multichannel capability. The opcode name is in bold print. Listed below are some sample arguments which are named to suggest their function and to indicate the rate at which they may update. The system may seem mysterious at first, but it is consistent enough that you will soon be able to guess at the meaning. This is a valuable skill. Some front ends show the syntax declaration as you edit, and many sources include it as comments in sample orchestras. The reference continues by defining the arguments, usually starting with the initializations (i-rate arguments). Square brackets indicate an argument is optional, but you can’t skip over any if you want to set a later one. For instance, reson requires three arguments; if you have four you are also setting iscl, and five are needed if you want to set iskip. In the case of reson, the arguments are: asig: an audio rate variable with the input signal. This is the way Csound is patched—the asig value will have probably been set by an opcode on the previous line. kcf: the center frequency for the filter. In Listing 14.11 I calculate it from the same icps value that sets the frequency of the pluck opcode. You could easily give it a dedicated p-field or put a constant here for fixed filtering. kbw: the bandwidth in Hz. (To keep a constant Q, multiply icps by a fraction.) iscl: this is a scaling code. Filters of this nature tend to amplify the signal, which will lead to samples out of range if the input is not attenuated. Finding the correct input amplitude for the frequency and bandwidth is a long process by trial and error, but reson can calculate a likely value for you. An iscl of 1 will make the response curve peak at 1; an iscl of 2 will give an idealized RMS level of 1. Another way to put it is if the default iscl of 0 gives a clipped signal, try 1; if that is too soft, try 2. iskip: this indicates whether the memory used by the filter is cleared for each note. The default value of 0 does so. This is an example of the incredible level of control Csound can give you. In the manual the definitions are often followed by examples of the opcode in use, which may be enlightening or confusing. In many cases the examples are sup-

15_Chap14_pp341-362 8/29/13 2:49 PM Page 355

COMPOSING ON A QWERTY KEYBOARD

cpsmidi 0.99

355

ampmidi 0.07874

icps 0,1,0.1,2

iamp 0.02,0.3,0.01

linenr

pluck a1

k1

3*icps,500,2

reson a2

* out FIGURE 14.2

Flowchart for filtered pluck instrument.

plied as a .csd file, which you can quickly run in the terminal. (Hint: many of these files have a sample rate of 22,050 and won’t run on some computers. If this happens, you can specify a valid sample rate with a -r flag and a k-rate with a -k flag.) Editing the example files is one way to see what makes the opcodes tick, but these examples should not be directly modified, so copy the entire folder and open them in a front end. I wasn’t being flippant when I said the examples may be confusing. Some authors can’t resist showing everything an opcode can do in one function. The way to really get to know an opcode is to simplify the example as much as possible, then change parameters one at a time. When designing complex instruments, it is often a good practice to sketch out the idea in a flowchart, a drawing reminiscent of the patching diagrams used in the synthesis tutorial. Figure 14.2 is an example of a flowchart. The flowcharts you see for Csound look more like computer programming flowcharts than the signal diagrams used in audio. For instance, audio signals are generally drawn flowing from left to right, while Csound charts are read from the top down. This helps keep track of the required order of computation, because the objects appear in pretty much the same order as the equivalent opcodes. When flowcharts find their way into books, they are generally tidied up with objects neatly arranged in rows. Hand-drawn versions with staggered boxes are actually more informative. Shapes taken from the standard ANSI flowchart library are used and

15_Chap14_pp341-362 8/29/13 2:49 PM Page 356

356

THE ART AND TECHNIQUE OF ELECTROACOUSTIC MUSIC

extended with a few specific traditions: a round-bottom box for signal sources, trapezoids for envelope generators, triangles for mixing, and circles with specific symbols for math and output functions. The interconnecting lines should be labeled with the variable names used to pass data. Note that the lines enter the tops of boxes in the same order as the arguments to the opcode. The other arguments are indicated either in the box or across the top.

Adding Widgets It is possible to interact with a Csound performance by adding a control window. There are various ways to do this. Some front ends provide controls as an added feature, but that code is usually not portable. The standard approach to adding controls is to use Csound widgets. These are based on a cross-platform application library called FLTK or the Fast Light Toolkit. The control windows built with FLTK have a primitive look (no transparencies or blended colors), but they are functional and reasonably reliable. Some code that adds frequency and bandwidth control to the filter in Listing 14.11 is shown in Listing 14.12. This goes in the header area of the orchestra. The window produced is shown in Figure 14.3. FLpanel “Filter”, 300, 180, 50, 50 ; Width, Height, location idispf FLvalue “Hertz”, 70, 15, 65, 120 gkfreq, idfilt FLknob “Frequency”, 50, 5000, -1, 1, idispf, \ 70, 65, 20 idispbw FLvalue “Hertz”, 70, 15, 165, 120 gkbw, idbw FLknob “Bandwidth”, 50, 1000, 0, 1, idispbw, 70, \ 165, 20 FLpanelEnd FLrun ;Run the widget thread FLsetVal_i 250, idfilt ; initialize the controls FLsetVal_i 500, idbw Listing 14.12

Code to display a control window.

A window is defined by the FLpanel opcode. This takes arguments to set a title and determine its size and location. FLpanel has to be matched with an FLpanelEnd statement. Everything in between will appear in the window. The FLrun statement starts the window running as a more or less separate program that will communicate with Csound via the variables you set up. The controls are called valuators in FLTK lingo. There are six basic types available, each with several cosmetic variants. Listing 14.12 uses the FLknob widget, which has this syntax:

15_Chap14_pp341-362 8/29/13 2:49 PM Page 357

COMPOSING ON A QWERTY KEYBOARD

FIGURE 14.3

357

The Csound window produced by the code in Listing 14.12.

kout, ihandle FLknob “label”, imin, imax, idisp, iwidth, ix, iy [, icursorsize]

iexp,

itype,

\

There are two output variables. The k-rate variable will provide the most recent setting of the control. Since the panel code is not within an instrument, the variable name actually used must start with gk to make it globally visible. The ihandle is a reference to the widget itself. This can be used to provide visual feedback if the score changes the value associated with the control. After the name argument, there are arguments to set the minimum and maximum values sent, and a code (iexp) that determines how rotation of the knob is translated to data. A 0 here means a simple linear relationship and a -1 means exponential, the option usually appropriate for pitches and volumes. Any other number refers to a function table with the control curve. Such tables must be defined in the orchestra header rather than the score. The itype argument chooses the shape from a menu of options, and icursor sizes the dot on the version shown. Finally, iwidth sets the size and ix and iy set the location of the widget within the window. The text display of the value from each knob is shown with an FLvalue opcode. This item is simply placed in the window, and it returns an ihandle value. That variable is used as the idisp argument to FLknob and will adjust the display as the knob is turned. After the FLrun command the FLsetVal_i statements place the pointers at their initial position. The color and other properties of the controls can be set by an additional series of opcodes.

15_Chap14_pp341-362 8/29/13 2:49 PM Page 358

358

THE ART AND TECHNIQUE OF ELECTROACOUSTIC MUSIC

The variables set by the two FLknob are used in the reson opcode of the filtered pluck instrument. DVD example 14.5 has a video demonstration of the knobs in action. The audio in this example was recorded from the real-time output and has some glitches. The saved file is clean, but live performance with Csound should use external controls from MIDI or OSC rather than widgets.

Processing Audio Files Many of the opcodes in Csound are useful for overall processing of audio files. I wouldn’t recommend this for routine actions available in your audio editor, but for some specialized tricks and precise control Csound may provide the best approach. The first step is to create an instrument that simply plays an audio file, such as Listing 14.13. instr 1 asource soundin “rachShort.aif” ; put process here out asource endin Listing 14.13

File playback.

This may seem the most useless of functions, but it opens a world of possibilities. Once the input file (here a clip of Rachmaninoff) is brought into Csound, any of the processing opcodes can be applied. For instance, it could be patched to the reson of Listing 14.12, and the action of the filter knobs would be immortalized as the file is rendered (DVD example 14.6). One slight hitch in that process is that the -o flag directs Csound to write to a file or the digital-to-analog converter (DAC), but never both. To actually hear our manipulation, we have to use the -odac flag and add an instrument that explicitly records the data, as in Listing 14.14. ; Instrument #999 - Save the global signal to a file. instr 999 aoutput monitor fout “rachFilt.wav”, 2, aoutput endin Listing 14.14

Recording within Csound.

Listing 14.14 uses the opcode monitor to grab the signal after the out opcode of all instruments. The fout opcode writes data to a specified file. The argument after the filename is a code for formatting the file (alas, it is not enough to simply use a .wav suffix); 2 is the most likely, meaning use -A or -W format as specified by the flags. Although it is annoying to have to specify the filename in the instrument defi-

15_Chap14_pp341-362 8/29/13 2:49 PM Page 359

COMPOSING ON A QWERTY KEYBOARD

359

nition, this is still a handy way to document all manner of real-time action. Of course the score must turn this instrument on for the duration of the piece.

Resynthesis Csound was the first platform capable of performing FFT analysis and resynthesis of an input signal. There are several ways of doing this, some dating back to the 1970s and a bit awkward, but the most interesting is real-time phase vocoding, pvs. This technique utilizes a new variable type, the fsig, which is a running analysis of the input. Listing 14.15 shows how to perform this analysis and resynthesis. It is a good idea to try this out simply to confirm that the conversion is not inherently damaging to the signal. instr 1 asource soundin “rachShort.aif” fsig pvsanal asource,1024,256,1024,0 ; put process here aout pvsynth fsig out aout endin Listing 14.15

Analyze and resynthesize.

In Listing 14.16 an opcode is added that changes the frequency of all components of the signal. We have encountered frequency shifting before. Since this change is not harmonically related to the pitch of the music, the result is a somewhat clangorous sound (DVD example 14.7). This process pushes some computers pretty hard, and you may hear glitches as the file renders. The finished file will be clean however. instr 1 asource soundin “rachShort.aif” fsig pvsanal asource,1024,256,1024,0 ; change the frequency data fproc pvshift fsig, -100, 0 aout pvsynth fproc out aout endin Listing 14.16

Frequency shifting.

The final example (Listing 14.17) is a vocoder (see chapters 5 and 13). It uses two sound files, the Rachmaninov clip and a young man reciting “Sally sells seashells by the seashore.” Since the poem is short, I have looped it with the

15_Chap14_pp341-362 8/29/13 2:49 PM Page 360

360

THE ART AND TECHNIQUE OF ELECTROACOUSTIC MUSIC

loscil opcode. This requires loading the file into a function table with a GEN01 routine. The loscil opcode has several optional arguments to control how it loops, but the most important is loop mode. The default mode looks for loop points in the file but does not detect them in many formats. Setting loop mode to 1 will loop the entire file, and other options are used to refine the loop points. Both files are converted to fsig format and sent to pvsvoc, which takes the frequency values from one signal and the amplitude values from the other. This can produce singing water heaters and other morphed sounds, but what you get depends on the actual frequency overlays. In this case, the vocal part is principally modulating the piano part of the Rachmaninov, as you can hear on DVD example 14.8.

-W

sr = 44100 ksmps = 128 nchnls = 1 0dbfs = 1 instr 1 asigf soundin “rachShort.aif” fsiga pvsanal asigf,1024,256,1024,0 asigamp loscil 1, 1,1,1,1 fsigf pvsanal asigamp,1024,256,1024,0 fproc pvsvoc fsiga, fsigf,1,1 aout pvsynth fproc out aout endin

f1 0 0 1 “sallyLoop.wav” 0 0 0 i 1 0 23 e

Listing 14.17

A Csound vocoder.

15_Chap14_pp341-362 8/29/13 2:49 PM Page 361

COMPOSING ON A QWERTY KEYBOARD

361

RESOURCES FOR FURTHER STUDY Obviously, there is a great deal more to be learned about Csound. There are hundreds of opcodes, and more are added with each update. The traditional score may seem old-fashioned and tedious to use, but it can be the quickest way to generate short sections of audio with precise control over the results. Csound can also be used to generate files for samplers or source files for musique concrète techniques. It can be used in Max/MSP (see chapter 16) or connected via MIDI Yoke or virtual ports to the sequencing program of your choice. Csound is worth some study just for the understanding of synthesis you will develop. One of the features that recommends Csound is the number of excellent tutorials available. Boulanger, Richard. 2000. The Csound Book: Perspectives in Software Synthesis, Sound Design, Signal Processing, and Programming. Cambridge, MA: MIT Press. In addition to articles mentioned earlier, this book includes two CDs with hundreds of example instruments and scores. Horner, Andrew, and Lydia Ayers. 2002. Cooking with Csound: Part 1:Woodwind and Brass Recipes. Middleton, WI: A-R Editions, Inc. A good practical approach to instrument design. Roads, Curtis. 1996. The Computer Music Tutorial. Cambridge, MA: MIT Press. Addresses a wide range of topics but describes most things in Csound terminology. There are also several online user groups where you can read about user mistakes and ask questions yourself: www.csounds.com/.

15_Chap14_pp341-362 8/29/13 2:49 PM Page 362

16_Chap15_pp363-392 8/29/13 2:50 PM Page 363

FIFTEEN Coding for Real-time Performance

The synthesis languages presented in the previous chapter implement an oldschool computing workflow: code is written, the program executes, and the results are output. Even though Csound will process audio and respond to input as it is compiling, it is not a true instrument—it is playing a score, even if the score is only a duration. In addition, the underlying code is fixed. If you wish to change a routine, you must abort the compilation, edit the code, and start over. The latest generation of synthesis languages offers real-time coding—the ability to run the program and edit it at the same time. This allows a composer to work in a truly interactive way, hearing the results of an action immediately and immediately responding to what is heard. This is not the same as modifying parameters on the fly—new parameters can be invented and implemented or the very structure of the code can be changed. Cmix is a MUSIC derivative language written by Paul Lansky and developed by several others at Columbia University. As the name suggests, it started out as a program for mixing sound files, with the mix controlled by a score document. Over the years the capabilities of the score were expanded to the point that Cmix became a full-blown synthesis language. The score in Cmix is not limited to a list of notes to play—it can generate new notes with algorithmic processes. RTcmix is the latest expansion of Cmix, written by Brad Garton and others. Instead of requiring a compilation, it runs in an interpreted language such as Python; this allows instant response to score commands. RTcmix can be used to design instruments that are both complex and flexible. The actual design is done in the host language, so appropriate programming skill is required. RTcmix is available at http://music.columbia.edu /cmc/RTcmix/. SuperCollider was developed in the mid 1990s by James McCartney, who designed it as an interactive system from the ground up. SuperCollider follows a clientserver paradigm: the server runs in the background of the operating system and will generate sound when it receives commands from a client. The server is controlled via a communications protocol known as Open Sound Control and can be managed by a variety of clients. Control code is written in an interpreted language with a syntax similar to C. You can type code into a window and execute it with a keystroke or

363

16_Chap15_pp363-392 8/29/13 2:50 PM Page 364

364

THE ART AND TECHNIQUE OF ELECTROACOUSTIC MUSIC

set up synth nodes to be played by external inputs. SuperCollider is extremely powerful—a skilled programmer would have little need for any other synthesis application. It is thoroughly documented with some good tutorials, but I would not recommend it as a starting language. SuperCollider is available at http:// sourceforge.net/projects/supercollider/. Impromptu is a real-time performance system developed by Andrew Sorensen. It allows direct manipulation of the Core Audio features of OS X Macintosh computers. This includes any plug-in that is compatible with the Apple Computer’s Audio Unit format. The controlling language is Scheme, a dialect of Lisp known for fast response time. This combination is well suited to algorithmic composition, as it allows direct programmed control of all features of some of the best synthesis and processing software available. It is also well suited to interactive applications. It includes access to Apple graphics features such as QuickTime and Core Image. Composers who are familiar with Lisp will find Impromptu easy to learn and use. As an Apple-only system it will never be as widely accepted as others, but the intimate control it provides makes it worth investigating. Impromptu is available at http://impromptu.moso.com.au/index.html. ChucK (http://chuck.cs.princeton.edu/) is a synthesis language designed expressly for improvisational programming, the performance practice known as live coding. Primary development was done by Princeton doctoral student Ge Wang under the supervision of Perry Cook and Dan Trueman. Graduate student computer language projects are often short lived, but Wang continues to support it as a faculty member at Stanford. ChucK is an example of a new breed of programs springing up postmillennium; these are focused on a specific programming domain, they tend to be fairly compact, and they value flexibility and productivity over rigor and consistency. Other programs a computer musician may encounter include the graphics language Processing and the Arduino microprocessor development environment Wiring. Languages of this class are quite accessible and make an effective introduction to programming, so we will explore ChucK in some depth.

OVERVIEW OF ChucK ChucK has many unique features besides an innovative approach to capitalization, but the prime distinction is its approach to time. For programs such as Csound, time is a complete abstraction, a by-product of computing a given number of samples. In interactive programs such as Max/MSP, time is something to minimize in order to provide immediate response to inputs. In ChucK, time is the output canvas, the defining aspect of the music. The special privileges of time are seen in several features of the program, most notably that time does not pass unless the programmer allows it to. The main component of ChucK is the virtual machine (VM), a program that runs constantly in the background and to which the composer sends sections of code.

16_Chap15_pp363-392 8/29/13 2:50 PM Page 365

CODING FOR REAL-TIME PERFORMANCE

365

These pieces of code, called shreds (for scheduled thread), contain functions to execute and instructions as to when to execute them. There can be as many shreds active at a time as the hardware can manage, so the old paradigm of a computer executing one line of code at a time is thrown right out the window. A composition does not consist of a distinct score and an orchestra but of a library of instrument routines that know what to play. In addition, the traditional cycle of code, compile, execute is eliminated or greatly truncated. When a shred is sent to the VM, it begins to run immediately, right on top of (and synchronized with) whatever is already running. In a ChucK composition, the programmer becomes the performer. Wang and others organize concerts in which the participants assemble programs on the fly. ChucK is a command-line application, but it comes with a choice of front ends: a fanciful three-dimensional interface called the Audicle and a more prosaic development environment called miniAudicle. Since the Audicle is designated an “alpha release,” I will use miniAudicle for the examples here. MiniAudicle provides a window to control the virtual machine and a window for each shred. You type the code into a shred window, start the virtual machine, and click the icon labeled “add shred.” If all is well, you will hear some tones. If not, an error message will appear in a console monitor window. The ChucK language resembles C++ and Java, but there are significant differences. Looking at a bit of code (Listing 15.1), the first startling feature is that the data in a line flows from left to right. SawOsc osc1 => dac; Listing 15.1

The chuck operator.

This unusual arrangement is indicated by the => symbol, which is called the chuck operator. That’s chuck as a verb, as in “chuck the can in the recycle bin.” The meaning of the chuck operator varies with context, but it always implies transfer of data or a data stream. In Listing 15.1 the meaning is “a sawtooth oscillator named osc1 is created and connected to a destination called dac.” The predefined variable dac indicates audio output. SawOsc is one of more than eighty unit generators that can produce or process audio. This line defines a tone, but nothing will happen until the tone is given duration. This is done with another use of the chuck operator (Listing 15.2). 2::second => now; Listing 15.2

Advancing time.

The line in Listing 15.2 can be loosely construed as “continue for 2 seconds.” A second in the ChucK language is a unit of duration, which is treated as a variable type. (Typing in computer languages means variables are limited to particular values and treated in a specific way.) ChucK defines a set of time units from a single sample to a week. The same units are used for a related data type known as time.

16_Chap15_pp363-392 8/29/13 2:50 PM Page 366

366

THE ART AND TECHNIQUE OF ELECTROACOUSTIC MUSIC

FIGURE 15.1

MiniAudicle windows for ChucK.

The built-in variable now contains the current time, but not in the sense of reading a clock (which is the approach in most programming languages). The now variable only changes when the program explicitly changes it, and such changes occur at the real-time rate. The code in Listing 15.2 causes all operations besides signal generation to wait for two seconds, thus “advancing time.” SawOsc osc1 => dac; 2::second => now; Listing 15.3

Two-second Tone.

The code in Listing 15.3 will play a tone for two seconds. The phrase SawOsc osc1 is an example of object-oriented programming. A thorough discussion of object-oriented programming requires a book rather than a chapter, but the short version defines an object as a collection of data with associated functions. The description of an object is contained in a class definition, and when the object is put

16_Chap15_pp363-392 8/29/13 2:50 PM Page 367

CODING FOR REAL-TIME PERFORMANCE

367

to use an instance of the class is constructed and named. The phrase SawOsc osc1 creates an instance of the SawOsc class called osc1. The phrase SawOsc osc2 creates another instance. The instances have distinct data and the same list of functions. The data associated with an object is accessed with a dot; SawOsc has a member variable called gain, and the gain of osc1 is named osc1.gain. Class variables may be set by chucking, as shown in Listing 15.4. .5 => osc1.gain; Listing 15.4

Class variable.

Functions that belong to a class are also accessed by a dot, the difference being that a function name always includes a pair of parentheses, which may contain arguments. The fact that the most common use for a member function is to set a variable, and those functions share the variable name is potentially confusing, but once you learn to watch for parentheses it will become clear. As an example, the ADSR class has a member variable called attackTime. The attack of an ADSR object named env1 can be set either by env1.attackTime(20::ms) or 20::ms => env1.attackTime. Objects of the ADSR class also have a set function that sets all four parameters at once, as shown in Listing 15.5. SawOsc osc1 => ADSR env1 => dac; env1.set(10::ms, 8::ms, .5, 500::ms); Listing 15.5

Class variable set function.

You can see that the arguments to a function are separated by commas, just as we encountered in Csound. In this case, the four arguments are the familiar attack, initial decay, sustain level, and release. You will note that the output of osc1 is chucked to env1 which is then chucked to the dac for output. The ADSR object is actually a transient generator connected to an amplifier. It does not produce a control signal; instead it directly modifies the amplitude of its input signal. Thus ADSR is not directly available to control parameters other than signal gain, but it is convenient in the most common connection. The ChucK version of basic beep is written on one line. To play it, we just need to set parameters and add time control to the audio patch, as shown in Listing 15.6. SawOsc osc1 => ADSR env1 => dac; env1.set(10::ms, 8::ms, 0.5, 500::ms); env1.keyOn(); 400::ms => now; env1.keyOff(); 500::ms => now; Listing 15.6

Basic beep.

The ADSR member function keyOn() begins the attack phase of the envelope, which will actually be computed as time advances during the following chuck to

16_Chap15_pp363-392 8/29/13 2:50 PM Page 368

368

THE ART AND TECHNIQUE OF ELECTROACOUSTIC MUSIC

now operation. Chucking 400 milliseconds will produce a tone of that duration. We follow the keyOff() function with another 500 milliseconds to allow time for the decay. To generate a simple musical phrase (actually, just a scale) we use a basic program structure called a loop. ChucK uses the for style of loop as found in the languages C and Java. This is shown in Listing 15.7. The for keyword takes three arguments, each of which is a bit of code (note that they are separated by semicolons). The first argument declares a variable to use to count the iterations of the loop and initializes it. In Listing 15.7 the integer variable p is set to start with a value of 60 (60 is chucked to p). The second argument of the for statement is a test that will be performed at the beginning of each cycle. If this test is true, the loop will continue. In this case we test to see if the variable p is less than 73. With a starting value of 60, we are defining a chromatic scale of one octave. The third argument in the parentheses is an operation to update the loop variable. The rather enigmatic p++ means increment p by one. This happens at the end of each cycle. The actual work to do inside the loop is enclosed in curly braces. This will happen thirteen times, with p equal to 60, then to 61, and so on to 72. SawOsc osc1 => ADSR env1 => dac; env1.set(10::ms, 8::ms, 0.5, 500::ms); for(60 => int p ;p < 73;p++){ Std.mtof(p) => osc1.freq; env1.keyOn(); 75::ms => now; env1.keyOff(); 75::ms => now; } 500::ms => now; Listing 15.7

A simple scale.

The code in curly braces will generate a note each time around. The function Std.mtof() converts the value of p to a frequency for osc1. The durations set a pattern of staccato eighth notes, which can be heard in DVD example 15.1. You will note that this scale is not polyphonic, as osc1 is retuned each time around. To get polyphonic sounds, we need to run several shreds at the same time. The first step is to define a function that plays a sound, as shown in Listing 15.8. The function declaration begins with the keyword fun, followed by the type of data the function returns (void means the function does not return data). The name of the function is next, followed by a parenthetical list of arguments. Each entry in the list defines a variable (with a type and a name) to use in the function definition. When the function is called, the arguments in the call will be passed into the function code in order. In the beep function defined in Listing 15.8 the first argument is used to set the variable named pitch.

16_Chap15_pp363-392 8/29/13 2:50 PM Page 369

CODING FOR REAL-TIME PERFORMANCE

369

// create a note function fun void beep(int pitch,int duration){ SawOsc osc1 => ADSR env1 => dac; env1.set(10::ms, 8::ms, 0.5, 500::ms); Std.mtof(pitch) => osc1.freq; env1.keyOn(); duration::ms => now; env1.keyOff(); 500::ms => now; } Listing 15.8

Sound producing function.

The mechanism for starting a shred is called sporking. The spork call is shown in the third line of Listing 15.9. The tilde is part of the spork syntax. The loop will execute every 200 milliseconds, starting a new pitch each time. Since the sound defined in beep() is 500 ms longer than the duration, there will be up to three shreds playing at any time (if you watch the virtual machine window, you can see them turn on and off). This can be heard in DVD example 15.2. // call the function concurrently for(60 => int p; p < 73;p++){ spork ~ beep(p,200); 200::ms => now; } 1000::ms => now; Listing 15.9

Sporking a function.

To make ChucK respond to MIDI notes, we adopt a similar strategy. This requires another ChucK concept, the event class. An event object can be chucked to now, with the result that the shred will pause until the event occurs. There are various types of events, including a MidiIn class. Use of this class is illustrated in Listing 15.10. The first half of the code is the beep function used before. After that, a MidiIn object called newMidi is created, along with an object of type MidiMsg, which is a container for the data of a MIDI message. A MidiIn object will not function until it has been attached to a MIDI input port on the computer. There’s a lot to this process, all covered in the open() function. Calling newMidi.open(0)will connect the newMidi event to MIDI port 0. If there is no MIDI port 0, or if it’s not working for some reason, the me.exit() function will shut the shred off (me is a shred’s name for itself; the shred is a class and thus has member functions like exit()). Discovering the number for a particular MIDI port can be a real chore. To simplify this, ChucK has a probe utility that will list the ports. Using numbers for access is tricky, because the numbers change when interfaces are changed or certain programs run.

16_Chap15_pp363-392 8/29/13 2:50 PM Page 370

370

THE ART AND TECHNIQUE OF ELECTROACOUSTIC MUSIC

Once a port is connected, chucking newMidi to now will prime the program to wait for MIDI input. You will note that this is in a while(true) loop. A while loop normally contains a test that determines whether to keep running. Generally, the test involves some action inside the loop. If the test is simply true, the loop will run until the shred is halted. In this case, nothing will happen until a MIDI message arrives. Once a MIDI message turns up the function newMidi.recv(msg) will transfer the data to the msg object. The actual contents of the message are contained in msg.data1, msg.data2, and if necessary msg.data3. These must be examined to see if this is a note. A note on channel one will have data1 (status) of 144. Sometimes a note on message will have a velocity of zero, meaning it’s really a note off, so data3 has to be checked to see that it is not zero. The enigmatic code in the if statement does all of this; the double equal signs mean “is equal to,” the != symbol means “not equal to,” and the && ties the two tests together so they both must be true. When those conditions are met, the beep() function is sporked with the pitch derived from data2. // create a note function fun void beep(int pitch,int duration){ SawOsc osc1 => ADSR env1 => dac; env1.set(10::ms, 8::ms, 0.5, 500::ms); Std.mtof(pitch) => osc1.freq; env1.keyOn(); duration::ms => now; env1.keyOff(); 500::ms => now; } MidiIn newMidi; // MIDI event object MidiMsg msg; // container for MIDI data // open midi receiver, exit on fail if (! newMidi.open(0)) me.exit(); while( true ) // loop forever { // wait on midi event newMidi => now; newMidi.recv(msg); if(msg.data1 == 144 && msg.data3 !=0){ spork ~ beep(msg.data2,200); } } Listing 15.10

Playing from MIDI.

16_Chap15_pp363-392 8/29/13 2:50 PM Page 371

CODING FOR REAL-TIME PERFORMANCE

371

As you probably have gathered by now, ChucK requires a fair amount of programming skill. Luckily, ChucK is a good way to learn programming, and some schools have adopted it for computer music classes. We’ve already seen some basic examples, so let’s use ChucK to expand our knowledge.

PROGRAMMING LESSONS Lesson 1—Variables I mentioned parenthetically that ChucK uses typed variables. Variable typing is simply specifiying what you intend to use a variable for. A variable is declared by writing the type followed by the name of the variable. Once a variable has been declared, you assign a value by chucking it to the variable. An error will be generated if you try to assign a value of the wrong type. In ChucK version 1.2 there are seven types: Int—This is an integer. Restricted to whole numbers, ints are typically used for counting. They are efficient for the computer and work well in all math operations except division, where something like 3/2 will return 1. Float—This is a floating point number, which will have a decimal point. Math with floats is accurate to something like 308 decimal places. You can assign an int to a float variable, but not vice versa. You can cast a float to an int however. Casting converts the type even if it will change the value. In ChucK this is done with a dollar sign, as shown in Listing 15.11. 7.5 $ int => int X; Listing 15.11

A cast.

Time—This type is unique to ChucK. It includes a float number and a unit, which may be samp (one sample at the going rate), millisecond, second, minute, hour, day, or week. (I can’t imagine what kind of music would be calculated in weeks, except maybe opera.) Dur—This is duration. The units are the same as for the time variable, but the meaning is different. The distinction becomes clear when you chuck a value to now. If you chuck a time, now is reset to that time (if it’s in the future). If you chuck a duration, it is added to the current now. If it’s just a number and unit that is chucked, it is treated as a duration. There’s a complicated scheme for doing math with time and durations that determines the type of the result, but it follows common sense (Time – time is dur, time + dur is time, and so on; see the manual for details).

16_Chap15_pp363-392 8/29/13 2:50 PM Page 372

372

THE ART AND TECHNIQUE OF ELECTROACOUSTIC MUSIC

Void—This means nothing. Void is used in function definitions to mean the definition does not return anything. Complex—A number based on the square root of -1. There is no such thing as 兹 -1, but it would be handy, because you could square it and get a negative number. So mathematicians invented it and named it i (for imaginary), but engineers already had a use for i, so they call the 兹 -1 j. Computer music uses j as well. Complex numbers are the sum of a real number and an imaginary number, written as A +Bj. You will remember from the discussion of FFTs in chapter 13 that each bin has a real and an imaginary part. In other words, FFTs produce complex numbers. Polar—Another way of representing complex numbers. When we translated FFTs from real and imaginary form to amplitude and phase, we were converting to the polar form. The point of a variable is that once it has been assigned a value, you can put the variable in any code that would work with that value. This lets you do math with numbers you don’t ever know. Listing 15.9 uses the variable p as a loop counter that also determines pitch. One thing about variables that often catches experienced as well as new programmers is scope. Scope describes where the variable is active (we say visible) in the program. A variable that is declared within a function or loop has local scope and is only visible within the curly braces. A variable that is declared in the body of the file, outside any curly braces, has global scope and is visible everywhere after the declaration, even inside functions and loops. If the same name is used for both a local and a global variable, the local version will hide the global one in the local scope. Many programmers develop one set of names for local variables and a different set for global variables. In any case, the name of a variable should clearly describe what it is for. (Always write code that can be read by strangers. The most likely stranger is you, in about six months.)

Lesson 2—Operators Once we have variables, we want to do math with them. The basic operators are + (add), – (subtract), * (multiply), and / (divide). These all work in the normal fashion and can be combined with the chuck operator to do math on the target. For example, 2 + => X means add 2 to X. The minus sign can be placed in front of a variable to return the negation of the variable: if X is 5, -X will return -5. A double plus sign attached to an int variable, as in X++, means increment the variable by 1. You often see this in the middle of some code that uses the value of the variable. If the plus signs are after the variable, that means increment after the variable is used. If the plus signs come before, as in ++X, that means increment the variable, then use it. Using minus signs, instead of plus signs, decrements the variable.

16_Chap15_pp363-392 8/29/13 2:50 PM Page 373

CODING FOR REAL-TIME PERFORMANCE

373

One operator only found in computer programs is % (rem (remainder) or mod (modulo)). It divides one integer into another and returns the remainder. It’s actually quite useful because you can repeatedly add a number to a running total and stay below a limit. The code in Listing 15.12 will count from 0 to 9 ten times. For(0 => int i ; i < 100; i++){ ; // > means print in the console window } Listing 15.12

The % operator.

There are operators that work on the bits that make up numbers in the computer. These follow a set of rules called Boolean logic that is the underlying technology of everything the computer does. Since they work on bits, the only possible input and results are 0 or 1. The rules are simple: AND (&):

1 & 1 = 1,

0 & 1 = 0,

1 & 0 = 0,

0&0=0

OR (|):

1 | 1 = 1,

0 | 1 = 1,

1 | 0 = 1,

0|0=0

XOR (^):

1 ^ 1 = 0,

0 ^ 1 = 1,

1 ^ 0 = 1,

0^0=0

We also have , which shifts all bits right. These are equivalent to multiplying or dividing by two. Bitwise operators are useful in deciphering MIDI codes, in which individual bits have meaning. The code in Listing 15.13 will set a variable ch to the channel number of the MIDI message that came in as msg. The channel number is encoded as the four least significant bits of the status message (data1). The number 15 works as a mask for these bits because its binary value is 00001111. The next line of the example will shift the upper four bits into the lower four which isolates the status part of the message. A note on message would have the status of 9. Msg.data1 & 15 => ch; Msg.data1 >> 4 => status; Listing 15.13

Bitwise operations.

Lesson 3—Library Functions For more advanced math, we turn to libraries of functions. In ChucK, these functions are named by libraryname.functionname, so the square root of 2 would be found by Math.sqrt(2). There are about fifty such functions. The library function list in the ChucK documentation deserves a bookmark in your browser. The following is a list of some useful ones. Std.abs(int x) returns the absolute value of an integer. Std.fabs(float x) returns the absolute value of a float. A lot of functions exist in versions for floats and ints.

16_Chap15_pp363-392 8/29/13 2:50 PM Page 374

374

THE ART AND TECHNIQUE OF ELECTROACOUSTIC MUSIC

Std.rand() returns a random integer. These tend to be huge. Std.rand2(int min, int max) returns a random integer between min and max. This is often more practical than Std.rand(). Std.randf() returns a random float between -1 and 1. Std.rand2f(float min, float max) returns a random float between min and max. Std.mtof(float x) returns a frequency for a MIDI note value. A fractional note number will be detuned; for example, the note number 60.5 is a quarter step above middle C. Std.dbtorms(float x) returns an amplitude value for the specified dB level. This value is suitable for setting gains. I prefer to do my amplitude calculations in dB because I know what that sounds like. Std.dbtorms(100) returns 1, which is full gain. Math.sin(float x) returns the sine of a value. Math.cos(float x) returns the cosine of a value. Math.pow(float x, float y) returns x raised to the yth power. Math.trunc(float x) returns a float without anything after the decimal point. Math.min(float x, float y) returns the lesser of x and y. Math.max(float x, float y) returns the greater of x and y.

Lesson 4—Decisions A lot of code is about making choices. The most common choice is to do something or not. This is handled by the if statement, which has the form shown in Listing 15.14. if(X==1){ // code to do if x equals 1; }else{ // optional code to do if x does not equal 1; } Listing 15.14

If statement.

The code in parentheses after if is the test. If that results in 0, the test fails and the code immediately following else will execute. If the test results in anything but 0, the code immediately following if will be executed instead. The curly braces are not required if there’s only one line of code, but I often use them anyway. Anything can be in the test (the word true is always 1, for instance), but you usually see logical operators in there. The logical operators always return 1 or 0. The most important is == which means equals: 2 == 2 returns 1, 2 == 3 returns 0. The others are:

16_Chap15_pp363-392 8/29/13 2:50 PM Page 375

CODING FOR REAL-TIME PERFORMANCE

Not equal (!=):

3 != 2 returns 1

Greater than (>):

3 >

2 returns 1

Less than (=): 3 >= 3 returns 1, 2 >= 3 returns 0 Less than or equal ( dac; // a reverb .8 => verb.gain; // reverb settings .2 => verb.mix; fun void bell(int pitch){ TubeBell bl => verb; // a bell model .5 => bl.gain; Std.mtof(pitch) => bl.freq; 1.0 =>bl.noteOn; // this sounds the bell 800::ms =>now; // bells have long decays } for(0 =>int i; i < 100;++i){ // loop 100 times Std.rand2(0,5) => int choice; //pick one of five options

16_Chap15_pp363-392 8/29/13 2:50 PM Page 376

376

THE ART AND TECHNIQUE OF ELECTROACOUSTIC MUSIC

if(choice == 0) spork ~ bell(72); else if(choice == 1) spork ~ bell(76); else if(choice == 2) spork ~ bell(79); else if(choice == 3) spork ~ bell(71); else if(choice == 4) spork ~ bell(59); 250::ms =>now; // 1/8 notes at 120 } spork ~ bell(72); // grand finale spork ~ bell(76); spork ~ bell(67); 2000::ms=>now; // time for reverb tail Listing 15.15

Windy chimes.

Lesson 5—More about Loops The for loop is the workhorse of most programs. One interesting thing you can do with loops is put one inside another. Listing 15.16 uses the same TubeBell function in a different loop. Int i,k; for(0 => i; i < 25;++i){ for(0 => k; k < 6 ; ++k){ spork ~ bell(72 + i/2 + k); 125::ms =>now; } } spork ~ bell(72 + i/2 + k); 2000::ms=>now; // time for reverb tail Listing 15.16

Glassy chimes.

There are no random operations here; the pitches are calculated from the loop indices. The index variables i and k are declared before the loops to give them global scope. Note that they are integers. The outer loop, which uses the i index,

16_Chap15_pp363-392 8/29/13 2:50 PM Page 377

CODING FOR REAL-TIME PERFORMANCE

377

will run twenty-five times. The inner loop, controlled by k, will run six times for each i. That’s a total of 150 notes generated by four lines of code. The pitches are calculated by the formula 72 + i/2 + k. Here, 72 (fourth space C) will be the starting pitch, because i and k will both be 0. On the next five notes, k will increment while i stays 0, so the result is a six-tone chromatic scale. Then i increments to 1, but since i is divided by 2 in the formula, the scale will repeat, because an integer of 1 divided by 2 is 0. Only on the third iteration of the outer loop will the scale be transposed up a step. You can hear the result in DVD example 15.4. I added the final note primarily to illustrate what happens to the loop indices when we are done with them. A loop terminates when the test in the second part of the setup fails, so i will equal 25. Integer 25 divided by 2 will come out 12, which added to 72 is 84, so the last scale will start on high C. At the end, k will equal 6, so the tag note is 90 (F#). We have already seen one example of a while loop (Listing 15.10). The while loop only has one test, which is checked before the loop executes. The code in Listing 15.17 will print the word loop three times. 0 =>int i; while(i int lastPitch; while(lastPitch < 84 ){ // loop until player hits high note newMidi => now; newMidi.recv(msg);

16_Chap15_pp363-392 8/29/13 2:50 PM Page 378

378

THE ART AND TECHNIQUE OF ELECTROACOUSTIC MUSIC

if(msg.data1 == 144 && msg.data3 !=0){ spork ~ beep(msg.data2,200); msg.data2 => lastPitch; } Listing 15.18

Stop via MIDI.

if ( !newMidi.open(0) ) me.exit(); 0 => int lastPitch; while(true){ // loop until player hits high note newMidi => now; newMidi.recv(msg); if(msg.data1 == 144 && msg.data3 !=0){ msg.data2 => lastPitch; if(lastPitch > 83) break; else spork ~ beep(msg.data2,200); } Listing 15.19

Code to produce a break.

There is another loop controller called until. This is the opposite of a while— looping will occur as long as the test is false. A variant of these uses the keyword do (with no test) at the beginning of the loop, and the while or until at the end. The code in a do loop (Listing 15.20) will execute once no matter what the conditions are. 0 =>int i; do{ ; ++i; } while(i dac; // a reverb .8 => verb.gain; .3 => verb.mix;

16_Chap15_pp363-392 8/29/13 2:50 PM Page 379

CODING FOR REAL-TIME PERFORMANCE

379

fun void bell(int pitch){ TubeBell bl => verb; // a bell model .5 => bl.gain; Std.mtof(pitch) => bl.freq; 1.0 =>bl.noteOn; // this sounds the bell 800::ms =>now; // bells have long decays } [72,76,79,71,59,0] @=> int pitches[]; Create Array for(0 =>int i; i < 100;++i){ Std.rand2(0,5) => int choice; if(pitches[choice] != 0) //0 = rest spork ~ bell(pitches[choice]); // play chosen pitches 250::ms =>now; } spork ~ bell(72); // grand finale spork ~ bell(76); spork ~ bell(67); 2000::ms=>now; // time for reverb tail Listing 15.21

Choosing from an array.

The line with the @=> (at chuck) creates an array of ints called pitches. The brackets define the initial contents of the array. To get the value of a member of an array, use the array name with an index contained in square brackets. The values in the array in Listing 15.21 can be accessed by pitches[0] up to pitches[5]. Pitches[2] will return 79, until changed by chucking an int to pitches[2]. The index can be no lower than 0 and no larger than one less than the array size—if an array AR has one hundred elements, asking for AR[100] is an error. You can also declare an array by simply providing type, name, and size as in Listing 15.22. The values are initially 0. int anArray[100]; Listing 15.22

Array declaration.

Here’s an important musical use for an array. The MIDI examples shown so far play notes of fixed length, without responding to note off messages. Since the note off is probably the second most important message, we must be able to deal with it. Listing 15.23 shows one way. int notesPlaying[128]; // an array to mark playing notes // create a note function fun void beep(int pitch,int duration){ SawOsc osc1 => ADSR env1 => dac;

16_Chap15_pp363-392 8/29/13 2:50 PM Page 380

380

THE ART AND TECHNIQUE OF ELECTROACOUSTIC MUSIC

20::ms => env1.attackTime; env1.set(10::ms, 8::ms, 0.5, 500::ms); Std.mtof(pitch) => osc1.freq; env1.keyOn(); while(notesPlaying [pitch]){ // check to see if note is off 10::ms => now; } env1.keyOff(); 500::ms => now; } MidiIn newMidi; // MIDI event object MidiMsg msg; // container for MIDI data // open midi receiver, exit on fail if ( !newMidi.open(0) ) me.exit(); while( true ){ newMidi => now; // wait on midi event newMidi.recv(msg); if((msg.data1 == 144 && msg.data3 ==0)||(msg.data1 == 128)) 0=> notesPlaying [msg.data2]; // if it’s note off, clear if(msg.data1 == 144 && msg.data3 !=0){ 1=> notesPlaying [msg.data2]; // mark the note as playing spork ~ beep(msg.data2,200); } } Listing 15.23

Note on and note off.

The code in Listing 15.23 starts by creating the array notesPlaying[] with 128 members, one for each possible note. Next, the beep function has been modified to check (at 10 ms intervals) if notesPlaying [pitch] contains a 1 or 0, and to shut off if the latter. Finally, another test is added to check for note off in both of its forms. Note off puts a 0 in notesPlaying [pitch]. Note on puts a 1 there, so the sound will be sustained from note on to note off. This rather convoluted process is diagrammed as a flowchart in Figure 15.2. The symbols in flowcharts may vary, but the diamonds always represent tests. Flowcharts can be a useful tool for designing programs, especially if there are a lot of conditions to respond to. Arrays in ChucK are a class. This means they have member functions that can do useful things. Array.size() returns the length of the array and can be used to change the length—for example, Array.size(12) will set the length to 12. Array.clear() will set all values to 0.

16_Chap15_pp363-392 8/29/13 2:50 PM Page 381

CODING FOR REAL-TIME PERFORMANCE

381

Wait for Message

NoteOff?

yes

0 => notesPlaying[p]

no NoteOn?

yes

Start Note 1 => notesPlaying[p] spork

no

notesPlaying[p] == 0?

yes

End Note

no Wait 10 ms

FIGURE 15.2

Flowchart of MIDI parser.

Lesson 7—More about Functions Functions are used for more than sporking notes. Any section of code that is repeated three or more times can profitably be replaced by a function. For instance, if we need to add up the values in an array, we would do something like Listing 15.24.

16_Chap15_pp363-392 8/29/13 2:50 PM Page 382

382

THE ART AND TECHNIQUE OF ELECTROACOUSTIC MUSIC

fun int arraySum(int theArray[]){ theArray.size() => int length; 0 => int total; for(0 => int i; i < length;++i) theArray[i] +=> total; return total; } Listing 15.24

Function definition.

Notice the use of the keyword return. This instructs the compiler to place the returned value at the location of the function call. When a function returns a value, that’s it, no code after the return will be executed. It’s not unusual to see two returns in a function chosen by an if statement or a return in the middle of a loop. Functions make it possible to write simple programs that do complicated things. I’ll illustrate with a Markov chain program. Markov chains are basic tools of algorithmic composition. They set a probability of choosing a new pitch based on the current pitch. An ordinary probability is the likelihood of getting a particular pitch from a chance (or stochastic) operation. If you generate one hundred notes and twelve of them are G, the probability of G is 12/100 or 0.12. Listing 15.25 shows how to generate a series of pitches that follow a probability expressed in a list of twelve items. (Pitches are described by pitch class as discussed in the last chapter.) The pitches correspond to the index of the array, the value at that index is the probability of that pitch occurring. //here’s a test array [0.2,0.,0.1,0.,0.15,0.2,0.,0.2,0.,0.1,0.,0.05] @=> float probArray[]; fun int getPitch(float pArray[]){ arraySum(pArray) => float circumference; Std.rand2f(0,circumference) => float pointer; 0=> float runningTotal; for(0 => int i;irunningTotal; if(runningTotal >= pointer) return i; } } Listing 15. 25

Generating a probability series.

Listing 15.25 shows a typical probability array that should sound somewhat C major. The numbers in the array add up to 1.0, which is usually required in probability distributions. This is known as a normalized array (you normalize an array by finding the sum of all elements and then dividing each element by the sum). I prefer to use nonnormalized arrays, which adds a bit of complexity to the probability routine but leaves the program immune to data entry mistakes.

16_Chap15_pp363-392 8/29/13 2:50 PM Page 383

CODING FOR REAL-TIME PERFORMANCE

C

B

C# D

A# A

D#

G#

E G

FIGURE 15.3

383

F#

F

A wheel of pitches.

The function getPitch() works like a wheel of fortune. Imagine such a wheel with a circumference of 120 inches (Figure 15.3). If the wheel were divided into twelve sections, and each section was a 10 inch arc, a spin of the wheel would be just as likely to come up with one pitch as another. However, if the segments varied in width, some pitches would be more likely than others. The probability array sets the width of the segments. The first half of getPitch() is the wheel spin, leaving the variable pointer somewhere between 0 and the circumference. The for loop finds the place in the probability array that contains the pointer.

Lesson 8—Multidimensional Arrays A Markov chain takes the process one step further. Not only do we pick a pitch according to an array of probabilities, we pick an array of probabilities depending on the current pitch. This requires twelve different arrays, all held in a superarray. An array of arrays is called a multidimensional array. A multidimensional array is defined by two or more indices, which in the two-dimensional form can be thought of as row and column. A two-dimensional array can be defined as shown in Listing 15.26. [[1,2,3], [2,5,6], [7,8,9]] @=> int mdArray[][]; ; // prints 2 ; // prints 5 Listing 15.26

Two-dimensional array.

16_Chap15_pp363-392 8/29/13 2:50 PM Page 384

384

THE ART AND TECHNIQUE OF ELECTROACOUSTIC MUSIC

The ChucK compiler is indifferent to line breaks, so you are free to lay out the definition in a way that looks properly organized. The item in the upper left of the array has indices of [0][0]. The reference mdArray[0][1] returns 2, and mdArray[1][1] returns 5. Arrays can have more than two dimensions, although there are usually better ways to handle that amount of data. The Markov routine requires a getPitch() function that will work with twodimensional arrays. Listing 15.27 is a simple modification of Listing 15.25. fun int getPitch(float mArray[][],int row){ mArray[row] @=> float rowArray[]; arraySum(rowArray) => float circumference; Std.rand2f(0,circumference) => float pointer; 0=> float runningTotal; for(0 => int i;irunningTotal; if(runningTotal >= pointer) return i; } } Listing 15.27

Markov chain for choice of pitch.

This uses a trick unique to the ChucK language. You can use @=> to create a reference to a row in a two-dimensional array that ChucK will then treat as a simple array. This is just a reference to part of the same data. If you change the original array, the values accessed via the reference will change too. To put it to the test, create a twelve by twelve array and write a loop as in Listing 15.28. [ // probability of next PC based on current PC [0.,0.1,0.,0.,0.,0.,0.,0.,0.,0.,0.,0.], //C [0.,0.,0.1,0.,0.,0.,0.,0.,0.,0.,0.,0.], //C# [0.,0.,0.,0.1,0.,0.,0.,0.,0.,0.,0.,0.], //D [0.,0.,0.,0.,0.1,0.,0.,0.,0.,0.,0.,0.], //D# [0.,0.,0.,0.,0.,0.1,0.,0.,0.,0.,0.,0.], //E [0.,0.,0.,0.,0.,0.,0.1,0.,0.,0.,0.,0.], //F [0.,0.,0.,0.,0.,0.,0.,0.1,0.,0.,0.,0.], //F# [0.,0.,0.,0.,0.,0.,0.,0.,0.1,0.,0.,0.], //G [0.,0.,0.,0.,0.,0.,0.,0.,0.,0.1,0.,0.], //G# [0.,0.,0.,0.,0.,0.,0.,0.,0.,0.,0.1,0.], //A [0.,0.,0.,0.,0.,0.,0.,0.,0.,0.,0.,0.1], //A# [0.1,0.,0.,0.,0.,0.,0.,0.,0.,0.,0.,0.] //B ] @=> float markArray[][]; 0 => int lastPitch; while(true){ getPitch(markArray,lastPitch)=> int p;

16_Chap15_pp363-392 8/29/13 2:50 PM Page 385

CODING FOR REAL-TIME PERFORMANCE

385

p=>lastPitch; 60 +=> p; spork ~ bell(p); 200::ms => now; } 1000::ms => now; Listing 15.28

Markov chain test code.

The tune specified in markArray for Listing 15.28 is not inspiring, but it’s a good test of the routines. If all is well, this will play scales. With the modified getPitch() function, all we have to do is keep track of lastPitch and pass it to the function call. Listing 15.29 has a more satisfying result as heard in DVD example 15.5. [// probability of next PC based on current PC [0.,0.,0.1,0.,0.1,0.4,0.,0.4,0.,0.,0.,0.], //C [0.,0.,0.,0.,0.,0.,0.,0.,0.,0.,0.,0.], //C# [0.,0.,0.,0.,0.,0.4,0.,0.5,0.,0.,0.,0.1], //D [0.,0.,0.,0.,0.,0.,0.,0.,0.,0.,0.,0.], //D# [0.3,0.,0.,0.,0.,0.3,0.,0.3,0.,0.1,0.,0.], //E [0.3,0.,0.,0.,0.4,0.,0.,0.3,0.,0.3,0.,0.], //F [0.,0.,0.,0.,0.,0.,0.,0.,0.,0.,0.,0.], //F# [0.5,0.,0.,0.,0.,0.3,0.,0.,0.,0.,0.,0.2], //G [0.,0.,0.,0.,0.,0.,0.,0.,0.,0.,0.,0.], //G# [0.3,0.,0.3,0.,0.3,0.,0.,0.,0.,0.,0.,0.1], //A [0.,0.,0.,0.,0.,0.,0.,0.,0.,0.,0.,0.], //A# [0.7,0.,0.2,0.,0.,0.,0.,0.1,0.,0.,0.,0.] //B ] @=> float markArray[][]; Listing 15.29

A Markov tune.

Designing a Markov array is simply a matter of working through the pitches and deciding what the probability of each following pitch should be. It’s easy to play with values if they do not have to add up to any special number. With this code the relative values in a row are all that matter. This makes it simple to hook up sliders or graphs for user controls. Of course, there’s still a lot of work to do to transform patterns of pitches into satisfying music.

Lesson 9—More about Time We have already seen that ChucK has special data types and units for time. However, these units are not particularly musical; in particular, they represent clock time, which always passes at a steady rate. Music is measured in duration using the values of whole note, quarter note, eighth note, and so on, including variants like

16_Chap15_pp363-392 8/29/13 2:50 PM Page 386

386

THE ART AND TECHNIQUE OF ELECTROACOUSTIC MUSIC

triplet eighths and dotted sixteenths. Furthermore, the time it takes to play these durations varies with tempo. To convert strict clock time to flexible musical time we will embrace a concept first met in chapter 8, the tick. To keep the numbers small, I’ll define a tick as 1/24th of a quarter note. This provides enough definition to go down to thirty-second notes, or even sixty-fourth notes, which would be 1.5 ticks. The relationship between tempo and ticks is straightforward. Tempo states the number of quarter notes per minute, so the number of ticks in 60,000 milliseconds will be 24 times the tempo. Code to compute that is shown in Listing 15.30. Once the tick is established, it’s handy to define a set of duration values, shown in Listing 15.31. 120. => float tempo; 60000./tempo/24. => float tick; Listing 15.30

Computing milliseconds per tick.

96 * tick::ms => dur wn; 72 * tick::ms => dur dh; 48 * tick::ms => dur hn; 36 * tick::ms => dur dq; 32 * tick::ms => dur th; 24 * tick::ms => dur qn; 18 * tick::ms => dur de; 16 * tick::ms => dur tq; 12 * tick::ms => dur en; 9 * tick::ms => dur ds; 8 * tick::ms => dur te; 6 * tick::ms => dur sn; 4.5 * tick::ms => dur dts; 4 * tick::ms => dur ts; 3 * tick::ms => dur tsn; Listing 15.31

//whole note //dotted half //half note //dotted quarter //triplet half //quarter note //dotted eighth //triplet quarter //eighth note //dotted sixteenth //triplet eighth //sixteenth note //dotted thirtysecond //triplet sixteenth //thirtysecond note

Defined durations.

These are all typed as durations so they are ready to chuck to now. They are defined after tick, of course, so setting the tempo is just a matter of changing one number. (If the tempo is changed while the shred is active, these variables need to be redefined.) With the durations defined as variables, it is easy to set up and play rhythm patterns as in Listing 15.32. The pattern can be written as an array of type dur. Then the play routine merely steps through the array chucking the elements to now. Listing 15.32 can be heard in DVD example 15.6. [qn,qn,qn,en,en,dq,en,qn,qn,wn] @=> dur beats[]; JCRev verb => dac; 0.2 => verb.mix;

16_Chap15_pp363-392 8/29/13 2:50 PM Page 387

CODING FOR REAL-TIME PERFORMANCE

387

fun void pluckIt(int pitch, dur value){ StifKarp KarpStrong => verb; 0.25 => KarpStrong.gain; Std.mtof(pitch) => KarpStrong.freq; 1.0 => KarpStrong.pluck; value => now; } for(0=>int i;i < beats.size();++i){ spork ~ pluckIt(60,beats[i]); beats[i]=>now; } Listing 15.32

Playing a pattern.

Since the basic idea of ChucK is to build pieces by layering shreds, we need a way to synchronize them. You can’t count on hitting the launch button at exactly the right time, because there are too many machine-dependent variables involved. However, synchronization is surprisingly easy. We can get the current ChucK time by reading now, which will give the number of samples since the virtual machine was started. A remainder operation between now and the duration of a measure will provide the time until the measure’s end. Listing 15.33 shows how this is calculated and chucked to now. When this code is placed at the beginning of a shred, it will start on the next downbeat. Listing 15.33 with the definitions of Listings 15.31 and 15.32 will play beats. You can hear this combined with similar files in DVD example 15.7 (the only difference is the duration value in the while loop). wn-(now % wn) => now; while(true){ spork ~ pluckIt(48,wn); wn => now; } Listing 15.33

Synchronized beats.

Lesson 10—Unit Generators ChucK has nearly as rich a set of sound generators and processors as Csound. In fact, looking over the list it may seem familiar, as many of the basic Csound unit generators are here. As in Csound, the audio processors are called unit generators, or ugens for short. The ChucK ugens are objects, and their behavior is controlled by setting class variables. As we have seen, the pertinent variables are usually gain and freq, and some classes have noteOn and noteOff to trigger envelopes. In addition, there will be variables specific to each class, such as width for pulse generators.

16_Chap15_pp363-392 8/29/13 2:50 PM Page 388

388

THE ART AND TECHNIQUE OF ELECTROACOUSTIC MUSIC

Oscillators The most generic ugen is probably Phasor, which makes ramp waves. This may be chucked to a filter for subtractive synthesis, or it may be used to drive a GenX routine. The GenX scheme is similar to a Csound’s oscil, with a specified function table. The basic structure is shown in Listing 15.34. Phasor osc1 =>Gen9 g9=> dac; 440. => osc1.freq; //gen9 coefs are ratio, amplitude, phase, ratio... [1., 1., 0., 2., 0.5, 90., 3., 0.3, 180.] => g9.coefs; 0.5 => g9.gain; 1000::ms => now; Listing 15.34

The unit generator phasor.

A Phasor is chucked to one of five ugen routines which have numbers that match the traditional Csound numbers. As in Csound, Gen9 and Gen10 are useful for waveform generation, as they specify the individual components of the shape. The routine Gen9 is used in Listing 15.34. The function table is filled in by setting the g9.coefs variable with an array containing ratio, amplitude, and phase for each desired partial. The array sets the table but is not kept, so the @=> is not needed (you can use a defined array if you like). The fiddly details like table size are handled automatically. Note that you adjust the gain by changing gain of the GenX routine, not the Phasor. Reducing the Phasor gain would simply scan the first part of the table, producing a sound similar to pulse width modulation, which we explored in chapter 11. Listing 15.34 just produces a raw sound for comparison purposes—in use, some kind of envelope, such as an ADSR, should be provided. The geometric waves found in modular and other subtractive synthesizers are found in ugens also. Sine, pulse, square, triangle, and sawtooth waveforms are included. Filters I keep saying that the most audible difference between synthesis programs is in the filters, and ChucK does not prove me wrong. A first perusal of the list of filter options is a bit daunting, because there are names like biquad, one-pole, pole-zero, and similar unfamiliar terms. These are the basic building blocks of digital filters. If I wanted to teach a course in digital filter design, I would use ChucK, because all of the parts are here and fairly easy to put together. Filter design is beyond the scope of this book, but if you are curious, the following is a basic description. Digital filters work on groups of samples, some taken from the input to the filter and some taken from recent output. We have already experimented with delays that combine current and delayed samples with feedback from the output and noted the filtering effects. Filter design involves slightly more elaborate networks of delay and feedback and a good deal of math to produce predictable results. In filter

16_Chap15_pp363-392 8/29/13 2:50 PM Page 389

CODING FOR REAL-TIME PERFORMANCE

In

*

b0

Out

+

Delay

Delay

*

b1

*

-a1 Delay

Delay

*

b2 FIGURE 15.4

389

*

-a2

Biquad filter.

speak, a one-sample delay is known as a zero and a sample of feedback is known as a pole. Figure 15.4 shows a schematic of a popular filter network called a biquad. The boxes in Figure 15.4 represent one sample delay, so a signal that passes through two boxes is delayed by two samples. Following the lines in the diagram, you can see that the output of the filter is a combination of five things: the current sample, the previous sample (input delayed by 1), and the sample before that, along with the most recent output (necessarily a delay, since the current output is not calculated yet) and the output before that. These are not simply added together. Each element is multiplied by a coefficient (which may be negative, resulting in a subtraction of that part of the signal), and the weighting provided by the coefficients will determine the filter frequency and response curve. Figuring out the coefficients is the tricky part of filter design, but there are many tools to help. For instance, all of the classic analog filter designs have published coefficients, and there are web-based applications that allow you to enter a desired characteristic and frequency and return the coefficients ready to plug in. If you do not care to roll your own filters, don’t despair, because ChucK contains an ample list of traditional filters, too. There is nothing complicated about using these; simply chuck them into the signal path and set frequency and Q. One complication does arise if you want to sweep a filter frequency with an ADSR envelope. Since the ADSR is a unit generator, it cannot be chucked directly to filter.freq. The ChucK architecture requires that unit generators be connected to the dac in

16_Chap15_pp363-392 8/29/13 2:50 PM Page 390

390

THE ART AND TECHNIQUE OF ELECTROACOUSTIC MUSIC

order to function. However, connecting an ADSR directly to dac would produce strange sounds. To work around this, an alternative to dac called blackhole has been added to ChucK. A unit generator may be chucked to blackhole, where it will generate samples that are not heard. What is the point? Any unit generator has a member function called last() which will yield the most recent output value as a float. This is something like a built-in sample and hold. This last() value can be chucked periodically to any parameter you wish to modulate. Listing 15.35 shows how to use the ADSR.last() function to sweep the frequency of a filter. SawOsc osc1 => LPF filt1 =>dac; 440. => osc1.freq; 0.25 => osc1.gain; 4 => filt1.Q; Step s=>ADSR env1 => blackhole; env1.set(40::ms,22::ms,0.7,1200::ms); env1.keyOn(); while( env1.state() < 2){ (8000. * env1.last()) => filt1.freq; 1::ms => now; } 200::ms => now; // sustain time env1.keyOff(); while( env1.state() < 4){ (8000. * env1.last()) => filt1.freq; 1::ms => now; } Listing 15.35

ADSR to filter.

Since the ADSR is really a gain controller, it will have no output unless there is a signal at its input. The step unit generator produces a signal that is perfect for this, since it does not change. In the default, the step signal consists of the value 1.0 on each sample. (Such a signal has been called a constant in other applications we’ve seen.) Once processed by the ADSR, this becomes a control in the range 0.0 to 1.0. The code in Listing 15.35 multiplies the ADSR output by the peak frequency the filter should reach. This happens in the while loops, which are controlled by the ADSR state() function. For example, env1.state() returns 0 during the attack, 1 in the decay, 2 during sustain, 3 during release, and 4 when finished. You should take care in specifying the time delay in the while loops. This works just like k-rate in Csound. If the time is too long in comparison to the rate of change of the envelope, there will be distinct zipper distortion. DVD example 15.8 demonstrates this sound with update rates of 1 ms, 10 ms, and 100 ms.

16_Chap15_pp363-392 8/29/13 2:50 PM Page 391

CODING FOR REAL-TIME PERFORMANCE

391

The Synthesis Toolkit You have probably noticed by now that there is a lot of similarity in the functions found in synthesis software. They all need basic waveform generation, filter routines, envelope control, and similar processes, regardless of how the user interface presents these things. Many of the ugens in ChucK are derived from Perry Cook’s previous work on a project called the Synthesis Toolkit (STK). The STK is not a program in itself; it is a library of source code routines that programmers may use (free of charge in research and academic applications) to handle much of the signal generation in their own applications. The instruments include several varieties of FM as well as assorted physical models. I have used some STK instruments in this chapter, namely TubeBell in Listing 15.14 and StifKarp in Listing 15.32. Some of the wind models can be extremely finicky to use, as there is no sound unless you get lip tension and note velocity just right for the frequency, but they do reward the effort. The Saxofony instrument in Listing 15.36 is typical. This is a combination wind and string waveguide model, similar to the Modelonia program in chapter 13. Saxofony sax => dac; 263 => sax.freq; 0.6 => sax.stiffness; 0.1 => sax.aperture; 0.7 => sax.pressure;

// these // all // interact

0.4 =>sax.noiseGain; 2 => sax.vibratoFreq; 0.1 => sax.vibratoGain; 0.3 => sax.blowPosition; // try 0.1 and 0.5 0.9 => sax.rate; 0.7 => sax.startBlowing; 1000::ms => now; 1 => sax.stopBlowing; 200::ms =>now; Listing 15.36

Wind model from the STK.

The stiffness, aperture, and pressure variables are all descriptive of reed action and behave remarkably like a real reed. If there is too much pressure for the stiffness, there will be no sound, as any woodwind player could tell you. The blowPosition works like the excitation point on a string model. If it is set at 0.5, the result will be odd harmonics only. The vibrato is derived from variation in lip pressure, so the timbre changes as much as the volume. The best way to learn

16_Chap15_pp363-392 8/29/13 2:50 PM Page 392

392

THE ART AND TECHNIQUE OF ELECTROACOUSTIC MUSIC

these is to follow the format of Listing 15.36 and set up a shred that matches the parameter list, experimenting with different settings. Changes on the order of 0.05 often have a large effect. DVD example 15.9 shows a few possibilities.

RESOURCES FOR FURTHER STUDY ChucK is a baby language, and it is difficult to predict what a baby will amount to. It probably will have changed by the time you read this, so the place to start is the ChucK website (http://chuck.cs.princeton.edu/). Here you can get the language, manuals, tutorials, and examples of it in use. Even if you do nothing with ChucK, you will get a laugh out of the ChucK anthem: http://www.cs.princeton.edu/~prc/HackChuck.html You will need to know some of the inner workings of MIDI to program for external control. The details of the MIDI protocol are found at: www.midi.org If you get far enough to want to design filters, an excellent tool can be found at: www-users.cs.york.ac.uk/~fisher/mkfilter

17_Chap16_pp393-420 8/29/13 2:50 PM Page 393

SIXTEEN Programming with Boxes and Lines

The last two chapters should have made the power of writing your own code obvious, but they also probably revealed a couple of drawbacks: typing all of that code is tedious, and it’s difficult to grasp the overall structure of a program from deep in the thicket of operators and punctuation. The flowcharting technique shown with Csound is a helpful tool for visualizing program organization, but I suspect most musicians use it back to front: once an instrument is working to their satisfaction, they draw up a flowchart to document it. The alternative to text-based coding is a visual programming language (VPL), in which flowcharts are assembled graphically and used to generate compiled code. There are a fair number of these, including several designed for audio and music. Of the musical VPLs there is a further distinction between applications that simply allow assembly of modules (such as Tassman) and those that provide operations at a level of detail comparable to Csound. The most comprehensive of this second group are Max/MSP and Pure Data (Pd). The program that became Max was first developed at IRCAM in the mid 1980s by Miller S. Puckette and became the basis for a series of research applications that are still under development there. The software was licensed to Opcode Systems in 1990 for commercial distribution and enhancement. Opcode assigned David Zicarelli to transform it into a stable and user-friendly application. (Research institutions are not overly upset by software that crashes from time to time as long as serious work can be done.) When Opcode was forced out of business in 1999, Zicarelli acquired rights to Max and continued development with his own company. In recent years, Max has developed a user base much wider than that of any other academic-style program. Max is more expensive than some music software, but there is a substantial educational discount. There is a free trial version that will run long enough to try out the examples in this chapter, and there is a free runtime version (with editing disabled) that allows composers to distribute working code to performers. All of the examples in this chapter are on the accompanying DVD and will work in runtime Max (version 5 or later) on Macintosh or Windows. Selected patches have been recorded, but it will be more instructive to run the patches and play with them. Max/MSP/Jitter is actually three interlinked modules. Max is the foundation of the system. It is named after Max Mathews, although it does not do any audio synthesis 393

17_Chap16_pp393-420 8/29/13 2:50 PM Page 394

394

THE ART AND TECHNIQUE OF ELECTROACOUSTIC MUSIC

or processing. Max was originally a control application for hardware systems, including some esoteric machinery at IRCAM, DSP cards in the Next computer, and MIDI in the commercial version. It is geared toward real-time data processing and design of user interfaces. MSP is the signal processing arm of Max, and if you dig deep enough, you will find traces of MUSIC-style algorithms. MSP came along in 1996 when Macintosh computers finally developed enough horsepower to do satisfactory signal processing. MSP stands for Max signal processing or perhaps for Miller S. Puckette, who was involved in the development of its algorithms. Jitter is the final arm of the program, adding video processing and large-scale data handling. Jitter is named for a cat that belonged to a member of the programming team.

BASIC MAX A Max code document is called a patcher, and the drawing it contains is a patch. The first impression of a patcher is a lot of boxes connected by a tangle of lines. The boxes are called objects, and the lines are patch cords. The connection points are called outlets or inlets, and cords are always drawn from an outlet to an inlet. The patch cords carry messages from one object to another. When an object receives a message it reacts appropriately, probably producing messages of its own. Some of the objects are user interface items: assorted buttons, sliders, and controls that allow mouse control of the patch. Figure 16.1 shows the basic elements of a patcher window and objects. The icons along the bottom of the window control various editing modes. For instance, the padlock icon switches between locked and unlocked mode. When the patcher is unlocked, items may be added, repositioned, and modified. If it is locked, items are fixed and mouse clicks are interpreted as performance actions. (It is possible to test user actions with the patcher unlocked using mouse and key combinations.) Objects are entered from a browser that appears when you double-click on an empty spot. In addition to the basic code object there are over four dozen other types of objects, most of which are devices for user input. If you select a code object, the palette will close and an empty object box will appear at the location where you double-clicked. After you type in the object name and any arguments, the proper number of inlets and outlets will appear. If you choose a user interface object, you can resize it by dragging the lower right corner and set its appearance and behavior by opening an inspector window. One of the engaging features of Max is that there is no distinction between editing and running code. An object is in operation as soon as the first patch cord is attached. This lets you see results constantly and greatly reduces time spent debugging. The function of an object is determined by its name. There are about six hundred objects shipped with Max, providing capabilities that range from simple addition to video transformation. If these are not enough, there are over a thousand objects written by independent programmers. These third-party externals are not

17_Chap16_pp393-420 8/29/13 2:50 PM Page 395

PROGRAMMING WITH BOXES AND LINES

395

Object Name Object Box Outlet Inlet

Patch Cords Argument

FIGURE 16.1

Basic elements of Max (Version 5).

supplied with the program, but they are available through web-based clearing houses, and most are free. This ability to add functions is one of the major strengths of Max. A lot of music software is open source and theoretically extendable, but an outside programmer usually must work with the entire body of code and understand it thoroughly before proposing any change. Such changes often have to be approved by a code master or committee before becoming available to the general public. In Max, all a programmer needs to learn is a set of rules for how an object is organized and should behave. Objects can then be written and added in simply by placing them in a particular directory. Objects are coded in C, one of the most widely used programming languages. For programmers who prefer Java, there is a Max object that will execute Java code. Finally, Max objects can be created from existing Max objects. This does more than just save space on the screen, it allows any Max user to build up a personal library of useful routines. The built-in objects and most third-party objects are thoroughly documented. When the mouse is held over an inlet or outlet, a balloon appears with a description of the connection. If you click on an inlet, a list of messages accepted by the object will appear. When an object is selected, an auxiliary window called “clue” shows a short description of the item (and you can add to this if you like). If you option-click or alt-click on an object a help patcher will appear showing a great deal of information about the object and working examples of it in use. The help patcher contains a link to an even more detailed reference page. There is also a help menu that leads to overall documentation and a complete set of tutorial articles. The objects in Figure 16.1 are notein and noteout. The notein object produces three messages when a note is received, one at each outlet. The rightmost outlet

17_Chap16_pp393-420 8/29/13 2:50 PM Page 396

396

THE ART AND TECHNIQUE OF ELECTROACOUSTIC MUSIC

FIGURE 16.2

Data processing.

sends a number for the MIDI channel, the center sends the velocity (0 if this is a note off) and the left outlet sends the note number. The messages are sent in that order, right before left. The noteout object transmits MIDI messages to the outside world or to concurrently running synthesizers. The example has an argument of 2, which will set it to send MIDI note messages on channel 2. (Arguments in basic objects are like those in Csound, with the order determining the meaning of each number.) The noteout object interprets the data it receives according to the inlet used. Numbers on the right inlet set the MIDI channel (replacing the argument), a number at the center sets the velocity (again, 0 will produce a note off), and a number in the left will set the note number and trigger MIDI output. Almost all Max objects work this way, with data in the right-hand inlets setting up the message that will be completed and sent by data in the left. The patch shown accepts notes from a keyboard and sends them out again on channel 2. Figure 16.2 also shows three more ways this data can be processed. The section labeled “transpose” simply adds 7 to the note number and passes the result to the noteout. This will create a pitch a fifth above the key played. In the center code the note number is sent both directly to the noteout and to the noteout via the add object, with the result that two pitches are heard. Right-to-left ordering is observed when an outlet has more than one destination, so the normal pitch will be played before the raised note. Of course there is no perceivable delay; the two notes are sent just as fast as MIDI can manage (the actual computation takes about a microsecond). In the right section of Figure 16.2 a complete major chord is produced. The type of math performed by math objects such as the addition object in Figure 16.2 is determined by the type of their arguments. If the argument is an integer, integer output is produced. If the argument has a decimal point, floating point numbers are produced. Figure 16.3 illustrates this. (Leaving the decimal point out of a math box when floating point operation is required is probably the most common mistake made in Max patches.) In Figure 16.3 the objects with triangles in them are number boxes. Number boxes display any number they receive and pass the number along from the outlet. There are two types of number box: integer and float. If an integer box receives a

17_Chap16_pp393-420 8/29/13 2:50 PM Page 397

PROGRAMMING WITH BOXES AND LINES

FIGURE 16.3

397

Integer and floating point math.

float number, it converts it to an integer, and vice versa. Integer boxes can be set to various styles of display, including MIDI note names. When the patcher window is locked, a selected number box can be edited by typing, with the change becoming effective upon return (hit escape to discard nascent changes). A click and drag on the box will scroll the numbers. When this is done, messages are actually produced about sixty times a second, so some numbers will be skipped. An inspector setting can change this behavior so that a single message is produced when the mouse button is released.

Decisions Creating the proper chord for major tonality is a bit more complex than just adding semitones. Figure 16.4 shows one way to do this and introduces several new objects. This patch contains all of the objects from the right-hand section of Figure 16.2, plus some logic to determine major or minor third and perfect or diminished fifth. The object with the % sign performs the rem operation, dividing input by its argument and sending the remainder from the outlet. This process is applied to the note number from notein. Since MIDI note numbers begin on a very low C, the % 12 function will extract the pitch class (0–11). The message from the rem object is passed to a select object. A select object has an outlet for each argument plus one extra. If the input matches an argument, the associated outlet will send a bang message. A bang is analogous to a synthesizer trigger—it prompts objects to act. The solid grey objects connected to the select object are message boxes. A message box will send its message when it receives any message, or if a user clicks on it. The spiderweb that connects the select object to the message boxes is easy to trace. Start at the right outlet, which passes along any input to the select object that does not find a match. In this case, that would be the notes that require a major chord. Following the patch cords we see that pitches that reduce to anything other than 2, 4, 9, or 11 will set the arithmetic for major third and perfect fifth. Notice that this mechanism is to the right of everything else, so the values will be set

17_Chap16_pp393-420 8/29/13 2:50 PM Page 398

398

THE ART AND TECHNIQUE OF ELECTROACOUSTIC MUSIC

FIGURE 16.4

Chords in C major.

before any notes are triggered. The select outlets that match 2, 4, or 9 are also connected to the 7, but send a 3 along for the minor third. The outlet that matches 11 triggers the minor third while setting the fifth to 6 semitones to produce the diminished chord. (Anyone using this patch will just have to be careful not to play any chromatic pitches.)

User Interaction Figure 16.5 shows how the performer can use the mouse to interact with a Max patch. The keyboard graphic is called a kslider and responds to clicks by sending data from two outlets. The data at the right outlet is proportional to the vertical position of the click. In keeping with most other interface objects, this is in the range of 0 to 127. The data at the left is determined by the key clicked. The range of key numbers the kslider produces is adjustable, along with the number of keys displayed. The rest of Figure 16.5 produces the note. Playing a note via MIDI requires two separate actions, an immediate note on and a note off message at the end of the note’s duration (we get so used to editors that show a note as a single event of specified duration that we forget this detail). The makenote object handles this by sending velocity and note number messages to the attached noteout object, then scheduling a zero velocity and matching note off some milliseconds later. The argu-

17_Chap16_pp393-420 8/29/13 2:50 PM Page 399

PROGRAMMING WITH BOXES AND LINES

FIGURE 16.5

399

Simple keyboard.

ments to makenote set the velocity and duration if these are not received before a note number arrives. User interface objects tend to get lost within the tangle of boxes and patch cords of a complicated patch. They can be grouped at the top of the patch but this leads to long patch cords and difficulty following the program logic. Another approach is to hide most of the patch. Any item can be marked as “hide on lock” and will vanish when the patcher is locked. This is the technique used in many classic patches you will find discussed in books and tutorials. With Max version 5, the paradigm was changed. The objects we do want to show are marked “add to presentation.” When presentation mode is engaged, only those objects appear. In addition, the objects may be freely repositioned for presentation mode, so there is one layout for programming and another for performance. Figure 16.6 shows a presentation version of Figure 16.5.

Automatic Note Generation The patch in Figure 16.7 shows how to generate notes, with random pitches at a steady rhythm. The square at the top of this patch is a toggle object. When a toggle is clicked, an X appears in the box and a 1 is sent out. Clicked again, the X is cleared and a 0 is sent out. Ones and zeros are used many places in Max to mean yes and no or on and off. One such place is the metro object which, when on, sends bang messages at intervals determined by its argument. In Figure 16.7, these bangs trigger the random object to produce an unpredictable number from zero to one less than its argument. An entire chapter could be written on the subtleties of random and probability-based composition, but I’ll only say here that one output is just as likely as any other. You have already met select and the rest of the objects in the patch, so the mechanism that completes the process should be clear. Some output from Figure 16.7 is recorded in DVD example 16.1.

17_Chap16_pp393-420 8/29/13 2:50 PM Page 400

400

THE ART AND TECHNIQUE OF ELECTROACOUSTIC MUSIC

FIGURE 16.6

Keyboard patch in presentation view.

FIGURE 16.7

Random note generator.

17_Chap16_pp393-420 8/29/13 2:50 PM Page 401

PROGRAMMING WITH BOXES AND LINES

401

Timing Issues The metro object is the prime motor of Max, and it is not uncommon to find several in a single patch. If you do use several metros, it can be difficult to keep them synchronized. They accurately bang at precise intervals, but they will only be rhythmic if they are started at precisely the same time. Figure 16.8 illustrates one of many ways to do this. A send object broadcasts any message it gets to any receive object with the same name. In Figure 16.8 both metros will be controlled by the toggle connected to the send metrosync object. A patch can contain any number of send and receive objects, which will also function between open patchers. Send and receive can be thought of as connections with invisible patch cords, with one difference. Whereas the order of messages passing through split patch cords is right before left, there is no way of knowing which of several identically named receive objects will get the message first. The round objects in Figure 16.8 are buttons. They will flash when banged, so this patch shows synchronization visually. Buttons are also important interface elements that flash and send a bang when clicked. Buttons will bang when they receive any message at all, so many composers use them to convert various data to bangs. Another way of synchronizing metro objects is through the global transport. A metro may have its interval specified in terms of a musical value instead of milliseconds. Thus you may see the value 4n, 8n, or 8nt for quarter note, eighth note, or eighth triplet. These concepts make no sense in the absence of a tempo, so there is a global transport window where tempo may be specified. Figure 16.9 shows the global transport window and two metro objects set up to work with it. Simply specifying interval in terms of note duration slaves the metro objects to the transport. They can only run when the transport is active. The words in the metro arguments that start with the @ sign are object attributes. An attribute is a setting internal to an object—the notation @autostart 1 means the autostart feature is turned on. Attributes are a more convenient approach to setting up objects than arguments because they can be in any order or left out if the default setting is satisfactory. Attributes can be changed at any time by a message that starts with the attribute name. The autostart feature will allow the transport to start the metro. The autostarttime attribute sets a precise point in terms of bars, beats, and subunits for the metro to start. This is tantamount to sending a one to the metro at that moment. One started, the metro will pause if the transport is stopped but will continue when the transport restarts.

Rhythm in Max Rhythm patterns can be produced by changing the interval of a metro. Figure 16.10 illustrates one method with a metro that is slaved to the transport. Normally, when the interval of a metro is changed (via the right inlet) the metro finishes the current

17_Chap16_pp393-420 8/29/13 2:50 PM Page 402

402

THE ART AND TECHNIQUE OF ELECTROACOUSTIC MUSIC

FIGURE 16.8

Syncing metro objects.

FIGURE 16.9

Transport control of tempo.

cycle at the old rate. The exception is when the change is instigated by the bang coming out of the metro itself. In that situation, the interval change is effective immediately. In Figure 16.10 the metro sends a bang to the cycle object on each note. Cycle passes these bangs through a different outlet on each instance, starting over when the process hits the end. The bangs are wired to messages with a duration symbol. If you trace the patch cords, you will see 4n is sent first, followed by 8n twice, then three instances of 8nt before the pattern repeats. The timepoint object is there to cure a subtle problem. If the transport is stopped and rewound, the metro will be started in sync with whatever else is going on, but the cycle object may be in the wrong phase. A timepoint bangs whenever the transport hits the time indicated, in this case sending a message to cycle instructing it to start over (set 0 means set inlet 0—the leftmost inlet—to bang on the next input). The makenote/ noteout mechanism will play a snare hit on channel 10 (DVD example 16.2). Notice that the patch in Figure 16.10 sends messages from the bottom of the chain back to

17_Chap16_pp393-420 8/29/13 2:50 PM Page 403

PROGRAMMING WITH BOXES AND LINES

FIGURE 16.10

403

A simple rhythm engine.

an earlier object, a kind of feedback. This is usually OK if the destination is the right inlet of some object, but if the feedback is misconnected to the left inlet, an infinite loop will be formed and Max will halt execution. An error alert will come up, and you will have to fix the problem before resuming operation.

Max in Use Figure 16.11 shows a patcher I have used in many performances. It makes heavy use of the expr object, which allows the inclusion of one line of C code with tokens to indicate the inlets that supply data. For example, the tempo slider is connected to expr with the code 60000/$f1, which converts beats per minute to milliseconds per beat. The heart of the patch is expr $f2*$f1*(1-$f1). This implements a famous chaotic formula known as the logistic map. The usual formula is shown at the bottom of the patch. The use of chaos in composition deserves a whole chapter (at least). The output from this patch is strange and lively patterns of pitches, especially when the slider labeled “Sigma” is toward the right. Some examples of this patch are given in DVD example 16.3. When this patch is in operation, the metro bangs the float object, which may contain one of three values: 0.5 to start, a number derived from clicking on the kslider, or the result of the last calculation. When the float object is banged, it sends its current value to expr (a float object is rather like a variable in other languages).

17_Chap16_pp393-420 8/29/13 2:50 PM Page 404

404

THE ART AND TECHNIQUE OF ELECTROACOUSTIC MUSIC

FIGURE 16.11

Chaotic Player.

The expr does another calculation, and the result is converted to a playable note and sent back to the float for the next round. The chaotic pattern that results is indescribable, but each time a note is input a pattern exclusive to that note is restarted.

Simplifying Patches It should be apparent that Max patches tend to grow quickly. Working from a flowchart is efficient most of the time, but as the program grows complex, this loses some of its appeal as a result of two factors: the patch will no longer fit in a single screen and the requirement of dozens of boxes to perform a function makes it hard to follow the larger picture. The cure for both of these ills is encapsulation. Encapsulation allows a section of a patch to be hidden within what appears to be a single

17_Chap16_pp393-420 8/29/13 2:50 PM Page 405

PROGRAMMING WITH BOXES AND LINES

FIGURE 16.12

405

Encapsulation.

object. Figure 16.12 shows Figure 16.4 modified by encapsulating the interval selection logic. The rem, select, and message objects at the heart of the patch are now in a box labeled p pickintervals. This box can open into its own window as shown. The object name p stands for patcher, although these constructions are usually called subpatchers. They are part of the patch and are saved in the same file. The objects with numbers are outlet and inlet objects—these correspond to actual inlets and outlets of the subpatcher box and transfer messages in and out. Their number is determined automatically by their position in the subpatch. Move them around, and the numbers will change. (Interestingly, as you do this any attached patch cords in the surrounding patcher will rearrange themselves to maintain the same connections. This is one of the spookiest things I have ever seen in a programming language.) You can set up an encapsulation by typing p in a new object or by selecting part of a working patch and choosing Encapsulate from the Edit menu. The name is optional but essential for good documentation. Once an encapsulation is made, it can be copied and the copies can be modified. This can be an efficient way to build a patch with a lot of similar elements. Another way to create a subpatch is to start with a new window and build the desired part of the patch using inlet and outlet objects where data will come in and go out. Once it’s finished, save the window with a unique name. Type that name in an object box of another patch and the subpatcher will appear ready to be wired in. It is as though you had designed and added your own Max object. Many Max programmers call this type of file an abstraction. There

17_Chap16_pp393-420 8/29/13 2:50 PM Page 406

406

THE ART AND TECHNIQUE OF ELECTROACOUSTIC MUSIC

is one difference between this approach to encapsulation and the patcher object: if you go back and modify the abstraction file, the changes will apply to all copies of it wherever they are. Figure 16.12 is a good place to illustrate one of the particular gotchas of Max (every program has them). Because operation is strictly right before left, the pickintervals abstraction must be kept on the right side. If it is moved to the left of the add objects, the chords produced would be appropriate to the previous instead of current note. This disconcerting effect can be avoided with an object called trigger (abbreviated t) that provides multiple outputs in a disciplined way (see Figure 16.13). The arguments to trigger determine the number of outlets and the type of output for each. The argument b denotes a bang, i an integer, and so forth. Trigger will convert message types as specified, and is the preferred object to do this.

MSP AUDIO AND SYNTHESIS MSP is the branch of Max that handles audio processing and synthesis. MSP is a suite of objects that work with audio signals—they have a tilde in their name to indicate the audio association (perhaps because the tilde looks a bit like a sine wave). Thus, where the * object multiplies two numbers, the *~ object multiplies two audio signals. MSP patch cords also have a distinctive color—yellow with black stripes. Since audio work is potentially time consuming and might slow down the editing process, MSP objects are activated by a control in a menu item called Audio status. This item also contains settings to connect Max to the computer’s audio environment. The patch in Figure 16.14 shows how simple signal processing works in Max. The object with the microphone drawing is ezdac~ and represents the connection to the sound card input. It also serves as a user interface, because you can click on it to turn audio processing on or off. The signal from ezdac~ is passed to the gain~ slider. (The distinctive barber pole pattern distinguishes this from the other data sliders.) The gain~ slider attenuates signals in a logarithmic manner. It can also accept number messages for control by other patch elements. The bar below the input level slider is a meter which will monitor the strength of signal coming in. The input signal is fed to the tapin~ object, which represents a tapped delay line with the maximum delay indicated in the argument. The delay time is only limited by the computer memory, so Max will provide delays longer than I have seen in any other application or machine. The delay line is read by one or more connected tapout~ objects. The argument to tapout~ sets the current delay of the signal. It is this delayed signal that is heard, one second after the original. (The speaker icon indicates sound card output.) The delayed signal is also fed back to the tapin~ by way of a slider labeled “feedback.” This produces the eternal echoes characteristic

17_Chap16_pp393-420 8/29/13 2:50 PM Page 407

PROGRAMMING WITH BOXES AND LINES

FIGURE 16.13

Managing message order with trigger.

FIGURE 16.14

MSP in action.

407

17_Chap16_pp393-420 8/29/13 2:50 PM Page 408

408

THE ART AND TECHNIQUE OF ELECTROACOUSTIC MUSIC

of open loop performance. (For more information about this technique, see chapter 18.) The adjustment of the feedback slider is crucial—too little and the effect will be inaudible, too much and the echoes will build up to the distortion point. DVD example 16.4 has a simple performance with this patch. The tapin~ tapout~ combination is only effective for relatively long delays. For the filtering associated with short-term delays there is an object called comb~. Figure 16.15 shows a comb~ object applied to a sound file player. The sfplay~ object will play audio files recorded in wave (.wav) or audio interchange file (.aiff) format. The open message brings up a file dialog for choosing the file. When it returns, the first few seconds of the file are loaded into memory so playback can start instantly. The toggle will send a one to start or a zero to stop playback, and the functions of pause and resume should be clear. The speed message will affect the rate of playback. When a message box has a $1 token in it, the token is replaced by the value that triggered the message, so at the moment this snapshot was taken the message speed 0.25 had been sent to sfplay~. The signal from sfplay~ is sent both to the output and the comb~ object. You will remember that most delay effects come from combining delayed and original signal. The essentials of the comb~ are set by the arguments, which control maximum delay, initial delay, gain, feed forward, and feedback. In this patch the delay time is continually modified to produce a classic chorusing effect. This modulation originates in a cycle~ object, which is a basic sine wave oscillator. The argument to cycle~ sets the frequency at 0.25 Hz. The output of cycle~ (and most MSP generators) ranges from -1.0 to 1.0. This is the maximum amplitude that can be converted to audio without distortion, but if signals are only used for control, they may be any amplitude desired. Thus the signal from cycle~ is multiplied by five, then added to ten. This will produce a modulated delay from 5 to 15 milliseconds.

Recording in Memory There is an sfrecord~ object to complement sfplay~, but I seldom use it. Most of the recording I do is of short clips directly into memory. Figure 16.16 demonstrates some of the capabilities of this procedure. The key to this patch is the buffer~ object in the lower left corner. Note that it is not connected to any of the other objects. The name of the buffer:—recordhere—establishes it as a destination other objects can access. The first object to use the buffer~ is the record~ object. All that record~ needs to capture audio in the buffer~ is a signal input and a one or zero to start and stop. Recording usually begins at the beginning of the buffer~ and stops when the buffer~ is full, but there is a loop mode that would allow continuous overrecording. Once the buffer~ contains audio, it may be played by a variety of objects, of which groove~ may be the most interesting. Groove~ requires a signal to enable playback, but the signal value indicates playback speed. The sig~ object converts

17_Chap16_pp393-420 8/29/13 2:50 PM Page 409

PROGRAMMING WITH BOXES AND LINES

FIGURE 16.15

Flanging a recording.

FIGURE 16.16

Playback from buffer~.

409

17_Chap16_pp393-420 8/29/13 2:50 PM Page 410

410

THE ART AND TECHNIQUE OF ELECTROACOUSTIC MUSIC

numbers into signals of constant value. If the idea of a signal of constant value seems odd, think of it as a stream of repeating values at the sample rate. The values may be 0, 1 or 100, but there are 44,100 of them a second if the sample rate is 44.1 kHz. The groove~ object can play from anywhere in buffer~, as indicated by the loop start and loop end inlets. If the play rate is negative, the loop will be reversed. This patch is the foundation of an endless variety of loop-based performance patches. The fact that buffer~ can hold four channels of audio should be suggestive, as may be the fact that you can have more than one groove~ playing the same buffer~ at different rates. DVD example 16.5 demonstrates some groove~ tricks.

Synthesis MSP can perform nearly any type of synthesis. Figure 16.17 is the Max version of basic beep. The notes will be triggered by a MIDI note on message. The cycle~ object requires a frequency, which is calculated from the note number by the mtof object. Like all good oscillators, cycle~ is running all the time, but there is no sound because its output is multiplied by zero. The envelope is generated by the mechanism from select to line~. If the velocity of the note is not zero, it is passed out the right outlet of select. Dividing by 127 will provide a value from zero to one. The result of the division will be combined into a message of four numbers. A message of two or more numbers has a special status in Max as a list data type. Many objects can produce complex behavior based on incoming lists. The line~ object interprets a list as a series of destination-time pairs. The output will transition smoothly to the destinations in the indicated time. With the values shown, the line~ object will produce an envelope that goes from zero to one in 10 milliseconds, then down to the velocity value in the next 20 milliseconds. This is the first three segments of the familiar ADSR. The sustain value lasts until the note off, when the list 0 500 instigates a slow release. The output of line~ is multiplied by the signal from cycle~. Since signals are all continuous, there is no worrying about left inlet, right inlet issues. The changing values from line~ will produce the familiar beep. Since there is only one source of signal (cycle~) the patch is monophonic.

Polyphony The first step toward polyphony is to build an abstracted version of Figure 16.17 with some minor changes (see Figure 16.18). At the top of the patch, the notein object has been replaced by an object simply called in, and an unpack object. The in object is a version of the inlet specifically for polyphonic use. Unlike the inlet object, you have to assign the number manually—it still indicates the order of the inlet in the master window. Unpack is an object that disassembles lists. When a list arrives, the individual items are sent from the unpack outlets. The assumption for

17_Chap16_pp393-420 8/29/13 2:50 PM Page 411

PROGRAMMING WITH BOXES AND LINES

FIGURE 16.17

Basic Beep in Max.

FIGURE 16.18

A beeping abstraction.

411

17_Chap16_pp393-420 8/29/13 2:50 PM Page 412

412

THE ART AND TECHNIQUE OF ELECTROACOUSTIC MUSIC

this application is that the incoming message will be a list of pitch and velocity. The next change is at the output end. Instead of going directly to the audio card, there is an out~ object. The tilde means this will be an audio connection. The thispoly~ object sends information to the host patch, which is shown in Figure 16.19. The secret ingredient in polyphony is the poly~ object. It is similar in concept to a encapsulated patch, but multiple copies of the abstraction are loaded into the program. The first argument to poly~ is the name of the abstraction to load, and the second argument is the number of copies desired. The example shown in Figure 16.19 has six copies of the beeper patch and will play up to six notes at once. The poly~ object will manage which copy of the subpatch is playing according to the midinote message, which should contain the note number and velocity of the note on or off coming in. These are provided as a list by the pack object and replace the $1 and $2 tokens in the message box. When a note on comes in, the poly~ object hands it to the first copy of beeper that is not busy. When a note off arrives, it is sent to the copy that is currently sounding that note. Poly~ keeps track of the state of the abstractions it contains via the thispoly~ objects. When thispoly~ gets a signal of zero, it is marked available. If more than six notes come in at once, poly~ can be set to ignore the new one or turn off the oldest.

Max and the FFT We have seen the power of the fast Fourier transform and associated resynthesis techniques in Alchemy and Csound. MSP also does FFT, in a particularly elegant and flexible way. The heart of FFT in Max is the pfft~ object, which is a specialized extension of the poly~ object. Figure 16.20 shows a vocoder patch that uses it. The left side of Figure 16.20 shows the host patch with pfft~, the shaded area on the right shows the workings of the inner abstraction fft_vocode. The carrier sound for the vocoder is a simple MIDI-controlled sawtooth oscillator. The noise source modulates the frequency of the oscillator by 20 Hz, ensuring a rich spectrum for the vocal components to work with. Envelope control is not necessary because the voice should shut the oscillator off completely. (In practice, you may find a noise gate useful here.) The action occurs in the fft_vocode patch that is loaded by the pfft~ object. The fftin~ objects perform the analysis, producing real and imaginary components for the oscillator and voice inputs. The cartopol~ object converts the voice components into amplitude and phase form. Cartopol was originally designed to convert Cartesian to polar coordinates for graphics, but that turns out to be exactly the math needed here. The amplitude bins of the voice analysis are clipped to ensure they never exceed 1.0—there would be nasty sonic consequences if they did so. The voice amplitude is used to control both components of the carrier FFT. This happens bin by bin, so only spectral components that are strong in both signals will be heard. The fftout~ object performs the inverse FFT on what is left of the carrier spectrum. The pfft~ object does more than just provide a wrapper for all of

17_Chap16_pp393-420 8/29/13 2:50 PM Page 413

PROGRAMMING WITH BOXES AND LINES

FIGURE 16.19

Six-voice polyphony.

FIGURE 16.20

FFT vocoder.

413

17_Chap16_pp393-420 8/29/13 2:50 PM Page 414

414

THE ART AND TECHNIQUE OF ELECTROACOUSTIC MUSIC

this. It also takes care of windowing and mixing overlapped FFT processes. The second argument controls the number of bins in the FFT, and the third sets the overlap factor. The version shown is demonstrated in DVD example 16.6.

Expanding MSP One of the more surprising things you can do in MSP is run Csound. This is not built in, it requires a third-party object called Csound~ (more than one is available, I use the version by Davis Pyon) and a complete working installation of CSound itself. The first question that comes to mind is why bother? If Csound is working, what is the point of running it from Max, or conversely, what does Csound do that MSP does not? The two programs are excellent complements, filling in whatever deficiencies a composer may find with either. The widgets in Csound provide a quick and dirty user interface, but if you want a clean and finished look, the UI objects in Max are far superior. On the other hand, while the MSP synthesis modules provide a perfectly respectable set of basic sounds, the possibilities in Csound are simply unmatchable in flexibility and precision. There is also an object that enables RTcmix, which provides similar synthesis power in a slightly different package. This has just been a sampling of the audio and synthesis techniques possible in MSP. Nearly every style of synthesis is offered, if not in the included objects then in third-party externals. For instance, Perry Cook’s Synthesis Toolkit is available as a set of objects. If all else fails, MSP will also host plug-ins, so nearly everything covered in this book is available.

A HINT OF JITTER Jitter is the graphics component of Max. Graphics are well beyond the scope of this book, so I’ll only give an overview and we’ll look at one of the many ways music and graphics can be combined. The heart of Jitter is a data structure known as a matrix. A matrix is a collection of simpler data structures called cells. The whole idea is built around (but by no means limited to) pixels on a screen, so I will use that as a basis for description. A pixel needs four numbers to describe its color: alpha (opacity), red, green, and blue values. The range needed to describe the intensity of a pixel color can be adequately expressed by an 8-bit number, the data size known as char. So a cell intended for screen display will contain four chars. The number of cells needed for a large screen can be enormous. As I write this, I’m looking at a particularly big one—2,560 by 1,440 pixels for a total of 3,686,400 pixels. If I wanted to use a Jitter matrix to hold the whole works, it would have to be defined with a cell size of four chars and dimensions of 2,560 by 1,440 which would take up fifteen megabytes of memory. We usually use more modest sizes in the interest of speed, but there are

17_Chap16_pp393-420 8/29/13 2:50 PM Page 415

PROGRAMMING WITH BOXES AND LINES

415

no limits except the amount of memory in the computer. Cells can have as many values as you like (with choice of several data types) and there can be more than two dimensions. Figure 16.21 illustrates one use of a matrix. Just as MSP objects have a tilde-based naming convention, all objects in Jitter are marked with the prefix “jit”—the Jitter matrix is named jit.matrix. The arguments establish the number and type of values in a cell (this may also be called the number of planes) and the size of each dimension. You may name the matrix with an additional argument. If you do not, a name is automatically generated. A matrix has a large number of attributes and commands that make it extremely powerful in its own right. For instance, the importmovie command shown in Figure 16.21 will open and convert any type of image that can be shown in the QuickTime system. The bang message from the button will send the matrix name from the outlet. Using this name, any other Jitter object can access the data in the matrix. The image here is shown in a jit.pwindow object. This is distinct from the jit.window object, which is a freestanding window. The jit.pwindow is in a patcher. I’ve included two jit.pwindows to illustrate one of the powerful features of Jitter. When matrices are passed to various objects, the data is rescaled and interpolated to match the size of the object.

FIGURE 16.21

Importing an image into Jitter.

17_Chap16_pp393-420 8/29/13 2:50 PM Page 416

416

THE ART AND TECHNIQUE OF ELECTROACOUSTIC MUSIC

FIGURE 16.22

Playing movies.

You can load a QuickTime movie into a matrix, but it will only show the first frame. Jitter has a dedicated object for playing .mov files called jit.qt.movie (see Figure 16.22). The essential movie player consists of the objects on the left side. The arguments in jit.qt.movie determine the initial resolution of the movie. Qmetro supplies bangs to send successive frames on to the jit.pwindow. Of course the point of showing movies in Jitter is to mess them up. There are several dozen graphic processes available ranging from simple math on the pixels to complex compositing schemes. Jit.rota is typical—it allows rotation and rescaling of the image (the anchor attributes designate the center of rotation).

Visualizing Audio There are many ways Jitter can integrate sound and images, from using a video camera to detect the actions of a conductor to using music to make 3D animations dance. Figure 16.23 shows one way to create images that should be familiar to users of iTunes. This patch starts with a file player built on sfplay~. We can listen to this, and the signal is passed to a jit.catch~. With both a jit prefix and a tilde suffix,

17_Chap16_pp393-420 8/29/13 2:50 PM Page 417

PROGRAMMING WITH BOXES AND LINES

FIGURE 16.23

417

Visualizing audio.

you can guess jit.catch~ acts as a bridge between MSP and Jitter. It captures the sample values in a one-dimensional matrix with a floating point value in each cell. The jit.graph object converts such a matrix to a graph, so we have what amounts to an oscilloscope that gives us the display on the left. Oscilloscopes are only interesting to engineers, so I’ve used an old analog trick to spice this up a bit. That trick is video feedback, produced by pointing a camera at a monitor. The basic result is an image of an image of an image, but if you tilt the camera, the picture will be wildly scrambled. The same result can be produced digitally with a little math. The oscilloscope image is applied to the right display through a jit.op object set up to add two images together ( jit.op does simple math with each cell value of the matrices). This combined image is also sent to a matrix named feedback If two matrix objects share a name, they refer to the same data. A second matrix named feedback is placed in a good position to be banged down

17_Chap16_pp393-420 8/29/13 2:50 PM Page 418

418

THE ART AND TECHNIQUE OF ELECTROACOUSTIC MUSIC

notein select 0 / 127

mtof 0 500

$1 20

osc~ 440 line~ *~ dac~

FIGURE 16.24

Basic beep in PD.

through a jit.rota object. This arrangement produces a vital one-frame delay in the process. (If I just wired the outlet of the jit.op to the inlet of the jit.rota object, there would be an infinite loop.) The jit.rota object modifies the image, which is passed through a jit.op multiplier before being combined with the new frame coming from the jit.graph. Multiplying by zero gives black and multiplying by one gives the unchanged image. Anything in between is a fade. The float number box that controls the multiplication actually controls the amount of feedback. Of course the original image passes through jit.rota many times before it fades completely, and it is turned a bit each time with the results shown. A static image does this no justice, so I’ve included a movie in DVD example 16.7 (note: the default color of jit.graph is yellow on black. I’ve modified the images to make them printable. The patch I actually use is more colorful, as you will see).

Pd: SIMPLE, EFFECTIVE, AND FREE If you find the visual approach to coding attractive but don’t want to invest in Max, you should give Pd a try. Pd is an open source program, well supported and available on all platforms. Best of all, it was originally written by Miller S. Puckette, the inventor of Max. Pd is not “Max lite.” It is a complete program that does all essential operations in a robust and reliable way, but it does have a research software feel to it. There are relatively few power user features like multiple patch cord fanning or automatic object alignment. The user interface objects are limited to basic sliders and buttons, and the appearance of the patch is sketchy (see Figure 16.24). But the results are all you could ask for.

17_Chap16_pp393-420 8/29/13 2:50 PM Page 419

PROGRAMMING WITH BOXES AND LINES

419

Using Pd is much like using Max. There is a large stable of basic objects (greatly extended by third parties) that are connected by cords. For the most part, objects with the same function have the same name in both programs. Pd always uses floating point numbers, which avoids a few problems. Graphics are third party instead of built in, but audio is not an add-on. Pd will present some surprises to the hardened Max user, but they are not insurmountable. I encourage students to learn both. Pd is available at http://puredata.info.

RESOURCES FOR FURTHER STUDY See my website for a number of .pdf tutorials on basic and advanced techniques: www.peterelsea.com. There is also a lot of informative material at the Max website, www .cycling74.com. You will find the forum there especially useful, as it is populated by a large number of friendly and experienced folks who don’t mind basic questions if you have read the manuals first. There’s not a lot formally published about Max. Most books are about some performance or synthesis topic and use Max as the demonstration language. For the most part, any book about Max will apply to PD and vice versa. Blum, Frank. 2007. Digital Interactive Installations: Programming Interactive Installations Using the Software Package Max/MSP/Jitter. Saarbrucken: VDM Verlag. Cipriani, Alessandro, and Maurizio Giri. 2010. Electronic Music and Sound Design: Theory and Practice with Max/MSP, vol. 1, translated by David Stutz. Rome: ConTempoNet. Kreidler, Johannes. 2009. Loadbang: Programming Electronic Music in Pure Data. Berlin: Wolke Verlag. (Also available for download at www.pd-tutorial.com). Lyon, Eric. 2012. Designing Audio Objects for Max/MSP and Pd. Middleton, WI: A-R Editions, Inc. Puckette, Miller S. 2007. The Theory and Technique of Electronic Music. Hackensack: World Scientific. Winkler, Todd. 1998, Composing Interactive Music: Techniques and Ideas Using Max. Cambridge, MA: MIT Press.

Getting Max A thirty-day trial version of Max can be downloaded from www.cycling74.com. Once you have installed that, you will also have a runtime version that will play existing patches even after your thirty days are up. Max can be greatly expanded with objects written by third-party developers. I have a set called L-objects at www.peterelsea.com. The best clearing house for such objects is www.maxobjects.com.

17_Chap16_pp393-420 8/29/13 2:50 PM Page 420

18_Chap17_pp421-444 8/29/13 2:51 PM Page 421

SEVENTEEN Synthesis in Hardware

Is hardware dead? That’s a perennial question in the world of electroacoustic music. It is never asked about hardware per se, but about a specific class of devices. The question was asked about modular systems when keyboard synthesizers appeared, yet there are more modular manufacturers than ever before and vintage systems command premium prices. The question was asked about analog synthesizers when MIDI and digital instruments appeared, yet Oberheim and Sequential instruments are prized parts of many studios, to say nothing of Minimoogs, which are again for sale in modern versions. It seemed for a while that sample-based instruments would displace abstract architectures like FM, but instead hybrid synthesizers with filters and envelope generators and a sample list of hundreds of waveforms were developed. Now I am looking at a hard drive full of softsynths (software synthesizers) and wondering if I will ever buy a knob-studded panel again. The answer is, “Of course I will.” Computer-based synthesis provides a powerful set of tools, but dedicated audio hardware has several advantages over a general-purpose computer pretending to be a musical instrument. Hardware is reliable. All you need to do is throw a switch, and you are making music. Hardware devices never have the sort of bugs we have gotten used to in software, and they are seldom shipped with a partial feature list. You don’t have to edit a .bash file to get something to work, it won’t stop working because of an operating system update, and you won’t have to buy a new one when your hard drive crashes. Hardware is compatible with everything. It doesn’t care if your computer’s operating system is Windows, Mac, Linux, or Be. The connections are MIDI or USB on one end and audio on the other, both predictable standards that work in Kansas City or Istanbul. Hardware often provides better audio. Most computer audio interfaces are pretty poor performers, largely because computer power supplies are not designed with audio in mind. A dirty supply contaminates everything it powers, including USB and FireWire gadgets. Only top of the line interfaces have audio converters that meet professional standards for linearity and distortion, and 421

18_Chap17_pp421-444 8/29/13 2:51 PM Page 422

422

THE ART AND TECHNIQUE OF ELECTROACOUSTIC MUSIC

even the digital outputs are consumer-grade S/PDIF connections, notorious for jitter and synchronization problems. Vintage hardware is often a bit hissy, but it’s nothing a noise gate can’t cure. Hardware has knobs. Control surfaces need to be mapped for each application, and typically they are not labeled for what I want them to do (with the exception of Pro Tools). The knobs on most control surfaces have a one-way connection to the application. If the value is changed on screen, the knob is out of sync. When the knob is turned, either the parameter value jumps back to the knob position, or the knob does nothing until it matches the internal value. And frankly, the knobs on many control surfaces are junk. They often don’t swing all the way from 0 to 127, and they feel like they will fall off at any moment. Hardware is responsive. Although there is some latency on digital gear, it is nothing like the 20 to 30 ms common on computers, and analog equipment has instant response. Hardware is here to stay, which is not to say that future hardware will be much like that of the past. This chapter is not intended to be a nostalgic tour of classic machines, but an introduction to the mysteries of hardware for the many students whose only exposure to synthesis has been through a computer display. We will look at some classics, but only with the purpose of understanding the typical interface. We will also look at some contemporary instruments. Finally, we will take a look at an instrument that is both a classic and a harbinger of things to come.

CLASSIC INSTRUMENTS There have probably been more than 5,000 different synthesizer models manufactured over the past thirty years. Few of these are still in production. Indeed, only a few companies still survive, but the instruments themselves endure. They are found at garage sales, used instrument stores, and in the closets of many older musicians. A quick look at an online auction house shows more than 6,000 offerings, with ’90s era rack modules going for around $100. It is educational and fun to add one or two of these to your studio, but there are a few steps to follow in order to avoid wasting your money. Don’t buy used keyboards. They take up too much room for the quality of sound you get, and the lightweight mechanisms in most wear out quickly. Only the oldest are repairable. In the 1980s keyboards had wire and spring contacts that could be cleaned or straightened, but from 1990 on the contacts are membrane switches. Years of use actually wears holes in them. Check the instrument before you buy. There’s a simple test to perform in a store or at a yard sale. Turn the unit on and see if the display lights up. If the display is readable with no missing letters, the instrument probably works. Some

18_Chap17_pp421-444 8/29/13 2:51 PM Page 423

SYNTHESIS IN HARDWARE

423

instruments have a knob or adjustment screw labeled “contrast” or “viewing angle.” If this is misadjusted, the display may be black or washed out. If the display is readable from any angle, the unit is probably OK even if there is no adjustment knob, because the angle setting is in a menu. If the display is not legible from any angle, it is burned out. Give the instrument a pass. Listen to it. Many instruments have a demo mode. Plug in headphones and listen. If the demo is good, the sounds will be good. If the display lights up but there are no sounds listed, the unit may have a dead battery. These are easy to replace, and sounds will appear after you do a reset. Resets are often found in a menu or performed by holding one or two buttons down as you turn the power on. There are few buttons, so the magic combination won’t take long to find.

Learning the Instrument Once you have the instrument home, you have some detective work to do. Ideally, the manual will come along with the purchase. Realistically, you will have to do some searching to find one. Many companies have posted manuals of old instruments on their websites. For others more diligent search is required. One of the online stores, such as synthzone.com, is an excellent place to start. There may be some disappointments, but persistence will likely pay off. Looking up the Oberheim MX1000 (a real antique), I found a note that it is actually covered in the MX6 manual with a link to a copy someone had scanned and posted. The lack of a manual is not the end of the world. You can discover enough about an instrument to play it just by exploring the interface. The archetypical tone module has a one- or two-line display, a scattering of buttons, and probably only two knobs. One of these is volume, the other is a data encoder, an endless knob whose meaning changes according to what’s in the display. The buttons you find will likely include cursor controls, data increment and decrement, and enter. Moving the cursor changes what the data buttons and data encoder knob do. On larger displays the cursor will be an underlined character, a filled triangle, or some sort of highlight. On the smallest displays, only one parameter is shown at a time. Usually the initial display shows the bank, program number, and name of the current sound. (You will remember from chapter 7 that a bank is a listing of up to 128 programs.) Sounds are changed by placing the cursor under the program name or number and using the data buttons or wheel to change it. Banks are changed in the same way or by dedicated buttons. Larger displays show MIDI channel number, channel volume, and maybe some other control states. These may all be edited by moving the cursor and adjusting the data wheel. The initial display may show a voice group instead of a single voice. Voice grouping is common on keyboard instruments, where you often want to play more than one voice at a time (usually called layering) or where you want the left and right

18_Chap17_pp421-444 8/29/13 2:51 PM Page 424

424

THE ART AND TECHNIQUE OF ELECTROACOUSTIC MUSIC

FIGURE 17.1

Typical tone module (Korg NS5R).

hands to sound different (split keyboard). Voice grouping is also handy on a multitimbral tone generator where you might want to call up sixteen channels worth of sounds in a single command. A group may be called a performance, a combination, a multi, or a setup. The sounds themselves may be called parts, programs, presets, or voices. Switching from group to single voice operation is usually done with dedicated buttons. You can test the instrument by sending it MIDI from a keyboard or computer. If the instrument does not respond to MIDI, check that the channel matches the keyboard or the track setting in your sequencer. If the instrument still behaves badly, there are more settings to try. These are usually reached by a button labeled “master,” “utility,” or “global.” This reveals options that affect the instrument as a whole, such as tuning and transposition, the display contrast, and MIDI settings such as local control. The MIDI channel may be hidden here, especially in older instruments. Almost as important as channel is MIDI mode, which may be omni (the instrument responds to everything), mono (only one note plays at a time), poly (many notes on one channel, or possibly one note per channel), or multi (many notes on many channels). If the instrument has both group and individual voice modes, there will be a setting to determine if a program change calls up a single voice or a group. (Figure 17.1 illustrates a typical tone module.) There may be more settings, such as the ability to block certain types of message. It may surprise you that we often block program changes. This is because keyboards send program change messages when their own program is changed, and they are unlikely to match the tone generator programs we want. Some sequencers are also notorious for sending program changes when they start up, and these may destroy temporary tweaks in the sounds. One type of MIDI setting many find puzzling is control mapping. Most instruments only respond to a few control change messages out of the possible 120, and keyboards and control surfaces are limited in what they send. The likelihood of a random controller matching the instrument is pretty low, so the control messages recognized can be changed. The action of the control is then passed on to the sound generators under a generic name like cord 2 or KN3. Once you have discovered how to set the accepted control messages, go back to some of the duller presets and see what control changes can do to the sound.

18_Chap17_pp421-444 8/29/13 2:51 PM Page 425

SYNTHESIS IN HARDWARE

425

Another MIDI mapping deals with program changes. The original MIDI specification allows for only 128 programs, but any instrument made since the late 1980s will have many more than this. Which ones should program changes call up? Eventually the problem was solved with bank change messages, but in the interim instruments contained a chart that determined what preset would load when a particular program change was requested. A related complexity is found in the program numbering scheme. Manufacturers never agreed on whether to label the programs from 0 to 127 (the actual numbers in the message) or from 1 to 128 (which matches the channel numbering scheme). Sequencer programmers also face this dilemma, and the resulting mismatches are always causing problems. Just to keep it interesting, a couple of instrument makers used a scheme left over from the days of two-digit displays and sixty-four voice memories. The programs were set up in eight groups of eight—the number 11 referred to the first program in the first group, or number 0. If the display reads 33, the program number is 18, and 88 refers to program 63. The scheme is sometimes called Roland Octal, even though it is not octal and Roland wasn’t the only company to use it. Start your exploration by listening to all of the built-in sounds. Follow the procedure outlined in chapter 8—noodle around on high and low sounds, try sending controls, and make note of sounds to explore further.

Editing Sounds Once you have explored the built-in sounds, it’s time to play with editing. Most instruments have similar memory structures. The majority of the programs are in read only memory and can only be temporarily modified. Any changes you make will be lost if you turn off the power or call up a different program. There is also a small section of battery-powered memory available for saving modified programs. This is often called the user bank. If your editing experiments produce something you want to keep, you must explicitly store the program in user memory. Discovering how to do this is the first priority of learning to edit. There is often a dedicated save button, but sometimes the function is in a utility menu or at the end of the edit menu. A few instruments have a memory protect mode that must be disabled before saving a modified program is even possible. Editing is where a manual is most valuable. If one is not available, you have some work ahead of you. What you have to discover is the edit tree, that is, the structure of menus and submenus that provide access to the parameters of the sound. You work this out by pressing the edit button and following all of the various options. As you go, make a chart of the branches and the parameters at the ends. As this list grows, a sense of the edit tree will appear. Table 17.1 shows the edit tree for a complex instrument, the Yamaha FS1R. (Not shown are things that can be set in the main display, such as selected performance, the voices assigned to the parts in this performance, MIDI channel, and initial settings.) This instrument has four edit buttons—when you press a button, you get a choice of four groups of edit

18_Chap17_pp421-444 8/29/13 2:51 PM Page 426

426

THE ART AND TECHNIQUE OF ELECTROACOUSTIC MUSIC

commands. Highlighting a group and hitting the enter key reveals even more subgroups. At the bottom of each branch you step through a long list of parameters. For the part group, a pair of buttons step through the four parts—the same buttons select the operator in the operator group. So to lengthen an envelope in a voice, you would first get that voice showing in the main display, then press Voice Edit, cursor to Operator, press Enter, cursor to EG, press Enter, cursor to Time4, select the desired operator, then use the data buttons to set the value. An exit button eventually gets back to the main display.

Voice Architecture A chart of the edit tree will help you figure out the architecture. The complexity of the voice architecture is inversely related to the age of the instrument. The voice architecture in instruments of the 1980s was really not much beyond basic beep, but in the late 1990s and certainly for contemporary instruments designers lost their timidity and went for broke. Basic beep is still at the bottom of most of them (the exception being Yamaha FM instruments, which use the operator paradigm we explored in chapter 12), but you will probably find two to eight basic beep layers per voice with separate envelopes for pitch, filter, and amplitude control. The oscillators will be able to play a wide range of sampled waveforms and might have some variant on FM or waveshaping. Figure 17.2 shows a typical voice structure. In this example the voice consists of four simple layers, each with independent parameters. It’s not unusual to find fifty or more parameters, including things like keyboard range and envelope delays. There will also be parameters that apply to all layers, such as portamento and tuning. It’s usually not hard to figure out what a parameter controls—the names are similar to what we have already studied. What may be difficult is predicting how much effect a change in the value may have. This can only be found by experiment. First turn the values all the way up and down to find the range, then make smaller changes and note the effect of each. In addition to the sound generation elements, a voice may contain low-frequency oscillators (LFOs, either common or associated with individual elements), arpeggiators, pattern generators, and effects. Interconnecting all of this with text menus can get pretty intricate. One common paradigm is the virtual patch cord. This is a table of eight to twenty-four cords with source and destination specified for each. The sources will include all inputs, MIDI or otherwise, as well as various LFOs and envelope generators. Virtual cords usually have a scaling feature to determine how much effect the connection will have. Sometimes this scaling is in itself a destination for a cord. For the signal routing associated with effects, a bus paradigm may be adopted. The elements may tie into one of several buses and an effect specified for that bus. The buses are then mixed back in with the voice signal.

18_Chap17_pp421-444 8/29/13 2:51 PM Page 427

SYNTHESIS IN HARDWARE

Edit Button Group PERFORMANCE COMMON

PART

Subgroup CtrlSrc

Contents Sources for control patches

CtrlDst Fseq 3 Others

Destinations for control patches 13 formant sequencer parameters Output and name parameters

Tone EG Pitch

12 parameters for each of four parts 7 parameters 7 parameters

Others

14 parameters Save edited performance

STORE RECALL EFFECT

VOICE

COMMON

Recall previous version Rev Var

Parameters for reverb Chorus etc. on for all parts

Ins

Chorus etc. on for each part

EQ LFO1

EQ for performance Parameters that affect everything in h i

LFO2 Filter PitchEG Others OPERATOR

Osc EG FrqEG

Parameters specific to one of sixteen i h i

Sns STORE UTILITY

RECALL SYSTEM

427

Voices are saved independently Master

Tuning etc.

MIDI Control Others

Message filters Control knobs setup Contrast etc. Initiate sysex dump Initialize device memory Play demo songs

DUMPOUT INITIAL DEMO TABLE 17.1 Edit tree for the Yamaha FS1R.

18_Chap17_pp421-444 8/29/13 2:51 PM Page 428

428

THE ART AND TECHNIQUE OF ELECTROACOUSTIC MUSIC

Filter

WG

Filter

WG PEG

Amp

Filter AEG

FEG WG PEG

Amp

Filter AEG

FEG WG PEG

Amp AEG

FEG PEG

FIGURE 17.2

Mix Amp

FEG

AEG

Typical synthesizer voice structure.

Group Architecture Grouping is not a particularly difficult concept, but many manuals present it in such a complex way that it seems like rocket science. It does not help that the nomenclature varies from one manufacturer to another or that some companies are inconsistent in the names on different instruments. Figure 17.3 shows a typical group with four parts. Each part contains one voice of four elements, and the voices are mixed before going to the effects section. These effects are in addition to any effects applied in the voice. Most instruments offer somewhat different sets of effects for voices and performances. The groups may be set up in one of several ways: Parts are linked to MIDI channel. This works well with sequencers, allowing tracks to control parts of polyphonic compositions. Parts are linked to pitch. This works well for keyboard performance, providing different sounds for left and right hands. Parts are mixed with hardware controls. This allows the performer to choose and blend sounds on the fly. This can be automated for vector synthesis.

Computer-assisted Editing If this seems a bit much to manage through a tiny display and five buttons, take heart. Many instruments can be edited via a computer program. The key to this feature is the presence of system exclusive (also known as sysex) messages that directly manipulate the parameters of a voice. Since about 2000, most instruments have included editing software in the box. If the program is lost, it may be found on the manufacturer’s support websites If an editor is not available or, more likely, requires an obsolete computer like Macintosh OS 9, there is a good chance it is sup-

18_Chap17_pp421-444 8/29/13 2:51 PM Page 429

SYNTHESIS IN HARDWARE

Filter

WG

Amp Filter

WG PEG

Amp

AEG

FEG

Filter

Amp Filter

WG

Amp Mix

Filter AEG

FEG WG PEG

Amp AEG

FEG PEG

WG

Amp

Filter AEG

FEG WG PEG

PEG

Mix

Filter AEG

FEG WG PEG

Amp

Filter AEG

FEG WG PEG

Amp AEG

FEG PEG

AEG

FEG

Mix Filter

WG

Amp

AEG

FEG

Filter

Amp Filter

WG

Amp Filter AEG

FEG WG PEG

Amp AEG

FEG PEG

WG

Amp

Filter AEG

FEG WG PEG

PEG

Mix

Filter AEG

FEG WG PEG

Effects

Amp Filter

WG PEG

429

Mix Amp AEG

FEG PEG

Amp

Filter AEG

FEG WG PEG

FEG

AEG

Controls, LFOs etc FIGURE 17.3

Four-part performance patch.

ported in Midi Quest or Unisyn. These are patch librarian and editing programs that have plug-in modules for a variety of synthesizers. Midi Quest supports more than 600 classic and contemporary instruments. It can work like a VST instrument in sequencing programs, giving hardware the same level of control softsynths have. It can also work as a stand-alone studio manager, allowing you to download and save banks of presets as well as move presets from bank to bank. Figure 17.4 shows part of the editing window for the Yamaha FS1R tone module. These programs are fairly easy to set up and use, but there are a couple of tricky issues in getting the hardware to communicate with the editor. First, MIDI connections need to run both ways between the instrument and the computer. Current settings are reported from the seldom used MIDI out jack. Second, system exclusive messages need to be enabled at each stage of the way. This may be turned off at the interface or in the instrument. Finally, MIDI devices contain a setting called device ID that allows you to manage duplicate instruments. The exact value is not important, but you need to enter it in the editor’s setup dialog box.

Production Methods Using hardware synthesizers in compositions is very similar to production with softsynths, but one operation that does differ is bounce. There is no bounce feature for

18_Chap17_pp421-444 8/29/13 2:51 PM Page 430

430

THE ART AND TECHNIQUE OF ELECTROACOUSTIC MUSIC

FIGURE 17.4

Editing synthesizer parameters on the computer screen.

hardware. When the time comes to convert the instrument output to an audio track, you have to connect the instrument to the computer audio input and record the track in real time. You should save this operation until late in the production process, because playing external MIDI creates less processor load than playing an audio track. When you are recording, you can listen to the instrument through the computer or through the mixer, but not both. The computer output will be slightly delayed, and the combination of signals will sound quite strange. Once the track is complete, you may have to shift it in time to compensate for recording latency. Once you have recorded an audio version of a synthesized track, remember to mute (but don’t delete) the MIDI version.

Using System Exclusive Messages The system exclusive messages that provide editing features can also be used to control the instruments in ways that go beyond the built-in controller patches. Potentially any feature or parameter can be modified on the fly by a programmable application like ChucK or Max. I say potentially because in order to do so the manufacturer had to include the requisite sysex capability in the first place, and there is a quite a bit of variation in this. Some simple instruments have little or no sysex, whereas others give access to everything. (On the original DX7 even the power switch could be turned off with a sysex command.) The second obstacle is discovering what the messages are. This is often in the manual, but sometimes it is pub-

18_Chap17_pp421-444 8/29/13 2:51 PM Page 431

SYNTHESIS IN HARDWARE

431

lished in a supplement that has to be requested separately. If the documentation can’t be found on the Internet, the commands may be discovered if there is an editor for the instrument. It’s not too hard to capture the commands the editor is sending to the instrument in Max/MSP or a MIDI monitoring program (just read the MIDI through port). The commands may be impossible to decipher but it’s worth a try. Once you have some experience with well-documented instruments the captured data may come clear. One last potential problem: sometimes the published sysex documentation is wrong. This part of the manual is complex and often doesn’t get the editorial attention the user instructions get. Don’t lose heart when this happens. The commands are similar in structure, and you can often figure out mistakes with a bit of testing. Let’s take a look at sysex control using an Evolver by Dave Smith Instruments. This is a desktop-style synthesizer with an advanced architecture and a unique programming interface. Figure 17.5 shows the Evolver’s matrix of knobs and buttons. The function of a given knob is determined by the most recently pressed button. For example, filter resonance is changed by pressing button 3, then turning knob 7. Delay time is controlled by button 5, then knob 4. This makes it easy to adjust the sounds while a sequence is playing, and many artists do so. But suppose you want to change filter resonance smoothly and change delay simultaneously? With system exclusive messages this is a piece of cake. A system exclusive message consists of a long series of bytes. The first byte is always 240, the status message for sysex. Next is the manufacturer’s ID, which might be two or more bytes and is assigned by the MIDI Manufacturers Association. It’s interesting that Dave Smith has ID 1, but that should not be surprising since he wrote the MIDI standard. All bytes after the ID are up to the instrument designer, but the next few generally identify the instrument and type of message. Smith chose 32, 1, and 1 to indicate the Evolver, version 1, parameter message. These are followed by a byte with the parameter number and two bytes for the actual value. Numbers larger than the MIDI limit of 127 are needed, so Smith sends four bits in each of two bytes. This scheme is called nibblizing, and the bytes are the most significant and least significant nibbles (lsn and msn). The final byte in any sysex message is 247. So to summarize, the message is 240 1 32 1 1 pn lsn msn 247. The Max patch in Figure 17.6 tests this. The sxformat object is designed to compose system exclusive messages. In Figure 17.6 it contains the Evolver parameter message with placeholders for the parameter number and data. The arithmetic boxes split the data value up into the required nibbles. To use this, first set the parameter number. Entering the data will send the message. If the Evolver knob related to the parameter was most recently turned, the data changes will be shown in the display. DVD example 17.1 has an example of this patch in action. The note patterns are coming from the chaotic player of Figure 16.11 in the last chapter. Of course this patch does nothing that can’t be done on the instrument itself, but it would not be hard to expand it to control any desired combination of parameters, automate the changes, or connect to another type of control surface.

18_Chap17_pp421-444 8/29/13 2:51 PM Page 432

432

THE ART AND TECHNIQUE OF ELECTROACOUSTIC MUSIC

FIGURE 17.5

Evolver (Dave Smith Instruments).

HARDWARE ACCELERATORS The products discussed so far are instruments of the old school, black boxes that take MIDI in and send music out. These are actually computers, but they are computers that run exactly one program. When the program is obsolete or no longer interesting, the box has no further use. That’s probably why they have been largely supplanted by software. It’s much less painful to retire a program that cost $149 than to get rid of a 15-pound piece of equipment that cost $2,000. On the other hand, software synthesis really is limited by the capabilities of the host. Audio processing is one of the most demanding tasks a computer will face, and complex productions can easily overwhelm the most powerful rig. Computers are gaining speed and memory all the time, but it seems the new operating systems hog that power, so the resources available for music stay the same. The fundamental limitation of a general-purpose computer is found in the architecture of the central processor itself. The ability to run a wide range of applications requires a design that can jump nimbly from task to task. A design that leaves out that feature can perform repeated mathematical functions efficiently and is better suited to digital signal processing. DSP chips are optimized for the algorithms used in video or audio and are the secret ingredient in synthesis hardware. A few companies are making DSP cards that plug into a computer to accelerate audio computation the same way graphics cards accelerate video. These cards support processing plug-ins of great power and sophistication (the cards that support

18_Chap17_pp421-444 8/29/13 2:51 PM Page 433

SYNTHESIS IN HARDWARE

FIGURE 17.6

433

System exclusive control of Evolver.

the high-end version of Pro Tools are an example). Most of these are geared to the recording studio, but companies occasionally offer powerful synthesizers for their cards. The drawback to the accelerator market is that the cards are proprietary with few if any third-party applications available. Once a standard format emerges, this area of music technology will take off the way OpenGL has powered the game industry. Virtual Studio Technology (VST) is a good candidate for such a standard, and hardware VST accelerators are beginning to appear. The pioneer in this is Muse Research, with a product called Receptor. Receptor is literally a VST computer. You can connect a keyboard and monitor to it, load VST instruments, and play them with high-quality output via MIDI control. Not all VST plug-ins will run on Receptor, but the supported list is quite impressive. In addition to freestanding mode, Receptor can connect to a computer to run in conjunction with a digital audio workstation or sequencer. The connection is via Ethernet, so it can be shared on a network and multiple Receptors can be used together.

18_Chap17_pp421-444 8/29/13 2:51 PM Page 434

434

THE ART AND TECHNIQUE OF ELECTROACOUSTIC MUSIC

KYMA The preeminent DSP system is Kyma by Symbolic Sound. It was first released in 1990 and has become the instrument of choice for professional electroacoustic composers and sound designers. You have likely heard the output of Kyma—it has been used to create exotic sounds for blockbuster sci-fi and fantasy movies as well as for many popular games. There are actually three components to Kyma—a host computer, an extremely powerful DSP box, and the Kyma software, which runs on the host and programs the box. The DSP unit has been a series of progressively more powerful devices named after obscure South American rodents. The current machine is the Pacarana. It costs a bit more than a fully loaded Macintosh but is significantly more powerful. There is a leaner version called the Paca which is still five times more powerful than the previous model. The Pacarana connects to the host computer via FireWire and requires a separate audio interface. There are no controls beyond a power switch, but control surfaces can be connected via USB or Ethernet. The Kyma software is much like Max in use. You set up the instrument architecture by patching graphic modules, then load the program into the Pacarana to hear the result. There is no free demo for Kyma. Like Csound and Max, the major investment in Kyma is a years of learning and experimentation, but the dollar cost is not trivial. The following discussion and accompanying examples on the DVD are intended to give you a feel for how the system works, so you can decide if it is the right investment for you. Kyma software is written in a computer language called Smalltalk-80. Smalltalk is an object-oriented language, and much of the philosophy of object-oriented design carries over into the Kyma terminology and way of thinking. In Kyma, the objects you can work with are called sounds and are represented by icons that you manipulate in a graphic window. You derive your own sounds from an assortment of prototypes which are complete enough to make some sort of noise by themselves. You modify copies of the prototypes by replacing some of their component sounds, adding other sounds of your choice, and setting parameters that control the operation of each sound. You can convert the sounds you create into prototypes if you like. As you develop your own sounds, they are stored in a sound file, not to be confused with a recording of sound, which Kyma calls a sample file. DVD example 17.2 shows a video of how to create a sound. The process starts by choosing New from the file menu and selecting Sound File as the type to create. Look in the prototypes window for Oscillator (in the Sources and Generators list) and drag it down into the new window. This makes a new instance of an oscillator. The original in the prototypes window will not be affected by anything you do. Double-click on your copy to open the editor window (Figure 17.7). The top half of the window shows the sound architecture and the lower half shows parameters for the object selected in the upper half. You can hear this oscillator by hitting command (or control) p. When you do so, a virtual control surface (VCS) appears with two controls (Figure 17.8). One is labeled AmpLow and the other LogFreq. You will

18_Chap17_pp421-444 8/29/13 2:51 PM Page 435

SYNTHESIS IN HARDWARE

FIGURE 17.7

435

Kyma editing window.

note that these odd names also appear in fields of the editing window preceded by an exclamation point. The exclamation point indicates that these are “event values” and will be represented by controls if they haven’t otherwise been defined. This sound has more parameters than we normally associate with simple oscillators. Some of these are labeled with italics, meaning they are hot parameters that may be updated as the module plays. These are the equivalent of control inputs. The others are set up as the module is compiled. If you hold the cursor over a parameter name, the cursor becomes a question mark and a click opens a window with information about the parameter. Clicking the sound name in the lower right part of the window produces a detailed description of the sound. Many of the parameters of the oscillator are familiar. The wavetable refers to a sample file, which contains one or more waveforms, each 4,096 samples long. The floppy disk icon opens a file dialog box to change this. The index, which you will note is a hot parameter, points into the file in steps of 4,096 samples so waveforms can be quickly changed. The modulator field lists a source for FM. The formant parameter is similar to the fractalize feature of Absynth, making the waveform asymmetrical. The contents of the parameter fields can be quite complex, containing Smalltalk computations that are executed 1,000 times a second. Looking at the frequency parameter, we see the event value !LogFreq followed by the notation n smoothed nn. Smoothed is a process applied to the data from the slider—any changes are

18_Chap17_pp421-444 8/29/13 2:51 PM Page 436

436

THE ART AND TECHNIQUE OF ELECTROACOUSTIC MUSIC

FIGURE 17.8

Kyma virtual control surface.

smoothed over 100 ms. The entry nn indicates that the data coming in is a note number that should be converted to frequency. When you define your own event value for the virtual control surface, the slider will default to a range of 0 to 1.0, but the prototypes often have different definitions. The VCS is editable; controllers can be moved around, there are several controller styles, and it is possible to set the range and default values for each control. The VCS setup can be saved as part of the sound definition. It is simple to convert the oscillator sound to basic beep. DVD example 17.3 gives a video demonstration of this. All that is needed is an envelope controller and links to the MIDI input. MIDI data is specified by predefined event values. Examples of these usually appear in the parameter Help, and you can see the entire list in a file called Global Map. To control the frequency of the oscillator, type !Pitch into the frequency field. !Pitch is a combination of MIDI note number and pitch bend. For the envelope, find an envelope prototype such as ADSR, copy it to the clipboard, and paste it into the envelope parameter field, replacing everything that is there. Once this action is taken, the oscillator icon in the upper half of the window will sprout a tail, indicating a hidden part of the patch. Clicking on the tail will show the icon for the ADSR. Double-click on that to select it and open the ADSR parameters, as shown in Figure 17.9. Again we see a more complex object than usual. The legato value determines what the envelope does when a new gate arrives before the release is done. If legato is 0, the envelope will shut off quickly before the new cycle. If legato is 1, the envelope cycle will restart from the current value. Scaling affects the entire envelope amplitude. It defaults to control via velocity, whereas sustain will have a slider in the VCS. There is also a legato event value. This has nothing to do with the

18_Chap17_pp421-444 8/29/13 2:51 PM Page 437

SYNTHESIS IN HARDWARE

FIGURE 17.9

437

Basic beep in Kyma.

legato parameter, it merely affects the overall time of the envelope. All of these can be changed by editing what you see. If you were to attempt to play the sound as it appears in Figure 17.9, all that you would hear would be a pop, because the play command applies to the selected sound in the window. The selected sound is not necessarily the sound that has parameters showing. Click once on the rightmost object to enable the entire window. Our next step is to add a filter, which is demonstrated in DVD example 17.4. This is easily done by choosing a filter prototype and dragging its icon onto the thin part of the line between the oscillator icon and little plus sign. The sounds so far play one note at a time. To get polyphony, we add a sound called MIDIVoice. Surprisingly, it goes at the right end of the patch and has the effect of duplicating everything that feeds into it as many times as requested for the voice count. The resulting sound is shown in Figure 17.10. MIDIVoice has few parameters: a keyboard range, MIDI channel, panning (shown as a ratio of left and right scaling), and the number of voices desired. It pays to be frugal with voices, as they use up computation time whether they are playing or not. MIDIVoice can accept performance instructions from three places. Live MIDI input, a standard MIDI file, or a short program (written in Smalltalk) in the script field.

18_Chap17_pp421-444 8/29/13 2:51 PM Page 438

438

THE ART AND TECHNIQUE OF ELECTROACOUSTIC MUSIC

FIGURE 17.10

MIDI instrument in Kyma.

Notice the name of the MIDIVoice object in the upper window is now MIDIbeep. I renamed it because new sounds take the name of the last object in the patch. This identifies the sound for future use—it can be used in other patches or extended to create something new. Figure 17.11 shows a sound derived from two MIDIbeeps. This adds a second MIDI channel. The new branch was added by opening MIDIbeep and dragging the MIDIbeep icon from the sound window onto the plus symbol next to the speaker icon—the plus symbol turns into a mixer. In Figure 17.11 the MIDIbeep sounds are folded up, showing only the final stage. These are renamed to clarify the their function. MIDIbeep1 is in the editor, but the mixer (now named MIDIbeep2ch) is selected to play. The selections are indicated by subtle shades of brown and blue. Both MIDIVoices will begin playing the invention as soon as the sound is compiled. DVD example 17.5 is a run-through with some manipulation of the filter controls. Recording the output is simply a matter of selecting Record from a menu and playing the sound. This should provide a sense of how Kyma is programmed, but so far it is just another synthesizer—nice but hardly worth the fuss. It does implement nearly every technique we have covered—FM, granular synthesis, even some modeling. How-

18_Chap17_pp421-444 8/29/13 2:51 PM Page 439

SYNTHESIS IN HARDWARE

FIGURE 17.11

439

Two-channel MIDI in Kyma.

ever, the power of Kyma is found in operations that are unavailable or difficult to obtain on other platforms. The following are a few examples.

Spectral Resynthesis We’ve seen several applications that work with spectral analysis files. Alchemy even has the ability to create sounds by interpolating between two to four spectra with real-time control, an impressive breakthrough released in 2008. Kyma was doing interpolated spectral resynthesis in 1998. (The current release contains more than twenty morphing and resysthesis algorithms.) Kyma has an efficient spectral editor that can be used for subtle or startling modification of sounds. DVD example 17.6 has an example of resynthesis using the cry of a swan and a bell. The original sounds are heard first, then the mashup is used to perform a classical tune. Neither sample is used at the original pitch, and the original pitches were not the same so there are a lot of strange effects. This process was performed with preanalyzed spectra, but Kyma also has the ability to analyze and resynthesize spectra in real

18_Chap17_pp421-444 8/29/13 2:51 PM Page 440

440

THE ART AND TECHNIQUE OF ELECTROACOUSTIC MUSIC

time. This produces some elegant pitch changes and leads to some truly bizarre processes like spectral derangement or the “space radio,” illustrated in DVD example 17.7.

RE Synthesis In addition to spectral analysis, Kyma offers Resonator/Exciter analysis, known as RE for short. RE analysis produces two analysis files. The resonator file describes a time-varying filter that matches the spectrum of the source sound. The exciter file contains the sounds that would re-create the original sound when applied to the RE filter. The two together will reproduce the original quite well, and interesting results can be produced by mixing up excitation and resonance files. This results in something like vocoding. DVD example 17.8 applies “Jabberwocky” to a power drill. The CrossFilter sounds perform RE analysis in real time. Again, the result is similar to vocoding, but the resonant filter has a ringing property that intensifies the color imposed on the source. The filter computation is not continuous but is captured upon command or at the beginning of the file. DVD example 17.9 forces the cry of a swan through a harp glissando. You first hear the harp, then a long recording of swans at a lake, followed by the processed version.

Group Additive Synthesis Group additive (GA) synthesis builds on spectral resynthesis by analyzing spectra to see how they can be simplified by using fewer oscillators with more complex waveforms. If the GA analyzer discovers a set of harmonically related components with a constant relationship, it replaces the group of sine waves with a somewhat triangular wave of appropriate harmonic structure. This is more efficient to resynthesize because fewer oscillators are involved. Resynthesis of GA files can be morphed just as easily as spectral resynthesis files. DVD example 17.10 has a bit of Ravel’s “Pavane for a Dead Princess” performed on a piano that transforms into a horn.

Vocoding We have seen several applications with one sort of vocoder or another, and we have even learned how to program one in Max/MSP. The Kyma vocoder base class is especially powerful, with detailed control over the number of bands and the frequency range covered. Vocoding turns out to be one of those processes that is most interesting at low fidelity. After all, it’s a communication process—done perfectly, all you would hear is the original. Low-fidelity vocoding is delicious, but cer-

18_Chap17_pp421-444 8/29/13 2:51 PM Page 441

SYNTHESIS IN HARDWARE

FIGURE 17.12

441

Group additive synthesis in Kyma.

tain combinations of sound seem to work better on one vocoder than another. Kyma allows us to tailor the vocoder to match the quality of input. DVD example 17.11 demonstrates various vocoding options. Figure 17.13 shows how this was set up.

Tau Editor Many of the Kyma effects most beloved of sound designers use cross synthesis or spectral morphing techniques. Application of these processes to complex sounds requires tight synchronization. The Tau editor (time alignment utility) is similar to the beat-mapping effects built into some sequencers, but it provides access to all of the possibilities of this technique. It allows you to stretch selected portions of the file, moving them forward and backward in time as desired. Rather than forcing the sounds to some arbitrary beat, the Tau editor is completely under the composer’s control. The Tau editor works by performing a spectral analysis of the input file; the analysis is saved as a psi file. Once the analysis is complete, the editor shows envelopes for amplitude, frequency, and formants of the psi file. Any of these displays can be used for time adjustment. Adjustments are saved in a tau file; the psi version

18_Chap17_pp421-444 8/29/13 2:51 PM Page 442

442

THE ART AND TECHNIQUE OF ELECTROACOUSTIC MUSIC

FIGURE 17.13

Kyma vocoder.

is not changed. The amplitude, frequency, and formants can also be adjusted— those changes are saved in the tau file also. The results are played back in a sound called tauPlayer, which has controls to modify the amount of modifications heard, as well as typical resynthesis tricks like pitch and rate control. DVD example 17.12 demonstrates some Tau modifications to the “Jabberwocky” file. The Tau editor can work with more than one file at a time. This allows precise alignment of similar recordings, such as two different people reading the same text. Once multiple psi files have been created and attached to a tau file, the tauPlayer will handle spectral morphing among them, with individual sources for amplitude, pitch, and formants. DVD example 17.13 demonstrates morphing between two speakers. You will notice odd artifacts popping up where the two speakers are not quite in sync. The reality of morphing is that sounds have to be pretty similar to start with in order to combine convincingly. Single words are not a problem, but syncing an entire speech would take quite a while. On the other hand, the side effects are quite interesting. DVD example 17.14 shows what can happen when the controls are scrambled up.

18_Chap17_pp421-444 8/29/13 2:51 PM Page 443

SYNTHESIS IN HARDWARE

443

There’s much more to Kyma than I have touched on here. The best place to learn more is the Symbolic Sound website. You will find recordings, videos, pointers to the work of Kyma artists, and, of course, an order form.

HARDWARE FOREVER? This chapter has been about the benefits of using hardware for at least some parts of electroacoustic composition. It may be my old-timer bias showing through, but it should be apparent that programs on a general-purpose computer can often be outperformed by dedicated instruments. Does that mean every studio should be walled with panels and knobs like the labs of the ’60s? No, but a composer should be aware of all possible tools and choose those that best fit his or her needs and working style. The hardware option is often expensive, but it should be carefully considered. Someday, a piece of gear may become the key to your best work.

RESOURCES FOR FURTHER STUDY A bit of study about historic instruments will help you spot bargains at yard sales and used instrument shops. These books have extensive descriptions of the classic synthesizers. Aikin, Jim. 2004. Power Tools for Synthesizer Programming: The Ultimate Reference for Sound Design. San Francisco: Backbeat Books. Casabona, Helen, and David Frederick. 1986. Beginning Synthesizer. Cupertino, CA: GPI Publications. Vail, Mark. 2000. Vintage Synthesizers: Pioneering Designers, Groundbreaking Instruments, Collecting Tips, Mutants of Technology, 2nd ed. San Francisco: Miller Freeman Books. Information about the Kyma system is available at the Symbolic Sound website, www.symbolicsound.com.

18_Chap17_pp421-444 8/29/13 2:51 PM Page 444

19_Chap18_pp445-464 8/29/13 2:52 PM Page 445

Part 6 Live Electroacoustic Music In the previous chapters we have considered electroacoustic music as a studio activity, produced at a leisurely pace with careful control of the results and disseminated via some form of recorded media. But that is only one facet of electronic music, the equivalent of Glenn Gould abandoning the concert stage to finish his career in the recording studio. The concert stage is important—without it, no one would have heard of Glenn Gould in the first place. The popularity of the Walkman and the iPod notwithstanding, people like to hear music in a social setting. People prefer to get recorded music for free, but they will pay serious money to attend a good live act. Concerts of recorded electronic music were never popular. We did them for a long time because there was no other way to present electronic music, especially when it required extraordinary sonic quality or multiple channels of sound. The most successful concerts were in the European tradition, where an artist would operate a mixing console on stage, taking the role of conductor and somewhat presaging DJs in the 1990s. Even so, audiences stayed home in droves. Today, an audience will sit still for a couple of short recorded pieces between live sets, but no more. We would like to believe in the purity of the music, that an audience is there only to listen, but that has never been true. To prove this all you have to do is listen to a recording of a performance you experienced in the flesh. No matter how exquisite the recording or how bad your seat was, there is something missing. It’s hard to define what that something is—eye contact with the performer, the emotional feedback from the audience, maybe just the performer’s sweat, but something is definitely different. From the musician’s point of view, there are a lot of reasons to perform live: it is fun to do, it is a good way to build a fan base, and it is a source of income. But the best reason is that music in front of an audience, no matter how technically augmented, is a transcendent experience for both the audience and performer. A lot more is going on than can be captured in the mere sound. There are also reasons not to perform: it’s hard work, time consuming, and stressful; it means hours of dull practice instead of exciting creation. But I urge you to give it a try. You don’t need to go on tour or be the house band at a bar, but when the opportunity presents itself, get out there and show them what you can do. Electroacoustic performance poses two kinds of problems—the physical issues of playing with electronics on stage and the artistic issues of composing a performance. These are discussed in separate chapters: chapter 18 is about the logistics of performance and chapter 19 will cover composition.

19_Chap18_pp445-464 8/29/13 2:52 PM Page 446

19_Chap18_pp445-464 8/29/13 2:52 PM Page 447

EIGHTEEN The Electroacoustic Performer

What makes a performance electroacoustic? It’s not amplification. With today’s large concert halls every style of music is sometimes performed with some degree of reinforcement. I have met performers who are so used to the process that they insist on a sound system in a room for twenty people. But there is a point where the amplification does begin to be part of the music. John Cage’s Child of Tree specifies vegetable material as a sound source—it’s hard to imagine that piece without amplification, as there would be long stretches only the performer would hear. Many vocalists have learned to use the trusty Shure SM57 microphone to process their voice, moving it away from their lips on loud passages as a kind of manual compression. They also learn the precise angle of the microphone that produces the tone color they want. For these artists, the microphone is an instrument. Moving up the signal chain, we find that sound reinforcement is becoming more and more creative. I once attended a concert by a famous singer known for the exceptional clarity and sultry quality of her voice. Imagine my surprise when I wandered backstage and discovered an entire rack of audio processors that was responsible for that quality. Of course the hallmark of good live sound is that it is unnoticed, that what the audience hears should be considered “natural” even if the effect is of the singer fifteen years younger. Acoustic music acquires the electro when the artist wants the sound of electronics (or sounds only possible through electronics) to equal other aspects of the performance. I say equal because I expect musicians to find a balance. The electronic elements, acoustic elements, and theatrical elements all must participate fully in the ebb and flow of the audience’s attention. There is some danger in this. If the electronic gadgets are not up to the job or are used carelessly, their contribution can be annoying sonic artifacts. In the worst case the gadgets can fail, leaving a gaping hole in the music if not stopping the show entirely. The arsenal of the electroacoustic performer is similar to the equipment of the studio. There has to be a source of signal, probably some processing gear, and a computer to move the performance away from familiar territory into new realms of expression. This all connects to the sound system. The demands on a system for performing in a hall are quite different from what we want from our studio monitors. Excellent sound is required, but there is more to reinforcement than quality of 447

19_Chap18_pp445-464 8/29/13 2:52 PM Page 448

448

THE ART AND TECHNIQUE OF ELECTROACOUSTIC MUSIC

sound. There must be adequate sound level at every seat in the house and approximately the same level at every seat in the house. This simple requirement alone can lead to awesome complexity. The performers must be able to hear themselves and every other member of the ensemble. Most of the time, there is a house sound system to tie into, but sometimes the performer must provide his or her own. Then equipment has to be rented, bought, or scrounged—and the added requirements of portability and durability may outweigh everything else.

SOURCES AND CONTROLLERS A performance is a musician engaged in sound-making activity on a stage. The scale of the activity has to match the size of the venue. A classical guitarist wrapped around his instrument presents interest to the first dozen rows in an intimate recital hall, but a stadium requires dance steps, flailing drummers, and giant video screens. Luckily, we do not have to start out in a stadium, but a single figure hunched motionless over a laptop doesn’t work anywhere. To paraphrase a legal maxim, music must not only be played, it must be seen to be played. So the first element to consider for your performance is the interface or instrument you play. If you have experience with an acoustic instrument, that will naturally suggest a mode of performance, but you should consider all possibilities, either as a change of pace or an auxiliary to your main axe.

Keyboards A keyboard artist is well provided for, with a wide choice of instruments and a long tradition of performance. The keyboard is merely a way of instructing a machine such as a piano or an organ as to which pitches to activate, so connecting a keyboard to a computer is natural evolution. Picking a keyboard for electroacoustic performance starts with the traditional issue: Does it feel good to play? The factors that go into that are touch and responsiveness. Touch is mostly about mechanical construction. The advertisements make much of the weight of keys, but what really matters is the matching of keys. Each should respond identically to the force of the fingers. Heavy action is good if you regularly play an acoustic piano (your finger muscles will be built up by that) but light action is no limit to virtuosity. Response is how the motion of the key is interpreted electronically. All notes must trigger at the same point on the keystroke, and the relationship between force and MIDI velocity should be consistent. The best keyboards include a choice of response curves. The curve reflects not only how hard you hit the keys to get the highest velocity, but the range of force from softest to loudest. When the curve is matched to your playing style you can play expressively and with little fatigue.

19_Chap18_pp445-464 8/29/13 2:52 PM Page 449

THE ELECTROACOUSTIC PERFORMER

449

Most modern keyboards have both MIDI and USB connections. Your software sees it all as MIDI coming in from different ports. If the connection is USB 2.0 or better, the keyboard response will probably be faster than going through MIDI. You may seldom plug anything into a keyboard’s MIDI connection, but it is still needed because you never know what is going to be hooked up tomorrow. The other vital connections on a keyboard are for pedals. Sustain and volume are central to keyboard technique, but extra pedals can provide control of the software. I don’t think a keyboard can have too many pedals. The same applies to general purpose knobs and sliders. My first keyboard had one spare slider, my second had four. My current instrument has sixteen, but that’s not nearly enough. The need for a keyboard to produce sound is just a matter of taste. You will mostly use it to control other things, but a self-contained instrument can be handy. After all, you probably won’t want to drag the entire studio along to play at a Christmas party. There are a few keyboards with great synthesizers in them, but no matter how good they are, the sounds will become obsolete before the keys do. What I look for in keyboard sound is a decent piano and general MIDI presets that aren’t too embarrassing. The built-in sequencers and multichannel playing ability of so-called workstations can occasionally be useful, but few match the production quality of computer software that is actually less expensive, and I personally find working with the little screens on such keyboards hard on the eyes.

Guitars A guitar is more naturally electroacoustic, at least in the versions pioneered by Les Paul and Leo Fender. They did an excellent job, finding a simple way to derive a signal that retains the intimacy and subtlety of flesh touching a string. The choice of an instrument is too personal to cover here, but since the signal will probably be applied to a wide range of sensitive electronics, humbucker-style pickups are essential. Whether MIDI is contemplated or not, a hexaphonic pickup (one pickup per string) is an excellent option. Guitars with built-in USB audio interfaces are starting to appear on the market, but a USB cable may be somewhat limiting in live performance. A wireless pickup system would be more liberating at the same cost. The sounds available from a guitar can go well beyond the picked notes and strummed chords we are used to. Foreign objects in contact with the strings will produce a wonderful variety of tones—some harmonic, some percussive, many that would seem to bear no relation to guitar at all. This is done with paper clips, short bits of plastic, dinner forks, alligator clips, practically anything threaded between the strings. Other objects can rest on the strings bottleneck style. I’ve seen PingPong balls, corkscrews, even circular saw blades in use. In addition, the strings can be hammered with knitting needles or marimba mallets, bowed with fiddle bows or paintbrushes, even set into motion by electric motors. If you want to explore this, I suggest using a spare guitar, not because the tricks are inherently dangerous (you may go through strings rather quickly), but because they can take a while to set up.

19_Chap18_pp445-464 8/29/13 2:52 PM Page 450

450

THE ART AND TECHNIQUE OF ELECTROACOUSTIC MUSIC

The pitch and velocity information that a synthesis program needs is difficult to derive from a guitar. There are guitar-to-MIDI interfaces, but many performers find them unrewarding to play. The latency inherent in detecting pitch in the low range of the guitar makes the system sluggish, so most artists prefer to work directly with the audio signal (this does not preclude the use of MIDI for auxiliary control). There are some new systems that claim better MIDI performance, but the jury is still out. The most promising uses an artificial intelligence technique called neural nets to learn to identify the starting transients of each pitch. The response time improves as the instrument is played, and I suspect you lose a bit of ground when you change strings. No conversion technique will work with chords, so a hex pickup is required for any style of MIDI detection. Guitar-shaped MIDI controllers turn up from time to time. It must be a brutal corner of the industry, because few seem to survive for long. Most of these are just game controllers, but Starr Labs has a complete line of professional-grade instruments (at professional-grade prices). These do not have strings in the usual sense. There is an array of wires to strum, but they make no noise. The fingering is detected by touch-sensitive frets and picking the wires triggers notes. The result is a very snappy response that feels quite natural to play.

Wind Interfaces I originally trained as a reed player and have been using Yamaha WX wind controllers for years. I’ve been through several models and am happy to report they have made steady improvements. I’ve never forgotten I was playing an imitation of the real thing, but I have played real instruments that were harder to play and sounded worse. The companion modeling synthesizer provides expressive and reasonably pleasant voices. AKAI has enjoyed similar success with the EWI series of controllers—their current lineup includes USB and wireless models. Brass players have not been as well served—the few trumpet-style controllers I have seen were handmade. The best have been quite sophisticated but hard to get and expensive. Yamaha does make a trumpet-shaped device, but it is more of an electronic kazoo. You hum into it and the pitch controls synthetic sounds of various types. Yamaha also makes a series of “silent brass” devices. These are mutes with microphones in them. The instrument is played naturally but does not make much sound. Instead you listen with headphones or an amplifier. These make an excellent source for an electronic processing rig.

String Interfaces Yamaha also makes a line of “silent” strings. These are actual instruments with a design reminiscent of electric guitar—a solid body and pickups on the strings. Characteristic violin, viola, cello, and bass tones are produced by sophisticated

19_Chap18_pp445-464 8/29/13 2:52 PM Page 451

THE ELECTROACOUSTIC PERFORMER

451

electronics. There are similar instruments made by various custom makers, professional instruments sold at professional prices. MIDI interfaces are available for electric violins. Like the guitar interfaces, early models were problematic, but serious advances are made every year. The Zeta violin, produced for nearly two decades, featured a quadraphonic pickup that was connected to an outboard MIDI converter. The latest version of the converter is a four- or six-channel USB interface called StringPort developed by Keith McMillen. Software in a laptop provides the latest in MIDI note detection, as well as making the independent string sounds available to processing software such as Max/MSP or PD. Keith McMillen Instruments (KMI) also has an interesting product called the K-Bow. This is a bow with a small box that transmits bow motion, position, and pressure wirelessly to the computer.

Percussion Percussion instruments are easy. Not only are several excellent instruments available in various configurations from Alternate Mode (KAT), Yamaha, and Roland, it is simple to build your own. Homegrown triggers require an interface box of some sort, but these are not expensive, and many pad controllers offer extra trigger inputs. The heart of a percussion trigger is a piezoelectric transducer, a device shaped like a coin with the thickness of a piece of paper, which can be bought for a few dollars at electronic parts stores or for a few cents if ordered in bulk (see Figure 18.1). They have two thin wires which are easily soldered to a longer wire and a connector. This plugs into the trigger input of the instrument or interface. The piezo acts like a contact microphone, converting the vibration of anything to which it is attached into an audio signal. It’s not a high-fidelity microphone, but it is certainly suitable for triggers. In my performance classes, I hand out piezos and the students slap them on anything from shoe soles (for tap dancing) to boxing gloves. Piezo transducers are useful for more than triggers. The sounds they pick up may be band restricted, but they open up a whole world of microsounds, tones too faint for a standard microphone to capture without feedback. My classes build micro orchestras by attaching a piezo to a board and mounting various springs, nails, bits of rubber, etc., and playing them with fingers, mallets, and nail files, (Figure 18.2). Amplification of these tiny sounds reveals amazingly rich textures. DVD example 18.1 demonstrates some of the sounds available. This technique was inspired by Pauline Oliveros’s orange crates used at the San Francisco Tape Music Center in the 1960s. Phonograph cartridges can also be used to pick up microsounds, as John Cage pioneered in Cartridge Music. The cartridge can be mounted with the stylus on the vibrating surface, and the output can be wired to a microphone preamp if no actual phono input is available (in fact, a phono input may be unsatisfactory because of the built-in bass emphasis). With some cartridges it is possible to replace the stylus with an alligator clip attached to a short steel wire. This improves contact with thin vibrators such as springs or cactus spines.

19_Chap18_pp445-464 8/29/13 2:52 PM Page 452

452

THE ART AND TECHNIQUE OF ELECTROACOUSTIC MUSIC

FIGURE 18.1

Piezo transducers.

Circuit Bending While we are on the subject of using odd sounds as input for electronic performance, I should mention the burgeoning art of circuit bending. This involves finding devices that make electronic sounds, such as children’s toys, and modifying the sounds, primarily by touching the circuit boards to find spots that create interesting noises. Some artists attach switches or potentiometers to the magic spots for repeatable operation; others prefer to play freeform. Most of the sounds would be elementary to synthesize, but a table full of deconstructed toys is a definite attention getter. Circuit bending is well documented in the book Handmade Electronic Music by Nicolas Collins, and many guides and techniques for classic toys are posted on the Internet.

NIME and the Future The need for variety in electroacoustic performance is not just a whim of this author. There are enough musicians involved in exploring new performance tech-

19_Chap18_pp445-464 8/29/13 2:52 PM Page 453

THE ELECTROACOUSTIC PERFORMER

FIGURE 18.2

453

Orange crate board with vibrators.

niques that a society has been formed, New Interfaces for Musical Expression (NIME). This group hosts an annual conference where papers are presented and truly exotic instruments are demonstrated. As a example, here is a description of the Nymophone2 designed by Kristian Nymoen at the University of Oslo: The Nymophone2 has four metal strings across a flexible metal plate. The flexibility of the metal plate causes the control mappings of the instrument to be quite complex. Pulling one string causes the sounding pitch of the pulled string to rise, while the pitch from the remaining strings falls. To control the resonant frequency of each of the strings one may also shorten the strings with a bottleneck (a round glass or metal object commonly used on electric guitar) or with the fingers. Bending the plate will also affect the tuning of the strings. Especially interesting is the way the separate strings are affected to a different extent depending on how one bends the plate. (Description taken from the NIME 2009 conference website, which has since been taken down. A complete description of the instrument is found in Nymoen’s thesis, “The Nymophone2: A Study of a New Multidimensionally Controllable Musical Instrument” [University of Oslo, 2008])

19_Chap18_pp445-464 8/29/13 2:52 PM Page 454

454

THE ART AND TECHNIQUE OF ELECTROACOUSTIC MUSIC

As you can imagine, the instrument has a character reminiscent of a musical saw, but with features of the steel guitar. Other equally fanciful instruments are being built in basements and kitchens around the world. Commercial production is unlikely in most cases, but such instruments can usually be built by anyone with a few basic tools. Some of these instruments achieve true sculptural status, such as the creations of San Francisco–based artist Oliver DiCicco. Experimental instruments may be totally acoustic, acoustic with pickups, or straight electronic controllers. The latter have become fairly easy to make with the advent of the Arduino board. The Arduino is a programmable microprocessor board with assorted digital and analog inputs that sells for less than $40. Other modular computing boards are well over $100 and a true microcontroller prototyping system is closer to $1,000. This is important, because the development of projects using such boards often leads to their damage. Another point in its favor is that an engineering degree is not required to build with the Arduino. It is designed to be a centerpiece of basic electronics for artists courses taught at many schools. There are also extensive online tutorials and sample projects. Since most controllers are built from simple switches and resistive controls, the actual electronics skill required is minimal. These parts are connected to the board as prescribed and the Arduino is programmed for the desired response. The programming can be done in a language similar to ChucK or the board can easily be set up to pass data to Max/MSP. Many experimental instruments use the Open Sound Control protocol. This protocol is derived from networking technology, using the same cabling and hardware as Internet connections. Controllers, tone generators, and computers communicate over a router. Since it’s all standard networking, wireless communication is a simple option. The protocol includes many more types of control than MIDI, so things like breath velocity and bow pressure can be precisely communicated. Open Sound Control is at the heart of many touch control apps for tablet computers and cell phones. These provide a pattern of lights and virtual buttons on the device, and manipulation of the buttons is transmitted wirelessly to a nearby computer. The result is a performance practice right out of Star Trek. There are apps that provide keyboard or violin style interfaces, as well as many that allow musicians to design their own.

Voice Electroacoustic performance can certainly begin with voice. I recommend listening to some of the recordings of Cathy Berberian to get a sense of the astounding things that are possible beyond merely producing the lyrics. When extended singing techniques are used, it is a good idea to provide two microphones; a sensitive studio mic for delicate sounds and a robust public address model that can be held without handling noise. It is essential that the singer be able to hear both their own voice and the processed results. This requires monitors on the stage or in the ears.

19_Chap18_pp445-464 8/29/13 2:52 PM Page 455

THE ELECTROACOUSTIC PERFORMER

455

PUTTING ELECTRONICS ON STAGE Mixing on Stage Whatever performing devices you choose, they are going to be hooked up to some sort of electronics. Unless you want to have a backstage assistant looking after things, the gear will share the spotlight with you. As in the studio, the central item is the mixer. An onstage mixer need not be particularly elaborate. All that’s required is a fader for each signal source, a master control for the feed to the house sound system, and monitoring controls with a headphone jack. (Headphones are essential on stage. You probably won’t perform with them on, but you will want to have some privacy when you set up the system, especially if you have to do any troubleshooting.) If processed vocals are part of the act, make sure the mixer has a decent preamp with insert or direct output. The direct output is patched to the processors or computer, then brought back to the mix on another channel. It is easier to balance processed and unprocessed vocals with this connection scheme than with wet/dry mix settings in the processors. Guitar should be used with a preamp patched directly to the mixer. An amp with speakers is seldom satisfactory because it is difficult to judge the balance with other parts of the setup. Any other signal source will connect easily to the mixer inputs. Wireless connections offer the performer a tempting level of mobility, but be sure to allow extra time to set them up before the show. You must let the system run for quite a while to make sure you are not going to pick up taxicabs or other unwanted communications. The frequency allocation for unlicensed stage microphones has recently been changed by government regulators, so make sure your system is still legal. Digital systems are less likely to pick up spurious signals than analog systems, but they can be considerably more expensive. Both analog and digital systems use various forms of compression and noise reduction and cannot match the audio fidelity of a microphone cable. The mixer main output goes to the sound system. It may be necessary to run the output signal through a direct box. This converts the line out signal from your mixer to the balanced connection sound systems need. There are two essential switches on a direct box. One is an attenuator to reduce the level of the setup to match a microphone input, which you should use only when absolutely necessary. The other is a ground lift, and this is the first thing to adjust when the PA system starts to hum. You will use only two settings on the mixer’s master fader—off and unity gain. The volume of sound in the hall is the responsibility of the house engineer, and you must send a consistent level if he is to do his job. If you are in an informal situation and have to control everything yourself, the master setting will determine the house level. Set your mark with the help of a trusted assistant or by playing a recording and wandering through the venue, but once the mark is made, stay with

19_Chap18_pp445-464 8/29/13 2:52 PM Page 456

456

THE ART AND TECHNIQUE OF ELECTROACOUSTIC MUSIC

it. A sound level meter is very useful in this process. If the meter measures 86 dBspl in the middle of the audience, your performance will be loud but not painful. You can’t perform if you can’t hear what you are doing. You can seldom hear the main speakers—they are deliberately set up to keep sound away from the stage. Stage monitors are generally available with the house system, but you will be happier if you provide your own. They needn’t be fancy, just some small units you can place near each member of your group. Monitors not only make the sound louder for the performers, they make the sound more immediate. They are essential to a responsive and well-coordinated performance. When I discuss onstage monitoring I always include this warning: most of the pop musicians of my generation are deaf. This is the result of putting massive amps and monitors on stage and listening at earsplitting levels. The danger is now well understood, and sophisticated tools such as custom-fitted in-ear monitors are becoming standard components of sound reinforcement. Whatever you wind up doing, always measure the sound level on stage and keep it below 80 dBspl.

Processors Many electroacoustic performances utilize freestanding effects devices. These are typically connected between the signal source and the mixer. There are two types: guitar stompboxes and line-level effects. Line-level devices are generally designed for studio installation, but they can be brought on stage in a portable rack. They can provide any effect from subtle EQ to multipart harmony, but they are usually not as interactive as stompboxes. Some models do include foot pedals or other remote controllers. Stompboxes are typically small and fairly rugged. Designed to be placed on the floor and operated with the feet, each box performs a single function. An electroacoustic piece may require a dozen boxes, presenting special problems in layout and wiring. When you set up complex arrays of stompboxes, you must pay attention to two basic factors of interconnections: level and grounding. The signal coming from a guitar pickup is very low in current. To work with this slight current, inputs suitable for guitar must be very sensitive—the technical description is high impedance. The inputs on effects devices will be marked for use with either high-impedance (often abbreviated hi-z) or line-level (abbreviated +4dBm) signals. Make sure you use the appropriate input; a mismatch will result in a weak and noisy signal. Grounding refers to the return path for current that is necessary to make any interconnection work. On high-impedance cables the grounding is provided by thin wires that are wrapped or braided around the insulated signal wire. This braid also provides some electromagnetic shielding. Any interruption of the ground path will result in signal loss and hum. Hum that appears suddenly is probably caused by a broken cable or jack. The guilty item can often be identified by flexing the cable or pushing on the connectors. If this affects the hum or produces crackling noises, replace the cable. If the replacement cable behaves the same way, suspect the jack on the box.

19_Chap18_pp445-464 8/29/13 2:52 PM Page 457

THE ELECTROACOUSTIC PERFORMER

457

Another problem rears its head when the setup becomes complex. If there are two ground paths to the amplifier, current can flow through the shields on this circular path, a situation known as a ground loop. And yes, it produces hum. The solution to a ground loop is the ground lift switch found on some (but not all) effects boxes. A less-effective solution is to make sure parallel cables are the same length and bundle them together. Even faultless high-impedance wiring tends to pick up hum. This is minimized by careful layout. Keep cables as short as possible and away from hum sources such as power transformers and electric motors. Line-level gear is less likely to pick up noise, but the same wiring rules apply. If the equipment has balanced connections, use balanced cables. These have two wires within a shield and are terminated with microphone connectors or 1/4-inch TRS (tip, ring, sleeve) phone plugs.

The Laptop Most electroacoustic performances involve a computer. Today’s general-purpose laptop is only marginally suitable for onstage use, but it’s the best we’ve got. My advice for choosing a laptop is the same as for buying any computer. Pick out the software you need, then buy the computer that runs it. Given this qualification, if possible get a laptop with a keyboard that lights up and avoid highly reflective screens. Not only are such screens hard to read under stage lighting, they can reflect the lights in surprising ways. Set up your software so that all actions are triggered by keys; a mouse is a terrible performance interface, and a touchpad is worse. A computer used for performance will not be set up the same way as a production machine. For instance, the sample rate used on stage need not be any higher than 44.1 kHz. The effect this has on the CPU load should be obvious. A little less obvious is the benefit of a simple audio interface. If two channels of input and output are sufficient, there is no point in carrying along a sixteen-input box. In fact the extra traffic on the USB or FireWire connection will be a detriment (often all of those channels are transmitted, even when the data is zero). We have discussed the issue of latency at length already, but the optimum buffer size may be different in performance, where we might be willing to sacrifice some processor efficiency for quick response. Performance Software At this writing, the programs found most often on stage are Max/MSP, AudioMulch, and Ableton Live. AudioMulch looks a bit like Max. It has a graphical user interface, and patching is set up by dragging lines between boxes. The boxes are called contraptions and implement a wide range of useful and unique signal processors. The contraptions list ranges from basics like inputs and outputs to most of the standard effects like delay, EQ, and distortion. Many of these are enhanced by elegant touches. For instance, the input can double as a file player, so you can develop a patch with recorded material and test it with an input by just flipping a switch. In

19_Chap18_pp445-464 8/29/13 2:52 PM Page 458

458

THE ART AND TECHNIQUE OF ELECTROACOUSTIC MUSIC

addition, there are some unique effects such as the Risset Filter which produces a constantly upward sweeping comb effect. AudioMulch can host VST processors and synthesizers and features a variety of file players and loopers to support common performance strategies. As shown in Figure 18.3, one of AudioMulch’s nicest features is control. Any parameter in the setup can be assigned to MIDI, and MIDI can even be used to call up saved sets. Further, parameters can be automated with a graphic tool that is simple and accurate to use. External controller action can be recorded, edited, and saved for future performances. The paradigm of Ableton Live in performance mode is quite simple (Figure 18.4). You have a matrix of buttons, each associated with a short clip of MIDI data or audio (the buttons may be virtual or hardware from a variety of manufacturers). The clips have a duration of perhaps eight bars. Click on a button, and the associated clip plays in a loop. The columns of the matrix represent tracks, with mixing controls at the bottom. A track plays one clip at a time. If a clip is playing and you click a different button, the new clip will take over on the next downbeat. The rows across the matrix are called scenes. A button at the right of the window will activate all of the clips in the scene. Processors can be applied to tracks, and any parameter can be automated or assigned to MIDI control. All audio is played back via granular synthesis, so there is wide flexibility of pitch and tempo. You can record a new clip in the middle of a performance, making it possible to build up a fresh texture over canned loops. Ableton Live can also be used as a production tool, with the clips scripted in a horizontal timeline. In fact, a performance can be captured as an arrangement, edited a bit, and produced as an idealized version of the show. An Ableton Live performance requires a great deal of preparation, and the program provides tools to make this as efficient as possible. Ableton artists tend to accumulate huge libraries of loops and samples, and there is a system of organization based on the concept of projects and sets. A project may contain a large amount of related clips that sets can draw upon. Ableton Live provides a wide range of instruments for MIDI synthesis, and VST instruments are available as well. Some artists who have little interest in the looping paradigm still prefer Ableton Live as a synthesizer host. Ableton Live can be used in conjunction with a special version of Max/MSP. Max for Live operates within a track, allowing you to design your own processes with Ableton Live as the input interface. This combination plays well on the strengths of both programs.

The Ergonomics of Performance The physical arrangement of your equipment will help or hamper the performance much more than you think. The first goal is to place everything within reach, but you must also have eye contact with the audience. If the piece is performed on a laptop, place it so you can look over the screen to the center of the house. If you

19_Chap18_pp445-464 8/29/13 2:52 PM Page 459

THE ELECTROACOUSTIC PERFORMER

FIGURE 18.3

459

AudioMulch.

will be mostly at a keyboard, place that toward the audience with the computer to one side. Play guitar with the laptop beside you on a stand or place the laptop so that you can work behind it and step sideways to play. Height is equally important. You may sit or stand, but avoid doing both in the same piece. If you are standing, place the laptop on a lectern or counter-height table. You can use a music stand to support a mouse but not a computer. If your playing area is large, set up extra computer monitors so you can follow program progress from different positions. Sometimes a large monitor placed on the floor at the edge of the stage is best. Never place a monitor so the audience can see it—it will be too bright in the stage lighting. Set up a black screen saver so you don’t have to shut your gear down when others are using the stage and want darkness. Use this mode instead of letting your computer sleep.

19_Chap18_pp445-464 8/29/13 2:52 PM Page 460

460

THE ART AND TECHNIQUE OF ELECTROACOUSTIC MUSIC

FIGURE 18.4

Ableton Live performance mode.

PUTTING YOUR SHOW ON STAGE Any electroacoustic performance starts with one of the toughest jobs in electronic music: getting the gear set up and operating in front of an audience. No matter how much talent you have or how much practice you have put in, the show will be a flop if all you can do is stare at a bunch of lifeless and silent junk. I have attended many such shows. Often the delay is only a few minutes, but it might as well be hours—nothing ruins the mood of an audience like a long wait in the dark. The mantra for live electronic performance is well known: “If anything can go wrong, it will.” Here are some tips for preventing disaster. Plan your show so you have enough gear. Surprisingly, the most common cause of equipment failure is leaving it home. Diagram your setup and make a checklist of all items. List the big pieces and all of the little stuff. Check items when you pack to go to the gig (and when you pack up after the show!). Do not as-

19_Chap18_pp445-464 8/29/13 2:52 PM Page 461

THE ELECTROACOUSTIC PERFORMER

461

sume any item will be available at the venue. All you can count on is a single AC plug, so include more power strips than you think you will need. That way when the player next to you runs short, he won’t unplug any of your gear to steal power. Check out the venue and be sure you can interface. What does the venue provide? Is the house system workable? A house sound system means speakers and some sort of board; make sure it has the capabilities and quality you need. Look at the board and input connectors on stage. Does your mixer put out a signal that the house can handle? Can you cope if all they offer are mic inputs? How many lines to the board do you need? Should you bring a snake? What furniture is available? These questions must be answered well in advance of the show. Invest in rugged versions of vital gear. Identify the key components of your system. Set up your stuff and stare at each item for a couple of minutes and imagine the consequences if it goes up in smoke. If the result is a show stopper, acquire the most reliable version you can afford. Apply this process to everything. Have you wondered why some power strips are made of plastic and cost $4 while some are metal and cost $20? The metal ones have real AC receptacles on them, the same you would find in a house. The cheap versions just have some long springy strips of metal. When the plastic wears a bit, electrical contact becomes a matter of luck. It is not true that all expensive gear is rugged and high quality, but it is true that all rugged and high-quality gear is expensive. Routinely replace vital gear. How many times has your guitar cable been wound up tight to fit in the case? After a couple of years it will become an embarrassment waiting to happen. When you buy a piece of equipment, write the date on it. We expect to replace big items when they get long in the tooth, but it’s easy to forget how old little things like foot pedals are. Don’t keep every cable and adapter you ever bought. If you toss one in the drawer when it acts suspiciously, it will eventually work its way to the surface and fail you again. You can save money on cables if you buy them in bulk. Change color every time you buy a batch—that will give you an idea how old various cables really are. Isolate flaky gear. We all have unique or vintage items that are lovely, but frankly fail from time to time. Make sure their failure won’t bring other things down. A questionable reverb should be used in an insert of the mixer, not patched after the main output. That way, if it acts up, you can just yank the plug and you won’t miss a beat. A classic synthesizer should not be your only keyboard if there’s any doubt it will get through the show. Back up as much gear as possible. This doesn’t just mean carry a spare. Gear usually fails in sound check (moving is hard on it), but sometimes you will be caught in the middle of your act. Have musical alternatives, so that any

19_Chap18_pp445-464 8/29/13 2:52 PM Page 462

462

THE ART AND TECHNIQUE OF ELECTROACOUSTIC MUSIC

section of the show can be done more than one way. Spares are a good idea where feasible. When you have more than one of any item, label them so you know which is broken after you get home from the gig. Label your cables. The second most common cause of failure is wrong connections. The light on stage is miserable, you are in a hurry, and people keep interrupting you as you are setting up. The easy fix: set the entire system up in your rehearsal space, and label each plug to match the jack it is plugged into. I know this seems silly because you have to search for the cable that is labeled instead of using the first 1/4-inch cable that comes to hand, but it actually speeds the setup. (Grab a cable, plug it into its matching jack. Hey, someone could help you without messing it all up!) Labels also make it simple to troubleshoot the wiring. You can see immediately if something is in the wrong spot. The best way to label a cable is to use stick-on labels (typically 5/8 in. x 7/8 in.) with a layer of clear tape wrapped all the way around the plug. Place them so you can read them with the plug pointing up—it will show correctly with the cable plugged in. Why clear tape? The glue on the label only lasts a few months, and the tape also protects the label so the writing doesn’t smudge. Label spare cables too: put a number or letter at both ends so you can figure out where a cable comes out after being snaked under some large piece of equipment. Colored tapes are great on microphone cables because you can see from the mixer who has which mic. Label the gear. That exciting black-on-black color scheme marketing departments favor is a pain in the rear on stage. If the connection labels or even the front panel labels are at all hard to read, make your own. If you take a bit of time you can do nice looking ones with a printer. Laminate the front with clear packing tape and use double-sided tape to attach them. While you’re on this labeling kick, put your name on everything. A good stationary store can sell you a roll of 1,000 labels with the printing of your choice. You can also get white paint markers that show up well on black boxes. You might even want to engrave your name on your gear. A $30 engraver does an ugly job, but your name and address can help get equipment back if it is stolen. Labels are easily removed by heating them with a hair dryer. Any leftover residue can be cleaned off with lighter fluid or adhesive remover (be careful what you use it on). Bring a light. Stages are dark places, especially since they always seem to be testing the lights when you are setting up. Always bring a couple of flashlights and spare batteries. You may want to include some LED lamps (the kind that clip onto books) in your layout so you can read knobs during blackouts.

19_Chap18_pp445-464 8/29/13 2:52 PM Page 463

THE ELECTROACOUSTIC PERFORMER

463

Gordon Mumma used to wear a miner’s cap when he set up, and occasionally to perform. A bicyclist’s headlamp is the modern equivalent.

THE LAST WORD ON PERFORMANCE Malcolm Gladwell has written, in Outliers (2008), that it takes 10,000 hours of directed practice to become expert at something. He based his conclusion partly on studies of musicians, so this observation probably applies to electroacoustic performance. He does not offer advice on how to spend those 10,000 hours, but I have a few suggestions based on my own experience. Practice. Even if all you do is push a few buttons, figure out the tricky spots and woodshed them. Record what you do and listen to it. Are you really making music you would pay to hear, or is this simply self-indulgence? Practice. Run all the way through from beginning to end. Do it again. Practice starting in the middle, or from any spot that makes sense. Then run through it all again. Write it down. It doesn’t have to be ready for publication, but it should be clean enough for you to read. Include enough information that a stranger could perform the piece. Practice. Shut off your computer and practice launching your application and finding your files. Play for your friends. Then listen to what they say without saying anything back (except thank you). You don’t have to do what they say, but try to figure out why they said what they did. Practice. Take all of your gear apart and pack it for moving, then put it all together and run through the show. Develop a version that is half as long. Then develop a very long version. Practice. While wearing your costume. Figure out how to jump from one section to another. And then how to go back and repeat something. Practice. With all the lights out. Perform. Every chance you get. The real final word is simply: have fun. Yes, it’s an incredible amount of work. Yes, it’s terrifying. Yes, you could earn more money flipping burgers. But if you are enjoying yourself, the audience will catch your enthusiasm, and they will send that energy and happiness right back to you.

19_Chap18_pp445-464 8/29/13 2:52 PM Page 464

464

THE ART AND TECHNIQUE OF ELECTROACOUSTIC MUSIC

RESOURCES FOR FURTHER STUDY Books on performance tend to be somewhat mystical, but these are very practical: Collins, Nicolas. 2006. Handmade Electronic Music: The Art of Hardware Hacking. New York: Routledge. Gladwell, Malcolm. 2008. Outliers: The Story of Success. New York: Little, Brown & Company. Greene, Don. 2002. Performance Success: Performing Your Best Under Pressure. New York: Routledge. Oliver DiCicco’s instruments may be viewed on line at www.oliverdicicco.com, and several performances are available on YouTube. The instruments are played on the CD by Mobius Operandi, What Were We Thinking (Mobuis Music, 1998). Information about New Interfaces for Musical Expression (NIME) conferences are at www.nime.org. Nymoen, Kristian, “The Nymophone2: A Study of a New Multidimensionally Controllable Musical Instrument,” masters thesis (University of Oslo, 2008), is available online at www.duo.uio.no/publ/IMV/2008/73952/nymoen_ma_thesis.pdf Information about the Arduino system can be found at www.arduino.cc.

20_Chap19_pp465-488 8/29/13 2:52 PM Page 465

NINETEEN Composing for Electronic Performance

Electroacoustic performance is a wonderful metaphor for the digital age. A musician interacts with sophisticated systems to produce something beyond the capabilities of either alone. When I say interact, I mean just that. Electroacoustic performance includes a balance between performer and electronics as dynamic as that between the members of an acoustic ensemble. Each has their part to contribute, and it is the composer’s job to facilitate this synergy. There are so many ways to perform with electronics that I can only begin to list the possibilities. I’ll start by exploring several styles that have evolved in recent years.

THE CLASSIC APPROACH: LIVE INSTRUMENT(S) WITH PRERECORDED ACCOMPANIMENT Many of the earliest electroacoustic pieces were nothing more complicated than a performer with a traditional instrument playing along with a recorded tape. Mario Davidovsky’s Synchronisms usually comes to mind when this genre is discussed. The concept seems simple, but in practice many problems have to be confronted. The first is suggested by Davidovsky’s title for his series. How do we synchronize the live performer and the fixed audio track? In the simplest approach, the audio track is continuous and the performer keeps up. This will put heavy demands on the artist unless the recorded sounds are clearly rhythmical or can be accurately represented by a score. (Or unless the sounds are so arrhythmic and amorphous it doesn’t matter how the human plays!) In addition, there can be no solo sections because it is unrealistic to expect anybody to keep perfect tempo for more than one or two unaccompanied bars. In a more developed version, the performer has a private cue track to follow on headphones. This is still tricky to carry off. The performer has three distinct things to listen to: the cue track ticking away, the sound of the audio track, and his or her own performance. Of course many musicians are skilled at playing with a click track—a lot of contemporary music is produced that way. If you have ever been stuck in an elevator, you know exactly how that sounds. If you do use this technique, 465

20_Chap19_pp465-488 8/29/13 2:52 PM Page 466

466

THE ART AND TECHNIQUE OF ELECTROACOUSTIC MUSIC

I suggest exploring richer sources than a metronome for the cue track. This could take the form of a piano part, such as the rehearsal score provided with some a cappella choral music. It need be nothing more than the beat and suggestions of the harmony and quiet except in phrases that prepare entrances and changes of tempo. In some pieces a hidden percussion part will do the trick. DVD example 19.1 demonstrates with a short piece for saxophone and whale sounds. The example is in two sections. The first section is what the performer hears. The left channel is the cue track and the right contains the recorded part. The cue part starts with a spoken count-in. More complex pieces may have spoken cues for rehearsal marks or changes in tempo. The click has distinct tones for downbeats and some variation in volume. The loud sections will prepare the performer for entrances, then the beat drops a bit to let the performer concentrate on expression. The second section of DVD example 19.1 presents what the audience hears, the recorded part and soloist. When you ask a performer to play a part that is accompanied by recorded sound you have to deal with a problem that has nearly disappeared from electroacoustic work—notation. A few electronic music pieces were transcribed in the early days, for documentation or when the composer passed specifications to an engineer who would do the studio work. Nowadays, notation is usually little more than an input device for a sequencer, but a performer working with recorded sound is going to need a score that includes some graphic representation of the accompaniment tracks. The question is, what does a score of electronic sounds look like? The following are some guidelines. If the recorded part includes anything that can be scored traditionally, do so. The result may look a lot like a percussion part, with creatively shaped note heads indicating style of sounds with standard beams and flags. Likewise, anything with a discernable pitch should be accurately placed on the staff, even if it is just a wiggly line. Dynamics can be indicated with the usual letter markings and hairpins. Include timings in all parts. These are mostly useful for rehearsal, but they are essential for the performer to figure out exactly what must match up. This makes it worthwhile to look for a playback program that features a large time window. If the recorded part is on a compact disc, be sure to take into account the dead time at the beginning. Use shapes to indicate pitch and texture to indicate timbre. The scoring process is likely to include two programs, one for notation and one for graphic elements. Exactly what gets imported to which depends on the capabilities of the programs. A drawing program will include a variety of crosshatchings and gradients. Use them consistently so that the performer will always know what to expect when seeing a striped wedge or a shaded square, for example. Don’t bother with color, since it can be hard to make out under stage lighting. Provide a detailed description of what every symbol means. Refer to timings in the recording where sounds stand out or include a file with solo samples of each sound.

20_Chap19_pp465-488 8/29/13 2:52 PM Page 467

COMPOSING FOR ELECTRONIC PERFORMANCE

freely q = 120

& 44 Ó

Sop. Sax.

Œ œ w

467

0:10

Œ œœ˙

‰bœJ

˙

œ œ ™ ≈bœR b œnœ œbœ œ œbœbœ

4 /4

Tape 6

& w

0:20



Œ

˙™

bœ œ œ œ™b œ œ œb˙

˙

œ™nœb œ œbœ œ bw

w

w

‰ bœJ

˙™

/ Ó 12

&

œ œbœ œ œ

œ

Œ Ó ∑

/ FIGURE 19.1

0:30

Score for instrument and recorded sounds.

Figure 19.1 shows the score for the saxophone and whale piece in DVD example 19.1. There are two prominent types of recorded sound, both featuring whale song. Each is given a name, a graphic texture, and a shape indicating starting point and duration. It is not necessary to include internal pitches and rhythms for long taped sections unless the instrument part is cued by them. Note that the timings allow two seconds for the count-in included in the recording. The musical problems posed by this format—instrument with audio track—are the same as what you would find with any piece for soloist and ensemble. Tonal balance and melodic interaction have been addressed in various ways from the concerto grosso to country music. The approach that has proved most successful through the ages is a simple alternation of focus: while the soloist plays the accompaniment is restrained; when the soloist takes a break the ensemble fires up. One issue unique to electroacoustic concertos is the disparity of timbre between the instrument and synthesized tones. As discussed in chapter 4, the music will work best if there are common elements between the various sonorities. I have heard several successful pieces with audio material derived from processed samples of the solo instrument. Dynamic balance is often a thorny issue in orchestral work, but it’s even more extreme with electronics. An unamplified soloist cannot compete in volume with an amplified audio track. Here we borrow a technique explored in

20_Chap19_pp465-488 8/29/13 2:52 PM Page 468

468

THE ART AND TECHNIQUE OF ELECTROACOUSTIC MUSIC

multitrack mixing. Study the spectrum of the solo instrument and apply equalization to the recorded audio to subtly reduce the masking harmonics. This will allow you to use the full dynamics of the sound system without burying the instrumental soloist. The technical problems of this type of performance are those of any amplified performance. Make sure the audience hears an artistic balance of the instrument and tape, and make sure the instrumentalist can hear a balance of both parts. This may or may not require a microphone on the instrument, but a monitor for the performer is usually necessary. The sound from the hall and PA is distant and somewhat delayed, making accurate rhythmic synchronization difficult. This problem can be cured by a small speaker not far from the player’s ear, which should not be audible to the audience. In the most advanced form of solo instrumentalist with audio track, the recorded part is under direct control of another performer. The recording can then be divided into sections that are started on cue from the soloist. The performer handling the audio track needs to be alert to start accurately (and to avoid the more likely pitfall of letting the audio play beyond stopping points), but this is no more demanding than more traditional forms of accompaniment. Playback can be managed with a standard CD player but is a lot easier on a laptop. Depending on the nature of the electronic material, an added level of flexibility can be achieved by synthesizing the audio on the spot with a sequencing program. Then the accompanist has control over the tempo and track balance for some real give and take with the soloist. Few sequencers have a tempo knob, but it is not hard to cobble up a patch in Max/MSP that produces a MIDI beat clock, which most sequencers can follow. The Max patch can in turn be controlled by an external knob on a MIDI controller. Taken to its logical conclusion, the Max patch has input from the performer and determines the tempo from the notes as they are played.

LOOPING AS AN ART FORM Many electroacoustic performances of the pre-laptop era featured an instrument and a pair of tape recorders. A reel of tape was threaded from the supply reel of the left deck through both head blocks to the take-up reel of the deck on the right. The left deck would be set to record the instrument, with the right deck set to play back. Playback was delayed by the time it took the tape to move from the recording deck to the playing deck (a second for every 15 inches). The name of the musician who first performed this way is lost, but that person probably was the first to have access to two machines and a mixer. I suspect it was Les Paul, who had already made use of delay effects produced on a record cutter. Paul’s work sparked a whole school of tape-augmented guitarists. Ray Butts’s custom-built Echosonic guitar amplifier introduced in 1952 featured a built-in tape delay, which was played by Chet Atkins and Elvis Presley among others. The familiar Echoplex followed in 1959.

20_Chap19_pp465-488 8/29/13 2:52 PM Page 469

COMPOSING FOR ELECTRONIC PERFORMANCE

FIGURE 19.2

469

Max beat clock patch.

Long tape delays featured in many experimental music performances, such as those at the San Francisco Tape Music Center in the early 1960s. Karlheinz Stockhausen’s 1966 work Solo, for a melody instrument with feedback, requires six play heads with delay times up to 20 seconds (the score includes a part for the engineer mixing the delays). Probably the most famous example is Alvin Lucier’s I Am Sitting in a Room (1969), which requires a delay of 75 seconds. The quirkiest version of delay I have seen was on a television program celebrating transatlantic broadcasting via one of the first communications satellites in the early 1960s. The signal from a flautist in the New York studio was relayed to Paris and back, producing a twosecond delay which the performer used in a soft jazz improvisation. Today we can make a distinction between two styles of delay. Short delays as found on most effects boxes are a sound process, producing the characteristic thick texture and a certain rhythmic drive. Delays of a measure or more are a formal device, becoming part of the composition’s harmonic development. The latter can be further divided into two styles: open loop and closed loop. The closed loop, produced by a digital looping recorder such as the Boss Loop Station or programs like Ableton Live, allows a real-time version of multitrack recording. The performer lays down one basic track that fixes the length of the verse. Once captured, the loop begins to play and the performer adds a second

20_Chap19_pp465-488 8/29/13 2:52 PM Page 470

470

THE ART AND TECHNIQUE OF ELECTROACOUSTIC MUSIC

layer, the trick being to keep the recorded levels consistent. The video clip in DVD example 19.2 demonstrates the process nicely, courtesy of vocal artist Amy X Neuburg. If you watch the left drumstick and occasionally her fingers, you can see the actions required to capture a layer and trigger playback. You will also see that it is not necessary to play layers forever in the order that they are recorded; a layer can be parked and brought back when it is musically most effective. This makes exceedingly complex forms possible. You can have loops already recorded so you do not need to capture all (or indeed any) of them on the fly, but the audience will get wise to this pretty quickly. I suggest having practice versions of your most important materials stored but bring them in only if the live version is a complete dud. The open loop is really just a long delay with feedback. The performer plays something, and after some interval the sound is heard again. The performer plays in reaction to this echo, and presently the more complex phrase turns up. As this goes on, the sound mutates into a continuous texture. The amount of feedback is critical. If there is not enough, repeats will be weak and will return only once or twice; if there is too much feedback, certain sounds will rise up like a tsunami and crush all musical sensibility. When the delay uses real tape, the sounds degrade gently on each pass, eventually joining a background murmur. The sound in digital delays maintains its identity as it fades unless processing is inserted in the loop. A simple high cut from a low-pass filter approximates tape fairly well, and some parametric boosts can produce predictable chords. In the Lucier work, for example, the processing is the acoustic response of the concert hall. The poorer the hall is for music, the more interesting the performance. Where closed loops tend to produce active rhythms and constant beats, open loops create a dreamlike effect, often used as ambience. This style of work seems simple, but the simplicity is deceptive. It takes serious planning to find sounds and gestures that are effective as foreground when they are first produced and form an appropriate background to what happens next. And if you lay down a clam, you have to live with it a very long time. DVD example 19.3 contains a short excerpt from a piece of this type featuring Veronica Elsea as vocalist. Either style of looping can be thought of as technologically enabled counterpoint, and by keeping that in mind from the beginning, you may avoid some of the common pitfalls. The pitches you play must not only fit with the notes that have been played, but with notes yet to happen. These pieces can easily become locked into a single chord or turn into an inharmonic mush. Scoring the whole thing out ahead of time is an excellent exercise and, with the cut and paste features of most notation programs, is not particularly difficult. Scoring electroacoustic work can be tedious, but in the case of loop compositions, it is more efficient. After all, if you are trying to decide if a G or a D will work better in the third measure of a bass loop, you need to go through the entire process of building layers for both versions. With a score, you can just change the note and see how it works. Figure 19.3 shows the score for a delay piece, the beginning of a slow movement of my Requiem for Guitar. The excerpt is played in DVD example 19.4 by Mesut Özgen. You can see how the echo of the opening gestures is scored in the second

20_Chap19_pp465-488 8/29/13 2:52 PM Page 471

COMPOSING FOR ELECTRONIC PERFORMANCE

471

Tuning 1: Db 2: C 3: Gb 4: D 5: A 6: E

Requiem III Bargaining

b‚ b‚ O V ‚ ° 4 J Guitar & 4 ‰ q = 80

Echo part is only partially transcribed for cueing purposes

f

4 (echo) & 4 Œ ¢



Œ



5

° Gtr. & Ó

Ó

b‚ b‚ ‚ ‚ ‚ ‚ ‚

(echo)



‚ b ‚ b‚ ‚ b‚ ‚ b ‚ b O ° ‚ Gtr. & 8

(echo)

Œ ‰

¢&

FIGURE 19.3

b‚

b OO

Œ

Œ

Œ

Œ

Œ

¢&



Œ

b‚ b‚ O ‚ J

‚ b‚ b OO ‰ ‚J

Œ

b‚ ‚ b‚ ‚ ‰ J

Œ



Ó



b‚

‚ b‚

Œ ‰

b‚ b‚ ‚ ‚ ‚ ‚ ‚

‚ J



‚ b‚

Œ Ó

b‚

‚ J

b‚ b‚ ‚ b‚ ‰ J ‚

Œ

Ó b‚ ‰ J



‚ ‚



∑ b‚

‚ b‚



Excerpt from Requiem for Guitar.

line. Since this delay has feedback, the echoes quickly become complex—writing all of them down would be more confusing than helpful. Instead, only the entrances are accurately notated, with second and third repeats included when they are prominent in the texture.

TRANSFORMATIONS AND PROCESSING Preset Control It probably goes without saying that an electronic performance will include some form of processing. This has been covered extensively enough that we know what the possibilities are, although some of our favorites (reversal, for instance) are a bit awkward to use in real time. The main pitfall in processing is using too much of a

20_Chap19_pp465-488 8/29/13 2:52 PM Page 472

472

THE ART AND TECHNIQUE OF ELECTROACOUSTIC MUSIC

good thing. Since everything in the show has to be packed up and moved to the stage, we find it tempting to bring as little as possible and rely on one or two surefire tricks. But it is all too easy to let the effect become the composition. I strongly encourage variety, and variety on two levels. You should vary the kind of processing in use, and you should vary the parameters of the process. Of course this can be taken too far: a performer will not thank you if more than four or five boxes are required or if presets have to be switched every eight measures. It is vital that the score accurately specifies the parameter settings for processors. You cannot count on the device you composed with being available to the performer, so you can’t just say “point the knob at 9 o’clock.” Specify beats per minute for delays and frequency range for wah-wah pedals. Include enough information about the sound you want to enable a performer to verify that the settings are working right. Your score must also include a diagram of how to set up the processors. You need to be aware of practical limitations when changing parameters. It is impossible to operate a mouse or touchpad while playing, and knobs work only in some situations. You should provide switches, and foot switches are usually the best option. Foot switches are available in a variety of forms, but most have one of two functions: audio routing or changing presets. The latter are usually specific to a particular model of multieffect processor. Typically they have two or three buttons that step through presets in a programmed order. This makes the performer’s job easy—he just hits the button at designated points in the score. It is up to the composer to get the effects into the right order. Be sure to indicate in the score which preset is supposed to come up—if the performer misses a cue, the rest of the piece could be played with the wrong settings. Also indicate the active preset at the top of each page; this simplifies practicing and provides reassurance during the performance. Audio routing pedals come in many degrees of complexity. The simplest are built into stompboxes along with the effect circuitry. These are bypass switches—when disengaged, the signal is sent through the box with no modification. Using these does allow interesting combinations of effects, but there is a price to pay. As the signal always goes through all boxes, the noise in the system will rise with the number of boxes, and the reliability of the system will go down. Every cable contributes to noise and one bad connection will disable the entire system. There are also switch pedals with no circuitry—they simply connect a single jack to one of a pair. These work going either way, so they can be used to send audio to a choice of destinations or to choose sources. Simple switches are most useful if you need to switch a chain of complex effects on and off. To use these you must split the signal with a Y cable and combine the processed signals with another switch or a mixer (see Figure 19.4). Slightly more expensive switches include these features or combine several switches into a single unit. If you are using rack-mount processors, there are rack-mount switches with remote pedal controllers. These tend to be more expensive but offer a lot of flexibility. If you are using laptop-based effects such as AudioMulch or Max/MSP, the best control option is a MIDI foot switch array. This is a number of switches on a board

20_Chap19_pp465-488 8/29/13 2:52 PM Page 473

COMPOSING FOR ELECTRONIC PERFORMANCE

A

Process

Switch

Source

473

Mix B

Process

Signal split with switch and combined with mixer.

Process Source

A

Y

Switch Process

B

Signal split with Y cable and selected with switch. FIGURE 19.4

Signal path with A-B switch.

each of which sends some kind of MIDI message. The exact messages sent are different from one model to the next, so make it easy to modify the patch to accommodate various messages. Figure 19.5 shows one way to do this in Max/MSP. This is simpler in AudioMulch and Ableton Live, which have a learn MIDI feature. The standard control numbers for foot switches are 64 to 71, and most send 0 for off and 127 for on. The Max/MSP patch will set the toggle on for the specified control of program number and off for any other. Figure 19.6 shows how to use a single pedal to step through a series of presets. Another option is to use a key on a MIDI keyboard to switch effects. The patch to do this looks quite similar, but notein is used instead of ctlin. The key might be an unneeded note at the end of the keyboard, or it may be on a mini keyboard placed close to hand.

Ghost Scores One of the most interesting things you can do with processing is to modify the parameters as the performer plays. This allows sound to evolve throughout the piece. One of the first users of this technique was Morton Subotnick, who coined the term ghost score. His original approach used custom-built processors with control signals recorded on tape. The works Axolotl and Passages of the Beast are excellent examples of the technique.

20_Chap19_pp465-488 8/29/13 2:52 PM Page 474

474

THE ART AND TECHNIQUE OF ELECTROACOUSTIC MUSIC

FIGURE 19.5

Adjustable control numbers in Max.

FIGURE 19.6

Preset stepping in Max.

Ghost scores are best performed with laptop applications that feature parameter automation. Ableton has a powerful automation system, and it’s not difficult to set up Max/MSP for this. It is even possible (but awkward) to do live processing with automation on some sequencers and digital audio workstations (DAWs). Automation in AudioMulch is easy to set up (see Figure 19.7). Just build your contraption

20_Chap19_pp465-488 8/29/13 2:52 PM Page 475

COMPOSING FOR ELECTRONIC PERFORMANCE

FIGURE 19.7

475

Automation in AudioMulch.

and choose the controls you want to automate with a right click. Then the changes are simply drawn in. The automation syncs to MIDI clock, so you can put a cue track in a DAW and play from there, which makes it easy to rehearse.

TRIGGERING SOUNDS AND PROCESSES Ghost scores and cue tracks work well for many pieces, but they can lock the performer to predetermined tempos just like a recorded accompaniment. We really need more flexibility in the software, applications that can react to the music as it is played. There has been a fair amount of research into the problem of score following for accompaniment, including basic studies by Roger Dannenberg and a promising application called Music Plus One by Christopher Raphael. These include sophisticated analysis of audio input which is compared with a score to control the pace through a recorded accompaniment. There are some commercial spin-offs of this work, but you can’t compose your own music for them. It’s a lovely lookout for the future, but at the moment, the only way to get the computer to respond to the performer is through Max or PD.

20_Chap19_pp465-488 8/29/13 2:52 PM Page 476

476

THE ART AND TECHNIQUE OF ELECTROACOUSTIC MUSIC

Catching Gestures A computer can “listen” for three things. In decreasing order of reliability, these are MIDI data, amplitude envelopes, and monophonic frequency. In performance it is not necessary to try to follow an entire score in detail; this would be a lot of work and it would limit what the performer can do. Instead, we watch for trigger events—gestures that are unique enough to be easily recognized. These include specific chords, melodic fragments, or rhythm patterns. We can improve reliability by using other gestures as arming gestures. When an arming gesture is detected, the system informs the performer but nothing else. Only if a specific trigger is played within a few beats will it cause some action. Otherwise the arming gesture is forgotten. Figure 19.8 shows one way to do this. In this example the arming gesture is four pitches: D, G, F, and B, all below middle C. The match object will bang when these are played in order, opening a gate for a half second. If a middle C arrives within that time, some event will be triggered. In fact, several things are triggered: the gate is shut, so further Cs will not trigger the event again, and it’s also quite likely that the numbers to match and select would be changed. This allows the same patch to respond to different patterns. I’m deliberately not suggesting what this patch should do musically. It is simply a mechanism to trigger any kind of event from a sound file to a parameter change. The advantage of this system over making the performer hit a foot switch is that the synchronization will be tighter. Asking someone to hit a button at the same time as sounding a climactic note is asking for trouble. They must take their attention from the usual process of playing to finding a switch and hitting it at a precise moment. This is a distraction at best and, in the worst case, completely breaks the performer’s concentration.

Tempo Detection One basic piece of information that can make a computer responsive is the performer’s tempo. There are some sophisticated techniques for analyzing music and extracting tempo, but they are CPU hogs and seldom really necessary. Most of the time one of two tricks will work well. These are tap tempo and follow tempo. Tap tempo is simple. The performer is asked to begin the piece with a series of quarter notes in strict tempo. The notes need not be audible, but depending on the style of music they might fit right in. The patch averages the times from hit to hit and uses that to calculate the tempo for the computer part. Measuring just one duration would not be accurate, because few performers are that precise. Usually three to eight notes will do. Figure 19.9 shows how to capture tap tempo in Max. This patch uses an object that is not standard with Max/MSP—the bonk~ object written by Miller S. Puckette. It analyzes the input signal and reacts when it detects a note attack. There are quite a few controls available, but all that is absolutely necessary is a sensitivity correction which is set for the particular sounds it is reading.

20_Chap19_pp465-488 8/29/13 2:52 PM Page 477

COMPOSING FOR ELECTRONIC PERFORMANCE

FIGURE 19.8

477

Gesture detection in Max.

The output of bonk~ is elaborate, but a bang is all that is needed here. The timer object, when wired as shown, will measure the time between attacks. The split object throws away any values that don’t fall between its arguments; in this case it removes times that are obviously too large or small. The first too large value occurs at the first note because this will be the time since the performer quit playing. A very short value is probably a mistake, either by the performer or the bonk~ object. The arguments shown for split will work for tempos from 60 to 240 but should be set for something appropriate to the piece. The rest of the patch exists to add up four of these measurements and average them. When the toggle is set, the gate is opened and the next four durations are added into an object called accum, short for accumulator, and it keeps a running total of the input. The select 4 object bangs on count four, causing a cascade of actions culminating in setting the metro period and turning it on. Along the way, a tempo has been calculated and the gate shut. Figure 19.10 shows how to make the tempo detection continuous. This is a similar process but works with more complicated input than straight quarter notes. To do this, the change must be restricted to small variations, which is done by discarding any values not within 10 percent of the target tempo. To make up for that, the patch tests values that are twice and half the expected beat. The test values will then be eighths, quarters, and half notes, but triplets and dotted eighths are ignored. The last eight measurements are averaged to get the new tempo. The zl stream object

20_Chap19_pp465-488 8/29/13 2:52 PM Page 478

478

THE ART AND TECHNIQUE OF ELECTROACOUSTIC MUSIC

FIGURE 19.9

Tap tempo detector in Max.

puts out a list of the last eight values which zl sum totals up—this means the averages overlap and any change will be gradual. Further calculations convert this to beats per minute, which is fed back to the target tempo. The rate the tempos can change is limited by the averaging process. Shorten the list of values to average, and it will become more nimble. With four durations in the queue it will change quickly—perhaps too quickly for less skilled performers. This is a good factor to make adjustable.

20_Chap19_pp465-488 8/29/13 2:52 PM Page 479

COMPOSING FOR ELECTRONIC PERFORMANCE

FIGURE 19.10

479

Follow tempo in Max.

Rhythm Patterns Detecting rhythm patterns is much more difficult that measuring tempo. When the source is audio, there are two main causes of error. The most important cause is that human performers aren’t accurate, at least not in a mathematical sense. A second cause is that impulse detectors like bonk~ will often give false triggers or miss a few. The upshot is you can’t just look for a duration of 250 milliseconds and call it an eighth note. A rhythmic pattern detector must be able to cope with a certain

20_Chap19_pp465-488 8/29/13 2:52 PM Page 480

480

THE ART AND TECHNIQUE OF ELECTROACOUSTIC MUSIC

FIGURE 19.11

Rhythm pattern detection in Max.

amount of variation. The best approach is to look at local relationships, as shown in Figure 19.11. This patch times the durations between MIDI note on messages. The split object passes times that may be expected for the patterns to detect with 20 percent leeway. The numbers shown imply the interest range is from an eighth to a half note at a tempo of 120 (the range can be calculated from a tempo follower). Values outside of the range are not discarded, they are replaced with 1. This will make sure contiguous groups of notes are tested. The rest of this patch makes use of lobjects, my own set of Max/MSP extensions, which are available online. The lobjects are primarily concerned with operations on lists. In this case, the llast object is similar to zl stream, providing lists of the last six inputs. Applying a list to a float box discards all but the first item in the list, which is applied here to the right inlet of ldiv. Ldiv divides every member of the list by the right input, so the result will be a list of ratios to the first duration. A pattern of quarter, eighth, eighth, quarter, half, quarter would be expressed by the ratios of 1, 0.5, 0.5, 1, 2, 1. Lsub subtracts the ratios from its argument list to give a list of errors. The absolute values of the errors are totaled up and the result applied to a threshold detector. I’ve found a total error of less than one on a short pattern is reliable enough for most performers.

20_Chap19_pp465-488 8/29/13 2:52 PM Page 481

COMPOSING FOR ELECTRONIC PERFORMANCE

481

This mechanism can detect patterns of any complexity, although the more notes in a pattern the less likely it is to be played correctly. We can catch other six-note patterns just by sending a new template to the right inlet of the lsub object. Of course this is really detecting patterns of seven notes. If we apply the lyrics “shave and a haircut / two bits” to this, the trigger happens on the word bits.

Pitch Detection Pitch detection in Max also depends on third-party externals. The detector of choice is Puckette’s sigmund~, which produces a MIDI note number with a fraction indicating any detuning. Pitch detection is not infallible. There is a tendency to jump octaves or come up with completely wrong pitches. In fact a 90 percent accuracy rate is considered pretty good. The detectors require a clear single pitch, and even the reverb tail of a previous note can fool them. I don’t recommend basing a significant feature of the composition on pitch detection. That doesn’t mean it can’t be effectively used in a situation that doesn’t require absolute accuracy—in fact, a bit of indeterminacy can be fun. In use, audio pitch detection looks much like Figure 19.8. The pitches to match come from a sigmund~ object instead of stripnote.

ALGORITHMIC PERFORMANCE The performance strategies discussed so far are all deterministic. The computer will always respond to the same actions in the same way. In my own music, I like the computer to surprise me—producing notes on its own that are unpredictable but make sense in relation to what I have recently played. This kind of application enters the realm of algorithmic composition. That term, which literally means “composition by rule” covers a wide range of techniques from simple note generators like the chaos player in chapter 16 to artificial intelligence systems that write symphonies. Most of the classic algorithmic composition methods focus on generating scores, but some can be adapted for real-time performance. An algorithmic performance program will take in data from the performer and respond to notes with events of its own choosing. The harmonizing patch used to introduce Max in chapter 16 does this in a rather boring way, producing the same chord for every occurrence of a particular note. The patch can be improved to avoid excess repetition of chords and generate interesting voice leading, but the improved version would be complex and the best result would be a good example of freshman music theory. Producing sophisticated harmonization requires advanced techniques like linguistic analysis or fuzzy logic, subjects that are too involved to cover here. Luckily, we can generate interesting performances by combining the methods explored already with a few tricks based on random number generators.

20_Chap19_pp465-488 8/29/13 2:52 PM Page 482

482

THE ART AND TECHNIQUE OF ELECTROACOUSTIC MUSIC

Musicians and mathematicians have differing understandings of the term random. Musicians equate random with unexpected. In mathematical terms, a random operation should produce equal numbers of all possible outcomes, an even distribution. If a coin is flipped a hundred times, there should be as many heads as tails. For math purposes it doesn’t matter if the outcomes are always in the same order. Computer-based random number generators calculate a long series of numbers with an even distribution, but the same series is generated each time the program starts up. Unpredictability comes from starting with an unpredictable value, such as the number of milliseconds since the computer was turned on. The starting number is called the seed. Some programming languages allow the composer to choose a seed, which can be useful in testing and occasionally reveals a musically interesting series of values. Random operations with even distribution can easily produce boring music. A composer needs to be able to shape the distribution of events, which leads to the concept of probability. We can visualize probabilities by considering a roulette wheel. There are thirty-eight numbered pockets on the wheel, and after each spin the ball can land in any one of them. If the wheel is honest, the ball is just as likely to land in one as another, so the chance of hitting a particular number is 1 out of 38. The probability of any event is the number of ways it can happen divided by the total number of outcomes. Eighteen of the roulette pockets are black, so the probability of landing in a black pocket is 18 out of 38 or 0.47 (47 percent). The chances of landing in a red pocket are the same. Two pockets are green, which will come up 6 percent of the time. The probabilities for black, red, and green are 0.47, 0.47, and 0.06. This list of numbers (or a graph that shows this) is called a probability distribution. A computer program that emulates a roulette wheel would generate a random number from 0 to 37. If the number was less than 18, black would win. If the number were greater than 35, green would win, and red would get the remainder. This technique of partitioning the range of a random operation can produce any arbitrary distribution. (We have already used this technique in the example of the Markov chain in chapter 15.) The most famous probability distribution is the Gaussian distribution, also known as the bell curve. In a Gaussian distribution, values near the center of the curve turn up often while the values at either end are relatively rare. Gaussian distributions occur any time random operations are added together, such as throwing dice. There are thirty-six ways two dice can fall. Only one of these will add up to 2, but there are six possible ways to get 7. The more dice you use, the wider the range of results and the closer the distribution approaches the Gaussian ideal. Programming this is simply a matter of adding up the results of several random number generators. Figure 19.12 shows how probability operations can be used in Max. There are four distinct processes going on here. The lreg object (from the lobjects) sends a list of keys held from the left outlet on each new note. When all keys are released, the right outlet bangs, so the wiring shown will turn the metro on while any keys are held. The metro triggers the random object, which generates a number from 0

20_Chap19_pp465-488 8/29/13 2:52 PM Page 483

COMPOSING FOR ELECTRONIC PERFORMANCE

FIGURE 19.12

483

Working with probabilities in Max.

to 300. The result plus 50 is sent back to the metro and will determine the time until the next bang. The rhythmic result is much like a wind chime on a gusty day. The metro bang produces several more random numbers (remember, these happen right to left). Three of these are added together and rescaled to produce a velocity between 64 and 128. The addition of random values produces a Gaussian distribution. Another random object picks an octave from a choice of three. The multiplication in the expr converts the random integers of 0–2 to steps of 12. The addition gives the result an offset of 48. The final bang goes to the table. A table object stores numbers at specified addresses. The message attached to the left hand slider instructs the table to store the solider value at address 0. The others store numbers at addresses 2, 5, 7, and 9. Normally a table is triggered with an address, and outputs the value stored at that address. When banged, a table performs a calculation using the value stored at each address as the probability of getting that address. The result of a bang will be one of 0, 2, 5, 7, or 9, so we are going to hear a pentatonic tune. DVD example 19.5 has a sample of this patch in action. In performance, I would control the sliders from a MIDI control surface or include a preset object in the patch with a variety of interesting settings already entered.

20_Chap19_pp465-488 8/29/13 2:52 PM Page 484

484

THE ART AND TECHNIQUE OF ELECTROACOUSTIC MUSIC

FIGURE 19.13

A probabilistic note player in Max.

Figure 19.13 also generates pitches according to an arbitrary distribution. The distribution is derived from the last eight notes played by a performer. If a note is repeated, its chances are increased. The heart of the patch is the histo object, which counts the number of times a particular number comes in. With the argument 12, it tracks the numbers 0 through 11, which will represent pitch classes. The pitches come from the llast 8 object, modified by lrem 12. When histo receives a number, it increments the count for that number, and sends the count and the number from its outlets. These outlets are connected to a table, storing the count at an address corresponding to the number. The histo and table are cleared at each note, so only the last eight notes are included. The first note played starts the metro object. The change object filters out repeated numbers to prevent the metro from restarting on following notes. Playing a note also restarts a delay object that is wired to stop the metro. Since notes coming in keep restarting the delay, the metro will shut off only if no notes are played for four seconds. This patch is heard on DVD example 19.6.

20_Chap19_pp465-488 8/29/13 2:52 PM Page 485

COMPOSING FOR ELECTRONIC PERFORMANCE

485

The next level of complexity in probability is to change the distributions according to incoming data. Figure 19.14 is the basis for a probability-driven harmonizer. This is designed to add a chord to the note played in the right hand. The essential mechanism is in the coll objects. A coll is similar to a table, holding data at specified addresses, but a coll can hold lists of data. Figure 19.15 shows the contents of the coll named coll probs. There is a list of values for each pitch of a major scale (if a chromatic note comes in, the coll will ignore it). When a pitch number comes in, the data associated with that number is output to a table, with the listfunnel object matching the formats. The table is then banged, producing a number according to the probability distribution designated by the note. This number chooses a chord for the measure. The pitch C has a 30 in the 0 position, so the chance of a 0 output is 30/70. The chance of a 5 is the same, while the chance of a 9 is somewhat less. The coll named coll chords interprets these numbers, providing the spelling of the chords. This mechanism, with a coll controlling probabilities that choose from a coll containing complex results can be adapted to control any kind of parameter in the patch. It can just as easily choose colors as chords. The probability chooser can also use any kind of input. If the table of probabilities is chosen by the result of the last evaluation, we have the Markov process. This patch is demonstrated in DVD example 19.7. There are many other techniques used in algorithmic composition that can be adapted for live performance to make a more extensive study of the field. The book Algorithmic Composition by Gerhard Nierhaus is a good place to start. For advanced study, I recommend The Algorithmic Composer by David Cope.

COMPOSING THE PERFORMANCE No matter how expensive, exotic, or cutting edge the equipment and software is, any composition succeeds or fails because of musical factors. Do not let the technical elements distract you from composing a piece that appeals to the audience. You know what the appealing elements are: emotion, excitement, humor, meaning. These are seldom included in any equipment box; they must be provided by the composer and presented by the performer. You also know what can get in the way: ineptitude, boredom, and, of course, technical difficulties. Ineptitude does not mean the performer is unskilled; it means a composer has asked the performer to do something he or she can’t do. In electroacoustic work it is easy to design things that are impossible. The most common impossible task is playing a concert while the ink is still wet. Be sure the performer has adequate time to prepare with all of the electronics at his or her disposal. Come to a rehearsal after the performer has had a chance to get a handle on things but well before the last minute. If something is not working out, change it. Boredom comes from music that is too simple or too obscure. Since this is personal to each audience member, I urge you to work at many levels, considering

20_Chap19_pp465-488 8/29/13 2:52 PM Page 486

486

THE ART AND TECHNIQUE OF ELECTROACOUSTIC MUSIC

FIGURE 19.14

Probability-driven harmony in Max.

FIGURE 19.15

Contents of colls in Figure 19.14.

20_Chap19_pp465-488 8/29/13 2:52 PM Page 487

COMPOSING FOR ELECTRONIC PERFORMANCE

487

everything from big structure to fiddly details. The first place to watch for boredom is in your performer. Be sure to give performers solo time and things to do that show off their hard-earned skills. Players will not tell you that you have written a dud, but you can tell from subtle hints in their behavior. Trust that instinct. The best that can happen with a disaffected performer is a lackluster performance. Technical difficulties can have many causes, as I discussed in chapter 18. One cause that I did not cover is one that falls squarely on the shoulders of the composer: inadequate understanding of the system. Failure usually stems from not knowing the limits of the equipment or software and the only way to gain that understanding is to experiment and test. The difference between these tasks is subtle: an experiment is trying something new to see if it will work; testing is trying things you expect to work to see if they always do. You should do plenty of both before placing anything new in front of an audience.

EXERCISES 1. Score a piece for solo instrument and musique concrète as covered in chapters 1 through 6. First, write all the notes and sounds down, then assemble the tape part. 2. Score and perform a loop piece (either open or closed) for household percussion instruments. The score can be a graphic representation. 3. Score a piece for solo instrument and ghost processing. 4. Score a piece for MIDI keyboard with tempo following and events triggered by gesture detection.

RESOURCES FOR FURTHER STUDY Bernstein, David, ed. 2008. The San Francisco Tape Music Center: 1960s Counterculture and the Avant-garde. Berkeley, CA: University of California Press. Cope, David. 2000. The Algorithmic Composer. Madison, WI: A-R Editions, Inc. Nierhaus, Gerhard. 2009. Algorithmic Composition: Paradigms of Automated Music Generation. New York: Springer Verlag. Christopher Raphael’s Music Plus One website is music-plus-one.com Fiddle~ and bonk~ by Miller S. Puckette are available from crca.ucsd.edu/~tapel/software .html. Lobjects by Peter Elsea are available at peterelsea.com

20_Chap19_pp465-488 8/29/13 2:52 PM Page 488

21_Bib_pp489-492 8/29/13 2:53 PM Page 489

Select Bibliography Aikin, Jim. 2004. Power Tools for Synthesizer Programming: The Ultimate Reference for Sound Design. San Francisco: Backbeat Books. Anderton, Craig. 1996. Home Recording for Musician. New York: Amsco Publications. Appleton, Jon, and Ronald Perera, eds. 1975. The Development and Practice of Electronic Music. Englewood Cliffs: Prentice Hall. Benade, Arthur H. 2012. Fundamentals of Musical Acoustics, 2nd rev. ed. New York: Dover Publications, Inc. Benade, Arthur. 1992. Horns, Strings, and Harmony, reprint ed. New York: Dover Publications, Inc. Bernstein, David W., ed. 2008. The San Francisco Tape Music Center: 1960s Counterculture and the Avant-Garde. Berkeley, CA: University of California Press. Blum, Frank. 2007. Digital Interactive Installations: Programming Interactive Installations Using the Software Package Max/MSP/Jitter. Saarbrucken: VDM Verlag. Boulanger, Richard. 2000. The Csound Book: Perspectives in Software Synthesis, Sound Design, Signal Processing, and Programming. Cambrdige, MA: MIT Press. Cann, Simon. 2010. Becoming a Synthesizer Wizard. Boston: Course Technology [Cengage Learning]. Cann, Simon. 2007. How to Make a Noise: A Comprehensive Guide to Synthesizer Programming. New Malden: Coombe Hill Publishing. Casabona, Helen, and David Frederick. 1986. Beginning Synthesizer. Cupertino, CA: GPI Publications. Case, Alexander U. 2007. Sound FX: Unlocking the Creative Potential of Recording Studio Effects. Boston: Focal Press. Chowning, John, and David Bristow. 1987. FM Theory and Applications: By Musicians for Musicians. Milwaukee: Hal Leonard. Cipriani, Alessandro, and Maurizio Giri. 2010. Electronic Music and Sound Design: Theory and Practice with Max/MSP, vol. 1, translated by David Stutz. Rome: ConTempoNet. Collins, Nicolas. 2013. Electronic Music. Cambridge Introductions to Music. New York: Cambridge University Press.

489

21_Bib_pp489-492 8/29/13 2:53 PM Page 490

490

THE ART AND TECHNIQUE OF ELECTROACOUSTIC MUSIC

Collins, Nicolas. 2006. Handmade Electronic Music: The Art of Hardware Hacking. New York: Routledge. Cope, David. 1997. Techniques of the Contemporary Composer. New York: Schirmer. Cope, David. 2000. The Algorithmic Composer. Madison, WI: A-R Editions, Inc. Davis, Gary. 1989. The Sound Reinforcement Handbook. Milwaukee: Hal Leonard Corporation. Dwyer, Terence. 1971. Composing with Tape Recorders: Musique Concrète for Beginners. New York: Oxford University Press. Ernst, David. 1972. Musique Concrète. Boston: Crescendo Publications. Everest, F. Alton. 2006. Critical Listening Skills for Audio Professionals. Boston: Artistpro. Everest, F. Alton. 1997. Sound Studio Construction on a Budget. New York: McGraw-Hill. Everest, F. Alton. 2009. The Master Handbook of Acoustics, 5th ed. New York: McGraw-Hill. Fletcher, Neville H., and Thomas Rossing. 1998. The Physics of Musical Instruments, 2nd ed. New York: Springer Verlag. Gladwell, Malcolm. 2008. Outliers: The Story of Success. New York: Little, Brown & Co. Greene, Don. 2002. Performance Success: Performing Your Best under Pressure. New York: Routledge. Holmes, Thom,. 2012. Electronic and Experimental Music: Technology, Music, and Culture, 4th Ed. New York: Routledge. Horner, Andrew, and Lydia Ayers. 2002. Cooking with Csound: Part 1: Woodwind and Brass Recipes. Middleton, WI: A-R Editions, Inc. Huber, David. 2007. The MIDI Manual: A Practical Guide to MIDI in the Project Studio, 3rd ed. Bosont: Focal Press. Huber, David. 2010. Modern Recording Techniques. Boston: Focal Press. Jenkins, Mark. 2007. Analog Synthesizers: Understanding, Performing, Buying. Boston: Focal Press. Judd, F. C. 1961. Electronic Music and Musique Concrète. London: N. Spearman. Keane, David. 1980. Tape Music Composition. New York: Oxford University Press. Krause, Bernie. 2002. Wild Soundscapes: Discovering the Voice of the Natural World. Berkeley, CA: Wilderness Press. Krause Bernie. 2012. The Great Animal Orchestra: Finding the Origins of Music in the World’s Wild Places. New York: Little, Brown. Kreidler, Johannes. 2009. Loadbang Programming Electronic Music in Pure Data. Berlin: Wolke Verlag. (Also available for download at www.scribd.com). Levitin, Daniel J. 2006. This Is Your Brain on Music: The Science of a Human Obsession. New York: Dutton. Lyon, Eric. 2012. Designing Audio Objects for Max/MSP and Pd. Middleton, WI: A-R Editions, Inc., 2012. Manning, Peter. 2013. Electronic and Computer Music, 4th Ed. New York: Oxford University Press.

21_Bib_pp489-492 8/29/13 2:53 PM Page 491

BIBLIOGRAPHY

491

Manzo, V. J. 2011. Max/MSP/Jitter for Music: A Practical Guide to Developing Interactive Music Systems for Education and More. New York: Oxford University Press. Millward, Simon. 2002. Sound Synthesis with VST Instruments. Tonbridge: PC Publishing. Nierhaus, Gerhard. 2009. Algorithmic Composition: Paradigms of Automated Music Generation. New York: Springer Verlag. Oliveros, Pauline. 2005. Deep Listening: A Composer’s Sound Practice. Lincoln, NE: iUniverse. Owsinski, Bobby. 2009. The Recording Engineers Handbook, 2nd ed. Boston: Course Technology [Cengage Learning]. Puckette, Miller S. 2007. The Theory and Technique of Electronic Music. Hackensack: World Scientific. Roads, Curtis. 2004. Microsound. Cambridge, MA: MIT Press. Roads, Curtis. 1996. The Computer Music Tutorial. Cambridge, MA: MIT Press. Russ, Martin. 2008. Sound Synthesis and Sampling, 3rd ed. New York: Taylor & Francis. Strange, Allen. 1983. Electronic Music: Systems, Techniques, and Controls, 2nd ed. Dubuque: William C. Brown. Touzeau, Jeff. 2009. Home Studio Essentials. Boston: Course Technology [Cengage Learning]. Vail, Mark. 2000. Vintage Synthesizers: Pioneering Designers, Groundbreaking Instruments, Collecting Tips, Mutants of Technology. San Francisco: Miller Freeman Books. Wells, Thomas, and Eric Vogel. 1981. The Technique of Electronic Music. New York: Schirmer Books. Winkler, Todd. 1998. Composing Interactive Music: Techniques and Ideas Using Max. Cambridge, MA: MIT Press.

21_Bib_pp489-492 8/29/13 2:53 PM Page 492

22_Append_pp493-500 8/29/13 2:53 PM Page 493

Appendix: Contents of the Accompanying DVD

Chapter 1 1.1 Reference signal for level setting in the studio Chapter 2.1 2.2 2.3 2.4 2.5 2.6 2.7 2.8 2.9 2.10 2.11 2.12 2.13 2.14 2.15 2.16 2.17 2.18 2.19 2.20 2.21 2.22 2.23 2.24 2.25 2.26 2.27 2.28 2.29

2 Rubbed wineglass Various squeaks Pan lid rubbed with a wooden dowel Cookie sheet bowed with a knife sharpener Sine waves at several frequencies Plastic cup rolled across a table Plastic cup rolled across a washing machine Salt carton, oatmeal carton and faucet Wineglass tipped to change the pitch Faucet, gas burner, steaming kettle Wineglass with varying amounts of water Bursts of noise of various lengths The effect of frequency on loudness Clean and distorted guitar tones Amplitude increase by 6 dB per step Various sound envelopes The sound of a saucepan lid twirling on its edge Sine wave The combination of two sine waves Violin tones Noise Drum riff in Figure 2.9 Whale, balloon, gong and bottle Melody with glasses, bowls and icemaker Two wine glasses Wine glass and bowl A mix of noisy sounds Kitchen mixer and handheld vacuum The masking effect with spoken text 493

22_Append_pp493-500 8/29/13 2:54 PM Page 494

494

THE ART AND TECHNIQUE OF ELECTROACOUSTIC MUSIC

Chapter 3.1 3.2 3.3 3.4

3 Microphone directionality experiments Proximity effect experiment Microphone placement experiment Six poor recordings (identified at the end of chapter 3)

Chapter 4.1 4.2 4.3 4.4 4.5 4.6 4.7 4.8 4.9 4.10 4.11 4.12 4.13 4.14 4.15

4 Melodic passages for editing Some problematic edits Pairs of similar sounds Four musical expectations Rubbed balloon Adding variety to repeated patterns Creating hybrid sounds Pitch shifted piano scratch Side effects of extreme pitch change Complex pitch change with Soundhack Cheese spreader Tomato slicer Roast holder China bowl Species I étude

Chapter 5.1 5.2 5.3 5.4 5.5 5.6 5.7 5.8 5.9 5.10 5.11 5.12 5.13 5.14 5.15 5.16 5.17 5.18

5 Pink noise Faucet with Graphic EQ Graphic EQ in action Faucet with parametric EQ Parametric EQ in action Simple vocoder Compression Compression ratio Compression on voice Compression on bass Compression on snare drum Sustain applied to a gong Multiband compression on gamelan Gating Basic echo Flanging Comb filtering Phasing

22_Append_pp493-500 8/29/13 2:54 PM Page 495

APPENDIX

5.19 5.20 5.21 5.22 5.23 5.24 5.25 5.26 5.27 5.28 5.29 5.20

Chorus Resonance Ring modulation and frequency shifting Reverb adjustment Music box Decimation Resampling Distortion Waveshaping Rectification Digital overload Effects étude

Chapter 6.1 6.2 6.3 6.4 6.5 6.6 6.7

6 Foreground and background Masking Beating between tones Loop with varied materials Loop with minor alterations Handmade loop Layers étude

Chapter 9.1 9.2 9.3 9.4

9 Fine tuning a sample’s loop Artifacts in sample loops Sample envelopes Swept and enveloped filter

Chapter 10.1 10.2 10.3 10.4 10.5 10.6 10.7 10.8 10.9 10.10 10.11 10.12 10.13

10 Basic waves Spectra of basic waves Pulse wave duty cycle Spectra of varying duty cycle Aliasing Attack and release ADSR envelopes Filter effects Doubled oscillators Vibrato Amplitude modulation of sine waves Amplitude modulation of sine and saw Amplitude modulation with voice

495

22_Append_pp493-500 8/29/13 2:54 PM Page 496

496

THE ART AND TECHNIQUE OF ELECTROACOUSTIC MUSIC

10.14 10.15 10.16

Amplitude modulation and balanced modulation Frequency modulation on an analog synthesizer Sample and hold

Chapter 11.1 11.2 11.3 11.4 11.5 11.6 11.7 11.8 11.9 11.10 11.11 11.12 11.13 11.14 11.15 11.16 11.17 11.18 11.19 11.20 11.21 11.22 11.23 11.24 11.25

11 Setting waveforms on Remedy synthesizer Pulsewave duty cycle on Remedy Exploring the Remedy LFO LFO shapes in Remedy Frequency modulation in Remedy Envelope control of FM in Remedy Envelope control of amplitude in Remedy Filter effects in Remedy Waveforms in Absynth Waveform design by spectrum Absynth oscillator modes Waveshaping, ring modulation and frequency shifting Filters in Absynth Low-pass filters in Absynth High-pass filters in Absynth Band-pass filter with high Q All-pass filter Notch filter Comb filter Coalescence Phase effects with two oscillators Detuning at unison and octave Transformation by crossfading voices Adding transients Formants

Chapter 12.1 12.2 12.3 12.4 12.5 12.6 12.7 12.8 12.9 12.10 12.11

12 FM index sweep FM index envelope FM partial bounce Brassy FM (see Table 12.1) FM ratio 1:10 FM clavichord (see Table 12.2) FM with integer ratios 1 to 15 FM bells (Table 12.3) FM chimes (Table 12.4) Fixed carrier in high octaves Fixed carrier in low notes

22_Append_pp493-500 8/29/13 2:54 PM Page 497

APPENDIX

12.12 12.13 12.14 12.15 12.16 12.17 12.18 12.19 12.20 12.21 12.22 12.23 12.24 12.25 12.26 12.27 12.28 12.29 12.30

Inverted modulation envelope (see Table 12.5) FM drum (see Table 12.6) Carrier at zero Carrier at zero with inverted modulation envelope FM8 modulator waveforms FM with two modulators Feedback modulation FM with stacked modulators FM noise FM trumpet FM choir FM choir with voice doubling and chorus Filtered version of FM Choir with doubling FM strings components FM strings complete FM with velocity controlled modulation FM evolution with envelopes Looping envelopes with external control Exercise in identifying FM sounds

Chapter 13.1 13.2 13.3 13.4 13.5 13.6 13.7 13.8 13.9 13.10 13.11 13.12 13.13 13.14 13.15 13.16 13.17 13.18 13.19 13.20 13.21 13.22

13 A video demonstration of SPEAR Creating a harmonic structure in SPEAR Additive synthesis in Alchemy Morphing trumpet sounds in Alchemy Time morphing in Alchemy Three versions of resynthesis Time stretching with resynthesis Synthesis from image A video demonstration of Metasynth Granular time stretching Granular pitch shifting Grain delay in RTGS-X Grain length in RTGS-X The buffer index in RTGS-X Granular synthesis using Absynth 4 Pure digital and modeled analog synthesis Pick parameters in Modelonia string model String filter in Modelonia String stiffness and modifier buttons in Modelonia A filter sweep in the Modelonia pipe model Modelonia lips parameters Noise excitation

497

22_Append_pp493-500 8/29/13 2:54 PM Page 498

498

THE ART AND TECHNIQUE OF ELECTROACOUSTIC MUSIC

13.23 13.24 13.25

Hybrid models in Modelonia Bow pressure and damping in String Studio Body parameters in String Studio

Chapter 14.1 14.2 14.3 14.4 14.5 14.6 14.7 14.8

14 Basic beep in Csound (see Listing 14.5) Fanfare for DC (see Listing 14.6) A demonstration of pluck (see Listing 14.7) Midi control of Csound (see Listing 14.8) A video demonstration of the knobs in a Csound window in action Audio processing in Csound Frequency shifting A Csound vocoder

Chapter 15.1 15.2 15.3 15.4 15.5 15.6 15.7 15.8 15.9

15 A simple scale in ChucK (see Listing 15.7) Sporking (see Listing 15.9) Windy chimes (see Listing 15.14) Glassy chimes (see Listing 15.15) The Markov tune (see Listing 15.28) Rhythm patterns (see Listing 15.31) Synchronized sporks (see Listing 15.32) Zipper distortion Variations on the wind model from the STK (see Listing 15.35)

Chapter 16.1 16.2 16.2 16.4 16.5 16.6 16.7

16 Random notes (see Figure 16.7) Simple rhythm engine (see Figure 16.10) Chaotic Player (see Figure 16.11) Delays (see Figure 16.14) Speed change with groove~ (see Figure 16.16) FFT vocoder (see Figure 16.20) Audio visualization with video feedback (see Figure 16.23)

Chapter 17.1 17.2 17.3 17.4 17.5 17.6 17.7 17.8

17 Sysex control of Evolver (see Figure 17.6) Video introduction to Kyma Basic beep in Kyma Filtered beep in Kyma MIDI in Kyma Resynthesis in Kyma Spectral manipulation in Kyma RE analysis and resynthesis

22_Append_pp493-500 8/29/13 2:54 PM Page 499

APPENDIX

17.9 17.10 17.11 17.12 17.13 17.14

Swan filtered by harp. Morphing with group additive synthesis Vocoding in Kyma Tau editing in Kyma Spectral morphing with tau editor Tau gone bad

Chapter 18 18.1 Sounds from amplified board Chapter 19.1 19.2 19.3 19.4 19.5 19.6 19.7

19 Cue track and performance of Figure 19.1 This Loud by Amy X Neuburg (video used by permission) Open loop excerpt from Curses and Prayers (Veronica Elsea vocals) Excerpt from Requiem for Guitar, performed by Mesut Ozgen Performance with probability distribution (see Figure 19.12) A probabilistic accompaniment (see Figure 19.13) Probability-driven harmony (see Figure 19.14)

Key to DVD example 12.30: h, f, d, b, a, c, e, g, i

499

22_Append_pp493-500 8/29/13 2:54 PM Page 500

23_Index_pp501-513 8/29/13 2:54 PM Page 501

Index

A Ableton Live, 458, 469, 474 Absynth, 247–49, 253, 261, 266, 308, 328 Acoustic modeling, 330–31 Acoustics, studio, 3 Additive synthesis, 205–6, 292–5, 313–20, 440 ADSR [Attack Decay Sustain Release], see Envelope generator AKAI, 187 Alchemy (synthesizer), 315–20, 412, 439 Algorithmic performance, 481–85 Aliasing, 60–61, 118 MIDI regions, 171 In synthesis, 216–17, 241 Ambience, 71 Amplifier module, 217–18 Amplitude, 36 Compression, 105–11 Limiter, 66–67, 104–5 Modulation, 228–30 Analog to digital converter (ADC), 60 Applied Acoustic Systems, 336 Architecture Hardware synthesizers, 426–28 Software synthesizers, 237–38, 239–40, 247–48 Arduino, 454 ARP 2600 modular synthesizer, 329 Atkins, Chet, 468 Attack (audio), 34, 138–39 Attack (in compression), 106 Attack (in synthesis) see Envelope generator

AU [Audio Units], 93, 364 Audio editing software, 73–82 Audio equipment, 4 Audio files File format, 63 Processing, 358–59 Audio interface, 11 Audio level, 21, 42–45 Audio recording and editing programs, 14–15, 73–82 Audio specifications, 4 Audio Units, see AU AudioMulch, 472 B Background and foreground, 138 Background noise, 70 Balanced modulation, see Ring modulation Basic beep, 224–25, 232, 239, 347, 367, 410, 437 Beatles, The, 187 “Strawberry Fields Forever,” 187 Bell Labs, 342 Berberian, Cathy, 454 Bessel function, 269 Bidirectional microphone, 53–56 Big Briar (Moog), 152 Bonetti, Stefano, 344 Boss Loop Station, 469 Boulanger, Richard, 344 Buchla, Don, 187 Bulletproofing, 460–63 Butts, Ray, 468

501

23_Index_pp501-513 8/29/13 2:54 PM Page 502

502

THE ART AND TECHNIQUE OF ELECTROACOUSTIC MUSIC

C Cabrera, Andrés, 444–45 Cage, John, 447, 451 Cartridge Music, 451 Child of Tree, 447 Camel Audio, 315 Cardioid microphone, 53–56 Carlos, Wendy, 313 Carrier, 103–4, 114, 228–31, 268–69 CD (Compact Disk), 64 Center for Research in Computer for the Arts, 343 Chamberlain Music Master, 187 Chaotic player, 403–4 Chipmunk effect, 89 Chorus, 112 Chowning, John, 267 ChucK, 363–92, 454 Arrays, 378–81, 383–85 Basic beep, 367 ChucK operator (=>), 365 For loop, 368, 376–78 Functions, 368–69, 381–83 If statement, 374–76 Library functions, 373–74 Math operators, 372–73 MIDI response, 369–70, 373 Overview, 364–71 Programming examples, 371–92 Shred, 365 Time, 365–63, 385–87 Unit generators, 387–92 Variables, 371–72 Virtual machine, 365 While loop, 370, 377 Circuit bending, 452 Click track, see Cue track Clipping, 105, 121, 246, 347 A Clockwork Orange (film), 313 Cmix, 363 Coalescence, 262 Coding, live, 363–92 Coloration, 70–71

Columbia University, 363 Comb filtering, 112 Common Lisp Music [CLM], 341, 343 Common Music, 343 Common Music Notation, 343 Complex instruments (in Csound), 353–56 Composers Desktop Project, 344 Composition In Audio editor, 89–92 Case study, 89–92, 144–45 Combining concrète and synthesis, 183 Critique, 84–85 In DAW, 137–45 For live performance, 465–87 In MIDI editors, 181–84 Performance emulation, 182–83 Structure, 83 Compression, 105–11 Attack time, 106 Bass, 107 Drums, 107 Multiband, 110–11 Output level (gain), 107 Ratio, 106 Release time, 106–7 Sidechain compression, 107 Stereo compressors, 109 Sustain, 109 Threshold, 106 Vocals, 108 Computer, 10–12 Basic requirements, 10 Computer Audio Research Laboratory [CARL], 343 Condenser microphone, 52–53 Control processors, 221–22 Cook, Perry, 364 Cope, David, 486 Copyright (and sampling), 202 Core Audio, 364 Core Image, 364

23_Index_pp501-513 8/29/13 2:54 PM Page 503

INDEX

503

Crest factor, 66 Csound, 341 .csd file, 345 .orc file, 345–48 .sco file, 348–51 Basic beep, 347 Csound in Chuck, 387 Filtered pluck, 354–55 GEN routine, 348 MIDI file player, 353 Opcodes, 350–51, 354, 359 Overview of, 344–53 Pitch specification, 349–51 Play audio, 358 Pluck, 351 Processing audio, 359–60 Real-time performance, 352–53 Record audio, 358 Reson, 354–55 Resynthesis, 359–60 Widgets, 356–58 Cubase, 93 Cue track, 465–66

Phasing, 112 Resonance, 112 Detecting events, 476–81 Detuning, 138, 228, 264, 276 DiCicco, Oliver, 454 Digital Audio Workstation, see DAW Digital Signal Processing, see DSP DirectX plug-ins, 93 Dissonance, 46–47 Distortion, 5–6, 107–8, 117–21 Decimation, 118 Harmonic vs inharmonic, 5–6 Nonlinearity, 118 Overdrive, 121 Rectification, 119–21 Resampling, 118 Waveshaping, 118–19 DSP [Digital Signal Processing], 80–81, 93–121 Duration Change, 113 Dynamic microphone, 51–53 DX7, 267, 270

D Dannenberg, Roger, 475 Davidovsky, Mario, 465 Synchronisms, 465 DAW [Digital Audio Workstation], 93, 123–45, 165, 167, 322 Automation, 303–4 Composition, 137 File management, 126–27 Projects, 125–26 Tracks, 128–37 Troubleshooting, 136–37 dB, see Decibels Decibels, 32–34 Delay, 111–12 Basic echo, 111 Chorus, 112 Comb filtering, 112 Flanging, 111

E Editing audio, 74–80 Actions, 78–79 Advanced techniques, 85–89 Assembly, 83–84 Case study, 89–92 Creating new sounds, 88 Critique, 84–85 Exercises, 82–92 Planning structure, 83 Editing window, 74–75, 125–27 EFM1 synthesizer, 271–72 Electret microphone, 52 Electroacoustic performance, 465–87 Algorithmic performance, 481–86 Amplified performance, technical problems of, 468 Composing the performance, 486–87 Ghost scores, 473–75

23_Index_pp501-513 8/29/13 2:54 PM Page 504

504

THE ART AND TECHNIQUE OF ELECTROACOUSTIC MUSIC

Looping, 468–71 Pitch detection, 481 Prerecorded accompaniment, 465–68 Preset control, 471–73 Rhythm patters, 479–81 Transformations and process, 471–75 Triggering sounds and processes, 475–81 Electroacoustic performer(s), 447–63 Circuit bending, 452 Ergonomics, 458–59 Guitars, 449–50 Keyboards, 448–49 Laptop (computer), 457–59 Mixing, 455–56 Percussion, 451 Processors, 456–57 Set up, 460–63 Sources and controllers, 448–54 String interfaces, 450–51 Wind interfaces, 450 Elsea, Peter (composer), 470 Requiem for Guitar, 470 Elsea, Veronica, 470 E-mu Systems, 187 Emulator, 187 Ensoniq, 187 Envelope generator, 197–98, 219–21, 243–45, 249–52, 388–90 Envelope (of sound), 34–35, 248–51, 302, 316 Envelope (in synthesis) Basic functions, 219–20 ChucK envelope (ADSR), 367 Csound envelope (linen), 347–48 FM synthesis, 244–45, 271 Graphic, 248–52 Kyma, 436–37 Max (line~), 410 Sample envelopes, 195–96, 198–200 Techniques, 303 EQ, see Equalization Equal temperament, 29

Equalization [EQ], 81, 94–103 All-pass filter, 97–98, 332 Applications, 102–3 FFT EQ, 100–2 Graphic EQ, 98 High-pass filter, 95–98 Low-pass filter, 95–98 Parametric EQ, 98–100 Shelving, 97 Using EQ, 102–3 Equipment layout, 18–19 Equipment wiring, 20–22 Ergonomics, 19, 458–59 Evolver, 431–32 Excitator, 336 Expansion, 111 Exponential curve, 33–34 F Fade in and out, 80 Fairlight, 187 Fast Fourier transform, see FFT Feedback, 284–85 Felici, Luigi, 331 FFT [Fast Fourier Transform], 310–12, 324, 359, 412 FFT EQ, 100–2 Filter, 218–19, 245–46, 257–61, see also Equalization All-pass, 258, 332 Analysis, 312–13 Band-pass, 257–58 ChucK, 388–90 Comb, 261 Csound, 353–56 High-pass, 218–19, 257 Keyboard tracking, 245 Low-pass, 218–19, 257 Notch, 259–61 Resonance, 218 Use in synthesis, 245–47, 257–62 Flanging, 111 Fletcher-Munson curves, 32

23_Index_pp501-513 8/29/13 2:54 PM Page 505

INDEX

Flowchart, 355, 393 FLTK [Fast Light Toolkit], 356 FM [Frequency Modulation], 230–32, 267–305 FM Synthesis, 267–305 Additive, 292–95 Analog, 230–32 Brass, 288–92 Carrier at zero, 277–79 Choir, 292–96 Evolution, 302–5 Feedback, 284–86 Fixed carrier, 276–77 Inharmonic ratios, 276 Math, 268–70 Noise synthesis, 287–88 Ratio 1:1, 271–73 Ratio 1:10, 274 Ratio 6:1, 274–76 Refinements, 288–302 Stacked modulators, 286–87 Strings, 296–300 Velocity response, 300–2 Voice, 292–96 FM8 synthesizer, 279–82 Foreground, 138–39 Formant, 31, 89, 200, 265–66 Fourier series, 307–8 Fourier transform, see FFT Frequency, 28–29 Frequency modulation, see FM Frequency response, 4–5 Frequency shifting, 114, 255–56 Function generators, see Envelope generator G Gain change, 80 GarageBand, 142 Garton, Brad, 363 Gating, 111 Gaussian distribution [bell curve], 482–83

505

GEN routine, 348 General MIDI, 162 Ghost score, 473–75 Gladwell, Malcolm, 463 Outliers, 463 Glass, Philip, 142 Granular synthesis, 324–29 Guitar, 331, 449 H Hammond organ, 319 Hardware, 421–43 Advantages of, 421–22 Architecture, 426–28 Classic instruments, 422–31 Editing, computer assisted, 428–29 Editing sounds, 425–26 MIDI, 424 Production methods, 429–31 Hardware accelerators, 432–34 Harmonic distortion, 5–6 Harmonic series, 29–30 Headphones, 6–7 Hop size, 311 Horn, 335 I Image Synth, 322, 324 Impedance, 58 Impromptu, 364 Inharmonic distortion, 5–6 Institutional studios, 21 Interval, 29–30 Intonation, 141–42 Invert audio clip, 81 IRCAM, 393–94 J Jitter, 414–18, see also Max/MSP/Jitter Matrix, 414 Pwindow, 415 Visualizer, 416–18 Joystick, 223, 315

23_Index_pp501-513 8/29/13 2:54 PM Page 506

506

THE ART AND TECHNIQUE OF ELECTROACOUSTIC MUSIC

K Karplus-Strong algorithm, 350–53, 387 Keyboard, 8 Analog, 222–23 Basic patch, 225 Connections, 156, 166 Max object (kslider), 398 MIDI, 151, 156–57 Performance criteria, 448–49 Sampler, 187 Synthesizer, 206 Used, 422 Keymap, 189, 191–92 Klingbeil, Michael, 313 SPEAR [Sinusoidal Partial Editing Analysis and Resynthesis], 313, 315 Korg, 315 Kyma, 434–43 Basic beep, 436–37 Editing sounds, 434–35 Group Additive Synthesis, 440 MIDIVoice, 437–38 RE synthesis [Resonator/Exciter synthesis], 440 Samples file(s), 434 Spectral resynthesis, 439–40 Tau editor, 441– 42 VCS (Virtual control surface), 434–36 Vocoding, 440–41 L Labeling, 462 Lansky, Paul, 363 Latency, 62–63, 124–25 Layering, 137 Level, see Audio level Level meters, 42–45 LFO [low frequency oscillator], 198–99, 213, 220 Envelope, 251–52 FM, 273, 297 Hardware, 426

Parameters, 243 Remedy synthesizer, 239–41 Sample and hold, 233–34 Vibrato, 228–29 Waveforms, 199 Limiter, Limiting, 66–67, 104–5 Lips module, 334 Listening levels, 21 Live coding, 363–92 Logarithmic curve, 33–34 Logic (sequencer), 171 EXS24, 188 Looping audio, 142–43 Looping (performance technique) Closed loop, 469–70 Open loop, 406–8, 470–71 Looping samples, 191–94 Loudness, 31–32, 138 Low frequency oscillator, see LFO Lucier, Alvin, 469 I Am Sitting in a Room, 469 M Malström, 327 Markov chain, 384–86, 485–86 Masking, 47–48, 139–40 Mathews, Max, 342–43, 393 Max signal processing [MSP], see Max/MSP/Jitter Max/MSP/Jitter, 342, 364, 393, 454, 468, 473–74, 476, 481 Audio visualizer, 416–18 Automatic note generation, 399 Basic beep, 410 Basic elements, 394–406 Chaotic player, 403–4 Decisions, 397–98 Encapsulation, 405–6 FFT, 412–14 Interaction, 398–99 Jitter graphics, 414–18 MSP audio and synthesis, 406–14

23_Index_pp501-513 8/29/13 2:54 PM Page 507

INDEX

Playing audio, 408 Polyphonic synthesis, 410–12 Preset stepping, 474 Recording audio, 408–10 Rhythm detection, 479–81 Rhythm generation, 401–3 Tempo detection, 476–79 Timing, 401 McCartney, James, 363 McCluhan, Marshall, 149 McMillen, Keith, 451 Mellotron, 187 MetaSynth, 322–24 Microphone, 51–59 Controls, 56–57 Directionality, 53–56 Impedance, 58 Mounts, 58 Polar response, 53–56 Preamplifiers, 58–60 Proximity effect, 56 Types, 51–53 Microphone technique, 68–72 MIDI, 151–85, 251, 293, 352–53, 396–98, 424, 448–49, 468, 472–73, 476, 483 Aftertouch, 150, 159 Connections, 153–55 Editors, 181–84 Future MIDI, 163–64 General MIDI, 162 History of, 151–53 Interfaces, 11–12, 154–55 Mergers, 154 Network, 154 Notation and, 162 Playback, 169–70 Recording MIDI data, 168–69 Standard MIDI files, 161–62 Step entry, 170 MIDI effects, 171 Arpeggio, 171 Echo, 171

507

MIDI Manufacturers Association, 157 MIDI messages, 155–61 Active sensing, 160 Aftertouch, 159 Channel messages, 156–57 Control change, 157–58 MIDI time code, 160 Note on and off, 156–57 Pitch bend, 159 Program change, 158–59 Real-time messages, 160 System exclusive [Sysex], 161, 428, 430–32 System messages, 160 MIDI notes, editing of, 171–81 (See also MIDI sequencer) MIDI sequencer, 16–17, 165 Event list window, 172–73 Hardware setup, 166 MIDI Effects, 171 Notation window, 177–79 Note editing, 171–81 Percussion display, 175–76 Piano roll window, 173–75 Playback, 169–70 Quantization, 176 Recording, 168–69 Region editing, 170–71 Synchronizing movies, 179–80 Tempo tracks, 179 Transforms, 176–77 MIDIVoice, 437–38 MiniAudicle, 365 Minimoog, 151, 157, 350, 421 Pitch bend, 159 Mix and balance, 143–44 Mixer, 7, 132–36, 221 Analog architecture, 132–34 Automation, 136 Control surfaces, 12, 135–36 Digital emulation, 134 Stage, 455

23_Index_pp501-513 8/29/13 2:54 PM Page 508

508

THE ART AND TECHNIQUE OF ELECTROACOUSTIC MUSIC

Modeling synthesis, 329–38 Acoustic modeling, 330–31 Analog modeling, 329–30 Pipe model, 333–35 String model, 331–33 String Studio, 336–38 Modelonia, 331–36 Modulation, 228–31 Amplitude modulation, 228–30 Balanced modulation, 230 Feedback modulation, 284–85 Frequency modulation, 230–31 Index, 268 Ring modulation, 114, 230 Modulator, balanced, see Ring modulator Modulator wave, 228–29, 243, 268 see also FM synthesis Modules (synthesizer), 211–22 Amplifiers, 217–18 Control processors, 221–22 Envelope (function) generators, 219–20 Filters, 218–19 LFO [Low frequency oscillator], 220 Mixers, 221 Noise generator, 217 Oscillators, 212–17 Sample and hold, 220–21 Monitor, stage, 456 Monitor speakers, studio, 7, 18–19 Monk, Meredith, 142 Moog, Robert, 152 Moog (synthesizer), 227 Morph, 313, 318–20 MP3, 324 MSP, 406–14 Csound (third–party object), 414 Polyphony, 410–12 Synthesis, 410 Multiband compressor, 110–11 Multiple carriers in FM synthesis, 287 Multitrack production programs, 15–16 Munchkin (Chipmunk) effect, 89

Muse research, 433 MUSIC (program), 343–44 Cmix, 363 MUSIC V, 343 Music Plus One, 475 N Native Instruments, 188, 279 Absynth, 246 FM8, 279–80, 285, 291–92, 296, 303 Kontakt, 188–89 New Interfaces for Musical Expression, see NIME Nierhaus, Gerhard, 486 NIME [New Interfaces for Musical Expression], 452–54 Noise, 6, 28–29, 287, 309–10 Background noise, 70 Computer noise, 11 Noise generator, 217, 335 Normalize, 80 Note generation, 232–34 MIDI keyboards, 232–33 Sample-and-hold patterns, 233–34 Nymoen, Kristian, 453 Nymophone2, 453 O Oberheim, 421 Object oriented programming, 366–67 Oliveros, Pauline, 451 Omnidirectional microphone, 53–56 Opcode, 346, 393 Open Sound Control, 363, 454 Open source software, 13 Operator (FM), 271 Oscillators, 212–17, 241–43 Analog oscillator, 214–15, 329 Differences between analog and digital, 217–19 Low frequency oscillator, 220 Modes, 253 Waveforms, 213–16, 252 Overdrive, 121, 246–47

23_Index_pp501-513 8/29/13 2:54 PM Page 509

INDEX

P Pacarana, 434 Patch, 209–11, 280, 404–6 Amplitude modulation, 228–29 Basic beep, 224–25, 232 Fat patches, 227 Filter patches, 225–26 FM [frequency modulation], 231–32 Vibrato, 228 Patchable synthesizers, 237–38 Patcher (for Max), 394–95 Paul, Les, 468 PeakLE, 73–76 Percussion interface, 451–52 Pd [Pure Data], 393, 418–19, 475 Performance, 445 Computer listening, 476–81 Equipment, 448–55 Looping, 468–71 Preparation, 460–63 Processing, 471–75 Scores 465–68 Setup, 455–60 Phase, 36 Fourier series, 308 Phasing effect, 112 In wave mixing, 37–38, 262–63 Phase vocoder, 320, 343–42, 359–60 Pipe model, 333–35 Pitch, 29–30 Pitch change, 81, 113, 325–26 Playlists, 81–82, 125 Plug-ins, 93–94 Polyphony, 196, 232–33 Practice routine, 463 Preamplifier, 7, 58 Presley, Elvis, 468 Preset bank, see Program bank Pro Tools, 93, 126, 129–30 Program bank, 159, 432 Programming Voices, 195–201 Connections, 201 Control sources, 197–98 Mapping, 195–96

509

Processes, 198–201 Processing samples, 196 Project, 125 Propellerhead Software, 327 Reason, 327 Proteus synthesizer, 188 Proximity effect, 56 Puckette, Miller S., 393, 418, 476 Pulse wave, 213, 241 Pulse width modulation, 243 Pure data, see Pd PVOC, see Phase vocoder Python, 363 Q Quantization, 176 QuickTime, 364, 415–16 QuteCsound, 344–45 R Rachmaninov, [Sergei], 359–60 Rack gear, 8–9 Range of hearing, 28 Raphael, Christopher, 475 Ratio (in compression), 106–8 Ratio (in FM), 271 Real-Time AudioSuite, see RTAS Real-Time Granular Synthesizer X, see, RTGS-X Real-time performance, 352–53 Coding for, 363–92 Receptor, 433 Recorders (audio), 59 Input and output conversion, 60–63 Recording software and controls, 65–67 Storage media, 63–64 Recording audio, 51–72 File format, 63 Latency, 61–62, 124–25 Process, 67–68, 126–27, 129–30 Quality, 69–70 Stereo recording, 71–72 Storage media, 63

23_Index_pp501-513 8/29/13 2:54 PM Page 510

510

THE ART AND TECHNIQUE OF ELECTROACOUSTIC MUSIC

Recording software, 14–16 Rectification, 119 Regions (in DAWs), 130–32, 167 Bouncing, 132 Editing, 131–32 Modifying MIDI regions, 170–71 Processing, 132 Trimming, 131 Repetition, 142–43 Resampling, 118 Research synthesis software, 341–42 Resonance, 31 Resonance, filter, 218, 245 Resynthesis, 318–20, 359–60, see also Additive synthesis Reverberation, 114–15 Decay time, 116 Diffuse reverberations, 115 Diffusion or density, 117 Direct sound, 115 Early reflections, 115, 117 Hi EQ, 117 Impulse response reverb, 117 Mix or balance, 116 Pre delay, 117 Width, 117 Reverse audio clip, 81 Rhythm, 35 Rhythmic accuracy in editing, 86, 140–41 Rhythmic loops, 87, 201–2 Ribbon controller, 223 Ribbon microphone, 51–52 Ring modulation, 114, 230, 255–56 Roland Juno keyboard, 151, 451 RTAS [Real-Time AudioSuite], 93 RTcmix, 363 RTGS–X [Real-Time Granular Synthesizer X], 326–29 S Sample and hold, 220, 233–34 Sample rate, 60

Sample word size, 60–61 Sampler, 153, 187–202 Sample, 187 Compression, 191 DC offset removal, 191 Envelope, 194–95, 198–200 Fade-in, 190 Level Adjustment, 190 Looping, 191–94, 201–2 Noise removal, 190–91 Pitch, 195 Pitch correction, 190 Recording, 189–90 Truncation, 190 Sampling and copyright, 202 Sampling Paradigm, 188–89 San Francisco Tape Music Center, 187, 451, 469 Sawtooth wave, 213–14, 241 Scheme, 364 Scratch track, 183 Sequencer module, 223–24 Sequencers, see also MIDI sequencers Sequential circuits, 151, 421 Prophet-5, 151 Shareware, 13 Sidebands, 114, 228–32, 243, 256, 268 Signal path, 9–10 Signal quality, 4 Silver Chordz, 300 Sine wave, 28, 36–39, 213–14, 307–8 Smalltalk, 80, 434 Smith, Dave, 151, 315, 431 SMPTE time format, 179–80 Soft Strings, 296 Software, 12–13 Editing, 14–16 MIDI sequencing, 16–17 Multitrack production, 15–16 Recording, 14–16 Synthesis, 17, 237–38 Utilities, 17–18 Sorenson, Andrew, 364

23_Index_pp501-513 8/29/13 2:54 PM Page 511

INDEX

Sound, 27–49 Ambience, 71 Association, 82–83 Blends, 46 Coloration, 70–71 Comparing sounds, 45–48 Complexity, 143 Components, 27, 309–10 Envelope, 138–39 Harmonic series, 29–30 Level meter, 42–45 Loudness, 31–34 Masking, 47, 139–40 Matching, 45–46 Multiple sources, 262–65 Pitch, 29–30 Resonance, 31 Spectrum, 37–42 Timbre, 27–29 Waveform, 35–37 Sound files, 73–78 Edit lists, 79–80 Editing, 78–79 Sound Sources, 262–66 Detuning, 263 Formants, 264–65, 295 Phase, 262–63 Transformation, 264 Transients, 264, 310 Sound synthesis programs, 17–18 Sound system, 4–9 SPL [Sound pressure level], 32–33 Speakers and amplifiers, 6 SPEAR [Sinusoidal Partial Editing Analysis and Resynthesis], 313, 315 Spectral balance, 139–40 Spectral graph, 37–42, 308–9 Spectral synthesis, 320–24 Spectrograph, see Spectral graph Spectrum, 38–32, 308–9 Spectrum analyzer, 269–70 Spectrum manipulation, 253 Spectrum Synth, 322, 324

511

Speed change, 88–89 Splicing, see Editing audio Stanford University Center for Computer Research in Music, 343 Star Trek, 454 Stereo recording, 71–72 Stereo track, 128 STK (Synthesis toolkit), 391–92, 414 Stockhausen, Karlheinz, 469 Solo, 469 Streaming, 48 String interface, 450–51 String model, 331–32, 336–38, 350–51 String Studio, 336–38 Studio, 3–23 Institutional studios, 21–22 Subotnick, Morton, 473 Axolotl, 473 Ghost score, 473 Passages of the Beast, 473 Subtractive synthesis, 205, 238–47 SuperCollider, 363–64 Synchronization, 140–41 Synthesis, 205–338 Advanced, 338 Basic beep, 224–25, 232 Computer based, 206–7 Fat patches, 227 Filter patches, 225–26 Fundamental patches, 224–34 Fundamentals, 209–35 Granular, 324–29 Methodologies, 205–6 Modeling, 329–38 New approaches to, 207–39 Research-style, 341–42 Spectral, 320–24 Vibrato, 228 Synthesizers, 209–11, 222–25 Analog, 290 Architecture of, 239–40 Designs, 207–7 Initial assessment, 167–68

23_Index_pp501-513 8/29/13 2:54 PM Page 512

512

THE ART AND TECHNIQUE OF ELECTROACOUSTIC MUSIC

Joystick, 223, 315 Keyboard, 222–23 Modern synthesizer, 209, 265–66 Modular synthesizer, 209–10, 223, 329 Module overview, 211–22 Patches (for modular synthesizer), 210–11, 224–34 Performance interfaces, 222–24 Ribbon controller, 223 Step sequencer, 223–24 Sysex [MIDI system exclusive message], 161, 428, 430–32 T Tassman, 336 Threshold (in compression), 106–8 TimewARP 2600 synthesizer, 329 Tone module, 152, 424 Tracks, 128–37 Arm for recording, 129 Audio level meter, 128–29 Input connection, 128 Mute, 129 Solo, 129 Transcribe, 189 Transfer function, 118–19, 255–56 Transient (audio), 264, 310 Transient generator, see Envelope generator Transport controls, 74 Triangle wave, 213–15, 241 Trueman, Dan, 364 Trumpet, 288, 318 Tuning, 29, 141–42 U U&I Software, 322 Unit generator, 387–92 USB, 155 V VCA [Voltage controlled amplifier], 217–18

VCF [Voltage controlled Filter], 218–19 VCO [voltage controlled oscillator], 212–13, 239 Vector synthesis, 315 Velocity, 156–57, 170, 174, 177, 187, 198, 352, 370, 396, 398 Vercoe, Barry, 344 Vienna Instrument Library, 190 Violin, 297 Waveform, 39 Zeta violin, 451 Virtual Machine [VM], 364–65 Virtual Studio Technology, see VST Visual Programming Language, see VPL VM, see Virtual Machine Vocal Chipmunk effect, 89 Conpression, 108–9 EQ, 103 Formant, 31, 266 Masking, 140 Performance, 454 Processing, Synthesis, 292–96 Vocoder, 103–4, 320, 412 Phase vocoder [PVOC], 312–13, 343–44 Voice, human, 29, 31, 32 Voice, synthesizer, 130, 195, 426–27 Voicing Synthesizers, 237–66 Voltage controlled amplifier, see VCA Voltage controlled Filter, see VCF Voltage controlled oscillator, see VCO VPL [Visual Programming Language], 393 VST [Virtual Studio Technology], 93, 433 VU meter, 42 W Wang, Ge, 364 Waveform, 36–37 Combining, 38 Formant wave in FM, 282

23_Index_pp501-513 8/29/13 2:54 PM Page 513

INDEX

Formula, 307–9 Modifying, 254–56 Noise, 40 Oscillator, 213–16, 241, 252–53 Rich waveforms in FM, 282–83 Sine, 36 Violin, 39 WavePad (application), 67–68 Waveshaping, 118–20, 254–55 Widgets, 356–58 Wiercks, Marcel, 326 WinXound, 344 Wind interface, 450 Windowing, 311, 324–25 Wiring, 20–21

X XML, 344 Y Yamaha, 267, 315, 426, 450 TX81Z, 282 Yamaha X instruments, 267 Z Zeta violin, 451 Zicarelli, David, 393

513

DAS26ElseaCover_1.247 8/29/13 2:36 PM Page 1

VOLUME 26 VOLUME 26

THE COMPUTER MUSIC AND DIGITAL AUDIO SERIES

THE COMPUTER MUSIC AND DIGITAL AUDIO SERIES

Peter Elsea Electroacoustic music is now in the mainstream of music, pervading all styles from the avant-garde to pop. Even classical works are routinely scored on a computer and a synthesized demo is a powerful tool for previewing a piece. The fundamental skills of electroacoustic composition are now as essential to a music student as ear training and counterpoint. The Art and Technique of Electroacoustic Music provides a detailed approach to those fundamental skills. In this book Peter Elsea explores the topic from the fundamentals of acoustics through the basics of recording, composition with the tools of music concreté, and music production with MIDI instruments, softsynths and digital audio workstations. Later sections cover synthesis in depth and introduce high powered computer composition languages including Csound, ChucK, and Max/MSP. A final section presents the challenges and techniques of live performance. This book can be used as a text for undergraduate courses and also as a guide for self learning.

Í A-R Editions, Inc.

The Art and Technique of Electroacoustic Music

Peter Elsea

Peter Elsea served as principal instructor and director of the Electronic Music Program at the University of California, Santa Cruz from 1980 until retirement in 2013. During his tenure, the program gained an international reputation for breadth and rigor. Alumni of the program have won or been nominated for Oscars and Clios for scores and sound designs or become chart-topping performers. Many others have found a place in the music software industry, designing tools that constantly advance the state of the art. Elsea's career began at the University of Iowa Center for New Music in 1972, where he studied under Peter Todd Lewis and Lowell Cross and worked closely with many prominent electroacoustic composers of the day. He is widely known as an internet author, with his posted articles referenced by thousands of sites around the web.

The Art and Technique of Electroacoustic Music

The Art and Technique of Electroacoustic Music

8551 Research Way, Suite 180 Middleton, WI 53562 800-736-0070 608-836-9000 http://www.areditions.com

Í

Peter Elsea