A surround sound speaker configuration consisting of five speakers arranged in specific positions along the circumference of circle, and a subwoofer (the “.1”). The speaker channels are typically designated as follows: left, center, right, left surround. right surround, and LFE (low frequency effect).
Sometimes written as Q8.24 or fx8.24. A fixed-point sample size used as the canonical audio sample type for processing linear PCM audio in iPhone OS, in lieu of 32-bit floating point samples. In an 8.24 audio sample there are eight bits to the left of the radix point, forming the integer (or “magnitude”) portion of the value, and 24 bits to the right, forming the fractional portion.
A compressed, lossy, perceptual coding scheme, originally a component of the MPEG-2 standard as MPEG-2 AAC. Defined in 1997 as part of ISO/IEC 13818-7. Enhanced for the MPEG-4 standard as MPEG-4 AAC. MPEG-2 AAC provides better perceived audio quality at the same bit rate compared to MPEG-1, layer 3 (MP3), according to results published in ISO/IEC JTC1/SC29/WG11, N2006 (February 1998). MPEG-4 AAC extends MPEG-2 AAC with additional coding tools. See also lossy compression.
A compressed, lossy, perceptual audio coding format developed by Dolby Laboratories, Inc. Sometimes called Dolby Digital or Dolby Surround AC-3. See also lossy compression, perceptual coding.
In iPhone OS, used to describe an audio session state in which playback or recording can proceed. Compare inactive.
Circuitry that converts analog signals to corresponding digital code using sampling and quantization. ADCs are characterized by sample rate, amplitude resolution in terms of bit depth, quantization error and other distortion characteristics, and noise floor. Professional audio work usually employs ADCs with a linear response. Compare DAC (digital-to-analog converter). See also quantization, sample.
A variant of pulse-code modulation (PCM), and an extension of DPCM (differential pulse code modulation), that varies quantization step size to minimize bit rate for a given dynamic range.
An international society of audio professionals that has established many important standards related to digital audio.
A digital audio transport standard defined by the Audio Engineering Society, originally published in 1992. Also called the AES/EBU interface. Equivalent to IEC 60958 Part 4. The AES-3 standard includes parts for various physical connections including balanced twisted-pair wire, unbalanced coaxial cable, and optical fiber. The technical inspiration for AES-3 was the S/PDIF (Sony/Phillips Digital Interface) standard.
An alternate name for AES-3. See also EBU (European Broadcasting Union).
A set of two or more audio devices interconnected to allow the set to be addressed by software applications as a single device. See also device.
A digital audio file format developed by Apple, Inc., based on the Interchange File Format (IFF) developed by Electronic Arts, Inc. The audio data in an AIFF file is uncompressed, big-endian PCM and is stored in chunks. See also chunk, linear PCM.
An extension of AIFF that supports storage of either compressed or uncompressed audio data. May also be abbreviated as AIFF-C. With the availability of newer audio compression schemes such as MP3 and AAC, AIFC is rarely used. It is still supported in Mac OS X.
Distortion resulting from sampling a signal containing energy at or above the Nyquist frequency. In audio, aliasing results in artifacts below the Nyquist frequency, sometimes called aliasing distortion. To avoid aliasing, audio signals must be low-pass filtered to remove energy at or above the Nyquist frequency before sampling.
Apple’s universal audio file format. Apple Core Audio Format is sometimes called Core Audio Format or CAF. CAF files are chunk-based and can contain AAC, MP3, and PCM audio data, among many other audio data formats, as well as MIDI data. See also chunk, pulse-code modulation (PCM).
A compressed, lossless digital audio encoding format defined by Apple, Inc. See also lossless compression.
In Audio Queue Services, describes one of two ways to stop an audio queue. Asynchronous stopping happens after all queued buffers have been played or recorded. In digital communications, a transmission method that does not require the clock frequency of the sender and receiver to be the same. Compare synchronous.
In Core Audio, a software object of type AudioFileStreamID
, which represents data obtained from a TCP stream and supports manipulation of that data. See also TCP (Transmission Control Protocol) stream.
A representation of a signal chain comprising an interconnection of audio units. Also called an AUGraph or graph. Core Audio represents such an interconnected network as a software object of typeAUGraph
. Audio processing graphs must end in an output unit. See also audio unit.
In Audio Queue Services, a software object of type AudioQueueRef
, used for recording or playing back audio. There are two distinct types of audio queue. A recording (sometimes called input) audio queue typically accepts incoming audio from a hardware device and uses a callback function on its output side. A playback (sometimes called output) audio queue has a callback on its input side, and typically sends its output audio to external hardware.
In Audio Queue Services, a data structure used as a container for transient blocks of audio data being played or recorded. An audio queue buffer is managed by the audio queue that owns it.
An iPhone OS software abstraction that represents audio behavior for an application, in context, on an iPhone or iPod touch. An audio session has a category and can be active or inactive.
See category.
A Component Manager–based plug-in that adds an audio feature to a Mac OS X application. Audio units can provide effects such as filtering and reverb, MIDI-based music synthesis, audio data format conversions, mixing, panning, sound generation, and audio playback. Unlike application-specific plug-ins, audio units are available systemwide. Multiple instances of a single audio unit can run simultaneously.
An Apple-supplied audio unit used to interface with hardware input or output, so named because it interacts with the Hardware Abstraction Layer (HAL).
The AV/C standard, published by the IEEE (Institute of Electrical and Electronic Engineers), provides a music and audio device command protocol over FireWire (IEEE 1394) connections.
Describes an encoded audio representation that, while allowing variations in bit rate from frame to frame, maintains a specific average bit rate over a long time interval (typically between 10 and 60 seconds). You can use ABR-savvy encoders to fit a recording into a predetermined file size. Compare constant bit rate (CBR), variable bit rate (VBR).
A chunk-based, container file format defined by Microsoft Corporation in 1992. AVI is a specialization of the RIFF (Resource Interchange File Format) format, which in turn is based on IFF (Interchange File Format).
In surround sound and immersive audio, the real or apparent horizontal angle of an audio source referenced to a line drawn from the listener’s head to a point directly ahead of the listener.
1. In analog audio, the width of a frequency band for a transmission channel, from a lower to an upper frequency limit. The limits are defined in terms of signal attenuation, in decibels, relative to the level at the center of the band. See also decibel. 2. In digital data transmission, the available data throughput for a transmission channel. Digital bandwidth is typically expressed in terms of bits or bytes per second. See also bit rate.
The basic time unit of a musical piece; typically, the bottom number in a time signature. Core Audio’s music player uses the notion of beats in the tempo track.
Sample resolution; the number of bits per sample. Along with some other factors, bit depth determines the dynamic range of a digital system.
The data rate (or bandwidth) of a digital channel, in bits per second.
Memory assigned to temporarily hold data between a source and a destination. For example, Core Audio uses buffers to supply audio to, and receive audio from, audio units. See also audio queue buffer.
In Audio Queue Services, an ordered list of audio queue buffers used by an audio queue. See also audio queue, audio queue buffer.
See element.
Also called audio session category. In iPhone OS, a collection of audio behaviors for an application. For example, a category specifies whether an application intends to mix its audio with other applications or silence them. You specify your application’s category after initializing its audio session.
The maximum allowable signal level in an audio system. The ratio of the ceiling to the noise floor is the dynamic range. Also called dynamic ceiling.
A discrete track of audio. A monaural recording or live performance has exactly one channel. A stereo recording or live performance has two channels. A multitrack recording or performance can have any number of channels. Between audio units, a connection has one or more channels. See also channel layout.
A description of the playback roles for the channels in an audio recording. For example, in a stereo recording, channel 1 has the role of “left front” and channel 2 has the role of “right front.”
A linear block of data consisting of a short, descriptive header followed by the described data. A chunk-based file is an on-disk file laid out as a series of chunks.
The descriptive, metadata section at the start of a chunk. Each element of information in a chunk header is called a field.
The data content of a chunk. The format of the data depends on the chunk type, as specified in the chunk header.
Distortion of a waveform resulting from the limiting of signal amplitude to a specific level. See also distortion.
The regular, periodic signal in a digital audio system used to pace audio recording and playback.
The deviation, over time, of one clock relative to another, due to differing counting rates. Clock drift interferes with synchronization.
Extracting and reconstructing timing information from a data stream.
A generic term applied to, among other things, lossy and lossless audio compression technologies implemented in hardware or software. Encoded data can be wrapped in a file format appropriate for the data, or decoded from such a file format. For example, the MP3 file format is a wrapper that can hold perceptually-encoded audio data.
In Mac OS X, a plug-in whose interface is defined by the Component Manager. An audio unit is a component.
Hardware or software that implements either data compression or level compression. A data compressor, along with its corresponding decompressor, is sometimes referred to as a codec.
In Core Audio, a hand-off point for audio data entering or leaving an audio unit. A connection has one or more channels. See also channel.
A data encoding scheme that can be used to stream audio data over a channel at a constant bit rate while supporting real-time decoding. In most cases, packet size is constant in CBR streams. In the case of constant bit rate AAC streams, packet size may vary slightly. Some encoding schemes, such as PCM, support only CBR encoding. Compare average bit rate, variable bit rate (VBR). See also packet.
See magic cookie.
In Core Audio, for a panner unit, a parameter that specifies the maximum value for the distance parameter, in meters.
A set of iPhone OS and Mac OS X frameworks that provides audio services (depending on the platform) that include recording, playback, synchronization, signal processing, format conversion, panning and surround sound, hardware abstraction, and others.
A Mac OS X framework for controlling and communicating with MIDI devices.
Circuitry that converts digital data to a corresponding analog signal. DACs are characterized by maximum sampling frequency, amplitude resolution in terms of bit depth, monotonicity, distortion characteristics, and noise floor. Compare ADC (analog-to-digital converter).
Algorithmic reduction of data size to improve storage or transmission efficiency. Data compression can be lossy or lossless. Compression is a special case of encoding. See also lossless compression, lossy compression, perceptual coding.
See decibel.
An absolute measure of RMS voltage level in decibels relative to 0.775 Volts RMS. dBu measurements assume a circuit load with infinite impedance. See also RMS (root mean square).
A dimensionless unit for expressing the ratio of two quantities, abbreviated as dB. The decibel difference between two power levels is equal to 10 times the common logarithm of their ratio. The decibel difference between two voltage levels is equal to 20 times the common logarithm of their ratio. Decibel values are typically associated with a standard voltage or power level. For example, acoustic levels are commonly referenced to 0 dB SPL, equivalent to 20 µPa (micropascals). See also SPL (sound pressure level).
To retrieve the original signal from an encoded representation of it. For lossy encoding schemes such as MP3, the retrieved signal approximates the original signal. See also codec (coder/decoder), encoding.
An Apple-supplied audio unit that connects with whichever hardware device the user has designated to be the default output.
A synonym for reverse multiplexing. In digital audio , retrieving discrete channels from an interleaved representation. Compare interleaving.
The time lag between one audio event and another. In audio processing, the second event is typically a processed or unprocessed copy of the original event. Delay is a settable parameter in the AUDelay audio unit included in Mac OS X.
In audio generally, a piece of equipment or a software entity that produces, transforms, transmits, receives, or stores audio data. In MIDI, a piece of equipment or a software entity that responds to MIDI control or provides MIDI data. In Audio Queue Services, a source or destination for audio, such as a microphone or a loudspeaker.
A generic term referring to embedded, electronic restriction over the use of electronic content. Usually applied to copyrighted material. See also FairPlay.
In audio, analyzing or transforming digital representations of audio. Such transformations include, among others, filtering and equalization, reverberation, level compression, data compression, and sound effects such as pitch shifting. Digital signal processing can be performed by hardware, software, or a combination of both.
In an audio data stream, a distinct break in the sequence of transmitted data. A discontinuity entails a period in which the stream is undefined. See also TCP (Transmission Control Protocol) stream.
In surround sound and immersive audio, the real or apparent straight line distance of an audio source from the listener.
A difference, typically unintentional and undesired, between the signals on the input and output of an audio device. Commonly measured types of distortion include harmonic distortion, intermodulation distortion, quantization distortion, and jitter. Intentional differences between input and output signals, such as level or equalization differences, are not described as distortion. Compare noise.
Low-amplitude noise applied to a signal to reduce quantization error. See also quantization noise.
A variant of pulse-code modulation (PCM) that encodes the difference between the current and previous sample.
A quality measure for an audio device or system that describes the difference between the loudest and softest signal that can appear at the output of the device. Dynamic range is equal to the ratio of dynamic ceiling to noise floor, typically described in decibels. See also ceiling, decibel,noise floor.
A Europe-based, international, audio and broadcasting standards organization.
In Core Audio, an audio unit of type 'aufx'
that employs DSP to modify a stream of digital audio. See also digital signal processing (DSP).
In Core Audio, an audio unit programming context nested within a scope. When part of an input or output scope of an audio unit, an element is analogous to a device signal bus—and is sometimes called a bus. See also audio unit, scope.
In surround sound and immersive audio, the real or apparent vertical angle of an audio source referenced to a line drawn from a listener’s head to a point directly ahead of the listener.
Algorithmic conversion of a signal from one representation to another. For example, compressing linear PCM data to AAC format is a form of encoding. Can be applied to perceptual or lossless data compression. See also codec (coder/decoder), decode. Compare data compression.
See MIDI endpoint.
See MIDI entity.
A stream of MIDI or event data which can be played using a music player. See also sequence.
Describes a variable bit rate (VBR) audio format where information about the sizes of the frames is transmitted separately from the audio data stream. Compare internally framed. See also frame.
The digital rights management (DRM) system built into Apple’s QuickTime technology and used by the iPod music player, the iTunes music application, and the iTunes store. These systems use FairPlay to encrypt some AAC files to restrict their playback to authorized devices.
In electronics generally, to direct one output signal to multiple inputs. Audio units cannot perform fan out of this sort. To feed multiple audio unit inputs, you direct an audio unit output to a buffer (such as a splitter unit) that has multiple outputs, each of which can connect to a separate audio unit input.
Apple’s implementation of the IEEE 1394 standard serial bus for connecting digital devices such as cameras and hard drives.
A digital audio sample that uses a fixed-point numerical representation, such as 8.24. Fixed-point samples support fixed-point arithmetic, which is a less computation-intensive alternative to floating-point arithmetic.
In Core Audio, a set of samples that contains one sample from each channel in an audio data stream. In the most common case, the samples in a frame are time-coincident—that is, sampled at the same moment. For example, in a stereo stream each frame contains one sample from the left channel and a time-coincident sample from the right channel. More generally, the various channels in a stream, and therefore in a frame, may be from unrelated sources and may have originated at unrelated times. In video, a single image in a series that constitutes a movie. See also packet.
In Core Audio, the number of frames played per second for an audio data stream. Compare sample rate. In video playback, the number of video frames displayed per second.
The number of times a repeating phenomenon or activity occurs per unit time. The frequency of a sound wave is determined by the number of wavelengths (or fractions thereof) that pass a particular point per unit time. Sampling frequency indicates the number of digital samples taken per unit time. Frequency is typically measured in Hertz (cycles per second).
The ratio of output level to the corresponding input level for a device. Level is typically represented in terms of power or voltage, but gain is unitless and is identical whether voltages or powers were used to calculate it. Because gain is a ratio, it is usually described using decibels. A gain of 0 dB indicates no change in level, while a gain of 10 dB is perceived as approximately a doubling in loudness—depending on the nature of the sound and on the initial loudness.
An object-like interface between Core Audio objects and hardware. The hardware abstraction layer typically addresses hardware by means of an I/O Kit driver, but this is not a requirement. The HAL gives applications a consistent way to communicate with external devices—insulating them from the complexity of addressing multiple, specialized hardware drivers.
The final node in an audio processing graph in terms of signal flow; the output node of a graph. See also audio unit, node.
The range, expressed in decibels, between a standard reference signal level and the maximum allowable signal level (the ceiling). See also dynamic range.
A Mac OS X application that loads and uses audio units. See also audio unit.
The clock time used by the computer running an audio application.
Also called Anatomical Transfer Function, or ATF. A mathematical description of the frequency and phase filtering that takes place when an acoustic signal impinges on a person’s head and pinnae. The HRTF is used in DSP to add spatialization information to a signal. See also digital signal processing (DSP),panning, spatialization.
An international standards organization, founded in 1906, that collaborates with ISO (International Organization for Standardization) on defining a wide variety of perceptual coding formats.
An organization of electronics professionals that has established many technology and audio-related standards. Pronounced “eye triple-e.”
See FireWire.
A flexible, chunk-based file format for storing media content. Developed by Electronic Arts, Inc., and the technical inspiration for Apple’s AIFF (Audio Interchange File Format).
IMA is the abbreviation for Interactive Multimedia Association. A lossy, 16-bit audio compression format that provides 4:1 compression. The format is sometimes referred to as IMA or IMA4. See also ADPCM (adaptive delta pulse code modulation).
Sound reproduction or generation that seems to surround a listener. See also surround sound.
In electronics, the amount of opposition a circuit presents to an AC (alternating current) signal at a given frequency. Impedance includes both a resistive (frequency-independent) and a reactive (frequency-dependent) component. In acoustics, the ratio of average sound pressure to particle velocity over a given surface area and at a given frequency.
In iPhone OS, used to describe an audio session state in which playback or recording cannot proceed. Compare active.
In Core Audio, to configure an audio unit for use.
Also called recording audio queue. See audio queue.
In Core Audio, an audio unit of type 'aumu'
that takes sound bank data and MIDI control data as inputs, and outputs digital audio.
A synonym for multiplexing. In digital audio, converting a set of data streams representing discrete channels into a single stream that retains the capacity to be converted back to separate channels. In Audio Converter Services and in audio file formats such as CAF, interleaving involves placing one sample from each channel in sequence such that a set of coincident samples, one from each channel represented in the data stream, appears in each frame. Compare deinterleaving.
Describes a variable-bit-rate audio format where information about the sizes of the frames is included in the audio data stream. Compare externally framed. See also frame, variable bit rate (VBR).
A generic term for the software- or hardware-based audio inputs and outputs for a device. Pronounced “eye-oh.”
ISO, based in Geneva, Switzerland, collaborates with the IEC (International Electrotechnical Commission) on defining a wide variety of perceptual coding formats. Pronounced “EYE-so.”
Time-based inconsistencies in the clock signal or clock component in a digital signal stream. In digital audio, jitter can result in audible distortion.
In digital audio processing, the time required for an audio sample to proceed from an input to a corresponding output. Total latency, depending on the scope of the system under consideration, can include unavoidable hardware latency (sometimes called I/O latency), safety offset latency (required for robust driver operation), and buffer latency (typically software controlled; dependent on digital signal processing requirements).
A description of the nominal audio signal strength resulting from a given input level and gain in an audio device or system. Level within analog audio circuitry is often measured in dBu. The instantaneous signal strength, for any nominal level, can vary from the noise floor to the dynamic ceiling. Professional “line level” typically indicates a nominal level of +4 dBu, while “consumer level” typically indicates a nominal level of –10 dBu. See also ceiling,dBu, noise floor. Compare volume.
Reduction of the dynamic range of an audio signal, typically by reducing the gain ratio for amplitudes above a specific level. Compare limiting.
One of the six typical channels in 5.1 Surround Sound. The LFE channel covers the bottom two or three octaves of audio and is typically used to enhance the realism of sound effects such as explosions.
Circuitry or software that limits signal amplitude to a user-defined maximum. Compare level compression.
The process of preventing signal amplitude from exceeding a user-defined maximum.
Describes a transfer function whose output signal is directly proportional to the input.
Short forlinear pulse code modulation A linear and lossless uncompressed audio data format.  PCM is usually assumed to mean linear PCM, but sometimes the adjective linear is used to differentiate from nonlinear PCM formats. See also pulse-code modulation (PCM).
An excerpt of a recording, often a few seconds long or shorter, intended to be played repeatedly as part of a larger composition.
Data size reduction without loss of information. Common lossless audio compression formats include FLAC (free lossless audio codec) and Apple Lossless. Compare lossy compression.
Data size reduction that entails loss of information. Common lossy audio compression formats include MP3, AAC (Advanced Audio Coding), and IMA ADPCM. See also perceptual coding. Compare lossless compression.
A subjective term to describe perceived sound intensity. When SPL (sound pressure level) increases by 10, loudness approximately doubles. Compare gain, volume.
See linear PCM.
Also called cookie. In digital audio, an opaque data structure for transporting audio format metadata. For audio formats that use them, such as AAC, a magic cookie is produced during encoding, accompanies the data stream that it describes, and is employed during decoding. Magic cookie data is not accessed directly, but rather via a codec-specific interface.
A standard data protocol for communication between computers and electronic music instruments, first adopted in 1983 by the AES (Audio Engineering Society). Core Audio uses MIDI to communicate with instrument unit audio units. MIDI data describes musical events, such as the starting or stopping of a note. Pronounced “MID-ee.”
An abstract representation of a MIDI cable connection (or port) as used by Core MIDI.
In Core MIDI, a logical grouping of MIDI endpoints. For example, a MIDI driver may group a MIDI-in and a MIDI-out endpoint together in a MIDI entity. See also MIDI endpoint.
A one-way (send or receive) connection point in a hardware-based or virtual MIDI network. Each port can support up to 16 channels of MIDI data. In Core MIDI, a port is represented abstractly in software by a MIDI endpoint. See also MIDI (Musical Instrument Digital Interface).
A music synchronization protocol, defined as part of the MIDI (Musical Instrument Digital Interface) protocol. MIDI timecode emulates SMPTE timecode. See also timecode.
A FireWire-based interconnection protocol that carries multichannel audio and MIDI over a single cable. See also MIDI (Musical Instrument Digital Interface).
Describes an instrument that plays only one note at a time. Compare polyphonic. See also monotimbral, multitimbral.
In Core Audio, describes an instrument unit configured to produce sounds of only a single timbre. Both monophonic and polyphonic instrument units can be monotimbral. Compare multitimbral.
Common short form for MPEG-1, audio layer 3. A lossy, perceptual compression format for audio data that can achieve 10:1 data compression with usable sound quality. MPEG-1 does not define a standard encoding algorithm for MP3; it specifies only the decoding algorithm, the bit stream (packet) format, and the file format. See also perceptual coding.
The MPEG-4 audio/video container format, also known as MPEG-4 Part 14. MP4 files can hold many different types of data, such as AAC and MP3 audio, or MPEG-2 and H.264 video. Typically, files with the .mp4
extension contain both audio and video data, while .m4a
denotes files containing only audio data.
An international working group of ISO/IEC that develops standards for digitally-coded representations of audio and video. MPEG is part of the names of many perceptual coding formats published by the group. Pronounced “EM-peg.” See also IEC (International Electrotechnical Commission), ISO (International Organization for Standardization).
A set of audio and video perceptual coding formats, formally designated as ISO/IEC-11172. MPEG-1 encompasses the Video CD and MP3 formats.
See MP3.
A set of audio and video perceptual coding formats, formally designated as ISO/IEC-13818, first published in 1994. MPEG-2 encompasses formats of generally higher quality than MPEG-1, including broadcast-quality video and (with modifications) DVD movies.
A set of audio and video perceptual coding formats, formally designated as ISO/IEC-14496, first published in 1998. MPEG-4 encompasses many of the features introduced in MPEG-1 and MPEG-2 and adds features useful for streaming media and broadcast, among others.
See MP4.
See MIDI timecode (MTC).
A synonym for interleaving.
In Core Audio, describes an instrument unit configured to allow production of more than one timbre simultaneously. Compare monotimbral.
The Core Audio programming construct that applications use to play MIDI or other event data.
An algorithm or object used to avoid concurrent use of unsharable resources in a multithreaded environment.
An audio unit in an audio processing graph. Each node has one or more inputs and outputs that must be connected to other audio units. See also head node.
Undesired energy or data components in a communication channel included with the signal that the channel is carrying. See also noise floor, quantization noise. Compare distortion.
The amplitude of the noise in a communication channel when no signal is present, typically measured as a scalar, absolute level in decibels relative to a standard level such as using dBu. Noise can vary according to frequency, and perceived noise is subject to psychoacoustics, so the derivation of a single number to describe noise floor can entail weighting. Common weighting schemes are dBA, dBC, and unweighted.
The highest frequency signal that can be faithfully recorded for a given sampling rate. Attempts to sample a signal containing higher frequencies results in the generation of an alias signal below the Nyquist frequency. The Nyquist frequency is half the sampling rate. See also aliasing.
A free collection of digital codecs for multimedia, including Ogg Vorbis for lossy compression of audio at medium-to-high bit rates, and Ogg FLAC for lossless audio.
A free, open source, lossless audio codec. Ogg FLAC typically compresses CD audio by 50% with no data loss. FLAC is an acronym for Free Lossless Audio Codec. See also lossless compression.
A free, open source, lossy audio codec intended to compete with MP3.
Also called playback audio queue. See audio queue.
An audio unit of type 'auou'
. Output units can start and stop the flow of audio data in the signal chain. Examples include the default output unit and the AUHAL. See also head node.
In Core Audio, an encoding-defined unit of audio data comprising one or more frames. For PCM audio, each packet corresponds to one frame. For compressed audio, each packet corresponds to an encoding-defined number of uncompressed frames. For example, one packet of MPEG-2 AAC data decompresses to 1,024 frames of PCM data. In information technology, a packet is a block of data formatted for delivery over a network. Compare frame, sample.
In a variable-packet-size audio file or stream, metadata that specifies where a packet of audio data starts as well as its size. In Core Audio, a data structure used to represent a packet description in an audio data buffer. See also packet, packet table.
In a variable-packet-size audio file or stream, metadata consisting of a table of packet descriptions. See also packet, packet description.
In Core Audio, an audio unit of type 'aupn'
that distributes a set of input channels, using a spatialization algorithm, to a set of output channels. In the simplest case, a panner unit places a monaural signal at a left/right spot in a stereo field.
From “panorama.“ In audio, the placement of a monaural signal within a stereo or multichannel (such as surround sound) sound field. Variations include stereo, SoundField, spherical head, vector, and HRTF (head related transfer function) panning. A more general term for panning is spatialization.
In an audio unit, a variable that defines an adjustable attribute such as volume, pitch, or filter cutoff frequency. Each audio unit parameter has a name, a unit (such as Hertz or decibels), a default value and a value range, and an optional set of flags. In an audio queue, a parameter has only a value. Compare element, property, scope.
In Audio File Stream Services, a software object of type AudioFileStreamID
, used for reading audio file streams. In computer science generally, a program that works with a tokenizer to interpret a sequence of tokens.
Lossy compression that takes advantage of limitations in human perception. In perceptual coding, audio data is selectively removed based on how unlikely it is that a listener will notice the removal. MP3 and MPEG-2 AAC are popular examples of perceptual coding. See also lossy compression.
In psychoacoustics, a perceptual sound attribute that is roughly correlated with frequency. In general, pitch increases as the perceptually-dominant sound frequency increases. The strength of a pitch sensation depends on the sound character; noise-like sounds cause a weak pitch sensation, while pure tones evoke a strong pitch sensation.
Also called output audio queue. See audio queue.
See music player.
A portable collection of code that software applications can load and access through a standardized interface. For example, an audio unit is a plug-in whose interface is defined by the Mac OS X Component Manager.
Describes an instrument capable of playing more than one note simultaneously. Compare monophonic. See also monotimbral, multitimbral.
See MIDI port.
A predefined set of parameter values for an audio unit.
When decompressing audio data, adding dummy frames to the beginning of a buffer to compensate for latency in a decoder.
A frame representing silence that precedes the audio data frames in a stream. The number of priming frames depends on the audio format.
In Core Audio, a key-value pair that declares an attribute or behavior, such as audio data stream format or latency. Each property has an associated data type to hold its value. Properties are typically non-time-varying and not directly settable by the user. Compare parameter.
The study of the perception of sound. The development of perceptual coding techniques relies on psychoacoustics.
A lossless encoding technique widely used for working with audio, invented by Alec H. Reeves in 1937. Sometimes called LPCM for linear pulse-code modulation, which distinguishes the process from ADPCM (adaptive delta pulse code modulation). In pulse-code modulation, an analog signal is linearly encoded to a series of binary numbers by sampling an analog signal at regular intervals. See also encoding, linear, quantization.
In Core Audio, to request and receive audio data, typically from a buffer. Data typically moves through an audio processing graph by way of a cascade of pull requests initiated by the head node. The head node pulls, and each object upstream passes on the pull until the cascade reaches an audio data source.
The process of representing an analog (continuous-scale) value by a digital (discrete-scale) value. Quantization is characterized by a bit depth, which determines the dynamic range that can be represented, and a scaling factor, which determines the ratio between the analog and digital scales.
The difference between an original analog signal value and its quantized digital representation. Quantization can sometimes results in a signal-correlated noise called quantization noise. See also dither.
Signal-correlated noise resulting from rounding errors when quantizing a series of data samples. Application of a dither signal during analog-to-digital conversion can decorrelate quantization noise from the signal. The perceptual result is low-amplitude noise instead of distortion.
Also called input audio queue. See audio queue.
In Core Audio, for a panner unit, a parameter that specifies the real or apparent distance of an audio source from the listener beyond which the source’s level attenuates.
In Core Audio, to apply a recipe or specification for signal processing to some audio data. An audio unit typically contains a rendering method to obtain audio data and perform processing.
The process of taking samples of a digitized signal at a rate different from that of the original recording. Specific types of resampling include downsampling (resampling at a rate lower than the original) and upsampling (resampling at a higher rate).
1. For audio units, to return an audio unit to its just-initialized state. 2. For codecs, to clear the codec’s input buffer and return the codec to its just-initialized state.
See reverberation.
An acoustic phenomenon produced by the cumulative addition of multiple sound reflections. Apple supplies the matrix reverb audio unit to simulate reverberation using digital signal processing (DSP).
A synonym for deinterleaving.
A minor variation on IFF (Interchange File Format) that uses little-endian integers.
A statistical measure of a time-varying value, such as voltage, current, or sound pressure. An RMS value is derived as the square root of the mean of the squares of a series of values. In the case of a continuously varying value, it is derived from an integration of the transfer function. For the special case of a sine wave signal, the calculation simplifies to Vrms = 0.707 * Vpeak. May also be written in lowercase as rms.
A property of an audio unit or other audio device that specifies a time lag, in samples, to allow for improved robustness of driver operation. The safety offset required for a given architecture includes time needed for memory access and to account for inaccuracies in a driver’s timestamp resolution. Safety offset contributes to latency.
1. (noun) An instantaneous amplitude of the signal in a single audio channel, represented as an integer, floating-point, or fixed-point number. 2. (verb) To collect samples from an audio source, typically an analog audio source. Sampling typically involves collecting samples at regular, very brief intervals such as 1/44,100 seconds. 3. (noun) An excerpt of a longer recording. When the excerpt is intended to be played repeatedly, it is called a loop. 4. (verb) To record a sample to use as a loop or for inclusion in a another recording. See also fixed-point sample.
An alternate name for sample rate.
The time span from one sample to the next. The inverse of sample rate.
During playback, the number of samples played per second for each channel of an audio file. During recording, the number of samples acquired per second for each channel. Also called sampling rate. More properly, but less commonly, called sampling frequency. Compare frame rate.
A technique used in AAC (Advanced Audio Coding) encoding (among other encoding technologies) to improve perceived audio quality.
In Core Audio, a programmatic context within an audio unit. Unlike the general computer science notion of scopes, however, audio unit scopes cannot be nested. Each scope is a discrete context. You use scopes when writing code that sets or retrieves values of parameters or properties. Compare element. See also parameter, property.
To set an audio file or buffer’s read position to a specified frame.
In Core Audio, a collection of tracks to be played by a music player. A sequence always contains one or more event tracks and a tempo track. See also event track.
Software or hardware for recording, playback, and editing of MIDI data or audio samples (excerpts or loops). See also loop,MIDI (Musical Instrument Digital Interface).
The range, expressed in decibels, between a nominal signal level and the noise floor. Compare dynamic range.
The number of frames requested and processed during one rendering cycle of an audio unit. See also frame.
A US association of media professionals that publishes standards related to film, television, and audio. Pronounced “SIMP-tea.”
A standard, time-based format for tagging film, video, and audio recordings to support synchronization and editing. The SMPTE timecode represents a given time in the format hours:minutes:seconds:frames
.
Also called spectrogram. A three-dimensional visualization of a signal’s frequency content. Typically, a sonogram’s horizontal axis is time, its vertical axis is frequency, and the visual intensity (in terms of color or dot size) of each plotted point represents energy.
A four-channel acoustic recording technique developed by British company SoundField, Ltd.
In acoustics, the space in which a sound is produced, conveyed to a listener, and perceived. In audio reproduction, the virtual space from which a monaural sound can seem to emanate. See also spatialization.
The manipulation of audio signals to create perceived localization of sounds within a sound field. Compare panning. See also sound field.
See sonogram.
A consumer version of, and the inspiration for, the AES-3 format. Part of the IEC-60958 standard. Devices such as CD players and DAT recorders use S/PDIF.
A measure of sound intensity. SPL is commonly expressed as a ratio in decibels relative to 0 dB SPL, or as an absolute level in Pascals (Pa). While SPL is sometimes used to approximately indicate loudness, the correlation of SPL to loudness is complex due to perceptual factors. See also weighting.
1. (noun) A continuous flow of data over a transmission channel that can be interpreted as it is received. The packet boundaries used for encoding in a particular audio format may not coincide with transmission packet boundaries. 2. (verb) To send data as a stream. See also audio file stream, parser, TCP (Transmission Control Protocol) stream.
A loudspeaker configuration with more than two loudspeakers, intended to provide an immersive audio experience. See also 5.1 Surround Sound.
Common short form of synchronize. See synchronization.
The process of ensuring that the clocks of two or more systems remain locked together, counting at the same rate. This term is commonly used in the context of locking an audio track to a video track. See also clock, clock drift, SMPTE timecode.
In Audio Queue Services, describes one of two ways to stop an audio queue. Synchronous stopping happens immediately, without regard for previously buffered audio data. In digital communications, a transmission method that requires the clock frequency of the sender and receiver to be the same. Compare asynchronous.
In Mac OS X, the hardware destination for all system sounds.
An Apple-supplied audio unit that connects with whichever hardware device the user has designated to be the system output.
The time, beyond an audio unit’s latency, for a nominal-level signal to decay to silence at an audio unit’s output after it has gone instantaneously to silence at the input. Tail time is significant for audio units performing an effect such as delay or reverberation. An audio unit declares its tail time as a property.
A data stream used for audio delivery over networks. TCP is part of the IP (Internet Protocol) suite. It provides reliability and in-order delivery of packets, both of which are useful for audio data transmission. See also audio file stream.
A method of combining multiple digital signals in a single data stream by interleaving samples of each signal in time. For example, to carry a stereo signal on a single stream, the stream contains alternating samples of the left and right channels: L R L R L R.
The general speed of a piece of music, often described in beats per minute (BPM).
A special track used to synchronize all the other tracks in a sequence. See also event track.
A preset signal level at which some sort of processing is activated. For example, a compressor audio unit can allow you to specify the threshold above which compression begins.
The perceived quality of a sound as distinct from pitch. volume, envelope, and duration. For example, a tuning fork can be described as having a “gentle” timbre, while a strongly hit crash cymbal can be described as having a “harsh” timbre.
A standardized indexing system for identifying specific portions of a audio file. Timecodes are often used for synchronizing or editing audio data. See also SMPTE timecode.
A visual representation of an audio signal over time.
An optical cable standard used to transmit digital audio signals. Short for ToshibaLink.
See event track. Compare channel.
A hardware or software conduit for the conveyance of a data stream or an analog signal.
Frames added to the beginning or end of a buffer to pad the audio data. Trim frames added before the audio data are typically used to prime an audio decompressor. See also frame,priming, priming frame.
To return an audio unit to its unconfigured state. Compare reset.
A gain of 0 dB.
A serial bus standard for connecting hardware devices, such as computers, keyboards, and audio devices. Variations include USB 0.9, USB 1.0, USB 1.1, and USB 2.0. Specified by the USB Implementers Forum (USB-IF), an international industry standards body.
An encoding method available for some compression formats, such as AAC, that allows bit rate to vary according to the source material. The aim is to provide consistent perceived audio quality while minimizing file size. It does this by increasing the bit rate for difficult-to-encode portions and decreasing the bit rate for easy-to-encode portions. Compare average bit rate, constant bit rate (CBR).
In Core Audio, a designation by a software MIDI device indicating that it can receive MIDI data. Compare virtual source.
A designation by a software MIDI device indicating that it can transmit MIDI data. Compare virtual destination.
In Core Audio, the original version of the audio unit interface, deprecated in Mac OS X v10.2 and unsupported starting in Mac OS X v10.5. V1 audio units differ from V2 audio units in that they supported fan out, supported interleaved streams, and used a component type and subtype approach different from that of V2. New development should be done with the V2 audio unit interface. Compare V2.
In Core Audio, the current version of the audio unit interface, recommended since Mac OS X v10.2 and the only supported version starting with Mac OS X v10.5. Compare V1.
In psychoacoustics, the average perceived loudness of a sound. Compare level.
A chunk-based digital audio file format originally developed for IBM-compatible PCs. While WAV files can hold compressed audio data, they most commonly hold uncompressed linear PCM data. WAV is a variant of the RIFF bitstream format. See also RIFF (Resource Interchange File Format).
The shape of a signal when visualized as a graph showing its variation in amplitude over time.
The span of one complete cycle in a repeating waveform.
Systematic adjustment of a measurement to highlight a particular criterion. For example, SPL (sound pressure level) measurements can be weighted to approximate how people perceive sound, placing more emphasis on midrange frequencies than on higher or lower ones.
Last updated: 2008-07-07