AudioWorks

AudioWorks

The complete manual...

Introduction

AudioWorks is a general purpose sound editing program. It can take sound samples in a variety of formats and then manipulate these samples. The samples can then be played or stored back to disc, again in a variety of formats.

Before describing how to use AudioWorks, it is helpful to explain some of the basic terminology and the various sample formats.

Digital Sound Samples

Historically, sounds have been recorded on analogue devices such as tape recorders. The continuously changing vibrations that make up a sound are converted into an electric current and then stored as a continuously changing magnetic signal on the tape itself.

If we were to look at a recorded analogue signal using an oscilloscope, we might see something like this:

This wave represents the vibrations which we hear as sound.

The waveform is regarded as having an amplitude of 0 when it touches the dotted centre line. Above the line are positive amplitudes, and below it are negative.

Volume

The more violent the swings between positive and negative amplitudes, the louder we perceive the sound to be. (The above sound is loudest at the beginning.) An amplitude of 0 is silence.

Thus, amplitude can, for our purposes, be considered roughly equivalent to volume.

(It is sometimes also called 'speaker deflection' as the speaker cone must move in and out in a similar pattern to the waveform to generate equivalent sound vibrations.)

Frequency

The time interval between the swings determines the frequency of the sound. This interval is called the 'period' of the sound. The number of such swings each second, from positive to negative and back to positive, is the 'frequency' or 'pitch'. This period is longer for low-pitched sounds than high-pitched.

Digital Sound and Sampling

So how do we store such a sound in a computer? Computers are based upon digital circuits. This means they can process numbers, but not continuously changing analogue signals.

The solution is to 'sample' the analogue signal. That is, taking 'snapshots' (samples) of the amplitude at frequent and regular intervals. Each snapshot can then be stored as a number representing the amplitude at that point, and a sequence of snapshots can be used to control a speaker to approximate the original waveform. (This is similar to cinema film, where a sequence of still pictures are flashed on the screen in rapid succession and this gives the appearance of movement.)

Continuing with the cinema analogy; if you slow the film down, it becomes flickery, and then jerky. Slowed down even more, objects seem to jump across the screen rather than move in a fluid, natural manner. This is because we are only using snapshots of the scene, and all movement between the snapshots is missed.

So what happens to the computer's representation of a sound if we sample too infrequently?

In the picture below, we record samples at regular intervals. Each grey bar shows the level captured by a sample in the middle of each interval.

Note how the first high wave is completely missed by our samples. In fact, most of the waves in the sample are missed.

If we double our 'sampling frequency' we get the following set of samples. Already, we catch a much better picture of the original waveform.

If we use five times our original sampling frequency, our digital samples come very close to the original waveform:

To successfully sample an analogue signal of frequency f we usually need to take at least 2f samples every second.

The human ear can detect frequencies between about 20 and 20,000 hertz (waves per second). Digital compact discs (CDs) are recorded at roughly twice this rate at 44,100 samples per second, and digital audio tapes (DAT and so on) are typically 48,000 samples per second. Telephone-line quality is much lower and rated digitally as only 8,000 samples per second.

Of course, storing all these samples takes memory. Because of this, common sampling rates on computers are a compromise between quality and memory and normally range between 8,000 and 22,000 samples per second. The internal Acorn sound system defaults to 20,833 samples per second, but is capable of higher rates.

Sample formats

As well as a wide range of sampling rates, there are also a wide range of sample formats. AudioWorks supports several formats:

Bits per sample

Remember that each sample is a digitally stored number representing amplitude. The more accurately we can record this number, the more accurately we can represent an analogue amplitude level.

The most common method of storing samples on computers is to use a single byte (8 bits) per sample. This allows 256 different levels to be recorded and is a convenient size for computers to handle.

However, the never-ending quest for higher quality reproduction and the decreasing cost of computer memory have led to systems that use more bits per sample. The most common formats are 12 and 16 bits per sample, which can represent 4,096 and 65,536 different levels respectively.

Again, using more bits per sample has a corresponding cost in memory usage.

Sample data format

To confuse matters further, although an 8-bit sample can represent 256 levels, different manufacturers interpret these values in different ways.

AudioWorks supports the standard Acorn interpretation and three other common interpretations. These are:

	Linear Signed Each sample value represents exactly what you might expect - the range of possible amplitudes is divided into 256 equal areas. Positive amplitudes are represented as positive numbers, negative amplitudes as negative numbers. Linear signed samples can be represented in any number of bits per sample.
	Linear Unsigned This is almost identical to Linear Signed format, but the sign is treated differently. Linear unsigned samples can be represented in any number of bits per sample.
	Logarithmic This is the Acorn native format. It records the amplitude on a logarithmic scale, which is closer to the natural response curve of the ear. This representation is only available in an 8-bit format.
	µLaw Logarithmic This is a common format on NeXT and Sun workstations, and is very similar to Acorn Logarithmic format. (Technically, the sign bit is at the opposite end of the byte and all the bits are inverted.) This representation is only available in an 8-bit format.

AudioWorks works in any of these sample formats, with 8, 12, or 16 bits per sample. Because it works directly in these formats where possible, conversion between formats is rarely needed. (Conversions between the logarithmic and linear formats always introduce distortions and are 'lossy'; that is, some information is lost.)

When copying and pasting between samples of different formats, AudioWorks automatically converts the source data to the correct format, if necessary. Thus, you rarely need to know which sample format you are working in.

Mono and Stereo

In addition to the various sample formats, sampled sounds can represent mono and stereo sound. To provide a stereo effect, two recordings are made of a sound, for the left and right channels. One channel is played to the left ear while the other is played to the right.

AudioWorks treats stereo samples almost as if they were two separate mono samples, with a special feature to link the two channels together so that they may be manipulated as a single stereo entity.

Terms used in AudioWorks

Sample Frequency/play speed/sampling rate

Sample frequency is usually referred to as play speed (when playing) or sampling rate (when sampling). A sample only sounds the same as the source when played back at the rate at which it was sampled. Thus play speed normally equals the sampling rate.

Frequency is normally measured in samples per second. However, for convenience, three forms are used in AudioWorks:

Hertz(Hz) This usually means 'waveforms per second'. However, in digital sampling it has a special meaning of 'samples per second'. A typical sampling rate would be 20,833 Hz.
Kilohertz (kHz) This is a related term, meaning 'thousands of samples per second' for our purposes. 20,833 Hz is approximately 21kHz.
Microseconds delay between samples ('µs delay', or just 'µs') This is an alternative way of expressing sampling rate. This is the time interval between samples. This interval is usually measured in microseconds (millionths of a second).
µs delay = 1,000,000/Hz.
20,833 Hz is a delay between samples of 47.61 µs (usually rounded to 48 µs).

Samples

The word 'sample' is a little ambiguous, as it is commonly used to refer to both single samples and sampled sounds comprising many individual samples.

Usually the difference is clear from the context but where ambiguity may arise, we refer to a sampled sound as a 'sound file' or 'sound sample' and individual amplitude snapshots as 'samples'.

The Current Sound Sample

AudioWorks allows you to load, play, and edit several sound samples at once. However, some functions can only operate on a single sound at any time. These include the keyboard and Alter windows, as well as the envelope effect. (All these are detailed later.)

AudioWorks therefore has the concept of the 'current' sound sample.

By clicking in any sample window, you can make that sample the current sample. It is given input focus (the title bar changes to yellow and any key short-cuts apply to that sample). A red dot in the top left corner of the sample display indicates which is the current sample.

To make a sample the current sample:

click in the sample window or,
some options (such as Alter) can display a menu of samples currently in memory. Choosing a sample from this menu makes it the current sample.

RISCWorld