=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-= AM/FM TECHCORNER The magic of "Octa"-sound Written by Teijo Kinnunen. =-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-= To all TechCorner readers: Up to now, I have received not a single letter from you. Therefore, I would really like to hear your comments about TechCorner. I would also be glad to hear your suggestions about what I should cover in the future TechCorners (I'll soon run out of ideas!). You're also welcome to send any questions concerning audio/music programming, which I'll attempt to answer in the following TechCorner. My address is: Teijo Kinnunen Oksantie 19 SF-86300 OULAINEN FINLAND (I'm sorry, but I don't probably have time to reply individually.) or FidoNet: 2:228/402 =-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-= And back to the point... As you most likely know, there are music programs that can "split" the audio channels, resulting max. eight independent channels. Such programs are e.g. Oktalyzer, StarTrekker and OctaMED, OctaMED being the best (advertisement!!). All these programs use a similar method to produce the sound. They also have similar restrictions, for example: * heavy CPU load * volume control not possible on channel-by-channel basis * rough looping resolution (a workaround could be possible, but quite complex) * decreased sound quality Other means to produce eight-channel sound could be possible, but I will describe the method all of the above programs use. The "magic" method is simply to mix two samples into one which is then played out. It has to happen in real time, though. The critical point is that the samples don't usually have the same playback rate. The mixing routine has to remove or double bytes in order to achieve the correct playback frequency, this leads to degradation in sound quality. OctaMED uses two buffers per channel. The samples are mixed into the buffer which is then played at constant frequency (period). When the other buffer is being played, the other one is being filled. When the first buffer has been played, the other buffer will start playing. This technique is called double-buffering. As already mentioned, the output rate is fixed. Naturally it has to be slow enough, so that the other buffer can be filled before the first has been played. On 7 MHz 68000 machines a good output period is approx. H-2/C-3 (OctaMED uses period 227 (non-HQ)). If not all four channels are splitted, or a fast processor is being used, there's more time for filling the buffers, and a higher output period can be used (in HQ-mode, OctaMED uses the highest possible frequency, 124). The higher the output period, the better the sound quality. The sample buffers are played out using normal DMA output. However, the sample pointers (AUDxDAT) have to be constantly swapped. The only correct method to do this is via audio interrupts. In 5 - 8 -channel modes, OctaMED also uses this interrupt for timing the music, this is very handy. E.g. StarTrekker, as far as I know, uses VBlank timing for music and "assumes" that a certain number of samples are played during one frame, which is very bad. It's wise to keep all channels in exactly the same phase with each other. When the DMA is started (done only once), you have to set all audio DMA bits with the same MOVE-instruction. As a result, we can assume that the audio interrupts will occur _exactly_ at the same time. So, you only need to have one audio interrupt that handles all channels at the same time. To clarify everything, let's have a look at a real 8-channel routine. It's a very stripped-down version of the OctaMED routine. To simplify things, it only handles one splitted channel, and doesn't handle repeat. Below is the macro that fetches the samples, does the actual mixing, and pushes the result into the playback buffer: ; This code does the magic 8 channel thing (mixing). MAGIC_8TRK MACRO swap d6 swap d7 move.b 0(a3,d6.w),d0 add.b 0(a4,d7.w),d0 move.b d0,(a1)+ swap d6 swap d7 add.l d1,d6 add.l d2,d7 ENDM This is the shortest way to do it (if someone can find a shorter/faster way, *PLEASE* let me know ;-). This macro is repeated many times, once for each byte of the playback buffer (on OctaMED max. 1600 times/interrupt), so it had better be fast. Let's examine this macro more closely. First we'll have a look at the register usage. A3 and A4 are pointers to the _beginning_ of the samples to mix. They remain constant throughout the mixing. Index registers D6.w and D7.w will be used to get the actual offset. The samples will be mixed in D0, and the resulting sample will be pushed into the buffer (pointed by A1). Note that the sample data must be halved beforehand (converted into 7-bit dynamic range by shifting sample bytes right one bit position), this saves us from using an extra ASR-instruction. As mentioned above, D6 and D7 are used to index the sample data, they are offsets from the beginning of the sample. However, a resolution of a byte is far too rough. Therefore we need to have a 16-bit fraction: SSSSSSSS SSSSSSSS FFFFFFFF FFFFFFFF The upper word 'S' is the actual byte offset from the beginning of the sample, and 'F' is the fraction part. When the value is updated (to point to the next sample to mix), it's handled as a 32-bit value. (D1 and D2 contain the 32-bit numbers to add each time, they are constant values based on the current playback periods of the channels). However, when the sample value must be fetched, the fractions must be forgotten. A single SWAP instruction will do fine. As a result FFFFFFFF FFFFFFFF SSSSSSSS SSSSSSSS the lower word can be easily used for indexing. Another SWAP, and everything is back again for a new cycle. This was the most critical part of 8-channel output, but let's also look at the interrupt code. _IntHandler8: movem.l d2/d5-d7/a2-a5,-(sp) DB is a pointer to the data area, we can use A6-relative data addressing. lea DB,a6 ; ================ 8 channel handling (buffer swap) ====== not.b whichbuff-DB(a6) ;swap buffer bne.s usebuff1 'whichbuff' tells us which buffer is currently in use. not.b toggles it and we change the buffer pointers accordingly. A1 (int_Data) points to the buffers (each 200 bytes long). A0 points to $DFF000 (custom chips) move.l a1,$a0(a0) ;ac_data = buffer 1 (offs = 0) move.w #100,$a4(a0) ;ac_len = 200 bytes bra.s buffset usebuff1 lea 200(a1),a1 ;ac_data = buffer 2 (offs = 200) move.l a1,$a0(a0) move.w #100,$a4(a0) Audio interrupt request bit MUST be cleared (very important). buffset move.w #1<<7,$9c(a0) Set the volume to maximum. move.w #64,$a8(a0) ; ============== fill buffers ============ To make things easier, I've set up some pseudo-audio-hardware registers (track0hw, track4hw). Instead of using ac_len, however, ac_end points to the end of the sample. startfillb lea track0hw-DB(a6),a2 ;calculate channel A period Some wizard-stuff again... It will calculate the fraction value to add each cycle. The actual formula is: 227 * 65536 14876672 fracval = ----------- = ---------- period period But as the result could be > 65535 and DIVU doesn't handle that big quotients, it will be calculated as 3719168 fracval = --------- * 4 period ac_per of 0 is considered silence... D1 will contain fracval and D2 will contain fracval / 4. move.l #3719168,d7 ;227 * 16384 move.w ac_per(a2),d6 beq.s setpzero0 move.l d7,d2 divu d6,d2 moveq #0,d1 move.w d2,d1 add.l d1,d1 add.l d1,d1 Then we fetch the required addresses. A5 is the sample end pointer, and A3 (after checking) is the sample start pointer. Note: A3 is the _current_ start pointer, it will change after each fill. ;get channel A addresses move.l ac_end(a2),a5 move.l (a2),d0 beq.s setpzero0 chA_dfnd move.l d0,a3 ;a3 = start address, a5 = end address The following operation will check, if the sample would run past the end address during this fill. If so, turn it off. ;calc bytes before end mulu #200<<3,d2 clr.w d2 swap d2 ; d2 = # of bytes/fill add.l a3,d2 ;d2 = end position after this fill sub.l a5,d2 ;subtract sample end bmi.s norestart0 clr.l (a2) setpzero0 lea zerodata-DB(a6),a3 moveq #0,d1 norestart0 Now repeat everything for the other channel.... ;channel B period move.w SIZE4TRKHW+ac_per(a2),d6 beq.s setpzero0b divu d6,d7 moveq #0,d2 move.w d7,d2 add.l d2,d2 add.l d2,d2 ;channel B addresses move.l SIZE4TRKHW+ac_end(a2),a5 move.l SIZE4TRKHW(a2),d0 beq.s setpzero0b move.l d0,a4 mulu #200<<3,d7 clr.w d7 swap d7 add.l a4,d7 sub.l a5,d7 bmi.s norestart0b clr.l SIZE4TRKHW(a2) setpzero0b lea zerodata-DB(a6),a4 moveq #0,d2 norestart0b Finally, it's time to mix. It'll be done 200 times. To save time, DBF will occur only after every 20th mix. moveq #0,d6 ;clear index regs moveq #0,d7 moveq #9,d5 ;DBF counter do8trkmagic MAGIC_8TRK ;20 times.. MAGIC_8TRK MAGIC_8TRK MAGIC_8TRK MAGIC_8TRK MAGIC_8TRK MAGIC_8TRK MAGIC_8TRK MAGIC_8TRK MAGIC_8TRK MAGIC_8TRK MAGIC_8TRK MAGIC_8TRK MAGIC_8TRK MAGIC_8TRK MAGIC_8TRK MAGIC_8TRK MAGIC_8TRK MAGIC_8TRK MAGIC_8TRK dbf d5,do8trkmagic ;do until cnt zero Then add the advanced index sample offsets to the sample pointers (the fraction part cleared first). end8trkmagic clr.w d6 clr.w d7 swap d6 swap d7 add.l d6,(a2) add.l d7,SIZE4TRKHW(a2) And exit the interrupt... movem.l (sp)+,d2/d5-d7/a2-a5 rts I have provided you with an example program that plays two samples through a single channel (for two seconds). Its arguments are: example (where periods are usually between 200 - 900) Have a look at the sources as well. The program consist of an interface & loader part (written in C), and the player & audio part (in assembler). As usual, feel free to use the code in your own programs!