Next Previous

Configuring the Audio Session

Configuring the audio session establishes basic audio behavior for your application. Most importantly, you set the audio session category according to what your app does and how you want it to interact with the device and the system. You can change the category, if needed, as your application runs.

Prerequisite to configuration, though, is initialization, described first.

Initializing the Audio Session

The system gives you an audio session object upon launch of your application. Before working with the session, you must initialize it. How you handle initialization depends primarily on how you want to handle audio interruptions:

If using the AV Foundation framework for managing interruptions (explained in “Handling Interruptions Using Delegate Methods”), take advantage of the implicit initialization that occurs when you obtain a reference to the AVAudioSession object, as shown here:
// implicitly initializes the audio session
AVAudioSession session = [AVAudioSession sharedInstance];
The session variable now represents the initialized audio session and can be used immediately. Apple recommends using implicit initialization when you handle audio interruptions with the AVAudioSession class’s interruption delegate methods, or with the delegate protocols of the AVAudioPlayer and AVAudioRecorder classes.
Alternatively, you can write a C callback function to handle audio interruptions (explained in “Handling Interruptions Using a Callback Function”). You must then attach that code to the audio session by using explicit initialization with the AudioSessionInitialize function. Typically, you’d perform explicit audio session initialization in your application’s main controller class, inside that class’s initialization method. “Initializing an Audio Session” provides a code example.

Activating and Deactivating the Audio Session

The system activates your audio session on application launch. Even so, Apple recommends that you explicitly activate your session—typically as part of your application’s viewDidLoad method. This gives you an opportunity to test whether or not activation succeeded. Likewise, when changing the audio session’s active/inactive state, check to ensure that the call is successful. Write your code to gracefully handle the system refusing to activate your session.

The system deactivates your audio session for a Clock or Calendar alarm or incoming phone call. When the user dismisses the alarm, or chooses to ignore a phone call, the system allows your session to become active again. Do reactivate it, upon interruption end, to ensure that your audio works. For code examples, see “Activating and Deactivating the Audio Session.”

In the specific case of playing or recording audio with an AVAudioPlayer or AVAudioRecorder object, the system takes care of audio session reactivation.

Most applications never need to explicitly deactivate their audio session. A possible exception is a recording application. An active session should have the “recording” category only when the application is recording. When recording stops, you can deactivate the session to allow other sounds, such as incoming text message alerts, to play. Only deactivate the session, though, if your application has no sounds to play. If you provide playback of recorded audio, for example, change the category from recording to playback instead of deactivating the session.

Choosing the Best Category

iOS has six audio session categories:

Three for playback
One for recording
One that supports playback and recording—that need not occur simultaneously
One for offline audio processing

To pick the best category, consider a few factors:

The best category is the one that most closely supports your audio needs. Do you want to play audio, record audio, do both, or just perform offline audio processing?
Is audio that you play essential or peripheral to using your application? If essential, the best category is one that supports playback with the Ring/Silent switch set to silent. If peripheral, pick a category that goes silent with the Ring/Silent switch set to silent.
Is other audio (such as iPod audio) playing when the user launches your application? Checking during launch enables you to branch. For example, a game application could choose a category configuration that allows iPod audio to continue if it’s already playing, or choose a different category configuration to support a built-in app soundtrack otherwise.

The following list describes the categories. The first category allows other audio to continue playing; the remaining categories indicate that you want other audio to stop when your session becomes active. However, you can customize the nonmixing “playback” and “play and record” categories to allow mixing, as described in “Fine-Tuning the Category.”

AVAudioSessionCategoryAmbient or the equivalent kAudioSessionCategory_AmbientSound—Use this category for an application that plays sounds that add polish or interest but are not essential to the application’s use. Using this category, your audio is silenced by the Ring/Silent switch and when the screen locks.
This category allows audio from the iPod, Safari, and other built-in applications to play while your application is playing audio. You could, for example, use this category for an application that provides a virtual musical instrument that a user plays along to iPod audio.
AVAudioSessionCategorySoloAmbient or the equivalent kAudioSessionCategory_SoloAmbientSound—Use this category for an application whose audio you want silenced when the user switches the Ring/Silent switch to the “silent” position and when the screen locks. This is the default category. It differs from the AVAudioSessionCategoryAmbient category only in that it silences other audio.
AVAudioSessionCategoryPlayback or the equivalent kAudioSessionCategory_MediaPlayback—Use this category for an application whose audio playback is of primary importance. Your audio plays even with the screen locked and with the Ring/Silent switch set to silent.
Note: If you choose an audio session category that allows audio to keep playing when the screen locks, you should normally not disable the system’s sleep timer. The sleep timer ensures that the screen goes dark after a user-specified interval, saving battery power.
AVAudioSessionCategoryRecord or the equivalent kAudioSessionCategory_RecordAudio—Use this category for recording audio. All playback on the phone—except for ringtones and Clock and Calendar alarms—is silenced.
AVAudioSessionCategoryPlayAndRecord or the equivalent kAudioSessionCategory_PlayAndRecord—Use this category for an application that inputs and outputs audio. The input and output need not occur simultaneously, but can if needed. This is the category to use for audio chat applications.
AVAudioSessionCategoryAudioProcessing or the equivalent kAudioSessionCategory_AudioProcessing—Use this category when performing offline audio processing and not playing or recording.

The precise behaviors associated with each category are not under your application’s control, but rather are set by the operating system. Apple may refine category behavior in future versions of iOS. Your best strategy is to pick the category that most accurately describes your intentions for the audio behavior you want. The appendix, “Audio Session Categories,” describes the behavior details for each category.

You may want to check, during application launch, whether other audio is playing and then choose your category based on that. During launch—such as within the viewDidLoad method—check the value of the kAudioSessionProperty_OtherAudioIsPlaying property.

How Categories Affect Encoding and Decoding

In iOS, you can encode uncompressed audio to a variety of compressed formats. You can also decode these formats for playback. The hardware-assisted codecs on a device consume less power than do the software codecs. In addition, the hardware codecs perform work that would otherwise be done by the main processor—thereby freeing CPU cycles for other tasks. For best performance and battery life, use the hardware-assisted codecs, if possible, when working with compressed audio formats. This can be particularly important for graphics-intensive applications such as games; using software audio decoding may limit your application’s maximum video frame rate.

A device’s hardware-assisted codecs are available to your application only if you configure your audio session category to silence other audio (such as iPod audio). In other words, to gain access to hardware-assisted audio encoding or decoding, you must claim exclusive rights to audio playback. Likewise, to use hardware-assisted encoding to a compressed audio format, your application must claim exclusive rights to audio input.

The following categories allow use of hardware-assisted audio encoding and decoding:

AVAudioSessionCategorySoloAmbient or the equivalent kAudioSessionCategory_SoloAmbientSound
AVAudioSessionCategoryPlayback or the equivalent kAudioSessionCategory_MediaPlayback
AVAudioSessionCategoryPlayAndRecord or the equivalent kAudioSessionCategory_PlayAndRecord
AVAudioSessionCategoryAudioProcessing or the equivalent kAudioSessionCategory_AudioProcessing

If you override the non-mixing nature of a playback category, as described in “Fine-Tuning the Category,” you cannot use hardware-assisted codecs. If you are performing offline audio conversion using a hardware-assisted codec, and don’t need to simultaneously play audio, use the AVAudioSessionCategoryAudioProcessing (or the equivalent kAudioSessionCategory_AudioProcessing) category.

Setting and Changing the Audio Session Category

For most iOS applications, setting the audio session category at launch—and never changing it—works well. This provides the best user experience because the device’s audio behavior remains consistent as your application runs. The main exception to this guideline is that an audio recording application should change categories depending on its state.

For recording, use the AVAudioSessionCategoryRecord (or the equivalent kAudioSessionCategory_RecordAudio) category. This ensures that recorded audio is not compromised by device sounds such as from incoming text messages. When finished recording—such as when the user taps Stop—change the category to the appropriate playback category depending on the needs of your application. Any of the playback categories allow text message alert sounds to be heard.

These same considerations apply for audio measurement applications that use audio input but don’t necessarily record, such as an audio spectrum analyzer.

For code examples that show how to set the audio session category, see “Setting the Category” in the “Audio Session Cookbook” chapter.

Fine-Tuning the Category

You can fine-tune an audio session category in a variety of ways. Depending on the category, you can:

Allow other audio (such as from the iPod) to mix with yours when a category normally disallows it
Change the audio output route from the receiver to the speaker
Allow Bluetooth audio input
Specify that other audio should reduce in volume (“duck”) when your audio plays

You can override the non-mixing characteristic of the AVAudioSessionCategoryPlayback (or the equivalent kAudioSessionCategory_MediaPlayback), and AVAudioSessionCategoryPlayAndRecord (or the equivalentkAudioSessionCategory_PlayAndRecord) categories. To perform the override, apply the kAudioSessionProperty_OverrideCategoryMixWithOthers property to the audio session. See “Modifying Playback Mixing Behavior” for a code example.

When you modify a playback category to allow other audio (such as iPod audio) to mix with yours, you cannot use hardware audio decoding of compressed formats.

You can programmatically influence the audio output route. When using the AVAudioSessionCategoryPlayAndRecord (or the equivalentkAudioSessionCategory_PlayAndRecord) category, audio normally goes to the receiver (the small speaker you hold to your ear when on a phone call). You can redirect audio to the speaker at the bottom of the phone by using a category routing override. For code examples, see “Redirecting Output Audio.”

Starting in iOS 3.1, you can configure the recording categories to allow input from a paired Bluetooth device that supports HFP. For a code example, see “Supporting Bluetooth Audio Input.”

Finally, you can enhance a category to automatically lower the volume of other audio when your audio is playing. This could be used, for example, in an exercise application. Say the user is exercising along to their iPod when your application wants to overlay a verbal message—for instance, “You’ve been rowing for 10 minutes.” To ensure that the message from your application is intelligible, apply the kAudioSessionProperty_OtherMixableAudioShouldDuck property to the audio session. When ducking takes place, all other audio on the device—apart from phone audio—lowers in volume.

Next Previous

Last updated: 2010-07-09

Did this document help you?

Shop the Apple Online Store (1-800-MY-APPLE), visit an Apple Retail Store, or find a reseller.