Microsoft DirectX 8.0

Time and Clocks in DirectShow

This article gives an overview of time and clocks in the Microsoft® DirectShow® architecture. It contains the following sections.

Reference Clocks

One function of the filter graph manager is to synchronize the filters in the graph. To accomplish this, the filter graph manager and the filters all rely on a single clock, called the reference clock. Any object that supports the IReferenceClock interface can serve as a reference clock. A filter with access to a hardware timer can provide a clock (an example is the audio renderer), or the filter graph manager can create one that uses the system time.

At run time, the filter graph manager selects a reference clock and calls IMediaFilter::SetSyncSource on all the filters to inform them of the choice. An application can override the filter graph manager's choice by calling SetSyncSource on the filter graph manager. Only do this if you have a particular reason to prefer another clock.

The filter graph manager uses the following heuristic to select the reference clock.

Reference clocks measure time in 100-nanosecond intervals. To retrieve a clock's current time, call the IReferenceClock::GetTime method. A clock's baseline—the time from which it starts counting—depends on the clock's implementation, so the particular time value it returns is not inherently meaningful. What matters is that it returns monotonically increasing values at a rate of one per 100 nanoseconds.

Clock Times

DirectShow defines two related clock times: reference time and stream time.

When an application calls IMediaControl::Run to run the filter graph, the filter graph manager calls IMediaFilter::Run on each filter. To compensate for the slight amount of time it takes for the filters to start running, the filter graph manager specifies a start time slightly in the future.

Time Stamps and Media Times

A media sample has two time-based properties: a time stamp, and a media time.

Time Stamps

The time stamp defines the sample's start and finish times, measured in stream time. The time stamp is sometimes called the presentation time.

When a renderer filter receives a sample, it schedules rendering based on the time stamp. If the sample arrives late, or has no time stamp, the filter renders the sample immediately. Otherwise, the filter waits until the sample's start time before it renders the sample. (It waits for the start time by calling the IReferenceClock::AdviseTime method.)

Source filters and parser filters are responsible for setting the correct time stamps on the samples they process. Use the following guidelines.

To set the time stamp on a sample, call the IMediaSample::SetTime method.

Media Time

Media time specifies the sample's original position within a seekable medium, such as a file on disk. For video, media time represents the frame number. For audio, media time represents the sample number in the packet. For example, if each packet contains one second of 44.1 kilohertz (kHz) audio, the first packet has a media start time of zero and a media stop time of 44100. Renderer filters can determine whether frames or samples have been dropped by checking for gaps in the media times.

Whereas the time stamp depends on factors outside the source, such as seeking and playback rate, the media time is always calculated relative to the original source.

To set the media time on a sample, call the IMediaSample::SetMediaTime method.

Live Sources

A live source, also called a push source, receives media data in real time. Examples include video capture and network broadcasts. In general, a live source cannot control the rate at which data arrives. A live-source filter should implement the IAMPushSource interface on its output pin. The filter graph manager uses this interface to address two problems that can arise when a live source is rendered.

Latency

A filter's latency is the amount of time it takes the filter to process a sample. For live sources, the latency is determined by the size of the buffer used to hold samples. Suppose that the filter graph renders a video source with a latency of 33 milliseconds (ms), and an audio source with a latency of 500 ms. Each video frame arrives at the video renderer about 470 ms before the matching audio sample reaches the audio renderer. Unless the graph compensates for the difference, the audio and video will not be synchronized.

The filter graph manager does not synchronize live sources unless the application enables synchronization by calling the IAMGraphStreams::SyncUsingStreamOffset method. If synchronization is enabled, the filter graph manager queries each source filter for the IAMPushSource interface. For every filter that supports IAMPushSource, the graph calls IAMLatency::GetLatency to retrieve the filter's expected latency, and from these values determines the maximum expected latency. The graph then calls IAMPushSource::SetStreamOffset to give each source filter a stream offset, which that filter adds to the time stamps that it generates.

Rate Matching

If a renderer filter schedules samples using one reference clock, but the source filter produces them using a different clock, glitches can occur in playback. If the renderer runs faster than the source, there will be gaps in the data. If the renderer runs slower than the source, samples will "bunch up," and at some point data will need to be dropped. Typically a live source cannot control its production rate, so instead the renderer matches clock rates with the source.

To perform rate matching, a renderer filter must set its clock according to another clock. The choice of clocks depends on how the source filter generates time stamps.

The renderer calls IAMPushSource::GetPushSourceFlags on the source filter to determine how the source filter sets its time stamps. Currently, only the audio renderer performs rate matching, because glitches in audio playback are more noticeable than glitches in video.