Getting your bits in a row
Some form of organization is needed so the receiver can reassemble and
identify the assorted bits of information contained in a digital audio
data stream. Organization involves assembling the data into blocks.
Each block consists of 192 frames of audio. Each of the 192 frames can
be divided into two sub- frames for two-channel audio. Each frame is
produced at the digital audio sampling rate so each frame contains one
digital value. In a 48kHz audio sampling rate, each frame is 20.833µs
(microsecond) with each frame lasting 4ms (millisecond).
|Figure 3. Pulse code modulation uses a two's complement system to distinguish positive and negative binary coded values with word lengths from eight to 24 bits. Click here to enlarge this image.
|Figure 4. Jitter is variations in the transition times of the clock waveform. Click here to enlarge this image.
|Figure 5. Jitter causes timing errors when the audio signal is reconstructed by the receiver. The receiver locks and regenerates a clock from the incoming digital audio signal. Click here to enlarge this image.
|Figure 6. Receiver-generated clock jitter is related to transmitter (sampling) jitter, interface transition variations and transmission line noise. These effects can be accumulative. Click here to enlarge this image.
Each frame can carry two audio channels. In a two-channel mode,
the samples from both channels are transmitted in consecutive
sub-frames. Channel 1 is in sub-frame A and channel 2 is in sub-frame
In addition to the digital audio word data bits, each sub-frame
contains additional data. Each sub-frame consists of 32 bits, which
includes 20 or 24 bits of audio word data bits and eight bits of
additional data. Each sub-frame includes bits for preamble or sync
data, auxiliary data, audio data word bits, validity (V), user (U),
channel status (C ) and parity (P) data bits. Considering that each
sub-frame consists of 32 × 2 bits, occurring in 20.833µs (FS = 48kHz),
the bit rate increases to 1,536,024 × 2 = 3,072,048 bits per second.
The first four bits of each sub-frame consist of four preamble
bits or sync bits. These bits identify the start of a new audio block
and each sub-frame. A “Z” sync bit arrangement marks the start of the
first frame in the 192 frame block. The sync word “Y” indicates the
start of every B sub-frame. The sync word “X” indicates the start of
all remaining frames. The sync bit arrangement is used by a digital
audio receiver to identify the start of the audio blocks and
Analyzing frequency accuracy
At the heart of any digital system is a clock. This is a
crystal oscillator or voltage-controlled crystal oscillator circuit.
The oscillator output determines the resulting audio sample rate and
audio data rate. A perfect circuit would be exactly the desired
frequency and each cycle of the clock waveform would be identical in
duration or time.
The clock isn't a perfect circuit, as the crystal is not
perfectly accurate. Crystals are rated in accuracy described by a
parts-per-million (PPM) rating. This indicates the maximum number of
cycles the frequency may deviate for every one million cycles or hertz.
A typical crystal rating is ±20 PPM. If the crystal frequency was
1,000,000Hz (1MHz), the generated frequency would be within ±20Hz
(1,000,020 to 999,980). The 20 PPM rating is additive. A crystal of
2,000,000Hz could deviate ±40Hz, while a 3,000,000Hz crystal could
deviate ±60Hz and so on.
In digital audio terms, a crystal frequency of 12,288,000Hz is
commonly selected. This is 256x the ideal sample rate of 48,000Hz. A 20
PPM error at this frequency calculates to an error in frequency of
±246Hz. Because this is a maximum error, one would expect typical
operational errors in PPM or Hz to be much less.
In digital audio systems, some frequency error is tolerable
because the clock frequency is imbedded into the audio data stream and
used to recreate a matching clock frequency by subsequent digital audio
equipment. However, good maintenance and troubleshooting practices
should include a frequency measurement of the digital audio signal
including the sample rate frequency (Fs) and clock frequency (256x Fs). Periodic measurement ensures that when trouble strikes, you know good from bad.
When multiple AES digital audio signals are created by separate clocks,
differences in clock frequencies and sync timing exist. These
differences present challenges to digital audio equipment designed to
switch between or process multiple inputs. To produce multiple AES
digital audio signals at the same frequency and timing, master clocks
or digital audio reference signals (DARS) can be used to synchronize
oscillators and sync timing.
Discussion about clock frequency and timing errors would not be
complete without talking about jitter. With a perfect clock square-wave
each subsequent clock cycle would be identical in time, with positive
and negative parts of the cycles the same duration. The clock would be
a symmetrical square-wave with each of its transitions occurring in
exact time increments from the previous transition.
Again, the clock is not perfect. Clock cycles may fluctuate in
time with cycles being slightly shorter or longer than previous cycles.
Clock positive and negative times may be slightly longer or shorter
causing transitions to occur at slightly different intervals in time.
These variations are called jitter.
In a digital system, it's all about timing. Consider how these
timing variations can cause audio signal degradation. For example,
consider a perfect jitter-free clock digitally sampling a linear rising
waveform during the analog to digital conversion as shown in Figure 5.
If the waveform is reconstructed by a digital-to-analog converter
containing some clock jitter, the linear rising voltage is no longer
linear. The digital values correctly indicate the audio level as it was
sampled, but because the levels are incorrectly placed in time, the
resulting waveform is distorted by the jitter component.
Jitter occurs in a digital audio system at the transmitter from
the non-perfect clock or crystal oscillator circuit. This is commonly
called transmitter or sampling jitter. The digital audio signal is also
adversely affected on the interface transmission line, which
contributes to jitter. This is commonly called interface jitter. These
jitter elements are cumulative as the digital audio is transmitted and
moves through a transmission line to a receiver.
Digital audio embeds the clock signal and sync transitions
within the serial digital audio data stream. It is up to the receiver
to regenerate an oscillator locked to the incoming digital audio. As in
any digital system, the data transitions from high to low have
crossover points. These transition points are used to lock and correct
the oscillator frequency in the receiver. Influences on these
transition points contribute to jitter within the receiver's clock.
One contributor of jitter is the data transmission line, better
known as the connecting cable. The cable's capacity and frequency
response characteristics can cause waveform shaping and slight DC
balance shifts to the digital audio waveform. This causes slight delays
or advances of the transition points along the digital waveform input
to the receiver. This is interpreted at the receiver as jitter. Noise
can also be induced into the transmission line, which further can shift
the crossover points.
Measuring jitter is an important step to ensuring a quality
digital audio signal. Jitter may be measured by an AES digital audio
analyzer. Typical jitter measurements are displayed as small time
errors and expressed in nano or pico seconds. Jitter errors are
commonly expressed as an average RMS value to reduce the measurement
effects of randomly occurring peak jitter errors. The AES/EBU standard
specifies that jitter be less than ±20nS. However, it is desirable to
minimize jitter to much lower levels to optimize digital sound
Kropuenske is an application engineer with Sencore.