Online Audio Processing

Publish date:

Online Audio Processing

Mar 1, 2002 12:00 PM, By Cornelius Gould

It is generally advisable to use a separate processor when setting up an Internet stream for the purpose of re-broadcasting your station over the Internet. With the recent AFTRA scare, many stations have resorted to a split between the over-the-air feed and a source for the Internet audience during commercial breaks.

Depending on the method used, stations may find themselves needing some kind of separate audio processing for webcasting operations. Many are faced with the question, "What should I do for Internet processing?"

There all kinds of options available to you ranging from just using an Aphex Compellor to employing one of the full-featured digital audio processors available from companies such as Omnia or Orban.

Audio processors designed for webcasting will produce the best sonic results.

Image placeholder title

What should you choose? It depends on what streaming system you're using. It will also depend on your format.

Digital reduction basics

Bit-rate reduction algorithms, such as MPEG, Real Audio, Microsoft's MSV2 codec and others used for webcasting generally take the form of "lossy" data compression. Remember that a linear 44.1kHz stereo audio stream is about 1.6Mb/s. The MPEG Layer 3 at 256kb/s compression ratio is about 6:1. At 128kb/s the ratio is approximately 11:1.

Through data reduction, data is removed to make the audio stream (or file) smaller. This data is gone forever. (A complete description of how masking works is covered in "How it works," BE Radio February 2002, page 64.)

The lower the bit-rate, the more artifacts are let through. This is why a 28.8kb/s MP3 Internet audio stream sounds so unnatural when compared to one at 128kb/s.

The critical area in any of these perceptual coding schemes is the high-frequency area. Our ears are most sensitive to what goes on in high-frequency areas, and it's pretty hard to manipulate data in this area without the change being noticeable.

The key to minimizing some of these effects is to keep the codec's input levels as close to zero as possible. It also helps to keep the upper spectrum free from clipping distortion or excessive high-frequency processing.

Most perceptual codecs were designed without audio processing and deliberate spectrum manipulation in mind, so in some ways, the use of audio processing is not the best approach. But the alternative is sloppy levels, which can be much worse.

Another option to consider is using a lower sample rate, as this will reduce the amount of high-frequency components the codec sees. This approach will usually result in better performance in suppressing artifacts.

Practical solutions and caveats

A strong, quick solution to webcast processing is a general AGC such as the Aphex Compellor. However, a big weakness in the use of such a device is in a lack of consistent spectral balance over a wide range of material. About 50 percent of the programming content (for current music) will sound acceptable; the rest will fail due to the lack of proper spectrum management. Multiband audio processing is a must for higher degree of quality control.

One drawback to using a spare multiband processor is that it was likely not designed for the Internet. Diode clippers and poor crossover design can make the sound worse.

Crossovers that are not dynamically flat are also an enemy of codecs. Many processors are designed using textbook crossover filters. These filters operate fine as long as the gain state is static, like a speaker crossover. However, when you change the gain relationship of the output of the crossover, many textbook filters will exhibit problems with peaks or notches forming in some parts of their passband due to the absolute phase relationship of the same frequencies appearing in the lowpass/highpass skirts of neighboring filters.

A high-quality sound card, preferably with digital inputs, is the best choice for the good streaming audio quality.

Image placeholder title

In such a case, the phase angles of audio within the highpass/lowpass skirts will rotate and either be out of phase with or in phase with frequencies in the passband of neighboring filters. When this mess is all summed together after feeding through compressors, you have random peaks and notches floating around the audio spectrum.

Old analog multiband processors can suffer from another problem. Due to parts tolerances and changes in temperature and humidity, the left and right crossovers will most likely not match each other. The roaming notches will be different for the left and right channels.

These random, narrow-bandwidth peaks and notches roaming around the audio spectrum drive low-bitrate codecs crazy, contributing to the strange phased-out sound of many Internet channels that attempt to use multiband processing to some degree. This is especially true for any low-bitrate stereo encoding.

The DSP units produced by the leading audio processor manufacturers address all of these issues. But if you cannot afford one of these, knowing what you're up against can help you make sound judgement calls and nifty modifications to some old box lying in the corner of the shop.

As good as the weakest link

The first rule of sound cards is that not all sound cards are created equal. The biggest quality issue you will face is how well the sound card will function at the desired sample rate. A $3000 Net processor connected to a $20 sound card will yield unimpressive results.

Of concern here is how well the sound card filters frequencies above the Nyquist frequency of the sample rate in use for streaming. The Nyquist frequency is highest analog frequency that can be converted to digital without severe problems. This point is exactly half of the sample being used. As an example, a 44.1kHz audio sample rate is capable of reproducing audio frequencies up to 22.05kHz.

Exceeding the Nyquist frequency generates digital aliasing distortion. Steep filtering of the analog signal, called anti-aliasing filters, are used to ensure that no data beyond the Nyquist is present at the input of the analog to digital converter.

This aliasing distortion accounts for most of the artifacts observed in the majority webcasts. These aliasing products produce scratchy and/or metallic audio, high-frequency birdies or jingling noises heard on top of white noise sources (such as tape hiss, or crowd cheers).

Choosing a sound card with support for your desired bit-rate and with exceptional anti-aliasing filters is a must. Look to professional soundcards that support the sample rates you wish to use.

Another problem with most budget sound cards is poor design. The layouts on budget cards can introduce computer noise into the audio.

Use a sound card with a digital input. This will allow you to take the digital output of a DSP-based processor and feed your codec via a direct digital link. Alter the output sample rate of the DSP processor to match what is used online. A sample rate converter between the DSP box and the sound card can also be used.

Thanks to Rolf Taylor, a patient second set of ears as the author experimented with hundreds of different configurations to try to get the best webcast audio performance.

Gould is senior staff engineer for Infinity Broadcasting Corporation, Cleveland, OH.