Streaming Audio

August 1, 2008

If you aren't already streaming audio from your radio station, you probably will be soon. The PPM is about to become currency in New York City (along with other major markets) and Arbitron is now measuring streaming audio use just like it measures over-the-air radio use. Obviously it's imperative to give any potential listener the best opportunity to listen to your station, and it's clear that station management in PPM markets are going to want to make sure online listeners are counted. But still — even if you are in a smaller market — you're undoubtedly just as concerned about reaching listeners in any way you can.

And I believe it is important that we as terrestrial broadcasters, with respect to competition from the satellite broadcasters and more and more Internet-only content providers, leverage our broadcast technical facilities, our expertise in generating programs and other content, and our native promotional facilities to stay in the lead. We've been at this broadcasting game a long, long time after all.

Hopefully I've gotten you to jump on the streaming audio bandwagon; and if so, you'll soon realize there are many questions to ask:

  • How do I actually generate the stream?
  • What encoding software should I use?
  • At what data rate should I encode?
  • How many listeners should I be prepared to handle?
  • How much will this cost?
  • How do I accomplish spot-insertion, to make a business of it?
  • A streaming review

    My first experience with streaming audio was at KKSF in 1998. It was the dot-com boom in San Francisco, and having a website and streaming audio was important so that KKSF would be recognized as being technically hip. It was fun to add a new way for listeners to get the station; it was amusing to read e-mails from around the country and world from people who were listening in. Via SNMP we could remotely look at our streaming server, seeing how many users were being served, and when. However, from a business standpoint, it wasn't much more than a curiosity. It cost a certain amount per month, and there was no return on investment. Even its promotional value was intangible.

    Orban Optimod PC1100

    Orban Optimod PC1100

    Soon after the dot-com bust, and the recession that followed, union announcers that voiced spots that were now being heardover the Internet insisted they be paid more for the spot reads. That, in addition to poor economic conditions, caused many streams to be simply turned off. The first streaming audio era, which peaked during the dot-com boom, reached a nadir during the dot-com bust.

    The recession of the early 2000s came to an end, of course. Broadband Internet connectivity became practically ubiquitous, and interest in streaming audio made a comeback. This time around, though, it was taken more seriously. The union announcer issue still existed, and so stations that streamed audio generally prevented the spot blocks from being sent over the Web stream. Some stations inserted simple fill-music, some public service announcements. Some did a combination of both. However, before too long, the notion of monetizing the streams came back.

    How do I generate the stream?

    There is more than one way to do this, but I think it's important to reiterate at this point that you will generally make use of the same audio source you already use for terrestrial broadcasts. (This is one advantage that we have over the newcomers to the business. We already have the infrastructure.) The most common method for generating the stream is to make use of another PC with Windows Media Encoder. Audio is fed into the computer through the sound card. The most common way to improve the quality of the audio stream generated in this manner is to make use of a better-quality sound card. (This is where the disciplines of IT and what is more traditionally known as radio engineering really dovetail). This stream should also be separately processed for the bit-reduced stream that will be created.

    Omnia A/X

    Omnia A/X

    Orban makes the Optimod PC1100, which is a sound card and audio processor that goes in the PC itself. Control of this card's functionality is done via a GUI operating under Windows. All the processing is done on the card itself through the on-board DSP. The card includes both analog inputs/outputs as well as AES ins and outs. Install this card in the streaming PC and provide every listener with processed audio (for many of the same reasons we have to process audio for our terrestrial stations).

    Omnia makes a software version of its Omnia.3net, known as Omnia A/X. This software runs on a PC in conjunction with a separate sound card, using the CPU to carry out the processing algorithms. A GUI controls all the processing functionality. There again, if this software is installed on the streaming generator, all the streaming audio listeners will enjoy the benefits of processed audio.

    Neural Audio offers its Neustar 4.0, a single-rack unit audio processor specifically designed to run in front of bit-rate reduced codecs. (It can operate in a stand-alone fashion or in conjunction with audio processors). Its stated design purpose is to reduce objectionable artifacts in the audio created by the encoding/decoding process. A software version is available as the Neustar SW4.0


    Once the stream is generated, it will likely be connected to a location with a very high-bandwidth Internet connection. This connection can be via LAN, WAN or the Internet itself. The reason behind this is simple: The streaming server itself has be where it can handle mountains of data. For example, back in 1998, at KKSF we generated the stream at the studio and then pulled the stream from our ISP where our streaming server PC was physically located. During the business day, we limited-out at 300 users, each getting a 32kb/s stream for a total of nearly 10Mb/s. That kind of data rate in to a radio station was unheard of in those days (and would be uncommon even now). All we had between us and our ISP was a single T1; nothing else was needed since our streaming server was located at the ISP where there was tons of bandwidth available.

    Today, being connected to a single streaming server at one location that happens to have a fast Internet connection isn't enough. Today most large organizations handling streaming content make use of a content distribution network (CDN). A CDN is a network of servers that provide content to end users. The servers are located at diverse geographic locations, and share the content to be served via one or multiple connections between themselves. It's the job of the CDN to see that the end user gets content in the most optimal fashion, from the server best suited for the particular job. In the case of KKSF in 1998, if we had lost our T1 connection, all Internet streaming would have been down. Likewise, had there be an issue with our streaming server or even with the network connections at the ISP, then all streaming would have been down. Back then it didn't matter too much, but today that would be unacceptable. Furthermore, the more hops the end-user is away from the actual streaming server, the slower and less reliable that connection is likely to be. If the stream the listener wants has to connect to router A to get to router B to get to router C to get to router D to finally get to the server, it's probable there will be slower and less reliable performance than if he can connect to the server through only two hops. The CDN works to optimize the performance between the server and you. At the same time it provides redundant server possibilities (getting around the KKSF problems described above). I'll talk about some large CDNs later.

    However, this common approach isn't the only way, as I wrote earlier. You could use hardware specifically designed to do nothing but stream audio. Stream On provides a service that directly communicates with Imediatouch and delivers a stream via Ogg Vorbis. The company provides a preconfigured Linux PC that requires only an audio source and an Internet connection.


    Which streaming encoder should I use? Obviously to make it easy for users to access your stream you should use an encoder that can be decoded by the most ubiquitous players. Hands down, the most ubiquitous player is the Windows Media player. That is not to say that you should avoid others, just make sure at the minimum you accommodate that one.

    At what data rate should I encode? Here is where streaming audio varies radically from what we are familiar with as terrestrial broadcasters. Of course, if you look at the cost of running a 100kW radio station (for example) you'll note the cost is the same whether there is one listener or 200,000 listeners. Streaming audio costs are opposite of that. You can look at the cost of streaming audio in terms of the number of bits used over some period of time. So, multiply the number of users or streams (call this factor A) by the number of bits per second (call this factor B) by the number of seconds (call this factor C) that each user stays connected. Since you multiply factor A by factor B by factor C to get the answer, it's obvious that making any one (or all) of them smaller will in turn reduce the product (A)(B)(C). Since you are trying to get as many users to stay connected as long as possible (assuming you're selling spots for your stream) then the obvious way to reduce the overall cost is by cutting down on the number of bits per second in each of the streams, right? But wait. There is a direct correlation between the quality that each user will enjoy and the data rate of the stream. The higher the data rate, the higher the quality. So if you reduce the quality of the product (if it sounds lousy) then it's likely you'll have fewer users. I won't pretend to tell you the best rate to encode; however my experience is as follows: 128kb/s is excessive for 99 percent of the users; 64kb/s will sound very good for the vast majority of users; 48kb/s will be fine for most users. (By the way, I mean stereo encoding. If you stream at less than 48kb/s, then by all means encode your stream as a mono source.) If you expect many users to use dial-up modems to log on then make sure you include an option for a slow stream.

    Neural Neustar SW4.0

    Neural Neustar SW4.0

    Running a business

    How many users should I expect? How much will this cost? How do I make a business of it? The number of users to expect of course depends primarily on the market size you are in. With the promotion of your stream over your terrestrial signal and on your website you can expect the number to grow over time. Make sure to remember there will be a peak number of listeners during office hours. Some CDNs charge on a per-user basis with a max number of listeners, some charge flat rates, some charge for a certain amount of throughput on a per-month basis.

    Another important cost consideration is the (relatively) new licensing fees in effect for streaming audio. The licensing fees you pay for the terrestrial station do not cover what goes out over your audio stream. Many of the CDNs have royalty payment plans that can be included in your monthly streaming expense.

    Making a business out of streaming audio and growing the station's revenue is really the name of the game (at least for those of us in the commercial radio space). This is not an article about the effectiveness of one type of ad versus another, and so I won't (and can't) say which one is best. However from an engineering standpoint, you should be prepared to accommodate ad replacement over the audio stream. This will likely require some changes in the technical facility.

    Ando Media offers an ad-replacement system that works through PC that performs the streaming audio function as well as the spot blocking/ad replacement function. This streaming computer (living somewhere in your technical facility) communicates directly with the station's play-out system and thus knows when the spot block is playing, and when to play out the replacement spots. Liquid Compass is one large CDN that makes use of the Ando technology.

    Self-contained streaming systems provide a plug-and-play method 
of initiating a stream, like this Stream On appliance.

    Self-contained streaming systems provide a plug-and-play method of initiating a stream, like this Stream On appliance.

    Spacial Audio is yet another player in the ad-replacement game, and its system also works by placing a PC at the station. This PC plays the role of streaming encoder as well as the function of ad replacement. Again, this PC knows when to perform the spot replacement, by way of communications directly with the station's play-out system. Jetcast is one streaming provider that makes use of the Spacial Audio technology.

    Stream Audio also offers an ad-replacement system, but it works differently. The streaming encoder lives at the station, and by communicating with the play-out system, tags the elements that play in that stream. When the stream is received at the Stream Audio network operations center, the tags are read and interpreted, and at the appropriate point in time, an ad-streaming server (that lives in its NOC) substitutes the replacement spots. Conceptually, it's the same as the other systems I've talked about but the technology behind it is somewhat different.

    So as you can see just the vendors mentioned here all have similar technology and all have common requirements when they're placing gear in your technical facility:

    • The necessary audio feed from your on-air studio

    • Communications link with your play-out system

    • Physical space in the plant for placement of the streaming computer

    • A high-speed, reliable network connection so the stream can be further distributed by way of the CDN you choose. Preferably this would be a WAN connection, but the Internet could be used as well.

    • A means by which remote technicians from the CDN can gain access to your network in order to provide technical support

    After your streaming audio gains some traction with an audience, or after you start encoding it with PPM, you will likely want to monitor the stream in some fashion to be sure the PPM can be decoded. In New York we've recently installed PPM decoders for our streaming audio, and we now monitor it constantly, along with our terrestrial signals.

    Entertaining an audience with streaming audio makes use of technology that is very different than what we've become accustomed to over the years. However, as more of our current audience, and hopefully a new audience use us in this fashion, it's imperative we learn the techniques and gain the experience necessary to keep them as listeners. Our competitors are, and we can't afford to be left behind in the proverbial technological dust.

    Resource Guide

    Providers of streaming services, ad insertion and streaming processing


    Akamai Technologies

    Ando Media

    Broadcast Electronics

    End to End Technologies



    Liquid Audio

    Liquid Compass





    Real Networks

    Spacial Audio

    Stream Audio

    Stream Guys

    Stream On

    Stream the World


    Warp Radio

    Irwin is chief engineer of WKTU, New York.

    Receive regular news and technology updates. Sign up for our free newsletter here.