IP audio: The new audio format

April 1, 2009


By now, “IP audio” is a well-known term, widely written about and, I suspect, many have a pretty good understanding of how it works. The concept of having all the audio and control combined in a simple data packet is appealing on a number of levels, such as reduced studio and facility cabling, ease of originating programming outside the facility, fewer components required and also operating wirelessly over Wi-fi, broadband modem, etc. for quick deployment or permanent STL system. In addition, IP audio can also be sent to a single location or broadcast to many.

The ability to transport audio and video over Ethernet/Internet has been around for years, so how is IP audio different? The simple answer is that there are similarities in terms of how the data is transported, but that's where it stops. Let's start at the beginning with the evolution of IP audio.

Of course, no discussion “audio contribution over IP,” would be complete without a quick review of the protocol that makes transmission possible (that would be the IP part). Internet Protocol consists of a specialized data packet. The data, to be transmitted to a specific destination, is encapsulated in the IP packet. The packet can be viewed as the train that carries a payload, in this case the encoded audio and control data. The engine of the train is called the header, which contains specific information such as: where the train is leaving from, where it is going, in what parts of the train the data is enclosed and how to handle that data once it arrives.

The IP packet alone doesn't have an enormous amount of intelligence; it is just a means to transport data. There are other protocols combined with IP to steer the packets to their intended destination, essentially the engineer of the train. The two common protocols used for this purpose are TCP (Transport Control Protocol) and UDP (Universal Datagram Protocol.) The major difference in these protocols is how they communicate back to the originating network interface. When an IP packet is sent under TCP, confirmation of proper delivery is sent to the sender. If a packet does not arrive, it asks the sender to resend until it has been completed; this is also known as a reliable or connection-oriented transport method. IP under UDP control simply broadcasts the data and assumes it has arrived properly. If a packet fails, it is necessary for the system on the receiving end to have the proper means to handle those errors; for this reason, UDP is called an unreliable or connectionless transport scheme.

A better IP audio method

For the transport of IP audio, one may first think that TCP transport would be the only sensible option, but in reality, what we really want is UDP. Why? Assume we are sending the streaming content (audio or video) over a busy network (Ethernet, Internet, etc.). Under TCP control, many of the packets will need to be resent; while this is going on, most of the other packets are making it through. What happens is that now the packets are showing up in a different order than they were originally sent. That will cause a multitude of problems including significant delays (or latency) between the source and destination codecs.

While IP audio systems are gaining acceptance, many have proprietary protocols that prevent interoperation at this time.

While IP audio systems are gaining acceptance, many have proprietary protocols that prevent interoperation at this time.

The fact is that UDP is the common control protocol used for streaming content, including IP audio. The algorithms in the codec handle any errors and provide the necessary means to make the streaming content meet the intended quality criteria.

Here is where it gets interesting. To overcome some of the challenges presented by sending data using UDP, manufacturers have proprietary protocols that permit a higher level of error control between their encoder and decoder. This may mean that units from different manufacturers might not work together.



That was until February 2008 when the European Broadcast Union (EBU) finalized a standard for IP audio. The solution was simple: Just add another protocol on top of the UDP protocol. Remember that all things in the data communications world utilize a layered architecture. The Open System Interconnect (OSI) model is based on seven layers, although IP transmission only needs four. There has been much written on the subject, but the simple explanation is that encoded data flows through several different layers, each with a specific job. These layers take care of the creation, packaging, management and control of data. It handles everything including getting the signal through the cable. The process is reversed on the receiving end.

The EBU used a protocol called RTP (Real-time Transport Protocol), a protocol standard created in the late 1990s specifically for the purpose of sending audio, video and telephony over the Internet. It provides a complete set of tools for the transmission of a wide variety of multimedia formats and provides management functions that minimize some problems found with transmissions over UDP alone. The details of RTP are an article in itself, but basically the data from a codec is packaged in the RTP packet layer then sent to the UDP layer for transmission.

More options

Other protocols used for IP audio include:

  • SIP (Session Initiation Protocol), which can control the setup, termination and flow of the session, similar to how a telephone network controls a call.
  • SDP (Session Description Protocol) provides information about the specific audio format to the destination. This permits the codec on the receiving end to match that of the transmitted format.
  • SNMP (Simple Network Management Protocol) provides control and monitoring of the equipment. While SNMP has been a time-tested protocol for this purpose, the EBU is currently working on an IP audio specific standard.

The EBU specifies four primary audio encoding formats: MPEG Layer 2, ITU G.711, ITU G.722, and PCM. However, RTP can also support a number of other current and future formats.

As manufacturers all move toward the IP audio environment and you only need to deal with power and data, I wonder if anyone will miss the days of punching down all those cables?


McNamara is president of Applied Wireless, Cape Coral, FL.



Want to read more stories like this?
Get our Free Newsletter Here!

Comments