Understanding bandwidth and streaming media production

An Introduction to the Basics of Bandwidth and Streaming Media Production

Understanding bandwidth is really quite simple, and it is necessary to have a fundamental grasp of what bandwidth is if you are creating streaming media files such as WMV.

The purpose of this document is to provide an easy-to-understand, general explanation of what bandwidth means, and how it relates to video production and content delivery. It is not a technical dissertation, and will therefore, for reasons of simplicity, use approximation and rounding in most calculations.

Basically, bandwidth is simply a measure of how much data can be transmitted through a connection over a given period of time.

For example, a 28.8 kbps dialup modem connection is much slower than a cable modem connection. The cable modem connection can download more data than the 28.8 kbps dialup modem connection can over the same period of time. The cable modem has a higher bandwidth connection than the 28.8 kbps dialup connection does.
It's like comparing a garden hose to a fire hose. More water can pass through the fire hose in a minute than through a garden hose. So, in an odd sort of way, you could say that the fire hose is a higher bandwidth hose than the garden hose.

File Size, Bit Rate, Bandwidth and Data Transmission

File size is measured in bytes. For example, a small image file might be 20K (K or KB is the abbreviation for kilobyte), or about 20,000 bytes in size. One kilobyte = about 1,000 bytes.

If a computer could receive 5,000 bytes per second or 5K bytes per second, it would take 4 seconds for that computer to receive a file 20K in length.

But bandwidth is not measured in bytes, it is measured in bits. One byte contains eight bits. So, if a computer can receive 5KB or 5,000 bytes in one second, another way to put it is to say that it can receive 40,000 (5,000 x 8) bits per second, or 40 kbps.

Data transmission measured in bits per second is called the baud rate or bit rate, and is the measure of bandwidth. It is commonly measured in thousands of bits per second or kilobits per second. The abbreviation for kilobits per second is kbps or Kbps or simply k. One kilobit = about 1,000 bits.

If a computer connects to the Internet using a 56 kbps dialup modem, in theory, it means that the computer could receive 56,000 bits (56 kbps) per second. That would mean that the computer could receive about 7,000 bytes per second. Remember that a byte is 8 bits, so 56,000bps / 8 = 7,000 bytes. So, to receive the 20K image file, the 56 kbps dialup connection would require slightly less than 3 seconds to receive the file.

Data transmission measured in bits per second is the bandwidth of the connection. In this case the bandwidth is 56 kbps.

One thing must be noted: A 56 kbps modem cannot actually communicate at 56 kbps. In reality it's more like 35-52 kbps, or to put it another way, a 56 kbps modem connection generally only provides 35-45 kbps of bandwidth.

Bandwidth and Streaming Media File Transmission

Computer video files are basically a number of still images called frames that are combined sequentially into one file. When the file is played, the player, Media Player for example, chugs through the file and displays each consecutive image in the same way that movie film rolling through a projector displays a movie.

When a file is streamed, frames are continuously delivered from the computer that is streaming the video to the computer that is playing it. Each frame is displayed as it is received.

Consider a computer connected to the Internet using a dialup modem. This is a skinny little, low bandwidth pipe to be sure. If the modem connected at 40 kbps, it would mean that it could receive abut 5,000 bytes of data per second. If each frame of the video was only 5KB then the modem could only receive 1 frame per second. Commercial motion pictures are 24 fps (frames per second), television is 30 frames per second. So, a 1 fps video is a very slow and choppy video.

But with a higher bandwidth connection, more frames per second could be received. With a 128 kbps ISDN connection for example, 32 5KB frames could be delivered per second.
But, a 5K image or frame is not very big. A small 320x200, 16 bit JPG file can easily be 20K in size. So, for the beleaguered modem connected with only 40 kbps of bandwidth, it would take 4 seconds to receive only one frame of the video! At that rate, the video would degrade into a slide show, and not be a video at all.

This is the reason why many videos on news sites such as CNN for example, are very small, little 160x120 pixels in dimension, and why a dialup Internet connection just doesn't have enough bandwidth to enjoy a very rich multimedia experience.

Video Compression and Key Frames

As static image files are compressed using various compression algorithms such as JPG, the video and audio data in streaming media files is compressed. This reduces the number of bytes in each frame thus reducing the bandwidth requirement to deliver the video. While data compression helps a considerably, another step is taken to reduce bandwidth requirements.

As mentioned previously, video files are a number of still images called frames that are combined sequentially into one file. Each frame is displayed at some given number of frames per second to create the illusion of movement. But many times there is no movement or change in the video between one frame and the next. A video demonstration of an application for example, may show the opening of a new window in the application, and then not change for several minutes while the audio narration explains the application.
If nothing changes, there is no reason to send a new frame of video data. The player can just sit there and display the same frame. This of course hugely reduces bandwidth requirements.

But consider a video that is demonstration some application, and all that is changing in the video is the mouse pointer moving around the application as the author of the video points out different areas of the application by using the mouse pointer as a pointing device. Instead of sending the entire frame, only the changes to the new frame are sent. If the only change between one frame and the next is that the mouse pointer is in a different position, then the only change to the image is the area under where the mouse pointer was, and drawing the mouse pointer in it new position. The mouse pointer is very small and the number of bytes of video data that represents it is minimal, so very little video data needs to be transmitted to reflect the change between frames. Sending only the part of the frame that has changed can also greatly reduce bandwidth requirements.

It now becomes obvious that movement, because it causes changes from one frame to the next, increases the bandwidth requirements of the video. The more movement there is, the more area of the screen is changed, resulting in more video data that must be sent to update to the next frame. If the entire screen changed from one frame to the next, the entire frame would have to be sent.

Movement increases bandwidth requirements

There are two types of video frames, key frames and delta frames. Key frames contain all of the pixels that comprise the complete frame. Delta frames only contain what has changed from the previous frame. Key frames are placed in the video at regular intervals, either every so many seconds or so many frames. Camtasia Recorder for example defaults to 1 key frame every 80 frames. It looks something like this:

Key frame | delta frame | delta | delta | delta | key | etc…

If there is no change from one frame to the next, delta frames can contain 0 bytes of data. If the only change from one frame to the next is the movement of the mouse pointer, the delta frame would contain very little data. If the entire screen had changed, the delta frame would be as large as a key frame, as it would have to contain bytes of data representing every pixel in the frame.

Frame Rate and Bandwidth

Frame rate may or may not have too much effect on the bandwidth requirements of the video. If there is a lot of change between frames, then the size of each frame is larger and more data must be transmitted for each frame. In this case, higher frame rates require increased bandwidth. But if there is little or no change between frames, then little or no video data is transmitted for each frame. So, depending on the content of the video, increasing the frame rate may have either a little or a great effect on how much bandwidth is required.

Network Congestion, Bandwidth Spikes and Buffering

Because streamed video is displayed as it is received, if for some reason the data stream is slowed or interrupted, the video will stop playing. Network congestion and other problems are fairly common, and to help ameliorate the interruption of the data stream, buffering is implemented.

Buffering works by storing a portion of the video locally, and then playing the video by retrieving data from the local buffer. Before the video starts playing, the player downloads some amount of the video and stores it locally. Generally this is not a large portion of the video, usually 10 seconds or so. It then plays the video by retrieving frames from this local buffer while continually downloading more of the video to keep the buffer full.

If the network becomes congested, or if the stream is interrupted for some reason, the player can continue playing from the buffer, and hopefully the interruption will be corrected before the buffer is depleted and the video stops playing.

Buffering can also help encoding videos that contain spikes of high bandwidth. This can occur if something in the video suddenly requires more bandwidth. For example, in a video demonstrating an application, for many frames the only movement might be the mouse pointer moving about the screen. If the author of the video pushes a button in the application that causes a new window to open, the entire frame might change, requiring a large block of data needing to be transmitted to update the next frame. This, of course, causes a spike in the required bandwidth.

Modern media encoders take into account the extra time afforded by the buffer, that while the extra data caused by the bandwidth spike is being is being delivered, the video can be played from the buffer, thereby not interrupting playback. Increasing the amount of buffering time can make the difference between a successful or failed encoding process.

Audio and Bandwidth

Audio is a very important consideration when considering streaming content and bandwidth. Audio requires bandwidth just as video does. The higher the quality of the audio, the more bandwidth it will consume.

That is why streaming media encoders such as Media Encoder always use compressed audio. Uncompressed audio gobbles up bandwidth. PCM uncompressed audio, 22.050 kHz, 16 bit mono for instance requires 43 kbps of bandwidth. If you consider a dialup modem connected at 40 kbps, it is evident that any attempt to stream this is doomed to fail. Highly compressed, lower quality audio such as ACELP.net 8 kHz, mono for example requires only 5 kbps of bandwidth.