Telecom Made Simple: voice quality

Showing posts with label voice quality. Show all posts

Jitter | What Makes Voice over IP Quality Suffer

Jitter is the variation in delays that the receiver experiences. Jitter is a nuisance that the user does not hear directly, because the phones employ a jitter buffer to correct for any delays. Jitter can be defined in a number of ways. One way is to use the standard deviation or maximum deviation around the mean delay per packet. Another way is to use the known arrival intervals (such as 20ms), and subtract consecutive delays of packets that were not lost from the known arrival time, then take the standard deviation or the maximum deviation. Either way, the jitter, measured in times or percentages against the mean, tells how variable the network is.

Jitter is introduced by variable queuing delays within network equipment. Phones and PBXs are well known for having very regular transmission intervals. However, the intervening network may have variable traffic. As the queue depths change and the network loads fluctuate, and as contention-based media such as Wi-Fi links clog with density, packets are forced to wait. Wireless networks are the biggest culprit for introducing delay into an enterprise private network. This is because wireless packets can be lost and retransmitted, and the time it takes to retransmit a packet can usually be measured in units of a millisecond.

A jitter buffer's job is to sit on the receiver and prevent the jitter from causing an underrun of the voice decoder. An underrun is an awkward period of silence that happens when the phone has finished playing the previous packet and needs another packet to play, but one has not yet arrived. These underruns count as a form of error or loss, even if every packet does make it to the receiver, and loss concealment will work to disguise them. The problem with jitter becomes that an underrun must be followed by an increase in delay of the same amount, assuming no packets are lost. This can be seen by realizing that the delayed packet will hold up the line for packets behind it.

Here, the value of the jitter buffer can be seen. The jitter buffer lets the receiver build up a slight delay in the output. If this delay is greater than the amount of actual jitter on the network, the jitter buffer will be able to smooth things out without underruning.

In this sense, the jitter buffer converts jitter directly into delay. If the jitter becomes too large, the jitter buffer may have limited room, and start dropping earlier samples in the buffer to let the call catch up to be closer to real time. In this way, the jitter buffer can convert the jitter directly into loss.

Because jitter is always converted into delay first, then loss, it does not have a direct impact on the E-model by itself, but instead can be folded in to the other measures. However, the complication arises because the user or administrator does not usually know the exact parameters of the jitter buffer. How many samples, how much delay, will the jitter buffer take before it starts to drop audio? Does the jitter buffer start off with a fixed delay? Does it build up the delay as jitter forces it to? Or does it try to proactively build in some delay, which can grow or shrink as the underruns occur? These all have an impact on the E-model call quality.

As a result, a rule of thumb here is to match the jitter tolerance to the delay tolerance. The network, at least, should not introduce more than 50ms of jitter.

Handoff Breaks | What Makes Voice over IP Quality Suffer

Handoffs cause consecutive packet losses. As mentioned in our previous discussion on packet loss, the impact of a handoff glitch can become large. The E-model does not make the best measurement of handoff break consternation, because it takes into account only the average burst length. Handoffs can cause burst loss far longer than the average, and these losses can delete entire words or parts of sentences.

Later chapters explore the details of where handoff breaks can occur. The two general categories are for intratechnology handoffs, such as Wi-Fi access-point to access-point, and intertechnology handoffs, such as from Wi-Fi to cellular. Both handoffs can cause losses ranging for up to a second, and the intertechnology handoff losses can be potentially far higher, if the line is busy or the network is congested when the handoff takes place.

The exact tolerance for handoff breaks depends on the mobility of the user, the density or cell sizes of the wireless technology currently in use, and the frequency of handoffs. Mobility tends to cut both ways: the more mobile the user is at the time of handoff, the more forgiving the user might be, so long as the handoff glitches stop when the user does. The density of the network base stations and the sizes of the cells determine how often a station hands off and how many choices a station has when doing so. These both add to the frequency of the glitches and the average delays the glitches see. Finally, the number of glitches a user sees during a call influences how they feel about the call and the technology.

There are no rules for how often the glitches should occur, except for the obvious one that the glitches should not be so many or for so long that they represent a packet loss rate beginning to approach a half of a percentage point. That represents one packet loss in a four second window, for 20ms packets. Therefore, a glitch of 100ms takes five packets, and so the glitch should certainly not occur more than once every 20 seconds. Glitches longer than that also run the risk of increasing the burst loss factor, and even more so run the risk of causing too many noticeable flaws in the voice call, even if they do not happen every few seconds. If, every two minutes, the caller is forced to repeat something because a choice word or two has been lost, then he would be right to consider that there is something wrong with the call or the technology, even though these cases do not fit well in the E-model.

Furthermore, handoff glitches may not always result in a pure loss, but rather in a loss followed by a delay, as the packets may have been held during the handoff. This delay causes the jitter buffer (jitter is explained in Section 3.2.4) to grow, and forces the loss to happen at another time, possibly with more delay accumulated.

A good rule of thumb is to look for technologies that keep handoff glitches less than 50ms. This keeps the delaying effect and the loss effect to reasonable limits. The only exception to this would be for handoffs between technologies, such as a fixed-mobile convergence handoff between Wi-Fi and cellular. As long as those events are kept not only rare but predictable, such as that they happen only on entering or exiting the building, the user is likely to forgive the glitch because it represents the convenience of keeping the phone call alive, knowing that it would otherwise have died. In this case, it is reasonable to not want the handoff break to exceed two seconds, and to have it average around a half of a second.

Voice Communications

Voice communication is the transmission and reception of audio and other signals that can be represented by the frequency band used for voice signal transmission. Telephone systems transfer voice signals in a variety of forms through by wire, radio, light, and other electronic or electromagnetic systems. These forms include analog and digital voice signals. Options for voice communications include different voice quality of service levels and voice privacy options.

Voice Quality
Voice quality is a measurement of the level of audio quality, often expressed in mean opinion score (MOS). The MOS is number that is determined by a panel of listeners who subjectively rate the quality of audio on various samples. The rating level varies from 1 (bad) to 5 (excellent). Good quality telephone service (called toll quality) has a MOS level of 4.0.

The first telephone systems used analog signals to represent the voice. To overcome the cumulative noise limitations of analog signal transmission, digital transmission systems were created. These digital transmission signals represented voice signals by discrete levels that can be recreated eliminating the noise. As a result, in the 1960’s, many modern telephone systems began to offer digital voice communications.

The first digital voice services converted (digitized) the analog voice signal to a 64 kbps digital signal. This 64 kbps digital channel called a DS0) provided “toll quality” voice with a MOS score of 4.0 or above.

Generally, there is a tradeoff between system efficiency (bandwidth used) and the level of voice quality. To gain system efficiency (to add more customers per interconnection line), some telephone systems compress the voice using speech-coding (data compression) technology. The first compressed voice service uses adaptive pulse coded modulation (ADPCM) that further compresses the 64 kbps DS0 to 32 kbps ADPCM.

Other voice compressed voice service have been developed that can use low bit-rate standard or proprietary speech compression algorithms. These can further compresses the 64 kbps DS0 to below 16 kbps or even 8 kbps.

Voice Privacy
Voice privacy is a process that is used to prevent the unauthorized listening of communications by other people. Voice privacy involves coding or encrypting of the voice signal with a key so only authorized users with the correct key and decryption program can listen to the communication information.

Digital systems are inherently more secure than analog systems because they can easily use an encrypted mode of operation. This encrypted mode of operation “scrambles” voice data before it is sent to other users in the network. The encryption uses a key (mask value) that is calculated from some form of secret data. When the voice data is received, it must be decrypted using the same mask value that was used to encrypt it. Although an interceptor may be capable of receiving the data signals, they cannot learn the true data value unless the secret number that was added to it is also known.

While the telephone system can offer an encryption mode that encrypts the signaling between the end-user’s phone line and the telephone network, it is more common for the end-user to maintain their own voice encryption system. This does help to prevent unauthorized access to the telephone system. This also allows the end-user to have many different voice encryption algorithms. The voice encryption algorithms are typically stored on the end-user’s telephone devices.

Telecom Made Simple

Jitter | What Makes Voice over IP Quality Suffer

Handoff Breaks | What Makes Voice over IP Quality Suffer

Voice Communications

Telecom Made Simple

Search This Blog

Blog Archive

Total Pageviews