Telecom Made Simple: Quality of Service

Showing posts with label Quality of Service. Show all posts

Local UE does not provide an IP BS Manager

The UE does not provide an IP BS Manager. The end-to-end IP QoS bearer service towards the remote terminal is controlled from the GGSN. The scenario assumes that the GGSN supports DiffServ functions, and the backbone IP network is DiffServ enabled. In this scenario, the control of the QoS over the UMTS access network (from the UE to the GGSN) may be performed either from the terminal using the PDP context signaling or from the SGSN by subscription data.

The IP QoS for the downlink direction is controlled by the remote terminal up to the GGSN. The GGSN will apply receiver control DiffServ edge functions and can reclassify the data (remarking the DiffServ Code Point = DSCP). This may affect the QoS applied to the data over the UMTS access (the TFT may use the DSCP to identify the data to be allocated to the PDP context).

The end-to-end QoS is provided by a local mechanism in the UE, the PDP context over the UMTS access network, DiffServ through the backbone IP network, and DiffServ in the remote access network in the scenario shown in the figure below. The GGSN provides the interworking between the PDP context and the DiffServ function. However, the interworking may use information about the PDP context which is established, or be controlled from static profiles, or dynamically through other means such as proprietary of HTTP based mechanisms. The UE is expected to be responsible for the control of the PDP context, but this may instead be controlled from the SGSN by subscription.

3GPP Concept of QoS

3GPP Standard TS 23.207 provides the framework for end-to-end GPRS and UMTS. The end-to-end QoS architecture is provided in Figure below. It’s describes the interaction between the TE/MT (Terminal Equipment/Mobile Terminal) Local Bearer Service, the GPRS Bearer Service, and the External Bearer Service, and how these together provide Quality of Service for the End-to-End Service.

It’s also describes IP level mechanisms necessary in providing end-to-end Quality of Service and possible interaction between the IP level and the GPRS level, as well as the application level and the IP level. This covers different architectural aspects of the end-to-end Quality of Service concept and architecture with varying level of detail. In general, other specifications shall be referred to for further details; these other specifications enable the reader to acquire the full understanding of the end-to-end Quality of Service concept and architecture.

QoS Management Functions in the Network: to provide IP QoS end-to-end, it is necessary to manage the QoS within each domain. An IP BS (Base Station) Manager is used to control the external IP bearer service. Due to the different techniques used within the IP network, this communicates to the UMTS BS manager through the Translation function. The QoS management functions for controlling the external IP bearer services and how they relate to the UMTS bearer service QoS management functions.

QoS Conceptual Model: there are many different end-to-end scenarios that may occur from an UE connected to an UTMS network. The following examples depict how end-to-end QoS will be delivered for a number of scenarios that are considered to be significant.

The Concept of QoS by ETSI

ETSI standard TS 102 250-2 v2.2.1 (2011) covering the QoS aspects for popular services in GSM and 3G networks. The standard divided into 6 parts book that identified below:

• ETSI TS 102 250 Part 1 identifies QoS criteria for popular services in GSM and 3G networks. They are considered to be suitable for the quantitative characterization of the dominant technical QoS aspects as experienced from the customer perspective.

• ETSI TS 102 250 Part 2 defines QoS parameters and their computation for popular services in GSM and 3G networks.

• ETSI TS 102 250 Part 3 describes typical procedures used for QoS measurements over GSM, along with settings and parameters for such measurements.

• ETSI TS 102 250 Part 4 defines the minimum requirements of QoS measurement equipment for GSM and 3G

• ETSI TS 102 250 Part 5 specifies test profiles which are required to enable benchmarking of different GSM or 3G networks both within and outside national boundaries.

• ETSI TS 102 250 Part 6 describes procedures to be used for statistical calculations in the field of QoS measurement of GSM and 3G networks using probing systems.

General Consideration: ETSI identifies QoS criteria for popular services in GSM and 3G. They are considered to be suitable for the quantitative characterization of the dominant technical QoS aspects as experienced from the customer perspective. The criteria are described by their name and a short description from the customer point of view.

Phases of Service from the Customer's Point of View

Figure shows different phases (Quality of Service aspects) during service use from the customer’s point of view. The five QoS aspects are:

1.Network Availability: is the probability of a telecommunications service that can be offered to customers through a network infrastructure.

2.Network Accessibility: probability that users can register on the network to be successful so that the network can provide telecommunication services. Network can only be accessed when it is available to the user.

3.Service Accessibility: probability that the user can access the service you want to use., If the customer wants to use a service, the network operator should provide him as fast as possible access to the service

4.Service Integrity: describes QoS while using the service and contains elements such as the quality of the content being transmitted, such as sound quality, video quality, and the number of bits transmitted error in the file. Service integrity can only be calculated if the service is accessible to success.

5.Service Retainability: Service retainability describes the termination of services, in accordance with or against the will of the user. Explains how to end or terminate a service, whether or not the will of the user. Examples of service retain ability parameter are call cut-off ratio or the data cut-off ratio.

Grade of Service

ITU-T Recommendation E.771 proposes network Grade of Service (GOS) parameters for current and evolving land mobile services. These parameters are defined, and their target values specified, assuming that the network and the network components are operating in their normal mode (i.e. are fully operational). Further, the parameters and their target values assume normal (as opposed to distress or emergency) traffic.

In this Recommendation, the following traffic GOS parameters are specified for mobile circuit switched services:

•Post Selection Delay: defined as the time interval from the instant the first bit of the initial SETUP message containing all the selection digits is passed by the calling terminal to the access Signaling system until the last bit of the first message indicating ccall disposition is received by the calling terminal (ALERTING message in case of successful call).

•Answer signal delay: defined as the time interval from the instant that the called terminal passes the first bit of the CONNECT message to its access Signaling system until the last bit of the CONNECT message is received by the calling terminal.

•Call release delay: defined as the time interval from the instant the DISCONNECT message is passed by the user terminal which terminated the call to the access Signaling system, until the RELEASE message is received by the same terminal (indicating that the terminals can initiate/receive a new call).

•Probability of end-to-end blocking: defined as the probability that any call attempt will be unsuccessful due to a lack of network resources.

•Probability of unsuccessful land cellular handover: defined as the probability that a handover attempt fails because of lack of radio resources in the target cell, or because of a lack of free resources for establishing the new network connection. The failure condition is based either on a specified time interval since the handover request was first issued or on a threshold on signal strength.

User Perception of QoS vs Operational Performance in Practical

Why are any differences between the results of measurements of QoSE (QoS Experience by the user) and QoSD (QoS Delivered by the provider), whereas the measurement of QoS and network performance are not contradictory?

In practice, many factors that influence the customer's perception of the QoS service they received from the provider.

In general, the perception of the customer is to compare the quality of service that they feel with the quality they expect. Customer expectations are influenced by the rates they pay and the information that they know from the media and from books. In general, if a customer feels an expensive, then their expectations for service quality is high as well.

Provider of telecommunications equipment owned or rented, and operates with the standard of performance they called KPI (Key Performance Indicator). The better prepared KPI, and the more realistic service rates, the correlation between customer expectations for QoS performance telecommunications systeM, will increase.

To better understand the expectations of its customers, the provider must have good customer service. Customer service should be a very good understanding of operational performance measured through Key Performance Indicators, as well as understand the relationship between customer complaints with performance indicators.

The task is customer service is two-way. On the one hand, they should be able to answer customer complaints properly, according to the technical conditions of operation. On the other hand, they should be able to give direction to the company, the translation of the customer's wishes into technical performance criteria.

Providers that are less, in general, ignore the customer service. As a result, customers will be frustrated. Customers have been disappointed, because he felt the complaint was not answered correctly. Provider engineers also depressed, because it was already successfully operating the device in accordance with technical standards, but it is still considered bad by the company who read so many reports of customer disappointment.

QoSE (QoS Experienced by the User)

QoS experienced by users reflect the subjective point of view of a user in certain circumstances they experienced. Customer satisfaction is one of the driving factors for this type of QoS. In general, QoSE described in nontechnical parameters. Telecom service providers can measure the level of QoSE by conducting a survey to its customers or to seek advice and input from them. At this stage, a user combines personal experience with the expected technical quality of the service it uses. In addition to technological aspects, there are several other factors that affect the level of QoSE. Some of these factors such as starting from the signing of the contract between the user and the service provider, the service provider the ability to handle probleM faced by customers, and the overall relationship between the customer and the service provider. Thus, it can be concluded that QoSE quite difficult to measure because there are several factors "hidden" are not easy to identify.

QoSD (QoS Delivered by the Service Provider)
QoSD reflect the level of QoS that has been successfully achieved by the telecom service providers. QoSD can test the ability of a telecommunications service provider to deliver the promised QoS.

Quality of Service with WMM-How Voice

Quality of Service with WMM-How Voice and Data Are Kept Separate

The first challenge is to address the unique nature of voice. Unlike data, which is usually carried over protocols such as TCP that are good at making sure they take the available bandwidth and nothing more, ensuring a continuous stream of data no matter what the network conditions, voice is picky. One packet every 20 milliseconds. No more, no less. The packets cannot be late, or the call becomes unusable as the callers are forced to wait for maddening periods before they hear the other side of their conversation come through. The packets cannot arrive unpredictably, or else the buffers on the phones overrun and the call becomes choppy and impossible to hear. And, of course, every lost packet is lost time and lost sounds or words.

On Ethernet, as we have seen, the notion of 802.1p or Diffserv can be used to give prioritization for voice traffic over data. When the routers or switches are congested, the voice packets get to move through priority queues, ahead of the data traffic, thus ensuring that their resources do not get starved, while still allowing the TCP-based data traffic to continue, albeit at a possibly lesser rate.

A similar principle applies to Wi-Fi. The Wi-Fi Multimedia (WMM) specification lays out a method for Wi-Fi networks to also prioritize traffic according to four common classes of service, each known as an access category (AC):

AC_VO: highest-priority voice traffic
AC_VI: medium-priority video traffic
AC_BE: standard-priority data traffic, also known as "best effort"
AC_BK: background traffic, that may be disposed of when the network is congested

The underscore between the AC and the two-letter abbreviation is a part of the correct designation, unfortunately. You may note that the term "best effort" applies to only one of the four categories. Please keep in mind that all four access categories of Wi-Fi are really best effort, but that the higher-priority categories get a better effort than the lower ones. We'll discuss the consequences of this shortly.

The access category for each packet is specified using either 802.1p tagging, when available and supported by the access point, or by the use of Diffserv Code Points (DSCP), which are carried in the IP header of each packet. DSCP is the more common protocol, because the per-packet tags do not require any complexity on the wired network, and are able to survive multiple router hops with ease. In other words, DSCP tags survive crossing through every network equipment that is not aware of DSCP tags, whereas 802.1p requires 802.1p-aware links throughout the network, all carried over 802.1Q VLAN links.

There are eight DSCP tags, which map to the four access categories. The application that generates the traffic is responsible for filling in the DSCP tag. The standard mapping is given in Table 1.

Table 1: DSCP tags and AC mappings
DSCP	TOS Field Value	Priority	Traffic Type	AC
0×38 (56)	0×E0 (224)	7	Voice	AC_VO
0×30 (48)	0×C0 (192)	6	Voice	AC_VO
0×28 (40)	0×A0 (160)	5	Video	AC_VI
0×20 (32)	0×80 (128)	4	Video	AC_VI
0×18 (24)	0×60 (96)	3	Best Effort	AC_BE
0×10 (16)	0×40 (64)	2	Background	AC_BK
0×08 (8)	0×20 (32)	1	Background	AC_BK
0×00 (0)	0×00 (0)	0	Best Effort	AC_BE

There are a few things to note here. First is that the eight "priorities"—again, the correct term, unfortunately—map to only four truly different classes. There is no difference in quality of service between Priority 7 and Priority 6 traffic. This was done to simplify the design of Wi-Fi, in which it was felt that four classes are enough. The next thing to note is that the many packet capture analyzers will still show the one-byte DSCP field in the IP header as the older TOS interpretation. Therefore, the values in the TOS column will be meaningless in the old TOS interpretation, but you can look for those specific values and map them back to the necessary ACs. Even the DSCP field itself has a lot of possibilities; nonetheless, you should count on only the previous eight values as having any meaning for Wi-Fi, unless the documentation in your equipment explicitly states otherwise. Finally, note that the default value of 0 maps to best effort data, as does the Priority 3 (DSCP 0×18) value. This strange inversion, where background traffic, with an actual lower over-the-air priority, has a higher Priority code value than the default best effort traffic, can cause some confusion when used; thankfully, most applications do not use Priority 3 and its use is not recommended here as well.

A word of warning about DSCP and WMM. The DSCP codes listed in Table 1 are neither Expedited Forwarding or Assured Forwarding codes, but rather use the backward-compatibility requirement in DSCP for TOS precedence. TOS precedence uses the top three bits of the DSCP to represent the priorities in Table 6.1, and assign other meanings to the lower bits. If a device is using the one-byte DSCP field as a TOS field, WMM devices may or may not ignore the lower bits, and so can sometimes give no quality-of-service for tagged packets. Further complicating the situation are endpoints that generate Expedited Forwarding DSCP tags (with code value of 46). Expedited Forwarding is the tag that devices use when they want to provide higher quality of service in general, and thus will usually mark all quality-of-service packets as EF, and all best effort packets with DSCP of 0. The EF code of 46 maps, however, to the Priority value of 5—a video, not voice, category. Thus, WMM devices may map all packets tagged with Expedited Forwarding as video. A wireless protocol analyzer shows exactly what the mapping is for by looking at the value of the TID/Access Category field in the WMM header.

This mapping can be configured on some devices. However, changing these values from the defaults can cause problems with the more advanced pieces of WMM, such as WMM Power Save and WMM Admission Control, so it is not recommended to make those changes. (The specific problem that would happen is that the mobile device is required to know what priority the other side of the call will be sending to it, and if the network changes it in between, then the protocols will get confused and not put the downstream traffic into the right buckets.)

Once the Wi-Fi device—the access point or the client—has the packet and knows its tag, it will assign the packet into one of four priority queues, based on the access categories. However, these queues are not like their wired Ethernet brethren. That is because it is not enough that voice be prioritized over data within the device; voice must also be prioritized over the air.

To achieve this, WMM changes the backoff procedure. Instead of each device waiting a random time less than some interval fixed in the standard, each device's access category gets to contend for the air individually. Furthermore, to get the over-the-air prioritization, higher quality-of-service access categories, such as voice, get more aggressive access parameters.

Each access category get four parameters that each determine how much priority the traffic in that category gets over the air, compared to the other categories. The first parameter is a unique per-packet minimum wait time called the Arbitration Interframe Spacing (AIFS). This parameter is the minimum amount of time that a packet in this category must wait before it can even start to back off. The longer the AIFS, the more a packet must wait, and the more it is likely that a higher-priority packet will have finished its backoff cycle and started transmitting. The key about the AIFS is that it is counted after every time the medium is busy. That means that a packet with a very high AIFS could wait a very long time, because the amount of time spent waiting for an AIFS does not count if the medium becomes busy in the meantime. The AIFS is measured in units of the number of slots, and thus is also called the AIFSn (AIFS number).

The second value is the minimum backoff CW, called the CWmin. This sets the minimum number of slots that the backoff counter for this particular AC must start with. As with pre-WMM Wi-Fi, the CW is not the exact number of slots that the client must wait, but the maximum number of slots that the packet must wait: the packet waits a random number of slots less than this value. The difference is that there is a different CW min for each access category. The CWmin is still measured in slots, but communicated to the client from the access point as the exponent of the power of two that it must equal. This exponent is called the ECWmin. Thus, if the ECWmin for video is 3, then the AC must pick a random number between 0 and 2³ − 1 = 7 slots. The CWmin is just as powerful as the AIFS in distinguishing traffic, by making access more aggressive by capping the number of slots the AC must wait to send its traffic.

The third parameter is similar to the minimum backoff CW, and is called the CWmax, or the maximum backoff CW. If you recall, the CW is required to double every time the sender fails to get an acknowledgement for a frame. However, that doubling is capped by the CWmax. This parameter is far mess powerful for controlling how much priority one AC gets over the other. As with the CWmin, there is a different CWmax for each AC.

The last parameter is how many microseconds the AC can burst out packets, before it has to yield the channel. This is known as the Transmit Opportunity Limit (TXOP Limit), and is measured in units of 32 microseconds (although user interfaces may show the microsecond equivalent). This notion of TXOPs is new with WMM, and is designed to allow for this bursting. For voice, bursting is usually not necessary or useful, because voice packets come on regular, well-spaced intervals, and rarely come back-to-back in properly functioning networks.

The access point has the ability to set these four AC parameters for every device in the network, by broadcasting the parameters to all of the clients. Every client, thus, has to share the same parameters. The access point may also have a different set for itself. Some access points set these values by themselves to optimize network access; others expose them to the user, who can manually override the defaults. The method that WMM uses to set these values to the clients is through the WMM Parameter Set information element, a structure that is present in every beacon, and can be seen clearly with a wireless packet capture system. Table 2 has the defaults for the WMM parameters.

Table 2: Common default values for the WMM parameters for 802.11
AC	Client		Access Point		CWmin	TXOP limit
	AIFS	CWmax	AIFS	CWmax		802.11b	802.11agn
Background (BK)	7	2¹⁰− 1 = 1023	7	2¹⁰ − 1 = 1023	2⁴− 1 = 15	0μs	0μs
Best Effort (BE)	3	2¹⁰− 1 = 102	3	2⁶− 1 = 63	2⁴− 1 = 15	0μs	0μs
Video (VI)	2	2⁴− 1 = 15	1	2⁴− 1 = 15	2³− 1 = 7	6016μs	3008μs
Voice (VO)	2	2³− 1 = 7	1	2³− 1 = 7	2²− 1 = 3	3264μs	1504μs

Quality-of-Service Mechanisms and Provisioning

There are a few common ways for quality of service to be provided in networks, using enterprise-grade wireline infrastructure. The concepts all stem around handling the packets differently when it comes to queuing. Why? Most wireline networks can handle a fairly large amount of traffic, because the wireline technologies, such as Gigabit Ethernet, have enough throughput to make congestion be less of an issue. However, certain protocols are designed to take up as much bandwidth as they can—to specifically expand into the space that you give them. It will always be important on voice mobility networks to keep the voice traffic protected from these applications, especially if they cause changes in delay. Moreover, network congestion can cause loss rates to become problematic. All of the problems happen to the packets not as they are on the wire, but as they back up in queues within the choke points of the network, the routers or switches that connect the links together. What happens in those queues makes the difference.

Thankfully, using the packet classification capabilities from differentiated services, enterprise-grade wireline infrastructure can be used to both police flows that get out of hand and give the ones that are being squeezed out the help they need. These techniques go under the broad category of queuing disciplines, as they provide the discipline that is used to maintain order in the queues.

The idea is to take what was once one monolithic queue for the chokepoint, and to create possibly different queues, each queue leading to the same eventual chokepoint. As traffic heads towards the bottoms of the queues, an element called a scheduler chooses from which queues to take packets, and then provides those packets for transmission.

We'll take queuing disciplines and scheduling together for this discussion.

FIFO

The simplest behavior is to do no particularly new behavior at all. First-in, first-out (FIFO) queuing refers to using the one queue that is there, and to putting packets in with the same order in which they arrived, and pulling them out the same way. This sort of queuing is precisely what causes congestion and variable delays.

For the purposes of voice, the longer the queue gets, the longer the potential maximum delay the queue can cause the voice packet to suffer. The alternative is not much better: if the queue gets longer than it can handle, the packets will be dropped.

Classification

The first step is to determine whether there is any structure in the packets that can be used to differentiate them. Enterprise-grade classification techniques can use a wide, rich array of properties about the individual packet, including the sender, receiver, size, DSCP value, ports, applications, and routes. These can all be applied in a stateless manner, meaning that the router or switch need look at each packet only in isolation. An additional option exists for some routers and switches with a lot of memory and processing ability. They can use flow state to create stateful classification, in which previous packets that are related to the current one dictate the behavior. This distinction is identical to that used in firewalling. Once packets are classified, they can be placed into queues by their classes. These queues can be administratively created, or they can be created on the fly based on the class divisions, ensuring that packets from each class stay in separate queues.

Class-based queuing (CBQ) is an extension of this basic concept. Instead of having one level of discrimination, the concept can be extended to a hierarchy of queues, all set up by the administrator. This hierarchy can be powerful in preventing flows and users from stepping on each other, and for shaping the bursts and behavior of the traffic. Traffic shaping is a highly important function for variable bitrate, expansive applications, to prevent them from overwhelming other applications that may not deserve the highest prioritization, but still need to be metered.

Once the packets are classified into sibling queues, the schedulers need to be selected, to determine how to get the packets out of the queues.

Round-Robin

The simplest scheduler is the round-robin scheduler. As the name suggests, the round-robin scheduler takes packets in turn from each queue, wrapping around when it hits the last one. Queues with empty packets get skipped over, but otherwise, everyone gets a shot.

Round robin is good for creating packet fairness, were every class gets an equal shot at sending a packet. However, if some of the classes should have a higher priority than the other, then round robin will not suffice.

Strict Prioritization

Strict prioritization is a very simple scheduler. Classes are ordered, strictly, from highest to lowest. The scheduler always starts with the highest queue. If there are no packets in the highest-priority queue, it checks the one with the next highest priority. This continues until the scheduler finds a packet, which it then sends.

By draining the highest-priority queue before moving onto the others, strict prioritization ensures that the traffic with the highest prioritization moves right to the head of the line. Even if the lower-priority queues are heavily backed up and congested, if the highest-priority queue is empty and a highest-priority packet comes in, it will move right past the long lines and be sent first.

Strict prioritization is often good enough for voice, especially when the issue is preventing data from competing with voice. However, for elastic or variable applications where one should get more resources than the other, but not too much more, strict prioritization will not suffice either.

Weighted Fair Queuing

To provide a sense of both prioritization, of which strict prioritization may provide too much, and fairness, of which round robin may provide too little, there is the notion of fair queuing. In fair queuing, the goal is to provide a fair bitrate to each of the classes. Round robin provides a fair packet rate, which is the same only if the packets are all the same size. On top of fair queuing, however, the bitrate should be adjustable so that higher-quality flows get more throughput, without exhausting all the throughput available. This is the concept of weighted fair queuing (WFQ).

The idea behind WFQ is that each queue gets a relative weight. That relative weight is used to adjust the data rate that the queue gets. The amount of traffic that the queue gets is always based on how many other queues are active and for how long; the goal is not to tightly control throughputs or to ensure that no one queue gets ahead of the other, but that queues with equal amounts to send get their weight's worth of relative throughput.

The scheduler's goal is to give the appearance that each queue with a byte in it has a byte taken out fairly (as if, say, by round robin, though order does not matter). This gives rise to thinking about packets flowing through the queues like fluids. The output requires a given data rate, or velocity, and each of the packets are extruded through their queues a little at a time, in equal amounts. The first packet out, then, would be the one whose last byte gets drawn out first—that is, the one that finishes first. The problem, of course, is that packets are packets, not bytes, and cannot be drawn out in this manner.

What can be done is that the scheduler can do the math that simulates the bit-by-bit extraction, and make sure to dequeue packets, then, in that proportion. The scheduler calculates the expected time the packet at the end of each queue would get drawn out, in units of virtual time, that don't depend on real time but still flow forward. This gives the precise order of the packets that should come out. As a packet comes out, the new packet's virtual end time is calculated, and so on. This technique ensures that packets flow out in the order they should.

The weightings come into play by adjusting the velocity, in virtual time, that a queue extrudes its packets. Higher-weighted queues extrude packets more quickly, and thus those packets finish more quickly in virtual time, and hit the wire sooner.

It is important to observe that WFQ is a work-conserving process. Work conservation means that the scheduler never delays sending traffic. If there is a packet to send, in any queue, then at least one packet will be sent. At no time will a work-conserving process refuse to send traffic, or delay sending traffic, in hopes of getting a more even throughput. Work conservation is important for not wasting network resources for the sake of "quality."

Traffic Shaping

Traffic shaping is more severe than fair queuing. Whereas fair queuing is concerned with fairness, traffic shaping is concerned with ensuring that a precise rate of traffic is met by a given class.

Traffic shaping is usually performed through the use of some form of token bucket, first mentioned in the context of RSVP. To recap, the idea of a token bucket is that virtual tokens, corresponding to permission to send bytes, are deposited into the virtual bucket corresponding to the queue at a fixed rate. This rate is the goal at which traffic should be sent. The token bucket then requires that a packet from the queue have enough tokens before it can be let past. This requirement ensures a constant bit rate to the flow.

Token buckets are general ways of metering the flow of traffic. Using them to shape traffic, by holding up packets until there are enough tokens for them, is clearly not work-conserving, as the hold up will happen regardless of whether the line will go idle because of it. On the other hand, token buckets have a bucket depth for a reason. If traffic does happen to go idle for a while in the queue that owns the tokens, the queue is allowed to save up its backlog of tokens for when it might need it. Once the traffic resumes, it can use up all of its saved tokens without waiting. This allows for the average traffic rate to be more manageable, even if the incoming flow is not perfectly regular.

Traffic shaping holds an important place in keeping variable flows in check, so that they do not exceed specific service-level agreements (SLAs), which often specify a minimum available bandwidth. The goal of an SLA is to give a fat pipe that is shared among users the appearance that it really is a dedicated thin pipe for that one user. This is reminiscent of the reason we embarked on this journey, to make packet networks seem more like dedicated circuits. For voice, a constant, inelastic traffic, traffic shaping does not hold much interest in itself for what we need. However, traffic shaping does highlight one advantage of packet-based networks. They are flexible enough to provide circuit-like throughput guarantees for some services when needed while providing expandable prioritization for other services, all on the same wire.

Policing

Policing is the other side of the coin of scheduling and queuing discipline. Instead of deciding to hold onto the packet in a queue until it has met its criteria, classes are watched for the same criteria and their packets are dropped when they exceed it.

The point of policing is that it does not require building up the long lines of delayed packets as queuing would. Instead, the policer can just observe and drop packets that go over the mark. Policing is a lot less forgiving than queuing, but it requires fewer resources in the network.

Token buckets are often used for policing. With token bucket policing, when a packet comes by that does not have enough tokens, it is simply dropped. Packets never delay in this model.

Policing is a tough tactic to get right, because it works necessarily by dropping packets that could have been queued or sent. For voice networks, where the goal is to prevent data from interfering with voice, policing is useful only for preventing runaway or hijacked voice streams, being high priority, from taking over the network. Prioritization is a better method to keep data from affecting voice quality.

Random Early Detection

Along with policing comes the idea of how to drop a packet when the queue is filling. Congestion, for data, is a major issue, and as data backs up, it can cause major problems for any traffic that shares the link with it.

The concept behind random early detection (RED) is that congestion can be signaled to TCP, or any other elastic and responsive traffic protocol, before the congestion gets so bad that it caused unfair loss. Congestion causes that unfair loss by affecting whichever random flow whose packet happens to be the one too many for the queue and gets dropped first. As such a flow loses packets, it slows down, and other flows expand to fit their place. To bring back less broken symmetry between the flows, random early detect uses a sliding scale of random drop probabilities to keep the backup at bay. When the queue is nearly empty, nothing is dropped. As the queue fills, however, RED kicks in by increasing its drop probability. This slow but steady increase starts backing the flows off before the queue gets too full, but the fact that it stops dropping when the queue empties gives pressure when the queues are filling and permissiveness when there is plenty of room.

On top of RED is a concept called weighted random early detection (WRED). WRED uses weights, based on the classifications we have seen already, to alter the drop probabilities. Using the classifications for voice allows administrators to avoid having WRED kick in for voice, which is inelastic and will not respond to being dropped, if the administrator has no ability to place voice in a separate queue or route. For data, more critical data connections, such as TCP-based SIP needed in calls, can be given a higher probability by avoiding a higher drop probability, while allowing normal data to be slowed down.

The problem with RED is the problem with policing. Packets that may have been needed to prevent the queue from going idle even though there are resources for them, causing lost work and wasted resources.

Explicit Congestion Notification

Instead of using RED, routers have the option of marking the packets, rather than dropping them. TCP endpoints that know to read for the congestion-marked packets will consider it as if the packet had, somehow, been lost, and will back off or slow down, but without causing the packet's data to disappear. This increases the performance of the network and improves efficiency, though, needless to say, it does nothing if the endpoints are not aware of the congestion notification scheme.

On TCP, explicit congestion notification (ECN) works by the TCP endpoints negotiating that they support this protocol. Both sides need to support it, because the only way the sender can know if an intervening router has marked a packet is for the receiver to echo that fact back to the sender over TCP itself. Once a flow is established, the sender sets the ECN bit, bit 6, in the DSCP usage of the TOS field in IP. This lets routers know that the packet supports ECN. When a router uses RED to decide that the packet should be dropped early, but notices that the packet is marked for ECN support and the router supports ECN itself, it will not drop the packet. Instead, it will set the seventh bit in the ECN header, the CE or Congestion Experienced bit, marking that the packet should be handled as if it were to have been dropped.

The TCP receiver notices that the packet has been marked, and so needs to echo this fact back in the acknowledgment. The receiver sets the ECE, or ECN-echo bit in the TCP flags field in the acknowledgement. The sender gets the acknowledgement, and uses this flag to cut its congestion window in half, as if the original packet were lost.

Differentiated Services | Quality of Service on Wired Networks

So how do wireline networks get quality of service? They do so through the use of prioritization. Instead of asking for, and accounting for, resources and reservations and policing, the network becomes very simple. Traffic is divided up into classes. Some classes are better than others, and will get special treatment. Most likely, this treatment is just to cut to the head of the line. Each packet, not flow, is independently marked with the priority or class it belongs to. Every router and switch along the way that understands the tags will provide that differentiation, and the ones that do not simply ignore the tags and treat the packet as best effort.

This is the concept of differentiated services. For IP networks, the TOS/DSCP field in IPv4 and Traffic Class field in IPv6 is expected to hold the specific class or priority that the packet belongs to. The sender self-marks the packet, and the network takes it from there.

Here, the two conflicting concepts of the IPv4 Type of Service (TOS) come in contact with the Differentiated Services Code Point (DSCP) definition, for the same byte in the header. Each is a mechanism that was created to try to classify packets on a per-packet basis. TOS is the older mechanism, and is now considered to have fallen out of use. However, for the purposes of voice mobility, a lot is similar about TOS and DSCP. TOS defined, among other things, eight priority levels.

The format of the now formally deprecated TOS field is shown in Table 1.

Table 1: The TOS Field in IPv4
	Precedence	Delay	Throughput	Reliability	Reserved
Bit:	0-2	3	4	5	6-7

The precedence value is a prioritization that is used within the network to determine its handling. The values run from 0 to 7, with 0 being the lower end of the range. The definitions originally conceived for this value is given in Table 2.

Table 2: The TOS Precedence

Value	Old Meaning	802.1 p Meaning	WMM Meaning
7	Network Control	Network Management	Voice
6	Internetwork Control	Voice	Voice
5	CRITIC/ECP	Video	Video
4	Flash Override	Controlled Load	Video
3	Flash	Excellent Effort	Best Effort
2	Immediate	Undefined	Background
1	Priority	Background	Background
0	Routine	Best Effort	Best Effort

The table suggests a gradual rise in priority from 0 to 7. The problem with this definition is that different technologies use the 0-7 range for priorities. Most equipment endeavors to maintain a consistent mapping for the number to a priority level, no matter how the priority got to the packet. The three different meanings are shown in the columns. The second column is from IEEE 802. 1p, which is a per-frame prioritization extension to Ethernet, and uses a special header to advertise the priority. The third column contains the meaning of the same eight values in WMM, the Wi-Fi prioritization standard. In general, it is best to assume the meaning of the final two columns. Note that the priority for values 1 and 2 are actually less than best effort in that case. When in doubt, do not use those priorities.

The remaining three flags in Table 3 represent extra information that may have been useful for the packet. Setting the delay bit meant to ask for low delays, whereas setting the throughput or reliability bit was meant to signal that throughput or reliability was a greater concern to the application.

TOS is considered to be replaced, and yet many modern devices in the world of IP telephones use the TOS meanings, and not the later DSCP meanings, in order to support older network configurations that may still be in use.

DSCP requires that the TOS meanings for the top three bits still be preserved, as long as the remaining bits are zero. However, DSCP looks at the one byte a different way. Table 3 shows the new meaning.

Table 3: The DSCP Field in IPv4 (Same Byte as TOS; Different Meaning
	Code Selector	ECN
Bit:	0-5	6-7

There are a couple of RFCs that define what the code selector maps to. The goal of the DSCP is to interpret the selector as a somewhat arbitrary code, mapping into a specific quality of service type.

RFC 2597 defines the concept of Assured Forwarding (AF), the purpose of which is to allow a service provider to accept markings of packets and apply a certain amount of guaranteed bandwidth, as well as allowing more bandwidth to be given. Each class is named AFxx, where the first x is a number from one to four, representing the class of traffic, and the second x is a number from one to three, representing the drop probability from low to high (see Table 4).

Table 4: Assured Forwarding DSCP Values
Drop Probability	Class 1	Class 2	Class 3	Class 4
Low	AF11 = 10	AF21 = 18	AF31 = 26	AF41 = 34
Medium	AF12 = 12	AF22 = 20	AF 32 = 28	AF42 = 36
High	AF13 = 14	AF23 = 22	AF33 = 30	AF43 = 38

The network administrator is expected to assign meanings to the four classes, in terms of assured, set-aside bandwidth that these codes can eat into. The drop probabilities are meant to be sent by the traffic originator to make sure that, if resources are getting exhausted, some packets get more protection than others.

A different concept is defined in RFC 2598. Expedited Forwarding (EF) sets up a specific codepoint, 46, to allow packets to be marked as belonging to a "virtual lease line," a high-performing point-to-point measure of quality of service. (There is a wrinkle with this DSCP code as it applies to Wi-Fi: All EF tagged packets get transmitted in the class of service designated for video because of the way the EF tag is coded.)

In total, there are 21 commonly seen DSCPs: the twelve AFs, the EF codepoint, and the eight original precedence values, now known default and CS1 to CS7.

Nothing in DSCP or differentiated services defines just what the qualities of the differentiated services are to be. This is the advantage of differentiated services: the differentiation is up to the administrator, and can grow as the network grows.

Telecom Made Simple