Voice Mobility with Wi-Fi Capacity


How the Capacity is Determined


Through either admission control scheme, the network needs to keep track of how much capacity is available. From the previous discussions on the effects of RF variability and cellular overlap, you can appreciate that this is a difficult problem to completely solve. As devices get further away from the access points, data rates drop. Changing levels of interference, from within the network or without, can cause increasing retransmissions and easily overrun surplus bandwidth allowances.
In the end, networks today adopt one of two stands, and may even show both to the user. The more complicated stand for the network—but simpler for the user—is for the network to automatically take the variability of RF into account, and to determine its own capacities. In systems that do this, there is no notion of a static maximum number of calls. Instead, the system accepts however many calls as it can handle. If conditions change, and fewer calls can be handled in the system, the network reserves the right to proactively end a client's reservation, often in concert with load balancing.
The other stand, simpler for the network but far more complicated for the user, is for the administrator to be required to enter the maximum number of calls per access point (or some other static metric). The idea here is that the administrator or installer is assumed to have gone through a planning process to determine how many calls can besafely allowed per access point, while still leaving room for best effort data. That number is usually far lower than the best-case maximum capacity, and is designed to be a low water mark: barring external changes, the network will be able to achieve that many calls most of the time. This number is then manually input into the wireless network, which then counts the number of calls. If the maximum number of calls is reached on that access point, the system will not let any more in. These static metrics may be entered either as the number of calls, or a percentage of airtime. Systems that work as a percentage of airtime can sometimes take in a padding factor to allow for calls that are roaming into the network.
Setting these values can be fraught with difficulty. Pick a number that's too low, and airtime is being wasted. Pick a number that's too high, however, and sometimes call quality will suffer. Even percentage of airtime calculations are not very good, because they may not take into account airtime that is unusable because of variable channel conditions or co-channel interference that the access point cannot directly see, such as client-to-client interference 
All in all, you might find vendors recommending setting the values to a low, safe value that allows for voice to work even if there is plenty of variability in the network. This works well for networks that are predominantly data-oriented, but voice-only networks cannot usually afford that luxury.

WMM Admission Control | Voice Mobility with Wi-Fi



Building on even more of the specification in the 802.11e quality-of-service amendment is WMM Admission Control. This specification and interoperability program from the Wi-Fi Alliance, which is required to achieve Voice Enterprise certification, uses an explicit layer-2 reservation scheme. This scheme, in a similar vein as the lightly used RSVP protocol (RFC 2205), requires the mobile device to reach out and request resources explicitly from the access point, using a new protocol built on top of 802.11 management frames.
This protocol is heavily dependant on the concept of a traffic specification (TSPEC). The TSPEC is created by the mobile phone, and specifies how much of the air resources either or both directions of the call (or whatever resource is being requested) will be taken. The access point processes the request as an admission controller (a function often placed literally on the controller, by coincidence), which is in charge of maintaining an account of which clients have requested what resources and whether they are available.
The overall protocol is rather simple. The mobile device, usually when it determines that it has a call incoming our outgoing, will send an Add Traffic Stream (ADDTS)Request message (a special type of Action management frame) to the access point, containing the TSPEC that will be able to carry the phone call. The access point will decide whether it can carry that call, based on whatever scheme it uses (see following discussion), and send an ADDTS Response message stating whether the stream was admitted.
WMM Admission Control can be set to mandatory or optional for each access category. For example, WMM Admission Control can be required for voice and video, but not for best effort and background data. What this would mean is that no client is allowed to transmit voice or video packets without first requesting and being granted admission for flows in those access categories, whereas all clients would be allowed to freely transmit best effort and background data as they see fit. Which access categories require admission control is signaled as a part of the WMM information element, which goes out in beacons and some other frames.
For WMM Admission Control, it is worth looking at the details of the concepts. The main concept is one of a traffic stream itself, and how it is identified and recognized. Traffic streams are represented by Traffic Identifiers (TID), a number from 0-7 (the standard allows up to 15, but WMM limits this to only 7) that represents the stream. Each client gets its own set of eight TIDs to use.
Each traffic stream, represented by its TID, maps onto real traffic by naming which of the eight priority values in WMM will belong to this traffic stream. Thus, if the phone intends to send and knows it is going to receive priority 7—recall that this is the highest of the two voice AC priorities—it can establish a traffic stream that maps priority 7 traffic to it, and get both sides of the call. In order for that to work, the client can specify whether the traffic stream is upstream-only, downstream-only, or bidirectional. It is possible for the client to request both an upstream-only and downstream-only stream mapping to the same priority (different TIDs, though!), if it knows that the airtime used by the downstream side is different than the upstream side—useful for video calls—or it may request both at once in one TID, with the same airtime usage. All of this freedom leads to some complexity, but thankfully there is a rule preventing there from being more than one downstream and one upstream flow (bidirectional counts as one of each) for each access category. Thus, the AC_VO voice access category will only have one admitted bidirectional phone call in it at any given time.[*]
The client requests the traffic stream using the TSPEC.
Table 1 shows the contents of the TSPEC that is carried in an ADDTS message.
Table 1: WMM admission control TSPEC 
TS Info
Nominal MMSDU Size
Maximum MSDU Size
Minimum Service Interval
Maximum Service Interval
Inactivity Interval
Suspension Interval
Service Start Time
3 bytes
2 bytes
2 bytes
4 bytes
4 bytes
4 bytes
4 bytes
4 bytes

Minimum Data Rate
Mean Data Rate
Peak Data Rate
Maximum Burst Size
Delay Bound
Minimum PHY Rate
Surplus Bandwidth Allowance
Medium Time
4 bytes
4 bytes
4 bytes
4 bytes
4 bytes
4 bytes
2 bytes
2 bytes
There's quite a lot of information in a TSPEC, so let's break it down slowly, using the example of a 20 millisecond G.711 (nearly uncompressed) one-way traffic flow:
  • The TS Info field (see Table 2) identifies the TID for the stream, the priority of the data frames that belong to this stream, what direction the stream is going in (00 = up, 01 = down, 10 = reserved, 11 = bidirectional), and whether the AC the stream belongs to is to be WMM Power Save delivery enabled (1) or not (0). The rest of the fields are not used in WMM Admission Control, and have specific values that will never change (Access Policy = 01, the rest are 0).
    Table 2: The TS info field 
     
    Traffic Type
    TID
    Direction
    Access Policy
    Aggregation
    WMM Power Save
    Priority
    TSInfo Ack Policy
    Schedule
    Reserved
    Bit:
    0
    1-4
    5-6
    7-8
    9
    10
    11-13
    14-15
    16
    17-23
  • The Nomimal MSDU Size field mentions the expected packet size, with the highest-order bit set to signify that the packet size never changes. G.711 20ms packets are 160 bytes of audio, plus 12 bytes of RTP header, 8 bytes of UDP header, 20 bytes of IP header, and 8 bytes of SNAP header, creating a data payload (excluding WPA/WPA2 overhead) of 208 = 0×D0. Because the packet size for G.711 never changes, this field would be set to 0×80D0.
  • The Maximum MSDU Size field specifies what the largest a data packet in the stream can get. For G.711, that's the same as the nominal size. There is no special bit for fixed sizes, so the value is 208 = 0×00D0. This can also be left as 0, as it is an optional field.
  • The Inactivity Interval specifies how long the stream can be idle—no traffic matching it—in microseconds, before the access point can go ahead and delete the flow. 0 means not to delete the flow automatically, and that's the common value.
  • The Mean Data Rate specifies, in bits per second, what the expected throughput is for the stream. For G.711, 208 bytes every 20 milliseconds results in a throughput of 83200 bits per second.
  • The Minimum Data Rate and Peak Data Rate specify the minimum and maximum throughput the traffic stream can expect. These are optional and can be set to 0. For G.711, these will be the same 83,200 bits per second.
  • The Minimum PHY Rate field specifies what the physical layer data rate assumptions are for the stream, in bits per second. If the client is assuming that the data rate could drop as low as 6Mbps for 802. Hag, then it would encode the field at 6Mbps = 6,000,000bps = 0×005B8D80.
  • The Surplus Bandwidth Allowance is a fudge factor that the phone can request, to account for that packets might be retransmitted. It's a multiplier, in units of l/8192nds. A value of 1.5 times as an allowance would be encoded as 0×3000 = 001.1000000000000, in binary.
  • The other fields are unused by the client, and can be set to 0.
In other words, the client simply requests the direction, priority, packet size, data rate, and surplus allowance.
The access point gets this information, and churns it using whatever algorithms it wants— this is not specified by the standard, but we'll look at what sorts of considerations tend to be used. Normally, we'll assume that the access point knows what percentage of airtime is available. The access point will then decide how much airtime the requested flow will take, as a percentage, and see whether it exceeds its maximum allowance (say, 100% of airtime used). If so, the flow is denied, and a failing ADDTS Response is sent. If not, the access point updates its measure of how much airtime is being used, and then allows the flow. The succeeding ADDTS Response has a TSPEC in it that is a mirror of the one the client requested, except that now the Medium Time field is filled in. This field specifies exactly how much airtime, in 32-microsecond units per second, the client can take for the flow.
The definition of how much airtime a client uses is based on what packets are sent to it or that it sends as a part of a flow. Both traffic sent by the client to the access point and sent by the access point to the client are counted, as well as the times for any RTSs, CTSs, ACKs, and interframe spacings that are between those frames. Another way of thinking about it is that the time from the first bit of the first preamble to the last bit of the last frame of the TXOP counts, including gaps in between. In general, you will never need to try to count this. Just know that WMM Admission Control requires that the clients count their usage. If they exceed their usage in the access category they are using, they have to send all subsequent frames with a lower access category—and one that is not admission control enabled—or drop them.
One advantage of WMM Admission Control is that it works for all traffic types, without requiring the network to have any smarts. Rather, the client is required to know everything about the flows it will both send and receive, and how much airtime those flows will take. The network just plays the role of arbiter, allowing some flows in and rejecting others. Thus, if the client is sufficiently smart, WMM Admission Control will work whether the protocol is SIP, H.323, some proprietary protocol, or even video or streaming data. The disadvantage of that, however, is that the client is required to be smart, and all of its pieces—from wireless to phone software—have to be well integrated. That pretty much eliminates most softphones, and brings the focus squarely on purpose-built phones. Furthermore, the client needs to know what type of traffic the party on the other side of the call will send to it. Some higher-level signaling protocols can convey this, such as with SDP within SIP, but doing so may be optional and may not always be followed. For a phone talking to a media gateway, for example, the phone needs to know exactly how the media gateway will send its traffic, including knowing the codec and packet rate and sizing, before it can request airtime. That can lead to situations in which the call needs to be initiated and agreed to by both parties before the network can be asked for permission to admit the flow, meaning that the call might have to be terminated by the network midway through ringing, if airtime is not available. Because WMM Admission Control is so new—by the time of publication, WMM Admission Control should be launching shortly and large amounts of devices may not yet be available—it remains to be seen how well all of the pieces will fit together. It is notoriously difficult for general-purpose devices to be built that run the gamut of technologies correctly, and so these new programs might be more useful for highly specific purpose-built phones.

SIP-Based Admission Control | Voice Mobility with Wi-Fi



The first method is to rely on the call setup signaling. Because the most common mechanism today is SIP, we can refer to this as SIP-based admission control. The idea is fairly simple. The access point, most likely in concert with a controller if the architecture in use has one, uses a firewall-based flow-detection system to observe the SIP messages as they are sent from the phones to the SIP servers and back. Specifically, when the call is initiated, either by the phone sending a SIP Invite, or receiving one from another party, the wireless network determines whether there is available capacity to take the call. If there is available capacity, then the wireless network lets the messages flow as usual, and the call is initiated.
On the other hand, if the wireless network determines that there is no room for the call, it will intercept the SIP Invite messages, preventing them from reaching the other party, and interject its own message to the caller (as if from the called party, usually), with one of a few possible SIP busy statuses. The call never completes, and the caller will get some sort of failure message, or a busy tone.
Other, more advanced behaviors are also possible, such as performing load balancing, once the network has determined that the call is not going to complete.
The advantage of using SIP flow detection to do the admission control is that it does not require any added sophistication on the mobile devices than they would already have with SIP. Furthermore, by having that awareness from tracking the SIP state, the network can provide a list of both calls in progress and registered phones not yet in a call. The disadvantage is that this system will not work for SIP calls that are encrypted end-to-end, such as being carried over a VPN link.

Wi-Fi Multimedia (WMM) Power Save



To provide power saving while the mobile device is in a call, the Wi-Fi Alliance came up with the second power saving technique, WMM Power Save. This technique, based on the quality-of-service additions in the 802.11e amendment to the standard, acts as a parallel scheme to the legacy one, using similar concepts but in a way that avoids having to wait for beacons and can apply on a per-access-category basis.
If you notice, there is nothing in the standard that prevents clients that are using the legacy power save scheme from ignoring beacons, for the most part, and sending PS Polls whenever they want. If the client were sure that there is going to be a packet for it waiting every so often—say, 20 milliseconds—then it could just send PS Polls every 20 milliseconds, collect its data, and have real-time power save. Of course, this doesn't happen for legacy power save, because the client has no guarantee that it won't get some other frames rather than what it is looking for. However, this is the concept that WMM Power Save builds on.
WMM Power Save is optional, and support for it is signaled by the WMM information elements in the Association messages and the beacons. Unlike with legacy power save, WMM Power Save (capitalized, as it is a formal name) is aware of the WMM access categories and can apply to a subset of them. The two subsets are delivery-enabledaccess categories and trigger-enabled access categories.
First, let's start with the polling protocol. The client no longer checks the beacons to see if there is traffic. Instead, it is responsible for knowing that traffic is waiting for it, and how often. For phones, this is not a problem, as voice is bidirectional and consistent. Instead of sending a PS Poll frame, or using the PSNonPoll mechanism, the phone sends data frames in access categories that it has specified to be trigger-enabled. The access point looks for those data frames, and uses that as a trigger—just as it does in legacy with Power Save Poll frames—sending packets in response from the power save buffer. Those packets, however, can only come from the delivery-enabled access categories. Which categories are delivery- and trigger-enabled are usually specified in the Association Request from the client—there, a bitmask specifies which categories are legacy and which are delivery and trigger enabled together—or in TSPEC messages, which we will come to later.
Here's a common example. The phone associates, and tells the access point that it wants the voice category (AC_VO) to be delivery- and trigger-enabled. That means that the other three categories work on the legacy scheme. If packets come in for those other categories while the client is asleep, the TIM bit on the beacon will be set and the client will use legacy power save mechanisms to get the frames. But when a voice packet is sent to the access point, the access point silently holds onto the packet. The only way the client can get the voice packet is to send a voice packet of its own.
When it does, that causes the access point to respond with one or more voice packets in its buffer. Unlike with legacy power save, the client can ask for more than one packet at a time. Using the concept of a service period, which is set at Association time by the client and specifies the number of frames the client wants to get for every trigger (either two, four, six, or all), the access point will send out the correct number of frames. The last frame, whether because the buffer is empty or the service period has been exceeded, will have a special end of service period (EOSP) bit set in the QoS header. Once the client gets that frame, it can go back to sleep.
As you can see, the legacy and WMM Power Save schemes operate simultaneously and independently. The only overlap is that the client goes into to power save mode for both schemes simultaneously. This means that devices that are actively using WMM Power Save should never use the PSNonPoll method during that time, because the client waking up from power save mode will cause the access point to send all frames, whether they are from the legacy or WMM Power Save access categories.
The capability to support WMM Power Save should be considered nearly mandatory for most voice equipment. Some mobile devices use proprietary mechanisms that may or may not be supported by every access point, but the trend is towards using WMM Power Save. 

Legacy Power Save | Voice Mobility over Wi-Fi



The first mode, known as legacy power saving because it was the original power saving technique for Wi-Fi, is used for saving battery during standby operation.
This power save mode is not designed for quality-of-service applications, but rather for data applications. The way it works is that the mobile device tells the access point when it is going to sleep. After that time, the access point buffers up frames directed to the mobile device, and sets a bit in the beacon to advertise when one or more frames are buffered. The mobile device is expected to wake every so many beacons and look for its bit set in the beacon. When the bit is set, the client then uses one of two mechanisms to get the access point to send the buffered frames.
This sort of system can be thought of as a paging mechanism, as the client is told when the access point has data for it—such as notification of an incoming call. Figure 1 shows the basics of the protocol.

 
Figure 1: Wi-Fi Legacy Power Save
The most important part of the protocol is the paging itself. Each client is assigned an association ID (AID) when it associates. The value is given out by the access point, in a field in the Association Response that it sent out when the client connected to it. The AID is a number from 1 to 2007 (an extremely high number for an access point) that is used by the client to figure out what bit to look at in the beacon. Each beacon carries a Traffic Indication Map (TIM), which is an abbreviated bit field. Each client who has a frame buffered for it has its bit set in the TIM, based on the AID. For example, if a client with AID of 10 has one or more frames buffered for it, the tenth bit (counting from zero) of the TIM would be set.
Because beacons are set periodically, using specific timing that ensures that it never goes out before its time, each client can plan on the earliest it needs to wake up to hear the beacon. That doesn't guarantee that the client will hear the beacon at exactly that time, however. Beacons can be delayed if the air is occupied at that time. Furthermore, because beacons are sent out as broadcasts, the client might just miss the beacon or the beacon can be collided with. If the client does hear the beacon, it can then go to sleep so long as no traffic is buffered for it.
Clients may also skip beacons. They would do this to save additional battery, at the expense of increasing the amount of time the frames would be buffered. Clients usually let the access points know how many beacons they will skip by sending a listen interval in their Association Request messages. A listen interval of 1 means that the client will wake for every beacon; a listen interval of 10 means that the client will wake only for every tenth beacon. Be careful, however; some clients do not follow the listen interval they state, waiting either for more or less beacons than they advertise.
The client signals that it is going to sleep by using the power management bit in any unicast frame it sends to the access point (except for non-Action management frames). The power management bit is in the Frame Control field for the frame. When the client sends a frame with the power management bit set and when it gets an Acknowledgement in response, it knows that the access point has heard the client's change of state and can now go to sleep. From this moment on, the access point will buffer frames, until the client sends any frame to the access point with the power management bit not set. That signals that the client is now awake, and can be sent packets as usual.
While the client is in power save mode, and it wakes to find that its TIM bit is set to signify that it has frames available for it, the client has two choices on how to gather those frames. The first choice is known as the PSPoll mechanism, and uses the Power Save Poll (PS Poll) frames. After the beacon with the client's TIM bit set, the client would send a PS Poll frame to the access point. This frame, which is usually acknowledged right away, triggers the access point to deliver exactly one of the buffered frames for the client. That buffered frame is put into the transmit queue, using the appropriate access category for WMM. The frame that is sent also has its More Data bit in the Frame Control field set if there are subsequent frames that are buffered. Once the client has the frame, it can chose to send another PS Poll to get another frame. This one-PS-Poll/one-data-frame exchange continues until the access point's buffer is drained or the client wishes to sleep more.
The other option the client has is to use the PSNonPoll mechanism. This mechanism is quite simple: the client simply sends a data frame, usually a Null data frame, stating that it is no longer sleeping, by clearing the power management bit. The access point will proceed to queue all of the buffered frames, each using its own WMM access category. The client can then wait for a certain amount of time, hoping that it got all of the frames it was going to get, after which it can send another Null data frame, signifying it is going back to sleep. Any frames that may have still been in a transmit queue might get buffered again by the access point, for a later PSNonPoll exercise. The advantage of the PSNonPoll mechanism is that it is simple and doesn't require a significant back-and-forth. The disadvantage is that the client has no way of knowing if there are any remaining frames for it, without going to sleep and waiting for the next beacon.
The choice between PSPoll and PSNonPoll modes is often left up to the client's software implementation, and not exposed to you. However, some clients do give a choice up front, or have specific behavior where they will use one method or the other, depending on how aggressive you set its power save settings to (using a slider, say). It should be clear that neither mode is good for quality-of-service traffic, because the client can be forced to wait as much as a beacon interval (times its listen interval) before it finds out traffic is available. If the beacon interval is set to the typical 100 milliseconds, and the listen interval is 10, then that can be up to a second of delay.
Broadcast and multicast frames are also covered in the legacy scheme. However, no polling is necessary for those frames to be delivered. Instead, the access point sets aside a certain number of the beacons for multicast traffic. If any client on the access point is in legacy power save mode, the access point will buffer all multicast traffic. The special beacons known as Delivery Traffic Indication Messages (the poorly named DTIM) are just like regular beacons, except that they come every so many beacons—when the next one is coming is signaled as a part of the TIM in every beacon—and they signal if multicast traffic is buffered. If multicast traffic is buffered, the TIM has the zeroth bit, corresponding to AID 0, set. If clients receive a beacon with that bit set, they know that the next frames coming from the access point will be all of the multicast frames buffered. Each multicast frame, except for the last one, will have the More Data bit set. Thus, clients can stay awake to collect all multicast traffic, and then go back to sleep after the last multicast data frame, with the cleared More Data bit, comes through. (Of course, if that last frame is lost, being multicast, the clients will have to decide on their own when to return to sleep.) The consequence of the all-or-nothing multicast buffering is that multicast traffic on Wi-Fi when any device is in power save is not generally suitable for real-time traffic! Look for architectures that provide solutions for this problem if real-time multicast is a priority for your network.
Finally, I haven't gone into details on how the TIM bits are compressed. It is not easy to read the TIM bits by hand, but a good wireless protocol analyzer will be able to read them for you, and let you know which AIDs are set in any beacon.

How Wi-Fi Multimedia (WMM) Works?



It is not easy to directly see what the consequences are by WMM creating multiple queues that act to access the air independently. But it is important to understand what makes WMM works, to understand how WMM—and thus, voice—scales in the network.
Looking at the common WMM parameters, we can see that the main way that WMM provides priority for voice is by letting voice use a faster backoff process than data. The shorter AIFS helps, by giving voice a small chance of transmitting before data even gets a chance, but the main mechanism is by allowing voice transmit, on average, with a quarter of the waiting time that best effort data has.
This mechanism works quite well when there is a small amount of voice traffic on a network with a potentially large amount of data. As long as voice traffic is scarce, any given voice packet is much more likely to get on the air as soon as it is ready, causing data to build up as a lower priority. This is one of the consequences of having different queues for traffic. As an analogy, picture the security lines at airports. Busy airports usually have two separate lines, one line for the average traveler, and another line for first-class passengers and those who fly enough to gain "elite" status on the airlines. When the line for the average traveler—the "best effort" line—is full of people, a short line for first class passengers gives those passengers a real advantage. In other words, we can think of best effort and voice as mostly independent. The problem, then, is if there are too many first-class passengers. For WMM, the problem happens when there is "too much" voice traffic. Unlike with the children of Lake Wobegone, not everyone can be above average.
Let's look at this more methodically. The backoff value is the primary mechanism that Wi-Fi is affected by density. As the number of clients increases, the chance of collision increases. Unfortunately, WMM provides for quality of service by reducing the number of slots of the backoff, thus making the network more sensitive to density. Again, if voice is rare, then its own density is low, and so a voice packet is not likely to collide with other voice packets, and the aggressive backoff settings for voice, compared to data, allow for voice to get on the network with higher probability. However, when the density of voice goes up, the aggressive voice backoff settings cause each voice packet to fight with the other voice packets, leading to more collisions and higher loss.
One solution for this problem is to limit the number of voice calls in a cell, thus ensuring that the density of voice never gets that high. This is called admission control. Another and an independent solution is for the system to provide a more deterministic quality of service, by intelligently setting the WMM parametersaway from the defaults. This exact purpose is envisioned by the standard, but most equipment today expects the user to hand-tune these values, something which is not easy. 

Quality of Service with WMM-How Voice


Quality of Service with WMM-How Voice and Data Are Kept Separate

The first challenge is to address the unique nature of voice. Unlike data, which is usually carried over protocols such as TCP that are good at making sure they take the available bandwidth and nothing more, ensuring a continuous stream of data no matter what the network conditions, voice is picky. One packet every 20 milliseconds. No more, no less. The packets cannot be late, or the call becomes unusable as the callers are forced to wait for maddening periods before they hear the other side of their conversation come through. The packets cannot arrive unpredictably, or else the buffers on the phones overrun and the call becomes choppy and impossible to hear. And, of course, every lost packet is lost time and lost sounds or words.
On Ethernet, as we have seen, the notion of 802.1p or Diffserv can be used to give prioritization for voice traffic over data. When the routers or switches are congested, the voice packets get to move through priority queues, ahead of the data traffic, thus ensuring that their resources do not get starved, while still allowing the TCP-based data traffic to continue, albeit at a possibly lesser rate.
A similar principle applies to Wi-Fi. The Wi-Fi Multimedia (WMM) specification lays out a method for Wi-Fi networks to also prioritize traffic according to four common classes of service, each known as an access category (AC):
  • AC_VO: highest-priority voice traffic
  • AC_VI: medium-priority video traffic
  • AC_BE: standard-priority data traffic, also known as "best effort"
  • AC_BK: background traffic, that may be disposed of when the network is congested
The underscore between the AC and the two-letter abbreviation is a part of the correct designation, unfortunately. You may note that the term "best effort" applies to only one of the four categories. Please keep in mind that all four access categories of Wi-Fi are really best effort, but that the higher-priority categories get a better effort than the lower ones. We'll discuss the consequences of this shortly.
The access category for each packet is specified using either 802.1p tagging, when available and supported by the access point, or by the use of Diffserv Code Points (DSCP), which are carried in the IP header of each packet. DSCP is the more common protocol, because the per-packet tags do not require any complexity on the wired network, and are able to survive multiple router hops with ease. In other words, DSCP tags survive crossing through every network equipment that is not aware of DSCP tags, whereas 802.1p requires 802.1p-aware links throughout the network, all carried over 802.1Q VLAN links.
There are eight DSCP tags, which map to the four access categories. The application that generates the traffic is responsible for filling in the DSCP tag. The standard mapping is given in Table 1.
Table 1: DSCP tags and AC mappings 
DSCP
TOS Field Value
Priority
Traffic Type
AC
0×38 (56)
0×E0 (224)
7
Voice
AC_VO
0×30 (48)
0×C0 (192)
6
Voice
AC_VO
0×28 (40)
0×A0 (160)
5
Video
AC_VI
0×20 (32)
0×80 (128)
4
Video
AC_VI
0×18 (24)
0×60 (96)
3
Best Effort
AC_BE
0×10 (16)
0×40 (64)
2
Background
AC_BK
0×08 (8)
0×20 (32)
1
Background
AC_BK
0×00 (0)
0×00 (0)
0
Best Effort
AC_BE
There are a few things to note here. First is that the eight "priorities"—again, the correct term, unfortunately—map to only four truly different classes. There is no difference in quality of service between Priority 7 and Priority 6 traffic. This was done to simplify the design of Wi-Fi, in which it was felt that four classes are enough. The next thing to note is that the many packet capture analyzers will still show the one-byte DSCP field in the IP header as the older TOS interpretation. Therefore, the values in the TOS column will be meaningless in the old TOS interpretation, but you can look for those specific values and map them back to the necessary ACs. Even the DSCP field itself has a lot of possibilities; nonetheless, you should count on only the previous eight values as having any meaning for Wi-Fi, unless the documentation in your equipment explicitly states otherwise. Finally, note that the default value of 0 maps to best effort data, as does the Priority 3 (DSCP 0×18) value. This strange inversion, where background traffic, with an actual lower over-the-air priority, has a higher Priority code value than the default best effort traffic, can cause some confusion when used; thankfully, most applications do not use Priority 3 and its use is not recommended here as well.
A word of warning about DSCP and WMM. The DSCP codes listed in Table 1 are neither Expedited Forwarding or Assured Forwarding codes, but rather use the backward-compatibility requirement in DSCP for TOS precedence. TOS precedence uses the top three bits of the DSCP to represent the priorities in Table 6.1, and assign other meanings to the lower bits. If a device is using the one-byte DSCP field as a TOS field, WMM devices may or may not ignore the lower bits, and so can sometimes give no quality-of-service for tagged packets. Further complicating the situation are endpoints that generate Expedited Forwarding DSCP tags (with code value of 46). Expedited Forwarding is the tag that devices use when they want to provide higher quality of service in general, and thus will usually mark all quality-of-service packets as EF, and all best effort packets with DSCP of 0. The EF code of 46 maps, however, to the Priority value of 5—a video, not voice, category. Thus, WMM devices may map all packets tagged with Expedited Forwarding as video. A wireless protocol analyzer shows exactly what the mapping is for by looking at the value of the TID/Access Category field in the WMM header. 
This mapping can be configured on some devices. However, changing these values from the defaults can cause problems with the more advanced pieces of WMM, such as WMM Power Save and WMM Admission Control, so it is not recommended to make those changes. (The specific problem that would happen is that the mobile device is required to know what priority the other side of the call will be sending to it, and if the network changes it in between, then the protocols will get confused and not put the downstream traffic into the right buckets.)
Once the Wi-Fi device—the access point or the client—has the packet and knows its tag, it will assign the packet into one of four priority queues, based on the access categories. However, these queues are not like their wired Ethernet brethren. That is because it is not enough that voice be prioritized over data within the device; voice must also be prioritized over the air.
To achieve this, WMM changes the backoff procedure. Instead of each device waiting a random time less than some interval fixed in the standard, each device's access category gets to contend for the air individually. Furthermore, to get the over-the-air prioritization, higher quality-of-service access categories, such as voice, get more aggressive access parameters.
Each access category get four parameters that each determine how much priority the traffic in that category gets over the air, compared to the other categories. The first parameter is a unique per-packet minimum wait time called the Arbitration Interframe Spacing (AIFS). This parameter is the minimum amount of time that a packet in this category must wait before it can even start to back off. The longer the AIFS, the more a packet must wait, and the more it is likely that a higher-priority packet will have finished its backoff cycle and started transmitting. The key about the AIFS is that it is counted after every time the medium is busy. That means that a packet with a very high AIFS could wait a very long time, because the amount of time spent waiting for an AIFS does not count if the medium becomes busy in the meantime. The AIFS is measured in units of the number of slots, and thus is also called the AIFSn (AIFS number).
The second value is the minimum backoff CW, called the CWmin. This sets the minimum number of slots that the backoff counter for this particular AC must start with. As with pre-WMM Wi-Fi, the CW is not the exact number of slots that the client must wait, but the maximum number of slots that the packet must wait: the packet waits a random number of slots less than this value. The difference is that there is a different CW min for each access category. The CWmin is still measured in slots, but communicated to the client from the access point as the exponent of the power of two that it must equal. This exponent is called the ECWmin. Thus, if the ECWmin for video is 3, then the AC must pick a random number between 0 and 23  1 = 7 slots. The CWmin is just as powerful as the AIFS in distinguishing traffic, by making access more aggressive by capping the number of slots the AC must wait to send its traffic.
The third parameter is similar to the minimum backoff CW, and is called the CWmax, or the maximum backoff CW. If you recall, the CW is required to double every time thesender fails to get an acknowledgement for a frame. However, that doubling is capped by the CWmax. This parameter is far mess powerful for controlling how much priority one AC gets over the other. As with the CWmin, there is a different CWmax for each AC.
The last parameter is how many microseconds the AC can burst out packets, before it has to yield the channel. This is known as the Transmit Opportunity Limit (TXOP Limit), and is measured in units of 32 microseconds (although user interfaces may show the microsecond equivalent). This notion of TXOPs is new with WMM, and is designed to allow for this bursting. For voice, bursting is usually not necessary or useful, because voice packets come on regular, well-spaced intervals, and rarely come back-to-back in properly functioning networks.
The access point has the ability to set these four AC parameters for every device in the network, by broadcasting the parameters to all of the clients. Every client, thus, has to share the same parameters. The access point may also have a different set for itself. Some access points set these values by themselves to optimize network access; others expose them to the user, who can manually override the defaults. The method that WMM uses to set these values to the clients is through the WMM Parameter Set information element, a structure that is present in every beacon, and can be seen clearly with a wireless packet capture system. Table 2 has the defaults for the WMM parameters.
Table 2: Common default values for the WMM parameters for 802.11 
AC
Client
Access Point
CWmin
TXOP limit
 
AIFS
CWmax
AIFS
CWmax
 
802.11b
802.11agn
Background (BK)
7
210 1 = 1023
7
210  1 = 1023
24 1 = 15
0μs
0μs
Best Effort (BE)
3
210 1 = 102
3
26 1 = 63
24 1 = 15
0μs
0μs
Video (VI)
2
24 1 = 15
1
24 1 = 15
23 1 = 7
6016μs
3008μs
Voice (VO)
2
23 1 = 7
1
23 1 = 7
22 1 = 3
3264μs
1504μs