Real-Time Transport | IP Telephony-Related Standards

Deals with the standards that are pertinent to the mechanisms of carrying voice and video over IP networks. These standards are essential to interworking with the PSTN because IP telephony gateways need to convert the IP voice and video payload into a form that is accepted by the PSTN and, conversely, translate the PSTN voice and video payload into a form that is accepted by IP networks. The gateways also need to reconstruct the original voice or video stream to be as close to the original as possible. Naturally, such reconstruction should retain the real-time properties of the original stream. In addition, an interactive application—such as a two-person voice call—also requires that the transport service itself be fast, reliable, and perceived as “free” of jitter (that is, high variation in delay) to maintain the perception of a real-time interaction.

These real-time transport requirements explain why the protocol suite, developed by the Audio/Video Transport (avt) IETF working group (see has been called Real-Time Transport Protocol (RTP) in RFC 1889 (Schulzrinne et al., 1996). RTP has been designed for multicast, as well as point-to-point, transmission and is accompanied by its quality control component, Real-Time Control Protocol (RTCP). Both protocols are carried by the User Datagram Protocol (UDP).

RTP specifies the header of the packets that carry streams of encoded audio or video samples. This encoding is performed by a device (or software module) called a coder; the subsequent decoding is performed by a decoder, but for full-duplex communications, both are usually combined in a codec. RTP specifies the payload format, which, in turn, identifies a specific codec. (The avt working group has also developed a number of RFCs that deal with payload formats.) The codec header, which is appended to the RTP header, determines the format of the attached encoded data unit (called a frame).

Since UDP does not guarantee sequencing (that is, arrival of packets in the order they were sent), this function is assisted by RTP, which stipulates the inclusion of sequence numbers in packets. Sequence numbers are used at the receiver not only to reconstruct the original sequence, but also to keep count of lost packets (one of the quality of service statistics fed back to the sender via RTCP).

RTP deals with any jitter by time-stamping packets. At the receiving end, the play-out devices buffer the packets and then reconstruct the stream at the original rate. Another synchronization mechanism is the marker bit of the header, which, according to RFC 1889, “is intended to allow significant events such as frame boundaries to be marked in the packet stream.”

The RTCP packets are sent to exactly the same addresses as the RTP ones, but on different ports. The primary function of RTCP is to carry, from receivers to senders, the statistics on the number of lost packets, jitter, and round-trip delays. RTCP carries sender reports in the opposite direction. The statistics are used by senders to adjust encoding rates (and, possibly, the choice of codecs) in order to use less bandwidth. In addition, the statistics are useful for network management as the mechanism to detect the type and location of network problems (such as congestion). In addition to supporting quality control, RTCP performs the following functions:

·         Synchronization of video and audio streams
·         Identification of session participants (by their full names, telephone numbers, and e-mail addresses)
·         Session control (through indication that a user is leaving the session and user-to-user control messages)

Real-Time Streaming Protocol (RTSP), developed in the Multiparty Multimedia Session Control (mmusic) working group, is a network remote control for multimedia services, as defined in RFC 2326 (Schulzrinne et al., 1998). The main purpose of the protocol is to control a device for so-called stored media [for example, a compact disc (CD) player, tape recorder, and so on]. But the control here actually encompasses playing the device, which evolves the transfer of the stream across the network. The applicability of this protocol to the task of integrating the PSTN and the Internet can be found in the areas of voice and video messaging. Like SIP, RTSP is also a descendant of HTTP, but unlike SIP, RTSP maintains a virtual connection identifier by assigning a session identifier in the beginning of the session and then keeping it in all messages relevant to the session. RTSP defines its own URL in reference to the media servers. RTSP can also interwork with SIP, as explained in Schulzrinne and Rosenberg (1999).

Intelligent Network

The Intelligent Network technology is at the heart of PSTN service provision. The whole range of convergence products and services (such as Internet call-waiting, click-to-dial, and universal mailbox, just to mention a few) rely on and expand this technology.

The Intelligent Network standards are published by ITU in the Q.12xy series. The IN architecture, in accordance with open distributed processing (ODP) principles, is viewed in terms of planes: the service plane is concerned only with the service description in terms of service features; the global functional plane deals with the service-independent building blocks (SIBs); the distributed functional plane addresses the elements of the architecture involved in the IN message exchange in terms of functional entities (FEs) (that is, the objects that are not associated with any box) and information flows (IFs), which model the message exchange among FEs; and the physical plane defines the actual boxes, called physical entities (PEs), and maps the FEs to PEs.

As with most other ITU-T standards, the IN Recommendations are being adopted by regional standards bodies for use in their respective countries; however, we address only the ITU-T standards. You can find a detailed description of the IN standards up to CS-2 in Faynberg et al. (1997), but the text of the ITU-T Recommendations and their regional counterparts is, naturally, the ultimate reference.

Although the IN standards have provided in both IN CS-1 and CS-2 an effective model for service creation by specifying the so-called service-independent building blocks (SIBs),[16] the standardization of SIBs stopped after CS-2. Because standardization efforts ended, and the existing SIBs were not designed for interworking the PSTN and the Internet, we do not cover them.

The functional architecture and the mechanism for triggering the interactions between its elements are essential topics related to capability sets, addressed in the sections that follow. Figure 1 describes a subset of the presently standardized FEs and their interconnections.

Figure 1: IN capability set 1 (CS-1) functional architecture
Source: ITU-T Recommendation Q.1211

The IN FEs are grouped according to their role in supporting IN: FEs involved in service execution and FEs involved in service creation and management.

The service execution FEs are:

  • Call control agent function (CCAF). Provides user access capabilities. It may be viewed as a proxy for a telephone (or ISDN terminal) through which a user interacts with the network.
  • Call control function (CCF). Provides the basic switching capabilities available in any (IN or non-IN) switching system. These include the capabilities to establish, manipulate, and release calls and connections. It is the CCF that provides the trigger capabilities; however, another object called the service switching function (SSF) is needed to support the recognition of triggers as well as interactions with the service control. The SSF and CCF are supposed to be colocated (that is, they cannot be placed in different PEs).
  • Service control function (SCF). Executes service logic. It provides capabilities to influence call processing by requesting the SSF/CCF and other service execution FEs to perform specified actions. Implicitly, the SCF provides mechanisms for introducing new services and service features independent of switching systems. It is therefore the function that interworks (via applicable gateways) with the IP hosts in support of joint service control. Two main principles of both CS-1 and CS-2 standards are single-endedness (that is, the service logic is aware of only one relation with the SSF/CCF for the purpose of a given terminating or originating call process) and single point of control (that is, only one instance of service logic may be in contact with the SSF/CCF for the purpose of a given terminating or originating call process).
  • Specialized resource function (SRF). Provides a set of real-time capabilities, which Recommendation Q.1204 calls specialized. These capabilities include playing announcements and collecting user input [either dual-tone-multi-frequency (DTMF) or voice, depending on the facilities]. The SRF is also responsible for conference bridging, fax support, and certain types of protocol conversion as well as text-to-speech (and vice versa) conversion. The SRF is crucial to supporting services like click-to-fax. To this end, the SRF also interworks with IP hosts, although—unlike the SCF—it supports the delivery of a service rather than the control of it.
  • Service data function (SDF). Provides generic database capabilities to either the SCF or another SDF.

The following three service creation and management FEs are defined in ITU-T Recommendation Q.1204:

  • Service creation environment function (SCEF). Responsible for developing (programming) and testing service logic, which is then sent to the service management function (SMF).
  • Service management function (SMF). Deploys the service logic (originally developed within the SCEF) to the service execution FEs, and otherwise administers these FEs by supplying user-defined parameters for customization of the service and collecting from them the billing information and service execution statistics.
  • Service management agent function (SMAF). Acts as a computer terminal that provides the user interface to the SMF.
These entities serve to complete the architecture and reflect the industry development. No associated protocols have been defined by ITU-T.

As you may recall, the fundamental idea of IN is to open the basic switching process (run by switches) to external influence. This is done by defining and standardizing the call model.

Figure 2 depicts the CS-2 Basic Call State Model (BCSM), which is the latest released standard. BCSM models both the originating and terminating basic call processes, which are depicted in Figures 2 and 3, respectively. In both processes, the primary states of calls as seen by a switch, depicted within rectangles are termed points in call (PICs). In addition, certain transitions lead to other states called detection points (DPs). It is at DPs that the switch may interrupt its processing by sending a message to the SCF. Each DP may be either armed or unarmed. As far as IN is concerned, being armed is the first essential prerequisite for being active, for only when a DP is armed is the external service logic (within the SCF) informed that the DP has been encountered.

Figure 3: Originating CS-2 BCSM.
Source: ITU-T Recommendation Q.1224

A DP may be armed either statically (from the SMF, as the result of the service feature provisioning) or dynamically (by the SCF). If it is statically armed, the DP remains armed until the SMF disarms it—as long as the service that needs it is to be offered; if it is dynamically armed, the DP will remain armed for no longer than the duration of a particular SCF-to-SSF relationship. A statically armed DP is called a trigger detection point (TDP); a dynamically armed DP is called an event detection point (EDP). The DP nomenclature is illustrated in Figure 5.

Figure 4: Terminating CS-2 BCSM.
Source: ITU-T Recommendation Q.1224

Figure 5: Detection point processing.
Source: ITU-T Recommendation Q.1214

The DP nomenclature is essential for understanding how IN can interacts with the IP networks. When a service originates from the Internet, the DPs in the PSTN may be armed only dynamically. When, on the other hand, a service originates from the PSTN, all DPs that can invoke it must be armed beforehand (but other DPs can be armed dynamically in the process of service delivery). These cases are respectively addressed by the IETF PSTN/ Internet INTernetworking (pint) working group ( pint-charter.html ) and the Service in the PSTN/IN Requesting Internet Service (spirits) working group ( ) in cooperation with ITU-T SG 11.

Figure 5 depicts the distribution of the IN functional entities among the subset of physical entities in CS-1. (For purposes of our discussion, we have reduced this subset to the bare minimum.) The physical entities are:

  • Service switching point (SSP). A switch that provides access to IN capabilities.
  • Service control point (SCP). A general-purpose computer that has access to the SS No. 7 network for communicating with SSPs and IPs.
  • Service data point (SDP). Contains only the SDF. The SCP can access data in an SDP either directly or through a signaling network.
  • Intelligent peripheral (IP). Its function is primarily to support the SRF. However, it may also include the SSF/CCF to provide external access to resources.
  • Adjunct (AD). Functionally equivalent to an SCP, but connected to a single switch via a high-speed network, not the SS No. 7 network.
  • Service node (SN). Similar to an AD, but in addition to performing a role of an SCP, it can perform the role of an IP. The SN connects to switches via the ISDN interface. (As you will see in Part Three, present implementations often include the SCP and small SSP as part of SN offers; however, those SCPs and SSPs act independently rather than as part of the group of standard SN functions.

Figure 6: IN CS-1 physical architecture (after Q.1215).
Source: ITU-T Recommendation Q.1215

The Intelligent Network Application Part (INAP) protocol is defined as a TC user.

Signalling System No. 7 (SS No. 7)

The SS No. 7 standards are essential to a wide range of convergence products for a number of reasons, of which we list three here. First, Internet access servers need to access the PSTN signaling network by means of the SS7 gateways. Second, IP telephony gateways need to interact with the PSTN switches in order to establish PC-to-phone (or phone-to-PC) calls and IP trunking. Third, IP telephony gateways, soft switches, and access servers need to access the PSTN service control. For some of these purposes, SS No. 7 needs to be ported into the IP environment (that is, carried by IP-based networks), which, as mentioned before, is one activity carried by the IETF sigtran working group.

SS No. 7 was standardized (based on the common channel experience) in order to satisfy the need for a common signaling interface both within and across the national borders. Another benefit of standardization has been cost eduction through multivendor interoperability. As it happens, European and American SS No. 7 versions differ from one another, and even in the United States implementations in some interexchange carriers’ networks differ from those of local exchange carriers. Nevertheless, signaling across networks works very well, at least as far as the provision of the basic call is concerned.

The overall objective of SS No. 7 is to provide a reliable means of information transfer in support of call control; remote control; and operations, management, and administration. ITU also specifies SS No. 7 measurements and performance requirements. When printed double-sided and stacked one page on top of another, the SS No. 7 ITU-T Recommendations Series (Q.7xx—starting with Recommendation Q.700) is about a yard high. These Recommendations define all aspects of a four-layer protocol stack as depicted in Figure 1.

Figure 1: The SS No. 7 stack.

There are two types of SS No. 7 application users:

  1. Applications that use service-related (but not circuit-related) transactions between the switches and network databases.

  2. Switching applications that rely on the exchange of circuit-related information in order to set up, test, maintain, and tear down trunks.
Applications of the first type (for example, Intelligent Network) use the Transaction Capabilities Application Part protocol; applications of the second type use either the Telephone User Part (TUP) protocol or ISDN User Part (ISUP), which has been the SS No. 7 response to the ISDN. Both TC and ISUP rely on the Signalling Connection Control Part (SCCP) protocol, which in turn runs on top of the Message Transfer Part (MTP), the lowest layer in the SS No. 7 stack. ISUP also uses MTP directly; TAP uses only MTP.

The rest of this section sheds a bit more light on MTP, SCCP, TC, and ISUP, respectively.

Message Transfer Part (MTP)

As Figure 1 demonstrates, MTP has three levels. The first two levels correspond to the physical and data link layers of the OSI model, respectively. The third level (Level 3) performs certain functions (such as routing and data delivery) of the OSI network and transport layers. In addition, MTP is responsible for the network management functions associated with the control of routing tables and other network configuration data.

Physically, MTP can be implemented in the endpoints (that is, switches, network databases, or operation and maintenance centers) or service transfer points (STPs), or both. The messages exchanged between any pair of these elements may traverse more than one intermediate STP. MTP does not guarantee in-sequence arrival and otherwise provides an unreliable connectionless transport mechanism to its users.

Each endpoint and each STP are identified by a unique point code, which is an exact equivalent of an IP address in the Internet. The MTP routing label contains three parts: the originating point code (OPC), destination point code (DPC), and signaling link selection (SLS). (The SLS field is used, among other things, for load sharing among STPs.) MTP can assign a specific hard link value [called signaling link code (SLC)] to an SLS, which is always done for MTP management information messages.

In order to recognize the user (for example, SCCP or ISUP) of the incoming message, MTP employs a combination of the service indicator (identifying the user) and a 2-bit-long network indicator (which, in combination with OPC and DPC, determines whether national or international signaling is involved) fields.

Note that MTP procedures include congestion control, of which MTP informs its users.

Signaling Connection Control Part (SCCP)

SCCP, described in Recommendations Q.711 through Q.716, provides both connectionless and connection-oriented transport services. The services are grouped into the following four classes (enumerated from 0):
Basic connectionless class, which does not guarantee in-sequence delivery.
In-sequence-delivery connectionless class.
Basic connection-oriented class.
Flow-control connection-oriented class.

SCCP peers address each other by the DPC–global title–subsystem number (SSN) triplet. Global title is defined in Recommendation Q.700 as a set of “dialled digits or another form of address that will not be recognized in the SS No. 7 network.” The subsystem number identifies a user part or TC application entities.

When the connection-oriented classes services are used, SCCP provides a reliable transport mechanism. During connection establishment, certain routing functions are provided by SCCP as well (in addition to those provided by MTP), as noted in Recommendation Q.711.

Transaction Capabilities (TC) Application Part

TC is most typically used as a protocol between a switch and a network database, but it can also be used between two network databases. The primary user of TC is Intelligent Network, but there are other users (such as mobile service applications, administration of closed user groups, and transaction-oriented operations and maintenance applications). The main feature of all these applications is best defined through an affirmation—they are transaction-oriented—and a negation—they are not circuit-related.
Add a Note HereThe word transaction-oriented simply means that TC is designed to support the request/response type of communications, although in reality the protocol has evolved to support a dialogue of multiple requests and responses issued by either side of the communications link. Figure 2 depicts the place of TC in SS No. 7 and its structure.

Figure 2: Transaction capabilities (TC) application part.

First of all, TC has two sublayers: the component sublayer (CSL) and transaction sublayer (TSL). We don’t concentrate on the TSL, but we should describe the CSL, which provides the actual interface to the TC user. The CSL is responsible for:

  1. Associating the user’s requests (whose nature is discussed in a later section) with the responses.

  2. Handling all abnormalities.
In a nutshell, the CSL provides the appearance of a (remote) procedure call to the user. In other words, CSL operations can be viewed (and implemented) by a programmer as procedure calls. To this end, TC is partially aligned with the capabilities of ITU-T Recommendations X.219 and X.229, developed jointly by ITU-T and the ISO. The CSL messaging unit is called a component. The user initiating a transaction issues an INVOKE component, which contains the operation code and arguments of the operation. Recommendation Q.771 defines four classes of operations in respect to the expected response:

  1. Both success and failure are reported.

  2. Only failure is reported.

  3. Only success is reported.

  4. Neither success nor failure is reported.
The response can also carry rejection to perform the requested operation. What is most interesting is that the responder may, in turn, send its own INVOKE component before returning a response, by means of linking the components into a dialogue. (A classical example of such dialogue is the freephone service. When a switch asks the service control point to translate a number, the latter requests that the switch connect to the device that plays announcement and collects digits. After the service control gets what it needs, it finally sends the response back to the switch.) The response components are:

  • RETURN RESULT NOT LAST. Contains the list of parameters defining the result, and also indicates that other RETURN components are going to be issued (thus, the response may be segmented).

  • RETURN RESULT LAST. Contains the list of parameters defining the result.

  • REJECT. Contains the problem code (for example, malformed component).

  • ERROR. Contains the error code (the error being the result of performing the operation).
TC is based only on the connectionless capabilities (that is, classes 0 and 1) of SCCP and uses the same addressing mechanism as SCCP.

ISDN User Part (ISUP)

As mentioned, SS No. 7 was adapted to interworking with Q.931, and so became the system of choice for the ISDN network support for call management. Call management is in turn realized through switch-to-switch signaling. ISUP is the protocol used for switch-to-switch signaling.

ISUP entities address each other with the MTP addressing scheme, augmented by circuit identification, which refers to a specific trunk.

The ISUP call model views three call phases:

  1. Call setup.

  2. Conversation (includes pure data exchange).

  3. Call teardown.
Accordingly, the ISUP messages are used to establish, maintain, and terminate different phases of a call. In addition, calls originating from ISDN terminals may be supplied with more detailed call progress information.

ISUP employs two methods—link-by-link (which passes messages through all intermediate exchanges, where they can be modified, hot-potato-style) and end-to-end messages that are exchanged between the ISUP endpoints (for example, local exchanges or international gateways). The end-to-end method typically employs either the connectionless or connection-oriented services of the SCCP; employing the latter makes things much simpler.

Unfortunately, there are too many ISUP messages to describe here. (Just the minimum internationally recognized set contains 29!) Instead, we list the basic message categories, followed by a call setup example:

  • Forward setup. The messages in this category are involved in setting up a call with particular characteristics in the direction toward the called party.

  • Backward setup. The messages in this category complete the call establishment (when it is possible) in the direction from the exchange containing the called party toward the calling party. Accounting and charging procedures belong in this category.

  • General setup. The messages in this category carry additional call-related information needed to set up a call.

  • Call supervision. The messages in this category are notifications of events like the call being answered, the circuit being released, or the need for an international operator intervention.

  • Circuit supervision. The messages in this category are all kinds of notifications of the events related to circuits allocated for a call.

  • Circuit group supervision. The messages (primarily of request/response type) in this category relate to circuit groups rather than individual circuits, and are used for network management purposes (for example, as preventive measures—such as call blocking on the indicated trunk groups, or circuit group status queries).

  • In-call modification. The messages in this group support modification of the existing call characteristics (for example, a change from a voice call to a data call) or invoking a particular medium (facility).

  • End-to-end. The messages in this group include user-to-user signaling independent of call control messages.

Telecom Made Simple

Related Posts with Thumbnails