A Novel Feedback Controlled Multimedia

0 downloads 0 Views 92KB Size Report
We propose a novel feedback controlled ... Client Initiated Transport Protocol (CITP) used for data ... large file sizes associated with multimedia transmissions .... transfer follows. ..... http://redriver.cmcl.cs.cmu.edu/~hzhang-ftp/SIGCOM91.pdf.
A Novel Feedback Controlled Multimedia Transmission Scheme Gabriel-Miro Muntean and Liam Murphy Department of Computer Science University College Dublin, Belfield, Dublin 4, Ireland [email protected], [email protected] ABSTRACT The number of multimedia transmissions over the existing network infrastructure is continually increasing. This leads to longer periods of network congestion which affects these transmissions and hence their playout. We propose a novel feedback controlled multimedia transmission scheme in order to ensure continuous stream delivery and play-out, even in the case of network congestion. Data transmission and the exchange of control information are done via a doublechannel (TCP and UDP) link. A special protocol (Client Initiated Protocol) has been defined to provide the transmission mechanism with a reduced overhead. We describe both its components, the Client Initiated Streaming Protocol (CISP) used for control and the Client Initiated Transport Protocol (CITP) used for data transmissions. We also present the feedback scheme and describe the server's possible state transitions. A multicast approach is explored and its advantages and disadvantages outlined. We present some experimental results to show the functionality of our scheme. I. INTRODUCTION With the current explosion in both the number of computer users and the number of computers with Internet access, communication via the Internet is continually growing. Large amounts of data are being transferred between remotely situated computers. Multimedia transmissions, especially of a real-time nature, are becoming very popular. In general, the very large file sizes associated with multimedia transmissions causes problems for the network, and therefore multimedia data is compressed before being transmitted. MPEG [1, 2] is a widely-used compression standard because it offers both good compression ratios and reduced vulnerability to transmission errors. Another problem for the network is the continuous nature of multimedia streams, which have strict timing constraints in order to allow the receiver to decode and play out all the streamed data [3]. Given the large resource requirements of multimedia transmissions, network congestion can occur. This leads to both huge packet delays and/or data losses, which affect the timing requirements and lower the transmission quality. Many proposals have been made to support transmitting time-sensitive applications over existing IP networks, which by default treat all transmitted packets equally in so-called "best-effort" service. One approach

is to enhance the IP network with mechanisms such as resource reservation (e.g. Resource reSerVation Protocol (RSVP) [4]), admission control [5] and special scheduling algorithms [6]. Implementing such mechanisms for millions of Internet users and for the increasingly large number of real-time applications running over the Internet is a difficult problem. Besides, multimedia streams’ requirements are becoming more complex (e.g. better quality, interactivity), making these mechanisms even more difficult to implement. Thus there is growing interest in another approach: real-time adjustment of the bandwidth used by applications [7, 8]. The main advantage of this approach is that it works with the network as-is, adapting time-sensitive applications to the network’s dynamically changing conditions. The novel transmission scheme we propose uses one MPEG coding algorithm feature: adjustment of the transmitted bit-rate, which consequently modifies the quality of the transmission. It takes advantage of the fact that viewers of a remotely transmitted multimedia stream can tolerate a certain degree of quality reduction, and prefer this to interruptions in the transmission while buffering [9]. The idea is to vary, in real-time, the encoding rate of the video stream (which is the main component of the traffic, relative to audio or text components), modifying the transmission quality and thus the quantity of data to be transmitted. The client, by analysing the statistics of the data received, is in the best position to realise that the network is becoming congested. By continuously transmitting feedback reports to the sender, the client initiates the adjustment process. The resulting adjustments have to be done onthe-fly by the server after the analysis of the received feedback control information. For the multicasting case, an arbitration scheme has to be implemented. Different criteria have to be taken into account for deciding which is the best measure to be taken by the server in the case of divergent control data received from its clients. Also the Client Initiated Protocol had to be extended to accommodate both the multicasting approach and the implementation of the feedback scheme. A feedback controlled client-server system which implements the proposed transmission scheme is described in the following section. We then describe the double-channel communication scheme, the Client Initiated Protocol, and the Feedback scheme we have implemented. We show some experimental results, and then conclude with some suggestions for further work.

II. SYSTEM OVERVIEW The system built in order to implement and test the proposed feedback controlled multimedia scheme consists of server and client applications. They communicate using double-channel TCP and UDP (described in the next section). Both server and client applications have been built in Visual C++ 6.0 using a multithreading object-oriented approach. They rely on Windows communication (WinSock 2), events and messaging systems. Audio/Video Acquirer

MPEG Encoder

Server Feedback Manager

Tx Shaper Connection Manager

Control

Network

Data

Fig. 1 The structure of the server application The server application (Fig. 1) has five main components: the Capture Unit (Audio/Video Acquirer), the MPEG Encoder, the Connection Manager, the Traffic Shaper, and the Feedback Manager. The Capture Unit and the MPEG Encoder are in charge of video and audio acquiring and their MPEG encoding. This can be done either in real-time, in which case the data is sent directly to the transmission buffer; or prior to transmission, in which case the data has to be saved to, and later retrieved from, the hard drive. The Connection Manager listens for client requests to establish TCP connections and later on UDP unicast or multicast channels. It also has to maintain and control existing communications. The Feedback Manager takes decisions according to the feedback control information received from the clients, in order to adjust the quality of the transmission. This is done with the help of the MPEG Encoder and the Traffic Shaper. The latter, controlled by the Feedback Manager, adjusts some of the transmission parameters, e.g. the frame size or the sender transmission frequency. It implements a modified token bucket mechanism, as described in [10]. MPEG Decoder

Synchro Unit

Audio/Video Player

Client Feedback Indication Control

Connection Manager

Network Data

Fig. 2 The structure of the client application The client application (Fig. 2) consists of the Connection Manager, the MPEG Decoder, the Multimedia Playing Unit (Audio/Video Player) and the Feedback Indication Unit. The client’s Connection Manager has the same role as in the server: to establish and control client-server communications. The MPEG Decoder transforms the received data into a playable /displayable format for the Multimedia Playing Unit. The latter is in charge of video stream display, audio stream play, and their synchronisation. The Feedback Indication Unit

continuously gathers statistics data and sends them from time to time to the server's Feedback Manager, as described in detail in section V. More information and a detailed description of the implemented system can be found in [11] and [12]. III. DOUBLE-CHANNEL (TCP AND UDP) COMMUNICATION Client-server communication in our system takes place over a double channel. A client contacts the server by specifying the IP address and port number of the server. If a threshold number of current clients has not been reached, the server accepts the connection and a reliable TCP control channel is created. During its setup, the server and client exchange information necessary to create UDP sockets on both sides, allowing data to be sent across them later on. Unlike the TCP connection which permits control messages to be sent in both directions, UDP communication will be a unidirectional data flow from the server to the client (Fig. 3). TCP bi-directional control channel UDP unidirectional data stream

Unicast Client 1

Server Application Unicast Client N

Fig. 3 Unicast: the server and each client communicate via a double-channel: TCP and UDP After completion of the setup process, the client can choose one of the streams the server is offering for transmission. A real-time or a pre-recorded multimedia transfer follows. During the transmission, feedback control messages are repeatedly sent over the TCP channel from client to server, informing it about the reception status. The server's Feedback Manager processes this information and, if necessary, adjusts some of the transmission parameters. In the multicasting case (Fig. 4), after the TCP connection has been established, multicast UDP sockets are created both at the server (once for all the clients) and at the clients (one for each of them). The group IP address and the group port number for multicasting are sent to the clients in order to allow them to join the server's multicast transmission group. TCP bi-directional control channel

Multicast Client 1

UDP unidirectional data stream

Multicast Client 2

Server Application

Multicast Client N

Fig. 4 Multicast: the communication between the server and the clients

One of the advantages of such a double-protocol double-channel approach is that it combines the advantages of a reliable TCP connection (which guarantees the in-order arrival of unduplicated data) with those of a connectionless UDP channel (fast delivery and multicast capabilities). Thus the system can rely on the control messages sent via the TCP connection in all phases of the session: setup, data request, data transmission control, and shutdown. Data delivery is done via the UDP channel. To ensure and control end-to-end transmission of multimedia data, a Client Initiated Protocol (CIP) was built. CIP is described in the following section.

SHUTDOWN process can be also server initiated or caused by the appearance of either a client or a server timeout. The timeouts help to free used resources in case of a broken communication between the server and the client. Server

Client GET_PLAY Starts Tx Stream

PUT_PLAY

GET_ PAUSE Pauses Tx Stream

PUT_ PAUSE OR GET_STOP

Stop Tx Stream

PUT_STOP

IV. THE CLIENT INITIATED PROTOCOL Fig. 7 PLAY, PAUSE and STOP methods The Client Initiated Protocol (CIP) uses features of the Real Time Protocol (RTP), including some of those offered by Real Time Control Protocol (RTCP) [13] and Real Time Streaming Protocol (RTSP) [14]. Server

Client

The session includes feedback control messages repeatedly sent by the client to the server (Fig. 8). The methods used by the protocol in all phases of the session are the appropriate GET_XXX and PUT_XXX ones, followed by parameters if necessary.

GET_CONNECTED

Server

PUT_CONNECTED

Client

Setup TCP Connection

OR

PUT_CTRLDATA Repeatedly Send Control Data

PUT_REJECTED PUT_CTRLDATA

GET_UDPSOCK Create UDP Socks

PUT_ UDPSOCK

GET_DATABASELIST List Available Streams/Files

PUT_ DATABASELIST

Fig. 5 CIP unicast SETUP procedure The reason we decided to implement a new protocol, rather than just use an existing one, is that our CIP combines some of the advantages of existing protocols. It has a low complexity and reduces the control information to a minimum. CIP has two main components which work in conjunction: the Client Initiated Streaming Protocol (CISP) and the Client Initiated Transport Protocol (CITP). A complete CISP session consists of a SETUP procedure, one or more callings of PLAY, PAUSE and STOP methods, and a SHUTDOWN procedure. Server

Fig. 8 CIP control messages As in RTCP, CISP provides the capability to send and receive control messages. Only one type of control message is defined, which carries reports from the client to the server about the quality of the transmission. Unlike in RTCP, these reports do not consist of statistical data to be processed at the server, but instead carry a summary of the client’s performance analysis in the form of a “grade” which describes the reception quality. The reception quality is currently determined by the number of lost frames, the number of frames arrived out of order or too late for the play-out. Also the reception and the decoder buffers occupancies are taken into account. Server

Client

Client SRV_ SHUTDOWN

GET_CONNECTED OR

CLI_SHUTDOWN

PUT_CONNECTED Setup TCP Connection

OR

OR

PUT_REJECTED

OR SRV_ TIMEOUT

GET_MCASTSOCK PUT_ MCASTSOCK

Create Multicast UDP Socks

Closing Session Or Connection Control Packets

CLI_ TIMEOUT

SRV_TXENDED

Fig. 6 CIP multicast SETUP procedure The SETUP procedure allows the client to set up a transport mechanism for both a continuous media stream and its control information in the unicast case (Fig. 5), as well as in a multicast session (Fig. 6). Later on it can start the stream’s transmission using PLAY, PAUSE-ing it from time to time or STOP-ing it (Fig. 7). Of course in the multicasting version some data may be lost if pausing. When the client wants to end the session, it initiates the SHUTDOWN procedure (Fig. 9). The

Fig. 9 SHUTDOWN procedures The CITP is in charge of data transmission from the server to the client. It relies on the UDP channel for data transmission, so some of the delivered packets may be out-of-order, and some transmitted packets may be lost or duplicated. A frame number is attached to every packet. A timestamp is added to allow real-time oneway delay and jitter computing. The CITP packet has a shorter header than in RTP and has the structure shown in Fig. 10.

The main purpose of our CIP implementation is to support the implementation of the feedback control mechanism, which is described in detail in the next section. IP Header UDP Header (20 or 40 bytes) (8 bytes)

CIP Header (6 bytes)

Payload (variable size)

Fig. 10 CITP packet structure V. THE FEEDBACK SCHEME

Feedback Indication Unit Network

RxCircBuf

Rx Thread

DriverBuf

Decoder Thread

Play Thread

Fig. 11 The client’s functional structure The client’s functional structure (Fig. 11) includes a receiver thread with a higher priority in order to get all the incoming packets, an MPEG decoder thread, and a player thread, as well as a unit for stream synchronisation. They share a receiver buffer and a driver buffer for each of the played streams. The Feedback Indication Unit gathers statistical data from the receiver buffer and from the driver buffers. This data includes the buffer occupancies and the number of lost or late packets. It is processed locally by a unit which weights each component and grades the overall transmission quality. The result is transmitted to the server in the feedback control messages. Five different quality levels were defined for describing the transmission: GOOD, ABOVE_NORMAL, NORMAL, BELOW_NORMAL and BAD, while other five describe the receiver and decoder buffers occupancies: EMPTY, HALF_EMPTY, NORMAL, HALF_FULL and FULL.

on the received feedback information, the server’s Feedback Manager takes decisions to adjust some parameters of the multimedia stream transmission and thus to change the server's state. In the MPEG video encoding algorithm, Discrete Cosine Transformation (DCT) is used to convert the video data from the time domain into the frequency domain. In order to reduce the quantity of data, it reduces the high frequency spatial components from the image since the human viewer is more sensitive to reconstruction errors in the low frequency components [15]. The components are divided by both a "quantization scale factor" (common to all of them) and special "quantization coefficients", and the results are rounded to integers. This is the fundamental information-loss step in DCT. The larger the quantization scale factor and the quantization coefficients, the more data is discarded. The ability to modify the quantization scale factor, with a value from 1 to 31, makes it useful as a tuning parameter for bitrate control. For the real-time stream transmission case, if the receiver reports a bad transmission quality, the server's Feedback Manager can decide to increase the quantization scale factor, thus reducing the volume of data to be transmitted. Of course the quality of the realtime video stream by itself will decay, but its continuity will be maintained. Other measures can be taken in the case that the server continues to receive bad reception reports. Dropping colours (i.e. only black-and-white images are transmitted), further decreasing the quality of the stream by increasing the quantization scale factor, and tougher measures such as dropping some of the B and even P MPEG-type frames, could be used (Fig. 13). Normal Lower Q Drop P

Feedback Manager

TxCircBuf

HDD

Capture, Encode & Copy Thread

Tx Thread

Black/White

Drop B

Lower Q Lower Q

Network

Fig. 12 The server’s functional structure The server (Fig. 12) can work in two ways: transmitting pre-stored MPEG streams, or capturing and transmitting real-time multimedia data. In the first case, only the Copy Thread is active during the transmission while the Capture and the Encoder threads have to be used prior to the transmission. In the second mode, all three threads have to be active in parallel in order to ensure data in time for a good real-time transmission. The server has a number of defined states. Each server state is associated with a certain transmission quality. After the SETUP procedure, the server starts transmitting data when the client requests a particular stream. The server assumes that the network is not congested, so the transmission can be performed at the best rate and quality. During the transmission, the server receives control data from the client which grades the transmission quality from its point of view. Depending

Fig. 13 The server's possible states To avoid repeated transitions between the same states, changing from one state to another is done by the Feedback Manager only if there are multiple reports from the client asking for it. In the multicasting case, multiple feedback reports are being received from the concurrent clients. A simple arbitration scheme has been implemented in the Feedback Manager in order to make the adjustment decisions. If the server adapts the video quality to support receivers with low bandwidth, clients with high bandwidth links will get a stream with lower quality than they can handle. If the server adapts the video quality to support the high-bandwidth receivers, clients with lower bandwidth links will suffer serious quality degradation. A compromise must be taken into account, and our arbitration scheme takes decisions favourable to the majority of existing clients.

VI. EXPERIMENTAL RESULTS

Receiver and Decoder Buffer Feedback Grades 6 5 Grade Level

For computing one-way packet delays, we need to have both destination and sender computers with perfectly synchronized clocks. Our experiments deal with millisecond order delays, so we can use the NTP protocol [16] for synchronizing both the server's and the client's clocks, by connecting to the Atomic Clock time server in Boulder, Colorado (USA) and adjusting both computer's clocks to match its value.

In Fig. 16 we show the dynamics of the client's receiver and decoder buffers occupancies during the transmission of the MPEG stream. According to those data, the client's Feedback Indication Unit grades both buffer occupancies separately as described in section V (Fig. 17).

One-way Frame Delays Over LAN 1000

4 3 2

Delay (msec)

950 1

900

0 1

850

21

41

61

81 101 121 141 161 Feedback Control Messages

Receiver Buffer

800

181

201

221

Decoder Buffer

Fig. 17 Client receiver and decoder buffer-based grade

750 1

5

9 13 17 21 25 29 33 37 41 45 49 53 57 61 65 69 73 77 81 85

A weighted overall grade is computed based on the buffer occupancies grading and the frame arrival based grading. The latter takes into account the number of late and lost packets (Fig. 18).

Frames Congested LAN

Normal LAN

Fig. 14 One-way frame delays on a LAN In Fig. 14 we plot the one-way delays for a 9second MPEG system stream (1.6 Mbytes) sent over a LAN measured in two cases: a normally-loaded network and a congested one, using a traffic shaping scheme as mentioned in section II.

Feedback Grade According To The Frame Arrival 16 14 12 10 8 6

One-way Jitter On A LAN 35

4

30

2

25

0 1

21

41

15 10

141

161

181

201

221

Feedback Grade

Fig. 18 Frame arrival based feedback grading

81

-10 -15 Fram es

Congested LAN

Normal LAN

Fig. 15 One-way jitter on a LAN In both cases, the transmission did not experience high delays while the jitter remained small (Fig. 15). The delays as well as the jitter varied, but not significantly, in different experiments done in a short period of time, while we could consider the network traffic constant. Thus we can say that the results are significant and show the stability of our solution. Even in congested network conditions, the jitter did not increase significantly, so receiver buffering can absorb it.

We also measured and plot the quantity of data handled (sent and received) by the server during a multicast session with different number of clients which joined the multicasting group and during multiple unicast sessions opened by the same number of clients (Fig. 19). Server Handled Data Multicast vs. Unicast 90 80 70 Size (MBytes)

76

71

66

61

56

51

46

41

36

31

26

21

16

6

11

0 -5

60 50 40 30 20 10 0 1

Receiver and Decoder Buffer Occupancies 120

5 10 Number of Clients Multiple unicast sessions

100 Occupancies (percent)

61 81 101 121 Frames Lost Frames (percent)

5 1

Jitter (msec)

20

50 Multicast session

Fig. 19 Server handled data in multiple unicast sessions and one multicast session

80 60 40 20 0 1

21

41

61

81

101

121

141

161

181

201

221

Feedback Control Messages Receiver Buffer Decoder Buffer

Fig. 16 Receiver and decoder buffer occupancies

Although the payload size of the transmitted packets may vary as well as the frequency of the control packets, consider a 5Kbyte payload size for the UDP packets, 4 bytes for the TCP control packets, and that one control packet is sent by the clients for every five data packets received. By adding the size of the headers (20 bytes for IP and TCP, 8 bytes for UDP and 6 bytes

for CIP), we have 1.614 Mbytes server load for the transmission of the 1.6 Mbytes MPEG stream, for both cases, for a single connected client. While the quantity of received data remained the same and increased proportionally with the number of clients, the quantity of data being sent increases drastically in the case of multiple unicast sessions. The server's CPU is also experiencing an increased load with the number of clients, because the server application creates two new threads for handling the transmission and one with a high priority for handling the control packets reception for each unicast session. Thus the number of threads which require server's CPU time slice becomes important, overloading the server, if the number of concurrent clients increases much. This is not the case of a multicast session when for the transmission only two threads are created for all the clients, apart from the ones created for the reception. VII. CONCLUSIONS AND FURTHER WORK This paper proposes a novel feedback controlled transmission scheme in order to insure multimedia stream transmission even in the case of network congestion. The scheme was implemented by a multithreading event-driven client-server system. A new double-channel approach is used for client-server communication: TCP for control and UDP for data. A new real-time streaming protocol (CIP) has been designed, which uses some features of RTP and RTSP but is simpler. Our proposed scheme involves the server changing its state, and therefore the transmission quality, in response to feedback information from the client(s). This is possible in a real-time capturing process by varying some of the encoding parameters. We are also studying the case of pre-recorded transmissions: to be able to change the server's state, multiple streams have to be prepared for transmission, each with a different quality. The idea is to switch between these streams at well-determined checkpoints so as not to cause flow disturbances. The experimental results to date show that even in a congested network the packet delays were not significantly increased by our scheme, while the jitter remained small. The next step is to study how our feedback scheme affects the received stream quality. The results obtained from the multicast experiments show that the multicast sessions come with an obvious advantage for the server load. Unfortunately the advantage is balanced by reduced control over the quality of the transmitted stream, even if with the feedback scheme is enabled. This is because the arbitration scheme takes into account a decision which is favourable to the majority of the clients and doesn't respect separately each client's will as in the unicast case. We are studying a compromise between the two approaches which takes into account the possibility of splitting the current multicasting transmission group into two or more groups in real-time, with different stream transmission qualities. The clients will be asked to join the newly created group that best matches their requirements.

REFERENCES [1] ISO/IEC International Standard 11172, “MPEG-1 Coding of Moving Pictures & Associated Audio for Digital Storage Media up to 1.5 Mbits/s”, Nov.1993 [2] ISO/IEC International Standard 13818, “MPEG-2 Generic Coding of Moving Pictures and Associated Audio Information”, November 1994 [3] Arturo A. Rodriguez, K. Morse: "Evaluating Video Codecs", IEEE Multimedia, Vol. 1 No. 3, Fall 1994 [4] Robert Braden, L. Zhang, S. Berson, S. Herzog, S. Jamin: RFC2205 "Resource ReSerVation Protocol (RSVP) - Version 1 Functional Specification", September 1997, http://www.ietf.org/rfc/rfc2205.txt [5] Lee Breslau, S. Jamin, S. Shenker, "Comparison of Measurement - Based Admission Control Algorithms for Control Load Service", Proc. IEEE INFOCOM 2000, March 26-30, 2000, http://www.ieee-infocom.org/1997/papers/jamin.pdf [6] Michael J. Hu, Tao Luo: "Adaptive Control Framework and Its Applications in Real-Time Multimedia Service on the Internet Architecture", IEICE Transactions on Communications, Vol. E82B, No. 7, July 1999, pp. 998 - 1008, http://search.ieice.or.jp/1999/ pdf/e82-b_7_998.pdf [7] Hui Zhang, S. Keshav: "Comparison in Rate-Based Service Disciplines", Proceedings of ACM SIGCOMM'1991, Zurich, Switzerland, Sept. 1991, http://redriver.cmcl.cs.cmu.edu/~hzhang-ftp/SIGCOM91.pdf

[8] Xin Wang, Henning Schulzrinne: "Comparison of Adaptive Internet Multimedia Applications", IEICE Transactions on Communications, Vol. E82 - B, No. 6, June 1999, pp. 806 - 818, http://search.ieice.or.jp/1999/pdf/e82-b_6_806.pdf [9] George Ghinea, J. P. Thomas, “QoS Impact on User Perception and Understanding of Multimedia Video Clips”, Proceedings of ACM Multimedia ’98, Bristol, United Kingdom, 1998 [10] Gabriel-Miro Muntean, Liam Murphy, “Feedback Controlled Traffic Shaping for Multimedia Transmissions in a Real-Time Client-Server System”, International Conference on Networking ICN'2001, July 9-13, 2001, Colmar, France [11] Gabriel-Miro Muntean, Liam Murphy, "An Object Oriented Prototype System for Feedback Controlled Multimedia Networking", ISSC'2000, University College of Dublin, Ireland, June 29-30, 2000 [12] Gabriel-Miro Muntean, Liam Murphy, "Some Software Issues of a Real-Time Multimedia Networking System", CONTI'2000, Timisoara, Romania, October 12-13, 2000 [13] Henning Schulzrinne, S. Casner, R. Frederick, V. Jacobson, RFC1889: "RTP: A Transport Protocol for Real-Time Applications", January, 1996, http://www.ietf.org/rfc/rfc1889.txt [14] H. Schulzrinne, A. Rao, R. Lanphier: RFC2326, "Real Time Streaming Protocol (RTSP)", April, 1998, http://www.ietf.org/rfc/rfc2326.txt [15] Joan L. Mitchell, W. B. Pennebaker, C. E Fogg, D. J. LeGall: "MPEG Video Compression Standard" Chapman & Hall, New York, USA, 1996 [16] David L. Mills, "Network Time Protocol (Version 3) Specification, Implementation and Analysis", March, 1992, http://www.ietf.org/rfc/rfc1305.txt