Providing QoS Mapping Rule using Data mining

0 downloads 0 Views 326KB Size Report
Dec 5, 2007 - In this paper, we will define a dynamic method that enable QoS mapping ... as for example Pulses Code Modulation, which defines a coding of ...
IJCSNS International Journal of Computer Science and Network Security, VOL.7 No.12, December 2007

201

Providing QoS Mapping Rule using Data mining techniques Laila Fetjah, Abderrahim Sekkaki Department of Mathematics and Computer Science University Hassan II, Ain Chock, Faculty of Sciences P.O Box 5366, Mâarif – Casablanca, Morocco. Summary Video on demand (VoD) has been evolved during the last decades. This application insures manipulation of a huge data requiring in this fact QoS support. One of the most challenges in QoS management is the heterogeneity of quality information coming from different layers of a distributed system such as user, service, system or resource layer. This leads us to provide QoS mapping between layers. Many QoS architectures used a static QoS mapping based generally on defined tables that enable some transformations between layers. In this paper, we will define a dynamic method that enable QoS mapping between user requirements and system’s offer. Typically, we will present a new way based on data mining techniques in order to extract actual system’s offer and classify user requirements enabling in this way creating personalized QoS mapping rule according to both user requirement and system’s offer.

Key words: VoD, QoS mapping, QoS monitoring, Data mining.

1.

Introduction

Nowadays, we witness emerging of new applications leading to revolutionize the way people are working together and communicating. We are talking about advanced Internet application, such as news on demand or video on demand, which offers new opportunities and ways for communication. Typically, this new advanced applications are involving a rich set of interactive media using high quality data and offering interactive access and real-time manipulation to large amounts of distributed data transmitted over a not efficient network like the Internet. Moreover, these applications require huge amounts of bandwidth and are highly sensitive to any loss of data. This leads us to require having QoS (Quality of service) provision. QoS management is a general concept, which represents all the techniques guaranteeing that a distributed system offers the QoS level required by users or applications. One of the main issues in QoS management is the heterogeneity of quality information coming from different layers of a distributed system such as user, service, system or resource layer. We have proposed in [4] an approach based on modeling and model management in order to offer homogeneity of this information. Manuscript received December 5, 2007 Manuscript revised December 20, 2007

The quality information is specified at first by the user in a subjective and non formal way according to his perception of the service for example "the sound must be clear and neat". This qualitative specification must be then transformed into a specification to the application quality as for example Pulses Code Modulation, which defines a coding of audio of 64kbs, or Adaptive Delta Modulation 6, which defines a coding of audio of 56kbps. What allows defining the resources quality as for example the throughput of 64 Kbps. We are talking about QoS mapping. All QoS architectures presented in [4] used a static QoS mapping based generally on defined tables that enable some transformations between layers. Our goal in this paper is to define a dynamic method that enable QoS mapping between user requirements and system’s offer. Typically, we will present a new way based on data mining techniques and knowledge management in order to extract actual system offer and classify user requirements enabling in this way creating personalized QoS mapping rule according to both user requirement and system offer. The paper is organized as follows: In Section 2 we present the video on demand application. In section 3 we explain the QoS Mapping activity performed in QoS management. In section 4 we describe QoS monitoring tools using for collecting QoS information about systems. We also present in this section the metrics used to evaluate video streaming. In section 5 we introduce the data mining techniques using for producing a dynamic QoS mapping rules. Finally, section 6 concludes the paper and presents some future works.

2.

Video on Demand

Video on Demand represents all the services where the end user can select and watch video-content over a network like the Internet. The first commercial VoD service was introduced by the Hong Kong Telephone Company in the early 1990’s. However, the service was no success, mainly because of high prizes, complicated user-interface, and difficulties getting the public to grasp the concept of pay-per-view.

202

IJCSNS International Journal of Computer Science and Network Security, VOL.7 No. 12, December 2007

Unfortunately, with the increasing success of Internet and the growing capacity offered to end-users through fibre, Giga Ethernet, xDSL and other technologies the VoD market has evolved dramatically and offers now huge revenue to the ISPs. Actually, we are talking about VoD over IP offered by a huge number of service providers. Each ISP promise offering video on content cheap and on demand with an easier user Interface. The end-user can browse or search for available content and view information about the video currently playing. When satisfied, he may enter full-screen mode for the best possible experience. Typically, VoD architecture consists of three major parts: the client, the network and the server. The client side represents all the end-users demanding the service, in this case the video. The server could be a computer streaming media to a client computer on request through some sort of network. The server must also controls the storage system and the traffic characteristics so that to optimize his performance.

Figure 1: Data Flow of a VoD Server [12] Figure 1 shows a possible logical architecture of a VoD server. A client, who wants to use the VoD service, must set up a connection and send a request. The request is handled by the Admission Control Unit and depending on the user’s privileges, the request is granted or not. When the customer requests media content, the Storage Control unit checks if the content is available in the (local) file system. If not, a request to the Storage Subsystem (which could be distributed) is made, and the content is loaded into internal server memory. The streaming is then initiated by the server at the bit-rate specified by the Service Level Agreement (SLA) between the customer and the service provider [12]. VoD applications integrate several media such as sound, images and video sequences. Thus, it would be necessary to supply to these applications some system management mechanisms that must offer QoS support, application

adaptation and system scalability [12]. For example, a VoD application needs a certain bandwidth so that the passed on images are correct. System scalability presents the system capacity to evolve according to the met loads. Application adaptation concerns the capacity of the system to change its behaviour according to the variations in the processing environment. QoS management is a general concept, which represents all the techniques allowing guaranteeing that a distributed system offers the QoS level required by users or applications. Generally, a user specifies his or her requirements, which concern system performance, and the system has to deliver the specified level by transforming the user specifications, in constraints aiming at the transport layer [10]. Most of QoS-related research works are interested in resources allocation, few of them are interested in content adaptation or the quality perceived by the user. New approaches have to consider the user in the first rank by taking into account user-perceived QoS characteristics which must be then transformed into constraints targeted the actors of the distributed multimedia system. We are talking about Quality-Driven-Delivery (QDD) [6], where the objective is to supply services by considering the quality level specified by the user. The QDD allows offering and support levels of service adapted to users’ requirements by offering them the specification of nonfunctional requirements. The QDD resumes some QoS activities like specification, monitoring and mapping [6]. Quality information is going to act in these activities and is generally going to come from different sources what make them heterogeneous in their definition, representation and manipulation.

3.

QoS Mapping

Most of current distributed systems are based on layering architectures such as user, service, system or resource layer [14], each layer has a specific data. QoS mapping enable translation of data between these layers. For example, QoS specification at the user layer often includes subjective or qualitative information such as good, bad, or image quality, etc. Lower layers, use objectives or quantitative information like bandwidth, frame rate, packet rate or buffer capacity. The QoS manager is then responsible to achieve the QoS level corresponding to user requirements. Thus, mapping should be provided to transform QoS information from different layers. Specifically, for VoD applications at the video application layer, each video packet is characterized based on its loss and delay properties, which contributes to the end-to-end video quality and service. Then, these video packets are classified and optimally mapped to the link transmission characteristics under the rate constraint. The video

IJCSNS International Journal of Computer Science and Network Security, VOL.7 No. 12, December 2007 application represented by the user layer QoS and the link transmission characteristics represented by the resource layer QoS are allowed to interact with each other, whose objective is to find the QoS contract, which simultaneously provides a desired video service of the end users with available transmission resources. Mapping activity is essential for making QoS decisions. In [10], the authors identify two types of mapping, vertical and horizontal, used respectively for transforming information between layers and exchanging information between services of the same layer. In our architecture presented in [4], these two categories have to be implemented by the mapping between quality information models. Mapping activity used mapping rules. We classify mapping rules in two categories: § Mathematics rules: based on mathematics formula between QoS dimensions of the different quality models. § Experimentation rules: built by the QoS Manager or the user using experimental tests. QoS dimensions like packet rate, inter-arrival jitter and end-to-end-delay can be expressed in terms of QoS dimensions of the Network quality model like delay, jitter and packet loss using mathematics rules provided by [13]. Unfortunately, we do not have a sufficient number of mathematics rules allowing us to make all the mappings in a system. This leads us to use experimentations rules as an alternative method. Our goal is to construct dynamic QoS mapping rules generated automatically by the system according to the context of the application and its specific characteristics, the online user QoS requirements and the actual system offer. The main idea is that we will be able with our system to generate customise and dynamic QoS mapping rule. This will be possible by collecting and exploiting QoS information from all layers using QoS monitoring and Data mining techniques in order to build a QoS mapping rule for these layers. We propose to realize this type of QoS mapping using a three stage process: · Statistics collection, · Classification and clustering, · Rule mapping creation.

4. 4.1

QoS Monitoring QoS Monitoring tools

QoS monitoring represents one of the most important activities in QoS management. It allows supplying a visibility on the system performance.

203

Measurement methods could be classified in different ways. The first classification is the distinction between direct and indirect measurements [13]. Indirect measurement methods rely on network models with a full respect of the used mechanisms throughout the network architecture. Direct measurement methods rely on direct traffic observation at several points within the architecture without respect of any models or expected behaviours. The second classification of measurement methods is the distinction between passive and active measurement methods. Passive measurement methods collect information without disturbing network operation or interfering with operational network traffic like SNMPbased network management tools, tcpdump [7] or NetFlow [2]. Active measurement methods inject measurement traffic into the network and therefore interfere with operational traffic like NIMI [9], Surveyor [5] and AMP [8]. The third kind classification of measurement methods is the distinction between hardware monitoring tools, software monitoring tools and protocols specifics monitoring tools. § The hardware tools which can be classified as follow: o Dedicated measurement devices that provide network operations centre. They provide QoS information at the network level. o Embedded measurement tools integrated into network equipments like router, switch. They are focused on measurements of the network performance. § The software tools which can be classified as follow : o Application tools integrated into end user applications. They are used to provide QoS information at the application level. o Database tools used to provide QoS information of the DBMS used by the VoD server. They can be classified in two categories : - Integrated tools provided by the DBMS vendor (Oracle or SQL Server…), - A set of standard benchmarking test like TPC-B or C which provide OLTP performance on various hardware and software configurations. The Transaction processing performance Council (TPC) defines transaction processing and database benchmarks independent of the used DBMS. o Systems tools used by the operating system to provide QoS information at the system level. § The protocol specifics tools like SNMP, RTCP, ICMP and others. These protocols are independent of hardware devices, systems, databases and applications. They provide QoS information at many levels such as network, application, system or database level. Generally, monitoring tools store a huge volume of statistical data about components into log files. These data

204

IJCSNS International Journal of Computer Science and Network Security, VOL.7 No. 12, December 2007

always comes from different levels and for that should be analyzed by users. Furthermore, in many cases the user has to extract his information according to his requirements. We will describe in the next paragraph the most important QoS metrics used in a Video on Demand application.

4.2 QoS metrics The quality metrics describe qualitative or quantitative information related to the quality levels of the delivery service actors or the specified users’ one. The qualitative dimensions concern the level of quality from the user perspective whereas the quantitative dimensions are about the measurable quality levels representing the real capacities of the service provider. Our architecture is based on the quality information model described in [6] and which is composed of the user quality model and the actor quality model. The user quality model describes the dimensions used to specify the desired quality level. The architecture distinguishes the qualitative quality model, which contains the qualitative dimensions, from the quantitative quality model that groups together the quantitative quality dimensions. The Actor Quality Model groups the quantitative quality dimensions along which are described a quality level. The architecture distinguishes the Media Quality Model, which contains the dimensions that represents the quality level of an object to be delivered (image, video sequences, audio, binary data…), and a Resource Quality Model, which describes the quality level offered by a system component (communication network, Database, operating system, storage equipment, video server etc.). Table 1 presents an example of the most important metrics at each level for a VoD application. The collected information constitutes a QoS Data warehouse that could be analysed using data mining techniques in order to extract QoS information. In this paper we propose an architecture that enables collecting statistical data using heterogeneous monitoring tools in order to analyse them and to provide a dynamic QoS Mapping rule.

5.

Mining QoS Information

Many successful approaches have been developed in the field of data mining [3]. The main goal of these techniques is to automatically extract knowledge from sampled and preprocessed data. The extracted knowledge is then used to derive rules and patterns. All these approaches are based on the different interpretation of learning and exploit different laws to extract knowledge from data.

Table 1: QoS metrics for a VoD application

Qualitative Quality Model User Quality Model

Quantitative Quality Model Media Quality Model

Actor Quality Model

Resource Quality Model

§ Excellent, § very good, § good, § fair, § bad § MOS : Mean opinion score, § SNR : Signal noise rate, § PSNR : Peak signal noise rate, § Loss tolerance, § Frame rate, § MPQM : moving pictures quality metric. § Image resolution, § Bit rates, § Sample sizes. Throughput, Delay, Network Jitter, Quality Reliability Model User device - Delay loss Quality Model - Bites rates, Encoding - Supported resolution, Quality - Supported rate control Model strategies, - Quality compression System - CPU, Quality - Process, Model - Hard disk capacity Video - RdbmsSrvInfoLogicalR DBMS Server eads, Quality Quality - RdbmsSrvInfoDiskRea Model Model d Device - Image/s, Quality - Delay, Model - Loss rate Memory Capacity, Quality Caching Model

These techniques can be summarized as follows: · Classification techniques support categorization of elements within a data set into predefined classes. · Association rule learning concerns the discovery of co-occurring data elements, including uncovering of causal relationships. · Clustering techniques partition a data set into subsets so that the elements within each subset are similar. · Multidimensional scaling techniques are used to detect meaningful underlying dimensions in a high dimensional data set. The main differences between them are related to the way the learning process is implemented and to the way the extracted knowledge is represented. We propose to use classification and clustering techniques in order to extract and group QoS monitoring information and create a QoS mapping rule from these information. Figure 2 presents our QoS data mining architecture. This architecture emphasizes the use of data mining concepts

IJCSNS International Journal of Computer Science and Network Security, VOL.7 No. 12, December 2007 and techniques for uncovering interesting data patterns hidden in a large data warehouse. Our architecture is composed of three steps. In the first step, a data warehouse is constructed using QoS monitoring agents acting in the three parts of VoD application namely: the client, the server and the network. In the second step, we will use a QoS Analyzer that will be used to analyze and to make a difference between user and system data. Finally, in the third step, we use data mining methods to generate a QoS mapping rule. CCliieent

N Nettw woorrkk

SSeerrvveerr

A Aggeenntt

A Aggeenntt

A Aggeen t

Q QooSS M Moonniittoorriinngg D Daattaa W Waarreehhoouussee

Q QooSS Daattaa M Miinniinngg

Q QooSS A Annaallyysseerr

U Usseerr rreeqquuiirreem meenntts C Cllassssiifiiccaattiioonn

SSyysstteem m ooffffeerr Clluusstteerriinngg

Q QooSS M Maappppiing RRuullee

Figure 2: QoS Data mining Architecture

5.1 User requirement classification The main idea in our architecture is to make a comparison between QoS user requirements and user device capabilities. So first of all, we will monitor the client side in order to be sure that there is no contradiction between his system capabilities and the desired level of service. Moreover, we need to classify user requirements into a limited classes so that to simplify their comparison. This classification will make it possible to provide to users the most appropriate service. Classification goal is to constitute a group of homogenise and different objects so that: § The objects are as similar as possible in the same group; § And the groups are as different as possible. Generally users are not able to describe their requirements in term of quantitative metrics such as: loss rate, resolution or response time. With our classification it will be easy for any user to specify his requirements by using five classes such as bad, fair, good, very good and excellent. After predicting the maximal user capability level of the service we will build a decision tree. Decision tree

205

algorithms work from top-down, seeking the best attribute to separate the classes at each node in the tree. This means that the shortest tree is assumed to be the best and always preferred. In our decision tree the leaves represents the user class of the service and the nodes represents the user configuration such as: Internet connection, disk storage, CPU frequency, memory size, device resolution, bits rates which can be extracted using monitoring tools.

5.2 System offer clustering QoS dimensions belong generally to a definition domain. Our goal is to divide this domain to many classes in order to make a correspondence with the user requirements classes described above. This division could be done using clustering methods. The clustering is a set of methodologies providing automatic classification of samples into a number of groups using a measure of association, so that the samples in one group are similar and samples belonging to different groups are not similar. Clustering algorithms could be classified as follow: · Exclusive Clustering · Overlapping Clustering · Hierarchical Clustering · Probabilistic Clustering In the exclusive clustering algorithms, data are grouped in an exclusive way, that’s mean that if data belongs to a definite cluster then it couldn’t be included in another one. The overlapping clustering algorithms uses fuzzy sets to cluster data, a data could belong to two or more clusters. Instead, the hierarchical clustering algorithm is based on the union between the two nearest clusters. Finally, the last kind of clustering uses a probabilistic approach. We have examined four of the most used clustering algorithms: · K-means · Fuzzy C-means · Hierarchical clustering. · Mixture of Gaussiens Each of these algorithms belongs to one of the clustering types listed above. K- means is an exclusive clustering algorithm, Fuzzy C-means is an overlapping clustering algorithm, and finally Mixture of Gaussian is a probabilistic clustering algorithm. We propose to implement system offer clustering using Kmeans [11] one of the simplest unsupervised learning algorithms. The procedure follows a simple and easy way to classify a given data set through a certain number of clusters (assume k clusters) fixed before.

206

IJCSNS International Journal of Computer Science and Network Security, VOL.7 No. 12, December 2007

References

5.3 Generating QoS mapping rule After predicting, on the one hand, the main classes a system offer belongs which describe the QoS levels that could be provided for the user, and on the other hand, generating user QoS requirements classes according to these levels, a QoS mapping rule could be generated. Typically, we will be able to make a correspondence with the QoS system offer and the QoS user requirements. A system offer and user QoS requirements will be presented using two tables generated by the classification and the clustering process. Each table will group information by classes as described below: Excellent, very good, good, fair, bad. Then, we must do a correlation between the two tables by a mapping link. Table 2 presents an example of a correlation that could be done between a set of QoS user requirements and system offer.

[1]

[2] [3] [4]

[5] [6]

[7] User Class

MOS value

Resolution

Frame rate

excellent

>4.3

740x480

27-30

V. good

4– 4.3

480x360

25-27

Good

3.6 – 4

360x240

Fair

3.1– 3.6

160x120

Bad

76