on the heavy-tailed distribution of the scene ... - Semantic Scholar

1 downloads 0 Views 196KB Size Report
E. Casilari, A. Reyes, A. Díaz-Estrella and F. Sandoval. Dpto. ... broadcast TV) impose on video traffic a special variability because of their intrinsic evolution.
ON THE HEAVY-TAILED DISTRIBUTION OF THE SCENE DURATION IN VBR VIDEO E. Casilari, A. Reyes, A. Díaz-Estrella and F. Sandoval Dpto. Tecnología Electrónica, E.T.S.I. Telecomunicación, Universidad de Málaga, Campus de Teatinos, 29071 Málaga (Spain) Telephone no.: 34-95- 2132755; FAX 34-95-2131447; E-mail: [email protected] INDEXING TERMS VBR Video Traffic, Scene Oriented Model, heavy-tailed distribution, Hill's estimate, LongRange-Dependence. ABSTRACT In this letter we propose a classification for the scene detection techniques which are commonly used in Variable Bit Rate (VBR) video traffic modelling. Using real video traces we show that scene duration follows heavy-tailed distributions. This heavy-tailed nature is proved to be independent of the technique or the threshold used for the scene detection. This invariant property of video sequences offers a physical explanation for the existence of Long Range Dependence (LRD) in video traffic.

INTRODUCTION Due to the increasing importance of video services, Variable Bit Rate (VBR) Video modelling has become a key issue in the ambit of multimedia traffic. Accurate video traffic models are needed to solve the still open problems that multimedia traffic management arises, such as police, shaping or call admission controls. VBR traffic is determined by the nature of the audiovisual sequences which are being transmitted. Many services (such as video on demand or broadcast TV) impose on video traffic a special variability because of their intrinsic evolution through scenes of different complexity and degree of motion. Moreover, it has been argued that Long Range Dependence (LRD), that is, long term variability existing in VBR Video Traffic could be motivated by the existence of scenes [1]. This LRD, which can seriously impact the network performance [2], can be modelled in two ways: using fractal processes (such as fractal gaussian noises or fractional integrated ARMA filters) or otherwise, considering a scene oriented modelling strategy. While fractal processes approximate the reality in a behaviourist way ("black box" modelling) including parameters which have not an evident physical meaning (such as Hurst parameter), scene oriented models offer a structural approximation ("white box" modelling) which directly imitates the underlying mechanisms of traffic generation. Scenic models contemplate the existence of a superior level (the scene) which modulates the long term traffic flow. A necessary step to define this scene level is to divide the real VBR video traces into different scenes. A proper fit of the statistical distribution of the scene duration is fundamental if it is required to fully characterise the impact of scenes on traffic. In this letter we propose a classification of the scene detection criteria used in modelling literature. Considering all the possible criteria and utilising a wide set of traces with different compression we show that infinite variance (or Noah effect) is an invariant property of the scene duration in video services. SCENE DETECTION TECHNIQUES: The following classification summarises the most common techniques to determine scene changes in real VBR video traces: - Visual Detection: according to this non-analytical technique, the scenes are visually detected by means of a thorough observation of the real uncompressed video sequence. This solution obviously presents several drawbacks: visual series are often not available, it is required long human monitoring (frame by frame), scene limits are not always visually clear (because of effects as camera panning, zooms, fading,...). Moreover, there is not always a correspondence between the real scenes and the changes in the bit rate. For instance, two visually different scenes can generate the same traffic if they have the same degree of motion and image complexity. - High pass filtering [3]: As scene changes usually provoke sudden rate changes in the traffic flow, a first or second order high pass filter can be enough to detect them. According to this technique a scene change occurs in the i-th interval if:

X [i ] − X [i − 1] > X T

(1)

where X[i] represents the generated traffic during the i-th interval (normally the frame period) and XT is a threshold. - A combination of low and high pass filtering [1]: it consists in a variant of the previous technique. In order to avoid noise the difference between adjacent samples is compared with the mean value of the W last samples, which is calculated with an averaging filter in the form:

X [i ] − X [i − 1] i −1

∑ X [ j]

> XT

(2)

W

j = i −W

- Low pass filtering and scene classification [4]: In this case, after averaging the traffic series, each sample of the signal is classified into N different types of scenes depending on the averaged traffic of the W adjacent samples.

- Clustering of the state space [5]: this method divides the state space (X[n], X[n+1]) into N clusters representing N types of scenes. The division is performed by means of a minimisation of the distances to N centroids within the state space. An iterative search algorithm is required to optimise the position of the N centroids. ESTIMATION OF THE HEAVY-TAILED NATURE OF THE SCENE DURATION A probability distribution function FX(x) of a random variable X[n] is said to be heavy-tailed if it exhibits a hyperbolic decay in the way: G X ( x) = Pr( X [n] > x) = 1 − FX ( x) = G o ⋅ x −α when x → ∞, where GX(x) represents the complementary distribution function and Go is a constant value. It can be proved that for α equal or lower than 2 the distribution has infinite variance or Noah effect, which is related to an extreme variability of X[n]. There exist two common methods to determine α from a series X[n]: - Plotting GX(x) on a log-log scale results, for a heavy-tail distribution, in an approximately straight line for large x-values, with a slope of -α. So, a common way to estimate α is to perform a least square regression on this representation. -A statistically more rigorous method is known as Hill's estimate [6]:

⎡ 1 k −1 ⎛ X [ N −i ] ⎞⎤ ⎟⎥ αˆ (k ) ∝ ⎢ ∑ log⎜⎜ ⎟ X ⎢⎣ k i =1 − N k [ ] ⎠⎥⎦ ⎝

−1

(3)

where X[N-i] is the i-th largest element of the N samples. As the index k increases, the series α(k) stabilises at values close to α . The presence of Noah effect is strongly related to the existence of LRD. In [6] it is proved that traffic sources whose activity periods (scenes in the case of video) follow heavy-tailed distributions (e.g.: Pareto) generate LRD or self-similar traffic, that is, traffic whose variability is not limited to a certain time scale. To analyse the nature of scene duration distribution and the influence of choosing a particular detection technique, we consider five long real VBR video traces compressed under different schemes (interframe and intraframe): 1) a series called "Wurzburg" consisting of several 30 minute sequences of different video signals, containing TV programs, news, films, sports and cartoons, compressed with a MPEG encoder in the University of Wurzburg (Germany); 2) the film "E.T. the Extraterrestrial" compressed under Motion JPEG (intraframe); 3) a series called "MTV", also compressed under M-JPEG, including several hours of a TV channel; 4) and 5) the film "Star Wars" with MPEG and M-JPEG compression, respectively. Figure 1 shows the evolution of Hill's estimate when it is applied to the scene durations of the previous traces. In all cases a combination of high and low pass filtering was considered to detect scene changes. The figure also plots the estimates for an exponentially distributed random series of 10000 samples. In opposition to the exponential series, it is proved that for all real video series the estimator converges to values under 2, reflecting the existence of an infinite variance. These results are confirmed in Table I where the regression method is applied to calculate α. Figure 2 depicts Hill's estimate when it is applied to the scenes of "Wurzburg" obtained with different methods and thresholds, which are normalised by the mean value. In particular it is considered a low pass filtering (with W=100 and N=3 types of scenes), a high pass filtering (with threshold XT=0.5), a clustering method (with N=2 types of scenes) and four combinations of high and low pass filters. In three of them the window W is fixed to 100 and three thresholds are contemplated (XT =0.5, 0.8 and 1). In the other case, XT is 0.5 and W is changed to 300. In spite of the fact that results diverge for most cases, it can be again observed that α tends to stabilise in values lower than 2, with independence of the threshold or the window size for the averaging filter.

CONCLUSIONS

In this letter we have shown that the scene duration of VBR Video is intrinsically heavy-tailed distributed, exhibiting the syndrome of infinite variance. This property has been detected with independence of the nature of the audiovisual signal (film or TV) or the criterion that is utilised to detect scene changes. This phenomenon enables a physical explanation for the existence of LRD or self-similarity within video traffic, establishing an invariant property that VBR traffic models should take into account. ACKNOWLEDGEMENTS This work has been partially supported by the Spanish Comisión Interministerial de Ciencia y Tecnología (CICYT), Project No. TIC96-0743. REFERENCES [1] Jelenkovic, P.R., Lazar, A.A., and Semret, N.: “The effect of Multiple Time Scales and Subexponentiality in MPEG Video Streams on Queueing Behavior”, IEEE Journal on Selected Areas in Communications, Vol. 15, No. 6, August, 1997, pp. 1052-1071. [2] Huang, C., Devetsikiotis, M., Lambadaris, I., and Kaye, A. R.: “Self-Similar Traffic and Its Implications for ATM Network Design”, Proc. of ICCT'96, Pekin, China, May, 1996. [3] Melamed, B., and Pendarakis, D., “A TES-Based Model for Compressed “Star Wars” Video”, Proceedings of the Communications Theory Mini-Conference at GLOBECOM’94, San Francisco, California, USA, November, 1994, pp. 70-81. [4] Casilari, E., Lorente, M., Reyes, A., Díaz-Estrella, A., and Sandoval, F.: “Scene Oriented Model For VBR Video”, IEE Electronics Letters, 1998, Vol. 34, No. 2, January, 1998, pp. 166168. [5] Chandra K., and Reibman, A.R., "Modeling One and Two-Layer Variable Bit Rate Video”, Research report, AT&T Laboratories, New Jersey, USA, 1997 [6] Willinger, W., Taqqu, M.S., Sherman, R., and Wilson, D.V., “Self-Similarity Through High-Variability: Statistical Analysis of Ethernet LAN Traffic at the Source Level”, Proceedings of the ACM/SIGCOMM’95, Cambridge, Massachusetts, USA, August, 1995, pp. 100-113.

FIGURE CAPTIONS:

Figure 1. Hill's estimate of α for the scene duration of different video traces. Figure 2. Hill's estimate of α for the scene duration of "Wurzburg" using different scene detection techniques.

Hill's estimate for scene duration distribution 7 Exponentially distributed serie 6

Finite Variance

5

α(k)

4

Wurzburg (MPEG)

3

Star Wars (M-JPEG)

Star Wars (MPEG)

1

0

E.T. (M-JPEG) 0

Figure 1.

20

Infinite Variance

2

MTV (Intraframe) 40

60

80

100 k

120

140

160

180

200

Hill's estimate for scene duration distribution 4.5

Finite Variance

4 3.5 Low -High Pass (XT=0.5, W=100)

α(k)

3

Low -High Pass (XT =0.5, W=300)

2.5

Low -High Pass (XT =0.8, W=100)

2 Infinite Variance

1.5 Low -High Pass (XT =1, W=100)

1

Clustering (N=2)

0.5 Low Pass (N=3, W=100) 0

Figure 2.

0

20

40

60

80

100 k

120

140

160

180

200

Table I. Estimation of α using regression method Video Wurzburg E.T. Trace 1.1153 1.1588 α

MTV 0.8411

Star Wars (M-JPEG) 1.6914

Star Wars (MPEG) 1.1360

Exponentially distributed serie 2.7161

Estimador de Hill (Duración de escenas) 4.5

Varianza Finita

4 3.5 P. Alto-Bajo (U=50%, W=100)

α(k)

3

P. Alto-Bajo (U=50%, W=300)

2.5

P. Alto-Bajo (U=80%, W=100) P. Alto (U=50%,)

2

Varianza Infinita

1.5 1 0.5 P. Bajo (N=3, W=100) 0

0

20

40

60

P. Alto-Bajo (U=100%, W=100) 80

100 k

120

140

160

180

200

Estimador de Hill (Duración de escenas) 7 Serie de distribución exponencial 6

Varianza Finita

5

a(k)

4

Wurzburg (MPEG)

3

Star Wars (M-JPEG)

Star Wars (MPEG)

1

0

E.T. (M-JPEG) 0

20

Varianza Infinita

2

MTV (Intraframe) 40

60

80

100 k

120

140

160

180

200