Utilizing Perceptual Image Quality Metrics for Link ... - Semantic Scholar

3 downloads 0 Views 357KB Size Report
and Hans-Jürgen Zepernick. 1. 1. Blekinge Institute of Technology. Department of Signal Processing. SE–372 25 Ronneby, Sweden. E-mail: {maulana.kusuma, ...
Utilizing Perceptual Image Quality Metrics for Link Adaptation Based on Region of Interest Tubagus Maulana Kusuma1,2 , Manora Caldera2 , and Hans-J¨urgen Zepernick1 1

Blekinge Institute of Technology Department of Signal Processing SE–372 25 Ronneby, Sweden E-mail: {maulana.kusuma, hans-jurgen.zepernick}@bth.se 2 Western Australian Telecommunications Research Institute Wireless Systems Laboratory 39 Fairway, Nedlands, WA 6907, Australia E-mail: {mkusuma, caldera}@watri.org.au

Abstract— An implicit link adaptation technique based on hybrid automatic repeat request (H-ARQ) and soft-combining is considered for transmission of Joint Photographic Experts Group 2000 (JPEG2000) images over wireless channels. Adaptation is carried out utilizing an objective perceptual image quality metric that takes into account the human perception. Retransmissions focus on the Region of Interest (ROI) part of the JPEG2000 image to efficiently utilize the bandwidth. Numerical results show that the combination of the proposed perceptual image quality metric with link adaptation provides robust link performance while meeting satisfactory quality constraints.

I. I NTRODUCTION Wireless imaging is expected to become an important application with the deployment of the third generation and beyond wireless communication networks [1]. Due to the limited bandwidth available for mobile applications, the images are heavily compressed using formats such as JPEG2000 [2]. Moreover, the time-varying nature of the wireless channel caused by multipath propagation and the changing interference conditions make the channel very unreliable. Therefore, the imaging services over wireless channels are impaired not only by lossy compression but also by burst errors associated with the channel. Various strategies such as link adaptation techniques have been used to combat these problems. The existing link adaptation techniques for wireless communications are based on conventional measures such as signalto-noise ratio (SNR), bit error rate (BER), frame error rate (FER) or combinations of them as the quality indicators or the metrics. However, in case of multimedia communications, it has been shown that these metrics do not necessarily correlate well with the quality as perceived by the end user as human subject [3], [4]. Thus in this paper, we aim at link adaptation techniques for image transmission over wireless channels focusing on the use of objective perceptual image quality metrics, which take into account the characteristics of human perception. Especially, implicit link adaptation is carried out based on hybrid automatic repeat request (H-ARQ) in combination with a soft-combining algorithm [5].

0-7803-9206-X/05/$20.00 2005 IEEE

Region of interest (ROI) is one of the most useful features supported by the JPEG2000 standard. With this, the ROI which can be selected as the most important region is encoded with better quality than the rest of the image or the background. In this paper, retransmissions are executed on the ROI part in order to efficiently utilize the bandwidth. This paper is organized as follows. Section II presents a brief overview of the objective perceptual image quality measures. The proposed retransmission scheme based on ROI and utilizing perceptual image quality metrics is detailed in Section III. Numerical results are presented in Section IV and Section V concludes the paper. II. O BJECTIVE P ERCEPTUAL I MAGE Q UALITY M EASURES The mean square error (MSE) and the peak signal-to-noise ratio (PSNR) are among the most widely adopted objective metrics for measuring the image quality. However, quality results based on both of these metrics are poorly correlated with subjective test results. Moreover, these measures are not suitable for in-service quality monitoring as required in link adaptation techniques for wireless multimedia communications due to the unavailability of the reference image at the receiver. Therefore, major emphasis in recent research is given to the measurement techniques that do not require a reference image, and provide high correlation with the subjective measurement results [6], [7]. However, quality measures obtained using no-reference systems may not be suitable for link adaptation schemes, as the benchmark used for retransmission termination may not always be reached. Therefore, a reduced-reference scheme that uses a hybrid image quality metric (HIQM) [8], which takes into account the artifacts related to human perception, is proposed for JPEG2000 and utilized in link adaptation. The measurement of the features related to the artifacts of blocking, blur, image activities and intensity maskings are carried out using computationally efficient algorithms [8].

TABLE I

constitutes a comprehensive image quality metric suitable for combination with implicit link adaptation techniques. In (1), the HIQM values of the transmitted and received image are denoted by HIQMT and HIQMR , respectively. Figure 1 depicts MOS in relation to HIQM difference. Also, the MOS values for the individual image samples are shown along with the fitting curve and the 95% confidence interval. It turned out that an exponential function can be used for curve fitting to finally derive the prediction function

A RTIFACT EVALUATION . Metric

Weight

Blocking

0.82

Blur

0.55

Image activity metric - Edge related

0.77

Image activity metric - Gradient related

0.21

Masking

0.22

M OS∆HIQM = 78.87e−0.215·∆HIQM In other words, each individual feature extraction algorithm emulates the behavior of a visual sensor. Once the individual quality measures related to each artifact are obtained, the overall quality measure HIQM is obtained using a weighted sum of all the metrics. Since human vision reacts differently to various artifacts, the perceptual weight factor allocation for these individual sensors was based on the impact of the output of the sensor on the overall perceptibility of images by human vision. In this paper, the weights listed in Table I are used to calculate the overall quality measure HIQM. It should be noted that these weight factors were obtained using a subjective quality test conducted according to the methodology specified in the ITU-R Recommendation BT.500-11 [9]. Some of the findings from these subjective quality tests shall be given in the sequel. A detailed description of the conducted subjective quality tests and their analysis can be found in [10]. The subjective ratings from the experiments were averaged into a mean opinion score (MOS) which represents the subjective quality of a particular image. On the other hand, HIQM relates to the objective image quality and can be used to predict perceived image quality automatically. In particular, as each original image has its characteristic HIQM value or baseline, the magnitude of the difference between this baseline and the HIQM value of the distorted image ∆HIQM = |HIQM T − HIQM R |

(1)

100 Image sample Fitting curve Prediction bounds (95%)

90 80 70

MOS

60 50 40 30 20 10 0

0

1

Fig. 1.

2

3

4



5

6

7

8

HIQM

Subjective scores versus ∆HIQM [10].

9

(2)

As the proposed hybrid metric does not require the availability of the full reference of the original image at the receiver but can be represented by a single number, it is well suited for in-service quality assessment and link adaptation purposes. In the latter application, the prediction curve (2) can be used to translate the HIQM difference ∆HIQM into a predicted mean opinion score M OS∆HIQM quantifying the image quality as perceived by human subjects. III. I MPLICIT L INK A DAPTATION BASED ON ROI The JPEG2000 standard provides the capability of defining the ROI, which is the area of an image that the human eyes tend to put more attention than to the rest. This feature is introduced for the applications where certain area or region of the image is of higher importance than the others [11]. Image progression order can also be defined based on layer, resolution, component and position. At the source encoder, the ROI is selected and encoded with higher quality than the background. Depending on the importance of this region, it is possible to have this ROI compressed at any level up to and including losslessly. Moreover, the ROI bitstream is located in the early segment within the whole bitstream, so that it can be transmitted first or with a higher priority. With retransmissions based on ROI in HARQ schemes, the number of retransmitted bits can be largely reduced. With conventional H-ARQ schemes, whenever a codeword is to be retransmitted, the initial codeword is discarded and replaced by its retransmitted copy. Thus the decoder ignores the information gathered through the failed decoding attempt leading to largely reduced throughput efficiencies over noisy channels. This is avoided in this paper using a soft-combining algorithm [5], which preserves the information obtained with each decoding attempt and incorporates it with the retransmitted copies of the codeword. With this, the reliability information is included in the soft-combining algorithm using the soft values on a symbol-by-symbol basis. In the proposed H-ARQ scheme using perceptual image quality metric and ROI, the baseline or the quality measure of the ROI part of the original image is incorporated with the first transmission. If the file is undecodable, the whole image is requested for retransmission. Otherwise, the quality in terms of HIQM of the ROI is measured. The HIQM difference of the received and the transmitted ROI is first obtained and then

translated into a predicted mean opinion score M OSHIQM using (1) and (2). When the normalised M OSHIQM of the ROI of the received image has not reached the baseline within a selected margin, retransmission of only the ROI part is requested. These requests are continued until the quality of the ROI received reaches the baseline of the original ROI within the selected margin or the number of retransmissions reaches the specified maximum allowable retransmissions. With each retransmission of ROI, the soft-output values of the received ROI are incorporated in the retransmitted copies as suggested in the soft-combining algorithm [5]. IV. N UMERICAL R ESULTS The performance of the proposed implicit link adaptation scheme based on H-ARQ is obtained using computer simulations. In the simulations, a 38 kilobyte (kb) JPEG2000 test image “Yacht” is transmitted 200 times over an uncorrelated flat Rayleigh fading channel in the presence of additive white Gaussian noise (AWGN) with an average bit energy to noise power spectral density ratio (Eb /N0 ) of 5dB. A (31, 21) Bose-Chaudhuri-Hocquenghem (BCH) code is used for error protection purposes and modulated using binary phase shift keying (BPSK). The maximum number of retransmissions in the soft-combining algorithm is set to 5. The original image was compressed using Kakadu software [12] with Layer-Resolution-Component-Position (LRCP) progression order with six quality layers. With LRCP, the bitstream ordering is based on the quality layers. This type of progression order is also called signal-to-noise ratio (SNR) progressive [13]. The JPEG2000 error resilent parameters (ERTERM, RESET, RESTART and SEGMARK) as well as start-of-packet (SOP) and end-of-packet-header (EPH) were enabled for maximum protection of the bitstream. These error resilient features allow the decoder to perform bitstream truncation at the receiver, in order to decode or decompress the heavily corrupted images. Prior to the truncation process, the decoder first attempts to get as much correct bits as possible. In the worst case, the first layer may be opened, since the data from this layer were given a priority in the retransmission. The MAXSHIFT algorithm was used to distinguish the ROI from the background with the parameters Rlevels = 10 and Rshif t = 12. The size of the code block of 64 was selected to avoid significant loss in compression efficiency, especially for high bit rate, since 4 bit-per-pixel (bpp) was chosen [14]. It has been observed that the maximum HIQM value of the selected ROI of the given example is reached with one layer as shown in Fig. 2. In this case, a ROI of the test image of size 7kb and a quality baseline of 2.7 based on the HIQM is selected for retransmissions. The quality measures obtained using the proposed reduced reference scheme based on the normalised M OSHIQM are compared with that of PSNR as it is the most commonly used quality metric for testing image and video communication systems. Figs. 3a and 3b respectively show the average performance comparison between PSNR and normalised M OSHIQM of the ROI of the received image. Here, the

(a) Layer 1, HIQM=2.7, File size=7kb

(b) Layer 2, HIQM=2.7, File size=9.9kb

(c) Layer 3, HIQM=2.7, File size=14kb Fig. 2.

Layer progression of selected ROI of image “Yacht”.

averaging is carried out considering the individual measures of fully openable images. The reference PSNR of the ROI is obtained using a pixel-by-pixel comparison of the uncompressed ROI with the transmitted ROI. The reference PSNR of the ROI is found as 45.33 and the normalised M OSHIQM of the ROI is 100. It is observed that the quality measures based on the normalised M OSHIQM reach the quality baseline within a tight margin after the third retransmission. Hence, the retransmission request in the given example can be terminated after the fourth retransmission. On the other hand, it was observed that the PSNR value never comes close to the reference quality. Thus, the termination will never occur and the link adaptation will continue until

Reference PSNR of ROI

50

Average PSNR (dB) of ROI

45 40

V. C ONCLUSIONS

35

An implicit link adaptation method that takes into account the human perception of the ROI of JPEG2000 images is proposed. The predicted normalized MOS utilizing the perceptual image quality metric, HIQM, has been used as a criterion to terminate the retransmission requests in H-ARQ using a soft-combining algorithm over wireless channels. With the proposed scheme, the normalized MOS provides a good quality estimate of the received image without needing the reference image at the receiver. Also, the numerical results show that the use of the objective perceptual quality measure and ROI in link adaptation over wireless channels provides more efficient use of the limited bandwidth while meeting the satisfactory perceived quality constraints.

30 25 20 15 10 5 0 0

1

2

3

Retransmission (r

4

5

=5)

max

(a) Reference MOS∆

of ROI

HIQM

100

of ROI

VI. ACKNOWLEDGMENT This work was partly supported by the Commonwealth of Australia through the Cooperative Research Centre program.

90

HIQM

80



Normalized MOS

of the information is needed to be retransmitted to obtain a quality improvement of the received image.

R EFERENCES

70 60 50 40 30 20 10 0 0

1

2

3

4

5

Retransmission (rmax=5)

(b) Fig. 3. Average performance of link adaptation for image “Yacht”: (a) PSNR; (b) Normalized MOS.

the maximum number of retransmissions allocated is reached, leading to a very inefficient scheme. Furthermore, the proposed M OSHIQM that provides a better quality measure as perceived by the end user can be incorporated into link adaptation as a method to more efficiently terminate the retransmission request. Figure 4 illustrates the progress of image quality improvement with each retransmission. It is observed that large improvement in the image quality is obtained with each retransmission. Moreover, a further improvement in the quality is not observed after the third retransmission. This gives an indication to the system of when to terminate the retransmissions. Again, the proposed reduced reference scheme constitutes an extra criterion for stopping the retransmissions. Moreover, by the use of ROI-based selective retransmission, the retransmitted file size can be significantly reduced from 38kb to 7kb or an approximate reduction of 82%. Which means, only 18%

[1] M. Grangetto, E. Magli, and G. Olmo, “Error Sensitivity Data Structures and Retransmission Strategies for Robust JPEG 2000 Wireless Imaging,” IEEE Trans. on Consum. Elec., vol. 49, no. 4, pp. 872-882, Nov. 2003. [2] ISO/IEC JTC 1 SC 29 WG 1 15444-1 “JPEG 2000 Part 1-Core Coding System”. [3] S. Winkler, E. D. Gelasca, and T. Ebrahimi, “Perceptual Quality Assessment for Video Watermarking,” in Proc. of Inter. Conf. on Inf. Tech., Coding and Computing, Nevada, USA, pp. 90-94, Apr. 2002. [4] A. W. Rix, A. Bourret, and M. P. Hollier, “Models of Human Perception,” Journal of BT Technology, vol. 17, no. 1, pp. 24-34, Jan. 1999. [5] H.-J. Zepernick, B. Rohani, and M. Caldera, “A Soft-combining Technique for LUEP Codes,” IEE Electronics Letters, vol. 38, no. 5, pp. 234-235, Feb. 2002. [6] Z. Yu and H. R. Wu, “Human Visual System Based Objective Digital Video Quality Metrics,” in Proc. of 5th International Conf. on Signal Processing, Beijing, China, vol. 2, pp. 1088-1095, Aug. 2000. [7] S. Winkler, “Digital Video Quality,” John Wiley & Sons, Chichester, 2005. [8] T. M. Kusuma and H.-J. Zepernick, “In-Service Image Monitoring Using Perceptual Objective Quality Metrics,” Journal of Electrical Engineering, vol. 54, no. 9-10, pp. 237-243, Dec. 2003. [9] ITU-R Recommendation BT.500-11, “Methodology for the Subjective Assessment of the Quality of Television Pictures,” Int. Telecommun. Union, Geneva, Switzerland, 2002. [10] T. M. Kusuma, H.-J. Zepernick, and M. Caldera, “On the Development of a Reduced-Reference Perceptual Image Quality Metric,” to appear in Proc. of International Conference on Multimedia Communications Systems, Montreal, Canada, Aug. 2005. [11] A. Skodras, C. Christopoulos, and T. Ebrahimi “The JPEG 2000 Still Image Compression Standard,” IEEE Sig. Proc. Magazine, pp. 36-58, Sept. 2001. [12] D.S. Taubman, “A Comprehensive Framework for JPEG2000: Kakadu Software [online],” [http://www.kakadusoftware.com], [Retrieved 1 July, 2004]. [13] D.S. Taubman and M.W. Marcelin, “JPEG2000: Image Compression Fundamentals, Standards and Practice,” Kluwer Academic Publishers, 2002. [14] A.P. Bradley and F.W.M. Stentiford, “JPEG2000 and Region of Interest Coding,” in Proc. of Digital Image Computing Techniques and Applications, Melbourne, Australia, pp. 1-6, Jan. 2002.

(a) Original image

(d) Second retransmission

Undecodable

(b) Initial transmission

(e) Third retransmission

(c) First retransmission

(f) Fourth retransmission

Fig. 4.

Quality improvement of image “Yacht” with each retransmission.