Video Improvement Technique for Vibrating Video

1 downloads 23 Views 312KB Size Report
Int. J. on Recent Trends in Engineering & Technology, Vol. 05, No. 02, Mar 2011 ..... of VLSI from C.V.S.R Engineering College,. Hyderabad, Department of ...

Letter Paper Int. J. on Recent Trends in Engineering & Technology, Vol. 05, No. 02, Mar 2011

Video Improvement Technique for Vibrating Video Signals in Surveillance Applications Vaujadevi M 1, S Nagakishore Bhavanam 2 , N Mohan Reddy 3 , S Pradeep kumar Reddy 4 1, 3, 4

Bandari Srinivas Institute of Technology, E.C.E Dept., Hyderabad, India Email: {vasujadevi, pradeepkumarreddys, nalabolumohanreddy}@gmail.com 2 Aurora’s Technological & Research Institute, E.C.E Dept., Hyderabad, India Email: {kishorereddy.vlsi, satyabhavanam}@gmail.com Abstract—Modern surveillance systems are mounted on air vehicles and have sensors for video capturing and IR imaging. Most of the times Unmanned Aired Vehicles (UAVs) are preferred for such applications. The video recorded from such air borne platforms is affected by vibration of the platform. Several automatic surveillance algorithms can perform better if this vibration is removed. In this context Digital image stabilization (DIS) algorithms are investigated with respect to implement-ation feasibility and complexity. In this project one of the most computationally efficient algorithms will be simulated to produce qualitative results. We attempt a fast digital Image stabilizer using the Gray coded bit plane matching (GC-BPM), which is robust to irregular conditions such as moving objects and Intentional panning. The proposed GC-BPM algorithm will be implemented for block matching motion estimation for DIS system. It is very computationally efficient since it uses not only binary Boolean operations which have significantly reduced computational complexity, but also has the advantage of the 3SS. GNU Octave (MATLAB scripting) will be used for implementation of the algorithm. Various applications of these algorithms will be studied and documented. Main focus will be given on (unmanned aerial vehicle) UAV based applications. Keywords: DIS, GC-BPM, DIS, 3SS, UAV, GNU Octave, MATLAB.

I. INTRODUCTION Video enhancement techniques have attracted great interests in recent years. Hand-held and mobile video cameras become more and more popular in consumer market and industry, due to the decrease in their cost. Its main goal is to remove the unwanted vibrated motion caused by a person holding the camera or mechanical shake, and to synthesis a new image sequence as seen from a new stabilized camera trajectory. There are two kinds of methods proposed to solve this problem: hardware approach and image processing approaches.Hardware approach,or optical stabilization, activates an optical system to adjust camera motion sensors when annoying shaky happened, such as Stead cam gyroscopic stabilizers. Even though this method potentially works well in practice, it is not broadly chosen due to the cost and the limitation of processing gross motion of the camera. Another method used in stabilization is the image post-processing technique, which is our concern in this paper. In general, the scheme of the digital stabilization includes three aspects: (1) Interframe motion estimation. (2) Motion smoothing and compensation. © 2011 ACEEE

DOI: 01.IJRTET.05.02.20

(3) Filling up the missing image areas. A. Previous work The development of video stabilization can be traced back to the work of Ratakonda who performed the profile matching and sub-sampling to produce a low resolution video stream in real time. Chang.J.K. presented an approach to feature tracking based on optical flow, calculating on a fixed grid of points in the video. Buehler proposed a novel approach by applying Image-Based Rendering techniques to video stabilization. The camera motion was estimated by “nonmetric” algorithm. Image-Based Rendering was then applied to reconstruct a stabilized video and the smoothed camera motion. This method avoided the problem of stabilization of non-planar scenes and rotational camera motions existing in the homography-based schemes.However, this method only performs well with simple and slow camera motion. A 2.5D motion model was introduced by Jin, adding an additional depth parameter to handle videos with large depth variations. However, all of three depth motion models could not simultaneously handle horizontal translation, vertical translation and rotation. Litvin M. applied the probabilistic methods to estimate intended camera motion. This method produced very accurate results, but it required tuning of camera motion model parameters to match with the type of camera motion in the video. Finally Matsushita developed an improved method for reconstructing undefined re-gions called Motion Inpainting and it was a practical motion deblurring method. This method produced good results in most cases, but it strongly relies on the result of global motion estimation.Recently, the use of invariant features for object recognition and matching has increased greatly Invariant features, found to be more repeatedly and matched more reliably than traditional methods such as Harris corners, are designed to be invariant under the scaling and rotation transformation. In this paper, we use Lowe’s Scale Invariant Feature Transform (SIFT) features to estimate the interframe transformation. Due to the excellent properties, scale invariant features have been widely used in object recognition in the past years, such as the fully automatic Construction of panoramas method proposed by Brown and Lowe .Video completion is still a challenge in recent researches. The most widely used approach is Mosaicing, blending the neighbour frames to fill up the missing image areas. Unfortunately, significant artifacts might occur when moving objects appear at the boundary of the video frame or the scene is non-planar. Wexler sampled spatio-temporal volume patches from different portions of the same video to repair the holes.

Letter Paper Int. J. on Recent Trends in Engineering & Technology, Vol. 05, No. 02, Mar 2011 However, it cost high computation and requires a long video sequence to increase the chance of finding correct matches. Jia segmented the video into two layers, foreground and background, and repaired the video in these two layers individually. This approach also required a long video sequence, or at least a sequence containing a single motion period of the moving object. Matsushita A. propagated local motion from defined areas to missing areas, naturally filling up the missing areas even when scene regions were nonplanar and dynamic. This method was free from the smearing and tearing present in previous methods. However, it might fail when speedily-moving objects are in the scene, and realtime frame rates are not possible at the current time. Translational or dolly motion with respect to the scene, is slow and smooth compared with unwanted, parasitic camera movements.The stable motion we expect is not completely motionless, instead only high frequency camera motion is removed. The advantage of our method is that the undefined area caused by motion compensation is as minimized as possible,keeping more information in the video. The comparison will be made in details later. Finally, a new mosaicing method with Dynamic Programming is proposed to fill up the missing area. The idea of using DP method is spurred by In Davis’s work, a single ‘correct’ frame is used to mosaic the region including the motion object to avoid the discontinuity and blur of focus object. The dividing boundary falling along a path of low intensity in the difference image. This segmenting mosaics method is also useful for the inexact registration resulting from lens distortion or unintentional parallax from image discrepancy. Since it is not required to find a global optimal path in our problem, Dynamic Programming algorithm is more effective than Dijkstra algorithm. The primary contributions of this paper are: • tracking the scale invariant feature transform (SIFT) features to estimate the global motion. • using segmenting mosaics method by Dynamic Programming(DP) to fill up the missing areas in the stabilized frames. To the best of our knowledge, both of these two ideas have not been applied to the video stabilization problem so far. The rest of this paper is organized as follows. Section 2 describes the camera motion estimation based on the scale invariant features. The intentional motion estimation with Gaussian kernel smoothing and parabolic fitting is drawn in Section 3. Section 4 presents the proposed mosaic method using Dynamic Programming. B. Motion Estimation The first step of video stabilization algorithm is to estimate the interframe motion. Feature-based approach has been used by the majority of existing stabilization techniques. The most commonly used features in the previous work are image contour or region boundaries, both of which are sensitive to changes in image scale and likely to be disrupted by cluttered backgrounds near object boundaries. In this paper, we estimate the global motion based on the SIFT features instead. Firstly, we will describe the selection of motion model in the following subsection. © 2011 ACEEE

DOI: 01.IJRTET.05.02.20

C. Motion Smoothing The intentional motion in the video is usually slow and smooth, so a stabilized motion can be obtained by removing undesired motion fluctuation, high frequency component in the original video sequence. There is no unified standard to evaluate the smoothness. The goal of video stabilization is producing a visually pleasant video. In our work, we combine two smoothing methods to produce a more acceptable stabilized motion. The one is the similar with the methodIn order to avoid the accumulative error due to the cascade of original and smoothed transformation chain, local displacement among the neighbour frames is smoothed to generate a compensation motion Another method used here is local parabolic fitting. In order to remain the camera’s main motion, a large Gaussian kernel is not appropriate here, which might lead to the problem of over-smoothing. However, a small Gaussian kernel is not effective to reduce the high frequency camera motion. It is difficult to choose a fit parameter. Here, we add local parabolic fitting to the motion smoothing. As we know, curve fitting method has been broadly used in the stabilization problem. The motion path is controlled by the order of curve and can minimize the undefined regions. Here, the parabola can satisfy the camera motion model. The advantages of such combination is that they cannot only produce smooth moving but also retain the main camera motion path. II. DIGITAL IMAGE PROCESSING Image Processing has been developed in response to three major problems concerned with pictures:  Picture digitization and coding to facilitate transmission, printing and storage of pictures.  Picture enhancement and restoration in order, for example, to interpret more easily pictures of the surface of other planets taken by various probes.  Picture segmentation and description as an early stage in Machine Vision. A monochrome image can be defined as a 2-dimensional light intensity function, f (x,y), where x and y are spatial coordinates and the value of f at (x,y) is proportional to the brightness of the image at that point[2]. If we have a multicolour image, f is a vector, each component of which indicates the brightness of the image at point (x, y) at the corresponding colour band.A digital image is an image f (x, y) that has been discretized both in spatial coordinates and in brightness. It is represented by a 2-dimensional integer array, ora series of Z dimensional arrays, one for each colour band [1].The digitized brightness value is called the grey level value. Each element of the array is called a pixel or a pel derived from the term “picture element”. Usually, the size of such an array is a few hundred pixels by a few hundred pixels and there are several dozens of possible different grey levels. Thus, a digital image looks like this:

Letter Paper Int. J. on Recent Trends in Engineering & Technology, Vol. 05, No. 02, Mar 2011 with 0  f ( x, y )  G  1 where usually N and G are expressed as integer powers of2 (N = 2n, G = 2m). The General block diagram of image processing system is shown in below Figure2.1:

Figure2.1:- General block diagram of image processing system

Depending on the source, the acquisition method would also vary[2]. For scanning (digitizing) a document, a document scanner is used while for capture of a live video frame grabber is required. A frame grabber can acquire images from any video source : video cameras, VCRs, etc. III. DIGITAL VIDEO STABILIZATION 





Military applications: missile guidance and detection , target identification , navigation of pilot less vehicle , and range finding , etc. Astronomy and space applications: Restoration of images suffering from geometric and photometric distortions , computing close of planetary surfaces , etc. Bio-metrical: ECG, EEG, EMG analysis; cytological, histological and stereological applications; automated radiology and pathology; X-ray

A. Introduction The digital image stabilizer consists of the motion (hand   movement) estimation   system   and   the   motion  correction  system.  In  general,  the  motion estimation  system  generates  several  local  motion  vectors  from  sub  images  in  the  different  position  of  the  frame  using  a  block  matching  algorithm  (BMA).  The  motion  correction  system  determines  the  global  motion  of  a  frame  by appropriately  processing  these  local  motion  vectors,  and  decides  whether  the motion  of  a   frame  is   caused   by  undesirable   fluctuation  of  the  camera.  The stabilized  image  is  generated  by  reading  out  the  proper  block  of  fluctuated image in the frame  memory.  A   digital   image   stabilization   system   first   estimates  unwanted  motion  and then  applies  corrections  to  the  image  sequence. Image    motion   can   be estimated   using spaciotemporal or region  matching  approaches.  Spacio-temporal  approaches  include  parametric  block  matching,  direct  optical  flow estimation,   and   a least mean-square error   matrix  inversion  approach.   Region matching methods include  bitp l a n e   m a t c h i n g , p o i n t - t o l i n e  cor r es pon d en ce ,   fea t ur e   t r a cki n g,   p yr a m i dal approaches and block matching.Various computation-nally  efficient   image   stabilization   algorithms   such  as representative point matching   (RPM),   edge   pattern  matching  (EPM),  and  bit-plane  matching (BPM)  have been  developed.   However,   RPM   is   sensitive   to  irregular conditions such   as moving   objects and  intentional panning, EPM requires large amount of computations  due to pre processing for generating  edge  maps.   © 2011 ACEEE

DOI: 01.IJRTET.05.02.20

B. GC-BPM and Motion Correction: In  this  report,  we  present  a  new  motion  estimation  and  correction technique  based  on  the  Gray-coded  BPM  (GC-BPM)  for  camcorders.  The proposed  motion  estimation  algorithm  performs  fast  binary  matching  using Gray-coded  bit-planes  of  the  image  sequence.  Also,  the  proposed  digital  image stabilization  (DIS)  system  uses  the  motion  correction  algorithm   that determines   the   global motion of  a frame by using   the order statistics   of   current local  motion vectors and  a  past  global  motion  vector.  Generic  DIS  systems   are   commonly   broken   into   the   following  major blocks:   local   motion   estimation,   global   motion  estimation,  motion  smoothing (i.e.  filtering or integration),  and  motion  compensation. The GC-BPM  algorithm can  be  broken  down  into  a  pre  processing  step,  followed  by  four  major  blocks (see  Fig 3.2.1).  The  algorithm  begins  by  pre-process  the   image   data   to   produce gray-coded   bit-planes   from  a standard  binary representation. 

Figure3.2.1:-block diagram of the GC-BPM image stabilization algorithm

The  image  frame  f t ( x , y )  can  be  represented  by  K-bit  Gray  code.  The  k-bit Gray  code  can  be computed  from 

where  is  the  exclusive  OR  operation  and  ai  ,  is  the  ith  bit  of  the  base  2 representation  given by This  Gray  code  has  the  unique  property  that  successive  code  words differ  in  only  one  bit  position.  Thus,  small  changes in gray   level   are less likely   to   affect   all K   bitplanes. When   gray   levels 127   and   128   are   adjacent, for  instance, only  the  7th  bit-plane  will  contain  a  0  to  1  transition,  because   the   Gray   codes that   correspond  to 127 and 128 are 11000000 and 01000000,  respect-tively.  Fig 3.2.1 shows Gray-coded bit-planes decomposed from a gray  scale  image, where  g(x ,y)  is  the  kth  order  Gray-coded  bit   plane   image.   Since   only   the order   bit-plane   images  contain  visually  significant  data  whereas  the  other  bit- planes  contribute  to more  subtle  details  within  the  image. Let the  size  of  each sub image  be M  x N  and  a search window  be (M +   2p) x( N +   2 P ) . For   the   proposed GCBPM, we define  the correlation measure  given by

Where g t ( x , y) and g t-1  (  x  ,  y  )  ,  respectively,  are the  current and  previous k- order Gray-coded bit-planes, and p is the maximum  displacement  in  the  search window. 

Letter Paper Int. J. on Recent Trends in Engineering & Technology, Vol. 05, No. 02, Mar 2011

Figure 3.2.3:- The 3SS procedure using alternatively a specific Gray-code bit-plane as each step

C. Median-based Motion Correction:

Figure 3.2.2:- Illustration of block searching method for comparing previous frame to current frame for motion estimation.

At  each  (m,n)  within  the  search  range,  the  proposed  matching  method  calculates  cj(m,n)  which  is  the  number  of  un-matched  bits  between the  reference  sub - image  in  the  current  bit-plane  and  the  compared  sub-image  in  the  previous   bit-plane.   The   smallest   cj(m,n)  yields   the   best  matching for  each  Subimage,  and thus  local  motion  vector  Vj  from  jth  sub  image  is  obtained as

This   motion   estimation   technique   can   replace   the  arithmetic   calculation of   BMA’s   based   on   conventional  MAD  and  MSE  criteria  with  simple  Boolean exclusive  OR  operations,   and   thus   has   significantly   reduced  computational complexity.   Since   the   GC-BPM   performs  motion  estimation  using  a  single  bit-plane,  it  is  important  to select anappropriate bit-plane for bitplane matching. Simulation  results  with  real  video  indicate  that  the  6th  order  bit-plane,  g(x,y) which  contains  both  the  global   information   and   details   of   the   original  image, produces the best result.  To   further   reduce   the  computational  complexity,  we  combine  the  GC- BPM  with  the  three-step  search  (3SS)  for  the  DIS  system.  The  GCBPM successively  apply  the  3SS  to  the  different  Gray-coded  bit-plane.  Fig 3.2.2  shows the  proposed  scheme  using  bitplanes  4,  5,  and  6.  The  proposed  method  GC- BPM  (3SS)  performs  GC-BPM  using  the  6th Gray coded bit-plane at nine checking  points on a  9x9 window.  If  a point  with the  smallest  cj (m,n)  is  found, the  centre  of  the  search  window  is  then  shifted  to  the  point.  In  the  second step, the 5th Graycoded   bit   plane is   used   for  GCBPM and the search window is reduced to 5x5. Finally, the  search  strategy  is  the  same  as  the  first  step,  but the  search  window is reduced to   3x3   and   the   4th Gray-coded bit  plane is used. 

© 2011 ACEEE

DOI: 01.IJRTET.05.02.20

In  general,  motion  vectors  from  the  sub  images  with  moving  objects  are not  reliable  and  two  successive  frames  fluctuated  by  camera’s  shake  should have  a  similar  global  motion.  Based  on  these  properties  of  camera’s  motion we  use  a simple  and  robust  motion  correction scheme  [3]  where  global  motion decision  is  performed  using   current  local  motion  vectors  (V1t, V2t , V3t, V4t  ) and  the previous  global motion  vector  Vgt-1 , The current global motion vector is obtained by 

By minimizing this error operator, the resulting values of (m,n) produce a local  motion  vector  for  each  of  the  four  sub  images  depicted  in  Fig 3.2.2. Then, the four local motion vectors along with the previous  global  motion  vector  are subject  to a median operator to produce the current global motion vector estimate, Vgt . Here the median of vectors is determined by seperately selecting medians of eace vector elements . Then, the global motion estimate   is   passed   through a filter   that   is tuned to   let  intentional  camera  motion  (e.g.   intentional  panning)  be  preserved  while removing  the  undesirable  high  frequency  motion.   The   final   filtered   motion   estimate   is   then  compensated   for   by   shifting   the   current   frame   by   an  integer number  of  pixels  in  the  opposite  direction of  the  motion.  Despite the fact that GC-BPM,  as  formulated,  works  well for translational   (horizontal   and   vertical) motion,   it,  includes no estimation or compensation capability for rotational or zooming motion. The output image is generated by reading out the proper block of input images from the frame memory. IV. SIMULATION RESULTS The  most  interesting  simulation  results  for  this  project  were  the  stabilized  videos  The  simulation  is  written  and  optimized  for  performance  in Matlab  by storing  all  of  the  image   data   as   unsigned 8-bit   integers and   applying  fast vector operations as  much  as possible[4].  The  useful  way  to  visualize  the  image  stabilization  process  is  by looking at plots  of  the  estimated  motion  vectors.  If  one  is  adding  jitter to  images  after  they  have  been  digitized,  these   plots   can   include   the data   for comparison.   For  natural  stable image sequences, this  visualization approach  s useful for characterizing the amount of jitter present in a test  sequence. Another way of visualizing the effectiveness of image  stabilization  is  to difference  consecutive  frames  of  the  original  and  stabilized  image  sequences. Fig.4.  shows 

Letter Paper Int. J. on Recent Trends in Engineering & Technology, Vol. 05, No. 02, Mar 2011 an   example   of   this   technique,   taking   the   difference  between frames   16   and   17   and   mapping   the output  values ranging  from -255 to  +255  to black and 

Fig4.1. original frame 120

white, respectively,  so  that  gray  is  zero. In  this  report,  we  have  proposed  a  GC-BPM  algorithm  for  block matching  motion  estimation  for  DIS  system.  

Fig4.2. difference between original frames 120 and 121

Fig4.3. difference between GC-BPM motion-corrected frames 120 and 121.

Figure4.4.Result of video stabilization. Top row: Original input sequence, and the frame 5,10,15,20 is shown here. Middle row: stabilized sequence which still has missing image areas, and bottom row stabilized

Fig. (a) The original frame (b) The reconstructed frame by our algorithm.

V. CONCLUSIONS AND FUTURE WORK Computationally efficient since it uses not only binary Boolean   operations which   have   significantly   reduced  computational  complexity,  but  also  has  the  advantage  of  the   3SS.   Gray-coded   bit   plane   matching   is   a   robust  optimization of  the   maximum-likelihood  block   matching  approach that  affords  efficient implementations that achieve quite acceptable qualitative performance. Further  interesting  work  might  relate  the  performance  quality  of  GC-BPM  and other  motion  estimation  algorithms  to  a  bound  based  on  human  perception, allowing  a  more  quantitative  and  useful  description  of  the  motion  stabilization system’s  performance  when  used  to  produce  image  sequences  for  human viewing.  Future work: In the future, we want to speed up our algorithm and achieve the dynamic, real time stabilization processing. We © 2011 ACEEE

DOI: 01.IJRTET.05.02.20

will make our algorithm applied for stabilizing videos in various conditions not only translation only motion as we suppose right now, but also translation with rotation, panning movement and so on. In future we can develop video stabilization algorithm for the Pocket PC, we can integrate it with wireless network to conference people whenever you are. However, limited by the transmission bandwidth and processing speed of Pocket PC, to achieve the goal of video conferencing, we want to get much lower bit rate of video sequences. One way we can do is to compress the image more and adopt the MPEG-7 technology to extract the image we want, i.e. human’s face out of the background. Since background compared to human’s face is not that desired, so we don’t have to send its information in every frame. Otherwise, video stabilization can also be used to electronics, such as camcorders. It’s hard to eliminate hand-shaking effect, however, we can use this algorithm to reduce it. It will be useful to make the video quality better and let people enjoy multimedia world more. REFERENCES [1] Rafeal C.Gonzalez, Richard E.Woods, “Digital Image Processing”, Second Edition Pearson Education Inc.,2004. [2] Anil K.Jain, “Fundamentals of Digital Image Processing”, Prentice hall of India,2002. [3] David Salmon, “Data compression the Complete reference”, Second Edition, springe Verlag, New York Inc.,2001. [4] Rafeal C.Gonzalez, Richard E.Woods, Steven Eddins, “Digital Image Processing Using MATLAB” Pearson Education Inc.,2004.

Letter Paper Int. J. on Recent Trends in Engineering & Technology, Vol. 05, No. 02, Mar 2011 [5] William K pratt, “Digital Image Processing”, John Wiley, New York, 2002. [6] Milman Sonka, Vaclav Hlavac, Roger Boyle, “Image Processing Analysis and Machine Vision”, Second Edition, Brooks/ Cole, Vikas publishing House, 1999. [7] S. Ko, S. Lee, S. Jeon, and E. Kang. Fast digital image stabilizer   based   on   gray-coded   bit-plane matching.   IEEE  Transactions  on Consumer  Electronics,  vol45, no. 3, pp.   598-603, Aug.1999. About Authors S Nagakishore Bhavanam is an M.Tech PostGraduate of VLSI-SD from Aurora’s Technological & Research Institute (ATRI), Department of ECE, JNTUH. He obtained his B.Tech from S.V.V.S.N Engineering college, Ongole. He has 14 Months of teaching experience. He has 4 Research Papers, Published in IEEE Xplore, He has 3 International journal Publications and has 6 Papers, Published in International & National Conferences. His interesting Fields are Low Power VLSI, Digital System Design, Sensor Networks and Communications.

© 2011 ACEEE

DOI: 01.IJRTET.05.02.20

Vasujadevi Midasala is an M.Tech PostGraduate of VLSI from Bandari Institute of science and Technology, Department of ECE, JNTUH. She obtained her B.Tech from S.V.V.S.N Engineering college Ongole. She Has 4 Research Papers Published in International/ National Journals. Her interesting Fields are Low Power VLSI, Design for Testability. N Mohan Reddy is an M.Tech Post Graduate of VLSI from C.V.S.R Engineering College, Hyderabad, Department of Electronics and Communications Engineering.. He obtained his B.Tech from Adams Engineering College, Palvancha.He has 3 Years of Teaching Experience.His Intresting fields are Low Power VLSI and Digital SyStem Design. Sangala Pradeep Kumar Reddy is an M.Tech, Post Graduate from Vardhaman College of Engineering , Hyderabad ,Associate Professor & H.O.D, Department of E.C.E, Bandari Srinivas Institute of Technology (BSIT). He obtained his B.Tech from Vijaynagara Enginnering college, Bellary, Karnataka He Has more than 10 years of teaching experience.