Signature redacted

2 downloads 0 Views 3MB Size Report
Jan 31, 1977 - each of which is made up of a number of pels (picture elements). Each pel ..... the maximum value of b tested, b = 250, though not with great consistency. ..... the fact that c'(127.5) = d, it may be shown that c'(b) > 1 for 0 < b < b.
NEW DATA ON NOISE VISIBILITY AND ITS APPLICATION TO IMAGE TRANSMISSION

by

ULICK OLIVER MALONE

B.A., B.A.I., Trinity College Dublin (1975)

SUBMITTED IN PARTIAL FULFILLMENT OF THE REQUIREMENTS FOR THE DEGREE OF MASTER OF SCIENCE

at the MASSACHUSETTS INSTITUTE OF TECHNOLOGY JANUARY 1977

Signature redacted Signature of Author......................................... Department of Electrical Engineering and Computer Science, January 31, 1977

Signature redacted Certified by...........

....

. .............

.........

Signature redacted. Accepted

by

.

.

.

.

-.

..........

Chairman, Department Committee Archives on Graduate Students

APR 6

1977)

NEW DATA ON NOISE VISIBILITY AND ITS APPLICATION TO IMAGE TRANSMISSION

by ULICK OLIVER MALONE

Submitted to the Department of Electrical Engineering and Computer Science on January 31,

1977 in partial fulfillment

of the requirements for the Degree of Master of Science.

ABSTRACT

A

series of noise visibility experiments have been

undertaken.

The results of these experiments are used

to validate the form log(l+ ab) model of vision.

of the functional transfer

Certain of the results are found to be

incompatible with Stockham's visual model.

A theoretical

framework for image dependent companding is set up using the functional transfer model of vision.

Examples are

given which show that this technique is an improvement on the traditional approach to optimum companding.

All

experiments and applications were implemented using a general purpose computer based image processing facility.

Name and Title of Thesis Supervisor:

Donald E. Troxel,

Associate Professor of Electrical Engineering.

2

ACKNOWLEDGEMENTS

Many thanks are due to my wife Cathy for the encouragement she gave me during the year I worked on this project.

I am very grateful for the guidance I received

from my supervisor Professor Donald Troxel and for the many hours of assistance given me by Charles Lynn.

3

TABLE OF CONTENTS

ABSTRACT . . . . . . . . . . . . . . . . . . . . .

2

ACKNOWLEDGEMENTS . . . . . . . . . . . . . . . . .

3

. . . . . . . . . . . . .

5

. . . . . . .

11

. . . . . . . . . .

22

CHAPTER 1.

INTRODUCTION

CHAPTER 2.

EXPERIMENTAL TECHNIQUES

CHAPTER 3.

OPTIMUM COMPANDING

CHAPTER 4.

PICTURE DEPENDENT COMPANDING

CHAPTER 5.

THE INFLUENCE OF BACKGROUND ADAPTION ON NOISE VISIBILITY

APPENDIX 1

.

.

.

.

.

44

.

.

.

.

.

73

. . . . . . . . . . . . . . . . . . . .

BIBLIOGRAPHY AND REFERECNES

. . . . . . . . . . .

4

89 90

CHAPTER 1

INTRODUCTION

The subject of noise visibility is of fundamental importance in image processing and transmission due to the fact that very many of the techniques of image technology give rise to pictorial noise.

As a result much

effort has been devoted to the development of methods of reducing the detrimental effects of noise on picture quality.

A good example is the quantization noise in PCM

systems for pictures.

This can lead to obvious disconti-

nuities in the appearance of the pictures and false contours

in areas of low detail.

The visibility of such contours

becomes irritating if a resolution of less than four bits per pel is attempted.

A variety of techniques have been

developed for either eliminating contours or lowering their visibility.

For example, Graham

(1)

found that the

visibility of the contours could be reduced by applying certain filtering operations quantization.

to the image before and after

Quantization contours may be considered

to be the result of the addition of highly structured, picture correlated noise to the image,

5

and

as has been shown

6

in various studies

(3, 4, 5)

such noise is more visible

than random white noise of the same amplitude. devised a

Roberts

scheme which takes advantage of these facts

(6).

In this scheme pseudo-random noise is added to the image before quantization and the same noise signal is subsequently subtracted from it.

The resulting noise is pseudo-

random noise of the same amplitude as the quantization noise, but of lower visibility.

Fairly acceptable pictures

are produced by the Roberts scheme using only three bits

per pel. An understanding of the process of vision is essential

to an understanding of noise visibility. Weber fraction experiments

(5,

6)

simple but powerful visual model.

The classical

have given rise to a

In this model the output

intensity at any point is considered to be some function of the intensity of the corresponding point in the input scene.

This function v(b)

defines the visual model and

may be referred to as the visual transfer function.

The

results of the Weber fraction experiments have led to the conclusion that the visual transfer function is logarithmic. This information can be used to make predictions about noise visibility.

For example using the logarithmic model

it can easily be shown that noise should be more visible in the dark tones than in the bright tones of a picture, and this,

as is well known,

is true.

As a more practical

7

example,

Hashizume

(8) used this model to show how noise

visibility may be made independent of intensity.

This

manipulation of noise visibility is referred to as companding and is achieved by performing a tone scale transformation on the picture before noise is added and then performing the inverse transformation after the noise is added.

Hashizume used the functional transfer model

of vision to show that the function v(b)

is the companding

function which achieves noise visibility independent of

intensity.

Since this equalization of the noise in a

picture usually results in an overall decrease in its

visibility, the logarithmic companding scheme is often used in combination with the Roberts technique for further

improvement in image quality

(reference 10 is a good

example). The results of the Weber fraction experiments and the work of Hashizume have left some doubt as to the exact

form of v(b).

The Weber fraction experiments were mostly

conducted in unusual conditions of dark adaption so it is not clear that the results of these experiments apply to more comfortable viewing conditions such as office lighting.

For this reason part of this thesis deals with

a new noise visibility experiment similar to the Weber fraction experiments which not only provides valuable new data on noise visibility but also allows a derivation of

8

the exact form of v(b).

This experiment was conducted

under comfortable lighting conditions with a view to obtaining a result for v(b) which would apply in practical situations.

This new result for v(b)

for comparison with Hashizume's postulate

was also intended v(b)

= k log (l+ab)

which, though successful in companding applications, was not verified directly. Optimum companding using v(b)

has the property of

causing noise visibility to be independent of intensity. This necessitates a decrease in noise in the dark tones but an increase of noise in the bright tones. picture is nearly all bright,

So if a

optimum companding can have

the undesirable effect of increasing the overall noise in

the picture.

A major portion of this thesis deals with

methods of overcoming this inability of optimum companding to match itself to the intensity distribution of the individual picture. A variety of optical illusions exist which cannot be explained by the functional transfer model of vision. Mach bands,

(7, 14)

simultaneous contrast and brightness constancy

are the most well known of these effects, and all

are examples of the output intensity from the vision system not being functionally related to the intensity of the

corresponding point in the input scene, and hence the breakdown of the functional transfer model.

Most attempts

9

at developing a model which explains these illusions have concluded that the appropriate model is a log stage followed by a linear shift invariant filter (7, 12, Stockham's visual model

(14)

13, 14).

has been particularly

successful in dealing with illusions. the best visual model to date.

It appears to be

As such it has the potential

of being very useful in the mathematical analysis of noise

visibility, and also in the field of noise reduction where it could be used as a companding processor.

Unfortunately

little or no research appears to have been done in this area since Stockham's paper was published in 1972. Experiments have been described in the literature

(5)

which demonstrate that the sensitivity of vision in a small area is decreased by increasing the contrast between the

small area and its background.

Part of the work of this

thesis deals with an investigation of this phenomena in which the variation of noise visibility was measured as a function of contrast.

A

second experiment was designed

to determine under what conditions contrast influenced

noise visibility.

It was hoped that the results of these

experiments would give an indication of whether this effect is of any relevance to practical

image processing.

The

decrease of noise visibility as contrast increases is another example of an effect which the functional transfer model fails to explain, but it is not intuitively clear

10

whether Stockham's visual model can account for it or not.

For this reason an analysis of the compatibility of this effect with Stockham's model will be given in this report.

The fact that noise is more visible in blank fields than in areas of detail

(5)

raises the issue of the

relationship between noise visibility and the spectra of the noise and picture.

Greenwood

(3)

and Mitchell

(4)

have

studied this relationship and found it to be quite complex. Greenwood found that both spectra influence the visibility

of the noise.

Mitchell's experiments indicated that noise

is most effectively concealed in the details of a picture when both picture and noise have the same frequency content.

White noises with different probability distributions but equal variances have been found to have equal visibility (11), so it may be concluded that probability distribution is not an important factor in noise visibility.

A survey of the present knowledge of noise visibility has now been completed, and it may be concluded that the subject is very complex and not yet fully understood. The aim of this work has been to accumulate some new experimental data on noise visibility,

investigate the

implications of this data for visual models and their

ability to predict noise visibility, and finally to use this new knowledge to improve on the traditional approach to companding.

CHAPTER 2

EXPERIMENTAL TECHNIQUES

2.1:

The APED system. This work was carried out using the APED image

processing facility of the Cognitive Information Processing Group at the Research Laboratory of Electronics,

M.I.T.

This system is supervised by a custom designed real time multiprocessing operating system for a PDP-ll/40 minicomputer.

APED was designed to respond to a simple set of

powerful user commands which may be entered into the system via a keyboard.

The multiprocessing feature of APED allows

it to perform a variety of tasks needed to keep the system in order concurrently with its real time servicing of user commands. APED was designed to receive, process pictorial data.

is the picture file.

transmit, display and

The basic data structure of APED

A picture file is composed of lines,

each of which is made up of a number of pels elements). number.

(picture

Each pel is internally represented as a binary

For monocrhome pictures this binary number is

proportional to the intensity at a point of the picture

11

12

being represented.

A picture file is thus a two dimensional

digital signal corresponding to the digitized samples of the intensities in the picture being represented. Operator commands enable the user to input pictures to the system from the Associated Press news photo wire and a facsimile receiver device.

Once received,

a variety

of processing operations may be carried out on the picture such as filtering,

sharpening or enlarging.

The processed

picture may then be transmitted to its final destination,

for example disk storage or the T.V. display.

2.2:

Software for noise visibility experiments. A variety of new APED commands were developed for

this research.

These included commands to add random noise

to a picture, commands to generate noisy test patterns for the noise visibility experiments,

and commands to

reformat news photos for the purpose of testing applica-

tions.

A picture format of 256 lines of 128 pels with

8 bit resolution was selected as the standard for these

commands.

A Tektronix 633 picture monitor was used for

display purposes.

Existing hardware was used to display

the 256 x 128 pel pictures on this T.V., and produced a square picture of dimensions

28 cm x

28 cm.

Thus the

vertical spacing of pels was twice as close as the horizontal spacing.

13

Software development was carried out using the manufacturer's operating system DOS

The

for the PDP-ll.

new commands were programmed using assembly language, assembled,

debugged,

and then integrated into the APED

system. APED was found to be very suitable for implementing the experiments and applications of this research.

Its

great flexibility allowed all the new commands to be implemented in software and without any modifications to

the existing hardware.

This highlights the utility

of general purpose systems such as APED in implementing a great variety of tasks with minimum effort.

2.3:

The generation of pseudo-random numbers. In this work pictorial noise was produced by adding

a sequence of random numbers to the picture signal before displaying it.

A subroutine was therefore required to

produce a sequence of random numbers with acceptable statistical properties.

The required probabilistic behavior for this application is that the output of the random number generator should behave as a discrete random variable n with P.M.F. Pn(n )

nO0

where N,

2

+N+

the noise amplitude,

the generator.

- N

,N1 < n 0 < +N, is the input parameter to

It was decided to achieve this by first

14

generating a random number from the range

(0,1), multiplying

this number by 2N + 1, truncating the result and then subtracting N.

This equivalent procedure simplifies the

problem to generating random numbers with uniform distribution on the range

(0,1).

As a first attempt, as described by Knuth

the linear congruential algorithm

(17) was implemented and tested.

This algorithm can be summarized as follows: Xn+l

=

(a-Xn+ c)

mod M

where Xn+1 and Xn are the n+1th and nth values of the random sequence.

a and c were chosen according to the constraints

laid down by Knuth.

M was chosen to be 215 since this

allowed modulo arithmetic to be programmed with ease on the PDP-l1.

The seed X0 was initialized with the computer

clock time.

The foregoing procedure guarantees that all

215 possible values of Xn are generated before the sequence starts to repeat.

The periodicity of generators such as

this one is the reason why they are referred to as pseudorandom rather than random number generators. The 15 bit numbers produced by the Knuth algorithm

may be considered to be 15 bit binary fractions selected with uniform probability from the range (0,1) and so may be used to obtain a random sequence of amplitude N using the calculation

(2 N+1)-Xnj

-

N.

An example of the

15

pictorial noise produced using this procedure is given in Fig.

2.1(a).

The vertical stripe pattern indicates

a high degree of correlation between every 128th number in the sequence.

Attempts to eliminate the stripes by

varying the values of a and c met with no success.

It

may be concluded that for M = 215 these undesirable vertical stripes are an inescapable result when using the linear congruential generator and a line size of 128.

For this

reason it was decided not to use the linear congruential method.

The problem of vertical stripe patterns in the pictorial noise was eliminated by using a pseudo-random number generator based on a feedback shift register.

The particular logical configuration selected has already been described in a paper by Troxel

(10), and corresponds

to the following bit equations for an 18 bit register: b10 =b2 bl b2

b =b3 b11 b3

b2 = bb4 b12

b3 b13 =b5 b5

b1

b 1 5 =b 7

b16 =

b17

4

=b 6

8

b9

b2 = b 5XORb2

b3 = b 6XORb13

b

b5 = b8 XORb1

b6 = b9XORb16

b7 = b10XORb 7

5

b0 = b 3XORb1 b8 = b

0

b

= b

b9 = b1

b

= b 7XORb 1

XORb1 1

16

Fig. 2.1(a)

Noise field using

linear congruential generator, N = 16.

Fig. 2.1(b)

Noise field using

shift register generator, N = 16.

17

Repeated invocation of these feedback equations gives rise to a sequence of random 8 bit patterns in bits 10-17 of The period of the random sequence is

the register.

15

2 register

length

An example of the pictorial noise produced by the shift register random number generator is given in

Fig. 2.1(b).

Unlike Fig. 2.1(a), this photograph is free

of undesirable patterns.

2.4:

Statistical testing of the shift register

pseudo-random number generator.

In order to test the statistical behavior of this generator the average and average squared values of its output for N = 6 were calculated for sequence lengths of 72. time.

Each sequence was initialized using computer clock Since N =

6 was the largest amplitude and 72 was

the smallest sequence length needed for quantitative results

in this work, the statistical behavior for these values of the parameters represents the worst case which can arise. Using the model Pn (n 0

13

-6

< n0
48 and < 64 (this agrees with the data from all 5 subjects).

In the

30

absence of further data it seems reasonable to use b = 56, the midpoint between 48 and 64 as the first data point. There was less agreement among the subjects on the

value of the highest value of b such that Nc = 2. Transitions from Nc

2 to Nc = 3 were reported between

=

160 and 176 by two of the subjects, between 96 and 112 by one

subject,

between 128 and 144 by one subject, and

between 192 and 208 by one subject.

The method chosen of

determining the average behavior based on this data has

been to take the arithmetic average of the midpoints of the above intervals:

1

(136

+ 168 + 168 + 200 + 104)

155.2.

It is more difficult to decide on a representative value for the transition from N

= 3 to Nc = 4.

Three of = 3 for

the subjects were able to locate the target with N the maximum value of b tested, great consistency.

though not with

The other two subjects reported

transition from Nc = 3 to N and 224-240.

b = 250,

= 4 in the intervals 208-224

One possible conclusion that may be drawn

from this data is that the representative value should be replaced in or about the maximum value tested, The three representative values of b 155.2,

b = 250.

selected,

56,

and 250, are the points at which the noise is just

visible for amplitudes N = 1, 2, and 3.

As shown in

Fig. 3.2 these three data points are very close to the linear relationship .435 + .0101b.

As explained earlier

31

this implies that the function Abc (b)

= J(b)

is the same

linear function multiplied by a constant:

J(b)

k 0 (.435 + .0101b)

=

Letting m = .435 and n = -

v(b)

0b

.0101,

kdb

K-log(l+

(m+nb)

_k

where K = k kn .

m b)

Substituting for m and n gives:

0 v(b)

K log (1 + ab)

=

where K is a constant, and a =

.0232.

The values of m and n used to obtain this result are based on the allocation of the three data points of

Fig.

3.2

Though these three points were selected as

objectively as possible from the mass of experimental data it is clear that these values may not be considered to be highly accurate.

However it does appear reasonable to

assume that the three points are linearly, or very nearly

linearly, related. relationship,

Allowing the assumption of a linear

it is worthwhile to investigate the potential

error in the estimate of the parameter a = a of the final m result

for v(b).

The differential

of a is

da

=

1

dn

2

_

dm.

m Allowing for an error of

10% in the values of m

and n

results in a maximum positive error in the value of a of approximately Aa

:

m

(.ln)

-

-n

m2

(-

.lm)

=

.2a.

Similarly

N

C

3

2

1

61

32

64

Fig.

3. 2

96

128

Data Points and the Line

160

.435 +

192

.0101b.

224

256

b

33

the greatest negative error would be Aa = -. 2a. allowing for 10% variations in the values of m restrict

(.018,

a to the interval

Thus and n would

.028).

The foregoing error analysis served to illustrate that the value a = value.

.0232 may not be considered a precision

However it is doubtful that it should be measured

with any greater precision,

since the average behavior of

human vision is not itself a very precise idea.

In the

next section it will be shown that the value a =

.02 is

accurate enough for companding applications, that this value is not critical

and furthermore

(for example with a =

.01

the effect of companding is indistinguishable from using a =

.02).

The real value of this experiment has been to

derive the form log(l+ab), and obtain some idea of the value of the parameter a.

It is doubtful that any practical

purposes would be served by setting up a more precise experiment than this one. For comparison purposes,

the experiment was carried

out with two subjects using slightly modified lighting conditions

(a small amount of daylight was allowed instead

of using the 75 W lamp).

The same trends were observed

in the experimental results as with the controlled lighting conditions. not critical.

Apparently moderate changes in lighting are

34

In contrast, when the experiment was carried out in darkness,

a completely different set of results was

obtained--a noise amplitude N = 1 was visible throughout most of the dynamic range of the T.V. is that viewing the T.V. its perception.

The implication

in total darkness radically alters

35

3.6:

Companding. Companding is the process of manipulating the

visibility of additive noise in a picture by means of processing the picture both before and after the noise is added.

The traditional approach to companding has

been as shown in Fig.

3.3(a).

Each pel intensity b

transformed by a companding function c(b) noise is added.

before the

After the noise is added the value c(b)+n

is inverse transformed by the inverse of c(b) c~1(c(b)+n).

is

to obtain

This process alters the visibility properties

of the additive noise.

The traditional aim of companding

has been to cause the visibility of noise to be independent of intensity,

in contrast with the absence of companding

when the noise is more visible in the dark regions than in the bright regions of a picture.

The most important

application of this is in conjunction with the Roberts technique of converting the pictorial contours due to intensity quantization to "snow"

noise.

As is well known

the Roberts technique effectively adds uniformly distributed discrete random noise to the picture,

so that companding

may be used to manipulate the visibility of this noise. This combination of companding and the Roberts technique is of great value in image transmission applications where

it is desired to transmit as few bits per pel as possible.

36

n

c (b) b -

-

c( )c

c (b) +n ( )

Fig.

3.3(a).

-

-c

The Companding Process.

(c (b) + n)

37

Determination of an optimum companding function.

3.7:

a companding function

For simplicity of discussion, c(b)

on the range 0 < b
(3) or c' (b) < 1.

Similarly the region in which apparent noise is decreased is defined by

(4)

< (3) or c'(b)

again intuitively agreeable. c'(b)

> 1.

These results are

An interval in which

< 1 is compressed by the application of c(b), so

that additive noise will be expanded by the application

47

of c-

(b).

Similarly an interval in which c'(b)

> 1 is

expanded by c(b), so that additive noise will be compressed 1

by c~

(b).

Note that these results are again independent

of v(b), so that they apply in any viewing situation which may be modelled by a visual transfer function. Continuing with this analysis,

it is possible to

determine whether apparent noise increases or decreases

as b increases when companding is used.

It increases if

d v' (b) > 0

db c' (b)

or if

,

c' (b)v" (b)

> c" (b)v' (b)

since c'

(b) > 0.

There is no variation of noise visibility if c'(b)v"(b) C"(b)v'(b) and noise visibility decreases as b increases if c'(b)v"(b) < c"(b)v'(b). results a

depend on v(b). .02,

v'(b)

=

ka

1+ab

Unlike previous results these

For the case v(b) ,

and v"(b)

=

ka 2 -

= k log (l+ab), 2

the above

(l+ab)2 may be restated as: the apparent noise increases with b if -ac'(b)

>

(1+ab)c"(b); it is independent of b if

-ac'(b)

=

(1+ab)c"(b) and it decreases with b if

-ac'(b)


b. k 1 log a1 k,

=

and it causes an increase in noise for b < b.

This type of companding would clearly be useful for pictures .

which have most of their area of intensity > k 1 log a1 k1 For example, using a 1 = .05 gives b

=

154 so that noise

should be reduced in most of the test pattern of Fig. 4.3(a) value of a 1 * This may be verified by comparing

for this Figs.

4.3(c)

and 4.3(d).

In contrast optimum companding

results in apparent noise even greater than with no companding as can be seen by comparing Figs. 4.3(c).

4.3(b)

and

In this situation the "optimum" companding function

is far from optimum in the sense of achieving reduction

56

in noise visibility.

Earlier it was shown that a companding

function k log(l+a1 b), a 1 > a =

.02, is more effective at

reducing noise visibility for b < b. = k 1

1than 1

the

a1

optimum compandor.

So a picture which has most of its area

of intensity < k

-

c(b) = k 1 log(l+a1 b)

than from optimum companding.

would benefit more from the use of

These

two examples of situations in which optimum companding does

not achieve the most reduction in noise visibility suggest the possibility of choosing picture dependent companding functions as an alternative.

Optimum companding,

though

it does result in noise visibility which is independent of intensity, does not take advantage of the intensity distribution of the individual picture and the additional potential for noise reduction which may arise from this distribution.

57

4.2:

Companding functions with two

or more intermediate brightnesses. So far companding functions with one intermediate

brightness b. have been discussed.

In one case the

companding function decreased noise visibility for b > b and in the other case noise visibility was decreased for b

< b..

Thus the former type of companding function is

suitable for predominantly bright pictures and the latter type is suitable for predominantly dark pictures. By using APED's facility to compute and display the intensity histogram of a picture it has been found that

many photographs are bimodal with peaks both in the dark and bright tones,

with the midtones occupying a relatively

small area of the picture.

With this type of a picture,

using a compandor k 1 log(l+a1 b),

a1

> a,

would reduce noise

visibility in the dark tones at the expense of greatly increasing it in the bright tones.

compandor

1 1

Similarly using a

b (exp(K-) - 1) would reduce noise visibility

a1

1

in the bright tones at the expense of greatly increasing it in the dark tones.

A new approach must be taken to

reduce noise visibility both in the bright and dark tones simultaneously.

58

A compandor will now be analyzed which has this capability: c(b)

=

As before,

this function has been selected to accommodate

the dynamic range of the T.V. c(255)

=

+ 127.5

1- d 2 (b-127.5)3 + d(b- 127.5) 127.5

so that c(0)

=

0 and

255.

1l-d 2 - 3(b- 127.5) 2 + d,

c'(b c'(b)

127.5

so that b.

is the solution of 1- d2 - 3(b- 127.5)2 + d 127.5 b

=

1

or

127.5

127.5

Thus there are 2 intermediate brighnesses, b

=

54 and

b i2 =201, at which noise visibility is unchanged by companding.

Given the restriction 0 < c'(127.5)

the fact that c'(127.5)

= d,

for 0 < b

< b