Pattern Recognition

2 downloads 0 Views 2MB Size Report
pattern recognition deals with mathematical and technical aspects of processing and analyzing patterns the usual goal is to map map patterns to symbols or data.
Pattern Recognition Prof. Christian Bauckhage

outline lecture 04 recap types of measurements basic linear algebra Aitchison geometry summary

recap

a problem domain Q consists of quantifiable objects from a specific application area; the q ∈ Q are called patterns pattern recognition deals with mathematical and technical aspects of processing and analyzing patterns the usual goal is to map map patterns to symbols or data structures in order to classify them classes or categories result from decomposing Q into k or k + 1 subsets Ω1 , Ω2 , . . . if there are k + 1 classes, Ω0 is called the rejection class

recap

a pattern has features that are characteristic of the class(es) it belongs to and there is a function f that extracts features from patterns  f q =x

classification is to map the features x extracted from a pattern q to a class Ωi y(x) = i

recap

the process of building and using a classifier involves 3 12 phases 1) training phase validation phase (optional) 2) test phase 3) application phase

for raining and testing, we need independent, representative, labeled samples of patterns

types of measurements

four types of data / measurements

four types of data / measurements

nominal (categorical, qualitative differentials)

four types of data / measurements

nominal (categorical, qualitative differentials) examples operations

: :

sex (male / female), nationality, . . . =

four types of data / measurements

nominal (categorical, qualitative differentials) examples operations

: :

sex (male / female), nationality, . . . =

ordinal (allow for ranking)

four types of data / measurements

nominal (categorical, qualitative differentials) examples operations

: :

sex (male / female), nationality, . . . =

ordinal (allow for ranking) examples operations

: :

health (healthy vs. sick), agreement, . . . =,

four types of data / measurements

nominal (categorical, qualitative differentials) examples operations

: :

sex (male / female), nationality, . . . =

ordinal (allow for ranking) examples operations

: :

health (healthy vs. sick), agreement, . . . =,

interval (degrees of difference, distance has meaning)

four types of data / measurements

nominal (categorical, qualitative differentials) examples operations

: :

sex (male / female), nationality, . . . =

ordinal (allow for ranking) examples operations

: :

health (healthy vs. sick), agreement, . . . =,

interval (degrees of difference, distance has meaning) examples operations

: :

dates (e.g. 50 BC), temperatures in ◦ C, . . . =, , +, −

four types of data / measurements

nominal (categorical, qualitative differentials) examples operations

: :

sex (male / female), nationality, . . . =

ordinal (allow for ranking) examples operations

: :

health (healthy vs. sick), agreement, . . . =,

interval (degrees of difference, distance has meaning) examples operations

: :

dates (e.g. 50 BC), temperatures in ◦ C, . . . =, , +, −

ratio (zero is unique and meaningful)

four types of data / measurements

nominal (categorical, qualitative differentials) examples operations

: :

sex (male / female), nationality, . . . =

ordinal (allow for ranking) examples operations

: :

health (healthy vs. sick), agreement, . . . =,

interval (degrees of difference, distance has meaning) examples operations

: :

dates (e.g. 50 BC), temperatures in ◦ C, . . . =, , +, −

ratio (zero is unique and meaningful) examples operations

: :

length, duration, energy, temperatures in ◦ K, . . . =, , +, −, ×, ÷

note

different kinds of measurements require different kinds of mathematical treatment (different classification algorithms)

note

different kinds of measurements require different kinds of mathematical treatment (different classification algorithms) throughout this course, we will mostly consider ratio data

basic linear algebra

 vector space V, ⊕, over a field K

 vector space V, ⊕, over a field K

axioms of addition:

axioms of multiplication:

 vector space V, ⊕, over a field K

axioms of addition: ∀ x, y ∈ V : ∃ x ⊕ y ∈ V : x⊕y=y⊕x x ⊕ (y ⊕ z) = (x ⊕ y) ⊕ z ∃0∈V :∀x∈V :x⊕0=x ∀ x ∈ V : ∃ ( x) ∈ V : x ⊕ ( x) = 0

axioms of multiplication:

 vector space V, ⊕, over a field K

axioms of addition: ∀ x, y ∈ V : ∃ x ⊕ y ∈ V : x⊕y=y⊕x x ⊕ (y ⊕ z) = (x ⊕ y) ⊕ z ∃0∈V :∀x∈V :x⊕0=x ∀ x ∈ V : ∃ ( x) ∈ V : x ⊕ ( x) = 0

axioms of multiplication: ∀ a, b ∈ K, x ∈ V : ∃ a x ∈ V : a (b x) = (a · b) x a (x ⊕ y) = a x ⊕ a y (a + b) x = a x ⊕ b x ∃1∈K:1 x=x

 vector space V, ⊕, over a field K

axioms of addition: ∀ x, y ∈ V : ∃ x ⊕ y ∈ V : x⊕y=y⊕x x ⊕ (y ⊕ z) = (x ⊕ y) ⊕ z

commutativity associativity

∃0∈V :∀x∈V :x⊕0=x

neutral element

∀ x ∈ V : ∃ ( x) ∈ V : x ⊕ ( x) = 0

inverse element

axioms of multiplication: ∀ a, b ∈ K, x ∈ V : ∃ a x ∈ V : a (b x) = (a · b) x

associativity

a (x ⊕ y) = a x ⊕ a y

distributivity

(a + b) x = a x ⊕ b x

distributivity

∃1∈K:1 x=x

neutral element

note

difference between + and ⊕ and between · and

note

difference between + and ⊕ and between · and , namely

+:K×K→K

addition of scalars

⊕:V ×V →V

addition of vectors

· :K×K→K

multiplication of scalars

:K×V →V

multiplication of a scalar and a vector

note

programmers know this!!! consider, say, this C++ code class Vector2D { public: double x; double y; Vector2D(double xval, double yval) { x = xval; y = yval; } friend Vector2D operator +(Vector2D u, Vector2D v) { return Vector2D(u.x + v.x, u.y + v.y); } friend Vector2D operator *(double a, Vector2D v) { return Vector2D(a * v.x, a * v.y); } };

void main() { double a = 2., b = 3., c = a + b; // scalar addition Vector2D u(1, 0), v(0, 1), w = u + v; // vector addition }

note

notationally, these differences are usually not made explicit

note

notationally, these differences are usually not made explicit for example, instead of a (x ⊕ y) = (a x) ⊕ (a y) we typically find a (x + y) = a x + a y

note

notationally, these differences are usually not made explicit for example, instead of a (x ⊕ y) = (a x) ⊕ (a y) we typically find a (x + y) = a x + a y

we will henceforth follow this convention

examples vector spaces are a dime a dozen . . . Rm , the vector space of real valued m-dim vectors Rm×n , the vector space of real valued m × n matrices Pn , the vector space of n-th order polynomials over R F[a, b], the vector space of functions f : [a, b] ⊂ R → R C[a, b], the vector space of functions f : [a, b] ⊂ R → C .. .

subspaces

W ⊂ V is a subspace if it is also a vector space

subspaces

W ⊂ V is a subspace if it is also a vector space

for instance R2 is a subspace of R3

subspaces

W ⊂ V is a subspace if it is also a vector space

for instance R2 is a subspace of R3 {0} is a subspace of Rm

subspaces

W ⊂ V is a subspace if it is also a vector space

for instance R2 is a subspace of R3 {0} is a subspace of Rm ∅ is not a subspace of Rm

subspaces

W ⊂ V is a subspace if it is also a vector space

for instance R2 is a subspace of R3 {0} is a subspace of Rm

why?

∅ is not a subspace of Rm

why not?

examples subspace or not?

inner product space V, +, ·, h i

∀ (x, y) ∈ V × V : ∃ hx, yi ∈ K : hx, yi = h y, xi∗ hx, y + zi = hx, yi + hx, zi ha x, yi = ahx, yi hx, xi > 0 hx, xi = 0 ⇔ x = 0



examples

x, y ∈ Rm hx, yi =

m X

xi yi = xT y

i=1

A, B ∈ Rm×n hA, Bi =

n m X X

aij bij = tr(AT B) = tr(ABT )

i=1 j=1

f (x), g(x) ∈ C[a, b] Zb

f (x), g(x) = f (x) g∗ (x) dx

a

orthogonality and orthonormality

xi , xj ∈ V are orthogonal, if hxi , xj i = 0 xi , xj ∈ V are orthonormal, if hxi , xj i = δij

 normed space V, +, ·, k k

∀ x ∈ V : kxk ∈ R : kxk > 0 kxk = 0 ⇔ x = 0 ka xk = |a| kxk kx + yk 6 kxk + kyk

 normed space V, +, ·, k k

∀ x ∈ V : kxk ∈ R : kxk > 0 kxk = 0 ⇔ x = 0 ka xk = |a| kxk kx + yk 6 kxk + kyk

typically, the inner product is used to induce a norm

unit vectors

x is a unit vector or a vector of length or magnitude 1, if kxk = kxk2 = hx, xi = 1

distances

if for each ordered pair of elements x, y ∈ V there exists a number d(x, y) ∈ R so that for all x, y, z ∈ V d(x, y) > 0 ∧ d(x, y) = 0 ⇔ x = y d(x, y) = d(y, x) d(x, z) 6 d(x, y) + d(y, z)

non-negativity symmetry triangle inequality

then d(x, y) is called a distance between the vectors x and y

distances

if for each ordered pair of elements x, y ∈ V there exists a number d(x, y) ∈ R so that for all x, y, z ∈ V d(x, y) > 0 ∧ d(x, y) = 0 ⇔ x = y d(x, y) = d(y, x) d(x, z) 6 d(x, y) + d(y, z)

non-negativity symmetry triangle inequality

then d(x, y) is called a distance between the vectors x and y

d(x, y) = kx − yk is the Euclidean distance between x and y

observe

norms / distances are a dime a dozen, too

observe

norms / distances are a dime a dozen, too, for instance for x ∈ Rm and p ∈ R, p > 1, the Lp -norm of x is given by

kxkp =

m X |xi |p i=1

! p1

question considering the space R2 , what is a unit circle ?

question considering the space R2 , what is a unit circle ?

answer the set

 C = x ∈ R d x, 0 = 1 2

example unit circle

1

1

p=2

examples unit circles

1

1

1

p=1

1

p=2

examples unit circles

1

1

1

p=1

1

1

p=2

1

p=4

examples unit circles

1

1

1

p=1

1

1

p=2

1

1

p=4

1

p=8

examples unit circles

1

1

1

p=1

1

1

p=2

1

1

p=4

1

1

p=8

1

p=∞

linear combinations  given a set of vectors x , x , . . . , x n 1 2  ⊂ V over K and a set of coefficients w1 , w2 , . . . , wn ⊂ K, the vector x = w1 x1 + w2 x2 + . . . + wn xn

is called a linear combination of the vectors xi

note

for wi ∈ R, there are different types of linear combinations

note

for wi ∈ R, there are different types of linear combinations P

wi > 0

true false

i wi true convex affine

=1 false conic linear

span

the span of a nonempty subset X of V over K is  

span X =

n X i=1

 wi xi wi ∈ K ∧ xi ∈ X

linear independence  a finite set of n vectors x1 , x2 , . . . , xn ⊂ V is linearly independent, if

0 = w1 x1 + w2 x2 + . . . + wn xn

only admits the trivial solution w1 = w2 = . . . = wn = 0

bases and dimensions  if a finite set of n vectors x1 , x2 , . . . , xn ⊂ V is linearly independent and spans V, it is a basis of V

bases and dimensions  if a finite set of n vectors x1 , x2 , . . . , xn ⊂ V is linearly independent and spans V, it is a basis of V

 if x1 , x2 , . . . , xn ⊂ V is a basis of V, then  dim V = n is the dimension of V

in this course, we are mainly concerned with

Rm

let us construct a vector space

Aitchison geometry

John Aitchison (∗1926)

standard simplex in Rm

 ∆m−1 =

 m X xi = 1 x ∈ Rm xi > 0 ∧ i=1

standard simplex in Rm

 ∆m−1 =

  m X xi = 1 = x ∈ Rm x ∈ Rm xi > 0 ∧ + i=1

 kxk1 = 1

note

∆m−1 is not a vector space

space of compositional data in Rm

 Sm =

 m X xi = 1 x ∈ Rm xi > 0 ∧ i=1

space of compositional data in Rm

 Sm =

 m X  xi = 1 = ∆m−1 δ∆m−1 x ∈ Rm xi > 0 ∧ i=1

closure, perturbation, and powering closure of x ∈ Rm + x C(x) = P i

xi

=

x kxk1

closure, perturbation, and powering closure of x ∈ Rm + x C(x) = P

xi

=

x kxk1

i

perturbation of x ∈ Sm by y ∈ Sm   x ⊕ y = C x1 · y1 , x2 · y2 , . . . , xm · ym

closure, perturbation, and powering closure of x ∈ Rm + x C(x) = P

xi

=

x kxk1

i

perturbation of x ∈ Sm by y ∈ Sm   x ⊕ y = C x1 · y1 , x2 · y2 , . . . , xm · ym powering of x ∈ Sm by a ∈ R   a a x = C x1a , x2a , . . . , xm

example addition vs. perturbation

note

 Sm , ⊕, is a vector space over R

note

 Sm , ⊕, is a vector space over R, because x⊕y=y⊕x x ⊕ (y ⊕ z) = (x ⊕ y) ⊕ z   0 = m1 , . . . , m1 ⇔ x ⊕ 0 = x  x x = x ⊕ −1 x = 0 a (b x) = (ab) x a (y ⊕ x) = (a x) ⊕ (a y) (a + b) x = (a x) ⊕ (b x) 1 x=x

inner product, norm, and distance inner product of x, y ∈ Sm 1 X X xi yi ln ln 2m xj yj m

h x, yia =

m

i=1 j=1

inner product, norm, and distance inner product of x, y ∈ Sm 1 X X xi yi ln ln 2m xj yj m

h x, yia =

m

i=1 j=1

norm of x ∈ Sm p kxka = h x, xia

inner product, norm, and distance inner product of x, y ∈ Sm 1 X X xi yi ln ln 2m xj yj m

h x, yia =

m

i=1 j=1

norm of x ∈ Sm p kxka = h x, xia distance between x, y ∈ Sm da (x, y) = kx yka

example unit circle e3

e1

e2

example circles about an arbitrary point e3

e1

e2

example equidistant parallel lines e3

e1

e2

summary

we now know about

basic terminology and concepts of linear algebra the Aitchison geometry of the simplex ⇔ a beautiful and useful example of an abstract vector space