pattern recognition deals with mathematical and technical aspects of processing and analyzing patterns the usual goal is to map map patterns to symbols or data.
Pattern Recognition Prof. Christian Bauckhage
outline lecture 04 recap types of measurements basic linear algebra Aitchison geometry summary
recap
a problem domain Q consists of quantifiable objects from a specific application area; the q ∈ Q are called patterns pattern recognition deals with mathematical and technical aspects of processing and analyzing patterns the usual goal is to map map patterns to symbols or data structures in order to classify them classes or categories result from decomposing Q into k or k + 1 subsets Ω1 , Ω2 , . . . if there are k + 1 classes, Ω0 is called the rejection class
recap
a pattern has features that are characteristic of the class(es) it belongs to and there is a function f that extracts features from patterns f q =x
classification is to map the features x extracted from a pattern q to a class Ωi y(x) = i
recap
the process of building and using a classifier involves 3 12 phases 1) training phase validation phase (optional) 2) test phase 3) application phase
for raining and testing, we need independent, representative, labeled samples of patterns
types of measurements
four types of data / measurements
four types of data / measurements
nominal (categorical, qualitative differentials)
four types of data / measurements
nominal (categorical, qualitative differentials) examples operations
: :
sex (male / female), nationality, . . . =
four types of data / measurements
nominal (categorical, qualitative differentials) examples operations
: :
sex (male / female), nationality, . . . =
ordinal (allow for ranking)
four types of data / measurements
nominal (categorical, qualitative differentials) examples operations
: :
sex (male / female), nationality, . . . =
ordinal (allow for ranking) examples operations
: :
health (healthy vs. sick), agreement, . . . =,
four types of data / measurements
nominal (categorical, qualitative differentials) examples operations
: :
sex (male / female), nationality, . . . =
ordinal (allow for ranking) examples operations
: :
health (healthy vs. sick), agreement, . . . =,
interval (degrees of difference, distance has meaning)
four types of data / measurements
nominal (categorical, qualitative differentials) examples operations
: :
sex (male / female), nationality, . . . =
ordinal (allow for ranking) examples operations
: :
health (healthy vs. sick), agreement, . . . =,
interval (degrees of difference, distance has meaning) examples operations
: :
dates (e.g. 50 BC), temperatures in ◦ C, . . . =, , +, −
four types of data / measurements
nominal (categorical, qualitative differentials) examples operations
: :
sex (male / female), nationality, . . . =
ordinal (allow for ranking) examples operations
: :
health (healthy vs. sick), agreement, . . . =,
interval (degrees of difference, distance has meaning) examples operations
: :
dates (e.g. 50 BC), temperatures in ◦ C, . . . =, , +, −
ratio (zero is unique and meaningful)
four types of data / measurements
nominal (categorical, qualitative differentials) examples operations
: :
sex (male / female), nationality, . . . =
ordinal (allow for ranking) examples operations
: :
health (healthy vs. sick), agreement, . . . =,
interval (degrees of difference, distance has meaning) examples operations
: :
dates (e.g. 50 BC), temperatures in ◦ C, . . . =, , +, −
ratio (zero is unique and meaningful) examples operations
: :
length, duration, energy, temperatures in ◦ K, . . . =, , +, −, ×, ÷
note
different kinds of measurements require different kinds of mathematical treatment (different classification algorithms)
note
different kinds of measurements require different kinds of mathematical treatment (different classification algorithms) throughout this course, we will mostly consider ratio data
basic linear algebra
vector space V, ⊕, over a field K
vector space V, ⊕, over a field K
axioms of addition:
axioms of multiplication:
vector space V, ⊕, over a field K
axioms of addition: ∀ x, y ∈ V : ∃ x ⊕ y ∈ V : x⊕y=y⊕x x ⊕ (y ⊕ z) = (x ⊕ y) ⊕ z ∃0∈V :∀x∈V :x⊕0=x ∀ x ∈ V : ∃ ( x) ∈ V : x ⊕ ( x) = 0
axioms of multiplication:
vector space V, ⊕, over a field K
axioms of addition: ∀ x, y ∈ V : ∃ x ⊕ y ∈ V : x⊕y=y⊕x x ⊕ (y ⊕ z) = (x ⊕ y) ⊕ z ∃0∈V :∀x∈V :x⊕0=x ∀ x ∈ V : ∃ ( x) ∈ V : x ⊕ ( x) = 0
axioms of multiplication: ∀ a, b ∈ K, x ∈ V : ∃ a x ∈ V : a (b x) = (a · b) x a (x ⊕ y) = a x ⊕ a y (a + b) x = a x ⊕ b x ∃1∈K:1 x=x
vector space V, ⊕, over a field K
axioms of addition: ∀ x, y ∈ V : ∃ x ⊕ y ∈ V : x⊕y=y⊕x x ⊕ (y ⊕ z) = (x ⊕ y) ⊕ z
commutativity associativity
∃0∈V :∀x∈V :x⊕0=x
neutral element
∀ x ∈ V : ∃ ( x) ∈ V : x ⊕ ( x) = 0
inverse element
axioms of multiplication: ∀ a, b ∈ K, x ∈ V : ∃ a x ∈ V : a (b x) = (a · b) x
associativity
a (x ⊕ y) = a x ⊕ a y
distributivity
(a + b) x = a x ⊕ b x
distributivity
∃1∈K:1 x=x
neutral element
note
difference between + and ⊕ and between · and
note
difference between + and ⊕ and between · and , namely
+:K×K→K
addition of scalars
⊕:V ×V →V
addition of vectors
· :K×K→K
multiplication of scalars
:K×V →V
multiplication of a scalar and a vector
note
programmers know this!!! consider, say, this C++ code class Vector2D { public: double x; double y; Vector2D(double xval, double yval) { x = xval; y = yval; } friend Vector2D operator +(Vector2D u, Vector2D v) { return Vector2D(u.x + v.x, u.y + v.y); } friend Vector2D operator *(double a, Vector2D v) { return Vector2D(a * v.x, a * v.y); } };
void main() { double a = 2., b = 3., c = a + b; // scalar addition Vector2D u(1, 0), v(0, 1), w = u + v; // vector addition }
note
notationally, these differences are usually not made explicit
note
notationally, these differences are usually not made explicit for example, instead of a (x ⊕ y) = (a x) ⊕ (a y) we typically find a (x + y) = a x + a y
note
notationally, these differences are usually not made explicit for example, instead of a (x ⊕ y) = (a x) ⊕ (a y) we typically find a (x + y) = a x + a y
we will henceforth follow this convention
examples vector spaces are a dime a dozen . . . Rm , the vector space of real valued m-dim vectors Rm×n , the vector space of real valued m × n matrices Pn , the vector space of n-th order polynomials over R F[a, b], the vector space of functions f : [a, b] ⊂ R → R C[a, b], the vector space of functions f : [a, b] ⊂ R → C .. .
subspaces
W ⊂ V is a subspace if it is also a vector space
subspaces
W ⊂ V is a subspace if it is also a vector space
for instance R2 is a subspace of R3
subspaces
W ⊂ V is a subspace if it is also a vector space
for instance R2 is a subspace of R3 {0} is a subspace of Rm
subspaces
W ⊂ V is a subspace if it is also a vector space
for instance R2 is a subspace of R3 {0} is a subspace of Rm ∅ is not a subspace of Rm
subspaces
W ⊂ V is a subspace if it is also a vector space
for instance R2 is a subspace of R3 {0} is a subspace of Rm
why?
∅ is not a subspace of Rm
why not?
examples subspace or not?
inner product space V, +, ·, h i
∀ (x, y) ∈ V × V : ∃ hx, yi ∈ K : hx, yi = h y, xi∗ hx, y + zi = hx, yi + hx, zi ha x, yi = ahx, yi hx, xi > 0 hx, xi = 0 ⇔ x = 0
examples
x, y ∈ Rm hx, yi =
m X
xi yi = xT y
i=1
A, B ∈ Rm×n hA, Bi =
n m X X
aij bij = tr(AT B) = tr(ABT )
i=1 j=1
f (x), g(x) ∈ C[a, b] Zb
f (x), g(x) = f (x) g∗ (x) dx
a
orthogonality and orthonormality
xi , xj ∈ V are orthogonal, if hxi , xj i = 0 xi , xj ∈ V are orthonormal, if hxi , xj i = δij
normed space V, +, ·, k k
∀ x ∈ V : kxk ∈ R : kxk > 0 kxk = 0 ⇔ x = 0 ka xk = |a| kxk kx + yk 6 kxk + kyk
normed space V, +, ·, k k
∀ x ∈ V : kxk ∈ R : kxk > 0 kxk = 0 ⇔ x = 0 ka xk = |a| kxk kx + yk 6 kxk + kyk
typically, the inner product is used to induce a norm
unit vectors
x is a unit vector or a vector of length or magnitude 1, if kxk = kxk2 = hx, xi = 1
distances
if for each ordered pair of elements x, y ∈ V there exists a number d(x, y) ∈ R so that for all x, y, z ∈ V d(x, y) > 0 ∧ d(x, y) = 0 ⇔ x = y d(x, y) = d(y, x) d(x, z) 6 d(x, y) + d(y, z)
non-negativity symmetry triangle inequality
then d(x, y) is called a distance between the vectors x and y
distances
if for each ordered pair of elements x, y ∈ V there exists a number d(x, y) ∈ R so that for all x, y, z ∈ V d(x, y) > 0 ∧ d(x, y) = 0 ⇔ x = y d(x, y) = d(y, x) d(x, z) 6 d(x, y) + d(y, z)
non-negativity symmetry triangle inequality
then d(x, y) is called a distance between the vectors x and y
d(x, y) = kx − yk is the Euclidean distance between x and y
observe
norms / distances are a dime a dozen, too
observe
norms / distances are a dime a dozen, too, for instance for x ∈ Rm and p ∈ R, p > 1, the Lp -norm of x is given by
kxkp =
m X |xi |p i=1
! p1
question considering the space R2 , what is a unit circle ?
question considering the space R2 , what is a unit circle ?
answer the set
C = x ∈ R d x, 0 = 1 2
example unit circle
1
1
p=2
examples unit circles
1
1
1
p=1
1
p=2
examples unit circles
1
1
1
p=1
1
1
p=2
1
p=4
examples unit circles
1
1
1
p=1
1
1
p=2
1
1
p=4
1
p=8
examples unit circles
1
1
1
p=1
1
1
p=2
1
1
p=4
1
1
p=8
1
p=∞
linear combinations given a set of vectors x , x , . . . , x n 1 2 ⊂ V over K and a set of coefficients w1 , w2 , . . . , wn ⊂ K, the vector x = w1 x1 + w2 x2 + . . . + wn xn
is called a linear combination of the vectors xi
note
for wi ∈ R, there are different types of linear combinations
note
for wi ∈ R, there are different types of linear combinations P
wi > 0
true false
i wi true convex affine
=1 false conic linear
span
the span of a nonempty subset X of V over K is
span X =
n X i=1
wi xi wi ∈ K ∧ xi ∈ X
linear independence a finite set of n vectors x1 , x2 , . . . , xn ⊂ V is linearly independent, if
0 = w1 x1 + w2 x2 + . . . + wn xn
only admits the trivial solution w1 = w2 = . . . = wn = 0
bases and dimensions if a finite set of n vectors x1 , x2 , . . . , xn ⊂ V is linearly independent and spans V, it is a basis of V
bases and dimensions if a finite set of n vectors x1 , x2 , . . . , xn ⊂ V is linearly independent and spans V, it is a basis of V
if x1 , x2 , . . . , xn ⊂ V is a basis of V, then dim V = n is the dimension of V
in this course, we are mainly concerned with
Rm
let us construct a vector space
Aitchison geometry
John Aitchison (∗1926)
standard simplex in Rm
∆m−1 =
m X xi = 1 x ∈ Rm xi > 0 ∧ i=1
standard simplex in Rm
∆m−1 =
m X xi = 1 = x ∈ Rm x ∈ Rm xi > 0 ∧ + i=1
kxk1 = 1
note
∆m−1 is not a vector space
space of compositional data in Rm
Sm =
m X xi = 1 x ∈ Rm xi > 0 ∧ i=1
space of compositional data in Rm
Sm =
m X xi = 1 = ∆m−1 δ∆m−1 x ∈ Rm xi > 0 ∧ i=1
closure, perturbation, and powering closure of x ∈ Rm + x C(x) = P i
xi
=
x kxk1
closure, perturbation, and powering closure of x ∈ Rm + x C(x) = P
xi
=
x kxk1
i
perturbation of x ∈ Sm by y ∈ Sm x ⊕ y = C x1 · y1 , x2 · y2 , . . . , xm · ym
closure, perturbation, and powering closure of x ∈ Rm + x C(x) = P
xi
=
x kxk1
i
perturbation of x ∈ Sm by y ∈ Sm x ⊕ y = C x1 · y1 , x2 · y2 , . . . , xm · ym powering of x ∈ Sm by a ∈ R a a x = C x1a , x2a , . . . , xm
example addition vs. perturbation
note
Sm , ⊕, is a vector space over R
note
Sm , ⊕, is a vector space over R, because x⊕y=y⊕x x ⊕ (y ⊕ z) = (x ⊕ y) ⊕ z 0 = m1 , . . . , m1 ⇔ x ⊕ 0 = x x x = x ⊕ −1 x = 0 a (b x) = (ab) x a (y ⊕ x) = (a x) ⊕ (a y) (a + b) x = (a x) ⊕ (b x) 1 x=x
inner product, norm, and distance inner product of x, y ∈ Sm 1 X X xi yi ln ln 2m xj yj m
h x, yia =
m
i=1 j=1
inner product, norm, and distance inner product of x, y ∈ Sm 1 X X xi yi ln ln 2m xj yj m
h x, yia =
m
i=1 j=1
norm of x ∈ Sm p kxka = h x, xia
inner product, norm, and distance inner product of x, y ∈ Sm 1 X X xi yi ln ln 2m xj yj m
h x, yia =
m
i=1 j=1
norm of x ∈ Sm p kxka = h x, xia distance between x, y ∈ Sm da (x, y) = kx yka
example unit circle e3
e1
e2
example circles about an arbitrary point e3
e1
e2
example equidistant parallel lines e3
e1
e2
summary
we now know about
basic terminology and concepts of linear algebra the Aitchison geometry of the simplex ⇔ a beautiful and useful example of an abstract vector space