Addison-Wesley, 1992. Glyn James. Advanced Modern Engineering
Mathematics. Addison-Wesley, 1993. H. F. Mattson Discrete Mathematics with
Applications.
Professor Dr. Ulrich Bühler
Matrix Algebra
Matrix Algebra and Applications 1
Algebraic Foundations
1.1
Vectors and Vectorspaces
1.2
Matrices and Determinants
1.3
Algebraic Equations
2
The Eigenvalue Problem
2.1
Eigenvalues and Eigenvectors
2.2
Numerical Methods
2.3
Functions of a Matrix
Department of Computer Science Tel. +49-661-9640-325/-300 Fax +49-661-9640-349 E-Mail:
[email protected]
Fulda University of Applied Sciences Marquardstraße 35 36039 Fulda, Germany
Professor Dr. Ulrich Bühler
Matrix Algebra
Matrix Algebra and Applications References J. B. Fraleigh
A First Course in Abstract Algebra, Addison-Wesley, Fifth Edition, 1994
Glyn James
Modern Engineering Mathematics Addison-Wesley, 1992
Glyn James
Advanced Modern Engineering Mathematics Addison-Wesley, 1993
H. F. Mattson Discrete Mathematics with Applications John Wiley & Sons, 1993 Jonny Olive
Maths – A Student Survival Guide Cambridge University Press, 1998
Department of Computer Science Tel. +49-661-9640-325/-300 Fax +49-661-9640-349 E-Mail:
[email protected]
Fulda University of Applied Sciences Marquardstraße 35 36039 Fulda, Germany
Matrix Algebra
Algebraic Foundations
1
Algebraic Foundations
1.1
Vectors and Vector spaces
What is a vector ?
1
on meets vectors within some physical context as force, velocity, position and so on; they have both magnitude and direction; two- or three-dimensional space; generalization to n-space is necessary
Let be R the set of all real numbers. Than we consider the Cartesian Product R n = {( x1 ,..., xn )| xi Î R } . Any real n-tupel ®
x = ( x1 ,..., xn ) Î R n
is called a n-dimensional vector ( n is called the dimension). Remark:
if there is no confusion we leave out the arrow
particulary: n = 2 : n=3 :
a = ( 2 ,5 ), b = ( 3 ,-7 ) (two-dimensional) x = ( 1,1,1 ), y = ( 3 ,-1,2 ) (three-dimensional)
Graphical representation in 3-dimensions:
arrows
Cartesian Coordinates System we refer to a system of three straight line axis x, y, z in space which are at right angles to each other; the crossing point is the origin 0 we denote by
e1 = ( 1,0 ,0 ), e2 = ( 0 ,1,0 ), e3 = ( 0 ,0 ,1 )
vectors from the origin directed along the x, y, z axis respectively. So any vector from the origin may be written in the form v = ( x , y , z ) = xe1 + ye2 + ze3
in the Cartesian Coordinates System {0;e1,e2,e3}. Ulrich Bühler, Department of Computer Science
Fulda University of Applied Sciences
Matrix Algebra
Algebraic Foundations
2
vector from a point A( a x ,a y ,a z ) to another point B( bx ,b y ,bz ) v = ( bx - a x ,by - a y ,bz - a z )
Magnitude and Direction of a vector v = ( x , y , z )
v = x 2 + y 2 + z 2 is called the magnitude of v (length, norm)
direction of v is represented by the angles a ,b ,g between v and the three axis respectively; it is cos( a ) =
x y z , cos( b ) = , cos( g ) = v v v
notice: the location of a vector is not specified, only its magnitude and direction
Ulrich Bühler, Department of Computer Science
Fulda University of Applied Sciences
Matrix Algebra
Algebraic Foundations
3
Operations with vectors in real n-space Addition
x + y = ( x1 + y1 ,..., xn + yn )
Scalar Multiplication
with x , y Î R n
a × x = ( ax1 ,ax2 ,...,axn ) , a ¹ 0,a Î R and x Î R n
( in R 3 the length of vector ax is a times the length of x ; direction of ax is the same as that of x if a > 0 , and the opposite if a < 0 ) Difference
Norm of a vector notice:
Dot product notice:
x - y = x + ( -1 ) × y
x = x12 + ... + xn2
The norm of a vector is a function J : R n ® R with the properties (i) J( tx ) =| t | ×J( x ) (ii) J( x + y ) £ J( x ) + J( y ) (iii) J( x ) ³ 0 and J( x ) = 0 for x = 0 x · y = x1 y1 + ...+ xn yn
(scalar product, inner product)
The inner product is a function j : R n ´ R n ® R with the properties (i) j( x, y ) = j( y , x ) (ii) tj ( x , y ) = j ( tx , y ) = j ( x ,ty ) (iii) j ( x + y , z ) = j ( x , z ) + j ( y , z ) (iv) j ( x , x ) > 0 , x ¹ 0
Ulrich Bühler, Department of Computer Science
Fulda University of Applied Sciences
Matrix Algebra
Algebraic Foundations
x · y = x × y × cos a
Angle between two vectors notice:
4
This is possible because of the relation -1£
x· y £ +1 x × y
two vectors x , y are called orthogonal iff their dot product is zero: x· y =0
Cross product (only in 3-space) (i) (ii) (iii)
vector v = x ´ y with
length v = x × y × sin( x , y ) v is orthogonal to both x and y x , y ,v form a right-handed system
we have (without derivation) e1
e2
e3
x ´ y = x1
x2
x3 =
y1
y2
y3
x2
x3
y2
y3
e1 -
x1
x3
y1
y3
e2 +
x1
x2
y1
y2
e3
x ´ y = the area of a parallelogram with sides x and y
two vectors x , y are parallel iff their cross product is zero properties
Special vectors
x ´ y = -y ´ x a ( x ´ y ) = ( ax ) ´ y = x ´ ( ay ), a is a scalar x ´( y + z)= x ´ y + x ´ z 0 := ( 0 ,...,0 ) zero vector (direction is not defined) ei := ( 0 ,...,0 ,1,0 ,...,0 ) , 1 £ i £ n
Ulrich Bühler, Department of Computer Science
Fulda University of Applied Sciences
Matrix Algebra
Algebraic Foundations
5
Vector space ( V ,+ ,*, K ) V a nonempty set with elements a ,b ,c ,... K = ( K ,Å ,Ä ) a field with elements a , b ,...
(vectors) (scalars)
Binary operation + : V ´ V ® V „Multiplication“ * : K ´ V ® V (A) The structur ( V ,+ ) forms a commutative group: (i) (ii) (iii)
x+ y= y+x x +( y + z)=(x + y)+ z x + 0 = x , "x ÎV x + ( -x ) = 0
(commutativity) (associativity) (neutral element 0) (inverse element)
(B) The „Multiplication“ fullfills the properties: (i) (ii) (iii) (iv)
a * x ÎV for x ÎV ,a Î K a *( b * x ) = (a Ä b )* x ( a Å b ) * x = a * x + b * x and a * ( x + y ) = a * x + a * y 1 * x = x for all x ÎV
A Vector space with a inner product we call an Eucledian Vectorspace. Examples (1) Real Numbers form a vector space ( R ,+ ,×, R ) (2) The n-dimensional real vectors x , y , z ,... with the real scalars a ,b ,g ,... define an eucledian vectorspace ( R n ,+ ,×, R ) . (3) The real ( m ,n ) -matrices form a vector space ( Am ,n ,+ ,*, R ) .
Ulrich Bühler, Department of Computer Science
Fulda University of Applied Sciences
Matrix Algebra
Algebraic Foundations
6
Linear independence of vectors The set of vectors a1 ,...,am ÎV are called linearly independent iff a 1 * a1 + ... + a m * a m = 0 implies that a 1 = 0 ,...,a m = 0 .
The set of all vectors of the form v := a 1 * a1 + ... + a m * am for some scalars a 1 ,...,a m Î K
built the vector space S ( a1 ,...,am ) ; it is a sub vector space of ( V ,+ ,*, K ) . Basis of ( V ,+ ,*, K ) The set B = { b1 ,...,bn } of vectors of ( V ,+ ,*, K ) is called a basis of the vector space if (a) S ( b1 ,...,bn ) = V , (b) { b1 ,...,bn } are linearly independent . Examples (1) The set { e1 ,...,en } is a basis of ( R n ,+ ,×, R ) . (2) The set { E11 ,..., E1n , E21 ,..., E 2 n ,..., Em1 ,..., Emn } of ( m ,n ) -matrices Eij with eij = 1 and eij = 0 otherwise form a basis of the vector space ( Am ,n ,+ ,*, R ) .
Ulrich Bühler, Department of Computer Science
Fulda University of Applied Sciences
Matrix Algebra
1.2
Algebraic Foundations
7
Matrices and Determinants
What is a matrix ? on needs matrices to simplify a large number of quantities; in a certan manner its a collection of several n-dimensional vectors A matrix A is a rectangular array of quantities in m rows and n column like follows æ a11 ç ç a21 A=ç . ç ç . ç è a m1
a12 a 22 . . am2
.... .... a1n ö ÷ .... .... a 2 n ÷ . ÷ , ÷ . ÷ ÷ .... .... a mn ø
Am´n = { aij }
m ´ n is called the form or dimension of the matrix M mK´ n
the set of all ( m ´ n ) -matrices with elements from the field K
Two matrices A, B are said to be equal if they are of the same dimension and if their corresponding elements are equal. The quantities aij are called the elements and can be arbitrary objects like real numbers, complex numbers, functions, differential operators or even matrices themselves.
m = n : square matrix of order n
®
M nK´ n
Am´n = { aij } is a diagonal matrix of order n if the only nonzero elements
lie on the main diagonal : A = diag( a11 ,...,a nn ) (that is aij = 0 , i ¹ j ). in particular:
I n = diag( 1,...,1 ) is called Identity (unit) matrix
Ulrich Bühler, Department of Computer Science
Fulda University of Applied Sciences
Matrix Algebra
Algebraic Foundations
8
Operations with matrices from M mK´ n Matrix Addition
A + B = { aij + bij } with A, B Î M mK´ n (the same dimension)
Scalar Multiplication
aA = {a × aij } , a Î K , A Î M mK´ n , ( aA ) Î M mK´ n
especially: - A = ( -1 ) A additive inverse of A A - B = A + ( -B )
Difference
Matrix Multiplication C = A × B with C = { cij } Î M mK´ p and n
cij = å aik bkj , k =1
suppose A Î M mK´ n and B Î M nK´ p (conformable matrices) conformable for multiplication that is the number of columns of A is equal to the number of rows of B notice:
Some properties
matrix multiplication means to carry out the scalar product of each row of the first matrix with each column of the second one A + B = B + A , A + (B + C)=( A + B)+ C A + 0 = A , A + (- A) = 0 a ( bA ) = ( ab ) A , ( a + b ) A = aA + aB
1A = A , 0 A = 0 A × B ¹ B × A , A ×( B + C ) = A × B + A × C A ×( B × C ) = ( A × B )× C , I × A = A × I = A
Ulrich Bühler, Department of Computer Science
Fulda University of Applied Sciences
Matrix Algebra
Algebraic Foundations
9
Elementary row or column operations (i) (ii) (iii)
interchanging two rows / columns multiplying the elements of one row / column by a non-zero real number adding to the elements of one row / column, any multiple of the corresponding elements of another row / column
Elementary Matrix
Matrix which can be obtained from the identity matrix by performing just one elementary operation
Two matrices A, B Î M mK´ n are equivalent (symbol: A º B ) if B can be obtained from A by a finite number of elementary operations opi , 1 £ i £ k : A º B :Û B = op k ( opk - 1 (...op1 ( A )...))
helpfull propery for matrix calculations: Every regular square matrix with elements from a field K = ( K ,+ ,*) can be represent by a finite product of elementary matrices.
Some special Matrices AT
transposed matrix of A , aijT = a ji A = AT
AT
symmetric matrix
adjoint matrix of A AT = A
hermitian matrix
Ulrich Bühler, Department of Computer Science
Fulda University of Applied Sciences
Matrix Algebra
Algebraic Foundations
Determinant
10
a scalar quantity associated with a square matrix An´ n
function det A: An ´n ® s with s Î R, aij Î R n = 1: n =2: n ³ 3:
(like a rule)
det A = a11 det A = a11 a 22 - a21a12 more generally
Let be s = ( k1k 2 ...k n ) a permutation of the set N n+ := { 1,2 ,...,n } , S n the set of all permutations of N n+ and I ( s ) the number of inversions of s . Then the determinant of A is det A := å ( -1 ) I ( s ) a1k1 a2 k 2 ...ank n s ÎS n
in particular:
n=3 Þ
Rule of Sarrus
det A = det AT
some notations: minor M ij of the element aij in A is the determinant of the order n-1 that survives when the ith row and the jth colummn are struck out cofactor Aij of the element aij in A is defined as Aij = ( -1 )i + j M ij
cofactor expansion of a determinant by its ith row det A = ai 1 Ai 1 + ai 2 Ai 2 + ...+ ain Ain
(Theorem of Laplace)
notice: if all elements in a row are zero elements then det A = 0 det A = 0 then A is called singular otherwise regular
Ulrich Bühler, Department of Computer Science
Fulda University of Applied Sciences
Matrix Algebra
Algebraic Foundations
11
Example to compute a determinant: æ 0 2 -1ö ç ÷ A = ç4 3 5 ÷ ç ÷ è 2 0 -4ø
det A = ?
using the expansion with i = 1 we have det A = a11 A11 + a12 A12 + ai 13 A13
= 0 × (-1)2 M 11 + 2 × ( -1 )3 M 12 + ( -1 ) × ( -1 )4 M 13 M 11 =
3 5 4 5 4 3 , M 12 = , M 13 = 0 -4 2 -4 2 0
det A = 0( -12 + 0 ) - ( 2 )( -16 - 10 ) + ( -1 )( 0 - 6 ) = 58
Properties of determinants (i)
Two equal rows or columns det( a1 ,...,ar ,...,ar ,...,an ) = 0
(ii)
Multiple of a row or column by a scalar det( a1 ,...,tar ,...,a n ) = t × det( a1 ,...,ar ,...,an )
(iii)
Interchange of two rows or columns det( a1 ,...,a r ,...,a s ...,an ) = - det( a1 ,...,a s ,...,ar ...,an )
(iv)
Factor in a row or column det( a1 ,...,a r ,...,tar ,...,a n ) = 0
(v)
Adding multiples of a row or column det( a1 ,...,a r + ta s ,...,an ) = det( a1 ,...,ar ,...,an )
Ulrich Bühler, Department of Computer Science
Fulda University of Applied Sciences
Matrix Algebra
Algebraic Foundations
Inverse Matrix
12
Let A be of order n. Any Matrix Y of order n which helds the equation A×Y = Y × A = I
is called the inverse of A , denoted by Y º A -1 . The matrix A is then said to be invertible.
notice:
If A is invertible, then A -1 is unique. A has an inverse iff det A ¹ 0
Some poperties
( A -1 )-1 = A ( A × B )-1 = B -1 × A -1 ( AT )-1 = ( A -1 )T
How to obtain the inverse of a matrix A ? (1)
compute all cofactors Aij of the element aij in A Aij = ( -1 )i + j M ij , i,j=1,...,n
(2)
compute the determinant of A by cofactor expansion det A = ai 1 Ai 1 + ai 2 Ai 2 + ...+ ain Ain , i fix
(3)
compute
A -1
æ A11 ç 1 ç A12 = det A ç . ç è A1n
... ... An 1 ö ÷ ... ... An 2 ÷ . . . ÷ ÷ ... ... Ann ø
But in practice we use numerical methods !!!
Ulrich Bühler, Department of Computer Science
Fulda University of Applied Sciences
Matrix Algebra
Rank
Algebraic Foundations
13
maximal number of linear independent rows or columns of the matrix Am´ n Obviously we have: rk ( A ) £ min( m ,n )
Criterion of Frobenius rk ( A ) =
the order r of the largest square submatrix with non-zero determinant
Rank criterium Let be obtained the matrix B from a matrix A by a finite sequence of elementary row or column operations. Then the matrices A, B have the same rank: A, B Î M mK´ n with A º B Þ rk ( B ) = rk ( A ) .
( the maximal number of linear independent rows or columns has not changed) Computational method 1. Transform the matrix A by elementary row or column operations so that zero elements arise (in particular we get an upper triangular matrix) 2. Compute the order r of the largest square submatrix with non-zero determinant (use the properties of determinants)
More generally:
Every matrix A Î M mK´ n with rank rk ( A ) = r is equivalent to the matrix of the form æ Ir I m ´ n := çç è 0( m - r )´ r
Þ
Numerical Algorithm:
Ulrich Bühler, Department of Computer Science
0r ´( n - r )
ö ÷ 0( m - r )´( n - r ) ÷ø
apply elementary operations on A Î M mK´ n as far the matrix I m ´ n is reached
Fulda University of Applied Sciences
Matrix Algebra
Algebraic Foundations
14
1.3 Algebraic Equations What is an algebraic equation ? Equation f ( x ) = 0 is said to be algebraic (polynomial) over the field ( K ,+ ,*) , if the function f is expressible in the form f ( x ) = a n x n + an -1 x n -1 + ...+ a1 x + a0 , an ¹ 0 , ai Î K n Î N is called the degree (dimension);
otherwise f is said to be transcendental. example:
3x2 + x - 2 = 0 , - 2x4 + 5x2 + x = 0 e x - sin( x ) = 0 , x + 1 - cos( x ) = 0
particulary: n = 1 : f ( x ) = mx + n (linear, first-degree polynomial) n = 2 : f ( x ) = ax 2 + bx + c (nonlinear, second-degree) Graphical representation:
straight line, parabolas
Special forms zeros equation
f (x)=0
fixpoint equation f ( x ) = x
solution leads to zeros solution leads to fixpoints that mean points of intersection between the graphs of f(x) and g(x)=x
The collection of all the solutions is called the solution set.
Ulrich Bühler, Department of Computer Science
Fulda University of Applied Sciences
Matrix Algebra
Algebraic Foundations
Systems of equations
15
more then one equation and more then one unknown, say m equations in n unknowns
f 1 ( x1 , x2 ,..., xn ) = 0 f 2 ( x1 , x2 ,..., xn ) = 0 ............................. f m ( x1 , x2 ,..., xn ) = 0
if there exist solutions: the system is consistent (otherwise inconsistant) if there is precisely one solution: unique solution (otherwise nonunique) important special case: Linear algebraic Systems a11 x1 + a12 x2 + ... + a1n xn = b1 a21 x1 + a22 x2 + ... + a2 n xn = b2 .................................................. am1 x1 + a m 2 x2 + ... + amn xn = bm
or in matrix notation:
with aij ,bi Î K , 1 £ i £ m , 1 £ j £ n
Am´ n × x = b with A Î M mK´ n , b Î K m , x Î K n
Solution methods: 1. elemination methods: Gauss Elimination (most important) 2. iterative methods: Jacobi method, Gauss-Seidel iteration notice:
Two linear systems are equivalent if one can be obtained from the other by a finite number of elementary operations. elementary operations: interchange of two equations, addition of a multiple of one equation to another, multiplication of an equation by a nonzero constant
Ulrich Bühler, Department of Computer Science
Fulda University of Applied Sciences
Matrix Algebra
Algebraic Foundations
16
Solving applied problems Diagram with the major conceptuel steps to solve applied problems Original statement of problem formulation Mathematical statement of problem solution Solution of mathematical problem interpretation Solution in language of original problem
Simulation Projection of a real dynamic system in a mathematical model Implementation and Calculations with the Computer model Interpretation of the results Profit of Realisation (perception) with the behavior of the real system
Real System Projection
Profit of Realisation
Calculations with Computer Model
Ulrich Bühler, Department of Computer Science
Mathematical Model
Implementation
Fulda University of Applied Sciences
Matrix Algebra
Algebraic Foundations
17
Solution of a nonlinear System We consider the vector equation f ( x ) = 0 . In the process of solving a problem or setting up a mathematical model, we encounter an equation of this form. ( Real Poblem ® Mathematical Model ) How to get a solution of this zeros equations ? Problem:
unfortunately it is offen difficult or impossible to calculate this zeros exactly
Numerical Methods Solution idea:
( Mathematical Model ® Computer Model )
iteration process repeating a procedure over and over, starting with an approximation value, until a desired degree of accuracy of the approximation is obtained
xk +1 = F ( xk ), k = 0 ,1,... , with starting value x0 Î I ( x* ) Þ
sequence { xk } with lim xk = x * and f ( x * ) = 0 k ®¥
Newton´s method Approximation scheme for Newton´s Method xk +1 = xk - [ f ¢( xk )] -1 f ( xk ) , k = 0 ,1,...
and initial value x0 ÎU ( x* ) particulary: m = n = 1 :
Ulrich Bühler, Department of Computer Science
f ( xk ) , k = 0 ,1,... f ¢( xk ) (if f ¢( xk ) ¹ 0 ) x k +1 = x k -
Fulda University of Applied Sciences
Matrix Algebra
Algebraic Foundations
Example:
Calculate
18
3 using the Newton´s method
Take f ( x ) = x 2 - 3 so the solution of f ( x ) = 0 is the value
3
f ¢( x ) = 2 x Þ
Approximation scheme of Newton´s method is
x n +1 = x n -
1æ 3ö x2n - 3 = ç xn + ÷ , n = 0 ,1,... 2 xn 2è xn ø
f ( 1 ) = -2 , f(2)= 5 Þ x * Î [ -2 ,5 ] =: I ( x* )
initial guess:
choose x0 = 1 Î I ( x* ) iteration process: 1æ 3ö 1æ 3ö x1 = ç 1 + ÷ = 2 , x2 = ç 2 + ÷ = 175 . 2è 1ø 2è 2ø x3 = ? , x4 = ? , .....
Þ { xk }
stop if xk + 1 - xk < 0.001
Ulrich Bühler, Department of Computer Science
Fulda University of Applied Sciences
Matrix Algebra
Algebraic Foundations
19
Now we are coming back to the general case of m equations in n unknowns of the form f 1 ( x1 , x2 ,..., xn ) = 0 f 2 ( x1 , x2 ,..., xn ) = 0 .............................
Û
f ( x ) = 0 with x Î K n , f : K n ® K m
f m ( x1 , x2 ,..., xn ) = 0
Newton´s method Approximation scheme for Newton´s Method xk +1 = xk - [ f ¢( xk )] -1 f ( xk ) , k = 0 ,1,...
Numerical realisation u k := xk + 1 - xk
æ f 11 ........... f 1n ö ç . ÷ . ç ÷ . ÷ ç ¢ Am´ n ( xk ) := f ( xk ) º . ( xk ) Î M mK´ n . ç ÷ . . ÷ ç ç f ........... f ÷ mn ø è m1
where f ij ( xk ) :=
df i k ( x1 ,..., xnk ) , 1 £ i £ m , 1 £ j £ n dx j
partial derivations of f to x in xk Þ
(a) Solve the linear equation
Am´ n ( xk ) × u k = - f ( xk )
(b) Interation
xk + 1 = xk + u k
Ulrich Bühler, Department of Computer Science
Fulda University of Applied Sciences
Matrix Algebra
Algebraic Foundations
20
System of Linear Algebraic Equations a11 x1 + a12 x2 + ... + a1n xn = b1 a21 x1 + a22 x2 + ... + a2 n xn = b2 ................................................ am1 x1 + a m 2 x2 + ... + amn xn = bm
in matrix notation:
Ax=b
where A is the m ´ n coefficient matrix æ a11 ç ç a21 A=ç . ç ç . ç è a m1
a12 a 22 . . am2
.... .... a1n ö ÷ .... .... a 2 n ÷ . ÷ , ÷ . ÷ ÷ .... .... a mn ø
Am´n = (aij ) Î M mK´n
x = ( x1 ,..., xn )T Î K n is the n-dimensional unknown column vector and b = ( b1 ,...,bm )T Î K m is the given m-dimensional column vector b = 0 : homogeneous system b ¹ 0 : inhomogeneous system
Solution set
S ( A,b ) := { x Î K n A × x = b } S ( A,b ) = Æ
unsolvable
Two linear systems A1 × x = b1 , A2 × x = b2 are equivalent if they have the same solution set: S ( A1 ,b1 ) = S ( A2 ,b2 ) .
How to solve a linear system ? Ulrich Bühler, Department of Computer Science
Fulda University of Applied Sciences
Matrix Algebra
Algebraic Foundations
21
How to solve a linear system ? I. Quadratic case
Ax = b with A Î M nK´n and x ,b Î K n
(A) det A ¹ 0 Because of det A ¹ 0 we have A -1 and the multiplication of both sides by A -1 gives A -1 ( Ax ) = A -1b ( A -1 A )x = A -1b or Ix = A -1b
and finally x = A -1b is the unique solution of the system.
Þ
(1) (2) (3)
det A ¹ 0 ? calculate the inverse A -1 x = A -1b is the unique solution
(B) det A = 0 Because of det A = 0 the inverse A -1 not exists. some rows are linear dependent and so we have less independent rows than unknowns: either there are no solutions or infinitely many Þ case II
Ulrich Bühler, Department of Computer Science
Fulda University of Applied Sciences
Matrix Algebra
Algebraic Foundations
22
Numerical Method: Cramer´s Rule We have seen that if A is invertible then the solution is x = A -1b . The Cramer´ s Rule avoid the use of the inverse of A . (1) (2)
(3)
D := det A ¹ 0 ? Let Di the determinant of the matrix that result from the replacement of the ith column in A by the vector b . Compute D1 ,..., Dn Solution vector is given by
x1 =
notice:
D1 D D , x2 = 2 , ... , xn = n D D D
The Cramer´s rule is only of theoretical importance as an explicit expression of the solution. It is not used for systems containing many equations.
Ulrich Bühler, Department of Computer Science
Fulda University of Applied Sciences
Matrix Algebra
Algebraic Foundations
II. Non-quadratic case
Ax = b with A Î M mK´n and x Î K n , b Î K m
(i)
23
We consider the homogeneous system Ax = 0 :
obviously: Every homogeneous system has a soltution x = 0 . x ¹ 0 is a non-trivial solution if det A = 0 .
If there are two non-trivial solutions x1 , x2 than we have infinitely many solutions: A × ( ax1 + bx2 ) = a ( A × x1 ) + b ( A × x2 ) = 0 . (ii)
We consider the inhomogeneous system Ax = b , b ¹ 0 :
The matrix B := ( Ab ) Î M mK´( n + 1 ) is called the augmented matrix of the linear system. Existence of Solutions rk ( A ) = rk ( B ) ! Uniquness of solution rk ( A ) = n infinitely many solutions rk ( A ) = r < min( m , n ) Transformation By finitely many elementary operations every inhomogeneous system can be transformed in an equivalent system of the form a11 x1 + ... ¢ x2 + ... 0 a22 . 0 ... 0
Þ
+ a1n xn + a ¢2n xn
. . ¢¢ xq + ... + aqn ¢¢ xn aqq 0 . 0
= b1 = b2¢
. . = bq¢¢ = bq¢¢+ 1 . = bn¢¢
the basic of the use of some numerical methods like Gauss-Jordan Elimination
Ulrich Bühler, Department of Computer Science
Fulda University of Applied Sciences
Matrix Algebra
Algebraic Foundations
24
Numerical Method: Gauss-Jordan Elimination in general case we have a large number of equations so we need a solution procedure that amount to a systematic application of arithmetic steps that reduce our system to simpler and simpler form until the solution is evident Þ
a method of successive elimination: we transforme our system through a sequence of elementary operations into an upper triangular form (it is another system !)
To do so we have altered the solution set ? Answer:
No, since we have only used a finite number of elementary operations.
Gauss-Jordan Elimination steps via example 2 x1 + x 2 - 2 x 3 = 4
Consider the linear system
x1 + 2 x 2 + x 3 = 4
, cÎR
3x1 + 3 x2 - x3 = c
step 1 :
eliminate the x1 variable from the second through mth equation by adding (assume a11 ¹ 0 ) -a 21 / a11 times of the first equation to the second ............... -a m1 / a11 times of the first equation to the mth Þ
an (equivalent) indented system of the form 2 x1 + x 2 - 2 x 3 = 4 3 x 2 + 2 x3 = 2 2 3 x 2 + 2 x3 = c - 6 2
Ulrich Bühler, Department of Computer Science
Fulda University of Applied Sciences
Matrix Algebra
step 2:
Algebraic Foundations
25
eliminate x2 from the third through mth equation by adding (assume a 22¢ ¹ 0 ) -a 32¢ / a 22¢ times of the second equation to the third ............... -a m2 ¢ / a 22¢ times of the second equation to the mth Þ
an (equivalent) indented system of the form 2 x1 + x 2 - 2 x 3 = 4 3 x2 + 2 x3 = 2 2 0=c-8
step n-1:
Continuing in this manner we eliminate x3 , x4 ,..., xn -1 and the result is a system of the form a11 x1 + ... ¢ x2 + ... 0 a22 . 0 ... 0
+ a1n xn + a ¢2n xn
. . ¢¢ xq + ... + aqn ¢¢ xn aqq 0 . 0
= b1 = b2¢
. . = bq¢¢ = bq¢¢+ 1 . = bn¢¢
step n:
apply back substitution starting in the qth equation
results:
i)
if q < n and bq¢¢+1 ,...,bn¢¢ are not all zero the system has no solution
ii)
if q < n and bq¢¢+1 ,...,bn¢¢ are all zero the system has
iii)
a ( n - q )-parameter family of solutions (infinity of solutions) if q = n the system has a unique solution
Ulrich Bühler, Department of Computer Science
Fulda University of Applied Sciences
Matrix Algebra
Algebraic Foundations
26
back to our counter example: There are two possibilities 1. if c ¹ 8 then we have a contradiction and the solution set is empty; hence the considered system has no solution (because of equivalence): S ( A,b ) = Æ 2. if c = 8 then the third equation can be omitted and we set x3 = t as a free parameter (arbitrary)
and then Þ Þ
x2 =
4( 1 - t ) 4 + 5t , x1 = 3 3
S ( A,b ) = { x Î R 3 x = ( ( 4 + 5t ) 3 ,( 4 - 4t ) 3 ,t ),t Î R } infinitely many solutions
Ulrich Bühler, Department of Computer Science
Fulda University of Applied Sciences
Matrix Algebra
The Eigenvalue Problem
2
The Eigenvalue Problem
2.1
Eigenvalues and Eigenvectors
What is an Eigenvalue Problem ?
27
a concept of solving a linear equation; Important in many applications, for example in dynamic continous systems, control of processes
Solve the equation
Ax = lx , x ¹ 0 for a given square complex matrix A Î M nC´n Eigenvalue
the value l Î C for which a solution x ¹ 0 exist is called an eigenvalue of the matrix
Eigenvector
the corresponding solution x is called an eigenvector of l .
Let be the matrix of the form A = B + iC with B ,C Î M nR´ n than we get Ax = ( B + iC )( u + iv ) = ( Bu - Cv ) + i( Cu + Bv ) = l ( u + iv ) = lu + ilv
This is equivalent to the real system æu ö æ B - C ö æu ö çç ÷÷ çç ÷÷ = l çç ÷÷ èvø èC B ø è v ø
and so we will consider only the eigenvalues of real matrices ! Remark:
With the transformation y = Ax the eigenvalue problem is eaquel to the question: Are there any vectors who are a multiple of themselves under the transformation.
Ulrich Bühler, Department of Computer Science
Fulda University of Applied Sciences
Matrix Algebra
The Eigenvalue Problem
28
How to find eigenvalues of a given matrix ? The definition equation can be transformed in the homogeneous system ( A - lI ) × x = 0 , x ¹ 0
and so there are only non-trivial solutions x if det( A - lI ) = 0 . pn ( l ) := det( A - lI )
Characteristic Polynomial
is a polynomial in l of degree n
(characteristic polynomial of the matrix A )
Þ
the eigenvalues of a real matrix A Î M nR´ n are the solutions (roots) of the characteristic equation det( A - lI ) = 0
By the Fundamental Theorem of Algebra we get: · a characteristic equation of degree n has exactly n roots and so the matrix A Î M nR´ n has exactly n eigenvalues · these eigenvalues may be real or complex and not necessarily distinct Example
We calculate the eigenvalues of the matrix 3 ö æ1 2 ç ÷ A = ç2 - 4 - 2÷ ç3 - 2 1 ÷ è ø
We get p3 ( l ) := det( A - lI ) = -l3 - 2l2 + 24 l = ( -l )( l2 + 2l - 24 ) and the roots are
l1 = 0 , l2 = 4 , l3 = -6
Ulrich Bühler, Department of Computer Science
Fulda University of Applied Sciences
Matrix Algebra
The Eigenvalue Problem
29
How to find eigenvectors of a given eigenvalue ? Let be l Î C an eigenvalue of the matrix A Î M nR´ n than all the non-trivial solutions x Î C n of the linear homogeneous system ( A - lI ) × x = 0
are the corresponding eigenvectors of l . Example
(a)
We determine the eigenvectors of the eigenvalues l1 = 0 , l2 = 4 , l3 = -6 of the above matrix : l1 = 0 :
( A - l1 I ) × x = A × x = 0
With elementary operations we receive the equivalent system 3 öæ x1 ö æ 0 ö æ1 2 ç ÷ç ÷ ç ÷ ç 2 - 1 - 1 ÷ç x2 ÷ = ç 0 ÷ . ç0 0 0 ÷øçè x3 ÷ø çè 0 ÷ø è
We choose x3 = t and than we have x2 = -t and x3 = -t . Hence an eigenvector of l1 = 0 is e1 = ( 1,1,-1 )T .
(b)
l2 = 4 :
3 ö æ- 3 2 ç ÷ ( A - l2 I ) × x = ç 2 - 8 - 2 ÷ × x = 0 ç 3 - 2 - 3÷ è ø
With elementary operations we receive the equivalent system 2 3 öæ x1 ö æ 0 ö æ- 3 ç ÷ç ÷ ç ÷ 0 20 0 ç ÷ç x2 ÷ = ç 0 ÷ ç 0 0 0 ÷øçè x3 ÷ø çè 0 ÷ø è
and an eigenvector e 2 = ( 1,0 ,1 )T of l2 = 4 . (c)
l3 = -6 : We get an eigenvector e 3 = ( -1,2 ,1 )T of l3 = -6 .
Ulrich Bühler, Department of Computer Science
Fulda University of Applied Sciences
Matrix Algebra
The Eigenvalue Problem
30
Some important notions Algebraic Multiplicity of an eigenvalue If the characteristic polynomial has r distinct roots than it may be factorised in the form pn ( l ) = ( -1 )n ( l - l1 )k1 ( l - l2 )k 2 ...( l - lr )k r with
r
å ki = n
i =1
and the root li has the order ki . The integer ki is called the algebraic multiplicity of li Geometric Multiplicity of an eigenvalue The number of the corresponding linearly independent eigenvectors is called the geometric multiplicity of the eigenvalue. Eigenspace of an eigenvalue The space spanned by the linearly independent eigenvectors of an eigenvalue is called the eigenspace of an eigenvalue. Some properties The eigenvalues of a real symmetric matrix are real. For a symmetric matrix A Î M nR´ n there are exists n linearly Independent eigenvectors that are mutueally orthogonal. That is they may be used as a basis of R n . Similarity matrices have the same eigenvalues: Let be l an eigenvalue Of A and x a corresponding eigenvector. Furthermore let be B similar to A . Than B = C -1 AC and with y = C -1 x we have By = C -1 ACy = C -1 ACC -1 x = C -1 Ax = C -1lx = lC -1 x = ly .
Ulrich Bühler, Department of Computer Science
Fulda University of Applied Sciences
Matrix Algebra
The Eigenvalue Problem
31
2.2
Numerical Methods
I.
In applications it is often required to find the dominant eigenvalue and the corresponding eigenvector.
Power Method Consider a matrix A Î M nR´ n with n distinct eigenvalues l1 ,...,ln and n corresponding linearly independent eigenvectors e1 ,...,en . We assume that the eigenvalues are ordered in the form | l1 |>| l2 |> ...| ln | . l1 is called the dominant eigenvalue of A . Then any vector x = ( x1 ,..., xn )T may be represented with the basis { e1 ,...,en } in the form x = a 1e1 + ... + a n en and we get Ax = A( a 1e1 + ... + a n en ) = a 1 Ae1 + ... + a n Aen = a 1l1e1 + ... + a n ln en .
Furthermore we have for any positive integer k Î Z Ak x = a 1l1k e1 + ... + a n lknen
and hence
k
A x
é
= l1k êa 1e1
Obviously we receive
êë
k k æ l2 ö æ ln ö ù + a 2 çç ÷÷ e2 + ... + a n çç ÷÷ en ú è l1 ø è l1 ø úû
( if l1 ¹ 0 ) .
lim Ak x = l1k a 1e1 .
k ®¥
This rise in the iterative process xk + 1 = Axk = A( Axk -1 ) = A2 xk -1 = ... = Ak + 1 x0 , k = 0 ,1,... ,
where x0 must be some arbitrary vector not orthogonal to e1 , to compute the dominant eigenvalue l1 and the corresponding eigenvector e1 .
Ulrich Bühler, Department of Computer Science
Fulda University of Applied Sciences
Matrix Algebra
The Eigenvalue Problem
32
Disadvantage of the iteration: if | l1 | is large then Ak x0 will become very large Þ scaling the vector x k after each iteration, for example using the largest element max( x k ) of x k
resulting iterative process yk + 1 = Axk , yk +1 , k = 0 ,1,... with x0 := ( 1,1,...,1 )T xk + 1 = max( yk + 1 )
Example
We calculate the dominant eigenvalue l1 and the corresponding Eigenvector e1 of a matrix with the power method in the above form. æ 1 1 - 2ö ç ÷ A = ç-1 2 1 ÷ ç 0 1 - 1÷ è ø y1 = Ax0 = ( 0 ,2 ,0 )T = 2( 0 ,1,0 )T , max( y1 ) = 2 and l(11 ) = 2 1 x1 = y1 = ( 0 ,1,0 )T 2
y2 = Ax1 = ( 1,2 ,1 )T = 2( 0.5 ,1,0.5 )T , max( y2 ) = 2 and l(12 ) = 2 1 x2 = y2 = ( 1 2 ,1,1 2 )T 2
and so on we get y3 = 2( 0.25 ,1,0.25 )T , y4 = 2( 0.375 ,1,0.375 )T , …, y7 = 2( 0.328 ,1,0.328 )T , y8 = 2( 0.336 ,1,0.336 )T Þ
T
approaching to 2( 1 3 ,1,1 3 )
Ulrich Bühler, Department of Computer Science
T
æ1 1ö and so we have l1 = 2 , e1 = ç ,1, ÷ . è3 3ø Fulda University of Applied Sciences
Matrix Algebra
II.
The Eigenvalue Problem
33
If the eigenvectors of a square matrix are known the transformation of the matrix to diagonal form can be carry out in some cases in a very effectiv manner.
Diagonalization method Let be A Î M nR´ n a matrix with n eigenvalues l1 ,...,ln and a full set of n linearly independent eigenvectors e1 ,...,en . Modal matrix
M := ( e1 ,...,en )
Spectral matrix
L := diag ( l1 ,...,ln )
Then we have
A × M = ( Ae1 ,..., Aen ) = ( l1e1 ,...,ln en ) = M × L .
(the eigenvectors as columns)
Since det M ¹ 0 the matrix M is regular and the inverse M -1 exists. Hence we get the equation M -1 × A × M = L . Þ
Knowing the full set of linearly independent eigenvectors the matrix may be transformed in diagonal form with the eigenvalues (in the corresponding order)
Þ
On the other hand a matrix is uniquely determined once the eigenvalues and the full set of corresponding eigenvectors are known.
Ulrich Bühler, Department of Computer Science
Fulda University of Applied Sciences
Matrix Algebra
2.3
The Eigenvalue Problem
34
Functions of a Matrix
Let be A Î M nR´ n . Then we may built the matrix products A 2 , A3 ,... and Ai Î M nR´ n for all i = 0 ,1,2 ,... with A0 := I . Power Series Function m
f m ( A ) = c0 I + c1 A + c2 A 2 + ... + cm A m = å ci Ai i =0
(polynomial function)
f m : M nR´ n ® M nR´ n with ci Î R for every fixed m Î N m
Consider f m ( A ) := å ci Ai for m = 1,2 ,... . i =0
If the sequence { f m ( A )} converges for m ® ¥ to a constant matrix G Î M nR´ n then we write in analog to the scalar power series ¥
f ( A ) = å ci A i i =0
and we call it the power series function of the matrix A . Example
Exponential matrix function
f ( A)= e
Problem
At
¥ Ai A A2 2 := I + t + t + ... = å 1! 2! i =0 i!
How to compute such functions of a square matrix ? From where we will get the positive integer powers of the matrix ?
Ulrich Bühler, Department of Computer Science
Fulda University of Applied Sciences
Matrix Algebra
The Eigenvalue Problem
35
Let be l1 ,...,ln the eigenvalues of a square matrix A Î M nR´ n and
p n ( l ) = ln + cn - 1ln - 1 + ... + c1l + c0 the characteristic polynomial of A .
Cayley-Hamilton Theorem Then square matrix A Î M nR´ n satisfies its own characteristic equation: A n + cn - 1 A n - 1 + ... + c1 A + c0 I = 0 . Þ
positive integer powers of the square matrix A Î M nR´ n may be expressed in terms of powers of A up to n - 1 : n -1
A k = å ci Ai for all k ³ n . i =0
Let us consider at first the particular case of n = 2 : A Î M 2R´ 2 with p 2 ( l ) = l2 + c1l + c0 Þ
A 2 + c1 A + c0 I = 0 (Cayley-Hamilton)
Þ
A 2 = -c1 A - c0 I
Þ
A3 = -c1 A 2 - c0 A (by multiplying the above equation with A ) = -c1 ( -c1 A - c0 I ) - c0 A = ( c12 - c0 ) A + c1c0 I
Obviously, this process we can continue, so that we will have the expression A k = b0 I + b1 A for any k ³ 2 .
Ulrich Bühler, Department of Computer Science
Fulda University of Applied Sciences
Matrix Algebra
The Eigenvalue Problem
36
How to compute the constants b0 ,b1 Î R ? Since l1 ,l2 are eigenvalues of A we have the equations p 2 ( li ) = li2 + c1li + c0 = 0 , i = 1,2 li2 = -c1li - c0 ,
and therefore
l3i = -c1li2 - c0 li = ( c12 - c0 )li + c1c0
Proceeding in this way we get the analog expression for each eigenvalue lik = b0 + b1li for i = 1,2 and any k ³ 2
and from there we may calculate the constants b0 ,b1 Î R . In the case the matrix A would have only one eigenvalue l with the multiplicity m = 2 , we can differentiate the equation lk = b0 + b1l with respect to l to get a second equation for computing the 2 unknown b0 ,b1 .
Remark:
Hence, in the general case we have ¥
n -1
i =0
i =0
f ( A ) = å ci Ai = å bi Ai ,
where we obtain the coefficients b0 ,b1 ,...,bn - 1 by solving the n equations n -1
f ( li ) = å bi li , i = 1,2 ,...,n . i =0
Remark:
If the eigenvalue li has the multiplicity mi then the derivatives dk dlik
f ( li ) =
dk
n -1
s å bs li k dli s = 0
, k = 1,2 ,...,mi - 1
are also satisfied by li . Ulrich Bühler, Department of Computer Science
Fulda University of Applied Sciences
Matrix Algebra
The Eigenvalue Problem
Ulrich Bühler, Department of Computer Science
37
Fulda University of Applied Sciences
Matrix Algebra
The Eigenvalue Problem
38
If A Î M nR´ n possesses n linearly independent eigenvectors e1 ,...,en then we have an easier method of computing functions k
f ( A ) = å ci Ai of A : i =0
With the Modal matrix M := ( e1 ,...,en ) and the Spectral matrix L := diag ( l1 ,...,ln ) we have the expression M -1 × A × M = L and therefore k
k
k
i =0
i =0
i =0
M - 1 × f ( A ) × M = å ci ( M - 1 Ai M ) = å ci ( M - 1 AM )i = å ci Li k
= å ci diag ( li1 ,...,lin ) i =0
k k ö æ k = diag ç å ci li1 , å ci li2 ,..., å ci lin ÷ i =0 i =0 è i =0 ø
= diag ( f ( l1 ), f ( l2 ),..., f ( ln )) .
Finally we get f ( A ) = M × diag ( f ( l1 ), f ( l2 ),..., f ( ln )) × M -1 .
Ulrich Bühler, Department of Computer Science
Fulda University of Applied Sciences
Matrix Algebra
Example
The Eigenvalue Problem
39
We compute f ( A ) := A k for the matrix 1 ö æ 0 ÷÷ with eigenvalues l1 = -1 and l2 = -2 A = çç 2 3 è ø
and corresponding eigenvectors e1 = ( 1,-1 )T , e2 = ( 1,-2 )T .
1 ö æ 1 ÷÷ and Hence we get the modal matrix M = çç è - 1 - 2ø æ- 1 0 ö ÷÷ . the spectral matrix L = çç è 0 - 2ø æ 2 1ö ÷÷ . The inverse of M is M - 1 = çç 1 1 è ø
Since f ( -1 ) = ( -1 )k , f ( -2 ) = ( -2 )k we get
k k k k ö 0 ö÷ 2 ( 1 ) ( 2 ) ( 1 ) ( 2 ) -1 æ ÷ × M = çç k k k k÷ ( -2 )k ÷ø 2 (( 2 ) ( 1 ) ) 2 ( 2 ) ( 1 ) è ø
æ ( -1 )k f ( A ) = M × çç è 0
and therefore
æ 2( -1 )k - ( -2 )k ( -1 )k - ( -2 )k ö÷ ç . A =ç k k k k÷ 2 (( 2 ) ( 1 ) ) 2 ( 2 ) ( 1 ) è ø k
Ulrich Bühler, Department of Computer Science
Fulda University of Applied Sciences