Basic Linear Algebra
Week 1
For any vector: $$ \overrightarrow{AB}=\overrightarrow{OB}-\overrightarrow{OA} $$ Can be denoted as: $$ \left[ \begin{matrix} b_1 - a_1 \\\ b_2 - a_2 \end{matrix} \right] or \left[ \begin{matrix} b_1 - a_1 , b_2 - a_2 \end{matrix} \right] $$
- $\overrightarrow{OA}$ is the position vector of A.
- $\overrightarrow{AB}$ is the displacement vector from A to B.
Vector addition supports:
- Commutativity
- Associativity
Scalar multiplication supports distributivity.
A vector space is a set $V$ that:
- For any $\vec v,\vec w \in V$, we have $\vec v+\vec w\in V$.
- For any $\vec v \in V, c \in R$, we have $c\vec v \in V$.
Two vectors are parallel if $\vec u = c\vec v$.
$\vec v$ is a linear combination of k vectors $\vec v_1, \vec v_2,\cdots,\vec v_k$ if $\vec v = c_1\vec v_1+c_2\vec v_2+\cdots+c_k\vec v_k$.
Week 2
The result of a dot product is a scalar: $$ \vec u \cdot\vec v = u_1 \cdot v_1 + u_2 \cdot v_1 + \cdots + u_n\cdot v_n $$
In $R^n$, the length of a vector is $\lVert \vec v \rVert = \sqrt{\vec v \cdot \vec v}$, so ${\lVert \vec v \rVert}^2=\vec v \cdot \vec v$.
If $\vec u$ is orthogonal to $\vec v$, $\vec u \cdot \vec v = 0$.
The Cauchy-Schwartz inequality:
For any vectors $\vec u, \vec v \in R^n$: $$ \lVert \vec u \cdot\vec v \rVert \leq \lVert \vec u\rVert \lVert \vec v\rVert $$
The triangle inequality:
For any vectors $\vec u, \vec v \in R^n$: $$ \lVert \vec u +\vec v \rVert \leq \lVert \vec u\rVert + \lVert \vec v\rVert $$
A unit vector is a vector of length 1.
In $R ^n$, there are n standard unit vectors, given by $\vec e_1, \vec e_2, \cdots , \vec e_n$, where $\vec e_i$ has 1 as its $i^{th}$ components and 0 for all other components: $$ \vec e_i= \left[ \begin{matrix} 0 \\\ 0 \\\ \cdots \\\ 1 \\\ \cdots \\\ 0 \end{matrix} \right] $$ The normalization of $\vec v$ is the unique vector $\widehat v$ with length 1 and direction the same as $\vec v$: $$ \widehat v = \frac{1}{\lVert \vec v \rVert} \vec v $$
The distance between two vectors $\vec u, \vec v \in R ^n$ is: $$ \begin{align} d(\vec u, \vec v) &= \lVert \vec u - \vec v\rVert \ &= \sqrt{(u_1-v_2)^2+\cdots+(u_n-v_n)^2} \end{align} $$ The angle between $\vec u, \vec v \in R ^n$ is $\theta \in [-1,1]$: $$ \cos \theta = \frac{\vec u \cdot \vec v}{\lVert u\rVert\cdot\lVert v\rVert} $$
- If $\vec u \cdot \vec v > 0$, then $\cos \theta > 0$, so $\theta$ is acute.
- If $\vec u \cdot \vec v < 0$, then $\cos \theta < 0$, so $\theta$ is obtuse.
- If $\vec u \cdot \vec v = 0$, then $\cos \theta = 0$, so $\theta$ is right angle, we say that these two vectors are orthogonal.
For $\vec u,\vec v \in R^n$, with $\vec u \neq \vec 0$, the projection of the vector $\vec v$ onto $\vec u$ is denoted by $proj_{\vec u}(\vec v)$ and defined by: $$ proj_{\vec u}(\vec v)=\frac{\vec u \cdot \vec v}{{\lVert \vec u\rVert}^2}\vec u $$ We can always write $\vec v$ as: $$ \vec v = \vec v_p + \vec v_0 $$ Where $\vec v_p$ is parallel to $\vec u$ and $\vec v_0$ is orthogonal to $\vec u$. Because $proj_{\vec u}(\vec v)$ is parallel to $\vec u$, $\vec v - proj_{\vec u}(\vec v)$ is orthogonal to $\vec u$.
Week 3
The cross product of two vectors $\vec u = [u_1, u_2, u_3]$, $\vec v=[v_1,v_2,v_3]$: $$ \vec u \times \vec v = \left[ \begin{matrix} u_2v_3 - u_3v_2 \\\ u_3v_1 - u_1v_3 \\\ u_1v_2 - u_2v_1 \end{matrix} \right] $$ For standard unit vectors:
- $\vec e_1 \times \vec e_2 = \vec e_3$.
- $\vec e_2 \times \vec e_3 = \vec e_1$.
- $\vec e_3 \times \vec e_1 = \vec e_2$.
Anti-commutativity: $\vec u \times \vec v = -\vec v \times \vec u$.
$\vec u \times \vec u = \vec 0$.
$\vec u \times \vec v$ is orthogonal to both $\vec u$ and $\vec v$.
If $\vec u, \vec v$ is parallel, $\vec u \times \vec v = \vec 0$.
The length of $\vec u \times \vec v$ is: $$ \lVert\vec u \times \vec v\rVert=\lVert \vec u\lVert \lVert\vec v\lVert\sin\theta $$
The area of the triangle spanned by $\vec u$ and $\vec v$ is $\frac{1}{2}\lVert\vec u\times\vec v\rVert$.
The area of the parallelogram spanned by $\vec u$ and $\vec v$ is $\lVert \vec u\times \vec v\rVert$.
The normal form of the line $l$ in $R^2$ is given by the equation: $$ \vec x \cdot \vec m = \vec p \cdot\vec m $$
$$ \left[ \begin{matrix} u_2v_3 - u_3v_2 \\\ u_3v_1 - u_1v_3 \\\ u_1v_2 - u_2v_1 \end{matrix} \right] \cdot\vec m=\overrightarrow{OP}\cdot\vec m $$
where $\vec m$ is a vector which is orthogonal to line $l$, $\vec p$ is one point on the line $l$.
The general form of the line $l$ in $R^2$ is given by the equation: $$ l={(x,y)|ax+by=c},c=\overrightarrow{OP}\cdot\vec m $$
The vector form of the line $l$ in $R^2$ is given by the equation: $$ \vec x = \vec p + t\vec d \ for \ some \ t \in R $$ where $\vec d$ is the direction vector of line $l$, $t\in R$ is called a parameter.
The parametric form of the line $l$ in $R ^2$ is given by the equation: $$ \begin{align} x &= p_1 + td_1 \ y &= p_2 + td_2 \end{align} \space t\in R $$ The vector form and parametric form can be used in $R^3$.
Week 4
The normal form of the equation of the plane $P$ in $R^3$ is given by the equation: $$ \begin{align} \vec m \cdot(\vec x - \vec p) &= 0 \ \vec m \cdot \vec x &= \vec m \cdot \vec p \end{align} $$ where $\vec m$ is a vector which is orthogonal to plane $P$, $\vec p$ is one point on the plane $P$.
The general form of the equation of the plane $P$ in $R^3$ is given by the equation: $$ ax+by+cz=d, where \ d=ap_1+bp_2+cp_3 $$
Given three points $P,Q,R\in R^n$, if there is a line $l$ which passes through all three of them, then $P,Q,R$ are collinear.
The vector form of the equation of the plane $P$ in $R^3$ is given by the equation: $$ \vec x = \vec p + s\vec v + t\vec u, s,t\in R $$ where point $p$ is inside plain $P$, with two direction vectors $\vec u,\vec v$.
The parametric form of the equation of the plane $P$ in $R^3$ is given by the equation:
$$ \left[ \begin{matrix} x = p_1 + su_1 + tv_1 \\\ y = p_2 + su_2 + tv_2 \\\ z = p_3 + su_3 + tv_3 \end{matrix} \right. \space , s,t\in R $$
If $S={\vec v_1, \vec v_2, \cdots, \vec v_n}$ is a set of vectors in $R^n$, then the span of $\vec v_1, \vec v_2, \cdots, \vec v_n$ is denoted: $$ span(S) \ or \ span(\vec v_1, \vec v_2, \cdots, \vec v_n) \ span(S)={\vec v \in R^n | \vec v = c_1\vec v_1 + c_2\vec v_2 + \cdots + c_n\vec v_n} $$ If $span(S)=R^n$, we say that $S$ is a spanning set for $R^n$, or the $\vec v_1, \cdots, \vec v_k$ span $R^n$.
A set of vectors $\vec v_1, \cdots, \vec v_n$ is called linearly independent if the equation: $$ c_1\vec v_1 + c_2\vec v_2 + \cdots + c_n\vec v_n = 0 $$ has exactly one solution: $$ c_1=c_2=\cdots=c_n=0 $$ $\vec v_1, \vec v_2, \cdots, \vec v_n$ are linearly independent only if $\vec v_1, \vec v_2, \cdots, \vec v_n$ span $R^n$.
A system of linear equations is a finite set of linear equations, each with the same variables. $$ \begin{matrix} a_{11}x_1 &+ a_{12}x_2 &+ \cdots &+ a_{1n}xn &=b_1 \\\ a_{21}x_1 &+ a_{22}x_2 &+ \cdots &+ a_{2n}xn &=b_2 \\\ \cdots & \cdots & & \cdots & \cdots\\\ a_{m1}x_1 &+ a_{m2}x_2 &+ \cdots &+ a_{mn}xn &=b_m \end{matrix} $$
- $a_{ij}$ are called coefficients.
- $b_i$ are called constant terms.
The system is called homogeneous is all $b_i$ are 0.
A system is said to be consistent if it has at least one solution.
Every system of linear equations has either:
- a unique solution
- infinitely many solutions
- no solution
A system of m linear equations in n variables can also be written as $$ \begin{matrix} \left[ \begin{array}{cccc | c} a_{11} & a_{12} & \cdots &a_{1n} & b_1 \\\ a_{21} & a_{22} & \cdots &a_{2n} & b_2 \\\ \cdots & \cdots & \cdots &\cdots & \cdots\\\ a_{m1} & a_{m2} & \cdots &a_{mn} & b_m \end{array} \right] \end{matrix} $$
Week 5
The following three row elementary operations don’t change the solutions:
-
Swapping two equations. $$ R_i \leftrightarrow R_j $$
-
Multiplying both sides of one equation by a non-zero scalar $c\in R$. $$ R_i \rightarrow cR_i $$
-
Adding a multiple of one equation to another. $$ R_i \rightarrow R_i + cR_j $$
An augmented matrix is in row echelon form if:
- Any rows in which all entries are 0 are at the bottom.
- In each non-zero row, the leftmost non-zero entry (called the leading entry or the pivot) has all zeros below it.
In the REF the columns corresponding to $x,y,w$ have leading terms while the column corresponding to $z$ doesn’t, we call $z$ a free variable.
- If the augmented matrix has a row of the form $[0 \cdots0|k]$, the system is inconsistent.
- If the REF of an augmented matrix has at least one free variable, it has Infinitely many solutions.
- Otherwise, it has exactly one solution.
Use Gaussian elimination to approach REF.
An augmented matrix is in reduced row echelon form if it satisfies the following conditions:
- It is in row echelon form.
- The leading entries are all ones.
- Each column containing a leading 1 has zeros everywhere else in this columns.
Use Gauss-Jordan elimination to approach RREF.
Week 6
Let $A=(a_{ij})$ be an $m\times n$ matrix, the transpose of A is the $n \times m$ matrix $A^T=(a_{ji})$ denoted by swapping the rows and columns of A.
transposition is self-inverse: for any matrix $A$, $(A^T)^T=A$.
- $(A+B)^T=A^T+B^T$.
- $(\lambda A)^T=\lambda(A^T)$.
- $(AB)^T=B^TA^T$.
- $(A^k)^T=(A^T)^k$.
- If $AB=BA$, these two matrices commu
Let $A$ be a square matrix:
- If $A^T=A$, we say that A is a symmetric matrix.
- If $A^T = -A$, we say that A is a stew-symmetric matrix.
Week 7
$$ \left[ \begin{matrix} ax_1 + bx_2 + cx_3 \\\ dx_1 + ex_2 + fx_3 \\\ gx_1 + hx_2 + ix_3 \end{matrix} \right]= \left[ \begin{matrix} a & b & c \\\ d & e & f \\\ g & h & i \end{matrix} \right] \left[ \begin{matrix} x_1 \\\ x_2 \\\ x_3 \end{matrix} \right] $$
Let $A$ be a matrix, an inverse for $A$ is a matrix $B$ such that: $$ AB=I \ and \ BA = I $$ So that $A$ can only have an inverse if it is square.
-
If A has an inverse, then we say that A is invertible.
-
Inverses are unique.
The inverse of A is denoted as $A^{-1}$, if $A^{-1}A=I$, $AA^{-1}=I$ must exist.
Properties, suppose $A,B$ are invertible with inverse $A^{-1},B^{-1}$
- $A^{-1}$ is in invertible.
- $cA$ is invertible, $(cA)^{-1}=\frac{1}{c}A^{-1}$.
- $AB$ is invertible, $(AB)^{-1}=B^{-1}A^{-1}$.
- $A^{k}$ is invertible, $(A^k)^{-1}=(A^{-1})^k$.
- $A^T$ is invertible, $(A^T)^-1=(A^{-1})^T$.
For $2\times 2$ matrices:
- $\det(AB)=\det(A)\det(B)$.
- $\det(A^{-1})=\frac{1}{\det(A)}$.
Suppose $A=\left[\begin{matrix}a & b \\\ c & d\end{matrix}\right]$, then $\det(A)=ad-bc$, the inverse of $A$ is: $$ A^{-1}=\frac{1}{\det(A)} \left[ \begin{matrix} d & -b \\\ -c & a \end{matrix} \right] $$
To calculate the inverse of an $n \times n$ matrix $A$:
- Construct $[A|I]$.
- Do EROs on the whole augmented matrix until the left hand is in REF, if the left hand has a row of zeros (means the vectors are linearly dependent), then $A$ is not invertible.
- Continue until it has the form $[I|B]$, then $B=A^{-1}$.
An $n\times n$ is invertible only if:
- Its REF doesn’t have a row of zeros.
- Its RREF is $I_n$.
Suppose A is an $n \times n$ matrix, giving a system of linear equations $A\vec x = \vec b$. If A is invertible, then the system has a unique solution, given by $\vec x = A^{-1}\vec b$.
Week 8
An elementary matrix is an $n\times n$ matrix that is obtained from the identity matrix $I_n$ by doing a single row operation, so there is three types of elementary matrix.
Let $E$ be the elementary matrix, the result of $EA$ is the same as performing that ERO on A.
Every elementary matrix $E$ is invertible, $E^{-1}$ is also an elementary matrix.
To undo the effect of $E$, do $E^{-1}A$.
A matrix which has a whole row or column of zeros cannot be invertible.
$A$ is invertible only if $\det(A)\neq 0$.
If $A$ is upper triangular or lower triangular, then $\det(A)$ is the product of the diagonal entries.
Week 9
Facts:
- If a matrix $A$ has one row which is a scalar multiple of a second row, then $\det(A)=0$.
- If a matrix $A$ has one column which is a scalar multiple of a second column, then$\det(A) = 0$.
- $\det(A)=\det(A^T)$.
If $B$ is obtained from $A$ be swapping two rows, $\det(B)=-\det(A)$.
If $B$ is obtained from $A$ by scaling one row by $\lambda$, $\det(B)=\lambda\det(A)$.
If $B$ is obtained from A by adding a scalar multiple of one row to another, $\det(A)=\det(B)$.
Let $A,B$ be $n\times n$ matrices, then $\det(A)\det(B)=\det(AB)$.
Let $M$ be an $n \times n$ matrix, suppose $\vec v$ is a non-zero $n\times 1$ column vector, and $\lambda \in R$ is a scalar such that: $$ M\vec v = \lambda \vec v $$
- $\lambda$ is an eigenvalue of $M$.
- $\vec v$ is an eigenvector of $M$.
$$ \det(M-\lambda I)=0 $$
This polynomial is called the characteristic polynomial of $M$.
An $n \times n$ matrix has matrix has at most n distinct eigenvalues.
Week 10
The trace of $A$ is the sum of its diagonal entries: $$ \operatorname{tr}(A) = a_{11}+a_{22}+\cdots+a_{nn} $$
Let A be an $n\times n$ matrix, then:
- $\det(A)$ is the product of the eigenvalues.
- $\operatorname{tr}(A) $ is the sum of the eigenvalues.
Let $U$ be a vector space, let $V\subset U$ be a non-empty subset. $V$ is called a subspace of $U$ if it satisfies the following two properties:
- if $v_1,v_2\in V$, then $v_1+v_2\in V$, $V$ is closed under addition.
- if $v\in V, c\in R$, then $cv\in V$, $V$ is closed under scalar multiplication.
An $n\times n$ matrix $A$ has determinant not equal to 0:
- Its columns are linearly independent.
- Its rows are linearly independent.
A basis of a vector space $V$ is a set $S={\vec v_1, \vec v_2, \cdots, \vec v_n}$ such that:
- $span(S)=V$.
- $S$ is linearly independent.
Facts:
- Any two bases $S$ and $S’$ of $V$ have the same number of elements.
- The dimension is the size of the smallest spanning set one can find.
- The dimension is the size of the biggest set of linearly independent vectors one can find.
Let $A$ be an $n\times n$ matrix and $\lambda\in R$ a scalar. The $\lambda \cdot eigenspace$ of A is the set of all solutions $\vec v$ of the equation $A\vec v=\lambda \vec v$, including $\vec 0$.
- If $\lambda$ is not an eigenvalue, then the $\lambda \cdot eigenspace$ is just ${\vec 0}$.
- ${\vec 0}$ is never an eigenvector for $\lambda$, but it is always in the eigenspace.
- Eigenspace is always a subspace of $R^n$.
The algebraic multiplicity of an eigenvalue is the number of times it appears as a root of $\det(A-\lambda I)$.
The geometric multiplicity of an eigenvalue $\lambda$ is the dimension of the $\lambda\cdot eigenspace$ of $A$.
For each eigenvalue $\lambda$ of an $n\times n$ matrix: $$ 1\leq geometric \ multiplicity \leq algebraic \ multiplicity \leq n $$ Let $\lambda_1,\cdots,\lambda_k$ be the distinct eigenvalues of $A$, the sum of their algebraic multiplicities is n.
If $A$ is triangular, the eigenvalues are the diagonal entries $a_{11},a_{22},\cdots,a_{nn}$.
Suppose a matrix A has a column $c_j$ all entries 0, except for possibly the $a_{jj}$ entry. Then $a_{jj}$ is an eigenvalue for $A$ and $\vec e_j$ is the corresponding eigenvector.
Let $X$ be an $n\times n$ matrix and $\vec v$ and eigenvalue $\lambda$, then for any $k\geq0$, $X^k$ has eigenvector $\lambda^k$ with eigenvector $\vec v$.
Week 11
Let $A$ be any $n\times n$ matrix, $A$ is diagonalizable if:
- Some invertible matrix $P$.
- Some diagonal matrix $D=\left[\begin{matrix}\lambda_1 & & & &\\\ & \lambda_2 & & & \\\ & & \cdots & \ & & & & \lambda_n\end{matrix}\right]$.
such that $A=PDP^{-1}$.
An $n\times n$ matrix $A$ is diagonalizable only if we can find eigenvectors $\vec v_1, \vec v_2, \cdots, \vec v_n$ with
eigenvalues $\lambda_1, \lambda_2, \cdots, \lambda_n$ which are linearly independent.
Then $A=PDP^{-1}$, where $P$ has columns $\vec v_1, \vec v_2, \cdots, \vec v_n$ and $D=\left[\begin{matrix}\lambda_1 & & & &\\\ & \lambda_2 & & & \\\ & & \cdots & \\\ & & & & \lambda_n\end{matrix}\right]$.
-
Suppose an $n\times n$ matrix $A$ has eigenvalues $\lambda_1> \lambda_2> \cdots> \lambda_n$, with eigenspaces $\vec v_1, \vec v_2, \cdots, \vec v_n$, then if $\vec v_1\in V_1, \vec v_2\in V_2, \cdots, \vec v_n\in V_n$ satisfy $$ \vec v_1 + \vec v_2 + \cdots+ \vec v_n = \vec 0 $$ we must have $\vec v_1, \vec v_2, \cdots, \vec v_n = \vec 0$.
-
Suppose an $n\times n$ matrix $A$ has eigenvalues $\lambda_1> \lambda_2> \cdots> \lambda_n$, with eigenspaces $\vec v_1, \vec v_2, \cdots, \vec v_n$. Suppose that for each $i$ we have a set $S_i$ of linearly independent vectors in $v_i$, then the set $S_1\cup S_2\cup\cdots\cup S_n$ of all of these vectors is still linearly independent.
-
$A$ is diagonalizable only if for each eigenvalue of $A$, the geometric multiplicity is equal to the algebraic multiplicity.
-
If $A$ is an $n\times n$ matrix with $n$ distinct eigenvalues, than $A$ is diagonalizable.
If $A$ is a diagonalizable matrix: $$ A^n=PD^nP^{-1} $$
Leslie matrix: $$ \left[\begin{matrix} b_1 & b_2 & b_3 & b_4\\\ s_1 & 0 & 0 & 0\\\ 0 & s_2 & 0 & 0 \\\ 0 & 0 & s_3 & 0 \end{matrix}\right] $$
- $b_i$ is the birth rate of group $i$.
- $s_i$ is the survival probability of group $i$.
$$ \vec x_{n+1} = L\vec x_n=L^{n+1}\vec x_0 $$
Week 12
A probability vector is a vector $\vec v = \left[\begin{matrix}v_1\\\ v_2\\\ \cdots \\\ v_n\end{matrix}\right]\in R^n$ such that:
- For each $i$, $0\leq v_i \leq 1$.
- $v_1+v_2+\cdots+v_n=1$.
A stochastic matrix is a square matrix $P$ such that each column is a probability vector.
A matrix is positive if all of its entries are positive (greater than 0).
A stochastic matrix $P$ is regular if there exists $n\geq 1$ such that $P^n$, is positive.
- If $P$ is positive, $P^n$ is positive.
- If $P$ is stochastic, $p^n$ is stochastic.
A Markov chain is a stochastic model, i t consists of finitely many variables called states, state vector is denoted as: $$ \vec x_n = \left[\begin{matrix}x_1\\\ x_2\\\ \cdots\\\ x_k\end{matrix}\right]\in R^k $$ The probability of moving from one state to another is called the transition probability. The transition probability of moving from state $j$ to $i$ by: $$ P_{ij} $$
An eigenvector $\vec v$ with eigenvalue 1 for the transition matrix $P$ is called a steady state vector.
- $P\vec x = \vec x$.
- Non-negative entries summing to the total number of entities in the system.
A steady state probability vector is a probability vector $\vec x$ satisfying $P\vec x=\vec x$.
The Markov chain always has a steady state vector and a steady state probability vector.
A Markov chain is regular if its transition matrix $P$ is regular.
-
If $P$ is regular, the $-1$ is NOT an eigenvalue.
-
If $P$ is regular, then there is a unique SSV $\vec x$ and a unique SSPV $\vec y$.
-
Asymptotic behaviors: For every initial state $\vec x_0$, we have $\vec x_n=P^n\vec x_0$.
-
If $P$ is regular, then as $n\rightarrow \infty$, $P^n$ approaches the long range transition matrix L of the Markov chain: $$ P^n\rightarrow L= \left[ \begin{matrix} \vec y | \vec y | \cdots |\vec y \end{matrix} \right] $$