Basic Linear Algebra

Week 1

For any vector: $$ \overrightarrow{AB}=\overrightarrow{OB}-\overrightarrow{OA} $$ Can be denoted as: $$ \left[ \begin{matrix} b_1 - a_1 \\\ b_2 - a_2 \end{matrix} \right] or \left[ \begin{matrix} b_1 - a_1 , b_2 - a_2 \end{matrix} \right] $$

$\overrightarrow{OA}$ is the position vector of A.
$\overrightarrow{AB}$ is the displacement vector from A to B.

Vector addition supports:

Commutativity
Associativity

Scalar multiplication supports distributivity.

A vector space is a set $V$ that:

For any $\vec v,\vec w \in V$, we have $\vec v+\vec w\in V$.
For any $\vec v \in V, c \in R$, we have $c\vec v \in V$.

Two vectors are parallel if $\vec u = c\vec v$.

$\vec v$ is a linear combination of k vectors $\vec v_1, \vec v_2,\cdots,\vec v_k$ if $\vec v = c_1\vec v_1+c_2\vec v_2+\cdots+c_k\vec v_k$.

Week 2

The result of a dot product is a scalar: $$ \vec u \cdot\vec v = u_1 \cdot v_1 + u_2 \cdot v_1 + \cdots + u_n\cdot v_n $$

In $R^n$, the length of a vector is $\lVert \vec v \rVert = \sqrt{\vec v \cdot \vec v}$, so ${\lVert \vec v \rVert}^2=\vec v \cdot \vec v$.

If $\vec u$ is orthogonal to $\vec v$, $\vec u \cdot \vec v = 0$.

The Cauchy-Schwartz inequality:

For any vectors $\vec u, \vec v \in R^n$: $$ \lVert \vec u \cdot\vec v \rVert \leq \lVert \vec u\rVert \lVert \vec v\rVert $$

The triangle inequality:

For any vectors $\vec u, \vec v \in R^n$: $$ \lVert \vec u +\vec v \rVert \leq \lVert \vec u\rVert + \lVert \vec v\rVert $$

A unit vector is a vector of length 1.

In $R ^n$, there are n standard unit vectors, given by $\vec e_1, \vec e_2, \cdots , \vec e_n$, where $\vec e_i$ has 1 as its $i^{th}$ components and 0 for all other components: $$ \vec e_i= \left[ \begin{matrix} 0 \\\ 0 \\\ \cdots \\\ 1 \\\ \cdots \\\ 0 \end{matrix} \right] $$ The normalization of $\vec v$ is the unique vector $\widehat v$ with length 1 and direction the same as $\vec v$: $$ \widehat v = \frac{1}{\lVert \vec v \rVert} \vec v $$

The distance between two vectors $\vec u, \vec v \in R ^n$ is: $$ \begin{align} d(\vec u, \vec v) &= \lVert \vec u - \vec v\rVert \ &= \sqrt{(u_1-v_2)^2+\cdots+(u_n-v_n)^2} \end{align} $$ The angle between $\vec u, \vec v \in R ^n$ is $\theta \in [-1,1]$: $$ \cos \theta = \frac{\vec u \cdot \vec v}{\lVert u\rVert\cdot\lVert v\rVert} $$

If $\vec u \cdot \vec v > 0$, then $\cos \theta > 0$, so $\theta$ is acute.
If $\vec u \cdot \vec v < 0$, then $\cos \theta < 0$, so $\theta$ is obtuse.
If $\vec u \cdot \vec v = 0$, then $\cos \theta = 0$, so $\theta$ is right angle, we say that these two vectors are orthogonal.

For $\vec u,\vec v \in R^n$, with $\vec u \neq \vec 0$, the projection of the vector $\vec v$ onto $\vec u$ is denoted by $proj_{\vec u}(\vec v)$ and defined by: $$ proj_{\vec u}(\vec v)=\frac{\vec u \cdot \vec v}{{\lVert \vec u\rVert}^2}\vec u $$ We can always write $\vec v$ as: $$ \vec v = \vec v_p + \vec v_0 $$ Where $\vec v_p$ is parallel to $\vec u$ and $\vec v_0$ is orthogonal to $\vec u$. Because $proj_{\vec u}(\vec v)$ is parallel to $\vec u$, $\vec v - proj_{\vec u}(\vec v)$ is orthogonal to $\vec u$.

Week 3

The cross product of two vectors $\vec u = [u_1, u_2, u_3]$, $\vec v=[v_1,v_2,v_3]$: $$ \vec u \times \vec v = \left[ \begin{matrix} u_2v_3 - u_3v_2 \\\ u_3v_1 - u_1v_3 \\\ u_1v_2 - u_2v_1 \end{matrix} \right] $$ For standard unit vectors:

$\vec e_1 \times \vec e_2 = \vec e_3$.
$\vec e_2 \times \vec e_3 = \vec e_1$.
$\vec e_3 \times \vec e_1 = \vec e_2$.

Anti-commutativity: $\vec u \times \vec v = -\vec v \times \vec u$.

$\vec u \times \vec u = \vec 0$.

$\vec u \times \vec v$ is orthogonal to both $\vec u$ and $\vec v$.

If $\vec u, \vec v$ is parallel, $\vec u \times \vec v = \vec 0$.

The length of $\vec u \times \vec v$ is: $$ \lVert\vec u \times \vec v\rVert=\lVert \vec u\lVert \lVert\vec v\lVert\sin\theta $$

The area of the triangle spanned by $\vec u$ and $\vec v$ is $\frac{1}{2}\lVert\vec u\times\vec v\rVert$.

The area of the parallelogram spanned by $\vec u$ and $\vec v$ is $\lVert \vec u\times \vec v\rVert$.

The normal form of the line $l$ in $R^2$ is given by the equation: $$ \vec x \cdot \vec m = \vec p \cdot\vec m $$

$$ \left[ \begin{matrix} u_2v_3 - u_3v_2 \\\ u_3v_1 - u_1v_3 \\\ u_1v_2 - u_2v_1 \end{matrix} \right] \cdot\vec m=\overrightarrow{OP}\cdot\vec m $$

where $\vec m$ is a vector which is orthogonal to line $l$, $\vec p$ is one point on the line $l$.

The general form of the line $l$ in $R^2$ is given by the equation: $$ l={(x,y)|ax+by=c},c=\overrightarrow{OP}\cdot\vec m $$

The vector form of the line $l$ in $R^2$ is given by the equation: $$ \vec x = \vec p + t\vec d \ for \ some \ t \in R $$ where $\vec d$ is the direction vector of line $l$, $t\in R$ is called a parameter.

The parametric form of the line $l$ in $R ^2$ is given by the equation: $$ \begin{align} x &= p_1 + td_1 \ y &= p_2 + td_2 \end{align} \space t\in R $$ The vector form and parametric form can be used in $R^3$.

Week 4

The normal form of the equation of the plane $P$ in $R^3$ is given by the equation: $$ \begin{align} \vec m \cdot(\vec x - \vec p) &= 0 \ \vec m \cdot \vec x &= \vec m \cdot \vec p \end{align} $$ where $\vec m$ is a vector which is orthogonal to plane $P$, $\vec p$ is one point on the plane $P$.

The general form of the equation of the plane $P$ in $R^3$ is given by the equation: $$ ax+by+cz=d, where \ d=ap_1+bp_2+cp_3 $$

Given three points $P,Q,R\in R^n$, if there is a line $l$ which passes through all three of them, then $P,Q,R$ are collinear.

The vector form of the equation of the plane $P$ in $R^3$ is given by the equation: $$ \vec x = \vec p + s\vec v + t\vec u, s,t\in R $$ where point $p$ is inside plain $P$, with two direction vectors $\vec u,\vec v$.

The parametric form of the equation of the plane $P$ in $R^3$ is given by the equation:

$$ \left[ \begin{matrix} x = p_1 + su_1 + tv_1 \\\ y = p_2 + su_2 + tv_2 \\\ z = p_3 + su_3 + tv_3 \end{matrix} \right. \space , s,t\in R $$

If $S={\vec v_1, \vec v_2, \cdots, \vec v_n}$ is a set of vectors in $R^n$, then the span of $\vec v_1, \vec v_2, \cdots, \vec v_n$ is denoted: $$ span(S) \ or \ span(\vec v_1, \vec v_2, \cdots, \vec v_n) \ span(S)={\vec v \in R^n | \vec v = c_1\vec v_1 + c_2\vec v_2 + \cdots + c_n\vec v_n} $$ If $span(S)=R^n$, we say that $S$ is a spanning set for $R^n$, or the $\vec v_1, \cdots, \vec v_k$ span $R^n$.

A set of vectors $\vec v_1, \cdots, \vec v_n$ is called linearly independent if the equation: $$ c_1\vec v_1 + c_2\vec v_2 + \cdots + c_n\vec v_n = 0 $$ has exactly one solution: $$ c_1=c_2=\cdots=c_n=0 $$ $\vec v_1, \vec v_2, \cdots, \vec v_n$ are linearly independent only if $\vec v_1, \vec v_2, \cdots, \vec v_n$ span $R^n$.

A system of linear equations is a finite set of linear equations, each with the same variables. $$ \begin{matrix} a_{11}x_1 &+ a_{12}x_2 &+ \cdots &+ a_{1n}xn &=b_1 \\\ a_{21}x_1 &+ a_{22}x_2 &+ \cdots &+ a_{2n}xn &=b_2 \\\ \cdots & \cdots & & \cdots & \cdots\\\ a_{m1}x_1 &+ a_{m2}x_2 &+ \cdots &+ a_{mn}xn &=b_m \end{matrix} $$

$a_{ij}$ are called coefficients.
$b_i$ are called constant terms.

The system is called homogeneous is all $b_i$ are 0.

A system is said to be consistent if it has at least one solution.

Every system of linear equations has either:

a unique solution
infinitely many solutions
no solution

A system of m linear equations in n variables can also be written as $$ \begin{matrix} \left[ \begin{array}{cccc | c} a_{11} & a_{12} & \cdots &a_{1n} & b_1 \\\ a_{21} & a_{22} & \cdots &a_{2n} & b_2 \\\ \cdots & \cdots & \cdots &\cdots & \cdots\\\ a_{m1} & a_{m2} & \cdots &a_{mn} & b_m \end{array} \right] \end{matrix} $$

Week 5

The following three row elementary operations don’t change the solutions:

Swapping two equations. $$ R_i \leftrightarrow R_j $$
Multiplying both sides of one equation by a non-zero scalar $c\in R$. $$ R_i \rightarrow cR_i $$
Adding a multiple of one equation to another. $$ R_i \rightarrow R_i + cR_j $$

An augmented matrix is in row echelon form if:

Any rows in which all entries are 0 are at the bottom.
In each non-zero row, the leftmost non-zero entry (called the leading entry or the pivot) has all zeros below it.

In the REF the columns corresponding to $x,y,w$ have leading terms while the column corresponding to $z$ doesn’t, we call $z$ a free variable.

If the augmented matrix has a row of the form $[0 \cdots0|k]$, the system is inconsistent.
If the REF of an augmented matrix has at least one free variable, it has Infinitely many solutions.
Otherwise, it has exactly one solution.

Use Gaussian elimination to approach REF.

An augmented matrix is in reduced row echelon form if it satisfies the following conditions:

It is in row echelon form.
The leading entries are all ones.
Each column containing a leading 1 has zeros everywhere else in this columns.

Use Gauss-Jordan elimination to approach RREF.

Week 6

Let $A=(a_{ij})$ be an $m\times n$ matrix, the transpose of A is the $n \times m$ matrix $A^T=(a_{ji})$ denoted by swapping the rows and columns of A.

transposition is self-inverse: for any matrix $A$, $(A^T)^T=A$.

$(A+B)^T=A^T+B^T$.
$(\lambda A)^T=\lambda(A^T)$.
$(AB)^T=B^TA^T$.
$(A^k)^T=(A^T)^k$.
If $AB=BA$, these two matrices commu

Let $A$ be a square matrix:

If $A^T=A$, we say that A is a symmetric matrix.
If $A^T = -A$, we say that A is a stew-symmetric matrix.

Week 7

$$ \left[ \begin{matrix} ax_1 + bx_2 + cx_3 \\\ dx_1 + ex_2 + fx_3 \\\ gx_1 + hx_2 + ix_3 \end{matrix} \right]= \left[ \begin{matrix} a & b & c \\\ d & e & f \\\ g & h & i \end{matrix} \right] \left[ \begin{matrix} x_1 \\\ x_2 \\\ x_3 \end{matrix} \right] $$

Let $A$ be a matrix, an inverse for $A$ is a matrix $B$ such that: $$ AB=I \ and \ BA = I $$ So that $A$ can only have an inverse if it is square.

If A has an inverse, then we say that A is invertible.
Inverses are unique.

The inverse of A is denoted as $A^{-1}$, if $A^{-1}A=I$, $AA^{-1}=I$ must exist.

Properties, suppose $A,B$ are invertible with inverse $A^{-1},B^{-1}$

$A^{-1}$ is in invertible.
$cA$ is invertible, $(cA)^{-1}=\frac{1}{c}A^{-1}$.
$AB$ is invertible, $(AB)^{-1}=B^{-1}A^{-1}$.
$A^{k}$ is invertible, $(A^k)^{-1}=(A^{-1})^k$.
$A^T$ is invertible, $(A^T)^-1=(A^{-1})^T$.

For $2\times 2$ matrices:

$\det(AB)=\det(A)\det(B)$.
$\det(A^{-1})=\frac{1}{\det(A)}$.

Suppose $A=\left[\begin{matrix}a & b \\\ c & d\end{matrix}\right]$, then $\det(A)=ad-bc$, the inverse of $A$ is: $$ A^{-1}=\frac{1}{\det(A)} \left[ \begin{matrix} d & -b \\\ -c & a \end{matrix} \right] $$

To calculate the inverse of an $n \times n$ matrix $A$:

Construct $[A|I]$.
Do EROs on the whole augmented matrix until the left hand is in REF, if the left hand has a row of zeros (means the vectors are linearly dependent), then $A$ is not invertible.
Continue until it has the form $[I|B]$, then $B=A^{-1}$.

An $n\times n$ is invertible only if:

Its REF doesn’t have a row of zeros.
Its RREF is $I_n$.

Suppose A is an $n \times n$ matrix, giving a system of linear equations $A\vec x = \vec b$. If A is invertible, then the system has a unique solution, given by $\vec x = A^{-1}\vec b$.

Week 8

An elementary matrix is an $n\times n$ matrix that is obtained from the identity matrix $I_n$ by doing a single row operation, so there is three types of elementary matrix.

Let $E$ be the elementary matrix, the result of $EA$ is the same as performing that ERO on A.

Every elementary matrix $E$ is invertible, $E^{-1}$ is also an elementary matrix.

To undo the effect of $E$, do $E^{-1}A$.

A matrix which has a whole row or column of zeros cannot be invertible.

$A$ is invertible only if $\det(A)\neq 0$.

If $A$ is upper triangular or lower triangular, then $\det(A)$ is the product of the diagonal entries.

Week 9

Facts:

If a matrix $A$ has one row which is a scalar multiple of a second row, then $\det(A)=0$.
If a matrix $A$ has one column which is a scalar multiple of a second column, then$\det(A) = 0$.
$\det(A)=\det(A^T)$.

If $B$ is obtained from $A$ be swapping two rows, $\det(B)=-\det(A)$.

If $B$ is obtained from $A$ by scaling one row by $\lambda$, $\det(B)=\lambda\det(A)$.

If $B$ is obtained from A by adding a scalar multiple of one row to another, $\det(A)=\det(B)$.

Let $A,B$ be $n\times n$ matrices, then $\det(A)\det(B)=\det(AB)$.

Let $M$ be an $n \times n$ matrix, suppose $\vec v$ is a non-zero $n\times 1$ column vector, and $\lambda \in R$ is a scalar such that: $$ M\vec v = \lambda \vec v $$

$\lambda$ is an eigenvalue of $M$.
$\vec v$ is an eigenvector of $M$.

$$ \det(M-\lambda I)=0 $$

This polynomial is called the characteristic polynomial of $M$.

An $n \times n$ matrix has matrix has at most n distinct eigenvalues.

Week 10

The trace of $A$ is the sum of its diagonal entries: $$ \operatorname{tr}(A) = a_{11}+a_{22}+\cdots+a_{nn} $$

Let A be an $n\times n$ matrix, then:

$\det(A)$ is the product of the eigenvalues.
$\operatorname{tr}(A) $ is the sum of the eigenvalues.

Let $U$ be a vector space, let $V\subset U$ be a non-empty subset. $V$ is called a subspace of $U$ if it satisfies the following two properties:

if $v_1,v_2\in V$, then $v_1+v_2\in V$, $V$ is closed under addition.
if $v\in V, c\in R$, then $cv\in V$, $V$ is closed under scalar multiplication.

An $n\times n$ matrix $A$ has determinant not equal to 0:

Its columns are linearly independent.
Its rows are linearly independent.

A basis of a vector space $V$ is a set $S={\vec v_1, \vec v_2, \cdots, \vec v_n}$ such that:

$span(S)=V$.
$S$ is linearly independent.

Facts:

Any two bases $S$ and $S’$ of $V$ have the same number of elements.
The dimension is the size of the smallest spanning set one can find.
The dimension is the size of the biggest set of linearly independent vectors one can find.

Let $A$ be an $n\times n$ matrix and $\lambda\in R$ a scalar. The $\lambda \cdot eigenspace$ of A is the set of all solutions $\vec v$ of the equation $A\vec v=\lambda \vec v$, including $\vec 0$.

If $\lambda$ is not an eigenvalue, then the $\lambda \cdot eigenspace$ is just ${\vec 0}$.
${\vec 0}$ is never an eigenvector for $\lambda$, but it is always in the eigenspace.
Eigenspace is always a subspace of $R^n$.

The algebraic multiplicity of an eigenvalue is the number of times it appears as a root of $\det(A-\lambda I)$.

The geometric multiplicity of an eigenvalue $\lambda$ is the dimension of the $\lambda\cdot eigenspace$ of $A$.

For each eigenvalue $\lambda$ of an $n\times n$ matrix: $$ 1\leq geometric \ multiplicity \leq algebraic \ multiplicity \leq n $$ Let $\lambda_1,\cdots,\lambda_k$ be the distinct eigenvalues of $A$, the sum of their algebraic multiplicities is n.

If $A$ is triangular, the eigenvalues are the diagonal entries $a_{11},a_{22},\cdots,a_{nn}$.

Suppose a matrix A has a column $c_j$ all entries 0, except for possibly the $a_{jj}$ entry. Then $a_{jj}$ is an eigenvalue for $A$ and $\vec e_j$ is the corresponding eigenvector.

Let $X$ be an $n\times n$ matrix and $\vec v$ and eigenvalue $\lambda$, then for any $k\geq0$, $X^k$ has eigenvector $\lambda^k$ with eigenvector $\vec v$.

Week 11

Let $A$ be any $n\times n$ matrix, $A$ is diagonalizable if:

Some invertible matrix $P$.
Some diagonal matrix $D=\left[\begin{matrix}\lambda_1 & & & &\\\ & \lambda_2 & & & \\\ & & \cdots & \ & & & & \lambda_n\end{matrix}\right]$.

such that $A=PDP^{-1}$.

An $n\times n$ matrix $A$ is diagonalizable only if we can find eigenvectors $\vec v_1, \vec v_2, \cdots, \vec v_n$ with

eigenvalues $\lambda_1, \lambda_2, \cdots, \lambda_n$ which are linearly independent.

Then $A=PDP^{-1}$, where $P$ has columns $\vec v_1, \vec v_2, \cdots, \vec v_n$ and $D=\left[\begin{matrix}\lambda_1 & & & &\\\ & \lambda_2 & & & \\\ & & \cdots & \\\ & & & & \lambda_n\end{matrix}\right]$.

Suppose an $n\times n$ matrix $A$ has eigenvalues $\lambda_1> \lambda_2> \cdots> \lambda_n$, with eigenspaces $\vec v_1, \vec v_2, \cdots, \vec v_n$, then if $\vec v_1\in V_1, \vec v_2\in V_2, \cdots, \vec v_n\in V_n$ satisfy $$ \vec v_1 + \vec v_2 + \cdots+ \vec v_n = \vec 0 $$ we must have $\vec v_1, \vec v_2, \cdots, \vec v_n = \vec 0$.
Suppose an $n\times n$ matrix $A$ has eigenvalues $\lambda_1> \lambda_2> \cdots> \lambda_n$, with eigenspaces $\vec v_1, \vec v_2, \cdots, \vec v_n$. Suppose that for each $i$ we have a set $S_i$ of linearly independent vectors in $v_i$, then the set $S_1\cup S_2\cup\cdots\cup S_n$ of all of these vectors is still linearly independent.
$A$ is diagonalizable only if for each eigenvalue of $A$, the geometric multiplicity is equal to the algebraic multiplicity.
If $A$ is an $n\times n$ matrix with $n$ distinct eigenvalues, than $A$ is diagonalizable.

If $A$ is a diagonalizable matrix: $$ A^n=PD^nP^{-1} $$

Leslie matrix: $$ \left[\begin{matrix} b_1 & b_2 & b_3 & b_4\\\ s_1 & 0 & 0 & 0\\\ 0 & s_2 & 0 & 0 \\\ 0 & 0 & s_3 & 0 \end{matrix}\right] $$

$b_i$ is the birth rate of group $i$.
$s_i$ is the survival probability of group $i$.

$$ \vec x_{n+1} = L\vec x_n=L^{n+1}\vec x_0 $$

Week 12

A probability vector is a vector $\vec v = \left[\begin{matrix}v_1\\\ v_2\\\ \cdots \\\ v_n\end{matrix}\right]\in R^n$ such that:

For each $i$, $0\leq v_i \leq 1$.
$v_1+v_2+\cdots+v_n=1$.

A stochastic matrix is a square matrix $P$ such that each column is a probability vector.

A matrix is positive if all of its entries are positive (greater than 0).

A stochastic matrix $P$ is regular if there exists $n\geq 1$ such that $P^n$, is positive.

If $P$ is positive, $P^n$ is positive.
If $P$ is stochastic, $p^n$ is stochastic.

A Markov chain is a stochastic model, i t consists of finitely many variables called states, state vector is denoted as: $$ \vec x_n = \left[\begin{matrix}x_1\\\ x_2\\\ \cdots\\\ x_k\end{matrix}\right]\in R^k $$ The probability of moving from one state to another is called the transition probability. The transition probability of moving from state $j$ to $i$ by: $$ P_{ij} $$

An eigenvector $\vec v$ with eigenvalue 1 for the transition matrix $P$ is called a steady state vector.

$P\vec x = \vec x$.
Non-negative entries summing to the total number of entities in the system.

A steady state probability vector is a probability vector $\vec x$ satisfying $P\vec x=\vec x$.

The Markov chain always has a steady state vector and a steady state probability vector.

A Markov chain is regular if its transition matrix $P$ is regular.

If $P$ is regular, the $-1$ is NOT an eigenvalue.
If $P$ is regular, then there is a unique SSV $\vec x$ and a unique SSPV $\vec y$.
Asymptotic behaviors: For every initial state $\vec x_0$, we have $\vec x_n=P^n\vec x_0$.
If $P$ is regular, then as $n\rightarrow \infty$, $P^n$ approaches the long range transition matrix L of the Markov chain: $$ P^n\rightarrow L= \left[ \begin{matrix} \vec y | \vec y | \cdots |\vec y \end{matrix} \right] $$