Basic Linear Algebra

Week 1

For any vector: $$ \overrightarrow{AB}=\overrightarrow{OB}-\overrightarrow{OA} $$ Can be denoted as: $$ \left[ \begin{matrix} b_1 - a_1 \\\ b_2 - a_2 \end{matrix} \right] or \left[ \begin{matrix} b_1 - a_1 , b_2 - a_2 \end{matrix} \right] $$

Vector addition supports:

Scalar multiplication supports distributivity.

A vector space is a set $V$ that:

Two vectors are parallel if $\vec u = c\vec v$.

$\vec v$ is a linear combination of k vectors $\vec v_1, \vec v_2,\cdots,\vec v_k$ if $\vec v = c_1\vec v_1+c_2\vec v_2+\cdots+c_k\vec v_k$.

Week 2

The result of a dot product is a scalar: $$ \vec u \cdot\vec v = u_1 \cdot v_1 + u_2 \cdot v_1 + \cdots + u_n\cdot v_n $$

In $R^n$, the length of a vector is $\lVert \vec v \rVert = \sqrt{\vec v \cdot \vec v}$, so ${\lVert \vec v \rVert}^2=\vec v \cdot \vec v$.

If $\vec u$ is orthogonal to $\vec v$, $\vec u \cdot \vec v = 0$.

The Cauchy-Schwartz inequality:

For any vectors $\vec u, \vec v \in R^n$: $$ \lVert \vec u \cdot\vec v \rVert \leq \lVert \vec u\rVert \lVert \vec v\rVert $$

The triangle inequality:

For any vectors $\vec u, \vec v \in R^n$: $$ \lVert \vec u +\vec v \rVert \leq \lVert \vec u\rVert + \lVert \vec v\rVert $$

A unit vector is a vector of length 1.

In $R ^n$, there are n standard unit vectors, given by $\vec e_1, \vec e_2, \cdots , \vec e_n$, where $\vec e_i$ has 1 as its $i^{th}$ components and 0 for all other components: $$ \vec e_i= \left[ \begin{matrix} 0 \\\ 0 \\\ \cdots \\\ 1 \\\ \cdots \\\ 0 \end{matrix} \right] $$ The normalization of $\vec v$ is the unique vector $\widehat v$ with length 1 and direction the same as $\vec v$: $$ \widehat v = \frac{1}{\lVert \vec v \rVert} \vec v $$

The distance between two vectors $\vec u, \vec v \in R ^n$ is: $$ \begin{align} d(\vec u, \vec v) &= \lVert \vec u - \vec v\rVert \ &= \sqrt{(u_1-v_2)^2+\cdots+(u_n-v_n)^2} \end{align} $$ The angle between $\vec u, \vec v \in R ^n$ is $\theta \in [-1,1]$: $$ \cos \theta = \frac{\vec u \cdot \vec v}{\lVert u\rVert\cdot\lVert v\rVert} $$

For $\vec u,\vec v \in R^n$, with $\vec u \neq \vec 0$, the projection of the vector $\vec v$ onto $\vec u$ is denoted by $proj_{\vec u}(\vec v)$ and defined by: $$ proj_{\vec u}(\vec v)=\frac{\vec u \cdot \vec v}{{\lVert \vec u\rVert}^2}\vec u $$ We can always write $\vec v$ as: $$ \vec v = \vec v_p + \vec v_0 $$ Where $\vec v_p$ is parallel to $\vec u$ and $\vec v_0$ is orthogonal to $\vec u$. Because $proj_{\vec u}(\vec v)$ is parallel to $\vec u$, $\vec v - proj_{\vec u}(\vec v)$ is orthogonal to $\vec u$.

Week 3

The cross product of two vectors $\vec u = [u_1, u_2, u_3]$, $\vec v=[v_1,v_2,v_3]$: $$ \vec u \times \vec v = \left[ \begin{matrix} u_2v_3 - u_3v_2 \\\ u_3v_1 - u_1v_3 \\\ u_1v_2 - u_2v_1 \end{matrix} \right] $$ For standard unit vectors:

Anti-commutativity: $\vec u \times \vec v = -\vec v \times \vec u$.

$\vec u \times \vec u = \vec 0$.

$\vec u \times \vec v$ is orthogonal to both $\vec u$ and $\vec v$.

If $\vec u, \vec v$ is parallel, $\vec u \times \vec v = \vec 0$.

The length of $\vec u \times \vec v$ is: $$ \lVert\vec u \times \vec v\rVert=\lVert \vec u\lVert \lVert\vec v\lVert\sin\theta $$

The area of the triangle spanned by $\vec u$ and $\vec v$ is $\frac{1}{2}\lVert\vec u\times\vec v\rVert$.

The area of the parallelogram spanned by $\vec u$ and $\vec v$ is $\lVert \vec u\times \vec v\rVert$.

The normal form of the line $l$ in $R^2$ is given by the equation: $$ \vec x \cdot \vec m = \vec p \cdot\vec m $$

$$ \left[ \begin{matrix} u_2v_3 - u_3v_2 \\\ u_3v_1 - u_1v_3 \\\ u_1v_2 - u_2v_1 \end{matrix} \right] \cdot\vec m=\overrightarrow{OP}\cdot\vec m $$

where $\vec m$ is a vector which is orthogonal to line $l$, $\vec p$ is one point on the line $l$.

The general form​ of the line $l$ in $R^2$ is given by the equation: $$ l={(x,y)|ax+by=c},c=\overrightarrow{OP}\cdot\vec m $$

The vector form of the line $l$ in $R^2$ is given by the equation: $$ \vec x = \vec p + t\vec d \ for \ some \ t \in R $$ where $\vec d$ is the direction vector of line $l$, $t\in R$ is called a parameter.

The parametric form of the line $l$ in $R ^2$ is given by the equation: $$ \begin{align} x &= p_1 + td_1 \ y &= p_2 + td_2 \end{align} \space t\in R $$ The vector form and parametric form can be used in $R^3$.

Week 4

The normal form of the equation of the plane $P$ in $R^3$ is given by the equation: $$ \begin{align} \vec m \cdot(\vec x - \vec p) &= 0 \ \vec m \cdot \vec x &= \vec m \cdot \vec p \end{align} $$ where $\vec m$ is a vector which is orthogonal to plane $P$, $\vec p$ is one point on the plane $P$.

The general form of the equation of the plane $P$ in $R^3$ is given by the equation: $$ ax+by+cz=d, where \ d=ap_1+bp_2+cp_3 $$

Given three points $P,Q,R\in R^n$, if there is a line $l$ which passes through all three of them, then $P,Q,R$ are collinear.

The vector form of the equation of the plane $P$ in $R^3$ is given by the equation: $$ \vec x = \vec p + s\vec v + t\vec u, s,t\in R $$ where point $p$ is inside plain $P$, with two direction vectors $\vec u,\vec v$.

The parametric form of the equation of the plane $P$ in $R^3$ is given by the equation:

$$ \left[ \begin{matrix} x = p_1 + su_1 + tv_1 \\\ y = p_2 + su_2 + tv_2 \\\ z = p_3 + su_3 + tv_3 \end{matrix} \right. \space , s,t\in R $$

If $S={\vec v_1, \vec v_2, \cdots, \vec v_n}$ is a set of vectors in $R^n$, then the span of $\vec v_1, \vec v_2, \cdots, \vec v_n$ is denoted: $$ span(S) \ or \ span(\vec v_1, \vec v_2, \cdots, \vec v_n) \ span(S)={\vec v \in R^n | \vec v = c_1\vec v_1 + c_2\vec v_2 + \cdots + c_n\vec v_n} $$ If $span(S)=R^n$, we say that $S$ is a spanning set for $R^n$, or the $\vec v_1, \cdots, \vec v_k$ span $R^n$.

A set of vectors $\vec v_1, \cdots, \vec v_n$ is called linearly independent if the equation: $$ c_1\vec v_1 + c_2\vec v_2 + \cdots + c_n\vec v_n = 0 $$ has exactly one solution: $$ c_1=c_2=\cdots=c_n=0 $$ $\vec v_1, \vec v_2, \cdots, \vec v_n$ are linearly independent only if $\vec v_1, \vec v_2, \cdots, \vec v_n$ span $R^n$.

A system of linear equations is a finite set of linear equations, each with the same variables. $$ \begin{matrix} a_{11}x_1 &+ a_{12}x_2 &+ \cdots &+ a_{1n}xn &=b_1 \\\ a_{21}x_1 &+ a_{22}x_2 &+ \cdots &+ a_{2n}xn &=b_2 \\\ \cdots & \cdots & & \cdots & \cdots\\\ a_{m1}x_1 &+ a_{m2}x_2 &+ \cdots &+ a_{mn}xn &=b_m \end{matrix} $$

The system is called homogeneous is all $b_i$ are 0.

A system is said to be consistent​ if it has at least one solution.

Every system of linear equations has either:

A system of m linear equations in n variables can also be written as $$ \begin{matrix} \left[ \begin{array}{cccc | c} a_{11} & a_{12} & \cdots &a_{1n} & b_1 \\\ a_{21} & a_{22} & \cdots &a_{2n} & b_2 \\\ \cdots & \cdots & \cdots &\cdots & \cdots\\\ a_{m1} & a_{m2} & \cdots &a_{mn} & b_m \end{array} \right] \end{matrix} $$

Week 5

The following three row elementary operations don’t change the solutions:

  1. Swapping two equations. $$ R_i \leftrightarrow R_j $$

  2. Multiplying both sides of one equation by a non-zero scalar $c\in R$. $$ R_i \rightarrow cR_i $$

  3. Adding a multiple of one equation to another. $$ R_i \rightarrow R_i + cR_j $$

An augmented matrix is in row echelon form if:

  1. Any rows in which all entries are 0 are at the bottom.
  2. In each non-zero row, the leftmost non-zero entry (called the leading entry or the pivot) has all zeros below it.

In the REF the columns corresponding to $x,y,w$ have leading terms while the column corresponding to $z$ doesn’t, we call $z$ a free variable.

Use Gaussian elimination to approach REF.

An augmented matrix is in reduced row echelon form if it satisfies the following conditions:

  1. It is in row echelon form.
  2. The leading entries are all ones.
  3. Each column containing a leading 1 has zeros everywhere else in this columns.

Use Gauss-Jordan elimination to approach RREF.

Week 6

Let $A=(a_{ij})$ be an $m\times n$ matrix, the transpose of A is the $n \times m$ matrix $A^T=(a_{ji})$ denoted by swapping the rows and columns of A.

transposition is self-inverse: for any matrix $A$, $(A^T)^T=A$.

Let $A$ be a square matrix:

Week 7

$$ \left[ \begin{matrix} ax_1 + bx_2 + cx_3 \\\ dx_1 + ex_2 + fx_3 \\\ gx_1 + hx_2 + ix_3 \end{matrix} \right]= \left[ \begin{matrix} a & b & c \\\ d & e & f \\\ g & h & i \end{matrix} \right] \left[ \begin{matrix} x_1 \\\ x_2 \\\ x_3 \end{matrix} \right] $$

Let $A$ be a matrix, an inverse for $A$ is a matrix $B$ such that: $$ AB=I \ and \ BA = I $$ So that $A$ can only have an inverse if it is square.

The inverse of A is denoted as $A^{-1}$, if $A^{-1}A=I$, $AA^{-1}=I$ must exist.

Properties, suppose $A,B$ are invertible with inverse $A^{-1},B^{-1}$

For $2\times 2$ matrices:

Suppose $A=\left[\begin{matrix}a & b \\\ c & d\end{matrix}\right]$, then $\det(A)=ad-bc$, the inverse of $A$ is: $$ A^{-1}=\frac{1}{\det(A)} \left[ \begin{matrix} d & -b \\\ -c & a \end{matrix} \right] $$

To calculate the inverse of an $n \times n$ matrix $A$:

  1. Construct $[A|I]$.
  2. Do EROs on the whole augmented matrix until the left hand is in REF, if the left hand has a row of zeros (means the vectors are linearly dependent), then $A$ is not invertible.
  3. Continue until it has the form $[I|B]$, then $B=A^{-1}$.

An $n\times n$ is invertible only if:

  1. Its REF doesn’t have a row of zeros.
  2. Its RREF is $I_n$.

Suppose A is an $n \times n$ matrix, giving a system of linear equations $A\vec x = \vec b$. If A is invertible, then the system has a unique solution, given by $\vec x = A^{-1}\vec b$.

Week 8

An elementary matrix is an $n\times n$ matrix that is obtained from the identity matrix $I_n$ by doing a single row operation, so there is three types of elementary matrix.

Let $E$ be the elementary matrix, the result of $EA$ is the same as performing that ERO on A.

Every elementary matrix $E$ is invertible, $E^{-1}$ is also an elementary matrix.

To undo the effect of $E$, do $E^{-1}A$.

A matrix which has a whole row or column of zeros cannot be invertible.

$A$ is invertible only if $\det(A)\neq 0$.

If $A$ is upper triangular or lower triangular, then $\det(A)$ is the product of the diagonal entries.

Week 9


If $B$ is obtained from $A$ be swapping two rows, $\det(B)=-\det(A)$.

If $B$ is obtained from $A$ by scaling one row by $\lambda$, $\det(B)=\lambda\det(A)$.

If $B$ is obtained from A by adding a scalar multiple of one row to another, $\det(A)=\det(B)$.

Let $A,B$ be $n\times n$ matrices, then $\det(A)\det(B)=\det(AB)$.

Let $M$ be an $n \times n$ matrix, suppose $\vec v$ is a non-zero $n\times 1$ column vector, and $\lambda \in R$ is a scalar such that: $$ M\vec v = \lambda \vec v $$

$$ \det(M-\lambda I)=0 $$

This polynomial is called the characteristic polynomial of $M$.

An $n \times n$ matrix has matrix has at most n distinct eigenvalues.

Week 10

The trace of $A$ is the sum of its diagonal entries: $$ \operatorname{tr}(A) = a_{11}+a_{22}+\cdots+a_{nn} $$

Let A be an $n\times n$ matrix, then:

Let $U$ be a vector space, let $V\subset U$ be a non-empty subset. $V$ is called a subspace of $U$ if it satisfies the following two properties:

  1. if $v_1,v_2\in V$, then $v_1+v_2\in V$, $V$ is closed under addition.
  2. if $v\in V, c\in R$, then $cv\in V$, $V$ is closed under scalar multiplication.

An $n\times n$ matrix $A$ has determinant not equal to 0:

  1. Its columns are linearly independent.
  2. Its rows are linearly independent.

A basis of a vector space $V$ is a set $S={\vec v_1, \vec v_2, \cdots, \vec v_n}$ such that:

  1. $span(S)=V$.
  2. $S$ is linearly independent.


  1. Any two bases $S$ and $S’$ of $V$ have the same number of elements.
  2. The dimension is the size of the smallest spanning set one can find.
  3. The dimension is the size of the biggest set of linearly independent vectors one can find.

Let $A$ be an $n\times n$ matrix and $\lambda\in R$ a scalar. The $\lambda \cdot eigenspace$ of A is the set of all solutions $\vec v$ of the equation $A\vec v=\lambda \vec v$, including $\vec 0$.

The algebraic multiplicity of an eigenvalue is the number of times it appears as a root of $\det(A-\lambda I)$.

The geometric multiplicity of an eigenvalue $\lambda$ is the dimension of the $\lambda\cdot eigenspace$ of $A$.

For each eigenvalue $\lambda$ of an $n\times n$ matrix: $$ 1\leq geometric \ multiplicity \leq algebraic \ multiplicity \leq n $$ Let $\lambda_1,\cdots,\lambda_k$ be the distinct eigenvalues of $A$, the sum of their algebraic multiplicities is n.

If $A$ is triangular, the eigenvalues are the diagonal entries $a_{11},a_{22},\cdots,a_{nn}$.

Suppose a matrix A has a column $c_j$ all entries 0, except for possibly the $a_{jj}$ entry. Then $a_{jj}$ is an eigenvalue for $A$ and $\vec e_j$ is the corresponding eigenvector.

Let $X$ be an $n\times n$ matrix and $\vec v$ and eigenvalue $\lambda$, then for any $k\geq0$, $X^k$ has eigenvector $\lambda^k$ with eigenvector $\vec v$.

Week 11

Let $A$ be any $n\times n$ matrix, $A$ is diagonalizable if:

such that $A=PDP^{-1}$.

An $n\times n$ matrix $A$ is diagonalizable only if we can find eigenvectors $\vec v_1, \vec v_2, \cdots, \vec v_n$ with

eigenvalues $\lambda_1, \lambda_2, \cdots, \lambda_n$ which are linearly independent.

Then $A=PDP^{-1}$, where $P$ has columns $\vec v_1, \vec v_2, \cdots, \vec v_n$ and $D=\left[\begin{matrix}\lambda_1 & & & &\\\ & \lambda_2 & & & \\\ & & \cdots & \\\ & & & & \lambda_n\end{matrix}\right]$.

If $A$ is a diagonalizable matrix: $$ A^n=PD^nP^{-1} $$

Leslie matrix: $$ \left[\begin{matrix} b_1 & b_2 & b_3 & b_4\\\ s_1 & 0 & 0 & 0\\\ 0 & s_2 & 0 & 0 \\\ 0 & 0 & s_3 & 0 \end{matrix}\right] $$

$$ \vec x_{n+1} = L\vec x_n=L^{n+1}\vec x_0 $$

Week 12

A probability vector is a vector $\vec v = \left[\begin{matrix}v_1\\\ v_2\\\ \cdots \\\ v_n\end{matrix}\right]\in R^n$ such that:

A stochastic matrix is a square matrix $P$ such that each column is a probability vector.

A matrix is positive if all of its entries are positive (greater than 0).

A stochastic matrix $P$ is regular if there exists $n\geq 1$ such that $P^n$, is positive.

A Markov chain is a stochastic model, i t consists of finitely many variables called states, state vector is denoted as: $$ \vec x_n = \left[\begin{matrix}x_1\\\ x_2\\\ \cdots\\\ x_k\end{matrix}\right]\in R^k $$ The probability of moving from one state to another is called the transition probability. The transition probability of moving from state $j$ to $i$ by: $$ P_{ij} $$

An eigenvector $\vec v$ with eigenvalue 1 for the transition matrix $P$ is called a steady state vector.

  1. $P\vec x = \vec x$.
  2. Non-negative entries summing to the total number of entities in the system.

A steady state probability vector is a probability vector $\vec x$ satisfying $P\vec x=\vec x$.

The Markov chain always has a steady state vector and a steady state probability vector.

A Markov chain is regular if its transition matrix $P$ is regular.