# A Linear Algebra Reminders

## A.1 Vectors

We write \(v \in \mathbb{R}^d\) if \(v = (v_1, \ldots, v_d)\) for numbers \(v_1, \ldots, v_d \in\mathbb{R}\). We say that \(v\) is a \(d\)-dimensional vector, and \(\mathbb{R}^d\) is the \(d\)-dimensional Euclidean space. Vectors are often graphically represented as “column vectors”: \[\begin{equation*} v = \begin{pmatrix} v_1 \\ v_2 \\ \vdots \\ v_d \end{pmatrix}. \end{equation*}\]

If \(u,v\in\mathbb{R}^d\) are two vectors, the **inner product** of \(u\) and \(v\)
is given by
\[\begin{equation}
u^\top v
= \sum_{i=1}^d u_i v_i. \tag{A.1}
\end{equation}\]
Note that the two vectors must have the same length for the inner product
to exist.

Using this notation, the **Euclidean length** of a vector \(v\)
can be written as
\[\begin{equation*}
\|v\|
= \sqrt{\sum_{i=1}^d v_i^2}
= \sqrt{v^\top v}.
\end{equation*}\]

Vectors \(v_1, \ldots, v_n\) are said to be **orthogonal**,
if \(v_i^\top v_j = 0\) for all \(i \neq j\). The vectors are said to be
**orthonormal**, if they are orthogonal and satisfy \(\|v_i\| = 1\)
for all \(i\).

## A.2 Matrices

We write \(A \in \mathbb{R}^{m\times n}\) if \[\begin{equation*} A = \begin{pmatrix} a_{1,1} & \ldots & a_{1,n}\\ a_{2,1} & \ldots & a_{2,n}\\ \vdots & \ddots & \vdots\\ a_{m,1} & \ldots & a_{m,n} \end{pmatrix}, \end{equation*}\] where \(a_{i,j}\), sometimes also written as \(a_{ij}\) are numbers for \(i \in \{1, \ldots, m\}\) and \(j \in \{1, \ldots, n\}\).

### A.2.1 Transpose

If \(A \in \mathbb{R}^{m\times n}\), then the **transpose** of \(A\) is the matrix
\(A^\top \in \mathbb{R}^{n\times m}\), with \((A^\top)_{ij} = a_{ji}\) for all \(i \in \{1, \ldots, n\}\) and \(j \in \{1, \ldots, m\}\). Graphically,
this can be written as
\[\begin{equation*}
A^\top
= \begin{pmatrix}
a_{1,1} & a_{2,1} & \ldots & a_{m,1} \\
\vdots & \vdots & \ddots & \vdots \\
a_{1,n} & a_{2,n} & \ldots & a_{m,n}
\end{pmatrix},
\end{equation*}\]

**Definition A.1 **A matrix \(A\) is called **symmetric**, if \(A^\top = A\).

### A.2.2 Matrix-vector Product

If \(A \in \mathbb{R}^{m \times n}\) and \(v \in \mathbb{R}^n\), then \(Av \in \mathbb{R}^m\) is the vector with \[\begin{equation*} (Av)_i = \sum_{j=1}^n a_{ij} v_j \end{equation*}\] for all \(i \in \{1, \ldots, m\}\).

If we consider \(v\) to be a \((n\times 1)\)-matrix instead of a vector, \(Av\) can also be interpreted as a matrix-matrix product between an \(m \times n\) and an \(n\times 1\) matrix. Using this convention, \(v^\top\) is then interpreted as an \(1 \times n\) matrix and if \(u\in\mathbb{R}^m\) we have \(u^\top A \in \mathbb{R}^{1 \times n} \cong \mathbb{R}^n\) with \[\begin{equation*} (u^\top A)_j = \sum_{i=1}^m u_i a_{ij} \end{equation*}\] for all \(j \in \{1, \ldots, n\}\). Going one step further, this notation also motivates the expression \(u^\top v\) in equation (A.1).

### A.2.3 Matrix-matrix Product

If \(A \in \mathbb{R}^{\ell \times m}\) and \(B \in \mathbb{R}^{m\times n}\), then
\(AB \in \mathbb{R}^{\ell \times n}\) is the matrix with
\[\begin{equation*}
(AB)_{ik}
= \sum_{j=1}^m a_{ij} b_{jk}
\end{equation*}\]
for all \(i \in \{1, \ldots, \ell\}\) and \(k \in \{1, \ldots, n\}\).
This is called the **matrix product** of \(A\) and \(B\). Note
that \(A\) and \(B\) must have compatible shapes for the product to exist.

Properties:

The matrix product is associative: if \(A\), \(B\) and \(C\) are matrices with shapes such that \(AB\) and \(BC\) exist, then we have \(A(BC) = (AB)C\). It does not matter in which order we perform the matrix products here.

The matrix product is transitive: if \(A\), \(B\) and \(C\) have the correct shapes, we have \(A(B+C) = AB + AC\).

The matrix product is

*not*commutative: if \(AB\) exists, in general \(A\) and \(B\) don’t have the correct shapes for \(BA\) to also exist, and even if \(BA\) exists, in general we have \(AB \neq BA\).Taking the transpose swaps the order in a matrix product: we have \[\begin{equation} (AB)^\top = B^\top A^\top \tag{A.2} \end{equation}\]

### A.2.4 Rank

**Definition A.2 **The **rank** of a matrix \(A \in \mathbb{R}^{m \times n}\) is the dimension of the
image space of \(A\):
\[\begin{equation*}
\mathop{\mathrm{rank}}(A)
= \dim \bigl\{ Av \bigm| v\in \mathbb{R}^n \}.
\end{equation*}\]

The rank can also be characterised as the largest number of linearly independent columns of \(A\), or the largest number of linearly independent rows of \(A\). Thus, for \(A \in \mathbb{R}^{m \times n}\) we always have \(\mathop{\mathrm{rank}}(A) \leq \min(m, n)\). The matrix \(A\) is said to have “full rank” if \(\mathop{\mathrm{rank}}(A) = \min(m, n)\).

### A.2.5 Trace

**Definition A.3 **The **trace** of a matrix \(A \in \mathbb{R}^{n\times n}\) is given by
\[\begin{equation*}
\mathop{\mathrm{tr}}(A) = \sum_{i=1}^n a_ii.
\end{equation*}\]

Properties:

\(\mathop{\mathrm{tr}}(A+B) = \mathop{\mathrm{tr}}(A) + \mathop{\mathrm{tr}}(B)\)

\(\mathop{\mathrm{tr}}(A^\top) = \mathop{\mathrm{tr}}(A)\)

\(\mathop{\mathrm{tr}}(ABC) = \mathop{\mathrm{tr}}(BCA) = \mathop{\mathrm{tr}}(CAB)\). The individual matrices \(A, B, C\) don’t need to be square for this relation to hold, but the relation only holds for cyclic permutations as shown. In general \(\mathop{\mathrm{tr}}(ABC) \neq \mathop{\mathrm{tr}}(ACB)\).

### A.2.6 Matrix Inverse

If \(A\) is a square matrix and if there is a matrix \(B\) such that
\(AB = I\), then \(A\) is called **invertible** and the matrix \(B\) is
called the **inverse** of \(A\), denoted by \(A^{-1} := B\).
Some important properties of the inverse:

The inverse, if it exists, is unique.

Left-inverse and right-inverse for matrices are the same: \(A^{-1} A = I\) holds if and only if \(A A^{-1} = I\).

If \(A\) is symmetric and invertible, then \(A^{-1}\) is also symmetric. This is true because \(A (A^{-1})^\top = (A^{-1} A)^\top = I^\top = I\) and thus \((A^{-1})^\top\) is an inverse of \(A\). Since the inverse is unique, \((A^{-1})^\top = A^{-1}\).

**Theorem A.1 **Let \(A \in \mathbb{R}^{n\times n}\). Then the following statements are equivalent:

- \(A\) is invertible
- \(A\) has full rank,
*i.e.*\(\mathop{\mathrm{rank}}(A) = n\). - The equation \(Ax = 0\) has \(x = 0\) as its only solution.
- The equation \(Ax = b\) has exactly one solution \(x\) for every \(b\in\mathbb{R}^n\).

### A.2.7 Orthogonal Matrices

**Definition A.4 **A matrix \(U\) is called **orthogonal**, if \(U^\top U = I = U U^\top\).

Some important properties of orthogonal matrices:

If \(U\) is orthogonal, the inverse and the transpose are the same: \(U^\top = U^{-1}\).

We have \(\| U x \|^2 = x^\top U^\top U x = x^\top x = \| x \|^2\). Thus, multiplying a vector \(x\) by an orthogonal matrix \(U\) does not change its length.

### A.2.8 Positive Definite Matrices

**Definition A.5 **A symmetric matrix \(A \in \mathbb{R}^{n\times n}\) is called **positive definite**, if
\[\begin{equation*}
x^\top A x > 0
\end{equation*}\]
for all \(x \in \mathbb{R}^n\) with \(x\neq 0\). The matrix is called
**positive semi-definite**, if
\[\begin{equation*}
x^\top A x \geq 0
\end{equation*}\]
for all \(x \in \mathbb{R}^n\).

## A.3 Eigenvalues

**Definition A.7 **Let \(A \in\mathbb{R}^{n\times n}\) be a square matrix and \(\lambda\in R\).
The number \(\lambda\) is called an **eigenvalue** of \(A\), if there
exists a vector \(v \neq 0\) such that \(A x = \lambda x\). Any
such vector \(x\) is called an **eigenvector** of \(A\) with eigenvalue \(\lambda\).

While there are very many results about eigenvectors and eigenvalues in Linear Algebra, here we will only use a small number of these results. We summarise what we need for this module:

- If \(A\) is idempotent and \(x\) is an eigenvector with eigenvalue \(\lambda\), then we have \(\lambda x = A x = A^2 x = \lambda Ax = \lambda^2 x\). Thus we have \(\lambda^2 = \lambda\). This shows that the only eigenvalues possible for idempotent matrices are \(0\) and \(1\).

**Theorem A.2 **Let \(A\in\mathbb{R}^{n\times n}\) be symmetric. Then there is an orthogonal
matrix \(U\) such that \(D := U A U^\top\) is diagonal. The diagonal
elements of \(D\) are the eigenvalues of \(A\) and the rows of \(U\)
are corresponding eigenvectors.