A foundational course for understanding mathematical modeling and machine learning.
This course introduces the fundamental ideas behind linear algebra, including topics such as vectors, matrices and vector spaces.
As you progress through this course, you will find links to related topics in other courses (such as algebra). This interconnected structure is designed to help you quickly revisit prerequisite ideas without breaking your workflow.
We must begin with a discussion of what a vector is, why we care about it, and examples of when it is useful. A vector, at least one of finite length, is a collection of numbers which are called components. The components generally represent different variables, for example if we wanted to count the number of cars and the number of people in the cars, it wouldn't make sense to mix those two numbers together, so instead we separate those into components. Vectors are the collection of these components, and they can be written in several ways, for instance a common way is to write a vector \(v\) as, \(v = \begin{bmatrix} c \\ p \end{bmatrix}\), where in this example I am using \(c\) to represent the number of cars and \(p\) to represent the number of people. When a vector is written like this it is called a column vector. We can also write the vector as a row, \(v^T = [c, p]\), where the \(T\) in the superscript here means the transpose. For vectors the transpose takes a column vector to a row vector, and it takes a row vector to a column vector. Note that occassionally we may also write vectors in this format \(v^T = (c, p)\), though this is less common. The first component in \(v\) (also the first component of \(v^T\)) is \(c\) and the second component is \(p\). This example vector is a two dimensional vector. For now we will stick to 2 and 3 dimensional vectors for our examples, but there can be an arbitrary number of dimensions for a vector. Now that we have an idea of what a vector is, we will take a look at what operations we can perform on vectors.
If you want the fast version, below is a short video introducing vectors.
In the previous section we discovered what a vector is, but what useful things can you do with vectors? To see this, we will continue with the example from the last section. Recall that \(v = \begin{bmatrix}c \\ p\end{bmatrix}\). Now imagine that you and a friend are trying to keep count of the total number of cars, and the total number of people entering a gated area with two different entrances, and that one entrance cannot be seen while sitting at the other. So you and your friend decide to split up, with one of you counting cars and people at one entrance and the other counting at the other entrance. Suppose your friend counts 20 cars and 32 people at that entrance, and you count 15 cars and 40 people at your entrance. How many total cars and people entered the gated area? Well, naturally, we would add the cars to the cars and the people to the people to find the totals, so in vector format \(v_{\mbox{friend}} = \begin{bmatrix}20 \\ 32 \end{bmatrix}\) and \(v_{\mbox{you}} = \begin{bmatrix}15 \\ 40 \end{bmatrix}\) and so the total is \(v_{\mbox{total}} = v_{\mbox{friend}} + v_{\mbox{you}} = \begin{bmatrix}20 \\ 32 \end{bmatrix} + \begin{bmatrix}15 \\ 40 \end{bmatrix} = \begin{bmatrix}20+15 \\ 32+40 \end{bmatrix} = \begin{bmatrix}35 \\ 72 \end{bmatrix}\). Notice that vector addition works exactly as we would hope, the components add together, this is because it obviously wouldn't make much sense to have number of cars adding to number people, but adding cars to number of cars and people to number of people makes sense. In some way this highlights the essence of a vector, it is a collection of different things that can be represented by numbers. In our example above, the numbers would be counting (also known as natural) numbers, but a vector can contain any type of numbers that makes sense for the situation, including real numbers and even complex numbers as appropriate. Most of the time, we will be working with vectors whose components can be real numbers in this course (and for many computations, we will make things "simpler" by restricting ourselves to integers).
Another operation that will be of interest is scalar multiplication. Imagine I have a number \(a\), which is scalar (as in only 1 number, not multiple components), then I can "scale" a vector using \(a\) by multiplying that vector by \(a\). How this works is, imagine I have a three dimensional vector \(m = \begin{bmatrix}x \\ y\\ z\end{bmatrix}\), then, \(am = \begin{bmatrix}ax \\ ay\\ az\end{bmatrix}\), that is \(a\) multiplies each of the components of the vector. For example, in the above vector let \(a = 3, x=-3.2, y = 1.1, z = 0.9\). Then we can see that \(m = \begin{bmatrix}-3.2 \\ 1.1\\ 0.9\end{bmatrix}\), and \(am = \begin{bmatrix}(3)-3.2 \\ (3)1.1\\ (3)0.9\end{bmatrix} = \begin{bmatrix}-9.6 \\ 3.3\\ 2.7\end{bmatrix}\). I should note that there are other operations that will be of interest to us later, however the primary operations we will be using will be of this form, either vector addition, or scalar multiplication or a combination of the two.
If you want the fast version, below is a short video introducing vector operations.
We will now move into more abstract mathematics. Suppose we have a set \(V\) of vectors. Additionally, suppose that we have an operation, called vector addition such that if we have 2 vectors, \(\mathbf{x}, \mathbf{y}\) then the sum \(\mathbf{x}+\mathbf{y}\) (assuming of course they are the same size) is also a vector. Suppose further that we have an operation, called scalar multiplication, such that for some scalar \(c\) we have that \(c \cdot \mathbf{x}\) is also a vector. Then the set \(V\) along with vector addition and scalar multiplication is called a vector space if all of the following properties hold:
Now let us suppose that \(V\) is a vector space (i.e. a set with vector addition, scalar multiplication and the above listed properties). Then we call a set \(W\) a subspace (or a linear subspace) of \(V\), if \(W\) is a subset of \(V\) which also satisfies all of the properties of a vector space. Note, this will require \(W\) to contain the \(\mathbf{0}\) vector and must contain the additive inverse of \(\mathbf{x}\) (i.e. \(-1\mathbf{x}\)) for all \(\mathbf{x} \in W\), these all follow from the properties of vector spaces. An example of a subspace on the vector space formed over \(\mathbb{R}^n\) is the set \(W = \{\mathbf{x}\in \mathbb{R}^n | A\mathbf{x} = \mathbf{0}\}\), for an appropriately sized matrix \(A\). This example subspace is also called the null space of the matrix \(A\), as you should remember from earlier in this course.
We will now discuss the concept of linear independence. Let \(V\) be a vector space, with \(\mathbf{a}_1,...,\mathbf{a}_m \in V\). This set of vectors is said to be linearly independent if \(\forall \mathbf{x} \in V, there is at most one tuple of scalars \(c_1,...,c_m\) such that: $$ \mathbf{x} = c_1\mathbf{a}_1 + ... + c_m \mathbf{a}_m $$.
An important concept in vector spaces is the dimension. One might think that we simply measure this by finding the number of components, however, what we are actually interested in is how many linearly independent vectors there are in the space. In some sense this truly measures the dimension, because if there are a set of vectors with 3 components, but only 2 vectors in the space are linearly independent, then the potential solutions in this space are restricted to two dimensions and the third dimension becomes "redundant".
More to come!