Intro to Matrices¶
Matrices are a natural extension of the vectors that we have been working with in the last couple reading; where a vector is a collection of data of the same type ordered along a single dimension, a matrix is a collection of data of the same type ordered along two dimensions.
If you’ve taken a linear algebra course before, the idea of a matrix will be very familiar, but if you haven’t, you can think of a matrix as a collection of vectors lined up side-by-side. For example a 3x3 matrix might look something like:
Just as vectors are commonly used in social science because we usually don’t just have a single observation of data, but instead lots of observations (different survey respondents) that we might want to put into vector, so too do we often have information not just on one type of measurement (age), but lots of different measurements about each observation (age, income, years of education, etc.). Matrices are commonly used to represent this type of data by using each row for an observation (an survey respondent), and each column for a different thing were measuring.
For example, suppose we surveyed three people, and the first one was twenty years old, had an income of 22,000 dollars, and twelve years of education, the second one was thirty five years old, had an income of 65,000 dollars, and sixteen years of education, and the third was fifty five years old, had an income of 19,000 dollars, and had eleven years of education. We could represent that information in a matrix that looks like:
And while it may not be immediately obvious why, this way of representing our data will turn out to not only be a useful organizational scheme, but it will be incredibly valuable for statistical analyses.
Why Learn About Matrices?¶
There are three reasons to learn about matrices as a social scientist.
The first is that matrices underlie nearly all of the statistical models that we use as social scientists, like linear regression, probits models, GMM, etc. And while the average social scientist may not interact with matrices directly on a day to day basis – most social scientists use convenient packages for fitting these models (like lm
in R, or reg
in Stata) – in order to ensure social science graduate students understand what these models are doing, most graduate statistical methods courses will require students to work directly with matrices to implement these models before they are allowed to move on to just using these convenient packages.
The second reason is that matrices are a natural steppingstone from vectors to the data structure that we’ll talk about next – data.frames
. So everything we learn here will be immediately applicable in our next lesson.
In the final reason is that while many social scientists don’t work with matrices on a day-to-day basis, you may end up being one of the social scientist who does, so I think it’s helpful to make sure you’re familiar with what matrices can do said that if you run into a situation where matrices are the right tool for the job, you’ll recognize it.
Constructing Matrices¶
As with vectors, there are a couple ways of constructing matrices. The first is just by combining a handful of vectors with either the cbind()
(which will make input vector into a column of the matrix) or rbind()
(which will make input vector into a row of the matrix):
[1]:
income <- c(22000, 65000, 19000)
age <- c(20, 35, 55)
education <- c(12, 16, 11)
cbind(income, age, education)
income | age | education |
---|---|---|
22000 | 20 | 12 |
65000 | 35 | 16 |
19000 | 55 | 11 |
[2]:
respondent1 <- c(20, 22000, 12)
respondent2 <- c(35, 65000, 16)
respondent3 <- c(55, 19000, 11)
rbind(respondent1, respondent2, respondent3)
respondent1 | 20 | 22000 | 12 |
---|---|---|---|
respondent2 | 35 | 65000 | 16 |
respondent3 | 55 | 19000 | 11 |
In addition, you can also construct a matrix by giving it along vector and telling it how to “fold” the entries. This is an approach that is actually not super useful in the real world, but is very convenient for making examples, so… I’ll use it a little here.
[3]:
matrix(1:9, nrow = 3, ncol = 3)
1 | 4 | 7 |
2 | 5 | 8 |
3 | 6 | 9 |
Matrix Math¶
As with vectors, matrices provide a extremely efficient way of doing a lot of mathematical operation. Suppose that we had a matrix that had information on survey respondents and their incomes at different periods of time:
[4]:
salary_1980 <- c(30000, 45000, 22000)
salary_1990 <- c(37000, 42000, 29000)
salary_2000 <- c(49000, 47000, 33000)
salaries <- cbind(salary_1980, salary_1990, salary_2000)
salaries
salary_1980 | salary_1990 | salary_2000 |
---|---|---|
30000 | 37000 | 49000 |
45000 | 42000 | 47000 |
22000 | 29000 | 33000 |
And instead of writing them out in dollars, I wanted to convert them out to thousands of dollars to make the table easier to fit on a graph. I can just do:
[5]:
salaries_in_thousands <- salaries / 1000
salaries_in_thousands
salary_1980 | salary_1990 | salary_2000 |
---|---|---|
30 | 37 | 49 |
45 | 42 | 47 |
22 | 29 | 33 |
Similarly, matrices can be added if they have the same size. If we also had a matrix of tax refunds, and we wanted to calculate everyone’s total income, we could just add the matrices:
[6]:
tax_refunds_1980 <- c(10000, 2000, 15000)
tax_refunds_1990 <- c(0, 0, 14000)
tax_refunds_2000 <- c(0, 0, 7000)
refunds <- cbind(tax_refunds_1980, tax_refunds_1990, tax_refunds_2000)
refunds
tax_refunds_1980 | tax_refunds_1990 | tax_refunds_2000 |
---|---|---|
10000 | 0 | 0 |
2000 | 0 | 0 |
15000 | 14000 | 7000 |
[7]:
total_income <- salaries + refunds
total_income
salary_1980 | salary_1990 | salary_2000 |
---|---|---|
40000 | 37000 | 49000 |
47000 | 42000 | 47000 |
37000 | 43000 | 40000 |
Summarizing Matrices¶
Just as vectors had lots of tools for summarizing their properties, so too do matrices. For example:
[8]:
# Num rows
nrow(total_income)
[9]:
# Num cols
ncol(total_income)
[10]:
# Overall statistics
mean(total_income)
[11]:
# And statistics calculated for each columns
colMeans(total_income)
- salary_1980
- 41333.3333333333
- salary_1990
- 40666.6666666667
- salary_2000
- 45333.3333333333
[12]:
# Or for each row!
rowMeans(total_income)
- 42000
- 45333.3333333333
- 40000
Linear Algebra¶
In addition to being able to do element-wise operations like those illustrated above, R also supports an array of linear algebra operations for matrices. If you haven’t taken a linear algebra course before don’t worry too much about what’s in the section!
For example, matrix multiplication:
[13]:
matrix_1 <- matrix(1:9, nrow = 3)
matrix_1
1 | 4 | 7 |
2 | 5 | 8 |
3 | 6 | 9 |
[14]:
matrix_2 <- matrix(9:17, nrow = 3)
matrix_2
9 | 12 | 15 |
10 | 13 | 16 |
11 | 14 | 17 |
[15]:
# Matrix multiplication
matrix_1 %*% matrix_2
126 | 162 | 198 |
156 | 201 | 246 |
186 | 240 | 294 |
[16]:
# Transpose matrix
t(matrix_1)
1 | 2 | 3 |
4 | 5 | 6 |
7 | 8 | 9 |
[17]:
# Eigen vector of matrix
eigen(matrix_1)
eigen() decomposition
$values
[1] 1.611684e+01 -1.116844e+00 -5.700691e-16
$vectors
[,1] [,2] [,3]
[1,] -0.4645473 -0.8829060 0.4082483
[2,] -0.5707955 -0.2395204 -0.8164966
[3,] -0.6770438 0.4038651 0.4082483
Getting the inverse of a matrix is also easy with the function solve.
However, note that not all matrices are invertible, so if you try and solve a matrix that you will get an error like this:
> solve(salaries)
Error in solve.default(matrix_1): Lapack routine dgesv: system is exactly singular: U[3,3] = 0
Traceback:
1. solve(matrix_1)
2. solve.default(matrix_1)
So here it is with a different matrix:
[18]:
solve(salaries)
salary_1980 | 8.607784e-06 | 7.485030e-05 | -0.0001193862 |
---|---|---|---|
salary_1990 | -1.687874e-04 | -3.293413e-05 | 0.0002975299 |
salary_2000 | 1.425898e-04 | -2.095808e-05 | -0.0001515719 |
Linear Regression with Matrices¶
To illustrate how matrices relate to the type of statistics that we do on a daily basis as social scientists, let’s quickly illustrate how linear regressions actually work!
(If you’ve never seen linear regression before, feel free to skip the section, but if your a political science graduate student, I guarantee you’ll see this in your methods class, so… maybe bookmark this for later? :))
Suppose that we wanted to regress income on age and education using the toy data set we made above. The easy way to do this in R is by using the lm
package as follows (don’t worry about the details of how I do this, I just want to quickly illustrate the way most people have probably seen linear regression done if they’ve worked with R before):
[19]:
lm(income ~ age + education)
Call:
lm(formula = income ~ age + education)
Coefficients:
(Intercept) age education
-102000 200 10000
Now as you’ll learn in your statistics courses, linear algebra can actually be expressed in terms of matrices as
(The inverse of the transverse of X matrix multiplied by X times the transpose of X times Y, where X is a matrix of predictors, and Y is the variable we’re trying to predict). So we can write this directly as:
[20]:
# Make X. note we need a vector of 1s for our intercept.
X <- cbind(age, education, rep(1, 3))
X
age | education | |
---|---|---|
20 | 12 | 1 |
35 | 16 | 1 |
55 | 11 | 1 |
[21]:
# Then we can calculate the components separately
# to make the mapping to the formula clear.
# Here's X' (i.e. X transpose)
X_transpose <- t(X)
X_transpose
age | 20 | 35 | 55 |
---|---|---|---|
education | 12 | 16 | 11 |
1 | 1 | 1 |
[22]:
# And now the block on the left: (X'X)^-1
inverse_of_X_transpose_times_X <- solve(X_transpose %*% X)
[23]:
# And the block on the right: X'Y
X_transpose_y <- X_transpose %*% income
[24]:
# And putting it all together we get:
inverse_of_X_transpose_times_X %*% X_transpose_y
age | 200 |
---|---|
education | 10000 |
-102000 |
Ta-da! That final matrix gives us our regression coefficients, which are identical to those from lm
! Now you know how to do linear regression using matrices in R!
(Note we broke out all the terms to make it easier to map the R code onto the equation for linear regression – you can of course just run this in one line if you want:
[25]:
solve(t(X) %*% X) %*% (t(X) %*% income)
age | 200 |
---|---|
education | 10000 |
-102000 |
Recap¶
Matrices are a natural extension of vectors from one dimension into two dimensions.
You can think of a matrix as a collection of vectors of the same length side by side.
Like vectors, we can do math using full matrices, as well as get summary statistics on both matrices as a whole and their rows/columns
R implements lots of linear algebra-specific mathematical operations for matrices, like matrix multiplication and inverses.
Next Steps¶
Now that we’re familiar with matrices, time to learn to manipulate them!