Intro to Matrices

Matrices are a natural extension of the vectors that we have been working with in the last couple reading; where a vector is a collection of data of the same type ordered along a single dimension, a matrix is a collection of data of the same type ordered along two dimensions.

If you’ve taken a linear algebra course before, the idea of a matrix will be very familiar, but if you haven’t, you can think of a matrix as a collection of vectors lined up side-by-side. For example a 3x3 matrix might look something like:

\[\begin{split}\begin{bmatrix} 1 & 2 & 3 \\ 4 & 5 & 6 \\ 7 & 8 & 9 \\ \end{bmatrix} \quad\end{split}\]

Just as vectors are commonly used in social science because we usually don’t just have a single observation of data, but instead lots of observations (different survey respondents) that we might want to put into vector, so too do we often have information not just on one type of measurement (age), but lots of different measurements about each observation (age, income, years of education, etc.). Matrices are commonly used to represent this type of data by using each row for an observation (an survey respondent), and each column for a different thing were measuring.

For example, suppose we surveyed three people, and the first one was twenty years old, had an income of 22,000 dollars, and twelve years of education, the second one was thirty five years old, had an income of 65,000 dollars, and sixteen years of education, and the third was fifty five years old, had an income of 19,000 dollars, and had eleven years of education. We could represent that information in a matrix that looks like:

\[\begin{split}\begin{bmatrix} 20 & 22000 & 12 \\ 35 & 65000 & 16 \\ 55 & 19000 & 11 \\ \end{bmatrix} \quad\end{split}\]

And while it may not be immediately obvious why, this way of representing our data will turn out to not only be a useful organizational scheme, but it will be incredibly valuable for statistical analyses.

Why Learn About Matrices?

There are three reasons to learn about matrices as a social scientist.

The first is that matrices underlie nearly all of the statistical models that we use as social scientists, like linear regression, probits models, GMM, etc. And while the average social scientist may not interact with matrices directly on a day to day basis – most social scientists use convenient packages for fitting these models (like lm in R, or reg in Stata) – in order to ensure social science graduate students understand what these models are doing, most graduate statistical methods courses will require students to work directly with matrices to implement these models before they are allowed to move on to just using these convenient packages.

The second reason is that matrices are a natural steppingstone from vectors to the data structure that we’ll talk about next – data.frames. So everything we learn here will be immediately applicable in our next lesson.

In the final reason is that while many social scientists don’t work with matrices on a day-to-day basis, you may end up being one of the social scientist who does, so I think it’s helpful to make sure you’re familiar with what matrices can do said that if you run into a situation where matrices are the right tool for the job, you’ll recognize it.

Constructing Matrices

As with vectors, there are a couple ways of constructing matrices. The first is just by combining a handful of vectors with either the cbind() (which will make input vector into a column of the matrix) or rbind() (which will make input vector into a row of the matrix):

[1]:
income <- c(22000, 65000, 19000)
age <- c(20, 35, 55)
education <- c(12, 16, 11)

cbind(income, age, education)
A matrix: 3 × 3 of type dbl
incomeageeducation
220002012
650003516
190005511
[2]:
respondent1 <- c(20, 22000, 12)
respondent2 <- c(35, 65000, 16)
respondent3 <- c(55, 19000, 11)

rbind(respondent1, respondent2, respondent3)
A matrix: 3 × 3 of type dbl
respondent1202200012
respondent2356500016
respondent3551900011

In addition, you can also construct a matrix by giving it along vector and telling it how to “fold” the entries. This is an approach that is actually not super useful in the real world, but is very convenient for making examples, so… I’ll use it a little here.

[3]:
matrix(1:9, nrow = 3, ncol = 3)
A matrix: 3 × 3 of type int
147
258
369

Matrix Math

As with vectors, matrices provide a extremely efficient way of doing a lot of mathematical operation. Suppose that we had a matrix that had information on survey respondents and their incomes at different periods of time:

[4]:
salary_1980 <- c(30000, 45000, 22000)
salary_1990 <- c(37000, 42000, 29000)
salary_2000 <- c(49000, 47000, 33000)

salaries <- cbind(salary_1980, salary_1990, salary_2000)
salaries
A matrix: 3 × 3 of type dbl
salary_1980salary_1990salary_2000
300003700049000
450004200047000
220002900033000

And instead of writing them out in dollars, I wanted to convert them out to thousands of dollars to make the table easier to fit on a graph. I can just do:

[5]:
salaries_in_thousands <- salaries / 1000
salaries_in_thousands
A matrix: 3 × 3 of type dbl
salary_1980salary_1990salary_2000
303749
454247
222933

Similarly, matrices can be added if they have the same size. If we also had a matrix of tax refunds, and we wanted to calculate everyone’s total income, we could just add the matrices:

[6]:
tax_refunds_1980 <- c(10000, 2000, 15000)
tax_refunds_1990 <- c(0, 0, 14000)
tax_refunds_2000 <- c(0, 0, 7000)

refunds <- cbind(tax_refunds_1980, tax_refunds_1990, tax_refunds_2000)
refunds

A matrix: 3 × 3 of type dbl
tax_refunds_1980tax_refunds_1990tax_refunds_2000
10000 0 0
2000 0 0
15000140007000
[7]:
total_income <- salaries + refunds
total_income
A matrix: 3 × 3 of type dbl
salary_1980salary_1990salary_2000
400003700049000
470004200047000
370004300040000

Summarizing Matrices

Just as vectors had lots of tools for summarizing their properties, so too do matrices. For example:

[8]:
# Num rows
nrow(total_income)
3
[9]:
# Num cols
ncol(total_income)
3
[10]:
# Overall statistics
mean(total_income)
42444.4444444444
[11]:
# And statistics calculated for each columns
colMeans(total_income)
salary_1980
41333.3333333333
salary_1990
40666.6666666667
salary_2000
45333.3333333333
[12]:
# Or for each row!
rowMeans(total_income)
  1. 42000
  2. 45333.3333333333
  3. 40000

Linear Algebra

In addition to being able to do element-wise operations like those illustrated above, R also supports an array of linear algebra operations for matrices. If you haven’t taken a linear algebra course before don’t worry too much about what’s in the section!

For example, matrix multiplication:

[13]:
matrix_1 <- matrix(1:9, nrow = 3)
matrix_1
A matrix: 3 × 3 of type int
147
258
369
[14]:
matrix_2 <- matrix(9:17, nrow = 3)
matrix_2

A matrix: 3 × 3 of type int
91215
101316
111417
[15]:
# Matrix multiplication
matrix_1 %*% matrix_2
A matrix: 3 × 3 of type dbl
126162198
156201246
186240294
[16]:
# Transpose matrix
t(matrix_1)
A matrix: 3 × 3 of type int
123
456
789
[17]:
# Eigen vector of matrix
eigen(matrix_1)
eigen() decomposition
$values
[1]  1.611684e+01 -1.116844e+00 -5.700691e-16

$vectors
           [,1]       [,2]       [,3]
[1,] -0.4645473 -0.8829060  0.4082483
[2,] -0.5707955 -0.2395204 -0.8164966
[3,] -0.6770438  0.4038651  0.4082483

Getting the inverse of a matrix is also easy with the function solve.

However, note that not all matrices are invertible, so if you try and solve a matrix that you will get an error like this:

> solve(salaries)

Error in solve.default(matrix_1): Lapack routine dgesv: system is exactly singular: U[3,3] = 0
Traceback:

1. solve(matrix_1)
2. solve.default(matrix_1)

So here it is with a different matrix:

[18]:
solve(salaries)
A matrix: 3 × 3 of type dbl
salary_1980 8.607784e-06 7.485030e-05-0.0001193862
salary_1990-1.687874e-04-3.293413e-05 0.0002975299
salary_2000 1.425898e-04-2.095808e-05-0.0001515719

Linear Regression with Matrices

To illustrate how matrices relate to the type of statistics that we do on a daily basis as social scientists, let’s quickly illustrate how linear regressions actually work!

(If you’ve never seen linear regression before, feel free to skip the section, but if your a political science graduate student, I guarantee you’ll see this in your methods class, so… maybe bookmark this for later? :))

Suppose that we wanted to regress income on age and education using the toy data set we made above. The easy way to do this in R is by using the lm package as follows (don’t worry about the details of how I do this, I just want to quickly illustrate the way most people have probably seen linear regression done if they’ve worked with R before):

[19]:
lm(income ~ age + education)

Call:
lm(formula = income ~ age + education)

Coefficients:
(Intercept)          age    education
    -102000          200        10000

Now as you’ll learn in your statistics courses, linear algebra can actually be expressed in terms of matrices as

\[(X'X)^{-1}X'Y\]

(The inverse of the transverse of X matrix multiplied by X times the transpose of X times Y, where X is a matrix of predictors, and Y is the variable we’re trying to predict). So we can write this directly as:

[20]:
# Make X. note we need a vector of 1s for our intercept.
X <- cbind(age, education, rep(1, 3))
X
A matrix: 3 × 3 of type dbl
ageeducation
20121
35161
55111
[21]:
# Then we can calculate the components separately
# to make the mapping to the formula clear.
# Here's X' (i.e. X transpose)

X_transpose <- t(X)
X_transpose
A matrix: 3 × 3 of type dbl
age203555
education121611
1 1 1
[22]:
# And now the block on the left: (X'X)^-1
inverse_of_X_transpose_times_X <- solve(X_transpose %*% X)
[23]:
# And the block on the right: X'Y
X_transpose_y <- X_transpose %*% income

[24]:
# And putting it all together we get:
inverse_of_X_transpose_times_X %*% X_transpose_y
A matrix: 3 × 1 of type dbl
age 200
education 10000
-102000

Ta-da! That final matrix gives us our regression coefficients, which are identical to those from lm! Now you know how to do linear regression using matrices in R!

(Note we broke out all the terms to make it easier to map the R code onto the equation for linear regression – you can of course just run this in one line if you want:

[25]:
solve(t(X) %*% X) %*% (t(X) %*% income)
A matrix: 3 × 1 of type dbl
age 200
education 10000
-102000

Recap

  • Matrices are a natural extension of vectors from one dimension into two dimensions.

  • You can think of a matrix as a collection of vectors of the same length side by side.

  • Like vectors, we can do math using full matrices, as well as get summary statistics on both matrices as a whole and their rows/columns

  • R implements lots of linear algebra-specific mathematical operations for matrices, like matrix multiplication and inverses.

Next Steps

Now that we’re familiar with matrices, time to learn to manipulate them!