Intro to Matrices¶

Matrices are a natural extension of the vectors that we have been working with in the last couple reading; where a vector is a collection of data of the same type ordered along a single dimension, a matrix is a collection of data of the same type ordered along two dimensions.

If you’ve taken a linear algebra course before, the idea of a matrix will be very familiar, but if you haven’t, you can think of a matrix as a collection of vectors lined up side-by-side. For example a 3x3 matrix might look something like:

\[\begin{split}\begin{bmatrix} 1 & 2 & 3 \\ 4 & 5 & 6 \\ 7 & 8 & 9 \\ \end{bmatrix} \quad\end{split}\]

Just as vectors are commonly used in social science because we usually don’t just have a single observation of data, but instead lots of observations (different survey respondents) that we might want to put into vector, so too do we often have information not just on one type of measurement (age), but lots of different measurements about each observation (age, income, years of education, etc.). Matrices are commonly used to represent this type of data by using each row for an observation (an survey respondent), and each column for a different thing were measuring.

For example, suppose we surveyed three people, and the first one was twenty years old, had an income of 22,000 dollars, and twelve years of education, the second one was thirty five years old, had an income of 65,000 dollars, and sixteen years of education, and the third was fifty five years old, had an income of 19,000 dollars, and had eleven years of education. We could represent that information in a matrix that looks like:

\[\begin{split}\begin{bmatrix} 20 & 22000 & 12 \\ 35 & 65000 & 16 \\ 55 & 19000 & 11 \\ \end{bmatrix} \quad\end{split}\]

And while it may not be immediately obvious why, this way of representing our data will turn out to not only be a useful organizational scheme, but it will be incredibly valuable for statistical analyses.

Why Learn About Matrices?¶

There are three reasons to learn about matrices as a social scientist.

The first is that matrices underlie nearly all of the statistical models that we use as social scientists, like linear regression, probits models, GMM, etc. And while the average social scientist may not interact with matrices directly on a day to day basis – most social scientists use convenient packages for fitting these models (like lm in R, or reg in Stata) – in order to ensure social science graduate students understand what these models are doing, most graduate statistical methods courses will require students to work directly with matrices to implement these models before they are allowed to move on to just using these convenient packages.

The second reason is that matrices are a natural steppingstone from vectors to the data structure that we’ll talk about next – data.frames. So everything we learn here will be immediately applicable in our next lesson.

In the final reason is that while many social scientists don’t work with matrices on a day-to-day basis, you may end up being one of the social scientist who does, so I think it’s helpful to make sure you’re familiar with what matrices can do said that if you run into a situation where matrices are the right tool for the job, you’ll recognize it.

Constructing Matrices¶

As with vectors, there are a couple ways of constructing matrices. The first is just by combining a handful of vectors with either the cbind() (which will make input vector into a column of the matrix) or rbind() (which will make input vector into a row of the matrix):

[1]:

income <- c(22000, 65000, 19000)
age <- c(20, 35, 55)
education <- c(12, 16, 11)

cbind(income, age, education)

A matrix: 3 × 3 of type dbl
income	age	education
22000	20	12
65000	35	16
19000	55	11

[2]:

respondent1 <- c(20, 22000, 12)
respondent2 <- c(35, 65000, 16)
respondent3 <- c(55, 19000, 11)

rbind(respondent1, respondent2, respondent3)

A matrix: 3 × 3 of type dbl
respondent1	20	22000	12
respondent2	35	65000	16
respondent3	55	19000	11

In addition, you can also construct a matrix by giving it along vector and telling it how to “fold” the entries. This is an approach that is actually not super useful in the real world, but is very convenient for making examples, so… I’ll use it a little here.

[3]:

matrix(1:9, nrow = 3, ncol = 3)

A matrix: 3 × 3 of type int
1	4	7
2	5	8
3	6	9

Matrix Math¶

As with vectors, matrices provide a extremely efficient way of doing a lot of mathematical operation. Suppose that we had a matrix that had information on survey respondents and their incomes at different periods of time:

[4]:

salary_1980 <- c(30000, 45000, 22000)
salary_1990 <- c(37000, 42000, 29000)
salary_2000 <- c(49000, 47000, 33000)

salaries <- cbind(salary_1980, salary_1990, salary_2000)
salaries

A matrix: 3 × 3 of type dbl
salary_1980	salary_1990	salary_2000
30000	37000	49000
45000	42000	47000
22000	29000	33000

And instead of writing them out in dollars, I wanted to convert them out to thousands of dollars to make the table easier to fit on a graph. I can just do:

[5]:

salaries_in_thousands <- salaries / 1000
salaries_in_thousands

A matrix: 3 × 3 of type dbl
salary_1980	salary_1990	salary_2000
30	37	49
45	42	47
22	29	33

Similarly, matrices can be added if they have the same size. If we also had a matrix of tax refunds, and we wanted to calculate everyone’s total income, we could just add the matrices:

[6]:

tax_refunds_1980 <- c(10000, 2000, 15000)
tax_refunds_1990 <- c(0, 0, 14000)
tax_refunds_2000 <- c(0, 0, 7000)

refunds <- cbind(tax_refunds_1980, tax_refunds_1990, tax_refunds_2000)
refunds

A matrix: 3 × 3 of type dbl
tax_refunds_1980	tax_refunds_1990	tax_refunds_2000
10000	0	0
2000	0	0
15000	14000	7000

[7]:

total_income <- salaries + refunds
total_income

A matrix: 3 × 3 of type dbl
salary_1980	salary_1990	salary_2000
40000	37000	49000
47000	42000	47000
37000	43000	40000

Summarizing Matrices¶

Just as vectors had lots of tools for summarizing their properties, so too do matrices. For example:

[8]:

# Num rows
nrow(total_income)

3

[9]:

# Num cols
ncol(total_income)

3

[10]:

# Overall statistics
mean(total_income)

42444.4444444444

[11]:

# And statistics calculated for each columns
colMeans(total_income)

salary_1980: 41333.3333333333
salary_1990: 40666.6666666667
salary_2000: 45333.3333333333

[12]:

# Or for each row!
rowMeans(total_income)

42000
45333.3333333333
40000

Linear Algebra¶

In addition to being able to do element-wise operations like those illustrated above, R also supports an array of linear algebra operations for matrices. If you haven’t taken a linear algebra course before don’t worry too much about what’s in the section!

For example, matrix multiplication:

[13]:

matrix_1 <- matrix(1:9, nrow = 3)
matrix_1

A matrix: 3 × 3 of type int
1	4	7
2	5	8
3	6	9

[14]:

matrix_2 <- matrix(9:17, nrow = 3)
matrix_2

A matrix: 3 × 3 of type int
9	12	15
10	13	16
11	14	17

[15]:

# Matrix multiplication
matrix_1 %*% matrix_2

A matrix: 3 × 3 of type dbl
126	162	198
156	201	246
186	240	294

[16]:

# Transpose matrix
t(matrix_1)

A matrix: 3 × 3 of type int
1	2	3
4	5	6
7	8	9

[17]:

# Eigen vector of matrix
eigen(matrix_1)

eigen() decomposition
$values
[1]  1.611684e+01 -1.116844e+00 -5.700691e-16

$vectors
           [,1]       [,2]       [,3]
[1,] -0.4645473 -0.8829060  0.4082483
[2,] -0.5707955 -0.2395204 -0.8164966
[3,] -0.6770438  0.4038651  0.4082483

Getting the inverse of a matrix is also easy with the function solve.

However, note that not all matrices are invertible, so if you try and solve a matrix that you will get an error like this:

> solve(salaries)

Error in solve.default(matrix_1): Lapack routine dgesv: system is exactly singular: U[3,3] = 0
Traceback:

1. solve(matrix_1)
2. solve.default(matrix_1)

So here it is with a different matrix:

[18]:

solve(salaries)

A matrix: 3 × 3 of type dbl
salary_1980	8.607784e-06	7.485030e-05	-0.0001193862
salary_1990	-1.687874e-04	-3.293413e-05	0.0002975299
salary_2000	1.425898e-04	-2.095808e-05	-0.0001515719

Linear Regression with Matrices¶

To illustrate how matrices relate to the type of statistics that we do on a daily basis as social scientists, let’s quickly illustrate how linear regressions actually work!

(If you’ve never seen linear regression before, feel free to skip the section, but if your a political science graduate student, I guarantee you’ll see this in your methods class, so… maybe bookmark this for later? :))

Suppose that we wanted to regress income on age and education using the toy data set we made above. The easy way to do this in R is by using the lm package as follows (don’t worry about the details of how I do this, I just want to quickly illustrate the way most people have probably seen linear regression done if they’ve worked with R before):

[19]:

lm(income ~ age + education)


Call:
lm(formula = income ~ age + education)

Coefficients:
(Intercept)          age    education
    -102000          200        10000

Now as you’ll learn in your statistics courses, linear algebra can actually be expressed in terms of matrices as

\[(X'X)^{-1}X'Y\]

(The inverse of the transverse of X matrix multiplied by X times the transpose of X times Y, where X is a matrix of predictors, and Y is the variable we’re trying to predict). So we can write this directly as:

[20]:

# Make X. note we need a vector of 1s for our intercept.
X <- cbind(age, education, rep(1, 3))
X

A matrix: 3 × 3 of type dbl
age	education
20	12	1
35	16	1
55	11	1

[21]:

# Then we can calculate the components separately
# to make the mapping to the formula clear.
# Here's X' (i.e. X transpose)

X_transpose <- t(X)
X_transpose

A matrix: 3 × 3 of type dbl
age	20	35	55
education	12	16	11
	1	1	1

[22]:

# And now the block on the left: (X'X)^-1
inverse_of_X_transpose_times_X <- solve(X_transpose %*% X)

[23]:

# And the block on the right: X'Y
X_transpose_y <- X_transpose %*% income

[24]:

# And putting it all together we get:
inverse_of_X_transpose_times_X %*% X_transpose_y

A matrix: 3 × 1 of type dbl
age	200
education	10000
	-102000

Ta-da! That final matrix gives us our regression coefficients, which are identical to those from lm! Now you know how to do linear regression using matrices in R!

(Note we broke out all the terms to make it easier to map the R code onto the equation for linear regression – you can of course just run this in one line if you want:

[25]:

solve(t(X) %*% X) %*% (t(X) %*% income)

A matrix: 3 × 1 of type dbl
age	200
education	10000
	-102000

Recap¶

Matrices are a natural extension of vectors from one dimension into two dimensions.
You can think of a matrix as a collection of vectors of the same length side by side.
Like vectors, we can do math using full matrices, as well as get summary statistics on both matrices as a whole and their rows/columns
R implements lots of linear algebra-specific mathematical operations for matrices, like matrix multiplication and inverses.

Next Steps¶

Now that we’re familiar with matrices, time to learn to manipulate them!