Session 1: R Basics

Rutgers University Sociology R Bootcamp

Fred Traylor, Lab TA (he/him)

September 1, 2022

Welcome!!

Schedule

Bootcamp Goals

  1. R

    • Programming Basics
    • Data Types
    • Data Management
    • Packages
    • Functions (using, not writing)
  2. Statistics

    • Data Structures

What you will need for the bootcamp

Computer with:

A copy of these slides

How to Use These Slides

These slides will work with any web browser and operate like PowerPoint.

On slides with code snippets, you can click the red clipboard button on the right side of each chunk to copy the code in the snippet. You can then paste it into R for quick running.

How This Will Flow

We’ll be pairing these slides with some work in R.

Please work through the R as we go so you can get practice with the code

Feel free to ask questions at any time throughout

We will also have “Official Question Times,” often combined with short breaks

Today’s Goals

  1. Intro to R

  2. R Basics

    • Objects

    • Groups of Objects

    • Groups of Groups of Objects

    • Some basic functions

    • Two keyboard shortcuts

Why R?

Why learn and use R, especially if you know another statistics program already?

R vs. RStudio

R

RStudio

R Cheatsheets

As I mentioned before, R has a very large, very great online community. One of the premier features they’ve put together is a series of cheatsheets. The following cheatsheets will be helpful for you in learning our bootcamp content.

Here is the link to others if you lose yours and/or just want to look around: https://www.rstudio.com/resources/cheatsheets/

The RStudio Window

Four Panes

  1. Console (bottom left)

  2. Source (top left)

  3. Environment / History (top right)

  4. Files / Plots / Packages / Help / Viewer (bottom right)

Editable, but this is default and it works well for most everyone.



(If you don’t see “Source” pane, click one of the buttons on top right of Console pane.)

Let’s do R: The Very Basics

Running Code from the Console

If you click in the console (bottom left), you can begin typing code into R.

To run it, simply hit Enter, and R will run what you’ve typed.

Let’s get started with that.

R as a Calculator

At its most basic, R (like all computers) is just a calculator. We can do addition, subtraction, multiplication, division, long functions, complex functions, (nearly) anything.

8+2
## [1] 10
5-6
## [1] -1
45*2
## [1] 90
6/3
## [1] 2
8+4-15*753*4-12
## [1] -45180
(8+2)-(6^(-3+7))
## [1] -1286

Running Code from the Source Pane

While you can type everything directly into the console pane (bottom left), it is good practice to begin typing your script into the source pane (top left).

To run a line from the source pane: Press Ctrl + Enter (Cmd + Enter on a Mac) and R will run everything it thinks you want. You can also click the “Run” button in the top right.

Storing Objects

Sometimes we want to store numbers so we can reference them again. We can use the assignment arrow <- to assign values to x and y.

x <- 8 + 2
x
## [1] 10
y <- x + 7
y 
## [1] 17

Because we used x to equal the sum of 8 and 2, (10) we can use x later on to calculate y.

A short note on saving values

Assignment (aka saving something to something else) can be done with both = and <-.

For the remainder of this bootcamp and through this next year, we’ll be using <- to assign.

This is the standard method because it makes it clear which item is being assigned to which.

To make <- quickly, press ALT and - (the minus key) at the same time.

The Global Environment

If you look at the top right pane of your RStudio window, you’ll see two “Values” being stored: x, set at 10, and y, which is set to 17.

As we go along, you can always take a look at the environment to see what you have stored and what’s inside.

Vectors: Groups of Values

While it’s nice to use R just as a plain calculator, we almost always have more than one value we’re working with at a time.

We can create “vectors” of these items with the c() command. (c stands for “concatenate.”)

scores <- c(2, 4, 6, 8, 14)
scores
## [1]  2  4  6  8 14

What we did here was create an object called “scores,” which has five values. We then printed it to make sure it was what we wanted. (Not always necessary, but it doesn’t hurt to double-check.)

When we use a math operator (like addition or division) on a vector, it uses that operator on each piece of the vector. For example:

newscores <- scores * 1.5
newscores
## [1]  3  6  9 12 21

Above, I multiplied scores times 1.5 and assigned this to a new object, newscores. I then printed it to show what our new scores are.

Vectors - Indexing

If we want to pull a specific value from our vector, we can use what’s called “indexing.”

We use square brackets [ ] to do this.

It takes the form: vectorname[element]

For example, if we want the third object of scores, we can index it like so:

scores
## [1]  2  4  6  8 14
scores[3]
## [1] 6

We can also ask for more than one element in a vector at a time like so:

scores[c(2,3)]
## [1] 4 6

See what we did there? We used a vector ( c(2,3) ) to index another vector (scores).

Official Question Time 1

Since we started, we’ve done:

  1. What is R? What is RStudio?

  2. R as a Calculator

  3. Storing Objects

  4. Vectors

  5. Indexing a Vector

Text as Data

We can also use R for things that are not just numbers. Let’s create a vector of words.

cities <- c(Oklahoma City, Dallas, Charlotte, Piscataway)
print(cities)

Because R wants things to be numbers, the above code doesn’t work. Instead, we have to put quotation marks around everything that is text.

cities <- c("Oklahoma City", "Dallas", "Charlotte", "Piscataway")
print(cities)
## [1] "Oklahoma City" "Dallas"        "Charlotte"     "Piscataway"

Object Names

As an aside, R is great at being able to handle object names that are both very short and very long. There are a few rules R makes you follow for naming:

  1. Names must start with a letter or a period, though it is better to start with a letter.

  2. Names can only contain letters, numbers, underscores, and periods.

  3. You can’t use certain special keywords as names.

    • If, else, TRUE, FALSE, NA, NaN, return, while, repeat, for, function, and some others

There are however, a few naming conventions. The key one is that names should be as short as possible while still remaining accurate.

Our vector “cities” could have also been called “cities_where_fred_has_lived,” but the name is unnecessarily long, so “cities” works better for now.

If we have vectors of cities for more than one person, it’d be better then to call it fred_cities or cities_fred so it can be distinguished from someone else’s cities (like cities_tom or cities_quan).

Finally, R won’t stop you from overwriting another object, so use caution with common names like “data” or “file” or “vector.”

R Logical Operators

R can also perform “logical tests” for us. These ask whether two values are equal to each other. For example:

1==3
## [1] FALSE
2==2
## [1] TRUE
(4+3)==(14/2)
## [1] TRUE

When performing these, we use two equal signs (==) so that it knows what we’re trying to do.

R can also do logical tests for greater than (>), less than (<), greater than or equal to (>=), and less than or equal to (<=).

4>19
## [1] FALSE
5>=3
## [1] TRUE

Logical Test over a Vector

Let’s say our scores from earlier are considered passing if they are larger than five. We can create a vector, called passing, that contains whether each score is greater than five.

scores
## [1]  2  4  6  8 14
passing <- scores > 5
passing
## [1] FALSE FALSE  TRUE  TRUE  TRUE

Data Type Review

We’ve worked with three types of data so far.

In case we forget what type our data is, R has functions build in that can help us remember. Let’s take another look and see what’s inside them.

scores
## [1]  2  4  6  8 14
cities
## [1] "Oklahoma City" "Dallas"        "Charlotte"     "Piscataway"
passing
## [1] FALSE FALSE  TRUE  TRUE  TRUE

So the first vector (scores) has numbers. The second vector (cities) has text, and the third (passing) has logic (TRUE and FALSE).

To make sure R has them correct, we can use the class() function on each of them, and R will tell us what kind they are.

class(scores)
## [1] "numeric"
class(cities)
## [1] "character"
class(passing)
## [1] "logical"

Official Question Time 2

Since the last OQT, we’ve done:

  1. Text Data

  2. Object naming conventions

  3. Logical tests and logical data

  4. Logical tests over a vector

  5. Data type review

  6. Common Error #1: Using = instead of == when performing logical tests

Functions

Intro to Functions

Beyond the base calculator options, functions are how things get done in R. We’ve already used two functions so far:

Let’s try out a few more useful functions on the next few slides.

Each function takes the form: func(argument, argument, ..., etc)

Functions: seq()

seq() creates a sequence of numbers. It takes the form: seq(from, to, by). Let’s examine the function by typing ?seq into the console and pressing enter. This will give us the documentation for the sequence function.

We can start by creating a sequence of numbers from 3, to 27, counting by 3’s.

seq1 <- seq(from = 3,
            to = 27,
            by = 3)
seq1
## [1]  3  6  9 12 15 18 21 24 27

Functions: Order of Arguments

There are a few other ways we can write the same function, though. We could put the arguments in a different order.

seq2 <- seq(to = 27,
            from = 3,
            by = 3)
seq2
## [1]  3  6  9 12 15 18 21 24 27
seq1 == seq2
## [1] TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE

We could also omit the names of the arguments altogether, BUT they have to be in the same order as the original function. (Otherwise, R will throw an error).

seq3 <- seq(3, 27, 3)

seq3
## [1]  3  6  9 12 15 18 21 24 27
seq1 == seq3
## [1] TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE

Functions: length(), head(), and tail()

Let’s create a really long vector using seq().

longvec <- seq(5, 10000, 12)
longvec
##   [1]    5   17   29   41   53   65   77   89  101  113  125  137  149  161  173
##  [16]  185  197  209  221  233  245  257  269  281  293  305  317  329  341  353
##  [31]  365  377  389  401  413  425  437  449  461  473  485  497  509  521  533
##  [46]  545  557  569  581  593  605  617  629  641  653  665  677  689  701  713
##  [61]  725  737  749  761  773  785  797  809  821  833  845  857  869  881  893
##  [76]  905  917  929  941  953  965  977  989 1001 1013 1025 1037 1049 1061 1073
##  [91] 1085 1097 1109 1121 1133 1145 1157 1169 1181 1193 1205 1217 1229 1241 1253
## [106] 1265 1277 1289 1301 1313 1325 1337 1349 1361 1373 1385 1397 1409 1421 1433
## [121] 1445 1457 1469 1481 1493 1505 1517 1529 1541 1553 1565 1577 1589 1601 1613
## [136] 1625 1637 1649 1661 1673 1685 1697 1709 1721 1733 1745 1757 1769 1781 1793
## [151] 1805 1817 1829 1841 1853 1865 1877 1889 1901 1913 1925 1937 1949 1961 1973
## [166] 1985 1997 2009 2021 2033 2045 2057 2069 2081 2093 2105 2117 2129 2141 2153
## [181] 2165 2177 2189 2201 2213 2225 2237 2249 2261 2273 2285 2297 2309 2321 2333
## [196] 2345 2357 2369 2381 2393 2405 2417 2429 2441 2453 2465 2477 2489 2501 2513
## [211] 2525 2537 2549 2561 2573 2585 2597 2609 2621 2633 2645 2657 2669 2681 2693
## [226] 2705 2717 2729 2741 2753 2765 2777 2789 2801 2813 2825 2837 2849 2861 2873
## [241] 2885 2897 2909 2921 2933 2945 2957 2969 2981 2993 3005 3017 3029 3041 3053
## [256] 3065 3077 3089 3101 3113 3125 3137 3149 3161 3173 3185 3197 3209 3221 3233
## [271] 3245 3257 3269 3281 3293 3305 3317 3329 3341 3353 3365 3377 3389 3401 3413
## [286] 3425 3437 3449 3461 3473 3485 3497 3509 3521 3533 3545 3557 3569 3581 3593
## [301] 3605 3617 3629 3641 3653 3665 3677 3689 3701 3713 3725 3737 3749 3761 3773
## [316] 3785 3797 3809 3821 3833 3845 3857 3869 3881 3893 3905 3917 3929 3941 3953
## [331] 3965 3977 3989 4001 4013 4025 4037 4049 4061 4073 4085 4097 4109 4121 4133
## [346] 4145 4157 4169 4181 4193 4205 4217 4229 4241 4253 4265 4277 4289 4301 4313
## [361] 4325 4337 4349 4361 4373 4385 4397 4409 4421 4433 4445 4457 4469 4481 4493
## [376] 4505 4517 4529 4541 4553 4565 4577 4589 4601 4613 4625 4637 4649 4661 4673
## [391] 4685 4697 4709 4721 4733 4745 4757 4769 4781 4793 4805 4817 4829 4841 4853
## [406] 4865 4877 4889 4901 4913 4925 4937 4949 4961 4973 4985 4997 5009 5021 5033
## [421] 5045 5057 5069 5081 5093 5105 5117 5129 5141 5153 5165 5177 5189 5201 5213
## [436] 5225 5237 5249 5261 5273 5285 5297 5309 5321 5333 5345 5357 5369 5381 5393
## [451] 5405 5417 5429 5441 5453 5465 5477 5489 5501 5513 5525 5537 5549 5561 5573
## [466] 5585 5597 5609 5621 5633 5645 5657 5669 5681 5693 5705 5717 5729 5741 5753
## [481] 5765 5777 5789 5801 5813 5825 5837 5849 5861 5873 5885 5897 5909 5921 5933
## [496] 5945 5957 5969 5981 5993 6005 6017 6029 6041 6053 6065 6077 6089 6101 6113
## [511] 6125 6137 6149 6161 6173 6185 6197 6209 6221 6233 6245 6257 6269 6281 6293
## [526] 6305 6317 6329 6341 6353 6365 6377 6389 6401 6413 6425 6437 6449 6461 6473
## [541] 6485 6497 6509 6521 6533 6545 6557 6569 6581 6593 6605 6617 6629 6641 6653
## [556] 6665 6677 6689 6701 6713 6725 6737 6749 6761 6773 6785 6797 6809 6821 6833
## [571] 6845 6857 6869 6881 6893 6905 6917 6929 6941 6953 6965 6977 6989 7001 7013
## [586] 7025 7037 7049 7061 7073 7085 7097 7109 7121 7133 7145 7157 7169 7181 7193
## [601] 7205 7217 7229 7241 7253 7265 7277 7289 7301 7313 7325 7337 7349 7361 7373
## [616] 7385 7397 7409 7421 7433 7445 7457 7469 7481 7493 7505 7517 7529 7541 7553
## [631] 7565 7577 7589 7601 7613 7625 7637 7649 7661 7673 7685 7697 7709 7721 7733
## [646] 7745 7757 7769 7781 7793 7805 7817 7829 7841 7853 7865 7877 7889 7901 7913
## [661] 7925 7937 7949 7961 7973 7985 7997 8009 8021 8033 8045 8057 8069 8081 8093
## [676] 8105 8117 8129 8141 8153 8165 8177 8189 8201 8213 8225 8237 8249 8261 8273
## [691] 8285 8297 8309 8321 8333 8345 8357 8369 8381 8393 8405 8417 8429 8441 8453
## [706] 8465 8477 8489 8501 8513 8525 8537 8549 8561 8573 8585 8597 8609 8621 8633
## [721] 8645 8657 8669 8681 8693 8705 8717 8729 8741 8753 8765 8777 8789 8801 8813
## [736] 8825 8837 8849 8861 8873 8885 8897 8909 8921 8933 8945 8957 8969 8981 8993
## [751] 9005 9017 9029 9041 9053 9065 9077 9089 9101 9113 9125 9137 9149 9161 9173
## [766] 9185 9197 9209 9221 9233 9245 9257 9269 9281 9293 9305 9317 9329 9341 9353
## [781] 9365 9377 9389 9401 9413 9425 9437 9449 9461 9473 9485 9497 9509 9521 9533
## [796] 9545 9557 9569 9581 9593 9605 9617 9629 9641 9653 9665 9677 9689 9701 9713
## [811] 9725 9737 9749 9761 9773 9785 9797 9809 9821 9833 9845 9857 9869 9881 9893
## [826] 9905 9917 9929 9941 9953 9965 9977 9989

Luckily, RStudio gives us numbers on the side to help us keep track of things, but what if we want the exact number? (Say we want to store it as an object for later usage.)

We can use the length() function to see exactly how long the vector is.

length(longvec)
## [1] 833
longveclength <- length(longvec)
longveclength
## [1] 833

We can also use the head() and tail() functions to get the first or last howevermany values.

If we look at their documentation (?head or ?tail), we can see they have two arguments: the object and the number of values. We can also see that there is a default of 6 values. This default means that, unless we specify how many we want, it will give us 6 values. (This is helpful if we want a quick look and don’t care how many we get.)

head(longvec)
## [1]  5 17 29 41 53 65
head(longvec, 2)
## [1]  5 17

If we look at the documentation for tail(), we can see that it just gives us the documentation for head(). This is because they are essentially the same function, just for different ends of the vector.

tail(longvec, 21)
##  [1] 9749 9761 9773 9785 9797 9809 9821 9833 9845 9857 9869 9881 9893 9905 9917
## [16] 9929 9941 9953 9965 9977 9989

Comments

In your R source code (top left of the window), you can (and should) make comments to help remind you of things. Anything following a number sign/pound sign/hashtag/octothorpe (this thing: #) will not be run.

2+2
## [1] 4
# 2+2
2+2 # equals four
## [1] 4
2 # +2
## [1] 2

This is very useful when you’re writing code/script and want to keep a note of something. This is especially helpful if you’re sharing your code with another person (or even Future You), since it let’s you label what you did and why.

# Below I do addition
2+2
## [1] 4
2+3  # I used three here because it's larger than two
## [1] 5


Throughout this year, you should also add comments onto your homework assignments so that Quan, Tom, and I know what you’re doing. It’s part of showing your work. I’ll provide examples of this when we do big practice sessions later on.

It’s also a great way to keep notes when we do lab sessions in class.



A quick way to “comment out” a whole line or group of lines is to press Ctrl + Shift + c (Cmd + Shift + c on a Mac)


This means we now have learned two R shortcuts:

  1. Assignment arrow: ALT and - (Option and - on a Mac)
  2. Comment: Ctrl + Shift + c (Cmd + Shift + c on a Mac)

(One more to come tomorrow)

Official Question Time 3

Since the last OQT, we’ve done:

  1. Functions

    • seq()

    • length()

    • head()

    • tail()

  2. Common Error #2: Forgetting to separate arguments with a comma

  3. Common Error #3: Forgetting to close a function with a parenthesis )

  4. Comments

    • Shortcut for commenting a whole line: Ctrl/Cmd + Shift + c

More R: Groups of Vectors

Groups of Vectors

There are two main ways that we can store multiple vectors together: Lists and Data Frames.

We’re going to skip lists for now. They’ll become (minorly) important in a few months, so it’s not worth looking at just yet.

Data Frames

Data frames are R objects that combine vectors both horizontally and vertically. If you’ve worked with any sort of data before, including Excel sheets, data frames will look familiar to you.

Data frames are also how we will be doing the vast majority of our work this year.

Let’s create one using our previous vectors.

firstdf <- data.frame(scores, newscores)
firstdf
##   scores newscores
## 1      2         3
## 2      4         6
## 3      6         9
## 4      8        12
## 5     14        21

We can see that our data frame “firstdf” has names for the columns. Fortunately, they do not take up a row themselves. The numbers on the left also don’t count, but are helpful for our purposes later.

There are several very helpful functions for working with data frames.

For example, we can use the ncol() function to see the number of columns there are and nrow() to see the number of rows there are.

If we wanted them together, we could call dim(firstdf), short for dimensions, and it would give us both.

ncol(firstdf)
## [1] 2
nrow(firstdf)
## [1] 5
dim(firstdf)
## [1] 5 2

Lastly, we can also use the function str() to analyze the structure of the data frame. (This works better for small data frames but gets unwieldy very easily when used with larger ones.)

str(firstdf)
## 'data.frame':    5 obs. of  2 variables:
##  $ scores   : num  2 4 6 8 14
##  $ newscores: num  3 6 9 12 21

This function is neat because it gives us

Data Frames - Indexing

If we want to extract certain values from firstdf, we can index them. Indexing data frames takes the form dataframe[row,column].

If we index the first row (firstdf[1, ]), for example, we’ll get the first row’s numbers (2, 3).

firstdf[1, ]   # First Row
##   scores newscores
## 1      2         3
firstdf[ ,1]   # First Column
## [1]  2  4  6  8 14
firstdf[1,1]   # First Row, First Column
## [1] 2

The comma is important here. Without it, R will give us just the column. And if we do two brackets without a comma, it’ll give us the column as a vector.

firstdf
##   scores newscores
## 1      2         3
## 2      4         6
## 3      6         9
## 4      8        12
## 5     14        21
firstdf[1]
##   scores
## 1      2
## 2      4
## 3      6
## 4      8
## 5     14
firstdf[[1]]
## [1]  2  4  6  8 14

Data Frames - Rows and Columns

Remember how data frames are groups of vectors? Well we can treat each column as a vector, complete with its own name. Let’s start by seeing what those names are. We can use colnames() or just names() to do so.

colnames(firstdf)
## [1] "scores"    "newscores"
names(firstdf)
## [1] "scores"    "newscores"

We can also use rownames() to get the names of our rows.

rownames(firstdf)
## [1] "1" "2" "3" "4" "5"

Reassigning Row and Column Names

Our rows are currently just numbers, so let’s assign them something else.

rownames(firstdf)
## [1] "1" "2" "3" "4" "5"
rownames(firstdf) <- c("karen", "quan", "tom", "kristen", "paul")
rownames(firstdf)
## [1] "karen"   "quan"    "tom"     "kristen" "paul"

See what we did there?

  1. We knew that rownames(firstdf) would give us a vector of our row names.
  2. We used our assigning arrow (<-) to assign each of our row names to a new name, found in the vector to the right of it.

Let’s try it again, this time renaming the columns.

newcolnames <- c("before_curve", "after_curve")     # Creating a vector for names
newcolnames
## [1] "before_curve" "after_curve"
colnames(firstdf) <- newcolnames                    # Assigning the new names
print(firstdf)
##         before_curve after_curve
## karen              2           3
## quan               4           6
## tom                6           9
## kristen            8          12
## paul              14          21

Data Frames - Indexing with Row and Column Names

Just like we were able to index rows and columns with their numbers (e.g. firstdf[1,]) we can do the name using their row and/or column names. We canalso use the dollar sign ($) to select a column from a data frame (but not a row).

When we index with a column name like this, we don’t need to include a comma. It already knows which column, so all it needs now is a row.

firstdf$before_curve
## [1]  2  4  6  8 14
firstdf$before_curve[3]
## [1] 6

Data Frames - Functions

Because data frames are just groups of vectors, we can use functions on them.

firstdf$column_three <- seq(8, 12)     # Number of new values has to be exact same as current df length
firstdf$c1_plus_c2 <- firstdf$before_curve + firstdf$after_curve
firstdf
##         before_curve after_curve column_three c1_plus_c2
## karen              2           3            8          5
## quan               4           6            9         10
## tom                6           9           10         15
## kristen            8          12           11         20
## paul              14          21           12         35

We can also use head() and tail() with data frames.

head(firstdf, 2)  # First two rows
##       before_curve after_curve column_three c1_plus_c2
## karen            2           3            8          5
## quan             4           6            9         10
tail(firstdf, 4)  # Last four rows
##         before_curve after_curve column_three c1_plus_c2
## quan               4           6            9         10
## tom                6           9           10         15
## kristen            8          12           11         20
## paul              14          21           12         35

And with columns inside those data frames.

head(firstdf$column_three, 5)  # First five values in column_three
## [1]  8  9 10 11 12

Official Question Time 4

Since the last OQT, we’ve done:

  1. Data Frames

    • Creating

    • Indexing

    • Row and Column Length

    • Row and Column Naming

    • Functions: head() and tail()

Viewing Data

Looking at Data with View()

With any data structure, you can use the View() function to see inside. It works fine for vectors, but becomes extremely helpful with dataframes. Go ahead and run View(firstdf) now.

You’ll see something like this in the source pane:

The table() Function

Another small dataset

For the next few minutes, we’ll be working with a small dataset. Let’s build it below:

smalldata <- data.frame(colors = c("Scarlet", "Black", "Black", "Scarlet", "Scarlet"),
                        teams = c("Knights", "Hoosiers", "Knights", "Knights", "Hoosiers"))

Let’s take a look at what’s inside:

print(smalldata)
##    colors    teams
## 1 Scarlet  Knights
## 2   Black Hoosiers
## 3   Black  Knights
## 4 Scarlet  Knights
## 5 Scarlet Hoosiers
str(smalldata)
## 'data.frame':    5 obs. of  2 variables:
##  $ colors: chr  "Scarlet" "Black" "Black" "Scarlet" ...
##  $ teams : chr  "Knights" "Hoosiers" "Knights" "Knights" ...
View(smalldata)

So we can see that we’re working with a data frame that has 5 observations of 2 variables. Both variables are character, too.


The table() function

The table() function gives us some quick counts of our data. We can use it when we want to look at how frequently one or two variables appear.

table(smalldata$colors)
## 
##   Black Scarlet 
##       2       3
table(smalldata$colors, smalldata$teams)
##          
##           Hoosiers Knights
##   Black          1       1
##   Scarlet        1       2

The first variable listed will always go on the left, and the second variable across the top.

We can see from the dataset itself, that “Black” is used twice and “Scarlet” three times. This is reflected in the tables we made as well.

And we can see that Black is on the same line as “Hoosiers” once, so there is a 1 where Black and Hoosiers intersect on the second table. Similarly, there are two rows in the data frame that have both Scarlet and Knights, so the cell where they meet in the second table has a 2.

Official Question Time 5

Since the last OQT, we’ve done:

  1. Viewing data

    • View()

    • str()

    • table()

One Last Thing: The Global Environment

After we assigned our first two values, I asked you to take a look at the “Environment” pane of your RStudio window (in the top right). Now that we’ve added more since then, take another look. In that pane, you’ll see everything we’ve made so far, separated into “Data” (which includes data frames and lists) and “Values” (which holds vectors and standalone values).