4.5 Vectors

Usually, we don’t want to work with just a single data point - we will typically have multiple values that we want to store together. The most convenient way of doing this in R is using vectors. A vector stores multiple data points, preserving their order.

We create a vector using the c() (short for “concatenate”) function. For example:

plants <- c("Feverfew", "Ivy", "Willow")
print(plants)
## [1] "Feverfew" "Ivy"      "Willow"

The data within a vector may be of any type, but all elements of a vector must be of the same data type. What happens if we try to create a vector with multiple data types?

beetles <- c("Weevil", "Firefly", 5)
print(beetles)
## [1] "Weevil"  "Firefly" "5"

Here, we mix character and numeric data. This isn’t allowed, so the numeric 5 is converted to the string "5".

4.5.1 Indexing

We will often want to take a larger vector and extract specific data points from it. To do this, we index our vector using the general syntax:

vectorName[itemPosition]

The position of the first item in the list is 1 (in other words, R is 1-indexed). Let’s try indexing using our plants vector, made above.

print(plants)
## [1] "Feverfew" "Ivy"      "Willow"

To extract "Ivy", we would do:

plants[2]
## [1] "Ivy"

We can also use a colon to extract multiple subsequent elements:

plants[1:2]
## [1] "Feverfew" "Ivy"

We can also provide a vector to index multiple values:

plants[c(1,3)]
## [1] "Feverfew" "Willow"

We often want to extract elements near the end of a vector. plants is short and we can count to the end of it easily, but most of the data we will work with is a lot longer. One easy way to index items near the end of a vector is to use the length() function. We can index the final entry in plants as so:

plants[length(plants)]
## [1] "Willow"

length(plants) is 3, so writing plants[length(plants)] is equivalent to writing plants[3]

Likewise, we can index the second element by doing some math:

plants[length(plants) - 1]
## [1] "Ivy"

4.5.2 Logical Indexing

We often want to subset our data not by the position of elements, but based on whether or not they meet a certain criterion. Below, I have generated a short list of numbers:

myNumbers <- c(1, 54, 12.2, 70, 18, 24, 94)

Let’s say we want to extract just the values that are greater than 15 from this list. We can use any of our comparative operators with a vector to compare all values within the vector:

myNumbers > 15
## [1] FALSE  TRUE FALSE  TRUE  TRUE  TRUE  TRUE

This generates a logical vector. We can provide this vector as an index to myNumbers to pull out the elements that are TRUE.

myNumbers[myNumbers > 15]
## [1] 54 70 18 24 94

We can use the logical AND (&) and OR (|) operators to combine conditions. For example, extracting values greater than 15 and less than 30:

myNumbers[(myNumbers > 15) & (myNumbers < 30)]
## [1] 18 24

4.5.3 Modifying Vectors

Once we point to elements within a vector, we can modify them using the assignment operator. For example, making the second item in myNumbers equal to 200:

myNumbers[2] <- 200
print(myNumbers)
## [1]   1.0 200.0  12.2  70.0  18.0  24.0  94.0

We can also modify multiple elements at once. For example, making every value less than 50 equal to 0:

myNumbers[myNumbers < 50] <- 0
print(myNumbers)
## [1]   0 200   0  70   0   0  94

4.5.4 Adding to Vectors

We can add to vectors using the concatenate function:

plants <- c(plants, "Philodendron")
print(plants)
## [1] "Feverfew"     "Ivy"          "Willow"       "Philodendron"