Chapter 2 Introduction to R

2.1 Instructions

This tutorial will introduce you to the basics of the language of R. We will cover how to set up our working environment, mathematical and logical operators, the most common data types in R, explore a simple dataset, write an R function as well as how to seek for help within and outside of R.

Accompanying this tutorial is a short Google quiz for your own self-assessment. The instructions of this tutorial will clearly indicate when you should answer which question.

2.2 Learning Objectives

  • Be familiar with the basic procedures for setting up an R session with functions such as getwd(), setwd(), dir(), install.packages(), and library().
  • Understand the very basic of how and when to use arithmetic and logical operators in R.
  • Be familiar with the most common types of data in R including string, vector, data frame, and list.
  • Explore a dataset using basic Base R functions.
  • Know how to write a new function in R.
  • Be comfortable with and know how to seek for help within and outside of R.

2.3 Set Up Basics

2.3.1 Working Directory

One of the most important function in R is getwd(), or “get working directory.” The output of this code is the pathway of your current R file. Interestingly, getwd() does not have any argument. In other words, you do not have to type anything in the ().

It is highly recommended that all of your files (the R file, any data files, images, etc) be in the same directory. This will make your project much more organized and your life a lot easier when we get into more complicated data analysis that involves more data files.

By default, your working directory is whatever folder your current R file is in.

getwd()
## [1] "C:/Users/ehsan/Documents/GitHub/intro2R"

If at any point, you want to change your working directory to another folder, you can use setwd(). Different from getwd(), setwd() requires an argument within its brackets. To set a new working directory, you need to copy and paste the pathway within the brackets and in quotation marks ("").

This code is helpful when you need to pull files outside of the default working directory. However, you should be mindful when using this function because it gets very confusing very quickly.

#setwd("")

Another important function is dir(). This function lets you check all of the files that exist in your working directory.

This function is a good option if you want to check if there are any extra or missing files from your working directory.

dir()
##  [1] "_book"                                 
##  [2] "_bookdown.yml"                         
##  [3] "_bookdown_files"                       
##  [4] "_build.sh"                             
##  [5] "_deploy.sh"                            
##  [6] "_output.yml"                           
##  [7] "0-r-and-rstudio-set-up.Rmd"            
##  [8] "1-introduction-to-r.Rmd"               
##  [9] "2-importing-data-into-r-with-readr.Rmd"
## [10] "3-introduction-to-nhanes.Rmd"          
## [11] "4-data-analysis-with-dplyr.Rmd"        
## [12] "5-data-visualization-with-ggplot.Rmd"  
## [13] "6-date-time-data-with-lubridate.Rmd"   
## [14] "7-data-summary-with-tableone.Rmd"      
## [15] "8-Exercise-Solutions.Rmd"              
## [16] "9-references.Rmd"                      
## [17] "book.bib"                              
## [18] "data"                                  
## [19] "DESCRIPTION"                           
## [20] "Dockerfile"                            
## [21] "docs"                                  
## [22] "header.html"                           
## [23] "images"                                
## [24] "index.Rmd"                             
## [25] "intro2R.log"                           
## [26] "intro2R.Rmd"                           
## [27] "intro2R.tex"                           
## [28] "intro2R_cache"                         
## [29] "intro2R_files"                         
## [30] "LICENSE"                               
## [31] "now.json"                              
## [32] "packages.bib"                          
## [33] "preamble.tex"                          
## [34] "R.Rproj"                               
## [35] "README.md"                             
## [36] "style.css"                             
## [37] "toc.css"

You will see that there are 2 CSV files in our working directory: last_15_bpx.csv and last_15_demo.csv. Do not worry about what they are right now (we will cover this in later tutorials). All you have to know for now is that these two files are currently residing in our input/tutorial-demo folder, AKA our working directory.

The example above is only to demonstrate how we would change our working directory. But since we want to remain in our default working directory for the rest of this tutorial, we will set our working directory back to the original directory.

setwd("..") # this ".." argument allows us to move back 1 folder

#setwd("") # and then we can set our working directory back to the original by copying and pasting its pathway

After setting a new working directory, it is in our best interest to check the working directory again to see if we are in the right place.

# check to see if we're back to our original directory
getwd()
## [1] "C:/Users/ehsan/Documents/GitHub/intro2R"

Functions Debunked

Throughout our tutorials, you will see a recurring section named Functions Debunked. These sections aim to break down the function that you were just introduced to. Each of these section will include a link for you to find more information about the function, the different arguments that can be nested within each function, and an example.

For this first section, we will debunk #. When you see a # in a code chunk, this means that the following information is a note or comment. In other words, it doesn’t code for anything - it is just notes explaining what we are doing. You can try adding a # in front of our getwd() code above to see what happens!

If this doesn’t make any sense to you right now, do not worry! It will make more sense as we move along the tutorials.

2.3.2 Installing and Attaching Packages

Now that we understand what working directories are, we can move onto installing and attaching packages.

There are a lot of packages on R, each has its own set of functions for different purposes. To access each set of function, we need to install the respective package. To install a package, we use install.packages(). Within the brackets, the only argument you need is the name of the package in quotation marks ("").

The most basic package on R is Base R. We do not actually need to install this package as it should be built into R by default. Therefore, by default, we should already have access to a range of basic R functions without having to install any packages. In this tutorial, we will only be using functions in this Base R package. However, in future tutorials, we will need to install pacakages such as dplyr and ggplot, which will give you access to even more and more advanced functions.

For the sake of demonstration, the ggplot2 package is installed below, but note that we will not be using any ggplot functions in this tutorial.

# install.packages("ggplot2")

A related function is library(). This function is used to attach installed packages to your R session. Unlike install.packages() where you only need to use once, library() needs to be run every R session. In other words, you need to attach whatever package you need everytime to you and then reopen R.

In future tutorials, if the library() function does not work for you, it is most likely because you have not installed the package, and therefore need to use install.packages() first before library().

# another difference from install.packages() is that we do not need "" in library()
library(ggplot2)

To make sure the package is successfully attached, we can try running a function in that package. After running the code below, you should only see a blank square. This is correct! We will go over why this is in tutorial 5.

ggplot()

DO QUESTIONS 1-3 OF THE QUIZ NOW

  • What is a “Working Directory?”

  • What is the main difference between setwd() and getwd()?

  • We need to install the packages first before we can load them using library(). (True or False)

2.4 Arithmetic Operators

As expected from a data analysis software, you can use R like a calcultor using arithmetic operators! Here is a list of a few basic and common arithmetic operators in R:

Figure 1. Arithmetic Operators in R

These operators will prove themselves to be more useful in data analysis when we get to later tutorials, especially our Tutorial 4 on the dplyr package.

Try it yourself 2.1

  1. Can you replicate and solve these problems in R?
  • 2^2
  • 2 × 2
  • 2 + 5 × (5 ÷ 4)^6
  • what is the remainder of 52 ÷ 5
  • what is the whole number solution to 82 ÷ 8
  1. Can you solve for x using R?

a <- 9 + 3 * 6

x <- a ÷ 2

DO QUESTION 4 OF THE QUIZ NOW

  • The output of 10 %% 2 is equal to which of the following?

2.5 Logical Operators

In addition, logical operators are also available on R. The main difference between arithmetic operators and logical operators is that logical operators will yield a logical (or TRUE/FALSE) output. This is a list of some common logical operators that can be used on R:

Figure 2. Logical Operators in R

Similarly, these operators will be more useful when we learn about filtering data in future tutorials and you will also be provided with more examples then.

Here are a few example codes that you can try running:

a <- 5 > 4 
    # the <- indicates that the information 5 > 4 is stored in the variable a - we will cover this in more detail in Section 4 of this tutorial
b <- 8
b != 8
## [1] FALSE
a == b
## [1] FALSE
9 + 10 * 15 - 8 <= 103
## [1] FALSE
11+ 3^9 == 19694
## [1] TRUE
    # note that in R, "equal to" is coded by ==, = has another meaning that you will see in section 4

Try it yourself 2.2

Translate the following into R and find the output: * 8 times 3 is greater than 8? * eleven divided by seven is not equal to 2? * 9 is less than or equal to 18?

DO QUESTION 5 OF THE QUIZ NOW

  • Which of the following operators code for “equal to?”

2.6 Most Common Data Types In R

2.6.1 Strings

Strings are either single character or a collection of characters. Note that all strings are in "". For example:

"hello, my name is Alex"
## [1] "hello, my name is Alex"
"Where are you?"
## [1] "Where are you?"
"I like to eat 6 apples"
## [1] "I like to eat 6 apples"

2.6.2 Vectors

Vector is the simplest data type in R. It is basically a list of components stored in the same place (or variable). To write a vector, starts with c() and input the appropriate components within the brackets. For example:

c(1, 2, 3) # numeric
## [1] 1 2 3
c("hello", "bonjour", "ciao") # character – note that text needs to be in “”
## [1] "hello"   "bonjour" "ciao"
c(TRUE, FALSE, TRUE) # logical – we will cover this in more detail later
## [1]  TRUE FALSE  TRUE
c(1, "hello", TRUE) # mixed
## [1] "1"     "hello" "TRUE"

We can also store these vectors using the symbol <-.

Note: If you are using RStudio, they will be stored in our environment located in the top right window.

numeric <- c(1, 2, 3, 10:12) # 10:12 means 10, 11, 12!
character <- c("hello", "bonjour", "ciao")
logical <- c(TRUE, FALSE, TRUE)

What we just did was assigning values to variables where numeric, character, and logical are all variables!

Try it yourself 2.3

Can you try storing a string? Assign the string “hello world, I am here” to the variable named start.

Note how the string is in "" but the variable name is not. Why do you think this is?

After assigning values to our variables, we can tell R to retrieve them by typing any of these variable names. R will give us the components of data that we assigned to each variable as the output.

numeric
## [1]  1  2  3 10 11 12
character
## [1] "hello"   "bonjour" "ciao"
logical
## [1]  TRUE FALSE  TRUE

If you want to extract a particular component from a variable, you can use []. For example:

numeric[1:4] # the first 4 components
## [1]  1  2  3 10
character[2] # only the 2nd component
## [1] "bonjour"

Try it yourself 2.4

It is important to note that R is case-sensitive. This means that it distinguishes capitalized from non-capitalized characters, so logical and Logical are read as two separate things by R!

Try typing Logical with a capitalized “L.” How does R respond to this?

We can also replace <- with = when assigning values to variables. But = has other uses as well - you will be introduced to their slight differences in future tutorials. Also, note that in R, = does not mean “equal to.” As you have see in the previous section, “equal to” is coded by ==.

2.6.3 Lists

A list is a collection of possibly unrelated components. It allows you to gather different types of data into one place. In the code below, we have numbers, characters, and data frame all in one place.

list <- list(numeric = 1, character = c("bonjour", "hello"), "I like to eat 6 apples")

Similarly, you can also extract specific information from this list using []. Note that in the code below, the output is “bonjour” AND “hello,” this is because the second component of our list is a vector that houses both of these words.

list[2]
## $character
## [1] "bonjour" "hello"

Functions Debunked

The arguments for list() are as follows:

list(Variable name of any data type = Any data stored within that variable OR Any string)

For example: list(numeric = 1, character = c("bonjour", "hello"), dataframe = logical)

2.6.4 Dataframe

Dataframe, you guessed it, stores your data in the form of a dataframe or a table! Dataframe allows you to store multiple vectors into one single table.

dataframe <- data.frame(numeric, character, logical)
dataframe
##   numeric character logical
## 1       1     hello    TRUE
## 2       2   bonjour   FALSE
## 3       3      ciao    TRUE
## 4      10     hello    TRUE
## 5      11   bonjour   FALSE
## 6      12      ciao    TRUE

Functions debunked

The arguments for data.frame() are as follows:

data.frame(Vector 1, Vector 2, Vector n)

For example: dataframe <- data.frame(numeric, character, logical)

If you want to change the column names of your data frame, you can use the function names().

names(dataframe) <- c("Number", "Text", "T/F")

Now if we check our data frame again, the new column names should appear.

dataframe
##   Number    Text   T/F
## 1      1   hello  TRUE
## 2      2 bonjour FALSE
## 3      3    ciao  TRUE
## 4     10   hello  TRUE
## 5     11 bonjour FALSE
## 6     12    ciao  TRUE

Functions Debunked

The arguments for names() are as follows:

names(Name of Dataset) <- A vector of components which length matches with that of the dataset (we will go over length in section 5 of this tutorial)

For example: names(dataframe) <- c("Number", "Text", "T/F")

Try it yourself 2.5

Why do you think numeric, character, and logical are not in "" but Number, Text, and T/F are?

Similar to how we extracted information from vectors and lists, we can also use [] to extract certain rows, columns, or cells in a data frame.

dataframe[1, ] # only fhe first row
##   Number  Text  T/F
## 1      1 hello TRUE
dataframe[, 2] # only the second column
## [1] "hello"   "bonjour" "ciao"    "hello"   "bonjour" "ciao"
dataframe[3, 2] # only cell (3,2) - the third row and second column
## [1] "ciao"

DO QUESTIONS 6 & 7 OF THE QUIZ NOW

  • Which of the following codes would extract only rows 1, 3, 6 and only column 1 from our data frame?

  • What is the value of the cell in the first row and third column of our data frame?

We can also add a new column to our data frame using the function cbind() like below. Note how the column name is in "". It is also important that the new column has the same number of values as the rest of the columns. If the new column contains less values than the other columns, you can use NA, or “not available” values, to fill up the rest of the places!

new_column <- cbind(dataframe, "new column" = c(2, 3, 4, 5, 1, NA))
    # we will learn more about NA values in tutorial 4

Similarly, the function to add a new row is rbind(). The two functions work almost identical, but rbind() does not require a row name.

(new_row <- rbind(new_column, c(13, "hello", FALSE, NA)))
##   Number    Text   T/F new column
## 1      1   hello  TRUE          2
## 2      2 bonjour FALSE          3
## 3      3    ciao  TRUE          4
## 4     10   hello  TRUE          5
## 5     11 bonjour FALSE          1
## 6     12    ciao  TRUE       <NA>
## 7     13   hello FALSE       <NA>

You may have noticed that the code above has an extra () that encompasses the whole code. This () is another way for us to print the output of our function - it is equivalent to if we just run the name of data frame new_row. Try removing the extra () and see what happens!

Functions Debunked

cbind() is used to create new columns in a data frame. The arguments are as follows:

cbind(the Current Data Frame, “Name of the New Column” = Value(s) in the New Column)

rbind is used to create new rows in a data frame. The arguments are as follows:

rbind(the Current Data Frame, Values in the New Row)

There exists many other types of data types on R, you are free to explore them on your own time. But what we have been introduced to are the most basic ones.

2.7 Exploring a Dataset

Now that you are familiar with the different data types and operators of R, we can move on to the fun parts of this tutorial: exploring a dataset!

To introduce you to the concept of exploring data on R, we will be using a dataset already available on R - in other words, we will not be importing data into R yet, we will cover this in another tutorial. Conveniently, R has a set of built-in datasets that we can use to practice using basic R functions. In this tutorial, we will use the dataset named “faithful” which contains information on the Old Faithful Geyser in Yellowstone National Park. Run the codes below to explore the dataset.

# information on the data
# ?faithful
# the actual data
# faithful
# print(faithful)

2.7.0.1 Functions Debunked

print() is another option for you to use if you want to see a variable, dataset, or any other type of output.

print(Any Object)

For example: print(faithful), print(faithful$eruptions), print(1:12)

Try it yourself 2.6

  1. What are 2 ways that we can print rows 1 to 5 of the data frame faithful?
  2. What is the value of the cell in the fourth row and second column of the data frame faithful?

2.7.1 Dimensions

Usually, the first thing we want to do when exploring a dataset is to check its dimensions. To do this, we use the function dim() with the dataset name between the ().

dim(faithful)
## [1] 272   2

You should see two numbers as the output: 272 and 2. This tells us that the dataset faithful has 272 rows (AKA observations) and 2 columns (AKA variables). Checking the dimensions of our datasets may be helpful when we want to check how large our dataset is after a certain data manipulation method. This may also be helpful to check if our manipulated or original data is abnormally large or small.

2.7.2 Structure

We can also check the structure of our data using the function str() with the dataset name between the ().

str(faithful)
## 'data.frame':    272 obs. of  2 variables:
##  $ eruptions: num  3.6 1.8 3.33 2.28 4.53 ...
##  $ waiting  : num  79 54 74 62 85 55 88 85 51 85 ...

In this case, we are provided with several pieces of information: 1. the dataset faithful is a data frame 2. there are two columns, or variables, in this dataset: eruptions and waiting 3. both eruptions and waiting contain numerical data

As you can see, str() can be very helpful if we want to check what kind of data we are working with and how large the data is.

2.7.3 Class

The class of our data can also be checked using the function class(). Similarly, the dataset name or the variable name can go between the ().

class(faithful)
## [1] "data.frame"
class(faithful$eruptions)
## [1] "numeric"

Note how if we are checking the class of the variable eruption, we need to have the dataset name followed by a $ first before we can write the variable name.

Try it yourself 2.7

Write a code to find the structure of the variable waiting in the faithful dataset.

2.7.4 Length

We can also check the length of our dataset or variable using length().

length(faithful)
## [1] 2
    ## the output should be 2 - the number of variables in our dataset!
length(faithful$eruption)
## [1] 272
    ## the output should be 272 - the number of observations in this variable!

Try it yourself 2.8

Remember those variables that we created earlier in the tutorial? Try finding the lengths of data frame and numeric.

Challenge: Psst! There are actually 2 ways for you to find the length of numeric.

2.7.5 Head and Tail

So we are now somewhat familiar with the basic functions to explore the general information about our dataset, YAY! If you want to check the actual dataset (AKA see the actual table), but do not want to see the whole data frame with 272 rows, head() and tail() are good options.

head() shows you the first few rows of your dataset.

head(faithful)
##   eruptions waiting
## 1     3.600      79
## 2     1.800      54
## 3     3.333      74
## 4     2.283      62
## 5     4.533      85
## 6     2.883      55

tail() shows you the last few rows of your dataset.

tail(faithful)
##     eruptions waiting
## 267     4.750      75
## 268     4.117      81
## 269     2.150      46
## 270     4.417      90
## 271     1.817      46
## 272     4.467      74

You can also choose how many rows you want to see

head(faithful, 10)
##    eruptions waiting
## 1      3.600      79
## 2      1.800      54
## 3      3.333      74
## 4      2.283      62
## 5      4.533      85
## 6      2.883      55
## 7      4.700      88
## 8      3.600      85
## 9      1.950      51
## 10     4.350      85
tail(faithful, 2)
##     eruptions waiting
## 271     1.817      46
## 272     4.467      74

As you can see, head() and tail() allow you to check just a portion of the dataset. This is especially useful when you’re working with large datasets and you only want to see part of it to make sure everything is okay!

DO QUESTION 8 OF THE QUIZ NOW

  • Which of the following codes is best to find how large our dataset is?

2.7.6 Mathematical Functions in R

R has a range of mathemtical functions for us to use. Below are only a few basic ones, we will cover much more as we move through our tutorials. Note that these functions only work if the data class is numeric.

We can find the mean of waiting:

mean(faithful$waiting)
## [1] 70.89706

We can also find the maximum and minimum values of waiting:

max(faithful$waiting)
## [1] 96
min(faithful$waiting)
## [1] 43

And the 1st, 2nd, 3rd, and 4th quantile of waiting:

quantile(faithful$waiting, 0.25)
## 25% 
##  58
quantile(faithful$waiting, 0.5)
## 50% 
##  76
quantile(faithful$waiting, 0.75)
## 75% 
##  82
quantile(faithful$waiting, 1)
## 100% 
##   96

As well as the median of waiting:

median(faithful$waiting)
## [1] 76

Another powerful function in R is summary(). It literally summarizes everything that we have just covered in this subsection in one single table. Not only that, if you place the dataset name within the (), it actually runs all of the functions above for all variables in the dataset.

summary(faithful)
##    eruptions        waiting    
##  Min.   :1.600   Min.   :43.0  
##  1st Qu.:2.163   1st Qu.:58.0  
##  Median :4.000   Median :76.0  
##  Mean   :3.488   Mean   :70.9  
##  3rd Qu.:4.454   3rd Qu.:82.0  
##  Max.   :5.100   Max.   :96.0

Try it yourself 2.9

Recall that in order for us to refer to a variable in a dataset, we need to first type the dataset name following by a $ before we can type the variable name.

A way to avoid repeating faithful$ everytime is to attach the dataset using attach(faithful).

Try attaching the dataset faithful then find the mean of the variable eruptions without using $!

DO QUESTION 9 OF THE QUIZ NOW

  • What information does the function summary() provide us with? (select all that apply)

2.8 Writing a New Function In R

While R and its existing packages has a lot to offer, there may be times when the function that we want to use does not really exist. In situations like this, we may want to just write our own function! A function in R is a script and it aims to help you to write reproducible code.

For example, we want to write a function that helps us convert temperature from Celsius to Farenheit. To do this, we would need this shell first:

F_to_C <- function(F) {}

F_to_C is our function name, and function(F) tells R that we want to write a new function. The actual function that we write will be placed between the brackets {}.

Let’s think of the math really quickly. To convert °F to °C, we would need to subtract 32 then times that by 5/9. So that should look like this: (F - 32) * 5 / 9. And we want the function to print the results, so our final function should look like this:

F_to_C <- function(F) {
    print((F - 32) * 5 / 9)
}

Each function has its own name and it is meant to be different from any other function names. To run it, you need to call the function name.

Now let’s see if your function works. We can test it using a known value. Since we know that 32°F is equal to 0°C:

F_to_C(32)
## [1] 0

Awesome! It works!

Here are a few other ways that we can create new functions! Note that we can also nest existing functions inside new functions to make our own functions! For example, knowing that sprintf() is an existing function, we can write the following:

hello <- function(name){
    sprintf("Hello %s", name)
}
hello("Tim")
## [1] "Hello Tim"
hello("Elisa")
## [1] "Hello Elisa"

We can also write new functions that will give us a return statement. A return statement means you want some output or result after running the script to be returned. For example, the following function will always plus 1 to whatever number we nest between the ().

plus_1 <- function(x){
    return(x+1)
}
plus_1(10)
## [1] 11
plus_1(20.93)
## [1] 21.93

Note also that there can be more than one argument in a function! For example, the following function requires the arguments x and y, of which x will be multiplied by 10 and y will be added to the new x.

math_work <- function(x, y){
    x = x * 10
    y = y + x
    result <- list(x, y)
    return(result)
}
math_work(3, 5)
## [[1]]
## [1] 30
## 
## [[2]]
## [1] 35

Once we have written a new function, it will saved in your Global Environment and we can continue to use it as long as the Environment is not wiped. Note that the new functions will not work if you open a new R session that does not store them as functions. But we can also run the code chunks above again to remind R of the new functions.

2.9 For Loops

In R, we can also write for loops that iterate a particular action/code that we want. For example, we can write a loop that adds 1 to every number of a vector. To do this, we first need to create a vector and assign it to a variable. Let’s say we want the vector 1:10 to be assigned to k.

k <- 1:10

After that, the for loop is written like so:

for (i in k){
  print(k[i] + 1)
}
## [1] 2
## [1] 3
## [1] 4
## [1] 5
## [1] 6
## [1] 7
## [1] 8
## [1] 9
## [1] 10
## [1] 11

Another way for us to yield the same output is to define the vector 1:10 within the for loop directly, like so:

for (i in 1:10){
  print(i + 1)
}
## [1] 2
## [1] 3
## [1] 4
## [1] 5
## [1] 6
## [1] 7
## [1] 8
## [1] 9
## [1] 10
## [1] 11

We can write more complicated for loops by adding more functions, arithmetic operators, or even vectors! For example, if we want to create a different for loop that calculates for the k^2, we should first create a new vector, k.sq, with the the same length as vector k.

k.sq <- 1:10

After that, we can write our for loop like so:

for (i in k){    
  k.sq[i] <- k[i]^2
  print(k.sq[i])
}
## [1] 1
## [1] 4
## [1] 9
## [1] 16
## [1] 25
## [1] 36
## [1] 49
## [1] 64
## [1] 81
## [1] 100

Now, everytime we call for k.sq, the new vector should be the output!

k.sq
##  [1]   1   4   9  16  25  36  49  64  81 100

We can also write for loops for data frames. Let’s take a look at our data frame, dataframe, again.

dataframe
##   Number    Text   T/F
## 1      1   hello  TRUE
## 2      2 bonjour FALSE
## 3      3    ciao  TRUE
## 4     10   hello  TRUE
## 5     11 bonjour FALSE
## 6     12    ciao  TRUE

Now, let’s say we want to loop through the entire data frame to find only the values under the “Number” column. To do that, we first need to know the number of rows our data frame has - we can use nrow() for this. But as we have learned before, we can just define this value directly in our for loop instead of having to do an extra step of finding the number of rows outside of the for loop.

In other words, our code would look like so:

for (i in 1:nrow(dataframe)){
  print(dataframe[i,"Number"])
}
## [1] 1
## [1] 2
## [1] 3
## [1] 10
## [1] 11
## [1] 12

We can also add conditions to our for loops by using if. For instance, we only want to print values under the “T/F” column where i > 3. To do that, we would need to write the following codes:

for (i in 1:nrow(dataframe)){
  if (i > 3) print(dataframe[i,"T/F"])
}
## [1] TRUE
## [1] FALSE
## [1] TRUE

2.10 Help Within and Outside Of R

It gets a while to get used to the language of R and no tutorials can fully explain the complexities and nuances of this language. Therefore, it is important to know how to and where to get help about R! There are so many ways for you to search for help on R, here are only a few methods:

2.10.1 Within R

If you have questions about a function or dataset in R, the easiest way to get help is to type ? before that particular function or dataset - as we have previously seen. Try running the codes below, what do you see?

# ?mean
# ?faithful

If you want to know whether there is a function for a particular action, you can use ?? before the action. If your action is more than one word long (e.g. geometric functions), you can put the entire phrase in quotation marks ("").

# ??"geometic functions"

After running the code above, you should see a list of functions, the package that it belongs to, as well as what it does. For example, the function phil() in the R package named BAS can be usedd to compound confluent hypergeometric function of two variables.

If you are using RStudio, you can also use the search box in the help window at the bottom right corner of our screen.

Figure 3. Help window in RStudio

2.10.2 Outside of R

Outside of R, there are also plenty of resources for you to tap into R Documentation is a good starting point.

Another resource is Stack Overflow, an online forum where you can ask and answer questions relating to coding!

DO QUESTION 10 OF THE QUIZ NOW

  • What are some of the ways that we can find help about R? (select all that apply)

2.11 Summary and Takeaways

In this tutorial, we learned a few basic steps of using R as a data analysis language. Completing this tutorial will prepare you more advanced tutorials in the future. Learning R is like learning another language, so the biggest tip is to practice practice practice!

After this tutorial, you should be familiar with setting up for an R session using basic working directory functions. Additionally, you should also be comfortable with using arithmetic and logical operators and a few common data types of R. You are also introduced with a few basic functions for exploring a dataset as well as several common methods on how to seek help within and outside of R.