R basics

Start using R

To get started with R, follow these steps:

  • Download and Install R: Grab the newest version from the official R website. > Tip: Download from a Comprehensive R Archive Network (CRAN) server near your geographic location.

  • Download and Install RStudio: You can get it from this link. > Note: RStudio serves as an Integrated Development Environment (IDE) offering a user-friendly interface. It facilitates operations such as executing R commands, preserving scripts, inspecting results, managing data, and more.

  • Begin with RStudio: Once you open RStudio, delve into using R. For starters, employ the R syntax for script preservation, allowing future code adjustments and additions.

Basic syntax

Tip

R, a versatile programming language for statistics and data analysis, can execute numerous tasks. Let’s break down some of the fundamental aspects of R’s syntax.

  1. Using R as a Calculator

Similar to how you’d use a traditional calculator for basic arithmetic operations, R can perform these functions with ease. For instance:

# Simple arithmetic
1 + 1
#> [1] 2

This is a basic addition, resulting in 2.

A more intricate calculation:

# Complex calculation involving 
# multiplication, subtraction, division, powers, and square root
20 * 5 - 10 * (3/4) * (2^3) + sqrt(25)
#> [1] 45

This demonstrates R’s capability to handle complex arithmetic operations.

  1. Variable Assignment in R

R allows you to store values in variables, acting like labeled containers that can be recalled and manipulated later. For example,

# Assigning a value of 2 to variable x1
x1 <- 2
print(x1)
#> [1] 2

Similarly:

x2 <- 9
x2
#> [1] 9
  1. Creating New Variables Using Existing Ones

You can combine and manipulate previously assigned variables to create new ones.

# Using variable x1 
# to compute its square and assign to y1
y1 <- x1^2
y1
#> [1] 4

You can also use multiple variables in a single expression:

y2 <- 310 - x1 + 2*x2 - 5*y1^3
y2
#> [1] 6
  1. Creating Functions

Functions act as reusable blocks of code. Once defined, they can be called multiple times with different arguments. Here’s how to define a function that squares a number:

z <- function(x) {x^2}

Call the function

z(x = 0.5)
#> [1] 0.25
z(x = 2)
#> [1] 4

R also comes with a plethora of built-in functions. Examples include exp (exponential function) and rnorm (random number generation from a normal distribution).

  1. Utilizing Built-In Functions

For instance, using the exponential function:

# Calling functions
exp(x1)
#> [1] 7.389056
log(exp(x1))
#> [1] 2

The rnorm function can generate random samples from a normal distribution: below we are generating 10 random sampling from the normal distribution with mean 0 and standard deviation 1:

rnorm(n = 10, mean = 0, sd = 1)
#>  [1]  0.03824617 -0.41388256 -1.45095902 -0.48677231 -0.27970404  0.67712693
#>  [7] -1.37815945  0.05783297 -0.11008730 -0.42272413

As random number generation relies on algorithms, results will differ with each execution.

# Random sampling (again)
rnorm(n = 10, mean = 0, sd = 1)
#>  [1]  0.91614807 -1.46007554  0.05019372  0.26000754 -0.12177978  1.10821941
#>  [7]  0.41004611 -0.43442647 -0.19412417  0.53161582

However, by setting a seed, we can reproduce identical random results:

# Random sampling (again, but with a seed)
set.seed(11)
rnorm(n = 10, mean = 0, sd = 1)
#>  [1] -0.59103110  0.02659437 -1.51655310 -1.36265335  1.17848916 -0.93415132
#>  [7]  1.32360565  0.62491779 -0.04572296 -1.00412058
# random sampling (reproducing the same numbers)
set.seed(11)
rnorm(n = 10, mean = 0, sd = 1)
#>  [1] -0.59103110  0.02659437 -1.51655310 -1.36265335  1.17848916 -0.93415132
#>  [7]  1.32360565  0.62491779 -0.04572296 -1.00412058

As we can see, when we set the same seed, we get exactly the same random number. This is very important for reproducing the same results. There are many other pre-existing functions in R.

  1. Seeking Help in R
Tip

R’s help function, invoked with ?function_name, provides detailed documentation on functions, assisting users with unclear or forgotten arguments:

# Searching for help if you know 
# the exact name of the function with a question mark
?curve

Below is an example of using the pre-exiting function for plotting a curve ranging from -10 to 10.

# Plotting a function
curve(z, from = -10, to = 10, xlab = "x", ylab = "Squared x")

If some of the arguments are difficult to remember or what else could be done with that function, we could use the help function. For example, we can simply type help(curve) or ?curve to get help on the curve function:

Tip

If you’re uncertain about a function’s precise name, two question marks can assist in the search:

# Searching for help if don't know 
# the exact name of the function
??boxplot
  1. Creating Vectors

Vectors are sequences of data elements of the same basic type. Here are some methods to create them:

# Creating vectors in different ways
x3 <- c(1, 2, 3, 4, 5)
print(x3)
#> [1] 1 2 3 4 5

x4 <- 1:7
print(x4)
#> [1] 1 2 3 4 5 6 7

x5 <- seq(from = 0, to = 100, by = 10)
print(x5)
#>  [1]   0  10  20  30  40  50  60  70  80  90 100

x6 <- seq(10, 30, length = 7)
x6
#> [1] 10.00000 13.33333 16.66667 20.00000 23.33333 26.66667 30.00000
  1. Plotting in R

R provides numerous plotting capabilities. For instance, the plot function can create scatter plots and line graphs:

# Scatter plot
plot(x5, type = "p", main = "Scatter plot")

# Line graph
plot(x = x6, y = x6^2, type = "l", main = "Line graph")

  1. Character Vectors Apart from numeric values, R also allows for character vectors. For example, we can create a sex variable coded as females, males and other.
# Character vector
sex <- c("females", "males", "other")
sex
#> [1] "females" "males"   "other"

To determine a variable’s type, use the mode function:

# Check data type
mode(sex)
#> [1] "character"

Package Management

Packages in R are collections of functions and datasets developed by the community. They enhance the capability of R by adding new functions for data analysis, visualization, data import, and more. Understanding how to install and load packages is essential for effective R programming.

  1. Installing Packages from CRAN

The CRAN is a major source of R packages. You can install them directly from within R using the install.packages() function.

# Installing the 'ggplot2' package
install.packages("ggplot2")
  1. Loading a Package

After a package is installed, it must be loaded to use its functions. This is done with the library() function.

# Loading the 'ggplot2' package
library(ggplot2)

You only need to install a package once, but you’ll need to load it every time you start a new R session and want to use its functions.

  1. Updating Packages

R packages are frequently updated. To ensure you have the latest version of a package, use the update.packages() function.

# Updating all installed packages
# could be time consuming!
update.packages(ask = FALSE)  
# 'ask = FALSE' updates all without asking for confirmation
  1. Listing Installed Packages

You can view all the installed packages on your R setup using the installed.packages() function.

# Listing installed packages
installed.packages()[, "Package"]
  1. Removing a Package

If you no longer need a package, it can be removed using the remove.packages() function.

# Removing the 'ggplot2' package
remove.packages("ggplot2")
  1. Installing Packages from Other Sources

While CRAN is the primary source, sometimes you might need to install packages from GitHub or other repositories. The devtools package provides a function for this.

# Installing devtools first
install.packages("devtools")
# Loading devtools
library(devtools)
# Install a package from GitHub
# https://github.com/ehsanx/simMSM
install_github("ehsanx/simMSM")

When you are working on a project, it’s a good practice to list and install required packages at the beginning of your R script.

Video content (optional)

Tip

For those who prefer a video walkthrough, feel free to watch the video below, which offers a description of an earlier version of the above content.