Data types

Matrix

Tip

In R, matrices are two-dimensional rectangular data sets, which can be created using the matrix() function. It’s essential to remember that all the elements of a matrix must be of the same type, such as all numeric or all character.

To construct a matrix, we often start with a vector and specify how we want to reshape it. For instance:

# Matrix 1
x <- 1:10
matrix1 <- matrix(x, nrow = 5, ncol = 2, byrow = TRUE)
matrix1
#>      [,1] [,2]
#> [1,]    1    2
#> [2,]    3    4
#> [3,]    5    6
#> [4,]    7    8
#> [5,]    9   10

Here, the vector x contains numbers from 1 to 10. We reshape it into a matrix with 5 rows and 2 columns. The byrow = TRUE argument means the matrix will be filled row-wise, with numbers from the vector.

Conversely, if you want the matrix to be filled column-wise, you’d set byrow = FALSE:

# matrix 2
matrix2 <- matrix(x, nrow = 5, ncol = 2, byrow = FALSE)
matrix2
#>      [,1] [,2]
#> [1,]    1    6
#> [2,]    2    7
#> [3,]    3    8
#> [4,]    4    9
#> [5,]    5   10

You can also combine or concatenate matrices. cbind() joins matrices by columns while rbind() joins them by rows.

# Merging 2 matrices
cbind(matrix1, matrix2)
#>      [,1] [,2] [,3] [,4]
#> [1,]    1    2    1    6
#> [2,]    3    4    2    7
#> [3,]    5    6    3    8
#> [4,]    7    8    4    9
#> [5,]    9   10    5   10
# Appending 2 matrices
rbind(matrix1, matrix2)
#>       [,1] [,2]
#>  [1,]    1    2
#>  [2,]    3    4
#>  [3,]    5    6
#>  [4,]    7    8
#>  [5,]    9   10
#>  [6,]    1    6
#>  [7,]    2    7
#>  [8,]    3    8
#>  [9,]    4    9
#> [10,]    5   10

Creating an empty matrix is also possible:

matrix(nrow=5, ncol=5)
#>      [,1] [,2] [,3] [,4] [,5]
#> [1,]   NA   NA   NA   NA   NA
#> [2,]   NA   NA   NA   NA   NA
#> [3,]   NA   NA   NA   NA   NA
#> [4,]   NA   NA   NA   NA   NA
#> [5,]   NA   NA   NA   NA   NA

List

Tip

In R, lists can be seen as a collection where you can store a variety of different objects under a single name. This includes vectors, matrices, or even other lists. It’s very versatile because its components can be of any type of R object, such as vector, matrix, array, dataframe, table, list, and so on.

For instance:

# List of 2 matrices
list1 <- list(matrix1, matrix2)
list1
#> [[1]]
#>      [,1] [,2]
#> [1,]    1    2
#> [2,]    3    4
#> [3,]    5    6
#> [4,]    7    8
#> [5,]    9   10
#> 
#> [[2]]
#>      [,1] [,2]
#> [1,]    1    6
#> [2,]    2    7
#> [3,]    3    8
#> [4,]    4    9
#> [5,]    5   10

Lists can also be expanded to include multiple items:

x6 <- seq(10, 30, length = 7)
sex <- c("females", "males", "other")
# Expanding list to include more items
list2 <- list(list1, x6, sex, matrix1)
list2 
#> [[1]]
#> [[1]][[1]]
#>      [,1] [,2]
#> [1,]    1    2
#> [2,]    3    4
#> [3,]    5    6
#> [4,]    7    8
#> [5,]    9   10
#> 
#> [[1]][[2]]
#>      [,1] [,2]
#> [1,]    1    6
#> [2,]    2    7
#> [3,]    3    8
#> [4,]    4    9
#> [5,]    5   10
#> 
#> 
#> [[2]]
#> [1] 10.00000 13.33333 16.66667 20.00000 23.33333 26.66667 30.00000
#> 
#> [[3]]
#> [1] "females" "males"   "other"  
#> 
#> [[4]]
#>      [,1] [,2]
#> [1,]    1    2
#> [2,]    3    4
#> [3,]    5    6
#> [4,]    7    8
#> [5,]    9   10

Combining different types of data into a single matrix converts everything to a character type:

# A matrix with numeric and character variables
id <- c(1, 2)
score <- c(85, 85)
sex <- c("M", "F")
new.matrix <- cbind(id, score, sex)
new.matrix
#>      id  score sex
#> [1,] "1" "85"  "M"
#> [2,] "2" "85"  "F"

To check the type of data in your matrix:

mode(new.matrix)
#> [1] "character"

Data frame

Tip

As we can see combining both numeric and character variables into a matrix ended up with a matrix of character values. To keep the numeric variables as numeric and character variables as character, we can use the data.frame function.

  1. Creating a data frame

A data frame is similar to a matrix but allows for columns of different types (numeric, character, factor, etc.). It’s a standard format for storing data sets in R.

df <- data.frame(id, score, sex)
df

To check the mode or type of your data frame:

mode(df)
#> [1] "list"
  1. Extract elements

Data frames allow easy extraction and modification of specific elements. For example, we can extract the values on the first row and first column as follow:

df[1,1]
#> [1] 1

Similarly, the first column can be extracted as follows:

df[,1]
#> [1] 1 2

The first row can be extracted as follows:

df[1,]

Columns can also be accessed using $. Below we are calling column id:

df$id
#> [1] 1 2
  1. Modifying values

We can edit the values in the data frame as well. For example, we can change the score from 85 to 90 for the id 1:

df$score[df$id == 1] <- 90
df

We can also change the name of the variables/columns:

colnames(df) <- c("Studyid", "Grade", "Sex")
df
  1. Combining data frames

We can also merge another data frame with the same variables using the rbind function:

# Create a new dataset
df2 <- data.frame(Studyid = c(10, 15, 50), Grade = c(75, 90, 65), Sex = c("F", "M", "M"))

# Combining two data frames
df.new <- rbind(df, df2)

# Print the first 6 rows
head(df.new)
  1. Checking the dimensions

To see the dimension of the data frame (i.e., number of rows and columns), we can use the dim function:

dim(df.new)
#> [1] 5 3

As we can see, we have 5 rows and 3 columns. We can use the nrow and ncol functions respectively for the same output:

nrow(df.new)
#> [1] 5
ncol(df.new)
#> [1] 3