Data types
Matrix
In R, matrices are two-dimensional rectangular data sets, which can be created using the matrix()
function. It’s essential to remember that all the elements of a matrix must be of the same type, such as all numeric or all character.
To construct a matrix, we often start with a vector and specify how we want to reshape it. For instance:
Here, the vector x contains numbers from 1 to 10. We reshape it into a matrix with 5 rows and 2 columns. The byrow = TRUE
argument means the matrix will be filled row-wise, with numbers from the vector.
Conversely, if you want the matrix to be filled column-wise, you’d set byrow = FALSE
:
You can also combine or concatenate matrices. cbind()
joins matrices by columns while rbind()
joins them by rows.
Creating an empty matrix is also possible:
List
In R, lists can be seen as a collection where you can store a variety of different objects under a single name. This includes vectors, matrices, or even other lists. It’s very versatile because its components can be of any type of R object, such as vector, matrix, array, dataframe, table, list, and so on.
For instance:
Lists can also be expanded to include multiple items:
x6 <- seq(10, 30, length = 7)
sex <- c("females", "males", "other")
# Expanding list to include more items
list2 <- list(list1, x6, sex, matrix1)
list2
#> [[1]]
#> [[1]][[1]]
#> [,1] [,2]
#> [1,] 1 2
#> [2,] 3 4
#> [3,] 5 6
#> [4,] 7 8
#> [5,] 9 10
#>
#> [[1]][[2]]
#> [,1] [,2]
#> [1,] 1 6
#> [2,] 2 7
#> [3,] 3 8
#> [4,] 4 9
#> [5,] 5 10
#>
#>
#> [[2]]
#> [1] 10.00000 13.33333 16.66667 20.00000 23.33333 26.66667 30.00000
#>
#> [[3]]
#> [1] "females" "males" "other"
#>
#> [[4]]
#> [,1] [,2]
#> [1,] 1 2
#> [2,] 3 4
#> [3,] 5 6
#> [4,] 7 8
#> [5,] 9 10
Combining different types of data into a single matrix converts everything to a character type:
To check the type of data in your matrix:
Data frame
As we can see combining both numeric and character variables into a matrix ended up with a matrix of character values. To keep the numeric variables as numeric and character variables as character, we can use the data.frame
function.
- Creating a data frame
A data frame is similar to a matrix but allows for columns of different types (numeric, character, factor, etc.). It’s a standard format for storing data sets in R.
To check the mode or type of your data frame:
- Extract elements
Data frames allow easy extraction and modification of specific elements. For example, we can extract the values on the first row and first column as follow:
Similarly, the first column can be extracted as follows:
The first row can be extracted as follows:
Columns can also be accessed using $
. Below we are calling column id
:
- Modifying values
We can edit the values in the data frame as well. For example, we can change the score from 85 to 90 for the id 1:
We can also change the name of the variables/columns:
- Combining data frames
We can also merge another data frame with the same variables using the rbind
function:
# Create a new dataset
df2 <- data.frame(Studyid = c(10, 15, 50), Grade = c(75, 90, 65), Sex = c("F", "M", "M"))
# Combining two data frames
df.new <- rbind(df, df2)
# Print the first 6 rows
head(df.new)
- Checking the dimensions
To see the dimension of the data frame (i.e., number of rows and columns), we can use the dim
function:
As we can see, we have 5 rows and 3 columns. We can use the nrow
and ncol
functions respectively for the same output: