Automating tasks
Repeating a task
The for
loop is a control flow statement in R that lets you repeat a particular task multiple times. This repetition is based on a sequence of numbers or values in a vector.
Consider a simple real-life analogy: Imagine you are filling water in 10 bottles, one by one. Instead of doing it manually 10 times, you can set a machine to do it in a loop until all 10 bottles are filled.
- Example 1
Let’s initiate a counter k at 0 and add 5 to k with each iteration of the loop (i.e., every time it “runs”). After 10 cycles, the loop will stop, but not before printing k in each cycle.
- Example 2
We create a variable x5
containing the values of 0, 10, 20, 30, 40, 50, 60, 70, 80, 90, and 100. Let us print the first 5 values using the for
loop function:
This loop cycles through the first five values of a previously created variable x5
and prints them. Each value printed corresponds to the positions 1 to 5 in x5
.
- Example 3
Let us use the for
loop in a more complicated scenario. First, we create a vector of numeric values and square it:
This is just squaring each value in the vector k
.
- Example 4
Using the for
loop function, we can create the same vector of square values as in Example 3. To do so, (i) we create a null object, (ii) use the loop for each of the elements in the vector (k), (iii) square each of the elements, and (iv) store each of the elements of the new vector. In the example below, the length of k is 5, and the loop will run from the first to the fifth element of k. Also, k.sq[1] is the first stored value for squared-k, and k.sq[2] is the second stored value for squared-k, and so on.
Here, we achieve the same result as the third example but use a for loop. We prepare an empty object k.sq
and then use the loop to square each value in k
, storing the result in k.sq
.
- Example 5
This loop prints the “Sex” column value for each row in the df.new data frame.
Functions
A function in R is a piece of code that can take inputs, process them, and return an output. There are functions built into R, like mean()
, which calculates the average of a set of numbers.
- Built-in function
Here, we’re using the built-in mean()
function to find the average of numbers from 1 to 100.
- Custom-made function
To understand how functions work, sometimes it’s helpful to build our own. Now we will create our own function to calculate the mean, where we will use the following equation to calculate it:
\(\text{Mean} = \frac{\sum_{i=1}^{n} x_i}{n},\)
where \(x_1\), \(x_2\),…, \(x_n\) are the values in the vector and \(n\) is the sample size. Let us create the function for calculation the mean:
This function, mean.own
, calculates the average. We add up all the numbers in a vector (Sum <- sum(x)
) and divide by the number of items in that vector (n <- length(x)
). The result is then returned.
By using our custom-made function, we calculate the mean of numbers from 1 to 100, getting the same result as the built-in mean()
function.
Video content (optional)
For those who prefer a video walkthrough, feel free to watch the video below, which offers a description of an earlier version of the above content.