Premature optimization is the root of all evil -- Donald Knuth
The humble for loop is often considered distasteful by seasoned programmers because it is inefficient; however, the for loop is one of the most useful and generalizable programming structures in R. If you can learn how to construct and understand for loops then you can code almost any iterative task. Once your loop works you can always work to optimize your code and increase its efficiency.
Before attempting these exercises you should review the lesson R intermediate in which loops were covered.
Examine the following for loop, and then complete the exercises
data(iris)
head(iris)
## Sepal.Length Sepal.Width Petal.Length Petal.Width Species
## 1 5.1 3.5 1.4 0.2 setosa
## 2 4.9 3.0 1.4 0.2 setosa
## 3 4.7 3.2 1.3 0.2 setosa
## 4 4.6 3.1 1.5 0.2 setosa
## 5 5.0 3.6 1.4 0.2 setosa
## 6 5.4 3.9 1.7 0.4 setosa
sp_ids <- unique(iris$Species)
output <- matrix(0, nrow=length(sp_ids), ncol=ncol(iris)-1)
rownames(output) <- sp_ids
colnames(output) <- names(iris[ , -ncol(iris)])
for(i in seq_along(sp_ids)) {
iris_sp <- subset(iris, subset=Species == sp_ids[i], select=-Species)
for(j in 1:(ncol(iris_sp))) {
x <- 0
y <- 0
if (nrow(iris_sp) > 0) {
for(k in 1:nrow(iris_sp)) {
x <- x + iris_sp[k, j]
y <- y + 1
}
output[i, j] <- x / y
}
}
}
output
## Sepal.Length Sepal.Width Petal.Length Petal.Width
## setosa 5.006 3.428 1.462 0.246
## versicolor 5.936 2.770 4.260 1.326
## virginica 6.588 2.974 5.552 2.026
Describe the values stored in the object output
. In other words what did the loops create?
Describe using pseudo-code how output
was calculated, for example,
Loop from 1 to length of species identities
Take a subset of iris data
Loop from 1 to number of columns of the iris data
If ... occurs then do ...
The variables in the loop were named so as to be vague. How can the objects output
, x
, and y
be renamed such that it is clearer what is occurring in the loop.
It is possible to accomplish the same task using fewer lines of code? Please suggest one other way to calculate output
that decreases the number of loops by 1.
You have a vector x
with the numbers 1:10. Write a for loop that will produce a vector y
that contains the sum of x
up to that index of x
. So for example the elements of x
are 1, 2, 3, and so on and the elements of y
would be 1, 3, 6, and so on.
Modify your for loop so that if the sum is greater than 10 the value of y
is set to NA
Place your for loop into a function that accepts as its argument any vector of arbitrary length and it will return y
.
0, 1, 1, 2, 3, 5, 8, 13, 21, 34, 55, 89, 144, ...
Write and apply a simple R function that can accomplish this task with a for loop. Then write a function that computes the ratio of each sequential pair of Fibonacci numbers. Do they asympoticly approch the golden ratio (1 + sqrt(5)) / 2) ?