---
output: rmarkdown::html_vignette
title: Loop-invariant Code Motion
vignette: >
%\VignetteIndexEntry{Loop-invariant Code Motion}
%\VignetteEngine{knitr::rmarkdown}
%\VignetteEncoding{UTF-8}
---
```{r echo=FALSE, message=FALSE}
library("rco")
library("microbenchmark")
library("ggplot2")
autoplot.microbenchmark <- function(obj) {
levels(obj$expr) <- paste0("Expr_", seq_along(levels(obj$expr)))
microbenchmark:::autoplot.microbenchmark(obj)
}
speed_up <- function(obj) {
levels(obj$expr) <- paste0("Expr_", seq_along(levels(obj$expr)))
obj <- as.data.frame(obj)
summaries <- do.call(rbind, by(obj$time, obj$expr, summary))
res <- c()
for (i in seq_len(nrow(summaries) - 1) + 1) {
res <- rbind(res, summaries[1, ] / summaries[i, ])
}
rownames(res) <- levels(obj$expr)[-1]
return(res)
}
```
# Loop-invariant Code Motion
## Idea
Invariant code is a piece of code that could be run outside of a loop without affecting the actual outcome of a program. This may come in the form of variables that are not modified after going through the loop and as such are called "invariants" or non-changing. If these pieces of code were removed from the loops, this would speed up the script being run.
## Background
Loop-invariant code consists of statements or expressions which can be moved outside the body of a loop without affecting the semantics of the program. Loop-invariant code motion is a compiler optimization which performs this movement automatically.
For example, consider the following code:
```{r eval=FALSE}
i <- 0
n <- 1000
y <- rnorm(1)
z <- rnorm(1)
a <- c()
while (i < n) {
x <- y + z
a[i] <- 6 * i + x * x
i <- i + 1
}
```
Here, `x` is assigned a value `y + z` that is constant in each loop execution. In this example, the assignment would be executed `n` times. The same result, but much faster, would be obtained by doing the assign outside the loop.
## Example
Consider the previous example to be automatically optimized.
```{r}
code <- paste(
"i <- 0",
"n <- 1000",
"y <- rnorm(1)",
"z <- rnorm(1)",
"a <- c()",
"while (i < n) {",
" x <- y + z",
" a[i] <- 6 * i + x * x",
" i <- i + 1",
"}",
sep = "\n"
)
cat(code)
```
Then, the automatically optimized code would be:
```{r}
opt_code <- opt_loop_invariant(list(code))
cat(opt_code$codes[[1]])
```
And if we measure the execution time of each one, and the speed-up:
```{r message=FALSE}
bmark_res <- microbenchmark({
eval(parse(text = code))
}, {
eval(parse(text = opt_code))
})
autoplot(bmark_res)
speed_up(bmark_res)
```
## Implementation
The `opt_loop_invariant` optimizer does:
* Finds `for` and `while` loops in the code.
* Discards loops that have function calls inside, as function calls can edit the environment.
* Gets which variables are variant inside the loop.
* Detects which expressions do not use any variant variable.
* Gets these invariant variables, and moves them outside the loop, but inside an `if` statement (with the same conditional as the loop).
## To-Do
* `if/else` code motion?
Actually, the `opt_loop_invariant` does not consider `if/else` expressions to move. In this sense, the code:
```{r eval=FALSE}
while (i < n) {
if (n > 3) {
something_without_i
}
}
```
Will not be optimized to its equivalent code:
```{r eval=FALSE}
if (i < n) {
if (n > 3) {
something_without_i
}
}
while (i < n) {
}
```
* Invariant subexpressions motion?
Actually, the `opt_loop_invariant` considers only full expressions, it is not working on subexpressions, for instance, the code:
```{r eval=FALSE}
while (i < n) {
i <- (x * y) + 1
}
```
as `x` and `y` are invariant, could be optimized to:
```{r eval=FALSE}
is_1 <- (x * y)
while (i < n) {
i <- is_1 + 1
}
```
* Include `repeat` optimization?
Since determining the conditional that causes a `repeat` loop to stop is not that simple, it is not easy to remove invariant variables and place them in an `if`.
For example, the code:
```{r eval=FALSE}
y <- 1
repeat{
if (y > 4) {
break
}
x <- 8 * 8
y <- y + 1
}
```
Must be optimized to:
```{r eval=FALSE}
y <- 1
if (y <= 4) {
x <- 8 * 8
}
repeat{
if (y > 4) {
break
}
y <- y + 1
}
```