As a general rule of thumb, in any programming language we should undertake the memory management as much as possible. When we grow a vector inside a loop, the vector asks the processor for extra space in between the running program and then proceeds once it gets the required memory. This process is repeated for every iteration of the loop resulting in massive delays. Thus we should pre-allocate the required memory to a vector to avoid such delays.
This Memory Allocation Optimizer checks the code for vectors that have been initialized without proper memory allocation and whenever possible, changes that allocation to an allocation with a fixed size, taken from the the loops where the vector is being called.
Consider the following example:
code <- paste(
"square_vec <- NULL",
"mul_vec <- NULL",
"for(i in 1:100) {",
" square_vec[i] <- i^2",
"}",
"for(i in 1:100) {",
" mul_vec[i] <- i * i",
"}",
"identical(square_vec, mul_vec)",
sep = "\n"
)
cat(code)
## square_vec <- NULL
## mul_vec <- NULL
## for(i in 1:100) {
## square_vec[i] <- i^2
## }
## for(i in 1:100) {
## mul_vec[i] <- i * i
## }
## identical(square_vec, mul_vec)
Then, the automatically optimized code would be:
## square_vec <- vector(length = 100)
## mul_vec <- vector(length = 100)
## for(i in 1:100) {
## square_vec[i] <- i^2
## }
## for(i in 1:100) {
## mul_vec[i] <- i * i
## }
## identical(square_vec, mul_vec)
And if we measure the execution time of each one, and the speed-up:
bmark_res <- microbenchmark({
eval(parse(text = code))
}, {
eval(parse(text = opt_code))
})
autoplot(bmark_res)
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## Expr_2 103.3015 95.46479 76.94973 87.12953 76.4189 83.21681
The memory-allocation-optimizer
looks for vector
assignments in the entire code snippet, that have been assigned without
proper allocation of memory. For instance, consider a vector
vec
, now this vector can be initialized without proper
memory allocation in the following ways:
vec <- NULL
vec = c()
vec <- NA
logical() -> vec()
When we spot these improper initializations of vectors, we change the intialization to include a suitable memory allocation in the following way, if possible:
vec <- vector(length = 10)
While the highly flexible and accomodating nature of the R Language is a boon for programmers, that very nature has limited the scope of this optimizer to a certain extent. In this optimizer the following points have been kept in mind:
FOR
loops have been considered, owing to the fact
that their sizes, most of the times, can be predicted accurately which
is not the case with while
and repeat
loops.FOR
loop.