Comments on Coffee and Econometrics in the Morning: Avoiding Loops in R: An Example with Principal Minors

On my system, this one-liner is a bit quicker: &g...

2011-07-19T16:12:20.793-07:00

On my system, this one-liner is a bit quicker:

> system.time(replicate(N, {
+ minors <- combn(nrow(mat), 2, function(x) {det(mat[x, x])})
+ }))
user system elapsed
35.758 0.000 35.763

And here are the results of Berend's Method 2:

> system.time(replicate(N, {
+ index_vec <- t(combn(nrow(mat),2))
+ minors.2 <- apply(index_vec, MARGIN=1, FUN=function(ix){minor2(mat,ix)})
+ }))
user system elapsed
41.755 0.024 41.810
>

Some more tests. I have tested with the following ...

2011-07-18T22:08:32.039-07:00

Some more tests.
I have tested with the following 4 methods

Method 1.
----------------------------

minors.1 <- NULL
system.time(replicate(N, {
for(i in 1:(nrow(mat)-1)){
start <- i + 1
for(j in start:nrow(mat)){
thisminor <- det(mat[c(i,j),c(i,j)])
minors.1 <- c(minors.1, thisminor)
}
}
}))

Method 2.
----------------------------

minors.1 <- numeric(nrow(mat)*nrow(mat)/2+nrow(mat)) #max
system.time(replicate(N, {
k <- 0
for(i in 1:(nrow(mat)-1)){
start <- i + 1
for(j in start:nrow(mat)){
thisminor <- det(mat[c(i,j),c(i,j)])
k <- k+1
minors.1[k] <- thisminor
}
}
}))

Method 3.
----------------------------

system.time(replicate(N, {
index_vec <- t(combn(nrow(mat),2))
minors.2 <- apply(index_vec, MARGIN=1, FUN=function(ix){ minor2(mat,ix)})
}))

Method 4.
----------------------------

index_vec <- t(combn(nrow(mat),2))
system.time(replicate(N, {
minors.2 <- apply(index_vec, MARGIN=1, FUN=function(ix){ minor2(mat,ix)})
}))

Timing results
------------------

N=25 nrow(mat)=200

Elapsed time
1. 55.862
2. 21.816
3. 33.192
4. 25.049

N=100 nrow(mat)=100

Elapsed time
1. 26.007
2. 21.968
3. 33.569
4. 24.758

In both these cases the method 2 appears to be the quickest.
So I would conclude that the for loop with preallocation of result vector
is likely to be the quickest.

The R provided function combn is also a bit of a bottleneck as can seen from the timings for method 3 and 4.
Executing combn once helps but the for loop with preallocation remains the quickest in the cases presented here.

Berend

Good point on preallocating the vector (copying sl...

2011-07-18T14:51:14.815-07:00

Good point on preallocating the vector (copying slows R down quite a bit).

One caveat to your suggestion: The size you propose isn't right. There are usually many more minors of order two than rows in the matrix.

I think some speed would be gained by preallocatin...

2011-07-18T14:33:38.865-07:00

I think some speed would be gained by preallocating a vector of size nrow, so that R does not have to copy data by c(...).

minors[i] = thisminor

By the way, apply has an internal for loop...

Thanks again for another challenging comment. Mayb...

2011-07-18T12:44:48.943-07:00

Thanks again for another challenging comment. Maybe I should not have been so unequivocal in my language about avoiding for loops. For small problems, they're fine.

I have been working with bigger matrices (on the order of 100 or 200 rows. In this setting, apply() saves time. I tested the code you provided (with combinations, rather than combn) as follows:

> mat = matrix(rnorm(200*200), nrow = 200)
> N = 25
>
> system.time(replicate(N,
+ {
+ minors.1 = NULL
+ for(i in 1:(nrow(mat)-1)){
+ start <- i + 1
+ for(j in start:nrow(mat)){
+ thisminor <- det(mat[c(i,j),c(i,j)])
+ minors.1 <- c(minors.1, thisminor)
+ }
+ }
+ }))
user system elapsed
30.75 0.03 30.79
>
>
> minor2 = function(mat, idx){
+ prin = mat[idx,idx]
+ return(det(prin))
+ }
>
> system.time(replicate(N, {
+ index_vec <- combinations(nrow(mat), 2)
+ minors.2 <- apply(index_vec, MARGIN=1, FUN=function(ix){ minor2(mat,ix)})
+ }))
user system elapsed
21.64 0.00 21.67

Conclusion: apply() scales better than the nested for loop. But, you're right. For smaller matrices, the for loop will be the faster option.

About the speed. I tested as follows: number of r...

2011-07-18T12:20:03.218-07:00

About the speed.

I tested as follows: number of replications N <- 1000

> system.time(replicate(N, {
+ for(i in 1:(nrow(mat)-1)){
+ start <- i + 1
+ for(j in start:nrow(mat)){
+ thisminor <- det(mat[c(i,j),c(i,j)])
+ minors.1 <- c(minors.1, thisminor)
+ }
+ }
+ }))
user system elapsed
0.417 0.001 0.421

> system.time(replicate(N, {
+ index_vec <- t(combn(nrow(mat),2))
+ minors.2 <- apply(index_vec, MARGIN=1, FUN=function(ix){ minor2(mat,ix)})
+ }))
user system elapsed
0.824 0.002 0.829

Even if you move the combn to before the second system.time the for loop is still faster albeit with a much smaller margin.

Conclusion: For loop is faster!

Berend

Thanks for the correction and the tip on combn. W...

2011-07-18T12:19:09.323-07:00

Thanks for the correction and the tip on combn. With regard to why to use gregmisc, I suppose that it depends on your application. It turns out that gregmisc is faster even without transposing (I tried that too).

> system.time(combinations(400,2))
user system elapsed
0.29 0.00 0.29
> system.time(t(combn(400,2)))
user system elapsed
1.04 0.00 1.05

But, gregmisc's combinations() function breaks on combinations much larger than this (I picked 1000 choose 2... which broke gregmisc's combinations, but not combn).

1. first for loop has an error and should read fo...

2011-07-18T12:10:42.271-07:00

1. first for loop has an error and should read

for(i in 1:(nrow(mat)-1)){

2. why use gregmisc? You can use the R provided function combn and then use t(combn(nrow(mat),2)) to get the same result.

Berend