About R_note
Documents
Download
Installation
MS Windows
Batch jobs
Function
Data/MySQL
Ghostscript
Plot
String/Parse
Batch more
Classes
Fast Loop
LAM/MPI/Rmpi
Recursion
PHP Call R
Basic C
R Call Fortran/C
R Call GSL
C Call R API
C Call R Objects
Standalone
Build a Package
C Pointer
Old Pages
Not end
Others
|
Section: Fast Loop
This section shows you how to write a fast loop,
and how efficient it is.
Five examples will be demonstrated different ways
o write the loop for the same purpose which is
"sum a large matrix several times",
as the following,
- "Sum by for 1" --
use "for()" loop to sum up the matrix by column.
- "Sum by for 2" --
use "for()" loop to sum up the matrix by row.
- "Sum by apply" --
use "apply()" function to sum up the matrix.
- "Sum by rowSums" --
use "rowSums()" internal function to sum up the matrix.
- "Sum by dyn" --
use "dynamical loading" to load external function
to sum up the matrix.
Finally, the computing time of the above methods
will be listed.
The parallel version of these methods
will be demonstrated and compared at
the section of
"LAM/MPI/Rmpi".
- Sum by for 1
First, create an R code file "loop_for_1.r" contains this
|
|
# File name: loop_for_1.r
my.loop <- 20
m.dim <- list(nrow = 200000, ncol = 10)
m <- matrix(1, nrow = m.dim$nrow, ncol = m.dim$ncol)
ret <- 0
start <- Sys.time()
for(k in 1 : my.loop){
for (i in 1 : m.dim$nrow){
for (j in 1 : m.dim$ncol){
ret <- ret + m[i, j]
}
}
}
Sys.time() - start
|
- Sum by for 2
First, create an R code file "loop_for_2.r" contains this
|
|
# File name: loop_for_2.r
my.loop <- 20
m.dim <- list(nrow = 200000, ncol = 10)
m <- matrix(1, nrow = m.dim$nrow, ncol = m.dim$ncol)
ret <- 0
start <- Sys.time()
for(k in 1 : my.loop){
for (j in 1 : m.dim$ncol){
for (i in 1 : m.dim$nrow){
ret <- ret + m[i, j]
}
}
}
Sys.time() - start
|
- Sum by apply
And then, create an R code file "loop_apply.r" contains this
|
|
# File name: loop_apply.r
my.loop <- 20
m.dim <- list(nrow = 200000, ncol = 10)
m <- matrix(1, nrow = m.dim$nrow, ncol = m.dim$ncol)
ret <- 0
start <- Sys.time()
for(k in 1 : my.loop){
ret <- ret + sum(apply(m, 1, sum))
}
Sys.time() - start
|
- Sum by rowSums
And then, create an R code file "loop_rowSums.r" contains this
|
|
# File name: loop_rowSums.r
my.loop <- 20
m.dim <- list(nrow = 200000, ncol = 10)
m <- matrix(1, nrow = m.dim$nrow, ncol = m.dim$ncol)
ret <- 0
start <- Sys.time()
for(k in 1 : my.loop){
ret <- ret + sum(rowSums(m))
}
Sys.time() - start
|
- Sum by dyn
Create a Fortran code file "loop_dyn.f" contains this
|
|
c File name: lood_dyn.f
c For dynamical load compile by g77.
c SHELL> g77 -c loop_dyn.f ; g77 -shared -o loop_dyn.so loop_dyn.o
subroutine dynsum(nrow, ncol, m, ret)
integer i, j, nrow, ncol
real*8 m(nrow, ncol), ret
ret = 0
do j = 1, ncol
do i = 1, nrow
ret = ret + m(i, j)
end do
end do
return
end
c Output is a shared library "loop_dyn.so" can called by R.
|
And, create an R code file "loop_dyn.r" contains this
|
|
# File name: loop_dyn.r
dyn.load("loop_dyn.so")
# For Windows will like this
# dyn.load("C:/Windows/Desktop/loop_dyn.dll")
my.loop <- 20
m.dim <- list(nrow = 200000, ncol = 10)
m <- matrix(1, nrow = m.dim$nrow, ncol = m.dim$ncol)
ret <- 0
dynsum.f <- function(m) {
ret <- .Fortran("dynsum", nrow = nrow(m), ncol = ncol(m),
m = as.double(m), ret = as.double(m))
ret$ret
}
start <- Sys.time()
for(k in 1 : my.loop){
ret <- ret + dynsum.f(m)
}
Sys.time() - start
dyn.unload("loop_dyn.so")
# For Windows will like this
# dyn.unload("C:/Windows/Desktop/loop_dyn.dll")
|
For test, download the example "loop_dyn.dll" to "C:\Windows\Desktop\".
- Computing time
For PIII-1.4G PC, the test computing time as follows,
|
| Sum by Loop
| for 1
| for 2
| apply
| rowSums
| dyn
|
|
| Time (secs)
| 331
| 307
| 117
| 2
| 19
|
- Conclusion
Use default internal function.
Use external compiled function.
Use "apply" to substitute "for loop".
See "apply", "lapply", "tapply", "sapply".
Use a column-wise data structure in R and Fortran.
Use a row-wise data structure in C.
|