About me | pbdR Tech | HPSC | Phyloclustering | R_note |

R_note -- The Exploration of Statistical Software R (統計軟體 R 深度歷險)
About R_note
Reference
MS Windows

Batch jobs
Function
Data/MySQL
Plot
String/Parse

Remark Lines
Classes/S3
S4 Methods
Batch more
Environment

Good Coding
Fast Loop
LAM/MPI/Rmpi
Recursion

PHP Call R
Basic C
R Call Fortran/C
R Call GSL
C Call R API
C Call R Objects
Standalone

Make Packages
C Pointer
Debug




Section: Fast Loop

This section shows you how to write a fast loop, and how efficient it is. Five examples will be demonstrated different ways o write the loop for the same purpose which is "sum a large matrix several times", as the following,

  1. "Sum by for 1" -- use for() loop to sum up the matrix by column.
  2. "Sum by for 2" -- use for() loop to sum up the matrix by row.
  3. "Sum by apply" -- use apply() function to sum up the matrix.
  4. "Sum by rowSums" -- use rowSums() internal function to sum up the matrix.
  5. "Sum by dyn" -- use "dynamical loading" to load external function to sum up the matrix.
Finally, the computing time of the above methods will be listed. The parallel version of these methods will be demonstrated and compared at the section of "LAM/MPI/Rmpi".


  • Sum by for 1
    First, create an R code file "loop_for_1.r" contains this

        
    # File name: loop_for_1.r
    
    my.loop <- 20
    m.dim <- list(nrow = 200000, ncol = 10)
    m <- matrix(1, nrow = m.dim$nrow, ncol = m.dim$ncol)
    ret <- 0
    
    start <- Sys.time()
    for(k in 1 : my.loop){
      for (i in 1 : m.dim$nrow){
        for (j in 1 : m.dim$ncol){
          ret <- ret + m[i, j]
        }
      }
    }
    Sys.time() - start
    

  • Sum by for 2
    First, create an R code file "loop_for_2.r" contains this

        
    # File name: loop_for_2.r
    
    my.loop <- 20
    m.dim <- list(nrow = 200000, ncol = 10)
    m <- matrix(1, nrow = m.dim$nrow, ncol = m.dim$ncol)
    ret <- 0
    
    start <- Sys.time()
    for(k in 1 : my.loop){
      for (j in 1 : m.dim$ncol){
        for (i in 1 : m.dim$nrow){
          ret <- ret + m[i, j]
        }
      }
    }
    Sys.time() - start
    

  • Sum by apply
    And then, create an R code file "loop_apply.r" contains this

        
    # File name: loop_apply.r
    
    my.loop <- 20
    m.dim <- list(nrow = 200000, ncol = 10)
    m <- matrix(1, nrow = m.dim$nrow, ncol = m.dim$ncol)
    ret <- 0
    
    start <- Sys.time()
    for(k in 1 : my.loop){
      ret <- ret + sum(apply(m, 1, sum))
    }
    Sys.time() - start
    

  • Sum by rowSums
    And then, create an R code file "loop_rowSums.r" contains this

        
    # File name: loop_rowSums.r
    
    my.loop <- 20
    m.dim <- list(nrow = 200000, ncol = 10)
    m <- matrix(1, nrow = m.dim$nrow, ncol = m.dim$ncol)
    ret <- 0
    
    start <- Sys.time()
    for(k in 1 : my.loop){
      ret <- ret + sum(rowSums(m))
    }
    Sys.time() - start
    

  • Sum by dyn
    Create a Fortran code file "loop_dyn.f" contains this

        
    c File name: lood_dyn.f
    c For dynamical load compile by g77.
    c SHELL> g77 -c loop_dyn.f ; g77 -shared -o loop_dyn.so loop_dyn.o
    
          subroutine dynsum(nrow, ncol, m, ret)
            integer i, j, nrow, ncol
            real*8 m(nrow, ncol), ret
    
            ret = 0
            do j = 1, ncol
              do i = 1, nrow
                ret = ret + m(i, j) 
              end do
            end do
    
            return
          end
    
    c Output is a shared library "loop_dyn.so" can called by R.
    

    And, create an R code file "loop_dyn.r" contains this

        
    # File name: loop_dyn.r
    
    dyn.load("loop_dyn.so")
    # For Windows will like this
    # dyn.load("C:/Windows/Desktop/loop_dyn.dll")
    
    my.loop <- 20
    m.dim <- list(nrow = 200000, ncol = 10)
    m <- matrix(1, nrow = m.dim$nrow, ncol = m.dim$ncol)
    ret <- 0
    
    dynsum.f <- function(m) {
      ret <- .Fortran("dynsum", nrow = nrow(m), ncol = ncol(m),
               m = as.double(m), ret = as.double(m))
      ret$ret
    }
    
    start <- Sys.time()
    for(k in 1 : my.loop){
      ret <- ret + dynsum.f(m)
    }
    Sys.time() - start
    
    dyn.unload("loop_dyn.so")
    # For Windows will like this
    # dyn.unload("C:/Windows/Desktop/loop_dyn.dll")
    

    For test, download the example "loop_dyn.dll" to "C:\Windows\Desktop\".

  • Computing time
    For PIII-1.4G PC, the test computing time as follows,

         Sum by
    Loop
    for 1 for 2 apply rowSums dyn
         Time
    (secs)
    331 307 117 2 19

  • Conclusion
    Use default internal function.
    Use external compiled function.
    Use "apply" to substitute "for loop".
    See "apply", "lapply", "tapply", "sapply".
    Use a column-wise data structure in R and Fortran.
    Use a row-wise data structure in C.



[ Go to top ]

Maintained: Wei-Chen Chen
E-Mail: wccsnow @ gmail.com
Last Revised: Dec 12 2016, 09:44 (CST Taipei, Taiwan)
Created: Oct 06 2003
free counters Best Resolution
IE6.0
1280x1024
small font