About me | pbdR Tech | HPSC | Phyloclustering | R_note |

R_note -- The Exploration of Statistical Software R (統計軟體 R 深度歷險)
About R_note
Reference
MS Windows

Batch jobs
Function
Data/MySQL
Plot
String/Parse

Remark Lines
Classes/S3
S4 Methods
Batch more
Environment

Good Coding
Fast Loop
LAM/MPI/Rmpi
Recursion

PHP Call R
Basic C
R Call Fortran/C
R Call GSL
C Call R API
C Call R Objects
Standalone

Make Packages
C Pointer
Debug




Section: LAM/MPI/Rmpi

This page demonstrate a parallel computing tool implemented in R. You have to have a MPI environment and a Rmpi package to run this example. It also show the comparison with other methods. In general, not all computing problem can be easily parallelized such as MCMC, and not all parallel computing is more efficient than series computing. Parallel computing is also highly dependent on programing and algorithm.


  • MPI -- Message Passing Interface
    With LAM/MPI and R package Rmpi by Dr. Hao Yu.

  • Download
    For Mandrake Linux system:
        For LAM/MPI, it requires lam-devel-6.5.9-2mdk (my mirror here), lam-runtime-6.5.9-2mdk (my mirror here), lam-doc-6.5.9-2mdk (my mirror here), liblam0-devel-6.5.9-2mdk (my mirror here).
        For Rmpi, it requires Rmpi_0.4-6.tar.gz (my mirror here).

  • Master & Salve
    The basic idea is Master for MPI rank 0 and Slave for MPI rank from 1 to n, where n is the universal size in your MPI enviroments.
    Step: 0. Initial.
    1. Master send to Slave. (bcast, send)
    2. Slave receive from Master. (bcast, recv)
    3. compute.
    4. Slave send to Master. (send)
    5. Master receive from Slave. (recv)
    6. complete and quit.
    Here, create a file "rmpi_ms.r" as follows,

        
    # File name: rmpi_ms.r
    
    call.mpi.master <- function(){
      library(Rmpi)
      mpi.spawn.Rslaves(needlog = FALSE)
      mpi.bcast.Robj2slave(call.mpi.slave)
      mpi.bcast.cmd(call.mpi.slave())
    
      x <- 100
      mpi.bcast(as.integer(x), type = 1)
    
      mysize <- mpi.universe.size()
      y <- 200
      for(i in 1 : mysize){
        mpi.send(as.integer(y), type = 1, dest = i, tag = 1)
      }
    
      ret <- NULL
      for(i in 1 : mysize){
        ret.slave <- mpi.recv.Robj(source = i, tag = 2)
        ret <- rbind(ret, ret.slave)
      }
      ret
    }
    
    call.mpi.slave <- function(){
      x <- mpi.bcast(integer(1), type = 1)
      y <- mpi.recv(integer(1), type = 1, source = 0, tag = 1)
    
      myrank <- mpi.comm.rank()
    
      ret.slave <- c(myrank, x, y, myrank, x * myrank + y)
      mpi.send.Robj(ret.slave, dest = 0, tag = 2)
    }
    
    call.mpi.master()
    

    The output will like this,

        
              [,1] [,2] [,3] [,4] [,5]
    ret.slave    1  100  200    1  300
    ret.slave    2  100  200    2  400
    ret.slave    3  100  200    3  500
    ret.slave    4  100  200    4  600
    


  • Sum by Rmpi
    As the Loop page, use MPI to split the for loop and send to another slave to compute, reduce the computing time. Here are examples "rmpi_for_1.r", "rmpi_for_2.r", "rmpi_apply.r" and "rmpi_rowSums.r".

  • Computing time
    For PIII-1.4G PC cluster with 4 nodes, the test computing time as follows,

         Sum by
    Rmpi
    for 1 for 2 apply rowSums
         Time
    (secs)
    86 79 41 5
         Sum by
    Loop
    for 1 for 2 apply rowSums dyn
         Time
    (secs)
    331 307 117 2 19


  • Conclusion
    The conclusion is the same in Loop page.
    Use MPI to split independent job.
    Reduce by the number of CPUs?
    Communication times?



[ Go to top ]

Maintained: Wei-Chen Chen
E-Mail: wccsnow @ gmail.com
Last Revised: Dec 12 2016, 09:44 (CST Taipei, Taiwan)
Created: Oct 06 2003
free counters Best Resolution
IE6.0
1280x1024
small font