Using the R plug-in in IBM Spectrum LSF

R is a free software environment for statistical computing and graphics. It compiles and runs on a wide variety of UNIX platforms, Windows and MacOS. Download R from https://cran.r-project.org/mirrors.html, and choose your preferred CRAN mirror. The R plug-in allows you to submit R jobs as regular LSF jobs.

Before you begin

Questions about R, such as how to download and install the software, or what the license terms are, may be found at https://cran.r-project.org/faqs.html.

Requirements for running the R workload (script) with IBM Spectrum LSF:

  • Ensure that the R script does not try to print any graphics. You can create and save graphical objects to files, but do not “print them to the screen”.
  • Ensure that R is in the same location, and is accessible from all the hosts in your IBM Spectrum LSF cluster.

Procedure

  1. Install IBM Spectrum LSF

    Follow instructions on the IBM Spectrum LSF Knowledge Center.

  2. Install R.

    R can be downloaded from The R Project for Statistical Computing.

    There are no special requirements for R, but ensure that your R workload can be executed successfully.

  3. Set the IBM Spectrum LSF environment and start LSF.

    To set the LSF environment:

    • For csh or tcsh: % source <LSF_TOP>/conf/cshrc.lsf
    • For sh, ksh, or bash: $ . <LSF_TOP>/conf/profile.lsf

    The startup of LSF daemons by users other than root applies only to the following lsadmin and badmin subcommands:

    1. badmin hstartup
    2. lsadmin limstartup
    3. lsadmin resstartup
  4. Create resources or an application profile for R in LSF
    • To create resources:
      Note: This solution is recommended since it is flexible to config hosts in the R environment.
    1. Edit the $LSB_SHAREDIR/lsf.shared file.

      Add or define a new boolean resource. The following example defines the R_workload resource names:

      Begin Resource
      RESOURCENAME  TYPE    INTERVAL INCREASING  DESCRIPTION
      R_workload      Boolean ()       ()          (Host installed R)
      End Resource
    2. Edit the $LSB_SHAREDIR/lsf.cluster.cluster_name file to set which hosts have R installed.

      The RESOURCES parameter is configured in the lsf.cluster.cluster_name file to the value that you set in the $LSB_SHAREDIR/lsf.shared file.

      The following example defines hostA and hostC with having R installed.

      Begin   Host
      HOSTNAME  model    type        server  RESOURCES    
      hostA      !         !           1    (mg R_workload)
      hostB      !         !           1    (mg)
      hostC      !         !           1    (mg R_workload)
      End     Host
    3. Run the lsadmin reconfig and badmin mbdrestart commands to apply changes.
    • To create a new application profile:
    1. Edit #LSB_CONFDIR/cluster_name/configdir/lsf.applications and add or define a new application. The following example defines the R_Application application profile.
      Note: The CPULIMIT parameter is configured with the value for the host that has R installed.

      For more info about the lsf.applications file, refer to the IBM Spectrum LSF Knowledge Center

      Begin Application
      NAME         = R_Application
      DESCRIPTION  = run R workload
      CPULIMIT     = 24:0/wylinto
      End Application
    2. Run badmin reconfig to apply the changes.
  5. Write your R application.

    The following example uses helloworld.r.

    # My first program in R Programming
    myString <- "Hello, World!"
    print ( myString)

    For example, the following job requests that the queue "interactive" runs the job using interactive mode and specifies R_workload as the resource where R is installed.

    [user@host R]$ bsub -R R_workload -q interactive -I /usr/local/bin/Rscript /home/user/R/helloworld.r >>/home/user/R/output.txt
    <<Waiting for dispatch ...>>
    <<Starting on hostA>>
    [user@host R]$ bjobs
    JOBID   USER    STAT  QUEUE      FROM_HOST   EXEC_HOST   JOB_NAME   SUBMIT_TIME
    185     user    PEND  interactiv user                *loworld.r Apr  3 04:56
    [user@host R]$ bjobs
    No unfinished job found
    [user@host R]$ cat output.txt
    Job <185> is submitted to queue <interactive>.
    [1] "Hello, World!"
    [user@host R]$

    When the job is running, get the running status of the R application from the output file: /home/user/R/output.txt

    Use the bsub -R R_workload -Is command to submit the R to LSF as an interactive job. The command will start and launch the job in R interactive mode.

    [user@host R]$ bsub -R R_workload -q interactive -Is /usr/local/bin/R
    Job <192> is submitted to queue <interactive>.
    <<Waiting for dispatch ...>>
    <<Starting on hostA>>
    
    R version 3.4.4 (2018-03-15) -- "Someone to Lean On"
    Copyright (C) 2018 The R Foundation for Statistical Computing
    Platform: x86_64-pc-linux-gnu (64-bit)
    
    R is free software and comes with ABSOLUTELY NO WARRANTY.
    You are welcome to redistribute it under certain conditions.
    Type 'license()' or 'licence()' for distribution details.
    
      Natural language support but running in an English locale
    
    R is a collaborative project with many contributors.
    Type 'contributors()' for more information and
    'citation()' on how to cite R or R packages in publications.
    
    Type 'demo()' for some demos, 'help()' for on-line help, or
    'help.start()' for an HTML browser interface to help.
    Type 'q()' to quit R.
    
    [Previously saved workspace restored]
    
    > myString <- "Hello, World!"
    > print ( myString)
    [1] "Hello, World!"
    > q()
    Save workspace image? [y/n/c]: n
    [user@host R]$

    Use the application profile to submit the R jobs. The commands are similar as above, but use the -app option instead of -R:

    bsub -app R_Application -q interactive -I /usr/local/bin/Rscript /home/user/R/helloworld.r >>/home/user/R/output.txt

    or

    bsub -app R_Application -q interactive -Is /usr/local/bin/R

    If you forget your application profile name, view it using the bapp command:

    [user@host R]$$ bapp
    APP_NAME            NJOBS     PEND      RUN     SUSP
    R_Application           0        0        0        0
    [user@host R]$ R]$