RStudio overview
R is a popular statistical analysis and machine-learning package that enables data management and includes tests, models, analyses, and graphics, and enables data management. RStudio, included in Watson Studio Local, provides an IDE for working with R.
An RStudio session created in Watson Studio Local includes 2 GB of storage and 5 GB of memory available for your use.
For information about how to set up and start using RStudio, see the blog post Using RStudio in IBM Data Science Experience, and Using RStudio on the RStudio Support site.
Related tasks:
Install a package
To connect to relational data sources from RStudio from a Watson Studio Local cluster without Internet access, the Watson Studio Local administrator must copy the required packages to your RStudio pods and then follow the installation steps specific to installing from a downloaded package. If the Watson Studio Local cluster has Internet access, complete the following steps:
- In the Tools shell, install the database driver in the
/user-home/ directory.
Example:
pwd /user-home/1003/DSX_Projects/project-nb-test/rstudio cd /user-home/1003/;wget https://jdbc.postgresql.org/download/postgresql-42.2.0.jar - Configure Java on the
pod:
R CMD javareconf - Return to the RStudio script and install the RJDBC module and
dependencies:
install.packages("RJDBC",dep=TRUE)
PostgreSQL example:
library(RJDBC)
driverClassName <- "org.postgresql.Driver"
driverPath <- "/user-home/1003/postgresql-42.2.0.jar"
url <- "jdbc:postgresql://9.876.543.21:27422/compose"
databaseUsername <- "admin"
databasePassword <- "ABCDEFGHIJKLMNOP"
databaseSchema <- "public"
databaseTable <- "cars"
drv <- JDBC(driverClassName, driverPath)
conn <- dbConnect(drv, url, databaseUsername, databasePassword)
#dbListTables(conn)
data <- dbReadTable(conn, databaseTable)
#data <- dbReadTable(conn,
paste(databaseSchema,'.',databaseTable, sep='')
data
Change the Spark version
- Sparklyr library
- To change the default Spark 2.0.2 service to a Spark 2.2.1 service, use the spark_connect() function.
- SparkR library
- To change the default Spark 2.0.2 service to a Spark 2.2.1 service, use the $SPARK_HOME
environment variable to specify the Spark 2.2.1 installation location in
RStudio:
Sys.setenv("SPARK_HOME"="/usr/local/spark-2.2.1-bin-hadoop2.7") # import SparkR library(SparkR, lib.loc = "/usr/local/spark-2.2.1-bin-hadoop2.7/R/lib") # initial sc sc = sparkR.session(master="spark://spark-master221-svc:7077", appName="dsxlRstudioSpark221")
See SparkR (R on Spark) for more information.
Transfer files to and from your user project folder
Using the File Explore in RStudio, a Watson Studio Local user can upload and download files between their project folder and a local disk outside of the cluster:
- To download an RStudio file, select it, click more, and click Export to save the file to your local disk. To upload an RStudio file, click Upload and select the file to upload.
- To download a Jupyter file, click ..., type ~/../jupyter, select the file, click more, and click Export to save the file to your local disk. To upload an Jupyter file, click Upload and select the file to upload.
Learn more
Read and write data to and from IBM Cloud object storage in RStudio