Analyzing data with RStudio (RStudio Server Runtimes)
R is a popular statistical analysis and machine-learning package that enables data management and includes tests, models, analyses and graphics. RStudio provides an integrated development environment for working with R scripts.
Service The RStudio Server Runtimes service is not available by default. An administrator must install this service on the IBM Cloud Pak for Data platform. To determine whether the service is installed, open the Services catalog and check whether the service is enabled.
- Required services
- RStudio Server with R 4.2
- Watson Studio
- Optional service
- Watson Machine Learning
You can use the RStudio IDE in a project with or without Git integration.
RStudio without Git integration
When you work in RStudio from a project that does not have integration with a Git repository, you can create R scripts or Shiny apps, and work with data assets from the project, but you can't add your scripts or Shiny apps to the project as assets to share with other users.
However, you can you deploy applications from a deployment space by using a code package. See Adding code packages to a deployment space.
RStudio with Git integration
When you work in RStudio from a project that is integrated with a Git repository, you can share your work on R scripts and Shiny apps with other users in your project.
The Git extension is pre-installed, enabling access to the repository that you associated with your project at the time you launch RStudio and the Git tab is added to the RStudio toolbar.
If you have the Watson Machine Learning service installed, you can deploy your applications in a deployment space as URLs that are accessible to users.
You can associate a project with a Git repository only while you're creating the project. The integration with Git differs depending on whether you create a project with default Git integration or create a project using deprecated Git integration. See Creating projects with Git integration.
Accessing RStudio
You access RStudio from within a project. The RStudio IDE runs in an RStudio environment. A default RStudio environment is included with the RStudio Server Runtimes service. You can also create custom RStudio environment templates if you have the execution engine for Apache Hadoop. See RStudio environments.
To start RStudio in your project:
-
Click RStudio from the Launch IDE menu on your project's action bar.
-
Select an environment.
-
Click Launch.
The environment runtime is initiated and the development environment opens.
Sometimes, when you start an RStudio session, you might experience a corrupted RStudio state from a previous session and your session will not start. If this happens, select to reset the workspace at the time you select the RStudio environment and then start the RStudio IDE again. By resetting the workspace, RStudio is started using the default settings with a clean RStudio workspace.
RStudio IDE launch might fail when trying to assign memory necessary to handle large R objects. The reason for this behavior is that by default RStudio ignores the rlimit_data
setting. If you want to enforce a memory limit and prevent
the IDE from crashing, patch the RStudio CR by using this code:
oc patch rstudioaddon ibm-cpd-rstudio-rt231 \
--namespace=${PROJECT_CPD_INST_OPERANDS} \
--type=merge \
--patch='{"spec": {"useRStudioDataLimits"=true}}'
Next steps
Depending on project:
Learn more
Parent topic: Notebooks and scripts