Working in RStudio with deprecated Git integration (Watson Studio)
You can create R scripts and R Shiny applications in projects with deprecated Git integration.
R Shiny is an R package that makes it easy to develop interactive web applications straight from R. You can create, develop and refine Shiny apps in RStudio, whether to create a unique data visualization dashboard or publish applications into different places, for example to deployment spaces.
Creating R scripts and Shiny apps
The directory browser should be open when RStudio is started, at the lower right. If you are working in RStudio with Git, navigate to <your_git_repo>/assets/rstudio
to ensure that all your files will sync
from that folder. You can create as many subfolders as needed with different R files.
The Git extension is preinstalled, enabling access to the repository that you associated with your project at launch time and adding the Git tab to the RStudio toolbar.
The Git repository referenced in the project is cloned by the RStudio environment at launch time and can be viewed on the files browser at the lower right of the IDE GUI in the folder called project_git_repo/<your_git_repo>
.
You must make all your changes to your R files in that folder to be able to sync with Git. Otherwise, it can be saved wherever you want.
Note that if a folder or subfolder contains RShiny App files (that is files with the names app.R
, ui.R
or server.R
), all files in that folder are considered to belong to the Shiny app (including .R files).
Otherwise, all .R
files are considered R script assets.
-
Optional: Add collaborators to the project if you want to work on the same file with others. See Collaboration.
-
Optional: Preinstall any R libraries that you source for your Shiny app from an external network at the global location
/cc-home/_global_/R
or in a persistent storage volume to avoid installing these libraries every time the Shiny app is deployed. Ensure that you are connected to the storage volume when you deploy the Shiny app. -
Start working on R scripts:
- Select New File > R Script or upload an R file from your local machine.
- After you are finished working on the files, save your file changes to your local clone before you commit to the Git repository by clicking File > Save.
-
Or start working on Shiny apps by:
- Clicking New File > Shiny Web App.... A new Shiny application creation window pops up.
- Enter a name for your Shiny application and leave
userfs
as the Create within directory setting. You must work in this directory or any subdirectories to enable syncing with the Git repository. Bothapp.R
andui.R
/server.R
contain instructions needed to build your app and provide a sample app the user can test run. - You can choose to create a single file application (
app.R
) if your application is simple and can be contained within one file, for simplicity. - Or, you can choose to create an application that uses multiple files (
ui.R
/server.R
) if your application is more complex and needs to have its different facets edited separately. - When you are done with configuration, click Create.
- You can test run your app by clicking Run App. When you click Run App, a pop-up window that contains your application shows on screen.
-
You can use data from a data set in your scripts or apps. Supported formats of data sets include text, CSV, SPSS, SAS, and Stata. You can use data assets that are already imported into the project by clicking Import dataset under the Environment tab, or clicking File and browsing for the file under
userfs/assets/data-asset
, or uploading them locally by clicking Upload in the Data panel on the lower right. You can preview the data assets in the editing panel.Note: Data sets larger than 5 MB will not be able to be previewed in RStudio. -
Push your file changes to the Git repository by using the Git button on the top menu bar of the main editing panel. Click Commit.
- Select all the files that you have made changes to and would like to push to the Git repository. Add a change description and commit your staged changes to the local clone of your repository in your RStudio session.
- Click Push to push your your changes to the remote repository where your changes can be seen and accessed by other users. Resolve any merge conflicts that might be caused by competing changes to files you are collaborating on. By clicking Pull in the Git actions panel, you can also pull file changes made by collaborators to your repository clone.
-
After you have pushed your changes, sync the changes made to the Git repository with the R scripts in your project. See Syncing Git changes with your project.
By syncing the Git R files changes with the project, you update the common shared project clone to reflect what was last pushed to the Git repository.
The R files will appear as a project assets which you can then click to preview and promote to a deployment space. Regular code and textual files can be previewed in Watson Studio while others cannot be previewed. Note that you cannot edit, run and sync R files without first launching RStudio.
Collaboration
With the Git version control system added through the Git extension in RStudio, users can share their work on files in RStudio. To enable sharing when working on files, users must be added to the project as collaborators and must have access to the associated project Git repository.
To enable users in a project to collaborate on file changes in RStudio:
-
Add users as collaborators to the project and assign them either Admin or Editor role. You can invite only users who have an existing IBM Cloud Pak for Data account. See Adding collaborators.
-
Give all collaborators the appropriate access permissions to the project Git repository.
-
Instruct all collaborators to create their own personal access token for the associated project repository. See Creating personal access tokens for Git repositories.
When you open RStudio, you will see your personal Git access token in the list. Select it to begin working on the RStudio project.
Storing intermediate .rda files
You can store any intermediate files, for example .rda
and .md
files, log or text files in the directory /project_data_folder/data_asset
, which is part of the project clone, and hence can be accessed by
all project collaborators and in R Shiny applications or jobs that run R scripts.
Working with data files
In RStudio, you can work with data files from different sources:
-
Files in the RStudio server file structure, which you can view by clicking Files in the lower right section of RStudio. This is where you can create folders, upload files from your local system, and delete files.
To access these files in R, you need to set the working directory to the directory with the files. You can do this by navigating to the directory with the files and clicking More > Set as Working Directory.
Be aware that files stored in the
Home
directory of your RStudio instance are persistent within your instance only and cannot be shared across environments nor within your project.
Watch this video to see how to load data to RStudio.
This video provides a visual method to learn the concepts and tasks in this documentation.
-
Project data assets which you can view by clicking Files > Home in the lower right section of RStudio. In projects with deprecated Git integration, the data assets are in the folder called
project_data_asset
. In projects with default Git integration, the data assets are in the folder calledassets/data_asset
. You can select to view the contents of a file or import the data set by clicking the asset.If you add a data file to this folder, the file is not added as a data asset to the project. To add data files as project data assets, see Adding project assets.
It is not possible to open and view connected data assets cannot be in the
project_data_asset
directory. You can only access connected data assets programmatically from an R script in RStudio. -
Data stored in a database system.
Loading and accessing data
Data loading options per compute engine type
Data loading options | Anaconda R distribution | R + Spark |
---|---|---|
Load data into a sparkSessionDataFrame | ✓ | |
Load data into an R data frame | ✓ | ✓ |
Generating code that loads data directly to RStudio
Loading data from local files
To generate code that inserts data from local files to RStudio:
- Click the Code snippets icon
and then click Read data.
- Select the data source from your project and then select Copy to clipboard.
- Paste the code in the RStudio file editor.
Supported file types:
- CSV/delimited files
- Excel files (.xls, .xlsx, .xlsm)
- JSON files
- SAS files
Loading data from data source connections
Before you can load data from an IBM data service or from an external data source, you must create or add a connection to your project. See Adding connections to projects.
To generate code that inserts data from database connections to RStudio:
- Click the Code snippets icon
and then click Read data.
- Select the connection from your project.
- Select the data source from the connection and then select Copy to clipboard.
- Paste the code in the RStudio file editor. The generated code serves as a quick start to begin working with a data set or connection. For production systems, carefully review the inserted code to determine whether you should write your own code that better meets your needs.
- If necessary, enter your personal credentials for locked data connections that are marked with the Key icon
. This is a one-time step that permanently unlocks the connection for you. After you have unlocked the connection, the key icon is no longer displayed. See Adding connections to projects.
- If no code can be generated for the connection, load the credentials and open the database connection that references your credentials. Write code to load the data.
RStudio supports the same database connections as Jupyter notebooks. For details, see Data load support in notebooks.
Adding or deleting project assets
Upload data files to use in RStudio by clicking the Upload asset to project icon on your project's
Assets page because these files are automatically added as Data assets to your project.
However, if you uploaded or created data files in RStudio, you can add these files to your project as project data assets. These files must be in the Home/project_data_asset
folder in RStudio. To add these files as data assets to
the project:
- On the Assets page of the project, click Import assets.
- Select Project files and the file in the
Home/project_data_asset
folder that you want to add to the project as asset.
If you delete a data asset from the Home/project_data_asset
folder in RStudio, you must delete the data asset in the project by:
- On the Assets page of the project, selecting the data asset you want to delete.
- Selecting Delete from the options list.
Running an R script as a job
You can run the script as a job in an RStudio environment in Watson Studio or on a remote Hadoop cluster. See:
-
To create a job to run an R script in an RStudio environment, see Creating jobs for files in a project with deprecated Git integration.
-
To create a job to run an R script on a Hadoop cluster, you need a Hadoop cluster that supports R and R scripts, and you'll need to enable the feature on the Hadoop cluster by modifying a configuration file. See Administering Apache Hadoop clusters, sub-section
scriptLanguages
under Details on the content of the json files for more details. In addition, all the libraries that you need for your R script must be available on the cluster.To run a job on the Hadoop cluster, you must first create a Hadoop environment. After you have created this Hadoop Yarn environment, you can select it when you create the job for the R script from the Assets page of the project.
Creating a Hadoop Yarn environment
- The Watson Studio adminstrator needs to add the Hadoop cluster configuration to your platform.
- Open the drop-down menu from the sandwich button on Watson Studio's home page, and click on Configure Platform.
- Click on Add Registration to add the Hadoop cluster to the project's configuration.
- Now go to your project, click on the Environments page. Click on New template to create a custom environment.
- After you give the custom environment a name, select Hadoop as the environment type.
- Select the Hadoop configuration that you want to use.
- A Hadoop cluster set up for R scripts needs to be able to use Yarn, as certain R scripts require usage of Yarn. If the cluster is set up correctly, a field called Execution type appears, in which the user can select Yarn as the execution type. If you do not see an option for Execution type, it is likely your Hadoop admin has not set up the Hadoop cluster and configuration file to support the R environment. Once the set up is done on the hadoop side, your admin will need to refresh the Hadoop registration before Execution Type option would be available. You can select "Yarn" to run R script.
- Select the language, Yarn size and Yarn container memory. These fields are bounded by the admin's settings.
- Click Create to complete the creation of the environment.
- You can change the default settings of the custom environment later by clicking on the environment under the Environments page, for example, increase or decrease the memory of the Yarn container.
Creating an app deployment
If you have an R Shiny asset saved to a project, you can promote it to a deployment space, then deploy it as an app and make the URL available to users.
To create an app deployment:
- From the deployment space, click the name of the saved R Shiny app you want to deploy. The asset detail page opens.
- From the Deployments tab, click Add new deployment.
- Choose App as the deployment type.
- Provide a name and adjust any optional settings for the deployment, then click Create Deployment to create the deployment. Optional settings you can configure include:
Setting | Description |
---|---|
Software configuration | Not configurable. It must match the version of R that you used to create the asset. |
Hardware configuration | Choose a hardware configuration to match your app. |
Copies | The number of copies to create. |
Share with | Choose whether to share with: - Anyone who has the URL - Any authenticated user (logged into watsonx) - Users who are collaborators in the project |
Working with prompts
If the watsonx.ai service is installed on your cluster, you can add various sample prompts for specific models into your R code. To add a sample prompt, click the Code snippets icon , select Prompt Engineering, and browse the various categories to find a sample prompt. When you select a prompt, click Copy to clipboard and then paste the code
in the RStudio file editor.
Learn more
- Creating jobs for files in a project with deprecated Git integration
- RStudio Overview
- Hadoop Environments
- Using Spark in RStudio
- Using libs from Anaconda Repository
- Accessing data in MySQL databases by using the RMariaDB library
- Connecting your Shiny application to a persistent storage volume
Parent topic: RStudio