Projects
A project is a collection of assets that you use to achieve a particular data analysis goal.
Your project assets can include:
- Notebooks
- RStudio files
- Models
- Data sets
- Scripts
You can also export or import a project as a ZIP or TAR.GZ file.
A sample project named dsx-samples is available to all users, with sample notebooks to help get you started. Although you can create new notebooks, models, scripts, and data sets, you cannot add jobs, collaborators, or SPSS Modeler flows.
Tasks you can perform:
- Create a project
- Manage collaborators
- Manage assets
- Add data sources
- Publish assets
- Export a project
- Rename a project
- Delete a project
- View all projects
- Create a script
- Set up runtime environments
- Run jobs in the background
Create a project
To create a project, go to the Projects list and click Add Project.
- For a new blank project, click the New tab.
- To import a preexisting project from your local device, click the From
file tab and upload the ZIP or TAR.GZ.Restriction: Data source credentials are not imported. For any data sources with credentials, you will need to open the imported project and specify the credentials for the data source again.
- To import your project from a GitHub, GitHub Enterprise, BitBucket, or BitBucket Server repository, click the From Git repository tab. To set up the connection to the repository, you must save a GitHub or BitBucket personal access token. See Import and commit projects on a Git repository for details on how to work with a Git repo. Note that you can only create one project from a Git repo.
- Project names
- A project name cannot contain any special characters, and it cannot contain spaces. If your project connects to a Git repository, then you cannot edit the project name.
- Asset names
- Asset names, including data source and remote data set names, cannot contain any special characters. If your project connects to a Git repository, then use the Git repo guidelines for notebook names.
| Project type | Collboration privileges | Master repository | Repository copy |
|---|---|---|---|
| Standard | Managed in the product | Master repository exists in the cluster file system | Each collaborator gets a copy |
| GitHub | Managed outside of the product | Master repository exists in GitHub | Each user gets a copy when the project is imported from GitHub |
| BitBucket | Managed outside of the product | Master repository exists in BitBucket | Each user gets a copy when the project is imported from BitBucket |
Click Create. Your new project opens and you can start adding collaborators and assets to it.
Manage collaborators
If you have Admin permissions for a project, you can add collaborators, change collaborator permissions, or remove collaborators from that project on its Collaborators page.
The collaborator permissions are:
- Viewer
- Can view the project, accept changes, and commit changes to their own local copy of the project.
- Editor
- Can control project assets. Can accept, commit, and push changes.
- Admin
- Can control project assets, collaborators, and settings. Can accept, commit, and push changes.
The actions that are available for a project from the Git
Actions icon (
) in the project action bar depend on your collaborator permissions. An Admin or Editor can
push and commit changes. A Viewer can add an asset but cannot push changes. An Admin, Editor, or
Viewer can pull changes from the master repository or reset the project to what is currently in the
master repository.
You can click Leave next to a project to remove yourself from it. However, if you are the only collaborator with Admin permissions, you must give another collaborator Admin permissions before you can leave the project.
Manage assets
If you have Admin or Editor permissions on a project, you can add assets from its Assets page.
If you have Admin permissions on a project, you can delete an asset by clicking Delete next to the asset.
Add data sources
A data source provides data for your project. For example, a data source can be a database table or data stream. A data source allows you to securely store information about your database and credentials. To add a data source, go to the Data Sources page in your project.
Publish assets
A project Editor or Admin can share a read-only copy of the asset either within the community or with people outside of the product. For example, you can share a Jupyter notebook, PDF, text file, or PNG graphic. To do so, go to the Assets page and click Publish next to the file. The publish action creates a read-only snapshot of the current version of the asset, copies it to a published content directory in the user-home file system (if the file already exists, then it is versioned), and automatically generates a URL (except for models) where the asset can be viewed.
The following assets can be published:
- Jupyter notebook content
- "Local" data sets
- R Shiny web apps
You can set the following content visibility permissions for the published asset (except for models):
- All users with the URL (anyone outside of the product can view it).
- Any authenticated user (only signed in users can view it).
- Restricted to members in the selected project (only collaborators in the selected project can view it). You can publish only to projects that you have Admin access to, and you cannot publish an asset to a project that was imported from GitHub (because these are not managed projects).
If you publish a Jupyter notebook, then the published copy is automatically converted to HTML. You can publish the notebook with the following options:
- You can either rerun the entire notebook (which might take a while) or publish it as-is.
- You can either include code cells in the published copy, or hide the code cells so that only the output appears.
If you publish an R Shiny app, then the URL displays it as an interactive UI where users can dynamically input their own variables to explore trends.
A permalink URL to the published asset (except for models) is automatically generated that you can copy. Alternatively, users can view the published asset in the Published Assets page. The Published Assets page shows only assets that the signed-in user has permissions to. To unpublish a file, you can go to the Published Assets page and click Unpublish next to it.
Export a project
You can download a project as a ZIP or TAR.GZ file by clicking Export as next it. Note that the environments in the project do not get exported.
- Data source credentials
- When you import this project, data source credentials will not be imported. For any data sources with credentials, you will need to open the imported project and specify the credentials for the data source again.
- Export timeout
- There is no set limitation on project size, but if your cluster is very busy or the export file is 10 GB or larger, generally exporting it within 10 minutes does not succeed. If Watson Studio cannot create the zip file within 10 minutes it results in a 504 timeout. Try again later when the cluster is less busy. If you have large data files in your project you may want to consider moving them from your project to a library to reduce the export size. If you don't have any large data files and don't believe that your project should result in a large export file, you might want to use the terminal window to look for unnecessary files (for example, core files) which can be deleted.
Rename a project
If you have Admin permissions on a project, you can rename the project by clicking Rename next to it. This renames the project for all of the collaborators and automatically stops the Admin's runtime environments that are active for that project.
When the renaming completes, any access to notebooks or RStudio will automatically start up the runtime environments inside the context of the new project. The Admin can also choose to manually start them in the Environments page. Because the containers are not stopped for the collaborators, each collaborator must stop the runtime environments that are associated with the old project name in the All Active Environments page. Any subsequent access to notebooks and RStudio would automatically bring up the runtime environments with the correct project name context, or the collaborator can go to the page to manually start runtimes.
Collaborators should verify that assets like notebooks and scripts do not directly specify the project name, for example, in any of the paths (the paths should always be relative for portability).
Delete a project
If you have Admin permissions on a standard project, you can delete it by clicking Delete next to it. This deletes the project for all of the collaborators, and deletes all assets (and the storage directories) associated with the project.
If an Admin deletes a GitHub project, then only the Watson Studio Local copy of the project will be deleted (not the remote repository on GitHub).
View all projects
Click the Tree View icon (
) to view all projects in the system and expand their contents. You can click on any folder,
Jupyter notebook, or CSV file to preview it.