Table of contents

Enabling access to a Git repository

To support collaboration with stakeholders and the data science community at large, you can integrate your project with Git. You can use this integration feature to work with Python scripts and notebooks in JupyterLab, to work with R scripts and Shiny apps in RStudio, to export project assets, or as a method to backup assets for source code management purposes.

Git in analytics projects

Analytics projects allow you to work both with code assets, like notebooks or Python and R scripts, tool-specific assets such as Data Refinery flows, and Decision Optimization models, and data assets like CSV files.

These assets are stored in a Git repository, but in different subdirectories of the repository. Jupyterlab code assets are stored in assets/jupyterlab, RStudio code assets in assets/rstudio and data assets in assets/data_asset. All other directories in the repository are used to store the tool-specific assets along with other metadata for the project.

Assets in the Git repository can be added as project assets. All these assets are listed on the project’s Assets page, including the code assets, but the code assets themselves are read-only and can be edited only in the appropriate IDE.

You need to enable Git integration at the time you create a project:

  • To work with notebooks or Python scripts in the JupyterLab IDE
  • To work with R scripts and Shiny apps in RStudio
  • To enable on-demand synchronization between project assets and a Git repository. This enables:

    • Pulling notebook and Python script changes pushed from JupyterLab
    • Pulling R Shiny apps and R scripts pushed from RStudio
    • Exporting assets from the project to enable creating a new project from Git
  • To create a project from assets that were exported to a Git repository

Important:

  • Each project must have its own Git repository. No two projects can use the same Git repository. This is especially relevant when working in projects with JupyterLab and RStudio. You can’t create a project to work with JupyterLab and associate your new project with the Git repository that you used in another project with JupyterLab.
  • Git integration in a project is only available to enterprise users with a valid certificate to the platform on which they create the Git repository to associate with the project.

Enterprise-grade Git instance certificates

The enterprise-grade instances of Git that you can associate with a project in Watson Studio, namely GitHub Enterprise, GitLab Self-Managed and Bitbucket Enterprise, mostly use a publicly trusted CA-signed certificate for secure Git client traffic. This CA-signed certificate is automatically taken to authenticate to the platform at the time the platform access token is created and does not have to be provided during project creation. Sometimes however, a Git enterprise-grade platform might use a self-signed certicate for authentication. If a self-signed certificate is used, you need to provide the certificate details at the time you create the platform access token. Ask your Git administrator for those details.

You can override a certificate by creating a new platform access token and providing the new certificate details if:

  • You need to move from a CA certificate to a self-signed certificate

    All new projects will start using the new certificate. To use the Git platform in existing projects, you need to contact IBM Support.

  • You need to update a self-signed certificate or it expires and becomes invalid

    All new and existing projects will start using the new certificate.

  • You need to move from a self-signed certificate to a CA certificate

    All new and existing projects will start using the new certificate. This is the only time you need to provide the CA certificate details.

Access a Git repository

If you are using a Git server, you will need to allow network traffic between the compute nodes on your Watson Studio cluster and your Git server. Check with your Git server provider to learn the specific ports that are used by your Git server.

Watch the following video see how to enable Git integration to use the Jupyter Lab IDE.

This video provides a visual method as an alternative to following the written steps in this documentation.

Follow these steps to get access to a Git repository when creating a project:

  1. For an empty project: Select to integrate the project with Git. This selection enables synchronization between the project and the Git repository after the project is created.
  2. Select an existing access token or create a new one. To create a token:

    1. Click New Token.
    2. Select the platform you want to create a personal access token for. The following Git servers are supported:

      • GitHub
      • GitHub Enterprise
      • GitLab
      • GitLab Self-Managed
      • Bitbucket
      • Bitbucket Enterprise
    3. Log in to the platform and follow the instructions to generate a new token with repository scope. The token must have read and write access to the repository.

      Note for Bitbucket Enterprise: You must have admin rights to create a token.

      Tokens to Git repositories are managed at the user level, not at the project level. This means that every user must create their own token.

    4. Copy the newly generated access token and paste it in the Git integration dialog window.
    5. Optional for GitHub Enterprise, GitLab Self-Managed, and Bitbucket Enterprise if you need to provide a self-signed certificate or override an existing certificate: Paste the certificate details that you got from your Git administrator in the Git integration dialog window.

      These details must be in PEM format.

    6. For GitHub Enterprise, GitLab Self-Managed, and Bitbucket Enterprise only: Enter the domain name and your user name.

      The domain name URL must use the https protocol used at the time the access token is created and is not allowed to end with a forward slash (/). An example is: https://dse-bitbucket.mylab.mycompany:8443. Note that existing Git Enterprise tokens that don’t use the https protocol, will not work. You must create new ones that use a correct domain name URL.

    7. Give the generated token a name.
  3. Select this token on the create project page.
  4. Enter the URL to a repository in the platform you selected. For example, to a repository in GitHub, enter https://github.com/myName/projectrepo.git.

    Note for Bitbucket Enterprise only: You must enter the URL to your repository in the following format: https://<repo-url>.git. Do not include your user name in the URL.

    Repository restrictions when creating a project:

    • For an empty project:
      • The repository doesn’t have to be empty, however, it can’t contain an exported project. All files in the repository will be deleted during project creation.
      • If the repository is locked by another project, you can reassign the repository to your new project by deleting the .project-lock.json file. The other project loses access to the repository.
    • For a project created from Git:
      • The repository must contain exported project assets.
      • The repository can be locked by another project. To enable synchronization between the project and the repository, you must delete the .project-lock.json file in the repository. The other project loses access to the repository.

      Important: If you want to collaborate with other users on files in JupyterLab or RStudio, you must invite them to the project as collaborators and give them Editor or Admin role for the project. You must also give those users read/write access to the Git repository associated with the project.

  5. After the repository is validated, select a branch. This branch is the main branch for your overall Git workflow and cannot be changed. All push or pull operations are performed in that branch.
  6. For a project created from Git: If you deleted the .project-lock.json file, select to enable on-demand synchronization between the project and the Git repository.
  7. To work in the Jupyterlab IDE, select to edit notebooks only in JupyterLab.

Learn more