Git operations

In a project with default Git integration, the Git repository associated with the project does not need to be empty when the project is created. For example, if you developed code in an IDE on your workstation and want to move to a project in Cloud Pak for Data, you can clone the repository already containing your code and continue working in JupyterLab in your project based on the file contents pulled from the repository.

You can perform Git operations from:

  • Within the JupyterLab or RStudio IDEs in a project

  • A project's action bar by selecting the Git icon (Shows the Git icon.). You can select the following Git operations:

    • Pull: pulls the latest changes from the remote branch in the repository to the project

      If you are new to the project and haven't created an access token to the repository, you are prompted to create one.

    • Commit: commits untracked local changes to your cloned branch

    • Push: pushes local changes to the remote branch to enable tracking by others

    • Checkout branch: enables selecting or switching the Git branch for the project

    • Merge conflict: enables selecting the file version (local or remote) to use to merge

      If Git is not able to merge automatically, you can choose to abandon your own changes by using the remote copy of the file or abandon the incoming changes by using your own copy of the file.

      For any files where you want to keep both the local and remote changes, select neither local nor remote and instead resolve the conflict manually. You can use project terminal to resolve complex conflicts manually. See Project terminal. For notebook .ipynb files, you should go to Jupyterlab where you can view the file changes and resolve the merge conflicts visually. See Merging conflicts.

    • Commit history: logs your commit history

Merge conflict scenarios

The following scenarios show how merge conflicts might arise when performing Git operations and how best to solve them:

  • Using private credentials for a connection:

    • User1 creates a new connection and an associated connected data asset in the project interface. User1 provides private credentials for the connection.
    • User1 commits and pushes this change to the remote branch.
    • User2 pulls this change from the remote branch and sees the new connection and the connected data asset on the project's Assets page.
    • When user2 uses the connection or the connected data asset for the first time, user2 is prompted to provide the private credentials. User1 must provide user2 with the credentials.
  • Working on the same notebook file on a feature branch:

    • User1 and user2 make conflicting changes to a notebook file, but user2 commits and pushes first.
    • When user1 pulls these changes prior to commiting, user1 sees a file-level conflict in the ipynb file.
    • User1 determines which change to keep directly in the project's Git user interface. Alternatively, user1 can go to Jupyterlab to examine and merge the changes.
  • Working on the same Data Refinery flow on a feature branch:

    • User1 and user2 make conflicting changes to an existing Data Refinery flow, but user2 commits and pushes first. Changes to a Data Refinery flow affect files in both the assets/.METADATA and in the assets/data_flow directories.
    • When user1 pulls these changes prior to committing, user1 sees that there is a conflict.
    • From the file name, user1 is able to determine what data refinery flow this conflict is occurring in. User1 determines which change to keep and makes the file changes in both the assets/.METADATA and assets/data_flow directories.
  • Changes to a utility Python file on a feature branch breaks a calling script:

    • User1 commits and pushes changes to a utility Python file on the feature branch.
    • User2 pulls user1's changes from the remote repository.
    • No conflicts occur, however user2 discovers the breaking change during testing.
    • User2 fixes the calling script before committing and pushing changes.
  • Resolving conflicting changes to the description of a data asset on a feature branch in the project's Git user interface:

    • User1 and user2 make conflicting changes to the description of a data asset, but user2 commits and pushes first.
    • When user1 pulls user2's changes prior to committing, user1 sees a conflict occurring in the relevant .json file in the assets/.metadata directory.
    • From the file name, user1 is able to determine what asset this conflict is occurring in. User1 determines which change to keep.
  • Resolving conflicting changes to the description of a data asset on a feature branch in JupyterLab:

    • User1 and user2 make conflicting changes to the description of a data asset, but user2 commits and pushes first.
    • When user1 pulls user2's changes prior to committing, user1 sees a conflict occurring in the relevant .json file in the assets/.metadata directory.
    • From the file name, user1 is able to determine what asset this conflict is occurring in. User1 cancels the merge.
    • When user1 searches for that asset on the project's Assets page to see the details, an error message is displayed indicating that the asset doesn't exist.
    • User1 opens Jupyterlab and starts a terminal window. User1 navigates to the .metadata directory and edits the file to resolve the conflict by combining the information from the two descriptions.
    • User1 runs the Git add command to mark that the conflict is resolved for that file, confirms in the project UI that the conflict no longer exists, and commits and pushes the changes.

Parent topic: Default Git integration