Set up Data Science Experience Local

After you install DSX Local, you can configure it in the following ways.

Set up an SSL connection for the web client

If you need to enable an HTTPS connection to the DSX Local web client with your own SSL certificate and private key (both in PEM format) rather than the default, complete the following steps:

  1. Ensure the SSL certificate and private key are in the same directory, and remember the absolute path for later. The SSL certificate can be a bundle that contains your server, intermediates, and root certificates concatenated (in proper order) into one file. The necessary certificates should be enabled as a trusted certificate on the clients connecting to your DSX Local instance.
  2. Name the SSL certificate cert.crt, and name the private key cert.key. You can verify the information supplied in the certificate signing request using the following command:
    
    openssl x509 -noout -text -in ./cert.crt
    
    
  3. Verify that the private key and certificate's public key match by entering the following commands to retrieve the md5 hash of the moduli of the private key and certificate's public key:
    
    openssl x509 -noout -modulus -in ./cert.crt |
    openssl md5 
    openssl rsa -noout -modulus -in ./cert.key | openssl md5
    
    

    If they match, then the private key and certificate match, and this certificate will be accepted by nginx.

  4. Replace the nginx default SSL certificate with your self-signed certificate by entering the following commands:
    
    cd /wdp/utils 
    ./change_nginx_cert.sh replace
    <your_chosen_directory_absolute_path>
    
    

    Alternatively, in the Admin Console, click the menu icon ( The menu icon) and click Scripts. In the Script pull-down menu, select Replace an existing cert.key and cert.crt certificates (change_nginx_cert.sh) to replace the SSL certificate.

    change_nginx_cert.sh

You can now sign into the DSX Local client using the domain name that matches the certificate, to verify that the certificate has been replaced.

Troubleshooting tip: If your web browser still reports an insecure connection after the certificate was imported, verify that the certificate can be viewed. Ensure that all the information is correct, and that the chain is signed by the intended certificate authority. If you ever need to change the SSL certificate back to the default, enter the following commands:

  cd /wdp/utils 
  ./change_nginx_cert.sh rollback

Configure DSX Local settings and users

To configure DSX Local, complete the following steps:

  1. By default, you can sign in to the DSX Local client by using your high availability proxy IP address, for example, https://123.45.67.89.
    Optional: In the local DNS server, you can add an entry to resolve to the HA proxy IP address. For example: 123.45.67.89 ibm-nginx-svc. As a result, you are able to sign in to the DSX Local client through the https://ibm-nginx-svc web address.
  2. Open the admin console by signing into the DSX Local client from a web browser and switching to Admin Console.

    Context switcher for DSX Local client to the admin console

    Alternatively, you can sign in to the admin console directly if you append /dsx-admin to your DSX Local client URL: https://ibm-nginx-svc/dsx-admin

  3. In the Admin Console, click the Menu icon ( The menu icon) and click User Management. Edit the admin user to set an email address and change the password for the primary administrator.
  4. You can configure a connection to your SMTP server so that DSX Local can send email to users and admins. DSX Local sends emails to users when they are given access to DSX Local and to administrators when a new user signs up for DSX Local, an alert is triggered, or an application setting, such as the alert threshold, is changed.

    To enable DSX Local to send email:

    1. From your username, select Settings.
    2. In the SMTP settings section, specify the following information:
      • The SMTP mail server address.
      • The port number of your SMTP server.

        Important: If you specify a secure port, you must select Use SSL encryption. If you specify a secure port but do not select this option, DSX Local cannot communicate with your SMTP server.

      • Depending on your SMTP server, you might need to specify your SMTP credentials:

        • If your SMTP server doesn't have a mailer daemon, you must specify an SMTP username and password.

        • If you SMTP server does have a mailer daemon, communications from DSX Local are associated with the mailer daemon account automatically. To associate communications with a specific account instead, provide the credentials for that account.

    3. Click Save. If your SMTP configuration is successful, you receive a confirmation email.
  5. Add users or set up an LDAP server. See Manage users for details.
  6. Switch to the IBM Data Science Experience Local client.

    Context switcher for DSX Local client to the admin dashboard

  7. Verify that the sample notebooks display successfully. Create a test project.

Configure DSX Local to work with the HDP or CDH cluster

If your HDP or CDH cluster does not use security, then just ensure DSX Local can access it. No additional configuration is needed.

Requirement: A secure HDP cluster or secure CDH cluster to work with DSX Local.

To configure DSX Local to work with a secure HDP or CDH cluster, complete the following steps:

  1. In the DSX Local master node, run the /wdp/utils/add_endpoint.sh script to add the certificate to securely connect to the HDP or CDH cluster. Additionally, you can run the script to set up the default Livy endpoint for the DSX Local cluster. Example:
    
    ./add_endpoint.sh
    --knox-url=https://9.87.654.323:8443 --addcert
    ./add_endpoint.sh --knox-url=https://9.87.654.323:8443
    --livy-url=https://9.87.654.323:8443/gateway/dsx/livy/v1 --addcert
    
    

    where https://9.87.654.323:8443/gateway/dsx/livy2/v1 represents the secure Livy endpoint that is defined in dsx.xml. As a result, the script automatically creates a default_endpoints.conf file.

    Alternatively, in the Admin Console, click the menu icon ( The menu icon) and click Scripts. In the Script pull-down menu, select Set the default Livy endpoint for DSX Local (add_endpoint.sh) to perform the same tasks.

    add_endpoint.sh

  2. Restart your Jupyter kernel and Zeppelin interpreter to pick up the new certificates.
  3. To ensure the same usernames exist in both DSX Local and HDP or CDH, set up the HDP or CDH LDAP server in DSX Local. See Manage users for details.

Configure DSX Local to work with Microsoft Azure VMs

Requirement: DSX Local must be installed on Microsoft Azure VMs.

For users to access the DSX Local client, you must make all three private IP addresses for the three master nodes (either from the three node or nine node configuration) accessible. Complete the following steps on each master node:

  1. In the /wdp/k8s/dsx-local-proxy/k8s/ directory, back up nginx-service.yaml to nginx-service.yaml.orig.
  2. Edit nginx-service.yaml and change the IP addresses to the three private IP addresses of the three master nodes (follow the same format as in the file, and ensure each IP is on a separate line). Example:
    
    ( externalIPs:
    10.0.0.100
    10.0.0.7
    10.0.0.8
    10.0.0.9)
    
    
  3. Run the command: kubectl delete -f nginx-service.yaml.orig --namespace=ibm-private-cloud
  4. Run the command: kubectl create -f nginx-service.yaml --namespace=ibm-private-cloud
  5. Test for an HTML response by running the command: curl -k https://
  6. Order a Load Balancer within Azure, and set up the Load Balancer for HTTPs (port 443) to point to the three private IP interfaces of the three master nodes.

Optional configuration settings

You can optionally adjust when alerts are generated, how long log files and metrics are stored, and how frequently the metrics on the dashboard are refreshed.

To configure refresh and retention settings:

  1. From your username, select Settings.
  2. In the Refresh and alert settings, adjust the appropriate settings:
    Log retention (days)

    The number of days to keep logs before they are automatically deleted.

    The default is 10 days.

    Metrics retention (days)

    Number of days to keep metrics history (such as the CPU and memory usage shown in the dashboard) before they are automatically deleted.

    The default is 1 day.

    Remember: If you increase the retention period and increase the frequency with which the dashboard metrics are refreshed, you use much storage in the Mongo database where metrics are stored.

    Dashboard refresh (seconds)

    The frequency with which the data in the admin dashboard is refreshed.

    The default is 10 seconds

    Alert threshold (%)

    The usage threshold at which an alert is triggered. When the usage reaches this threshold, the node color immediately changes to red. The alert is generated if the usage stays above the threshold longer than the time that is specified for the Alert length threshold setting.

    The default is 90%.

    Alert warning threshold (%)

    The usage threshold at which a warning is triggered and the node color changes to yellow.

    The default is 70%.

    Alert time threshold (minutes)

    The length of time that must elapse before an alert is generated.

    For example, if CPU usage goes above 90% for 30 seconds during a complex computation, you probably don't need to be alerted. But if CPU usage stays above 90% for 5 minutes, it might be cause for concern.

  3. Click Save.