Installing connectors on remote data sources (Data Virtualization)

To access data that is stored in remote data sources, Data Virtualization requires the installation of a remote connector.

Before you begin

Required role: To complete this task, you have the Data Virtualization Admin or Engineer role.

  • You must have IBM Java 8 installed on the remote data source to add a connector. For more information about installing Java 8, see the product documentation.
  • To run the dv-endpoint.sh script, you must also install the following tools:
    • Curl
    • Netstat
    • TAR
  • Ensure that you updated the HAProxy configuration file.

About this task

A remote connector enables Data Virtualization to automatically access data, such as files located in a remote data source. If you need to virtualize data that is stored or a remote data source or file system, you must install a remote connector on the data source where the data is located. The credentials used to establish the connection to the data source determine what data in the data source can be accessed by the Data Virtualization service.

Data Virtualization provides an easy way to add a connector on a remote data source by generating and running the dv-endpoint.sh configuration script. The dv-endpoint.sh script performs the following tasks:
  1. Ensure that you meet all prerequisites to add a remote connector.
  2. Set the specified parameters to add the remote connector.
  3. Download and extract the remote connector installation package.
  4. Start and check the status of the remote connector.

Procedure

To add connectors to remote data sources, complete the following steps:

  1. Go to Data > Data Virtualization > Virtualization > Data sources.
  2. Click Set up remote connector.
  3. Enter the name and description of the remote connector.
  4. To generate the dv-endpoint.sh configuration script:
    1. Select the operating system of the remote data source.
    2. Specify the directory where Java is installed on the remote data source.
    3. Specify the directory where you want to install the remote connector.
      For example, you can use the /home/user/<user-id>/dvendpoint directory.
    4. Specify the node port that the connector will use on the remote data source.
      Each connector that you install on the remote data source can use a different node port. By default, the remote connector uses port 6414.
      Note: On Microsoft Windows, if the port is already in use, the installation of the endpoint will fail. You can see error messages in the dvendpoint.log file in the endpoint install directory. For example, you might see errors messages that are similar to the following example.
      An exception occurred during the Install phase.
      System.ComponentModel.Win32Exception: The specified service already exists
    5. Click Generate script.
  5. To run the script on the remote data source:
    • If you download the script, you must:
      1. Use SCP or FTP to transfer the dv-endpoint.sh file to the remote data source.
      2. Set the script as an executable file:
        chmod +x dv_endpoint.sh
      3. Run the script on the remote data source:
        ./dv-endpoint.sh
    • If you copy the script to clipboard, you must:
      1. Create a new file on the remote host.
      2. Paste the script content from clipboard and save the file.

Results

You can now virtualize data that is stored on the remote data source or file system.

What to do next

By default, the remote connector has the access permissions of the user who runs the script. Therefore, it is recommended that you follow these steps on your operating system:
  1. Create a functional ID and start the remote connector under this functional ID.
  2. Create a group to which the data to be virtualized can be added.
  3. Grant the group read access to the data to be virtualized.
To manage remote connectors, see Managing connectors on remote data sources.

Managing connectors on remote data sources

You can start and stop connectors on remote data sources.

Procedure

  • To start the remote connector:
    1. Go to the directory where you installed the remote connector.
      For example, /home/user1/dvendpoint.
      cd /home/user1/dvendpoint
    2. Run the following command to start the remote connector:
      nohup ./datavirtualization_start.sh
  • To stop the remote connector:
    1. Go to the directory where you installed the remote connector.
      For example, /home/user1/dvendpoint.
      cd /home/user1/dvendpoint
    2. Run the following command to stop the remote connector:
      ./sysroot/killGaianServers.sh
  • To change the default memory setting that is used by the remote connector:
    1. Stop the remote connector.
    2. Set the JAVA_OPTS parameter.
      The following example changes the memory setting from the default value of 256 MB to 512 MB:
      export JAVA_OPTS=-Xmx512m
    3. Start the remote connector.
  • To change the default port (6414) that is used by the remote connector in an already generated the dvendpoint.sh script:
    1. Complete steps 1-4 in the Adding connectors to remote data sources section to generate the dv-endpoint.sh script.
    2. Add the -p port-number parameter.
    3. Update the _config_cmd parameter and other references to the default port (6414) in the script.
      The following example uses port 6415:
      _config_cmd="./config.sh -p 6415 -i 93a46053-81ee-4ec3-a68a-b052c5e16c8e \
       -c 9MUbMH1tf20KAQAQOk07Xp9BN2qFUmY/wS9RvBpKovfSSwyJHG8rW2I6TtJvB \
      4UTCp3tRJbPgKDrJuNOks7SiV3Vi9P3id1DUNUl9I28hVe5q2fCKr57RSJszOwC8bM6Y7Hz6XETRP \
      bVdKSEjmhTM4mq7wsrqYXVph0nll4NTyEJxbE \
      k3RGna2mIpnujckBAprAA8e2YV0r2R1en6D0czPvMW6j3qLdFrNR7iTX/drldOyg9Jm7AhBpJZv0IHnERpVGX8I0gpwfcXjx \
      lfmglB04NA/06Js2vx8fom2Xj6bITmC6RvYreWw5sLruDF/Mn49HiGbdv/DkDWPGpMc7Up5rkTu8DU1kYGEOeXs \
      LvqjIY8SkTDdbN1cRgZFkNcR4eIRvAi \
      ayI9qJtX2PSeLNEss4gpFtHYq7FxJQeWvXcppLCAld8ghSZEMoYzZqZoUtliKSrCBgOK8lN5JSQN4YhSrMLxDVNIW \
      x37rG0Q/GCI5BZUeF1h2U2TILycFvkSwRUEB2qe8imfeXd2txPsHvTjlxiKpRP/nOo2Ud0="
    4. Update all references in script of netstat -na|grep 6414 to netstat -na|grep 6415.