Table of contents

Network requirements for Data Virtualization

The Data Virtualization service exposes the following network communication ports to allow connections from outside of the Cloud Pak for Data cluster. This is an optional task.

Ports exposed by Data Virtualization

The following table lists the ports that are exposed by Data Virtualization and their usage.

Table 1. Ports exposed by the Data Virtualization service
Port usage External port Internal port Communication
External client applications to connect to Data Virtualization via JDBC with SSL. To get the external port:
  1. Go to Collect > Data Virtualization > Connection details.
  2. Select the With SSL option in the Connection configuration resources section.
The external port is the value of the Port number field.
Optionally, you can run the following command:
oc get -n Project -o jsonpath="{.spec.ports[?(@.name=='bigsqldb2jdbcssl')].nodePort}" services dv-server
Replace Project with the project (namespace) where the Data Virtualization service is installed.
For example:
oc get -n dv-project -o jsonpath="{.spec.ports[?(@.name=='bigsqldb2jdbcssl')].nodePort}" services dv-server
31961
32052 TCP
External client applications to connect to Data Virtualization via JDBC without SSL. To get the external port:
  1. Go to Collect > Data Virtualization > Connection details.
  2. Select the Without SSL option in the Connection configuration resources section.
The external port is the value of the Port number field.
Optionally, you can run the following command:
oc get -n Project -o jsonpath="{.spec.ports[?(@.name=='bigsqldb2jdbc')].nodePort}" services dv-server
Replace Project with the project (namespace) where the Data Virtualization service is installed.
For example:
oc get -n dv-project -o jsonpath="{.spec.ports[?(@.name=='bigsqldb2jdbc')].nodePort}" services dv-server
32162
32051 TCP
Automated discovery to streamline the process of accessing remote data sources. See Discovering remote data sources. To get the external port, run the following command
oc get -n Project -o jsonpath="{.spec.ports[?(@.name=='qpdiscovery')].nodePort}" services dv-server
Replace Project with the project (namespace) where the Data Virtualization service is installed.
For example:
oc get -n dv-project -o jsonpath="{.spec.ports[?(@.name=='qpdiscovery')].nodePort}" services dv-server
30503
7777 UDP
To get the list of Kubernetes NodePort ports exposed by the Data Virtualization service and internal-to-external port mapping, run the following command:
oc get -n Project services dv-server
Replace Project with the project (namespace) where the Data Virtualization service is installed.
For example:
oc get -n dv-project services dv-server

NAME        TYPE    CLUSTER-IP EXTERNAL-IP                 PORT(S)                       AGE
dv-server NodePort 172.30.140.105 <none> 7777:30503/UDP,32051:32162/TCP,32052:31961/TCP  2d

Network requirements for load-balancing environments

By using the iptables utility or the firewall-cmd command, you can ensure that external ports exposed listed in Table 1 and their communication are not blocked by local firewall rules or load balancers.
Note: For more information about checking ports for communication blockages, see Managing data using the NCAT utility in the Red Hat® documentation.

If your Cloud Pak for Data uses a load balancer and you get a timeout error when trying to connect to the Data Virtualization service, increase the load balancer timeout values by updating the /etc/haproxy/haproxy.cfg file. For more information, see Limitations and known issues in Data Virtualization.

Switching service port from UDP to TCP
If the Cloud Pak for Data load balancer does not support UDP traffic, you can switch the port from UDP to TCP in the Data Virtualization service:
  1. Run the following command:
    oc edit svc dv-server
  2. Change the protocol parameter from UDP to TCP for the qpdiscovery port:
    - name: qpdiscovery
        nodePort: 32071
        port: 7777
        protocol: TCP
        targetPort: 7777
Defining alternative gateway host
If the Cloud Pak for Data load balancer does not support UDP port mapping, Data Virtualization provides an alternative gateway host, which handles incoming requests to map UDP ports for remote connectors. To manually define the gateway:
  1. Click Collect > Data Virtualization > SQL editor
  2. Run the QPLEXSYS.DEFINEGATEWAYS () stored procedure. For example:
    QPLEXSYS.DEFINEGATEWAYS ('host1:6414, host2:6414') 
    Replace host1 and host2 variables with the remote connector hostname or IP address. This example uses port 6414, which you specify while generating the dv-endpoint configuration script. To determine which port-mapping to use in the QPLEXSYS.DEFINEGATEWAYS () stored procedure, check the QueryPlex_config.log on your remote connector, and search for the GAIAN_NODE_PORT value. For example:
    GAIAN_NODE_PORT=6414
    If you use port (e.g. NAT or VPN) to the remote connector, you must specify two ports:
    QPLEXSYS.DEFINEGATEWAYS ('host1:37400:6414, host2:37400:6414')
    In this example, two remote connectors are listening internally on ports 6414, but these ports are not exposed externally by the host. For example, remote connectors can only be accessible from Cloud Pak for Data via a VPN server that is configured to map external VPN port 37400 to internal port 6414. Defining the gateway enables Data Virtualization to open a connection to the remote connectors running on host1 and host2. Data Virtualization connects to port 37400 on the remote host, and the VPN forwards traffic to the remote connector's internal port 6414.

Updating HAProxy configuration file

If you use an external infrastructure node to route external Data Virtualization traffic into the Red Hat OpenShift® cluster, you might need to ensure traffic is forward to your cluster:
  1. On the infrastructure node, open the HAProxy configuration file located at /etc/haproxy/haproxy.cfg.
  2. Modify the haproxy.cfg file to include the following content:
    defaults
           log                     global
           option                  dontlognull
           option  tcp-smart-accept
           option  tcp-smart-connect
           retries                 3
           timeout queue           1m
           timeout connect         10s
           timeout client          1m
           timeout server          1m
           timeout check           10s
           maxconn                 3000
    frontend dv-nonssl
           bind *:dv NodePort
           default_backend dv-nonssl
           mode tcp
           option tcplog
    backend dv-nonssl
           balance source
           mode tcp
           server master1 Master1-privateIP:dv NodePort
  3. Reload HAProxy:
    systemctl reload haproxy