IBM Support

SCRIPTING_04 - Jython stage does not bypass proxy leading to timeout

Troubleshooting


Problem

  1. The reason why this may occur is because the Python requests library does not support wildcards in the no_proxy environment variable.

  2. StreamSets Platform copies the http.nonProxyHosts over to the no_proxy variable automatically when the deployment starts.

  3. However, the Python requests library does not respect the same syntax, and wildcard characters such as * are not interpreted correctly.

  4. This is why specifying nonProxyHosts like *.host.com or 10.0.0.* can be problematic.

  1. You can disable the use of proxies in your Jython script by creating a session and setting session.trust_env to False. ex:

    url = "http://test.com"
    session = requests.Session()
    session.trust_env = False
    try:
        response = session.get(url)
        
        record = sdc.createRecord('test')
        record.value = response.text
        
        
        
        cur_batch.add(record)
    
    except Exception as e:
        cur_batch.addError(record, str(e))
        cur_batch.process(entityName, str(offset))
  1. The requests library has shown to support specifying CIDR ranges and subdomains for the no_proxy variable in the following format:

    • .host.com

    • 10.0.0.0/8

    • ex: http.nonProxyHosts=.host.com, 10.0.0.0/8

  2. Account for the different types of syntax and include them in your deployment’s http.nonProxyHosts list.

Symptom

  1. With StreamSets Platform, you can configure your Data Collector deployments to use a proxy server https://docs.streamsets.com/portal/platform-datacollector/latest/datacollector/UserGuide/Configuration/ProxyServer.html?hl=proxy

  2. You are also able to define http.nonProxyHosts, which lists hosts that Data Collector can connect to without going through the proxy.

  3. However, when using the Jython stage to make requests (using the requests library) to these hosts, you may see that it still tries to connect through the proxy. Example script:

  4. If that nonProxyHost does not allow connections from the proxy, you can receive a timeout error.

Document Location

Worldwide

[{"Line of Business":{"code":"LOB76","label":"Data Platform"},"Business Unit":{"code":"BU048","label":"IBM Software"},"Product":{"code":"SSBH9Y","label":"IBM StreamSets Control Hub"},"ARM Category":[{"code":"","label":""}],"ARM Case Number":"","Platform":[{"code":"PF025","label":"Platform Independent"}],"Version":"All Version(s)"},{"Line of Business":{"code":"LOB76","label":"Data Platform"},"Business Unit":{"code":"BU048","label":"IBM Software"},"Product":{"code":"SSM7CU","label":"IBM StreamSets Data Collector"},"ARM Category":[{"code":"","label":""}],"ARM Case Number":"","Platform":[{"code":"PF025","label":"Platform Independent"}],"Version":"All Version(s)"}]

Document Information

Modified date:
15 March 2025

UID

ibm17186240