HDFS_READ

The HDFS_READ function reads data from a delimiter-separated file in the Hadoop Distributed File System (HDFS).

Read syntax diagramSkip visual syntax diagramHDFS_READ( file-url, options )

The schema is SYSFUN.

file-url
An expression that specifies the server address and path of the input file in HDFS. file-url is a VARCHAR(512) value.
options
An expression that specifies a list of name=value pairs. Each pair must be separated from the following pair by a space character. options is a VARCHAR(256) value. options can contain any of the following name and value pairs:
delimiter=delimiter-value
Identifies the character that is used as the delimiter in the input file that is specified by file-url.
user=user-value
Specifies an IBM® InfoSphere® BigInsights® user name that has access to the input file that is specified by file-url.
password=password-value
Specifies the password for the IBM InfoSphere BigInsights user that is identified by user=user-value.
authport=authport-value
Specifies the port for form-based authentication of the input. The default is 8080.

The HDFS_READ function returns a table with one row for each record in the input file. HDFS_READ is a generic table function, which means that the columns in the returned table are defined when the table is referenced, instead of when the table is defined.

Example 1: Read an HDFS file whose URL is http://hdfssrv.svl.ibm.com:8080. The input file delimiter is a comma. Use the default authorization port. The records in the input file have two fields: a DECIMAL(8,3) field, and an INTEGER field.
SELECT * FROM TABLE(
 HDFS_READ(
 'http://hdfssrv.svl.ibm.com:8080',
 'delimiter=, user=biadmin password=passw0rd'))
 AS T1(C1 DECIMAL(8,3), C2 INTEGER);
Example 2: Read an HDFS file whose URL is the location to which the output of a successful Jaql query is written. That location is specified by the return-string parameter of the JAQL_SUBMIT invocation that submits the Jaql query.
SELECT * FROM TABLE(
 HDFS_READ(
  JAQL_SUBMIT(
   '[[15.3, 16],[170.99,180]]-> 
    write(del(location=''/tmp/test1.csv''));',
    'http://hdfssrv.svl.ibm.com:14000/webhdfs/v1/tmp/test1.csv',
   'http://jaqlsrv.svl.ibm.com:8080',
   'timeout=60 user='biadmin', password=passw0rd'),
  'user=biadmin password=passw0rd'))
 AS T1(C1 DECIMAL(8,3), C2 INTEGER);