HDFS_READ
The HDFS_READ function reads data from a delimiter-separated file in the Hadoop Distributed File System (HDFS).
The schema is SYSFUN.
- file-url
- An expression that specifies the server address and path of the input file in HDFS. file-url is a VARCHAR(512) value.
- options
- An expression that specifies a list of name=value pairs.
Each pair must be separated from the following pair by a space character. options
is a VARCHAR(256) value. options can contain any of the following name and value pairs:
- delimiter=delimiter-value
- Identifies the character that is used as the delimiter in the input file that is specified by file-url.
- user=user-value
- Specifies an IBM® InfoSphere® BigInsights® user name that has access to the input file that is specified by file-url.
- password=password-value
- Specifies the password for the IBM InfoSphere BigInsights user that is identified by user=user-value.
- authport=authport-value
- Specifies the port for form-based authentication of the input. The default is 8080.
The HDFS_READ function returns a table with one row for each record in the input file. HDFS_READ is a generic table function, which means that the columns in the returned table are defined when the table is referenced, instead of when the table is defined.
Example 1: Read an HDFS file whose URL is http://hdfssrv.svl.ibm.com:8080. The input file
delimiter is a comma. Use the default authorization port. The records in the input file have two
fields: a DECIMAL(8,3) field, and an INTEGER
field.
SELECT * FROM TABLE(
HDFS_READ(
'http://hdfssrv.svl.ibm.com:8080',
'delimiter=, user=biadmin password=passw0rd'))
AS T1(C1 DECIMAL(8,3), C2 INTEGER);Example 2: Read an HDFS file whose URL is the location to which the output of a successful
Jaql query is written. That location is specified by the return-string parameter
of the JAQL_SUBMIT invocation that submits the Jaql
query.
SELECT * FROM TABLE(
HDFS_READ(
JAQL_SUBMIT(
'[[15.3, 16],[170.99,180]]->
write(del(location=''/tmp/test1.csv''));',
'http://hdfssrv.svl.ibm.com:14000/webhdfs/v1/tmp/test1.csv',
'http://jaqlsrv.svl.ibm.com:8080',
'timeout=60 user='biadmin', password=passw0rd'),
'user=biadmin password=passw0rd'))
AS T1(C1 DECIMAL(8,3), C2 INTEGER);