Operator FTPReader

Primitive operator image not displayed. Problem loading file: ../../image/tk$com.ibm.streamsx.inet/op$com.ibm.streamsx.inet.ftp$FTPReader.svg

The FTPReader operator is a source operator that can scan a directory at a remote host or receive files from a remote host by using the FTP, FTPS, or SFTP protocols. If the operator works as a directory scanner, the contents of the directory can be received line by line. Separate output functions are available to get dedicated information about the directory in specialized output functions like file name, size, or date.

If the operator works as file source, the content of the file can be delivered either in binary format as a blob or in text format line by line.

The FTPReader operator must not be used inside a consistent region!

Summary

Ports
This operator has 2 input ports and 2 output ports.
Windowing
This operator does not accept any windowing configurations.
Parameters
This operator supports 15 parameters.

Required: host, path, protocol, username

Optional: connectionCloseMode, connectionTimeout, curlVerbose, filename, isDirReader, password, skipPASVIp, transferTimeout, useEPRT, useEPSV, usePORT

Metrics
This operator does not report any metrics.

Properties

Implementation
C++
Threading
Always - Operator always provides a single threaded execution context.

Input Ports

Ports (0)

This input port triggers a file transfer / directory scan. Typically the requested filename/directory is received from a stream attribute.

Properties

Ports (1)

This optional input port may be used to set/change the username and password.

Properties

Output Ports

Output Functions
GetData
blob Binary()
Get the data as binary blob. This function must not be used if the operator works as directory scanner (isDirReader : true). This function must not be used, if one of the following output functions is used:
  • Line()
  • FileName()
  • FileSize()
  • FileDate()
  • FileUser()
  • FileGroup()
  • FileInfo()
  • IsFile()

One file may be transmitted in several blobs. The blob size is determined by the underlying library. An empty blob indicates an empty file.

rstring Line()

Get the data of the directory/file line by line. This requires that the received file is a text file. The line is delivered with the trailing newline character. An complete empty line indicates an empty file. This function must not be used if the function Binary() is used.

rstring Url()

This function returns a string with the url including schema of the received file or scanned directory.

rstring FileName()

This function returns one file name of the scanned directory. If this function is used, the parameter isDirReader must be true.

uint64 FileSize()

This function returns the file size of one file in the scanned directory. If this function is used, the parameter isDirReader must be true.

rstring FileDate()

This function returns the file date string of one file in the scanned directory. If this function is used, the parameter isDirReader must be true.

rstring FileUser()

This function returns the file user of one file in the scanned directory. If this function is used, the parameter isDirReader must be true.

rstring FileGroup()

This function returns the file group of one file in the scanned directory. If this function is used, the parameter isDirReader must be true.

rstring FileInfo()

This function returns the file access rights of one file in the scanned directory. If this function is used, the parameter isDirReader must be true.

boolean IsFile()

This function returns true if this directory entry is file a regular file (first character in file info equals '-'). If this function is used, the parameter isDirReader must be true.

int32 Sequence()

This function returns a sequence number of the output tuple. The sequence starts with 0 for each file/directory.

uint32 NoTransfers()

Deprecated: This function returns the number of completed ftp transfers. Use function TransferCount() instead.

uint32 TransferCount()

This function returns the number of completed ftp transfers.

uint32 NoTransferFailures()

Deprecated: This function returns the number of failed ftp transfers. Use function TransferFailureCount() instead.

uint32 TransferFailureCount()

This function returns the number of failed ftp transfers.

uint64 NoBytesTransferred()

Deprecated: This function returns the number of bytes transferred in successfully completed transfers. Use function BytesTransferred() instead.

uint64 BytesTransferred()

This function returns the number of bytes transferred in successfully completed transfers.

float64 TransferSpeed()

This function returns the transfer speed in byte per second of the last transfer / valid after file end.

<any T> T AsIs(T)

Return the argument unchanged.

GetError
int32 Error()

Deprecated: Get the error number. Use function ErrorCode() instead.

int32 ErrorCode()

Get the error number

rstring ErrorText()

Get the error description

rstring Url()

This function returns a string with the url including schema of the received file or scanned directory.

uint32 NoTransfers()

Deprecated: This function returns the number of completed ftp transfers. Use function TransferCount() instead.

uint32 TransferCount()

This function returns the number of completed ftp transfers.

uint32 NoTransferFailures()

Deprecated: This function returns the number of failed ftp transfers. Use function TransferFailureCount() instead.

uint32 TransferFailureCount()

This function returns the number of failed ftp transfers.

uint64 NoBytesTransferred()

Deprecated: This function returns the number of bytes transferred in successfully completed transfers. Use function BytesTransferred() instead.

uint64 BytesTransferred()

This function returns the number of bytes transferred in successfully completed transfers.

<any T> T AsIs(T)

Return the argument unchanged.

Ports (0)

This mandatory port emits the file/directory content. The GetData output functions must be applied to this port.

Assignments
This port set allows any SPL expression of the correct type to be assigned to output attributes. Attributes not assigned in the output clause will be automatically assigned from the attributes of the input ports that have the same name and type. If there is no such input attribute, an error is reported at compile-time.

Properties

Ports (1)

This optional port may be used to carry error information and diagnostics. The GetError output functions may be applied to this port. If no output assignment is applied, the output stream must have a single attribute of type rstring.

Assignments
This port set allows any SPL expression of the correct type to be assigned to output attributes.

Properties

Parameters

This operator supports 15 parameters.

Required: host, path, protocol, username

Optional: connectionCloseMode, connectionTimeout, curlVerbose, filename, isDirReader, password, skipPASVIp, transferTimeout, useEPRT, useEPSV, usePORT

connectionCloseMode

This optionally parameter controls when the closing of connection after transfer. The parameter takes one value of type ConnectionCloseMode. Default value is never.

Properties

connectionTimeout

This is the maximum time in seconds that you allow the connection to the server to take. This only limits the connection phase, once it has connected, this option is of no more use. Set to zero to switch to the default built-in connection timeout - 120 seconds. See also the transferTimeout parameter.

Properties

curlVerbose

Verbose mode for curl lib. Default value is false. The curl information are sent stderr.

Properties

filename

The filename part of the file/directory if the path does not contain a filename part.

Properties

host

Hostname or IP address of the remote host in form hostname[:port].

Properties

isDirReader

If this parameter is true, the operator acts as directory scanner and each directory entry produces one line at the output. Special output functions to get the properties of the directory entry are available in this case. Default is false

Properties

password

The password for the ftp user. If the operator has two input ports and this parameter is an attribute parameter it must be supplied from the second input port.

Properties

path

The path of the remote fie/directory. The path may contain an filename part. If the filename part is omitted the parameter must end with a '/'. The path should begin with '/'. In case of ftp protocols the path is relative to the home directory of the user and may depend on the server configuration. In case of ftp protocol an absolute path must start with '//'. In case of sftp the path is an absolute path. A path relative to users home directory may be entered in the form '~/'.

Properties

protocol

Protocol and encryption properties of the connection. This parameter takes one value of type Protocol

Properties

skipPASVIp

If set to true, it instructs libcurl to not use the IP address the server suggests in its 227-response to libcurl's PASV command when libcurl connects the data connection. Instead libcurl will re-use the same IP address it already uses for the control connection. But it will use the port number from the 227-response.

Properties

transferTimeout

This is the maximum time in seconds that you allow the libcurl transfer operation to take. Normally, name lookups can take a considerable time and limiting operations to less than a few minutes risk aborting perfectly normal operations. Default timeout is 0 (zero) which means it never times out.

Properties

useEPRT

If the value is true, it tells curl to use the EPRT (and LPRT) command when doing active FTP downloads (which is enabled by ftpPORT) default true. Using EPRT means that it will first attempt to use EPRT and then LPRT before using PORT, but if you pass false to this option, it will not try using EPRT or LPRT, only plain PORT. If the server is an IPv6 host, this option will have no effect as of 7.12.3.

Properties

useEPSV

If the value is true, it tells curl to use the EPSV command when doing passive FTP downloads (default true) Using EPSV means that it will first attempt to use EPSV before using PASV, but if you pass false to this option, it will not try using EPSV, only plain PASV If the server is an IPv6 host, this option will have no effect.

Properties

usePORT

It will be used to get the IP address to use for the FTP PORT instruction. The PORT instruction tells the remote server to connect to our specified IP address. The string may be a plain IP address, a host name, a network interface name or just a '-' symbol to let the library use your system's default IP address. Default FTP operations are passive, and thus won't use PORT. The address can be followed by a ':' to specify a port, optionally followed by a '-' to specify a port range. If the port specified is 0, the operating system will pick a free port. If a range is provided and all ports in the range are not available, libcurl will report CURLE_FTP_PORT_FAILED for the handle. Invalid port/range settings are ignored. IPv6 addresses followed by a port or portrange have to be in brackets. IPv6 addresses without port/range specifier can be in brackets. (added in 7.19.5)

Examples with specified ports:

eth0:0 192.168.1.2:32000-33000 curl.se:32123 [::1]:1234-4567

You disable PORT again and go back to using the passive version by setting this option to an empty string.

Properties

username

The ftp username. If the operator has two input ports and this parameter is an attribute parameter it must be supplied from the second input port.

Properties

Code Templates

FTPReader-DirectoryScanner


stream<rstring fileName, uint64 size, rstring date, rstring user, boolean isFile> ${FilenameStream} as OUT = FTPReader(${TriggerStream}) {
	param
		protocol : ftp;
		isDirReader : true;
		host : "${host}";
		path : "${path}/";
		username : "${username}";
		password : "${password}";
	output OUT :
		fileName = FileName(),
		size = FileSize(),
		date = FileDate(),
		user = FileUser(),
		isFile = IsFile();
}
      

FTPReader-TextFileReader


stream<rstring line, int32 sequence> ${FileStream} as OUT = FTPReader(${FilenameStream} as IN) {
	param
		protocol : ftp;
		isDirReader : false;
		host : "${host}";
		path : "/${path}/";
		filename : IN.fileName;
		username : "${username}";
		password : "${password}";
	output OUT :
		line = Line(),
		sequence = Sequence();
}
      

FTPReader-BinaryFileReader


stream<blob content, int32 sequence> ${FileStream} as OUT = FTPReader(${FilenameStream} as IN; ${PasswordStream} as PWD) {
	param
		protocol : ftp;
		host : "${host}";
		path : "/${path}/";
		filename : IN.fileName;
		username : "${username}";
		password : PWD.password;
	output OUT :
		content = Binary(),
		sequence = Sequence();
}
      

Libraries

curl lib
Library Name: curl
FTP wrapper lib
Library Name: inettoolkit
Library Path: ../../impl/lib
Include Path: ../../impl/include