Pattern matching based on wildcard characters

In read mode, you can use wildcard characters in the File name property. The connector uses the glob syntax for pattern matching.

Wildcard characters

You can use any of the following wildcard characters:

*
Specify asterisk (*) to match zero or more characters. For example, specify *.txt to match all files with the .txt extension.
?
Specify a question mark (?) to match one character.
[range]
Specify a range to match a single character in a set or a range of characters. For example, you can specify the range [a-f] to match [abcdef].

If multiple files match the pattern, the connector reads each file, one after another. When jobs run in parallel, each processing node reads a different part of the matching file or files.

Directory level wildcard characters

This feature supports reading files from partitioned tables on Hive, and works only in WebHDFS and HttpFS modes. You can use any of the wildcard characters in the directory name, only if the directory name contains "=" symbol

*
Specify asterisk (*) to match zero or more characters. For example, specify c3=* to match all patterns of column c3.
?
Specify a question mark (?) to match one character.
[range]
Specify a range to match a single character in a set or a range of characters. For example, you can specify the range [a-f] to match [abcdef].
Example:
/apps/hive/warehouse/*/*/* is unsupported
/apps/hive/warehouse/c3=[a-z]/c?=*/* is supported.
Sample Use cases:
The directory name can contain multiple occurrences of single wildcard or concatenation of different wildcards as shown below:
/user/hive/warehouse/testmultipart2/??=[a-z]*/*=[a-z][a-z]?/[0-9]* 
/user/hive/warehouse/testmultipart2/?3=test1/?[3-4]=ex[1-2]/* 
/user/hive/warehouse/testmultipart2/c3=test?/c4=ex?/* 
/user/hive/warehouse/testmultipart2/c3=test?/c4=ex[0,1]/* 
/user/hive/warehouse/testmultipart2/[a-z][0-9]=[a-z,0-9]*/c?=ex[2]/*
The FileName specified can be absolute or relative. Wildcard is supported in both:
Relative Path:  testRel/c3=*/c4=*/*.txt 
Absolute Path: /user/hdfs/testRel/c3=1/c4=2/2.txt