date formats

Specifying the format of a date field that is passed as a parameter to a conversion.

Four conversions, string_from_date, ustring_from_date, date_from_string, and ustring_from_date, take as a parameter of the conversion a date format or a date uformat. These formats are described below. The default format of the date contained in the string is yyyy-mm-dd.

The format string requires that you provide enough information for InfoSphere® DataStage® to determine a complete date (either day, month, and year, or year and day of year).

date uformat

The date uformat provides support for international components in date fields. It's syntax is:


String%macroString%macroString%macroString

where %macro is a date formatting macro such as %mmm for a 3-character English month. See the following table for a description of the date format macros. Only the String components of date uformat can include multi-byte Unicode characters.

date format

The format string requires that you provide enough information for InfoSphere DataStage to determine a complete date (either day, month, and year, or year and day of year).

The format_string can contain one or a combination of the following elements:

Table 1. Date format tags
Tag Variable width availability Description Value range Options
%d import Day of month, variable width 1...31 s
%dd   Day of month, fixed width 01...31 s
%ddd with v option Day of year 1...366 s, v
%m import Month of year, variable width 1...12 s
%mm   Month of year, fixed width 01...12 s
%mmm   Month of year, short name, locale specific Jan, Feb ... t, u, w
%mmmm import/export Month of year, full name, locale specific January, February ... t, u, w, -N, +N
%yy   Year of century 00...99 s
%yyyy   Four digit year 0001 ...9999  
%NNNNyy   Cutoff year plus year of century yy = 00...99 s
%e   Day of week, Sunday = day 1 1...7  
%E   Day of week, Monday = day 1 1...7  
%eee   Weekday short name, locale specific Sun, Mon ... t, u, w
%eeee import/export Weekday long name, locale specific Sunday, Monday ... t, u, w, -N, +N
%W import Week of year (ISO 8601, Mon) 1...53 s
%WW   Week of year (ISO 8601, Mon) 01...53 s

When you specify a date format string, prefix each component with the percent symbol (%) and separate the string's components with a suitable literal character.

The default date_format is %yyyy-%mm-%dd.

Where indicated the tags can represent variable-width data elements. Variable-width date elements can omit leading zeroes without causing errors.

The following options can be used in the format string where indicated in the table:

s
Specify this option to allow leading spaces in date formats. The s option is specified in the form:
%(tag,s)
Where tag is the format string. For example:
%(m,s)
indicates a numeric month of year field in which values can contain leading spaces or zeroes and be one or two characters wide. If you specified the following date format property:
%(d,s)/%(m,s)/%yyyy
Then the following dates would all be valid:
8/ 8/1958
08/08/1958
8/8/1958
v
Use this option in conjunction with the %ddd tag to represent day of year in variable-width format. So the following date property:
%(ddd,v)
represents values in the range 1 to 366. (If you omit the v option then the range of values would be 001 to 366.)
u
Use this option to render uppercase text on output.
w
Use this option to render lowercase text on output.
t
Use this option to render titlecase text (initial capitals) on output.
The u, w, and t options are mutually exclusive. They affect how text is formatted for output. Input dates will still be correctly interpreted regardless of case.
-N
Specify this option to left justify long day or month names so that the other elements in the date will be aligned.
+N
Specify this option to right justify long day or month names so that the other elements in the date will be aligned.
Names are left justified or right justified within a fixed width field of N characters (where N is between 1 and 99). Names will be truncated if necessary. The following are examples of justification in use:

%dd-%(mmmm,-5)-%yyyyy

21-Augus-2006

%dd-%(mmmm,-10)-%yyyyy

21-August    -2005

%dd-%(mmmm,+10)-%yyyyy

21-    August-2005

The locale for determining the setting of the day and month names can be controlled through the locale tag. This has the format:

%(L,'locale')

Where locale specifies the locale to be set using the language_COUNTRY.variant naming convention supported by ICU. The default locale for month names and weekday names markers is English unless overridden by a %L tag or the APT_IMPEXP_LOCALE environment variable (the tag takes precedence over the environment variable if both are set).

Use the locale tag in conjunction with your time format, for example the format string:

%(L,'es')%eeee, %dd %mmmm %yyyy

Specifies the Spanish locale and would result in a date with the following format:

miércoles, 21 septembre 2005

The format string is subject to the restrictions laid out in the following table. A format string can contain at most one tag from each row. In addition some rows are mutually incompatible, as indicated in the 'incompatible with' column. When some tags are used the format string requires that other tags are present too, as indicated in the 'requires' column.
Table 2. Format tag restrictions
Element Numeric format tags Text format tags Requires Incompatible with
year %yyyy, %yy, %[nnnn]yy - - -
month %mm, %m %mmm, %mmmm year week of year
day of month %dd, %d - month day of week, week of year
day of year %ddd   year day of month, day of week, week of year
day of week %e, %E %eee, %eeee month, week of year day of year
week of year %WW   year month, day of month, day of year

When a numeric variable-width input tag such as %d or %m is used, the field to the immediate right of the tag (if any) in the format string cannot be either a numeric tag, or a literal substring that starts with a digit. For example, all of the following format strings are invalid because of this restriction:

%d%m-%yyyy

%d%mm-%yyyy

%(d)%(mm)-%yyyy

%h00 hours

The year_cutoff is the year defining the beginning of the century in which all two-digit years fall. By default, the year cutoff is 1900; therefore, a two-digit year of 97 represents 1997.

You can specify any four-digit year as the year cutoff. All two-digit years then specify the next possible year ending in the specified two digits that is the same or greater than the cutoff. For example, if you set the year cutoff to 1930, the two-digit year 30 corresponds to 1930, and the two-digit year 29 corresponds to 2029.

On import and export, the year_cutoff is the base year.

This property is mutually exclusive with days_since, text, and julian.

You can include literal text in your date format. Any Unicode character other than null, backslash, or the percent sign can be used (although it is better to avoid control codes and other non-graphic characters). The following table lists special tags and escape sequences:
Tag Escape sequence
%% literal percent sign
\% literal percent sign
\n newline
\t horizontal tab
\\ single backslash

For example, the format string %mm/%dd/%yyyy specifies that slashes separate the string's date components; the format %ddd-%yy specifies that the string stores the date as a value from 1 to 366, derives the year from the current year cutoff of 1900, and separates the two components with a dash (-).

The diagram shows the modification of a date field to three integers. The modify operator takes:

  • The day of the month portion of a date field and writes it to an 8-bit integer
  • The month portion of a date field and writes it to an 8-bit integer
  • The year portion of a date field and writes it to a 16-bit integer
    Shows the modification of a date field to output three integer fields representing the date

Use the following osh command:


$ osh "...| modify 'dayField = month_day_from_date(dField);
            monthField = month_from_date(dField);
            yearField = year_from_date(dField);' | ..."