parses a string and extracts a date from it
Synopsis
string
viv:parse-date
(date-str, format, timezone, range);
string date-str;
string format;
string timezone;
boolean range;
Arguments
- date-str: the date string to parse.
- format: a format string identifying the format of the date supplied
as the first argument. If not supplied, the function will try to apply the date ranges
listed earlier in this document.
- timezone: a string identifying the time zone to use. This can be
either a numeric time zone offset or a time zone abbreviation, as specified earlier in
this reference document. If not specified, the default value for timezone is
'localtime'.
- range: a Boolean value that, when true, specifies the valid date
range as being a 32 bit value. The default value is false(), which does not restrict the
date range to 32-bit values. When the 32 bit check is enabled, both dates outside the
32-bit range and any other invalid date will return a value of '0'.
Description
Attempt to parse a string and extract a date from it. The value returned by this function
is the extracted date represented as the number of seconds since 00:00:00 on January 1,
1970, Coordinated Universal Time (UTC).
The optional format string specifies how the date should be parsed.
The format specifiers are intended to be compatible with PHP or C strftime() format
strings. This function supports the following codes:
- %a or %A - The day-of-the-week name, either abbreviated (e.g. "Mon")
or full (e.g. "Monday").
- %b, %B, or %h - The month name, either abbreviated (e.g.
"Jan") or full (e.g. "January").
- %c - A shortcut for the time and date in %a %b %e %H:%M:%S %Y format,
e.g. Mon Jul 7 15:30:45 2007.
- %C - The century (00-99).
- %d or %e - The day of the month (1-31), with or without leading
zeros.
- %D - A shortcut for the date in %m/%d/%y form, e.g. 03/19/06.
- %F - A shortcut for the date in %Y-%m-%d form, e.g. 2004-09-25.
- %H - The hour in 24-hour format (00-23), as a 2-digit number with leading
zeros.
- %I - The hour in 12-hour format (01-12), as a 2-digit number with leading
zeros. Requires %p, the "AM/PM" specifier, to be included in the format
string.
- %j - The day number in the year (1-366), with or without leading zeros.
- %k - The hour in 24-hour format (0-23), as a 1- or 2-digit number without
leading zeros.
- %l - The hour in 12-hour format (1-12), as a 1- or 2-digit number without
leading zeros. Requires %p, the "AM/PM" specifier, to be included in the format
string.
- %m - The month number (1-12), with or without leading zeros.
- %M - The minute (0-59), with or without leading zeros.
- %n - The newline character.
- %p - "AM" or "PM", or "am" or "pm". Required when using 12-hour format.
- %r - A shortcut for the time in 12-hour format: %I:%M:%S %p, e.g.
10:40:22 PM.
- %R - A shortcut for the 24-hour time without seconds: %H:%M.
- %s - The number of seconds since the Unix Epoch (00:00:00 UTC on 1 January
1970).
- %S - The seconds (0-61), with or without leading zeros.
- %t - The tab character.
- %T - A shortcut for the 24-hour time with seconds: %H:%M:%S, e.g.
23:59:59.
- %u - The weekday number (1-7),, with or without leading zeros, where 1 is
Monday.
- %U - The week number in the year (0-53), with or without leading zeros, where
Sunday is the first day of the week.
- %w - The weekday number (0-6), with or without leading zeros, where 0 is
Sunday.
- %W - The week number in the year (0-53), with or without leading zeros, where
Monday is the first day of the week.
- %y - The 2-digit year, with or without leading zeros, where 69-99 refer to
1969-1999 and 00-68 refer to 2000-2068.
- %Y - The 4-digit year.
- %Z - The time zone. This may be specified either as a standard abbreviation
(listed below), or as a signed offset in hours (and optionally minutes).
- %% - The % character.
In the format string, the space character matches any number
of whitespace characters.
Numeric time zone offsets may be in formats such as +0200, -03:30,
+12, or -5. Supported time zone abbreviations are: GMT,
UT, UTC, Z, WET, WEST, BST,
ART, BRT, BRST, NST, NDT, AST,
ADT, CLT, CLST, EST, EDT, CST,
CDT, MST, MDT, PST, PDT, AKST,
AKDT, HST, HAST, HADT, SST, WAT,
CET, CEST, MET, MEZ, MEST, MESZ,
EET, EEST, CAT, SAST, EAT, MSK,
MSD, IST, SGT, KST, JST, GST,
NZST, and NZDT.
The default value for the format parameter is the empty string, '""'. If a
format string is not specified, the following date formats will be
automatically recognized:
- RFC 2616 3.3.1 (HTTP protocol) which supports:
- Sun, 06 Nov 1994 08:49:37 GMT
- Sunday, 06-Nov-94 08:49:37 GMT
- Sun Nov 6 08:49:37 1994
- Dates without week day name:
- 06 Nov 1994 08:49:37 GMT
- 06-Nov-94 08:49:37 GMT
- Nov 6 08:49:37 1994
- Dates without the time zone:
- 06 Nov 1994 08:49:37
- 06-Nov-94 08:49:37
- 1994-11-06 08:49:37 PM
- 1994-11-06 20:49:37
- 1994/11/06 08:49:37 PM
- 1994/11/06 20:49:37
- 11-06-1994 08:49:37 PM
- 11-06-1994 20:49:37 PM
- 11/06/1994 08:49:37 PM
- 11/06/1994 20:49:37
- 19941106204937
- 1994-11-06T20:49:37
- Dates in a weird order:
- 1994 Nov 6 08:49:37
- GMT 08:49:37 06-Nov-94 Sunday
- 94 6 Nov 08:49:37
- Dates without times:
- 1994 Nov 6
- 06-Nov-94
- Sun Nov 6 94
- 1994/11/06
- 1994-11-06
- 19941106
- 11/06/1994
- 11-06-1994
- Dates with unusual separators:
- 1994.Nov.6
- Sun/Nov/6/94/GMT
- Time zones specified using RFC 822 style:
- Sun, 12 Sep 2004 15:05:58 -0700
- Sat, 11 Sep 2004 21:32:11 +0200
- Compact numerical date strings:
- 20040912 15:05:58 -0700
- 20040911 +0200
Returns
A number of seconds since 00:00:00 on January 1, 1970, Coordinated Universal Time (UTC) or
0 if the input argument could not be parsed.
Example
Input Example:
<process-xsl>
<xsl:variable name="datesec" select="viv:parse-date('06-Nov-94')" />
<datesec><xsl:value-of select="$datesec" /></datesec>
<date><xsl:value-of select="viv:seconds-to-local-date-time($datesec)" /></date>
<timezone><xsl:value-of select="viv:time-zone-name()" /></timezone>
</process-xsl>
Output Example:
<datesec>784080000</datesec>
<date>1994-11-05T20:00:00-04:00</date>
<timezone>EDT</timezone>
Known Issues
The viv:parse-date() function only supports 32-bit dates on 32-bit Linux
systems.
Notes
Fast-Indexing and the Date Type
When fast-indexing a field that contains a date, Watson Explorer Engine provides a date type
to automate conversion from time values expressed as strings. Fields in a fast index that
are declared as being of the date type automatically process their content using
viv:parse-date. For this reason, you should not attempt to use
viv:parse-date to process fast-indexed variables of type date.
As an example, assume that you had the following content:
<document xmlns:xi="http://www.w3.org/2003/XInclude">
<content name="last-update">Thu, 29 Mar 2007 15:20:01 +0100</content>
</document>
If you fast-index the last-update field as a date (last-update|date), the
indexer will automatically fast-index the result of
viv:parse-date(content[@name="last-update"]), in which case the value that would
be fast indexed would be an integer (1175178001). By default, when used within
fast-indexing, viv:parse-date restricts date values to 32-bits. To fast-index dates outside
this range, you can use viv:parse-date manually and fast-index the resulting value as an
integer. Any searches on this field would have to be expressed as integer values (or
converted into integer values before performing the search), so that the comparison would
work correctly.