Advanced parsing information

Advanced parsing includes parsing multiple strings, parsing with DBCS characters, and special cases where absolute and relative positional patterns do not work identically. Flow charts that depict a conceptual view of parsing are provided.

Parsing multiple strings

Only ARG and PARSE ARG can have more than one source string. To parse multiple strings, you can specify multiple comma-separated templates.

Here is an example:
parse arg template1, template2, template3

This instruction consists of the keywords PARSE ARG and three comma-separated templates. (For an ARG instruction, the source strings to parse come from arguments you specify when you call a program or CALL a subroutine or function.) Each comma is an instruction to the parser to move on to the next string.

Example:
/* Parsing multiple strings in a subroutine                      */
num='3'
musketeers="Porthos Athos Aramis D'Artagnon"
CALL Sub num,musketeers  /* Passes num and musketeers to sub     */
SAY total; say fourth /* Displays: "4" and " D'Artagnon"         */
EXIT

Sub:
parse arg subtotal, . . . fourth
total=subtotal+1
RETURN
When a REXX program is started as a command, only one argument string is recognized. You can pass multiple argument strings for parsing:
  • When one REXX program calls another REXX program with the CALL instruction or a function call.
  • When programs written in other languages start a REXX program.

If there are more templates than source strings, each variable in a leftover template receives a null string. If there are more source strings than templates, the language processor ignores leftover source strings. If a template is empty (two commas in a row) or contains no variable names, parsing proceeds to the next template and source string.

Parsing with DBCS characters

Parsing with DBCS characters generally follows the same rules as parsing with SBCS characters. Literal strings and symbols can contain DBCS characters, but numbers must be in SBCS characters. See PARSE instruction with DBCS for examples of DBCS parsing.

Combining string and positional patterns: a special case

We have shown how parsing with a template that contains a string pattern skips over the data in the source string that matches the pattern (see Templates that contain string patterns). But a template that contains the following sequence does not skip over the matching data:
  • string pattern
  • variable name
  • relative positional pattern

A relative positional pattern moves relative to the first character matching a string pattern. As a result, assignment includes the data in the source string that matches the string pattern.

/* Template containing string pattern, then variable name, then  */
/*  relative positional pattern does not skip over any data.     */
string='REstructured eXtended eXecutor'
parse var string var1 3 junk 'X' var2 +1 junk 'X' var3 +1 junk
say var1||var2||var3 /* Concatenates variables; displays: "REXX" */
Here is how this template works:

|var1  3|   |junk 'X'|     |var2 +1|   |junk  'X'|   |var3 +1 |  | junk |
+-------+   +--------+     +-------+   +---------+   +--------+  +------+
    |           |              |            |            |          |
Put         Starting       Starting     Starting     Starting    Starting
characters  at 3, put      with first   with char-   with        with char-
1 through   characters     'X' put 1    acter after  second 'X'  acter
2 in var1.  up to (not     (+1)         first 'X'    put 1 (+1)  after sec-
(Stopping   including)     character    put up to    character   ond 'X'
point is    first 'X'      in var2.     second 'X'   in var3.    put rest
3.)         in junk.                    in junk.                 in junk.
 
var1='RE'   junk=          var2='X'     junk=        var3='X'    junk=
            'structured e'              'tended e'              'ecutor'

Flow charts showing the details of steps in parsing

To help you understand the concept of parsing, the following figures are given to provide a conceptual view of parsing:

Note that the figures do not include error cases.

The figures include terms whose definitions are as follows:
  • string start is the beginning of the source string (or substring).
  • string end is the end of the source string (or substring).
  • length is the length of the source string.
  • match start is in the source string and is the first character of the match.
  • match end is in the source string. For a string pattern, it is the first character after the end of the match. For a positional pattern, it is the same as match start.
  • match position is in the source string. For a string pattern, it is the first matching character. For a positional pattern, it is the position of the matching character.
  • token is a distinct syntactic element in a template, such as a variable, a period, a pattern, or a comma.
  • value is the numeric value of a positional pattern. This can be either a constant or the resolved value of a variable.
Conceptual overview of parsing
Figure 1. Conceptual overview of parsing

              +----------------------------------------+                    
              V                                        |                    
    +--------------------------------+                 |                    
    |START                           |                 |                    
    |Token is first one in template. |                 |                    
    |Length=length(source string)    |                 |                    
    |Match start=1. Match end=1.     |                 |                    
    +--------------------------------+                 |                    
 +----------> |                                        |                    
 |            V                                        |                    
 |  +-------------------+yes +--------------------+    |                    
 |  |End of template?   |--->|Parsing complete.   |    |                    
 |  +-------------------+    +--------------------+    |                    
 |            V no                                     |                    
 |  +-------------------+                              |                    
 |  |CALL Find Next     |                              |                    
 |  | Pattern.          |                              |                    
 |  +-------------------+                              |                    
 |            V                                        |                    
 |  +-------------------+                              |                    
 |  |CALL Word Parsing. |                              |                    
 |  +-------------------+                              |                    
 |            V                                        |                    
 |  +-------------------+                              |                    
 |  |Step to next token.|                              |                    
 |  +-------------------+                              |                    
 |            V                                        |                    
 |  +-------------------+ yes +--------------------+   |                    
 |  |Token a comma?     |---->|Set next source     |   |                    
 |  +-------------------+     |string and template.|---+                    
 |            | no            +--------------------+                        
 +------------+
Conceptual view of finding next pattern
Figure 2. Conceptual view of finding next pattern
       +------------------------------------------------+
       V                                                |
+-------------+    +--------------------------------+   |
|Start:       |yes |String start=match end.         |   |
|End of       |--->|Match start=length + 1.         |   |
|template?    |    |Match end=length + 1. Return.   |   |
+-------------+    +--------------------------------+   |
      V no                                              |
+-------------+    +--------------------------------+   |
|Token period |yes |                                |   |
|or variable? |--->|Step to next token.             |---+
+-------------+    +--------------------------------+
      V no
+-------------+    +---------+    +----------+   +---------------------------------+
|Token a plus?|yes |Variable |yes |Resolve   |   |String start=match start.        |
|             |--->|form?    |--->|its value.|-->|Match start=min(length + 1,      |
+-------------+    +---------+    +----------+A  | match start + value).           |
      | no              | no                  |  |Match end=match start. Return.   |
      V                 +---------------------+  +---------------------------------+
+-------------+    +---------+    +----------+   +---------------------------------+
|Token a      |yes |Variable |yes |Resolve   |   |String start=match start.        |
|minus?       |--->|form?    |--->|its value.|-->|Match start=max(1, match         |
+-------------+    +---------+    +----------+A  | start - value).                 |
      | no              | no                  |  |Match end=match start. Return.   |
      V                 +---------------------+  +---------------------------------+
+-------------+    +---------+    +----------+   +---------------------------------+
|Token an     |yes |Variable |yes |Resolve   |   |String start=match end.          |
|equal?       |--->|form?    |--->|its value.|-->|Match start=min(length+1, value).|
+-------------+    +---------+    +----------+A  |Match end=match start. Return.   |
      | no              | no                  |  +---------------------------------+
      V                 +---------------------+
+-------------+    +-----------------------------------+
|Token a      |yes |String start=match end.            |
|number?      |--->|Match start=min(length+1, value).  |
+-------------+    |Match end=match start. Return.     |
      V no         +-----------------------------------+
+-------------+
|Token a lit- |yes
|eral string? |--------------------------+
+-------------+                          |
      | no                               |
      V                                  V
+-------------+    +----------+   +---------------+    +---------------------------+
|Token a var- |yes |Resolve   |   |Match found in |yes |String start=match end.    |
|iable string?|--->|its value.|-->|rest of string?|--->|Match start=match position.|
+-------------+    +----------+   +---------------+    |Match end=match position + |
      | no                               | no          | pattern length.  Return.  |
      |                                  V             +---------------------------+
      |                  +--------------------------------+
      |                  |String start=match end.         |
      |                  |Match start=length + 1.         |
      |                  |Match end=length + 1. Return.   |
      V                  +--------------------------------+
+-------------+          +--------------------------------+
|Token a      |yes       |Match start=length + 1.         |
| comma?      |--------->|Match end=length + 1. Return.   |
+-------------+          +--------------------------------+
Conceptual view of word parsing
Figure 3. Conceptual view of word parsing
+-------------------------+    +------------------------+
|Start:  Match end <=     |no  |                        |
|        string start?    |--->|String end=match start. |
+-------------------------+    +------------------------+
            V yes
+-------------------------+
|String end=length + 1.   |
+-------------------------+
            V
+----------------------------------------------------------------------+
|Substring=substr(source string,string start,(string end-string start))|
|Token=previous pattern.                                               |
+----------------------------------------------------------------------+
            V <-----------------------------------------------+
+-------------------------+no                                 |
|Any more tokens?         |-------------+                     |
+-------------------------+             |                     |
            V yes                       |                     |
+-------------------------+             |                     |
|Step to next token.      |             |                     |
+-------------------------+             |                     |
            V                           V                     |
+-------------------------+no  +------------------------+     |
|Token a variable or a    |--->|Return.                 |     |
|period?                  |    +------------------------+     |
+-------------------------+                                   |
            V yes                                             |
+-------------------------+no                                 |
|Any more tokens?         |-------------+                     |
+-------------------------+             |                     |
            V yes                       V                     |
+-------------------------+    +------------------------+     |
|Next token a variable or | no |Assign rest of substring|     |
|period?                  |--->|to variable.            |     |
+-------------------------+    +------------------------+     |
            V yes                            +--------------->|
+-------------------------+ no +------------------------+     |
|Any substring left?      |--->|Assign null string to   |     |
+-------------------------+    |variable.               |     |
            V yes              +------------------------+     |
+-------------------------+                  +--------------->|
|Strip any leading blanks.|                                   |
+-------------------------+                                   |
            V                                                 |
+-------------------------+ no +------------------------+     |
|Any substring left?      |--->|Assign null string to   |     |
+-------------------------+    |variable.               |     |
            |                  +------------------------+     |
            V yes                            +--------------->|
+-------------------------+ no +------------------------+     |
|Blank found in substring?|--->|Assign rest of substring|     |
|                         |    |to variable.            |     |
+-------------------------+    +------------------------+     |
            V yes                            +--------------->|
+-----------------------------------------------------------+ |
|Assign word from substring to variable and step past blank.| |
+-----------------------------------------------------------+ |
                    +-----------------------------------------+