Start of change

Example of a DATA-INTO parser

Note: Detailed explanation is provided only for the aspects of the example that are related to the DATA-INTO operation.

In this example, a parser parses a properties file for the DATA-INTO operation, and an RPG program uses the parser to import data from the properties file into a data structure.

This parser works with UCS-2 data. That means that it can parse documents in any CCSID. The "ccsid" option for the DATA-INTO operation defaults to "ccsid=ucs2", so RPG programmers using this parser will not have to worry about coding the "ccsid" option.

CAUTION:
If the parser expected its data in the job CCSID, then data might be lost if the document contained data that could not be converted to the job CCSID.

The following shows the RPG program that uses the DATA-INTO operation.

Note the following aspects of the program:
  1. The data structures are defined with a subfield for each property expected in the "properties" file.
  2. The "properties" file is specified in the first operand of the %DATA built-in function. Option "doc=file" indicates that the first operand is the name of a file. Option "allowextra=yes" allows the "properties" file to have additional properties.
  3. The program that does the parsing is specified as the first operand of the %PARSER built-in function. See Program to parse a "properties" file for the source for the program.
  4. The second DATA-INTO operation parses properties in a string. The "doc=file" option is not specified.
  5. The string "sep=;" is specified as the second operand of the %PARSER built-in function. The parser will receive this value as a null-terminated string. See Main procedure to see how the parser handles this null-terminated string. See A DATA-INTO parser that uses a data structure as a communication area for an example of a parser which uses a data structure to communicate between the parser and the program with the DATA-INTO operation.

DCL-DS props1 QUALIFIED; //  1 
   company VARCHAR(30);
   language VARCHAR(10);
   version VARCHAR(10);
END-DS;
DCL-DS props2 QUALIFIED; //  1 
   city VARCHAR(30);
   province VARCHAR(10);
END-DS;

DCL-S propString VARCHAR(50) INZ('city=Toronto;province=Ontario;');

DATA-INTO props1 %DATA(propfileName : 'doc=file allowextra=yes') //  2 
                 %PARSER('PARSPROP'); //  3 

DATA-INTO props2 %DATA(propString : 'allowextra=yes') //  4 
                 %PARSER('PARSPROP' : 'sep=;'); // //  5 

Program to parse a "properties" file

  1. Copy in the file with the definition for the parameter passed to the parser and the prototypes for the callback procedures.
  2. Define named constants for the error codes issued by this parser. Each parser can define its own error codes.
  3. Define other constants and templates related to parsing in UCS-2.
  4. Define a data structure template to hold the information about the parse.

**free
ctl-opt main(parsProp);
ctl-opt option(*srcstmt);

/copy qoar/qrpglesrc,qrndtainto   1 

// Error codes for this parser  2 
dcl-c ERROR_missing_equal1        1;
dcl-c ERROR_blankName2            2;
dcl-c ERROR_blankInName3          3;

// Constants related to working in UCS-2  3 
dcl-c UCS2_CCSID 13488;
dcl-c UTF16_CCSID 1200;
dcl-c CR %ucs2(X'0D');
dcl-c LF %ucs2(X'15');
dcl-c CHAR_SIZE 2; // The size of a UCS-2 character
dcl-s oneChar_t UCS2(1);

dcl-ds parseInfo_t template qualified; //  4 
   lineStartOffset int(10);
   lineLength int(10);
   equalOffset int(10);
   curOffset int(10);
   sep varUcs2(20);
end-ds;

Main procedure

  1. The parser is passed a single parameter. See Parameter passed to a DATA-INTO parser.
  2. This parser supports a null-terminated string as the option for the %PARSER built-in function of the DATA-INTO operation. This parser expects the value of the null-terminated string to begin with "sep=", followed by the value that separates each option in the data. If this option is not specified, this parser assumes that the data came from a stream file, and that the CR and or LF characters end each line.
  3. This parser expects the data to be UCS-2 or UTF-16. If the RPG programmer specified option "ccsid=job", this parser sends an escape message which will cause the DATA-INTO operation to fail.
  4. Enable access to the callback procedures.
  5. QrnDiStart must be called first.
  6. Call QrnDiStartStruct to indicate that the document is a structure. Reporting a name for the outermost structure is not required.
  7. The parse() procedure will report the names and values within the document.
  8. Call QrnDiEndStruct to indicate that the outer data structure has ended.
  9. QrnDiFinish must be called last.

dcl-proc parsProp;
   dcl-pi *n extpgm;
      parm likeds(QrnDiParm_T) const; //  1 
   end-pi;
   dcl-ds parseInfo likeds(parseInfo_t) inz;
   dcl-s userParm varchar(30);

   if  parm.dataCcsid <> UCS2_CCSID //  2 
   and parm.dataCcsid <> UTF16_CCSID;
     //We can only parse if option "ccsid=ucs2" was specified!
     //Send an escape message in this case, since it's a user error
     signalException ('%DATA must have ccsid=ucs2 for this parser'
                    : %proc());
     // Control will not reach here
   endif;

   if parm.userParmIsNullTermString; //  2 
      userParm = %str(parm.userParm);
      if  %len(userParm) > 4
      and %scan('sep=' : userParm) = 1;
         // The parameter starts with 'sep='
         // The separator is the remaining part of the parameter
         parseInfo.sep = %subst(userParm : 5);
      endif;
   endif;

   pQrnDiEnv = parm.env; //  4 

   QrnDiStart (parm.handle); //  5 

   QrnDiStartStruct (parm.handle); //  6 

   // Parse the document
   parse (parm : parseInfo); //  7 

   // End the outer structure
   QrnDiEndStruct (parm.handle); //  8 

   // End the parse
   QrnDiFinish(parm.handle); //  9 

on-exit;
   // Nothing to do here yet
end-proc;

parse() procedure

This procedure loops through the document, reporting one property for each line it finds in the document.


dcl-proc parse;
   dcl-pi *n extproc(*dclcase);
      parserParm likeds(QrnDiParm_T) const;
      parseInfo likeds(parseInfo_t);
   end-pi;

   dow findNextLine (parserParm : parseInfo) = *on;
      reportProperty (parserParm : parseInfo);
   enddo;
   return;
end-proc;

findNextLine() procedure

This procedure loops through the data until it finds the end of a line.
  1. If the options indicated a separator string, the parser checks whether it has found the separator. If so, the line is complete, and the procedure returns.
  2. If the options did not indicate a separator string, the parser checks whether it has found an end-of-line character, either CR (carriage-return) or LF (line-feed). If the procedure had found any prior data for the line, the procedure returns. Otherwise, it begins a new line without returning the blank line. This allows the document to have lines that end with both CR and LF.
  3. If the document is not valid according to the rules of this parser, the parser calls the halt() procedure to indicate that the document is invalid.
    Note: Control does not return to the parser after a call to the halt() procedure due to the fact that the halt() procedure calls the QrnDiReportError procedure, which causes the parse to end immediately.

dcl-proc findNextLine;
   dcl-pi *n ind extproc(*dclcase);
      parserParm likeds(QrnDiParm_T) const;
      parseInfo likeds(parseInfo_t);
   end-pi;
   dcl-s viewCur like(oneChar_T) based(pData);
   dcl-s viewNext like(oneChar_T) based(pDataNext);
   dcl-s viewSep ucs2(MAX_SEP) based(pData); // must use %SUBST
   dcl-s sep varUcs2(MAX_SEP);
   dcl-s sepSize int(10);

   parseInfo.lineStartOffset = parseInfo.curOffset;
   parseInfo.lineLength = 0;
   parseInfo.equalOffset = 0;
   sep = parseInfo.sep;
   sepSize = %len(sep) * CHAR_SIZE;

   pData = parserParm.data + parseInfo.curOffset;
   dow parseInfo.curOffset < parserParm.dataLen;
      if %len(sep) > 0; // The separator is a string  1 
         if parseInfo.curOffset + sepSize <= parserParm.dataLen
         and %subst(viewSep : 1 : %len(sep)) = sep;
            parseInfo.curOffset += sepSize;
            return *on; // End of line
         endif;
      endif;

      parseInfo.curOffset += CHAR_SIZE;
      if %len(sep) = 0 and (viewCur = CR or viewCur = LF);  2 
         if parseInfo.lineLength > 0;
            return *on; // The line was not empty
         else; // The previous line was empty, so start again
            parseInfo.lineStartOffset = parseInfo.curOffset;
            parseInfo.lineLength = 0;
            parseInfo.equalOffset = 0;
         endif;
      else;
         parseInfo.lineLength += CHAR_SIZE;
         if viewCur = '=';
            parseInfo.equalOffset = parseInfo.curOffset - CHAR_SIZE;
         elseif viewCur = ' ';
            if parseInfo.equalOffset = 0; //  3 
               if parseInfo.lineLength = 0; // Completely blank name
                  halt (parserParm : parseInfo : ERROR_blankName2);
               else; // Blanks are not allowed before the equal sign
                  halt (parserParm : parseInfo : ERROR_blankInName3);
               endif;
            endif;
         endif;
      endif;
      pData += CHAR_SIZE;// Next character (curOffset already updated)
   enddo;
   return parseInfo.lineLength > 0; // *ON if the line is not empty
end-proc;

reportProperty() procedure

This procedure reports the property found on a line in the document.
  1. If the line is not valid, the parser reports the error using the QrnDiReportError callback.
    Note: Control does not return to the parser after a call to QrnDiReportError.
  2. The callback QrnDiReportName is used to report the name of the property. The DATA-INTO operation will use this name to locate a subfield in the target data structure.
  3. The callback QrnDiReportValue is used to report the value of the property. The DATA-INTO operation will assign this value to the subfield that was located by the call to QrnDiReportName.

dcl-proc reportProperty;
   dcl-pi *n extproc(*dclcase);
      parserParm likeds(QrnDiParm_T) const;
      parseInfo likeds(parseInfo_t);
   end-pi;
   dcl-s len int(10);

   if parseInfo.equalOffset = 0;
      halt (parserParm : parseInfo //  1 
          : ERROR_missing_equal1);
   elseif parseInfo.equalOffset = parseInfo.lineStartOffset;
      halt (parserParm : parseInfo //  1 
          : ERROR_blankName2);
   endif;

   // Report the name
   len = parseInfo.equalOffset - parseInfo.lineStartOffset;
   QrnDiReportName (parserParm.handle //  2 
                  : parserParm.data + parseInfo.lineStartOffset
                  : len);

   // Report the value
   len = parseInfo.lineLength - (len + CHAR_SIZE);
   QrnDiReportValue (parserParm.handle //  3 
                  : parserParm.data + parseInfo.equalOffset + CHAR_SIZE
                  : len);
   return;
end-proc;

halt() procedure

This procedure reports a parsing error.
  1. The parser reports the error using the QrnDiReportError callback.
  2. Control does not return to the parser after a call to QrnDiReportError.

dcl-proc halt;
   dcl-pi *n extproc(*dclcase);
      parserParm likeds(QrnDiParm_T) const;
      parseInfo likeds(parseInfo_T) const;
      errorCode int(10) value;
   end-pi;

   QrnDiReportError (parserParm.handle //  1 
                   : errorCode
                   : parseInfo.curOffset - 1);
   // Control will not reach here after the call to QrnDiReportError  2 

end-proc;

signalException() procedure

This procedure sends an escape message.
  1. The message is sent to the main

dcl-proc signalException;
   dcl-pi *n;
      msg varchar(200) const;
      mainProcName varchar(200) const;
   end-pi;
   dcl-pr QMHSNDPM extpgm;
      msgId char(7) const;
      msgFile likeds(qualMsgf);
      msgData char(500) const;
      dataLen int(10) const;
      msgType char(10) const;
      stackEntry char(10) const;
      stackOffset int(10) const;
      msgKey char(4) const;
      errorCode likeds(errcode);
   end-pr;
   dcl-ds qualMsgf qualified;
      msgf char(10) inz('QCPFMSG');
      lib char(10) inz('*LIBL');
   end-ds;
   dcl-ds errCode qualified;
      bytesProvided int(10) inz(0); // issue exception if bad parms
      bytesAvailable int(10) inz(0);
   end-ds;
   dcl-s msgkey char(4);

   QMHSNDPM ('CPF9898' : qualMsgf : msg : %len(msg) : '*ESCAPE'
           : mainProcName : 0  // Send to our main procedure  1 
           : msgkey : errCode);
   // Control will not return here after sending the escape message
end-proc;
End of change