Topic
11 replies Latest Post - ‏2013-11-26T20:50:21Z by Fu Lin Yiu
tlwtheq
tlwtheq
50 Posts
ACCEPTED ANSWER

Pinned topic List of characters that need escape, i.e., \

‏2013-03-19T11:41:54Z |
Anyone know what other characters besides the dot (.) need to be preceded
by the escape (\) character?
Thanks.
Updated on 2013-03-21T03:34:49Z at 2013-03-21T03:34:49Z by tlwtheq
  • SystemAdmin
    SystemAdmin
    3180 Posts
    ACCEPTED ANSWER

    Re: List of characters that need escape, i.e., \

    ‏2013-03-19T14:37:49Z  in response to tlwtheq
    :) what a coincidence :)

    Your question has been answered 9 minutes before you posted it.
    Check http://www.ibm.com/developerworks/forums/thread.jspa?messageID=14958436

    @Mathias: seems your support is faster than light
    • tlwtheq
      tlwtheq
      50 Posts
      ACCEPTED ANSWER

      Re: List of characters that need escape, i.e., \

      ‏2013-03-19T14:44:34Z  in response to SystemAdmin
      So ^ ? , $ don't need \ to be taken literally?
      • tlwtheq
        tlwtheq
        50 Posts
        ACCEPTED ANSWER

        Re: List of characters that need escape, i.e., \

        ‏2013-03-19T15:22:02Z  in response to tlwtheq
        It's kind of annoying the documentation doesn't give the exact list.
    • llandale
      llandale
      2809 Posts
      ACCEPTED ANSWER

      Re: List of characters that need escape, i.e., \

      ‏2013-03-19T18:16:56Z  in response to SystemAdmin
      Yup, MM is ahead of our time. 7 hours ahead I think.
      • tlwtheq
        tlwtheq
        50 Posts
        ACCEPTED ANSWER

        Re: List of characters that need escape, i.e., \

        ‏2013-03-19T18:21:43Z  in response to llandale
        NYUK NYUK
        • Mathias Mamsch
          Mathias Mamsch
          1762 Posts
          ACCEPTED ANSWER

          Re: List of characters that need escape, i.e., \

          ‏2013-03-19T21:44:09Z  in response to tlwtheq

          Well ^ is kind of an exception, you do not need to escape it, unless it is at the beginning of the regexp expression, there it has a special meaning. However to be safe I would always escape it. With the below code you can easily test escaping rules yourself.
           

          string s = "^x"
           
          Regexp re1 = regexp2 "^x" // does not match
          Regexp re2 = regexp2 "^^x" // does match 
          Regexp re3 = regexp2 "\^x" // does match 
           
          print (re1 s)
          print (re2 s)
          print (re3 s)
          

           


          Regards, Mathias

           

           


          Mathias Mamsch, IT-QBase GmbH, Consultant for Requirement Engineering and D00RS

           

          Updated on 2014-01-06T10:31:42Z at 2014-01-06T10:31:42Z by iron-man
          • Mathias Mamsch
            Mathias Mamsch
            1762 Posts
            ACCEPTED ANSWER

            Re: List of characters that need escape, i.e., \

            ‏2013-03-19T21:48:49Z  in response to Mathias Mamsch

            And ? of course needs to be escaped too ;-) Well its funny, the following code
             

            for (i = 32; i < 127; i++) {
             
               char c = charOf i
               string s = "x" c ""
             
               noError()
               Regexp re1 = regexp2 "^x" c "$" // does not match
               string sErr = lastError()
             
               if (!null sErr || !(re1 s)) print c " "
            }
            

             


            results in: ( ) * + ? [ \

            Regards, Mathias

             

             


            Mathias Mamsch, IT-QBase GmbH, Consultant for Requirement Engineering and D00RS

             

            Updated on 2014-01-06T10:31:58Z at 2014-01-06T10:31:58Z by iron-man
            • llandale
              llandale
              2809 Posts
              ACCEPTED ANSWER

              Re: List of characters that need escape, i.e., \

              ‏2013-03-20T15:24:13Z  in response to Mathias Mamsch
              Looks like you found the characters that must be escaped, thanks. I must have missed something; what is "funny" about that?
              • Mathias Mamsch
                Mathias Mamsch
                1762 Posts
                ACCEPTED ANSWER

                Re: List of characters that need escape, i.e., \

                ‏2013-03-20T21:17:13Z  in response to llandale
                Well funny is always dependent on the humour of the people reading it, but I found it funny, that for example () must be both escaped, while for square brackets only the opening bracket must be escaped. I was surprised that it seems so hard to answer this question, even the above list is not correct, since "." matches "." and therefore it does not show up on the list, and there are special rules for escaping ] and ^ and such. The answer seems so obvious, but when I think about it, it is really easy to make a mistake on that - the most obvious one being typing "." for matching a dot and not realizing that it will match everything else too. That is what I think is funny! Regards, Mathias


                Mathias Mamsch, IT-QBase GmbH, Consultant for Requirement Engineering and D00RS
                • tlwtheq
                  tlwtheq
                  50 Posts
                  ACCEPTED ANSWER

                  Re: List of characters that need escape, i.e., \

                  ‏2013-03-21T03:34:49Z  in response to Mathias Mamsch
                  This is one of these "Aw Geez" kind of moments. To begin with, no one knew what my
                  "NYUK NYUK" was about, and then no one was clear about Louie's response to
                  "funny" meant.

                  My NYUK NYUK referred to Louie mentioning the time difference business.
                  After that, I cannot offer a logical account about people's reactions to statements.

                  I think all you folks are sensational, and thank you for offering
                  your possible solutions.

                  Many heads seem to be better than one on numerous accounts.

                  TW
  • Fu Lin Yiu
    Fu Lin Yiu
    14 Posts
    ACCEPTED ANSWER

    Re: List of characters that need escape, i.e., \

    ‏2013-11-26T20:50:21Z  in response to tlwtheq
    I was hoping I could (1) go to a regular expression test app, (2) dork around until I get it working, and (3) paste the pre-tested regexp into DXL and run with it.
    I got close to this goal, but got bad token for sequences like \* (backslash preceding a non-DXL char constant).  In cases where no escaping of non-backslash regexp special characters is required, I can get by with (for example):
    // Test app is http://www.myregextester.com/index.php#matchtab
    // {\\pntxtb.*?} copied from test app with no transposing
     
    Regexp re regexp2( (reEsc("{\\pntxtb.*?}") ));
    For something more complicated (like a literal *), I had to resort to constructing with three segments like:
      // find: {\\*\\pn\\pnlvlblt ... } in rich text  
    Regexp re;   re = regexp2( (reEsc("{\\")) (reEsc('*')) (reEsc("\\pn\\pnlvlblt.*}")));
    Explanations are included in code comments.  If anyone can do any better for direct cut and paste from the GUI test tool, please post code,

    //------------------------------------------------------------------------------
    // Use this for RegExp escaping of any character except backslash ( \ )
    //     NOTE: Use of this function is not needed unless you wish to use
    //           a regular expression "special char"
    //------------------------------------------------------------------------------
    string reEsc(char ch) {
        return "\\" ch "";
    }
    //------------------------------------------------------------------------------
    // Use this to RegExp escape strings.
    //     NOTE A - It is standard DXL practice to represent
    //              a backslash ( \ ) as "\\".  This policy applies here as well.
    //     NOTE B - Use the separate function
    //                  string reEsc(char);
    //              to escape non-backslash single characters
    //     NOTE C - DXL "gotcha" warning.  When appending a bunch of functions
    //              that have string arguments and return string together, surround
    //              function calls with () to thwart argument bleed-through.
    //------------------------------------------------------------------------------
    string reEsc(string str) {
        string retVal = "";
        int i;
        for (i = 0; i < length(str); i++) {
            if (str[i] == '\\') {
                retVal = retVal "\\\\";
            } else {
                retVal = retVal str[i] "";
            }
        }
        return retVal;
    }
    //------------------------------------------------------------------------------
    // EXAMPLE: using http://www.myregextester.com/index.php
    //------------------------------------------------------------------------------
    // NOTE: This example parses rich text which is ugly to start with!
    //
    // LET SOURCE TEXT (as pasted into My RegEx Tester page):
    //     {\pntext\'B7\tab}{\*\pn\pnlvlblt\pnindent0{\pntxtb\'B7}}\li1440 Hi
    //
    // LET MATCH PATTERN (as pasted into My RegEx Tester page):
    //     {\\\*\\pn\\pnlvlblt.*}
    //
    // My RegEx Tester page results:
    //     {\*\pn\pnlvlblt\pnindent0{\pntxtb\'B7}}
    //
    // OBJECTIVES: Upon getting a RegEx expression working correctly (not trivial),
    //             transpose it into DXL with a minimum chance of screwing it up
    //             in the process (one extra/missing backslash and it is late for
    //             supper).
    //
    // TRANSPOSING NOTES: Here are the markers and what to do with them.
    //       {\\\*\\pn\\pnlvlblt.*}
    //       ^ ^^^^               ^
    // STEP: AAABBCCCCCCCCCCCCCCCCC
    //                  AAA: (reEsc("{\\"))
    //                   BB: (reEsc('*'))
    //    CCCCCCCCCCCCCCCCC: (reEsc("\\pn\\pnlvlblt.*}"))
    //------------------------------------------------------------------------------
    void regExpTest() {
        string sourceToParse = //-
    "{\\pntext\\'B7\\tab}{\\*\\pn\\pnlvlblt\\pnindent0{\\pntxtb\\'B7}}\\li1440 Hi";
        // find: {\\*\\pn\\pnlvlblt ... } in rich text
        Regexp re;
        re = regexp2( (reEsc("{\\")) (reEsc('*')) (reEsc("\\pn\\pnlvlblt.*}")));
        if (re sourceToParse) {
            print("Found pnlvlblt '"     //-
                  sourceToParse[match 0] //-
                  "':\n"                 //-
                  sourceToParse          //-
                  "\n"                    );
            print("Matched substring starts at " (start(0)) //-
                  " and ends at " (end(0)) "\n"             );
        } else {
            print("No pnlvlblt in " sourceToParse "\n");
        }
    }
    // Launch example:
    regExpTest();
    //------------------------------------------------------------------------------
    // Example's output:
    //------------------------------------------------------------------------------
    // Found pnlvlblt '{\*\pn\pnlvlblt\pnindent0{\pntxtb\'B7}}':
    // {\pntext\'B7\tab}{\*\pn\pnlvlblt\pnindent0{\pntxtb\'B7}}\li1440 Hi
    // Matched substring starts at 17 and ends at 55
    //------------------------------------------------------------------------------
    //------------------------------------------------------------------------------
    // DXL Manual Says ...
    //     The following symbols can be used in Regexp expressions:
    //------------------------------------------------------------------------------
    //   Chars: *
    // Meaning: zero or more occurrences
    // Example: a*
    // Matches: any number of a characters, or none
    //------------------------------------------------------------------------------
    //   Chars: +
    // Meaning: one or more occurrences
    // Example: x+
    // Matches: one or more x characters
    //------------------------------------------------------------------------------
    //   Chars: .
    // Meaning: any single character except new line
    // Example: .*
    // Matches: any number of any characters (any string)
    //------------------------------------------------------------------------------
    //   Chars: \
    // Meaning: escape (literal text char)
    // Example: \.
    // Matches: literally a . (dot) character
    //------------------------------------------------------------------------------
    //   Chars: ^
    // Meaning: start of the string (if at start of Regexp)
    // Example: ^The.*
    // Matches: any string starting with The or starting with The after any
    //          new line(see also [ ] below)
    //------------------------------------------------------------------------------
    //   Chars: $
    // Meaning: end of the string (if at end of Regexp)
    // Example: end\\.$
    // Matches: any string ending with end.
    //------------------------------------------------------------------------------
    //   Chars: ( )
    // Meaning: Groupings
    // Example: (ref) + (bind) *
    // Matches: at least one ref string then any number of bind strings
    //------------------------------------------------------------------------------
    //   Chars: [ ]
    // Meaning: character range (letters or digits)
    // Example: [sS]hall.*\\.$
    // Matches: any string containing shall or Shall and ending in a literal
    //          dot (any requirement sentence)
    //
    // Example: [^abc]
    // Matches: any character except a, b, or c
    //
    // Example: [a-zA-Z]
    // Matches: any alphabetic character
    //
    // Example: [0-9]
    // Matches: any digit
    //------------------------------------------------------------------------------
    //   Chars: |
    // Meaning: Alternative
    // Example: (dat|doc)
    // Matches: either the string dat or the string doc
    //------------------------------------------------------------------------------
    //------------------------------------------------------------------------------
    // Mathias Mamsch says:
    //     "Well its funny , the following code results in: ( ) * + ? [ \ "
    //
    // I find that "?" seems to work in regular expressions but the DXL manual
    // does not mention it.
    //
    // What else works? ... Left as an exercise for the reader ... Welcome to DOORS!
    //------------------------------------------------------------------------------
    //------------------------------------------------------------------------------
    // REFERENCES:
    //     http://www.addedbytes.com/cheat-sheets/regular-expressions-cheat-sheet/
    //------------------------------------------------------------------------------

     

    Regards ...

     

    Updated on 2013-11-27T21:23:15Z at 2013-11-27T21:23:15Z by Fu Lin Yiu