Topic
  • 11 replies
  • Latest Post - ‏2013-11-26T20:50:21Z by Fu Lin Yiu
tlwtheq
tlwtheq
50 Posts

Pinned topic List of characters that need escape, i.e., \

‏2013-03-19T11:41:54Z |
Anyone know what other characters besides the dot (.) need to be preceded
by the escape (\) character?
Thanks.
Updated on 2013-03-21T03:34:49Z at 2013-03-21T03:34:49Z by tlwtheq
  • SystemAdmin
    SystemAdmin
    3180 Posts

    Re: List of characters that need escape, i.e., \

    ‏2013-03-19T14:37:49Z  
    :) what a coincidence :)

    Your question has been answered 9 minutes before you posted it.
    Check http://www.ibm.com/developerworks/forums/thread.jspa?messageID=14958436

    @Mathias: seems your support is faster than light
  • tlwtheq
    tlwtheq
    50 Posts

    Re: List of characters that need escape, i.e., \

    ‏2013-03-19T14:44:34Z  
    :) what a coincidence :)

    Your question has been answered 9 minutes before you posted it.
    Check http://www.ibm.com/developerworks/forums/thread.jspa?messageID=14958436

    @Mathias: seems your support is faster than light
    So ^ ? , $ don't need \ to be taken literally?
  • tlwtheq
    tlwtheq
    50 Posts

    Re: List of characters that need escape, i.e., \

    ‏2013-03-19T15:22:02Z  
    • tlwtheq
    • ‏2013-03-19T14:44:34Z
    So ^ ? , $ don't need \ to be taken literally?
    It's kind of annoying the documentation doesn't give the exact list.
  • llandale
    llandale
    3035 Posts

    Re: List of characters that need escape, i.e., \

    ‏2013-03-19T18:16:56Z  
    :) what a coincidence :)

    Your question has been answered 9 minutes before you posted it.
    Check http://www.ibm.com/developerworks/forums/thread.jspa?messageID=14958436

    @Mathias: seems your support is faster than light
    Yup, MM is ahead of our time. 7 hours ahead I think.
  • tlwtheq
    tlwtheq
    50 Posts

    Re: List of characters that need escape, i.e., \

    ‏2013-03-19T18:21:43Z  
    • llandale
    • ‏2013-03-19T18:16:56Z
    Yup, MM is ahead of our time. 7 hours ahead I think.
    NYUK NYUK
  • Mathias Mamsch
    Mathias Mamsch
    2183 Posts

    Re: List of characters that need escape, i.e., \

    ‏2013-03-19T21:44:09Z  
    • tlwtheq
    • ‏2013-03-19T18:21:43Z
    NYUK NYUK

    Well ^ is kind of an exception, you do not need to escape it, unless it is at the beginning of the regexp expression, there it has a special meaning. However to be safe I would always escape it. With the below code you can easily test escaping rules yourself.
     

    string s = "^x"
     
    Regexp re1 = regexp2 "^x" // does not match
    Regexp re2 = regexp2 "^^x" // does match 
    Regexp re3 = regexp2 "\^x" // does match 
     
    print (re1 s)
    print (re2 s)
    print (re3 s)
    

     


    Regards, Mathias

     

     


    Mathias Mamsch, IT-QBase GmbH, Consultant for Requirement Engineering and D00RS

     

    Updated on 2014-01-06T10:31:42Z at 2014-01-06T10:31:42Z by iron-man
  • Mathias Mamsch
    Mathias Mamsch
    2183 Posts

    Re: List of characters that need escape, i.e., \

    ‏2013-03-19T21:48:49Z  

    Well ^ is kind of an exception, you do not need to escape it, unless it is at the beginning of the regexp expression, there it has a special meaning. However to be safe I would always escape it. With the below code you can easily test escaping rules yourself.
     

    <pre class="javascript dw" data-editor-lang="js" data-pbcklang="javascript" dir="ltr">string s = "^x" Regexp re1 = regexp2 "^x" // does not match Regexp re2 = regexp2 "^^x" // does match Regexp re3 = regexp2 "\^x" // does match print (re1 s) print (re2 s) print (re3 s) </pre>

     


    Regards, Mathias

     

     


    Mathias Mamsch, IT-QBase GmbH, Consultant for Requirement Engineering and D00RS

     

    And ? of course needs to be escaped too ;-) Well its funny, the following code
     

    for (i = 32; i < 127; i++) {
     
       char c = charOf i
       string s = "x" c ""
     
       noError()
       Regexp re1 = regexp2 "^x" c "$" // does not match
       string sErr = lastError()
     
       if (!null sErr || !(re1 s)) print c " "
    }
    

     


    results in: ( ) * + ? [ \

    Regards, Mathias

     

     


    Mathias Mamsch, IT-QBase GmbH, Consultant for Requirement Engineering and D00RS

     

    Updated on 2014-01-06T10:31:58Z at 2014-01-06T10:31:58Z by iron-man
  • llandale
    llandale
    3035 Posts

    Re: List of characters that need escape, i.e., \

    ‏2013-03-20T15:24:13Z  

    And ? of course needs to be escaped too ;-) Well its funny, the following code
     

    <pre class="javascript dw" data-editor-lang="js" data-pbcklang="javascript" dir="ltr">for (i = 32; i < 127; i++) { char c = charOf i string s = "x" c "" noError() Regexp re1 = regexp2 "^x" c "$" // does not match string sErr = lastError() if (!null sErr || !(re1 s)) print c " " } </pre>

     


    results in: ( ) * + ? [ \

    Regards, Mathias

     

     


    Mathias Mamsch, IT-QBase GmbH, Consultant for Requirement Engineering and D00RS

     

    Looks like you found the characters that must be escaped, thanks. I must have missed something; what is "funny" about that?
  • Mathias Mamsch
    Mathias Mamsch
    2183 Posts

    Re: List of characters that need escape, i.e., \

    ‏2013-03-20T21:17:13Z  
    • llandale
    • ‏2013-03-20T15:24:13Z
    Looks like you found the characters that must be escaped, thanks. I must have missed something; what is "funny" about that?
    Well funny is always dependent on the humour of the people reading it, but I found it funny, that for example () must be both escaped, while for square brackets only the opening bracket must be escaped. I was surprised that it seems so hard to answer this question, even the above list is not correct, since "." matches "." and therefore it does not show up on the list, and there are special rules for escaping ] and ^ and such. The answer seems so obvious, but when I think about it, it is really easy to make a mistake on that - the most obvious one being typing "." for matching a dot and not realizing that it will match everything else too. That is what I think is funny! Regards, Mathias


    Mathias Mamsch, IT-QBase GmbH, Consultant for Requirement Engineering and D00RS
  • tlwtheq
    tlwtheq
    50 Posts

    Re: List of characters that need escape, i.e., \

    ‏2013-03-21T03:34:49Z  
    Well funny is always dependent on the humour of the people reading it, but I found it funny, that for example () must be both escaped, while for square brackets only the opening bracket must be escaped. I was surprised that it seems so hard to answer this question, even the above list is not correct, since "." matches "." and therefore it does not show up on the list, and there are special rules for escaping ] and ^ and such. The answer seems so obvious, but when I think about it, it is really easy to make a mistake on that - the most obvious one being typing "." for matching a dot and not realizing that it will match everything else too. That is what I think is funny! Regards, Mathias


    Mathias Mamsch, IT-QBase GmbH, Consultant for Requirement Engineering and D00RS
    This is one of these "Aw Geez" kind of moments. To begin with, no one knew what my
    "NYUK NYUK" was about, and then no one was clear about Louie's response to
    "funny" meant.

    My NYUK NYUK referred to Louie mentioning the time difference business.
    After that, I cannot offer a logical account about people's reactions to statements.

    I think all you folks are sensational, and thank you for offering
    your possible solutions.

    Many heads seem to be better than one on numerous accounts.

    TW
  • Fu Lin Yiu
    Fu Lin Yiu
    15 Posts

    Re: List of characters that need escape, i.e., \

    ‏2013-11-26T20:50:21Z  
    I was hoping I could (1) go to a regular expression test app, (2) dork around until I get it working, and (3) paste the pre-tested regexp into DXL and run with it.
    I got close to this goal, but got bad token for sequences like \* (backslash preceding a non-DXL char constant).  In cases where no escaping of non-backslash regexp special characters is required, I can get by with (for example):
    // Test app is http://www.myregextester.com/index.php#matchtab
    // {\\pntxtb.*?} copied from test app with no transposing
     
    Regexp re regexp2( (reEsc("{\\pntxtb.*?}") ));
    For something more complicated (like a literal *), I had to resort to constructing with three segments like:
      // find: {\\*\\pn\\pnlvlblt ... } in rich text  
    Regexp re;   re = regexp2( (reEsc("{\\")) (reEsc('*')) (reEsc("\\pn\\pnlvlblt.*}")));
    Explanations are included in code comments.  If anyone can do any better for direct cut and paste from the GUI test tool, please post code,

    //------------------------------------------------------------------------------
    // Use this for RegExp escaping of any character except backslash ( \ )
    //     NOTE: Use of this function is not needed unless you wish to use
    //           a regular expression "special char"
    //------------------------------------------------------------------------------
    string reEsc(char ch) {
        return "\\" ch "";
    }
    //------------------------------------------------------------------------------
    // Use this to RegExp escape strings.
    //     NOTE A - It is standard DXL practice to represent
    //              a backslash ( \ ) as "\\".  This policy applies here as well.
    //     NOTE B - Use the separate function
    //                  string reEsc(char);
    //              to escape non-backslash single characters
    //     NOTE C - DXL "gotcha" warning.  When appending a bunch of functions
    //              that have string arguments and return string together, surround
    //              function calls with () to thwart argument bleed-through.
    //------------------------------------------------------------------------------
    string reEsc(string str) {
        string retVal = "";
        int i;
        for (i = 0; i < length(str); i++) {
            if (str[i] == '\\') {
                retVal = retVal "\\\\";
            } else {
                retVal = retVal str[i] "";
            }
        }
        return retVal;
    }
    //------------------------------------------------------------------------------
    // EXAMPLE: using http://www.myregextester.com/index.php
    //------------------------------------------------------------------------------
    // NOTE: This example parses rich text which is ugly to start with!
    //
    // LET SOURCE TEXT (as pasted into My RegEx Tester page):
    //     {\pntext\'B7\tab}{\*\pn\pnlvlblt\pnindent0{\pntxtb\'B7}}\li1440 Hi
    //
    // LET MATCH PATTERN (as pasted into My RegEx Tester page):
    //     {\\\*\\pn\\pnlvlblt.*}
    //
    // My RegEx Tester page results:
    //     {\*\pn\pnlvlblt\pnindent0{\pntxtb\'B7}}
    //
    // OBJECTIVES: Upon getting a RegEx expression working correctly (not trivial),
    //             transpose it into DXL with a minimum chance of screwing it up
    //             in the process (one extra/missing backslash and it is late for
    //             supper).
    //
    // TRANSPOSING NOTES: Here are the markers and what to do with them.
    //       {\\\*\\pn\\pnlvlblt.*}
    //       ^ ^^^^               ^
    // STEP: AAABBCCCCCCCCCCCCCCCCC
    //                  AAA: (reEsc("{\\"))
    //                   BB: (reEsc('*'))
    //    CCCCCCCCCCCCCCCCC: (reEsc("\\pn\\pnlvlblt.*}"))
    //------------------------------------------------------------------------------
    void regExpTest() {
        string sourceToParse = //-
    "{\\pntext\\'B7\\tab}{\\*\\pn\\pnlvlblt\\pnindent0{\\pntxtb\\'B7}}\\li1440 Hi";
        // find: {\\*\\pn\\pnlvlblt ... } in rich text
        Regexp re;
        re = regexp2( (reEsc("{\\")) (reEsc('*')) (reEsc("\\pn\\pnlvlblt.*}")));
        if (re sourceToParse) {
            print("Found pnlvlblt '"     //-
                  sourceToParse[match 0] //-
                  "':\n"                 //-
                  sourceToParse          //-
                  "\n"                    );
            print("Matched substring starts at " (start(0)) //-
                  " and ends at " (end(0)) "\n"             );
        } else {
            print("No pnlvlblt in " sourceToParse "\n");
        }
    }
    // Launch example:
    regExpTest();
    //------------------------------------------------------------------------------
    // Example's output:
    //------------------------------------------------------------------------------
    // Found pnlvlblt '{\*\pn\pnlvlblt\pnindent0{\pntxtb\'B7}}':
    // {\pntext\'B7\tab}{\*\pn\pnlvlblt\pnindent0{\pntxtb\'B7}}\li1440 Hi
    // Matched substring starts at 17 and ends at 55
    //------------------------------------------------------------------------------
    //------------------------------------------------------------------------------
    // DXL Manual Says ...
    //     The following symbols can be used in Regexp expressions:
    //------------------------------------------------------------------------------
    //   Chars: *
    // Meaning: zero or more occurrences
    // Example: a*
    // Matches: any number of a characters, or none
    //------------------------------------------------------------------------------
    //   Chars: +
    // Meaning: one or more occurrences
    // Example: x+
    // Matches: one or more x characters
    //------------------------------------------------------------------------------
    //   Chars: .
    // Meaning: any single character except new line
    // Example: .*
    // Matches: any number of any characters (any string)
    //------------------------------------------------------------------------------
    //   Chars: \
    // Meaning: escape (literal text char)
    // Example: \.
    // Matches: literally a . (dot) character
    //------------------------------------------------------------------------------
    //   Chars: ^
    // Meaning: start of the string (if at start of Regexp)
    // Example: ^The.*
    // Matches: any string starting with The or starting with The after any
    //          new line(see also [ ] below)
    //------------------------------------------------------------------------------
    //   Chars: $
    // Meaning: end of the string (if at end of Regexp)
    // Example: end\\.$
    // Matches: any string ending with end.
    //------------------------------------------------------------------------------
    //   Chars: ( )
    // Meaning: Groupings
    // Example: (ref) + (bind) *
    // Matches: at least one ref string then any number of bind strings
    //------------------------------------------------------------------------------
    //   Chars: [ ]
    // Meaning: character range (letters or digits)
    // Example: [sS]hall.*\\.$
    // Matches: any string containing shall or Shall and ending in a literal
    //          dot (any requirement sentence)
    //
    // Example: [^abc]
    // Matches: any character except a, b, or c
    //
    // Example: [a-zA-Z]
    // Matches: any alphabetic character
    //
    // Example: [0-9]
    // Matches: any digit
    //------------------------------------------------------------------------------
    //   Chars: |
    // Meaning: Alternative
    // Example: (dat|doc)
    // Matches: either the string dat or the string doc
    //------------------------------------------------------------------------------
    //------------------------------------------------------------------------------
    // Mathias Mamsch says:
    //     "Well its funny , the following code results in: ( ) * + ? [ \ "
    //
    // I find that "?" seems to work in regular expressions but the DXL manual
    // does not mention it.
    //
    // What else works? ... Left as an exercise for the reader ... Welcome to DOORS!
    //------------------------------------------------------------------------------
    //------------------------------------------------------------------------------
    // REFERENCES:
    //     http://www.addedbytes.com/cheat-sheets/regular-expressions-cheat-sheet/
    //------------------------------------------------------------------------------

     

    Regards ...

     

    Updated on 2013-11-27T21:23:15Z at 2013-11-27T21:23:15Z by Fu Lin Yiu