Topic
IC4NOTICE: developerWorks Community will be offline May 29-30, 2015 while we upgrade to the latest version of IBM Connections. For more information, read our upgrade FAQ.
6 replies Latest Post - ‏2014-04-24T13:53:01Z by llandale
Estebell
Estebell
52 Posts
ACCEPTED ANSWER

Pinned topic Regexp with new lines

‏2014-04-22T12:38:44Z |

Hi,

I want to parse an object text to put each match in a new attribute.

But, when there is some new lines, my function doesn't work. Any help ?

 

/*My current object is litterally :
"Version 2 :
Add some links
with "Satisfy" link module" 

I want to have :
"Add some links
with "Satisfy" link module"
in match [3] 
*/

Object curro = current Object
string Text_Version = curro."Object Text"
string Text_Description

Regexp MyText = regexp2 ".*(Version [0-9]*:) [^\\n*](\\\\~| *)*(.*)"
if (MyText Text_Version){
Text_Description = Text_Version [match 3]
}
else Text_Description = "Nothing"

print Text_Version [match 0] "\n" 
print Text_Version [match 1] "\n"
print Text_Version [match 2] "\n"
print Text_Version [match 3] "\n"

/*
match[0] = "ersion 2 :
Add some links
with "Satisfy" link module"

match[1] = "ersion 2 :
Add some links
with "Satisfy" link module"

match [2] = ""

match[3] = ""
*/

What is wrong ? 

Why the V of Verision is not parsed ?

Updated on 2014-04-22T12:40:09Z at 2014-04-22T12:40:09Z by Estebell
  • Mathias Mamsch
    Mathias Mamsch
    1946 Posts
    ACCEPTED ANSWER

    Re: Regexp with new lines

    ‏2014-04-22T14:29:29Z  in response to Estebell

    I don't quite get your regular expression but in DXL you can match newlines in different ways, however the "." placeholder does not match on newlines:

    string sText = "\n"; 
    
    Regexp re1 = regexp "\\\n"  // match
    Regexp re2 = regexp "\\n"   // match
    Regexp re3 = regexp "\n"    // match
    Regexp re4 = regexp "."     // no match
    
    if (re1 sText) print "Match re1\n"; 
    if (re2 sText) print "Match re2\n"; 
    if (re3 sText) print "Match re3\n"; 
    if (re4 sText) print "Match re4\n";
    

    Although I don't know what the (\\\\~| *) part of your regexp is supposed to match to, but the following test at least matches in group 1 and 3 (note that you have a space between 'Version 2' and the ':' ...

    string Text_Version = "Version 2 :
    Add some links
    with \"Satisfy\" link module"
    
    Regexp MyText = regexp2 "(Version [0-9]*[ ]*:)[ \n]*(\\\\~| *)*(.*)"
    
    print "Match 0: " Text_Version [match 0] "\n" 
    print "Match 1: " Text_Version [match 1] "\n"
    print "Match 2: " Text_Version [match 2] "\n"
    print "Match 3: " Text_Version [match 3] "\n"
    

    Regards, Mathias

  • llandale
    llandale
    2943 Posts
    ACCEPTED ANSWER

    Re: Regexp with new lines

    ‏2014-04-22T17:04:16Z  in response to Estebell

    Not really following; but I will say:

    • (.*)   will match any number of characters no including a new-line (EOL)
    • Other RegExp references to "end of string" generally mean "end of string or next EOL".

    Thus, parsing text with EOLs causes problems.  I resolve that with this

    • const string cl_re_strAnyChar = "[" charOf(1) "-" charOf(255)"]"    // any character except null
    • Regexp re = regexp2(whatever (cl_re_strAnyChar)* whatever)

    That seesm to handle EOLs in the text

    -Louie

    • Estebell
      Estebell
      52 Posts
      ACCEPTED ANSWER

      Re: Regexp with new lines

      ‏2014-04-23T07:26:18Z  in response to llandale

      Well, I tried your const string cl_re_strAnyChar but it doesn't work...

      I've simplfied my object text.

      // object text : "Version 2 : Add some links with link module" (without any EOL nor special characters)
      
      Object curro = current Object
      string Text_Version = curro."Object Text"
      const string str_anychar = "["charOf(1)"-"charOf(255)"]"
      
      Regexp MyText = regexp2 "([A-Z a-z 0-9]*:)(str_anychar)*"
      
      if (MyText Text_Version)
      {
          print Text_Version [match(0)] "\n"
          print Text_Version [match(1)] "\n"
          print Text_Version [match(2)] "\n"
      }
      
      // match [0] = "Version 2 :"
      // match [1] = "Version 2 :"
      // match [2] = ""
      

      Why match (0) and then match(2) are wrong ??

      I expected match(0) = "Version 2 : Add some links with link module"  and match(2) = "Add some links with link module" Even without EOL and with the const string, the regexp does not match !!!

      • Mathias Mamsch
        Mathias Mamsch
        1946 Posts
        ACCEPTED ANSWER

        Re: Regexp with new lines

        ‏2014-04-23T21:14:02Z  in response to Estebell

        Your line 7 is wrong. It should read:

        Regexp MyText = regexp2 "([A-Za-z0-9 ]*:)(" str_anychar ")*"
        

        Regards, Mathias

         

        • Estebell
          Estebell
          52 Posts
          ACCEPTED ANSWER

          Re: Regexp with new lines

          ‏2014-04-24T07:10:53Z  in response to Mathias Mamsch

          Thank's so much !

          It works fine !!!

           

        • llandale
          llandale
          2943 Posts
          ACCEPTED ANSWER

          Re: Regexp with new lines

          ‏2014-04-24T13:53:01Z  in response to Mathias Mamsch

          Beat my head against the wall yesterday and missed that.  Doh!

          However, I think you should move that last asterisk * inside the parens; this gives "match 2" the entire rest of the string.  The way you have it, match 2 is just the last character, in this case "e".

          const string str_anychar = "["charOf(1)"-"charOf(255)"]"

          Regexp MyText = regexp2 "^([A-Za-z0-9 ]*:)(" str_anychar "*)"

          void Test(string in_String)
          {
           print "[" in_String "]\n"
           if (MyText in_String)
           {
               print "\t0  [" in_String [match(0)] "]\n"
               print "\t1  [" in_String [match(1)] "]\n"
               print "\t2  [" in_String [match(2)] "]\n"
               print "\t3  [" in_String [match(3)] "]\n"
           }
           else print "\tNo Match\n"
          }
          Test("Version 2 : Add some links with link module")
          Test("Version 2 : Add some: links with link module")
          Test("Version 2 : Add some: links \nwith link module")

          -Louie

          Updated on 2014-04-24T13:53:36Z at 2014-04-24T13:53:36Z by llandale