Topic
  • 2 replies
  • Latest Post - ‏2014-04-01T18:46:11Z by Mathias Mamsch
JimB21
JimB21
2 Posts

Pinned topic Limited Repetition in DXL Regular Expression

‏2014-03-31T15:41:15Z |

I'm trying to write a script to extract some information from the Object Text attribute and populate an enumerated field.  The text I'd like to extract is always at the beginning of the Object Text and enclosed in parentheses.  For example,

(Alice) This object text is marked Alice.
(Bob) This object text is marked Bob (because Bob wrote it).

I have tried to apply a regular expression like below to the object text:

Regexp MARKING = regexp2 "^\\(.{1,10}\\)"

However, this does not find matches.  I'd like to specify the repetition (i.e. {1,10}) as otherwise I sometimes pick up other end parentheses in the text and end up with the whole Object Text string as a match.  Does DXL support limited repetition and if so what is the syntax?

Thanks,
Jim

  • llandale
    llandale
    3005 Posts

    Re: Limited Repetition in DXL Regular Expression

    ‏2014-03-31T21:27:45Z  

    Lookup Regular Expression in Wikipedia, then for topic "lazy".  You want to write a "lazy" (non-"greedy") expression by cleverly inserting the '?' character.

    http://en.wikipedia.org/wiki/Regular_Expression

    I think you also need to put your desired search string inside natural paranthesis, so when you get a general match you can extract the name of the person.

    Regexp MARKING = regexp2 "^(\\(.*?\\))"

    void Test(string S)
    {
     if (MARKING S) //
     then print "Found\t" S[match 1] "\t<" S ">\n"
     else print "Nope\t\t<" S ">\n"
    }

    Test("(ABC) more stuff")
    Test("(ABC no end paren")
    Test("(ABC) adjacent(parens)")
    Test("(ABC(nested parens) xx)")
    Test("x(ABC) paren not start of string")

    I see nested parens cause a problem.

    -Louie

  • Mathias Mamsch
    Mathias Mamsch
    2003 Posts

    Re: Limited Repetition in DXL Regular Expression

    ‏2014-04-01T18:46:11Z  

    No DXL Regexp does not support the repetition operator {a,b}... You will have to do with + or * and then maybe

    - manually check the number of occurences by placing a group around the expression and then apply the expression to the group value to count how many occurences are in there. Not nice but in most cases simply using + and forgetting about the count is enough.

    - in certain cases (as Louie said) where you do not want a greedy match you might be forced to simply repeat the pattern.

    I also wanted always wanted to make a simple recursive descent parser in DXL using regexps to define easier grammars and have more flexibility in parsing and get around the DXL regexp restrictions. So if you are interested to do something like this together, just ping me :-)

    Regards, Mathias