Topic
  • 5 replies
  • Latest Post - ‏2012-11-08T15:07:10Z by SystemAdmin
rlnielsen
rlnielsen
3 Posts

Pinned topic Parsing Rule With Multiple, Alternative Part of Speech

‏2012-11-07T19:43:01Z |
I want to create a parsing rule that matches:

1) A determiner
2) Zero or more adjectives, quantifiers, or cardinal numbers (since I consider all these adjectives)
3) One or more nouns

I can figure out everything but 2. How can I define a parsing rule where "zero or more of the following types" can be used?

I realize I can write three separate rules ... one for adjectives, one for quantifiers, and one for cardinal numbers ... but this will eventually lead to an explosion of rules farther down the pipeline, that need to process the annotations derived from this/these rules. So I would rather have one rule that pertains to one annotation.

Is there a superclass that matches all three? Can I group things somehow?

Robert
Updated on 2012-11-08T15:07:10Z at 2012-11-08T15:07:10Z by SystemAdmin
  • bfoyle
    bfoyle
    60 Posts

    Re: Parsing Rule With Multiple, Alternative Part of Speech

    ‏2012-11-07T21:38:37Z  
    I would try grouping them.

    -start with a phrase that mirrors what you are trying to model...such as "the six best large companies" and drag it into your parsing rule.
    -click the check box to require on each of the tokens for adjective, quantifier and make them 'zero or more'
    -change the second adjective token to 'cardinal number' (right click on part of speech and click match criteria) and then set it to 'zero or more'
    -group the adjective, quantifier, and cardinal number...I'm guessing disordered
    -set the disordered group to occur 'one or more times'

    You should end up with something like this image attached.

    You also may have to play around with adding additional tokens for instances where there are comma separations (such as the six best, large, awesome, incredible companies)

    bf
  • SystemAdmin
    SystemAdmin
    197 Posts

    Re: Parsing Rule With Multiple, Alternative Part of Speech

    ‏2012-11-08T11:25:42Z  
    Another way of doing this is to create some rules that will promote these adjectives.
    So you create 3 phrase rules: each one for a token, each one will have of the match criteria you need (quantifier, adjective, cardinal number), you annotate teach of them as "Adjective".

    After this you create a rule with Determiner Adjective* Noun+

    This solution may be better in terms of performance as you have less loops.
    I would recommend that you try both solutions and see which one works better for you.
  • rlnielsen
    rlnielsen
    3 Posts

    Re: Parsing Rule With Multiple, Alternative Part of Speech

    ‏2012-11-08T14:51:29Z  
    • bfoyle
    • ‏2012-11-07T21:38:37Z
    I would try grouping them.

    -start with a phrase that mirrors what you are trying to model...such as "the six best large companies" and drag it into your parsing rule.
    -click the check box to require on each of the tokens for adjective, quantifier and make them 'zero or more'
    -change the second adjective token to 'cardinal number' (right click on part of speech and click match criteria) and then set it to 'zero or more'
    -group the adjective, quantifier, and cardinal number...I'm guessing disordered
    -set the disordered group to occur 'one or more times'

    You should end up with something like this image attached.

    You also may have to play around with adding additional tokens for instances where there are comma separations (such as the six best, large, awesome, incredible companies)

    bf
    Thanks bf ... I will try it.

    Robert
  • rlnielsen
    rlnielsen
    3 Posts

    Re: Parsing Rule With Multiple, Alternative Part of Speech

    ‏2012-11-08T14:53:21Z  
    Another way of doing this is to create some rules that will promote these adjectives.
    So you create 3 phrase rules: each one for a token, each one will have of the match criteria you need (quantifier, adjective, cardinal number), you annotate teach of them as "Adjective".

    After this you create a rule with Determiner Adjective* Noun+

    This solution may be better in terms of performance as you have less loops.
    I would recommend that you try both solutions and see which one works better for you.
    Thanks Amine ... I will try it. But I have a secondary question ... shouldn't the base "Adjective" annotation be a supertype of Quantifier and CardinalNumber? If it was, I could just use the base Adjective for my purposes.

    Robert
  • SystemAdmin
    SystemAdmin
    197 Posts

    Re: Parsing Rule With Multiple, Alternative Part of Speech

    ‏2012-11-08T15:07:10Z  
    • rlnielsen
    • ‏2012-11-08T14:53:21Z
    Thanks Amine ... I will try it. But I have a secondary question ... shouldn't the base "Adjective" annotation be a supertype of Quantifier and CardinalNumber? If it was, I could just use the base Adjective for my purposes.

    Robert
    I understand your point. Th reason behind this classification is to give more flexibility on terms of available parts of speech. If we stopped our analysis at Adjectives only, we would lose this fine grained classification, and you wouldn't be able to do what you want to do. Whereas with the way things are classified, you can still have all these elements grouped under one flag (although I understand that it is not the most convenient situation :-) )