Topic
  • 3 replies
  • Latest Post - ‏2008-11-05T22:14:40Z by SystemAdmin
SystemAdmin
SystemAdmin
533 Posts

Pinned topic Removing some special words

‏2008-05-29T12:24:59Z |
Hi,
We need to remove keywords like Jr, Sr, II, III etc from a field.

For example:
"Richard Headley SR" should be changed to "Richard Headley"
"Richard SR Headley" should be changed to "Richard Headley"
"Sr Richard Headley" should be changed to "Richard Headley"
"Siva Sriram" should be kept as "Siva Sriram" -- Here we have "sr" but it didn't replace that.
"Sriram Krishnan" should be kept as "Sriram Krishnan"

Our requirement will have at least 50 diffrent keywords.
Please provide the way to implement this functionality.

Thanks,
Nasimul
Updated on 2008-11-05T22:14:40Z at 2008-11-05T22:14:40Z by SystemAdmin
  • SystemAdmin
    SystemAdmin
    533 Posts

    Re: Removing some special words

    ‏2008-05-30T23:12:47Z  
    The various NAME RuleSets should identify these as NameGeneration and tokenize them as G.

    Either use a Rule Override to remove this token or modify the pattern action.

    Stewart
  • SystemAdmin
    SystemAdmin
    533 Posts

    Re: Removing some special words

    ‏2008-09-10T10:05:45Z  
    hi ,

    classify all the keywords as G or whatever in your classification table.

    in pattern action file , search for that G and retype to 0.

    i.e.

    *G
    retype [1] 0
    try this and let me know..

    this wont find sr in srikrishna.. if any toke appear seperately as 'sr'. then it will find those and make it null and those values won't come after that in your pattern.

    place the above code before subroutine calls ...

    Regards
    vairamuthu
  • SystemAdmin
    SystemAdmin
    533 Posts

    Re: Removing some special words

    ‏2008-11-05T22:14:40Z  
    hi ,

    classify all the keywords as G or whatever in your classification table.

    in pattern action file , search for that G and retype to 0.

    i.e.

    *G
    retype [1] 0
    try this and let me know..

    this wont find sr in srikrishna.. if any toke appear seperately as 'sr'. then it will find those and make it null and those values won't come after that in your pattern.

    place the above code before subroutine calls ...

    Regards
    vairamuthu
    As a general rule, changes like these should reside in the Input_Modifications Subroutine. This subroutine, as well as Unhandled_Modifications are intentionally left blank so that end users may modify the Rules without disrupting the flow.

    Additionally, the code should be specific since JR, II, Junior, Senior all would be caught with the suggestion by vairamuthu

    *G = T = "SR"
    retype [1] 0