Topic
  • 10 replies
  • Latest Post - ‏2017-12-15T13:14:17Z by RobertDickson
SystemAdmin
SystemAdmin
7754 Posts

Pinned topic Remove Numbers using Rule Set

‏2012-08-11T22:30:00Z |
Hi All,

I need to create a rule set to remove the numbers at the end from the below pattern.
3259C83

Basically I want to separate 3259C and 83.

I have used investigate stage to identify the pattern and pattern is complex mix.(qsInvPattern:@).

Please let me to create a rule set to handle this pattern.

Regards,
NP
Updated on 2012-08-22T22:43:30Z at 2012-08-22T22:43:30Z by SystemAdmin
  • RobertDickson
    RobertDickson
    36 Posts

    Re: Remove Numbers using Rule Set

    ‏2012-08-12T15:19:03Z  
    Hi NP,

    There are multiple possible ways. Can you provide a bit more information?
    -Is the 'separator' character ALWYAS a C?
    -Is it ALWAYS a number-C-number? (i.e. a @ pattern)
    -Is the trailing number ALWAYS 2 positions?
    -Is this the ONLY data for this rule, or are there other words in the data? (i.e. Is the data something like 'This is a bronze version of part number 3259C83; currently out of stock.')

    The reason for the questions is that there are potential solutions in both Pattern Action Language in the Standardize stage, and the Transformer stage.


    Regards,
    Robert
  • SystemAdmin
    SystemAdmin
    7754 Posts

    Re: Remove Numbers using Rule Set

    ‏2012-08-18T22:37:43Z  
    Hi NP,

    There are multiple possible ways. Can you provide a bit more information?
    -Is the 'separator' character ALWYAS a C?
    -Is it ALWAYS a number-C-number? (i.e. a @ pattern)
    -Is the trailing number ALWAYS 2 positions?
    -Is this the ONLY data for this rule, or are there other words in the data? (i.e. Is the data something like 'This is a bronze version of part number 3259C83; currently out of stock.')

    The reason for the questions is that there are potential solutions in both Pattern Action Language in the Standardize stage, and the Transformer stage.


    Regards,
    Robert
    Hi Robert,

    Thanks for the quick response. Below is the summary of my issue.

    Currently I have created the rule set to handle below patterns.

    111
    TRUST3
    NNN111
    RE FOR MBA117
    ABN A DD DFD 2
    O'BRIEN MBA 8213
    BENEFIT A CN 087 648 815
    C ASSC 17 428 543 685
    O'BRIEN MBA 8213 2009 125 33 44

    Where there is a number concatenated at the end of the word, strip the number to a separate field and use this number field, along with the string field in matching.

    Ex:
    Original word field value: HAWKE VIC 4505 2009
    New String field value: HAWKE VIC
    New Number field value: 45052009

    Same way I need to handle the 3259C83 pattern, basically I need to separate as below.

    Original word field value: 3259C83
    New String field value: C
    New Number field value: 325983

    Pattern is always (4 Numbers – Character - 2 Numbers)

    Please find the attached rule set herewith.

    Regards,
    NP

    Attachments

  • RobertDickson
    RobertDickson
    36 Posts

    Re: Remove Numbers using Rule Set

    ‏2012-08-19T16:16:11Z  
    Hi Robert,

    Thanks for the quick response. Below is the summary of my issue.

    Currently I have created the rule set to handle below patterns.

    111
    TRUST3
    NNN111
    RE FOR MBA117
    ABN A DD DFD 2
    O'BRIEN MBA 8213
    BENEFIT A CN 087 648 815
    C ASSC 17 428 543 685
    O'BRIEN MBA 8213 2009 125 33 44

    Where there is a number concatenated at the end of the word, strip the number to a separate field and use this number field, along with the string field in matching.

    Ex:
    Original word field value: HAWKE VIC 4505 2009
    New String field value: HAWKE VIC
    New Number field value: 45052009

    Same way I need to handle the 3259C83 pattern, basically I need to separate as below.

    Original word field value: 3259C83
    New String field value: C
    New Number field value: 325983

    Pattern is always (4 Numbers – Character - 2 Numbers)

    Please find the attached rule set herewith.

    Regards,
    NP
    Hi Koditun,

    For the 4 Number - 1 Char - 2 Number:
    Use the PICT option. For example:

    *@ ;Complex with four numeric, 1 char, 2 numeric
    COPY [1](1:4) {leadingNumeric}
    COPY [1](5:5) {middleChar}
    COPY [1](6:7} {trailingNumeric}
    RETYPE [1] 0

    For some of the numeric ones, you can try something more generic (if you can assume that the data, once numerics start, will continue with numeric until the end of the line):

    *^ | ** | A ; From the first numeric to the ZQWORDTWOZQ separator
    COPY [1] temp
    CONCAT [2] temp ;this will remove the spaces from [2], so '123 456 789' will become '123456789'
    COPY temp {NEWNUM}
    RETYPE [1] 0
    RETYPE [2] 0

    The above will handle an unlimited number of numerics after the first, so you do not have to code to be very specific as to the number of numerics (3, 4, 5, 6, etc).

    I hope this helps!

    Robert
  • SystemAdmin
    SystemAdmin
    7754 Posts

    Re: Remove Numbers using Rule Set

    ‏2012-08-22T22:43:30Z  
    Hi Koditun,

    For the 4 Number - 1 Char - 2 Number:
    Use the PICT option. For example:

    *@ ;Complex with four numeric, 1 char, 2 numeric
    COPY [1](1:4) {leadingNumeric}
    COPY [1](5:5) {middleChar}
    COPY [1](6:7} {trailingNumeric}
    RETYPE [1] 0

    For some of the numeric ones, you can try something more generic (if you can assume that the data, once numerics start, will continue with numeric until the end of the line):

    *^ | ** | A ; From the first numeric to the ZQWORDTWOZQ separator
    COPY [1] temp
    CONCAT [2] temp ;this will remove the spaces from [2], so '123 456 789' will become '123456789'
    COPY temp {NEWNUM}
    RETYPE [1] 0
    RETYPE [2] 0

    The above will handle an unlimited number of numerics after the first, so you do not have to code to be very specific as to the number of numerics (3, 4, 5, 6, etc).

    I hope this helps!

    Robert
    Hi Robert,

    It worked and thanks for your all comments, really appreciate it.

    Regards,
    NP
  • GIQU
    GIQU
    5 Posts

    Re: Remove Numbers using Rule Set

    ‏2017-12-12T08:07:45Z  
    Hi Koditun,

    For the 4 Number - 1 Char - 2 Number:
    Use the PICT option. For example:

    *@ ;Complex with four numeric, 1 char, 2 numeric
    COPY [1](1:4) {leadingNumeric}
    COPY [1](5:5) {middleChar}
    COPY [1](6:7} {trailingNumeric}
    RETYPE [1] 0

    For some of the numeric ones, you can try something more generic (if you can assume that the data, once numerics start, will continue with numeric until the end of the line):

    *^ | ** | A ; From the first numeric to the ZQWORDTWOZQ separator
    COPY [1] temp
    CONCAT [2] temp ;this will remove the spaces from [2], so '123 456 789' will become '123456789'
    COPY temp {NEWNUM}
    RETYPE [1] 0
    RETYPE [2] 0

    The above will handle an unlimited number of numerics after the first, so you do not have to code to be very specific as to the number of numerics (3, 4, 5, 6, etc).

    I hope this helps!

    Robert

    With the continuation of this topic, I have a similar doubt.If my input token is like 234AD45H7, I just want to separate numeric values alone by writing a rule in pattern language.

     

    I tried something like this 

     

    COPY [1](n) temp
    CONCAT [1](-n) {DictonaryColumn}
    

    But its not working as expected. My expected out put is 234457

  • RobertDickson
    RobertDickson
    36 Posts

    Re: Remove Numbers using Rule Set

    ‏2017-12-13T22:30:54Z  
    • GIQU
    • ‏2017-12-12T08:07:45Z

    With the continuation of this topic, I have a similar doubt.If my input token is like 234AD45H7, I just want to separate numeric values alone by writing a rule in pattern language.

     

    I tried something like this 

     

    <pre class="javascript dw" data-editor-lang="js" data-pbcklang="javascript" dir="ltr">COPY [1](n) temp CONCAT [1](-n) {DictonaryColumn} </pre>

    But its not working as expected. My expected out put is 234457

    You are discarding the leading numerics from temp (you do not use temp again).

    The CONCAT statement gets trailing number (which is 7).

    You have a complex alpha-numeric. The example you give is 234AD45H7. Will it always be of the format NNNAANNAN?

    Robert

  • GIQU
    GIQU
    5 Posts

    Re: Remove Numbers using Rule Set

    ‏2017-12-14T04:34:00Z  

    You are discarding the leading numerics from temp (you do not use temp again).

    The CONCAT statement gets trailing number (which is 7).

    You have a complex alpha-numeric. The example you give is 234AD45H7. Will it always be of the format NNNAANNAN?

    Robert

    HI  Robert

     

    Its not necessary that format will always same.It is going to alpha-numeric coming in different format. And also there can be chance that it can begin with alphabets and end with alphabets and numbers in middle. So there is no specific format as such.

     

     

  • RobertDickson
    RobertDickson
    36 Posts

    Re: Remove Numbers using Rule Set

    ‏2017-12-14T23:34:42Z  
    • GIQU
    • ‏2017-12-14T04:34:00Z

    HI  Robert

     

    Its not necessary that format will always same.It is going to alpha-numeric coming in different format. And also there can be chance that it can begin with alphabets and end with alphabets and numbers in middle. So there is no specific format as such.

     

     

    Create the main pattern action like:

    & ; Any input
    COPY [1] originalValue
    CALL startLoop

    **
    COPY outputString (2:-1) {outString}
    COPY outputValue (2:-1) {outValue}

     

    And the sub routines like:

    \SUB startLoop


    &
    COPY originalValue workingValue
    COPY "-" outputValue
    COPY "-" outputString

    [workingValue LEN > 0]
    COPY workingValue(1:1) testValue
    COPY workingValue(2:-1) temp
    COPY temp workingValue
    CALL testCurrentValue
    REPEAT


    \END_SUB

    \SUB testCurrentValue

    [testValue PICT = "c"]
    CONCAT testValue outputString

    [testValue PICT = "n"]
    CONCAT testValue outputValue

     

    \END_SUB

     

    I hope that helps!

  • GIQU
    GIQU
    5 Posts

    Re: Remove Numbers using Rule Set

    ‏2017-12-15T05:24:46Z  

    Create the main pattern action like:

    & ; Any input
    COPY [1] originalValue
    CALL startLoop

    **
    COPY outputString (2:-1) {outString}
    COPY outputValue (2:-1) {outValue}

     

    And the sub routines like:

    \SUB startLoop


    &
    COPY originalValue workingValue
    COPY "-" outputValue
    COPY "-" outputString

    [workingValue LEN > 0]
    COPY workingValue(1:1) testValue
    COPY workingValue(2:-1) temp
    COPY temp workingValue
    CALL testCurrentValue
    REPEAT


    \END_SUB

    \SUB testCurrentValue

    [testValue PICT = "c"]
    CONCAT testValue outputString

    [testValue PICT = "n"]
    CONCAT testValue outputValue

     

    \END_SUB

     

    I hope that helps!

    Thanks Rob.

     

    Actually I am bit confused with this one. If I understand correctly, the below code outputString and outputValue are variable rite.So from where this variable is getting assigned, to take a sub string.

     

    COPY outputString (2:-1) {outString}
    COPY outputValue (2:-1) {outValue}
    

     

  • RobertDickson
    RobertDickson
    36 Posts

    Re: Remove Numbers using Rule Set

    ‏2017-12-15T13:14:17Z  
    • GIQU
    • ‏2017-12-15T05:24:46Z

    Thanks Rob.

     

    Actually I am bit confused with this one. If I understand correctly, the below code outputString and outputValue are variable rite.So from where this variable is getting assigned, to take a sub string.

     

    <pre class="html dw" data-editor-lang="js" data-pbcklang="html" dir="ltr">COPY outputString (2:-1) {outString} COPY outputValue (2:-1) {outValue} </pre>

     

    Hi,

     

    Notice further up where I do:

    **
    COPY outputString (2:-1) {outString}
    COPY outputValue (2:-1) {outValue}

     

    I do this to avoid confusion between COPY and CONCAT later (they can all be CONCAT). But that means that I need to ignore the leading dash. Hence the (2:-1).

     

    I hope this helps!

    Robert