Topic
  • 12 replies
  • Latest Post - ‏2012-03-30T13:13:28Z by smithha
prarthanab
prarthanab
12 Posts

Pinned topic Run time Error in Unduplicate match

‏2012-03-29T16:05:04Z |
Hi,
I am using Qulaity stage 8.5 . I am working on few examples in tutorial . I created 2 jobs
1.Extract data : Thsi has input source as a csv file . copies that data into 2 data sets . 1 set is just member data set and the other one is Memberfrequencies dataset . So here i have 1 source,1 copy , 1 match freq stage and 2 data sets . This compiles fine and creates 2 data sets fine aftr running.

2. Executemembermatch : Here I am providing 2 data sets from previous jobs as input . 1 unduplicate match stage . 1 funnel and 3 data sets .This job compiles fine but throws error at runtime . Here is the error :

UnduplicateMatch:Unable to locate column MatchprimaryWord1NYSIIS_USNAME in the input link with following schema:
record(
UniqueIdentifier:ustringmax=10;
ApplicantSSN:nullable ustringmax=9;
name:nullable ustringmax=70;
AddressLine1:nullable ustringmax=50;
Addressline2:nullable ustringmax=50;
City:nullable ustringmax=40;
State:nullable ustringmax=2;
Zip5:nullable ustringmax=5;
Zip4:nullable ustringmax=4;
)

I am relatively new to Quality stage . Please help me resolve this error .By the way I am not using standardize job in the whole process.
Updated on 2012-03-30T13:13:28Z at 2012-03-30T13:13:28Z by smithha
  • smithha
    smithha
    23 Posts

    Re: Run time Error in Unduplicate match

    ‏2012-03-29T16:35:57Z  
    I suspect you are using one of the out-of-the-box match specifications which would assume (and include) prior use of name and address standardization. MatchprimaryWord1NYSIIS_USNAME is created by the USNAME standardization process.

    You will want to open up your Match Specification to confirm, but it probably contains the above and other fields based on standardization. Your options would then be to either: 1) remove the fields from the passes in the Match Spec and replace with those available in your schema (and then provision the match spec and passes); or 2) add in standardization so that the fields called for in the Match Spec are available (and remember if you do to ensure those fields also go through the Match Frequency process).

    Harald
  • prarthanab
    prarthanab
    12 Posts

    Re: Run time Error in Unduplicate match

    ‏2012-03-29T16:53:07Z  
    • smithha
    • ‏2012-03-29T16:35:57Z
    I suspect you are using one of the out-of-the-box match specifications which would assume (and include) prior use of name and address standardization. MatchprimaryWord1NYSIIS_USNAME is created by the USNAME standardization process.

    You will want to open up your Match Specification to confirm, but it probably contains the above and other fields based on standardization. Your options would then be to either: 1) remove the fields from the passes in the Match Spec and replace with those available in your schema (and then provision the match spec and passes); or 2) add in standardization so that the fields called for in the Match Spec are available (and remember if you do to ensure those fields also go through the Match Frequency process).

    Harald
    Harald ,
    Thanks a lot for your reply. Thats exactly what I suspected and added a standardization job to my first job . It seems the job is runnign fine now . I still dont see a finished status . ANy idea how much time it would take to finish up a unduplicate job with alomost 1000 records ?
    My job is runnign since 5 mins.
  • smithha
    smithha
    23 Posts

    Re: Run time Error in Unduplicate match

    ‏2012-03-29T17:55:14Z  
    Harald ,
    Thanks a lot for your reply. Thats exactly what I suspected and added a standardization job to my first job . It seems the job is runnign fine now . I still dont see a finished status . ANy idea how much time it would take to finish up a unduplicate job with alomost 1000 records ?
    My job is runnign since 5 mins.
    Glad to hear you got the process running.
    In general, it is advantageous to run standardization processes before unduplication or matching processes. The parsing and standardization give you more control and consistency over inputs used in matching (both data and frequencies).

    As for job execution time, I'd expect 1000 records to finish generally within 5 mins. There's usually some startup time, but it's a very small data set. If you are working in the Designer, then you should see result flow displayed graphically (and would be green if all went fine). Otherwise, look in the DS Director for the detail log output.

    Harald
  • prarthanab
    prarthanab
    12 Posts

    Re: Run time Error in Unduplicate match

    ‏2012-03-29T19:18:43Z  
    • smithha
    • ‏2012-03-29T17:55:14Z
    Glad to hear you got the process running.
    In general, it is advantageous to run standardization processes before unduplication or matching processes. The parsing and standardization give you more control and consistency over inputs used in matching (both data and frequencies).

    As for job execution time, I'd expect 1000 records to finish generally within 5 mins. There's usually some startup time, but it's a very small data set. If you are working in the Designer, then you should see result flow displayed graphically (and would be green if all went fine). Otherwise, look in the DS Director for the detail log output.

    Harald
    HI harold,
    The job after running for sometime however did not complete succesfully. It is now failing at run time with the followign error.

    Unduplicate_Match_11: Unable to locate column MatchFirst1 in the input link with the following schema:
    record
    ( UniqueIdentifier: ustringmax=10;
    ApplicantSSN: nullable ustringmax=9;
    Name: nullable ustringmax=70;
    AddressLine1: nullable ustringmax=50;
    AddressLine2: nullable ustringmax=50;
    City: nullable ustringmax=40;
    State: nullable ustringmax=2;
    Zip5: nullable ustringmax=5;
    Zip4: nullable ustringmax=4;
    NameType_USNAME: nullable ustringmax=1;
    GenderCode_USNAME: nullable ustringmax=1;
    NamePrefix_USNAME: nullable ustringmax=20;
    FirstName_USNAME: nullable ustringmax=25;
    MiddleName_USNAME: nullable ustringmax=25;
    PrimaryName_USNAME: nullable ustringmax=50;
    NameGeneration_USNAME: nullable ustringmax=10;
    NameSuffix_USNAME: nullable ustringmax=20;
    AdditionalName_USNAME: nullable ustringmax=50;
    MatchFirstName_USNAME: nullable ustringmax=25;
    MatchFirstNameNYSIIS_USNAME: nullable ustringmax=8;
    MatchFirstNameRVSNDX_USNAME: nullable ustringmax=4;
    MatchPrimaryName_USNAME: nullable ustringmax=50;
    MatchPrimaryNameHashKey_USNAME: nullable ustringmax=10;
    MatchPrimaryNamePackKey_USNAME: nullable ustringmax=20;
    NumofMatchPrimaryWords_USNAME: nullable ustringmax=1;
    MatchPrimaryWord1_USNAME: nullable ustringmax=15;
    MatchPrimaryWord2_USNAME: nullable ustringmax=15;
    MatchPrimaryWord3_USNAME: nullable ustringmax=15;
    MatchPrimaryWord4_USNAME: nullable ustringmax=15;
    MatchPrimaryWord5_USNAME: nullable ustringmax=15;
    MatchPrimaryWord1NYSIIS_USNAME: nullable ustringmax=8;
    MatchPrimaryWord1RVSNDX_USNAME: nullable ustringmax=4;
    MatchPrimaryWord2NYSIIS_USNAME: nullable ustringmax=8;
    MatchPrimaryWord2RVSNDX_USNAME: nullable ustringmax=4;
    UnhandledPattern_USNAME: nullable ustringmax=30;
    UnhandledData_USNAME: nullable ustringmax=100;
    InputPattern_USNAME: nullable ustringmax=30;
    ExceptionData_USNAME: nullable ustringmax=25;
    UserOverrideFlag_USNAME: nullable ustringmax=2;
    HouseNumber_USADDR: nullable ustringmax=10;
    HouseNumberSuffix_USADDR: nullable ustringmax=10;
    StreetPrefixDirectional_USADDR: nullable ustringmax=3;
    StreetPrefixType_USADDR: nullable ustringmax=20;
    StreetName_USADDR: nullable ustringmax=25;
    StreetSuffixType_USADDR: nullable ustringmax=5;
    StreetSuffixQualifier_USADDR: nullable ustringmax=5;
    StreetSuffixDirectional_USADDR: nullable ustringmax=3;
    RuralRouteType_USADDR: nullable ustringmax=3;
    RuralRouteValue_USADDR: nullable ustringmax=10;
    BoxType_USADDR: nullable ustringmax=7;
    BoxValue_USADDR: nullable ustringmax=10;
    FloorType_USADDR: nullable ustringmax=5;
    FloorValue_USADDR: nullable ustringmax=10;
    UnitType_USADDR: nullable ustringmax=5;
    UnitValue_USADDR: nullable ustringmax=10;
    MultiUnitType_USADDR: nullable ustringmax=5;
    MultiUnitValue_USADDR: nullable ustringmax=10;
    BuildingName_USADDR: nullable ustringmax=30;
    AdditionalAddress_USADDR: nullable ustringmax=50;
    AddressType_USADDR: nullable ustringmax=1;
    StreetNameNYSIIS_USADDR: nullable ustringmax=8;
    StreetNameRVSNDX_USADDR: nullable ustringmax=4;
    UnhandledPattern_USADDR: nullable ustringmax=30;
    UnhandledData_USADDR: nullable ustringmax=50;
    InputPattern_USADDR: nullable ustringmax=30;
    ExceptionData_USADDR: nullable ustringmax=50;
    UserOverrideFlag_USADDR: nullable ustringmax=2;
    CityName_USAREA: nullable ustringmax=30;
    StateAbbreviation_USAREA: nullable ustringmax=3;
    ZipCode_USAREA: nullable ustringmax=5;
    Zip4AddonCode_USAREA: nullable ustringmax=4;
    CountryCode_USAREA: nullable ustringmax=2;
    CityNameNYSIIS_USAREA: nullable ustringmax=8;
    CityNameRVSNDX_USAREA: nullable ustringmax=4;
    UnhandledPattern_USAREA: nullable ustringmax=30;
    UnhandledData_USAREA: nullable ustringmax=50;
    InputPattern_USAREA: nullable ustringmax=30;
    ExceptionData_USAREA: nullable ustringmax=50;
    UserOverrideFlag_USAREA: nullable ustringmax=2;
    ValidFlag_USTAXID: nullable ustringmax=1;
    TaxID_USTAXID: nullable ustringmax=10;
    UnhandledPattern_USTAXID: nullable ustringmax=30;
    UnhandledData_USTAXID: nullable ustringmax=50;
    InputPattern_USTAXID: nullable ustringmax=30;
    ExceptionData_USTAXID: nullable ustringmax=50;
    UserOverrideFlag_USTAXID: nullable ustringmax=2;
    )
    Thanks
    Prarthana.
  • smithha
    smithha
    23 Posts

    Re: Run time Error in Unduplicate match

    ‏2012-03-29T19:30:05Z  
    HI harold,
    The job after running for sometime however did not complete succesfully. It is now failing at run time with the followign error.

    Unduplicate_Match_11: Unable to locate column MatchFirst1 in the input link with the following schema:
    record
    ( UniqueIdentifier: ustringmax=10;
    ApplicantSSN: nullable ustringmax=9;
    Name: nullable ustringmax=70;
    AddressLine1: nullable ustringmax=50;
    AddressLine2: nullable ustringmax=50;
    City: nullable ustringmax=40;
    State: nullable ustringmax=2;
    Zip5: nullable ustringmax=5;
    Zip4: nullable ustringmax=4;
    NameType_USNAME: nullable ustringmax=1;
    GenderCode_USNAME: nullable ustringmax=1;
    NamePrefix_USNAME: nullable ustringmax=20;
    FirstName_USNAME: nullable ustringmax=25;
    MiddleName_USNAME: nullable ustringmax=25;
    PrimaryName_USNAME: nullable ustringmax=50;
    NameGeneration_USNAME: nullable ustringmax=10;
    NameSuffix_USNAME: nullable ustringmax=20;
    AdditionalName_USNAME: nullable ustringmax=50;
    MatchFirstName_USNAME: nullable ustringmax=25;
    MatchFirstNameNYSIIS_USNAME: nullable ustringmax=8;
    MatchFirstNameRVSNDX_USNAME: nullable ustringmax=4;
    MatchPrimaryName_USNAME: nullable ustringmax=50;
    MatchPrimaryNameHashKey_USNAME: nullable ustringmax=10;
    MatchPrimaryNamePackKey_USNAME: nullable ustringmax=20;
    NumofMatchPrimaryWords_USNAME: nullable ustringmax=1;
    MatchPrimaryWord1_USNAME: nullable ustringmax=15;
    MatchPrimaryWord2_USNAME: nullable ustringmax=15;
    MatchPrimaryWord3_USNAME: nullable ustringmax=15;
    MatchPrimaryWord4_USNAME: nullable ustringmax=15;
    MatchPrimaryWord5_USNAME: nullable ustringmax=15;
    MatchPrimaryWord1NYSIIS_USNAME: nullable ustringmax=8;
    MatchPrimaryWord1RVSNDX_USNAME: nullable ustringmax=4;
    MatchPrimaryWord2NYSIIS_USNAME: nullable ustringmax=8;
    MatchPrimaryWord2RVSNDX_USNAME: nullable ustringmax=4;
    UnhandledPattern_USNAME: nullable ustringmax=30;
    UnhandledData_USNAME: nullable ustringmax=100;
    InputPattern_USNAME: nullable ustringmax=30;
    ExceptionData_USNAME: nullable ustringmax=25;
    UserOverrideFlag_USNAME: nullable ustringmax=2;
    HouseNumber_USADDR: nullable ustringmax=10;
    HouseNumberSuffix_USADDR: nullable ustringmax=10;
    StreetPrefixDirectional_USADDR: nullable ustringmax=3;
    StreetPrefixType_USADDR: nullable ustringmax=20;
    StreetName_USADDR: nullable ustringmax=25;
    StreetSuffixType_USADDR: nullable ustringmax=5;
    StreetSuffixQualifier_USADDR: nullable ustringmax=5;
    StreetSuffixDirectional_USADDR: nullable ustringmax=3;
    RuralRouteType_USADDR: nullable ustringmax=3;
    RuralRouteValue_USADDR: nullable ustringmax=10;
    BoxType_USADDR: nullable ustringmax=7;
    BoxValue_USADDR: nullable ustringmax=10;
    FloorType_USADDR: nullable ustringmax=5;
    FloorValue_USADDR: nullable ustringmax=10;
    UnitType_USADDR: nullable ustringmax=5;
    UnitValue_USADDR: nullable ustringmax=10;
    MultiUnitType_USADDR: nullable ustringmax=5;
    MultiUnitValue_USADDR: nullable ustringmax=10;
    BuildingName_USADDR: nullable ustringmax=30;
    AdditionalAddress_USADDR: nullable ustringmax=50;
    AddressType_USADDR: nullable ustringmax=1;
    StreetNameNYSIIS_USADDR: nullable ustringmax=8;
    StreetNameRVSNDX_USADDR: nullable ustringmax=4;
    UnhandledPattern_USADDR: nullable ustringmax=30;
    UnhandledData_USADDR: nullable ustringmax=50;
    InputPattern_USADDR: nullable ustringmax=30;
    ExceptionData_USADDR: nullable ustringmax=50;
    UserOverrideFlag_USADDR: nullable ustringmax=2;
    CityName_USAREA: nullable ustringmax=30;
    StateAbbreviation_USAREA: nullable ustringmax=3;
    ZipCode_USAREA: nullable ustringmax=5;
    Zip4AddonCode_USAREA: nullable ustringmax=4;
    CountryCode_USAREA: nullable ustringmax=2;
    CityNameNYSIIS_USAREA: nullable ustringmax=8;
    CityNameRVSNDX_USAREA: nullable ustringmax=4;
    UnhandledPattern_USAREA: nullable ustringmax=30;
    UnhandledData_USAREA: nullable ustringmax=50;
    InputPattern_USAREA: nullable ustringmax=30;
    ExceptionData_USAREA: nullable ustringmax=50;
    UserOverrideFlag_USAREA: nullable ustringmax=2;
    ValidFlag_USTAXID: nullable ustringmax=1;
    TaxID_USTAXID: nullable ustringmax=10;
    UnhandledPattern_USTAXID: nullable ustringmax=30;
    UnhandledData_USTAXID: nullable ustringmax=50;
    InputPattern_USTAXID: nullable ustringmax=30;
    ExceptionData_USTAXID: nullable ustringmax=50;
    UserOverrideFlag_USTAXID: nullable ustringmax=2;
    )
    Thanks
    Prarthana.
    As the error log shows, the Match Specification expects a field/column called MatchFirst1.
    But looking through the record layout, there are no columns of that name.
    You need to make sure that all Passes in the Match Specification use column names that are in the input layout.

    Harald
  • prarthanab
    prarthanab
    12 Posts

    Re: Run time Error in Unduplicate match

    ‏2012-03-29T19:37:33Z  
    • smithha
    • ‏2012-03-29T19:30:05Z
    As the error log shows, the Match Specification expects a field/column called MatchFirst1.
    But looking through the record layout, there are no columns of that name.
    You need to make sure that all Passes in the Match Specification use column names that are in the input layout.

    Harald
    Thanks Harold for your inputs . Actually I am workign on some tutorial samples .There is a transformer stage in the first job which actually maps something to this field ( MatchFirst1) . There are totally 3 fields that transofrmer stage is adding . But I have bypassed that stage ( I dint use it in my job ) . I dont want to use transformer stage . How can i remove those fields from the match specification ? How do I open it in the first place ?I right clicked on it but dint see an option to edit it . Please advice.
  • prarthanab
    prarthanab
    12 Posts

    Re: Run time Error in Unduplicate match

    ‏2012-03-29T19:44:21Z  
    Thanks Harold for your inputs . Actually I am workign on some tutorial samples .There is a transformer stage in the first job which actually maps something to this field ( MatchFirst1) . There are totally 3 fields that transofrmer stage is adding . But I have bypassed that stage ( I dint use it in my job ) . I dont want to use transformer stage . How can i remove those fields from the match specification ? How do I open it in the first place ?I right clicked on it but dint see an option to edit it . Please advice.
    From your earlier post :

    You will want to open up your Match Specification to confirm, but it probably contains the above and other fields based on standardization. Your options would then be to either: 1) remove the fields from the passes in the Match Spec and replace with those available in your schema (and then provision the match spec and passes); or 2) add in standardization so that the fields called for in the Match Spec are available (and remember if you do to ensure those fields also go through the Match Frequency process).

    How do I perform step 1 ?
    I want to remove MatchFirst1.
    As u mentioned yes , I am using out of box NameandAddress Match specification.
  • smithha
    smithha
    23 Posts

    Re: Run time Error in Unduplicate match

    ‏2012-03-29T19:53:44Z  
    Thanks Harold for your inputs . Actually I am workign on some tutorial samples .There is a transformer stage in the first job which actually maps something to this field ( MatchFirst1) . There are totally 3 fields that transofrmer stage is adding . But I have bypassed that stage ( I dint use it in my job ) . I dont want to use transformer stage . How can i remove those fields from the match specification ? How do I open it in the first place ?I right clicked on it but dint see an option to edit it . Please advice.
    To open a Match Specification, find the icon for it in your repository , and either double-click it, or right-click and select "Properties".

    See this section in the user documentation for details on how to work with the specifications and passes in the Match Designer: http://publib.boulder.ibm.com/infocenter/iisinfsv/v8r7/topic/com.ibm.swg.im.iis.qs.ug.doc/topics/c_Defining_and_testing_match_criteria.html

    You can also find more screenshots in the QualityStage Redbook at: http://www.redbooks.ibm.com/abstracts/sg247546.html
    Particularly look at pages 95-110.

    Harald
  • prarthanab
    prarthanab
    12 Posts

    Re: Run time Error in Unduplicate match

    ‏2012-03-29T20:59:09Z  
    • smithha
    • ‏2012-03-29T19:53:44Z
    To open a Match Specification, find the icon for it in your repository , and either double-click it, or right-click and select "Properties".

    See this section in the user documentation for details on how to work with the specifications and passes in the Match Designer: http://publib.boulder.ibm.com/infocenter/iisinfsv/v8r7/topic/com.ibm.swg.im.iis.qs.ug.doc/topics/c_Defining_and_testing_match_criteria.html

    You can also find more screenshots in the QualityStage Redbook at: http://www.redbooks.ibm.com/abstracts/sg247546.html
    Particularly look at pages 95-110.

    Harald
    Thanks Harold . I could succesfully run that job for the first time after i deleted the MatchFirst1 column from match specification . But the job is failing with the following error starting from 2nd run . Please advice.
  • prarthanab
    prarthanab
    12 Posts

    Re: Run time Error in Unduplicate match

    ‏2012-03-29T20:59:48Z  
    Thanks Harold . I could succesfully run that job for the first time after i deleted the MatchFirst1 column from match specification . But the job is failing with the following error starting from 2nd run . Please advice.
    Here is the error :

    node_node2: player 9 terminated unexpectedly.
  • prarthanab
    prarthanab
    12 Posts

    Re: Run time Error in Unduplicate match

    ‏2012-03-29T21:35:02Z  
    Here is the error :

    node_node2: player 9 terminated unexpectedly.
    Ok ..here is what I have observed . In my 2 nd job which has unduplicate stage , if I output the result to only 1 file , it works . Whether it is matched , unmatched , clerical data , it is giving me proper results . But when i output the result to 2 files ( matched and unmatched data files) from match stage , it fails with the "player terminated unexpectedly" error . Any clue ?
  • smithha
    smithha
    23 Posts

    Re: Run time Error in Unduplicate match

    ‏2012-03-30T13:13:28Z  
    Ok ..here is what I have observed . In my 2 nd job which has unduplicate stage , if I output the result to only 1 file , it works . Whether it is matched , unmatched , clerical data , it is giving me proper results . But when i output the result to 2 files ( matched and unmatched data files) from match stage , it fails with the "player terminated unexpectedly" error . Any clue ?
    I think you'll need to dig into the logs in DS Director for more details. "player terminated unexpectedly" is pointing to an issue in one of the partitions - could be in handling some type of bad data condition.
    Given your comments about it working with 1 file, but not 2 files, you probably want to look at how the results are being written out. Matched and unmatched data have different output fields available, so that could be an issue.

    Read through some of the material in the Parallel Design guide here: http://publib.boulder.ibm.com/infocenter/iisinfsv/v8r7/topic/com.ibm.swg.im.iis.ds.parjob.dev.doc/topics/g_deeref_Parallel_Jobs_General_Information.html

    Also look for information about writing out files: http://publib.boulder.ibm.com/infocenter/iisinfsv/v8r7/topic/com.ibm.swg.im.iis.ds.parjob.dev.doc/topics/readandwritingfiles.html

    Hope these suggestions help.

    Harald