Topic
IC4NOTICE: developerWorks Community will be offline May 29-30, 2015 while we upgrade to the latest version of IBM Connections. For more information, read our upgrade FAQ.
12 replies Latest Post - ‏2012-03-30T13:13:28Z by smithha
prarthanab
prarthanab
12 Posts
ACCEPTED ANSWER

Pinned topic Run time Error in Unduplicate match

‏2012-03-29T16:05:04Z |
Hi,
I am using Qulaity stage 8.5 . I am working on few examples in tutorial . I created 2 jobs
1.Extract data : Thsi has input source as a csv file . copies that data into 2 data sets . 1 set is just member data set and the other one is Memberfrequencies dataset . So here i have 1 source,1 copy , 1 match freq stage and 2 data sets . This compiles fine and creates 2 data sets fine aftr running.

2. Executemembermatch : Here I am providing 2 data sets from previous jobs as input . 1 unduplicate match stage . 1 funnel and 3 data sets .This job compiles fine but throws error at runtime . Here is the error :

UnduplicateMatch:Unable to locate column MatchprimaryWord1NYSIIS_USNAME in the input link with following schema:
record(
UniqueIdentifier:ustringmax=10;
ApplicantSSN:nullable ustringmax=9;
name:nullable ustringmax=70;
AddressLine1:nullable ustringmax=50;
Addressline2:nullable ustringmax=50;
City:nullable ustringmax=40;
State:nullable ustringmax=2;
Zip5:nullable ustringmax=5;
Zip4:nullable ustringmax=4;
)

I am relatively new to Quality stage . Please help me resolve this error .By the way I am not using standardize job in the whole process.
Updated on 2012-03-30T13:13:28Z at 2012-03-30T13:13:28Z by smithha
  • smithha
    smithha
    23 Posts
    ACCEPTED ANSWER

    Re: Run time Error in Unduplicate match

    ‏2012-03-29T16:35:57Z  in response to prarthanab
    I suspect you are using one of the out-of-the-box match specifications which would assume (and include) prior use of name and address standardization. MatchprimaryWord1NYSIIS_USNAME is created by the USNAME standardization process.

    You will want to open up your Match Specification to confirm, but it probably contains the above and other fields based on standardization. Your options would then be to either: 1) remove the fields from the passes in the Match Spec and replace with those available in your schema (and then provision the match spec and passes); or 2) add in standardization so that the fields called for in the Match Spec are available (and remember if you do to ensure those fields also go through the Match Frequency process).

    Harald
    • prarthanab
      prarthanab
      12 Posts
      ACCEPTED ANSWER

      Re: Run time Error in Unduplicate match

      ‏2012-03-29T16:53:07Z  in response to smithha
      Harald ,
      Thanks a lot for your reply. Thats exactly what I suspected and added a standardization job to my first job . It seems the job is runnign fine now . I still dont see a finished status . ANy idea how much time it would take to finish up a unduplicate job with alomost 1000 records ?
      My job is runnign since 5 mins.
      • smithha
        smithha
        23 Posts
        ACCEPTED ANSWER

        Re: Run time Error in Unduplicate match

        ‏2012-03-29T17:55:14Z  in response to prarthanab
        Glad to hear you got the process running.
        In general, it is advantageous to run standardization processes before unduplication or matching processes. The parsing and standardization give you more control and consistency over inputs used in matching (both data and frequencies).

        As for job execution time, I'd expect 1000 records to finish generally within 5 mins. There's usually some startup time, but it's a very small data set. If you are working in the Designer, then you should see result flow displayed graphically (and would be green if all went fine). Otherwise, look in the DS Director for the detail log output.

        Harald
        • prarthanab
          prarthanab
          12 Posts
          ACCEPTED ANSWER

          Re: Run time Error in Unduplicate match

          ‏2012-03-29T19:18:43Z  in response to smithha
          HI harold,
          The job after running for sometime however did not complete succesfully. It is now failing at run time with the followign error.

          Unduplicate_Match_11: Unable to locate column MatchFirst1 in the input link with the following schema:
          record
          ( UniqueIdentifier: ustringmax=10;
          ApplicantSSN: nullable ustringmax=9;
          Name: nullable ustringmax=70;
          AddressLine1: nullable ustringmax=50;
          AddressLine2: nullable ustringmax=50;
          City: nullable ustringmax=40;
          State: nullable ustringmax=2;
          Zip5: nullable ustringmax=5;
          Zip4: nullable ustringmax=4;
          NameType_USNAME: nullable ustringmax=1;
          GenderCode_USNAME: nullable ustringmax=1;
          NamePrefix_USNAME: nullable ustringmax=20;
          FirstName_USNAME: nullable ustringmax=25;
          MiddleName_USNAME: nullable ustringmax=25;
          PrimaryName_USNAME: nullable ustringmax=50;
          NameGeneration_USNAME: nullable ustringmax=10;
          NameSuffix_USNAME: nullable ustringmax=20;
          AdditionalName_USNAME: nullable ustringmax=50;
          MatchFirstName_USNAME: nullable ustringmax=25;
          MatchFirstNameNYSIIS_USNAME: nullable ustringmax=8;
          MatchFirstNameRVSNDX_USNAME: nullable ustringmax=4;
          MatchPrimaryName_USNAME: nullable ustringmax=50;
          MatchPrimaryNameHashKey_USNAME: nullable ustringmax=10;
          MatchPrimaryNamePackKey_USNAME: nullable ustringmax=20;
          NumofMatchPrimaryWords_USNAME: nullable ustringmax=1;
          MatchPrimaryWord1_USNAME: nullable ustringmax=15;
          MatchPrimaryWord2_USNAME: nullable ustringmax=15;
          MatchPrimaryWord3_USNAME: nullable ustringmax=15;
          MatchPrimaryWord4_USNAME: nullable ustringmax=15;
          MatchPrimaryWord5_USNAME: nullable ustringmax=15;
          MatchPrimaryWord1NYSIIS_USNAME: nullable ustringmax=8;
          MatchPrimaryWord1RVSNDX_USNAME: nullable ustringmax=4;
          MatchPrimaryWord2NYSIIS_USNAME: nullable ustringmax=8;
          MatchPrimaryWord2RVSNDX_USNAME: nullable ustringmax=4;
          UnhandledPattern_USNAME: nullable ustringmax=30;
          UnhandledData_USNAME: nullable ustringmax=100;
          InputPattern_USNAME: nullable ustringmax=30;
          ExceptionData_USNAME: nullable ustringmax=25;
          UserOverrideFlag_USNAME: nullable ustringmax=2;
          HouseNumber_USADDR: nullable ustringmax=10;
          HouseNumberSuffix_USADDR: nullable ustringmax=10;
          StreetPrefixDirectional_USADDR: nullable ustringmax=3;
          StreetPrefixType_USADDR: nullable ustringmax=20;
          StreetName_USADDR: nullable ustringmax=25;
          StreetSuffixType_USADDR: nullable ustringmax=5;
          StreetSuffixQualifier_USADDR: nullable ustringmax=5;
          StreetSuffixDirectional_USADDR: nullable ustringmax=3;
          RuralRouteType_USADDR: nullable ustringmax=3;
          RuralRouteValue_USADDR: nullable ustringmax=10;
          BoxType_USADDR: nullable ustringmax=7;
          BoxValue_USADDR: nullable ustringmax=10;
          FloorType_USADDR: nullable ustringmax=5;
          FloorValue_USADDR: nullable ustringmax=10;
          UnitType_USADDR: nullable ustringmax=5;
          UnitValue_USADDR: nullable ustringmax=10;
          MultiUnitType_USADDR: nullable ustringmax=5;
          MultiUnitValue_USADDR: nullable ustringmax=10;
          BuildingName_USADDR: nullable ustringmax=30;
          AdditionalAddress_USADDR: nullable ustringmax=50;
          AddressType_USADDR: nullable ustringmax=1;
          StreetNameNYSIIS_USADDR: nullable ustringmax=8;
          StreetNameRVSNDX_USADDR: nullable ustringmax=4;
          UnhandledPattern_USADDR: nullable ustringmax=30;
          UnhandledData_USADDR: nullable ustringmax=50;
          InputPattern_USADDR: nullable ustringmax=30;
          ExceptionData_USADDR: nullable ustringmax=50;
          UserOverrideFlag_USADDR: nullable ustringmax=2;
          CityName_USAREA: nullable ustringmax=30;
          StateAbbreviation_USAREA: nullable ustringmax=3;
          ZipCode_USAREA: nullable ustringmax=5;
          Zip4AddonCode_USAREA: nullable ustringmax=4;
          CountryCode_USAREA: nullable ustringmax=2;
          CityNameNYSIIS_USAREA: nullable ustringmax=8;
          CityNameRVSNDX_USAREA: nullable ustringmax=4;
          UnhandledPattern_USAREA: nullable ustringmax=30;
          UnhandledData_USAREA: nullable ustringmax=50;
          InputPattern_USAREA: nullable ustringmax=30;
          ExceptionData_USAREA: nullable ustringmax=50;
          UserOverrideFlag_USAREA: nullable ustringmax=2;
          ValidFlag_USTAXID: nullable ustringmax=1;
          TaxID_USTAXID: nullable ustringmax=10;
          UnhandledPattern_USTAXID: nullable ustringmax=30;
          UnhandledData_USTAXID: nullable ustringmax=50;
          InputPattern_USTAXID: nullable ustringmax=30;
          ExceptionData_USTAXID: nullable ustringmax=50;
          UserOverrideFlag_USTAXID: nullable ustringmax=2;
          )
          Thanks
          Prarthana.
          • smithha
            smithha
            23 Posts
            ACCEPTED ANSWER

            Re: Run time Error in Unduplicate match

            ‏2012-03-29T19:30:05Z  in response to prarthanab
            As the error log shows, the Match Specification expects a field/column called MatchFirst1.
            But looking through the record layout, there are no columns of that name.
            You need to make sure that all Passes in the Match Specification use column names that are in the input layout.

            Harald
            • prarthanab
              prarthanab
              12 Posts
              ACCEPTED ANSWER

              Re: Run time Error in Unduplicate match

              ‏2012-03-29T19:37:33Z  in response to smithha
              Thanks Harold for your inputs . Actually I am workign on some tutorial samples .There is a transformer stage in the first job which actually maps something to this field ( MatchFirst1) . There are totally 3 fields that transofrmer stage is adding . But I have bypassed that stage ( I dint use it in my job ) . I dont want to use transformer stage . How can i remove those fields from the match specification ? How do I open it in the first place ?I right clicked on it but dint see an option to edit it . Please advice.
              • prarthanab
                prarthanab
                12 Posts
                ACCEPTED ANSWER

                Re: Run time Error in Unduplicate match

                ‏2012-03-29T19:44:21Z  in response to prarthanab
                From your earlier post :

                You will want to open up your Match Specification to confirm, but it probably contains the above and other fields based on standardization. Your options would then be to either: 1) remove the fields from the passes in the Match Spec and replace with those available in your schema (and then provision the match spec and passes); or 2) add in standardization so that the fields called for in the Match Spec are available (and remember if you do to ensure those fields also go through the Match Frequency process).

                How do I perform step 1 ?
                I want to remove MatchFirst1.
                As u mentioned yes , I am using out of box NameandAddress Match specification.
              • smithha
                smithha
                23 Posts
                ACCEPTED ANSWER

                Re: Run time Error in Unduplicate match

                ‏2012-03-29T19:53:44Z  in response to prarthanab
                To open a Match Specification, find the icon for it in your repository , and either double-click it, or right-click and select "Properties".

                See this section in the user documentation for details on how to work with the specifications and passes in the Match Designer: http://publib.boulder.ibm.com/infocenter/iisinfsv/v8r7/topic/com.ibm.swg.im.iis.qs.ug.doc/topics/c_Defining_and_testing_match_criteria.html

                You can also find more screenshots in the QualityStage Redbook at: http://www.redbooks.ibm.com/abstracts/sg247546.html
                Particularly look at pages 95-110.

                Harald
                • prarthanab
                  prarthanab
                  12 Posts
                  ACCEPTED ANSWER

                  Re: Run time Error in Unduplicate match

                  ‏2012-03-29T20:59:09Z  in response to smithha
                  Thanks Harold . I could succesfully run that job for the first time after i deleted the MatchFirst1 column from match specification . But the job is failing with the following error starting from 2nd run . Please advice.
                  • prarthanab
                    prarthanab
                    12 Posts
                    ACCEPTED ANSWER

                    Re: Run time Error in Unduplicate match

                    ‏2012-03-29T20:59:48Z  in response to prarthanab
                    Here is the error :

                    node_node2: player 9 terminated unexpectedly.
                    • prarthanab
                      prarthanab
                      12 Posts
                      ACCEPTED ANSWER

                      Re: Run time Error in Unduplicate match

                      ‏2012-03-29T21:35:02Z  in response to prarthanab
                      Ok ..here is what I have observed . In my 2 nd job which has unduplicate stage , if I output the result to only 1 file , it works . Whether it is matched , unmatched , clerical data , it is giving me proper results . But when i output the result to 2 files ( matched and unmatched data files) from match stage , it fails with the "player terminated unexpectedly" error . Any clue ?
                      • smithha
                        smithha
                        23 Posts
                        ACCEPTED ANSWER

                        Re: Run time Error in Unduplicate match

                        ‏2012-03-30T13:13:28Z  in response to prarthanab
                        I think you'll need to dig into the logs in DS Director for more details. "player terminated unexpectedly" is pointing to an issue in one of the partitions - could be in handling some type of bad data condition.
                        Given your comments about it working with 1 file, but not 2 files, you probably want to look at how the results are being written out. Matched and unmatched data have different output fields available, so that could be an issue.

                        Read through some of the material in the Parallel Design guide here: http://publib.boulder.ibm.com/infocenter/iisinfsv/v8r7/topic/com.ibm.swg.im.iis.ds.parjob.dev.doc/topics/g_deeref_Parallel_Jobs_General_Information.html

                        Also look for information about writing out files: http://publib.boulder.ibm.com/infocenter/iisinfsv/v8r7/topic/com.ibm.swg.im.iis.ds.parjob.dev.doc/topics/readandwritingfiles.html

                        Hope these suggestions help.

                        Harald