Topic
IC4NOTICE: developerWorks Community will be offline May 29-30, 2015 while we upgrade to the latest version of IBM Connections. For more information, read our upgrade FAQ.
9 replies Latest Post - ‏2012-04-17T20:26:07Z by dstageevo
prarthanab
prarthanab
12 Posts
ACCEPTED ANSWER

Pinned topic Input to Reference Match stage

‏2012-04-05T16:51:09Z |
Hi ,
I am building a job with reference match stage . I have data source which is in a standardised format ( obviously it appends lot more columns to input data after standardization) . My question is , should the reference data source also be standardised ? If not how will both the colums match ?

Ex : I have Name , address fields
My data source ( standardised) will have other columns like MatchFirstname, MatchPrimaryWordNYSIISS etc..
My refrence match does not have those standardised colums .
If I want to find matches for a record in ref data source , it is not matching if i select to match on Name and Matchprimaryword_NYSIIS not even as a duplicate .

Please let me know if u need more info .
Updated on 2012-04-17T20:26:07Z at 2012-04-17T20:26:07Z by dstageevo
  • smithha
    smithha
    23 Posts
    ACCEPTED ANSWER

    Re: Input to Reference Match stage

    ‏2012-04-05T18:21:49Z  in response to prarthanab
    Hi,

    Yes, you want to have your reference source parsed and standardized to the same level as your incoming data.
    That statement has a couple implications:
    1) before bringing in new input data, you will need to build at least a standardization process (and ideally a deduplication process) that will parse and generate the standardized output you want for subsequent reference matching.
    2) you will need to store the parsed/standardized output either in the existing table containing the data you want to match to or in an associated cross-reference table.

    Also from a matching perspective, data comparisons are very sensitive to the presence (or absence) of additional data, which is why parsing and standardization are important pieces of the process. This is what you are seeing in your current process. Be aware as well of the Match Comparison type and associated parameters you are using. A CHAR comparison for instance requires two strings to match exactly - a situation you will almost never have been standardized and non-standardized data. If you have a high disparity between contents of two fields, the best option is to use a MULT_UNCERT comparison to try and find some overlap between the contents regardless of data order.

    Harald
    • prarthanab
      prarthanab
      12 Posts
      ACCEPTED ANSWER

      Re: Input to Reference Match stage

      ‏2012-04-05T20:05:59Z  in response to smithha
      HI Harald ,
      Thankyou very much for the reply . I got my ref job compiled and run . I will now need to replce my reference data source ( which was previously a csv file ) with a web service . From what I have read , it seems like we should use ISD Input stage to send webservice input . But i dont see much configuration parameters ther to lookup for a web service . Basically I want to send a request ( as refrence dat source ) from portal . Please suggest if there is a good material . Thanks!
      • smithha
        smithha
        23 Posts
        ACCEPTED ANSWER

        Re: Input to Reference Match stage

        ‏2012-04-06T14:39:17Z  in response to prarthanab
        For a broad overview of ISD usage with QualityStage, see this reference: Link: http://publib.boulder.ibm.com/infocenter/iisinfsv/v8r7/topic/com.ibm.swg.im.iis.infoservdir.user.doc/topics/c_isd_user_ds_qs_job_topologies.html

        There are more specific considerations for reference matches since the ISD services only support one input and one output service link. See: Link: http://publib.boulder.ibm.com/infocenter/iisinfsv/v8r7/topic/com.ibm.swg.im.iis.infoservdir.user.doc/topics/t_create_realtime_ref_match.html

        You will need to determine whether you want a dynamic reference source or a static one (i.e. a snapshot).
        An example of a QS ISD service with a dynamic reference is shown here: Link: http://publib.boulder.ibm.com/infocenter/iisinfsv/v8r7/topic/com.ibm.swg.im.iis.infoservdir.user.doc/topics/t_setup_realtime_ref_opt_three.html
        An example with a static reference is shown here: Link: http://publib.boulder.ibm.com/infocenter/iisinfsv/v8r7/topic/com.ibm.swg.im.iis.infoservdir.user.doc/topics/t_setup_realtime_ref_opt_two.html

        It is also feasible to pass both input data and reference source into a single service but you must put them into a single input and filter within the service if you do so. An example is shown here: Link: http://publib.boulder.ibm.com/infocenter/iisinfsv/v8r7/topic/com.ibm.swg.im.iis.infoservdir.user.doc/topics/t_setup_realtime_ref_opt_one.html

        These examples should get you started.

        Harald
        • prarthanab
          prarthanab
          12 Posts
          ACCEPTED ANSWER

          Re: Input to Reference Match stage

          ‏2012-04-06T16:14:44Z  in response to smithha
          HI harald ,
          Thanks for your reply . I am a little lost here . I hope you understood my requirement . I have a job which performs member match using reference match . here are more details on my job .

          1 data source - from data set ( this is result of my previous job which extracts data from DB)
          1 ref source - this is currently input from a direct csv file .

          I am feeding these 2 inputs to ref macth stage ( apart from the frequency files ofcourse ).

          Now , if i want my ref source to be a web service request ( instaed of csv file ) which comes from portal , how do i implement it ?

          requirement : search for a member in the DB . we will enter the members name ( only 1 record at a time - which i am calling as ref data source) in portal and click search . This search shud invoke a QS job which performs member match .

          Questions :
          1. Should i convert the entire job in to a service ( which i do from server console ) and expose it as a webservice so that portal can invoke it - which means this web service will accept 1 input and outputs several matched rows ?
          2. Is there a way we dont need to convert this job as web service and still can send portal input as reference source to my match job .

          Please clarify ! This is key in my websphere portal -> Quality stage requirement .

          Thanks in advance.
          • smithha
            smithha
            23 Posts
            ACCEPTED ANSWER

            Re: Input to Reference Match stage

            ‏2012-04-06T20:10:16Z  in response to prarthanab
            Ok, so that provides more insight into what you want to accomplish (btw, I would generally refer to the name entered in the portal as the input data and the member DB to be the reference source since that is what you are matching against).

            There are generally 2 key questions to ask when you are looking at this type of requirement:
            1) do you need a response back? (I believe you are indicating that you do.)
            2) if yes, do you need the data/response back quickly? (Generally if I think of a portal where you are entering member information, I expect someone wants to see the result fairly soon.)

            Sending information from a service into a job which performs a match is going to have 3 requirements:
            1) you have to get the data to the job, which means it has to go to a file, queue, or other transient data set
            2) you have to start the job (which has overhead), process the data, and complete the job (possibly more overhead)
            3) you have to get the data from the job back to the portal, which again means it has to go to something the portal can read
            It's a very decoupled process, and will result in more latency then you are likely to want (particularly assuming requirement 2 above is true).

            Coming back to your question, while I certainly don't know all specifics, I suspect you want to convert the job into a web service. That has several advantages:
            1) the service is always on so minimal startup time (happens when the service is started only) and latency.
            2) the data can be passed from service to service so it does not require additional landing to a file or queue.

            Harald

            Edited by: smithha_admin on Apr 6, 2012 4:08 PM
            Updated on 2012-04-06T20:10:16Z at 2012-04-06T20:10:16Z by smithha
            • prarthanab
              prarthanab
              12 Posts
              ACCEPTED ANSWER

              Re: Input to Reference Match stage

              ‏2012-04-06T21:18:50Z  in response to smithha
              HI Harald ,
              Thanks for the reply . From the material you provided i could finally convert my job into webservice . Here are answers to your qsns :

              ) do you need a response back? - yes
              2) if yes, do you need the data/response back quickly- yes , it should be a synchronous call . We enter some search criteria on screen and hit enter , I expect this search to invoke my match web service ( that I converted just now ) and return all matched records on the portal screen immediately.

              Now here is what I am doing :

              I want to test this service from SOAP UI . I loaded the wsdl and entered some input data . I dint get a response back in SOAP UI . It says java.net.SocketTimeoutException : Read timed out. at the bottom of the screen.

              When I looked at the Information director , there are so many instances of this job ( about 50 ) some running , some aborted and some are in compiled state . I dont know what's happening . Why are so many instaces created ( i invoked only 2 times) ? Why din't I get response , where else can i chk logs to figure out . Please help .
              • smithha
                smithha
                23 Posts
                ACCEPTED ANSWER

                Re: Input to Reference Match stage

                ‏2012-04-09T12:40:20Z  in response to prarthanab
                I don't know specifically what you may be running into, so it may be worthwhile to talk to IBM Support if you don't find you can debug the issue yourself.

                There is some reference to timeouts in QualityStage services here: http://publib.boulder.ibm.com/infocenter/iisinfsv/v8r7/topic/com.ibm.swg.im.iis.infoservdir.user.doc/topics/c_isd_user_timeout_value_wisd_input_output.html

                Since you are using SOAP, you may want to look through the bindings reference here: http://publib.boulder.ibm.com/infocenter/iisinfsv/v8r7/topic/com.ibm.swg.im.iis.infoservdir.user.doc/topics/c_isd_user_bindings_container.html

                Also review the steps here for enabling QS jobs as services here: http://publib.boulder.ibm.com/infocenter/iisinfsv/v8r7/topic/com.ibm.swg.im.iis.infoservdir.user.doc/topics/t_isd_user_adding_operations_ds_qs.html

                As for logs, reference is here: http://publib.boulder.ibm.com/infocenter/iisinfsv/v8r7/topic/com.ibm.swg.im.iis.found.admin.nav.doc/containers/cont_iisinfsrv_log.html
                If you've been using the Information Service Director console, then you can create a log view there selecting for the ISD logs: http://publib.boulder.ibm.com/infocenter/iisinfsv/v8r7/topic/com.ibm.swg.im.iis.found.moz.rc.logging.doc/topics/t_rclogs_working_with_logs.html

                Hope these help.

                Harald
                • Prardhana
                  Prardhana
                  9 Posts
                  ACCEPTED ANSWER

                  Re: Input to Reference Match stage

                  ‏2012-04-13T21:16:46Z  in response to smithha
                  Hi Harald ,
                  HOw do we undeploy a deployed quality stage service ? I could undeploy an application but dont see an option to undeploy only service . Even if i undeploy application , I could still see the service is up and running fine . Also , in director , I see that the instance associated with the job i still running . HOw do i stop/delete that instance ? Though I click on stop ( red button above ) it still shows as running.Also couple of more questions ( could get much info in redbook/IBM site ) .

                  when is an instance of job created when it is enabled as information service ? Is it as soon as it is deployed ? or when it is invoked by a client ?and how many instaces are created per invocation ( i hope its only 1 per invocation).And is there any other best way to integrate with websphere portal other than exposing this job as service and invoking this service from portal ?

                  Thanks
                  Prarthana.
                  • dstageevo
                    dstageevo
                    1 Post
                    ACCEPTED ANSWER

                    Re: Input to Reference Match stage

                    ‏2012-04-17T20:26:07Z  in response to Prardhana
                    Hi Prarthana...

                    Some thoughts on this below....

                    Ernie

                    Ernie Ostic


                    Hi Harald ,
                    HOw do we undeploy a deployed quality stage service ? I could undeploy an application but dont see an option to undeploy only service .

                    Applications are deployed and un-deployed. Services, which "belong" to an application, can be "disabled". Applications and Operations can also be "disabled".... Disable/Enable (done at the Deployed Application Workspace --- look for the edit button on the bottom right and then a new button that appears towards the bottom left) is the preferred way to just "stop" a job instance in QS if you just want to change a transform or something and then re-compile (and then re-enable). It is far quicker. You only need to "Re-deploy" if you change something in the signature (the ultimate input and output from ISDInput and ISDOutput Stages) of the Service.
                    Even if i undeploy application , I could still see the service is up and running fine .

                    Something might have gone wrong during undeployment. It's better to disable first, but usually, if you "undeploy" it should also shut things down.

                    Also , in director , I see that the instance associated with the job i still running . HOw do i stop/delete that instance ? Though I click on stop ( red button above ) it still shows as running.Also couple of more questions ( could get much info in redbook/IBM site ) .

                    best way to fix it now is to probably try and deploy the application again at the regular Applications Workspace. Hopefully it will cycle things through....though you may need to cycle your Info Server.

                    when is an instance of job created when it is enabled as information service ? Is it as soon as it is deployed ? or when it is invoked by a client ?and how many instaces are created per invocation ( i hope its only 1 per invocation).And is there any other best way to integrate with websphere portal other than exposing this job as service and invoking this service from portal ?

                    These are more complex questions that depend on the topology of the job and the minimum number of instances. Assuming that you have an ISDInput Stage (we call this an "always on" Job) and the minimum number of instances is at least one, then the QS Job will be started as soon as the Application is fully deployed. Invocations by a client will all flow thru that same Job. Things change if you deploy a Job that does NOT have an ISDInput Stage, but that's a whole other subject.

                    As far as integrating with webSphere, there is no better way! Using SOAP or EJB, there are lots of tools that can/should be able to generate an EJB for you that can invoke this service. Alternatively though, you might consider using MQSeries if you have that tooling. That's not an ISD based method, but it would work for real-time sharing of functionality.

                    Thanks
                    Prarthana.