Information Management IBM InfoSphere Master Data Management, Version 10.1

Example: scraping instructions

It is important to start with a clear design for the IBM® InfoSphere® Master Data Management Healthcare Point of Service Integrator scraping process. A successful design involves understanding the scraping process from start to finish.

In this example, the scraping design is as follows:

  1. HostSearch wakes up upon the presentation of three legacy search windows.
  2. Since the intended design of HostSearch is to be a replacement Search window, if the user cancels from the HostSearch window, then the legacy search is also canceled.
  3. If the user decides that no member meets the entered criteria, then only the entered name is placed in the legacy search window. The entered name is preceded by an <Alt>E to clear all fields, and succeeded with an <Alt>d (which is the legacy application search window shortcut that forces a New Patient action). HostSearch returns to idle mode for the next requested legacy search.
  4. If the user decides that an entity with a local-system representative should be registered, then only the local-system ID number is placed in the legacy search window. The entered system ID is preceded by a pound sign (which is the usual notation that the legacy system uses to indicate that a primary person identifier is being used). Again an initial <Alt>E is used to be sure that no stray data is in the search screen. Because users normally press <Enter> when selecting a patient in this manner (though it is a rare case that the user has the primary identifier available), it is used here as well.
  5. If the user picks an entity which has only presented at other sites in the healthcare system, then a more complicated process is required. First, nothing should be present in any input field on the legacy search window. An <Alt>d should always be sent, which is a clue for the legacy system that a new patient is being entered. Then HostSearch then goes away, since there is a process the user must go through to deal with certain legacy process items, which are not part of the HostSearch information. When the user is finished with entering those items, the actual demographic scraping begins when the legacy system presents a particular demographics window, and go away upon completion.

Assume that the initialization uses the following commands:

Emulator Type = HBOC8
Def Trigger 1 = Group=1^Style=1
Trigger 1 = Group=1^Type=1^WinTitle=Patient MPI Search
Trigger 2 = Group=1^Type=1^WinTitle=Guarantor Search
Trigger 3 = Group=1^Type=1^WinTitle=Employee Search

Def Trigger 2 = Group=2^Style=1
Trigger 5 = Group=2^Type=1^WinTitle=Admission^SubTitle=[CHS

Screen Order = New=1^Update=2^Add=3^Demogjump=4^Demog1=5^Demog2=6^Cancel=7

App Field 90 = Screen=7^Seq=1^Detect=wt+Patient^Action=send:%c{ENTER}~Proc=exit
App Field 91 = Screen=7^Seq=1^Detect=wt+Guarantor^Action=send:%Z{ENTER}~Proc=exit
App Field 92 = Screen=7^Seq=1^Detect=wt+Employee^Action=send:{ESC}~Proc=exit

App Field 01 = Screen=1^Seq=1^Detect=wt+Patient MPI Search~Guarantor Search
App Field 02 = Screen=1^Seq=2^Target=New Patient^Action=send:%E$A%d

App Field 03 = Screen=2^Seq=1^Detect=wt+Patient MPI Search~Guarantor Search
App Field 04 = Screen=2^Seq=2^Target=MRN^Action=send:%E#$A{ENTER}

App Field 05 = Screen=3^Seq=1^Detect=wt+Patient MPI Search~Guarantor Search
App Field 06 = Screen=3^Seq=2^Target=Add^Action=send:%E%d^Proc=mode=2:Demogjump

App Field 10 = Screen=4^Seq=1^Detect=st+[CHS Register]^Proc=jump=Demog1
App Field 11 = Screen=4^Seq=2^Detect=st+[CHS Schedule]^Proc=jump=Demog2

App Field 20 = Screen=5^Seq=1^Detect=st+[CHS Register]^Proc=required~activate
App Field 21 = Screen=5^Seq=2^Target=Name^Putloc=0,0^Attr=LGLNAME^Proc=l,fm
App Field 22 = Screen=5^Seq=3^Target=Suffix^Putloc=0,0^Attr=LGLNAME:sfx
App Field 23 = Screen=5^Seq=4^Target=Skipping Race^Putloc=0,0^Attr=NULL
App Field 24 = Screen=5^Seq=5^Target=SSN^Putloc=0,0^Attr=SSN:idnumber
App Field 25 = 
Screen=5^Seq=6^Detect=anywin+Alternatives^Proc=repeat=3~abort=M175:"SSN'SSN 
Match"
App Field 26 = Screen=5^Seq=7^Target=Address #1^Putloc=0,0^Attr=HMADDR:street1
App Field 27 = Screen=5^Seq=8^Target=Address #2^Putloc=0,0^Attr=HMADDR:street2
App Field 28 = Screen=5^Seq=9^Target=City^Putloc=0,0^Attr=HMADDR:city
App Field 29 = Screen=5^Seq=10^Target=Zipcode^Putloc=0,0^Attr=HMADDR:zipcode

App Field 40 = Screen=6^Seq=1^Detect=st+[CHS Schedule]^Proc=required~activate
App Field 41 = Screen=6^Seq=2^Target=SSN^Putloc=0,0^Attr=SSN:idnumber

Then the following scraping activity occurs:

  1. The chosen emulator type for this example is HBOC8, which is a window-only emulator (it is not capable of direct-text scraping, so it can be accessed only by window-type calls). By default, a <Tab> is always appended to HBOC8 attribute Put operations.
  2. The main trigger group (Group 1) on which HostSearch initially wakes up and presents itself for the user to search, consists of three possible windows. In fact, it could be more, because the WinTitle modifier specifies that the window caption need only begin with the given text. But in this case, enough caption text is given to make sure that the overwritten windows are unique in the target legacy application.
  3. The second trigger group represents a continuation of scraping when it needs to be divided into two parts. In this example, the first part is responsible only for sending the clue as to new-patient registration and the second part is responsible for demographics. Both WinTitle and SubTitle are invoked, because there are many windows that begin with the text “Admission,” but only those that have “[MHS” somewhere in the caption begin the demographics process. And even though there are many windows in the New Patient Registration process that might have this caption sub-text, that does not matter because it is always at the first such window that demographics are scraped. Because HostSearch wakes up the first time it sees such a window, and then returns at the end of the process to look for Group 1 trigger windows, this seemingly broad text matching is fine for the Group 2 trigger windows.
  4. All the instruction blocks are given names. The only reserved names are New, Update, Add, and Cancel. New represents criteria-only scraping. Update represents local-member scraping. Add represents non-local-member scraping. Cancel refers to any action that invokes a cancel, such as clicking a mapped Cancel button, closing the HostSearch window, or pressing a hotkey that is mapped to the Cancel button. The other instruction blocks are named as you want. In this example, the name “Demogjump” seemed appropriate for the beginning of the Stage 2 of the split-scrape process, since the purpose of that block is single-minded. Decide which specific screen triggered the process, and jump to an instruction block appropriate for that screen.
  5. The first instruction block is that pertaining to Cancel (Screen=7, and in ScreenOrder, Cancel=7). The AppField numbering, order, and proximity do not matter; it just seemed appropriate to number the Cancel block high (to account for possible AppField additions in other blocks later), but present it early for readability purposes. In this case, all that is required is a shortened Detect window title, since we know that HostSearch came up only because of the Trigger Group 1 windows. The first Detect that succeeds will send the associated characters to the legacy window, then immediately exit (once a window is found, there is no harm to continue through the Cancel block, but no reason to go on and test for more either) and return to idle mode to listen to Group 1 triggers again.
  6. The New and Update actions are nearly identical, differing in just a few characters. The blocks have a Detect at the beginning. While not required, the Detect verifies that the user has not used the legacy application while HostSearch was supposed to be at the forefront. Also, only one Detect is needed, as it can verify multiple target windows (and in this case, all such windows use the same data feeding technique).
  7. The Add action is also nearly identical. The exception is at the end of its scraping where HostSearch is instructed to go into idle mode and listen specifically for Group 2 triggers. When a trigger is found, HostSearch continues with instruction block “Demogjump”. This splitting of the scraping process is required due to the screen interruption that the user must first deal with before continuing with scraping.
  8. The Demogjump instruction block tests for more specific window captioning than was in the Group 2 trigger specification. It immediately routes scraping to one of two blocks, the second of which is the simplest: only a Social Security Number is required. As a precaution, both blocks verify at their start that the window exists, and request an “activate”, just in case the legacy system has not brought the window to the foreground.
  9. The Demog1 instruction block is the most complicated, but looking at it carefully, is a repetition of the same attribute scrape instruction. The “PutLoc” is required to force attribute scraping, but set to “0,0” because the emulator does not have position-scraping capability. Instead, the attributes are scrape-instructed in the <Tab> field order, so that the implicitly appended <Tab> character is sent to each input field.
  10. For that reason, the desire to skip the Race field requires an attribute, but it is set to NULL, so that the <Tab> is still sent. Instead of using this method, the programmer could have used an instruction with “Action=send:{TAB}”, but then a separate report instruction would have been needed to make a progress message appear to the user. With the attribute scraping comes the implicit progress messages.
  11. After the SSN is scraped, a special processing trap is made. In this fictitious legacy system, there is a rule imposed: no two patients are to have the same Social Security Number. I a user enters a previously used number, an error window appears that forces the user to choose an alternative processing method. So in this instruction block, this popup window is tested three times, and if it does appear, then the scraping process is halted with an error message to that effect. The error is M175, with the “%1” in its definition replaced with the text “SSN,” and the “%2” replaced with “SSN Match.” This instruction is rather fast (it occurs in milliseconds) and might not give the legacy system enough time to both process its mini-search and present the popup window during times of heavy use. Thus, you can add a “pause=3” to the SSN scraping, which forces a full three-second pause into the scraping process. Alternatively, adding “pause=1” in the Proc modifier of the Detect instruction means that each test for the window is be accompanied by a 1-second pause, so that the legacy system has up to three seconds to present the window, but if it is running faster, it is detected sooner. In either case, if the popup window does not appear, it is a three-second overall process wait.


Feedback

Timestamp Last updated: 15 Aug 2012

Topic URL: