Skip to main content

alphaWorks  >  Forums  >  IBM LanguageWare Resource Workbench  >  developerWorks

combine lwr pear with analysis engine in aggregate engine    Point your RSS reader here for a feed of the latest messages in this thread


     

 
 

My developerWorks
 Welcome, Guest
Sign in or register
Permlink Replies: 6 - Pages: 1 - Last Post: Nov 18, 2009 6:20 AM Last Post By: t.oby Threads: [ Previous | Next ]
t.oby

Posts: 53
Registered: Feb 19, 2009 08:57:13 AM
combine lwr pear with analysis engine in aggregate engine
Posted: Oct 15, 2009 01:40:21 PM
Click to report abuse...   Click to reply to this thread Reply
Hi,

I developed an AE which uses the results of an LWR Pear. Currently I specified an Aggregate Analysis Descriptor XML file which contains (i) LWRPearAnalysisEngineDescriptor file ( and (ii) MyAnalysisEngineDescriptor File which looks like this:

// code snippet
<delegateAnalysisEngineSpecifiers>
<delegateAnalysisEngine key="com.ibm.dltj.ruleannotator_pear">
<import location="com.ibm.dltj.ruleannotator_pear.xml"/>
</delegateAnalysisEngine>
<delegateAnalysisEngine key="OntologyAE">
<import location="OntologyAE.xml"/>
</delegateAnalysisEngine>
</delegateAnalysisEngineSpecifiers>
// code snippet

Since I now can successfully load the LWR Pear as PackageBrowser and then generate an AE from it (thanks to Kelvin!) I would like to built an AggregateEngine on the fly. My first try doing this looks like this:

// code snippet

// ae1: load AE from Pear PackageBrowser
AnalysisEngine ae1 = loadEFromPEAR(uimaPear);
AnalysisEngineDescription ae1_desc = ....?

// ae2: load OntologyAE
AnalysisEngineDescription ae2_desc = loadAEDescFromJavaClassName(ONTOLOGY_AE_JAVA_CLASS_NAME);

// agg: generate aggregate description from ae1 and ae2
AnalysisEngineDescription aggDesc = new AnalysisEngineDescription_impl();
aggDesc.getDelegateAnalysisEngineSpecifiersWithImports().put("AE1_LWR", ae1_desc);
aggDesc.getDelegateAnalysisEngineSpecifiersWithImports().put("AE2_MY", ae2_desc);
FixedFlow_impl flow = new FixedFlow_impl();
flow.setFixedFlow(new String[] { "AE1_LWR","AE2_MY"});
aggDesc.getAnalysisEngineMetaData().setFlowConstraints(flow);

// code snippet

Unfortunately I have no idea how to generate an AnalysisEngineDescription from a Pear or if this is even possible since the Pear comes with a PearSpecifier which is not an AnalysisEngineDescription. I tried to set "ae2_desc" to the PearSpecifier by parsing it from "uimaPear.getComponentPearDescPath()" which caused an org.apache.uima.util.InvalidXMLException as expected.

For any help on this I would be very thankful.

Cheers,
Toby

KevinCunnane

Posts: 72
Registered: Feb 03, 2009 10:46:24 AM
Re: combine lwr pear with analysis engine in aggregate engine
Posted: Oct 21, 2009 11:30:04 PM   in response to: t.oby in response to: t.oby's post
Click to report abuse...   Click to reply to this thread Reply
Hi Toby, sorry for the delay in getting back to you. I think that in an earlier mail to you I covered how to get an Analysis Engine descriptor from a Pear? The relevant snippet is:

//install PEAR
		final PackageBrowser pear = installPear();
 
		//Get the descriptor for the pear, and build Analysis Engine based on that
		final XMLInputSource in = new XMLInputSource(pear.getInstallationDescriptor().getMainComponentDesc());
		final AnalysisEngineDescription analysisDesc = 	UIMAFramework.getXMLParser().parseAnalysisEngineDescription(in);
		in.close();


This uses the main component descriptor, and if you look at the earlier mail it sets up a Resource Manager that loads the classpath and datapath for the Pear. Hopefully this solves your problem? I think there should be a simpler way but I haven't tested it. From the UIMA References Guide section on the Pear API:

// Create analysis engine from the installed PEAR package using
  // the created PEAR specifier
  XMLInputSource in = 
        new XMLInputSource(instPear.getComponentPearDescPath());
  ResourceSpecifier specifier =
        UIMAFramework.getXMLParser().parseResourceSpecifier(in);
  AnalysisEngine ae = 
        UIMAFramework.produceAnalysisEngine(specifier, rsrcMgr, null);


Note that this gets a ResourceSpecifier, not an AnalysisEngineDescription, but it should still be equivalent. I would advise trying this out since it doesn't require any setup of the Resource Manager.

By the way, you don't have to create an Aggregate Engine if you are working programmatically. CasCreationUtils.java provides support for creating CAS objects based on a Collection of descriptors. So if you prefer, you could do this, create AnalysisEngines for the Pear and your annotator, and then call the AnalysisEngine.process() method for each to run the processing. This is another option if setting up an aggregate descriptor proves too annoying.

Hope this helps,

Kevin
t.oby

Posts: 53
Registered: Feb 19, 2009 08:57:13 AM
Re: combine lwr pear with analysis engine in aggregate engine
Posted: Nov 16, 2009 02:42:02 PM   in response to: KevinCunnane in response to: KevinCunnane's post
Click to report abuse...   Click to reply to this thread Reply
Hi Kevin, thanks for your reply and sorry for my late reply again :) Unfortunately I was not able yet to get a working AnalysisEngineDescription from the Pear, which I could use to generate my aggregated AnalysisEngine.

First off, to generate my aggregated AnalysisEngine I intitialized an AnalysisEngine with the descriptor generated from the pear and the descriptor form my analysis engine in a fixed flow setup:
AnalysisEngineDescription aggDesc = new AnalysisEngineDescription_impl();
aggDesc.getDelegateAnalysisEngineSpecifiersWithImports().put("AE1_PEAR", ae1_pear); 
aggDesc.getDelegateAnalysisEngineSpecifiersWithImports().put("AE2_MY", ae2_java);
FixedFlow_impl flow = new FixedFlow_impl();
flow.setFixedFlow(new String[] { "AE1_PEAR","AE2_MY"});
aggDesc.getAnalysisEngineMetaData().setFlowConstraints(flow);

I hope this is correct.

Ok to generate the descriptor from the pear I tried two different ways.

First (this is more a workaround) I manually installed the pear into "/tmp/pearInstalls" via the UIMA Pear Installer (org.apache.uima.tools.pear.install.InstallPear) and then called
PackageBrowser pear = new PackageBrowser("/tmp/pearInstalls/com.ibm.dltj.ruleannotator/");

This would return a Pear with the file
/tmp/pearInstalls/com.ibm.dltj.ruleannotator/desc/lw_PosRule.xml

as main component descriptor (pear.getInstallationDescriptor().getMainComponentDesc()). And the pear component pear specifier file path (pear.getComponentPearDescPath())
/tmp/pearInstalls/com.ibm.dltj.ruleannotator/com.ibm.dltj.ruleannotator_pear.xml

When validating the pear AnalysisEngineDescriptor via the doFullValidation method a ResourceInitializationException is thrown
org.apache.uima.resource.ResourceInitializationException: Annotator class "com.ibm.dltj.uima_annotator.tagger.POSTagger" was not found. (Descriptor: file:/tmp/pearInstalls/com.ibm.dltj.ruleannotator/desc/LWPOSTagger.xml)

I'm not sure if this "POSTagger.xml" would be the correct main component descriptor to set up the AnalysisEngineDescription.

The second way was to directly use the result from the installed pear to generate the AnalysisEngineDescription of the Pear.
PackageBrowser pear = PackageInstaller.installPackage(new File("/tmp/pearInstalls/"), pearPackageFile, false);
XMLInputSource ae1_desc_xml = new XMLInputSource(mainComponentDesc);
AnalysisEngineDescription ae_desc = UIMAFramework.getXMLParser().parseAnalysisEngineDescription(ae1_desc_xml);
XMLInputSource ae1_desc_xml = new XMLInputSource(mainComponentDesc);
AnalysisEngineDescription ae_desc = UIMAFramework.getXMLParser().parseAnalysisEngineDescription(ae1_desc_xml);

This gave the same result.

Some files in my desc directory are
langware.xml
lw_PosRule.xml
LWPOSTagger.xml
LWShallowParser0.xml
Scanners/0/DB-News-pri.xml
Scanners/0/DB-News-ts.xml
Scanners/0/DB-News-tsin.xml
SPGrammarTypes.xml
SPPriorities.xml
tt_core_typesystem.xml
tt_extension_typesystem.xml


I am also a bit confused why the main component descriptor would be the specific POS descriptor and not "/tmp/pearInstalls/com.ibm.dltj.ruleannotator/com.ibm.dltj.ruleannotator_pear.xml", which is an pear specifier and cannot be used as AnalysisEngineDescriptor with the XMLInputSource.

Thanks for any help.

Cheers,
Toby
sgnkei

Posts: 2
Registered: Apr 28, 2009 05:49:44 AM
Re: combine lwr pear with analysis engine in aggregate engine
Posted: Nov 16, 2009 10:33:22 PM   in response to: t.oby in response to: t.oby's post
Click to report abuse...   Click to reply to this thread Reply
Toby,

lw_PosRule.xml is the main component descriptor of LRW model,
so you need to use lw_PosRule.xml to set up the AnalysisEngineDescription.

When using the main component descriptor instead of Pear descriptor,
you need to specify UIMA DataPath, ClassPath manually.

Here is a sample code :
// PEAR directory
final String PEAR_DIR = "pear";
// PEAR file
final String PEAR_FILE = PEAR_DIR + File.separator + "LRW_Model.pear";
// LRW data resource path
final String LRW_DATAPATH = "resources";

// Install the PEAR file
final PackageBrowser pearPkg = PackageInstaller.installPackage(
new File(PEAR_DIR), new File(PEAR_FILE), true);

// Get LRW main descriptor (aggregated) instead of PEAR descriptor
final AnalysisEngineDescription ae_desc = UIMAFramework.getXMLParser().parseAnalysisEngineDescription(
new XMLInputSource(pearPkg.getInstallationDescriptor().getMainComponentDesc()));

// Prepare a resource manager
final ResourceManager resourceMgr = UIMAFramework.newDefaultResourceManager();
// Set data path and class path
resourceMgr.setDataPath(
new File(pearPkg.getRootDirectory(), LRW_DATAPATH).getAbsolutePath());
resourceMgr.setExtensionClassPath(pearPkg.buildComponentClassPath(), true);

// Do full validation
ae_desc.doFullValidation(resourceMgr);

// Create an aggregate engine descriptor on the fly
AnalysisEngineDescription aggDesc = new AnalysisEngineDescription_impl();
aggDesc.getDelegateAnalysisEngineSpecifiersWithImports().put("AE1_LWR", ae_desc);
FixedFlow_impl flow = new FixedFlow_impl();
flow.setFixedFlow(new String[] { "AE1_LWR" });
aggDesc.getAnalysisEngineMetaData().setFlowConstraints(flow);

// Create the aggregate engine using the resource manager
final AnalysisEngine ae = UIMAFramework.produceAnalysisEngine(aggDesc, resourceMgr, null);

Hope this helps,
Kei
t.oby

Posts: 53
Registered: Feb 19, 2009 08:57:13 AM
Re: combine lwr pear with analysis engine in aggregate engine
Posted: Nov 17, 2009 09:05:29 PM   in response to: sgnkei in response to: sgnkei's post
Click to report abuse...   Click to reply to this thread Reply
Hi Kei, thanks a lot for your help. Seems I got finally the Pear AE to run. Validating the Pear AE with the resource manager (ae1) and also my own AE (ae2) worked.

Unfortunately combining both programmatically in the fixed flow setup

AnalysisEngineDescription aggDesc = new AnalysisEngineDescription_impl();
  aggDesc.getDelegateAnalysisEngineSpecifiersWithImports().put("AE1", ae_desc1); 
  aggDesc.getDelegateAnalysisEngineSpecifiersWithImports().put("AE2", ae_desc2);
  FixedFlow_impl flow = new FixedFlow_impl();
  flow.setFixedFlow(new String[] { "AE1","AE2"});
  aggDesc.getAnalysisEngineMetaData().setFlowConstraints(flow);
  AnalysisEngine combinedAE = UIMAFramework.produceAnalysisEngine(aggDesc, resourceM, null);


with a CPE taking a collection reader as input

CollectionReaderDescription collectionDesc = getReaderDescription(inputDir);
  CollectionReader reader = UIMAFramework.produceCollectionReader(collectionDesc);
  CollectionProcessingManager cpeM = UIMAFramework.newCollectionProcessingManager();
  cpeM.setAnalysisEngine(combinedFixFlowAnalysisEngine);
  cpeM.process(reader);


returned a null pointer exception

java.lang.NullPointerException
	at java.util.Hashtable.containsKey(Hashtable.java:314)
	at org.apache.uima.collection.impl.cpm.engine.CPMEngine.addParallizableCasProcessor(CPMEngine.java:1049)
	at org.apache.uima.collection.impl.cpm.engine.CPMEngine.classifyCasProcessors(CPMEngine.java:1120)
	at org.apache.uima.collection.impl.cpm.engine.CPMEngine.deployCasProcessors(CPMEngine.java:1469)
	at org.apache.uima.collection.impl.cpm.BaseCPMImpl.run(BaseCPMImpl.java:458)
	at java.lang.Thread.run(Thread.java:613)


As described in earlier post everything works when having the Pear already installed in a fixed directory and using an XML descriptor (which combines both AEs) in a CPE with a CpeDescription initialized by CpeCasProcessor which takes the XML desriptor file as input

CpeDescription cpeDesc = CpeDescriptorFactory.produceDescriptor();
  ...
  CpeCasProcessor casProc = CpeDescriptorFactory.produceCasProcessor("UserAE");
  casProc.setDescriptor(xmlAEDescriptorFile);
  CollectionProcessingEngine mCPE = UIMAFramework.produceCollectionProcessingEngine(cpeDesc);
  mCPE.process();


For any help I would be really gladful.

Cheers,
Toby
sgnkei

Posts: 2
Registered: Apr 28, 2009 05:49:44 AM
Re: combine lwr pear with analysis engine in aggregate engine
Posted: Nov 18, 2009 02:54:43 AM   in response to: t.oby in response to: t.oby's post
Click to report abuse...   Click to reply to this thread Reply
Hi Toby,

It seems that you need to set the aggregated engine name in this case :

aggDesc.getAnalysisEngineMetaData().setName("Aggregated Engine");

Thanks,
Kei
t.oby

Posts: 53
Registered: Feb 19, 2009 08:57:13 AM
Re: combine lwr pear with analysis engine in aggregate engine
Posted: Nov 18, 2009 06:20:33 AM   in response to: sgnkei in response to: sgnkei's post
Click to report abuse...   Click to reply to this thread Reply
Brilliant - it's running!

Thanks for your help guys!

Best Regards,
Toby

Point your RSS reader here for a feed of the latest messages in all forums