Skip to main content

alphaWorks  >  Forums  >  IBM Unstructured Information Modeler  >  developerWorks

IBM Unstructured Information Modeler    Point your RSS reader here for a feed of the latest messages in this thread


     

 
 

My developerWorks
 Welcome, Guest
Sign in or register
Permlink Replies: 2 - Pages: 1 - Last Post: Jan 16, 2008 11:28 AM Last Post By: spangles Threads: [ Previous | Next ]
Zeno Davatz

Posts: 2
Registered: Aug 09, 2006 12:02:26 PM
IBM Unstructured Information Modeler
Posted: May 07, 2007 10:37:51 AM
Click to report abuse...   Click to reply to this thread Reply
Hi There!

I just read this post, interesting stuff.
http://alphaworks.ibm.com/tech/uimodeler/

I would like to add my 5 cents about unstructured information:

Talking about Unstructured Information (Ontology, Search and Taxonomy), the only software I know of where the software really works is InfoCodex. InfoCodex comes with a linguistical database with 3 Mio words and terms in German, French, English, Italian and Spanish. What does this mean? This means IC can actually do a cross-language search and find similar documents in another language. This is true-cross-language-search. Autonomy, Fast, OmniFind, none of those can do that. InfoCodex will also produce a Graphical Heat-Map of all the contents in any given collection. Also see: InfoCodex Procedure

Posts: 1
Registered: Apr 27, 2007 10:49:44 AM
Re: IBM Unstructured Information Modeler
Posted: May 15, 2007 12:48:17 PM   in response to: Zeno Davatz in response to: Zeno Davatz's post
Click to report abuse...   Click to reply to this thread Reply
Is it possible that you could re-compile so that the parser accepts accented letters? At the moment it treats them as word boundaries.
spangles

Posts: 4
Registered: Jul 25, 2005 09:43:59 AM
Re: IBM Unstructured Information Modeler
Posted: Jan 16, 2008 11:28:31 AM   in response to: in response to: 's post
Click to report abuse...   Click to reply to this thread Reply
Its true this version of the software is somewhat limited in terms of only working on ASCII text. There is an upgrade that will allow it to work on UNICODE:

First you need to download the package "icu4j.jar" which contains the unicode parser:

http://www.icu-project.org/download /

Then start up UIM using the following command line call:

java -cp uimodeler.jar;icu4j.jar -Dunicode=yes com.ibm.cv.text.EAdvisor

This should cause it to parse the text input as UNICODE instead of ASCII.

Point your RSS reader here for a feed of the latest messages in all forums