Topic
IC4NOTICE: developerWorks Community will be offline May 29-30, 2015 while we upgrade to the latest version of IBM Connections. For more information, read our upgrade FAQ.
No replies
Dushyant
Dushyant
10 Posts
ACCEPTED ANSWER

Pinned topic Dynamic Regex Generation for Search

‏2012-08-07T17:57:00Z |
Hi All -

Have a general search design question I could use help with -

Using stock symbol search as an example -

1. Data is precrawled into Omnifind index and available for searching. (e.g. AAPL, MSFT, GOOG ... thousands of stock symbols)

2. Custom code runs and asks user to provide 3 stock symbols. (ANY 3 out of thousands possible).

3. In a loop of 3, we now want to run a regex-based Omnifind search for each symbol - e.g. if user inputted NYX, F, MSFT then we form simple boundary word regex as -

NYX -- > \bN\bY\bX
F -- > \bF
MSFT -- > \bM\bS\bF\bT

4. Key design point here is to NOT use PEAR files uploaded ahead of time. This is for 2 reasons -

- the master-list of symbols is unknown and changes often - it's practically impossible to write regex/pear for an unbound data universe.
- the regex above is very simple (and not much useful), but the real solution will have multiple levels of regex complexity. So we want to write regex-generator java functions which create the expression for the loop of 3, and fire the search to the index.

Would appreciate thoughts on feasibility. I would expect UIMA/Omnifind to handle dynamic generation of regex based searches, without having to pre-define in pear files. I haven't seen examples, but hoping we aren't the first one attempting to do this.. Thanks in advance!!

Dushyant.