Topic
3 replies Latest Post - ‏2012-01-17T13:21:32Z by reddz
reddz
reddz
23 Posts
ACCEPTED ANSWER

Pinned topic Unescapped HTML Characters In the Description of Search Results

‏2011-12-22T15:08:10Z |
Hi All,
While Crawling RSS feeds With Web Crawler, Search Results have Unescapped HTML
(<script>) Characters although the Actual RSS feeds don't have this HTML Characters.

I tried to Unescape the HTML Characters from the plug in but in vain.
Also Tried to Adding Rules to HTML Parser
https://www-304.ibm.com/support/docview.wss?q1=htmlparser&rs=63&uid=swg27011251&context=SS5SQ7&cs=utf-8&lang=en&loc=en_US
As parser Used is HTML Parser.

Didnt yeild any results.

Can you please guide me how Proceed further with the issue .

Omnifind Enterprise Edition 9.1 is the Omnifind Server.
Updated on 2012-01-17T13:21:32Z at 2012-01-17T13:21:32Z by reddz
  • bfoyle
    bfoyle
    29 Posts
    ACCEPTED ANSWER

    Re: Unescapped HTML Characters In the Description of Search Results

    ‏2012-01-09T22:59:09Z  in response to reddz
    You might be able to use field filters to change or remove these character sequences. I have used this feature in the past to change things like &amp to &, etc.
    • bfoyle
      bfoyle
      29 Posts
      ACCEPTED ANSWER

      Re: Unescapped HTML Characters In the Description of Search Results

      ‏2012-01-10T16:14:02Z  in response to bfoyle
      I have been informed that OEE doesn't have the field filtering that was introduced in ICA. So you may have to move to an ICA search collection if you want to use the field filtering. If that's not possible, I think you are looking at needing to write a crawler plugin to find those escaped characters.
      • reddz
        reddz
        23 Posts
        ACCEPTED ANSWER

        Re: Unescapped HTML Characters In the Description of Search Results

        ‏2012-01-17T13:21:32Z  in response to bfoyle
        Hi Bfoyle,

        Thanks for Suggestion. We have handled through the custom Plug In

        Regards
        Reddz