Topic
8 replies Latest Post - ‏2012-06-07T14:39:33Z by bfoyle
andy100
andy100
17 Posts
ACCEPTED ANSWER

Pinned topic Question about analyzing pictures with Content Analytics

‏2011-03-23T10:32:54Z |
Hej,

does anybody know if it is possible to analyze metadata of picture files (like *.jpg, .bmp) with Content Analytics. I created a text analytic collection and crawler with mixed files (.txt and *.jpg). The crawler finished with all the data (10 documents, 5 txt files and 5 jpg files). Next step I did was parsing and indexing the files. There I got 5 parsed and indexed files (all txt files) but also 5 dropped files (jpg files).

Does anybody know how I can handle the problem to analyze picture files too?
Updated on 2012-06-07T14:39:33Z at 2012-06-07T14:39:33Z by bfoyle
  • bfoyle
    bfoyle
    60 Posts
    ACCEPTED ANSWER

    Re: Question about analyzing pictures with Content Analytics

    ‏2011-04-03T17:36:19Z  in response to andy100
    Andy, help me understand what you are trying to analyze about the .jpg files and we can figure out how to get you to the outcome you are trying to reach.
  • andy100
    andy100
    17 Posts
    ACCEPTED ANSWER

    Re: Question about analyzing pictures with Content Analytics

    ‏2011-04-06T07:06:30Z  in response to andy100
    What I want to do is analyzing metadata of pictures. Let me giva an example: You have a picture file on your hard disk (maybe test123.jpg) and make a right-click on it. Clicking "properties" and changing to summary tab (advanced mode) (I used a windows system). There are a lot of information like "Title", "Subject", "Author", "Comments" and much more.

    To analyze this kind of data I want to crawl picture files (*.jpg), parse and index them and analyze the metadata structure. I don't know if it is possible to do such kind of analytics within Content Analytics?

    Regards,

    Andy
    • Mitch DeFelice
      Mitch DeFelice
      7 Posts
      ACCEPTED ANSWER

      Re: Question about analyzing pictures with Content Analytics

      ‏2011-07-12T18:53:04Z  in response to andy100
      Have you tried removing the .jpg extension from the exclusion list?

      I noticed when setting up a crawler that there is a edit options to configure an individual windows sub-directory. When editing the sub-directory there is a list of File Extensions to exclude. In that list are .gif & .jpeg extensions.

      Regards,

      Mitch
      • andy100
        andy100
        17 Posts
        ACCEPTED ANSWER

        Re: Question about analyzing pictures with Content Analytics

        ‏2011-07-13T13:06:27Z  in response to Mitch DeFelice
        Hej Mitch,

        thanks for your input. To answer your quesiton, yes I did remove the *.jpg, *.jpeg extension from the exlude list and set them to the include list which I use explicitly. This list only contains *.txt, *.jpg and *.jpeg file extensions to crawl. But the problem of crawling *.jpg or *.jpeg files exists further more.

        Best Regards,
        Andy
  • SystemAdmin
    SystemAdmin
    197 Posts
    ACCEPTED ANSWER

    Re: Question about analyzing pictures with Content Analytics

    ‏2011-08-02T00:42:22Z  in response to andy100
    Here is the link to top page how to configure the document format detection and parser assignment. I believe that you need to configure the parser to accept jpeg image files.
  • andy100
    andy100
    17 Posts
    ACCEPTED ANSWER

    Re: Question about analyzing pictures with Content Analytics

    ‏2011-08-11T13:34:03Z  in response to andy100
    Thank for you input! I'll try that :-)
  • mauriziog
    mauriziog
    5 Posts
    ACCEPTED ANSWER

    Re: Question about analyzing pictures with Content Analytics

    ‏2012-04-06T14:23:05Z  in response to andy100
    I was able to gather and index metadata from image files in omnifind and search for them ;
    take a look if this omnifind thread can help you :
    https://www.ibm.com/developerworks/forums/thread.jspa?messageID=14043745&#14043745
    • bfoyle
      bfoyle
      60 Posts
      ACCEPTED ANSWER

      Re: Question about analyzing pictures with Content Analytics

      ‏2012-06-07T14:39:33Z  in response to mauriziog
      I can add some additional information here now that version 3.0 is released.

      We've added some additional capabilities for binary image support or some multi-media.

      Here is the link to the "what's new in 3.0" page on the infocenter.

      http://pic.dhe.ibm.com/infocenter/analytic/v3r0m0/topic/com.ibm.discovery.es.common.doc/iiysawhatsnew.htm