Topic
  • 8 replies
  • Latest Post - ‏2012-06-07T14:39:33Z by bfoyle
andy100
andy100
17 Posts

Pinned topic Question about analyzing pictures with Content Analytics

‏2011-03-23T10:32:54Z |
Hej,

does anybody know if it is possible to analyze metadata of picture files (like *.jpg, .bmp) with Content Analytics. I created a text analytic collection and crawler with mixed files (.txt and *.jpg). The crawler finished with all the data (10 documents, 5 txt files and 5 jpg files). Next step I did was parsing and indexing the files. There I got 5 parsed and indexed files (all txt files) but also 5 dropped files (jpg files).

Does anybody know how I can handle the problem to analyze picture files too?
  • bfoyle
    bfoyle
    60 Posts

    Re: Question about analyzing pictures with Content Analytics

    ‏2011-04-03T17:36:19Z  
    Andy, help me understand what you are trying to analyze about the .jpg files and we can figure out how to get you to the outcome you are trying to reach.
  • andy100
    andy100
    17 Posts

    Re: Question about analyzing pictures with Content Analytics

    ‏2011-04-06T07:06:30Z  
    What I want to do is analyzing metadata of pictures. Let me giva an example: You have a picture file on your hard disk (maybe test123.jpg) and make a right-click on it. Clicking "properties" and changing to summary tab (advanced mode) (I used a windows system). There are a lot of information like "Title", "Subject", "Author", "Comments" and much more.

    To analyze this kind of data I want to crawl picture files (*.jpg), parse and index them and analyze the metadata structure. I don't know if it is possible to do such kind of analytics within Content Analytics?

    Regards,

    Andy
  • Mitch DeFelice
    Mitch DeFelice
    7 Posts

    Re: Question about analyzing pictures with Content Analytics

    ‏2011-07-12T18:53:04Z  
    • andy100
    • ‏2011-04-06T07:06:30Z
    What I want to do is analyzing metadata of pictures. Let me giva an example: You have a picture file on your hard disk (maybe test123.jpg) and make a right-click on it. Clicking "properties" and changing to summary tab (advanced mode) (I used a windows system). There are a lot of information like "Title", "Subject", "Author", "Comments" and much more.

    To analyze this kind of data I want to crawl picture files (*.jpg), parse and index them and analyze the metadata structure. I don't know if it is possible to do such kind of analytics within Content Analytics?

    Regards,

    Andy
    Have you tried removing the .jpg extension from the exclusion list?

    I noticed when setting up a crawler that there is a edit options to configure an individual windows sub-directory. When editing the sub-directory there is a list of File Extensions to exclude. In that list are .gif & .jpeg extensions.

    Regards,

    Mitch
  • andy100
    andy100
    17 Posts

    Re: Question about analyzing pictures with Content Analytics

    ‏2011-07-13T13:06:27Z  
    Have you tried removing the .jpg extension from the exclusion list?

    I noticed when setting up a crawler that there is a edit options to configure an individual windows sub-directory. When editing the sub-directory there is a list of File Extensions to exclude. In that list are .gif & .jpeg extensions.

    Regards,

    Mitch
    Hej Mitch,

    thanks for your input. To answer your quesiton, yes I did remove the *.jpg, *.jpeg extension from the exlude list and set them to the include list which I use explicitly. This list only contains *.txt, *.jpg and *.jpeg file extensions to crawl. But the problem of crawling *.jpg or *.jpeg files exists further more.

    Best Regards,
    Andy
  • SystemAdmin
    SystemAdmin
    197 Posts

    Re: Question about analyzing pictures with Content Analytics

    ‏2011-08-02T00:42:22Z  
    Here is the link to top page how to configure the document format detection and parser assignment. I believe that you need to configure the parser to accept jpeg image files.
  • andy100
    andy100
    17 Posts

    Re: Question about analyzing pictures with Content Analytics

    ‏2011-08-11T13:34:03Z  
    Thank for you input! I'll try that :-)
  • mauriziog
    mauriziog
    5 Posts

    Re: Question about analyzing pictures with Content Analytics

    ‏2012-04-06T14:23:05Z  
    I was able to gather and index metadata from image files in omnifind and search for them ;
    take a look if this omnifind thread can help you :
    https://www.ibm.com/developerworks/forums/thread.jspa?messageID=14043745&#14043745
  • bfoyle
    bfoyle
    60 Posts

    Re: Question about analyzing pictures with Content Analytics

    ‏2012-06-07T14:39:33Z  
    • mauriziog
    • ‏2012-04-06T14:23:05Z
    I was able to gather and index metadata from image files in omnifind and search for them ;
    take a look if this omnifind thread can help you :
    https://www.ibm.com/developerworks/forums/thread.jspa?messageID=14043745&#14043745
    I can add some additional information here now that version 3.0 is released.

    We've added some additional capabilities for binary image support or some multi-media.

    Here is the link to the "what's new in 3.0" page on the infocenter.

    http://pic.dhe.ibm.com/infocenter/analytic/v3r0m0/topic/com.ibm.discovery.es.common.doc/iiysawhatsnew.htm