does anybody know if it is possible to analyze metadata of picture files (like *.jpg, .bmp) with Content Analytics. I created a text analytic collection and crawler with mixed files (.txt and *.jpg). The crawler finished with all the data (10 documents, 5 txt files and 5 jpg files). Next step I did was parsing and indexing the files. There I got 5 parsed and indexed files (all txt files) but also 5 dropped files (jpg files).
Does anybody know how I can handle the problem to analyze picture files too?
NOTICE: developerWorks Community will be offline May 29-30, 2015 while we upgrade to the latest version of IBM Connections. For more information, read our upgrade FAQ.
This topic has been locked.
8 replies Latest Post - 2012-06-07T14:39:33Z by bfoyle
Pinned topic Question about analyzing pictures with Content Analytics
Answered question This question has been answered.
Unanswered question This question has not been answered yet.
Updated on 2012-06-07T14:39:33Z at 2012-06-07T14:39:33Z by bfoyle
bfoyle 060001WDQ360 Posts
Re: Question about analyzing pictures with Content Analytics2011-04-06T07:06:30Z in response to andy100What I want to do is analyzing metadata of pictures. Let me giva an example: You have a picture file on your hard disk (maybe test123.jpg) and make a right-click on it. Clicking "properties" and changing to summary tab (advanced mode) (I used a windows system). There are a lot of information like "Title", "Subject", "Author", "Comments" and much more.
To analyze this kind of data I want to crawl picture files (*.jpg), parse and index them and analyze the metadata structure. I don't know if it is possible to do such kind of analytics within Content Analytics?
Mitch DeFelice 0600027BV77 PostsACCEPTED ANSWER
Re: Question about analyzing pictures with Content Analytics2011-07-12T18:53:04Z in response to andy100Have you tried removing the .jpg extension from the exclusion list?
I noticed when setting up a crawler that there is a edit options to configure an individual windows sub-directory. When editing the sub-directory there is a list of File Extensions to exclude. In that list are .gif & .jpeg extensions.
Re: Question about analyzing pictures with Content Analytics2011-07-13T13:06:27Z in response to Mitch DeFeliceHej Mitch,
thanks for your input. To answer your quesiton, yes I did remove the *.jpg, *.jpeg extension from the exlude list and set them to the include list which I use explicitly. This list only contains *.txt, *.jpg and *.jpeg file extensions to crawl. But the problem of crawling *.jpg or *.jpeg files exists further more.
SystemAdmin 110000D4XK197 PostsACCEPTED ANSWER
Re: Question about analyzing pictures with Content Analytics2011-08-02T00:42:22Z in response to andy100Here is the link to top page how to configure the document format detection and parser assignment. I believe that you need to configure the parser to accept jpeg image files.
mauriziog 110000D3455 PostsACCEPTED ANSWER
Re: Question about analyzing pictures with Content Analytics2012-04-06T14:23:05Z in response to andy100I was able to gather and index metadata from image files in omnifind and search for them ;
take a look if this omnifind thread can help you :
bfoyle 060001WDQ360 PostsACCEPTED ANSWER
Re: Question about analyzing pictures with Content Analytics2012-06-07T14:39:33Z in response to mauriziogI can add some additional information here now that version 3.0 is released.
We've added some additional capabilities for binary image support or some multi-media.
Here is the link to the "what's new in 3.0" page on the infocenter.