Skip to main content

alphaWorks  >  Forums  >  IBM Unstructured Information Modeler  >  developerWorks

Is there a similar web-based content search & mining tool?    Point your RSS reader here for a feed of the latest messages in this thread


     

 
 

My developerWorks
 Welcome, Guest
Sign in or register
This question is not answered.

Permlink Replies: 0 - Pages: 1 Threads: [ Previous | Next ]
Webmining

Posts: 1
Registered: Jan 23, 2008 08:44:48 PM
Is there a similar web-based content search & mining tool?
Posted: Jan 23, 2008 08:58:01 PM
 
Click to report abuse...   Click to reply to this thread Reply
A web-based content mining methodology is planned to obtain the frequency of keyword occurrence on the target websites, and then combine with factor analysis in my research.

This research approach requires the tool, Inventive Firms API (IFAPI, using Google's SOAP API), to conduct automated website searches, which was introduced in the article:
- Diana Hicks, Dirk Libaers, Alan Porter, and David Schoeneck. 2006. Identification of the Technology Commercialization Strategies of High- tech Small Firms. No.289. http://www.sba.gov/advo/re search/rs289tot.pdf.

As mentioned in this article (page 7 ~ 9), the basic process of using IFAPI to obtain the frequency of keyword occurrence on the target websites is as follows:
- Locate targeted web addresses;
- Identify certain keywords;
- Use IFAPI to search for each of the keywords on each of the target websites, and obtain hit counts for each term on each firm website as well as the total number of pages on the website (in order to be normalized by size of website).

The current situation has two major negative points:
1) There is a new policy enacted by Google: "As of December 5, 2006, we are no longer issuing new API keys for the SOAP Search API. Developers with existing SOAP Search API keys will not be affected." (http://code.google.com/apis/soap search/)
2) There is no plan to integrate IFAPI into Google's new AJAX search API in the foreseeable future.

Therefore, in order to continue the research, it looks like there are two options for me:
- Find another a search tool with similar functions;
- Or do the search one by one by Google, but the work load is tremendous.

So, could you mind telling me that:
- Is there a similar web-based content search & mining tool which can conduct automatic batch processing?
- Does the IBM Unstructured Information Modeler has the similar functions?

Thanks in advance.

Point your RSS reader here for a feed of the latest messages in all forums