Is it possible to use group by with the extract statement.
I am extracting the different parts-of-speach figures from tweeter feeds. The problem is i am getting all the eg.vers and adjectives in the collection. I am not able to group them on per tweet basic.
eg. create view adverbs as
extract parts_of_speech 'RB' with language 'en'
on P.queryTweets as adverbs
from queryOutputDisplay P ;-- Can I use to group by to group adverbs based on each tweet ;
queryOutputDisplay-- Contains the number of rows,each row contain single Tweet.
This topic has been locked.
1 reply Latest Post - 2012-09-24T18:44:22Z by SystemAdmin
Pinned topic Aql Capabilites and syntax (For Sentiment Analysis)
Answered question This question has been answered.
Unanswered question This question has not been answered yet.
Updated on 2012-09-24T18:44:22Z at 2012-09-24T18:44:22Z by SystemAdmin
SystemAdmin 110000D4XK603 PostsACCEPTED ANSWER
Re: Aql Capabilites and syntax (For Sentiment Analysis)2012-09-24T18:44:22Z in response to D7NU_rohit_haritashHi Rohit,
Please see the answer below, which I obtained from development:
"The GROUP BY clause cannot be used directly as part of an EXTRACT statement. Refer to the general form of the EXTRACT statement here: http://pic.dhe.ibm.com/infocenter/bigins/v1r4/topic/com.ibm.swg.im.infosphere.biginsights.text.doc/doc/biginsights_aqlref_ref_extract-statement.html
However, you can simply wrap the EXTRACT statement in a view, and use GROUP BY in a SELECT statement that operates on that view, as in:
create view AdverbsGroup as
from Adverbs A
group by A. ... ;
An example AQL snippet along these lines is available in the documentation for the GROUP BY statement here: http://pic.dhe.ibm.com/infocenter/bigins/v1r4/topic/com.ibm.swg.im.infosphere.biginsights.text.doc/doc/biginsights_aqlref_ref_group-by-clause.html
However, your message reveals a fundamental disconnect between the high-level goal of the application and the way the analytics are implemented. If you wish to extract information from Twitter messages, then the extractor should be applied to a single Twitter message at a time. That is, the value of the "text" attribute of the view Document should be the text of a single Twitter message. The extractor should not apply to a file containing multiple Twitter messages at a time (i.e., The value Document.text should not contain the text of multiple Twitter messages). In other words, you should not be in a situation in which you need to group extraction results based on the individual messages within a file containing multiple messages."