Topic
  • 1 reply
  • Latest Post - ‏2012-09-24T18:44:22Z by SystemAdmin
D7NU_rohit_haritash
D7NU_rohit_haritash
16 Posts

Pinned topic Aql Capabilites and syntax (For Sentiment Analysis)

‏2012-09-21T11:47:45Z |
Hi
Is it possible to use group by with the extract statement.

I am extracting the different parts-of-speach figures from tweeter feeds. The problem is i am getting all the eg.vers and adjectives in the collection. I am not able to group them on per tweet basic.

eg. create view adverbs as
extract parts_of_speech 'RB' with language 'en'
on P.queryTweets as adverbs
from queryOutputDisplay P ;-- Can I use to group by to group adverbs based on each tweet ;

queryOutputDisplay-- Contains the number of rows,each row contain single Tweet.

Thanks
Updated on 2012-09-24T18:44:22Z at 2012-09-24T18:44:22Z by SystemAdmin
  • SystemAdmin
    SystemAdmin
    603 Posts

    Re: Aql Capabilites and syntax (For Sentiment Analysis)

    ‏2012-09-24T18:44:22Z  
    Hi Rohit,

    Please see the answer below, which I obtained from development:

    "The GROUP BY clause cannot be used directly as part of an EXTRACT statement. Refer to the general form of the EXTRACT statement here: http://pic.dhe.ibm.com/infocenter/bigins/v1r4/topic/com.ibm.swg.im.infosphere.biginsights.text.doc/doc/biginsights_aqlref_ref_extract-statement.html

    However, you can simply wrap the EXTRACT statement in a view, and use GROUP BY in a SELECT statement that operates on that view, as in:

    create view AdverbsGroup as
    select ...
    from Adverbs A
    group by A. ... ;

    An example AQL snippet along these lines is available in the documentation for the GROUP BY statement here: http://pic.dhe.ibm.com/infocenter/bigins/v1r4/topic/com.ibm.swg.im.infosphere.biginsights.text.doc/doc/biginsights_aqlref_ref_group-by-clause.html

    However, your message reveals a fundamental disconnect between the high-level goal of the application and the way the analytics are implemented. If you wish to extract information from Twitter messages, then the extractor should be applied to a single Twitter message at a time. That is, the value of the "text" attribute of the view Document should be the text of a single Twitter message. The extractor should not apply to a file containing multiple Twitter messages at a time (i.e., The value Document.text should not contain the text of multiple Twitter messages). In other words, you should not be in a situation in which you need to group extraction results based on the individual messages within a file containing multiple messages."