- Streaming data to Excel
- Easy setup for high-availability
- Resilient processing with the consistent region annotation
- Toolkits enhancements
Big data in motion
JacquesRoy 120000A2MS 1,492 Views
This has been in the works for quite a while but now it’s out!
This new version adds multiple interesting new features including:
Streaming data to Microsoft Excel makes it easy create user interfaces to get real-time feedback on what’s happening in addition to providing all the capabilities from Excel to do additional processing on the data received.
A lot has been done on the high-availability front. It is much easier to setup redundant administrative services and have them failover automatically when needed. In addition, there is no need for a DB2 database. Instead, Streams now relies on Zookeeper to preserve all the state information. Also,to continue to improve on high availability, Streams does not require a shared file system anymore.
There is a new feature that guarantees at least once processing a tuples within a region or a set of operators. It is easy to use. We simply have to add annotations that define the region and set a few parameters.
There has been enhancements to existing toolkits and addition of new ones such as support for Kafka in the messaging toolkit and the new HBase toolkit.
There is more to the new release of Streams. You can find the online documentation in the knowledge center at:
To get an idea of what’s new in this release, the a look at:
JacquesRoy 120000A2MS 1,013 Views
The general session started with an example of context computing and an interview with Captain Phillips.
All that was pretty exciting but what stole the show is the announcement of the partnership
Then I went on my way to attend Streams sessions talking about use cases.
The first one i attended is about a partner, Voci, that has a appliance that converts audio to text.
The next session was a panel of expert on geospatial analytics.
In the afternoon, I attended a session on the features of the new Streams beta that was announced last Friday.
I followed with a session on context computing used to counter fraud. I finished my day
The conference is winding down with the last day tomorrow.
JacquesRoy 120000A2MS 1,147 Views
Another full day.
It started at 7:00 with a breakfast meeting and was followed by a conference call.
"The Power of Now: Real-Time Analytics and IBM InfoSphere Streams"
My afternoon was taken by a Streams and text analytics lab.
I went back to the conference floor and had interesting conversations with many technical people
I'll be able to catch up on some Streams sessions Tomorrow. I can't wait to hear about some customer/partners stories
Also, I heard through the grapevine that there my be a big announcement at the general session.
JacquesRoy 120000A2MS 761 Views
After walking by 3 different Starbucks, I arrived at the conference breakfast hall.
Then it was time to attend the general session that started at 8:15.
Multiple speakers expanded on these themes.
I particularly likes the line: "Geospatial data will become analytics superfood".
There were many interesting sessions to choose from but because of multiple engagements, I only attended
There was so much, if you are not at the conference, you may want to look for InsightGo to be able to attend some general sessions remotely.
Now it's time to move on to Tuesday!
JacquesRoy 120000A2MS 1,103 Views
The event went as planned at the Mandalay Bay convention center with presentation on:
Many people attended and were engaged in the presentations. Overall a success.
The Insight conference officially started with the opening reception.
JacquesRoy 120000A2MS 993 Views
We're up and going.
The conference is still being setup but there are events happening this Saturday.
All sorts of other sessions are taking place in other areas of the Mandalay Bay convention center.
If you are already in Las Vegas for the Insight conference, this would be a good use of your time.
Finally, Sunday evening, the Insight conference officially starts with the Solution EXPO Grand Opening Reception
I'll post comments on the conference daily so, stay tuned!
JacquesRoy 120000A2MS 904 Views
We are barely more than two weeks away from the Insight conference.
As you know, Streams is excellent at providing real-time analytics. It can be used with other
It happens that I'll be participating in an IoT deep dive on Sunday October 26.
I'll be joining the main speakers:
The technical section is divided in three parts:
You can register for the event at: http://insight-deep-dive.eventbrite.com
Don't forget to come see me at Insight in my sessions and labs as well as a book signing
The book is: "The Power of Now: Real-Time Analytics and IBM InfoSphere Streams"
See you in Vegas!
JacquesRoy 120000A2MS 1,549 Views
Ok, this is probably not news to you but there is information you should know.
The Insight conference, formerly known as Information on Demand (IOD), is going on Oct 26-30.
For the week, I am particularly interested in the Streams sessions such as:
Just to name a few. I am involved in a few sessions:
The other exciting part for me is that I am coming out with a new book:
I am doing a book signing on Tuesday between 9:30 and 10:30.
The Insight conference provides many excellent learning opportunities on many subjects including Cloud, mobile/Social, security, analytics, and more.
It is also a great opportunity to network with experts from IBM, partners, and other customers.
A while back, I started reading a book called "Thinking, Fast and Slow" from Daniel Kahneman.
Daniel Kahneman is a professor of psychology who won a Nobel prize in economic.
I have to admit, I am not done reading it. I need more "plane" time
Today, I just want to relate some parts of chapter 14 where he put together a test to see how people would classify individuals
"Tom W is a high intelligence, although lacking is true creativity.
After reading the description, the subject was asked to figure out which field of study Tom was most likely in.
The description was actually designed so people should rank computer science among the best fitting
I laughed out loud when I read that part. I immediately though of one of my co-worker, Robert U., that
For those who read this blog, if you make corny jokes/puns and graduated in computer science rejoice.
The book is full of interesting information including the fact that even statisticians can misuse/misinterpret statistics.
"you dispose of a limited budget of attention that you can allocate to activities. . .
My conclusion: if someone tells you he/she's multitasking, they do trivial work.
JacquesRoy 120000A2MS 1,036 Views
When we talk about processing data in real time, it is easy to just write a program and be done with it.
A program is easy to write when it can process records sequentially. Once you reach the limit of this sequential processing, you start adding complexity that may represent the bulk of your work: You start by using multi-threading and eventually you need to also go to multi-processing to take advantage of multiple machines. It is much easier to use a framework to reduce those issues.
Still, a framework may give you the ability to distribute your processing but how easy is it to do? Now you want proper tools to assemble the many operations that you want to link together. Then, you also need to have the tools to easily identify bottlenecks so you can parallelize you operations. What about all the standard operations you would expect to be able to do?
This is where a platform comes in. It gives you the foundation for distributed processing but also gives you pre-built capabilities to interact with the outside world (files, message queues, databases, and so on) and also analytics so you don't have to reinvent the wheel.
JacquesRoy 120000A2MS 1,598 Views
InfoSphere Streams is starting to engage the open-source community to provide additional capabilities to its real-time analytics platform.
This is still very early in the process and we can assume we'll see evolve quickly. That may also be a way to consolidate
One of the projects is under the name resourceManagers.
Learn more about what is available for Streams on GitHub by looking at the newest page from the InfoSphere Streams playbook:
JacquesRoy 120000A2MS 1,416 Views
Anyone remembers this cartoon? I think the first time I saw it was in the '80s. Still, it keeps coming back.
This used to apply to IT requests. It can also be applied to all sort of things, including how quickly you want to go from data to actionable information.
Real-time analytics apply in many industries including medical, telecommunication, and security. You can find additional examples in the
There is a special need in processing machine data. The data can be generated at such a rate that we need machines to analyze all that data.
Data in motion processing is here to stay. It is a great approach to solve many business problems. Of course, this approach does not work in a vacuum.
The IBM solution for data in motion is InfoSphere Streams. You can download a free copy of the software to learn about it.
JacquesRoy 120000A2MS 1,277 Views
Do you know about IBM Data Magazine? It is the regular newsletter based on ibmdatamag.com that many people receive in their inbox
This online magazine contains articles related to: Big Data and Warehousing, Databases, Information Strategy, Integration and governance.
My first article got published on January 31st and is titled: "Getting the big data ball rolling".
I have put together a plan for a series of articles. When it gets more in depth, I will complement the articles with
Until next time...
JacquesRoy 120000A2MS 1,281 Views
I have to say, these are busy times!
With TimeSeries PoC and multiple activities around Streams, time flies by quickly.
It's been a while since I updated the InfoSphere Streams Playbook. This was overdue. There are new videos, training material and capabilities that were not reflected in the playbook. Here's what I updated:
With the end of the year so close, we can expect everyone to prepare for the new year. Looks like 2014 will be another fun year!
JacquesRoy 120000A2MS 1,604 Views
The other day I ran across an article on Infoworld.com: Cloudera pitches Hadoop for everything. Really?”
Of course, the article starts by mentioning the expression about hammers and nails. This is an old story and it appears that it is getting ready to repeat itself. Like it’s been said: “those who forget the past are doomed to repeat it”.
Hadoop has been the biggest star of the big data story. I have to say that it is revolutionizing data processing and for good reasons. Many seem to point to the use of cheap clusters based on commodity hardware. I personally prefer to attribute it to the large amount of data that has different requirements from traditional data processing.