The IBM Challenge was a big success. One of the contestants, Ken Jennings, [welcomes our new computer overlords]. Congratulations are in order to the IBM Research team who pulled off this Herculean effort!
Some folks have poked fun at some of the odd responses and wager amounts from the IBM Watson computer during the three-day tournament. Others were surprised as I was that the impressive feat was done with less than 1TB of stored data. Here is what John Webster wrote in CNET yesterday, in hist article [What IBM's Watson says to storage systems developers]:
"All well and good. But here's what I find most interesting as a result of what IBM has done in response to the Grand Challenge that motivated Watson's creators. We know, from Tony Pearson's blog, that the foundation of Watson's data storage system is a modified IBM SONAS cluster with a total of 21.6TB of raw capacity. But Pearson also reveals another very significant, and to me, surprising data point: "When Watson is booted up, the 15TB of total RAM are loaded up, and thereafter the DeepQA processing is all done from memory. According to IBM Research, the actual size of the data (analyzed and indexed text, knowledge bases, etc.) used for candidate answer generation and evidence evaluation is under 1 Terabyte."
To better appreciate how difficult the challenge was, and how a small amount of data can answer a billion different questions, I thought I would cover Business Intelligence, Data Retrieval and Text Mining concepts.
"In this paper, business is a collection of activities carried on for whatever purpose, be it science, technology, commerce, industry, law, government, defense, et cetera. The communication facility serving the conduct of a business (in the broad sense) may be referred to as an intelligence system. The notion of intelligence is also defined here, in a more general sense, as the ability to apprehend the interrelationships of presented facts in such a way as to guide action towards a desired goal."
Ideally, when you need "Business Intelligence" to help you make a better decision, you perform data retrieval from a structured database for the specific information you are looking for. In other cases, you might be looking for insight, patterns or trends. In that case, you go "data mining" against your structured databases.
And that's not including more ethereal questions, such as:
This is just for a small set, two market segments (by gender) and two products (apples and oranges). However, if you have many market segments (perhaps by age group, zip code, etc.) and many products, the number of queries that can be supported is huge. For small sets of data, you can easily do this with a spreadsheet program like IBM Lotus Symphony or Microsoft Excel.
Second, you had to be skilled at SQL to phrase your queries correctly to retrieve the data you are after. What ended up happening was that skilled SQL programmers would develop "canned reports" with fixed SQL parameters, so that less-skilled business decision makers could base their decisions from these reports.
IBM has fully integrated stacks to help process structured data, combining servers, storage, and advanced analytics software into a complete appliance. IBM offers the [Smart Analytics System] for robust, customized deployments, and recently acquired [Netezza] for pre-configured, and more rapid deployments.
However, the bigger problem is that more than 80 percent of information is not structured! Semi-structured data like email provides some searchable fields like From and Subject. The rest of the information is unstructured, such as text files, photographs, video and audio. To look for specific information in unstructured sources can be like looking for a needle in a haystack, and trying to get insight, patterns or trends involves text mining.
IBM is a leader in Business Analytics and has made great progress in dealing with unstructured data. This includes [IBM OmniFind Enterprise Edition], [IBM e-Discovery Manager] and [IBM Cognos Business Intelligence].
This, in effect, is what IBM Watson was able to perform so well this week. Finding the needle in the haystacks of unstructured data from 200 million pages of text stored in its system, combined with the ability to apprehend the interrelationships of meaning and subtle nuance, resulted in an impressive technology demonstration. Certainly, this new technology will be powerful for a variety of use cases across a broad set of industries!
To learn more, read the Arizona Daily Star's article [After 'Jeopardy!' win, IBM program steps out].
My colleagues, Harley Puckett (left) and Jack Arnold (right) were highlighted in today's Arizona Daily Star, our local newspaper, as part of an article on IBM's success and leadership in the IT storage industry. At 1400 employees here in Tucson, IBM is Southern Arizona's 36th largest employer.
Highlighted in the article:
Read the full article [IBMers Crank Out 4 New Offerings To Handle Data Deluge]
With the economy recovering from the [Global Recession], my manager has been given authorization to hire new Subject Matter Experts (SMEs) for the [IBM Tucson Executive Briefing Center!] Here are a few answers to questions you might have:
Where is the job located?
The job is located in Tucson, Arizona, which is a great place to live! Tucson is the headquarters for IBM storage design and development, with the largest collection of engineers, software developers and testers. The IBM Tucson Executive Briefing Center is located on the [University of Arizona Science and Technology Park] campus that houses over 7,000 employees from 50 different companies.
What does the job entail?
Primarily, you will be developing, customizing and presenting Powerpoint presentations and live product demos. For some briefings, you will work with sales reps, IBM Business Partners, and clients to develop an agenda of topics to discuss. At times, the presentation may involve working to solve the client's problems, drawing on the whiteboard or flip charts to help capture the requirements and architect a solution.
Which products are we talking about?
The [IBM System Storage product line] includes solid-state drives (SSD), block and file-based disk systems, tape drives and libraries, storage virtualization, and storage management software.
Is there any opportunity for travel?
Most of the presentations will be performed in Tucson, either in person, by webcast or video conference call. Sometimes, this includes discussions over drinks, dinner or golfing. Occasionally, there will be travel to present at client locations, IBM branch offices, events or conferences. My manager estimates approximately 10 percent travel.
Is the pay based on a commission?
Absolutely not! We are consultants, not salespeople. To maintain our "trusted advisor" status, it is a flat salary, with possibility for year-end bonus based on how well our division does overall. This allows us to present and position all of the products fairly to the clients at briefings without bias. Our clients appreciate that! The job is considered pre-sales technical support.
Is training included?
Yes. Assuming you already have a strong background in storage hardware and software, and how these connect to SAN and LAN networks for a variety of operating systems like z/OS, AIX, Windows and Linux, there will be training for the latest updates and features of the IBM products throughout the year. Also, there will be professional training to build up your public speaking and meeting facilitation skills.
How do I apply?
If you are an American citizen, fluent in the English language, and have at least a Bachelor's Degree, go to the [IBM Employment website], look for "Storage Support Specialist" position using job code "STG-0524037" or "STG-0525309". IBM is committed to creating a diverse environment and is proud to be an equal opportunity employer. All qualified applicants will receive consideration for employment without regard to race, color, religion, gender, gender identity or expression, sexual orientation, national origin, genetics, disability, age, or veteran status.
The job is immediately available. Apply today!