Explore the advanced analytics platform, Part 2: Explore use cases that cross multiple industries using the advanced analytics platform

Discover how the advanced analytics platform supports multiple big data use cases

In the first article of this series, you learned about big data and the “four Vs” that characterize this data: Volume, Velocity, Variety, and Veracity and how an integrated analytics platform supports their diverse requirements. You also saw a high-level overview of the Advanced Analytics Platforms (AAP) architecture components and how they support various aspects of big data.

The analytics platform that is created requires investment of products and sourcing of data from multiple sources. By its very nature, big data leads to extreme requirements in data volumes and velocity that make it hard to replicate data across organizational silos. The platform must support multiple use cases for an organizational data lake (a common big data store across many divisions of the organization) to justify such investment. Explore multiple successful big data use cases and the flow of one use case to learn how the architecture supports the use cases.

Share:

Dr Arvind Sathi (asathi@us.ibm.com), WW Architect, Big Data, IBM

Photo of Arvind SathiArvind Sathi is the worldwide Communication Sector Architect for big data at IBM. His primary focus has been in creating visions and roadmaps or Advanced Analytics at leading IBM clients in telecommunications, media and entertainment, and energy and utilities organizations, worldwide. He has conducted a number of strategic BAO engagements with marquee customers in the communications industry.



Mathews Thomas (matthom@us.ibm.com), Executive IT Architect, IBM

Photo of Mathews ThomasMathews is the lead architect for Communications Sector at the IBM Global Industry Solution Center (GISC) which hosts the primary IBM industry solution center and labs in North America for the telecommunications, media and entertainment, and energy and utility industries. The GISC is also an IBM Analytics Solution Center where Mathews is the lead architect.



Mr Jinesh Radadia (radadia@us.ibm.com), Partner, Watson and Big Data Analytics, IBM

Photo of Jinesh RadadiaJinesh leads IBM services and solutions business for its breakthrough innovations in Watson DeepQA and Big data analytics to enable strategic analytics-driven business transformation for customers across all industries.



Mr Ken Kralick (ken.kralick@us.ibm.com), Global Solution Executive, IBM Telecommunications Leadership Team, IBM

Photo of Ken KralickKen is a member of the IBM Telecommunications Industry leadership team, focused on bringing Customer Experience Solutions to Communications Service Provider clients globally. The focus includes big data, analytics, and BSS/OSS solutions (including software, services, and hardware).



Mr Richard Lanahan (rlanahan@us.ibm.com), WW Communications Sector Lead, Big Data Industry Solutions , IBM

Photo of Richard LanahanRich leads the Global Communications Sector, Big Data Solutions Team. The Big Data Industry team works to identify and accelerate value to create big data initiatives with IBM clients globally. Key areas that Rich and his team see evolving in the Communications Industry include, IT transformation, advanced network analytics, personalized customer experience, and new business model and service creation.



24 September 2013

Also available in Chinese Russian Portuguese

Introduction and review of previous article

In Part 1, we discussed big data and the "four Vs" that characterize this data: Volume, Velocity, Variety, and Veracity and how an integrated analytics platform supports their diverse requirements. In a high-level overview of Advanced Analytics Platforms (AAP), you looked at the AAP architecture components and how they support various aspects of big data.

The platform that is created requires investment of products and sourcing of data from multiple diverse sources. The very nature of big data leads to extreme requirements in data volumes and velocity that make it hard to replicate data across organizational silos. The platform must support multiple use cases using a "organizational data lake" (a common big data store across many divisions of the organization) to justify such investment. In Part 2, you look at many successful big data use cases plus the flow of one use case to see how to support the use cases with the architecture.


Use cases

With a strong surge in evaluations to replace warehouses with big data tools, most organizations are beginning to realize that big data enables a number of important use cases that traditional business intelligence was unable to support. IBM has implemented a number of these use cases for big data using AAP. The rest of this section provides examples of some. We will then take a simplified use case and explain how to address the use case with the Advanced Analytics Platform.

Social media command center

Marketers often have a number of staffers that are dedicated to surf the net for any brand-related information that is posted by external sources. Often, someone discovers the problem on Twitter before the internal monitoring organization. Many junior staffers who are employed by marketing, customer service, and public relations search through social media for relevant information. A Social Media Command Center combines automated search and display of consumer feedback that is expressed publicly on the social media. Often, the feedback is summarized in the form of positive or negative sentiment. After the feedback is obtained, the marketer can respond to specific comments through a conversation with the affected consumers, whether to respond to questions about an outage or obtain feedback about a new product offering.

Big data analytics can help you monitor social media for feedback on product, price, and promotions plus automate the actions that are taken in response to the feedback. This feedback might require communication with several internal organizations, tracking a product or service problem, and dialog with customers as the feedback results in product or service changes. When consumers provide feedback, the dialog can be created only if the responses are provided in a timely manner. The automated solutions are far better at systematically finding the information, categorizing it based on available attributes, organizing it into a dashboard, and orchestrating a response at conversation speed.

Product knowledge hub

As consumers turn into sophisticated users of technology and the marketplace becomes specialized, the product knowledge seldom belongs to one organization. Take the Apple iPhone as an example. Apple markets the iPhone. iPhone parts come from a large supply chain pool, iPhone apps come from a large community of app developers, and the communications service is provided by a Communications Service Provider (CSP). Google's Android is even more diverse, as Google provides the operating system while a cell phone manufacturer makes the device. The smartphones do not work in isolation. They communicate with each other across CSPs and WiFi networks and even act as WiFi hubs for other devices. In most cases, if we have a question about how these products work together, we can find the answer by searching with any popular search engine. However, the solutions do not always favor the CSPs, and they are often dated, failing to take into account the latest offerings. Between the device operating system, the offerings from CSPs, and the apps, you must tread carefully through the versions to make sure that the discovered solution is for the same version of software that is on the device.

The solution involves three sets of technologies. The first part of the solution is the capability to tap any sources of data. A CSP might already have pieces of the solution on its intranet, put together by product managers or customer service subject matter experts. Or, the information might be on a device manufacturer site or a third-party site. All this data must be pulled and stripped of the control information so that the raw text is available for reuse. The second part of the solution is to create a set of indexes so that the raw information can be categorized and found when needed. Because many combinations of products exist, we like to collect and combine information for the devices searched. With the federated indexing system, we can organize the information for easy access. The third part of the solution involves creating an XML document in response to a query that can either be rendered by a mashup engine or made available to a third-party application.

What we created is a knowledge hub, which can now be used directly from a website or made available to the call centers. It significantly reduces call-handling time in the call centers and also increases first call resolution. By placing the information on the web, we are now promoting the CSP's website as the source of knowledge, which increases web traffic and reduces the number of people who resort to contacting the call center.

After you create a single knowledge source, this source can be used to upsell other products, connect usage knowledge to product features, and use the knowledge pool to discover new product or business partnership ideas. Much stray, fragmented knowledge about the products can be rapidly organized and find various other uses.

A very different, but similar usage of this approach is to build a knowledge hub which can be found in the healthcare industry. Similar to how the device and situation-specifc information needs to be accessed, organized, indexed, and made available, in healthcare, the patient and condition-specific information has the same needs. We will explore this in a future article in this series.

Infrastructure and 0perations studies

A number of industries are exploring the use of advanced analytics to improve their infrastructure. In many situations, the best way to improve the infrastructure is to understand its use and how bottlenecks or configurations affect performance. In the past, this data required extensive manual data collection costs. Big data provides a natural source of data with minimal data collection costs.

The city of Boston decided to use big data to identify potholes in the streets by sponsoring a competition in the analyst community. A winner came from Sprout & Company, a nonprofit group in Somerville, Massachusetts. The solution included the use of magnitude-of-acceleration spikes along a cell phone's z-axis to spot impacts, plus more filters to distinguish potholes from other irregularities on the road. The new algorithm made Street Bump, a free download in Apple's App Store, a winner. This analysis can save significant road survey cost. Navigation systems can also use the cell phone data to avoid traffic congestion and offer alternative routes. This type of use of big data is one of the best ways to gain acceptance without getting into privacy or security issues.

In another example, city bus and train agencies are making their real-time transit information available to riders. This information significantly improves the user experience and reduces the uncertainty that is associated with both planned and unexpected delays. Transloc (www.transloc.com) provides this information for riders through various technologies, including smartphones, web, and SMS messages. It also provides prediction capabilities on expected arrival time. When the app is loaded on a smartphone, the rider can use it to accurately estimate travel time and also review the travel route.

The IBM Smarter Cities® initiative uses big data in applications that are directed at city infrastructure and operations. Location data from cell phones provides raw material for detecting traffic patterns. These patterns are then used to decide on new transportation projects, to change controls, or to redirect traffic in an emergency.

Product selection, design, and engineering

Product automation provides an enormous opportunity to measure customer experience. You take photos digitally and then post them on Facebook, providing an opportunity for face recognition without requiring laborious cycles in digitization. You listen to songs on Pandora, creating an opportunity to measure what you like or dislike or how often you skip a song after you listen to the part of it that you like the most. You read books electronically online or on your favorite handheld devices, giving publishers an opportunity to understand what you read, how many times you read it, and which parts you view. You watch television on a two-way set-top box that can record each channel click and correlate it to analyze whether the channel was switched right before, during, or after a commercial break. Even mechanical products such as automobiles are increasing electronic interactions. You make all of your ordering transactions electronically, giving third parties opportunities to analyze your spending habits by month, by season, by postal code, and by tens of thousands of micro-segments. Usage data can be synthesized to study the quality of customer experience and can be mined for component defects, successes, or extensions. Marketing analysts can identify micro-segmentations using this data. For example, in a wireless company, we isolated problems in the use of cell phones to defective device antenna by analyzing call quality and comparing it across devices.

Products can be test-marketed and changed based on feedback. They can also be customized and personalized for every consumer or micro-segment based on their needs. Analytics plays a major role in customizing, personalizing, and changing products based on customer feedback. Product engineering combines a set of independent components into a product in response to a customer need. Component quality impacts overall product performance. Can you use analytics to isolate poorly performing components and replace them with good ones? In addition, can you simplify the overall product by removing components that are rarely used and offer no real value to the customer? Many product engineering analytics based on customer experience data can lead to building simplified products that best meet customer requirements.

To conduct this analysis and predictive modeling, you need a good understanding of the components that are used and how they participate in the customer experience. After a good amount of data is collected, the model can isolate badly performing components by isolating the observations from customer experience and tracing them to the poorly performing component. Complex products, such as automobiles, telecommunications networks, and engineering goods, benefit from this type of analytics around product engineering.

The first level of analysis is in identifying a product portfolio mix and its success with the customers. For example, if a marketer has many products, these products can be aligned to customer segments and their usage. You might find several products that were purchased and hardly used, leading to their discontinuation in six months, while other products were heavily used and sparingly discontinued.

When you identify less-used products, the next analysis question is whether you can isolate the cause of customer disinterest. By analyzing usage patterns, you can differentiate between successful products and unsuccessful ones. Were the unsuccessful ones never launched? Did many users get stuck with the initial security screen? Maybe the identification process was too cumbersome. How many people used the product to perform basic functions offered by the product? What were the highest frequency functions?

The next level of analysis is to understand component failures. How many times did the product fail to perform? Where were the failures most likely? What led to the failure? What did the user do after the failure? Can you isolate the component, replace it, and repair the product online?

These analysis capabilities can now be combined with product changes to create a sophisticated test-marketing framework. You can change the product, try the modified product on a test market, observe the impact, and, after repeated adjustments, offer the altered product to the marketplace.

Let us illustrate how big data is shaping improved product engineering and operations at the communications service providers. Major CSPs collect enormous amounts of data about the network, including network transport information that comes from the routers and the switches, plus usage information, popularly known as call detail records (CDRs). CDRs are recorded each time people use telephones to connect with one another. As the CSP networks grew in sophistication, the CDRs were extended to data and video signals that use IPDRs. Most CSPs refer to this usage information as xDRs (where x is now a variable that can be substituted for "any" usage information). For larger CSPs, the usage statistics not only are high volume (in billions of transactions a day) but also require low-latency analytics for a number of applications. For example, detecting a fraudulent transaction or abusive network user in the middle of a video download or call might be more valuable than finding out this information the next day. A strategic driver for CSPs is to lay out all the network and usage information on their network topology and geography and use various automated analytics and manual visualization techniques to connect network trouble or inefficiencies and usage. These analytics provide CSP with a valuable capability to improve the quality of the communication. If every user call is dropping in a particular area that is a popular location for premier customers, it might lead to churn of those customers to competitors.

The information about xDRs, network events, customer trouble tickets, blogs, and tweets in the social media can be correlated for various business purposes. CSPs use this type of analytics to detect spots with poor network performance to reorganize towers and boosters. The differences in usage can be analyzed to detect device problems such as faulty antennas on specific models. The variations can also be analyzed to find and fix network policies or routing problems. As CSPs race to implement high-volume, low-latency xDR hubs, they find plenty of business incentives to fund these programs and reap benefits in the form of improved product offerings for customers.

Location-based services

Various industries have location information about their customers. Cell phone operators know customer location through the location of the phones. Credit-card companies know the location of transactions, and auto manufacturers the location of cars, while social media encourages customers to disclose their location to their friends and family.

Let us take a wireless CSP example to study how to collect and summarize location information. A cell phone is served by a collection of cell phone towers, and its specific location can be inferred by triangulating its distance from the nearest cell towers. In addition, most smartphones can provide GPS location information that is more accurate (down to about one square meter). The location data includes longitude and latitude and, if completely stored, takes about 26 bytes of information. If you deal with 50 million subscribers and want to store 24 hours of location information at the frequency of once a minute, the stored data is about two terabytes of information per day. This is the amount of information that is stored in the location servers at a typical CSP. One can use fewer bytes to store the information and in doing so reduce the location accuracy.

Customer locations can be summarized into "hangouts" at different levels of granularity. The location information can be aggregated into geohashes that draw geo boundaries and transform latitude-longitude data into geohash so that it can be counted and statistically analyzed. The presence of a person in a specific location for a certain duration is considered a space-time box. This information can be used to encode the hangout of an individual in a specific business or residential location for a specific time period.

Many smartphone apps collect location data, if a subscriber opts in. If a marketer is interested in increasing the traffic to a grocery store that is in a specific geohash, they can run an effective marketing campaign by analyzing and understanding which neighborhood people are more likely to hang out or shop in that specific grocery store. Instead of blasting a promotion to all neighborhoods, the communication can now be directed to specific neighborhoods, increasing the efficiency of the marketing campaign. This analysis can possibly be conducted using a 6-byte location geohash over a one-hour span and finding all the cell phones that visited the grocery store regularly. A predictive model can compute the probability of a customer visit to a grocery store based on past hangout history of the customer. Customer residence information can be clustered to identify neighborhoods most likely to visit the shopping center.

Analysis of machine-to-machine transaction data with big data technologies is revolutionizing how location-based services can be personalized and offered at low latency. Consider the example of Shopkick, a retail campaign tool that can be downloaded on a smartphone. Shopkick seeks and uses location data to offer campaigns. After the app is downloaded, Shopkick seeks permission to use current location as recorded by the smartphone. In addition, Shopkick has a database of retailers and their geo-locations. It runs campaigns on behalf of the merchants and collects its revenues from merchants. Shopkick tells you, for example, that the department store in your neighborhood would like you to visit the store. As a further incentive, Shopkick deposits shopping points in your account for just visiting the store. As you walk through the store, Shopkick can use your current location in the smartphone to record your presence at the store and award points.

Why would a customer opt in? Device makers, CSPs, and retailers are beginning to offer a number of location-based services, in exchange for location opt in. For example, smartphones offer "find my phone" services, which can locate a phone. If the phone is lost, the last known location can be ascertained through a website. In exchange, the CSP or the device manufacturer might seek location data for product or service improvement. These location-based services can also generate revenue. A CSP might decide to charge for a configuration service that switches a smartphone to silent mode when the subscriber enters the movie theater and switches back to a normal ring tone once the subscriber leaves the theater. Prepaid wireless providers are engaging in location-based campaigns that are targeted at customers who are about to run out of prepaid minutes. These customers are the most likely to churn to a competitor and might easily continue with their current wireless provider if directed to a store that sells prepaid wireless cards.

Micro-segmentation and Next Best Action

Automation provides tremendous opportunity to use sensors to collect data in every step of the customer-facing processes, such as click streams in the use of a website. Sensor data gives you an opportunity to establish behavioral patterns using analytics. The early evolution was in use of analytics for segmentation. The original segmentations were demographic in nature and used hard consumer data, such as geography, age, gender, and ethnic characteristics to establish market segmentations. Marketers soon realized that behavioral traits were also important parameters to segment customers.

As understanding grew, we saw more emphasis on micro-segments — specific niche markets that are based on analytics-driven parameters. For example, marketers started to differentiate innovators and early adopters from late adopters in their willingness to purchase new electronic gadgets. Through customer experience data, you can characterize innovators who are eager to share experiences early on and are more tolerant of product defects.

In the mid-1990s, with automation in customer touch points and use of the Internet for customer self-service, marketing became more interested in personalization and 1:1 marketing. As Martha Rogers and Don Peppers point out in their book The One to One Future:

"The basis for 1:1 marketing is share of customer, not just market share. Instead of selling as many products as possible over the next sales period to whomever will buy them, the goal of the 1:1 marketer is to sell one customer at a time as many products as possible over the lifetime of that customer's patronage. Mass marketers develop a product and try to find customers for that product. But 1:1 marketers develop a customer and try to find products for that customer."

Early analytics systems were reporting systems that provided raw segmentation data to the marketing team so that they could use the data to decide on marketing activities, such as campaigns. Automation in marketing and operations gave us the opportunity to close the loop — to use analytics to collect effectiveness data to revise and improve campaigns. We are seeing surges in campaign activity. Marketers are interested in micro-campaigns that are designed specifically for a micro-segment or, in some cases, for specific customers. The customer experience information gives criteria for including a customer in the campaign.

At Northeastern University in Boston, network physicists discovered just how predictable people are by studying the travel routines of 100,000 European mobile-phone users. After researchers analyzed more than 16 million records of call dates, times, and locations, they determined that, when compiled, people's movements appeared to follow a mathematical pattern. The researchers stated that with enough information about past movements, they can forecast someone's future whereabouts with 93.6 percent accuracy.

How do you use location data to derive micro-segments? At the simplest level, if you take the past three months of location data across a set of people, you can differentiate between globe trotters, people who do field jobs, "9-to-5ers" (that is, people who work desk jobs during regular office hours), and people who work from home. At the next level, you can start to infer frequent behaviors. By observing how many times a person visits a coffee shop, the mall, or a golf course, for example, you can establish his hangouts through frequency rules (such as, "more than four visits per month, each for a duration of an hour or longer" constitutes a hangout). A marketer might seek a customer to opt in their location information and offer location and context-specific promotions.

Next Best Action (NBA) recommends an activity that is based on the customer's latest experience with the product. This activity might include an up-sell or cross-sell based on current product ownership, usage level, and behavioral profile. You can offer an NBA whenever the sales organization has the opportunity to connect with the customer through a touch point. NBA is far more effective in sales conversion than predefined rules that repeatedly offer the same product over and over across a customer interaction channel. (Imagine that your airline offered you a discounted trip to your favorite warm-weather golf vacation spot on a cold day.)

NBA can also be revised based on feedback from customer reactions. For a number of decades, television producers relied on a control sample of audience viewing habits to gauge the popularity of their television shows. This data was collected through extensive surveys in the early days of television programming and then through special devices that were placed on a sample of television sets by companies such as Nielsen. With the advancement in the cable set-top box (STB) and digital network that support the cable and satellite industries, you can now collect channel surfing data from all the STBs capable of providing this information. As a result, the size of data collected has grown considerably, providing finer insights not previously available. This information is valuable because you can use it to correlate channel surfing with a number of micro-segmentation variables.

The grocery stores are equally busy developing their understanding of customers. Most grocers offer frequent shopper cards that can be used by the grocer to track purchase habits and by the shoppers to redeem discounts and other useful campaigns. With identifying information collected from the customer, this shopper card can be correlated with a name and an address. With retailer information from the frequent shopper card and cable provider information about television viewing habits, you can correlate channel surfing data with retail purchases by the household and insert appropriate commercials to run micro campaigns that are based on household purchases.

Retailers toyed with the idea of providing shopping gadgets to shoppers and eventually realized that creating a smartphone app to run on an existing device is easier than engineering a new device. The shoppers can activate a mobile app as soon as they enter the retail store. The app starts to collect GPS-level accurate location information about the shopper and lets the shopper check in grocery items on the smartphone. At the checkout counter, the shopper connects the smartphone to the point-of-sale (PoS) device, and the grocery bill is automatically paid by the credit card that is associated with the app. As the person walks through the grocery store and checks in grocery items with a smartphone, a campaign management system downloads mobile coupons that are based on customer profile, past grocery purchases, and currently active promotions.

While advertising agencies made the connection through messaging, you now can connect at micro-segment or one-to-one marketing levels. That is, you can air commercials and see their impact on customer purchase, or air commercials that are based on what a specific consumer is buying. It requires the ability to connect retail and cable advertising data pluss an ecosystem where the two analytics systems (retail and cable) can collaborate.

A triple-play CSP (providing cable, broadband, and wireless services) might use its customer database to correlate customer activities across these three screens. Many consumers are viewing media using Internet over their desktops or tablets. You can now start to correlate media viewing, location-based micro-segments, and customer purchase intentions as known through social media to make retail offers. The consumer registers on the retailer's website, giving permission to the retailer to use profile data. The retailer uses consumer context and location to tailor a specific promotion.

This is another good place for us to take a peek forward into this article series to where we will discuss healthcare applications of big data and the AAP. Patient micro-segmentation based on health evidence and behavior rather than merely on demographic data is creating better treatment models, similar to how customer segmentation creates better offers.

Online advertising

Television and radio used advertising as their funding model for decades. As online content distribution becomes popular, advertising followed the content distribution with increasing volumes and acceptance in the marketplace. Online advertising is also becoming increasingly sophisticated. The biggest focus is the advertisement bidding that is managed for a publisher, such as Google, by either a Supply Side Platform (SSP) or Advertising Exchange. Online advertising provides tremendous opportunity for advertising to a micro-segment and also for context-based advertising.

The advertiser's main goal is to reach the most receptive online audience in the right context. The audience members then engage with the displayed ad and eventually take the wanted action that is identified by the type of campaign. Big data provides you with an opportunity to collect myriads of behavioral information. This information can be collated and analyzed to build two sets of insights about the customers, both of which are relevant to online advertising. First, the micro-segmentation information and associated purchase history allows you to establish buyer patterns for each micro-segment. Second, you can use the context of an online interaction to drive context-specific advertising. For example, for someone who is searching and shopping for a product, a number of related products can be offered in the advertisements that are placed on the web page.

Turn's Demand Side Platform (DSP) delivers over 500,000 advertisements per second by using ad bidding platforms at most major platforms, including Google, Yahoo, and Facebook. A DSP manages online advertising campaigns for a number of advertisers through real-time auctions or bidding. Unlike a direct buy market (such as print or television), where the price is decided in advance based on reach and opportunities to see, the real-time Ad Exchange accepts bids for each impression opportunity. The impression is sold to the highest bidder in a public auction. DSPs are the platforms where all the information about users, pages, ads, and campaign constraints come together to make the best decision for advertisers.

Consider an example to understand the flow of information and collaboration between publisher, Ad Exchange, DSP, and advertiser to deliver online advertisements. If a user initiates a web search for food in a particular postal code on a search engine, the search engine takes the request, parse it, and start to deliver the search result. As the search results are delivered, the search engine decides to place a couple of advertisements on the screen. The search engine seeks bids for those spots, which are accumulated through Ad Exchange and offered to a number of DSPs competing for the opportunities to place advertisements for their advertisers. In seeking the bid, the publisher might supply some contextual information that can be matched with any additional information known to the DSP about the user. The DSP decides whether to participate in this specific bid and makes an offer to place an ad. The highest bidder is chosen, and their advertisement is delivered to the user in response to the search. Typically, this entire process might take 80 milliseconds.

A Data Management Platform (DMP) can collect valuable statistics about the advertisement and the advertising process. The key performance indicators (KPIs) include the number of times a user clicked the advertisement, which provides a measure of success. If a user receives a single advertisement many times, it can cause saturation and reduce the probability that the user will click the advertisement.

As online advertising is integrated with online purchasing, the value of placing an advertisement in the right context can go up. If the placement of the ad results in the immediate purchase of the product, the advertiser is likely to offer a higher price to the publisher. DSP and DMP success depends directly on their ability to track and match consumers based on their perceived information need and their ability to find advertising opportunities that are related closely to an online sale of associated goods or services.

Improved risk management

A credit-card company can use cell phone location data to differentiate an authentic user from a fraudulent one. As the credit card is used in a location, the credit-card transaction location can be matched with the cell phone location for the customer to reduce risk of fraudulent transactions.

The premise for credit-card fraud is that someone might steal a credit card and use it. A typical fraud rule looks for an unusual purchase that is initiated in a new unusual location. Unfortunately, for frequent travelers, irregular personal credit-card use can easily mimic these fraudulent transactions. However, these travelers carry their smartphones all the time when they travel. If only the traveller could authorize her credit-card company to check her phone location whenever a concern about the credit-card usage arises, and even download an app to her phone that could ask her to authorize the charges through a secure login or password to eliminate the possibility that the phone was stolen.

Example flow

Now look at a simplified version of the Location Based Service use case and see how AAP addresses this use case and the flow of information across the architecture.

A company decides to offer coupons to individuals based on their location and other characteristics and also monitors how successful the campaign is. The following steps outline how the process works. Figure 1, relates each step of the flow with relevant architecture components. We are considering a telecommunications solution so the architecture is modified to reflect data sources relevant to a telecommunications environment.

The initial level of analytics required is data-at-rest analytics where the appropriate models, entity resolution, and campaigns are created through static insights that are derived from data at rest.

  1. Data from various sources are collected. In this case, we assume social media data, customer loyalty data, web log data that indicates how the users interacted with company sites, customer's location data, and customer profile data that the company has about the customer.
  2. The preceding data goes through an extract-transform-load process in the appropriate ETL tool if necessary. In many cases, the data can be loaded as is.
  3. The data is stored into the appropriate repository according to whether it is structured or unstructured data.
  4. More processing by entity resolution tools can occur to this data to provide a complete user profile. This profile gives a complete view of the user that is based on the different sources that are described in step 1. For example, the profile links customer loyalty data to how the customer interacts with the website so you know what this customer bought and what the customer might be interested in buying).
  5. Appropriate predictive models are created from the customer profile information and location information. These models can determine users' movement habits, where users hang out, who they hang out with, and more details to better segment the target market.
  6. Appropriate campaigns are created in the campaign management system for the target market segment that includes the marketing channel and message for each channel.
Figure 1. Location-based services analytics – data at rest
Diagram of location-based services analytics for data at rest

When a customer enters a specific location, real-time analytics must be performed on the available data. The AAP platform acts as noted in Figure 2:

  1. The location data is obtained and processed by stream in real time.
  2. 2. Real-time analytics is performed on the data to determine whether to send a coupon to this customer. This step invokes the predictive models in real time and receives a score to determine whether the customer falls within a target segment.
  3. If the customer falls within the target segment, the campaign management system determines what message to send and a coupon is sent through the appropriate channel. Examples of channels include mobile, social media, and web.
  4. The real-time data is stored in the appropriate repository for future historical analysis.
  5. Feedback indicates whether the customer accepted the coupon.
  6. The models are continuously refined based on the success of the campaign.
Figure 2. Location-based services analytics – data in motion
Diagram of location-based services analytics for data in motion

Conclusions and what comes next

We discussed some key use cases, which go across multiple industries where you can use the Advanced Analytics Platform. A flow was taken through one of the use cases to show how the components come together to implement the use case. Implementing such a use case requires:

  • Understanding the data
  • Processing the data to a state when the analytics tools can effectively be applied to the data
  • Identifying the appropriate repositories for structured and unstructured data
  • Determining which analytics to do with data at rest and data in motion
  • Performing the analytics and then executing the action items that result based on the insights derived

The next article will examine the unstructured analytics pattern and how the Advanced Analytics Platform enables this kind of analytics. We will examine what is meant by unstructured data, potential sources for this data, the macro and micro patterns associated with such data, specific uses cases, and tooling with code snippets, which enables you to do such analytics.

Resources

Learn

  • Big Data Analytics – Disruptive Technologies for Changing the Game (Arvind Sathi, MC Press, October 2012): In this practitioner's view of big data analytics, discover how big data changes analytics architecture.
  • Information on Geohash codes: Get codes and learn tips and tricks about geocoding.
  • The developerWorks Business analytics topic: Find how-to information, tools, and updates to help you improve outcome and control risk.
  • The developerWorks Big data content area: Learn more about big data. Find technical documentation, how-to articles, education, downloads, product information, and more.
  • Explore some key big data and analytics products:
    • InfoSphere DataExplorer: Develop and deploy enterprise information navigation and 360-degree information applications for heterogenous data sources and application data repositories with the DataExplorer platform.
    • InfoSphere BigInsights: Check out BigInsights, a platform powered by Apache Hadoop, for the analysis and visualization of Internet-scale data volumes.
    • InfoSphere Streams: Develop and execute applications that process information in data streams with this software platform.
    • SPSS Modeler: Put this modeling tool to work.
    • Cognos Business Intelligence (BI): Learn about this web product with integrated reporting, analysis, scorecarding, and event management features
    • PureData Systems: Maximize the benefit of integrated systems and patterns with solutions that accelerate deployment and simplify management for cloud, business, and infrastructure applications.

Get products and technologies

  • Evaluate IBM products in the way that suits you best: Download a product trial, try a product online, use a product in a cloud environment.

Discuss

  • Get involved in the developerWorks community. Connect with other developerWorks users while you explore the developer-driven blogs, forums, groups, and wikis.

Comments

developerWorks: Sign in

Required fields are indicated with an asterisk (*).


Need an IBM ID?
Forgot your IBM ID?


Forgot your password?
Change your password

By clicking Submit, you agree to the developerWorks terms of use.

 


The first time you sign into developerWorks, a profile is created for you. Information in your profile (your name, country/region, and company name) is displayed to the public and will accompany any content you post, unless you opt to hide your company name. You may update your IBM account at any time.

All information submitted is secure.

Choose your display name



The first time you sign in to developerWorks, a profile is created for you, so you need to choose a display name. Your display name accompanies the content you post on developerWorks.

Please choose a display name between 3-31 characters. Your display name must be unique in the developerWorks community and should not be your email address for privacy reasons.

Required fields are indicated with an asterisk (*).

(Must be between 3 – 31 characters.)

By clicking Submit, you agree to the developerWorks terms of use.

 


All information submitted is secure.

Dig deeper into Big data and analytics on developerWorks


static.content.url=http://www.ibm.com/developerworks/js/artrating/
SITE_ID=1
Zone=Big data and analytics
ArticleID=944762
ArticleTitle=Explore the advanced analytics platform, Part 2: Explore use cases that cross multiple industries using the advanced analytics platform
publish-date=09242013