Cloud-Nanny needed to find an architecture that would enable it to check hundreds of thousands of web requests and decide whether to allow or block them without noticeable impact on the end-user’s browsing experience. It targeted a processing time of no more than 40 microseconds to look up a site in its database and return a decision.
Martijn Rooks, CEO of Cloud-Nanny, comments: “IBM Db2® on Cloud was an ideal solution for quickly checking requests against our database of blacklisted and whitelisted sites – it’s very fast at performing this kind of query, and as a cloud-based database platform it can scale easily. Best of all, IBM provides it as a managed service, which means we can focus on developing our solution, instead of spending time on low-level database administration tasks.”
Looking up sites in a database is simple enough—but what happens if a child is trying to access a site that isn’t already in the database? That’s where the intelligent part of the solution kicks in. With a large collection of websites, Cloud-Nanny trained a model tailored to its needs, using machine learning algorithms running in IBM Analytics for Apache® Spark™. The power of the Spark cluster is used to create the website classifier, which is able to classify content in real time and categorize it—for example as a gaming site, a video site, or a site that contains adult material.
The solution then compares the results with the family’s existing profile, to check whether the site’s particular category is listed as OK or prohibited for the device or user that is making the request. If the categorization algorithm is very confident that the site falls into a permitted or banned category, the request is either allowed or blocked. On the other hand, if it is less certain about the classification, it can alert the parents and ask them to make a judgement call. The results of this parental decision are then fed back into the model, helping it learn and improve over time.
“The intelligent part of the solution is that it is built around the idea that Internet safety isn’t a black-or-white issue—there are lots of gray areas, and different parents will have different views on what is or isn’t acceptable for each of their children,” says Martijn Rooks. “Moreover, those views will likely change over time—sites that aren’t appropriate for a 10-year-old might be fine for a 14-year-old. Machine learning with Spark is so powerful, because it means our solution can adapt and evolve along with the needs of the family.”
Cloud-Nanny was able to take the solution from initial proof-of-concept through to a production-ready service in just 14 months. The company credits this rapid development cycle to its decision to build the solution on IBM Bluemix®.
“When we built the initial proof-of-concept for the Cloud-Nanny product, we used another hosting provider,” says Martijn Rooks. “It took us two months just to get the infrastructure set up and configured, before we could even begin the real development work. With IBM Cloud™, we were able to get up and running almost immediately. Once you have learned how the platform works, and how easy it is to bring different services together, you can put together a basic app in a couple of days.
“Building a product and bringing it to market in 14 months from end to end is something that would have been almost unthinkable a few years ago—and with such an advanced project, using state-of-the-art technologies like Spark, it’s especially impressive. In total, we estimate that getting a project up and running with Bluemix is at least 50 percent faster than with a more traditional software development environment.”