How to use IBM Cloud Code Engine for GitHub traffic analytics.
Some years back, we introduced an IBM Cloud solution tutorial for GitHub traffic analytics based on Cloud Foundry and IBM Cloud Functions. A Cloud Functions action is triggered daily to collect traffic data. The action stores the data in a Db2 database. Users can then analyse the data in a Python Flask app served by Cloud Foundry.
Today, that same solution scenario and app are still available, but they are served by IBM Cloud Code Engine. Code Engine is a fully managed, serverless platform that runs your containerized workloads, including web apps, microservices, event-driven functions or batch jobs. The slightly renamed tutorial — “Serverless web app and eventing for data retrieval and analytics” — demonstrates how the existing app can be containerized and both served as web app and and used to process the daily data collection event:
Serverless application
Serverless computing is an approach to computing that offloads responsibility for common infrastructure management tasks (e.g., scaling, scheduling, patching, provisioning, etc.) to cloud providers and tools, allowing engineers to focus their time and effort on the business logic specific to their applications or process. It is also the computing model utilized by Code Engine. With that, it is a great fit for the solution scenario discussed in the tutorial.
Users can access the web application to look at and analyse GitHub traffic data (see [1] in the architecture diagram above). The app itself relies on the IBM Cloud App ID security service to authenticate users. It also stores the traffic data and other app data in an IBM Db2 on Cloud database (see [2]). Moreover, the app is triggered daily (see [3]) to collect new traffic data from GitHub (see [4]). Since the web application is only needed infrequently and the data collection happens once a day, the solution benefits from the serverless compute model and its automatic up- and downscaling of resources (autoscaling). When not in use, there are no costs.
Containerized solution
The code for this tutorial is available in a GitHub repository. In the tutorial, we set up a Code Engine project with an app. We configure the app to be built as container image from the code on GitHub. The container image is stored in the IBM Cloud Container Registry. From there, it is fetched by Code Engine and used to deploy a revision of our app.
Only few steps are needed to set up everything and deploy the app and it required services. Once done, the traffic data is collected daily and you can start analyzing it in the app — either as data tables or with a chart like the one below:
Summary
The solution for analysing GitHub traffic statistics is a good fit for IBM Cloud Code Engine. It has scheduled daily collection of traffic data (eventing) and the actual web app is infrequently used (serving). Thus, it benefits from the serverless compute model and its automatic up- and downscaling of resources (autoscaling). When not in use, there are no costs. The container deployment directly off the git repository makes it easy to use.
- Get started with IBM Cloud Code Engine (which is based on technologies like Knative, Istio and Tekton).
- Learn about Text analysis with Code Engine.
- Check out the other IBM Cloud tutorials.
If you have feedback, suggestions, or questions about this post, please reach out to me on Twitter (@data_henrik) or LinkedIn.