How to use IBM Cloud Code Engine for GitHub traffic analytics.

Some years back, we introduced an IBM Cloud solution tutorial for GitHub traffic analytics based on Cloud Foundry and IBM Cloud Functions. A Cloud Functions action is triggered daily to collect traffic data. The action stores the data in a Db2 database. Users can then analyse the data in a Python Flask app served by Cloud Foundry.

Today, that same solution scenario and app are still available, but they are served by IBM Cloud Code Engine. Code Engine is a fully managed, serverless platform that runs your containerized workloads, including web apps, microservices, event-driven functions or batch jobs. The slightly renamed tutorial — “Serverless web app and eventing for data retrieval and analytics” — demonstrates how the existing app can be containerized and both served as web app and and used to process the daily data collection event:

Solution architecture for serverless web app and eventing.

Serverless application

Serverless computing is an approach to computing that offloads responsibility for common infrastructure management tasks (e.g., scaling, scheduling, patching, provisioning, etc.) to cloud providers and tools, allowing engineers to focus their time and effort on the business logic specific to their applications or process. It is also the computing model utilized by Code Engine. With that, it is a great fit for the solution scenario discussed in the tutorial. 

Users can access the web application to look at and analyse GitHub traffic data (see [1] in the architecture diagram above). The app itself relies on the IBM Cloud App ID security service to authenticate users. It also stores the traffic data and other app data in an IBM Db2 on Cloud database (see [2]). Moreover, the app is triggered daily (see [3]) to collect new traffic data from GitHub (see [4]). Since the web application is only needed infrequently and the data collection happens once a day, the solution benefits from the serverless compute model and its automatic up- and downscaling of resources (autoscaling). When not in use, there are no costs.

Containerized solution

The code for this tutorial is available in a GitHub repository. In the tutorial, we set up a Code Engine project with an app. We configure the app to be built as container image from the code on GitHub. The container image is stored in the IBM Cloud Container Registry. From there, it is fetched by Code Engine and used to deploy a revision of our app.

Only few steps are needed to set up everything and deploy the app and it required services. Once done, the traffic data is collected daily and you can start analyzing it in the app — either as data tables or with a chart like the one below:

Line chart with daily views for GitHub repositories.

Summary

The solution for analysing GitHub traffic statistics is a good fit for IBM Cloud Code Engine. It has scheduled daily collection of traffic data (eventing) and the actual web app is infrequently used (serving). Thus, it benefits from the serverless compute model and its automatic up- and downscaling of resources (autoscaling). When not in use, there are no costs. The container deployment directly off the git repository makes it easy to use.

If you have feedback, suggestions, or questions about this post, please reach out to me on Twitter (@data_henrik) or LinkedIn

Categories

More from Cloud

IBM Cloud inactive identities: Ideas for automated processing

4 min read - Regular cleanup is part of all account administration and security best practices, not just for cloud environments. In our blog post on identifying inactive identities, we looked at the APIs offered by IBM Cloud Identity and Access Management (IAM) and how to utilize them to obtain details on IAM identities and API keys. Some readers provided feedback and asked on how to proceed and act on identified inactive identities. In response, we are going lay out possible steps to take.…

IBM Cloud VMware as a Service introduces multitenant as a new, cost-efficient consumption model

4 min read - Businesses often struggle with ongoing operational needs like monitoring, patching and maintenance of their VMware infrastructure or the added concerns over capacity management. At the same time, cost efficiency and control are very important. Not all workloads have identical needs and different business applications have variable requirements. For example, production applications and regulated workloads may require strong isolation, but development/testing, training environments, disaster recovery sites or other applications may have lower availability requirements or they can be ephemeral in nature,…

IBM accelerates enterprise AI for clients with new capabilities on IBM Z

5 min read - Today, we are excited to unveil a new suite of AI offerings for IBM Z that are designed to help clients improve business outcomes by speeding the implementation of enterprise AI on IBM Z across a wide variety of use cases and industries. We are bringing artificial intelligence (AI) to emerging use cases that our clients (like Swiss insurance provider La Mobilière) have begun exploring, such as enhancing the accuracy of insurance policy recommendations, increasing the accuracy and timeliness of…

IBM NS1 Connect: How IBM is delivering network connectivity with premium DNS offerings

4 min read - For most enterprises, how their users access applications and data is an essential part of doing business, and how they service those application and data responses has a direct correlation to revenue generation.    According to We Are Social’s Digital 2023 Global Overview Report, there are 5.19 billion people around the world using the internet in 2023. There’s an imperative need for businesses to trust their networks to deliver meaningful content to address customer needs.  So how responsive is the…