Predictive Cloud Computing for professional golf and tennis, Part 5

Continuous integration and deployment

Comments

Content series:

This content is part # of 8 in the series: Predictive Cloud Computing for professional golf and tennis, Part 5

Stay tuned for additional content in this series.

This content is part of the series:Predictive Cloud Computing for professional golf and tennis, Part 5

Stay tuned for additional content in this series.

In this tutorial, we provide an overview of our continuous integration and continuous deployment architecture designed and implemented for PCC. We discuss the use of Jenkins continuous integration (CI) and IBM UrbanCode™ Deploy, showing how our unit and integration testing has been implemented in Jenkins CI and describing the connection between Jenkins and IBM UrbanCode Deploy. We also show how IBM UrbanCode can automatically deploy to test environments and enable developers to push changes to pre-production and production environments without manual steps or elevated user privileges on systems.

Figure 1. Continuous integration/delivery architecture
Chart showing continuous integration/delivery architecture
Chart showing continuous integration/delivery architecture

Jenkins continuous integration

Continuous integration is the practice of automatically detecting source code changes, through either push or pull mechanisms, and the subsequent building of that source code into artifacts that can be automatically tested. CI systems promote testing and the propagation of small changes by offloading and automating the build and test procedures. Our PCC has used CI systems extensively, permitting developers to check in changes to source control so that the code is automatically tested and deployed into test environments.

Jenkins executes builds and tests through the use of job definitions. Job definitions contain a variety of information, including but not limited to source locations, polling intervals, plugins used, build steps, and conditions. The components of a build can be mixed and configured within any combination that allows flexibility to match any build and testing goal. Jenkins provides many project types, including freestyle, Maven, workflow, and multi-configuration. For PCC, we used a mixture of freestyle and Maven-type projects.

While Jenkins provides numerous built-in functions for continuous integration, the system can be extensively customized through the use of plugins. Predictive Cloud Computing makes extensive use of the Jenkins plugins. Some plugins used by the project include the Maven Integration Plugin, Maven Repository Server Plugin, Artifactory Plugin, Conditional BuildStep Plugin, and the UrbanCode Deploy Plugin.

Figure 2 depicts the 15 Jenkins jobs used to provide continuous integration services for PCC. In general, each job represents a distinct function for PCC that may require separate compilation, testing, or packaging. The colored circles on the left represent the state of the last build where blue is success, red is failure, and yellow is unstable. The weather icons next to the circles indicate the health of the most recent build.

Table 1 lists all the projects in Jenkins and their respective functions to support BigEngine.

Table 1. Projects in Jenkins
ProjectFunction
BigEngine Performs the multithreaded job runs for big data, establishes the end points for RESTful services, and has configuration for WebSphere® Liberty Profile
BigEngine-Integration Performs integration testing for BigEngine project
Config Parses application, tennis, golf, and overall tournament configurations
FactorsJar Feature extractor algorithms from golf and tennis simulations
Http-Proxy-Servlet Servlet to proxy traffic from BigEngine to the web front end
Predictive Cloud Database Project to build liquibase changes
RacketStream Project to extract sentiment analysis in InfoSphere Streams
scripts Collection of scripts used to manage BigEngine
Shared Elements shared between PCC projects
Shared-Integration Integration testing for shared projects
StreamsLogAggregator InfoSphere® Streams project to analyze web logs in real time
Twitter Analysis InfoSphere Streams project to perform analysis ontweet volumes
TwitterExporter InfoSphere Streams project to export twitter feeds to other InfoSphere Streams processes
TwitterFeed InfoSphere Streams project to read from Twitter API
WebLogAnalyzer Java™ program to analyze web log content to extract player mentions from content.

The BigEngine Job in Figure 2 is the primary job for PCC. This job builds the analytic and decision engine used by PCC to forecast site traffic. The BigEngine-Integration job performs the time-consuming integration testing for BigEngine and is segregated from the BigEngine job to speed up delivery time for small changes. TwitterAnalysis, TwitterExporter, and StreamsLogAggregator jobs are used to create artifacts for deployment within InfoSphere Streams. Additional projects such as the WebLogAnalyzer provide Big Data functions that support the PCC.

Figure 2. The Jenkins continuous integration jobs for PCC
Chart showing Jenkins continuous integration jobs for PCC
Chart showing Jenkins continuous integration jobs for PCC

Figure 3 contains the high-level configuration directives for the BigEngine project. The main element to note is that Jenkins is configured to discard old build information by rotating off any builder logs older than the most recent 300 builds. These builds and build logs can be useful for finding and tracking down exactly how and why a build fails. Additionally, this build is restricted to run only on the Jenkins master node. Other builds, such as the InfoSphere Streams–related builds are executed only on Jenkins clients that support streams.

Figure 3. BigEngine job configuration in Jenkins
Screen capture showing BigEngine job configuration in Jenkins
Screen capture showing BigEngine job configuration in Jenkins

The source code management configuration for BigEngine, as shown in Figure 4, describes how the build systems access and discover changes within the source code repository. PCC used Git as a source code repository. Access to an internal Git server is defined, which includes the credentials necessary for reading and writing from that repository within the source code management section of a build. The branch specifier describes a specific branch that will be built, such as the Wimbledon branch. For each sporting event, a specific branch was used that enabled the team to manage tournament-specific configurations. Finally, the system was configured to build specific files or subdirectories upon content change. In this configuration, a build would be triggered on any change in the source code (BigEngine/src/*) or within the Maven build description (BigEngine/pom.xml).

Figure 4. BigEngine source code management section of Jenkins job
Screen capture showing BigEngine source code management section of Jenkins job
Screen capture showing BigEngine source code management section of Jenkins job

Build triggers for BigEngine are shown in Figure 5. Build triggers describe how and when Jenkins triggers a build job to produce deployment artifacts for PCC. Two build triggers are defined for the job in Figure 5. The SNAPSHOT dependency trigger causes Jenkins to inspect the Project Object Model (POM) of the project to see if any of the project dependencies are built on this Jenkins server. Jenkins automatically sets up a dependency relationship to build any required downstream components. This assists with continuous integration because any change in a dependency triggers a downstream build. Additionally, the system is configured to poll the source code server every three minutes (H/3) to discover any changes for the execution of a build.

Figure 5. Build triggers for Jenkins job
Screen capture showing build triggers
Screen capture showing build triggers

The build environment definition for BigEngine is shown in Figure 6. Build environment options control many aspects that affect how the build is completed. For example, build workspaces are deleted before the build starts within BigEngine, ensuring that every build starts from a clean slate. A timestamp is added to the output of each build to trace and track each build. An upstream Maven repository is defined to assist with artifact resolution. Finally, an artifactory server is defined to resolve private artifacts that are necessary during a build.

Figure 6. Build environment for Jenkins job
Screen capture sowing build environment
Screen capture sowing build environment

Figure 7 contains the pre-build, build, and post-build steps for BigEngine. The pre-build step uses Maven to clean all projects. The build step defines the Maven POM and goals to use for building. BigEngine defines several options that customize a build. The option -T 8 tells Maven to build with eight threads, which speed up the build process. Option -X instructs Maven to provide debug output useful for analyzing build failures. The -U option requests Maven to check for updates of snapshots on remote repositories to ensure the build is using the latest artifacts. The -Dskip.integration.tests=true option instructs Maven to skip integration tests because those are handled by a different job. The options clean and install instruct Maven to clean the build setup and perform all phases of the build process up to install. The included phases are validate, compile, test, package, verify, and install into the local repository. Within the final post-build step, Jenkins is instructed to copy over the server.xml for the deployment on WebSphere Liberty.

Figure 7. Build steps for Jenkins job
Screen capture showing build steps for Jenkins job
Screen capture showing build steps for Jenkins job

Figure 8 depicts part of the configuration for UrbanCode Deploy. The configuration enables the artifacts built in Jenkins to be sent to UrbanCode Deploy for deployment. The target server for deployment is selected from a drop-down menu, such as a production server. A username and password are configured to authenticate with UrbanCode Deploy. The built artifacts are mapped to a component in UrbanCode Deploy, such as the Forecast Engine. The Base Artifact Directory defines where the Jenkins plugin for UrbanCode Deploy should look for uploadable artifacts. The Directory Offset is offset from the Base Artifact Directory and further refines which files should be uploaded. The version is used as a component version in UrbanCode Deploy so the BigEngine version matches the Jenkins build number. The version is automatically set by Jenkins on each build. The include directive is a list of artifacts that result from the build process that should be sent to UrbanCode Deploy for each version of a component.

Figure 8. UrbanCode Deploy publish configuration for Jenkins job
Screen capture showing UrbanCode Deploy publish configuration
Screen capture showing UrbanCode Deploy publish configuration

Figure 9 shows the remaining parts of the UrbanCode Deploy Jenkins plugin configuration. For the BigEngine project, the plugin was configured to automatically deploy changes to the PCC test environment whenever a build is complete. The deploy checkbox, when enabled, instructs the plugin to request UrbanCode Deploy to execute a deployment process. The application to deploy is defined, as well as the process in UrbanCode Deploy, to orchestrate the deployment. Within minutes, any change in source control is reflected in the test environment.

Figure 9. UrbanCode Deploy continuous deployment configuration for Jenkins job
Screen capture showing UrbanCode Deploy continuous deployment
Screen capture showing UrbanCode Deploy continuous deployment

UrbanCode Deploy

Continuous deployment (CD) is the practice of automatic or automated deployment of build artifacts through one or more deployment environments such as test, staging, and production. CD systems promote propagating small change sets into production because they offload, automate, and orchestrate tedious deployment procedures. PCC extensively used UrbanCode Deploy as a CD system. The promotion of many small change sets rather than large change sets were advantageous to assist with PCC debugging and change rollback.

UrbanCode Deploy topology has applications at the top level. An application consists of a group of components, component processes, and application processes. The components are versioned artifacts with associated processes that describe how to deploy those artifacts. The application processes describe how to orchestrate a deployment across components. Components are associated with a resource, often a computing device, which describes where to run the component processes for a given application. Application processes can be executed across different deployment environments such as test, staging, and production.

For PCC, UrbanCode Deploy was used to deploy all components of the project, including WebSphere Liberty applications, Hadoop jobs, Streams jobs, and the visualization components.

Figure 10 depicts the six deployment environments configured in UrbanCode Deploy PCC. The environment marked ECC is the test environment. The pre-production environments were distinct staging environments segregated by cloud region. Likewise, production environments were also separated by cloud region, enabling developers to push updates to one cloud region at a time.

Figure 10. UrbanCode Deploy application view for PCC
Screen capture showing UrbanCode Deploy application view for PCC
Screen capture showing UrbanCode Deploy application view for PCC

Figure 11 shows the configuration for the test environment that denoted ECC in this configuration. Every component that can be deployed to production can also be deployed to the test environment. UrbanCode Deploy supports component versioning such that each deployed component version within a specific environment is viewable.

Figure 11. Test environment in UrbanCode Deploy
Screen capture showing test environment in UrbanCode Deploy
Screen capture showing test environment in UrbanCode Deploy

One of the production environments and its corresponding components and versions is depicted in Figure 12. Production Plex 3 mirrors the configuration. A PCC developer pushed component versions to a single production plex (region) and verified functionality. After verification and successful system test, the deployment was pushed to the other regions.

Figure 12. Production environment in UCD
Screen capture showing production environment in UCD
Screen capture showing production environment in UCD

Figure 13 shows the high-level application orchestration processes for several deployments. Each of the distinct processes performs a deployment or an action such a server restart. The PCC user selects the process using the play buttons shown on the corresponding environment Figures 10, 11, and 12 above.

Figure 13. Application processes in UCD
Screen capture showing application processes in UCD
Screen capture showing application processes in UCD

A high-level application process for deploying the Predictive Cloud Forecasting service is shown in Figure 14. First the configuration is deployed, followed by any application server updates that stop the application server. Next, the application server is started, followed by post-processing scripts.

Figure 14. BigEngine deployment process in UCD
Screen capture showing BigEngine deployment process in UCD
Screen capture showing BigEngine deployment process in UCD

Figure 15 shows the high-level application process for deploying the UIMA components. Many different Java virtual machines and analytic engines are deployed in parallel. After each analytic engine was started and joined together, the top-level aggregator engine was deployed. Deployments can be complex yet repeatable with UrbanCode Deploy.

Figure 15. UIMA deployment in UrbanCode Deploy
Screen capture showing UIMA deployment in UrbanCode Deploy
Screen capture showing UIMA deployment in UrbanCode Deploy

The steps executed to update a Liberty component within a component process called "Install Forecast Engine" are depicted in Figure 16. In this process, properties are imported from Chef, such as the WebSphere Liberty Profile (WLP) installation directory and WLP user. The properties are used in subsequent automation steps. As the WebSphere Liberty Server is stopped, UrbanCode Deploy downloads the server configuration and the application code. Tokens such as port and security features are then replaced in the server configuration to customize WebSphere Liberty Server to the deployment environment. Finally, the server is started by another component process.

Figure 16. WebSphere Liberty component deployment process in UrbanCode Deploy
Screen capture showing Liberty component deployment process in UrbanCode Deploy
Screen capture showing Liberty component deployment process in UrbanCode Deploy

Figure 17 depicts the steps executed to update an InfoSphere Streams component. The component process stops the Streams application, creates the necessary application directories (if not previously created), downloads the Streams application as a tarball and unpacks it, downloads the configuration, and restarts the application.

Figure 17. InfoSphere Streams component deployment process in UrbanCode Deploy
Screen capture showing InfoSphere Streams component deployment process in UrbanCode Deploy
Screen capture showing InfoSphere Streams component deployment process in UrbanCode Deploy

UrbanCode Deploy provides a web-based editor to ease the creation of repeatable and automated deployment processes for complex applications and components of all types. UrbanCode was used in PCC to deploy and manage versions of components across multiple cloud regions and multiple host environments, such as test, staging, and production. UrbanCode's flexibility permitted the management of deployment across a diverse set of component types, including Hadoop jobs, Streams jobs, WebSphere Liberty applications, UIMA applications, database changes, and configuration files.

Conclusion

The combination of Jenkins for CI and UrbanCode Deployment for CD and deployment automation increased the reliability of PCC with a reduction in the time required to promote changes. Figure 16 shows the savings realized during the 2015 Australian Open by moving this project to a CI/CD environment.

Figure 18. Time savings using CI/CD during 2015 Australian Open
Chart showing time savings
Chart showing time savings

In Part 6, we examine predictive modeling with SPSS Modeler and SPSS Statistics. In addition, we will depict the use of Unstructured Information Management Architecture Scaleout (UIMA-AS) for the discovery of feature vectors that predict large web traffic demand spikes.


Downloadable resources


Related topics


Comments

Sign in or register to add and subscribe to comments.

static.content.url=http://www.ibm.com/developerworks/js/artrating/
SITE_ID=1
Zone=Data and analytics, DevOps, Cloud computing
ArticleID=1031020
ArticleTitle=Predictive Cloud Computing for professional golf and tennis, Part 5: Continuous integration and deployment
publish-date=03082016