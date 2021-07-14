CodeCommit



AWS CodeCommit replaces a conventional git repository – this is the essential place where all of the used code of a project is stored.

CodeDeploy/CodeBuild

CodeBuild will run all unit and integration tests, as well as build a tarball from the specified python sources, which can be deployed into a docker container later. CodeDeploy will execute a specified deployment scenario, which will e.g. build the docker container, push it to a docker image repository and in the end load the image in a production setting.

AWS ECR

AWS ECR functions as the repository for all docker containers, which are built in the above-mentioned pipeline. It acts as repository for containers just as CodeCommit acts as a repository for config files and source code. This is the point where AWS SageMaker will look for a specified docker image, when a training job is triggered with the respective parameters from the outside.

AWS SageMaker

AWS SageMaker acts as the runtime environment for all training jobs. AWS SageMaker can be triggered via an API/python binding. User specifies, what kind of model is to be run and where the respective input and output data is located. AWS SageMaker will accept docker images with a predefined entry point containing the training code. However, it is also possible run a TensorFlow/MXNext/ONNX-defined job there. SageMaker offers a User Interface for administration and can be elastically scaled as it is a managed service. Therefore, the user can choose from a wide variety of machines, which are used to train a specific model. AWS SageMaker can also be used to perform Hyperparameter Tuning, which can be triggered via the API as well. The tool will automatically select the best performing combination of hyperparameters. The results from a run can be directly written to S3 or even DynamoDB.

AWS S3

AWS S3 acts as the basic file system for input and output files. Usually S3 is used to store large training data files and can also be used to store serialized models. AWS S3 seamlessly integrates with SageMaker.

AWS DynamoDB

AWS DynamoDB is a key-value based NoSQL database, which is completely managed by AWS and can be scaled on demand. The database can be used to hold the KPIs from a model run to track model performance over time for example. It is also leveraged to integrate runtime information and performance meta data for a model run. AWS DynamoDB can be seamlessly integrated with QuickSight, which is a data visualization tool offered by AWS.

AWS Elastic Inference

AWS Elastic Inference is an EC2 instance on steroids. Models trained in AWS SageMaker can be hosted on an EI instance for prediction. The underlying machine(s) can be scaled on demand.

Developing trustworthy AI

The Ethics question is not just a modelling problem but a business problem. 60% of companies see compliance as a barrier to achieving success in applying AI, in part due to a lack of trust and understanding of the system. IBM Designed a 3-Pronged Approach to Nurture Trust, Transparency & Fairness to consistently run, maintain, and scale AI while maintaining trust and reducing brand and reputation risk. IBM can assist the client with the culture they need to adopt and safely scale AI, with AI engineering through forensic tools to see inside black-box algorithms, and with the governance to make sure the engineering sticks to the culture. At the center of trustworthy AI is the telemetry and forensic tooling that IBM holds supreme in the community for our open source and Linux® foundation.

IBM Services for AI at Scale is framed around the IBM Research open-source toolkit, AI Fairness 360 and fact sheets. Developers are able to share and receive state-of-the-art codes and data sets related to AI bias detection and mitigation. These IBM Research efforts also led us to integrate IBM Watson® OpenScale™, a commercial offering designed to build AI-based solutions or enterprises to detect, manage and mitigate AI bias.