Building the digital representation with Digital Twin using AWS stack

By and Sanjay B Panikkar | 6 minute read | September 15, 2021

A Digital Twin is a dynamic, virtual representation of its physical counterpart, usually across multiple stages of its lifecycle. It uses real-world data combined with engineering, simulation or machine learning models to enhance operations and support human decision making. Digital Twin mirrors a unique physical object, process, organization, person or other abstraction and can be used to answer what-if questions, present insights in an intuitive way and provide a way to interact with the physical object/twin.

The key elements in a Digital Twin:

While there are many different views of what a digital twin is and is not, the following more or less lists the common set of characteristics that is present in nearly every digital twin:

  • Connectivity created by IoT sensors on the physical product to obtain data and integrate through various technologies. Alternatively, capture on field information of assets through drone photography or LIDAR scans followed by 3D reconstruction using different techniques
  • Digital Thread, a key enabler interconnecting all relevant systems and functional processes homogenization, decouples the information from its physical form
  • Re-programmable and smart, enabling a physical product to be reprogramed manually and in an automatic manner
  • Digital traces and modularity, to diagnose the source of a problem

A Digital Twin is not a single technology play, rather it is realized through an amalgamation of multiple technologies such as:

  • Visualization, AR/ VR: Usually the topmost layer in a digital twin that combines the data and insights to present, advise and interact with the user or other machines
  • Workflow and APIs: Extract and share data from multiple sources in creating the digital twin and/ or infuses the insights within workflow of digital twin
  • Artificial Intelligence/ analytics: Using machine learning framework and analytics to make real- time decision based on historical and streaming data
  • Knowledge graph: Creates a digital thread based on semantic model, data dictionary and knowledge graph
  • Internet of Things and Data Platform: Real-time data ingestion, gathered via sensors and gateways from physical asset/ objects related to state, conditions and events. And to integrate, persist, transform and govern the data collected
  • Digital infrastructure: Hybrid infrastructure including cloud, edge compute, in-plant infrastructure, etc.
  • Physical infrastructure: Instrumentation of physical objects via sensors, gateways, network (IT/ OT), etc.

Emerging architecture of Digital Twin, an IBM point of view:

Architecture for digital twin essentially comprises of all the elements as described in the above section. Objective is to be able to support the functionality to connect, monitor, predict and simulate multiple physical objects, assets and/ or processes.


Figure 1: Digital Twin Logical Architecture


Building Digital Twin with AWS technology stack:

Propose here is a digital twin that is primarily based on the AWS software stack:

Figure 2: Digital Twin Technology Architecture with AWS


Architecture Layer Technology Vendor Purpose
Data Platform: Ingestion Kinesis Data Streams, Kinesis Data Firehose, IoT Core AWS Ingesting and processing device telemetry securely is one of the key requirements in Digital twins. A combination of IoT Core and Kinesis will allow connecting to heterogenous data sources, multiple protocol and support streaming ingestion
Data Platform: Persistence DynamoDB, RDS/ Aurora, S3, Timestream, Redshift AWS Considering the diverse nature of data that will be handled by Digital Twin, polyglot storage is recommended. Thus a combination of relational, non-relational, time-series, data lake and warehouse is required
Data Platform: Integration, Transformation & Quality Glue, Lambda AWS Meant to provide consistency of data through the validation, enrichment, cataloguing and transformation conforming to the standard and integrate with the other systems
Digital Thread: Knowledge Graph KITT IBM The Knowledge Graph represents a collection of interlinked descriptions of entities – objects, events or concepts. It will put data in context via linking and semantic metadata. The IBM asset KITT is a general purpose knowledge graph that enables the Digital Thread required to link lifecycle information and data together. Kitt is proposed to be deployed on Red Hat OpenShift
Modelling & Execution: Analytics, Machine Learning SageMaker, Athena, Kinesis Data Analytics, EMR AWS Cloud based Machine Learning Platform for building, training and deploying models based on the data collected via data platform. Also provides ability to perform analytics on streaming data. We have proposed here interactive query service that makes it easy to analyse data from e.g., S3 using standard SQL. Further to process the mass of data and analyse the same we also propose AWS EMR
Consumption & Visualization: Dashboard, Apps Quicksight, Fargate AWS For portal, real-time dashboards and command centres we have proposed here Quicksight for BI service and Fargate to build web apps
Consumption & Visualization: AR/VR/3D/2D Sumerian AWS AR and VR services are required to visualize diagnostics, predictions and recommendations for physical world. An extension of dashboard visualization is also required in digital twin to provide a view in 3D/2D
Consumption & Visualization: API & Microservices API Gateway, Node JS, SpringBoot, AWS To provide secured access to application APIs and batch files we propose Microservice based applications built using AWS API Gateway, Node JS/ SpringBoot and exposed via API Gateway
Workflow Mgmt.: Intelligent Workflow Lambda, AppSync, Simple Workflow Service, Airflow, RedHat Process Automation AWS, Apache, RedHat Workflow Management in digital twin is intended to deal with the business processes, simulation and event based flows. Besides the AWS stack we also propose RedHat Process Automation and Apache Airflow either of which could be hosted in the RHOS on AWS
Governance & Operations: DataOps Glue AWS DataOps is an essential element of Digital Twin that sits atop the big data which must be available on time, be automated and managed well to extract value. AWS Glue with its end-end data capability  to discover, prepare, and combine data for analytics, machine learning is the right fit for the purpose
Governance & Operations: DevOps CloudFormation, CodeBuild, CodePipeline, CodeDeploy, CodeStar AWS AWS CodePipeline helps to build a continuous integration or continuous delivery workflow that uses AWS CodeBuild, AWS CodeDeploy, and other tools, or use each service separately. With AWS CodeStar the entire continuous delivery chain could be set up for a scalable DevOps solution
Governance & Operations: MLOps SageMaker AWS AWS SageMaker as MLOps fulfils the AI@Scale goals providing capability to build, train, deploy and maintain machine learning models in production reliably and efficiently.
Governance & Operations: Data Governance Collibra Collibra on AWS The Collibra Data Governance and Catalog solutions will help find, understand, and trust the data, ensuring quality, and accessibility in a digital twin
Hybrid Infrastructure: Edge & Cloud Greengrass, Elastic Kubernetes Service, Elastic Compute Cloud, RedHat OpenShift on AWS AWS, RedHat Hybrid infrastructure that transcends beyond cloud is a reality in digital. However the core components will be on cloud with edge as a key component residing outside. EKS and/ or ROSA based containers are the ideal choice for the non-serverless component. Those requiring VM kind of infrastructure can be catered via EC2 instances
Security and Monitoring: ECR, IAM, Secrets Manager, CloudWatch AWS Digital Twin has multiple aspects of security that needs to be catered for including identity management, information protection, managing infra secrets, monitoring etc.


Digital Twin is intended to connect the digital and physical world as it truly takes the IoT, machine learning and virtual reality, in tandem to the next level. Read more about digital twin here: – Creating an end-to-end digital twin platform, requires lot more than single set of capabilities with heterogenous software and hardware stack, multiple set of architecture with the principal theme being data, read more about Data architecture here – There are specialized software vendors for each layer or architecture, the flexibility comes from adopting a hybrid approach wherein the hyperscalers forms the core part of the solution. While the individual customer environment will determine hyperscalers, the advantage of AWS with its IoT centric platform and strong data products can be leveraged to build powerful digital twin solutions, refer to AWS Architecture Center for the diverse architecture knowledgebase,