Building the digital representation with digital twin using AWS stack

15 September 2021

5 min read

A digital twin is a dynamic, virtual representation of its physical counterpart, usually across multiple stages of its lifecycle. It uses real-world data combined with engineering, simulation or machine learning models to enhance operations and support human decision making. A digital twin mirrors a unique physical object, process, organization, person or other abstraction and can be used to answer what-if questions, present insights in an intuitive way and provide a way to interact with the physical object/twin.

The key elements in a digital twin:

While there are many different views of what a digital twin is and is not, the following more or less lists the common set of characteristics that is present in nearly every digital twin:

  • Connectivity created by IoT sensors on the physical product to obtain data and integrate through various technologies. Alternatively, capture on field information of assets through drone photography or LIDAR scans followed by 3D reconstruction using different techniques
  • Digital thread, a key enabler interconnecting all relevant systems and functional processes homogenization, decouples the information from its physical form
  • Re-programmable and smart, enabling a physical product to be reprogramed manually and in an automatic manner
  • Digital traces and modularity, to diagnose the source of a problem

A digital twin is not a single technology play, rather it is realized through an amalgamation of multiple technologies such as:

  • Visualization, AR/ VR: Usually the topmost layer in a digital twin that combines the data and insights to present, advise and interact with the user or other machines
  • Workflow and APIs: Extract and share data from multiple sources in creating the digital twin and/ or infuses the insights within workflow of digital twin
  • Artificial Intelligence/ analytics: Using machine learning framework and analytics to make real- time decision based on historical and streaming data
  • Knowledge graph: Creates a digital thread based on semantic model, data dictionary and knowledge graph
  • Internet of Things and Data Platform: Real-time data ingestion, gathered through sensors and gateways from physical asset/ objects related to state, conditions and events. And to integrate, persist, transform and govern the data collected
  • Digital infrastructure: Hybrid infrastructure including cloud, edge compute, in-plant infrastructure, and so on.
  • Physical infrastructure: Instrumentation of physical objects through sensors, gateways, network (IT/ OT), and so on.

Emerging architecture of digital twin, an IBM point of view:

Architecture for digital twin essentially comprises of all the elements as described in the above section. Objective is to be able to support the functionality to connect, monitor, predict and simulate multiple physical objects, assets and/ or processes.

Figure 1: Digital twin logical architecture

Building digital twin with AWS technology stack:

Propose here is a digital twin that is primarily based on the AWS software stack:

  • Data platform ingestion: Kinesis Data Streams, Kinesis Data Firehose, IoT Core (AWS). Ingesting and processing device telemetry securely is one of the key requirements in digital twins. A combination of IoT Core and Kinesis will allow connecting to heterogenous data sources, multiple protocol and support streaming ingestion.
  • Data platform persistence: DynamoDB, RDS/ Aurora, S3, Timestream, Redshift (AWS). Considering the diverse nature of data that will be handled by digital twin, polyglot storage is recommended. Thus a combination of relational, non-relational, time-series, data lake and warehouse is required.
  • Data platform integration, transformation and quality: Glue, Lambda (AWS). Meant to provide consistency of data through the validation, enrichment, cataloguing and transformation conforming to the standard and integrate with the other systems.
  • Digital thread knowledge graph: KITT (IBM). The Knowledge Graph represents a collection of interlinked descriptions of entities – objects, events or concepts. It will put data in context through linking and semantic metadata. The IBM asset KITT is a general purpose knowledge graph that enables the Digital Thread required to link lifecycle information and data together. Kitt is proposed to be deployed on Red Hat® OpenShift®.
  • Modelling and execution analytics, machine learning: SageMaker, Athena, Kinesis Data Analytics, EMR (AWS). Cloud based Machine Learning Platform for building, training and deploying models based on the data collected through data platform. Also provides ability to perform analytics on streaming data. We have proposed here interactive query service that makes it easy to analyse data from e.g., S3 using standard SQL. Further to process the mass of data and analyse the same we also propose AWS EMR.
  • Consumption and visualization dashboard, apps: Quicksight, Fargate (AWS). For portal, real-time dashboards and command centres we have proposed here Quicksight for BI service and Fargate to build web apps.
  • Consumption and visualization AR/VR/3D/2D: Sumerian (AWS). AR and VR services are required to visualize diagnostics, predictions and recommendations for physical world. An extension of dashboard visualization is also required in digital twin to provide a view in 3D/2D.
  • Consumption and visualization API and Microservices: API Gateway, Node JS, SpringBoot (AWS). To provide secured access to application APIs and batch files we propose Microservice based applications built using AWS API Gateway, Node JS/ SpringBoot and exposed throughAPI Gateway.
  • Workflow mgmt. intelligent workflow: Lambda, AppSync, Simple Workflow Service, Airflow, RedHat Process Automation (AWS, Apache, RedHat). Workflow Management in digital twin is intended to deal with the business processes, simulation and event based flows. Besides the AWS stack we also propose RedHat Process Automation and Apache Airflow either of which could be hosted in the RHOS on AWS.
  • Governance and operations dataOps: Glue (AWS). DataOps is an essential element of digital twin that sits atop the big data which must be available on time, be automated and managed well to extract value. AWS Glue with its end-end data capability to discover, prepare and combine data for analytics, machine learning is the right fit for the purpose.
  • Governance and operations DevOps: CloudFormation, CodeBuild, CodePipeline, CodeDeploy, CodeStar (AWS). AWS CodePipeline helps to build a continuous integration or continuous delivery workflow that uses AWS CodeBuild, AWS CodeDeploy, and other tools, or use each service separately. With AWS CodeStar the entire continuous delivery chain could be set up for a scalable DevOps solution.
  • Governance and operations MLOps: SageMaker (AWS). AWS SageMaker as MLOps fulfils the AI@Scale goals providing capability to build, train, deploy and maintain machine learning models in production reliably and efficiently.
  • Governance and operations data governance: Collibra (Collibra on AWS). The Collibra Data Governance and Catalog solutions will help find, understand, and trust the data, ensuring quality, and accessibility in a digital twin.
  • Hybrid infrastructure edge and cloud: Greengrass, Elastic Kubernetes Service, Elastic Compute Cloud, RedHat OpenShift on AWS (AWS, RedHat). Hybrid infrastructure that transcends beyond cloud is a reality in digital. However the core components will be on cloud with edge as a key component residing outside. EKS and/ or ROSA based containers are the ideal choice for the non-serverless component. Those requiring VM kind of infrastructure can be catered through EC2 instances.
  • Security and monitoring: ECR, IAM, Secrets Manager, CloudWatch (AWS). A digital twin has multiple aspects of security that needs to be catered for including identity management, information protection, managing infra secrets, monitoring and so on.

Conclusion:

A digital twin is intended to connect the digital and physical world as it truly takes the IoT, machine learning and virtual reality, in tandem to the next level. Read more about digital twin here: – https://www.ibm.com/topics/what-is-a-digital-twin. Creating an end-to-end digital twin platform, requires lot more than single set of capabilities with heterogenous software and hardware stack, multiple set of architecture with the principal theme being data, read more about data architecture here – https://www.ibm.com/data-fabric. There are specialized software vendors for each layer or architecture, the flexibility comes from adopting a hybrid approach wherein the hyperscalers forms the core part of the solution. While the individual customer environment will determine hyperscalers, the advantage of AWS with its IoT centric platform and strong data products can be leveraged to build powerful digital twin solutions, refer to AWS Architecture Center for the diverse architecture knowledgebase, https://aws.amazon.com/architecture/ (link resides outside ibm.com).

Author

Sujay Nandi

Sujay Nandi

Sanjay B Panikkar

Insights you can’t miss. Subscribe to our newsletters.

Go beyond the hype with expert news on AI, quantum computing, cloud, security and much more.

Subscribe today