Deployment profiles
This section describes the options available for the operator setup.
StarterPak
This basic setup that does not require any specific configuration. It is only intended for nonproduction environments.
CRD
See Basic setup section for more details.
Node resources
Software | Memory (GB) | CPU (cores) | Disk (GB) | Nodes |
---|---|---|---|---|
IBM Process Mining | 64 | 16 | 100 | 1 |
IBM Task Mining | 16 | 4 | 100 | 1 |
Total | 80 | 20 | 200 | 2 |
Production Setup
This is the suggested configuration for a production installation.
CRD
See Production setup section](crd.md) for more details.
Node Resources
Listed below are three sizing configurations for processing event logs with an increasing number of events. In addition to data volumes and data complexity, another factor that can impact the right sizing of the system is the number of concurrent users active on the application; it’s not so important how many users are working, but what they are doing concurrently.
The most important information that is required to set up a production environment are as follows:
-
- Number of projects to be managed by the application and their classification, i.e.:
- Flat (only one business entity involved)
- Multi-level (from 2 to 5 business entities involved - ex: P2P, O2C) - and the number of mapped entities.
Note: Knowing the type of process is important because multi-level processes require more complex and resource-consuming algorithms than flat processes.
-
- Number of events per each project listed at point 1, at least by range.:
- Up to 10 M
- From 10 M to 50 M
- Over 50 M
Note:
- A. Knowing the distribution of events per each project instead of the total number of events is important, the same number of events requires much more computing resources for a multi-level process, especially in case of high cardinalities between interleaving business entities (ex: 1:10000 between contracts and invoices).
- B. We do not indicate an upper threshold, but it’s worth noting that we've never managed a production environment hosting multi-level processes with a number of events > 25 M.
It’s worth noting that the quantity of collected events is not a factor in IBM Process Mining, the quality of data is the real important factor. The number of events must be bare enough to have a complete range of variants and reliable statistics over a relevant time period: the presence of obsolete data relating to past periods can affect the mining and the statistics because they can reflect workflows that are no longer present.
-
- Number of mapped custom fields per project, at least by range:
- up to 20
- from 20 to 50
- from 50 to 80
Note:
- A. No more than 80 custom fields must be used, this is the max threshold we have tested on the product.
- B. Knowing the nr of custom fields is important because they affect the complexity of the processing.
-
- Number of users that get access to the application and the quota of users that will also access Data Analytics.
Note: Knowing the number of users is important to understand how many snapshots of the process we’ll have to store and manage. Knowing the users who will access Analytics is important because Analytics is the most resource consuming component of the suite (especially for RAM usage).
Since it’s difficult to collect the previous data in advance and it’s even more difficult to estimate the max workload generated by the concurrent user activities, we provide three different sizing ranges based on events’ volume, assuming as worst case a concurrency level of 10 users.
UP to 10 M events
Software | Memory (GB) | CPU (cores) | Disk (GB) | Nodes |
---|---|---|---|---|
IBM Process Mining | 64 | 16 | 300 | 1 |
UP to 50 M events
Software | Memory (GB) | CPU (cores) | Disk (GB) | Nodes |
---|---|---|---|---|
IBM Process Mining | 128 | 32 | 600 | 1 |
UP to 100 M events
Software | Memory (GB) | CPU (cores) | Disk (GB) | Nodes |
---|---|---|---|---|
IBM Process Mining | 192 | 48 | 1000 | 1 |
Task mining
For this component is suggested a common configuration:
Software | Memory (GB) | CPU (cores) | Disk (GB) | Nodes |
---|---|---|---|---|
IBM Task Mining | 32 | 8 | 300 | 1 |
HA Setup
This is the suggested configuration for an HA installation.
CRD
See Custom setup section for more details on how to increase the number of pod's replica.
Node Resources
This is an example of configuration for an installation that can manage up to 50 M eventl.
Software | Memory (GB) | CPU (cores) | Disk (GB) | Nodes |
---|---|---|---|---|
IBM Process Mining | 128 | 48 | 200 | 3 |
IBM Task Mining | 32 | 8 | 300 | 3 |
Total | 160 | 56 | 500 | 6 |