Performance factors overview
Job performance in DataStage® depends on several factors. Actual system requirements depend on concurrency, parallelism, and the type, number and size of jobs. Consider the following general factors while planning your workload.
- Disk space depends mostly on the size and concurrency of your jobs. Best practice is to multiply the expected required space by 3 or 4 to ensure that there is enough space for growth and potential estimation issues. You will need sufficient storage for data held in DataStage tables and files and additional space to allow for temporary data storage while a DataStage job runs.
- Memory: actual memory requirements depend mostly on the type of processing, degree of parallelism and the number and type of stages that run concurrently. Some stages, for example Join, Lookup, XML Input and Output may use a large amount of memory.
- CPU: processor requirements mostly depend on the degree of parallelism, concurrency of jobs and the type of processing that is done in your jobs. For example, aggregations, decimal calculations and complex data parsing are more CPU intensive than sorts and string manipulations.
For information on customizing resources, dynamic workload management, and more, see Administering DataStage.