Dictionary Dataflow Scenario Fails with java.langOutOfMemory: Java heap space
Problem
Some of the database DictionaryDataflow scenarios fails with
PLATFORM_CLI_ERRORS FAILED_EXECUTING_SCENARIO java.lang.OutOfMemoryError: Java heap space
after processing all the batches. The complete error message in the log is:
2023-08-04 17:34:24.243 [CLI] 0 INFO eu.profinit.manta.dataflow.generator.modelutils.GraphScenario Processed 438 of 438 files in 1,486,121 ms.
2023-08-04 17:34:29.585 [CLI] 0 INFO eu.profinit.manta.dataflow.repository.merger.client.AbstractMergerWriter Sending data to server http://localhost:8080/manta-dataflow-server/api/merge.
2023-08-04 17:34:32.858 [CLI] 0 INFO eu.profinit.manta.dataflow.repository.merger.client.AbstractMergerWriter Response from server: {"processingReport":{"time":2531,"newObjects":{"node":0,"edge_attribute":0,"edge":0,"resource":0,"node_attribute":20059,"source_code":0,"layer":0},"errorObjects":{"node":0,"edge_attribute":0,"edge":0,"resource":0,"node_attribute":0,"source_code":0,"layer":0},"existedObjects":{"node":4100,"edge_attribute":0,"edge":0,"resource":1,"node_attribute":1,"source_code":0,"layer":1},"unknownTypesCount":0,"processedObjectsCount":24162,"requestedSourceCodes":0},"processingTime":2563}.
2023-08-04 17:34:32.863 [CLI] 0 INFO eu.profinit.manta.dataflow.repository.merger.client.StandardSourceCodeService Waiting for source code upload to finish.
2023-08-04 17:34:32.867 [CLI] 0 INFO eu.profinit.manta.dataflow.repository.merger.client.StandardSourceCodeService Source code upload has been completed.
2023-08-04 17:34:32.891 [CLI] ISSUE eu.profinit.manta.platform.cli.CliImpl
Event: SCENARIO_FAILURE_EVENT
Message: Scenario 'snowflakeDictionaryDataflowScenario' failed to execute.
Type: ISSUE
Priority: HIGH
2023-08-04 17:34:32.915 [CLI] 0 ERROR eu.profinit.manta.platform.cli.CliImpl
PLATFORM_CLI_ERRORS FAILED_EXECUTING_SCENARIO
User message: Failed executing scenario snowflakeDictionaryDataflowScenario
Technical message: Failed executing scenario snowflakeDictionaryDataflowScenario
Solution: Please contact MANTA Support at portal.getmanta.com and submit a support bundle/log export.
Impact: SCENARIO
java.lang.OutOfMemoryError: Java heap space
at org.apache.commons.io.output.AbstractByteArrayOutputStream.needNewBuffer(AbstractByteArrayOutputStream.java:106) ~[manta-connector-jms-artemis-client-37.1.0.jar:?]
at org.apache.commons.io.output.AbstractByteArrayOutputStream.writeImpl(AbstractByteArrayOutputStream.java:135) ~[manta-connector-jms-artemis-client-37.1.0.jar:?]
at org.apache.commons.io.output.ByteArrayOutputStream.write(ByteArrayOutputStream.java:66) ~[manta-connector-jms-artemis-client-37.1.0.jar:?]
at sun.nio.cs.StreamEncoder.writeBytes(StreamEncoder.java:233) ~[?:?]
at sun.nio.cs.StreamEncoder.implWrite(StreamEncoder.java:303) ~[?:?]
at sun.nio.cs.StreamEncoder.implWrite(StreamEncoder.java:281) ~[?:?]
at sun.nio.cs.StreamEncoder.write(StreamEncoder.java:125) ~[?:?]
at sun.nio.cs.StreamEncoder.write(StreamEncoder.java:135) ~[?:?]
at java.io.OutputStreamWriter.write(OutputStreamWriter.java:226) ~[?:?]
at java.io.PrintWriter.write(PrintWriter.java:542) ~[?:?]
at java.io.PrintWriter.write(PrintWriter.java:542) ~[?:?]
at java.io.PrintWriter.write(PrintWriter.java:559) ~[?:?]
at au.com.bytecode.opencsv.CSVWriter.writeNext(CSVWriter.java:243) ~[?:?]
at eu.profinit.manta.dataflow.repository.merger.common.GraphCsvSerializationHelper.printNodeAtributes(GraphCsvSerializationHelper.java:181) ~[?:?]
at eu.profinit.manta.dataflow.repository.merger.common.GraphCsvSerializationHelper.writeGraph(GraphCsvSerializationHelper.java:82) ~[?:?]
at eu.profinit.manta.dataflow.repository.merger.client.MergerWriter.createData(MergerWriter.java:103) ~[?:?]
at eu.profinit.manta.dataflow.repository.merger.client.MergerWriter.writeInnerGraph(MergerWriter.java:82) ~[?:?]
at eu.profinit.manta.dataflow.repository.merger.client.MergerWriter.write(MergerWriter.java:52) ~[?:?]
at eu.profinit.manta.dataflow.repository.merger.client.MergerWriter.write(MergerWriter.java:31) ~[?:?]
at eu.profinit.manta.dataflow.generator.modelutils.GraphScenario.doExecute(GraphScenario.java:151) ~[?:?]
at eu.profinit.manta.platform.automation.AbstractScenario.execute(AbstractScenario.java:97) ~[manta-platform-automation-37.1.0.jar:?]
at eu.profinit.manta.platform.cli.CliImpl.execute(CliImpl.java:261) [manta-platform-cli-37.1.0.jar:?]
at eu.profinit.manta.platform.cli.launcher.Main.lambda$main$1(Main.java:66) [manta-platform-cli-launcher-37.1.0.jar:37.1.0]
at eu.profinit.manta.platform.cli.launcher.Main$$Lambda$2/0x0000000800067440.run(Unknown Source) [manta-platform-cli-launcher-37.1.0.jar:37.1.0]
at java.lang.Thread.run(Thread.java:829) [?:?]
More Details
The OutOfMemory message may appear when running the dictionary dataflow analysis for a large database with more objects than fit into the memory. The memory requirements are about 1kB per object (schema, table, column, procedure, datatype, etc.). With the default 3GB RAM for scenario, approximately 3M objects can be processed. For larger databases, it is necessary to increase the memory allocated to the dictionary dataflow scenario.
Solution
-
Identify how much memory will likely be needed - number of object * 1000 gives heap size in GB
-
Increase memory for the failing Dictionary Dataflow using
SCENARIO_LOAD_MEMORY
as per Configure Runtime and Limitations|Scenario-Execution-Runtime-Limits -
Rerun the lineage analysis again (no need to necessarily rerun the Extraction step).