Duplicated records are produced when they are aggregated by System Data Engine in stream mode

When the System Data Engine aggregates data in stream mode, duplicated records might be produced.


In stream mode, System Data Engine must send gathered data in memory to Data Streamer in near real time, so System Data Engine can only aggregate the data that have been gathered for a certain interval. If two SMF records that have the same GROUP BY key are written out in different intervals, these records cannot be aggregated into one record, and duplicated records are produced. If the two SMF records that have the same GROUP BY key are written in the same interval, they are aggregated into one record.

For example, if you use START TIME of a transaction and TRANSACTION ID as a GROUP BY key to group SMF 110 records, the SMF records are not written out in the order of the START TIME of the transactions. The transaction that starts earlier might be written out later than other transactions that start later. These SMF records that start in the same minute might be aggregated in different aggregation intervals, which make the duplicated records be produced.


If you do not want duplicated records to be produced, you must specify time-series fields. For example, you can specify SMF record time as GROUP BY key and specify the DURATION parameter in the GROUP BY definition:
GROUP BY                       	
The value of DURATION is the time how long System Data Engine holds aggregation buffer. For more information, see DURATION integer SECONDS/MINUTES in the sde_lang_define_update.html.