Topic
  • 5 replies
  • Latest Post - ‏2011-06-30T19:53:10Z by GlenSakuth
essam.Salah
essam.Salah
1 Post

Pinned topic Infosphere Change Data Capture (CDC)

‏2009-12-24T18:18:05Z |
Geeks,

I am evaluating Infosphere Change Data Capture to be used with Oracle and SQL Server and wondering about the followings:

1. There is 2 Oracle versions; Trigger and Redo: what is the selection criteria ?
2. DataStage integration ?
3. Best practices for configurations ?

Thanks
Essam
Updated on 2011-06-30T19:53:10Z at 2011-06-30T19:53:10Z by GlenSakuth
  • G-Steffler
    G-Steffler
    3 Posts

    Re: Infosphere Change Data Capture (CDC)

    ‏2010-02-02T20:55:22Z  
    I will answer the questions out of order.

    2.

    CDC integrates with DataStage through a product known as CDC DataStage 6.3 FP3 which writes flat files and generates a job that can be imported into DataStage which will process the flat files. Documentation is available on IBM.com

    3.

    A best practices guide is a work in progress for the CDC Information Development team. As history, DataMirror (acquired by IBM in late 2007) held this information within the CDC services team which has now become part of the larger IBM services organization.

    1.

    Log based change data capture is generally (almost always) preferable to trigger based replication.

    The CDC Oracle 6.3 product is 95% common code between the two technologies, with only the front end scraper component being different. This means that the user interface, command line utilities, documentation, and scriptability are the same. They are both tested rigorously in QA with thousands of scenarios, and on dozens of platform combinations.

    Common DBA concern regarding enabling triggers on a production database due to the performance and maintenance considerations. This requirement has been the primary driver for many customers to consider only log based replication. The redo log based changed data capture is the recommended technology for the majority of database workloads. The CDC trigger based technology has been retained because it does have a narrow set of (very few) advantages for specific workloads, specifically: (a) the behavior of large batch processing (b) memory requirements of CDC log based due to the need to stage transactions (c) support of systems where log based replication cannot be easily supported.

    Will reduce performance of Oracle due to triggers used for change capture. The presence of triggers will add small (subsecond for small transactions to few seconds or minutes for larger transactions) latency to the transactions issued by applications running on the source Oracle database. Meaning, applications may take longer to execute transactional statements (insert, update, delete). Large batch transactions are affected as they tend to be highly sensitive to extra I/O being performed. High volume data replication of millions of transactions per day will be a performannce tuning challenge for a trigger capture product.

    With triggers, operations are replicated in the order they arrived in the trigger journal, thus, there is no need to "order" or "stage" transactions, as the entries in the journal are already ordered and committed by the database. The triggers require database CPU and disk I/O during the transaction as the operations are staged in a database journal table. Log based replication requires staging transactions until a commit/rollback is seen, before sending them to the target. This means that redo log must process transactions after they are committed, whereas the triggers have already processed the transaction while it was being generated on the database. Log based replication requires very little database CPU, but does require staging to disk for very large transactions, while small transactions are typically staged in memory.
  • hayyamgsu
    hayyamgsu
    1 Post

    Re: Infosphere Change Data Capture (CDC)

    ‏2010-10-12T20:25:08Z  
    I will answer the questions out of order.

    2.

    CDC integrates with DataStage through a product known as CDC DataStage 6.3 FP3 which writes flat files and generates a job that can be imported into DataStage which will process the flat files. Documentation is available on IBM.com

    3.

    A best practices guide is a work in progress for the CDC Information Development team. As history, DataMirror (acquired by IBM in late 2007) held this information within the CDC services team which has now become part of the larger IBM services organization.

    1.

    Log based change data capture is generally (almost always) preferable to trigger based replication.

    The CDC Oracle 6.3 product is 95% common code between the two technologies, with only the front end scraper component being different. This means that the user interface, command line utilities, documentation, and scriptability are the same. They are both tested rigorously in QA with thousands of scenarios, and on dozens of platform combinations.

    Common DBA concern regarding enabling triggers on a production database due to the performance and maintenance considerations. This requirement has been the primary driver for many customers to consider only log based replication. The redo log based changed data capture is the recommended technology for the majority of database workloads. The CDC trigger based technology has been retained because it does have a narrow set of (very few) advantages for specific workloads, specifically: (a) the behavior of large batch processing (b) memory requirements of CDC log based due to the need to stage transactions (c) support of systems where log based replication cannot be easily supported.

    Will reduce performance of Oracle due to triggers used for change capture. The presence of triggers will add small (subsecond for small transactions to few seconds or minutes for larger transactions) latency to the transactions issued by applications running on the source Oracle database. Meaning, applications may take longer to execute transactional statements (insert, update, delete). Large batch transactions are affected as they tend to be highly sensitive to extra I/O being performed. High volume data replication of millions of transactions per day will be a performannce tuning challenge for a trigger capture product.

    With triggers, operations are replicated in the order they arrived in the trigger journal, thus, there is no need to "order" or "stage" transactions, as the entries in the journal are already ordered and committed by the database. The triggers require database CPU and disk I/O during the transaction as the operations are staged in a database journal table. Log based replication requires staging transactions until a commit/rollback is seen, before sending them to the target. This means that redo log must process transactions after they are committed, whereas the triggers have already processed the transaction while it was being generated on the database. Log based replication requires very little database CPU, but does require staging to disk for very large transactions, while small transactions are typically staged in memory.
    Hi, is best practices guide available?
    Thanks
  • G-Steffler
    G-Steffler
    3 Posts

    Re: Infosphere Change Data Capture (CDC)

    ‏2010-10-19T15:06:44Z  
    • hayyamgsu
    • ‏2010-10-12T20:25:08Z
    Hi, is best practices guide available?
    Thanks
    It is available as an internal IBM document on the CDC Wiki in W3.

    Search W3 for "icdc" and go to right hand "Collaboration Spaces preview" links for "Infosphere CDC".

    The topics will become public at a later date via RedBook, Techdoc or other external process.
  • SystemAdmin
    SystemAdmin
    61 Posts

    Re: Infosphere Change Data Capture (CDC)

    ‏2011-06-30T19:20:52Z  
    It is available as an internal IBM document on the CDC Wiki in W3.

    Search W3 for "icdc" and go to right hand "Collaboration Spaces preview" links for "Infosphere CDC".

    The topics will become public at a later date via RedBook, Techdoc or other external process.
    Is there any update on the best practices info?
  • GlenSakuth
    GlenSakuth
    9 Posts

    Re: Infosphere Change Data Capture (CDC)

    ‏2011-06-30T19:53:10Z  
    Is there any update on the best practices info?
    Hi,

    A thorough Redbook with best practices is currently in the works and just entering the review stage. It should be publicly available in a few months.