We are excited to announce the GA of IBM Big Replicate 2.1.2.
What is IBM Big Replicate?
IBM Big Replicate ensures business continuity by providing active-active data replication capability for your Big Data environment. Big Replicate enables data replication for Hadoop and cloud object stores with data consistency.
What’s new with this release?
1. New Platform Supported
Big Replicate 2.1.2 has added support for the following new platforms since Big Replicate 2.1.1
- -CDH 5.12
- -CDH 5.11
- -HDP 2.6.2
2. Notable New Features
This release includes the following major new features.
Big Replicate Kernel and Performance
Extensive improvements have been made in the core engine that is now referred to as Fusion Kernel.
Testing that spans a variety of load types shows throughput improvements and memory requirements reduced. Users can expect benefits ranging from 40% to 75% compared to previous releases.
Replication Memberships
Replication rule creation is simplified by the removal of the membership concept.
Memberships were used in previous versions of IBM Big Replicate has been replaced by simpler priority selection among zones, and the ability to control specific Big Replicate server roles in each zone. Memberships no longer need to be created, and there is no need to remove memberships that may no longer be in use by replication rules.
Non-Blocking Consistency Check
Consistency checks provide a mechanism to determine if there are any differences in the state of content within the scope of a replication rule. In earlier versions, during a consistency check, no change could be made via Big Replicate to the content being checked to ensure that the results of the check remain valid.
This release introduces an alternative, non-blocking consistency check that allows information on consistency state to be determined without blocking other activity while the check is underway. It takes advantage of tracking the state of changes to content under check during execution, and produces information for each item checked that covers the states: consistent, not-consistent, potentially inconsistent.
Bulk Replication Rules
Multiple replication rules can be created at the same time when they share attributes other than file system location.
Sidelining
IBM Big Replicate versions before 2.1.2 included a feature called “sidelining”. This allowed Big Replicate nodes that had fallen behind the agreement processing being performed among the network of nodes to a configurable degree to be sidelined, such that would no longer participate in agreement processing. The benefit of this approach was to ensure the overall health of a network under memory-constrained conditions, where the slow processing speed of an individual node was prevented from halting progress of the entire network.
A sidelined node required an intrusive process (“unsidelining”) to bring it back into the network to continue processing agreements.
IBM Big Replicate 2.1.2 supports operation in a manner that eliminates the potential for sidelining when nodes exceed memory constraints for agreement processing.
Logging
IBM Big Replicate logging has been changed. Where the Big Replicate server logged information to a set of rolling log files in /var/log/fusion/server named fusion-dcone.log.
Big Replicate client logging is disabled by default and can be re-enabled through the Settings > Log Settings view.
Non-Coordinated Notification of File Content
This version of IBM Big Replicate does not use coordinated activities to communicate information among zones about the availability of new file content. This removes a significant portion of communication through the Big Replicate Kernel related to progress in writing file content and reduces the overall load on the coordination engine as a result.
Broader HDFS API Support
The set of HDFS API methods that were not previously coordinated by Big Replicate is extended with the inclusion of support for:
public void concat(Path trg, Path[] psrcs)
public boolean mkdir(Path f, FsPermission permission)
public FSDataOutputStream append(Path f, final EnumSet
public FSDataOutputStream create(Path f, FsPermission permission, EnumSet
public FSDataOutputStream create(Path f, FsPermission permission, EnumSet
public HdfsDataOutputStream create(final Path f, final FsPermission permission, final boolean overwrite, final int bufferSize, final short replication, final long blockSize, final Progressable progress, final InetSocketAddress[] favoredNodes)
public void rename(Path src, Path dst, Rename… options)