HDInsight Kafka Disaster Recovery Solution

Chandra Manral 1 Reputation point
2022-01-07T13:10:37.33+00:00

Hi ,

We need to create a DR strategy for HDInsight Kafka Cluster . We thought of following cross region unidirectional replication via mirror maker from primary to secondary cluster.

The questions however We have are-
-What kind of RTO and RPO we can get with this as it is documented ?
-If the target is to have near 0 RTO and RPO then what other solution We can go for?

Also, HDInsight doesn't guarantee the zonal redundancy for HA as there is no way to deploy cluster across multiple AZs. What is the alterative to this . Pls suggest

Azure HDInsight
Azure HDInsight
An Azure managed cluster service for open-source analytics.
210 questions
0 comments No comments
{count} votes

1 answer

Sort by: Most helpful
  1. PRADEEPCHEEKATLA-MSFT 89,466 Reputation points Microsoft Employee
    2022-01-10T06:22:40.767+00:00

    Hello @Chandra Manral ,

    Thanks for the question and using MS Q&A platform.

    Kafka uses Active – Passive replication to mirror Kafka Topics from the primary region to the secondary region. An alternative to Kafka replication could be to produce to Kafka in both the regions.

    To enable cross region availability HDInsight 4.0 supports Kafka MirrorMaker which can be used to maintain a secondary replica of the primary Kafka cluster in a different region. MirrorMaker acts as a high-level consumer-producer pair, consumes from a specific topic in the primary cluster and produces to a topic with the same name in the secondary. Cross cluster replication for high availability disaster recovery using MirrorMaker comes with the assumption that Producers and Consumers need to fail over to the replica cluster.

    Depending on the topic lifetime when replication started, MirrorMaker topic replication can lead to different offsets between source and replica topics. HDInsight Kafka clusters also support topic partition replication which is a high availability feature at the individual cluster level.

    163497-image.png

    Kafka Replication: Active – Passive

    Active-Passive setup enables asynchronous unidirectional mirroring from Active to Passive. Producers and Consumers need to be aware of the existence of an Active and Passive cluster and must be ready to fail over to the Passive in case the Active fails. Below are some advantages and disadvantages of Active-Passive setup.

    163483-image.png

    Kafka Replication: Active – Active

    Active-Active set up involves two regionally separated, VNet peered HDInsight Kafka clusters with bidirectional asynchronous replication with MirrorMaker. In this design, messages consumed by the consumers in the primary are also made available to consumers in secondary and vice versa. Below are some advantages and disadvantages of Active-Active setup.

    163522-image.png

    For more details, refer to Azure HDInsight business continuity architectures - Apache Kafka.

    Hope this will help. Please let us know if any further queries.

    ------------------------------

    • Please don't forget to click on 130616-image.png or upvote 130671-image.png button whenever the information provided helps you. Original posters help the community find answers faster by identifying the correct answer. Here is how
    • Want a reminder to come back and check responses? Here is how to subscribe to a notification
    • If you are interested in joining the VM program and help shape the future of Q&A: Here is how you can be part of Q&A Volunteer Moderators

Your answer

Answers can be marked as Accepted Answers by the question author, which helps users to know the answer solved the author's problem.