How to setup multiple consumers for an Azure Event Hub

Bharath Narayan 0 Reputation points
2024-04-02T07:14:55.9033333+00:00

Problem Statement :

I have an event hub to which multiple apps would send the messages to .
The load is very high (a million records per 5 hours ) .

I have only one Azure function as an event hub consumer which consumes the message and inserts it into Cosmos DB .

Due to the load , the current throughput is very low compared to the incoming messages .

I know adding new Azure Function as a consumer would help in the throughput , but my doubt is whether same events gets processed by multiple azure functions and the duplicate data could get inserted in the cosmos db .

Please share your insights to make the performance better without duplicating events in the cosmos db.

Azure Functions
Azure Functions
An Azure service that provides an event-driven serverless compute platform.
4,566 questions
Azure Event Hubs
Azure Event Hubs
An Azure real-time data ingestion service.
591 questions
0 comments No comments
{count} votes

2 answers

Sort by: Most helpful
  1. Sander van de Velde | MVP 30,711 Reputation points MVP
    2024-04-02T12:19:51.63+00:00

    Hello @Bharath Narayan ,

    welcome to this moderated Azure community forum.

    As @MayankBargali-MSFT already explained, you probably need to try to scale out your (single) Azure Function or build your own Event Processor.

    This is because the Event Hub is optimized for 'fan in - fan out' usage.

    This means multiple producers can write to the same Event Hub and multiple consumers each get their own copy.

    It seems you are looking for a queue.

    Using a queue is a strategy to disconnect producers and consumers but each consumer can read their own messages, taken from the queue. So a message is normally not consumed twice unless an error occurs the first time.

    This functionality is offered using the Azure Function trigger for a Storage account queue.

    Check the documentation (and example) regarding concurrency and retries (Poison messages).

    Although I have no experience with the Azure Service Bus Queue trigger, this could offer a similar experience too.

    If you want to make use of a queue for parallel processing, you need to check your architecture. You either need to start with a (Storage Account) Queue instead of the EventHub or you need to pass all incoming messages from the Event Hub to the queue first before parallel processing can happen.


    If the response helped, do "Accept Answer". If it doesn't work, please let us know the progress. All community members with similar issues will benefit by doing so. Your contribution is highly appreciated.

    1 person found this answer helpful.
    0 comments No comments

  2. MayankBargali-MSFT 69,946 Reputation points
    2024-04-02T09:54:36.3966667+00:00

    @Bharath Narayan Thanks for reaching out.

    Azure Function Event Hub trigger can consume events from multiple partitions of an Event Hub. When you create an Event Hub trigger for an Azure Function, you can specify the Event Hub name, the connection string, and the consumer group to use. Each consumer group provides a separate view of the event stream, allowing multiple consumers to read from the same Event Hub without interfering with each other.

    By default, the Event Hub trigger will consume events from all partitions of the Event Hub. Using event hub trigger there is no way to specify the partition ID from which you want to consume the events.

    The suggestion would be first validate to increase the throughput of your Azure Function as an event hub consumer, you can scale out the number of instances of the function app. This can be done by increasing the number of instances in the App Service Plan or by using Azure Functions Premium plan which provides automatic scaling based on the number of events in the event hub. In case if you have already reached the maximum capacity of your app service plan and you cannot scale further then you need can reach out to use to see if there could be alternative way to increase this limit. In case if this cannot be done then you need to write your own code and consume events using Event Processor Host library where you can consume the events from single partition of the same consumer.

    I know adding new Azure Function as a consumer would help in the throughput , but my doubt is whether same events gets processed by multiple azure functions and the duplicate data could get inserted in the cosmos db .

    Consumer is the view so different azure function would have different views that would consumes the same events. Your different function app would be consuming same events twice from different view.

    0 comments No comments