Using Azure Functions to validate event hub requests

Ivan Wilson 121 Reputation points
2021-01-27T10:14:55.483+00:00

We need to accept a large amount of REST API calls and send the data to SQL. We are expecting to receive up to 1 million API calls per day, each one containing up to 100 rows of data

We have set up a solution using API Management, Event Hubs, Stream Analytics and Azure SQL. This works fine, EXCEPT it doesn't have a good mechanism for handling invalid data.

We can configure Stream Analytics to discard any items that it fails to log to the SQL database. But really what we want to do is inform the sender that there is a problem with their data.

Is it appropriate to use Azure Functions to perform the validation logic? We can define a JSON schema and validate the inbound requests prior to sending to the Event Hub. We done a quick prototype and it seems to work. My concern is that it could impact on performance and cost.

Does anyone have real-world experience with doing something like this? Are there pitfalls we haven't considered?

Thanks

Azure Functions
Azure Functions
An Azure service that provides an event-driven serverless compute platform.
4,571 questions
Azure Event Hubs
Azure Event Hubs
An Azure real-time data ingestion service.
591 questions
0 comments No comments
{count} votes

Accepted answer
  1. ChaitanyaNaykodi-MSFT 24,231 Reputation points Microsoft Employee
    2021-01-29T00:25:19.263+00:00

    (Posting this as an answer instead of a comment due to character restrictions.)
    Hello @Ivan Wilson , apologies for the delay in my response.
    Yes you are correct, Azure Functions scale differently for different triggers. If I am understanding the question correctly, the request is to understand the difference in performance between.

    Method 1:

    API Management -> Azure Function(Http triggered) -> Event Hubs

    Method 2:

    API Management -> Event Hubs -> Azure Function(EventHub triggered)

    For Method 1, You can return HTTP response for the original request to as HTTP Status Code of 201 or 422 by using policies in APIM. You can configure host behavior to handle concurrency, a better way to understand the scaling behavior and to set-up the host behavior will be to conduct a load test as shown here by Anthony. Another advantage Http triggered function app have is the new instance rate is better than non-HTTP triggered function apps.

    For Method 2, You can go through this guide to understand how scaling works in Evet Hub triggered function apps

    If you are planning to use the Consumption plan, understanding the concept of serverless cold-start might be helpful.

    Please let me know if there are any additional concerns, I will be glad to continue with our discussion. Thank you!

    0 comments No comments

3 additional answers

Sort by: Most helpful
  1. ChaitanyaNaykodi-MSFT 24,231 Reputation points Microsoft Employee
    2021-01-27T21:20:41.367+00:00

    Hello @Ivan Wilson , Thank you for reaching out! I think Azure Functions will be well suited for your scenario. Going through these architectural overview and examples might help you in making the decision. You can also explore these real world Customer stories for Azure Functions.

    Additionally you can also go through this best practices guidelines for Azure Functions and how to mange connections in Azure Functions. You can also use the resources listed here for pricing details. Please let me know if you need any additional concerns, I will be glad to continue with our discussion. Thank you!

    0 comments No comments

  2. Ivan Wilson 121 Reputation points
    2021-01-27T21:42:04.153+00:00

    Thanks @ChaitanyaNaykodi-MSFT , from a quick scan at the examples, many are using a queueing mechanism like Event Hubs to collect the incoming requests. The function is then pulling these out for processing.

    I understand the benefits of this model, but it doesn't allow us to let the submitter know that their request has failed to validate. Ideally we would like to use the HTTP response for the original request to return a HTTP Status Code of 201 or 422.

    Under what circumstances does it make sense to expose an Azure Function directly to requests versus keeping it behind a queueing mechanism?

    My guess is that it can't scale to very high volumes the way event hubs can. If that is the case, is there any guidance on what the limits are? I know that under the consumption plan, there is a maximum of scale-out of 200 instances of the function app. If our scenario can keep below this limit, should we consider Azure Functions a suitable option?

    0 comments No comments

  3. Ivan Wilson 121 Reputation points
    2021-01-29T05:26:01.477+00:00

    Thanks @ChaitanyaNaykodi-MSFT , that is very helpful.

    We have done some initial testing and Azure Functions is scaling well. The costs are also low - less than 1c for 6,000 executions. We plan to do testing at higher volumes next week.

    We are using a combination of .Net console apps and https://artillery.io to generate test data. I believe Microsoft has dropped the load testing feature in AzureDevOps that Anthony used in the article you referenced above.

    0 comments No comments