Long-Running Jobs with Azure Service Bus

Azure May 24, 2020

Azure Service Bus is a messaging service available as a Platform-as-a-Service in the cloud and is intended for enterprise applications that require advance messaging features like transactions, ordering, duplicate detection and more.

Scenario: I got a job that takes more than 5 minutes to complete, how can I keep the message locked until I am finished.

There are few techniques to handle this, one way commonly discussed is using AutoRenewTimeout, essentially your service bus client will at regular interval get lock renewed. Although this technique works it can be challenging when using it with durable functions that might call a nested long-running function(s). Lock renewal is not a guaranteed operation but should be considered as best-effort, as it is initiated by Service Bus Client.

Here is an interesting way to solve this issue:

Solution

You can take advantage of two features Service Bus provides

  1. Deferred Message: You fetched the message from the queue and asked service bus to hold the message until you are ready to fetch it and is no longer available to any message consumers unless they explicitly ask for it by presenting a Message Sequence Number.
  2. Scheduled delivery: You can send a message using future enqueue time when message consumers can see the message.

Using this knowledge we can do following to keep hold of message for long time, in this solution I will use two queues

Two Queues Main and Cleanup
  1. Main Queue - where messages are sent
  2. Cleanup Queue - we use this to clean up any deferred messages in main queue (possible if main consumer crashed)

Here is how the steps look like for the consumers (both can run independently)

Main Queue Consumer

Main Queue Consumer Sample
  1. Receive a Message from the Main Queue (M)
  2. Record the Sequence number of the message (SQ)
  3. Generate and send a new scheduled message (C) in a Cleanup Queue with a sequence number of M
    i) Message Id of C =  SQ i.e. Sequence number from step 2
    ii) Enqueue Time of C =  Greater than max time your consumer needs to process the message  
  4. Now mark the message (M) as Deferred
  5. Start Processing the message
  6. If all goes well, retrieve Message M using sequence SQ and mark message M as complete or failed
  7. If receiver crashed -  Cleanup Consumer will take care of it

Cleanup Queue Consumer

Cleanup Queue Consumer Sample
  1. Read a Message from Cleanup Queue (C)
  2. Set Sequence Number (SQ) = Message Id of C
  3. Try to retrieve a message from Main Queue (M) with a Sequence Number (SQ)
  4. if M is present, likely something bad happened, you can either create a new message copying M and re-post into Main Queue or simply log the error. Then mark M as complete
  5. Mark C as Complete

Code Sample:

kunalbabre/ServiceBus-Long-Running-Consumer-Sample
Sample to handle long running Jobs with Azure Service Bus - kunalbabre/ServiceBus-Long-Running-Consumer-Sample

Kunal Babre

Azure and Kubernetes Certified Cloud Architect, with a passion for building robust cloud architecture. I work with strategic customers to spearhead their journey into the Cloud.

Great! You've successfully subscribed.
Great! Next, complete checkout for full access.
Welcome back! You've successfully signed in.
Success! Your account is fully activated, you now have access to all content.