Webhook Circuit Breaker (WCB)

This article explains the functionality of how the WCB incident happens and how it impacts the business.

Written By Ops UrbanPiper (Collaborator)

Updated at March 26th, 2020

As you might already be aware we have a retry logic (3 attempts) in place for dealing with request failures (Bad Request/ Connection time-out) while making the webhook callback to 3rd party URLs. In order to keep our infra stable and to avoid making  repeated retries to those systems whose webhook endpoints are DOWN for prolonged time or throwing the same set of BAD REQUEST again and again, we have come up with an implementation termed as a - "Webhook Circuit Breaker" (WCB).

When a 3rd party system URL throws more than 15 request failures within a minute for a particular biz, our system detects that breach and automatically disables all the webhook endpoints configured for that biz. At this point, the webhook callback is said to have tripped the circuit breaker. The disabled webhook endpoints will be of the same domain originated URLs.

When the WCB is tripped, no webhook payloads will be pushed to the particular 3rd party URLs. If even a single webhook callback URL (say Rider Status Update) associated with the 3rd party gets tripped, then other webhook callbacks will also be disabled. Our system checks from which domain we are receiving the failures and when the failure rate is breached, our system automatically disables all the webhook endpoints associated with that same domain URL.

Our system automatically  re-enables the webhook endpoint after 15 minutes from the time it disables the webhook callbacks. 

For the orders which were not relayed in those 15 minutes, our system tries to push them at once when the webhooks are re-enabled. When we try to re-push the data and at that time if the third-party system is not responding again, the WCB will get triggered again. 

We  have an email notification trigger in place when the WCB gets tripped. Please check with the concerned ACM/OM/Integration PoC to get it configured for the business you on-board to our platform and also share the list of email ids to get the notifications. The email will provide you with the details of:

  •  the list and details of all the request failures
  • which business was impacted
  • which all URLs got disabled

The email alert notification will have a subject - Webhooks disabled for {{biz_name}}

Note: When the webhooks are enabled and our system does try to push the lost orders at once to third party URLs. In this event, whatever the state of the order present in the Urbanpiper system, the same will be present in the Order Relay payload under the state/order_state attribute.

For example - At the time of order push in retry attempt, if the state of the order is Acknowledged in our system due to status change happens from external tool say - satellite, then in the Order Placed payload, the state/order_state attribute will have the Acknowledged status instead of Placed. Make sure that you do not expect only a Placed state value to be passed in that attribute.

In case if you have any questions, please reach out to us.

Was this article helpful?