Vinicius Aguiar
Architecture

Circuit Breaker in Node.js: protecting systems against cascading failures

Apr 16, 2026 · 9 min read

When your application depends on external APIs — payment providers, marketplaces, shipping services — you're accepting that part of your system is out of your control. These dependencies can become slow, return errors, or simply stop responding. Without protection, an unstable external API can bring down your entire system. The Circuit Breaker is the pattern that prevents this.

The problem: cascading failures

Imagine a real scenario: your application calls a marketplace API during checkout. Normally this call takes 200ms. But the marketplace is having issues and starts taking 30 seconds to respond — or simply doesn't respond at all.

What happens without protection:

  1. The user's checkout hangs waiting for the marketplace response
  2. While waiting, new requests arrive and also get stuck
  3. Your server's connection pool runs out
  4. Your server stops responding to all users — not just those depending on the marketplace
  5. The entire system goes down because of an external dependency

This is a cascading failure. An unstable dependency propagates the failure through the entire system. The Circuit Breaker stops this propagation.

How the Circuit Breaker works

The pattern works like an electrical circuit breaker. It monitors calls to an external dependency and has three states:

  • CLOSED — normal state. Requests pass through to the external API. If failures accumulate beyond the threshold, transitions to OPEN.
  • OPEN — protection state. Requests do not go to the API. Returns fallback immediately. After a cooldown period, transitions to HALF-OPEN.
  • HALF-OPEN — testing state. Allows one request through as a test. If successful, goes back to CLOSED. If it fails, goes back to OPEN.
Circuit Breaker state machine diagram — CLOSED, OPEN and HALF-OPEN states with transitionsCircuit Breaker state machine diagram — CLOSED, OPEN and HALF-OPEN states with transitions

TypeScript implementation

I'll implement a generic Circuit Breaker that can wrap any external call. The idea is to make it reusable across different dependencies.

type CircuitState = 'CLOSED' | 'OPEN' | 'HALF_OPEN'

interface CircuitBreakerOptions {
  failureThreshold: number  // failures before opening
  cooldownMs: number        // time in OPEN before testing
  timeoutMs: number         // timeout per request
}

class CircuitBreaker {
  private state: CircuitState = 'CLOSED'
  private failureCount = 0
  private lastFailureTime = 0
  private readonly options: CircuitBreakerOptions

  constructor(options: Partial<CircuitBreakerOptions> = {}) {
    this.options = {
      failureThreshold: options.failureThreshold ?? 5,
      cooldownMs: options.cooldownMs ?? 60_000,
      timeoutMs: options.timeoutMs ?? 10_000,
    }
  }

  async execute<T>(fn: () => Promise<T>, fallback: () => T): Promise<T> {
    if (this.state === 'OPEN') {
      if (Date.now() - this.lastFailureTime >= this.options.cooldownMs) {
        this.state = 'HALF_OPEN'
      } else {
        return fallback()
      }
    }

    try {
      const result = await this.withTimeout(fn())
      this.onSuccess()
      return result
    } catch (error) {
      this.onFailure()
      return fallback()
    }
  }

  private onSuccess() {
    this.failureCount = 0
    this.state = 'CLOSED'
  }

  private onFailure() {
    this.failureCount++
    this.lastFailureTime = Date.now()

    if (this.failureCount >= this.options.failureThreshold) {
      this.state = 'OPEN'
    }
  }

  private withTimeout<T>(promise: Promise<T>): Promise<T> {
    return Promise.race([
      promise,
      new Promise<never>((_, reject) =>
        setTimeout(() => reject(new Error('Timeout')), this.options.timeoutMs)
      ),
    ])
  }

  getState(): CircuitState {
    return this.state
  }
}

Practical usage: protecting an API call

With the class created, wrapping any external call is straightforward:

// One circuit breaker per dependency
const marketplaceBreaker = new CircuitBreaker({
  failureThreshold: 3,   // opens after 3 failures
  cooldownMs: 30_000,    // waits 30s before testing
  timeoutMs: 5_000,      // 5s timeout per call
})

const paymentBreaker = new CircuitBreaker({
  failureThreshold: 2,   // more sensitive — it's payments
  cooldownMs: 60_000,    // longer cooldown
  timeoutMs: 10_000,     // longer timeout — payment providers are slow
})

// Protected call
async function getProductFromMarketplace(productId: string) {
  return marketplaceBreaker.execute(
    // Actual call
    () => fetch(`https://api.marketplace.com/products/${productId}`)
      .then(res => res.json()),
    // Fallback when circuit is open
    () => getCachedProduct(productId)
  )
}

The key point: each external dependency should have its own Circuit Breaker. If the marketplace goes down, the marketplace circuit opens — but the payment provider circuit stays closed and keeps working normally.

Fallback strategies

The fallback is what your system returns when the circuit is open. This part requires the most product thinking, because the fallback needs to be useful enough so the user doesn't notice the degradation. Some strategies:

  • Cache — return the last valid response. Works well for data that changes slowly (product catalog, configurations).
  • Default value — return a safe pre-defined value. Works for shipping calculations ("shipping to be confirmed") or inventory ("check availability").
  • Reduced functionality — checkout works without marketplace validation, but warns the user that the price will be confirmed later.
  • Queue for later retry — save the operation to retry when the service comes back. Works for operations that don't need an immediate response.
// Fallback with cache
const productCache = new Map<string, Product>()

async function getProduct(id: string): Promise<Product> {
  return marketplaceBreaker.execute(
    async () => {
      const product = await fetchFromMarketplace(id)
      productCache.set(id, product) // update cache on success
      return product
    },
    () => {
      const cached = productCache.get(id)
      if (cached) return cached
      // No cache — return product with unavailability flag
      return { id, name: 'Product unavailable', available: false }
    }
  )
}

Combining with retry

Circuit Breaker and retry solve different problems:

  • Retry — for transient failures (one request failed, the next one will probably work). Uses exponential backoff.
  • Circuit Breaker — for sustained failures (the service is down, there's no point in keep trying). Protects the entire system.

The correct combination: retry inside the circuit breaker. The circuit monitors whether retries are working. If the call keeps failing even with retry, the circuit opens.

async function fetchWithRetry<T>(
  fn: () => Promise<T>,
  maxRetries = 3
): Promise<T> {
  let lastError: Error | null = null

  for (let attempt = 0; attempt < maxRetries; attempt++) {
    try {
      return await fn()
    } catch (error) {
      lastError = error as Error
      // Exponential backoff: 1s, 2s, 4s
      await new Promise(r => setTimeout(r, Math.pow(2, attempt) * 1000))
    }
  }

  throw lastError
}

// Retry inside circuit breaker
async function getProductReliably(id: string) {
  return marketplaceBreaker.execute(
    () => fetchWithRetry(() => fetchFromMarketplace(id), 3),
    () => getCachedProduct(id)
  )
}

Monitoring: knowing when the circuit opens

A circuit breaker that opens silently is dangerous — you need to know when an external service is unstable. Adding state transition events solves this:

class ObservableCircuitBreaker extends CircuitBreaker {
  private name: string

  constructor(name: string, options?: Partial<CircuitBreakerOptions>) {
    super(options)
    this.name = name
  }

  async execute<T>(fn: () => Promise<T>, fallback: () => T): Promise<T> {
    const previousState = this.getState()
    const result = await super.execute(fn, fallback)
    const currentState = this.getState()

    if (previousState !== currentState) {
      this.logTransition(previousState, currentState)
    }

    return result
  }

  private logTransition(from: string, to: string) {
    const message = `[CircuitBreaker:${this.name}] ${from} → ${to}`

    if (to === 'OPEN') {
      console.error(message) // alert when opening
      // Send to monitoring (Datadog, Sentry, etc.)
    } else if (to === 'CLOSED') {
      console.info(message) // inform when recovering
    } else {
      console.warn(message) // half-open is a transition
    }
  }
}

With this, you know exactly when an external service starts failing and when it recovers. In production, these logs should go to an alerting system — if the payment provider circuit opens, someone needs to be notified.

Real scenario: inconsistent marketplace API

A scenario I faced: a marketplace API returned product data inconsistently. Sometimes it worked in 200ms, sometimes it took 15 seconds, sometimes it returned 500. The checkout depended on this API to validate price and stock.

The solution was:

  1. Circuit Breaker with threshold of 3 failures and 30s cooldown
  2. Aggressive timeout of 3s (if it didn't respond in 3s, consider it a failure)
  3. Cache as fallback — last valid product response, with a "price subject to confirmation" flag
  4. Later reconciliation — job running every 5 minutes checking if cached prices are still correct

The result: checkout never hung because of the marketplace again. When the API was unstable, users saw cached prices with a subtle warning. When it recovered, the circuit closed and everything went back to real-time.

When not to use Circuit Breaker

Not every external call needs a circuit breaker:

  • Already resilient calls — if the dependency has native retry and high SLA (e.g., AWS S3), the circuit overhead may be unnecessary.
  • Idempotent operations with a queue — if you already use a queue with retry (e.g., webhook processing), the queue already plays the circuit's role.
  • Non-critical calls — analytics, external logging, tracking. If it fails, it doesn't affect the user.

Summary

The Circuit Breaker is one of the most important tools for applications that depend on external services. The pattern itself is simple — a state machine with three states. What requires engineering is:

  1. Calibrate thresholds — how many failures before opening, how long to cooldown
  2. Design useful fallbacks — cache, default values, reduced functionality
  3. Monitor transitions — know when it opened and when it closed
  4. One breaker per dependency — isolate failures so one unstable API doesn't affect the others
  5. Combine with retry — retry for transient failures, circuit for sustained failures

The difference between a system that goes down with its dependencies and one that degrades gracefully is almost always a well-configured Circuit Breaker.