Rate Limit and Retry: A Basic Scheme for Reliable Integrations

◷ 7 min read 3/4/2026

Rate Limit Retry API

Next step

Open the bot or continue inside this section.

$ cd section/ $ open @mmorecil_bot

Article -> plan in AI

Paste this article URL into any AI and get an implementation plan for your project.

How to use

Copy this prompt and send it to your AI chat.
Attach your project or open the repository folder in the AI tool.
Ask for file-level changes, risks, and a quick verification checklist.

In any integration, sooner or later, the same problem arises. You send a request to the API and it does not pass. Sometimes the server responds with an error, sometimes the network crashes, sometimes the API just says, "too many requests.".

In the logs, it looks something like this:

code

429 Too Many Requests

code

500 Internal Server Error

If the system fails to respond properly to such situations, integration becomes unstable

drop off
data not synchronized
events are lost
the system starts repeating requests endlessly

To prevent this from happening, any serious integration uses two basic mechanisms:

rate limit and retry.

The first is responsible for controlling the speed of requests, the second is responsible for repeating attempts when errors occur.

If implemented correctly, integration becomes sustainable even with network failures and high loads.

What is the rate limit

Rate limit is a limit on the number of requests that can be sent to the API in a given period of time.

Almost all public APIs use such restrictions.

Examples:

GitHub API: 5,000 requests per hour
Stripe – approximately 100 requests per second
many SaaS – 60 requests per minute

When the limit is exceeded, the server returns the answer:

code

429 Too Many Requests

Sometimes with an additional title:

code

Retry-After: 10

This means that a new request can only be sent in **10 seconds.

Such restrictions are necessary to protect infrastructure and distribute resources fairly among customers.

Why APIs restrict requests

At first glance, it may seem that the rate limit is just an inconvenience for developers. But there are important reasons for this mechanism.

The first reason is server protection. If one client sends thousands of requests per second, it can overload the system.

The second reason is the equal use of resources. Restrictions ensure that one user does not take up all the power of the service.

The third reason is security. Rate limit helps protect against brute force and other attacks.

What happens without a control rate limit

Imagine a simple integration that sends data to an API.

The code might look like this:

javascript

for (const event of events) {
  await api.send(event)
}

If there are few events, everything works fine. But if there are thousands of them, the system starts sending a huge number of requests.

At some point, the API begins to respond with an error:

code

429 Too Many Requests

If the code doesn’t know how to handle these responses correctly, integration gets worse

requests fall
system overloads the API
part of the data is lost

Therefore, it is important to control the speed of sending requests.

What's retry

Retry is a repeated attempt to execute a request after an error.

The point of retry is that many errors are **temporary.

For example:

server overloaded
balancer returned the error
network cut off for a second

In such cases, repeated requests are often successful.

Typical errors in which retry is used:

500 Internal Server ErrorXX
502 Bad GatewayXX
503 Service UnavailableXX
timeoutXX

In all these situations, trying again makes sense.

When retry is not done

Some errors mean that the request will never be successful until the data changes.

For example:

400 Bad RequestXX
401 UnauthorizedXX
403 ForbiddenXX
404 Not FoundXX

If you repeat such requests, the system will simply create an extra load.

Therefore, retry should only apply to **temporary errors.

The problem of naive retry

The simplest implementation of retry is as follows:

javascript

try {
  await api.request()
} catch (e) {
  await api.request()
}

But this approach can lead to a serious problem.

If the server is already overloaded, instant repeated requests will only increase the load. As a result, the system can get into a state where thousands of customers start repeating requests at the same time.

This is called the **retry storm, the storm of repeated requests.

To avoid this, a more accurate algorithm is used.

Exponential backoff

One of the most popular retry algorithms is **exponential backoff.

His idea is very simple: each attempt is carried out with increasing delay.

For example:

1 try at once 2 attempts - in 1 second 3 try - in 2 seconds 4 attempts in 4 seconds 5 attempts in 8 seconds

This gives the server time to recover and dramatically reduces the load.

Example of retry in JavaScript

Simple implementation of retry with exponential backoff:

javascript

async function requestWithRetry(fn, retries = 5) {
  for (let attempt = 0; attempt < retries; attempt++) {
    try {
      return await fn()
    } catch (error) {

      if (attempt === retries - 1) {
        throw error
      }

      const delay = 2 ** attempt * 1000

      await new Promise(resolve => setTimeout(resolve, delay))
    }
  }
}

Use:

javascript

await requestWithRetry(() => fetch("https://api.example.com"))

If the request ends in an error, the function will automatically try again.

Limiting the speed of requests

In addition to retry, you often need to control the speed of sending requests.

The simplest scheme is to use a queue.

First, tasks are queued, then the worker sends them to the API at a controlled rate.

The simplest limiter can look like this:

javascript

class RateLimiter {
  constructor(limit, interval) {
    this.limit = limit
    this.interval = interval
    this.queue = []
    this.active = 0
  }

  async schedule(task) {
    return new Promise(resolve => {
      this.queue.push({ task, resolve })
      this.run()
    })
  }

  run() {
    if (this.active >= this.limit || this.queue.length === 0) return

    const { task, resolve } = this.queue.shift()
    this.active++

    task().then(result => {
      resolve(result)

      setTimeout(() => {
        this.active--
        this.run()
      }, this.interval)
    })
  }
}

This limiter allows you to send, for example, ** no more than 5 requests per second **.

What it looks like in real architecture

In production systems, the scheme usually looks like this:

events fall in line
worker takes the task
rate limiter controls the speed
aPI request is sent
in case of error, retry is activated

This architecture allows the system to:

not exceed the API limits
correctly handle temporary errors
do not lose data during network failures.

Outcome

Rate limit and retry are the foundation of any reliable integration.

Rate limit controls the speed of requests and protects the API from overload. *Retry helps the system automatically recover from temporary errors.

Even the simple implementation of these mechanisms significantly improves the stability of integrations and prevents data loss when working with external services.

Rate Limit and Retry: A Basic Scheme for Reliable Integrations

What is the rate limit

Why APIs restrict requests

What happens without a control rate limit

What's retry

When retry is not done

The problem of naive retry

Exponential backoff

Example of retry in JavaScript

Limiting the speed of requests

What it looks like in real architecture

Outcome

What is webhook in simple words and how not to lose events

—

Rate Limit and Retry: A Basic Scheme for Reliable Integrations

## What is the rate limit

## Why APIs restrict requests

## What happens without a control rate limit

## What's retry

## When retry is not done

## The problem of naive retry

## Exponential backoff

## Example of retry in JavaScript

## Limiting the speed of requests

## What it looks like in real architecture

## Outcome

What is webhook in simple words and how not to lose events

—

What is the rate limit

Why APIs restrict requests

What happens without a control rate limit

What's retry

When retry is not done

The problem of naive retry

Exponential backoff

Example of retry in JavaScript

Limiting the speed of requests

What it looks like in real architecture

Outcome