How to Handle Concurrent Requests in JavaScript

TL;DR

Ever had your app fire off a ton of API requests at once and wondered how to keep things under control? This post shows you how to limit concurrent requests in JavaScript—no matter if you use Axios or fetch. The best part: you don’t have to mess with your UI logic at all.

Why Should You Care About Concurrent Requests?

Modern data fetching libraries make API calls feel effortless, but sometimes you need to put the brakes on. If your app loads a bunch of data at once (think: table rows, batch jobs, or bulk updates), you can easily overwhelm your server or hit rate limits. That’s where request throttling comes in.

The Good Old Days of API Calls

Once upon a time, you used jQuery to handle API calls. (I am young, do not know what we did before that :P) Then React came along, and with it, Redux. After tinkering, we settled on handling async API requests inside a “thunk.” As soon as you got a response, you’d dispatch a Redux action.

Today, we have more convenient—if not downright sophisticated—ways to handle this. Libraries like SWR and TanStack Query let you treat API responses (almost) just like ordinary everyday local state. You just call a hook, and declare the JSX you want for loading and resolved states. Done.

You don’t really “call” an API anymore. You just “use” a state within a component—one that may or may not be defined yet. This is great because you don’t have to care about when or how you should call APIs. The library handles it.

The Problem: Too Many Requests at Once

However, there are scenarios where timing does matter.

For example, one of my apps deals with a lot of data in a table, where each row uses the same API but with different parameters. Normally, this is fine—users add rows manually, one at a time. The app persists those states (which are essentially the params each row uses). When the window refreshes, the persisted params (via localStorage) let each row call for the data, but at the same time—30 to 50 requests almost at once. It’d be sensible to reduce the number of concurrent requests in these situations.

The Solution: Control Your Requests

You want to control when and how many requests you make at any given time. Preferably without messing with the render logic. It would be a ~~nightmare~~ very challenging scenario if you had to selectively render table rows based on how many API calls you’re allowed to make concurrently. Could you do it? Sure. Should you? Absolutely not. That’s complexity you don’t need.

Whether you’re using fetch directly or Axios as a fetcher for SWR or TanStack Query, there’s an easy, robust way to control this.

Solution 1: Axios with Interceptors

For Axios, you can provide request and response interceptors to limit how many requests Axios handles at any given time.

Here’s the key insight: Axios’s request interceptor can return a Promise that resolves to the request config. If we don’t resolve that promise immediately, the request just… waits. It’s like putting someone on hold, but for HTTP requests.

import axios from "axios";
import type { AxiosError, AxiosResponse, InternalAxiosRequestConfig } from "axios";

type TQueuedReq = {
  request: InternalAxiosRequestConfig;
  resolver: (value: InternalAxiosRequestConfig) => void;
};

const MAX_CONCURRENT = 10;

const queue: TQueuedReq[] = [];
let _runningCount = 0;

const _processQueue = () => {
  while (_runningCount < MAX_CONCURRENT && queue.length > 0) {
    const queued = queue.shift();
    if (!queued) break;

    _runningCount += 1;
    queued.resolver(queued.request);
  }
};

const _onComplete = () => {
  _runningCount -= 1;
  _processQueue();
};

const axiosRequestInterceptor = (req: InternalAxiosRequestConfig): Promise<InternalAxiosRequestConfig> => {
  return new Promise((resolve) => {
    if (_runningCount < MAX_CONCURRENT) {
      _runningCount += 1;
      resolve(req);
      return;
    }
    queue.push({ request: req, resolver: resolve });
  });
};

const axiosResponseInterceptor = <T>(res: AxiosResponse<T>): AxiosResponse<T> => {
  _onComplete();
  return res;
};

const axiosResponseErrorInterceptor = (error: AxiosError): Promise<never> => {
  _onComplete();
  return Promise.reject(error);
};

export const axiosInstance = axios.create();
axiosInstance.interceptors.request.use(axiosRequestInterceptor);
axiosInstance.interceptors.response.use(axiosResponseInterceptor, axiosResponseErrorInterceptor);

How It Works

Let me break this down step by step:

The Queue: We maintain a queue array that holds pending requests and a _runningCount to track how many requests are currently in flight.
Request Interceptor (axiosRequestInterceptor):
- Every request passes through this function before being sent
- If we’re under the limit (_runningCount < MAX_CONCURRENT), we increment the counter and immediately resolve the request—it goes through
- If we’re at the limit, we push the request into the queue and don’t resolve. The request just sits there, waiting patiently
Response Interceptor (axiosResponseInterceptor and axiosResponseErrorInterceptor):
- When a request completes (success or error), we call _onComplete()
- This decrements _runningCount and calls _processQueue()
Processing the Queue (_processQueue):
- Loops through queued requests as long as we’re under the limit
- For each one, increments the counter and calls resolver() to finally let the request through

The beauty here is that Axios doesn’t know anything about our queue. It just sees a promise that eventually resolves. We’re essentially building a waiting room for HTTP requests.

Solution 2: Plain Fetch with a Queue

You can do exactly the same thing with plain fetch. The approach is slightly different because fetch doesn’t have interceptors, but the core concept is identical.

type TQueuedFetch = {
  input: RequestInfo | URL;
  init?: RequestInit;
  resolve: (value: Response) => void;
  reject: (reason?: unknown) => void;
};

const MAX_CONCURRENT = 5;

const queue: TQueuedFetch[] = [];
let runningCount = 0;

const processQueue = () => {
  while (runningCount < MAX_CONCURRENT && queue.length > 0) {
    const next = queue.shift();
    if (!next) break;

    runningCount += 1;

    globalThis
      .fetch(next.input, next.init)
      .then(next.resolve, next.reject)
      .finally(() => {
        runningCount -= 1;
        processQueue();
      });
  }
};

export const queuedFetch: typeof fetch = (input: RequestInfo | URL, init?: RequestInit) => {
  return new Promise<Response>((resolve, reject) => {
    queue.push({ input, init, resolve, reject });
    processQueue();
  });
};

How It Works

Drop-in Replacement: queuedFetch has the same signature as fetch. You can swap it in anywhere you’re using fetch directly.
Queue Everything: Every call to queuedFetch pushes the request details (URL, options, and the promise’s resolve/reject functions) onto the queue, then attempts to process.
Process with Limits (processQueue):
- While under the limit and there are queued items, pop one off
- Increment runningCount, make the actual fetch call
- When the fetch completes (via .finally()), decrement the counter and recursively call processQueue()
Self-Healing: The .finally() ensures we always decrement and check the queue, even if the request fails. No request gets lost.

Example Scenario

Let’s say each call takes about 6 seconds (which it shouldn’t, but bear with me), and you make 6 requests in quick succession within 3 seconds with MAX_CONCURRENT = 5:

Requests 1-5: Start immediately, runningCount = 5
Request 6: Gets queued because we’re at the limit
After ~6 seconds, request 1 completes
finally() fires: runningCount drops to 4, processQueue() runs
Request 6 finally gets executed

The user never notices. The component just waits a bit longer for that 6th response. No 500 errors. No angry servers. Everyone’s happy.