Skip to main content

Command Palette

Search for a command to run...

How Node.js Handles Multiple Requests with a Single Thread

Updated
14 min read
C
Software developer passionate about building scalable web applications with React and backend technologies. I enjoy solving problems, building projects, and sharing my learning with the community.

Most developers, when they first hear "Node.js is single-threaded," assume that means it can only handle one request at a time. That sounds like a terrible idea for a web server. If user A makes a request and it takes two seconds, does user B have to wait?

The answer is no — and understanding why is one of the most important things you can learn about how Node.js actually works.


First: what does single-threaded even mean?

A thread is the smallest unit of execution in a program. It's a sequence of instructions that a CPU can run. Most traditional servers spin up a new thread for every incoming request. The thread handles that request from start to finish — reading data, querying the database, sending a response — and then it's done.

This works, but it's expensive. Each thread takes memory (typically 1–8MB per thread in languages like Java or C#). If 10,000 users connect at once, you potentially need 10,000 threads. The server spends enormous amounts of time and memory managing those threads, switching between them, and waiting for each one to finish its work.

Node.js takes a different approach. It runs on a single main thread. There is one call stack, one thing executing at any given moment. But it doesn't block that thread waiting for slow operations — and that's the entire game.


The chef analogy

Imagine two kinds of restaurants.

Restaurant A hires one chef per customer. When you arrive, your chef takes your order, walks to the fridge, gets the ingredients, cooks your food, plates it, brings it to you, and only then goes to serve the next customer. This is the traditional multi-threaded model. It scales by adding more chefs (threads), but each chef (thread) stands idle every time they're waiting for something — the oven to heat, the pasta to boil.

Restaurant B has one head chef. You arrive and give your order. The chef passes the order to the kitchen assistants (database, file system, external APIs) and immediately turns around to take the next customer's order. When the kitchen finishes your food, they ring a bell. The chef hears the bell, picks up your plate, and serves you. Meanwhile, they've taken orders from thirty other people and those are all being prepared simultaneously in the kitchen.

Node.js is Restaurant B. The head chef is the main thread. The kitchen assistants are the background workers. The bell is the event loop signalling that an async task is complete.

The chef is never idle, never waiting, always moving to the next order. The concurrency is real — it's just happening in the kitchen, not at the chef's station.


The call stack: where code runs

Before understanding the event loop, you need a clear picture of the call stack.

The call stack is where JavaScript executes code. It's a last-in, first-out data structure. When you call a function, it gets pushed onto the stack. When it returns, it gets popped off.

function greet(name) {
  const message = formatMessage(name);
  console.log(message);
}

function formatMessage(name) {
  return `Hello, ${name}`;
}

greet("Aarav");

The call stack during this code:

1. greet("Aarav") is pushed
2. formatMessage("Aarav") is pushed
3. formatMessage returns "Hello, Aarav" → popped
4. console.log("Hello, Aarav") is pushed
5. console.log returns → popped
6. greet returns → popped
7. Stack is empty

JavaScript can only execute one thing at a time — one frame on the call stack. This is what single-threaded means in practice.

Now here's the question: what happens when you call something slow, like reading a file or querying a database?


Blocking vs non-blocking: the core distinction

There are two ways to handle a slow operation.

Blocking (synchronous): The thread sits and waits until the operation completes. Nothing else can happen on that thread during this time.

const fs = require("fs");

// Blocking — freezes the entire thread until the file is read
const data = fs.readFileSync("largefile.txt", "utf-8");
console.log(data);
console.log("This only runs after the file is fully read");

If reading that file takes 500ms, your entire Node.js process is frozen for 500ms. Every other request waiting? They all wait too.

Non-blocking (asynchronous): The operation is handed off, and the thread moves on immediately. When the operation finishes, a callback is called.

const fs = require("fs");

// Non-blocking — hands off the task, continues immediately
fs.readFile("largefile.txt", "utf-8", (err, data) => {
  console.log(data); // runs when the file is ready
});

console.log("This runs immediately, before the file is read");

The output:

"This runs immediately, before the file is read"
"...file contents..."

The file read happened — but it happened in the background. The main thread never waited.


Who does the actual work? libuv and the thread pool

Here's where it gets interesting. When you call fs.readFile, the actual file reading doesn't happen on the main JavaScript thread. It gets handed to libuv.

libuv is a C library that Node.js is built on. It provides two mechanisms for handling async work:

1. OS-level async I/O: For operations like network I/O (TCP/UDP sockets), libuv uses the operating system's native async interfaces — epoll on Linux, kqueue on macOS, IOCP on Windows. The OS itself manages waiting for data to arrive on a socket. Node.js registers interest and moves on. When data arrives, the OS signals libuv, which queues a callback.

2. The thread pool: For operations that don't have great OS-level async support — file system operations, DNS lookups, some crypto operations — libuv maintains a pool of worker threads (default: 4 threads, configurable up to 1024 with UV_THREADPOOL_SIZE). These threads do the blocking work in the background. When they finish, they signal the main thread.

Your JavaScript (main thread)
         │
         │  fs.readFile("file.txt", callback)
         ▼
       libuv
         │
         ├── Network I/O  → OS async (epoll/kqueue/IOCP)
         │
         └── File I/O     → Thread pool (4 worker threads)
                                │
                                └── Thread reads file
                                    When done → queues callback
                                    Main thread picks it up

This is the key insight: the single thread never does the slow work. It delegates, and comes back when the work is done.


The event loop: the engine that makes it all work

The event loop is the mechanism that lets a single thread manage thousands of concurrent operations. It runs in a continuous cycle, checking different queues and executing callbacks when their associated operations are complete.

A simplified version of the event loop cycle looks like this:

┌──────────────────────────────────┐
│           Event Loop             │
│                                  │
│  1. timers                       │
│     (setTimeout, setInterval)    │
│                                  │
│  2. pending callbacks            │
│     (I/O callbacks deferred)     │
│                                  │
│  3. idle, prepare                │
│     (internal use)               │
│                                  │
│  4. poll                         │
│     (retrieve new I/O events)    │
│     ← spends most time here      │
│                                  │
│  5. check                        │
│     (setImmediate callbacks)     │
│                                  │
│  6. close callbacks              │
│     (socket.on('close', ...))    │
│                                  │
└──────────────────────────────────┘
         ↑ repeats until no more work

The poll phase is where the event loop spends most of its time. It waits for I/O completion events. When a file read finishes, or a database query returns, or data arrives on a network socket — the poll phase picks it up and queues its callback.

Then in the next iteration, that callback runs on the main thread.


Microtask queues: Promises and process.nextTick

There are actually two higher-priority queues that run between each phase of the event loop:

process.nextTick queue: Callbacks registered with process.nextTick run at the end of the current operation, before the event loop moves to its next phase. It's the highest priority async mechanism in Node.js.

Microtask queue (Promises): Resolved Promise callbacks (.then, .catch, async/await) run after nextTick but before the event loop advances.

console.log("1 - synchronous");

setTimeout(() => console.log("4 - setTimeout"), 0);

Promise.resolve().then(() => console.log("3 - Promise"));

process.nextTick(() => console.log("2 - nextTick"));

console.log("5 - synchronous end");

Output:

1 - synchronous
5 - synchronous end
2 - nextTick
3 - Promise
4 - setTimeout

The synchronous code runs first (call stack). Then nextTick. Then resolved Promises. Then the event loop phases (timers, I/O, etc.).

Understanding this order matters when you're debugging race conditions or unexpected callback ordering in complex async code.


Handling multiple client requests: what actually happens

Let's make this concrete. Here's a simple Express server:

const express = require("express");
const fs = require("fs/promises");

const app = express();

app.get("/user/:id", async (req, res) => {
  const data = await fs.readFile(`./users/${req.params.id}.json`, "utf-8");
  res.json(JSON.parse(data));
});

app.listen(3000);

Now imagine 1000 requests arrive in rapid succession — all hitting /user/:id.

Here's what happens, request by request, from Node's perspective:

Request 1 arrives
→ Event loop picks it up
→ Calls readFile for user/1.json
→ Hands off to libuv thread pool
→ Main thread immediately free

Request 2 arrives
→ Event loop picks it up
→ Calls readFile for user/2.json
→ Hands off to libuv thread pool
→ Main thread immediately free

...same for requests 3 through 1000...

libuv thread pool finishes user/1.json
→ Callback queued in event loop
→ Event loop picks it up
→ res.json() sends the response

libuv thread pool finishes user/47.json
→ Callback queued
→ Event loop picks it up
→ Response sent

All 1000 requests are in flight simultaneously. None of them blocked the main thread. The file reads are happening in libuv's thread pool or via OS async I/O. The main thread just keeps picking up completed callbacks and sending responses.

This is concurrency — many things in progress at the same time — but not parallelism — multiple things literally executing at the same instant on multiple CPUs.


Concurrency vs parallelism: why the distinction matters

This is worth being precise about because the terms get confused constantly.

Parallelism means multiple operations executing at literally the same instant, on multiple CPU cores. For this you need multiple threads or processes.

Concurrency means multiple operations are in progress (started but not finished) at the same time, even if only one is actively executing at any instant. They're interleaved.

Node.js achieves concurrency on a single thread. While one callback is executing, thousands of I/O operations can be in progress in libuv. The moment that callback finishes, the event loop picks up the next completed operation.

The critical insight: I/O is the bottleneck for most web applications, not CPU. A web server mostly waits — for databases, for file reads, for external APIs. Node.js is designed so that the main thread never waits. It only executes, delegates, and processes results.

A traditional multi-threaded server uses many threads to handle waiting. Node.js uses one thread that never has to wait at all.


What CAN block the event loop

Since there's only one main thread, anything that keeps it busy for a long time is a problem. These operations block the event loop:

CPU-intensive synchronous computation:

// This blocks the event loop for however long it takes
app.get("/compute", (req, res) => {
  let result = 0;
  for (let i = 0; i < 10_000_000_000; i++) {
    result += i; // pure CPU work — nothing to delegate
  }
  res.json({ result });
});

While this loop runs, every other request is frozen. No callbacks process, no other routes respond.

Synchronous file/database calls:

// Never do this in a request handler
app.get("/file", (req, res) => {
  const data = fs.readFileSync("big.json"); // blocks entire thread
  res.send(data);
});

JSON parsing of extremely large payloads:

// Parsing 50MB of JSON is synchronous and CPU-bound
const parsed = JSON.parse(hugeString); // blocks thread until done

The rule is simple: never put long-running synchronous work on the main thread. Anything CPU-intensive should be offloaded to a Worker Thread or a child process.


Worker Threads: when you actually need parallelism

Node.js added the worker_threads module in v10 (stable in v12) for exactly this scenario. Worker threads are separate JavaScript execution environments that run in parallel on different CPU cores.

const { Worker, isMainThread, parentPort, workerData } = require("worker_threads");

if (isMainThread) {
  // This is the main thread
  const app = require("express")();

  app.get("/heavy", (req, res) => {
    const worker = new Worker(__filename, {
      workerData: { input: 1_000_000_000 }
    });

    worker.on("message", result => {
      res.json({ result }); // main thread stays free while worker computes
    });
  });

  app.listen(3000);

} else {
  // This runs in the worker thread
  let result = 0;
  for (let i = 0; i < workerData.input; i++) {
    result += i;
  }
  parentPort.postMessage(result);
}

Worker threads share memory through SharedArrayBuffer and communicate via message passing. They're the right tool when you genuinely need parallelism for CPU-heavy tasks like image processing, video transcoding, cryptography, or machine learning inference.


Why Node.js scales well for I/O-heavy workloads

Traditional servers use a thread-per-request model. Resources scale with threads — memory, context switching overhead, OS scheduling.

Node.js uses an event-driven, callback-based model. Resources scale with the number of concurrent I/O operations in flight, which has a much lower overhead than threads.

In practical terms:

  • A Java or Python server handling 10,000 simultaneous connections might need 10,000 threads, consuming gigabytes of memory

  • A Node.js server handling 10,000 simultaneous connections uses one main thread, a handful of libuv worker threads, and OS async I/O — consuming a fraction of the memory

This is why companies like Netflix, LinkedIn, and PayPal moved parts of their infrastructure to Node.js and reported massive improvements in throughput and reductions in server count. LinkedIn reduced their mobile server count from 30 servers to 3 after switching. Netflix reported a 70% reduction in startup time.

The gain isn't magic — it's architectural. Node's model eliminates the overhead of thread management for the common case: I/O-bound web servers.


The scenarios where Node.js is the wrong tool

Understanding when Node.js scales well also means understanding when it doesn't.

CPU-intensive applications: Video encoding, complex scientific computation, machine learning training — these need CPU time, not I/O waiting. The event loop doesn't help here. Languages with true parallelism (Go, Rust, C++) are better suited.

Blocking ORM or database drivers: If your database library doesn't support async operations and blocks the main thread, you lose all the concurrency benefits. Always verify your dependencies are non-blocking.

Long-polling without streaming: If each connection holds the thread for a long time doing synchronous work, you lose the concurrency advantage quickly.

Node.js excels at: REST APIs, GraphQL servers, real-time applications (chat, notifications, live updates), microservices, file servers, and anything that mostly reads/writes data and waits for I/O. This covers the majority of web application use cases, which is why Node.js adoption has been so broad.


Putting it all together

Here's the complete mental model:

Client request arrives
        │
        ▼
  Event Loop (main thread)
        │
        ├── Synchronous code runs immediately on call stack
        │
        ├── Async I/O (fs, db, http)?
        │       │
        │       └── Delegated to libuv
        │               │
        │               ├── Network I/O → OS async (epoll/kqueue)
        │               └── File/DNS    → Thread pool (4+ workers)
        │
        ├── Main thread immediately free for next request
        │
        └── When async work finishes:
                Callback queued → Event loop picks it up
                → Runs on main thread → Response sent

One thread. One call stack. But thousands of operations in flight simultaneously because the thread never waits — it only executes and delegates.


Summary

Node.js being single-threaded is not a weakness. It's a deliberate architectural decision built around a key observation: most web server time is spent waiting for I/O, not executing code. If you eliminate waiting from the main thread, one thread can handle an enormous number of concurrent operations.

The event loop is the scheduler that makes this possible. libuv is the engine that handles the actual waiting in the background. Together, they give Node.js its concurrency model — not parallel execution, but extraordinarily efficient interleaving.

Understanding this changes how you write Node.js code. You start thinking in terms of never blocking the main thread, always delegating slow work, and trusting the event loop to manage the orchestration. That mindset is what separates developers who use Node.js from developers who understand it.

More from this blog

Why Node.js is Perfect for Building Fast Web Applications

Every technology makes a bet. Node.js's bet was this: most web applications aren't slow because they do too much computation. They're slow because they spend most of their time waiting — waiting for a database to respond, waiting for a file to load, waiting for an external API to return. If you build a runtime optimised around that specific reality, you get something genuinely fast for the work most web apps actually do.

May 9, 202611 min read1
C

Chetan Chauhan | Tech Blog | chetan71

41 posts