Ataraxia through Epoché: [Unix IO] minute for IO models

Blocking IO

Nonblocking I/O Model

I/O Multiplexing Model

Disadvantage: using select requires two system calls (select and recvfrom) instead of one

Advantage: we can wait for more than one descriptor to be ready (see the select function later in this chapter)

Signal-Driven I/O Mode

Signal-driven I/O is rarely used in modern applications because its disadvantages generally outweigh its benefits. Newer APIs like epoll (Linux), kqueue (BSD/macOS), and IOCP (Windows) are far superior.

Complexity of Signal Handling: Dealing with signals is notoriously difficult and error-prone. Signal handlers have many restrictions (e.g., only a limited set of functions, known as async-signal-safe functions, can be safely called from within them).
Unreliable Signal Queuing: Signals are not queued. If two I/O events occur in rapid succession, the kernel might only deliver a single SIGIO signal. This means the signal handler must be written in a loop to read (or write) until the operation would block (returning an EWOULDBLOCK or EAGAIN error). Forgetting this leads to lost I/O events.
Complexity in Multithreading: Signal handling in a multithreaded program is extremely complex. It's often unclear which thread will receive the signal, leading to difficult synchronization problems.
Still a Synchronous Model: According to the POSIX standard, signal-driven I/O is still a synchronous I/O model. While the notification is asynchronous, the actual I/O call (e.g., recvfrom()) is initiated by the process and can still block in the signal handler or main loop. This is different from true asynchronous I/O (AIO), where the kernel performs the entire operation (including the data copy) and only notifies the process upon completion.

Asynchronous I/O Model
The main difference between this model and the signal-driven I/O model is that with signal-driven I/O, the kernel tells us when an I/O operation can be initiated, but with asynchronous I/O, the kernel tells us when an I/O operation is complete.

Advantage:

True Asynchronicity: This is the only model (besides the modern io_uring) that is truly asynchronous. The application is completely unblocked and can perform other computations while the I/O is in progress.
Parallel I/O and Computation: It allows an application to overlap its computations with its I/O operations, which can lead to significant performance gains, especially in data-intensive applications like database servers.
Request Queuing: You can submit (queue) multiple I/O requests to the kernel at once, allowing the kernel to potentially optimize the scheduling of these operations (e.g., re-ordering disk reads).

Disadvantages:

The POSIX AIO model, despite its theoretical benefits, is rarely used and generally not recommended on Linux for several critical reasons:

Poor Linux Implementation: This is the biggest problem. The standard glibc implementation of POSIX AIO is not a true kernel-level AIO. Instead, it's implemented in user-space by creating a pool of worker threads. When you call aio_read(), it just hands the request to one of these hidden threads, which then performs a normal blocking read(). This adds all the overhead of threading and synchronization, often making it slower and more resource-intensive than just managing your own thread pool.
Limited to Disk Files: Even the "true" kernel AIO support (which requires using O_DIRECT) works only for disk files. It does not work for network sockets. For high-performance networking, epoll (I/O multiplexing) is the standard.
API Complexity: The API is complex, requiring you to manage aiocb (AIO control block) structures for every request and handle notifications, which often fall back to signals (with all their associated problems) or require you to poll for completion.
Superseded by io_uring: On modern Linux, AIO is considered obsolete. io_uring is the modern, high-performance interface for true asynchronous I/O. It works for both file I/O and network I/O, is vastly more efficient, and is designed to eliminate the flaws of POSIX AIO.

Comparison chart:

Ataraxia through Epoché

Oct 25, 2025

[Unix IO] minute for IO models - W. Richard Stevens

No comments:

Post a Comment