What is IOPS? Storage Metrics for System Performance

Farouk Ben. - Founder at OdownFarouk Ben.()
What is IOPS? Storage Metrics for System Performance - Odown - uptime monitoring and status page

Storage performance is one of those topics that seems straightforward until you actually need to make decisions based on it. I've spent years working with storage systems, and let me tell you—IOPS (Input/Output Operations Per Second) is both simpler and more complex than most people realize.

Whether you're building high-performance applications, managing databases, or just trying to figure out why your system feels sluggish, understanding IOPS and related metrics is essential. But these numbers don't exist in isolation, and that's where things get interesting.

Table of contents

What is IOPS?

IOPS stands for Input/Output Operations Per Second. It measures how many read or write operations a storage device can perform in one second. Think of it as the number of "tasks" your storage can handle per second, regardless of how big or small those tasks are.

Let's say your storage system has an IOPS rating of 10,000. This means it can handle 10,000 separate read or write operations each second. But here's where it gets tricky—this doesn't tell you how much data is being read or written.

IOPS is pronounced "eye-ops," by the way. And no, it has nothing to do with Apple products!

The concept is straightforward, but the implications are profound. IOPS directly impacts how responsive your applications feel, how quickly your database queries run, and how many users your system can support simultaneously.

How IOPS is calculated

Calculating IOPS might seem like rocket science, but the basic formula is actually quite simple:

IOPS = (Read Operations + Write Operations) / Time in Seconds

But hang on—it gets more complicated when you start factoring in real-world conditions. The formula above only gives you the raw number, but several factors affect actual IOPS performance:

  1. The size of each operation (block size)
  2. The ratio of reads to writes
  3. Whether operations are sequential or random
  4. Queue depth (how many operations are waiting to be processed)

For those dealing with RAID configurations or erasure coding, there's another wrinkle: one write operation might actually require multiple physical operations. For example, in a RAID 5 setup, a single write operation might require:

  • Reading existing data
  • Reading parity data
  • Computing new parity
  • Writing new data
  • Writing new parity

That's a 5:1 ratio of physical to logical operations! No wonder RAID can hurt write performance.

IOPS vs. throughput

While IOPS measures the number of operations, throughput measures the volume of data transferred in a given time period, typically measured in bytes per second (B/s), kilobytes per second (KB/s), megabytes per second (MB/s), or gigabytes per second (GB/s).

Here's the relationship:

Throughput = IOPS × Block Size

For example, if your storage can perform 1,000 IOPS with a block size of 4KB, your throughput would be: 1,000 × 4KB = 4,000KB/s or 4MB/s

Now for a real-world example: Imagine two different storage devices.

  • Device A: 5,000 IOPS at 4KB block size = 20MB/s throughput
  • Device B: 2,000 IOPS at 16KB block size = 32MB/s throughput

Despite Device A having higher IOPS, Device B actually has higher throughput! This illustrates why looking at IOPS alone isn't enough to understand performance.

And this is exactly why storage vendors love to quote IOPS numbers without context—they can manipulate the numbers to sound impressive while hiding the complete picture.

Understanding latency

Latency is the third piece of the storage performance puzzle. While IOPS tells you how many operations can be performed, and throughput tells you how much data can be transferred, latency tells you how long each operation takes.

Latency is typically measured in milliseconds (ms) or microseconds (μs) and represents the time between sending a request to the storage device and receiving a response.

Low latency is crucial for interactive applications and databases. You can have extremely high IOPS and throughput, but if your latency is high, users will still perceive the system as slow.

Factors affecting latency include:

  • Physical distance to the storage
  • Network congestion
  • Storage controller processing time
  • Queue depth
  • Storage medium (SSDs have lower latency than HDDs)

With spinning hard drives (HDDs), latency includes the time it takes for the disk to rotate to the correct position (rotational latency) and for the read/write head to move to the right track (seek time). This is why HDDs have much higher latency than SSDs, which have no moving parts.

The relationship between IOPS, throughput, and latency

These three metrics—IOPS, throughput, and latency—are interrelated and collectively define storage performance. Understanding how they interact is crucial for proper system design and troubleshooting.

Here's how they relate:

  1. IOPS and throughput: As we've seen, throughput = IOPS × block size. Higher IOPS leads to higher throughput, but only if block size remains constant.

  2. IOPS and latency: Generally, as you push a system to its maximum IOPS capacity, latency increases. This relationship isn't linear—latency often stays low until you approach the IOPS limit, then increases dramatically.

  3. Latency and throughput: Lower latency allows for more operations to be completed in a given time period, potentially increasing throughput.

Think of it like a highway: IOPS is how many cars can pass through, block size is how big each car is, throughput is the total number of people transported, and latency is how long it takes each car to make the journey.

But there's a critical point here that's often missed: optimizing for one metric may negatively impact others. For example, increasing block size can improve throughput but may reduce IOPS and increase latency.

Block size: The hidden multiplier

Block size deserves special attention because it's the hidden multiplier that connects IOPS and throughput. It represents how much data is transferred in each I/O operation.

Common block sizes include:

  • 4KB (typical filesystem block size)
  • 8KB (common for databases)
  • 16KB, 32KB, 64KB (larger blocks for sequential operations)
  • 128KB, 256KB (very large blocks for bulk transfers)

The impact of block size on performance is dramatic:

Block Size IOPS Throughput
4KB 10,000 40MB/s
8KB 10,000 80MB/s
16KB 10,000 160MB/s
32KB 10,000 320MB/s

However, larger block sizes aren't always better. Most storage systems can handle more IOPS with smaller block sizes and fewer IOPS with larger block sizes. There's usually a sweet spot that balances IOPS and throughput for your specific workload.

Also, your application may not be able to take advantage of larger block sizes if it's designed to work with smaller ones. Changing block sizes often requires application modifications, filesystem changes, or storage reconfiguration.

Read vs. write operations

Not all operations are created equal. Read and write operations often have different performance characteristics, and understanding this difference is crucial for optimizing storage systems.

Generally, reads are faster than writes for several reasons:

  1. Writes often require additional operations (like updating metadata)
  2. Many storage systems use write-back caching, which adds complexity
  3. In flash storage, cells must be erased before being written to

The ratio of reads to writes in your workload has a significant impact on overall performance. Some common read/write patterns include:

  • Read-heavy workloads (80% read, 20% write): Web servers, content delivery, analytical databases
  • Balanced workloads (50% read, 50% write): Transactional databases, email servers
  • Write-heavy workloads (20% read, 80% write): Logging systems, IoT data collection, streaming ingest

For read-heavy workloads, you might optimize for read IOPS and read latency. For write-heavy workloads, you'd focus on write throughput and durability.

And here's something storage vendors don't want you to know: the IOPS numbers they quote are often best-case scenarios, typically for read operations, with optimal block sizes and queue depths. Real-world performance, especially for writes, can be much lower.

Storage types and their IOPS capabilities

Different storage technologies have vastly different IOPS capabilities. Here's a general comparison:

Storage Type Typical IOPS Range Typical Latency
HDD (7200 RPM) 75-100 10-20ms
SATA SSD 3,000-40,000 0.1-1ms
NVMe SSD 100,000-1,000,000+ 0.02-0.1ms
Storage Area Network (SAN) Varies widely 1-10ms
Cloud Block Storage 3,000-250,000 0.5-10ms

SSDs dramatically outperform HDDs in terms of IOPS because they have no moving parts. Within the SSD category, NVMe drives outperform SATA drives because the NVMe protocol was designed specifically for flash storage and uses the PCIe bus instead of the older SATA interface.

Cloud storage performance varies widely based on the provider, tier, and configuration. Most cloud providers offer storage options with provisioned IOPS, allowing you to pay for the specific performance level you need.

RAID configurations also impact IOPS, with some providing IOPS improvements (like RAID 0) and others reducing IOPS (like RAID 5/6, especially for writes).

Common IOPS values and what they mean

It's helpful to have some context for what different IOPS values mean in practice:

  • 100-200 IOPS: Typical for a single HDD. Sufficient for basic file storage and light use.
  • 1,000-5,000 IOPS: Entry-level SSD or small SAN. Good for small databases, virtualization hosts with a few VMs, and general-purpose servers.
  • 10,000-20,000 IOPS: Mid-range SSD or SAN. Suitable for medium-sized databases, busy web servers, and virtualization hosts with many VMs.
  • 50,000-100,000 IOPS: High-performance SSD or SAN. Appropriate for large databases, high-traffic web applications, and intensive data processing.
  • 100,000+ IOPS: Enterprise NVMe or high-end storage arrays. Necessary for very large databases, AI/ML workloads, and high-frequency trading applications.

But remember—these numbers are meaningless without context. An application that performs sequential reads of large files might be perfectly fine with low IOPS but high throughput, while a database that does many small random reads and writes will need high IOPS even if the total throughput is relatively low.

Measuring IOPS in real-world scenarios

Benchmarking storage performance is an art in itself. Many tools exist for measuring IOPS, throughput, and latency:

  • fio: Flexible I/O tester, highly configurable
  • iometer: Popular for Windows systems
  • diskspd: Microsoft's recommended disk subsystem performance tool
  • dd: A simple Unix utility that can be used for basic tests
  • sysbench: Multi-purpose benchmark tool with storage testing capabilities

When measuring IOPS, it's important to simulate your actual workload as closely as possible. This includes:

  1. Using realistic block sizes
  2. Mimicking your application's read/write ratio
  3. Testing both sequential and random access patterns
  4. Setting appropriate queue depths
  5. Running tests long enough to overcome caching effects

For example, if you're testing for a database server, you might use 8KB blocks, a 70/30 read/write ratio, random access patterns, and a queue depth of 32.

Be skeptical of benchmark results unless they're transparent about their methodology. I've seen many cases where changing just one parameter can double or halve the measured IOPS.

Optimizing for IOPS: Best practices

If you need to improve IOPS performance, consider these strategies:

  1. Use appropriate storage media: Match your storage type to your workload requirements. Don't use HDDs for high-IOPS workloads if you can avoid it.

  2. Implement caching: Use memory-based caches (like Redis) or storage-level caching to reduce the load on your primary storage.

  3. Optimize block size: Align your application's block size with your storage system's optimal block size.

  4. Increase queue depth: Many storage systems perform better with higher queue depths, but only up to a point. Finding the optimal queue depth requires testing.

  5. Use RAID intelligently: RAID 0 increases IOPS but eliminates redundancy. RAID 10 provides a good balance of performance and redundancy. Avoid RAID 5/6 for write-intensive workloads.

  6. Distribute workloads: Spread I/O-intensive workloads across multiple storage devices or volumes.

  7. Consider storage tiering: Use fast storage for hot data and slower, cheaper storage for cold data.

  8. Optimize your application: Reduce unnecessary I/O operations through efficient algorithms and data structures.

  9. Use parallel I/O: Where possible, design your application to perform I/O operations in parallel rather than sequentially.

  10. Consider specialized solutions: For extremely high-IOPS requirements, technologies like in-memory databases or storage-class memory might be appropriate.

How to avoid IOPS bottlenecks

Preventing IOPS bottlenecks is often easier than fixing them after they occur. Here are some proactive approaches:

  1. Baseline your performance: Understand your normal IOPS requirements before you have problems.

  2. Monitor trends: Watch for gradual increases in IOPS utilization or latency that might indicate growing problems.

  3. Size appropriately: Don't undersize your storage infrastructure to save money—it often costs more in the long run.

  4. Plan for peaks: Design for peak IOPS requirements, not just average loads.

  5. Use QoS (Quality of Service): In shared storage environments, use QoS to prevent one workload from starving others.

  6. Schedule intensive operations: Run backups, batch jobs, and other I/O-intensive tasks during off-peak hours.

  7. Limit noisy neighbors: In virtualized environments, be aware of how VMs on the same physical hardware might impact each other.

  8. Test before production: Benchmark your storage performance with realistic workloads before deploying critical applications.

I once worked with a company that couldn't figure out why their database was suddenly running slow every day at 2 PM. Turns out, they had scheduled a backup job that was consuming all their IOPS. Moving the backup to midnight solved the problem instantly.

Monitoring IOPS for optimal performance

Continuous monitoring of storage performance is essential for maintaining optimal system operation. Key metrics to monitor include:

  1. IOPS utilization: How close are you to your maximum IOPS capacity?

  2. Latency trends: Is latency increasing over time, indicating potential problems?

  3. Queue depths: Are operations being queued, suggesting insufficient IOPS?

  4. Read/write distribution: Has your workload pattern changed?

  5. Block size distribution: Are operations using efficient block sizes?

  6. Throughput utilization: Are you approaching bandwidth limits?

Effective monitoring tools can provide early warning of developing issues before they impact users. Most modern infrastructure monitoring solutions include storage performance metrics, but specialized tools might provide more detailed insights.

For web applications and APIs, monitoring your entire stack's performance is crucial—not just storage. That's where tools like Odown come in handy. Odown provides comprehensive monitoring for websites and APIs, helping you identify performance bottlenecks throughout your infrastructure.

Besides uptime monitoring, Odown also offers SSL certificate monitoring to ensure your security certificates remain valid and public status pages to keep your users informed about system performance. These integrated capabilities allow you to maintain visibility not just into your storage performance, but your entire application stack.

So while IOPS is an important metric for understanding storage performance, it's just one piece of the larger performance puzzle. By understanding how IOPS relates to throughput and latency, and by implementing appropriate monitoring and optimization strategies, you can ensure your storage infrastructure meets the needs of your applications and users.

Remember, the goal isn't to maximize IOPS for its own sake—it's to provide sufficient performance for your specific workload while balancing considerations like cost, reliability, and scalability. Finding that balance is both an art and a science, but with the right approach, you can achieve optimal storage performance for your unique requirements.