This post is the second in a series looking at the use and misuse of IOPS for storage system performance analysis or specification.
In this experiment, the latency and bandwidth of random NFS writes is examined. Conventional wisdom says, jumbo frames and large I/Os is better than default frame size or small I/Os. If that is the case, then we expect to see a correlation between I/O size and latency. Remember, latency is what we care about for performance, not operations per second (OPS). The test case is a typical VM workload where the client is generating lots of small random write I/Os, as generated by the iozone benchmark. The operations are measured at the NFS server along with their size, internal latency, and bandwidth. The internal latency is the time required for the NFS server to respond to the NFS operation request. The NFS client will see the internal latency plus the transport latency.
If the large I/O theory holds, we expect that we will see better performance with larger I/Os. By default, the NFSv3 I/O size for the server and client in this case is 1MB. It can be tuned to something smaller, so for comparison, we also measured when the I/O size was 32KB (the NFSv2 default).
Toss the results into JMP and we get this nice chart that shows two consecutive iozone benchmark runs - the first with NFS I/O size limited to 32KB, the second with NFS I/O size the default 1MB:
The results are not as expected. What is expected is that the larger I/Os are more efficient and therefore offer better effective bandwidth while reducing overall latency. What we see is that we get higher bandwidth and significantly lower latency with the smaller I/O size! The small I/O size configuration on the left clearly outperforms the same system using large I/O sizes.
The way I like to describe this is using the cars vs trains analogy. Trains are much more efficient at moving people from one place to another. Hundreds or thousands of people can be carried on a train at high speed (except in the US, where high speed trains are unknown, but that is a different topic). By contrast cars can carry only a few people at a time, but can move about without regard to the train schedules and without having to wait as hundreds of people load or unload from the train. On the other hand, if a car and train approach a crossing at the same time, the car must wait for the train to pass. And that can take some time. The same thing happens on a network where small packets must wait until large packets pass through the interface. Hence, there is no correlation between the size of the packets and how quickly they move through the network because when large packets are moving, the small packets can be blocked - cars wait at the crossing for the train to pass.
This notion leads to a design choice that is counter to the conventional wisdom. To improve overall performance of the system, smaller I/O sizes can be better. As usual, for performance issues, there are many factors involved in performance constraints, but consider that there can be positive improvement when the I/O sizes are more like cars than trains.