device %busy avque r+w/s blks/s avwait avserv
sd3 82 0.9 134 879 0.0 6.6
sd4 90 0.9 133 879 0.0 6.9
sd5 73 17.1 777 12366 0.0 22.0
You can draw conflicting conclusions from this data:
- On the one hand disk “sd5” seems to be performing slower than disk “sd4”, at 22.0 milliseconds per disk I/O request from Solaris versus only 6.9 milliseconds for “sd4”.
- But on the other hand “sd5” is clearly doing more work than “sd4” – 777 I/Os per second versus 133 I/Os per seconds (6 times more), and 12,366 blocks per second versus 879 (14 times more). Does this make “sd5” actually faster than “sd4” overall?
I actually think that “sd5” is faster, given how many more disk I/Os it is doing per second, and the size of its average queue length. How can I prove this one way or the other? Well our old friend “Queuing Theory” can help, with its set of formulae describing how such things work.
A key point to realise is that modern disks have internal queues, and will accept more than one request from the operating system at a time. From the operating system’s perspective it can send a new I/O request to a disk before all the previous ones have finished. From the disk’s perspective it has an internal queue in front of the real disk, and the real disk can still only do one I/O at a time. We can see that this is the case in Solaris because the average queue length is 17.1 for “sd5”. Also the average wait time is 0.0 for all disks, because there is no waiting or queuing within Solaris, which is what this measures. Solaris was always able to immediately issue a new I/O request to the disk, and never exceeded any limit on concurrent requests to the disks.
So although “sd5” looks slow at 22.0 milliseconds service time, this is the full service time measured by Solaris, which includes any queuing time within the disk device itself. And with 17.1 concurrent requests on average, this could be quite a large queue, meaning that the 22.0 milliseconds reported by Solaris could include a significant amount of time waiting within the disk before the I/O was actually performed and the data returned.
Queuing Theory can help us “look inside the disk device” and see how big its queue is on average, and what the “real service time” of an I/O is within the disk when it performs it.
What do we know about the disks behaviour?
- Average completed requests per second are 777 for sd5 and 133 for sd4.
- External service times are 22.0 ms for sd5 and 6.9 ms for sd4
- Average requests in the disk device are 17.1 for sd5, and 0.9 for sd4
We would like to know the actual service time within the disk, separate from the queue time within the disk. We can use a formula from Queuing Theory for this:
- S = R / (1 + N)
- sd5: S = 22.0 / (1 + 17.1) = 22.0 / 18.1 = 1.215 milliseconds
- sd4: S = 6.9 / (1 + 0.9) = 6.9 / 1.9 = 3.632 milliseconds
In terms of the utilisation of each disk, the Queuing Theory formula is U = X * S, so:
- sd5: U = 777 * 0.001215 = 0.944 = 94.4%
- sd4: U = 133 * 0.003632 = 0.483 = 48.3%
In this scenario I would suggest trying to move some of the I/O workload off “sd5” and onto some other disks somehow. Any reduction in the workload on “sd5” would dramatically reduce the number of concurrent requests (average queue length) and so dramatically reduce the service time as measured by Solaris. In other words, “sd5” is a hot and busy disk.