Guides‎ > ‎Common Checks‎ > ‎

Disk Performance

Windows Performance Counters

Disk Performance can be measured using performance counters.

There are many performance counters available, these are the ones that I think are most useful:
  • % Disk Read Time
    • % Disk Read Time is the percentage of elapsed time that the selected disk drive was busy servicing read requests.
  • % Disk Write Time
    • % Disk Write Time is the percentage of elapsed time that the selected disk drive was busy servicing write requests.
  • Current Disk Queue Length
    • Current Disk Queue Length is the number of requests outstanding on the disk at the time the performance data is collected. It also includes requests in service at the time of the collection. This is a instantaneous snapshot, not an average over the time interval. Multi-spindle disk devices can have multiple requests that are active at one time, but other concurrent requests are awaiting service. This counter might reflect a transitory high or low queue length, but if there is a sustained load on the disk drive, it is likely that this will be consistently high. Requests experience delays proportional to the length of this queue minus the number of spindles on the disks. For good performance, this difference should average less than two.
  • Disk Read Bytes/sec
    • Disk Read Bytes/sec is the rate at which bytes are transferred from the disk during read operations.
  • Disk Write Bytes/sec
    • Disk Write Bytes/sec is rate at which bytes are transferred to the disk during write operations.
  • Split IO/Sec
    • Split IO/Sec reports the rate at which I/Os to the disk were split into multiple I/Os. A split I/O may result from requesting data of a size that is too large to fit into a single I/O or that the disk is fragmented.

The performance counters can be collected from either:
  • Logical Disk
    • The Logical Disk performance object consists of counters that monitor logical partitions of a hard or fixed disk drives.  Performance Monitor identifies logical disks by their a drive letter, such as D:.
    • These are referenced as \LogicalDisk(D:)\Counter Object
  • Physical Disk
    • The Physical Disk performance object consists of counters that monitor hard or fixed disk drive on a computer. The values of physical disk counters are sums of the values of the logical disks (or partitions) into which they are divided.
    • These are referenced as \PhysicalDisk(1 D:)\Counter Object
The main difference between the two counter types is that the Physical Disk counter objects reference the physical disk number AND the drive label:
  • For example, D: on one computer could be:
    • \PhysicalDisk(1 D:)\Counter Object
  • On another computer it could be:
    • \PhysicalDisk(3 D:)\Counter Object

The point to explaining all this has to do with advanced Nagios service configurations. When you create services that are assigned to host groups (so you don't need individual services for multiple hosts), the performance counter has to be the same on all hosts. By using logical disk performance counters, D: drive is always the same performance counter object. 

NOTE: There are reasons why you would use physical disk performance counters, however if you have just one volume (like D:) on it's own disk and no other volumes on the same disk, then just stick to logical disk performance counters. 

Warning and Critical Threshold Values
The values in the examples provided are for demonstration purposes, you will need to adjust them to suit your environment.

Labels When Using NSClient++
Normally the output is as follows:
  • Command:
    • check_nrpe -H 192.168.142.137 -c CheckCounter -a 'Counter=\LogicalDisk(C:)\% Disk Read Time' ShowAll MaxWarn=50 MaxCrit=75
  • Output:
    • OK: \LogicalDisk(C:)\% Disk Read Time: 0|'\LogicalDisk(C:)\% Disk Read Time'=0;50;75

However I like to provide my own label as I don't need to display the whole name of the performance counter in my output. To do this, you simply add the label Counter<:here>= (before the equal sign, starting with a column). For example:
  • Command:
    • check_nrpe -H 192.168.142.137 -c CheckCounter -a "Counter:C: % Read Time=\LogicalDisk(C:)\% Disk Read Time" ShowAll MaxWarn=50 MaxCrit=75
  • Output:
    • OK: C: % Read Time: 0|'C: % Read Time'=0;50;75
So you can see it looks cleaner ... well that's my opinion anyway :)


OK so on with the examples ...

% Disk Read Time

NSClient++ 0.3.9 & 0.4.1
No additional configuration is required.
Command:
check_nrpe -H 192.168.142.137 -c CheckCounter -a 'Counter:C: % Read Time=\LogicalDisk(C:)\% Disk Read Time' ShowAll MaxWarn=50 MaxCrit=75

Output:
OK: C: % Read Time: 0|'C: % Read Time'=0;50;75

NSClient++ 0.5.0.17 onwards
No additional configuration is required.
Command:
check_nrpe -H 10.25.14.2 -c check_pdh -a 'counter:C: % Read Time=\LogicalDisk(C:)\% Disk Read Time' 'warning=value>50' 'critical=value>75' 'perf-config=*(suffix:none)'

Output:
OK: C: % Read Time = 0|'C: % Read Time'=0;50;75

% Disk Write Time

NSClient++ 0.3.9 & 0.4.1
No additional configuration is required.
Command:
check_nrpe -H 192.168.142.137 -c CheckCounter -a 'Counter:C: % Write Time=\LogicalDisk(C:)\% Disk Write Time' ShowAll MaxWarn=50 MaxCrit=75

Output:
OK: C: % Write Time: 0.641025|'C: % Write Time'=0.64102;50;75

NSClient++ 0.5.0.17 onwards
No additional configuration is required.
Command:
check_nrpe -H 10.25.14.2 -c check_pdh -a 'counter:C: % Write Time=\LogicalDisk(C:)\% Disk Write Time' 'warning=value>50' 'critical=value>75' 'perf-config=*(suffix:none)'

Output:
OK: C: % Write Time = 0|'C: % Write Time'=0;50;75

Current Disk Queue Length

NSClient++ 0.3.9 & 0.4.1
No additional configuration is required.
Command:
check_nrpe -H 192.168.142.137 -c CheckCounter -a 'Counter:C: Queue Length=\LogicalDisk(C:)\Current Disk Queue Length' ShowAll MaxWarn=10 MaxCrit=20

Output:
OK: C: Queue Length: 0|'C: Queue Length'=0;10;20

NSClient++ 0.5.0.17 onwards
No additional configuration is required.
Command:
check_nrpe -H 10.25.14.2 -c check_pdh -a 'counter:C: Queue Length=\LogicalDisk(C:)\Current Disk Queue Length' 'warning=value>10' 'critical=value>20' 'perf-config=*(suffix:none)'

Output:
OK: C: Queue Length = 0|'C: Queue Length'=0;10;20

Disk Read Bytes/sec

NSClient++ 0.3.9 & 0.4.1
No additional configuration is required.
Command:
check_nrpe -H 192.168.142.137 -c CheckCounter -a 'Counter:C: Read Bytes/sec=\LogicalDisk(C:)\Disk Read Bytes/sec' ShowAll MaxWarn=100 MaxCrit=500

Output:
OK: C: Read Bytes/sec: 0|'C: Read Bytes/sec'=0;100;500

NSClient++ 0.5.0.17 onwards
No additional configuration is required.
Command:
check_nrpe -H 10.25.14.2 -c check_pdh -a 'counter:C: Read Bytes/sec=\LogicalDisk(C:)\Disk Read Bytes/sec' 'warning=value>100' 'critical=value>500' 'perf-config=*(suffix:none)'

Output:
OK: C: Read Bytes/sec = 0|'C: Read Bytes/sec'=0;100;500

Disk Write Bytes/sec

NSClient++ 0.3.9 & 0.4.1
No additional configuration is required.
Command:
check_nrpe -H 192.168.142.137 -c CheckCounter -a 'Counter:C: Write Bytes/sec=\LogicalDisk(C:)\Disk Write Bytes/sec' ShowAll MaxWarn=100 MaxCrit=500

Output:
OK: C: Write Bytes/sec: 0|'C: Write Bytes/sec'=0;100;500

NSClient++ 0.5.0.17 onwards
No additional configuration is required.
Command:
check_nrpe -H 10.25.14.2 -c check_pdh -a 'counter:C: Write Bytes/sec=\LogicalDisk(C:)\Disk Write Bytes/sec' 'warning=value>100' 'critical=value>500' 'perf-config=*(suffix:none)'

Output:
OK: C: Write Bytes/sec = 0|'C: Write Bytes/sec'=0;100;500

Split IO/Sec

NSClient++ 0.3.9 & 0.4.1
No additional configuration is required.
Command:
check_nrpe -H 192.168.142.137 -c CheckCounter -a 'Counter:C: Split IO/sec=\LogicalDisk(C:)\Split IO/Sec' ShowAll MaxWarn=10 MaxCrit=20

Output:
OK: C: Split IO/sec: 0|'C: Split IO/sec'=0;10;20

NSClient++ 0.5.0.17 onwards
No additional configuration is required.
Command:
check_nrpe -H 10.25.14.2 -c check_pdh -a 'counter:C: Split IO/sec=\LogicalDisk(C:)\Split IO/sec' 'warning=value>10' 'critical=value>20' 'perf-config=*(suffix:none)'

Output:
OK: C: Split IO/sec = 0|'C: Split IO/sec'=0;10;20