Your virtual machine shares resources with other virtual machines on a hypervisor (the server that is hosting the virtual machines). CPU Steal is the amount of time your virtual CPU has to wait for a real CPU from the hypervisor. This happens when a neighbour VM is using up all the real CPU of the hypervisor.
CPU Steal is an important metric to monitor in case you're using virtualized servers. It can have significant impact on the performance of your machine.
Monitoring CPU Steal
top command is great way to keep track of CPU Steal in real-time.
top - 15:43:41 up 59 days, 22:41, 1 user, load average: 0.02, 0.07, 0.20 Tasks: 110 total, 1 running, 109 sleeping, 0 stopped, 0 zombie %Cpu(s): 2.3 us, 1.3 sy, 0.0 ni, 91.5 id, 0.0 wa, 0.0 hi, 0.3 si, 4.6 st KiB Mem : 1998976 total, 268548 free, 364992 used, 1365436 buff/cache KiB Swap: 0 total, 0 free, 0 used. 1172056 avail Mem
The third row shows a breakdown of the CPU usage, the steal value is the last in this row (st). In this case the CPU steal is 4.6%
The other values in this row are:
- us (2.3%): Time CPU is spending in user space
- sy (1.3%): Time CPU is spending running the system kernel
- ni (0.0%): Time CPU is spending running user space programs that are niced
- id (91.5%): Time CPU is not doing anything (idle)
- wa (0.0%): Time CPU has to wait for I/O (disk/network...) to complete
- hi (0.0%): Time CPU spend on hardware interrupts
- si (0.3%): Time CPU spend on software interrupts
- st (4.6%): Time CPU spends waiting for the CPU.
Resolving CPU Steal time
Occasional CPU steal below 10% should be fine. There are a couple of things to do about CPU Steal.
- Upgrade your virtual machine to a more powerful CPU
- Increase the amount of available CPU cores.
- Migrate to a different node with less noisy neighbours.
- Contact your ISP, they will be able to investigate on the node itself.
Using Nixstats to monitor CPU Steal time
CPU Steal time is monitored by default on Nixstats. Just install the monitoring agent and you should be able to view the amount of CPU Steal.
A system with about 5% CPU Steal time.
Receive alerts for CPU Steal
Example of an alert setup in case CPU Steal if higher than 10% for more than 60 minutes on any server with the tag VM