All of lore.kernel.org
 help / color / mirror / Atom feed
* Cyclictest with small interval in guest makes host cpu go very high
@ 2022-05-13  6:15 Florent Carli
  2022-05-13  6:26 ` Christophe de Dinechin
  0 siblings, 1 reply; 3+ messages in thread
From: Florent Carli @ 2022-05-13  6:15 UTC (permalink / raw)
  To: kvm; +Cc: nsaenzju

Hello,

When I run a cyclictest with a small interval in a guest, even though
the guest's cpu load is small (2-3%) the host qemu-system thread is
showing 100% cpu utilization, almost all of it being "system/kernel".
There seems to be a threshold effect:
- on my system an interval of 220us creates no problem (host
qemu-system thread is 4% user and 1% system)
- an interval of 210us shows the host qemu-system thread at 4% user
and 50% system)
- an interval of 200us makes the host qemu-system thread at 4% user
and 95% system

Those threshold values are probably not universal...

I'm using kvm with qemu on x86-64, and this issue seems easily
reproducible (yocto with a 5.15rt kernel, debian stable with a 5.10rt
kernel, or a non-rt 5.10 or a backported 5.16rt kernel, etc.). I
reproduced this issue on a debian stable non-RT kernel to be sure the
problem was not due to preempt-rt.
My cmdlines for host and guest are very basic: ipv6.disable=1 efi=runtime
Vcpupinning does not change the outcome.

I'd love to understand the cause of this behavior and if there's
something to be done to solve this.
Thanks a lot.

Florent.

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: Cyclictest with small interval in guest makes host cpu go very high
  2022-05-13  6:15 Cyclictest with small interval in guest makes host cpu go very high Florent Carli
@ 2022-05-13  6:26 ` Christophe de Dinechin
  2022-05-16 15:48   ` Florent Carli
  0 siblings, 1 reply; 3+ messages in thread
From: Christophe de Dinechin @ 2022-05-13  6:26 UTC (permalink / raw)
  To: Florent Carli; +Cc: kvm, nsaenzju



> On 13 May 2022, at 08:15, Florent Carli <fcarli@gmail.com> wrote:
> 
> Hello,
> 
> When I run a cyclictest with a small interval in a guest, even though
> the guest's cpu load is small (2-3%) the host qemu-system thread is
> showing 100% cpu utilization, almost all of it being "system/kernel".
> There seems to be a threshold effect:
> - on my system an interval of 220us creates no problem (host
> qemu-system thread is 4% user and 1% system)
> - an interval of 210us shows the host qemu-system thread at 4% user
> and 50% system)
> - an interval of 200us makes the host qemu-system thread at 4% user
> and 95% system
> 
> Those threshold values are probably not universal...
> 
> I'm using kvm with qemu on x86-64, and this issue seems easily
> reproducible (yocto with a 5.15rt kernel, debian stable with a 5.10rt
> kernel, or a non-rt 5.10 or a backported 5.16rt kernel, etc.). I
> reproduced this issue on a debian stable non-RT kernel to be sure the
> problem was not due to preempt-rt.
> My cmdlines for host and guest are very basic: ipv6.disable=1 efi=runtime
> Vcpupinning does not change the outcome.
> 
> I'd love to understand the cause of this behavior and if there's
> something to be done to solve this.
> Thanks a lot.

I suspect this is related to this:

https://lkml.kernel.org/kvm/ad6184a3-5de6-9a9d-77f8-84b6b47efb04@gmail.com/T/

Can you try adjusting poll_threshold_ns to confirm?

> 
> Florent.
> 


^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: Cyclictest with small interval in guest makes host cpu go very high
  2022-05-13  6:26 ` Christophe de Dinechin
@ 2022-05-16 15:48   ` Florent Carli
  0 siblings, 0 replies; 3+ messages in thread
From: Florent Carli @ 2022-05-16 15:48 UTC (permalink / raw)
  To: Christophe de Dinechin; +Cc: kvm

Thank you Christophe for your idea, it led me in the right direction.

I found that the root cause is actually the default value of
halt_pool_ns (200000ns --> 200us).

"The KVM halt polling system provides a feature within KVM whereby the
latency of a guest can, under some circumstances, be reduced by
polling in the host for some time period after the guest has elected
to no longer run by cedeing."

When the cyclictest interval is larger than halt_poll_ns, then the
polling does not help (it's never interrupted) and the
growing/shrinking algorithm makes the interval go to 0 ("In the event
that the total block time was greater than the global max polling
interval then the host will never poll for long enough (limited by the
global max) to wakeup during the polling interval so it may as well be
shrunk in order to avoid pointless polling.").

But when the cyclictest interval starts becoming smaller than
halt_poll_ns, then a wakeup source is received within polling...
"During polling if a wakeup source is received within the halt polling
interval, the interval is left unchanged.", and so polling continues
with the same value, again and again, which puts us is this known
situation:

"Care should be taken when setting the halt_poll_ns module parameter
as a large value has the potential to drive the cpu usage to 100% on a
machine which would be almost entirely idle otherwise. This is because
even if a guest has wakeups during which very little work is done and
which are quite far apart, if the period is shorter than the global
max polling interval (halt_poll_ns) then the host will always poll for
the entire block time and thus cpu utilisation will go to 100%."

It just to me a while to realize that halt_poll_ns = 200000ns = 200us
= my problematic cyclictest interval threshold...
if I set halt_poll_ns to 100000 (and restart the vm, that's
important), then the 200us cyclictest interval works fine...

^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2022-05-16 15:48 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-05-13  6:15 Cyclictest with small interval in guest makes host cpu go very high Florent Carli
2022-05-13  6:26 ` Christophe de Dinechin
2022-05-16 15:48   ` Florent Carli

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.