linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* Re: [PATCH] KVM: add halt_attempted_poll to VCPU stats
       [not found] <1442334477-35377-1-git-send-email-pbonzini@redhat.com>
@ 2015-09-15 17:21 ` David Matlack
  2015-09-16 10:12 ` Christian Borntraeger
  1 sibling, 0 replies; 5+ messages in thread
From: David Matlack @ 2015-09-15 17:21 UTC (permalink / raw)
  To: Paolo Bonzini; +Cc: linux-kernel, kvm list, rkrcmar

On Tue, Sep 15, 2015 at 9:27 AM, Paolo Bonzini <pbonzini@redhat.com> wrote:
> This new statistic can help diagnosing VCPUs that, for any reason,
> trigger bad behavior of halt_poll_ns autotuning.
>
> For example, say halt_poll_ns = 480000, and wakeups are spaced exactly
> like 479us, 481us, 479us, 481us. Then KVM always fails polling and wastes
> 10+20+40+80+160+320+480 = 1110 microseconds out of every
> 479+481+479+481+479+481+479 = 3359 microseconds. The VCPU then
> is consuming about 30% more CPU than it would use without
> polling.  This would show as an abnormally high number of
> attempted polling compared to the successful polls.

Reviewed-by: David Matlack <dmatlack@google.com>

>
> Cc: Christian Borntraeger <borntraeger@de.ibm.com<
> Cc: David Matlack <dmatlack@google.com>
> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
> ---
>  arch/arm/include/asm/kvm_host.h     | 1 +
>  arch/arm64/include/asm/kvm_host.h   | 1 +
>  arch/mips/include/asm/kvm_host.h    | 1 +
>  arch/mips/kvm/mips.c                | 1 +
>  arch/powerpc/include/asm/kvm_host.h | 1 +
>  arch/powerpc/kvm/book3s.c           | 1 +
>  arch/powerpc/kvm/booke.c            | 1 +
>  arch/s390/include/asm/kvm_host.h    | 1 +
>  arch/s390/kvm/kvm-s390.c            | 1 +
>  arch/x86/include/asm/kvm_host.h     | 1 +
>  arch/x86/kvm/x86.c                  | 1 +
>  virt/kvm/kvm_main.c                 | 1 +
>  12 files changed, 12 insertions(+)
>
> diff --git a/arch/arm/include/asm/kvm_host.h b/arch/arm/include/asm/kvm_host.h
> index dcba0fa5176e..687ddeba3611 100644
> --- a/arch/arm/include/asm/kvm_host.h
> +++ b/arch/arm/include/asm/kvm_host.h
> @@ -148,6 +148,7 @@ struct kvm_vm_stat {
>
>  struct kvm_vcpu_stat {
>         u32 halt_successful_poll;
> +       u32 halt_attempted_poll;
>         u32 halt_wakeup;
>  };
>
> diff --git a/arch/arm64/include/asm/kvm_host.h b/arch/arm64/include/asm/kvm_host.h
> index 415938dc45cf..486594583cc6 100644
> --- a/arch/arm64/include/asm/kvm_host.h
> +++ b/arch/arm64/include/asm/kvm_host.h
> @@ -195,6 +195,7 @@ struct kvm_vm_stat {
>
>  struct kvm_vcpu_stat {
>         u32 halt_successful_poll;
> +       u32 halt_attempted_poll;
>         u32 halt_wakeup;
>  };
>
> diff --git a/arch/mips/include/asm/kvm_host.h b/arch/mips/include/asm/kvm_host.h
> index e8c8d9d0c45f..3a54dbca9f7e 100644
> --- a/arch/mips/include/asm/kvm_host.h
> +++ b/arch/mips/include/asm/kvm_host.h
> @@ -128,6 +128,7 @@ struct kvm_vcpu_stat {
>         u32 msa_disabled_exits;
>         u32 flush_dcache_exits;
>         u32 halt_successful_poll;
> +       u32 halt_attempted_poll;
>         u32 halt_wakeup;
>  };
>
> diff --git a/arch/mips/kvm/mips.c b/arch/mips/kvm/mips.c
> index cd4c129ce743..49ff3bfc007e 100644
> --- a/arch/mips/kvm/mips.c
> +++ b/arch/mips/kvm/mips.c
> @@ -55,6 +55,7 @@ struct kvm_stats_debugfs_item debugfs_entries[] = {
>         { "msa_disabled", VCPU_STAT(msa_disabled_exits), KVM_STAT_VCPU },
>         { "flush_dcache", VCPU_STAT(flush_dcache_exits), KVM_STAT_VCPU },
>         { "halt_successful_poll", VCPU_STAT(halt_successful_poll), KVM_STAT_VCPU },
> +       { "halt_attempted_poll", VCPU_STAT(halt_attempted_poll), KVM_STAT_VCPU },
>         { "halt_wakeup",  VCPU_STAT(halt_wakeup),        KVM_STAT_VCPU },
>         {NULL}
>  };
> diff --git a/arch/powerpc/include/asm/kvm_host.h b/arch/powerpc/include/asm/kvm_host.h
> index 98eebbf66340..195886a583ba 100644
> --- a/arch/powerpc/include/asm/kvm_host.h
> +++ b/arch/powerpc/include/asm/kvm_host.h
> @@ -108,6 +108,7 @@ struct kvm_vcpu_stat {
>         u32 dec_exits;
>         u32 ext_intr_exits;
>         u32 halt_successful_poll;
> +       u32 halt_attempted_poll;
>         u32 halt_wakeup;
>         u32 dbell_exits;
>         u32 gdbell_exits;
> diff --git a/arch/powerpc/kvm/book3s.c b/arch/powerpc/kvm/book3s.c
> index d75bf325f54a..cf009167d208 100644
> --- a/arch/powerpc/kvm/book3s.c
> +++ b/arch/powerpc/kvm/book3s.c
> @@ -53,6 +53,7 @@ struct kvm_stats_debugfs_item debugfs_entries[] = {
>         { "ext_intr",    VCPU_STAT(ext_intr_exits) },
>         { "queue_intr",  VCPU_STAT(queue_intr) },
>         { "halt_successful_poll", VCPU_STAT(halt_successful_poll), },
> +       { "halt_attempted_poll", VCPU_STAT(halt_attempted_poll), },
>         { "halt_wakeup", VCPU_STAT(halt_wakeup) },
>         { "pf_storage",  VCPU_STAT(pf_storage) },
>         { "sp_storage",  VCPU_STAT(sp_storage) },
> diff --git a/arch/powerpc/kvm/booke.c b/arch/powerpc/kvm/booke.c
> index ae458f0fd061..fd5875179e5c 100644
> --- a/arch/powerpc/kvm/booke.c
> +++ b/arch/powerpc/kvm/booke.c
> @@ -63,6 +63,7 @@ struct kvm_stats_debugfs_item debugfs_entries[] = {
>         { "dec",        VCPU_STAT(dec_exits) },
>         { "ext_intr",   VCPU_STAT(ext_intr_exits) },
>         { "halt_successful_poll", VCPU_STAT(halt_successful_poll) },
> +       { "halt_attempted_poll", VCPU_STAT(halt_attempted_poll) },
>         { "halt_wakeup", VCPU_STAT(halt_wakeup) },
>         { "doorbell", VCPU_STAT(dbell_exits) },
>         { "guest doorbell", VCPU_STAT(gdbell_exits) },
> diff --git a/arch/s390/include/asm/kvm_host.h b/arch/s390/include/asm/kvm_host.h
> index 3d012e071647..6ce4a0b7e8da 100644
> --- a/arch/s390/include/asm/kvm_host.h
> +++ b/arch/s390/include/asm/kvm_host.h
> @@ -210,6 +210,7 @@ struct kvm_vcpu_stat {
>         u32 exit_validity;
>         u32 exit_instruction;
>         u32 halt_successful_poll;
> +       u32 halt_attempted_poll;
>         u32 halt_wakeup;
>         u32 instruction_lctl;
>         u32 instruction_lctlg;
> diff --git a/arch/s390/kvm/kvm-s390.c b/arch/s390/kvm/kvm-s390.c
> index c91eb941b444..2f807ab1725e 100644
> --- a/arch/s390/kvm/kvm-s390.c
> +++ b/arch/s390/kvm/kvm-s390.c
> @@ -63,6 +63,7 @@ struct kvm_stats_debugfs_item debugfs_entries[] = {
>         { "exit_program_interruption", VCPU_STAT(exit_program_interruption) },
>         { "exit_instr_and_program_int", VCPU_STAT(exit_instr_and_program) },
>         { "halt_successful_poll", VCPU_STAT(halt_successful_poll) },
> +       { "halt_attempted_poll", VCPU_STAT(halt_attempted_poll) },
>         { "halt_wakeup", VCPU_STAT(halt_wakeup) },
>         { "instruction_lctlg", VCPU_STAT(instruction_lctlg) },
>         { "instruction_lctl", VCPU_STAT(instruction_lctl) },
> diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
> index 09acaa64ef8e..5bca6ac91e7c 100644
> --- a/arch/x86/include/asm/kvm_host.h
> +++ b/arch/x86/include/asm/kvm_host.h
> @@ -718,6 +718,7 @@ struct kvm_vcpu_stat {
>         u32 nmi_window_exits;
>         u32 halt_exits;
>         u32 halt_successful_poll;
> +       u32 halt_attempted_poll;
>         u32 halt_wakeup;
>         u32 request_irq_exits;
>         u32 irq_exits;
> diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
> index 18d59b584dee..96e748c46bee 100644
> --- a/arch/x86/kvm/x86.c
> +++ b/arch/x86/kvm/x86.c
> @@ -149,6 +149,7 @@ struct kvm_stats_debugfs_item debugfs_entries[] = {
>         { "nmi_window", VCPU_STAT(nmi_window_exits) },
>         { "halt_exits", VCPU_STAT(halt_exits) },
>         { "halt_successful_poll", VCPU_STAT(halt_successful_poll) },
> +       { "halt_attempted_poll", VCPU_STAT(halt_attempted_poll) },
>         { "halt_wakeup", VCPU_STAT(halt_wakeup) },
>         { "hypercalls", VCPU_STAT(hypercalls) },
>         { "request_irq", VCPU_STAT(request_irq_exits) },
> diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
> index 0b48aadedcee..c5de4ba9b33e 100644
> --- a/virt/kvm/kvm_main.c
> +++ b/virt/kvm/kvm_main.c
> @@ -2004,6 +2004,7 @@ void kvm_vcpu_block(struct kvm_vcpu *vcpu)
>         if (vcpu->halt_poll_ns) {
>                 ktime_t stop = ktime_add_ns(ktime_get(), vcpu->halt_poll_ns);
>
> +               ++vcpu->stat.halt_attempted_poll;
>                 do {
>                         /*
>                          * This sets KVM_REQ_UNHALT if an interrupt
> --
> 1.8.3.1
>

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH] KVM: add halt_attempted_poll to VCPU stats
       [not found] <1442334477-35377-1-git-send-email-pbonzini@redhat.com>
  2015-09-15 17:21 ` [PATCH] KVM: add halt_attempted_poll to VCPU stats David Matlack
@ 2015-09-16 10:12 ` Christian Borntraeger
  2015-09-16 10:15   ` Paolo Bonzini
                     ` (2 more replies)
  1 sibling, 3 replies; 5+ messages in thread
From: Christian Borntraeger @ 2015-09-16 10:12 UTC (permalink / raw)
  To: Paolo Bonzini, linux-kernel, kvm
  Cc: rkrcmar, David Hildenbrand, David Matlack, Jens Freimann

Am 15.09.2015 um 18:27 schrieb Paolo Bonzini:
> This new statistic can help diagnosing VCPUs that, for any reason,
> trigger bad behavior of halt_poll_ns autotuning.
> 
> For example, say halt_poll_ns = 480000, and wakeups are spaced exactly
> like 479us, 481us, 479us, 481us. Then KVM always fails polling and wastes
> 10+20+40+80+160+320+480 = 1110 microseconds out of every
> 479+481+479+481+479+481+479 = 3359 microseconds. The VCPU then
> is consuming about 30% more CPU than it would use without
> polling.  This would show as an abnormally high number of
> attempted polling compared to the successful polls.
> 
> Cc: Christian Borntraeger <borntraeger@de.ibm.com<
> Cc: David Matlack <dmatlack@google.com>
> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

Acked-by: Christian Borntraeger <borntraeger@de.ibm.com>

yes, this will help to detect some bad cases, but not all.

PS: 
upstream maintenance keeps me really busy at the moment :-)
I am looking into a case right now, where auto polling goes 
completely nuts on my system:

guest1: 8vcpus		guest2: 1 vcpu
iperf with 25 process (-P25) from guest1 to guest2.

I/O interrupts on s390 are floating (pending on all CPUs) so on 
ALL VCPUs that go to sleep, polling will consider any pending
network interrupt as successful poll. So with auto polling the
guest consumes up to 5 host CPUs without auto polling only 1.
Reducing  halt_poll_ns to 100000 seems to work (goes back to 
1 cpu).

The proper way might be to feedback the result of the
interrupt dequeue into the heuristics. Don't know yet how
to handle that properly.

Christian


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH] KVM: add halt_attempted_poll to VCPU stats
  2015-09-16 10:12 ` Christian Borntraeger
@ 2015-09-16 10:15   ` Paolo Bonzini
  2015-09-16 12:12   ` Wanpeng Li
  2015-09-17 10:28   ` David Hildenbrand
  2 siblings, 0 replies; 5+ messages in thread
From: Paolo Bonzini @ 2015-09-16 10:15 UTC (permalink / raw)
  To: Christian Borntraeger, linux-kernel, kvm
  Cc: rkrcmar, David Hildenbrand, David Matlack, Jens Freimann



On 16/09/2015 12:12, Christian Borntraeger wrote:
> I am looking into a case right now, where auto polling goes 
> completely nuts on my system:
> 
> guest1: 8vcpus		guest2: 1 vcpu
> iperf with 25 process (-P25) from guest1 to guest2.
> 
> I/O interrupts on s390 are floating (pending on all CPUs) so on 
> ALL VCPUs that go to sleep, polling will consider any pending
> network interrupt as successful poll. So with auto polling the
> guest consumes up to 5 host CPUs without auto polling only 1.
> Reducing  halt_poll_ns to 100000 seems to work (goes back to 
> 1 cpu).
> 
> The proper way might be to feedback the result of the
> interrupt dequeue into the heuristics. Don't know yet how
> to handle that properly.

I think it's simplest to disable halt_poll_ns by default on s390.  On
x86, for example, you can mark interrupts so that they _can_ be
delivered to all CPUs but only one will get it.

You can add a Kconfig symbol for that to other architectures, and not s390.

Paolo

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH] KVM: add halt_attempted_poll to VCPU stats
  2015-09-16 10:12 ` Christian Borntraeger
  2015-09-16 10:15   ` Paolo Bonzini
@ 2015-09-16 12:12   ` Wanpeng Li
  2015-09-17 10:28   ` David Hildenbrand
  2 siblings, 0 replies; 5+ messages in thread
From: Wanpeng Li @ 2015-09-16 12:12 UTC (permalink / raw)
  To: Christian Borntraeger, Paolo Bonzini, linux-kernel, kvm
  Cc: rkrcmar, David Hildenbrand, David Matlack, Jens Freimann

On 9/16/15 6:12 PM, Christian Borntraeger wrote:
> Am 15.09.2015 um 18:27 schrieb Paolo Bonzini:
>> This new statistic can help diagnosing VCPUs that, for any reason,
>> trigger bad behavior of halt_poll_ns autotuning.
>>
>> For example, say halt_poll_ns = 480000, and wakeups are spaced exactly
>> like 479us, 481us, 479us, 481us. Then KVM always fails polling and wastes
>> 10+20+40+80+160+320+480 = 1110 microseconds out of every
>> 479+481+479+481+479+481+479 = 3359 microseconds. The VCPU then

For the first 481 us, block_ns should be 481us, block_ns > 
halt_poll_ns(480us) and long halt is detected, the vcpu->halt_poll_ns 
will be shrinked.

>> is consuming about 30% more CPU than it would use without
>> polling.  This would show as an abnormally high number of
>> attempted polling compared to the successful polls.
>>
>> Cc: Christian Borntraeger <borntraeger@de.ibm.com<
>> Cc: David Matlack <dmatlack@google.com>
>> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
> Acked-by: Christian Borntraeger <borntraeger@de.ibm.com>
>
> yes, this will help to detect some bad cases, but not all.
>
> PS:
> upstream maintenance keeps me really busy at the moment :-)
> I am looking into a case right now, where auto polling goes
> completely nuts on my system:
>
> guest1: 8vcpus		guest2: 1 vcpu
> iperf with 25 process (-P25) from guest1 to guest2.
>
> I/O interrupts on s390 are floating (pending on all CPUs) so on
> ALL VCPUs that go to sleep, polling will consider any pending
> network interrupt as successful poll. So with auto polling the
> guest consumes up to 5 host CPUs without auto polling only 1.
> Reducing  halt_poll_ns to 100000 seems to work (goes back to
> 1 cpu).
>
> The proper way might be to feedback the result of the
> interrupt dequeue into the heuristics. Don't know yet how
> to handle that properly.

If this can be reproduced on x86 platform?

Regards,
Wanpeng Li

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH] KVM: add halt_attempted_poll to VCPU stats
  2015-09-16 10:12 ` Christian Borntraeger
  2015-09-16 10:15   ` Paolo Bonzini
  2015-09-16 12:12   ` Wanpeng Li
@ 2015-09-17 10:28   ` David Hildenbrand
  2 siblings, 0 replies; 5+ messages in thread
From: David Hildenbrand @ 2015-09-17 10:28 UTC (permalink / raw)
  To: Christian Borntraeger
  Cc: Paolo Bonzini, linux-kernel, kvm, rkrcmar, David Matlack, Jens Freimann

> Am 15.09.2015 um 18:27 schrieb Paolo Bonzini:
> > This new statistic can help diagnosing VCPUs that, for any reason,
> > trigger bad behavior of halt_poll_ns autotuning.
> > 
> > For example, say halt_poll_ns = 480000, and wakeups are spaced exactly
> > like 479us, 481us, 479us, 481us. Then KVM always fails polling and wastes
> > 10+20+40+80+160+320+480 = 1110 microseconds out of every
> > 479+481+479+481+479+481+479 = 3359 microseconds. The VCPU then
> > is consuming about 30% more CPU than it would use without
> > polling.  This would show as an abnormally high number of
> > attempted polling compared to the successful polls.
> > 
> > Cc: Christian Borntraeger <borntraeger@de.ibm.com<
> > Cc: David Matlack <dmatlack@google.com>
> > Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
> 
> Acked-by: Christian Borntraeger <borntraeger@de.ibm.com>
> 
> yes, this will help to detect some bad cases, but not all.
> 
> PS: 
> upstream maintenance keeps me really busy at the moment :-)
> I am looking into a case right now, where auto polling goes 
> completely nuts on my system:
> 
> guest1: 8vcpus		guest2: 1 vcpu
> iperf with 25 process (-P25) from guest1 to guest2.
> 
> I/O interrupts on s390 are floating (pending on all CPUs) so on 
> ALL VCPUs that go to sleep, polling will consider any pending
> network interrupt as successful poll. So with auto polling the
> guest consumes up to 5 host CPUs without auto polling only 1.
> Reducing  halt_poll_ns to 100000 seems to work (goes back to 
> 1 cpu).
> 
> The proper way might be to feedback the result of the
> interrupt dequeue into the heuristics. Don't know yet how
> to handle that properly.
> 
> Christian

I think the main problem is that we have two different kinds of wakeups, and
they can't be properly reported for now. "runnability" says nothing about the
"reason".

a) "forced wakeup" - "local workload"
- "local" interrupt/timer pending
- signal
- (request bits, nested irqs ... for some archs)
-> Impacts halt_poll_ns

b) "trial wakeup" - "floating" workload
-> floating interrupts
Another vcpu might be faster and dequeue the floating interrupt.
However, if we have a high I/O load, we want all VCPUs to reduce their
halt_poll_ns value. Special cases would also be:
- Only one VCPU in the system
- Only one VCPU running
- Only one VCPU that is enabled for this type of interrupt
-> Impacts halt_poll_ns only partially

So kvm_arch_vcpu_runnable() returns true for multiple
VCPUs, although only one might get the interrupt. If we could change that
internally (one VCPU reserving that interrupt), we might get this working out of
the box. As kvm_arch_vcpu_runnable is on the hot path and also called from other
VCPUs, this isn't that trivial. Will play with it. Until then, I'll prepare a
patch to disable it for s390x, just as Paolo suggested.

David


^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2015-09-17 10:28 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
     [not found] <1442334477-35377-1-git-send-email-pbonzini@redhat.com>
2015-09-15 17:21 ` [PATCH] KVM: add halt_attempted_poll to VCPU stats David Matlack
2015-09-16 10:12 ` Christian Borntraeger
2015-09-16 10:15   ` Paolo Bonzini
2015-09-16 12:12   ` Wanpeng Li
2015-09-17 10:28   ` David Hildenbrand

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).