All of lore.kernel.org
 help / color / mirror / Atom feed
* nested KVM slower than QEMU with gnumach guest kernel
@ 2014-11-11 18:55 Samuel Thibault
  2014-11-11 23:42 ` Jan Kiszka
  0 siblings, 1 reply; 16+ messages in thread
From: Samuel Thibault @ 2014-11-11 18:55 UTC (permalink / raw)
  To: kvm, gleb, pbonzini, abelg, nyh, oritw, benami, muli

Hello,

jenkins.debian.net is running inside a KVM VM, and it runs nested
KVM guests for its installation attempts.  This goes fine with Linux
kernels, but it is extremely slow with gnumach kernels.  I have
reproduced the issue with my laptop with a linux 3.17 host kernel, a
3.16 L1-guest kernel, and an i7-2720QM CPU, with similar results; it's
actually even slower than letting qemu emulate the CPU... For these
tests I'm using the following image:

http://people.debian.org/~sthibault/tmp/netinst.iso

The reference test here boils down to running qemu -cdrom netinst.iso -m
512, choosing the "Automated install" choice, and waiting for "Loading
additional components" step to complete. (yes, the boot menu gets
mangled ATM, there's apparently currently a bug between qemu and grub)

My host is A, my level1-KVM-guest is B.

KVM:
A$ qemu -enable-kvm -cdrom netinst.iso -m 512M
takes ~1 minute.

QEMU:
A$ qemu             -cdrom netinst.iso -m 512M
takes ~7 minutes.

KVM-in-KVM:
B$ qemu -enable-kvm -cdrom netinst.iso -m 512M
takes ~10 minutes, when it doesn't gets completely stuck, which is quite
often, actually...

QEMU-in-KVM:
B$ qemu             -cdrom netinst.iso -m 512M
takes ~7 minutes.

I don't see such horrible slowdown with a linux image.  Is there
something particular that could explain such a difference?  What tools
or counters could I use to investigate which area of KVM is getting
slow?

Samuel

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: nested KVM slower than QEMU with gnumach guest kernel
  2014-11-11 18:55 nested KVM slower than QEMU with gnumach guest kernel Samuel Thibault
@ 2014-11-11 23:42 ` Jan Kiszka
  2014-11-16 22:18   ` Samuel Thibault
  0 siblings, 1 reply; 16+ messages in thread
From: Jan Kiszka @ 2014-11-11 23:42 UTC (permalink / raw)
  To: Samuel Thibault, kvm, gleb, pbonzini, abelg, nyh, oritw, benami, muli

[-- Attachment #1: Type: text/plain, Size: 1802 bytes --]

On 2014-11-11 19:55, Samuel Thibault wrote:
> Hello,
> 
> jenkins.debian.net is running inside a KVM VM, and it runs nested
> KVM guests for its installation attempts.  This goes fine with Linux
> kernels, but it is extremely slow with gnumach kernels.  I have
> reproduced the issue with my laptop with a linux 3.17 host kernel, a
> 3.16 L1-guest kernel, and an i7-2720QM CPU, with similar results; it's
> actually even slower than letting qemu emulate the CPU... For these
> tests I'm using the following image:
> 
> http://people.debian.org/~sthibault/tmp/netinst.iso
> 
> The reference test here boils down to running qemu -cdrom netinst.iso -m
> 512, choosing the "Automated install" choice, and waiting for "Loading
> additional components" step to complete. (yes, the boot menu gets
> mangled ATM, there's apparently currently a bug between qemu and grub)
> 
> My host is A, my level1-KVM-guest is B.
> 
> KVM:
> A$ qemu -enable-kvm -cdrom netinst.iso -m 512M
> takes ~1 minute.
> 
> QEMU:
> A$ qemu             -cdrom netinst.iso -m 512M
> takes ~7 minutes.
> 
> KVM-in-KVM:
> B$ qemu -enable-kvm -cdrom netinst.iso -m 512M
> takes ~10 minutes, when it doesn't gets completely stuck, which is quite
> often, actually...
> 
> QEMU-in-KVM:
> B$ qemu             -cdrom netinst.iso -m 512M
> takes ~7 minutes.
> 
> I don't see such horrible slowdown with a linux image.  Is there
> something particular that could explain such a difference?  What tools
> or counters could I use to investigate which area of KVM is getting
> slow?

You can try to catch a trace (ftrace) on the physical host.

I suspect the setup forces a lot of instruction emulation, either on L0
or L1. And that is slower than QEMU is KVM does not optimize like QEMU does.

Jan


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 198 bytes --]

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: nested KVM slower than QEMU with gnumach guest kernel
  2014-11-11 23:42 ` Jan Kiszka
@ 2014-11-16 22:18   ` Samuel Thibault
  2014-11-17  6:28     ` Jan Kiszka
  2014-11-17  8:58     ` Gleb Natapov
  0 siblings, 2 replies; 16+ messages in thread
From: Samuel Thibault @ 2014-11-16 22:18 UTC (permalink / raw)
  To: Jan Kiszka; +Cc: kvm, gleb, pbonzini

[-- Attachment #1: Type: text/plain, Size: 1866 bytes --]

Hello,

Jan Kiszka, le Wed 12 Nov 2014 00:42:52 +0100, a écrit :
> On 2014-11-11 19:55, Samuel Thibault wrote:
> > jenkins.debian.net is running inside a KVM VM, and it runs nested
> > KVM guests for its installation attempts.  This goes fine with Linux
> > kernels, but it is extremely slow with gnumach kernels.

> You can try to catch a trace (ftrace) on the physical host.
> 
> I suspect the setup forces a lot of instruction emulation, either on L0
> or L1. And that is slower than QEMU is KVM does not optimize like QEMU does.

Here is a sample of trace-cmd output dump: the same kind of pattern
repeats over and over, with EXTERNAL_INTERRUPT happening mostly
every other microsecond:

 qemu-system-x86-9752  [003]  4106.187755: kvm_exit:             reason EXTERNAL_INTERRUPT rip 0xffffffffa02848b1 info 0 800000f6
 qemu-system-x86-9752  [003]  4106.187756: kvm_entry:            vcpu 0
 qemu-system-x86-9752  [003]  4106.187757: kvm_exit:             reason EXTERNAL_INTERRUPT rip 0xffffffffa02848b1 info 0 800000f6
 qemu-system-x86-9752  [003]  4106.187758: kvm_entry:            vcpu 0
 qemu-system-x86-9752  [003]  4106.187759: kvm_exit:             reason EXTERNAL_INTERRUPT rip 0xffffffffa02848b1 info 0 800000f6
 qemu-system-x86-9752  [003]  4106.187760: kvm_entry:            vcpu 0

The various functions being interrupted are vmx_vcpu_run
(0xffffffffa02848b1 and 0xffffffffa0284972), handle_io
(0xffffffffa027ee62), vmx_get_cpl (0xffffffffa027a7de),
load_vmc12_host_state (0xffffffffa027ea31), native_read_tscp
(0xffffffff81050a84), native_write_msr_safe (0xffffffff81050aa6),
vmx_decache_cr0_guest_bits (0xffffffffa027a384),
vmx_handle_external_intr (0xffffffffa027a54d).

AIUI, the external interrupt is 0xf6, i.e. Linux' IRQ_WORK_VECTOR.  I
however don't see any of them, neither in L0's /proc/interrupts, nor in
L1's /proc/interrupts...

Samuel

[-- Attachment #2: trace.bz2 --]
[-- Type: application/octet-stream, Size: 8303 bytes --]

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: nested KVM slower than QEMU with gnumach guest kernel
  2014-11-16 22:18   ` Samuel Thibault
@ 2014-11-17  6:28     ` Jan Kiszka
  2014-11-17  8:50       ` Samuel Thibault
                         ` (2 more replies)
  2014-11-17  8:58     ` Gleb Natapov
  1 sibling, 3 replies; 16+ messages in thread
From: Jan Kiszka @ 2014-11-17  6:28 UTC (permalink / raw)
  To: Samuel Thibault, kvm, gleb, pbonzini

[-- Attachment #1: Type: text/plain, Size: 2365 bytes --]

On 2014-11-16 23:18, Samuel Thibault wrote:
> Hello,
> 
> Jan Kiszka, le Wed 12 Nov 2014 00:42:52 +0100, a écrit :
>> On 2014-11-11 19:55, Samuel Thibault wrote:
>>> jenkins.debian.net is running inside a KVM VM, and it runs nested
>>> KVM guests for its installation attempts.  This goes fine with Linux
>>> kernels, but it is extremely slow with gnumach kernels.
> 
>> You can try to catch a trace (ftrace) on the physical host.
>>
>> I suspect the setup forces a lot of instruction emulation, either on L0
>> or L1. And that is slower than QEMU is KVM does not optimize like QEMU does.
> 
> Here is a sample of trace-cmd output dump: the same kind of pattern
> repeats over and over, with EXTERNAL_INTERRUPT happening mostly
> every other microsecond:
> 
>  qemu-system-x86-9752  [003]  4106.187755: kvm_exit:             reason EXTERNAL_INTERRUPT rip 0xffffffffa02848b1 info 0 800000f6
>  qemu-system-x86-9752  [003]  4106.187756: kvm_entry:            vcpu 0
>  qemu-system-x86-9752  [003]  4106.187757: kvm_exit:             reason EXTERNAL_INTERRUPT rip 0xffffffffa02848b1 info 0 800000f6
>  qemu-system-x86-9752  [003]  4106.187758: kvm_entry:            vcpu 0
>  qemu-system-x86-9752  [003]  4106.187759: kvm_exit:             reason EXTERNAL_INTERRUPT rip 0xffffffffa02848b1 info 0 800000f6
>  qemu-system-x86-9752  [003]  4106.187760: kvm_entry:            vcpu 0

You may want to turn on more trace events, if not all, to possibly see
what Linux does then. The next level after that is function tracing (may
require a kernel rebuild or a tracing kernel of the distro).

> 
> The various functions being interrupted are vmx_vcpu_run
> (0xffffffffa02848b1 and 0xffffffffa0284972), handle_io
> (0xffffffffa027ee62), vmx_get_cpl (0xffffffffa027a7de),
> load_vmc12_host_state (0xffffffffa027ea31), native_read_tscp
> (0xffffffff81050a84), native_write_msr_safe (0xffffffff81050aa6),
> vmx_decache_cr0_guest_bits (0xffffffffa027a384),
> vmx_handle_external_intr (0xffffffffa027a54d).
> 
> AIUI, the external interrupt is 0xf6, i.e. Linux' IRQ_WORK_VECTOR.  I
> however don't see any of them, neither in L0's /proc/interrupts, nor in
> L1's /proc/interrupts...

I suppose this is a SMP host and guest? Does reducing CPUs to 1 change
to picture? If not, it may help to understand cause and effect easier.

Jan



[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 198 bytes --]

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: nested KVM slower than QEMU with gnumach guest kernel
  2014-11-17  6:28     ` Jan Kiszka
@ 2014-11-17  8:50       ` Samuel Thibault
  2014-11-24  0:42       ` Samuel Thibault
  2014-12-15  0:09       ` Samuel Thibault
  2 siblings, 0 replies; 16+ messages in thread
From: Samuel Thibault @ 2014-11-17  8:50 UTC (permalink / raw)
  To: Jan Kiszka; +Cc: kvm, gleb, pbonzini

Jan Kiszka, le Mon 17 Nov 2014 07:28:23 +0100, a écrit :
> > AIUI, the external interrupt is 0xf6, i.e. Linux' IRQ_WORK_VECTOR.  I
> > however don't see any of them, neither in L0's /proc/interrupts, nor in
> > L1's /proc/interrupts...
> 
> I suppose this is a SMP host and guest?

L0 is a hyperthreaded quad-core, but L1 is only 1 VCPU.  In the trace,
L1 happens to have been apparently always scheduled on the same L0 CPU:
trace-cmd tells me that CPU [0-24-7] are empty.

Samuel

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: nested KVM slower than QEMU with gnumach guest kernel
  2014-11-16 22:18   ` Samuel Thibault
  2014-11-17  6:28     ` Jan Kiszka
@ 2014-11-17  8:58     ` Gleb Natapov
  2014-11-17  9:03       ` Samuel Thibault
  1 sibling, 1 reply; 16+ messages in thread
From: Gleb Natapov @ 2014-11-17  8:58 UTC (permalink / raw)
  To: Samuel Thibault, Jan Kiszka, kvm, pbonzini

On Sun, Nov 16, 2014 at 11:18:28PM +0100, Samuel Thibault wrote:
> Hello,
> 
> Jan Kiszka, le Wed 12 Nov 2014 00:42:52 +0100, a écrit :
> > On 2014-11-11 19:55, Samuel Thibault wrote:
> > > jenkins.debian.net is running inside a KVM VM, and it runs nested
> > > KVM guests for its installation attempts.  This goes fine with Linux
> > > kernels, but it is extremely slow with gnumach kernels.
> 
> > You can try to catch a trace (ftrace) on the physical host.
> > 
> > I suspect the setup forces a lot of instruction emulation, either on L0
> > or L1. And that is slower than QEMU is KVM does not optimize like QEMU does.
> 
> Here is a sample of trace-cmd output dump: the same kind of pattern
> repeats over and over, with EXTERNAL_INTERRUPT happening mostly
> every other microsecond:
> 
>  qemu-system-x86-9752  [003]  4106.187755: kvm_exit:             reason EXTERNAL_INTERRUPT rip 0xffffffffa02848b1 info 0 800000f6
>  qemu-system-x86-9752  [003]  4106.187756: kvm_entry:            vcpu 0
>  qemu-system-x86-9752  [003]  4106.187757: kvm_exit:             reason EXTERNAL_INTERRUPT rip 0xffffffffa02848b1 info 0 800000f6
>  qemu-system-x86-9752  [003]  4106.187758: kvm_entry:            vcpu 0
>  qemu-system-x86-9752  [003]  4106.187759: kvm_exit:             reason EXTERNAL_INTERRUPT rip 0xffffffffa02848b1 info 0 800000f6
>  qemu-system-x86-9752  [003]  4106.187760: kvm_entry:            vcpu 0
> 
> The various functions being interrupted are vmx_vcpu_run
> (0xffffffffa02848b1 and 0xffffffffa0284972), handle_io
> (0xffffffffa027ee62), vmx_get_cpl (0xffffffffa027a7de),
> load_vmc12_host_state (0xffffffffa027ea31), native_read_tscp
> (0xffffffff81050a84), native_write_msr_safe (0xffffffff81050aa6),
> vmx_decache_cr0_guest_bits (0xffffffffa027a384),
> vmx_handle_external_intr (0xffffffffa027a54d).
> 
> AIUI, the external interrupt is 0xf6, i.e. Linux' IRQ_WORK_VECTOR.  I
> however don't see any of them, neither in L0's /proc/interrupts, nor in
> L1's /proc/interrupts...
> 
Do you know how gnumach timekeeping works? Does it have a timer that fires each 1ms?
Which clock device is it using?

--
			Gleb.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: nested KVM slower than QEMU with gnumach guest kernel
  2014-11-17  8:58     ` Gleb Natapov
@ 2014-11-17  9:03       ` Samuel Thibault
  2014-11-17  9:04         ` Jan Kiszka
  0 siblings, 1 reply; 16+ messages in thread
From: Samuel Thibault @ 2014-11-17  9:03 UTC (permalink / raw)
  To: Gleb Natapov; +Cc: Jan Kiszka, kvm, pbonzini

Gleb Natapov, le Mon 17 Nov 2014 10:58:45 +0200, a écrit :
> Do you know how gnumach timekeeping works? Does it have a timer that fires each 1ms?
> Which clock device is it using?

It uses the PIT every 10ms, in square mode
(PIT_C0|PIT_SQUAREMODE|PIT_READMODE = 0x36).

Samuel

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: nested KVM slower than QEMU with gnumach guest kernel
  2014-11-17  9:03       ` Samuel Thibault
@ 2014-11-17  9:04         ` Jan Kiszka
  2014-11-17  9:10           ` Samuel Thibault
  2014-11-23 21:50           ` Samuel Thibault
  0 siblings, 2 replies; 16+ messages in thread
From: Jan Kiszka @ 2014-11-17  9:04 UTC (permalink / raw)
  To: Samuel Thibault, Gleb Natapov, kvm, pbonzini

[-- Attachment #1: Type: text/plain, Size: 436 bytes --]

On 2014-11-17 10:03, Samuel Thibault wrote:
> Gleb Natapov, le Mon 17 Nov 2014 10:58:45 +0200, a écrit :
>> Do you know how gnumach timekeeping works? Does it have a timer that fires each 1ms?
>> Which clock device is it using?
> 
> It uses the PIT every 10ms, in square mode
> (PIT_C0|PIT_SQUAREMODE|PIT_READMODE = 0x36).

Wow... how retro. That feature might be unsupported - does user space
irqchip work better?

Jan



[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 198 bytes --]

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: nested KVM slower than QEMU with gnumach guest kernel
  2014-11-17  9:04         ` Jan Kiszka
@ 2014-11-17  9:10           ` Samuel Thibault
  2014-11-17  9:21             ` Gleb Natapov
  2014-11-23 21:50           ` Samuel Thibault
  1 sibling, 1 reply; 16+ messages in thread
From: Samuel Thibault @ 2014-11-17  9:10 UTC (permalink / raw)
  To: Jan Kiszka; +Cc: Gleb Natapov, kvm, pbonzini

Jan Kiszka, le Mon 17 Nov 2014 10:04:37 +0100, a écrit :
> On 2014-11-17 10:03, Samuel Thibault wrote:
> > Gleb Natapov, le Mon 17 Nov 2014 10:58:45 +0200, a écrit :
> >> Do you know how gnumach timekeeping works? Does it have a timer that fires each 1ms?
> >> Which clock device is it using?
> > 
> > It uses the PIT every 10ms, in square mode
> > (PIT_C0|PIT_SQUAREMODE|PIT_READMODE = 0x36).
> 
> Wow... how retro. That feature might be unsupported - does user space
> irqchip work better?

I had indeed tried giving -machine kernel_irqchip=off to the L2 kvm,
with the same bad performance and external_interrupt in the trace.

Samuel

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: nested KVM slower than QEMU with gnumach guest kernel
  2014-11-17  9:10           ` Samuel Thibault
@ 2014-11-17  9:21             ` Gleb Natapov
  2014-11-17  9:37               ` Samuel Thibault
  2014-11-17  9:38               ` Samuel Thibault
  0 siblings, 2 replies; 16+ messages in thread
From: Gleb Natapov @ 2014-11-17  9:21 UTC (permalink / raw)
  To: Samuel Thibault, Jan Kiszka, kvm, pbonzini

On Mon, Nov 17, 2014 at 10:10:25AM +0100, Samuel Thibault wrote:
> Jan Kiszka, le Mon 17 Nov 2014 10:04:37 +0100, a écrit :
> > On 2014-11-17 10:03, Samuel Thibault wrote:
> > > Gleb Natapov, le Mon 17 Nov 2014 10:58:45 +0200, a écrit :
> > >> Do you know how gnumach timekeeping works? Does it have a timer that fires each 1ms?
> > >> Which clock device is it using?
> > > 
> > > It uses the PIT every 10ms, in square mode
> > > (PIT_C0|PIT_SQUAREMODE|PIT_READMODE = 0x36).
> > 
> > Wow... how retro. That feature might be unsupported - does user space
> > irqchip work better?
> 
> I had indeed tried giving -machine kernel_irqchip=off to the L2 kvm,
> with the same bad performance and external_interrupt in the trace.
> 
They are always be in the trace, but do you see them each ms or each 10ms
with user space irqchip?

--
			Gleb.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: nested KVM slower than QEMU with gnumach guest kernel
  2014-11-17  9:21             ` Gleb Natapov
@ 2014-11-17  9:37               ` Samuel Thibault
  2014-11-17  9:38               ` Samuel Thibault
  1 sibling, 0 replies; 16+ messages in thread
From: Samuel Thibault @ 2014-11-17  9:37 UTC (permalink / raw)
  To: Gleb Natapov; +Cc: Jan Kiszka, kvm, pbonzini

Gleb Natapov, le Mon 17 Nov 2014 11:21:22 +0200, a écrit :
> On Mon, Nov 17, 2014 at 10:10:25AM +0100, Samuel Thibault wrote:
> > Jan Kiszka, le Mon 17 Nov 2014 10:04:37 +0100, a écrit :
> > > On 2014-11-17 10:03, Samuel Thibault wrote:
> > > > Gleb Natapov, le Mon 17 Nov 2014 10:58:45 +0200, a écrit :
> > > >> Do you know how gnumach timekeeping works? Does it have a timer that fires each 1ms?
> > > >> Which clock device is it using?
> > > > 
> > > > It uses the PIT every 10ms, in square mode
> > > > (PIT_C0|PIT_SQUAREMODE|PIT_READMODE = 0x36).
> > > 
> > > Wow... how retro. That feature might be unsupported - does user space
> > > irqchip work better?
> > 
> > I had indeed tried giving -machine kernel_irqchip=off to the L2 kvm,
> > with the same bad performance and external_interrupt in the trace.
> > 
> They are always be in the trace, but do you see them each ms or each 10ms
> with user space irqchip?

The external interupts are every 1 *microsecond, not millisecond. With
irqchip=off or not.

Samuel

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: nested KVM slower than QEMU with gnumach guest kernel
  2014-11-17  9:21             ` Gleb Natapov
  2014-11-17  9:37               ` Samuel Thibault
@ 2014-11-17  9:38               ` Samuel Thibault
  1 sibling, 0 replies; 16+ messages in thread
From: Samuel Thibault @ 2014-11-17  9:38 UTC (permalink / raw)
  To: Gleb Natapov; +Cc: Jan Kiszka, kvm, pbonzini

Also, I have made gnumach show a timer counter, it does get PIT
interrupts every 10ms as expected, not more often.

Samuel

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: nested KVM slower than QEMU with gnumach guest kernel
  2014-11-17  9:04         ` Jan Kiszka
  2014-11-17  9:10           ` Samuel Thibault
@ 2014-11-23 21:50           ` Samuel Thibault
  1 sibling, 0 replies; 16+ messages in thread
From: Samuel Thibault @ 2014-11-23 21:50 UTC (permalink / raw)
  To: Jan Kiszka; +Cc: Gleb Natapov, kvm, pbonzini

Jan Kiszka, le Mon 17 Nov 2014 10:04:37 +0100, a écrit :
> On 2014-11-17 10:03, Samuel Thibault wrote:
> > Gleb Natapov, le Mon 17 Nov 2014 10:58:45 +0200, a écrit :
> >> Do you know how gnumach timekeeping works? Does it have a timer that fires each 1ms?
> >> Which clock device is it using?
> > 
> > It uses the PIT every 10ms, in square mode
> > (PIT_C0|PIT_SQUAREMODE|PIT_READMODE = 0x36).
> 
> Wow... how retro. That feature might be unsupported

(BTW, I tried the more common ndiv mode, 0x34, with the same result)

Samuel

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: nested KVM slower than QEMU with gnumach guest kernel
  2014-11-17  6:28     ` Jan Kiszka
  2014-11-17  8:50       ` Samuel Thibault
@ 2014-11-24  0:42       ` Samuel Thibault
  2014-12-15  0:09       ` Samuel Thibault
  2 siblings, 0 replies; 16+ messages in thread
From: Samuel Thibault @ 2014-11-24  0:42 UTC (permalink / raw)
  To: Jan Kiszka; +Cc: kvm, gleb, pbonzini

Jan Kiszka, le Mon 17 Nov 2014 07:28:23 +0100, a écrit :
> I suppose this is a SMP host and guest? Does reducing CPUs to 1 change
> to picture?

Oddly enough, putting my host into UniProcessor mode is making L1
realmode simulation awfully slow.  That also happens when binding kvm on
a single hardware thread like this:

hwloc-bind pu:0 kvm ...

but not when binding kvm on the two threads of the same core like this:

hwloc-bind core:0 kvm ...

...

> > Here is a sample of trace-cmd output dump: the same kind of pattern
> > repeats over and over, with EXTERNAL_INTERRUPT happening mostly
> > every other microsecond:
> > 
> >  qemu-system-x86-9752  [003]  4106.187755: kvm_exit:             reason EXTERNAL_INTERRUPT rip 0xffffffffa02848b1 info 0 800000f6
> >  qemu-system-x86-9752  [003]  4106.187756: kvm_entry:            vcpu 0
> >  qemu-system-x86-9752  [003]  4106.187757: kvm_exit:             reason EXTERNAL_INTERRUPT rip 0xffffffffa02848b1 info 0 800000f6
> >  qemu-system-x86-9752  [003]  4106.187758: kvm_entry:            vcpu 0
> >  qemu-system-x86-9752  [003]  4106.187759: kvm_exit:             reason EXTERNAL_INTERRUPT rip 0xffffffffa02848b1 info 0 800000f6
> >  qemu-system-x86-9752  [003]  4106.187760: kvm_entry:            vcpu 0

Turning into UniProcessor mode however dropped them, but the slowness
is still there.  So they are probably actually not the source of the
issue.  I'm actually wondering whether they weren't simply coming
from the tracing engine itself: I see some irq_work_queue calls from
kernel/trace/ring_buffer.c and kernel/events/core.c.

> You may want to turn on more trace events, if not all, to possibly see
> what Linux does then.

With the EXTERNAL_INTERRUPT mostly away, I mostly get this over and over:

		qemu-system-x86-2138  [000]   247.558705: kvm_exit:             reason MSR_READ rip 0xffffffff81050a82 info 0 0
native_read_msr_safe
		qemu-system-x86-2138  [000]   247.558705: kvm_msr:              msr_read 1d9 = 0x0
		qemu-system-x86-2138  [000]   247.558705: rcu_utilization:      Start context switch
		qemu-system-x86-2138  [000]   247.558706: rcu_utilization:      End context switch
		qemu-system-x86-2138  [000]   247.558706: kvm_entry:            vcpu 0
		qemu-system-x86-2138  [000]   247.558706: kvm_exit:             reason VMRESUME rip 0xffffffffa03058ae info 0 0
vmx_vcpu_run
		qemu-system-x86-2138  [000]   247.558711: kvm_mmu_get_page:     [FAILED TO PARSE] mmu_valid_gen=0x26 gfn=248173 role=114692 root_count=0 unsync=0 created=0
		qemu-system-x86-2138  [000]   247.558712: rcu_utilization:      Start context switch
		qemu-system-x86-2138  [000]   247.558712: rcu_utilization:      End context switch
		qemu-system-x86-2138  [000]   247.558712: kvm_entry:            vcpu 0
		qemu-system-x86-2138  [000]   247.558712: kvm_exit:             reason IO_INSTRUCTION rip 0xc0109769 info a10040 0
gnumach accesses the PIC
		qemu-system-x86-2138  [000]   247.558713: kvm_nested_vmexit:    rip: 0x00000000c0109769 reason: IO_INSTRUCTION ext_inf1: 0x0000000000a10040 ext_inf2: 0x0000000000000000 ext_int: 0x00000000 ext_int_err: 0x00000000
		qemu-system-x86-2138  [000]   247.558713: kvm_nested_vmexit_inject: reason: IO_INSTRUCTION ext_inf1: 0x0000000000a10040 ext_inf2: 0x0000000000000000 ext_int: 0x00000000 ext_int_err: 0x00000000
		qemu-system-x86-2138  [000]   247.558718: kvm_mmu_get_page:     [FAILED TO PARSE] mmu_valid_gen=0x26 gfn=0 role=122884 root_count=0 unsync=0 created=0
		qemu-system-x86-2138  [000]   247.558718: rcu_utilization:      Start context switch
		qemu-system-x86-2138  [000]   247.558719: rcu_utilization:      End context switch
		qemu-system-x86-2138  [000]   247.558719: kvm_entry:            vcpu 0
		qemu-system-x86-2138  [000]   247.558719: kvm_exit:             reason VMREAD rip 0xffffffffa0305956 info 0 0
vmx_vcpu_run
		qemu-system-x86-2138  [000]   247.558720: rcu_utilization:      Start context switch
		qemu-system-x86-2138  [000]   247.558720: rcu_utilization:      End context switch
		qemu-system-x86-2138  [000]   247.558720: kvm_entry:            vcpu 0
		qemu-system-x86-2138  [000]   247.558721: kvm_exit:             reason VMREAD rip 0xffffffffa030596f info 0 0
vmx_vcpu_run
		qemu-system-x86-2138  [000]   247.558721: rcu_utilization:      Start context switch
		qemu-system-x86-2138  [000]   247.558721: rcu_utilization:      End context switch
		qemu-system-x86-2138  [000]   247.558721: kvm_entry:            vcpu 0
		qemu-system-x86-2138  [000]   247.558722: kvm_exit:             reason VMREAD rip 0xffffffffa02fb333 info 0 0
vmx_read_l1_tsc
		qemu-system-x86-2138  [000]   247.558722: rcu_utilization:      Start context switch
		qemu-system-x86-2138  [000]   247.558722: rcu_utilization:      End context switch
		qemu-system-x86-2138  [000]   247.558722: kvm_entry:            vcpu 0
		qemu-system-x86-2138  [000]   247.558723: kvm_exit:             reason VMREAD rip 0xffffffffa02fb54a info 0 0
vmx_handle_external_intr
		qemu-system-x86-2138  [000]   247.558723: rcu_utilization:      Start context switch
		qemu-system-x86-2138  [000]   247.558724: rcu_utilization:      End context switch
		qemu-system-x86-2138  [000]   247.558724: kvm_entry:            vcpu 0
		qemu-system-x86-2138  [000]   247.558724: kvm_exit:             reason VMREAD rip 0xffffffffa02ffe5f info 0 0
handle_io
		qemu-system-x86-2138  [000]   247.558725: rcu_utilization:      Start context switch
		qemu-system-x86-2138  [000]   247.558725: rcu_utilization:      End context switch
		qemu-system-x86-2138  [000]   247.558725: kvm_entry:            vcpu 0
		qemu-system-x86-2138  [000]   247.558726: kvm_exit:             reason VMREAD rip 0xffffffffa02fcce5 info 0 0
vmx_cache_reg
		qemu-system-x86-2138  [000]   247.558726: rcu_utilization:      Start context switch
		qemu-system-x86-2138  [000]   247.558726: rcu_utilization:      End context switch
		qemu-system-x86-2138  [000]   247.558726: kvm_entry:            vcpu 0
		qemu-system-x86-2138  [000]   247.558727: kvm_exit:             reason VMREAD rip 0xffffffffa02ffa2e info 0 0
skip_emulated_instruction
		qemu-system-x86-2138  [000]   247.558727: rcu_utilization:      Start context switch
		qemu-system-x86-2138  [000]   247.558727: rcu_utilization:      End context switch
		qemu-system-x86-2138  [000]   247.558727: kvm_entry:            vcpu 0
		qemu-system-x86-2138  [000]   247.558728: kvm_exit:             reason VMREAD rip 0xffffffffa02fb81a info 0 0
vmx_set_interrupt_shadow
		qemu-system-x86-2138  [000]   247.558728: rcu_utilization:      Start context switch
		qemu-system-x86-2138  [000]   247.558728: rcu_utilization:      End context switch
		qemu-system-x86-2138  [000]   247.558728: kvm_entry:            vcpu 0
		qemu-system-x86-2138  [000]   247.558729: kvm_exit:             reason VMREAD rip 0xffffffffa02fb270 info 0 0
vmx_get_rflags
		qemu-system-x86-2138  [000]   247.558730: rcu_utilization:      Start context switch
		qemu-system-x86-2138  [000]   247.558730: rcu_utilization:      End context switch
		qemu-system-x86-2138  [000]   247.558730: kvm_entry:            vcpu 0
		qemu-system-x86-2138  [000]   247.558730: kvm_exit:             reason VMREAD rip 0xffffffffa02fcbf5 info 0 0
vmx_interrupt_allowed
		qemu-system-x86-2138  [000]   247.558731: rcu_utilization:      Start context switch
		qemu-system-x86-2138  [000]   247.558731: rcu_utilization:      End context switch
		qemu-system-x86-2138  [000]   247.558731: kvm_entry:            vcpu 0
		qemu-system-x86-2138  [000]   247.558734: kvm_exit:             reason VMWRITE rip 0xffffffffa02fb7db info 0 0
vmx_get_cpl
		qemu-system-x86-2138  [000]   247.558735: rcu_utilization:      Start context switch
		qemu-system-x86-2138  [000]   247.558735: rcu_utilization:      End context switch
		qemu-system-x86-2138  [000]   247.558735: kvm_entry:            vcpu 0
		qemu-system-x86-2138  [000]   247.558736: kvm_exit:             reason VMWRITE rip 0xffffffffa02fb7db info 0 0
vmx_get_cpl
		qemu-system-x86-2138  [000]   247.558736: rcu_utilization:      Start context switch
		qemu-system-x86-2138  [000]   247.558736: rcu_utilization:      End context switch
		qemu-system-x86-2138  [000]   247.558736: kvm_entry:            vcpu 0
		qemu-system-x86-2138  [000]   247.558737: kvm_exit:             reason VMWRITE rip 0xffffffffa02fb7db info 0 0
vmx_get_cpl
		qemu-system-x86-2138  [000]   247.558737: rcu_utilization:      Start context switch
		qemu-system-x86-2138  [000]   247.558737: rcu_utilization:      End context switch
		qemu-system-x86-2138  [000]   247.558737: kvm_entry:            vcpu 0
		qemu-system-x86-2138  [000]   247.558738: kvm_exit:             reason VMWRITE rip 0xffffffffa02fb7db info 0 0
vmx_get_cpl
		qemu-system-x86-2138  [000]   247.558738: rcu_utilization:      Start context switch
		qemu-system-x86-2138  [000]   247.558738: rcu_utilization:      End context switch
		qemu-system-x86-2138  [000]   247.558739: kvm_entry:            vcpu 0
		qemu-system-x86-2138  [000]   247.558739: kvm_exit:             reason VMWRITE rip 0xffffffffa02fb7db info 0 0
vmx_get_cpl
		qemu-system-x86-2138  [000]   247.558740: rcu_utilization:      Start context switch
		qemu-system-x86-2138  [000]   247.558740: rcu_utilization:      End context switch
		qemu-system-x86-2138  [000]   247.558740: kvm_entry:            vcpu 0
		qemu-system-x86-2138  [000]   247.558740: kvm_exit:             reason MSR_READ rip 0xffffffff81050a82 info 0 0
native_read_msr_safe
		qemu-system-x86-2138  [000]   247.558741: kvm_msr:              msr_read 1d9 = 0x0
		qemu-system-x86-2138  [000]   247.558741: rcu_utilization:      Start context switch
		qemu-system-x86-2138  [000]   247.558741: rcu_utilization:      End context switch
		qemu-system-x86-2138  [000]   247.558741: kvm_entry:            vcpu 0
		qemu-system-x86-2138  [000]   247.558741: kvm_exit:             reason VMRESUME rip 0xffffffffa03058ae info 0 0
vmx_vcpu_run
		qemu-system-x86-2138  [000]   247.558746: kvm_mmu_get_page:     [FAILED TO PARSE] mmu_valid_gen=0x26 gfn=248173 role=114692 root_count=0 unsync=0 created=0
		qemu-system-x86-2138  [000]   247.558747: rcu_utilization:      Start context switch
		qemu-system-x86-2138  [000]   247.558747: rcu_utilization:      End context switch
		qemu-system-x86-2138  [000]   247.558747: kvm_entry:            vcpu 0
		qemu-system-x86-2138  [000]   247.558748: kvm_exit:             reason IO_INSTRUCTION rip 0xc01095f5 info 210040 0
		qemu-system-x86-2138  [000]   247.558748: kvm_nested_vmexit:    rip: 0x00000000c01095f5 reason: IO_INSTRUCTION ext_inf1: 0x0000000000210040 ext_inf2: 0x0000000000000000 ext_int: 0x00000000 ext_int_err: 0x00000000
		qemu-system-x86-2138  [000]   247.558749: kvm_nested_vmexit_inject: reason: IO_INSTRUCTION ext_inf1: 0x0000000000210040 ext_inf2: 0x0000000000000000 ext_int: 0x00000000 ext_int_err: 0x00000000
		qemu-system-x86-2138  [000]   247.558753: kvm_mmu_get_page:     [FAILED TO PARSE] mmu_valid_gen=0x26 gfn=0 role=122884 root_count=0 unsync=0 created=0
		qemu-system-x86-2138  [000]   247.558754: rcu_utilization:      Start context switch
		qemu-system-x86-2138  [000]   247.558754: rcu_utilization:      End context switch
		qemu-system-x86-2138  [000]   247.558754: kvm_entry:            vcpu 0
		qemu-system-x86-2138  [000]   247.558755: kvm_exit:             reason VMREAD rip 0xffffffffa0305956 info 0 0
vmx_vcpu_run
		qemu-system-x86-2138  [000]   247.558755: rcu_utilization:      Start context switch
		qemu-system-x86-2138  [000]   247.558756: rcu_utilization:      End context switch
		qemu-system-x86-2138  [000]   247.558756: kvm_entry:            vcpu 0
		qemu-system-x86-2138  [000]   247.558756: kvm_exit:             reason VMREAD rip 0xffffffffa030596f info 0 0
vmx_vcpu_run
		qemu-system-x86-2138  [000]   247.558757: rcu_utilization:      Start context switch
		qemu-system-x86-2138  [000]   247.558757: rcu_utilization:      End context switch
		qemu-system-x86-2138  [000]   247.558757: kvm_entry:            vcpu 0
		qemu-system-x86-2138  [000]   247.558757: kvm_exit:             reason VMREAD rip 0xffffffffa02fb333 info 0 0
vmx_read_l1_tsc
		qemu-system-x86-2138  [000]   247.558758: rcu_utilization:      Start context switch
		qemu-system-x86-2138  [000]   247.558758: rcu_utilization:      End context switch
		qemu-system-x86-2138  [000]   247.558758: kvm_entry:            vcpu 0
		qemu-system-x86-2138  [000]   247.558759: kvm_exit:             reason VMREAD rip 0xffffffffa02fb54a info 0 0
vmx_handle_external_intr
		qemu-system-x86-2138  [000]   247.558759: rcu_utilization:      Start context switch
		qemu-system-x86-2138  [000]   247.558759: rcu_utilization:      End context switch
		qemu-system-x86-2138  [000]   247.558759: kvm_entry:            vcpu 0
		qemu-system-x86-2138  [000]   247.558760: kvm_exit:             reason VMREAD rip 0xffffffffa02ffe5f info 0 0
handle_io
		qemu-system-x86-2138  [000]   247.558760: rcu_utilization:      Start context switch
		qemu-system-x86-2138  [000]   247.558760: rcu_utilization:      End context switch
		qemu-system-x86-2138  [000]   247.558760: kvm_entry:            vcpu 0
		qemu-system-x86-2138  [000]   247.558761: kvm_exit:             reason VMREAD rip 0xffffffffa02fcce5 info 0 0
vmx_cache_reg

When running the same kind of operation with non-nested KVM, I get this
kind of trace:

		qemu-system-x86-3667  [000]  1399.213498: kvm_exit:             reason IO_INSTRUCTION rip 0x801090c8 info 210040 0
		qemu-system-x86-3667  [000]  1399.213498: kvm_pio:              pio_write at 0x21 size 1 count 1 val 0xff
		qemu-system-x86-3667  [000]  1399.213499: rcu_utilization:      Start context switch
		qemu-system-x86-3667  [000]  1399.213499: rcu_utilization:      End context switch
		qemu-system-x86-3667  [000]  1399.213499: kvm_entry:            vcpu 0
		qemu-system-x86-3667  [000]  1399.213500: kvm_exit:             reason IO_INSTRUCTION rip 0x801090d1 info a10040 0
		qemu-system-x86-3667  [000]  1399.213500: kvm_pio:              pio_write at 0xa1 size 1 count 1 val 0xff 
		qemu-system-x86-3667  [000]  1399.213500: rcu_utilization:      Start context switch
		qemu-system-x86-3667  [000]  1399.213501: rcu_utilization:      End context switch
		qemu-system-x86-3667  [000]  1399.213501: kvm_entry:            vcpu 0
		qemu-system-x86-3667  [000]  1399.213501: kvm_exit:             reason IO_INSTRUCTION rip 0x80108f5d info 210040 0
		qemu-system-x86-3667  [000]  1399.213501: kvm_pio:              pio_write at 0x21 size 1 count 1 val 0x68 
		qemu-system-x86-3667  [000]  1399.213502: rcu_utilization:      Start context switch
		qemu-system-x86-3667  [000]  1399.213502: rcu_utilization:      End context switch
		qemu-system-x86-3667  [000]  1399.213502: kvm_entry:            vcpu 0

i.e. just one kvm_exit per guest IO instruction, not 18 like above, are
those really expected?

Samuel

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: nested KVM slower than QEMU with gnumach guest kernel
  2014-11-17  6:28     ` Jan Kiszka
  2014-11-17  8:50       ` Samuel Thibault
  2014-11-24  0:42       ` Samuel Thibault
@ 2014-12-15  0:09       ` Samuel Thibault
  2014-12-15  9:41         ` Paolo Bonzini
  2 siblings, 1 reply; 16+ messages in thread
From: Samuel Thibault @ 2014-12-15  0:09 UTC (permalink / raw)
  To: Jan Kiszka; +Cc: kvm, gleb, pbonzini

Hello,

Just FTR, it seems that the overhead is due to gnumach somtimes using
the PIC quite a lot.  It used not to be too much a concern with just
kvm, but kvm on kvm becomes too expensive for that.  I've fixed gnumach
into being a lot more reasonable, and the performance issues got away.

Samuel

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: nested KVM slower than QEMU with gnumach guest kernel
  2014-12-15  0:09       ` Samuel Thibault
@ 2014-12-15  9:41         ` Paolo Bonzini
  0 siblings, 0 replies; 16+ messages in thread
From: Paolo Bonzini @ 2014-12-15  9:41 UTC (permalink / raw)
  To: Samuel Thibault, Jan Kiszka, kvm, gleb



On 15/12/2014 01:09, Samuel Thibault wrote:
> Hello,
> 
> Just FTR, it seems that the overhead is due to gnumach somtimes using
> the PIC quite a lot.  It used not to be too much a concern with just
> kvm, but kvm on kvm becomes too expensive for that.  I've fixed gnumach
> into being a lot more reasonable, and the performance issues got away.

Thanks!

Paolo

^ permalink raw reply	[flat|nested] 16+ messages in thread

end of thread, other threads:[~2014-12-15  9:41 UTC | newest]

Thread overview: 16+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2014-11-11 18:55 nested KVM slower than QEMU with gnumach guest kernel Samuel Thibault
2014-11-11 23:42 ` Jan Kiszka
2014-11-16 22:18   ` Samuel Thibault
2014-11-17  6:28     ` Jan Kiszka
2014-11-17  8:50       ` Samuel Thibault
2014-11-24  0:42       ` Samuel Thibault
2014-12-15  0:09       ` Samuel Thibault
2014-12-15  9:41         ` Paolo Bonzini
2014-11-17  8:58     ` Gleb Natapov
2014-11-17  9:03       ` Samuel Thibault
2014-11-17  9:04         ` Jan Kiszka
2014-11-17  9:10           ` Samuel Thibault
2014-11-17  9:21             ` Gleb Natapov
2014-11-17  9:37               ` Samuel Thibault
2014-11-17  9:38               ` Samuel Thibault
2014-11-23 21:50           ` Samuel Thibault

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.