qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
* Testing the virtio-vhost-user QEMU patch
@ 2020-07-21  7:14 Alyssa Ross
  2020-07-21  8:30 ` Stefan Hajnoczi
  0 siblings, 1 reply; 8+ messages in thread
From: Alyssa Ross @ 2020-07-21  7:14 UTC (permalink / raw)
  To: Nikos Dragazis, Stefan Hajnoczi; +Cc: qemu-devel

Hi -- I hope it's okay me reaching out like this.

I've been trying to test out the virtio-vhost-user implementation that's
been posted to this list a couple of times, but have been unable to get
it to boot a kernel following the steps listed either on
<https://wiki.qemu.org/Features/VirtioVhostUser> or
<https://ndragazis.github.io/dpdk-vhost-vvu-demo.html>.

Specifically, the kernel appears to be unable to write to the
virtio-vhost-user device's PCI registers.  I've included the full panic
output from the kernel at the end of this message.  The panic is
reproducible with two different kernels I tried (with different configs
and versions).  I tried both versions of the virtio-vhost-user I was
able to find[1][2], and both exhibited the same behaviour.

Is this a known issue?  Am I doing something wrong?

Thanks in advance -- I'm excitedly following the progress of this
feature.

Alyssa Ross

[1]: https://github.com/ndragazis/qemu/commits/virtio-vhost-user
[2]: https://github.com/stefanha/qemu/commits/virtio-vhost-user


[    1.287979] BUG: unable to handle page fault for address: ffffb8ca40025014
[    1.288311] #PF: supervisor write access in kernel mode
[    1.288311] #PF: error_code(0x000b) - reserved bit violation
[    1.288311] PGD 3b128067 P4D 3b128067 PUD 3b129067 PMD 3b12a067 PTE 8000002000000073
[    1.288311] Oops: 000b [#1] SMP PTI
[    1.288311] CPU: 1 PID: 1 Comm: swapper/0 Not tainted 5.4.28 #1-NixOS
[    1.288311] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.11.0-0-g63451fca13-prebuilt.qemu-project.org 04/01/2014
[    1.288311] RIP: 0010:iowrite8+0xe/0x30
[    1.288311] Code: fe ff ff 48 c7 c0 ff ff ff ff c3 48 8b 3f 48 89 f8 c3 66 2e 0f 1f 84 00 00 00 00 00 89 f8 48 89 f7 48 81 fe ff ff 3
[    1.288311] RSP: 0000:ffffb8ca40013cd8 EFLAGS: 00010292
[    1.288311] RAX: 0000000000000000 RBX: ffffb8ca40013d60 RCX: 0000000000000000
[    1.288311] RDX: 000000000000002f RSI: ffffb8ca40025014 RDI: ffffb8ca40025014
[    1.288311] RBP: ffff9c742ea20400 R08: ffff9c742f0a60af R09: 0000000000000000
[    1.288311] R10: 0000000000000018 R11: ffff9c742f0a60af R12: 0000000000000000
[    1.288311] R13: ffff9c742ea20410 R14: 0000000000000000 R15: 0000000000000000
[    1.288311] FS:  0000000000000000(0000) GS:ffff9c743b700000(0000) knlGS:0000000000000000
[    1.288311] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[    1.288311] CR2: ffffb8ca40025014 CR3: 0000000037a0a001 CR4: 0000000000060ee0
[    1.288311] Call Trace:
[    1.288311]  vp_reset+0x1b/0x50
[    1.288311]  register_virtio_device+0x74/0xe0
[    1.288311]  virtio_pci_probe+0xaf/0x140
[    1.288311]  local_pci_probe+0x42/0x80
[    1.288311]  pci_device_probe+0x104/0x1b0
[    1.288311]  really_probe+0x147/0x3c0
[    1.288311]  driver_probe_device+0xb6/0x100
[    1.288311]  device_driver_attach+0x53/0x60
[    1.288311]  __driver_attach+0x8a/0x150
[    1.288311]  ? device_driver_attach+0x60/0x60
[    1.288311]  bus_for_each_dev+0x78/0xc0
[    1.288311]  bus_add_driver+0x14d/0x1f0
[    1.288311]  driver_register+0x6c/0xc0
[    1.288311]  ? dma_bus_init+0xbf/0xbf
[    1.288311]  do_one_initcall+0x46/0x1f4
[    1.288311]  kernel_init_freeable+0x176/0x200
[    1.288311]  ? rest_init+0xab/0xab
[    1.288311]  kernel_init+0xa/0x105
[    1.288311]  ret_from_fork+0x35/0x40
[    1.288311] Modules linked in:
[    1.288311] CR2: ffffb8ca40025014
[    1.288311] ---[ end trace 5164b2fa531e028f ]---
[    1.288311] RIP: 0010:iowrite8+0xe/0x30
[    1.288311] Code: fe ff ff 48 c7 c0 ff ff ff ff c3 48 8b 3f 48 89 f8 c3 66 2e 0f 1f 84 00 00 00 00 00 89 f8 48 89 f7 48 81 fe ff ff 3
[    1.288311] RSP: 0000:ffffb8ca40013cd8 EFLAGS: 00010292
[    1.288311] RAX: 0000000000000000 RBX: ffffb8ca40013d60 RCX: 0000000000000000
[    1.288311] RDX: 000000000000002f RSI: ffffb8ca40025014 RDI: ffffb8ca40025014
[    1.288311] RBP: ffff9c742ea20400 R08: ffff9c742f0a60af R09: 0000000000000000
[    1.288311] R10: 0000000000000018 R11: ffff9c742f0a60af R12: 0000000000000000
[    1.288311] R13: ffff9c742ea20410 R14: 0000000000000000 R15: 0000000000000000
[    1.288311] FS:  0000000000000000(0000) GS:ffff9c743b700000(0000) knlGS:0000000000000000
[    1.288311] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[    1.288311] CR2: ffffb8ca40025014 CR3: 0000000037a0a001 CR4: 0000000000060ee0
[    1.288311] Kernel panic - not syncing: Attempted to kill init! exitcode=0x00000009
[    1.288311] Kernel Offset: 0x21200000 from 0xffffffff81000000 (relocation range: 0xffffffff80000000-0xffffffffbfffffff)
[    1.288311] ---[ end Kernel panic - not syncing: Attempted to kill init! exitcode=0x00000009 ]---


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: Testing the virtio-vhost-user QEMU patch
  2020-07-21  7:14 Testing the virtio-vhost-user QEMU patch Alyssa Ross
@ 2020-07-21  8:30 ` Stefan Hajnoczi
  2020-07-21 16:02   ` Alyssa Ross
  2020-07-23 22:27   ` Alyssa Ross
  0 siblings, 2 replies; 8+ messages in thread
From: Stefan Hajnoczi @ 2020-07-21  8:30 UTC (permalink / raw)
  To: Alyssa Ross; +Cc: Nikos Dragazis, qemu-devel

[-- Attachment #1: Type: text/plain, Size: 5638 bytes --]

On Tue, Jul 21, 2020 at 07:14:38AM +0000, Alyssa Ross wrote:
> Hi -- I hope it's okay me reaching out like this.
> 
> I've been trying to test out the virtio-vhost-user implementation that's
> been posted to this list a couple of times, but have been unable to get
> it to boot a kernel following the steps listed either on
> <https://wiki.qemu.org/Features/VirtioVhostUser> or
> <https://ndragazis.github.io/dpdk-vhost-vvu-demo.html>.
> 
> Specifically, the kernel appears to be unable to write to the
> virtio-vhost-user device's PCI registers.  I've included the full panic
> output from the kernel at the end of this message.  The panic is
> reproducible with two different kernels I tried (with different configs
> and versions).  I tried both versions of the virtio-vhost-user I was
> able to find[1][2], and both exhibited the same behaviour.
> 
> Is this a known issue?  Am I doing something wrong?

Hi,
Unfortunately I'm not sure what the issue is. This is an early
virtio-pci register access before a driver for any specific device type
(net, blk, vhost-user, etc) comes into play.

Did you test the git trees linked below or did you rebase the commits
on top of your own QEMU tree?

Is your guest kernel a stock kernel.org/distro kernel or has it been
modified (especially with security patches)?

If no one else knows what is wrong here then it will be necessary to
check the Intel manuals to figure out the exact meaning of
"error_code(0x000b) - reserved bit violation" and why Linux triggers it
with "PGD 3b128067 P4D 3b128067 PUD 3b129067 PMD 3b12a067 PTE
8000002000000073".

Stefan

> 
> Thanks in advance -- I'm excitedly following the progress of this
> feature.
> 
> Alyssa Ross
> 
> [1]: https://github.com/ndragazis/qemu/commits/virtio-vhost-user
> [2]: https://github.com/stefanha/qemu/commits/virtio-vhost-user
> 
> 
> [    1.287979] BUG: unable to handle page fault for address: ffffb8ca40025014
> [    1.288311] #PF: supervisor write access in kernel mode
> [    1.288311] #PF: error_code(0x000b) - reserved bit violation
> [    1.288311] PGD 3b128067 P4D 3b128067 PUD 3b129067 PMD 3b12a067 PTE 8000002000000073
> [    1.288311] Oops: 000b [#1] SMP PTI
> [    1.288311] CPU: 1 PID: 1 Comm: swapper/0 Not tainted 5.4.28 #1-NixOS
> [    1.288311] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.11.0-0-g63451fca13-prebuilt.qemu-project.org 04/01/2014
> [    1.288311] RIP: 0010:iowrite8+0xe/0x30
> [    1.288311] Code: fe ff ff 48 c7 c0 ff ff ff ff c3 48 8b 3f 48 89 f8 c3 66 2e 0f 1f 84 00 00 00 00 00 89 f8 48 89 f7 48 81 fe ff ff 3
> [    1.288311] RSP: 0000:ffffb8ca40013cd8 EFLAGS: 00010292
> [    1.288311] RAX: 0000000000000000 RBX: ffffb8ca40013d60 RCX: 0000000000000000
> [    1.288311] RDX: 000000000000002f RSI: ffffb8ca40025014 RDI: ffffb8ca40025014
> [    1.288311] RBP: ffff9c742ea20400 R08: ffff9c742f0a60af R09: 0000000000000000
> [    1.288311] R10: 0000000000000018 R11: ffff9c742f0a60af R12: 0000000000000000
> [    1.288311] R13: ffff9c742ea20410 R14: 0000000000000000 R15: 0000000000000000
> [    1.288311] FS:  0000000000000000(0000) GS:ffff9c743b700000(0000) knlGS:0000000000000000
> [    1.288311] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> [    1.288311] CR2: ffffb8ca40025014 CR3: 0000000037a0a001 CR4: 0000000000060ee0
> [    1.288311] Call Trace:
> [    1.288311]  vp_reset+0x1b/0x50
> [    1.288311]  register_virtio_device+0x74/0xe0
> [    1.288311]  virtio_pci_probe+0xaf/0x140
> [    1.288311]  local_pci_probe+0x42/0x80
> [    1.288311]  pci_device_probe+0x104/0x1b0
> [    1.288311]  really_probe+0x147/0x3c0
> [    1.288311]  driver_probe_device+0xb6/0x100
> [    1.288311]  device_driver_attach+0x53/0x60
> [    1.288311]  __driver_attach+0x8a/0x150
> [    1.288311]  ? device_driver_attach+0x60/0x60
> [    1.288311]  bus_for_each_dev+0x78/0xc0
> [    1.288311]  bus_add_driver+0x14d/0x1f0
> [    1.288311]  driver_register+0x6c/0xc0
> [    1.288311]  ? dma_bus_init+0xbf/0xbf
> [    1.288311]  do_one_initcall+0x46/0x1f4
> [    1.288311]  kernel_init_freeable+0x176/0x200
> [    1.288311]  ? rest_init+0xab/0xab
> [    1.288311]  kernel_init+0xa/0x105
> [    1.288311]  ret_from_fork+0x35/0x40
> [    1.288311] Modules linked in:
> [    1.288311] CR2: ffffb8ca40025014
> [    1.288311] ---[ end trace 5164b2fa531e028f ]---
> [    1.288311] RIP: 0010:iowrite8+0xe/0x30
> [    1.288311] Code: fe ff ff 48 c7 c0 ff ff ff ff c3 48 8b 3f 48 89 f8 c3 66 2e 0f 1f 84 00 00 00 00 00 89 f8 48 89 f7 48 81 fe ff ff 3
> [    1.288311] RSP: 0000:ffffb8ca40013cd8 EFLAGS: 00010292
> [    1.288311] RAX: 0000000000000000 RBX: ffffb8ca40013d60 RCX: 0000000000000000
> [    1.288311] RDX: 000000000000002f RSI: ffffb8ca40025014 RDI: ffffb8ca40025014
> [    1.288311] RBP: ffff9c742ea20400 R08: ffff9c742f0a60af R09: 0000000000000000
> [    1.288311] R10: 0000000000000018 R11: ffff9c742f0a60af R12: 0000000000000000
> [    1.288311] R13: ffff9c742ea20410 R14: 0000000000000000 R15: 0000000000000000
> [    1.288311] FS:  0000000000000000(0000) GS:ffff9c743b700000(0000) knlGS:0000000000000000
> [    1.288311] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> [    1.288311] CR2: ffffb8ca40025014 CR3: 0000000037a0a001 CR4: 0000000000060ee0
> [    1.288311] Kernel panic - not syncing: Attempted to kill init! exitcode=0x00000009
> [    1.288311] Kernel Offset: 0x21200000 from 0xffffffff81000000 (relocation range: 0xffffffff80000000-0xffffffffbfffffff)
> [    1.288311] ---[ end Kernel panic - not syncing: Attempted to kill init! exitcode=0x00000009 ]---
> 

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: Testing the virtio-vhost-user QEMU patch
  2020-07-21  8:30 ` Stefan Hajnoczi
@ 2020-07-21 16:02   ` Alyssa Ross
  2020-07-23 22:27   ` Alyssa Ross
  1 sibling, 0 replies; 8+ messages in thread
From: Alyssa Ross @ 2020-07-21 16:02 UTC (permalink / raw)
  To: Stefan Hajnoczi; +Cc: Nikos Dragazis, qemu-devel

Stefan Hajnoczi <stefanha@redhat.com> writes:

> On Tue, Jul 21, 2020 at 07:14:38AM +0000, Alyssa Ross wrote:
>> Hi -- I hope it's okay me reaching out like this.
>> 
>> I've been trying to test out the virtio-vhost-user implementation that's
>> been posted to this list a couple of times, but have been unable to get
>> it to boot a kernel following the steps listed either on
>> <https://wiki.qemu.org/Features/VirtioVhostUser> or
>> <https://ndragazis.github.io/dpdk-vhost-vvu-demo.html>.
>> 
>> Specifically, the kernel appears to be unable to write to the
>> virtio-vhost-user device's PCI registers.  I've included the full panic
>> output from the kernel at the end of this message.  The panic is
>> reproducible with two different kernels I tried (with different configs
>> and versions).  I tried both versions of the virtio-vhost-user I was
>> able to find[1][2], and both exhibited the same behaviour.
>> 
>> Is this a known issue?  Am I doing something wrong?
>
> Hi,
> Unfortunately I'm not sure what the issue is. This is an early
> virtio-pci register access before a driver for any specific device type
> (net, blk, vhost-user, etc) comes into play.
>
> Did you test the git trees linked below or did you rebase the commits
> on top of your own QEMU tree?

I tested the git trees.  For your one I had to make a slight
modification to delete the memfd syscall wrapper in util/memfd.c, since
it conflicted with the one that is now provided by Glibc.  Nikos's tree
I used totally unmodified.

> Is your guest kernel a stock kernel.org/distro kernel or has it been
> modified (especially with security patches)?

I tried a slightly modified Chromium OS kernel (5.4.23), and a stock
Ubuntu 18.10 kernel (4.15.0).  I think the most "normal" setup I tried
was building QEMU on Fedora 32, and then attempting to boot a freshly
installed Ubuntu Server 18.10 VM with

    -chardev socket,id=chardev0,path=vhost-user.sock,server,nowait \
    -device virtio-vhost-user-pci,chardev=chardev0

(The crash was reproducible with the full QEMU command lines in the
write-ups, but these seemed to be the load-bearing bits.)

> If no one else knows what is wrong here then it will be necessary to
> check the Intel manuals to figure out the exact meaning of
> "error_code(0x000b) - reserved bit violation" and why Linux triggers it
> with "PGD 3b128067 P4D 3b128067 PUD 3b129067 PMD 3b12a067 PTE
> 8000002000000073".

Thanks for your insight.  Now I at least have a place to start if nobody
else knows what's up. :)


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: Testing the virtio-vhost-user QEMU patch
  2020-07-21  8:30 ` Stefan Hajnoczi
  2020-07-21 16:02   ` Alyssa Ross
@ 2020-07-23 22:27   ` Alyssa Ross
  2020-07-24 10:58     ` Alyssa Ross
  1 sibling, 1 reply; 8+ messages in thread
From: Alyssa Ross @ 2020-07-23 22:27 UTC (permalink / raw)
  To: Stefan Hajnoczi; +Cc: Nikos Dragazis, qemu-devel

Stefan Hajnoczi <stefanha@redhat.com> writes:

> On Tue, Jul 21, 2020 at 07:14:38AM +0000, Alyssa Ross wrote:
>> Hi -- I hope it's okay me reaching out like this.
>> 
>> I've been trying to test out the virtio-vhost-user implementation that's
>> been posted to this list a couple of times, but have been unable to get
>> it to boot a kernel following the steps listed either on
>> <https://wiki.qemu.org/Features/VirtioVhostUser> or
>> <https://ndragazis.github.io/dpdk-vhost-vvu-demo.html>.
>> 
>> Specifically, the kernel appears to be unable to write to the
>> virtio-vhost-user device's PCI registers.  I've included the full panic
>> output from the kernel at the end of this message.  The panic is
>> reproducible with two different kernels I tried (with different configs
>> and versions).  I tried both versions of the virtio-vhost-user I was
>> able to find[1][2], and both exhibited the same behaviour.
>> 
>> Is this a known issue?  Am I doing something wrong?
>
> Hi,
> Unfortunately I'm not sure what the issue is. This is an early
> virtio-pci register access before a driver for any specific device type
> (net, blk, vhost-user, etc) comes into play.

Small update here: I tried on another computer, and it worked.  Made
sure that it was exactly the same QEMU binary, command line, and VM
disk/initrd/kernel, so I think I can fairly confidently say the panic
depends on what hardware QEMU is running on.  I set -cpu value to the
same on both as well (SandyBridge).

I also discovered that it works on my primary computer (the one it
panicked on before) with KVM disabled.

Note that I've only got so far as finding that it boots on the other
machine -- I haven't verified yet that it actually works.

Bad host CPU:  Intel(R) Core(TM) i5-2520M CPU @ 2.50GHz
Good host CPU: AMD EPYC 7401P 24-Core Processor

May I ask what host CPUs other people have tested this on?  Having more
data would probably be useful.  Could it be an AMD vs. Intel thing?


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: Testing the virtio-vhost-user QEMU patch
  2020-07-23 22:27   ` Alyssa Ross
@ 2020-07-24 10:58     ` Alyssa Ross
  2020-07-24 12:32       ` Stefan Hajnoczi
  0 siblings, 1 reply; 8+ messages in thread
From: Alyssa Ross @ 2020-07-24 10:58 UTC (permalink / raw)
  To: Stefan Hajnoczi; +Cc: Nikos Dragazis, qemu-devel

Alyssa Ross <hi@alyssa.is> writes:

> Stefan Hajnoczi <stefanha@redhat.com> writes:
>
>> On Tue, Jul 21, 2020 at 07:14:38AM +0000, Alyssa Ross wrote:
>>> Hi -- I hope it's okay me reaching out like this.
>>> 
>>> I've been trying to test out the virtio-vhost-user implementation that's
>>> been posted to this list a couple of times, but have been unable to get
>>> it to boot a kernel following the steps listed either on
>>> <https://wiki.qemu.org/Features/VirtioVhostUser> or
>>> <https://ndragazis.github.io/dpdk-vhost-vvu-demo.html>.
>>> 
>>> Specifically, the kernel appears to be unable to write to the
>>> virtio-vhost-user device's PCI registers.  I've included the full panic
>>> output from the kernel at the end of this message.  The panic is
>>> reproducible with two different kernels I tried (with different configs
>>> and versions).  I tried both versions of the virtio-vhost-user I was
>>> able to find[1][2], and both exhibited the same behaviour.
>>> 
>>> Is this a known issue?  Am I doing something wrong?
>>
>> Hi,
>> Unfortunately I'm not sure what the issue is. This is an early
>> virtio-pci register access before a driver for any specific device type
>> (net, blk, vhost-user, etc) comes into play.
>
> Small update here: I tried on another computer, and it worked.  Made
> sure that it was exactly the same QEMU binary, command line, and VM
> disk/initrd/kernel, so I think I can fairly confidently say the panic
> depends on what hardware QEMU is running on.  I set -cpu value to the
> same on both as well (SandyBridge).
>
> I also discovered that it works on my primary computer (the one it
> panicked on before) with KVM disabled.
>
> Note that I've only got so far as finding that it boots on the other
> machine -- I haven't verified yet that it actually works.
>
> Bad host CPU:  Intel(R) Core(TM) i5-2520M CPU @ 2.50GHz
> Good host CPU: AMD EPYC 7401P 24-Core Processor
>
> May I ask what host CPUs other people have tested this on?  Having more
> data would probably be useful.  Could it be an AMD vs. Intel thing?

I think I've figured it out!

Sandy Bridge and Ivy Bridge hosts encounter this panic because the
"additional resources" bar size is too big, at 1 << 36.  If I change
this to 1 << 35, no more kernel panic.

Skylake and later are fine with 1 << 36.  In between Ivy Bridge and
Skylake were Haswell and Broadwell, but I couldn't find anybody who was
able to help me test on either of those, so I don't know what they do.

Perhaps related, the hosts that produce panics all seem to have a
physical address size of 36 bits, while the hosts that work have larger
physical address sizes, as reported by lscpu.


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: Testing the virtio-vhost-user QEMU patch
  2020-07-24 10:58     ` Alyssa Ross
@ 2020-07-24 12:32       ` Stefan Hajnoczi
  2020-07-24 21:56         ` Alyssa Ross
  0 siblings, 1 reply; 8+ messages in thread
From: Stefan Hajnoczi @ 2020-07-24 12:32 UTC (permalink / raw)
  To: Alyssa Ross; +Cc: Nikos Dragazis, qemu-devel

[-- Attachment #1: Type: text/plain, Size: 3002 bytes --]

On Fri, Jul 24, 2020 at 10:58:45AM +0000, Alyssa Ross wrote:
> Alyssa Ross <hi@alyssa.is> writes:
> 
> > Stefan Hajnoczi <stefanha@redhat.com> writes:
> >
> >> On Tue, Jul 21, 2020 at 07:14:38AM +0000, Alyssa Ross wrote:
> >>> Hi -- I hope it's okay me reaching out like this.
> >>> 
> >>> I've been trying to test out the virtio-vhost-user implementation that's
> >>> been posted to this list a couple of times, but have been unable to get
> >>> it to boot a kernel following the steps listed either on
> >>> <https://wiki.qemu.org/Features/VirtioVhostUser> or
> >>> <https://ndragazis.github.io/dpdk-vhost-vvu-demo.html>.
> >>> 
> >>> Specifically, the kernel appears to be unable to write to the
> >>> virtio-vhost-user device's PCI registers.  I've included the full panic
> >>> output from the kernel at the end of this message.  The panic is
> >>> reproducible with two different kernels I tried (with different configs
> >>> and versions).  I tried both versions of the virtio-vhost-user I was
> >>> able to find[1][2], and both exhibited the same behaviour.
> >>> 
> >>> Is this a known issue?  Am I doing something wrong?
> >>
> >> Hi,
> >> Unfortunately I'm not sure what the issue is. This is an early
> >> virtio-pci register access before a driver for any specific device type
> >> (net, blk, vhost-user, etc) comes into play.
> >
> > Small update here: I tried on another computer, and it worked.  Made
> > sure that it was exactly the same QEMU binary, command line, and VM
> > disk/initrd/kernel, so I think I can fairly confidently say the panic
> > depends on what hardware QEMU is running on.  I set -cpu value to the
> > same on both as well (SandyBridge).
> >
> > I also discovered that it works on my primary computer (the one it
> > panicked on before) with KVM disabled.
> >
> > Note that I've only got so far as finding that it boots on the other
> > machine -- I haven't verified yet that it actually works.
> >
> > Bad host CPU:  Intel(R) Core(TM) i5-2520M CPU @ 2.50GHz
> > Good host CPU: AMD EPYC 7401P 24-Core Processor
> >
> > May I ask what host CPUs other people have tested this on?  Having more
> > data would probably be useful.  Could it be an AMD vs. Intel thing?
> 
> I think I've figured it out!
> 
> Sandy Bridge and Ivy Bridge hosts encounter this panic because the
> "additional resources" bar size is too big, at 1 << 36.  If I change
> this to 1 << 35, no more kernel panic.
> 
> Skylake and later are fine with 1 << 36.  In between Ivy Bridge and
> Skylake were Haswell and Broadwell, but I couldn't find anybody who was
> able to help me test on either of those, so I don't know what they do.
> 
> Perhaps related, the hosts that produce panics all seem to have a
> physical address size of 36 bits, while the hosts that work have larger
> physical address sizes, as reported by lscpu.

I have run it successfully on Broadwell but never tried 64GB or larger
shared memory resources.

Stefan

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: Testing the virtio-vhost-user QEMU patch
  2020-07-24 12:32       ` Stefan Hajnoczi
@ 2020-07-24 21:56         ` Alyssa Ross
  2020-07-27 10:00           ` Stefan Hajnoczi
  0 siblings, 1 reply; 8+ messages in thread
From: Alyssa Ross @ 2020-07-24 21:56 UTC (permalink / raw)
  To: Stefan Hajnoczi; +Cc: Nikos Dragazis, qemu-devel

Stefan Hajnoczi <stefanha@redhat.com> writes:

> On Fri, Jul 24, 2020 at 10:58:45AM +0000, Alyssa Ross wrote:
>> Alyssa Ross <hi@alyssa.is> writes:
>> 
>> > Stefan Hajnoczi <stefanha@redhat.com> writes:
>> >
>> >> On Tue, Jul 21, 2020 at 07:14:38AM +0000, Alyssa Ross wrote:
>> >>> Hi -- I hope it's okay me reaching out like this.
>> >>> 
>> >>> I've been trying to test out the virtio-vhost-user implementation that's
>> >>> been posted to this list a couple of times, but have been unable to get
>> >>> it to boot a kernel following the steps listed either on
>> >>> <https://wiki.qemu.org/Features/VirtioVhostUser> or
>> >>> <https://ndragazis.github.io/dpdk-vhost-vvu-demo.html>.
>> >>> 
>> >>> Specifically, the kernel appears to be unable to write to the
>> >>> virtio-vhost-user device's PCI registers.  I've included the full panic
>> >>> output from the kernel at the end of this message.  The panic is
>> >>> reproducible with two different kernels I tried (with different configs
>> >>> and versions).  I tried both versions of the virtio-vhost-user I was
>> >>> able to find[1][2], and both exhibited the same behaviour.
>> >>> 
>> >>> Is this a known issue?  Am I doing something wrong?
>> >>
>> >> Hi,
>> >> Unfortunately I'm not sure what the issue is. This is an early
>> >> virtio-pci register access before a driver for any specific device type
>> >> (net, blk, vhost-user, etc) comes into play.
>> >
>> > Small update here: I tried on another computer, and it worked.  Made
>> > sure that it was exactly the same QEMU binary, command line, and VM
>> > disk/initrd/kernel, so I think I can fairly confidently say the panic
>> > depends on what hardware QEMU is running on.  I set -cpu value to the
>> > same on both as well (SandyBridge).
>> >
>> > I also discovered that it works on my primary computer (the one it
>> > panicked on before) with KVM disabled.
>> >
>> > Note that I've only got so far as finding that it boots on the other
>> > machine -- I haven't verified yet that it actually works.
>> >
>> > Bad host CPU:  Intel(R) Core(TM) i5-2520M CPU @ 2.50GHz
>> > Good host CPU: AMD EPYC 7401P 24-Core Processor
>> >
>> > May I ask what host CPUs other people have tested this on?  Having more
>> > data would probably be useful.  Could it be an AMD vs. Intel thing?
>> 
>> I think I've figured it out!
>> 
>> Sandy Bridge and Ivy Bridge hosts encounter this panic because the
>> "additional resources" bar size is too big, at 1 << 36.  If I change
>> this to 1 << 35, no more kernel panic.
>> 
>> Skylake and later are fine with 1 << 36.  In between Ivy Bridge and
>> Skylake were Haswell and Broadwell, but I couldn't find anybody who was
>> able to help me test on either of those, so I don't know what they do.
>> 
>> Perhaps related, the hosts that produce panics all seem to have a
>> physical address size of 36 bits, while the hosts that work have larger
>> physical address sizes, as reported by lscpu.
>
> I have run it successfully on Broadwell but never tried 64GB or larger
> shared memory resources.

To clarify, I haven't been using big shared memory resources either --
this has all been about getting the backend VM to start at all.  The
panic happens at boot, and the 1 << 36 BAR allocation comes from here,
during realization:
https://github.com/ndragazis/qemu/blob/f9ab08c0c8/hw/virtio/virtio-vhost-user-pci.c#L291


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: Testing the virtio-vhost-user QEMU patch
  2020-07-24 21:56         ` Alyssa Ross
@ 2020-07-27 10:00           ` Stefan Hajnoczi
  0 siblings, 0 replies; 8+ messages in thread
From: Stefan Hajnoczi @ 2020-07-27 10:00 UTC (permalink / raw)
  To: Alyssa Ross; +Cc: Nikos Dragazis, qemu-devel

[-- Attachment #1: Type: text/plain, Size: 3850 bytes --]

On Fri, Jul 24, 2020 at 09:56:53PM +0000, Alyssa Ross wrote:
> Stefan Hajnoczi <stefanha@redhat.com> writes:
> 
> > On Fri, Jul 24, 2020 at 10:58:45AM +0000, Alyssa Ross wrote:
> >> Alyssa Ross <hi@alyssa.is> writes:
> >> 
> >> > Stefan Hajnoczi <stefanha@redhat.com> writes:
> >> >
> >> >> On Tue, Jul 21, 2020 at 07:14:38AM +0000, Alyssa Ross wrote:
> >> >>> Hi -- I hope it's okay me reaching out like this.
> >> >>> 
> >> >>> I've been trying to test out the virtio-vhost-user implementation that's
> >> >>> been posted to this list a couple of times, but have been unable to get
> >> >>> it to boot a kernel following the steps listed either on
> >> >>> <https://wiki.qemu.org/Features/VirtioVhostUser> or
> >> >>> <https://ndragazis.github.io/dpdk-vhost-vvu-demo.html>.
> >> >>> 
> >> >>> Specifically, the kernel appears to be unable to write to the
> >> >>> virtio-vhost-user device's PCI registers.  I've included the full panic
> >> >>> output from the kernel at the end of this message.  The panic is
> >> >>> reproducible with two different kernels I tried (with different configs
> >> >>> and versions).  I tried both versions of the virtio-vhost-user I was
> >> >>> able to find[1][2], and both exhibited the same behaviour.
> >> >>> 
> >> >>> Is this a known issue?  Am I doing something wrong?
> >> >>
> >> >> Hi,
> >> >> Unfortunately I'm not sure what the issue is. This is an early
> >> >> virtio-pci register access before a driver for any specific device type
> >> >> (net, blk, vhost-user, etc) comes into play.
> >> >
> >> > Small update here: I tried on another computer, and it worked.  Made
> >> > sure that it was exactly the same QEMU binary, command line, and VM
> >> > disk/initrd/kernel, so I think I can fairly confidently say the panic
> >> > depends on what hardware QEMU is running on.  I set -cpu value to the
> >> > same on both as well (SandyBridge).
> >> >
> >> > I also discovered that it works on my primary computer (the one it
> >> > panicked on before) with KVM disabled.
> >> >
> >> > Note that I've only got so far as finding that it boots on the other
> >> > machine -- I haven't verified yet that it actually works.
> >> >
> >> > Bad host CPU:  Intel(R) Core(TM) i5-2520M CPU @ 2.50GHz
> >> > Good host CPU: AMD EPYC 7401P 24-Core Processor
> >> >
> >> > May I ask what host CPUs other people have tested this on?  Having more
> >> > data would probably be useful.  Could it be an AMD vs. Intel thing?
> >> 
> >> I think I've figured it out!
> >> 
> >> Sandy Bridge and Ivy Bridge hosts encounter this panic because the
> >> "additional resources" bar size is too big, at 1 << 36.  If I change
> >> this to 1 << 35, no more kernel panic.
> >> 
> >> Skylake and later are fine with 1 << 36.  In between Ivy Bridge and
> >> Skylake were Haswell and Broadwell, but I couldn't find anybody who was
> >> able to help me test on either of those, so I don't know what they do.
> >> 
> >> Perhaps related, the hosts that produce panics all seem to have a
> >> physical address size of 36 bits, while the hosts that work have larger
> >> physical address sizes, as reported by lscpu.
> >
> > I have run it successfully on Broadwell but never tried 64GB or larger
> > shared memory resources.
> 
> To clarify, I haven't been using big shared memory resources either --
> this has all been about getting the backend VM to start at all.  The
> panic happens at boot, and the 1 << 36 BAR allocation comes from here,
> during realization:
> https://github.com/ndragazis/qemu/blob/f9ab08c0c8/hw/virtio/virtio-vhost-user-pci.c#L291

Okay, then that worked on Broadwell :)

Intel(R) Core(TM) i7-5600U CPU @ 2.60GHz
https://ark.intel.com/content/www/us/en/ark/products/85215/intel-core-i7-5600u-processor-4m-cache-up-to-3-20-ghz.html

Stefan

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2020-07-27 10:02 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-07-21  7:14 Testing the virtio-vhost-user QEMU patch Alyssa Ross
2020-07-21  8:30 ` Stefan Hajnoczi
2020-07-21 16:02   ` Alyssa Ross
2020-07-23 22:27   ` Alyssa Ross
2020-07-24 10:58     ` Alyssa Ross
2020-07-24 12:32       ` Stefan Hajnoczi
2020-07-24 21:56         ` Alyssa Ross
2020-07-27 10:00           ` Stefan Hajnoczi

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).