* Testing the virtio-vhost-user QEMU patch @ 2020-07-21 7:14 Alyssa Ross 2020-07-21 8:30 ` Stefan Hajnoczi 0 siblings, 1 reply; 8+ messages in thread From: Alyssa Ross @ 2020-07-21 7:14 UTC (permalink / raw) To: Nikos Dragazis, Stefan Hajnoczi; +Cc: qemu-devel Hi -- I hope it's okay me reaching out like this. I've been trying to test out the virtio-vhost-user implementation that's been posted to this list a couple of times, but have been unable to get it to boot a kernel following the steps listed either on <https://wiki.qemu.org/Features/VirtioVhostUser> or <https://ndragazis.github.io/dpdk-vhost-vvu-demo.html>. Specifically, the kernel appears to be unable to write to the virtio-vhost-user device's PCI registers. I've included the full panic output from the kernel at the end of this message. The panic is reproducible with two different kernels I tried (with different configs and versions). I tried both versions of the virtio-vhost-user I was able to find[1][2], and both exhibited the same behaviour. Is this a known issue? Am I doing something wrong? Thanks in advance -- I'm excitedly following the progress of this feature. Alyssa Ross [1]: https://github.com/ndragazis/qemu/commits/virtio-vhost-user [2]: https://github.com/stefanha/qemu/commits/virtio-vhost-user [ 1.287979] BUG: unable to handle page fault for address: ffffb8ca40025014 [ 1.288311] #PF: supervisor write access in kernel mode [ 1.288311] #PF: error_code(0x000b) - reserved bit violation [ 1.288311] PGD 3b128067 P4D 3b128067 PUD 3b129067 PMD 3b12a067 PTE 8000002000000073 [ 1.288311] Oops: 000b [#1] SMP PTI [ 1.288311] CPU: 1 PID: 1 Comm: swapper/0 Not tainted 5.4.28 #1-NixOS [ 1.288311] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.11.0-0-g63451fca13-prebuilt.qemu-project.org 04/01/2014 [ 1.288311] RIP: 0010:iowrite8+0xe/0x30 [ 1.288311] Code: fe ff ff 48 c7 c0 ff ff ff ff c3 48 8b 3f 48 89 f8 c3 66 2e 0f 1f 84 00 00 00 00 00 89 f8 48 89 f7 48 81 fe ff ff 3 [ 1.288311] RSP: 0000:ffffb8ca40013cd8 EFLAGS: 00010292 [ 1.288311] RAX: 0000000000000000 RBX: ffffb8ca40013d60 RCX: 0000000000000000 [ 1.288311] RDX: 000000000000002f RSI: ffffb8ca40025014 RDI: ffffb8ca40025014 [ 1.288311] RBP: ffff9c742ea20400 R08: ffff9c742f0a60af R09: 0000000000000000 [ 1.288311] R10: 0000000000000018 R11: ffff9c742f0a60af R12: 0000000000000000 [ 1.288311] R13: ffff9c742ea20410 R14: 0000000000000000 R15: 0000000000000000 [ 1.288311] FS: 0000000000000000(0000) GS:ffff9c743b700000(0000) knlGS:0000000000000000 [ 1.288311] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 1.288311] CR2: ffffb8ca40025014 CR3: 0000000037a0a001 CR4: 0000000000060ee0 [ 1.288311] Call Trace: [ 1.288311] vp_reset+0x1b/0x50 [ 1.288311] register_virtio_device+0x74/0xe0 [ 1.288311] virtio_pci_probe+0xaf/0x140 [ 1.288311] local_pci_probe+0x42/0x80 [ 1.288311] pci_device_probe+0x104/0x1b0 [ 1.288311] really_probe+0x147/0x3c0 [ 1.288311] driver_probe_device+0xb6/0x100 [ 1.288311] device_driver_attach+0x53/0x60 [ 1.288311] __driver_attach+0x8a/0x150 [ 1.288311] ? device_driver_attach+0x60/0x60 [ 1.288311] bus_for_each_dev+0x78/0xc0 [ 1.288311] bus_add_driver+0x14d/0x1f0 [ 1.288311] driver_register+0x6c/0xc0 [ 1.288311] ? dma_bus_init+0xbf/0xbf [ 1.288311] do_one_initcall+0x46/0x1f4 [ 1.288311] kernel_init_freeable+0x176/0x200 [ 1.288311] ? rest_init+0xab/0xab [ 1.288311] kernel_init+0xa/0x105 [ 1.288311] ret_from_fork+0x35/0x40 [ 1.288311] Modules linked in: [ 1.288311] CR2: ffffb8ca40025014 [ 1.288311] ---[ end trace 5164b2fa531e028f ]--- [ 1.288311] RIP: 0010:iowrite8+0xe/0x30 [ 1.288311] Code: fe ff ff 48 c7 c0 ff ff ff ff c3 48 8b 3f 48 89 f8 c3 66 2e 0f 1f 84 00 00 00 00 00 89 f8 48 89 f7 48 81 fe ff ff 3 [ 1.288311] RSP: 0000:ffffb8ca40013cd8 EFLAGS: 00010292 [ 1.288311] RAX: 0000000000000000 RBX: ffffb8ca40013d60 RCX: 0000000000000000 [ 1.288311] RDX: 000000000000002f RSI: ffffb8ca40025014 RDI: ffffb8ca40025014 [ 1.288311] RBP: ffff9c742ea20400 R08: ffff9c742f0a60af R09: 0000000000000000 [ 1.288311] R10: 0000000000000018 R11: ffff9c742f0a60af R12: 0000000000000000 [ 1.288311] R13: ffff9c742ea20410 R14: 0000000000000000 R15: 0000000000000000 [ 1.288311] FS: 0000000000000000(0000) GS:ffff9c743b700000(0000) knlGS:0000000000000000 [ 1.288311] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 1.288311] CR2: ffffb8ca40025014 CR3: 0000000037a0a001 CR4: 0000000000060ee0 [ 1.288311] Kernel panic - not syncing: Attempted to kill init! exitcode=0x00000009 [ 1.288311] Kernel Offset: 0x21200000 from 0xffffffff81000000 (relocation range: 0xffffffff80000000-0xffffffffbfffffff) [ 1.288311] ---[ end Kernel panic - not syncing: Attempted to kill init! exitcode=0x00000009 ]--- ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: Testing the virtio-vhost-user QEMU patch 2020-07-21 7:14 Testing the virtio-vhost-user QEMU patch Alyssa Ross @ 2020-07-21 8:30 ` Stefan Hajnoczi 2020-07-21 16:02 ` Alyssa Ross 2020-07-23 22:27 ` Alyssa Ross 0 siblings, 2 replies; 8+ messages in thread From: Stefan Hajnoczi @ 2020-07-21 8:30 UTC (permalink / raw) To: Alyssa Ross; +Cc: Nikos Dragazis, qemu-devel [-- Attachment #1: Type: text/plain, Size: 5638 bytes --] On Tue, Jul 21, 2020 at 07:14:38AM +0000, Alyssa Ross wrote: > Hi -- I hope it's okay me reaching out like this. > > I've been trying to test out the virtio-vhost-user implementation that's > been posted to this list a couple of times, but have been unable to get > it to boot a kernel following the steps listed either on > <https://wiki.qemu.org/Features/VirtioVhostUser> or > <https://ndragazis.github.io/dpdk-vhost-vvu-demo.html>. > > Specifically, the kernel appears to be unable to write to the > virtio-vhost-user device's PCI registers. I've included the full panic > output from the kernel at the end of this message. The panic is > reproducible with two different kernels I tried (with different configs > and versions). I tried both versions of the virtio-vhost-user I was > able to find[1][2], and both exhibited the same behaviour. > > Is this a known issue? Am I doing something wrong? Hi, Unfortunately I'm not sure what the issue is. This is an early virtio-pci register access before a driver for any specific device type (net, blk, vhost-user, etc) comes into play. Did you test the git trees linked below or did you rebase the commits on top of your own QEMU tree? Is your guest kernel a stock kernel.org/distro kernel or has it been modified (especially with security patches)? If no one else knows what is wrong here then it will be necessary to check the Intel manuals to figure out the exact meaning of "error_code(0x000b) - reserved bit violation" and why Linux triggers it with "PGD 3b128067 P4D 3b128067 PUD 3b129067 PMD 3b12a067 PTE 8000002000000073". Stefan > > Thanks in advance -- I'm excitedly following the progress of this > feature. > > Alyssa Ross > > [1]: https://github.com/ndragazis/qemu/commits/virtio-vhost-user > [2]: https://github.com/stefanha/qemu/commits/virtio-vhost-user > > > [ 1.287979] BUG: unable to handle page fault for address: ffffb8ca40025014 > [ 1.288311] #PF: supervisor write access in kernel mode > [ 1.288311] #PF: error_code(0x000b) - reserved bit violation > [ 1.288311] PGD 3b128067 P4D 3b128067 PUD 3b129067 PMD 3b12a067 PTE 8000002000000073 > [ 1.288311] Oops: 000b [#1] SMP PTI > [ 1.288311] CPU: 1 PID: 1 Comm: swapper/0 Not tainted 5.4.28 #1-NixOS > [ 1.288311] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.11.0-0-g63451fca13-prebuilt.qemu-project.org 04/01/2014 > [ 1.288311] RIP: 0010:iowrite8+0xe/0x30 > [ 1.288311] Code: fe ff ff 48 c7 c0 ff ff ff ff c3 48 8b 3f 48 89 f8 c3 66 2e 0f 1f 84 00 00 00 00 00 89 f8 48 89 f7 48 81 fe ff ff 3 > [ 1.288311] RSP: 0000:ffffb8ca40013cd8 EFLAGS: 00010292 > [ 1.288311] RAX: 0000000000000000 RBX: ffffb8ca40013d60 RCX: 0000000000000000 > [ 1.288311] RDX: 000000000000002f RSI: ffffb8ca40025014 RDI: ffffb8ca40025014 > [ 1.288311] RBP: ffff9c742ea20400 R08: ffff9c742f0a60af R09: 0000000000000000 > [ 1.288311] R10: 0000000000000018 R11: ffff9c742f0a60af R12: 0000000000000000 > [ 1.288311] R13: ffff9c742ea20410 R14: 0000000000000000 R15: 0000000000000000 > [ 1.288311] FS: 0000000000000000(0000) GS:ffff9c743b700000(0000) knlGS:0000000000000000 > [ 1.288311] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 > [ 1.288311] CR2: ffffb8ca40025014 CR3: 0000000037a0a001 CR4: 0000000000060ee0 > [ 1.288311] Call Trace: > [ 1.288311] vp_reset+0x1b/0x50 > [ 1.288311] register_virtio_device+0x74/0xe0 > [ 1.288311] virtio_pci_probe+0xaf/0x140 > [ 1.288311] local_pci_probe+0x42/0x80 > [ 1.288311] pci_device_probe+0x104/0x1b0 > [ 1.288311] really_probe+0x147/0x3c0 > [ 1.288311] driver_probe_device+0xb6/0x100 > [ 1.288311] device_driver_attach+0x53/0x60 > [ 1.288311] __driver_attach+0x8a/0x150 > [ 1.288311] ? device_driver_attach+0x60/0x60 > [ 1.288311] bus_for_each_dev+0x78/0xc0 > [ 1.288311] bus_add_driver+0x14d/0x1f0 > [ 1.288311] driver_register+0x6c/0xc0 > [ 1.288311] ? dma_bus_init+0xbf/0xbf > [ 1.288311] do_one_initcall+0x46/0x1f4 > [ 1.288311] kernel_init_freeable+0x176/0x200 > [ 1.288311] ? rest_init+0xab/0xab > [ 1.288311] kernel_init+0xa/0x105 > [ 1.288311] ret_from_fork+0x35/0x40 > [ 1.288311] Modules linked in: > [ 1.288311] CR2: ffffb8ca40025014 > [ 1.288311] ---[ end trace 5164b2fa531e028f ]--- > [ 1.288311] RIP: 0010:iowrite8+0xe/0x30 > [ 1.288311] Code: fe ff ff 48 c7 c0 ff ff ff ff c3 48 8b 3f 48 89 f8 c3 66 2e 0f 1f 84 00 00 00 00 00 89 f8 48 89 f7 48 81 fe ff ff 3 > [ 1.288311] RSP: 0000:ffffb8ca40013cd8 EFLAGS: 00010292 > [ 1.288311] RAX: 0000000000000000 RBX: ffffb8ca40013d60 RCX: 0000000000000000 > [ 1.288311] RDX: 000000000000002f RSI: ffffb8ca40025014 RDI: ffffb8ca40025014 > [ 1.288311] RBP: ffff9c742ea20400 R08: ffff9c742f0a60af R09: 0000000000000000 > [ 1.288311] R10: 0000000000000018 R11: ffff9c742f0a60af R12: 0000000000000000 > [ 1.288311] R13: ffff9c742ea20410 R14: 0000000000000000 R15: 0000000000000000 > [ 1.288311] FS: 0000000000000000(0000) GS:ffff9c743b700000(0000) knlGS:0000000000000000 > [ 1.288311] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 > [ 1.288311] CR2: ffffb8ca40025014 CR3: 0000000037a0a001 CR4: 0000000000060ee0 > [ 1.288311] Kernel panic - not syncing: Attempted to kill init! exitcode=0x00000009 > [ 1.288311] Kernel Offset: 0x21200000 from 0xffffffff81000000 (relocation range: 0xffffffff80000000-0xffffffffbfffffff) > [ 1.288311] ---[ end Kernel panic - not syncing: Attempted to kill init! exitcode=0x00000009 ]--- > [-- Attachment #2: signature.asc --] [-- Type: application/pgp-signature, Size: 488 bytes --] ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: Testing the virtio-vhost-user QEMU patch 2020-07-21 8:30 ` Stefan Hajnoczi @ 2020-07-21 16:02 ` Alyssa Ross 2020-07-23 22:27 ` Alyssa Ross 1 sibling, 0 replies; 8+ messages in thread From: Alyssa Ross @ 2020-07-21 16:02 UTC (permalink / raw) To: Stefan Hajnoczi; +Cc: Nikos Dragazis, qemu-devel Stefan Hajnoczi <stefanha@redhat.com> writes: > On Tue, Jul 21, 2020 at 07:14:38AM +0000, Alyssa Ross wrote: >> Hi -- I hope it's okay me reaching out like this. >> >> I've been trying to test out the virtio-vhost-user implementation that's >> been posted to this list a couple of times, but have been unable to get >> it to boot a kernel following the steps listed either on >> <https://wiki.qemu.org/Features/VirtioVhostUser> or >> <https://ndragazis.github.io/dpdk-vhost-vvu-demo.html>. >> >> Specifically, the kernel appears to be unable to write to the >> virtio-vhost-user device's PCI registers. I've included the full panic >> output from the kernel at the end of this message. The panic is >> reproducible with two different kernels I tried (with different configs >> and versions). I tried both versions of the virtio-vhost-user I was >> able to find[1][2], and both exhibited the same behaviour. >> >> Is this a known issue? Am I doing something wrong? > > Hi, > Unfortunately I'm not sure what the issue is. This is an early > virtio-pci register access before a driver for any specific device type > (net, blk, vhost-user, etc) comes into play. > > Did you test the git trees linked below or did you rebase the commits > on top of your own QEMU tree? I tested the git trees. For your one I had to make a slight modification to delete the memfd syscall wrapper in util/memfd.c, since it conflicted with the one that is now provided by Glibc. Nikos's tree I used totally unmodified. > Is your guest kernel a stock kernel.org/distro kernel or has it been > modified (especially with security patches)? I tried a slightly modified Chromium OS kernel (5.4.23), and a stock Ubuntu 18.10 kernel (4.15.0). I think the most "normal" setup I tried was building QEMU on Fedora 32, and then attempting to boot a freshly installed Ubuntu Server 18.10 VM with -chardev socket,id=chardev0,path=vhost-user.sock,server,nowait \ -device virtio-vhost-user-pci,chardev=chardev0 (The crash was reproducible with the full QEMU command lines in the write-ups, but these seemed to be the load-bearing bits.) > If no one else knows what is wrong here then it will be necessary to > check the Intel manuals to figure out the exact meaning of > "error_code(0x000b) - reserved bit violation" and why Linux triggers it > with "PGD 3b128067 P4D 3b128067 PUD 3b129067 PMD 3b12a067 PTE > 8000002000000073". Thanks for your insight. Now I at least have a place to start if nobody else knows what's up. :) ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: Testing the virtio-vhost-user QEMU patch 2020-07-21 8:30 ` Stefan Hajnoczi 2020-07-21 16:02 ` Alyssa Ross @ 2020-07-23 22:27 ` Alyssa Ross 2020-07-24 10:58 ` Alyssa Ross 1 sibling, 1 reply; 8+ messages in thread From: Alyssa Ross @ 2020-07-23 22:27 UTC (permalink / raw) To: Stefan Hajnoczi; +Cc: Nikos Dragazis, qemu-devel Stefan Hajnoczi <stefanha@redhat.com> writes: > On Tue, Jul 21, 2020 at 07:14:38AM +0000, Alyssa Ross wrote: >> Hi -- I hope it's okay me reaching out like this. >> >> I've been trying to test out the virtio-vhost-user implementation that's >> been posted to this list a couple of times, but have been unable to get >> it to boot a kernel following the steps listed either on >> <https://wiki.qemu.org/Features/VirtioVhostUser> or >> <https://ndragazis.github.io/dpdk-vhost-vvu-demo.html>. >> >> Specifically, the kernel appears to be unable to write to the >> virtio-vhost-user device's PCI registers. I've included the full panic >> output from the kernel at the end of this message. The panic is >> reproducible with two different kernels I tried (with different configs >> and versions). I tried both versions of the virtio-vhost-user I was >> able to find[1][2], and both exhibited the same behaviour. >> >> Is this a known issue? Am I doing something wrong? > > Hi, > Unfortunately I'm not sure what the issue is. This is an early > virtio-pci register access before a driver for any specific device type > (net, blk, vhost-user, etc) comes into play. Small update here: I tried on another computer, and it worked. Made sure that it was exactly the same QEMU binary, command line, and VM disk/initrd/kernel, so I think I can fairly confidently say the panic depends on what hardware QEMU is running on. I set -cpu value to the same on both as well (SandyBridge). I also discovered that it works on my primary computer (the one it panicked on before) with KVM disabled. Note that I've only got so far as finding that it boots on the other machine -- I haven't verified yet that it actually works. Bad host CPU: Intel(R) Core(TM) i5-2520M CPU @ 2.50GHz Good host CPU: AMD EPYC 7401P 24-Core Processor May I ask what host CPUs other people have tested this on? Having more data would probably be useful. Could it be an AMD vs. Intel thing? ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: Testing the virtio-vhost-user QEMU patch 2020-07-23 22:27 ` Alyssa Ross @ 2020-07-24 10:58 ` Alyssa Ross 2020-07-24 12:32 ` Stefan Hajnoczi 0 siblings, 1 reply; 8+ messages in thread From: Alyssa Ross @ 2020-07-24 10:58 UTC (permalink / raw) To: Stefan Hajnoczi; +Cc: Nikos Dragazis, qemu-devel Alyssa Ross <hi@alyssa.is> writes: > Stefan Hajnoczi <stefanha@redhat.com> writes: > >> On Tue, Jul 21, 2020 at 07:14:38AM +0000, Alyssa Ross wrote: >>> Hi -- I hope it's okay me reaching out like this. >>> >>> I've been trying to test out the virtio-vhost-user implementation that's >>> been posted to this list a couple of times, but have been unable to get >>> it to boot a kernel following the steps listed either on >>> <https://wiki.qemu.org/Features/VirtioVhostUser> or >>> <https://ndragazis.github.io/dpdk-vhost-vvu-demo.html>. >>> >>> Specifically, the kernel appears to be unable to write to the >>> virtio-vhost-user device's PCI registers. I've included the full panic >>> output from the kernel at the end of this message. The panic is >>> reproducible with two different kernels I tried (with different configs >>> and versions). I tried both versions of the virtio-vhost-user I was >>> able to find[1][2], and both exhibited the same behaviour. >>> >>> Is this a known issue? Am I doing something wrong? >> >> Hi, >> Unfortunately I'm not sure what the issue is. This is an early >> virtio-pci register access before a driver for any specific device type >> (net, blk, vhost-user, etc) comes into play. > > Small update here: I tried on another computer, and it worked. Made > sure that it was exactly the same QEMU binary, command line, and VM > disk/initrd/kernel, so I think I can fairly confidently say the panic > depends on what hardware QEMU is running on. I set -cpu value to the > same on both as well (SandyBridge). > > I also discovered that it works on my primary computer (the one it > panicked on before) with KVM disabled. > > Note that I've only got so far as finding that it boots on the other > machine -- I haven't verified yet that it actually works. > > Bad host CPU: Intel(R) Core(TM) i5-2520M CPU @ 2.50GHz > Good host CPU: AMD EPYC 7401P 24-Core Processor > > May I ask what host CPUs other people have tested this on? Having more > data would probably be useful. Could it be an AMD vs. Intel thing? I think I've figured it out! Sandy Bridge and Ivy Bridge hosts encounter this panic because the "additional resources" bar size is too big, at 1 << 36. If I change this to 1 << 35, no more kernel panic. Skylake and later are fine with 1 << 36. In between Ivy Bridge and Skylake were Haswell and Broadwell, but I couldn't find anybody who was able to help me test on either of those, so I don't know what they do. Perhaps related, the hosts that produce panics all seem to have a physical address size of 36 bits, while the hosts that work have larger physical address sizes, as reported by lscpu. ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: Testing the virtio-vhost-user QEMU patch 2020-07-24 10:58 ` Alyssa Ross @ 2020-07-24 12:32 ` Stefan Hajnoczi 2020-07-24 21:56 ` Alyssa Ross 0 siblings, 1 reply; 8+ messages in thread From: Stefan Hajnoczi @ 2020-07-24 12:32 UTC (permalink / raw) To: Alyssa Ross; +Cc: Nikos Dragazis, qemu-devel [-- Attachment #1: Type: text/plain, Size: 3002 bytes --] On Fri, Jul 24, 2020 at 10:58:45AM +0000, Alyssa Ross wrote: > Alyssa Ross <hi@alyssa.is> writes: > > > Stefan Hajnoczi <stefanha@redhat.com> writes: > > > >> On Tue, Jul 21, 2020 at 07:14:38AM +0000, Alyssa Ross wrote: > >>> Hi -- I hope it's okay me reaching out like this. > >>> > >>> I've been trying to test out the virtio-vhost-user implementation that's > >>> been posted to this list a couple of times, but have been unable to get > >>> it to boot a kernel following the steps listed either on > >>> <https://wiki.qemu.org/Features/VirtioVhostUser> or > >>> <https://ndragazis.github.io/dpdk-vhost-vvu-demo.html>. > >>> > >>> Specifically, the kernel appears to be unable to write to the > >>> virtio-vhost-user device's PCI registers. I've included the full panic > >>> output from the kernel at the end of this message. The panic is > >>> reproducible with two different kernels I tried (with different configs > >>> and versions). I tried both versions of the virtio-vhost-user I was > >>> able to find[1][2], and both exhibited the same behaviour. > >>> > >>> Is this a known issue? Am I doing something wrong? > >> > >> Hi, > >> Unfortunately I'm not sure what the issue is. This is an early > >> virtio-pci register access before a driver for any specific device type > >> (net, blk, vhost-user, etc) comes into play. > > > > Small update here: I tried on another computer, and it worked. Made > > sure that it was exactly the same QEMU binary, command line, and VM > > disk/initrd/kernel, so I think I can fairly confidently say the panic > > depends on what hardware QEMU is running on. I set -cpu value to the > > same on both as well (SandyBridge). > > > > I also discovered that it works on my primary computer (the one it > > panicked on before) with KVM disabled. > > > > Note that I've only got so far as finding that it boots on the other > > machine -- I haven't verified yet that it actually works. > > > > Bad host CPU: Intel(R) Core(TM) i5-2520M CPU @ 2.50GHz > > Good host CPU: AMD EPYC 7401P 24-Core Processor > > > > May I ask what host CPUs other people have tested this on? Having more > > data would probably be useful. Could it be an AMD vs. Intel thing? > > I think I've figured it out! > > Sandy Bridge and Ivy Bridge hosts encounter this panic because the > "additional resources" bar size is too big, at 1 << 36. If I change > this to 1 << 35, no more kernel panic. > > Skylake and later are fine with 1 << 36. In between Ivy Bridge and > Skylake were Haswell and Broadwell, but I couldn't find anybody who was > able to help me test on either of those, so I don't know what they do. > > Perhaps related, the hosts that produce panics all seem to have a > physical address size of 36 bits, while the hosts that work have larger > physical address sizes, as reported by lscpu. I have run it successfully on Broadwell but never tried 64GB or larger shared memory resources. Stefan [-- Attachment #2: signature.asc --] [-- Type: application/pgp-signature, Size: 488 bytes --] ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: Testing the virtio-vhost-user QEMU patch 2020-07-24 12:32 ` Stefan Hajnoczi @ 2020-07-24 21:56 ` Alyssa Ross 2020-07-27 10:00 ` Stefan Hajnoczi 0 siblings, 1 reply; 8+ messages in thread From: Alyssa Ross @ 2020-07-24 21:56 UTC (permalink / raw) To: Stefan Hajnoczi; +Cc: Nikos Dragazis, qemu-devel Stefan Hajnoczi <stefanha@redhat.com> writes: > On Fri, Jul 24, 2020 at 10:58:45AM +0000, Alyssa Ross wrote: >> Alyssa Ross <hi@alyssa.is> writes: >> >> > Stefan Hajnoczi <stefanha@redhat.com> writes: >> > >> >> On Tue, Jul 21, 2020 at 07:14:38AM +0000, Alyssa Ross wrote: >> >>> Hi -- I hope it's okay me reaching out like this. >> >>> >> >>> I've been trying to test out the virtio-vhost-user implementation that's >> >>> been posted to this list a couple of times, but have been unable to get >> >>> it to boot a kernel following the steps listed either on >> >>> <https://wiki.qemu.org/Features/VirtioVhostUser> or >> >>> <https://ndragazis.github.io/dpdk-vhost-vvu-demo.html>. >> >>> >> >>> Specifically, the kernel appears to be unable to write to the >> >>> virtio-vhost-user device's PCI registers. I've included the full panic >> >>> output from the kernel at the end of this message. The panic is >> >>> reproducible with two different kernels I tried (with different configs >> >>> and versions). I tried both versions of the virtio-vhost-user I was >> >>> able to find[1][2], and both exhibited the same behaviour. >> >>> >> >>> Is this a known issue? Am I doing something wrong? >> >> >> >> Hi, >> >> Unfortunately I'm not sure what the issue is. This is an early >> >> virtio-pci register access before a driver for any specific device type >> >> (net, blk, vhost-user, etc) comes into play. >> > >> > Small update here: I tried on another computer, and it worked. Made >> > sure that it was exactly the same QEMU binary, command line, and VM >> > disk/initrd/kernel, so I think I can fairly confidently say the panic >> > depends on what hardware QEMU is running on. I set -cpu value to the >> > same on both as well (SandyBridge). >> > >> > I also discovered that it works on my primary computer (the one it >> > panicked on before) with KVM disabled. >> > >> > Note that I've only got so far as finding that it boots on the other >> > machine -- I haven't verified yet that it actually works. >> > >> > Bad host CPU: Intel(R) Core(TM) i5-2520M CPU @ 2.50GHz >> > Good host CPU: AMD EPYC 7401P 24-Core Processor >> > >> > May I ask what host CPUs other people have tested this on? Having more >> > data would probably be useful. Could it be an AMD vs. Intel thing? >> >> I think I've figured it out! >> >> Sandy Bridge and Ivy Bridge hosts encounter this panic because the >> "additional resources" bar size is too big, at 1 << 36. If I change >> this to 1 << 35, no more kernel panic. >> >> Skylake and later are fine with 1 << 36. In between Ivy Bridge and >> Skylake were Haswell and Broadwell, but I couldn't find anybody who was >> able to help me test on either of those, so I don't know what they do. >> >> Perhaps related, the hosts that produce panics all seem to have a >> physical address size of 36 bits, while the hosts that work have larger >> physical address sizes, as reported by lscpu. > > I have run it successfully on Broadwell but never tried 64GB or larger > shared memory resources. To clarify, I haven't been using big shared memory resources either -- this has all been about getting the backend VM to start at all. The panic happens at boot, and the 1 << 36 BAR allocation comes from here, during realization: https://github.com/ndragazis/qemu/blob/f9ab08c0c8/hw/virtio/virtio-vhost-user-pci.c#L291 ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: Testing the virtio-vhost-user QEMU patch 2020-07-24 21:56 ` Alyssa Ross @ 2020-07-27 10:00 ` Stefan Hajnoczi 0 siblings, 0 replies; 8+ messages in thread From: Stefan Hajnoczi @ 2020-07-27 10:00 UTC (permalink / raw) To: Alyssa Ross; +Cc: Nikos Dragazis, qemu-devel [-- Attachment #1: Type: text/plain, Size: 3850 bytes --] On Fri, Jul 24, 2020 at 09:56:53PM +0000, Alyssa Ross wrote: > Stefan Hajnoczi <stefanha@redhat.com> writes: > > > On Fri, Jul 24, 2020 at 10:58:45AM +0000, Alyssa Ross wrote: > >> Alyssa Ross <hi@alyssa.is> writes: > >> > >> > Stefan Hajnoczi <stefanha@redhat.com> writes: > >> > > >> >> On Tue, Jul 21, 2020 at 07:14:38AM +0000, Alyssa Ross wrote: > >> >>> Hi -- I hope it's okay me reaching out like this. > >> >>> > >> >>> I've been trying to test out the virtio-vhost-user implementation that's > >> >>> been posted to this list a couple of times, but have been unable to get > >> >>> it to boot a kernel following the steps listed either on > >> >>> <https://wiki.qemu.org/Features/VirtioVhostUser> or > >> >>> <https://ndragazis.github.io/dpdk-vhost-vvu-demo.html>. > >> >>> > >> >>> Specifically, the kernel appears to be unable to write to the > >> >>> virtio-vhost-user device's PCI registers. I've included the full panic > >> >>> output from the kernel at the end of this message. The panic is > >> >>> reproducible with two different kernels I tried (with different configs > >> >>> and versions). I tried both versions of the virtio-vhost-user I was > >> >>> able to find[1][2], and both exhibited the same behaviour. > >> >>> > >> >>> Is this a known issue? Am I doing something wrong? > >> >> > >> >> Hi, > >> >> Unfortunately I'm not sure what the issue is. This is an early > >> >> virtio-pci register access before a driver for any specific device type > >> >> (net, blk, vhost-user, etc) comes into play. > >> > > >> > Small update here: I tried on another computer, and it worked. Made > >> > sure that it was exactly the same QEMU binary, command line, and VM > >> > disk/initrd/kernel, so I think I can fairly confidently say the panic > >> > depends on what hardware QEMU is running on. I set -cpu value to the > >> > same on both as well (SandyBridge). > >> > > >> > I also discovered that it works on my primary computer (the one it > >> > panicked on before) with KVM disabled. > >> > > >> > Note that I've only got so far as finding that it boots on the other > >> > machine -- I haven't verified yet that it actually works. > >> > > >> > Bad host CPU: Intel(R) Core(TM) i5-2520M CPU @ 2.50GHz > >> > Good host CPU: AMD EPYC 7401P 24-Core Processor > >> > > >> > May I ask what host CPUs other people have tested this on? Having more > >> > data would probably be useful. Could it be an AMD vs. Intel thing? > >> > >> I think I've figured it out! > >> > >> Sandy Bridge and Ivy Bridge hosts encounter this panic because the > >> "additional resources" bar size is too big, at 1 << 36. If I change > >> this to 1 << 35, no more kernel panic. > >> > >> Skylake and later are fine with 1 << 36. In between Ivy Bridge and > >> Skylake were Haswell and Broadwell, but I couldn't find anybody who was > >> able to help me test on either of those, so I don't know what they do. > >> > >> Perhaps related, the hosts that produce panics all seem to have a > >> physical address size of 36 bits, while the hosts that work have larger > >> physical address sizes, as reported by lscpu. > > > > I have run it successfully on Broadwell but never tried 64GB or larger > > shared memory resources. > > To clarify, I haven't been using big shared memory resources either -- > this has all been about getting the backend VM to start at all. The > panic happens at boot, and the 1 << 36 BAR allocation comes from here, > during realization: > https://github.com/ndragazis/qemu/blob/f9ab08c0c8/hw/virtio/virtio-vhost-user-pci.c#L291 Okay, then that worked on Broadwell :) Intel(R) Core(TM) i7-5600U CPU @ 2.60GHz https://ark.intel.com/content/www/us/en/ark/products/85215/intel-core-i7-5600u-processor-4m-cache-up-to-3-20-ghz.html Stefan [-- Attachment #2: signature.asc --] [-- Type: application/pgp-signature, Size: 488 bytes --] ^ permalink raw reply [flat|nested] 8+ messages in thread
end of thread, other threads:[~2020-07-27 10:02 UTC | newest] Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2020-07-21 7:14 Testing the virtio-vhost-user QEMU patch Alyssa Ross 2020-07-21 8:30 ` Stefan Hajnoczi 2020-07-21 16:02 ` Alyssa Ross 2020-07-23 22:27 ` Alyssa Ross 2020-07-24 10:58 ` Alyssa Ross 2020-07-24 12:32 ` Stefan Hajnoczi 2020-07-24 21:56 ` Alyssa Ross 2020-07-27 10:00 ` Stefan Hajnoczi
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).