qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
* [Bug 1866892] [NEW] guest OS catches a page fault bug when running dotnet
@ 2020-03-10 19:39 Robert Henry
  2020-03-17  3:11 ` [Bug 1866892] " Robert Henry
                   ` (12 more replies)
  0 siblings, 13 replies; 14+ messages in thread
From: Robert Henry @ 2020-03-10 19:39 UTC (permalink / raw)
  To: qemu-devel

Public bug reported:

The linux guest OS catches a page fault bug when running the dotnet
application.

host = metal = x86_64
host OS = ubuntu 19.10
qemu emulation, without KVM, with "tiny code generator" tcg; no plugins; built from head/master
guest emulation = x86_64
guest OS = ubuntu 19.10
guest app = dotnet, running any program

qemu sha=7bc4d1980f95387c4cc921d7a066217ff4e42b70 (head/master Mar 10,
2020)

qemu invocation is:

qemu/build/x86_64-softmmu/qemu-system-x86_64 \
  -m size=4096 \
  -smp cpus=1 \
  -machine type=pc-i440fx-5.0,accel=tcg \
  -cpu Skylake-Server-v1 \
  -nographic \
  -bios OVMF-pure-efi.fd \
  -drive if=none,id=hd0,file=ubuntu-19.10-server-cloudimg-amd64.img \
  -device virtio-blk,drive=hd0 \
  -drive if=none,id=cloud,file=linux_cloud_config.img \
  -device virtio-blk,drive=cloud \
  -netdev user,id=user0,hostfwd=tcp::2223-:22 \
  -device virtio-net,netdev=user0


Here's the guest kernel console output:


[ 2834.005449] BUG: unable to handle page fault for address: 00007fffffffc2c0
[ 2834.009895] #PF: supervisor read access in user mode
[ 2834.013872] #PF: error_code(0x0001) - permissions violation
[ 2834.018025] IDT: 0xfffffe0000000000 (limit=0xfff) GDT: 0xfffffe0000001000 (limit=0x7f)
[ 2834.022242] LDTR: NULL
[ 2834.026306] TR: 0x40 -- base=0xfffffe0000003000 limit=0x206f
[ 2834.030395] PGD 80000000360d0067 P4D 80000000360d0067 PUD 36105067 PMD 36193067 PTE 8000000076d8e867
[ 2834.038672] Oops: 0001 [#4] SMP PTI
[ 2834.042707] CPU: 0 PID: 13537 Comm: dotnet Tainted: G      D           5.3.0-29-generic #31-Ubuntu
[ 2834.050591] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 0.0.0 02/06/2015
[ 2834.054785] RIP: 0033:0x1555547eaeda
[ 2834.059017] Code: d0 00 00 00 4c 8b a7 d8 00 00 00 4c 8b af e0 00 00 00 4c 8b b7 e8 00 00 00 4c 8b bf f0 00 00 00 48 8b bf b0 00 00 00 9d 74 02 <48> cf 48 8d 64 24 30 5d c3 90 cc c3 66 90 55 4c 8b a7 d8 00 00 00
[ 2834.072103] RSP: 002b:00007fffffffc2c0 EFLAGS: 00000202
[ 2834.076507] RAX: 0000000000000000 RBX: 00001554b401af38 RCX: 0000000000000001
[ 2834.080832] RDX: 0000000000000000 RSI: 0000000000000000 RDI: 00007fffffffcfb0
[ 2834.085010] RBP: 00007fffffffd730 R08: 0000000000000000 R09: 00007fffffffd1b0
[ 2834.089184] R10: 0000155555331dd5 R11: 00001555553ad8d0 R12: 0000000000000002
[ 2834.093350] R13: 0000000000000001 R14: 0000000000000001 R15: 00001554b401d388
[ 2834.097309] FS:  0000155554fa5740 GS:  0000000000000000
[ 2834.101131] Modules linked in: isofs nls_iso8859_1 dm_multipath scsi_dh_rdac scsi_dh_emc scsi_dh_alua ppdev input_leds serio_raw parport_pc parport sch_fq_codel ip_tables x_tables autofs4 btrfs zstd_compress raid10 raid456 async_raid6_recov async_memcpy async_pq async_xor async_tx xor raid6_pq libcrc32c raid1 raid0 multipath linear crct10dif_pclmul crc32_pclmul ghash_clmulni_intel aesni_intel aes_x86_64 crypto_simd cryptd glue_helper virtio_net psmouse net_failover failover virtio_blk floppy
[ 2834.122539] CR2: 00007fffffffc2c0
[ 2834.126867] ---[ end trace dfae51f1d9432708 ]---
[ 2834.131239] RIP: 0033:0x14d793262eda
[ 2834.135715] Code: Bad RIP value.
[ 2834.140243] RSP: 002b:00007ffddb4e2980 EFLAGS: 00000202
[ 2834.144615] RAX: 0000000000000000 RBX: 000014d6f402acb8 RCX: 0000000000000002
[ 2834.148943] RDX: 0000000001cd6950 RSI: 0000000000000000 RDI: 00007ffddb4e3670
[ 2834.153335] RBP: 00007ffddb4e3df0 R08: 0000000000000001 R09: 00007ffddb4e3870
[ 2834.157774] R10: 000014d793da9dd5 R11: 000014d793e258d0 R12: 0000000000000002
[ 2834.162132] R13: 0000000000000001 R14: 0000000000000001 R15: 000014d6f402d040
[ 2834.166239] FS:  0000155554fa5740(0000) GS:ffff97213ba00000(0000) knlGS:0000000000000000
[ 2834.170529] CS:  0033 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 2834.174751] CR2: 000014d793262eb0 CR3: 0000000036130000 CR4: 00000000007406f0
[ 2834.178892] PKRU: 55555554

I run the application from a shell with `ulimit -s unlimited` (unlimited
stack to size).

The application creates a number of threads, and those threads make a
lot of calls to sigaltstack() and mprotect(); see the relevant source
for dotnet here
https://github.com/dotnet/runtime/blob/15ec69e47b4dc56098e6058a11ccb6ae4d5d4fa1/src/coreclr/src/pal/src/thread/thread.cpp#L2467

using strace -f on the app shows that no alt stacks come anywhere near
the failing address; all alt stacks are in the heap, as expected.  None
of the mmap/mprotect/munmap syscalls were given arguments in the high
memory 0x7fffffff0000 and up.

gdb (with default signal stop/print/pass semantics) does not report any
signals prior to the kernel bug being tripped, so I doubt the alternate
signal stack is actually used.

When I run the same dotnet binary on the host (eg, on "bare metal"), the
host kernel seems happy and dotnet runs as expected.

I have not tried different qemu or guest or host O/S.

** Affects: qemu
     Importance: Undecided
         Status: New

-- 
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/1866892

Title:
  guest OS catches a page  fault bug when running dotnet

Status in QEMU:
  New

Bug description:
  The linux guest OS catches a page fault bug when running the dotnet
  application.

  host = metal = x86_64
  host OS = ubuntu 19.10
  qemu emulation, without KVM, with "tiny code generator" tcg; no plugins; built from head/master
  guest emulation = x86_64
  guest OS = ubuntu 19.10
  guest app = dotnet, running any program

  qemu sha=7bc4d1980f95387c4cc921d7a066217ff4e42b70 (head/master Mar 10,
  2020)

  qemu invocation is:

  qemu/build/x86_64-softmmu/qemu-system-x86_64 \
    -m size=4096 \
    -smp cpus=1 \
    -machine type=pc-i440fx-5.0,accel=tcg \
    -cpu Skylake-Server-v1 \
    -nographic \
    -bios OVMF-pure-efi.fd \
    -drive if=none,id=hd0,file=ubuntu-19.10-server-cloudimg-amd64.img \
    -device virtio-blk,drive=hd0 \
    -drive if=none,id=cloud,file=linux_cloud_config.img \
    -device virtio-blk,drive=cloud \
    -netdev user,id=user0,hostfwd=tcp::2223-:22 \
    -device virtio-net,netdev=user0

  
  Here's the guest kernel console output:

  
  [ 2834.005449] BUG: unable to handle page fault for address: 00007fffffffc2c0
  [ 2834.009895] #PF: supervisor read access in user mode
  [ 2834.013872] #PF: error_code(0x0001) - permissions violation
  [ 2834.018025] IDT: 0xfffffe0000000000 (limit=0xfff) GDT: 0xfffffe0000001000 (limit=0x7f)
  [ 2834.022242] LDTR: NULL
  [ 2834.026306] TR: 0x40 -- base=0xfffffe0000003000 limit=0x206f
  [ 2834.030395] PGD 80000000360d0067 P4D 80000000360d0067 PUD 36105067 PMD 36193067 PTE 8000000076d8e867
  [ 2834.038672] Oops: 0001 [#4] SMP PTI
  [ 2834.042707] CPU: 0 PID: 13537 Comm: dotnet Tainted: G      D           5.3.0-29-generic #31-Ubuntu
  [ 2834.050591] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 0.0.0 02/06/2015
  [ 2834.054785] RIP: 0033:0x1555547eaeda
  [ 2834.059017] Code: d0 00 00 00 4c 8b a7 d8 00 00 00 4c 8b af e0 00 00 00 4c 8b b7 e8 00 00 00 4c 8b bf f0 00 00 00 48 8b bf b0 00 00 00 9d 74 02 <48> cf 48 8d 64 24 30 5d c3 90 cc c3 66 90 55 4c 8b a7 d8 00 00 00
  [ 2834.072103] RSP: 002b:00007fffffffc2c0 EFLAGS: 00000202
  [ 2834.076507] RAX: 0000000000000000 RBX: 00001554b401af38 RCX: 0000000000000001
  [ 2834.080832] RDX: 0000000000000000 RSI: 0000000000000000 RDI: 00007fffffffcfb0
  [ 2834.085010] RBP: 00007fffffffd730 R08: 0000000000000000 R09: 00007fffffffd1b0
  [ 2834.089184] R10: 0000155555331dd5 R11: 00001555553ad8d0 R12: 0000000000000002
  [ 2834.093350] R13: 0000000000000001 R14: 0000000000000001 R15: 00001554b401d388
  [ 2834.097309] FS:  0000155554fa5740 GS:  0000000000000000
  [ 2834.101131] Modules linked in: isofs nls_iso8859_1 dm_multipath scsi_dh_rdac scsi_dh_emc scsi_dh_alua ppdev input_leds serio_raw parport_pc parport sch_fq_codel ip_tables x_tables autofs4 btrfs zstd_compress raid10 raid456 async_raid6_recov async_memcpy async_pq async_xor async_tx xor raid6_pq libcrc32c raid1 raid0 multipath linear crct10dif_pclmul crc32_pclmul ghash_clmulni_intel aesni_intel aes_x86_64 crypto_simd cryptd glue_helper virtio_net psmouse net_failover failover virtio_blk floppy
  [ 2834.122539] CR2: 00007fffffffc2c0
  [ 2834.126867] ---[ end trace dfae51f1d9432708 ]---
  [ 2834.131239] RIP: 0033:0x14d793262eda
  [ 2834.135715] Code: Bad RIP value.
  [ 2834.140243] RSP: 002b:00007ffddb4e2980 EFLAGS: 00000202
  [ 2834.144615] RAX: 0000000000000000 RBX: 000014d6f402acb8 RCX: 0000000000000002
  [ 2834.148943] RDX: 0000000001cd6950 RSI: 0000000000000000 RDI: 00007ffddb4e3670
  [ 2834.153335] RBP: 00007ffddb4e3df0 R08: 0000000000000001 R09: 00007ffddb4e3870
  [ 2834.157774] R10: 000014d793da9dd5 R11: 000014d793e258d0 R12: 0000000000000002
  [ 2834.162132] R13: 0000000000000001 R14: 0000000000000001 R15: 000014d6f402d040
  [ 2834.166239] FS:  0000155554fa5740(0000) GS:ffff97213ba00000(0000) knlGS:0000000000000000
  [ 2834.170529] CS:  0033 DS: 0000 ES: 0000 CR0: 0000000080050033
  [ 2834.174751] CR2: 000014d793262eb0 CR3: 0000000036130000 CR4: 00000000007406f0
  [ 2834.178892] PKRU: 55555554

  I run the application from a shell with `ulimit -s unlimited`
  (unlimited stack to size).

  The application creates a number of threads, and those threads make a
  lot of calls to sigaltstack() and mprotect(); see the relevant source
  for dotnet here
  https://github.com/dotnet/runtime/blob/15ec69e47b4dc56098e6058a11ccb6ae4d5d4fa1/src/coreclr/src/pal/src/thread/thread.cpp#L2467

  using strace -f on the app shows that no alt stacks come anywhere near
  the failing address; all alt stacks are in the heap, as expected.
  None of the mmap/mprotect/munmap syscalls were given arguments in the
  high memory 0x7fffffff0000 and up.

  gdb (with default signal stop/print/pass semantics) does not report
  any signals prior to the kernel bug being tripped, so I doubt the
  alternate signal stack is actually used.

  When I run the same dotnet binary on the host (eg, on "bare metal"),
  the host kernel seems happy and dotnet runs as expected.

  I have not tried different qemu or guest or host O/S.

To manage notifications about this bug go to:
https://bugs.launchpad.net/qemu/+bug/1866892/+subscriptions


^ permalink raw reply	[flat|nested] 14+ messages in thread

* [Bug 1866892] Re: guest OS catches a page fault bug when running dotnet
  2020-03-10 19:39 [Bug 1866892] [NEW] guest OS catches a page fault bug when running dotnet Robert Henry
@ 2020-03-17  3:11 ` Robert Henry
  2020-03-19 18:48 ` Robert Henry
                   ` (11 subsequent siblings)
  12 siblings, 0 replies; 14+ messages in thread
From: Robert Henry @ 2020-03-17  3:11 UTC (permalink / raw)
  To: qemu-devel

A simpler case seems to produce the same error.  See
https://bugs.launchpad.net/qemu/+bug/1824344

-- 
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/1866892

Title:
  guest OS catches a page  fault bug when running dotnet

Status in QEMU:
  New

Bug description:
  The linux guest OS catches a page fault bug when running the dotnet
  application.

  host = metal = x86_64
  host OS = ubuntu 19.10
  qemu emulation, without KVM, with "tiny code generator" tcg; no plugins; built from head/master
  guest emulation = x86_64
  guest OS = ubuntu 19.10
  guest app = dotnet, running any program

  qemu sha=7bc4d1980f95387c4cc921d7a066217ff4e42b70 (head/master Mar 10,
  2020)

  qemu invocation is:

  qemu/build/x86_64-softmmu/qemu-system-x86_64 \
    -m size=4096 \
    -smp cpus=1 \
    -machine type=pc-i440fx-5.0,accel=tcg \
    -cpu Skylake-Server-v1 \
    -nographic \
    -bios OVMF-pure-efi.fd \
    -drive if=none,id=hd0,file=ubuntu-19.10-server-cloudimg-amd64.img \
    -device virtio-blk,drive=hd0 \
    -drive if=none,id=cloud,file=linux_cloud_config.img \
    -device virtio-blk,drive=cloud \
    -netdev user,id=user0,hostfwd=tcp::2223-:22 \
    -device virtio-net,netdev=user0

  
  Here's the guest kernel console output:

  
  [ 2834.005449] BUG: unable to handle page fault for address: 00007fffffffc2c0
  [ 2834.009895] #PF: supervisor read access in user mode
  [ 2834.013872] #PF: error_code(0x0001) - permissions violation
  [ 2834.018025] IDT: 0xfffffe0000000000 (limit=0xfff) GDT: 0xfffffe0000001000 (limit=0x7f)
  [ 2834.022242] LDTR: NULL
  [ 2834.026306] TR: 0x40 -- base=0xfffffe0000003000 limit=0x206f
  [ 2834.030395] PGD 80000000360d0067 P4D 80000000360d0067 PUD 36105067 PMD 36193067 PTE 8000000076d8e867
  [ 2834.038672] Oops: 0001 [#4] SMP PTI
  [ 2834.042707] CPU: 0 PID: 13537 Comm: dotnet Tainted: G      D           5.3.0-29-generic #31-Ubuntu
  [ 2834.050591] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 0.0.0 02/06/2015
  [ 2834.054785] RIP: 0033:0x1555547eaeda
  [ 2834.059017] Code: d0 00 00 00 4c 8b a7 d8 00 00 00 4c 8b af e0 00 00 00 4c 8b b7 e8 00 00 00 4c 8b bf f0 00 00 00 48 8b bf b0 00 00 00 9d 74 02 <48> cf 48 8d 64 24 30 5d c3 90 cc c3 66 90 55 4c 8b a7 d8 00 00 00
  [ 2834.072103] RSP: 002b:00007fffffffc2c0 EFLAGS: 00000202
  [ 2834.076507] RAX: 0000000000000000 RBX: 00001554b401af38 RCX: 0000000000000001
  [ 2834.080832] RDX: 0000000000000000 RSI: 0000000000000000 RDI: 00007fffffffcfb0
  [ 2834.085010] RBP: 00007fffffffd730 R08: 0000000000000000 R09: 00007fffffffd1b0
  [ 2834.089184] R10: 0000155555331dd5 R11: 00001555553ad8d0 R12: 0000000000000002
  [ 2834.093350] R13: 0000000000000001 R14: 0000000000000001 R15: 00001554b401d388
  [ 2834.097309] FS:  0000155554fa5740 GS:  0000000000000000
  [ 2834.101131] Modules linked in: isofs nls_iso8859_1 dm_multipath scsi_dh_rdac scsi_dh_emc scsi_dh_alua ppdev input_leds serio_raw parport_pc parport sch_fq_codel ip_tables x_tables autofs4 btrfs zstd_compress raid10 raid456 async_raid6_recov async_memcpy async_pq async_xor async_tx xor raid6_pq libcrc32c raid1 raid0 multipath linear crct10dif_pclmul crc32_pclmul ghash_clmulni_intel aesni_intel aes_x86_64 crypto_simd cryptd glue_helper virtio_net psmouse net_failover failover virtio_blk floppy
  [ 2834.122539] CR2: 00007fffffffc2c0
  [ 2834.126867] ---[ end trace dfae51f1d9432708 ]---
  [ 2834.131239] RIP: 0033:0x14d793262eda
  [ 2834.135715] Code: Bad RIP value.
  [ 2834.140243] RSP: 002b:00007ffddb4e2980 EFLAGS: 00000202
  [ 2834.144615] RAX: 0000000000000000 RBX: 000014d6f402acb8 RCX: 0000000000000002
  [ 2834.148943] RDX: 0000000001cd6950 RSI: 0000000000000000 RDI: 00007ffddb4e3670
  [ 2834.153335] RBP: 00007ffddb4e3df0 R08: 0000000000000001 R09: 00007ffddb4e3870
  [ 2834.157774] R10: 000014d793da9dd5 R11: 000014d793e258d0 R12: 0000000000000002
  [ 2834.162132] R13: 0000000000000001 R14: 0000000000000001 R15: 000014d6f402d040
  [ 2834.166239] FS:  0000155554fa5740(0000) GS:ffff97213ba00000(0000) knlGS:0000000000000000
  [ 2834.170529] CS:  0033 DS: 0000 ES: 0000 CR0: 0000000080050033
  [ 2834.174751] CR2: 000014d793262eb0 CR3: 0000000036130000 CR4: 00000000007406f0
  [ 2834.178892] PKRU: 55555554

  I run the application from a shell with `ulimit -s unlimited`
  (unlimited stack to size).

  The application creates a number of threads, and those threads make a
  lot of calls to sigaltstack() and mprotect(); see the relevant source
  for dotnet here
  https://github.com/dotnet/runtime/blob/15ec69e47b4dc56098e6058a11ccb6ae4d5d4fa1/src/coreclr/src/pal/src/thread/thread.cpp#L2467

  using strace -f on the app shows that no alt stacks come anywhere near
  the failing address; all alt stacks are in the heap, as expected.
  None of the mmap/mprotect/munmap syscalls were given arguments in the
  high memory 0x7fffffff0000 and up.

  gdb (with default signal stop/print/pass semantics) does not report
  any signals prior to the kernel bug being tripped, so I doubt the
  alternate signal stack is actually used.

  When I run the same dotnet binary on the host (eg, on "bare metal"),
  the host kernel seems happy and dotnet runs as expected.

  I have not tried different qemu or guest or host O/S.

To manage notifications about this bug go to:
https://bugs.launchpad.net/qemu/+bug/1866892/+subscriptions


^ permalink raw reply	[flat|nested] 14+ messages in thread

* [Bug 1866892] Re: guest OS catches a page fault bug when running dotnet
  2020-03-10 19:39 [Bug 1866892] [NEW] guest OS catches a page fault bug when running dotnet Robert Henry
  2020-03-17  3:11 ` [Bug 1866892] " Robert Henry
@ 2020-03-19 18:48 ` Robert Henry
  2020-03-19 19:38 ` Peter Maydell
                   ` (10 subsequent siblings)
  12 siblings, 0 replies; 14+ messages in thread
From: Robert Henry @ 2020-03-19 18:48 UTC (permalink / raw)
  To: qemu-devel

I have confirmed that the dotnet guest application is executing a
"iretq" instruction when this guest kernel bug is hit. A first round of
analysis shows nothing unreasonable at the point the iretq is executed.
The $rsp points into the middle of a mapped in page, the returned-to
$rip looks reasonable, etc. We continue our analysis of qemu and the
dotnet runtime.

-- 
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/1866892

Title:
  guest OS catches a page  fault bug when running dotnet

Status in QEMU:
  New

Bug description:
  The linux guest OS catches a page fault bug when running the dotnet
  application.

  host = metal = x86_64
  host OS = ubuntu 19.10
  qemu emulation, without KVM, with "tiny code generator" tcg; no plugins; built from head/master
  guest emulation = x86_64
  guest OS = ubuntu 19.10
  guest app = dotnet, running any program

  qemu sha=7bc4d1980f95387c4cc921d7a066217ff4e42b70 (head/master Mar 10,
  2020)

  qemu invocation is:

  qemu/build/x86_64-softmmu/qemu-system-x86_64 \
    -m size=4096 \
    -smp cpus=1 \
    -machine type=pc-i440fx-5.0,accel=tcg \
    -cpu Skylake-Server-v1 \
    -nographic \
    -bios OVMF-pure-efi.fd \
    -drive if=none,id=hd0,file=ubuntu-19.10-server-cloudimg-amd64.img \
    -device virtio-blk,drive=hd0 \
    -drive if=none,id=cloud,file=linux_cloud_config.img \
    -device virtio-blk,drive=cloud \
    -netdev user,id=user0,hostfwd=tcp::2223-:22 \
    -device virtio-net,netdev=user0

  
  Here's the guest kernel console output:

  
  [ 2834.005449] BUG: unable to handle page fault for address: 00007fffffffc2c0
  [ 2834.009895] #PF: supervisor read access in user mode
  [ 2834.013872] #PF: error_code(0x0001) - permissions violation
  [ 2834.018025] IDT: 0xfffffe0000000000 (limit=0xfff) GDT: 0xfffffe0000001000 (limit=0x7f)
  [ 2834.022242] LDTR: NULL
  [ 2834.026306] TR: 0x40 -- base=0xfffffe0000003000 limit=0x206f
  [ 2834.030395] PGD 80000000360d0067 P4D 80000000360d0067 PUD 36105067 PMD 36193067 PTE 8000000076d8e867
  [ 2834.038672] Oops: 0001 [#4] SMP PTI
  [ 2834.042707] CPU: 0 PID: 13537 Comm: dotnet Tainted: G      D           5.3.0-29-generic #31-Ubuntu
  [ 2834.050591] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 0.0.0 02/06/2015
  [ 2834.054785] RIP: 0033:0x1555547eaeda
  [ 2834.059017] Code: d0 00 00 00 4c 8b a7 d8 00 00 00 4c 8b af e0 00 00 00 4c 8b b7 e8 00 00 00 4c 8b bf f0 00 00 00 48 8b bf b0 00 00 00 9d 74 02 <48> cf 48 8d 64 24 30 5d c3 90 cc c3 66 90 55 4c 8b a7 d8 00 00 00
  [ 2834.072103] RSP: 002b:00007fffffffc2c0 EFLAGS: 00000202
  [ 2834.076507] RAX: 0000000000000000 RBX: 00001554b401af38 RCX: 0000000000000001
  [ 2834.080832] RDX: 0000000000000000 RSI: 0000000000000000 RDI: 00007fffffffcfb0
  [ 2834.085010] RBP: 00007fffffffd730 R08: 0000000000000000 R09: 00007fffffffd1b0
  [ 2834.089184] R10: 0000155555331dd5 R11: 00001555553ad8d0 R12: 0000000000000002
  [ 2834.093350] R13: 0000000000000001 R14: 0000000000000001 R15: 00001554b401d388
  [ 2834.097309] FS:  0000155554fa5740 GS:  0000000000000000
  [ 2834.101131] Modules linked in: isofs nls_iso8859_1 dm_multipath scsi_dh_rdac scsi_dh_emc scsi_dh_alua ppdev input_leds serio_raw parport_pc parport sch_fq_codel ip_tables x_tables autofs4 btrfs zstd_compress raid10 raid456 async_raid6_recov async_memcpy async_pq async_xor async_tx xor raid6_pq libcrc32c raid1 raid0 multipath linear crct10dif_pclmul crc32_pclmul ghash_clmulni_intel aesni_intel aes_x86_64 crypto_simd cryptd glue_helper virtio_net psmouse net_failover failover virtio_blk floppy
  [ 2834.122539] CR2: 00007fffffffc2c0
  [ 2834.126867] ---[ end trace dfae51f1d9432708 ]---
  [ 2834.131239] RIP: 0033:0x14d793262eda
  [ 2834.135715] Code: Bad RIP value.
  [ 2834.140243] RSP: 002b:00007ffddb4e2980 EFLAGS: 00000202
  [ 2834.144615] RAX: 0000000000000000 RBX: 000014d6f402acb8 RCX: 0000000000000002
  [ 2834.148943] RDX: 0000000001cd6950 RSI: 0000000000000000 RDI: 00007ffddb4e3670
  [ 2834.153335] RBP: 00007ffddb4e3df0 R08: 0000000000000001 R09: 00007ffddb4e3870
  [ 2834.157774] R10: 000014d793da9dd5 R11: 000014d793e258d0 R12: 0000000000000002
  [ 2834.162132] R13: 0000000000000001 R14: 0000000000000001 R15: 000014d6f402d040
  [ 2834.166239] FS:  0000155554fa5740(0000) GS:ffff97213ba00000(0000) knlGS:0000000000000000
  [ 2834.170529] CS:  0033 DS: 0000 ES: 0000 CR0: 0000000080050033
  [ 2834.174751] CR2: 000014d793262eb0 CR3: 0000000036130000 CR4: 00000000007406f0
  [ 2834.178892] PKRU: 55555554

  I run the application from a shell with `ulimit -s unlimited`
  (unlimited stack to size).

  The application creates a number of threads, and those threads make a
  lot of calls to sigaltstack() and mprotect(); see the relevant source
  for dotnet here
  https://github.com/dotnet/runtime/blob/15ec69e47b4dc56098e6058a11ccb6ae4d5d4fa1/src/coreclr/src/pal/src/thread/thread.cpp#L2467

  using strace -f on the app shows that no alt stacks come anywhere near
  the failing address; all alt stacks are in the heap, as expected.
  None of the mmap/mprotect/munmap syscalls were given arguments in the
  high memory 0x7fffffff0000 and up.

  gdb (with default signal stop/print/pass semantics) does not report
  any signals prior to the kernel bug being tripped, so I doubt the
  alternate signal stack is actually used.

  When I run the same dotnet binary on the host (eg, on "bare metal"),
  the host kernel seems happy and dotnet runs as expected.

  I have not tried different qemu or guest or host O/S.

To manage notifications about this bug go to:
https://bugs.launchpad.net/qemu/+bug/1866892/+subscriptions


^ permalink raw reply	[flat|nested] 14+ messages in thread

* [Bug 1866892] Re: guest OS catches a page fault bug when running dotnet
  2020-03-10 19:39 [Bug 1866892] [NEW] guest OS catches a page fault bug when running dotnet Robert Henry
  2020-03-17  3:11 ` [Bug 1866892] " Robert Henry
  2020-03-19 18:48 ` Robert Henry
@ 2020-03-19 19:38 ` Peter Maydell
  2020-03-19 20:06 ` Robert Henry
                   ` (9 subsequent siblings)
  12 siblings, 0 replies; 14+ messages in thread
From: Peter Maydell @ 2020-03-19 19:38 UTC (permalink / raw)
  To: qemu-devel

Is dotnet intentionally doing an iret? It seems like an odd thing for a
userspace program to do, given it's basically "return from interrupt".

-- 
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/1866892

Title:
  guest OS catches a page  fault bug when running dotnet

Status in QEMU:
  New

Bug description:
  The linux guest OS catches a page fault bug when running the dotnet
  application.

  host = metal = x86_64
  host OS = ubuntu 19.10
  qemu emulation, without KVM, with "tiny code generator" tcg; no plugins; built from head/master
  guest emulation = x86_64
  guest OS = ubuntu 19.10
  guest app = dotnet, running any program

  qemu sha=7bc4d1980f95387c4cc921d7a066217ff4e42b70 (head/master Mar 10,
  2020)

  qemu invocation is:

  qemu/build/x86_64-softmmu/qemu-system-x86_64 \
    -m size=4096 \
    -smp cpus=1 \
    -machine type=pc-i440fx-5.0,accel=tcg \
    -cpu Skylake-Server-v1 \
    -nographic \
    -bios OVMF-pure-efi.fd \
    -drive if=none,id=hd0,file=ubuntu-19.10-server-cloudimg-amd64.img \
    -device virtio-blk,drive=hd0 \
    -drive if=none,id=cloud,file=linux_cloud_config.img \
    -device virtio-blk,drive=cloud \
    -netdev user,id=user0,hostfwd=tcp::2223-:22 \
    -device virtio-net,netdev=user0

  
  Here's the guest kernel console output:

  
  [ 2834.005449] BUG: unable to handle page fault for address: 00007fffffffc2c0
  [ 2834.009895] #PF: supervisor read access in user mode
  [ 2834.013872] #PF: error_code(0x0001) - permissions violation
  [ 2834.018025] IDT: 0xfffffe0000000000 (limit=0xfff) GDT: 0xfffffe0000001000 (limit=0x7f)
  [ 2834.022242] LDTR: NULL
  [ 2834.026306] TR: 0x40 -- base=0xfffffe0000003000 limit=0x206f
  [ 2834.030395] PGD 80000000360d0067 P4D 80000000360d0067 PUD 36105067 PMD 36193067 PTE 8000000076d8e867
  [ 2834.038672] Oops: 0001 [#4] SMP PTI
  [ 2834.042707] CPU: 0 PID: 13537 Comm: dotnet Tainted: G      D           5.3.0-29-generic #31-Ubuntu
  [ 2834.050591] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 0.0.0 02/06/2015
  [ 2834.054785] RIP: 0033:0x1555547eaeda
  [ 2834.059017] Code: d0 00 00 00 4c 8b a7 d8 00 00 00 4c 8b af e0 00 00 00 4c 8b b7 e8 00 00 00 4c 8b bf f0 00 00 00 48 8b bf b0 00 00 00 9d 74 02 <48> cf 48 8d 64 24 30 5d c3 90 cc c3 66 90 55 4c 8b a7 d8 00 00 00
  [ 2834.072103] RSP: 002b:00007fffffffc2c0 EFLAGS: 00000202
  [ 2834.076507] RAX: 0000000000000000 RBX: 00001554b401af38 RCX: 0000000000000001
  [ 2834.080832] RDX: 0000000000000000 RSI: 0000000000000000 RDI: 00007fffffffcfb0
  [ 2834.085010] RBP: 00007fffffffd730 R08: 0000000000000000 R09: 00007fffffffd1b0
  [ 2834.089184] R10: 0000155555331dd5 R11: 00001555553ad8d0 R12: 0000000000000002
  [ 2834.093350] R13: 0000000000000001 R14: 0000000000000001 R15: 00001554b401d388
  [ 2834.097309] FS:  0000155554fa5740 GS:  0000000000000000
  [ 2834.101131] Modules linked in: isofs nls_iso8859_1 dm_multipath scsi_dh_rdac scsi_dh_emc scsi_dh_alua ppdev input_leds serio_raw parport_pc parport sch_fq_codel ip_tables x_tables autofs4 btrfs zstd_compress raid10 raid456 async_raid6_recov async_memcpy async_pq async_xor async_tx xor raid6_pq libcrc32c raid1 raid0 multipath linear crct10dif_pclmul crc32_pclmul ghash_clmulni_intel aesni_intel aes_x86_64 crypto_simd cryptd glue_helper virtio_net psmouse net_failover failover virtio_blk floppy
  [ 2834.122539] CR2: 00007fffffffc2c0
  [ 2834.126867] ---[ end trace dfae51f1d9432708 ]---
  [ 2834.131239] RIP: 0033:0x14d793262eda
  [ 2834.135715] Code: Bad RIP value.
  [ 2834.140243] RSP: 002b:00007ffddb4e2980 EFLAGS: 00000202
  [ 2834.144615] RAX: 0000000000000000 RBX: 000014d6f402acb8 RCX: 0000000000000002
  [ 2834.148943] RDX: 0000000001cd6950 RSI: 0000000000000000 RDI: 00007ffddb4e3670
  [ 2834.153335] RBP: 00007ffddb4e3df0 R08: 0000000000000001 R09: 00007ffddb4e3870
  [ 2834.157774] R10: 000014d793da9dd5 R11: 000014d793e258d0 R12: 0000000000000002
  [ 2834.162132] R13: 0000000000000001 R14: 0000000000000001 R15: 000014d6f402d040
  [ 2834.166239] FS:  0000155554fa5740(0000) GS:ffff97213ba00000(0000) knlGS:0000000000000000
  [ 2834.170529] CS:  0033 DS: 0000 ES: 0000 CR0: 0000000080050033
  [ 2834.174751] CR2: 000014d793262eb0 CR3: 0000000036130000 CR4: 00000000007406f0
  [ 2834.178892] PKRU: 55555554

  I run the application from a shell with `ulimit -s unlimited`
  (unlimited stack to size).

  The application creates a number of threads, and those threads make a
  lot of calls to sigaltstack() and mprotect(); see the relevant source
  for dotnet here
  https://github.com/dotnet/runtime/blob/15ec69e47b4dc56098e6058a11ccb6ae4d5d4fa1/src/coreclr/src/pal/src/thread/thread.cpp#L2467

  using strace -f on the app shows that no alt stacks come anywhere near
  the failing address; all alt stacks are in the heap, as expected.
  None of the mmap/mprotect/munmap syscalls were given arguments in the
  high memory 0x7fffffff0000 and up.

  gdb (with default signal stop/print/pass semantics) does not report
  any signals prior to the kernel bug being tripped, so I doubt the
  alternate signal stack is actually used.

  When I run the same dotnet binary on the host (eg, on "bare metal"),
  the host kernel seems happy and dotnet runs as expected.

  I have not tried different qemu or guest or host O/S.

To manage notifications about this bug go to:
https://bugs.launchpad.net/qemu/+bug/1866892/+subscriptions


^ permalink raw reply	[flat|nested] 14+ messages in thread

* [Bug 1866892] Re: guest OS catches a page fault bug when running dotnet
  2020-03-10 19:39 [Bug 1866892] [NEW] guest OS catches a page fault bug when running dotnet Robert Henry
                   ` (2 preceding siblings ...)
  2020-03-19 19:38 ` Peter Maydell
@ 2020-03-19 20:06 ` Robert Henry
  2020-03-19 20:35 ` Peter Maydell
                   ` (8 subsequent siblings)
  12 siblings, 0 replies; 14+ messages in thread
From: Robert Henry @ 2020-03-19 20:06 UTC (permalink / raw)
  To: qemu-devel

yes, it is intentional.  I don't yet understand why, but am talking to
those who do.
https://github.com/dotnet/runtime/blob/1b02665be501b695b9c22c1ebd69148c07a225f6/src/coreclr/src/pal/src/arch/amd64/context2.S#L183

-- 
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/1866892

Title:
  guest OS catches a page  fault bug when running dotnet

Status in QEMU:
  New

Bug description:
  The linux guest OS catches a page fault bug when running the dotnet
  application.

  host = metal = x86_64
  host OS = ubuntu 19.10
  qemu emulation, without KVM, with "tiny code generator" tcg; no plugins; built from head/master
  guest emulation = x86_64
  guest OS = ubuntu 19.10
  guest app = dotnet, running any program

  qemu sha=7bc4d1980f95387c4cc921d7a066217ff4e42b70 (head/master Mar 10,
  2020)

  qemu invocation is:

  qemu/build/x86_64-softmmu/qemu-system-x86_64 \
    -m size=4096 \
    -smp cpus=1 \
    -machine type=pc-i440fx-5.0,accel=tcg \
    -cpu Skylake-Server-v1 \
    -nographic \
    -bios OVMF-pure-efi.fd \
    -drive if=none,id=hd0,file=ubuntu-19.10-server-cloudimg-amd64.img \
    -device virtio-blk,drive=hd0 \
    -drive if=none,id=cloud,file=linux_cloud_config.img \
    -device virtio-blk,drive=cloud \
    -netdev user,id=user0,hostfwd=tcp::2223-:22 \
    -device virtio-net,netdev=user0

  
  Here's the guest kernel console output:

  
  [ 2834.005449] BUG: unable to handle page fault for address: 00007fffffffc2c0
  [ 2834.009895] #PF: supervisor read access in user mode
  [ 2834.013872] #PF: error_code(0x0001) - permissions violation
  [ 2834.018025] IDT: 0xfffffe0000000000 (limit=0xfff) GDT: 0xfffffe0000001000 (limit=0x7f)
  [ 2834.022242] LDTR: NULL
  [ 2834.026306] TR: 0x40 -- base=0xfffffe0000003000 limit=0x206f
  [ 2834.030395] PGD 80000000360d0067 P4D 80000000360d0067 PUD 36105067 PMD 36193067 PTE 8000000076d8e867
  [ 2834.038672] Oops: 0001 [#4] SMP PTI
  [ 2834.042707] CPU: 0 PID: 13537 Comm: dotnet Tainted: G      D           5.3.0-29-generic #31-Ubuntu
  [ 2834.050591] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 0.0.0 02/06/2015
  [ 2834.054785] RIP: 0033:0x1555547eaeda
  [ 2834.059017] Code: d0 00 00 00 4c 8b a7 d8 00 00 00 4c 8b af e0 00 00 00 4c 8b b7 e8 00 00 00 4c 8b bf f0 00 00 00 48 8b bf b0 00 00 00 9d 74 02 <48> cf 48 8d 64 24 30 5d c3 90 cc c3 66 90 55 4c 8b a7 d8 00 00 00
  [ 2834.072103] RSP: 002b:00007fffffffc2c0 EFLAGS: 00000202
  [ 2834.076507] RAX: 0000000000000000 RBX: 00001554b401af38 RCX: 0000000000000001
  [ 2834.080832] RDX: 0000000000000000 RSI: 0000000000000000 RDI: 00007fffffffcfb0
  [ 2834.085010] RBP: 00007fffffffd730 R08: 0000000000000000 R09: 00007fffffffd1b0
  [ 2834.089184] R10: 0000155555331dd5 R11: 00001555553ad8d0 R12: 0000000000000002
  [ 2834.093350] R13: 0000000000000001 R14: 0000000000000001 R15: 00001554b401d388
  [ 2834.097309] FS:  0000155554fa5740 GS:  0000000000000000
  [ 2834.101131] Modules linked in: isofs nls_iso8859_1 dm_multipath scsi_dh_rdac scsi_dh_emc scsi_dh_alua ppdev input_leds serio_raw parport_pc parport sch_fq_codel ip_tables x_tables autofs4 btrfs zstd_compress raid10 raid456 async_raid6_recov async_memcpy async_pq async_xor async_tx xor raid6_pq libcrc32c raid1 raid0 multipath linear crct10dif_pclmul crc32_pclmul ghash_clmulni_intel aesni_intel aes_x86_64 crypto_simd cryptd glue_helper virtio_net psmouse net_failover failover virtio_blk floppy
  [ 2834.122539] CR2: 00007fffffffc2c0
  [ 2834.126867] ---[ end trace dfae51f1d9432708 ]---
  [ 2834.131239] RIP: 0033:0x14d793262eda
  [ 2834.135715] Code: Bad RIP value.
  [ 2834.140243] RSP: 002b:00007ffddb4e2980 EFLAGS: 00000202
  [ 2834.144615] RAX: 0000000000000000 RBX: 000014d6f402acb8 RCX: 0000000000000002
  [ 2834.148943] RDX: 0000000001cd6950 RSI: 0000000000000000 RDI: 00007ffddb4e3670
  [ 2834.153335] RBP: 00007ffddb4e3df0 R08: 0000000000000001 R09: 00007ffddb4e3870
  [ 2834.157774] R10: 000014d793da9dd5 R11: 000014d793e258d0 R12: 0000000000000002
  [ 2834.162132] R13: 0000000000000001 R14: 0000000000000001 R15: 000014d6f402d040
  [ 2834.166239] FS:  0000155554fa5740(0000) GS:ffff97213ba00000(0000) knlGS:0000000000000000
  [ 2834.170529] CS:  0033 DS: 0000 ES: 0000 CR0: 0000000080050033
  [ 2834.174751] CR2: 000014d793262eb0 CR3: 0000000036130000 CR4: 00000000007406f0
  [ 2834.178892] PKRU: 55555554

  I run the application from a shell with `ulimit -s unlimited`
  (unlimited stack to size).

  The application creates a number of threads, and those threads make a
  lot of calls to sigaltstack() and mprotect(); see the relevant source
  for dotnet here
  https://github.com/dotnet/runtime/blob/15ec69e47b4dc56098e6058a11ccb6ae4d5d4fa1/src/coreclr/src/pal/src/thread/thread.cpp#L2467

  using strace -f on the app shows that no alt stacks come anywhere near
  the failing address; all alt stacks are in the heap, as expected.
  None of the mmap/mprotect/munmap syscalls were given arguments in the
  high memory 0x7fffffff0000 and up.

  gdb (with default signal stop/print/pass semantics) does not report
  any signals prior to the kernel bug being tripped, so I doubt the
  alternate signal stack is actually used.

  When I run the same dotnet binary on the host (eg, on "bare metal"),
  the host kernel seems happy and dotnet runs as expected.

  I have not tried different qemu or guest or host O/S.

To manage notifications about this bug go to:
https://bugs.launchpad.net/qemu/+bug/1866892/+subscriptions


^ permalink raw reply	[flat|nested] 14+ messages in thread

* [Bug 1866892] Re: guest OS catches a page fault bug when running dotnet
  2020-03-10 19:39 [Bug 1866892] [NEW] guest OS catches a page fault bug when running dotnet Robert Henry
                   ` (3 preceding siblings ...)
  2020-03-19 20:06 ` Robert Henry
@ 2020-03-19 20:35 ` Peter Maydell
  2020-03-24 20:43 ` Robert Henry
                   ` (7 subsequent siblings)
  12 siblings, 0 replies; 14+ messages in thread
From: Peter Maydell @ 2020-03-19 20:35 UTC (permalink / raw)
  To: qemu-devel

Thanks, that's pretty clear. I expect you'll find the bug is just that
QEMU doesn't get the semantics of an iret from userspace correct. The
helper_iret_protected() function is probably a good place to look.

-- 
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/1866892

Title:
  guest OS catches a page  fault bug when running dotnet

Status in QEMU:
  New

Bug description:
  The linux guest OS catches a page fault bug when running the dotnet
  application.

  host = metal = x86_64
  host OS = ubuntu 19.10
  qemu emulation, without KVM, with "tiny code generator" tcg; no plugins; built from head/master
  guest emulation = x86_64
  guest OS = ubuntu 19.10
  guest app = dotnet, running any program

  qemu sha=7bc4d1980f95387c4cc921d7a066217ff4e42b70 (head/master Mar 10,
  2020)

  qemu invocation is:

  qemu/build/x86_64-softmmu/qemu-system-x86_64 \
    -m size=4096 \
    -smp cpus=1 \
    -machine type=pc-i440fx-5.0,accel=tcg \
    -cpu Skylake-Server-v1 \
    -nographic \
    -bios OVMF-pure-efi.fd \
    -drive if=none,id=hd0,file=ubuntu-19.10-server-cloudimg-amd64.img \
    -device virtio-blk,drive=hd0 \
    -drive if=none,id=cloud,file=linux_cloud_config.img \
    -device virtio-blk,drive=cloud \
    -netdev user,id=user0,hostfwd=tcp::2223-:22 \
    -device virtio-net,netdev=user0

  
  Here's the guest kernel console output:

  
  [ 2834.005449] BUG: unable to handle page fault for address: 00007fffffffc2c0
  [ 2834.009895] #PF: supervisor read access in user mode
  [ 2834.013872] #PF: error_code(0x0001) - permissions violation
  [ 2834.018025] IDT: 0xfffffe0000000000 (limit=0xfff) GDT: 0xfffffe0000001000 (limit=0x7f)
  [ 2834.022242] LDTR: NULL
  [ 2834.026306] TR: 0x40 -- base=0xfffffe0000003000 limit=0x206f
  [ 2834.030395] PGD 80000000360d0067 P4D 80000000360d0067 PUD 36105067 PMD 36193067 PTE 8000000076d8e867
  [ 2834.038672] Oops: 0001 [#4] SMP PTI
  [ 2834.042707] CPU: 0 PID: 13537 Comm: dotnet Tainted: G      D           5.3.0-29-generic #31-Ubuntu
  [ 2834.050591] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 0.0.0 02/06/2015
  [ 2834.054785] RIP: 0033:0x1555547eaeda
  [ 2834.059017] Code: d0 00 00 00 4c 8b a7 d8 00 00 00 4c 8b af e0 00 00 00 4c 8b b7 e8 00 00 00 4c 8b bf f0 00 00 00 48 8b bf b0 00 00 00 9d 74 02 <48> cf 48 8d 64 24 30 5d c3 90 cc c3 66 90 55 4c 8b a7 d8 00 00 00
  [ 2834.072103] RSP: 002b:00007fffffffc2c0 EFLAGS: 00000202
  [ 2834.076507] RAX: 0000000000000000 RBX: 00001554b401af38 RCX: 0000000000000001
  [ 2834.080832] RDX: 0000000000000000 RSI: 0000000000000000 RDI: 00007fffffffcfb0
  [ 2834.085010] RBP: 00007fffffffd730 R08: 0000000000000000 R09: 00007fffffffd1b0
  [ 2834.089184] R10: 0000155555331dd5 R11: 00001555553ad8d0 R12: 0000000000000002
  [ 2834.093350] R13: 0000000000000001 R14: 0000000000000001 R15: 00001554b401d388
  [ 2834.097309] FS:  0000155554fa5740 GS:  0000000000000000
  [ 2834.101131] Modules linked in: isofs nls_iso8859_1 dm_multipath scsi_dh_rdac scsi_dh_emc scsi_dh_alua ppdev input_leds serio_raw parport_pc parport sch_fq_codel ip_tables x_tables autofs4 btrfs zstd_compress raid10 raid456 async_raid6_recov async_memcpy async_pq async_xor async_tx xor raid6_pq libcrc32c raid1 raid0 multipath linear crct10dif_pclmul crc32_pclmul ghash_clmulni_intel aesni_intel aes_x86_64 crypto_simd cryptd glue_helper virtio_net psmouse net_failover failover virtio_blk floppy
  [ 2834.122539] CR2: 00007fffffffc2c0
  [ 2834.126867] ---[ end trace dfae51f1d9432708 ]---
  [ 2834.131239] RIP: 0033:0x14d793262eda
  [ 2834.135715] Code: Bad RIP value.
  [ 2834.140243] RSP: 002b:00007ffddb4e2980 EFLAGS: 00000202
  [ 2834.144615] RAX: 0000000000000000 RBX: 000014d6f402acb8 RCX: 0000000000000002
  [ 2834.148943] RDX: 0000000001cd6950 RSI: 0000000000000000 RDI: 00007ffddb4e3670
  [ 2834.153335] RBP: 00007ffddb4e3df0 R08: 0000000000000001 R09: 00007ffddb4e3870
  [ 2834.157774] R10: 000014d793da9dd5 R11: 000014d793e258d0 R12: 0000000000000002
  [ 2834.162132] R13: 0000000000000001 R14: 0000000000000001 R15: 000014d6f402d040
  [ 2834.166239] FS:  0000155554fa5740(0000) GS:ffff97213ba00000(0000) knlGS:0000000000000000
  [ 2834.170529] CS:  0033 DS: 0000 ES: 0000 CR0: 0000000080050033
  [ 2834.174751] CR2: 000014d793262eb0 CR3: 0000000036130000 CR4: 00000000007406f0
  [ 2834.178892] PKRU: 55555554

  I run the application from a shell with `ulimit -s unlimited`
  (unlimited stack to size).

  The application creates a number of threads, and those threads make a
  lot of calls to sigaltstack() and mprotect(); see the relevant source
  for dotnet here
  https://github.com/dotnet/runtime/blob/15ec69e47b4dc56098e6058a11ccb6ae4d5d4fa1/src/coreclr/src/pal/src/thread/thread.cpp#L2467

  using strace -f on the app shows that no alt stacks come anywhere near
  the failing address; all alt stacks are in the heap, as expected.
  None of the mmap/mprotect/munmap syscalls were given arguments in the
  high memory 0x7fffffff0000 and up.

  gdb (with default signal stop/print/pass semantics) does not report
  any signals prior to the kernel bug being tripped, so I doubt the
  alternate signal stack is actually used.

  When I run the same dotnet binary on the host (eg, on "bare metal"),
  the host kernel seems happy and dotnet runs as expected.

  I have not tried different qemu or guest or host O/S.

To manage notifications about this bug go to:
https://bugs.launchpad.net/qemu/+bug/1866892/+subscriptions


^ permalink raw reply	[flat|nested] 14+ messages in thread

* [Bug 1866892] Re: guest OS catches a page fault bug when running dotnet
  2020-03-10 19:39 [Bug 1866892] [NEW] guest OS catches a page fault bug when running dotnet Robert Henry
                   ` (4 preceding siblings ...)
  2020-03-19 20:35 ` Peter Maydell
@ 2020-03-24 20:43 ` Robert Henry
  2020-03-25  1:07 ` Robert Henry
                   ` (6 subsequent siblings)
  12 siblings, 0 replies; 14+ messages in thread
From: Robert Henry @ 2020-03-24 20:43 UTC (permalink / raw)
  To: qemu-devel

I've stepped/nexted from the helper_iret_protected, going deep into the
bowels of the TLB, MMU and page table engine.  None of which I
understand. The helper_ret_protected faults in the first POPQ_RA.  I'll
investigate the value of sp at the time of the POPQ_RA.

Here's the POPQ_RA in i386/seg_helper.c:2140

    sp = env->regs[R_ESP];
    ssp = env->segs[R_SS].base;
    new_eflags = 0; /* avoid warning */
#ifdef TARGET_X86_64
    if (shift == 2) {
        POPQ_RA(sp, new_eip, retaddr);
        POPQ_RA(sp, new_cs, retaddr);
        new_cs &= 0xffff;
        if (is_iret) {
            POPQ_RA(sp, new_eflags, retaddr);
        }

and here's the stack.  Note some of the logical intermediate frames are
optimized out due to -O3 and inline. (the value of env-errorcode is 1)

0  0x0000555555a370c0 in raise_interrupt2
    (env=env@entry=0x5555566ef200, intno=14, is_int=is_int@entry=0, error_code=1, next_eip_addend=next_eip_addend@entry=0, retaddr=retaddr@entry=140736367565663) at /mnt/robhenry/qemu_robhenry_amd64/qemu/include/exec/cpu-all.h:426
#1  0x0000555555a377f9 in raise_exception_err_ra
    (env=env@entry=0x5555566ef200, exception_index=<optimized out>, error_code=<optimized out>, retaddr=retaddr@entry=140736367565663) at /mnt/robhenry/qemu_robhenry_amd64/qemu/target/i386/excp_helper.c:127
#2  0x0000555555a37d69 in x86_cpu_tlb_fill
    (cs=0x5555566e69a0, addr=140727872411616, size=<optimized out>, access_type=MMU_DATA_LOAD, mmu_idx=0, probe=<optimized out>, retaddr=140736367565663) at /mnt/robhenry/qemu_robhenry_amd64/qemu/target/i386/excp_helper.c:697
#3  0x0000555555952295 in tlb_fill
    (cpu=0x5555566e69a0, addr=140727872411616, size=8, access_type=MMU_DATA_LOAD, mmu_idx=0, retaddr=140736367565663)
    at /mnt/robhenry/qemu_robhenry_amd64/qemu/accel/tcg/cputlb.c:1017
#4  0x0000555555956320 in load_helper
    (full_load=0x555555956140 <helper_le_ldq_mmu>, code_read=false, op=MO_64, retaddr=93825010692608, oi=48, addr=140727872411616, env=0x5555566ef200) at /mnt/robhenry/qemu_robhenry_amd64/qemu/include/exec/cpu-all.h:426
#5  0x0000555555956320 in helper_le_ldq_mmu
    (env=env@entry=0x5555566ef200, addr=addr@entry=140727872411616, oi=oi@entry=48, retaddr=retaddr@entry=140736367565663)
    at /mnt/robhenry/qemu_robhenry_amd64/qemu/accel/tcg/cputlb.c:1688
#6  0x0000555555956dc0 in cpu_load_helper
    (full_load=0x555555956140 <helper_le_ldq_mmu>, op=MO_64, retaddr=140736367565663, mmu_idx=<optimized out>, addr=140727872411616, env=0x5555566ef200) at /mnt/robhenry/qemu_robhenry_amd64/qemu/accel/tcg/cputlb.c:1752
#7  0x0000555555956dc0 in cpu_ldq_mmuidx_ra
    (env=env@entry=0x5555566ef200, addr=addr@entry=140727872411616, mmu_idx=<optimized out>, ra=ra@entry=140736367565663)
--Type <RET> for more, q to quit, c to continue without paging--
    at /mnt/robhenry/qemu_robhenry_amd64/qemu/accel/tcg/cputlb.c:1799
#8  0x0000555555a4ff09 in helper_ret_protected
    (env=env@entry=0x5555566ef200, shift=shift@entry=2, is_iret=is_iret@entry=1, addend=addend@entry=0, retaddr=140736367565663)
    at /mnt/robhenry/qemu_robhenry_amd64/qemu/target/i386/seg_helper.c:2140
#9  0x0000555555a50ff5 in helper_iret_protected (env=0x5555566ef200, shift=2, next_eip=-999377888)
    at /mnt/robhenry/qemu_robhenry_amd64/qemu/target/i386/seg_helper.c:2363
#10 0x00007fffbd321b5f in code_gen_buffer ()

-- 
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/1866892

Title:
  guest OS catches a page  fault bug when running dotnet

Status in QEMU:
  New

Bug description:
  The linux guest OS catches a page fault bug when running the dotnet
  application.

  host = metal = x86_64
  host OS = ubuntu 19.10
  qemu emulation, without KVM, with "tiny code generator" tcg; no plugins; built from head/master
  guest emulation = x86_64
  guest OS = ubuntu 19.10
  guest app = dotnet, running any program

  qemu sha=7bc4d1980f95387c4cc921d7a066217ff4e42b70 (head/master Mar 10,
  2020)

  qemu invocation is:

  qemu/build/x86_64-softmmu/qemu-system-x86_64 \
    -m size=4096 \
    -smp cpus=1 \
    -machine type=pc-i440fx-5.0,accel=tcg \
    -cpu Skylake-Server-v1 \
    -nographic \
    -bios OVMF-pure-efi.fd \
    -drive if=none,id=hd0,file=ubuntu-19.10-server-cloudimg-amd64.img \
    -device virtio-blk,drive=hd0 \
    -drive if=none,id=cloud,file=linux_cloud_config.img \
    -device virtio-blk,drive=cloud \
    -netdev user,id=user0,hostfwd=tcp::2223-:22 \
    -device virtio-net,netdev=user0

  
  Here's the guest kernel console output:

  
  [ 2834.005449] BUG: unable to handle page fault for address: 00007fffffffc2c0
  [ 2834.009895] #PF: supervisor read access in user mode
  [ 2834.013872] #PF: error_code(0x0001) - permissions violation
  [ 2834.018025] IDT: 0xfffffe0000000000 (limit=0xfff) GDT: 0xfffffe0000001000 (limit=0x7f)
  [ 2834.022242] LDTR: NULL
  [ 2834.026306] TR: 0x40 -- base=0xfffffe0000003000 limit=0x206f
  [ 2834.030395] PGD 80000000360d0067 P4D 80000000360d0067 PUD 36105067 PMD 36193067 PTE 8000000076d8e867
  [ 2834.038672] Oops: 0001 [#4] SMP PTI
  [ 2834.042707] CPU: 0 PID: 13537 Comm: dotnet Tainted: G      D           5.3.0-29-generic #31-Ubuntu
  [ 2834.050591] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 0.0.0 02/06/2015
  [ 2834.054785] RIP: 0033:0x1555547eaeda
  [ 2834.059017] Code: d0 00 00 00 4c 8b a7 d8 00 00 00 4c 8b af e0 00 00 00 4c 8b b7 e8 00 00 00 4c 8b bf f0 00 00 00 48 8b bf b0 00 00 00 9d 74 02 <48> cf 48 8d 64 24 30 5d c3 90 cc c3 66 90 55 4c 8b a7 d8 00 00 00
  [ 2834.072103] RSP: 002b:00007fffffffc2c0 EFLAGS: 00000202
  [ 2834.076507] RAX: 0000000000000000 RBX: 00001554b401af38 RCX: 0000000000000001
  [ 2834.080832] RDX: 0000000000000000 RSI: 0000000000000000 RDI: 00007fffffffcfb0
  [ 2834.085010] RBP: 00007fffffffd730 R08: 0000000000000000 R09: 00007fffffffd1b0
  [ 2834.089184] R10: 0000155555331dd5 R11: 00001555553ad8d0 R12: 0000000000000002
  [ 2834.093350] R13: 0000000000000001 R14: 0000000000000001 R15: 00001554b401d388
  [ 2834.097309] FS:  0000155554fa5740 GS:  0000000000000000
  [ 2834.101131] Modules linked in: isofs nls_iso8859_1 dm_multipath scsi_dh_rdac scsi_dh_emc scsi_dh_alua ppdev input_leds serio_raw parport_pc parport sch_fq_codel ip_tables x_tables autofs4 btrfs zstd_compress raid10 raid456 async_raid6_recov async_memcpy async_pq async_xor async_tx xor raid6_pq libcrc32c raid1 raid0 multipath linear crct10dif_pclmul crc32_pclmul ghash_clmulni_intel aesni_intel aes_x86_64 crypto_simd cryptd glue_helper virtio_net psmouse net_failover failover virtio_blk floppy
  [ 2834.122539] CR2: 00007fffffffc2c0
  [ 2834.126867] ---[ end trace dfae51f1d9432708 ]---
  [ 2834.131239] RIP: 0033:0x14d793262eda
  [ 2834.135715] Code: Bad RIP value.
  [ 2834.140243] RSP: 002b:00007ffddb4e2980 EFLAGS: 00000202
  [ 2834.144615] RAX: 0000000000000000 RBX: 000014d6f402acb8 RCX: 0000000000000002
  [ 2834.148943] RDX: 0000000001cd6950 RSI: 0000000000000000 RDI: 00007ffddb4e3670
  [ 2834.153335] RBP: 00007ffddb4e3df0 R08: 0000000000000001 R09: 00007ffddb4e3870
  [ 2834.157774] R10: 000014d793da9dd5 R11: 000014d793e258d0 R12: 0000000000000002
  [ 2834.162132] R13: 0000000000000001 R14: 0000000000000001 R15: 000014d6f402d040
  [ 2834.166239] FS:  0000155554fa5740(0000) GS:ffff97213ba00000(0000) knlGS:0000000000000000
  [ 2834.170529] CS:  0033 DS: 0000 ES: 0000 CR0: 0000000080050033
  [ 2834.174751] CR2: 000014d793262eb0 CR3: 0000000036130000 CR4: 00000000007406f0
  [ 2834.178892] PKRU: 55555554

  I run the application from a shell with `ulimit -s unlimited`
  (unlimited stack to size).

  The application creates a number of threads, and those threads make a
  lot of calls to sigaltstack() and mprotect(); see the relevant source
  for dotnet here
  https://github.com/dotnet/runtime/blob/15ec69e47b4dc56098e6058a11ccb6ae4d5d4fa1/src/coreclr/src/pal/src/thread/thread.cpp#L2467

  using strace -f on the app shows that no alt stacks come anywhere near
  the failing address; all alt stacks are in the heap, as expected.
  None of the mmap/mprotect/munmap syscalls were given arguments in the
  high memory 0x7fffffff0000 and up.

  gdb (with default signal stop/print/pass semantics) does not report
  any signals prior to the kernel bug being tripped, so I doubt the
  alternate signal stack is actually used.

  When I run the same dotnet binary on the host (eg, on "bare metal"),
  the host kernel seems happy and dotnet runs as expected.

  I have not tried different qemu or guest or host O/S.

To manage notifications about this bug go to:
https://bugs.launchpad.net/qemu/+bug/1866892/+subscriptions


^ permalink raw reply	[flat|nested] 14+ messages in thread

* [Bug 1866892] Re: guest OS catches a page fault bug when running dotnet
  2020-03-10 19:39 [Bug 1866892] [NEW] guest OS catches a page fault bug when running dotnet Robert Henry
                   ` (5 preceding siblings ...)
  2020-03-24 20:43 ` Robert Henry
@ 2020-03-25  1:07 ` Robert Henry
  2020-03-25  9:14 ` Peter Maydell
                   ` (5 subsequent siblings)
  12 siblings, 0 replies; 14+ messages in thread
From: Robert Henry @ 2020-03-25  1:07 UTC (permalink / raw)
  To: qemu-devel

Peter: I think your intuition is right.  The POPQ_RA (pop quad, passing
through return address handle) is only called from helper_ret_protected,
and it suspiciously calls cpu_ldq_kernel_ra which calls
cpu_mmu_index_kernel which only is prepared for kernel space iretq (and
of course the substring _kernel in the function name tells us that too).

-- 
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/1866892

Title:
  guest OS catches a page  fault bug when running dotnet

Status in QEMU:
  New

Bug description:
  The linux guest OS catches a page fault bug when running the dotnet
  application.

  host = metal = x86_64
  host OS = ubuntu 19.10
  qemu emulation, without KVM, with "tiny code generator" tcg; no plugins; built from head/master
  guest emulation = x86_64
  guest OS = ubuntu 19.10
  guest app = dotnet, running any program

  qemu sha=7bc4d1980f95387c4cc921d7a066217ff4e42b70 (head/master Mar 10,
  2020)

  qemu invocation is:

  qemu/build/x86_64-softmmu/qemu-system-x86_64 \
    -m size=4096 \
    -smp cpus=1 \
    -machine type=pc-i440fx-5.0,accel=tcg \
    -cpu Skylake-Server-v1 \
    -nographic \
    -bios OVMF-pure-efi.fd \
    -drive if=none,id=hd0,file=ubuntu-19.10-server-cloudimg-amd64.img \
    -device virtio-blk,drive=hd0 \
    -drive if=none,id=cloud,file=linux_cloud_config.img \
    -device virtio-blk,drive=cloud \
    -netdev user,id=user0,hostfwd=tcp::2223-:22 \
    -device virtio-net,netdev=user0

  
  Here's the guest kernel console output:

  
  [ 2834.005449] BUG: unable to handle page fault for address: 00007fffffffc2c0
  [ 2834.009895] #PF: supervisor read access in user mode
  [ 2834.013872] #PF: error_code(0x0001) - permissions violation
  [ 2834.018025] IDT: 0xfffffe0000000000 (limit=0xfff) GDT: 0xfffffe0000001000 (limit=0x7f)
  [ 2834.022242] LDTR: NULL
  [ 2834.026306] TR: 0x40 -- base=0xfffffe0000003000 limit=0x206f
  [ 2834.030395] PGD 80000000360d0067 P4D 80000000360d0067 PUD 36105067 PMD 36193067 PTE 8000000076d8e867
  [ 2834.038672] Oops: 0001 [#4] SMP PTI
  [ 2834.042707] CPU: 0 PID: 13537 Comm: dotnet Tainted: G      D           5.3.0-29-generic #31-Ubuntu
  [ 2834.050591] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 0.0.0 02/06/2015
  [ 2834.054785] RIP: 0033:0x1555547eaeda
  [ 2834.059017] Code: d0 00 00 00 4c 8b a7 d8 00 00 00 4c 8b af e0 00 00 00 4c 8b b7 e8 00 00 00 4c 8b bf f0 00 00 00 48 8b bf b0 00 00 00 9d 74 02 <48> cf 48 8d 64 24 30 5d c3 90 cc c3 66 90 55 4c 8b a7 d8 00 00 00
  [ 2834.072103] RSP: 002b:00007fffffffc2c0 EFLAGS: 00000202
  [ 2834.076507] RAX: 0000000000000000 RBX: 00001554b401af38 RCX: 0000000000000001
  [ 2834.080832] RDX: 0000000000000000 RSI: 0000000000000000 RDI: 00007fffffffcfb0
  [ 2834.085010] RBP: 00007fffffffd730 R08: 0000000000000000 R09: 00007fffffffd1b0
  [ 2834.089184] R10: 0000155555331dd5 R11: 00001555553ad8d0 R12: 0000000000000002
  [ 2834.093350] R13: 0000000000000001 R14: 0000000000000001 R15: 00001554b401d388
  [ 2834.097309] FS:  0000155554fa5740 GS:  0000000000000000
  [ 2834.101131] Modules linked in: isofs nls_iso8859_1 dm_multipath scsi_dh_rdac scsi_dh_emc scsi_dh_alua ppdev input_leds serio_raw parport_pc parport sch_fq_codel ip_tables x_tables autofs4 btrfs zstd_compress raid10 raid456 async_raid6_recov async_memcpy async_pq async_xor async_tx xor raid6_pq libcrc32c raid1 raid0 multipath linear crct10dif_pclmul crc32_pclmul ghash_clmulni_intel aesni_intel aes_x86_64 crypto_simd cryptd glue_helper virtio_net psmouse net_failover failover virtio_blk floppy
  [ 2834.122539] CR2: 00007fffffffc2c0
  [ 2834.126867] ---[ end trace dfae51f1d9432708 ]---
  [ 2834.131239] RIP: 0033:0x14d793262eda
  [ 2834.135715] Code: Bad RIP value.
  [ 2834.140243] RSP: 002b:00007ffddb4e2980 EFLAGS: 00000202
  [ 2834.144615] RAX: 0000000000000000 RBX: 000014d6f402acb8 RCX: 0000000000000002
  [ 2834.148943] RDX: 0000000001cd6950 RSI: 0000000000000000 RDI: 00007ffddb4e3670
  [ 2834.153335] RBP: 00007ffddb4e3df0 R08: 0000000000000001 R09: 00007ffddb4e3870
  [ 2834.157774] R10: 000014d793da9dd5 R11: 000014d793e258d0 R12: 0000000000000002
  [ 2834.162132] R13: 0000000000000001 R14: 0000000000000001 R15: 000014d6f402d040
  [ 2834.166239] FS:  0000155554fa5740(0000) GS:ffff97213ba00000(0000) knlGS:0000000000000000
  [ 2834.170529] CS:  0033 DS: 0000 ES: 0000 CR0: 0000000080050033
  [ 2834.174751] CR2: 000014d793262eb0 CR3: 0000000036130000 CR4: 00000000007406f0
  [ 2834.178892] PKRU: 55555554

  I run the application from a shell with `ulimit -s unlimited`
  (unlimited stack to size).

  The application creates a number of threads, and those threads make a
  lot of calls to sigaltstack() and mprotect(); see the relevant source
  for dotnet here
  https://github.com/dotnet/runtime/blob/15ec69e47b4dc56098e6058a11ccb6ae4d5d4fa1/src/coreclr/src/pal/src/thread/thread.cpp#L2467

  using strace -f on the app shows that no alt stacks come anywhere near
  the failing address; all alt stacks are in the heap, as expected.
  None of the mmap/mprotect/munmap syscalls were given arguments in the
  high memory 0x7fffffff0000 and up.

  gdb (with default signal stop/print/pass semantics) does not report
  any signals prior to the kernel bug being tripped, so I doubt the
  alternate signal stack is actually used.

  When I run the same dotnet binary on the host (eg, on "bare metal"),
  the host kernel seems happy and dotnet runs as expected.

  I have not tried different qemu or guest or host O/S.

To manage notifications about this bug go to:
https://bugs.launchpad.net/qemu/+bug/1866892/+subscriptions


^ permalink raw reply	[flat|nested] 14+ messages in thread

* [Bug 1866892] Re: guest OS catches a page fault bug when running dotnet
  2020-03-10 19:39 [Bug 1866892] [NEW] guest OS catches a page fault bug when running dotnet Robert Henry
                   ` (6 preceding siblings ...)
  2020-03-25  1:07 ` Robert Henry
@ 2020-03-25  9:14 ` Peter Maydell
  2020-03-25 15:49 ` Richard Henderson
                   ` (4 subsequent siblings)
  12 siblings, 0 replies; 14+ messages in thread
From: Peter Maydell @ 2020-03-25  9:14 UTC (permalink / raw)
  To: qemu-devel

That function is for "do a guest memory load as if from the kernel" (ie
with the permissions and page table settings that guest kernel-mode
accesses would use). We'd need to look at what the required semantics
for the instruction are for user-mode iretq: are the loads supposed to
be done with only the user-mode access and page tables?

-- 
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/1866892

Title:
  guest OS catches a page  fault bug when running dotnet

Status in QEMU:
  New

Bug description:
  The linux guest OS catches a page fault bug when running the dotnet
  application.

  host = metal = x86_64
  host OS = ubuntu 19.10
  qemu emulation, without KVM, with "tiny code generator" tcg; no plugins; built from head/master
  guest emulation = x86_64
  guest OS = ubuntu 19.10
  guest app = dotnet, running any program

  qemu sha=7bc4d1980f95387c4cc921d7a066217ff4e42b70 (head/master Mar 10,
  2020)

  qemu invocation is:

  qemu/build/x86_64-softmmu/qemu-system-x86_64 \
    -m size=4096 \
    -smp cpus=1 \
    -machine type=pc-i440fx-5.0,accel=tcg \
    -cpu Skylake-Server-v1 \
    -nographic \
    -bios OVMF-pure-efi.fd \
    -drive if=none,id=hd0,file=ubuntu-19.10-server-cloudimg-amd64.img \
    -device virtio-blk,drive=hd0 \
    -drive if=none,id=cloud,file=linux_cloud_config.img \
    -device virtio-blk,drive=cloud \
    -netdev user,id=user0,hostfwd=tcp::2223-:22 \
    -device virtio-net,netdev=user0

  
  Here's the guest kernel console output:

  
  [ 2834.005449] BUG: unable to handle page fault for address: 00007fffffffc2c0
  [ 2834.009895] #PF: supervisor read access in user mode
  [ 2834.013872] #PF: error_code(0x0001) - permissions violation
  [ 2834.018025] IDT: 0xfffffe0000000000 (limit=0xfff) GDT: 0xfffffe0000001000 (limit=0x7f)
  [ 2834.022242] LDTR: NULL
  [ 2834.026306] TR: 0x40 -- base=0xfffffe0000003000 limit=0x206f
  [ 2834.030395] PGD 80000000360d0067 P4D 80000000360d0067 PUD 36105067 PMD 36193067 PTE 8000000076d8e867
  [ 2834.038672] Oops: 0001 [#4] SMP PTI
  [ 2834.042707] CPU: 0 PID: 13537 Comm: dotnet Tainted: G      D           5.3.0-29-generic #31-Ubuntu
  [ 2834.050591] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 0.0.0 02/06/2015
  [ 2834.054785] RIP: 0033:0x1555547eaeda
  [ 2834.059017] Code: d0 00 00 00 4c 8b a7 d8 00 00 00 4c 8b af e0 00 00 00 4c 8b b7 e8 00 00 00 4c 8b bf f0 00 00 00 48 8b bf b0 00 00 00 9d 74 02 <48> cf 48 8d 64 24 30 5d c3 90 cc c3 66 90 55 4c 8b a7 d8 00 00 00
  [ 2834.072103] RSP: 002b:00007fffffffc2c0 EFLAGS: 00000202
  [ 2834.076507] RAX: 0000000000000000 RBX: 00001554b401af38 RCX: 0000000000000001
  [ 2834.080832] RDX: 0000000000000000 RSI: 0000000000000000 RDI: 00007fffffffcfb0
  [ 2834.085010] RBP: 00007fffffffd730 R08: 0000000000000000 R09: 00007fffffffd1b0
  [ 2834.089184] R10: 0000155555331dd5 R11: 00001555553ad8d0 R12: 0000000000000002
  [ 2834.093350] R13: 0000000000000001 R14: 0000000000000001 R15: 00001554b401d388
  [ 2834.097309] FS:  0000155554fa5740 GS:  0000000000000000
  [ 2834.101131] Modules linked in: isofs nls_iso8859_1 dm_multipath scsi_dh_rdac scsi_dh_emc scsi_dh_alua ppdev input_leds serio_raw parport_pc parport sch_fq_codel ip_tables x_tables autofs4 btrfs zstd_compress raid10 raid456 async_raid6_recov async_memcpy async_pq async_xor async_tx xor raid6_pq libcrc32c raid1 raid0 multipath linear crct10dif_pclmul crc32_pclmul ghash_clmulni_intel aesni_intel aes_x86_64 crypto_simd cryptd glue_helper virtio_net psmouse net_failover failover virtio_blk floppy
  [ 2834.122539] CR2: 00007fffffffc2c0
  [ 2834.126867] ---[ end trace dfae51f1d9432708 ]---
  [ 2834.131239] RIP: 0033:0x14d793262eda
  [ 2834.135715] Code: Bad RIP value.
  [ 2834.140243] RSP: 002b:00007ffddb4e2980 EFLAGS: 00000202
  [ 2834.144615] RAX: 0000000000000000 RBX: 000014d6f402acb8 RCX: 0000000000000002
  [ 2834.148943] RDX: 0000000001cd6950 RSI: 0000000000000000 RDI: 00007ffddb4e3670
  [ 2834.153335] RBP: 00007ffddb4e3df0 R08: 0000000000000001 R09: 00007ffddb4e3870
  [ 2834.157774] R10: 000014d793da9dd5 R11: 000014d793e258d0 R12: 0000000000000002
  [ 2834.162132] R13: 0000000000000001 R14: 0000000000000001 R15: 000014d6f402d040
  [ 2834.166239] FS:  0000155554fa5740(0000) GS:ffff97213ba00000(0000) knlGS:0000000000000000
  [ 2834.170529] CS:  0033 DS: 0000 ES: 0000 CR0: 0000000080050033
  [ 2834.174751] CR2: 000014d793262eb0 CR3: 0000000036130000 CR4: 00000000007406f0
  [ 2834.178892] PKRU: 55555554

  I run the application from a shell with `ulimit -s unlimited`
  (unlimited stack to size).

  The application creates a number of threads, and those threads make a
  lot of calls to sigaltstack() and mprotect(); see the relevant source
  for dotnet here
  https://github.com/dotnet/runtime/blob/15ec69e47b4dc56098e6058a11ccb6ae4d5d4fa1/src/coreclr/src/pal/src/thread/thread.cpp#L2467

  using strace -f on the app shows that no alt stacks come anywhere near
  the failing address; all alt stacks are in the heap, as expected.
  None of the mmap/mprotect/munmap syscalls were given arguments in the
  high memory 0x7fffffff0000 and up.

  gdb (with default signal stop/print/pass semantics) does not report
  any signals prior to the kernel bug being tripped, so I doubt the
  alternate signal stack is actually used.

  When I run the same dotnet binary on the host (eg, on "bare metal"),
  the host kernel seems happy and dotnet runs as expected.

  I have not tried different qemu or guest or host O/S.

To manage notifications about this bug go to:
https://bugs.launchpad.net/qemu/+bug/1866892/+subscriptions


^ permalink raw reply	[flat|nested] 14+ messages in thread

* [Bug 1866892] Re: guest OS catches a page fault bug when running dotnet
  2020-03-10 19:39 [Bug 1866892] [NEW] guest OS catches a page fault bug when running dotnet Robert Henry
                   ` (7 preceding siblings ...)
  2020-03-25  9:14 ` Peter Maydell
@ 2020-03-25 15:49 ` Richard Henderson
  2020-03-25 16:14 ` Robert Henry
                   ` (3 subsequent siblings)
  12 siblings, 0 replies; 14+ messages in thread
From: Richard Henderson @ 2020-03-25 15:49 UTC (permalink / raw)
  To: qemu-devel

Robert, please try replacing s/_kernel_/_data_/g.

That will use the current address space and permissions as opposed to
cpl=0 address space and permissions.

-- 
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/1866892

Title:
  guest OS catches a page  fault bug when running dotnet

Status in QEMU:
  New

Bug description:
  The linux guest OS catches a page fault bug when running the dotnet
  application.

  host = metal = x86_64
  host OS = ubuntu 19.10
  qemu emulation, without KVM, with "tiny code generator" tcg; no plugins; built from head/master
  guest emulation = x86_64
  guest OS = ubuntu 19.10
  guest app = dotnet, running any program

  qemu sha=7bc4d1980f95387c4cc921d7a066217ff4e42b70 (head/master Mar 10,
  2020)

  qemu invocation is:

  qemu/build/x86_64-softmmu/qemu-system-x86_64 \
    -m size=4096 \
    -smp cpus=1 \
    -machine type=pc-i440fx-5.0,accel=tcg \
    -cpu Skylake-Server-v1 \
    -nographic \
    -bios OVMF-pure-efi.fd \
    -drive if=none,id=hd0,file=ubuntu-19.10-server-cloudimg-amd64.img \
    -device virtio-blk,drive=hd0 \
    -drive if=none,id=cloud,file=linux_cloud_config.img \
    -device virtio-blk,drive=cloud \
    -netdev user,id=user0,hostfwd=tcp::2223-:22 \
    -device virtio-net,netdev=user0

  
  Here's the guest kernel console output:

  
  [ 2834.005449] BUG: unable to handle page fault for address: 00007fffffffc2c0
  [ 2834.009895] #PF: supervisor read access in user mode
  [ 2834.013872] #PF: error_code(0x0001) - permissions violation
  [ 2834.018025] IDT: 0xfffffe0000000000 (limit=0xfff) GDT: 0xfffffe0000001000 (limit=0x7f)
  [ 2834.022242] LDTR: NULL
  [ 2834.026306] TR: 0x40 -- base=0xfffffe0000003000 limit=0x206f
  [ 2834.030395] PGD 80000000360d0067 P4D 80000000360d0067 PUD 36105067 PMD 36193067 PTE 8000000076d8e867
  [ 2834.038672] Oops: 0001 [#4] SMP PTI
  [ 2834.042707] CPU: 0 PID: 13537 Comm: dotnet Tainted: G      D           5.3.0-29-generic #31-Ubuntu
  [ 2834.050591] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 0.0.0 02/06/2015
  [ 2834.054785] RIP: 0033:0x1555547eaeda
  [ 2834.059017] Code: d0 00 00 00 4c 8b a7 d8 00 00 00 4c 8b af e0 00 00 00 4c 8b b7 e8 00 00 00 4c 8b bf f0 00 00 00 48 8b bf b0 00 00 00 9d 74 02 <48> cf 48 8d 64 24 30 5d c3 90 cc c3 66 90 55 4c 8b a7 d8 00 00 00
  [ 2834.072103] RSP: 002b:00007fffffffc2c0 EFLAGS: 00000202
  [ 2834.076507] RAX: 0000000000000000 RBX: 00001554b401af38 RCX: 0000000000000001
  [ 2834.080832] RDX: 0000000000000000 RSI: 0000000000000000 RDI: 00007fffffffcfb0
  [ 2834.085010] RBP: 00007fffffffd730 R08: 0000000000000000 R09: 00007fffffffd1b0
  [ 2834.089184] R10: 0000155555331dd5 R11: 00001555553ad8d0 R12: 0000000000000002
  [ 2834.093350] R13: 0000000000000001 R14: 0000000000000001 R15: 00001554b401d388
  [ 2834.097309] FS:  0000155554fa5740 GS:  0000000000000000
  [ 2834.101131] Modules linked in: isofs nls_iso8859_1 dm_multipath scsi_dh_rdac scsi_dh_emc scsi_dh_alua ppdev input_leds serio_raw parport_pc parport sch_fq_codel ip_tables x_tables autofs4 btrfs zstd_compress raid10 raid456 async_raid6_recov async_memcpy async_pq async_xor async_tx xor raid6_pq libcrc32c raid1 raid0 multipath linear crct10dif_pclmul crc32_pclmul ghash_clmulni_intel aesni_intel aes_x86_64 crypto_simd cryptd glue_helper virtio_net psmouse net_failover failover virtio_blk floppy
  [ 2834.122539] CR2: 00007fffffffc2c0
  [ 2834.126867] ---[ end trace dfae51f1d9432708 ]---
  [ 2834.131239] RIP: 0033:0x14d793262eda
  [ 2834.135715] Code: Bad RIP value.
  [ 2834.140243] RSP: 002b:00007ffddb4e2980 EFLAGS: 00000202
  [ 2834.144615] RAX: 0000000000000000 RBX: 000014d6f402acb8 RCX: 0000000000000002
  [ 2834.148943] RDX: 0000000001cd6950 RSI: 0000000000000000 RDI: 00007ffddb4e3670
  [ 2834.153335] RBP: 00007ffddb4e3df0 R08: 0000000000000001 R09: 00007ffddb4e3870
  [ 2834.157774] R10: 000014d793da9dd5 R11: 000014d793e258d0 R12: 0000000000000002
  [ 2834.162132] R13: 0000000000000001 R14: 0000000000000001 R15: 000014d6f402d040
  [ 2834.166239] FS:  0000155554fa5740(0000) GS:ffff97213ba00000(0000) knlGS:0000000000000000
  [ 2834.170529] CS:  0033 DS: 0000 ES: 0000 CR0: 0000000080050033
  [ 2834.174751] CR2: 000014d793262eb0 CR3: 0000000036130000 CR4: 00000000007406f0
  [ 2834.178892] PKRU: 55555554

  I run the application from a shell with `ulimit -s unlimited`
  (unlimited stack to size).

  The application creates a number of threads, and those threads make a
  lot of calls to sigaltstack() and mprotect(); see the relevant source
  for dotnet here
  https://github.com/dotnet/runtime/blob/15ec69e47b4dc56098e6058a11ccb6ae4d5d4fa1/src/coreclr/src/pal/src/thread/thread.cpp#L2467

  using strace -f on the app shows that no alt stacks come anywhere near
  the failing address; all alt stacks are in the heap, as expected.
  None of the mmap/mprotect/munmap syscalls were given arguments in the
  high memory 0x7fffffff0000 and up.

  gdb (with default signal stop/print/pass semantics) does not report
  any signals prior to the kernel bug being tripped, so I doubt the
  alternate signal stack is actually used.

  When I run the same dotnet binary on the host (eg, on "bare metal"),
  the host kernel seems happy and dotnet runs as expected.

  I have not tried different qemu or guest or host O/S.

To manage notifications about this bug go to:
https://bugs.launchpad.net/qemu/+bug/1866892/+subscriptions


^ permalink raw reply	[flat|nested] 14+ messages in thread

* [Bug 1866892] Re: guest OS catches a page fault bug when running dotnet
  2020-03-10 19:39 [Bug 1866892] [NEW] guest OS catches a page fault bug when running dotnet Robert Henry
                   ` (8 preceding siblings ...)
  2020-03-25 15:49 ` Richard Henderson
@ 2020-03-25 16:14 ` Robert Henry
  2021-05-05 17:41 ` Thomas Huth
                   ` (2 subsequent siblings)
  12 siblings, 0 replies; 14+ messages in thread
From: Robert Henry @ 2020-03-25 16:14 UTC (permalink / raw)
  To: qemu-devel

The change should only be dynamically visible when doing an iretq from
and to the same protection level, AFAICT. The code clearly[sic] works
now for the interrupt return that is used by the linux kernel,
presumably {from=kernel, to=kernel} or {from=kernel, to=user}.  I would
claim that to make this code correct it needs to work for all 4*4 state
changes, as windows (notoriously) uses all 4 protection levels.  I'm
still digging, but your suggestion is certainly on the path forward.

-- 
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/1866892

Title:
  guest OS catches a page  fault bug when running dotnet

Status in QEMU:
  New

Bug description:
  The linux guest OS catches a page fault bug when running the dotnet
  application.

  host = metal = x86_64
  host OS = ubuntu 19.10
  qemu emulation, without KVM, with "tiny code generator" tcg; no plugins; built from head/master
  guest emulation = x86_64
  guest OS = ubuntu 19.10
  guest app = dotnet, running any program

  qemu sha=7bc4d1980f95387c4cc921d7a066217ff4e42b70 (head/master Mar 10,
  2020)

  qemu invocation is:

  qemu/build/x86_64-softmmu/qemu-system-x86_64 \
    -m size=4096 \
    -smp cpus=1 \
    -machine type=pc-i440fx-5.0,accel=tcg \
    -cpu Skylake-Server-v1 \
    -nographic \
    -bios OVMF-pure-efi.fd \
    -drive if=none,id=hd0,file=ubuntu-19.10-server-cloudimg-amd64.img \
    -device virtio-blk,drive=hd0 \
    -drive if=none,id=cloud,file=linux_cloud_config.img \
    -device virtio-blk,drive=cloud \
    -netdev user,id=user0,hostfwd=tcp::2223-:22 \
    -device virtio-net,netdev=user0

  
  Here's the guest kernel console output:

  
  [ 2834.005449] BUG: unable to handle page fault for address: 00007fffffffc2c0
  [ 2834.009895] #PF: supervisor read access in user mode
  [ 2834.013872] #PF: error_code(0x0001) - permissions violation
  [ 2834.018025] IDT: 0xfffffe0000000000 (limit=0xfff) GDT: 0xfffffe0000001000 (limit=0x7f)
  [ 2834.022242] LDTR: NULL
  [ 2834.026306] TR: 0x40 -- base=0xfffffe0000003000 limit=0x206f
  [ 2834.030395] PGD 80000000360d0067 P4D 80000000360d0067 PUD 36105067 PMD 36193067 PTE 8000000076d8e867
  [ 2834.038672] Oops: 0001 [#4] SMP PTI
  [ 2834.042707] CPU: 0 PID: 13537 Comm: dotnet Tainted: G      D           5.3.0-29-generic #31-Ubuntu
  [ 2834.050591] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 0.0.0 02/06/2015
  [ 2834.054785] RIP: 0033:0x1555547eaeda
  [ 2834.059017] Code: d0 00 00 00 4c 8b a7 d8 00 00 00 4c 8b af e0 00 00 00 4c 8b b7 e8 00 00 00 4c 8b bf f0 00 00 00 48 8b bf b0 00 00 00 9d 74 02 <48> cf 48 8d 64 24 30 5d c3 90 cc c3 66 90 55 4c 8b a7 d8 00 00 00
  [ 2834.072103] RSP: 002b:00007fffffffc2c0 EFLAGS: 00000202
  [ 2834.076507] RAX: 0000000000000000 RBX: 00001554b401af38 RCX: 0000000000000001
  [ 2834.080832] RDX: 0000000000000000 RSI: 0000000000000000 RDI: 00007fffffffcfb0
  [ 2834.085010] RBP: 00007fffffffd730 R08: 0000000000000000 R09: 00007fffffffd1b0
  [ 2834.089184] R10: 0000155555331dd5 R11: 00001555553ad8d0 R12: 0000000000000002
  [ 2834.093350] R13: 0000000000000001 R14: 0000000000000001 R15: 00001554b401d388
  [ 2834.097309] FS:  0000155554fa5740 GS:  0000000000000000
  [ 2834.101131] Modules linked in: isofs nls_iso8859_1 dm_multipath scsi_dh_rdac scsi_dh_emc scsi_dh_alua ppdev input_leds serio_raw parport_pc parport sch_fq_codel ip_tables x_tables autofs4 btrfs zstd_compress raid10 raid456 async_raid6_recov async_memcpy async_pq async_xor async_tx xor raid6_pq libcrc32c raid1 raid0 multipath linear crct10dif_pclmul crc32_pclmul ghash_clmulni_intel aesni_intel aes_x86_64 crypto_simd cryptd glue_helper virtio_net psmouse net_failover failover virtio_blk floppy
  [ 2834.122539] CR2: 00007fffffffc2c0
  [ 2834.126867] ---[ end trace dfae51f1d9432708 ]---
  [ 2834.131239] RIP: 0033:0x14d793262eda
  [ 2834.135715] Code: Bad RIP value.
  [ 2834.140243] RSP: 002b:00007ffddb4e2980 EFLAGS: 00000202
  [ 2834.144615] RAX: 0000000000000000 RBX: 000014d6f402acb8 RCX: 0000000000000002
  [ 2834.148943] RDX: 0000000001cd6950 RSI: 0000000000000000 RDI: 00007ffddb4e3670
  [ 2834.153335] RBP: 00007ffddb4e3df0 R08: 0000000000000001 R09: 00007ffddb4e3870
  [ 2834.157774] R10: 000014d793da9dd5 R11: 000014d793e258d0 R12: 0000000000000002
  [ 2834.162132] R13: 0000000000000001 R14: 0000000000000001 R15: 000014d6f402d040
  [ 2834.166239] FS:  0000155554fa5740(0000) GS:ffff97213ba00000(0000) knlGS:0000000000000000
  [ 2834.170529] CS:  0033 DS: 0000 ES: 0000 CR0: 0000000080050033
  [ 2834.174751] CR2: 000014d793262eb0 CR3: 0000000036130000 CR4: 00000000007406f0
  [ 2834.178892] PKRU: 55555554

  I run the application from a shell with `ulimit -s unlimited`
  (unlimited stack to size).

  The application creates a number of threads, and those threads make a
  lot of calls to sigaltstack() and mprotect(); see the relevant source
  for dotnet here
  https://github.com/dotnet/runtime/blob/15ec69e47b4dc56098e6058a11ccb6ae4d5d4fa1/src/coreclr/src/pal/src/thread/thread.cpp#L2467

  using strace -f on the app shows that no alt stacks come anywhere near
  the failing address; all alt stacks are in the heap, as expected.
  None of the mmap/mprotect/munmap syscalls were given arguments in the
  high memory 0x7fffffff0000 and up.

  gdb (with default signal stop/print/pass semantics) does not report
  any signals prior to the kernel bug being tripped, so I doubt the
  alternate signal stack is actually used.

  When I run the same dotnet binary on the host (eg, on "bare metal"),
  the host kernel seems happy and dotnet runs as expected.

  I have not tried different qemu or guest or host O/S.

To manage notifications about this bug go to:
https://bugs.launchpad.net/qemu/+bug/1866892/+subscriptions


^ permalink raw reply	[flat|nested] 14+ messages in thread

* [Bug 1866892] Re: guest OS catches a page fault bug when running dotnet
  2020-03-10 19:39 [Bug 1866892] [NEW] guest OS catches a page fault bug when running dotnet Robert Henry
                   ` (9 preceding siblings ...)
  2020-03-25 16:14 ` Robert Henry
@ 2021-05-05 17:41 ` Thomas Huth
  2021-05-05 18:01 ` Peter Maydell
  2021-05-10  7:44 ` Thomas Huth
  12 siblings, 0 replies; 14+ messages in thread
From: Thomas Huth @ 2021-05-05 17:41 UTC (permalink / raw)
  To: qemu-devel

The QEMU project is currently considering to move its bug tracking to
another system. For this we need to know which bugs are still valid
and which could be closed already. Thus we are setting older bugs to
"Incomplete" now.

If you still think this bug report here is valid, then please switch
the state back to "New" within the next 60 days, otherwise this report
will be marked as "Expired". Or please mark it as "Fix Released" if
the problem has been solved with a newer version of QEMU already.

Thank you and sorry for the inconvenience.


** Changed in: qemu
       Status: New => Incomplete

-- 
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/1866892

Title:
  guest OS catches a page  fault bug when running dotnet

Status in QEMU:
  Incomplete

Bug description:
  The linux guest OS catches a page fault bug when running the dotnet
  application.

  host = metal = x86_64
  host OS = ubuntu 19.10
  qemu emulation, without KVM, with "tiny code generator" tcg; no plugins; built from head/master
  guest emulation = x86_64
  guest OS = ubuntu 19.10
  guest app = dotnet, running any program

  qemu sha=7bc4d1980f95387c4cc921d7a066217ff4e42b70 (head/master Mar 10,
  2020)

  qemu invocation is:

  qemu/build/x86_64-softmmu/qemu-system-x86_64 \
    -m size=4096 \
    -smp cpus=1 \
    -machine type=pc-i440fx-5.0,accel=tcg \
    -cpu Skylake-Server-v1 \
    -nographic \
    -bios OVMF-pure-efi.fd \
    -drive if=none,id=hd0,file=ubuntu-19.10-server-cloudimg-amd64.img \
    -device virtio-blk,drive=hd0 \
    -drive if=none,id=cloud,file=linux_cloud_config.img \
    -device virtio-blk,drive=cloud \
    -netdev user,id=user0,hostfwd=tcp::2223-:22 \
    -device virtio-net,netdev=user0

  
  Here's the guest kernel console output:

  
  [ 2834.005449] BUG: unable to handle page fault for address: 00007fffffffc2c0
  [ 2834.009895] #PF: supervisor read access in user mode
  [ 2834.013872] #PF: error_code(0x0001) - permissions violation
  [ 2834.018025] IDT: 0xfffffe0000000000 (limit=0xfff) GDT: 0xfffffe0000001000 (limit=0x7f)
  [ 2834.022242] LDTR: NULL
  [ 2834.026306] TR: 0x40 -- base=0xfffffe0000003000 limit=0x206f
  [ 2834.030395] PGD 80000000360d0067 P4D 80000000360d0067 PUD 36105067 PMD 36193067 PTE 8000000076d8e867
  [ 2834.038672] Oops: 0001 [#4] SMP PTI
  [ 2834.042707] CPU: 0 PID: 13537 Comm: dotnet Tainted: G      D           5.3.0-29-generic #31-Ubuntu
  [ 2834.050591] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 0.0.0 02/06/2015
  [ 2834.054785] RIP: 0033:0x1555547eaeda
  [ 2834.059017] Code: d0 00 00 00 4c 8b a7 d8 00 00 00 4c 8b af e0 00 00 00 4c 8b b7 e8 00 00 00 4c 8b bf f0 00 00 00 48 8b bf b0 00 00 00 9d 74 02 <48> cf 48 8d 64 24 30 5d c3 90 cc c3 66 90 55 4c 8b a7 d8 00 00 00
  [ 2834.072103] RSP: 002b:00007fffffffc2c0 EFLAGS: 00000202
  [ 2834.076507] RAX: 0000000000000000 RBX: 00001554b401af38 RCX: 0000000000000001
  [ 2834.080832] RDX: 0000000000000000 RSI: 0000000000000000 RDI: 00007fffffffcfb0
  [ 2834.085010] RBP: 00007fffffffd730 R08: 0000000000000000 R09: 00007fffffffd1b0
  [ 2834.089184] R10: 0000155555331dd5 R11: 00001555553ad8d0 R12: 0000000000000002
  [ 2834.093350] R13: 0000000000000001 R14: 0000000000000001 R15: 00001554b401d388
  [ 2834.097309] FS:  0000155554fa5740 GS:  0000000000000000
  [ 2834.101131] Modules linked in: isofs nls_iso8859_1 dm_multipath scsi_dh_rdac scsi_dh_emc scsi_dh_alua ppdev input_leds serio_raw parport_pc parport sch_fq_codel ip_tables x_tables autofs4 btrfs zstd_compress raid10 raid456 async_raid6_recov async_memcpy async_pq async_xor async_tx xor raid6_pq libcrc32c raid1 raid0 multipath linear crct10dif_pclmul crc32_pclmul ghash_clmulni_intel aesni_intel aes_x86_64 crypto_simd cryptd glue_helper virtio_net psmouse net_failover failover virtio_blk floppy
  [ 2834.122539] CR2: 00007fffffffc2c0
  [ 2834.126867] ---[ end trace dfae51f1d9432708 ]---
  [ 2834.131239] RIP: 0033:0x14d793262eda
  [ 2834.135715] Code: Bad RIP value.
  [ 2834.140243] RSP: 002b:00007ffddb4e2980 EFLAGS: 00000202
  [ 2834.144615] RAX: 0000000000000000 RBX: 000014d6f402acb8 RCX: 0000000000000002
  [ 2834.148943] RDX: 0000000001cd6950 RSI: 0000000000000000 RDI: 00007ffddb4e3670
  [ 2834.153335] RBP: 00007ffddb4e3df0 R08: 0000000000000001 R09: 00007ffddb4e3870
  [ 2834.157774] R10: 000014d793da9dd5 R11: 000014d793e258d0 R12: 0000000000000002
  [ 2834.162132] R13: 0000000000000001 R14: 0000000000000001 R15: 000014d6f402d040
  [ 2834.166239] FS:  0000155554fa5740(0000) GS:ffff97213ba00000(0000) knlGS:0000000000000000
  [ 2834.170529] CS:  0033 DS: 0000 ES: 0000 CR0: 0000000080050033
  [ 2834.174751] CR2: 000014d793262eb0 CR3: 0000000036130000 CR4: 00000000007406f0
  [ 2834.178892] PKRU: 55555554

  I run the application from a shell with `ulimit -s unlimited`
  (unlimited stack to size).

  The application creates a number of threads, and those threads make a
  lot of calls to sigaltstack() and mprotect(); see the relevant source
  for dotnet here
  https://github.com/dotnet/runtime/blob/15ec69e47b4dc56098e6058a11ccb6ae4d5d4fa1/src/coreclr/src/pal/src/thread/thread.cpp#L2467

  using strace -f on the app shows that no alt stacks come anywhere near
  the failing address; all alt stacks are in the heap, as expected.
  None of the mmap/mprotect/munmap syscalls were given arguments in the
  high memory 0x7fffffff0000 and up.

  gdb (with default signal stop/print/pass semantics) does not report
  any signals prior to the kernel bug being tripped, so I doubt the
  alternate signal stack is actually used.

  When I run the same dotnet binary on the host (eg, on "bare metal"),
  the host kernel seems happy and dotnet runs as expected.

  I have not tried different qemu or guest or host O/S.

To manage notifications about this bug go to:
https://bugs.launchpad.net/qemu/+bug/1866892/+subscriptions


^ permalink raw reply	[flat|nested] 14+ messages in thread

* [Bug 1866892] Re: guest OS catches a page fault bug when running dotnet
  2020-03-10 19:39 [Bug 1866892] [NEW] guest OS catches a page fault bug when running dotnet Robert Henry
                   ` (10 preceding siblings ...)
  2021-05-05 17:41 ` Thomas Huth
@ 2021-05-05 18:01 ` Peter Maydell
  2021-05-10  7:44 ` Thomas Huth
  12 siblings, 0 replies; 14+ messages in thread
From: Peter Maydell @ 2021-05-05 18:01 UTC (permalink / raw)
  To: qemu-devel

The QEMU code implementing the iret insn is still doing loads/stores as
if kernel mode, so this is still a bug.


** Changed in: qemu
       Status: Incomplete => Confirmed

-- 
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/1866892

Title:
  guest OS catches a page  fault bug when running dotnet

Status in QEMU:
  Confirmed

Bug description:
  The linux guest OS catches a page fault bug when running the dotnet
  application.

  host = metal = x86_64
  host OS = ubuntu 19.10
  qemu emulation, without KVM, with "tiny code generator" tcg; no plugins; built from head/master
  guest emulation = x86_64
  guest OS = ubuntu 19.10
  guest app = dotnet, running any program

  qemu sha=7bc4d1980f95387c4cc921d7a066217ff4e42b70 (head/master Mar 10,
  2020)

  qemu invocation is:

  qemu/build/x86_64-softmmu/qemu-system-x86_64 \
    -m size=4096 \
    -smp cpus=1 \
    -machine type=pc-i440fx-5.0,accel=tcg \
    -cpu Skylake-Server-v1 \
    -nographic \
    -bios OVMF-pure-efi.fd \
    -drive if=none,id=hd0,file=ubuntu-19.10-server-cloudimg-amd64.img \
    -device virtio-blk,drive=hd0 \
    -drive if=none,id=cloud,file=linux_cloud_config.img \
    -device virtio-blk,drive=cloud \
    -netdev user,id=user0,hostfwd=tcp::2223-:22 \
    -device virtio-net,netdev=user0

  
  Here's the guest kernel console output:

  
  [ 2834.005449] BUG: unable to handle page fault for address: 00007fffffffc2c0
  [ 2834.009895] #PF: supervisor read access in user mode
  [ 2834.013872] #PF: error_code(0x0001) - permissions violation
  [ 2834.018025] IDT: 0xfffffe0000000000 (limit=0xfff) GDT: 0xfffffe0000001000 (limit=0x7f)
  [ 2834.022242] LDTR: NULL
  [ 2834.026306] TR: 0x40 -- base=0xfffffe0000003000 limit=0x206f
  [ 2834.030395] PGD 80000000360d0067 P4D 80000000360d0067 PUD 36105067 PMD 36193067 PTE 8000000076d8e867
  [ 2834.038672] Oops: 0001 [#4] SMP PTI
  [ 2834.042707] CPU: 0 PID: 13537 Comm: dotnet Tainted: G      D           5.3.0-29-generic #31-Ubuntu
  [ 2834.050591] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 0.0.0 02/06/2015
  [ 2834.054785] RIP: 0033:0x1555547eaeda
  [ 2834.059017] Code: d0 00 00 00 4c 8b a7 d8 00 00 00 4c 8b af e0 00 00 00 4c 8b b7 e8 00 00 00 4c 8b bf f0 00 00 00 48 8b bf b0 00 00 00 9d 74 02 <48> cf 48 8d 64 24 30 5d c3 90 cc c3 66 90 55 4c 8b a7 d8 00 00 00
  [ 2834.072103] RSP: 002b:00007fffffffc2c0 EFLAGS: 00000202
  [ 2834.076507] RAX: 0000000000000000 RBX: 00001554b401af38 RCX: 0000000000000001
  [ 2834.080832] RDX: 0000000000000000 RSI: 0000000000000000 RDI: 00007fffffffcfb0
  [ 2834.085010] RBP: 00007fffffffd730 R08: 0000000000000000 R09: 00007fffffffd1b0
  [ 2834.089184] R10: 0000155555331dd5 R11: 00001555553ad8d0 R12: 0000000000000002
  [ 2834.093350] R13: 0000000000000001 R14: 0000000000000001 R15: 00001554b401d388
  [ 2834.097309] FS:  0000155554fa5740 GS:  0000000000000000
  [ 2834.101131] Modules linked in: isofs nls_iso8859_1 dm_multipath scsi_dh_rdac scsi_dh_emc scsi_dh_alua ppdev input_leds serio_raw parport_pc parport sch_fq_codel ip_tables x_tables autofs4 btrfs zstd_compress raid10 raid456 async_raid6_recov async_memcpy async_pq async_xor async_tx xor raid6_pq libcrc32c raid1 raid0 multipath linear crct10dif_pclmul crc32_pclmul ghash_clmulni_intel aesni_intel aes_x86_64 crypto_simd cryptd glue_helper virtio_net psmouse net_failover failover virtio_blk floppy
  [ 2834.122539] CR2: 00007fffffffc2c0
  [ 2834.126867] ---[ end trace dfae51f1d9432708 ]---
  [ 2834.131239] RIP: 0033:0x14d793262eda
  [ 2834.135715] Code: Bad RIP value.
  [ 2834.140243] RSP: 002b:00007ffddb4e2980 EFLAGS: 00000202
  [ 2834.144615] RAX: 0000000000000000 RBX: 000014d6f402acb8 RCX: 0000000000000002
  [ 2834.148943] RDX: 0000000001cd6950 RSI: 0000000000000000 RDI: 00007ffddb4e3670
  [ 2834.153335] RBP: 00007ffddb4e3df0 R08: 0000000000000001 R09: 00007ffddb4e3870
  [ 2834.157774] R10: 000014d793da9dd5 R11: 000014d793e258d0 R12: 0000000000000002
  [ 2834.162132] R13: 0000000000000001 R14: 0000000000000001 R15: 000014d6f402d040
  [ 2834.166239] FS:  0000155554fa5740(0000) GS:ffff97213ba00000(0000) knlGS:0000000000000000
  [ 2834.170529] CS:  0033 DS: 0000 ES: 0000 CR0: 0000000080050033
  [ 2834.174751] CR2: 000014d793262eb0 CR3: 0000000036130000 CR4: 00000000007406f0
  [ 2834.178892] PKRU: 55555554

  I run the application from a shell with `ulimit -s unlimited`
  (unlimited stack to size).

  The application creates a number of threads, and those threads make a
  lot of calls to sigaltstack() and mprotect(); see the relevant source
  for dotnet here
  https://github.com/dotnet/runtime/blob/15ec69e47b4dc56098e6058a11ccb6ae4d5d4fa1/src/coreclr/src/pal/src/thread/thread.cpp#L2467

  using strace -f on the app shows that no alt stacks come anywhere near
  the failing address; all alt stacks are in the heap, as expected.
  None of the mmap/mprotect/munmap syscalls were given arguments in the
  high memory 0x7fffffff0000 and up.

  gdb (with default signal stop/print/pass semantics) does not report
  any signals prior to the kernel bug being tripped, so I doubt the
  alternate signal stack is actually used.

  When I run the same dotnet binary on the host (eg, on "bare metal"),
  the host kernel seems happy and dotnet runs as expected.

  I have not tried different qemu or guest or host O/S.

To manage notifications about this bug go to:
https://bugs.launchpad.net/qemu/+bug/1866892/+subscriptions


^ permalink raw reply	[flat|nested] 14+ messages in thread

* [Bug 1866892] Re: guest OS catches a page fault bug when running dotnet
  2020-03-10 19:39 [Bug 1866892] [NEW] guest OS catches a page fault bug when running dotnet Robert Henry
                   ` (11 preceding siblings ...)
  2021-05-05 18:01 ` Peter Maydell
@ 2021-05-10  7:44 ` Thomas Huth
  12 siblings, 0 replies; 14+ messages in thread
From: Thomas Huth @ 2021-05-10  7:44 UTC (permalink / raw)
  To: qemu-devel

This is an automated cleanup. This bug report has been moved to QEMU's
new bug tracker on gitlab.com and thus gets marked as 'expired' now.
Please continue with the discussion here:

 https://gitlab.com/qemu-project/qemu/-/issues/249


** Changed in: qemu
       Status: Confirmed => Expired

** Bug watch added: gitlab.com/qemu-project/qemu/-/issues #249
   https://gitlab.com/qemu-project/qemu/-/issues/249

-- 
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/1866892

Title:
  guest OS catches a page  fault bug when running dotnet

Status in QEMU:
  Expired

Bug description:
  The linux guest OS catches a page fault bug when running the dotnet
  application.

  host = metal = x86_64
  host OS = ubuntu 19.10
  qemu emulation, without KVM, with "tiny code generator" tcg; no plugins; built from head/master
  guest emulation = x86_64
  guest OS = ubuntu 19.10
  guest app = dotnet, running any program

  qemu sha=7bc4d1980f95387c4cc921d7a066217ff4e42b70 (head/master Mar 10,
  2020)

  qemu invocation is:

  qemu/build/x86_64-softmmu/qemu-system-x86_64 \
    -m size=4096 \
    -smp cpus=1 \
    -machine type=pc-i440fx-5.0,accel=tcg \
    -cpu Skylake-Server-v1 \
    -nographic \
    -bios OVMF-pure-efi.fd \
    -drive if=none,id=hd0,file=ubuntu-19.10-server-cloudimg-amd64.img \
    -device virtio-blk,drive=hd0 \
    -drive if=none,id=cloud,file=linux_cloud_config.img \
    -device virtio-blk,drive=cloud \
    -netdev user,id=user0,hostfwd=tcp::2223-:22 \
    -device virtio-net,netdev=user0

  
  Here's the guest kernel console output:

  
  [ 2834.005449] BUG: unable to handle page fault for address: 00007fffffffc2c0
  [ 2834.009895] #PF: supervisor read access in user mode
  [ 2834.013872] #PF: error_code(0x0001) - permissions violation
  [ 2834.018025] IDT: 0xfffffe0000000000 (limit=0xfff) GDT: 0xfffffe0000001000 (limit=0x7f)
  [ 2834.022242] LDTR: NULL
  [ 2834.026306] TR: 0x40 -- base=0xfffffe0000003000 limit=0x206f
  [ 2834.030395] PGD 80000000360d0067 P4D 80000000360d0067 PUD 36105067 PMD 36193067 PTE 8000000076d8e867
  [ 2834.038672] Oops: 0001 [#4] SMP PTI
  [ 2834.042707] CPU: 0 PID: 13537 Comm: dotnet Tainted: G      D           5.3.0-29-generic #31-Ubuntu
  [ 2834.050591] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 0.0.0 02/06/2015
  [ 2834.054785] RIP: 0033:0x1555547eaeda
  [ 2834.059017] Code: d0 00 00 00 4c 8b a7 d8 00 00 00 4c 8b af e0 00 00 00 4c 8b b7 e8 00 00 00 4c 8b bf f0 00 00 00 48 8b bf b0 00 00 00 9d 74 02 <48> cf 48 8d 64 24 30 5d c3 90 cc c3 66 90 55 4c 8b a7 d8 00 00 00
  [ 2834.072103] RSP: 002b:00007fffffffc2c0 EFLAGS: 00000202
  [ 2834.076507] RAX: 0000000000000000 RBX: 00001554b401af38 RCX: 0000000000000001
  [ 2834.080832] RDX: 0000000000000000 RSI: 0000000000000000 RDI: 00007fffffffcfb0
  [ 2834.085010] RBP: 00007fffffffd730 R08: 0000000000000000 R09: 00007fffffffd1b0
  [ 2834.089184] R10: 0000155555331dd5 R11: 00001555553ad8d0 R12: 0000000000000002
  [ 2834.093350] R13: 0000000000000001 R14: 0000000000000001 R15: 00001554b401d388
  [ 2834.097309] FS:  0000155554fa5740 GS:  0000000000000000
  [ 2834.101131] Modules linked in: isofs nls_iso8859_1 dm_multipath scsi_dh_rdac scsi_dh_emc scsi_dh_alua ppdev input_leds serio_raw parport_pc parport sch_fq_codel ip_tables x_tables autofs4 btrfs zstd_compress raid10 raid456 async_raid6_recov async_memcpy async_pq async_xor async_tx xor raid6_pq libcrc32c raid1 raid0 multipath linear crct10dif_pclmul crc32_pclmul ghash_clmulni_intel aesni_intel aes_x86_64 crypto_simd cryptd glue_helper virtio_net psmouse net_failover failover virtio_blk floppy
  [ 2834.122539] CR2: 00007fffffffc2c0
  [ 2834.126867] ---[ end trace dfae51f1d9432708 ]---
  [ 2834.131239] RIP: 0033:0x14d793262eda
  [ 2834.135715] Code: Bad RIP value.
  [ 2834.140243] RSP: 002b:00007ffddb4e2980 EFLAGS: 00000202
  [ 2834.144615] RAX: 0000000000000000 RBX: 000014d6f402acb8 RCX: 0000000000000002
  [ 2834.148943] RDX: 0000000001cd6950 RSI: 0000000000000000 RDI: 00007ffddb4e3670
  [ 2834.153335] RBP: 00007ffddb4e3df0 R08: 0000000000000001 R09: 00007ffddb4e3870
  [ 2834.157774] R10: 000014d793da9dd5 R11: 000014d793e258d0 R12: 0000000000000002
  [ 2834.162132] R13: 0000000000000001 R14: 0000000000000001 R15: 000014d6f402d040
  [ 2834.166239] FS:  0000155554fa5740(0000) GS:ffff97213ba00000(0000) knlGS:0000000000000000
  [ 2834.170529] CS:  0033 DS: 0000 ES: 0000 CR0: 0000000080050033
  [ 2834.174751] CR2: 000014d793262eb0 CR3: 0000000036130000 CR4: 00000000007406f0
  [ 2834.178892] PKRU: 55555554

  I run the application from a shell with `ulimit -s unlimited`
  (unlimited stack to size).

  The application creates a number of threads, and those threads make a
  lot of calls to sigaltstack() and mprotect(); see the relevant source
  for dotnet here
  https://github.com/dotnet/runtime/blob/15ec69e47b4dc56098e6058a11ccb6ae4d5d4fa1/src/coreclr/src/pal/src/thread/thread.cpp#L2467

  using strace -f on the app shows that no alt stacks come anywhere near
  the failing address; all alt stacks are in the heap, as expected.
  None of the mmap/mprotect/munmap syscalls were given arguments in the
  high memory 0x7fffffff0000 and up.

  gdb (with default signal stop/print/pass semantics) does not report
  any signals prior to the kernel bug being tripped, so I doubt the
  alternate signal stack is actually used.

  When I run the same dotnet binary on the host (eg, on "bare metal"),
  the host kernel seems happy and dotnet runs as expected.

  I have not tried different qemu or guest or host O/S.

To manage notifications about this bug go to:
https://bugs.launchpad.net/qemu/+bug/1866892/+subscriptions


^ permalink raw reply	[flat|nested] 14+ messages in thread

end of thread, other threads:[~2021-05-10  7:53 UTC | newest]

Thread overview: 14+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-03-10 19:39 [Bug 1866892] [NEW] guest OS catches a page fault bug when running dotnet Robert Henry
2020-03-17  3:11 ` [Bug 1866892] " Robert Henry
2020-03-19 18:48 ` Robert Henry
2020-03-19 19:38 ` Peter Maydell
2020-03-19 20:06 ` Robert Henry
2020-03-19 20:35 ` Peter Maydell
2020-03-24 20:43 ` Robert Henry
2020-03-25  1:07 ` Robert Henry
2020-03-25  9:14 ` Peter Maydell
2020-03-25 15:49 ` Richard Henderson
2020-03-25 16:14 ` Robert Henry
2021-05-05 17:41 ` Thomas Huth
2021-05-05 18:01 ` Peter Maydell
2021-05-10  7:44 ` Thomas Huth

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).