All of lore.kernel.org
 help / color / mirror / Atom feed
* [Qemu-devel] [Bug 1824053] [NEW] Qemu-img convert appears to be stuck on aarch64 host with low probability
@ 2019-04-10  2:55 贞贵李
  2019-04-13  2:04 ` [Qemu-devel] [Bug 1824053] " 贞贵李
                   ` (9 more replies)
  0 siblings, 10 replies; 11+ messages in thread
From: 贞贵李 @ 2019-04-10  2:55 UTC (permalink / raw)
  To: qemu-devel

Public bug reported:

Hi,  I found a problem that qemu-img convert appears to be stuck on
aarch64 host with low probability.

The convert command  line is  "qemu-img convert -f qcow2 -O raw
disk.qcow2 disk.raw ".

The bt is below:

Thread 2 (Thread 0x40000b776e50 (LWP 27215)):
#0  0x000040000a3f2994 in sigtimedwait () from /lib64/libc.so.6
#1  0x000040000a39c60c in sigwait () from /lib64/libpthread.so.0
#2  0x0000aaaaaae82610 in sigwait_compat (opaque=0xaaaac5163b00) at util/compatfd.c:37
#3  0x0000aaaaaae85038 in qemu_thread_start (args=args@entry=0xaaaac5163b90) at util/qemu_thread_posix.c:496
#4  0x000040000a3918bc in start_thread () from /lib64/libpthread.so.0
#5  0x000040000a492b2c in thread_start () from /lib64/libc.so.6

Thread 1 (Thread 0x40000b573370 (LWP 27214)):
#0  0x000040000a489020 in ppoll () from /lib64/libc.so.6
#1  0x0000aaaaaadaefc0 in ppoll (__ss=0x0, __timeout=0x0, __nfds=<optimized out>, __fds=<optimized out>) at /usr/include/bits/poll2.h:77
#2  qemu_poll_ns (fds=<optimized out>, nfds=<optimized out>, timeout=<optimized out>) at qemu_timer.c:391
#3  0x0000aaaaaadae014 in os_host_main_loop_wait (timeout=<optimized out>) at main_loop.c:272
#4  0x0000aaaaaadae190 in main_loop_wait (nonblocking=<optimized out>) at main_loop.c:534
#5  0x0000aaaaaad97be0 in convert_do_copy (s=0xffffdc32eb48) at qemu-img.c:1923
#6  0x0000aaaaaada2d70 in img_convert (argc=<optimized out>, argv=<optimized out>) at qemu-img.c:2414
#7  0x0000aaaaaad99ac4 in main (argc=7, argv=<optimized out>) at qemu-img.c:5305


The problem seems to be very similar to the phenomenon described by this patch (https://resources.ovirt.org/pub/ovirt-4.1/src/qemu-kvm-ev/0025-aio_notify-force-main-loop-wakeup-with-SIGIO-aarch64.patch), 

which force main loop wakeup with SIGIO.  But this patch was reverted by
the patch (http://ovirt.repo.nfrance.com/src/qemu-kvm-ev/kvm-Revert-
aio_notify-force-main-loop-wakeup-with-SIGIO-.patch).

The problem still seems to exist in aarch64 host. The qemu version I used is 2.8.1. The host version is 4.19.28-1.2.108.aarch64.
 Do you have any solutions to fix it?  Thanks for your reply !

** Affects: qemu
     Importance: Undecided
         Status: New

-- 
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/1824053

Title:
  Qemu-img convert appears to be stuck on aarch64 host with low
  probability

Status in QEMU:
  New

Bug description:
  Hi,  I found a problem that qemu-img convert appears to be stuck on
  aarch64 host with low probability.

  The convert command  line is  "qemu-img convert -f qcow2 -O raw
  disk.qcow2 disk.raw ".

  The bt is below:

  Thread 2 (Thread 0x40000b776e50 (LWP 27215)):
  #0  0x000040000a3f2994 in sigtimedwait () from /lib64/libc.so.6
  #1  0x000040000a39c60c in sigwait () from /lib64/libpthread.so.0
  #2  0x0000aaaaaae82610 in sigwait_compat (opaque=0xaaaac5163b00) at util/compatfd.c:37
  #3  0x0000aaaaaae85038 in qemu_thread_start (args=args@entry=0xaaaac5163b90) at util/qemu_thread_posix.c:496
  #4  0x000040000a3918bc in start_thread () from /lib64/libpthread.so.0
  #5  0x000040000a492b2c in thread_start () from /lib64/libc.so.6

  Thread 1 (Thread 0x40000b573370 (LWP 27214)):
  #0  0x000040000a489020 in ppoll () from /lib64/libc.so.6
  #1  0x0000aaaaaadaefc0 in ppoll (__ss=0x0, __timeout=0x0, __nfds=<optimized out>, __fds=<optimized out>) at /usr/include/bits/poll2.h:77
  #2  qemu_poll_ns (fds=<optimized out>, nfds=<optimized out>, timeout=<optimized out>) at qemu_timer.c:391
  #3  0x0000aaaaaadae014 in os_host_main_loop_wait (timeout=<optimized out>) at main_loop.c:272
  #4  0x0000aaaaaadae190 in main_loop_wait (nonblocking=<optimized out>) at main_loop.c:534
  #5  0x0000aaaaaad97be0 in convert_do_copy (s=0xffffdc32eb48) at qemu-img.c:1923
  #6  0x0000aaaaaada2d70 in img_convert (argc=<optimized out>, argv=<optimized out>) at qemu-img.c:2414
  #7  0x0000aaaaaad99ac4 in main (argc=7, argv=<optimized out>) at qemu-img.c:5305

  
  The problem seems to be very similar to the phenomenon described by this patch (https://resources.ovirt.org/pub/ovirt-4.1/src/qemu-kvm-ev/0025-aio_notify-force-main-loop-wakeup-with-SIGIO-aarch64.patch), 

  which force main loop wakeup with SIGIO.  But this patch was reverted
  by the patch (http://ovirt.repo.nfrance.com/src/qemu-kvm-ev/kvm-
  Revert-aio_notify-force-main-loop-wakeup-with-SIGIO-.patch).

  The problem still seems to exist in aarch64 host. The qemu version I used is 2.8.1. The host version is 4.19.28-1.2.108.aarch64.
   Do you have any solutions to fix it?  Thanks for your reply !

To manage notifications about this bug go to:
https://bugs.launchpad.net/qemu/+bug/1824053/+subscriptions

^ permalink raw reply	[flat|nested] 11+ messages in thread

* [Qemu-devel] [Bug 1824053] Re: Qemu-img convert appears to be stuck on aarch64 host with low probability
  2019-04-10  2:55 [Qemu-devel] [Bug 1824053] [NEW] Qemu-img convert appears to be stuck on aarch64 host with low probability 贞贵李
@ 2019-04-13  2:04 ` 贞贵李
  2019-04-15  4:07 ` 贞贵李
                   ` (8 subsequent siblings)
  9 siblings, 0 replies; 11+ messages in thread
From: 贞贵李 @ 2019-04-13  2:04 UTC (permalink / raw)
  To: qemu-devel

Anyone else has a similar problem?

-- 
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/1824053

Title:
  Qemu-img convert appears to be stuck on aarch64 host with low
  probability

Status in QEMU:
  New

Bug description:
  Hi,  I found a problem that qemu-img convert appears to be stuck on
  aarch64 host with low probability.

  The convert command  line is  "qemu-img convert -f qcow2 -O raw
  disk.qcow2 disk.raw ".

  The bt is below:

  Thread 2 (Thread 0x40000b776e50 (LWP 27215)):
  #0  0x000040000a3f2994 in sigtimedwait () from /lib64/libc.so.6
  #1  0x000040000a39c60c in sigwait () from /lib64/libpthread.so.0
  #2  0x0000aaaaaae82610 in sigwait_compat (opaque=0xaaaac5163b00) at util/compatfd.c:37
  #3  0x0000aaaaaae85038 in qemu_thread_start (args=args@entry=0xaaaac5163b90) at util/qemu_thread_posix.c:496
  #4  0x000040000a3918bc in start_thread () from /lib64/libpthread.so.0
  #5  0x000040000a492b2c in thread_start () from /lib64/libc.so.6

  Thread 1 (Thread 0x40000b573370 (LWP 27214)):
  #0  0x000040000a489020 in ppoll () from /lib64/libc.so.6
  #1  0x0000aaaaaadaefc0 in ppoll (__ss=0x0, __timeout=0x0, __nfds=<optimized out>, __fds=<optimized out>) at /usr/include/bits/poll2.h:77
  #2  qemu_poll_ns (fds=<optimized out>, nfds=<optimized out>, timeout=<optimized out>) at qemu_timer.c:391
  #3  0x0000aaaaaadae014 in os_host_main_loop_wait (timeout=<optimized out>) at main_loop.c:272
  #4  0x0000aaaaaadae190 in main_loop_wait (nonblocking=<optimized out>) at main_loop.c:534
  #5  0x0000aaaaaad97be0 in convert_do_copy (s=0xffffdc32eb48) at qemu-img.c:1923
  #6  0x0000aaaaaada2d70 in img_convert (argc=<optimized out>, argv=<optimized out>) at qemu-img.c:2414
  #7  0x0000aaaaaad99ac4 in main (argc=7, argv=<optimized out>) at qemu-img.c:5305

  
  The problem seems to be very similar to the phenomenon described by this patch (https://resources.ovirt.org/pub/ovirt-4.1/src/qemu-kvm-ev/0025-aio_notify-force-main-loop-wakeup-with-SIGIO-aarch64.patch), 

  which force main loop wakeup with SIGIO.  But this patch was reverted
  by the patch (http://ovirt.repo.nfrance.com/src/qemu-kvm-ev/kvm-
  Revert-aio_notify-force-main-loop-wakeup-with-SIGIO-.patch).

  The problem still seems to exist in aarch64 host. The qemu version I used is 2.8.1. The host version is 4.19.28-1.2.108.aarch64.
   Do you have any solutions to fix it?  Thanks for your reply !

To manage notifications about this bug go to:
https://bugs.launchpad.net/qemu/+bug/1824053/+subscriptions

^ permalink raw reply	[flat|nested] 11+ messages in thread

* [Qemu-devel] [Bug 1824053] Re: Qemu-img convert appears to be stuck on aarch64 host with low probability
  2019-04-10  2:55 [Qemu-devel] [Bug 1824053] [NEW] Qemu-img convert appears to be stuck on aarch64 host with low probability 贞贵李
  2019-04-13  2:04 ` [Qemu-devel] [Bug 1824053] " 贞贵李
@ 2019-04-15  4:07 ` 贞贵李
  2019-04-15 19:02 ` John Snow
                   ` (7 subsequent siblings)
  9 siblings, 0 replies; 11+ messages in thread
From: 贞贵李 @ 2019-04-15  4:07 UTC (permalink / raw)
  To: qemu-devel

I  can't reproduce this problem with  qemu.git/matser?  It seems to have
been fixed in qemu.git/matser.

But  I haven't found which patch fixed this problem from QEMU version
2.8.1 to  qemu.git/matser.

Could anybody give me some suggestions? Thanks for your reply.

-- 
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/1824053

Title:
  Qemu-img convert appears to be stuck on aarch64 host with low
  probability

Status in QEMU:
  New

Bug description:
  Hi,  I found a problem that qemu-img convert appears to be stuck on
  aarch64 host with low probability.

  The convert command  line is  "qemu-img convert -f qcow2 -O raw
  disk.qcow2 disk.raw ".

  The bt is below:

  Thread 2 (Thread 0x40000b776e50 (LWP 27215)):
  #0  0x000040000a3f2994 in sigtimedwait () from /lib64/libc.so.6
  #1  0x000040000a39c60c in sigwait () from /lib64/libpthread.so.0
  #2  0x0000aaaaaae82610 in sigwait_compat (opaque=0xaaaac5163b00) at util/compatfd.c:37
  #3  0x0000aaaaaae85038 in qemu_thread_start (args=args@entry=0xaaaac5163b90) at util/qemu_thread_posix.c:496
  #4  0x000040000a3918bc in start_thread () from /lib64/libpthread.so.0
  #5  0x000040000a492b2c in thread_start () from /lib64/libc.so.6

  Thread 1 (Thread 0x40000b573370 (LWP 27214)):
  #0  0x000040000a489020 in ppoll () from /lib64/libc.so.6
  #1  0x0000aaaaaadaefc0 in ppoll (__ss=0x0, __timeout=0x0, __nfds=<optimized out>, __fds=<optimized out>) at /usr/include/bits/poll2.h:77
  #2  qemu_poll_ns (fds=<optimized out>, nfds=<optimized out>, timeout=<optimized out>) at qemu_timer.c:391
  #3  0x0000aaaaaadae014 in os_host_main_loop_wait (timeout=<optimized out>) at main_loop.c:272
  #4  0x0000aaaaaadae190 in main_loop_wait (nonblocking=<optimized out>) at main_loop.c:534
  #5  0x0000aaaaaad97be0 in convert_do_copy (s=0xffffdc32eb48) at qemu-img.c:1923
  #6  0x0000aaaaaada2d70 in img_convert (argc=<optimized out>, argv=<optimized out>) at qemu-img.c:2414
  #7  0x0000aaaaaad99ac4 in main (argc=7, argv=<optimized out>) at qemu-img.c:5305

  
  The problem seems to be very similar to the phenomenon described by this patch (https://resources.ovirt.org/pub/ovirt-4.1/src/qemu-kvm-ev/0025-aio_notify-force-main-loop-wakeup-with-SIGIO-aarch64.patch), 

  which force main loop wakeup with SIGIO.  But this patch was reverted
  by the patch (http://ovirt.repo.nfrance.com/src/qemu-kvm-ev/kvm-
  Revert-aio_notify-force-main-loop-wakeup-with-SIGIO-.patch).

  The problem still seems to exist in aarch64 host. The qemu version I used is 2.8.1. The host version is 4.19.28-1.2.108.aarch64.
   Do you have any solutions to fix it?  Thanks for your reply !

To manage notifications about this bug go to:
https://bugs.launchpad.net/qemu/+bug/1824053/+subscriptions

^ permalink raw reply	[flat|nested] 11+ messages in thread

* [Qemu-devel] [Bug 1824053] Re: Qemu-img convert appears to be stuck on aarch64 host with low probability
  2019-04-10  2:55 [Qemu-devel] [Bug 1824053] [NEW] Qemu-img convert appears to be stuck on aarch64 host with low probability 贞贵李
  2019-04-13  2:04 ` [Qemu-devel] [Bug 1824053] " 贞贵李
  2019-04-15  4:07 ` 贞贵李
@ 2019-04-15 19:02 ` John Snow
  2019-04-16  5:31 ` Thomas Huth
                   ` (6 subsequent siblings)
  9 siblings, 0 replies; 11+ messages in thread
From: John Snow @ 2019-04-15 19:02 UTC (permalink / raw)
  To: qemu-devel

Hi, unfortunately a lot has changed from 2.8 and it might be hard to
identify a single individual fix that may be responsible for this; there
are aio_context fixes that go in nearly every version.

It may be quickest (unfortunately) to start git-bisecting the problem to
see if you can identify which build alleviates the behavior to see if it
isn't something you can backport directly -- but you might find that
this particular fix has a lot of requisites and you might find it
difficult to backport to 2.8.1.

Best of luck,
--js

-- 
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/1824053

Title:
  Qemu-img convert appears to be stuck on aarch64 host with low
  probability

Status in QEMU:
  New

Bug description:
  Hi,  I found a problem that qemu-img convert appears to be stuck on
  aarch64 host with low probability.

  The convert command  line is  "qemu-img convert -f qcow2 -O raw
  disk.qcow2 disk.raw ".

  The bt is below:

  Thread 2 (Thread 0x40000b776e50 (LWP 27215)):
  #0  0x000040000a3f2994 in sigtimedwait () from /lib64/libc.so.6
  #1  0x000040000a39c60c in sigwait () from /lib64/libpthread.so.0
  #2  0x0000aaaaaae82610 in sigwait_compat (opaque=0xaaaac5163b00) at util/compatfd.c:37
  #3  0x0000aaaaaae85038 in qemu_thread_start (args=args@entry=0xaaaac5163b90) at util/qemu_thread_posix.c:496
  #4  0x000040000a3918bc in start_thread () from /lib64/libpthread.so.0
  #5  0x000040000a492b2c in thread_start () from /lib64/libc.so.6

  Thread 1 (Thread 0x40000b573370 (LWP 27214)):
  #0  0x000040000a489020 in ppoll () from /lib64/libc.so.6
  #1  0x0000aaaaaadaefc0 in ppoll (__ss=0x0, __timeout=0x0, __nfds=<optimized out>, __fds=<optimized out>) at /usr/include/bits/poll2.h:77
  #2  qemu_poll_ns (fds=<optimized out>, nfds=<optimized out>, timeout=<optimized out>) at qemu_timer.c:391
  #3  0x0000aaaaaadae014 in os_host_main_loop_wait (timeout=<optimized out>) at main_loop.c:272
  #4  0x0000aaaaaadae190 in main_loop_wait (nonblocking=<optimized out>) at main_loop.c:534
  #5  0x0000aaaaaad97be0 in convert_do_copy (s=0xffffdc32eb48) at qemu-img.c:1923
  #6  0x0000aaaaaada2d70 in img_convert (argc=<optimized out>, argv=<optimized out>) at qemu-img.c:2414
  #7  0x0000aaaaaad99ac4 in main (argc=7, argv=<optimized out>) at qemu-img.c:5305

  
  The problem seems to be very similar to the phenomenon described by this patch (https://resources.ovirt.org/pub/ovirt-4.1/src/qemu-kvm-ev/0025-aio_notify-force-main-loop-wakeup-with-SIGIO-aarch64.patch), 

  which force main loop wakeup with SIGIO.  But this patch was reverted
  by the patch (http://ovirt.repo.nfrance.com/src/qemu-kvm-ev/kvm-
  Revert-aio_notify-force-main-loop-wakeup-with-SIGIO-.patch).

  The problem still seems to exist in aarch64 host. The qemu version I used is 2.8.1. The host version is 4.19.28-1.2.108.aarch64.
   Do you have any solutions to fix it?  Thanks for your reply !

To manage notifications about this bug go to:
https://bugs.launchpad.net/qemu/+bug/1824053/+subscriptions

^ permalink raw reply	[flat|nested] 11+ messages in thread

* [Qemu-devel] [Bug 1824053] Re: Qemu-img convert appears to be stuck on aarch64 host with low probability
  2019-04-10  2:55 [Qemu-devel] [Bug 1824053] [NEW] Qemu-img convert appears to be stuck on aarch64 host with low probability 贞贵李
                   ` (2 preceding siblings ...)
  2019-04-15 19:02 ` John Snow
@ 2019-04-16  5:31 ` Thomas Huth
  2019-04-16  7:53 ` 贞贵李
                   ` (5 subsequent siblings)
  9 siblings, 0 replies; 11+ messages in thread
From: Thomas Huth @ 2019-04-16  5:31 UTC (permalink / raw)
  To: qemu-devel

Marking this bug as fixed according to comment 2.

** Changed in: qemu
       Status: New => Fix Released

-- 
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/1824053

Title:
  Qemu-img convert appears to be stuck on aarch64 host with low
  probability

Status in QEMU:
  Fix Released

Bug description:
  Hi,  I found a problem that qemu-img convert appears to be stuck on
  aarch64 host with low probability.

  The convert command  line is  "qemu-img convert -f qcow2 -O raw
  disk.qcow2 disk.raw ".

  The bt is below:

  Thread 2 (Thread 0x40000b776e50 (LWP 27215)):
  #0  0x000040000a3f2994 in sigtimedwait () from /lib64/libc.so.6
  #1  0x000040000a39c60c in sigwait () from /lib64/libpthread.so.0
  #2  0x0000aaaaaae82610 in sigwait_compat (opaque=0xaaaac5163b00) at util/compatfd.c:37
  #3  0x0000aaaaaae85038 in qemu_thread_start (args=args@entry=0xaaaac5163b90) at util/qemu_thread_posix.c:496
  #4  0x000040000a3918bc in start_thread () from /lib64/libpthread.so.0
  #5  0x000040000a492b2c in thread_start () from /lib64/libc.so.6

  Thread 1 (Thread 0x40000b573370 (LWP 27214)):
  #0  0x000040000a489020 in ppoll () from /lib64/libc.so.6
  #1  0x0000aaaaaadaefc0 in ppoll (__ss=0x0, __timeout=0x0, __nfds=<optimized out>, __fds=<optimized out>) at /usr/include/bits/poll2.h:77
  #2  qemu_poll_ns (fds=<optimized out>, nfds=<optimized out>, timeout=<optimized out>) at qemu_timer.c:391
  #3  0x0000aaaaaadae014 in os_host_main_loop_wait (timeout=<optimized out>) at main_loop.c:272
  #4  0x0000aaaaaadae190 in main_loop_wait (nonblocking=<optimized out>) at main_loop.c:534
  #5  0x0000aaaaaad97be0 in convert_do_copy (s=0xffffdc32eb48) at qemu-img.c:1923
  #6  0x0000aaaaaada2d70 in img_convert (argc=<optimized out>, argv=<optimized out>) at qemu-img.c:2414
  #7  0x0000aaaaaad99ac4 in main (argc=7, argv=<optimized out>) at qemu-img.c:5305

  
  The problem seems to be very similar to the phenomenon described by this patch (https://resources.ovirt.org/pub/ovirt-4.1/src/qemu-kvm-ev/0025-aio_notify-force-main-loop-wakeup-with-SIGIO-aarch64.patch), 

  which force main loop wakeup with SIGIO.  But this patch was reverted
  by the patch (http://ovirt.repo.nfrance.com/src/qemu-kvm-ev/kvm-
  Revert-aio_notify-force-main-loop-wakeup-with-SIGIO-.patch).

  The problem still seems to exist in aarch64 host. The qemu version I used is 2.8.1. The host version is 4.19.28-1.2.108.aarch64.
   Do you have any solutions to fix it?  Thanks for your reply !

To manage notifications about this bug go to:
https://bugs.launchpad.net/qemu/+bug/1824053/+subscriptions

^ permalink raw reply	[flat|nested] 11+ messages in thread

* [Qemu-devel] [Bug 1824053] Re: Qemu-img convert appears to be stuck on aarch64 host with low probability
  2019-04-10  2:55 [Qemu-devel] [Bug 1824053] [NEW] Qemu-img convert appears to be stuck on aarch64 host with low probability 贞贵李
                   ` (3 preceding siblings ...)
  2019-04-16  5:31 ` Thomas Huth
@ 2019-04-16  7:53 ` 贞贵李
  2019-04-16  8:08 ` Thomas Huth
                   ` (4 subsequent siblings)
  9 siblings, 0 replies; 11+ messages in thread
From: 贞贵李 @ 2019-04-16  7:53 UTC (permalink / raw)
  To: qemu-devel

dann frazier met the same problem as me in
(https://bugs.launchpad.net/qemu/+bug/1805256).

He said this bugs still persists w/ latest upstream (@ afccfc0). His
reply to me is below:

No, sorry - this bugs still persists w/ latest upstream (@ afccfc0). I
found a report of similar symptoms:

  https://patchwork.kernel.org/patch/10047341/
  https://bugzilla.redhat.com/show_bug.cgi?id=1524770#c13

To be clear, ^ is already fixed upstream, so it is not the *same* issue
- but perhaps related.


** Bug watch added: Red Hat Bugzilla #1524770
   https://bugzilla.redhat.com/show_bug.cgi?id=1524770

-- 
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/1824053

Title:
  Qemu-img convert appears to be stuck on aarch64 host with low
  probability

Status in QEMU:
  Fix Released

Bug description:
  Hi,  I found a problem that qemu-img convert appears to be stuck on
  aarch64 host with low probability.

  The convert command  line is  "qemu-img convert -f qcow2 -O raw
  disk.qcow2 disk.raw ".

  The bt is below:

  Thread 2 (Thread 0x40000b776e50 (LWP 27215)):
  #0  0x000040000a3f2994 in sigtimedwait () from /lib64/libc.so.6
  #1  0x000040000a39c60c in sigwait () from /lib64/libpthread.so.0
  #2  0x0000aaaaaae82610 in sigwait_compat (opaque=0xaaaac5163b00) at util/compatfd.c:37
  #3  0x0000aaaaaae85038 in qemu_thread_start (args=args@entry=0xaaaac5163b90) at util/qemu_thread_posix.c:496
  #4  0x000040000a3918bc in start_thread () from /lib64/libpthread.so.0
  #5  0x000040000a492b2c in thread_start () from /lib64/libc.so.6

  Thread 1 (Thread 0x40000b573370 (LWP 27214)):
  #0  0x000040000a489020 in ppoll () from /lib64/libc.so.6
  #1  0x0000aaaaaadaefc0 in ppoll (__ss=0x0, __timeout=0x0, __nfds=<optimized out>, __fds=<optimized out>) at /usr/include/bits/poll2.h:77
  #2  qemu_poll_ns (fds=<optimized out>, nfds=<optimized out>, timeout=<optimized out>) at qemu_timer.c:391
  #3  0x0000aaaaaadae014 in os_host_main_loop_wait (timeout=<optimized out>) at main_loop.c:272
  #4  0x0000aaaaaadae190 in main_loop_wait (nonblocking=<optimized out>) at main_loop.c:534
  #5  0x0000aaaaaad97be0 in convert_do_copy (s=0xffffdc32eb48) at qemu-img.c:1923
  #6  0x0000aaaaaada2d70 in img_convert (argc=<optimized out>, argv=<optimized out>) at qemu-img.c:2414
  #7  0x0000aaaaaad99ac4 in main (argc=7, argv=<optimized out>) at qemu-img.c:5305

  
  The problem seems to be very similar to the phenomenon described by this patch (https://resources.ovirt.org/pub/ovirt-4.1/src/qemu-kvm-ev/0025-aio_notify-force-main-loop-wakeup-with-SIGIO-aarch64.patch), 

  which force main loop wakeup with SIGIO.  But this patch was reverted
  by the patch (http://ovirt.repo.nfrance.com/src/qemu-kvm-ev/kvm-
  Revert-aio_notify-force-main-loop-wakeup-with-SIGIO-.patch).

  The problem still seems to exist in aarch64 host. The qemu version I used is 2.8.1. The host version is 4.19.28-1.2.108.aarch64.
   Do you have any solutions to fix it?  Thanks for your reply !

To manage notifications about this bug go to:
https://bugs.launchpad.net/qemu/+bug/1824053/+subscriptions

^ permalink raw reply	[flat|nested] 11+ messages in thread

* [Qemu-devel] [Bug 1824053] Re: Qemu-img convert appears to be stuck on aarch64 host with low probability
  2019-04-10  2:55 [Qemu-devel] [Bug 1824053] [NEW] Qemu-img convert appears to be stuck on aarch64 host with low probability 贞贵李
                   ` (4 preceding siblings ...)
  2019-04-16  7:53 ` 贞贵李
@ 2019-04-16  8:08 ` Thomas Huth
  2019-04-20  1:46 ` 贞贵李
                   ` (3 subsequent siblings)
  9 siblings, 0 replies; 11+ messages in thread
From: Thomas Huth @ 2019-04-16  8:08 UTC (permalink / raw)
  To: qemu-devel

Ok, we can track the bug reported by Dann Frazier in ticket 1805256
instead.

** Bug watch removed: Red Hat Bugzilla #1524770
   https://bugzilla.redhat.com/show_bug.cgi?id=1524770

-- 
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/1824053

Title:
  Qemu-img convert appears to be stuck on aarch64 host with low
  probability

Status in QEMU:
  Fix Released

Bug description:
  Hi,  I found a problem that qemu-img convert appears to be stuck on
  aarch64 host with low probability.

  The convert command  line is  "qemu-img convert -f qcow2 -O raw
  disk.qcow2 disk.raw ".

  The bt is below:

  Thread 2 (Thread 0x40000b776e50 (LWP 27215)):
  #0  0x000040000a3f2994 in sigtimedwait () from /lib64/libc.so.6
  #1  0x000040000a39c60c in sigwait () from /lib64/libpthread.so.0
  #2  0x0000aaaaaae82610 in sigwait_compat (opaque=0xaaaac5163b00) at util/compatfd.c:37
  #3  0x0000aaaaaae85038 in qemu_thread_start (args=args@entry=0xaaaac5163b90) at util/qemu_thread_posix.c:496
  #4  0x000040000a3918bc in start_thread () from /lib64/libpthread.so.0
  #5  0x000040000a492b2c in thread_start () from /lib64/libc.so.6

  Thread 1 (Thread 0x40000b573370 (LWP 27214)):
  #0  0x000040000a489020 in ppoll () from /lib64/libc.so.6
  #1  0x0000aaaaaadaefc0 in ppoll (__ss=0x0, __timeout=0x0, __nfds=<optimized out>, __fds=<optimized out>) at /usr/include/bits/poll2.h:77
  #2  qemu_poll_ns (fds=<optimized out>, nfds=<optimized out>, timeout=<optimized out>) at qemu_timer.c:391
  #3  0x0000aaaaaadae014 in os_host_main_loop_wait (timeout=<optimized out>) at main_loop.c:272
  #4  0x0000aaaaaadae190 in main_loop_wait (nonblocking=<optimized out>) at main_loop.c:534
  #5  0x0000aaaaaad97be0 in convert_do_copy (s=0xffffdc32eb48) at qemu-img.c:1923
  #6  0x0000aaaaaada2d70 in img_convert (argc=<optimized out>, argv=<optimized out>) at qemu-img.c:2414
  #7  0x0000aaaaaad99ac4 in main (argc=7, argv=<optimized out>) at qemu-img.c:5305

  
  The problem seems to be very similar to the phenomenon described by this patch (https://resources.ovirt.org/pub/ovirt-4.1/src/qemu-kvm-ev/0025-aio_notify-force-main-loop-wakeup-with-SIGIO-aarch64.patch), 

  which force main loop wakeup with SIGIO.  But this patch was reverted
  by the patch (http://ovirt.repo.nfrance.com/src/qemu-kvm-ev/kvm-
  Revert-aio_notify-force-main-loop-wakeup-with-SIGIO-.patch).

  The problem still seems to exist in aarch64 host. The qemu version I used is 2.8.1. The host version is 4.19.28-1.2.108.aarch64.
   Do you have any solutions to fix it?  Thanks for your reply !

To manage notifications about this bug go to:
https://bugs.launchpad.net/qemu/+bug/1824053/+subscriptions

^ permalink raw reply	[flat|nested] 11+ messages in thread

* [Qemu-devel] [Bug 1824053] Re: Qemu-img convert appears to be stuck on aarch64 host with low probability
  2019-04-10  2:55 [Qemu-devel] [Bug 1824053] [NEW] Qemu-img convert appears to be stuck on aarch64 host with low probability 贞贵李
                   ` (5 preceding siblings ...)
  2019-04-16  8:08 ` Thomas Huth
@ 2019-04-20  1:46 ` 贞贵李
  2019-04-20  2:49 ` 贞贵李
                   ` (2 subsequent siblings)
  9 siblings, 0 replies; 11+ messages in thread
From: 贞贵李 @ 2019-04-20  1:46 UTC (permalink / raw)
  To: qemu-devel

** Changed in: qemu
       Status: Fix Released => Confirmed

-- 
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/1824053

Title:
  Qemu-img convert appears to be stuck on aarch64 host with low
  probability

Status in QEMU:
  Confirmed

Bug description:
  Hi,  I found a problem that qemu-img convert appears to be stuck on
  aarch64 host with low probability.

  The convert command  line is  "qemu-img convert -f qcow2 -O raw
  disk.qcow2 disk.raw ".

  The bt is below:

  Thread 2 (Thread 0x40000b776e50 (LWP 27215)):
  #0  0x000040000a3f2994 in sigtimedwait () from /lib64/libc.so.6
  #1  0x000040000a39c60c in sigwait () from /lib64/libpthread.so.0
  #2  0x0000aaaaaae82610 in sigwait_compat (opaque=0xaaaac5163b00) at util/compatfd.c:37
  #3  0x0000aaaaaae85038 in qemu_thread_start (args=args@entry=0xaaaac5163b90) at util/qemu_thread_posix.c:496
  #4  0x000040000a3918bc in start_thread () from /lib64/libpthread.so.0
  #5  0x000040000a492b2c in thread_start () from /lib64/libc.so.6

  Thread 1 (Thread 0x40000b573370 (LWP 27214)):
  #0  0x000040000a489020 in ppoll () from /lib64/libc.so.6
  #1  0x0000aaaaaadaefc0 in ppoll (__ss=0x0, __timeout=0x0, __nfds=<optimized out>, __fds=<optimized out>) at /usr/include/bits/poll2.h:77
  #2  qemu_poll_ns (fds=<optimized out>, nfds=<optimized out>, timeout=<optimized out>) at qemu_timer.c:391
  #3  0x0000aaaaaadae014 in os_host_main_loop_wait (timeout=<optimized out>) at main_loop.c:272
  #4  0x0000aaaaaadae190 in main_loop_wait (nonblocking=<optimized out>) at main_loop.c:534
  #5  0x0000aaaaaad97be0 in convert_do_copy (s=0xffffdc32eb48) at qemu-img.c:1923
  #6  0x0000aaaaaada2d70 in img_convert (argc=<optimized out>, argv=<optimized out>) at qemu-img.c:2414
  #7  0x0000aaaaaad99ac4 in main (argc=7, argv=<optimized out>) at qemu-img.c:5305

  
  The problem seems to be very similar to the phenomenon described by this patch (https://resources.ovirt.org/pub/ovirt-4.1/src/qemu-kvm-ev/0025-aio_notify-force-main-loop-wakeup-with-SIGIO-aarch64.patch), 

  which force main loop wakeup with SIGIO.  But this patch was reverted
  by the patch (http://ovirt.repo.nfrance.com/src/qemu-kvm-ev/kvm-
  Revert-aio_notify-force-main-loop-wakeup-with-SIGIO-.patch).

  The problem still seems to exist in aarch64 host. The qemu version I used is 2.8.1. The host version is 4.19.28-1.2.108.aarch64.
   Do you have any solutions to fix it?  Thanks for your reply !

To manage notifications about this bug go to:
https://bugs.launchpad.net/qemu/+bug/1824053/+subscriptions

^ permalink raw reply	[flat|nested] 11+ messages in thread

* [Qemu-devel] [Bug 1824053] Re: Qemu-img convert appears to be stuck on aarch64 host with low probability
  2019-04-10  2:55 [Qemu-devel] [Bug 1824053] [NEW] Qemu-img convert appears to be stuck on aarch64 host with low probability 贞贵李
                   ` (6 preceding siblings ...)
  2019-04-20  1:46 ` 贞贵李
@ 2019-04-20  2:49 ` 贞贵李
  2019-04-20  2:55 ` 贞贵李
  2019-06-06 22:57 ` dann frazier
  9 siblings, 0 replies; 11+ messages in thread
From: 贞贵李 @ 2019-04-20  2:49 UTC (permalink / raw)
  To: qemu-devel

I can reproduce this problem with qemu.git/matser. It still exists in qemu.git/matser. I found that when an IO return in
worker threads and want to call aio_notify to wake up main_loop, but it found that ctx->notify_me is cleared to 0 by main_loop in aio_ctx_check by calling atomic_and(&ctx->notify_me, ~1) . So worker thread won't write enventfd to notify main_loop. If such a scene happens, the main_loop will hang:

   main loop                                   worker thread1                         worker thread2
---------------------------------------------------------------------------------------------------------------------        
     qemu_poll_ns                            aio_worker        
                                        qemu_bh_schedule(pool->completion_bh)                              
    glib_pollfds_poll
    g_main_context_check
    aio_ctx_check                                                                     aio_worker                                                                                       
    atomic_and(&ctx->notify_me, ~1)                                                                   
                                                                               qemu_bh_schedule(pool->completion_bh)
    /* do something for event */   
    qemu_poll_ns
    /* hangs !!!*/  


As we known ,ctx->notify_me will be visited by worker thread and main loop. I thank we should add a lock protection for ctx->notify_me to avoid this happend.

-- 
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/1824053

Title:
  Qemu-img convert appears to be stuck on aarch64 host with low
  probability

Status in QEMU:
  Confirmed

Bug description:
  Hi,  I found a problem that qemu-img convert appears to be stuck on
  aarch64 host with low probability.

  The convert command  line is  "qemu-img convert -f qcow2 -O raw
  disk.qcow2 disk.raw ".

  The bt is below:

  Thread 2 (Thread 0x40000b776e50 (LWP 27215)):
  #0  0x000040000a3f2994 in sigtimedwait () from /lib64/libc.so.6
  #1  0x000040000a39c60c in sigwait () from /lib64/libpthread.so.0
  #2  0x0000aaaaaae82610 in sigwait_compat (opaque=0xaaaac5163b00) at util/compatfd.c:37
  #3  0x0000aaaaaae85038 in qemu_thread_start (args=args@entry=0xaaaac5163b90) at util/qemu_thread_posix.c:496
  #4  0x000040000a3918bc in start_thread () from /lib64/libpthread.so.0
  #5  0x000040000a492b2c in thread_start () from /lib64/libc.so.6

  Thread 1 (Thread 0x40000b573370 (LWP 27214)):
  #0  0x000040000a489020 in ppoll () from /lib64/libc.so.6
  #1  0x0000aaaaaadaefc0 in ppoll (__ss=0x0, __timeout=0x0, __nfds=<optimized out>, __fds=<optimized out>) at /usr/include/bits/poll2.h:77
  #2  qemu_poll_ns (fds=<optimized out>, nfds=<optimized out>, timeout=<optimized out>) at qemu_timer.c:391
  #3  0x0000aaaaaadae014 in os_host_main_loop_wait (timeout=<optimized out>) at main_loop.c:272
  #4  0x0000aaaaaadae190 in main_loop_wait (nonblocking=<optimized out>) at main_loop.c:534
  #5  0x0000aaaaaad97be0 in convert_do_copy (s=0xffffdc32eb48) at qemu-img.c:1923
  #6  0x0000aaaaaada2d70 in img_convert (argc=<optimized out>, argv=<optimized out>) at qemu-img.c:2414
  #7  0x0000aaaaaad99ac4 in main (argc=7, argv=<optimized out>) at qemu-img.c:5305

  
  The problem seems to be very similar to the phenomenon described by this patch (https://resources.ovirt.org/pub/ovirt-4.1/src/qemu-kvm-ev/0025-aio_notify-force-main-loop-wakeup-with-SIGIO-aarch64.patch), 

  which force main loop wakeup with SIGIO.  But this patch was reverted
  by the patch (http://ovirt.repo.nfrance.com/src/qemu-kvm-ev/kvm-
  Revert-aio_notify-force-main-loop-wakeup-with-SIGIO-.patch).

  The problem still seems to exist in aarch64 host. The qemu version I used is 2.8.1. The host version is 4.19.28-1.2.108.aarch64.
   Do you have any solutions to fix it?  Thanks for your reply !

To manage notifications about this bug go to:
https://bugs.launchpad.net/qemu/+bug/1824053/+subscriptions

^ permalink raw reply	[flat|nested] 11+ messages in thread

* [Qemu-devel] [Bug 1824053] Re: Qemu-img convert appears to be stuck on aarch64 host with low probability
  2019-04-10  2:55 [Qemu-devel] [Bug 1824053] [NEW] Qemu-img convert appears to be stuck on aarch64 host with low probability 贞贵李
                   ` (7 preceding siblings ...)
  2019-04-20  2:49 ` 贞贵李
@ 2019-04-20  2:55 ` 贞贵李
  2019-06-06 22:57 ` dann frazier
  9 siblings, 0 replies; 11+ messages in thread
From: 贞贵李 @ 2019-04-20  2:55 UTC (permalink / raw)
  To: qemu-devel

** Description changed:

  Hi,  I found a problem that qemu-img convert appears to be stuck on
  aarch64 host with low probability.
  
  The convert command  line is  "qemu-img convert -f qcow2 -O raw
  disk.qcow2 disk.raw ".
  
  The bt is below:
  
  Thread 2 (Thread 0x40000b776e50 (LWP 27215)):
  #0  0x000040000a3f2994 in sigtimedwait () from /lib64/libc.so.6
  #1  0x000040000a39c60c in sigwait () from /lib64/libpthread.so.0
  #2  0x0000aaaaaae82610 in sigwait_compat (opaque=0xaaaac5163b00) at util/compatfd.c:37
  #3  0x0000aaaaaae85038 in qemu_thread_start (args=args@entry=0xaaaac5163b90) at util/qemu_thread_posix.c:496
  #4  0x000040000a3918bc in start_thread () from /lib64/libpthread.so.0
  #5  0x000040000a492b2c in thread_start () from /lib64/libc.so.6
  
  Thread 1 (Thread 0x40000b573370 (LWP 27214)):
  #0  0x000040000a489020 in ppoll () from /lib64/libc.so.6
  #1  0x0000aaaaaadaefc0 in ppoll (__ss=0x0, __timeout=0x0, __nfds=<optimized out>, __fds=<optimized out>) at /usr/include/bits/poll2.h:77
  #2  qemu_poll_ns (fds=<optimized out>, nfds=<optimized out>, timeout=<optimized out>) at qemu_timer.c:391
  #3  0x0000aaaaaadae014 in os_host_main_loop_wait (timeout=<optimized out>) at main_loop.c:272
  #4  0x0000aaaaaadae190 in main_loop_wait (nonblocking=<optimized out>) at main_loop.c:534
  #5  0x0000aaaaaad97be0 in convert_do_copy (s=0xffffdc32eb48) at qemu-img.c:1923
  #6  0x0000aaaaaada2d70 in img_convert (argc=<optimized out>, argv=<optimized out>) at qemu-img.c:2414
  #7  0x0000aaaaaad99ac4 in main (argc=7, argv=<optimized out>) at qemu-img.c:5305
  
- 
- The problem seems to be very similar to the phenomenon described by this patch (https://resources.ovirt.org/pub/ovirt-4.1/src/qemu-kvm-ev/0025-aio_notify-force-main-loop-wakeup-with-SIGIO-aarch64.patch), 
+ The problem seems to be very similar to the phenomenon described by this
+ patch (https://resources.ovirt.org/pub/ovirt-4.1/src/qemu-kvm-ev/0025
+ -aio_notify-force-main-loop-wakeup-with-SIGIO-aarch64.patch),
  
  which force main loop wakeup with SIGIO.  But this patch was reverted by
  the patch (http://ovirt.repo.nfrance.com/src/qemu-kvm-ev/kvm-Revert-
  aio_notify-force-main-loop-wakeup-with-SIGIO-.patch).
  
- The problem still seems to exist in aarch64 host. The qemu version I used is 2.8.1. The host version is 4.19.28-1.2.108.aarch64.
-  Do you have any solutions to fix it?  Thanks for your reply !
+ I can reproduce this problem with qemu.git/matser. It still exists in qemu.git/matser. I found that when an IO return in
+ worker threads and want to call aio_notify to wake up main_loop, but it found that ctx->notify_me is cleared to 0 by main_loop in aio_ctx_check by calling atomic_and(&ctx->notify_me, ~1) . So worker thread won't write enventfd to notify main_loop. If such a scene happens, the main_loop will hang:
+     main loop                        worker thread1                         worker thread2
+ ------------------------------------------------------------------------------------------       
+      qemu_poll_ns                     aio_worker        
+                                     qemu_bh_schedule(pool->completion_bh)                              
+     glib_pollfds_poll
+     g_main_context_check
+     aio_ctx_check                                                         aio_worker                                                                                                    
+     atomic_and(&ctx->notify_me, ~1)                 qemu_bh_schedule(pool->completion_bh)                          
+                                                                                
+     /* do something for event */   
+     qemu_poll_ns
+     /* hangs !!!*/
+ 
+ As we known ,ctx->notify_me will be visited by worker thread and main
+ loop. I thank we should add a lock protection for ctx->notify_me to
+ avoid this happend.

-- 
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/1824053

Title:
  Qemu-img convert appears to be stuck on aarch64 host with low
  probability

Status in QEMU:
  Confirmed

Bug description:
  Hi,  I found a problem that qemu-img convert appears to be stuck on
  aarch64 host with low probability.

  The convert command  line is  "qemu-img convert -f qcow2 -O raw
  disk.qcow2 disk.raw ".

  The bt is below:

  Thread 2 (Thread 0x40000b776e50 (LWP 27215)):
  #0  0x000040000a3f2994 in sigtimedwait () from /lib64/libc.so.6
  #1  0x000040000a39c60c in sigwait () from /lib64/libpthread.so.0
  #2  0x0000aaaaaae82610 in sigwait_compat (opaque=0xaaaac5163b00) at util/compatfd.c:37
  #3  0x0000aaaaaae85038 in qemu_thread_start (args=args@entry=0xaaaac5163b90) at util/qemu_thread_posix.c:496
  #4  0x000040000a3918bc in start_thread () from /lib64/libpthread.so.0
  #5  0x000040000a492b2c in thread_start () from /lib64/libc.so.6

  Thread 1 (Thread 0x40000b573370 (LWP 27214)):
  #0  0x000040000a489020 in ppoll () from /lib64/libc.so.6
  #1  0x0000aaaaaadaefc0 in ppoll (__ss=0x0, __timeout=0x0, __nfds=<optimized out>, __fds=<optimized out>) at /usr/include/bits/poll2.h:77
  #2  qemu_poll_ns (fds=<optimized out>, nfds=<optimized out>, timeout=<optimized out>) at qemu_timer.c:391
  #3  0x0000aaaaaadae014 in os_host_main_loop_wait (timeout=<optimized out>) at main_loop.c:272
  #4  0x0000aaaaaadae190 in main_loop_wait (nonblocking=<optimized out>) at main_loop.c:534
  #5  0x0000aaaaaad97be0 in convert_do_copy (s=0xffffdc32eb48) at qemu-img.c:1923
  #6  0x0000aaaaaada2d70 in img_convert (argc=<optimized out>, argv=<optimized out>) at qemu-img.c:2414
  #7  0x0000aaaaaad99ac4 in main (argc=7, argv=<optimized out>) at qemu-img.c:5305

  The problem seems to be very similar to the phenomenon described by
  this patch (https://resources.ovirt.org/pub/ovirt-4.1/src/qemu-kvm-
  ev/0025-aio_notify-force-main-loop-wakeup-with-SIGIO-aarch64.patch),

  which force main loop wakeup with SIGIO.  But this patch was reverted
  by the patch (http://ovirt.repo.nfrance.com/src/qemu-kvm-ev/kvm-
  Revert-aio_notify-force-main-loop-wakeup-with-SIGIO-.patch).

  I can reproduce this problem with qemu.git/matser. It still exists in qemu.git/matser. I found that when an IO return in
  worker threads and want to call aio_notify to wake up main_loop, but it found that ctx->notify_me is cleared to 0 by main_loop in aio_ctx_check by calling atomic_and(&ctx->notify_me, ~1) . So worker thread won't write enventfd to notify main_loop. If such a scene happens, the main_loop will hang:
      main loop                        worker thread1                         worker thread2
  ------------------------------------------------------------------------------------------       
       qemu_poll_ns                     aio_worker        
                                      qemu_bh_schedule(pool->completion_bh)                              
      glib_pollfds_poll
      g_main_context_check
      aio_ctx_check                                                         aio_worker                                                                                                    
      atomic_and(&ctx->notify_me, ~1)                 qemu_bh_schedule(pool->completion_bh)                          
                                                                                 
      /* do something for event */   
      qemu_poll_ns
      /* hangs !!!*/

  As we known ,ctx->notify_me will be visited by worker thread and main
  loop. I thank we should add a lock protection for ctx->notify_me to
  avoid this happend.

To manage notifications about this bug go to:
https://bugs.launchpad.net/qemu/+bug/1824053/+subscriptions

^ permalink raw reply	[flat|nested] 11+ messages in thread

* [Qemu-devel] [Bug 1824053] Re: Qemu-img convert appears to be stuck on aarch64 host with low probability
  2019-04-10  2:55 [Qemu-devel] [Bug 1824053] [NEW] Qemu-img convert appears to be stuck on aarch64 host with low probability 贞贵李
                   ` (8 preceding siblings ...)
  2019-04-20  2:55 ` 贞贵李
@ 2019-06-06 22:57 ` dann frazier
  9 siblings, 0 replies; 11+ messages in thread
From: dann frazier @ 2019-06-06 22:57 UTC (permalink / raw)
  To: qemu-devel

*** This bug is a duplicate of bug 1805256 ***
    https://bugs.launchpad.net/bugs/1805256

** This bug has been marked a duplicate of bug 1805256
   qemu-img hangs on high core count ARM system

-- 
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/1824053

Title:
  Qemu-img convert appears to be stuck on aarch64 host with low
  probability

Status in QEMU:
  Confirmed

Bug description:
  Hi,  I found a problem that qemu-img convert appears to be stuck on
  aarch64 host with low probability.

  The convert command  line is  "qemu-img convert -f qcow2 -O raw
  disk.qcow2 disk.raw ".

  The bt is below:

  Thread 2 (Thread 0x40000b776e50 (LWP 27215)):
  #0  0x000040000a3f2994 in sigtimedwait () from /lib64/libc.so.6
  #1  0x000040000a39c60c in sigwait () from /lib64/libpthread.so.0
  #2  0x0000aaaaaae82610 in sigwait_compat (opaque=0xaaaac5163b00) at util/compatfd.c:37
  #3  0x0000aaaaaae85038 in qemu_thread_start (args=args@entry=0xaaaac5163b90) at util/qemu_thread_posix.c:496
  #4  0x000040000a3918bc in start_thread () from /lib64/libpthread.so.0
  #5  0x000040000a492b2c in thread_start () from /lib64/libc.so.6

  Thread 1 (Thread 0x40000b573370 (LWP 27214)):
  #0  0x000040000a489020 in ppoll () from /lib64/libc.so.6
  #1  0x0000aaaaaadaefc0 in ppoll (__ss=0x0, __timeout=0x0, __nfds=<optimized out>, __fds=<optimized out>) at /usr/include/bits/poll2.h:77
  #2  qemu_poll_ns (fds=<optimized out>, nfds=<optimized out>, timeout=<optimized out>) at qemu_timer.c:391
  #3  0x0000aaaaaadae014 in os_host_main_loop_wait (timeout=<optimized out>) at main_loop.c:272
  #4  0x0000aaaaaadae190 in main_loop_wait (nonblocking=<optimized out>) at main_loop.c:534
  #5  0x0000aaaaaad97be0 in convert_do_copy (s=0xffffdc32eb48) at qemu-img.c:1923
  #6  0x0000aaaaaada2d70 in img_convert (argc=<optimized out>, argv=<optimized out>) at qemu-img.c:2414
  #7  0x0000aaaaaad99ac4 in main (argc=7, argv=<optimized out>) at qemu-img.c:5305

  The problem seems to be very similar to the phenomenon described by
  this patch (https://resources.ovirt.org/pub/ovirt-4.1/src/qemu-kvm-
  ev/0025-aio_notify-force-main-loop-wakeup-with-SIGIO-aarch64.patch),

  which force main loop wakeup with SIGIO.  But this patch was reverted
  by the patch (http://ovirt.repo.nfrance.com/src/qemu-kvm-ev/kvm-
  Revert-aio_notify-force-main-loop-wakeup-with-SIGIO-.patch).

  I can reproduce this problem with qemu.git/matser. It still exists in qemu.git/matser. I found that when an IO return in
  worker threads and want to call aio_notify to wake up main_loop, but it found that ctx->notify_me is cleared to 0 by main_loop in aio_ctx_check by calling atomic_and(&ctx->notify_me, ~1) . So worker thread won't write enventfd to notify main_loop. If such a scene happens, the main_loop will hang:
      main loop                        worker thread1                         worker thread2
  ------------------------------------------------------------------------------------------       
       qemu_poll_ns                     aio_worker        
                                      qemu_bh_schedule(pool->completion_bh)                              
      glib_pollfds_poll
      g_main_context_check
      aio_ctx_check                                                         aio_worker                                                                                                    
      atomic_and(&ctx->notify_me, ~1)                 qemu_bh_schedule(pool->completion_bh)                          
                                                                                 
      /* do something for event */   
      qemu_poll_ns
      /* hangs !!!*/

  As we known ,ctx->notify_me will be visited by worker thread and main
  loop. I thank we should add a lock protection for ctx->notify_me to
  avoid this happend.

To manage notifications about this bug go to:
https://bugs.launchpad.net/qemu/+bug/1824053/+subscriptions


^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2019-06-06 23:10 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-04-10  2:55 [Qemu-devel] [Bug 1824053] [NEW] Qemu-img convert appears to be stuck on aarch64 host with low probability 贞贵李
2019-04-13  2:04 ` [Qemu-devel] [Bug 1824053] " 贞贵李
2019-04-15  4:07 ` 贞贵李
2019-04-15 19:02 ` John Snow
2019-04-16  5:31 ` Thomas Huth
2019-04-16  7:53 ` 贞贵李
2019-04-16  8:08 ` Thomas Huth
2019-04-20  1:46 ` 贞贵李
2019-04-20  2:49 ` 贞贵李
2019-04-20  2:55 ` 贞贵李
2019-06-06 22:57 ` dann frazier

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.