All of lore.kernel.org
 help / color / mirror / Atom feed
* [Qemu-devel] QEMU hangs in main_loop_wait
@ 2018-10-17 15:44 Frank Yang
  2018-10-17 17:03 ` Paolo Bonzini
  0 siblings, 1 reply; 4+ messages in thread
From: Frank Yang @ 2018-10-17 15:44 UTC (permalink / raw)
  To: Paolo Bonzini, qemu-devel

Hi all,

After quite some backports, such as

https://android.googlesource.com/platform/external/qemu/+/a8da859b1b011e509056f03cbcb73df27afe4337

we are not seeing hangs in flatview_do_translate, but are still getting
hangs in main-loop:

    qemu_mutex_unlock_iothread();
    replay_mutex_unlock();

    ret = qemu_poll_ns((GPollFD *)gpollfds->data, gpollfds->len, timeout);
<-------------this seems to be the hanging part

    replay_mutex_lock();
    qemu_mutex_lock_iothread();

Any idea how to diagnose the cause of taking too much time in g_poll in
qemu_poll_ns?

Frank

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [Qemu-devel] QEMU hangs in main_loop_wait
  2018-10-17 15:44 [Qemu-devel] QEMU hangs in main_loop_wait Frank Yang
@ 2018-10-17 17:03 ` Paolo Bonzini
  2018-10-23 15:02   ` Frank Yang
  0 siblings, 1 reply; 4+ messages in thread
From: Paolo Bonzini @ 2018-10-17 17:03 UTC (permalink / raw)
  To: Frank Yang, qemu-devel

On 17/10/2018 17:44, Frank Yang wrote:
> 
> we are not seeing hangs in flatview_do_translate, but are still getting
> hangs in main-loop:
> 
>     qemu_mutex_unlock_iothread();
>     replay_mutex_unlock();
> 
>     ret = qemu_poll_ns((GPollFD *)gpollfds->data, gpollfds->len,
> timeout); <-------------this seems to be the hanging part
> 
>     replay_mutex_lock();
>     qemu_mutex_lock_iothread();
> 
> Any idea how to diagnose the cause of taking too much time in g_poll in
> qemu_poll_ns?

Is there any event (timer or file descriptor) that you know should be
ready, but is not causing a wakeup?

Paolo

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [Qemu-devel] QEMU hangs in main_loop_wait
  2018-10-17 17:03 ` Paolo Bonzini
@ 2018-10-23 15:02   ` Frank Yang
  2018-10-23 15:33     ` Frank Yang
  0 siblings, 1 reply; 4+ messages in thread
From: Frank Yang @ 2018-10-23 15:02 UTC (permalink / raw)
  To: Paolo Bonzini; +Cc: qemu-devel

We don't yet have visibility into that, but is there a way to enumerate
what callbacks are running? We see that sometimes, actually qemu_poll_ns
takes much, much longer than the specified timeout to return:

    qemu_mutex_unlock_iothread();
    replay_mutex_unlock();

    struct timeval tv;
    gettimeofday(&tv, NULL);

    uint64_t start_us = tv.tv_usec + tv.tv_sec * 1000000LL;
    uint64_t duration = 0;
        ret = qemu_poll_ns((GPollFD *)gpollfds->data, gpollfds->len,
timeout);
        gettimeofday(&tv, NULL);
        uint64_t end_us = tv.tv_usec + tv.tv_sec * 1000000LL;
        duration = end_us - start_us;

    if (duration > 100000LL) {
        fprintf(stderr, "%s: polled for %f ms (timeout: %f ms) ret: %d
nfds: %d\n", __func__, duration / 1000.0f, timeout / 1000000.0f, ret,
gpollfds->len);
    }

prints:

os_host_main_loop_wait: polled for 580.517029 ms (timeout: 6.150904 ms)
ret: 1 nfds: 50
os_host_main_loop_wait: polled for 10001.063477 ms (timeout: 1.277816 ms)
ret: 0 nfds: 50
os_host_main_loop_wait: polled for 792.466003 ms (timeout: 0.149536 ms)
ret: 1 nfds: 47
os_host_main_loop_wait: polled for 8359.308594 ms (timeout: 1.791392 ms)
ret: 1 nfds: 47
os_host_main_loop_wait: polled for 2478.096924 ms (timeout: 5.850920 ms)
ret: 1 nfds: 47
os_host_main_loop_wait: polled for 139.272995 ms (timeout: 5.442976 ms)
ret: 1 nfds: 47
os_host_main_loop_wait: polled for 10001.864258 ms (timeout: 1.946600 ms)
ret: 0 nfds: 47


On Wed, Oct 17, 2018 at 10:03 AM Paolo Bonzini <pbonzini@redhat.com> wrote:

> On 17/10/2018 17:44, Frank Yang wrote:
> >
> > we are not seeing hangs in flatview_do_translate, but are still getting
> > hangs in main-loop:
> >
> >     qemu_mutex_unlock_iothread();
> >     replay_mutex_unlock();
> >
> >     ret = qemu_poll_ns((GPollFD *)gpollfds->data, gpollfds->len,
> > timeout); <-------------this seems to be the hanging part
> >
> >     replay_mutex_lock();
> >     qemu_mutex_lock_iothread();
> >
> > Any idea how to diagnose the cause of taking too much time in g_poll in
> > qemu_poll_ns?
>
> Is there any event (timer or file descriptor) that you know should be
> ready, but is not causing a wakeup?
>
> Paolo
>

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [Qemu-devel] QEMU hangs in main_loop_wait
  2018-10-23 15:02   ` Frank Yang
@ 2018-10-23 15:33     ` Frank Yang
  0 siblings, 0 replies; 4+ messages in thread
From: Frank Yang @ 2018-10-23 15:33 UTC (permalink / raw)
  To: Paolo Bonzini; +Cc: qemu-devel

Also, those only happen when using the browser in the guest; leave it on a
single webpage, and get these long timeouts. We are using slirp.

On Tue, Oct 23, 2018 at 8:02 AM Frank Yang <lfy@google.com> wrote:

> We don't yet have visibility into that, but is there a way to enumerate
> what callbacks are running? We see that sometimes, actually qemu_poll_ns
> takes much, much longer than the specified timeout to return:
>
>     qemu_mutex_unlock_iothread();
>     replay_mutex_unlock();
>
>     struct timeval tv;
>     gettimeofday(&tv, NULL);
>
>     uint64_t start_us = tv.tv_usec + tv.tv_sec * 1000000LL;
>     uint64_t duration = 0;
>         ret = qemu_poll_ns((GPollFD *)gpollfds->data, gpollfds->len,
> timeout);
>         gettimeofday(&tv, NULL);
>         uint64_t end_us = tv.tv_usec + tv.tv_sec * 1000000LL;
>         duration = end_us - start_us;
>
>     if (duration > 100000LL) {
>         fprintf(stderr, "%s: polled for %f ms (timeout: %f ms) ret: %d
> nfds: %d\n", __func__, duration / 1000.0f, timeout / 1000000.0f, ret,
> gpollfds->len);
>     }
>
> prints:
>
> os_host_main_loop_wait: polled for 580.517029 ms (timeout: 6.150904 ms)
> ret: 1 nfds: 50
> os_host_main_loop_wait: polled for 10001.063477 ms (timeout: 1.277816 ms)
> ret: 0 nfds: 50
> os_host_main_loop_wait: polled for 792.466003 ms (timeout: 0.149536 ms)
> ret: 1 nfds: 47
> os_host_main_loop_wait: polled for 8359.308594 ms (timeout: 1.791392 ms)
> ret: 1 nfds: 47
> os_host_main_loop_wait: polled for 2478.096924 ms (timeout: 5.850920 ms)
> ret: 1 nfds: 47
> os_host_main_loop_wait: polled for 139.272995 ms (timeout: 5.442976 ms)
> ret: 1 nfds: 47
> os_host_main_loop_wait: polled for 10001.864258 ms (timeout: 1.946600 ms)
> ret: 0 nfds: 47
>
>
> On Wed, Oct 17, 2018 at 10:03 AM Paolo Bonzini <pbonzini@redhat.com>
> wrote:
>
>> On 17/10/2018 17:44, Frank Yang wrote:
>> >
>> > we are not seeing hangs in flatview_do_translate, but are still getting
>> > hangs in main-loop:
>> >
>> >     qemu_mutex_unlock_iothread();
>> >     replay_mutex_unlock();
>> >
>> >     ret = qemu_poll_ns((GPollFD *)gpollfds->data, gpollfds->len,
>> > timeout); <-------------this seems to be the hanging part
>> >
>> >     replay_mutex_lock();
>> >     qemu_mutex_lock_iothread();
>> >
>> > Any idea how to diagnose the cause of taking too much time in g_poll in
>> > qemu_poll_ns?
>>
>> Is there any event (timer or file descriptor) that you know should be
>> ready, but is not causing a wakeup?
>>
>> Paolo
>>
>

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2018-10-23 15:33 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-10-17 15:44 [Qemu-devel] QEMU hangs in main_loop_wait Frank Yang
2018-10-17 17:03 ` Paolo Bonzini
2018-10-23 15:02   ` Frank Yang
2018-10-23 15:33     ` Frank Yang

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.