All of lore.kernel.org
 help / color / mirror / Atom feed
* [Qemu-devel] Question: an IO hang problem
@ 2018-03-13  9:38 sochin.jiang
  2018-03-15  6:37 ` Fam Zheng
  0 siblings, 1 reply; 2+ messages in thread
From: sochin.jiang @ 2018-03-13  9:38 UTC (permalink / raw)
  To: kwolf, mreitz
  Cc: qemu-block, qemu-devel, Subo (A), Lulina (A), Fangyi (C), sochin.jiang


 Hi, guys,

 Recently, I encountered an IO hang problem in occasion which I cannot reproduce it now.

 I analyzed this problem carefully, the critical stack is as following:


After reading the codes in linux-aio.c(see ioq_submit() function), I found two situations could lead us here.

1) no AIOs are in flight(s->ioq.in_flight is 0) and another call to io_submit returns -EAGAIN

2) no AIOs are in flight(s->ioq.in_flight is 0) and s->io_q.pending IOs reach to MAX_EVENTS at once

In both the two situations above, the do{...}while loop breaks out and set s->io_q.blocked true.

After that, AIO completion callback will never be called,  ioq_submit() either, all pended requests will hang.


Is there a proper way we can fix this while do not affect(stuck) the guest ?

Hope for a reply, thanks.


Sochin.

^ permalink raw reply	[flat|nested] 2+ messages in thread

* Re: [Qemu-devel] Question: an IO hang problem
  2018-03-13  9:38 [Qemu-devel] Question: an IO hang problem sochin.jiang
@ 2018-03-15  6:37 ` Fam Zheng
  0 siblings, 0 replies; 2+ messages in thread
From: Fam Zheng @ 2018-03-15  6:37 UTC (permalink / raw)
  To: sochin.jiang
  Cc: kwolf, mreitz, Lulina (A), qemu-block, Subo (A), Fangyi (C), qemu-devel

On Tue, 03/13 17:38, sochin.jiang wrote:
> 
>  Hi, guys,
> 
>  Recently, I encountered an IO hang problem in occasion which I cannot reproduce it now.
> 
>  I analyzed this problem carefully, the critical stack is as following:
> 
> 
> After reading the codes in linux-aio.c(see ioq_submit() function), I found two situations could lead us here.
> 
> 1) no AIOs are in flight(s->ioq.in_flight is 0) and another call to io_submit returns -EAGAIN

So if there is no inflight I/O, why it would return -EAGAIN? The tricky thing
here is that since we're not expecting a completion, when should we retry?

> 
> 2) no AIOs are in flight(s->ioq.in_flight is 0) and s->io_q.pending IOs reach to MAX_EVENTS at once

I don't understand this case. We have,

        len = 0;
        QSIMPLEQ_FOREACH(aiocb, &s->io_q.pending, next) {
            iocbs[len++] = &aiocb->iocb;
            if (s->io_q.in_flight + len >= MAX_EVENTS) {
                break;
            }
        }

        ret = io_submit(s->ctx, len, iocbs);

If in_flight is 0, only (MAX_EVENTS - 1) requests can be added to iocbs, so
io_submit shouldn't return -EAGAIN.

> 
> In both the two situations above, the do{...}while loop breaks out and set s->io_q.blocked true.
> 
> After that, AIO completion callback will never be called,  ioq_submit() either, all pended requests will hang.
> 
> 
> Is there a proper way we can fix this while do not affect(stuck) the guest ?

Fam

^ permalink raw reply	[flat|nested] 2+ messages in thread

end of thread, other threads:[~2018-03-15  6:38 UTC | newest]

Thread overview: 2+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-03-13  9:38 [Qemu-devel] Question: an IO hang problem sochin.jiang
2018-03-15  6:37 ` Fam Zheng

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.