qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
* [Qemu-devel] [PATCH] qed: fix bdrv_qed_drain
@ 2016-02-16 15:53 Paolo Bonzini
  2016-02-17  2:57 ` Fam Zheng
  0 siblings, 1 reply; 15+ messages in thread
From: Paolo Bonzini @ 2016-02-16 15:53 UTC (permalink / raw)
  To: qemu-devel; +Cc: famz, stefanha, qemu-block, qemu-stable

The current implementation of bdrv_qed_drain can cause a double
completion of a request.

The problem is that bdrv_qed_drain calls qed_plug_allocating_write_reqs
unconditionally, but this is not correct if an allocating write
is queued.  In this case, qed_unplug_allocating_write_reqs will
restart the allocating write and possibly cause it to complete.
The aiocb however is still in use for the L2/L1 table writes,
and will then be completed again as soon as the table writes
are stable.

The fix is to only call qed_plug_allocating_write_reqs and
bdrv_aio_flush (which is the same as the timer callback---the patch
makes this explicit) only if the timer was scheduled in the first
place.  This fixes qemu-iotests test 011.

Cc: qemu-stable@nongnu.org
Cc: qemu-block@nongnu.org
Cc: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
 block/qed.c | 13 +++++++------
 1 file changed, 7 insertions(+), 6 deletions(-)

diff --git a/block/qed.c b/block/qed.c
index 404be1e..ebba220 100644
--- a/block/qed.c
+++ b/block/qed.c
@@ -380,12 +380,13 @@ static void bdrv_qed_drain(BlockDriverState *bs)
 {
     BDRVQEDState *s = bs->opaque;
 
-    /* Cancel timer and start doing I/O that were meant to happen as if it
-     * fired, that way we get bdrv_drain() taking care of the ongoing requests
-     * correctly. */
-    qed_cancel_need_check_timer(s);
-    qed_plug_allocating_write_reqs(s);
-    bdrv_aio_flush(s->bs, qed_clear_need_check, s);
+    /* Fire the timer immediately in order to start doing I/O as soon as the
+     * header is flushed.
+     */
+    if (s->need_check_timer && timer_pending(s->need_check_timer)) {
+        qed_cancel_need_check_timer(s);
+        qed_need_check_timer_cb(s);
+    }
 }
 
 static int bdrv_qed_open(BlockDriverState *bs, QDict *options, int flags,
-- 
2.5.0

^ permalink raw reply related	[flat|nested] 15+ messages in thread

* Re: [Qemu-devel] [PATCH] qed: fix bdrv_qed_drain
  2016-02-16 15:53 [Qemu-devel] [PATCH] qed: fix bdrv_qed_drain Paolo Bonzini
@ 2016-02-17  2:57 ` Fam Zheng
  2016-02-17 11:28   ` Paolo Bonzini
  0 siblings, 1 reply; 15+ messages in thread
From: Fam Zheng @ 2016-02-17  2:57 UTC (permalink / raw)
  To: Paolo Bonzini; +Cc: stefanha, qemu-devel, qemu-block, qemu-stable

On Tue, 02/16 16:53, Paolo Bonzini wrote:
> The current implementation of bdrv_qed_drain can cause a double
> completion of a request.
> 
> The problem is that bdrv_qed_drain calls qed_plug_allocating_write_reqs
> unconditionally, but this is not correct if an allocating write
> is queued.  In this case, qed_unplug_allocating_write_reqs will
> restart the allocating write and possibly cause it to complete.
> The aiocb however is still in use for the L2/L1 table writes,
> and will then be completed again as soon as the table writes
> are stable.
> 
> The fix is to only call qed_plug_allocating_write_reqs and
> bdrv_aio_flush (which is the same as the timer callback---the patch
> makes this explicit) only if the timer was scheduled in the first
> place.  This fixes qemu-iotests test 011.
> 
> Cc: qemu-stable@nongnu.org
> Cc: qemu-block@nongnu.org
> Cc: Stefan Hajnoczi <stefanha@redhat.com>
> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
> ---
>  block/qed.c | 13 +++++++------
>  1 file changed, 7 insertions(+), 6 deletions(-)
> 
> diff --git a/block/qed.c b/block/qed.c
> index 404be1e..ebba220 100644
> --- a/block/qed.c
> +++ b/block/qed.c
> @@ -380,12 +380,13 @@ static void bdrv_qed_drain(BlockDriverState *bs)
>  {
>      BDRVQEDState *s = bs->opaque;
>  
> -    /* Cancel timer and start doing I/O that were meant to happen as if it
> -     * fired, that way we get bdrv_drain() taking care of the ongoing requests
> -     * correctly. */
> -    qed_cancel_need_check_timer(s);
> -    qed_plug_allocating_write_reqs(s);
> -    bdrv_aio_flush(s->bs, qed_clear_need_check, s);
> +    /* Fire the timer immediately in order to start doing I/O as soon as the
> +     * header is flushed.
> +     */
> +    if (s->need_check_timer && timer_pending(s->need_check_timer)) {

We can assert(s->need_check_timer);

> +        qed_cancel_need_check_timer(s);
> +        qed_need_check_timer_cb(s);
> +    }

What if an allocating write is queued (the else branch case)? Its completion
will be in bdrv_drain and it could arm the need_check_timer which is wrong.

We need to drain the allocating_write_reqs queue before checking the timer.

Fam

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [Qemu-devel] [PATCH] qed: fix bdrv_qed_drain
  2016-02-17  2:57 ` Fam Zheng
@ 2016-02-17 11:28   ` Paolo Bonzini
  2016-02-23  5:57     ` Fam Zheng
  0 siblings, 1 reply; 15+ messages in thread
From: Paolo Bonzini @ 2016-02-17 11:28 UTC (permalink / raw)
  To: Fam Zheng; +Cc: qemu-block, qemu-devel, stefanha, qemu-stable



On 17/02/2016 03:57, Fam Zheng wrote:
> On Tue, 02/16 16:53, Paolo Bonzini wrote:
>> The current implementation of bdrv_qed_drain can cause a double
>> completion of a request.
>>
>> The problem is that bdrv_qed_drain calls qed_plug_allocating_write_reqs
>> unconditionally, but this is not correct if an allocating write
>> is queued.  In this case, qed_unplug_allocating_write_reqs will
>> restart the allocating write and possibly cause it to complete.
>> The aiocb however is still in use for the L2/L1 table writes,
>> and will then be completed again as soon as the table writes
>> are stable.
>>
>> The fix is to only call qed_plug_allocating_write_reqs and
>> bdrv_aio_flush (which is the same as the timer callback---the patch
>> makes this explicit) only if the timer was scheduled in the first
>> place.  This fixes qemu-iotests test 011.
>>
>> Cc: qemu-stable@nongnu.org
>> Cc: qemu-block@nongnu.org
>> Cc: Stefan Hajnoczi <stefanha@redhat.com>
>> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
>> ---
>>  block/qed.c | 13 +++++++------
>>  1 file changed, 7 insertions(+), 6 deletions(-)
>>
>> diff --git a/block/qed.c b/block/qed.c
>> index 404be1e..ebba220 100644
>> --- a/block/qed.c
>> +++ b/block/qed.c
>> @@ -380,12 +380,13 @@ static void bdrv_qed_drain(BlockDriverState *bs)
>>  {
>>      BDRVQEDState *s = bs->opaque;
>>  
>> -    /* Cancel timer and start doing I/O that were meant to happen as if it
>> -     * fired, that way we get bdrv_drain() taking care of the ongoing requests
>> -     * correctly. */
>> -    qed_cancel_need_check_timer(s);
>> -    qed_plug_allocating_write_reqs(s);
>> -    bdrv_aio_flush(s->bs, qed_clear_need_check, s);
>> +    /* Fire the timer immediately in order to start doing I/O as soon as the
>> +     * header is flushed.
>> +     */
>> +    if (s->need_check_timer && timer_pending(s->need_check_timer)) {
> 
> We can assert(s->need_check_timer);

I've seen it NULL, but didn't check why.  This was also a source of
segmentation faults.

>> +        qed_cancel_need_check_timer(s);
>> +        qed_need_check_timer_cb(s);
>> +    }
> 
> What if an allocating write is queued (the else branch case)? Its completion
> will be in bdrv_drain and it could arm the need_check_timer which is wrong.
> 
> We need to drain the allocating_write_reqs queue before checking the timer.

You're right, but how?  That's what bdrv_drain(bs) does, it's a
chicken-and-egg problem.

In my new series, draining works separately on each BlockDriverState
along the chain, from parent (bs) to children (bs->file->bs).  We could
then have .before_drain and .after_drain; .before_drain disables the
timer, while .after_drain does the same operation as the timer.  But
that couldn't be backported, and 2.5 would remain broken forever.  This
patch at least is a band-aid to make QED functional.

Perhaps the alternative is to remove write support for QED altogether
(and the drain callback with it, since it's the only user).  Not sure
why anyone would use it these days.

Paolo

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [Qemu-devel] [PATCH] qed: fix bdrv_qed_drain
  2016-02-17 11:28   ` Paolo Bonzini
@ 2016-02-23  5:57     ` Fam Zheng
  2016-02-23 10:43       ` Paolo Bonzini
  0 siblings, 1 reply; 15+ messages in thread
From: Fam Zheng @ 2016-02-23  5:57 UTC (permalink / raw)
  To: Paolo Bonzini; +Cc: qemu-block, qemu-devel, stefanha, qemu-stable

On Wed, 02/17 12:28, Paolo Bonzini wrote:
> 
> 
> On 17/02/2016 03:57, Fam Zheng wrote:
> > On Tue, 02/16 16:53, Paolo Bonzini wrote:
> >> The current implementation of bdrv_qed_drain can cause a double
> >> completion of a request.
> >>
> >> The problem is that bdrv_qed_drain calls qed_plug_allocating_write_reqs
> >> unconditionally, but this is not correct if an allocating write
> >> is queued.  In this case, qed_unplug_allocating_write_reqs will
> >> restart the allocating write and possibly cause it to complete.
> >> The aiocb however is still in use for the L2/L1 table writes,
> >> and will then be completed again as soon as the table writes
> >> are stable.
> >>
> >> The fix is to only call qed_plug_allocating_write_reqs and
> >> bdrv_aio_flush (which is the same as the timer callback---the patch
> >> makes this explicit) only if the timer was scheduled in the first
> >> place.  This fixes qemu-iotests test 011.
> >>
> >> Cc: qemu-stable@nongnu.org
> >> Cc: qemu-block@nongnu.org
> >> Cc: Stefan Hajnoczi <stefanha@redhat.com>
> >> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
> >> ---
> >>  block/qed.c | 13 +++++++------
> >>  1 file changed, 7 insertions(+), 6 deletions(-)
> >>
> >> diff --git a/block/qed.c b/block/qed.c
> >> index 404be1e..ebba220 100644
> >> --- a/block/qed.c
> >> +++ b/block/qed.c
> >> @@ -380,12 +380,13 @@ static void bdrv_qed_drain(BlockDriverState *bs)
> >>  {
> >>      BDRVQEDState *s = bs->opaque;
> >>  
> >> -    /* Cancel timer and start doing I/O that were meant to happen as if it
> >> -     * fired, that way we get bdrv_drain() taking care of the ongoing requests
> >> -     * correctly. */
> >> -    qed_cancel_need_check_timer(s);
> >> -    qed_plug_allocating_write_reqs(s);
> >> -    bdrv_aio_flush(s->bs, qed_clear_need_check, s);
> >> +    /* Fire the timer immediately in order to start doing I/O as soon as the
> >> +     * header is flushed.
> >> +     */
> >> +    if (s->need_check_timer && timer_pending(s->need_check_timer)) {
> > 
> > We can assert(s->need_check_timer);
> 
> I've seen it NULL, but didn't check why.  This was also a source of
> segmentation faults.
> 
> >> +        qed_cancel_need_check_timer(s);
> >> +        qed_need_check_timer_cb(s);
> >> +    }
> > 
> > What if an allocating write is queued (the else branch case)? Its completion
> > will be in bdrv_drain and it could arm the need_check_timer which is wrong.
> > 
> > We need to drain the allocating_write_reqs queue before checking the timer.
> 
> You're right, but how?  That's what bdrv_drain(bs) does, it's a
> chicken-and-egg problem.

Maybe use an aio_poll loop before the if?

Fam

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [Qemu-devel] [PATCH] qed: fix bdrv_qed_drain
  2016-02-23  5:57     ` Fam Zheng
@ 2016-02-23 10:43       ` Paolo Bonzini
  2016-02-23 12:49         ` Fam Zheng
  0 siblings, 1 reply; 15+ messages in thread
From: Paolo Bonzini @ 2016-02-23 10:43 UTC (permalink / raw)
  To: Fam Zheng; +Cc: stefanha, qemu-devel, qemu-block, qemu-stable



On 23/02/2016 06:57, Fam Zheng wrote:
>>>> +        qed_cancel_need_check_timer(s);
>>>> +        qed_need_check_timer_cb(s);
>>>> +    }
>>>
>>> What if an allocating write is queued (the else branch case)? Its completion
>>> will be in bdrv_drain and it could arm the need_check_timer which is wrong.
>>>
>>> We need to drain the allocating_write_reqs queue before checking the timer.
>>
>> You're right, but how?  That's what bdrv_drain(bs) does, it's a
>> chicken-and-egg problem.
> 
> Maybe use an aio_poll loop before the if?

That would not change the fact that you're reimplementing bdrv_drain
inside bdrv_qed_drain.

Perhaps for now it's simplest to just remove the QED .bdrv_drain
callback, if you think this patch is not a good stopgap measure to avoid
the segmentation faults.

Once the bdrv_drain rework is in, we can move the callback _after_ I/O
is drained on bs and before it is drained on bs->file->bs.

Paolo

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [Qemu-devel] [PATCH] qed: fix bdrv_qed_drain
  2016-02-23 10:43       ` Paolo Bonzini
@ 2016-02-23 12:49         ` Fam Zheng
  2016-02-23 13:54           ` Paolo Bonzini
  0 siblings, 1 reply; 15+ messages in thread
From: Fam Zheng @ 2016-02-23 12:49 UTC (permalink / raw)
  To: Paolo Bonzini; +Cc: stefanha, qemu-devel, qemu-block, qemu-stable

On Tue, 02/23 11:43, Paolo Bonzini wrote:
> 
> 
> On 23/02/2016 06:57, Fam Zheng wrote:
> >>>> +        qed_cancel_need_check_timer(s);
> >>>> +        qed_need_check_timer_cb(s);
> >>>> +    }
> >>>
> >>> What if an allocating write is queued (the else branch case)? Its completion
> >>> will be in bdrv_drain and it could arm the need_check_timer which is wrong.
> >>>
> >>> We need to drain the allocating_write_reqs queue before checking the timer.
> >>
> >> You're right, but how?  That's what bdrv_drain(bs) does, it's a
> >> chicken-and-egg problem.
> > 
> > Maybe use an aio_poll loop before the if?
> 
> That would not change the fact that you're reimplementing bdrv_drain
> inside bdrv_qed_drain.
> 

But it fulfills the contract of .bdrv_drain. This is the easy way, the hard way
would be iterating through the allocating_write_reqs list and process reqs one
by one synchronously, which still involves aio_poll indirectly.

> Perhaps for now it's simplest to just remove the QED .bdrv_drain
> callback, if you think this patch is not a good stopgap measure to avoid
> the segmentation faults.

OK, I'm fine with this as a stopgap measure.

> 
> Once the bdrv_drain rework is in, we can move the callback _after_ I/O
> is drained on bs and before it is drained on bs->file->bs.

Sounds good.

Fam

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [Qemu-devel] [PATCH] qed: fix bdrv_qed_drain
  2016-02-23 12:49         ` Fam Zheng
@ 2016-02-23 13:54           ` Paolo Bonzini
  2016-03-07 16:57             ` [Qemu-devel] [Qemu-block] " Kevin Wolf
  0 siblings, 1 reply; 15+ messages in thread
From: Paolo Bonzini @ 2016-02-23 13:54 UTC (permalink / raw)
  To: Fam Zheng; +Cc: stefanha, qemu-devel, qemu-block, qemu-stable



On 23/02/2016 13:49, Fam Zheng wrote:
> On Tue, 02/23 11:43, Paolo Bonzini wrote:
>>
>>
>> On 23/02/2016 06:57, Fam Zheng wrote:
>>>>>> +        qed_cancel_need_check_timer(s);
>>>>>> +        qed_need_check_timer_cb(s);
>>>>>> +    }
>>>>>
>>>>> What if an allocating write is queued (the else branch case)? Its completion
>>>>> will be in bdrv_drain and it could arm the need_check_timer which is wrong.
>>>>>
>>>>> We need to drain the allocating_write_reqs queue before checking the timer.
>>>>
>>>> You're right, but how?  That's what bdrv_drain(bs) does, it's a
>>>> chicken-and-egg problem.
>>>
>>> Maybe use an aio_poll loop before the if?
>>
>> That would not change the fact that you're reimplementing bdrv_drain
>> inside bdrv_qed_drain.
> 
> But it fulfills the contract of .bdrv_drain. This is the easy way, the hard way
> would be iterating through the allocating_write_reqs list and process reqs one
> by one synchronously, which still involves aio_poll indirectly.

The easy way would be better then.

Stefan, any second opinion?

Paolo

>> Perhaps for now it's simplest to just remove the QED .bdrv_drain
>> callback, if you think this patch is not a good stopgap measure to avoid
>> the segmentation faults.
> 
> OK, I'm fine with this as a stopgap measure.
> 
>> Once the bdrv_drain rework is in, we can move the callback _after_ I/O
>> is drained on bs and before it is drained on bs->file->bs.
> 
> Sounds good.

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [Qemu-devel] [Qemu-block] [PATCH] qed: fix bdrv_qed_drain
  2016-02-23 13:54           ` Paolo Bonzini
@ 2016-03-07 16:57             ` Kevin Wolf
  2016-03-07 20:56               ` Stefan Hajnoczi
  2016-03-07 20:57               ` Stefan Hajnoczi
  0 siblings, 2 replies; 15+ messages in thread
From: Kevin Wolf @ 2016-03-07 16:57 UTC (permalink / raw)
  To: Paolo Bonzini; +Cc: qemu-block, Fam Zheng, qemu-devel, stefanha, qemu-stable

Am 23.02.2016 um 14:54 hat Paolo Bonzini geschrieben:
> 
> 
> On 23/02/2016 13:49, Fam Zheng wrote:
> > On Tue, 02/23 11:43, Paolo Bonzini wrote:
> >>
> >>
> >> On 23/02/2016 06:57, Fam Zheng wrote:
> >>>>>> +        qed_cancel_need_check_timer(s);
> >>>>>> +        qed_need_check_timer_cb(s);
> >>>>>> +    }
> >>>>>
> >>>>> What if an allocating write is queued (the else branch case)? Its completion
> >>>>> will be in bdrv_drain and it could arm the need_check_timer which is wrong.
> >>>>>
> >>>>> We need to drain the allocating_write_reqs queue before checking the timer.
> >>>>
> >>>> You're right, but how?  That's what bdrv_drain(bs) does, it's a
> >>>> chicken-and-egg problem.
> >>>
> >>> Maybe use an aio_poll loop before the if?
> >>
> >> That would not change the fact that you're reimplementing bdrv_drain
> >> inside bdrv_qed_drain.
> > 
> > But it fulfills the contract of .bdrv_drain. This is the easy way, the hard way
> > would be iterating through the allocating_write_reqs list and process reqs one
> > by one synchronously, which still involves aio_poll indirectly.
> 
> The easy way would be better then.
> 
> Stefan, any second opinion?

What's the status here? It would be good to have qed not segfaulting all
the time.

Kevin

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [Qemu-devel] [Qemu-block] [PATCH] qed: fix bdrv_qed_drain
  2016-03-07 16:57             ` [Qemu-devel] [Qemu-block] " Kevin Wolf
@ 2016-03-07 20:56               ` Stefan Hajnoczi
  2016-03-07 21:22                 ` [Qemu-devel] " Paolo Bonzini
  2016-03-08  9:58                 ` Kevin Wolf
  2016-03-07 20:57               ` Stefan Hajnoczi
  1 sibling, 2 replies; 15+ messages in thread
From: Stefan Hajnoczi @ 2016-03-07 20:56 UTC (permalink / raw)
  To: Kevin Wolf; +Cc: Paolo Bonzini, Fam Zheng, qemu-devel, qemu-block, qemu-stable

[-- Attachment #1: Type: text/plain, Size: 1941 bytes --]

On Mon, Mar 07, 2016 at 05:57:41PM +0100, Kevin Wolf wrote:
> Am 23.02.2016 um 14:54 hat Paolo Bonzini geschrieben:
> > 
> > 
> > On 23/02/2016 13:49, Fam Zheng wrote:
> > > On Tue, 02/23 11:43, Paolo Bonzini wrote:
> > >>
> > >>
> > >> On 23/02/2016 06:57, Fam Zheng wrote:
> > >>>>>> +        qed_cancel_need_check_timer(s);
> > >>>>>> +        qed_need_check_timer_cb(s);
> > >>>>>> +    }
> > >>>>>
> > >>>>> What if an allocating write is queued (the else branch case)? Its completion
> > >>>>> will be in bdrv_drain and it could arm the need_check_timer which is wrong.
> > >>>>>
> > >>>>> We need to drain the allocating_write_reqs queue before checking the timer.
> > >>>>
> > >>>> You're right, but how?  That's what bdrv_drain(bs) does, it's a
> > >>>> chicken-and-egg problem.
> > >>>
> > >>> Maybe use an aio_poll loop before the if?
> > >>
> > >> That would not change the fact that you're reimplementing bdrv_drain
> > >> inside bdrv_qed_drain.
> > > 
> > > But it fulfills the contract of .bdrv_drain. This is the easy way, the hard way
> > > would be iterating through the allocating_write_reqs list and process reqs one
> > > by one synchronously, which still involves aio_poll indirectly.
> > 
> > The easy way would be better then.
> > 
> > Stefan, any second opinion?
> 
> What's the status here? It would be good to have qed not segfaulting all
> the time.

I think the timer concept itself is troublesome.  A radical approach but
limited to changing QED itself is to drop the timer and instead keep a
timestamp for the last allocating write request.  Next time a write
request (allocating or rewriting) is about to complete, do the flush and
clear the need check flag as part of the write request (if 5 seconds
have passed since the timestamp).

Because the need check flag clearing is part of the write request
completion, it will integrate properly with drain.

Stefan

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 473 bytes --]

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [Qemu-devel] [Qemu-block] [PATCH] qed: fix bdrv_qed_drain
  2016-03-07 16:57             ` [Qemu-devel] [Qemu-block] " Kevin Wolf
  2016-03-07 20:56               ` Stefan Hajnoczi
@ 2016-03-07 20:57               ` Stefan Hajnoczi
  1 sibling, 0 replies; 15+ messages in thread
From: Stefan Hajnoczi @ 2016-03-07 20:57 UTC (permalink / raw)
  To: Kevin Wolf; +Cc: Paolo Bonzini, Fam Zheng, qemu-devel, qemu-block, qemu-stable

[-- Attachment #1: Type: text/plain, Size: 138 bytes --]

By the way, I'll send a patch on Tuesday showing the timerless need
check mechanism.  It's a little too late tonight to start it.

Stefan

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 473 bytes --]

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [Qemu-devel] [PATCH] qed: fix bdrv_qed_drain
  2016-03-07 20:56               ` Stefan Hajnoczi
@ 2016-03-07 21:22                 ` Paolo Bonzini
  2016-03-08  9:52                   ` [Qemu-devel] [Qemu-block] " Stefan Hajnoczi
  2016-03-08  9:58                 ` Kevin Wolf
  1 sibling, 1 reply; 15+ messages in thread
From: Paolo Bonzini @ 2016-03-07 21:22 UTC (permalink / raw)
  To: Stefan Hajnoczi, Kevin Wolf
  Cc: Fam Zheng, qemu-devel, qemu-block, qemu-stable



On 07/03/2016 21:56, Stefan Hajnoczi wrote:
> I think the timer concept itself is troublesome.  A radical approach but
> limited to changing QED itself is to drop the timer and instead keep a
> timestamp for the last allocating write request.  Next time a write
> request (allocating or rewriting) is about to complete, do the flush and
> clear the need check flag as part of the write request (if 5 seconds
> have passed since the timestamp).

bdrv_qed_drain should be easy to fix in my new drain implementation,
which is based on draining the parent before the BdrvChild-ren.  It's
just troublesome in the current one which alternates flushing and draining.

I would just revert the patch that introduced bdrv_qed_drain for now,
and reintroduce it later (note however that something was messed up in
commit df9a681, "qed: Implement .bdrv_drain", 2015-11-12, because it
includes some dirty bitmap stuff).

Or perhaps the idea of making QED read-only has some merit...

Paolo

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [Qemu-devel] [Qemu-block] [PATCH] qed: fix bdrv_qed_drain
  2016-03-07 21:22                 ` [Qemu-devel] " Paolo Bonzini
@ 2016-03-08  9:52                   ` Stefan Hajnoczi
  2016-03-08  9:59                     ` Kevin Wolf
  0 siblings, 1 reply; 15+ messages in thread
From: Stefan Hajnoczi @ 2016-03-08  9:52 UTC (permalink / raw)
  To: Paolo Bonzini
  Cc: Kevin Wolf, Fam Zheng, qemu-block, qemu-devel, qemu-stable,
	Stefan Hajnoczi

[-- Attachment #1: Type: text/plain, Size: 1233 bytes --]

On Mon, Mar 07, 2016 at 10:22:58PM +0100, Paolo Bonzini wrote:
> On 07/03/2016 21:56, Stefan Hajnoczi wrote:
> > I think the timer concept itself is troublesome.  A radical approach but
> > limited to changing QED itself is to drop the timer and instead keep a
> > timestamp for the last allocating write request.  Next time a write
> > request (allocating or rewriting) is about to complete, do the flush and
> > clear the need check flag as part of the write request (if 5 seconds
> > have passed since the timestamp).
> 
> bdrv_qed_drain should be easy to fix in my new drain implementation,
> which is based on draining the parent before the BdrvChild-ren.  It's
> just troublesome in the current one which alternates flushing and draining.
> 
> I would just revert the patch that introduced bdrv_qed_drain for now,
> and reintroduce it later (note however that something was messed up in
> commit df9a681, "qed: Implement .bdrv_drain", 2015-11-12, because it
> includes some dirty bitmap stuff).

You're right, that might be the best solution for the time being.  AFAIK
the need check write is harmless.  It does not result in a guest-visible
change and is basically just a flush + header update.

Stefan

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 473 bytes --]

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [Qemu-devel] [Qemu-block] [PATCH] qed: fix bdrv_qed_drain
  2016-03-07 20:56               ` Stefan Hajnoczi
  2016-03-07 21:22                 ` [Qemu-devel] " Paolo Bonzini
@ 2016-03-08  9:58                 ` Kevin Wolf
  2016-03-09 15:37                   ` Stefan Hajnoczi
  1 sibling, 1 reply; 15+ messages in thread
From: Kevin Wolf @ 2016-03-08  9:58 UTC (permalink / raw)
  To: Stefan Hajnoczi
  Cc: Paolo Bonzini, Fam Zheng, qemu-devel, qemu-block, qemu-stable

[-- Attachment #1: Type: text/plain, Size: 2593 bytes --]

Am 07.03.2016 um 21:56 hat Stefan Hajnoczi geschrieben:
> On Mon, Mar 07, 2016 at 05:57:41PM +0100, Kevin Wolf wrote:
> > Am 23.02.2016 um 14:54 hat Paolo Bonzini geschrieben:
> > > 
> > > 
> > > On 23/02/2016 13:49, Fam Zheng wrote:
> > > > On Tue, 02/23 11:43, Paolo Bonzini wrote:
> > > >>
> > > >>
> > > >> On 23/02/2016 06:57, Fam Zheng wrote:
> > > >>>>>> +        qed_cancel_need_check_timer(s);
> > > >>>>>> +        qed_need_check_timer_cb(s);
> > > >>>>>> +    }
> > > >>>>>
> > > >>>>> What if an allocating write is queued (the else branch case)? Its completion
> > > >>>>> will be in bdrv_drain and it could arm the need_check_timer which is wrong.
> > > >>>>>
> > > >>>>> We need to drain the allocating_write_reqs queue before checking the timer.
> > > >>>>
> > > >>>> You're right, but how?  That's what bdrv_drain(bs) does, it's a
> > > >>>> chicken-and-egg problem.
> > > >>>
> > > >>> Maybe use an aio_poll loop before the if?
> > > >>
> > > >> That would not change the fact that you're reimplementing bdrv_drain
> > > >> inside bdrv_qed_drain.
> > > > 
> > > > But it fulfills the contract of .bdrv_drain. This is the easy way, the hard way
> > > > would be iterating through the allocating_write_reqs list and process reqs one
> > > > by one synchronously, which still involves aio_poll indirectly.
> > > 
> > > The easy way would be better then.
> > > 
> > > Stefan, any second opinion?
> > 
> > What's the status here? It would be good to have qed not segfaulting all
> > the time.
> 
> I think the timer concept itself is troublesome.  A radical approach but
> limited to changing QED itself is to drop the timer and instead keep a
> timestamp for the last allocating write request.  Next time a write
> request (allocating or rewriting) is about to complete, do the flush and
> clear the need check flag as part of the write request (if 5 seconds
> have passed since the timestamp).

Isn't that kind of backwards, though, because it's very likely that
after some new activity a second write request will come in soon and
mark the image dirty again?

Apart from that, doesn't it fail to provide the intended effect, that
the image doesn't stay dirty for a long time and a repair isn't usually
needed after a crash?

I think rather than doing this, I'd just remove the mechanism altogether
and instead mark the image clean when the guest flushes the disk cache.

Kevin

> Because the need check flag clearing is part of the write request
> completion, it will integrate properly with drain.
> 
> Stefan



[-- Attachment #2: Type: application/pgp-signature, Size: 836 bytes --]

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [Qemu-devel] [Qemu-block] [PATCH] qed: fix bdrv_qed_drain
  2016-03-08  9:52                   ` [Qemu-devel] [Qemu-block] " Stefan Hajnoczi
@ 2016-03-08  9:59                     ` Kevin Wolf
  0 siblings, 0 replies; 15+ messages in thread
From: Kevin Wolf @ 2016-03-08  9:59 UTC (permalink / raw)
  To: Stefan Hajnoczi
  Cc: Fam Zheng, qemu-block, qemu-devel, qemu-stable, Stefan Hajnoczi,
	Paolo Bonzini

[-- Attachment #1: Type: text/plain, Size: 1383 bytes --]

Am 08.03.2016 um 10:52 hat Stefan Hajnoczi geschrieben:
> On Mon, Mar 07, 2016 at 10:22:58PM +0100, Paolo Bonzini wrote:
> > On 07/03/2016 21:56, Stefan Hajnoczi wrote:
> > > I think the timer concept itself is troublesome.  A radical approach but
> > > limited to changing QED itself is to drop the timer and instead keep a
> > > timestamp for the last allocating write request.  Next time a write
> > > request (allocating or rewriting) is about to complete, do the flush and
> > > clear the need check flag as part of the write request (if 5 seconds
> > > have passed since the timestamp).
> > 
> > bdrv_qed_drain should be easy to fix in my new drain implementation,
> > which is based on draining the parent before the BdrvChild-ren.  It's
> > just troublesome in the current one which alternates flushing and draining.
> > 
> > I would just revert the patch that introduced bdrv_qed_drain for now,
> > and reintroduce it later (note however that something was messed up in
> > commit df9a681, "qed: Implement .bdrv_drain", 2015-11-12, because it
> > includes some dirty bitmap stuff).
> 
> You're right, that might be the best solution for the time being.  AFAIK
> the need check write is harmless.  It does not result in a guest-visible
> change and is basically just a flush + header update.

I think it caused occasional assertion failures.

Kevin

[-- Attachment #2: Type: application/pgp-signature, Size: 836 bytes --]

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [Qemu-devel] [Qemu-block] [PATCH] qed: fix bdrv_qed_drain
  2016-03-08  9:58                 ` Kevin Wolf
@ 2016-03-09 15:37                   ` Stefan Hajnoczi
  0 siblings, 0 replies; 15+ messages in thread
From: Stefan Hajnoczi @ 2016-03-09 15:37 UTC (permalink / raw)
  To: Kevin Wolf; +Cc: Paolo Bonzini, Fam Zheng, qemu-devel, qemu-block, qemu-stable

[-- Attachment #1: Type: text/plain, Size: 2580 bytes --]

On Tue, Mar 08, 2016 at 10:58:47AM +0100, Kevin Wolf wrote:
> Am 07.03.2016 um 21:56 hat Stefan Hajnoczi geschrieben:
> > On Mon, Mar 07, 2016 at 05:57:41PM +0100, Kevin Wolf wrote:
> > > Am 23.02.2016 um 14:54 hat Paolo Bonzini geschrieben:
> > > > 
> > > > 
> > > > On 23/02/2016 13:49, Fam Zheng wrote:
> > > > > On Tue, 02/23 11:43, Paolo Bonzini wrote:
> > > > >>
> > > > >>
> > > > >> On 23/02/2016 06:57, Fam Zheng wrote:
> > > > >>>>>> +        qed_cancel_need_check_timer(s);
> > > > >>>>>> +        qed_need_check_timer_cb(s);
> > > > >>>>>> +    }
> > > > >>>>>
> > > > >>>>> What if an allocating write is queued (the else branch case)? Its completion
> > > > >>>>> will be in bdrv_drain and it could arm the need_check_timer which is wrong.
> > > > >>>>>
> > > > >>>>> We need to drain the allocating_write_reqs queue before checking the timer.
> > > > >>>>
> > > > >>>> You're right, but how?  That's what bdrv_drain(bs) does, it's a
> > > > >>>> chicken-and-egg problem.
> > > > >>>
> > > > >>> Maybe use an aio_poll loop before the if?
> > > > >>
> > > > >> That would not change the fact that you're reimplementing bdrv_drain
> > > > >> inside bdrv_qed_drain.
> > > > > 
> > > > > But it fulfills the contract of .bdrv_drain. This is the easy way, the hard way
> > > > > would be iterating through the allocating_write_reqs list and process reqs one
> > > > > by one synchronously, which still involves aio_poll indirectly.
> > > > 
> > > > The easy way would be better then.
> > > > 
> > > > Stefan, any second opinion?
> > > 
> > > What's the status here? It would be good to have qed not segfaulting all
> > > the time.
> > 
> > I think the timer concept itself is troublesome.  A radical approach but
> > limited to changing QED itself is to drop the timer and instead keep a
> > timestamp for the last allocating write request.  Next time a write
> > request (allocating or rewriting) is about to complete, do the flush and
> > clear the need check flag as part of the write request (if 5 seconds
> > have passed since the timestamp).
> 
> Isn't that kind of backwards, though, because it's very likely that
> after some new activity a second write request will come in soon and
> mark the image dirty again?
> 
> Apart from that, doesn't it fail to provide the intended effect, that
> the image doesn't stay dirty for a long time and a repair isn't usually
> needed after a crash?

Yes, it relies on a non-allocating write request after the timeout.

I've reverted the drain patch for now.

Stefan

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 473 bytes --]

^ permalink raw reply	[flat|nested] 15+ messages in thread

end of thread, other threads:[~2016-03-09 15:37 UTC | newest]

Thread overview: 15+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-02-16 15:53 [Qemu-devel] [PATCH] qed: fix bdrv_qed_drain Paolo Bonzini
2016-02-17  2:57 ` Fam Zheng
2016-02-17 11:28   ` Paolo Bonzini
2016-02-23  5:57     ` Fam Zheng
2016-02-23 10:43       ` Paolo Bonzini
2016-02-23 12:49         ` Fam Zheng
2016-02-23 13:54           ` Paolo Bonzini
2016-03-07 16:57             ` [Qemu-devel] [Qemu-block] " Kevin Wolf
2016-03-07 20:56               ` Stefan Hajnoczi
2016-03-07 21:22                 ` [Qemu-devel] " Paolo Bonzini
2016-03-08  9:52                   ` [Qemu-devel] [Qemu-block] " Stefan Hajnoczi
2016-03-08  9:59                     ` Kevin Wolf
2016-03-08  9:58                 ` Kevin Wolf
2016-03-09 15:37                   ` Stefan Hajnoczi
2016-03-07 20:57               ` Stefan Hajnoczi

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).