All of lore.kernel.org
 help / color / mirror / Atom feed
* aio_wait_bh_oneshot() thread-safety question
@ 2022-05-23 16:04 Vladimir Sementsov-Ogievskiy
  2022-05-24  7:08 ` Paolo Bonzini
  0 siblings, 1 reply; 6+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2022-05-23 16:04 UTC (permalink / raw)
  To: qemu-block, qemu-devel; +Cc: kwolf, Paolo Bonzini, Stefan Hajnoczi, hreitz

Hi all (yes, that's my new address, I hope for a long time. )

I have a doubt about how aio_wait_bh_oneshot() works. Exactly, I see that data->done is not accessed atomically, and doesn't have any barrier protecting it..

Is following possible:

main-loop                           iothread
                                 |
aio_wait_bh_oneshot()           |
     aio_bh_schedule_oneshot()   |
                                 |  handle bh:
                                 | 1. set data->done = true
                                 | 2. call aio_wait_kick(), inserting the
                                 | dummy bh into main context
                                 |
  ... in AIO_WAIT_WHILE():
    handle dummy bh, go to next
    iteration, but still read
    data->done=false due to some
    processor data reordering,
    go to next iteration of polling
    and hang

?

I've seen a following dead-lock on 2.12-based Qemu, but failed to find is it (and how is it) fixed in master:

1. main() thread is stuck in qemu_mutex_lock_iothread()

2. The global mutex is taken by migration_thread(), which has the following stack:

aio_poll ( ctx=qemu_aio_context, blocking=true )
aio_wait_bh_oneshot ( ctx=context_of_iothread, cb=virtio_blk_data_plane_stop_bh )
virtio_blk_data_plane_stop
virtio_bus_stop_ioeventfd
virtio_vmstate_change
vm_state_notify
do_vm_stop
migration_completion

The iothread itself is in qemu_poll_ns() -> ppoll(). data->done of the BH is true, so I assume iothread completely handled the BH. Also, there is no dummy_bh in the main qemu aio context bh-list, so I assume it's either already handled, or aio_wait_kick() was called even before entering AIO_WAIT_WHILE. But still, AIO_WAIT_WHILE somehow go into block aio_poll, like data->done was false.

-- 
Best regards,
Vladimir


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: aio_wait_bh_oneshot() thread-safety question
  2022-05-23 16:04 aio_wait_bh_oneshot() thread-safety question Vladimir Sementsov-Ogievskiy
@ 2022-05-24  7:08 ` Paolo Bonzini
  2022-05-24 12:40   ` Kevin Wolf
  2022-05-24 17:56   ` Emanuele Giuseppe Esposito
  0 siblings, 2 replies; 6+ messages in thread
From: Paolo Bonzini @ 2022-05-24  7:08 UTC (permalink / raw)
  To: Vladimir Sementsov-Ogievskiy, qemu-block, qemu-devel
  Cc: kwolf, Stefan Hajnoczi, hreitz, Emanuele Giuseppe Esposito

On 5/23/22 18:04, Vladimir Sementsov-Ogievskiy wrote:
> 
> I have a doubt about how aio_wait_bh_oneshot() works. Exactly, I see 
> that data->done is not accessed atomically, and doesn't have any barrier 
> protecting it..
> 
> Is following possible:
> 
> main-loop                           iothread
>                                  |
> aio_wait_bh_oneshot()           |
>      aio_bh_schedule_oneshot()   |
>                                  |  handle bh:
>                                  | 1. set data->done = true
>                                  | 2. call aio_wait_kick(), inserting the
>                                  | dummy bh into main context
>                                  |
>   ... in AIO_WAIT_WHILE():
>     handle dummy bh, go to next
>     iteration, but still read
>     data->done=false due to some
>     processor data reordering,
>     go to next iteration of polling
>     and hang
Yes, barriers are missing:

https://lore.kernel.org/qemu-devel/You6FburTi7gVyxy@stefanha-x1.localdomain/T/#md97146c6eae1fce2ddd687fdc3f2215eee03f6f4

It seems like the issue was never observed, at least on x86.

Paolo



^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: aio_wait_bh_oneshot() thread-safety question
  2022-05-24  7:08 ` Paolo Bonzini
@ 2022-05-24 12:40   ` Kevin Wolf
  2022-05-24 13:46     ` Vladimir Sementsov-Ogievskiy
  2022-05-24 17:21     ` Paolo Bonzini
  2022-05-24 17:56   ` Emanuele Giuseppe Esposito
  1 sibling, 2 replies; 6+ messages in thread
From: Kevin Wolf @ 2022-05-24 12:40 UTC (permalink / raw)
  To: Paolo Bonzini
  Cc: Vladimir Sementsov-Ogievskiy, qemu-block, qemu-devel,
	Stefan Hajnoczi, hreitz, Emanuele Giuseppe Esposito

Am 24.05.2022 um 09:08 hat Paolo Bonzini geschrieben:
> On 5/23/22 18:04, Vladimir Sementsov-Ogievskiy wrote:
> > 
> > I have a doubt about how aio_wait_bh_oneshot() works. Exactly, I see
> > that data->done is not accessed atomically, and doesn't have any barrier
> > protecting it..
> > 
> > Is following possible:
> > 
> > main-loop                           iothread
> >                                  |
> > aio_wait_bh_oneshot()           |
> >      aio_bh_schedule_oneshot()   |
> >                                  |  handle bh:
> >                                  | 1. set data->done = true
> >                                  | 2. call aio_wait_kick(), inserting the
> >                                  | dummy bh into main context
> >                                  |
> >   ... in AIO_WAIT_WHILE():
> >     handle dummy bh, go to next
> >     iteration, but still read
> >     data->done=false due to some
> >     processor data reordering,
> >     go to next iteration of polling
> >     and hang
> Yes, barriers are missing:
> 
> https://lore.kernel.org/qemu-devel/You6FburTi7gVyxy@stefanha-x1.localdomain/T/#md97146c6eae1fce2ddd687fdc3f2215eee03f6f4
> 
> It seems like the issue was never observed, at least on x86.

Why is the barrier in aio_bh_enqueue() not enough? Is the comment there
wrong?

aio_notify() has another barrier. This is a little bit too late, but if
I misunderstood the aio_bh_enqueue() one, it could explain why it was
never observed.

Kevin



^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: aio_wait_bh_oneshot() thread-safety question
  2022-05-24 12:40   ` Kevin Wolf
@ 2022-05-24 13:46     ` Vladimir Sementsov-Ogievskiy
  2022-05-24 17:21     ` Paolo Bonzini
  1 sibling, 0 replies; 6+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2022-05-24 13:46 UTC (permalink / raw)
  To: Kevin Wolf, Paolo Bonzini
  Cc: qemu-block, qemu-devel, Stefan Hajnoczi, hreitz,
	Emanuele Giuseppe Esposito

On 5/24/22 15:40, Kevin Wolf wrote:
> Am 24.05.2022 um 09:08 hat Paolo Bonzini geschrieben:
>> On 5/23/22 18:04, Vladimir Sementsov-Ogievskiy wrote:
>>>
>>> I have a doubt about how aio_wait_bh_oneshot() works. Exactly, I see
>>> that data->done is not accessed atomically, and doesn't have any barrier
>>> protecting it..
>>>
>>> Is following possible:
>>>
>>> main-loop                           iothread
>>>                                   |
>>> aio_wait_bh_oneshot()           |
>>>       aio_bh_schedule_oneshot()   |
>>>                                   |  handle bh:
>>>                                   | 1. set data->done = true
>>>                                   | 2. call aio_wait_kick(), inserting the
>>>                                   | dummy bh into main context
>>>                                   |
>>>    ... in AIO_WAIT_WHILE():
>>>      handle dummy bh, go to next
>>>      iteration, but still read
>>>      data->done=false due to some
>>>      processor data reordering,
>>>      go to next iteration of polling
>>>      and hang
>> Yes, barriers are missing:
>>
>> https://lore.kernel.org/qemu-devel/You6FburTi7gVyxy@stefanha-x1.localdomain/T/#md97146c6eae1fce2ddd687fdc3f2215eee03f6f4
>>
>> It seems like the issue was never observed, at least on x86.
> 
> Why is the barrier in aio_bh_enqueue() not enough? Is the comment there
> wrong?
> 
> aio_notify() has another barrier. This is a little bit too late, but if
> I misunderstood the aio_bh_enqueue() one, it could explain why it was
> never observed.
> 
> Kevin
> 

I'd consider two cases:

1. aio_wait_kick() reads num_waiters as 0 and don't schedule any BH into main ctx.

In this case aio_wait_kick() only do one atomic operation: qatomic_read(&global_aio_wait.num_waiters), which is not a barrier as I understand.
So, data->done=true may be reordered with this operation.

main-loop                                iothread

   aio_wait_bh_oneshot()          |
      aio_bh_schedule_oneshot()   |
                                  |  atomic read num_waiters = 0 => don't kick
      AIO_WAIT_WHILE              |
       atomic inc num_waiters     |
       read done = false, go      |
       into blocking aio_poll()   |
                                  |  set data->done = true  # reordered to the end
                                  |    - but that doesn't help to wake main loop


For this case, iothread just don't call aio_bh_enqueue() and aio_notify(), so any barriers in them doesn't help


2. aio_wait_kick() reads num_waiters>0 and do schedule BH

In this case it seems you are right: if main-loop dequeued  dummy BH, it should be guaranteed that after handling this BH the main loop will see data->done=true.. That's if the comment is correct, hope it is. At least it corresponds to what I've read here : https://www.kernel.org/doc/Documentation/atomic_t.txt . How much generic this information is - I don't know.

In 2.12 there was no enque() deque() functions, but there was smp_wmb() in aio_bh_schedule_oneshot(), paired with atomic_xchg() in aio_bh_poll(), with similar comment about implicit barrier.


-- 
Best regards,
Vladimir


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: aio_wait_bh_oneshot() thread-safety question
  2022-05-24 12:40   ` Kevin Wolf
  2022-05-24 13:46     ` Vladimir Sementsov-Ogievskiy
@ 2022-05-24 17:21     ` Paolo Bonzini
  1 sibling, 0 replies; 6+ messages in thread
From: Paolo Bonzini @ 2022-05-24 17:21 UTC (permalink / raw)
  To: Kevin Wolf
  Cc: Vladimir Sementsov-Ogievskiy, qemu-block, qemu-devel,
	Stefan Hajnoczi, hreitz, Emanuele Giuseppe Esposito

On 5/24/22 14:40, Kevin Wolf wrote:
> Why is the barrier in aio_bh_enqueue() not enough? Is the comment there
> wrong?
> 
> aio_notify() has another barrier. This is a little bit too late, but if
> I misunderstood the aio_bh_enqueue() one, it could explain why it was
> never observed.

The missing one that I (and I think Vladimir) were talking about is at the
end of the execution of the bottom half, not at the beginning:

/* Context: BH in IOThread */
static void aio_wait_bh(void *opaque)
{
     AioWaitBHData *data = opaque;

     data->cb(data->opaque);

     data->done = true;
     aio_wait_kick();
}

void aio_wait_kick(void)
{
     /* The barrier (or an atomic op) is in the caller.  */
     if (qatomic_read(&global_aio_wait.num_waiters)) {
         aio_bh_schedule_oneshot(qemu_get_aio_context(), dummy_bh_cb, NULL);
     }
}

where there is no barrier in the caller to separate reading data->done
(qatomic_set would be nice, if only for clarity) from reading num_waiters.

Paolo


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: aio_wait_bh_oneshot() thread-safety question
  2022-05-24  7:08 ` Paolo Bonzini
  2022-05-24 12:40   ` Kevin Wolf
@ 2022-05-24 17:56   ` Emanuele Giuseppe Esposito
  1 sibling, 0 replies; 6+ messages in thread
From: Emanuele Giuseppe Esposito @ 2022-05-24 17:56 UTC (permalink / raw)
  To: Paolo Bonzini, Vladimir Sementsov-Ogievskiy, qemu-block, qemu-devel
  Cc: kwolf, Stefan Hajnoczi, hreitz



Am 24/05/2022 um 09:08 schrieb Paolo Bonzini:
> On 5/23/22 18:04, Vladimir Sementsov-Ogievskiy wrote:
>>
>> I have a doubt about how aio_wait_bh_oneshot() works. Exactly, I see
>> that data->done is not accessed atomically, and doesn't have any
>> barrier protecting it..
>>
>> Is following possible:
>>
>> main-loop                           iothread
>>                                  |
>> aio_wait_bh_oneshot()           |
>>      aio_bh_schedule_oneshot()   |
>>                                  |  handle bh:
>>                                  | 1. set data->done = true
>>                                  | 2. call aio_wait_kick(), inserting the
>>                                  | dummy bh into main context
>>                                  |
>>   ... in AIO_WAIT_WHILE():
>>     handle dummy bh, go to next
>>     iteration, but still read
>>     data->done=false due to some
>>     processor data reordering,
>>     go to next iteration of polling
>>     and hang
> Yes, barriers are missing:
> 
> https://lore.kernel.org/qemu-devel/You6FburTi7gVyxy@stefanha-x1.localdomain/T/#md97146c6eae1fce2ddd687fdc3f2215eee03f6f4
> 
> 
> It seems like the issue was never observed, at least on x86.
> 
> Paolo
> 

Sent the fix as a separate patch:
https://patchew.org/QEMU/20220524173054.12651-1-eesposit@redhat.com/

Thank you,
Emanuele



^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2022-05-24 17:58 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-05-23 16:04 aio_wait_bh_oneshot() thread-safety question Vladimir Sementsov-Ogievskiy
2022-05-24  7:08 ` Paolo Bonzini
2022-05-24 12:40   ` Kevin Wolf
2022-05-24 13:46     ` Vladimir Sementsov-Ogievskiy
2022-05-24 17:21     ` Paolo Bonzini
2022-05-24 17:56   ` Emanuele Giuseppe Esposito

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.