All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v2 kvmtool] virtio: Fix ordering of virtio_queue__should_signal()
@ 2020-08-04 14:53 Alexandru Elisei
  2020-08-05  7:53 ` Milan Kocian
  2020-08-21 12:28 ` Will Deacon
  0 siblings, 2 replies; 4+ messages in thread
From: Alexandru Elisei @ 2020-08-04 14:53 UTC (permalink / raw)
  To: kvm
  Cc: milon, julien.thierry.kdev, will, jean-philippe, andre.przywara,
	Anvay Virkar

The guest programs used_event in the avail ring to let the host know when
it wants a notification from the device. The host notifies the guest when
the used ring index passes used_event. It is possible for the guest to
submit a buffer, and then go into uninterruptible sleep waiting for this
notification.

The virtio-blk guest driver, in the notification callback virtblk_done(),
increments the last known used ring index, then sets used_event to this
value, which means it will get a notification after the next buffer is
consumed by the host. virtblk_done() exits after the value of the used
ring idx has been propagated from the host thread.

On the host side, the virtio-blk device increments the used ring index,
then compares it to used_event to decide if a notification should be sent.

This is a common communication pattern between two threads, called store
buffer. Memory barriers are needed in order for the pattern to work
correctly, otherwise it is possible for the host to miss sending a required
notification.

Initial state: vring.used.idx = 2, vring.used_event = 1 (idx passes
used_event, which means kvmtool notifies the guest).

	GUEST (in virtblk_done())	| KVMTOOL (in virtio_blk_complete())
					|
(increment vq->last_used_idx = 2)	|
// virtqueue_enable_cb_prepare_split():	| // virt_queue__used_idx_advance():
write vring.used_event = 2		| write vring.used.idx = 3
// virtqueue_poll():			|
mb()					| wmb()
// virtqueue_poll_split():		| // virt_queue__should_signal():
read vring.used.idx = 2			| read vring.used_event = 1
// virtblk_done() exits.		| // No notification.

The write memory barrier on the host side is not enough to prevent
reordering of the read in the kvmtool thread, which can lead to the guest
thread waiting forever for IO to complete. Replace it with a full memory
barrier to get the correct store buffer pattern described in the Linux
litmus test SB+fencembonceonces.litmus, which forbids both threads reading
the initial values.

Also move the barrier in virtio_queue__should_signal(), because the barrier
is needed for notifications to work correctly, and it makes more sense to
have it in the function that determines if the host should notify the
guest.

Reported-by: Anvay Virkar <anvay.virkar@arm.com>
Suggested-by: Anvay Virkar <anvay.virkar@arm.com>
Reviewed-by: Jean-Philippe Brucker <jean-philippe@linaro.org>
Signed-off-by: Alexandru Elisei <alexandru.elisei@arm.com>
---
Changes in v2:
- Updated commit message with suggestions from Jean. Hoping that now it's
  clearer that the hang is caused by kvmtool not implementing the store
  buffer pattern correctly.

I've added a link to the online herd7 memory model tool with the store
buffer pattern [1]. The DMB SY can be replaced with DMB ST in one of
the threads, to mimic what kvmtool does, and the forbidden behaviour
becomes possible.

This was observed by Anvay, where both the guest and kvmtool read the
previous values of used.idx, and used_event respectively. The guest goes
into uninterruptible sleep and the notification is not sent.

I *think* this also fixes the VM hang reported in [2], where several
processes in the guest were stuck in uninterruptible sleep. I am not
familiar with the block layer, but my theory is that the threads were stuck
in wait_for_completion_io(), from blk_execute_rq() executing a flush
request. Milan has agreed to give this patch a spin [3], but that might
take a while because the bug is not easily reproducible. I believe the
patch can be merged on its own.

[1] http://diy.inria.fr/www/index.html?record=aarch64&cat=aarch64-v04&litmus=SB%2Bdmb.sys&cfg=new-web
[2] https://www.spinics.net/lists/kvm/msg204543.html
[3] https://www.spinics.net/lists/kvm/msg222201.html

 virtio/core.c | 15 ++++++++-------
 1 file changed, 8 insertions(+), 7 deletions(-)

diff --git a/virtio/core.c b/virtio/core.c
index f5b3c07fc100..90a661d12e14 100644
--- a/virtio/core.c
+++ b/virtio/core.c
@@ -33,13 +33,6 @@ void virt_queue__used_idx_advance(struct virt_queue *queue, u16 jump)
 	wmb();
 	idx += jump;
 	queue->vring.used->idx = virtio_host_to_guest_u16(queue, idx);
-
-	/*
-	 * Use wmb to assure used idx has been increased before we signal the guest.
-	 * Without a wmb here the guest may ignore the queue since it won't see
-	 * an updated idx.
-	 */
-	wmb();
 }
 
 struct vring_used_elem *
@@ -194,6 +187,14 @@ bool virtio_queue__should_signal(struct virt_queue *vq)
 {
 	u16 old_idx, new_idx, event_idx;
 
+	/*
+	 * Use mb to assure used idx has been increased before we signal the
+	 * guest, and we don't read a stale value for used_event. Without a mb
+	 * here we might not send a notification that we need to send, or the
+	 * guest may ignore the queue since it won't see an updated idx.
+	 */
+	mb();
+
 	if (!vq->use_event_idx) {
 		/*
 		 * When VIRTIO_RING_F_EVENT_IDX isn't negotiated, interrupt the
-- 
2.28.0


^ permalink raw reply related	[flat|nested] 4+ messages in thread

* Re: [PATCH v2 kvmtool] virtio: Fix ordering of virtio_queue__should_signal()
  2020-08-04 14:53 [PATCH v2 kvmtool] virtio: Fix ordering of virtio_queue__should_signal() Alexandru Elisei
@ 2020-08-05  7:53 ` Milan Kocian
  2020-08-10 15:14   ` Alexandru Elisei
  2020-08-21 12:28 ` Will Deacon
  1 sibling, 1 reply; 4+ messages in thread
From: Milan Kocian @ 2020-08-05  7:53 UTC (permalink / raw)
  To: Alexandru Elisei
  Cc: kvm, julien.thierry.kdev, will, jean-philippe, andre.przywara,
	Anvay Virkar

Hello,

> I *think* this also fixes the VM hang reported in [2], where several
> processes in the guest were stuck in uninterruptible sleep. I am not
> familiar with the block layer, but my theory is that the threads were stuck
> in wait_for_completion_io(), from blk_execute_rq() executing a flush
> request. Milan has agreed to give this patch a spin [3], but that might
> take a while because the bug is not easily reproducible. I believe the
> patch can be merged on its own.
> 
> [1] http://diy.inria.fr/www/index.html?record=aarch64&cat=aarch64-v04&litmus=SB%2Bdmb.sys&cfg=new-web
> [2] https://www.spinics.net/lists/kvm/msg204543.html
> [3] https://www.spinics.net/lists/kvm/msg222201.html
> 

Unfortunately it didn't help. I can see the problem again now.

Thanks.

Best regards.

-- 
Milan Kocian

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [PATCH v2 kvmtool] virtio: Fix ordering of virtio_queue__should_signal()
  2020-08-05  7:53 ` Milan Kocian
@ 2020-08-10 15:14   ` Alexandru Elisei
  0 siblings, 0 replies; 4+ messages in thread
From: Alexandru Elisei @ 2020-08-10 15:14 UTC (permalink / raw)
  To: Milan Kocian
  Cc: kvm, julien.thierry.kdev, will, jean-philippe, andre.przywara,
	Anvay Virkar

Hi Milan,

On 8/5/20 8:53 AM, Milan Kocian wrote:
> Hello,
>
>> I *think* this also fixes the VM hang reported in [2], where several
>> processes in the guest were stuck in uninterruptible sleep. I am not
>> familiar with the block layer, but my theory is that the threads were stuck
>> in wait_for_completion_io(), from blk_execute_rq() executing a flush
>> request. Milan has agreed to give this patch a spin [3], but that might
>> take a while because the bug is not easily reproducible. I believe the
>> patch can be merged on its own.
>>
>> [1] http://diy.inria.fr/www/index.html?record=aarch64&cat=aarch64-v04&litmus=SB%2Bdmb.sys&cfg=new-web
>> [2] https://www.spinics.net/lists/kvm/msg204543.html
>> [3] https://www.spinics.net/lists/kvm/msg222201.html
>>
> Unfortunately it didn't help. I can see the problem again now.

Thank you for giving this a go. Unfortunately this means there's another bug
lurking in kvmtool. I'll put this on my todo list and I'll try to figure something
out when I have some spare time.

Thanks,
Alex

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [PATCH v2 kvmtool] virtio: Fix ordering of virtio_queue__should_signal()
  2020-08-04 14:53 [PATCH v2 kvmtool] virtio: Fix ordering of virtio_queue__should_signal() Alexandru Elisei
  2020-08-05  7:53 ` Milan Kocian
@ 2020-08-21 12:28 ` Will Deacon
  1 sibling, 0 replies; 4+ messages in thread
From: Will Deacon @ 2020-08-21 12:28 UTC (permalink / raw)
  To: kvm, Alexandru Elisei
  Cc: catalin.marinas, kernel-team, Will Deacon, jean-philippe,
	Anvay Virkar, andre.przywara, julien.thierry.kdev, milon

On Tue, 4 Aug 2020 15:53:17 +0100, Alexandru Elisei wrote:
> The guest programs used_event in the avail ring to let the host know when
> it wants a notification from the device. The host notifies the guest when
> the used ring index passes used_event. It is possible for the guest to
> submit a buffer, and then go into uninterruptible sleep waiting for this
> notification.
> 
> The virtio-blk guest driver, in the notification callback virtblk_done(),
> increments the last known used ring index, then sets used_event to this
> value, which means it will get a notification after the next buffer is
> consumed by the host. virtblk_done() exits after the value of the used
> ring idx has been propagated from the host thread.
> 
> [...]

Applied to kvmtool (master), thanks!

[1/1] virtio: Fix ordering of virtio_queue__should_signal()
      https://git.kernel.org/will/kvmtool/c/d7d79bd51412

Cheers,
-- 
Will

https://fixes.arm64.dev
https://next.arm64.dev
https://will.arm64.dev

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2020-08-21 12:28 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-08-04 14:53 [PATCH v2 kvmtool] virtio: Fix ordering of virtio_queue__should_signal() Alexandru Elisei
2020-08-05  7:53 ` Milan Kocian
2020-08-10 15:14   ` Alexandru Elisei
2020-08-21 12:28 ` Will Deacon

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.