* [PATCH char-misc 1/1] Drivers: hv: vmbus: Fix ring buffer signaling
@ 2018-02-10 20:48 Michael Kelley
2018-02-11 23:50 ` Stephen Hemminger
` (3 more replies)
0 siblings, 4 replies; 7+ messages in thread
From: Michael Kelley @ 2018-02-10 20:48 UTC (permalink / raw)
To: gregkh, linux-kernel, devel, olaf, apw, vkuznets, jasowang,
leann.ogasawara, marcelo.cerri, sthemmin, kys
Cc: mikelley
Fix bugs in signaling the Hyper-V host when freeing space in the
host->guest ring buffer:
1. The interrupt_mask must not be used to determine whether to signal
on the host->guest ring buffer
2. The ring buffer write_index must be read (via hv_get_bytes_to_write)
*after* pending_send_sz is read in order to avoid a race condition
3. Comparisons with pending_send_sz must treat the "equals" case as
not-enough-space
4. Don't signal if the pending_send_sz feature is not present. Older
versions of Hyper-V that don't implement this feature will poll.
Fixes: 03bad714a161 ("vmbus: more host signalling avoidance")
Signed-off-by: Michael Kelley <mhkelley@outlook.com>
---
drivers/hv/ring_buffer.c | 24 ++++++++++++++++--------
1 file changed, 16 insertions(+), 8 deletions(-)
diff --git a/drivers/hv/ring_buffer.c b/drivers/hv/ring_buffer.c
index 50e0714..b64be18 100644
--- a/drivers/hv/ring_buffer.c
+++ b/drivers/hv/ring_buffer.c
@@ -423,7 +423,11 @@ struct vmpacket_descriptor *
void hv_pkt_iter_close(struct vmbus_channel *channel)
{
struct hv_ring_buffer_info *rbi = &channel->inbound;
- u32 orig_write_sz = hv_get_bytes_to_write(rbi);
+ u32 curr_write_sz;
+ u32 delta = rbi->ring_buffer->read_index < rbi->priv_read_index ?
+ (rbi->priv_read_index - rbi->ring_buffer->read_index) :
+ (rbi->ring_datasize - rbi->ring_buffer->read_index +
+ rbi->priv_read_index);
/*
* Make sure all reads are done before we update the read index since
@@ -446,27 +450,31 @@ void hv_pkt_iter_close(struct vmbus_channel *channel)
*/
virt_mb();
- /* If host has disabled notifications then skip */
- if (rbi->ring_buffer->interrupt_mask)
- return;
-
if (rbi->ring_buffer->feature_bits.feat_pending_send_sz) {
u32 pending_sz = READ_ONCE(rbi->ring_buffer->pending_send_sz);
/*
+ * Ensure the read of write_index in hv_get_bytes_to_write()
+ * happens after the read of pending_send_sz.
+ */
+ virt_rmb();
+ curr_write_sz = hv_get_bytes_to_write(rbi);
+
+ /*
* If there was space before we began iteration,
* then host was not blocked. Also handles case where
* pending_sz is zero then host has nothing pending
* and does not need to be signaled.
*/
- if (orig_write_sz > pending_sz)
+ if (curr_write_sz - delta > pending_sz)
return;
/* If pending write will not fit, don't give false hope. */
- if (hv_get_bytes_to_write(rbi) < pending_sz)
+ if (curr_write_sz <= pending_sz)
return;
+
+ vmbus_setevent(channel);
}
- vmbus_setevent(channel);
}
EXPORT_SYMBOL_GPL(hv_pkt_iter_close);
--
1.8.3.1
_______________________________________________
devel mailing list
devel@linuxdriverproject.org
http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel
^ permalink raw reply related [flat|nested] 7+ messages in thread
* Re: [PATCH char-misc 1/1] Drivers: hv: vmbus: Fix ring buffer signaling
2018-02-10 20:48 [PATCH char-misc 1/1] Drivers: hv: vmbus: Fix ring buffer signaling Michael Kelley
@ 2018-02-11 23:50 ` Stephen Hemminger
2018-02-12 1:14 ` KY Srinivasan
` (2 subsequent siblings)
3 siblings, 0 replies; 7+ messages in thread
From: Stephen Hemminger @ 2018-02-11 23:50 UTC (permalink / raw)
To: Michael Kelley
Cc: olaf, sthemmin, gregkh, jasowang, linux-kernel, mikelley,
marcelo.cerri, apw, devel, vkuznets, leann.ogasawara
On Sat, 10 Feb 2018 20:48:49 +0000
Michael Kelley <mhkelley@outlook.com> wrote:
> + u32 delta = rbi->ring_buffer->read_index < rbi->priv_read_index ?
> + (rbi->priv_read_index - rbi->ring_buffer->read_index) :
> + (rbi->ring_datasize - rbi->ring_buffer->read_index +
> + rbi->priv_read_index);
>
Would it would make sense for this to be a wrapper function like other ring operations?
_______________________________________________
devel mailing list
devel@linuxdriverproject.org
http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel
^ permalink raw reply [flat|nested] 7+ messages in thread
* RE: [PATCH char-misc 1/1] Drivers: hv: vmbus: Fix ring buffer signaling
2018-02-10 20:48 [PATCH char-misc 1/1] Drivers: hv: vmbus: Fix ring buffer signaling Michael Kelley
2018-02-11 23:50 ` Stephen Hemminger
@ 2018-02-12 1:14 ` KY Srinivasan
2018-02-12 5:13 ` Michael Kelley (EOSG)
2018-02-13 17:34 ` Stephen Hemminger
2018-02-14 18:12 ` Stephen Hemminger
3 siblings, 1 reply; 7+ messages in thread
From: KY Srinivasan @ 2018-02-12 1:14 UTC (permalink / raw)
To: Michael Kelley (EOSG),
gregkh, linux-kernel, devel, olaf, apw, vkuznets, jasowang,
leann.ogasawara, marcelo.cerri, Stephen Hemminger
> -----Original Message-----
> From: Michael Kelley [mailto:mhkelley@outlook.com]
> Sent: Saturday, February 10, 2018 12:49 PM
> To: gregkh@linuxfoundation.org; linux-kernel@vger.kernel.org;
> devel@linuxdriverproject.org; olaf@aepfle.de; apw@canonical.com;
> vkuznets@redhat.com; jasowang@redhat.com;
> leann.ogasawara@canonical.com; marcelo.cerri@canonical.com; Stephen
> Hemminger <sthemmin@microsoft.com>; KY Srinivasan
> <kys@microsoft.com>
> Cc: Michael Kelley (EOSG) <Michael.H.Kelley@microsoft.com>
> Subject: [PATCH char-misc 1/1] Drivers: hv: vmbus: Fix ring buffer signaling
>
> Fix bugs in signaling the Hyper-V host when freeing space in the
> host->guest ring buffer:
>
> 1. The interrupt_mask must not be used to determine whether to signal
> on the host->guest ring buffer
> 2. The ring buffer write_index must be read (via hv_get_bytes_to_write)
> *after* pending_send_sz is read in order to avoid a race condition
> 3. Comparisons with pending_send_sz must treat the "equals" case as
> not-enough-space
> 4. Don't signal if the pending_send_sz feature is not present. Older
> versions of Hyper-V that don't implement this feature will poll.
>
> Fixes: 03bad714a161 ("vmbus: more host signalling avoidance")
> Signed-off-by: Michael Kelley <mhkelley@outlook.com>
> ---
> drivers/hv/ring_buffer.c | 24 ++++++++++++++++--------
> 1 file changed, 16 insertions(+), 8 deletions(-)
>
> diff --git a/drivers/hv/ring_buffer.c b/drivers/hv/ring_buffer.c
> index 50e0714..b64be18 100644
> --- a/drivers/hv/ring_buffer.c
> +++ b/drivers/hv/ring_buffer.c
> @@ -423,7 +423,11 @@ struct vmpacket_descriptor *
> void hv_pkt_iter_close(struct vmbus_channel *channel)
> {
> struct hv_ring_buffer_info *rbi = &channel->inbound;
> - u32 orig_write_sz = hv_get_bytes_to_write(rbi);
> + u32 curr_write_sz;
> + u32 delta = rbi->ring_buffer->read_index < rbi->priv_read_index ?
> + (rbi->priv_read_index - rbi->ring_buffer-
> >read_index) :
> + (rbi->ring_datasize - rbi->ring_buffer->read_index +
> + rbi->priv_read_index);
>
> /*
> * Make sure all reads are done before we update the read index
> since
> @@ -446,27 +450,31 @@ void hv_pkt_iter_close(struct vmbus_channel
> *channel)
> */
> virt_mb();
>
> - /* If host has disabled notifications then skip */
> - if (rbi->ring_buffer->interrupt_mask)
> - return;
> -
> if (rbi->ring_buffer->feature_bits.feat_pending_send_sz) {
> u32 pending_sz = READ_ONCE(rbi->ring_buffer-
> >pending_send_sz);
>
> /*
> + * Ensure the read of write_index in
> hv_get_bytes_to_write()
> + * happens after the read of pending_send_sz.
> + */
> + virt_rmb();
We can avoid the read barrier by making the initialization of curr_write_sz conditional
on pending_send_sz being non-zero. Indeed you can make all the signaling code conditional
on pending_send_sz being non-zero.
> + curr_write_sz = hv_get_bytes_to_write(rbi);
> +
> + /*
> * If there was space before we began iteration,
> * then host was not blocked. Also handles case where
> * pending_sz is zero then host has nothing pending
> * and does not need to be signaled.
> */
> - if (orig_write_sz > pending_sz)
> + if (curr_write_sz - delta > pending_sz)
> return;
>
> /* If pending write will not fit, don't give false hope. */
> - if (hv_get_bytes_to_write(rbi) < pending_sz)
> + if (curr_write_sz <= pending_sz)
> return;
> +
> + vmbus_setevent(channel);
> }
>
> - vmbus_setevent(channel);
> }
> EXPORT_SYMBOL_GPL(hv_pkt_iter_close);
> --
> 1.8.3.1
_______________________________________________
devel mailing list
devel@linuxdriverproject.org
http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel
^ permalink raw reply [flat|nested] 7+ messages in thread
* RE: [PATCH char-misc 1/1] Drivers: hv: vmbus: Fix ring buffer signaling
2018-02-12 1:14 ` KY Srinivasan
@ 2018-02-12 5:13 ` Michael Kelley (EOSG)
0 siblings, 0 replies; 7+ messages in thread
From: Michael Kelley (EOSG) @ 2018-02-12 5:13 UTC (permalink / raw)
To: KY Srinivasan, gregkh, linux-kernel, devel, olaf, apw, vkuznets,
jasowang, leann.ogasawara, marcelo.cerri, Stephen Hemminger
> -----Original Message-----
> From: KY Srinivasan
> Sent: Sunday, February 11, 2018 5:14 PM
--- snip ---
> > if (rbi->ring_buffer->feature_bits.feat_pending_send_sz) {
> > u32 pending_sz = READ_ONCE(rbi->ring_buffer-
> > >pending_send_sz);
> >
> > /*
> > + * Ensure the read of write_index in
> > hv_get_bytes_to_write()
> > + * happens after the read of pending_send_sz.
> > + */
> > + virt_rmb();
> We can avoid the read barrier by making the initialization of curr_write_sz conditional
> on pending_send_sz being non-zero. Indeed you can make all the signaling code conditional
> on pending_send_sz being non-zero.
>
>
I agree that we can immediately test pending_send_sz for zero, and exit
if zero. A zero value will be by far the most common path, and would
save executing the read barrier and hv_get_bytes_to_write(). Can also
move the calculation of "delta" to after the test for zero.
But I believe the read barrier is still needed on the path where
pending_send_sz is non-zero. Just the testing the value of pending_send_sz
doesn't guarantee that the write_index wouldn't be speculatively read,
and potentially out-of-order. And we have to consider the out-of-order
behaviors on ARM64 as well.
>
> > + curr_write_sz = hv_get_bytes_to_write(rbi);
> > +
> > + /*
> > * If there was space before we began iteration,
> > * then host was not blocked. Also handles case where
> > * pending_sz is zero then host has nothing pending
> > * and does not need to be signaled.
> > */
> > - if (orig_write_sz > pending_sz)
> > + if (curr_write_sz - delta > pending_sz)
> > return;
> >
> > /* If pending write will not fit, don't give false hope. */
> > - if (hv_get_bytes_to_write(rbi) < pending_sz)
> > + if (curr_write_sz <= pending_sz)
> > return;
> > +
> > + vmbus_setevent(channel);
> > }
> >
> > - vmbus_setevent(channel);
> > }
> > EXPORT_SYMBOL_GPL(hv_pkt_iter_close);
> > --
> > 1.8.3.1
_______________________________________________
devel mailing list
devel@linuxdriverproject.org
http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH char-misc 1/1] Drivers: hv: vmbus: Fix ring buffer signaling
2018-02-10 20:48 [PATCH char-misc 1/1] Drivers: hv: vmbus: Fix ring buffer signaling Michael Kelley
2018-02-11 23:50 ` Stephen Hemminger
2018-02-12 1:14 ` KY Srinivasan
@ 2018-02-13 17:34 ` Stephen Hemminger
2018-02-14 2:12 ` Michael Kelley (EOSG)
2018-02-14 18:12 ` Stephen Hemminger
3 siblings, 1 reply; 7+ messages in thread
From: Stephen Hemminger @ 2018-02-13 17:34 UTC (permalink / raw)
To: Michael Kelley
Cc: mikelley, gregkh, linux-kernel, devel, olaf, apw, vkuznets,
jasowang, leann.ogasawara, marcelo.cerri, sthemmin, kys
> if (rbi->ring_buffer->feature_bits.feat_pending_send_sz) {
> u32 pending_sz = READ_ONCE(rbi->ring_buffer->pending_send_sz);
>
> /*
> + * Ensure the read of write_index in hv_get_bytes_to_write()
> + * happens after the read of pending_send_sz.
> + */
> + virt_rmb();
> + curr_write_sz = hv_get_bytes_to_write(rbi);
> +
> + /*
> * If there was space before we began iteration,
> * then host was not blocked. Also handles case where
> * pending_sz is zero then host has nothing pending
> * and does not need to be signaled.
> */
> - if (orig_write_sz > pending_sz)
> + if (curr_write_sz - delta > pending_sz)
> return;
>
> /* If pending write will not fit, don't give false hope. */
> - if (hv_get_bytes_to_write(rbi) < pending_sz)
> + if (curr_write_sz <= pending_sz)
> return;
> +
> + vmbus_setevent(channel);
> }
>
> - vmbus_setevent(channel);
> }
I think this won't work on older versions of Windows where feat_pending_sz
is never set. On those versions vmbus_setevent needs to always be called.
^ permalink raw reply [flat|nested] 7+ messages in thread
* RE: [PATCH char-misc 1/1] Drivers: hv: vmbus: Fix ring buffer signaling
2018-02-13 17:34 ` Stephen Hemminger
@ 2018-02-14 2:12 ` Michael Kelley (EOSG)
0 siblings, 0 replies; 7+ messages in thread
From: Michael Kelley (EOSG) @ 2018-02-14 2:12 UTC (permalink / raw)
To: Stephen Hemminger, Michael Kelley
Cc: gregkh, linux-kernel, devel, olaf, apw, vkuznets, jasowang,
leann.ogasawara, marcelo.cerri, Stephen Hemminger, KY Srinivasan
> -----Original Message-----
> From: Stephen Hemminger <stephen@networkplumber.org>
> Sent: Tuesday, February 13, 2018 9:35 AM
> To: Michael Kelley <mhkelley@outlook.com>
> Cc: Michael Kelley (EOSG) <Michael.H.Kelley@microsoft.com>; gregkh@linuxfoundation.org;
> linux-kernel@vger.kernel.org; devel@linuxdriverproject.org; olaf@aepfle.de;
> apw@canonical.com; vkuznets@redhat.com; jasowang@redhat.com;
> leann.ogasawara@canonical.com; marcelo.cerri@canonical.com; Stephen Hemminger
> <sthemmin@microsoft.com>; KY Srinivasan <kys@microsoft.com>
> Subject: Re: [PATCH char-misc 1/1] Drivers: hv: vmbus: Fix ring buffer signaling
>
>
> > if (rbi->ring_buffer->feature_bits.feat_pending_send_sz) {
> > u32 pending_sz = READ_ONCE(rbi->ring_buffer->pending_send_sz);
> >
> > /*
> > + * Ensure the read of write_index in hv_get_bytes_to_write()
> > + * happens after the read of pending_send_sz.
> > + */
> > + virt_rmb();
> > + curr_write_sz = hv_get_bytes_to_write(rbi);
> > +
> > + /*
> > * If there was space before we began iteration,
> > * then host was not blocked. Also handles case where
> > * pending_sz is zero then host has nothing pending
> > * and does not need to be signaled.
> > */
> > - if (orig_write_sz > pending_sz)
> > + if (curr_write_sz - delta > pending_sz)
> > return;
> >
> > /* If pending write will not fit, don't give false hope. */
> > - if (hv_get_bytes_to_write(rbi) < pending_sz)
> > + if (curr_write_sz <= pending_sz)
> > return;
> > +
> > + vmbus_setevent(channel);
> > }
> >
> > - vmbus_setevent(channel);
> > }
>
> I think this won't work on older versions of Windows where feat_pending_sz
> is never set. On those versions vmbus_setevent needs to always be called.
The older Windows/Hyper-V hosts poll if they can't write to the
host->guest ring buffer because it is full. We don't want to call
vmbus_setevent() every time through this routine because then
the guest would be signaling the host every time the host interrupts
the guest.
Michael
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH char-misc 1/1] Drivers: hv: vmbus: Fix ring buffer signaling
2018-02-10 20:48 [PATCH char-misc 1/1] Drivers: hv: vmbus: Fix ring buffer signaling Michael Kelley
` (2 preceding siblings ...)
2018-02-13 17:34 ` Stephen Hemminger
@ 2018-02-14 18:12 ` Stephen Hemminger
3 siblings, 0 replies; 7+ messages in thread
From: Stephen Hemminger @ 2018-02-14 18:12 UTC (permalink / raw)
To: Michael Kelley
Cc: olaf, sthemmin, gregkh, jasowang, linux-kernel, mikelley,
marcelo.cerri, apw, devel, vkuznets, leann.ogasawara
What about the following simplification. If on older hosts you can avoid more code
and the additional barrier. Also cleanup comment wording somewhat.
>From de2d566d9092cbbc8e5974dea581617ef787ff69 Mon Sep 17 00:00:00 2001
From: Michael Kelley <mhkelley@outlook.com>
Date: Sat, 10 Feb 2018 20:48:49 +0000
Subject: [PATCH 1/9] Drivers: hv: vmbus: Fix ring buffer signaling
Fix bugs in signaling the Hyper-V host when freeing space in the
host->guest ring buffer:
1. The interrupt_mask must not be used to determine whether to signal
on the host->guest ring buffer
2. The ring buffer write_index must be read (via hv_get_bytes_to_write)
*after* pending_send_sz is read in order to avoid a race condition
3. Comparisons with pending_send_sz must treat the "equals" case as
not-enough-space
4. Don't signal if the pending_send_sz feature is not present. Older
versions of Hyper-V that don't implement this feature will poll.
Fixes: 03bad714a161 ("vmbus: more host signalling avoidance")
Signed-off-by: Michael Kelley <mhkelley@outlook.com>
---
drivers/hv/ring_buffer.c | 67 +++++++++++++++++++++++++++++-------------------
1 file changed, 40 insertions(+), 27 deletions(-)
diff --git a/drivers/hv/ring_buffer.c b/drivers/hv/ring_buffer.c
index 50e071444a5c..d27cbb66279c 100644
--- a/drivers/hv/ring_buffer.c
+++ b/drivers/hv/ring_buffer.c
@@ -417,13 +417,25 @@ __hv_pkt_iter_next(struct vmbus_channel *channel,
}
EXPORT_SYMBOL_GPL(__hv_pkt_iter_next);
+/* How many bytes were read in this iterator cycle */
+static u32 hv_pkt_iter_bytes_read(const struct hv_ring_buffer_info *rbi)
+{
+ if (rbi->ring_buffer->read_index < rbi->priv_read_index)
+ return rbi->priv_read_index - rbi->ring_buffer->read_index;
+ else
+ return rbi->ring_datasize -
+ rbi->ring_buffer->read_index + rbi->priv_read_index;
+}
+
/*
* Update host ring buffer after iterating over packets.
*/
void hv_pkt_iter_close(struct vmbus_channel *channel)
{
struct hv_ring_buffer_info *rbi = &channel->inbound;
- u32 orig_write_sz = hv_get_bytes_to_write(rbi);
+ u32 curr_write_sz;
+ u32 delta = hv_pkt_iter_bytes_read(rbi);
+ u32 pending_sz;
/*
* Make sure all reads are done before we update the read index since
@@ -433,39 +445,40 @@ void hv_pkt_iter_close(struct vmbus_channel *channel)
virt_rmb();
rbi->ring_buffer->read_index = rbi->priv_read_index;
+ /* Older versions do not require signalling */
+ if (!rbi->ring_buffer->feature_bits.feat_pending_send_sz)
+ return;
+
/*
- * Issue a full memory barrier before making the signaling decision.
- * Here is the reason for having this barrier:
- * If the reading of the pend_sz (in this function)
- * were to be reordered and read before we commit the new read
- * index (in the calling function) we could
- * have a problem. If the host were to set the pending_sz after we
- * have sampled pending_sz and go to sleep before we commit the
- * read index, we could miss sending the interrupt. Issue a full
- * memory barrier to address this.
+ * If the reading of the pend_sz were to be reordered and read
+ * before we commit the new read index then we would have a
+ * problem. If the host were to set the pending_sz after we
+ * have sampled pending_sz and go to sleep before we commit
+ * the read index, we could miss sending the interrupt.
*/
virt_mb();
- /* If host has disabled notifications then skip */
- if (rbi->ring_buffer->interrupt_mask)
- return;
+ pending_sz = READ_ONCE(rbi->ring_buffer->pending_send_sz);
- if (rbi->ring_buffer->feature_bits.feat_pending_send_sz) {
- u32 pending_sz = READ_ONCE(rbi->ring_buffer->pending_send_sz);
+ /*
+ * Ensure the read of write_index in hv_get_bytes_to_write()
+ * happens after the read of pending_send_sz.
+ */
+ virt_rmb();
+ curr_write_sz = hv_get_bytes_to_write(rbi);
- /*
- * If there was space before we began iteration,
- * then host was not blocked. Also handles case where
- * pending_sz is zero then host has nothing pending
- * and does not need to be signaled.
- */
- if (orig_write_sz > pending_sz)
- return;
+ /*
+ * If there was space before we began iteration,
+ * then host was not blocked. Also handles case where
+ * pending_sz is zero then host has nothing pending and does
+ * not need to be signaled.
+ */
+ if (curr_write_sz - delta > pending_sz)
+ return;
- /* If pending write will not fit, don't give false hope. */
- if (hv_get_bytes_to_write(rbi) < pending_sz)
- return;
- }
+ /* If pending write will not fit, don't give false hope. */
+ if (curr_write_sz <= pending_sz)
+ return;
vmbus_setevent(channel);
}
--
2.15.1
_______________________________________________
devel mailing list
devel@linuxdriverproject.org
http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel
^ permalink raw reply related [flat|nested] 7+ messages in thread
end of thread, other threads:[~2018-02-14 18:12 UTC | newest]
Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-02-10 20:48 [PATCH char-misc 1/1] Drivers: hv: vmbus: Fix ring buffer signaling Michael Kelley
2018-02-11 23:50 ` Stephen Hemminger
2018-02-12 1:14 ` KY Srinivasan
2018-02-12 5:13 ` Michael Kelley (EOSG)
2018-02-13 17:34 ` Stephen Hemminger
2018-02-14 2:12 ` Michael Kelley (EOSG)
2018-02-14 18:12 ` Stephen Hemminger
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.