* Re: [PATCH] virtio_ring: fix packed ring event may missing [not found] <20191021171004.18729-1-yong.liu@intel.com> @ 2019-10-22 2:44 ` Jason Wang [not found] ` <86228AFD5BCD8E4EBFD2B90117B5E81E633D74EF@SHSMSX103.ccr.corp.intel.com> 2019-10-25 9:32 ` Michael S. Tsirkin ` (2 subsequent siblings) 3 siblings, 1 reply; 9+ messages in thread From: Jason Wang @ 2019-10-22 2:44 UTC (permalink / raw) To: Marvin Liu, mst, tiwei.bie; +Cc: virtualization On 2019/10/22 上午1:10, Marvin Liu wrote: > When callback is delayed, virtio expect that vhost will kick when > rolling over event offset. Recheck should be taken as used index may > exceed event offset between status check and driver event update. > > However, it is possible that flags was not modified if descriptors are > chained or in_order feature was negotiated. So flags at event offset > may not be valid for descriptor's status checking. Fix it by using last > used index as replacement. Tx queue will be stopped if there's not > enough freed buffers after recheck. > > Signed-off-by: Marvin Liu <yong.liu@intel.com> > > diff --git a/drivers/virtio/virtio_ring.c b/drivers/virtio/virtio_ring.c > index bdc08244a648..a8041e451e9e 100644 > --- a/drivers/virtio/virtio_ring.c > +++ b/drivers/virtio/virtio_ring.c > @@ -1499,9 +1499,6 @@ static bool virtqueue_enable_cb_delayed_packed(struct virtqueue *_vq) > * counter first before updating event flags. > */ > virtio_wmb(vq->weak_barriers); > - } else { > - used_idx = vq->last_used_idx; > - wrap_counter = vq->packed.used_wrap_counter; > } > > if (vq->packed.event_flags_shadow == VRING_PACKED_EVENT_FLAG_DISABLE) { > @@ -1518,7 +1515,9 @@ static bool virtqueue_enable_cb_delayed_packed(struct virtqueue *_vq) > */ > virtio_mb(vq->weak_barriers); > > - if (is_used_desc_packed(vq, used_idx, wrap_counter)) { > + if (is_used_desc_packed(vq, > + vq->last_used_idx, > + vq->packed.used_wrap_counter)) { > END_USE(vq); > return false; > } Hi Marvin: Two questions: 1) Do we support IN_ORDER in kernel driver? 2) Should we check IN_ORDER in this case otherwise we may end up with interrupt storm when IN_ORDER is not negotiated? Thanks _______________________________________________ Virtualization mailing list Virtualization@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/virtualization ^ permalink raw reply [flat|nested] 9+ messages in thread
[parent not found: <86228AFD5BCD8E4EBFD2B90117B5E81E633D74EF@SHSMSX103.ccr.corp.intel.com>]
* Re: [PATCH] virtio_ring: fix packed ring event may missing [not found] ` <86228AFD5BCD8E4EBFD2B90117B5E81E633D74EF@SHSMSX103.ccr.corp.intel.com> @ 2019-10-22 13:05 ` Jason Wang [not found] ` <86228AFD5BCD8E4EBFD2B90117B5E81E633DA298@SHSMSX103.ccr.corp.intel.com> 0 siblings, 1 reply; 9+ messages in thread From: Jason Wang @ 2019-10-22 13:05 UTC (permalink / raw) To: Liu, Yong, mst, Bie, Tiwei; +Cc: virtualization On 2019/10/22 下午2:48, Liu, Yong wrote: > Hi Jason, > My answers are inline. > >> -----Original Message----- >> From: Jason Wang [mailto:jasowang@redhat.com] >> Sent: Tuesday, October 22, 2019 10:45 AM >> To: Liu, Yong <yong.liu@intel.com>; mst@redhat.com; Bie, Tiwei >> <tiwei.bie@intel.com> >> Cc: virtualization@lists.linux-foundation.org >> Subject: Re: [PATCH] virtio_ring: fix packed ring event may missing >> >> >> On 2019/10/22 上午1:10, Marvin Liu wrote: >>> When callback is delayed, virtio expect that vhost will kick when >>> rolling over event offset. Recheck should be taken as used index may >>> exceed event offset between status check and driver event update. >>> >>> However, it is possible that flags was not modified if descriptors are >>> chained or in_order feature was negotiated. So flags at event offset >>> may not be valid for descriptor's status checking. Fix it by using last >>> used index as replacement. Tx queue will be stopped if there's not >>> enough freed buffers after recheck. >>> >>> Signed-off-by: Marvin Liu <yong.liu@intel.com> >>> >>> diff --git a/drivers/virtio/virtio_ring.c b/drivers/virtio/virtio_ring.c >>> index bdc08244a648..a8041e451e9e 100644 >>> --- a/drivers/virtio/virtio_ring.c >>> +++ b/drivers/virtio/virtio_ring.c >>> @@ -1499,9 +1499,6 @@ static bool >> virtqueue_enable_cb_delayed_packed(struct virtqueue *_vq) >>> * counter first before updating event flags. >>> */ >>> virtio_wmb(vq->weak_barriers); >>> - } else { >>> - used_idx = vq->last_used_idx; >>> - wrap_counter = vq->packed.used_wrap_counter; >>> } >>> >>> if (vq->packed.event_flags_shadow == VRING_PACKED_EVENT_FLAG_DISABLE) >> { >>> @@ -1518,7 +1515,9 @@ static bool >> virtqueue_enable_cb_delayed_packed(struct virtqueue *_vq) >>> */ >>> virtio_mb(vq->weak_barriers); >>> >>> - if (is_used_desc_packed(vq, used_idx, wrap_counter)) { >>> + if (is_used_desc_packed(vq, >>> + vq->last_used_idx, >>> + vq->packed.used_wrap_counter)) { >>> END_USE(vq); >>> return false; >>> } >> >> Hi Marvin: >> >> Two questions: >> >> 1) Do we support IN_ORDER in kernel driver? >> > Not support by now. But issue still can be possible if in_direct disabled and meanwhile descs are chained. > Due to packed ring desc status should check one by one, chose arbitrary position may cause issue. Right, then it's better to mention IN_ORDER as future features. > >> 2) Should we check IN_ORDER in this case otherwise we may end up with >> interrupt storm when IN_ORDER is not negotiated? > Interrupt number will not increase here, event offset value calculated as before. > Here just recheck whether new used descs is enough for next around xmit. > If backend was slow, most likely Tx queue will sleep for a while until used index go over event offset. Ok, but what if the backend is almost as fast as guest driver? E.g in virtio-net we had: if (sq->vq->num_free < 2+MAX_SKB_FRAGS) { netif_stop_subqueue(dev, qnum); if (!use_napi && unlikely(!virtqueue_enable_cb_delayed(sq->vq))) { /* More just got used, free them then recheck. */ free_old_xmit_skbs(sq, false); if (sq->vq->num_free >= 2+MAX_SKB_FRAGS) { netif_start_subqueue(dev, qnum); virtqueue_disable_cb(sq->vq); } } } I worry that we may end up with toggling queue state in the case (sq->vq->num_free is near 2 + MAX_SKB_FRAGS). It looks to me the correct thing to implement is to calculate the head descriptor of a chain that sits at 3/4. Thanks > > Thanks, > Marvin > >> Thanks >> _______________________________________________ Virtualization mailing list Virtualization@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/virtualization ^ permalink raw reply [flat|nested] 9+ messages in thread
[parent not found: <86228AFD5BCD8E4EBFD2B90117B5E81E633DA298@SHSMSX103.ccr.corp.intel.com>]
* Re: [PATCH] virtio_ring: fix packed ring event may missing [not found] ` <86228AFD5BCD8E4EBFD2B90117B5E81E633DA298@SHSMSX103.ccr.corp.intel.com> @ 2019-10-24 3:50 ` Jason Wang 2019-10-27 9:54 ` Michael S. Tsirkin 0 siblings, 1 reply; 9+ messages in thread From: Jason Wang @ 2019-10-24 3:50 UTC (permalink / raw) To: Liu, Yong, mst, Bie, Tiwei; +Cc: virtualization On 2019/10/24 上午11:26, Liu, Yong wrote: > >> -----Original Message----- >> From: Jason Wang [mailto:jasowang@redhat.com] >> Sent: Tuesday, October 22, 2019 9:06 PM >> To: Liu, Yong <yong.liu@intel.com>; mst@redhat.com; Bie, Tiwei >> <tiwei.bie@intel.com> >> Cc: virtualization@lists.linux-foundation.org >> Subject: Re: [PATCH] virtio_ring: fix packed ring event may missing >> >> >> On 2019/10/22 下午2:48, Liu, Yong wrote: >>> Hi Jason, >>> My answers are inline. >>> >>>> -----Original Message----- >>>> From: Jason Wang [mailto:jasowang@redhat.com] >>>> Sent: Tuesday, October 22, 2019 10:45 AM >>>> To: Liu, Yong <yong.liu@intel.com>; mst@redhat.com; Bie, Tiwei >>>> <tiwei.bie@intel.com> >>>> Cc: virtualization@lists.linux-foundation.org >>>> Subject: Re: [PATCH] virtio_ring: fix packed ring event may missing >>>> >>>> >>>> On 2019/10/22 上午1:10, Marvin Liu wrote: >>>>> When callback is delayed, virtio expect that vhost will kick when >>>>> rolling over event offset. Recheck should be taken as used index may >>>>> exceed event offset between status check and driver event update. >>>>> >>>>> However, it is possible that flags was not modified if descriptors are >>>>> chained or in_order feature was negotiated. So flags at event offset >>>>> may not be valid for descriptor's status checking. Fix it by using last >>>>> used index as replacement. Tx queue will be stopped if there's not >>>>> enough freed buffers after recheck. >>>>> >>>>> Signed-off-by: Marvin Liu <yong.liu@intel.com> >>>>> >>>>> diff --git a/drivers/virtio/virtio_ring.c >> b/drivers/virtio/virtio_ring.c >>>>> index bdc08244a648..a8041e451e9e 100644 >>>>> --- a/drivers/virtio/virtio_ring.c >>>>> +++ b/drivers/virtio/virtio_ring.c >>>>> @@ -1499,9 +1499,6 @@ static bool >>>> virtqueue_enable_cb_delayed_packed(struct virtqueue *_vq) >>>>> * counter first before updating event flags. >>>>> */ >>>>> virtio_wmb(vq->weak_barriers); >>>>> - } else { >>>>> - used_idx = vq->last_used_idx; >>>>> - wrap_counter = vq->packed.used_wrap_counter; >>>>> } >>>>> >>>>> if (vq->packed.event_flags_shadow == >> VRING_PACKED_EVENT_FLAG_DISABLE) >>>> { >>>>> @@ -1518,7 +1515,9 @@ static bool >>>> virtqueue_enable_cb_delayed_packed(struct virtqueue *_vq) >>>>> */ >>>>> virtio_mb(vq->weak_barriers); >>>>> >>>>> - if (is_used_desc_packed(vq, used_idx, wrap_counter)) { >>>>> + if (is_used_desc_packed(vq, >>>>> + vq->last_used_idx, >>>>> + vq->packed.used_wrap_counter)) { >>>>> END_USE(vq); >>>>> return false; >>>>> } >>>> Hi Marvin: >>>> >>>> Two questions: >>>> >>>> 1) Do we support IN_ORDER in kernel driver? >>>> >>> Not support by now. But issue still can be possible if in_direct disabled >> and meanwhile descs are chained. >>> Due to packed ring desc status should check one by one, chose arbitrary >> position may cause issue. >> >> >> Right, then it's better to mention IN_ORDER as future features. >> >> >>>> 2) Should we check IN_ORDER in this case otherwise we may end up with >>>> interrupt storm when IN_ORDER is not negotiated? >>> Interrupt number will not increase here, event offset value calculated as >> before. >>> Here just recheck whether new used descs is enough for next around xmit. >>> If backend was slow, most likely Tx queue will sleep for a while until >> used index go over event offset. >> >> >> Ok, but what if the backend is almost as fast as guest driver? E.g in >> virtio-net we had: >> >> if (sq->vq->num_free < 2+MAX_SKB_FRAGS) { >> netif_stop_subqueue(dev, qnum); >> if (!use_napi && >> unlikely(!virtqueue_enable_cb_delayed(sq->vq))) { >> /* More just got used, free them then recheck. */ >> free_old_xmit_skbs(sq, false); >> if (sq->vq->num_free >= 2+MAX_SKB_FRAGS) { >> netif_start_subqueue(dev, qnum); >> virtqueue_disable_cb(sq->vq); >> } >> } >> } >> >> I worry that we may end up with toggling queue state in the case >> (sq->vq->num_free is near 2 + MAX_SKB_FRAGS). >> > Yes, at this worst case each packet will add extra twice event flags write. Due to backend only read this value, the cost won't too much. For driver, it means extra overheads, atomics, less batching, stats updating etc. For backend, cacheline will bounce between two cpus. > Even we can track down chained descs and figure out whether event offset indexed desc is used. There's still possibility that flags is invalid. > One case is that backend can buffer multiple descs by not updating the first one. We cannot guarantee that later flags is usable until check from the first one. In this case, since we've stopped tx queue, so there's no new buffers added. It doesn't matter we get notified when the 3/4 or all of the descriptors has been used. Thanks > > Regards, > Marvin > >> It looks to me the correct thing to implement is to calculate the head >> descriptor of a chain that sits at 3/4. >> >> Thanks >> >> >>> Thanks, >>> Marvin >>> >>>> Thanks >>>> _______________________________________________ Virtualization mailing list Virtualization@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/virtualization ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH] virtio_ring: fix packed ring event may missing 2019-10-24 3:50 ` Jason Wang @ 2019-10-27 9:54 ` Michael S. Tsirkin 0 siblings, 0 replies; 9+ messages in thread From: Michael S. Tsirkin @ 2019-10-27 9:54 UTC (permalink / raw) To: Jason Wang; +Cc: Liu, Yong, virtualization On Thu, Oct 24, 2019 at 11:50:51AM +0800, Jason Wang wrote: > > On 2019/10/24 上午11:26, Liu, Yong wrote: > > > > > -----Original Message----- > > > From: Jason Wang [mailto:jasowang@redhat.com] > > > Sent: Tuesday, October 22, 2019 9:06 PM > > > To: Liu, Yong <yong.liu@intel.com>; mst@redhat.com; Bie, Tiwei > > > <tiwei.bie@intel.com> > > > Cc: virtualization@lists.linux-foundation.org > > > Subject: Re: [PATCH] virtio_ring: fix packed ring event may missing > > > > > > > > > On 2019/10/22 下午2:48, Liu, Yong wrote: > > > > Hi Jason, > > > > My answers are inline. > > > > > > > > > -----Original Message----- > > > > > From: Jason Wang [mailto:jasowang@redhat.com] > > > > > Sent: Tuesday, October 22, 2019 10:45 AM > > > > > To: Liu, Yong <yong.liu@intel.com>; mst@redhat.com; Bie, Tiwei > > > > > <tiwei.bie@intel.com> > > > > > Cc: virtualization@lists.linux-foundation.org > > > > > Subject: Re: [PATCH] virtio_ring: fix packed ring event may missing > > > > > > > > > > > > > > > On 2019/10/22 上午1:10, Marvin Liu wrote: > > > > > > When callback is delayed, virtio expect that vhost will kick when > > > > > > rolling over event offset. Recheck should be taken as used index may > > > > > > exceed event offset between status check and driver event update. > > > > > > > > > > > > However, it is possible that flags was not modified if descriptors are > > > > > > chained or in_order feature was negotiated. So flags at event offset > > > > > > may not be valid for descriptor's status checking. Fix it by using last > > > > > > used index as replacement. Tx queue will be stopped if there's not > > > > > > enough freed buffers after recheck. > > > > > > > > > > > > Signed-off-by: Marvin Liu <yong.liu@intel.com> > > > > > > > > > > > > diff --git a/drivers/virtio/virtio_ring.c > > > b/drivers/virtio/virtio_ring.c > > > > > > index bdc08244a648..a8041e451e9e 100644 > > > > > > --- a/drivers/virtio/virtio_ring.c > > > > > > +++ b/drivers/virtio/virtio_ring.c > > > > > > @@ -1499,9 +1499,6 @@ static bool > > > > > virtqueue_enable_cb_delayed_packed(struct virtqueue *_vq) > > > > > > * counter first before updating event flags. > > > > > > */ > > > > > > virtio_wmb(vq->weak_barriers); > > > > > > - } else { > > > > > > - used_idx = vq->last_used_idx; > > > > > > - wrap_counter = vq->packed.used_wrap_counter; > > > > > > } > > > > > > > > > > > > if (vq->packed.event_flags_shadow == > > > VRING_PACKED_EVENT_FLAG_DISABLE) > > > > > { > > > > > > @@ -1518,7 +1515,9 @@ static bool > > > > > virtqueue_enable_cb_delayed_packed(struct virtqueue *_vq) > > > > > > */ > > > > > > virtio_mb(vq->weak_barriers); > > > > > > > > > > > > - if (is_used_desc_packed(vq, used_idx, wrap_counter)) { > > > > > > + if (is_used_desc_packed(vq, > > > > > > + vq->last_used_idx, > > > > > > + vq->packed.used_wrap_counter)) { > > > > > > END_USE(vq); > > > > > > return false; > > > > > > } > > > > > Hi Marvin: > > > > > > > > > > Two questions: > > > > > > > > > > 1) Do we support IN_ORDER in kernel driver? > > > > > > > > > Not support by now. But issue still can be possible if in_direct disabled > > > and meanwhile descs are chained. > > > > Due to packed ring desc status should check one by one, chose arbitrary > > > position may cause issue. > > > > > > > > > Right, then it's better to mention IN_ORDER as future features. > > > > > > > > > > > 2) Should we check IN_ORDER in this case otherwise we may end up with > > > > > interrupt storm when IN_ORDER is not negotiated? > > > > Interrupt number will not increase here, event offset value calculated as > > > before. > > > > Here just recheck whether new used descs is enough for next around xmit. > > > > If backend was slow, most likely Tx queue will sleep for a while until > > > used index go over event offset. > > > > > > > > > Ok, but what if the backend is almost as fast as guest driver? E.g in > > > virtio-net we had: > > > > > > if (sq->vq->num_free < 2+MAX_SKB_FRAGS) { > > > netif_stop_subqueue(dev, qnum); > > > if (!use_napi && > > > unlikely(!virtqueue_enable_cb_delayed(sq->vq))) { > > > /* More just got used, free them then recheck. */ > > > free_old_xmit_skbs(sq, false); > > > if (sq->vq->num_free >= 2+MAX_SKB_FRAGS) { > > > netif_start_subqueue(dev, qnum); > > > virtqueue_disable_cb(sq->vq); > > > } > > > } > > > } > > > > > > I worry that we may end up with toggling queue state in the case > > > (sq->vq->num_free is near 2 + MAX_SKB_FRAGS). > > > > > Yes, at this worst case each packet will add extra twice event flags write. Due to backend only read this value, the cost won't too much. > > > For driver, it means extra overheads, atomics, less batching, stats updating > etc. For backend, cacheline will bounce between two cpus. > > > > Even we can track down chained descs and figure out whether event offset indexed desc is used. There's still possibility that flags is invalid. > > One case is that backend can buffer multiple descs by not updating the first one. We cannot guarantee that later flags is usable until check from the first one. > > > In this case, since we've stopped tx queue, so there's no new buffers added. > It doesn't matter we get notified when the 3/4 or all of the descriptors has > been used. > > Thanks Well - checking the next descriptor will likely result in moving the event index forward, which will thinkably reduce the # of interrupts. So it's hard to predict which is better. I'll apply the patch for now as it's simple and safe. If someone has the time to work on tuning all this, that would be great. > > > > > Regards, > > Marvin > > > > > It looks to me the correct thing to implement is to calculate the head > > > descriptor of a chain that sits at 3/4. > > > > > > Thanks > > > > > > > > > > Thanks, > > > > Marvin > > > > > > > > > Thanks > > > > > _______________________________________________ Virtualization mailing list Virtualization@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/virtualization ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH] virtio_ring: fix packed ring event may missing [not found] <20191021171004.18729-1-yong.liu@intel.com> 2019-10-22 2:44 ` [PATCH] virtio_ring: fix packed ring event may missing Jason Wang @ 2019-10-25 9:32 ` Michael S. Tsirkin 2019-10-27 9:09 ` Michael S. Tsirkin 2019-10-25 10:53 ` Michael S. Tsirkin 2019-10-27 9:51 ` Michael S. Tsirkin 3 siblings, 1 reply; 9+ messages in thread From: Michael S. Tsirkin @ 2019-10-25 9:32 UTC (permalink / raw) To: Marvin Liu; +Cc: virtualization On Tue, Oct 22, 2019 at 01:10:04AM +0800, Marvin Liu wrote: > When callback is delayed, virtio expect that vhost will kick when > rolling over event offset. Recheck should be taken as used index may > exceed event offset between status check and driver event update. > > However, it is possible that flags was not modified if descriptors are > chained or in_order feature was negotiated. So flags at event offset This mention of event offset I don't understand: your patch only affects code that runs when !event. So how can it affect event offset? > may not be valid for descriptor's status checking. Fix it by using last > used index as replacement. Tx queue will be stopped if there's not > enough freed buffers after recheck. > > Signed-off-by: Marvin Liu <yong.liu@intel.com> > > diff --git a/drivers/virtio/virtio_ring.c b/drivers/virtio/virtio_ring.c > index bdc08244a648..a8041e451e9e 100644 > --- a/drivers/virtio/virtio_ring.c > +++ b/drivers/virtio/virtio_ring.c > @@ -1499,9 +1499,6 @@ static bool virtqueue_enable_cb_delayed_packed(struct virtqueue *_vq) > * counter first before updating event flags. > */ > virtio_wmb(vq->weak_barriers); > - } else { > - used_idx = vq->last_used_idx; > - wrap_counter = vq->packed.used_wrap_counter; > } Is all this theorectical? Or did you actually see a problem and then fixed it? Because as far as I could see after this patch and with event index off, used_idx and wrap_counter will be used without being initialized. OTOH the behaviour with event index on is completely unaffected. > > if (vq->packed.event_flags_shadow == VRING_PACKED_EVENT_FLAG_DISABLE) { OK so trying to unpack the scenario. First you patch only affects code running when EVENT_IDX is off, so legal values for flags are enable and disable. Next point, this calculates the index at which we are going to look for the flags to change, in other words it affects the line if (is_used_desc_packed(vq, used_idx, wrap_counter)) { below. Without your patch, we simply look at the next descriptor. This is exactly what the spec says we should do: Writes of device and driver descriptors can generally be reordered, but each side (driver and device) are only required to poll (or test) a single location in memory: the next device descriptor after the one they processed previously, in circular order. > @@ -1518,7 +1515,9 @@ static bool virtqueue_enable_cb_delayed_packed(struct virtqueue *_vq) > */ > virtio_mb(vq->weak_barriers); > > - if (is_used_desc_packed(vq, used_idx, wrap_counter)) { > + if (is_used_desc_packed(vq, > + vq->last_used_idx, > + vq->packed.used_wrap_counter)) { > END_USE(vq); > return false; > } > -- > 2.17.1 ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH] virtio_ring: fix packed ring event may missing 2019-10-25 9:32 ` Michael S. Tsirkin @ 2019-10-27 9:09 ` Michael S. Tsirkin 0 siblings, 0 replies; 9+ messages in thread From: Michael S. Tsirkin @ 2019-10-27 9:09 UTC (permalink / raw) To: Marvin Liu; +Cc: virtualization On Fri, Oct 25, 2019 at 05:32:49AM -0400, Michael S. Tsirkin wrote: > On Tue, Oct 22, 2019 at 01:10:04AM +0800, Marvin Liu wrote: > > When callback is delayed, virtio expect that vhost will kick when > > rolling over event offset. Recheck should be taken as used index may > > exceed event offset between status check and driver event update. > > > > However, it is possible that flags was not modified if descriptors are > > chained or in_order feature was negotiated. So flags at event offset > > This mention of event offset I don't understand: your patch > only affects code that runs when !event. So how can it > affect event offset? > > > > > may not be valid for descriptor's status checking. Fix it by using last > > used index as replacement. Tx queue will be stopped if there's not > > enough freed buffers after recheck. > > > > Signed-off-by: Marvin Liu <yong.liu@intel.com> > > > > diff --git a/drivers/virtio/virtio_ring.c b/drivers/virtio/virtio_ring.c > > index bdc08244a648..a8041e451e9e 100644 > > --- a/drivers/virtio/virtio_ring.c > > +++ b/drivers/virtio/virtio_ring.c > > @@ -1499,9 +1499,6 @@ static bool virtqueue_enable_cb_delayed_packed(struct virtqueue *_vq) > > * counter first before updating event flags. > > */ > > virtio_wmb(vq->weak_barriers); > > - } else { > > - used_idx = vq->last_used_idx; > > - wrap_counter = vq->packed.used_wrap_counter; > > } > > > Is all this theorectical? Or did you actually see a problem > and then fixed it? > Because as far as I could see after this patch and with > event index off, used_idx and wrap_counter will be used > without being initialized. > > OTOH the behaviour with event index on is completely unaffected. > > > > > > if (vq->packed.event_flags_shadow == VRING_PACKED_EVENT_FLAG_DISABLE) { > > OK so trying to unpack the scenario. > > First you patch only affects code running when EVENT_IDX is off, so > legal values for flags are enable and disable. > > > Next point, this calculates the index at which we are going > to look for the flags to change, in other words > it affects the line > if (is_used_desc_packed(vq, used_idx, wrap_counter)) { > below. > > Without your patch, we simply look at the next descriptor. > This is exactly what the spec says we should do: > > Writes of device and driver descriptors can generally be > reordered, but each side (driver and device) are only required to > poll (or test) a single location in memory: the next device descriptor after > the one they processed previously, in circular order. > OK please ignore all this, I misunderstood the patch. Sorry about the noise. > > > > > > @@ -1518,7 +1515,9 @@ static bool virtqueue_enable_cb_delayed_packed(struct virtqueue *_vq) > > */ > > virtio_mb(vq->weak_barriers); > > > > - if (is_used_desc_packed(vq, used_idx, wrap_counter)) { > > + if (is_used_desc_packed(vq, > > + vq->last_used_idx, > > + vq->packed.used_wrap_counter)) { > > END_USE(vq); > > return false; > > } > > -- > > 2.17.1 > _______________________________________________ > Virtualization mailing list > Virtualization@lists.linux-foundation.org > https://lists.linuxfoundation.org/mailman/listinfo/virtualization ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH] virtio_ring: fix packed ring event may missing [not found] <20191021171004.18729-1-yong.liu@intel.com> 2019-10-22 2:44 ` [PATCH] virtio_ring: fix packed ring event may missing Jason Wang 2019-10-25 9:32 ` Michael S. Tsirkin @ 2019-10-25 10:53 ` Michael S. Tsirkin 2019-10-27 9:12 ` Michael S. Tsirkin 2019-10-27 9:51 ` Michael S. Tsirkin 3 siblings, 1 reply; 9+ messages in thread From: Michael S. Tsirkin @ 2019-10-25 10:53 UTC (permalink / raw) To: Marvin Liu; +Cc: virtualization On Tue, Oct 22, 2019 at 01:10:04AM +0800, Marvin Liu wrote: > When callback is delayed, virtio expect that vhost will kick when > rolling over event offset. Recheck should be taken as used index may > exceed event offset between status check and driver event update. > > However, it is possible that flags was not modified if descriptors are > chained or in_order feature was negotiated. So flags at event offset > may not be valid for descriptor's status checking. Fix it by using last > used index as replacement. Tx queue will be stopped if there's not > enough freed buffers after recheck. > > Signed-off-by: Marvin Liu <yong.liu@intel.com> So as I wrote separately, I don't think the patch is correct. And it doesn't look like it was tested properly. However, I do think virtqueue_enable_cb_delayed_packed is implemented incorrectly in case of chained s/g and ring that is close to being full. It's just that chained only happens on OOM at the moment, so it's hard to trigger. virtio_ring.c can be hacked to force chained to make it trigger more. And with napi_tx the default this function is barely used at all. So it's hard to test, and I think in this area we really should include info on how patch was tested before applying. But just theoretically speaking I agree: the trick virtqueue_enable_cb_delayed_packed is using to check that 3/4 of the ring has been used just by looking at a single s/g entry can't work reliably. We generally really need to scan the ring. How much to scan? If we want to implement this efficiently, I propose that we track the number of outstanding bufs. Then # of s/g - # of bufs is the max number of entries we need to scan. For the common case of a single s/g this will give exactly 0. Alternatively, we can just scan the whole ring up to the index using the standard logic. virtqueue_enable_cb_delayed_packed is rare enough. > > diff --git a/drivers/virtio/virtio_ring.c b/drivers/virtio/virtio_ring.c > index bdc08244a648..a8041e451e9e 100644 > --- a/drivers/virtio/virtio_ring.c > +++ b/drivers/virtio/virtio_ring.c > @@ -1499,9 +1499,6 @@ static bool virtqueue_enable_cb_delayed_packed(struct virtqueue *_vq) > * counter first before updating event flags. > */ > virtio_wmb(vq->weak_barriers); > - } else { > - used_idx = vq->last_used_idx; > - wrap_counter = vq->packed.used_wrap_counter; > } > > if (vq->packed.event_flags_shadow == VRING_PACKED_EVENT_FLAG_DISABLE) { > @@ -1518,7 +1515,9 @@ static bool virtqueue_enable_cb_delayed_packed(struct virtqueue *_vq) > */ > virtio_mb(vq->weak_barriers); > > - if (is_used_desc_packed(vq, used_idx, wrap_counter)) { > + if (is_used_desc_packed(vq, > + vq->last_used_idx, > + vq->packed.used_wrap_counter)) { > END_USE(vq); > return false; > } > -- > 2.17.1 ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH] virtio_ring: fix packed ring event may missing 2019-10-25 10:53 ` Michael S. Tsirkin @ 2019-10-27 9:12 ` Michael S. Tsirkin 0 siblings, 0 replies; 9+ messages in thread From: Michael S. Tsirkin @ 2019-10-27 9:12 UTC (permalink / raw) To: Marvin Liu; +Cc: virtualization On Fri, Oct 25, 2019 at 06:53:55AM -0400, Michael S. Tsirkin wrote: > On Tue, Oct 22, 2019 at 01:10:04AM +0800, Marvin Liu wrote: > > When callback is delayed, virtio expect that vhost will kick when > > rolling over event offset. Recheck should be taken as used index may > > exceed event offset between status check and driver event update. > > > > However, it is possible that flags was not modified if descriptors are > > chained or in_order feature was negotiated. So flags at event offset > > may not be valid for descriptor's status checking. Fix it by using last > > used index as replacement. Tx queue will be stopped if there's not > > enough freed buffers after recheck. > > > > Signed-off-by: Marvin Liu <yong.liu@intel.com> > > So as I wrote separately, I don't think the patch is correct. And it > doesn't look like it was tested properly. I was wrong to say this: the patch is correct. Thanks for working on this, and sorry about creating the confusion. It might not be the optimal thing to do but it fixes a real bug, so I think it's a good idea to merge it for now, and implement optimizations on top. > However, I do think > virtqueue_enable_cb_delayed_packed is implemented incorrectly in case of > chained s/g and ring that is close to being full. It's just that > chained only happens on OOM at the moment, so it's hard to trigger. > virtio_ring.c can be hacked to force chained to make it trigger more. > And with napi_tx the default this function is barely used at all. > > So it's hard to test, and I think in this area we really should include > info on how patch was tested before applying. > > > But just theoretically speaking I agree: the trick > virtqueue_enable_cb_delayed_packed is using to check that 3/4 of the > ring has been used just by looking at a single s/g entry can't work > reliably. > > We generally really need to scan the ring. How much to scan? If we > want to implement this efficiently, I propose that we track the number > of outstanding bufs. Then # of s/g - # of bufs is the max number of > entries we need to scan. For the common case of a single s/g this will > give exactly 0. > > > Alternatively, we can just scan the whole ring up to the index using the > standard logic. virtqueue_enable_cb_delayed_packed is rare enough. > > > > > > > diff --git a/drivers/virtio/virtio_ring.c b/drivers/virtio/virtio_ring.c > > index bdc08244a648..a8041e451e9e 100644 > > --- a/drivers/virtio/virtio_ring.c > > +++ b/drivers/virtio/virtio_ring.c > > @@ -1499,9 +1499,6 @@ static bool virtqueue_enable_cb_delayed_packed(struct virtqueue *_vq) > > * counter first before updating event flags. > > */ > > virtio_wmb(vq->weak_barriers); > > - } else { > > - used_idx = vq->last_used_idx; > > - wrap_counter = vq->packed.used_wrap_counter; > > } > > > > if (vq->packed.event_flags_shadow == VRING_PACKED_EVENT_FLAG_DISABLE) { > > @@ -1518,7 +1515,9 @@ static bool virtqueue_enable_cb_delayed_packed(struct virtqueue *_vq) > > */ > > virtio_mb(vq->weak_barriers); > > > > - if (is_used_desc_packed(vq, used_idx, wrap_counter)) { > > + if (is_used_desc_packed(vq, > > + vq->last_used_idx, > > + vq->packed.used_wrap_counter)) { > > END_USE(vq); > > return false; > > } > > -- > > 2.17.1 ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH] virtio_ring: fix packed ring event may missing [not found] <20191021171004.18729-1-yong.liu@intel.com> ` (2 preceding siblings ...) 2019-10-25 10:53 ` Michael S. Tsirkin @ 2019-10-27 9:51 ` Michael S. Tsirkin 3 siblings, 0 replies; 9+ messages in thread From: Michael S. Tsirkin @ 2019-10-27 9:51 UTC (permalink / raw) To: Marvin Liu; +Cc: virtualization On Tue, Oct 22, 2019 at 01:10:04AM +0800, Marvin Liu wrote: > When callback is delayed, virtio expect that vhost will kick when > rolling over event offset. Recheck should be taken as used index may > exceed event offset between status check and driver event update. > > However, it is possible that flags was not modified if descriptors are > chained or in_order feature was negotiated. So flags at event offset > may not be valid for descriptor's status checking. Fix it by using last > used index as replacement. Tx queue will be stopped if there's not > enough freed buffers after recheck. > > Signed-off-by: Marvin Liu <yong.liu@intel.com> OK I rewrote the commit log slightly: When VIRTIO_F_RING_EVENT_IDX is negotiated, virtio devices can use virtqueue_enable_cb_delayed_packed to reduce the number of device interrupts. At the moment, this is the case for virtio-net when the napi_tx module parameter is set to false. In this case, the virtio driver selects an event offset in the ring and expects that the device will send a notification when rolling over the event offset in the ring. However, if this roll-over happens before the event suppression structure update, the notification won't be sent. To address this race condition the driver needs to check wether the device rolled over this offset after updating the event suppression structure. With VIRTIO_F_RING_PACKED, the virtio driver did this by reading the the flags field at the specified offset in the descriptor. Unfortunately, checking at the event offset isn't reliable: if descriptors are chained (e.g. when INDIRECT is off) not all descriptors are overwritten by the device, so it's possible that the device skipped the specific descriptor driver is checking when writing out used descriptors. If this happens, the driver won't detect the race condition and will incorrectly expect the device to send a notification. For virtio-net, the result will be TX queue stall, and transmission getting blocked forever. With the packed ring, it isn't easy to find a location which is guaranteed to change upon the roll-over, except the next device descriptor, as described in the spec: Writes of device and driver descriptors can generally be reordered, but each side (driver and device) are only required to poll (or test) a single location in memory: the next device descriptor after the one they processed previously, in circular order. while this might be sub-optimal, let's do exactly this for now. And applied this. Thanks a lot for working on this, and sorry again for not understanding the patch originally and thinking it was not tested! > diff --git a/drivers/virtio/virtio_ring.c b/drivers/virtio/virtio_ring.c > index bdc08244a648..a8041e451e9e 100644 > --- a/drivers/virtio/virtio_ring.c > +++ b/drivers/virtio/virtio_ring.c > @@ -1499,9 +1499,6 @@ static bool virtqueue_enable_cb_delayed_packed(struct virtqueue *_vq) > * counter first before updating event flags. > */ > virtio_wmb(vq->weak_barriers); > - } else { > - used_idx = vq->last_used_idx; > - wrap_counter = vq->packed.used_wrap_counter; > } > > if (vq->packed.event_flags_shadow == VRING_PACKED_EVENT_FLAG_DISABLE) { > @@ -1518,7 +1515,9 @@ static bool virtqueue_enable_cb_delayed_packed(struct virtqueue *_vq) > */ > virtio_mb(vq->weak_barriers); > > - if (is_used_desc_packed(vq, used_idx, wrap_counter)) { > + if (is_used_desc_packed(vq, > + vq->last_used_idx, > + vq->packed.used_wrap_counter)) { > END_USE(vq); > return false; > } > -- > 2.17.1 ^ permalink raw reply [flat|nested] 9+ messages in thread
end of thread, other threads:[~2019-10-27 9:54 UTC | newest] Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- [not found] <20191021171004.18729-1-yong.liu@intel.com> 2019-10-22 2:44 ` [PATCH] virtio_ring: fix packed ring event may missing Jason Wang [not found] ` <86228AFD5BCD8E4EBFD2B90117B5E81E633D74EF@SHSMSX103.ccr.corp.intel.com> 2019-10-22 13:05 ` Jason Wang [not found] ` <86228AFD5BCD8E4EBFD2B90117B5E81E633DA298@SHSMSX103.ccr.corp.intel.com> 2019-10-24 3:50 ` Jason Wang 2019-10-27 9:54 ` Michael S. Tsirkin 2019-10-25 9:32 ` Michael S. Tsirkin 2019-10-27 9:09 ` Michael S. Tsirkin 2019-10-25 10:53 ` Michael S. Tsirkin 2019-10-27 9:12 ` Michael S. Tsirkin 2019-10-27 9:51 ` Michael S. Tsirkin
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.