All of lore.kernel.org
 help / color / mirror / Atom feed
From: Cristian Marussi <cristian.marussi@arm.com>
To: "Michael S. Tsirkin" <mst@redhat.com>
Cc: Peter Hilber <peter.hilber@opensynergy.com>,
	linux-kernel@vger.kernel.org,
	linux-arm-kernel@lists.infradead.org, sudeep.holla@arm.com,
	james.quinlan@broadcom.com, Jonathan.Cameron@huawei.com,
	f.fainelli@gmail.com, etienne.carriere@linaro.org,
	vincent.guittot@linaro.org, souvik.chakravarty@arm.com,
	igor.skalkin@opensynergy.com,
	virtualization@lists.linux-foundation.org
Subject: Re: [PATCH v9 09/11] firmware: arm_scmi: Add atomic mode support to virtio transport
Date: Sun, 23 Jan 2022 22:45:54 +0000	[thread overview]
Message-ID: <20220123224554.GG6113@e120937-lin> (raw)
In-Reply-To: <20220123172950-mutt-send-email-mst@kernel.org>

On Sun, Jan 23, 2022 at 05:40:08PM -0500, Michael S. Tsirkin wrote:
> On Sun, Jan 23, 2022 at 08:02:54PM +0000, Cristian Marussi wrote:
> > I was thinking...keeping the current virtqueue_poll interface, since our
> > possible issue arises from the used_index wrapping around exactly on top
> > of the same polled index and given that currently the API returns an
> > unsigned "opaque" value really carrying just the 16-bit index (and possibly
> > the wrap bit as bit15 for packed vq) that is supposed to be fed back as
> > it is to the virtqueue_poll() function....
> > 
> > ...why don't we just keep an internal full fledged per-virtqueue wrap-counter
> > and return that as the MSB 16-bit of the opaque value returned by
> > virtqueue_prepare_enable_cb and then check it back in virtqueue_poll when the
> > opaque is fed back ? (filtering it out from the internal helpers machinery)
> > 
> > As in the example below the scissors.
> > 
> > I mean if the internal wrap count is at that point different from the
> > one provided to virtqueue_poll() via the opaque poll_idx value previously
> > provided, certainly there is something new to fetch without even looking
> > at the indexes: at the same time, exposing an opaque index built as
> > (wraps << 16 | idx) implicitly 'binds' each index to a specific
> > wrap-iteration, so they can be distiguished (..ok until the wrap-count
> > upper 16bit wraps too....but...)
> > 
> > I am not really extremely familiar with the internals of virtio so I
> > could be missing something obvious...feel free to insult me :P
> > 
> > (..and I have not made any perf measurements or consideration at this
> > point....nor considered the redundancy of the existent packed
> > used_wrap_counter bit...)
> > 
> > Thanks,
> > Cristian
> > 
> > ----
> > 
> > diff --git a/drivers/virtio/virtio_ring.c b/drivers/virtio/virtio_ring.c
> > index 00f64f2f8b72..bda6af121cd7 100644
> > --- a/drivers/virtio/virtio_ring.c
> > +++ b/drivers/virtio/virtio_ring.c
> > @@ -117,6 +117,8 @@ struct vring_virtqueue {
> >         /* Last used index we've seen. */
> >         u16 last_used_idx;
> >  
> > +       u16 wraps;
> > +
> >         /* Hint for event idx: already triggered no need to disable. */
> >         bool event_triggered;
> >  
> > @@ -806,6 +808,8 @@ static void *virtqueue_get_buf_ctx_split(struct virtqueue *_vq,
> >         ret = vq->split.desc_state[i].data;
> >         detach_buf_split(vq, i, ctx);
> >         vq->last_used_idx++;
> > +       if (unlikely(!vq->last_used_idx))
> > +               vq->wraps++;
> 
> I wonder whether
>                vq->wraps += !vq->last_used_idx;
> is faster or slower. No branch but OTOH a dependency.
> 
> 
> >         /* If we expect an interrupt for the next entry, tell host
> >          * by writing event index and flush out the write before
> >          * the read in the next get_buf call. */
> > @@ -1508,6 +1512,7 @@ static void *virtqueue_get_buf_ctx_packed(struct virtqueue *_vq,
> >         if (unlikely(vq->last_used_idx >= vq->packed.vring.num)) {
> >                 vq->last_used_idx -= vq->packed.vring.num;
> >                 vq->packed.used_wrap_counter ^= 1;
> > +               vq->wraps++;
> >         }
> >  
> >         /*
> > @@ -1744,6 +1749,7 @@ static struct virtqueue *vring_create_virtqueue_packed(
> >         vq->weak_barriers = weak_barriers;
> >         vq->broken = false;
> >         vq->last_used_idx = 0;
> > +       vq->wraps = 0;
> >         vq->event_triggered = false;
> >         vq->num_added = 0;
> >         vq->packed_ring = true;
> > @@ -2092,13 +2098,17 @@ EXPORT_SYMBOL_GPL(virtqueue_disable_cb);
> >   */
> >  unsigned virtqueue_enable_cb_prepare(struct virtqueue *_vq)
> >  {
> > +       unsigned last_used_idx;
> >         struct vring_virtqueue *vq = to_vvq(_vq);
> >  
> >         if (vq->event_triggered)
> >                 vq->event_triggered = false;
> >  
> > -       return vq->packed_ring ? virtqueue_enable_cb_prepare_packed(_vq) :
> > -                                virtqueue_enable_cb_prepare_split(_vq);
> > +       last_used_idx = vq->packed_ring ?
> > +                       virtqueue_enable_cb_prepare_packed(_vq) :
> > +                       virtqueue_enable_cb_prepare_split(_vq);
> > +
> > +       return VRING_BUILD_OPAQUE(last_used_idx, vq->wraps);
> >  }
> >  EXPORT_SYMBOL_GPL(virtqueue_enable_cb_prepare);
> >  
> > @@ -2118,9 +2128,13 @@ bool virtqueue_poll(struct virtqueue *_vq, unsigned last_used_idx)
> >         if (unlikely(vq->broken))
> >                 return false;
> >  
> > +       if (unlikely(vq->wraps != VRING_GET_WRAPS(last_used_idx)))
> > +               return true;
> > +
> >         virtio_mb(vq->weak_barriers);
> > -       return vq->packed_ring ? virtqueue_poll_packed(_vq, last_used_idx) :
> > -                                virtqueue_poll_split(_vq, last_used_idx);
> > +       return vq->packed_ring ?
> > +               virtqueue_poll_packed(_vq, VRING_GET_IDX(last_used_idx)) :
> > +                       virtqueue_poll_split(_vq, VRING_GET_IDX(last_used_idx));
> >  }
> >  EXPORT_SYMBOL_GPL(virtqueue_poll);
> >  
> > @@ -2245,6 +2259,7 @@ struct virtqueue *__vring_new_virtqueue(unsigned int index,
> >         vq->weak_barriers = weak_barriers;
> >         vq->broken = false;
> >         vq->last_used_idx = 0;
> > +       vq->wraps = 0;
> >         vq->event_triggered = false;
> >         vq->num_added = 0;
> >         vq->use_dma_api = vring_use_dma_api(vdev);
> > diff --git a/include/uapi/linux/virtio_ring.h b/include/uapi/linux/virtio_ring.h
> > index 476d3e5c0fe7..e6b03017ebd7 100644
> > --- a/include/uapi/linux/virtio_ring.h
> > +++ b/include/uapi/linux/virtio_ring.h
> > @@ -77,6 +77,17 @@
> >   */
> >  #define VRING_PACKED_EVENT_F_WRAP_CTR  15
> >  
> > +#define VRING_IDX_MASK                                 GENMASK(15, 0)
> > +#define VRING_GET_IDX(opaque)                          \
> > +       ((u16)FIELD_GET(VRING_IDX_MASK, (opaque)))
> > +
> > +#define VRING_WRAPS_MASK                               GENMASK(31, 16)
> > +#define VRING_GET_WRAPS(opaque)                                \
> > +       ((u16)FIELD_GET(VRING_WRAPS_MASK, (opaque)))
> > +
> > +#define VRING_BUILD_OPAQUE(idx, wraps)                 \
> > +       (FIELD_PREP(VRING_WRAPS_MASK, (wraps)) | ((idx) & VRING_IDX_MASK))
> > +
> >  /* We support indirect buffer descriptors */
> >  #define VIRTIO_RING_F_INDIRECT_DESC    28
> 
> Yea I think this patch increases the time it takes to wrap around from
> 2^16 to 2^32 which seems good enough.
> Need some comments to explain the logic.
> Would be interesting to see perf data.
> 

Thanks for your feedback !

I'll try to gather some perf data around it next days.
(and eventually cleanup and adding comments if it is god enough...)

Thanks,
Cristian


WARNING: multiple messages have this Message-ID (diff)
From: Cristian Marussi <cristian.marussi@arm.com>
To: "Michael S. Tsirkin" <mst@redhat.com>
Cc: Peter Hilber <peter.hilber@opensynergy.com>,
	linux-kernel@vger.kernel.org,
	linux-arm-kernel@lists.infradead.org, sudeep.holla@arm.com,
	james.quinlan@broadcom.com, Jonathan.Cameron@huawei.com,
	f.fainelli@gmail.com, etienne.carriere@linaro.org,
	vincent.guittot@linaro.org, souvik.chakravarty@arm.com,
	igor.skalkin@opensynergy.com,
	virtualization@lists.linux-foundation.org
Subject: Re: [PATCH v9 09/11] firmware: arm_scmi: Add atomic mode support to virtio transport
Date: Sun, 23 Jan 2022 22:45:54 +0000	[thread overview]
Message-ID: <20220123224554.GG6113@e120937-lin> (raw)
In-Reply-To: <20220123172950-mutt-send-email-mst@kernel.org>

On Sun, Jan 23, 2022 at 05:40:08PM -0500, Michael S. Tsirkin wrote:
> On Sun, Jan 23, 2022 at 08:02:54PM +0000, Cristian Marussi wrote:
> > I was thinking...keeping the current virtqueue_poll interface, since our
> > possible issue arises from the used_index wrapping around exactly on top
> > of the same polled index and given that currently the API returns an
> > unsigned "opaque" value really carrying just the 16-bit index (and possibly
> > the wrap bit as bit15 for packed vq) that is supposed to be fed back as
> > it is to the virtqueue_poll() function....
> > 
> > ...why don't we just keep an internal full fledged per-virtqueue wrap-counter
> > and return that as the MSB 16-bit of the opaque value returned by
> > virtqueue_prepare_enable_cb and then check it back in virtqueue_poll when the
> > opaque is fed back ? (filtering it out from the internal helpers machinery)
> > 
> > As in the example below the scissors.
> > 
> > I mean if the internal wrap count is at that point different from the
> > one provided to virtqueue_poll() via the opaque poll_idx value previously
> > provided, certainly there is something new to fetch without even looking
> > at the indexes: at the same time, exposing an opaque index built as
> > (wraps << 16 | idx) implicitly 'binds' each index to a specific
> > wrap-iteration, so they can be distiguished (..ok until the wrap-count
> > upper 16bit wraps too....but...)
> > 
> > I am not really extremely familiar with the internals of virtio so I
> > could be missing something obvious...feel free to insult me :P
> > 
> > (..and I have not made any perf measurements or consideration at this
> > point....nor considered the redundancy of the existent packed
> > used_wrap_counter bit...)
> > 
> > Thanks,
> > Cristian
> > 
> > ----
> > 
> > diff --git a/drivers/virtio/virtio_ring.c b/drivers/virtio/virtio_ring.c
> > index 00f64f2f8b72..bda6af121cd7 100644
> > --- a/drivers/virtio/virtio_ring.c
> > +++ b/drivers/virtio/virtio_ring.c
> > @@ -117,6 +117,8 @@ struct vring_virtqueue {
> >         /* Last used index we've seen. */
> >         u16 last_used_idx;
> >  
> > +       u16 wraps;
> > +
> >         /* Hint for event idx: already triggered no need to disable. */
> >         bool event_triggered;
> >  
> > @@ -806,6 +808,8 @@ static void *virtqueue_get_buf_ctx_split(struct virtqueue *_vq,
> >         ret = vq->split.desc_state[i].data;
> >         detach_buf_split(vq, i, ctx);
> >         vq->last_used_idx++;
> > +       if (unlikely(!vq->last_used_idx))
> > +               vq->wraps++;
> 
> I wonder whether
>                vq->wraps += !vq->last_used_idx;
> is faster or slower. No branch but OTOH a dependency.
> 
> 
> >         /* If we expect an interrupt for the next entry, tell host
> >          * by writing event index and flush out the write before
> >          * the read in the next get_buf call. */
> > @@ -1508,6 +1512,7 @@ static void *virtqueue_get_buf_ctx_packed(struct virtqueue *_vq,
> >         if (unlikely(vq->last_used_idx >= vq->packed.vring.num)) {
> >                 vq->last_used_idx -= vq->packed.vring.num;
> >                 vq->packed.used_wrap_counter ^= 1;
> > +               vq->wraps++;
> >         }
> >  
> >         /*
> > @@ -1744,6 +1749,7 @@ static struct virtqueue *vring_create_virtqueue_packed(
> >         vq->weak_barriers = weak_barriers;
> >         vq->broken = false;
> >         vq->last_used_idx = 0;
> > +       vq->wraps = 0;
> >         vq->event_triggered = false;
> >         vq->num_added = 0;
> >         vq->packed_ring = true;
> > @@ -2092,13 +2098,17 @@ EXPORT_SYMBOL_GPL(virtqueue_disable_cb);
> >   */
> >  unsigned virtqueue_enable_cb_prepare(struct virtqueue *_vq)
> >  {
> > +       unsigned last_used_idx;
> >         struct vring_virtqueue *vq = to_vvq(_vq);
> >  
> >         if (vq->event_triggered)
> >                 vq->event_triggered = false;
> >  
> > -       return vq->packed_ring ? virtqueue_enable_cb_prepare_packed(_vq) :
> > -                                virtqueue_enable_cb_prepare_split(_vq);
> > +       last_used_idx = vq->packed_ring ?
> > +                       virtqueue_enable_cb_prepare_packed(_vq) :
> > +                       virtqueue_enable_cb_prepare_split(_vq);
> > +
> > +       return VRING_BUILD_OPAQUE(last_used_idx, vq->wraps);
> >  }
> >  EXPORT_SYMBOL_GPL(virtqueue_enable_cb_prepare);
> >  
> > @@ -2118,9 +2128,13 @@ bool virtqueue_poll(struct virtqueue *_vq, unsigned last_used_idx)
> >         if (unlikely(vq->broken))
> >                 return false;
> >  
> > +       if (unlikely(vq->wraps != VRING_GET_WRAPS(last_used_idx)))
> > +               return true;
> > +
> >         virtio_mb(vq->weak_barriers);
> > -       return vq->packed_ring ? virtqueue_poll_packed(_vq, last_used_idx) :
> > -                                virtqueue_poll_split(_vq, last_used_idx);
> > +       return vq->packed_ring ?
> > +               virtqueue_poll_packed(_vq, VRING_GET_IDX(last_used_idx)) :
> > +                       virtqueue_poll_split(_vq, VRING_GET_IDX(last_used_idx));
> >  }
> >  EXPORT_SYMBOL_GPL(virtqueue_poll);
> >  
> > @@ -2245,6 +2259,7 @@ struct virtqueue *__vring_new_virtqueue(unsigned int index,
> >         vq->weak_barriers = weak_barriers;
> >         vq->broken = false;
> >         vq->last_used_idx = 0;
> > +       vq->wraps = 0;
> >         vq->event_triggered = false;
> >         vq->num_added = 0;
> >         vq->use_dma_api = vring_use_dma_api(vdev);
> > diff --git a/include/uapi/linux/virtio_ring.h b/include/uapi/linux/virtio_ring.h
> > index 476d3e5c0fe7..e6b03017ebd7 100644
> > --- a/include/uapi/linux/virtio_ring.h
> > +++ b/include/uapi/linux/virtio_ring.h
> > @@ -77,6 +77,17 @@
> >   */
> >  #define VRING_PACKED_EVENT_F_WRAP_CTR  15
> >  
> > +#define VRING_IDX_MASK                                 GENMASK(15, 0)
> > +#define VRING_GET_IDX(opaque)                          \
> > +       ((u16)FIELD_GET(VRING_IDX_MASK, (opaque)))
> > +
> > +#define VRING_WRAPS_MASK                               GENMASK(31, 16)
> > +#define VRING_GET_WRAPS(opaque)                                \
> > +       ((u16)FIELD_GET(VRING_WRAPS_MASK, (opaque)))
> > +
> > +#define VRING_BUILD_OPAQUE(idx, wraps)                 \
> > +       (FIELD_PREP(VRING_WRAPS_MASK, (wraps)) | ((idx) & VRING_IDX_MASK))
> > +
> >  /* We support indirect buffer descriptors */
> >  #define VIRTIO_RING_F_INDIRECT_DESC    28
> 
> Yea I think this patch increases the time it takes to wrap around from
> 2^16 to 2^32 which seems good enough.
> Need some comments to explain the logic.
> Would be interesting to see perf data.
> 

Thanks for your feedback !

I'll try to gather some perf data around it next days.
(and eventually cleanup and adding comments if it is god enough...)

Thanks,
Cristian


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

  reply	other threads:[~2022-01-23 22:46 UTC|newest]

Thread overview: 61+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-12-20 19:56 [PATCH v8 00/11] Introduce atomic support for SCMI transports Cristian Marussi
2021-12-20 19:56 ` Cristian Marussi
2021-12-20 19:56 ` [PATCH v8 01/11] firmware: arm_scmi: Add configurable polling mode for transports Cristian Marussi
2021-12-20 19:56   ` Cristian Marussi
2021-12-21 19:38   ` Pratyush Yadav
2021-12-21 19:38     ` Pratyush Yadav
2021-12-21 20:23     ` Sudeep Holla
2021-12-21 20:23       ` Sudeep Holla
2021-12-20 19:56 ` [PATCH v8 02/11] firmware: arm_scmi: Make smc transport use common completions Cristian Marussi
2021-12-20 19:56   ` Cristian Marussi
2021-12-20 19:56 ` [PATCH v8 03/11] firmware: arm_scmi: Add sync_cmds_completed_on_ret transport flag Cristian Marussi
2021-12-20 19:56   ` Cristian Marussi
2021-12-20 19:56 ` [PATCH v8 04/11] firmware: arm_scmi: Make smc support sync_cmds_completed_on_ret Cristian Marussi
2021-12-20 19:56   ` Cristian Marussi
2021-12-20 19:56 ` [PATCH v8 05/11] firmware: arm_scmi: Make optee " Cristian Marussi
2021-12-20 19:56   ` Cristian Marussi
2021-12-20 19:56 ` [PATCH v8 06/11] firmware: arm_scmi: Add support for atomic transports Cristian Marussi
2021-12-20 19:56   ` Cristian Marussi
2021-12-20 19:56 ` [PATCH v8 07/11] firmware: arm_scmi: Add atomic mode support to smc transport Cristian Marussi
2021-12-20 19:56   ` Cristian Marussi
2021-12-20 19:56 ` [PATCH v8 08/11] firmware: arm_scmi: Add new parameter to mark_txdone Cristian Marussi
2021-12-20 19:56   ` Cristian Marussi
2021-12-20 19:56 ` [PATCH v8 09/11] firmware: arm_scmi: Add atomic mode support to virtio transport Cristian Marussi
2021-12-20 19:56   ` Cristian Marussi
2021-12-20 23:17   ` Michael S. Tsirkin
2021-12-20 23:17     ` Michael S. Tsirkin
2021-12-20 23:17     ` Michael S. Tsirkin
2021-12-21 12:09     ` Cristian Marussi
2021-12-21 12:09       ` Cristian Marussi
2021-12-21 14:00   ` [PATCH v9 " Cristian Marussi
2021-12-21 14:00     ` Cristian Marussi
2022-01-18 14:21     ` Peter Hilber
2022-01-18 14:21       ` Peter Hilber
2022-01-19 12:23       ` Cristian Marussi
2022-01-19 12:23         ` Cristian Marussi
2022-01-20 19:09         ` Peter Hilber
2022-01-20 19:09           ` Peter Hilber
2022-01-20 20:39           ` Michael S. Tsirkin
2022-01-20 20:39             ` Michael S. Tsirkin
2022-01-20 20:39             ` Michael S. Tsirkin
2022-01-23 20:02             ` Cristian Marussi
2022-01-23 20:02               ` Cristian Marussi
2022-01-23 22:40               ` Michael S. Tsirkin
2022-01-23 22:40                 ` Michael S. Tsirkin
2022-01-23 22:40                 ` Michael S. Tsirkin
2022-01-23 22:45                 ` Cristian Marussi [this message]
2022-01-23 22:45                   ` Cristian Marussi
2021-12-20 19:56 ` [PATCH v8 10/11] firmware: arm_scmi: Add atomic support to clock protocol Cristian Marussi
2021-12-20 19:56   ` Cristian Marussi
2021-12-20 19:56 ` [PATCH v8 11/11] clk: scmi: Support atomic clock enable/disable API Cristian Marussi
2021-12-20 19:56   ` Cristian Marussi
2022-01-14 23:08   ` Stephen Boyd
2022-01-14 23:08     ` Stephen Boyd
2022-01-17 10:31     ` Sudeep Holla
2022-01-17 10:31       ` Sudeep Holla
2022-01-17 12:40       ` Cristian Marussi
2022-01-17 12:40         ` Cristian Marussi
2021-12-22 14:23 ` [PATCH v8 00/11] (subset) Introduce atomic support for SCMI transports Sudeep Holla
2021-12-22 14:23   ` Sudeep Holla
2022-01-11 18:13 ` [PATCH v8 00/11] " Cristian Marussi
2022-01-11 18:13   ` Cristian Marussi

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20220123224554.GG6113@e120937-lin \
    --to=cristian.marussi@arm.com \
    --cc=Jonathan.Cameron@huawei.com \
    --cc=etienne.carriere@linaro.org \
    --cc=f.fainelli@gmail.com \
    --cc=igor.skalkin@opensynergy.com \
    --cc=james.quinlan@broadcom.com \
    --cc=linux-arm-kernel@lists.infradead.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mst@redhat.com \
    --cc=peter.hilber@opensynergy.com \
    --cc=souvik.chakravarty@arm.com \
    --cc=sudeep.holla@arm.com \
    --cc=vincent.guittot@linaro.org \
    --cc=virtualization@lists.linux-foundation.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.