From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id A1C60C433EF for ; Sun, 23 Jan 2022 22:46:02 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S240397AbiAWWqC (ORCPT ); Sun, 23 Jan 2022 17:46:02 -0500 Received: from foss.arm.com ([217.140.110.172]:38140 "EHLO foss.arm.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230077AbiAWWqA (ORCPT ); Sun, 23 Jan 2022 17:46:00 -0500 Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 79BB71FB; Sun, 23 Jan 2022 14:45:59 -0800 (PST) Received: from e120937-lin (unknown [172.31.20.19]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id 5CD1F3F774; Sun, 23 Jan 2022 14:45:57 -0800 (PST) Date: Sun, 23 Jan 2022 22:45:54 +0000 From: Cristian Marussi To: "Michael S. Tsirkin" Cc: Peter Hilber , linux-kernel@vger.kernel.org, linux-arm-kernel@lists.infradead.org, sudeep.holla@arm.com, james.quinlan@broadcom.com, Jonathan.Cameron@huawei.com, f.fainelli@gmail.com, etienne.carriere@linaro.org, vincent.guittot@linaro.org, souvik.chakravarty@arm.com, igor.skalkin@opensynergy.com, virtualization@lists.linux-foundation.org Subject: Re: [PATCH v9 09/11] firmware: arm_scmi: Add atomic mode support to virtio transport Message-ID: <20220123224554.GG6113@e120937-lin> References: <20211220195646.44498-10-cristian.marussi@arm.com> <20211221140027.41524-1-cristian.marussi@arm.com> <20220119122338.GE6113@e120937-lin> <2f1ea794-a0b9-2099-edc0-b2aeb3ca6b92@opensynergy.com> <20220120150418-mutt-send-email-mst@kernel.org> <20220123200254.GF6113@e120937-lin> <20220123172950-mutt-send-email-mst@kernel.org> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20220123172950-mutt-send-email-mst@kernel.org> User-Agent: Mutt/1.9.4 (2018-02-28) Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Sun, Jan 23, 2022 at 05:40:08PM -0500, Michael S. Tsirkin wrote: > On Sun, Jan 23, 2022 at 08:02:54PM +0000, Cristian Marussi wrote: > > I was thinking...keeping the current virtqueue_poll interface, since our > > possible issue arises from the used_index wrapping around exactly on top > > of the same polled index and given that currently the API returns an > > unsigned "opaque" value really carrying just the 16-bit index (and possibly > > the wrap bit as bit15 for packed vq) that is supposed to be fed back as > > it is to the virtqueue_poll() function.... > > > > ...why don't we just keep an internal full fledged per-virtqueue wrap-counter > > and return that as the MSB 16-bit of the opaque value returned by > > virtqueue_prepare_enable_cb and then check it back in virtqueue_poll when the > > opaque is fed back ? (filtering it out from the internal helpers machinery) > > > > As in the example below the scissors. > > > > I mean if the internal wrap count is at that point different from the > > one provided to virtqueue_poll() via the opaque poll_idx value previously > > provided, certainly there is something new to fetch without even looking > > at the indexes: at the same time, exposing an opaque index built as > > (wraps << 16 | idx) implicitly 'binds' each index to a specific > > wrap-iteration, so they can be distiguished (..ok until the wrap-count > > upper 16bit wraps too....but...) > > > > I am not really extremely familiar with the internals of virtio so I > > could be missing something obvious...feel free to insult me :P > > > > (..and I have not made any perf measurements or consideration at this > > point....nor considered the redundancy of the existent packed > > used_wrap_counter bit...) > > > > Thanks, > > Cristian > > > > ---- > > > > diff --git a/drivers/virtio/virtio_ring.c b/drivers/virtio/virtio_ring.c > > index 00f64f2f8b72..bda6af121cd7 100644 > > --- a/drivers/virtio/virtio_ring.c > > +++ b/drivers/virtio/virtio_ring.c > > @@ -117,6 +117,8 @@ struct vring_virtqueue { > > /* Last used index we've seen. */ > > u16 last_used_idx; > > > > + u16 wraps; > > + > > /* Hint for event idx: already triggered no need to disable. */ > > bool event_triggered; > > > > @@ -806,6 +808,8 @@ static void *virtqueue_get_buf_ctx_split(struct virtqueue *_vq, > > ret = vq->split.desc_state[i].data; > > detach_buf_split(vq, i, ctx); > > vq->last_used_idx++; > > + if (unlikely(!vq->last_used_idx)) > > + vq->wraps++; > > I wonder whether > vq->wraps += !vq->last_used_idx; > is faster or slower. No branch but OTOH a dependency. > > > > /* If we expect an interrupt for the next entry, tell host > > * by writing event index and flush out the write before > > * the read in the next get_buf call. */ > > @@ -1508,6 +1512,7 @@ static void *virtqueue_get_buf_ctx_packed(struct virtqueue *_vq, > > if (unlikely(vq->last_used_idx >= vq->packed.vring.num)) { > > vq->last_used_idx -= vq->packed.vring.num; > > vq->packed.used_wrap_counter ^= 1; > > + vq->wraps++; > > } > > > > /* > > @@ -1744,6 +1749,7 @@ static struct virtqueue *vring_create_virtqueue_packed( > > vq->weak_barriers = weak_barriers; > > vq->broken = false; > > vq->last_used_idx = 0; > > + vq->wraps = 0; > > vq->event_triggered = false; > > vq->num_added = 0; > > vq->packed_ring = true; > > @@ -2092,13 +2098,17 @@ EXPORT_SYMBOL_GPL(virtqueue_disable_cb); > > */ > > unsigned virtqueue_enable_cb_prepare(struct virtqueue *_vq) > > { > > + unsigned last_used_idx; > > struct vring_virtqueue *vq = to_vvq(_vq); > > > > if (vq->event_triggered) > > vq->event_triggered = false; > > > > - return vq->packed_ring ? virtqueue_enable_cb_prepare_packed(_vq) : > > - virtqueue_enable_cb_prepare_split(_vq); > > + last_used_idx = vq->packed_ring ? > > + virtqueue_enable_cb_prepare_packed(_vq) : > > + virtqueue_enable_cb_prepare_split(_vq); > > + > > + return VRING_BUILD_OPAQUE(last_used_idx, vq->wraps); > > } > > EXPORT_SYMBOL_GPL(virtqueue_enable_cb_prepare); > > > > @@ -2118,9 +2128,13 @@ bool virtqueue_poll(struct virtqueue *_vq, unsigned last_used_idx) > > if (unlikely(vq->broken)) > > return false; > > > > + if (unlikely(vq->wraps != VRING_GET_WRAPS(last_used_idx))) > > + return true; > > + > > virtio_mb(vq->weak_barriers); > > - return vq->packed_ring ? virtqueue_poll_packed(_vq, last_used_idx) : > > - virtqueue_poll_split(_vq, last_used_idx); > > + return vq->packed_ring ? > > + virtqueue_poll_packed(_vq, VRING_GET_IDX(last_used_idx)) : > > + virtqueue_poll_split(_vq, VRING_GET_IDX(last_used_idx)); > > } > > EXPORT_SYMBOL_GPL(virtqueue_poll); > > > > @@ -2245,6 +2259,7 @@ struct virtqueue *__vring_new_virtqueue(unsigned int index, > > vq->weak_barriers = weak_barriers; > > vq->broken = false; > > vq->last_used_idx = 0; > > + vq->wraps = 0; > > vq->event_triggered = false; > > vq->num_added = 0; > > vq->use_dma_api = vring_use_dma_api(vdev); > > diff --git a/include/uapi/linux/virtio_ring.h b/include/uapi/linux/virtio_ring.h > > index 476d3e5c0fe7..e6b03017ebd7 100644 > > --- a/include/uapi/linux/virtio_ring.h > > +++ b/include/uapi/linux/virtio_ring.h > > @@ -77,6 +77,17 @@ > > */ > > #define VRING_PACKED_EVENT_F_WRAP_CTR 15 > > > > +#define VRING_IDX_MASK GENMASK(15, 0) > > +#define VRING_GET_IDX(opaque) \ > > + ((u16)FIELD_GET(VRING_IDX_MASK, (opaque))) > > + > > +#define VRING_WRAPS_MASK GENMASK(31, 16) > > +#define VRING_GET_WRAPS(opaque) \ > > + ((u16)FIELD_GET(VRING_WRAPS_MASK, (opaque))) > > + > > +#define VRING_BUILD_OPAQUE(idx, wraps) \ > > + (FIELD_PREP(VRING_WRAPS_MASK, (wraps)) | ((idx) & VRING_IDX_MASK)) > > + > > /* We support indirect buffer descriptors */ > > #define VIRTIO_RING_F_INDIRECT_DESC 28 > > Yea I think this patch increases the time it takes to wrap around from > 2^16 to 2^32 which seems good enough. > Need some comments to explain the logic. > Would be interesting to see perf data. > Thanks for your feedback ! I'll try to gather some perf data around it next days. (and eventually cleanup and adding comments if it is god enough...) Thanks, Cristian From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from bombadil.infradead.org (bombadil.infradead.org [198.137.202.133]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id A1513C433EF for ; Sun, 23 Jan 2022 22:47:22 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=lists.infradead.org; s=bombadil.20210309; h=Sender: Content-Transfer-Encoding:Content-Type:List-Subscribe:List-Help:List-Post: List-Archive:List-Unsubscribe:List-Id:In-Reply-To:MIME-Version:References: Message-ID:Subject:Cc:To:From:Date:Reply-To:Content-ID:Content-Description: Resent-Date:Resent-From:Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID: List-Owner; bh=8QpYYHt3l4pMc8NkY75B9otk/5o9NI+LwGhOVVEoKk4=; b=wffUrPq53xuvAj dYtoVJkSWowM0A+EPTNINIcBkBXDCeJlNoej4TpCZFpPViaM2gtp79OkHvpx4izkSy5Vm7/Xvz4oA TmbT2R1bvHA/hIwbqDqfoRdWyZSEOXQ7rEyjnrZLFmavgB2EIHl1LYIoMqbYibr0D+LjnJIuFIqnc wzfuKm0mzNtmYOVs85HK7B3/A43sHESoUN7jOkuR4aEBu39SbumQcVDApIVh2qKwLM56M+/lpvWsf /T/DtlxVLM3tD/uumhnKq0t/u1L7vSUvMBin0l6gcnEwksqdF2kreRhHqsoxg4dGRPVQCmQP22hHJ cYinQSGpu375qohW9f1w==; Received: from localhost ([::1] helo=bombadil.infradead.org) by bombadil.infradead.org with esmtp (Exim 4.94.2 #2 (Red Hat Linux)) id 1nBlcb-001sXt-A6; Sun, 23 Jan 2022 22:46:05 +0000 Received: from foss.arm.com ([217.140.110.172]) by bombadil.infradead.org with esmtp (Exim 4.94.2 #2 (Red Hat Linux)) id 1nBlcW-001sWg-QG for linux-arm-kernel@lists.infradead.org; Sun, 23 Jan 2022 22:46:02 +0000 Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 79BB71FB; Sun, 23 Jan 2022 14:45:59 -0800 (PST) Received: from e120937-lin (unknown [172.31.20.19]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id 5CD1F3F774; Sun, 23 Jan 2022 14:45:57 -0800 (PST) Date: Sun, 23 Jan 2022 22:45:54 +0000 From: Cristian Marussi To: "Michael S. Tsirkin" Cc: Peter Hilber , linux-kernel@vger.kernel.org, linux-arm-kernel@lists.infradead.org, sudeep.holla@arm.com, james.quinlan@broadcom.com, Jonathan.Cameron@huawei.com, f.fainelli@gmail.com, etienne.carriere@linaro.org, vincent.guittot@linaro.org, souvik.chakravarty@arm.com, igor.skalkin@opensynergy.com, virtualization@lists.linux-foundation.org Subject: Re: [PATCH v9 09/11] firmware: arm_scmi: Add atomic mode support to virtio transport Message-ID: <20220123224554.GG6113@e120937-lin> References: <20211220195646.44498-10-cristian.marussi@arm.com> <20211221140027.41524-1-cristian.marussi@arm.com> <20220119122338.GE6113@e120937-lin> <2f1ea794-a0b9-2099-edc0-b2aeb3ca6b92@opensynergy.com> <20220120150418-mutt-send-email-mst@kernel.org> <20220123200254.GF6113@e120937-lin> <20220123172950-mutt-send-email-mst@kernel.org> MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: <20220123172950-mutt-send-email-mst@kernel.org> User-Agent: Mutt/1.9.4 (2018-02-28) X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20220123_144600_969846_A0818353 X-CRM114-Status: GOOD ( 36.46 ) X-BeenThere: linux-arm-kernel@lists.infradead.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Sender: "linux-arm-kernel" Errors-To: linux-arm-kernel-bounces+linux-arm-kernel=archiver.kernel.org@lists.infradead.org On Sun, Jan 23, 2022 at 05:40:08PM -0500, Michael S. Tsirkin wrote: > On Sun, Jan 23, 2022 at 08:02:54PM +0000, Cristian Marussi wrote: > > I was thinking...keeping the current virtqueue_poll interface, since our > > possible issue arises from the used_index wrapping around exactly on top > > of the same polled index and given that currently the API returns an > > unsigned "opaque" value really carrying just the 16-bit index (and possibly > > the wrap bit as bit15 for packed vq) that is supposed to be fed back as > > it is to the virtqueue_poll() function.... > > > > ...why don't we just keep an internal full fledged per-virtqueue wrap-counter > > and return that as the MSB 16-bit of the opaque value returned by > > virtqueue_prepare_enable_cb and then check it back in virtqueue_poll when the > > opaque is fed back ? (filtering it out from the internal helpers machinery) > > > > As in the example below the scissors. > > > > I mean if the internal wrap count is at that point different from the > > one provided to virtqueue_poll() via the opaque poll_idx value previously > > provided, certainly there is something new to fetch without even looking > > at the indexes: at the same time, exposing an opaque index built as > > (wraps << 16 | idx) implicitly 'binds' each index to a specific > > wrap-iteration, so they can be distiguished (..ok until the wrap-count > > upper 16bit wraps too....but...) > > > > I am not really extremely familiar with the internals of virtio so I > > could be missing something obvious...feel free to insult me :P > > > > (..and I have not made any perf measurements or consideration at this > > point....nor considered the redundancy of the existent packed > > used_wrap_counter bit...) > > > > Thanks, > > Cristian > > > > ---- > > > > diff --git a/drivers/virtio/virtio_ring.c b/drivers/virtio/virtio_ring.c > > index 00f64f2f8b72..bda6af121cd7 100644 > > --- a/drivers/virtio/virtio_ring.c > > +++ b/drivers/virtio/virtio_ring.c > > @@ -117,6 +117,8 @@ struct vring_virtqueue { > > /* Last used index we've seen. */ > > u16 last_used_idx; > > > > + u16 wraps; > > + > > /* Hint for event idx: already triggered no need to disable. */ > > bool event_triggered; > > > > @@ -806,6 +808,8 @@ static void *virtqueue_get_buf_ctx_split(struct virtqueue *_vq, > > ret = vq->split.desc_state[i].data; > > detach_buf_split(vq, i, ctx); > > vq->last_used_idx++; > > + if (unlikely(!vq->last_used_idx)) > > + vq->wraps++; > > I wonder whether > vq->wraps += !vq->last_used_idx; > is faster or slower. No branch but OTOH a dependency. > > > > /* If we expect an interrupt for the next entry, tell host > > * by writing event index and flush out the write before > > * the read in the next get_buf call. */ > > @@ -1508,6 +1512,7 @@ static void *virtqueue_get_buf_ctx_packed(struct virtqueue *_vq, > > if (unlikely(vq->last_used_idx >= vq->packed.vring.num)) { > > vq->last_used_idx -= vq->packed.vring.num; > > vq->packed.used_wrap_counter ^= 1; > > + vq->wraps++; > > } > > > > /* > > @@ -1744,6 +1749,7 @@ static struct virtqueue *vring_create_virtqueue_packed( > > vq->weak_barriers = weak_barriers; > > vq->broken = false; > > vq->last_used_idx = 0; > > + vq->wraps = 0; > > vq->event_triggered = false; > > vq->num_added = 0; > > vq->packed_ring = true; > > @@ -2092,13 +2098,17 @@ EXPORT_SYMBOL_GPL(virtqueue_disable_cb); > > */ > > unsigned virtqueue_enable_cb_prepare(struct virtqueue *_vq) > > { > > + unsigned last_used_idx; > > struct vring_virtqueue *vq = to_vvq(_vq); > > > > if (vq->event_triggered) > > vq->event_triggered = false; > > > > - return vq->packed_ring ? virtqueue_enable_cb_prepare_packed(_vq) : > > - virtqueue_enable_cb_prepare_split(_vq); > > + last_used_idx = vq->packed_ring ? > > + virtqueue_enable_cb_prepare_packed(_vq) : > > + virtqueue_enable_cb_prepare_split(_vq); > > + > > + return VRING_BUILD_OPAQUE(last_used_idx, vq->wraps); > > } > > EXPORT_SYMBOL_GPL(virtqueue_enable_cb_prepare); > > > > @@ -2118,9 +2128,13 @@ bool virtqueue_poll(struct virtqueue *_vq, unsigned last_used_idx) > > if (unlikely(vq->broken)) > > return false; > > > > + if (unlikely(vq->wraps != VRING_GET_WRAPS(last_used_idx))) > > + return true; > > + > > virtio_mb(vq->weak_barriers); > > - return vq->packed_ring ? virtqueue_poll_packed(_vq, last_used_idx) : > > - virtqueue_poll_split(_vq, last_used_idx); > > + return vq->packed_ring ? > > + virtqueue_poll_packed(_vq, VRING_GET_IDX(last_used_idx)) : > > + virtqueue_poll_split(_vq, VRING_GET_IDX(last_used_idx)); > > } > > EXPORT_SYMBOL_GPL(virtqueue_poll); > > > > @@ -2245,6 +2259,7 @@ struct virtqueue *__vring_new_virtqueue(unsigned int index, > > vq->weak_barriers = weak_barriers; > > vq->broken = false; > > vq->last_used_idx = 0; > > + vq->wraps = 0; > > vq->event_triggered = false; > > vq->num_added = 0; > > vq->use_dma_api = vring_use_dma_api(vdev); > > diff --git a/include/uapi/linux/virtio_ring.h b/include/uapi/linux/virtio_ring.h > > index 476d3e5c0fe7..e6b03017ebd7 100644 > > --- a/include/uapi/linux/virtio_ring.h > > +++ b/include/uapi/linux/virtio_ring.h > > @@ -77,6 +77,17 @@ > > */ > > #define VRING_PACKED_EVENT_F_WRAP_CTR 15 > > > > +#define VRING_IDX_MASK GENMASK(15, 0) > > +#define VRING_GET_IDX(opaque) \ > > + ((u16)FIELD_GET(VRING_IDX_MASK, (opaque))) > > + > > +#define VRING_WRAPS_MASK GENMASK(31, 16) > > +#define VRING_GET_WRAPS(opaque) \ > > + ((u16)FIELD_GET(VRING_WRAPS_MASK, (opaque))) > > + > > +#define VRING_BUILD_OPAQUE(idx, wraps) \ > > + (FIELD_PREP(VRING_WRAPS_MASK, (wraps)) | ((idx) & VRING_IDX_MASK)) > > + > > /* We support indirect buffer descriptors */ > > #define VIRTIO_RING_F_INDIRECT_DESC 28 > > Yea I think this patch increases the time it takes to wrap around from > 2^16 to 2^32 which seems good enough. > Need some comments to explain the logic. > Would be interesting to see perf data. > Thanks for your feedback ! I'll try to gather some perf data around it next days. (and eventually cleanup and adding comments if it is god enough...) Thanks, Cristian _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel