linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 0/9] Add SCMI Virtio & Clock atomic support
@ 2022-02-01 17:15 Cristian Marussi
  2022-02-01 17:15 ` [PATCH v2 1/9] firmware: arm_scmi: Add a virtio channel refcount Cristian Marussi
                   ` (8 more replies)
  0 siblings, 9 replies; 16+ messages in thread
From: Cristian Marussi @ 2022-02-01 17:15 UTC (permalink / raw)
  To: linux-kernel, linux-arm-kernel
  Cc: sudeep.holla, james.quinlan, Jonathan.Cameron, f.fainelli,
	etienne.carriere, vincent.guittot, souvik.chakravarty,
	peter.hilber, igor.skalkin, cristian.marussi

Hi,

This small series is the tail-subset of the previous V8 series about atomic
support in SCMI [1], whose 8-patches head-subset has now been queued on
[2]; as such, it is based on [2] on top of tag scmi-updates-5.17:

commit 94d0cd1da14a ("firmware: arm_scmi: Add new parameter to
		     mark_txdone")

Patch [1/9] substitute virtio-scmi ready flag and lock with a reference
counter to keep track of vio channels lifetime while removing the need of
a wide spinlocked section (that would cause issues with introduction of
virtio polling support)

Patch [2/9] adds a few helpers to handle the TX free_list and a dedicated
spinlock to reduce the reliance on the main one.

Patch [3/9] is an RFC patch to the virtio core around virtqueue_poll() API:
the aim is to solve a possible ABA problem while polling on a very busy
virtqueue.

Patch [4/9] adds polling mode to SCMI VirtIO transport in order to support
atomic operations on such transport.

Patches [5,6/9] introduce a new optional SCMI binding, atomic_threshold, to
configure a platform specific time threshold used in the following patches
to select with a finer grain which SCMI resources should be eligible for
atomic operations when requested.

Patch [7/9] exposes new SCMI Clock protocol operations to allow an SCMI
user to request atomic mode on clock enable commands.

Patch [8/9] adds support to SCMI Clock protocol for a new clock attributes
field which advertises typical enable latency for a specific resource.
It is marked as RFC since the SCMI spec including such addition is still
to be finalized.

Finally patch [9/9] add support for atomic operations to the SCMI clock
driver; the underlying logic here is that we register with the Clock
framework atomic-capable clock resources if and only if the backing SCMI
transport is capable of atomic operations AND the specific clock resource
has been advertised by the SCMI platform as having:

	clock_enable_latency <= atomic_threshold

The idea is to avoid costly atomic busy-waiting for resources that have
been advertised as 'slow' to operate upon. (i.e. a PLL vs a gating clock)
This last patch is marked as RFC too since it is dependent on the previous
one (that is RFC-tagged too)

To ease testing the whole series can be find at [3].

Any feedback/testing welcome as usual.

Thanks,
Cristian

[1]: https://lore.kernel.org/linux-arm-kernel/20211220195646.44498-1-cristian.marussi@arm.com/
[2]: https://git.kernel.org/pub/scm/linux/kernel/git/sudeep.holla/linux.git/tag/?h=scmi-updates-5.17
[3]: https://gitlab.arm.com/linux-arm/linux-cm/-/commits/scmi_atomic_clk_virtio_V2/

---
V1 --> V2
 - added vio channel refcount support patch
 - reviewed free_list support and usage
 - added virtio_ring RFC patch
 - shrinked spinlocked section within virtio_poll_done to exclude
   virtqueue_poll call
 - removed poll_lock
 - use vio channel refcount acquire/release logic when polling
 - using new free_list accessors
 - added new dedicated pending_lock to access pending_cmds_list
 - fixed a few comments

Cristian Marussi (9):
  firmware: arm_scmi: Add a virtio channel refcount
  firmware: arm_scmi: Review virtio free_list handling
  [RFC] virtio_ring: Embed a wrap counter in opaque poll index value
  firmware: arm_scmi: Add atomic mode support to virtio transport
  dt-bindings: firmware: arm,scmi: Add atomic_threshold optional
    property
  firmware: arm_scmi: Support optional system wide atomic_threshold
  firmware: arm_scmi: Add atomic support to clock protocol
  [RFC] firmware: arm_scmi: Add support for clock_enable_latency
  [RFC] clk: scmi: Support atomic clock enable/disable API

 .../bindings/firmware/arm,scmi.yaml           |  11 +
 drivers/clk/clk-scmi.c                        |  71 ++-
 drivers/firmware/arm_scmi/Kconfig             |  15 +
 drivers/firmware/arm_scmi/clock.c             |  34 +-
 drivers/firmware/arm_scmi/driver.c            |  36 +-
 drivers/firmware/arm_scmi/virtio.c            | 504 +++++++++++++++---
 drivers/virtio/virtio_ring.c                  |  51 +-
 include/linux/scmi_protocol.h                 |   9 +-
 8 files changed, 622 insertions(+), 109 deletions(-)

-- 
2.17.1


^ permalink raw reply	[flat|nested] 16+ messages in thread

* [PATCH v2 1/9] firmware: arm_scmi: Add a virtio channel refcount
  2022-02-01 17:15 [PATCH 0/9] Add SCMI Virtio & Clock atomic support Cristian Marussi
@ 2022-02-01 17:15 ` Cristian Marussi
  2022-02-01 22:05   ` Michael S. Tsirkin
  2022-02-01 17:15 ` [PATCH v2 2/9] firmware: arm_scmi: Review virtio free_list handling Cristian Marussi
                   ` (7 subsequent siblings)
  8 siblings, 1 reply; 16+ messages in thread
From: Cristian Marussi @ 2022-02-01 17:15 UTC (permalink / raw)
  To: linux-kernel, linux-arm-kernel
  Cc: sudeep.holla, james.quinlan, Jonathan.Cameron, f.fainelli,
	etienne.carriere, vincent.guittot, souvik.chakravarty,
	peter.hilber, igor.skalkin, cristian.marussi

Currently SCMI VirtIO channels are marked with a ready flag and related
lock to track channel lifetime and support proper synchronization at
shutdown when virtqueues have to be stopped.

This leads to some extended spinlocked sections with IRQs off on the RX
path to keep hold of the ready flag and does not scale well especially when
SCMI VirtIO polling mode will be introduced.

Add an SCMI VirtIO channel dedicated refcount to track active users on both
the TX and the RX path and properly enforce synchronization and cleanup at
shutdown, inhibiting further usage of the channel once freed.

Cc: Igor Skalkin <igor.skalkin@opensynergy.com>
Cc: Peter Hilber <peter.hilber@opensynergy.com>
Signed-off-by: Cristian Marussi <cristian.marussi@arm.com>
---
 drivers/firmware/arm_scmi/virtio.c | 148 +++++++++++++++++++----------
 1 file changed, 100 insertions(+), 48 deletions(-)

diff --git a/drivers/firmware/arm_scmi/virtio.c b/drivers/firmware/arm_scmi/virtio.c
index fd0f6f91fc0b..536e46eab462 100644
--- a/drivers/firmware/arm_scmi/virtio.c
+++ b/drivers/firmware/arm_scmi/virtio.c
@@ -17,7 +17,9 @@
  * virtqueue. Access to each virtqueue is protected by spinlocks.
  */
 
+#include <linux/completion.h>
 #include <linux/errno.h>
+#include <linux/refcount.h>
 #include <linux/slab.h>
 #include <linux/virtio.h>
 #include <linux/virtio_config.h>
@@ -27,6 +29,7 @@
 
 #include "common.h"
 
+#define VIRTIO_MAX_RX_TIMEOUT_MS	60000
 #define VIRTIO_SCMI_MAX_MSG_SIZE 128 /* Value may be increased. */
 #define VIRTIO_SCMI_MAX_PDU_SIZE \
 	(VIRTIO_SCMI_MAX_MSG_SIZE + SCMI_MSG_MAX_PROT_OVERHEAD)
@@ -39,23 +42,21 @@
  * @cinfo: SCMI Tx or Rx channel
  * @free_list: List of unused scmi_vio_msg, maintained for Tx channels only
  * @is_rx: Whether channel is an Rx channel
- * @ready: Whether transport user is ready to hear about channel
  * @max_msg: Maximum number of pending messages for this channel.
- * @lock: Protects access to all members except ready.
- * @ready_lock: Protects access to ready. If required, it must be taken before
- *              lock.
+ * @lock: Protects access to all members except users.
+ * @shutdown_done: A reference to a completion used when freeing this channel.
+ * @users: A reference count to currently active users of this channel.
  */
 struct scmi_vio_channel {
 	struct virtqueue *vqueue;
 	struct scmi_chan_info *cinfo;
 	struct list_head free_list;
 	bool is_rx;
-	bool ready;
 	unsigned int max_msg;
-	/* lock to protect access to all members except ready. */
+	/* lock to protect access to all members except users. */
 	spinlock_t lock;
-	/* lock to rotects access to ready flag. */
-	spinlock_t ready_lock;
+	struct completion *shutdown_done;
+	refcount_t users;
 };
 
 /**
@@ -76,6 +77,71 @@ struct scmi_vio_msg {
 /* Only one SCMI VirtIO device can possibly exist */
 static struct virtio_device *scmi_vdev;
 
+static void scmi_vio_channel_ready(struct scmi_vio_channel *vioch,
+				   struct scmi_chan_info *cinfo)
+{
+	unsigned long flags;
+
+	spin_lock_irqsave(&vioch->lock, flags);
+	cinfo->transport_info = vioch;
+	/* Indirectly setting channel not available any more */
+	vioch->cinfo = cinfo;
+	spin_unlock_irqrestore(&vioch->lock, flags);
+
+	refcount_set(&vioch->users, 1);
+}
+
+static inline bool scmi_vio_channel_acquire(struct scmi_vio_channel *vioch)
+{
+	return refcount_inc_not_zero(&vioch->users);
+}
+
+static inline void scmi_vio_channel_release(struct scmi_vio_channel *vioch)
+{
+	if (refcount_dec_and_test(&vioch->users)) {
+		unsigned long flags;
+
+		spin_lock_irqsave(&vioch->lock, flags);
+		if (vioch->shutdown_done) {
+			vioch->cinfo = NULL;
+			complete(vioch->shutdown_done);
+		}
+		spin_unlock_irqrestore(&vioch->lock, flags);
+	}
+}
+
+static void scmi_vio_channel_cleanup_sync(struct scmi_vio_channel *vioch)
+{
+	int timeout;
+	char *vq_name;
+	unsigned long flags;
+	struct device *dev;
+	DECLARE_COMPLETION_ONSTACK(vioch_shutdown_done);
+
+	/*
+	 * Prepare to wait for the last release if not already released
+	 * or in progress.
+	 */
+	spin_lock_irqsave(&vioch->lock, flags);
+	if (!vioch->cinfo || vioch->shutdown_done) {
+		spin_unlock_irqrestore(&vioch->lock, flags);
+		return;
+	}
+	vioch->shutdown_done = &vioch_shutdown_done;
+	vq_name = vioch->is_rx ? "RX" : "TX";
+	/* vioch->cinfo could be NULLified after the release */
+	dev = vioch->cinfo->dev;
+	spin_unlock_irqrestore(&vioch->lock, flags);
+
+	scmi_vio_channel_release(vioch);
+
+	timeout = msecs_to_jiffies(VIRTIO_MAX_RX_TIMEOUT_MS + 10);
+	/* Let any possibly concurrent RX path release the channel */
+	if (!wait_for_completion_timeout(vioch->shutdown_done, timeout))
+		dev_warn(dev,
+			 "Timeout shutting down %s VQ.\n", vq_name);
+}
+
 static bool scmi_vio_have_vq_rx(struct virtio_device *vdev)
 {
 	return virtio_has_feature(vdev, VIRTIO_SCMI_F_P2A_CHANNELS);
@@ -119,7 +185,7 @@ static void scmi_finalize_message(struct scmi_vio_channel *vioch,
 
 static void scmi_vio_complete_cb(struct virtqueue *vqueue)
 {
-	unsigned long ready_flags;
+	unsigned long flags;
 	unsigned int length;
 	struct scmi_vio_channel *vioch;
 	struct scmi_vio_msg *msg;
@@ -130,27 +196,27 @@ static void scmi_vio_complete_cb(struct virtqueue *vqueue)
 	vioch = &((struct scmi_vio_channel *)vqueue->vdev->priv)[vqueue->index];
 
 	for (;;) {
-		spin_lock_irqsave(&vioch->ready_lock, ready_flags);
-
-		if (!vioch->ready) {
+		if (!scmi_vio_channel_acquire(vioch)) {
 			if (!cb_enabled)
 				(void)virtqueue_enable_cb(vqueue);
-			goto unlock_ready_out;
+			return;
 		}
 
-		/* IRQs already disabled here no need to irqsave */
-		spin_lock(&vioch->lock);
+		spin_lock_irqsave(&vioch->lock, flags);
 		if (cb_enabled) {
 			virtqueue_disable_cb(vqueue);
 			cb_enabled = false;
 		}
 		msg = virtqueue_get_buf(vqueue, &length);
 		if (!msg) {
-			if (virtqueue_enable_cb(vqueue))
-				goto unlock_out;
+			if (virtqueue_enable_cb(vqueue)) {
+				spin_unlock_irqrestore(&vioch->lock, flags);
+				scmi_vio_channel_release(vioch);
+				return;
+			}
 			cb_enabled = true;
 		}
-		spin_unlock(&vioch->lock);
+		spin_unlock_irqrestore(&vioch->lock, flags);
 
 		if (msg) {
 			msg->rx_len = length;
@@ -161,19 +227,14 @@ static void scmi_vio_complete_cb(struct virtqueue *vqueue)
 		}
 
 		/*
-		 * Release ready_lock and re-enable IRQs between loop iterations
-		 * to allow virtio_chan_free() to possibly kick in and set the
-		 * flag vioch->ready to false even in between processing of
-		 * messages, so as to force outstanding messages to be ignored
-		 * when system is shutting down.
+		 * Release vio channel between loop iterations to allow
+		 * virtio_chan_free() to eventually fully release it when
+		 * shutting down; in such a case, any outstanding message will
+		 * be ignored since this loop will bail out at the next
+		 * iteration.
 		 */
-		spin_unlock_irqrestore(&vioch->ready_lock, ready_flags);
+		scmi_vio_channel_release(vioch);
 	}
-
-unlock_out:
-	spin_unlock(&vioch->lock);
-unlock_ready_out:
-	spin_unlock_irqrestore(&vioch->ready_lock, ready_flags);
 }
 
 static const char *const scmi_vio_vqueue_names[] = { "tx", "rx" };
@@ -273,35 +334,20 @@ static int virtio_chan_setup(struct scmi_chan_info *cinfo, struct device *dev,
 		}
 	}
 
-	spin_lock_irqsave(&vioch->lock, flags);
-	cinfo->transport_info = vioch;
-	/* Indirectly setting channel not available any more */
-	vioch->cinfo = cinfo;
-	spin_unlock_irqrestore(&vioch->lock, flags);
-
-	spin_lock_irqsave(&vioch->ready_lock, flags);
-	vioch->ready = true;
-	spin_unlock_irqrestore(&vioch->ready_lock, flags);
+	scmi_vio_channel_ready(vioch, cinfo);
 
 	return 0;
 }
 
 static int virtio_chan_free(int id, void *p, void *data)
 {
-	unsigned long flags;
 	struct scmi_chan_info *cinfo = p;
 	struct scmi_vio_channel *vioch = cinfo->transport_info;
 
-	spin_lock_irqsave(&vioch->ready_lock, flags);
-	vioch->ready = false;
-	spin_unlock_irqrestore(&vioch->ready_lock, flags);
+	scmi_vio_channel_cleanup_sync(vioch);
 
 	scmi_free_channel(cinfo, data, id);
 
-	spin_lock_irqsave(&vioch->lock, flags);
-	vioch->cinfo = NULL;
-	spin_unlock_irqrestore(&vioch->lock, flags);
-
 	return 0;
 }
 
@@ -316,10 +362,14 @@ static int virtio_send_message(struct scmi_chan_info *cinfo,
 	int rc;
 	struct scmi_vio_msg *msg;
 
+	if (!scmi_vio_channel_acquire(vioch))
+		return -EINVAL;
+
 	spin_lock_irqsave(&vioch->lock, flags);
 
 	if (list_empty(&vioch->free_list)) {
 		spin_unlock_irqrestore(&vioch->lock, flags);
+		scmi_vio_channel_release(vioch);
 		return -EBUSY;
 	}
 
@@ -342,6 +392,8 @@ static int virtio_send_message(struct scmi_chan_info *cinfo,
 
 	spin_unlock_irqrestore(&vioch->lock, flags);
 
+	scmi_vio_channel_release(vioch);
+
 	return rc;
 }
 
@@ -416,7 +468,6 @@ static int scmi_vio_probe(struct virtio_device *vdev)
 		unsigned int sz;
 
 		spin_lock_init(&channels[i].lock);
-		spin_lock_init(&channels[i].ready_lock);
 		INIT_LIST_HEAD(&channels[i].free_list);
 		channels[i].vqueue = vqs[i];
 
@@ -503,7 +554,8 @@ const struct scmi_desc scmi_virtio_desc = {
 	.transport_init = virtio_scmi_init,
 	.transport_exit = virtio_scmi_exit,
 	.ops = &scmi_virtio_ops,
-	.max_rx_timeout_ms = 60000, /* for non-realtime virtio devices */
+	/* for non-realtime virtio devices */
+	.max_rx_timeout_ms = VIRTIO_MAX_RX_TIMEOUT_MS,
 	.max_msg = 0, /* overridden by virtio_get_max_msg() */
 	.max_msg_size = VIRTIO_SCMI_MAX_MSG_SIZE,
 };
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH v2 2/9] firmware: arm_scmi: Review virtio free_list handling
  2022-02-01 17:15 [PATCH 0/9] Add SCMI Virtio & Clock atomic support Cristian Marussi
  2022-02-01 17:15 ` [PATCH v2 1/9] firmware: arm_scmi: Add a virtio channel refcount Cristian Marussi
@ 2022-02-01 17:15 ` Cristian Marussi
  2022-02-01 17:15 ` [PATCH v2 3/9] [RFC] virtio_ring: Embed a wrap counter in opaque poll index value Cristian Marussi
                   ` (6 subsequent siblings)
  8 siblings, 0 replies; 16+ messages in thread
From: Cristian Marussi @ 2022-02-01 17:15 UTC (permalink / raw)
  To: linux-kernel, linux-arm-kernel
  Cc: sudeep.holla, james.quinlan, Jonathan.Cameron, f.fainelli,
	etienne.carriere, vincent.guittot, souvik.chakravarty,
	peter.hilber, igor.skalkin, cristian.marussi

Add a new spinlock dedicated to the access of the TX free list and a couple
of helpers to get and put messages back and forth from the free_list.

Cc: Igor Skalkin <igor.skalkin@opensynergy.com>
Cc: Peter Hilber <peter.hilber@opensynergy.com>
Signed-off-by: Cristian Marussi <cristian.marussi@arm.com>
---
 drivers/firmware/arm_scmi/virtio.c | 88 +++++++++++++++++++-----------
 1 file changed, 57 insertions(+), 31 deletions(-)

diff --git a/drivers/firmware/arm_scmi/virtio.c b/drivers/firmware/arm_scmi/virtio.c
index 536e46eab462..22d5a632262f 100644
--- a/drivers/firmware/arm_scmi/virtio.c
+++ b/drivers/firmware/arm_scmi/virtio.c
@@ -40,20 +40,23 @@
  *
  * @vqueue: Associated virtqueue
  * @cinfo: SCMI Tx or Rx channel
+ * @free_lock: Protects access to the @free_list.
  * @free_list: List of unused scmi_vio_msg, maintained for Tx channels only
  * @is_rx: Whether channel is an Rx channel
  * @max_msg: Maximum number of pending messages for this channel.
- * @lock: Protects access to all members except users.
+ * @lock: Protects access to all members except users, free_list.
  * @shutdown_done: A reference to a completion used when freeing this channel.
  * @users: A reference count to currently active users of this channel.
  */
 struct scmi_vio_channel {
 	struct virtqueue *vqueue;
 	struct scmi_chan_info *cinfo;
+	/* lock to protect access to the free list. */
+	spinlock_t free_lock;
 	struct list_head free_list;
 	bool is_rx;
 	unsigned int max_msg;
-	/* lock to protect access to all members except users. */
+	/* lock to protect access to all members except users, free_list  */
 	spinlock_t lock;
 	struct completion *shutdown_done;
 	refcount_t users;
@@ -142,18 +145,49 @@ static void scmi_vio_channel_cleanup_sync(struct scmi_vio_channel *vioch)
 			 "Timeout shutting down %s VQ.\n", vq_name);
 }
 
+/* Assumes to be called with vio channel acquired already */
+static struct scmi_vio_msg *
+scmi_virtio_get_free_msg(struct scmi_vio_channel *vioch)
+{
+	unsigned long flags;
+	struct scmi_vio_msg *msg;
+
+	spin_lock_irqsave(&vioch->free_lock, flags);
+	if (list_empty(&vioch->free_list)) {
+		spin_unlock_irqrestore(&vioch->free_lock, flags);
+		return NULL;
+	}
+
+	msg = list_first_entry(&vioch->free_list, typeof(*msg), list);
+	list_del_init(&msg->list);
+	spin_unlock_irqrestore(&vioch->free_lock, flags);
+
+	return msg;
+}
+
+/* Assumes to be called with vio channel acquired already */
+static void scmi_virtio_put_free_msg(struct scmi_vio_channel *vioch,
+				     struct scmi_vio_msg *msg)
+{
+	unsigned long flags;
+
+	spin_lock_irqsave(&vioch->free_lock, flags);
+	list_add_tail(&msg->list, &vioch->free_list);
+	spin_unlock_irqrestore(&vioch->free_lock, flags);
+}
+
 static bool scmi_vio_have_vq_rx(struct virtio_device *vdev)
 {
 	return virtio_has_feature(vdev, VIRTIO_SCMI_F_P2A_CHANNELS);
 }
 
 static int scmi_vio_feed_vq_rx(struct scmi_vio_channel *vioch,
-			       struct scmi_vio_msg *msg,
-			       struct device *dev)
+			       struct scmi_vio_msg *msg)
 {
 	struct scatterlist sg_in;
 	int rc;
 	unsigned long flags;
+	struct device *dev = &vioch->vqueue->vdev->dev;
 
 	sg_init_one(&sg_in, msg->input, VIRTIO_SCMI_MAX_PDU_SIZE);
 
@@ -170,17 +204,17 @@ static int scmi_vio_feed_vq_rx(struct scmi_vio_channel *vioch,
 	return rc;
 }
 
+/*
+ * Assume to be called with channel already acquired or not ready at all;
+ * vioch->lock MUST NOT have been already acquired.
+ */
 static void scmi_finalize_message(struct scmi_vio_channel *vioch,
 				  struct scmi_vio_msg *msg)
 {
-	if (vioch->is_rx) {
-		scmi_vio_feed_vq_rx(vioch, msg, vioch->cinfo->dev);
-	} else {
-		/* Here IRQs are assumed to be already disabled by the caller */
-		spin_lock(&vioch->lock);
-		list_add(&msg->list, &vioch->free_list);
-		spin_unlock(&vioch->lock);
-	}
+	if (vioch->is_rx)
+		scmi_vio_feed_vq_rx(vioch, msg);
+	else
+		scmi_virtio_put_free_msg(vioch, msg);
 }
 
 static void scmi_vio_complete_cb(struct virtqueue *vqueue)
@@ -295,7 +329,6 @@ static bool virtio_chan_available(struct device *dev, int idx)
 static int virtio_chan_setup(struct scmi_chan_info *cinfo, struct device *dev,
 			     bool tx)
 {
-	unsigned long flags;
 	struct scmi_vio_channel *vioch;
 	int index = tx ? VIRTIO_SCMI_VQ_TX : VIRTIO_SCMI_VQ_RX;
 	int i;
@@ -325,13 +358,7 @@ static int virtio_chan_setup(struct scmi_chan_info *cinfo, struct device *dev,
 		if (!msg->input)
 			return -ENOMEM;
 
-		if (tx) {
-			spin_lock_irqsave(&vioch->lock, flags);
-			list_add_tail(&msg->list, &vioch->free_list);
-			spin_unlock_irqrestore(&vioch->lock, flags);
-		} else {
-			scmi_vio_feed_vq_rx(vioch, msg, cinfo->dev);
-		}
+		scmi_finalize_message(vioch, msg);
 	}
 
 	scmi_vio_channel_ready(vioch, cinfo);
@@ -365,33 +392,31 @@ static int virtio_send_message(struct scmi_chan_info *cinfo,
 	if (!scmi_vio_channel_acquire(vioch))
 		return -EINVAL;
 
-	spin_lock_irqsave(&vioch->lock, flags);
-
-	if (list_empty(&vioch->free_list)) {
-		spin_unlock_irqrestore(&vioch->lock, flags);
+	msg = scmi_virtio_get_free_msg(vioch);
+	if (!msg) {
 		scmi_vio_channel_release(vioch);
 		return -EBUSY;
 	}
 
-	msg = list_first_entry(&vioch->free_list, typeof(*msg), list);
-	list_del(&msg->list);
-
 	msg_tx_prepare(msg->request, xfer);
 
 	sg_init_one(&sg_out, msg->request, msg_command_size(xfer));
 	sg_init_one(&sg_in, msg->input, msg_response_size(xfer));
 
+	spin_lock_irqsave(&vioch->lock, flags);
+
 	rc = virtqueue_add_sgs(vioch->vqueue, sgs, 1, 1, msg, GFP_ATOMIC);
-	if (rc) {
-		list_add(&msg->list, &vioch->free_list);
+	if (rc)
 		dev_err(vioch->cinfo->dev,
 			"failed to add to TX virtqueue (%d)\n", rc);
-	} else {
+	else
 		virtqueue_kick(vioch->vqueue);
-	}
 
 	spin_unlock_irqrestore(&vioch->lock, flags);
 
+	if (rc)
+		scmi_virtio_put_free_msg(vioch, msg);
+
 	scmi_vio_channel_release(vioch);
 
 	return rc;
@@ -468,6 +493,7 @@ static int scmi_vio_probe(struct virtio_device *vdev)
 		unsigned int sz;
 
 		spin_lock_init(&channels[i].lock);
+		spin_lock_init(&channels[i].free_lock);
 		INIT_LIST_HEAD(&channels[i].free_list);
 		channels[i].vqueue = vqs[i];
 
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH v2 3/9] [RFC] virtio_ring: Embed a wrap counter in opaque poll index value
  2022-02-01 17:15 [PATCH 0/9] Add SCMI Virtio & Clock atomic support Cristian Marussi
  2022-02-01 17:15 ` [PATCH v2 1/9] firmware: arm_scmi: Add a virtio channel refcount Cristian Marussi
  2022-02-01 17:15 ` [PATCH v2 2/9] firmware: arm_scmi: Review virtio free_list handling Cristian Marussi
@ 2022-02-01 17:15 ` Cristian Marussi
  2022-02-01 18:27   ` Michael S. Tsirkin
  2022-02-01 17:15 ` [PATCH v2 4/9] firmware: arm_scmi: Add atomic mode support to virtio transport Cristian Marussi
                   ` (5 subsequent siblings)
  8 siblings, 1 reply; 16+ messages in thread
From: Cristian Marussi @ 2022-02-01 17:15 UTC (permalink / raw)
  To: linux-kernel, linux-arm-kernel
  Cc: sudeep.holla, james.quinlan, Jonathan.Cameron, f.fainelli,
	etienne.carriere, vincent.guittot, souvik.chakravarty,
	peter.hilber, igor.skalkin, cristian.marussi, Michael S. Tsirkin,
	virtualization

Exported API virtqueue_poll() can be used to support polling mode operation
on top of virtio layer if needed; currently the parameter last_used_idx is
the opaque value that needs to be passed to the virtqueue_poll() function
to check if there are new pending used buffers in the queue: such opaque
value would have been previously obtained by a call to the API function
virtqueue_enable_cb_prepare().

Since such opaque value is indeed containing simply a snapshot in time of
the internal last_used_index (roughly), it is possible that, if exactly
2**16 buffers are marked as used between two successive calls to
virtqueue_poll(), the caller is fooled into thinking that nothing is
pending (ABA problem).

Keep a full fledged internal wraps counter per virtqueue and embed it into
the upper 16bits of the returned opaque value, so that the above scenario
can be detected transparently by virtqueue_poll(): this way each single
possible last_used_idx value is really belonging to a different wrap.

Cc: "Michael S. Tsirkin" <mst@redhat.com>
Cc: Igor Skalkin <igor.skalkin@opensynergy.com>
Cc: Peter Hilber <peter.hilber@opensynergy.com>
Cc: virtualization@lists.linux-foundation.org
Signed-off-by: Cristian Marussi <cristian.marussi@arm.com>
---
Still no perf data on this, I was wondering what exactly to measure in
term of perf metrics to evaluate the impact of the rolling vq->wraps
counter.
---
 drivers/virtio/virtio_ring.c | 51 +++++++++++++++++++++++++++++++++---
 1 file changed, 47 insertions(+), 4 deletions(-)

diff --git a/drivers/virtio/virtio_ring.c b/drivers/virtio/virtio_ring.c
index 00f64f2f8b72..613ec0503509 100644
--- a/drivers/virtio/virtio_ring.c
+++ b/drivers/virtio/virtio_ring.c
@@ -12,6 +12,8 @@
 #include <linux/hrtimer.h>
 #include <linux/dma-mapping.h>
 #include <linux/spinlock.h>
+#include <linux/bits.h>
+#include <linux/bitfield.h>
 #include <xen/xen.h>
 
 static bool force_used_validation = false;
@@ -69,6 +71,17 @@ module_param(force_used_validation, bool, 0444);
 #define LAST_ADD_TIME_INVALID(vq)
 #endif
 
+#define VRING_IDX_MASK					GENMASK(15, 0)
+#define VRING_GET_IDX(opaque)				\
+	((u16)FIELD_GET(VRING_IDX_MASK, (opaque)))
+
+#define VRING_WRAPS_MASK				GENMASK(31, 16)
+#define VRING_GET_WRAPS(opaque)				\
+	((u16)FIELD_GET(VRING_WRAPS_MASK, (opaque)))
+
+#define VRING_BUILD_OPAQUE(idx, wraps)			\
+	(FIELD_PREP(VRING_WRAPS_MASK, (wraps)) | ((idx) & VRING_IDX_MASK))
+
 struct vring_desc_state_split {
 	void *data;			/* Data for callback. */
 	struct vring_desc *indir_desc;	/* Indirect descriptor, if any. */
@@ -117,6 +130,8 @@ struct vring_virtqueue {
 	/* Last used index we've seen. */
 	u16 last_used_idx;
 
+	u16 wraps;
+
 	/* Hint for event idx: already triggered no need to disable. */
 	bool event_triggered;
 
@@ -806,6 +821,8 @@ static void *virtqueue_get_buf_ctx_split(struct virtqueue *_vq,
 	ret = vq->split.desc_state[i].data;
 	detach_buf_split(vq, i, ctx);
 	vq->last_used_idx++;
+	if (unlikely(!vq->last_used_idx))
+		vq->wraps++;
 	/* If we expect an interrupt for the next entry, tell host
 	 * by writing event index and flush out the write before
 	 * the read in the next get_buf call. */
@@ -1508,6 +1525,7 @@ static void *virtqueue_get_buf_ctx_packed(struct virtqueue *_vq,
 	if (unlikely(vq->last_used_idx >= vq->packed.vring.num)) {
 		vq->last_used_idx -= vq->packed.vring.num;
 		vq->packed.used_wrap_counter ^= 1;
+		vq->wraps++;
 	}
 
 	/*
@@ -1744,6 +1762,7 @@ static struct virtqueue *vring_create_virtqueue_packed(
 	vq->weak_barriers = weak_barriers;
 	vq->broken = false;
 	vq->last_used_idx = 0;
+	vq->wraps = 0;
 	vq->event_triggered = false;
 	vq->num_added = 0;
 	vq->packed_ring = true;
@@ -2092,13 +2111,17 @@ EXPORT_SYMBOL_GPL(virtqueue_disable_cb);
  */
 unsigned virtqueue_enable_cb_prepare(struct virtqueue *_vq)
 {
+	unsigned int last_used_idx;
 	struct vring_virtqueue *vq = to_vvq(_vq);
 
 	if (vq->event_triggered)
 		vq->event_triggered = false;
 
-	return vq->packed_ring ? virtqueue_enable_cb_prepare_packed(_vq) :
-				 virtqueue_enable_cb_prepare_split(_vq);
+	last_used_idx = vq->packed_ring ?
+			virtqueue_enable_cb_prepare_packed(_vq) :
+			virtqueue_enable_cb_prepare_split(_vq);
+
+	return VRING_BUILD_OPAQUE(last_used_idx, vq->wraps);
 }
 EXPORT_SYMBOL_GPL(virtqueue_enable_cb_prepare);
 
@@ -2107,6 +2130,21 @@ EXPORT_SYMBOL_GPL(virtqueue_enable_cb_prepare);
  * @_vq: the struct virtqueue we're talking about.
  * @last_used_idx: virtqueue state (from call to virtqueue_enable_cb_prepare).
  *
+ * The provided last_used_idx, as returned by virtqueue_enable_cb_prepare(),
+ * is an opaque value representing the queue state and it is built as follows:
+ *
+ *	---------------------------------------------------------
+ *	|	vq->wraps	|	vq->last_used_idx	|
+ *	31------------------------------------------------------0
+ *
+ * The MSB 16bits embedding the wraps counter for the underlying virtqueue
+ * is stripped out here before reaching into the lower layer helpers.
+ *
+ * This structure of the opaque value mitigates the scenario in which, when
+ * exactly 2**16 messages are marked as used between two successive calls to
+ * virtqueue_poll(), the caller is fooled into thinking nothing new has arrived
+ * since the pure last_used_idx is exactly the same.
+ *
  * Returns "true" if there are pending used buffers in the queue.
  *
  * This does not need to be serialized.
@@ -2118,9 +2156,13 @@ bool virtqueue_poll(struct virtqueue *_vq, unsigned last_used_idx)
 	if (unlikely(vq->broken))
 		return false;
 
+	if (unlikely(vq->wraps != VRING_GET_WRAPS(last_used_idx)))
+		return true;
+
 	virtio_mb(vq->weak_barriers);
-	return vq->packed_ring ? virtqueue_poll_packed(_vq, last_used_idx) :
-				 virtqueue_poll_split(_vq, last_used_idx);
+	return vq->packed_ring ?
+		virtqueue_poll_packed(_vq, VRING_GET_IDX(last_used_idx)) :
+			virtqueue_poll_split(_vq, VRING_GET_IDX(last_used_idx));
 }
 EXPORT_SYMBOL_GPL(virtqueue_poll);
 
@@ -2245,6 +2287,7 @@ struct virtqueue *__vring_new_virtqueue(unsigned int index,
 	vq->weak_barriers = weak_barriers;
 	vq->broken = false;
 	vq->last_used_idx = 0;
+	vq->wraps = 0;
 	vq->event_triggered = false;
 	vq->num_added = 0;
 	vq->use_dma_api = vring_use_dma_api(vdev);
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH v2 4/9] firmware: arm_scmi: Add atomic mode support to virtio transport
  2022-02-01 17:15 [PATCH 0/9] Add SCMI Virtio & Clock atomic support Cristian Marussi
                   ` (2 preceding siblings ...)
  2022-02-01 17:15 ` [PATCH v2 3/9] [RFC] virtio_ring: Embed a wrap counter in opaque poll index value Cristian Marussi
@ 2022-02-01 17:15 ` Cristian Marussi
  2022-02-01 17:15 ` [PATCH v2 5/9] dt-bindings: firmware: arm,scmi: Add atomic_threshold optional property Cristian Marussi
                   ` (4 subsequent siblings)
  8 siblings, 0 replies; 16+ messages in thread
From: Cristian Marussi @ 2022-02-01 17:15 UTC (permalink / raw)
  To: linux-kernel, linux-arm-kernel
  Cc: sudeep.holla, james.quinlan, Jonathan.Cameron, f.fainelli,
	etienne.carriere, vincent.guittot, souvik.chakravarty,
	peter.hilber, igor.skalkin, cristian.marussi, Michael S. Tsirkin,
	virtualization

Add support for .mark_txdone and .poll_done transport operations to SCMI
VirtIO transport as pre-requisites to enable atomic operations.

Add a Kernel configuration option to enable SCMI VirtIO transport polling
and atomic mode for selected SCMI transactions while leaving it default
disabled.

Cc: "Michael S. Tsirkin" <mst@redhat.com>
Cc: Igor Skalkin <igor.skalkin@opensynergy.com>
Cc: Peter Hilber <peter.hilber@opensynergy.com>
Cc: virtualization@lists.linux-foundation.org
Signed-off-by: Cristian Marussi <cristian.marussi@arm.com>
---
v1 --> v2
- shrinked spinlocked section within virtio_poll_done to exclude
  virtqueue_poll
- removed poll_lock
- use vio channel refcount acquire/release logic when polling
- using new free_list accessors
- added new dedicated pending_lock to access pending_cmds_list
- fixed a few comments

v0 --> v1
- check for deferred_wq existence before queueing work to avoid
  race at driver removal time
- changed mark_txdone decision-logic about message release
- fixed race while checking for msg polled from another thread
- using dedicated poll_status instead of poll_idx upper bits
- pick initial poll_idx earlier inside send_message to avoid missing
  early replies
- removed F_NOTIFY mention in comment
- clearing xfer->priv on the IRQ tx path once message has been fetched
- added some store barriers
- updated some comments
---
 drivers/firmware/arm_scmi/Kconfig  |  15 ++
 drivers/firmware/arm_scmi/driver.c |   9 +-
 drivers/firmware/arm_scmi/virtio.c | 278 ++++++++++++++++++++++++++++-
 3 files changed, 292 insertions(+), 10 deletions(-)

diff --git a/drivers/firmware/arm_scmi/Kconfig b/drivers/firmware/arm_scmi/Kconfig
index d429326433d1..7794bd41eaa0 100644
--- a/drivers/firmware/arm_scmi/Kconfig
+++ b/drivers/firmware/arm_scmi/Kconfig
@@ -118,6 +118,21 @@ config ARM_SCMI_TRANSPORT_VIRTIO_VERSION1_COMPLIANCE
 	  the ones implemented by kvmtool) and let the core Kernel VirtIO layer
 	  take care of the needed conversions, say N.
 
+config ARM_SCMI_TRANSPORT_VIRTIO_ATOMIC_ENABLE
+	bool "Enable atomic mode for SCMI VirtIO transport"
+	depends on ARM_SCMI_TRANSPORT_VIRTIO
+	help
+	  Enable support of atomic operation for SCMI VirtIO based transport.
+
+	  If you want the SCMI VirtIO based transport to operate in atomic
+	  mode, avoiding any kind of sleeping behaviour for selected
+	  transactions on the TX path, answer Y.
+
+	  Enabling atomic mode operations allows any SCMI driver using this
+	  transport to optionally ask for atomic SCMI transactions and operate
+	  in atomic context too, at the price of using a number of busy-waiting
+	  primitives all over instead. If unsure say N.
+
 endif #ARM_SCMI_PROTOCOL
 
 config ARM_SCMI_POWER_DOMAIN
diff --git a/drivers/firmware/arm_scmi/driver.c b/drivers/firmware/arm_scmi/driver.c
index c2e7897ff56e..dc972a54e93e 100644
--- a/drivers/firmware/arm_scmi/driver.c
+++ b/drivers/firmware/arm_scmi/driver.c
@@ -648,7 +648,8 @@ static void scmi_handle_notification(struct scmi_chan_info *cinfo,
 
 	unpack_scmi_header(msg_hdr, &xfer->hdr);
 	if (priv)
-		xfer->priv = priv;
+		/* Ensure order between xfer->priv store and following ops */
+		smp_store_mb(xfer->priv, priv);
 	info->desc->ops->fetch_notification(cinfo, info->desc->max_msg_size,
 					    xfer);
 	scmi_notify(cinfo->handle, xfer->hdr.protocol_id,
@@ -680,8 +681,12 @@ static void scmi_handle_response(struct scmi_chan_info *cinfo,
 		xfer->rx.len = info->desc->max_msg_size;
 
 	if (priv)
-		xfer->priv = priv;
+		/* Ensure order between xfer->priv store and following ops */
+		smp_store_mb(xfer->priv, priv);
 	info->desc->ops->fetch_response(cinfo, xfer);
+	if (priv)
+		/* Ensure order between xfer->priv clear and later accesses */
+		smp_store_mb(xfer->priv, NULL);
 
 	trace_scmi_rx_done(xfer->transfer_id, xfer->hdr.id,
 			   xfer->hdr.protocol_id, xfer->hdr.seq,
diff --git a/drivers/firmware/arm_scmi/virtio.c b/drivers/firmware/arm_scmi/virtio.c
index 22d5a632262f..46d40da7eeaa 100644
--- a/drivers/firmware/arm_scmi/virtio.c
+++ b/drivers/firmware/arm_scmi/virtio.c
@@ -3,8 +3,8 @@
  * Virtio Transport driver for Arm System Control and Management Interface
  * (SCMI).
  *
- * Copyright (C) 2020-2021 OpenSynergy.
- * Copyright (C) 2021 ARM Ltd.
+ * Copyright (C) 2020-2022 OpenSynergy.
+ * Copyright (C) 2021-2022 ARM Ltd.
  */
 
 /**
@@ -42,6 +42,10 @@
  * @cinfo: SCMI Tx or Rx channel
  * @free_lock: Protects access to the @free_list.
  * @free_list: List of unused scmi_vio_msg, maintained for Tx channels only
+ * @deferred_tx_work: Worker for TX deferred replies processing
+ * @deferred_tx_wq: Workqueue for TX deferred replies
+ * @pending_lock: Protects access to the @pending_cmds_list.
+ * @pending_cmds_list: List of pre-fetched commands queueud for later processing
  * @is_rx: Whether channel is an Rx channel
  * @max_msg: Maximum number of pending messages for this channel.
  * @lock: Protects access to all members except users, free_list.
@@ -54,6 +58,11 @@ struct scmi_vio_channel {
 	/* lock to protect access to the free list. */
 	spinlock_t free_lock;
 	struct list_head free_list;
+	/* lock to protect access to the pending list. */
+	spinlock_t pending_lock;
+	struct list_head pending_cmds_list;
+	struct work_struct deferred_tx_work;
+	struct workqueue_struct *deferred_tx_wq;
 	bool is_rx;
 	unsigned int max_msg;
 	/* lock to protect access to all members except users, free_list  */
@@ -62,6 +71,12 @@ struct scmi_vio_channel {
 	refcount_t users;
 };
 
+enum poll_states {
+	VIO_MSG_NOT_POLLED,
+	VIO_MSG_POLLING,
+	VIO_MSG_POLL_DONE,
+};
+
 /**
  * struct scmi_vio_msg - Transport PDU information
  *
@@ -69,12 +84,17 @@ struct scmi_vio_channel {
  * @input: SDU used for (delayed) responses and notifications
  * @list: List which scmi_vio_msg may be part of
  * @rx_len: Input SDU size in bytes, once input has been received
+ * @poll_idx: Last used index registered for polling purposes if this message
+ *	      transaction reply was configured for polling.
+ * @poll_status: Polling state for this message.
  */
 struct scmi_vio_msg {
 	struct scmi_msg_payld *request;
 	struct scmi_msg_payld *input;
 	struct list_head list;
 	unsigned int rx_len;
+	unsigned int poll_idx;
+	enum poll_states poll_status;
 };
 
 /* Only one SCMI VirtIO device can possibly exist */
@@ -120,6 +140,7 @@ static void scmi_vio_channel_cleanup_sync(struct scmi_vio_channel *vioch)
 	unsigned long flags;
 	struct device *dev;
 	DECLARE_COMPLETION_ONSTACK(vioch_shutdown_done);
+	void *deferred_wq = NULL;
 
 	/*
 	 * Prepare to wait for the last release if not already released
@@ -130,12 +151,22 @@ static void scmi_vio_channel_cleanup_sync(struct scmi_vio_channel *vioch)
 		spin_unlock_irqrestore(&vioch->lock, flags);
 		return;
 	}
+
 	vioch->shutdown_done = &vioch_shutdown_done;
 	vq_name = vioch->is_rx ? "RX" : "TX";
 	/* vioch->cinfo could be NULLified after the release */
 	dev = vioch->cinfo->dev;
+
+	if (!vioch->is_rx && vioch->deferred_tx_wq) {
+		deferred_wq = vioch->deferred_tx_wq;
+		/* Cannot be kicked anymore after this...*/
+		vioch->deferred_tx_wq = NULL;
+	}
 	spin_unlock_irqrestore(&vioch->lock, flags);
 
+	if (deferred_wq)
+		destroy_workqueue(deferred_wq);
+
 	scmi_vio_channel_release(vioch);
 
 	timeout = msecs_to_jiffies(VIRTIO_MAX_RX_TIMEOUT_MS + 10);
@@ -171,6 +202,8 @@ static void scmi_virtio_put_free_msg(struct scmi_vio_channel *vioch,
 {
 	unsigned long flags;
 
+	msg->poll_status = VIO_MSG_NOT_POLLED;
+
 	spin_lock_irqsave(&vioch->free_lock, flags);
 	list_add_tail(&msg->list, &vioch->free_list);
 	spin_unlock_irqrestore(&vioch->free_lock, flags);
@@ -241,6 +274,7 @@ static void scmi_vio_complete_cb(struct virtqueue *vqueue)
 			virtqueue_disable_cb(vqueue);
 			cb_enabled = false;
 		}
+
 		msg = virtqueue_get_buf(vqueue, &length);
 		if (!msg) {
 			if (virtqueue_enable_cb(vqueue)) {
@@ -271,6 +305,40 @@ static void scmi_vio_complete_cb(struct virtqueue *vqueue)
 	}
 }
 
+static void scmi_vio_deferred_tx_worker(struct work_struct *work)
+{
+	unsigned long flags;
+	struct scmi_vio_channel *vioch;
+	struct scmi_vio_msg *msg, *tmp;
+
+	vioch = container_of(work, struct scmi_vio_channel, deferred_tx_work);
+
+	if (!scmi_vio_channel_acquire(vioch))
+		return;
+
+	/* Process pre-fetched messages */
+	spin_lock_irqsave(&vioch->pending_lock, flags);
+
+	/* Scan the list of possibly pre-fetched messages during polling. */
+	list_for_each_entry_safe(msg, tmp, &vioch->pending_cmds_list, list) {
+		list_del(&msg->list);
+
+		/* Channel is acquired here and cannot vanish */
+		scmi_rx_callback(vioch->cinfo,
+				 msg_read_header(msg->input), msg);
+
+		/* Free the processed message once done */
+		scmi_virtio_put_free_msg(vioch, msg);
+	}
+
+	spin_unlock_irqrestore(&vioch->pending_lock, flags);
+
+	/* Process possibly still pending messages */
+	scmi_vio_complete_cb(vioch->vqueue);
+
+	scmi_vio_channel_release(vioch);
+}
+
 static const char *const scmi_vio_vqueue_names[] = { "tx", "rx" };
 
 static vq_callback_t *scmi_vio_complete_callbacks[] = {
@@ -338,6 +406,19 @@ static int virtio_chan_setup(struct scmi_chan_info *cinfo, struct device *dev,
 
 	vioch = &((struct scmi_vio_channel *)scmi_vdev->priv)[index];
 
+	/* Setup a deferred worker for polling. */
+	if (tx && !vioch->deferred_tx_wq) {
+		vioch->deferred_tx_wq =
+			alloc_workqueue(dev_name(&scmi_vdev->dev),
+					WQ_UNBOUND | WQ_FREEZABLE | WQ_SYSFS,
+					0);
+		if (!vioch->deferred_tx_wq)
+			return -ENOMEM;
+
+		INIT_WORK(&vioch->deferred_tx_work,
+			  scmi_vio_deferred_tx_worker);
+	}
+
 	for (i = 0; i < vioch->max_msg; i++) {
 		struct scmi_vio_msg *msg;
 
@@ -405,6 +486,18 @@ static int virtio_send_message(struct scmi_chan_info *cinfo,
 
 	spin_lock_irqsave(&vioch->lock, flags);
 
+	/*
+	 * If polling was requested for this transaction:
+	 *  - retrieve last used index (will be used as polling reference)
+	 *  - bind the polled message to the xfer via .priv
+	 */
+	if (xfer->hdr.poll_completion) {
+		msg->poll_idx = virtqueue_enable_cb_prepare(vioch->vqueue);
+		msg->poll_status = VIO_MSG_POLLING;
+		/* Ensure initialized msg is visibly bound to xfer */
+		smp_store_mb(xfer->priv, msg);
+	}
+
 	rc = virtqueue_add_sgs(vioch->vqueue, sgs, 1, 1, msg, GFP_ATOMIC);
 	if (rc)
 		dev_err(vioch->cinfo->dev,
@@ -414,8 +507,11 @@ static int virtio_send_message(struct scmi_chan_info *cinfo,
 
 	spin_unlock_irqrestore(&vioch->lock, flags);
 
-	if (rc)
+	if (rc) {
+		/* Ensure order between xfer->priv clear and vq feeding */
+		smp_store_mb(xfer->priv, NULL);
 		scmi_virtio_put_free_msg(vioch, msg);
+	}
 
 	scmi_vio_channel_release(vioch);
 
@@ -427,10 +523,8 @@ static void virtio_fetch_response(struct scmi_chan_info *cinfo,
 {
 	struct scmi_vio_msg *msg = xfer->priv;
 
-	if (msg) {
+	if (msg)
 		msg_fetch_response(msg->input, msg->rx_len, xfer);
-		xfer->priv = NULL;
-	}
 }
 
 static void virtio_fetch_notification(struct scmi_chan_info *cinfo,
@@ -438,10 +532,173 @@ static void virtio_fetch_notification(struct scmi_chan_info *cinfo,
 {
 	struct scmi_vio_msg *msg = xfer->priv;
 
-	if (msg) {
+	if (msg)
 		msg_fetch_notification(msg->input, msg->rx_len, max_len, xfer);
-		xfer->priv = NULL;
+}
+
+/**
+ * virtio_mark_txdone  - Mark transmission done
+ *
+ * Free only completed polling transfer messages.
+ *
+ * Note that in the SCMI VirtIO transport we never explicitly release timed-out
+ * messages by forcibly re-adding them to the free-list inside the TX code path;
+ * we instead let IRQ/RX callbacks eventually clean up such messages once,
+ * finally, a late reply is received and discarded (if ever).
+ *
+ * This approach was deemed preferable since those pending timed-out buffers are
+ * still effectively owned by the SCMI platform VirtIO device even after timeout
+ * expiration: forcibly freeing and reusing them before they had been returned
+ * explicitly by the SCMI platform could lead to subtle bugs due to message
+ * corruption.
+ * An SCMI platform VirtIO device which never returns message buffers is
+ * anyway broken and it will quickly lead to exhaustion of available messages.
+ *
+ * For this same reason, here, we take care to free only the polled messages
+ * that had been somehow replied and not by chance processed on the IRQ path,
+ * since they won't be freed elsewhere; possible late replies to timed-out
+ * polled messages will be anyway freed by RX callbacks instead.
+ *
+ * @cinfo: SCMI channel info
+ * @ret: Transmission return code
+ * @xfer: Transfer descriptor
+ */
+static void virtio_mark_txdone(struct scmi_chan_info *cinfo, int ret,
+			       struct scmi_xfer *xfer)
+{
+	struct scmi_vio_channel *vioch = cinfo->transport_info;
+	struct scmi_vio_msg *msg = xfer->priv;
+
+	if (!scmi_vio_channel_acquire(vioch))
+		return;
+
+	/* Must be a polled xfer and not already freed on the IRQ path */
+	if (!xfer->hdr.poll_completion || !msg) {
+		scmi_vio_channel_release(vioch);
+		return;
+	}
+
+	/* Ensure msg is unbound from xfer anyway at this point */
+	smp_store_mb(xfer->priv, NULL);
+
+	/* Do not free timedout polled messages */
+	if (ret != -ETIMEDOUT)
+		scmi_virtio_put_free_msg(vioch, msg);
+
+	scmi_vio_channel_release(vioch);
+}
+
+/**
+ * virtio_poll_done  - Provide polling support for VirtIO transport
+ *
+ * @cinfo: SCMI channel info
+ * @xfer: Reference to the transfer being poll for.
+ *
+ * VirtIO core provides a polling mechanism based only on last used indexes:
+ * this means that it is possible to poll the virtqueues waiting for something
+ * new to arrive from the host side, but the only way to check if the freshly
+ * arrived buffer was indeed what we were waiting for is to compare the newly
+ * arrived message descriptor with the one we are polling on.
+ *
+ * As a consequence it can happen to dequeue something different from the buffer
+ * we were poll-waiting for: if that is the case such early fetched buffers are
+ * then added to a the @pending_cmds_list list for later processing by a
+ * dedicated deferred worker.
+ *
+ * So, basically, once something new is spotted we proceed to de-queue all the
+ * freshly received used buffers until we found the one we were polling on, or,
+ * we have 'seemingly' emptied the virtqueue; if some buffers are still pending
+ * in the vqueue at the end of the polling loop (possible due to inherent races
+ * in virtqueues handling mechanisms), we similarly kick the deferred worker
+ * and let it process those, to avoid indefinitely looping in the .poll_done
+ * busy-waiting helper.
+ *
+ * Note that, since we do NOT have per-message suppress notification mechanism,
+ * the message we are polling for could be alternatively delivered via usual
+ * IRQs callbacks on another core which happened to have IRQs enabled while we
+ * are actively polling for it here: in such a case it will be handled as such
+ * by scmi_rx_callback() and the polling loop in the SCMI Core TX path will be
+ * transparently terminated anyway.
+ *
+ * Return: True once polling has successfully completed.
+ */
+static bool virtio_poll_done(struct scmi_chan_info *cinfo,
+			     struct scmi_xfer *xfer)
+{
+	bool pending, ret = false;
+	unsigned int length, any_prefetched = 0;
+	unsigned long flags;
+	struct scmi_vio_msg *next_msg, *msg = xfer->priv;
+	struct scmi_vio_channel *vioch = cinfo->transport_info;
+
+	if (!msg)
+		return true;
+
+	/* Processed already by other polling loop on another CPU ? */
+	if (msg->poll_status == VIO_MSG_POLL_DONE)
+		return true;
+
+	if (!scmi_vio_channel_acquire(vioch))
+		return true;
+
+	/* Has cmdq index moved at all ? */
+	pending = virtqueue_poll(vioch->vqueue, msg->poll_idx);
+	if (!pending) {
+		scmi_vio_channel_release(vioch);
+		return false;
+	}
+
+	spin_lock_irqsave(&vioch->lock, flags);
+	virtqueue_disable_cb(vioch->vqueue);
+
+	/*
+	 * Process all new messages till the polled-for message is found OR
+	 * the vqueue is empty.
+	 */
+	while ((next_msg = virtqueue_get_buf(vioch->vqueue, &length))) {
+		next_msg->rx_len = length;
+		/* Is the message we were polling for ? */
+		if (next_msg == msg) {
+			ret = true;
+			break;
+		}
+
+		if (next_msg->poll_status == VIO_MSG_NOT_POLLED) {
+			any_prefetched++;
+
+			spin_lock(&vioch->pending_lock);
+			list_add_tail(&next_msg->list,
+				      &vioch->pending_cmds_list);
+			spin_unlock(&vioch->pending_lock);
+		} else {
+			/* We picked another currently polled msg */
+			smp_store_mb(next_msg->poll_status, VIO_MSG_POLL_DONE);
+		}
+	}
+
+	/*
+	 * When the polling loop has successfully terminated if something
+	 * else was queued in the meantime, it will be served by a deferred
+	 * worker OR by the normal IRQ/callback OR by other poll loops.
+	 *
+	 * If we are still looking for the polled reply, the polling index has
+	 * to be updated to the current vqueue last used index.
+	 */
+	if (ret) {
+		pending = !virtqueue_enable_cb(vioch->vqueue);
+	} else {
+		msg->poll_idx = virtqueue_enable_cb_prepare(vioch->vqueue);
+		pending = virtqueue_poll(vioch->vqueue, msg->poll_idx);
 	}
+
+	if (vioch->deferred_tx_wq && (any_prefetched || pending))
+		queue_work(vioch->deferred_tx_wq, &vioch->deferred_tx_work);
+
+	spin_unlock_irqrestore(&vioch->lock, flags);
+
+	scmi_vio_channel_release(vioch);
+
+	return ret;
 }
 
 static const struct scmi_transport_ops scmi_virtio_ops = {
@@ -453,6 +710,8 @@ static const struct scmi_transport_ops scmi_virtio_ops = {
 	.send_message = virtio_send_message,
 	.fetch_response = virtio_fetch_response,
 	.fetch_notification = virtio_fetch_notification,
+	.mark_txdone = virtio_mark_txdone,
+	.poll_done = virtio_poll_done,
 };
 
 static int scmi_vio_probe(struct virtio_device *vdev)
@@ -495,6 +754,8 @@ static int scmi_vio_probe(struct virtio_device *vdev)
 		spin_lock_init(&channels[i].lock);
 		spin_lock_init(&channels[i].free_lock);
 		INIT_LIST_HEAD(&channels[i].free_list);
+		spin_lock_init(&channels[i].pending_lock);
+		INIT_LIST_HEAD(&channels[i].pending_cmds_list);
 		channels[i].vqueue = vqs[i];
 
 		sz = virtqueue_get_vring_size(channels[i].vqueue);
@@ -584,4 +845,5 @@ const struct scmi_desc scmi_virtio_desc = {
 	.max_rx_timeout_ms = VIRTIO_MAX_RX_TIMEOUT_MS,
 	.max_msg = 0, /* overridden by virtio_get_max_msg() */
 	.max_msg_size = VIRTIO_SCMI_MAX_MSG_SIZE,
+	.atomic_enabled = IS_ENABLED(CONFIG_ARM_SCMI_TRANSPORT_VIRTIO_ATOMIC_ENABLE),
 };
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH v2 5/9] dt-bindings: firmware: arm,scmi: Add atomic_threshold optional property
  2022-02-01 17:15 [PATCH 0/9] Add SCMI Virtio & Clock atomic support Cristian Marussi
                   ` (3 preceding siblings ...)
  2022-02-01 17:15 ` [PATCH v2 4/9] firmware: arm_scmi: Add atomic mode support to virtio transport Cristian Marussi
@ 2022-02-01 17:15 ` Cristian Marussi
  2022-02-01 17:15 ` [PATCH v2 6/9] firmware: arm_scmi: Support optional system wide atomic_threshold Cristian Marussi
                   ` (3 subsequent siblings)
  8 siblings, 0 replies; 16+ messages in thread
From: Cristian Marussi @ 2022-02-01 17:15 UTC (permalink / raw)
  To: linux-kernel, linux-arm-kernel
  Cc: sudeep.holla, james.quinlan, Jonathan.Cameron, f.fainelli,
	etienne.carriere, vincent.guittot, souvik.chakravarty,
	peter.hilber, igor.skalkin, cristian.marussi, Rob Herring,
	devicetree

SCMI protocols in the platform can optionally signal to the OSPM agent
the expected execution latency for a specific resource/operation pair.

Introduce an SCMI system wide optional property to describe a global time
threshold which can be configured on a per-platform base to determine the
opportunity, or not, for an SCMI command advertised to have a higher
latency than the threshold, to be considered for atomic operations:
high-latency SCMI synchronous commands should be preferably issued in the
usual non-atomic mode.

Cc: Rob Herring <robh+dt@kernel.org>
Cc: devicetree@vger.kernel.org
Signed-off-by: Cristian Marussi <cristian.marussi@arm.com>
---
v1 --> v2
- rephrased the property description
---
 .../devicetree/bindings/firmware/arm,scmi.yaml        | 11 +++++++++++
 1 file changed, 11 insertions(+)

diff --git a/Documentation/devicetree/bindings/firmware/arm,scmi.yaml b/Documentation/devicetree/bindings/firmware/arm,scmi.yaml
index eae15df36eef..646bdf2873b5 100644
--- a/Documentation/devicetree/bindings/firmware/arm,scmi.yaml
+++ b/Documentation/devicetree/bindings/firmware/arm,scmi.yaml
@@ -81,6 +81,15 @@ properties:
   '#size-cells':
     const: 0
 
+  atomic_threshold:
+    $ref: /schemas/types.yaml#/definitions/uint32
+    description:
+      An optional time value, expressed in microseconds, representing, on this
+      platform, the threshold above which any SCMI command, advertised to have
+      an higher-than-threshold execution latency, should not be considered for
+      atomic mode of operation, even if requested.
+      If left unconfigured defaults to zero.
+
   arm,smc-id:
     $ref: /schemas/types.yaml#/definitions/uint32
     description:
@@ -264,6 +273,8 @@ examples:
             #address-cells = <1>;
             #size-cells = <0>;
 
+            atomic_threshold = <10000>;
+
             scmi_devpd: protocol@11 {
                 reg = <0x11>;
                 #power-domain-cells = <1>;
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH v2 6/9] firmware: arm_scmi: Support optional system wide atomic_threshold
  2022-02-01 17:15 [PATCH 0/9] Add SCMI Virtio & Clock atomic support Cristian Marussi
                   ` (4 preceding siblings ...)
  2022-02-01 17:15 ` [PATCH v2 5/9] dt-bindings: firmware: arm,scmi: Add atomic_threshold optional property Cristian Marussi
@ 2022-02-01 17:15 ` Cristian Marussi
  2022-02-01 17:15 ` [PATCH v2 7/9] firmware: arm_scmi: Add atomic support to clock protocol Cristian Marussi
                   ` (2 subsequent siblings)
  8 siblings, 0 replies; 16+ messages in thread
From: Cristian Marussi @ 2022-02-01 17:15 UTC (permalink / raw)
  To: linux-kernel, linux-arm-kernel
  Cc: sudeep.holla, james.quinlan, Jonathan.Cameron, f.fainelli,
	etienne.carriere, vincent.guittot, souvik.chakravarty,
	peter.hilber, igor.skalkin, cristian.marussi

An SCMI agent can be configured system-wide with a well-defined atomic
threshold: only SCMI synchronous command whose latency has been advertised
by the SCMI platform to be lower or equal to this configured threshold will
be considered for atomic operations, when requested and if supported by the
underlying transport at all.

Signed-off-by: Cristian Marussi <cristian.marussi@arm.com>
---
 drivers/firmware/arm_scmi/driver.c | 27 ++++++++++++++++++++++++---
 include/linux/scmi_protocol.h      |  5 ++++-
 2 files changed, 28 insertions(+), 4 deletions(-)

diff --git a/drivers/firmware/arm_scmi/driver.c b/drivers/firmware/arm_scmi/driver.c
index dc972a54e93e..d2ac6291b84b 100644
--- a/drivers/firmware/arm_scmi/driver.c
+++ b/drivers/firmware/arm_scmi/driver.c
@@ -131,6 +131,12 @@ struct scmi_protocol_instance {
  *	MAX_PROTOCOLS_IMP elements allocated by the base protocol
  * @active_protocols: IDR storing device_nodes for protocols actually defined
  *		      in the DT and confirmed as implemented by fw.
+ * @atomic_threshold: Optional system wide DT-configured threshold, expressed
+ *		      in microseconds, for atomic operations.
+ *		      Only SCMI synchronous commands reported by the platform
+ *		      to have an execution latency lesser-equal to the threshold
+ *		      should be considered for atomic mode operation: such
+ *		      decision is finally left up to the SCMI drivers.
  * @notify_priv: Pointer to private data structure specific to notifications.
  * @node: List head
  * @users: Number of users of this instance
@@ -149,6 +155,7 @@ struct scmi_info {
 	struct mutex protocols_mtx;
 	u8 *protocols_imp;
 	struct idr active_protocols;
+	unsigned int atomic_threshold;
 	void *notify_priv;
 	struct list_head node;
 	int users;
@@ -1409,15 +1416,22 @@ static void scmi_devm_protocol_put(struct scmi_device *sdev, u8 protocol_id)
  * SCMI instance is configured as atomic.
  *
  * @handle: A reference to the SCMI platform instance.
+ * @atomic_threshold: An optional return value for the system wide currently
+ *		      configured threshold for atomic operations.
  *
  * Return: True if transport is configured as atomic
  */
-static bool scmi_is_transport_atomic(const struct scmi_handle *handle)
+static bool scmi_is_transport_atomic(const struct scmi_handle *handle,
+				     unsigned int *atomic_threshold)
 {
+	bool ret;
 	struct scmi_info *info = handle_to_scmi_info(handle);
 
-	return info->desc->atomic_enabled &&
-		is_transport_polling_capable(info);
+	ret = info->desc->atomic_enabled && is_transport_polling_capable(info);
+	if (ret && atomic_threshold)
+		*atomic_threshold = info->atomic_threshold;
+
+	return ret;
 }
 
 static inline
@@ -1957,6 +1971,13 @@ static int scmi_probe(struct platform_device *pdev)
 	handle->version = &info->version;
 	handle->devm_protocol_get = scmi_devm_protocol_get;
 	handle->devm_protocol_put = scmi_devm_protocol_put;
+
+	/* System wide atomic threshold for atomic ops .. if any */
+	if (!of_property_read_u32(np, "atomic_threshold",
+				  &info->atomic_threshold))
+		dev_info(dev,
+			 "SCMI System wide atomic threshold set to %d us\n",
+			 info->atomic_threshold);
 	handle->is_transport_atomic = scmi_is_transport_atomic;
 
 	if (desc->ops->link_supplier) {
diff --git a/include/linux/scmi_protocol.h b/include/linux/scmi_protocol.h
index 9f895cb81818..fdf6bd83cc59 100644
--- a/include/linux/scmi_protocol.h
+++ b/include/linux/scmi_protocol.h
@@ -619,6 +619,8 @@ struct scmi_notify_ops {
  *			 be interested to know if they can assume SCMI
  *			 command transactions associated to this handle will
  *			 never sleep and act accordingly.
+ *			 An optional atomic threshold value could be returned
+ *			 where configured.
  * @notify_ops: pointer to set of notifications related operations
  */
 struct scmi_handle {
@@ -629,7 +631,8 @@ struct scmi_handle {
 		(*devm_protocol_get)(struct scmi_device *sdev, u8 proto,
 				     struct scmi_protocol_handle **ph);
 	void (*devm_protocol_put)(struct scmi_device *sdev, u8 proto);
-	bool (*is_transport_atomic)(const struct scmi_handle *handle);
+	bool (*is_transport_atomic)(const struct scmi_handle *handle,
+				    unsigned int *atomic_threshold);
 
 	const struct scmi_notify_ops *notify_ops;
 };
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH v2 7/9] firmware: arm_scmi: Add atomic support to clock protocol
  2022-02-01 17:15 [PATCH 0/9] Add SCMI Virtio & Clock atomic support Cristian Marussi
                   ` (5 preceding siblings ...)
  2022-02-01 17:15 ` [PATCH v2 6/9] firmware: arm_scmi: Support optional system wide atomic_threshold Cristian Marussi
@ 2022-02-01 17:15 ` Cristian Marussi
  2022-02-01 17:16 ` [PATCH v2 8/9] [RFC] firmware: arm_scmi: Add support for clock_enable_latency Cristian Marussi
  2022-02-01 17:16 ` [PATCH v2 9/9] [RFC] clk: scmi: Support atomic clock enable/disable API Cristian Marussi
  8 siblings, 0 replies; 16+ messages in thread
From: Cristian Marussi @ 2022-02-01 17:15 UTC (permalink / raw)
  To: linux-kernel, linux-arm-kernel
  Cc: sudeep.holla, james.quinlan, Jonathan.Cameron, f.fainelli,
	etienne.carriere, vincent.guittot, souvik.chakravarty,
	peter.hilber, igor.skalkin, cristian.marussi

Introduce new _atomic variant for SCMI clock protocol operations related
to enable disable operations: when an atomic operation is required the xfer
poll_completion flag is set for that transaction.

Signed-off-by: Cristian Marussi <cristian.marussi@arm.com>
---
v1 --> v2
- removed enable_latency flag in clock_info since it does not belong to
  this commit really
---
 drivers/firmware/arm_scmi/clock.c | 22 +++++++++++++++++++---
 include/linux/scmi_protocol.h     |  3 +++
 2 files changed, 22 insertions(+), 3 deletions(-)

diff --git a/drivers/firmware/arm_scmi/clock.c b/drivers/firmware/arm_scmi/clock.c
index 35b56c8ba0c0..72f930c0e3e2 100644
--- a/drivers/firmware/arm_scmi/clock.c
+++ b/drivers/firmware/arm_scmi/clock.c
@@ -273,7 +273,7 @@ static int scmi_clock_rate_set(const struct scmi_protocol_handle *ph,
 
 static int
 scmi_clock_config_set(const struct scmi_protocol_handle *ph, u32 clk_id,
-		      u32 config)
+		      u32 config, bool atomic)
 {
 	int ret;
 	struct scmi_xfer *t;
@@ -284,6 +284,8 @@ scmi_clock_config_set(const struct scmi_protocol_handle *ph, u32 clk_id,
 	if (ret)
 		return ret;
 
+	t->hdr.poll_completion = atomic;
+
 	cfg = t->tx.buf;
 	cfg->id = cpu_to_le32(clk_id);
 	cfg->attributes = cpu_to_le32(config);
@@ -296,12 +298,24 @@ scmi_clock_config_set(const struct scmi_protocol_handle *ph, u32 clk_id,
 
 static int scmi_clock_enable(const struct scmi_protocol_handle *ph, u32 clk_id)
 {
-	return scmi_clock_config_set(ph, clk_id, CLOCK_ENABLE);
+	return scmi_clock_config_set(ph, clk_id, CLOCK_ENABLE, false);
 }
 
 static int scmi_clock_disable(const struct scmi_protocol_handle *ph, u32 clk_id)
 {
-	return scmi_clock_config_set(ph, clk_id, 0);
+	return scmi_clock_config_set(ph, clk_id, 0, false);
+}
+
+static int scmi_clock_enable_atomic(const struct scmi_protocol_handle *ph,
+				    u32 clk_id)
+{
+	return scmi_clock_config_set(ph, clk_id, CLOCK_ENABLE, true);
+}
+
+static int scmi_clock_disable_atomic(const struct scmi_protocol_handle *ph,
+				     u32 clk_id)
+{
+	return scmi_clock_config_set(ph, clk_id, 0, true);
 }
 
 static int scmi_clock_count_get(const struct scmi_protocol_handle *ph)
@@ -330,6 +344,8 @@ static const struct scmi_clk_proto_ops clk_proto_ops = {
 	.rate_set = scmi_clock_rate_set,
 	.enable = scmi_clock_enable,
 	.disable = scmi_clock_disable,
+	.enable_atomic = scmi_clock_enable_atomic,
+	.disable_atomic = scmi_clock_disable_atomic,
 };
 
 static int scmi_clock_protocol_init(const struct scmi_protocol_handle *ph)
diff --git a/include/linux/scmi_protocol.h b/include/linux/scmi_protocol.h
index fdf6bd83cc59..306e576835f8 100644
--- a/include/linux/scmi_protocol.h
+++ b/include/linux/scmi_protocol.h
@@ -82,6 +82,9 @@ struct scmi_clk_proto_ops {
 			u64 rate);
 	int (*enable)(const struct scmi_protocol_handle *ph, u32 clk_id);
 	int (*disable)(const struct scmi_protocol_handle *ph, u32 clk_id);
+	int (*enable_atomic)(const struct scmi_protocol_handle *ph, u32 clk_id);
+	int (*disable_atomic)(const struct scmi_protocol_handle *ph,
+			      u32 clk_id);
 };
 
 /**
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH v2 8/9] [RFC] firmware: arm_scmi: Add support for clock_enable_latency
  2022-02-01 17:15 [PATCH 0/9] Add SCMI Virtio & Clock atomic support Cristian Marussi
                   ` (6 preceding siblings ...)
  2022-02-01 17:15 ` [PATCH v2 7/9] firmware: arm_scmi: Add atomic support to clock protocol Cristian Marussi
@ 2022-02-01 17:16 ` Cristian Marussi
  2022-02-01 17:16 ` [PATCH v2 9/9] [RFC] clk: scmi: Support atomic clock enable/disable API Cristian Marussi
  8 siblings, 0 replies; 16+ messages in thread
From: Cristian Marussi @ 2022-02-01 17:16 UTC (permalink / raw)
  To: linux-kernel, linux-arm-kernel
  Cc: sudeep.holla, james.quinlan, Jonathan.Cameron, f.fainelli,
	etienne.carriere, vincent.guittot, souvik.chakravarty,
	peter.hilber, igor.skalkin, cristian.marussi

An SCMI platform can optionally advertise an enable latency typically
associated with a specific clock resource: add support for parsing such
optional message field and export such information in the usual publicly
accessible clock descriptor.

Signed-off-by: Cristian Marussi <cristian.marussi@arm.com>
---
RFC tag is due to the fact that this SCMI spec Clock protocol field
addition is still to be published.

v1 --> v2
- moved enable_latency flag definition from the previous commit
---
 drivers/firmware/arm_scmi/clock.c | 12 +++++++++---
 include/linux/scmi_protocol.h     |  1 +
 2 files changed, 10 insertions(+), 3 deletions(-)

diff --git a/drivers/firmware/arm_scmi/clock.c b/drivers/firmware/arm_scmi/clock.c
index 72f930c0e3e2..552610bfb91e 100644
--- a/drivers/firmware/arm_scmi/clock.c
+++ b/drivers/firmware/arm_scmi/clock.c
@@ -27,7 +27,8 @@ struct scmi_msg_resp_clock_protocol_attributes {
 struct scmi_msg_resp_clock_attributes {
 	__le32 attributes;
 #define	CLOCK_ENABLE	BIT(0)
-	    u8 name[SCMI_MAX_STR_SIZE];
+	u8 name[SCMI_MAX_STR_SIZE];
+	__le32 clock_enable_latency;
 };
 
 struct scmi_clock_set_config {
@@ -116,10 +117,15 @@ static int scmi_clock_attributes_get(const struct scmi_protocol_handle *ph,
 	attr = t->rx.buf;
 
 	ret = ph->xops->do_xfer(ph, t);
-	if (!ret)
+	if (!ret) {
 		strlcpy(clk->name, attr->name, SCMI_MAX_STR_SIZE);
-	else
+		/* TODO Use CLOCK Proto Versioning once SCMI spec is updated */
+		if (t->rx.len > sizeof(attr->attributes) + SCMI_MAX_STR_SIZE)
+			clk->enable_latency =
+				le32_to_cpu(attr->clock_enable_latency);
+	} else {
 		clk->name[0] = '\0';
+	}
 
 	ph->xops->xfer_put(ph, t);
 	return ret;
diff --git a/include/linux/scmi_protocol.h b/include/linux/scmi_protocol.h
index 306e576835f8..b87551f41f9f 100644
--- a/include/linux/scmi_protocol.h
+++ b/include/linux/scmi_protocol.h
@@ -42,6 +42,7 @@ struct scmi_revision_info {
 
 struct scmi_clock_info {
 	char name[SCMI_MAX_STR_SIZE];
+	unsigned int enable_latency;
 	bool rate_discrete;
 	union {
 		struct {
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH v2 9/9] [RFC] clk: scmi: Support atomic clock enable/disable API
  2022-02-01 17:15 [PATCH 0/9] Add SCMI Virtio & Clock atomic support Cristian Marussi
                   ` (7 preceding siblings ...)
  2022-02-01 17:16 ` [PATCH v2 8/9] [RFC] firmware: arm_scmi: Add support for clock_enable_latency Cristian Marussi
@ 2022-02-01 17:16 ` Cristian Marussi
  8 siblings, 0 replies; 16+ messages in thread
From: Cristian Marussi @ 2022-02-01 17:16 UTC (permalink / raw)
  To: linux-kernel, linux-arm-kernel
  Cc: sudeep.holla, james.quinlan, Jonathan.Cameron, f.fainelli,
	etienne.carriere, vincent.guittot, souvik.chakravarty,
	peter.hilber, igor.skalkin, cristian.marussi, Michael Turquette,
	Stephen Boyd, linux-clk

Support also atomic enable/disable clk_ops beside the bare non-atomic one
(prepare/unprepare) when the underlying SCMI transport is configured to
support atomic transactions for synchronous commands.

Compare the SCMI system-wide configured atomic threshold latency time and
the per-clock advertised enable latency (if any) to choose whether to
provide sleeping prepare/unprepare vs atomic enable/disable.

Cc: Michael Turquette <mturquette@baylibre.com>
Cc: Stephen Boyd <sboyd@kernel.org>
Cc: linux-clk@vger.kernel.org
Signed-off-by: Cristian Marussi <cristian.marussi@arm.com>
---
Tagged as RFC since dependent on a previous RFC patch
---
 drivers/clk/clk-scmi.c | 71 +++++++++++++++++++++++++++++++++++-------
 1 file changed, 60 insertions(+), 11 deletions(-)

diff --git a/drivers/clk/clk-scmi.c b/drivers/clk/clk-scmi.c
index 1e357d364ca2..2c7a830ce308 100644
--- a/drivers/clk/clk-scmi.c
+++ b/drivers/clk/clk-scmi.c
@@ -2,7 +2,7 @@
 /*
  * System Control and Power Interface (SCMI) Protocol based clock driver
  *
- * Copyright (C) 2018-2021 ARM Ltd.
+ * Copyright (C) 2018-2022 ARM Ltd.
  */
 
 #include <linux/clk-provider.h>
@@ -88,21 +88,51 @@ static void scmi_clk_disable(struct clk_hw *hw)
 	scmi_proto_clk_ops->disable(clk->ph, clk->id);
 }
 
+static int scmi_clk_atomic_enable(struct clk_hw *hw)
+{
+	struct scmi_clk *clk = to_scmi_clk(hw);
+
+	return scmi_proto_clk_ops->enable_atomic(clk->ph, clk->id);
+}
+
+static void scmi_clk_atomic_disable(struct clk_hw *hw)
+{
+	struct scmi_clk *clk = to_scmi_clk(hw);
+
+	scmi_proto_clk_ops->disable_atomic(clk->ph, clk->id);
+}
+
+/*
+ * We can provide enable/disable atomic callbacks only if the underlying SCMI
+ * transport for an SCMI instance is configured to handle SCMI commands in an
+ * atomic manner.
+ *
+ * When no SCMI atomic transport support is available we instead provide only
+ * the prepare/unprepare API, as allowed by the clock framework when atomic
+ * calls are not available.
+ *
+ * Two distinct sets of clk_ops are provided since we could have multiple SCMI
+ * instances with different underlying transport quality, so they cannot be
+ * shared.
+ */
 static const struct clk_ops scmi_clk_ops = {
 	.recalc_rate = scmi_clk_recalc_rate,
 	.round_rate = scmi_clk_round_rate,
 	.set_rate = scmi_clk_set_rate,
-	/*
-	 * We can't provide enable/disable callback as we can't perform the same
-	 * in atomic context. Since the clock framework provides standard API
-	 * clk_prepare_enable that helps cases using clk_enable in non-atomic
-	 * context, it should be fine providing prepare/unprepare.
-	 */
 	.prepare = scmi_clk_enable,
 	.unprepare = scmi_clk_disable,
 };
 
-static int scmi_clk_ops_init(struct device *dev, struct scmi_clk *sclk)
+static const struct clk_ops scmi_atomic_clk_ops = {
+	.recalc_rate = scmi_clk_recalc_rate,
+	.round_rate = scmi_clk_round_rate,
+	.set_rate = scmi_clk_set_rate,
+	.enable = scmi_clk_atomic_enable,
+	.disable = scmi_clk_atomic_disable,
+};
+
+static int scmi_clk_ops_init(struct device *dev, struct scmi_clk *sclk,
+			     const struct clk_ops *scmi_ops)
 {
 	int ret;
 	unsigned long min_rate, max_rate;
@@ -110,7 +140,7 @@ static int scmi_clk_ops_init(struct device *dev, struct scmi_clk *sclk)
 	struct clk_init_data init = {
 		.flags = CLK_GET_RATE_NOCACHE,
 		.num_parents = 0,
-		.ops = &scmi_clk_ops,
+		.ops = scmi_ops,
 		.name = sclk->info->name,
 	};
 
@@ -139,6 +169,8 @@ static int scmi_clk_ops_init(struct device *dev, struct scmi_clk *sclk)
 static int scmi_clocks_probe(struct scmi_device *sdev)
 {
 	int idx, count, err;
+	unsigned int atomic_threshold;
+	bool is_atomic;
 	struct clk_hw **hws;
 	struct clk_hw_onecell_data *clk_data;
 	struct device *dev = &sdev->dev;
@@ -168,8 +200,11 @@ static int scmi_clocks_probe(struct scmi_device *sdev)
 	clk_data->num = count;
 	hws = clk_data->hws;
 
+	is_atomic = handle->is_transport_atomic(handle, &atomic_threshold);
+
 	for (idx = 0; idx < count; idx++) {
 		struct scmi_clk *sclk;
+		const struct clk_ops *scmi_ops;
 
 		sclk = devm_kzalloc(dev, sizeof(*sclk), GFP_KERNEL);
 		if (!sclk)
@@ -184,13 +219,27 @@ static int scmi_clocks_probe(struct scmi_device *sdev)
 		sclk->id = idx;
 		sclk->ph = ph;
 
-		err = scmi_clk_ops_init(dev, sclk);
+		/*
+		 * Note that when transport is atomic but SCMI protocol did not
+		 * specify (or support) an enable_latency associated with a
+		 * clock, we default to use atomic operations mode.
+		 */
+		if (is_atomic &&
+		    sclk->info->enable_latency <= atomic_threshold)
+			scmi_ops = &scmi_atomic_clk_ops;
+		else
+			scmi_ops = &scmi_clk_ops;
+
+		err = scmi_clk_ops_init(dev, sclk, scmi_ops);
 		if (err) {
 			dev_err(dev, "failed to register clock %d\n", idx);
 			devm_kfree(dev, sclk);
 			hws[idx] = NULL;
 		} else {
-			dev_dbg(dev, "Registered clock:%s\n", sclk->info->name);
+			dev_dbg(dev, "Registered clock:%s%s\n",
+				sclk->info->name,
+				scmi_ops == &scmi_atomic_clk_ops ?
+				" (atomic ops)" : "");
 			hws[idx] = &sclk->hw;
 		}
 	}
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 16+ messages in thread

* Re: [PATCH v2 3/9] [RFC] virtio_ring: Embed a wrap counter in opaque poll index value
  2022-02-01 17:15 ` [PATCH v2 3/9] [RFC] virtio_ring: Embed a wrap counter in opaque poll index value Cristian Marussi
@ 2022-02-01 18:27   ` Michael S. Tsirkin
  2022-02-03 10:51     ` Cristian Marussi
  0 siblings, 1 reply; 16+ messages in thread
From: Michael S. Tsirkin @ 2022-02-01 18:27 UTC (permalink / raw)
  To: Cristian Marussi
  Cc: linux-kernel, linux-arm-kernel, sudeep.holla, james.quinlan,
	Jonathan.Cameron, f.fainelli, etienne.carriere, vincent.guittot,
	souvik.chakravarty, peter.hilber, igor.skalkin, virtualization

Looks correct, thanks. Some minor comments below:

On Tue, Feb 01, 2022 at 05:15:55PM +0000, Cristian Marussi wrote:
> Exported API virtqueue_poll() can be used to support polling mode operation
> on top of virtio layer if needed; currently the parameter last_used_idx is
> the opaque value that needs to be passed to the virtqueue_poll() function
> to check if there are new pending used buffers in the queue: such opaque
> value would have been previously obtained by a call to the API function
> virtqueue_enable_cb_prepare().
> 
> Since such opaque value is indeed containing simply a snapshot in time of
> the internal

to add: 16 bit

> last_used_index (roughly), it is possible that,

to add here: 

if another thread calls virtqueue_add_*()
at the same time (which existing drivers don't do,
but does not seem to be documented as prohibited anywhere), and

> if exactly
> 2**16 buffers are marked as used between two successive calls to
> virtqueue_poll(), the caller is fooled into thinking that nothing is
> pending (ABA problem).
> Keep a full fledged internal wraps counter

s/full fledged/a 16 bit/

since I don't see why is a 16 bit counter full but not e.g. a 32 bit one

> per virtqueue and embed it into
> the upper 16bits of the returned opaque value, so that the above scenario
> can be detected transparently by virtqueue_poll(): this way each single
> possible last_used_idx value is really belonging to a different wrap.

Just to add here: the ABA problem can in theory still happen but
now that's after 2^32 requests, which seems sufficient in practice.

> Cc: "Michael S. Tsirkin" <mst@redhat.com>
> Cc: Igor Skalkin <igor.skalkin@opensynergy.com>
> Cc: Peter Hilber <peter.hilber@opensynergy.com>
> Cc: virtualization@lists.linux-foundation.org
> Signed-off-by: Cristian Marussi <cristian.marussi@arm.com>
> ---
> Still no perf data on this, I was wondering what exactly to measure in
> term of perf metrics to evaluate the impact of the rolling vq->wraps
> counter.
> ---
>  drivers/virtio/virtio_ring.c | 51 +++++++++++++++++++++++++++++++++---
>  1 file changed, 47 insertions(+), 4 deletions(-)
> 
> diff --git a/drivers/virtio/virtio_ring.c b/drivers/virtio/virtio_ring.c
> index 00f64f2f8b72..613ec0503509 100644
> --- a/drivers/virtio/virtio_ring.c
> +++ b/drivers/virtio/virtio_ring.c
> @@ -12,6 +12,8 @@
>  #include <linux/hrtimer.h>
>  #include <linux/dma-mapping.h>
>  #include <linux/spinlock.h>
> +#include <linux/bits.h>
> +#include <linux/bitfield.h>
>  #include <xen/xen.h>
>  
>  static bool force_used_validation = false;
> @@ -69,6 +71,17 @@ module_param(force_used_validation, bool, 0444);
>  #define LAST_ADD_TIME_INVALID(vq)
>  #endif
>  
> +#define VRING_IDX_MASK					GENMASK(15, 0)
> +#define VRING_GET_IDX(opaque)				\
> +	((u16)FIELD_GET(VRING_IDX_MASK, (opaque)))
> +
> +#define VRING_WRAPS_MASK				GENMASK(31, 16)
> +#define VRING_GET_WRAPS(opaque)				\
> +	((u16)FIELD_GET(VRING_WRAPS_MASK, (opaque)))
> +
> +#define VRING_BUILD_OPAQUE(idx, wraps)			\
> +	(FIELD_PREP(VRING_WRAPS_MASK, (wraps)) | ((idx) & VRING_IDX_MASK))
> +

Maybe prefix with VRING_POLL_  since that is the only user.


>  struct vring_desc_state_split {
>  	void *data;			/* Data for callback. */
>  	struct vring_desc *indir_desc;	/* Indirect descriptor, if any. */
> @@ -117,6 +130,8 @@ struct vring_virtqueue {
>  	/* Last used index we've seen. */
>  	u16 last_used_idx;
>  
> +	u16 wraps;
> +
>  	/* Hint for event idx: already triggered no need to disable. */
>  	bool event_triggered;
>  
> @@ -806,6 +821,8 @@ static void *virtqueue_get_buf_ctx_split(struct virtqueue *_vq,
>  	ret = vq->split.desc_state[i].data;
>  	detach_buf_split(vq, i, ctx);
>  	vq->last_used_idx++;
> +	if (unlikely(!vq->last_used_idx))
> +		vq->wraps++;
>  	/* If we expect an interrupt for the next entry, tell host
>  	 * by writing event index and flush out the write before
>  	 * the read in the next get_buf call. */

So most drivers don't call virtqueue_poll.
Concerned about the overhead here: another option is
with a flag that will have to be set whenever a driver
wants to use virtqueue_poll.
Could you pls do a quick perf test e.g. using tools/virtio/
to see what's faster?



> @@ -1508,6 +1525,7 @@ static void *virtqueue_get_buf_ctx_packed(struct virtqueue *_vq,
>  	if (unlikely(vq->last_used_idx >= vq->packed.vring.num)) {
>  		vq->last_used_idx -= vq->packed.vring.num;
>  		vq->packed.used_wrap_counter ^= 1;
> +		vq->wraps++;
>  	}
>  
>  	/*
> @@ -1744,6 +1762,7 @@ static struct virtqueue *vring_create_virtqueue_packed(
>  	vq->weak_barriers = weak_barriers;
>  	vq->broken = false;
>  	vq->last_used_idx = 0;
> +	vq->wraps = 0;
>  	vq->event_triggered = false;
>  	vq->num_added = 0;
>  	vq->packed_ring = true;
> @@ -2092,13 +2111,17 @@ EXPORT_SYMBOL_GPL(virtqueue_disable_cb);
>   */
>  unsigned virtqueue_enable_cb_prepare(struct virtqueue *_vq)
>  {
> +	unsigned int last_used_idx;
>  	struct vring_virtqueue *vq = to_vvq(_vq);
>  
>  	if (vq->event_triggered)
>  		vq->event_triggered = false;
>  
> -	return vq->packed_ring ? virtqueue_enable_cb_prepare_packed(_vq) :
> -				 virtqueue_enable_cb_prepare_split(_vq);
> +	last_used_idx = vq->packed_ring ?
> +			virtqueue_enable_cb_prepare_packed(_vq) :
> +			virtqueue_enable_cb_prepare_split(_vq);
> +
> +	return VRING_BUILD_OPAQUE(last_used_idx, vq->wraps);
>  }
>  EXPORT_SYMBOL_GPL(virtqueue_enable_cb_prepare);
>  
> @@ -2107,6 +2130,21 @@ EXPORT_SYMBOL_GPL(virtqueue_enable_cb_prepare);
>   * @_vq: the struct virtqueue we're talking about.
>   * @last_used_idx: virtqueue state (from call to virtqueue_enable_cb_prepare).
>   *
> + * The provided last_used_idx, as returned by virtqueue_enable_cb_prepare(),
> + * is an opaque value representing the queue state and it is built as follows:
> + *
> + *	---------------------------------------------------------
> + *	|	vq->wraps	|	vq->last_used_idx	|
> + *	31------------------------------------------------------0
> + *
> + * The MSB 16bits embedding the wraps counter for the underlying virtqueue
> + * is stripped out here before reaching into the lower layer helpers.
> + *
> + * This structure of the opaque value mitigates the scenario in which, when
> + * exactly 2**16 messages are marked as used between two successive calls to
> + * virtqueue_poll(), the caller is fooled into thinking nothing new has arrived
> + * since the pure last_used_idx is exactly the same.
> + *

Do you want to move this comment to where the macros implementing it
are?

>   * Returns "true" if there are pending used buffers in the queue.
>   *
>   * This does not need to be serialized.
> @@ -2118,9 +2156,13 @@ bool virtqueue_poll(struct virtqueue *_vq, unsigned last_used_idx)
>  	if (unlikely(vq->broken))
>  		return false;
>  
> +	if (unlikely(vq->wraps != VRING_GET_WRAPS(last_used_idx)))
> +		return true;
> +
>  	virtio_mb(vq->weak_barriers);
> -	return vq->packed_ring ? virtqueue_poll_packed(_vq, last_used_idx) :
> -				 virtqueue_poll_split(_vq, last_used_idx);
> +	return vq->packed_ring ?
> +		virtqueue_poll_packed(_vq, VRING_GET_IDX(last_used_idx)) :
> +			virtqueue_poll_split(_vq, VRING_GET_IDX(last_used_idx));
>  }
>  EXPORT_SYMBOL_GPL(virtqueue_poll);
>  
> @@ -2245,6 +2287,7 @@ struct virtqueue *__vring_new_virtqueue(unsigned int index,
>  	vq->weak_barriers = weak_barriers;
>  	vq->broken = false;
>  	vq->last_used_idx = 0;
> +	vq->wraps = 0;
>  	vq->event_triggered = false;
>  	vq->num_added = 0;
>  	vq->use_dma_api = vring_use_dma_api(vdev);
> -- 
> 2.17.1


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH v2 1/9] firmware: arm_scmi: Add a virtio channel refcount
  2022-02-01 17:15 ` [PATCH v2 1/9] firmware: arm_scmi: Add a virtio channel refcount Cristian Marussi
@ 2022-02-01 22:05   ` Michael S. Tsirkin
  2022-02-03 10:08     ` Cristian Marussi
  0 siblings, 1 reply; 16+ messages in thread
From: Michael S. Tsirkin @ 2022-02-01 22:05 UTC (permalink / raw)
  To: Cristian Marussi
  Cc: linux-kernel, linux-arm-kernel, sudeep.holla, james.quinlan,
	Jonathan.Cameron, f.fainelli, etienne.carriere, vincent.guittot,
	souvik.chakravarty, peter.hilber, igor.skalkin

On Tue, Feb 01, 2022 at 05:15:53PM +0000, Cristian Marussi wrote:
> Currently SCMI VirtIO channels are marked with a ready flag and related
> lock to track channel lifetime and support proper synchronization at
> shutdown when virtqueues have to be stopped.
> 
> This leads to some extended spinlocked sections with IRQs off on the RX
> path to keep hold of the ready flag and does not scale well especially when
> SCMI VirtIO polling mode will be introduced.
> 
> Add an SCMI VirtIO channel dedicated refcount to track active users on both
> the TX and the RX path and properly enforce synchronization and cleanup at
> shutdown, inhibiting further usage of the channel once freed.
> 
> Cc: Igor Skalkin <igor.skalkin@opensynergy.com>
> Cc: Peter Hilber <peter.hilber@opensynergy.com>
> Signed-off-by: Cristian Marussi <cristian.marussi@arm.com>
> ---
>  drivers/firmware/arm_scmi/virtio.c | 148 +++++++++++++++++++----------
>  1 file changed, 100 insertions(+), 48 deletions(-)
> 
> diff --git a/drivers/firmware/arm_scmi/virtio.c b/drivers/firmware/arm_scmi/virtio.c
> index fd0f6f91fc0b..536e46eab462 100644
> --- a/drivers/firmware/arm_scmi/virtio.c
> +++ b/drivers/firmware/arm_scmi/virtio.c
> @@ -17,7 +17,9 @@
>   * virtqueue. Access to each virtqueue is protected by spinlocks.
>   */
>  
> +#include <linux/completion.h>
>  #include <linux/errno.h>
> +#include <linux/refcount.h>
>  #include <linux/slab.h>
>  #include <linux/virtio.h>
>  #include <linux/virtio_config.h>
> @@ -27,6 +29,7 @@
>  
>  #include "common.h"
>  
> +#define VIRTIO_MAX_RX_TIMEOUT_MS	60000
>  #define VIRTIO_SCMI_MAX_MSG_SIZE 128 /* Value may be increased. */
>  #define VIRTIO_SCMI_MAX_PDU_SIZE \
>  	(VIRTIO_SCMI_MAX_MSG_SIZE + SCMI_MSG_MAX_PROT_OVERHEAD)
> @@ -39,23 +42,21 @@
>   * @cinfo: SCMI Tx or Rx channel
>   * @free_list: List of unused scmi_vio_msg, maintained for Tx channels only
>   * @is_rx: Whether channel is an Rx channel
> - * @ready: Whether transport user is ready to hear about channel
>   * @max_msg: Maximum number of pending messages for this channel.
> - * @lock: Protects access to all members except ready.
> - * @ready_lock: Protects access to ready. If required, it must be taken before
> - *              lock.
> + * @lock: Protects access to all members except users.
> + * @shutdown_done: A reference to a completion used when freeing this channel.
> + * @users: A reference count to currently active users of this channel.
>   */
>  struct scmi_vio_channel {
>  	struct virtqueue *vqueue;
>  	struct scmi_chan_info *cinfo;
>  	struct list_head free_list;
>  	bool is_rx;
> -	bool ready;
>  	unsigned int max_msg;
> -	/* lock to protect access to all members except ready. */
> +	/* lock to protect access to all members except users. */
>  	spinlock_t lock;
> -	/* lock to rotects access to ready flag. */
> -	spinlock_t ready_lock;
> +	struct completion *shutdown_done;
> +	refcount_t users;
>  };
>  
>  /**
> @@ -76,6 +77,71 @@ struct scmi_vio_msg {
>  /* Only one SCMI VirtIO device can possibly exist */
>  static struct virtio_device *scmi_vdev;
>  
> +static void scmi_vio_channel_ready(struct scmi_vio_channel *vioch,
> +				   struct scmi_chan_info *cinfo)
> +{
> +	unsigned long flags;
> +
> +	spin_lock_irqsave(&vioch->lock, flags);
> +	cinfo->transport_info = vioch;
> +	/* Indirectly setting channel not available any more */
> +	vioch->cinfo = cinfo;
> +	spin_unlock_irqrestore(&vioch->lock, flags);
> +
> +	refcount_set(&vioch->users, 1);
> +}
> +
> +static inline bool scmi_vio_channel_acquire(struct scmi_vio_channel *vioch)
> +{
> +	return refcount_inc_not_zero(&vioch->users);
> +}
> +
> +static inline void scmi_vio_channel_release(struct scmi_vio_channel *vioch)
> +{
> +	if (refcount_dec_and_test(&vioch->users)) {
> +		unsigned long flags;
> +
> +		spin_lock_irqsave(&vioch->lock, flags);
> +		if (vioch->shutdown_done) {
> +			vioch->cinfo = NULL;
> +			complete(vioch->shutdown_done);
> +		}
> +		spin_unlock_irqrestore(&vioch->lock, flags);
> +	}
> +}
> +
> +static void scmi_vio_channel_cleanup_sync(struct scmi_vio_channel *vioch)
> +{
> +	int timeout;
> +	char *vq_name;
> +	unsigned long flags;
> +	struct device *dev;
> +	DECLARE_COMPLETION_ONSTACK(vioch_shutdown_done);
> +
> +	/*
> +	 * Prepare to wait for the last release if not already released
> +	 * or in progress.
> +	 */
> +	spin_lock_irqsave(&vioch->lock, flags);
> +	if (!vioch->cinfo || vioch->shutdown_done) {
> +		spin_unlock_irqrestore(&vioch->lock, flags);
> +		return;
> +	}
> +	vioch->shutdown_done = &vioch_shutdown_done;
> +	vq_name = vioch->is_rx ? "RX" : "TX";
> +	/* vioch->cinfo could be NULLified after the release */
> +	dev = vioch->cinfo->dev;
> +	spin_unlock_irqrestore(&vioch->lock, flags);
> +
> +	scmi_vio_channel_release(vioch);
> +
> +	timeout = msecs_to_jiffies(VIRTIO_MAX_RX_TIMEOUT_MS + 10);
> +	/* Let any possibly concurrent RX path release the channel */
> +	if (!wait_for_completion_timeout(vioch->shutdown_done, timeout))
> +		dev_warn(dev,
> +			 "Timeout shutting down %s VQ.\n", vq_name);
> +}
> +

Hmm. So if it times out then what? It's ok to corrupt memory then?
Why? I suspect if you want to recover from this you need to mark device
as broken, synchronize with all callbacks (we don't have an API
for that but we really should). Only then you will know it's
not doing anything.


>  static bool scmi_vio_have_vq_rx(struct virtio_device *vdev)
>  {
>  	return virtio_has_feature(vdev, VIRTIO_SCMI_F_P2A_CHANNELS);
> @@ -119,7 +185,7 @@ static void scmi_finalize_message(struct scmi_vio_channel *vioch,
>  
>  static void scmi_vio_complete_cb(struct virtqueue *vqueue)
>  {
> -	unsigned long ready_flags;
> +	unsigned long flags;
>  	unsigned int length;
>  	struct scmi_vio_channel *vioch;
>  	struct scmi_vio_msg *msg;
> @@ -130,27 +196,27 @@ static void scmi_vio_complete_cb(struct virtqueue *vqueue)
>  	vioch = &((struct scmi_vio_channel *)vqueue->vdev->priv)[vqueue->index];
>  
>  	for (;;) {
> -		spin_lock_irqsave(&vioch->ready_lock, ready_flags);
> -
> -		if (!vioch->ready) {
> +		if (!scmi_vio_channel_acquire(vioch)) {
>  			if (!cb_enabled)
>  				(void)virtqueue_enable_cb(vqueue);
> -			goto unlock_ready_out;
> +			return;
>  		}
>  
> -		/* IRQs already disabled here no need to irqsave */
> -		spin_lock(&vioch->lock);
> +		spin_lock_irqsave(&vioch->lock, flags);
>  		if (cb_enabled) {
>  			virtqueue_disable_cb(vqueue);
>  			cb_enabled = false;
>  		}
>  		msg = virtqueue_get_buf(vqueue, &length);
>  		if (!msg) {
> -			if (virtqueue_enable_cb(vqueue))
> -				goto unlock_out;
> +			if (virtqueue_enable_cb(vqueue)) {
> +				spin_unlock_irqrestore(&vioch->lock, flags);
> +				scmi_vio_channel_release(vioch);
> +				return;
> +			}
>  			cb_enabled = true;
>  		}
> -		spin_unlock(&vioch->lock);
> +		spin_unlock_irqrestore(&vioch->lock, flags);
>  
>  		if (msg) {
>  			msg->rx_len = length;
> @@ -161,19 +227,14 @@ static void scmi_vio_complete_cb(struct virtqueue *vqueue)
>  		}
>  
>  		/*
> -		 * Release ready_lock and re-enable IRQs between loop iterations
> -		 * to allow virtio_chan_free() to possibly kick in and set the
> -		 * flag vioch->ready to false even in between processing of
> -		 * messages, so as to force outstanding messages to be ignored
> -		 * when system is shutting down.
> +		 * Release vio channel between loop iterations to allow
> +		 * virtio_chan_free() to eventually fully release it when
> +		 * shutting down; in such a case, any outstanding message will
> +		 * be ignored since this loop will bail out at the next
> +		 * iteration.
>  		 */
> -		spin_unlock_irqrestore(&vioch->ready_lock, ready_flags);
> +		scmi_vio_channel_release(vioch);
>  	}
> -
> -unlock_out:
> -	spin_unlock(&vioch->lock);
> -unlock_ready_out:
> -	spin_unlock_irqrestore(&vioch->ready_lock, ready_flags);
>  }
>  
>  static const char *const scmi_vio_vqueue_names[] = { "tx", "rx" };
> @@ -273,35 +334,20 @@ static int virtio_chan_setup(struct scmi_chan_info *cinfo, struct device *dev,
>  		}
>  	}
>  
> -	spin_lock_irqsave(&vioch->lock, flags);
> -	cinfo->transport_info = vioch;
> -	/* Indirectly setting channel not available any more */
> -	vioch->cinfo = cinfo;
> -	spin_unlock_irqrestore(&vioch->lock, flags);
> -
> -	spin_lock_irqsave(&vioch->ready_lock, flags);
> -	vioch->ready = true;
> -	spin_unlock_irqrestore(&vioch->ready_lock, flags);
> +	scmi_vio_channel_ready(vioch, cinfo);
>  
>  	return 0;
>  }
>  
>  static int virtio_chan_free(int id, void *p, void *data)
>  {
> -	unsigned long flags;
>  	struct scmi_chan_info *cinfo = p;
>  	struct scmi_vio_channel *vioch = cinfo->transport_info;
>  
> -	spin_lock_irqsave(&vioch->ready_lock, flags);
> -	vioch->ready = false;
> -	spin_unlock_irqrestore(&vioch->ready_lock, flags);
> +	scmi_vio_channel_cleanup_sync(vioch);
>  
>  	scmi_free_channel(cinfo, data, id);
>  
> -	spin_lock_irqsave(&vioch->lock, flags);
> -	vioch->cinfo = NULL;
> -	spin_unlock_irqrestore(&vioch->lock, flags);
> -
>  	return 0;
>  }
>  
> @@ -316,10 +362,14 @@ static int virtio_send_message(struct scmi_chan_info *cinfo,
>  	int rc;
>  	struct scmi_vio_msg *msg;
>  
> +	if (!scmi_vio_channel_acquire(vioch))
> +		return -EINVAL;
> +
>  	spin_lock_irqsave(&vioch->lock, flags);
>  
>  	if (list_empty(&vioch->free_list)) {
>  		spin_unlock_irqrestore(&vioch->lock, flags);
> +		scmi_vio_channel_release(vioch);
>  		return -EBUSY;
>  	}
>  
> @@ -342,6 +392,8 @@ static int virtio_send_message(struct scmi_chan_info *cinfo,
>  
>  	spin_unlock_irqrestore(&vioch->lock, flags);
>  
> +	scmi_vio_channel_release(vioch);
> +
>  	return rc;
>  }
>  
> @@ -416,7 +468,6 @@ static int scmi_vio_probe(struct virtio_device *vdev)
>  		unsigned int sz;
>  
>  		spin_lock_init(&channels[i].lock);
> -		spin_lock_init(&channels[i].ready_lock);
>  		INIT_LIST_HEAD(&channels[i].free_list);
>  		channels[i].vqueue = vqs[i];
>  
> @@ -503,7 +554,8 @@ const struct scmi_desc scmi_virtio_desc = {
>  	.transport_init = virtio_scmi_init,
>  	.transport_exit = virtio_scmi_exit,
>  	.ops = &scmi_virtio_ops,
> -	.max_rx_timeout_ms = 60000, /* for non-realtime virtio devices */
> +	/* for non-realtime virtio devices */
> +	.max_rx_timeout_ms = VIRTIO_MAX_RX_TIMEOUT_MS,
>  	.max_msg = 0, /* overridden by virtio_get_max_msg() */
>  	.max_msg_size = VIRTIO_SCMI_MAX_MSG_SIZE,
>  };
> -- 
> 2.17.1
> 
> 
> _______________________________________________
> linux-arm-kernel mailing list
> linux-arm-kernel@lists.infradead.org
> http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
> 


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH v2 1/9] firmware: arm_scmi: Add a virtio channel refcount
  2022-02-01 22:05   ` Michael S. Tsirkin
@ 2022-02-03 10:08     ` Cristian Marussi
  0 siblings, 0 replies; 16+ messages in thread
From: Cristian Marussi @ 2022-02-03 10:08 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: linux-kernel, linux-arm-kernel, sudeep.holla, james.quinlan,
	Jonathan.Cameron, f.fainelli, etienne.carriere, vincent.guittot,
	souvik.chakravarty, peter.hilber, igor.skalkin

On Tue, Feb 01, 2022 at 05:05:27PM -0500, Michael S. Tsirkin wrote:
> On Tue, Feb 01, 2022 at 05:15:53PM +0000, Cristian Marussi wrote:
> > Currently SCMI VirtIO channels are marked with a ready flag and related
> > lock to track channel lifetime and support proper synchronization at
> > shutdown when virtqueues have to be stopped.
> > 
> > This leads to some extended spinlocked sections with IRQs off on the RX
> > path to keep hold of the ready flag and does not scale well especially when
> > SCMI VirtIO polling mode will be introduced.
> > 
> > Add an SCMI VirtIO channel dedicated refcount to track active users on both
> > the TX and the RX path and properly enforce synchronization and cleanup at
> > shutdown, inhibiting further usage of the channel once freed.
> > 
> > Cc: Igor Skalkin <igor.skalkin@opensynergy.com>
> > Cc: Peter Hilber <peter.hilber@opensynergy.com>
> > Signed-off-by: Cristian Marussi <cristian.marussi@arm.com>
> > ---
> >  drivers/firmware/arm_scmi/virtio.c | 148 +++++++++++++++++++----------
> >  1 file changed, 100 insertions(+), 48 deletions(-)
> > 
> > diff --git a/drivers/firmware/arm_scmi/virtio.c b/drivers/firmware/arm_scmi/virtio.c
> > index fd0f6f91fc0b..536e46eab462 100644
> > --- a/drivers/firmware/arm_scmi/virtio.c
> > +++ b/drivers/firmware/arm_scmi/virtio.c
> > @@ -17,7 +17,9 @@
> >   * virtqueue. Access to each virtqueue is protected by spinlocks.
> >   */
> >  

Hi Michael,

thanks for your review first of all.

> > +#include <linux/completion.h>
> >  #include <linux/errno.h>
> > +#include <linux/refcount.h>
> >  #include <linux/slab.h>
> >  #include <linux/virtio.h>
> >  #include <linux/virtio_config.h>
> > @@ -27,6 +29,7 @@
> >  
> >  #include "common.h"
> >  
> > +#define VIRTIO_MAX_RX_TIMEOUT_MS	60000
> >  #define VIRTIO_SCMI_MAX_MSG_SIZE 128 /* Value may be increased. */
> >  #define VIRTIO_SCMI_MAX_PDU_SIZE \
> >  	(VIRTIO_SCMI_MAX_MSG_SIZE + SCMI_MSG_MAX_PROT_OVERHEAD)
> > @@ -39,23 +42,21 @@
> >   * @cinfo: SCMI Tx or Rx channel
> >   * @free_list: List of unused scmi_vio_msg, maintained for Tx channels only
> >   * @is_rx: Whether channel is an Rx channel
> > - * @ready: Whether transport user is ready to hear about channel
> >   * @max_msg: Maximum number of pending messages for this channel.
> > - * @lock: Protects access to all members except ready.
> > - * @ready_lock: Protects access to ready. If required, it must be taken before
> > - *              lock.
> > + * @lock: Protects access to all members except users.
> > + * @shutdown_done: A reference to a completion used when freeing this channel.
> > + * @users: A reference count to currently active users of this channel.
> >   */
> >  struct scmi_vio_channel {
> >  	struct virtqueue *vqueue;
> >  	struct scmi_chan_info *cinfo;
> >  	struct list_head free_list;
> >  	bool is_rx;
> > -	bool ready;
> >  	unsigned int max_msg;
> > -	/* lock to protect access to all members except ready. */
> > +	/* lock to protect access to all members except users. */
> >  	spinlock_t lock;
> > -	/* lock to rotects access to ready flag. */
> > -	spinlock_t ready_lock;
> > +	struct completion *shutdown_done;
> > +	refcount_t users;
> >  };
> >  
> >  /**
> > @@ -76,6 +77,71 @@ struct scmi_vio_msg {
> >  /* Only one SCMI VirtIO device can possibly exist */
> >  static struct virtio_device *scmi_vdev;
> >  
> > +static void scmi_vio_channel_ready(struct scmi_vio_channel *vioch,
> > +				   struct scmi_chan_info *cinfo)
> > +{
> > +	unsigned long flags;
> > +
> > +	spin_lock_irqsave(&vioch->lock, flags);
> > +	cinfo->transport_info = vioch;
> > +	/* Indirectly setting channel not available any more */
> > +	vioch->cinfo = cinfo;
> > +	spin_unlock_irqrestore(&vioch->lock, flags);
> > +
> > +	refcount_set(&vioch->users, 1);
> > +}
> > +
> > +static inline bool scmi_vio_channel_acquire(struct scmi_vio_channel *vioch)
> > +{
> > +	return refcount_inc_not_zero(&vioch->users);
> > +}
> > +
> > +static inline void scmi_vio_channel_release(struct scmi_vio_channel *vioch)
> > +{
> > +	if (refcount_dec_and_test(&vioch->users)) {
> > +		unsigned long flags;
> > +
> > +		spin_lock_irqsave(&vioch->lock, flags);
> > +		if (vioch->shutdown_done) {
> > +			vioch->cinfo = NULL;
> > +			complete(vioch->shutdown_done);
> > +		}
> > +		spin_unlock_irqrestore(&vioch->lock, flags);
> > +	}
> > +}
> > +
> > +static void scmi_vio_channel_cleanup_sync(struct scmi_vio_channel *vioch)
> > +{
> > +	int timeout;
> > +	char *vq_name;
> > +	unsigned long flags;
> > +	struct device *dev;
> > +	DECLARE_COMPLETION_ONSTACK(vioch_shutdown_done);
> > +
> > +	/*
> > +	 * Prepare to wait for the last release if not already released
> > +	 * or in progress.
> > +	 */
> > +	spin_lock_irqsave(&vioch->lock, flags);
> > +	if (!vioch->cinfo || vioch->shutdown_done) {
> > +		spin_unlock_irqrestore(&vioch->lock, flags);
> > +		return;
> > +	}
> > +	vioch->shutdown_done = &vioch_shutdown_done;
> > +	vq_name = vioch->is_rx ? "RX" : "TX";
> > +	/* vioch->cinfo could be NULLified after the release */
> > +	dev = vioch->cinfo->dev;
> > +	spin_unlock_irqrestore(&vioch->lock, flags);
> > +
> > +	scmi_vio_channel_release(vioch);
> > +
> > +	timeout = msecs_to_jiffies(VIRTIO_MAX_RX_TIMEOUT_MS + 10);
> > +	/* Let any possibly concurrent RX path release the channel */
> > +	if (!wait_for_completion_timeout(vioch->shutdown_done, timeout))
> > +		dev_warn(dev,
> > +			 "Timeout shutting down %s VQ.\n", vq_name);
> > +}
> > +
> 
> Hmm. So if it times out then what? It's ok to corrupt memory then?
> Why? I suspect if you want to recover from this you need to mark device
> as broken, synchronize with all callbacks (we don't have an API
> for that but we really should). Only then you will know it's
> not doing anything.
> 

In fact I was not sure how to address this situation and not sure even
if it was a real possibility, but indeed a constant flood of evily
interleaved messages coming from the queues could keep the refcount
indefinitely > 0 (and the fact that I am shutting down is no excuse to
corrupt memory...my bad)

So what if I mark the device broken here BEFORE this possible final
release, so as to disrupt completely any communication on the virtqueus
at first (inhibiting also any callbacks invocation on IRQ) and then sync
on my channel refcount as it is now.
(and then just wait indefintely without any timeout..)

something like (not tried still..):

....
	vq_name = vioch->is_rx ? "RX" : "TX";
	/* vioch->cinfo could be NULLified after the release */
	dev = vioch->cinfo->dev;
+	virtio_break_device(vioch->vqueue->vdev);
	spin_unlock_irqrestore(&vioch->lock, flags);

	scmi_vio_channel_release(vioch);

	/* Let any possibly concurrent RX path release the channel */
+	wait_for_completion(vioch->shutdown_done);


Thanks,
Cristian


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH v2 3/9] [RFC] virtio_ring: Embed a wrap counter in opaque poll index value
  2022-02-01 18:27   ` Michael S. Tsirkin
@ 2022-02-03 10:51     ` Cristian Marussi
  2022-02-03 11:32       ` Michael S. Tsirkin
  0 siblings, 1 reply; 16+ messages in thread
From: Cristian Marussi @ 2022-02-03 10:51 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: linux-kernel, linux-arm-kernel, sudeep.holla, james.quinlan,
	Jonathan.Cameron, f.fainelli, etienne.carriere, vincent.guittot,
	souvik.chakravarty, peter.hilber, igor.skalkin, virtualization

On Tue, Feb 01, 2022 at 01:27:38PM -0500, Michael S. Tsirkin wrote:
> Looks correct, thanks. Some minor comments below:
> 

Hi Michael,

thanks for the feedback.

> On Tue, Feb 01, 2022 at 05:15:55PM +0000, Cristian Marussi wrote:
> > Exported API virtqueue_poll() can be used to support polling mode operation
> > on top of virtio layer if needed; currently the parameter last_used_idx is
> > the opaque value that needs to be passed to the virtqueue_poll() function
> > to check if there are new pending used buffers in the queue: such opaque
> > value would have been previously obtained by a call to the API function
> > virtqueue_enable_cb_prepare().
> > 
> > Since such opaque value is indeed containing simply a snapshot in time of
> > the internal
> 
> to add: 16 bit
> 
> > last_used_index (roughly), it is possible that,
> 
> to add here: 
> 
> if another thread calls virtqueue_add_*()
> at the same time (which existing drivers don't do,
> but does not seem to be documented as prohibited anywhere), and
> 
> > if exactly
> > 2**16 buffers are marked as used between two successive calls to
> > virtqueue_poll(), the caller is fooled into thinking that nothing is
> > pending (ABA problem).
> > Keep a full fledged internal wraps counter
> 
> s/full fledged/a 16 bit/
> 
> since I don't see why is a 16 bit counter full but not e.g. a 32 bit one
> 
.. :D I wanted to stress the fact that this being a 16bits counter has a
higher rollover than a 1-bit one wrap_counter already used...but indeed
all are just counters at the end, it's justthe wrapround that changes...

I'll fix.

> > per virtqueue and embed it into
> > the upper 16bits of the returned opaque value, so that the above scenario
> > can be detected transparently by virtqueue_poll(): this way each single
> > possible last_used_idx value is really belonging to a different wrap.
> 
> Just to add here: the ABA problem can in theory still happen but
> now that's after 2^32 requests, which seems sufficient in practice.
> 

Sure, I'll fix the commit message as above advised.

> > Cc: "Michael S. Tsirkin" <mst@redhat.com>
> > Cc: Igor Skalkin <igor.skalkin@opensynergy.com>
> > Cc: Peter Hilber <peter.hilber@opensynergy.com>
> > Cc: virtualization@lists.linux-foundation.org
> > Signed-off-by: Cristian Marussi <cristian.marussi@arm.com>
> > ---
> > Still no perf data on this, I was wondering what exactly to measure in
> > term of perf metrics to evaluate the impact of the rolling vq->wraps
> > counter.
> > ---
> >  drivers/virtio/virtio_ring.c | 51 +++++++++++++++++++++++++++++++++---
> >  1 file changed, 47 insertions(+), 4 deletions(-)
> > 
> > diff --git a/drivers/virtio/virtio_ring.c b/drivers/virtio/virtio_ring.c
> > index 00f64f2f8b72..613ec0503509 100644
> > --- a/drivers/virtio/virtio_ring.c
> > +++ b/drivers/virtio/virtio_ring.c
> > @@ -12,6 +12,8 @@
> >  #include <linux/hrtimer.h>
> >  #include <linux/dma-mapping.h>
> >  #include <linux/spinlock.h>
> > +#include <linux/bits.h>
> > +#include <linux/bitfield.h>
> >  #include <xen/xen.h>
> >  
> >  static bool force_used_validation = false;
> > @@ -69,6 +71,17 @@ module_param(force_used_validation, bool, 0444);
> >  #define LAST_ADD_TIME_INVALID(vq)
> >  #endif
> >  
> > +#define VRING_IDX_MASK					GENMASK(15, 0)
> > +#define VRING_GET_IDX(opaque)				\
> > +	((u16)FIELD_GET(VRING_IDX_MASK, (opaque)))
> > +
> > +#define VRING_WRAPS_MASK				GENMASK(31, 16)
> > +#define VRING_GET_WRAPS(opaque)				\
> > +	((u16)FIELD_GET(VRING_WRAPS_MASK, (opaque)))
> > +
> > +#define VRING_BUILD_OPAQUE(idx, wraps)			\
> > +	(FIELD_PREP(VRING_WRAPS_MASK, (wraps)) | ((idx) & VRING_IDX_MASK))
> > +
> 
> Maybe prefix with VRING_POLL_  since that is the only user.
> 

I'll do.

> 
> >  struct vring_desc_state_split {
> >  	void *data;			/* Data for callback. */
> >  	struct vring_desc *indir_desc;	/* Indirect descriptor, if any. */
> > @@ -117,6 +130,8 @@ struct vring_virtqueue {
> >  	/* Last used index we've seen. */
> >  	u16 last_used_idx;
> >  
> > +	u16 wraps;
> > +
> >  	/* Hint for event idx: already triggered no need to disable. */
> >  	bool event_triggered;
> >  
> > @@ -806,6 +821,8 @@ static void *virtqueue_get_buf_ctx_split(struct virtqueue *_vq,
> >  	ret = vq->split.desc_state[i].data;
> >  	detach_buf_split(vq, i, ctx);
> >  	vq->last_used_idx++;
> > +	if (unlikely(!vq->last_used_idx))
> > +		vq->wraps++;
> >  	/* If we expect an interrupt for the next entry, tell host
> >  	 * by writing event index and flush out the write before
> >  	 * the read in the next get_buf call. */
> 
> So most drivers don't call virtqueue_poll.
> Concerned about the overhead here: another option is
> with a flag that will have to be set whenever a driver
> wants to use virtqueue_poll.

Do you mean a compile time flag/Kconfig to just remove the possible
overhead instructions as a whole when not needed by the driver ?

Or do you mean at runtime since checking the flag evry time should be
less costly than checking the wrpas each time AND counting when it
happens ?

> Could you pls do a quick perf test e.g. using tools/virtio/
> to see what's faster?

Yes I'll do, thanks for the hint, I have some compilation issues in
tools/virtio due to my additions (missing mirrored hehaders) or to some
recently added stuff (missing drv_to_virtio & friends for
suppressed_used_validation thing)...anyway I fixed those now and I'll
post related tools/virtio patches with next iteration.

Anyway, do you mean perf data about vringh_test and virtio_test/vhost
right ? (ringtest/ excluded 'cause does not use any API is just
prototyping)

> 
> 
> 
> > @@ -1508,6 +1525,7 @@ static void *virtqueue_get_buf_ctx_packed(struct virtqueue *_vq,
> >  	if (unlikely(vq->last_used_idx >= vq->packed.vring.num)) {
> >  		vq->last_used_idx -= vq->packed.vring.num;
> >  		vq->packed.used_wrap_counter ^= 1;
> > +		vq->wraps++;
> >  	}
> >  
> >  	/*
> > @@ -1744,6 +1762,7 @@ static struct virtqueue *vring_create_virtqueue_packed(
> >  	vq->weak_barriers = weak_barriers;
> >  	vq->broken = false;
> >  	vq->last_used_idx = 0;
> > +	vq->wraps = 0;
> >  	vq->event_triggered = false;
> >  	vq->num_added = 0;
> >  	vq->packed_ring = true;
> > @@ -2092,13 +2111,17 @@ EXPORT_SYMBOL_GPL(virtqueue_disable_cb);
> >   */
> >  unsigned virtqueue_enable_cb_prepare(struct virtqueue *_vq)
> >  {
> > +	unsigned int last_used_idx;
> >  	struct vring_virtqueue *vq = to_vvq(_vq);
> >  
> >  	if (vq->event_triggered)
> >  		vq->event_triggered = false;
> >  
> > -	return vq->packed_ring ? virtqueue_enable_cb_prepare_packed(_vq) :
> > -				 virtqueue_enable_cb_prepare_split(_vq);
> > +	last_used_idx = vq->packed_ring ?
> > +			virtqueue_enable_cb_prepare_packed(_vq) :
> > +			virtqueue_enable_cb_prepare_split(_vq);
> > +
> > +	return VRING_BUILD_OPAQUE(last_used_idx, vq->wraps);
> >  }
> >  EXPORT_SYMBOL_GPL(virtqueue_enable_cb_prepare);
> >  
> > @@ -2107,6 +2130,21 @@ EXPORT_SYMBOL_GPL(virtqueue_enable_cb_prepare);
> >   * @_vq: the struct virtqueue we're talking about.
> >   * @last_used_idx: virtqueue state (from call to virtqueue_enable_cb_prepare).
> >   *
> > + * The provided last_used_idx, as returned by virtqueue_enable_cb_prepare(),
> > + * is an opaque value representing the queue state and it is built as follows:
> > + *
> > + *	---------------------------------------------------------
> > + *	|	vq->wraps	|	vq->last_used_idx	|
> > + *	31------------------------------------------------------0
> > + *
> > + * The MSB 16bits embedding the wraps counter for the underlying virtqueue
> > + * is stripped out here before reaching into the lower layer helpers.
> > + *
> > + * This structure of the opaque value mitigates the scenario in which, when
> > + * exactly 2**16 messages are marked as used between two successive calls to
> > + * virtqueue_poll(), the caller is fooled into thinking nothing new has arrived
> > + * since the pure last_used_idx is exactly the same.
> > + *
> 
> Do you want to move this comment to where the macros implementing it
> are?
> 

Sure, I'll do.

Thanks,
Cristian


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH v2 3/9] [RFC] virtio_ring: Embed a wrap counter in opaque poll index value
  2022-02-03 10:51     ` Cristian Marussi
@ 2022-02-03 11:32       ` Michael S. Tsirkin
  2022-02-07 18:52         ` Cristian Marussi
  0 siblings, 1 reply; 16+ messages in thread
From: Michael S. Tsirkin @ 2022-02-03 11:32 UTC (permalink / raw)
  To: Cristian Marussi
  Cc: linux-kernel, linux-arm-kernel, sudeep.holla, james.quinlan,
	Jonathan.Cameron, f.fainelli, etienne.carriere, vincent.guittot,
	souvik.chakravarty, peter.hilber, igor.skalkin, virtualization

On Thu, Feb 03, 2022 at 10:51:19AM +0000, Cristian Marussi wrote:
> On Tue, Feb 01, 2022 at 01:27:38PM -0500, Michael S. Tsirkin wrote:
> > Looks correct, thanks. Some minor comments below:
> > 
> 
> Hi Michael,
> 
> thanks for the feedback.
> 
> > On Tue, Feb 01, 2022 at 05:15:55PM +0000, Cristian Marussi wrote:
> > > Exported API virtqueue_poll() can be used to support polling mode operation
> > > on top of virtio layer if needed; currently the parameter last_used_idx is
> > > the opaque value that needs to be passed to the virtqueue_poll() function
> > > to check if there are new pending used buffers in the queue: such opaque
> > > value would have been previously obtained by a call to the API function
> > > virtqueue_enable_cb_prepare().
> > > 
> > > Since such opaque value is indeed containing simply a snapshot in time of
> > > the internal
> > 
> > to add: 16 bit
> > 
> > > last_used_index (roughly), it is possible that,
> > 
> > to add here: 
> > 
> > if another thread calls virtqueue_add_*()
> > at the same time (which existing drivers don't do,
> > but does not seem to be documented as prohibited anywhere), and
> > 
> > > if exactly
> > > 2**16 buffers are marked as used between two successive calls to
> > > virtqueue_poll(), the caller is fooled into thinking that nothing is
> > > pending (ABA problem).
> > > Keep a full fledged internal wraps counter
> > 
> > s/full fledged/a 16 bit/
> > 
> > since I don't see why is a 16 bit counter full but not e.g. a 32 bit one
> > 
> .. :D I wanted to stress the fact that this being a 16bits counter has a
> higher rollover than a 1-bit one wrap_counter already used...but indeed
> all are just counters at the end, it's justthe wrapround that changes...
> 
> I'll fix.
> 
> > > per virtqueue and embed it into
> > > the upper 16bits of the returned opaque value, so that the above scenario
> > > can be detected transparently by virtqueue_poll(): this way each single
> > > possible last_used_idx value is really belonging to a different wrap.
> > 
> > Just to add here: the ABA problem can in theory still happen but
> > now that's after 2^32 requests, which seems sufficient in practice.
> > 
> 
> Sure, I'll fix the commit message as above advised.
> 
> > > Cc: "Michael S. Tsirkin" <mst@redhat.com>
> > > Cc: Igor Skalkin <igor.skalkin@opensynergy.com>
> > > Cc: Peter Hilber <peter.hilber@opensynergy.com>
> > > Cc: virtualization@lists.linux-foundation.org
> > > Signed-off-by: Cristian Marussi <cristian.marussi@arm.com>
> > > ---
> > > Still no perf data on this, I was wondering what exactly to measure in
> > > term of perf metrics to evaluate the impact of the rolling vq->wraps
> > > counter.
> > > ---
> > >  drivers/virtio/virtio_ring.c | 51 +++++++++++++++++++++++++++++++++---
> > >  1 file changed, 47 insertions(+), 4 deletions(-)
> > > 
> > > diff --git a/drivers/virtio/virtio_ring.c b/drivers/virtio/virtio_ring.c
> > > index 00f64f2f8b72..613ec0503509 100644
> > > --- a/drivers/virtio/virtio_ring.c
> > > +++ b/drivers/virtio/virtio_ring.c
> > > @@ -12,6 +12,8 @@
> > >  #include <linux/hrtimer.h>
> > >  #include <linux/dma-mapping.h>
> > >  #include <linux/spinlock.h>
> > > +#include <linux/bits.h>
> > > +#include <linux/bitfield.h>
> > >  #include <xen/xen.h>
> > >  
> > >  static bool force_used_validation = false;
> > > @@ -69,6 +71,17 @@ module_param(force_used_validation, bool, 0444);
> > >  #define LAST_ADD_TIME_INVALID(vq)
> > >  #endif
> > >  
> > > +#define VRING_IDX_MASK					GENMASK(15, 0)
> > > +#define VRING_GET_IDX(opaque)				\
> > > +	((u16)FIELD_GET(VRING_IDX_MASK, (opaque)))
> > > +
> > > +#define VRING_WRAPS_MASK				GENMASK(31, 16)
> > > +#define VRING_GET_WRAPS(opaque)				\
> > > +	((u16)FIELD_GET(VRING_WRAPS_MASK, (opaque)))
> > > +
> > > +#define VRING_BUILD_OPAQUE(idx, wraps)			\
> > > +	(FIELD_PREP(VRING_WRAPS_MASK, (wraps)) | ((idx) & VRING_IDX_MASK))
> > > +
> > 
> > Maybe prefix with VRING_POLL_  since that is the only user.
> > 
> 
> I'll do.
> 
> > 
> > >  struct vring_desc_state_split {
> > >  	void *data;			/* Data for callback. */
> > >  	struct vring_desc *indir_desc;	/* Indirect descriptor, if any. */
> > > @@ -117,6 +130,8 @@ struct vring_virtqueue {
> > >  	/* Last used index we've seen. */
> > >  	u16 last_used_idx;
> > >  
> > > +	u16 wraps;
> > > +
> > >  	/* Hint for event idx: already triggered no need to disable. */
> > >  	bool event_triggered;
> > >  
> > > @@ -806,6 +821,8 @@ static void *virtqueue_get_buf_ctx_split(struct virtqueue *_vq,
> > >  	ret = vq->split.desc_state[i].data;
> > >  	detach_buf_split(vq, i, ctx);
> > >  	vq->last_used_idx++;
> > > +	if (unlikely(!vq->last_used_idx))
> > > +		vq->wraps++;
> > >  	/* If we expect an interrupt for the next entry, tell host
> > >  	 * by writing event index and flush out the write before
> > >  	 * the read in the next get_buf call. */
> > 
> > So most drivers don't call virtqueue_poll.
> > Concerned about the overhead here: another option is
> > with a flag that will have to be set whenever a driver
> > wants to use virtqueue_poll.
> 
> Do you mean a compile time flag/Kconfig to just remove the possible
> overhead instructions as a whole when not needed by the driver ?
> 
> Or do you mean at runtime since checking the flag evry time should be
> less costly than checking the wrpas each time AND counting when it
> happens ?

The later.

> > Could you pls do a quick perf test e.g. using tools/virtio/
> > to see what's faster?
> 
> Yes I'll do, thanks for the hint, I have some compilation issues in
> tools/virtio due to my additions (missing mirrored hehaders) or to some
> recently added stuff (missing drv_to_virtio & friends for
> suppressed_used_validation thing)...anyway I fixed those now and I'll
> post related tools/virtio patches with next iteration.
> 
> Anyway, do you mean perf data about vringh_test and virtio_test/vhost
> right ? (ringtest/ excluded 'cause does not use any API is just
> prototyping)

can be either or both, virtio_test/vhost is a bit easier to use.

> > 
> > 
> > 
> > > @@ -1508,6 +1525,7 @@ static void *virtqueue_get_buf_ctx_packed(struct virtqueue *_vq,
> > >  	if (unlikely(vq->last_used_idx >= vq->packed.vring.num)) {
> > >  		vq->last_used_idx -= vq->packed.vring.num;
> > >  		vq->packed.used_wrap_counter ^= 1;
> > > +		vq->wraps++;
> > >  	}
> > >  
> > >  	/*
> > > @@ -1744,6 +1762,7 @@ static struct virtqueue *vring_create_virtqueue_packed(
> > >  	vq->weak_barriers = weak_barriers;
> > >  	vq->broken = false;
> > >  	vq->last_used_idx = 0;
> > > +	vq->wraps = 0;
> > >  	vq->event_triggered = false;
> > >  	vq->num_added = 0;
> > >  	vq->packed_ring = true;
> > > @@ -2092,13 +2111,17 @@ EXPORT_SYMBOL_GPL(virtqueue_disable_cb);
> > >   */
> > >  unsigned virtqueue_enable_cb_prepare(struct virtqueue *_vq)
> > >  {
> > > +	unsigned int last_used_idx;
> > >  	struct vring_virtqueue *vq = to_vvq(_vq);
> > >  
> > >  	if (vq->event_triggered)
> > >  		vq->event_triggered = false;
> > >  
> > > -	return vq->packed_ring ? virtqueue_enable_cb_prepare_packed(_vq) :
> > > -				 virtqueue_enable_cb_prepare_split(_vq);
> > > +	last_used_idx = vq->packed_ring ?
> > > +			virtqueue_enable_cb_prepare_packed(_vq) :
> > > +			virtqueue_enable_cb_prepare_split(_vq);
> > > +
> > > +	return VRING_BUILD_OPAQUE(last_used_idx, vq->wraps);
> > >  }
> > >  EXPORT_SYMBOL_GPL(virtqueue_enable_cb_prepare);
> > >  
> > > @@ -2107,6 +2130,21 @@ EXPORT_SYMBOL_GPL(virtqueue_enable_cb_prepare);
> > >   * @_vq: the struct virtqueue we're talking about.
> > >   * @last_used_idx: virtqueue state (from call to virtqueue_enable_cb_prepare).
> > >   *
> > > + * The provided last_used_idx, as returned by virtqueue_enable_cb_prepare(),
> > > + * is an opaque value representing the queue state and it is built as follows:
> > > + *
> > > + *	---------------------------------------------------------
> > > + *	|	vq->wraps	|	vq->last_used_idx	|
> > > + *	31------------------------------------------------------0
> > > + *
> > > + * The MSB 16bits embedding the wraps counter for the underlying virtqueue
> > > + * is stripped out here before reaching into the lower layer helpers.
> > > + *
> > > + * This structure of the opaque value mitigates the scenario in which, when
> > > + * exactly 2**16 messages are marked as used between two successive calls to
> > > + * virtqueue_poll(), the caller is fooled into thinking nothing new has arrived
> > > + * since the pure last_used_idx is exactly the same.
> > > + *
> > 
> > Do you want to move this comment to where the macros implementing it
> > are?
> > 
> 
> Sure, I'll do.
> 
> Thanks,
> Cristian


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH v2 3/9] [RFC] virtio_ring: Embed a wrap counter in opaque poll index value
  2022-02-03 11:32       ` Michael S. Tsirkin
@ 2022-02-07 18:52         ` Cristian Marussi
  0 siblings, 0 replies; 16+ messages in thread
From: Cristian Marussi @ 2022-02-07 18:52 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: linux-kernel, linux-arm-kernel, sudeep.holla, james.quinlan,
	Jonathan.Cameron, f.fainelli, etienne.carriere, vincent.guittot,
	souvik.chakravarty, peter.hilber, igor.skalkin, virtualization

On Thu, Feb 03, 2022 at 06:32:29AM -0500, Michael S. Tsirkin wrote:
> On Thu, Feb 03, 2022 at 10:51:19AM +0000, Cristian Marussi wrote:
> > On Tue, Feb 01, 2022 at 01:27:38PM -0500, Michael S. Tsirkin wrote:
> > > Looks correct, thanks. Some minor comments below:
> > > 
> > 
> > Hi Michael,
> > 
> > thanks for the feedback.
> > 

Hi Michael,

 
[snip]

> > > > @@ -806,6 +821,8 @@ static void *virtqueue_get_buf_ctx_split(struct virtqueue *_vq,
> > > >  	ret = vq->split.desc_state[i].data;
> > > >  	detach_buf_split(vq, i, ctx);
> > > >  	vq->last_used_idx++;
> > > > +	if (unlikely(!vq->last_used_idx))
> > > > +		vq->wraps++;
> > > >  	/* If we expect an interrupt for the next entry, tell host
> > > >  	 * by writing event index and flush out the write before
> > > >  	 * the read in the next get_buf call. */
> > > 
> > > So most drivers don't call virtqueue_poll.
> > > Concerned about the overhead here: another option is
> > > with a flag that will have to be set whenever a driver
> > > wants to use virtqueue_poll.
> > 
> > Do you mean a compile time flag/Kconfig to just remove the possible
> > overhead instructions as a whole when not needed by the driver ?
> > 
> > Or do you mean at runtime since checking the flag evry time should be
> > less costly than checking the wrpas each time AND counting when it
> > happens ?
> 
> The later.
> 
> > > Could you pls do a quick perf test e.g. using tools/virtio/
> > > to see what's faster?
> > 
> > Yes I'll do, thanks for the hint, I have some compilation issues in
> > tools/virtio due to my additions (missing mirrored hehaders) or to some
> > recently added stuff (missing drv_to_virtio & friends for
> > suppressed_used_validation thing)...anyway I fixed those now and I'll
> > post related tools/virtio patches with next iteration.
> > 
> > Anyway, do you mean perf data about vringh_test and virtio_test/vhost
> > right ? (ringtest/ excluded 'cause does not use any API is just
> > prototyping)
> 
> can be either or both, virtio_test/vhost is a bit easier to use.
> 

After a number of round tests with tools/virtio/virtio_test, below you can find
the most reliable results I had.

Using the flag as you suggested as in:

if (unlikely(vq->use_wrap_counter))
	vq->wraps += !vq->last_used_idx;


seems definitely better as per the result in virtio_test_flag_V1.

The last run with virtio_test_flag_V1 --wrap-counters is the case in which
the test code request to set the use_wrap_counter flag to true at start.

Since such flag is nothig spec related, I added a new EXPORT API

virtqueue_use_wrap_counter()

to allow a driver willing to use polling to ask for wrap counting at
probe time.

Since all of this required a few build-fixes in tool/virtio both before
and after my additions, I'm going to post the proposed change in a new series
independent from this SCMI virtio series (and add later the call to
virtqueue_use_wrap_counter() to the SCMI virtio driver.

How does this sound ?

Thanks,
Cristian

---
+ cset shield --exec -- perf stat --repeat 25 -- nice -n -20 /root/virtio_test_unpatched
 Performance counter stats for 'nice -n -20 /root/virtio_test_unpatched' (25 runs):

           6100.81 msec task-clock                #    1.002 CPUs utilized            ( +-  0.16% )
                19      context-switches          #    3.126 /sec                     ( +-  2.08% )
                 0      cpu-migrations            #    0.000 /sec                   
               134      page-faults               #   22.049 /sec                     ( +-  0.03% )
       18249525657      cycles                    #    3.003 GHz                      ( +-  0.07% )
       45583397473      instructions              #    2.52  insn per cycle           ( +-  0.09% )
       14009712668      branches                  #    2.305 G/sec                    ( +-  0.09% )
          10075872      branch-misses             #    0.07% of all branches          ( +-  0.83% )

            6.0908 +- 0.0107 seconds time elapsed  ( +-  0.18% )


+ cset shield --exec -- perf stat --repeat 25 -- nice -n -20 /root/virtio_test_wraps_noflag
 Performance counter stats for 'nice -n -20 /root/virtio_test_wraps_noflag' (25 runs):

           7982.99 msec task-clock                #    0.996 CPUs utilized            ( +-  0.14% )
                16      context-switches          #    1.999 /sec                     ( +-  2.56% )
                 0      cpu-migrations            #    0.000 /sec                   
               134      page-faults               #   16.744 /sec                     ( +-  0.03% )
       23691074946      cycles                    #    2.960 GHz                      ( +-  0.06% )
       68176350359      instructions              #    2.88  insn per cycle           ( +-  0.09% )
       21037768642      branches                  #    2.629 G/sec                    ( +-  0.09% )
           9083084      branch-misses             #    0.04% of all branches          ( +-  0.74% )

            8.0125 +- 0.0114 seconds time elapsed  ( +-  0.14% )


+ cset shield --exec -- perf stat --repeat 25 -- nice -n -20 /root/virtio_test_flag_V1
 Performance counter stats for 'nice -n -20 /root/virtio_test_flag_V1' (25 runs):

           6182.21 msec task-clock                #    1.007 CPUs utilized            ( +-  0.25% )
                19      context-switches          #    3.104 /sec                     ( +-  1.68% )
                 0      cpu-migrations            #    0.000 /sec                   
               134      page-faults               #   21.889 /sec                     ( +-  0.03% )
       18142274957      cycles                    #    2.963 GHz                      ( +-  0.13% )
       48973010013      instructions              #    2.71  insn per cycle           ( +-  0.18% )
       15064825126      branches                  #    2.461 G/sec                    ( +-  0.18% )
           8697800      branch-misses             #    0.06% of all branches          ( +-  0.89% )

            6.1382 +- 0.0172 seconds time elapsed  ( +-  0.28% )

+ cset shield --exec -- perf stat --repeat 25 -- nice -n -20 /root/virtio_test_flag_V1 --wrap-counters
 Performance counter stats for 'nice -n -20 /root/virtio_test_flag_V1 --wrap-counters' (25 runs):

           6051.58 msec task-clock                #    0.984 CPUs utilized            ( +-  0.22% )
                21      context-switches          #    3.424 /sec                     ( +-  1.25% )
                 0      cpu-migrations            #    0.000 /sec                   
               134      page-faults               #   21.846 /sec                     ( +-  0.03% )
       17928356478      cycles                    #    2.923 GHz                      ( +-  0.11% )
       48147192304      instructions              #    2.67  insn per cycle           ( +-  0.14% )
       14808798588      branches                  #    2.414 G/sec                    ( +-  0.15% )
           9108899      branch-misses             #    0.06% of all branches          ( +-  1.22% )

            6.1525 +- 0.0155 seconds time elapsed  ( +-  0.25% )



^ permalink raw reply	[flat|nested] 16+ messages in thread

end of thread, other threads:[~2022-02-07 18:54 UTC | newest]

Thread overview: 16+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-02-01 17:15 [PATCH 0/9] Add SCMI Virtio & Clock atomic support Cristian Marussi
2022-02-01 17:15 ` [PATCH v2 1/9] firmware: arm_scmi: Add a virtio channel refcount Cristian Marussi
2022-02-01 22:05   ` Michael S. Tsirkin
2022-02-03 10:08     ` Cristian Marussi
2022-02-01 17:15 ` [PATCH v2 2/9] firmware: arm_scmi: Review virtio free_list handling Cristian Marussi
2022-02-01 17:15 ` [PATCH v2 3/9] [RFC] virtio_ring: Embed a wrap counter in opaque poll index value Cristian Marussi
2022-02-01 18:27   ` Michael S. Tsirkin
2022-02-03 10:51     ` Cristian Marussi
2022-02-03 11:32       ` Michael S. Tsirkin
2022-02-07 18:52         ` Cristian Marussi
2022-02-01 17:15 ` [PATCH v2 4/9] firmware: arm_scmi: Add atomic mode support to virtio transport Cristian Marussi
2022-02-01 17:15 ` [PATCH v2 5/9] dt-bindings: firmware: arm,scmi: Add atomic_threshold optional property Cristian Marussi
2022-02-01 17:15 ` [PATCH v2 6/9] firmware: arm_scmi: Support optional system wide atomic_threshold Cristian Marussi
2022-02-01 17:15 ` [PATCH v2 7/9] firmware: arm_scmi: Add atomic support to clock protocol Cristian Marussi
2022-02-01 17:16 ` [PATCH v2 8/9] [RFC] firmware: arm_scmi: Add support for clock_enable_latency Cristian Marussi
2022-02-01 17:16 ` [PATCH v2 9/9] [RFC] clk: scmi: Support atomic clock enable/disable API Cristian Marussi

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).