linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v6 00/17] Introduce SCMI transport based on VirtIO
@ 2021-07-12 14:18 Cristian Marussi
  2021-07-12 14:18 ` [PATCH v6 01/17] firmware: arm_scmi: Avoid padding in sensor message structure Cristian Marussi
                   ` (18 more replies)
  0 siblings, 19 replies; 48+ messages in thread
From: Cristian Marussi @ 2021-07-12 14:18 UTC (permalink / raw)
  To: linux-kernel, linux-arm-kernel, virtualization, virtio-dev
  Cc: sudeep.holla, james.quinlan, Jonathan.Cameron, f.fainelli,
	etienne.carriere, vincent.guittot, souvik.chakravarty,
	cristian.marussi, igor.skalkin, peter.hilber, alex.bennee,
	jean-philippe, mikhail.golubev, anton.yakovlev, Vasyl.Vavrychuk,
	Andriy.Tryshnivskyy

Hi all,

While reworking this series starting from the work done up to V3 by
OpenSynergy, I am keeping the original autorship and list distribution
unchanged.

The main aim of this rework, as said, is to simplify where possible the
SCMI VirtIO support added in V3 by adding at first some new general
mechanisms in the SCMI Transport layer.

Indeed, after some initial small fixes, patches 05/06/07/08 add such new
additional mechanisms to the SCMI core to ease implementation of more
complex transports like virtio, while also addressing a few general issues
already potentially affecting existing transports.

In terms of rework I dropped original V3 patches 05/06/07/08/12 as no more
needed, and modified where needed the remaining original patches to take
advantage of the above mentioned new SCMI transport features.

DT bindings patch has been ported on top of freshly YAML converted arm,scmi
bindings.

Moreover, since V5 I dropped support for polling mode from the virtio-scmi
transport, since it is an optional general mechanism provided by the core
to allow transports lacking a completion IRQ to work and it seemed a
needless addition/complication in the context of virtio transport.

Additionally, in V5 I could also simplify a bit the virtio transport
probing sequence starting from the observation that, by the VirtIO spec,
in fact, only one single SCMI VirtIO device can possibly exist on a system.

The series has been tested using an emulated fake SCMI device and also a
proper SCP-fw stack running through QEMU vhost-users, with the SCMI stack
compiled, in both cases, as builtin and as a loadable module, running tests
against mocked SCMI Sensors using HWMON and IIO interfaces to check the
functionality of notifications and sync/async commands.

Virtio-scmi support has been exercised in the following testing scenario
on a JUNO board:

 - normal sync/async command transfers
 - notifications
 - concurrent delivery of correlated response and delayed responses
 - out-of-order delivery of delayed responses before related responses
 - unexpected delayed response delivery for sync commands
 - late delivery of timed-out responses and delayed responses

Some basic regression testing against mailbox transport has been performed
for commands and notifications too.

No sensible overhead in total handling time of commands and notifications
has been observed, even though this series do indeed add a considerable
amount of code to execute on TX path.
More test and measurements could be needed in these regards.

This series is based on top of v5.14-rc1.

Any feedback/testing is welcome :D

Thanks,
Cristian
---
V5 --> V6:
 - removed delegated xfers and its usage
 - add and use *priv optional parameter in scmi_rx_callback()
 - made .poll_done and .clear_channel ops optional

V4 --> V5:
 - removed msg raw_payload helpers
 - reworked msg helpers to use xfer->priv reference
 - simplified SCMI device probe sequence (one static device)
 - added new SCMI Kconfig layout
 - removed SCMI virtio polling support

V3 --> V4:
 - using new delegated xfers support and monotonically increasing tokens
   in virtio transport
 - ported SCMI virtio transport DT bindings to YAML format
 - added virtio-scmi polling support
 - added delegated xfers support

Cristian Marussi (11):
  firmware: arm_scmi: Avoid padding in sensor message structure
  firmware: arm_scmi: Fix max pending messages boundary check
  firmware: arm_scmi: Add support for type handling in common functions
  firmware: arm_scmi: Remove scmi_dump_header_dbg() helper
  firmware: arm_scmi: Add transport optional init/exit support
  firmware: arm_scmi: Introduce monotonically increasing tokens
  firmware: arm_scmi: Handle concurrent and out-of-order messages
  firmware: arm_scmi: Add priv parameter to scmi_rx_callback
  firmware: arm_scmi: Make .clear_channel optional
  firmware: arm_scmi: Make polling mode optional
  firmware: arm_scmi: Make SCMI transports configurable

Igor Skalkin (4):
  firmware: arm_scmi: Make shmem support optional for transports
  firmware: arm_scmi: Add method to override max message number
  dt-bindings: arm: Add virtio transport for SCMI
  firmware: arm_scmi: Add virtio transport

Peter Hilber (2):
  firmware: arm_scmi: Add message passing abstractions for transports
  firmware: arm_scmi: Add optional link_supplier() transport op

 .../bindings/firmware/arm,scmi.yaml           |   8 +-
 MAINTAINERS                                   |   1 +
 drivers/firmware/Kconfig                      |  34 +-
 drivers/firmware/arm_scmi/Kconfig             |  97 +++
 drivers/firmware/arm_scmi/Makefile            |   8 +-
 drivers/firmware/arm_scmi/common.h            |  94 ++-
 drivers/firmware/arm_scmi/driver.c            | 651 +++++++++++++++---
 drivers/firmware/arm_scmi/mailbox.c           |   2 +-
 drivers/firmware/arm_scmi/msg.c               | 113 +++
 drivers/firmware/arm_scmi/sensors.c           |   6 +-
 drivers/firmware/arm_scmi/smc.c               |   3 +-
 drivers/firmware/arm_scmi/virtio.c            | 491 +++++++++++++
 include/uapi/linux/virtio_ids.h               |   1 +
 include/uapi/linux/virtio_scmi.h              |  24 +
 14 files changed, 1389 insertions(+), 144 deletions(-)
 create mode 100644 drivers/firmware/arm_scmi/Kconfig
 create mode 100644 drivers/firmware/arm_scmi/msg.c
 create mode 100644 drivers/firmware/arm_scmi/virtio.c
 create mode 100644 include/uapi/linux/virtio_scmi.h

-- 
2.17.1


^ permalink raw reply	[flat|nested] 48+ messages in thread

* [PATCH v6 01/17] firmware: arm_scmi: Avoid padding in sensor message structure
  2021-07-12 14:18 [PATCH v6 00/17] Introduce SCMI transport based on VirtIO Cristian Marussi
@ 2021-07-12 14:18 ` Cristian Marussi
  2021-07-12 14:18 ` [PATCH v6 02/17] firmware: arm_scmi: Fix max pending messages boundary check Cristian Marussi
                   ` (17 subsequent siblings)
  18 siblings, 0 replies; 48+ messages in thread
From: Cristian Marussi @ 2021-07-12 14:18 UTC (permalink / raw)
  To: linux-kernel, linux-arm-kernel, virtualization, virtio-dev
  Cc: sudeep.holla, james.quinlan, Jonathan.Cameron, f.fainelli,
	etienne.carriere, vincent.guittot, souvik.chakravarty,
	cristian.marussi, igor.skalkin, peter.hilber, alex.bennee,
	jean-philippe, mikhail.golubev, anton.yakovlev, Vasyl.Vavrychuk,
	Andriy.Tryshnivskyy

Structure scmi_resp_sensor_reading_complete is meant to represent an SCMI
asynchronous reading complete message: representing the readings field with
a 64bit type forces padding and breaks reads in scmi_sensor_reading_get.

Split it in two adjacent 32bit readings_low/high subfields to avoid padding
or the need to make it packed.

Fixes: e2083d3673916 ("firmware: arm_scmi: Add SCMI v3.0 sensors timestamped reads")
Signed-off-by: Cristian Marussi <cristian.marussi@arm.com>
---
 drivers/firmware/arm_scmi/sensors.c | 6 ++++--
 1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/drivers/firmware/arm_scmi/sensors.c b/drivers/firmware/arm_scmi/sensors.c
index 2c88aa221559..308471586381 100644
--- a/drivers/firmware/arm_scmi/sensors.c
+++ b/drivers/firmware/arm_scmi/sensors.c
@@ -166,7 +166,8 @@ struct scmi_msg_sensor_reading_get {
 
 struct scmi_resp_sensor_reading_complete {
 	__le32 id;
-	__le64 readings;
+	__le32 readings_low;
+	__le32 readings_high;
 };
 
 struct scmi_sensor_reading_resp {
@@ -717,7 +718,8 @@ static int scmi_sensor_reading_get(const struct scmi_protocol_handle *ph,
 
 			resp = t->rx.buf;
 			if (le32_to_cpu(resp->id) == sensor_id)
-				*value = get_unaligned_le64(&resp->readings);
+				*value =
+					get_unaligned_le64(&resp->readings_low);
 			else
 				ret = -EPROTO;
 		}
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 48+ messages in thread

* [PATCH v6 02/17] firmware: arm_scmi: Fix max pending messages boundary check
  2021-07-12 14:18 [PATCH v6 00/17] Introduce SCMI transport based on VirtIO Cristian Marussi
  2021-07-12 14:18 ` [PATCH v6 01/17] firmware: arm_scmi: Avoid padding in sensor message structure Cristian Marussi
@ 2021-07-12 14:18 ` Cristian Marussi
  2021-07-14 16:46   ` Sudeep Holla
  2021-07-12 14:18 ` [PATCH v6 03/17] firmware: arm_scmi: Add support for type handling in common functions Cristian Marussi
                   ` (16 subsequent siblings)
  18 siblings, 1 reply; 48+ messages in thread
From: Cristian Marussi @ 2021-07-12 14:18 UTC (permalink / raw)
  To: linux-kernel, linux-arm-kernel, virtualization, virtio-dev
  Cc: sudeep.holla, james.quinlan, Jonathan.Cameron, f.fainelli,
	etienne.carriere, vincent.guittot, souvik.chakravarty,
	cristian.marussi, igor.skalkin, peter.hilber, alex.bennee,
	jean-philippe, mikhail.golubev, anton.yakovlev, Vasyl.Vavrychuk,
	Andriy.Tryshnivskyy

SCMI message headers carry a sequence number and such field is sized to
allow for MSG_TOKEN_MAX distinct numbers; moreover zero is not really an
acceptable maximum number of pending in-flight messages.

Fix accordignly the checks performed on the value exported by transports
in scmi_desc.max_msg.

Reported-by: Vincent Guittot <vincent.guittot@linaro.org>
Signed-off-by: Cristian Marussi <cristian.marussi@arm.com>
---
 drivers/firmware/arm_scmi/driver.c | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/drivers/firmware/arm_scmi/driver.c b/drivers/firmware/arm_scmi/driver.c
index 66e5e694be7d..d2c98642cb14 100644
--- a/drivers/firmware/arm_scmi/driver.c
+++ b/drivers/firmware/arm_scmi/driver.c
@@ -1025,8 +1025,9 @@ static int __scmi_xfer_info_init(struct scmi_info *sinfo,
 	const struct scmi_desc *desc = sinfo->desc;
 
 	/* Pre-allocated messages, no more than what hdr.seq can support */
-	if (WARN_ON(desc->max_msg >= MSG_TOKEN_MAX)) {
-		dev_err(dev, "Maximum message of %d exceeds supported %ld\n",
+	if (WARN_ON(!desc->max_msg || desc->max_msg > MSG_TOKEN_MAX)) {
+		dev_err(dev,
+			"Invalid max_msg %d. Maximum messages supported %lu.\n",
 			desc->max_msg, MSG_TOKEN_MAX);
 		return -EINVAL;
 	}
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 48+ messages in thread

* [PATCH v6 03/17] firmware: arm_scmi: Add support for type handling in common functions
  2021-07-12 14:18 [PATCH v6 00/17] Introduce SCMI transport based on VirtIO Cristian Marussi
  2021-07-12 14:18 ` [PATCH v6 01/17] firmware: arm_scmi: Avoid padding in sensor message structure Cristian Marussi
  2021-07-12 14:18 ` [PATCH v6 02/17] firmware: arm_scmi: Fix max pending messages boundary check Cristian Marussi
@ 2021-07-12 14:18 ` Cristian Marussi
  2021-07-12 14:18 ` [PATCH v6 04/17] firmware: arm_scmi: Remove scmi_dump_header_dbg() helper Cristian Marussi
                   ` (15 subsequent siblings)
  18 siblings, 0 replies; 48+ messages in thread
From: Cristian Marussi @ 2021-07-12 14:18 UTC (permalink / raw)
  To: linux-kernel, linux-arm-kernel, virtualization, virtio-dev
  Cc: sudeep.holla, james.quinlan, Jonathan.Cameron, f.fainelli,
	etienne.carriere, vincent.guittot, souvik.chakravarty,
	cristian.marussi, igor.skalkin, peter.hilber, alex.bennee,
	jean-philippe, mikhail.golubev, anton.yakovlev, Vasyl.Vavrychuk,
	Andriy.Tryshnivskyy

Add SCMI type handling to pack/unpack_scmi_header common helper functions.
Initialize hdr.type properly when initializing a command xfer.

Signed-off-by: Cristian Marussi <cristian.marussi@arm.com>
---
Needed later in the series to support serialization
---
 drivers/firmware/arm_scmi/common.h | 6 +++++-
 drivers/firmware/arm_scmi/driver.c | 1 +
 2 files changed, 6 insertions(+), 1 deletion(-)

diff --git a/drivers/firmware/arm_scmi/common.h b/drivers/firmware/arm_scmi/common.h
index 8685619d38f9..7c2b9fd7e929 100644
--- a/drivers/firmware/arm_scmi/common.h
+++ b/drivers/firmware/arm_scmi/common.h
@@ -70,6 +70,7 @@ struct scmi_msg_resp_prot_version {
  *
  * @id: The identifier of the message being sent
  * @protocol_id: The identifier of the protocol used to send @id message
+ * @type: The SCMI type for this message
  * @seq: The token to identify the message. When a message returns, the
  *	platform returns the whole message header unmodified including the
  *	token
@@ -80,6 +81,7 @@ struct scmi_msg_resp_prot_version {
 struct scmi_msg_hdr {
 	u8 id;
 	u8 protocol_id;
+	u8 type;
 	u16 seq;
 	u32 status;
 	bool poll_completion;
@@ -89,13 +91,14 @@ struct scmi_msg_hdr {
  * pack_scmi_header() - packs and returns 32-bit header
  *
  * @hdr: pointer to header containing all the information on message id,
- *	protocol id and sequence id.
+ *	protocol id, sequence id and type.
  *
  * Return: 32-bit packed message header to be sent to the platform.
  */
 static inline u32 pack_scmi_header(struct scmi_msg_hdr *hdr)
 {
 	return FIELD_PREP(MSG_ID_MASK, hdr->id) |
+		FIELD_PREP(MSG_TYPE_MASK, hdr->type) |
 		FIELD_PREP(MSG_TOKEN_ID_MASK, hdr->seq) |
 		FIELD_PREP(MSG_PROTOCOL_ID_MASK, hdr->protocol_id);
 }
@@ -110,6 +113,7 @@ static inline void unpack_scmi_header(u32 msg_hdr, struct scmi_msg_hdr *hdr)
 {
 	hdr->id = MSG_XTRACT_ID(msg_hdr);
 	hdr->protocol_id = MSG_XTRACT_PROT_ID(msg_hdr);
+	hdr->type = MSG_XTRACT_TYPE(msg_hdr);
 }
 
 /**
diff --git a/drivers/firmware/arm_scmi/driver.c b/drivers/firmware/arm_scmi/driver.c
index d2c98642cb14..53c17fba0059 100644
--- a/drivers/firmware/arm_scmi/driver.c
+++ b/drivers/firmware/arm_scmi/driver.c
@@ -565,6 +565,7 @@ static int xfer_get_init(const struct scmi_protocol_handle *ph,
 
 	xfer->tx.len = tx_size;
 	xfer->rx.len = rx_size ? : info->desc->max_msg_size;
+	xfer->hdr.type = MSG_TYPE_COMMAND;
 	xfer->hdr.id = msg_id;
 	xfer->hdr.poll_completion = false;
 
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 48+ messages in thread

* [PATCH v6 04/17] firmware: arm_scmi: Remove scmi_dump_header_dbg() helper
  2021-07-12 14:18 [PATCH v6 00/17] Introduce SCMI transport based on VirtIO Cristian Marussi
                   ` (2 preceding siblings ...)
  2021-07-12 14:18 ` [PATCH v6 03/17] firmware: arm_scmi: Add support for type handling in common functions Cristian Marussi
@ 2021-07-12 14:18 ` Cristian Marussi
  2021-07-12 14:18 ` [PATCH v6 05/17] firmware: arm_scmi: Add transport optional init/exit support Cristian Marussi
                   ` (14 subsequent siblings)
  18 siblings, 0 replies; 48+ messages in thread
From: Cristian Marussi @ 2021-07-12 14:18 UTC (permalink / raw)
  To: linux-kernel, linux-arm-kernel, virtualization, virtio-dev
  Cc: sudeep.holla, james.quinlan, Jonathan.Cameron, f.fainelli,
	etienne.carriere, vincent.guittot, souvik.chakravarty,
	cristian.marussi, igor.skalkin, peter.hilber, alex.bennee,
	jean-philippe, mikhail.golubev, anton.yakovlev, Vasyl.Vavrychuk,
	Andriy.Tryshnivskyy

Being a while that we have SCMI trace events in the SCMI stack, remove
this debug helper and its call sites.

Signed-off-by: Cristian Marussi <cristian.marussi@arm.com>
---
 drivers/firmware/arm_scmi/driver.c | 16 ----------------
 1 file changed, 16 deletions(-)

diff --git a/drivers/firmware/arm_scmi/driver.c b/drivers/firmware/arm_scmi/driver.c
index 53c17fba0059..d2c183fa7949 100644
--- a/drivers/firmware/arm_scmi/driver.c
+++ b/drivers/firmware/arm_scmi/driver.c
@@ -171,19 +171,6 @@ static inline int scmi_to_linux_errno(int errno)
 	return -EIO;
 }
 
-/**
- * scmi_dump_header_dbg() - Helper to dump a message header.
- *
- * @dev: Device pointer corresponding to the SCMI entity
- * @hdr: pointer to header.
- */
-static inline void scmi_dump_header_dbg(struct device *dev,
-					struct scmi_msg_hdr *hdr)
-{
-	dev_dbg(dev, "Message ID: %x Sequence ID: %x Protocol: %x\n",
-		hdr->id, hdr->seq, hdr->protocol_id);
-}
-
 void scmi_notification_instance_data_set(const struct scmi_handle *handle,
 					 void *priv)
 {
@@ -287,7 +274,6 @@ static void scmi_handle_notification(struct scmi_chan_info *cinfo, u32 msg_hdr)
 	}
 
 	unpack_scmi_header(msg_hdr, &xfer->hdr);
-	scmi_dump_header_dbg(dev, &xfer->hdr);
 	info->desc->ops->fetch_notification(cinfo, info->desc->max_msg_size,
 					    xfer);
 	scmi_notify(cinfo->handle, xfer->hdr.protocol_id,
@@ -338,8 +324,6 @@ static void scmi_handle_response(struct scmi_chan_info *cinfo,
 	if (msg_type == MSG_TYPE_DELAYED_RESP)
 		xfer->rx.len = info->desc->max_msg_size;
 
-	scmi_dump_header_dbg(dev, &xfer->hdr);
-
 	info->desc->ops->fetch_response(cinfo, xfer);
 
 	trace_scmi_rx_done(xfer->transfer_id, xfer->hdr.id,
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 48+ messages in thread

* [PATCH v6 05/17] firmware: arm_scmi: Add transport optional init/exit support
  2021-07-12 14:18 [PATCH v6 00/17] Introduce SCMI transport based on VirtIO Cristian Marussi
                   ` (3 preceding siblings ...)
  2021-07-12 14:18 ` [PATCH v6 04/17] firmware: arm_scmi: Remove scmi_dump_header_dbg() helper Cristian Marussi
@ 2021-07-12 14:18 ` Cristian Marussi
  2021-07-28 11:40   ` Sudeep Holla
  2021-07-12 14:18 ` [PATCH v6 06/17] firmware: arm_scmi: Introduce monotonically increasing tokens Cristian Marussi
                   ` (13 subsequent siblings)
  18 siblings, 1 reply; 48+ messages in thread
From: Cristian Marussi @ 2021-07-12 14:18 UTC (permalink / raw)
  To: linux-kernel, linux-arm-kernel, virtualization, virtio-dev
  Cc: sudeep.holla, james.quinlan, Jonathan.Cameron, f.fainelli,
	etienne.carriere, vincent.guittot, souvik.chakravarty,
	cristian.marussi, igor.skalkin, peter.hilber, alex.bennee,
	jean-philippe, mikhail.golubev, anton.yakovlev, Vasyl.Vavrychuk,
	Andriy.Tryshnivskyy

Some SCMI transport could need to perform some transport specific setup
before they can be used by the SCMI core transport layer: typically this
early setup consists in registering with some other kernel subsystem.

Add the optional capability for a transport to provide a couple of .init
and .exit functions that are assured to be called early during the SCMI
core initialization phase, well before the SCMI core probing step.

[ Peter: Adapted RFC patch by Cristian for submission to upstream. ]
Signed-off-by: Peter Hilber <peter.hilber@opensynergy.com>
[ Cristian: Fixed scmi_transports_exit point of invocation ]
Signed-off-by: Cristian Marussi <cristian.marussi@arm.com>
---
v4 --> V5
- removed useless pr_debug
- moved scmi_transport_exit() invocation
---
 drivers/firmware/arm_scmi/common.h |  8 +++++
 drivers/firmware/arm_scmi/driver.c | 56 ++++++++++++++++++++++++++++++
 2 files changed, 64 insertions(+)

diff --git a/drivers/firmware/arm_scmi/common.h b/drivers/firmware/arm_scmi/common.h
index 7c2b9fd7e929..6bb734e0e3ac 100644
--- a/drivers/firmware/arm_scmi/common.h
+++ b/drivers/firmware/arm_scmi/common.h
@@ -321,6 +321,12 @@ struct scmi_device *scmi_child_dev_find(struct device *parent,
 /**
  * struct scmi_desc - Description of SoC integration
  *
+ * @init: An optional function that a transport can provide to initialize some
+ *	  transport-specific setup during SCMI core initialization, so ahead of
+ *	  SCMI core probing.
+ * @exit: An optional function that a transport can provide to de-initialize
+ *	  some transport-specific setup during SCMI core de-initialization, so
+ *	  after SCMI core removal.
  * @ops: Pointer to the transport specific ops structure
  * @max_rx_timeout_ms: Timeout for communication with SoC (in Milliseconds)
  * @max_msg: Maximum number of messages that can be pending
@@ -328,6 +334,8 @@ struct scmi_device *scmi_child_dev_find(struct device *parent,
  * @max_msg_size: Maximum size of data per message that can be handled.
  */
 struct scmi_desc {
+	int (*init)(void);
+	void (*exit)(void);
 	const struct scmi_transport_ops *ops;
 	int max_rx_timeout_ms;
 	int max_msg;
diff --git a/drivers/firmware/arm_scmi/driver.c b/drivers/firmware/arm_scmi/driver.c
index d2c183fa7949..4c77ee13b1ad 100644
--- a/drivers/firmware/arm_scmi/driver.c
+++ b/drivers/firmware/arm_scmi/driver.c
@@ -1578,10 +1578,64 @@ static struct platform_driver scmi_driver = {
 	.remove = scmi_remove,
 };
 
+/**
+ * __scmi_transports_setup  - Common helper to call transport-specific
+ * .init/.exit code if provided.
+ *
+ * @init: A flag to distinguish between init and exit.
+ *
+ * Note that, if provided, we invoke .init/.exit functions for all the
+ * transports currently compiled in.
+ *
+ * Return: 0 on Success.
+ */
+static inline int __scmi_transports_setup(bool init)
+{
+	int ret = 0;
+	const struct of_device_id *trans;
+
+	for (trans = scmi_of_match; trans->data; trans++) {
+		const struct scmi_desc *tdesc = trans->data;
+
+		if ((init && !tdesc->init) || (!init && !tdesc->exit))
+			continue;
+
+		if (init)
+			ret = tdesc->init();
+		else
+			tdesc->exit();
+
+		if (ret) {
+			pr_err("SCMI transport %s FAILED initialization!\n",
+			       trans->compatible);
+			break;
+		}
+	}
+
+	return ret;
+}
+
+static int __init scmi_transports_init(void)
+{
+	return __scmi_transports_setup(true);
+}
+
+static void __exit scmi_transports_exit(void)
+{
+	__scmi_transports_setup(false);
+}
+
 static int __init scmi_driver_init(void)
 {
+	int ret;
+
 	scmi_bus_init();
 
+	/* Initialize any compiled-in transport which provided an init/exit */
+	ret = scmi_transports_init();
+	if (ret)
+		return ret;
+
 	scmi_base_register();
 
 	scmi_clock_register();
@@ -1610,6 +1664,8 @@ static void __exit scmi_driver_exit(void)
 
 	scmi_bus_exit();
 
+	scmi_transports_exit();
+
 	platform_driver_unregister(&scmi_driver);
 }
 module_exit(scmi_driver_exit);
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 48+ messages in thread

* [PATCH v6 06/17] firmware: arm_scmi: Introduce monotonically increasing tokens
  2021-07-12 14:18 [PATCH v6 00/17] Introduce SCMI transport based on VirtIO Cristian Marussi
                   ` (4 preceding siblings ...)
  2021-07-12 14:18 ` [PATCH v6 05/17] firmware: arm_scmi: Add transport optional init/exit support Cristian Marussi
@ 2021-07-12 14:18 ` Cristian Marussi
  2021-07-28 14:17   ` Sudeep Holla
  2021-07-12 14:18 ` [PATCH v6 07/17] firmware: arm_scmi: Handle concurrent and out-of-order messages Cristian Marussi
                   ` (12 subsequent siblings)
  18 siblings, 1 reply; 48+ messages in thread
From: Cristian Marussi @ 2021-07-12 14:18 UTC (permalink / raw)
  To: linux-kernel, linux-arm-kernel, virtualization, virtio-dev
  Cc: sudeep.holla, james.quinlan, Jonathan.Cameron, f.fainelli,
	etienne.carriere, vincent.guittot, souvik.chakravarty,
	cristian.marussi, igor.skalkin, peter.hilber, alex.bennee,
	jean-philippe, mikhail.golubev, anton.yakovlev, Vasyl.Vavrychuk,
	Andriy.Tryshnivskyy

Tokens are sequence numbers embedded in the each SCMI message header: they
are used to correlate commands with responses (and delayed responses), but
their usage and policy of selection is entirely up to the caller (usually
the OSPM agent), while they are completely opaque to the callee (SCMI
server platform) which merely copies them back from the command into the
response message header.
This also means that the platform does not, can not and should not enforce
any kind of policy on received messages depending on the contained sequence
number: platform can perfectly handle concurrent requests carrying the same
identifiying token if that should happen.

Moreover the platform is not required to produce in-order responses to
agent requests, the only constraint in these regards is that in case of
an asynchronous message the delayed response must be sent after the
immediate response for the synchronous part of the command transaction.

Currenly the SCMI stack of the OSPM agent selects a token for the egressing
commands picking the lowest possible number which is not already in use by
an existing in-flight transaction, which means, in other words, that we
immediately reuse any token after its transaction has completed or it has
timed out: this policy indeed does simplify management and lookup of tokens
and associated xfers.

Under the above assumptions and constraints, since there is really no state
shared between the agent and the platform to let the platform know when a
token and its associated message has timed out, the current policy of early
reuse of tokens can easily lead to the situation in which a spurious or
late received response (or delayed_response), related to an old stale and
timed out transaction, can be wrongly associated to a newer valid in-flight
xfer that just happens to have reused the same token.

This misbehaviour on such ghost responses is more easily exposed on those
transports that naturally have an higher level of parallelism in processing
multiple concurrent in-flight messages.

This commit introduces a new policy of selection of tokens for the OSPM
agent: each new command transfer now gets the next available, monotonically
increasing token, until tokens are exhausted and the counter rolls over.

Such new policy mitigates the above issues with ghost responses since the
tokens are now reused as late as possible (when they roll back ideally)
and so it is much easier to identify such ghost responses to stale timed
out transactions: this also helps in simplifying the specific transports
implementation since stale transport messages can be easily identified
and discarded early on in the rx path without the need to cross check
their actual state with the core transport layer.
This mitigation is even more effective when, as is usually the case, the
maximum number of pending messages is capped by the platform to a much
lower number than the whole possible range of tokens values (2^10).

This internal policy change in the core SCMI transport layer is fully
transparent to the specific transports so it has not and should not have
any impact on the transports implementation.

Signed-off-by: Cristian Marussi <cristian.marussi@arm.com>
---
v4 --> V5
- removed empirical profiling info from commit msg
- do NOT use monotonic tokens and pending HT for notifications (not needed)
- release xfer_lock later in scmi_xfer_get
---
 drivers/firmware/arm_scmi/common.h |  25 +++
 drivers/firmware/arm_scmi/driver.c | 260 +++++++++++++++++++++++++----
 2 files changed, 249 insertions(+), 36 deletions(-)

diff --git a/drivers/firmware/arm_scmi/common.h b/drivers/firmware/arm_scmi/common.h
index 6bb734e0e3ac..2233d0a188fc 100644
--- a/drivers/firmware/arm_scmi/common.h
+++ b/drivers/firmware/arm_scmi/common.h
@@ -14,7 +14,10 @@
 #include <linux/device.h>
 #include <linux/errno.h>
 #include <linux/kernel.h>
+#include <linux/hashtable.h>
+#include <linux/list.h>
 #include <linux/module.h>
+#include <linux/refcount.h>
 #include <linux/scmi_protocol.h>
 #include <linux/types.h>
 
@@ -138,6 +141,10 @@ struct scmi_msg {
  *	buffer for the rx path as we use for the tx path.
  * @done: command message transmit completion event
  * @async_done: pointer to delayed response message received event completion
+ * @users: A refcount to track the active users for this xfer
+ * @pending: True for xfers added to @pending_xfers hashtable
+ * @node: An hlist_node reference used to store this xfer, alternatively, on
+ *	  the free list @free_xfers or in the @pending_xfers hashtable
  */
 struct scmi_xfer {
 	int transfer_id;
@@ -146,8 +153,26 @@ struct scmi_xfer {
 	struct scmi_msg rx;
 	struct completion done;
 	struct completion *async_done;
+	refcount_t users;
+	bool pending;
+	struct hlist_node node;
 };
 
+/*
+ * An helper macro to lookup an xfer from the @pending_xfers hashtable
+ * using the message sequence number token as a key.
+ */
+#define XFER_FIND(__ht, __k)					\
+({								\
+	typeof(__k) k_ = __k;					\
+	struct scmi_xfer *xfer_ = NULL;				\
+								\
+	hash_for_each_possible((__ht), xfer_, node, k_)		\
+		if (xfer_->hdr.seq == k_)			\
+			break;					\
+	xfer_;							\
+})
+
 struct scmi_xfer_ops;
 
 /**
diff --git a/drivers/firmware/arm_scmi/driver.c b/drivers/firmware/arm_scmi/driver.c
index 4c77ee13b1ad..245ede223302 100644
--- a/drivers/firmware/arm_scmi/driver.c
+++ b/drivers/firmware/arm_scmi/driver.c
@@ -21,6 +21,7 @@
 #include <linux/io.h>
 #include <linux/kernel.h>
 #include <linux/ktime.h>
+#include <linux/hashtable.h>
 #include <linux/list.h>
 #include <linux/module.h>
 #include <linux/of_address.h>
@@ -65,19 +66,29 @@ struct scmi_requested_dev {
 	struct list_head node;
 };
 
+#define SCMI_PENDING_XFERS_HT_ORDER_SZ	9
+
 /**
  * struct scmi_xfers_info - Structure to manage transfer information
  *
- * @xfer_block: Preallocated Message array
  * @xfer_alloc_table: Bitmap table for allocated messages.
  *	Index of this bitmap table is also used for message
  *	sequence identifier.
  * @xfer_lock: Protection for message allocation
+ * @last_token: A counter to use as base to generate for monotonically
+ *		increasing tokens.
+ * @free_xfers: A free list for available to use xfers. It is initialized with
+ *		a number of xfers equal to the maximum allowed in-flight
+ *		messages.
+ * @pending_xfers: An hashtable, indexed by msg_hdr.seq, used to keep all the
+ *		   currently in-flight messages.
  */
 struct scmi_xfers_info {
-	struct scmi_xfer *xfer_block;
 	unsigned long *xfer_alloc_table;
 	spinlock_t xfer_lock;
+	atomic_t last_token;
+	struct hlist_head free_xfers;
+	DECLARE_HASHTABLE(pending_xfers, SCMI_PENDING_XFERS_HT_ORDER_SZ);
 };
 
 /**
@@ -190,45 +201,177 @@ void *scmi_notification_instance_data_get(const struct scmi_handle *handle)
 	return info->notify_priv;
 }
 
+/**
+ * scmi_xfer_token_set  - Reserve and set new token for the xfer at hand
+ *
+ * @minfo: Pointer to Tx/Rx Message management info based on channel type
+ * @xfer: The xfer to act upon
+ *
+ * Pick the next unused monotonically increasing token and set it into
+ * xfer->hdr.seq: picking a monotonically increasing value avoids immediate
+ * reuse of freshly completed or timed-out xfers, thus mitigating the risk
+ * of incorrect association of a late and expired xfer with a live in-flight
+ * transaction, both happening to re-use the same token identifier.
+ *
+ * Since platform is NOT required to answer our request in-order we should
+ * account for a few rare but possible scenarios:
+ *
+ *  - exactly 'next_token' may be NOT available so pick xfer_id >= next_token
+ *    using find_next_zero_bit() starting from candidate next_token bit
+ *
+ *  - all tokens ahead upto (MSG_TOKEN_ID_MASK - 1) are used in-flight but we
+ *    are plenty of free tokens at start, so try a second pass using
+ *    find_next_zero_bit() and starting from 0.
+ *
+ *  X = used in-flight
+ *
+ * Normal
+ * ------
+ *
+ *		|- xfer_id picked
+ *   -----------+----------------------------------------------------------
+ *   | | |X|X|X| | | | | | ... ... ... ... ... ... ... ... ... ... ...|X|X|
+ *   ----------------------------------------------------------------------
+ *		^
+ *		|- next_token
+ *
+ * Out-of-order pending at start
+ * -----------------------------
+ *
+ *	  |- xfer_id picked, last_token fixed
+ *   -----+----------------------------------------------------------------
+ *   |X|X| | | | |X|X| ... ... ... ... ... ... ... ... ... ... ... ...|X| |
+ *   ----------------------------------------------------------------------
+ *    ^
+ *    |- next_token
+ *
+ *
+ * Out-of-order pending at end
+ * ---------------------------
+ *
+ *	  |- xfer_id picked, last_token fixed
+ *   -----+----------------------------------------------------------------
+ *   |X|X| | | | |X|X| ... ... ... ... ... ... ... ... ... ... |X|X|X||X|X|
+ *   ----------------------------------------------------------------------
+ *								^
+ *								|- next_token
+ *
+ * Context: Assumes to be called with @xfer_lock already acquired.
+ *
+ * Return: 0 on Success or error
+ */
+static int scmi_xfer_token_set(struct scmi_xfers_info *minfo,
+			       struct scmi_xfer *xfer)
+{
+	unsigned long xfer_id, next_token;
+
+	/* Pick a candidate monotonic token in range [0, MSG_TOKEN_MAX - 1] */
+	next_token = (atomic_inc_return(&minfo->last_token) &
+		      (MSG_TOKEN_MAX - 1));
+
+	/* Pick the next available xfer_id >= next_token */
+	xfer_id = find_next_zero_bit(minfo->xfer_alloc_table,
+				     MSG_TOKEN_MAX, next_token);
+	if (xfer_id == MSG_TOKEN_MAX) {
+		/*
+		 * After heavily out-of-order responses, there are no free
+		 * tokens ahead, but only at start of xfer_alloc_table so
+		 * try again from the beginning.
+		 */
+		xfer_id = find_next_zero_bit(minfo->xfer_alloc_table,
+					     MSG_TOKEN_MAX, 0);
+		/*
+		 * Something is wrong if we got here since there can be a
+		 * maximum number of (MSG_TOKEN_MAX - 1) in-flight messages
+		 * but we have not found any free token [0, MSG_TOKEN_MAX - 1].
+		 */
+		if (WARN_ON_ONCE(xfer_id == MSG_TOKEN_MAX))
+			return -ENOMEM;
+	}
+
+	/* Update +/- last_token accordingly if we skipped some hole */
+	if (xfer_id != next_token)
+		atomic_add((int)(xfer_id - next_token), &minfo->last_token);
+
+	/* Set in-flight */
+	set_bit(xfer_id, minfo->xfer_alloc_table);
+	xfer->hdr.seq = (u16)xfer_id;
+
+	return 0;
+}
+
+/**
+ * scmi_xfer_token_clear  - Release the token
+ *
+ * @minfo: Pointer to Tx/Rx Message management info based on channel type
+ * @xfer: The xfer to act upon
+ */
+static inline void scmi_xfer_token_clear(struct scmi_xfers_info *minfo,
+					 struct scmi_xfer *xfer)
+{
+	clear_bit(xfer->hdr.seq, minfo->xfer_alloc_table);
+}
+
 /**
  * scmi_xfer_get() - Allocate one message
  *
  * @handle: Pointer to SCMI entity handle
  * @minfo: Pointer to Tx/Rx Message management info based on channel type
+ * @set_pending: If true a monotonic token is picked and the xfer is added to
+ *		 the pending hash table.
  *
  * Helper function which is used by various message functions that are
  * exposed to clients of this driver for allocating a message traffic event.
  *
- * This function can sleep depending on pending requests already in the system
- * for the SCMI entity. Further, this also holds a spinlock to maintain
- * integrity of internal data structures.
+ * Picks an xfer from the free list @free_xfers (if any available) and, if
+ * required, sets a monotonically increasing token and stores the inflight xfer
+ * into the @pending_xfers hashtable for later retrieval.
+ *
+ * The successfully initialized xfer is refcounted.
+ *
+ * Context: Holds @xfer_lock while manipulating @xfer_alloc_table and
+ *	    @free_xfers.
  *
  * Return: 0 if all went fine, else corresponding error.
  */
 static struct scmi_xfer *scmi_xfer_get(const struct scmi_handle *handle,
-				       struct scmi_xfers_info *minfo)
+				       struct scmi_xfers_info *minfo,
+				       bool set_pending)
 {
-	u16 xfer_id;
+	int ret;
+	unsigned long flags;
 	struct scmi_xfer *xfer;
-	unsigned long flags, bit_pos;
-	struct scmi_info *info = handle_to_scmi_info(handle);
 
-	/* Keep the locked section as small as possible */
 	spin_lock_irqsave(&minfo->xfer_lock, flags);
-	bit_pos = find_first_zero_bit(minfo->xfer_alloc_table,
-				      info->desc->max_msg);
-	if (bit_pos == info->desc->max_msg) {
+	if (hlist_empty(&minfo->free_xfers)) {
 		spin_unlock_irqrestore(&minfo->xfer_lock, flags);
 		return ERR_PTR(-ENOMEM);
 	}
-	set_bit(bit_pos, minfo->xfer_alloc_table);
-	spin_unlock_irqrestore(&minfo->xfer_lock, flags);
 
-	xfer_id = bit_pos;
+	/* grab an xfer from the free_list */
+	xfer = hlist_entry(minfo->free_xfers.first, struct scmi_xfer, node);
+	hlist_del_init(&xfer->node);
+
+	if (set_pending) {
+		/* Pick and set monotonic token */
+		ret = scmi_xfer_token_set(minfo, xfer);
+		if (!ret) {
+			hash_add(minfo->pending_xfers, &xfer->node,
+				 xfer->hdr.seq);
+			xfer->pending = true;
+		} else {
+			dev_err(handle->dev,
+				"Failed to get monotonic token %d\n", ret);
+			hlist_add_head(&xfer->node, &minfo->free_xfers);
+			xfer = ERR_PTR(ret);
+		}
+	}
 
-	xfer = &minfo->xfer_block[xfer_id];
-	xfer->hdr.seq = xfer_id;
-	xfer->transfer_id = atomic_inc_return(&transfer_last_id);
+	if (!IS_ERR(xfer)) {
+		refcount_set(&xfer->users, 1);
+		xfer->transfer_id = atomic_inc_return(&transfer_last_id);
+	}
+	spin_unlock_irqrestore(&minfo->xfer_lock, flags);
 
 	return xfer;
 }
@@ -239,6 +382,9 @@ static struct scmi_xfer *scmi_xfer_get(const struct scmi_handle *handle,
  * @minfo: Pointer to Tx/Rx Message management info based on channel type
  * @xfer: message that was reserved by scmi_xfer_get
  *
+ * After refcount check, possibly release an xfer, clearing the token slot,
+ * removing xfer from @pending_xfers and putting it back into free_xfers.
+ *
  * This holds a spinlock to maintain integrity of internal data structures.
  */
 static void
@@ -246,16 +392,44 @@ __scmi_xfer_put(struct scmi_xfers_info *minfo, struct scmi_xfer *xfer)
 {
 	unsigned long flags;
 
-	/*
-	 * Keep the locked section as small as possible
-	 * NOTE: we might escape with smp_mb and no lock here..
-	 * but just be conservative and symmetric.
-	 */
 	spin_lock_irqsave(&minfo->xfer_lock, flags);
-	clear_bit(xfer->hdr.seq, minfo->xfer_alloc_table);
+	if (refcount_dec_and_test(&xfer->users)) {
+		if (xfer->pending) {
+			scmi_xfer_token_clear(minfo, xfer);
+			hash_del(&xfer->node);
+			xfer->pending = false;
+		}
+		hlist_add_head(&xfer->node, &minfo->free_xfers);
+	}
 	spin_unlock_irqrestore(&minfo->xfer_lock, flags);
 }
 
+/**
+ * scmi_xfer_lookup_unlocked  -  Helper to lookup an xfer_id
+ *
+ * @minfo: Pointer to Tx/Rx Message management info based on channel type
+ * @xfer_id: Token ID to lookup in @pending_xfers
+ *
+ * Refcounting is untouched.
+ *
+ * Context: Assumes to be called with @xfer_lock already acquired.
+ *
+ * Return: A valid xfer on Success or error otherwise
+ */
+static struct scmi_xfer *
+scmi_xfer_lookup_unlocked(struct scmi_xfers_info *minfo, u16 xfer_id)
+{
+	struct scmi_xfer *xfer = NULL;
+
+	if (xfer_id >= MSG_TOKEN_MAX)
+		return ERR_PTR(-EINVAL);
+
+	if (test_bit(xfer_id, minfo->xfer_alloc_table))
+		xfer = XFER_FIND(minfo->pending_xfers, xfer_id);
+
+	return xfer ?: ERR_PTR(-EINVAL);
+}
+
 static void scmi_handle_notification(struct scmi_chan_info *cinfo, u32 msg_hdr)
 {
 	struct scmi_xfer *xfer;
@@ -265,7 +439,7 @@ static void scmi_handle_notification(struct scmi_chan_info *cinfo, u32 msg_hdr)
 	ktime_t ts;
 
 	ts = ktime_get_boottime();
-	xfer = scmi_xfer_get(cinfo->handle, minfo);
+	xfer = scmi_xfer_get(cinfo->handle, minfo, false);
 	if (IS_ERR(xfer)) {
 		dev_err(dev, "failed to get free message slot (%ld)\n",
 			PTR_ERR(xfer));
@@ -291,19 +465,22 @@ static void scmi_handle_notification(struct scmi_chan_info *cinfo, u32 msg_hdr)
 static void scmi_handle_response(struct scmi_chan_info *cinfo,
 				 u16 xfer_id, u8 msg_type)
 {
+	unsigned long flags;
 	struct scmi_xfer *xfer;
 	struct device *dev = cinfo->dev;
 	struct scmi_info *info = handle_to_scmi_info(cinfo->handle);
 	struct scmi_xfers_info *minfo = &info->tx_minfo;
 
 	/* Are we even expecting this? */
-	if (!test_bit(xfer_id, minfo->xfer_alloc_table)) {
+	spin_lock_irqsave(&minfo->xfer_lock, flags);
+	xfer = scmi_xfer_lookup_unlocked(minfo, xfer_id);
+	spin_unlock_irqrestore(&minfo->xfer_lock, flags);
+	if (IS_ERR(xfer)) {
 		dev_err(dev, "message for %d is not expected!\n", xfer_id);
 		info->desc->ops->clear_channel(cinfo);
 		return;
 	}
 
-	xfer = &minfo->xfer_block[xfer_id];
 	/*
 	 * Even if a response was indeed expected on this slot at this point,
 	 * a buggy platform could wrongly reply feeding us an unexpected
@@ -540,7 +717,7 @@ static int xfer_get_init(const struct scmi_protocol_handle *ph,
 	    tx_size > info->desc->max_msg_size)
 		return -ERANGE;
 
-	xfer = scmi_xfer_get(pi->handle, minfo);
+	xfer = scmi_xfer_get(pi->handle, minfo, true);
 	if (IS_ERR(xfer)) {
 		ret = PTR_ERR(xfer);
 		dev_err(dev, "failed to get free message slot(%d)\n", ret);
@@ -1017,18 +1194,25 @@ static int __scmi_xfer_info_init(struct scmi_info *sinfo,
 		return -EINVAL;
 	}
 
-	info->xfer_block = devm_kcalloc(dev, desc->max_msg,
-					sizeof(*info->xfer_block), GFP_KERNEL);
-	if (!info->xfer_block)
-		return -ENOMEM;
+	hash_init(info->pending_xfers);
 
-	info->xfer_alloc_table = devm_kcalloc(dev, BITS_TO_LONGS(desc->max_msg),
+	/* Allocate a bitmask sized to hold MSG_TOKEN_MAX tokens */
+	info->xfer_alloc_table = devm_kcalloc(dev, BITS_TO_LONGS(MSG_TOKEN_MAX),
 					      sizeof(long), GFP_KERNEL);
 	if (!info->xfer_alloc_table)
 		return -ENOMEM;
 
-	/* Pre-initialize the buffer pointer to pre-allocated buffers */
-	for (i = 0, xfer = info->xfer_block; i < desc->max_msg; i++, xfer++) {
+	/*
+	 * Preallocate a number of xfers equal to max inflight messages,
+	 * pre-initialize the buffer pointer to pre-allocated buffers and
+	 * attach all of them to the free list
+	 */
+	INIT_HLIST_HEAD(&info->free_xfers);
+	for (i = 0; i < desc->max_msg; i++) {
+		xfer = devm_kzalloc(dev, sizeof(*xfer), GFP_KERNEL);
+		if (!xfer)
+			return -ENOMEM;
+
 		xfer->rx.buf = devm_kcalloc(dev, sizeof(u8), desc->max_msg_size,
 					    GFP_KERNEL);
 		if (!xfer->rx.buf)
@@ -1036,8 +1220,12 @@ static int __scmi_xfer_info_init(struct scmi_info *sinfo,
 
 		xfer->tx.buf = xfer->rx.buf;
 		init_completion(&xfer->done);
+
+		/* Add initialized xfer to the free list */
+		hlist_add_head(&xfer->node, &info->free_xfers);
 	}
 
+	atomic_set(&info->last_token, -1);
 	spin_lock_init(&info->xfer_lock);
 
 	return 0;
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 48+ messages in thread

* [PATCH v6 07/17] firmware: arm_scmi: Handle concurrent and out-of-order messages
  2021-07-12 14:18 [PATCH v6 00/17] Introduce SCMI transport based on VirtIO Cristian Marussi
                   ` (5 preceding siblings ...)
  2021-07-12 14:18 ` [PATCH v6 06/17] firmware: arm_scmi: Introduce monotonically increasing tokens Cristian Marussi
@ 2021-07-12 14:18 ` Cristian Marussi
  2021-07-15 16:36   ` Peter Hilber
  2021-08-02 10:10   ` Sudeep Holla
  2021-07-12 14:18 ` [PATCH v6 08/17] firmware: arm_scmi: Add priv parameter to scmi_rx_callback Cristian Marussi
                   ` (11 subsequent siblings)
  18 siblings, 2 replies; 48+ messages in thread
From: Cristian Marussi @ 2021-07-12 14:18 UTC (permalink / raw)
  To: linux-kernel, linux-arm-kernel, virtualization, virtio-dev
  Cc: sudeep.holla, james.quinlan, Jonathan.Cameron, f.fainelli,
	etienne.carriere, vincent.guittot, souvik.chakravarty,
	cristian.marussi, igor.skalkin, peter.hilber, alex.bennee,
	jean-philippe, mikhail.golubev, anton.yakovlev, Vasyl.Vavrychuk,
	Andriy.Tryshnivskyy

Even though in case of asynchronous commands an SCMI platform server is
constrained to emit the delayed response message only after the related
message response has been sent, the configured underlying transport could
still deliver such messages together or in inverted order, causing races
due to the concurrent or out-of-order access to the underlying xfer.

Introduce a mechanism to grant exclusive access to an xfer in order to
properly serialize concurrent accesses to the same xfer originating from
multiple correlated messages.

Add additional state information to xfer descriptors so as to be able to
identify out-of-order message deliveries and act accordingly:

 - when a delayed response is expected but delivered before the related
   response, the synchronous response is considered as successfully
   received and the delayed response processing is carried on as usual.

 - when/if the missing synchronous response is subsequently received, it
   is discarded as not congruent with the current state of the xfer, or
   simply, because the xfer has been already released and so, now, the
   monotonically increasing sequence number carried by the late response
   is stale.

Signed-off-by: Cristian Marussi <cristian.marussi@arm.com>
---
v5 --> v6
- added spinlock comment
---
 drivers/firmware/arm_scmi/common.h |  18 ++-
 drivers/firmware/arm_scmi/driver.c | 229 ++++++++++++++++++++++++-----
 2 files changed, 212 insertions(+), 35 deletions(-)

diff --git a/drivers/firmware/arm_scmi/common.h b/drivers/firmware/arm_scmi/common.h
index 2233d0a188fc..9efebe1406d2 100644
--- a/drivers/firmware/arm_scmi/common.h
+++ b/drivers/firmware/arm_scmi/common.h
@@ -19,6 +19,7 @@
 #include <linux/module.h>
 #include <linux/refcount.h>
 #include <linux/scmi_protocol.h>
+#include <linux/spinlock.h>
 #include <linux/types.h>
 
 #include <asm/unaligned.h>
@@ -145,6 +146,13 @@ struct scmi_msg {
  * @pending: True for xfers added to @pending_xfers hashtable
  * @node: An hlist_node reference used to store this xfer, alternatively, on
  *	  the free list @free_xfers or in the @pending_xfers hashtable
+ * @busy: An atomic flag to ensure exclusive write access to this xfer
+ * @state: The current state of this transfer, with states transitions deemed
+ *	   valid being:
+ *	    - SCMI_XFER_SENT_OK -> SCMI_XFER_RESP_OK [ -> SCMI_XFER_DRESP_OK ]
+ *	    - SCMI_XFER_SENT_OK -> SCMI_XFER_DRESP_OK
+ *	      (Missing synchronous response is assumed OK and ignored)
+ * @lock: A spinlock to protect state and busy fields.
  */
 struct scmi_xfer {
 	int transfer_id;
@@ -156,6 +164,15 @@ struct scmi_xfer {
 	refcount_t users;
 	bool pending;
 	struct hlist_node node;
+#define SCMI_XFER_FREE		0
+#define SCMI_XFER_BUSY		1
+	atomic_t busy;
+#define SCMI_XFER_SENT_OK	0
+#define SCMI_XFER_RESP_OK	1
+#define SCMI_XFER_DRESP_OK	2
+	int state;
+	/* A lock to protect state and busy fields */
+	spinlock_t lock;
 };
 
 /*
@@ -392,5 +409,4 @@ bool shmem_poll_done(struct scmi_shared_mem __iomem *shmem,
 void scmi_notification_instance_data_set(const struct scmi_handle *handle,
 					 void *priv);
 void *scmi_notification_instance_data_get(const struct scmi_handle *handle);
-
 #endif /* _SCMI_COMMON_H */
diff --git a/drivers/firmware/arm_scmi/driver.c b/drivers/firmware/arm_scmi/driver.c
index 245ede223302..5ef33d692670 100644
--- a/drivers/firmware/arm_scmi/driver.c
+++ b/drivers/firmware/arm_scmi/driver.c
@@ -369,6 +369,7 @@ static struct scmi_xfer *scmi_xfer_get(const struct scmi_handle *handle,
 
 	if (!IS_ERR(xfer)) {
 		refcount_set(&xfer->users, 1);
+		atomic_set(&xfer->busy, SCMI_XFER_FREE);
 		xfer->transfer_id = atomic_inc_return(&transfer_last_id);
 	}
 	spin_unlock_irqrestore(&minfo->xfer_lock, flags);
@@ -430,6 +431,168 @@ scmi_xfer_lookup_unlocked(struct scmi_xfers_info *minfo, u16 xfer_id)
 	return xfer ?: ERR_PTR(-EINVAL);
 }
 
+/**
+ * scmi_msg_response_validate  - Validate message type against state of related
+ * xfer
+ *
+ * @cinfo: A reference to the channel descriptor.
+ * @msg_type: Message type to check
+ * @xfer: A reference to the xfer to validate against @msg_type
+ *
+ * This function checks if @msg_type is congruent with the current state of
+ * a pending @xfer; if an asynchronous delayed response is received before the
+ * related synchronous response (Out-of-Order Delayed Response) the missing
+ * synchronous response is assumed to be OK and completed, carrying on with the
+ * Delayed Response: this is done to address the case in which the underlying
+ * SCMI transport can deliver such out-of-order responses.
+ *
+ * Context: Assumes to be called with xfer->lock already acquired.
+ *
+ * Return: 0 on Success, error otherwise
+ */
+static inline int scmi_msg_response_validate(struct scmi_chan_info *cinfo,
+					     u8 msg_type,
+					     struct scmi_xfer *xfer)
+{
+	/*
+	 * Even if a response was indeed expected on this slot at this point,
+	 * a buggy platform could wrongly reply feeding us an unexpected
+	 * delayed response we're not prepared to handle: bail-out safely
+	 * blaming firmware.
+	 */
+	if (msg_type == MSG_TYPE_DELAYED_RESP && !xfer->async_done) {
+		dev_err(cinfo->dev,
+			"Delayed Response for %d not expected! Buggy F/W ?\n",
+			xfer->hdr.seq);
+		return -EINVAL;
+	}
+
+	switch (xfer->state) {
+	case SCMI_XFER_SENT_OK:
+		if (msg_type == MSG_TYPE_DELAYED_RESP) {
+			/*
+			 * Delayed Response expected but delivered earlier.
+			 * Assume message RESPONSE was OK and skip state.
+			 */
+			xfer->hdr.status = SCMI_SUCCESS;
+			xfer->state = SCMI_XFER_RESP_OK;
+			complete(&xfer->done);
+			dev_warn(cinfo->dev,
+				 "Received valid OoO Delayed Response for %d\n",
+				 xfer->hdr.seq);
+		}
+		break;
+	case SCMI_XFER_RESP_OK:
+		if (msg_type != MSG_TYPE_DELAYED_RESP)
+			return -EINVAL;
+		break;
+	case SCMI_XFER_DRESP_OK:
+		/* No further message expected once in SCMI_XFER_DRESP_OK */
+		return -EINVAL;
+	}
+
+	return 0;
+}
+
+static bool scmi_xfer_is_free(struct scmi_xfer *xfer)
+{
+	int ret;
+
+	ret = atomic_cmpxchg(&xfer->busy, SCMI_XFER_FREE, SCMI_XFER_BUSY);
+
+	return ret == SCMI_XFER_FREE;
+}
+
+/**
+ * scmi_xfer_command_acquire  -  Helper to lookup and acquire a command xfer
+ *
+ * @cinfo: A reference to the channel descriptor.
+ * @msg_hdr: A message header to use as lookup key
+ *
+ * When a valid xfer is found for the sequence number embedded in the provided
+ * msg_hdr, reference counting is properly updated and exclusive access to this
+ * xfer is granted till released with @scmi_xfer_command_release.
+ *
+ * Return: A valid @xfer on Success or error otherwise.
+ */
+static inline struct scmi_xfer *
+scmi_xfer_command_acquire(struct scmi_chan_info *cinfo, u32 msg_hdr)
+{
+	int ret;
+	unsigned long flags;
+	struct scmi_xfer *xfer;
+	struct scmi_info *info = handle_to_scmi_info(cinfo->handle);
+	struct scmi_xfers_info *minfo = &info->tx_minfo;
+	u8 msg_type = MSG_XTRACT_TYPE(msg_hdr);
+	u16 xfer_id = MSG_XTRACT_TOKEN(msg_hdr);
+
+	/* Are we even expecting this? */
+	spin_lock_irqsave(&minfo->xfer_lock, flags);
+	xfer = scmi_xfer_lookup_unlocked(minfo, xfer_id);
+	if (IS_ERR(xfer)) {
+		dev_err(cinfo->dev,
+			"Message for %d type %d is not expected!\n",
+			xfer_id, msg_type);
+		spin_unlock_irqrestore(&minfo->xfer_lock, flags);
+		return xfer;
+	}
+	refcount_inc(&xfer->users);
+	spin_unlock_irqrestore(&minfo->xfer_lock, flags);
+
+	spin_lock_irqsave(&xfer->lock, flags);
+	ret = scmi_msg_response_validate(cinfo, msg_type, xfer);
+	/*
+	 * If a pending xfer was found which was also in a congruent state with
+	 * the received message, acquire exclusive access to it setting the busy
+	 * flag.
+	 * Spins only on the rare limit condition of concurrent reception of
+	 * RESP and DRESP for the same xfer.
+	 */
+	if (!ret) {
+		spin_until_cond(scmi_xfer_is_free(xfer));
+		xfer->hdr.type = msg_type;
+	}
+	spin_unlock_irqrestore(&xfer->lock, flags);
+
+	if (ret) {
+		dev_err(cinfo->dev,
+			"Invalid message type:%d for %d - HDR:0x%X  state:%d\n",
+			msg_type, xfer_id, msg_hdr, xfer->state);
+		/* On error the refcount incremented above has to be dropped */
+		__scmi_xfer_put(minfo, xfer);
+		xfer = ERR_PTR(-EINVAL);
+	}
+
+	return xfer;
+}
+
+static inline void scmi_xfer_command_release(struct scmi_info *info,
+					     struct scmi_xfer *xfer)
+{
+	atomic_set(&xfer->busy, SCMI_XFER_FREE);
+	__scmi_xfer_put(&info->tx_minfo, xfer);
+}
+
+/**
+ * scmi_xfer_state_update  - Update xfer state
+ *
+ * @xfer: A reference to the xfer to update
+ *
+ * Context: Assumes to be called on an xfer exclusively acquired using the
+ *	    busy flag.
+ */
+static inline void scmi_xfer_state_update(struct scmi_xfer *xfer)
+{
+	switch (xfer->hdr.type) {
+	case MSG_TYPE_COMMAND:
+		xfer->state = SCMI_XFER_RESP_OK;
+		break;
+	case MSG_TYPE_DELAYED_RESP:
+		xfer->state = SCMI_XFER_DRESP_OK;
+		break;
+	}
+}
+
 static void scmi_handle_notification(struct scmi_chan_info *cinfo, u32 msg_hdr)
 {
 	struct scmi_xfer *xfer;
@@ -462,57 +625,37 @@ static void scmi_handle_notification(struct scmi_chan_info *cinfo, u32 msg_hdr)
 	info->desc->ops->clear_channel(cinfo);
 }
 
-static void scmi_handle_response(struct scmi_chan_info *cinfo,
-				 u16 xfer_id, u8 msg_type)
+static void scmi_handle_response(struct scmi_chan_info *cinfo, u32 msg_hdr)
 {
-	unsigned long flags;
 	struct scmi_xfer *xfer;
-	struct device *dev = cinfo->dev;
 	struct scmi_info *info = handle_to_scmi_info(cinfo->handle);
-	struct scmi_xfers_info *minfo = &info->tx_minfo;
 
-	/* Are we even expecting this? */
-	spin_lock_irqsave(&minfo->xfer_lock, flags);
-	xfer = scmi_xfer_lookup_unlocked(minfo, xfer_id);
-	spin_unlock_irqrestore(&minfo->xfer_lock, flags);
+	xfer = scmi_xfer_command_acquire(cinfo, msg_hdr);
 	if (IS_ERR(xfer)) {
-		dev_err(dev, "message for %d is not expected!\n", xfer_id);
 		info->desc->ops->clear_channel(cinfo);
 		return;
 	}
 
-	/*
-	 * Even if a response was indeed expected on this slot at this point,
-	 * a buggy platform could wrongly reply feeding us an unexpected
-	 * delayed response we're not prepared to handle: bail-out safely
-	 * blaming firmware.
-	 */
-	if (unlikely(msg_type == MSG_TYPE_DELAYED_RESP && !xfer->async_done)) {
-		dev_err(dev,
-			"Delayed Response for %d not expected! Buggy F/W ?\n",
-			xfer_id);
-		info->desc->ops->clear_channel(cinfo);
-		/* It was unexpected, so nobody will clear the xfer if not us */
-		__scmi_xfer_put(minfo, xfer);
-		return;
-	}
+	scmi_xfer_state_update(xfer);
 
 	/* rx.len could be shrunk in the sync do_xfer, so reset to maxsz */
-	if (msg_type == MSG_TYPE_DELAYED_RESP)
+	if (xfer->hdr.type == MSG_TYPE_DELAYED_RESP)
 		xfer->rx.len = info->desc->max_msg_size;
 
 	info->desc->ops->fetch_response(cinfo, xfer);
 
 	trace_scmi_rx_done(xfer->transfer_id, xfer->hdr.id,
 			   xfer->hdr.protocol_id, xfer->hdr.seq,
-			   msg_type);
+			   xfer->hdr.type);
 
-	if (msg_type == MSG_TYPE_DELAYED_RESP) {
+	if (xfer->hdr.type == MSG_TYPE_DELAYED_RESP) {
 		info->desc->ops->clear_channel(cinfo);
 		complete(xfer->async_done);
 	} else {
 		complete(&xfer->done);
 	}
+
+	scmi_xfer_command_release(info, xfer);
 }
 
 /**
@@ -529,7 +672,6 @@ static void scmi_handle_response(struct scmi_chan_info *cinfo,
  */
 void scmi_rx_callback(struct scmi_chan_info *cinfo, u32 msg_hdr)
 {
-	u16 xfer_id = MSG_XTRACT_TOKEN(msg_hdr);
 	u8 msg_type = MSG_XTRACT_TYPE(msg_hdr);
 
 	switch (msg_type) {
@@ -538,7 +680,7 @@ void scmi_rx_callback(struct scmi_chan_info *cinfo, u32 msg_hdr)
 		break;
 	case MSG_TYPE_COMMAND:
 	case MSG_TYPE_DELAYED_RESP:
-		scmi_handle_response(cinfo, xfer_id, msg_type);
+		scmi_handle_response(cinfo, msg_hdr);
 		break;
 	default:
 		WARN_ONCE(1, "received unknown msg_type:%d\n", msg_type);
@@ -550,7 +692,7 @@ void scmi_rx_callback(struct scmi_chan_info *cinfo, u32 msg_hdr)
  * xfer_put() - Release a transmit message
  *
  * @ph: Pointer to SCMI protocol handle
- * @xfer: message that was reserved by scmi_xfer_get
+ * @xfer: message that was reserved by xfer_get_init
  */
 static void xfer_put(const struct scmi_protocol_handle *ph,
 		     struct scmi_xfer *xfer)
@@ -568,7 +710,12 @@ static bool scmi_xfer_done_no_timeout(struct scmi_chan_info *cinfo,
 {
 	struct scmi_info *info = handle_to_scmi_info(cinfo->handle);
 
+	/*
+	 * Poll also on xfer->done so that polling can be forcibly terminated
+	 * in case of out-of-order receptions of delayed responses
+	 */
 	return info->desc->ops->poll_done(cinfo, xfer) ||
+	       try_wait_for_completion(&xfer->done) ||
 	       ktime_after(ktime_get(), stop);
 }
 
@@ -608,6 +755,7 @@ static int do_xfer(const struct scmi_protocol_handle *ph,
 			      xfer->hdr.protocol_id, xfer->hdr.seq,
 			      xfer->hdr.poll_completion);
 
+	xfer->state = SCMI_XFER_SENT_OK;
 	ret = info->desc->ops->send_message(cinfo, xfer);
 	if (ret < 0) {
 		dev_dbg(dev, "Failed to send message %d\n", ret);
@@ -619,10 +767,22 @@ static int do_xfer(const struct scmi_protocol_handle *ph,
 
 		spin_until_cond(scmi_xfer_done_no_timeout(cinfo, xfer, stop));
 
-		if (ktime_before(ktime_get(), stop))
-			info->desc->ops->fetch_response(cinfo, xfer);
-		else
+		if (ktime_before(ktime_get(), stop)) {
+			unsigned long flags;
+
+			/*
+			 * Do not fetch_response if an out-of-order delayed
+			 * response is being processed.
+			 */
+			spin_lock_irqsave(&xfer->lock, flags);
+			if (xfer->state == SCMI_XFER_SENT_OK) {
+				info->desc->ops->fetch_response(cinfo, xfer);
+				xfer->state = SCMI_XFER_RESP_OK;
+			}
+			spin_unlock_irqrestore(&xfer->lock, flags);
+		} else {
 			ret = -ETIMEDOUT;
+		}
 	} else {
 		/* And we wait for the response. */
 		timeout = msecs_to_jiffies(info->desc->max_rx_timeout_ms);
@@ -1220,6 +1380,7 @@ static int __scmi_xfer_info_init(struct scmi_info *sinfo,
 
 		xfer->tx.buf = xfer->rx.buf;
 		init_completion(&xfer->done);
+		spin_lock_init(&xfer->lock);
 
 		/* Add initialized xfer to the free list */
 		hlist_add_head(&xfer->node, &info->free_xfers);
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 48+ messages in thread

* [PATCH v6 08/17] firmware: arm_scmi: Add priv parameter to scmi_rx_callback
  2021-07-12 14:18 [PATCH v6 00/17] Introduce SCMI transport based on VirtIO Cristian Marussi
                   ` (6 preceding siblings ...)
  2021-07-12 14:18 ` [PATCH v6 07/17] firmware: arm_scmi: Handle concurrent and out-of-order messages Cristian Marussi
@ 2021-07-12 14:18 ` Cristian Marussi
  2021-07-28 14:26   ` Sudeep Holla
  2021-07-12 14:18 ` [PATCH v6 09/17] firmware: arm_scmi: Make .clear_channel optional Cristian Marussi
                   ` (10 subsequent siblings)
  18 siblings, 1 reply; 48+ messages in thread
From: Cristian Marussi @ 2021-07-12 14:18 UTC (permalink / raw)
  To: linux-kernel, linux-arm-kernel, virtualization, virtio-dev
  Cc: sudeep.holla, james.quinlan, Jonathan.Cameron, f.fainelli,
	etienne.carriere, vincent.guittot, souvik.chakravarty,
	cristian.marussi, igor.skalkin, peter.hilber, alex.bennee,
	jean-philippe, mikhail.golubev, anton.yakovlev, Vasyl.Vavrychuk,
	Andriy.Tryshnivskyy

Add a new opaque void *priv parameter to scmi_rx_callback which can be
optionally provided by the transport layer when invoking scmi_rx_callback
and that will be passed back to the transport layer in xfer->priv.

This can be used by transports that needs to keep track of their specific
data structures together with the valid xfers.

Signed-off-by: Cristian Marussi <cristian.marussi@arm.com>
---
 drivers/firmware/arm_scmi/common.h  |  4 +++-
 drivers/firmware/arm_scmi/driver.c  | 17 ++++++++++++-----
 drivers/firmware/arm_scmi/mailbox.c |  2 +-
 drivers/firmware/arm_scmi/smc.c     |  3 ++-
 4 files changed, 18 insertions(+), 8 deletions(-)

diff --git a/drivers/firmware/arm_scmi/common.h b/drivers/firmware/arm_scmi/common.h
index 9efebe1406d2..5707789de66c 100644
--- a/drivers/firmware/arm_scmi/common.h
+++ b/drivers/firmware/arm_scmi/common.h
@@ -153,6 +153,7 @@ struct scmi_msg {
  *	    - SCMI_XFER_SENT_OK -> SCMI_XFER_DRESP_OK
  *	      (Missing synchronous response is assumed OK and ignored)
  * @lock: A spinlock to protect state and busy fields.
+ * @priv: A pointer for transport private usage.
  */
 struct scmi_xfer {
 	int transfer_id;
@@ -173,6 +174,7 @@ struct scmi_xfer {
 	int state;
 	/* A lock to protect state and busy fields */
 	spinlock_t lock;
+	void *priv;
 };
 
 /*
@@ -389,7 +391,7 @@ extern const struct scmi_desc scmi_mailbox_desc;
 extern const struct scmi_desc scmi_smc_desc;
 #endif
 
-void scmi_rx_callback(struct scmi_chan_info *cinfo, u32 msg_hdr);
+void scmi_rx_callback(struct scmi_chan_info *cinfo, u32 msg_hdr, void *priv);
 void scmi_free_channel(struct scmi_chan_info *cinfo, struct idr *idr, int id);
 
 /* shmem related declarations */
diff --git a/drivers/firmware/arm_scmi/driver.c b/drivers/firmware/arm_scmi/driver.c
index 5ef33d692670..dc14bc63cb43 100644
--- a/drivers/firmware/arm_scmi/driver.c
+++ b/drivers/firmware/arm_scmi/driver.c
@@ -593,7 +593,8 @@ static inline void scmi_xfer_state_update(struct scmi_xfer *xfer)
 	}
 }
 
-static void scmi_handle_notification(struct scmi_chan_info *cinfo, u32 msg_hdr)
+static void scmi_handle_notification(struct scmi_chan_info *cinfo,
+				     u32 msg_hdr, void *priv)
 {
 	struct scmi_xfer *xfer;
 	struct device *dev = cinfo->dev;
@@ -611,6 +612,8 @@ static void scmi_handle_notification(struct scmi_chan_info *cinfo, u32 msg_hdr)
 	}
 
 	unpack_scmi_header(msg_hdr, &xfer->hdr);
+	if (priv)
+		xfer->priv = priv;
 	info->desc->ops->fetch_notification(cinfo, info->desc->max_msg_size,
 					    xfer);
 	scmi_notify(cinfo->handle, xfer->hdr.protocol_id,
@@ -625,7 +628,8 @@ static void scmi_handle_notification(struct scmi_chan_info *cinfo, u32 msg_hdr)
 	info->desc->ops->clear_channel(cinfo);
 }
 
-static void scmi_handle_response(struct scmi_chan_info *cinfo, u32 msg_hdr)
+static void scmi_handle_response(struct scmi_chan_info *cinfo,
+				 u32 msg_hdr, void *priv)
 {
 	struct scmi_xfer *xfer;
 	struct scmi_info *info = handle_to_scmi_info(cinfo->handle);
@@ -642,6 +646,8 @@ static void scmi_handle_response(struct scmi_chan_info *cinfo, u32 msg_hdr)
 	if (xfer->hdr.type == MSG_TYPE_DELAYED_RESP)
 		xfer->rx.len = info->desc->max_msg_size;
 
+	if (priv)
+		xfer->priv = priv;
 	info->desc->ops->fetch_response(cinfo, xfer);
 
 	trace_scmi_rx_done(xfer->transfer_id, xfer->hdr.id,
@@ -663,6 +669,7 @@ static void scmi_handle_response(struct scmi_chan_info *cinfo, u32 msg_hdr)
  *
  * @cinfo: SCMI channel info
  * @msg_hdr: Message header
+ * @priv: Transport specific private data.
  *
  * Processes one received message to appropriate transfer information and
  * signals completion of the transfer.
@@ -670,17 +677,17 @@ static void scmi_handle_response(struct scmi_chan_info *cinfo, u32 msg_hdr)
  * NOTE: This function will be invoked in IRQ context, hence should be
  * as optimal as possible.
  */
-void scmi_rx_callback(struct scmi_chan_info *cinfo, u32 msg_hdr)
+void scmi_rx_callback(struct scmi_chan_info *cinfo, u32 msg_hdr, void *priv)
 {
 	u8 msg_type = MSG_XTRACT_TYPE(msg_hdr);
 
 	switch (msg_type) {
 	case MSG_TYPE_NOTIFICATION:
-		scmi_handle_notification(cinfo, msg_hdr);
+		scmi_handle_notification(cinfo, msg_hdr, priv);
 		break;
 	case MSG_TYPE_COMMAND:
 	case MSG_TYPE_DELAYED_RESP:
-		scmi_handle_response(cinfo, msg_hdr);
+		scmi_handle_response(cinfo, msg_hdr, priv);
 		break;
 	default:
 		WARN_ONCE(1, "received unknown msg_type:%d\n", msg_type);
diff --git a/drivers/firmware/arm_scmi/mailbox.c b/drivers/firmware/arm_scmi/mailbox.c
index e3dcb58314ae..e09eb12bf421 100644
--- a/drivers/firmware/arm_scmi/mailbox.c
+++ b/drivers/firmware/arm_scmi/mailbox.c
@@ -43,7 +43,7 @@ static void rx_callback(struct mbox_client *cl, void *m)
 {
 	struct scmi_mailbox *smbox = client_to_scmi_mailbox(cl);
 
-	scmi_rx_callback(smbox->cinfo, shmem_read_header(smbox->shmem));
+	scmi_rx_callback(smbox->cinfo, shmem_read_header(smbox->shmem), NULL);
 }
 
 static bool mailbox_chan_available(struct device *dev, int idx)
diff --git a/drivers/firmware/arm_scmi/smc.c b/drivers/firmware/arm_scmi/smc.c
index bed5596c7209..4effecc3bb46 100644
--- a/drivers/firmware/arm_scmi/smc.c
+++ b/drivers/firmware/arm_scmi/smc.c
@@ -154,7 +154,8 @@ static int smc_send_message(struct scmi_chan_info *cinfo,
 	if (scmi_info->irq)
 		wait_for_completion(&scmi_info->tx_complete);
 
-	scmi_rx_callback(scmi_info->cinfo, shmem_read_header(scmi_info->shmem));
+	scmi_rx_callback(scmi_info->cinfo,
+			 shmem_read_header(scmi_info->shmem), NULL);
 
 	mutex_unlock(&scmi_info->shmem_lock);
 
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 48+ messages in thread

* [PATCH v6 09/17] firmware: arm_scmi: Make .clear_channel optional
  2021-07-12 14:18 [PATCH v6 00/17] Introduce SCMI transport based on VirtIO Cristian Marussi
                   ` (7 preceding siblings ...)
  2021-07-12 14:18 ` [PATCH v6 08/17] firmware: arm_scmi: Add priv parameter to scmi_rx_callback Cristian Marussi
@ 2021-07-12 14:18 ` Cristian Marussi
  2021-07-12 14:18 ` [PATCH v6 10/17] firmware: arm_scmi: Make polling mode optional Cristian Marussi
                   ` (9 subsequent siblings)
  18 siblings, 0 replies; 48+ messages in thread
From: Cristian Marussi @ 2021-07-12 14:18 UTC (permalink / raw)
  To: linux-kernel, linux-arm-kernel, virtualization, virtio-dev
  Cc: sudeep.holla, james.quinlan, Jonathan.Cameron, f.fainelli,
	etienne.carriere, vincent.guittot, souvik.chakravarty,
	cristian.marussi, igor.skalkin, peter.hilber, alex.bennee,
	jean-philippe, mikhail.golubev, anton.yakovlev, Vasyl.Vavrychuk,
	Andriy.Tryshnivskyy

Make transport operation .clear_channel optional since some transports
do not need it and so avoid to have them implement dummy callbacks.

Signed-off-by: Cristian Marussi <cristian.marussi@arm.com>
---
 drivers/firmware/arm_scmi/driver.c | 15 +++++++++++----
 1 file changed, 11 insertions(+), 4 deletions(-)

diff --git a/drivers/firmware/arm_scmi/driver.c b/drivers/firmware/arm_scmi/driver.c
index dc14bc63cb43..a952b6527b8a 100644
--- a/drivers/firmware/arm_scmi/driver.c
+++ b/drivers/firmware/arm_scmi/driver.c
@@ -593,6 +593,13 @@ static inline void scmi_xfer_state_update(struct scmi_xfer *xfer)
 	}
 }
 
+static inline void scmi_clear_channel(struct scmi_info *info,
+				      struct scmi_chan_info *cinfo)
+{
+	if (info->desc->ops->clear_channel)
+		info->desc->ops->clear_channel(cinfo);
+}
+
 static void scmi_handle_notification(struct scmi_chan_info *cinfo,
 				     u32 msg_hdr, void *priv)
 {
@@ -607,7 +614,7 @@ static void scmi_handle_notification(struct scmi_chan_info *cinfo,
 	if (IS_ERR(xfer)) {
 		dev_err(dev, "failed to get free message slot (%ld)\n",
 			PTR_ERR(xfer));
-		info->desc->ops->clear_channel(cinfo);
+		scmi_clear_channel(info, cinfo);
 		return;
 	}
 
@@ -625,7 +632,7 @@ static void scmi_handle_notification(struct scmi_chan_info *cinfo,
 
 	__scmi_xfer_put(minfo, xfer);
 
-	info->desc->ops->clear_channel(cinfo);
+	scmi_clear_channel(info, cinfo);
 }
 
 static void scmi_handle_response(struct scmi_chan_info *cinfo,
@@ -636,7 +643,7 @@ static void scmi_handle_response(struct scmi_chan_info *cinfo,
 
 	xfer = scmi_xfer_command_acquire(cinfo, msg_hdr);
 	if (IS_ERR(xfer)) {
-		info->desc->ops->clear_channel(cinfo);
+		scmi_clear_channel(info, cinfo);
 		return;
 	}
 
@@ -655,7 +662,7 @@ static void scmi_handle_response(struct scmi_chan_info *cinfo,
 			   xfer->hdr.type);
 
 	if (xfer->hdr.type == MSG_TYPE_DELAYED_RESP) {
-		info->desc->ops->clear_channel(cinfo);
+		scmi_clear_channel(info, cinfo);
 		complete(xfer->async_done);
 	} else {
 		complete(&xfer->done);
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 48+ messages in thread

* [PATCH v6 10/17] firmware: arm_scmi: Make polling mode optional
  2021-07-12 14:18 [PATCH v6 00/17] Introduce SCMI transport based on VirtIO Cristian Marussi
                   ` (8 preceding siblings ...)
  2021-07-12 14:18 ` [PATCH v6 09/17] firmware: arm_scmi: Make .clear_channel optional Cristian Marussi
@ 2021-07-12 14:18 ` Cristian Marussi
  2021-07-15 16:36   ` Peter Hilber
  2021-07-28 14:34   ` Sudeep Holla
  2021-07-12 14:18 ` [PATCH v6 11/17] firmware: arm_scmi: Make SCMI transports configurable Cristian Marussi
                   ` (8 subsequent siblings)
  18 siblings, 2 replies; 48+ messages in thread
From: Cristian Marussi @ 2021-07-12 14:18 UTC (permalink / raw)
  To: linux-kernel, linux-arm-kernel, virtualization, virtio-dev
  Cc: sudeep.holla, james.quinlan, Jonathan.Cameron, f.fainelli,
	etienne.carriere, vincent.guittot, souvik.chakravarty,
	cristian.marussi, igor.skalkin, peter.hilber, alex.bennee,
	jean-philippe, mikhail.golubev, anton.yakovlev, Vasyl.Vavrychuk,
	Andriy.Tryshnivskyy

Add a check for the presence of .poll_done transport operation so that
transports that do not need to support polling mode have no need to provide
a dummy .poll_done callback either and polling mode can be disabled in the
SCMI core for that tranport.

Signed-off-by: Cristian Marussi <cristian.marussi@arm.com>
---
 drivers/firmware/arm_scmi/driver.c | 43 ++++++++++++++++++------------
 1 file changed, 26 insertions(+), 17 deletions(-)

diff --git a/drivers/firmware/arm_scmi/driver.c b/drivers/firmware/arm_scmi/driver.c
index a952b6527b8a..4183d25c9289 100644
--- a/drivers/firmware/arm_scmi/driver.c
+++ b/drivers/firmware/arm_scmi/driver.c
@@ -777,25 +777,34 @@ static int do_xfer(const struct scmi_protocol_handle *ph,
 	}
 
 	if (xfer->hdr.poll_completion) {
-		ktime_t stop = ktime_add_ns(ktime_get(), SCMI_MAX_POLL_TO_NS);
-
-		spin_until_cond(scmi_xfer_done_no_timeout(cinfo, xfer, stop));
-
-		if (ktime_before(ktime_get(), stop)) {
-			unsigned long flags;
-
-			/*
-			 * Do not fetch_response if an out-of-order delayed
-			 * response is being processed.
-			 */
-			spin_lock_irqsave(&xfer->lock, flags);
-			if (xfer->state == SCMI_XFER_SENT_OK) {
-				info->desc->ops->fetch_response(cinfo, xfer);
-				xfer->state = SCMI_XFER_RESP_OK;
+		if (info->desc->ops->poll_done) {
+			ktime_t stop = ktime_add_ns(ktime_get(),
+						    SCMI_MAX_POLL_TO_NS);
+
+			spin_until_cond(scmi_xfer_done_no_timeout(cinfo, xfer,
+								  stop));
+
+			if (ktime_before(ktime_get(), stop)) {
+				unsigned long flags;
+
+				/*
+				 * Do not fetch_response if an out-of-order delayed
+				 * response is being processed.
+				 */
+				spin_lock_irqsave(&xfer->lock, flags);
+				if (xfer->state == SCMI_XFER_SENT_OK) {
+					info->desc->ops->fetch_response(cinfo,
+									xfer);
+					xfer->state = SCMI_XFER_RESP_OK;
+				}
+				spin_unlock_irqrestore(&xfer->lock, flags);
+			} else {
+				ret = -ETIMEDOUT;
 			}
-			spin_unlock_irqrestore(&xfer->lock, flags);
 		} else {
-			ret = -ETIMEDOUT;
+			dev_warn_once(dev,
+				      "Polling mode is not supported by transport.\n");
+			ret = EINVAL;
 		}
 	} else {
 		/* And we wait for the response. */
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 48+ messages in thread

* [PATCH v6 11/17] firmware: arm_scmi: Make SCMI transports configurable
  2021-07-12 14:18 [PATCH v6 00/17] Introduce SCMI transport based on VirtIO Cristian Marussi
                   ` (9 preceding siblings ...)
  2021-07-12 14:18 ` [PATCH v6 10/17] firmware: arm_scmi: Make polling mode optional Cristian Marussi
@ 2021-07-12 14:18 ` Cristian Marussi
  2021-07-28 14:50   ` Sudeep Holla
  2021-07-12 14:18 ` [PATCH v6 12/17] firmware: arm_scmi: Make shmem support optional for transports Cristian Marussi
                   ` (7 subsequent siblings)
  18 siblings, 1 reply; 48+ messages in thread
From: Cristian Marussi @ 2021-07-12 14:18 UTC (permalink / raw)
  To: linux-kernel, linux-arm-kernel, virtualization, virtio-dev
  Cc: sudeep.holla, james.quinlan, Jonathan.Cameron, f.fainelli,
	etienne.carriere, vincent.guittot, souvik.chakravarty,
	cristian.marussi, igor.skalkin, peter.hilber, alex.bennee,
	jean-philippe, mikhail.golubev, anton.yakovlev, Vasyl.Vavrychuk,
	Andriy.Tryshnivskyy

Add configuration options to be able to select which SCMI transports have
to be compiled into the SCMI stack.

Mailbox and SMC are by default enabled if their related dependencies are
satisfied.

While doing that move all SCMI related config options in their own
dedicated submenu.

Signed-off-by: Cristian Marussi <cristian.marussi@arm.com>
---
Used a BUILD_BUG_ON() to avoid the scenario where SCMI is configured
without any transport. Coul dnot do in any other way in Kconfig due to
circular dependencies.

This will be neeed later on to add new Virtio based transport and
optionally exclude other transports.
---
 drivers/firmware/Kconfig           | 34 +--------------
 drivers/firmware/arm_scmi/Kconfig  | 70 ++++++++++++++++++++++++++++++
 drivers/firmware/arm_scmi/Makefile |  4 +-
 drivers/firmware/arm_scmi/common.h |  4 +-
 drivers/firmware/arm_scmi/driver.c |  6 ++-
 5 files changed, 80 insertions(+), 38 deletions(-)
 create mode 100644 drivers/firmware/arm_scmi/Kconfig

diff --git a/drivers/firmware/Kconfig b/drivers/firmware/Kconfig
index 1db738d5b301..8d41f73f5395 100644
--- a/drivers/firmware/Kconfig
+++ b/drivers/firmware/Kconfig
@@ -6,39 +6,7 @@
 
 menu "Firmware Drivers"
 
-config ARM_SCMI_PROTOCOL
-	tristate "ARM System Control and Management Interface (SCMI) Message Protocol"
-	depends on ARM || ARM64 || COMPILE_TEST
-	depends on MAILBOX || HAVE_ARM_SMCCC_DISCOVERY
-	help
-	  ARM System Control and Management Interface (SCMI) protocol is a
-	  set of operating system-independent software interfaces that are
-	  used in system management. SCMI is extensible and currently provides
-	  interfaces for: Discovery and self-description of the interfaces
-	  it supports, Power domain management which is the ability to place
-	  a given device or domain into the various power-saving states that
-	  it supports, Performance management which is the ability to control
-	  the performance of a domain that is composed of compute engines
-	  such as application processors and other accelerators, Clock
-	  management which is the ability to set and inquire rates on platform
-	  managed clocks and Sensor management which is the ability to read
-	  sensor data, and be notified of sensor value.
-
-	  This protocol library provides interface for all the client drivers
-	  making use of the features offered by the SCMI.
-
-config ARM_SCMI_POWER_DOMAIN
-	tristate "SCMI power domain driver"
-	depends on ARM_SCMI_PROTOCOL || (COMPILE_TEST && OF)
-	default y
-	select PM_GENERIC_DOMAINS if PM
-	help
-	  This enables support for the SCMI power domains which can be
-	  enabled or disabled via the SCP firmware
-
-	  This driver can also be built as a module.  If so, the module
-	  will be called scmi_pm_domain. Note this may needed early in boot
-	  before rootfs may be available.
+source "drivers/firmware/arm_scmi/Kconfig"
 
 config ARM_SCPI_PROTOCOL
 	tristate "ARM System Control and Power Interface (SCPI) Message Protocol"
diff --git a/drivers/firmware/arm_scmi/Kconfig b/drivers/firmware/arm_scmi/Kconfig
new file mode 100644
index 000000000000..479fc8a3533e
--- /dev/null
+++ b/drivers/firmware/arm_scmi/Kconfig
@@ -0,0 +1,70 @@
+# SPDX-License-Identifier: GPL-2.0-only
+menu "ARM System Control and Management Interface Protocol"
+
+config ARM_SCMI_PROTOCOL
+	tristate "ARM System Control and Management Interface (SCMI) Message Protocol"
+	depends on ARM || ARM64 || COMPILE_TEST
+	help
+	  ARM System Control and Management Interface (SCMI) protocol is a
+	  set of operating system-independent software interfaces that are
+	  used in system management. SCMI is extensible and currently provides
+	  interfaces for: Discovery and self-description of the interfaces
+	  it supports, Power domain management which is the ability to place
+	  a given device or domain into the various power-saving states that
+	  it supports, Performance management which is the ability to control
+	  the performance of a domain that is composed of compute engines
+	  such as application processors and other accelerators, Clock
+	  management which is the ability to set and inquire rates on platform
+	  managed clocks and Sensor management which is the ability to read
+	  sensor data, and be notified of sensor value.
+
+	  This protocol library provides interface for all the client drivers
+	  making use of the features offered by the SCMI.
+
+config ARM_SCMI_HAVE_TRANSPORT
+	bool
+	help
+	  This declares whether at least one SCMI transport has been configured.
+	  Used to trigger a build bug when trying to build SCMI without any
+	  configured transport.
+
+config ARM_SCMI_TRANSPORT_MAILBOX
+	bool "SCMI transport based on Mailbox"
+	depends on ARM_SCMI_PROTOCOL && MAILBOX
+	select ARM_SCMI_HAVE_TRANSPORT
+	default y
+	help
+	  Enable mailbox based transport for SCMI.
+
+	  If you want the ARM SCMI PROTOCOL stack to include support for a
+	  transport based on mailboxes, answer Y.
+	  A matching DT entry will also be needed to indicate the effective
+	  presence of this kind of transport.
+
+config ARM_SCMI_TRANSPORT_SMC
+	bool "SCMI transport based on SMC"
+	depends on ARM_SCMI_PROTOCOL && HAVE_ARM_SMCCC_DISCOVERY
+	select ARM_SCMI_HAVE_TRANSPORT
+	default y
+	help
+	  Enable SMC based transport for SCMI.
+
+	  If you want the ARM SCMI PROTOCOL stack to include support for a
+	  transport based on SMC, answer Y.
+	  A matching DT entry will also be needed to indicate the effective
+	  presence of this kind of transport.
+
+config ARM_SCMI_POWER_DOMAIN
+	tristate "SCMI power domain driver"
+	depends on ARM_SCMI_PROTOCOL || (COMPILE_TEST && OF)
+	default y
+	select PM_GENERIC_DOMAINS if PM
+	help
+	  This enables support for the SCMI power domains which can be
+	  enabled or disabled via the SCP firmware
+
+	  This driver can also be built as a module.  If so, the module
+	  will be called scmi_pm_domain. Note this may needed early in boot
+	  before rootfs may be available.
+
+endmenu
diff --git a/drivers/firmware/arm_scmi/Makefile b/drivers/firmware/arm_scmi/Makefile
index 6a2ef63306d6..38163d6991b3 100644
--- a/drivers/firmware/arm_scmi/Makefile
+++ b/drivers/firmware/arm_scmi/Makefile
@@ -2,8 +2,8 @@
 scmi-bus-y = bus.o
 scmi-driver-y = driver.o notify.o
 scmi-transport-y = shmem.o
-scmi-transport-$(CONFIG_MAILBOX) += mailbox.o
-scmi-transport-$(CONFIG_HAVE_ARM_SMCCC_DISCOVERY) += smc.o
+scmi-transport-$(CONFIG_ARM_SCMI_TRANSPORT_MAILBOX) += mailbox.o
+scmi-transport-$(CONFIG_ARM_SCMI_TRANSPORT_SMC) += smc.o
 scmi-protocols-y = base.o clock.o perf.o power.o reset.o sensors.o system.o voltage.o
 scmi-module-objs := $(scmi-bus-y) $(scmi-driver-y) $(scmi-protocols-y) \
 		    $(scmi-transport-y)
diff --git a/drivers/firmware/arm_scmi/common.h b/drivers/firmware/arm_scmi/common.h
index 5707789de66c..9d5f88016d4a 100644
--- a/drivers/firmware/arm_scmi/common.h
+++ b/drivers/firmware/arm_scmi/common.h
@@ -386,8 +386,10 @@ struct scmi_desc {
 	int max_msg_size;
 };
 
+#ifdef CONFIG_ARM_SCMI_TRANSPORT_MAILBOX
 extern const struct scmi_desc scmi_mailbox_desc;
-#ifdef CONFIG_HAVE_ARM_SMCCC_DISCOVERY
+#endif
+#ifdef CONFIG_ARM_SCMI_TRANSPORT_SMC
 extern const struct scmi_desc scmi_smc_desc;
 #endif
 
diff --git a/drivers/firmware/arm_scmi/driver.c b/drivers/firmware/arm_scmi/driver.c
index 4183d25c9289..ffcac9bd3edd 100644
--- a/drivers/firmware/arm_scmi/driver.c
+++ b/drivers/firmware/arm_scmi/driver.c
@@ -1929,10 +1929,10 @@ ATTRIBUTE_GROUPS(versions);
 
 /* Each compatible listed below must have descriptor associated with it */
 static const struct of_device_id scmi_of_match[] = {
-#ifdef CONFIG_MAILBOX
+#ifdef CONFIG_ARM_SCMI_TRANSPORT_MAILBOX
 	{ .compatible = "arm,scmi", .data = &scmi_mailbox_desc },
 #endif
-#ifdef CONFIG_HAVE_ARM_SMCCC_DISCOVERY
+#ifdef CONFIG_ARM_SCMI_TRANSPORT_SMC
 	{ .compatible = "arm,scmi-smc", .data = &scmi_smc_desc},
 #endif
 	{ /* Sentinel */ },
@@ -2003,6 +2003,8 @@ static int __init scmi_driver_init(void)
 
 	scmi_bus_init();
 
+	BUILD_BUG_ON(!IS_ENABLED(CONFIG_ARM_SCMI_HAVE_TRANSPORT));
+
 	/* Initialize any compiled-in transport which provided an init/exit */
 	ret = scmi_transports_init();
 	if (ret)
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 48+ messages in thread

* [PATCH v6 12/17] firmware: arm_scmi: Make shmem support optional for transports
  2021-07-12 14:18 [PATCH v6 00/17] Introduce SCMI transport based on VirtIO Cristian Marussi
                   ` (10 preceding siblings ...)
  2021-07-12 14:18 ` [PATCH v6 11/17] firmware: arm_scmi: Make SCMI transports configurable Cristian Marussi
@ 2021-07-12 14:18 ` Cristian Marussi
  2021-07-12 14:18 ` [PATCH v6 13/17] firmware: arm_scmi: Add method to override max message number Cristian Marussi
                   ` (6 subsequent siblings)
  18 siblings, 0 replies; 48+ messages in thread
From: Cristian Marussi @ 2021-07-12 14:18 UTC (permalink / raw)
  To: linux-kernel, linux-arm-kernel, virtualization, virtio-dev
  Cc: sudeep.holla, james.quinlan, Jonathan.Cameron, f.fainelli,
	etienne.carriere, vincent.guittot, souvik.chakravarty,
	cristian.marussi, igor.skalkin, peter.hilber, alex.bennee,
	jean-philippe, mikhail.golubev, anton.yakovlev, Vasyl.Vavrychuk,
	Andriy.Tryshnivskyy

From: Igor Skalkin <igor.skalkin@opensynergy.com>

Upcoming new SCMI transports won't need any kind of shared memory support.
Compile shmem.c only if a shmem based transport is selected.

Signed-off-by: Igor Skalkin <igor.skalkin@opensynergy.com>
[ Peter: Adapted patch for submission to upstream. ]
Co-developed-by: Peter Hilber <peter.hilber@opensynergy.com>
Signed-off-by: Peter Hilber <peter.hilber@opensynergy.com>
[ Cristian: Adapted patch/commit_msg to new SCMI Kconfig layout ]
Signed-off-by: Cristian Marussi <cristian.marussi@arm.com>
---
v4 --> v5
- Adapted to new SCMI Kconfig layout
---
 drivers/firmware/arm_scmi/Kconfig  | 8 ++++++++
 drivers/firmware/arm_scmi/Makefile | 2 +-
 2 files changed, 9 insertions(+), 1 deletion(-)

diff --git a/drivers/firmware/arm_scmi/Kconfig b/drivers/firmware/arm_scmi/Kconfig
index 479fc8a3533e..ee6517b24080 100644
--- a/drivers/firmware/arm_scmi/Kconfig
+++ b/drivers/firmware/arm_scmi/Kconfig
@@ -28,10 +28,17 @@ config ARM_SCMI_HAVE_TRANSPORT
 	  Used to trigger a build bug when trying to build SCMI without any
 	  configured transport.
 
+config ARM_SCMI_HAVE_SHMEM
+	bool
+	help
+	  This declares whether a shared memory based transport for SCMI is
+	  available.
+
 config ARM_SCMI_TRANSPORT_MAILBOX
 	bool "SCMI transport based on Mailbox"
 	depends on ARM_SCMI_PROTOCOL && MAILBOX
 	select ARM_SCMI_HAVE_TRANSPORT
+	select ARM_SCMI_HAVE_SHMEM
 	default y
 	help
 	  Enable mailbox based transport for SCMI.
@@ -45,6 +52,7 @@ config ARM_SCMI_TRANSPORT_SMC
 	bool "SCMI transport based on SMC"
 	depends on ARM_SCMI_PROTOCOL && HAVE_ARM_SMCCC_DISCOVERY
 	select ARM_SCMI_HAVE_TRANSPORT
+	select ARM_SCMI_HAVE_SHMEM
 	default y
 	help
 	  Enable SMC based transport for SCMI.
diff --git a/drivers/firmware/arm_scmi/Makefile b/drivers/firmware/arm_scmi/Makefile
index 38163d6991b3..e0e6bd3dba9e 100644
--- a/drivers/firmware/arm_scmi/Makefile
+++ b/drivers/firmware/arm_scmi/Makefile
@@ -1,7 +1,7 @@
 # SPDX-License-Identifier: GPL-2.0-only
 scmi-bus-y = bus.o
 scmi-driver-y = driver.o notify.o
-scmi-transport-y = shmem.o
+scmi-transport-$(CONFIG_ARM_SCMI_HAVE_SHMEM) = shmem.o
 scmi-transport-$(CONFIG_ARM_SCMI_TRANSPORT_MAILBOX) += mailbox.o
 scmi-transport-$(CONFIG_ARM_SCMI_TRANSPORT_SMC) += smc.o
 scmi-protocols-y = base.o clock.o perf.o power.o reset.o sensors.o system.o voltage.o
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 48+ messages in thread

* [PATCH v6 13/17] firmware: arm_scmi: Add method to override max message number
  2021-07-12 14:18 [PATCH v6 00/17] Introduce SCMI transport based on VirtIO Cristian Marussi
                   ` (11 preceding siblings ...)
  2021-07-12 14:18 ` [PATCH v6 12/17] firmware: arm_scmi: Make shmem support optional for transports Cristian Marussi
@ 2021-07-12 14:18 ` Cristian Marussi
  2021-07-12 14:18 ` [PATCH v6 14/17] firmware: arm_scmi: Add message passing abstractions for transports Cristian Marussi
                   ` (5 subsequent siblings)
  18 siblings, 0 replies; 48+ messages in thread
From: Cristian Marussi @ 2021-07-12 14:18 UTC (permalink / raw)
  To: linux-kernel, linux-arm-kernel, virtualization, virtio-dev
  Cc: sudeep.holla, james.quinlan, Jonathan.Cameron, f.fainelli,
	etienne.carriere, vincent.guittot, souvik.chakravarty,
	cristian.marussi, igor.skalkin, peter.hilber, alex.bennee,
	jean-philippe, mikhail.golubev, anton.yakovlev, Vasyl.Vavrychuk,
	Andriy.Tryshnivskyy

From: Igor Skalkin <igor.skalkin@opensynergy.com>

The maximum number of simultaneously pending messages is a transport
specific quantity that is usually described statically in struct scmi_desc.

Some transports, though, can calculate such number only at run-time after
some initial transport specific setup and probing is completed; moreover
the resulting max message numbers could also be different between rx and
tx channels.

Add an optional get_max_msg() operation so that a transport can report more
accurate max message numbers for each channel type.

The value in scmi_desc.max_msg is still used as default when transport does
not provide any get_max_msg() method.

Signed-off-by: Igor Skalkin <igor.skalkin@opensynergy.com>
[ Peter: Adapted patch for submission to upstream. ]
Co-developed-by: Peter Hilber <peter.hilber@opensynergy.com>
Signed-off-by: Peter Hilber <peter.hilber@opensynergy.com>
[ Cristian: refactored how get_max_msg() is used to minimize core changes ]
Co-developed-by: Cristian Marussi <cristian.marussi@arm.com>
Signed-off-by: Cristian Marussi <cristian.marussi@arm.com>
---
v4 --> v5
- refactored usage of get_max_msg()
---
 drivers/firmware/arm_scmi/common.h |  9 +++++--
 drivers/firmware/arm_scmi/driver.c | 40 +++++++++++++++++++++++++++---
 2 files changed, 43 insertions(+), 6 deletions(-)

diff --git a/drivers/firmware/arm_scmi/common.h b/drivers/firmware/arm_scmi/common.h
index 9d5f88016d4a..14457f0d5dea 100644
--- a/drivers/firmware/arm_scmi/common.h
+++ b/drivers/firmware/arm_scmi/common.h
@@ -334,6 +334,9 @@ struct scmi_chan_info {
  * @chan_available: Callback to check if channel is available or not
  * @chan_setup: Callback to allocate and setup a channel
  * @chan_free: Callback to free a channel
+ * @get_max_msg: Optional callback to provide max_msg dynamically
+ *		 Returns the maximum number of messages for the channel type
+ *		 (tx or rx) that can be pending simultaneously in the system
  * @send_message: Callback to send a message
  * @mark_txdone: Callback to mark tx as done
  * @fetch_response: Callback to fetch response
@@ -346,6 +349,7 @@ struct scmi_transport_ops {
 	int (*chan_setup)(struct scmi_chan_info *cinfo, struct device *dev,
 			  bool tx);
 	int (*chan_free)(int id, void *p, void *data);
+	unsigned int (*get_max_msg)(struct scmi_chan_info *base_cinfo);
 	int (*send_message)(struct scmi_chan_info *cinfo,
 			    struct scmi_xfer *xfer);
 	void (*mark_txdone)(struct scmi_chan_info *cinfo, int ret);
@@ -373,8 +377,9 @@ struct scmi_device *scmi_child_dev_find(struct device *parent,
  *	  after SCMI core removal.
  * @ops: Pointer to the transport specific ops structure
  * @max_rx_timeout_ms: Timeout for communication with SoC (in Milliseconds)
- * @max_msg: Maximum number of messages that can be pending
- *	simultaneously in the system
+ * @max_msg: Maximum number of messages for a channel type (tx or rx) that can
+ *	be pending simultaneously in the system. May be overridden by the
+ *	get_max_msg op.
  * @max_msg_size: Maximum size of data per message that can be handled.
  */
 struct scmi_desc {
diff --git a/drivers/firmware/arm_scmi/driver.c b/drivers/firmware/arm_scmi/driver.c
index ffcac9bd3edd..24039be7599d 100644
--- a/drivers/firmware/arm_scmi/driver.c
+++ b/drivers/firmware/arm_scmi/driver.c
@@ -75,6 +75,7 @@ struct scmi_requested_dev {
  *	Index of this bitmap table is also used for message
  *	sequence identifier.
  * @xfer_lock: Protection for message allocation
+ * @max_msg: Maximum number of messages that can be pending
  * @last_token: A counter to use as base to generate for monotonically
  *		increasing tokens.
  * @free_xfers: A free list for available to use xfers. It is initialized with
@@ -86,6 +87,7 @@ struct scmi_requested_dev {
 struct scmi_xfers_info {
 	unsigned long *xfer_alloc_table;
 	spinlock_t xfer_lock;
+	int max_msg;
 	atomic_t last_token;
 	struct hlist_head free_xfers;
 	DECLARE_HASHTABLE(pending_xfers, SCMI_PENDING_XFERS_HT_ORDER_SZ);
@@ -1370,10 +1372,10 @@ static int __scmi_xfer_info_init(struct scmi_info *sinfo,
 	const struct scmi_desc *desc = sinfo->desc;
 
 	/* Pre-allocated messages, no more than what hdr.seq can support */
-	if (WARN_ON(!desc->max_msg || desc->max_msg > MSG_TOKEN_MAX)) {
+	if (WARN_ON(!info->max_msg || info->max_msg > MSG_TOKEN_MAX)) {
 		dev_err(dev,
 			"Invalid max_msg %d. Maximum messages supported %lu.\n",
-			desc->max_msg, MSG_TOKEN_MAX);
+			info->max_msg, MSG_TOKEN_MAX);
 		return -EINVAL;
 	}
 
@@ -1391,7 +1393,7 @@ static int __scmi_xfer_info_init(struct scmi_info *sinfo,
 	 * attach all of them to the free list
 	 */
 	INIT_HLIST_HEAD(&info->free_xfers);
-	for (i = 0; i < desc->max_msg; i++) {
+	for (i = 0; i < info->max_msg; i++) {
 		xfer = devm_kzalloc(dev, sizeof(*xfer), GFP_KERNEL);
 		if (!xfer)
 			return -ENOMEM;
@@ -1415,10 +1417,40 @@ static int __scmi_xfer_info_init(struct scmi_info *sinfo,
 	return 0;
 }
 
+static int scmi_channels_max_msg_configure(struct scmi_info *sinfo)
+{
+	const struct scmi_desc *desc = sinfo->desc;
+
+	if (!desc->ops->get_max_msg) {
+		sinfo->tx_minfo.max_msg = desc->max_msg;
+		sinfo->rx_minfo.max_msg = desc->max_msg;
+	} else {
+		struct scmi_chan_info *base_cinfo;
+
+		base_cinfo = idr_find(&sinfo->tx_idr, SCMI_PROTOCOL_BASE);
+		if (!base_cinfo)
+			return -EINVAL;
+		sinfo->tx_minfo.max_msg = desc->ops->get_max_msg(base_cinfo);
+
+		/* RX channel is optional so can be skipped */
+		base_cinfo = idr_find(&sinfo->rx_idr, SCMI_PROTOCOL_BASE);
+		if (base_cinfo)
+			sinfo->rx_minfo.max_msg =
+				desc->ops->get_max_msg(base_cinfo);
+	}
+
+	return 0;
+}
+
 static int scmi_xfer_info_init(struct scmi_info *sinfo)
 {
-	int ret = __scmi_xfer_info_init(sinfo, &sinfo->tx_minfo);
+	int ret;
+
+	ret = scmi_channels_max_msg_configure(sinfo);
+	if (ret)
+		return ret;
 
+	ret = __scmi_xfer_info_init(sinfo, &sinfo->tx_minfo);
 	if (!ret && idr_find(&sinfo->rx_idr, SCMI_PROTOCOL_BASE))
 		ret = __scmi_xfer_info_init(sinfo, &sinfo->rx_minfo);
 
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 48+ messages in thread

* [PATCH v6 14/17] firmware: arm_scmi: Add message passing abstractions for transports
  2021-07-12 14:18 [PATCH v6 00/17] Introduce SCMI transport based on VirtIO Cristian Marussi
                   ` (12 preceding siblings ...)
  2021-07-12 14:18 ` [PATCH v6 13/17] firmware: arm_scmi: Add method to override max message number Cristian Marussi
@ 2021-07-12 14:18 ` Cristian Marussi
  2021-07-15 16:36   ` Peter Hilber
  2021-07-12 14:18 ` [PATCH v6 15/17] firmware: arm_scmi: Add optional link_supplier() transport op Cristian Marussi
                   ` (4 subsequent siblings)
  18 siblings, 1 reply; 48+ messages in thread
From: Cristian Marussi @ 2021-07-12 14:18 UTC (permalink / raw)
  To: linux-kernel, linux-arm-kernel, virtualization, virtio-dev
  Cc: sudeep.holla, james.quinlan, Jonathan.Cameron, f.fainelli,
	etienne.carriere, vincent.guittot, souvik.chakravarty,
	cristian.marussi, igor.skalkin, peter.hilber, alex.bennee,
	jean-philippe, mikhail.golubev, anton.yakovlev, Vasyl.Vavrychuk,
	Andriy.Tryshnivskyy

From: Peter Hilber <peter.hilber@opensynergy.com>

Add abstractions for future transports using message passing, such as
virtio. Derive the abstractions from the shared memory abstractions.

Abstract the transport SDU through the opaque struct scmi_msg_payld.
Also enable the transport to determine all other required information
about the transport SDU.

Signed-off-by: Peter Hilber <peter.hilber@opensynergy.com>
[ Cristian: Adapted to new SCMI Kconfig layout, updated Copyrigths ]
Signed-off-by: Cristian Marussi <cristian.marussi@arm.com>
---
v4 --> v5
- adapted to new SCMI Kconfig
- removed raw_payload msg helpers
v3 --> v4
- added raw_payload msg helpers
---
 drivers/firmware/arm_scmi/Kconfig  |   6 ++
 drivers/firmware/arm_scmi/Makefile |   1 +
 drivers/firmware/arm_scmi/common.h |  15 ++++
 drivers/firmware/arm_scmi/msg.c    | 113 +++++++++++++++++++++++++++++
 4 files changed, 135 insertions(+)
 create mode 100644 drivers/firmware/arm_scmi/msg.c

diff --git a/drivers/firmware/arm_scmi/Kconfig b/drivers/firmware/arm_scmi/Kconfig
index ee6517b24080..1fdaa8ad7d3f 100644
--- a/drivers/firmware/arm_scmi/Kconfig
+++ b/drivers/firmware/arm_scmi/Kconfig
@@ -34,6 +34,12 @@ config ARM_SCMI_HAVE_SHMEM
 	  This declares whether a shared memory based transport for SCMI is
 	  available.
 
+config ARM_SCMI_HAVE_MSG
+	bool
+	help
+	  This declares whether a message passing based transport for SCMI is
+	  available.
+
 config ARM_SCMI_TRANSPORT_MAILBOX
 	bool "SCMI transport based on Mailbox"
 	depends on ARM_SCMI_PROTOCOL && MAILBOX
diff --git a/drivers/firmware/arm_scmi/Makefile b/drivers/firmware/arm_scmi/Makefile
index e0e6bd3dba9e..aaad9f6589aa 100644
--- a/drivers/firmware/arm_scmi/Makefile
+++ b/drivers/firmware/arm_scmi/Makefile
@@ -4,6 +4,7 @@ scmi-driver-y = driver.o notify.o
 scmi-transport-$(CONFIG_ARM_SCMI_HAVE_SHMEM) = shmem.o
 scmi-transport-$(CONFIG_ARM_SCMI_TRANSPORT_MAILBOX) += mailbox.o
 scmi-transport-$(CONFIG_ARM_SCMI_TRANSPORT_SMC) += smc.o
+scmi-transport-$(CONFIG_ARM_SCMI_HAVE_MSG) += msg.o
 scmi-protocols-y = base.o clock.o perf.o power.o reset.o sensors.o system.o voltage.o
 scmi-module-objs := $(scmi-bus-y) $(scmi-driver-y) $(scmi-protocols-y) \
 		    $(scmi-transport-y)
diff --git a/drivers/firmware/arm_scmi/common.h b/drivers/firmware/arm_scmi/common.h
index 14457f0d5dea..7a1e84dc191b 100644
--- a/drivers/firmware/arm_scmi/common.h
+++ b/drivers/firmware/arm_scmi/common.h
@@ -415,6 +415,21 @@ void shmem_clear_channel(struct scmi_shared_mem __iomem *shmem);
 bool shmem_poll_done(struct scmi_shared_mem __iomem *shmem,
 		     struct scmi_xfer *xfer);
 
+/* declarations for message passing transports */
+struct scmi_msg_payld;
+
+/* Maximum overhead of message w.r.t. struct scmi_desc.max_msg_size */
+#define SCMI_MSG_MAX_PROT_OVERHEAD (2 * sizeof(__le32))
+
+size_t msg_response_size(struct scmi_xfer *xfer);
+size_t msg_command_size(struct scmi_xfer *xfer);
+void msg_tx_prepare(struct scmi_msg_payld *msg, struct scmi_xfer *xfer);
+u32 msg_read_header(struct scmi_msg_payld *msg);
+void msg_fetch_response(struct scmi_msg_payld *msg, size_t len,
+			struct scmi_xfer *xfer);
+void msg_fetch_notification(struct scmi_msg_payld *msg, size_t len,
+			    size_t max_len, struct scmi_xfer *xfer);
+
 void scmi_notification_instance_data_set(const struct scmi_handle *handle,
 					 void *priv);
 void *scmi_notification_instance_data_get(const struct scmi_handle *handle);
diff --git a/drivers/firmware/arm_scmi/msg.c b/drivers/firmware/arm_scmi/msg.c
new file mode 100644
index 000000000000..639969b7dc10
--- /dev/null
+++ b/drivers/firmware/arm_scmi/msg.c
@@ -0,0 +1,113 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * For transports using message passing.
+ *
+ * Derived from shm.c.
+ *
+ * Copyright (C) 2019-2021 ARM Ltd.
+ * Copyright (C) 2020-2021 OpenSynergy GmbH
+ */
+
+#include <linux/io.h>
+#include <linux/processor.h>
+#include <linux/types.h>
+
+#include "common.h"
+
+/*
+ * struct scmi_msg_payld - Transport SDU layout
+ *
+ * The SCMI specification requires all parameters, message headers, return
+ * arguments or any protocol data to be expressed in little endian format only.
+ */
+struct scmi_msg_payld {
+	__le32 msg_header;
+	__le32 msg_payload[];
+};
+
+/**
+ * msg_command_size() - Actual size of transport SDU for command.
+ *
+ * @xfer: message which core has prepared for sending
+ *
+ * Return: transport SDU size.
+ */
+size_t msg_command_size(struct scmi_xfer *xfer)
+{
+	return sizeof(struct scmi_msg_payld) + xfer->tx.len;
+}
+
+/**
+ * msg_response_size() - Maximum size of transport SDU for response.
+ *
+ * @xfer: message which core has prepared for sending
+ *
+ * Return: transport SDU size.
+ */
+size_t msg_response_size(struct scmi_xfer *xfer)
+{
+	return sizeof(struct scmi_msg_payld) + sizeof(__le32) + xfer->rx.len;
+}
+
+/**
+ * msg_tx_prepare() - Set up transport SDU for command.
+ *
+ * @msg: transport SDU for command
+ * @xfer: message which is being sent
+ */
+void msg_tx_prepare(struct scmi_msg_payld *msg, struct scmi_xfer *xfer)
+{
+	msg->msg_header = cpu_to_le32(pack_scmi_header(&xfer->hdr));
+	if (xfer->tx.buf)
+		memcpy(msg->msg_payload, xfer->tx.buf, xfer->tx.len);
+}
+
+/**
+ * msg_read_header() - Read SCMI header from transport SDU.
+ *
+ * @msg: transport SDU
+ *
+ * Return: SCMI header
+ */
+u32 msg_read_header(struct scmi_msg_payld *msg)
+{
+	return le32_to_cpu(msg->msg_header);
+}
+
+/**
+ * msg_fetch_response() - Fetch response SCMI payload from transport SDU.
+ *
+ * @msg: transport SDU with response
+ * @len: transport SDU size
+ * @xfer: message being responded to
+ */
+void msg_fetch_response(struct scmi_msg_payld *msg, size_t len,
+			struct scmi_xfer *xfer)
+{
+	size_t prefix_len = sizeof(*msg) + sizeof(msg->msg_payload[0]);
+
+	xfer->hdr.status = le32_to_cpu(msg->msg_payload[0]);
+	xfer->rx.len = min_t(size_t, xfer->rx.len,
+			     len >= prefix_len ? len - prefix_len : 0);
+
+	/* Take a copy to the rx buffer.. */
+	memcpy(xfer->rx.buf, &msg->msg_payload[1], xfer->rx.len);
+}
+
+/**
+ * msg_fetch_notification() - Fetch notification payload from transport SDU.
+ *
+ * @msg: transport SDU with notification
+ * @len: transport SDU size
+ * @max_len: maximum SCMI payload size to fetch
+ * @xfer: notification message
+ */
+void msg_fetch_notification(struct scmi_msg_payld *msg, size_t len,
+			    size_t max_len, struct scmi_xfer *xfer)
+{
+	xfer->rx.len = min_t(size_t, max_len,
+			     len >= sizeof(*msg) ? len - sizeof(*msg) : 0);
+
+	/* Take a copy to the rx buffer.. */
+	memcpy(xfer->rx.buf, msg->msg_payload, xfer->rx.len);
+}
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 48+ messages in thread

* [PATCH v6 15/17] firmware: arm_scmi: Add optional link_supplier() transport op
  2021-07-12 14:18 [PATCH v6 00/17] Introduce SCMI transport based on VirtIO Cristian Marussi
                   ` (13 preceding siblings ...)
  2021-07-12 14:18 ` [PATCH v6 14/17] firmware: arm_scmi: Add message passing abstractions for transports Cristian Marussi
@ 2021-07-12 14:18 ` Cristian Marussi
  2021-07-28 15:36   ` Sudeep Holla
  2021-07-12 14:18 ` [PATCH v6 16/17] dt-bindings: arm: Add virtio transport for SCMI Cristian Marussi
                   ` (3 subsequent siblings)
  18 siblings, 1 reply; 48+ messages in thread
From: Cristian Marussi @ 2021-07-12 14:18 UTC (permalink / raw)
  To: linux-kernel, linux-arm-kernel, virtualization, virtio-dev
  Cc: sudeep.holla, james.quinlan, Jonathan.Cameron, f.fainelli,
	etienne.carriere, vincent.guittot, souvik.chakravarty,
	cristian.marussi, igor.skalkin, peter.hilber, alex.bennee,
	jean-philippe, mikhail.golubev, anton.yakovlev, Vasyl.Vavrychuk,
	Andriy.Tryshnivskyy

From: Peter Hilber <peter.hilber@opensynergy.com>

Some transports are also effectively registered with other kernel subsystem
in order to be properly probed and initialized; as a consequence such kind
of transports, and their related devices, might still not have been probed
and initialized at the time the main SCMI core driver is probed.

Add an optional .link_supplier() transport operation which can be used by
the core SCMI stack to dynamically check if the transport is ready and
dynamically link its device to the platform instance device.

Signed-off-by: Peter Hilber <peter.hilber@opensynergy.com>
[ Cristian: reworded commit message ]
Signed-off-by: Cristian Marussi <cristian.marussi@arm.com>
---
v5 --> v6
- reworded commit message
---
 drivers/firmware/arm_scmi/common.h | 2 ++
 drivers/firmware/arm_scmi/driver.c | 6 ++++++
 2 files changed, 8 insertions(+)

diff --git a/drivers/firmware/arm_scmi/common.h b/drivers/firmware/arm_scmi/common.h
index 7a1e84dc191b..e6403ebe59ca 100644
--- a/drivers/firmware/arm_scmi/common.h
+++ b/drivers/firmware/arm_scmi/common.h
@@ -331,6 +331,7 @@ struct scmi_chan_info {
 /**
  * struct scmi_transport_ops - Structure representing a SCMI transport ops
  *
+ * @link_supplier: Optional callback to add link to a supplier device
  * @chan_available: Callback to check if channel is available or not
  * @chan_setup: Callback to allocate and setup a channel
  * @chan_free: Callback to free a channel
@@ -345,6 +346,7 @@ struct scmi_chan_info {
  * @poll_done: Callback to poll transfer status
  */
 struct scmi_transport_ops {
+	int (*link_supplier)(struct device *dev);
 	bool (*chan_available)(struct device *dev, int idx);
 	int (*chan_setup)(struct scmi_chan_info *cinfo, struct device *dev,
 			  bool tx);
diff --git a/drivers/firmware/arm_scmi/driver.c b/drivers/firmware/arm_scmi/driver.c
index 24039be7599d..1be86ab40be3 100644
--- a/drivers/firmware/arm_scmi/driver.c
+++ b/drivers/firmware/arm_scmi/driver.c
@@ -1810,6 +1810,12 @@ static int scmi_probe(struct platform_device *pdev)
 	handle->devm_protocol_get = scmi_devm_protocol_get;
 	handle->devm_protocol_put = scmi_devm_protocol_put;
 
+	if (desc->ops->link_supplier) {
+		ret = desc->ops->link_supplier(dev);
+		if (ret)
+			return ret;
+	}
+
 	ret = scmi_txrx_setup(info, dev, SCMI_PROTOCOL_BASE);
 	if (ret)
 		return ret;
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 48+ messages in thread

* [PATCH v6 16/17] dt-bindings: arm: Add virtio transport for SCMI
  2021-07-12 14:18 [PATCH v6 00/17] Introduce SCMI transport based on VirtIO Cristian Marussi
                   ` (14 preceding siblings ...)
  2021-07-12 14:18 ` [PATCH v6 15/17] firmware: arm_scmi: Add optional link_supplier() transport op Cristian Marussi
@ 2021-07-12 14:18 ` Cristian Marussi
  2021-07-12 14:18 ` [PATCH v6 17/17] firmware: arm_scmi: Add virtio transport Cristian Marussi
                   ` (2 subsequent siblings)
  18 siblings, 0 replies; 48+ messages in thread
From: Cristian Marussi @ 2021-07-12 14:18 UTC (permalink / raw)
  To: linux-kernel, linux-arm-kernel, virtualization, virtio-dev
  Cc: sudeep.holla, james.quinlan, Jonathan.Cameron, f.fainelli,
	etienne.carriere, vincent.guittot, souvik.chakravarty,
	cristian.marussi, igor.skalkin, peter.hilber, alex.bennee,
	jean-philippe, mikhail.golubev, anton.yakovlev, Vasyl.Vavrychuk,
	Andriy.Tryshnivskyy

From: Igor Skalkin <igor.skalkin@opensynergy.com>

Document the properties for arm,scmi-virtio compatible nodes.
The backing virtio SCMI device is described in patch [1].

While doing that, make shmem property required only for pre-existing
mailbox and smc transports, since virtio-scmi does not need it.

[1] https://lists.oasis-open.org/archives/virtio-comment/202102/msg00018.html

Reviewed-by: Rob Herring <robh@kernel.org>
Signed-off-by: Igor Skalkin <igor.skalkin@opensynergy.com>
[ Peter: Adapted patch for submission to upstream. ]
Co-developed-by: Peter Hilber <peter.hilber@opensynergy.com>
Signed-off-by: Peter Hilber <peter.hilber@opensynergy.com>
[ Cristian: converted to yaml format, moved shmen required property. ]
Signed-off-by: Cristian Marussi <cristian.marussi@arm.com>
---
v3 --> V4
- convertd to YAML
- make shmem required only for pre-existing mailbox and smc transport
- updated VirtIO specification patch message reference
- dropped virtio-mmio SCMI device example since really not pertinent to
  virtio-scmi dt bindings transport: it is not even referenced in SCMI
  virtio DT node since they are enumerated by VirtIO subsystem and there
  could be PCI based SCMI devices anyway.
---
 Documentation/devicetree/bindings/firmware/arm,scmi.yaml | 8 +++++++-
 1 file changed, 7 insertions(+), 1 deletion(-)

diff --git a/Documentation/devicetree/bindings/firmware/arm,scmi.yaml b/Documentation/devicetree/bindings/firmware/arm,scmi.yaml
index cebf6ffe70d5..5c4c6782e052 100644
--- a/Documentation/devicetree/bindings/firmware/arm,scmi.yaml
+++ b/Documentation/devicetree/bindings/firmware/arm,scmi.yaml
@@ -34,6 +34,10 @@ properties:
       - description: SCMI compliant firmware with ARM SMC/HVC transport
         items:
           - const: arm,scmi-smc
+      - description: SCMI compliant firmware with SCMI Virtio transport.
+                     The virtio transport only supports a single device.
+        items:
+          - const: arm,scmi-virtio
 
   interrupts:
     description:
@@ -172,6 +176,7 @@ patternProperties:
       Each sub-node represents a protocol supported. If the platform
       supports a dedicated communication channel for a particular protocol,
       then the corresponding transport properties must be present.
+      The virtio transport does not support a dedicated communication channel.
 
     properties:
       reg:
@@ -195,7 +200,6 @@ patternProperties:
 
 required:
   - compatible
-  - shmem
 
 if:
   properties:
@@ -209,6 +213,7 @@ then:
 
   required:
     - mboxes
+    - shmem
 
 else:
   if:
@@ -219,6 +224,7 @@ else:
   then:
     required:
       - arm,smc-id
+      - shmem
 
 examples:
   - |
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 48+ messages in thread

* [PATCH v6 17/17] firmware: arm_scmi: Add virtio transport
  2021-07-12 14:18 [PATCH v6 00/17] Introduce SCMI transport based on VirtIO Cristian Marussi
                   ` (15 preceding siblings ...)
  2021-07-12 14:18 ` [PATCH v6 16/17] dt-bindings: arm: Add virtio transport for SCMI Cristian Marussi
@ 2021-07-12 14:18 ` Cristian Marussi
  2021-07-15 16:35 ` [PATCH v6 00/17] Introduce SCMI transport based on VirtIO Peter Hilber
  2021-08-11  9:31 ` Floris Westermann
  18 siblings, 0 replies; 48+ messages in thread
From: Cristian Marussi @ 2021-07-12 14:18 UTC (permalink / raw)
  To: linux-kernel, linux-arm-kernel, virtualization, virtio-dev
  Cc: sudeep.holla, james.quinlan, Jonathan.Cameron, f.fainelli,
	etienne.carriere, vincent.guittot, souvik.chakravarty,
	cristian.marussi, igor.skalkin, peter.hilber, alex.bennee,
	jean-philippe, mikhail.golubev, anton.yakovlev, Vasyl.Vavrychuk,
	Andriy.Tryshnivskyy

From: Igor Skalkin <igor.skalkin@opensynergy.com>

This transport enables communications with an SCMI platform through virtio;
the SCMI platform will be represented by a virtio device.

Implement an SCMI virtio driver according to the virtio SCMI device spec
[1]. Virtio device id 32 has been reserved for the SCMI device [2].

The virtio transport has one Tx channel (virtio cmdq, A2P channel) and
at most one Rx channel (virtio eventq, P2A channel).

The following feature bit defined in [1] is not implemented:
VIRTIO_SCMI_F_SHARED_MEMORY.

The number of messages which can be pending simultaneously is restricted
according to the virtqueue capacity negotiated at probing time.

As soon as Rx channel message buffers are allocated or have been read
out by the arm-scmi driver, feed them back to the virtio device.

Since some virtio devices may not have the short response time exhibited
by SCMI platforms using other transports, set a generous response
timeout.

SCMI polling mode is not supported by this virtio transport since deemed
meaningless: polling mode operation is offered by the SCMI core to those
transports that could not provide a completion interrupt on the TX path,
which is never the case for virtio whose core callbacks can easily call
into core scmi_rx_callback upon messages reception.

[1] https://github.com/oasis-tcs/virtio-spec/blob/master/virtio-scmi.tex
[2] https://www.oasis-open.org/committees/ballot.php?id=3496

Signed-off-by: Igor Skalkin <igor.skalkin@opensynergy.com>
[ Peter: Adapted patch for submission to upstream. ]
Co-developed-by: Peter Hilber <peter.hilber@opensynergy.com>
Signed-off-by: Peter Hilber <peter.hilber@opensynergy.com>
[ Cristian: simplified driver logic, changed link_supplier and channel
	    available/setup logic, removed dummy callbacks ]
Co-developed-by: Cristian Marussi <cristian.marussi@arm.com>
Signed-off-by: Cristian Marussi <cristian.marussi@arm.com>
---
NOTE THAT VIRTIO TRANSPORT IS ADDED AS default=n

V5 --> V6
- removed usage of delegated xfers
- using new scmi_rx_callback with priv argument
- removed .dummy clear_channel/.poll_done callbacks
- added missing spinlock comments
- updated Copyrights

V4 --> V5
- adapted Virtio transport config to new SCMI Kconfig layout
- removed support for polling
- added validate virtio method support
- removed usage of raw_payload helpers
- removed dynamic search of matching devices
- added one single statically configured device

V3 --> V4
- using delegated xfers
- using raw_payload msg helpers
---
 MAINTAINERS                        |   1 +
 drivers/firmware/arm_scmi/Kconfig  |  13 +
 drivers/firmware/arm_scmi/Makefile |   1 +
 drivers/firmware/arm_scmi/common.h |   3 +
 drivers/firmware/arm_scmi/driver.c |   3 +
 drivers/firmware/arm_scmi/virtio.c | 491 +++++++++++++++++++++++++++++
 include/uapi/linux/virtio_ids.h    |   1 +
 include/uapi/linux/virtio_scmi.h   |  24 ++
 8 files changed, 537 insertions(+)
 create mode 100644 drivers/firmware/arm_scmi/virtio.c
 create mode 100644 include/uapi/linux/virtio_scmi.h

diff --git a/MAINTAINERS b/MAINTAINERS
index a61f4f3b78a9..db1c7b74642e 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -17940,6 +17940,7 @@ F:	drivers/regulator/scmi-regulator.c
 F:	drivers/reset/reset-scmi.c
 F:	include/linux/sc[mp]i_protocol.h
 F:	include/trace/events/scmi.h
+F:	include/uapi/linux/virtio_scmi.h
 
 SYSTEM RESET/SHUTDOWN DRIVERS
 M:	Sebastian Reichel <sre@kernel.org>
diff --git a/drivers/firmware/arm_scmi/Kconfig b/drivers/firmware/arm_scmi/Kconfig
index 1fdaa8ad7d3f..149b88f773a1 100644
--- a/drivers/firmware/arm_scmi/Kconfig
+++ b/drivers/firmware/arm_scmi/Kconfig
@@ -68,6 +68,19 @@ config ARM_SCMI_TRANSPORT_SMC
 	  A matching DT entry will also be needed to indicate the effective
 	  presence of this kind of transport.
 
+config ARM_SCMI_TRANSPORT_VIRTIO
+	bool "SCMI transport based on VirtIO"
+	depends on ARM_SCMI_PROTOCOL && VIRTIO
+	select ARM_SCMI_HAVE_TRANSPORT
+	select ARM_SCMI_HAVE_MSG
+	help
+	  This enables the virtio based transport for SCMI.
+
+	  If you want the ARM SCMI PROTOCOL stack to include support for a
+	  transport based on VirtIO, answer Y.
+	  A matching DT entry will also be needed to indicate the effective
+	  presence of this kind of transport.
+
 config ARM_SCMI_POWER_DOMAIN
 	tristate "SCMI power domain driver"
 	depends on ARM_SCMI_PROTOCOL || (COMPILE_TEST && OF)
diff --git a/drivers/firmware/arm_scmi/Makefile b/drivers/firmware/arm_scmi/Makefile
index aaad9f6589aa..1dcf123d64ab 100644
--- a/drivers/firmware/arm_scmi/Makefile
+++ b/drivers/firmware/arm_scmi/Makefile
@@ -5,6 +5,7 @@ scmi-transport-$(CONFIG_ARM_SCMI_HAVE_SHMEM) = shmem.o
 scmi-transport-$(CONFIG_ARM_SCMI_TRANSPORT_MAILBOX) += mailbox.o
 scmi-transport-$(CONFIG_ARM_SCMI_TRANSPORT_SMC) += smc.o
 scmi-transport-$(CONFIG_ARM_SCMI_HAVE_MSG) += msg.o
+scmi-transport-$(CONFIG_ARM_SCMI_TRANSPORT_VIRTIO) += virtio.o
 scmi-protocols-y = base.o clock.o perf.o power.o reset.o sensors.o system.o voltage.o
 scmi-module-objs := $(scmi-bus-y) $(scmi-driver-y) $(scmi-protocols-y) \
 		    $(scmi-transport-y)
diff --git a/drivers/firmware/arm_scmi/common.h b/drivers/firmware/arm_scmi/common.h
index e6403ebe59ca..877236b99602 100644
--- a/drivers/firmware/arm_scmi/common.h
+++ b/drivers/firmware/arm_scmi/common.h
@@ -399,6 +399,9 @@ extern const struct scmi_desc scmi_mailbox_desc;
 #ifdef CONFIG_ARM_SCMI_TRANSPORT_SMC
 extern const struct scmi_desc scmi_smc_desc;
 #endif
+#ifdef CONFIG_ARM_SCMI_TRANSPORT_VIRTIO
+extern const struct scmi_desc scmi_virtio_desc;
+#endif
 
 void scmi_rx_callback(struct scmi_chan_info *cinfo, u32 msg_hdr, void *priv);
 void scmi_free_channel(struct scmi_chan_info *cinfo, struct idr *idr, int id);
diff --git a/drivers/firmware/arm_scmi/driver.c b/drivers/firmware/arm_scmi/driver.c
index 1be86ab40be3..650da2ef59bf 100644
--- a/drivers/firmware/arm_scmi/driver.c
+++ b/drivers/firmware/arm_scmi/driver.c
@@ -1972,6 +1972,9 @@ static const struct of_device_id scmi_of_match[] = {
 #endif
 #ifdef CONFIG_ARM_SCMI_TRANSPORT_SMC
 	{ .compatible = "arm,scmi-smc", .data = &scmi_smc_desc},
+#endif
+#ifdef CONFIG_ARM_SCMI_TRANSPORT_VIRTIO
+	{ .compatible = "arm,scmi-virtio", .data = &scmi_virtio_desc},
 #endif
 	{ /* Sentinel */ },
 };
diff --git a/drivers/firmware/arm_scmi/virtio.c b/drivers/firmware/arm_scmi/virtio.c
new file mode 100644
index 000000000000..47a9a01b6fe5
--- /dev/null
+++ b/drivers/firmware/arm_scmi/virtio.c
@@ -0,0 +1,491 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * Virtio Transport driver for Arm System Control and Management Interface
+ * (SCMI).
+ *
+ * Copyright (C) 2020-2021 OpenSynergy.
+ * Copyright (C) 2021 ARM Ltd.
+ */
+
+/**
+ * DOC: Theory of Operation
+ *
+ * The scmi-virtio transport implements a driver for the virtio SCMI device.
+ *
+ * There is one Tx channel (virtio cmdq, A2P channel) and at most one Rx
+ * channel (virtio eventq, P2A channel). Each channel is implemented through a
+ * virtqueue. Access to each virtqueue is protected by spinlocks.
+ */
+
+#include <linux/errno.h>
+#include <linux/slab.h>
+#include <linux/virtio.h>
+#include <linux/virtio_config.h>
+
+#include <uapi/linux/virtio_ids.h>
+#include <uapi/linux/virtio_scmi.h>
+
+#include "common.h"
+
+#define VIRTIO_SCMI_MAX_MSG_SIZE 128 /* Value may be increased. */
+#define VIRTIO_SCMI_MAX_PDU_SIZE \
+	(VIRTIO_SCMI_MAX_MSG_SIZE + SCMI_MSG_MAX_PROT_OVERHEAD)
+#define DESCRIPTORS_PER_TX_MSG 2
+
+/**
+ * struct scmi_vio_channel - Transport channel information
+ *
+ * @vqueue: Associated virtqueue
+ * @cinfo: SCMI Tx or Rx channel
+ * @free_list: List of unused scmi_vio_msg, maintained for Tx channels only
+ * @is_rx: Whether channel is an Rx channel
+ * @ready: Whether transport user is ready to hear about channel
+ * @max_msg: Maximum number of pending messages for this channel.
+ * @lock: Protects access to all members except ready.
+ * @ready_lock: Protects access to ready. If required, it must be taken before
+ *              lock.
+ */
+struct scmi_vio_channel {
+	struct virtqueue *vqueue;
+	struct scmi_chan_info *cinfo;
+	struct list_head free_list;
+	bool is_rx;
+	bool ready;
+	unsigned int max_msg;
+	/* lock to protect access to all members except ready. */
+	spinlock_t lock;
+	/* lock to rotects access to ready flag. */
+	spinlock_t ready_lock;
+};
+
+/**
+ * struct scmi_vio_msg - Transport PDU information
+ *
+ * @request: SDU used for commands
+ * @input: SDU used for (delayed) responses and notifications
+ * @list: List which scmi_vio_msg may be part of
+ * @rx_len: Input SDU size in bytes, once input has been received
+ */
+struct scmi_vio_msg {
+	struct scmi_msg_payld *request;
+	struct scmi_msg_payld *input;
+	struct list_head list;
+	unsigned int rx_len;
+};
+
+/* Only one SCMI VirtIO device can possibly exist */
+static struct virtio_device *scmi_vdev;
+
+static bool scmi_vio_have_vq_rx(struct virtio_device *vdev)
+{
+	return virtio_has_feature(vdev, VIRTIO_SCMI_F_P2A_CHANNELS);
+}
+
+static int scmi_vio_feed_vq_rx(struct scmi_vio_channel *vioch,
+			       struct scmi_vio_msg *msg)
+{
+	struct scatterlist sg_in;
+	int rc;
+	unsigned long flags;
+
+	sg_init_one(&sg_in, msg->input, VIRTIO_SCMI_MAX_PDU_SIZE);
+
+	spin_lock_irqsave(&vioch->lock, flags);
+
+	rc = virtqueue_add_inbuf(vioch->vqueue, &sg_in, 1, msg, GFP_ATOMIC);
+	if (rc)
+		dev_err_once(vioch->cinfo->dev,
+			     "failed to add to virtqueue (%d)\n", rc);
+	else
+		virtqueue_kick(vioch->vqueue);
+
+	spin_unlock_irqrestore(&vioch->lock, flags);
+
+	return rc;
+}
+
+static void scmi_finalize_message(struct scmi_vio_channel *vioch,
+				  struct scmi_vio_msg *msg)
+{
+	if (vioch->is_rx) {
+		scmi_vio_feed_vq_rx(vioch, msg);
+	} else {
+		unsigned long flags;
+
+		spin_lock_irqsave(&vioch->lock, flags);
+		list_add(&msg->list, &vioch->free_list);
+		spin_unlock_irqrestore(&vioch->lock, flags);
+	}
+}
+
+static void scmi_vio_complete_cb(struct virtqueue *vqueue)
+{
+	unsigned long ready_flags;
+	unsigned long flags;
+	unsigned int length;
+	struct scmi_vio_channel *vioch;
+	struct scmi_vio_msg *msg;
+	bool cb_enabled = true;
+
+	if (WARN_ON_ONCE(!vqueue->vdev->priv))
+		return;
+	vioch = &((struct scmi_vio_channel *)vqueue->vdev->priv)[vqueue->index];
+
+	for (;;) {
+		spin_lock_irqsave(&vioch->ready_lock, ready_flags);
+
+		if (!vioch->ready) {
+			if (!cb_enabled)
+				(void)virtqueue_enable_cb(vqueue);
+			goto unlock_ready_out;
+		}
+
+		spin_lock_irqsave(&vioch->lock, flags);
+		if (cb_enabled) {
+			virtqueue_disable_cb(vqueue);
+			cb_enabled = false;
+		}
+		msg = virtqueue_get_buf(vqueue, &length);
+		if (!msg) {
+			if (virtqueue_enable_cb(vqueue))
+				goto unlock_out;
+			cb_enabled = true;
+		}
+		spin_unlock_irqrestore(&vioch->lock, flags);
+
+		if (msg) {
+			msg->rx_len = length;
+			scmi_rx_callback(vioch->cinfo,
+					 msg_read_header(msg->input), msg);
+
+			scmi_finalize_message(vioch, msg);
+		}
+
+		spin_unlock_irqrestore(&vioch->ready_lock, ready_flags);
+	}
+
+unlock_out:
+	spin_unlock_irqrestore(&vioch->lock, flags);
+unlock_ready_out:
+	spin_unlock_irqrestore(&vioch->ready_lock, ready_flags);
+}
+
+static const char *const scmi_vio_vqueue_names[] = { "tx", "rx" };
+
+static vq_callback_t *scmi_vio_complete_callbacks[] = {
+	scmi_vio_complete_cb,
+	scmi_vio_complete_cb
+};
+
+static unsigned int virtio_get_max_msg(struct scmi_chan_info *base_cinfo)
+{
+	struct scmi_vio_channel *vioch = base_cinfo->transport_info;
+
+	return vioch->max_msg;
+}
+
+static int virtio_link_supplier(struct device *dev)
+{
+	if (!scmi_vdev) {
+		dev_notice_once(dev,
+				"Deferring probe after not finding a bound scmi-virtio device\n");
+		return -EPROBE_DEFER;
+	}
+
+	if (!device_link_add(dev, &scmi_vdev->dev,
+			     DL_FLAG_AUTOREMOVE_CONSUMER)) {
+		dev_err(dev, "Adding link to supplier virtio device failed\n");
+		return -ECANCELED;
+	}
+
+	return 0;
+}
+
+static bool virtio_chan_available(struct device *dev, int idx)
+{
+	struct scmi_vio_channel *channels, *vioch = NULL;
+
+	if (WARN_ON_ONCE(!scmi_vdev))
+		return false;
+
+	channels = (struct scmi_vio_channel *)scmi_vdev->priv;
+
+	switch (idx) {
+	case VIRTIO_SCMI_VQ_TX:
+		vioch = &channels[VIRTIO_SCMI_VQ_TX];
+		break;
+	case VIRTIO_SCMI_VQ_RX:
+		if (scmi_vio_have_vq_rx(scmi_vdev))
+			vioch = &channels[VIRTIO_SCMI_VQ_RX];
+		break;
+	default:
+		return false;
+	}
+
+	return vioch && !vioch->cinfo ? true : false;
+}
+
+static int virtio_chan_setup(struct scmi_chan_info *cinfo, struct device *dev,
+			     bool tx)
+{
+	unsigned long flags;
+	struct scmi_vio_channel *vioch;
+	int index = tx ? VIRTIO_SCMI_VQ_TX : VIRTIO_SCMI_VQ_RX;
+	int i;
+
+	if (!scmi_vdev)
+		return -EPROBE_DEFER;
+
+	vioch = &((struct scmi_vio_channel *)scmi_vdev->priv)[index];
+
+	for (i = 0; i < vioch->max_msg; i++) {
+		struct scmi_vio_msg *msg;
+
+		msg = devm_kzalloc(cinfo->dev, sizeof(*msg), GFP_KERNEL);
+		if (!msg)
+			return -ENOMEM;
+
+		if (tx) {
+			msg->request = devm_kzalloc(cinfo->dev,
+						    VIRTIO_SCMI_MAX_PDU_SIZE,
+						    GFP_KERNEL);
+			if (!msg->request)
+				return -ENOMEM;
+		}
+
+		msg->input = devm_kzalloc(cinfo->dev, VIRTIO_SCMI_MAX_PDU_SIZE,
+					  GFP_KERNEL);
+		if (!msg->input)
+			return -ENOMEM;
+
+		if (tx) {
+			spin_lock_irqsave(&vioch->lock, flags);
+			list_add_tail(&msg->list, &vioch->free_list);
+			spin_unlock_irqrestore(&vioch->lock, flags);
+		} else {
+			scmi_vio_feed_vq_rx(vioch, msg);
+		}
+	}
+
+	spin_lock_irqsave(&vioch->lock, flags);
+	cinfo->transport_info = vioch;
+	/* Indirectly setting channel not available any more */
+	vioch->cinfo = cinfo;
+	spin_unlock_irqrestore(&vioch->lock, flags);
+
+	spin_lock_irqsave(&vioch->ready_lock, flags);
+	vioch->ready = true;
+	spin_unlock_irqrestore(&vioch->ready_lock, flags);
+
+	return 0;
+}
+
+static int virtio_chan_free(int id, void *p, void *data)
+{
+	unsigned long flags;
+	struct scmi_chan_info *cinfo = p;
+	struct scmi_vio_channel *vioch = cinfo->transport_info;
+
+	spin_lock_irqsave(&vioch->ready_lock, flags);
+	vioch->ready = false;
+	spin_unlock_irqrestore(&vioch->ready_lock, flags);
+
+	scmi_free_channel(cinfo, data, id);
+
+	spin_lock_irqsave(&vioch->lock, flags);
+	vioch->cinfo = NULL;
+	spin_unlock_irqrestore(&vioch->lock, flags);
+
+	return 0;
+}
+
+static int virtio_send_message(struct scmi_chan_info *cinfo,
+			       struct scmi_xfer *xfer)
+{
+	struct scmi_vio_channel *vioch = cinfo->transport_info;
+	struct scatterlist sg_out;
+	struct scatterlist sg_in;
+	struct scatterlist *sgs[DESCRIPTORS_PER_TX_MSG] = { &sg_out, &sg_in };
+	unsigned long flags;
+	int rc;
+	struct scmi_vio_msg *msg;
+
+	spin_lock_irqsave(&vioch->lock, flags);
+
+	if (list_empty(&vioch->free_list)) {
+		spin_unlock_irqrestore(&vioch->lock, flags);
+		return -EBUSY;
+	}
+
+	msg = list_first_entry(&vioch->free_list, typeof(*msg), list);
+	list_del(&msg->list);
+
+	msg_tx_prepare(msg->request, xfer);
+
+	sg_init_one(&sg_out, msg->request, msg_command_size(xfer));
+	sg_init_one(&sg_in, msg->input, msg_response_size(xfer));
+
+	rc = virtqueue_add_sgs(vioch->vqueue, sgs, 1, 1, msg, GFP_ATOMIC);
+	if (rc) {
+		list_add(&msg->list, &vioch->free_list);
+		dev_err_once(vioch->cinfo->dev,
+			     "%s() failed to add to virtqueue (%d)\n", __func__,
+			     rc);
+	} else {
+		virtqueue_kick(vioch->vqueue);
+	}
+
+	spin_unlock_irqrestore(&vioch->lock, flags);
+
+	return rc;
+}
+
+static void virtio_fetch_response(struct scmi_chan_info *cinfo,
+				  struct scmi_xfer *xfer)
+{
+	struct scmi_vio_msg *msg = xfer->priv;
+
+	if (msg) {
+		msg_fetch_response(msg->input, msg->rx_len, xfer);
+		xfer->priv = NULL;
+	}
+}
+
+static void virtio_fetch_notification(struct scmi_chan_info *cinfo,
+				      size_t max_len, struct scmi_xfer *xfer)
+{
+	struct scmi_vio_msg *msg = xfer->priv;
+
+	if (msg) {
+		msg_fetch_notification(msg->input, msg->rx_len, max_len, xfer);
+		xfer->priv = NULL;
+	}
+}
+
+static const struct scmi_transport_ops scmi_virtio_ops = {
+	.link_supplier = virtio_link_supplier,
+	.chan_available = virtio_chan_available,
+	.chan_setup = virtio_chan_setup,
+	.chan_free = virtio_chan_free,
+	.get_max_msg = virtio_get_max_msg,
+	.send_message = virtio_send_message,
+	.fetch_response = virtio_fetch_response,
+	.fetch_notification = virtio_fetch_notification,
+};
+
+static int scmi_vio_probe(struct virtio_device *vdev)
+{
+	struct device *dev = &vdev->dev;
+	struct scmi_vio_channel *channels;
+	bool have_vq_rx;
+	int vq_cnt;
+	int i;
+	int ret;
+	struct virtqueue *vqs[VIRTIO_SCMI_VQ_MAX_CNT];
+
+	/* Only one SCMI VirtiO device allowed */
+	if (scmi_vdev)
+		return -EINVAL;
+
+	have_vq_rx = scmi_vio_have_vq_rx(vdev);
+	vq_cnt = have_vq_rx ? VIRTIO_SCMI_VQ_MAX_CNT : 1;
+
+	channels = devm_kcalloc(dev, vq_cnt, sizeof(*channels), GFP_KERNEL);
+	if (!channels)
+		return -ENOMEM;
+
+	if (have_vq_rx)
+		channels[VIRTIO_SCMI_VQ_RX].is_rx = true;
+
+	ret = virtio_find_vqs(vdev, vq_cnt, vqs, scmi_vio_complete_callbacks,
+			      scmi_vio_vqueue_names, NULL);
+	if (ret) {
+		dev_err(dev, "Failed to get %d virtqueue(s)\n", vq_cnt);
+		return ret;
+	}
+
+	for (i = 0; i < vq_cnt; i++) {
+		unsigned int sz;
+
+		spin_lock_init(&channels[i].lock);
+		spin_lock_init(&channels[i].ready_lock);
+		INIT_LIST_HEAD(&channels[i].free_list);
+		channels[i].vqueue = vqs[i];
+
+		sz = virtqueue_get_vring_size(channels[i].vqueue);
+		/* Tx messages need multiple descriptors. */
+		if (!channels[i].is_rx)
+			sz /= DESCRIPTORS_PER_TX_MSG;
+
+		if (sz > MSG_TOKEN_MAX) {
+			dev_info_once(dev,
+				      "%s virtqueue could hold %d messages. Only %ld allowed to be pending.\n",
+				      channels[i].is_rx ? "rx" : "tx",
+				      sz, MSG_TOKEN_MAX);
+			sz = MSG_TOKEN_MAX;
+		}
+		channels[i].max_msg = sz;
+	}
+
+	vdev->priv = channels;
+	scmi_vdev = vdev;
+
+	return 0;
+}
+
+static void scmi_vio_remove(struct virtio_device *vdev)
+{
+	vdev->config->reset(vdev);
+	vdev->config->del_vqs(vdev);
+	scmi_vdev = NULL;
+}
+
+static int scmi_vio_validate(struct virtio_device *vdev)
+{
+	if (!virtio_has_feature(vdev, VIRTIO_F_VERSION_1)) {
+		dev_err(&vdev->dev,
+			"device does not comply with spec version 1.x\n");
+		return -EINVAL;
+	}
+
+	return 0;
+}
+
+static unsigned int features[] = {
+	VIRTIO_SCMI_F_P2A_CHANNELS,
+};
+
+static const struct virtio_device_id id_table[] = {
+	{ VIRTIO_ID_SCMI, VIRTIO_DEV_ANY_ID },
+	{ 0 }
+};
+
+static struct virtio_driver virtio_scmi_driver = {
+	.driver.name = "scmi-virtio",
+	.driver.owner = THIS_MODULE,
+	.feature_table = features,
+	.feature_table_size = ARRAY_SIZE(features),
+	.id_table = id_table,
+	.probe = scmi_vio_probe,
+	.remove = scmi_vio_remove,
+	.validate = scmi_vio_validate,
+};
+
+static int __init virtio_scmi_init(void)
+{
+	return register_virtio_driver(&virtio_scmi_driver);
+}
+
+static void __exit virtio_scmi_exit(void)
+{
+	unregister_virtio_driver(&virtio_scmi_driver);
+}
+
+const struct scmi_desc scmi_virtio_desc = {
+	.init = virtio_scmi_init,
+	.exit = virtio_scmi_exit,
+	.ops = &scmi_virtio_ops,
+	.max_rx_timeout_ms = 60000, /* for non-realtime virtio devices */
+	.max_msg = 0, /* overridden by virtio_get_max_msg() */
+	.max_msg_size = VIRTIO_SCMI_MAX_MSG_SIZE,
+};
diff --git a/include/uapi/linux/virtio_ids.h b/include/uapi/linux/virtio_ids.h
index 70a8057ad4bb..f74155f6882d 100644
--- a/include/uapi/linux/virtio_ids.h
+++ b/include/uapi/linux/virtio_ids.h
@@ -55,6 +55,7 @@
 #define VIRTIO_ID_FS			26 /* virtio filesystem */
 #define VIRTIO_ID_PMEM			27 /* virtio pmem */
 #define VIRTIO_ID_MAC80211_HWSIM	29 /* virtio mac80211-hwsim */
+#define VIRTIO_ID_SCMI			32 /* virtio SCMI */
 #define VIRTIO_ID_BT			40 /* virtio bluetooth */
 
 /*
diff --git a/include/uapi/linux/virtio_scmi.h b/include/uapi/linux/virtio_scmi.h
new file mode 100644
index 000000000000..f8ddd04a3ace
--- /dev/null
+++ b/include/uapi/linux/virtio_scmi.h
@@ -0,0 +1,24 @@
+/* SPDX-License-Identifier: ((GPL-2.0 WITH Linux-syscall-note) OR BSD-3-Clause) */
+/*
+ * Copyright (C) 2020-2021 OpenSynergy GmbH
+ * Copyright (C) 2021 ARM Ltd.
+ */
+
+#ifndef _UAPI_LINUX_VIRTIO_SCMI_H
+#define _UAPI_LINUX_VIRTIO_SCMI_H
+
+#include <linux/virtio_types.h>
+
+/* Device implements some SCMI notifications, or delayed responses. */
+#define VIRTIO_SCMI_F_P2A_CHANNELS 0
+
+/* Device implements any SCMI statistics shared memory region */
+#define VIRTIO_SCMI_F_SHARED_MEMORY 1
+
+/* Virtqueues */
+
+#define VIRTIO_SCMI_VQ_TX 0 /* cmdq */
+#define VIRTIO_SCMI_VQ_RX 1 /* eventq */
+#define VIRTIO_SCMI_VQ_MAX_CNT 2
+
+#endif /* _UAPI_LINUX_VIRTIO_SCMI_H */
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 48+ messages in thread

* Re: [PATCH v6 02/17] firmware: arm_scmi: Fix max pending messages boundary check
  2021-07-12 14:18 ` [PATCH v6 02/17] firmware: arm_scmi: Fix max pending messages boundary check Cristian Marussi
@ 2021-07-14 16:46   ` Sudeep Holla
  0 siblings, 0 replies; 48+ messages in thread
From: Sudeep Holla @ 2021-07-14 16:46 UTC (permalink / raw)
  To: linux-kernel, linux-arm-kernel, virtio-dev, virtualization,
	Cristian Marussi
  Cc: Sudeep Holla, Vasyl.Vavrychuk, f.fainelli, jean-philippe,
	Jonathan.Cameron, james.quinlan, igor.skalkin, alex.bennee,
	Andriy.Tryshnivskyy, peter.hilber, etienne.carriere,
	mikhail.golubev, vincent.guittot, souvik.chakravarty,
	anton.yakovlev

On Mon, 12 Jul 2021 15:18:16 +0100, Cristian Marussi wrote:
> SCMI message headers carry a sequence number and such field is sized to
> allow for MSG_TOKEN_MAX distinct numbers; moreover zero is not really an
> acceptable maximum number of pending in-flight messages.
>
> Fix accordignly the checks performed on the value exported by transports
>in scmi_desc.max_msg.
>
> [...]

Applied to sudeep.holla/linux (for-next/scmi), thanks!

[02/17] firmware: arm_scmi: Fix max pending messages boundary check
        https://git.kernel.org/sudeep.holla/c/bdb8742dc6

--
Regards,
Sudeep


^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH v6 00/17] Introduce SCMI transport based on VirtIO
  2021-07-12 14:18 [PATCH v6 00/17] Introduce SCMI transport based on VirtIO Cristian Marussi
                   ` (16 preceding siblings ...)
  2021-07-12 14:18 ` [PATCH v6 17/17] firmware: arm_scmi: Add virtio transport Cristian Marussi
@ 2021-07-15 16:35 ` Peter Hilber
  2021-07-19 11:36   ` Cristian Marussi
  2021-08-11  9:31 ` Floris Westermann
  18 siblings, 1 reply; 48+ messages in thread
From: Peter Hilber @ 2021-07-15 16:35 UTC (permalink / raw)
  To: Cristian Marussi, linux-kernel, linux-arm-kernel, virtualization,
	virtio-dev
  Cc: sudeep.holla, james.quinlan, Jonathan.Cameron, f.fainelli,
	etienne.carriere, vincent.guittot, souvik.chakravarty,
	igor.skalkin, alex.bennee, jean-philippe, mikhail.golubev,
	anton.yakovlev, Vasyl.Vavrychuk, Andriy.Tryshnivskyy

On 12.07.21 16:18, Cristian Marussi wrote:
> Hi all,
> 

Hi Cristian,

thanks for your update. Please find some additional comments in this 
reply and the following.

Best regards,

Peter

> While reworking this series starting from the work done up to V3 by
> OpenSynergy, I am keeping the original autorship and list distribution
> unchanged.
> 
> The main aim of this rework, as said, is to simplify where possible the
> SCMI VirtIO support added in V3 by adding at first some new general
> mechanisms in the SCMI Transport layer.
> 
> Indeed, after some initial small fixes, patches 05/06/07/08 add such new
> additional mechanisms to the SCMI core to ease implementation of more
> complex transports like virtio, while also addressing a few general issues
> already potentially affecting existing transports.
> 
> In terms of rework I dropped original V3 patches 05/06/07/08/12 as no more
> needed, and modified where needed the remaining original patches to take
> advantage of the above mentioned new SCMI transport features.
> 
> DT bindings patch has been ported on top of freshly YAML converted arm,scmi
> bindings.
> 
> Moreover, since V5 I dropped support for polling mode from the virtio-scmi
> transport, since it is an optional general mechanism provided by the core
> to allow transports lacking a completion IRQ to work and it seemed a
> needless addition/complication in the context of virtio transport.
> 

Just for correctness, in my understanding polling is not completely 
optional ATM. Polling would be required by scmi_cpufreq_fast_switch(). 
But that requirement might be irrelevant for now.

> Additionally, in V5 I could also simplify a bit the virtio transport
> probing sequence starting from the observation that, by the VirtIO spec,
> in fact, only one single SCMI VirtIO device can possibly exist on a system.
> 

I wouldn't say that the virtio spec restricts the # of virtio-scmi 
devices to one. But I do think the one device limitation in the kernel 
is acceptable.

> The series has been tested using an emulated fake SCMI device and also a
> proper SCP-fw stack running through QEMU vhost-users, with the SCMI stack
> compiled, in both cases, as builtin and as a loadable module, running tests
> against mocked SCMI Sensors using HWMON and IIO interfaces to check the
> functionality of notifications and sync/async commands.
> 
> Virtio-scmi support has been exercised in the following testing scenario
> on a JUNO board:
> 
>   - normal sync/async command transfers
>   - notifications
>   - concurrent delivery of correlated response and delayed responses
>   - out-of-order delivery of delayed responses before related responses
>   - unexpected delayed response delivery for sync commands
>   - late delivery of timed-out responses and delayed responses
> 
> Some basic regression testing against mailbox transport has been performed
> for commands and notifications too.
> 
> No sensible overhead in total handling time of commands and notifications
> has been observed, even though this series do indeed add a considerable
> amount of code to execute on TX path.
> More test and measurements could be needed in these regards.
> 
> This series is based on top of v5.14-rc1.
> 
> Any feedback/testing is welcome :D
> 
> Thanks,
> Cristian
> ---
> V5 --> V6:
>   - removed delegated xfers and its usage
>   - add and use *priv optional parameter in scmi_rx_callback()
>   - made .poll_done and .clear_channel ops optional
> 
> V4 --> V5:
>   - removed msg raw_payload helpers
>   - reworked msg helpers to use xfer->priv reference
>   - simplified SCMI device probe sequence (one static device)
>   - added new SCMI Kconfig layout
>   - removed SCMI virtio polling support
> 
> V3 --> V4:
>   - using new delegated xfers support and monotonically increasing tokens
>     in virtio transport
>   - ported SCMI virtio transport DT bindings to YAML format
>   - added virtio-scmi polling support
>   - added delegated xfers support
> 
> Cristian Marussi (11):
>    firmware: arm_scmi: Avoid padding in sensor message structure
>    firmware: arm_scmi: Fix max pending messages boundary check
>    firmware: arm_scmi: Add support for type handling in common functions
>    firmware: arm_scmi: Remove scmi_dump_header_dbg() helper
>    firmware: arm_scmi: Add transport optional init/exit support
>    firmware: arm_scmi: Introduce monotonically increasing tokens
>    firmware: arm_scmi: Handle concurrent and out-of-order messages
>    firmware: arm_scmi: Add priv parameter to scmi_rx_callback
>    firmware: arm_scmi: Make .clear_channel optional
>    firmware: arm_scmi: Make polling mode optional
>    firmware: arm_scmi: Make SCMI transports configurable
> 
> Igor Skalkin (4):
>    firmware: arm_scmi: Make shmem support optional for transports
>    firmware: arm_scmi: Add method to override max message number
>    dt-bindings: arm: Add virtio transport for SCMI
>    firmware: arm_scmi: Add virtio transport
> 
> Peter Hilber (2):
>    firmware: arm_scmi: Add message passing abstractions for transports
>    firmware: arm_scmi: Add optional link_supplier() transport op
> 
>   .../bindings/firmware/arm,scmi.yaml           |   8 +-
>   MAINTAINERS                                   |   1 +
>   drivers/firmware/Kconfig                      |  34 +-
>   drivers/firmware/arm_scmi/Kconfig             |  97 +++
>   drivers/firmware/arm_scmi/Makefile            |   8 +-
>   drivers/firmware/arm_scmi/common.h            |  94 ++-
>   drivers/firmware/arm_scmi/driver.c            | 651 +++++++++++++++---
>   drivers/firmware/arm_scmi/mailbox.c           |   2 +-
>   drivers/firmware/arm_scmi/msg.c               | 113 +++
>   drivers/firmware/arm_scmi/sensors.c           |   6 +-
>   drivers/firmware/arm_scmi/smc.c               |   3 +-
>   drivers/firmware/arm_scmi/virtio.c            | 491 +++++++++++++
>   include/uapi/linux/virtio_ids.h               |   1 +
>   include/uapi/linux/virtio_scmi.h              |  24 +
>   14 files changed, 1389 insertions(+), 144 deletions(-)
>   create mode 100644 drivers/firmware/arm_scmi/Kconfig
>   create mode 100644 drivers/firmware/arm_scmi/msg.c
>   create mode 100644 drivers/firmware/arm_scmi/virtio.c
>   create mode 100644 include/uapi/linux/virtio_scmi.h
> 


^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH v6 07/17] firmware: arm_scmi: Handle concurrent and out-of-order messages
  2021-07-12 14:18 ` [PATCH v6 07/17] firmware: arm_scmi: Handle concurrent and out-of-order messages Cristian Marussi
@ 2021-07-15 16:36   ` Peter Hilber
  2021-07-19  9:14     ` Cristian Marussi
  2021-08-02 10:10   ` Sudeep Holla
  1 sibling, 1 reply; 48+ messages in thread
From: Peter Hilber @ 2021-07-15 16:36 UTC (permalink / raw)
  To: Cristian Marussi, linux-kernel, linux-arm-kernel, virtualization,
	virtio-dev
  Cc: sudeep.holla, james.quinlan, Jonathan.Cameron, f.fainelli,
	etienne.carriere, vincent.guittot, souvik.chakravarty,
	igor.skalkin, alex.bennee, jean-philippe, mikhail.golubev,
	anton.yakovlev, Vasyl.Vavrychuk, Andriy.Tryshnivskyy

On 12.07.21 16:18, Cristian Marussi wrote:
> Even though in case of asynchronous commands an SCMI platform server is
> constrained to emit the delayed response message only after the related
> message response has been sent, the configured underlying transport could
> still deliver such messages together or in inverted order, causing races
> due to the concurrent or out-of-order access to the underlying xfer.
> 
> Introduce a mechanism to grant exclusive access to an xfer in order to
> properly serialize concurrent accesses to the same xfer originating from
> multiple correlated messages.
> 
> Add additional state information to xfer descriptors so as to be able to
> identify out-of-order message deliveries and act accordingly:
> 
>   - when a delayed response is expected but delivered before the related
>     response, the synchronous response is considered as successfully
>     received and the delayed response processing is carried on as usual.
> 
>   - when/if the missing synchronous response is subsequently received, it
>     is discarded as not congruent with the current state of the xfer, or
>     simply, because the xfer has been already released and so, now, the
>     monotonically increasing sequence number carried by the late response
>     is stale.
> 
> Signed-off-by: Cristian Marussi <cristian.marussi@arm.com>
> ---
> v5 --> v6
> - added spinlock comment
> ---
>   drivers/firmware/arm_scmi/common.h |  18 ++-
>   drivers/firmware/arm_scmi/driver.c | 229 ++++++++++++++++++++++++-----
>   2 files changed, 212 insertions(+), 35 deletions(-)
> 
> diff --git a/drivers/firmware/arm_scmi/common.h b/drivers/firmware/arm_scmi/common.h
> index 2233d0a188fc..9efebe1406d2 100644
> --- a/drivers/firmware/arm_scmi/common.h
> +++ b/drivers/firmware/arm_scmi/common.h
> @@ -19,6 +19,7 @@
>   #include <linux/module.h>
>   #include <linux/refcount.h>
>   #include <linux/scmi_protocol.h>
> +#include <linux/spinlock.h>
>   #include <linux/types.h>
>   
>   #include <asm/unaligned.h>
> @@ -145,6 +146,13 @@ struct scmi_msg {
>    * @pending: True for xfers added to @pending_xfers hashtable
>    * @node: An hlist_node reference used to store this xfer, alternatively, on
>    *	  the free list @free_xfers or in the @pending_xfers hashtable
> + * @busy: An atomic flag to ensure exclusive write access to this xfer
> + * @state: The current state of this transfer, with states transitions deemed
> + *	   valid being:
> + *	    - SCMI_XFER_SENT_OK -> SCMI_XFER_RESP_OK [ -> SCMI_XFER_DRESP_OK ]
> + *	    - SCMI_XFER_SENT_OK -> SCMI_XFER_DRESP_OK
> + *	      (Missing synchronous response is assumed OK and ignored)
> + * @lock: A spinlock to protect state and busy fields.
>    */
>   struct scmi_xfer {
>   	int transfer_id;
> @@ -156,6 +164,15 @@ struct scmi_xfer {
>   	refcount_t users;
>   	bool pending;
>   	struct hlist_node node;
> +#define SCMI_XFER_FREE		0
> +#define SCMI_XFER_BUSY		1
> +	atomic_t busy;
> +#define SCMI_XFER_SENT_OK	0
> +#define SCMI_XFER_RESP_OK	1
> +#define SCMI_XFER_DRESP_OK	2
> +	int state;
> +	/* A lock to protect state and busy fields */
> +	spinlock_t lock;
>   };
>   
>   /*
> @@ -392,5 +409,4 @@ bool shmem_poll_done(struct scmi_shared_mem __iomem *shmem,
>   void scmi_notification_instance_data_set(const struct scmi_handle *handle,
>   					 void *priv);
>   void *scmi_notification_instance_data_get(const struct scmi_handle *handle);
> -
>   #endif /* _SCMI_COMMON_H */
> diff --git a/drivers/firmware/arm_scmi/driver.c b/drivers/firmware/arm_scmi/driver.c
> index 245ede223302..5ef33d692670 100644
> --- a/drivers/firmware/arm_scmi/driver.c
> +++ b/drivers/firmware/arm_scmi/driver.c
> @@ -369,6 +369,7 @@ static struct scmi_xfer *scmi_xfer_get(const struct scmi_handle *handle,
>   
>   	if (!IS_ERR(xfer)) {
>   		refcount_set(&xfer->users, 1);
> +		atomic_set(&xfer->busy, SCMI_XFER_FREE);
>   		xfer->transfer_id = atomic_inc_return(&transfer_last_id);
>   	}
>   	spin_unlock_irqrestore(&minfo->xfer_lock, flags);
> @@ -430,6 +431,168 @@ scmi_xfer_lookup_unlocked(struct scmi_xfers_info *minfo, u16 xfer_id)
>   	return xfer ?: ERR_PTR(-EINVAL);
>   }
>   
> +/**
> + * scmi_msg_response_validate  - Validate message type against state of related
> + * xfer
> + *
> + * @cinfo: A reference to the channel descriptor.
> + * @msg_type: Message type to check
> + * @xfer: A reference to the xfer to validate against @msg_type
> + *
> + * This function checks if @msg_type is congruent with the current state of
> + * a pending @xfer; if an asynchronous delayed response is received before the
> + * related synchronous response (Out-of-Order Delayed Response) the missing
> + * synchronous response is assumed to be OK and completed, carrying on with the
> + * Delayed Response: this is done to address the case in which the underlying
> + * SCMI transport can deliver such out-of-order responses.
> + *
> + * Context: Assumes to be called with xfer->lock already acquired.
> + *
> + * Return: 0 on Success, error otherwise
> + */
> +static inline int scmi_msg_response_validate(struct scmi_chan_info *cinfo,
> +					     u8 msg_type,
> +					     struct scmi_xfer *xfer)
> +{
> +	/*
> +	 * Even if a response was indeed expected on this slot at this point,
> +	 * a buggy platform could wrongly reply feeding us an unexpected
> +	 * delayed response we're not prepared to handle: bail-out safely
> +	 * blaming firmware.
> +	 */
> +	if (msg_type == MSG_TYPE_DELAYED_RESP && !xfer->async_done) {
> +		dev_err(cinfo->dev,
> +			"Delayed Response for %d not expected! Buggy F/W ?\n",
> +			xfer->hdr.seq);
> +		return -EINVAL;
> +	}
> +
> +	switch (xfer->state) {
> +	case SCMI_XFER_SENT_OK:
> +		if (msg_type == MSG_TYPE_DELAYED_RESP) {
> +			/*
> +			 * Delayed Response expected but delivered earlier.
> +			 * Assume message RESPONSE was OK and skip state.
> +			 */
> +			xfer->hdr.status = SCMI_SUCCESS;
> +			xfer->state = SCMI_XFER_RESP_OK;
> +			complete(&xfer->done);
> +			dev_warn(cinfo->dev,
> +				 "Received valid OoO Delayed Response for %d\n",
> +				 xfer->hdr.seq);
> +		}
> +		break;
> +	case SCMI_XFER_RESP_OK:
> +		if (msg_type != MSG_TYPE_DELAYED_RESP)
> +			return -EINVAL;
> +		break;
> +	case SCMI_XFER_DRESP_OK:
> +		/* No further message expected once in SCMI_XFER_DRESP_OK */
> +		return -EINVAL;
> +	}
> +
> +	return 0;
> +}
> +
> +static bool scmi_xfer_is_free(struct scmi_xfer *xfer)
> +{
> +	int ret;
> +
> +	ret = atomic_cmpxchg(&xfer->busy, SCMI_XFER_FREE, SCMI_XFER_BUSY);

Naming: Rather unusual to change state in an _is_free() function, 
looking at other _is_free() functions in the kernel.

> +
> +	return ret == SCMI_XFER_FREE;
> +}
> +
> +/**
> + * scmi_xfer_command_acquire  -  Helper to lookup and acquire a command xfer
> + *
> + * @cinfo: A reference to the channel descriptor.
> + * @msg_hdr: A message header to use as lookup key
> + *
> + * When a valid xfer is found for the sequence number embedded in the provided
> + * msg_hdr, reference counting is properly updated and exclusive access to this
> + * xfer is granted till released with @scmi_xfer_command_release.
> + *
> + * Return: A valid @xfer on Success or error otherwise.
> + */
> +static inline struct scmi_xfer *
> +scmi_xfer_command_acquire(struct scmi_chan_info *cinfo, u32 msg_hdr)
> +{
> +	int ret;
> +	unsigned long flags;
> +	struct scmi_xfer *xfer;
> +	struct scmi_info *info = handle_to_scmi_info(cinfo->handle);
> +	struct scmi_xfers_info *minfo = &info->tx_minfo;
> +	u8 msg_type = MSG_XTRACT_TYPE(msg_hdr);
> +	u16 xfer_id = MSG_XTRACT_TOKEN(msg_hdr);
> +
> +	/* Are we even expecting this? */
> +	spin_lock_irqsave(&minfo->xfer_lock, flags);
> +	xfer = scmi_xfer_lookup_unlocked(minfo, xfer_id);
> +	if (IS_ERR(xfer)) {
> +		dev_err(cinfo->dev,
> +			"Message for %d type %d is not expected!\n",
> +			xfer_id, msg_type);
> +		spin_unlock_irqrestore(&minfo->xfer_lock, flags);
> +		return xfer;
> +	}
> +	refcount_inc(&xfer->users);
> +	spin_unlock_irqrestore(&minfo->xfer_lock, flags);
> +
> +	spin_lock_irqsave(&xfer->lock, flags);
> +	ret = scmi_msg_response_validate(cinfo, msg_type, xfer);
> +	/*
> +	 * If a pending xfer was found which was also in a congruent state with
> +	 * the received message, acquire exclusive access to it setting the busy
> +	 * flag.
> +	 * Spins only on the rare limit condition of concurrent reception of
> +	 * RESP and DRESP for the same xfer.
> +	 */
> +	if (!ret) {
> +		spin_until_cond(scmi_xfer_is_free(xfer));

Maybe there should be an additional comment to indicate that the 
xfer->lock cannot be reaquired later during response processing, to 
avoid a deadlock in conjunction with the xfer->busy flag.

> +		xfer->hdr.type = msg_type;
> +	}
> +	spin_unlock_irqrestore(&xfer->lock, flags);
> +
> +	if (ret) {
> +		dev_err(cinfo->dev,
> +			"Invalid message type:%d for %d - HDR:0x%X  state:%d\n",
> +			msg_type, xfer_id, msg_hdr, xfer->state);
> +		/* On error the refcount incremented above has to be dropped */
> +		__scmi_xfer_put(minfo, xfer);
> +		xfer = ERR_PTR(-EINVAL);
> +	}
> +
> +	return xfer;
> +}
> +
> +static inline void scmi_xfer_command_release(struct scmi_info *info,
> +					     struct scmi_xfer *xfer)
> +{
> +	atomic_set(&xfer->busy, SCMI_XFER_FREE);
> +	__scmi_xfer_put(&info->tx_minfo, xfer);
> +}
> +
> +/**
> + * scmi_xfer_state_update  - Update xfer state
> + *
> + * @xfer: A reference to the xfer to update
> + *
> + * Context: Assumes to be called on an xfer exclusively acquired using the
> + *	    busy flag.
> + */
> +static inline void scmi_xfer_state_update(struct scmi_xfer *xfer)
> +{
> +	switch (xfer->hdr.type) {
> +	case MSG_TYPE_COMMAND:
> +		xfer->state = SCMI_XFER_RESP_OK;
> +		break;
> +	case MSG_TYPE_DELAYED_RESP:
> +		xfer->state = SCMI_XFER_DRESP_OK;
> +		break;
> +	}
> +}
> +
>   static void scmi_handle_notification(struct scmi_chan_info *cinfo, u32 msg_hdr)
>   {
>   	struct scmi_xfer *xfer;
> @@ -462,57 +625,37 @@ static void scmi_handle_notification(struct scmi_chan_info *cinfo, u32 msg_hdr)
>   	info->desc->ops->clear_channel(cinfo);
>   }
>   
> -static void scmi_handle_response(struct scmi_chan_info *cinfo,
> -				 u16 xfer_id, u8 msg_type)
> +static void scmi_handle_response(struct scmi_chan_info *cinfo, u32 msg_hdr)
>   {
> -	unsigned long flags;
>   	struct scmi_xfer *xfer;
> -	struct device *dev = cinfo->dev;
>   	struct scmi_info *info = handle_to_scmi_info(cinfo->handle);
> -	struct scmi_xfers_info *minfo = &info->tx_minfo;
>   
> -	/* Are we even expecting this? */
> -	spin_lock_irqsave(&minfo->xfer_lock, flags);
> -	xfer = scmi_xfer_lookup_unlocked(minfo, xfer_id);
> -	spin_unlock_irqrestore(&minfo->xfer_lock, flags);
> +	xfer = scmi_xfer_command_acquire(cinfo, msg_hdr);
>   	if (IS_ERR(xfer)) {
> -		dev_err(dev, "message for %d is not expected!\n", xfer_id);
>   		info->desc->ops->clear_channel(cinfo);
>   		return;
>   	}
>   
> -	/*
> -	 * Even if a response was indeed expected on this slot at this point,
> -	 * a buggy platform could wrongly reply feeding us an unexpected
> -	 * delayed response we're not prepared to handle: bail-out safely
> -	 * blaming firmware.
> -	 */
> -	if (unlikely(msg_type == MSG_TYPE_DELAYED_RESP && !xfer->async_done)) {
> -		dev_err(dev,
> -			"Delayed Response for %d not expected! Buggy F/W ?\n",
> -			xfer_id);
> -		info->desc->ops->clear_channel(cinfo);
> -		/* It was unexpected, so nobody will clear the xfer if not us */
> -		__scmi_xfer_put(minfo, xfer);
> -		return;
> -	}
> +	scmi_xfer_state_update(xfer);

Since this update is not protected by the xfer->lock any more, it may 
not become visible in time to a concurrent response which is checking 
and possibly updating state in scmi_msg_response_validate(). I think 
this should be avoided, even if it might not cause practical problems ATM.

>   
>   	/* rx.len could be shrunk in the sync do_xfer, so reset to maxsz */
> -	if (msg_type == MSG_TYPE_DELAYED_RESP)
> +	if (xfer->hdr.type == MSG_TYPE_DELAYED_RESP)
>   		xfer->rx.len = info->desc->max_msg_size;
>   
>   	info->desc->ops->fetch_response(cinfo, xfer);
>   
>   	trace_scmi_rx_done(xfer->transfer_id, xfer->hdr.id,
>   			   xfer->hdr.protocol_id, xfer->hdr.seq,
> -			   msg_type);
> +			   xfer->hdr.type);
>   
> -	if (msg_type == MSG_TYPE_DELAYED_RESP) {
> +	if (xfer->hdr.type == MSG_TYPE_DELAYED_RESP) {
>   		info->desc->ops->clear_channel(cinfo);
>   		complete(xfer->async_done);
>   	} else {
>   		complete(&xfer->done);
>   	}
> +
> +	scmi_xfer_command_release(info, xfer);
>   }
>   
>   /**
> @@ -529,7 +672,6 @@ static void scmi_handle_response(struct scmi_chan_info *cinfo,
>    */
>   void scmi_rx_callback(struct scmi_chan_info *cinfo, u32 msg_hdr)
>   {
> -	u16 xfer_id = MSG_XTRACT_TOKEN(msg_hdr);
>   	u8 msg_type = MSG_XTRACT_TYPE(msg_hdr);
>   
>   	switch (msg_type) {
> @@ -538,7 +680,7 @@ void scmi_rx_callback(struct scmi_chan_info *cinfo, u32 msg_hdr)
>   		break;
>   	case MSG_TYPE_COMMAND:
>   	case MSG_TYPE_DELAYED_RESP:
> -		scmi_handle_response(cinfo, xfer_id, msg_type);
> +		scmi_handle_response(cinfo, msg_hdr);
>   		break;
>   	default:
>   		WARN_ONCE(1, "received unknown msg_type:%d\n", msg_type);
> @@ -550,7 +692,7 @@ void scmi_rx_callback(struct scmi_chan_info *cinfo, u32 msg_hdr)
>    * xfer_put() - Release a transmit message
>    *
>    * @ph: Pointer to SCMI protocol handle
> - * @xfer: message that was reserved by scmi_xfer_get
> + * @xfer: message that was reserved by xfer_get_init
>    */
>   static void xfer_put(const struct scmi_protocol_handle *ph,
>   		     struct scmi_xfer *xfer)
> @@ -568,7 +710,12 @@ static bool scmi_xfer_done_no_timeout(struct scmi_chan_info *cinfo,
>   {
>   	struct scmi_info *info = handle_to_scmi_info(cinfo->handle);
>   
> +	/*
> +	 * Poll also on xfer->done so that polling can be forcibly terminated
> +	 * in case of out-of-order receptions of delayed responses
> +	 */
>   	return info->desc->ops->poll_done(cinfo, xfer) ||
> +	       try_wait_for_completion(&xfer->done) ||
>   	       ktime_after(ktime_get(), stop);
>   }
>   
> @@ -608,6 +755,7 @@ static int do_xfer(const struct scmi_protocol_handle *ph,
>   			      xfer->hdr.protocol_id, xfer->hdr.seq,
>   			      xfer->hdr.poll_completion);
>   
> +	xfer->state = SCMI_XFER_SENT_OK;

To be completely safe, this assignment could also be protected by the 
xfer->lock.

>   	ret = info->desc->ops->send_message(cinfo, xfer);
>   	if (ret < 0) {
>   		dev_dbg(dev, "Failed to send message %d\n", ret);
> @@ -619,10 +767,22 @@ static int do_xfer(const struct scmi_protocol_handle *ph,
>   
>   		spin_until_cond(scmi_xfer_done_no_timeout(cinfo, xfer, stop));
>   
> -		if (ktime_before(ktime_get(), stop))
> -			info->desc->ops->fetch_response(cinfo, xfer);
> -		else
> +		if (ktime_before(ktime_get(), stop)) {
> +			unsigned long flags;
> +
> +			/*
> +			 * Do not fetch_response if an out-of-order delayed
> +			 * response is being processed.
> +			 */
> +			spin_lock_irqsave(&xfer->lock, flags);
> +			if (xfer->state == SCMI_XFER_SENT_OK) {
> +				info->desc->ops->fetch_response(cinfo, xfer);
> +				xfer->state = SCMI_XFER_RESP_OK;
> +			}
> +			spin_unlock_irqrestore(&xfer->lock, flags);
> +		} else {
>   			ret = -ETIMEDOUT;
> +		}
>   	} else {
>   		/* And we wait for the response. */
>   		timeout = msecs_to_jiffies(info->desc->max_rx_timeout_ms);
> @@ -1220,6 +1380,7 @@ static int __scmi_xfer_info_init(struct scmi_info *sinfo,
>   
>   		xfer->tx.buf = xfer->rx.buf;
>   		init_completion(&xfer->done);
> +		spin_lock_init(&xfer->lock);
>   
>   		/* Add initialized xfer to the free list */
>   		hlist_add_head(&xfer->node, &info->free_xfers);
> 


^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH v6 10/17] firmware: arm_scmi: Make polling mode optional
  2021-07-12 14:18 ` [PATCH v6 10/17] firmware: arm_scmi: Make polling mode optional Cristian Marussi
@ 2021-07-15 16:36   ` Peter Hilber
  2021-07-19  9:15     ` Cristian Marussi
  2021-07-28 14:34   ` Sudeep Holla
  1 sibling, 1 reply; 48+ messages in thread
From: Peter Hilber @ 2021-07-15 16:36 UTC (permalink / raw)
  To: Cristian Marussi, linux-kernel, linux-arm-kernel, virtualization,
	virtio-dev
  Cc: sudeep.holla, james.quinlan, Jonathan.Cameron, f.fainelli,
	etienne.carriere, vincent.guittot, souvik.chakravarty,
	igor.skalkin, alex.bennee, jean-philippe, mikhail.golubev,
	anton.yakovlev, Vasyl.Vavrychuk, Andriy.Tryshnivskyy

On 12.07.21 16:18, Cristian Marussi wrote:
> Add a check for the presence of .poll_done transport operation so that
> transports that do not need to support polling mode have no need to provide
> a dummy .poll_done callback either and polling mode can be disabled in the
> SCMI core for that tranport.
> 
> Signed-off-by: Cristian Marussi <cristian.marussi@arm.com>
> ---
>   drivers/firmware/arm_scmi/driver.c | 43 ++++++++++++++++++------------
>   1 file changed, 26 insertions(+), 17 deletions(-)
> 
> diff --git a/drivers/firmware/arm_scmi/driver.c b/drivers/firmware/arm_scmi/driver.c
> index a952b6527b8a..4183d25c9289 100644
> --- a/drivers/firmware/arm_scmi/driver.c
> +++ b/drivers/firmware/arm_scmi/driver.c
> @@ -777,25 +777,34 @@ static int do_xfer(const struct scmi_protocol_handle *ph,
>   	}
>   
>   	if (xfer->hdr.poll_completion) {
> -		ktime_t stop = ktime_add_ns(ktime_get(), SCMI_MAX_POLL_TO_NS);
> -
> -		spin_until_cond(scmi_xfer_done_no_timeout(cinfo, xfer, stop));
> -
> -		if (ktime_before(ktime_get(), stop)) {
> -			unsigned long flags;
> -
> -			/*
> -			 * Do not fetch_response if an out-of-order delayed
> -			 * response is being processed.
> -			 */
> -			spin_lock_irqsave(&xfer->lock, flags);
> -			if (xfer->state == SCMI_XFER_SENT_OK) {
> -				info->desc->ops->fetch_response(cinfo, xfer);
> -				xfer->state = SCMI_XFER_RESP_OK;
> +		if (info->desc->ops->poll_done) {
> +			ktime_t stop = ktime_add_ns(ktime_get(),
> +						    SCMI_MAX_POLL_TO_NS);
> +
> +			spin_until_cond(scmi_xfer_done_no_timeout(cinfo, xfer,
> +								  stop));
> +
> +			if (ktime_before(ktime_get(), stop)) {
> +				unsigned long flags;
> +
> +				/*
> +				 * Do not fetch_response if an out-of-order delayed
> +				 * response is being processed.
> +				 */
> +				spin_lock_irqsave(&xfer->lock, flags);
> +				if (xfer->state == SCMI_XFER_SENT_OK) {
> +					info->desc->ops->fetch_response(cinfo,
> +									xfer);
> +					xfer->state = SCMI_XFER_RESP_OK;
> +				}
> +				spin_unlock_irqrestore(&xfer->lock, flags);
> +			} else {
> +				ret = -ETIMEDOUT;
>   			}
> -			spin_unlock_irqrestore(&xfer->lock, flags);
>   		} else {
> -			ret = -ETIMEDOUT;
> +			dev_warn_once(dev,
> +				      "Polling mode is not supported by transport.\n");
> +			ret = EINVAL;

s/EINVAL/-EINVAL

>   		}
>   	} else {
>   		/* And we wait for the response. */
> 


^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH v6 14/17] firmware: arm_scmi: Add message passing abstractions for transports
  2021-07-12 14:18 ` [PATCH v6 14/17] firmware: arm_scmi: Add message passing abstractions for transports Cristian Marussi
@ 2021-07-15 16:36   ` Peter Hilber
  2021-07-19  9:16     ` Cristian Marussi
  0 siblings, 1 reply; 48+ messages in thread
From: Peter Hilber @ 2021-07-15 16:36 UTC (permalink / raw)
  To: Cristian Marussi, linux-kernel, linux-arm-kernel, virtualization,
	virtio-dev
  Cc: sudeep.holla, james.quinlan, Jonathan.Cameron, f.fainelli,
	etienne.carriere, vincent.guittot, souvik.chakravarty,
	igor.skalkin, alex.bennee, jean-philippe, mikhail.golubev,
	anton.yakovlev, Vasyl.Vavrychuk, Andriy.Tryshnivskyy

On 12.07.21 16:18, Cristian Marussi wrote:
> From: Peter Hilber <peter.hilber@opensynergy.com>
> 
> Add abstractions for future transports using message passing, such as
> virtio. Derive the abstractions from the shared memory abstractions.
> 
> Abstract the transport SDU through the opaque struct scmi_msg_payld.
> Also enable the transport to determine all other required information
> about the transport SDU.
> 
> Signed-off-by: Peter Hilber <peter.hilber@opensynergy.com>
> [ Cristian: Adapted to new SCMI Kconfig layout, updated Copyrigths ]
> Signed-off-by: Cristian Marussi <cristian.marussi@arm.com>
> ---
> v4 --> v5
> - adapted to new SCMI Kconfig
> - removed raw_payload msg helpers
> v3 --> v4
> - added raw_payload msg helpers
> ---
>   drivers/firmware/arm_scmi/Kconfig  |   6 ++
>   drivers/firmware/arm_scmi/Makefile |   1 +
>   drivers/firmware/arm_scmi/common.h |  15 ++++
>   drivers/firmware/arm_scmi/msg.c    | 113 +++++++++++++++++++++++++++++
>   4 files changed, 135 insertions(+)
>   create mode 100644 drivers/firmware/arm_scmi/msg.c
> 
> diff --git a/drivers/firmware/arm_scmi/Kconfig b/drivers/firmware/arm_scmi/Kconfig
> index ee6517b24080..1fdaa8ad7d3f 100644
> --- a/drivers/firmware/arm_scmi/Kconfig
> +++ b/drivers/firmware/arm_scmi/Kconfig
> @@ -34,6 +34,12 @@ config ARM_SCMI_HAVE_SHMEM
>   	  This declares whether a shared memory based transport for SCMI is
>   	  available.
>   
> +config ARM_SCMI_HAVE_MSG
> +	bool
> +	help
> +	  This declares whether a message passing based transport for SCMI is
> +	  available.
> +
>   config ARM_SCMI_TRANSPORT_MAILBOX
>   	bool "SCMI transport based on Mailbox"
>   	depends on ARM_SCMI_PROTOCOL && MAILBOX
> diff --git a/drivers/firmware/arm_scmi/Makefile b/drivers/firmware/arm_scmi/Makefile
> index e0e6bd3dba9e..aaad9f6589aa 100644
> --- a/drivers/firmware/arm_scmi/Makefile
> +++ b/drivers/firmware/arm_scmi/Makefile
> @@ -4,6 +4,7 @@ scmi-driver-y = driver.o notify.o
>   scmi-transport-$(CONFIG_ARM_SCMI_HAVE_SHMEM) = shmem.o
>   scmi-transport-$(CONFIG_ARM_SCMI_TRANSPORT_MAILBOX) += mailbox.o
>   scmi-transport-$(CONFIG_ARM_SCMI_TRANSPORT_SMC) += smc.o
> +scmi-transport-$(CONFIG_ARM_SCMI_HAVE_MSG) += msg.o
>   scmi-protocols-y = base.o clock.o perf.o power.o reset.o sensors.o system.o voltage.o
>   scmi-module-objs := $(scmi-bus-y) $(scmi-driver-y) $(scmi-protocols-y) \
>   		    $(scmi-transport-y)
> diff --git a/drivers/firmware/arm_scmi/common.h b/drivers/firmware/arm_scmi/common.h
> index 14457f0d5dea..7a1e84dc191b 100644
> --- a/drivers/firmware/arm_scmi/common.h
> +++ b/drivers/firmware/arm_scmi/common.h
> @@ -415,6 +415,21 @@ void shmem_clear_channel(struct scmi_shared_mem __iomem *shmem);
>   bool shmem_poll_done(struct scmi_shared_mem __iomem *shmem,
>   		     struct scmi_xfer *xfer);
>   
> +/* declarations for message passing transports */
> +struct scmi_msg_payld;
> +
> +/* Maximum overhead of message w.r.t. struct scmi_desc.max_msg_size */
> +#define SCMI_MSG_MAX_PROT_OVERHEAD (2 * sizeof(__le32))
> +
> +size_t msg_response_size(struct scmi_xfer *xfer);
> +size_t msg_command_size(struct scmi_xfer *xfer);
> +void msg_tx_prepare(struct scmi_msg_payld *msg, struct scmi_xfer *xfer);
> +u32 msg_read_header(struct scmi_msg_payld *msg);
> +void msg_fetch_response(struct scmi_msg_payld *msg, size_t len,
> +			struct scmi_xfer *xfer);
> +void msg_fetch_notification(struct scmi_msg_payld *msg, size_t len,
> +			    size_t max_len, struct scmi_xfer *xfer);
> +
>   void scmi_notification_instance_data_set(const struct scmi_handle *handle,
>   					 void *priv);
>   void *scmi_notification_instance_data_get(const struct scmi_handle *handle);
> diff --git a/drivers/firmware/arm_scmi/msg.c b/drivers/firmware/arm_scmi/msg.c
> new file mode 100644
> index 000000000000..639969b7dc10
> --- /dev/null
> +++ b/drivers/firmware/arm_scmi/msg.c
> @@ -0,0 +1,113 @@
> +// SPDX-License-Identifier: GPL-2.0
> +/*
> + * For transports using message passing.
> + *
> + * Derived from shm.c.
> + *
> + * Copyright (C) 2019-2021 ARM Ltd.
> + * Copyright (C) 2020-2021 OpenSynergy GmbH
> + */
> +
> +#include <linux/io.h>
> +#include <linux/processor.h>

The above two includes appear to be unnecessary.

> +#include <linux/types.h>
> +
> +#include "common.h"
> +
> +/*
> + * struct scmi_msg_payld - Transport SDU layout
> + *
> + * The SCMI specification requires all parameters, message headers, return
> + * arguments or any protocol data to be expressed in little endian format only.
> + */
> +struct scmi_msg_payld {
> +	__le32 msg_header;
> +	__le32 msg_payload[];
> +};
> +
> +/**
> + * msg_command_size() - Actual size of transport SDU for command.
> + *
> + * @xfer: message which core has prepared for sending
> + *
> + * Return: transport SDU size.
> + */
> +size_t msg_command_size(struct scmi_xfer *xfer)
> +{
> +	return sizeof(struct scmi_msg_payld) + xfer->tx.len;
> +}
> +
> +/**
> + * msg_response_size() - Maximum size of transport SDU for response.
> + *
> + * @xfer: message which core has prepared for sending
> + *
> + * Return: transport SDU size.
> + */
> +size_t msg_response_size(struct scmi_xfer *xfer)
> +{
> +	return sizeof(struct scmi_msg_payld) + sizeof(__le32) + xfer->rx.len;
> +}
> +
> +/**
> + * msg_tx_prepare() - Set up transport SDU for command.
> + *
> + * @msg: transport SDU for command
> + * @xfer: message which is being sent
> + */
> +void msg_tx_prepare(struct scmi_msg_payld *msg, struct scmi_xfer *xfer)
> +{
> +	msg->msg_header = cpu_to_le32(pack_scmi_header(&xfer->hdr));
> +	if (xfer->tx.buf)
> +		memcpy(msg->msg_payload, xfer->tx.buf, xfer->tx.len);
> +}
> +
> +/**
> + * msg_read_header() - Read SCMI header from transport SDU.
> + *
> + * @msg: transport SDU
> + *
> + * Return: SCMI header
> + */
> +u32 msg_read_header(struct scmi_msg_payld *msg)
> +{
> +	return le32_to_cpu(msg->msg_header);
> +}
> +
> +/**
> + * msg_fetch_response() - Fetch response SCMI payload from transport SDU.
> + *
> + * @msg: transport SDU with response
> + * @len: transport SDU size
> + * @xfer: message being responded to
> + */
> +void msg_fetch_response(struct scmi_msg_payld *msg, size_t len,
> +			struct scmi_xfer *xfer)
> +{
> +	size_t prefix_len = sizeof(*msg) + sizeof(msg->msg_payload[0]);
> +
> +	xfer->hdr.status = le32_to_cpu(msg->msg_payload[0]);
> +	xfer->rx.len = min_t(size_t, xfer->rx.len,
> +			     len >= prefix_len ? len - prefix_len : 0);
> +
> +	/* Take a copy to the rx buffer.. */
> +	memcpy(xfer->rx.buf, &msg->msg_payload[1], xfer->rx.len);
> +}
> +
> +/**
> + * msg_fetch_notification() - Fetch notification payload from transport SDU.
> + *
> + * @msg: transport SDU with notification
> + * @len: transport SDU size
> + * @max_len: maximum SCMI payload size to fetch
> + * @xfer: notification message
> + */
> +void msg_fetch_notification(struct scmi_msg_payld *msg, size_t len,
> +			    size_t max_len, struct scmi_xfer *xfer)
> +{
> +	xfer->rx.len = min_t(size_t, max_len,
> +			     len >= sizeof(*msg) ? len - sizeof(*msg) : 0);
> +
> +	/* Take a copy to the rx buffer.. */
> +	memcpy(xfer->rx.buf, msg->msg_payload, xfer->rx.len);
> +}
> 


^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH v6 07/17] firmware: arm_scmi: Handle concurrent and out-of-order messages
  2021-07-15 16:36   ` Peter Hilber
@ 2021-07-19  9:14     ` Cristian Marussi
  2021-07-22  8:32       ` Peter Hilber
  0 siblings, 1 reply; 48+ messages in thread
From: Cristian Marussi @ 2021-07-19  9:14 UTC (permalink / raw)
  To: Peter Hilber
  Cc: linux-kernel, linux-arm-kernel, virtualization, virtio-dev,
	sudeep.holla, james.quinlan, Jonathan.Cameron, f.fainelli,
	etienne.carriere, vincent.guittot, souvik.chakravarty,
	igor.skalkin, alex.bennee, jean-philippe, mikhail.golubev,
	anton.yakovlev, Vasyl.Vavrychuk, Andriy.Tryshnivskyy

On Thu, Jul 15, 2021 at 06:36:03PM +0200, Peter Hilber wrote:
> On 12.07.21 16:18, Cristian Marussi wrote:
> > Even though in case of asynchronous commands an SCMI platform server is
> > constrained to emit the delayed response message only after the related
> > message response has been sent, the configured underlying transport could
> > still deliver such messages together or in inverted order, causing races
> > due to the concurrent or out-of-order access to the underlying xfer.
> > 
> > Introduce a mechanism to grant exclusive access to an xfer in order to
> > properly serialize concurrent accesses to the same xfer originating from
> > multiple correlated messages.
> > 
> > Add additional state information to xfer descriptors so as to be able to
> > identify out-of-order message deliveries and act accordingly:
> > 
> >   - when a delayed response is expected but delivered before the related
> >     response, the synchronous response is considered as successfully
> >     received and the delayed response processing is carried on as usual.
> > 
> >   - when/if the missing synchronous response is subsequently received, it
> >     is discarded as not congruent with the current state of the xfer, or
> >     simply, because the xfer has been already released and so, now, the
> >     monotonically increasing sequence number carried by the late response
> >     is stale.
> > 
> > Signed-off-by: Cristian Marussi <cristian.marussi@arm.com>
> > ---

Hi Peter,

thanks for the review, I replied online down below.

> > v5 --> v6
> > - added spinlock comment
> > ---
> >   drivers/firmware/arm_scmi/common.h |  18 ++-
> >   drivers/firmware/arm_scmi/driver.c | 229 ++++++++++++++++++++++++-----
> >   2 files changed, 212 insertions(+), 35 deletions(-)
> > 
> > diff --git a/drivers/firmware/arm_scmi/common.h b/drivers/firmware/arm_scmi/common.h
> > index 2233d0a188fc..9efebe1406d2 100644
> > --- a/drivers/firmware/arm_scmi/common.h
> > +++ b/drivers/firmware/arm_scmi/common.h
> > @@ -19,6 +19,7 @@
> >   #include <linux/module.h>
> >   #include <linux/refcount.h>
> >   #include <linux/scmi_protocol.h>
> > +#include <linux/spinlock.h>
> >   #include <linux/types.h>
> >   #include <asm/unaligned.h>
> > @@ -145,6 +146,13 @@ struct scmi_msg {
> >    * @pending: True for xfers added to @pending_xfers hashtable
> >    * @node: An hlist_node reference used to store this xfer, alternatively, on
> >    *	  the free list @free_xfers or in the @pending_xfers hashtable
> > + * @busy: An atomic flag to ensure exclusive write access to this xfer
> > + * @state: The current state of this transfer, with states transitions deemed
> > + *	   valid being:
> > + *	    - SCMI_XFER_SENT_OK -> SCMI_XFER_RESP_OK [ -> SCMI_XFER_DRESP_OK ]
> > + *	    - SCMI_XFER_SENT_OK -> SCMI_XFER_DRESP_OK
> > + *	      (Missing synchronous response is assumed OK and ignored)
> > + * @lock: A spinlock to protect state and busy fields.
> >    */
> >   struct scmi_xfer {
> >   	int transfer_id;
> > @@ -156,6 +164,15 @@ struct scmi_xfer {
> >   	refcount_t users;
> >   	bool pending;
> >   	struct hlist_node node;
> > +#define SCMI_XFER_FREE		0
> > +#define SCMI_XFER_BUSY		1
> > +	atomic_t busy;
> > +#define SCMI_XFER_SENT_OK	0
> > +#define SCMI_XFER_RESP_OK	1
> > +#define SCMI_XFER_DRESP_OK	2
> > +	int state;
> > +	/* A lock to protect state and busy fields */
> > +	spinlock_t lock;
> >   };
> >   /*
> > @@ -392,5 +409,4 @@ bool shmem_poll_done(struct scmi_shared_mem __iomem *shmem,
> >   void scmi_notification_instance_data_set(const struct scmi_handle *handle,
> >   					 void *priv);
> >   void *scmi_notification_instance_data_get(const struct scmi_handle *handle);
> > -
> >   #endif /* _SCMI_COMMON_H */
> > diff --git a/drivers/firmware/arm_scmi/driver.c b/drivers/firmware/arm_scmi/driver.c
> > index 245ede223302..5ef33d692670 100644
> > --- a/drivers/firmware/arm_scmi/driver.c
> > +++ b/drivers/firmware/arm_scmi/driver.c
> > @@ -369,6 +369,7 @@ static struct scmi_xfer *scmi_xfer_get(const struct scmi_handle *handle,
> >   	if (!IS_ERR(xfer)) {
> >   		refcount_set(&xfer->users, 1);
> > +		atomic_set(&xfer->busy, SCMI_XFER_FREE);
> >   		xfer->transfer_id = atomic_inc_return(&transfer_last_id);
> >   	}
> >   	spin_unlock_irqrestore(&minfo->xfer_lock, flags);
> > @@ -430,6 +431,168 @@ scmi_xfer_lookup_unlocked(struct scmi_xfers_info *minfo, u16 xfer_id)
> >   	return xfer ?: ERR_PTR(-EINVAL);
> >   }
> > +/**
> > + * scmi_msg_response_validate  - Validate message type against state of related
> > + * xfer
> > + *
> > + * @cinfo: A reference to the channel descriptor.
> > + * @msg_type: Message type to check
> > + * @xfer: A reference to the xfer to validate against @msg_type
> > + *
> > + * This function checks if @msg_type is congruent with the current state of
> > + * a pending @xfer; if an asynchronous delayed response is received before the
> > + * related synchronous response (Out-of-Order Delayed Response) the missing
> > + * synchronous response is assumed to be OK and completed, carrying on with the
> > + * Delayed Response: this is done to address the case in which the underlying
> > + * SCMI transport can deliver such out-of-order responses.
> > + *
> > + * Context: Assumes to be called with xfer->lock already acquired.
> > + *
> > + * Return: 0 on Success, error otherwise
> > + */
> > +static inline int scmi_msg_response_validate(struct scmi_chan_info *cinfo,
> > +					     u8 msg_type,
> > +					     struct scmi_xfer *xfer)
> > +{
> > +	/*
> > +	 * Even if a response was indeed expected on this slot at this point,
> > +	 * a buggy platform could wrongly reply feeding us an unexpected
> > +	 * delayed response we're not prepared to handle: bail-out safely
> > +	 * blaming firmware.
> > +	 */
> > +	if (msg_type == MSG_TYPE_DELAYED_RESP && !xfer->async_done) {
> > +		dev_err(cinfo->dev,
> > +			"Delayed Response for %d not expected! Buggy F/W ?\n",
> > +			xfer->hdr.seq);
> > +		return -EINVAL;
> > +	}
> > +
> > +	switch (xfer->state) {
> > +	case SCMI_XFER_SENT_OK:
> > +		if (msg_type == MSG_TYPE_DELAYED_RESP) {
> > +			/*
> > +			 * Delayed Response expected but delivered earlier.
> > +			 * Assume message RESPONSE was OK and skip state.
> > +			 */
> > +			xfer->hdr.status = SCMI_SUCCESS;
> > +			xfer->state = SCMI_XFER_RESP_OK;
> > +			complete(&xfer->done);
> > +			dev_warn(cinfo->dev,
> > +				 "Received valid OoO Delayed Response for %d\n",
> > +				 xfer->hdr.seq);
> > +		}
> > +		break;
> > +	case SCMI_XFER_RESP_OK:
> > +		if (msg_type != MSG_TYPE_DELAYED_RESP)
> > +			return -EINVAL;
> > +		break;
> > +	case SCMI_XFER_DRESP_OK:
> > +		/* No further message expected once in SCMI_XFER_DRESP_OK */
> > +		return -EINVAL;
> > +	}
> > +
> > +	return 0;
> > +}
> > +
> > +static bool scmi_xfer_is_free(struct scmi_xfer *xfer)
> > +{
> > +	int ret;
> > +
> > +	ret = atomic_cmpxchg(&xfer->busy, SCMI_XFER_FREE, SCMI_XFER_BUSY);
> 
> Naming: Rather unusual to change state in an _is_free() function, looking at
> other _is_free() functions in the kernel.
> 

Indeed, but this naming was meant to reflect more the fact that is called
from spin_until_cond(is_free). The use of the atomic cmpxchg trick on
the busy flag is an attempt to limit as much as possible the size of the
spinlocked section.

> > +
> > +	return ret == SCMI_XFER_FREE;
> > +}
> > +
> > +/**
> > + * scmi_xfer_command_acquire  -  Helper to lookup and acquire a command xfer
> > + *
> > + * @cinfo: A reference to the channel descriptor.
> > + * @msg_hdr: A message header to use as lookup key
> > + *
> > + * When a valid xfer is found for the sequence number embedded in the provided
> > + * msg_hdr, reference counting is properly updated and exclusive access to this
> > + * xfer is granted till released with @scmi_xfer_command_release.
> > + *
> > + * Return: A valid @xfer on Success or error otherwise.
> > + */
> > +static inline struct scmi_xfer *
> > +scmi_xfer_command_acquire(struct scmi_chan_info *cinfo, u32 msg_hdr)
> > +{
> > +	int ret;
> > +	unsigned long flags;
> > +	struct scmi_xfer *xfer;
> > +	struct scmi_info *info = handle_to_scmi_info(cinfo->handle);
> > +	struct scmi_xfers_info *minfo = &info->tx_minfo;
> > +	u8 msg_type = MSG_XTRACT_TYPE(msg_hdr);
> > +	u16 xfer_id = MSG_XTRACT_TOKEN(msg_hdr);
> > +
> > +	/* Are we even expecting this? */
> > +	spin_lock_irqsave(&minfo->xfer_lock, flags);
> > +	xfer = scmi_xfer_lookup_unlocked(minfo, xfer_id);
> > +	if (IS_ERR(xfer)) {
> > +		dev_err(cinfo->dev,
> > +			"Message for %d type %d is not expected!\n",
> > +			xfer_id, msg_type);
> > +		spin_unlock_irqrestore(&minfo->xfer_lock, flags);
> > +		return xfer;
> > +	}
> > +	refcount_inc(&xfer->users);
> > +	spin_unlock_irqrestore(&minfo->xfer_lock, flags);
> > +
> > +	spin_lock_irqsave(&xfer->lock, flags);
> > +	ret = scmi_msg_response_validate(cinfo, msg_type, xfer);
> > +	/*
> > +	 * If a pending xfer was found which was also in a congruent state with
> > +	 * the received message, acquire exclusive access to it setting the busy
> > +	 * flag.
> > +	 * Spins only on the rare limit condition of concurrent reception of
> > +	 * RESP and DRESP for the same xfer.
> > +	 */
> > +	if (!ret) {
> > +		spin_until_cond(scmi_xfer_is_free(xfer));
> 
> Maybe there should be an additional comment to indicate that the xfer->lock
> cannot be reaquired later during response processing, to avoid a deadlock in
> conjunction with the xfer->busy flag.
> 
I'll add that. Thanks.

> > +		xfer->hdr.type = msg_type;
> > +	}
> > +	spin_unlock_irqrestore(&xfer->lock, flags);
> > +
> > +	if (ret) {
> > +		dev_err(cinfo->dev,
> > +			"Invalid message type:%d for %d - HDR:0x%X  state:%d\n",
> > +			msg_type, xfer_id, msg_hdr, xfer->state);
> > +		/* On error the refcount incremented above has to be dropped */
> > +		__scmi_xfer_put(minfo, xfer);
> > +		xfer = ERR_PTR(-EINVAL);
> > +	}
> > +
> > +	return xfer;
> > +}
> > +
> > +static inline void scmi_xfer_command_release(struct scmi_info *info,
> > +					     struct scmi_xfer *xfer)
> > +{
> > +	atomic_set(&xfer->busy, SCMI_XFER_FREE);
> > +	__scmi_xfer_put(&info->tx_minfo, xfer);
> > +}
> > +
> > +/**
> > + * scmi_xfer_state_update  - Update xfer state
> > + *
> > + * @xfer: A reference to the xfer to update
> > + *
> > + * Context: Assumes to be called on an xfer exclusively acquired using the
> > + *	    busy flag.
> > + */
> > +static inline void scmi_xfer_state_update(struct scmi_xfer *xfer)
> > +{
> > +	switch (xfer->hdr.type) {
> > +	case MSG_TYPE_COMMAND:
> > +		xfer->state = SCMI_XFER_RESP_OK;
> > +		break;
> > +	case MSG_TYPE_DELAYED_RESP:
> > +		xfer->state = SCMI_XFER_DRESP_OK;
> > +		break;
> > +	}
> > +}
> > +
> >   static void scmi_handle_notification(struct scmi_chan_info *cinfo, u32 msg_hdr)
> >   {
> >   	struct scmi_xfer *xfer;
> > @@ -462,57 +625,37 @@ static void scmi_handle_notification(struct scmi_chan_info *cinfo, u32 msg_hdr)
> >   	info->desc->ops->clear_channel(cinfo);
> >   }
> > -static void scmi_handle_response(struct scmi_chan_info *cinfo,
> > -				 u16 xfer_id, u8 msg_type)
> > +static void scmi_handle_response(struct scmi_chan_info *cinfo, u32 msg_hdr)
> >   {
> > -	unsigned long flags;
> >   	struct scmi_xfer *xfer;
> > -	struct device *dev = cinfo->dev;
> >   	struct scmi_info *info = handle_to_scmi_info(cinfo->handle);
> > -	struct scmi_xfers_info *minfo = &info->tx_minfo;
> > -	/* Are we even expecting this? */
> > -	spin_lock_irqsave(&minfo->xfer_lock, flags);
> > -	xfer = scmi_xfer_lookup_unlocked(minfo, xfer_id);
> > -	spin_unlock_irqrestore(&minfo->xfer_lock, flags);
> > +	xfer = scmi_xfer_command_acquire(cinfo, msg_hdr);
> >   	if (IS_ERR(xfer)) {
> > -		dev_err(dev, "message for %d is not expected!\n", xfer_id);
> >   		info->desc->ops->clear_channel(cinfo);
> >   		return;
> >   	}
> > -	/*
> > -	 * Even if a response was indeed expected on this slot at this point,
> > -	 * a buggy platform could wrongly reply feeding us an unexpected
> > -	 * delayed response we're not prepared to handle: bail-out safely
> > -	 * blaming firmware.
> > -	 */
> > -	if (unlikely(msg_type == MSG_TYPE_DELAYED_RESP && !xfer->async_done)) {
> > -		dev_err(dev,
> > -			"Delayed Response for %d not expected! Buggy F/W ?\n",
> > -			xfer_id);
> > -		info->desc->ops->clear_channel(cinfo);
> > -		/* It was unexpected, so nobody will clear the xfer if not us */
> > -		__scmi_xfer_put(minfo, xfer);
> > -		return;
> > -	}
> > +	scmi_xfer_state_update(xfer);
> 
> Since this update is not protected by the xfer->lock any more, it may not
> become visible in time to a concurrent response which is checking and
> possibly updating state in scmi_msg_response_validate(). I think this should
> be avoided, even if it might not cause practical problems ATM.
> 

Right. I missed this, and looking back at this code it's also awkward
indeed to have a status update disjoint from the msg state validation.
(I think is a left over from using delegated xfers...my bad)
For V7 I moved scmi_xfer_state_update() inside the spinlocked section
in scmi_xfer_command_acquire() right after the busy flag is set.

> >   	/* rx.len could be shrunk in the sync do_xfer, so reset to maxsz */
> > -	if (msg_type == MSG_TYPE_DELAYED_RESP)
> > +	if (xfer->hdr.type == MSG_TYPE_DELAYED_RESP)
> >   		xfer->rx.len = info->desc->max_msg_size;
> >   	info->desc->ops->fetch_response(cinfo, xfer);
> >   	trace_scmi_rx_done(xfer->transfer_id, xfer->hdr.id,
> >   			   xfer->hdr.protocol_id, xfer->hdr.seq,
> > -			   msg_type);
> > +			   xfer->hdr.type);
> > -	if (msg_type == MSG_TYPE_DELAYED_RESP) {
> > +	if (xfer->hdr.type == MSG_TYPE_DELAYED_RESP) {
> >   		info->desc->ops->clear_channel(cinfo);
> >   		complete(xfer->async_done);
> >   	} else {
> >   		complete(&xfer->done);
> >   	}
> > +
> > +	scmi_xfer_command_release(info, xfer);
> >   }
> >   /**
> > @@ -529,7 +672,6 @@ static void scmi_handle_response(struct scmi_chan_info *cinfo,
> >    */
> >   void scmi_rx_callback(struct scmi_chan_info *cinfo, u32 msg_hdr)
> >   {
> > -	u16 xfer_id = MSG_XTRACT_TOKEN(msg_hdr);
> >   	u8 msg_type = MSG_XTRACT_TYPE(msg_hdr);
> >   	switch (msg_type) {
> > @@ -538,7 +680,7 @@ void scmi_rx_callback(struct scmi_chan_info *cinfo, u32 msg_hdr)
> >   		break;
> >   	case MSG_TYPE_COMMAND:
> >   	case MSG_TYPE_DELAYED_RESP:
> > -		scmi_handle_response(cinfo, xfer_id, msg_type);
> > +		scmi_handle_response(cinfo, msg_hdr);
> >   		break;
> >   	default:
> >   		WARN_ONCE(1, "received unknown msg_type:%d\n", msg_type);
> > @@ -550,7 +692,7 @@ void scmi_rx_callback(struct scmi_chan_info *cinfo, u32 msg_hdr)
> >    * xfer_put() - Release a transmit message
> >    *
> >    * @ph: Pointer to SCMI protocol handle
> > - * @xfer: message that was reserved by scmi_xfer_get
> > + * @xfer: message that was reserved by xfer_get_init
> >    */
> >   static void xfer_put(const struct scmi_protocol_handle *ph,
> >   		     struct scmi_xfer *xfer)
> > @@ -568,7 +710,12 @@ static bool scmi_xfer_done_no_timeout(struct scmi_chan_info *cinfo,
> >   {
> >   	struct scmi_info *info = handle_to_scmi_info(cinfo->handle);
> > +	/*
> > +	 * Poll also on xfer->done so that polling can be forcibly terminated
> > +	 * in case of out-of-order receptions of delayed responses
> > +	 */
> >   	return info->desc->ops->poll_done(cinfo, xfer) ||
> > +	       try_wait_for_completion(&xfer->done) ||
> >   	       ktime_after(ktime_get(), stop);
> >   }
> > @@ -608,6 +755,7 @@ static int do_xfer(const struct scmi_protocol_handle *ph,
> >   			      xfer->hdr.protocol_id, xfer->hdr.seq,
> >   			      xfer->hdr.poll_completion);
> > +	xfer->state = SCMI_XFER_SENT_OK;
> 
> To be completely safe, this assignment could also be protected by the
> xfer->lock.
> 

In fact this would be true being xfer->lock meant to protect the state but it
seemed to me unnecessary here given that this is a brand new xfer with a
brand new (monotonic) seq number so that any possibly late-received msg will
carry an old stale seq number certainly different from this such that cannot be
possibly mapped to this same xfer. (but just discarded on xfer lookup in
xfer_command_acquire)

The issue indeed could still exist only for do_xfer loops (as you pointed out
already early on) where the seq_num is used, but in that case on a timeout we
would have already bailed out of the loop and reported an error so any timed-out
late received response would have been anyway discarded; so at the end I thought
I could avoid spinlocking here.

Thanks,
Cristian

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH v6 10/17] firmware: arm_scmi: Make polling mode optional
  2021-07-15 16:36   ` Peter Hilber
@ 2021-07-19  9:15     ` Cristian Marussi
  0 siblings, 0 replies; 48+ messages in thread
From: Cristian Marussi @ 2021-07-19  9:15 UTC (permalink / raw)
  To: Peter Hilber
  Cc: linux-kernel, linux-arm-kernel, virtualization, virtio-dev,
	sudeep.holla, james.quinlan, Jonathan.Cameron, f.fainelli,
	etienne.carriere, vincent.guittot, souvik.chakravarty,
	igor.skalkin, alex.bennee, jean-philippe, mikhail.golubev,
	anton.yakovlev, Vasyl.Vavrychuk, Andriy.Tryshnivskyy

On Thu, Jul 15, 2021 at 06:36:10PM +0200, Peter Hilber wrote:
> On 12.07.21 16:18, Cristian Marussi wrote:
> > Add a check for the presence of .poll_done transport operation so that
> > transports that do not need to support polling mode have no need to provide
> > a dummy .poll_done callback either and polling mode can be disabled in the
> > SCMI core for that tranport.
> > 
> > Signed-off-by: Cristian Marussi <cristian.marussi@arm.com>
> > ---
> >   drivers/firmware/arm_scmi/driver.c | 43 ++++++++++++++++++------------
> >   1 file changed, 26 insertions(+), 17 deletions(-)
> > 
> > diff --git a/drivers/firmware/arm_scmi/driver.c b/drivers/firmware/arm_scmi/driver.c
> > index a952b6527b8a..4183d25c9289 100644
> > --- a/drivers/firmware/arm_scmi/driver.c
> > +++ b/drivers/firmware/arm_scmi/driver.c
> > @@ -777,25 +777,34 @@ static int do_xfer(const struct scmi_protocol_handle *ph,
> >   	}
> >   	if (xfer->hdr.poll_completion) {
> > -		ktime_t stop = ktime_add_ns(ktime_get(), SCMI_MAX_POLL_TO_NS);
> > -
> > -		spin_until_cond(scmi_xfer_done_no_timeout(cinfo, xfer, stop));
> > -
> > -		if (ktime_before(ktime_get(), stop)) {
> > -			unsigned long flags;
> > -
> > -			/*
> > -			 * Do not fetch_response if an out-of-order delayed
> > -			 * response is being processed.
> > -			 */
> > -			spin_lock_irqsave(&xfer->lock, flags);
> > -			if (xfer->state == SCMI_XFER_SENT_OK) {
> > -				info->desc->ops->fetch_response(cinfo, xfer);
> > -				xfer->state = SCMI_XFER_RESP_OK;
> > +		if (info->desc->ops->poll_done) {
> > +			ktime_t stop = ktime_add_ns(ktime_get(),
> > +						    SCMI_MAX_POLL_TO_NS);
> > +
> > +			spin_until_cond(scmi_xfer_done_no_timeout(cinfo, xfer,
> > +								  stop));
> > +
> > +			if (ktime_before(ktime_get(), stop)) {
> > +				unsigned long flags;
> > +
> > +				/*
> > +				 * Do not fetch_response if an out-of-order delayed
> > +				 * response is being processed.
> > +				 */
> > +				spin_lock_irqsave(&xfer->lock, flags);
> > +				if (xfer->state == SCMI_XFER_SENT_OK) {
> > +					info->desc->ops->fetch_response(cinfo,
> > +									xfer);
> > +					xfer->state = SCMI_XFER_RESP_OK;
> > +				}
> > +				spin_unlock_irqrestore(&xfer->lock, flags);
> > +			} else {
> > +				ret = -ETIMEDOUT;
> >   			}
> > -			spin_unlock_irqrestore(&xfer->lock, flags);
> >   		} else {
> > -			ret = -ETIMEDOUT;
> > +			dev_warn_once(dev,
> > +				      "Polling mode is not supported by transport.\n");
> > +			ret = EINVAL;
> 
> s/EINVAL/-EINVAL

Ah. Right. I'll fix.

Thanks,
Cristian

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH v6 14/17] firmware: arm_scmi: Add message passing abstractions for transports
  2021-07-15 16:36   ` Peter Hilber
@ 2021-07-19  9:16     ` Cristian Marussi
  0 siblings, 0 replies; 48+ messages in thread
From: Cristian Marussi @ 2021-07-19  9:16 UTC (permalink / raw)
  To: Peter Hilber
  Cc: linux-kernel, linux-arm-kernel, virtualization, virtio-dev,
	sudeep.holla, james.quinlan, Jonathan.Cameron, f.fainelli,
	etienne.carriere, vincent.guittot, souvik.chakravarty,
	igor.skalkin, alex.bennee, jean-philippe, mikhail.golubev,
	anton.yakovlev, Vasyl.Vavrychuk, Andriy.Tryshnivskyy

On Thu, Jul 15, 2021 at 06:36:26PM +0200, Peter Hilber wrote:
> On 12.07.21 16:18, Cristian Marussi wrote:
> > From: Peter Hilber <peter.hilber@opensynergy.com>
> > 
> > Add abstractions for future transports using message passing, such as
> > virtio. Derive the abstractions from the shared memory abstractions.
> > 
> > Abstract the transport SDU through the opaque struct scmi_msg_payld.
> > Also enable the transport to determine all other required information
> > about the transport SDU.
> > 
> > Signed-off-by: Peter Hilber <peter.hilber@opensynergy.com>
> > [ Cristian: Adapted to new SCMI Kconfig layout, updated Copyrigths ]
> > Signed-off-by: Cristian Marussi <cristian.marussi@arm.com>
> > ---
> > v4 --> v5
> > - adapted to new SCMI Kconfig
> > - removed raw_payload msg helpers
> > v3 --> v4
> > - added raw_payload msg helpers
> > ---
> >   drivers/firmware/arm_scmi/Kconfig  |   6 ++
> >   drivers/firmware/arm_scmi/Makefile |   1 +
> >   drivers/firmware/arm_scmi/common.h |  15 ++++
> >   drivers/firmware/arm_scmi/msg.c    | 113 +++++++++++++++++++++++++++++
> >   4 files changed, 135 insertions(+)
> >   create mode 100644 drivers/firmware/arm_scmi/msg.c
> > 
> > diff --git a/drivers/firmware/arm_scmi/Kconfig b/drivers/firmware/arm_scmi/Kconfig
> > index ee6517b24080..1fdaa8ad7d3f 100644
> > --- a/drivers/firmware/arm_scmi/Kconfig
> > +++ b/drivers/firmware/arm_scmi/Kconfig
> > @@ -34,6 +34,12 @@ config ARM_SCMI_HAVE_SHMEM
> >   	  This declares whether a shared memory based transport for SCMI is
> >   	  available.
> > +config ARM_SCMI_HAVE_MSG
> > +	bool
> > +	help
> > +	  This declares whether a message passing based transport for SCMI is
> > +	  available.
> > +
> >   config ARM_SCMI_TRANSPORT_MAILBOX
> >   	bool "SCMI transport based on Mailbox"
> >   	depends on ARM_SCMI_PROTOCOL && MAILBOX
> > diff --git a/drivers/firmware/arm_scmi/Makefile b/drivers/firmware/arm_scmi/Makefile
> > index e0e6bd3dba9e..aaad9f6589aa 100644
> > --- a/drivers/firmware/arm_scmi/Makefile
> > +++ b/drivers/firmware/arm_scmi/Makefile
> > @@ -4,6 +4,7 @@ scmi-driver-y = driver.o notify.o
> >   scmi-transport-$(CONFIG_ARM_SCMI_HAVE_SHMEM) = shmem.o
> >   scmi-transport-$(CONFIG_ARM_SCMI_TRANSPORT_MAILBOX) += mailbox.o
> >   scmi-transport-$(CONFIG_ARM_SCMI_TRANSPORT_SMC) += smc.o
> > +scmi-transport-$(CONFIG_ARM_SCMI_HAVE_MSG) += msg.o
> >   scmi-protocols-y = base.o clock.o perf.o power.o reset.o sensors.o system.o voltage.o
> >   scmi-module-objs := $(scmi-bus-y) $(scmi-driver-y) $(scmi-protocols-y) \
> >   		    $(scmi-transport-y)
> > diff --git a/drivers/firmware/arm_scmi/common.h b/drivers/firmware/arm_scmi/common.h
> > index 14457f0d5dea..7a1e84dc191b 100644
> > --- a/drivers/firmware/arm_scmi/common.h
> > +++ b/drivers/firmware/arm_scmi/common.h
> > @@ -415,6 +415,21 @@ void shmem_clear_channel(struct scmi_shared_mem __iomem *shmem);
> >   bool shmem_poll_done(struct scmi_shared_mem __iomem *shmem,
> >   		     struct scmi_xfer *xfer);
> > +/* declarations for message passing transports */
> > +struct scmi_msg_payld;
> > +
> > +/* Maximum overhead of message w.r.t. struct scmi_desc.max_msg_size */
> > +#define SCMI_MSG_MAX_PROT_OVERHEAD (2 * sizeof(__le32))
> > +
> > +size_t msg_response_size(struct scmi_xfer *xfer);
> > +size_t msg_command_size(struct scmi_xfer *xfer);
> > +void msg_tx_prepare(struct scmi_msg_payld *msg, struct scmi_xfer *xfer);
> > +u32 msg_read_header(struct scmi_msg_payld *msg);
> > +void msg_fetch_response(struct scmi_msg_payld *msg, size_t len,
> > +			struct scmi_xfer *xfer);
> > +void msg_fetch_notification(struct scmi_msg_payld *msg, size_t len,
> > +			    size_t max_len, struct scmi_xfer *xfer);
> > +
> >   void scmi_notification_instance_data_set(const struct scmi_handle *handle,
> >   					 void *priv);
> >   void *scmi_notification_instance_data_get(const struct scmi_handle *handle);
> > diff --git a/drivers/firmware/arm_scmi/msg.c b/drivers/firmware/arm_scmi/msg.c
> > new file mode 100644
> > index 000000000000..639969b7dc10
> > --- /dev/null
> > +++ b/drivers/firmware/arm_scmi/msg.c
> > @@ -0,0 +1,113 @@
> > +// SPDX-License-Identifier: GPL-2.0
> > +/*
> > + * For transports using message passing.
> > + *
> > + * Derived from shm.c.
> > + *
> > + * Copyright (C) 2019-2021 ARM Ltd.
> > + * Copyright (C) 2020-2021 OpenSynergy GmbH
> > + */
> > +
> > +#include <linux/io.h>
> > +#include <linux/processor.h>
> 
> The above two includes appear to be unnecessary.

Right. Thanks, I'll fix.

Cristian


^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH v6 00/17] Introduce SCMI transport based on VirtIO
  2021-07-15 16:35 ` [PATCH v6 00/17] Introduce SCMI transport based on VirtIO Peter Hilber
@ 2021-07-19 11:36   ` Cristian Marussi
  2021-07-22  8:30     ` Peter Hilber
  0 siblings, 1 reply; 48+ messages in thread
From: Cristian Marussi @ 2021-07-19 11:36 UTC (permalink / raw)
  To: Peter Hilber
  Cc: linux-kernel, linux-arm-kernel, virtualization, virtio-dev,
	sudeep.holla, james.quinlan, Jonathan.Cameron, f.fainelli,
	etienne.carriere, vincent.guittot, souvik.chakravarty,
	igor.skalkin, alex.bennee, jean-philippe, mikhail.golubev,
	anton.yakovlev, Vasyl.Vavrychuk, Andriy.Tryshnivskyy

On Thu, Jul 15, 2021 at 06:35:38PM +0200, Peter Hilber wrote:
> On 12.07.21 16:18, Cristian Marussi wrote:
> > Hi all,
> > 
> 
> Hi Cristian,
> 
> thanks for your update. Please find some additional comments in this reply
> and the following.
> 
> Best regards,
> 
> Peter

Hi Peter,

thanks for the feedback.

> 
> > While reworking this series starting from the work done up to V3 by
> > OpenSynergy, I am keeping the original autorship and list distribution
> > unchanged.
> > 
> > The main aim of this rework, as said, is to simplify where possible the
> > SCMI VirtIO support added in V3 by adding at first some new general
> > mechanisms in the SCMI Transport layer.
> > 
> > Indeed, after some initial small fixes, patches 05/06/07/08 add such new
> > additional mechanisms to the SCMI core to ease implementation of more
> > complex transports like virtio, while also addressing a few general issues
> > already potentially affecting existing transports.
> > 
> > In terms of rework I dropped original V3 patches 05/06/07/08/12 as no more
> > needed, and modified where needed the remaining original patches to take
> > advantage of the above mentioned new SCMI transport features.
> > 
> > DT bindings patch has been ported on top of freshly YAML converted arm,scmi
> > bindings.
> > 
> > Moreover, since V5 I dropped support for polling mode from the virtio-scmi
> > transport, since it is an optional general mechanism provided by the core
> > to allow transports lacking a completion IRQ to work and it seemed a
> > needless addition/complication in the context of virtio transport.
> > 
> 
> Just for correctness, in my understanding polling is not completely optional
> ATM. Polling would be required by scmi_cpufreq_fast_switch(). But that
> requirement might be irrelevant for now.
> 

Cpufreq core can use .fast_switch (scmi_cpufreq_fast_switch) op only if 
policy->fast_switch_enabled is true which in turn reported as true by
the SCMI cpufreq driver iff SCMI FastChannels are supported by Perf
implementation server side, but the SCMI Device VirtIO spec (5.17)
explicitly does NOT support SCMI FastChannels as of now.

Anyway, even though we should support in the future SCMI FastChannels on
VirtIO SCMI transport, fastchannels are by defintion per-protocol/per-command/
per-domain-id specific, based on sharedMem or MMIO, unidirectional and do not
even allow for a response from the platform (SCMIV3.0 4.1.1 5.3) so polling
won't be a thing anyway unless I'm missing something.

BUT you made a good point in fact anyway, because the generic perf->freq_set/get
API CAN be indeed invoked in polling mode, and, even though we do not use them
in polling as of now (if not in the FastChannel scenario above) this could be a
potential problem in general if when the underlying transport do not support poll
the core just drop any poll_completion=true messages.

So, while I still think it is not sensible to enable poll mode in SCMI Virtio,
because would be a sort of faked polling and increases complexity, I'm now
considering the fact that maybe the right behaviour of the SCMI core in such a
scenario would be to warn the user as it does now AND then fallback to use
non-polling, probably better if such a behavior is made condtional on some
transport config desc flag that allow such fallback behavior.

Any thought ?

> > Additionally, in V5 I could also simplify a bit the virtio transport
> > probing sequence starting from the observation that, by the VirtIO spec,
> > in fact, only one single SCMI VirtIO device can possibly exist on a system.
> > 
> 
> I wouldn't say that the virtio spec restricts the # of virtio-scmi devices
> to one. But I do think the one device limitation in the kernel is
> acceptable.
> 

Indeed it is not that VirtIO spec explicitly forbids multiple devices,
but the fact that SCMI devices are not identifiable from VirtIO layer
(if not by vendor mmaybe) nor from DT config (as of now) leads to the fact
that only one single device can be possibly used; if you define multiple
MMIO devices (or multiple PCI are discovered) there is now way to tell which
is which and, say, assign those to distinct channels per different protocols.

Having said that, this upcoming series by Viresh, may be useful in the
future:

https://lore.kernel.org/lkml/aa4bf68fdd13b885a6dc1b98f88834916d51d97d.1626173013.git.viresh.kumar@linaro.org/

since it would add the capability to 'link' an SCMI device user (SCMI
virtio transport instance or specific channel) to a specific virtio-mmio
node. Once this is in, I could try to multi SCMI VirtIO device support, but not
within this series probably.


Thanks,
Cristian


^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH v6 00/17] Introduce SCMI transport based on VirtIO
  2021-07-19 11:36   ` Cristian Marussi
@ 2021-07-22  8:30     ` Peter Hilber
  0 siblings, 0 replies; 48+ messages in thread
From: Peter Hilber @ 2021-07-22  8:30 UTC (permalink / raw)
  To: Cristian Marussi
  Cc: linux-kernel, linux-arm-kernel, virtualization, virtio-dev,
	sudeep.holla, james.quinlan, Jonathan.Cameron, f.fainelli,
	etienne.carriere, vincent.guittot, souvik.chakravarty,
	igor.skalkin, alex.bennee, jean-philippe, mikhail.golubev,
	anton.yakovlev, Vasyl.Vavrychuk, Andriy.Tryshnivskyy

On 19.07.21 13:36, Cristian Marussi wrote:
> On Thu, Jul 15, 2021 at 06:35:38PM +0200, Peter Hilber wrote:
>> On 12.07.21 16:18, Cristian Marussi wrote:
>>> Hi all,
>>>
>>
>> Hi Cristian,
>>
>> thanks for your update. Please find some additional comments in this reply
>> and the following.
>>
>> Best regards,
>>
>> Peter
> 
> Hi Peter,
> 
> thanks for the feedback.
> 
>>
>>> While reworking this series starting from the work done up to V3 by
>>> OpenSynergy, I am keeping the original autorship and list distribution
>>> unchanged.
>>>
>>> The main aim of this rework, as said, is to simplify where possible the
>>> SCMI VirtIO support added in V3 by adding at first some new general
>>> mechanisms in the SCMI Transport layer.
>>>
>>> Indeed, after some initial small fixes, patches 05/06/07/08 add such new
>>> additional mechanisms to the SCMI core to ease implementation of more
>>> complex transports like virtio, while also addressing a few general issues
>>> already potentially affecting existing transports.
>>>
>>> In terms of rework I dropped original V3 patches 05/06/07/08/12 as no more
>>> needed, and modified where needed the remaining original patches to take
>>> advantage of the above mentioned new SCMI transport features.
>>>
>>> DT bindings patch has been ported on top of freshly YAML converted arm,scmi
>>> bindings.
>>>
>>> Moreover, since V5 I dropped support for polling mode from the virtio-scmi
>>> transport, since it is an optional general mechanism provided by the core
>>> to allow transports lacking a completion IRQ to work and it seemed a
>>> needless addition/complication in the context of virtio transport.
>>>
>>
>> Just for correctness, in my understanding polling is not completely optional
>> ATM. Polling would be required by scmi_cpufreq_fast_switch(). But that
>> requirement might be irrelevant for now.
>>
> 
> Cpufreq core can use .fast_switch (scmi_cpufreq_fast_switch) op only if
> policy->fast_switch_enabled is true which in turn reported as true by
> the SCMI cpufreq driver iff SCMI FastChannels are supported by Perf
> implementation server side, but the SCMI Device VirtIO spec (5.17)
> explicitly does NOT support SCMI FastChannels as of now.
> 
> Anyway, even though we should support in the future SCMI FastChannels on
> VirtIO SCMI transport, fastchannels are by defintion per-protocol/per-command/
> per-domain-id specific, based on sharedMem or MMIO, unidirectional and do not
> even allow for a response from the platform (SCMIV3.0 4.1.1 5.3) so polling
> won't be a thing anyway unless I'm missing something.
> 
> BUT you made a good point in fact anyway, because the generic perf->freq_set/get
> API CAN be indeed invoked in polling mode, and, even though we do not use them
> in polling as of now (if not in the FastChannel scenario above) this could be a
> potential problem in general if when the underlying transport do not support poll
> the core just drop any poll_completion=true messages.
> 
> So, while I still think it is not sensible to enable poll mode in SCMI Virtio,
> because would be a sort of faked polling and increases complexity, I'm now
> considering the fact that maybe the right behaviour of the SCMI core in such a
> scenario would be to warn the user as it does now AND then fallback to use
> non-polling, probably better if such a behavior is made condtional on some
> transport config desc flag that allow such fallback behavior.
> 
> Any thought ?
> 

Maybe the SCMI protocols should request "atomic" instead of "polling"? 
That semantics are the actual intent in my understanding. So the 
"Introduce atomic support for SCMI transports" patch series [1] could 
potentially address this?

Best regards,

Peter


[1] https://lkml.org/lkml/2021/7/12/3089

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH v6 07/17] firmware: arm_scmi: Handle concurrent and out-of-order messages
  2021-07-19  9:14     ` Cristian Marussi
@ 2021-07-22  8:32       ` Peter Hilber
  2021-07-28  8:31         ` Cristian Marussi
  0 siblings, 1 reply; 48+ messages in thread
From: Peter Hilber @ 2021-07-22  8:32 UTC (permalink / raw)
  To: Cristian Marussi
  Cc: linux-kernel, linux-arm-kernel, virtualization, virtio-dev,
	sudeep.holla, james.quinlan, Jonathan.Cameron, f.fainelli,
	etienne.carriere, vincent.guittot, souvik.chakravarty,
	igor.skalkin, alex.bennee, jean-philippe, mikhail.golubev,
	anton.yakovlev, Vasyl.Vavrychuk, Andriy.Tryshnivskyy

On 19.07.21 11:14, Cristian Marussi wrote:
> On Thu, Jul 15, 2021 at 06:36:03PM +0200, Peter Hilber wrote:
>> On 12.07.21 16:18, Cristian Marussi wrote:

[snip]

>>> @@ -608,6 +755,7 @@ static int do_xfer(const struct scmi_protocol_handle *ph,
>>>    			      xfer->hdr.protocol_id, xfer->hdr.seq,
>>>    			      xfer->hdr.poll_completion);
>>> +	xfer->state = SCMI_XFER_SENT_OK;
>>
>> To be completely safe, this assignment could also be protected by the
>> xfer->lock.
>>
> 
> In fact this would be true being xfer->lock meant to protect the state but it
> seemed to me unnecessary here given that this is a brand new xfer with a
> brand new (monotonic) seq number so that any possibly late-received msg will
> carry an old stale seq number certainly different from this such that cannot be
> possibly mapped to this same xfer. (but just discarded on xfer lookup in
> xfer_command_acquire)
> 
> The issue indeed could still exist only for do_xfer loops (as you pointed out
> already early on) where the seq_num is used, but in that case on a timeout we
> would have already bailed out of the loop and reported an error so any timed-out
> late received response would have been anyway discarded; so at the end I thought
> I could avoid spinlocking here.
> 
> Thanks,
> Cristian
> 

I mostly meant to refer to the possibility of a very fast response not 
seeing this assignment, since the next line is

>  	ret = info->desc->ops->send_message(cinfo, xfer);

and during that a regular scmi_rx_callback(), reading xfer->state, can 
already arrive. But maybe this is too theoretical.

Best regards,

Peter

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH v6 07/17] firmware: arm_scmi: Handle concurrent and out-of-order messages
  2021-07-22  8:32       ` Peter Hilber
@ 2021-07-28  8:31         ` Cristian Marussi
  0 siblings, 0 replies; 48+ messages in thread
From: Cristian Marussi @ 2021-07-28  8:31 UTC (permalink / raw)
  To: Peter Hilber
  Cc: linux-kernel, linux-arm-kernel, virtualization, virtio-dev,
	sudeep.holla, james.quinlan, Jonathan.Cameron, f.fainelli,
	etienne.carriere, vincent.guittot, souvik.chakravarty,
	igor.skalkin, alex.bennee, jean-philippe, mikhail.golubev,
	anton.yakovlev, Vasyl.Vavrychuk, Andriy.Tryshnivskyy

On Thu, Jul 22, 2021 at 10:32:58AM +0200, Peter Hilber wrote:
> On 19.07.21 11:14, Cristian Marussi wrote:
> > On Thu, Jul 15, 2021 at 06:36:03PM +0200, Peter Hilber wrote:
> > > On 12.07.21 16:18, Cristian Marussi wrote:
> 
> [snip]
> 
> > > > @@ -608,6 +755,7 @@ static int do_xfer(const struct scmi_protocol_handle *ph,
> > > >    			      xfer->hdr.protocol_id, xfer->hdr.seq,
> > > >    			      xfer->hdr.poll_completion);
> > > > +	xfer->state = SCMI_XFER_SENT_OK;
> > > 
> > > To be completely safe, this assignment could also be protected by the
> > > xfer->lock.
> > > 
> > 
> > In fact this would be true being xfer->lock meant to protect the state but it
> > seemed to me unnecessary here given that this is a brand new xfer with a
> > brand new (monotonic) seq number so that any possibly late-received msg will
> > carry an old stale seq number certainly different from this such that cannot be
> > possibly mapped to this same xfer. (but just discarded on xfer lookup in
> > xfer_command_acquire)
> > 
> > The issue indeed could still exist only for do_xfer loops (as you pointed out
> > already early on) where the seq_num is used, but in that case on a timeout we
> > would have already bailed out of the loop and reported an error so any timed-out
> > late received response would have been anyway discarded; so at the end I thought
> > I could avoid spinlocking here.
> > 
> > Thanks,
> > Cristian
> > 

Hi Peter,

sorry for the late answer.

> 
> I mostly meant to refer to the possibility of a very fast response not
> seeing this assignment, since the next line is
> 
> >  	ret = info->desc->ops->send_message(cinfo, xfer);
> 
> and during that a regular scmi_rx_callback(), reading xfer->state, can
> already arrive. But maybe this is too theoretical.
> 

Right, that's a possibility indeed to account for even if remote: given
that, though, no race is possible here on state as said, I'd still avoid the
spinlock and related irq-off and opt instead for a barrier to avoid
re-ordering and to be sure that the scmi_rx_callback() on the RX processor
can see the latest value (a dmb(ish) + cache coherence magic should be enough)

Thanks,
Cristian

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH v6 05/17] firmware: arm_scmi: Add transport optional init/exit support
  2021-07-12 14:18 ` [PATCH v6 05/17] firmware: arm_scmi: Add transport optional init/exit support Cristian Marussi
@ 2021-07-28 11:40   ` Sudeep Holla
  2021-07-28 12:28     ` Cristian Marussi
  0 siblings, 1 reply; 48+ messages in thread
From: Sudeep Holla @ 2021-07-28 11:40 UTC (permalink / raw)
  To: Cristian Marussi
  Cc: linux-kernel, linux-arm-kernel, virtualization, virtio-dev,
	james.quinlan, Jonathan.Cameron, f.fainelli, etienne.carriere,
	vincent.guittot, souvik.chakravarty, igor.skalkin, peter.hilber,
	alex.bennee, jean-philippe, mikhail.golubev, anton.yakovlev,
	Vasyl.Vavrychuk, Sudeep Holla, Andriy.Tryshnivskyy

On Mon, Jul 12, 2021 at 03:18:21PM +0100, Cristian Marussi wrote:
> Some SCMI transport could need to perform some transport specific setup
> before they can be used by the SCMI core transport layer: typically this
> early setup consists in registering with some other kernel subsystem.
> 
> Add the optional capability for a transport to provide a couple of .init
> and .exit functions that are assured to be called early during the SCMI
> core initialization phase, well before the SCMI core probing step.
> 
> [ Peter: Adapted RFC patch by Cristian for submission to upstream. ]
> Signed-off-by: Peter Hilber <peter.hilber@opensynergy.com>
> [ Cristian: Fixed scmi_transports_exit point of invocation ]
> Signed-off-by: Cristian Marussi <cristian.marussi@arm.com>
> ---
> v4 --> V5
> - removed useless pr_debug
> - moved scmi_transport_exit() invocation
> ---
>  drivers/firmware/arm_scmi/common.h |  8 +++++
>  drivers/firmware/arm_scmi/driver.c | 56 ++++++++++++++++++++++++++++++
>  2 files changed, 64 insertions(+)
> 
> diff --git a/drivers/firmware/arm_scmi/common.h b/drivers/firmware/arm_scmi/common.h
> index 7c2b9fd7e929..6bb734e0e3ac 100644
> --- a/drivers/firmware/arm_scmi/common.h
> +++ b/drivers/firmware/arm_scmi/common.h
> @@ -321,6 +321,12 @@ struct scmi_device *scmi_child_dev_find(struct device *parent,
>  /**
>   * struct scmi_desc - Description of SoC integration
>   *
> + * @init: An optional function that a transport can provide to initialize some
> + *	  transport-specific setup during SCMI core initialization, so ahead of
> + *	  SCMI core probing.
> + * @exit: An optional function that a transport can provide to de-initialize
> + *	  some transport-specific setup during SCMI core de-initialization, so
> + *	  after SCMI core removal.
>   * @ops: Pointer to the transport specific ops structure
>   * @max_rx_timeout_ms: Timeout for communication with SoC (in Milliseconds)
>   * @max_msg: Maximum number of messages that can be pending
> @@ -328,6 +334,8 @@ struct scmi_device *scmi_child_dev_find(struct device *parent,
>   * @max_msg_size: Maximum size of data per message that can be handled.
>   */
>  struct scmi_desc {
> +	int (*init)(void);
> +	void (*exit)(void);

Does it make sense to rename scmi_desc as scmi_transport or scmi_transport_desc ?
I reason I ask is plain init/exit here doesn't make sense. You can change it
to transport_init/exit if we don't want to rename the structure.

>  	const struct scmi_transport_ops *ops;

I assume we don't want init/exit inside ops as it is shared with protocols ?
Looks good other than the above comment.

-- 
Regards,
Sudeep

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH v6 05/17] firmware: arm_scmi: Add transport optional init/exit support
  2021-07-28 11:40   ` Sudeep Holla
@ 2021-07-28 12:28     ` Cristian Marussi
  0 siblings, 0 replies; 48+ messages in thread
From: Cristian Marussi @ 2021-07-28 12:28 UTC (permalink / raw)
  To: Sudeep Holla
  Cc: linux-kernel, linux-arm-kernel, virtualization, virtio-dev,
	james.quinlan, Jonathan.Cameron, f.fainelli, etienne.carriere,
	vincent.guittot, souvik.chakravarty, igor.skalkin, peter.hilber,
	alex.bennee, jean-philippe, mikhail.golubev, anton.yakovlev,
	Vasyl.Vavrychuk, Andriy.Tryshnivskyy

On Wed, Jul 28, 2021 at 12:40:18PM +0100, Sudeep Holla wrote:
> On Mon, Jul 12, 2021 at 03:18:21PM +0100, Cristian Marussi wrote:
> > Some SCMI transport could need to perform some transport specific setup
> > before they can be used by the SCMI core transport layer: typically this
> > early setup consists in registering with some other kernel subsystem.
> > 
> > Add the optional capability for a transport to provide a couple of .init
> > and .exit functions that are assured to be called early during the SCMI
> > core initialization phase, well before the SCMI core probing step.
> > 
> > [ Peter: Adapted RFC patch by Cristian for submission to upstream. ]
> > Signed-off-by: Peter Hilber <peter.hilber@opensynergy.com>
> > [ Cristian: Fixed scmi_transports_exit point of invocation ]
> > Signed-off-by: Cristian Marussi <cristian.marussi@arm.com>
> > ---
> > v4 --> V5
> > - removed useless pr_debug
> > - moved scmi_transport_exit() invocation
> > ---

Hi Sudeep,

thanks for having a look.

> >  drivers/firmware/arm_scmi/common.h |  8 +++++
> >  drivers/firmware/arm_scmi/driver.c | 56 ++++++++++++++++++++++++++++++
> >  2 files changed, 64 insertions(+)
> > 
> > diff --git a/drivers/firmware/arm_scmi/common.h b/drivers/firmware/arm_scmi/common.h
> > index 7c2b9fd7e929..6bb734e0e3ac 100644
> > --- a/drivers/firmware/arm_scmi/common.h
> > +++ b/drivers/firmware/arm_scmi/common.h
> > @@ -321,6 +321,12 @@ struct scmi_device *scmi_child_dev_find(struct device *parent,
> >  /**
> >   * struct scmi_desc - Description of SoC integration
> >   *
> > + * @init: An optional function that a transport can provide to initialize some
> > + *	  transport-specific setup during SCMI core initialization, so ahead of
> > + *	  SCMI core probing.
> > + * @exit: An optional function that a transport can provide to de-initialize
> > + *	  some transport-specific setup during SCMI core de-initialization, so
> > + *	  after SCMI core removal.
> >   * @ops: Pointer to the transport specific ops structure
> >   * @max_rx_timeout_ms: Timeout for communication with SoC (in Milliseconds)
> >   * @max_msg: Maximum number of messages that can be pending
> > @@ -328,6 +334,8 @@ struct scmi_device *scmi_child_dev_find(struct device *parent,
> >   * @max_msg_size: Maximum size of data per message that can be handled.
> >   */
> >  struct scmi_desc {
> > +	int (*init)(void);
> > +	void (*exit)(void);
> 
> Does it make sense to rename scmi_desc as scmi_transport or scmi_transport_desc ?
> I reason I ask is plain init/exit here doesn't make sense. You can change it
> to transport_init/exit if we don't want to rename the structure.
> 

Yes indeed I'll rename these to transport_init/exit in V7.

> >  	const struct scmi_transport_ops *ops;
> 
> I assume we don't want init/exit inside ops as it is shared with protocols ?
> Looks good other than the above comment.
> 

It seemed to me that scmi_transport_ops were more related to an initialized
instance of a transport and as such used when the scmi instance is probed or
later, while these transport_init/exit are more general transport specific
methods that have to be called, if provided, at scmi driver init, way before
scmi_probe(), to allow for early transport inits, as an example virtio-scmi
uses these to register at first with the virtio subsystem; so I kept them
separated.

Thanks,
Cristian


^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH v6 06/17] firmware: arm_scmi: Introduce monotonically increasing tokens
  2021-07-12 14:18 ` [PATCH v6 06/17] firmware: arm_scmi: Introduce monotonically increasing tokens Cristian Marussi
@ 2021-07-28 14:17   ` Sudeep Holla
  2021-07-28 16:54     ` Cristian Marussi
  0 siblings, 1 reply; 48+ messages in thread
From: Sudeep Holla @ 2021-07-28 14:17 UTC (permalink / raw)
  To: Cristian Marussi
  Cc: linux-kernel, linux-arm-kernel, virtualization, virtio-dev,
	james.quinlan, Jonathan.Cameron, f.fainelli, etienne.carriere,
	vincent.guittot, souvik.chakravarty, igor.skalkin, peter.hilber,
	alex.bennee, jean-philippe, mikhail.golubev, anton.yakovlev,
	Vasyl.Vavrychuk, Sudeep Holla, Andriy.Tryshnivskyy

On Mon, Jul 12, 2021 at 03:18:22PM +0100, Cristian Marussi wrote:
> Tokens are sequence numbers embedded in the each SCMI message header: they
> are used to correlate commands with responses (and delayed responses), but
> their usage and policy of selection is entirely up to the caller (usually
> the OSPM agent), while they are completely opaque to the callee (SCMI
> server platform) which merely copies them back from the command into the
> response message header.
> This also means that the platform does not, can not and should not enforce
> any kind of policy on received messages depending on the contained sequence
> number: platform can perfectly handle concurrent requests carrying the same
> identifiying token if that should happen.
> 
> Moreover the platform is not required to produce in-order responses to
> agent requests, the only constraint in these regards is that in case of
> an asynchronous message the delayed response must be sent after the
> immediate response for the synchronous part of the command transaction.
> 
> Currenly the SCMI stack of the OSPM agent selects a token for the egressing
> commands picking the lowest possible number which is not already in use by
> an existing in-flight transaction, which means, in other words, that we
> immediately reuse any token after its transaction has completed or it has
> timed out: this policy indeed does simplify management and lookup of tokens
> and associated xfers.
> 
> Under the above assumptions and constraints, since there is really no state
> shared between the agent and the platform to let the platform know when a
> token and its associated message has timed out, the current policy of early
> reuse of tokens can easily lead to the situation in which a spurious or
> late received response (or delayed_response), related to an old stale and
> timed out transaction, can be wrongly associated to a newer valid in-flight
> xfer that just happens to have reused the same token.
> 
> This misbehaviour on such ghost responses is more easily exposed on those
> transports that naturally have an higher level of parallelism in processing
> multiple concurrent in-flight messages.
>

The term ghost is used here without any reference to what it means. That
could make it difficult to follow if someone unaware of it is trying to
understand.

> This commit introduces a new policy of selection of tokens for the OSPM
> agent: each new command transfer now gets the next available, monotonically
> increasing token, until tokens are exhausted and the counter rolls over.
> 
> Such new policy mitigates the above issues with ghost responses since the
> tokens are now reused as late as possible (when they roll back ideally)
> and so it is much easier to identify such ghost responses to stale timed
> out transactions: this also helps in simplifying the specific transports
> implementation since stale transport messages can be easily identified
> and discarded early on in the rx path without the need to cross check
> their actual state with the core transport layer.
> This mitigation is even more effective when, as is usually the case, the
> maximum number of pending messages is capped by the platform to a much
> lower number than the whole possible range of tokens values (2^10).
> 
> This internal policy change in the core SCMI transport layer is fully
> transparent to the specific transports so it has not and should not have
> any impact on the transports implementation.
> 
> Signed-off-by: Cristian Marussi <cristian.marussi@arm.com>
> ---
> v4 --> V5
> - removed empirical profiling info from commit msg
> - do NOT use monotonic tokens and pending HT for notifications (not needed)
> - release xfer_lock later in scmi_xfer_get
> ---
>  drivers/firmware/arm_scmi/common.h |  25 +++
>  drivers/firmware/arm_scmi/driver.c | 260 +++++++++++++++++++++++++----
>  2 files changed, 249 insertions(+), 36 deletions(-)
> 
> diff --git a/drivers/firmware/arm_scmi/common.h b/drivers/firmware/arm_scmi/common.h
> index 6bb734e0e3ac..2233d0a188fc 100644
> --- a/drivers/firmware/arm_scmi/common.h
> +++ b/drivers/firmware/arm_scmi/common.h
> @@ -14,7 +14,10 @@
>  #include <linux/device.h>
>  #include <linux/errno.h>
>  #include <linux/kernel.h>
> +#include <linux/hashtable.h>
> +#include <linux/list.h>
>  #include <linux/module.h>
> +#include <linux/refcount.h>
>  #include <linux/scmi_protocol.h>
>  #include <linux/types.h>
>  
> @@ -138,6 +141,10 @@ struct scmi_msg {
>   *	buffer for the rx path as we use for the tx path.
>   * @done: command message transmit completion event
>   * @async_done: pointer to delayed response message received event completion
> + * @users: A refcount to track the active users for this xfer
> + * @pending: True for xfers added to @pending_xfers hashtable
> + * @node: An hlist_node reference used to store this xfer, alternatively, on
> + *	  the free list @free_xfers or in the @pending_xfers hashtable
>   */
>  struct scmi_xfer {
>  	int transfer_id;
> @@ -146,8 +153,26 @@ struct scmi_xfer {
>  	struct scmi_msg rx;
>  	struct completion done;
>  	struct completion *async_done;
> +	refcount_t users;
> +	bool pending;
> +	struct hlist_node node;
>  };
>  
> +/*
> + * An helper macro to lookup an xfer from the @pending_xfers hashtable
> + * using the message sequence number token as a key.
> + */
> +#define XFER_FIND(__ht, __k)					\
> +({								\
> +	typeof(__k) k_ = __k;					\
> +	struct scmi_xfer *xfer_ = NULL;				\
> +								\
> +	hash_for_each_possible((__ht), xfer_, node, k_)		\
> +		if (xfer_->hdr.seq == k_)			\
> +			break;					\
> +	xfer_;							\
> +})
> +
>  struct scmi_xfer_ops;
>  
>  /**
> diff --git a/drivers/firmware/arm_scmi/driver.c b/drivers/firmware/arm_scmi/driver.c
> index 4c77ee13b1ad..245ede223302 100644
> --- a/drivers/firmware/arm_scmi/driver.c
> +++ b/drivers/firmware/arm_scmi/driver.c
> @@ -21,6 +21,7 @@
>  #include <linux/io.h>
>  #include <linux/kernel.h>
>  #include <linux/ktime.h>
> +#include <linux/hashtable.h>
>  #include <linux/list.h>
>  #include <linux/module.h>
>  #include <linux/of_address.h>
> @@ -65,19 +66,29 @@ struct scmi_requested_dev {
>  	struct list_head node;
>  };
>  
> +#define SCMI_PENDING_XFERS_HT_ORDER_SZ	9
> +

Is there any particular reason to choose half the token size as hash bucket
size ? IOW why not 1/3 or 1/4th of it ? I would appreciate a comment here.
I see it is mentioned in the commit log. Also is it not better to associate
or keep it close to MSG_TOKEN_ID_MASK and associated macros.

>  /**
>   * struct scmi_xfers_info - Structure to manage transfer information
>   *
> - * @xfer_block: Preallocated Message array
>   * @xfer_alloc_table: Bitmap table for allocated messages.
>   *	Index of this bitmap table is also used for message
>   *	sequence identifier.
>   * @xfer_lock: Protection for message allocation
> + * @last_token: A counter to use as base to generate for monotonically
> + *		increasing tokens.
> + * @free_xfers: A free list for available to use xfers. It is initialized with
> + *		a number of xfers equal to the maximum allowed in-flight
> + *		messages.
> + * @pending_xfers: An hashtable, indexed by msg_hdr.seq, used to keep all the
> + *		   currently in-flight messages.
>   */
>  struct scmi_xfers_info {
> -	struct scmi_xfer *xfer_block;
>  	unsigned long *xfer_alloc_table;
>  	spinlock_t xfer_lock;
> +	atomic_t last_token;

Can we merge this and transfer_last_id ? Let this be free running like
transfer_last_id and just use [0-9] from this ? I don't see any point
having 2 different monotonically increasing tokens/id.

> +	struct hlist_head free_xfers;
> +	DECLARE_HASHTABLE(pending_xfers, SCMI_PENDING_XFERS_HT_ORDER_SZ);
>  };
>  
>  /**
> @@ -190,45 +201,177 @@ void *scmi_notification_instance_data_get(const struct scmi_handle *handle)
>  	return info->notify_priv;
>  }
>  
> +/**
> + * scmi_xfer_token_set  - Reserve and set new token for the xfer at hand
> + *
> + * @minfo: Pointer to Tx/Rx Message management info based on channel type
> + * @xfer: The xfer to act upon
> + *
> + * Pick the next unused monotonically increasing token and set it into
> + * xfer->hdr.seq: picking a monotonically increasing value avoids immediate
> + * reuse of freshly completed or timed-out xfers, thus mitigating the risk
> + * of incorrect association of a late and expired xfer with a live in-flight
> + * transaction, both happening to re-use the same token identifier.
> + *
> + * Since platform is NOT required to answer our request in-order we should
> + * account for a few rare but possible scenarios:
> + *
> + *  - exactly 'next_token' may be NOT available so pick xfer_id >= next_token
> + *    using find_next_zero_bit() starting from candidate next_token bit
> + *
> + *  - all tokens ahead upto (MSG_TOKEN_ID_MASK - 1) are used in-flight but we
> + *    are plenty of free tokens at start, so try a second pass using
> + *    find_next_zero_bit() and starting from 0.
> + *
> + *  X = used in-flight
> + *
> + * Normal
> + * ------
> + *
> + *		|- xfer_id picked
> + *   -----------+----------------------------------------------------------
> + *   | | |X|X|X| | | | | | ... ... ... ... ... ... ... ... ... ... ...|X|X|
> + *   ----------------------------------------------------------------------
> + *		^
> + *		|- next_token
> + *
> + * Out-of-order pending at start
> + * -----------------------------
> + *
> + *	  |- xfer_id picked, last_token fixed
> + *   -----+----------------------------------------------------------------
> + *   |X|X| | | | |X|X| ... ... ... ... ... ... ... ... ... ... ... ...|X| |
> + *   ----------------------------------------------------------------------
> + *    ^
> + *    |- next_token
> + *
> + *
> + * Out-of-order pending at end
> + * ---------------------------
> + *
> + *	  |- xfer_id picked, last_token fixed
> + *   -----+----------------------------------------------------------------
> + *   |X|X| | | | |X|X| ... ... ... ... ... ... ... ... ... ... |X|X|X||X|X|
> + *   ----------------------------------------------------------------------
> + *								^
> + *								|- next_token
> + *
> + * Context: Assumes to be called with @xfer_lock already acquired.
> + *
> + * Return: 0 on Success or error
> + */
> +static int scmi_xfer_token_set(struct scmi_xfers_info *minfo,
> +			       struct scmi_xfer *xfer)
> +{
> +	unsigned long xfer_id, next_token;
> +
> +	/* Pick a candidate monotonic token in range [0, MSG_TOKEN_MAX - 1] */
> +	next_token = (atomic_inc_return(&minfo->last_token) &
> +		      (MSG_TOKEN_MAX - 1));
> +
> +	/* Pick the next available xfer_id >= next_token */
> +	xfer_id = find_next_zero_bit(minfo->xfer_alloc_table,
> +				     MSG_TOKEN_MAX, next_token);
> +	if (xfer_id == MSG_TOKEN_MAX) {
> +		/*
> +		 * After heavily out-of-order responses, there are no free
> +		 * tokens ahead, but only at start of xfer_alloc_table so
> +		 * try again from the beginning.
> +		 */
> +		xfer_id = find_next_zero_bit(minfo->xfer_alloc_table,
> +					     MSG_TOKEN_MAX, 0);
> +		/*
> +		 * Something is wrong if we got here since there can be a
> +		 * maximum number of (MSG_TOKEN_MAX - 1) in-flight messages
> +		 * but we have not found any free token [0, MSG_TOKEN_MAX - 1].
> +		 */
> +		if (WARN_ON_ONCE(xfer_id == MSG_TOKEN_MAX))
> +			return -ENOMEM;
> +	}
> +
> +	/* Update +/- last_token accordingly if we skipped some hole */
> +	if (xfer_id != next_token)
> +		atomic_add((int)(xfer_id - next_token), &minfo->last_token);
> +
> +	/* Set in-flight */
> +	set_bit(xfer_id, minfo->xfer_alloc_table);
> +	xfer->hdr.seq = (u16)xfer_id;
> +
> +	return 0;
> +}
> +
> +/**
> + * scmi_xfer_token_clear  - Release the token
> + *
> + * @minfo: Pointer to Tx/Rx Message management info based on channel type
> + * @xfer: The xfer to act upon
> + */
> +static inline void scmi_xfer_token_clear(struct scmi_xfers_info *minfo,
> +					 struct scmi_xfer *xfer)
> +{
> +	clear_bit(xfer->hdr.seq, minfo->xfer_alloc_table);
> +}
> +
>  /**
>   * scmi_xfer_get() - Allocate one message
>   *
>   * @handle: Pointer to SCMI entity handle
>   * @minfo: Pointer to Tx/Rx Message management info based on channel type
> + * @set_pending: If true a monotonic token is picked and the xfer is added to
> + *		 the pending hash table.
>   *
>   * Helper function which is used by various message functions that are
>   * exposed to clients of this driver for allocating a message traffic event.
>   *
> - * This function can sleep depending on pending requests already in the system
> - * for the SCMI entity. Further, this also holds a spinlock to maintain
> - * integrity of internal data structures.
> + * Picks an xfer from the free list @free_xfers (if any available) and, if
> + * required, sets a monotonically increasing token and stores the inflight xfer
> + * into the @pending_xfers hashtable for later retrieval.
> + *
> + * The successfully initialized xfer is refcounted.
> + *
> + * Context: Holds @xfer_lock while manipulating @xfer_alloc_table and
> + *	    @free_xfers.
>   *
>   * Return: 0 if all went fine, else corresponding error.
>   */
>  static struct scmi_xfer *scmi_xfer_get(const struct scmi_handle *handle,
> -				       struct scmi_xfers_info *minfo)
> +				       struct scmi_xfers_info *minfo,
> +				       bool set_pending)
>  {
> -	u16 xfer_id;
> +	int ret;
> +	unsigned long flags;
>  	struct scmi_xfer *xfer;
> -	unsigned long flags, bit_pos;
> -	struct scmi_info *info = handle_to_scmi_info(handle);
>  
> -	/* Keep the locked section as small as possible */
>  	spin_lock_irqsave(&minfo->xfer_lock, flags);
> -	bit_pos = find_first_zero_bit(minfo->xfer_alloc_table,
> -				      info->desc->max_msg);
> -	if (bit_pos == info->desc->max_msg) {
> +	if (hlist_empty(&minfo->free_xfers)) {
>  		spin_unlock_irqrestore(&minfo->xfer_lock, flags);
>  		return ERR_PTR(-ENOMEM);
>  	}
> -	set_bit(bit_pos, minfo->xfer_alloc_table);
> -	spin_unlock_irqrestore(&minfo->xfer_lock, flags);
>  
> -	xfer_id = bit_pos;
> +	/* grab an xfer from the free_list */
> +	xfer = hlist_entry(minfo->free_xfers.first, struct scmi_xfer, node);
> +	hlist_del_init(&xfer->node);
> +
> +	if (set_pending) {
> +		/* Pick and set monotonic token */
> +		ret = scmi_xfer_token_set(minfo, xfer);
> +		if (!ret) {
> +			hash_add(minfo->pending_xfers, &xfer->node,
> +				 xfer->hdr.seq);
> +			xfer->pending = true;
> +		} else {
> +			dev_err(handle->dev,
> +				"Failed to get monotonic token %d\n", ret);
> +			hlist_add_head(&xfer->node, &minfo->free_xfers);
> +			xfer = ERR_PTR(ret);
> +		}
> +	}
>  
> -	xfer = &minfo->xfer_block[xfer_id];
> -	xfer->hdr.seq = xfer_id;
> -	xfer->transfer_id = atomic_inc_return(&transfer_last_id);
> +	if (!IS_ERR(xfer)) {
> +		refcount_set(&xfer->users, 1);
> +		xfer->transfer_id = atomic_inc_return(&transfer_last_id);
> +	}
> +	spin_unlock_irqrestore(&minfo->xfer_lock, flags);
>  
>  	return xfer;
>  }
> @@ -239,6 +382,9 @@ static struct scmi_xfer *scmi_xfer_get(const struct scmi_handle *handle,
>   * @minfo: Pointer to Tx/Rx Message management info based on channel type
>   * @xfer: message that was reserved by scmi_xfer_get
>   *
> + * After refcount check, possibly release an xfer, clearing the token slot,
> + * removing xfer from @pending_xfers and putting it back into free_xfers.
> + *
>   * This holds a spinlock to maintain integrity of internal data structures.
>   */
>  static void
> @@ -246,16 +392,44 @@ __scmi_xfer_put(struct scmi_xfers_info *minfo, struct scmi_xfer *xfer)
>  {
>  	unsigned long flags;
>  
> -	/*
> -	 * Keep the locked section as small as possible
> -	 * NOTE: we might escape with smp_mb and no lock here..
> -	 * but just be conservative and symmetric.
> -	 */
>  	spin_lock_irqsave(&minfo->xfer_lock, flags);
> -	clear_bit(xfer->hdr.seq, minfo->xfer_alloc_table);
> +	if (refcount_dec_and_test(&xfer->users)) {

The introduction of users in this patch seems useless. I am assuming there are
multiple users and this is to prevent some race. I was about to ask if we can
manage without it, but seeing some additional use of it in later patches, I
will comment later. It may still make sense to move this to later patch as
it doesn't add anything here ?

> +		if (xfer->pending) {
> +			scmi_xfer_token_clear(minfo, xfer);
> +			hash_del(&xfer->node);
> +			xfer->pending = false;
> +		}
> +		hlist_add_head(&xfer->node, &minfo->free_xfers);
> +	}
>  	spin_unlock_irqrestore(&minfo->xfer_lock, flags);
>  }
>  
> +/**
> + * scmi_xfer_lookup_unlocked  -  Helper to lookup an xfer_id
> + *
> + * @minfo: Pointer to Tx/Rx Message management info based on channel type
> + * @xfer_id: Token ID to lookup in @pending_xfers
> + *
> + * Refcounting is untouched.
> + *
> + * Context: Assumes to be called with @xfer_lock already acquired.
> + *
> + * Return: A valid xfer on Success or error otherwise
> + */
> +static struct scmi_xfer *
> +scmi_xfer_lookup_unlocked(struct scmi_xfers_info *minfo, u16 xfer_id)
> +{
> +	struct scmi_xfer *xfer = NULL;
> +
> +	if (xfer_id >= MSG_TOKEN_MAX)
> +		return ERR_PTR(-EINVAL);
> +

Is this really needed ? I guess we always use MSG_XTRACT_TOKEN(hdr) to
fetch the xfer_id, no ?

> +	if (test_bit(xfer_id, minfo->xfer_alloc_table))
> +		xfer = XFER_FIND(minfo->pending_xfers, xfer_id);
> +
> +	return xfer ?: ERR_PTR(-EINVAL);
> +}
> +
>  static void scmi_handle_notification(struct scmi_chan_info *cinfo, u32 msg_hdr)
>  {
>  	struct scmi_xfer *xfer;
> @@ -265,7 +439,7 @@ static void scmi_handle_notification(struct scmi_chan_info *cinfo, u32 msg_hdr)
>  	ktime_t ts;
>  
>  	ts = ktime_get_boottime();
> -	xfer = scmi_xfer_get(cinfo->handle, minfo);
> +	xfer = scmi_xfer_get(cinfo->handle, minfo, false);
>  	if (IS_ERR(xfer)) {
>  		dev_err(dev, "failed to get free message slot (%ld)\n",
>  			PTR_ERR(xfer));
> @@ -291,19 +465,22 @@ static void scmi_handle_notification(struct scmi_chan_info *cinfo, u32 msg_hdr)
>  static void scmi_handle_response(struct scmi_chan_info *cinfo,
>  				 u16 xfer_id, u8 msg_type)
>  {
> +	unsigned long flags;
>  	struct scmi_xfer *xfer;
>  	struct device *dev = cinfo->dev;
>  	struct scmi_info *info = handle_to_scmi_info(cinfo->handle);
>  	struct scmi_xfers_info *minfo = &info->tx_minfo;
>  
>  	/* Are we even expecting this? */
> -	if (!test_bit(xfer_id, minfo->xfer_alloc_table)) {
> +	spin_lock_irqsave(&minfo->xfer_lock, flags);
> +	xfer = scmi_xfer_lookup_unlocked(minfo, xfer_id);
> +	spin_unlock_irqrestore(&minfo->xfer_lock, flags);
> +	if (IS_ERR(xfer)) {
>  		dev_err(dev, "message for %d is not expected!\n", xfer_id);
>  		info->desc->ops->clear_channel(cinfo);
>  		return;
>  	}
>  
> -	xfer = &minfo->xfer_block[xfer_id];
>  	/*
>  	 * Even if a response was indeed expected on this slot at this point,
>  	 * a buggy platform could wrongly reply feeding us an unexpected
> @@ -540,7 +717,7 @@ static int xfer_get_init(const struct scmi_protocol_handle *ph,
>  	    tx_size > info->desc->max_msg_size)
>  		return -ERANGE;
>  
> -	xfer = scmi_xfer_get(pi->handle, minfo);
> +	xfer = scmi_xfer_get(pi->handle, minfo, true);
>  	if (IS_ERR(xfer)) {
>  		ret = PTR_ERR(xfer);
>  		dev_err(dev, "failed to get free message slot(%d)\n", ret);
> @@ -1017,18 +1194,25 @@ static int __scmi_xfer_info_init(struct scmi_info *sinfo,
>  		return -EINVAL;
>  	}
>  
> -	info->xfer_block = devm_kcalloc(dev, desc->max_msg,
> -					sizeof(*info->xfer_block), GFP_KERNEL);
> -	if (!info->xfer_block)
> -		return -ENOMEM;
> +	hash_init(info->pending_xfers);
>  
> -	info->xfer_alloc_table = devm_kcalloc(dev, BITS_TO_LONGS(desc->max_msg),
> +	/* Allocate a bitmask sized to hold MSG_TOKEN_MAX tokens */
> +	info->xfer_alloc_table = devm_kcalloc(dev, BITS_TO_LONGS(MSG_TOKEN_MAX),
>  					      sizeof(long), GFP_KERNEL);
>  	if (!info->xfer_alloc_table)
>  		return -ENOMEM;
>  
> -	/* Pre-initialize the buffer pointer to pre-allocated buffers */
> -	for (i = 0, xfer = info->xfer_block; i < desc->max_msg; i++, xfer++) {
> +	/*
> +	 * Preallocate a number of xfers equal to max inflight messages,
> +	 * pre-initialize the buffer pointer to pre-allocated buffers and
> +	 * attach all of them to the free list
> +	 */
> +	INIT_HLIST_HEAD(&info->free_xfers);
> +	for (i = 0; i < desc->max_msg; i++) {
> +		xfer = devm_kzalloc(dev, sizeof(*xfer), GFP_KERNEL);
> +		if (!xfer)
> +			return -ENOMEM;
> +
>  		xfer->rx.buf = devm_kcalloc(dev, sizeof(u8), desc->max_msg_size,
>  					    GFP_KERNEL);
>  		if (!xfer->rx.buf)
> @@ -1036,8 +1220,12 @@ static int __scmi_xfer_info_init(struct scmi_info *sinfo,
>  
>  		xfer->tx.buf = xfer->rx.buf;
>  		init_completion(&xfer->done);
> +
> +		/* Add initialized xfer to the free list */
> +		hlist_add_head(&xfer->node, &info->free_xfers);
>  	}
>  
> +	atomic_set(&info->last_token, -1);
>  	spin_lock_init(&info->xfer_lock);
>  
>  	return 0;
> -- 
> 2.17.1
> 

-- 
Regards,
Sudeep

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH v6 08/17] firmware: arm_scmi: Add priv parameter to scmi_rx_callback
  2021-07-12 14:18 ` [PATCH v6 08/17] firmware: arm_scmi: Add priv parameter to scmi_rx_callback Cristian Marussi
@ 2021-07-28 14:26   ` Sudeep Holla
  2021-07-28 17:25     ` Cristian Marussi
  0 siblings, 1 reply; 48+ messages in thread
From: Sudeep Holla @ 2021-07-28 14:26 UTC (permalink / raw)
  To: Cristian Marussi
  Cc: linux-kernel, linux-arm-kernel, virtualization, virtio-dev,
	james.quinlan, Jonathan.Cameron, f.fainelli, etienne.carriere,
	vincent.guittot, souvik.chakravarty, igor.skalkin, peter.hilber,
	alex.bennee, jean-philippe, mikhail.golubev, anton.yakovlev,
	Vasyl.Vavrychuk, Andriy.Tryshnivskyy

On Mon, Jul 12, 2021 at 03:18:24PM +0100, Cristian Marussi wrote:
> Add a new opaque void *priv parameter to scmi_rx_callback which can be
> optionally provided by the transport layer when invoking scmi_rx_callback
> and that will be passed back to the transport layer in xfer->priv.
> 
> This can be used by transports that needs to keep track of their specific
> data structures together with the valid xfers.
>

This change looks simple but doesn't make sense on its own. I assume you don't
want to add all these in the patch making use of priv which makes sense. Not
sure if the next patch uses it, but I prefer to keep them as close as possible
if it is not too much of a hassle.

-- 
Regards,
Sudeep

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH v6 10/17] firmware: arm_scmi: Make polling mode optional
  2021-07-12 14:18 ` [PATCH v6 10/17] firmware: arm_scmi: Make polling mode optional Cristian Marussi
  2021-07-15 16:36   ` Peter Hilber
@ 2021-07-28 14:34   ` Sudeep Holla
  2021-07-28 17:41     ` Cristian Marussi
  1 sibling, 1 reply; 48+ messages in thread
From: Sudeep Holla @ 2021-07-28 14:34 UTC (permalink / raw)
  To: Cristian Marussi
  Cc: linux-kernel, linux-arm-kernel, virtualization, virtio-dev,
	james.quinlan, Jonathan.Cameron, f.fainelli, etienne.carriere,
	vincent.guittot, souvik.chakravarty, igor.skalkin, peter.hilber,
	alex.bennee, jean-philippe, mikhail.golubev, anton.yakovlev,
	Vasyl.Vavrychuk, Sudeep Holla, Andriy.Tryshnivskyy

On Mon, Jul 12, 2021 at 03:18:26PM +0100, Cristian Marussi wrote:
> Add a check for the presence of .poll_done transport operation so that
> transports that do not need to support polling mode have no need to provide
> a dummy .poll_done callback either and polling mode can be disabled in the
> SCMI core for that tranport.
> 
> Signed-off-by: Cristian Marussi <cristian.marussi@arm.com>
> ---
>  drivers/firmware/arm_scmi/driver.c | 43 ++++++++++++++++++------------
>  1 file changed, 26 insertions(+), 17 deletions(-)
> 
> diff --git a/drivers/firmware/arm_scmi/driver.c b/drivers/firmware/arm_scmi/driver.c
> index a952b6527b8a..4183d25c9289 100644
> --- a/drivers/firmware/arm_scmi/driver.c
> +++ b/drivers/firmware/arm_scmi/driver.c
> @@ -777,25 +777,34 @@ static int do_xfer(const struct scmi_protocol_handle *ph,
>  	}
>  
>  	if (xfer->hdr.poll_completion) {
> -		ktime_t stop = ktime_add_ns(ktime_get(), SCMI_MAX_POLL_TO_NS);
> -
> -		spin_until_cond(scmi_xfer_done_no_timeout(cinfo, xfer, stop));
> -
> -		if (ktime_before(ktime_get(), stop)) {
> -			unsigned long flags;
> -
> -			/*
> -			 * Do not fetch_response if an out-of-order delayed
> -			 * response is being processed.
> -			 */
> -			spin_lock_irqsave(&xfer->lock, flags);
> -			if (xfer->state == SCMI_XFER_SENT_OK) {
> -				info->desc->ops->fetch_response(cinfo, xfer);
> -				xfer->state = SCMI_XFER_RESP_OK;
> +		if (info->desc->ops->poll_done) {
> +			ktime_t stop = ktime_add_ns(ktime_get(),
> +						    SCMI_MAX_POLL_TO_NS);
> +
> +			spin_until_cond(scmi_xfer_done_no_timeout(cinfo, xfer,
> +								  stop));
> +
> +			if (ktime_before(ktime_get(), stop)) {
> +				unsigned long flags;
> +
> +				/*
> +				 * Do not fetch_response if an out-of-order delayed
> +				 * response is being processed.
> +				 */
> +				spin_lock_irqsave(&xfer->lock, flags);
> +				if (xfer->state == SCMI_XFER_SENT_OK) {
> +					info->desc->ops->fetch_response(cinfo,
> +									xfer);
> +					xfer->state = SCMI_XFER_RESP_OK;
> +				}
> +				spin_unlock_irqrestore(&xfer->lock, flags);
> +			} else {
> +				ret = -ETIMEDOUT;
>  			}
> -			spin_unlock_irqrestore(&xfer->lock, flags);
>  		} else {
> -			ret = -ETIMEDOUT;
> +			dev_warn_once(dev,
> +				      "Polling mode is not supported by transport.\n");
> +			ret = EINVAL;

Can't we just return this error as early as possible if the user isn't
expected to use polling with this transport ? That would simplify the patch
(as most of it is due to indentation which can go away) as you need not
check it later ?

-- 
Regards,
Sudeep

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH v6 11/17] firmware: arm_scmi: Make SCMI transports configurable
  2021-07-12 14:18 ` [PATCH v6 11/17] firmware: arm_scmi: Make SCMI transports configurable Cristian Marussi
@ 2021-07-28 14:50   ` Sudeep Holla
  2021-07-29 16:18     ` Cristian Marussi
  0 siblings, 1 reply; 48+ messages in thread
From: Sudeep Holla @ 2021-07-28 14:50 UTC (permalink / raw)
  To: Cristian Marussi
  Cc: linux-kernel, linux-arm-kernel, virtualization, virtio-dev,
	james.quinlan, Jonathan.Cameron, f.fainelli, etienne.carriere,
	vincent.guittot, souvik.chakravarty, igor.skalkin, peter.hilber,
	alex.bennee, jean-philippe, mikhail.golubev, anton.yakovlev,
	Vasyl.Vavrychuk, Sudeep Holla, Andriy.Tryshnivskyy

On Mon, Jul 12, 2021 at 03:18:27PM +0100, Cristian Marussi wrote:
> Add configuration options to be able to select which SCMI transports have
> to be compiled into the SCMI stack.
> 
> Mailbox and SMC are by default enabled if their related dependencies are
> satisfied.
> 
> While doing that move all SCMI related config options in their own
> dedicated submenu.
> 
> Signed-off-by: Cristian Marussi <cristian.marussi@arm.com>
> ---
> Used a BUILD_BUG_ON() to avoid the scenario where SCMI is configured
> without any transport. Coul dnot do in any other way in Kconfig due to
> circular dependencies.
> 
> This will be neeed later on to add new Virtio based transport and
> optionally exclude other transports.
> ---
>  drivers/firmware/Kconfig           | 34 +--------------
>  drivers/firmware/arm_scmi/Kconfig  | 70 ++++++++++++++++++++++++++++++
>  drivers/firmware/arm_scmi/Makefile |  4 +-
>  drivers/firmware/arm_scmi/common.h |  4 +-
>  drivers/firmware/arm_scmi/driver.c |  6 ++-
>  5 files changed, 80 insertions(+), 38 deletions(-)
>  create mode 100644 drivers/firmware/arm_scmi/Kconfig
> 
> diff --git a/drivers/firmware/Kconfig b/drivers/firmware/Kconfig
> index 1db738d5b301..8d41f73f5395 100644
> --- a/drivers/firmware/Kconfig
> +++ b/drivers/firmware/Kconfig
> @@ -6,39 +6,7 @@
>  
>  menu "Firmware Drivers"
>  
> -config ARM_SCMI_PROTOCOL
> -	tristate "ARM System Control and Management Interface (SCMI) Message Protocol"
> -	depends on ARM || ARM64 || COMPILE_TEST
> -	depends on MAILBOX || HAVE_ARM_SMCCC_DISCOVERY
> -	help
> -	  ARM System Control and Management Interface (SCMI) protocol is a
> -	  set of operating system-independent software interfaces that are
> -	  used in system management. SCMI is extensible and currently provides
> -	  interfaces for: Discovery and self-description of the interfaces
> -	  it supports, Power domain management which is the ability to place
> -	  a given device or domain into the various power-saving states that
> -	  it supports, Performance management which is the ability to control
> -	  the performance of a domain that is composed of compute engines
> -	  such as application processors and other accelerators, Clock
> -	  management which is the ability to set and inquire rates on platform
> -	  managed clocks and Sensor management which is the ability to read
> -	  sensor data, and be notified of sensor value.
> -
> -	  This protocol library provides interface for all the client drivers
> -	  making use of the features offered by the SCMI.
> -
> -config ARM_SCMI_POWER_DOMAIN
> -	tristate "SCMI power domain driver"
> -	depends on ARM_SCMI_PROTOCOL || (COMPILE_TEST && OF)
> -	default y
> -	select PM_GENERIC_DOMAINS if PM
> -	help
> -	  This enables support for the SCMI power domains which can be
> -	  enabled or disabled via the SCP firmware
> -
> -	  This driver can also be built as a module.  If so, the module
> -	  will be called scmi_pm_domain. Note this may needed early in boot
> -	  before rootfs may be available.
> +source "drivers/firmware/arm_scmi/Kconfig"
>  
>  config ARM_SCPI_PROTOCOL
>  	tristate "ARM System Control and Power Interface (SCPI) Message Protocol"
> diff --git a/drivers/firmware/arm_scmi/Kconfig b/drivers/firmware/arm_scmi/Kconfig
> new file mode 100644
> index 000000000000..479fc8a3533e
> --- /dev/null
> +++ b/drivers/firmware/arm_scmi/Kconfig
> @@ -0,0 +1,70 @@
> +# SPDX-License-Identifier: GPL-2.0-only
> +menu "ARM System Control and Management Interface Protocol"
> +
> +config ARM_SCMI_PROTOCOL
> +	tristate "ARM System Control and Management Interface (SCMI) Message Protocol"
> +	depends on ARM || ARM64 || COMPILE_TEST
> +	help
> +	  ARM System Control and Management Interface (SCMI) protocol is a
> +	  set of operating system-independent software interfaces that are
> +	  used in system management. SCMI is extensible and currently provides
> +	  interfaces for: Discovery and self-description of the interfaces
> +	  it supports, Power domain management which is the ability to place
> +	  a given device or domain into the various power-saving states that
> +	  it supports, Performance management which is the ability to control
> +	  the performance of a domain that is composed of compute engines
> +	  such as application processors and other accelerators, Clock
> +	  management which is the ability to set and inquire rates on platform
> +	  managed clocks and Sensor management which is the ability to read
> +	  sensor data, and be notified of sensor value.
> +
> +	  This protocol library provides interface for all the client drivers
> +	  making use of the features offered by the SCMI.
> +

May be you can add if condition here to remove the depends on ARM_SCMI_PROTOCOL

if ARM_SCMI_PROTOCOL

> +config ARM_SCMI_HAVE_TRANSPORT
> +	bool
> +	help
> +	  This declares whether at least one SCMI transport has been configured.
> +	  Used to trigger a build bug when trying to build SCMI without any
> +	  configured transport.
> +
> +config ARM_SCMI_TRANSPORT_MAILBOX
> +	bool "SCMI transport based on Mailbox"
> +	depends on ARM_SCMI_PROTOCOL && MAILBOX

And drop ARM_SCMI_PROTOCOL above

> +	select ARM_SCMI_HAVE_TRANSPORT
> +	default y

Do we need a user visible choice if it is always default on ?

> +	help
> +	  Enable mailbox based transport for SCMI.
> +
> +	  If you want the ARM SCMI PROTOCOL stack to include support for a
> +	  transport based on mailboxes, answer Y.
> +	  A matching DT entry will also be needed to indicate the effective
> +	  presence of this kind of transport.
> +

I would drop the above comment on matching DT.

> +config ARM_SCMI_TRANSPORT_SMC
> +	bool "SCMI transport based on SMC"
> +	depends on ARM_SCMI_PROTOCOL && HAVE_ARM_SMCCC_DISCOVERY

Ditto

> +	select ARM_SCMI_HAVE_TRANSPORT
> +	default y
> +	help
> +	  Enable SMC based transport for SCMI.
> +
> +	  If you want the ARM SCMI PROTOCOL stack to include support for a
> +	  transport based on SMC, answer Y.
> +	  A matching DT entry will also be needed to indicate the effective
> +	  presence of this kind of transport.
> +

endif #ARM_SCMI_PROTOCOL

-- 
Regards,
Sudeep

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH v6 15/17] firmware: arm_scmi: Add optional link_supplier() transport op
  2021-07-12 14:18 ` [PATCH v6 15/17] firmware: arm_scmi: Add optional link_supplier() transport op Cristian Marussi
@ 2021-07-28 15:36   ` Sudeep Holla
  2021-07-29 16:19     ` Cristian Marussi
  0 siblings, 1 reply; 48+ messages in thread
From: Sudeep Holla @ 2021-07-28 15:36 UTC (permalink / raw)
  To: Cristian Marussi
  Cc: linux-kernel, linux-arm-kernel, virtualization, virtio-dev,
	james.quinlan, Jonathan.Cameron, f.fainelli, etienne.carriere,
	vincent.guittot, souvik.chakravarty, igor.skalkin, peter.hilber,
	alex.bennee, jean-philippe, mikhail.golubev, anton.yakovlev,
	Vasyl.Vavrychuk, Sudeep Holla, Andriy.Tryshnivskyy

On Mon, Jul 12, 2021 at 03:18:31PM +0100, Cristian Marussi wrote:
> From: Peter Hilber <peter.hilber@opensynergy.com>
> 
> Some transports are also effectively registered with other kernel subsystem
> in order to be properly probed and initialized; as a consequence such kind
> of transports, and their related devices, might still not have been probed
> and initialized at the time the main SCMI core driver is probed.
> 
> Add an optional .link_supplier() transport operation which can be used by
> the core SCMI stack to dynamically check if the transport is ready and
> dynamically link its device to the platform instance device.
>

To be precise,

s/platform instance device/SCMI platform instance device

-- 
Regards,
Sudeep

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH v6 06/17] firmware: arm_scmi: Introduce monotonically increasing tokens
  2021-07-28 14:17   ` Sudeep Holla
@ 2021-07-28 16:54     ` Cristian Marussi
  2021-08-02 10:24       ` Sudeep Holla
  0 siblings, 1 reply; 48+ messages in thread
From: Cristian Marussi @ 2021-07-28 16:54 UTC (permalink / raw)
  To: Sudeep Holla
  Cc: linux-kernel, linux-arm-kernel, virtualization, virtio-dev,
	james.quinlan, Jonathan.Cameron, f.fainelli, etienne.carriere,
	vincent.guittot, souvik.chakravarty, igor.skalkin, peter.hilber,
	alex.bennee, jean-philippe, mikhail.golubev, anton.yakovlev,
	Vasyl.Vavrychuk, Andriy.Tryshnivskyy

On Wed, Jul 28, 2021 at 03:17:46PM +0100, Sudeep Holla wrote:
> On Mon, Jul 12, 2021 at 03:18:22PM +0100, Cristian Marussi wrote:
> > Tokens are sequence numbers embedded in the each SCMI message header: they
> > are used to correlate commands with responses (and delayed responses), but
> > their usage and policy of selection is entirely up to the caller (usually
> > the OSPM agent), while they are completely opaque to the callee (SCMI
> > server platform) which merely copies them back from the command into the
> > response message header.
> > This also means that the platform does not, can not and should not enforce
> > any kind of policy on received messages depending on the contained sequence
> > number: platform can perfectly handle concurrent requests carrying the same
> > identifiying token if that should happen.
> > 
> > Moreover the platform is not required to produce in-order responses to
> > agent requests, the only constraint in these regards is that in case of
> > an asynchronous message the delayed response must be sent after the
> > immediate response for the synchronous part of the command transaction.
> > 
> > Currenly the SCMI stack of the OSPM agent selects a token for the egressing
> > commands picking the lowest possible number which is not already in use by
> > an existing in-flight transaction, which means, in other words, that we
> > immediately reuse any token after its transaction has completed or it has
> > timed out: this policy indeed does simplify management and lookup of tokens
> > and associated xfers.
> > 
> > Under the above assumptions and constraints, since there is really no state
> > shared between the agent and the platform to let the platform know when a
> > token and its associated message has timed out, the current policy of early
> > reuse of tokens can easily lead to the situation in which a spurious or
> > late received response (or delayed_response), related to an old stale and
> > timed out transaction, can be wrongly associated to a newer valid in-flight
> > xfer that just happens to have reused the same token.
> > 
> > This misbehaviour on such ghost responses is more easily exposed on those
> > transports that naturally have an higher level of parallelism in processing
> > multiple concurrent in-flight messages.
> >

Hi,

> 
> The term ghost is used here without any reference to what it means. That
> could make it difficult to follow if someone unaware of it is trying to
> understand.
> 

Right, I'll reword this.

> > This commit introduces a new policy of selection of tokens for the OSPM
> > agent: each new command transfer now gets the next available, monotonically
> > increasing token, until tokens are exhausted and the counter rolls over.
> > 
> > Such new policy mitigates the above issues with ghost responses since the
> > tokens are now reused as late as possible (when they roll back ideally)
> > and so it is much easier to identify such ghost responses to stale timed
> > out transactions: this also helps in simplifying the specific transports
> > implementation since stale transport messages can be easily identified
> > and discarded early on in the rx path without the need to cross check
> > their actual state with the core transport layer.
> > This mitigation is even more effective when, as is usually the case, the
> > maximum number of pending messages is capped by the platform to a much
> > lower number than the whole possible range of tokens values (2^10).
> > 
> > This internal policy change in the core SCMI transport layer is fully
> > transparent to the specific transports so it has not and should not have
> > any impact on the transports implementation.
> > 
> > Signed-off-by: Cristian Marussi <cristian.marussi@arm.com>
> > ---
> > v4 --> V5
> > - removed empirical profiling info from commit msg
> > - do NOT use monotonic tokens and pending HT for notifications (not needed)
> > - release xfer_lock later in scmi_xfer_get
> > ---
> >  drivers/firmware/arm_scmi/common.h |  25 +++
> >  drivers/firmware/arm_scmi/driver.c | 260 +++++++++++++++++++++++++----
> >  2 files changed, 249 insertions(+), 36 deletions(-)
> > 
> > diff --git a/drivers/firmware/arm_scmi/common.h b/drivers/firmware/arm_scmi/common.h
> > index 6bb734e0e3ac..2233d0a188fc 100644
> > --- a/drivers/firmware/arm_scmi/common.h
> > +++ b/drivers/firmware/arm_scmi/common.h
> > @@ -14,7 +14,10 @@
> >  #include <linux/device.h>
> >  #include <linux/errno.h>
> >  #include <linux/kernel.h>
> > +#include <linux/hashtable.h>
> > +#include <linux/list.h>
> >  #include <linux/module.h>
> > +#include <linux/refcount.h>
> >  #include <linux/scmi_protocol.h>
> >  #include <linux/types.h>
> >  
> > @@ -138,6 +141,10 @@ struct scmi_msg {
> >   *	buffer for the rx path as we use for the tx path.
> >   * @done: command message transmit completion event
> >   * @async_done: pointer to delayed response message received event completion
> > + * @users: A refcount to track the active users for this xfer
> > + * @pending: True for xfers added to @pending_xfers hashtable
> > + * @node: An hlist_node reference used to store this xfer, alternatively, on
> > + *	  the free list @free_xfers or in the @pending_xfers hashtable
> >   */
> >  struct scmi_xfer {
> >  	int transfer_id;
> > @@ -146,8 +153,26 @@ struct scmi_xfer {
> >  	struct scmi_msg rx;
> >  	struct completion done;
> >  	struct completion *async_done;
> > +	refcount_t users;
> > +	bool pending;
> > +	struct hlist_node node;
> >  };
> >  
> > +/*
> > + * An helper macro to lookup an xfer from the @pending_xfers hashtable
> > + * using the message sequence number token as a key.
> > + */
> > +#define XFER_FIND(__ht, __k)					\
> > +({								\
> > +	typeof(__k) k_ = __k;					\
> > +	struct scmi_xfer *xfer_ = NULL;				\
> > +								\
> > +	hash_for_each_possible((__ht), xfer_, node, k_)		\
> > +		if (xfer_->hdr.seq == k_)			\
> > +			break;					\
> > +	xfer_;							\
> > +})
> > +
> >  struct scmi_xfer_ops;
> >  
> >  /**
> > diff --git a/drivers/firmware/arm_scmi/driver.c b/drivers/firmware/arm_scmi/driver.c
> > index 4c77ee13b1ad..245ede223302 100644
> > --- a/drivers/firmware/arm_scmi/driver.c
> > +++ b/drivers/firmware/arm_scmi/driver.c
> > @@ -21,6 +21,7 @@
> >  #include <linux/io.h>
> >  #include <linux/kernel.h>
> >  #include <linux/ktime.h>
> > +#include <linux/hashtable.h>
> >  #include <linux/list.h>
> >  #include <linux/module.h>
> >  #include <linux/of_address.h>
> > @@ -65,19 +66,29 @@ struct scmi_requested_dev {
> >  	struct list_head node;
> >  };
> >  
> > +#define SCMI_PENDING_XFERS_HT_ORDER_SZ	9
> > +
> 
> Is there any particular reason to choose half the token size as hash bucket
> size ? IOW why not 1/3 or 1/4th of it ? I would appreciate a comment here.
> I see it is mentioned in the commit log. Also is it not better to associate
> or keep it close to MSG_TOKEN_ID_MASK and associated macros.
> 

I'll move this in the proper place where associated macros are defined.

The reason for the size choice is tricky (and not sure about its value
still...so I have not commented yet :D); the ideal size of this hashtable would
be desc->max_msg so equal to the maximum number of inflight messages allowed on
the system in order to minimize (probably to zero) collisions on the hashtable:
unfortunately max_msg is only finally available at runtime time and the
kernel hashtable is statically sized by design....

I tried to play some tricks to define dynamically the size but everything falls
apart since a lot of stuff in linux/hashtable.h is based on ARRAY_SIZE() and
friends (to speedup all I suppose). Another non-fit (in my opinion)
alternative would be using relativistic hashtable (linux/rhashtable.h) but
those are definitely overkill in our case since they are hashtables that
can be resized completely at runtime while populated O_o. (with even
more overhead)

At the end the size that fits all possible in-flight messages minimizing
collisions in any possible case that I can set at compile time would be 10,
which means really 2^10 1024 HT entries (equal to MAX_MSG_TOKEN) each of which
is a struct list_head (*prev,*next 16bytes) i.e. 16KB HT: Peter pointed out
that it would be a lot of wasted space on normal systems in which max in-flight
messages are far-less than 1024 AND would not even fit in one 4Kb page, so I
reduced it to 512 entries but the best would be 256 (8) if we want to
fit in one regular 4kb page. The drawback will be a bit of HT collisions on
system with more than 256 possible and effective in-flight messages.

So, best of all would be runtime sizing of the HT, which is not possible
with current statically sized Kernel hashtable.h (even though I started looking
into that I have to admit :D), or sizing to 8 (256 entries 4k) to fit in a page,
or 10 (1024 entries 16kb) to minimize any possible collision if we do
not care about wasted memory and/or page boundary.

> >  /**
> >   * struct scmi_xfers_info - Structure to manage transfer information
> >   *
> > - * @xfer_block: Preallocated Message array
> >   * @xfer_alloc_table: Bitmap table for allocated messages.
> >   *	Index of this bitmap table is also used for message
> >   *	sequence identifier.
> >   * @xfer_lock: Protection for message allocation
> > + * @last_token: A counter to use as base to generate for monotonically
> > + *		increasing tokens.
> > + * @free_xfers: A free list for available to use xfers. It is initialized with
> > + *		a number of xfers equal to the maximum allowed in-flight
> > + *		messages.
> > + * @pending_xfers: An hashtable, indexed by msg_hdr.seq, used to keep all the
> > + *		   currently in-flight messages.
> >   */
> >  struct scmi_xfers_info {
> > -	struct scmi_xfer *xfer_block;
> >  	unsigned long *xfer_alloc_table;
> >  	spinlock_t xfer_lock;
> > +	atomic_t last_token;
> 
> Can we merge this and transfer_last_id ? Let this be free running like
> transfer_last_id and just use [0-9] from this ? I don't see any point
> having 2 different monotonically increasing tokens/id.
> 

Mmm I was tempted about that, but the reason I did not was that in some
rare limit condition as you can see in the ASCII art (:O) I can find a hole in
the next available token ids so I have to skip and update last_token itself,
not sure if this could cause confusion seeing transfer_ids with holes during
tracing if I unify them.

> > +	struct hlist_head free_xfers;
> > +	DECLARE_HASHTABLE(pending_xfers, SCMI_PENDING_XFERS_HT_ORDER_SZ);
> >  };
> >  
> >  /**
> > @@ -190,45 +201,177 @@ void *scmi_notification_instance_data_get(const struct scmi_handle *handle)
> >  	return info->notify_priv;
> >  }
> >  
> > +/**
> > + * scmi_xfer_token_set  - Reserve and set new token for the xfer at hand
> > + *
> > + * @minfo: Pointer to Tx/Rx Message management info based on channel type
> > + * @xfer: The xfer to act upon
> > + *
> > + * Pick the next unused monotonically increasing token and set it into
> > + * xfer->hdr.seq: picking a monotonically increasing value avoids immediate
> > + * reuse of freshly completed or timed-out xfers, thus mitigating the risk
> > + * of incorrect association of a late and expired xfer with a live in-flight
> > + * transaction, both happening to re-use the same token identifier.
> > + *
> > + * Since platform is NOT required to answer our request in-order we should
> > + * account for a few rare but possible scenarios:
> > + *
> > + *  - exactly 'next_token' may be NOT available so pick xfer_id >= next_token
> > + *    using find_next_zero_bit() starting from candidate next_token bit
> > + *
> > + *  - all tokens ahead upto (MSG_TOKEN_ID_MASK - 1) are used in-flight but we
> > + *    are plenty of free tokens at start, so try a second pass using
> > + *    find_next_zero_bit() and starting from 0.
> > + *
> > + *  X = used in-flight
> > + *
> > + * Normal
> > + * ------
> > + *
> > + *		|- xfer_id picked
> > + *   -----------+----------------------------------------------------------
> > + *   | | |X|X|X| | | | | | ... ... ... ... ... ... ... ... ... ... ...|X|X|
> > + *   ----------------------------------------------------------------------
> > + *		^
> > + *		|- next_token
> > + *
> > + * Out-of-order pending at start
> > + * -----------------------------
> > + *
> > + *	  |- xfer_id picked, last_token fixed
> > + *   -----+----------------------------------------------------------------
> > + *   |X|X| | | | |X|X| ... ... ... ... ... ... ... ... ... ... ... ...|X| |
> > + *   ----------------------------------------------------------------------
> > + *    ^
> > + *    |- next_token
> > + *
> > + *
> > + * Out-of-order pending at end
> > + * ---------------------------
> > + *
> > + *	  |- xfer_id picked, last_token fixed
> > + *   -----+----------------------------------------------------------------
> > + *   |X|X| | | | |X|X| ... ... ... ... ... ... ... ... ... ... |X|X|X||X|X|
> > + *   ----------------------------------------------------------------------
> > + *								^
> > + *								|- next_token
> > + *
> > + * Context: Assumes to be called with @xfer_lock already acquired.
> > + *
> > + * Return: 0 on Success or error
> > + */
> > +static int scmi_xfer_token_set(struct scmi_xfers_info *minfo,
> > +			       struct scmi_xfer *xfer)
> > +{
> > +	unsigned long xfer_id, next_token;
> > +
> > +	/* Pick a candidate monotonic token in range [0, MSG_TOKEN_MAX - 1] */
> > +	next_token = (atomic_inc_return(&minfo->last_token) &
> > +		      (MSG_TOKEN_MAX - 1));
> > +
> > +	/* Pick the next available xfer_id >= next_token */
> > +	xfer_id = find_next_zero_bit(minfo->xfer_alloc_table,
> > +				     MSG_TOKEN_MAX, next_token);
> > +	if (xfer_id == MSG_TOKEN_MAX) {
> > +		/*
> > +		 * After heavily out-of-order responses, there are no free
> > +		 * tokens ahead, but only at start of xfer_alloc_table so
> > +		 * try again from the beginning.
> > +		 */
> > +		xfer_id = find_next_zero_bit(minfo->xfer_alloc_table,
> > +					     MSG_TOKEN_MAX, 0);
> > +		/*
> > +		 * Something is wrong if we got here since there can be a
> > +		 * maximum number of (MSG_TOKEN_MAX - 1) in-flight messages
> > +		 * but we have not found any free token [0, MSG_TOKEN_MAX - 1].
> > +		 */
> > +		if (WARN_ON_ONCE(xfer_id == MSG_TOKEN_MAX))
> > +			return -ENOMEM;
> > +	}
> > +
> > +	/* Update +/- last_token accordingly if we skipped some hole */
> > +	if (xfer_id != next_token)
> > +		atomic_add((int)(xfer_id - next_token), &minfo->last_token);
> > +
> > +	/* Set in-flight */
> > +	set_bit(xfer_id, minfo->xfer_alloc_table);
> > +	xfer->hdr.seq = (u16)xfer_id;
> > +
> > +	return 0;
> > +}
> > +
> > +/**
> > + * scmi_xfer_token_clear  - Release the token
> > + *
> > + * @minfo: Pointer to Tx/Rx Message management info based on channel type
> > + * @xfer: The xfer to act upon
> > + */
> > +static inline void scmi_xfer_token_clear(struct scmi_xfers_info *minfo,
> > +					 struct scmi_xfer *xfer)
> > +{
> > +	clear_bit(xfer->hdr.seq, minfo->xfer_alloc_table);
> > +}
> > +
> >  /**
> >   * scmi_xfer_get() - Allocate one message
> >   *
> >   * @handle: Pointer to SCMI entity handle
> >   * @minfo: Pointer to Tx/Rx Message management info based on channel type
> > + * @set_pending: If true a monotonic token is picked and the xfer is added to
> > + *		 the pending hash table.
> >   *
> >   * Helper function which is used by various message functions that are
> >   * exposed to clients of this driver for allocating a message traffic event.
> >   *
> > - * This function can sleep depending on pending requests already in the system
> > - * for the SCMI entity. Further, this also holds a spinlock to maintain
> > - * integrity of internal data structures.
> > + * Picks an xfer from the free list @free_xfers (if any available) and, if
> > + * required, sets a monotonically increasing token and stores the inflight xfer
> > + * into the @pending_xfers hashtable for later retrieval.
> > + *
> > + * The successfully initialized xfer is refcounted.
> > + *
> > + * Context: Holds @xfer_lock while manipulating @xfer_alloc_table and
> > + *	    @free_xfers.
> >   *
> >   * Return: 0 if all went fine, else corresponding error.
> >   */
> >  static struct scmi_xfer *scmi_xfer_get(const struct scmi_handle *handle,
> > -				       struct scmi_xfers_info *minfo)
> > +				       struct scmi_xfers_info *minfo,
> > +				       bool set_pending)
> >  {
> > -	u16 xfer_id;
> > +	int ret;
> > +	unsigned long flags;
> >  	struct scmi_xfer *xfer;
> > -	unsigned long flags, bit_pos;
> > -	struct scmi_info *info = handle_to_scmi_info(handle);
> >  
> > -	/* Keep the locked section as small as possible */
> >  	spin_lock_irqsave(&minfo->xfer_lock, flags);
> > -	bit_pos = find_first_zero_bit(minfo->xfer_alloc_table,
> > -				      info->desc->max_msg);
> > -	if (bit_pos == info->desc->max_msg) {
> > +	if (hlist_empty(&minfo->free_xfers)) {
> >  		spin_unlock_irqrestore(&minfo->xfer_lock, flags);
> >  		return ERR_PTR(-ENOMEM);
> >  	}
> > -	set_bit(bit_pos, minfo->xfer_alloc_table);
> > -	spin_unlock_irqrestore(&minfo->xfer_lock, flags);
> >  
> > -	xfer_id = bit_pos;
> > +	/* grab an xfer from the free_list */
> > +	xfer = hlist_entry(minfo->free_xfers.first, struct scmi_xfer, node);
> > +	hlist_del_init(&xfer->node);
> > +
> > +	if (set_pending) {
> > +		/* Pick and set monotonic token */
> > +		ret = scmi_xfer_token_set(minfo, xfer);
> > +		if (!ret) {
> > +			hash_add(minfo->pending_xfers, &xfer->node,
> > +				 xfer->hdr.seq);
> > +			xfer->pending = true;
> > +		} else {
> > +			dev_err(handle->dev,
> > +				"Failed to get monotonic token %d\n", ret);
> > +			hlist_add_head(&xfer->node, &minfo->free_xfers);
> > +			xfer = ERR_PTR(ret);
> > +		}
> > +	}
> >  
> > -	xfer = &minfo->xfer_block[xfer_id];
> > -	xfer->hdr.seq = xfer_id;
> > -	xfer->transfer_id = atomic_inc_return(&transfer_last_id);
> > +	if (!IS_ERR(xfer)) {
> > +		refcount_set(&xfer->users, 1);
> > +		xfer->transfer_id = atomic_inc_return(&transfer_last_id);
> > +	}
> > +	spin_unlock_irqrestore(&minfo->xfer_lock, flags);
> >  
> >  	return xfer;
> >  }
> > @@ -239,6 +382,9 @@ static struct scmi_xfer *scmi_xfer_get(const struct scmi_handle *handle,
> >   * @minfo: Pointer to Tx/Rx Message management info based on channel type
> >   * @xfer: message that was reserved by scmi_xfer_get
> >   *
> > + * After refcount check, possibly release an xfer, clearing the token slot,
> > + * removing xfer from @pending_xfers and putting it back into free_xfers.
> > + *
> >   * This holds a spinlock to maintain integrity of internal data structures.
> >   */
> >  static void
> > @@ -246,16 +392,44 @@ __scmi_xfer_put(struct scmi_xfers_info *minfo, struct scmi_xfer *xfer)
> >  {
> >  	unsigned long flags;
> >  
> > -	/*
> > -	 * Keep the locked section as small as possible
> > -	 * NOTE: we might escape with smp_mb and no lock here..
> > -	 * but just be conservative and symmetric.
> > -	 */
> >  	spin_lock_irqsave(&minfo->xfer_lock, flags);
> > -	clear_bit(xfer->hdr.seq, minfo->xfer_alloc_table);
> > +	if (refcount_dec_and_test(&xfer->users)) {
> 
> The introduction of users in this patch seems useless. I am assuming there are
> multiple users and this is to prevent some race. I was about to ask if we can
> manage without it, but seeing some additional use of it in later patches, I
> will comment later. It may still make sense to move this to later patch as
> it doesn't add anything here ?
> 

Indeed I was not sure about introducing it here, the problem that addresses is
as follows: a normal cmd flow is xfer_get/send/rx/xfer_put ... but if a command
times out on the TX path BEFORE is finally xfer_put() by the invoking protocol,
the MSG is still valid (as state) and pending for a while and if something is
received in the RX path concurrently, the transport calls scmi_rx_callback and
the risk would be that the xfer would be acquired just fine and processed by RX
(still has to be xfer_put as said) and then be suddendly vanished by the final
xfer_put on the TX path while RX is working on it. (or worst released while RX
is working on it and then picked from the free_list and reused for send while
RX is still processing it...ok bit of unlucky case I have to admit..)

So xfer->users is meant to avoid the xfer being vanished by the xfer_put() in
the TX path while rx_callback is still working on it....in such a case the
rx_callback will be allowed to process safely the xfer (even though it timed
out in TX) till the end of its RX-processing and the result will be then finally
discarded at the end with the final __scmi_xfer_put inside handle_response
that at the end releases the xfer once for all.

Using a refcount allows me to avoid bigger spinlocked critical sections (or at least
was the idea :D)

> > +		if (xfer->pending) {
> > +			scmi_xfer_token_clear(minfo, xfer);
> > +			hash_del(&xfer->node);
> > +			xfer->pending = false;
> > +		}
> > +		hlist_add_head(&xfer->node, &minfo->free_xfers);
> > +	}
> >  	spin_unlock_irqrestore(&minfo->xfer_lock, flags);
> >  }
> >  
> > +/**
> > + * scmi_xfer_lookup_unlocked  -  Helper to lookup an xfer_id
> > + *
> > + * @minfo: Pointer to Tx/Rx Message management info based on channel type
> > + * @xfer_id: Token ID to lookup in @pending_xfers
> > + *
> > + * Refcounting is untouched.
> > + *
> > + * Context: Assumes to be called with @xfer_lock already acquired.
> > + *
> > + * Return: A valid xfer on Success or error otherwise
> > + */
> > +static struct scmi_xfer *
> > +scmi_xfer_lookup_unlocked(struct scmi_xfers_info *minfo, u16 xfer_id)
> > +{
> > +	struct scmi_xfer *xfer = NULL;
> > +
> > +	if (xfer_id >= MSG_TOKEN_MAX)
> > +		return ERR_PTR(-EINVAL);
> > +
> 
> Is this really needed ? I guess we always use MSG_XTRACT_TOKEN(hdr) to
> fetch the xfer_id, no ?
> 

Right, indeed comes from a MSG_XTRACT_TOKEN, I'll drop it.

Thanks,
Cristian


^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH v6 08/17] firmware: arm_scmi: Add priv parameter to scmi_rx_callback
  2021-07-28 14:26   ` Sudeep Holla
@ 2021-07-28 17:25     ` Cristian Marussi
  0 siblings, 0 replies; 48+ messages in thread
From: Cristian Marussi @ 2021-07-28 17:25 UTC (permalink / raw)
  To: Sudeep Holla
  Cc: linux-kernel, linux-arm-kernel, virtualization, virtio-dev,
	james.quinlan, Jonathan.Cameron, f.fainelli, etienne.carriere,
	vincent.guittot, souvik.chakravarty, igor.skalkin, peter.hilber,
	alex.bennee, jean-philippe, mikhail.golubev, anton.yakovlev,
	Vasyl.Vavrychuk, Andriy.Tryshnivskyy

On Wed, Jul 28, 2021 at 03:26:25PM +0100, Sudeep Holla wrote:
> On Mon, Jul 12, 2021 at 03:18:24PM +0100, Cristian Marussi wrote:
> > Add a new opaque void *priv parameter to scmi_rx_callback which can be
> > optionally provided by the transport layer when invoking scmi_rx_callback
> > and that will be passed back to the transport layer in xfer->priv.
> > 
> > This can be used by transports that needs to keep track of their specific
> > data structures together with the valid xfers.
> >

Hi,

> 
> This change looks simple but doesn't make sense on its own. I assume you don't
> want to add all these in the patch making use of priv which makes sense. Not
> sure if the next patch uses it, but I prefer to keep them as close as possible
> if it is not too much of a hassle.
> 

Right, yes I'll move it close to 17/17 which is the actual user of this
at the end.

Thanks,
Cristian


^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH v6 10/17] firmware: arm_scmi: Make polling mode optional
  2021-07-28 14:34   ` Sudeep Holla
@ 2021-07-28 17:41     ` Cristian Marussi
  0 siblings, 0 replies; 48+ messages in thread
From: Cristian Marussi @ 2021-07-28 17:41 UTC (permalink / raw)
  To: Sudeep Holla
  Cc: linux-kernel, linux-arm-kernel, virtualization, virtio-dev,
	james.quinlan, Jonathan.Cameron, f.fainelli, etienne.carriere,
	vincent.guittot, souvik.chakravarty, igor.skalkin, peter.hilber,
	alex.bennee, jean-philippe, mikhail.golubev, anton.yakovlev,
	Vasyl.Vavrychuk, Andriy.Tryshnivskyy

On Wed, Jul 28, 2021 at 03:34:18PM +0100, Sudeep Holla wrote:
> On Mon, Jul 12, 2021 at 03:18:26PM +0100, Cristian Marussi wrote:
> > Add a check for the presence of .poll_done transport operation so that
> > transports that do not need to support polling mode have no need to provide
> > a dummy .poll_done callback either and polling mode can be disabled in the
> > SCMI core for that tranport.
> > 
> > Signed-off-by: Cristian Marussi <cristian.marussi@arm.com>
> > ---

Hi,

> >  drivers/firmware/arm_scmi/driver.c | 43 ++++++++++++++++++------------
> >  1 file changed, 26 insertions(+), 17 deletions(-)
> > 
> > diff --git a/drivers/firmware/arm_scmi/driver.c b/drivers/firmware/arm_scmi/driver.c
> > index a952b6527b8a..4183d25c9289 100644
> > --- a/drivers/firmware/arm_scmi/driver.c
> > +++ b/drivers/firmware/arm_scmi/driver.c
> > @@ -777,25 +777,34 @@ static int do_xfer(const struct scmi_protocol_handle *ph,
> >  	}
> >  
> >  	if (xfer->hdr.poll_completion) {
> > -		ktime_t stop = ktime_add_ns(ktime_get(), SCMI_MAX_POLL_TO_NS);
> > -
> > -		spin_until_cond(scmi_xfer_done_no_timeout(cinfo, xfer, stop));
> > -
> > -		if (ktime_before(ktime_get(), stop)) {
> > -			unsigned long flags;
> > -
> > -			/*
> > -			 * Do not fetch_response if an out-of-order delayed
> > -			 * response is being processed.
> > -			 */
> > -			spin_lock_irqsave(&xfer->lock, flags);
> > -			if (xfer->state == SCMI_XFER_SENT_OK) {
> > -				info->desc->ops->fetch_response(cinfo, xfer);
> > -				xfer->state = SCMI_XFER_RESP_OK;
> > +		if (info->desc->ops->poll_done) {
> > +			ktime_t stop = ktime_add_ns(ktime_get(),
> > +						    SCMI_MAX_POLL_TO_NS);
> > +
> > +			spin_until_cond(scmi_xfer_done_no_timeout(cinfo, xfer,
> > +								  stop));
> > +
> > +			if (ktime_before(ktime_get(), stop)) {
> > +				unsigned long flags;
> > +
> > +				/*
> > +				 * Do not fetch_response if an out-of-order delayed
> > +				 * response is being processed.
> > +				 */
> > +				spin_lock_irqsave(&xfer->lock, flags);
> > +				if (xfer->state == SCMI_XFER_SENT_OK) {
> > +					info->desc->ops->fetch_response(cinfo,
> > +									xfer);
> > +					xfer->state = SCMI_XFER_RESP_OK;
> > +				}
> > +				spin_unlock_irqrestore(&xfer->lock, flags);
> > +			} else {
> > +				ret = -ETIMEDOUT;
> >  			}
> > -			spin_unlock_irqrestore(&xfer->lock, flags);
> >  		} else {
> > -			ret = -ETIMEDOUT;
> > +			dev_warn_once(dev,
> > +				      "Polling mode is not supported by transport.\n");
> > +			ret = EINVAL;
> 
> Can't we just return this error as early as possible if the user isn't
> expected to use polling with this transport ? That would simplify the patch
> (as most of it is due to indentation which can go away) as you need not
> check it later ?
> 

Yes, indeed at first it was something like

if (xfer->hdr.poll_completion && !info->desc->ops->poll_done)
	return -EINVAL;

at the very beginning of do_xfer(), even before attempting to send
anything...and I liked much more but then I thought I would have run such
if-test for each and every command do_xfer() attempted...but maybe it's
just irrelevant and instead much more tidy if done as above.

I'll fix it as above.

Thanks,
Cristian

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH v6 11/17] firmware: arm_scmi: Make SCMI transports configurable
  2021-07-28 14:50   ` Sudeep Holla
@ 2021-07-29 16:18     ` Cristian Marussi
  0 siblings, 0 replies; 48+ messages in thread
From: Cristian Marussi @ 2021-07-29 16:18 UTC (permalink / raw)
  To: Sudeep Holla
  Cc: linux-kernel, linux-arm-kernel, virtualization, virtio-dev,
	james.quinlan, Jonathan.Cameron, f.fainelli, etienne.carriere,
	vincent.guittot, souvik.chakravarty, igor.skalkin, peter.hilber,
	alex.bennee, jean-philippe, mikhail.golubev, anton.yakovlev,
	Vasyl.Vavrychuk, Andriy.Tryshnivskyy

On Wed, Jul 28, 2021 at 03:50:33PM +0100, Sudeep Holla wrote:
> On Mon, Jul 12, 2021 at 03:18:27PM +0100, Cristian Marussi wrote:
> > Add configuration options to be able to select which SCMI transports have
> > to be compiled into the SCMI stack.
> > 
> > Mailbox and SMC are by default enabled if their related dependencies are
> > satisfied.
> > 
> > While doing that move all SCMI related config options in their own
> > dedicated submenu.
> > 
> > Signed-off-by: Cristian Marussi <cristian.marussi@arm.com>
> > ---
> > Used a BUILD_BUG_ON() to avoid the scenario where SCMI is configured
> > without any transport. Coul dnot do in any other way in Kconfig due to
> > circular dependencies.
> > 
> > This will be neeed later on to add new Virtio based transport and
> > optionally exclude other transports.
> > ---

Hi Sudeep,

> >  drivers/firmware/Kconfig           | 34 +--------------
> >  drivers/firmware/arm_scmi/Kconfig  | 70 ++++++++++++++++++++++++++++++
> >  drivers/firmware/arm_scmi/Makefile |  4 +-
> >  drivers/firmware/arm_scmi/common.h |  4 +-
> >  drivers/firmware/arm_scmi/driver.c |  6 ++-
> >  5 files changed, 80 insertions(+), 38 deletions(-)
> >  create mode 100644 drivers/firmware/arm_scmi/Kconfig
> > 
> > diff --git a/drivers/firmware/Kconfig b/drivers/firmware/Kconfig
> > index 1db738d5b301..8d41f73f5395 100644
> > --- a/drivers/firmware/Kconfig
> > +++ b/drivers/firmware/Kconfig
> > @@ -6,39 +6,7 @@
> >  
> >  menu "Firmware Drivers"
> >  
> > -config ARM_SCMI_PROTOCOL
> > -	tristate "ARM System Control and Management Interface (SCMI) Message Protocol"
> > -	depends on ARM || ARM64 || COMPILE_TEST
> > -	depends on MAILBOX || HAVE_ARM_SMCCC_DISCOVERY
> > -	help
> > -	  ARM System Control and Management Interface (SCMI) protocol is a
> > -	  set of operating system-independent software interfaces that are
> > -	  used in system management. SCMI is extensible and currently provides
> > -	  interfaces for: Discovery and self-description of the interfaces
> > -	  it supports, Power domain management which is the ability to place
> > -	  a given device or domain into the various power-saving states that
> > -	  it supports, Performance management which is the ability to control
> > -	  the performance of a domain that is composed of compute engines
> > -	  such as application processors and other accelerators, Clock
> > -	  management which is the ability to set and inquire rates on platform
> > -	  managed clocks and Sensor management which is the ability to read
> > -	  sensor data, and be notified of sensor value.
> > -
> > -	  This protocol library provides interface for all the client drivers
> > -	  making use of the features offered by the SCMI.
> > -
> > -config ARM_SCMI_POWER_DOMAIN
> > -	tristate "SCMI power domain driver"
> > -	depends on ARM_SCMI_PROTOCOL || (COMPILE_TEST && OF)
> > -	default y
> > -	select PM_GENERIC_DOMAINS if PM
> > -	help
> > -	  This enables support for the SCMI power domains which can be
> > -	  enabled or disabled via the SCP firmware
> > -
> > -	  This driver can also be built as a module.  If so, the module
> > -	  will be called scmi_pm_domain. Note this may needed early in boot
> > -	  before rootfs may be available.
> > +source "drivers/firmware/arm_scmi/Kconfig"
> >  
> >  config ARM_SCPI_PROTOCOL
> >  	tristate "ARM System Control and Power Interface (SCPI) Message Protocol"
> > diff --git a/drivers/firmware/arm_scmi/Kconfig b/drivers/firmware/arm_scmi/Kconfig
> > new file mode 100644
> > index 000000000000..479fc8a3533e
> > --- /dev/null
> > +++ b/drivers/firmware/arm_scmi/Kconfig
> > @@ -0,0 +1,70 @@
> > +# SPDX-License-Identifier: GPL-2.0-only
> > +menu "ARM System Control and Management Interface Protocol"
> > +
> > +config ARM_SCMI_PROTOCOL
> > +	tristate "ARM System Control and Management Interface (SCMI) Message Protocol"
> > +	depends on ARM || ARM64 || COMPILE_TEST
> > +	help
> > +	  ARM System Control and Management Interface (SCMI) protocol is a
> > +	  set of operating system-independent software interfaces that are
> > +	  used in system management. SCMI is extensible and currently provides
> > +	  interfaces for: Discovery and self-description of the interfaces
> > +	  it supports, Power domain management which is the ability to place
> > +	  a given device or domain into the various power-saving states that
> > +	  it supports, Performance management which is the ability to control
> > +	  the performance of a domain that is composed of compute engines
> > +	  such as application processors and other accelerators, Clock
> > +	  management which is the ability to set and inquire rates on platform
> > +	  managed clocks and Sensor management which is the ability to read
> > +	  sensor data, and be notified of sensor value.
> > +
> > +	  This protocol library provides interface for all the client drivers
> > +	  making use of the features offered by the SCMI.
> > +
> 
> May be you can add if condition here to remove the depends on ARM_SCMI_PROTOCOL
> 
> if ARM_SCMI_PROTOCOL
> 

I'll do.

> > +config ARM_SCMI_HAVE_TRANSPORT
> > +	bool
> > +	help
> > +	  This declares whether at least one SCMI transport has been configured.
> > +	  Used to trigger a build bug when trying to build SCMI without any
> > +	  configured transport.
> > +
> > +config ARM_SCMI_TRANSPORT_MAILBOX
> > +	bool "SCMI transport based on Mailbox"
> > +	depends on ARM_SCMI_PROTOCOL && MAILBOX
> 
> And drop ARM_SCMI_PROTOCOL above
> 

Ditto

> > +	select ARM_SCMI_HAVE_TRANSPORT
> > +	default y
> 
> Do we need a user visible choice if it is always default on ?
> 

I'll leave for now as said offline to be able to not build unsupported
transport.

> > +	help
> > +	  Enable mailbox based transport for SCMI.
> > +
> > +	  If you want the ARM SCMI PROTOCOL stack to include support for a
> > +	  transport based on mailboxes, answer Y.
> > +	  A matching DT entry will also be needed to indicate the effective
> > +	  presence of this kind of transport.
> > +
> 
> I would drop the above comment on matching DT.
> 

Yes right, I was in doubt in fact if it was sensible place for that
comment.

> > +config ARM_SCMI_TRANSPORT_SMC
> > +	bool "SCMI transport based on SMC"
> > +	depends on ARM_SCMI_PROTOCOL && HAVE_ARM_SMCCC_DISCOVERY
> 
> Ditto
> 

DittoAck :D

> > +	select ARM_SCMI_HAVE_TRANSPORT
> > +	default y
> > +	help
> > +	  Enable SMC based transport for SCMI.
> > +
> > +	  If you want the ARM SCMI PROTOCOL stack to include support for a
> > +	  transport based on SMC, answer Y.
> > +	  A matching DT entry will also be needed to indicate the effective
> > +	  presence of this kind of transport.
> > +
> 
> endif #ARM_SCMI_PROTOCOL
> 

Thanks,
Cristian

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH v6 15/17] firmware: arm_scmi: Add optional link_supplier() transport op
  2021-07-28 15:36   ` Sudeep Holla
@ 2021-07-29 16:19     ` Cristian Marussi
  0 siblings, 0 replies; 48+ messages in thread
From: Cristian Marussi @ 2021-07-29 16:19 UTC (permalink / raw)
  To: Sudeep Holla
  Cc: linux-kernel, linux-arm-kernel, virtualization, virtio-dev,
	james.quinlan, Jonathan.Cameron, f.fainelli, etienne.carriere,
	vincent.guittot, souvik.chakravarty, igor.skalkin, peter.hilber,
	alex.bennee, jean-philippe, mikhail.golubev, anton.yakovlev,
	Vasyl.Vavrychuk, Andriy.Tryshnivskyy

On Wed, Jul 28, 2021 at 04:36:47PM +0100, Sudeep Holla wrote:
> On Mon, Jul 12, 2021 at 03:18:31PM +0100, Cristian Marussi wrote:
> > From: Peter Hilber <peter.hilber@opensynergy.com>
> > 
> > Some transports are also effectively registered with other kernel subsystem
> > in order to be properly probed and initialized; as a consequence such kind
> > of transports, and their related devices, might still not have been probed
> > and initialized at the time the main SCMI core driver is probed.
> > 
> > Add an optional .link_supplier() transport operation which can be used by
> > the core SCMI stack to dynamically check if the transport is ready and
> > dynamically link its device to the platform instance device.
> >
> 
> To be precise,
> 
> s/platform instance device/SCMI platform instance device

Yes I'll fix.

Thanks,
Cristian


^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH v6 07/17] firmware: arm_scmi: Handle concurrent and out-of-order messages
  2021-07-12 14:18 ` [PATCH v6 07/17] firmware: arm_scmi: Handle concurrent and out-of-order messages Cristian Marussi
  2021-07-15 16:36   ` Peter Hilber
@ 2021-08-02 10:10   ` Sudeep Holla
  2021-08-02 10:27     ` Cristian Marussi
  1 sibling, 1 reply; 48+ messages in thread
From: Sudeep Holla @ 2021-08-02 10:10 UTC (permalink / raw)
  To: Cristian Marussi
  Cc: linux-kernel, linux-arm-kernel, virtualization, virtio-dev,
	james.quinlan, Jonathan.Cameron, f.fainelli, etienne.carriere,
	vincent.guittot, souvik.chakravarty, igor.skalkin, peter.hilber,
	alex.bennee, jean-philippe, mikhail.golubev, anton.yakovlev,
	Vasyl.Vavrychuk, Sudeep Holla, Andriy.Tryshnivskyy

On Mon, Jul 12, 2021 at 03:18:23PM +0100, Cristian Marussi wrote:
> Even though in case of asynchronous commands an SCMI platform server is

Drop the term "server"

> constrained to emit the delayed response message only after the related
> message response has been sent, the configured underlying transport could
> still deliver such messages together or in inverted order, causing races
> due to the concurrent or out-of-order access to the underlying xfer.
> 
> Introduce a mechanism to grant exclusive access to an xfer in order to
> properly serialize concurrent accesses to the same xfer originating from
> multiple correlated messages.
> 
> Add additional state information to xfer descriptors so as to be able to
> identify out-of-order message deliveries and act accordingly:
> 
>  - when a delayed response is expected but delivered before the related
>    response, the synchronous response is considered as successfully
>    received and the delayed response processing is carried on as usual.
> 
>  - when/if the missing synchronous response is subsequently received, it
>    is discarded as not congruent with the current state of the xfer, or
>    simply, because the xfer has been already released and so, now, the
>    monotonically increasing sequence number carried by the late response
>    is stale.
> 
> Signed-off-by: Cristian Marussi <cristian.marussi@arm.com>
> ---
> v5 --> v6
> - added spinlock comment
> ---
>  drivers/firmware/arm_scmi/common.h |  18 ++-
>  drivers/firmware/arm_scmi/driver.c | 229 ++++++++++++++++++++++++-----
>  2 files changed, 212 insertions(+), 35 deletions(-)
> 
> diff --git a/drivers/firmware/arm_scmi/common.h b/drivers/firmware/arm_scmi/common.h
> index 2233d0a188fc..9efebe1406d2 100644
> --- a/drivers/firmware/arm_scmi/common.h
> +++ b/drivers/firmware/arm_scmi/common.h
> @@ -19,6 +19,7 @@
>  #include <linux/module.h>
>  #include <linux/refcount.h>
>  #include <linux/scmi_protocol.h>
> +#include <linux/spinlock.h>
>  #include <linux/types.h>
>  
>  #include <asm/unaligned.h>
> @@ -145,6 +146,13 @@ struct scmi_msg {
>   * @pending: True for xfers added to @pending_xfers hashtable
>   * @node: An hlist_node reference used to store this xfer, alternatively, on
>   *	  the free list @free_xfers or in the @pending_xfers hashtable
> + * @busy: An atomic flag to ensure exclusive write access to this xfer
> + * @state: The current state of this transfer, with states transitions deemed
> + *	   valid being:
> + *	    - SCMI_XFER_SENT_OK -> SCMI_XFER_RESP_OK [ -> SCMI_XFER_DRESP_OK ]
> + *	    - SCMI_XFER_SENT_OK -> SCMI_XFER_DRESP_OK
> + *	      (Missing synchronous response is assumed OK and ignored)
> + * @lock: A spinlock to protect state and busy fields.
>   */
>  struct scmi_xfer {
>  	int transfer_id;
> @@ -156,6 +164,15 @@ struct scmi_xfer {
>  	refcount_t users;
>  	bool pending;
>  	struct hlist_node node;
> +#define SCMI_XFER_FREE		0
> +#define SCMI_XFER_BUSY		1
> +	atomic_t busy;
> +#define SCMI_XFER_SENT_OK	0
> +#define SCMI_XFER_RESP_OK	1
> +#define SCMI_XFER_DRESP_OK	2
> +	int state;
> +	/* A lock to protect state and busy fields */
> +	spinlock_t lock;
>  };
>  
>  /*
> @@ -392,5 +409,4 @@ bool shmem_poll_done(struct scmi_shared_mem __iomem *shmem,
>  void scmi_notification_instance_data_set(const struct scmi_handle *handle,
>  					 void *priv);
>  void *scmi_notification_instance_data_get(const struct scmi_handle *handle);
> -
>  #endif /* _SCMI_COMMON_H */
> diff --git a/drivers/firmware/arm_scmi/driver.c b/drivers/firmware/arm_scmi/driver.c
> index 245ede223302..5ef33d692670 100644
> --- a/drivers/firmware/arm_scmi/driver.c
> +++ b/drivers/firmware/arm_scmi/driver.c
> @@ -369,6 +369,7 @@ static struct scmi_xfer *scmi_xfer_get(const struct scmi_handle *handle,
>  
>  	if (!IS_ERR(xfer)) {
>  		refcount_set(&xfer->users, 1);
> +		atomic_set(&xfer->busy, SCMI_XFER_FREE);
>  		xfer->transfer_id = atomic_inc_return(&transfer_last_id);
>  	}
>  	spin_unlock_irqrestore(&minfo->xfer_lock, flags);
> @@ -430,6 +431,168 @@ scmi_xfer_lookup_unlocked(struct scmi_xfers_info *minfo, u16 xfer_id)
>  	return xfer ?: ERR_PTR(-EINVAL);
>  }
>  
> +/**
> + * scmi_msg_response_validate  - Validate message type against state of related
> + * xfer
> + *
> + * @cinfo: A reference to the channel descriptor.
> + * @msg_type: Message type to check
> + * @xfer: A reference to the xfer to validate against @msg_type
> + *
> + * This function checks if @msg_type is congruent with the current state of
> + * a pending @xfer; if an asynchronous delayed response is received before the
> + * related synchronous response (Out-of-Order Delayed Response) the missing
> + * synchronous response is assumed to be OK and completed, carrying on with the
> + * Delayed Response: this is done to address the case in which the underlying
> + * SCMI transport can deliver such out-of-order responses.
> + *
> + * Context: Assumes to be called with xfer->lock already acquired.
> + *
> + * Return: 0 on Success, error otherwise
> + */
> +static inline int scmi_msg_response_validate(struct scmi_chan_info *cinfo,
> +					     u8 msg_type,
> +					     struct scmi_xfer *xfer)
> +{
> +	/*
> +	 * Even if a response was indeed expected on this slot at this point,
> +	 * a buggy platform could wrongly reply feeding us an unexpected
> +	 * delayed response we're not prepared to handle: bail-out safely
> +	 * blaming firmware.
> +	 */
> +	if (msg_type == MSG_TYPE_DELAYED_RESP && !xfer->async_done) {
> +		dev_err(cinfo->dev,
> +			"Delayed Response for %d not expected! Buggy F/W ?\n",
> +			xfer->hdr.seq);
> +		return -EINVAL;
> +	}
> +
> +	switch (xfer->state) {
> +	case SCMI_XFER_SENT_OK:
> +		if (msg_type == MSG_TYPE_DELAYED_RESP) {
> +			/*
> +			 * Delayed Response expected but delivered earlier.
> +			 * Assume message RESPONSE was OK and skip state.
> +			 */
> +			xfer->hdr.status = SCMI_SUCCESS;
> +			xfer->state = SCMI_XFER_RESP_OK;
> +			complete(&xfer->done);
> +			dev_warn(cinfo->dev,
> +				 "Received valid OoO Delayed Response for %d\n",
> +				 xfer->hdr.seq);
> +		}
> +		break;
> +	case SCMI_XFER_RESP_OK:
> +		if (msg_type != MSG_TYPE_DELAYED_RESP)
> +			return -EINVAL;
> +		break;
> +	case SCMI_XFER_DRESP_OK:
> +		/* No further message expected once in SCMI_XFER_DRESP_OK */

Do we really need this case ? If so, how can this happen.

> +		return -EINVAL;
> +	}
> +
> +	return 0;
> +}
> +
> +static bool scmi_xfer_is_free(struct scmi_xfer *xfer)
> +{
> +	int ret;
> +
> +	ret = atomic_cmpxchg(&xfer->busy, SCMI_XFER_FREE, SCMI_XFER_BUSY);
> +
> +	return ret == SCMI_XFER_FREE;
> +}
> +
> +/**
> + * scmi_xfer_command_acquire  -  Helper to lookup and acquire a command xfer
> + *
> + * @cinfo: A reference to the channel descriptor.
> + * @msg_hdr: A message header to use as lookup key
> + *
> + * When a valid xfer is found for the sequence number embedded in the provided
> + * msg_hdr, reference counting is properly updated and exclusive access to this
> + * xfer is granted till released with @scmi_xfer_command_release.
> + *
> + * Return: A valid @xfer on Success or error otherwise.
> + */
> +static inline struct scmi_xfer *
> +scmi_xfer_command_acquire(struct scmi_chan_info *cinfo, u32 msg_hdr)
> +{
> +	int ret;
> +	unsigned long flags;
> +	struct scmi_xfer *xfer;
> +	struct scmi_info *info = handle_to_scmi_info(cinfo->handle);
> +	struct scmi_xfers_info *minfo = &info->tx_minfo;
> +	u8 msg_type = MSG_XTRACT_TYPE(msg_hdr);
> +	u16 xfer_id = MSG_XTRACT_TOKEN(msg_hdr);
> +
> +	/* Are we even expecting this? */
> +	spin_lock_irqsave(&minfo->xfer_lock, flags);
> +	xfer = scmi_xfer_lookup_unlocked(minfo, xfer_id);
> +	if (IS_ERR(xfer)) {
> +		dev_err(cinfo->dev,
> +			"Message for %d type %d is not expected!\n",
> +			xfer_id, msg_type);
> +		spin_unlock_irqrestore(&minfo->xfer_lock, flags);
> +		return xfer;
> +	}
> +	refcount_inc(&xfer->users);
> +	spin_unlock_irqrestore(&minfo->xfer_lock, flags);
> +
> +	spin_lock_irqsave(&xfer->lock, flags);
> +	ret = scmi_msg_response_validate(cinfo, msg_type, xfer);
> +	/*
> +	 * If a pending xfer was found which was also in a congruent state with
> +	 * the received message, acquire exclusive access to it setting the busy
> +	 * flag.
> +	 * Spins only on the rare limit condition of concurrent reception of
> +	 * RESP and DRESP for the same xfer.
> +	 */
> +	if (!ret) {
> +		spin_until_cond(scmi_xfer_is_free(xfer));

I agree with the discussion between you and Peter around this, so I assume
it will be renamed or handled accordingly.

> +		xfer->hdr.type = msg_type;
> +	}
> +	spin_unlock_irqrestore(&xfer->lock, flags);
> +
> +	if (ret) {
> +		dev_err(cinfo->dev,
> +			"Invalid message type:%d for %d - HDR:0x%X  state:%d\n",
> +			msg_type, xfer_id, msg_hdr, xfer->state);
> +		/* On error the refcount incremented above has to be dropped */
> +		__scmi_xfer_put(minfo, xfer);
> +		xfer = ERR_PTR(-EINVAL);
> +	}
> +
> +	return xfer;
> +}
> +
> +static inline void scmi_xfer_command_release(struct scmi_info *info,
> +					     struct scmi_xfer *xfer)
> +{
> +	atomic_set(&xfer->busy, SCMI_XFER_FREE);
> +	__scmi_xfer_put(&info->tx_minfo, xfer);
> +}
> +
> +/**
> + * scmi_xfer_state_update  - Update xfer state
> + *
> + * @xfer: A reference to the xfer to update
> + *
> + * Context: Assumes to be called on an xfer exclusively acquired using the
> + *	    busy flag.
> + */
> +static inline void scmi_xfer_state_update(struct scmi_xfer *xfer)
> +{
> +	switch (xfer->hdr.type) {
> +	case MSG_TYPE_COMMAND:
> +		xfer->state = SCMI_XFER_RESP_OK;
> +		break;
> +	case MSG_TYPE_DELAYED_RESP:
> +		xfer->state = SCMI_XFER_DRESP_OK;
> +		break;
> +	}
> +}

Can't this be if () ..  else if(), switch sounds unnecessary for 2 conditions.

Other than the things already discussed with you and Peter, don't have much to
add ATM. I may look at this with fresh eyes once again in the next version.

-- 
Regards,
Sudeep

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH v6 06/17] firmware: arm_scmi: Introduce monotonically increasing tokens
  2021-07-28 16:54     ` Cristian Marussi
@ 2021-08-02 10:24       ` Sudeep Holla
  2021-08-03 12:52         ` Cristian Marussi
  0 siblings, 1 reply; 48+ messages in thread
From: Sudeep Holla @ 2021-08-02 10:24 UTC (permalink / raw)
  To: Cristian Marussi
  Cc: linux-kernel, linux-arm-kernel, virtualization, virtio-dev,
	james.quinlan, Jonathan.Cameron, f.fainelli, etienne.carriere,
	vincent.guittot, souvik.chakravarty, igor.skalkin, peter.hilber,
	alex.bennee, jean-philippe, mikhail.golubev, anton.yakovlev,
	Vasyl.Vavrychuk, Andriy.Tryshnivskyy

On Wed, Jul 28, 2021 at 05:54:30PM +0100, Cristian Marussi wrote:
> On Wed, Jul 28, 2021 at 03:17:46PM +0100, Sudeep Holla wrote:
> > On Mon, Jul 12, 2021 at 03:18:22PM +0100, Cristian Marussi wrote:

[...]

> > >  
> > > +#define SCMI_PENDING_XFERS_HT_ORDER_SZ	9
> > > +
> > 
> > Is there any particular reason to choose half the token size as hash bucket
> > size ? IOW why not 1/3 or 1/4th of it ? I would appreciate a comment here.
> > I see it is mentioned in the commit log. Also is it not better to associate
> > or keep it close to MSG_TOKEN_ID_MASK and associated macros.
> > 
> 
> I'll move this in the proper place where associated macros are defined.
> 
> The reason for the size choice is tricky (and not sure about its value
> still...so I have not commented yet :D); the ideal size of this hashtable would
> be desc->max_msg so equal to the maximum number of inflight messages allowed on
> the system in order to minimize (probably to zero) collisions on the hashtable:
> unfortunately max_msg is only finally available at runtime time and the
> kernel hashtable is statically sized by design....
> 
> I tried to play some tricks to define dynamically the size but everything falls
> apart since a lot of stuff in linux/hashtable.h is based on ARRAY_SIZE() and
> friends (to speedup all I suppose). Another non-fit (in my opinion)
> alternative would be using relativistic hashtable (linux/rhashtable.h) but
> those are definitely overkill in our case since they are hashtables that
> can be resized completely at runtime while populated O_o. (with even
> more overhead)
> 
> At the end the size that fits all possible in-flight messages minimizing
> collisions in any possible case that I can set at compile time would be 10,
> which means really 2^10 1024 HT entries (equal to MAX_MSG_TOKEN) each of which
> is a struct list_head (*prev,*next 16bytes) i.e. 16KB HT: Peter pointed out
> that it would be a lot of wasted space on normal systems in which max in-flight
> messages are far-less than 1024 AND would not even fit in one 4Kb page, so I
> reduced it to 512 entries but the best would be 256 (8) if we want to
> fit in one regular 4kb page. The drawback will be a bit of HT collisions on
> system with more than 256 possible and effective in-flight messages.
>

I agree, 256 should be fine for now. Just add a note that it is chosen to
fit a page and can be updated if required.


> > >  /**
> > >   * struct scmi_xfers_info - Structure to manage transfer information
> > >   *
> > > - * @xfer_block: Preallocated Message array
> > >   * @xfer_alloc_table: Bitmap table for allocated messages.
> > >   *	Index of this bitmap table is also used for message
> > >   *	sequence identifier.
> > >   * @xfer_lock: Protection for message allocation
> > > + * @last_token: A counter to use as base to generate for monotonically
> > > + *		increasing tokens.
> > > + * @free_xfers: A free list for available to use xfers. It is initialized with
> > > + *		a number of xfers equal to the maximum allowed in-flight
> > > + *		messages.
> > > + * @pending_xfers: An hashtable, indexed by msg_hdr.seq, used to keep all the
> > > + *		   currently in-flight messages.
> > >   */
> > >  struct scmi_xfers_info {
> > > -	struct scmi_xfer *xfer_block;
> > >  	unsigned long *xfer_alloc_table;
> > >  	spinlock_t xfer_lock;
> > > +	atomic_t last_token;
> > 
> > Can we merge this and transfer_last_id ? Let this be free running like
> > transfer_last_id and just use [0-9] from this ? I don't see any point
> > having 2 different monotonically increasing tokens/id.
> > 
> 
> Mmm I was tempted about that, but the reason I did not was that in some
> rare limit condition as you can see in the ASCII art (:O) I can find a hole in
> the next available token ids so I have to skip and update last_token itself,
> not sure if this could cause confusion seeing transfer_ids with holes during
> tracing if I unify them.
>

That should be fine as it won't be used at all.

-- 
Regards,
Sudeep

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH v6 07/17] firmware: arm_scmi: Handle concurrent and out-of-order messages
  2021-08-02 10:10   ` Sudeep Holla
@ 2021-08-02 10:27     ` Cristian Marussi
  0 siblings, 0 replies; 48+ messages in thread
From: Cristian Marussi @ 2021-08-02 10:27 UTC (permalink / raw)
  To: Sudeep Holla
  Cc: linux-kernel, linux-arm-kernel, virtualization, virtio-dev,
	james.quinlan, Jonathan.Cameron, f.fainelli, etienne.carriere,
	vincent.guittot, souvik.chakravarty, igor.skalkin, peter.hilber,
	alex.bennee, jean-philippe, mikhail.golubev, anton.yakovlev,
	Vasyl.Vavrychuk, Andriy.Tryshnivskyy

On Mon, Aug 02, 2021 at 11:10:32AM +0100, Sudeep Holla wrote:
> On Mon, Jul 12, 2021 at 03:18:23PM +0100, Cristian Marussi wrote:
> > Even though in case of asynchronous commands an SCMI platform server is
> 
> Drop the term "server"
> 

Sure.

> > constrained to emit the delayed response message only after the related
> > message response has been sent, the configured underlying transport could
> > still deliver such messages together or in inverted order, causing races
> > due to the concurrent or out-of-order access to the underlying xfer.
> > 
> > Introduce a mechanism to grant exclusive access to an xfer in order to
> > properly serialize concurrent accesses to the same xfer originating from
> > multiple correlated messages.
> > 
> > Add additional state information to xfer descriptors so as to be able to
> > identify out-of-order message deliveries and act accordingly:
> > 
> >  - when a delayed response is expected but delivered before the related
> >    response, the synchronous response is considered as successfully
> >    received and the delayed response processing is carried on as usual.
> > 
> >  - when/if the missing synchronous response is subsequently received, it
> >    is discarded as not congruent with the current state of the xfer, or
> >    simply, because the xfer has been already released and so, now, the
> >    monotonically increasing sequence number carried by the late response
> >    is stale.
> > 
> > Signed-off-by: Cristian Marussi <cristian.marussi@arm.com>
> > ---
> > v5 --> v6
> > - added spinlock comment
> > ---
> >  drivers/firmware/arm_scmi/common.h |  18 ++-
> >  drivers/firmware/arm_scmi/driver.c | 229 ++++++++++++++++++++++++-----
> >  2 files changed, 212 insertions(+), 35 deletions(-)
> > 
> > diff --git a/drivers/firmware/arm_scmi/common.h b/drivers/firmware/arm_scmi/common.h
> > index 2233d0a188fc..9efebe1406d2 100644
> > --- a/drivers/firmware/arm_scmi/common.h
> > +++ b/drivers/firmware/arm_scmi/common.h
> > @@ -19,6 +19,7 @@
> >  #include <linux/module.h>
> >  #include <linux/refcount.h>
> >  #include <linux/scmi_protocol.h>
> > +#include <linux/spinlock.h>
> >  #include <linux/types.h>
> >  
> >  #include <asm/unaligned.h>
> > @@ -145,6 +146,13 @@ struct scmi_msg {
> >   * @pending: True for xfers added to @pending_xfers hashtable
> >   * @node: An hlist_node reference used to store this xfer, alternatively, on
> >   *	  the free list @free_xfers or in the @pending_xfers hashtable
> > + * @busy: An atomic flag to ensure exclusive write access to this xfer
> > + * @state: The current state of this transfer, with states transitions deemed
> > + *	   valid being:
> > + *	    - SCMI_XFER_SENT_OK -> SCMI_XFER_RESP_OK [ -> SCMI_XFER_DRESP_OK ]
> > + *	    - SCMI_XFER_SENT_OK -> SCMI_XFER_DRESP_OK
> > + *	      (Missing synchronous response is assumed OK and ignored)
> > + * @lock: A spinlock to protect state and busy fields.
> >   */
> >  struct scmi_xfer {
> >  	int transfer_id;
> > @@ -156,6 +164,15 @@ struct scmi_xfer {
> >  	refcount_t users;
> >  	bool pending;
> >  	struct hlist_node node;
> > +#define SCMI_XFER_FREE		0
> > +#define SCMI_XFER_BUSY		1
> > +	atomic_t busy;
> > +#define SCMI_XFER_SENT_OK	0
> > +#define SCMI_XFER_RESP_OK	1
> > +#define SCMI_XFER_DRESP_OK	2
> > +	int state;
> > +	/* A lock to protect state and busy fields */
> > +	spinlock_t lock;
> >  };
> >  
> >  /*
> > @@ -392,5 +409,4 @@ bool shmem_poll_done(struct scmi_shared_mem __iomem *shmem,
> >  void scmi_notification_instance_data_set(const struct scmi_handle *handle,
> >  					 void *priv);
> >  void *scmi_notification_instance_data_get(const struct scmi_handle *handle);
> > -
> >  #endif /* _SCMI_COMMON_H */
> > diff --git a/drivers/firmware/arm_scmi/driver.c b/drivers/firmware/arm_scmi/driver.c
> > index 245ede223302..5ef33d692670 100644
> > --- a/drivers/firmware/arm_scmi/driver.c
> > +++ b/drivers/firmware/arm_scmi/driver.c
> > @@ -369,6 +369,7 @@ static struct scmi_xfer *scmi_xfer_get(const struct scmi_handle *handle,
> >  
> >  	if (!IS_ERR(xfer)) {
> >  		refcount_set(&xfer->users, 1);
> > +		atomic_set(&xfer->busy, SCMI_XFER_FREE);
> >  		xfer->transfer_id = atomic_inc_return(&transfer_last_id);
> >  	}
> >  	spin_unlock_irqrestore(&minfo->xfer_lock, flags);
> > @@ -430,6 +431,168 @@ scmi_xfer_lookup_unlocked(struct scmi_xfers_info *minfo, u16 xfer_id)
> >  	return xfer ?: ERR_PTR(-EINVAL);
> >  }
> >  
> > +/**
> > + * scmi_msg_response_validate  - Validate message type against state of related
> > + * xfer
> > + *
> > + * @cinfo: A reference to the channel descriptor.
> > + * @msg_type: Message type to check
> > + * @xfer: A reference to the xfer to validate against @msg_type
> > + *
> > + * This function checks if @msg_type is congruent with the current state of
> > + * a pending @xfer; if an asynchronous delayed response is received before the
> > + * related synchronous response (Out-of-Order Delayed Response) the missing
> > + * synchronous response is assumed to be OK and completed, carrying on with the
> > + * Delayed Response: this is done to address the case in which the underlying
> > + * SCMI transport can deliver such out-of-order responses.
> > + *
> > + * Context: Assumes to be called with xfer->lock already acquired.
> > + *
> > + * Return: 0 on Success, error otherwise
> > + */
> > +static inline int scmi_msg_response_validate(struct scmi_chan_info *cinfo,
> > +					     u8 msg_type,
> > +					     struct scmi_xfer *xfer)
> > +{
> > +	/*
> > +	 * Even if a response was indeed expected on this slot at this point,
> > +	 * a buggy platform could wrongly reply feeding us an unexpected
> > +	 * delayed response we're not prepared to handle: bail-out safely
> > +	 * blaming firmware.
> > +	 */
> > +	if (msg_type == MSG_TYPE_DELAYED_RESP && !xfer->async_done) {
> > +		dev_err(cinfo->dev,
> > +			"Delayed Response for %d not expected! Buggy F/W ?\n",
> > +			xfer->hdr.seq);
> > +		return -EINVAL;
> > +	}
> > +
> > +	switch (xfer->state) {
> > +	case SCMI_XFER_SENT_OK:
> > +		if (msg_type == MSG_TYPE_DELAYED_RESP) {
> > +			/*
> > +			 * Delayed Response expected but delivered earlier.
> > +			 * Assume message RESPONSE was OK and skip state.
> > +			 */
> > +			xfer->hdr.status = SCMI_SUCCESS;
> > +			xfer->state = SCMI_XFER_RESP_OK;
> > +			complete(&xfer->done);
> > +			dev_warn(cinfo->dev,
> > +				 "Received valid OoO Delayed Response for %d\n",
> > +				 xfer->hdr.seq);
> > +		}
> > +		break;
> > +	case SCMI_XFER_RESP_OK:
> > +		if (msg_type != MSG_TYPE_DELAYED_RESP)
> > +			return -EINVAL;
> > +		break;
> > +	case SCMI_XFER_DRESP_OK:
> > +		/* No further message expected once in SCMI_XFER_DRESP_OK */
> 
> Do we really need this case ? If so, how can this happen.
> 

Given that I am checking for state validity I thought to account also
for the case of possible (even though rare) duplicated delayed response.

> > +		return -EINVAL;
> > +	}
> > +
> > +	return 0;
> > +}
> > +
> > +static bool scmi_xfer_is_free(struct scmi_xfer *xfer)
> > +{
> > +	int ret;
> > +
> > +	ret = atomic_cmpxchg(&xfer->busy, SCMI_XFER_FREE, SCMI_XFER_BUSY);
> > +
> > +	return ret == SCMI_XFER_FREE;
> > +}
> > +
> > +/**
> > + * scmi_xfer_command_acquire  -  Helper to lookup and acquire a command xfer
> > + *
> > + * @cinfo: A reference to the channel descriptor.
> > + * @msg_hdr: A message header to use as lookup key
> > + *
> > + * When a valid xfer is found for the sequence number embedded in the provided
> > + * msg_hdr, reference counting is properly updated and exclusive access to this
> > + * xfer is granted till released with @scmi_xfer_command_release.
> > + *
> > + * Return: A valid @xfer on Success or error otherwise.
> > + */
> > +static inline struct scmi_xfer *
> > +scmi_xfer_command_acquire(struct scmi_chan_info *cinfo, u32 msg_hdr)
> > +{
> > +	int ret;
> > +	unsigned long flags;
> > +	struct scmi_xfer *xfer;
> > +	struct scmi_info *info = handle_to_scmi_info(cinfo->handle);
> > +	struct scmi_xfers_info *minfo = &info->tx_minfo;
> > +	u8 msg_type = MSG_XTRACT_TYPE(msg_hdr);
> > +	u16 xfer_id = MSG_XTRACT_TOKEN(msg_hdr);
> > +
> > +	/* Are we even expecting this? */
> > +	spin_lock_irqsave(&minfo->xfer_lock, flags);
> > +	xfer = scmi_xfer_lookup_unlocked(minfo, xfer_id);
> > +	if (IS_ERR(xfer)) {
> > +		dev_err(cinfo->dev,
> > +			"Message for %d type %d is not expected!\n",
> > +			xfer_id, msg_type);
> > +		spin_unlock_irqrestore(&minfo->xfer_lock, flags);
> > +		return xfer;
> > +	}
> > +	refcount_inc(&xfer->users);
> > +	spin_unlock_irqrestore(&minfo->xfer_lock, flags);
> > +
> > +	spin_lock_irqsave(&xfer->lock, flags);
> > +	ret = scmi_msg_response_validate(cinfo, msg_type, xfer);
> > +	/*
> > +	 * If a pending xfer was found which was also in a congruent state with
> > +	 * the received message, acquire exclusive access to it setting the busy
> > +	 * flag.
> > +	 * Spins only on the rare limit condition of concurrent reception of
> > +	 * RESP and DRESP for the same xfer.
> > +	 */
> > +	if (!ret) {
> > +		spin_until_cond(scmi_xfer_is_free(xfer));
> 
> I agree with the discussion between you and Peter around this, so I assume
> it will be renamed or handled accordingly.
> 

Ok I'll rename it.

> > +		xfer->hdr.type = msg_type;
> > +	}
> > +	spin_unlock_irqrestore(&xfer->lock, flags);
> > +
> > +	if (ret) {
> > +		dev_err(cinfo->dev,
> > +			"Invalid message type:%d for %d - HDR:0x%X  state:%d\n",
> > +			msg_type, xfer_id, msg_hdr, xfer->state);
> > +		/* On error the refcount incremented above has to be dropped */
> > +		__scmi_xfer_put(minfo, xfer);
> > +		xfer = ERR_PTR(-EINVAL);
> > +	}
> > +
> > +	return xfer;
> > +}
> > +
> > +static inline void scmi_xfer_command_release(struct scmi_info *info,
> > +					     struct scmi_xfer *xfer)
> > +{
> > +	atomic_set(&xfer->busy, SCMI_XFER_FREE);
> > +	__scmi_xfer_put(&info->tx_minfo, xfer);
> > +}
> > +
> > +/**
> > + * scmi_xfer_state_update  - Update xfer state
> > + *
> > + * @xfer: A reference to the xfer to update
> > + *
> > + * Context: Assumes to be called on an xfer exclusively acquired using the
> > + *	    busy flag.
> > + */
> > +static inline void scmi_xfer_state_update(struct scmi_xfer *xfer)
> > +{
> > +	switch (xfer->hdr.type) {
> > +	case MSG_TYPE_COMMAND:
> > +		xfer->state = SCMI_XFER_RESP_OK;
> > +		break;
> > +	case MSG_TYPE_DELAYED_RESP:
> > +		xfer->state = SCMI_XFER_DRESP_OK;
> > +		break;
> > +	}
> > +}
> 
> Can't this be if () ..  else if(), switch sounds unnecessary for 2 conditions.
> 

Yes indeed I'll rework in V7.

> Other than the things already discussed with you and Peter, don't have much to
> add ATM. I may look at this with fresh eyes once again in the next version.
> 

Thanks for the review.

Cristian


^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH v6 06/17] firmware: arm_scmi: Introduce monotonically increasing tokens
  2021-08-02 10:24       ` Sudeep Holla
@ 2021-08-03 12:52         ` Cristian Marussi
  0 siblings, 0 replies; 48+ messages in thread
From: Cristian Marussi @ 2021-08-03 12:52 UTC (permalink / raw)
  To: Sudeep Holla
  Cc: linux-kernel, linux-arm-kernel, virtualization, virtio-dev,
	james.quinlan, Jonathan.Cameron, f.fainelli, etienne.carriere,
	vincent.guittot, souvik.chakravarty, igor.skalkin, peter.hilber,
	alex.bennee, jean-philippe, mikhail.golubev, anton.yakovlev,
	Vasyl.Vavrychuk, Andriy.Tryshnivskyy

On Mon, Aug 02, 2021 at 11:24:25AM +0100, Sudeep Holla wrote:
> On Wed, Jul 28, 2021 at 05:54:30PM +0100, Cristian Marussi wrote:
> > On Wed, Jul 28, 2021 at 03:17:46PM +0100, Sudeep Holla wrote:
> > > On Mon, Jul 12, 2021 at 03:18:22PM +0100, Cristian Marussi wrote:
> 
> [...]
> 

Hi Sudeep,

I'm goignt to post V7 with all the remarks on the series from you and Peter
addressed.  A few notes beow on this patch.

> > > >  
> > > > +#define SCMI_PENDING_XFERS_HT_ORDER_SZ	9
> > > > +
> > > 
> > > Is there any particular reason to choose half the token size as hash bucket
> > > size ? IOW why not 1/3 or 1/4th of it ? I would appreciate a comment here.
> > > I see it is mentioned in the commit log. Also is it not better to associate
> > > or keep it close to MSG_TOKEN_ID_MASK and associated macros.
> > > 
> > 
> > I'll move this in the proper place where associated macros are defined.
> > 
> > The reason for the size choice is tricky (and not sure about its value
> > still...so I have not commented yet :D); the ideal size of this hashtable would
> > be desc->max_msg so equal to the maximum number of inflight messages allowed on
> > the system in order to minimize (probably to zero) collisions on the hashtable:
> > unfortunately max_msg is only finally available at runtime time and the
> > kernel hashtable is statically sized by design....
> > 
> > I tried to play some tricks to define dynamically the size but everything falls
> > apart since a lot of stuff in linux/hashtable.h is based on ARRAY_SIZE() and
> > friends (to speedup all I suppose). Another non-fit (in my opinion)
> > alternative would be using relativistic hashtable (linux/rhashtable.h) but
> > those are definitely overkill in our case since they are hashtables that
> > can be resized completely at runtime while populated O_o. (with even
> > more overhead)
> > 
> > At the end the size that fits all possible in-flight messages minimizing
> > collisions in any possible case that I can set at compile time would be 10,
> > which means really 2^10 1024 HT entries (equal to MAX_MSG_TOKEN) each of which
> > is a struct list_head (*prev,*next 16bytes) i.e. 16KB HT: Peter pointed out
> > that it would be a lot of wasted space on normal systems in which max in-flight
> > messages are far-less than 1024 AND would not even fit in one 4Kb page, so I
> > reduced it to 512 entries but the best would be 256 (8) if we want to
> > fit in one regular 4kb page. The drawback will be a bit of HT collisions on
> > system with more than 256 possible and effective in-flight messages.
> >
> 
> I agree, 256 should be fine for now. Just add a note that it is chosen to
> fit a page and can be updated if required.
> 
> 

Yes I added a comment explaining the situation, but, my bad, the 'good'
size that fits the HT into a 4k page is indeed 512 (9) because the HT array
is made of hlist_head (so only one pointer, good I double checked on myself :P)

> > > >  /**
> > > >   * struct scmi_xfers_info - Structure to manage transfer information
> > > >   *
> > > > - * @xfer_block: Preallocated Message array
> > > >   * @xfer_alloc_table: Bitmap table for allocated messages.
> > > >   *	Index of this bitmap table is also used for message
> > > >   *	sequence identifier.
> > > >   * @xfer_lock: Protection for message allocation
> > > > + * @last_token: A counter to use as base to generate for monotonically
> > > > + *		increasing tokens.
> > > > + * @free_xfers: A free list for available to use xfers. It is initialized with
> > > > + *		a number of xfers equal to the maximum allowed in-flight
> > > > + *		messages.
> > > > + * @pending_xfers: An hashtable, indexed by msg_hdr.seq, used to keep all the
> > > > + *		   currently in-flight messages.
> > > >   */
> > > >  struct scmi_xfers_info {
> > > > -	struct scmi_xfer *xfer_block;
> > > >  	unsigned long *xfer_alloc_table;
> > > >  	spinlock_t xfer_lock;
> > > > +	atomic_t last_token;
> > > 
> > > Can we merge this and transfer_last_id ? Let this be free running like
> > > transfer_last_id and just use [0-9] from this ? I don't see any point
> > > having 2 different monotonically increasing tokens/id.
> > > 
> > 
> > Mmm I was tempted about that, but the reason I did not was that in some
> > rare limit condition as you can see in the ASCII art (:O) I can find a hole in
> > the next available token ids so I have to skip and update last_token itself,
> > not sure if this could cause confusion seeing transfer_ids with holes during
> > tracing if I unify them.
> >
> 
> That should be fine as it won't be used at all.
> 

In V7 I removed mx_info last_token and I'm using last_transfer_id; my
only concern (mostly theoretical I think) is that last_transfer_id is
picked from a global atomic and so it is shared by any platform or
channel instance. (and being global is good for tracing correlation
in fact...)

In particular this means that it is also shared with notifications
(which do not use monotonic seqnums), so that, if a flood of notifs
suddenly kicks in, a lot of transfer_idS would be burnt (for tracing
notifs.. not for seqnums) and any following command will end up picking
a good valid monotonic seqnum BUT far away from the previous one
(i.e. there will be holes in the set of monotonic seqnums): no issue
with that but it could lead, under some very awkward limit conditions,
to an early reuse of seqnums for commands (i.e. a less effective
mitigation of seqnums reusing).

Anyway in my tests on this new version I have see no issues and the
above mentioned case is clearly borderline so I made the change (and
noted the above scenario in a comment)

Thanks,
Cristian


^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH v6 00/17] Introduce SCMI transport based on VirtIO
  2021-07-12 14:18 [PATCH v6 00/17] Introduce SCMI transport based on VirtIO Cristian Marussi
                   ` (17 preceding siblings ...)
  2021-07-15 16:35 ` [PATCH v6 00/17] Introduce SCMI transport based on VirtIO Peter Hilber
@ 2021-08-11  9:31 ` Floris Westermann
  2021-08-11 15:26   ` Cristian Marussi
  18 siblings, 1 reply; 48+ messages in thread
From: Floris Westermann @ 2021-08-11  9:31 UTC (permalink / raw)
  To: Cristian Marussi
  Cc: linux-kernel, linux-arm-kernel, virtualization, virtio-dev,
	sudeep.holla, james.quinlan, Jonathan.Cameron, f.fainelli,
	etienne.carriere, vincent.guittot, souvik.chakravarty,
	igor.skalkin, peter.hilber, alex.bennee, jean-philippe,
	mikhail.golubev, anton.yakovlev, Vasyl.Vavrychuk,
	Andriy.Tryshnivskyy

Hi Cristian,

I am currently working on an interface for VMs to communicate their
performance requirements to the hosts by passing through cpu frequency
adjustments.

Your patch looks very interesting but I have some questions:


On Mon, Jul 12, 2021 at 03:18:16PM +0100, Cristian Marussi wrote:
>
> The series has been tested using an emulated fake SCMI device and also a
> proper SCP-fw stack running through QEMU vhost-users, with the SCMI stack
> compiled, in both cases, as builtin and as a loadable module, running tests
> against mocked SCMI Sensors using HWMON and IIO interfaces to check the
> functionality of notifications and sync/async commands.
>
> Virtio-scmi support has been exercised in the following testing scenario
> on a JUNO board:
>
>  - normal sync/async command transfers
>  - notifications
>  - concurrent delivery of correlated response and delayed responses
>  - out-of-order delivery of delayed responses before related responses
>  - unexpected delayed response delivery for sync commands
>  - late delivery of timed-out responses and delayed responses
>
> Some basic regression testing against mailbox transport has been performed
> for commands and notifications too.
>
> No sensible overhead in total handling time of commands and notifications
> has been observed, even though this series do indeed add a considerable
> amount of code to execute on TX path.
> More test and measurements could be needed in these regards.
>

Can you share any data and benchmarks using you fake SCMI device.
Also, could you provide the emulated device code so that the results can
be reproduced.


Cheers,
Floris

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH v6 00/17] Introduce SCMI transport based on VirtIO
  2021-08-11  9:31 ` Floris Westermann
@ 2021-08-11 15:26   ` Cristian Marussi
  0 siblings, 0 replies; 48+ messages in thread
From: Cristian Marussi @ 2021-08-11 15:26 UTC (permalink / raw)
  To: Floris Westermann
  Cc: linux-kernel, linux-arm-kernel, virtualization, virtio-dev,
	sudeep.holla, james.quinlan, Jonathan.Cameron, f.fainelli,
	etienne.carriere, vincent.guittot, souvik.chakravarty,
	igor.skalkin, peter.hilber, alex.bennee, jean-philippe,
	mikhail.golubev, anton.yakovlev, Vasyl.Vavrychuk,
	Andriy.Tryshnivskyy

On Wed, Aug 11, 2021 at 09:31:21AM +0000, Floris Westermann wrote:
> Hi Cristian,
> 

Hi Floris,

> I am currently working on an interface for VMs to communicate their
> performance requirements to the hosts by passing through cpu frequency
> adjustments.
> 

So something like looking up SCMI requests from VMs in the hypervisor
and act on VM underlying hw accordingly ? Where the SCMI server is meant
to live ?

> Your patch looks very interesting but I have some questions:
> 

Happy to hear that, a new V7 (with minor cleanups) which is (hopefully)
being pulled these days is at:

https://lore.kernel.org/linux-arm-kernel/20210803131024.40280-1-cristian.marussi@arm.com/
> 
> On Mon, Jul 12, 2021 at 03:18:16PM +0100, Cristian Marussi wrote:
> >
> > The series has been tested using an emulated fake SCMI device and also a
> > proper SCP-fw stack running through QEMU vhost-users, with the SCMI stack
> > compiled, in both cases, as builtin and as a loadable module, running tests
> > against mocked SCMI Sensors using HWMON and IIO interfaces to check the
> > functionality of notifications and sync/async commands.
> >
> > Virtio-scmi support has been exercised in the following testing scenario
> > on a JUNO board:
> >
> >  - normal sync/async command transfers
> >  - notifications
> >  - concurrent delivery of correlated response and delayed responses
> >  - out-of-order delivery of delayed responses before related responses
> >  - unexpected delayed response delivery for sync commands
> >  - late delivery of timed-out responses and delayed responses
> >
> > Some basic regression testing against mailbox transport has been performed
> > for commands and notifications too.
> >
> > No sensible overhead in total handling time of commands and notifications
> > has been observed, even though this series do indeed add a considerable
> > amount of code to execute on TX path.
> > More test and measurements could be needed in these regards.
> >
> 
> Can you share any data and benchmarks using you fake SCMI device.
> Also, could you provide the emulated device code so that the results can
> be reproduced.
> 

Not really, because the testing based on the fake SCMI VirtIO device was
purely functional, just to exercise some rare limit conditions not
easily reproducible with a regular SCMI stack, I've made no benchmark
using the fake emulated SCMI virtio device because it mimics VirtIO
transfers but there's not even an host/guest in my fake emulation.
Moreover is a hacked driver+userspace blob not really in a state to be
shared :P

While developing this series I needed somehow to be able to let the Kernel
SCMI-agent in the guest "speak" some basic SCMI commands and notifs to
some SCMI server platform sitting somewhere across the new VirtIO SCMI
transport, which basically means that it's not enough to create an SCMI
VirtIO device reachable from the VirtIO layer, the SCMI stack itself
(or part of it) must live behind such device somehow/somewhere to be
able to receive meaningful replies.

One proper way to do that is to use some QEMU/SCP-fw vhost-users support
cooked by Linaro (not in the upstream for now) so as to basically run a
full proper SCP SCMI-platform fw stack in host userspace and let it speak
to a guest through the new scmi virtio transport and the vhost-users magic:
the drawback of this kind of approach is that it made hard to test limit
conditions like stale or out-of-order SCMI replies because to do so you
have to patch the official SCP/SCMI stack to behave badly and out-of specs,
which is not something is designed to do. (and also the fact that the
vhost-users QEMU/SCP-fw solution was only available later during devel).
Hence the emulation hack for testing rare limit conditions.

Now, the emulation hack, beside really ugly, is clearly a totally fake
VirtIO environment (there's not even a guest really...) and, as said, it
just served the need to exercise the SCMI virtio transport code enough to
test anomalous and bad-behaving SCMI commands flows and notifications and
as such made really no sense to be used as a performance testbed.

In fact, what I was really meaning (poorly) while saying:

> > No sensible overhead in total handling time of commands and notifications
> > has been observed, even though this series do indeed add a considerable
> > amount of code to execute on TX path.

is that I have NOT seen any sensible overhead/slowdown in the context of
OTHER real SCMI transports (like mailboxes), because the virtio series
contains a number of preliminary SCMI common core changes unrelated to
virtio (monotonic tokens/handle concurrent and out-of-order replies) that,
even though easing the devel of SCMI virtio, are really needed and used
by any other existent SCMI transport, so my fear was to introduce some
common slowdown in the core: in those regards only, I said that I have
not seen (looking at cmd traces) any slowdown with such additional core
changes even though more code is clearly now run in the TX path.
(contention can indeed only happen under very rare limit conditions)

Having said that (sorry for the flood of words) what I can give you are
a few traces (non statistically significant probably) showing the typical
round-trip time for some plain SCMI command sensor requests on an idle
system (JUNO)

In both cases the sensors being read are mocked, so the time is purely
related to SCMI core stack and virtio exchanges (i.e. there's no delay
introduced by reading real hw sensors)


-> Using a proper SCP-fw/QEMU vhost-users stack:

root@deb-guest:~# cat /sys/class/hwmon/hwmon0/temp1_input 
             cat-195     [000] ....  7044.614295: scmi_xfer_begin: transfer_id=27 msg_id=6 protocol_id=21 seq=27 poll=0
25000
          <idle>-0       [000] d.h3  7044.615342: scmi_rx_done: transfer_id=27 msg_id=6 protocol_id=21 seq=27 msg_type=0
             cat-195     [000] ....  7044.615420: scmi_xfer_end: transfer_id=27 msg_id=6 protocol_id=21 seq=27 status=0

root@deb-guest:~# cat /sys/class/hwmon/hwmon0/temp1_input 
             cat-196     [000] ....  7049.200349: scmi_xfer_begin: transfer_id=28 msg_id=6 protocol_id=21 seq=28 poll=0
          <idle>-0       [000] d.h3  7049.202053: scmi_rx_done: transfer_id=28 msg_id=6 protocol_id=21 seq=28 msg_type=0
             cat-196     [000] ....  7049.202152: scmi_xfer_end: transfer_id=28 msg_id=6 protocol_id=21 seq=28 status=0
25000

root@deb-guest:~# cat /sys/class/hwmon/hwmon0/temp1_input 
             cat-197     [000] ....  7053.699713: scmi_xfer_begin: transfer_id=29 msg_id=6 protocol_id=21 seq=29 poll=0
25000
          <idle>-0       [000] d.H3  7053.700366: scmi_rx_done: transfer_id=29 msg_id=6 protocol_id=21 seq=29 msg_type=0
             cat-197     [000] ....  7053.700468: scmi_xfer_end: transfer_id=29 msg_id=6 protocol_id=21 seq=29 status=0

root@deb-guest:~# cat /sys/class/hwmon/hwmon0/temp1_input 
             cat-198     [001] ....  7058.944442: scmi_xfer_begin: transfer_id=30 msg_id=6 protocol_id=21 seq=30 poll=0
             cat-173     [000] d.h2  7058.944959: scmi_rx_done: transfer_id=30 msg_id=6 protocol_id=21 seq=30 msg_type=0
             cat-198     [001] ....  7058.945500: scmi_xfer_end: transfer_id=30 msg_id=6 protocol_id=21 seq=30 status=0
25000

root@deb-guest:~# cat /sys/class/hwmon/hwmon0/temp1_input 
             cat-199     [000] ....  7064.598797: scmi_xfer_begin: transfer_id=31 msg_id=6 protocol_id=21 seq=31 poll=0
25000
          <idle>-0       [000] d.h3  7064.599710: scmi_rx_done: transfer_id=31 msg_id=6 protocol_id=21 seq=31 msg_type=0
             cat-199     [000] ....  7064.599787: scmi_xfer_end: transfer_id=31 msg_id=6 protocol_id=21 seq=31 status=0



-> Using the fake hack SCMI device that relays packets to userspace:

             cat-1306    [000] ....  7614.373161: scmi_xfer_begin: transfer_id=78 msg_id=6 protocol_id=21 seq=78 poll=0
 scmi_sniffer_ng-342     [000] d.h2  7614.373699: scmi_rx_done: transfer_id=78 msg_id=6 protocol_id=21 seq=78 msg_type=0
             cat-1306    [000] ....  7614.377653: scmi_xfer_end: transfer_id=78 msg_id=6 protocol_id=21 seq=78 status=0


             cat-1308    [004] ....  7626.677176: scmi_xfer_begin: transfer_id=79 msg_id=6 protocol_id=21 seq=79 poll=0
 scmi_sniffer_ng-342     [000] d.h2  7626.677653: scmi_rx_done: transfer_id=79 msg_id=6 protocol_id=21 seq=79 msg_type=0
             cat-1308    [004] ....  7626.677705: scmi_xfer_end: transfer_id=79 msg_id=6 protocol_id=21 seq=79 status=0


             cat-1309    [004] ....  7631.249412: scmi_xfer_begin: transfer_id=80 msg_id=6 protocol_id=21 seq=80 poll=0
 scmi_sniffer_ng-342     [000] d.h2  7631.250182: scmi_rx_done: transfer_id=80 msg_id=6 protocol_id=21 seq=80 msg_type=0
             cat-1309    [004] ....  7631.250237: scmi_xfer_end: transfer_id=80 msg_id=6 protocol_id=21 seq=80 status=0

             cat-1312    [004] ....  7642.210034: scmi_xfer_begin: transfer_id=81 msg_id=6 protocol_id=21 seq=81 poll=0
 scmi_sniffer_ng-342     [000] d.h2  7642.210514: scmi_rx_done: transfer_id=81 msg_id=6 protocol_id=21 seq=81 msg_type=0
             cat-1312    [004] ....  7642.210567: scmi_xfer_end: transfer_id=81 msg_id=6 protocol_id=21 seq=81 status=0

             cat-1314    [003] ....  7645.810775: scmi_xfer_begin: transfer_id=82 msg_id=6 protocol_id=21 seq=82 poll=0
 scmi_sniffer_ng-342     [000] d.h2  7645.811255: scmi_rx_done: transfer_id=82 msg_id=6 protocol_id=21 seq=82 msg_type=0
             cat-1314    [003] ....  7645.811307: scmi_xfer_end: transfer_id=82 msg_id=6 protocol_id=21 seq=82 status=0


In both cases SCMI requests are effectively relayed to userspace so
that's probably the reason timings are similar. (despite the hackish
internals of latter solution)

Not sure if all the above madness helped you at all :D

Thanks,
Cristian


^ permalink raw reply	[flat|nested] 48+ messages in thread

end of thread, other threads:[~2021-08-11 15:26 UTC | newest]

Thread overview: 48+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-07-12 14:18 [PATCH v6 00/17] Introduce SCMI transport based on VirtIO Cristian Marussi
2021-07-12 14:18 ` [PATCH v6 01/17] firmware: arm_scmi: Avoid padding in sensor message structure Cristian Marussi
2021-07-12 14:18 ` [PATCH v6 02/17] firmware: arm_scmi: Fix max pending messages boundary check Cristian Marussi
2021-07-14 16:46   ` Sudeep Holla
2021-07-12 14:18 ` [PATCH v6 03/17] firmware: arm_scmi: Add support for type handling in common functions Cristian Marussi
2021-07-12 14:18 ` [PATCH v6 04/17] firmware: arm_scmi: Remove scmi_dump_header_dbg() helper Cristian Marussi
2021-07-12 14:18 ` [PATCH v6 05/17] firmware: arm_scmi: Add transport optional init/exit support Cristian Marussi
2021-07-28 11:40   ` Sudeep Holla
2021-07-28 12:28     ` Cristian Marussi
2021-07-12 14:18 ` [PATCH v6 06/17] firmware: arm_scmi: Introduce monotonically increasing tokens Cristian Marussi
2021-07-28 14:17   ` Sudeep Holla
2021-07-28 16:54     ` Cristian Marussi
2021-08-02 10:24       ` Sudeep Holla
2021-08-03 12:52         ` Cristian Marussi
2021-07-12 14:18 ` [PATCH v6 07/17] firmware: arm_scmi: Handle concurrent and out-of-order messages Cristian Marussi
2021-07-15 16:36   ` Peter Hilber
2021-07-19  9:14     ` Cristian Marussi
2021-07-22  8:32       ` Peter Hilber
2021-07-28  8:31         ` Cristian Marussi
2021-08-02 10:10   ` Sudeep Holla
2021-08-02 10:27     ` Cristian Marussi
2021-07-12 14:18 ` [PATCH v6 08/17] firmware: arm_scmi: Add priv parameter to scmi_rx_callback Cristian Marussi
2021-07-28 14:26   ` Sudeep Holla
2021-07-28 17:25     ` Cristian Marussi
2021-07-12 14:18 ` [PATCH v6 09/17] firmware: arm_scmi: Make .clear_channel optional Cristian Marussi
2021-07-12 14:18 ` [PATCH v6 10/17] firmware: arm_scmi: Make polling mode optional Cristian Marussi
2021-07-15 16:36   ` Peter Hilber
2021-07-19  9:15     ` Cristian Marussi
2021-07-28 14:34   ` Sudeep Holla
2021-07-28 17:41     ` Cristian Marussi
2021-07-12 14:18 ` [PATCH v6 11/17] firmware: arm_scmi: Make SCMI transports configurable Cristian Marussi
2021-07-28 14:50   ` Sudeep Holla
2021-07-29 16:18     ` Cristian Marussi
2021-07-12 14:18 ` [PATCH v6 12/17] firmware: arm_scmi: Make shmem support optional for transports Cristian Marussi
2021-07-12 14:18 ` [PATCH v6 13/17] firmware: arm_scmi: Add method to override max message number Cristian Marussi
2021-07-12 14:18 ` [PATCH v6 14/17] firmware: arm_scmi: Add message passing abstractions for transports Cristian Marussi
2021-07-15 16:36   ` Peter Hilber
2021-07-19  9:16     ` Cristian Marussi
2021-07-12 14:18 ` [PATCH v6 15/17] firmware: arm_scmi: Add optional link_supplier() transport op Cristian Marussi
2021-07-28 15:36   ` Sudeep Holla
2021-07-29 16:19     ` Cristian Marussi
2021-07-12 14:18 ` [PATCH v6 16/17] dt-bindings: arm: Add virtio transport for SCMI Cristian Marussi
2021-07-12 14:18 ` [PATCH v6 17/17] firmware: arm_scmi: Add virtio transport Cristian Marussi
2021-07-15 16:35 ` [PATCH v6 00/17] Introduce SCMI transport based on VirtIO Peter Hilber
2021-07-19 11:36   ` Cristian Marussi
2021-07-22  8:30     ` Peter Hilber
2021-08-11  9:31 ` Floris Westermann
2021-08-11 15:26   ` Cristian Marussi

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).