All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v8 00/11] Introduce atomic support for SCMI transports
@ 2021-12-20 19:56 ` Cristian Marussi
  0 siblings, 0 replies; 61+ messages in thread
From: Cristian Marussi @ 2021-12-20 19:56 UTC (permalink / raw)
  To: linux-kernel, linux-arm-kernel
  Cc: sudeep.holla, james.quinlan, Jonathan.Cameron, f.fainelli,
	etienne.carriere, vincent.guittot, souvik.chakravarty,
	cristian.marussi

Hi all,

This series mainly aims to introduce atomic support for SCMI transports
that can support it.

In [01/11], as a closely related addition, it is introduced a common way
for a transport to signal to the SCMI core that it does not offer
completion interrupts, so that the usual polling behaviour will be
required: this can be done enabling statically a global polling behaviour
for the whole transport with flag scmi_desc.force_polling OR dynamically
enabling at runtime such polling behaviour on a per-channel basis using
the flag scmi_chan_info.no_completion_irq, typically during .chan_setup().
The usual per-command polling selection behaviour based on
hdr.poll_completion is preserved as before.

Patch [02/11], ports SMC transport to use the common core completions when
completion interrupt is available or otherwise revert to use common core
polling mechanism above introduced: this avoids the improper call of
scmi_rx_callback directly in smc_send_message.

With [03/11] I introduce a flag to allow a transport to signal to the core
that upon return of a .send_message() the requested command execution can
be assumed by the core to have been fully completed by the platform, so
that the response payload (if any) can be immediately fetched without the
need to effectively poll the channel.

In [04/11] and [05/11] I enable such flag for SMC amd OPTEE transports.

With [06/11] a transport that supports atomic operations on its TX
path can now declare itself as .atomic_enabled and, if the platform has
been also configured to use such atomic operation mode via Kconfig, the
SCMI core will refrain itself too from sleeping on the RX path and instead
rely on polling on the correspondent when such atomic behaviour is
requested from the upper layers. (like the Clock framework)

Then in [07/11] SMC is converted to be .atomic_enabled by substituting
the mutexes with busy-waiting to keep the channel 'locked' ONLY IF the
SMC transport is configured to operate in atomic mode.
(CONFIG_ARM_SCMI_TRANSPORT_SMC_ATOMIC_ENABLE=y)

In [08/11] a new parameter is added to mark_txdone transport operation;
this will be needed to support polling/atomic support in SCMI virtio.

[09/11] adds polling and configurable atomic mode support to SCMI virtio.
(CONFIG_ARM_SCMI_TRANSPORT_VIRTIO_ATOMIC_ENABLE=y)

Finally [10/11] adds support for atomic clock enable/disable requests to
the SCMI Clock protocol layer, while in [11/11] the SCMI Clock driver is
modified to gain optional support for atomic operations when operating on
an atomically capable and configured SCMI transport.

NOTE THAT once atomic mode is available and configured on the underlying
SCMI transport, selected SCMI operations can be performed in atomic mode:
in this series this is done only for Clock enable operations; moreover
ALL the known clock devices are currently moved to atomic operations if
supported by transport: later a more fine grained control to avoid atomic
operations on 'slow' clock domains will be introduced, but this latter
solution requires an SCMI Spec change which is still underway.

Atomic support has been tested against the virtio transport in regards to
the above mentioned Clock enable/disable operations.
(echo 1 > /sys/kernel/debug/clk/myclock/clk_prepare_enable)
Pure polling has been tested on mailbox transport.

The series is based on sudeep/for-next/scmi [1] on top of:

commit f872af09094c ("firmware: arm_scmi: Use new trace event
		     scmi_xfer_response_wait")

Any feedback welcome.

Thanks,

Cristian

[1]: https://git.kernel.org/pub/scm/linux/kernel/git/sudeep.holla/linux.git/log/?h=for-next/scmi 

---
V7 --> V8
- rebased on top sudeep/for-next/scmi
- removed clock prepare/unprepare for atomic transports
- removed ifdeffery in SCMI virtio atomic/polling support
- added SCMI virtio polling deferred worker
- simplified SCMI virtio spinlocking around free list
- removed ifdeffery in SMC atomic support
- removed polling_capable/polling_enabled internal flags, using macros
- renaming to sync_cmds_completed_on_ret

V6 --> V7
- rebased on lates sudeep/for-next/scmi
- fixed missing list_del() in vio_complete while processing pre-fetched buffers
- added proper spilocking while accessing pending_cmds_list
- reduced unneeded spinlocked region in virtio_poll_done
- using __unused macro in optional mark_txdone param

V5 --> V6
- changed default polling timeout
- refactored SCMI RX path
- added timeout_ms to traces in RX-path
- added OPTEE support for sync_cmds_atomic_replies
- reworked the whole atomic support in SCMI core removing any reliance
  on IRQs. (using pure polling)
- added new parameter to mark_txdone
- added new SCMI VirtIO polling/atomic support
- reviewed Clock SCMI Driver atomic support: now supporting both
  clk_prepare and clk_enable when atomic transport is detected
- added CCs

V4 --> V5
- removed RFCs tags
- added scmi_desc.atomic_enabled flags and a few Kconfig options to set
  atomic mode for SMC and VirtIO transports. Default disabled.
- added Kconfig option to enable forced polling as a whole on the Mailbox
  transport
- removed .poll_done callback from SMC transport since no real polling is
  needed once sync_cmds_atomic_replies is set
- made atomic_capable changes on SMC transport dependent on Kconfig
  CONFIG_ARM_SCMI_TRANSPORT_SMC_ATOMIC_ENABLE: so no change and no busy waiting
  if atomic mode is NOT enabled in Kconfig.
- made const: force_polling/atomic_capable/atomic_enabled/sync_cmds_atomic_replies

V3 --> V4
- rebased on linux-next/master next-20210824
- renamed .needs_polling to .no_completion_irq
- added .sync_cmds_atomic_replies
- make SMC use .sync_cmd_atomic_replies

V2 --> v3
- rebased on SCMI VirtIO V6 which in turn is based on v5.14-rc1

Cristian Marussi (11):
  firmware: arm_scmi: Add configurable polling mode for transports
  firmware: arm_scmi: Make smc transport use common completions
  firmware: arm_scmi: Add sync_cmds_completed_on_ret transport flag
  firmware: arm_scmi: Make smc support sync_cmds_completed_on_ret
  firmware: arm_scmi: Make optee support sync_cmds_completed_on_ret
  firmware: arm_scmi: Add support for atomic transports
  firmware: arm_scmi: Add atomic mode support to smc transport
  firmware: arm_scmi: Add new parameter to mark_txdone
  firmware: arm_scmi: Add atomic mode support to virtio transport
  firmware: arm_scmi: Add atomic support to clock protocol
  clk: scmi: Support atomic clock enable/disable API

 drivers/clk/clk-scmi.c              |  54 +++++-
 drivers/firmware/arm_scmi/Kconfig   |  29 +++
 drivers/firmware/arm_scmi/clock.c   |  22 ++-
 drivers/firmware/arm_scmi/common.h  |  23 ++-
 drivers/firmware/arm_scmi/driver.c  | 123 ++++++++++--
 drivers/firmware/arm_scmi/mailbox.c |   3 +-
 drivers/firmware/arm_scmi/optee.c   |  18 +-
 drivers/firmware/arm_scmi/smc.c     |  98 +++++++---
 drivers/firmware/arm_scmi/virtio.c  | 291 +++++++++++++++++++++++++---
 include/linux/scmi_protocol.h       |  11 ++
 10 files changed, 586 insertions(+), 86 deletions(-)

-- 
2.17.1


^ permalink raw reply	[flat|nested] 61+ messages in thread

* [PATCH v8 00/11] Introduce atomic support for SCMI transports
@ 2021-12-20 19:56 ` Cristian Marussi
  0 siblings, 0 replies; 61+ messages in thread
From: Cristian Marussi @ 2021-12-20 19:56 UTC (permalink / raw)
  To: linux-kernel, linux-arm-kernel
  Cc: sudeep.holla, james.quinlan, Jonathan.Cameron, f.fainelli,
	etienne.carriere, vincent.guittot, souvik.chakravarty,
	cristian.marussi

Hi all,

This series mainly aims to introduce atomic support for SCMI transports
that can support it.

In [01/11], as a closely related addition, it is introduced a common way
for a transport to signal to the SCMI core that it does not offer
completion interrupts, so that the usual polling behaviour will be
required: this can be done enabling statically a global polling behaviour
for the whole transport with flag scmi_desc.force_polling OR dynamically
enabling at runtime such polling behaviour on a per-channel basis using
the flag scmi_chan_info.no_completion_irq, typically during .chan_setup().
The usual per-command polling selection behaviour based on
hdr.poll_completion is preserved as before.

Patch [02/11], ports SMC transport to use the common core completions when
completion interrupt is available or otherwise revert to use common core
polling mechanism above introduced: this avoids the improper call of
scmi_rx_callback directly in smc_send_message.

With [03/11] I introduce a flag to allow a transport to signal to the core
that upon return of a .send_message() the requested command execution can
be assumed by the core to have been fully completed by the platform, so
that the response payload (if any) can be immediately fetched without the
need to effectively poll the channel.

In [04/11] and [05/11] I enable such flag for SMC amd OPTEE transports.

With [06/11] a transport that supports atomic operations on its TX
path can now declare itself as .atomic_enabled and, if the platform has
been also configured to use such atomic operation mode via Kconfig, the
SCMI core will refrain itself too from sleeping on the RX path and instead
rely on polling on the correspondent when such atomic behaviour is
requested from the upper layers. (like the Clock framework)

Then in [07/11] SMC is converted to be .atomic_enabled by substituting
the mutexes with busy-waiting to keep the channel 'locked' ONLY IF the
SMC transport is configured to operate in atomic mode.
(CONFIG_ARM_SCMI_TRANSPORT_SMC_ATOMIC_ENABLE=y)

In [08/11] a new parameter is added to mark_txdone transport operation;
this will be needed to support polling/atomic support in SCMI virtio.

[09/11] adds polling and configurable atomic mode support to SCMI virtio.
(CONFIG_ARM_SCMI_TRANSPORT_VIRTIO_ATOMIC_ENABLE=y)

Finally [10/11] adds support for atomic clock enable/disable requests to
the SCMI Clock protocol layer, while in [11/11] the SCMI Clock driver is
modified to gain optional support for atomic operations when operating on
an atomically capable and configured SCMI transport.

NOTE THAT once atomic mode is available and configured on the underlying
SCMI transport, selected SCMI operations can be performed in atomic mode:
in this series this is done only for Clock enable operations; moreover
ALL the known clock devices are currently moved to atomic operations if
supported by transport: later a more fine grained control to avoid atomic
operations on 'slow' clock domains will be introduced, but this latter
solution requires an SCMI Spec change which is still underway.

Atomic support has been tested against the virtio transport in regards to
the above mentioned Clock enable/disable operations.
(echo 1 > /sys/kernel/debug/clk/myclock/clk_prepare_enable)
Pure polling has been tested on mailbox transport.

The series is based on sudeep/for-next/scmi [1] on top of:

commit f872af09094c ("firmware: arm_scmi: Use new trace event
		     scmi_xfer_response_wait")

Any feedback welcome.

Thanks,

Cristian

[1]: https://git.kernel.org/pub/scm/linux/kernel/git/sudeep.holla/linux.git/log/?h=for-next/scmi 

---
V7 --> V8
- rebased on top sudeep/for-next/scmi
- removed clock prepare/unprepare for atomic transports
- removed ifdeffery in SCMI virtio atomic/polling support
- added SCMI virtio polling deferred worker
- simplified SCMI virtio spinlocking around free list
- removed ifdeffery in SMC atomic support
- removed polling_capable/polling_enabled internal flags, using macros
- renaming to sync_cmds_completed_on_ret

V6 --> V7
- rebased on lates sudeep/for-next/scmi
- fixed missing list_del() in vio_complete while processing pre-fetched buffers
- added proper spilocking while accessing pending_cmds_list
- reduced unneeded spinlocked region in virtio_poll_done
- using __unused macro in optional mark_txdone param

V5 --> V6
- changed default polling timeout
- refactored SCMI RX path
- added timeout_ms to traces in RX-path
- added OPTEE support for sync_cmds_atomic_replies
- reworked the whole atomic support in SCMI core removing any reliance
  on IRQs. (using pure polling)
- added new parameter to mark_txdone
- added new SCMI VirtIO polling/atomic support
- reviewed Clock SCMI Driver atomic support: now supporting both
  clk_prepare and clk_enable when atomic transport is detected
- added CCs

V4 --> V5
- removed RFCs tags
- added scmi_desc.atomic_enabled flags and a few Kconfig options to set
  atomic mode for SMC and VirtIO transports. Default disabled.
- added Kconfig option to enable forced polling as a whole on the Mailbox
  transport
- removed .poll_done callback from SMC transport since no real polling is
  needed once sync_cmds_atomic_replies is set
- made atomic_capable changes on SMC transport dependent on Kconfig
  CONFIG_ARM_SCMI_TRANSPORT_SMC_ATOMIC_ENABLE: so no change and no busy waiting
  if atomic mode is NOT enabled in Kconfig.
- made const: force_polling/atomic_capable/atomic_enabled/sync_cmds_atomic_replies

V3 --> V4
- rebased on linux-next/master next-20210824
- renamed .needs_polling to .no_completion_irq
- added .sync_cmds_atomic_replies
- make SMC use .sync_cmd_atomic_replies

V2 --> v3
- rebased on SCMI VirtIO V6 which in turn is based on v5.14-rc1

Cristian Marussi (11):
  firmware: arm_scmi: Add configurable polling mode for transports
  firmware: arm_scmi: Make smc transport use common completions
  firmware: arm_scmi: Add sync_cmds_completed_on_ret transport flag
  firmware: arm_scmi: Make smc support sync_cmds_completed_on_ret
  firmware: arm_scmi: Make optee support sync_cmds_completed_on_ret
  firmware: arm_scmi: Add support for atomic transports
  firmware: arm_scmi: Add atomic mode support to smc transport
  firmware: arm_scmi: Add new parameter to mark_txdone
  firmware: arm_scmi: Add atomic mode support to virtio transport
  firmware: arm_scmi: Add atomic support to clock protocol
  clk: scmi: Support atomic clock enable/disable API

 drivers/clk/clk-scmi.c              |  54 +++++-
 drivers/firmware/arm_scmi/Kconfig   |  29 +++
 drivers/firmware/arm_scmi/clock.c   |  22 ++-
 drivers/firmware/arm_scmi/common.h  |  23 ++-
 drivers/firmware/arm_scmi/driver.c  | 123 ++++++++++--
 drivers/firmware/arm_scmi/mailbox.c |   3 +-
 drivers/firmware/arm_scmi/optee.c   |  18 +-
 drivers/firmware/arm_scmi/smc.c     |  98 +++++++---
 drivers/firmware/arm_scmi/virtio.c  | 291 +++++++++++++++++++++++++---
 include/linux/scmi_protocol.h       |  11 ++
 10 files changed, 586 insertions(+), 86 deletions(-)

-- 
2.17.1


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 61+ messages in thread

* [PATCH v8 01/11] firmware: arm_scmi: Add configurable polling mode for transports
  2021-12-20 19:56 ` Cristian Marussi
@ 2021-12-20 19:56   ` Cristian Marussi
  -1 siblings, 0 replies; 61+ messages in thread
From: Cristian Marussi @ 2021-12-20 19:56 UTC (permalink / raw)
  To: linux-kernel, linux-arm-kernel
  Cc: sudeep.holla, james.quinlan, Jonathan.Cameron, f.fainelli,
	etienne.carriere, vincent.guittot, souvik.chakravarty,
	cristian.marussi

SCMI communications along TX channels can optionally be provided of a
completion interrupt; when such interrupt is not available, command
transactions should rely on polling, where the SCMI core takes care to
repeatedly evaluate the transport-specific .poll_done() function, if
available, to determine if and when a request was fully completed or
timed out.

Such mechanism is already present and working on a single transfer base:
SCMI protocols can indeed enable hdr.poll_completion on specific commands
ahead of each transfer and cause that transaction to be handled with
polling.

Introduce a couple of flags to be able to enforce such polling behaviour
globally at will:

 - scmi_desc.force_polling: to statically switch the whole transport to
   polling mode.

 - scmi_chan_info.no_completion_irq: to switch a single channel dynamically
   to polling mode if, at runtime, is determined that no completion
   interrupt was available for such channel.

Signed-off-by: Cristian Marussi <cristian.marussi@arm.com>
---
v7 --> v8
- removed internal poling_enabled flag, using macros
v5 --> v6
- removed check on replies received by IRQs when xfer was requested
  as poll_completion (not all transport can suppress IRQs on an xfer basis)
v4 --> v5
- make force_polling const
- introduce polling_enabled flag to simplify checks on do_xfer
v3 --> v4:
- renamed .needs_polling flag to .no_completion_irq
- refactored error path when polling needed but not supported
---
 drivers/firmware/arm_scmi/common.h |  8 ++++++++
 drivers/firmware/arm_scmi/driver.c | 32 ++++++++++++++++++++++++++++++
 2 files changed, 40 insertions(+)

diff --git a/drivers/firmware/arm_scmi/common.h b/drivers/firmware/arm_scmi/common.h
index 6438b5248c24..652e5d95ee65 100644
--- a/drivers/firmware/arm_scmi/common.h
+++ b/drivers/firmware/arm_scmi/common.h
@@ -339,11 +339,16 @@ void scmi_protocol_release(const struct scmi_handle *handle, u8 protocol_id);
  * @dev: Reference to device in the SCMI hierarchy corresponding to this
  *	 channel
  * @handle: Pointer to SCMI entity handle
+ * @no_completion_irq: Flag to indicate that this channel has no completion
+ *		       interrupt mechanism for synchronous commands.
+ *		       This can be dynamically set by transports at run-time
+ *		       inside their provided .chan_setup().
  * @transport_info: Transport layer related information
  */
 struct scmi_chan_info {
 	struct device *dev;
 	struct scmi_handle *handle;
+	bool no_completion_irq;
 	void *transport_info;
 };
 
@@ -402,6 +407,8 @@ struct scmi_device *scmi_child_dev_find(struct device *parent,
  *	be pending simultaneously in the system. May be overridden by the
  *	get_max_msg op.
  * @max_msg_size: Maximum size of data per message that can be handled.
+ * @force_polling: Flag to force this whole transport to use SCMI core polling
+ *		   mechanism instead of completion interrupts even if available.
  */
 struct scmi_desc {
 	int (*transport_init)(void);
@@ -410,6 +417,7 @@ struct scmi_desc {
 	int max_rx_timeout_ms;
 	int max_msg;
 	int max_msg_size;
+	const bool force_polling;
 };
 
 #ifdef CONFIG_ARM_SCMI_TRANSPORT_MAILBOX
diff --git a/drivers/firmware/arm_scmi/driver.c b/drivers/firmware/arm_scmi/driver.c
index 476b91845e40..cb9fc12503f2 100644
--- a/drivers/firmware/arm_scmi/driver.c
+++ b/drivers/firmware/arm_scmi/driver.c
@@ -36,6 +36,23 @@
 #define CREATE_TRACE_POINTS
 #include <trace/events/scmi.h>
 
+#define IS_POLLING_REQUIRED(__c, __i)					\
+	((__c)->no_completion_irq || (__i)->desc->force_polling)	\
+
+#define IS_TRANSPORT_POLLING_CAPABLE(__i)				\
+	((__i)->desc->ops->poll_done)
+
+#define IS_POLLING_ENABLED(__c, __i)					\
+({									\
+	bool __ret;							\
+	typeof(__c) c_ = __c;						\
+	typeof(__i) i_ = __i;						\
+									\
+	__ret = (IS_POLLING_REQUIRED(c_, i_) &&				\
+			IS_TRANSPORT_POLLING_CAPABLE(i_));		\
+	__ret;								\
+})
+
 enum scmi_error_codes {
 	SCMI_SUCCESS = 0,	/* Success */
 	SCMI_ERR_SUPPORT = -1,	/* Not supported */
@@ -817,6 +834,7 @@ static int do_xfer(const struct scmi_protocol_handle *ph,
 	struct device *dev = info->dev;
 	struct scmi_chan_info *cinfo;
 
+	/* Check for polling request on custom command xfers at first */
 	if (xfer->hdr.poll_completion && !info->desc->ops->poll_done) {
 		dev_warn_once(dev,
 			      "Polling mode is not supported by transport.\n");
@@ -827,6 +845,10 @@ static int do_xfer(const struct scmi_protocol_handle *ph,
 	if (unlikely(!cinfo))
 		return -EINVAL;
 
+	/* True ONLY if also supported by transport. */
+	if (IS_POLLING_ENABLED(cinfo, info))
+		xfer->hdr.poll_completion = true;
+
 	/*
 	 * Initialise protocol id now from protocol handle to avoid it being
 	 * overridden by mistake (or malice) by the protocol code mangling with
@@ -1527,6 +1549,16 @@ static int scmi_chan_setup(struct scmi_info *info, struct device *dev,
 	if (ret)
 		return ret;
 
+	if (tx && IS_POLLING_REQUIRED(cinfo, info)) {
+		if (IS_TRANSPORT_POLLING_CAPABLE(info))
+			dev_info(dev,
+				 "Enabled polling mode TX channel - prot_id:%d\n",
+				 prot_id);
+		else
+			dev_warn(dev,
+				 "Polling mode NOT supported by transport.\n");
+	}
+
 idr_alloc:
 	ret = idr_alloc(idr, cinfo, prot_id, prot_id + 1, GFP_KERNEL);
 	if (ret != prot_id) {
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 61+ messages in thread

* [PATCH v8 01/11] firmware: arm_scmi: Add configurable polling mode for transports
@ 2021-12-20 19:56   ` Cristian Marussi
  0 siblings, 0 replies; 61+ messages in thread
From: Cristian Marussi @ 2021-12-20 19:56 UTC (permalink / raw)
  To: linux-kernel, linux-arm-kernel
  Cc: sudeep.holla, james.quinlan, Jonathan.Cameron, f.fainelli,
	etienne.carriere, vincent.guittot, souvik.chakravarty,
	cristian.marussi

SCMI communications along TX channels can optionally be provided of a
completion interrupt; when such interrupt is not available, command
transactions should rely on polling, where the SCMI core takes care to
repeatedly evaluate the transport-specific .poll_done() function, if
available, to determine if and when a request was fully completed or
timed out.

Such mechanism is already present and working on a single transfer base:
SCMI protocols can indeed enable hdr.poll_completion on specific commands
ahead of each transfer and cause that transaction to be handled with
polling.

Introduce a couple of flags to be able to enforce such polling behaviour
globally at will:

 - scmi_desc.force_polling: to statically switch the whole transport to
   polling mode.

 - scmi_chan_info.no_completion_irq: to switch a single channel dynamically
   to polling mode if, at runtime, is determined that no completion
   interrupt was available for such channel.

Signed-off-by: Cristian Marussi <cristian.marussi@arm.com>
---
v7 --> v8
- removed internal poling_enabled flag, using macros
v5 --> v6
- removed check on replies received by IRQs when xfer was requested
  as poll_completion (not all transport can suppress IRQs on an xfer basis)
v4 --> v5
- make force_polling const
- introduce polling_enabled flag to simplify checks on do_xfer
v3 --> v4:
- renamed .needs_polling flag to .no_completion_irq
- refactored error path when polling needed but not supported
---
 drivers/firmware/arm_scmi/common.h |  8 ++++++++
 drivers/firmware/arm_scmi/driver.c | 32 ++++++++++++++++++++++++++++++
 2 files changed, 40 insertions(+)

diff --git a/drivers/firmware/arm_scmi/common.h b/drivers/firmware/arm_scmi/common.h
index 6438b5248c24..652e5d95ee65 100644
--- a/drivers/firmware/arm_scmi/common.h
+++ b/drivers/firmware/arm_scmi/common.h
@@ -339,11 +339,16 @@ void scmi_protocol_release(const struct scmi_handle *handle, u8 protocol_id);
  * @dev: Reference to device in the SCMI hierarchy corresponding to this
  *	 channel
  * @handle: Pointer to SCMI entity handle
+ * @no_completion_irq: Flag to indicate that this channel has no completion
+ *		       interrupt mechanism for synchronous commands.
+ *		       This can be dynamically set by transports at run-time
+ *		       inside their provided .chan_setup().
  * @transport_info: Transport layer related information
  */
 struct scmi_chan_info {
 	struct device *dev;
 	struct scmi_handle *handle;
+	bool no_completion_irq;
 	void *transport_info;
 };
 
@@ -402,6 +407,8 @@ struct scmi_device *scmi_child_dev_find(struct device *parent,
  *	be pending simultaneously in the system. May be overridden by the
  *	get_max_msg op.
  * @max_msg_size: Maximum size of data per message that can be handled.
+ * @force_polling: Flag to force this whole transport to use SCMI core polling
+ *		   mechanism instead of completion interrupts even if available.
  */
 struct scmi_desc {
 	int (*transport_init)(void);
@@ -410,6 +417,7 @@ struct scmi_desc {
 	int max_rx_timeout_ms;
 	int max_msg;
 	int max_msg_size;
+	const bool force_polling;
 };
 
 #ifdef CONFIG_ARM_SCMI_TRANSPORT_MAILBOX
diff --git a/drivers/firmware/arm_scmi/driver.c b/drivers/firmware/arm_scmi/driver.c
index 476b91845e40..cb9fc12503f2 100644
--- a/drivers/firmware/arm_scmi/driver.c
+++ b/drivers/firmware/arm_scmi/driver.c
@@ -36,6 +36,23 @@
 #define CREATE_TRACE_POINTS
 #include <trace/events/scmi.h>
 
+#define IS_POLLING_REQUIRED(__c, __i)					\
+	((__c)->no_completion_irq || (__i)->desc->force_polling)	\
+
+#define IS_TRANSPORT_POLLING_CAPABLE(__i)				\
+	((__i)->desc->ops->poll_done)
+
+#define IS_POLLING_ENABLED(__c, __i)					\
+({									\
+	bool __ret;							\
+	typeof(__c) c_ = __c;						\
+	typeof(__i) i_ = __i;						\
+									\
+	__ret = (IS_POLLING_REQUIRED(c_, i_) &&				\
+			IS_TRANSPORT_POLLING_CAPABLE(i_));		\
+	__ret;								\
+})
+
 enum scmi_error_codes {
 	SCMI_SUCCESS = 0,	/* Success */
 	SCMI_ERR_SUPPORT = -1,	/* Not supported */
@@ -817,6 +834,7 @@ static int do_xfer(const struct scmi_protocol_handle *ph,
 	struct device *dev = info->dev;
 	struct scmi_chan_info *cinfo;
 
+	/* Check for polling request on custom command xfers at first */
 	if (xfer->hdr.poll_completion && !info->desc->ops->poll_done) {
 		dev_warn_once(dev,
 			      "Polling mode is not supported by transport.\n");
@@ -827,6 +845,10 @@ static int do_xfer(const struct scmi_protocol_handle *ph,
 	if (unlikely(!cinfo))
 		return -EINVAL;
 
+	/* True ONLY if also supported by transport. */
+	if (IS_POLLING_ENABLED(cinfo, info))
+		xfer->hdr.poll_completion = true;
+
 	/*
 	 * Initialise protocol id now from protocol handle to avoid it being
 	 * overridden by mistake (or malice) by the protocol code mangling with
@@ -1527,6 +1549,16 @@ static int scmi_chan_setup(struct scmi_info *info, struct device *dev,
 	if (ret)
 		return ret;
 
+	if (tx && IS_POLLING_REQUIRED(cinfo, info)) {
+		if (IS_TRANSPORT_POLLING_CAPABLE(info))
+			dev_info(dev,
+				 "Enabled polling mode TX channel - prot_id:%d\n",
+				 prot_id);
+		else
+			dev_warn(dev,
+				 "Polling mode NOT supported by transport.\n");
+	}
+
 idr_alloc:
 	ret = idr_alloc(idr, cinfo, prot_id, prot_id + 1, GFP_KERNEL);
 	if (ret != prot_id) {
-- 
2.17.1


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 61+ messages in thread

* [PATCH v8 02/11] firmware: arm_scmi: Make smc transport use common completions
  2021-12-20 19:56 ` Cristian Marussi
@ 2021-12-20 19:56   ` Cristian Marussi
  -1 siblings, 0 replies; 61+ messages in thread
From: Cristian Marussi @ 2021-12-20 19:56 UTC (permalink / raw)
  To: linux-kernel, linux-arm-kernel
  Cc: sudeep.holla, james.quinlan, Jonathan.Cameron, f.fainelli,
	etienne.carriere, vincent.guittot, souvik.chakravarty,
	cristian.marussi

When a completion irq is available use it and delegate command completion
handling to the core SCMI completion mechanism.

If no completion irq is available revert to polling, using the core common
polling machinery.

Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: Cristian Marussi <cristian.marussi@arm.com>
---
v6 --> v7
- removed spurios blank line removal
v4 --> v5
- removed RFC tag
v3 --> v4
- renamed usage of .needs_polling to .no_completion_irq
---
 drivers/firmware/arm_scmi/smc.c | 39 +++++++++++++++++----------------
 1 file changed, 20 insertions(+), 19 deletions(-)

diff --git a/drivers/firmware/arm_scmi/smc.c b/drivers/firmware/arm_scmi/smc.c
index 4effecc3bb46..d6c6ad9f6bab 100644
--- a/drivers/firmware/arm_scmi/smc.c
+++ b/drivers/firmware/arm_scmi/smc.c
@@ -25,8 +25,6 @@
  * @shmem: Transmit/Receive shared memory area
  * @shmem_lock: Lock to protect access to Tx/Rx shared memory area
  * @func_id: smc/hvc call function id
- * @irq: Optional; employed when platforms indicates msg completion by intr.
- * @tx_complete: Optional, employed only when irq is valid.
  */
 
 struct scmi_smc {
@@ -34,15 +32,14 @@ struct scmi_smc {
 	struct scmi_shared_mem __iomem *shmem;
 	struct mutex shmem_lock;
 	u32 func_id;
-	int irq;
-	struct completion tx_complete;
 };
 
 static irqreturn_t smc_msg_done_isr(int irq, void *data)
 {
 	struct scmi_smc *scmi_info = data;
 
-	complete(&scmi_info->tx_complete);
+	scmi_rx_callback(scmi_info->cinfo,
+			 shmem_read_header(scmi_info->shmem), NULL);
 
 	return IRQ_HANDLED;
 }
@@ -111,8 +108,8 @@ static int smc_chan_setup(struct scmi_chan_info *cinfo, struct device *dev,
 			dev_err(dev, "failed to setup SCMI smc irq\n");
 			return ret;
 		}
-		init_completion(&scmi_info->tx_complete);
-		scmi_info->irq = irq;
+	} else {
+		cinfo->no_completion_irq = true;
 	}
 
 	scmi_info->func_id = func_id;
@@ -142,26 +139,22 @@ static int smc_send_message(struct scmi_chan_info *cinfo,
 	struct scmi_smc *scmi_info = cinfo->transport_info;
 	struct arm_smccc_res res;
 
+	/*
+	 * Channel lock will be released only once response has been
+	 * surely fully retrieved, so after .mark_txdone()
+	 */
 	mutex_lock(&scmi_info->shmem_lock);
 
 	shmem_tx_prepare(scmi_info->shmem, xfer);
 
-	if (scmi_info->irq)
-		reinit_completion(&scmi_info->tx_complete);
-
 	arm_smccc_1_1_invoke(scmi_info->func_id, 0, 0, 0, 0, 0, 0, 0, &res);
 
-	if (scmi_info->irq)
-		wait_for_completion(&scmi_info->tx_complete);
-
-	scmi_rx_callback(scmi_info->cinfo,
-			 shmem_read_header(scmi_info->shmem), NULL);
-
-	mutex_unlock(&scmi_info->shmem_lock);
-
 	/* Only SMCCC_RET_NOT_SUPPORTED is valid error code */
-	if (res.a0)
+	if (res.a0) {
+		mutex_unlock(&scmi_info->shmem_lock);
 		return -EOPNOTSUPP;
+	}
+
 	return 0;
 }
 
@@ -173,6 +166,13 @@ static void smc_fetch_response(struct scmi_chan_info *cinfo,
 	shmem_fetch_response(scmi_info->shmem, xfer);
 }
 
+static void smc_mark_txdone(struct scmi_chan_info *cinfo, int ret)
+{
+	struct scmi_smc *scmi_info = cinfo->transport_info;
+
+	mutex_unlock(&scmi_info->shmem_lock);
+}
+
 static bool
 smc_poll_done(struct scmi_chan_info *cinfo, struct scmi_xfer *xfer)
 {
@@ -186,6 +186,7 @@ static const struct scmi_transport_ops scmi_smc_ops = {
 	.chan_setup = smc_chan_setup,
 	.chan_free = smc_chan_free,
 	.send_message = smc_send_message,
+	.mark_txdone = smc_mark_txdone,
 	.fetch_response = smc_fetch_response,
 	.poll_done = smc_poll_done,
 };
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 61+ messages in thread

* [PATCH v8 02/11] firmware: arm_scmi: Make smc transport use common completions
@ 2021-12-20 19:56   ` Cristian Marussi
  0 siblings, 0 replies; 61+ messages in thread
From: Cristian Marussi @ 2021-12-20 19:56 UTC (permalink / raw)
  To: linux-kernel, linux-arm-kernel
  Cc: sudeep.holla, james.quinlan, Jonathan.Cameron, f.fainelli,
	etienne.carriere, vincent.guittot, souvik.chakravarty,
	cristian.marussi

When a completion irq is available use it and delegate command completion
handling to the core SCMI completion mechanism.

If no completion irq is available revert to polling, using the core common
polling machinery.

Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: Cristian Marussi <cristian.marussi@arm.com>
---
v6 --> v7
- removed spurios blank line removal
v4 --> v5
- removed RFC tag
v3 --> v4
- renamed usage of .needs_polling to .no_completion_irq
---
 drivers/firmware/arm_scmi/smc.c | 39 +++++++++++++++++----------------
 1 file changed, 20 insertions(+), 19 deletions(-)

diff --git a/drivers/firmware/arm_scmi/smc.c b/drivers/firmware/arm_scmi/smc.c
index 4effecc3bb46..d6c6ad9f6bab 100644
--- a/drivers/firmware/arm_scmi/smc.c
+++ b/drivers/firmware/arm_scmi/smc.c
@@ -25,8 +25,6 @@
  * @shmem: Transmit/Receive shared memory area
  * @shmem_lock: Lock to protect access to Tx/Rx shared memory area
  * @func_id: smc/hvc call function id
- * @irq: Optional; employed when platforms indicates msg completion by intr.
- * @tx_complete: Optional, employed only when irq is valid.
  */
 
 struct scmi_smc {
@@ -34,15 +32,14 @@ struct scmi_smc {
 	struct scmi_shared_mem __iomem *shmem;
 	struct mutex shmem_lock;
 	u32 func_id;
-	int irq;
-	struct completion tx_complete;
 };
 
 static irqreturn_t smc_msg_done_isr(int irq, void *data)
 {
 	struct scmi_smc *scmi_info = data;
 
-	complete(&scmi_info->tx_complete);
+	scmi_rx_callback(scmi_info->cinfo,
+			 shmem_read_header(scmi_info->shmem), NULL);
 
 	return IRQ_HANDLED;
 }
@@ -111,8 +108,8 @@ static int smc_chan_setup(struct scmi_chan_info *cinfo, struct device *dev,
 			dev_err(dev, "failed to setup SCMI smc irq\n");
 			return ret;
 		}
-		init_completion(&scmi_info->tx_complete);
-		scmi_info->irq = irq;
+	} else {
+		cinfo->no_completion_irq = true;
 	}
 
 	scmi_info->func_id = func_id;
@@ -142,26 +139,22 @@ static int smc_send_message(struct scmi_chan_info *cinfo,
 	struct scmi_smc *scmi_info = cinfo->transport_info;
 	struct arm_smccc_res res;
 
+	/*
+	 * Channel lock will be released only once response has been
+	 * surely fully retrieved, so after .mark_txdone()
+	 */
 	mutex_lock(&scmi_info->shmem_lock);
 
 	shmem_tx_prepare(scmi_info->shmem, xfer);
 
-	if (scmi_info->irq)
-		reinit_completion(&scmi_info->tx_complete);
-
 	arm_smccc_1_1_invoke(scmi_info->func_id, 0, 0, 0, 0, 0, 0, 0, &res);
 
-	if (scmi_info->irq)
-		wait_for_completion(&scmi_info->tx_complete);
-
-	scmi_rx_callback(scmi_info->cinfo,
-			 shmem_read_header(scmi_info->shmem), NULL);
-
-	mutex_unlock(&scmi_info->shmem_lock);
-
 	/* Only SMCCC_RET_NOT_SUPPORTED is valid error code */
-	if (res.a0)
+	if (res.a0) {
+		mutex_unlock(&scmi_info->shmem_lock);
 		return -EOPNOTSUPP;
+	}
+
 	return 0;
 }
 
@@ -173,6 +166,13 @@ static void smc_fetch_response(struct scmi_chan_info *cinfo,
 	shmem_fetch_response(scmi_info->shmem, xfer);
 }
 
+static void smc_mark_txdone(struct scmi_chan_info *cinfo, int ret)
+{
+	struct scmi_smc *scmi_info = cinfo->transport_info;
+
+	mutex_unlock(&scmi_info->shmem_lock);
+}
+
 static bool
 smc_poll_done(struct scmi_chan_info *cinfo, struct scmi_xfer *xfer)
 {
@@ -186,6 +186,7 @@ static const struct scmi_transport_ops scmi_smc_ops = {
 	.chan_setup = smc_chan_setup,
 	.chan_free = smc_chan_free,
 	.send_message = smc_send_message,
+	.mark_txdone = smc_mark_txdone,
 	.fetch_response = smc_fetch_response,
 	.poll_done = smc_poll_done,
 };
-- 
2.17.1


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 61+ messages in thread

* [PATCH v8 03/11] firmware: arm_scmi: Add sync_cmds_completed_on_ret transport flag
  2021-12-20 19:56 ` Cristian Marussi
@ 2021-12-20 19:56   ` Cristian Marussi
  -1 siblings, 0 replies; 61+ messages in thread
From: Cristian Marussi @ 2021-12-20 19:56 UTC (permalink / raw)
  To: linux-kernel, linux-arm-kernel
  Cc: sudeep.holla, james.quinlan, Jonathan.Cameron, f.fainelli,
	etienne.carriere, vincent.guittot, souvik.chakravarty,
	cristian.marussi

Add a flag to let the transport signal to the core if its handling of sync
command implies that, after .send_message has returned successfully, the
requested command can be assumed to be fully and completely executed on
SCMI platform side so that any possible response value is already
immediately available to be retrieved by a .fetch_response: in other words
the polling phase can be skipped in such a case and the response values
accessed straight away.

Note that all of the above applies only when polling mode of operation was
selected by the core: if instead a completion IRQ was found to be available
the normal response processing path based on completions will still be
followed.

Signed-off-by: Cristian Marussi <cristian.marussi@arm.com>
---
v7 --> v8
- renaming to sync_cmds_completed_on_ret
- removed poling_capable flag, using macros
v5 --> v6
- added polling_capable helper flag
v4 --> v5
- removed RFC tag
- consider sync_cmds_atomic_replies flag when deciding if polling is to be
  supported and .poll_done() is not provided.
- reviewed commit message
---
 drivers/firmware/arm_scmi/common.h |  8 ++++++
 drivers/firmware/arm_scmi/driver.c | 40 ++++++++++++++++++++++--------
 2 files changed, 38 insertions(+), 10 deletions(-)

diff --git a/drivers/firmware/arm_scmi/common.h b/drivers/firmware/arm_scmi/common.h
index 652e5d95ee65..24b1d1ac5f12 100644
--- a/drivers/firmware/arm_scmi/common.h
+++ b/drivers/firmware/arm_scmi/common.h
@@ -409,6 +409,13 @@ struct scmi_device *scmi_child_dev_find(struct device *parent,
  * @max_msg_size: Maximum size of data per message that can be handled.
  * @force_polling: Flag to force this whole transport to use SCMI core polling
  *		   mechanism instead of completion interrupts even if available.
+ * @sync_cmds_completed_on_ret: Flag to indicate that the transport assures
+ *				synchronous-command messages are atomically
+ *				completed on .send_message: no need to poll
+ *				actively waiting for a response.
+ *				Used by core internally only when polling is
+ *				selected as a waiting for reply method: i.e.
+ *				if a completion irq was found use that anyway.
  */
 struct scmi_desc {
 	int (*transport_init)(void);
@@ -418,6 +425,7 @@ struct scmi_desc {
 	int max_msg;
 	int max_msg_size;
 	const bool force_polling;
+	const bool sync_cmds_completed_on_ret;
 };
 
 #ifdef CONFIG_ARM_SCMI_TRANSPORT_MAILBOX
diff --git a/drivers/firmware/arm_scmi/driver.c b/drivers/firmware/arm_scmi/driver.c
index cb9fc12503f2..d4dd42ebc5e8 100644
--- a/drivers/firmware/arm_scmi/driver.c
+++ b/drivers/firmware/arm_scmi/driver.c
@@ -40,7 +40,14 @@
 	((__c)->no_completion_irq || (__i)->desc->force_polling)	\
 
 #define IS_TRANSPORT_POLLING_CAPABLE(__i)				\
-	((__i)->desc->ops->poll_done)
+({									\
+	bool __ret;							\
+	typeof(__i) i_ = __i;						\
+									\
+	__ret = ((i_)->desc->ops->poll_done ||				\
+			(i_)->desc->sync_cmds_completed_on_ret);	\
+	__ret;								\
+})
 
 #define IS_POLLING_ENABLED(__c, __i)					\
 ({									\
@@ -780,10 +787,28 @@ static int scmi_wait_for_message_response(struct scmi_chan_info *cinfo,
 				      xfer->hdr.poll_completion);
 
 	if (xfer->hdr.poll_completion) {
-		ktime_t stop = ktime_add_ms(ktime_get(), timeout_ms);
+		/*
+		 * Real polling is needed only if transport has NOT declared
+		 * itself to support synchronous commands replies.
+		 */
+		if (!info->desc->sync_cmds_completed_on_ret) {
+			/*
+			 * Poll on xfer using transport provided .poll_done();
+			 * assumes no completion interrupt was available.
+			 */
+			ktime_t stop = ktime_add_ms(ktime_get(), timeout_ms);
+
+			spin_until_cond(scmi_xfer_done_no_timeout(cinfo,
+								  xfer, stop));
+			if (ktime_after(ktime_get(), stop)) {
+				dev_err(dev,
+					"timed out in resp(caller: %pS) - polling\n",
+					(void *)_RET_IP_);
+				ret = -ETIMEDOUT;
+			}
+		}
 
-		spin_until_cond(scmi_xfer_done_no_timeout(cinfo, xfer, stop));
-		if (ktime_before(ktime_get(), stop)) {
+		if (!ret) {
 			unsigned long flags;
 
 			/*
@@ -796,11 +821,6 @@ static int scmi_wait_for_message_response(struct scmi_chan_info *cinfo,
 				xfer->state = SCMI_XFER_RESP_OK;
 			}
 			spin_unlock_irqrestore(&xfer->lock, flags);
-		} else {
-			dev_err(dev,
-				"timed out in resp(caller: %pS) - polling\n",
-				(void *)_RET_IP_);
-			ret = -ETIMEDOUT;
 		}
 	} else {
 		/* And we wait for the response. */
@@ -835,7 +855,7 @@ static int do_xfer(const struct scmi_protocol_handle *ph,
 	struct scmi_chan_info *cinfo;
 
 	/* Check for polling request on custom command xfers at first */
-	if (xfer->hdr.poll_completion && !info->desc->ops->poll_done) {
+	if (xfer->hdr.poll_completion && !IS_TRANSPORT_POLLING_CAPABLE(info)) {
 		dev_warn_once(dev,
 			      "Polling mode is not supported by transport.\n");
 		return -EINVAL;
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 61+ messages in thread

* [PATCH v8 03/11] firmware: arm_scmi: Add sync_cmds_completed_on_ret transport flag
@ 2021-12-20 19:56   ` Cristian Marussi
  0 siblings, 0 replies; 61+ messages in thread
From: Cristian Marussi @ 2021-12-20 19:56 UTC (permalink / raw)
  To: linux-kernel, linux-arm-kernel
  Cc: sudeep.holla, james.quinlan, Jonathan.Cameron, f.fainelli,
	etienne.carriere, vincent.guittot, souvik.chakravarty,
	cristian.marussi

Add a flag to let the transport signal to the core if its handling of sync
command implies that, after .send_message has returned successfully, the
requested command can be assumed to be fully and completely executed on
SCMI platform side so that any possible response value is already
immediately available to be retrieved by a .fetch_response: in other words
the polling phase can be skipped in such a case and the response values
accessed straight away.

Note that all of the above applies only when polling mode of operation was
selected by the core: if instead a completion IRQ was found to be available
the normal response processing path based on completions will still be
followed.

Signed-off-by: Cristian Marussi <cristian.marussi@arm.com>
---
v7 --> v8
- renaming to sync_cmds_completed_on_ret
- removed poling_capable flag, using macros
v5 --> v6
- added polling_capable helper flag
v4 --> v5
- removed RFC tag
- consider sync_cmds_atomic_replies flag when deciding if polling is to be
  supported and .poll_done() is not provided.
- reviewed commit message
---
 drivers/firmware/arm_scmi/common.h |  8 ++++++
 drivers/firmware/arm_scmi/driver.c | 40 ++++++++++++++++++++++--------
 2 files changed, 38 insertions(+), 10 deletions(-)

diff --git a/drivers/firmware/arm_scmi/common.h b/drivers/firmware/arm_scmi/common.h
index 652e5d95ee65..24b1d1ac5f12 100644
--- a/drivers/firmware/arm_scmi/common.h
+++ b/drivers/firmware/arm_scmi/common.h
@@ -409,6 +409,13 @@ struct scmi_device *scmi_child_dev_find(struct device *parent,
  * @max_msg_size: Maximum size of data per message that can be handled.
  * @force_polling: Flag to force this whole transport to use SCMI core polling
  *		   mechanism instead of completion interrupts even if available.
+ * @sync_cmds_completed_on_ret: Flag to indicate that the transport assures
+ *				synchronous-command messages are atomically
+ *				completed on .send_message: no need to poll
+ *				actively waiting for a response.
+ *				Used by core internally only when polling is
+ *				selected as a waiting for reply method: i.e.
+ *				if a completion irq was found use that anyway.
  */
 struct scmi_desc {
 	int (*transport_init)(void);
@@ -418,6 +425,7 @@ struct scmi_desc {
 	int max_msg;
 	int max_msg_size;
 	const bool force_polling;
+	const bool sync_cmds_completed_on_ret;
 };
 
 #ifdef CONFIG_ARM_SCMI_TRANSPORT_MAILBOX
diff --git a/drivers/firmware/arm_scmi/driver.c b/drivers/firmware/arm_scmi/driver.c
index cb9fc12503f2..d4dd42ebc5e8 100644
--- a/drivers/firmware/arm_scmi/driver.c
+++ b/drivers/firmware/arm_scmi/driver.c
@@ -40,7 +40,14 @@
 	((__c)->no_completion_irq || (__i)->desc->force_polling)	\
 
 #define IS_TRANSPORT_POLLING_CAPABLE(__i)				\
-	((__i)->desc->ops->poll_done)
+({									\
+	bool __ret;							\
+	typeof(__i) i_ = __i;						\
+									\
+	__ret = ((i_)->desc->ops->poll_done ||				\
+			(i_)->desc->sync_cmds_completed_on_ret);	\
+	__ret;								\
+})
 
 #define IS_POLLING_ENABLED(__c, __i)					\
 ({									\
@@ -780,10 +787,28 @@ static int scmi_wait_for_message_response(struct scmi_chan_info *cinfo,
 				      xfer->hdr.poll_completion);
 
 	if (xfer->hdr.poll_completion) {
-		ktime_t stop = ktime_add_ms(ktime_get(), timeout_ms);
+		/*
+		 * Real polling is needed only if transport has NOT declared
+		 * itself to support synchronous commands replies.
+		 */
+		if (!info->desc->sync_cmds_completed_on_ret) {
+			/*
+			 * Poll on xfer using transport provided .poll_done();
+			 * assumes no completion interrupt was available.
+			 */
+			ktime_t stop = ktime_add_ms(ktime_get(), timeout_ms);
+
+			spin_until_cond(scmi_xfer_done_no_timeout(cinfo,
+								  xfer, stop));
+			if (ktime_after(ktime_get(), stop)) {
+				dev_err(dev,
+					"timed out in resp(caller: %pS) - polling\n",
+					(void *)_RET_IP_);
+				ret = -ETIMEDOUT;
+			}
+		}
 
-		spin_until_cond(scmi_xfer_done_no_timeout(cinfo, xfer, stop));
-		if (ktime_before(ktime_get(), stop)) {
+		if (!ret) {
 			unsigned long flags;
 
 			/*
@@ -796,11 +821,6 @@ static int scmi_wait_for_message_response(struct scmi_chan_info *cinfo,
 				xfer->state = SCMI_XFER_RESP_OK;
 			}
 			spin_unlock_irqrestore(&xfer->lock, flags);
-		} else {
-			dev_err(dev,
-				"timed out in resp(caller: %pS) - polling\n",
-				(void *)_RET_IP_);
-			ret = -ETIMEDOUT;
 		}
 	} else {
 		/* And we wait for the response. */
@@ -835,7 +855,7 @@ static int do_xfer(const struct scmi_protocol_handle *ph,
 	struct scmi_chan_info *cinfo;
 
 	/* Check for polling request on custom command xfers at first */
-	if (xfer->hdr.poll_completion && !info->desc->ops->poll_done) {
+	if (xfer->hdr.poll_completion && !IS_TRANSPORT_POLLING_CAPABLE(info)) {
 		dev_warn_once(dev,
 			      "Polling mode is not supported by transport.\n");
 		return -EINVAL;
-- 
2.17.1


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 61+ messages in thread

* [PATCH v8 04/11] firmware: arm_scmi: Make smc support sync_cmds_completed_on_ret
  2021-12-20 19:56 ` Cristian Marussi
@ 2021-12-20 19:56   ` Cristian Marussi
  -1 siblings, 0 replies; 61+ messages in thread
From: Cristian Marussi @ 2021-12-20 19:56 UTC (permalink / raw)
  To: linux-kernel, linux-arm-kernel
  Cc: sudeep.holla, james.quinlan, Jonathan.Cameron, f.fainelli,
	etienne.carriere, vincent.guittot, souvik.chakravarty,
	cristian.marussi

Enable sync_cmds_completed_on_ret in the SMC transport descriptor and
remove SMC specific .poll_done callback support since polling is bypassed
when sync_cmds_completed_on_ret is set.

Signed-off-by: Cristian Marussi <cristian.marussi@arm.com>
---
v7 --> v8
- renaming to sync_cmds_completed_on_ret
v4 --> v5
- removed RFC tag
- added comment on setting flag
- remove smc_poll_done
---
 drivers/firmware/arm_scmi/smc.c | 18 +++++++++---------
 1 file changed, 9 insertions(+), 9 deletions(-)

diff --git a/drivers/firmware/arm_scmi/smc.c b/drivers/firmware/arm_scmi/smc.c
index d6c6ad9f6bab..df0defd9f8bb 100644
--- a/drivers/firmware/arm_scmi/smc.c
+++ b/drivers/firmware/arm_scmi/smc.c
@@ -173,14 +173,6 @@ static void smc_mark_txdone(struct scmi_chan_info *cinfo, int ret)
 	mutex_unlock(&scmi_info->shmem_lock);
 }
 
-static bool
-smc_poll_done(struct scmi_chan_info *cinfo, struct scmi_xfer *xfer)
-{
-	struct scmi_smc *scmi_info = cinfo->transport_info;
-
-	return shmem_poll_done(scmi_info->shmem, xfer);
-}
-
 static const struct scmi_transport_ops scmi_smc_ops = {
 	.chan_available = smc_chan_available,
 	.chan_setup = smc_chan_setup,
@@ -188,7 +180,6 @@ static const struct scmi_transport_ops scmi_smc_ops = {
 	.send_message = smc_send_message,
 	.mark_txdone = smc_mark_txdone,
 	.fetch_response = smc_fetch_response,
-	.poll_done = smc_poll_done,
 };
 
 const struct scmi_desc scmi_smc_desc = {
@@ -196,4 +187,13 @@ const struct scmi_desc scmi_smc_desc = {
 	.max_rx_timeout_ms = 30,
 	.max_msg = 20,
 	.max_msg_size = 128,
+	/*
+	 * Setting .sync_cmds_atomic_replies to true for SMC assumes that,
+	 * once the SMC instruction has completed successfully, the issued
+	 * SCMI command would have been already fully processed by the SCMI
+	 * platform firmware and so any possible response value expected
+	 * for the issued command will be immmediately ready to be fetched
+	 * from the shared memory area.
+	 */
+	.sync_cmds_completed_on_ret = true,
 };
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 61+ messages in thread

* [PATCH v8 04/11] firmware: arm_scmi: Make smc support sync_cmds_completed_on_ret
@ 2021-12-20 19:56   ` Cristian Marussi
  0 siblings, 0 replies; 61+ messages in thread
From: Cristian Marussi @ 2021-12-20 19:56 UTC (permalink / raw)
  To: linux-kernel, linux-arm-kernel
  Cc: sudeep.holla, james.quinlan, Jonathan.Cameron, f.fainelli,
	etienne.carriere, vincent.guittot, souvik.chakravarty,
	cristian.marussi

Enable sync_cmds_completed_on_ret in the SMC transport descriptor and
remove SMC specific .poll_done callback support since polling is bypassed
when sync_cmds_completed_on_ret is set.

Signed-off-by: Cristian Marussi <cristian.marussi@arm.com>
---
v7 --> v8
- renaming to sync_cmds_completed_on_ret
v4 --> v5
- removed RFC tag
- added comment on setting flag
- remove smc_poll_done
---
 drivers/firmware/arm_scmi/smc.c | 18 +++++++++---------
 1 file changed, 9 insertions(+), 9 deletions(-)

diff --git a/drivers/firmware/arm_scmi/smc.c b/drivers/firmware/arm_scmi/smc.c
index d6c6ad9f6bab..df0defd9f8bb 100644
--- a/drivers/firmware/arm_scmi/smc.c
+++ b/drivers/firmware/arm_scmi/smc.c
@@ -173,14 +173,6 @@ static void smc_mark_txdone(struct scmi_chan_info *cinfo, int ret)
 	mutex_unlock(&scmi_info->shmem_lock);
 }
 
-static bool
-smc_poll_done(struct scmi_chan_info *cinfo, struct scmi_xfer *xfer)
-{
-	struct scmi_smc *scmi_info = cinfo->transport_info;
-
-	return shmem_poll_done(scmi_info->shmem, xfer);
-}
-
 static const struct scmi_transport_ops scmi_smc_ops = {
 	.chan_available = smc_chan_available,
 	.chan_setup = smc_chan_setup,
@@ -188,7 +180,6 @@ static const struct scmi_transport_ops scmi_smc_ops = {
 	.send_message = smc_send_message,
 	.mark_txdone = smc_mark_txdone,
 	.fetch_response = smc_fetch_response,
-	.poll_done = smc_poll_done,
 };
 
 const struct scmi_desc scmi_smc_desc = {
@@ -196,4 +187,13 @@ const struct scmi_desc scmi_smc_desc = {
 	.max_rx_timeout_ms = 30,
 	.max_msg = 20,
 	.max_msg_size = 128,
+	/*
+	 * Setting .sync_cmds_atomic_replies to true for SMC assumes that,
+	 * once the SMC instruction has completed successfully, the issued
+	 * SCMI command would have been already fully processed by the SCMI
+	 * platform firmware and so any possible response value expected
+	 * for the issued command will be immmediately ready to be fetched
+	 * from the shared memory area.
+	 */
+	.sync_cmds_completed_on_ret = true,
 };
-- 
2.17.1


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 61+ messages in thread

* [PATCH v8 05/11] firmware: arm_scmi: Make optee support sync_cmds_completed_on_ret
  2021-12-20 19:56 ` Cristian Marussi
@ 2021-12-20 19:56   ` Cristian Marussi
  -1 siblings, 0 replies; 61+ messages in thread
From: Cristian Marussi @ 2021-12-20 19:56 UTC (permalink / raw)
  To: linux-kernel, linux-arm-kernel
  Cc: sudeep.holla, james.quinlan, Jonathan.Cameron, f.fainelli,
	etienne.carriere, vincent.guittot, souvik.chakravarty,
	cristian.marussi

Declare each OPTEE SCMI channel as not having a completion_irq so as to
enable polling mode and then enable also .sync_cmds_completed_on_ret flag
in the OPTEE transport descriptor so that real polling is itself
effectively bypassed on the rx path: once the optee command invocation has
successfully returned the core will directly fetch the response from the
shared memory area.

Remove OPTEE SCMI transport specific .poll_done callback support since
real polling is effectively bypassed when .sync_cmds_completed_on_ret is
set.

Add OPTEE SCMI transport specific .mark_txdone callback support in order to
properly handle channel locking along the tx path.

Cc: Etienne Carriere <etienne.carriere@linaro.org>
Signed-off-by: Cristian Marussi <cristian.marussi@arm.com>
---
v7 -> v8
- renaming to sync_cmds_completed_on_ret
v6 --> v7
- reviewed commit message
---
 drivers/firmware/arm_scmi/optee.c | 17 +++++++++--------
 1 file changed, 9 insertions(+), 8 deletions(-)

diff --git a/drivers/firmware/arm_scmi/optee.c b/drivers/firmware/arm_scmi/optee.c
index 175b39bcd470..f460e12be4ea 100644
--- a/drivers/firmware/arm_scmi/optee.c
+++ b/drivers/firmware/arm_scmi/optee.c
@@ -363,6 +363,9 @@ static int scmi_optee_chan_setup(struct scmi_chan_info *cinfo, struct device *de
 	if (ret)
 		goto err_close_sess;
 
+	/* Enable polling */
+	cinfo->no_completion_irq = true;
+
 	mutex_lock(&scmi_optee_private->mu);
 	list_add(&channel->link, &scmi_optee_private->channel_list);
 	mutex_unlock(&scmi_optee_private->mu);
@@ -423,9 +426,8 @@ static int scmi_optee_send_message(struct scmi_chan_info *cinfo,
 	shmem_tx_prepare(shmem, xfer);
 
 	ret = invoke_process_smt_channel(channel);
-
-	scmi_rx_callback(cinfo, shmem_read_header(shmem), NULL);
-	mutex_unlock(&channel->mu);
+	if (ret)
+		mutex_unlock(&channel->mu);
 
 	return ret;
 }
@@ -439,13 +441,11 @@ static void scmi_optee_fetch_response(struct scmi_chan_info *cinfo,
 	shmem_fetch_response(shmem, xfer);
 }
 
-static bool scmi_optee_poll_done(struct scmi_chan_info *cinfo,
-				 struct scmi_xfer *xfer)
+static void scmi_optee_mark_txdone(struct scmi_chan_info *cinfo, int ret)
 {
 	struct scmi_optee_channel *channel = cinfo->transport_info;
-	struct scmi_shared_mem *shmem = get_channel_shm(channel, xfer);
 
-	return shmem_poll_done(shmem, xfer);
+	mutex_unlock(&channel->mu);
 }
 
 static struct scmi_transport_ops scmi_optee_ops = {
@@ -454,9 +454,9 @@ static struct scmi_transport_ops scmi_optee_ops = {
 	.chan_setup = scmi_optee_chan_setup,
 	.chan_free = scmi_optee_chan_free,
 	.send_message = scmi_optee_send_message,
+	.mark_txdone = scmi_optee_mark_txdone,
 	.fetch_response = scmi_optee_fetch_response,
 	.clear_channel = scmi_optee_clear_channel,
-	.poll_done = scmi_optee_poll_done,
 };
 
 static int scmi_optee_ctx_match(struct tee_ioctl_version_data *ver, const void *data)
@@ -562,4 +562,5 @@ const struct scmi_desc scmi_optee_desc = {
 	.max_rx_timeout_ms = 30,
 	.max_msg = 20,
 	.max_msg_size = SCMI_OPTEE_MAX_MSG_SIZE,
+	.sync_cmds_completed_on_ret = true,
 };
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 61+ messages in thread

* [PATCH v8 05/11] firmware: arm_scmi: Make optee support sync_cmds_completed_on_ret
@ 2021-12-20 19:56   ` Cristian Marussi
  0 siblings, 0 replies; 61+ messages in thread
From: Cristian Marussi @ 2021-12-20 19:56 UTC (permalink / raw)
  To: linux-kernel, linux-arm-kernel
  Cc: sudeep.holla, james.quinlan, Jonathan.Cameron, f.fainelli,
	etienne.carriere, vincent.guittot, souvik.chakravarty,
	cristian.marussi

Declare each OPTEE SCMI channel as not having a completion_irq so as to
enable polling mode and then enable also .sync_cmds_completed_on_ret flag
in the OPTEE transport descriptor so that real polling is itself
effectively bypassed on the rx path: once the optee command invocation has
successfully returned the core will directly fetch the response from the
shared memory area.

Remove OPTEE SCMI transport specific .poll_done callback support since
real polling is effectively bypassed when .sync_cmds_completed_on_ret is
set.

Add OPTEE SCMI transport specific .mark_txdone callback support in order to
properly handle channel locking along the tx path.

Cc: Etienne Carriere <etienne.carriere@linaro.org>
Signed-off-by: Cristian Marussi <cristian.marussi@arm.com>
---
v7 -> v8
- renaming to sync_cmds_completed_on_ret
v6 --> v7
- reviewed commit message
---
 drivers/firmware/arm_scmi/optee.c | 17 +++++++++--------
 1 file changed, 9 insertions(+), 8 deletions(-)

diff --git a/drivers/firmware/arm_scmi/optee.c b/drivers/firmware/arm_scmi/optee.c
index 175b39bcd470..f460e12be4ea 100644
--- a/drivers/firmware/arm_scmi/optee.c
+++ b/drivers/firmware/arm_scmi/optee.c
@@ -363,6 +363,9 @@ static int scmi_optee_chan_setup(struct scmi_chan_info *cinfo, struct device *de
 	if (ret)
 		goto err_close_sess;
 
+	/* Enable polling */
+	cinfo->no_completion_irq = true;
+
 	mutex_lock(&scmi_optee_private->mu);
 	list_add(&channel->link, &scmi_optee_private->channel_list);
 	mutex_unlock(&scmi_optee_private->mu);
@@ -423,9 +426,8 @@ static int scmi_optee_send_message(struct scmi_chan_info *cinfo,
 	shmem_tx_prepare(shmem, xfer);
 
 	ret = invoke_process_smt_channel(channel);
-
-	scmi_rx_callback(cinfo, shmem_read_header(shmem), NULL);
-	mutex_unlock(&channel->mu);
+	if (ret)
+		mutex_unlock(&channel->mu);
 
 	return ret;
 }
@@ -439,13 +441,11 @@ static void scmi_optee_fetch_response(struct scmi_chan_info *cinfo,
 	shmem_fetch_response(shmem, xfer);
 }
 
-static bool scmi_optee_poll_done(struct scmi_chan_info *cinfo,
-				 struct scmi_xfer *xfer)
+static void scmi_optee_mark_txdone(struct scmi_chan_info *cinfo, int ret)
 {
 	struct scmi_optee_channel *channel = cinfo->transport_info;
-	struct scmi_shared_mem *shmem = get_channel_shm(channel, xfer);
 
-	return shmem_poll_done(shmem, xfer);
+	mutex_unlock(&channel->mu);
 }
 
 static struct scmi_transport_ops scmi_optee_ops = {
@@ -454,9 +454,9 @@ static struct scmi_transport_ops scmi_optee_ops = {
 	.chan_setup = scmi_optee_chan_setup,
 	.chan_free = scmi_optee_chan_free,
 	.send_message = scmi_optee_send_message,
+	.mark_txdone = scmi_optee_mark_txdone,
 	.fetch_response = scmi_optee_fetch_response,
 	.clear_channel = scmi_optee_clear_channel,
-	.poll_done = scmi_optee_poll_done,
 };
 
 static int scmi_optee_ctx_match(struct tee_ioctl_version_data *ver, const void *data)
@@ -562,4 +562,5 @@ const struct scmi_desc scmi_optee_desc = {
 	.max_rx_timeout_ms = 30,
 	.max_msg = 20,
 	.max_msg_size = SCMI_OPTEE_MAX_MSG_SIZE,
+	.sync_cmds_completed_on_ret = true,
 };
-- 
2.17.1


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 61+ messages in thread

* [PATCH v8 06/11] firmware: arm_scmi: Add support for atomic transports
  2021-12-20 19:56 ` Cristian Marussi
@ 2021-12-20 19:56   ` Cristian Marussi
  -1 siblings, 0 replies; 61+ messages in thread
From: Cristian Marussi @ 2021-12-20 19:56 UTC (permalink / raw)
  To: linux-kernel, linux-arm-kernel
  Cc: sudeep.holla, james.quinlan, Jonathan.Cameron, f.fainelli,
	etienne.carriere, vincent.guittot, souvik.chakravarty,
	cristian.marussi

An SCMI transport can be configured as .atomic_enabled in order to signal
to the SCMI core that all its TX path is executed in atomic context and
that, when requested, polling mode should be used while waiting for command
responses.

When a specific platform configuration had properly configured such a
transport as .atomic_enabled, the SCMI core will also take care not to
sleep in the corresponding RX path while waiting for a response if that
specific command transaction was requested as atomic using polling mode.

Asynchronous commands should not be used in an atomic context and so a
warning is emitted if polling was requested for an asynchronous command.

Add also a method to check, from the SCMI drivers, if the underlying SCMI
transport is currently configured to support atomic transactions: this will
be used by upper layers to determine if atomic requests can be supported at
all on this SCMI instance.

Signed-off-by: Cristian Marussi <cristian.marussi@arm.com>
---
v7 --> v8
- removed usage of poling_capable, use macros
v6 --> v7
- reviewed commit message
- converted async WARN_ON to WARN_ON_ONCE
v5 --> v6
- removed atomic_capable
- fully relying on transport polling capabilities
- removed polling/atomic support for delayed_reponse and WARN
- merged with is_transport_atomic() patch
- is_transport_atomic() now considers polling_capable and
  atomic_enabled flags
v4 --> v5
- added .atomic_enabled flag to decide wheter to enable atomic mode or not
  for atomic_capable transports
- reviewed commit message
---
 drivers/firmware/arm_scmi/common.h |  4 +++
 drivers/firmware/arm_scmi/driver.c | 51 ++++++++++++++++++++++++++++--
 include/linux/scmi_protocol.h      |  8 +++++
 3 files changed, 61 insertions(+), 2 deletions(-)

diff --git a/drivers/firmware/arm_scmi/common.h b/drivers/firmware/arm_scmi/common.h
index 24b1d1ac5f12..01d42c2069d4 100644
--- a/drivers/firmware/arm_scmi/common.h
+++ b/drivers/firmware/arm_scmi/common.h
@@ -416,6 +416,9 @@ struct scmi_device *scmi_child_dev_find(struct device *parent,
  *				Used by core internally only when polling is
  *				selected as a waiting for reply method: i.e.
  *				if a completion irq was found use that anyway.
+ * @atomic_enabled: Flag to indicate that this transport, which is assured not
+ *		    to sleep anywhere on the TX path, can be used in atomic mode
+ *		    when requested.
  */
 struct scmi_desc {
 	int (*transport_init)(void);
@@ -426,6 +429,7 @@ struct scmi_desc {
 	int max_msg_size;
 	const bool force_polling;
 	const bool sync_cmds_completed_on_ret;
+	const bool atomic_enabled;
 };
 
 #ifdef CONFIG_ARM_SCMI_TRANSPORT_MAILBOX
diff --git a/drivers/firmware/arm_scmi/driver.c b/drivers/firmware/arm_scmi/driver.c
index d4dd42ebc5e8..0406e030db1b 100644
--- a/drivers/firmware/arm_scmi/driver.c
+++ b/drivers/firmware/arm_scmi/driver.c
@@ -928,6 +928,20 @@ static void reset_rx_to_maxsz(const struct scmi_protocol_handle *ph,
  * @ph: Pointer to SCMI protocol handle
  * @xfer: Transfer to initiate and wait for response
  *
+ * Using asynchronous commands in atomic/polling mode should be avoided since
+ * it could cause long busy-waiting here, so ignore polling for the delayed
+ * response and WARN if it was requested for this command transaction since
+ * upper layers should refrain from issuing such kind of requests.
+ *
+ * The only other option would have been to refrain from using any asynchronous
+ * command even if made available, when an atomic transport is detected, and
+ * instead forcibly use the synchronous version (thing that can be easily
+ * attained at the protocol layer), but this would also have led to longer
+ * stalls of the channel for synchronous commands and possibly timeouts.
+ * (in other words there is usually a good reason if a platform provides an
+ *  asynchronous version of a command and we should prefer to use it...just not
+ *  when using atomic/polling mode)
+ *
  * Return: -ETIMEDOUT in case of no delayed response, if transmit error,
  *	return corresponding error, else if all goes well, return 0.
  */
@@ -939,12 +953,24 @@ static int do_xfer_with_response(const struct scmi_protocol_handle *ph,
 
 	xfer->async_done = &async_response;
 
+	/*
+	 * Delayed responses should not be polled, so an async command should
+	 * not have been used when requiring an atomic/poll context; WARN and
+	 * perform instead a sleeping wait.
+	 * (Note Async + IgnoreDelayedResponses are sent via do_xfer)
+	 */
+	WARN_ON_ONCE(xfer->hdr.poll_completion);
+
 	ret = do_xfer(ph, xfer);
 	if (!ret) {
-		if (!wait_for_completion_timeout(xfer->async_done, timeout))
+		if (!wait_for_completion_timeout(xfer->async_done, timeout)) {
+			dev_err(ph->dev,
+				"timed out in delayed resp(caller: %pS)\n",
+				(void *)_RET_IP_);
 			ret = -ETIMEDOUT;
-		else if (xfer->hdr.status)
+		} else if (xfer->hdr.status) {
 			ret = scmi_to_linux_errno(xfer->hdr.status);
+		}
 	}
 
 	xfer->async_done = NULL;
@@ -1378,6 +1404,22 @@ static void scmi_devm_protocol_put(struct scmi_device *sdev, u8 protocol_id)
 	WARN_ON(ret);
 }
 
+/**
+ * scmi_is_transport_atomic  - Method to check if underlying transport for an
+ * SCMI instance is configured as atomic.
+ *
+ * @handle: A reference to the SCMI platform instance.
+ *
+ * Return: True if transport is configured as atomic
+ */
+static bool scmi_is_transport_atomic(const struct scmi_handle *handle)
+{
+	struct scmi_info *info = handle_to_scmi_info(handle);
+
+	return info->desc->atomic_enabled &&
+		IS_TRANSPORT_POLLING_CAPABLE(info);
+}
+
 static inline
 struct scmi_handle *scmi_handle_get_from_info_unlocked(struct scmi_info *info)
 {
@@ -1915,6 +1957,7 @@ static int scmi_probe(struct platform_device *pdev)
 	handle->version = &info->version;
 	handle->devm_protocol_get = scmi_devm_protocol_get;
 	handle->devm_protocol_put = scmi_devm_protocol_put;
+	handle->is_transport_atomic = scmi_is_transport_atomic;
 
 	if (desc->ops->link_supplier) {
 		ret = desc->ops->link_supplier(dev);
@@ -1933,6 +1976,10 @@ static int scmi_probe(struct platform_device *pdev)
 	if (scmi_notification_init(handle))
 		dev_err(dev, "SCMI Notifications NOT available.\n");
 
+	if (info->desc->atomic_enabled && !IS_TRANSPORT_POLLING_CAPABLE(info))
+		dev_err(dev,
+			"Transport is not polling capable. Atomic mode not supported.\n");
+
 	/*
 	 * Trigger SCMI Base protocol initialization.
 	 * It's mandatory and won't be ever released/deinit until the
diff --git a/include/linux/scmi_protocol.h b/include/linux/scmi_protocol.h
index 80e781c51ddc..9f895cb81818 100644
--- a/include/linux/scmi_protocol.h
+++ b/include/linux/scmi_protocol.h
@@ -612,6 +612,13 @@ struct scmi_notify_ops {
  * @devm_protocol_get: devres managed method to acquire a protocol and get specific
  *		       operations and a dedicated protocol handler
  * @devm_protocol_put: devres managed method to release a protocol
+ * @is_transport_atomic: method to check if the underlying transport for this
+ *			 instance handle is configured to support atomic
+ *			 transactions for commands.
+ *			 Some users of the SCMI stack in the upper layers could
+ *			 be interested to know if they can assume SCMI
+ *			 command transactions associated to this handle will
+ *			 never sleep and act accordingly.
  * @notify_ops: pointer to set of notifications related operations
  */
 struct scmi_handle {
@@ -622,6 +629,7 @@ struct scmi_handle {
 		(*devm_protocol_get)(struct scmi_device *sdev, u8 proto,
 				     struct scmi_protocol_handle **ph);
 	void (*devm_protocol_put)(struct scmi_device *sdev, u8 proto);
+	bool (*is_transport_atomic)(const struct scmi_handle *handle);
 
 	const struct scmi_notify_ops *notify_ops;
 };
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 61+ messages in thread

* [PATCH v8 06/11] firmware: arm_scmi: Add support for atomic transports
@ 2021-12-20 19:56   ` Cristian Marussi
  0 siblings, 0 replies; 61+ messages in thread
From: Cristian Marussi @ 2021-12-20 19:56 UTC (permalink / raw)
  To: linux-kernel, linux-arm-kernel
  Cc: sudeep.holla, james.quinlan, Jonathan.Cameron, f.fainelli,
	etienne.carriere, vincent.guittot, souvik.chakravarty,
	cristian.marussi

An SCMI transport can be configured as .atomic_enabled in order to signal
to the SCMI core that all its TX path is executed in atomic context and
that, when requested, polling mode should be used while waiting for command
responses.

When a specific platform configuration had properly configured such a
transport as .atomic_enabled, the SCMI core will also take care not to
sleep in the corresponding RX path while waiting for a response if that
specific command transaction was requested as atomic using polling mode.

Asynchronous commands should not be used in an atomic context and so a
warning is emitted if polling was requested for an asynchronous command.

Add also a method to check, from the SCMI drivers, if the underlying SCMI
transport is currently configured to support atomic transactions: this will
be used by upper layers to determine if atomic requests can be supported at
all on this SCMI instance.

Signed-off-by: Cristian Marussi <cristian.marussi@arm.com>
---
v7 --> v8
- removed usage of poling_capable, use macros
v6 --> v7
- reviewed commit message
- converted async WARN_ON to WARN_ON_ONCE
v5 --> v6
- removed atomic_capable
- fully relying on transport polling capabilities
- removed polling/atomic support for delayed_reponse and WARN
- merged with is_transport_atomic() patch
- is_transport_atomic() now considers polling_capable and
  atomic_enabled flags
v4 --> v5
- added .atomic_enabled flag to decide wheter to enable atomic mode or not
  for atomic_capable transports
- reviewed commit message
---
 drivers/firmware/arm_scmi/common.h |  4 +++
 drivers/firmware/arm_scmi/driver.c | 51 ++++++++++++++++++++++++++++--
 include/linux/scmi_protocol.h      |  8 +++++
 3 files changed, 61 insertions(+), 2 deletions(-)

diff --git a/drivers/firmware/arm_scmi/common.h b/drivers/firmware/arm_scmi/common.h
index 24b1d1ac5f12..01d42c2069d4 100644
--- a/drivers/firmware/arm_scmi/common.h
+++ b/drivers/firmware/arm_scmi/common.h
@@ -416,6 +416,9 @@ struct scmi_device *scmi_child_dev_find(struct device *parent,
  *				Used by core internally only when polling is
  *				selected as a waiting for reply method: i.e.
  *				if a completion irq was found use that anyway.
+ * @atomic_enabled: Flag to indicate that this transport, which is assured not
+ *		    to sleep anywhere on the TX path, can be used in atomic mode
+ *		    when requested.
  */
 struct scmi_desc {
 	int (*transport_init)(void);
@@ -426,6 +429,7 @@ struct scmi_desc {
 	int max_msg_size;
 	const bool force_polling;
 	const bool sync_cmds_completed_on_ret;
+	const bool atomic_enabled;
 };
 
 #ifdef CONFIG_ARM_SCMI_TRANSPORT_MAILBOX
diff --git a/drivers/firmware/arm_scmi/driver.c b/drivers/firmware/arm_scmi/driver.c
index d4dd42ebc5e8..0406e030db1b 100644
--- a/drivers/firmware/arm_scmi/driver.c
+++ b/drivers/firmware/arm_scmi/driver.c
@@ -928,6 +928,20 @@ static void reset_rx_to_maxsz(const struct scmi_protocol_handle *ph,
  * @ph: Pointer to SCMI protocol handle
  * @xfer: Transfer to initiate and wait for response
  *
+ * Using asynchronous commands in atomic/polling mode should be avoided since
+ * it could cause long busy-waiting here, so ignore polling for the delayed
+ * response and WARN if it was requested for this command transaction since
+ * upper layers should refrain from issuing such kind of requests.
+ *
+ * The only other option would have been to refrain from using any asynchronous
+ * command even if made available, when an atomic transport is detected, and
+ * instead forcibly use the synchronous version (thing that can be easily
+ * attained at the protocol layer), but this would also have led to longer
+ * stalls of the channel for synchronous commands and possibly timeouts.
+ * (in other words there is usually a good reason if a platform provides an
+ *  asynchronous version of a command and we should prefer to use it...just not
+ *  when using atomic/polling mode)
+ *
  * Return: -ETIMEDOUT in case of no delayed response, if transmit error,
  *	return corresponding error, else if all goes well, return 0.
  */
@@ -939,12 +953,24 @@ static int do_xfer_with_response(const struct scmi_protocol_handle *ph,
 
 	xfer->async_done = &async_response;
 
+	/*
+	 * Delayed responses should not be polled, so an async command should
+	 * not have been used when requiring an atomic/poll context; WARN and
+	 * perform instead a sleeping wait.
+	 * (Note Async + IgnoreDelayedResponses are sent via do_xfer)
+	 */
+	WARN_ON_ONCE(xfer->hdr.poll_completion);
+
 	ret = do_xfer(ph, xfer);
 	if (!ret) {
-		if (!wait_for_completion_timeout(xfer->async_done, timeout))
+		if (!wait_for_completion_timeout(xfer->async_done, timeout)) {
+			dev_err(ph->dev,
+				"timed out in delayed resp(caller: %pS)\n",
+				(void *)_RET_IP_);
 			ret = -ETIMEDOUT;
-		else if (xfer->hdr.status)
+		} else if (xfer->hdr.status) {
 			ret = scmi_to_linux_errno(xfer->hdr.status);
+		}
 	}
 
 	xfer->async_done = NULL;
@@ -1378,6 +1404,22 @@ static void scmi_devm_protocol_put(struct scmi_device *sdev, u8 protocol_id)
 	WARN_ON(ret);
 }
 
+/**
+ * scmi_is_transport_atomic  - Method to check if underlying transport for an
+ * SCMI instance is configured as atomic.
+ *
+ * @handle: A reference to the SCMI platform instance.
+ *
+ * Return: True if transport is configured as atomic
+ */
+static bool scmi_is_transport_atomic(const struct scmi_handle *handle)
+{
+	struct scmi_info *info = handle_to_scmi_info(handle);
+
+	return info->desc->atomic_enabled &&
+		IS_TRANSPORT_POLLING_CAPABLE(info);
+}
+
 static inline
 struct scmi_handle *scmi_handle_get_from_info_unlocked(struct scmi_info *info)
 {
@@ -1915,6 +1957,7 @@ static int scmi_probe(struct platform_device *pdev)
 	handle->version = &info->version;
 	handle->devm_protocol_get = scmi_devm_protocol_get;
 	handle->devm_protocol_put = scmi_devm_protocol_put;
+	handle->is_transport_atomic = scmi_is_transport_atomic;
 
 	if (desc->ops->link_supplier) {
 		ret = desc->ops->link_supplier(dev);
@@ -1933,6 +1976,10 @@ static int scmi_probe(struct platform_device *pdev)
 	if (scmi_notification_init(handle))
 		dev_err(dev, "SCMI Notifications NOT available.\n");
 
+	if (info->desc->atomic_enabled && !IS_TRANSPORT_POLLING_CAPABLE(info))
+		dev_err(dev,
+			"Transport is not polling capable. Atomic mode not supported.\n");
+
 	/*
 	 * Trigger SCMI Base protocol initialization.
 	 * It's mandatory and won't be ever released/deinit until the
diff --git a/include/linux/scmi_protocol.h b/include/linux/scmi_protocol.h
index 80e781c51ddc..9f895cb81818 100644
--- a/include/linux/scmi_protocol.h
+++ b/include/linux/scmi_protocol.h
@@ -612,6 +612,13 @@ struct scmi_notify_ops {
  * @devm_protocol_get: devres managed method to acquire a protocol and get specific
  *		       operations and a dedicated protocol handler
  * @devm_protocol_put: devres managed method to release a protocol
+ * @is_transport_atomic: method to check if the underlying transport for this
+ *			 instance handle is configured to support atomic
+ *			 transactions for commands.
+ *			 Some users of the SCMI stack in the upper layers could
+ *			 be interested to know if they can assume SCMI
+ *			 command transactions associated to this handle will
+ *			 never sleep and act accordingly.
  * @notify_ops: pointer to set of notifications related operations
  */
 struct scmi_handle {
@@ -622,6 +629,7 @@ struct scmi_handle {
 		(*devm_protocol_get)(struct scmi_device *sdev, u8 proto,
 				     struct scmi_protocol_handle **ph);
 	void (*devm_protocol_put)(struct scmi_device *sdev, u8 proto);
+	bool (*is_transport_atomic)(const struct scmi_handle *handle);
 
 	const struct scmi_notify_ops *notify_ops;
 };
-- 
2.17.1


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 61+ messages in thread

* [PATCH v8 07/11] firmware: arm_scmi: Add atomic mode support to smc transport
  2021-12-20 19:56 ` Cristian Marussi
@ 2021-12-20 19:56   ` Cristian Marussi
  -1 siblings, 0 replies; 61+ messages in thread
From: Cristian Marussi @ 2021-12-20 19:56 UTC (permalink / raw)
  To: linux-kernel, linux-arm-kernel
  Cc: sudeep.holla, james.quinlan, Jonathan.Cameron, f.fainelli,
	etienne.carriere, vincent.guittot, souvik.chakravarty,
	cristian.marussi

Add a Kernel configuration option to enable SCMI SMC transport atomic
mode operation for selected SCMI transactions and leave it as default
disabled.

Substitute mutex usages with busy-waiting and declare smc transport as
.atomic_enabled if such Kernel configuration option is enabled.

Signed-off-by: Cristian Marussi <cristian.marussi@arm.com>
---
v7 --> v8
- removed ifdeffery, using IS_ENABLED
v5 --> v6
- remove usage of atomic_capable
- removed needless union
- reviewed Kconfig help
v4 --> v5
- removed RFC tag
- add CONFIG_ARM_SCMI_TRANSPORT_SMC_ATOMIC_ENABLE option
- add .atomic_enable support
- make atomic_capable dependent on
  CONFIG_ARM_SCMI_TRANSPORT_SMC_ATOMIC_ENABLE
- make also usage of mutexes vs busy-waiting dependent on
  CONFIG_ARM_SCMI_TRANSPORT_SMC_ATOMIC_ENABLE
---
 drivers/firmware/arm_scmi/Kconfig | 14 ++++++++
 drivers/firmware/arm_scmi/smc.c   | 56 +++++++++++++++++++++++++++----
 2 files changed, 64 insertions(+), 6 deletions(-)

diff --git a/drivers/firmware/arm_scmi/Kconfig b/drivers/firmware/arm_scmi/Kconfig
index 638ecec89ff1..d429326433d1 100644
--- a/drivers/firmware/arm_scmi/Kconfig
+++ b/drivers/firmware/arm_scmi/Kconfig
@@ -78,6 +78,20 @@ config ARM_SCMI_TRANSPORT_SMC
 	  If you want the ARM SCMI PROTOCOL stack to include support for a
 	  transport based on SMC, answer Y.
 
+config ARM_SCMI_TRANSPORT_SMC_ATOMIC_ENABLE
+	bool "Enable atomic mode support for SCMI SMC transport"
+	depends on ARM_SCMI_TRANSPORT_SMC
+	help
+	  Enable support of atomic operation for SCMI SMC based transport.
+
+	  If you want the SCMI SMC based transport to operate in atomic
+	  mode, avoiding any kind of sleeping behaviour for selected
+	  transactions on the TX path, answer Y.
+	  Enabling atomic mode operations allows any SCMI driver using this
+	  transport to optionally ask for atomic SCMI transactions and operate
+	  in atomic context too, at the price of using a number of busy-waiting
+	  primitives all over instead. If unsure say N.
+
 config ARM_SCMI_TRANSPORT_VIRTIO
 	bool "SCMI transport based on VirtIO"
 	depends on VIRTIO=y || VIRTIO=ARM_SCMI_PROTOCOL
diff --git a/drivers/firmware/arm_scmi/smc.c b/drivers/firmware/arm_scmi/smc.c
index df0defd9f8bb..6c7871a40611 100644
--- a/drivers/firmware/arm_scmi/smc.c
+++ b/drivers/firmware/arm_scmi/smc.c
@@ -7,6 +7,7 @@
  */
 
 #include <linux/arm-smccc.h>
+#include <linux/atomic.h>
 #include <linux/device.h>
 #include <linux/err.h>
 #include <linux/interrupt.h>
@@ -14,6 +15,7 @@
 #include <linux/of.h>
 #include <linux/of_address.h>
 #include <linux/of_irq.h>
+#include <linux/processor.h>
 #include <linux/slab.h>
 
 #include "common.h"
@@ -23,14 +25,20 @@
  *
  * @cinfo: SCMI channel info
  * @shmem: Transmit/Receive shared memory area
- * @shmem_lock: Lock to protect access to Tx/Rx shared memory area
+ * @shmem_lock: Lock to protect access to Tx/Rx shared memory area.
+ *		Used when NOT operating in atomic mode.
+ * @inflight: Atomic flag to protect access to Tx/Rx shared memory area.
+ *	      Used when operating in atomic mode.
  * @func_id: smc/hvc call function id
  */
 
 struct scmi_smc {
 	struct scmi_chan_info *cinfo;
 	struct scmi_shared_mem __iomem *shmem;
+	/* Protect access to shmem area */
 	struct mutex shmem_lock;
+#define INFLIGHT_NONE	MSG_TOKEN_MAX
+	atomic_t inflight;
 	u32 func_id;
 };
 
@@ -54,6 +62,41 @@ static bool smc_chan_available(struct device *dev, int idx)
 	return true;
 }
 
+static inline void smc_channel_lock_init(struct scmi_smc *scmi_info)
+{
+	if (IS_ENABLED(CONFIG_ARM_SCMI_TRANSPORT_SMC_ATOMIC_ENABLE))
+		atomic_set(&scmi_info->inflight, INFLIGHT_NONE);
+	else
+		mutex_init(&scmi_info->shmem_lock);
+}
+
+static bool smc_xfer_inflight(struct scmi_xfer *xfer, atomic_t *inflight)
+{
+	int ret;
+
+	ret = atomic_cmpxchg(inflight, INFLIGHT_NONE, xfer->hdr.seq);
+
+	return ret == INFLIGHT_NONE;
+}
+
+static inline void
+smc_channel_lock_acquire(struct scmi_smc *scmi_info,
+			 struct scmi_xfer *xfer __maybe_unused)
+{
+	if (IS_ENABLED(CONFIG_ARM_SCMI_TRANSPORT_SMC_ATOMIC_ENABLE))
+		spin_until_cond(smc_xfer_inflight(xfer, &scmi_info->inflight));
+	else
+		mutex_lock(&scmi_info->shmem_lock);
+}
+
+static inline void smc_channel_lock_release(struct scmi_smc *scmi_info)
+{
+	if (IS_ENABLED(CONFIG_ARM_SCMI_TRANSPORT_SMC_ATOMIC_ENABLE))
+		atomic_set(&scmi_info->inflight, INFLIGHT_NONE);
+	else
+		mutex_unlock(&scmi_info->shmem_lock);
+}
+
 static int smc_chan_setup(struct scmi_chan_info *cinfo, struct device *dev,
 			  bool tx)
 {
@@ -114,7 +157,7 @@ static int smc_chan_setup(struct scmi_chan_info *cinfo, struct device *dev,
 
 	scmi_info->func_id = func_id;
 	scmi_info->cinfo = cinfo;
-	mutex_init(&scmi_info->shmem_lock);
+	smc_channel_lock_init(scmi_info);
 	cinfo->transport_info = scmi_info;
 
 	return 0;
@@ -140,10 +183,10 @@ static int smc_send_message(struct scmi_chan_info *cinfo,
 	struct arm_smccc_res res;
 
 	/*
-	 * Channel lock will be released only once response has been
+	 * Channel will be released only once response has been
 	 * surely fully retrieved, so after .mark_txdone()
 	 */
-	mutex_lock(&scmi_info->shmem_lock);
+	smc_channel_lock_acquire(scmi_info, xfer);
 
 	shmem_tx_prepare(scmi_info->shmem, xfer);
 
@@ -151,7 +194,7 @@ static int smc_send_message(struct scmi_chan_info *cinfo,
 
 	/* Only SMCCC_RET_NOT_SUPPORTED is valid error code */
 	if (res.a0) {
-		mutex_unlock(&scmi_info->shmem_lock);
+		smc_channel_lock_release(scmi_info);
 		return -EOPNOTSUPP;
 	}
 
@@ -170,7 +213,7 @@ static void smc_mark_txdone(struct scmi_chan_info *cinfo, int ret)
 {
 	struct scmi_smc *scmi_info = cinfo->transport_info;
 
-	mutex_unlock(&scmi_info->shmem_lock);
+	smc_channel_lock_release(scmi_info);
 }
 
 static const struct scmi_transport_ops scmi_smc_ops = {
@@ -196,4 +239,5 @@ const struct scmi_desc scmi_smc_desc = {
 	 * from the shared memory area.
 	 */
 	.sync_cmds_completed_on_ret = true,
+	.atomic_enabled = IS_ENABLED(CONFIG_ARM_SCMI_TRANSPORT_SMC_ATOMIC_ENABLE),
 };
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 61+ messages in thread

* [PATCH v8 07/11] firmware: arm_scmi: Add atomic mode support to smc transport
@ 2021-12-20 19:56   ` Cristian Marussi
  0 siblings, 0 replies; 61+ messages in thread
From: Cristian Marussi @ 2021-12-20 19:56 UTC (permalink / raw)
  To: linux-kernel, linux-arm-kernel
  Cc: sudeep.holla, james.quinlan, Jonathan.Cameron, f.fainelli,
	etienne.carriere, vincent.guittot, souvik.chakravarty,
	cristian.marussi

Add a Kernel configuration option to enable SCMI SMC transport atomic
mode operation for selected SCMI transactions and leave it as default
disabled.

Substitute mutex usages with busy-waiting and declare smc transport as
.atomic_enabled if such Kernel configuration option is enabled.

Signed-off-by: Cristian Marussi <cristian.marussi@arm.com>
---
v7 --> v8
- removed ifdeffery, using IS_ENABLED
v5 --> v6
- remove usage of atomic_capable
- removed needless union
- reviewed Kconfig help
v4 --> v5
- removed RFC tag
- add CONFIG_ARM_SCMI_TRANSPORT_SMC_ATOMIC_ENABLE option
- add .atomic_enable support
- make atomic_capable dependent on
  CONFIG_ARM_SCMI_TRANSPORT_SMC_ATOMIC_ENABLE
- make also usage of mutexes vs busy-waiting dependent on
  CONFIG_ARM_SCMI_TRANSPORT_SMC_ATOMIC_ENABLE
---
 drivers/firmware/arm_scmi/Kconfig | 14 ++++++++
 drivers/firmware/arm_scmi/smc.c   | 56 +++++++++++++++++++++++++++----
 2 files changed, 64 insertions(+), 6 deletions(-)

diff --git a/drivers/firmware/arm_scmi/Kconfig b/drivers/firmware/arm_scmi/Kconfig
index 638ecec89ff1..d429326433d1 100644
--- a/drivers/firmware/arm_scmi/Kconfig
+++ b/drivers/firmware/arm_scmi/Kconfig
@@ -78,6 +78,20 @@ config ARM_SCMI_TRANSPORT_SMC
 	  If you want the ARM SCMI PROTOCOL stack to include support for a
 	  transport based on SMC, answer Y.
 
+config ARM_SCMI_TRANSPORT_SMC_ATOMIC_ENABLE
+	bool "Enable atomic mode support for SCMI SMC transport"
+	depends on ARM_SCMI_TRANSPORT_SMC
+	help
+	  Enable support of atomic operation for SCMI SMC based transport.
+
+	  If you want the SCMI SMC based transport to operate in atomic
+	  mode, avoiding any kind of sleeping behaviour for selected
+	  transactions on the TX path, answer Y.
+	  Enabling atomic mode operations allows any SCMI driver using this
+	  transport to optionally ask for atomic SCMI transactions and operate
+	  in atomic context too, at the price of using a number of busy-waiting
+	  primitives all over instead. If unsure say N.
+
 config ARM_SCMI_TRANSPORT_VIRTIO
 	bool "SCMI transport based on VirtIO"
 	depends on VIRTIO=y || VIRTIO=ARM_SCMI_PROTOCOL
diff --git a/drivers/firmware/arm_scmi/smc.c b/drivers/firmware/arm_scmi/smc.c
index df0defd9f8bb..6c7871a40611 100644
--- a/drivers/firmware/arm_scmi/smc.c
+++ b/drivers/firmware/arm_scmi/smc.c
@@ -7,6 +7,7 @@
  */
 
 #include <linux/arm-smccc.h>
+#include <linux/atomic.h>
 #include <linux/device.h>
 #include <linux/err.h>
 #include <linux/interrupt.h>
@@ -14,6 +15,7 @@
 #include <linux/of.h>
 #include <linux/of_address.h>
 #include <linux/of_irq.h>
+#include <linux/processor.h>
 #include <linux/slab.h>
 
 #include "common.h"
@@ -23,14 +25,20 @@
  *
  * @cinfo: SCMI channel info
  * @shmem: Transmit/Receive shared memory area
- * @shmem_lock: Lock to protect access to Tx/Rx shared memory area
+ * @shmem_lock: Lock to protect access to Tx/Rx shared memory area.
+ *		Used when NOT operating in atomic mode.
+ * @inflight: Atomic flag to protect access to Tx/Rx shared memory area.
+ *	      Used when operating in atomic mode.
  * @func_id: smc/hvc call function id
  */
 
 struct scmi_smc {
 	struct scmi_chan_info *cinfo;
 	struct scmi_shared_mem __iomem *shmem;
+	/* Protect access to shmem area */
 	struct mutex shmem_lock;
+#define INFLIGHT_NONE	MSG_TOKEN_MAX
+	atomic_t inflight;
 	u32 func_id;
 };
 
@@ -54,6 +62,41 @@ static bool smc_chan_available(struct device *dev, int idx)
 	return true;
 }
 
+static inline void smc_channel_lock_init(struct scmi_smc *scmi_info)
+{
+	if (IS_ENABLED(CONFIG_ARM_SCMI_TRANSPORT_SMC_ATOMIC_ENABLE))
+		atomic_set(&scmi_info->inflight, INFLIGHT_NONE);
+	else
+		mutex_init(&scmi_info->shmem_lock);
+}
+
+static bool smc_xfer_inflight(struct scmi_xfer *xfer, atomic_t *inflight)
+{
+	int ret;
+
+	ret = atomic_cmpxchg(inflight, INFLIGHT_NONE, xfer->hdr.seq);
+
+	return ret == INFLIGHT_NONE;
+}
+
+static inline void
+smc_channel_lock_acquire(struct scmi_smc *scmi_info,
+			 struct scmi_xfer *xfer __maybe_unused)
+{
+	if (IS_ENABLED(CONFIG_ARM_SCMI_TRANSPORT_SMC_ATOMIC_ENABLE))
+		spin_until_cond(smc_xfer_inflight(xfer, &scmi_info->inflight));
+	else
+		mutex_lock(&scmi_info->shmem_lock);
+}
+
+static inline void smc_channel_lock_release(struct scmi_smc *scmi_info)
+{
+	if (IS_ENABLED(CONFIG_ARM_SCMI_TRANSPORT_SMC_ATOMIC_ENABLE))
+		atomic_set(&scmi_info->inflight, INFLIGHT_NONE);
+	else
+		mutex_unlock(&scmi_info->shmem_lock);
+}
+
 static int smc_chan_setup(struct scmi_chan_info *cinfo, struct device *dev,
 			  bool tx)
 {
@@ -114,7 +157,7 @@ static int smc_chan_setup(struct scmi_chan_info *cinfo, struct device *dev,
 
 	scmi_info->func_id = func_id;
 	scmi_info->cinfo = cinfo;
-	mutex_init(&scmi_info->shmem_lock);
+	smc_channel_lock_init(scmi_info);
 	cinfo->transport_info = scmi_info;
 
 	return 0;
@@ -140,10 +183,10 @@ static int smc_send_message(struct scmi_chan_info *cinfo,
 	struct arm_smccc_res res;
 
 	/*
-	 * Channel lock will be released only once response has been
+	 * Channel will be released only once response has been
 	 * surely fully retrieved, so after .mark_txdone()
 	 */
-	mutex_lock(&scmi_info->shmem_lock);
+	smc_channel_lock_acquire(scmi_info, xfer);
 
 	shmem_tx_prepare(scmi_info->shmem, xfer);
 
@@ -151,7 +194,7 @@ static int smc_send_message(struct scmi_chan_info *cinfo,
 
 	/* Only SMCCC_RET_NOT_SUPPORTED is valid error code */
 	if (res.a0) {
-		mutex_unlock(&scmi_info->shmem_lock);
+		smc_channel_lock_release(scmi_info);
 		return -EOPNOTSUPP;
 	}
 
@@ -170,7 +213,7 @@ static void smc_mark_txdone(struct scmi_chan_info *cinfo, int ret)
 {
 	struct scmi_smc *scmi_info = cinfo->transport_info;
 
-	mutex_unlock(&scmi_info->shmem_lock);
+	smc_channel_lock_release(scmi_info);
 }
 
 static const struct scmi_transport_ops scmi_smc_ops = {
@@ -196,4 +239,5 @@ const struct scmi_desc scmi_smc_desc = {
 	 * from the shared memory area.
 	 */
 	.sync_cmds_completed_on_ret = true,
+	.atomic_enabled = IS_ENABLED(CONFIG_ARM_SCMI_TRANSPORT_SMC_ATOMIC_ENABLE),
 };
-- 
2.17.1


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 61+ messages in thread

* [PATCH v8 08/11] firmware: arm_scmi: Add new parameter to mark_txdone
  2021-12-20 19:56 ` Cristian Marussi
@ 2021-12-20 19:56   ` Cristian Marussi
  -1 siblings, 0 replies; 61+ messages in thread
From: Cristian Marussi @ 2021-12-20 19:56 UTC (permalink / raw)
  To: linux-kernel, linux-arm-kernel
  Cc: sudeep.holla, james.quinlan, Jonathan.Cameron, f.fainelli,
	etienne.carriere, vincent.guittot, souvik.chakravarty,
	cristian.marussi

Add a new xfer parameter to mark_txdone transport operation which enables
the SCMI core to optionally pass back into the transport layer a reference
to the xfer descriptor that is being handled.

Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: Cristian Marussi <cristian.marussi@arm.com>
---
v6 --> v7
- use __unused macro for new param in existing transports
---
 drivers/firmware/arm_scmi/common.h  | 3 ++-
 drivers/firmware/arm_scmi/driver.c  | 2 +-
 drivers/firmware/arm_scmi/mailbox.c | 3 ++-
 drivers/firmware/arm_scmi/optee.c   | 3 ++-
 drivers/firmware/arm_scmi/smc.c     | 3 ++-
 5 files changed, 9 insertions(+), 5 deletions(-)

diff --git a/drivers/firmware/arm_scmi/common.h b/drivers/firmware/arm_scmi/common.h
index 01d42c2069d4..4fda84bfab42 100644
--- a/drivers/firmware/arm_scmi/common.h
+++ b/drivers/firmware/arm_scmi/common.h
@@ -378,7 +378,8 @@ struct scmi_transport_ops {
 	unsigned int (*get_max_msg)(struct scmi_chan_info *base_cinfo);
 	int (*send_message)(struct scmi_chan_info *cinfo,
 			    struct scmi_xfer *xfer);
-	void (*mark_txdone)(struct scmi_chan_info *cinfo, int ret);
+	void (*mark_txdone)(struct scmi_chan_info *cinfo, int ret,
+			    struct scmi_xfer *xfer);
 	void (*fetch_response)(struct scmi_chan_info *cinfo,
 			       struct scmi_xfer *xfer);
 	void (*fetch_notification)(struct scmi_chan_info *cinfo,
diff --git a/drivers/firmware/arm_scmi/driver.c b/drivers/firmware/arm_scmi/driver.c
index 0406e030db1b..b4bbfe89368d 100644
--- a/drivers/firmware/arm_scmi/driver.c
+++ b/drivers/firmware/arm_scmi/driver.c
@@ -902,7 +902,7 @@ static int do_xfer(const struct scmi_protocol_handle *ph,
 		ret = scmi_to_linux_errno(xfer->hdr.status);
 
 	if (info->desc->ops->mark_txdone)
-		info->desc->ops->mark_txdone(cinfo, ret);
+		info->desc->ops->mark_txdone(cinfo, ret, xfer);
 
 	trace_scmi_xfer_end(xfer->transfer_id, xfer->hdr.id,
 			    xfer->hdr.protocol_id, xfer->hdr.seq, ret);
diff --git a/drivers/firmware/arm_scmi/mailbox.c b/drivers/firmware/arm_scmi/mailbox.c
index e09eb12bf421..08ff4d110beb 100644
--- a/drivers/firmware/arm_scmi/mailbox.c
+++ b/drivers/firmware/arm_scmi/mailbox.c
@@ -140,7 +140,8 @@ static int mailbox_send_message(struct scmi_chan_info *cinfo,
 	return ret;
 }
 
-static void mailbox_mark_txdone(struct scmi_chan_info *cinfo, int ret)
+static void mailbox_mark_txdone(struct scmi_chan_info *cinfo, int ret,
+				struct scmi_xfer *__unused)
 {
 	struct scmi_mailbox *smbox = cinfo->transport_info;
 
diff --git a/drivers/firmware/arm_scmi/optee.c b/drivers/firmware/arm_scmi/optee.c
index f460e12be4ea..734f1eeee161 100644
--- a/drivers/firmware/arm_scmi/optee.c
+++ b/drivers/firmware/arm_scmi/optee.c
@@ -441,7 +441,8 @@ static void scmi_optee_fetch_response(struct scmi_chan_info *cinfo,
 	shmem_fetch_response(shmem, xfer);
 }
 
-static void scmi_optee_mark_txdone(struct scmi_chan_info *cinfo, int ret)
+static void scmi_optee_mark_txdone(struct scmi_chan_info *cinfo, int ret,
+				   struct scmi_xfer *__unused)
 {
 	struct scmi_optee_channel *channel = cinfo->transport_info;
 
diff --git a/drivers/firmware/arm_scmi/smc.c b/drivers/firmware/arm_scmi/smc.c
index 6c7871a40611..745acfdd0b3d 100644
--- a/drivers/firmware/arm_scmi/smc.c
+++ b/drivers/firmware/arm_scmi/smc.c
@@ -209,7 +209,8 @@ static void smc_fetch_response(struct scmi_chan_info *cinfo,
 	shmem_fetch_response(scmi_info->shmem, xfer);
 }
 
-static void smc_mark_txdone(struct scmi_chan_info *cinfo, int ret)
+static void smc_mark_txdone(struct scmi_chan_info *cinfo, int ret,
+			    struct scmi_xfer *__unused)
 {
 	struct scmi_smc *scmi_info = cinfo->transport_info;
 
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 61+ messages in thread

* [PATCH v8 08/11] firmware: arm_scmi: Add new parameter to mark_txdone
@ 2021-12-20 19:56   ` Cristian Marussi
  0 siblings, 0 replies; 61+ messages in thread
From: Cristian Marussi @ 2021-12-20 19:56 UTC (permalink / raw)
  To: linux-kernel, linux-arm-kernel
  Cc: sudeep.holla, james.quinlan, Jonathan.Cameron, f.fainelli,
	etienne.carriere, vincent.guittot, souvik.chakravarty,
	cristian.marussi

Add a new xfer parameter to mark_txdone transport operation which enables
the SCMI core to optionally pass back into the transport layer a reference
to the xfer descriptor that is being handled.

Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: Cristian Marussi <cristian.marussi@arm.com>
---
v6 --> v7
- use __unused macro for new param in existing transports
---
 drivers/firmware/arm_scmi/common.h  | 3 ++-
 drivers/firmware/arm_scmi/driver.c  | 2 +-
 drivers/firmware/arm_scmi/mailbox.c | 3 ++-
 drivers/firmware/arm_scmi/optee.c   | 3 ++-
 drivers/firmware/arm_scmi/smc.c     | 3 ++-
 5 files changed, 9 insertions(+), 5 deletions(-)

diff --git a/drivers/firmware/arm_scmi/common.h b/drivers/firmware/arm_scmi/common.h
index 01d42c2069d4..4fda84bfab42 100644
--- a/drivers/firmware/arm_scmi/common.h
+++ b/drivers/firmware/arm_scmi/common.h
@@ -378,7 +378,8 @@ struct scmi_transport_ops {
 	unsigned int (*get_max_msg)(struct scmi_chan_info *base_cinfo);
 	int (*send_message)(struct scmi_chan_info *cinfo,
 			    struct scmi_xfer *xfer);
-	void (*mark_txdone)(struct scmi_chan_info *cinfo, int ret);
+	void (*mark_txdone)(struct scmi_chan_info *cinfo, int ret,
+			    struct scmi_xfer *xfer);
 	void (*fetch_response)(struct scmi_chan_info *cinfo,
 			       struct scmi_xfer *xfer);
 	void (*fetch_notification)(struct scmi_chan_info *cinfo,
diff --git a/drivers/firmware/arm_scmi/driver.c b/drivers/firmware/arm_scmi/driver.c
index 0406e030db1b..b4bbfe89368d 100644
--- a/drivers/firmware/arm_scmi/driver.c
+++ b/drivers/firmware/arm_scmi/driver.c
@@ -902,7 +902,7 @@ static int do_xfer(const struct scmi_protocol_handle *ph,
 		ret = scmi_to_linux_errno(xfer->hdr.status);
 
 	if (info->desc->ops->mark_txdone)
-		info->desc->ops->mark_txdone(cinfo, ret);
+		info->desc->ops->mark_txdone(cinfo, ret, xfer);
 
 	trace_scmi_xfer_end(xfer->transfer_id, xfer->hdr.id,
 			    xfer->hdr.protocol_id, xfer->hdr.seq, ret);
diff --git a/drivers/firmware/arm_scmi/mailbox.c b/drivers/firmware/arm_scmi/mailbox.c
index e09eb12bf421..08ff4d110beb 100644
--- a/drivers/firmware/arm_scmi/mailbox.c
+++ b/drivers/firmware/arm_scmi/mailbox.c
@@ -140,7 +140,8 @@ static int mailbox_send_message(struct scmi_chan_info *cinfo,
 	return ret;
 }
 
-static void mailbox_mark_txdone(struct scmi_chan_info *cinfo, int ret)
+static void mailbox_mark_txdone(struct scmi_chan_info *cinfo, int ret,
+				struct scmi_xfer *__unused)
 {
 	struct scmi_mailbox *smbox = cinfo->transport_info;
 
diff --git a/drivers/firmware/arm_scmi/optee.c b/drivers/firmware/arm_scmi/optee.c
index f460e12be4ea..734f1eeee161 100644
--- a/drivers/firmware/arm_scmi/optee.c
+++ b/drivers/firmware/arm_scmi/optee.c
@@ -441,7 +441,8 @@ static void scmi_optee_fetch_response(struct scmi_chan_info *cinfo,
 	shmem_fetch_response(shmem, xfer);
 }
 
-static void scmi_optee_mark_txdone(struct scmi_chan_info *cinfo, int ret)
+static void scmi_optee_mark_txdone(struct scmi_chan_info *cinfo, int ret,
+				   struct scmi_xfer *__unused)
 {
 	struct scmi_optee_channel *channel = cinfo->transport_info;
 
diff --git a/drivers/firmware/arm_scmi/smc.c b/drivers/firmware/arm_scmi/smc.c
index 6c7871a40611..745acfdd0b3d 100644
--- a/drivers/firmware/arm_scmi/smc.c
+++ b/drivers/firmware/arm_scmi/smc.c
@@ -209,7 +209,8 @@ static void smc_fetch_response(struct scmi_chan_info *cinfo,
 	shmem_fetch_response(scmi_info->shmem, xfer);
 }
 
-static void smc_mark_txdone(struct scmi_chan_info *cinfo, int ret)
+static void smc_mark_txdone(struct scmi_chan_info *cinfo, int ret,
+			    struct scmi_xfer *__unused)
 {
 	struct scmi_smc *scmi_info = cinfo->transport_info;
 
-- 
2.17.1


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 61+ messages in thread

* [PATCH v8 09/11] firmware: arm_scmi: Add atomic mode support to virtio transport
  2021-12-20 19:56 ` Cristian Marussi
@ 2021-12-20 19:56   ` Cristian Marussi
  -1 siblings, 0 replies; 61+ messages in thread
From: Cristian Marussi @ 2021-12-20 19:56 UTC (permalink / raw)
  To: linux-kernel, linux-arm-kernel
  Cc: sudeep.holla, james.quinlan, Jonathan.Cameron, f.fainelli,
	etienne.carriere, vincent.guittot, souvik.chakravarty,
	cristian.marussi, Michael S. Tsirkin, Igor Skalkin, Peter Hilber,
	virtualization

Add support for .mark_txdone and .poll_done transport operations to SCMI
VirtIO transport as pre-requisites to enable atomic operations.

Add a Kernel configuration option to enable SCMI VirtIO transport polling
and atomic mode for selected SCMI transactions while leaving it default
disabled.

Cc: "Michael S. Tsirkin" <mst@redhat.com>
Cc: Igor Skalkin <igor.skalkin@opensynergy.com>
Cc: Peter Hilber <peter.hilber@opensynergy.com>
Cc: virtualization@lists.linux-foundation.org
Signed-off-by: Cristian Marussi <cristian.marussi@arm.com>
---
v7 --> v8
- removed ifdeffery
- reviewed comments
- simplified spinlocking aroung scmi_feed_vq_tx/rx
- added deferred worker for TX replies to aid while polling mode is active
V6 --> V7
- added a few comments about virtio polling internals
- fixed missing list_del on pending_cmds_list processing
- shrinked spinlocked areas in virtio_poll_done
- added proper spinlocking to scmi_vio_complete_cb while scanning list
  of pending cmds
---
 drivers/firmware/arm_scmi/Kconfig  |  15 ++
 drivers/firmware/arm_scmi/virtio.c | 291 ++++++++++++++++++++++++++---
 2 files changed, 280 insertions(+), 26 deletions(-)

diff --git a/drivers/firmware/arm_scmi/Kconfig b/drivers/firmware/arm_scmi/Kconfig
index d429326433d1..7794bd41eaa0 100644
--- a/drivers/firmware/arm_scmi/Kconfig
+++ b/drivers/firmware/arm_scmi/Kconfig
@@ -118,6 +118,21 @@ config ARM_SCMI_TRANSPORT_VIRTIO_VERSION1_COMPLIANCE
 	  the ones implemented by kvmtool) and let the core Kernel VirtIO layer
 	  take care of the needed conversions, say N.
 
+config ARM_SCMI_TRANSPORT_VIRTIO_ATOMIC_ENABLE
+	bool "Enable atomic mode for SCMI VirtIO transport"
+	depends on ARM_SCMI_TRANSPORT_VIRTIO
+	help
+	  Enable support of atomic operation for SCMI VirtIO based transport.
+
+	  If you want the SCMI VirtIO based transport to operate in atomic
+	  mode, avoiding any kind of sleeping behaviour for selected
+	  transactions on the TX path, answer Y.
+
+	  Enabling atomic mode operations allows any SCMI driver using this
+	  transport to optionally ask for atomic SCMI transactions and operate
+	  in atomic context too, at the price of using a number of busy-waiting
+	  primitives all over instead. If unsure say N.
+
 endif #ARM_SCMI_PROTOCOL
 
 config ARM_SCMI_POWER_DOMAIN
diff --git a/drivers/firmware/arm_scmi/virtio.c b/drivers/firmware/arm_scmi/virtio.c
index fd0f6f91fc0b..f589bbcc5db9 100644
--- a/drivers/firmware/arm_scmi/virtio.c
+++ b/drivers/firmware/arm_scmi/virtio.c
@@ -3,8 +3,8 @@
  * Virtio Transport driver for Arm System Control and Management Interface
  * (SCMI).
  *
- * Copyright (C) 2020-2021 OpenSynergy.
- * Copyright (C) 2021 ARM Ltd.
+ * Copyright (C) 2020-2022 OpenSynergy.
+ * Copyright (C) 2021-2022 ARM Ltd.
  */
 
 /**
@@ -38,6 +38,9 @@
  * @vqueue: Associated virtqueue
  * @cinfo: SCMI Tx or Rx channel
  * @free_list: List of unused scmi_vio_msg, maintained for Tx channels only
+ * @deferred_tx_work: Worker for TX deferred replies processing
+ * @deferred_tx_wq: Workqueue for TX deferred replies
+ * @pending_cmds_list: List of pre-fetched commands queueud for later processing
  * @is_rx: Whether channel is an Rx channel
  * @ready: Whether transport user is ready to hear about channel
  * @max_msg: Maximum number of pending messages for this channel.
@@ -49,6 +52,9 @@ struct scmi_vio_channel {
 	struct virtqueue *vqueue;
 	struct scmi_chan_info *cinfo;
 	struct list_head free_list;
+	struct list_head pending_cmds_list;
+	struct work_struct deferred_tx_work;
+	struct workqueue_struct *deferred_tx_wq;
 	bool is_rx;
 	bool ready;
 	unsigned int max_msg;
@@ -65,12 +71,22 @@ struct scmi_vio_channel {
  * @input: SDU used for (delayed) responses and notifications
  * @list: List which scmi_vio_msg may be part of
  * @rx_len: Input SDU size in bytes, once input has been received
+ * @poll_idx: Last used index registered for polling purposes if this message
+ *	      transaction reply was configured for polling.
+ *	      Note that since virtqueue used index is an unsigned 16-bit we can
+ *	      use some out-of-scale values to signify particular conditions.
+ * @poll_lock: Protect access to @poll_idx.
  */
 struct scmi_vio_msg {
 	struct scmi_msg_payld *request;
 	struct scmi_msg_payld *input;
 	struct list_head list;
 	unsigned int rx_len;
+#define VIO_MSG_NOT_POLLED	0xeeeeeeeeUL
+#define VIO_MSG_POLL_DONE	0xffffffffUL
+	unsigned int poll_idx;
+	/* lock to protect access to poll_idx. */
+	spinlock_t poll_lock;
 };
 
 /* Only one SCMI VirtIO device can possibly exist */
@@ -81,40 +97,43 @@ static bool scmi_vio_have_vq_rx(struct virtio_device *vdev)
 	return virtio_has_feature(vdev, VIRTIO_SCMI_F_P2A_CHANNELS);
 }
 
+/* Expect to be called with vioch->lock acquired by the caller and IRQs off */
 static int scmi_vio_feed_vq_rx(struct scmi_vio_channel *vioch,
 			       struct scmi_vio_msg *msg,
 			       struct device *dev)
 {
 	struct scatterlist sg_in;
 	int rc;
-	unsigned long flags;
 
 	sg_init_one(&sg_in, msg->input, VIRTIO_SCMI_MAX_PDU_SIZE);
 
-	spin_lock_irqsave(&vioch->lock, flags);
-
 	rc = virtqueue_add_inbuf(vioch->vqueue, &sg_in, 1, msg, GFP_ATOMIC);
 	if (rc)
 		dev_err(dev, "failed to add to RX virtqueue (%d)\n", rc);
 	else
 		virtqueue_kick(vioch->vqueue);
 
-	spin_unlock_irqrestore(&vioch->lock, flags);
-
 	return rc;
 }
 
+/* Expect to be called with vioch->lock acquired by the caller and IRQs off */
+static inline void scmi_vio_feed_vq_tx(struct scmi_vio_channel *vioch,
+				       struct scmi_vio_msg *msg)
+{
+	spin_lock(&msg->poll_lock);
+	msg->poll_idx = VIO_MSG_NOT_POLLED;
+	spin_unlock(&msg->poll_lock);
+
+	list_add(&msg->list, &vioch->free_list);
+}
+
 static void scmi_finalize_message(struct scmi_vio_channel *vioch,
 				  struct scmi_vio_msg *msg)
 {
-	if (vioch->is_rx) {
+	if (vioch->is_rx)
 		scmi_vio_feed_vq_rx(vioch, msg, vioch->cinfo->dev);
-	} else {
-		/* Here IRQs are assumed to be already disabled by the caller */
-		spin_lock(&vioch->lock);
-		list_add(&msg->list, &vioch->free_list);
-		spin_unlock(&vioch->lock);
-	}
+	else
+		scmi_vio_feed_vq_tx(vioch, msg);
 }
 
 static void scmi_vio_complete_cb(struct virtqueue *vqueue)
@@ -144,6 +163,7 @@ static void scmi_vio_complete_cb(struct virtqueue *vqueue)
 			virtqueue_disable_cb(vqueue);
 			cb_enabled = false;
 		}
+
 		msg = virtqueue_get_buf(vqueue, &length);
 		if (!msg) {
 			if (virtqueue_enable_cb(vqueue))
@@ -157,7 +177,9 @@ static void scmi_vio_complete_cb(struct virtqueue *vqueue)
 			scmi_rx_callback(vioch->cinfo,
 					 msg_read_header(msg->input), msg);
 
+			spin_lock(&vioch->lock);
 			scmi_finalize_message(vioch, msg);
+			spin_unlock(&vioch->lock);
 		}
 
 		/*
@@ -176,6 +198,34 @@ static void scmi_vio_complete_cb(struct virtqueue *vqueue)
 	spin_unlock_irqrestore(&vioch->ready_lock, ready_flags);
 }
 
+static void scmi_vio_deferred_tx_worker(struct work_struct *work)
+{
+	unsigned long flags;
+	struct scmi_vio_channel *vioch;
+	struct scmi_vio_msg *msg, *tmp;
+
+	vioch = container_of(work, struct scmi_vio_channel, deferred_tx_work);
+
+	/* Process pre-fetched messages */
+	spin_lock_irqsave(&vioch->lock, flags);
+
+	/* Scan the list of possibly pre-fetched messages during polling. */
+	list_for_each_entry_safe(msg, tmp, &vioch->pending_cmds_list, list) {
+		list_del(&msg->list);
+
+		scmi_rx_callback(vioch->cinfo,
+				 msg_read_header(msg->input), msg);
+
+		/* Free the processed message once done */
+		scmi_vio_feed_vq_tx(vioch, msg);
+	}
+
+	spin_unlock_irqrestore(&vioch->lock, flags);
+
+	/* Process possibly still pending messages */
+	scmi_vio_complete_cb(vioch->vqueue);
+}
+
 static const char *const scmi_vio_vqueue_names[] = { "tx", "rx" };
 
 static vq_callback_t *scmi_vio_complete_callbacks[] = {
@@ -244,6 +294,19 @@ static int virtio_chan_setup(struct scmi_chan_info *cinfo, struct device *dev,
 
 	vioch = &((struct scmi_vio_channel *)scmi_vdev->priv)[index];
 
+	/* Setup a deferred worker for polling. */
+	if (tx && !vioch->deferred_tx_wq) {
+		vioch->deferred_tx_wq =
+			alloc_workqueue(dev_name(&scmi_vdev->dev),
+					WQ_UNBOUND | WQ_FREEZABLE | WQ_SYSFS,
+					0);
+		if (!vioch->deferred_tx_wq)
+			return -ENOMEM;
+
+		INIT_WORK(&vioch->deferred_tx_work,
+			  scmi_vio_deferred_tx_worker);
+	}
+
 	for (i = 0; i < vioch->max_msg; i++) {
 		struct scmi_vio_msg *msg;
 
@@ -257,6 +320,7 @@ static int virtio_chan_setup(struct scmi_chan_info *cinfo, struct device *dev,
 						    GFP_KERNEL);
 			if (!msg->request)
 				return -ENOMEM;
+			spin_lock_init(&msg->poll_lock);
 		}
 
 		msg->input = devm_kzalloc(cinfo->dev, VIRTIO_SCMI_MAX_PDU_SIZE,
@@ -264,13 +328,12 @@ static int virtio_chan_setup(struct scmi_chan_info *cinfo, struct device *dev,
 		if (!msg->input)
 			return -ENOMEM;
 
-		if (tx) {
-			spin_lock_irqsave(&vioch->lock, flags);
-			list_add_tail(&msg->list, &vioch->free_list);
-			spin_unlock_irqrestore(&vioch->lock, flags);
-		} else {
+		spin_lock_irqsave(&vioch->lock, flags);
+		if (tx)
+			scmi_vio_feed_vq_tx(vioch, msg);
+		else
 			scmi_vio_feed_vq_rx(vioch, msg, cinfo->dev);
-		}
+		spin_unlock_irqrestore(&vioch->lock, flags);
 	}
 
 	spin_lock_irqsave(&vioch->lock, flags);
@@ -296,6 +359,11 @@ static int virtio_chan_free(int id, void *p, void *data)
 	vioch->ready = false;
 	spin_unlock_irqrestore(&vioch->ready_lock, flags);
 
+	if (!vioch->is_rx && vioch->deferred_tx_wq) {
+		destroy_workqueue(vioch->deferred_tx_wq);
+		vioch->deferred_tx_wq = NULL;
+	}
+
 	scmi_free_channel(cinfo, data, id);
 
 	spin_lock_irqsave(&vioch->lock, flags);
@@ -324,7 +392,8 @@ static int virtio_send_message(struct scmi_chan_info *cinfo,
 	}
 
 	msg = list_first_entry(&vioch->free_list, typeof(*msg), list);
-	list_del(&msg->list);
+	/* Re-init element so we can discern anytime if it is still in-flight */
+	list_del_init(&msg->list);
 
 	msg_tx_prepare(msg->request, xfer);
 
@@ -337,6 +406,19 @@ static int virtio_send_message(struct scmi_chan_info *cinfo,
 		dev_err(vioch->cinfo->dev,
 			"failed to add to TX virtqueue (%d)\n", rc);
 	} else {
+		/*
+		 * If polling was requested for this transaction:
+		 *  - retrieve last used index (will be used as polling reference)
+		 *  - bind the polled message to the xfer via .priv
+		 */
+		if (xfer->hdr.poll_completion) {
+			spin_lock(&msg->poll_lock);
+			msg->poll_idx =
+				virtqueue_enable_cb_prepare(vioch->vqueue);
+			spin_unlock(&msg->poll_lock);
+			/* Ensure initialized msg is visibly bound to xfer */
+			smp_store_mb(xfer->priv, msg);
+		}
 		virtqueue_kick(vioch->vqueue);
 	}
 
@@ -350,10 +432,8 @@ static void virtio_fetch_response(struct scmi_chan_info *cinfo,
 {
 	struct scmi_vio_msg *msg = xfer->priv;
 
-	if (msg) {
+	if (msg)
 		msg_fetch_response(msg->input, msg->rx_len, xfer);
-		xfer->priv = NULL;
-	}
 }
 
 static void virtio_fetch_notification(struct scmi_chan_info *cinfo,
@@ -361,10 +441,165 @@ static void virtio_fetch_notification(struct scmi_chan_info *cinfo,
 {
 	struct scmi_vio_msg *msg = xfer->priv;
 
-	if (msg) {
+	if (msg)
 		msg_fetch_notification(msg->input, msg->rx_len, max_len, xfer);
-		xfer->priv = NULL;
+}
+
+/**
+ * virtio_mark_txdone  - Mark transmission done
+ *
+ * Free only successfully completed polling transfer messages.
+ *
+ * Note that in the SCMI VirtIO transport we never explicitly release timed-out
+ * messages by forcibly re-adding them to the free-list, even on timeout, inside
+ * the TX code path; we instead let IRQ/RX callbacks eventually clean up such
+ * messages once, finally, a late reply is received and discarded (if ever).
+ *
+ * This approach was deemed preferable since those pending timed-out buffers are
+ * still effectively owned by the SCMI platform VirtIO device even after timeout
+ * expiration: forcibly freeing and reusing them before they had been returned
+ * explicitly by the SCMI platform could lead to subtle bugs due to message
+ * corruption.
+ * An SCMI platform VirtIO device which never returns message buffers is
+ * anyway broken and it will quickly lead to exhaustion of available messages.
+ *
+ * For this same reason, here, we take care to free only the successfully
+ * completed polled messages, since they won't be freed elsewhere; late replies
+ * to timed-out polled messages would be anyway freed by RX callbacks instead.
+ *
+ * @cinfo: SCMI channel info
+ * @ret: Transmission return code
+ * @xfer: Transfer descriptor
+ */
+static void virtio_mark_txdone(struct scmi_chan_info *cinfo, int ret,
+			       struct scmi_xfer *xfer)
+{
+	unsigned long flags;
+	struct scmi_vio_channel *vioch = cinfo->transport_info;
+	struct scmi_vio_msg *msg = xfer->priv;
+
+	if (!msg)
+		return;
+
+	/* Ensure msg is unbound from xfer before pushing onto the free list  */
+	smp_store_mb(xfer->priv, NULL);
+
+	/* Is a successfully completed polled message still to be finalized ? */
+	spin_lock_irqsave(&vioch->lock, flags);
+	if (!ret && xfer->hdr.poll_completion && list_empty(&msg->list))
+		scmi_vio_feed_vq_tx(vioch, msg);
+	spin_unlock_irqrestore(&vioch->lock, flags);
+}
+
+/**
+ * virtio_poll_done  - Provide polling support for VirtIO transport
+ *
+ * @cinfo: SCMI channel info
+ * @xfer: Reference to the transfer being poll for.
+ *
+ * VirtIO core provides a polling mechanism based only on last used indexes:
+ * this means that it is possible to poll the virtqueues waiting for something
+ * new to arrive from the host side but the only way to check if the freshly
+ * arrived buffer was what we were waiting for is to compare the newly arrived
+ * message descriptors with the one we are polling on.
+ *
+ * As a consequence it can happen to dequeue something different from the buffer
+ * we were poll-waiting for: if that is the case such early fetched buffers are
+ * then added to a the @pending_cmds_list list for later processing by a
+ * dedicated deferred worker.
+ *
+ * So, basically, once something new is spotted we proceed to de-queue all the
+ * freshly received used buffers until we found the one we were polling on, or,
+ * we have 'seemingly' emptied the virtqueue; if some buffers are still pending
+ * in the vqueue at the end of the polling loop (possible due to inherent races
+ * in virtqueues handling mechanisms), we similarly kick the deferred worker
+ * and let it process those, to avoid indefinitely looping in the .poll_done
+ * helper.
+ *
+ * Note that we do NOT suppress notification with VIRTQ_USED_F_NO_NOTIFY even
+ * when polling since such flag is per-virtqueues and we do not want to
+ * suppress notifications as a whole: so, if the message we are polling for is
+ * delivered via usual IRQs callbacks, on another core which are IRQs-on, it
+ * will be handled as such by scmi_rx_callback() and the polling loop in the
+ * SCMI Core TX path will be transparently terminated anyway.
+ *
+ * Return: True once polling has successfully completed.
+ */
+static bool virtio_poll_done(struct scmi_chan_info *cinfo,
+			     struct scmi_xfer *xfer)
+{
+	bool pending, ret = false;
+	unsigned int length, any_prefetched = 0;
+	unsigned long flags;
+	struct scmi_vio_msg *next_msg, *msg = xfer->priv;
+	struct scmi_vio_channel *vioch = cinfo->transport_info;
+
+	if (!msg)
+		return true;
+
+	spin_lock_irqsave(&msg->poll_lock, flags);
+	/* Processed already by other polling loop on another CPU ? */
+	if (msg->poll_idx == VIO_MSG_POLL_DONE) {
+		spin_unlock_irqrestore(&msg->poll_lock, flags);
+		return true;
 	}
+
+	/* Has cmdq index moved at all ? */
+	pending = virtqueue_poll(vioch->vqueue, msg->poll_idx);
+	spin_unlock_irqrestore(&msg->poll_lock, flags);
+	if (!pending)
+		return false;
+
+	spin_lock_irqsave(&vioch->lock, flags);
+	virtqueue_disable_cb(vioch->vqueue);
+
+	/*
+	 * If something arrived we cannot be sure, without dequeueing, if it
+	 * was the reply to the xfer we are polling for, or, to other, even
+	 * possibly non-polling, pending xfers: process all new messages
+	 * till the polled-for message is found OR the vqueue is empty.
+	 */
+	while ((next_msg = virtqueue_get_buf(vioch->vqueue, &length))) {
+		next_msg->rx_len = length;
+		/* Is the message we were polling for ? */
+		if (next_msg == msg) {
+			ret = true;
+			break;
+		}
+
+		spin_lock(&next_msg->poll_lock);
+		if (next_msg->poll_idx == VIO_MSG_NOT_POLLED) {
+			any_prefetched++;
+			list_add_tail(&next_msg->list,
+				      &vioch->pending_cmds_list);
+		} else {
+			next_msg->poll_idx = VIO_MSG_POLL_DONE;
+		}
+		spin_unlock(&next_msg->poll_lock);
+	}
+
+	/*
+	 * When the polling loop has successfully terminated if something
+	 * else was queued in the meantime, it will be served by a deferred
+	 * worker OR by the normal IRQ/callback OR by other poll loops.
+	 *
+	 * If we are still looking for the polled reply, the polling index has
+	 * to be updated to the current vqueue last used index.
+	 */
+	if (ret) {
+		pending = !virtqueue_enable_cb(vioch->vqueue);
+	} else {
+		spin_lock(&msg->poll_lock);
+		msg->poll_idx = virtqueue_enable_cb_prepare(vioch->vqueue);
+		pending = virtqueue_poll(vioch->vqueue, msg->poll_idx);
+		spin_unlock(&msg->poll_lock);
+	}
+	spin_unlock_irqrestore(&vioch->lock, flags);
+
+	if (any_prefetched || pending)
+		queue_work(vioch->deferred_tx_wq, &vioch->deferred_tx_work);
+
+	return ret;
 }
 
 static const struct scmi_transport_ops scmi_virtio_ops = {
@@ -376,6 +611,8 @@ static const struct scmi_transport_ops scmi_virtio_ops = {
 	.send_message = virtio_send_message,
 	.fetch_response = virtio_fetch_response,
 	.fetch_notification = virtio_fetch_notification,
+	.mark_txdone = virtio_mark_txdone,
+	.poll_done = virtio_poll_done,
 };
 
 static int scmi_vio_probe(struct virtio_device *vdev)
@@ -418,6 +655,7 @@ static int scmi_vio_probe(struct virtio_device *vdev)
 		spin_lock_init(&channels[i].lock);
 		spin_lock_init(&channels[i].ready_lock);
 		INIT_LIST_HEAD(&channels[i].free_list);
+		INIT_LIST_HEAD(&channels[i].pending_cmds_list);
 		channels[i].vqueue = vqs[i];
 
 		sz = virtqueue_get_vring_size(channels[i].vqueue);
@@ -506,4 +744,5 @@ const struct scmi_desc scmi_virtio_desc = {
 	.max_rx_timeout_ms = 60000, /* for non-realtime virtio devices */
 	.max_msg = 0, /* overridden by virtio_get_max_msg() */
 	.max_msg_size = VIRTIO_SCMI_MAX_MSG_SIZE,
+	.atomic_enabled = IS_ENABLED(CONFIG_ARM_SCMI_TRANSPORT_VIRTIO_ATOMIC_ENABLE),
 };
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 61+ messages in thread

* [PATCH v8 09/11] firmware: arm_scmi: Add atomic mode support to virtio transport
@ 2021-12-20 19:56   ` Cristian Marussi
  0 siblings, 0 replies; 61+ messages in thread
From: Cristian Marussi @ 2021-12-20 19:56 UTC (permalink / raw)
  To: linux-kernel, linux-arm-kernel
  Cc: sudeep.holla, james.quinlan, Jonathan.Cameron, f.fainelli,
	etienne.carriere, vincent.guittot, souvik.chakravarty,
	cristian.marussi, Michael S. Tsirkin, Igor Skalkin, Peter Hilber,
	virtualization

Add support for .mark_txdone and .poll_done transport operations to SCMI
VirtIO transport as pre-requisites to enable atomic operations.

Add a Kernel configuration option to enable SCMI VirtIO transport polling
and atomic mode for selected SCMI transactions while leaving it default
disabled.

Cc: "Michael S. Tsirkin" <mst@redhat.com>
Cc: Igor Skalkin <igor.skalkin@opensynergy.com>
Cc: Peter Hilber <peter.hilber@opensynergy.com>
Cc: virtualization@lists.linux-foundation.org
Signed-off-by: Cristian Marussi <cristian.marussi@arm.com>
---
v7 --> v8
- removed ifdeffery
- reviewed comments
- simplified spinlocking aroung scmi_feed_vq_tx/rx
- added deferred worker for TX replies to aid while polling mode is active
V6 --> V7
- added a few comments about virtio polling internals
- fixed missing list_del on pending_cmds_list processing
- shrinked spinlocked areas in virtio_poll_done
- added proper spinlocking to scmi_vio_complete_cb while scanning list
  of pending cmds
---
 drivers/firmware/arm_scmi/Kconfig  |  15 ++
 drivers/firmware/arm_scmi/virtio.c | 291 ++++++++++++++++++++++++++---
 2 files changed, 280 insertions(+), 26 deletions(-)

diff --git a/drivers/firmware/arm_scmi/Kconfig b/drivers/firmware/arm_scmi/Kconfig
index d429326433d1..7794bd41eaa0 100644
--- a/drivers/firmware/arm_scmi/Kconfig
+++ b/drivers/firmware/arm_scmi/Kconfig
@@ -118,6 +118,21 @@ config ARM_SCMI_TRANSPORT_VIRTIO_VERSION1_COMPLIANCE
 	  the ones implemented by kvmtool) and let the core Kernel VirtIO layer
 	  take care of the needed conversions, say N.
 
+config ARM_SCMI_TRANSPORT_VIRTIO_ATOMIC_ENABLE
+	bool "Enable atomic mode for SCMI VirtIO transport"
+	depends on ARM_SCMI_TRANSPORT_VIRTIO
+	help
+	  Enable support of atomic operation for SCMI VirtIO based transport.
+
+	  If you want the SCMI VirtIO based transport to operate in atomic
+	  mode, avoiding any kind of sleeping behaviour for selected
+	  transactions on the TX path, answer Y.
+
+	  Enabling atomic mode operations allows any SCMI driver using this
+	  transport to optionally ask for atomic SCMI transactions and operate
+	  in atomic context too, at the price of using a number of busy-waiting
+	  primitives all over instead. If unsure say N.
+
 endif #ARM_SCMI_PROTOCOL
 
 config ARM_SCMI_POWER_DOMAIN
diff --git a/drivers/firmware/arm_scmi/virtio.c b/drivers/firmware/arm_scmi/virtio.c
index fd0f6f91fc0b..f589bbcc5db9 100644
--- a/drivers/firmware/arm_scmi/virtio.c
+++ b/drivers/firmware/arm_scmi/virtio.c
@@ -3,8 +3,8 @@
  * Virtio Transport driver for Arm System Control and Management Interface
  * (SCMI).
  *
- * Copyright (C) 2020-2021 OpenSynergy.
- * Copyright (C) 2021 ARM Ltd.
+ * Copyright (C) 2020-2022 OpenSynergy.
+ * Copyright (C) 2021-2022 ARM Ltd.
  */
 
 /**
@@ -38,6 +38,9 @@
  * @vqueue: Associated virtqueue
  * @cinfo: SCMI Tx or Rx channel
  * @free_list: List of unused scmi_vio_msg, maintained for Tx channels only
+ * @deferred_tx_work: Worker for TX deferred replies processing
+ * @deferred_tx_wq: Workqueue for TX deferred replies
+ * @pending_cmds_list: List of pre-fetched commands queueud for later processing
  * @is_rx: Whether channel is an Rx channel
  * @ready: Whether transport user is ready to hear about channel
  * @max_msg: Maximum number of pending messages for this channel.
@@ -49,6 +52,9 @@ struct scmi_vio_channel {
 	struct virtqueue *vqueue;
 	struct scmi_chan_info *cinfo;
 	struct list_head free_list;
+	struct list_head pending_cmds_list;
+	struct work_struct deferred_tx_work;
+	struct workqueue_struct *deferred_tx_wq;
 	bool is_rx;
 	bool ready;
 	unsigned int max_msg;
@@ -65,12 +71,22 @@ struct scmi_vio_channel {
  * @input: SDU used for (delayed) responses and notifications
  * @list: List which scmi_vio_msg may be part of
  * @rx_len: Input SDU size in bytes, once input has been received
+ * @poll_idx: Last used index registered for polling purposes if this message
+ *	      transaction reply was configured for polling.
+ *	      Note that since virtqueue used index is an unsigned 16-bit we can
+ *	      use some out-of-scale values to signify particular conditions.
+ * @poll_lock: Protect access to @poll_idx.
  */
 struct scmi_vio_msg {
 	struct scmi_msg_payld *request;
 	struct scmi_msg_payld *input;
 	struct list_head list;
 	unsigned int rx_len;
+#define VIO_MSG_NOT_POLLED	0xeeeeeeeeUL
+#define VIO_MSG_POLL_DONE	0xffffffffUL
+	unsigned int poll_idx;
+	/* lock to protect access to poll_idx. */
+	spinlock_t poll_lock;
 };
 
 /* Only one SCMI VirtIO device can possibly exist */
@@ -81,40 +97,43 @@ static bool scmi_vio_have_vq_rx(struct virtio_device *vdev)
 	return virtio_has_feature(vdev, VIRTIO_SCMI_F_P2A_CHANNELS);
 }
 
+/* Expect to be called with vioch->lock acquired by the caller and IRQs off */
 static int scmi_vio_feed_vq_rx(struct scmi_vio_channel *vioch,
 			       struct scmi_vio_msg *msg,
 			       struct device *dev)
 {
 	struct scatterlist sg_in;
 	int rc;
-	unsigned long flags;
 
 	sg_init_one(&sg_in, msg->input, VIRTIO_SCMI_MAX_PDU_SIZE);
 
-	spin_lock_irqsave(&vioch->lock, flags);
-
 	rc = virtqueue_add_inbuf(vioch->vqueue, &sg_in, 1, msg, GFP_ATOMIC);
 	if (rc)
 		dev_err(dev, "failed to add to RX virtqueue (%d)\n", rc);
 	else
 		virtqueue_kick(vioch->vqueue);
 
-	spin_unlock_irqrestore(&vioch->lock, flags);
-
 	return rc;
 }
 
+/* Expect to be called with vioch->lock acquired by the caller and IRQs off */
+static inline void scmi_vio_feed_vq_tx(struct scmi_vio_channel *vioch,
+				       struct scmi_vio_msg *msg)
+{
+	spin_lock(&msg->poll_lock);
+	msg->poll_idx = VIO_MSG_NOT_POLLED;
+	spin_unlock(&msg->poll_lock);
+
+	list_add(&msg->list, &vioch->free_list);
+}
+
 static void scmi_finalize_message(struct scmi_vio_channel *vioch,
 				  struct scmi_vio_msg *msg)
 {
-	if (vioch->is_rx) {
+	if (vioch->is_rx)
 		scmi_vio_feed_vq_rx(vioch, msg, vioch->cinfo->dev);
-	} else {
-		/* Here IRQs are assumed to be already disabled by the caller */
-		spin_lock(&vioch->lock);
-		list_add(&msg->list, &vioch->free_list);
-		spin_unlock(&vioch->lock);
-	}
+	else
+		scmi_vio_feed_vq_tx(vioch, msg);
 }
 
 static void scmi_vio_complete_cb(struct virtqueue *vqueue)
@@ -144,6 +163,7 @@ static void scmi_vio_complete_cb(struct virtqueue *vqueue)
 			virtqueue_disable_cb(vqueue);
 			cb_enabled = false;
 		}
+
 		msg = virtqueue_get_buf(vqueue, &length);
 		if (!msg) {
 			if (virtqueue_enable_cb(vqueue))
@@ -157,7 +177,9 @@ static void scmi_vio_complete_cb(struct virtqueue *vqueue)
 			scmi_rx_callback(vioch->cinfo,
 					 msg_read_header(msg->input), msg);
 
+			spin_lock(&vioch->lock);
 			scmi_finalize_message(vioch, msg);
+			spin_unlock(&vioch->lock);
 		}
 
 		/*
@@ -176,6 +198,34 @@ static void scmi_vio_complete_cb(struct virtqueue *vqueue)
 	spin_unlock_irqrestore(&vioch->ready_lock, ready_flags);
 }
 
+static void scmi_vio_deferred_tx_worker(struct work_struct *work)
+{
+	unsigned long flags;
+	struct scmi_vio_channel *vioch;
+	struct scmi_vio_msg *msg, *tmp;
+
+	vioch = container_of(work, struct scmi_vio_channel, deferred_tx_work);
+
+	/* Process pre-fetched messages */
+	spin_lock_irqsave(&vioch->lock, flags);
+
+	/* Scan the list of possibly pre-fetched messages during polling. */
+	list_for_each_entry_safe(msg, tmp, &vioch->pending_cmds_list, list) {
+		list_del(&msg->list);
+
+		scmi_rx_callback(vioch->cinfo,
+				 msg_read_header(msg->input), msg);
+
+		/* Free the processed message once done */
+		scmi_vio_feed_vq_tx(vioch, msg);
+	}
+
+	spin_unlock_irqrestore(&vioch->lock, flags);
+
+	/* Process possibly still pending messages */
+	scmi_vio_complete_cb(vioch->vqueue);
+}
+
 static const char *const scmi_vio_vqueue_names[] = { "tx", "rx" };
 
 static vq_callback_t *scmi_vio_complete_callbacks[] = {
@@ -244,6 +294,19 @@ static int virtio_chan_setup(struct scmi_chan_info *cinfo, struct device *dev,
 
 	vioch = &((struct scmi_vio_channel *)scmi_vdev->priv)[index];
 
+	/* Setup a deferred worker for polling. */
+	if (tx && !vioch->deferred_tx_wq) {
+		vioch->deferred_tx_wq =
+			alloc_workqueue(dev_name(&scmi_vdev->dev),
+					WQ_UNBOUND | WQ_FREEZABLE | WQ_SYSFS,
+					0);
+		if (!vioch->deferred_tx_wq)
+			return -ENOMEM;
+
+		INIT_WORK(&vioch->deferred_tx_work,
+			  scmi_vio_deferred_tx_worker);
+	}
+
 	for (i = 0; i < vioch->max_msg; i++) {
 		struct scmi_vio_msg *msg;
 
@@ -257,6 +320,7 @@ static int virtio_chan_setup(struct scmi_chan_info *cinfo, struct device *dev,
 						    GFP_KERNEL);
 			if (!msg->request)
 				return -ENOMEM;
+			spin_lock_init(&msg->poll_lock);
 		}
 
 		msg->input = devm_kzalloc(cinfo->dev, VIRTIO_SCMI_MAX_PDU_SIZE,
@@ -264,13 +328,12 @@ static int virtio_chan_setup(struct scmi_chan_info *cinfo, struct device *dev,
 		if (!msg->input)
 			return -ENOMEM;
 
-		if (tx) {
-			spin_lock_irqsave(&vioch->lock, flags);
-			list_add_tail(&msg->list, &vioch->free_list);
-			spin_unlock_irqrestore(&vioch->lock, flags);
-		} else {
+		spin_lock_irqsave(&vioch->lock, flags);
+		if (tx)
+			scmi_vio_feed_vq_tx(vioch, msg);
+		else
 			scmi_vio_feed_vq_rx(vioch, msg, cinfo->dev);
-		}
+		spin_unlock_irqrestore(&vioch->lock, flags);
 	}
 
 	spin_lock_irqsave(&vioch->lock, flags);
@@ -296,6 +359,11 @@ static int virtio_chan_free(int id, void *p, void *data)
 	vioch->ready = false;
 	spin_unlock_irqrestore(&vioch->ready_lock, flags);
 
+	if (!vioch->is_rx && vioch->deferred_tx_wq) {
+		destroy_workqueue(vioch->deferred_tx_wq);
+		vioch->deferred_tx_wq = NULL;
+	}
+
 	scmi_free_channel(cinfo, data, id);
 
 	spin_lock_irqsave(&vioch->lock, flags);
@@ -324,7 +392,8 @@ static int virtio_send_message(struct scmi_chan_info *cinfo,
 	}
 
 	msg = list_first_entry(&vioch->free_list, typeof(*msg), list);
-	list_del(&msg->list);
+	/* Re-init element so we can discern anytime if it is still in-flight */
+	list_del_init(&msg->list);
 
 	msg_tx_prepare(msg->request, xfer);
 
@@ -337,6 +406,19 @@ static int virtio_send_message(struct scmi_chan_info *cinfo,
 		dev_err(vioch->cinfo->dev,
 			"failed to add to TX virtqueue (%d)\n", rc);
 	} else {
+		/*
+		 * If polling was requested for this transaction:
+		 *  - retrieve last used index (will be used as polling reference)
+		 *  - bind the polled message to the xfer via .priv
+		 */
+		if (xfer->hdr.poll_completion) {
+			spin_lock(&msg->poll_lock);
+			msg->poll_idx =
+				virtqueue_enable_cb_prepare(vioch->vqueue);
+			spin_unlock(&msg->poll_lock);
+			/* Ensure initialized msg is visibly bound to xfer */
+			smp_store_mb(xfer->priv, msg);
+		}
 		virtqueue_kick(vioch->vqueue);
 	}
 
@@ -350,10 +432,8 @@ static void virtio_fetch_response(struct scmi_chan_info *cinfo,
 {
 	struct scmi_vio_msg *msg = xfer->priv;
 
-	if (msg) {
+	if (msg)
 		msg_fetch_response(msg->input, msg->rx_len, xfer);
-		xfer->priv = NULL;
-	}
 }
 
 static void virtio_fetch_notification(struct scmi_chan_info *cinfo,
@@ -361,10 +441,165 @@ static void virtio_fetch_notification(struct scmi_chan_info *cinfo,
 {
 	struct scmi_vio_msg *msg = xfer->priv;
 
-	if (msg) {
+	if (msg)
 		msg_fetch_notification(msg->input, msg->rx_len, max_len, xfer);
-		xfer->priv = NULL;
+}
+
+/**
+ * virtio_mark_txdone  - Mark transmission done
+ *
+ * Free only successfully completed polling transfer messages.
+ *
+ * Note that in the SCMI VirtIO transport we never explicitly release timed-out
+ * messages by forcibly re-adding them to the free-list, even on timeout, inside
+ * the TX code path; we instead let IRQ/RX callbacks eventually clean up such
+ * messages once, finally, a late reply is received and discarded (if ever).
+ *
+ * This approach was deemed preferable since those pending timed-out buffers are
+ * still effectively owned by the SCMI platform VirtIO device even after timeout
+ * expiration: forcibly freeing and reusing them before they had been returned
+ * explicitly by the SCMI platform could lead to subtle bugs due to message
+ * corruption.
+ * An SCMI platform VirtIO device which never returns message buffers is
+ * anyway broken and it will quickly lead to exhaustion of available messages.
+ *
+ * For this same reason, here, we take care to free only the successfully
+ * completed polled messages, since they won't be freed elsewhere; late replies
+ * to timed-out polled messages would be anyway freed by RX callbacks instead.
+ *
+ * @cinfo: SCMI channel info
+ * @ret: Transmission return code
+ * @xfer: Transfer descriptor
+ */
+static void virtio_mark_txdone(struct scmi_chan_info *cinfo, int ret,
+			       struct scmi_xfer *xfer)
+{
+	unsigned long flags;
+	struct scmi_vio_channel *vioch = cinfo->transport_info;
+	struct scmi_vio_msg *msg = xfer->priv;
+
+	if (!msg)
+		return;
+
+	/* Ensure msg is unbound from xfer before pushing onto the free list  */
+	smp_store_mb(xfer->priv, NULL);
+
+	/* Is a successfully completed polled message still to be finalized ? */
+	spin_lock_irqsave(&vioch->lock, flags);
+	if (!ret && xfer->hdr.poll_completion && list_empty(&msg->list))
+		scmi_vio_feed_vq_tx(vioch, msg);
+	spin_unlock_irqrestore(&vioch->lock, flags);
+}
+
+/**
+ * virtio_poll_done  - Provide polling support for VirtIO transport
+ *
+ * @cinfo: SCMI channel info
+ * @xfer: Reference to the transfer being poll for.
+ *
+ * VirtIO core provides a polling mechanism based only on last used indexes:
+ * this means that it is possible to poll the virtqueues waiting for something
+ * new to arrive from the host side but the only way to check if the freshly
+ * arrived buffer was what we were waiting for is to compare the newly arrived
+ * message descriptors with the one we are polling on.
+ *
+ * As a consequence it can happen to dequeue something different from the buffer
+ * we were poll-waiting for: if that is the case such early fetched buffers are
+ * then added to a the @pending_cmds_list list for later processing by a
+ * dedicated deferred worker.
+ *
+ * So, basically, once something new is spotted we proceed to de-queue all the
+ * freshly received used buffers until we found the one we were polling on, or,
+ * we have 'seemingly' emptied the virtqueue; if some buffers are still pending
+ * in the vqueue at the end of the polling loop (possible due to inherent races
+ * in virtqueues handling mechanisms), we similarly kick the deferred worker
+ * and let it process those, to avoid indefinitely looping in the .poll_done
+ * helper.
+ *
+ * Note that we do NOT suppress notification with VIRTQ_USED_F_NO_NOTIFY even
+ * when polling since such flag is per-virtqueues and we do not want to
+ * suppress notifications as a whole: so, if the message we are polling for is
+ * delivered via usual IRQs callbacks, on another core which are IRQs-on, it
+ * will be handled as such by scmi_rx_callback() and the polling loop in the
+ * SCMI Core TX path will be transparently terminated anyway.
+ *
+ * Return: True once polling has successfully completed.
+ */
+static bool virtio_poll_done(struct scmi_chan_info *cinfo,
+			     struct scmi_xfer *xfer)
+{
+	bool pending, ret = false;
+	unsigned int length, any_prefetched = 0;
+	unsigned long flags;
+	struct scmi_vio_msg *next_msg, *msg = xfer->priv;
+	struct scmi_vio_channel *vioch = cinfo->transport_info;
+
+	if (!msg)
+		return true;
+
+	spin_lock_irqsave(&msg->poll_lock, flags);
+	/* Processed already by other polling loop on another CPU ? */
+	if (msg->poll_idx == VIO_MSG_POLL_DONE) {
+		spin_unlock_irqrestore(&msg->poll_lock, flags);
+		return true;
 	}
+
+	/* Has cmdq index moved at all ? */
+	pending = virtqueue_poll(vioch->vqueue, msg->poll_idx);
+	spin_unlock_irqrestore(&msg->poll_lock, flags);
+	if (!pending)
+		return false;
+
+	spin_lock_irqsave(&vioch->lock, flags);
+	virtqueue_disable_cb(vioch->vqueue);
+
+	/*
+	 * If something arrived we cannot be sure, without dequeueing, if it
+	 * was the reply to the xfer we are polling for, or, to other, even
+	 * possibly non-polling, pending xfers: process all new messages
+	 * till the polled-for message is found OR the vqueue is empty.
+	 */
+	while ((next_msg = virtqueue_get_buf(vioch->vqueue, &length))) {
+		next_msg->rx_len = length;
+		/* Is the message we were polling for ? */
+		if (next_msg == msg) {
+			ret = true;
+			break;
+		}
+
+		spin_lock(&next_msg->poll_lock);
+		if (next_msg->poll_idx == VIO_MSG_NOT_POLLED) {
+			any_prefetched++;
+			list_add_tail(&next_msg->list,
+				      &vioch->pending_cmds_list);
+		} else {
+			next_msg->poll_idx = VIO_MSG_POLL_DONE;
+		}
+		spin_unlock(&next_msg->poll_lock);
+	}
+
+	/*
+	 * When the polling loop has successfully terminated if something
+	 * else was queued in the meantime, it will be served by a deferred
+	 * worker OR by the normal IRQ/callback OR by other poll loops.
+	 *
+	 * If we are still looking for the polled reply, the polling index has
+	 * to be updated to the current vqueue last used index.
+	 */
+	if (ret) {
+		pending = !virtqueue_enable_cb(vioch->vqueue);
+	} else {
+		spin_lock(&msg->poll_lock);
+		msg->poll_idx = virtqueue_enable_cb_prepare(vioch->vqueue);
+		pending = virtqueue_poll(vioch->vqueue, msg->poll_idx);
+		spin_unlock(&msg->poll_lock);
+	}
+	spin_unlock_irqrestore(&vioch->lock, flags);
+
+	if (any_prefetched || pending)
+		queue_work(vioch->deferred_tx_wq, &vioch->deferred_tx_work);
+
+	return ret;
 }
 
 static const struct scmi_transport_ops scmi_virtio_ops = {
@@ -376,6 +611,8 @@ static const struct scmi_transport_ops scmi_virtio_ops = {
 	.send_message = virtio_send_message,
 	.fetch_response = virtio_fetch_response,
 	.fetch_notification = virtio_fetch_notification,
+	.mark_txdone = virtio_mark_txdone,
+	.poll_done = virtio_poll_done,
 };
 
 static int scmi_vio_probe(struct virtio_device *vdev)
@@ -418,6 +655,7 @@ static int scmi_vio_probe(struct virtio_device *vdev)
 		spin_lock_init(&channels[i].lock);
 		spin_lock_init(&channels[i].ready_lock);
 		INIT_LIST_HEAD(&channels[i].free_list);
+		INIT_LIST_HEAD(&channels[i].pending_cmds_list);
 		channels[i].vqueue = vqs[i];
 
 		sz = virtqueue_get_vring_size(channels[i].vqueue);
@@ -506,4 +744,5 @@ const struct scmi_desc scmi_virtio_desc = {
 	.max_rx_timeout_ms = 60000, /* for non-realtime virtio devices */
 	.max_msg = 0, /* overridden by virtio_get_max_msg() */
 	.max_msg_size = VIRTIO_SCMI_MAX_MSG_SIZE,
+	.atomic_enabled = IS_ENABLED(CONFIG_ARM_SCMI_TRANSPORT_VIRTIO_ATOMIC_ENABLE),
 };
-- 
2.17.1


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 61+ messages in thread

* [PATCH v8 10/11] firmware: arm_scmi: Add atomic support to clock protocol
  2021-12-20 19:56 ` Cristian Marussi
@ 2021-12-20 19:56   ` Cristian Marussi
  -1 siblings, 0 replies; 61+ messages in thread
From: Cristian Marussi @ 2021-12-20 19:56 UTC (permalink / raw)
  To: linux-kernel, linux-arm-kernel
  Cc: sudeep.holla, james.quinlan, Jonathan.Cameron, f.fainelli,
	etienne.carriere, vincent.guittot, souvik.chakravarty,
	cristian.marussi

Introduce new _atomic variant for SCMI clock protocol operations related
to enable disable operations: when an atomic operation is required the xfer
poll_completion flag is set for that transaction.

Signed-off-by: Cristian Marussi <cristian.marussi@arm.com>
---
 drivers/firmware/arm_scmi/clock.c | 22 +++++++++++++++++++---
 include/linux/scmi_protocol.h     |  3 +++
 2 files changed, 22 insertions(+), 3 deletions(-)

diff --git a/drivers/firmware/arm_scmi/clock.c b/drivers/firmware/arm_scmi/clock.c
index 35b56c8ba0c0..72f930c0e3e2 100644
--- a/drivers/firmware/arm_scmi/clock.c
+++ b/drivers/firmware/arm_scmi/clock.c
@@ -273,7 +273,7 @@ static int scmi_clock_rate_set(const struct scmi_protocol_handle *ph,
 
 static int
 scmi_clock_config_set(const struct scmi_protocol_handle *ph, u32 clk_id,
-		      u32 config)
+		      u32 config, bool atomic)
 {
 	int ret;
 	struct scmi_xfer *t;
@@ -284,6 +284,8 @@ scmi_clock_config_set(const struct scmi_protocol_handle *ph, u32 clk_id,
 	if (ret)
 		return ret;
 
+	t->hdr.poll_completion = atomic;
+
 	cfg = t->tx.buf;
 	cfg->id = cpu_to_le32(clk_id);
 	cfg->attributes = cpu_to_le32(config);
@@ -296,12 +298,24 @@ scmi_clock_config_set(const struct scmi_protocol_handle *ph, u32 clk_id,
 
 static int scmi_clock_enable(const struct scmi_protocol_handle *ph, u32 clk_id)
 {
-	return scmi_clock_config_set(ph, clk_id, CLOCK_ENABLE);
+	return scmi_clock_config_set(ph, clk_id, CLOCK_ENABLE, false);
 }
 
 static int scmi_clock_disable(const struct scmi_protocol_handle *ph, u32 clk_id)
 {
-	return scmi_clock_config_set(ph, clk_id, 0);
+	return scmi_clock_config_set(ph, clk_id, 0, false);
+}
+
+static int scmi_clock_enable_atomic(const struct scmi_protocol_handle *ph,
+				    u32 clk_id)
+{
+	return scmi_clock_config_set(ph, clk_id, CLOCK_ENABLE, true);
+}
+
+static int scmi_clock_disable_atomic(const struct scmi_protocol_handle *ph,
+				     u32 clk_id)
+{
+	return scmi_clock_config_set(ph, clk_id, 0, true);
 }
 
 static int scmi_clock_count_get(const struct scmi_protocol_handle *ph)
@@ -330,6 +344,8 @@ static const struct scmi_clk_proto_ops clk_proto_ops = {
 	.rate_set = scmi_clock_rate_set,
 	.enable = scmi_clock_enable,
 	.disable = scmi_clock_disable,
+	.enable_atomic = scmi_clock_enable_atomic,
+	.disable_atomic = scmi_clock_disable_atomic,
 };
 
 static int scmi_clock_protocol_init(const struct scmi_protocol_handle *ph)
diff --git a/include/linux/scmi_protocol.h b/include/linux/scmi_protocol.h
index 9f895cb81818..d4971c991963 100644
--- a/include/linux/scmi_protocol.h
+++ b/include/linux/scmi_protocol.h
@@ -82,6 +82,9 @@ struct scmi_clk_proto_ops {
 			u64 rate);
 	int (*enable)(const struct scmi_protocol_handle *ph, u32 clk_id);
 	int (*disable)(const struct scmi_protocol_handle *ph, u32 clk_id);
+	int (*enable_atomic)(const struct scmi_protocol_handle *ph, u32 clk_id);
+	int (*disable_atomic)(const struct scmi_protocol_handle *ph,
+			      u32 clk_id);
 };
 
 /**
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 61+ messages in thread

* [PATCH v8 10/11] firmware: arm_scmi: Add atomic support to clock protocol
@ 2021-12-20 19:56   ` Cristian Marussi
  0 siblings, 0 replies; 61+ messages in thread
From: Cristian Marussi @ 2021-12-20 19:56 UTC (permalink / raw)
  To: linux-kernel, linux-arm-kernel
  Cc: sudeep.holla, james.quinlan, Jonathan.Cameron, f.fainelli,
	etienne.carriere, vincent.guittot, souvik.chakravarty,
	cristian.marussi

Introduce new _atomic variant for SCMI clock protocol operations related
to enable disable operations: when an atomic operation is required the xfer
poll_completion flag is set for that transaction.

Signed-off-by: Cristian Marussi <cristian.marussi@arm.com>
---
 drivers/firmware/arm_scmi/clock.c | 22 +++++++++++++++++++---
 include/linux/scmi_protocol.h     |  3 +++
 2 files changed, 22 insertions(+), 3 deletions(-)

diff --git a/drivers/firmware/arm_scmi/clock.c b/drivers/firmware/arm_scmi/clock.c
index 35b56c8ba0c0..72f930c0e3e2 100644
--- a/drivers/firmware/arm_scmi/clock.c
+++ b/drivers/firmware/arm_scmi/clock.c
@@ -273,7 +273,7 @@ static int scmi_clock_rate_set(const struct scmi_protocol_handle *ph,
 
 static int
 scmi_clock_config_set(const struct scmi_protocol_handle *ph, u32 clk_id,
-		      u32 config)
+		      u32 config, bool atomic)
 {
 	int ret;
 	struct scmi_xfer *t;
@@ -284,6 +284,8 @@ scmi_clock_config_set(const struct scmi_protocol_handle *ph, u32 clk_id,
 	if (ret)
 		return ret;
 
+	t->hdr.poll_completion = atomic;
+
 	cfg = t->tx.buf;
 	cfg->id = cpu_to_le32(clk_id);
 	cfg->attributes = cpu_to_le32(config);
@@ -296,12 +298,24 @@ scmi_clock_config_set(const struct scmi_protocol_handle *ph, u32 clk_id,
 
 static int scmi_clock_enable(const struct scmi_protocol_handle *ph, u32 clk_id)
 {
-	return scmi_clock_config_set(ph, clk_id, CLOCK_ENABLE);
+	return scmi_clock_config_set(ph, clk_id, CLOCK_ENABLE, false);
 }
 
 static int scmi_clock_disable(const struct scmi_protocol_handle *ph, u32 clk_id)
 {
-	return scmi_clock_config_set(ph, clk_id, 0);
+	return scmi_clock_config_set(ph, clk_id, 0, false);
+}
+
+static int scmi_clock_enable_atomic(const struct scmi_protocol_handle *ph,
+				    u32 clk_id)
+{
+	return scmi_clock_config_set(ph, clk_id, CLOCK_ENABLE, true);
+}
+
+static int scmi_clock_disable_atomic(const struct scmi_protocol_handle *ph,
+				     u32 clk_id)
+{
+	return scmi_clock_config_set(ph, clk_id, 0, true);
 }
 
 static int scmi_clock_count_get(const struct scmi_protocol_handle *ph)
@@ -330,6 +344,8 @@ static const struct scmi_clk_proto_ops clk_proto_ops = {
 	.rate_set = scmi_clock_rate_set,
 	.enable = scmi_clock_enable,
 	.disable = scmi_clock_disable,
+	.enable_atomic = scmi_clock_enable_atomic,
+	.disable_atomic = scmi_clock_disable_atomic,
 };
 
 static int scmi_clock_protocol_init(const struct scmi_protocol_handle *ph)
diff --git a/include/linux/scmi_protocol.h b/include/linux/scmi_protocol.h
index 9f895cb81818..d4971c991963 100644
--- a/include/linux/scmi_protocol.h
+++ b/include/linux/scmi_protocol.h
@@ -82,6 +82,9 @@ struct scmi_clk_proto_ops {
 			u64 rate);
 	int (*enable)(const struct scmi_protocol_handle *ph, u32 clk_id);
 	int (*disable)(const struct scmi_protocol_handle *ph, u32 clk_id);
+	int (*enable_atomic)(const struct scmi_protocol_handle *ph, u32 clk_id);
+	int (*disable_atomic)(const struct scmi_protocol_handle *ph,
+			      u32 clk_id);
 };
 
 /**
-- 
2.17.1


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 61+ messages in thread

* [PATCH v8 11/11] clk: scmi: Support atomic clock enable/disable API
  2021-12-20 19:56 ` Cristian Marussi
@ 2021-12-20 19:56   ` Cristian Marussi
  -1 siblings, 0 replies; 61+ messages in thread
From: Cristian Marussi @ 2021-12-20 19:56 UTC (permalink / raw)
  To: linux-kernel, linux-arm-kernel
  Cc: sudeep.holla, james.quinlan, Jonathan.Cameron, f.fainelli,
	etienne.carriere, vincent.guittot, souvik.chakravarty,
	cristian.marussi, Michael Turquette, Stephen Boyd, linux-clk

Support also atomic enable/disable clk_ops beside the bare non-atomic one
(prepare/unprepare) when the underlying SCMI transport is configured to
support atomic transactions for synchronous commands.

Cc: Michael Turquette <mturquette@baylibre.com>
Cc: Stephen Boyd <sboyd@kernel.org>
Cc: linux-clk@vger.kernel.org
Signed-off-by: Cristian Marussi <cristian.marussi@arm.com>
---
NOTE THAT STILL THERE'S NO FINE GRAIN CONTROL OVER SELECTION
OF WHICH CLOCK DEVICES CAN SUPPORT ATOMIC AND WHICH SHOULD NOT
BASED ON CLOCK DEVICES ENABLE LATENCY.
THIS HAS STILL TO BE ADDED IN SCMI PROTOCOL SPEC.

v7 --> v8
- provide prepare/unprepare only on non-atomic transports
V5 --> V6
- add concurrent availability of atomic and non atomic reqs
---
 drivers/clk/clk-scmi.c | 54 +++++++++++++++++++++++++++++++++++-------
 1 file changed, 45 insertions(+), 9 deletions(-)

diff --git a/drivers/clk/clk-scmi.c b/drivers/clk/clk-scmi.c
index 1e357d364ca2..3da99f81ead9 100644
--- a/drivers/clk/clk-scmi.c
+++ b/drivers/clk/clk-scmi.c
@@ -88,21 +88,51 @@ static void scmi_clk_disable(struct clk_hw *hw)
 	scmi_proto_clk_ops->disable(clk->ph, clk->id);
 }
 
+static int scmi_clk_atomic_enable(struct clk_hw *hw)
+{
+	struct scmi_clk *clk = to_scmi_clk(hw);
+
+	return scmi_proto_clk_ops->enable_atomic(clk->ph, clk->id);
+}
+
+static void scmi_clk_atomic_disable(struct clk_hw *hw)
+{
+	struct scmi_clk *clk = to_scmi_clk(hw);
+
+	scmi_proto_clk_ops->disable_atomic(clk->ph, clk->id);
+}
+
+/*
+ * We can provide enable/disable atomic callbacks only if the underlying SCMI
+ * transport for an SCMI instance is configured to handle SCMI commands in an
+ * atomic manner.
+ *
+ * When no SCMI atomic transport support is available we instead provide only
+ * the prepare/unprepare API, as allowed by the clock framework when atomic
+ * calls are not available.
+ *
+ * Two distinct sets of clk_ops are provided since we could have multiple SCMI
+ * instances with different underlying transport quality, so they cannot be
+ * shared.
+ */
 static const struct clk_ops scmi_clk_ops = {
 	.recalc_rate = scmi_clk_recalc_rate,
 	.round_rate = scmi_clk_round_rate,
 	.set_rate = scmi_clk_set_rate,
-	/*
-	 * We can't provide enable/disable callback as we can't perform the same
-	 * in atomic context. Since the clock framework provides standard API
-	 * clk_prepare_enable that helps cases using clk_enable in non-atomic
-	 * context, it should be fine providing prepare/unprepare.
-	 */
 	.prepare = scmi_clk_enable,
 	.unprepare = scmi_clk_disable,
 };
 
-static int scmi_clk_ops_init(struct device *dev, struct scmi_clk *sclk)
+static const struct clk_ops scmi_atomic_clk_ops = {
+	.recalc_rate = scmi_clk_recalc_rate,
+	.round_rate = scmi_clk_round_rate,
+	.set_rate = scmi_clk_set_rate,
+	.enable = scmi_clk_atomic_enable,
+	.disable = scmi_clk_atomic_disable,
+};
+
+static int scmi_clk_ops_init(struct device *dev, struct scmi_clk *sclk,
+			     const struct clk_ops *scmi_ops)
 {
 	int ret;
 	unsigned long min_rate, max_rate;
@@ -110,7 +140,7 @@ static int scmi_clk_ops_init(struct device *dev, struct scmi_clk *sclk)
 	struct clk_init_data init = {
 		.flags = CLK_GET_RATE_NOCACHE,
 		.num_parents = 0,
-		.ops = &scmi_clk_ops,
+		.ops = scmi_ops,
 		.name = sclk->info->name,
 	};
 
@@ -145,6 +175,7 @@ static int scmi_clocks_probe(struct scmi_device *sdev)
 	struct device_node *np = dev->of_node;
 	const struct scmi_handle *handle = sdev->handle;
 	struct scmi_protocol_handle *ph;
+	const struct clk_ops *scmi_ops;
 
 	if (!handle)
 		return -ENODEV;
@@ -168,6 +199,11 @@ static int scmi_clocks_probe(struct scmi_device *sdev)
 	clk_data->num = count;
 	hws = clk_data->hws;
 
+	if (handle->is_transport_atomic(handle))
+		scmi_ops = &scmi_atomic_clk_ops;
+	else
+		scmi_ops = &scmi_clk_ops;
+
 	for (idx = 0; idx < count; idx++) {
 		struct scmi_clk *sclk;
 
@@ -184,7 +220,7 @@ static int scmi_clocks_probe(struct scmi_device *sdev)
 		sclk->id = idx;
 		sclk->ph = ph;
 
-		err = scmi_clk_ops_init(dev, sclk);
+		err = scmi_clk_ops_init(dev, sclk, scmi_ops);
 		if (err) {
 			dev_err(dev, "failed to register clock %d\n", idx);
 			devm_kfree(dev, sclk);
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 61+ messages in thread

* [PATCH v8 11/11] clk: scmi: Support atomic clock enable/disable API
@ 2021-12-20 19:56   ` Cristian Marussi
  0 siblings, 0 replies; 61+ messages in thread
From: Cristian Marussi @ 2021-12-20 19:56 UTC (permalink / raw)
  To: linux-kernel, linux-arm-kernel
  Cc: sudeep.holla, james.quinlan, Jonathan.Cameron, f.fainelli,
	etienne.carriere, vincent.guittot, souvik.chakravarty,
	cristian.marussi, Michael Turquette, Stephen Boyd, linux-clk

Support also atomic enable/disable clk_ops beside the bare non-atomic one
(prepare/unprepare) when the underlying SCMI transport is configured to
support atomic transactions for synchronous commands.

Cc: Michael Turquette <mturquette@baylibre.com>
Cc: Stephen Boyd <sboyd@kernel.org>
Cc: linux-clk@vger.kernel.org
Signed-off-by: Cristian Marussi <cristian.marussi@arm.com>
---
NOTE THAT STILL THERE'S NO FINE GRAIN CONTROL OVER SELECTION
OF WHICH CLOCK DEVICES CAN SUPPORT ATOMIC AND WHICH SHOULD NOT
BASED ON CLOCK DEVICES ENABLE LATENCY.
THIS HAS STILL TO BE ADDED IN SCMI PROTOCOL SPEC.

v7 --> v8
- provide prepare/unprepare only on non-atomic transports
V5 --> V6
- add concurrent availability of atomic and non atomic reqs
---
 drivers/clk/clk-scmi.c | 54 +++++++++++++++++++++++++++++++++++-------
 1 file changed, 45 insertions(+), 9 deletions(-)

diff --git a/drivers/clk/clk-scmi.c b/drivers/clk/clk-scmi.c
index 1e357d364ca2..3da99f81ead9 100644
--- a/drivers/clk/clk-scmi.c
+++ b/drivers/clk/clk-scmi.c
@@ -88,21 +88,51 @@ static void scmi_clk_disable(struct clk_hw *hw)
 	scmi_proto_clk_ops->disable(clk->ph, clk->id);
 }
 
+static int scmi_clk_atomic_enable(struct clk_hw *hw)
+{
+	struct scmi_clk *clk = to_scmi_clk(hw);
+
+	return scmi_proto_clk_ops->enable_atomic(clk->ph, clk->id);
+}
+
+static void scmi_clk_atomic_disable(struct clk_hw *hw)
+{
+	struct scmi_clk *clk = to_scmi_clk(hw);
+
+	scmi_proto_clk_ops->disable_atomic(clk->ph, clk->id);
+}
+
+/*
+ * We can provide enable/disable atomic callbacks only if the underlying SCMI
+ * transport for an SCMI instance is configured to handle SCMI commands in an
+ * atomic manner.
+ *
+ * When no SCMI atomic transport support is available we instead provide only
+ * the prepare/unprepare API, as allowed by the clock framework when atomic
+ * calls are not available.
+ *
+ * Two distinct sets of clk_ops are provided since we could have multiple SCMI
+ * instances with different underlying transport quality, so they cannot be
+ * shared.
+ */
 static const struct clk_ops scmi_clk_ops = {
 	.recalc_rate = scmi_clk_recalc_rate,
 	.round_rate = scmi_clk_round_rate,
 	.set_rate = scmi_clk_set_rate,
-	/*
-	 * We can't provide enable/disable callback as we can't perform the same
-	 * in atomic context. Since the clock framework provides standard API
-	 * clk_prepare_enable that helps cases using clk_enable in non-atomic
-	 * context, it should be fine providing prepare/unprepare.
-	 */
 	.prepare = scmi_clk_enable,
 	.unprepare = scmi_clk_disable,
 };
 
-static int scmi_clk_ops_init(struct device *dev, struct scmi_clk *sclk)
+static const struct clk_ops scmi_atomic_clk_ops = {
+	.recalc_rate = scmi_clk_recalc_rate,
+	.round_rate = scmi_clk_round_rate,
+	.set_rate = scmi_clk_set_rate,
+	.enable = scmi_clk_atomic_enable,
+	.disable = scmi_clk_atomic_disable,
+};
+
+static int scmi_clk_ops_init(struct device *dev, struct scmi_clk *sclk,
+			     const struct clk_ops *scmi_ops)
 {
 	int ret;
 	unsigned long min_rate, max_rate;
@@ -110,7 +140,7 @@ static int scmi_clk_ops_init(struct device *dev, struct scmi_clk *sclk)
 	struct clk_init_data init = {
 		.flags = CLK_GET_RATE_NOCACHE,
 		.num_parents = 0,
-		.ops = &scmi_clk_ops,
+		.ops = scmi_ops,
 		.name = sclk->info->name,
 	};
 
@@ -145,6 +175,7 @@ static int scmi_clocks_probe(struct scmi_device *sdev)
 	struct device_node *np = dev->of_node;
 	const struct scmi_handle *handle = sdev->handle;
 	struct scmi_protocol_handle *ph;
+	const struct clk_ops *scmi_ops;
 
 	if (!handle)
 		return -ENODEV;
@@ -168,6 +199,11 @@ static int scmi_clocks_probe(struct scmi_device *sdev)
 	clk_data->num = count;
 	hws = clk_data->hws;
 
+	if (handle->is_transport_atomic(handle))
+		scmi_ops = &scmi_atomic_clk_ops;
+	else
+		scmi_ops = &scmi_clk_ops;
+
 	for (idx = 0; idx < count; idx++) {
 		struct scmi_clk *sclk;
 
@@ -184,7 +220,7 @@ static int scmi_clocks_probe(struct scmi_device *sdev)
 		sclk->id = idx;
 		sclk->ph = ph;
 
-		err = scmi_clk_ops_init(dev, sclk);
+		err = scmi_clk_ops_init(dev, sclk, scmi_ops);
 		if (err) {
 			dev_err(dev, "failed to register clock %d\n", idx);
 			devm_kfree(dev, sclk);
-- 
2.17.1


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 61+ messages in thread

* Re: [PATCH v8 09/11] firmware: arm_scmi: Add atomic mode support to virtio transport
  2021-12-20 19:56   ` Cristian Marussi
  (?)
@ 2021-12-20 23:17     ` Michael S. Tsirkin
  -1 siblings, 0 replies; 61+ messages in thread
From: Michael S. Tsirkin @ 2021-12-20 23:17 UTC (permalink / raw)
  To: Cristian Marussi
  Cc: linux-kernel, linux-arm-kernel, sudeep.holla, james.quinlan,
	Jonathan.Cameron, f.fainelli, etienne.carriere, vincent.guittot,
	souvik.chakravarty, Igor Skalkin, Peter Hilber, virtualization

On Mon, Dec 20, 2021 at 07:56:44PM +0000, Cristian Marussi wrote:
> Add support for .mark_txdone and .poll_done transport operations to SCMI
> VirtIO transport as pre-requisites to enable atomic operations.
> 
> Add a Kernel configuration option to enable SCMI VirtIO transport polling
> and atomic mode for selected SCMI transactions while leaving it default
> disabled.
> 
> Cc: "Michael S. Tsirkin" <mst@redhat.com>
> Cc: Igor Skalkin <igor.skalkin@opensynergy.com>
> Cc: Peter Hilber <peter.hilber@opensynergy.com>
> Cc: virtualization@lists.linux-foundation.org
> Signed-off-by: Cristian Marussi <cristian.marussi@arm.com>
> ---
> v7 --> v8
> - removed ifdeffery
> - reviewed comments
> - simplified spinlocking aroung scmi_feed_vq_tx/rx
> - added deferred worker for TX replies to aid while polling mode is active
> V6 --> V7
> - added a few comments about virtio polling internals
> - fixed missing list_del on pending_cmds_list processing
> - shrinked spinlocked areas in virtio_poll_done
> - added proper spinlocking to scmi_vio_complete_cb while scanning list
>   of pending cmds
> ---
>  drivers/firmware/arm_scmi/Kconfig  |  15 ++
>  drivers/firmware/arm_scmi/virtio.c | 291 ++++++++++++++++++++++++++---
>  2 files changed, 280 insertions(+), 26 deletions(-)
> 
> diff --git a/drivers/firmware/arm_scmi/Kconfig b/drivers/firmware/arm_scmi/Kconfig
> index d429326433d1..7794bd41eaa0 100644
> --- a/drivers/firmware/arm_scmi/Kconfig
> +++ b/drivers/firmware/arm_scmi/Kconfig
> @@ -118,6 +118,21 @@ config ARM_SCMI_TRANSPORT_VIRTIO_VERSION1_COMPLIANCE
>  	  the ones implemented by kvmtool) and let the core Kernel VirtIO layer
>  	  take care of the needed conversions, say N.
>  
> +config ARM_SCMI_TRANSPORT_VIRTIO_ATOMIC_ENABLE
> +	bool "Enable atomic mode for SCMI VirtIO transport"
> +	depends on ARM_SCMI_TRANSPORT_VIRTIO
> +	help
> +	  Enable support of atomic operation for SCMI VirtIO based transport.
> +
> +	  If you want the SCMI VirtIO based transport to operate in atomic
> +	  mode, avoiding any kind of sleeping behaviour for selected
> +	  transactions on the TX path, answer Y.
> +
> +	  Enabling atomic mode operations allows any SCMI driver using this
> +	  transport to optionally ask for atomic SCMI transactions and operate
> +	  in atomic context too, at the price of using a number of busy-waiting
> +	  primitives all over instead. If unsure say N.
> +
>  endif #ARM_SCMI_PROTOCOL
>  
>  config ARM_SCMI_POWER_DOMAIN
> diff --git a/drivers/firmware/arm_scmi/virtio.c b/drivers/firmware/arm_scmi/virtio.c
> index fd0f6f91fc0b..f589bbcc5db9 100644
> --- a/drivers/firmware/arm_scmi/virtio.c
> +++ b/drivers/firmware/arm_scmi/virtio.c
> @@ -3,8 +3,8 @@
>   * Virtio Transport driver for Arm System Control and Management Interface
>   * (SCMI).
>   *
> - * Copyright (C) 2020-2021 OpenSynergy.
> - * Copyright (C) 2021 ARM Ltd.
> + * Copyright (C) 2020-2022 OpenSynergy.
> + * Copyright (C) 2021-2022 ARM Ltd.
>   */
>  
>  /**
> @@ -38,6 +38,9 @@
>   * @vqueue: Associated virtqueue
>   * @cinfo: SCMI Tx or Rx channel
>   * @free_list: List of unused scmi_vio_msg, maintained for Tx channels only
> + * @deferred_tx_work: Worker for TX deferred replies processing
> + * @deferred_tx_wq: Workqueue for TX deferred replies
> + * @pending_cmds_list: List of pre-fetched commands queueud for later processing
>   * @is_rx: Whether channel is an Rx channel
>   * @ready: Whether transport user is ready to hear about channel
>   * @max_msg: Maximum number of pending messages for this channel.
> @@ -49,6 +52,9 @@ struct scmi_vio_channel {
>  	struct virtqueue *vqueue;
>  	struct scmi_chan_info *cinfo;
>  	struct list_head free_list;
> +	struct list_head pending_cmds_list;
> +	struct work_struct deferred_tx_work;
> +	struct workqueue_struct *deferred_tx_wq;
>  	bool is_rx;
>  	bool ready;
>  	unsigned int max_msg;
> @@ -65,12 +71,22 @@ struct scmi_vio_channel {
>   * @input: SDU used for (delayed) responses and notifications
>   * @list: List which scmi_vio_msg may be part of
>   * @rx_len: Input SDU size in bytes, once input has been received
> + * @poll_idx: Last used index registered for polling purposes if this message
> + *	      transaction reply was configured for polling.
> + *	      Note that since virtqueue used index is an unsigned 16-bit we can
> + *	      use some out-of-scale values to signify particular conditions.
> + * @poll_lock: Protect access to @poll_idx.
>   */
>  struct scmi_vio_msg {
>  	struct scmi_msg_payld *request;
>  	struct scmi_msg_payld *input;
>  	struct list_head list;
>  	unsigned int rx_len;
> +#define VIO_MSG_NOT_POLLED	0xeeeeeeeeUL
> +#define VIO_MSG_POLL_DONE	0xffffffffUL
> +	unsigned int poll_idx;
> +	/* lock to protect access to poll_idx. */
> +	spinlock_t poll_lock;
>  };
>  
>  /* Only one SCMI VirtIO device can possibly exist */
> @@ -81,40 +97,43 @@ static bool scmi_vio_have_vq_rx(struct virtio_device *vdev)
>  	return virtio_has_feature(vdev, VIRTIO_SCMI_F_P2A_CHANNELS);
>  }
>  
> +/* Expect to be called with vioch->lock acquired by the caller and IRQs off */
>  static int scmi_vio_feed_vq_rx(struct scmi_vio_channel *vioch,
>  			       struct scmi_vio_msg *msg,
>  			       struct device *dev)
>  {
>  	struct scatterlist sg_in;
>  	int rc;
> -	unsigned long flags;
>  
>  	sg_init_one(&sg_in, msg->input, VIRTIO_SCMI_MAX_PDU_SIZE);
>  
> -	spin_lock_irqsave(&vioch->lock, flags);
> -
>  	rc = virtqueue_add_inbuf(vioch->vqueue, &sg_in, 1, msg, GFP_ATOMIC);
>  	if (rc)
>  		dev_err(dev, "failed to add to RX virtqueue (%d)\n", rc);
>  	else
>  		virtqueue_kick(vioch->vqueue);
>  
> -	spin_unlock_irqrestore(&vioch->lock, flags);
> -
>  	return rc;
>  }
>  
> +/* Expect to be called with vioch->lock acquired by the caller and IRQs off */
> +static inline void scmi_vio_feed_vq_tx(struct scmi_vio_channel *vioch,
> +				       struct scmi_vio_msg *msg)
> +{
> +	spin_lock(&msg->poll_lock);
> +	msg->poll_idx = VIO_MSG_NOT_POLLED;
> +	spin_unlock(&msg->poll_lock);
> +
> +	list_add(&msg->list, &vioch->free_list);
> +}
> +
>  static void scmi_finalize_message(struct scmi_vio_channel *vioch,
>  				  struct scmi_vio_msg *msg)
>  {
> -	if (vioch->is_rx) {
> +	if (vioch->is_rx)
>  		scmi_vio_feed_vq_rx(vioch, msg, vioch->cinfo->dev);
> -	} else {
> -		/* Here IRQs are assumed to be already disabled by the caller */
> -		spin_lock(&vioch->lock);
> -		list_add(&msg->list, &vioch->free_list);
> -		spin_unlock(&vioch->lock);
> -	}
> +	else
> +		scmi_vio_feed_vq_tx(vioch, msg);
>  }
>  
>  static void scmi_vio_complete_cb(struct virtqueue *vqueue)
> @@ -144,6 +163,7 @@ static void scmi_vio_complete_cb(struct virtqueue *vqueue)
>  			virtqueue_disable_cb(vqueue);
>  			cb_enabled = false;
>  		}
> +
>  		msg = virtqueue_get_buf(vqueue, &length);
>  		if (!msg) {
>  			if (virtqueue_enable_cb(vqueue))
> @@ -157,7 +177,9 @@ static void scmi_vio_complete_cb(struct virtqueue *vqueue)
>  			scmi_rx_callback(vioch->cinfo,
>  					 msg_read_header(msg->input), msg);
>  
> +			spin_lock(&vioch->lock);
>  			scmi_finalize_message(vioch, msg);
> +			spin_unlock(&vioch->lock);
>  		}
>  
>  		/*
> @@ -176,6 +198,34 @@ static void scmi_vio_complete_cb(struct virtqueue *vqueue)
>  	spin_unlock_irqrestore(&vioch->ready_lock, ready_flags);
>  }
>  
> +static void scmi_vio_deferred_tx_worker(struct work_struct *work)
> +{
> +	unsigned long flags;
> +	struct scmi_vio_channel *vioch;
> +	struct scmi_vio_msg *msg, *tmp;
> +
> +	vioch = container_of(work, struct scmi_vio_channel, deferred_tx_work);
> +
> +	/* Process pre-fetched messages */
> +	spin_lock_irqsave(&vioch->lock, flags);
> +
> +	/* Scan the list of possibly pre-fetched messages during polling. */
> +	list_for_each_entry_safe(msg, tmp, &vioch->pending_cmds_list, list) {
> +		list_del(&msg->list);
> +
> +		scmi_rx_callback(vioch->cinfo,
> +				 msg_read_header(msg->input), msg);
> +
> +		/* Free the processed message once done */
> +		scmi_vio_feed_vq_tx(vioch, msg);
> +	}
> +
> +	spin_unlock_irqrestore(&vioch->lock, flags);
> +
> +	/* Process possibly still pending messages */
> +	scmi_vio_complete_cb(vioch->vqueue);
> +}
> +
>  static const char *const scmi_vio_vqueue_names[] = { "tx", "rx" };
>  
>  static vq_callback_t *scmi_vio_complete_callbacks[] = {
> @@ -244,6 +294,19 @@ static int virtio_chan_setup(struct scmi_chan_info *cinfo, struct device *dev,
>  
>  	vioch = &((struct scmi_vio_channel *)scmi_vdev->priv)[index];
>  
> +	/* Setup a deferred worker for polling. */
> +	if (tx && !vioch->deferred_tx_wq) {
> +		vioch->deferred_tx_wq =
> +			alloc_workqueue(dev_name(&scmi_vdev->dev),
> +					WQ_UNBOUND | WQ_FREEZABLE | WQ_SYSFS,
> +					0);
> +		if (!vioch->deferred_tx_wq)
> +			return -ENOMEM;
> +
> +		INIT_WORK(&vioch->deferred_tx_work,
> +			  scmi_vio_deferred_tx_worker);
> +	}
> +
>  	for (i = 0; i < vioch->max_msg; i++) {
>  		struct scmi_vio_msg *msg;
>  
> @@ -257,6 +320,7 @@ static int virtio_chan_setup(struct scmi_chan_info *cinfo, struct device *dev,
>  						    GFP_KERNEL);
>  			if (!msg->request)
>  				return -ENOMEM;
> +			spin_lock_init(&msg->poll_lock);
>  		}
>  
>  		msg->input = devm_kzalloc(cinfo->dev, VIRTIO_SCMI_MAX_PDU_SIZE,
> @@ -264,13 +328,12 @@ static int virtio_chan_setup(struct scmi_chan_info *cinfo, struct device *dev,
>  		if (!msg->input)
>  			return -ENOMEM;
>  
> -		if (tx) {
> -			spin_lock_irqsave(&vioch->lock, flags);
> -			list_add_tail(&msg->list, &vioch->free_list);
> -			spin_unlock_irqrestore(&vioch->lock, flags);
> -		} else {
> +		spin_lock_irqsave(&vioch->lock, flags);
> +		if (tx)
> +			scmi_vio_feed_vq_tx(vioch, msg);
> +		else
>  			scmi_vio_feed_vq_rx(vioch, msg, cinfo->dev);
> -		}
> +		spin_unlock_irqrestore(&vioch->lock, flags);
>  	}
>  
>  	spin_lock_irqsave(&vioch->lock, flags);
> @@ -296,6 +359,11 @@ static int virtio_chan_free(int id, void *p, void *data)
>  	vioch->ready = false;
>  	spin_unlock_irqrestore(&vioch->ready_lock, flags);
>  
> +	if (!vioch->is_rx && vioch->deferred_tx_wq) {
> +		destroy_workqueue(vioch->deferred_tx_wq);
> +		vioch->deferred_tx_wq = NULL;
> +	}
> +
>  	scmi_free_channel(cinfo, data, id);
>  
>  	spin_lock_irqsave(&vioch->lock, flags);
> @@ -324,7 +392,8 @@ static int virtio_send_message(struct scmi_chan_info *cinfo,
>  	}
>  
>  	msg = list_first_entry(&vioch->free_list, typeof(*msg), list);
> -	list_del(&msg->list);
> +	/* Re-init element so we can discern anytime if it is still in-flight */
> +	list_del_init(&msg->list);
>  
>  	msg_tx_prepare(msg->request, xfer);
>  
> @@ -337,6 +406,19 @@ static int virtio_send_message(struct scmi_chan_info *cinfo,
>  		dev_err(vioch->cinfo->dev,
>  			"failed to add to TX virtqueue (%d)\n", rc);
>  	} else {
> +		/*
> +		 * If polling was requested for this transaction:
> +		 *  - retrieve last used index (will be used as polling reference)
> +		 *  - bind the polled message to the xfer via .priv
> +		 */
> +		if (xfer->hdr.poll_completion) {
> +			spin_lock(&msg->poll_lock);
> +			msg->poll_idx =
> +				virtqueue_enable_cb_prepare(vioch->vqueue);
> +			spin_unlock(&msg->poll_lock);
> +			/* Ensure initialized msg is visibly bound to xfer */
> +			smp_store_mb(xfer->priv, msg);
> +		}
>  		virtqueue_kick(vioch->vqueue);
>  	}
>  
> @@ -350,10 +432,8 @@ static void virtio_fetch_response(struct scmi_chan_info *cinfo,
>  {
>  	struct scmi_vio_msg *msg = xfer->priv;
>  
> -	if (msg) {
> +	if (msg)
>  		msg_fetch_response(msg->input, msg->rx_len, xfer);
> -		xfer->priv = NULL;
> -	}
>  }
>  
>  static void virtio_fetch_notification(struct scmi_chan_info *cinfo,
> @@ -361,10 +441,165 @@ static void virtio_fetch_notification(struct scmi_chan_info *cinfo,
>  {
>  	struct scmi_vio_msg *msg = xfer->priv;
>  
> -	if (msg) {
> +	if (msg)
>  		msg_fetch_notification(msg->input, msg->rx_len, max_len, xfer);
> -		xfer->priv = NULL;
> +}
> +
> +/**
> + * virtio_mark_txdone  - Mark transmission done
> + *
> + * Free only successfully completed polling transfer messages.
> + *
> + * Note that in the SCMI VirtIO transport we never explicitly release timed-out
> + * messages by forcibly re-adding them to the free-list, even on timeout, inside
> + * the TX code path; we instead let IRQ/RX callbacks eventually clean up such
> + * messages once, finally, a late reply is received and discarded (if ever).
> + *
> + * This approach was deemed preferable since those pending timed-out buffers are
> + * still effectively owned by the SCMI platform VirtIO device even after timeout
> + * expiration: forcibly freeing and reusing them before they had been returned
> + * explicitly by the SCMI platform could lead to subtle bugs due to message
> + * corruption.
> + * An SCMI platform VirtIO device which never returns message buffers is
> + * anyway broken and it will quickly lead to exhaustion of available messages.
> + *
> + * For this same reason, here, we take care to free only the successfully
> + * completed polled messages, since they won't be freed elsewhere; late replies
> + * to timed-out polled messages would be anyway freed by RX callbacks instead.
> + *
> + * @cinfo: SCMI channel info
> + * @ret: Transmission return code
> + * @xfer: Transfer descriptor
> + */
> +static void virtio_mark_txdone(struct scmi_chan_info *cinfo, int ret,
> +			       struct scmi_xfer *xfer)
> +{
> +	unsigned long flags;
> +	struct scmi_vio_channel *vioch = cinfo->transport_info;
> +	struct scmi_vio_msg *msg = xfer->priv;
> +
> +	if (!msg)
> +		return;
> +
> +	/* Ensure msg is unbound from xfer before pushing onto the free list  */
> +	smp_store_mb(xfer->priv, NULL);
> +
> +	/* Is a successfully completed polled message still to be finalized ? */
> +	spin_lock_irqsave(&vioch->lock, flags);
> +	if (!ret && xfer->hdr.poll_completion && list_empty(&msg->list))
> +		scmi_vio_feed_vq_tx(vioch, msg);
> +	spin_unlock_irqrestore(&vioch->lock, flags);
> +}
> +
> +/**
> + * virtio_poll_done  - Provide polling support for VirtIO transport
> + *
> + * @cinfo: SCMI channel info
> + * @xfer: Reference to the transfer being poll for.
> + *
> + * VirtIO core provides a polling mechanism based only on last used indexes:
> + * this means that it is possible to poll the virtqueues waiting for something
> + * new to arrive from the host side but the only way to check if the freshly
> + * arrived buffer was what we were waiting for is to compare the newly arrived
> + * message descriptors with the one we are polling on.
> + *
> + * As a consequence it can happen to dequeue something different from the buffer
> + * we were poll-waiting for: if that is the case such early fetched buffers are
> + * then added to a the @pending_cmds_list list for later processing by a
> + * dedicated deferred worker.
> + *
> + * So, basically, once something new is spotted we proceed to de-queue all the
> + * freshly received used buffers until we found the one we were polling on, or,
> + * we have 'seemingly' emptied the virtqueue; if some buffers are still pending
> + * in the vqueue at the end of the polling loop (possible due to inherent races
> + * in virtqueues handling mechanisms), we similarly kick the deferred worker
> + * and let it process those, to avoid indefinitely looping in the .poll_done
> + * helper.
> + *
> + * Note that we do NOT suppress notification with VIRTQ_USED_F_NO_NOTIFY even
> + * when polling since such flag is per-virtqueues and we do not want to
> + * suppress notifications as a whole: so, if the message we are polling for is
> + * delivered via usual IRQs callbacks, on another core which are IRQs-on, it
> + * will be handled as such by scmi_rx_callback() and the polling loop in the
> + * SCMI Core TX path will be transparently terminated anyway.
> + *
> + * Return: True once polling has successfully completed.
> + */
> +static bool virtio_poll_done(struct scmi_chan_info *cinfo,
> +			     struct scmi_xfer *xfer)
> +{
> +	bool pending, ret = false;
> +	unsigned int length, any_prefetched = 0;
> +	unsigned long flags;
> +	struct scmi_vio_msg *next_msg, *msg = xfer->priv;
> +	struct scmi_vio_channel *vioch = cinfo->transport_info;
> +
> +	if (!msg)
> +		return true;
> +
> +	spin_lock_irqsave(&msg->poll_lock, flags);
> +	/* Processed already by other polling loop on another CPU ? */
> +	if (msg->poll_idx == VIO_MSG_POLL_DONE) {
> +		spin_unlock_irqrestore(&msg->poll_lock, flags);
> +		return true;
>  	}
> +
> +	/* Has cmdq index moved at all ? */
> +	pending = virtqueue_poll(vioch->vqueue, msg->poll_idx);
> +	spin_unlock_irqrestore(&msg->poll_lock, flags);
> +	if (!pending)
> +		return false;
> +
> +	spin_lock_irqsave(&vioch->lock, flags);
> +	virtqueue_disable_cb(vioch->vqueue);
> +
> +	/*
> +	 * If something arrived we cannot be sure, without dequeueing, if it
> +	 * was the reply to the xfer we are polling for, or, to other, even
> +	 * possibly non-polling, pending xfers: process all new messages
> +	 * till the polled-for message is found OR the vqueue is empty.
> +	 */
> +	while ((next_msg = virtqueue_get_buf(vioch->vqueue, &length))) {
> +		next_msg->rx_len = length;
> +		/* Is the message we were polling for ? */
> +		if (next_msg == msg) {
> +			ret = true;
> +			break;
> +		}
> +
> +		spin_lock(&next_msg->poll_lock);
> +		if (next_msg->poll_idx == VIO_MSG_NOT_POLLED) {
> +			any_prefetched++;
> +			list_add_tail(&next_msg->list,
> +				      &vioch->pending_cmds_list);
> +		} else {
> +			next_msg->poll_idx = VIO_MSG_POLL_DONE;
> +		}
> +		spin_unlock(&next_msg->poll_lock);
> +	}
> +
> +	/*
> +	 * When the polling loop has successfully terminated if something
> +	 * else was queued in the meantime, it will be served by a deferred
> +	 * worker OR by the normal IRQ/callback OR by other poll loops.
> +	 *
> +	 * If we are still looking for the polled reply, the polling index has
> +	 * to be updated to the current vqueue last used index.
> +	 */
> +	if (ret) {
> +		pending = !virtqueue_enable_cb(vioch->vqueue);
> +	} else {
> +		spin_lock(&msg->poll_lock);
> +		msg->poll_idx = virtqueue_enable_cb_prepare(vioch->vqueue);
> +		pending = virtqueue_poll(vioch->vqueue, msg->poll_idx);
> +		spin_unlock(&msg->poll_lock);
> +	}
> +	spin_unlock_irqrestore(&vioch->lock, flags);
> +
> +	if (any_prefetched || pending)
> +		queue_work(vioch->deferred_tx_wq, &vioch->deferred_tx_work);

I don't see any attempt to make sure the queued work is no longer
running on e.g. device or driver removal.


> +
> +	return ret;
>  }
>  
>  static const struct scmi_transport_ops scmi_virtio_ops = {
> @@ -376,6 +611,8 @@ static const struct scmi_transport_ops scmi_virtio_ops = {
>  	.send_message = virtio_send_message,
>  	.fetch_response = virtio_fetch_response,
>  	.fetch_notification = virtio_fetch_notification,
> +	.mark_txdone = virtio_mark_txdone,
> +	.poll_done = virtio_poll_done,
>  };
>  
>  static int scmi_vio_probe(struct virtio_device *vdev)
> @@ -418,6 +655,7 @@ static int scmi_vio_probe(struct virtio_device *vdev)
>  		spin_lock_init(&channels[i].lock);
>  		spin_lock_init(&channels[i].ready_lock);
>  		INIT_LIST_HEAD(&channels[i].free_list);
> +		INIT_LIST_HEAD(&channels[i].pending_cmds_list);
>  		channels[i].vqueue = vqs[i];
>  
>  		sz = virtqueue_get_vring_size(channels[i].vqueue);
> @@ -506,4 +744,5 @@ const struct scmi_desc scmi_virtio_desc = {
>  	.max_rx_timeout_ms = 60000, /* for non-realtime virtio devices */
>  	.max_msg = 0, /* overridden by virtio_get_max_msg() */
>  	.max_msg_size = VIRTIO_SCMI_MAX_MSG_SIZE,
> +	.atomic_enabled = IS_ENABLED(CONFIG_ARM_SCMI_TRANSPORT_VIRTIO_ATOMIC_ENABLE),
>  };
> -- 
> 2.17.1


^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [PATCH v8 09/11] firmware: arm_scmi: Add atomic mode support to virtio transport
@ 2021-12-20 23:17     ` Michael S. Tsirkin
  0 siblings, 0 replies; 61+ messages in thread
From: Michael S. Tsirkin @ 2021-12-20 23:17 UTC (permalink / raw)
  To: Cristian Marussi
  Cc: f.fainelli, vincent.guittot, Igor Skalkin, sudeep.holla,
	linux-kernel, virtualization, Peter Hilber, james.quinlan,
	Jonathan.Cameron, souvik.chakravarty, etienne.carriere,
	linux-arm-kernel

On Mon, Dec 20, 2021 at 07:56:44PM +0000, Cristian Marussi wrote:
> Add support for .mark_txdone and .poll_done transport operations to SCMI
> VirtIO transport as pre-requisites to enable atomic operations.
> 
> Add a Kernel configuration option to enable SCMI VirtIO transport polling
> and atomic mode for selected SCMI transactions while leaving it default
> disabled.
> 
> Cc: "Michael S. Tsirkin" <mst@redhat.com>
> Cc: Igor Skalkin <igor.skalkin@opensynergy.com>
> Cc: Peter Hilber <peter.hilber@opensynergy.com>
> Cc: virtualization@lists.linux-foundation.org
> Signed-off-by: Cristian Marussi <cristian.marussi@arm.com>
> ---
> v7 --> v8
> - removed ifdeffery
> - reviewed comments
> - simplified spinlocking aroung scmi_feed_vq_tx/rx
> - added deferred worker for TX replies to aid while polling mode is active
> V6 --> V7
> - added a few comments about virtio polling internals
> - fixed missing list_del on pending_cmds_list processing
> - shrinked spinlocked areas in virtio_poll_done
> - added proper spinlocking to scmi_vio_complete_cb while scanning list
>   of pending cmds
> ---
>  drivers/firmware/arm_scmi/Kconfig  |  15 ++
>  drivers/firmware/arm_scmi/virtio.c | 291 ++++++++++++++++++++++++++---
>  2 files changed, 280 insertions(+), 26 deletions(-)
> 
> diff --git a/drivers/firmware/arm_scmi/Kconfig b/drivers/firmware/arm_scmi/Kconfig
> index d429326433d1..7794bd41eaa0 100644
> --- a/drivers/firmware/arm_scmi/Kconfig
> +++ b/drivers/firmware/arm_scmi/Kconfig
> @@ -118,6 +118,21 @@ config ARM_SCMI_TRANSPORT_VIRTIO_VERSION1_COMPLIANCE
>  	  the ones implemented by kvmtool) and let the core Kernel VirtIO layer
>  	  take care of the needed conversions, say N.
>  
> +config ARM_SCMI_TRANSPORT_VIRTIO_ATOMIC_ENABLE
> +	bool "Enable atomic mode for SCMI VirtIO transport"
> +	depends on ARM_SCMI_TRANSPORT_VIRTIO
> +	help
> +	  Enable support of atomic operation for SCMI VirtIO based transport.
> +
> +	  If you want the SCMI VirtIO based transport to operate in atomic
> +	  mode, avoiding any kind of sleeping behaviour for selected
> +	  transactions on the TX path, answer Y.
> +
> +	  Enabling atomic mode operations allows any SCMI driver using this
> +	  transport to optionally ask for atomic SCMI transactions and operate
> +	  in atomic context too, at the price of using a number of busy-waiting
> +	  primitives all over instead. If unsure say N.
> +
>  endif #ARM_SCMI_PROTOCOL
>  
>  config ARM_SCMI_POWER_DOMAIN
> diff --git a/drivers/firmware/arm_scmi/virtio.c b/drivers/firmware/arm_scmi/virtio.c
> index fd0f6f91fc0b..f589bbcc5db9 100644
> --- a/drivers/firmware/arm_scmi/virtio.c
> +++ b/drivers/firmware/arm_scmi/virtio.c
> @@ -3,8 +3,8 @@
>   * Virtio Transport driver for Arm System Control and Management Interface
>   * (SCMI).
>   *
> - * Copyright (C) 2020-2021 OpenSynergy.
> - * Copyright (C) 2021 ARM Ltd.
> + * Copyright (C) 2020-2022 OpenSynergy.
> + * Copyright (C) 2021-2022 ARM Ltd.
>   */
>  
>  /**
> @@ -38,6 +38,9 @@
>   * @vqueue: Associated virtqueue
>   * @cinfo: SCMI Tx or Rx channel
>   * @free_list: List of unused scmi_vio_msg, maintained for Tx channels only
> + * @deferred_tx_work: Worker for TX deferred replies processing
> + * @deferred_tx_wq: Workqueue for TX deferred replies
> + * @pending_cmds_list: List of pre-fetched commands queueud for later processing
>   * @is_rx: Whether channel is an Rx channel
>   * @ready: Whether transport user is ready to hear about channel
>   * @max_msg: Maximum number of pending messages for this channel.
> @@ -49,6 +52,9 @@ struct scmi_vio_channel {
>  	struct virtqueue *vqueue;
>  	struct scmi_chan_info *cinfo;
>  	struct list_head free_list;
> +	struct list_head pending_cmds_list;
> +	struct work_struct deferred_tx_work;
> +	struct workqueue_struct *deferred_tx_wq;
>  	bool is_rx;
>  	bool ready;
>  	unsigned int max_msg;
> @@ -65,12 +71,22 @@ struct scmi_vio_channel {
>   * @input: SDU used for (delayed) responses and notifications
>   * @list: List which scmi_vio_msg may be part of
>   * @rx_len: Input SDU size in bytes, once input has been received
> + * @poll_idx: Last used index registered for polling purposes if this message
> + *	      transaction reply was configured for polling.
> + *	      Note that since virtqueue used index is an unsigned 16-bit we can
> + *	      use some out-of-scale values to signify particular conditions.
> + * @poll_lock: Protect access to @poll_idx.
>   */
>  struct scmi_vio_msg {
>  	struct scmi_msg_payld *request;
>  	struct scmi_msg_payld *input;
>  	struct list_head list;
>  	unsigned int rx_len;
> +#define VIO_MSG_NOT_POLLED	0xeeeeeeeeUL
> +#define VIO_MSG_POLL_DONE	0xffffffffUL
> +	unsigned int poll_idx;
> +	/* lock to protect access to poll_idx. */
> +	spinlock_t poll_lock;
>  };
>  
>  /* Only one SCMI VirtIO device can possibly exist */
> @@ -81,40 +97,43 @@ static bool scmi_vio_have_vq_rx(struct virtio_device *vdev)
>  	return virtio_has_feature(vdev, VIRTIO_SCMI_F_P2A_CHANNELS);
>  }
>  
> +/* Expect to be called with vioch->lock acquired by the caller and IRQs off */
>  static int scmi_vio_feed_vq_rx(struct scmi_vio_channel *vioch,
>  			       struct scmi_vio_msg *msg,
>  			       struct device *dev)
>  {
>  	struct scatterlist sg_in;
>  	int rc;
> -	unsigned long flags;
>  
>  	sg_init_one(&sg_in, msg->input, VIRTIO_SCMI_MAX_PDU_SIZE);
>  
> -	spin_lock_irqsave(&vioch->lock, flags);
> -
>  	rc = virtqueue_add_inbuf(vioch->vqueue, &sg_in, 1, msg, GFP_ATOMIC);
>  	if (rc)
>  		dev_err(dev, "failed to add to RX virtqueue (%d)\n", rc);
>  	else
>  		virtqueue_kick(vioch->vqueue);
>  
> -	spin_unlock_irqrestore(&vioch->lock, flags);
> -
>  	return rc;
>  }
>  
> +/* Expect to be called with vioch->lock acquired by the caller and IRQs off */
> +static inline void scmi_vio_feed_vq_tx(struct scmi_vio_channel *vioch,
> +				       struct scmi_vio_msg *msg)
> +{
> +	spin_lock(&msg->poll_lock);
> +	msg->poll_idx = VIO_MSG_NOT_POLLED;
> +	spin_unlock(&msg->poll_lock);
> +
> +	list_add(&msg->list, &vioch->free_list);
> +}
> +
>  static void scmi_finalize_message(struct scmi_vio_channel *vioch,
>  				  struct scmi_vio_msg *msg)
>  {
> -	if (vioch->is_rx) {
> +	if (vioch->is_rx)
>  		scmi_vio_feed_vq_rx(vioch, msg, vioch->cinfo->dev);
> -	} else {
> -		/* Here IRQs are assumed to be already disabled by the caller */
> -		spin_lock(&vioch->lock);
> -		list_add(&msg->list, &vioch->free_list);
> -		spin_unlock(&vioch->lock);
> -	}
> +	else
> +		scmi_vio_feed_vq_tx(vioch, msg);
>  }
>  
>  static void scmi_vio_complete_cb(struct virtqueue *vqueue)
> @@ -144,6 +163,7 @@ static void scmi_vio_complete_cb(struct virtqueue *vqueue)
>  			virtqueue_disable_cb(vqueue);
>  			cb_enabled = false;
>  		}
> +
>  		msg = virtqueue_get_buf(vqueue, &length);
>  		if (!msg) {
>  			if (virtqueue_enable_cb(vqueue))
> @@ -157,7 +177,9 @@ static void scmi_vio_complete_cb(struct virtqueue *vqueue)
>  			scmi_rx_callback(vioch->cinfo,
>  					 msg_read_header(msg->input), msg);
>  
> +			spin_lock(&vioch->lock);
>  			scmi_finalize_message(vioch, msg);
> +			spin_unlock(&vioch->lock);
>  		}
>  
>  		/*
> @@ -176,6 +198,34 @@ static void scmi_vio_complete_cb(struct virtqueue *vqueue)
>  	spin_unlock_irqrestore(&vioch->ready_lock, ready_flags);
>  }
>  
> +static void scmi_vio_deferred_tx_worker(struct work_struct *work)
> +{
> +	unsigned long flags;
> +	struct scmi_vio_channel *vioch;
> +	struct scmi_vio_msg *msg, *tmp;
> +
> +	vioch = container_of(work, struct scmi_vio_channel, deferred_tx_work);
> +
> +	/* Process pre-fetched messages */
> +	spin_lock_irqsave(&vioch->lock, flags);
> +
> +	/* Scan the list of possibly pre-fetched messages during polling. */
> +	list_for_each_entry_safe(msg, tmp, &vioch->pending_cmds_list, list) {
> +		list_del(&msg->list);
> +
> +		scmi_rx_callback(vioch->cinfo,
> +				 msg_read_header(msg->input), msg);
> +
> +		/* Free the processed message once done */
> +		scmi_vio_feed_vq_tx(vioch, msg);
> +	}
> +
> +	spin_unlock_irqrestore(&vioch->lock, flags);
> +
> +	/* Process possibly still pending messages */
> +	scmi_vio_complete_cb(vioch->vqueue);
> +}
> +
>  static const char *const scmi_vio_vqueue_names[] = { "tx", "rx" };
>  
>  static vq_callback_t *scmi_vio_complete_callbacks[] = {
> @@ -244,6 +294,19 @@ static int virtio_chan_setup(struct scmi_chan_info *cinfo, struct device *dev,
>  
>  	vioch = &((struct scmi_vio_channel *)scmi_vdev->priv)[index];
>  
> +	/* Setup a deferred worker for polling. */
> +	if (tx && !vioch->deferred_tx_wq) {
> +		vioch->deferred_tx_wq =
> +			alloc_workqueue(dev_name(&scmi_vdev->dev),
> +					WQ_UNBOUND | WQ_FREEZABLE | WQ_SYSFS,
> +					0);
> +		if (!vioch->deferred_tx_wq)
> +			return -ENOMEM;
> +
> +		INIT_WORK(&vioch->deferred_tx_work,
> +			  scmi_vio_deferred_tx_worker);
> +	}
> +
>  	for (i = 0; i < vioch->max_msg; i++) {
>  		struct scmi_vio_msg *msg;
>  
> @@ -257,6 +320,7 @@ static int virtio_chan_setup(struct scmi_chan_info *cinfo, struct device *dev,
>  						    GFP_KERNEL);
>  			if (!msg->request)
>  				return -ENOMEM;
> +			spin_lock_init(&msg->poll_lock);
>  		}
>  
>  		msg->input = devm_kzalloc(cinfo->dev, VIRTIO_SCMI_MAX_PDU_SIZE,
> @@ -264,13 +328,12 @@ static int virtio_chan_setup(struct scmi_chan_info *cinfo, struct device *dev,
>  		if (!msg->input)
>  			return -ENOMEM;
>  
> -		if (tx) {
> -			spin_lock_irqsave(&vioch->lock, flags);
> -			list_add_tail(&msg->list, &vioch->free_list);
> -			spin_unlock_irqrestore(&vioch->lock, flags);
> -		} else {
> +		spin_lock_irqsave(&vioch->lock, flags);
> +		if (tx)
> +			scmi_vio_feed_vq_tx(vioch, msg);
> +		else
>  			scmi_vio_feed_vq_rx(vioch, msg, cinfo->dev);
> -		}
> +		spin_unlock_irqrestore(&vioch->lock, flags);
>  	}
>  
>  	spin_lock_irqsave(&vioch->lock, flags);
> @@ -296,6 +359,11 @@ static int virtio_chan_free(int id, void *p, void *data)
>  	vioch->ready = false;
>  	spin_unlock_irqrestore(&vioch->ready_lock, flags);
>  
> +	if (!vioch->is_rx && vioch->deferred_tx_wq) {
> +		destroy_workqueue(vioch->deferred_tx_wq);
> +		vioch->deferred_tx_wq = NULL;
> +	}
> +
>  	scmi_free_channel(cinfo, data, id);
>  
>  	spin_lock_irqsave(&vioch->lock, flags);
> @@ -324,7 +392,8 @@ static int virtio_send_message(struct scmi_chan_info *cinfo,
>  	}
>  
>  	msg = list_first_entry(&vioch->free_list, typeof(*msg), list);
> -	list_del(&msg->list);
> +	/* Re-init element so we can discern anytime if it is still in-flight */
> +	list_del_init(&msg->list);
>  
>  	msg_tx_prepare(msg->request, xfer);
>  
> @@ -337,6 +406,19 @@ static int virtio_send_message(struct scmi_chan_info *cinfo,
>  		dev_err(vioch->cinfo->dev,
>  			"failed to add to TX virtqueue (%d)\n", rc);
>  	} else {
> +		/*
> +		 * If polling was requested for this transaction:
> +		 *  - retrieve last used index (will be used as polling reference)
> +		 *  - bind the polled message to the xfer via .priv
> +		 */
> +		if (xfer->hdr.poll_completion) {
> +			spin_lock(&msg->poll_lock);
> +			msg->poll_idx =
> +				virtqueue_enable_cb_prepare(vioch->vqueue);
> +			spin_unlock(&msg->poll_lock);
> +			/* Ensure initialized msg is visibly bound to xfer */
> +			smp_store_mb(xfer->priv, msg);
> +		}
>  		virtqueue_kick(vioch->vqueue);
>  	}
>  
> @@ -350,10 +432,8 @@ static void virtio_fetch_response(struct scmi_chan_info *cinfo,
>  {
>  	struct scmi_vio_msg *msg = xfer->priv;
>  
> -	if (msg) {
> +	if (msg)
>  		msg_fetch_response(msg->input, msg->rx_len, xfer);
> -		xfer->priv = NULL;
> -	}
>  }
>  
>  static void virtio_fetch_notification(struct scmi_chan_info *cinfo,
> @@ -361,10 +441,165 @@ static void virtio_fetch_notification(struct scmi_chan_info *cinfo,
>  {
>  	struct scmi_vio_msg *msg = xfer->priv;
>  
> -	if (msg) {
> +	if (msg)
>  		msg_fetch_notification(msg->input, msg->rx_len, max_len, xfer);
> -		xfer->priv = NULL;
> +}
> +
> +/**
> + * virtio_mark_txdone  - Mark transmission done
> + *
> + * Free only successfully completed polling transfer messages.
> + *
> + * Note that in the SCMI VirtIO transport we never explicitly release timed-out
> + * messages by forcibly re-adding them to the free-list, even on timeout, inside
> + * the TX code path; we instead let IRQ/RX callbacks eventually clean up such
> + * messages once, finally, a late reply is received and discarded (if ever).
> + *
> + * This approach was deemed preferable since those pending timed-out buffers are
> + * still effectively owned by the SCMI platform VirtIO device even after timeout
> + * expiration: forcibly freeing and reusing them before they had been returned
> + * explicitly by the SCMI platform could lead to subtle bugs due to message
> + * corruption.
> + * An SCMI platform VirtIO device which never returns message buffers is
> + * anyway broken and it will quickly lead to exhaustion of available messages.
> + *
> + * For this same reason, here, we take care to free only the successfully
> + * completed polled messages, since they won't be freed elsewhere; late replies
> + * to timed-out polled messages would be anyway freed by RX callbacks instead.
> + *
> + * @cinfo: SCMI channel info
> + * @ret: Transmission return code
> + * @xfer: Transfer descriptor
> + */
> +static void virtio_mark_txdone(struct scmi_chan_info *cinfo, int ret,
> +			       struct scmi_xfer *xfer)
> +{
> +	unsigned long flags;
> +	struct scmi_vio_channel *vioch = cinfo->transport_info;
> +	struct scmi_vio_msg *msg = xfer->priv;
> +
> +	if (!msg)
> +		return;
> +
> +	/* Ensure msg is unbound from xfer before pushing onto the free list  */
> +	smp_store_mb(xfer->priv, NULL);
> +
> +	/* Is a successfully completed polled message still to be finalized ? */
> +	spin_lock_irqsave(&vioch->lock, flags);
> +	if (!ret && xfer->hdr.poll_completion && list_empty(&msg->list))
> +		scmi_vio_feed_vq_tx(vioch, msg);
> +	spin_unlock_irqrestore(&vioch->lock, flags);
> +}
> +
> +/**
> + * virtio_poll_done  - Provide polling support for VirtIO transport
> + *
> + * @cinfo: SCMI channel info
> + * @xfer: Reference to the transfer being poll for.
> + *
> + * VirtIO core provides a polling mechanism based only on last used indexes:
> + * this means that it is possible to poll the virtqueues waiting for something
> + * new to arrive from the host side but the only way to check if the freshly
> + * arrived buffer was what we were waiting for is to compare the newly arrived
> + * message descriptors with the one we are polling on.
> + *
> + * As a consequence it can happen to dequeue something different from the buffer
> + * we were poll-waiting for: if that is the case such early fetched buffers are
> + * then added to a the @pending_cmds_list list for later processing by a
> + * dedicated deferred worker.
> + *
> + * So, basically, once something new is spotted we proceed to de-queue all the
> + * freshly received used buffers until we found the one we were polling on, or,
> + * we have 'seemingly' emptied the virtqueue; if some buffers are still pending
> + * in the vqueue at the end of the polling loop (possible due to inherent races
> + * in virtqueues handling mechanisms), we similarly kick the deferred worker
> + * and let it process those, to avoid indefinitely looping in the .poll_done
> + * helper.
> + *
> + * Note that we do NOT suppress notification with VIRTQ_USED_F_NO_NOTIFY even
> + * when polling since such flag is per-virtqueues and we do not want to
> + * suppress notifications as a whole: so, if the message we are polling for is
> + * delivered via usual IRQs callbacks, on another core which are IRQs-on, it
> + * will be handled as such by scmi_rx_callback() and the polling loop in the
> + * SCMI Core TX path will be transparently terminated anyway.
> + *
> + * Return: True once polling has successfully completed.
> + */
> +static bool virtio_poll_done(struct scmi_chan_info *cinfo,
> +			     struct scmi_xfer *xfer)
> +{
> +	bool pending, ret = false;
> +	unsigned int length, any_prefetched = 0;
> +	unsigned long flags;
> +	struct scmi_vio_msg *next_msg, *msg = xfer->priv;
> +	struct scmi_vio_channel *vioch = cinfo->transport_info;
> +
> +	if (!msg)
> +		return true;
> +
> +	spin_lock_irqsave(&msg->poll_lock, flags);
> +	/* Processed already by other polling loop on another CPU ? */
> +	if (msg->poll_idx == VIO_MSG_POLL_DONE) {
> +		spin_unlock_irqrestore(&msg->poll_lock, flags);
> +		return true;
>  	}
> +
> +	/* Has cmdq index moved at all ? */
> +	pending = virtqueue_poll(vioch->vqueue, msg->poll_idx);
> +	spin_unlock_irqrestore(&msg->poll_lock, flags);
> +	if (!pending)
> +		return false;
> +
> +	spin_lock_irqsave(&vioch->lock, flags);
> +	virtqueue_disable_cb(vioch->vqueue);
> +
> +	/*
> +	 * If something arrived we cannot be sure, without dequeueing, if it
> +	 * was the reply to the xfer we are polling for, or, to other, even
> +	 * possibly non-polling, pending xfers: process all new messages
> +	 * till the polled-for message is found OR the vqueue is empty.
> +	 */
> +	while ((next_msg = virtqueue_get_buf(vioch->vqueue, &length))) {
> +		next_msg->rx_len = length;
> +		/* Is the message we were polling for ? */
> +		if (next_msg == msg) {
> +			ret = true;
> +			break;
> +		}
> +
> +		spin_lock(&next_msg->poll_lock);
> +		if (next_msg->poll_idx == VIO_MSG_NOT_POLLED) {
> +			any_prefetched++;
> +			list_add_tail(&next_msg->list,
> +				      &vioch->pending_cmds_list);
> +		} else {
> +			next_msg->poll_idx = VIO_MSG_POLL_DONE;
> +		}
> +		spin_unlock(&next_msg->poll_lock);
> +	}
> +
> +	/*
> +	 * When the polling loop has successfully terminated if something
> +	 * else was queued in the meantime, it will be served by a deferred
> +	 * worker OR by the normal IRQ/callback OR by other poll loops.
> +	 *
> +	 * If we are still looking for the polled reply, the polling index has
> +	 * to be updated to the current vqueue last used index.
> +	 */
> +	if (ret) {
> +		pending = !virtqueue_enable_cb(vioch->vqueue);
> +	} else {
> +		spin_lock(&msg->poll_lock);
> +		msg->poll_idx = virtqueue_enable_cb_prepare(vioch->vqueue);
> +		pending = virtqueue_poll(vioch->vqueue, msg->poll_idx);
> +		spin_unlock(&msg->poll_lock);
> +	}
> +	spin_unlock_irqrestore(&vioch->lock, flags);
> +
> +	if (any_prefetched || pending)
> +		queue_work(vioch->deferred_tx_wq, &vioch->deferred_tx_work);

I don't see any attempt to make sure the queued work is no longer
running on e.g. device or driver removal.


> +
> +	return ret;
>  }
>  
>  static const struct scmi_transport_ops scmi_virtio_ops = {
> @@ -376,6 +611,8 @@ static const struct scmi_transport_ops scmi_virtio_ops = {
>  	.send_message = virtio_send_message,
>  	.fetch_response = virtio_fetch_response,
>  	.fetch_notification = virtio_fetch_notification,
> +	.mark_txdone = virtio_mark_txdone,
> +	.poll_done = virtio_poll_done,
>  };
>  
>  static int scmi_vio_probe(struct virtio_device *vdev)
> @@ -418,6 +655,7 @@ static int scmi_vio_probe(struct virtio_device *vdev)
>  		spin_lock_init(&channels[i].lock);
>  		spin_lock_init(&channels[i].ready_lock);
>  		INIT_LIST_HEAD(&channels[i].free_list);
> +		INIT_LIST_HEAD(&channels[i].pending_cmds_list);
>  		channels[i].vqueue = vqs[i];
>  
>  		sz = virtqueue_get_vring_size(channels[i].vqueue);
> @@ -506,4 +744,5 @@ const struct scmi_desc scmi_virtio_desc = {
>  	.max_rx_timeout_ms = 60000, /* for non-realtime virtio devices */
>  	.max_msg = 0, /* overridden by virtio_get_max_msg() */
>  	.max_msg_size = VIRTIO_SCMI_MAX_MSG_SIZE,
> +	.atomic_enabled = IS_ENABLED(CONFIG_ARM_SCMI_TRANSPORT_VIRTIO_ATOMIC_ENABLE),
>  };
> -- 
> 2.17.1

_______________________________________________
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [PATCH v8 09/11] firmware: arm_scmi: Add atomic mode support to virtio transport
@ 2021-12-20 23:17     ` Michael S. Tsirkin
  0 siblings, 0 replies; 61+ messages in thread
From: Michael S. Tsirkin @ 2021-12-20 23:17 UTC (permalink / raw)
  To: Cristian Marussi
  Cc: linux-kernel, linux-arm-kernel, sudeep.holla, james.quinlan,
	Jonathan.Cameron, f.fainelli, etienne.carriere, vincent.guittot,
	souvik.chakravarty, Igor Skalkin, Peter Hilber, virtualization

On Mon, Dec 20, 2021 at 07:56:44PM +0000, Cristian Marussi wrote:
> Add support for .mark_txdone and .poll_done transport operations to SCMI
> VirtIO transport as pre-requisites to enable atomic operations.
> 
> Add a Kernel configuration option to enable SCMI VirtIO transport polling
> and atomic mode for selected SCMI transactions while leaving it default
> disabled.
> 
> Cc: "Michael S. Tsirkin" <mst@redhat.com>
> Cc: Igor Skalkin <igor.skalkin@opensynergy.com>
> Cc: Peter Hilber <peter.hilber@opensynergy.com>
> Cc: virtualization@lists.linux-foundation.org
> Signed-off-by: Cristian Marussi <cristian.marussi@arm.com>
> ---
> v7 --> v8
> - removed ifdeffery
> - reviewed comments
> - simplified spinlocking aroung scmi_feed_vq_tx/rx
> - added deferred worker for TX replies to aid while polling mode is active
> V6 --> V7
> - added a few comments about virtio polling internals
> - fixed missing list_del on pending_cmds_list processing
> - shrinked spinlocked areas in virtio_poll_done
> - added proper spinlocking to scmi_vio_complete_cb while scanning list
>   of pending cmds
> ---
>  drivers/firmware/arm_scmi/Kconfig  |  15 ++
>  drivers/firmware/arm_scmi/virtio.c | 291 ++++++++++++++++++++++++++---
>  2 files changed, 280 insertions(+), 26 deletions(-)
> 
> diff --git a/drivers/firmware/arm_scmi/Kconfig b/drivers/firmware/arm_scmi/Kconfig
> index d429326433d1..7794bd41eaa0 100644
> --- a/drivers/firmware/arm_scmi/Kconfig
> +++ b/drivers/firmware/arm_scmi/Kconfig
> @@ -118,6 +118,21 @@ config ARM_SCMI_TRANSPORT_VIRTIO_VERSION1_COMPLIANCE
>  	  the ones implemented by kvmtool) and let the core Kernel VirtIO layer
>  	  take care of the needed conversions, say N.
>  
> +config ARM_SCMI_TRANSPORT_VIRTIO_ATOMIC_ENABLE
> +	bool "Enable atomic mode for SCMI VirtIO transport"
> +	depends on ARM_SCMI_TRANSPORT_VIRTIO
> +	help
> +	  Enable support of atomic operation for SCMI VirtIO based transport.
> +
> +	  If you want the SCMI VirtIO based transport to operate in atomic
> +	  mode, avoiding any kind of sleeping behaviour for selected
> +	  transactions on the TX path, answer Y.
> +
> +	  Enabling atomic mode operations allows any SCMI driver using this
> +	  transport to optionally ask for atomic SCMI transactions and operate
> +	  in atomic context too, at the price of using a number of busy-waiting
> +	  primitives all over instead. If unsure say N.
> +
>  endif #ARM_SCMI_PROTOCOL
>  
>  config ARM_SCMI_POWER_DOMAIN
> diff --git a/drivers/firmware/arm_scmi/virtio.c b/drivers/firmware/arm_scmi/virtio.c
> index fd0f6f91fc0b..f589bbcc5db9 100644
> --- a/drivers/firmware/arm_scmi/virtio.c
> +++ b/drivers/firmware/arm_scmi/virtio.c
> @@ -3,8 +3,8 @@
>   * Virtio Transport driver for Arm System Control and Management Interface
>   * (SCMI).
>   *
> - * Copyright (C) 2020-2021 OpenSynergy.
> - * Copyright (C) 2021 ARM Ltd.
> + * Copyright (C) 2020-2022 OpenSynergy.
> + * Copyright (C) 2021-2022 ARM Ltd.
>   */
>  
>  /**
> @@ -38,6 +38,9 @@
>   * @vqueue: Associated virtqueue
>   * @cinfo: SCMI Tx or Rx channel
>   * @free_list: List of unused scmi_vio_msg, maintained for Tx channels only
> + * @deferred_tx_work: Worker for TX deferred replies processing
> + * @deferred_tx_wq: Workqueue for TX deferred replies
> + * @pending_cmds_list: List of pre-fetched commands queueud for later processing
>   * @is_rx: Whether channel is an Rx channel
>   * @ready: Whether transport user is ready to hear about channel
>   * @max_msg: Maximum number of pending messages for this channel.
> @@ -49,6 +52,9 @@ struct scmi_vio_channel {
>  	struct virtqueue *vqueue;
>  	struct scmi_chan_info *cinfo;
>  	struct list_head free_list;
> +	struct list_head pending_cmds_list;
> +	struct work_struct deferred_tx_work;
> +	struct workqueue_struct *deferred_tx_wq;
>  	bool is_rx;
>  	bool ready;
>  	unsigned int max_msg;
> @@ -65,12 +71,22 @@ struct scmi_vio_channel {
>   * @input: SDU used for (delayed) responses and notifications
>   * @list: List which scmi_vio_msg may be part of
>   * @rx_len: Input SDU size in bytes, once input has been received
> + * @poll_idx: Last used index registered for polling purposes if this message
> + *	      transaction reply was configured for polling.
> + *	      Note that since virtqueue used index is an unsigned 16-bit we can
> + *	      use some out-of-scale values to signify particular conditions.
> + * @poll_lock: Protect access to @poll_idx.
>   */
>  struct scmi_vio_msg {
>  	struct scmi_msg_payld *request;
>  	struct scmi_msg_payld *input;
>  	struct list_head list;
>  	unsigned int rx_len;
> +#define VIO_MSG_NOT_POLLED	0xeeeeeeeeUL
> +#define VIO_MSG_POLL_DONE	0xffffffffUL
> +	unsigned int poll_idx;
> +	/* lock to protect access to poll_idx. */
> +	spinlock_t poll_lock;
>  };
>  
>  /* Only one SCMI VirtIO device can possibly exist */
> @@ -81,40 +97,43 @@ static bool scmi_vio_have_vq_rx(struct virtio_device *vdev)
>  	return virtio_has_feature(vdev, VIRTIO_SCMI_F_P2A_CHANNELS);
>  }
>  
> +/* Expect to be called with vioch->lock acquired by the caller and IRQs off */
>  static int scmi_vio_feed_vq_rx(struct scmi_vio_channel *vioch,
>  			       struct scmi_vio_msg *msg,
>  			       struct device *dev)
>  {
>  	struct scatterlist sg_in;
>  	int rc;
> -	unsigned long flags;
>  
>  	sg_init_one(&sg_in, msg->input, VIRTIO_SCMI_MAX_PDU_SIZE);
>  
> -	spin_lock_irqsave(&vioch->lock, flags);
> -
>  	rc = virtqueue_add_inbuf(vioch->vqueue, &sg_in, 1, msg, GFP_ATOMIC);
>  	if (rc)
>  		dev_err(dev, "failed to add to RX virtqueue (%d)\n", rc);
>  	else
>  		virtqueue_kick(vioch->vqueue);
>  
> -	spin_unlock_irqrestore(&vioch->lock, flags);
> -
>  	return rc;
>  }
>  
> +/* Expect to be called with vioch->lock acquired by the caller and IRQs off */
> +static inline void scmi_vio_feed_vq_tx(struct scmi_vio_channel *vioch,
> +				       struct scmi_vio_msg *msg)
> +{
> +	spin_lock(&msg->poll_lock);
> +	msg->poll_idx = VIO_MSG_NOT_POLLED;
> +	spin_unlock(&msg->poll_lock);
> +
> +	list_add(&msg->list, &vioch->free_list);
> +}
> +
>  static void scmi_finalize_message(struct scmi_vio_channel *vioch,
>  				  struct scmi_vio_msg *msg)
>  {
> -	if (vioch->is_rx) {
> +	if (vioch->is_rx)
>  		scmi_vio_feed_vq_rx(vioch, msg, vioch->cinfo->dev);
> -	} else {
> -		/* Here IRQs are assumed to be already disabled by the caller */
> -		spin_lock(&vioch->lock);
> -		list_add(&msg->list, &vioch->free_list);
> -		spin_unlock(&vioch->lock);
> -	}
> +	else
> +		scmi_vio_feed_vq_tx(vioch, msg);
>  }
>  
>  static void scmi_vio_complete_cb(struct virtqueue *vqueue)
> @@ -144,6 +163,7 @@ static void scmi_vio_complete_cb(struct virtqueue *vqueue)
>  			virtqueue_disable_cb(vqueue);
>  			cb_enabled = false;
>  		}
> +
>  		msg = virtqueue_get_buf(vqueue, &length);
>  		if (!msg) {
>  			if (virtqueue_enable_cb(vqueue))
> @@ -157,7 +177,9 @@ static void scmi_vio_complete_cb(struct virtqueue *vqueue)
>  			scmi_rx_callback(vioch->cinfo,
>  					 msg_read_header(msg->input), msg);
>  
> +			spin_lock(&vioch->lock);
>  			scmi_finalize_message(vioch, msg);
> +			spin_unlock(&vioch->lock);
>  		}
>  
>  		/*
> @@ -176,6 +198,34 @@ static void scmi_vio_complete_cb(struct virtqueue *vqueue)
>  	spin_unlock_irqrestore(&vioch->ready_lock, ready_flags);
>  }
>  
> +static void scmi_vio_deferred_tx_worker(struct work_struct *work)
> +{
> +	unsigned long flags;
> +	struct scmi_vio_channel *vioch;
> +	struct scmi_vio_msg *msg, *tmp;
> +
> +	vioch = container_of(work, struct scmi_vio_channel, deferred_tx_work);
> +
> +	/* Process pre-fetched messages */
> +	spin_lock_irqsave(&vioch->lock, flags);
> +
> +	/* Scan the list of possibly pre-fetched messages during polling. */
> +	list_for_each_entry_safe(msg, tmp, &vioch->pending_cmds_list, list) {
> +		list_del(&msg->list);
> +
> +		scmi_rx_callback(vioch->cinfo,
> +				 msg_read_header(msg->input), msg);
> +
> +		/* Free the processed message once done */
> +		scmi_vio_feed_vq_tx(vioch, msg);
> +	}
> +
> +	spin_unlock_irqrestore(&vioch->lock, flags);
> +
> +	/* Process possibly still pending messages */
> +	scmi_vio_complete_cb(vioch->vqueue);
> +}
> +
>  static const char *const scmi_vio_vqueue_names[] = { "tx", "rx" };
>  
>  static vq_callback_t *scmi_vio_complete_callbacks[] = {
> @@ -244,6 +294,19 @@ static int virtio_chan_setup(struct scmi_chan_info *cinfo, struct device *dev,
>  
>  	vioch = &((struct scmi_vio_channel *)scmi_vdev->priv)[index];
>  
> +	/* Setup a deferred worker for polling. */
> +	if (tx && !vioch->deferred_tx_wq) {
> +		vioch->deferred_tx_wq =
> +			alloc_workqueue(dev_name(&scmi_vdev->dev),
> +					WQ_UNBOUND | WQ_FREEZABLE | WQ_SYSFS,
> +					0);
> +		if (!vioch->deferred_tx_wq)
> +			return -ENOMEM;
> +
> +		INIT_WORK(&vioch->deferred_tx_work,
> +			  scmi_vio_deferred_tx_worker);
> +	}
> +
>  	for (i = 0; i < vioch->max_msg; i++) {
>  		struct scmi_vio_msg *msg;
>  
> @@ -257,6 +320,7 @@ static int virtio_chan_setup(struct scmi_chan_info *cinfo, struct device *dev,
>  						    GFP_KERNEL);
>  			if (!msg->request)
>  				return -ENOMEM;
> +			spin_lock_init(&msg->poll_lock);
>  		}
>  
>  		msg->input = devm_kzalloc(cinfo->dev, VIRTIO_SCMI_MAX_PDU_SIZE,
> @@ -264,13 +328,12 @@ static int virtio_chan_setup(struct scmi_chan_info *cinfo, struct device *dev,
>  		if (!msg->input)
>  			return -ENOMEM;
>  
> -		if (tx) {
> -			spin_lock_irqsave(&vioch->lock, flags);
> -			list_add_tail(&msg->list, &vioch->free_list);
> -			spin_unlock_irqrestore(&vioch->lock, flags);
> -		} else {
> +		spin_lock_irqsave(&vioch->lock, flags);
> +		if (tx)
> +			scmi_vio_feed_vq_tx(vioch, msg);
> +		else
>  			scmi_vio_feed_vq_rx(vioch, msg, cinfo->dev);
> -		}
> +		spin_unlock_irqrestore(&vioch->lock, flags);
>  	}
>  
>  	spin_lock_irqsave(&vioch->lock, flags);
> @@ -296,6 +359,11 @@ static int virtio_chan_free(int id, void *p, void *data)
>  	vioch->ready = false;
>  	spin_unlock_irqrestore(&vioch->ready_lock, flags);
>  
> +	if (!vioch->is_rx && vioch->deferred_tx_wq) {
> +		destroy_workqueue(vioch->deferred_tx_wq);
> +		vioch->deferred_tx_wq = NULL;
> +	}
> +
>  	scmi_free_channel(cinfo, data, id);
>  
>  	spin_lock_irqsave(&vioch->lock, flags);
> @@ -324,7 +392,8 @@ static int virtio_send_message(struct scmi_chan_info *cinfo,
>  	}
>  
>  	msg = list_first_entry(&vioch->free_list, typeof(*msg), list);
> -	list_del(&msg->list);
> +	/* Re-init element so we can discern anytime if it is still in-flight */
> +	list_del_init(&msg->list);
>  
>  	msg_tx_prepare(msg->request, xfer);
>  
> @@ -337,6 +406,19 @@ static int virtio_send_message(struct scmi_chan_info *cinfo,
>  		dev_err(vioch->cinfo->dev,
>  			"failed to add to TX virtqueue (%d)\n", rc);
>  	} else {
> +		/*
> +		 * If polling was requested for this transaction:
> +		 *  - retrieve last used index (will be used as polling reference)
> +		 *  - bind the polled message to the xfer via .priv
> +		 */
> +		if (xfer->hdr.poll_completion) {
> +			spin_lock(&msg->poll_lock);
> +			msg->poll_idx =
> +				virtqueue_enable_cb_prepare(vioch->vqueue);
> +			spin_unlock(&msg->poll_lock);
> +			/* Ensure initialized msg is visibly bound to xfer */
> +			smp_store_mb(xfer->priv, msg);
> +		}
>  		virtqueue_kick(vioch->vqueue);
>  	}
>  
> @@ -350,10 +432,8 @@ static void virtio_fetch_response(struct scmi_chan_info *cinfo,
>  {
>  	struct scmi_vio_msg *msg = xfer->priv;
>  
> -	if (msg) {
> +	if (msg)
>  		msg_fetch_response(msg->input, msg->rx_len, xfer);
> -		xfer->priv = NULL;
> -	}
>  }
>  
>  static void virtio_fetch_notification(struct scmi_chan_info *cinfo,
> @@ -361,10 +441,165 @@ static void virtio_fetch_notification(struct scmi_chan_info *cinfo,
>  {
>  	struct scmi_vio_msg *msg = xfer->priv;
>  
> -	if (msg) {
> +	if (msg)
>  		msg_fetch_notification(msg->input, msg->rx_len, max_len, xfer);
> -		xfer->priv = NULL;
> +}
> +
> +/**
> + * virtio_mark_txdone  - Mark transmission done
> + *
> + * Free only successfully completed polling transfer messages.
> + *
> + * Note that in the SCMI VirtIO transport we never explicitly release timed-out
> + * messages by forcibly re-adding them to the free-list, even on timeout, inside
> + * the TX code path; we instead let IRQ/RX callbacks eventually clean up such
> + * messages once, finally, a late reply is received and discarded (if ever).
> + *
> + * This approach was deemed preferable since those pending timed-out buffers are
> + * still effectively owned by the SCMI platform VirtIO device even after timeout
> + * expiration: forcibly freeing and reusing them before they had been returned
> + * explicitly by the SCMI platform could lead to subtle bugs due to message
> + * corruption.
> + * An SCMI platform VirtIO device which never returns message buffers is
> + * anyway broken and it will quickly lead to exhaustion of available messages.
> + *
> + * For this same reason, here, we take care to free only the successfully
> + * completed polled messages, since they won't be freed elsewhere; late replies
> + * to timed-out polled messages would be anyway freed by RX callbacks instead.
> + *
> + * @cinfo: SCMI channel info
> + * @ret: Transmission return code
> + * @xfer: Transfer descriptor
> + */
> +static void virtio_mark_txdone(struct scmi_chan_info *cinfo, int ret,
> +			       struct scmi_xfer *xfer)
> +{
> +	unsigned long flags;
> +	struct scmi_vio_channel *vioch = cinfo->transport_info;
> +	struct scmi_vio_msg *msg = xfer->priv;
> +
> +	if (!msg)
> +		return;
> +
> +	/* Ensure msg is unbound from xfer before pushing onto the free list  */
> +	smp_store_mb(xfer->priv, NULL);
> +
> +	/* Is a successfully completed polled message still to be finalized ? */
> +	spin_lock_irqsave(&vioch->lock, flags);
> +	if (!ret && xfer->hdr.poll_completion && list_empty(&msg->list))
> +		scmi_vio_feed_vq_tx(vioch, msg);
> +	spin_unlock_irqrestore(&vioch->lock, flags);
> +}
> +
> +/**
> + * virtio_poll_done  - Provide polling support for VirtIO transport
> + *
> + * @cinfo: SCMI channel info
> + * @xfer: Reference to the transfer being poll for.
> + *
> + * VirtIO core provides a polling mechanism based only on last used indexes:
> + * this means that it is possible to poll the virtqueues waiting for something
> + * new to arrive from the host side but the only way to check if the freshly
> + * arrived buffer was what we were waiting for is to compare the newly arrived
> + * message descriptors with the one we are polling on.
> + *
> + * As a consequence it can happen to dequeue something different from the buffer
> + * we were poll-waiting for: if that is the case such early fetched buffers are
> + * then added to a the @pending_cmds_list list for later processing by a
> + * dedicated deferred worker.
> + *
> + * So, basically, once something new is spotted we proceed to de-queue all the
> + * freshly received used buffers until we found the one we were polling on, or,
> + * we have 'seemingly' emptied the virtqueue; if some buffers are still pending
> + * in the vqueue at the end of the polling loop (possible due to inherent races
> + * in virtqueues handling mechanisms), we similarly kick the deferred worker
> + * and let it process those, to avoid indefinitely looping in the .poll_done
> + * helper.
> + *
> + * Note that we do NOT suppress notification with VIRTQ_USED_F_NO_NOTIFY even
> + * when polling since such flag is per-virtqueues and we do not want to
> + * suppress notifications as a whole: so, if the message we are polling for is
> + * delivered via usual IRQs callbacks, on another core which are IRQs-on, it
> + * will be handled as such by scmi_rx_callback() and the polling loop in the
> + * SCMI Core TX path will be transparently terminated anyway.
> + *
> + * Return: True once polling has successfully completed.
> + */
> +static bool virtio_poll_done(struct scmi_chan_info *cinfo,
> +			     struct scmi_xfer *xfer)
> +{
> +	bool pending, ret = false;
> +	unsigned int length, any_prefetched = 0;
> +	unsigned long flags;
> +	struct scmi_vio_msg *next_msg, *msg = xfer->priv;
> +	struct scmi_vio_channel *vioch = cinfo->transport_info;
> +
> +	if (!msg)
> +		return true;
> +
> +	spin_lock_irqsave(&msg->poll_lock, flags);
> +	/* Processed already by other polling loop on another CPU ? */
> +	if (msg->poll_idx == VIO_MSG_POLL_DONE) {
> +		spin_unlock_irqrestore(&msg->poll_lock, flags);
> +		return true;
>  	}
> +
> +	/* Has cmdq index moved at all ? */
> +	pending = virtqueue_poll(vioch->vqueue, msg->poll_idx);
> +	spin_unlock_irqrestore(&msg->poll_lock, flags);
> +	if (!pending)
> +		return false;
> +
> +	spin_lock_irqsave(&vioch->lock, flags);
> +	virtqueue_disable_cb(vioch->vqueue);
> +
> +	/*
> +	 * If something arrived we cannot be sure, without dequeueing, if it
> +	 * was the reply to the xfer we are polling for, or, to other, even
> +	 * possibly non-polling, pending xfers: process all new messages
> +	 * till the polled-for message is found OR the vqueue is empty.
> +	 */
> +	while ((next_msg = virtqueue_get_buf(vioch->vqueue, &length))) {
> +		next_msg->rx_len = length;
> +		/* Is the message we were polling for ? */
> +		if (next_msg == msg) {
> +			ret = true;
> +			break;
> +		}
> +
> +		spin_lock(&next_msg->poll_lock);
> +		if (next_msg->poll_idx == VIO_MSG_NOT_POLLED) {
> +			any_prefetched++;
> +			list_add_tail(&next_msg->list,
> +				      &vioch->pending_cmds_list);
> +		} else {
> +			next_msg->poll_idx = VIO_MSG_POLL_DONE;
> +		}
> +		spin_unlock(&next_msg->poll_lock);
> +	}
> +
> +	/*
> +	 * When the polling loop has successfully terminated if something
> +	 * else was queued in the meantime, it will be served by a deferred
> +	 * worker OR by the normal IRQ/callback OR by other poll loops.
> +	 *
> +	 * If we are still looking for the polled reply, the polling index has
> +	 * to be updated to the current vqueue last used index.
> +	 */
> +	if (ret) {
> +		pending = !virtqueue_enable_cb(vioch->vqueue);
> +	} else {
> +		spin_lock(&msg->poll_lock);
> +		msg->poll_idx = virtqueue_enable_cb_prepare(vioch->vqueue);
> +		pending = virtqueue_poll(vioch->vqueue, msg->poll_idx);
> +		spin_unlock(&msg->poll_lock);
> +	}
> +	spin_unlock_irqrestore(&vioch->lock, flags);
> +
> +	if (any_prefetched || pending)
> +		queue_work(vioch->deferred_tx_wq, &vioch->deferred_tx_work);

I don't see any attempt to make sure the queued work is no longer
running on e.g. device or driver removal.


> +
> +	return ret;
>  }
>  
>  static const struct scmi_transport_ops scmi_virtio_ops = {
> @@ -376,6 +611,8 @@ static const struct scmi_transport_ops scmi_virtio_ops = {
>  	.send_message = virtio_send_message,
>  	.fetch_response = virtio_fetch_response,
>  	.fetch_notification = virtio_fetch_notification,
> +	.mark_txdone = virtio_mark_txdone,
> +	.poll_done = virtio_poll_done,
>  };
>  
>  static int scmi_vio_probe(struct virtio_device *vdev)
> @@ -418,6 +655,7 @@ static int scmi_vio_probe(struct virtio_device *vdev)
>  		spin_lock_init(&channels[i].lock);
>  		spin_lock_init(&channels[i].ready_lock);
>  		INIT_LIST_HEAD(&channels[i].free_list);
> +		INIT_LIST_HEAD(&channels[i].pending_cmds_list);
>  		channels[i].vqueue = vqs[i];
>  
>  		sz = virtqueue_get_vring_size(channels[i].vqueue);
> @@ -506,4 +744,5 @@ const struct scmi_desc scmi_virtio_desc = {
>  	.max_rx_timeout_ms = 60000, /* for non-realtime virtio devices */
>  	.max_msg = 0, /* overridden by virtio_get_max_msg() */
>  	.max_msg_size = VIRTIO_SCMI_MAX_MSG_SIZE,
> +	.atomic_enabled = IS_ENABLED(CONFIG_ARM_SCMI_TRANSPORT_VIRTIO_ATOMIC_ENABLE),
>  };
> -- 
> 2.17.1


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [PATCH v8 09/11] firmware: arm_scmi: Add atomic mode support to virtio transport
  2021-12-20 23:17     ` Michael S. Tsirkin
@ 2021-12-21 12:09       ` Cristian Marussi
  -1 siblings, 0 replies; 61+ messages in thread
From: Cristian Marussi @ 2021-12-21 12:09 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: linux-kernel, linux-arm-kernel, sudeep.holla, james.quinlan,
	Jonathan.Cameron, f.fainelli, etienne.carriere, vincent.guittot,
	souvik.chakravarty, Igor Skalkin, Peter Hilber, virtualization

On Mon, Dec 20, 2021 at 06:17:03PM -0500, Michael S. Tsirkin wrote:
> On Mon, Dec 20, 2021 at 07:56:44PM +0000, Cristian Marussi wrote:
> > Add support for .mark_txdone and .poll_done transport operations to SCMI
> > VirtIO transport as pre-requisites to enable atomic operations.
> > 
> > Add a Kernel configuration option to enable SCMI VirtIO transport polling
> > and atomic mode for selected SCMI transactions while leaving it default
> > disabled.
> > 

Hi Michael,

thanks for your review.

> > Cc: "Michael S. Tsirkin" <mst@redhat.com>
> > Cc: Igor Skalkin <igor.skalkin@opensynergy.com>
> > Cc: Peter Hilber <peter.hilber@opensynergy.com>
> > Cc: virtualization@lists.linux-foundation.org
> > Signed-off-by: Cristian Marussi <cristian.marussi@arm.com>
> > ---
> > v7 --> v8
> > - removed ifdeffery
> > - reviewed comments
> > - simplified spinlocking aroung scmi_feed_vq_tx/rx
> > - added deferred worker for TX replies to aid while polling mode is active
> > V6 --> V7
> > - added a few comments about virtio polling internals
> > - fixed missing list_del on pending_cmds_list processing
> > - shrinked spinlocked areas in virtio_poll_done
> > - added proper spinlocking to scmi_vio_complete_cb while scanning list
> >   of pending cmds
> > ---
> >  drivers/firmware/arm_scmi/Kconfig  |  15 ++
> >  drivers/firmware/arm_scmi/virtio.c | 291 ++++++++++++++++++++++++++---
> >  2 files changed, 280 insertions(+), 26 deletions(-)
> > 
> > diff --git a/drivers/firmware/arm_scmi/Kconfig b/drivers/firmware/arm_scmi/Kconfig
> > index d429326433d1..7794bd41eaa0 100644
> > --- a/drivers/firmware/arm_scmi/Kconfig
> > +++ b/drivers/firmware/arm_scmi/Kconfig
> > @@ -118,6 +118,21 @@ config ARM_SCMI_TRANSPORT_VIRTIO_VERSION1_COMPLIANCE
> >  	  the ones implemented by kvmtool) and let the core Kernel VirtIO layer
> >  	  take care of the needed conversions, say N.
> >  
> > +config ARM_SCMI_TRANSPORT_VIRTIO_ATOMIC_ENABLE
> > +	bool "Enable atomic mode for SCMI VirtIO transport"
> > +	depends on ARM_SCMI_TRANSPORT_VIRTIO
> > +	help
> > +	  Enable support of atomic operation for SCMI VirtIO based transport.
> > +
> > +	  If you want the SCMI VirtIO based transport to operate in atomic
> > +	  mode, avoiding any kind of sleeping behaviour for selected
> > +	  transactions on the TX path, answer Y.
> > +
> > +	  Enabling atomic mode operations allows any SCMI driver using this
> > +	  transport to optionally ask for atomic SCMI transactions and operate
> > +	  in atomic context too, at the price of using a number of busy-waiting
> > +	  primitives all over instead. If unsure say N.
> > +
> >  endif #ARM_SCMI_PROTOCOL
> >  
> >  config ARM_SCMI_POWER_DOMAIN
> > diff --git a/drivers/firmware/arm_scmi/virtio.c b/drivers/firmware/arm_scmi/virtio.c
> > index fd0f6f91fc0b..f589bbcc5db9 100644
> > --- a/drivers/firmware/arm_scmi/virtio.c
> > +++ b/drivers/firmware/arm_scmi/virtio.c
> > @@ -3,8 +3,8 @@
> >   * Virtio Transport driver for Arm System Control and Management Interface
> >   * (SCMI).
> >   *
> > - * Copyright (C) 2020-2021 OpenSynergy.
> > - * Copyright (C) 2021 ARM Ltd.
> > + * Copyright (C) 2020-2022 OpenSynergy.
> > + * Copyright (C) 2021-2022 ARM Ltd.
> >   */
> >  
> >  /**
> > @@ -38,6 +38,9 @@
> >   * @vqueue: Associated virtqueue
> >   * @cinfo: SCMI Tx or Rx channel
> >   * @free_list: List of unused scmi_vio_msg, maintained for Tx channels only
> > + * @deferred_tx_work: Worker for TX deferred replies processing
> > + * @deferred_tx_wq: Workqueue for TX deferred replies
> > + * @pending_cmds_list: List of pre-fetched commands queueud for later processing
> >   * @is_rx: Whether channel is an Rx channel
> >   * @ready: Whether transport user is ready to hear about channel
> >   * @max_msg: Maximum number of pending messages for this channel.
> > @@ -49,6 +52,9 @@ struct scmi_vio_channel {
> >  	struct virtqueue *vqueue;
> >  	struct scmi_chan_info *cinfo;
> >  	struct list_head free_list;
> > +	struct list_head pending_cmds_list;
> > +	struct work_struct deferred_tx_work;
> > +	struct workqueue_struct *deferred_tx_wq;
> >  	bool is_rx;
> >  	bool ready;
> >  	unsigned int max_msg;
> > @@ -65,12 +71,22 @@ struct scmi_vio_channel {
> >   * @input: SDU used for (delayed) responses and notifications
> >   * @list: List which scmi_vio_msg may be part of
> >   * @rx_len: Input SDU size in bytes, once input has been received
> > + * @poll_idx: Last used index registered for polling purposes if this message
> > + *	      transaction reply was configured for polling.
> > + *	      Note that since virtqueue used index is an unsigned 16-bit we can
> > + *	      use some out-of-scale values to signify particular conditions.
> > + * @poll_lock: Protect access to @poll_idx.
> >   */
> >  struct scmi_vio_msg {
> >  	struct scmi_msg_payld *request;
> >  	struct scmi_msg_payld *input;
> >  	struct list_head list;
> >  	unsigned int rx_len;
> > +#define VIO_MSG_NOT_POLLED	0xeeeeeeeeUL
> > +#define VIO_MSG_POLL_DONE	0xffffffffUL
> > +	unsigned int poll_idx;
> > +	/* lock to protect access to poll_idx. */
> > +	spinlock_t poll_lock;
> >  };
> >  
> >  /* Only one SCMI VirtIO device can possibly exist */
> > @@ -81,40 +97,43 @@ static bool scmi_vio_have_vq_rx(struct virtio_device *vdev)
> >  	return virtio_has_feature(vdev, VIRTIO_SCMI_F_P2A_CHANNELS);
> >  }
> >  
> > +/* Expect to be called with vioch->lock acquired by the caller and IRQs off */
> >  static int scmi_vio_feed_vq_rx(struct scmi_vio_channel *vioch,
> >  			       struct scmi_vio_msg *msg,
> >  			       struct device *dev)
> >  {
> >  	struct scatterlist sg_in;
> >  	int rc;
> > -	unsigned long flags;
> >  
> >  	sg_init_one(&sg_in, msg->input, VIRTIO_SCMI_MAX_PDU_SIZE);
> >  
> > -	spin_lock_irqsave(&vioch->lock, flags);
> > -
> >  	rc = virtqueue_add_inbuf(vioch->vqueue, &sg_in, 1, msg, GFP_ATOMIC);
> >  	if (rc)
> >  		dev_err(dev, "failed to add to RX virtqueue (%d)\n", rc);
> >  	else
> >  		virtqueue_kick(vioch->vqueue);
> >  
> > -	spin_unlock_irqrestore(&vioch->lock, flags);
> > -
> >  	return rc;
> >  }
> >  
> > +/* Expect to be called with vioch->lock acquired by the caller and IRQs off */
> > +static inline void scmi_vio_feed_vq_tx(struct scmi_vio_channel *vioch,
> > +				       struct scmi_vio_msg *msg)
> > +{
> > +	spin_lock(&msg->poll_lock);
> > +	msg->poll_idx = VIO_MSG_NOT_POLLED;
> > +	spin_unlock(&msg->poll_lock);
> > +
> > +	list_add(&msg->list, &vioch->free_list);
> > +}
> > +
> >  static void scmi_finalize_message(struct scmi_vio_channel *vioch,
> >  				  struct scmi_vio_msg *msg)
> >  {
> > -	if (vioch->is_rx) {
> > +	if (vioch->is_rx)
> >  		scmi_vio_feed_vq_rx(vioch, msg, vioch->cinfo->dev);
> > -	} else {
> > -		/* Here IRQs are assumed to be already disabled by the caller */
> > -		spin_lock(&vioch->lock);
> > -		list_add(&msg->list, &vioch->free_list);
> > -		spin_unlock(&vioch->lock);
> > -	}
> > +	else
> > +		scmi_vio_feed_vq_tx(vioch, msg);
> >  }
> >  
> >  static void scmi_vio_complete_cb(struct virtqueue *vqueue)
> > @@ -144,6 +163,7 @@ static void scmi_vio_complete_cb(struct virtqueue *vqueue)
> >  			virtqueue_disable_cb(vqueue);
> >  			cb_enabled = false;
> >  		}
> > +
> >  		msg = virtqueue_get_buf(vqueue, &length);
> >  		if (!msg) {
> >  			if (virtqueue_enable_cb(vqueue))
> > @@ -157,7 +177,9 @@ static void scmi_vio_complete_cb(struct virtqueue *vqueue)
> >  			scmi_rx_callback(vioch->cinfo,
> >  					 msg_read_header(msg->input), msg);
> >  
> > +			spin_lock(&vioch->lock);
> >  			scmi_finalize_message(vioch, msg);
> > +			spin_unlock(&vioch->lock);
> >  		}
> >  
> >  		/*
> > @@ -176,6 +198,34 @@ static void scmi_vio_complete_cb(struct virtqueue *vqueue)
> >  	spin_unlock_irqrestore(&vioch->ready_lock, ready_flags);
> >  }
> >  
> > +static void scmi_vio_deferred_tx_worker(struct work_struct *work)
> > +{
> > +	unsigned long flags;
> > +	struct scmi_vio_channel *vioch;
> > +	struct scmi_vio_msg *msg, *tmp;
> > +
> > +	vioch = container_of(work, struct scmi_vio_channel, deferred_tx_work);
> > +
> > +	/* Process pre-fetched messages */
> > +	spin_lock_irqsave(&vioch->lock, flags);
> > +
> > +	/* Scan the list of possibly pre-fetched messages during polling. */
> > +	list_for_each_entry_safe(msg, tmp, &vioch->pending_cmds_list, list) {
> > +		list_del(&msg->list);
> > +
> > +		scmi_rx_callback(vioch->cinfo,
> > +				 msg_read_header(msg->input), msg);
> > +
> > +		/* Free the processed message once done */
> > +		scmi_vio_feed_vq_tx(vioch, msg);
> > +	}
> > +
> > +	spin_unlock_irqrestore(&vioch->lock, flags);
> > +
> > +	/* Process possibly still pending messages */
> > +	scmi_vio_complete_cb(vioch->vqueue);
> > +}
> > +
> >  static const char *const scmi_vio_vqueue_names[] = { "tx", "rx" };
> >  
> >  static vq_callback_t *scmi_vio_complete_callbacks[] = {
> > @@ -244,6 +294,19 @@ static int virtio_chan_setup(struct scmi_chan_info *cinfo, struct device *dev,
> >  
> >  	vioch = &((struct scmi_vio_channel *)scmi_vdev->priv)[index];
> >  
> > +	/* Setup a deferred worker for polling. */
> > +	if (tx && !vioch->deferred_tx_wq) {
> > +		vioch->deferred_tx_wq =
> > +			alloc_workqueue(dev_name(&scmi_vdev->dev),
> > +					WQ_UNBOUND | WQ_FREEZABLE | WQ_SYSFS,
> > +					0);
> > +		if (!vioch->deferred_tx_wq)
> > +			return -ENOMEM;
> > +
> > +		INIT_WORK(&vioch->deferred_tx_work,
> > +			  scmi_vio_deferred_tx_worker);
> > +	}
> > +
> >  	for (i = 0; i < vioch->max_msg; i++) {
> >  		struct scmi_vio_msg *msg;
> >  
> > @@ -257,6 +320,7 @@ static int virtio_chan_setup(struct scmi_chan_info *cinfo, struct device *dev,
> >  						    GFP_KERNEL);
> >  			if (!msg->request)
> >  				return -ENOMEM;
> > +			spin_lock_init(&msg->poll_lock);
> >  		}
> >  
> >  		msg->input = devm_kzalloc(cinfo->dev, VIRTIO_SCMI_MAX_PDU_SIZE,
> > @@ -264,13 +328,12 @@ static int virtio_chan_setup(struct scmi_chan_info *cinfo, struct device *dev,
> >  		if (!msg->input)
> >  			return -ENOMEM;
> >  
> > -		if (tx) {
> > -			spin_lock_irqsave(&vioch->lock, flags);
> > -			list_add_tail(&msg->list, &vioch->free_list);
> > -			spin_unlock_irqrestore(&vioch->lock, flags);
> > -		} else {
> > +		spin_lock_irqsave(&vioch->lock, flags);
> > +		if (tx)
> > +			scmi_vio_feed_vq_tx(vioch, msg);
> > +		else
> >  			scmi_vio_feed_vq_rx(vioch, msg, cinfo->dev);
> > -		}
> > +		spin_unlock_irqrestore(&vioch->lock, flags);
> >  	}
> >  
> >  	spin_lock_irqsave(&vioch->lock, flags);
> > @@ -296,6 +359,11 @@ static int virtio_chan_free(int id, void *p, void *data)
> >  	vioch->ready = false;
> >  	spin_unlock_irqrestore(&vioch->ready_lock, flags);
> >  
> > +	if (!vioch->is_rx && vioch->deferred_tx_wq) {
> > +		destroy_workqueue(vioch->deferred_tx_wq);
> > +		vioch->deferred_tx_wq = NULL;
> > +	}
> > +
> >  	scmi_free_channel(cinfo, data, id);
> >  
> >  	spin_lock_irqsave(&vioch->lock, flags);
> > @@ -324,7 +392,8 @@ static int virtio_send_message(struct scmi_chan_info *cinfo,
> >  	}
> >  
> >  	msg = list_first_entry(&vioch->free_list, typeof(*msg), list);
> > -	list_del(&msg->list);
> > +	/* Re-init element so we can discern anytime if it is still in-flight */
> > +	list_del_init(&msg->list);
> >  
> >  	msg_tx_prepare(msg->request, xfer);
> >  
> > @@ -337,6 +406,19 @@ static int virtio_send_message(struct scmi_chan_info *cinfo,
> >  		dev_err(vioch->cinfo->dev,
> >  			"failed to add to TX virtqueue (%d)\n", rc);
> >  	} else {
> > +		/*
> > +		 * If polling was requested for this transaction:
> > +		 *  - retrieve last used index (will be used as polling reference)
> > +		 *  - bind the polled message to the xfer via .priv
> > +		 */
> > +		if (xfer->hdr.poll_completion) {
> > +			spin_lock(&msg->poll_lock);
> > +			msg->poll_idx =
> > +				virtqueue_enable_cb_prepare(vioch->vqueue);
> > +			spin_unlock(&msg->poll_lock);
> > +			/* Ensure initialized msg is visibly bound to xfer */
> > +			smp_store_mb(xfer->priv, msg);
> > +		}
> >  		virtqueue_kick(vioch->vqueue);
> >  	}
> >  
> > @@ -350,10 +432,8 @@ static void virtio_fetch_response(struct scmi_chan_info *cinfo,
> >  {
> >  	struct scmi_vio_msg *msg = xfer->priv;
> >  
> > -	if (msg) {
> > +	if (msg)
> >  		msg_fetch_response(msg->input, msg->rx_len, xfer);
> > -		xfer->priv = NULL;
> > -	}
> >  }
> >  
> >  static void virtio_fetch_notification(struct scmi_chan_info *cinfo,
> > @@ -361,10 +441,165 @@ static void virtio_fetch_notification(struct scmi_chan_info *cinfo,
> >  {
> >  	struct scmi_vio_msg *msg = xfer->priv;
> >  
> > -	if (msg) {
> > +	if (msg)
> >  		msg_fetch_notification(msg->input, msg->rx_len, max_len, xfer);
> > -		xfer->priv = NULL;
> > +}
> > +
> > +/**
> > + * virtio_mark_txdone  - Mark transmission done
> > + *
> > + * Free only successfully completed polling transfer messages.
> > + *
> > + * Note that in the SCMI VirtIO transport we never explicitly release timed-out
> > + * messages by forcibly re-adding them to the free-list, even on timeout, inside
> > + * the TX code path; we instead let IRQ/RX callbacks eventually clean up such
> > + * messages once, finally, a late reply is received and discarded (if ever).
> > + *
> > + * This approach was deemed preferable since those pending timed-out buffers are
> > + * still effectively owned by the SCMI platform VirtIO device even after timeout
> > + * expiration: forcibly freeing and reusing them before they had been returned
> > + * explicitly by the SCMI platform could lead to subtle bugs due to message
> > + * corruption.
> > + * An SCMI platform VirtIO device which never returns message buffers is
> > + * anyway broken and it will quickly lead to exhaustion of available messages.
> > + *
> > + * For this same reason, here, we take care to free only the successfully
> > + * completed polled messages, since they won't be freed elsewhere; late replies
> > + * to timed-out polled messages would be anyway freed by RX callbacks instead.
> > + *
> > + * @cinfo: SCMI channel info
> > + * @ret: Transmission return code
> > + * @xfer: Transfer descriptor
> > + */
> > +static void virtio_mark_txdone(struct scmi_chan_info *cinfo, int ret,
> > +			       struct scmi_xfer *xfer)
> > +{
> > +	unsigned long flags;
> > +	struct scmi_vio_channel *vioch = cinfo->transport_info;
> > +	struct scmi_vio_msg *msg = xfer->priv;
> > +
> > +	if (!msg)
> > +		return;
> > +
> > +	/* Ensure msg is unbound from xfer before pushing onto the free list  */
> > +	smp_store_mb(xfer->priv, NULL);
> > +
> > +	/* Is a successfully completed polled message still to be finalized ? */
> > +	spin_lock_irqsave(&vioch->lock, flags);
> > +	if (!ret && xfer->hdr.poll_completion && list_empty(&msg->list))
> > +		scmi_vio_feed_vq_tx(vioch, msg);
> > +	spin_unlock_irqrestore(&vioch->lock, flags);
> > +}
> > +
> > +/**
> > + * virtio_poll_done  - Provide polling support for VirtIO transport
> > + *
> > + * @cinfo: SCMI channel info
> > + * @xfer: Reference to the transfer being poll for.
> > + *
> > + * VirtIO core provides a polling mechanism based only on last used indexes:
> > + * this means that it is possible to poll the virtqueues waiting for something
> > + * new to arrive from the host side but the only way to check if the freshly
> > + * arrived buffer was what we were waiting for is to compare the newly arrived
> > + * message descriptors with the one we are polling on.
> > + *
> > + * As a consequence it can happen to dequeue something different from the buffer
> > + * we were poll-waiting for: if that is the case such early fetched buffers are
> > + * then added to a the @pending_cmds_list list for later processing by a
> > + * dedicated deferred worker.
> > + *
> > + * So, basically, once something new is spotted we proceed to de-queue all the
> > + * freshly received used buffers until we found the one we were polling on, or,
> > + * we have 'seemingly' emptied the virtqueue; if some buffers are still pending
> > + * in the vqueue at the end of the polling loop (possible due to inherent races
> > + * in virtqueues handling mechanisms), we similarly kick the deferred worker
> > + * and let it process those, to avoid indefinitely looping in the .poll_done
> > + * helper.
> > + *
> > + * Note that we do NOT suppress notification with VIRTQ_USED_F_NO_NOTIFY even
> > + * when polling since such flag is per-virtqueues and we do not want to
> > + * suppress notifications as a whole: so, if the message we are polling for is
> > + * delivered via usual IRQs callbacks, on another core which are IRQs-on, it
> > + * will be handled as such by scmi_rx_callback() and the polling loop in the
> > + * SCMI Core TX path will be transparently terminated anyway.
> > + *
> > + * Return: True once polling has successfully completed.
> > + */
> > +static bool virtio_poll_done(struct scmi_chan_info *cinfo,
> > +			     struct scmi_xfer *xfer)
> > +{
> > +	bool pending, ret = false;
> > +	unsigned int length, any_prefetched = 0;
> > +	unsigned long flags;
> > +	struct scmi_vio_msg *next_msg, *msg = xfer->priv;
> > +	struct scmi_vio_channel *vioch = cinfo->transport_info;
> > +
> > +	if (!msg)
> > +		return true;
> > +
> > +	spin_lock_irqsave(&msg->poll_lock, flags);
> > +	/* Processed already by other polling loop on another CPU ? */
> > +	if (msg->poll_idx == VIO_MSG_POLL_DONE) {
> > +		spin_unlock_irqrestore(&msg->poll_lock, flags);
> > +		return true;
> >  	}
> > +
> > +	/* Has cmdq index moved at all ? */
> > +	pending = virtqueue_poll(vioch->vqueue, msg->poll_idx);
> > +	spin_unlock_irqrestore(&msg->poll_lock, flags);
> > +	if (!pending)
> > +		return false;
> > +
> > +	spin_lock_irqsave(&vioch->lock, flags);
> > +	virtqueue_disable_cb(vioch->vqueue);
> > +
> > +	/*
> > +	 * If something arrived we cannot be sure, without dequeueing, if it
> > +	 * was the reply to the xfer we are polling for, or, to other, even
> > +	 * possibly non-polling, pending xfers: process all new messages
> > +	 * till the polled-for message is found OR the vqueue is empty.
> > +	 */
> > +	while ((next_msg = virtqueue_get_buf(vioch->vqueue, &length))) {
> > +		next_msg->rx_len = length;
> > +		/* Is the message we were polling for ? */
> > +		if (next_msg == msg) {
> > +			ret = true;
> > +			break;
> > +		}
> > +
> > +		spin_lock(&next_msg->poll_lock);
> > +		if (next_msg->poll_idx == VIO_MSG_NOT_POLLED) {
> > +			any_prefetched++;
> > +			list_add_tail(&next_msg->list,
> > +				      &vioch->pending_cmds_list);
> > +		} else {
> > +			next_msg->poll_idx = VIO_MSG_POLL_DONE;
> > +		}
> > +		spin_unlock(&next_msg->poll_lock);
> > +	}
> > +
> > +	/*
> > +	 * When the polling loop has successfully terminated if something
> > +	 * else was queued in the meantime, it will be served by a deferred
> > +	 * worker OR by the normal IRQ/callback OR by other poll loops.
> > +	 *
> > +	 * If we are still looking for the polled reply, the polling index has
> > +	 * to be updated to the current vqueue last used index.
> > +	 */
> > +	if (ret) {
> > +		pending = !virtqueue_enable_cb(vioch->vqueue);
> > +	} else {
> > +		spin_lock(&msg->poll_lock);
> > +		msg->poll_idx = virtqueue_enable_cb_prepare(vioch->vqueue);
> > +		pending = virtqueue_poll(vioch->vqueue, msg->poll_idx);
> > +		spin_unlock(&msg->poll_lock);
> > +	}
> > +	spin_unlock_irqrestore(&vioch->lock, flags);
> > +
> > +	if (any_prefetched || pending)
> > +		queue_work(vioch->deferred_tx_wq, &vioch->deferred_tx_work);
> 
> I don't see any attempt to make sure the queued work is no longer
> running on e.g. device or driver removal.
> 

Right, I destroy the workqueue and nullify the deferred_tx_wq pointer in
chan_free, I'll add a check here within the spinlocking area to avoid the
race with the destroy in chan_free.

Thanks,
Cristian


^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [PATCH v8 09/11] firmware: arm_scmi: Add atomic mode support to virtio transport
@ 2021-12-21 12:09       ` Cristian Marussi
  0 siblings, 0 replies; 61+ messages in thread
From: Cristian Marussi @ 2021-12-21 12:09 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: linux-kernel, linux-arm-kernel, sudeep.holla, james.quinlan,
	Jonathan.Cameron, f.fainelli, etienne.carriere, vincent.guittot,
	souvik.chakravarty, Igor Skalkin, Peter Hilber, virtualization

On Mon, Dec 20, 2021 at 06:17:03PM -0500, Michael S. Tsirkin wrote:
> On Mon, Dec 20, 2021 at 07:56:44PM +0000, Cristian Marussi wrote:
> > Add support for .mark_txdone and .poll_done transport operations to SCMI
> > VirtIO transport as pre-requisites to enable atomic operations.
> > 
> > Add a Kernel configuration option to enable SCMI VirtIO transport polling
> > and atomic mode for selected SCMI transactions while leaving it default
> > disabled.
> > 

Hi Michael,

thanks for your review.

> > Cc: "Michael S. Tsirkin" <mst@redhat.com>
> > Cc: Igor Skalkin <igor.skalkin@opensynergy.com>
> > Cc: Peter Hilber <peter.hilber@opensynergy.com>
> > Cc: virtualization@lists.linux-foundation.org
> > Signed-off-by: Cristian Marussi <cristian.marussi@arm.com>
> > ---
> > v7 --> v8
> > - removed ifdeffery
> > - reviewed comments
> > - simplified spinlocking aroung scmi_feed_vq_tx/rx
> > - added deferred worker for TX replies to aid while polling mode is active
> > V6 --> V7
> > - added a few comments about virtio polling internals
> > - fixed missing list_del on pending_cmds_list processing
> > - shrinked spinlocked areas in virtio_poll_done
> > - added proper spinlocking to scmi_vio_complete_cb while scanning list
> >   of pending cmds
> > ---
> >  drivers/firmware/arm_scmi/Kconfig  |  15 ++
> >  drivers/firmware/arm_scmi/virtio.c | 291 ++++++++++++++++++++++++++---
> >  2 files changed, 280 insertions(+), 26 deletions(-)
> > 
> > diff --git a/drivers/firmware/arm_scmi/Kconfig b/drivers/firmware/arm_scmi/Kconfig
> > index d429326433d1..7794bd41eaa0 100644
> > --- a/drivers/firmware/arm_scmi/Kconfig
> > +++ b/drivers/firmware/arm_scmi/Kconfig
> > @@ -118,6 +118,21 @@ config ARM_SCMI_TRANSPORT_VIRTIO_VERSION1_COMPLIANCE
> >  	  the ones implemented by kvmtool) and let the core Kernel VirtIO layer
> >  	  take care of the needed conversions, say N.
> >  
> > +config ARM_SCMI_TRANSPORT_VIRTIO_ATOMIC_ENABLE
> > +	bool "Enable atomic mode for SCMI VirtIO transport"
> > +	depends on ARM_SCMI_TRANSPORT_VIRTIO
> > +	help
> > +	  Enable support of atomic operation for SCMI VirtIO based transport.
> > +
> > +	  If you want the SCMI VirtIO based transport to operate in atomic
> > +	  mode, avoiding any kind of sleeping behaviour for selected
> > +	  transactions on the TX path, answer Y.
> > +
> > +	  Enabling atomic mode operations allows any SCMI driver using this
> > +	  transport to optionally ask for atomic SCMI transactions and operate
> > +	  in atomic context too, at the price of using a number of busy-waiting
> > +	  primitives all over instead. If unsure say N.
> > +
> >  endif #ARM_SCMI_PROTOCOL
> >  
> >  config ARM_SCMI_POWER_DOMAIN
> > diff --git a/drivers/firmware/arm_scmi/virtio.c b/drivers/firmware/arm_scmi/virtio.c
> > index fd0f6f91fc0b..f589bbcc5db9 100644
> > --- a/drivers/firmware/arm_scmi/virtio.c
> > +++ b/drivers/firmware/arm_scmi/virtio.c
> > @@ -3,8 +3,8 @@
> >   * Virtio Transport driver for Arm System Control and Management Interface
> >   * (SCMI).
> >   *
> > - * Copyright (C) 2020-2021 OpenSynergy.
> > - * Copyright (C) 2021 ARM Ltd.
> > + * Copyright (C) 2020-2022 OpenSynergy.
> > + * Copyright (C) 2021-2022 ARM Ltd.
> >   */
> >  
> >  /**
> > @@ -38,6 +38,9 @@
> >   * @vqueue: Associated virtqueue
> >   * @cinfo: SCMI Tx or Rx channel
> >   * @free_list: List of unused scmi_vio_msg, maintained for Tx channels only
> > + * @deferred_tx_work: Worker for TX deferred replies processing
> > + * @deferred_tx_wq: Workqueue for TX deferred replies
> > + * @pending_cmds_list: List of pre-fetched commands queueud for later processing
> >   * @is_rx: Whether channel is an Rx channel
> >   * @ready: Whether transport user is ready to hear about channel
> >   * @max_msg: Maximum number of pending messages for this channel.
> > @@ -49,6 +52,9 @@ struct scmi_vio_channel {
> >  	struct virtqueue *vqueue;
> >  	struct scmi_chan_info *cinfo;
> >  	struct list_head free_list;
> > +	struct list_head pending_cmds_list;
> > +	struct work_struct deferred_tx_work;
> > +	struct workqueue_struct *deferred_tx_wq;
> >  	bool is_rx;
> >  	bool ready;
> >  	unsigned int max_msg;
> > @@ -65,12 +71,22 @@ struct scmi_vio_channel {
> >   * @input: SDU used for (delayed) responses and notifications
> >   * @list: List which scmi_vio_msg may be part of
> >   * @rx_len: Input SDU size in bytes, once input has been received
> > + * @poll_idx: Last used index registered for polling purposes if this message
> > + *	      transaction reply was configured for polling.
> > + *	      Note that since virtqueue used index is an unsigned 16-bit we can
> > + *	      use some out-of-scale values to signify particular conditions.
> > + * @poll_lock: Protect access to @poll_idx.
> >   */
> >  struct scmi_vio_msg {
> >  	struct scmi_msg_payld *request;
> >  	struct scmi_msg_payld *input;
> >  	struct list_head list;
> >  	unsigned int rx_len;
> > +#define VIO_MSG_NOT_POLLED	0xeeeeeeeeUL
> > +#define VIO_MSG_POLL_DONE	0xffffffffUL
> > +	unsigned int poll_idx;
> > +	/* lock to protect access to poll_idx. */
> > +	spinlock_t poll_lock;
> >  };
> >  
> >  /* Only one SCMI VirtIO device can possibly exist */
> > @@ -81,40 +97,43 @@ static bool scmi_vio_have_vq_rx(struct virtio_device *vdev)
> >  	return virtio_has_feature(vdev, VIRTIO_SCMI_F_P2A_CHANNELS);
> >  }
> >  
> > +/* Expect to be called with vioch->lock acquired by the caller and IRQs off */
> >  static int scmi_vio_feed_vq_rx(struct scmi_vio_channel *vioch,
> >  			       struct scmi_vio_msg *msg,
> >  			       struct device *dev)
> >  {
> >  	struct scatterlist sg_in;
> >  	int rc;
> > -	unsigned long flags;
> >  
> >  	sg_init_one(&sg_in, msg->input, VIRTIO_SCMI_MAX_PDU_SIZE);
> >  
> > -	spin_lock_irqsave(&vioch->lock, flags);
> > -
> >  	rc = virtqueue_add_inbuf(vioch->vqueue, &sg_in, 1, msg, GFP_ATOMIC);
> >  	if (rc)
> >  		dev_err(dev, "failed to add to RX virtqueue (%d)\n", rc);
> >  	else
> >  		virtqueue_kick(vioch->vqueue);
> >  
> > -	spin_unlock_irqrestore(&vioch->lock, flags);
> > -
> >  	return rc;
> >  }
> >  
> > +/* Expect to be called with vioch->lock acquired by the caller and IRQs off */
> > +static inline void scmi_vio_feed_vq_tx(struct scmi_vio_channel *vioch,
> > +				       struct scmi_vio_msg *msg)
> > +{
> > +	spin_lock(&msg->poll_lock);
> > +	msg->poll_idx = VIO_MSG_NOT_POLLED;
> > +	spin_unlock(&msg->poll_lock);
> > +
> > +	list_add(&msg->list, &vioch->free_list);
> > +}
> > +
> >  static void scmi_finalize_message(struct scmi_vio_channel *vioch,
> >  				  struct scmi_vio_msg *msg)
> >  {
> > -	if (vioch->is_rx) {
> > +	if (vioch->is_rx)
> >  		scmi_vio_feed_vq_rx(vioch, msg, vioch->cinfo->dev);
> > -	} else {
> > -		/* Here IRQs are assumed to be already disabled by the caller */
> > -		spin_lock(&vioch->lock);
> > -		list_add(&msg->list, &vioch->free_list);
> > -		spin_unlock(&vioch->lock);
> > -	}
> > +	else
> > +		scmi_vio_feed_vq_tx(vioch, msg);
> >  }
> >  
> >  static void scmi_vio_complete_cb(struct virtqueue *vqueue)
> > @@ -144,6 +163,7 @@ static void scmi_vio_complete_cb(struct virtqueue *vqueue)
> >  			virtqueue_disable_cb(vqueue);
> >  			cb_enabled = false;
> >  		}
> > +
> >  		msg = virtqueue_get_buf(vqueue, &length);
> >  		if (!msg) {
> >  			if (virtqueue_enable_cb(vqueue))
> > @@ -157,7 +177,9 @@ static void scmi_vio_complete_cb(struct virtqueue *vqueue)
> >  			scmi_rx_callback(vioch->cinfo,
> >  					 msg_read_header(msg->input), msg);
> >  
> > +			spin_lock(&vioch->lock);
> >  			scmi_finalize_message(vioch, msg);
> > +			spin_unlock(&vioch->lock);
> >  		}
> >  
> >  		/*
> > @@ -176,6 +198,34 @@ static void scmi_vio_complete_cb(struct virtqueue *vqueue)
> >  	spin_unlock_irqrestore(&vioch->ready_lock, ready_flags);
> >  }
> >  
> > +static void scmi_vio_deferred_tx_worker(struct work_struct *work)
> > +{
> > +	unsigned long flags;
> > +	struct scmi_vio_channel *vioch;
> > +	struct scmi_vio_msg *msg, *tmp;
> > +
> > +	vioch = container_of(work, struct scmi_vio_channel, deferred_tx_work);
> > +
> > +	/* Process pre-fetched messages */
> > +	spin_lock_irqsave(&vioch->lock, flags);
> > +
> > +	/* Scan the list of possibly pre-fetched messages during polling. */
> > +	list_for_each_entry_safe(msg, tmp, &vioch->pending_cmds_list, list) {
> > +		list_del(&msg->list);
> > +
> > +		scmi_rx_callback(vioch->cinfo,
> > +				 msg_read_header(msg->input), msg);
> > +
> > +		/* Free the processed message once done */
> > +		scmi_vio_feed_vq_tx(vioch, msg);
> > +	}
> > +
> > +	spin_unlock_irqrestore(&vioch->lock, flags);
> > +
> > +	/* Process possibly still pending messages */
> > +	scmi_vio_complete_cb(vioch->vqueue);
> > +}
> > +
> >  static const char *const scmi_vio_vqueue_names[] = { "tx", "rx" };
> >  
> >  static vq_callback_t *scmi_vio_complete_callbacks[] = {
> > @@ -244,6 +294,19 @@ static int virtio_chan_setup(struct scmi_chan_info *cinfo, struct device *dev,
> >  
> >  	vioch = &((struct scmi_vio_channel *)scmi_vdev->priv)[index];
> >  
> > +	/* Setup a deferred worker for polling. */
> > +	if (tx && !vioch->deferred_tx_wq) {
> > +		vioch->deferred_tx_wq =
> > +			alloc_workqueue(dev_name(&scmi_vdev->dev),
> > +					WQ_UNBOUND | WQ_FREEZABLE | WQ_SYSFS,
> > +					0);
> > +		if (!vioch->deferred_tx_wq)
> > +			return -ENOMEM;
> > +
> > +		INIT_WORK(&vioch->deferred_tx_work,
> > +			  scmi_vio_deferred_tx_worker);
> > +	}
> > +
> >  	for (i = 0; i < vioch->max_msg; i++) {
> >  		struct scmi_vio_msg *msg;
> >  
> > @@ -257,6 +320,7 @@ static int virtio_chan_setup(struct scmi_chan_info *cinfo, struct device *dev,
> >  						    GFP_KERNEL);
> >  			if (!msg->request)
> >  				return -ENOMEM;
> > +			spin_lock_init(&msg->poll_lock);
> >  		}
> >  
> >  		msg->input = devm_kzalloc(cinfo->dev, VIRTIO_SCMI_MAX_PDU_SIZE,
> > @@ -264,13 +328,12 @@ static int virtio_chan_setup(struct scmi_chan_info *cinfo, struct device *dev,
> >  		if (!msg->input)
> >  			return -ENOMEM;
> >  
> > -		if (tx) {
> > -			spin_lock_irqsave(&vioch->lock, flags);
> > -			list_add_tail(&msg->list, &vioch->free_list);
> > -			spin_unlock_irqrestore(&vioch->lock, flags);
> > -		} else {
> > +		spin_lock_irqsave(&vioch->lock, flags);
> > +		if (tx)
> > +			scmi_vio_feed_vq_tx(vioch, msg);
> > +		else
> >  			scmi_vio_feed_vq_rx(vioch, msg, cinfo->dev);
> > -		}
> > +		spin_unlock_irqrestore(&vioch->lock, flags);
> >  	}
> >  
> >  	spin_lock_irqsave(&vioch->lock, flags);
> > @@ -296,6 +359,11 @@ static int virtio_chan_free(int id, void *p, void *data)
> >  	vioch->ready = false;
> >  	spin_unlock_irqrestore(&vioch->ready_lock, flags);
> >  
> > +	if (!vioch->is_rx && vioch->deferred_tx_wq) {
> > +		destroy_workqueue(vioch->deferred_tx_wq);
> > +		vioch->deferred_tx_wq = NULL;
> > +	}
> > +
> >  	scmi_free_channel(cinfo, data, id);
> >  
> >  	spin_lock_irqsave(&vioch->lock, flags);
> > @@ -324,7 +392,8 @@ static int virtio_send_message(struct scmi_chan_info *cinfo,
> >  	}
> >  
> >  	msg = list_first_entry(&vioch->free_list, typeof(*msg), list);
> > -	list_del(&msg->list);
> > +	/* Re-init element so we can discern anytime if it is still in-flight */
> > +	list_del_init(&msg->list);
> >  
> >  	msg_tx_prepare(msg->request, xfer);
> >  
> > @@ -337,6 +406,19 @@ static int virtio_send_message(struct scmi_chan_info *cinfo,
> >  		dev_err(vioch->cinfo->dev,
> >  			"failed to add to TX virtqueue (%d)\n", rc);
> >  	} else {
> > +		/*
> > +		 * If polling was requested for this transaction:
> > +		 *  - retrieve last used index (will be used as polling reference)
> > +		 *  - bind the polled message to the xfer via .priv
> > +		 */
> > +		if (xfer->hdr.poll_completion) {
> > +			spin_lock(&msg->poll_lock);
> > +			msg->poll_idx =
> > +				virtqueue_enable_cb_prepare(vioch->vqueue);
> > +			spin_unlock(&msg->poll_lock);
> > +			/* Ensure initialized msg is visibly bound to xfer */
> > +			smp_store_mb(xfer->priv, msg);
> > +		}
> >  		virtqueue_kick(vioch->vqueue);
> >  	}
> >  
> > @@ -350,10 +432,8 @@ static void virtio_fetch_response(struct scmi_chan_info *cinfo,
> >  {
> >  	struct scmi_vio_msg *msg = xfer->priv;
> >  
> > -	if (msg) {
> > +	if (msg)
> >  		msg_fetch_response(msg->input, msg->rx_len, xfer);
> > -		xfer->priv = NULL;
> > -	}
> >  }
> >  
> >  static void virtio_fetch_notification(struct scmi_chan_info *cinfo,
> > @@ -361,10 +441,165 @@ static void virtio_fetch_notification(struct scmi_chan_info *cinfo,
> >  {
> >  	struct scmi_vio_msg *msg = xfer->priv;
> >  
> > -	if (msg) {
> > +	if (msg)
> >  		msg_fetch_notification(msg->input, msg->rx_len, max_len, xfer);
> > -		xfer->priv = NULL;
> > +}
> > +
> > +/**
> > + * virtio_mark_txdone  - Mark transmission done
> > + *
> > + * Free only successfully completed polling transfer messages.
> > + *
> > + * Note that in the SCMI VirtIO transport we never explicitly release timed-out
> > + * messages by forcibly re-adding them to the free-list, even on timeout, inside
> > + * the TX code path; we instead let IRQ/RX callbacks eventually clean up such
> > + * messages once, finally, a late reply is received and discarded (if ever).
> > + *
> > + * This approach was deemed preferable since those pending timed-out buffers are
> > + * still effectively owned by the SCMI platform VirtIO device even after timeout
> > + * expiration: forcibly freeing and reusing them before they had been returned
> > + * explicitly by the SCMI platform could lead to subtle bugs due to message
> > + * corruption.
> > + * An SCMI platform VirtIO device which never returns message buffers is
> > + * anyway broken and it will quickly lead to exhaustion of available messages.
> > + *
> > + * For this same reason, here, we take care to free only the successfully
> > + * completed polled messages, since they won't be freed elsewhere; late replies
> > + * to timed-out polled messages would be anyway freed by RX callbacks instead.
> > + *
> > + * @cinfo: SCMI channel info
> > + * @ret: Transmission return code
> > + * @xfer: Transfer descriptor
> > + */
> > +static void virtio_mark_txdone(struct scmi_chan_info *cinfo, int ret,
> > +			       struct scmi_xfer *xfer)
> > +{
> > +	unsigned long flags;
> > +	struct scmi_vio_channel *vioch = cinfo->transport_info;
> > +	struct scmi_vio_msg *msg = xfer->priv;
> > +
> > +	if (!msg)
> > +		return;
> > +
> > +	/* Ensure msg is unbound from xfer before pushing onto the free list  */
> > +	smp_store_mb(xfer->priv, NULL);
> > +
> > +	/* Is a successfully completed polled message still to be finalized ? */
> > +	spin_lock_irqsave(&vioch->lock, flags);
> > +	if (!ret && xfer->hdr.poll_completion && list_empty(&msg->list))
> > +		scmi_vio_feed_vq_tx(vioch, msg);
> > +	spin_unlock_irqrestore(&vioch->lock, flags);
> > +}
> > +
> > +/**
> > + * virtio_poll_done  - Provide polling support for VirtIO transport
> > + *
> > + * @cinfo: SCMI channel info
> > + * @xfer: Reference to the transfer being poll for.
> > + *
> > + * VirtIO core provides a polling mechanism based only on last used indexes:
> > + * this means that it is possible to poll the virtqueues waiting for something
> > + * new to arrive from the host side but the only way to check if the freshly
> > + * arrived buffer was what we were waiting for is to compare the newly arrived
> > + * message descriptors with the one we are polling on.
> > + *
> > + * As a consequence it can happen to dequeue something different from the buffer
> > + * we were poll-waiting for: if that is the case such early fetched buffers are
> > + * then added to a the @pending_cmds_list list for later processing by a
> > + * dedicated deferred worker.
> > + *
> > + * So, basically, once something new is spotted we proceed to de-queue all the
> > + * freshly received used buffers until we found the one we were polling on, or,
> > + * we have 'seemingly' emptied the virtqueue; if some buffers are still pending
> > + * in the vqueue at the end of the polling loop (possible due to inherent races
> > + * in virtqueues handling mechanisms), we similarly kick the deferred worker
> > + * and let it process those, to avoid indefinitely looping in the .poll_done
> > + * helper.
> > + *
> > + * Note that we do NOT suppress notification with VIRTQ_USED_F_NO_NOTIFY even
> > + * when polling since such flag is per-virtqueues and we do not want to
> > + * suppress notifications as a whole: so, if the message we are polling for is
> > + * delivered via usual IRQs callbacks, on another core which are IRQs-on, it
> > + * will be handled as such by scmi_rx_callback() and the polling loop in the
> > + * SCMI Core TX path will be transparently terminated anyway.
> > + *
> > + * Return: True once polling has successfully completed.
> > + */
> > +static bool virtio_poll_done(struct scmi_chan_info *cinfo,
> > +			     struct scmi_xfer *xfer)
> > +{
> > +	bool pending, ret = false;
> > +	unsigned int length, any_prefetched = 0;
> > +	unsigned long flags;
> > +	struct scmi_vio_msg *next_msg, *msg = xfer->priv;
> > +	struct scmi_vio_channel *vioch = cinfo->transport_info;
> > +
> > +	if (!msg)
> > +		return true;
> > +
> > +	spin_lock_irqsave(&msg->poll_lock, flags);
> > +	/* Processed already by other polling loop on another CPU ? */
> > +	if (msg->poll_idx == VIO_MSG_POLL_DONE) {
> > +		spin_unlock_irqrestore(&msg->poll_lock, flags);
> > +		return true;
> >  	}
> > +
> > +	/* Has cmdq index moved at all ? */
> > +	pending = virtqueue_poll(vioch->vqueue, msg->poll_idx);
> > +	spin_unlock_irqrestore(&msg->poll_lock, flags);
> > +	if (!pending)
> > +		return false;
> > +
> > +	spin_lock_irqsave(&vioch->lock, flags);
> > +	virtqueue_disable_cb(vioch->vqueue);
> > +
> > +	/*
> > +	 * If something arrived we cannot be sure, without dequeueing, if it
> > +	 * was the reply to the xfer we are polling for, or, to other, even
> > +	 * possibly non-polling, pending xfers: process all new messages
> > +	 * till the polled-for message is found OR the vqueue is empty.
> > +	 */
> > +	while ((next_msg = virtqueue_get_buf(vioch->vqueue, &length))) {
> > +		next_msg->rx_len = length;
> > +		/* Is the message we were polling for ? */
> > +		if (next_msg == msg) {
> > +			ret = true;
> > +			break;
> > +		}
> > +
> > +		spin_lock(&next_msg->poll_lock);
> > +		if (next_msg->poll_idx == VIO_MSG_NOT_POLLED) {
> > +			any_prefetched++;
> > +			list_add_tail(&next_msg->list,
> > +				      &vioch->pending_cmds_list);
> > +		} else {
> > +			next_msg->poll_idx = VIO_MSG_POLL_DONE;
> > +		}
> > +		spin_unlock(&next_msg->poll_lock);
> > +	}
> > +
> > +	/*
> > +	 * When the polling loop has successfully terminated if something
> > +	 * else was queued in the meantime, it will be served by a deferred
> > +	 * worker OR by the normal IRQ/callback OR by other poll loops.
> > +	 *
> > +	 * If we are still looking for the polled reply, the polling index has
> > +	 * to be updated to the current vqueue last used index.
> > +	 */
> > +	if (ret) {
> > +		pending = !virtqueue_enable_cb(vioch->vqueue);
> > +	} else {
> > +		spin_lock(&msg->poll_lock);
> > +		msg->poll_idx = virtqueue_enable_cb_prepare(vioch->vqueue);
> > +		pending = virtqueue_poll(vioch->vqueue, msg->poll_idx);
> > +		spin_unlock(&msg->poll_lock);
> > +	}
> > +	spin_unlock_irqrestore(&vioch->lock, flags);
> > +
> > +	if (any_prefetched || pending)
> > +		queue_work(vioch->deferred_tx_wq, &vioch->deferred_tx_work);
> 
> I don't see any attempt to make sure the queued work is no longer
> running on e.g. device or driver removal.
> 

Right, I destroy the workqueue and nullify the deferred_tx_wq pointer in
chan_free, I'll add a check here within the spinlocking area to avoid the
race with the destroy in chan_free.

Thanks,
Cristian


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 61+ messages in thread

* [PATCH v9 09/11] firmware: arm_scmi: Add atomic mode support to virtio transport
  2021-12-20 19:56   ` Cristian Marussi
@ 2021-12-21 14:00     ` Cristian Marussi
  -1 siblings, 0 replies; 61+ messages in thread
From: Cristian Marussi @ 2021-12-21 14:00 UTC (permalink / raw)
  To: linux-kernel, linux-arm-kernel
  Cc: sudeep.holla, james.quinlan, Jonathan.Cameron, f.fainelli,
	etienne.carriere, vincent.guittot, souvik.chakravarty,
	peter.hilber, igor.skalkin, cristian.marussi, Michael S. Tsirkin,
	virtualization

Add support for .mark_txdone and .poll_done transport operations to SCMI
VirtIO transport as pre-requisites to enable atomic operations.

Add a Kernel configuration option to enable SCMI VirtIO transport polling
and atomic mode for selected SCMI transactions while leaving it default
disabled.

Cc: "Michael S. Tsirkin" <mst@redhat.com>
Cc: Igor Skalkin <igor.skalkin@opensynergy.com>
Cc: Peter Hilber <peter.hilber@opensynergy.com>
Cc: virtualization@lists.linux-foundation.org
Signed-off-by: Cristian Marussi <cristian.marussi@arm.com>
---
v8 --> v9
- check for deferred_wq existence before queueing work to avoid
  race at driver removal time
---
 drivers/firmware/arm_scmi/Kconfig  |  15 ++
 drivers/firmware/arm_scmi/virtio.c | 298 ++++++++++++++++++++++++++---
 2 files changed, 287 insertions(+), 26 deletions(-)

diff --git a/drivers/firmware/arm_scmi/Kconfig b/drivers/firmware/arm_scmi/Kconfig
index d429326433d1..7794bd41eaa0 100644
--- a/drivers/firmware/arm_scmi/Kconfig
+++ b/drivers/firmware/arm_scmi/Kconfig
@@ -118,6 +118,21 @@ config ARM_SCMI_TRANSPORT_VIRTIO_VERSION1_COMPLIANCE
 	  the ones implemented by kvmtool) and let the core Kernel VirtIO layer
 	  take care of the needed conversions, say N.
 
+config ARM_SCMI_TRANSPORT_VIRTIO_ATOMIC_ENABLE
+	bool "Enable atomic mode for SCMI VirtIO transport"
+	depends on ARM_SCMI_TRANSPORT_VIRTIO
+	help
+	  Enable support of atomic operation for SCMI VirtIO based transport.
+
+	  If you want the SCMI VirtIO based transport to operate in atomic
+	  mode, avoiding any kind of sleeping behaviour for selected
+	  transactions on the TX path, answer Y.
+
+	  Enabling atomic mode operations allows any SCMI driver using this
+	  transport to optionally ask for atomic SCMI transactions and operate
+	  in atomic context too, at the price of using a number of busy-waiting
+	  primitives all over instead. If unsure say N.
+
 endif #ARM_SCMI_PROTOCOL
 
 config ARM_SCMI_POWER_DOMAIN
diff --git a/drivers/firmware/arm_scmi/virtio.c b/drivers/firmware/arm_scmi/virtio.c
index fd0f6f91fc0b..e54d14971c07 100644
--- a/drivers/firmware/arm_scmi/virtio.c
+++ b/drivers/firmware/arm_scmi/virtio.c
@@ -3,8 +3,8 @@
  * Virtio Transport driver for Arm System Control and Management Interface
  * (SCMI).
  *
- * Copyright (C) 2020-2021 OpenSynergy.
- * Copyright (C) 2021 ARM Ltd.
+ * Copyright (C) 2020-2022 OpenSynergy.
+ * Copyright (C) 2021-2022 ARM Ltd.
  */
 
 /**
@@ -38,6 +38,9 @@
  * @vqueue: Associated virtqueue
  * @cinfo: SCMI Tx or Rx channel
  * @free_list: List of unused scmi_vio_msg, maintained for Tx channels only
+ * @deferred_tx_work: Worker for TX deferred replies processing
+ * @deferred_tx_wq: Workqueue for TX deferred replies
+ * @pending_cmds_list: List of pre-fetched commands queueud for later processing
  * @is_rx: Whether channel is an Rx channel
  * @ready: Whether transport user is ready to hear about channel
  * @max_msg: Maximum number of pending messages for this channel.
@@ -49,6 +52,9 @@ struct scmi_vio_channel {
 	struct virtqueue *vqueue;
 	struct scmi_chan_info *cinfo;
 	struct list_head free_list;
+	struct list_head pending_cmds_list;
+	struct work_struct deferred_tx_work;
+	struct workqueue_struct *deferred_tx_wq;
 	bool is_rx;
 	bool ready;
 	unsigned int max_msg;
@@ -65,12 +71,22 @@ struct scmi_vio_channel {
  * @input: SDU used for (delayed) responses and notifications
  * @list: List which scmi_vio_msg may be part of
  * @rx_len: Input SDU size in bytes, once input has been received
+ * @poll_idx: Last used index registered for polling purposes if this message
+ *	      transaction reply was configured for polling.
+ *	      Note that since virtqueue used index is an unsigned 16-bit we can
+ *	      use some out-of-scale values to signify particular conditions.
+ * @poll_lock: Protect access to @poll_idx.
  */
 struct scmi_vio_msg {
 	struct scmi_msg_payld *request;
 	struct scmi_msg_payld *input;
 	struct list_head list;
 	unsigned int rx_len;
+#define VIO_MSG_NOT_POLLED	0xeeeeeeeeUL
+#define VIO_MSG_POLL_DONE	0xffffffffUL
+	unsigned int poll_idx;
+	/* lock to protect access to poll_idx. */
+	spinlock_t poll_lock;
 };
 
 /* Only one SCMI VirtIO device can possibly exist */
@@ -81,40 +97,43 @@ static bool scmi_vio_have_vq_rx(struct virtio_device *vdev)
 	return virtio_has_feature(vdev, VIRTIO_SCMI_F_P2A_CHANNELS);
 }
 
+/* Expect to be called with vioch->lock acquired by the caller and IRQs off */
 static int scmi_vio_feed_vq_rx(struct scmi_vio_channel *vioch,
 			       struct scmi_vio_msg *msg,
 			       struct device *dev)
 {
 	struct scatterlist sg_in;
 	int rc;
-	unsigned long flags;
 
 	sg_init_one(&sg_in, msg->input, VIRTIO_SCMI_MAX_PDU_SIZE);
 
-	spin_lock_irqsave(&vioch->lock, flags);
-
 	rc = virtqueue_add_inbuf(vioch->vqueue, &sg_in, 1, msg, GFP_ATOMIC);
 	if (rc)
 		dev_err(dev, "failed to add to RX virtqueue (%d)\n", rc);
 	else
 		virtqueue_kick(vioch->vqueue);
 
-	spin_unlock_irqrestore(&vioch->lock, flags);
-
 	return rc;
 }
 
+/* Expect to be called with vioch->lock acquired by the caller and IRQs off */
+static inline void scmi_vio_feed_vq_tx(struct scmi_vio_channel *vioch,
+				       struct scmi_vio_msg *msg)
+{
+	spin_lock(&msg->poll_lock);
+	msg->poll_idx = VIO_MSG_NOT_POLLED;
+	spin_unlock(&msg->poll_lock);
+
+	list_add(&msg->list, &vioch->free_list);
+}
+
 static void scmi_finalize_message(struct scmi_vio_channel *vioch,
 				  struct scmi_vio_msg *msg)
 {
-	if (vioch->is_rx) {
+	if (vioch->is_rx)
 		scmi_vio_feed_vq_rx(vioch, msg, vioch->cinfo->dev);
-	} else {
-		/* Here IRQs are assumed to be already disabled by the caller */
-		spin_lock(&vioch->lock);
-		list_add(&msg->list, &vioch->free_list);
-		spin_unlock(&vioch->lock);
-	}
+	else
+		scmi_vio_feed_vq_tx(vioch, msg);
 }
 
 static void scmi_vio_complete_cb(struct virtqueue *vqueue)
@@ -144,6 +163,7 @@ static void scmi_vio_complete_cb(struct virtqueue *vqueue)
 			virtqueue_disable_cb(vqueue);
 			cb_enabled = false;
 		}
+
 		msg = virtqueue_get_buf(vqueue, &length);
 		if (!msg) {
 			if (virtqueue_enable_cb(vqueue))
@@ -157,7 +177,9 @@ static void scmi_vio_complete_cb(struct virtqueue *vqueue)
 			scmi_rx_callback(vioch->cinfo,
 					 msg_read_header(msg->input), msg);
 
+			spin_lock(&vioch->lock);
 			scmi_finalize_message(vioch, msg);
+			spin_unlock(&vioch->lock);
 		}
 
 		/*
@@ -176,6 +198,34 @@ static void scmi_vio_complete_cb(struct virtqueue *vqueue)
 	spin_unlock_irqrestore(&vioch->ready_lock, ready_flags);
 }
 
+static void scmi_vio_deferred_tx_worker(struct work_struct *work)
+{
+	unsigned long flags;
+	struct scmi_vio_channel *vioch;
+	struct scmi_vio_msg *msg, *tmp;
+
+	vioch = container_of(work, struct scmi_vio_channel, deferred_tx_work);
+
+	/* Process pre-fetched messages */
+	spin_lock_irqsave(&vioch->lock, flags);
+
+	/* Scan the list of possibly pre-fetched messages during polling. */
+	list_for_each_entry_safe(msg, tmp, &vioch->pending_cmds_list, list) {
+		list_del(&msg->list);
+
+		scmi_rx_callback(vioch->cinfo,
+				 msg_read_header(msg->input), msg);
+
+		/* Free the processed message once done */
+		scmi_vio_feed_vq_tx(vioch, msg);
+	}
+
+	spin_unlock_irqrestore(&vioch->lock, flags);
+
+	/* Process possibly still pending messages */
+	scmi_vio_complete_cb(vioch->vqueue);
+}
+
 static const char *const scmi_vio_vqueue_names[] = { "tx", "rx" };
 
 static vq_callback_t *scmi_vio_complete_callbacks[] = {
@@ -244,6 +294,19 @@ static int virtio_chan_setup(struct scmi_chan_info *cinfo, struct device *dev,
 
 	vioch = &((struct scmi_vio_channel *)scmi_vdev->priv)[index];
 
+	/* Setup a deferred worker for polling. */
+	if (tx && !vioch->deferred_tx_wq) {
+		vioch->deferred_tx_wq =
+			alloc_workqueue(dev_name(&scmi_vdev->dev),
+					WQ_UNBOUND | WQ_FREEZABLE | WQ_SYSFS,
+					0);
+		if (!vioch->deferred_tx_wq)
+			return -ENOMEM;
+
+		INIT_WORK(&vioch->deferred_tx_work,
+			  scmi_vio_deferred_tx_worker);
+	}
+
 	for (i = 0; i < vioch->max_msg; i++) {
 		struct scmi_vio_msg *msg;
 
@@ -257,6 +320,7 @@ static int virtio_chan_setup(struct scmi_chan_info *cinfo, struct device *dev,
 						    GFP_KERNEL);
 			if (!msg->request)
 				return -ENOMEM;
+			spin_lock_init(&msg->poll_lock);
 		}
 
 		msg->input = devm_kzalloc(cinfo->dev, VIRTIO_SCMI_MAX_PDU_SIZE,
@@ -264,13 +328,12 @@ static int virtio_chan_setup(struct scmi_chan_info *cinfo, struct device *dev,
 		if (!msg->input)
 			return -ENOMEM;
 
-		if (tx) {
-			spin_lock_irqsave(&vioch->lock, flags);
-			list_add_tail(&msg->list, &vioch->free_list);
-			spin_unlock_irqrestore(&vioch->lock, flags);
-		} else {
+		spin_lock_irqsave(&vioch->lock, flags);
+		if (tx)
+			scmi_vio_feed_vq_tx(vioch, msg);
+		else
 			scmi_vio_feed_vq_rx(vioch, msg, cinfo->dev);
-		}
+		spin_unlock_irqrestore(&vioch->lock, flags);
 	}
 
 	spin_lock_irqsave(&vioch->lock, flags);
@@ -291,11 +354,22 @@ static int virtio_chan_free(int id, void *p, void *data)
 	unsigned long flags;
 	struct scmi_chan_info *cinfo = p;
 	struct scmi_vio_channel *vioch = cinfo->transport_info;
+	void *deferred_wq = NULL;
 
 	spin_lock_irqsave(&vioch->ready_lock, flags);
 	vioch->ready = false;
 	spin_unlock_irqrestore(&vioch->ready_lock, flags);
 
+	spin_lock_irqsave(&vioch->lock, flags);
+	if (!vioch->is_rx && vioch->deferred_tx_wq) {
+		deferred_wq = vioch->deferred_tx_wq;
+		vioch->deferred_tx_wq = NULL;
+	}
+	spin_unlock_irqrestore(&vioch->lock, flags);
+
+	if (deferred_wq)
+		destroy_workqueue(deferred_wq);
+
 	scmi_free_channel(cinfo, data, id);
 
 	spin_lock_irqsave(&vioch->lock, flags);
@@ -324,7 +398,8 @@ static int virtio_send_message(struct scmi_chan_info *cinfo,
 	}
 
 	msg = list_first_entry(&vioch->free_list, typeof(*msg), list);
-	list_del(&msg->list);
+	/* Re-init element so we can discern anytime if it is still in-flight */
+	list_del_init(&msg->list);
 
 	msg_tx_prepare(msg->request, xfer);
 
@@ -337,6 +412,19 @@ static int virtio_send_message(struct scmi_chan_info *cinfo,
 		dev_err(vioch->cinfo->dev,
 			"failed to add to TX virtqueue (%d)\n", rc);
 	} else {
+		/*
+		 * If polling was requested for this transaction:
+		 *  - retrieve last used index (will be used as polling reference)
+		 *  - bind the polled message to the xfer via .priv
+		 */
+		if (xfer->hdr.poll_completion) {
+			spin_lock(&msg->poll_lock);
+			msg->poll_idx =
+				virtqueue_enable_cb_prepare(vioch->vqueue);
+			spin_unlock(&msg->poll_lock);
+			/* Ensure initialized msg is visibly bound to xfer */
+			smp_store_mb(xfer->priv, msg);
+		}
 		virtqueue_kick(vioch->vqueue);
 	}
 
@@ -350,10 +438,8 @@ static void virtio_fetch_response(struct scmi_chan_info *cinfo,
 {
 	struct scmi_vio_msg *msg = xfer->priv;
 
-	if (msg) {
+	if (msg)
 		msg_fetch_response(msg->input, msg->rx_len, xfer);
-		xfer->priv = NULL;
-	}
 }
 
 static void virtio_fetch_notification(struct scmi_chan_info *cinfo,
@@ -361,10 +447,166 @@ static void virtio_fetch_notification(struct scmi_chan_info *cinfo,
 {
 	struct scmi_vio_msg *msg = xfer->priv;
 
-	if (msg) {
+	if (msg)
 		msg_fetch_notification(msg->input, msg->rx_len, max_len, xfer);
-		xfer->priv = NULL;
+}
+
+/**
+ * virtio_mark_txdone  - Mark transmission done
+ *
+ * Free only successfully completed polling transfer messages.
+ *
+ * Note that in the SCMI VirtIO transport we never explicitly release timed-out
+ * messages by forcibly re-adding them to the free-list, even on timeout, inside
+ * the TX code path; we instead let IRQ/RX callbacks eventually clean up such
+ * messages once, finally, a late reply is received and discarded (if ever).
+ *
+ * This approach was deemed preferable since those pending timed-out buffers are
+ * still effectively owned by the SCMI platform VirtIO device even after timeout
+ * expiration: forcibly freeing and reusing them before they had been returned
+ * explicitly by the SCMI platform could lead to subtle bugs due to message
+ * corruption.
+ * An SCMI platform VirtIO device which never returns message buffers is
+ * anyway broken and it will quickly lead to exhaustion of available messages.
+ *
+ * For this same reason, here, we take care to free only the successfully
+ * completed polled messages, since they won't be freed elsewhere; late replies
+ * to timed-out polled messages would be anyway freed by RX callbacks instead.
+ *
+ * @cinfo: SCMI channel info
+ * @ret: Transmission return code
+ * @xfer: Transfer descriptor
+ */
+static void virtio_mark_txdone(struct scmi_chan_info *cinfo, int ret,
+			       struct scmi_xfer *xfer)
+{
+	unsigned long flags;
+	struct scmi_vio_channel *vioch = cinfo->transport_info;
+	struct scmi_vio_msg *msg = xfer->priv;
+
+	if (!msg)
+		return;
+
+	/* Ensure msg is unbound from xfer before pushing onto the free list  */
+	smp_store_mb(xfer->priv, NULL);
+
+	/* Is a successfully completed polled message still to be finalized ? */
+	spin_lock_irqsave(&vioch->lock, flags);
+	if (!ret && xfer->hdr.poll_completion && list_empty(&msg->list))
+		scmi_vio_feed_vq_tx(vioch, msg);
+	spin_unlock_irqrestore(&vioch->lock, flags);
+}
+
+/**
+ * virtio_poll_done  - Provide polling support for VirtIO transport
+ *
+ * @cinfo: SCMI channel info
+ * @xfer: Reference to the transfer being poll for.
+ *
+ * VirtIO core provides a polling mechanism based only on last used indexes:
+ * this means that it is possible to poll the virtqueues waiting for something
+ * new to arrive from the host side but the only way to check if the freshly
+ * arrived buffer was what we were waiting for is to compare the newly arrived
+ * message descriptors with the one we are polling on.
+ *
+ * As a consequence it can happen to dequeue something different from the buffer
+ * we were poll-waiting for: if that is the case such early fetched buffers are
+ * then added to a the @pending_cmds_list list for later processing by a
+ * dedicated deferred worker.
+ *
+ * So, basically, once something new is spotted we proceed to de-queue all the
+ * freshly received used buffers until we found the one we were polling on, or,
+ * we have 'seemingly' emptied the virtqueue; if some buffers are still pending
+ * in the vqueue at the end of the polling loop (possible due to inherent races
+ * in virtqueues handling mechanisms), we similarly kick the deferred worker
+ * and let it process those, to avoid indefinitely looping in the .poll_done
+ * helper.
+ *
+ * Note that we do NOT suppress notification with VIRTQ_USED_F_NO_NOTIFY even
+ * when polling since such flag is per-virtqueues and we do not want to
+ * suppress notifications as a whole: so, if the message we are polling for is
+ * delivered via usual IRQs callbacks, on another core which are IRQs-on, it
+ * will be handled as such by scmi_rx_callback() and the polling loop in the
+ * SCMI Core TX path will be transparently terminated anyway.
+ *
+ * Return: True once polling has successfully completed.
+ */
+static bool virtio_poll_done(struct scmi_chan_info *cinfo,
+			     struct scmi_xfer *xfer)
+{
+	bool pending, ret = false;
+	unsigned int length, any_prefetched = 0;
+	unsigned long flags;
+	struct scmi_vio_msg *next_msg, *msg = xfer->priv;
+	struct scmi_vio_channel *vioch = cinfo->transport_info;
+
+	if (!msg)
+		return true;
+
+	spin_lock_irqsave(&msg->poll_lock, flags);
+	/* Processed already by other polling loop on another CPU ? */
+	if (msg->poll_idx == VIO_MSG_POLL_DONE) {
+		spin_unlock_irqrestore(&msg->poll_lock, flags);
+		return true;
+	}
+
+	/* Has cmdq index moved at all ? */
+	pending = virtqueue_poll(vioch->vqueue, msg->poll_idx);
+	spin_unlock_irqrestore(&msg->poll_lock, flags);
+	if (!pending)
+		return false;
+
+	spin_lock_irqsave(&vioch->lock, flags);
+	virtqueue_disable_cb(vioch->vqueue);
+
+	/*
+	 * If something arrived we cannot be sure, without dequeueing, if it
+	 * was the reply to the xfer we are polling for, or, to other, even
+	 * possibly non-polling, pending xfers: process all new messages
+	 * till the polled-for message is found OR the vqueue is empty.
+	 */
+	while ((next_msg = virtqueue_get_buf(vioch->vqueue, &length))) {
+		next_msg->rx_len = length;
+		/* Is the message we were polling for ? */
+		if (next_msg == msg) {
+			ret = true;
+			break;
+		}
+
+		spin_lock(&next_msg->poll_lock);
+		if (next_msg->poll_idx == VIO_MSG_NOT_POLLED) {
+			any_prefetched++;
+			list_add_tail(&next_msg->list,
+				      &vioch->pending_cmds_list);
+		} else {
+			next_msg->poll_idx = VIO_MSG_POLL_DONE;
+		}
+		spin_unlock(&next_msg->poll_lock);
 	}
+
+	/*
+	 * When the polling loop has successfully terminated if something
+	 * else was queued in the meantime, it will be served by a deferred
+	 * worker OR by the normal IRQ/callback OR by other poll loops.
+	 *
+	 * If we are still looking for the polled reply, the polling index has
+	 * to be updated to the current vqueue last used index.
+	 */
+	if (ret) {
+		pending = !virtqueue_enable_cb(vioch->vqueue);
+	} else {
+		spin_lock(&msg->poll_lock);
+		msg->poll_idx = virtqueue_enable_cb_prepare(vioch->vqueue);
+		pending = virtqueue_poll(vioch->vqueue, msg->poll_idx);
+		spin_unlock(&msg->poll_lock);
+	}
+
+	if (vioch->deferred_tx_wq && (any_prefetched || pending))
+		queue_work(vioch->deferred_tx_wq, &vioch->deferred_tx_work);
+
+	spin_unlock_irqrestore(&vioch->lock, flags);
+
+	return ret;
 }
 
 static const struct scmi_transport_ops scmi_virtio_ops = {
@@ -376,6 +618,8 @@ static const struct scmi_transport_ops scmi_virtio_ops = {
 	.send_message = virtio_send_message,
 	.fetch_response = virtio_fetch_response,
 	.fetch_notification = virtio_fetch_notification,
+	.mark_txdone = virtio_mark_txdone,
+	.poll_done = virtio_poll_done,
 };
 
 static int scmi_vio_probe(struct virtio_device *vdev)
@@ -418,6 +662,7 @@ static int scmi_vio_probe(struct virtio_device *vdev)
 		spin_lock_init(&channels[i].lock);
 		spin_lock_init(&channels[i].ready_lock);
 		INIT_LIST_HEAD(&channels[i].free_list);
+		INIT_LIST_HEAD(&channels[i].pending_cmds_list);
 		channels[i].vqueue = vqs[i];
 
 		sz = virtqueue_get_vring_size(channels[i].vqueue);
@@ -506,4 +751,5 @@ const struct scmi_desc scmi_virtio_desc = {
 	.max_rx_timeout_ms = 60000, /* for non-realtime virtio devices */
 	.max_msg = 0, /* overridden by virtio_get_max_msg() */
 	.max_msg_size = VIRTIO_SCMI_MAX_MSG_SIZE,
+	.atomic_enabled = IS_ENABLED(CONFIG_ARM_SCMI_TRANSPORT_VIRTIO_ATOMIC_ENABLE),
 };
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 61+ messages in thread

* [PATCH v9 09/11] firmware: arm_scmi: Add atomic mode support to virtio transport
@ 2021-12-21 14:00     ` Cristian Marussi
  0 siblings, 0 replies; 61+ messages in thread
From: Cristian Marussi @ 2021-12-21 14:00 UTC (permalink / raw)
  To: linux-kernel, linux-arm-kernel
  Cc: sudeep.holla, james.quinlan, Jonathan.Cameron, f.fainelli,
	etienne.carriere, vincent.guittot, souvik.chakravarty,
	peter.hilber, igor.skalkin, cristian.marussi, Michael S. Tsirkin,
	virtualization

Add support for .mark_txdone and .poll_done transport operations to SCMI
VirtIO transport as pre-requisites to enable atomic operations.

Add a Kernel configuration option to enable SCMI VirtIO transport polling
and atomic mode for selected SCMI transactions while leaving it default
disabled.

Cc: "Michael S. Tsirkin" <mst@redhat.com>
Cc: Igor Skalkin <igor.skalkin@opensynergy.com>
Cc: Peter Hilber <peter.hilber@opensynergy.com>
Cc: virtualization@lists.linux-foundation.org
Signed-off-by: Cristian Marussi <cristian.marussi@arm.com>
---
v8 --> v9
- check for deferred_wq existence before queueing work to avoid
  race at driver removal time
---
 drivers/firmware/arm_scmi/Kconfig  |  15 ++
 drivers/firmware/arm_scmi/virtio.c | 298 ++++++++++++++++++++++++++---
 2 files changed, 287 insertions(+), 26 deletions(-)

diff --git a/drivers/firmware/arm_scmi/Kconfig b/drivers/firmware/arm_scmi/Kconfig
index d429326433d1..7794bd41eaa0 100644
--- a/drivers/firmware/arm_scmi/Kconfig
+++ b/drivers/firmware/arm_scmi/Kconfig
@@ -118,6 +118,21 @@ config ARM_SCMI_TRANSPORT_VIRTIO_VERSION1_COMPLIANCE
 	  the ones implemented by kvmtool) and let the core Kernel VirtIO layer
 	  take care of the needed conversions, say N.
 
+config ARM_SCMI_TRANSPORT_VIRTIO_ATOMIC_ENABLE
+	bool "Enable atomic mode for SCMI VirtIO transport"
+	depends on ARM_SCMI_TRANSPORT_VIRTIO
+	help
+	  Enable support of atomic operation for SCMI VirtIO based transport.
+
+	  If you want the SCMI VirtIO based transport to operate in atomic
+	  mode, avoiding any kind of sleeping behaviour for selected
+	  transactions on the TX path, answer Y.
+
+	  Enabling atomic mode operations allows any SCMI driver using this
+	  transport to optionally ask for atomic SCMI transactions and operate
+	  in atomic context too, at the price of using a number of busy-waiting
+	  primitives all over instead. If unsure say N.
+
 endif #ARM_SCMI_PROTOCOL
 
 config ARM_SCMI_POWER_DOMAIN
diff --git a/drivers/firmware/arm_scmi/virtio.c b/drivers/firmware/arm_scmi/virtio.c
index fd0f6f91fc0b..e54d14971c07 100644
--- a/drivers/firmware/arm_scmi/virtio.c
+++ b/drivers/firmware/arm_scmi/virtio.c
@@ -3,8 +3,8 @@
  * Virtio Transport driver for Arm System Control and Management Interface
  * (SCMI).
  *
- * Copyright (C) 2020-2021 OpenSynergy.
- * Copyright (C) 2021 ARM Ltd.
+ * Copyright (C) 2020-2022 OpenSynergy.
+ * Copyright (C) 2021-2022 ARM Ltd.
  */
 
 /**
@@ -38,6 +38,9 @@
  * @vqueue: Associated virtqueue
  * @cinfo: SCMI Tx or Rx channel
  * @free_list: List of unused scmi_vio_msg, maintained for Tx channels only
+ * @deferred_tx_work: Worker for TX deferred replies processing
+ * @deferred_tx_wq: Workqueue for TX deferred replies
+ * @pending_cmds_list: List of pre-fetched commands queueud for later processing
  * @is_rx: Whether channel is an Rx channel
  * @ready: Whether transport user is ready to hear about channel
  * @max_msg: Maximum number of pending messages for this channel.
@@ -49,6 +52,9 @@ struct scmi_vio_channel {
 	struct virtqueue *vqueue;
 	struct scmi_chan_info *cinfo;
 	struct list_head free_list;
+	struct list_head pending_cmds_list;
+	struct work_struct deferred_tx_work;
+	struct workqueue_struct *deferred_tx_wq;
 	bool is_rx;
 	bool ready;
 	unsigned int max_msg;
@@ -65,12 +71,22 @@ struct scmi_vio_channel {
  * @input: SDU used for (delayed) responses and notifications
  * @list: List which scmi_vio_msg may be part of
  * @rx_len: Input SDU size in bytes, once input has been received
+ * @poll_idx: Last used index registered for polling purposes if this message
+ *	      transaction reply was configured for polling.
+ *	      Note that since virtqueue used index is an unsigned 16-bit we can
+ *	      use some out-of-scale values to signify particular conditions.
+ * @poll_lock: Protect access to @poll_idx.
  */
 struct scmi_vio_msg {
 	struct scmi_msg_payld *request;
 	struct scmi_msg_payld *input;
 	struct list_head list;
 	unsigned int rx_len;
+#define VIO_MSG_NOT_POLLED	0xeeeeeeeeUL
+#define VIO_MSG_POLL_DONE	0xffffffffUL
+	unsigned int poll_idx;
+	/* lock to protect access to poll_idx. */
+	spinlock_t poll_lock;
 };
 
 /* Only one SCMI VirtIO device can possibly exist */
@@ -81,40 +97,43 @@ static bool scmi_vio_have_vq_rx(struct virtio_device *vdev)
 	return virtio_has_feature(vdev, VIRTIO_SCMI_F_P2A_CHANNELS);
 }
 
+/* Expect to be called with vioch->lock acquired by the caller and IRQs off */
 static int scmi_vio_feed_vq_rx(struct scmi_vio_channel *vioch,
 			       struct scmi_vio_msg *msg,
 			       struct device *dev)
 {
 	struct scatterlist sg_in;
 	int rc;
-	unsigned long flags;
 
 	sg_init_one(&sg_in, msg->input, VIRTIO_SCMI_MAX_PDU_SIZE);
 
-	spin_lock_irqsave(&vioch->lock, flags);
-
 	rc = virtqueue_add_inbuf(vioch->vqueue, &sg_in, 1, msg, GFP_ATOMIC);
 	if (rc)
 		dev_err(dev, "failed to add to RX virtqueue (%d)\n", rc);
 	else
 		virtqueue_kick(vioch->vqueue);
 
-	spin_unlock_irqrestore(&vioch->lock, flags);
-
 	return rc;
 }
 
+/* Expect to be called with vioch->lock acquired by the caller and IRQs off */
+static inline void scmi_vio_feed_vq_tx(struct scmi_vio_channel *vioch,
+				       struct scmi_vio_msg *msg)
+{
+	spin_lock(&msg->poll_lock);
+	msg->poll_idx = VIO_MSG_NOT_POLLED;
+	spin_unlock(&msg->poll_lock);
+
+	list_add(&msg->list, &vioch->free_list);
+}
+
 static void scmi_finalize_message(struct scmi_vio_channel *vioch,
 				  struct scmi_vio_msg *msg)
 {
-	if (vioch->is_rx) {
+	if (vioch->is_rx)
 		scmi_vio_feed_vq_rx(vioch, msg, vioch->cinfo->dev);
-	} else {
-		/* Here IRQs are assumed to be already disabled by the caller */
-		spin_lock(&vioch->lock);
-		list_add(&msg->list, &vioch->free_list);
-		spin_unlock(&vioch->lock);
-	}
+	else
+		scmi_vio_feed_vq_tx(vioch, msg);
 }
 
 static void scmi_vio_complete_cb(struct virtqueue *vqueue)
@@ -144,6 +163,7 @@ static void scmi_vio_complete_cb(struct virtqueue *vqueue)
 			virtqueue_disable_cb(vqueue);
 			cb_enabled = false;
 		}
+
 		msg = virtqueue_get_buf(vqueue, &length);
 		if (!msg) {
 			if (virtqueue_enable_cb(vqueue))
@@ -157,7 +177,9 @@ static void scmi_vio_complete_cb(struct virtqueue *vqueue)
 			scmi_rx_callback(vioch->cinfo,
 					 msg_read_header(msg->input), msg);
 
+			spin_lock(&vioch->lock);
 			scmi_finalize_message(vioch, msg);
+			spin_unlock(&vioch->lock);
 		}
 
 		/*
@@ -176,6 +198,34 @@ static void scmi_vio_complete_cb(struct virtqueue *vqueue)
 	spin_unlock_irqrestore(&vioch->ready_lock, ready_flags);
 }
 
+static void scmi_vio_deferred_tx_worker(struct work_struct *work)
+{
+	unsigned long flags;
+	struct scmi_vio_channel *vioch;
+	struct scmi_vio_msg *msg, *tmp;
+
+	vioch = container_of(work, struct scmi_vio_channel, deferred_tx_work);
+
+	/* Process pre-fetched messages */
+	spin_lock_irqsave(&vioch->lock, flags);
+
+	/* Scan the list of possibly pre-fetched messages during polling. */
+	list_for_each_entry_safe(msg, tmp, &vioch->pending_cmds_list, list) {
+		list_del(&msg->list);
+
+		scmi_rx_callback(vioch->cinfo,
+				 msg_read_header(msg->input), msg);
+
+		/* Free the processed message once done */
+		scmi_vio_feed_vq_tx(vioch, msg);
+	}
+
+	spin_unlock_irqrestore(&vioch->lock, flags);
+
+	/* Process possibly still pending messages */
+	scmi_vio_complete_cb(vioch->vqueue);
+}
+
 static const char *const scmi_vio_vqueue_names[] = { "tx", "rx" };
 
 static vq_callback_t *scmi_vio_complete_callbacks[] = {
@@ -244,6 +294,19 @@ static int virtio_chan_setup(struct scmi_chan_info *cinfo, struct device *dev,
 
 	vioch = &((struct scmi_vio_channel *)scmi_vdev->priv)[index];
 
+	/* Setup a deferred worker for polling. */
+	if (tx && !vioch->deferred_tx_wq) {
+		vioch->deferred_tx_wq =
+			alloc_workqueue(dev_name(&scmi_vdev->dev),
+					WQ_UNBOUND | WQ_FREEZABLE | WQ_SYSFS,
+					0);
+		if (!vioch->deferred_tx_wq)
+			return -ENOMEM;
+
+		INIT_WORK(&vioch->deferred_tx_work,
+			  scmi_vio_deferred_tx_worker);
+	}
+
 	for (i = 0; i < vioch->max_msg; i++) {
 		struct scmi_vio_msg *msg;
 
@@ -257,6 +320,7 @@ static int virtio_chan_setup(struct scmi_chan_info *cinfo, struct device *dev,
 						    GFP_KERNEL);
 			if (!msg->request)
 				return -ENOMEM;
+			spin_lock_init(&msg->poll_lock);
 		}
 
 		msg->input = devm_kzalloc(cinfo->dev, VIRTIO_SCMI_MAX_PDU_SIZE,
@@ -264,13 +328,12 @@ static int virtio_chan_setup(struct scmi_chan_info *cinfo, struct device *dev,
 		if (!msg->input)
 			return -ENOMEM;
 
-		if (tx) {
-			spin_lock_irqsave(&vioch->lock, flags);
-			list_add_tail(&msg->list, &vioch->free_list);
-			spin_unlock_irqrestore(&vioch->lock, flags);
-		} else {
+		spin_lock_irqsave(&vioch->lock, flags);
+		if (tx)
+			scmi_vio_feed_vq_tx(vioch, msg);
+		else
 			scmi_vio_feed_vq_rx(vioch, msg, cinfo->dev);
-		}
+		spin_unlock_irqrestore(&vioch->lock, flags);
 	}
 
 	spin_lock_irqsave(&vioch->lock, flags);
@@ -291,11 +354,22 @@ static int virtio_chan_free(int id, void *p, void *data)
 	unsigned long flags;
 	struct scmi_chan_info *cinfo = p;
 	struct scmi_vio_channel *vioch = cinfo->transport_info;
+	void *deferred_wq = NULL;
 
 	spin_lock_irqsave(&vioch->ready_lock, flags);
 	vioch->ready = false;
 	spin_unlock_irqrestore(&vioch->ready_lock, flags);
 
+	spin_lock_irqsave(&vioch->lock, flags);
+	if (!vioch->is_rx && vioch->deferred_tx_wq) {
+		deferred_wq = vioch->deferred_tx_wq;
+		vioch->deferred_tx_wq = NULL;
+	}
+	spin_unlock_irqrestore(&vioch->lock, flags);
+
+	if (deferred_wq)
+		destroy_workqueue(deferred_wq);
+
 	scmi_free_channel(cinfo, data, id);
 
 	spin_lock_irqsave(&vioch->lock, flags);
@@ -324,7 +398,8 @@ static int virtio_send_message(struct scmi_chan_info *cinfo,
 	}
 
 	msg = list_first_entry(&vioch->free_list, typeof(*msg), list);
-	list_del(&msg->list);
+	/* Re-init element so we can discern anytime if it is still in-flight */
+	list_del_init(&msg->list);
 
 	msg_tx_prepare(msg->request, xfer);
 
@@ -337,6 +412,19 @@ static int virtio_send_message(struct scmi_chan_info *cinfo,
 		dev_err(vioch->cinfo->dev,
 			"failed to add to TX virtqueue (%d)\n", rc);
 	} else {
+		/*
+		 * If polling was requested for this transaction:
+		 *  - retrieve last used index (will be used as polling reference)
+		 *  - bind the polled message to the xfer via .priv
+		 */
+		if (xfer->hdr.poll_completion) {
+			spin_lock(&msg->poll_lock);
+			msg->poll_idx =
+				virtqueue_enable_cb_prepare(vioch->vqueue);
+			spin_unlock(&msg->poll_lock);
+			/* Ensure initialized msg is visibly bound to xfer */
+			smp_store_mb(xfer->priv, msg);
+		}
 		virtqueue_kick(vioch->vqueue);
 	}
 
@@ -350,10 +438,8 @@ static void virtio_fetch_response(struct scmi_chan_info *cinfo,
 {
 	struct scmi_vio_msg *msg = xfer->priv;
 
-	if (msg) {
+	if (msg)
 		msg_fetch_response(msg->input, msg->rx_len, xfer);
-		xfer->priv = NULL;
-	}
 }
 
 static void virtio_fetch_notification(struct scmi_chan_info *cinfo,
@@ -361,10 +447,166 @@ static void virtio_fetch_notification(struct scmi_chan_info *cinfo,
 {
 	struct scmi_vio_msg *msg = xfer->priv;
 
-	if (msg) {
+	if (msg)
 		msg_fetch_notification(msg->input, msg->rx_len, max_len, xfer);
-		xfer->priv = NULL;
+}
+
+/**
+ * virtio_mark_txdone  - Mark transmission done
+ *
+ * Free only successfully completed polling transfer messages.
+ *
+ * Note that in the SCMI VirtIO transport we never explicitly release timed-out
+ * messages by forcibly re-adding them to the free-list, even on timeout, inside
+ * the TX code path; we instead let IRQ/RX callbacks eventually clean up such
+ * messages once, finally, a late reply is received and discarded (if ever).
+ *
+ * This approach was deemed preferable since those pending timed-out buffers are
+ * still effectively owned by the SCMI platform VirtIO device even after timeout
+ * expiration: forcibly freeing and reusing them before they had been returned
+ * explicitly by the SCMI platform could lead to subtle bugs due to message
+ * corruption.
+ * An SCMI platform VirtIO device which never returns message buffers is
+ * anyway broken and it will quickly lead to exhaustion of available messages.
+ *
+ * For this same reason, here, we take care to free only the successfully
+ * completed polled messages, since they won't be freed elsewhere; late replies
+ * to timed-out polled messages would be anyway freed by RX callbacks instead.
+ *
+ * @cinfo: SCMI channel info
+ * @ret: Transmission return code
+ * @xfer: Transfer descriptor
+ */
+static void virtio_mark_txdone(struct scmi_chan_info *cinfo, int ret,
+			       struct scmi_xfer *xfer)
+{
+	unsigned long flags;
+	struct scmi_vio_channel *vioch = cinfo->transport_info;
+	struct scmi_vio_msg *msg = xfer->priv;
+
+	if (!msg)
+		return;
+
+	/* Ensure msg is unbound from xfer before pushing onto the free list  */
+	smp_store_mb(xfer->priv, NULL);
+
+	/* Is a successfully completed polled message still to be finalized ? */
+	spin_lock_irqsave(&vioch->lock, flags);
+	if (!ret && xfer->hdr.poll_completion && list_empty(&msg->list))
+		scmi_vio_feed_vq_tx(vioch, msg);
+	spin_unlock_irqrestore(&vioch->lock, flags);
+}
+
+/**
+ * virtio_poll_done  - Provide polling support for VirtIO transport
+ *
+ * @cinfo: SCMI channel info
+ * @xfer: Reference to the transfer being poll for.
+ *
+ * VirtIO core provides a polling mechanism based only on last used indexes:
+ * this means that it is possible to poll the virtqueues waiting for something
+ * new to arrive from the host side but the only way to check if the freshly
+ * arrived buffer was what we were waiting for is to compare the newly arrived
+ * message descriptors with the one we are polling on.
+ *
+ * As a consequence it can happen to dequeue something different from the buffer
+ * we were poll-waiting for: if that is the case such early fetched buffers are
+ * then added to a the @pending_cmds_list list for later processing by a
+ * dedicated deferred worker.
+ *
+ * So, basically, once something new is spotted we proceed to de-queue all the
+ * freshly received used buffers until we found the one we were polling on, or,
+ * we have 'seemingly' emptied the virtqueue; if some buffers are still pending
+ * in the vqueue at the end of the polling loop (possible due to inherent races
+ * in virtqueues handling mechanisms), we similarly kick the deferred worker
+ * and let it process those, to avoid indefinitely looping in the .poll_done
+ * helper.
+ *
+ * Note that we do NOT suppress notification with VIRTQ_USED_F_NO_NOTIFY even
+ * when polling since such flag is per-virtqueues and we do not want to
+ * suppress notifications as a whole: so, if the message we are polling for is
+ * delivered via usual IRQs callbacks, on another core which are IRQs-on, it
+ * will be handled as such by scmi_rx_callback() and the polling loop in the
+ * SCMI Core TX path will be transparently terminated anyway.
+ *
+ * Return: True once polling has successfully completed.
+ */
+static bool virtio_poll_done(struct scmi_chan_info *cinfo,
+			     struct scmi_xfer *xfer)
+{
+	bool pending, ret = false;
+	unsigned int length, any_prefetched = 0;
+	unsigned long flags;
+	struct scmi_vio_msg *next_msg, *msg = xfer->priv;
+	struct scmi_vio_channel *vioch = cinfo->transport_info;
+
+	if (!msg)
+		return true;
+
+	spin_lock_irqsave(&msg->poll_lock, flags);
+	/* Processed already by other polling loop on another CPU ? */
+	if (msg->poll_idx == VIO_MSG_POLL_DONE) {
+		spin_unlock_irqrestore(&msg->poll_lock, flags);
+		return true;
+	}
+
+	/* Has cmdq index moved at all ? */
+	pending = virtqueue_poll(vioch->vqueue, msg->poll_idx);
+	spin_unlock_irqrestore(&msg->poll_lock, flags);
+	if (!pending)
+		return false;
+
+	spin_lock_irqsave(&vioch->lock, flags);
+	virtqueue_disable_cb(vioch->vqueue);
+
+	/*
+	 * If something arrived we cannot be sure, without dequeueing, if it
+	 * was the reply to the xfer we are polling for, or, to other, even
+	 * possibly non-polling, pending xfers: process all new messages
+	 * till the polled-for message is found OR the vqueue is empty.
+	 */
+	while ((next_msg = virtqueue_get_buf(vioch->vqueue, &length))) {
+		next_msg->rx_len = length;
+		/* Is the message we were polling for ? */
+		if (next_msg == msg) {
+			ret = true;
+			break;
+		}
+
+		spin_lock(&next_msg->poll_lock);
+		if (next_msg->poll_idx == VIO_MSG_NOT_POLLED) {
+			any_prefetched++;
+			list_add_tail(&next_msg->list,
+				      &vioch->pending_cmds_list);
+		} else {
+			next_msg->poll_idx = VIO_MSG_POLL_DONE;
+		}
+		spin_unlock(&next_msg->poll_lock);
 	}
+
+	/*
+	 * When the polling loop has successfully terminated if something
+	 * else was queued in the meantime, it will be served by a deferred
+	 * worker OR by the normal IRQ/callback OR by other poll loops.
+	 *
+	 * If we are still looking for the polled reply, the polling index has
+	 * to be updated to the current vqueue last used index.
+	 */
+	if (ret) {
+		pending = !virtqueue_enable_cb(vioch->vqueue);
+	} else {
+		spin_lock(&msg->poll_lock);
+		msg->poll_idx = virtqueue_enable_cb_prepare(vioch->vqueue);
+		pending = virtqueue_poll(vioch->vqueue, msg->poll_idx);
+		spin_unlock(&msg->poll_lock);
+	}
+
+	if (vioch->deferred_tx_wq && (any_prefetched || pending))
+		queue_work(vioch->deferred_tx_wq, &vioch->deferred_tx_work);
+
+	spin_unlock_irqrestore(&vioch->lock, flags);
+
+	return ret;
 }
 
 static const struct scmi_transport_ops scmi_virtio_ops = {
@@ -376,6 +618,8 @@ static const struct scmi_transport_ops scmi_virtio_ops = {
 	.send_message = virtio_send_message,
 	.fetch_response = virtio_fetch_response,
 	.fetch_notification = virtio_fetch_notification,
+	.mark_txdone = virtio_mark_txdone,
+	.poll_done = virtio_poll_done,
 };
 
 static int scmi_vio_probe(struct virtio_device *vdev)
@@ -418,6 +662,7 @@ static int scmi_vio_probe(struct virtio_device *vdev)
 		spin_lock_init(&channels[i].lock);
 		spin_lock_init(&channels[i].ready_lock);
 		INIT_LIST_HEAD(&channels[i].free_list);
+		INIT_LIST_HEAD(&channels[i].pending_cmds_list);
 		channels[i].vqueue = vqs[i];
 
 		sz = virtqueue_get_vring_size(channels[i].vqueue);
@@ -506,4 +751,5 @@ const struct scmi_desc scmi_virtio_desc = {
 	.max_rx_timeout_ms = 60000, /* for non-realtime virtio devices */
 	.max_msg = 0, /* overridden by virtio_get_max_msg() */
 	.max_msg_size = VIRTIO_SCMI_MAX_MSG_SIZE,
+	.atomic_enabled = IS_ENABLED(CONFIG_ARM_SCMI_TRANSPORT_VIRTIO_ATOMIC_ENABLE),
 };
-- 
2.17.1


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 61+ messages in thread

* Re: [PATCH v8 01/11] firmware: arm_scmi: Add configurable polling mode for transports
  2021-12-20 19:56   ` Cristian Marussi
@ 2021-12-21 19:38     ` Pratyush Yadav
  -1 siblings, 0 replies; 61+ messages in thread
From: Pratyush Yadav @ 2021-12-21 19:38 UTC (permalink / raw)
  To: Cristian Marussi
  Cc: linux-kernel, linux-arm-kernel, sudeep.holla, james.quinlan,
	Jonathan.Cameron, f.fainelli, etienne.carriere, vincent.guittot,
	souvik.chakravarty

Hi,

On 20/12/21 07:56PM, Cristian Marussi wrote:
> SCMI communications along TX channels can optionally be provided of a
> completion interrupt; when such interrupt is not available, command
> transactions should rely on polling, where the SCMI core takes care to
> repeatedly evaluate the transport-specific .poll_done() function, if
> available, to determine if and when a request was fully completed or
> timed out.
> 
> Such mechanism is already present and working on a single transfer base:
> SCMI protocols can indeed enable hdr.poll_completion on specific commands
> ahead of each transfer and cause that transaction to be handled with
> polling.
> 
> Introduce a couple of flags to be able to enforce such polling behaviour
> globally at will:
> 
>  - scmi_desc.force_polling: to statically switch the whole transport to
>    polling mode.
> 
>  - scmi_chan_info.no_completion_irq: to switch a single channel dynamically
>    to polling mode if, at runtime, is determined that no completion
>    interrupt was available for such channel.
> 
> Signed-off-by: Cristian Marussi <cristian.marussi@arm.com>

This patch breaks linux-next build for me with LLVM.

drivers/firmware/arm_scmi/driver.c:869:6: error: variable 'i_' is uninitialized when used within its own initialization [-Werror,-Wuninitialized]
        if (IS_POLLING_ENABLED(cinfo, info))
            ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
drivers/firmware/arm_scmi/driver.c:59:33: note: expanded from macro 'IS_POLLING_ENABLED'
                        IS_TRANSPORT_POLLING_CAPABLE(i_));              \
                        ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^~~
drivers/firmware/arm_scmi/driver.c:45:19: note: expanded from macro 'IS_TRANSPORT_POLLING_CAPABLE'
        typeof(__i) i_ = __i;                                           \
                    ~~   ^~~
1 error generated.

-- 
Regards,
Pratyush Yadav
Texas Instruments Inc.

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [PATCH v8 01/11] firmware: arm_scmi: Add configurable polling mode for transports
@ 2021-12-21 19:38     ` Pratyush Yadav
  0 siblings, 0 replies; 61+ messages in thread
From: Pratyush Yadav @ 2021-12-21 19:38 UTC (permalink / raw)
  To: Cristian Marussi
  Cc: linux-kernel, linux-arm-kernel, sudeep.holla, james.quinlan,
	Jonathan.Cameron, f.fainelli, etienne.carriere, vincent.guittot,
	souvik.chakravarty

Hi,

On 20/12/21 07:56PM, Cristian Marussi wrote:
> SCMI communications along TX channels can optionally be provided of a
> completion interrupt; when such interrupt is not available, command
> transactions should rely on polling, where the SCMI core takes care to
> repeatedly evaluate the transport-specific .poll_done() function, if
> available, to determine if and when a request was fully completed or
> timed out.
> 
> Such mechanism is already present and working on a single transfer base:
> SCMI protocols can indeed enable hdr.poll_completion on specific commands
> ahead of each transfer and cause that transaction to be handled with
> polling.
> 
> Introduce a couple of flags to be able to enforce such polling behaviour
> globally at will:
> 
>  - scmi_desc.force_polling: to statically switch the whole transport to
>    polling mode.
> 
>  - scmi_chan_info.no_completion_irq: to switch a single channel dynamically
>    to polling mode if, at runtime, is determined that no completion
>    interrupt was available for such channel.
> 
> Signed-off-by: Cristian Marussi <cristian.marussi@arm.com>

This patch breaks linux-next build for me with LLVM.

drivers/firmware/arm_scmi/driver.c:869:6: error: variable 'i_' is uninitialized when used within its own initialization [-Werror,-Wuninitialized]
        if (IS_POLLING_ENABLED(cinfo, info))
            ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
drivers/firmware/arm_scmi/driver.c:59:33: note: expanded from macro 'IS_POLLING_ENABLED'
                        IS_TRANSPORT_POLLING_CAPABLE(i_));              \
                        ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^~~
drivers/firmware/arm_scmi/driver.c:45:19: note: expanded from macro 'IS_TRANSPORT_POLLING_CAPABLE'
        typeof(__i) i_ = __i;                                           \
                    ~~   ^~~
1 error generated.

-- 
Regards,
Pratyush Yadav
Texas Instruments Inc.

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [PATCH v8 01/11] firmware: arm_scmi: Add configurable polling mode for transports
  2021-12-21 19:38     ` Pratyush Yadav
@ 2021-12-21 20:23       ` Sudeep Holla
  -1 siblings, 0 replies; 61+ messages in thread
From: Sudeep Holla @ 2021-12-21 20:23 UTC (permalink / raw)
  To: Pratyush Yadav
  Cc: Cristian Marussi, linux-kernel, linux-arm-kernel, james.quinlan,
	Jonathan.Cameron, f.fainelli, etienne.carriere, Sudeep Holla,
	vincent.guittot, souvik.chakravarty

On Wed, Dec 22, 2021 at 01:08:54AM +0530, Pratyush Yadav wrote:
> Hi,
> 
> On 20/12/21 07:56PM, Cristian Marussi wrote:
> > SCMI communications along TX channels can optionally be provided of a
> > completion interrupt; when such interrupt is not available, command
> > transactions should rely on polling, where the SCMI core takes care to
> > repeatedly evaluate the transport-specific .poll_done() function, if
> > available, to determine if and when a request was fully completed or
> > timed out.
> > 
> > Such mechanism is already present and working on a single transfer base:
> > SCMI protocols can indeed enable hdr.poll_completion on specific commands
> > ahead of each transfer and cause that transaction to be handled with
> > polling.
> > 
> > Introduce a couple of flags to be able to enforce such polling behaviour
> > globally at will:
> > 
> >  - scmi_desc.force_polling: to statically switch the whole transport to
> >    polling mode.
> > 
> >  - scmi_chan_info.no_completion_irq: to switch a single channel dynamically
> >    to polling mode if, at runtime, is determined that no completion
> >    interrupt was available for such channel.
> > 
> > Signed-off-by: Cristian Marussi <cristian.marussi@arm.com>
> 
> This patch breaks linux-next build for me with LLVM.
> 
> drivers/firmware/arm_scmi/driver.c:869:6: error: variable 'i_' is uninitialized when used within its own initialization [-Werror,-Wuninitialized]
>         if (IS_POLLING_ENABLED(cinfo, info))
>             ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
> drivers/firmware/arm_scmi/driver.c:59:33: note: expanded from macro 'IS_POLLING_ENABLED'
>                         IS_TRANSPORT_POLLING_CAPABLE(i_));              \
>                         ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^~~
> drivers/firmware/arm_scmi/driver.c:45:19: note: expanded from macro 'IS_TRANSPORT_POLLING_CAPABLE'
>         typeof(__i) i_ = __i;                                           \
>                     ~~   ^~~
> 1 error generated.
> 

Thanks for the report. The bot complained last night and I have pushed
the update though not in time for linux-next today, must make it to
tomorrow's cut.

--
Regards,
Sudeep

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [PATCH v8 01/11] firmware: arm_scmi: Add configurable polling mode for transports
@ 2021-12-21 20:23       ` Sudeep Holla
  0 siblings, 0 replies; 61+ messages in thread
From: Sudeep Holla @ 2021-12-21 20:23 UTC (permalink / raw)
  To: Pratyush Yadav
  Cc: Cristian Marussi, linux-kernel, linux-arm-kernel, james.quinlan,
	Jonathan.Cameron, f.fainelli, etienne.carriere, Sudeep Holla,
	vincent.guittot, souvik.chakravarty

On Wed, Dec 22, 2021 at 01:08:54AM +0530, Pratyush Yadav wrote:
> Hi,
> 
> On 20/12/21 07:56PM, Cristian Marussi wrote:
> > SCMI communications along TX channels can optionally be provided of a
> > completion interrupt; when such interrupt is not available, command
> > transactions should rely on polling, where the SCMI core takes care to
> > repeatedly evaluate the transport-specific .poll_done() function, if
> > available, to determine if and when a request was fully completed or
> > timed out.
> > 
> > Such mechanism is already present and working on a single transfer base:
> > SCMI protocols can indeed enable hdr.poll_completion on specific commands
> > ahead of each transfer and cause that transaction to be handled with
> > polling.
> > 
> > Introduce a couple of flags to be able to enforce such polling behaviour
> > globally at will:
> > 
> >  - scmi_desc.force_polling: to statically switch the whole transport to
> >    polling mode.
> > 
> >  - scmi_chan_info.no_completion_irq: to switch a single channel dynamically
> >    to polling mode if, at runtime, is determined that no completion
> >    interrupt was available for such channel.
> > 
> > Signed-off-by: Cristian Marussi <cristian.marussi@arm.com>
> 
> This patch breaks linux-next build for me with LLVM.
> 
> drivers/firmware/arm_scmi/driver.c:869:6: error: variable 'i_' is uninitialized when used within its own initialization [-Werror,-Wuninitialized]
>         if (IS_POLLING_ENABLED(cinfo, info))
>             ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
> drivers/firmware/arm_scmi/driver.c:59:33: note: expanded from macro 'IS_POLLING_ENABLED'
>                         IS_TRANSPORT_POLLING_CAPABLE(i_));              \
>                         ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^~~
> drivers/firmware/arm_scmi/driver.c:45:19: note: expanded from macro 'IS_TRANSPORT_POLLING_CAPABLE'
>         typeof(__i) i_ = __i;                                           \
>                     ~~   ^~~
> 1 error generated.
> 

Thanks for the report. The bot complained last night and I have pushed
the update though not in time for linux-next today, must make it to
tomorrow's cut.

--
Regards,
Sudeep

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [PATCH v8 00/11] (subset) Introduce atomic support for SCMI transports
  2021-12-20 19:56 ` Cristian Marussi
@ 2021-12-22 14:23   ` Sudeep Holla
  -1 siblings, 0 replies; 61+ messages in thread
From: Sudeep Holla @ 2021-12-22 14:23 UTC (permalink / raw)
  To: Cristian Marussi, linux-arm-kernel, linux-kernel
  Cc: Sudeep Holla, etienne.carriere, souvik.chakravarty,
	Jonathan.Cameron, vincent.guittot, james.quinlan, f.fainelli

On Mon, 20 Dec 2021 19:56:35 +0000, Cristian Marussi wrote:
> This series mainly aims to introduce atomic support for SCMI transports
> that can support it.
> 
> In [01/11], as a closely related addition, it is introduced a common way
> for a transport to signal to the SCMI core that it does not offer
> completion interrupts, so that the usual polling behaviour will be
> required: this can be done enabling statically a global polling behaviour
> for the whole transport with flag scmi_desc.force_polling OR dynamically
> enabling at runtime such polling behaviour on a per-channel basis using
> the flag scmi_chan_info.no_completion_irq, typically during .chan_setup().
> The usual per-command polling selection behaviour based on
> hdr.poll_completion is preserved as before.
> 
> [...]


Applied to sudeep.holla/linux (for-next/scmi), thanks!

[01/11] firmware: arm_scmi: Add configurable polling mode for transports
        https://git.kernel.org/sudeep.holla/c/a690b7e6e7
[02/11] firmware: arm_scmi: Make smc transport use common completions
        https://git.kernel.org/sudeep.holla/c/f716cbd33f
[03/11] firmware: arm_scmi: Add sync_cmds_completed_on_ret transport flag
        https://git.kernel.org/sudeep.holla/c/31d2f803c1
[04/11] firmware: arm_scmi: Make smc support sync_cmds_completed_on_ret
        https://git.kernel.org/sudeep.holla/c/117542b81f
[05/11] firmware: arm_scmi: Make optee support sync_cmds_completed_on_ret
        https://git.kernel.org/sudeep.holla/c/bf322084fe
[06/11] firmware: arm_scmi: Add support for atomic transports
        https://git.kernel.org/sudeep.holla/c/69255e7468
[07/11] firmware: arm_scmi: Add atomic mode support to smc transport
        https://git.kernel.org/sudeep.holla/c/0bfdca8a86
[08/11] firmware: arm_scmi: Add new parameter to mark_txdone
        https://git.kernel.org/sudeep.holla/c/94d0cd1da1


Deferring the last 3 for next release. We can see if we can include
latest spec change for clock atomic support 🤞.

--
Regards,
Sudeep


^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [PATCH v8 00/11] (subset) Introduce atomic support for SCMI transports
@ 2021-12-22 14:23   ` Sudeep Holla
  0 siblings, 0 replies; 61+ messages in thread
From: Sudeep Holla @ 2021-12-22 14:23 UTC (permalink / raw)
  To: Cristian Marussi, linux-arm-kernel, linux-kernel
  Cc: Sudeep Holla, etienne.carriere, souvik.chakravarty,
	Jonathan.Cameron, vincent.guittot, james.quinlan, f.fainelli

On Mon, 20 Dec 2021 19:56:35 +0000, Cristian Marussi wrote:
> This series mainly aims to introduce atomic support for SCMI transports
> that can support it.
> 
> In [01/11], as a closely related addition, it is introduced a common way
> for a transport to signal to the SCMI core that it does not offer
> completion interrupts, so that the usual polling behaviour will be
> required: this can be done enabling statically a global polling behaviour
> for the whole transport with flag scmi_desc.force_polling OR dynamically
> enabling at runtime such polling behaviour on a per-channel basis using
> the flag scmi_chan_info.no_completion_irq, typically during .chan_setup().
> The usual per-command polling selection behaviour based on
> hdr.poll_completion is preserved as before.
> 
> [...]


Applied to sudeep.holla/linux (for-next/scmi), thanks!

[01/11] firmware: arm_scmi: Add configurable polling mode for transports
        https://git.kernel.org/sudeep.holla/c/a690b7e6e7
[02/11] firmware: arm_scmi: Make smc transport use common completions
        https://git.kernel.org/sudeep.holla/c/f716cbd33f
[03/11] firmware: arm_scmi: Add sync_cmds_completed_on_ret transport flag
        https://git.kernel.org/sudeep.holla/c/31d2f803c1
[04/11] firmware: arm_scmi: Make smc support sync_cmds_completed_on_ret
        https://git.kernel.org/sudeep.holla/c/117542b81f
[05/11] firmware: arm_scmi: Make optee support sync_cmds_completed_on_ret
        https://git.kernel.org/sudeep.holla/c/bf322084fe
[06/11] firmware: arm_scmi: Add support for atomic transports
        https://git.kernel.org/sudeep.holla/c/69255e7468
[07/11] firmware: arm_scmi: Add atomic mode support to smc transport
        https://git.kernel.org/sudeep.holla/c/0bfdca8a86
[08/11] firmware: arm_scmi: Add new parameter to mark_txdone
        https://git.kernel.org/sudeep.holla/c/94d0cd1da1


Deferring the last 3 for next release. We can see if we can include
latest spec change for clock atomic support 🤞.

--
Regards,
Sudeep


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [PATCH v8 00/11] Introduce atomic support for SCMI transports
  2021-12-20 19:56 ` Cristian Marussi
@ 2022-01-11 18:13   ` Cristian Marussi
  -1 siblings, 0 replies; 61+ messages in thread
From: Cristian Marussi @ 2022-01-11 18:13 UTC (permalink / raw)
  To: linux-kernel, linux-arm-kernel
  Cc: sudeep.holla, james.quinlan, Jonathan.Cameron, f.fainelli,
	etienne.carriere, vincent.guittot, souvik.chakravarty

On Mon, Dec 20, 2021 at 07:56:35PM +0000, Cristian Marussi wrote:
> Hi all,
> 

Hi

this series has now been partially queued (01-->08), the remaining
patches on virtio polling and SCMI CLK atomic support will be posted as
a new small patchset in the next cycle.

Thanks,
Cristian

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [PATCH v8 00/11] Introduce atomic support for SCMI transports
@ 2022-01-11 18:13   ` Cristian Marussi
  0 siblings, 0 replies; 61+ messages in thread
From: Cristian Marussi @ 2022-01-11 18:13 UTC (permalink / raw)
  To: linux-kernel, linux-arm-kernel
  Cc: sudeep.holla, james.quinlan, Jonathan.Cameron, f.fainelli,
	etienne.carriere, vincent.guittot, souvik.chakravarty

On Mon, Dec 20, 2021 at 07:56:35PM +0000, Cristian Marussi wrote:
> Hi all,
> 

Hi

this series has now been partially queued (01-->08), the remaining
patches on virtio polling and SCMI CLK atomic support will be posted as
a new small patchset in the next cycle.

Thanks,
Cristian

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [PATCH v8 11/11] clk: scmi: Support atomic clock enable/disable API
  2021-12-20 19:56   ` Cristian Marussi
@ 2022-01-14 23:08     ` Stephen Boyd
  -1 siblings, 0 replies; 61+ messages in thread
From: Stephen Boyd @ 2022-01-14 23:08 UTC (permalink / raw)
  To: Cristian Marussi, linux-arm-kernel, linux-kernel
  Cc: sudeep.holla, james.quinlan, Jonathan.Cameron, f.fainelli,
	etienne.carriere, vincent.guittot, souvik.chakravarty,
	cristian.marussi, Michael Turquette, linux-clk

Quoting Cristian Marussi (2021-12-20 11:56:46)
> Support also atomic enable/disable clk_ops beside the bare non-atomic one
> (prepare/unprepare) when the underlying SCMI transport is configured to
> support atomic transactions for synchronous commands.
> 
> Cc: Michael Turquette <mturquette@baylibre.com>
> Cc: Stephen Boyd <sboyd@kernel.org>
> Cc: linux-clk@vger.kernel.org
> Signed-off-by: Cristian Marussi <cristian.marussi@arm.com>
> ---
> NOTE THAT STILL THERE'S NO FINE GRAIN CONTROL OVER SELECTION
> OF WHICH CLOCK DEVICES CAN SUPPORT ATOMIC AND WHICH SHOULD NOT
> BASED ON CLOCK DEVICES ENABLE LATENCY.
> THIS HAS STILL TO BE ADDED IN SCMI PROTOCOL SPEC.

Why are you yelling on the internet? :-) I guess I need to ack this.

Acked-by: Stephen Boyd <sboyd@kernel.org>

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [PATCH v8 11/11] clk: scmi: Support atomic clock enable/disable API
@ 2022-01-14 23:08     ` Stephen Boyd
  0 siblings, 0 replies; 61+ messages in thread
From: Stephen Boyd @ 2022-01-14 23:08 UTC (permalink / raw)
  To: Cristian Marussi, linux-arm-kernel, linux-kernel
  Cc: sudeep.holla, james.quinlan, Jonathan.Cameron, f.fainelli,
	etienne.carriere, vincent.guittot, souvik.chakravarty,
	cristian.marussi, Michael Turquette, linux-clk

Quoting Cristian Marussi (2021-12-20 11:56:46)
> Support also atomic enable/disable clk_ops beside the bare non-atomic one
> (prepare/unprepare) when the underlying SCMI transport is configured to
> support atomic transactions for synchronous commands.
> 
> Cc: Michael Turquette <mturquette@baylibre.com>
> Cc: Stephen Boyd <sboyd@kernel.org>
> Cc: linux-clk@vger.kernel.org
> Signed-off-by: Cristian Marussi <cristian.marussi@arm.com>
> ---
> NOTE THAT STILL THERE'S NO FINE GRAIN CONTROL OVER SELECTION
> OF WHICH CLOCK DEVICES CAN SUPPORT ATOMIC AND WHICH SHOULD NOT
> BASED ON CLOCK DEVICES ENABLE LATENCY.
> THIS HAS STILL TO BE ADDED IN SCMI PROTOCOL SPEC.

Why are you yelling on the internet? :-) I guess I need to ack this.

Acked-by: Stephen Boyd <sboyd@kernel.org>

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [PATCH v8 11/11] clk: scmi: Support atomic clock enable/disable API
  2022-01-14 23:08     ` Stephen Boyd
@ 2022-01-17 10:31       ` Sudeep Holla
  -1 siblings, 0 replies; 61+ messages in thread
From: Sudeep Holla @ 2022-01-17 10:31 UTC (permalink / raw)
  To: Stephen Boyd
  Cc: Cristian Marussi, Sudeep Holla, linux-arm-kernel, linux-kernel,
	james.quinlan, Jonathan.Cameron, f.fainelli, etienne.carriere,
	vincent.guittot, souvik.chakravarty, Michael Turquette,
	linux-clk

On Fri, Jan 14, 2022 at 03:08:37PM -0800, Stephen Boyd wrote:
> Quoting Cristian Marussi (2021-12-20 11:56:46)
> > Support also atomic enable/disable clk_ops beside the bare non-atomic one
> > (prepare/unprepare) when the underlying SCMI transport is configured to
> > support atomic transactions for synchronous commands.
> > 
> > Cc: Michael Turquette <mturquette@baylibre.com>
> > Cc: Stephen Boyd <sboyd@kernel.org>
> > Cc: linux-clk@vger.kernel.org
> > Signed-off-by: Cristian Marussi <cristian.marussi@arm.com>
> > ---
> > NOTE THAT STILL THERE'S NO FINE GRAIN CONTROL OVER SELECTION
> > OF WHICH CLOCK DEVICES CAN SUPPORT ATOMIC AND WHICH SHOULD NOT
> > BASED ON CLOCK DEVICES ENABLE LATENCY.
> > THIS HAS STILL TO BE ADDED IN SCMI PROTOCOL SPEC.
> 
> Why are you yelling on the internet? :-) I guess I need to ack this.
>

It is for the partners who request such changes. We are trying to prototype
and share the code and ask for feedback before we finalise the specification.

In fact it is other way around for you 😁. Not to ack as it is not yet final
😉. At least I need to wait until the spec contents are finalised before I
can merge with your ack 😁. But I agree RFC would have indicated that along
with the above background instead of *yelling*. Cristian assumed everyone
is aware of the content as quite a few are involved in offline discussions.

> Acked-by: Stephen Boyd <sboyd@kernel.org>

Thanks anyways, will use it if nothing changes.

-- 
Regards,
Sudeep

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [PATCH v8 11/11] clk: scmi: Support atomic clock enable/disable API
@ 2022-01-17 10:31       ` Sudeep Holla
  0 siblings, 0 replies; 61+ messages in thread
From: Sudeep Holla @ 2022-01-17 10:31 UTC (permalink / raw)
  To: Stephen Boyd
  Cc: Cristian Marussi, Sudeep Holla, linux-arm-kernel, linux-kernel,
	james.quinlan, Jonathan.Cameron, f.fainelli, etienne.carriere,
	vincent.guittot, souvik.chakravarty, Michael Turquette,
	linux-clk

On Fri, Jan 14, 2022 at 03:08:37PM -0800, Stephen Boyd wrote:
> Quoting Cristian Marussi (2021-12-20 11:56:46)
> > Support also atomic enable/disable clk_ops beside the bare non-atomic one
> > (prepare/unprepare) when the underlying SCMI transport is configured to
> > support atomic transactions for synchronous commands.
> > 
> > Cc: Michael Turquette <mturquette@baylibre.com>
> > Cc: Stephen Boyd <sboyd@kernel.org>
> > Cc: linux-clk@vger.kernel.org
> > Signed-off-by: Cristian Marussi <cristian.marussi@arm.com>
> > ---
> > NOTE THAT STILL THERE'S NO FINE GRAIN CONTROL OVER SELECTION
> > OF WHICH CLOCK DEVICES CAN SUPPORT ATOMIC AND WHICH SHOULD NOT
> > BASED ON CLOCK DEVICES ENABLE LATENCY.
> > THIS HAS STILL TO BE ADDED IN SCMI PROTOCOL SPEC.
> 
> Why are you yelling on the internet? :-) I guess I need to ack this.
>

It is for the partners who request such changes. We are trying to prototype
and share the code and ask for feedback before we finalise the specification.

In fact it is other way around for you 😁. Not to ack as it is not yet final
😉. At least I need to wait until the spec contents are finalised before I
can merge with your ack 😁. But I agree RFC would have indicated that along
with the above background instead of *yelling*. Cristian assumed everyone
is aware of the content as quite a few are involved in offline discussions.

> Acked-by: Stephen Boyd <sboyd@kernel.org>

Thanks anyways, will use it if nothing changes.

-- 
Regards,
Sudeep

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [PATCH v8 11/11] clk: scmi: Support atomic clock enable/disable API
  2022-01-17 10:31       ` Sudeep Holla
@ 2022-01-17 12:40         ` Cristian Marussi
  -1 siblings, 0 replies; 61+ messages in thread
From: Cristian Marussi @ 2022-01-17 12:40 UTC (permalink / raw)
  To: Sudeep Holla
  Cc: Stephen Boyd, linux-arm-kernel, linux-kernel, james.quinlan,
	Jonathan.Cameron, f.fainelli, etienne.carriere, vincent.guittot,
	souvik.chakravarty, Michael Turquette, linux-clk

On Mon, Jan 17, 2022 at 10:31:00AM +0000, Sudeep Holla wrote:
> On Fri, Jan 14, 2022 at 03:08:37PM -0800, Stephen Boyd wrote:
> > Quoting Cristian Marussi (2021-12-20 11:56:46)
> > > Support also atomic enable/disable clk_ops beside the bare non-atomic one
> > > (prepare/unprepare) when the underlying SCMI transport is configured to
> > > support atomic transactions for synchronous commands.
> > > 

Hi,

> > > Cc: Michael Turquette <mturquette@baylibre.com>
> > > Cc: Stephen Boyd <sboyd@kernel.org>
> > > Cc: linux-clk@vger.kernel.org
> > > Signed-off-by: Cristian Marussi <cristian.marussi@arm.com>
> > > ---
> > > NOTE THAT STILL THERE'S NO FINE GRAIN CONTROL OVER SELECTION
> > > OF WHICH CLOCK DEVICES CAN SUPPORT ATOMIC AND WHICH SHOULD NOT
> > > BASED ON CLOCK DEVICES ENABLE LATENCY.
> > > THIS HAS STILL TO BE ADDED IN SCMI PROTOCOL SPEC.
> > 
> > Why are you yelling on the internet? :-) I guess I need to ack this.
> >
> 

Sorry I did not mean to yell really, just to warn partners using this.

> It is for the partners who request such changes. We are trying to prototype
> and share the code and ask for feedback before we finalise the specification.
> 
> In fact it is other way around for you 😁. Not to ack as it is not yet final
> 😉. At least I need to wait until the spec contents are finalised before I
> can merge with your ack 😁. But I agree RFC would have indicated that along
> with the above background instead of *yelling*. Cristian assumed everyone
> is aware of the content as quite a few are involved in offline discussions.
> 
> > Acked-by: Stephen Boyd <sboyd@kernel.org>
> 
> Thanks anyways, will use it if nothing changes.
> 

As Sudeep said, V8 it is indeed not the final version which is going to
be posted soon after the merge-windows (without yelling :D) and which it
will be indeed still marked as RFC since it does include a new protocol
feature which is still under review and not published.

Sorry for the noise.

Thanks,
Cristian

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [PATCH v8 11/11] clk: scmi: Support atomic clock enable/disable API
@ 2022-01-17 12:40         ` Cristian Marussi
  0 siblings, 0 replies; 61+ messages in thread
From: Cristian Marussi @ 2022-01-17 12:40 UTC (permalink / raw)
  To: Sudeep Holla
  Cc: Stephen Boyd, linux-arm-kernel, linux-kernel, james.quinlan,
	Jonathan.Cameron, f.fainelli, etienne.carriere, vincent.guittot,
	souvik.chakravarty, Michael Turquette, linux-clk

On Mon, Jan 17, 2022 at 10:31:00AM +0000, Sudeep Holla wrote:
> On Fri, Jan 14, 2022 at 03:08:37PM -0800, Stephen Boyd wrote:
> > Quoting Cristian Marussi (2021-12-20 11:56:46)
> > > Support also atomic enable/disable clk_ops beside the bare non-atomic one
> > > (prepare/unprepare) when the underlying SCMI transport is configured to
> > > support atomic transactions for synchronous commands.
> > > 

Hi,

> > > Cc: Michael Turquette <mturquette@baylibre.com>
> > > Cc: Stephen Boyd <sboyd@kernel.org>
> > > Cc: linux-clk@vger.kernel.org
> > > Signed-off-by: Cristian Marussi <cristian.marussi@arm.com>
> > > ---
> > > NOTE THAT STILL THERE'S NO FINE GRAIN CONTROL OVER SELECTION
> > > OF WHICH CLOCK DEVICES CAN SUPPORT ATOMIC AND WHICH SHOULD NOT
> > > BASED ON CLOCK DEVICES ENABLE LATENCY.
> > > THIS HAS STILL TO BE ADDED IN SCMI PROTOCOL SPEC.
> > 
> > Why are you yelling on the internet? :-) I guess I need to ack this.
> >
> 

Sorry I did not mean to yell really, just to warn partners using this.

> It is for the partners who request such changes. We are trying to prototype
> and share the code and ask for feedback before we finalise the specification.
> 
> In fact it is other way around for you 😁. Not to ack as it is not yet final
> 😉. At least I need to wait until the spec contents are finalised before I
> can merge with your ack 😁. But I agree RFC would have indicated that along
> with the above background instead of *yelling*. Cristian assumed everyone
> is aware of the content as quite a few are involved in offline discussions.
> 
> > Acked-by: Stephen Boyd <sboyd@kernel.org>
> 
> Thanks anyways, will use it if nothing changes.
> 

As Sudeep said, V8 it is indeed not the final version which is going to
be posted soon after the merge-windows (without yelling :D) and which it
will be indeed still marked as RFC since it does include a new protocol
feature which is still under review and not published.

Sorry for the noise.

Thanks,
Cristian

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [PATCH v9 09/11] firmware: arm_scmi: Add atomic mode support to virtio transport
  2021-12-21 14:00     ` Cristian Marussi
@ 2022-01-18 14:21       ` Peter Hilber
  -1 siblings, 0 replies; 61+ messages in thread
From: Peter Hilber @ 2022-01-18 14:21 UTC (permalink / raw)
  To: Cristian Marussi, linux-kernel, linux-arm-kernel
  Cc: sudeep.holla, james.quinlan, Jonathan.Cameron, f.fainelli,
	etienne.carriere, vincent.guittot, souvik.chakravarty,
	igor.skalkin, Michael S. Tsirkin, virtualization

On 21.12.21 15:00, Cristian Marussi wrote:
> Add support for .mark_txdone and .poll_done transport operations to SCMI
> VirtIO transport as pre-requisites to enable atomic operations.
> 
> Add a Kernel configuration option to enable SCMI VirtIO transport polling
> and atomic mode for selected SCMI transactions while leaving it default
> disabled.
> 

Hi Cristian,

thanks for the update. I have some more remarks inline below.

My impression is that the virtio core does not expose helper functions suitable
to busy-poll for used buffers. But changing this might not be difficult. Maybe
more_used() from virtio_ring.c could be exposed via a wrapper?

Best regards,

Peter

> Cc: "Michael S. Tsirkin" <mst@redhat.com>
> Cc: Igor Skalkin <igor.skalkin@opensynergy.com>
> Cc: Peter Hilber <peter.hilber@opensynergy.com>
> Cc: virtualization@lists.linux-foundation.org
> Signed-off-by: Cristian Marussi <cristian.marussi@arm.com>
> ---
> v8 --> v9
> - check for deferred_wq existence before queueing work to avoid
>   race at driver removal time
> ---
>  drivers/firmware/arm_scmi/Kconfig  |  15 ++
>  drivers/firmware/arm_scmi/virtio.c | 298 ++++++++++++++++++++++++++---
>  2 files changed, 287 insertions(+), 26 deletions(-)
> 
> diff --git a/drivers/firmware/arm_scmi/Kconfig b/drivers/firmware/arm_scmi/Kconfig
> index d429326433d1..7794bd41eaa0 100644
> --- a/drivers/firmware/arm_scmi/Kconfig
> +++ b/drivers/firmware/arm_scmi/Kconfig
> @@ -118,6 +118,21 @@ config ARM_SCMI_TRANSPORT_VIRTIO_VERSION1_COMPLIANCE
>  	  the ones implemented by kvmtool) and let the core Kernel VirtIO layer
>  	  take care of the needed conversions, say N.
>  
> +config ARM_SCMI_TRANSPORT_VIRTIO_ATOMIC_ENABLE
> +	bool "Enable atomic mode for SCMI VirtIO transport"
> +	depends on ARM_SCMI_TRANSPORT_VIRTIO
> +	help
> +	  Enable support of atomic operation for SCMI VirtIO based transport.
> +
> +	  If you want the SCMI VirtIO based transport to operate in atomic
> +	  mode, avoiding any kind of sleeping behaviour for selected
> +	  transactions on the TX path, answer Y.
> +
> +	  Enabling atomic mode operations allows any SCMI driver using this
> +	  transport to optionally ask for atomic SCMI transactions and operate
> +	  in atomic context too, at the price of using a number of busy-waiting
> +	  primitives all over instead. If unsure say N.
> +
>  endif #ARM_SCMI_PROTOCOL
>  
>  config ARM_SCMI_POWER_DOMAIN
> diff --git a/drivers/firmware/arm_scmi/virtio.c b/drivers/firmware/arm_scmi/virtio.c
> index fd0f6f91fc0b..e54d14971c07 100644
> --- a/drivers/firmware/arm_scmi/virtio.c
> +++ b/drivers/firmware/arm_scmi/virtio.c
> @@ -3,8 +3,8 @@
>   * Virtio Transport driver for Arm System Control and Management Interface
>   * (SCMI).
>   *
> - * Copyright (C) 2020-2021 OpenSynergy.
> - * Copyright (C) 2021 ARM Ltd.
> + * Copyright (C) 2020-2022 OpenSynergy.
> + * Copyright (C) 2021-2022 ARM Ltd.
>   */
>  
>  /**
> @@ -38,6 +38,9 @@
>   * @vqueue: Associated virtqueue
>   * @cinfo: SCMI Tx or Rx channel
>   * @free_list: List of unused scmi_vio_msg, maintained for Tx channels only
> + * @deferred_tx_work: Worker for TX deferred replies processing
> + * @deferred_tx_wq: Workqueue for TX deferred replies
> + * @pending_cmds_list: List of pre-fetched commands queueud for later processing
>   * @is_rx: Whether channel is an Rx channel
>   * @ready: Whether transport user is ready to hear about channel
>   * @max_msg: Maximum number of pending messages for this channel.
> @@ -49,6 +52,9 @@ struct scmi_vio_channel {
>  	struct virtqueue *vqueue;
>  	struct scmi_chan_info *cinfo;
>  	struct list_head free_list;
> +	struct list_head pending_cmds_list;
> +	struct work_struct deferred_tx_work;
> +	struct workqueue_struct *deferred_tx_wq;
>  	bool is_rx;
>  	bool ready;
>  	unsigned int max_msg;
> @@ -65,12 +71,22 @@ struct scmi_vio_channel {
>   * @input: SDU used for (delayed) responses and notifications
>   * @list: List which scmi_vio_msg may be part of
>   * @rx_len: Input SDU size in bytes, once input has been received
> + * @poll_idx: Last used index registered for polling purposes if this message
> + *	      transaction reply was configured for polling.
> + *	      Note that since virtqueue used index is an unsigned 16-bit we can
> + *	      use some out-of-scale values to signify particular conditions.
> + * @poll_lock: Protect access to @poll_idx.
>   */
>  struct scmi_vio_msg {
>  	struct scmi_msg_payld *request;
>  	struct scmi_msg_payld *input;
>  	struct list_head list;
>  	unsigned int rx_len;
> +#define VIO_MSG_NOT_POLLED	0xeeeeeeeeUL
> +#define VIO_MSG_POLL_DONE	0xffffffffUL
> +	unsigned int poll_idx;
> +	/* lock to protect access to poll_idx. */
> +	spinlock_t poll_lock;
>  };
>  
>  /* Only one SCMI VirtIO device can possibly exist */
> @@ -81,40 +97,43 @@ static bool scmi_vio_have_vq_rx(struct virtio_device *vdev)
>  	return virtio_has_feature(vdev, VIRTIO_SCMI_F_P2A_CHANNELS);
>  }
>  
> +/* Expect to be called with vioch->lock acquired by the caller and IRQs off */
>  static int scmi_vio_feed_vq_rx(struct scmi_vio_channel *vioch,
>  			       struct scmi_vio_msg *msg,
>  			       struct device *dev)
>  {
>  	struct scatterlist sg_in;
>  	int rc;
> -	unsigned long flags;
>  
>  	sg_init_one(&sg_in, msg->input, VIRTIO_SCMI_MAX_PDU_SIZE);
>  
> -	spin_lock_irqsave(&vioch->lock, flags);
> -
>  	rc = virtqueue_add_inbuf(vioch->vqueue, &sg_in, 1, msg, GFP_ATOMIC);
>  	if (rc)
>  		dev_err(dev, "failed to add to RX virtqueue (%d)\n", rc);
>  	else
>  		virtqueue_kick(vioch->vqueue);
>  
> -	spin_unlock_irqrestore(&vioch->lock, flags);
> -
>  	return rc;
>  }
>  
> +/* Expect to be called with vioch->lock acquired by the caller and IRQs off */
> +static inline void scmi_vio_feed_vq_tx(struct scmi_vio_channel *vioch,
> +				       struct scmi_vio_msg *msg)
> +{
> +	spin_lock(&msg->poll_lock);
> +	msg->poll_idx = VIO_MSG_NOT_POLLED;
> +	spin_unlock(&msg->poll_lock);
> +
> +	list_add(&msg->list, &vioch->free_list);
> +}
> +
>  static void scmi_finalize_message(struct scmi_vio_channel *vioch,
>  				  struct scmi_vio_msg *msg)
>  {
> -	if (vioch->is_rx) {
> +	if (vioch->is_rx)
>  		scmi_vio_feed_vq_rx(vioch, msg, vioch->cinfo->dev);
> -	} else {
> -		/* Here IRQs are assumed to be already disabled by the caller */
> -		spin_lock(&vioch->lock);
> -		list_add(&msg->list, &vioch->free_list);
> -		spin_unlock(&vioch->lock);
> -	}
> +	else
> +		scmi_vio_feed_vq_tx(vioch, msg);
>  }
>  
>  static void scmi_vio_complete_cb(struct virtqueue *vqueue)
> @@ -144,6 +163,7 @@ static void scmi_vio_complete_cb(struct virtqueue *vqueue)
>  			virtqueue_disable_cb(vqueue);
>  			cb_enabled = false;
>  		}
> +
>  		msg = virtqueue_get_buf(vqueue, &length);
>  		if (!msg) {
>  			if (virtqueue_enable_cb(vqueue))
> @@ -157,7 +177,9 @@ static void scmi_vio_complete_cb(struct virtqueue *vqueue)
>  			scmi_rx_callback(vioch->cinfo,
>  					 msg_read_header(msg->input), msg);
>  
> +			spin_lock(&vioch->lock);
>  			scmi_finalize_message(vioch, msg);
> +			spin_unlock(&vioch->lock);
>  		}
>  
>  		/*
> @@ -176,6 +198,34 @@ static void scmi_vio_complete_cb(struct virtqueue *vqueue)
>  	spin_unlock_irqrestore(&vioch->ready_lock, ready_flags);
>  }
>  
> +static void scmi_vio_deferred_tx_worker(struct work_struct *work)
> +{
> +	unsigned long flags;
> +	struct scmi_vio_channel *vioch;
> +	struct scmi_vio_msg *msg, *tmp;
> +
> +	vioch = container_of(work, struct scmi_vio_channel, deferred_tx_work);
> +
> +	/* Process pre-fetched messages */
> +	spin_lock_irqsave(&vioch->lock, flags);
> +
> +	/* Scan the list of possibly pre-fetched messages during polling. */
> +	list_for_each_entry_safe(msg, tmp, &vioch->pending_cmds_list, list) {
> +		list_del(&msg->list);
> +
> +		scmi_rx_callback(vioch->cinfo,
> +				 msg_read_header(msg->input), msg);
> +
> +		/* Free the processed message once done */
> +		scmi_vio_feed_vq_tx(vioch, msg);
> +	}
> +
> +	spin_unlock_irqrestore(&vioch->lock, flags);
> +
> +	/* Process possibly still pending messages */
> +	scmi_vio_complete_cb(vioch->vqueue);
> +}
> +
>  static const char *const scmi_vio_vqueue_names[] = { "tx", "rx" };
>  
>  static vq_callback_t *scmi_vio_complete_callbacks[] = {
> @@ -244,6 +294,19 @@ static int virtio_chan_setup(struct scmi_chan_info *cinfo, struct device *dev,
>  
>  	vioch = &((struct scmi_vio_channel *)scmi_vdev->priv)[index];
>  
> +	/* Setup a deferred worker for polling. */
> +	if (tx && !vioch->deferred_tx_wq) {
> +		vioch->deferred_tx_wq =
> +			alloc_workqueue(dev_name(&scmi_vdev->dev),
> +					WQ_UNBOUND | WQ_FREEZABLE | WQ_SYSFS,
> +					0);
> +		if (!vioch->deferred_tx_wq)
> +			return -ENOMEM;
> +
> +		INIT_WORK(&vioch->deferred_tx_work,
> +			  scmi_vio_deferred_tx_worker);
> +	}
> +
>  	for (i = 0; i < vioch->max_msg; i++) {
>  		struct scmi_vio_msg *msg;
>  
> @@ -257,6 +320,7 @@ static int virtio_chan_setup(struct scmi_chan_info *cinfo, struct device *dev,
>  						    GFP_KERNEL);
>  			if (!msg->request)
>  				return -ENOMEM;
> +			spin_lock_init(&msg->poll_lock);
>  		}
>  
>  		msg->input = devm_kzalloc(cinfo->dev, VIRTIO_SCMI_MAX_PDU_SIZE,
> @@ -264,13 +328,12 @@ static int virtio_chan_setup(struct scmi_chan_info *cinfo, struct device *dev,
>  		if (!msg->input)
>  			return -ENOMEM;
>  
> -		if (tx) {
> -			spin_lock_irqsave(&vioch->lock, flags);
> -			list_add_tail(&msg->list, &vioch->free_list);
> -			spin_unlock_irqrestore(&vioch->lock, flags);
> -		} else {
> +		spin_lock_irqsave(&vioch->lock, flags);
> +		if (tx)
> +			scmi_vio_feed_vq_tx(vioch, msg);
> +		else
>  			scmi_vio_feed_vq_rx(vioch, msg, cinfo->dev);
> -		}
> +		spin_unlock_irqrestore(&vioch->lock, flags);
>  	}
>  
>  	spin_lock_irqsave(&vioch->lock, flags);
> @@ -291,11 +354,22 @@ static int virtio_chan_free(int id, void *p, void *data)
>  	unsigned long flags;
>  	struct scmi_chan_info *cinfo = p;
>  	struct scmi_vio_channel *vioch = cinfo->transport_info;
> +	void *deferred_wq = NULL;
>  
>  	spin_lock_irqsave(&vioch->ready_lock, flags);
>  	vioch->ready = false;
>  	spin_unlock_irqrestore(&vioch->ready_lock, flags);
>  
> +	spin_lock_irqsave(&vioch->lock, flags);
> +	if (!vioch->is_rx && vioch->deferred_tx_wq) {
> +		deferred_wq = vioch->deferred_tx_wq;
> +		vioch->deferred_tx_wq = NULL;
> +	}
> +	spin_unlock_irqrestore(&vioch->lock, flags);
> +
> +	if (deferred_wq)
> +		destroy_workqueue(deferred_wq);
> +
>  	scmi_free_channel(cinfo, data, id);
>  
>  	spin_lock_irqsave(&vioch->lock, flags);
> @@ -324,7 +398,8 @@ static int virtio_send_message(struct scmi_chan_info *cinfo,
>  	}
>  
>  	msg = list_first_entry(&vioch->free_list, typeof(*msg), list);
> -	list_del(&msg->list);
> +	/* Re-init element so we can discern anytime if it is still in-flight */
> +	list_del_init(&msg->list);
>  
>  	msg_tx_prepare(msg->request, xfer);
>  
> @@ -337,6 +412,19 @@ static int virtio_send_message(struct scmi_chan_info *cinfo,
>  		dev_err(vioch->cinfo->dev,
>  			"failed to add to TX virtqueue (%d)\n", rc);
>  	} else {
> +		/*
> +		 * If polling was requested for this transaction:
> +		 *  - retrieve last used index (will be used as polling reference)

The virtqueue_add_sgs() called before already exposed the message to the virtio
device, so a fast response might have arrived already, so the reference value
might unexpectedly correspond to the current message.

> +		 *  - bind the polled message to the xfer via .priv
> +		 */
> +		if (xfer->hdr.poll_completion) {
> +			spin_lock(&msg->poll_lock);
> +			msg->poll_idx =
> +				virtqueue_enable_cb_prepare(vioch->vqueue);
> +			spin_unlock(&msg->poll_lock);
> +			/* Ensure initialized msg is visibly bound to xfer */
> +			smp_store_mb(xfer->priv, msg);
> +		}
>  		virtqueue_kick(vioch->vqueue);
>  	}
>  
> @@ -350,10 +438,8 @@ static void virtio_fetch_response(struct scmi_chan_info *cinfo,
>  {
>  	struct scmi_vio_msg *msg = xfer->priv;
>  
> -	if (msg) {
> +	if (msg)
>  		msg_fetch_response(msg->input, msg->rx_len, xfer);
> -		xfer->priv = NULL;
> -	}
>  }
>  
>  static void virtio_fetch_notification(struct scmi_chan_info *cinfo,
> @@ -361,10 +447,166 @@ static void virtio_fetch_notification(struct scmi_chan_info *cinfo,
>  {
>  	struct scmi_vio_msg *msg = xfer->priv;
>  
> -	if (msg) {
> +	if (msg)
>  		msg_fetch_notification(msg->input, msg->rx_len, max_len, xfer);
> -		xfer->priv = NULL;
> +}
> +
> +/**
> + * virtio_mark_txdone  - Mark transmission done
> + *
> + * Free only successfully completed polling transfer messages.
> + *
> + * Note that in the SCMI VirtIO transport we never explicitly release timed-out
> + * messages by forcibly re-adding them to the free-list, even on timeout, inside
> + * the TX code path; we instead let IRQ/RX callbacks eventually clean up such
> + * messages once, finally, a late reply is received and discarded (if ever).
> + *
> + * This approach was deemed preferable since those pending timed-out buffers are
> + * still effectively owned by the SCMI platform VirtIO device even after timeout
> + * expiration: forcibly freeing and reusing them before they had been returned
> + * explicitly by the SCMI platform could lead to subtle bugs due to message
> + * corruption.
> + * An SCMI platform VirtIO device which never returns message buffers is
> + * anyway broken and it will quickly lead to exhaustion of available messages.
> + *
> + * For this same reason, here, we take care to free only the successfully
> + * completed polled messages, since they won't be freed elsewhere; late replies
> + * to timed-out polled messages would be anyway freed by RX callbacks instead.
> + *
> + * @cinfo: SCMI channel info
> + * @ret: Transmission return code
> + * @xfer: Transfer descriptor
> + */
> +static void virtio_mark_txdone(struct scmi_chan_info *cinfo, int ret,
> +			       struct scmi_xfer *xfer)
> +{
> +	unsigned long flags;
> +	struct scmi_vio_channel *vioch = cinfo->transport_info;
> +	struct scmi_vio_msg *msg = xfer->priv;
> +
> +	if (!msg)
> +		return;
> +
> +	/* Ensure msg is unbound from xfer before pushing onto the free list  */
> +	smp_store_mb(xfer->priv, NULL);
> +
> +	/* Is a successfully completed polled message still to be finalized ? */
> +	spin_lock_irqsave(&vioch->lock, flags);
> +	if (!ret && xfer->hdr.poll_completion && list_empty(&msg->list))

Could this not also be a message which was already finalized and then
immediately reused for another polling xfer?

> +		scmi_vio_feed_vq_tx(vioch, msg);
> +	spin_unlock_irqrestore(&vioch->lock, flags);
> +}
> +
> +/**
> + * virtio_poll_done  - Provide polling support for VirtIO transport
> + *
> + * @cinfo: SCMI channel info
> + * @xfer: Reference to the transfer being poll for.
> + *
> + * VirtIO core provides a polling mechanism based only on last used indexes:
> + * this means that it is possible to poll the virtqueues waiting for something
> + * new to arrive from the host side but the only way to check if the freshly
> + * arrived buffer was what we were waiting for is to compare the newly arrived
> + * message descriptors with the one we are polling on.
> + *
> + * As a consequence it can happen to dequeue something different from the buffer
> + * we were poll-waiting for: if that is the case such early fetched buffers are
> + * then added to a the @pending_cmds_list list for later processing by a
> + * dedicated deferred worker.
> + *
> + * So, basically, once something new is spotted we proceed to de-queue all the
> + * freshly received used buffers until we found the one we were polling on, or,
> + * we have 'seemingly' emptied the virtqueue; if some buffers are still pending
> + * in the vqueue at the end of the polling loop (possible due to inherent races
> + * in virtqueues handling mechanisms), we similarly kick the deferred worker
> + * and let it process those, to avoid indefinitely looping in the .poll_done
> + * helper.
> + *
> + * Note that we do NOT suppress notification with VIRTQ_USED_F_NO_NOTIFY even
> + * when polling since such flag is per-virtqueues and we do not want to
> + * suppress notifications as a whole: so, if the message we are polling for is
> + * delivered via usual IRQs callbacks, on another core which are IRQs-on, it
> + * will be handled as such by scmi_rx_callback() and the polling loop in the
> + * SCMI Core TX path will be transparently terminated anyway.

Why is VIRTQ_USED_F_NO_NOTIFY mentioned? It is a flag associated with only one
of two buffer notification suppression methods, which are transparently handled
by the virtio core. Cf. `Driver Requirements: Used Buffer Notification
Suppression' in the virtio spec.

> + *
> + * Return: True once polling has successfully completed.
> + */
> +static bool virtio_poll_done(struct scmi_chan_info *cinfo,
> +			     struct scmi_xfer *xfer)
> +{
> +	bool pending, ret = false;
> +	unsigned int length, any_prefetched = 0;
> +	unsigned long flags;
> +	struct scmi_vio_msg *next_msg, *msg = xfer->priv;
> +	struct scmi_vio_channel *vioch = cinfo->transport_info;
> +
> +	if (!msg)
> +		return true;
> +
> +	spin_lock_irqsave(&msg->poll_lock, flags);
> +	/* Processed already by other polling loop on another CPU ? */
> +	if (msg->poll_idx == VIO_MSG_POLL_DONE) {
> +		spin_unlock_irqrestore(&msg->poll_lock, flags);
> +		return true;
> +	}
> +
> +	/* Has cmdq index moved at all ? */
> +	pending = virtqueue_poll(vioch->vqueue, msg->poll_idx);

In my understanding, the polling comparison could still be subject to the ABA
problem when exactly 2**16 messages have been marked as used since
msg->poll_idx was set (unlikely scenario, granted).

I think this would be a lot simpler if the virtio core exported some
concurrency-safe helper function for such polling (similar to more_used() from
virtio_ring.c), as discussed at the top.

> +	spin_unlock_irqrestore(&msg->poll_lock, flags);
> +	if (!pending)
> +		return false;
> +
> +	spin_lock_irqsave(&vioch->lock, flags);
> +	virtqueue_disable_cb(vioch->vqueue);
> +
> +	/*
> +	 * If something arrived we cannot be sure, without dequeueing, if it
> +	 * was the reply to the xfer we are polling for, or, to other, even
> +	 * possibly non-polling, pending xfers: process all new messages
> +	 * till the polled-for message is found OR the vqueue is empty.
> +	 */
> +	while ((next_msg = virtqueue_get_buf(vioch->vqueue, &length))) {
> +		next_msg->rx_len = length;
> +		/* Is the message we were polling for ? */
> +		if (next_msg == msg) {
> +			ret = true;
> +			break;
> +		}
> +
> +		spin_lock(&next_msg->poll_lock);
> +		if (next_msg->poll_idx == VIO_MSG_NOT_POLLED) {
> +			any_prefetched++;
> +			list_add_tail(&next_msg->list,
> +				      &vioch->pending_cmds_list);
> +		} else {
> +			next_msg->poll_idx = VIO_MSG_POLL_DONE;
> +		}
> +		spin_unlock(&next_msg->poll_lock);
>  	}
> +
> +	/*
> +	 * When the polling loop has successfully terminated if something
> +	 * else was queued in the meantime, it will be served by a deferred
> +	 * worker OR by the normal IRQ/callback OR by other poll loops.
> +	 *
> +	 * If we are still looking for the polled reply, the polling index has
> +	 * to be updated to the current vqueue last used index.
> +	 */
> +	if (ret) {
> +		pending = !virtqueue_enable_cb(vioch->vqueue);
> +	} else {
> +		spin_lock(&msg->poll_lock);
> +		msg->poll_idx = virtqueue_enable_cb_prepare(vioch->vqueue);
> +		pending = virtqueue_poll(vioch->vqueue, msg->poll_idx);
> +		spin_unlock(&msg->poll_lock);
> +	}
> +
> +	if (vioch->deferred_tx_wq && (any_prefetched || pending))
> +		queue_work(vioch->deferred_tx_wq, &vioch->deferred_tx_work);

What if queue_work() returns false because the single work item for the channel
was already pending? Couldn't it then happen that the worker CPU will not yet
see the updates implied by `any_prefetched || pending'?

> +
> +	spin_unlock_irqrestore(&vioch->lock, flags);
> +
> +	return ret;
>  }
>  
>  static const struct scmi_transport_ops scmi_virtio_ops = {
> @@ -376,6 +618,8 @@ static const struct scmi_transport_ops scmi_virtio_ops = {
>  	.send_message = virtio_send_message,
>  	.fetch_response = virtio_fetch_response,
>  	.fetch_notification = virtio_fetch_notification,
> +	.mark_txdone = virtio_mark_txdone,
> +	.poll_done = virtio_poll_done,
>  };
>  
>  static int scmi_vio_probe(struct virtio_device *vdev)
> @@ -418,6 +662,7 @@ static int scmi_vio_probe(struct virtio_device *vdev)
>  		spin_lock_init(&channels[i].lock);
>  		spin_lock_init(&channels[i].ready_lock);
>  		INIT_LIST_HEAD(&channels[i].free_list);
> +		INIT_LIST_HEAD(&channels[i].pending_cmds_list);
>  		channels[i].vqueue = vqs[i];
>  
>  		sz = virtqueue_get_vring_size(channels[i].vqueue);
> @@ -506,4 +751,5 @@ const struct scmi_desc scmi_virtio_desc = {
>  	.max_rx_timeout_ms = 60000, /* for non-realtime virtio devices */
>  	.max_msg = 0, /* overridden by virtio_get_max_msg() */
>  	.max_msg_size = VIRTIO_SCMI_MAX_MSG_SIZE,
> +	.atomic_enabled = IS_ENABLED(CONFIG_ARM_SCMI_TRANSPORT_VIRTIO_ATOMIC_ENABLE),
>  };
>

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [PATCH v9 09/11] firmware: arm_scmi: Add atomic mode support to virtio transport
@ 2022-01-18 14:21       ` Peter Hilber
  0 siblings, 0 replies; 61+ messages in thread
From: Peter Hilber @ 2022-01-18 14:21 UTC (permalink / raw)
  To: Cristian Marussi, linux-kernel, linux-arm-kernel
  Cc: sudeep.holla, james.quinlan, Jonathan.Cameron, f.fainelli,
	etienne.carriere, vincent.guittot, souvik.chakravarty,
	igor.skalkin, Michael S. Tsirkin, virtualization

On 21.12.21 15:00, Cristian Marussi wrote:
> Add support for .mark_txdone and .poll_done transport operations to SCMI
> VirtIO transport as pre-requisites to enable atomic operations.
> 
> Add a Kernel configuration option to enable SCMI VirtIO transport polling
> and atomic mode for selected SCMI transactions while leaving it default
> disabled.
> 

Hi Cristian,

thanks for the update. I have some more remarks inline below.

My impression is that the virtio core does not expose helper functions suitable
to busy-poll for used buffers. But changing this might not be difficult. Maybe
more_used() from virtio_ring.c could be exposed via a wrapper?

Best regards,

Peter

> Cc: "Michael S. Tsirkin" <mst@redhat.com>
> Cc: Igor Skalkin <igor.skalkin@opensynergy.com>
> Cc: Peter Hilber <peter.hilber@opensynergy.com>
> Cc: virtualization@lists.linux-foundation.org
> Signed-off-by: Cristian Marussi <cristian.marussi@arm.com>
> ---
> v8 --> v9
> - check for deferred_wq existence before queueing work to avoid
>   race at driver removal time
> ---
>  drivers/firmware/arm_scmi/Kconfig  |  15 ++
>  drivers/firmware/arm_scmi/virtio.c | 298 ++++++++++++++++++++++++++---
>  2 files changed, 287 insertions(+), 26 deletions(-)
> 
> diff --git a/drivers/firmware/arm_scmi/Kconfig b/drivers/firmware/arm_scmi/Kconfig
> index d429326433d1..7794bd41eaa0 100644
> --- a/drivers/firmware/arm_scmi/Kconfig
> +++ b/drivers/firmware/arm_scmi/Kconfig
> @@ -118,6 +118,21 @@ config ARM_SCMI_TRANSPORT_VIRTIO_VERSION1_COMPLIANCE
>  	  the ones implemented by kvmtool) and let the core Kernel VirtIO layer
>  	  take care of the needed conversions, say N.
>  
> +config ARM_SCMI_TRANSPORT_VIRTIO_ATOMIC_ENABLE
> +	bool "Enable atomic mode for SCMI VirtIO transport"
> +	depends on ARM_SCMI_TRANSPORT_VIRTIO
> +	help
> +	  Enable support of atomic operation for SCMI VirtIO based transport.
> +
> +	  If you want the SCMI VirtIO based transport to operate in atomic
> +	  mode, avoiding any kind of sleeping behaviour for selected
> +	  transactions on the TX path, answer Y.
> +
> +	  Enabling atomic mode operations allows any SCMI driver using this
> +	  transport to optionally ask for atomic SCMI transactions and operate
> +	  in atomic context too, at the price of using a number of busy-waiting
> +	  primitives all over instead. If unsure say N.
> +
>  endif #ARM_SCMI_PROTOCOL
>  
>  config ARM_SCMI_POWER_DOMAIN
> diff --git a/drivers/firmware/arm_scmi/virtio.c b/drivers/firmware/arm_scmi/virtio.c
> index fd0f6f91fc0b..e54d14971c07 100644
> --- a/drivers/firmware/arm_scmi/virtio.c
> +++ b/drivers/firmware/arm_scmi/virtio.c
> @@ -3,8 +3,8 @@
>   * Virtio Transport driver for Arm System Control and Management Interface
>   * (SCMI).
>   *
> - * Copyright (C) 2020-2021 OpenSynergy.
> - * Copyright (C) 2021 ARM Ltd.
> + * Copyright (C) 2020-2022 OpenSynergy.
> + * Copyright (C) 2021-2022 ARM Ltd.
>   */
>  
>  /**
> @@ -38,6 +38,9 @@
>   * @vqueue: Associated virtqueue
>   * @cinfo: SCMI Tx or Rx channel
>   * @free_list: List of unused scmi_vio_msg, maintained for Tx channels only
> + * @deferred_tx_work: Worker for TX deferred replies processing
> + * @deferred_tx_wq: Workqueue for TX deferred replies
> + * @pending_cmds_list: List of pre-fetched commands queueud for later processing
>   * @is_rx: Whether channel is an Rx channel
>   * @ready: Whether transport user is ready to hear about channel
>   * @max_msg: Maximum number of pending messages for this channel.
> @@ -49,6 +52,9 @@ struct scmi_vio_channel {
>  	struct virtqueue *vqueue;
>  	struct scmi_chan_info *cinfo;
>  	struct list_head free_list;
> +	struct list_head pending_cmds_list;
> +	struct work_struct deferred_tx_work;
> +	struct workqueue_struct *deferred_tx_wq;
>  	bool is_rx;
>  	bool ready;
>  	unsigned int max_msg;
> @@ -65,12 +71,22 @@ struct scmi_vio_channel {
>   * @input: SDU used for (delayed) responses and notifications
>   * @list: List which scmi_vio_msg may be part of
>   * @rx_len: Input SDU size in bytes, once input has been received
> + * @poll_idx: Last used index registered for polling purposes if this message
> + *	      transaction reply was configured for polling.
> + *	      Note that since virtqueue used index is an unsigned 16-bit we can
> + *	      use some out-of-scale values to signify particular conditions.
> + * @poll_lock: Protect access to @poll_idx.
>   */
>  struct scmi_vio_msg {
>  	struct scmi_msg_payld *request;
>  	struct scmi_msg_payld *input;
>  	struct list_head list;
>  	unsigned int rx_len;
> +#define VIO_MSG_NOT_POLLED	0xeeeeeeeeUL
> +#define VIO_MSG_POLL_DONE	0xffffffffUL
> +	unsigned int poll_idx;
> +	/* lock to protect access to poll_idx. */
> +	spinlock_t poll_lock;
>  };
>  
>  /* Only one SCMI VirtIO device can possibly exist */
> @@ -81,40 +97,43 @@ static bool scmi_vio_have_vq_rx(struct virtio_device *vdev)
>  	return virtio_has_feature(vdev, VIRTIO_SCMI_F_P2A_CHANNELS);
>  }
>  
> +/* Expect to be called with vioch->lock acquired by the caller and IRQs off */
>  static int scmi_vio_feed_vq_rx(struct scmi_vio_channel *vioch,
>  			       struct scmi_vio_msg *msg,
>  			       struct device *dev)
>  {
>  	struct scatterlist sg_in;
>  	int rc;
> -	unsigned long flags;
>  
>  	sg_init_one(&sg_in, msg->input, VIRTIO_SCMI_MAX_PDU_SIZE);
>  
> -	spin_lock_irqsave(&vioch->lock, flags);
> -
>  	rc = virtqueue_add_inbuf(vioch->vqueue, &sg_in, 1, msg, GFP_ATOMIC);
>  	if (rc)
>  		dev_err(dev, "failed to add to RX virtqueue (%d)\n", rc);
>  	else
>  		virtqueue_kick(vioch->vqueue);
>  
> -	spin_unlock_irqrestore(&vioch->lock, flags);
> -
>  	return rc;
>  }
>  
> +/* Expect to be called with vioch->lock acquired by the caller and IRQs off */
> +static inline void scmi_vio_feed_vq_tx(struct scmi_vio_channel *vioch,
> +				       struct scmi_vio_msg *msg)
> +{
> +	spin_lock(&msg->poll_lock);
> +	msg->poll_idx = VIO_MSG_NOT_POLLED;
> +	spin_unlock(&msg->poll_lock);
> +
> +	list_add(&msg->list, &vioch->free_list);
> +}
> +
>  static void scmi_finalize_message(struct scmi_vio_channel *vioch,
>  				  struct scmi_vio_msg *msg)
>  {
> -	if (vioch->is_rx) {
> +	if (vioch->is_rx)
>  		scmi_vio_feed_vq_rx(vioch, msg, vioch->cinfo->dev);
> -	} else {
> -		/* Here IRQs are assumed to be already disabled by the caller */
> -		spin_lock(&vioch->lock);
> -		list_add(&msg->list, &vioch->free_list);
> -		spin_unlock(&vioch->lock);
> -	}
> +	else
> +		scmi_vio_feed_vq_tx(vioch, msg);
>  }
>  
>  static void scmi_vio_complete_cb(struct virtqueue *vqueue)
> @@ -144,6 +163,7 @@ static void scmi_vio_complete_cb(struct virtqueue *vqueue)
>  			virtqueue_disable_cb(vqueue);
>  			cb_enabled = false;
>  		}
> +
>  		msg = virtqueue_get_buf(vqueue, &length);
>  		if (!msg) {
>  			if (virtqueue_enable_cb(vqueue))
> @@ -157,7 +177,9 @@ static void scmi_vio_complete_cb(struct virtqueue *vqueue)
>  			scmi_rx_callback(vioch->cinfo,
>  					 msg_read_header(msg->input), msg);
>  
> +			spin_lock(&vioch->lock);
>  			scmi_finalize_message(vioch, msg);
> +			spin_unlock(&vioch->lock);
>  		}
>  
>  		/*
> @@ -176,6 +198,34 @@ static void scmi_vio_complete_cb(struct virtqueue *vqueue)
>  	spin_unlock_irqrestore(&vioch->ready_lock, ready_flags);
>  }
>  
> +static void scmi_vio_deferred_tx_worker(struct work_struct *work)
> +{
> +	unsigned long flags;
> +	struct scmi_vio_channel *vioch;
> +	struct scmi_vio_msg *msg, *tmp;
> +
> +	vioch = container_of(work, struct scmi_vio_channel, deferred_tx_work);
> +
> +	/* Process pre-fetched messages */
> +	spin_lock_irqsave(&vioch->lock, flags);
> +
> +	/* Scan the list of possibly pre-fetched messages during polling. */
> +	list_for_each_entry_safe(msg, tmp, &vioch->pending_cmds_list, list) {
> +		list_del(&msg->list);
> +
> +		scmi_rx_callback(vioch->cinfo,
> +				 msg_read_header(msg->input), msg);
> +
> +		/* Free the processed message once done */
> +		scmi_vio_feed_vq_tx(vioch, msg);
> +	}
> +
> +	spin_unlock_irqrestore(&vioch->lock, flags);
> +
> +	/* Process possibly still pending messages */
> +	scmi_vio_complete_cb(vioch->vqueue);
> +}
> +
>  static const char *const scmi_vio_vqueue_names[] = { "tx", "rx" };
>  
>  static vq_callback_t *scmi_vio_complete_callbacks[] = {
> @@ -244,6 +294,19 @@ static int virtio_chan_setup(struct scmi_chan_info *cinfo, struct device *dev,
>  
>  	vioch = &((struct scmi_vio_channel *)scmi_vdev->priv)[index];
>  
> +	/* Setup a deferred worker for polling. */
> +	if (tx && !vioch->deferred_tx_wq) {
> +		vioch->deferred_tx_wq =
> +			alloc_workqueue(dev_name(&scmi_vdev->dev),
> +					WQ_UNBOUND | WQ_FREEZABLE | WQ_SYSFS,
> +					0);
> +		if (!vioch->deferred_tx_wq)
> +			return -ENOMEM;
> +
> +		INIT_WORK(&vioch->deferred_tx_work,
> +			  scmi_vio_deferred_tx_worker);
> +	}
> +
>  	for (i = 0; i < vioch->max_msg; i++) {
>  		struct scmi_vio_msg *msg;
>  
> @@ -257,6 +320,7 @@ static int virtio_chan_setup(struct scmi_chan_info *cinfo, struct device *dev,
>  						    GFP_KERNEL);
>  			if (!msg->request)
>  				return -ENOMEM;
> +			spin_lock_init(&msg->poll_lock);
>  		}
>  
>  		msg->input = devm_kzalloc(cinfo->dev, VIRTIO_SCMI_MAX_PDU_SIZE,
> @@ -264,13 +328,12 @@ static int virtio_chan_setup(struct scmi_chan_info *cinfo, struct device *dev,
>  		if (!msg->input)
>  			return -ENOMEM;
>  
> -		if (tx) {
> -			spin_lock_irqsave(&vioch->lock, flags);
> -			list_add_tail(&msg->list, &vioch->free_list);
> -			spin_unlock_irqrestore(&vioch->lock, flags);
> -		} else {
> +		spin_lock_irqsave(&vioch->lock, flags);
> +		if (tx)
> +			scmi_vio_feed_vq_tx(vioch, msg);
> +		else
>  			scmi_vio_feed_vq_rx(vioch, msg, cinfo->dev);
> -		}
> +		spin_unlock_irqrestore(&vioch->lock, flags);
>  	}
>  
>  	spin_lock_irqsave(&vioch->lock, flags);
> @@ -291,11 +354,22 @@ static int virtio_chan_free(int id, void *p, void *data)
>  	unsigned long flags;
>  	struct scmi_chan_info *cinfo = p;
>  	struct scmi_vio_channel *vioch = cinfo->transport_info;
> +	void *deferred_wq = NULL;
>  
>  	spin_lock_irqsave(&vioch->ready_lock, flags);
>  	vioch->ready = false;
>  	spin_unlock_irqrestore(&vioch->ready_lock, flags);
>  
> +	spin_lock_irqsave(&vioch->lock, flags);
> +	if (!vioch->is_rx && vioch->deferred_tx_wq) {
> +		deferred_wq = vioch->deferred_tx_wq;
> +		vioch->deferred_tx_wq = NULL;
> +	}
> +	spin_unlock_irqrestore(&vioch->lock, flags);
> +
> +	if (deferred_wq)
> +		destroy_workqueue(deferred_wq);
> +
>  	scmi_free_channel(cinfo, data, id);
>  
>  	spin_lock_irqsave(&vioch->lock, flags);
> @@ -324,7 +398,8 @@ static int virtio_send_message(struct scmi_chan_info *cinfo,
>  	}
>  
>  	msg = list_first_entry(&vioch->free_list, typeof(*msg), list);
> -	list_del(&msg->list);
> +	/* Re-init element so we can discern anytime if it is still in-flight */
> +	list_del_init(&msg->list);
>  
>  	msg_tx_prepare(msg->request, xfer);
>  
> @@ -337,6 +412,19 @@ static int virtio_send_message(struct scmi_chan_info *cinfo,
>  		dev_err(vioch->cinfo->dev,
>  			"failed to add to TX virtqueue (%d)\n", rc);
>  	} else {
> +		/*
> +		 * If polling was requested for this transaction:
> +		 *  - retrieve last used index (will be used as polling reference)

The virtqueue_add_sgs() called before already exposed the message to the virtio
device, so a fast response might have arrived already, so the reference value
might unexpectedly correspond to the current message.

> +		 *  - bind the polled message to the xfer via .priv
> +		 */
> +		if (xfer->hdr.poll_completion) {
> +			spin_lock(&msg->poll_lock);
> +			msg->poll_idx =
> +				virtqueue_enable_cb_prepare(vioch->vqueue);
> +			spin_unlock(&msg->poll_lock);
> +			/* Ensure initialized msg is visibly bound to xfer */
> +			smp_store_mb(xfer->priv, msg);
> +		}
>  		virtqueue_kick(vioch->vqueue);
>  	}
>  
> @@ -350,10 +438,8 @@ static void virtio_fetch_response(struct scmi_chan_info *cinfo,
>  {
>  	struct scmi_vio_msg *msg = xfer->priv;
>  
> -	if (msg) {
> +	if (msg)
>  		msg_fetch_response(msg->input, msg->rx_len, xfer);
> -		xfer->priv = NULL;
> -	}
>  }
>  
>  static void virtio_fetch_notification(struct scmi_chan_info *cinfo,
> @@ -361,10 +447,166 @@ static void virtio_fetch_notification(struct scmi_chan_info *cinfo,
>  {
>  	struct scmi_vio_msg *msg = xfer->priv;
>  
> -	if (msg) {
> +	if (msg)
>  		msg_fetch_notification(msg->input, msg->rx_len, max_len, xfer);
> -		xfer->priv = NULL;
> +}
> +
> +/**
> + * virtio_mark_txdone  - Mark transmission done
> + *
> + * Free only successfully completed polling transfer messages.
> + *
> + * Note that in the SCMI VirtIO transport we never explicitly release timed-out
> + * messages by forcibly re-adding them to the free-list, even on timeout, inside
> + * the TX code path; we instead let IRQ/RX callbacks eventually clean up such
> + * messages once, finally, a late reply is received and discarded (if ever).
> + *
> + * This approach was deemed preferable since those pending timed-out buffers are
> + * still effectively owned by the SCMI platform VirtIO device even after timeout
> + * expiration: forcibly freeing and reusing them before they had been returned
> + * explicitly by the SCMI platform could lead to subtle bugs due to message
> + * corruption.
> + * An SCMI platform VirtIO device which never returns message buffers is
> + * anyway broken and it will quickly lead to exhaustion of available messages.
> + *
> + * For this same reason, here, we take care to free only the successfully
> + * completed polled messages, since they won't be freed elsewhere; late replies
> + * to timed-out polled messages would be anyway freed by RX callbacks instead.
> + *
> + * @cinfo: SCMI channel info
> + * @ret: Transmission return code
> + * @xfer: Transfer descriptor
> + */
> +static void virtio_mark_txdone(struct scmi_chan_info *cinfo, int ret,
> +			       struct scmi_xfer *xfer)
> +{
> +	unsigned long flags;
> +	struct scmi_vio_channel *vioch = cinfo->transport_info;
> +	struct scmi_vio_msg *msg = xfer->priv;
> +
> +	if (!msg)
> +		return;
> +
> +	/* Ensure msg is unbound from xfer before pushing onto the free list  */
> +	smp_store_mb(xfer->priv, NULL);
> +
> +	/* Is a successfully completed polled message still to be finalized ? */
> +	spin_lock_irqsave(&vioch->lock, flags);
> +	if (!ret && xfer->hdr.poll_completion && list_empty(&msg->list))

Could this not also be a message which was already finalized and then
immediately reused for another polling xfer?

> +		scmi_vio_feed_vq_tx(vioch, msg);
> +	spin_unlock_irqrestore(&vioch->lock, flags);
> +}
> +
> +/**
> + * virtio_poll_done  - Provide polling support for VirtIO transport
> + *
> + * @cinfo: SCMI channel info
> + * @xfer: Reference to the transfer being poll for.
> + *
> + * VirtIO core provides a polling mechanism based only on last used indexes:
> + * this means that it is possible to poll the virtqueues waiting for something
> + * new to arrive from the host side but the only way to check if the freshly
> + * arrived buffer was what we were waiting for is to compare the newly arrived
> + * message descriptors with the one we are polling on.
> + *
> + * As a consequence it can happen to dequeue something different from the buffer
> + * we were poll-waiting for: if that is the case such early fetched buffers are
> + * then added to a the @pending_cmds_list list for later processing by a
> + * dedicated deferred worker.
> + *
> + * So, basically, once something new is spotted we proceed to de-queue all the
> + * freshly received used buffers until we found the one we were polling on, or,
> + * we have 'seemingly' emptied the virtqueue; if some buffers are still pending
> + * in the vqueue at the end of the polling loop (possible due to inherent races
> + * in virtqueues handling mechanisms), we similarly kick the deferred worker
> + * and let it process those, to avoid indefinitely looping in the .poll_done
> + * helper.
> + *
> + * Note that we do NOT suppress notification with VIRTQ_USED_F_NO_NOTIFY even
> + * when polling since such flag is per-virtqueues and we do not want to
> + * suppress notifications as a whole: so, if the message we are polling for is
> + * delivered via usual IRQs callbacks, on another core which are IRQs-on, it
> + * will be handled as such by scmi_rx_callback() and the polling loop in the
> + * SCMI Core TX path will be transparently terminated anyway.

Why is VIRTQ_USED_F_NO_NOTIFY mentioned? It is a flag associated with only one
of two buffer notification suppression methods, which are transparently handled
by the virtio core. Cf. `Driver Requirements: Used Buffer Notification
Suppression' in the virtio spec.

> + *
> + * Return: True once polling has successfully completed.
> + */
> +static bool virtio_poll_done(struct scmi_chan_info *cinfo,
> +			     struct scmi_xfer *xfer)
> +{
> +	bool pending, ret = false;
> +	unsigned int length, any_prefetched = 0;
> +	unsigned long flags;
> +	struct scmi_vio_msg *next_msg, *msg = xfer->priv;
> +	struct scmi_vio_channel *vioch = cinfo->transport_info;
> +
> +	if (!msg)
> +		return true;
> +
> +	spin_lock_irqsave(&msg->poll_lock, flags);
> +	/* Processed already by other polling loop on another CPU ? */
> +	if (msg->poll_idx == VIO_MSG_POLL_DONE) {
> +		spin_unlock_irqrestore(&msg->poll_lock, flags);
> +		return true;
> +	}
> +
> +	/* Has cmdq index moved at all ? */
> +	pending = virtqueue_poll(vioch->vqueue, msg->poll_idx);

In my understanding, the polling comparison could still be subject to the ABA
problem when exactly 2**16 messages have been marked as used since
msg->poll_idx was set (unlikely scenario, granted).

I think this would be a lot simpler if the virtio core exported some
concurrency-safe helper function for such polling (similar to more_used() from
virtio_ring.c), as discussed at the top.

> +	spin_unlock_irqrestore(&msg->poll_lock, flags);
> +	if (!pending)
> +		return false;
> +
> +	spin_lock_irqsave(&vioch->lock, flags);
> +	virtqueue_disable_cb(vioch->vqueue);
> +
> +	/*
> +	 * If something arrived we cannot be sure, without dequeueing, if it
> +	 * was the reply to the xfer we are polling for, or, to other, even
> +	 * possibly non-polling, pending xfers: process all new messages
> +	 * till the polled-for message is found OR the vqueue is empty.
> +	 */
> +	while ((next_msg = virtqueue_get_buf(vioch->vqueue, &length))) {
> +		next_msg->rx_len = length;
> +		/* Is the message we were polling for ? */
> +		if (next_msg == msg) {
> +			ret = true;
> +			break;
> +		}
> +
> +		spin_lock(&next_msg->poll_lock);
> +		if (next_msg->poll_idx == VIO_MSG_NOT_POLLED) {
> +			any_prefetched++;
> +			list_add_tail(&next_msg->list,
> +				      &vioch->pending_cmds_list);
> +		} else {
> +			next_msg->poll_idx = VIO_MSG_POLL_DONE;
> +		}
> +		spin_unlock(&next_msg->poll_lock);
>  	}
> +
> +	/*
> +	 * When the polling loop has successfully terminated if something
> +	 * else was queued in the meantime, it will be served by a deferred
> +	 * worker OR by the normal IRQ/callback OR by other poll loops.
> +	 *
> +	 * If we are still looking for the polled reply, the polling index has
> +	 * to be updated to the current vqueue last used index.
> +	 */
> +	if (ret) {
> +		pending = !virtqueue_enable_cb(vioch->vqueue);
> +	} else {
> +		spin_lock(&msg->poll_lock);
> +		msg->poll_idx = virtqueue_enable_cb_prepare(vioch->vqueue);
> +		pending = virtqueue_poll(vioch->vqueue, msg->poll_idx);
> +		spin_unlock(&msg->poll_lock);
> +	}
> +
> +	if (vioch->deferred_tx_wq && (any_prefetched || pending))
> +		queue_work(vioch->deferred_tx_wq, &vioch->deferred_tx_work);

What if queue_work() returns false because the single work item for the channel
was already pending? Couldn't it then happen that the worker CPU will not yet
see the updates implied by `any_prefetched || pending'?

> +
> +	spin_unlock_irqrestore(&vioch->lock, flags);
> +
> +	return ret;
>  }
>  
>  static const struct scmi_transport_ops scmi_virtio_ops = {
> @@ -376,6 +618,8 @@ static const struct scmi_transport_ops scmi_virtio_ops = {
>  	.send_message = virtio_send_message,
>  	.fetch_response = virtio_fetch_response,
>  	.fetch_notification = virtio_fetch_notification,
> +	.mark_txdone = virtio_mark_txdone,
> +	.poll_done = virtio_poll_done,
>  };
>  
>  static int scmi_vio_probe(struct virtio_device *vdev)
> @@ -418,6 +662,7 @@ static int scmi_vio_probe(struct virtio_device *vdev)
>  		spin_lock_init(&channels[i].lock);
>  		spin_lock_init(&channels[i].ready_lock);
>  		INIT_LIST_HEAD(&channels[i].free_list);
> +		INIT_LIST_HEAD(&channels[i].pending_cmds_list);
>  		channels[i].vqueue = vqs[i];
>  
>  		sz = virtqueue_get_vring_size(channels[i].vqueue);
> @@ -506,4 +751,5 @@ const struct scmi_desc scmi_virtio_desc = {
>  	.max_rx_timeout_ms = 60000, /* for non-realtime virtio devices */
>  	.max_msg = 0, /* overridden by virtio_get_max_msg() */
>  	.max_msg_size = VIRTIO_SCMI_MAX_MSG_SIZE,
> +	.atomic_enabled = IS_ENABLED(CONFIG_ARM_SCMI_TRANSPORT_VIRTIO_ATOMIC_ENABLE),
>  };
>

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [PATCH v9 09/11] firmware: arm_scmi: Add atomic mode support to virtio transport
  2022-01-18 14:21       ` Peter Hilber
@ 2022-01-19 12:23         ` Cristian Marussi
  -1 siblings, 0 replies; 61+ messages in thread
From: Cristian Marussi @ 2022-01-19 12:23 UTC (permalink / raw)
  To: Peter Hilber
  Cc: linux-kernel, linux-arm-kernel, sudeep.holla, james.quinlan,
	Jonathan.Cameron, f.fainelli, etienne.carriere, vincent.guittot,
	souvik.chakravarty, igor.skalkin, Michael S. Tsirkin,
	virtualization

On Tue, Jan 18, 2022 at 03:21:03PM +0100, Peter Hilber wrote:
> On 21.12.21 15:00, Cristian Marussi wrote:
> > Add support for .mark_txdone and .poll_done transport operations to SCMI
> > VirtIO transport as pre-requisites to enable atomic operations.
> > 
> > Add a Kernel configuration option to enable SCMI VirtIO transport polling
> > and atomic mode for selected SCMI transactions while leaving it default
> > disabled.
> > 
> 
> Hi Cristian,
> 
> thanks for the update. I have some more remarks inline below.
> 

Hi Peter,

thanks for your review, much appreciated, please see my replies online.

> My impression is that the virtio core does not expose helper functions suitable
> to busy-poll for used buffers. But changing this might not be difficult. Maybe
> more_used() from virtio_ring.c could be exposed via a wrapper?
> 

While I definitely agree that the virtio core support for polling is far from
ideal, some support is provided and my point was at first to try implement SCMI
virtio polling leveraging what we have now in the core and see if it was attainable
(indeed I tried early in this series to avoid as a whole to have to support polling
at the SCMI transport layer to attain SCMI cmds atomicity..but that was an ill
attempt that led nowhere good...)

Btw, I was planning to post a new series next week (after merge-windows) with some
fixes I did already, at this point I'll include also some fixes derived
from some of your remarks.

> Best regards,
> 
> Peter
> 
> > Cc: "Michael S. Tsirkin" <mst@redhat.com>
> > Cc: Igor Skalkin <igor.skalkin@opensynergy.com>
> > Cc: Peter Hilber <peter.hilber@opensynergy.com>
> > Cc: virtualization@lists.linux-foundation.org
> > Signed-off-by: Cristian Marussi <cristian.marussi@arm.com>
> > ---
> > v8 --> v9
> > - check for deferred_wq existence before queueing work to avoid
> >   race at driver removal time
> > ---
> >  drivers/firmware/arm_scmi/Kconfig  |  15 ++
> >  drivers/firmware/arm_scmi/virtio.c | 298 ++++++++++++++++++++++++++---
> >  2 files changed, 287 insertions(+), 26 deletions(-)
> > 
> > diff --git a/drivers/firmware/arm_scmi/Kconfig b/drivers/firmware/arm_scmi/Kconfig
> > index d429326433d1..7794bd41eaa0 100644
> > --- a/drivers/firmware/arm_scmi/Kconfig
> > +++ b/drivers/firmware/arm_scmi/Kconfig
> > @@ -118,6 +118,21 @@ config ARM_SCMI_TRANSPORT_VIRTIO_VERSION1_COMPLIANCE
> >  	  the ones implemented by kvmtool) and let the core Kernel VirtIO layer
> >  	  take care of the needed conversions, say N.
> >  
> > +config ARM_SCMI_TRANSPORT_VIRTIO_ATOMIC_ENABLE
> > +	bool "Enable atomic mode for SCMI VirtIO transport"
> > +	depends on ARM_SCMI_TRANSPORT_VIRTIO
> > +	help
> > +	  Enable support of atomic operation for SCMI VirtIO based transport.
> > +
> > +	  If you want the SCMI VirtIO based transport to operate in atomic
> > +	  mode, avoiding any kind of sleeping behaviour for selected
> > +	  transactions on the TX path, answer Y.
> > +
> > +	  Enabling atomic mode operations allows any SCMI driver using this
> > +	  transport to optionally ask for atomic SCMI transactions and operate
> > +	  in atomic context too, at the price of using a number of busy-waiting
> > +	  primitives all over instead. If unsure say N.
> > +
> >  endif #ARM_SCMI_PROTOCOL
> >  
> >  config ARM_SCMI_POWER_DOMAIN
> > diff --git a/drivers/firmware/arm_scmi/virtio.c b/drivers/firmware/arm_scmi/virtio.c
> > index fd0f6f91fc0b..e54d14971c07 100644
> > --- a/drivers/firmware/arm_scmi/virtio.c
> > +++ b/drivers/firmware/arm_scmi/virtio.c
> > @@ -3,8 +3,8 @@
> >   * Virtio Transport driver for Arm System Control and Management Interface
> >   * (SCMI).
> >   *
> > - * Copyright (C) 2020-2021 OpenSynergy.
> > - * Copyright (C) 2021 ARM Ltd.
> > + * Copyright (C) 2020-2022 OpenSynergy.
> > + * Copyright (C) 2021-2022 ARM Ltd.
> >   */
> >  
> >  /**
> > @@ -38,6 +38,9 @@
> >   * @vqueue: Associated virtqueue
> >   * @cinfo: SCMI Tx or Rx channel
> >   * @free_list: List of unused scmi_vio_msg, maintained for Tx channels only
> > + * @deferred_tx_work: Worker for TX deferred replies processing
> > + * @deferred_tx_wq: Workqueue for TX deferred replies
> > + * @pending_cmds_list: List of pre-fetched commands queueud for later processing
> >   * @is_rx: Whether channel is an Rx channel
> >   * @ready: Whether transport user is ready to hear about channel
> >   * @max_msg: Maximum number of pending messages for this channel.
> > @@ -49,6 +52,9 @@ struct scmi_vio_channel {
> >  	struct virtqueue *vqueue;
> >  	struct scmi_chan_info *cinfo;
> >  	struct list_head free_list;
> > +	struct list_head pending_cmds_list;
> > +	struct work_struct deferred_tx_work;
> > +	struct workqueue_struct *deferred_tx_wq;
> >  	bool is_rx;
> >  	bool ready;
> >  	unsigned int max_msg;
> > @@ -65,12 +71,22 @@ struct scmi_vio_channel {
> >   * @input: SDU used for (delayed) responses and notifications
> >   * @list: List which scmi_vio_msg may be part of
> >   * @rx_len: Input SDU size in bytes, once input has been received
> > + * @poll_idx: Last used index registered for polling purposes if this message
> > + *	      transaction reply was configured for polling.
> > + *	      Note that since virtqueue used index is an unsigned 16-bit we can
> > + *	      use some out-of-scale values to signify particular conditions.
> > + * @poll_lock: Protect access to @poll_idx.
> >   */
> >  struct scmi_vio_msg {
> >  	struct scmi_msg_payld *request;
> >  	struct scmi_msg_payld *input;
> >  	struct list_head list;
> >  	unsigned int rx_len;
> > +#define VIO_MSG_NOT_POLLED	0xeeeeeeeeUL
> > +#define VIO_MSG_POLL_DONE	0xffffffffUL
> > +	unsigned int poll_idx;
> > +	/* lock to protect access to poll_idx. */
> > +	spinlock_t poll_lock;
> >  };
> >  
> >  /* Only one SCMI VirtIO device can possibly exist */
> > @@ -81,40 +97,43 @@ static bool scmi_vio_have_vq_rx(struct virtio_device *vdev)
> >  	return virtio_has_feature(vdev, VIRTIO_SCMI_F_P2A_CHANNELS);
> >  }
> >  
> > +/* Expect to be called with vioch->lock acquired by the caller and IRQs off */
> >  static int scmi_vio_feed_vq_rx(struct scmi_vio_channel *vioch,
> >  			       struct scmi_vio_msg *msg,
> >  			       struct device *dev)
> >  {
> >  	struct scatterlist sg_in;
> >  	int rc;
> > -	unsigned long flags;
> >  
> >  	sg_init_one(&sg_in, msg->input, VIRTIO_SCMI_MAX_PDU_SIZE);
> >  
> > -	spin_lock_irqsave(&vioch->lock, flags);
> > -
> >  	rc = virtqueue_add_inbuf(vioch->vqueue, &sg_in, 1, msg, GFP_ATOMIC);
> >  	if (rc)
> >  		dev_err(dev, "failed to add to RX virtqueue (%d)\n", rc);
> >  	else
> >  		virtqueue_kick(vioch->vqueue);
> >  
> > -	spin_unlock_irqrestore(&vioch->lock, flags);
> > -
> >  	return rc;
> >  }
> >  
> > +/* Expect to be called with vioch->lock acquired by the caller and IRQs off */
> > +static inline void scmi_vio_feed_vq_tx(struct scmi_vio_channel *vioch,
> > +				       struct scmi_vio_msg *msg)
> > +{
> > +	spin_lock(&msg->poll_lock);
> > +	msg->poll_idx = VIO_MSG_NOT_POLLED;
> > +	spin_unlock(&msg->poll_lock);
> > +
> > +	list_add(&msg->list, &vioch->free_list);
> > +}
> > +
> >  static void scmi_finalize_message(struct scmi_vio_channel *vioch,
> >  				  struct scmi_vio_msg *msg)
> >  {
> > -	if (vioch->is_rx) {
> > +	if (vioch->is_rx)
> >  		scmi_vio_feed_vq_rx(vioch, msg, vioch->cinfo->dev);
> > -	} else {
> > -		/* Here IRQs are assumed to be already disabled by the caller */
> > -		spin_lock(&vioch->lock);
> > -		list_add(&msg->list, &vioch->free_list);
> > -		spin_unlock(&vioch->lock);
> > -	}
> > +	else
> > +		scmi_vio_feed_vq_tx(vioch, msg);
> >  }
> >  
> >  static void scmi_vio_complete_cb(struct virtqueue *vqueue)
> > @@ -144,6 +163,7 @@ static void scmi_vio_complete_cb(struct virtqueue *vqueue)
> >  			virtqueue_disable_cb(vqueue);
> >  			cb_enabled = false;
> >  		}
> > +
> >  		msg = virtqueue_get_buf(vqueue, &length);
> >  		if (!msg) {
> >  			if (virtqueue_enable_cb(vqueue))
> > @@ -157,7 +177,9 @@ static void scmi_vio_complete_cb(struct virtqueue *vqueue)
> >  			scmi_rx_callback(vioch->cinfo,
> >  					 msg_read_header(msg->input), msg);
> >  
> > +			spin_lock(&vioch->lock);
> >  			scmi_finalize_message(vioch, msg);
> > +			spin_unlock(&vioch->lock);
> >  		}
> >  
> >  		/*
> > @@ -176,6 +198,34 @@ static void scmi_vio_complete_cb(struct virtqueue *vqueue)
> >  	spin_unlock_irqrestore(&vioch->ready_lock, ready_flags);
> >  }
> >  
> > +static void scmi_vio_deferred_tx_worker(struct work_struct *work)
> > +{
> > +	unsigned long flags;
> > +	struct scmi_vio_channel *vioch;
> > +	struct scmi_vio_msg *msg, *tmp;
> > +
> > +	vioch = container_of(work, struct scmi_vio_channel, deferred_tx_work);
> > +
> > +	/* Process pre-fetched messages */
> > +	spin_lock_irqsave(&vioch->lock, flags);
> > +
> > +	/* Scan the list of possibly pre-fetched messages during polling. */
> > +	list_for_each_entry_safe(msg, tmp, &vioch->pending_cmds_list, list) {
> > +		list_del(&msg->list);
> > +
> > +		scmi_rx_callback(vioch->cinfo,
> > +				 msg_read_header(msg->input), msg);
> > +
> > +		/* Free the processed message once done */
> > +		scmi_vio_feed_vq_tx(vioch, msg);
> > +	}
> > +
> > +	spin_unlock_irqrestore(&vioch->lock, flags);
> > +
> > +	/* Process possibly still pending messages */
> > +	scmi_vio_complete_cb(vioch->vqueue);
> > +}
> > +
> >  static const char *const scmi_vio_vqueue_names[] = { "tx", "rx" };
> >  
> >  static vq_callback_t *scmi_vio_complete_callbacks[] = {
> > @@ -244,6 +294,19 @@ static int virtio_chan_setup(struct scmi_chan_info *cinfo, struct device *dev,
> >  
> >  	vioch = &((struct scmi_vio_channel *)scmi_vdev->priv)[index];
> >  
> > +	/* Setup a deferred worker for polling. */
> > +	if (tx && !vioch->deferred_tx_wq) {
> > +		vioch->deferred_tx_wq =
> > +			alloc_workqueue(dev_name(&scmi_vdev->dev),
> > +					WQ_UNBOUND | WQ_FREEZABLE | WQ_SYSFS,
> > +					0);
> > +		if (!vioch->deferred_tx_wq)
> > +			return -ENOMEM;
> > +
> > +		INIT_WORK(&vioch->deferred_tx_work,
> > +			  scmi_vio_deferred_tx_worker);
> > +	}
> > +
> >  	for (i = 0; i < vioch->max_msg; i++) {
> >  		struct scmi_vio_msg *msg;
> >  
> > @@ -257,6 +320,7 @@ static int virtio_chan_setup(struct scmi_chan_info *cinfo, struct device *dev,
> >  						    GFP_KERNEL);
> >  			if (!msg->request)
> >  				return -ENOMEM;
> > +			spin_lock_init(&msg->poll_lock);
> >  		}
> >  
> >  		msg->input = devm_kzalloc(cinfo->dev, VIRTIO_SCMI_MAX_PDU_SIZE,
> > @@ -264,13 +328,12 @@ static int virtio_chan_setup(struct scmi_chan_info *cinfo, struct device *dev,
> >  		if (!msg->input)
> >  			return -ENOMEM;
> >  
> > -		if (tx) {
> > -			spin_lock_irqsave(&vioch->lock, flags);
> > -			list_add_tail(&msg->list, &vioch->free_list);
> > -			spin_unlock_irqrestore(&vioch->lock, flags);
> > -		} else {
> > +		spin_lock_irqsave(&vioch->lock, flags);
> > +		if (tx)
> > +			scmi_vio_feed_vq_tx(vioch, msg);
> > +		else
> >  			scmi_vio_feed_vq_rx(vioch, msg, cinfo->dev);
> > -		}
> > +		spin_unlock_irqrestore(&vioch->lock, flags);
> >  	}
> >  
> >  	spin_lock_irqsave(&vioch->lock, flags);
> > @@ -291,11 +354,22 @@ static int virtio_chan_free(int id, void *p, void *data)
> >  	unsigned long flags;
> >  	struct scmi_chan_info *cinfo = p;
> >  	struct scmi_vio_channel *vioch = cinfo->transport_info;
> > +	void *deferred_wq = NULL;
> >  
> >  	spin_lock_irqsave(&vioch->ready_lock, flags);
> >  	vioch->ready = false;
> >  	spin_unlock_irqrestore(&vioch->ready_lock, flags);
> >  
> > +	spin_lock_irqsave(&vioch->lock, flags);
> > +	if (!vioch->is_rx && vioch->deferred_tx_wq) {
> > +		deferred_wq = vioch->deferred_tx_wq;
> > +		vioch->deferred_tx_wq = NULL;
> > +	}
> > +	spin_unlock_irqrestore(&vioch->lock, flags);
> > +
> > +	if (deferred_wq)
> > +		destroy_workqueue(deferred_wq);
> > +
> >  	scmi_free_channel(cinfo, data, id);
> >  
> >  	spin_lock_irqsave(&vioch->lock, flags);
> > @@ -324,7 +398,8 @@ static int virtio_send_message(struct scmi_chan_info *cinfo,
> >  	}
> >  
> >  	msg = list_first_entry(&vioch->free_list, typeof(*msg), list);
> > -	list_del(&msg->list);
> > +	/* Re-init element so we can discern anytime if it is still in-flight */
> > +	list_del_init(&msg->list);
> >  
> >  	msg_tx_prepare(msg->request, xfer);
> >  
> > @@ -337,6 +412,19 @@ static int virtio_send_message(struct scmi_chan_info *cinfo,
> >  		dev_err(vioch->cinfo->dev,
> >  			"failed to add to TX virtqueue (%d)\n", rc);
> >  	} else {
> > +		/*
> > +		 * If polling was requested for this transaction:
> > +		 *  - retrieve last used index (will be used as polling reference)
> 
> The virtqueue_add_sgs() called before already exposed the message to the virtio
> device, so a fast response might have arrived already, so the reference value
> might unexpectedly correspond to the current message.
> 

Good point I have wrongly considered the kick as the only trigger but if
indeed the device was already kicked and processing the queues the scenario
you depicted is possible. I'll move this logic earlier.

> > +		 *  - bind the polled message to the xfer via .priv
> > +		 */
> > +		if (xfer->hdr.poll_completion) {
> > +			spin_lock(&msg->poll_lock);
> > +			msg->poll_idx =
> > +				virtqueue_enable_cb_prepare(vioch->vqueue);
> > +			spin_unlock(&msg->poll_lock);
> > +			/* Ensure initialized msg is visibly bound to xfer */
> > +			smp_store_mb(xfer->priv, msg);
> > +		}
> >  		virtqueue_kick(vioch->vqueue);
> >  	}
> >  
> > @@ -350,10 +438,8 @@ static void virtio_fetch_response(struct scmi_chan_info *cinfo,
> >  {
> >  	struct scmi_vio_msg *msg = xfer->priv;
> >  
> > -	if (msg) {
> > +	if (msg)
> >  		msg_fetch_response(msg->input, msg->rx_len, xfer);
> > -		xfer->priv = NULL;
> > -	}
> >  }
> >  
> >  static void virtio_fetch_notification(struct scmi_chan_info *cinfo,
> > @@ -361,10 +447,166 @@ static void virtio_fetch_notification(struct scmi_chan_info *cinfo,
> >  {
> >  	struct scmi_vio_msg *msg = xfer->priv;
> >  
> > -	if (msg) {
> > +	if (msg)
> >  		msg_fetch_notification(msg->input, msg->rx_len, max_len, xfer);
> > -		xfer->priv = NULL;
> > +}
> > +
> > +/**
> > + * virtio_mark_txdone  - Mark transmission done
> > + *
> > + * Free only successfully completed polling transfer messages.
> > + *
> > + * Note that in the SCMI VirtIO transport we never explicitly release timed-out
> > + * messages by forcibly re-adding them to the free-list, even on timeout, inside
> > + * the TX code path; we instead let IRQ/RX callbacks eventually clean up such
> > + * messages once, finally, a late reply is received and discarded (if ever).
> > + *
> > + * This approach was deemed preferable since those pending timed-out buffers are
> > + * still effectively owned by the SCMI platform VirtIO device even after timeout
> > + * expiration: forcibly freeing and reusing them before they had been returned
> > + * explicitly by the SCMI platform could lead to subtle bugs due to message
> > + * corruption.
> > + * An SCMI platform VirtIO device which never returns message buffers is
> > + * anyway broken and it will quickly lead to exhaustion of available messages.
> > + *
> > + * For this same reason, here, we take care to free only the successfully
> > + * completed polled messages, since they won't be freed elsewhere; late replies
> > + * to timed-out polled messages would be anyway freed by RX callbacks instead.
> > + *
> > + * @cinfo: SCMI channel info
> > + * @ret: Transmission return code
> > + * @xfer: Transfer descriptor
> > + */
> > +static void virtio_mark_txdone(struct scmi_chan_info *cinfo, int ret,
> > +			       struct scmi_xfer *xfer)
> > +{
> > +	unsigned long flags;
> > +	struct scmi_vio_channel *vioch = cinfo->transport_info;
> > +	struct scmi_vio_msg *msg = xfer->priv;
> > +
> > +	if (!msg)
> > +		return;
> > +
> > +	/* Ensure msg is unbound from xfer before pushing onto the free list  */
> > +	smp_store_mb(xfer->priv, NULL);
> > +
> > +	/* Is a successfully completed polled message still to be finalized ? */
> > +	spin_lock_irqsave(&vioch->lock, flags);
> > +	if (!ret && xfer->hdr.poll_completion && list_empty(&msg->list))
> 
> Could this not also be a message which was already finalized and then
> immediately reused for another polling xfer?
> 

Fist of all, in the upcoming series there is already a fix here, because I do
not want (as in non-polling mode and as stated in the comment above) to forcibly
release -ETIMEOUT xfer here. New check being like:

	if (ret != -ETIMEDOUT && xfer->hdr.poll_completion && list_empty(&msg->list))

BUT, if I understood correctly what you meant you are talking about a polled
message which had been received by chance on the IRQ path and there finalized
already, and so subsequently happily picked up from the free list for another
polled transfer....I think you are right, even if remote, this is a possibility
to account for (and the list_empty() check fails at this): I'll take care to clear
xfer->priv after a command is fetched BUT only if received across the IRQ path,
and drop the list_empty() check above which at this point is redundant: in this
way mark_tx_done will be skip even for polled cmds that have already been
finalized on the IRQ path. (notifications/delayed responses are unaffected
since not handled on this path...)

Thanks for spotting this.

> > +		scmi_vio_feed_vq_tx(vioch, msg);
> > +	spin_unlock_irqrestore(&vioch->lock, flags);
> > +}
> > +
> > +/**
> > + * virtio_poll_done  - Provide polling support for VirtIO transport
> > + *
> > + * @cinfo: SCMI channel info
> > + * @xfer: Reference to the transfer being poll for.
> > + *
> > + * VirtIO core provides a polling mechanism based only on last used indexes:
> > + * this means that it is possible to poll the virtqueues waiting for something
> > + * new to arrive from the host side but the only way to check if the freshly
> > + * arrived buffer was what we were waiting for is to compare the newly arrived
> > + * message descriptors with the one we are polling on.
> > + *
> > + * As a consequence it can happen to dequeue something different from the buffer
> > + * we were poll-waiting for: if that is the case such early fetched buffers are
> > + * then added to a the @pending_cmds_list list for later processing by a
> > + * dedicated deferred worker.
> > + *
> > + * So, basically, once something new is spotted we proceed to de-queue all the
> > + * freshly received used buffers until we found the one we were polling on, or,
> > + * we have 'seemingly' emptied the virtqueue; if some buffers are still pending
> > + * in the vqueue at the end of the polling loop (possible due to inherent races
> > + * in virtqueues handling mechanisms), we similarly kick the deferred worker
> > + * and let it process those, to avoid indefinitely looping in the .poll_done
> > + * helper.
> > + *
> > + * Note that we do NOT suppress notification with VIRTQ_USED_F_NO_NOTIFY even
> > + * when polling since such flag is per-virtqueues and we do not want to
> > + * suppress notifications as a whole: so, if the message we are polling for is
> > + * delivered via usual IRQs callbacks, on another core which are IRQs-on, it
> > + * will be handled as such by scmi_rx_callback() and the polling loop in the
> > + * SCMI Core TX path will be transparently terminated anyway.
> 
> Why is VIRTQ_USED_F_NO_NOTIFY mentioned? It is a flag associated with only one
> of two buffer notification suppression methods, which are transparently handled
> by the virtio core. Cf. `Driver Requirements: Used Buffer Notification
> Suppression' in the virtio spec.
> 

I saw that paragraph on the spec, what I wanted to stress was that we have no
way to implement a per-message notification suppression (and probably neither a
per-channel one since VIRTQ_USED_F_NO_NOTIFY is not really exposed), so that
a polled message could be indeed received instead through the usual IRQ based
callbacks if it happens to be delivered on a different core or if the
current polling core is not running such loops with IRQs off, but such
behaviour is supported and handled.

Anyway it is probably confusing mentioning the presence of the
VIRTQ_USED_F_NO_NOTIFY flag, so I'll drop that mention.

> > + *
> > + * Return: True once polling has successfully completed.
> > + */
> > +static bool virtio_poll_done(struct scmi_chan_info *cinfo,
> > +			     struct scmi_xfer *xfer)
> > +{
> > +	bool pending, ret = false;
> > +	unsigned int length, any_prefetched = 0;
> > +	unsigned long flags;
> > +	struct scmi_vio_msg *next_msg, *msg = xfer->priv;
> > +	struct scmi_vio_channel *vioch = cinfo->transport_info;
> > +
> > +	if (!msg)
> > +		return true;
> > +
> > +	spin_lock_irqsave(&msg->poll_lock, flags);
> > +	/* Processed already by other polling loop on another CPU ? */
> > +	if (msg->poll_idx == VIO_MSG_POLL_DONE) {
> > +		spin_unlock_irqrestore(&msg->poll_lock, flags);
> > +		return true;
> > +	}
> > +
> > +	/* Has cmdq index moved at all ? */
> > +	pending = virtqueue_poll(vioch->vqueue, msg->poll_idx);
> 
> In my understanding, the polling comparison could still be subject to the ABA
> problem when exactly 2**16 messages have been marked as used since
> msg->poll_idx was set (unlikely scenario, granted).
> 
> I think this would be a lot simpler if the virtio core exported some
> concurrency-safe helper function for such polling (similar to more_used() from
> virtio_ring.c), as discussed at the top.

So this is the main limitation indeed of the current implementation, I
cannot distinguish if there was an exact full wrap and I'm reading the same
last_idx as before but a whoppying 2**16 messages have instead gone through...

The tricky part seems to me here that even introducing dedicated helpers
for polling in order to account for such wrapping (similar to more_used())
those would be based by current VirtIO spec on a single bit wrap counter,
so how do you discern if 2 whole wraps have happened (even more unlikely..) ?

Maybe I'm missing something though...

I'll have a though about this, but in my opinion this seems something so
unlikely that we could live with it, for the moment at least...

> 
> > +	spin_unlock_irqrestore(&msg->poll_lock, flags);
> > +	if (!pending)
> > +		return false;
> > +
> > +	spin_lock_irqsave(&vioch->lock, flags);
> > +	virtqueue_disable_cb(vioch->vqueue);
> > +
> > +	/*
> > +	 * If something arrived we cannot be sure, without dequeueing, if it
> > +	 * was the reply to the xfer we are polling for, or, to other, even
> > +	 * possibly non-polling, pending xfers: process all new messages
> > +	 * till the polled-for message is found OR the vqueue is empty.
> > +	 */
> > +	while ((next_msg = virtqueue_get_buf(vioch->vqueue, &length))) {
> > +		next_msg->rx_len = length;
> > +		/* Is the message we were polling for ? */
> > +		if (next_msg == msg) {
> > +			ret = true;
> > +			break;
> > +		}
> > +
> > +		spin_lock(&next_msg->poll_lock);
> > +		if (next_msg->poll_idx == VIO_MSG_NOT_POLLED) {
> > +			any_prefetched++;
> > +			list_add_tail(&next_msg->list,
> > +				      &vioch->pending_cmds_list);
> > +		} else {
> > +			next_msg->poll_idx = VIO_MSG_POLL_DONE;
> > +		}
> > +		spin_unlock(&next_msg->poll_lock);
> >  	}
> > +
> > +	/*
> > +	 * When the polling loop has successfully terminated if something
> > +	 * else was queued in the meantime, it will be served by a deferred
> > +	 * worker OR by the normal IRQ/callback OR by other poll loops.
> > +	 *
> > +	 * If we are still looking for the polled reply, the polling index has
> > +	 * to be updated to the current vqueue last used index.
> > +	 */
> > +	if (ret) {
> > +		pending = !virtqueue_enable_cb(vioch->vqueue);
> > +	} else {
> > +		spin_lock(&msg->poll_lock);
> > +		msg->poll_idx = virtqueue_enable_cb_prepare(vioch->vqueue);
> > +		pending = virtqueue_poll(vioch->vqueue, msg->poll_idx);
> > +		spin_unlock(&msg->poll_lock);
> > +	}
> > +
> > +	if (vioch->deferred_tx_wq && (any_prefetched || pending))
> > +		queue_work(vioch->deferred_tx_wq, &vioch->deferred_tx_work);
> 
> What if queue_work() returns false because the single work item for the channel
> was already pending? Couldn't it then happen that the worker CPU will not yet
> see the updates implied by `any_prefetched || pending'?
> 

The short asnwer is, no this can't happen: I had an extensive discussion in the
past around similar concerns about this about SCMI notifs dispatching (which use
workqueues:
https://lore.kernel.org/linux-arm-kernel/0234c256-2e29-69f8-99ae-2599f982961e@arm.com/).

In a nutshell all bails down to how the Kenrel cmwq susbsystem works
internally.

queue_work() does fail if the same work item was already queued recently
and suck work is still marked as WORK_PENDING, BUT that flag happens to be set
only if such queued work is still fully pending in the sense that the worker
thread has still not even being waken by the Kernel cmwq subsystem.

The above flag is indeed cleared right before the related worker
thread is started, roughly:

> kernel/workqueue.c:process_one_work()
> 
> 	set_work_pool_and_clear_pending(work, pool->id);
> 	....
> 	worker->current_func(work);
> 

So queue_work() return false only if a previous work was pending AND the
worker thread was still dormient; it is not possible that the worker
thread was running and, say, had already emptied the above @pending_cmds_list
but was still marked pending since it still has to terminate (that would be
indeed a problem and you would end up with this new work items not seen
till the next worker run is kicked)

Moreover, here the work items are really contained in a spinlocked list
whose spinlock is hold here while filling up the list and in the worker
thread while emptying it, so that it is neither possible to
modify/corrupt the list while the worker is consuming it, in case you
successfully queue more work while the worker is running.

Thanks again for your feedback,
Cristian

P.S: .. from the other thread..
> ... it seems to me that is exactly how I'm using it, and moreover I
> don't see any other way via the VirtIO API to grab that last_used_idx
> that I need for virtqueu_poll.

>> I meant to say that the VIO_MSG_POLL_DONE special value should best not be put
>> into the .poll_idx (since the special value might in theory overlap with an
>> opaque value). Another variable could hold the special states VIO_MSG_POLL_DONE
>> and VIO_MSG_NOT_POLLED.

Yes indeed, I was trying to avoid to keep one more variable ignoring indeed
the opaqueness :D ... I'll find a better way.

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [PATCH v9 09/11] firmware: arm_scmi: Add atomic mode support to virtio transport
@ 2022-01-19 12:23         ` Cristian Marussi
  0 siblings, 0 replies; 61+ messages in thread
From: Cristian Marussi @ 2022-01-19 12:23 UTC (permalink / raw)
  To: Peter Hilber
  Cc: linux-kernel, linux-arm-kernel, sudeep.holla, james.quinlan,
	Jonathan.Cameron, f.fainelli, etienne.carriere, vincent.guittot,
	souvik.chakravarty, igor.skalkin, Michael S. Tsirkin,
	virtualization

On Tue, Jan 18, 2022 at 03:21:03PM +0100, Peter Hilber wrote:
> On 21.12.21 15:00, Cristian Marussi wrote:
> > Add support for .mark_txdone and .poll_done transport operations to SCMI
> > VirtIO transport as pre-requisites to enable atomic operations.
> > 
> > Add a Kernel configuration option to enable SCMI VirtIO transport polling
> > and atomic mode for selected SCMI transactions while leaving it default
> > disabled.
> > 
> 
> Hi Cristian,
> 
> thanks for the update. I have some more remarks inline below.
> 

Hi Peter,

thanks for your review, much appreciated, please see my replies online.

> My impression is that the virtio core does not expose helper functions suitable
> to busy-poll for used buffers. But changing this might not be difficult. Maybe
> more_used() from virtio_ring.c could be exposed via a wrapper?
> 

While I definitely agree that the virtio core support for polling is far from
ideal, some support is provided and my point was at first to try implement SCMI
virtio polling leveraging what we have now in the core and see if it was attainable
(indeed I tried early in this series to avoid as a whole to have to support polling
at the SCMI transport layer to attain SCMI cmds atomicity..but that was an ill
attempt that led nowhere good...)

Btw, I was planning to post a new series next week (after merge-windows) with some
fixes I did already, at this point I'll include also some fixes derived
from some of your remarks.

> Best regards,
> 
> Peter
> 
> > Cc: "Michael S. Tsirkin" <mst@redhat.com>
> > Cc: Igor Skalkin <igor.skalkin@opensynergy.com>
> > Cc: Peter Hilber <peter.hilber@opensynergy.com>
> > Cc: virtualization@lists.linux-foundation.org
> > Signed-off-by: Cristian Marussi <cristian.marussi@arm.com>
> > ---
> > v8 --> v9
> > - check for deferred_wq existence before queueing work to avoid
> >   race at driver removal time
> > ---
> >  drivers/firmware/arm_scmi/Kconfig  |  15 ++
> >  drivers/firmware/arm_scmi/virtio.c | 298 ++++++++++++++++++++++++++---
> >  2 files changed, 287 insertions(+), 26 deletions(-)
> > 
> > diff --git a/drivers/firmware/arm_scmi/Kconfig b/drivers/firmware/arm_scmi/Kconfig
> > index d429326433d1..7794bd41eaa0 100644
> > --- a/drivers/firmware/arm_scmi/Kconfig
> > +++ b/drivers/firmware/arm_scmi/Kconfig
> > @@ -118,6 +118,21 @@ config ARM_SCMI_TRANSPORT_VIRTIO_VERSION1_COMPLIANCE
> >  	  the ones implemented by kvmtool) and let the core Kernel VirtIO layer
> >  	  take care of the needed conversions, say N.
> >  
> > +config ARM_SCMI_TRANSPORT_VIRTIO_ATOMIC_ENABLE
> > +	bool "Enable atomic mode for SCMI VirtIO transport"
> > +	depends on ARM_SCMI_TRANSPORT_VIRTIO
> > +	help
> > +	  Enable support of atomic operation for SCMI VirtIO based transport.
> > +
> > +	  If you want the SCMI VirtIO based transport to operate in atomic
> > +	  mode, avoiding any kind of sleeping behaviour for selected
> > +	  transactions on the TX path, answer Y.
> > +
> > +	  Enabling atomic mode operations allows any SCMI driver using this
> > +	  transport to optionally ask for atomic SCMI transactions and operate
> > +	  in atomic context too, at the price of using a number of busy-waiting
> > +	  primitives all over instead. If unsure say N.
> > +
> >  endif #ARM_SCMI_PROTOCOL
> >  
> >  config ARM_SCMI_POWER_DOMAIN
> > diff --git a/drivers/firmware/arm_scmi/virtio.c b/drivers/firmware/arm_scmi/virtio.c
> > index fd0f6f91fc0b..e54d14971c07 100644
> > --- a/drivers/firmware/arm_scmi/virtio.c
> > +++ b/drivers/firmware/arm_scmi/virtio.c
> > @@ -3,8 +3,8 @@
> >   * Virtio Transport driver for Arm System Control and Management Interface
> >   * (SCMI).
> >   *
> > - * Copyright (C) 2020-2021 OpenSynergy.
> > - * Copyright (C) 2021 ARM Ltd.
> > + * Copyright (C) 2020-2022 OpenSynergy.
> > + * Copyright (C) 2021-2022 ARM Ltd.
> >   */
> >  
> >  /**
> > @@ -38,6 +38,9 @@
> >   * @vqueue: Associated virtqueue
> >   * @cinfo: SCMI Tx or Rx channel
> >   * @free_list: List of unused scmi_vio_msg, maintained for Tx channels only
> > + * @deferred_tx_work: Worker for TX deferred replies processing
> > + * @deferred_tx_wq: Workqueue for TX deferred replies
> > + * @pending_cmds_list: List of pre-fetched commands queueud for later processing
> >   * @is_rx: Whether channel is an Rx channel
> >   * @ready: Whether transport user is ready to hear about channel
> >   * @max_msg: Maximum number of pending messages for this channel.
> > @@ -49,6 +52,9 @@ struct scmi_vio_channel {
> >  	struct virtqueue *vqueue;
> >  	struct scmi_chan_info *cinfo;
> >  	struct list_head free_list;
> > +	struct list_head pending_cmds_list;
> > +	struct work_struct deferred_tx_work;
> > +	struct workqueue_struct *deferred_tx_wq;
> >  	bool is_rx;
> >  	bool ready;
> >  	unsigned int max_msg;
> > @@ -65,12 +71,22 @@ struct scmi_vio_channel {
> >   * @input: SDU used for (delayed) responses and notifications
> >   * @list: List which scmi_vio_msg may be part of
> >   * @rx_len: Input SDU size in bytes, once input has been received
> > + * @poll_idx: Last used index registered for polling purposes if this message
> > + *	      transaction reply was configured for polling.
> > + *	      Note that since virtqueue used index is an unsigned 16-bit we can
> > + *	      use some out-of-scale values to signify particular conditions.
> > + * @poll_lock: Protect access to @poll_idx.
> >   */
> >  struct scmi_vio_msg {
> >  	struct scmi_msg_payld *request;
> >  	struct scmi_msg_payld *input;
> >  	struct list_head list;
> >  	unsigned int rx_len;
> > +#define VIO_MSG_NOT_POLLED	0xeeeeeeeeUL
> > +#define VIO_MSG_POLL_DONE	0xffffffffUL
> > +	unsigned int poll_idx;
> > +	/* lock to protect access to poll_idx. */
> > +	spinlock_t poll_lock;
> >  };
> >  
> >  /* Only one SCMI VirtIO device can possibly exist */
> > @@ -81,40 +97,43 @@ static bool scmi_vio_have_vq_rx(struct virtio_device *vdev)
> >  	return virtio_has_feature(vdev, VIRTIO_SCMI_F_P2A_CHANNELS);
> >  }
> >  
> > +/* Expect to be called with vioch->lock acquired by the caller and IRQs off */
> >  static int scmi_vio_feed_vq_rx(struct scmi_vio_channel *vioch,
> >  			       struct scmi_vio_msg *msg,
> >  			       struct device *dev)
> >  {
> >  	struct scatterlist sg_in;
> >  	int rc;
> > -	unsigned long flags;
> >  
> >  	sg_init_one(&sg_in, msg->input, VIRTIO_SCMI_MAX_PDU_SIZE);
> >  
> > -	spin_lock_irqsave(&vioch->lock, flags);
> > -
> >  	rc = virtqueue_add_inbuf(vioch->vqueue, &sg_in, 1, msg, GFP_ATOMIC);
> >  	if (rc)
> >  		dev_err(dev, "failed to add to RX virtqueue (%d)\n", rc);
> >  	else
> >  		virtqueue_kick(vioch->vqueue);
> >  
> > -	spin_unlock_irqrestore(&vioch->lock, flags);
> > -
> >  	return rc;
> >  }
> >  
> > +/* Expect to be called with vioch->lock acquired by the caller and IRQs off */
> > +static inline void scmi_vio_feed_vq_tx(struct scmi_vio_channel *vioch,
> > +				       struct scmi_vio_msg *msg)
> > +{
> > +	spin_lock(&msg->poll_lock);
> > +	msg->poll_idx = VIO_MSG_NOT_POLLED;
> > +	spin_unlock(&msg->poll_lock);
> > +
> > +	list_add(&msg->list, &vioch->free_list);
> > +}
> > +
> >  static void scmi_finalize_message(struct scmi_vio_channel *vioch,
> >  				  struct scmi_vio_msg *msg)
> >  {
> > -	if (vioch->is_rx) {
> > +	if (vioch->is_rx)
> >  		scmi_vio_feed_vq_rx(vioch, msg, vioch->cinfo->dev);
> > -	} else {
> > -		/* Here IRQs are assumed to be already disabled by the caller */
> > -		spin_lock(&vioch->lock);
> > -		list_add(&msg->list, &vioch->free_list);
> > -		spin_unlock(&vioch->lock);
> > -	}
> > +	else
> > +		scmi_vio_feed_vq_tx(vioch, msg);
> >  }
> >  
> >  static void scmi_vio_complete_cb(struct virtqueue *vqueue)
> > @@ -144,6 +163,7 @@ static void scmi_vio_complete_cb(struct virtqueue *vqueue)
> >  			virtqueue_disable_cb(vqueue);
> >  			cb_enabled = false;
> >  		}
> > +
> >  		msg = virtqueue_get_buf(vqueue, &length);
> >  		if (!msg) {
> >  			if (virtqueue_enable_cb(vqueue))
> > @@ -157,7 +177,9 @@ static void scmi_vio_complete_cb(struct virtqueue *vqueue)
> >  			scmi_rx_callback(vioch->cinfo,
> >  					 msg_read_header(msg->input), msg);
> >  
> > +			spin_lock(&vioch->lock);
> >  			scmi_finalize_message(vioch, msg);
> > +			spin_unlock(&vioch->lock);
> >  		}
> >  
> >  		/*
> > @@ -176,6 +198,34 @@ static void scmi_vio_complete_cb(struct virtqueue *vqueue)
> >  	spin_unlock_irqrestore(&vioch->ready_lock, ready_flags);
> >  }
> >  
> > +static void scmi_vio_deferred_tx_worker(struct work_struct *work)
> > +{
> > +	unsigned long flags;
> > +	struct scmi_vio_channel *vioch;
> > +	struct scmi_vio_msg *msg, *tmp;
> > +
> > +	vioch = container_of(work, struct scmi_vio_channel, deferred_tx_work);
> > +
> > +	/* Process pre-fetched messages */
> > +	spin_lock_irqsave(&vioch->lock, flags);
> > +
> > +	/* Scan the list of possibly pre-fetched messages during polling. */
> > +	list_for_each_entry_safe(msg, tmp, &vioch->pending_cmds_list, list) {
> > +		list_del(&msg->list);
> > +
> > +		scmi_rx_callback(vioch->cinfo,
> > +				 msg_read_header(msg->input), msg);
> > +
> > +		/* Free the processed message once done */
> > +		scmi_vio_feed_vq_tx(vioch, msg);
> > +	}
> > +
> > +	spin_unlock_irqrestore(&vioch->lock, flags);
> > +
> > +	/* Process possibly still pending messages */
> > +	scmi_vio_complete_cb(vioch->vqueue);
> > +}
> > +
> >  static const char *const scmi_vio_vqueue_names[] = { "tx", "rx" };
> >  
> >  static vq_callback_t *scmi_vio_complete_callbacks[] = {
> > @@ -244,6 +294,19 @@ static int virtio_chan_setup(struct scmi_chan_info *cinfo, struct device *dev,
> >  
> >  	vioch = &((struct scmi_vio_channel *)scmi_vdev->priv)[index];
> >  
> > +	/* Setup a deferred worker for polling. */
> > +	if (tx && !vioch->deferred_tx_wq) {
> > +		vioch->deferred_tx_wq =
> > +			alloc_workqueue(dev_name(&scmi_vdev->dev),
> > +					WQ_UNBOUND | WQ_FREEZABLE | WQ_SYSFS,
> > +					0);
> > +		if (!vioch->deferred_tx_wq)
> > +			return -ENOMEM;
> > +
> > +		INIT_WORK(&vioch->deferred_tx_work,
> > +			  scmi_vio_deferred_tx_worker);
> > +	}
> > +
> >  	for (i = 0; i < vioch->max_msg; i++) {
> >  		struct scmi_vio_msg *msg;
> >  
> > @@ -257,6 +320,7 @@ static int virtio_chan_setup(struct scmi_chan_info *cinfo, struct device *dev,
> >  						    GFP_KERNEL);
> >  			if (!msg->request)
> >  				return -ENOMEM;
> > +			spin_lock_init(&msg->poll_lock);
> >  		}
> >  
> >  		msg->input = devm_kzalloc(cinfo->dev, VIRTIO_SCMI_MAX_PDU_SIZE,
> > @@ -264,13 +328,12 @@ static int virtio_chan_setup(struct scmi_chan_info *cinfo, struct device *dev,
> >  		if (!msg->input)
> >  			return -ENOMEM;
> >  
> > -		if (tx) {
> > -			spin_lock_irqsave(&vioch->lock, flags);
> > -			list_add_tail(&msg->list, &vioch->free_list);
> > -			spin_unlock_irqrestore(&vioch->lock, flags);
> > -		} else {
> > +		spin_lock_irqsave(&vioch->lock, flags);
> > +		if (tx)
> > +			scmi_vio_feed_vq_tx(vioch, msg);
> > +		else
> >  			scmi_vio_feed_vq_rx(vioch, msg, cinfo->dev);
> > -		}
> > +		spin_unlock_irqrestore(&vioch->lock, flags);
> >  	}
> >  
> >  	spin_lock_irqsave(&vioch->lock, flags);
> > @@ -291,11 +354,22 @@ static int virtio_chan_free(int id, void *p, void *data)
> >  	unsigned long flags;
> >  	struct scmi_chan_info *cinfo = p;
> >  	struct scmi_vio_channel *vioch = cinfo->transport_info;
> > +	void *deferred_wq = NULL;
> >  
> >  	spin_lock_irqsave(&vioch->ready_lock, flags);
> >  	vioch->ready = false;
> >  	spin_unlock_irqrestore(&vioch->ready_lock, flags);
> >  
> > +	spin_lock_irqsave(&vioch->lock, flags);
> > +	if (!vioch->is_rx && vioch->deferred_tx_wq) {
> > +		deferred_wq = vioch->deferred_tx_wq;
> > +		vioch->deferred_tx_wq = NULL;
> > +	}
> > +	spin_unlock_irqrestore(&vioch->lock, flags);
> > +
> > +	if (deferred_wq)
> > +		destroy_workqueue(deferred_wq);
> > +
> >  	scmi_free_channel(cinfo, data, id);
> >  
> >  	spin_lock_irqsave(&vioch->lock, flags);
> > @@ -324,7 +398,8 @@ static int virtio_send_message(struct scmi_chan_info *cinfo,
> >  	}
> >  
> >  	msg = list_first_entry(&vioch->free_list, typeof(*msg), list);
> > -	list_del(&msg->list);
> > +	/* Re-init element so we can discern anytime if it is still in-flight */
> > +	list_del_init(&msg->list);
> >  
> >  	msg_tx_prepare(msg->request, xfer);
> >  
> > @@ -337,6 +412,19 @@ static int virtio_send_message(struct scmi_chan_info *cinfo,
> >  		dev_err(vioch->cinfo->dev,
> >  			"failed to add to TX virtqueue (%d)\n", rc);
> >  	} else {
> > +		/*
> > +		 * If polling was requested for this transaction:
> > +		 *  - retrieve last used index (will be used as polling reference)
> 
> The virtqueue_add_sgs() called before already exposed the message to the virtio
> device, so a fast response might have arrived already, so the reference value
> might unexpectedly correspond to the current message.
> 

Good point I have wrongly considered the kick as the only trigger but if
indeed the device was already kicked and processing the queues the scenario
you depicted is possible. I'll move this logic earlier.

> > +		 *  - bind the polled message to the xfer via .priv
> > +		 */
> > +		if (xfer->hdr.poll_completion) {
> > +			spin_lock(&msg->poll_lock);
> > +			msg->poll_idx =
> > +				virtqueue_enable_cb_prepare(vioch->vqueue);
> > +			spin_unlock(&msg->poll_lock);
> > +			/* Ensure initialized msg is visibly bound to xfer */
> > +			smp_store_mb(xfer->priv, msg);
> > +		}
> >  		virtqueue_kick(vioch->vqueue);
> >  	}
> >  
> > @@ -350,10 +438,8 @@ static void virtio_fetch_response(struct scmi_chan_info *cinfo,
> >  {
> >  	struct scmi_vio_msg *msg = xfer->priv;
> >  
> > -	if (msg) {
> > +	if (msg)
> >  		msg_fetch_response(msg->input, msg->rx_len, xfer);
> > -		xfer->priv = NULL;
> > -	}
> >  }
> >  
> >  static void virtio_fetch_notification(struct scmi_chan_info *cinfo,
> > @@ -361,10 +447,166 @@ static void virtio_fetch_notification(struct scmi_chan_info *cinfo,
> >  {
> >  	struct scmi_vio_msg *msg = xfer->priv;
> >  
> > -	if (msg) {
> > +	if (msg)
> >  		msg_fetch_notification(msg->input, msg->rx_len, max_len, xfer);
> > -		xfer->priv = NULL;
> > +}
> > +
> > +/**
> > + * virtio_mark_txdone  - Mark transmission done
> > + *
> > + * Free only successfully completed polling transfer messages.
> > + *
> > + * Note that in the SCMI VirtIO transport we never explicitly release timed-out
> > + * messages by forcibly re-adding them to the free-list, even on timeout, inside
> > + * the TX code path; we instead let IRQ/RX callbacks eventually clean up such
> > + * messages once, finally, a late reply is received and discarded (if ever).
> > + *
> > + * This approach was deemed preferable since those pending timed-out buffers are
> > + * still effectively owned by the SCMI platform VirtIO device even after timeout
> > + * expiration: forcibly freeing and reusing them before they had been returned
> > + * explicitly by the SCMI platform could lead to subtle bugs due to message
> > + * corruption.
> > + * An SCMI platform VirtIO device which never returns message buffers is
> > + * anyway broken and it will quickly lead to exhaustion of available messages.
> > + *
> > + * For this same reason, here, we take care to free only the successfully
> > + * completed polled messages, since they won't be freed elsewhere; late replies
> > + * to timed-out polled messages would be anyway freed by RX callbacks instead.
> > + *
> > + * @cinfo: SCMI channel info
> > + * @ret: Transmission return code
> > + * @xfer: Transfer descriptor
> > + */
> > +static void virtio_mark_txdone(struct scmi_chan_info *cinfo, int ret,
> > +			       struct scmi_xfer *xfer)
> > +{
> > +	unsigned long flags;
> > +	struct scmi_vio_channel *vioch = cinfo->transport_info;
> > +	struct scmi_vio_msg *msg = xfer->priv;
> > +
> > +	if (!msg)
> > +		return;
> > +
> > +	/* Ensure msg is unbound from xfer before pushing onto the free list  */
> > +	smp_store_mb(xfer->priv, NULL);
> > +
> > +	/* Is a successfully completed polled message still to be finalized ? */
> > +	spin_lock_irqsave(&vioch->lock, flags);
> > +	if (!ret && xfer->hdr.poll_completion && list_empty(&msg->list))
> 
> Could this not also be a message which was already finalized and then
> immediately reused for another polling xfer?
> 

Fist of all, in the upcoming series there is already a fix here, because I do
not want (as in non-polling mode and as stated in the comment above) to forcibly
release -ETIMEOUT xfer here. New check being like:

	if (ret != -ETIMEDOUT && xfer->hdr.poll_completion && list_empty(&msg->list))

BUT, if I understood correctly what you meant you are talking about a polled
message which had been received by chance on the IRQ path and there finalized
already, and so subsequently happily picked up from the free list for another
polled transfer....I think you are right, even if remote, this is a possibility
to account for (and the list_empty() check fails at this): I'll take care to clear
xfer->priv after a command is fetched BUT only if received across the IRQ path,
and drop the list_empty() check above which at this point is redundant: in this
way mark_tx_done will be skip even for polled cmds that have already been
finalized on the IRQ path. (notifications/delayed responses are unaffected
since not handled on this path...)

Thanks for spotting this.

> > +		scmi_vio_feed_vq_tx(vioch, msg);
> > +	spin_unlock_irqrestore(&vioch->lock, flags);
> > +}
> > +
> > +/**
> > + * virtio_poll_done  - Provide polling support for VirtIO transport
> > + *
> > + * @cinfo: SCMI channel info
> > + * @xfer: Reference to the transfer being poll for.
> > + *
> > + * VirtIO core provides a polling mechanism based only on last used indexes:
> > + * this means that it is possible to poll the virtqueues waiting for something
> > + * new to arrive from the host side but the only way to check if the freshly
> > + * arrived buffer was what we were waiting for is to compare the newly arrived
> > + * message descriptors with the one we are polling on.
> > + *
> > + * As a consequence it can happen to dequeue something different from the buffer
> > + * we were poll-waiting for: if that is the case such early fetched buffers are
> > + * then added to a the @pending_cmds_list list for later processing by a
> > + * dedicated deferred worker.
> > + *
> > + * So, basically, once something new is spotted we proceed to de-queue all the
> > + * freshly received used buffers until we found the one we were polling on, or,
> > + * we have 'seemingly' emptied the virtqueue; if some buffers are still pending
> > + * in the vqueue at the end of the polling loop (possible due to inherent races
> > + * in virtqueues handling mechanisms), we similarly kick the deferred worker
> > + * and let it process those, to avoid indefinitely looping in the .poll_done
> > + * helper.
> > + *
> > + * Note that we do NOT suppress notification with VIRTQ_USED_F_NO_NOTIFY even
> > + * when polling since such flag is per-virtqueues and we do not want to
> > + * suppress notifications as a whole: so, if the message we are polling for is
> > + * delivered via usual IRQs callbacks, on another core which are IRQs-on, it
> > + * will be handled as such by scmi_rx_callback() and the polling loop in the
> > + * SCMI Core TX path will be transparently terminated anyway.
> 
> Why is VIRTQ_USED_F_NO_NOTIFY mentioned? It is a flag associated with only one
> of two buffer notification suppression methods, which are transparently handled
> by the virtio core. Cf. `Driver Requirements: Used Buffer Notification
> Suppression' in the virtio spec.
> 

I saw that paragraph on the spec, what I wanted to stress was that we have no
way to implement a per-message notification suppression (and probably neither a
per-channel one since VIRTQ_USED_F_NO_NOTIFY is not really exposed), so that
a polled message could be indeed received instead through the usual IRQ based
callbacks if it happens to be delivered on a different core or if the
current polling core is not running such loops with IRQs off, but such
behaviour is supported and handled.

Anyway it is probably confusing mentioning the presence of the
VIRTQ_USED_F_NO_NOTIFY flag, so I'll drop that mention.

> > + *
> > + * Return: True once polling has successfully completed.
> > + */
> > +static bool virtio_poll_done(struct scmi_chan_info *cinfo,
> > +			     struct scmi_xfer *xfer)
> > +{
> > +	bool pending, ret = false;
> > +	unsigned int length, any_prefetched = 0;
> > +	unsigned long flags;
> > +	struct scmi_vio_msg *next_msg, *msg = xfer->priv;
> > +	struct scmi_vio_channel *vioch = cinfo->transport_info;
> > +
> > +	if (!msg)
> > +		return true;
> > +
> > +	spin_lock_irqsave(&msg->poll_lock, flags);
> > +	/* Processed already by other polling loop on another CPU ? */
> > +	if (msg->poll_idx == VIO_MSG_POLL_DONE) {
> > +		spin_unlock_irqrestore(&msg->poll_lock, flags);
> > +		return true;
> > +	}
> > +
> > +	/* Has cmdq index moved at all ? */
> > +	pending = virtqueue_poll(vioch->vqueue, msg->poll_idx);
> 
> In my understanding, the polling comparison could still be subject to the ABA
> problem when exactly 2**16 messages have been marked as used since
> msg->poll_idx was set (unlikely scenario, granted).
> 
> I think this would be a lot simpler if the virtio core exported some
> concurrency-safe helper function for such polling (similar to more_used() from
> virtio_ring.c), as discussed at the top.

So this is the main limitation indeed of the current implementation, I
cannot distinguish if there was an exact full wrap and I'm reading the same
last_idx as before but a whoppying 2**16 messages have instead gone through...

The tricky part seems to me here that even introducing dedicated helpers
for polling in order to account for such wrapping (similar to more_used())
those would be based by current VirtIO spec on a single bit wrap counter,
so how do you discern if 2 whole wraps have happened (even more unlikely..) ?

Maybe I'm missing something though...

I'll have a though about this, but in my opinion this seems something so
unlikely that we could live with it, for the moment at least...

> 
> > +	spin_unlock_irqrestore(&msg->poll_lock, flags);
> > +	if (!pending)
> > +		return false;
> > +
> > +	spin_lock_irqsave(&vioch->lock, flags);
> > +	virtqueue_disable_cb(vioch->vqueue);
> > +
> > +	/*
> > +	 * If something arrived we cannot be sure, without dequeueing, if it
> > +	 * was the reply to the xfer we are polling for, or, to other, even
> > +	 * possibly non-polling, pending xfers: process all new messages
> > +	 * till the polled-for message is found OR the vqueue is empty.
> > +	 */
> > +	while ((next_msg = virtqueue_get_buf(vioch->vqueue, &length))) {
> > +		next_msg->rx_len = length;
> > +		/* Is the message we were polling for ? */
> > +		if (next_msg == msg) {
> > +			ret = true;
> > +			break;
> > +		}
> > +
> > +		spin_lock(&next_msg->poll_lock);
> > +		if (next_msg->poll_idx == VIO_MSG_NOT_POLLED) {
> > +			any_prefetched++;
> > +			list_add_tail(&next_msg->list,
> > +				      &vioch->pending_cmds_list);
> > +		} else {
> > +			next_msg->poll_idx = VIO_MSG_POLL_DONE;
> > +		}
> > +		spin_unlock(&next_msg->poll_lock);
> >  	}
> > +
> > +	/*
> > +	 * When the polling loop has successfully terminated if something
> > +	 * else was queued in the meantime, it will be served by a deferred
> > +	 * worker OR by the normal IRQ/callback OR by other poll loops.
> > +	 *
> > +	 * If we are still looking for the polled reply, the polling index has
> > +	 * to be updated to the current vqueue last used index.
> > +	 */
> > +	if (ret) {
> > +		pending = !virtqueue_enable_cb(vioch->vqueue);
> > +	} else {
> > +		spin_lock(&msg->poll_lock);
> > +		msg->poll_idx = virtqueue_enable_cb_prepare(vioch->vqueue);
> > +		pending = virtqueue_poll(vioch->vqueue, msg->poll_idx);
> > +		spin_unlock(&msg->poll_lock);
> > +	}
> > +
> > +	if (vioch->deferred_tx_wq && (any_prefetched || pending))
> > +		queue_work(vioch->deferred_tx_wq, &vioch->deferred_tx_work);
> 
> What if queue_work() returns false because the single work item for the channel
> was already pending? Couldn't it then happen that the worker CPU will not yet
> see the updates implied by `any_prefetched || pending'?
> 

The short asnwer is, no this can't happen: I had an extensive discussion in the
past around similar concerns about this about SCMI notifs dispatching (which use
workqueues:
https://lore.kernel.org/linux-arm-kernel/0234c256-2e29-69f8-99ae-2599f982961e@arm.com/).

In a nutshell all bails down to how the Kenrel cmwq susbsystem works
internally.

queue_work() does fail if the same work item was already queued recently
and suck work is still marked as WORK_PENDING, BUT that flag happens to be set
only if such queued work is still fully pending in the sense that the worker
thread has still not even being waken by the Kernel cmwq subsystem.

The above flag is indeed cleared right before the related worker
thread is started, roughly:

> kernel/workqueue.c:process_one_work()
> 
> 	set_work_pool_and_clear_pending(work, pool->id);
> 	....
> 	worker->current_func(work);
> 

So queue_work() return false only if a previous work was pending AND the
worker thread was still dormient; it is not possible that the worker
thread was running and, say, had already emptied the above @pending_cmds_list
but was still marked pending since it still has to terminate (that would be
indeed a problem and you would end up with this new work items not seen
till the next worker run is kicked)

Moreover, here the work items are really contained in a spinlocked list
whose spinlock is hold here while filling up the list and in the worker
thread while emptying it, so that it is neither possible to
modify/corrupt the list while the worker is consuming it, in case you
successfully queue more work while the worker is running.

Thanks again for your feedback,
Cristian

P.S: .. from the other thread..
> ... it seems to me that is exactly how I'm using it, and moreover I
> don't see any other way via the VirtIO API to grab that last_used_idx
> that I need for virtqueu_poll.

>> I meant to say that the VIO_MSG_POLL_DONE special value should best not be put
>> into the .poll_idx (since the special value might in theory overlap with an
>> opaque value). Another variable could hold the special states VIO_MSG_POLL_DONE
>> and VIO_MSG_NOT_POLLED.

Yes indeed, I was trying to avoid to keep one more variable ignoring indeed
the opaqueness :D ... I'll find a better way.

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [PATCH v9 09/11] firmware: arm_scmi: Add atomic mode support to virtio transport
  2022-01-19 12:23         ` Cristian Marussi
@ 2022-01-20 19:09           ` Peter Hilber
  -1 siblings, 0 replies; 61+ messages in thread
From: Peter Hilber @ 2022-01-20 19:09 UTC (permalink / raw)
  To: Cristian Marussi
  Cc: linux-kernel, linux-arm-kernel, sudeep.holla, james.quinlan,
	Jonathan.Cameron, f.fainelli, etienne.carriere, vincent.guittot,
	souvik.chakravarty, igor.skalkin, Michael S. Tsirkin,
	virtualization

On 19.01.22 13:23, Cristian Marussi wrote:
> On Tue, Jan 18, 2022 at 03:21:03PM +0100, Peter Hilber wrote:
>> On 21.12.21 15:00, Cristian Marussi wrote:
>>> Add support for .mark_txdone and .poll_done transport operations to SCMI
>>> VirtIO transport as pre-requisites to enable atomic operations.
>>>
>>> Add a Kernel configuration option to enable SCMI VirtIO transport polling
>>> and atomic mode for selected SCMI transactions while leaving it default
>>> disabled.
>>>
>>
>> Hi Cristian,
>>
>> thanks for the update. I have some more remarks inline below.
>>
> 
> Hi Peter,
> 
> thanks for your review, much appreciated, please see my replies online.
> 
>> My impression is that the virtio core does not expose helper functions suitable
>> to busy-poll for used buffers. But changing this might not be difficult. Maybe
>> more_used() from virtio_ring.c could be exposed via a wrapper?
>>
> 
> While I definitely agree that the virtio core support for polling is far from
> ideal, some support is provided and my point was at first to try implement SCMI
> virtio polling leveraging what we have now in the core and see if it was attainable
> (indeed I tried early in this series to avoid as a whole to have to support polling
> at the SCMI transport layer to attain SCMI cmds atomicity..but that was an ill
> attempt that led nowhere good...)
> 
> Btw, I was planning to post a new series next week (after merge-windows) with some
> fixes I did already, at this point I'll include also some fixes derived
> from some of your remarks.
> 
>> Best regards,
>>
>> Peter
>>
[snip]>>> + *
>>> + * Return: True once polling has successfully completed.
>>> + */
>>> +static bool virtio_poll_done(struct scmi_chan_info *cinfo,
>>> +			     struct scmi_xfer *xfer)
>>> +{
>>> +	bool pending, ret = false;
>>> +	unsigned int length, any_prefetched = 0;
>>> +	unsigned long flags;
>>> +	struct scmi_vio_msg *next_msg, *msg = xfer->priv;
>>> +	struct scmi_vio_channel *vioch = cinfo->transport_info;
>>> +
>>> +	if (!msg)
>>> +		return true;
>>> +
>>> +	spin_lock_irqsave(&msg->poll_lock, flags);
>>> +	/* Processed already by other polling loop on another CPU ? */
>>> +	if (msg->poll_idx == VIO_MSG_POLL_DONE) {
>>> +		spin_unlock_irqrestore(&msg->poll_lock, flags);
>>> +		return true;
>>> +	}
>>> +
>>> +	/* Has cmdq index moved at all ? */
>>> +	pending = virtqueue_poll(vioch->vqueue, msg->poll_idx);
>>
>> In my understanding, the polling comparison could still be subject to the ABA
>> problem when exactly 2**16 messages have been marked as used since
>> msg->poll_idx was set (unlikely scenario, granted).
>>
>> I think this would be a lot simpler if the virtio core exported some
>> concurrency-safe helper function for such polling (similar to more_used() from
>> virtio_ring.c), as discussed at the top.
> 
> So this is the main limitation indeed of the current implementation, I
> cannot distinguish if there was an exact full wrap and I'm reading the same
> last_idx as before but a whoppying 2**16 messages have instead gone through...
> 
> The tricky part seems to me here that even introducing dedicated helpers
> for polling in order to account for such wrapping (similar to more_used())
> those would be based by current VirtIO spec on a single bit wrap counter,
> so how do you discern if 2 whole wraps have happened (even more unlikely..) ?
> 
> Maybe I'm missing something though...
> 

In my understanding, there is no need to keep track of the old state. We
actually only want to check whether the device has marked any buffers as `used'
which we did not retrieve yet via virtqueue_get_buf_ctx().

This is what more_used() checks in my understanding. One would just need to
translate the external `struct virtqueue' param to the virtio_ring.c internal
representation `struct vring_virtqueue' and then call `more_used()'.

There would be no need to keep `poll_idx` then.

Best regards,

Peter


> I'll have a though about this, but in my opinion this seems something so
> unlikely that we could live with it, for the moment at least...
> [snip]

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [PATCH v9 09/11] firmware: arm_scmi: Add atomic mode support to virtio transport
@ 2022-01-20 19:09           ` Peter Hilber
  0 siblings, 0 replies; 61+ messages in thread
From: Peter Hilber @ 2022-01-20 19:09 UTC (permalink / raw)
  To: Cristian Marussi
  Cc: linux-kernel, linux-arm-kernel, sudeep.holla, james.quinlan,
	Jonathan.Cameron, f.fainelli, etienne.carriere, vincent.guittot,
	souvik.chakravarty, igor.skalkin, Michael S. Tsirkin,
	virtualization

On 19.01.22 13:23, Cristian Marussi wrote:
> On Tue, Jan 18, 2022 at 03:21:03PM +0100, Peter Hilber wrote:
>> On 21.12.21 15:00, Cristian Marussi wrote:
>>> Add support for .mark_txdone and .poll_done transport operations to SCMI
>>> VirtIO transport as pre-requisites to enable atomic operations.
>>>
>>> Add a Kernel configuration option to enable SCMI VirtIO transport polling
>>> and atomic mode for selected SCMI transactions while leaving it default
>>> disabled.
>>>
>>
>> Hi Cristian,
>>
>> thanks for the update. I have some more remarks inline below.
>>
> 
> Hi Peter,
> 
> thanks for your review, much appreciated, please see my replies online.
> 
>> My impression is that the virtio core does not expose helper functions suitable
>> to busy-poll for used buffers. But changing this might not be difficult. Maybe
>> more_used() from virtio_ring.c could be exposed via a wrapper?
>>
> 
> While I definitely agree that the virtio core support for polling is far from
> ideal, some support is provided and my point was at first to try implement SCMI
> virtio polling leveraging what we have now in the core and see if it was attainable
> (indeed I tried early in this series to avoid as a whole to have to support polling
> at the SCMI transport layer to attain SCMI cmds atomicity..but that was an ill
> attempt that led nowhere good...)
> 
> Btw, I was planning to post a new series next week (after merge-windows) with some
> fixes I did already, at this point I'll include also some fixes derived
> from some of your remarks.
> 
>> Best regards,
>>
>> Peter
>>
[snip]>>> + *
>>> + * Return: True once polling has successfully completed.
>>> + */
>>> +static bool virtio_poll_done(struct scmi_chan_info *cinfo,
>>> +			     struct scmi_xfer *xfer)
>>> +{
>>> +	bool pending, ret = false;
>>> +	unsigned int length, any_prefetched = 0;
>>> +	unsigned long flags;
>>> +	struct scmi_vio_msg *next_msg, *msg = xfer->priv;
>>> +	struct scmi_vio_channel *vioch = cinfo->transport_info;
>>> +
>>> +	if (!msg)
>>> +		return true;
>>> +
>>> +	spin_lock_irqsave(&msg->poll_lock, flags);
>>> +	/* Processed already by other polling loop on another CPU ? */
>>> +	if (msg->poll_idx == VIO_MSG_POLL_DONE) {
>>> +		spin_unlock_irqrestore(&msg->poll_lock, flags);
>>> +		return true;
>>> +	}
>>> +
>>> +	/* Has cmdq index moved at all ? */
>>> +	pending = virtqueue_poll(vioch->vqueue, msg->poll_idx);
>>
>> In my understanding, the polling comparison could still be subject to the ABA
>> problem when exactly 2**16 messages have been marked as used since
>> msg->poll_idx was set (unlikely scenario, granted).
>>
>> I think this would be a lot simpler if the virtio core exported some
>> concurrency-safe helper function for such polling (similar to more_used() from
>> virtio_ring.c), as discussed at the top.
> 
> So this is the main limitation indeed of the current implementation, I
> cannot distinguish if there was an exact full wrap and I'm reading the same
> last_idx as before but a whoppying 2**16 messages have instead gone through...
> 
> The tricky part seems to me here that even introducing dedicated helpers
> for polling in order to account for such wrapping (similar to more_used())
> those would be based by current VirtIO spec on a single bit wrap counter,
> so how do you discern if 2 whole wraps have happened (even more unlikely..) ?
> 
> Maybe I'm missing something though...
> 

In my understanding, there is no need to keep track of the old state. We
actually only want to check whether the device has marked any buffers as `used'
which we did not retrieve yet via virtqueue_get_buf_ctx().

This is what more_used() checks in my understanding. One would just need to
translate the external `struct virtqueue' param to the virtio_ring.c internal
representation `struct vring_virtqueue' and then call `more_used()'.

There would be no need to keep `poll_idx` then.

Best regards,

Peter


> I'll have a though about this, but in my opinion this seems something so
> unlikely that we could live with it, for the moment at least...
> [snip]

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [PATCH v9 09/11] firmware: arm_scmi: Add atomic mode support to virtio transport
  2022-01-20 19:09           ` Peter Hilber
  (?)
@ 2022-01-20 20:39             ` Michael S. Tsirkin
  -1 siblings, 0 replies; 61+ messages in thread
From: Michael S. Tsirkin @ 2022-01-20 20:39 UTC (permalink / raw)
  To: Peter Hilber
  Cc: Cristian Marussi, linux-kernel, linux-arm-kernel, sudeep.holla,
	james.quinlan, Jonathan.Cameron, f.fainelli, etienne.carriere,
	vincent.guittot, souvik.chakravarty, igor.skalkin,
	virtualization

On Thu, Jan 20, 2022 at 08:09:56PM +0100, Peter Hilber wrote:
> On 19.01.22 13:23, Cristian Marussi wrote:
> > On Tue, Jan 18, 2022 at 03:21:03PM +0100, Peter Hilber wrote:
> >> On 21.12.21 15:00, Cristian Marussi wrote:
> >>> Add support for .mark_txdone and .poll_done transport operations to SCMI
> >>> VirtIO transport as pre-requisites to enable atomic operations.
> >>>
> >>> Add a Kernel configuration option to enable SCMI VirtIO transport polling
> >>> and atomic mode for selected SCMI transactions while leaving it default
> >>> disabled.
> >>>
> >>
> >> Hi Cristian,
> >>
> >> thanks for the update. I have some more remarks inline below.
> >>
> > 
> > Hi Peter,
> > 
> > thanks for your review, much appreciated, please see my replies online.
> > 
> >> My impression is that the virtio core does not expose helper functions suitable
> >> to busy-poll for used buffers. But changing this might not be difficult. Maybe
> >> more_used() from virtio_ring.c could be exposed via a wrapper?
> >>
> > 
> > While I definitely agree that the virtio core support for polling is far from
> > ideal, some support is provided and my point was at first to try implement SCMI
> > virtio polling leveraging what we have now in the core and see if it was attainable
> > (indeed I tried early in this series to avoid as a whole to have to support polling
> > at the SCMI transport layer to attain SCMI cmds atomicity..but that was an ill
> > attempt that led nowhere good...)
> > 
> > Btw, I was planning to post a new series next week (after merge-windows) with some
> > fixes I did already, at this point I'll include also some fixes derived
> > from some of your remarks.
> > 
> >> Best regards,
> >>
> >> Peter
> >>
> [snip]>>> + *
> >>> + * Return: True once polling has successfully completed.
> >>> + */
> >>> +static bool virtio_poll_done(struct scmi_chan_info *cinfo,
> >>> +			     struct scmi_xfer *xfer)
> >>> +{
> >>> +	bool pending, ret = false;
> >>> +	unsigned int length, any_prefetched = 0;
> >>> +	unsigned long flags;
> >>> +	struct scmi_vio_msg *next_msg, *msg = xfer->priv;
> >>> +	struct scmi_vio_channel *vioch = cinfo->transport_info;
> >>> +
> >>> +	if (!msg)
> >>> +		return true;
> >>> +
> >>> +	spin_lock_irqsave(&msg->poll_lock, flags);
> >>> +	/* Processed already by other polling loop on another CPU ? */
> >>> +	if (msg->poll_idx == VIO_MSG_POLL_DONE) {
> >>> +		spin_unlock_irqrestore(&msg->poll_lock, flags);
> >>> +		return true;
> >>> +	}
> >>> +
> >>> +	/* Has cmdq index moved at all ? */
> >>> +	pending = virtqueue_poll(vioch->vqueue, msg->poll_idx);
> >>
> >> In my understanding, the polling comparison could still be subject to the ABA
> >> problem when exactly 2**16 messages have been marked as used since
> >> msg->poll_idx was set (unlikely scenario, granted).
> >>
> >> I think this would be a lot simpler if the virtio core exported some
> >> concurrency-safe helper function for such polling (similar to more_used() from
> >> virtio_ring.c), as discussed at the top.
> > 
> > So this is the main limitation indeed of the current implementation, I
> > cannot distinguish if there was an exact full wrap and I'm reading the same
> > last_idx as before but a whoppying 2**16 messages have instead gone through...
> > 
> > The tricky part seems to me here that even introducing dedicated helpers
> > for polling in order to account for such wrapping (similar to more_used())
> > those would be based by current VirtIO spec on a single bit wrap counter,
> > so how do you discern if 2 whole wraps have happened (even more unlikely..) ?
> > 
> > Maybe I'm missing something though...
> > 
> 
> In my understanding, there is no need to keep track of the old state. We
> actually only want to check whether the device has marked any buffers as `used'
> which we did not retrieve yet via virtqueue_get_buf_ctx().
> 
> This is what more_used() checks in my understanding. One would just need to
> translate the external `struct virtqueue' param to the virtio_ring.c internal
> representation `struct vring_virtqueue' and then call `more_used()'.
> 
> There would be no need to keep `poll_idx` then.
> 
> Best regards,
> 
> Peter

Not really, I don't think so.

There's no magic in more_used. No synchronization happens.
more_used is exactly like virtqueue_poll except
you get to maintain your own index.

As it is, it is quite possible to read the cached index,
then another thread makes 2^16 bufs available, then device
uses them all, and presto you get a false positive.

I guess we can play with memory barriers such that cache
read happens after the index read - but it seems that
will just lead to the same wrap around problem
in reverse. So IIUC it's quite a bit more involved than
just translating structures.

And yes, a more_used like API would remove the need to pass
the index around, but it will also obscure the fact that
there's internal state here and that it's inherently racy
wrt wrap arounds. Whereas I'm happy to see that virtqueue_poll
seems to have made it clear enough that people get it.


It's not hard to handle wrap around in the driver if you like though:
just have a 32 bit atomic counter and increment it each time you are
going to make 2^16 buffers available. That gets you to 2^48 with an
overhead of an atomic read and that should be enough short term. Make
sure the cache line where you put the counter is not needed elsewhere -
checking it in a tight loop with an atomic will force it to the local
CPU. And if you are doing that virtqueue_poll will be enough.



> 
> > I'll have a though about this, but in my opinion this seems something so
> > unlikely that we could live with it, for the moment at least...
> > [snip]


^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [PATCH v9 09/11] firmware: arm_scmi: Add atomic mode support to virtio transport
@ 2022-01-20 20:39             ` Michael S. Tsirkin
  0 siblings, 0 replies; 61+ messages in thread
From: Michael S. Tsirkin @ 2022-01-20 20:39 UTC (permalink / raw)
  To: Peter Hilber
  Cc: f.fainelli, vincent.guittot, igor.skalkin, sudeep.holla,
	linux-kernel, virtualization, Cristian Marussi, james.quinlan,
	Jonathan.Cameron, souvik.chakravarty, etienne.carriere,
	linux-arm-kernel

On Thu, Jan 20, 2022 at 08:09:56PM +0100, Peter Hilber wrote:
> On 19.01.22 13:23, Cristian Marussi wrote:
> > On Tue, Jan 18, 2022 at 03:21:03PM +0100, Peter Hilber wrote:
> >> On 21.12.21 15:00, Cristian Marussi wrote:
> >>> Add support for .mark_txdone and .poll_done transport operations to SCMI
> >>> VirtIO transport as pre-requisites to enable atomic operations.
> >>>
> >>> Add a Kernel configuration option to enable SCMI VirtIO transport polling
> >>> and atomic mode for selected SCMI transactions while leaving it default
> >>> disabled.
> >>>
> >>
> >> Hi Cristian,
> >>
> >> thanks for the update. I have some more remarks inline below.
> >>
> > 
> > Hi Peter,
> > 
> > thanks for your review, much appreciated, please see my replies online.
> > 
> >> My impression is that the virtio core does not expose helper functions suitable
> >> to busy-poll for used buffers. But changing this might not be difficult. Maybe
> >> more_used() from virtio_ring.c could be exposed via a wrapper?
> >>
> > 
> > While I definitely agree that the virtio core support for polling is far from
> > ideal, some support is provided and my point was at first to try implement SCMI
> > virtio polling leveraging what we have now in the core and see if it was attainable
> > (indeed I tried early in this series to avoid as a whole to have to support polling
> > at the SCMI transport layer to attain SCMI cmds atomicity..but that was an ill
> > attempt that led nowhere good...)
> > 
> > Btw, I was planning to post a new series next week (after merge-windows) with some
> > fixes I did already, at this point I'll include also some fixes derived
> > from some of your remarks.
> > 
> >> Best regards,
> >>
> >> Peter
> >>
> [snip]>>> + *
> >>> + * Return: True once polling has successfully completed.
> >>> + */
> >>> +static bool virtio_poll_done(struct scmi_chan_info *cinfo,
> >>> +			     struct scmi_xfer *xfer)
> >>> +{
> >>> +	bool pending, ret = false;
> >>> +	unsigned int length, any_prefetched = 0;
> >>> +	unsigned long flags;
> >>> +	struct scmi_vio_msg *next_msg, *msg = xfer->priv;
> >>> +	struct scmi_vio_channel *vioch = cinfo->transport_info;
> >>> +
> >>> +	if (!msg)
> >>> +		return true;
> >>> +
> >>> +	spin_lock_irqsave(&msg->poll_lock, flags);
> >>> +	/* Processed already by other polling loop on another CPU ? */
> >>> +	if (msg->poll_idx == VIO_MSG_POLL_DONE) {
> >>> +		spin_unlock_irqrestore(&msg->poll_lock, flags);
> >>> +		return true;
> >>> +	}
> >>> +
> >>> +	/* Has cmdq index moved at all ? */
> >>> +	pending = virtqueue_poll(vioch->vqueue, msg->poll_idx);
> >>
> >> In my understanding, the polling comparison could still be subject to the ABA
> >> problem when exactly 2**16 messages have been marked as used since
> >> msg->poll_idx was set (unlikely scenario, granted).
> >>
> >> I think this would be a lot simpler if the virtio core exported some
> >> concurrency-safe helper function for such polling (similar to more_used() from
> >> virtio_ring.c), as discussed at the top.
> > 
> > So this is the main limitation indeed of the current implementation, I
> > cannot distinguish if there was an exact full wrap and I'm reading the same
> > last_idx as before but a whoppying 2**16 messages have instead gone through...
> > 
> > The tricky part seems to me here that even introducing dedicated helpers
> > for polling in order to account for such wrapping (similar to more_used())
> > those would be based by current VirtIO spec on a single bit wrap counter,
> > so how do you discern if 2 whole wraps have happened (even more unlikely..) ?
> > 
> > Maybe I'm missing something though...
> > 
> 
> In my understanding, there is no need to keep track of the old state. We
> actually only want to check whether the device has marked any buffers as `used'
> which we did not retrieve yet via virtqueue_get_buf_ctx().
> 
> This is what more_used() checks in my understanding. One would just need to
> translate the external `struct virtqueue' param to the virtio_ring.c internal
> representation `struct vring_virtqueue' and then call `more_used()'.
> 
> There would be no need to keep `poll_idx` then.
> 
> Best regards,
> 
> Peter

Not really, I don't think so.

There's no magic in more_used. No synchronization happens.
more_used is exactly like virtqueue_poll except
you get to maintain your own index.

As it is, it is quite possible to read the cached index,
then another thread makes 2^16 bufs available, then device
uses them all, and presto you get a false positive.

I guess we can play with memory barriers such that cache
read happens after the index read - but it seems that
will just lead to the same wrap around problem
in reverse. So IIUC it's quite a bit more involved than
just translating structures.

And yes, a more_used like API would remove the need to pass
the index around, but it will also obscure the fact that
there's internal state here and that it's inherently racy
wrt wrap arounds. Whereas I'm happy to see that virtqueue_poll
seems to have made it clear enough that people get it.


It's not hard to handle wrap around in the driver if you like though:
just have a 32 bit atomic counter and increment it each time you are
going to make 2^16 buffers available. That gets you to 2^48 with an
overhead of an atomic read and that should be enough short term. Make
sure the cache line where you put the counter is not needed elsewhere -
checking it in a tight loop with an atomic will force it to the local
CPU. And if you are doing that virtqueue_poll will be enough.



> 
> > I'll have a though about this, but in my opinion this seems something so
> > unlikely that we could live with it, for the moment at least...
> > [snip]

_______________________________________________
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [PATCH v9 09/11] firmware: arm_scmi: Add atomic mode support to virtio transport
@ 2022-01-20 20:39             ` Michael S. Tsirkin
  0 siblings, 0 replies; 61+ messages in thread
From: Michael S. Tsirkin @ 2022-01-20 20:39 UTC (permalink / raw)
  To: Peter Hilber
  Cc: Cristian Marussi, linux-kernel, linux-arm-kernel, sudeep.holla,
	james.quinlan, Jonathan.Cameron, f.fainelli, etienne.carriere,
	vincent.guittot, souvik.chakravarty, igor.skalkin,
	virtualization

On Thu, Jan 20, 2022 at 08:09:56PM +0100, Peter Hilber wrote:
> On 19.01.22 13:23, Cristian Marussi wrote:
> > On Tue, Jan 18, 2022 at 03:21:03PM +0100, Peter Hilber wrote:
> >> On 21.12.21 15:00, Cristian Marussi wrote:
> >>> Add support for .mark_txdone and .poll_done transport operations to SCMI
> >>> VirtIO transport as pre-requisites to enable atomic operations.
> >>>
> >>> Add a Kernel configuration option to enable SCMI VirtIO transport polling
> >>> and atomic mode for selected SCMI transactions while leaving it default
> >>> disabled.
> >>>
> >>
> >> Hi Cristian,
> >>
> >> thanks for the update. I have some more remarks inline below.
> >>
> > 
> > Hi Peter,
> > 
> > thanks for your review, much appreciated, please see my replies online.
> > 
> >> My impression is that the virtio core does not expose helper functions suitable
> >> to busy-poll for used buffers. But changing this might not be difficult. Maybe
> >> more_used() from virtio_ring.c could be exposed via a wrapper?
> >>
> > 
> > While I definitely agree that the virtio core support for polling is far from
> > ideal, some support is provided and my point was at first to try implement SCMI
> > virtio polling leveraging what we have now in the core and see if it was attainable
> > (indeed I tried early in this series to avoid as a whole to have to support polling
> > at the SCMI transport layer to attain SCMI cmds atomicity..but that was an ill
> > attempt that led nowhere good...)
> > 
> > Btw, I was planning to post a new series next week (after merge-windows) with some
> > fixes I did already, at this point I'll include also some fixes derived
> > from some of your remarks.
> > 
> >> Best regards,
> >>
> >> Peter
> >>
> [snip]>>> + *
> >>> + * Return: True once polling has successfully completed.
> >>> + */
> >>> +static bool virtio_poll_done(struct scmi_chan_info *cinfo,
> >>> +			     struct scmi_xfer *xfer)
> >>> +{
> >>> +	bool pending, ret = false;
> >>> +	unsigned int length, any_prefetched = 0;
> >>> +	unsigned long flags;
> >>> +	struct scmi_vio_msg *next_msg, *msg = xfer->priv;
> >>> +	struct scmi_vio_channel *vioch = cinfo->transport_info;
> >>> +
> >>> +	if (!msg)
> >>> +		return true;
> >>> +
> >>> +	spin_lock_irqsave(&msg->poll_lock, flags);
> >>> +	/* Processed already by other polling loop on another CPU ? */
> >>> +	if (msg->poll_idx == VIO_MSG_POLL_DONE) {
> >>> +		spin_unlock_irqrestore(&msg->poll_lock, flags);
> >>> +		return true;
> >>> +	}
> >>> +
> >>> +	/* Has cmdq index moved at all ? */
> >>> +	pending = virtqueue_poll(vioch->vqueue, msg->poll_idx);
> >>
> >> In my understanding, the polling comparison could still be subject to the ABA
> >> problem when exactly 2**16 messages have been marked as used since
> >> msg->poll_idx was set (unlikely scenario, granted).
> >>
> >> I think this would be a lot simpler if the virtio core exported some
> >> concurrency-safe helper function for such polling (similar to more_used() from
> >> virtio_ring.c), as discussed at the top.
> > 
> > So this is the main limitation indeed of the current implementation, I
> > cannot distinguish if there was an exact full wrap and I'm reading the same
> > last_idx as before but a whoppying 2**16 messages have instead gone through...
> > 
> > The tricky part seems to me here that even introducing dedicated helpers
> > for polling in order to account for such wrapping (similar to more_used())
> > those would be based by current VirtIO spec on a single bit wrap counter,
> > so how do you discern if 2 whole wraps have happened (even more unlikely..) ?
> > 
> > Maybe I'm missing something though...
> > 
> 
> In my understanding, there is no need to keep track of the old state. We
> actually only want to check whether the device has marked any buffers as `used'
> which we did not retrieve yet via virtqueue_get_buf_ctx().
> 
> This is what more_used() checks in my understanding. One would just need to
> translate the external `struct virtqueue' param to the virtio_ring.c internal
> representation `struct vring_virtqueue' and then call `more_used()'.
> 
> There would be no need to keep `poll_idx` then.
> 
> Best regards,
> 
> Peter

Not really, I don't think so.

There's no magic in more_used. No synchronization happens.
more_used is exactly like virtqueue_poll except
you get to maintain your own index.

As it is, it is quite possible to read the cached index,
then another thread makes 2^16 bufs available, then device
uses them all, and presto you get a false positive.

I guess we can play with memory barriers such that cache
read happens after the index read - but it seems that
will just lead to the same wrap around problem
in reverse. So IIUC it's quite a bit more involved than
just translating structures.

And yes, a more_used like API would remove the need to pass
the index around, but it will also obscure the fact that
there's internal state here and that it's inherently racy
wrt wrap arounds. Whereas I'm happy to see that virtqueue_poll
seems to have made it clear enough that people get it.


It's not hard to handle wrap around in the driver if you like though:
just have a 32 bit atomic counter and increment it each time you are
going to make 2^16 buffers available. That gets you to 2^48 with an
overhead of an atomic read and that should be enough short term. Make
sure the cache line where you put the counter is not needed elsewhere -
checking it in a tight loop with an atomic will force it to the local
CPU. And if you are doing that virtqueue_poll will be enough.



> 
> > I'll have a though about this, but in my opinion this seems something so
> > unlikely that we could live with it, for the moment at least...
> > [snip]


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [PATCH v9 09/11] firmware: arm_scmi: Add atomic mode support to virtio transport
  2022-01-20 20:39             ` Michael S. Tsirkin
@ 2022-01-23 20:02               ` Cristian Marussi
  -1 siblings, 0 replies; 61+ messages in thread
From: Cristian Marussi @ 2022-01-23 20:02 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: Peter Hilber, linux-kernel, linux-arm-kernel, sudeep.holla,
	james.quinlan, Jonathan.Cameron, f.fainelli, etienne.carriere,
	vincent.guittot, souvik.chakravarty, igor.skalkin,
	virtualization

On Thu, Jan 20, 2022 at 03:39:12PM -0500, Michael S. Tsirkin wrote:
> On Thu, Jan 20, 2022 at 08:09:56PM +0100, Peter Hilber wrote:
> > On 19.01.22 13:23, Cristian Marussi wrote:
> > > On Tue, Jan 18, 2022 at 03:21:03PM +0100, Peter Hilber wrote:
> > >> On 21.12.21 15:00, Cristian Marussi wrote:
> > >>> Add support for .mark_txdone and .poll_done transport operations to SCMI
> > >>> VirtIO transport as pre-requisites to enable atomic operations.
> > >>>
> > >>> Add a Kernel configuration option to enable SCMI VirtIO transport polling
> > >>> and atomic mode for selected SCMI transactions while leaving it default
> > >>> disabled.
> > >>>
> > >>
> > >> Hi Cristian,
> > >>
> > >> thanks for the update. I have some more remarks inline below.
> > >>
> > > 
> > > Hi Peter,
> > > 
> > > thanks for your review, much appreciated, please see my replies online.
> > > 
> > >> My impression is that the virtio core does not expose helper functions suitable
> > >> to busy-poll for used buffers. But changing this might not be difficult. Maybe
> > >> more_used() from virtio_ring.c could be exposed via a wrapper?
> > >>
> > > 
> > > While I definitely agree that the virtio core support for polling is far from
> > > ideal, some support is provided and my point was at first to try implement SCMI
> > > virtio polling leveraging what we have now in the core and see if it was attainable
> > > (indeed I tried early in this series to avoid as a whole to have to support polling
> > > at the SCMI transport layer to attain SCMI cmds atomicity..but that was an ill
> > > attempt that led nowhere good...)
> > > 
> > > Btw, I was planning to post a new series next week (after merge-windows) with some
> > > fixes I did already, at this point I'll include also some fixes derived
> > > from some of your remarks.
> > > 
> > >> Best regards,
> > >>
> > >> Peter
> > >>
> > [snip]>>> + *

Hi Michael, Peter

I'm gonna post another iteration of this (and some other dependent
patches) with a few fixes as pointed out by Peter but still without addressing
the ABA style race/issue we are talkig about.

A few comments iand a proposal on this ABA race below, though.

> > >>> + * Return: True once polling has successfully completed.
> > >>> + */
> > >>> +static bool virtio_poll_done(struct scmi_chan_info *cinfo,
> > >>> +			     struct scmi_xfer *xfer)
> > >>> +{
> > >>> +	bool pending, ret = false;
> > >>> +	unsigned int length, any_prefetched = 0;
> > >>> +	unsigned long flags;
> > >>> +	struct scmi_vio_msg *next_msg, *msg = xfer->priv;
> > >>> +	struct scmi_vio_channel *vioch = cinfo->transport_info;
> > >>> +
> > >>> +	if (!msg)
> > >>> +		return true;
> > >>> +
> > >>> +	spin_lock_irqsave(&msg->poll_lock, flags);
> > >>> +	/* Processed already by other polling loop on another CPU ? */
> > >>> +	if (msg->poll_idx == VIO_MSG_POLL_DONE) {
> > >>> +		spin_unlock_irqrestore(&msg->poll_lock, flags);
> > >>> +		return true;
> > >>> +	}
> > >>> +
> > >>> +	/* Has cmdq index moved at all ? */
> > >>> +	pending = virtqueue_poll(vioch->vqueue, msg->poll_idx);
> > >>
> > >> In my understanding, the polling comparison could still be subject to the ABA
> > >> problem when exactly 2**16 messages have been marked as used since
> > >> msg->poll_idx was set (unlikely scenario, granted).
> > >>
> > >> I think this would be a lot simpler if the virtio core exported some
> > >> concurrency-safe helper function for such polling (similar to more_used() from
> > >> virtio_ring.c), as discussed at the top.
> > > 
> > > So this is the main limitation indeed of the current implementation, I
> > > cannot distinguish if there was an exact full wrap and I'm reading the same
> > > last_idx as before but a whoppying 2**16 messages have instead gone through...
> > > 
> > > The tricky part seems to me here that even introducing dedicated helpers
> > > for polling in order to account for such wrapping (similar to more_used())
> > > those would be based by current VirtIO spec on a single bit wrap counter,
> > > so how do you discern if 2 whole wraps have happened (even more unlikely..) ?
> > > 
> > > Maybe I'm missing something though...
> > > 
> > 
> > In my understanding, there is no need to keep track of the old state. We
> > actually only want to check whether the device has marked any buffers as `used'
> > which we did not retrieve yet via virtqueue_get_buf_ctx().
> > 
> > This is what more_used() checks in my understanding. One would just need to
> > translate the external `struct virtqueue' param to the virtio_ring.c internal
> > representation `struct vring_virtqueue' and then call `more_used()'.
> > 
> > There would be no need to keep `poll_idx` then.
> > 
> > Best regards,
> > 
> > Peter
> 
> Not really, I don't think so.
> 
> There's no magic in more_used. No synchronization happens.
> more_used is exactly like virtqueue_poll except
> you get to maintain your own index.
> 
> As it is, it is quite possible to read the cached index,
> then another thread makes 2^16 bufs available, then device
> uses them all, and presto you get a false positive.
> 
> I guess we can play with memory barriers such that cache
> read happens after the index read - but it seems that
> will just lead to the same wrap around problem
> in reverse. So IIUC it's quite a bit more involved than
> just translating structures.
> 
> And yes, a more_used like API would remove the need to pass
> the index around, but it will also obscure the fact that
> there's internal state here and that it's inherently racy
> wrt wrap arounds. Whereas I'm happy to see that virtqueue_poll
> seems to have made it clear enough that people get it.
> 
> 
> It's not hard to handle wrap around in the driver if you like though:
> just have a 32 bit atomic counter and increment it each time you are
> going to make 2^16 buffers available. That gets you to 2^48 with an
> overhead of an atomic read and that should be enough short term. Make
> sure the cache line where you put the counter is not needed elsewhere -
> checking it in a tight loop with an atomic will force it to the local
> CPU. And if you are doing that virtqueue_poll will be enough.
> 
>

I was thinking...keeping the current virtqueue_poll interface, since our
possible issue arises from the used_index wrapping around exactly on top
of the same polled index and given that currently the API returns an
unsigned "opaque" value really carrying just the 16-bit index (and possibly
the wrap bit as bit15 for packed vq) that is supposed to be fed back as
it is to the virtqueue_poll() function....

...why don't we just keep an internal full fledged per-virtqueue wrap-counter
and return that as the MSB 16-bit of the opaque value returned by
virtqueue_prepare_enable_cb and then check it back in virtqueue_poll when the
opaque is fed back ? (filtering it out from the internal helpers machinery)

As in the example below the scissors.

I mean if the internal wrap count is at that point different from the
one provided to virtqueue_poll() via the opaque poll_idx value previously
provided, certainly there is something new to fetch without even looking
at the indexes: at the same time, exposing an opaque index built as
(wraps << 16 | idx) implicitly 'binds' each index to a specific
wrap-iteration, so they can be distiguished (..ok until the wrap-count
upper 16bit wraps too....but...)

I am not really extremely familiar with the internals of virtio so I
could be missing something obvious...feel free to insult me :P

(..and I have not made any perf measurements or consideration at this
point....nor considered the redundancy of the existent packed
used_wrap_counter bit...)

Thanks,
Cristian

----

diff --git a/drivers/virtio/virtio_ring.c b/drivers/virtio/virtio_ring.c
index 00f64f2f8b72..bda6af121cd7 100644
--- a/drivers/virtio/virtio_ring.c
+++ b/drivers/virtio/virtio_ring.c
@@ -117,6 +117,8 @@ struct vring_virtqueue {
        /* Last used index we've seen. */
        u16 last_used_idx;
 
+       u16 wraps;
+
        /* Hint for event idx: already triggered no need to disable. */
        bool event_triggered;
 
@@ -806,6 +808,8 @@ static void *virtqueue_get_buf_ctx_split(struct virtqueue *_vq,
        ret = vq->split.desc_state[i].data;
        detach_buf_split(vq, i, ctx);
        vq->last_used_idx++;
+       if (unlikely(!vq->last_used_idx))
+               vq->wraps++;
        /* If we expect an interrupt for the next entry, tell host
         * by writing event index and flush out the write before
         * the read in the next get_buf call. */
@@ -1508,6 +1512,7 @@ static void *virtqueue_get_buf_ctx_packed(struct virtqueue *_vq,
        if (unlikely(vq->last_used_idx >= vq->packed.vring.num)) {
                vq->last_used_idx -= vq->packed.vring.num;
                vq->packed.used_wrap_counter ^= 1;
+               vq->wraps++;
        }
 
        /*
@@ -1744,6 +1749,7 @@ static struct virtqueue *vring_create_virtqueue_packed(
        vq->weak_barriers = weak_barriers;
        vq->broken = false;
        vq->last_used_idx = 0;
+       vq->wraps = 0;
        vq->event_triggered = false;
        vq->num_added = 0;
        vq->packed_ring = true;
@@ -2092,13 +2098,17 @@ EXPORT_SYMBOL_GPL(virtqueue_disable_cb);
  */
 unsigned virtqueue_enable_cb_prepare(struct virtqueue *_vq)
 {
+       unsigned last_used_idx;
        struct vring_virtqueue *vq = to_vvq(_vq);
 
        if (vq->event_triggered)
                vq->event_triggered = false;
 
-       return vq->packed_ring ? virtqueue_enable_cb_prepare_packed(_vq) :
-                                virtqueue_enable_cb_prepare_split(_vq);
+       last_used_idx = vq->packed_ring ?
+                       virtqueue_enable_cb_prepare_packed(_vq) :
+                       virtqueue_enable_cb_prepare_split(_vq);
+
+       return VRING_BUILD_OPAQUE(last_used_idx, vq->wraps);
 }
 EXPORT_SYMBOL_GPL(virtqueue_enable_cb_prepare);
 
@@ -2118,9 +2128,13 @@ bool virtqueue_poll(struct virtqueue *_vq, unsigned last_used_idx)
        if (unlikely(vq->broken))
                return false;
 
+       if (unlikely(vq->wraps != VRING_GET_WRAPS(last_used_idx)))
+               return true;
+
        virtio_mb(vq->weak_barriers);
-       return vq->packed_ring ? virtqueue_poll_packed(_vq, last_used_idx) :
-                                virtqueue_poll_split(_vq, last_used_idx);
+       return vq->packed_ring ?
+               virtqueue_poll_packed(_vq, VRING_GET_IDX(last_used_idx)) :
+                       virtqueue_poll_split(_vq, VRING_GET_IDX(last_used_idx));
 }
 EXPORT_SYMBOL_GPL(virtqueue_poll);
 
@@ -2245,6 +2259,7 @@ struct virtqueue *__vring_new_virtqueue(unsigned int index,
        vq->weak_barriers = weak_barriers;
        vq->broken = false;
        vq->last_used_idx = 0;
+       vq->wraps = 0;
        vq->event_triggered = false;
        vq->num_added = 0;
        vq->use_dma_api = vring_use_dma_api(vdev);
diff --git a/include/uapi/linux/virtio_ring.h b/include/uapi/linux/virtio_ring.h
index 476d3e5c0fe7..e6b03017ebd7 100644
--- a/include/uapi/linux/virtio_ring.h
+++ b/include/uapi/linux/virtio_ring.h
@@ -77,6 +77,17 @@
  */
 #define VRING_PACKED_EVENT_F_WRAP_CTR  15
 
+#define VRING_IDX_MASK                                 GENMASK(15, 0)
+#define VRING_GET_IDX(opaque)                          \
+       ((u16)FIELD_GET(VRING_IDX_MASK, (opaque)))
+
+#define VRING_WRAPS_MASK                               GENMASK(31, 16)
+#define VRING_GET_WRAPS(opaque)                                \
+       ((u16)FIELD_GET(VRING_WRAPS_MASK, (opaque)))
+
+#define VRING_BUILD_OPAQUE(idx, wraps)                 \
+       (FIELD_PREP(VRING_WRAPS_MASK, (wraps)) | ((idx) & VRING_IDX_MASK))
+
 /* We support indirect buffer descriptors */
 #define VIRTIO_RING_F_INDIRECT_DESC    28


^ permalink raw reply related	[flat|nested] 61+ messages in thread

* Re: [PATCH v9 09/11] firmware: arm_scmi: Add atomic mode support to virtio transport
@ 2022-01-23 20:02               ` Cristian Marussi
  0 siblings, 0 replies; 61+ messages in thread
From: Cristian Marussi @ 2022-01-23 20:02 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: Peter Hilber, linux-kernel, linux-arm-kernel, sudeep.holla,
	james.quinlan, Jonathan.Cameron, f.fainelli, etienne.carriere,
	vincent.guittot, souvik.chakravarty, igor.skalkin,
	virtualization

On Thu, Jan 20, 2022 at 03:39:12PM -0500, Michael S. Tsirkin wrote:
> On Thu, Jan 20, 2022 at 08:09:56PM +0100, Peter Hilber wrote:
> > On 19.01.22 13:23, Cristian Marussi wrote:
> > > On Tue, Jan 18, 2022 at 03:21:03PM +0100, Peter Hilber wrote:
> > >> On 21.12.21 15:00, Cristian Marussi wrote:
> > >>> Add support for .mark_txdone and .poll_done transport operations to SCMI
> > >>> VirtIO transport as pre-requisites to enable atomic operations.
> > >>>
> > >>> Add a Kernel configuration option to enable SCMI VirtIO transport polling
> > >>> and atomic mode for selected SCMI transactions while leaving it default
> > >>> disabled.
> > >>>
> > >>
> > >> Hi Cristian,
> > >>
> > >> thanks for the update. I have some more remarks inline below.
> > >>
> > > 
> > > Hi Peter,
> > > 
> > > thanks for your review, much appreciated, please see my replies online.
> > > 
> > >> My impression is that the virtio core does not expose helper functions suitable
> > >> to busy-poll for used buffers. But changing this might not be difficult. Maybe
> > >> more_used() from virtio_ring.c could be exposed via a wrapper?
> > >>
> > > 
> > > While I definitely agree that the virtio core support for polling is far from
> > > ideal, some support is provided and my point was at first to try implement SCMI
> > > virtio polling leveraging what we have now in the core and see if it was attainable
> > > (indeed I tried early in this series to avoid as a whole to have to support polling
> > > at the SCMI transport layer to attain SCMI cmds atomicity..but that was an ill
> > > attempt that led nowhere good...)
> > > 
> > > Btw, I was planning to post a new series next week (after merge-windows) with some
> > > fixes I did already, at this point I'll include also some fixes derived
> > > from some of your remarks.
> > > 
> > >> Best regards,
> > >>
> > >> Peter
> > >>
> > [snip]>>> + *

Hi Michael, Peter

I'm gonna post another iteration of this (and some other dependent
patches) with a few fixes as pointed out by Peter but still without addressing
the ABA style race/issue we are talkig about.

A few comments iand a proposal on this ABA race below, though.

> > >>> + * Return: True once polling has successfully completed.
> > >>> + */
> > >>> +static bool virtio_poll_done(struct scmi_chan_info *cinfo,
> > >>> +			     struct scmi_xfer *xfer)
> > >>> +{
> > >>> +	bool pending, ret = false;
> > >>> +	unsigned int length, any_prefetched = 0;
> > >>> +	unsigned long flags;
> > >>> +	struct scmi_vio_msg *next_msg, *msg = xfer->priv;
> > >>> +	struct scmi_vio_channel *vioch = cinfo->transport_info;
> > >>> +
> > >>> +	if (!msg)
> > >>> +		return true;
> > >>> +
> > >>> +	spin_lock_irqsave(&msg->poll_lock, flags);
> > >>> +	/* Processed already by other polling loop on another CPU ? */
> > >>> +	if (msg->poll_idx == VIO_MSG_POLL_DONE) {
> > >>> +		spin_unlock_irqrestore(&msg->poll_lock, flags);
> > >>> +		return true;
> > >>> +	}
> > >>> +
> > >>> +	/* Has cmdq index moved at all ? */
> > >>> +	pending = virtqueue_poll(vioch->vqueue, msg->poll_idx);
> > >>
> > >> In my understanding, the polling comparison could still be subject to the ABA
> > >> problem when exactly 2**16 messages have been marked as used since
> > >> msg->poll_idx was set (unlikely scenario, granted).
> > >>
> > >> I think this would be a lot simpler if the virtio core exported some
> > >> concurrency-safe helper function for such polling (similar to more_used() from
> > >> virtio_ring.c), as discussed at the top.
> > > 
> > > So this is the main limitation indeed of the current implementation, I
> > > cannot distinguish if there was an exact full wrap and I'm reading the same
> > > last_idx as before but a whoppying 2**16 messages have instead gone through...
> > > 
> > > The tricky part seems to me here that even introducing dedicated helpers
> > > for polling in order to account for such wrapping (similar to more_used())
> > > those would be based by current VirtIO spec on a single bit wrap counter,
> > > so how do you discern if 2 whole wraps have happened (even more unlikely..) ?
> > > 
> > > Maybe I'm missing something though...
> > > 
> > 
> > In my understanding, there is no need to keep track of the old state. We
> > actually only want to check whether the device has marked any buffers as `used'
> > which we did not retrieve yet via virtqueue_get_buf_ctx().
> > 
> > This is what more_used() checks in my understanding. One would just need to
> > translate the external `struct virtqueue' param to the virtio_ring.c internal
> > representation `struct vring_virtqueue' and then call `more_used()'.
> > 
> > There would be no need to keep `poll_idx` then.
> > 
> > Best regards,
> > 
> > Peter
> 
> Not really, I don't think so.
> 
> There's no magic in more_used. No synchronization happens.
> more_used is exactly like virtqueue_poll except
> you get to maintain your own index.
> 
> As it is, it is quite possible to read the cached index,
> then another thread makes 2^16 bufs available, then device
> uses them all, and presto you get a false positive.
> 
> I guess we can play with memory barriers such that cache
> read happens after the index read - but it seems that
> will just lead to the same wrap around problem
> in reverse. So IIUC it's quite a bit more involved than
> just translating structures.
> 
> And yes, a more_used like API would remove the need to pass
> the index around, but it will also obscure the fact that
> there's internal state here and that it's inherently racy
> wrt wrap arounds. Whereas I'm happy to see that virtqueue_poll
> seems to have made it clear enough that people get it.
> 
> 
> It's not hard to handle wrap around in the driver if you like though:
> just have a 32 bit atomic counter and increment it each time you are
> going to make 2^16 buffers available. That gets you to 2^48 with an
> overhead of an atomic read and that should be enough short term. Make
> sure the cache line where you put the counter is not needed elsewhere -
> checking it in a tight loop with an atomic will force it to the local
> CPU. And if you are doing that virtqueue_poll will be enough.
> 
>

I was thinking...keeping the current virtqueue_poll interface, since our
possible issue arises from the used_index wrapping around exactly on top
of the same polled index and given that currently the API returns an
unsigned "opaque" value really carrying just the 16-bit index (and possibly
the wrap bit as bit15 for packed vq) that is supposed to be fed back as
it is to the virtqueue_poll() function....

...why don't we just keep an internal full fledged per-virtqueue wrap-counter
and return that as the MSB 16-bit of the opaque value returned by
virtqueue_prepare_enable_cb and then check it back in virtqueue_poll when the
opaque is fed back ? (filtering it out from the internal helpers machinery)

As in the example below the scissors.

I mean if the internal wrap count is at that point different from the
one provided to virtqueue_poll() via the opaque poll_idx value previously
provided, certainly there is something new to fetch without even looking
at the indexes: at the same time, exposing an opaque index built as
(wraps << 16 | idx) implicitly 'binds' each index to a specific
wrap-iteration, so they can be distiguished (..ok until the wrap-count
upper 16bit wraps too....but...)

I am not really extremely familiar with the internals of virtio so I
could be missing something obvious...feel free to insult me :P

(..and I have not made any perf measurements or consideration at this
point....nor considered the redundancy of the existent packed
used_wrap_counter bit...)

Thanks,
Cristian

----

diff --git a/drivers/virtio/virtio_ring.c b/drivers/virtio/virtio_ring.c
index 00f64f2f8b72..bda6af121cd7 100644
--- a/drivers/virtio/virtio_ring.c
+++ b/drivers/virtio/virtio_ring.c
@@ -117,6 +117,8 @@ struct vring_virtqueue {
        /* Last used index we've seen. */
        u16 last_used_idx;
 
+       u16 wraps;
+
        /* Hint for event idx: already triggered no need to disable. */
        bool event_triggered;
 
@@ -806,6 +808,8 @@ static void *virtqueue_get_buf_ctx_split(struct virtqueue *_vq,
        ret = vq->split.desc_state[i].data;
        detach_buf_split(vq, i, ctx);
        vq->last_used_idx++;
+       if (unlikely(!vq->last_used_idx))
+               vq->wraps++;
        /* If we expect an interrupt for the next entry, tell host
         * by writing event index and flush out the write before
         * the read in the next get_buf call. */
@@ -1508,6 +1512,7 @@ static void *virtqueue_get_buf_ctx_packed(struct virtqueue *_vq,
        if (unlikely(vq->last_used_idx >= vq->packed.vring.num)) {
                vq->last_used_idx -= vq->packed.vring.num;
                vq->packed.used_wrap_counter ^= 1;
+               vq->wraps++;
        }
 
        /*
@@ -1744,6 +1749,7 @@ static struct virtqueue *vring_create_virtqueue_packed(
        vq->weak_barriers = weak_barriers;
        vq->broken = false;
        vq->last_used_idx = 0;
+       vq->wraps = 0;
        vq->event_triggered = false;
        vq->num_added = 0;
        vq->packed_ring = true;
@@ -2092,13 +2098,17 @@ EXPORT_SYMBOL_GPL(virtqueue_disable_cb);
  */
 unsigned virtqueue_enable_cb_prepare(struct virtqueue *_vq)
 {
+       unsigned last_used_idx;
        struct vring_virtqueue *vq = to_vvq(_vq);
 
        if (vq->event_triggered)
                vq->event_triggered = false;
 
-       return vq->packed_ring ? virtqueue_enable_cb_prepare_packed(_vq) :
-                                virtqueue_enable_cb_prepare_split(_vq);
+       last_used_idx = vq->packed_ring ?
+                       virtqueue_enable_cb_prepare_packed(_vq) :
+                       virtqueue_enable_cb_prepare_split(_vq);
+
+       return VRING_BUILD_OPAQUE(last_used_idx, vq->wraps);
 }
 EXPORT_SYMBOL_GPL(virtqueue_enable_cb_prepare);
 
@@ -2118,9 +2128,13 @@ bool virtqueue_poll(struct virtqueue *_vq, unsigned last_used_idx)
        if (unlikely(vq->broken))
                return false;
 
+       if (unlikely(vq->wraps != VRING_GET_WRAPS(last_used_idx)))
+               return true;
+
        virtio_mb(vq->weak_barriers);
-       return vq->packed_ring ? virtqueue_poll_packed(_vq, last_used_idx) :
-                                virtqueue_poll_split(_vq, last_used_idx);
+       return vq->packed_ring ?
+               virtqueue_poll_packed(_vq, VRING_GET_IDX(last_used_idx)) :
+                       virtqueue_poll_split(_vq, VRING_GET_IDX(last_used_idx));
 }
 EXPORT_SYMBOL_GPL(virtqueue_poll);
 
@@ -2245,6 +2259,7 @@ struct virtqueue *__vring_new_virtqueue(unsigned int index,
        vq->weak_barriers = weak_barriers;
        vq->broken = false;
        vq->last_used_idx = 0;
+       vq->wraps = 0;
        vq->event_triggered = false;
        vq->num_added = 0;
        vq->use_dma_api = vring_use_dma_api(vdev);
diff --git a/include/uapi/linux/virtio_ring.h b/include/uapi/linux/virtio_ring.h
index 476d3e5c0fe7..e6b03017ebd7 100644
--- a/include/uapi/linux/virtio_ring.h
+++ b/include/uapi/linux/virtio_ring.h
@@ -77,6 +77,17 @@
  */
 #define VRING_PACKED_EVENT_F_WRAP_CTR  15
 
+#define VRING_IDX_MASK                                 GENMASK(15, 0)
+#define VRING_GET_IDX(opaque)                          \
+       ((u16)FIELD_GET(VRING_IDX_MASK, (opaque)))
+
+#define VRING_WRAPS_MASK                               GENMASK(31, 16)
+#define VRING_GET_WRAPS(opaque)                                \
+       ((u16)FIELD_GET(VRING_WRAPS_MASK, (opaque)))
+
+#define VRING_BUILD_OPAQUE(idx, wraps)                 \
+       (FIELD_PREP(VRING_WRAPS_MASK, (wraps)) | ((idx) & VRING_IDX_MASK))
+
 /* We support indirect buffer descriptors */
 #define VIRTIO_RING_F_INDIRECT_DESC    28


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 61+ messages in thread

* Re: [PATCH v9 09/11] firmware: arm_scmi: Add atomic mode support to virtio transport
  2022-01-23 20:02               ` Cristian Marussi
  (?)
@ 2022-01-23 22:40                 ` Michael S. Tsirkin
  -1 siblings, 0 replies; 61+ messages in thread
From: Michael S. Tsirkin @ 2022-01-23 22:40 UTC (permalink / raw)
  To: Cristian Marussi
  Cc: Peter Hilber, linux-kernel, linux-arm-kernel, sudeep.holla,
	james.quinlan, Jonathan.Cameron, f.fainelli, etienne.carriere,
	vincent.guittot, souvik.chakravarty, igor.skalkin,
	virtualization

On Sun, Jan 23, 2022 at 08:02:54PM +0000, Cristian Marussi wrote:
> I was thinking...keeping the current virtqueue_poll interface, since our
> possible issue arises from the used_index wrapping around exactly on top
> of the same polled index and given that currently the API returns an
> unsigned "opaque" value really carrying just the 16-bit index (and possibly
> the wrap bit as bit15 for packed vq) that is supposed to be fed back as
> it is to the virtqueue_poll() function....
> 
> ...why don't we just keep an internal full fledged per-virtqueue wrap-counter
> and return that as the MSB 16-bit of the opaque value returned by
> virtqueue_prepare_enable_cb and then check it back in virtqueue_poll when the
> opaque is fed back ? (filtering it out from the internal helpers machinery)
> 
> As in the example below the scissors.
> 
> I mean if the internal wrap count is at that point different from the
> one provided to virtqueue_poll() via the opaque poll_idx value previously
> provided, certainly there is something new to fetch without even looking
> at the indexes: at the same time, exposing an opaque index built as
> (wraps << 16 | idx) implicitly 'binds' each index to a specific
> wrap-iteration, so they can be distiguished (..ok until the wrap-count
> upper 16bit wraps too....but...)
> 
> I am not really extremely familiar with the internals of virtio so I
> could be missing something obvious...feel free to insult me :P
> 
> (..and I have not made any perf measurements or consideration at this
> point....nor considered the redundancy of the existent packed
> used_wrap_counter bit...)
> 
> Thanks,
> Cristian
> 
> ----
> 
> diff --git a/drivers/virtio/virtio_ring.c b/drivers/virtio/virtio_ring.c
> index 00f64f2f8b72..bda6af121cd7 100644
> --- a/drivers/virtio/virtio_ring.c
> +++ b/drivers/virtio/virtio_ring.c
> @@ -117,6 +117,8 @@ struct vring_virtqueue {
>         /* Last used index we've seen. */
>         u16 last_used_idx;
>  
> +       u16 wraps;
> +
>         /* Hint for event idx: already triggered no need to disable. */
>         bool event_triggered;
>  
> @@ -806,6 +808,8 @@ static void *virtqueue_get_buf_ctx_split(struct virtqueue *_vq,
>         ret = vq->split.desc_state[i].data;
>         detach_buf_split(vq, i, ctx);
>         vq->last_used_idx++;
> +       if (unlikely(!vq->last_used_idx))
> +               vq->wraps++;

I wonder whether
               vq->wraps += !vq->last_used_idx;
is faster or slower. No branch but OTOH a dependency.


>         /* If we expect an interrupt for the next entry, tell host
>          * by writing event index and flush out the write before
>          * the read in the next get_buf call. */
> @@ -1508,6 +1512,7 @@ static void *virtqueue_get_buf_ctx_packed(struct virtqueue *_vq,
>         if (unlikely(vq->last_used_idx >= vq->packed.vring.num)) {
>                 vq->last_used_idx -= vq->packed.vring.num;
>                 vq->packed.used_wrap_counter ^= 1;
> +               vq->wraps++;
>         }
>  
>         /*
> @@ -1744,6 +1749,7 @@ static struct virtqueue *vring_create_virtqueue_packed(
>         vq->weak_barriers = weak_barriers;
>         vq->broken = false;
>         vq->last_used_idx = 0;
> +       vq->wraps = 0;
>         vq->event_triggered = false;
>         vq->num_added = 0;
>         vq->packed_ring = true;
> @@ -2092,13 +2098,17 @@ EXPORT_SYMBOL_GPL(virtqueue_disable_cb);
>   */
>  unsigned virtqueue_enable_cb_prepare(struct virtqueue *_vq)
>  {
> +       unsigned last_used_idx;
>         struct vring_virtqueue *vq = to_vvq(_vq);
>  
>         if (vq->event_triggered)
>                 vq->event_triggered = false;
>  
> -       return vq->packed_ring ? virtqueue_enable_cb_prepare_packed(_vq) :
> -                                virtqueue_enable_cb_prepare_split(_vq);
> +       last_used_idx = vq->packed_ring ?
> +                       virtqueue_enable_cb_prepare_packed(_vq) :
> +                       virtqueue_enable_cb_prepare_split(_vq);
> +
> +       return VRING_BUILD_OPAQUE(last_used_idx, vq->wraps);
>  }
>  EXPORT_SYMBOL_GPL(virtqueue_enable_cb_prepare);
>  
> @@ -2118,9 +2128,13 @@ bool virtqueue_poll(struct virtqueue *_vq, unsigned last_used_idx)
>         if (unlikely(vq->broken))
>                 return false;
>  
> +       if (unlikely(vq->wraps != VRING_GET_WRAPS(last_used_idx)))
> +               return true;
> +
>         virtio_mb(vq->weak_barriers);
> -       return vq->packed_ring ? virtqueue_poll_packed(_vq, last_used_idx) :
> -                                virtqueue_poll_split(_vq, last_used_idx);
> +       return vq->packed_ring ?
> +               virtqueue_poll_packed(_vq, VRING_GET_IDX(last_used_idx)) :
> +                       virtqueue_poll_split(_vq, VRING_GET_IDX(last_used_idx));
>  }
>  EXPORT_SYMBOL_GPL(virtqueue_poll);
>  
> @@ -2245,6 +2259,7 @@ struct virtqueue *__vring_new_virtqueue(unsigned int index,
>         vq->weak_barriers = weak_barriers;
>         vq->broken = false;
>         vq->last_used_idx = 0;
> +       vq->wraps = 0;
>         vq->event_triggered = false;
>         vq->num_added = 0;
>         vq->use_dma_api = vring_use_dma_api(vdev);
> diff --git a/include/uapi/linux/virtio_ring.h b/include/uapi/linux/virtio_ring.h
> index 476d3e5c0fe7..e6b03017ebd7 100644
> --- a/include/uapi/linux/virtio_ring.h
> +++ b/include/uapi/linux/virtio_ring.h
> @@ -77,6 +77,17 @@
>   */
>  #define VRING_PACKED_EVENT_F_WRAP_CTR  15
>  
> +#define VRING_IDX_MASK                                 GENMASK(15, 0)
> +#define VRING_GET_IDX(opaque)                          \
> +       ((u16)FIELD_GET(VRING_IDX_MASK, (opaque)))
> +
> +#define VRING_WRAPS_MASK                               GENMASK(31, 16)
> +#define VRING_GET_WRAPS(opaque)                                \
> +       ((u16)FIELD_GET(VRING_WRAPS_MASK, (opaque)))
> +
> +#define VRING_BUILD_OPAQUE(idx, wraps)                 \
> +       (FIELD_PREP(VRING_WRAPS_MASK, (wraps)) | ((idx) & VRING_IDX_MASK))
> +
>  /* We support indirect buffer descriptors */
>  #define VIRTIO_RING_F_INDIRECT_DESC    28

Yea I think this patch increases the time it takes to wrap around from
2^16 to 2^32 which seems good enough.
Need some comments to explain the logic.
Would be interesting to see perf data.

-- 
MST


^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [PATCH v9 09/11] firmware: arm_scmi: Add atomic mode support to virtio transport
@ 2022-01-23 22:40                 ` Michael S. Tsirkin
  0 siblings, 0 replies; 61+ messages in thread
From: Michael S. Tsirkin @ 2022-01-23 22:40 UTC (permalink / raw)
  To: Cristian Marussi
  Cc: f.fainelli, vincent.guittot, igor.skalkin, sudeep.holla,
	linux-kernel, virtualization, Peter Hilber, james.quinlan,
	Jonathan.Cameron, souvik.chakravarty, etienne.carriere,
	linux-arm-kernel

On Sun, Jan 23, 2022 at 08:02:54PM +0000, Cristian Marussi wrote:
> I was thinking...keeping the current virtqueue_poll interface, since our
> possible issue arises from the used_index wrapping around exactly on top
> of the same polled index and given that currently the API returns an
> unsigned "opaque" value really carrying just the 16-bit index (and possibly
> the wrap bit as bit15 for packed vq) that is supposed to be fed back as
> it is to the virtqueue_poll() function....
> 
> ...why don't we just keep an internal full fledged per-virtqueue wrap-counter
> and return that as the MSB 16-bit of the opaque value returned by
> virtqueue_prepare_enable_cb and then check it back in virtqueue_poll when the
> opaque is fed back ? (filtering it out from the internal helpers machinery)
> 
> As in the example below the scissors.
> 
> I mean if the internal wrap count is at that point different from the
> one provided to virtqueue_poll() via the opaque poll_idx value previously
> provided, certainly there is something new to fetch without even looking
> at the indexes: at the same time, exposing an opaque index built as
> (wraps << 16 | idx) implicitly 'binds' each index to a specific
> wrap-iteration, so they can be distiguished (..ok until the wrap-count
> upper 16bit wraps too....but...)
> 
> I am not really extremely familiar with the internals of virtio so I
> could be missing something obvious...feel free to insult me :P
> 
> (..and I have not made any perf measurements or consideration at this
> point....nor considered the redundancy of the existent packed
> used_wrap_counter bit...)
> 
> Thanks,
> Cristian
> 
> ----
> 
> diff --git a/drivers/virtio/virtio_ring.c b/drivers/virtio/virtio_ring.c
> index 00f64f2f8b72..bda6af121cd7 100644
> --- a/drivers/virtio/virtio_ring.c
> +++ b/drivers/virtio/virtio_ring.c
> @@ -117,6 +117,8 @@ struct vring_virtqueue {
>         /* Last used index we've seen. */
>         u16 last_used_idx;
>  
> +       u16 wraps;
> +
>         /* Hint for event idx: already triggered no need to disable. */
>         bool event_triggered;
>  
> @@ -806,6 +808,8 @@ static void *virtqueue_get_buf_ctx_split(struct virtqueue *_vq,
>         ret = vq->split.desc_state[i].data;
>         detach_buf_split(vq, i, ctx);
>         vq->last_used_idx++;
> +       if (unlikely(!vq->last_used_idx))
> +               vq->wraps++;

I wonder whether
               vq->wraps += !vq->last_used_idx;
is faster or slower. No branch but OTOH a dependency.


>         /* If we expect an interrupt for the next entry, tell host
>          * by writing event index and flush out the write before
>          * the read in the next get_buf call. */
> @@ -1508,6 +1512,7 @@ static void *virtqueue_get_buf_ctx_packed(struct virtqueue *_vq,
>         if (unlikely(vq->last_used_idx >= vq->packed.vring.num)) {
>                 vq->last_used_idx -= vq->packed.vring.num;
>                 vq->packed.used_wrap_counter ^= 1;
> +               vq->wraps++;
>         }
>  
>         /*
> @@ -1744,6 +1749,7 @@ static struct virtqueue *vring_create_virtqueue_packed(
>         vq->weak_barriers = weak_barriers;
>         vq->broken = false;
>         vq->last_used_idx = 0;
> +       vq->wraps = 0;
>         vq->event_triggered = false;
>         vq->num_added = 0;
>         vq->packed_ring = true;
> @@ -2092,13 +2098,17 @@ EXPORT_SYMBOL_GPL(virtqueue_disable_cb);
>   */
>  unsigned virtqueue_enable_cb_prepare(struct virtqueue *_vq)
>  {
> +       unsigned last_used_idx;
>         struct vring_virtqueue *vq = to_vvq(_vq);
>  
>         if (vq->event_triggered)
>                 vq->event_triggered = false;
>  
> -       return vq->packed_ring ? virtqueue_enable_cb_prepare_packed(_vq) :
> -                                virtqueue_enable_cb_prepare_split(_vq);
> +       last_used_idx = vq->packed_ring ?
> +                       virtqueue_enable_cb_prepare_packed(_vq) :
> +                       virtqueue_enable_cb_prepare_split(_vq);
> +
> +       return VRING_BUILD_OPAQUE(last_used_idx, vq->wraps);
>  }
>  EXPORT_SYMBOL_GPL(virtqueue_enable_cb_prepare);
>  
> @@ -2118,9 +2128,13 @@ bool virtqueue_poll(struct virtqueue *_vq, unsigned last_used_idx)
>         if (unlikely(vq->broken))
>                 return false;
>  
> +       if (unlikely(vq->wraps != VRING_GET_WRAPS(last_used_idx)))
> +               return true;
> +
>         virtio_mb(vq->weak_barriers);
> -       return vq->packed_ring ? virtqueue_poll_packed(_vq, last_used_idx) :
> -                                virtqueue_poll_split(_vq, last_used_idx);
> +       return vq->packed_ring ?
> +               virtqueue_poll_packed(_vq, VRING_GET_IDX(last_used_idx)) :
> +                       virtqueue_poll_split(_vq, VRING_GET_IDX(last_used_idx));
>  }
>  EXPORT_SYMBOL_GPL(virtqueue_poll);
>  
> @@ -2245,6 +2259,7 @@ struct virtqueue *__vring_new_virtqueue(unsigned int index,
>         vq->weak_barriers = weak_barriers;
>         vq->broken = false;
>         vq->last_used_idx = 0;
> +       vq->wraps = 0;
>         vq->event_triggered = false;
>         vq->num_added = 0;
>         vq->use_dma_api = vring_use_dma_api(vdev);
> diff --git a/include/uapi/linux/virtio_ring.h b/include/uapi/linux/virtio_ring.h
> index 476d3e5c0fe7..e6b03017ebd7 100644
> --- a/include/uapi/linux/virtio_ring.h
> +++ b/include/uapi/linux/virtio_ring.h
> @@ -77,6 +77,17 @@
>   */
>  #define VRING_PACKED_EVENT_F_WRAP_CTR  15
>  
> +#define VRING_IDX_MASK                                 GENMASK(15, 0)
> +#define VRING_GET_IDX(opaque)                          \
> +       ((u16)FIELD_GET(VRING_IDX_MASK, (opaque)))
> +
> +#define VRING_WRAPS_MASK                               GENMASK(31, 16)
> +#define VRING_GET_WRAPS(opaque)                                \
> +       ((u16)FIELD_GET(VRING_WRAPS_MASK, (opaque)))
> +
> +#define VRING_BUILD_OPAQUE(idx, wraps)                 \
> +       (FIELD_PREP(VRING_WRAPS_MASK, (wraps)) | ((idx) & VRING_IDX_MASK))
> +
>  /* We support indirect buffer descriptors */
>  #define VIRTIO_RING_F_INDIRECT_DESC    28

Yea I think this patch increases the time it takes to wrap around from
2^16 to 2^32 which seems good enough.
Need some comments to explain the logic.
Would be interesting to see perf data.

-- 
MST

_______________________________________________
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [PATCH v9 09/11] firmware: arm_scmi: Add atomic mode support to virtio transport
@ 2022-01-23 22:40                 ` Michael S. Tsirkin
  0 siblings, 0 replies; 61+ messages in thread
From: Michael S. Tsirkin @ 2022-01-23 22:40 UTC (permalink / raw)
  To: Cristian Marussi
  Cc: Peter Hilber, linux-kernel, linux-arm-kernel, sudeep.holla,
	james.quinlan, Jonathan.Cameron, f.fainelli, etienne.carriere,
	vincent.guittot, souvik.chakravarty, igor.skalkin,
	virtualization

On Sun, Jan 23, 2022 at 08:02:54PM +0000, Cristian Marussi wrote:
> I was thinking...keeping the current virtqueue_poll interface, since our
> possible issue arises from the used_index wrapping around exactly on top
> of the same polled index and given that currently the API returns an
> unsigned "opaque" value really carrying just the 16-bit index (and possibly
> the wrap bit as bit15 for packed vq) that is supposed to be fed back as
> it is to the virtqueue_poll() function....
> 
> ...why don't we just keep an internal full fledged per-virtqueue wrap-counter
> and return that as the MSB 16-bit of the opaque value returned by
> virtqueue_prepare_enable_cb and then check it back in virtqueue_poll when the
> opaque is fed back ? (filtering it out from the internal helpers machinery)
> 
> As in the example below the scissors.
> 
> I mean if the internal wrap count is at that point different from the
> one provided to virtqueue_poll() via the opaque poll_idx value previously
> provided, certainly there is something new to fetch without even looking
> at the indexes: at the same time, exposing an opaque index built as
> (wraps << 16 | idx) implicitly 'binds' each index to a specific
> wrap-iteration, so they can be distiguished (..ok until the wrap-count
> upper 16bit wraps too....but...)
> 
> I am not really extremely familiar with the internals of virtio so I
> could be missing something obvious...feel free to insult me :P
> 
> (..and I have not made any perf measurements or consideration at this
> point....nor considered the redundancy of the existent packed
> used_wrap_counter bit...)
> 
> Thanks,
> Cristian
> 
> ----
> 
> diff --git a/drivers/virtio/virtio_ring.c b/drivers/virtio/virtio_ring.c
> index 00f64f2f8b72..bda6af121cd7 100644
> --- a/drivers/virtio/virtio_ring.c
> +++ b/drivers/virtio/virtio_ring.c
> @@ -117,6 +117,8 @@ struct vring_virtqueue {
>         /* Last used index we've seen. */
>         u16 last_used_idx;
>  
> +       u16 wraps;
> +
>         /* Hint for event idx: already triggered no need to disable. */
>         bool event_triggered;
>  
> @@ -806,6 +808,8 @@ static void *virtqueue_get_buf_ctx_split(struct virtqueue *_vq,
>         ret = vq->split.desc_state[i].data;
>         detach_buf_split(vq, i, ctx);
>         vq->last_used_idx++;
> +       if (unlikely(!vq->last_used_idx))
> +               vq->wraps++;

I wonder whether
               vq->wraps += !vq->last_used_idx;
is faster or slower. No branch but OTOH a dependency.


>         /* If we expect an interrupt for the next entry, tell host
>          * by writing event index and flush out the write before
>          * the read in the next get_buf call. */
> @@ -1508,6 +1512,7 @@ static void *virtqueue_get_buf_ctx_packed(struct virtqueue *_vq,
>         if (unlikely(vq->last_used_idx >= vq->packed.vring.num)) {
>                 vq->last_used_idx -= vq->packed.vring.num;
>                 vq->packed.used_wrap_counter ^= 1;
> +               vq->wraps++;
>         }
>  
>         /*
> @@ -1744,6 +1749,7 @@ static struct virtqueue *vring_create_virtqueue_packed(
>         vq->weak_barriers = weak_barriers;
>         vq->broken = false;
>         vq->last_used_idx = 0;
> +       vq->wraps = 0;
>         vq->event_triggered = false;
>         vq->num_added = 0;
>         vq->packed_ring = true;
> @@ -2092,13 +2098,17 @@ EXPORT_SYMBOL_GPL(virtqueue_disable_cb);
>   */
>  unsigned virtqueue_enable_cb_prepare(struct virtqueue *_vq)
>  {
> +       unsigned last_used_idx;
>         struct vring_virtqueue *vq = to_vvq(_vq);
>  
>         if (vq->event_triggered)
>                 vq->event_triggered = false;
>  
> -       return vq->packed_ring ? virtqueue_enable_cb_prepare_packed(_vq) :
> -                                virtqueue_enable_cb_prepare_split(_vq);
> +       last_used_idx = vq->packed_ring ?
> +                       virtqueue_enable_cb_prepare_packed(_vq) :
> +                       virtqueue_enable_cb_prepare_split(_vq);
> +
> +       return VRING_BUILD_OPAQUE(last_used_idx, vq->wraps);
>  }
>  EXPORT_SYMBOL_GPL(virtqueue_enable_cb_prepare);
>  
> @@ -2118,9 +2128,13 @@ bool virtqueue_poll(struct virtqueue *_vq, unsigned last_used_idx)
>         if (unlikely(vq->broken))
>                 return false;
>  
> +       if (unlikely(vq->wraps != VRING_GET_WRAPS(last_used_idx)))
> +               return true;
> +
>         virtio_mb(vq->weak_barriers);
> -       return vq->packed_ring ? virtqueue_poll_packed(_vq, last_used_idx) :
> -                                virtqueue_poll_split(_vq, last_used_idx);
> +       return vq->packed_ring ?
> +               virtqueue_poll_packed(_vq, VRING_GET_IDX(last_used_idx)) :
> +                       virtqueue_poll_split(_vq, VRING_GET_IDX(last_used_idx));
>  }
>  EXPORT_SYMBOL_GPL(virtqueue_poll);
>  
> @@ -2245,6 +2259,7 @@ struct virtqueue *__vring_new_virtqueue(unsigned int index,
>         vq->weak_barriers = weak_barriers;
>         vq->broken = false;
>         vq->last_used_idx = 0;
> +       vq->wraps = 0;
>         vq->event_triggered = false;
>         vq->num_added = 0;
>         vq->use_dma_api = vring_use_dma_api(vdev);
> diff --git a/include/uapi/linux/virtio_ring.h b/include/uapi/linux/virtio_ring.h
> index 476d3e5c0fe7..e6b03017ebd7 100644
> --- a/include/uapi/linux/virtio_ring.h
> +++ b/include/uapi/linux/virtio_ring.h
> @@ -77,6 +77,17 @@
>   */
>  #define VRING_PACKED_EVENT_F_WRAP_CTR  15
>  
> +#define VRING_IDX_MASK                                 GENMASK(15, 0)
> +#define VRING_GET_IDX(opaque)                          \
> +       ((u16)FIELD_GET(VRING_IDX_MASK, (opaque)))
> +
> +#define VRING_WRAPS_MASK                               GENMASK(31, 16)
> +#define VRING_GET_WRAPS(opaque)                                \
> +       ((u16)FIELD_GET(VRING_WRAPS_MASK, (opaque)))
> +
> +#define VRING_BUILD_OPAQUE(idx, wraps)                 \
> +       (FIELD_PREP(VRING_WRAPS_MASK, (wraps)) | ((idx) & VRING_IDX_MASK))
> +
>  /* We support indirect buffer descriptors */
>  #define VIRTIO_RING_F_INDIRECT_DESC    28

Yea I think this patch increases the time it takes to wrap around from
2^16 to 2^32 which seems good enough.
Need some comments to explain the logic.
Would be interesting to see perf data.

-- 
MST


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [PATCH v9 09/11] firmware: arm_scmi: Add atomic mode support to virtio transport
  2022-01-23 22:40                 ` Michael S. Tsirkin
@ 2022-01-23 22:45                   ` Cristian Marussi
  -1 siblings, 0 replies; 61+ messages in thread
From: Cristian Marussi @ 2022-01-23 22:45 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: Peter Hilber, linux-kernel, linux-arm-kernel, sudeep.holla,
	james.quinlan, Jonathan.Cameron, f.fainelli, etienne.carriere,
	vincent.guittot, souvik.chakravarty, igor.skalkin,
	virtualization

On Sun, Jan 23, 2022 at 05:40:08PM -0500, Michael S. Tsirkin wrote:
> On Sun, Jan 23, 2022 at 08:02:54PM +0000, Cristian Marussi wrote:
> > I was thinking...keeping the current virtqueue_poll interface, since our
> > possible issue arises from the used_index wrapping around exactly on top
> > of the same polled index and given that currently the API returns an
> > unsigned "opaque" value really carrying just the 16-bit index (and possibly
> > the wrap bit as bit15 for packed vq) that is supposed to be fed back as
> > it is to the virtqueue_poll() function....
> > 
> > ...why don't we just keep an internal full fledged per-virtqueue wrap-counter
> > and return that as the MSB 16-bit of the opaque value returned by
> > virtqueue_prepare_enable_cb and then check it back in virtqueue_poll when the
> > opaque is fed back ? (filtering it out from the internal helpers machinery)
> > 
> > As in the example below the scissors.
> > 
> > I mean if the internal wrap count is at that point different from the
> > one provided to virtqueue_poll() via the opaque poll_idx value previously
> > provided, certainly there is something new to fetch without even looking
> > at the indexes: at the same time, exposing an opaque index built as
> > (wraps << 16 | idx) implicitly 'binds' each index to a specific
> > wrap-iteration, so they can be distiguished (..ok until the wrap-count
> > upper 16bit wraps too....but...)
> > 
> > I am not really extremely familiar with the internals of virtio so I
> > could be missing something obvious...feel free to insult me :P
> > 
> > (..and I have not made any perf measurements or consideration at this
> > point....nor considered the redundancy of the existent packed
> > used_wrap_counter bit...)
> > 
> > Thanks,
> > Cristian
> > 
> > ----
> > 
> > diff --git a/drivers/virtio/virtio_ring.c b/drivers/virtio/virtio_ring.c
> > index 00f64f2f8b72..bda6af121cd7 100644
> > --- a/drivers/virtio/virtio_ring.c
> > +++ b/drivers/virtio/virtio_ring.c
> > @@ -117,6 +117,8 @@ struct vring_virtqueue {
> >         /* Last used index we've seen. */
> >         u16 last_used_idx;
> >  
> > +       u16 wraps;
> > +
> >         /* Hint for event idx: already triggered no need to disable. */
> >         bool event_triggered;
> >  
> > @@ -806,6 +808,8 @@ static void *virtqueue_get_buf_ctx_split(struct virtqueue *_vq,
> >         ret = vq->split.desc_state[i].data;
> >         detach_buf_split(vq, i, ctx);
> >         vq->last_used_idx++;
> > +       if (unlikely(!vq->last_used_idx))
> > +               vq->wraps++;
> 
> I wonder whether
>                vq->wraps += !vq->last_used_idx;
> is faster or slower. No branch but OTOH a dependency.
> 
> 
> >         /* If we expect an interrupt for the next entry, tell host
> >          * by writing event index and flush out the write before
> >          * the read in the next get_buf call. */
> > @@ -1508,6 +1512,7 @@ static void *virtqueue_get_buf_ctx_packed(struct virtqueue *_vq,
> >         if (unlikely(vq->last_used_idx >= vq->packed.vring.num)) {
> >                 vq->last_used_idx -= vq->packed.vring.num;
> >                 vq->packed.used_wrap_counter ^= 1;
> > +               vq->wraps++;
> >         }
> >  
> >         /*
> > @@ -1744,6 +1749,7 @@ static struct virtqueue *vring_create_virtqueue_packed(
> >         vq->weak_barriers = weak_barriers;
> >         vq->broken = false;
> >         vq->last_used_idx = 0;
> > +       vq->wraps = 0;
> >         vq->event_triggered = false;
> >         vq->num_added = 0;
> >         vq->packed_ring = true;
> > @@ -2092,13 +2098,17 @@ EXPORT_SYMBOL_GPL(virtqueue_disable_cb);
> >   */
> >  unsigned virtqueue_enable_cb_prepare(struct virtqueue *_vq)
> >  {
> > +       unsigned last_used_idx;
> >         struct vring_virtqueue *vq = to_vvq(_vq);
> >  
> >         if (vq->event_triggered)
> >                 vq->event_triggered = false;
> >  
> > -       return vq->packed_ring ? virtqueue_enable_cb_prepare_packed(_vq) :
> > -                                virtqueue_enable_cb_prepare_split(_vq);
> > +       last_used_idx = vq->packed_ring ?
> > +                       virtqueue_enable_cb_prepare_packed(_vq) :
> > +                       virtqueue_enable_cb_prepare_split(_vq);
> > +
> > +       return VRING_BUILD_OPAQUE(last_used_idx, vq->wraps);
> >  }
> >  EXPORT_SYMBOL_GPL(virtqueue_enable_cb_prepare);
> >  
> > @@ -2118,9 +2128,13 @@ bool virtqueue_poll(struct virtqueue *_vq, unsigned last_used_idx)
> >         if (unlikely(vq->broken))
> >                 return false;
> >  
> > +       if (unlikely(vq->wraps != VRING_GET_WRAPS(last_used_idx)))
> > +               return true;
> > +
> >         virtio_mb(vq->weak_barriers);
> > -       return vq->packed_ring ? virtqueue_poll_packed(_vq, last_used_idx) :
> > -                                virtqueue_poll_split(_vq, last_used_idx);
> > +       return vq->packed_ring ?
> > +               virtqueue_poll_packed(_vq, VRING_GET_IDX(last_used_idx)) :
> > +                       virtqueue_poll_split(_vq, VRING_GET_IDX(last_used_idx));
> >  }
> >  EXPORT_SYMBOL_GPL(virtqueue_poll);
> >  
> > @@ -2245,6 +2259,7 @@ struct virtqueue *__vring_new_virtqueue(unsigned int index,
> >         vq->weak_barriers = weak_barriers;
> >         vq->broken = false;
> >         vq->last_used_idx = 0;
> > +       vq->wraps = 0;
> >         vq->event_triggered = false;
> >         vq->num_added = 0;
> >         vq->use_dma_api = vring_use_dma_api(vdev);
> > diff --git a/include/uapi/linux/virtio_ring.h b/include/uapi/linux/virtio_ring.h
> > index 476d3e5c0fe7..e6b03017ebd7 100644
> > --- a/include/uapi/linux/virtio_ring.h
> > +++ b/include/uapi/linux/virtio_ring.h
> > @@ -77,6 +77,17 @@
> >   */
> >  #define VRING_PACKED_EVENT_F_WRAP_CTR  15
> >  
> > +#define VRING_IDX_MASK                                 GENMASK(15, 0)
> > +#define VRING_GET_IDX(opaque)                          \
> > +       ((u16)FIELD_GET(VRING_IDX_MASK, (opaque)))
> > +
> > +#define VRING_WRAPS_MASK                               GENMASK(31, 16)
> > +#define VRING_GET_WRAPS(opaque)                                \
> > +       ((u16)FIELD_GET(VRING_WRAPS_MASK, (opaque)))
> > +
> > +#define VRING_BUILD_OPAQUE(idx, wraps)                 \
> > +       (FIELD_PREP(VRING_WRAPS_MASK, (wraps)) | ((idx) & VRING_IDX_MASK))
> > +
> >  /* We support indirect buffer descriptors */
> >  #define VIRTIO_RING_F_INDIRECT_DESC    28
> 
> Yea I think this patch increases the time it takes to wrap around from
> 2^16 to 2^32 which seems good enough.
> Need some comments to explain the logic.
> Would be interesting to see perf data.
> 

Thanks for your feedback !

I'll try to gather some perf data around it next days.
(and eventually cleanup and adding comments if it is god enough...)

Thanks,
Cristian


^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [PATCH v9 09/11] firmware: arm_scmi: Add atomic mode support to virtio transport
@ 2022-01-23 22:45                   ` Cristian Marussi
  0 siblings, 0 replies; 61+ messages in thread
From: Cristian Marussi @ 2022-01-23 22:45 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: Peter Hilber, linux-kernel, linux-arm-kernel, sudeep.holla,
	james.quinlan, Jonathan.Cameron, f.fainelli, etienne.carriere,
	vincent.guittot, souvik.chakravarty, igor.skalkin,
	virtualization

On Sun, Jan 23, 2022 at 05:40:08PM -0500, Michael S. Tsirkin wrote:
> On Sun, Jan 23, 2022 at 08:02:54PM +0000, Cristian Marussi wrote:
> > I was thinking...keeping the current virtqueue_poll interface, since our
> > possible issue arises from the used_index wrapping around exactly on top
> > of the same polled index and given that currently the API returns an
> > unsigned "opaque" value really carrying just the 16-bit index (and possibly
> > the wrap bit as bit15 for packed vq) that is supposed to be fed back as
> > it is to the virtqueue_poll() function....
> > 
> > ...why don't we just keep an internal full fledged per-virtqueue wrap-counter
> > and return that as the MSB 16-bit of the opaque value returned by
> > virtqueue_prepare_enable_cb and then check it back in virtqueue_poll when the
> > opaque is fed back ? (filtering it out from the internal helpers machinery)
> > 
> > As in the example below the scissors.
> > 
> > I mean if the internal wrap count is at that point different from the
> > one provided to virtqueue_poll() via the opaque poll_idx value previously
> > provided, certainly there is something new to fetch without even looking
> > at the indexes: at the same time, exposing an opaque index built as
> > (wraps << 16 | idx) implicitly 'binds' each index to a specific
> > wrap-iteration, so they can be distiguished (..ok until the wrap-count
> > upper 16bit wraps too....but...)
> > 
> > I am not really extremely familiar with the internals of virtio so I
> > could be missing something obvious...feel free to insult me :P
> > 
> > (..and I have not made any perf measurements or consideration at this
> > point....nor considered the redundancy of the existent packed
> > used_wrap_counter bit...)
> > 
> > Thanks,
> > Cristian
> > 
> > ----
> > 
> > diff --git a/drivers/virtio/virtio_ring.c b/drivers/virtio/virtio_ring.c
> > index 00f64f2f8b72..bda6af121cd7 100644
> > --- a/drivers/virtio/virtio_ring.c
> > +++ b/drivers/virtio/virtio_ring.c
> > @@ -117,6 +117,8 @@ struct vring_virtqueue {
> >         /* Last used index we've seen. */
> >         u16 last_used_idx;
> >  
> > +       u16 wraps;
> > +
> >         /* Hint for event idx: already triggered no need to disable. */
> >         bool event_triggered;
> >  
> > @@ -806,6 +808,8 @@ static void *virtqueue_get_buf_ctx_split(struct virtqueue *_vq,
> >         ret = vq->split.desc_state[i].data;
> >         detach_buf_split(vq, i, ctx);
> >         vq->last_used_idx++;
> > +       if (unlikely(!vq->last_used_idx))
> > +               vq->wraps++;
> 
> I wonder whether
>                vq->wraps += !vq->last_used_idx;
> is faster or slower. No branch but OTOH a dependency.
> 
> 
> >         /* If we expect an interrupt for the next entry, tell host
> >          * by writing event index and flush out the write before
> >          * the read in the next get_buf call. */
> > @@ -1508,6 +1512,7 @@ static void *virtqueue_get_buf_ctx_packed(struct virtqueue *_vq,
> >         if (unlikely(vq->last_used_idx >= vq->packed.vring.num)) {
> >                 vq->last_used_idx -= vq->packed.vring.num;
> >                 vq->packed.used_wrap_counter ^= 1;
> > +               vq->wraps++;
> >         }
> >  
> >         /*
> > @@ -1744,6 +1749,7 @@ static struct virtqueue *vring_create_virtqueue_packed(
> >         vq->weak_barriers = weak_barriers;
> >         vq->broken = false;
> >         vq->last_used_idx = 0;
> > +       vq->wraps = 0;
> >         vq->event_triggered = false;
> >         vq->num_added = 0;
> >         vq->packed_ring = true;
> > @@ -2092,13 +2098,17 @@ EXPORT_SYMBOL_GPL(virtqueue_disable_cb);
> >   */
> >  unsigned virtqueue_enable_cb_prepare(struct virtqueue *_vq)
> >  {
> > +       unsigned last_used_idx;
> >         struct vring_virtqueue *vq = to_vvq(_vq);
> >  
> >         if (vq->event_triggered)
> >                 vq->event_triggered = false;
> >  
> > -       return vq->packed_ring ? virtqueue_enable_cb_prepare_packed(_vq) :
> > -                                virtqueue_enable_cb_prepare_split(_vq);
> > +       last_used_idx = vq->packed_ring ?
> > +                       virtqueue_enable_cb_prepare_packed(_vq) :
> > +                       virtqueue_enable_cb_prepare_split(_vq);
> > +
> > +       return VRING_BUILD_OPAQUE(last_used_idx, vq->wraps);
> >  }
> >  EXPORT_SYMBOL_GPL(virtqueue_enable_cb_prepare);
> >  
> > @@ -2118,9 +2128,13 @@ bool virtqueue_poll(struct virtqueue *_vq, unsigned last_used_idx)
> >         if (unlikely(vq->broken))
> >                 return false;
> >  
> > +       if (unlikely(vq->wraps != VRING_GET_WRAPS(last_used_idx)))
> > +               return true;
> > +
> >         virtio_mb(vq->weak_barriers);
> > -       return vq->packed_ring ? virtqueue_poll_packed(_vq, last_used_idx) :
> > -                                virtqueue_poll_split(_vq, last_used_idx);
> > +       return vq->packed_ring ?
> > +               virtqueue_poll_packed(_vq, VRING_GET_IDX(last_used_idx)) :
> > +                       virtqueue_poll_split(_vq, VRING_GET_IDX(last_used_idx));
> >  }
> >  EXPORT_SYMBOL_GPL(virtqueue_poll);
> >  
> > @@ -2245,6 +2259,7 @@ struct virtqueue *__vring_new_virtqueue(unsigned int index,
> >         vq->weak_barriers = weak_barriers;
> >         vq->broken = false;
> >         vq->last_used_idx = 0;
> > +       vq->wraps = 0;
> >         vq->event_triggered = false;
> >         vq->num_added = 0;
> >         vq->use_dma_api = vring_use_dma_api(vdev);
> > diff --git a/include/uapi/linux/virtio_ring.h b/include/uapi/linux/virtio_ring.h
> > index 476d3e5c0fe7..e6b03017ebd7 100644
> > --- a/include/uapi/linux/virtio_ring.h
> > +++ b/include/uapi/linux/virtio_ring.h
> > @@ -77,6 +77,17 @@
> >   */
> >  #define VRING_PACKED_EVENT_F_WRAP_CTR  15
> >  
> > +#define VRING_IDX_MASK                                 GENMASK(15, 0)
> > +#define VRING_GET_IDX(opaque)                          \
> > +       ((u16)FIELD_GET(VRING_IDX_MASK, (opaque)))
> > +
> > +#define VRING_WRAPS_MASK                               GENMASK(31, 16)
> > +#define VRING_GET_WRAPS(opaque)                                \
> > +       ((u16)FIELD_GET(VRING_WRAPS_MASK, (opaque)))
> > +
> > +#define VRING_BUILD_OPAQUE(idx, wraps)                 \
> > +       (FIELD_PREP(VRING_WRAPS_MASK, (wraps)) | ((idx) & VRING_IDX_MASK))
> > +
> >  /* We support indirect buffer descriptors */
> >  #define VIRTIO_RING_F_INDIRECT_DESC    28
> 
> Yea I think this patch increases the time it takes to wrap around from
> 2^16 to 2^32 which seems good enough.
> Need some comments to explain the logic.
> Would be interesting to see perf data.
> 

Thanks for your feedback !

I'll try to gather some perf data around it next days.
(and eventually cleanup and adding comments if it is god enough...)

Thanks,
Cristian


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 61+ messages in thread

end of thread, other threads:[~2022-01-23 22:47 UTC | newest]

Thread overview: 61+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-12-20 19:56 [PATCH v8 00/11] Introduce atomic support for SCMI transports Cristian Marussi
2021-12-20 19:56 ` Cristian Marussi
2021-12-20 19:56 ` [PATCH v8 01/11] firmware: arm_scmi: Add configurable polling mode for transports Cristian Marussi
2021-12-20 19:56   ` Cristian Marussi
2021-12-21 19:38   ` Pratyush Yadav
2021-12-21 19:38     ` Pratyush Yadav
2021-12-21 20:23     ` Sudeep Holla
2021-12-21 20:23       ` Sudeep Holla
2021-12-20 19:56 ` [PATCH v8 02/11] firmware: arm_scmi: Make smc transport use common completions Cristian Marussi
2021-12-20 19:56   ` Cristian Marussi
2021-12-20 19:56 ` [PATCH v8 03/11] firmware: arm_scmi: Add sync_cmds_completed_on_ret transport flag Cristian Marussi
2021-12-20 19:56   ` Cristian Marussi
2021-12-20 19:56 ` [PATCH v8 04/11] firmware: arm_scmi: Make smc support sync_cmds_completed_on_ret Cristian Marussi
2021-12-20 19:56   ` Cristian Marussi
2021-12-20 19:56 ` [PATCH v8 05/11] firmware: arm_scmi: Make optee " Cristian Marussi
2021-12-20 19:56   ` Cristian Marussi
2021-12-20 19:56 ` [PATCH v8 06/11] firmware: arm_scmi: Add support for atomic transports Cristian Marussi
2021-12-20 19:56   ` Cristian Marussi
2021-12-20 19:56 ` [PATCH v8 07/11] firmware: arm_scmi: Add atomic mode support to smc transport Cristian Marussi
2021-12-20 19:56   ` Cristian Marussi
2021-12-20 19:56 ` [PATCH v8 08/11] firmware: arm_scmi: Add new parameter to mark_txdone Cristian Marussi
2021-12-20 19:56   ` Cristian Marussi
2021-12-20 19:56 ` [PATCH v8 09/11] firmware: arm_scmi: Add atomic mode support to virtio transport Cristian Marussi
2021-12-20 19:56   ` Cristian Marussi
2021-12-20 23:17   ` Michael S. Tsirkin
2021-12-20 23:17     ` Michael S. Tsirkin
2021-12-20 23:17     ` Michael S. Tsirkin
2021-12-21 12:09     ` Cristian Marussi
2021-12-21 12:09       ` Cristian Marussi
2021-12-21 14:00   ` [PATCH v9 " Cristian Marussi
2021-12-21 14:00     ` Cristian Marussi
2022-01-18 14:21     ` Peter Hilber
2022-01-18 14:21       ` Peter Hilber
2022-01-19 12:23       ` Cristian Marussi
2022-01-19 12:23         ` Cristian Marussi
2022-01-20 19:09         ` Peter Hilber
2022-01-20 19:09           ` Peter Hilber
2022-01-20 20:39           ` Michael S. Tsirkin
2022-01-20 20:39             ` Michael S. Tsirkin
2022-01-20 20:39             ` Michael S. Tsirkin
2022-01-23 20:02             ` Cristian Marussi
2022-01-23 20:02               ` Cristian Marussi
2022-01-23 22:40               ` Michael S. Tsirkin
2022-01-23 22:40                 ` Michael S. Tsirkin
2022-01-23 22:40                 ` Michael S. Tsirkin
2022-01-23 22:45                 ` Cristian Marussi
2022-01-23 22:45                   ` Cristian Marussi
2021-12-20 19:56 ` [PATCH v8 10/11] firmware: arm_scmi: Add atomic support to clock protocol Cristian Marussi
2021-12-20 19:56   ` Cristian Marussi
2021-12-20 19:56 ` [PATCH v8 11/11] clk: scmi: Support atomic clock enable/disable API Cristian Marussi
2021-12-20 19:56   ` Cristian Marussi
2022-01-14 23:08   ` Stephen Boyd
2022-01-14 23:08     ` Stephen Boyd
2022-01-17 10:31     ` Sudeep Holla
2022-01-17 10:31       ` Sudeep Holla
2022-01-17 12:40       ` Cristian Marussi
2022-01-17 12:40         ` Cristian Marussi
2021-12-22 14:23 ` [PATCH v8 00/11] (subset) Introduce atomic support for SCMI transports Sudeep Holla
2021-12-22 14:23   ` Sudeep Holla
2022-01-11 18:13 ` [PATCH v8 00/11] " Cristian Marussi
2022-01-11 18:13   ` Cristian Marussi

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.