All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH V2 0/4] soc: imx: add scu firmware api support
@ 2018-06-17 12:49 Dong Aisheng
  2018-06-17 12:49 ` [PATCH V2 1/4] soc: imx: add mu library functions support Dong Aisheng
                   ` (3 more replies)
  0 siblings, 4 replies; 37+ messages in thread
From: Dong Aisheng @ 2018-06-17 12:49 UTC (permalink / raw)
  To: linux-arm-kernel

Unlike the former i.MX Architectures, the new generation i.MX8 SoCs
(e.g. MX8QXP and MX8QM) contain a system controller which runs on a
dedicated Cortex-M core to provide power, clock, Pad, and resource
management. Communication between the host processor running
an OS and the system controller happens through a SCU protocol.
This patchset adds the SCU APIs which is implemented based on MU
and will be used by different system components.

It mainly consists of below parts:
1) MU library calls
1) Implementation of the IPC functions based on MUs (client side).
2) SCU firmware services APIs implementation ased on RPC calls which
   are mostly generated by SCU firmware

Dong Aisheng (4):
  soc: imx: add mu library functions support
  dt-bindings: arm: fsl: add mu binding doc
  dt-bindings: arm: fsl: add scu binding doc
  soc: imx: add SC firmware IPC and APIs

 .../devicetree/bindings/arm/freescale/fsl,mu.txt   |  32 +
 .../devicetree/bindings/arm/freescale/fsl,scu.txt  |  38 +
 drivers/soc/imx/Kconfig                            |   7 +
 drivers/soc/imx/Makefile                           |   2 +
 drivers/soc/imx/imx_mu.c                           | 165 +++++
 drivers/soc/imx/sc/Makefile                        |   8 +
 drivers/soc/imx/sc/main/ipc.c                      | 250 +++++++
 drivers/soc/imx/sc/main/rpc.h                      |  81 +++
 drivers/soc/imx/sc/svc/irq/rpc.h                   |  23 +
 drivers/soc/imx/sc/svc/irq/rpc_clnt.c              |  58 ++
 drivers/soc/imx/sc/svc/misc/rpc.h                  |  40 ++
 drivers/soc/imx/sc/svc/misc/rpc_clnt.c             | 368 ++++++++++
 drivers/soc/imx/sc/svc/pad/rpc.h                   |  37 +
 drivers/soc/imx/sc/svc/pad/rpc_clnt.c              |  53 ++
 drivers/soc/imx/sc/svc/pm/rpc.h                    |  40 ++
 drivers/soc/imx/sc/svc/pm/rpc_clnt.c               | 393 +++++++++++
 drivers/soc/imx/sc/svc/rm/rpc.h                    |  52 ++
 drivers/soc/imx/sc/svc/rm/rpc_clnt.c               | 612 +++++++++++++++++
 drivers/soc/imx/sc/svc/timer/rpc.h                 |  34 +
 drivers/soc/imx/sc/svc/timer/rpc_clnt.c            | 295 ++++++++
 include/soc/imx/mu.h                               |  24 +
 include/soc/imx/sc/ipc.h                           |  46 ++
 include/soc/imx/sc/scfw.h                          |  24 +
 include/soc/imx/sc/sci.h                           |  35 +
 include/soc/imx/sc/svc/irq/api.h                   | 139 ++++
 include/soc/imx/sc/svc/misc/api.h                  | 395 +++++++++++
 include/soc/imx/sc/svc/pad/api.h                   |  61 ++
 include/soc/imx/sc/svc/pm/api.h                    | 559 +++++++++++++++
 include/soc/imx/sc/svc/rm/api.h                    | 726 ++++++++++++++++++++
 include/soc/imx/sc/svc/timer/api.h                 | 265 +++++++
 include/soc/imx/sc/types.h                         | 764 +++++++++++++++++++++
 31 files changed, 5626 insertions(+)
 create mode 100644 Documentation/devicetree/bindings/arm/freescale/fsl,mu.txt
 create mode 100644 Documentation/devicetree/bindings/arm/freescale/fsl,scu.txt
 create mode 100644 drivers/soc/imx/imx_mu.c
 create mode 100644 drivers/soc/imx/sc/Makefile
 create mode 100644 drivers/soc/imx/sc/main/ipc.c
 create mode 100644 drivers/soc/imx/sc/main/rpc.h
 create mode 100644 drivers/soc/imx/sc/svc/irq/rpc.h
 create mode 100644 drivers/soc/imx/sc/svc/irq/rpc_clnt.c
 create mode 100644 drivers/soc/imx/sc/svc/misc/rpc.h
 create mode 100644 drivers/soc/imx/sc/svc/misc/rpc_clnt.c
 create mode 100644 drivers/soc/imx/sc/svc/pad/rpc.h
 create mode 100644 drivers/soc/imx/sc/svc/pad/rpc_clnt.c
 create mode 100644 drivers/soc/imx/sc/svc/pm/rpc.h
 create mode 100644 drivers/soc/imx/sc/svc/pm/rpc_clnt.c
 create mode 100644 drivers/soc/imx/sc/svc/rm/rpc.h
 create mode 100644 drivers/soc/imx/sc/svc/rm/rpc_clnt.c
 create mode 100644 drivers/soc/imx/sc/svc/timer/rpc.h
 create mode 100644 drivers/soc/imx/sc/svc/timer/rpc_clnt.c
 create mode 100644 include/soc/imx/mu.h
 create mode 100644 include/soc/imx/sc/ipc.h
 create mode 100644 include/soc/imx/sc/scfw.h
 create mode 100644 include/soc/imx/sc/sci.h
 create mode 100644 include/soc/imx/sc/svc/irq/api.h
 create mode 100644 include/soc/imx/sc/svc/misc/api.h
 create mode 100644 include/soc/imx/sc/svc/pad/api.h
 create mode 100644 include/soc/imx/sc/svc/pm/api.h
 create mode 100644 include/soc/imx/sc/svc/rm/api.h
 create mode 100644 include/soc/imx/sc/svc/timer/api.h
 create mode 100644 include/soc/imx/sc/types.h

-- 
2.7.4

^ permalink raw reply	[flat|nested] 37+ messages in thread

* [PATCH V2 1/4] soc: imx: add mu library functions support
  2018-06-17 12:49 [PATCH V2 0/4] soc: imx: add scu firmware api support Dong Aisheng
@ 2018-06-17 12:49 ` Dong Aisheng
  2018-06-17 12:49   ` Dong Aisheng
                   ` (2 subsequent siblings)
  3 siblings, 0 replies; 37+ messages in thread
From: Dong Aisheng @ 2018-06-17 12:49 UTC (permalink / raw)
  To: linux-arm-kernel

This is used for i.MX multi core communication.
e.g. A core to SCU firmware(M core) on MX8.

Cc: Shawn Guo <shawnguo@kernel.org>
Cc: Sascha Hauer <kernel@pengutronix.de>
Cc: Fabio Estevam <fabio.estevam@nxp.com>
Signed-off-by: Dong Aisheng <aisheng.dong@nxp.com>
---
v1->v2:
 * introduce struct mu_priv to keep the private iomem info
 * add the corresponding mu_exit()
---
 drivers/soc/imx/Kconfig  |   3 +
 drivers/soc/imx/Makefile |   1 +
 drivers/soc/imx/imx_mu.c | 165 +++++++++++++++++++++++++++++++++++++++++++++++
 include/soc/imx/mu.h     |  24 +++++++
 4 files changed, 193 insertions(+)
 create mode 100644 drivers/soc/imx/imx_mu.c
 create mode 100644 include/soc/imx/mu.h

diff --git a/drivers/soc/imx/Kconfig b/drivers/soc/imx/Kconfig
index a5b86a2..4858cd7 100644
--- a/drivers/soc/imx/Kconfig
+++ b/drivers/soc/imx/Kconfig
@@ -7,4 +7,7 @@ config IMX7_PM_DOMAINS
 	select PM_GENERIC_DOMAINS
 	default y if SOC_IMX7D
 
+config HAVE_IMX_MU
+	bool
+
 endmenu
diff --git a/drivers/soc/imx/Makefile b/drivers/soc/imx/Makefile
index aab41a5c..113dc7f 100644
--- a/drivers/soc/imx/Makefile
+++ b/drivers/soc/imx/Makefile
@@ -1,2 +1,3 @@
 obj-$(CONFIG_HAVE_IMX_GPC) += gpc.o
 obj-$(CONFIG_IMX7_PM_DOMAINS) += gpcv2.o
+obj-$(CONFIG_HAVE_IMX_MU) += imx_mu.o
diff --git a/drivers/soc/imx/imx_mu.c b/drivers/soc/imx/imx_mu.c
new file mode 100644
index 0000000..44294d1
--- /dev/null
+++ b/drivers/soc/imx/imx_mu.c
@@ -0,0 +1,165 @@
+// SPDX-License-Identifier: GPL-2.0+
+/*
+ * Copyright (C) 2016 Freescale Semiconductor, Inc.
+ * Copyright 2017~2018 NXP
+ *	Dong Aisheng <aisheng.dong@nxp.com>
+ */
+
+#include <linux/err.h>
+#include <linux/io.h>
+#include <linux/of_address.h>
+#include <linux/slab.h>
+
+#define MU_ATR0			0x0
+#define MU_ARR0			0x10
+#define MU_ASR			0x20
+#define MU_ACR			0x24
+
+#define MU_CR_GIEn_MASK		(0xf << 28)
+#define MU_CR_RIEn_MASK		(0xf << 24)
+#define MU_CR_TIEn_MASK		(0xf << 20)
+#define MU_CR_GIRn_MASK		(0xf << 16)
+#define MU_CR_NMI_MASK		(1 << 3)
+#define MU_CR_Fn_MASK		0x7
+
+#define MU_SR_TE0_MASK		BIT(23)
+#define MU_SR_RF0_MASK		BIT(27)
+
+#define MU_CR_RIE0_MASK		BIT(27)
+#define MU_CR_GIE0_MASK		BIT(31)
+
+struct mu_priv {
+	struct device_node *np;
+	void __iomem *base;
+};
+
+/*
+ * This function sets the Flag n of the MU.
+ */
+int32_t mu_set_fn(struct mu_priv *priv, uint32_t fn)
+{
+	uint32_t reg;
+
+	reg = fn & (~MU_CR_Fn_MASK);
+	if (reg > 0)
+		return -EINVAL;
+
+	reg = readl_relaxed(priv->base + MU_ACR);
+	/*  Clear ABFn. */
+	reg &= ~MU_CR_Fn_MASK;
+	reg |= fn;
+	writel_relaxed(reg, priv->base + MU_ACR);
+
+	return 0;
+}
+EXPORT_SYMBOL_GPL(mu_set_fn);
+
+/*
+ * This function reads the status from status register.
+ */
+uint32_t mu_read_status(struct mu_priv *priv)
+{
+	return readl_relaxed(priv->base + MU_ASR);
+}
+EXPORT_SYMBOL_GPL(mu_read_status);
+
+/*
+ * This function enables specific RX full interrupt.
+ */
+void mu_enable_rx_full_int(struct mu_priv *priv, uint32_t index)
+{
+	uint32_t reg;
+
+	reg = readl_relaxed(priv->base + MU_ACR);
+	reg &= ~(MU_CR_GIRn_MASK | MU_CR_NMI_MASK);
+	reg |= MU_CR_RIE0_MASK >> index;
+	writel_relaxed(reg, priv->base + MU_ACR);
+}
+EXPORT_SYMBOL_GPL(mu_enable_rx_full_int);
+
+/*
+ * This function enables specific general purpose interrupt.
+ */
+void mu_enable_general_int(struct mu_priv *priv, uint32_t index)
+{
+	uint32_t reg;
+
+	reg = readl_relaxed(priv->base + MU_ACR);
+	reg &= ~(MU_CR_GIRn_MASK | MU_CR_NMI_MASK);
+	reg |= MU_CR_GIE0_MASK >> index;
+	writel_relaxed(reg, priv->base + MU_ACR);
+}
+EXPORT_SYMBOL_GPL(mu_enable_general_int);
+
+/*
+ * Wait and send message to the other core.
+ */
+void mu_send_msg(struct mu_priv *priv, uint32_t index, uint32_t msg)
+{
+	uint32_t mask = MU_SR_TE0_MASK >> index;
+
+	/* Wait TX register to be empty. */
+	while (!(readl_relaxed(priv->base + MU_ASR) & mask))
+		;
+	writel_relaxed(msg, priv->base + MU_ATR0  + (index * 4));
+}
+EXPORT_SYMBOL_GPL(mu_send_msg);
+
+/*
+ * Wait to receive message from the other core.
+ */
+void mu_receive_msg(struct mu_priv *priv, uint32_t index, uint32_t *msg)
+{
+	uint32_t mask = MU_SR_RF0_MASK >> index;
+
+	/* Wait RX register to be full. */
+	while (!(readl_relaxed(priv->base + MU_ASR) & mask))
+		;
+	*msg = readl_relaxed(priv->base + MU_ARR0 + (index * 4));
+}
+EXPORT_SYMBOL_GPL(mu_receive_msg);
+
+struct device_node *mu_node(struct mu_priv *priv)
+{
+	return priv ? priv->np : NULL;
+}
+EXPORT_SYMBOL_GPL(mu_node);
+
+struct mu_priv *mu_init(struct device_node *np)
+{
+	struct mu_priv *priv;
+	uint32_t reg;
+
+	if (WARN_ON(!np))
+		return NULL;
+
+	priv = kzalloc(sizeof(*priv), GFP_KERNEL);
+	if (!priv)
+		return ERR_PTR(-ENOMEM);
+
+	priv->np = np;
+	priv->base = of_iomap(np, 0);
+	if (!priv->base) {
+		kfree(priv);
+		return ERR_PTR(-ENOMEM);
+	}
+
+	reg = readl_relaxed(priv->base + MU_ACR);
+	/* Clear GIEn, RIEn, TIEn, GIRn and ABFn. */
+	reg &= ~(MU_CR_GIEn_MASK | MU_CR_RIEn_MASK | MU_CR_TIEn_MASK
+		 | MU_CR_GIRn_MASK | MU_CR_NMI_MASK | MU_CR_Fn_MASK);
+	writel_relaxed(reg, priv->base + MU_ACR);
+
+	return priv;
+}
+EXPORT_SYMBOL_GPL(mu_init);
+
+void mu_exit(struct mu_priv *priv)
+{
+	if (WARN_ON(!priv))
+		return;
+
+	iounmap(priv->base);
+	kfree(priv);
+}
+EXPORT_SYMBOL_GPL(mu_exit);
diff --git a/include/soc/imx/mu.h b/include/soc/imx/mu.h
new file mode 100644
index 0000000..0802193
--- /dev/null
+++ b/include/soc/imx/mu.h
@@ -0,0 +1,24 @@
+/* SPDX-License-Identifier: GPL-2.0+ */
+/*
+ * Copyright (C) 2016 Freescale Semiconductor, Inc.
+ * Copyright 2017-2018 NXP
+ */
+
+#ifndef IMX_MU_H
+#define IMX_MU_H
+
+#define MU_TR_COUNT		4
+#define MU_RR_COUNT		4
+
+struct mu_priv;
+
+void mu_exit(struct mu_priv *priv);
+struct mu_priv *mu_init(struct device_node *np);
+struct device_node *mu_node(struct mu_priv *priv);
+void mu_send_msg(struct mu_priv *priv, uint32_t index, uint32_t msg);
+void mu_receive_msg(struct mu_priv *priv, uint32_t index, uint32_t *msg);
+void mu_enable_general_int(struct mu_priv *priv, uint32_t index);
+void mu_enable_rx_full_int(struct mu_priv *priv, uint32_t index);
+uint32_t mu_read_status(struct mu_priv *priv);
+int32_t mu_set_fn(struct mu_priv *priv, uint32_t Fn);
+#endif
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 37+ messages in thread

* [PATCH V2 2/4] dt-bindings: arm: fsl: add mu binding doc
  2018-06-17 12:49 [PATCH V2 0/4] soc: imx: add scu firmware api support Dong Aisheng
@ 2018-06-17 12:49   ` Dong Aisheng
  2018-06-17 12:49   ` Dong Aisheng
                     ` (2 subsequent siblings)
  3 siblings, 0 replies; 37+ messages in thread
From: Dong Aisheng @ 2018-06-17 12:49 UTC (permalink / raw)
  To: linux-arm-kernel
  Cc: Dong Aisheng, Mark Rutland, dongas86, devicetree, Rob Herring,
	linux-imx, kernel, fabio.estevam, shawnguo

The Messaging Unit module enables two processors within
the SoC to communicate and coordinate by passing messages
(e.g. data, status and control) through the MU interface.

Cc: Shawn Guo <shawnguo@kernel.org>
Cc: Sascha Hauer <kernel@pengutronix.de>
Cc: Fabio Estevam <fabio.estevam@nxp.com>
Cc: Rob Herring <robh+dt@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: devicetree@vger.kernel.org
Signed-off-by: Dong Aisheng <aisheng.dong@nxp.com>
---
v1->v2:
 * typo fixes
 * remove status property
 * remove imx6&7 compatible string which may be added later for
   the generic mailbox binding

Note: Because MU used by SCU is not implemented as a mailbox driver,
Instead, they're provided in library calls to gain higher performance.
Futhermore, SCU MU has only one channel. But the binding doc claims
(Change to allow 0?)
So we did not follow the mailbox binding.

For the generic mailbox driver binding way, it may be added later.
---
 .../devicetree/bindings/arm/freescale/fsl,mu.txt   | 32 ++++++++++++++++++++++
 1 file changed, 32 insertions(+)
 create mode 100644 Documentation/devicetree/bindings/arm/freescale/fsl,mu.txt

diff --git a/Documentation/devicetree/bindings/arm/freescale/fsl,mu.txt b/Documentation/devicetree/bindings/arm/freescale/fsl,mu.txt
new file mode 100644
index 0000000..c37aa1d
--- /dev/null
+++ b/Documentation/devicetree/bindings/arm/freescale/fsl,mu.txt
@@ -0,0 +1,32 @@
+NXP i.MX Messaging Unit (MU)
+--------------------------------------------------------------------
+
+The Messaging Unit module enables two processors within the SoC to
+communicate and coordinate by passing messages (e.g. data, status
+and control) through the MU interface. The MU also provides the ability
+for one processor to signal the other processor using interrupts.
+
+Because the MU manages the messaging between processors, the MU uses
+different clocks (from each side of the different peripheral buses).
+Therefore, the MU must synchronize the accesses from one side to the
+other. The MU accomplishes synchronization using two sets of matching
+registers (Processor A-facing, Processor B-facing).
+
+Messaging Unit Device Node:
+=============================
+
+Required properties:
+-------------------
+- compatible :	should be "fsl,<chip>-mu", the supported chips include
+		imx8qxp, imx8qm.
+- reg :		Should contain the registers location and length
+- interrupts :	Interrupt number. The interrupt specifier format depends
+		on the interrupt controller parent.
+
+Examples:
+--------
+lsio_mu0: mu@5d1b0000 {
+	compatible = "fsl,imx8qxp-mu";
+	reg = <0x0 0x5d1b0000 0x0 0x10000>;
+	interrupts = <GIC_SPI 176 IRQ_TYPE_LEVEL_HIGH>;
+};
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 37+ messages in thread

* [PATCH V2 2/4] dt-bindings: arm: fsl: add mu binding doc
@ 2018-06-17 12:49   ` Dong Aisheng
  0 siblings, 0 replies; 37+ messages in thread
From: Dong Aisheng @ 2018-06-17 12:49 UTC (permalink / raw)
  To: linux-arm-kernel

The Messaging Unit module enables two processors within
the SoC to communicate and coordinate by passing messages
(e.g. data, status and control) through the MU interface.

Cc: Shawn Guo <shawnguo@kernel.org>
Cc: Sascha Hauer <kernel@pengutronix.de>
Cc: Fabio Estevam <fabio.estevam@nxp.com>
Cc: Rob Herring <robh+dt@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: devicetree at vger.kernel.org
Signed-off-by: Dong Aisheng <aisheng.dong@nxp.com>
---
v1->v2:
 * typo fixes
 * remove status property
 * remove imx6&7 compatible string which may be added later for
   the generic mailbox binding

Note: Because MU used by SCU is not implemented as a mailbox driver,
Instead, they're provided in library calls to gain higher performance.
Futhermore, SCU MU has only one channel. But the binding doc claims
(Change to allow 0?)
So we did not follow the mailbox binding.

For the generic mailbox driver binding way, it may be added later.
---
 .../devicetree/bindings/arm/freescale/fsl,mu.txt   | 32 ++++++++++++++++++++++
 1 file changed, 32 insertions(+)
 create mode 100644 Documentation/devicetree/bindings/arm/freescale/fsl,mu.txt

diff --git a/Documentation/devicetree/bindings/arm/freescale/fsl,mu.txt b/Documentation/devicetree/bindings/arm/freescale/fsl,mu.txt
new file mode 100644
index 0000000..c37aa1d
--- /dev/null
+++ b/Documentation/devicetree/bindings/arm/freescale/fsl,mu.txt
@@ -0,0 +1,32 @@
+NXP i.MX Messaging Unit (MU)
+--------------------------------------------------------------------
+
+The Messaging Unit module enables two processors within the SoC to
+communicate and coordinate by passing messages (e.g. data, status
+and control) through the MU interface. The MU also provides the ability
+for one processor to signal the other processor using interrupts.
+
+Because the MU manages the messaging between processors, the MU uses
+different clocks (from each side of the different peripheral buses).
+Therefore, the MU must synchronize the accesses from one side to the
+other. The MU accomplishes synchronization using two sets of matching
+registers (Processor A-facing, Processor B-facing).
+
+Messaging Unit Device Node:
+=============================
+
+Required properties:
+-------------------
+- compatible :	should be "fsl,<chip>-mu", the supported chips include
+		imx8qxp, imx8qm.
+- reg :		Should contain the registers location and length
+- interrupts :	Interrupt number. The interrupt specifier format depends
+		on the interrupt controller parent.
+
+Examples:
+--------
+lsio_mu0: mu at 5d1b0000 {
+	compatible = "fsl,imx8qxp-mu";
+	reg = <0x0 0x5d1b0000 0x0 0x10000>;
+	interrupts = <GIC_SPI 176 IRQ_TYPE_LEVEL_HIGH>;
+};
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 37+ messages in thread

* [PATCH V2 3/4] dt-bindings: arm: fsl: add scu binding doc
  2018-06-17 12:49 [PATCH V2 0/4] soc: imx: add scu firmware api support Dong Aisheng
@ 2018-06-17 12:49   ` Dong Aisheng
  2018-06-17 12:49   ` Dong Aisheng
                     ` (2 subsequent siblings)
  3 siblings, 0 replies; 37+ messages in thread
From: Dong Aisheng @ 2018-06-17 12:49 UTC (permalink / raw)
  To: linux-arm-kernel
  Cc: Dong Aisheng, Mark Rutland, dongas86, devicetree, Rob Herring,
	linux-imx, kernel, fabio.estevam, shawnguo

The System Controller Firmware (SCFW) is a low-level system function
which runs on a dedicated Cortex-M core to provide power, clock, and
resource management. It exists on some i.MX8 processors. e.g. i.MX8QM
(QM, QP), and i.MX8QX (QXP, DX).

Cc: Shawn Guo <shawnguo@kernel.org>
Cc: Sascha Hauer <kernel@pengutronix.de>
Cc: Fabio Estevam <fabio.estevam@nxp.com>
Cc: Rob Herring <robh+dt@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: devicetree@vger.kernel.org
Signed-off-by: Dong Aisheng <aisheng.dong@nxp.com>
---
v1->v2:
 * remove status
 * changed to mu1
---
 .../devicetree/bindings/arm/freescale/fsl,scu.txt  | 38 ++++++++++++++++++++++
 1 file changed, 38 insertions(+)
 create mode 100644 Documentation/devicetree/bindings/arm/freescale/fsl,scu.txt

diff --git a/Documentation/devicetree/bindings/arm/freescale/fsl,scu.txt b/Documentation/devicetree/bindings/arm/freescale/fsl,scu.txt
new file mode 100644
index 0000000..9b7c9fe
--- /dev/null
+++ b/Documentation/devicetree/bindings/arm/freescale/fsl,scu.txt
@@ -0,0 +1,38 @@
+NXP i.MX System Controller Firmware (SCFW)
+--------------------------------------------------------------------
+
+The System Controller Firmware (SCFW) is a low-level system function
+which runs on a dedicated Cortex-M core to provide power, clock, and
+resource management. It exists on some i.MX8 processors. e.g. i.MX8QM
+(QM, QP), and i.MX8QX (QXP, DX).
+
+The AP communicates with the SC using a multi-ported MU module found
+in the LSIO subsystem. The current definition of this MU module provides
+5 remote AP connections to the SC to support up to 5 execution environments
+(TZ, HV, standard Linux, etc.). The SC side of this MU module interfaces
+with the LSIO DSC IP bus. The SC firmware will communicate with this MU
+using the MSI bus.
+
+System Controller Device Node:
+=============================
+
+Required properties:
+-------------------
+- compatible: should be "fsl,imx8qxp-scu" or "fsl,imx8qm-scu"
+- fsl,mu: a phandle to the Message Unit used by SCU. Should be
+	  one of LSIO MU0~M4 for imx8qxp and imx8qm. Users need
+	  to make sure not use the one which is conflict with
+	  other execution environments. e.g. ATF.
+
+Examples:
+--------
+lsio_mu1: mu@5d1c0000 {
+	compatible = "fsl,imx8qxp-mu";
+	reg = <0x0 0x5d1c0000 0x0 0x10000>;
+	interrupts = <GIC_SPI 176 IRQ_TYPE_LEVEL_HIGH>;
+};
+
+scu {
+	compatible = "fsl,imx8qxp-scu";
+	fsl,mu = <&lsio_mu1>;
+};
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 37+ messages in thread

* [PATCH V2 3/4] dt-bindings: arm: fsl: add scu binding doc
@ 2018-06-17 12:49   ` Dong Aisheng
  0 siblings, 0 replies; 37+ messages in thread
From: Dong Aisheng @ 2018-06-17 12:49 UTC (permalink / raw)
  To: linux-arm-kernel

The System Controller Firmware (SCFW) is a low-level system function
which runs on a dedicated Cortex-M core to provide power, clock, and
resource management. It exists on some i.MX8 processors. e.g. i.MX8QM
(QM, QP), and i.MX8QX (QXP, DX).

Cc: Shawn Guo <shawnguo@kernel.org>
Cc: Sascha Hauer <kernel@pengutronix.de>
Cc: Fabio Estevam <fabio.estevam@nxp.com>
Cc: Rob Herring <robh+dt@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: devicetree at vger.kernel.org
Signed-off-by: Dong Aisheng <aisheng.dong@nxp.com>
---
v1->v2:
 * remove status
 * changed to mu1
---
 .../devicetree/bindings/arm/freescale/fsl,scu.txt  | 38 ++++++++++++++++++++++
 1 file changed, 38 insertions(+)
 create mode 100644 Documentation/devicetree/bindings/arm/freescale/fsl,scu.txt

diff --git a/Documentation/devicetree/bindings/arm/freescale/fsl,scu.txt b/Documentation/devicetree/bindings/arm/freescale/fsl,scu.txt
new file mode 100644
index 0000000..9b7c9fe
--- /dev/null
+++ b/Documentation/devicetree/bindings/arm/freescale/fsl,scu.txt
@@ -0,0 +1,38 @@
+NXP i.MX System Controller Firmware (SCFW)
+--------------------------------------------------------------------
+
+The System Controller Firmware (SCFW) is a low-level system function
+which runs on a dedicated Cortex-M core to provide power, clock, and
+resource management. It exists on some i.MX8 processors. e.g. i.MX8QM
+(QM, QP), and i.MX8QX (QXP, DX).
+
+The AP communicates with the SC using a multi-ported MU module found
+in the LSIO subsystem. The current definition of this MU module provides
+5 remote AP connections to the SC to support up to 5 execution environments
+(TZ, HV, standard Linux, etc.). The SC side of this MU module interfaces
+with the LSIO DSC IP bus. The SC firmware will communicate with this MU
+using the MSI bus.
+
+System Controller Device Node:
+=============================
+
+Required properties:
+-------------------
+- compatible: should be "fsl,imx8qxp-scu" or "fsl,imx8qm-scu"
+- fsl,mu: a phandle to the Message Unit used by SCU. Should be
+	  one of LSIO MU0~M4 for imx8qxp and imx8qm. Users need
+	  to make sure not use the one which is conflict with
+	  other execution environments. e.g. ATF.
+
+Examples:
+--------
+lsio_mu1: mu at 5d1c0000 {
+	compatible = "fsl,imx8qxp-mu";
+	reg = <0x0 0x5d1c0000 0x0 0x10000>;
+	interrupts = <GIC_SPI 176 IRQ_TYPE_LEVEL_HIGH>;
+};
+
+scu {
+	compatible = "fsl,imx8qxp-scu";
+	fsl,mu = <&lsio_mu1>;
+};
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 37+ messages in thread

* [PATCH V2 4/4] soc: imx: add SC firmware IPC and APIs
       [not found] ` <1529239789-26849-5-git-send-email-aisheng.dong@nxp.com>
@ 2018-06-18  8:58   ` Leonard Crestez
  0 siblings, 0 replies; 37+ messages in thread
From: Leonard Crestez @ 2018-06-18  8:58 UTC (permalink / raw)
  To: linux-arm-kernel

On Sun, 2018-06-17 at 20:49 +0800, Dong Aisheng wrote:
> The System Controller Firmware (SCFW) is a low-level system function
> which runs on a dedicated Cortex-M core to provide power, clock, and
> resource management. It exists on some i.MX8 processors. e.g. i.MX8QM
> (QM, QP), and i.MX8QX (QXP, DX).
> 
> This patch adds the SC firmware service APIs used by the system.
> It mainly consists of two parts:
> 1) Implementation of the IPC functions using MUs (client side).
> 2) SCU firmware services APIs implemented based on RPC calls

Most of the code in this patch is auto-generated but some of it (like
ipc.c) is not. Maybe that part should go to a separate patch?

It's otherwise difficult to tell which parts of this +5000 line patch
should be looked at by a human.

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH V2 2/4] dt-bindings: arm: fsl: add mu binding doc
  2018-06-17 12:49   ` Dong Aisheng
@ 2018-06-20 19:43     ` Rob Herring
  -1 siblings, 0 replies; 37+ messages in thread
From: Rob Herring @ 2018-06-20 19:43 UTC (permalink / raw)
  To: Dong Aisheng
  Cc: Mark Rutland, devicetree, dongas86, linux-imx, kernel,
	fabio.estevam, shawnguo, linux-arm-kernel

On Sun, Jun 17, 2018 at 08:49:47PM +0800, Dong Aisheng wrote:
> The Messaging Unit module enables two processors within
> the SoC to communicate and coordinate by passing messages
> (e.g. data, status and control) through the MU interface.
> 
> Cc: Shawn Guo <shawnguo@kernel.org>
> Cc: Sascha Hauer <kernel@pengutronix.de>
> Cc: Fabio Estevam <fabio.estevam@nxp.com>
> Cc: Rob Herring <robh+dt@kernel.org>
> Cc: Mark Rutland <mark.rutland@arm.com>
> Cc: devicetree@vger.kernel.org
> Signed-off-by: Dong Aisheng <aisheng.dong@nxp.com>
> ---
> v1->v2:
>  * typo fixes
>  * remove status property
>  * remove imx6&7 compatible string which may be added later for
>    the generic mailbox binding
> 
> Note: Because MU used by SCU is not implemented as a mailbox driver,
> Instead, they're provided in library calls to gain higher performance.

Using a binding doesn't mean you have to use an OS's subsystem.

What needs higher performance? What's the performance difference? Why 
can't the mailbox framework be improved?

> Futhermore, SCU MU has only one channel. But the binding doc claims
> (Change to allow 0?)
> So we did not follow the mailbox binding.
> 
> For the generic mailbox driver binding way, it may be added later.

The h/w isn't changing later, so no.

> ---
>  .../devicetree/bindings/arm/freescale/fsl,mu.txt   | 32 ++++++++++++++++++++++
>  1 file changed, 32 insertions(+)
>  create mode 100644 Documentation/devicetree/bindings/arm/freescale/fsl,mu.txt

^ permalink raw reply	[flat|nested] 37+ messages in thread

* [PATCH V2 2/4] dt-bindings: arm: fsl: add mu binding doc
@ 2018-06-20 19:43     ` Rob Herring
  0 siblings, 0 replies; 37+ messages in thread
From: Rob Herring @ 2018-06-20 19:43 UTC (permalink / raw)
  To: linux-arm-kernel

On Sun, Jun 17, 2018 at 08:49:47PM +0800, Dong Aisheng wrote:
> The Messaging Unit module enables two processors within
> the SoC to communicate and coordinate by passing messages
> (e.g. data, status and control) through the MU interface.
> 
> Cc: Shawn Guo <shawnguo@kernel.org>
> Cc: Sascha Hauer <kernel@pengutronix.de>
> Cc: Fabio Estevam <fabio.estevam@nxp.com>
> Cc: Rob Herring <robh+dt@kernel.org>
> Cc: Mark Rutland <mark.rutland@arm.com>
> Cc: devicetree at vger.kernel.org
> Signed-off-by: Dong Aisheng <aisheng.dong@nxp.com>
> ---
> v1->v2:
>  * typo fixes
>  * remove status property
>  * remove imx6&7 compatible string which may be added later for
>    the generic mailbox binding
> 
> Note: Because MU used by SCU is not implemented as a mailbox driver,
> Instead, they're provided in library calls to gain higher performance.

Using a binding doesn't mean you have to use an OS's subsystem.

What needs higher performance? What's the performance difference? Why 
can't the mailbox framework be improved?

> Futhermore, SCU MU has only one channel. But the binding doc claims
> (Change to allow 0?)
> So we did not follow the mailbox binding.
> 
> For the generic mailbox driver binding way, it may be added later.

The h/w isn't changing later, so no.

> ---
>  .../devicetree/bindings/arm/freescale/fsl,mu.txt   | 32 ++++++++++++++++++++++
>  1 file changed, 32 insertions(+)
>  create mode 100644 Documentation/devicetree/bindings/arm/freescale/fsl,mu.txt

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH V2 3/4] dt-bindings: arm: fsl: add scu binding doc
  2018-06-17 12:49   ` Dong Aisheng
@ 2018-06-20 19:44     ` Rob Herring
  -1 siblings, 0 replies; 37+ messages in thread
From: Rob Herring @ 2018-06-20 19:44 UTC (permalink / raw)
  To: Dong Aisheng
  Cc: Mark Rutland, devicetree, dongas86, linux-imx, kernel,
	fabio.estevam, shawnguo, linux-arm-kernel

On Sun, Jun 17, 2018 at 08:49:48PM +0800, Dong Aisheng wrote:
> The System Controller Firmware (SCFW) is a low-level system function
> which runs on a dedicated Cortex-M core to provide power, clock, and
> resource management. It exists on some i.MX8 processors. e.g. i.MX8QM
> (QM, QP), and i.MX8QX (QXP, DX).
> 
> Cc: Shawn Guo <shawnguo@kernel.org>
> Cc: Sascha Hauer <kernel@pengutronix.de>
> Cc: Fabio Estevam <fabio.estevam@nxp.com>
> Cc: Rob Herring <robh+dt@kernel.org>
> Cc: Mark Rutland <mark.rutland@arm.com>
> Cc: devicetree@vger.kernel.org
> Signed-off-by: Dong Aisheng <aisheng.dong@nxp.com>
> ---
> v1->v2:
>  * remove status
>  * changed to mu1
> ---
>  .../devicetree/bindings/arm/freescale/fsl,scu.txt  | 38 ++++++++++++++++++++++
>  1 file changed, 38 insertions(+)
>  create mode 100644 Documentation/devicetree/bindings/arm/freescale/fsl,scu.txt
> 
> diff --git a/Documentation/devicetree/bindings/arm/freescale/fsl,scu.txt b/Documentation/devicetree/bindings/arm/freescale/fsl,scu.txt
> new file mode 100644
> index 0000000..9b7c9fe
> --- /dev/null
> +++ b/Documentation/devicetree/bindings/arm/freescale/fsl,scu.txt
> @@ -0,0 +1,38 @@
> +NXP i.MX System Controller Firmware (SCFW)
> +--------------------------------------------------------------------
> +
> +The System Controller Firmware (SCFW) is a low-level system function
> +which runs on a dedicated Cortex-M core to provide power, clock, and
> +resource management. It exists on some i.MX8 processors. e.g. i.MX8QM
> +(QM, QP), and i.MX8QX (QXP, DX).
> +
> +The AP communicates with the SC using a multi-ported MU module found
> +in the LSIO subsystem. The current definition of this MU module provides
> +5 remote AP connections to the SC to support up to 5 execution environments
> +(TZ, HV, standard Linux, etc.). The SC side of this MU module interfaces
> +with the LSIO DSC IP bus. The SC firmware will communicate with this MU
> +using the MSI bus.
> +
> +System Controller Device Node:
> +=============================
> +
> +Required properties:
> +-------------------
> +- compatible: should be "fsl,imx8qxp-scu" or "fsl,imx8qm-scu"
> +- fsl,mu: a phandle to the Message Unit used by SCU. Should be
> +	  one of LSIO MU0~M4 for imx8qxp and imx8qm. Users need
> +	  to make sure not use the one which is conflict with
> +	  other execution environments. e.g. ATF.

Use the mailbox binding even if you don't use the mailbox subsystem.

> +
> +Examples:
> +--------
> +lsio_mu1: mu@5d1c0000 {
> +	compatible = "fsl,imx8qxp-mu";
> +	reg = <0x0 0x5d1c0000 0x0 0x10000>;
> +	interrupts = <GIC_SPI 176 IRQ_TYPE_LEVEL_HIGH>;
> +};
> +
> +scu {
> +	compatible = "fsl,imx8qxp-scu";
> +	fsl,mu = <&lsio_mu1>;
> +};
> -- 
> 2.7.4
> 

^ permalink raw reply	[flat|nested] 37+ messages in thread

* [PATCH V2 3/4] dt-bindings: arm: fsl: add scu binding doc
@ 2018-06-20 19:44     ` Rob Herring
  0 siblings, 0 replies; 37+ messages in thread
From: Rob Herring @ 2018-06-20 19:44 UTC (permalink / raw)
  To: linux-arm-kernel

On Sun, Jun 17, 2018 at 08:49:48PM +0800, Dong Aisheng wrote:
> The System Controller Firmware (SCFW) is a low-level system function
> which runs on a dedicated Cortex-M core to provide power, clock, and
> resource management. It exists on some i.MX8 processors. e.g. i.MX8QM
> (QM, QP), and i.MX8QX (QXP, DX).
> 
> Cc: Shawn Guo <shawnguo@kernel.org>
> Cc: Sascha Hauer <kernel@pengutronix.de>
> Cc: Fabio Estevam <fabio.estevam@nxp.com>
> Cc: Rob Herring <robh+dt@kernel.org>
> Cc: Mark Rutland <mark.rutland@arm.com>
> Cc: devicetree at vger.kernel.org
> Signed-off-by: Dong Aisheng <aisheng.dong@nxp.com>
> ---
> v1->v2:
>  * remove status
>  * changed to mu1
> ---
>  .../devicetree/bindings/arm/freescale/fsl,scu.txt  | 38 ++++++++++++++++++++++
>  1 file changed, 38 insertions(+)
>  create mode 100644 Documentation/devicetree/bindings/arm/freescale/fsl,scu.txt
> 
> diff --git a/Documentation/devicetree/bindings/arm/freescale/fsl,scu.txt b/Documentation/devicetree/bindings/arm/freescale/fsl,scu.txt
> new file mode 100644
> index 0000000..9b7c9fe
> --- /dev/null
> +++ b/Documentation/devicetree/bindings/arm/freescale/fsl,scu.txt
> @@ -0,0 +1,38 @@
> +NXP i.MX System Controller Firmware (SCFW)
> +--------------------------------------------------------------------
> +
> +The System Controller Firmware (SCFW) is a low-level system function
> +which runs on a dedicated Cortex-M core to provide power, clock, and
> +resource management. It exists on some i.MX8 processors. e.g. i.MX8QM
> +(QM, QP), and i.MX8QX (QXP, DX).
> +
> +The AP communicates with the SC using a multi-ported MU module found
> +in the LSIO subsystem. The current definition of this MU module provides
> +5 remote AP connections to the SC to support up to 5 execution environments
> +(TZ, HV, standard Linux, etc.). The SC side of this MU module interfaces
> +with the LSIO DSC IP bus. The SC firmware will communicate with this MU
> +using the MSI bus.
> +
> +System Controller Device Node:
> +=============================
> +
> +Required properties:
> +-------------------
> +- compatible: should be "fsl,imx8qxp-scu" or "fsl,imx8qm-scu"
> +- fsl,mu: a phandle to the Message Unit used by SCU. Should be
> +	  one of LSIO MU0~M4 for imx8qxp and imx8qm. Users need
> +	  to make sure not use the one which is conflict with
> +	  other execution environments. e.g. ATF.

Use the mailbox binding even if you don't use the mailbox subsystem.

> +
> +Examples:
> +--------
> +lsio_mu1: mu at 5d1c0000 {
> +	compatible = "fsl,imx8qxp-mu";
> +	reg = <0x0 0x5d1c0000 0x0 0x10000>;
> +	interrupts = <GIC_SPI 176 IRQ_TYPE_LEVEL_HIGH>;
> +};
> +
> +scu {
> +	compatible = "fsl,imx8qxp-scu";
> +	fsl,mu = <&lsio_mu1>;
> +};
> -- 
> 2.7.4
> 

^ permalink raw reply	[flat|nested] 37+ messages in thread

* RE: [PATCH V2 3/4] dt-bindings: arm: fsl: add scu binding doc
  2018-06-20 19:44     ` Rob Herring
@ 2018-06-21  3:38       ` A.s. Dong
  -1 siblings, 0 replies; 37+ messages in thread
From: A.s. Dong @ 2018-06-21  3:38 UTC (permalink / raw)
  To: Rob Herring
  Cc: Mark Rutland, devicetree, dongas86, dl-linux-imx, kernel,
	Fabio Estevam, shawnguo, linux-arm-kernel

Hi Rob,

> -----Original Message-----
> From: Rob Herring [mailto:robh@kernel.org]
> Sent: Thursday, June 21, 2018 3:45 AM
> To: A.s. Dong <aisheng.dong@nxp.com>
> Cc: linux-arm-kernel@lists.infradead.org; dongas86@gmail.com;
> kernel@pengutronix.de; shawnguo@kernel.org; Fabio Estevam
> <fabio.estevam@nxp.com>; dl-linux-imx <linux-imx@nxp.com>; Mark
> Rutland <mark.rutland@arm.com>; devicetree@vger.kernel.org
> Subject: Re: [PATCH V2 3/4] dt-bindings: arm: fsl: add scu binding doc
> 
> On Sun, Jun 17, 2018 at 08:49:48PM +0800, Dong Aisheng wrote:
> > The System Controller Firmware (SCFW) is a low-level system function
> > which runs on a dedicated Cortex-M core to provide power, clock, and
> > resource management. It exists on some i.MX8 processors. e.g. i.MX8QM
> > (QM, QP), and i.MX8QX (QXP, DX).
> >
> > Cc: Shawn Guo <shawnguo@kernel.org>
> > Cc: Sascha Hauer <kernel@pengutronix.de>
> > Cc: Fabio Estevam <fabio.estevam@nxp.com>
> > Cc: Rob Herring <robh+dt@kernel.org>
> > Cc: Mark Rutland <mark.rutland@arm.com>
> > Cc: devicetree@vger.kernel.org
> > Signed-off-by: Dong Aisheng <aisheng.dong@nxp.com>
> > ---
> > v1->v2:
> >  * remove status
> >  * changed to mu1
> > ---
> >  .../devicetree/bindings/arm/freescale/fsl,scu.txt  | 38
> > ++++++++++++++++++++++
> >  1 file changed, 38 insertions(+)
> >  create mode 100644
> > Documentation/devicetree/bindings/arm/freescale/fsl,scu.txt
> >
> > diff --git
> > a/Documentation/devicetree/bindings/arm/freescale/fsl,scu.txt
> > b/Documentation/devicetree/bindings/arm/freescale/fsl,scu.txt
> > new file mode 100644
> > index 0000000..9b7c9fe
> > --- /dev/null
> > +++ b/Documentation/devicetree/bindings/arm/freescale/fsl,scu.txt
> > @@ -0,0 +1,38 @@
> > +NXP i.MX System Controller Firmware (SCFW)
> > +--------------------------------------------------------------------
> > +
> > +The System Controller Firmware (SCFW) is a low-level system function
> > +which runs on a dedicated Cortex-M core to provide power, clock, and
> > +resource management. It exists on some i.MX8 processors. e.g. i.MX8QM
> > +(QM, QP), and i.MX8QX (QXP, DX).
> > +
> > +The AP communicates with the SC using a multi-ported MU module found
> > +in the LSIO subsystem. The current definition of this MU module
> > +provides
> > +5 remote AP connections to the SC to support up to 5 execution
> > +environments (TZ, HV, standard Linux, etc.). The SC side of this MU
> > +module interfaces with the LSIO DSC IP bus. The SC firmware will
> > +communicate with this MU using the MSI bus.
> > +
> > +System Controller Device Node:
> > +=============================
> > +
> > +Required properties:
> > +-------------------
> > +- compatible: should be "fsl,imx8qxp-scu" or "fsl,imx8qm-scu"
> > +- fsl,mu: a phandle to the Message Unit used by SCU. Should be
> > +	  one of LSIO MU0~M4 for imx8qxp and imx8qm. Users need
> > +	  to make sure not use the one which is conflict with
> > +	  other execution environments. e.g. ATF.
> 
> Use the mailbox binding even if you don't use the mailbox subsystem.
> 

Looks reasonable. Will change it.

BTW as I said before, the current mailbox binding fixed #mbox-cells to be
at least 1 which is not suitable for i.MX SCU MU as it has only one physical
channel.

I will cook a patch to update it to allow #mbox-cells = <0>.

If any issue please let me know.

Regards
Dong Aisheng

> > +
> > +Examples:
> > +--------
> > +lsio_mu1: mu@5d1c0000 {
> > +	compatible = "fsl,imx8qxp-mu";
> > +	reg = <0x0 0x5d1c0000 0x0 0x10000>;
> > +	interrupts = <GIC_SPI 176 IRQ_TYPE_LEVEL_HIGH>; };
> > +
> > +scu {
> > +	compatible = "fsl,imx8qxp-scu";
> > +	fsl,mu = <&lsio_mu1>;
> > +};
> > --
> > 2.7.4
> >

^ permalink raw reply	[flat|nested] 37+ messages in thread

* [PATCH V2 3/4] dt-bindings: arm: fsl: add scu binding doc
@ 2018-06-21  3:38       ` A.s. Dong
  0 siblings, 0 replies; 37+ messages in thread
From: A.s. Dong @ 2018-06-21  3:38 UTC (permalink / raw)
  To: linux-arm-kernel

Hi Rob,

> -----Original Message-----
> From: Rob Herring [mailto:robh at kernel.org]
> Sent: Thursday, June 21, 2018 3:45 AM
> To: A.s. Dong <aisheng.dong@nxp.com>
> Cc: linux-arm-kernel at lists.infradead.org; dongas86 at gmail.com;
> kernel at pengutronix.de; shawnguo at kernel.org; Fabio Estevam
> <fabio.estevam@nxp.com>; dl-linux-imx <linux-imx@nxp.com>; Mark
> Rutland <mark.rutland@arm.com>; devicetree at vger.kernel.org
> Subject: Re: [PATCH V2 3/4] dt-bindings: arm: fsl: add scu binding doc
> 
> On Sun, Jun 17, 2018 at 08:49:48PM +0800, Dong Aisheng wrote:
> > The System Controller Firmware (SCFW) is a low-level system function
> > which runs on a dedicated Cortex-M core to provide power, clock, and
> > resource management. It exists on some i.MX8 processors. e.g. i.MX8QM
> > (QM, QP), and i.MX8QX (QXP, DX).
> >
> > Cc: Shawn Guo <shawnguo@kernel.org>
> > Cc: Sascha Hauer <kernel@pengutronix.de>
> > Cc: Fabio Estevam <fabio.estevam@nxp.com>
> > Cc: Rob Herring <robh+dt@kernel.org>
> > Cc: Mark Rutland <mark.rutland@arm.com>
> > Cc: devicetree at vger.kernel.org
> > Signed-off-by: Dong Aisheng <aisheng.dong@nxp.com>
> > ---
> > v1->v2:
> >  * remove status
> >  * changed to mu1
> > ---
> >  .../devicetree/bindings/arm/freescale/fsl,scu.txt  | 38
> > ++++++++++++++++++++++
> >  1 file changed, 38 insertions(+)
> >  create mode 100644
> > Documentation/devicetree/bindings/arm/freescale/fsl,scu.txt
> >
> > diff --git
> > a/Documentation/devicetree/bindings/arm/freescale/fsl,scu.txt
> > b/Documentation/devicetree/bindings/arm/freescale/fsl,scu.txt
> > new file mode 100644
> > index 0000000..9b7c9fe
> > --- /dev/null
> > +++ b/Documentation/devicetree/bindings/arm/freescale/fsl,scu.txt
> > @@ -0,0 +1,38 @@
> > +NXP i.MX System Controller Firmware (SCFW)
> > +--------------------------------------------------------------------
> > +
> > +The System Controller Firmware (SCFW) is a low-level system function
> > +which runs on a dedicated Cortex-M core to provide power, clock, and
> > +resource management. It exists on some i.MX8 processors. e.g. i.MX8QM
> > +(QM, QP), and i.MX8QX (QXP, DX).
> > +
> > +The AP communicates with the SC using a multi-ported MU module found
> > +in the LSIO subsystem. The current definition of this MU module
> > +provides
> > +5 remote AP connections to the SC to support up to 5 execution
> > +environments (TZ, HV, standard Linux, etc.). The SC side of this MU
> > +module interfaces with the LSIO DSC IP bus. The SC firmware will
> > +communicate with this MU using the MSI bus.
> > +
> > +System Controller Device Node:
> > +=============================
> > +
> > +Required properties:
> > +-------------------
> > +- compatible: should be "fsl,imx8qxp-scu" or "fsl,imx8qm-scu"
> > +- fsl,mu: a phandle to the Message Unit used by SCU. Should be
> > +	  one of LSIO MU0~M4 for imx8qxp and imx8qm. Users need
> > +	  to make sure not use the one which is conflict with
> > +	  other execution environments. e.g. ATF.
> 
> Use the mailbox binding even if you don't use the mailbox subsystem.
> 

Looks reasonable. Will change it.

BTW as I said before, the current mailbox binding fixed #mbox-cells to be
at least 1 which is not suitable for i.MX SCU MU as it has only one physical
channel.

I will cook a patch to update it to allow #mbox-cells = <0>.

If any issue please let me know.

Regards
Dong Aisheng

> > +
> > +Examples:
> > +--------
> > +lsio_mu1: mu at 5d1c0000 {
> > +	compatible = "fsl,imx8qxp-mu";
> > +	reg = <0x0 0x5d1c0000 0x0 0x10000>;
> > +	interrupts = <GIC_SPI 176 IRQ_TYPE_LEVEL_HIGH>; };
> > +
> > +scu {
> > +	compatible = "fsl,imx8qxp-scu";
> > +	fsl,mu = <&lsio_mu1>;
> > +};
> > --
> > 2.7.4
> >

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH V2 3/4] dt-bindings: arm: fsl: add scu binding doc
  2018-06-21  3:38       ` A.s. Dong
@ 2018-06-21  7:37         ` Sascha Hauer
  -1 siblings, 0 replies; 37+ messages in thread
From: Sascha Hauer @ 2018-06-21  7:37 UTC (permalink / raw)
  To: A.s. Dong
  Cc: Mark Rutland, Rob Herring, dongas86, devicetree, dl-linux-imx,
	kernel, Fabio Estevam, shawnguo, linux-arm-kernel

On Thu, Jun 21, 2018 at 03:38:30AM +0000, A.s. Dong wrote:
> Hi Rob,
> 
> > -----Original Message-----
> > From: Rob Herring [mailto:robh@kernel.org]
> > Sent: Thursday, June 21, 2018 3:45 AM
> > To: A.s. Dong <aisheng.dong@nxp.com>
> > Cc: linux-arm-kernel@lists.infradead.org; dongas86@gmail.com;
> > kernel@pengutronix.de; shawnguo@kernel.org; Fabio Estevam
> > <fabio.estevam@nxp.com>; dl-linux-imx <linux-imx@nxp.com>; Mark
> > Rutland <mark.rutland@arm.com>; devicetree@vger.kernel.org
> > Subject: Re: [PATCH V2 3/4] dt-bindings: arm: fsl: add scu binding doc
> > 
> > On Sun, Jun 17, 2018 at 08:49:48PM +0800, Dong Aisheng wrote:
> > > The System Controller Firmware (SCFW) is a low-level system function
> > > which runs on a dedicated Cortex-M core to provide power, clock, and
> > > resource management. It exists on some i.MX8 processors. e.g. i.MX8QM
> > > (QM, QP), and i.MX8QX (QXP, DX).
> > >
> > > Cc: Shawn Guo <shawnguo@kernel.org>
> > > Cc: Sascha Hauer <kernel@pengutronix.de>
> > > Cc: Fabio Estevam <fabio.estevam@nxp.com>
> > > Cc: Rob Herring <robh+dt@kernel.org>
> > > Cc: Mark Rutland <mark.rutland@arm.com>
> > > Cc: devicetree@vger.kernel.org
> > > Signed-off-by: Dong Aisheng <aisheng.dong@nxp.com>
> > > ---
> > > v1->v2:
> > >  * remove status
> > >  * changed to mu1
> > > ---
> > >  .../devicetree/bindings/arm/freescale/fsl,scu.txt  | 38
> > > ++++++++++++++++++++++
> > >  1 file changed, 38 insertions(+)
> > >  create mode 100644
> > > Documentation/devicetree/bindings/arm/freescale/fsl,scu.txt
> > >
> > > diff --git
> > > a/Documentation/devicetree/bindings/arm/freescale/fsl,scu.txt
> > > b/Documentation/devicetree/bindings/arm/freescale/fsl,scu.txt
> > > new file mode 100644
> > > index 0000000..9b7c9fe
> > > --- /dev/null
> > > +++ b/Documentation/devicetree/bindings/arm/freescale/fsl,scu.txt
> > > @@ -0,0 +1,38 @@
> > > +NXP i.MX System Controller Firmware (SCFW)
> > > +--------------------------------------------------------------------
> > > +
> > > +The System Controller Firmware (SCFW) is a low-level system function
> > > +which runs on a dedicated Cortex-M core to provide power, clock, and
> > > +resource management. It exists on some i.MX8 processors. e.g. i.MX8QM
> > > +(QM, QP), and i.MX8QX (QXP, DX).
> > > +
> > > +The AP communicates with the SC using a multi-ported MU module found
> > > +in the LSIO subsystem. The current definition of this MU module
> > > +provides
> > > +5 remote AP connections to the SC to support up to 5 execution
> > > +environments (TZ, HV, standard Linux, etc.). The SC side of this MU
> > > +module interfaces with the LSIO DSC IP bus. The SC firmware will
> > > +communicate with this MU using the MSI bus.
> > > +
> > > +System Controller Device Node:
> > > +=============================
> > > +
> > > +Required properties:
> > > +-------------------
> > > +- compatible: should be "fsl,imx8qxp-scu" or "fsl,imx8qm-scu"
> > > +- fsl,mu: a phandle to the Message Unit used by SCU. Should be
> > > +	  one of LSIO MU0~M4 for imx8qxp and imx8qm. Users need
> > > +	  to make sure not use the one which is conflict with
> > > +	  other execution environments. e.g. ATF.
> > 
> > Use the mailbox binding even if you don't use the mailbox subsystem.
> > 
> 
> Looks reasonable. Will change it.
> 
> BTW as I said before, the current mailbox binding fixed #mbox-cells to be
> at least 1 which is not suitable for i.MX SCU MU as it has only one physical
> channel.

Why is that not suitable? If we use #mbox-cells = 1 we can encode the
channel into the second cell. Furthermore, the SCU mode which uses all
channels in a funny way could be another channel id the mu driver could
use to distinguish the different channel/modes. i.e.

#define MU_CHANNEL_0	0
#define MU_CHANNEL_1    1
#define MU_CHANNEL_2    2
#define MU_CHANNEL_3    3
#define MU_CHANNEL_SCU	4


scu {
       compatible = "fsl,imx8qxp-scu";
       fsl,mu = <&lsio_mu1 MU_CHANNEL_SCU>;
};

This would also allow to the MU-SCU-mode driver to coexist with the
regular MU driver for which Oleksij posted a driver.

Sascha

-- 
Pengutronix e.K.                           |                             |
Industrial Linux Solutions                 | http://www.pengutronix.de/  |
Peiner Str. 6-8, 31137 Hildesheim, Germany | Phone: +49-5121-206917-0    |
Amtsgericht Hildesheim, HRA 2686           | Fax:   +49-5121-206917-5555 |

^ permalink raw reply	[flat|nested] 37+ messages in thread

* [PATCH V2 3/4] dt-bindings: arm: fsl: add scu binding doc
@ 2018-06-21  7:37         ` Sascha Hauer
  0 siblings, 0 replies; 37+ messages in thread
From: Sascha Hauer @ 2018-06-21  7:37 UTC (permalink / raw)
  To: linux-arm-kernel

On Thu, Jun 21, 2018 at 03:38:30AM +0000, A.s. Dong wrote:
> Hi Rob,
> 
> > -----Original Message-----
> > From: Rob Herring [mailto:robh at kernel.org]
> > Sent: Thursday, June 21, 2018 3:45 AM
> > To: A.s. Dong <aisheng.dong@nxp.com>
> > Cc: linux-arm-kernel at lists.infradead.org; dongas86 at gmail.com;
> > kernel at pengutronix.de; shawnguo at kernel.org; Fabio Estevam
> > <fabio.estevam@nxp.com>; dl-linux-imx <linux-imx@nxp.com>; Mark
> > Rutland <mark.rutland@arm.com>; devicetree at vger.kernel.org
> > Subject: Re: [PATCH V2 3/4] dt-bindings: arm: fsl: add scu binding doc
> > 
> > On Sun, Jun 17, 2018 at 08:49:48PM +0800, Dong Aisheng wrote:
> > > The System Controller Firmware (SCFW) is a low-level system function
> > > which runs on a dedicated Cortex-M core to provide power, clock, and
> > > resource management. It exists on some i.MX8 processors. e.g. i.MX8QM
> > > (QM, QP), and i.MX8QX (QXP, DX).
> > >
> > > Cc: Shawn Guo <shawnguo@kernel.org>
> > > Cc: Sascha Hauer <kernel@pengutronix.de>
> > > Cc: Fabio Estevam <fabio.estevam@nxp.com>
> > > Cc: Rob Herring <robh+dt@kernel.org>
> > > Cc: Mark Rutland <mark.rutland@arm.com>
> > > Cc: devicetree at vger.kernel.org
> > > Signed-off-by: Dong Aisheng <aisheng.dong@nxp.com>
> > > ---
> > > v1->v2:
> > >  * remove status
> > >  * changed to mu1
> > > ---
> > >  .../devicetree/bindings/arm/freescale/fsl,scu.txt  | 38
> > > ++++++++++++++++++++++
> > >  1 file changed, 38 insertions(+)
> > >  create mode 100644
> > > Documentation/devicetree/bindings/arm/freescale/fsl,scu.txt
> > >
> > > diff --git
> > > a/Documentation/devicetree/bindings/arm/freescale/fsl,scu.txt
> > > b/Documentation/devicetree/bindings/arm/freescale/fsl,scu.txt
> > > new file mode 100644
> > > index 0000000..9b7c9fe
> > > --- /dev/null
> > > +++ b/Documentation/devicetree/bindings/arm/freescale/fsl,scu.txt
> > > @@ -0,0 +1,38 @@
> > > +NXP i.MX System Controller Firmware (SCFW)
> > > +--------------------------------------------------------------------
> > > +
> > > +The System Controller Firmware (SCFW) is a low-level system function
> > > +which runs on a dedicated Cortex-M core to provide power, clock, and
> > > +resource management. It exists on some i.MX8 processors. e.g. i.MX8QM
> > > +(QM, QP), and i.MX8QX (QXP, DX).
> > > +
> > > +The AP communicates with the SC using a multi-ported MU module found
> > > +in the LSIO subsystem. The current definition of this MU module
> > > +provides
> > > +5 remote AP connections to the SC to support up to 5 execution
> > > +environments (TZ, HV, standard Linux, etc.). The SC side of this MU
> > > +module interfaces with the LSIO DSC IP bus. The SC firmware will
> > > +communicate with this MU using the MSI bus.
> > > +
> > > +System Controller Device Node:
> > > +=============================
> > > +
> > > +Required properties:
> > > +-------------------
> > > +- compatible: should be "fsl,imx8qxp-scu" or "fsl,imx8qm-scu"
> > > +- fsl,mu: a phandle to the Message Unit used by SCU. Should be
> > > +	  one of LSIO MU0~M4 for imx8qxp and imx8qm. Users need
> > > +	  to make sure not use the one which is conflict with
> > > +	  other execution environments. e.g. ATF.
> > 
> > Use the mailbox binding even if you don't use the mailbox subsystem.
> > 
> 
> Looks reasonable. Will change it.
> 
> BTW as I said before, the current mailbox binding fixed #mbox-cells to be
> at least 1 which is not suitable for i.MX SCU MU as it has only one physical
> channel.

Why is that not suitable? If we use #mbox-cells = 1 we can encode the
channel into the second cell. Furthermore, the SCU mode which uses all
channels in a funny way could be another channel id the mu driver could
use to distinguish the different channel/modes. i.e.

#define MU_CHANNEL_0	0
#define MU_CHANNEL_1    1
#define MU_CHANNEL_2    2
#define MU_CHANNEL_3    3
#define MU_CHANNEL_SCU	4


scu {
       compatible = "fsl,imx8qxp-scu";
       fsl,mu = <&lsio_mu1 MU_CHANNEL_SCU>;
};

This would also allow to the MU-SCU-mode driver to coexist with the
regular MU driver for which Oleksij posted a driver.

Sascha

-- 
Pengutronix e.K.                           |                             |
Industrial Linux Solutions                 | http://www.pengutronix.de/  |
Peiner Str. 6-8, 31137 Hildesheim, Germany | Phone: +49-5121-206917-0    |
Amtsgericht Hildesheim, HRA 2686           | Fax:   +49-5121-206917-5555 |

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH V2 2/4] dt-bindings: arm: fsl: add mu binding doc
  2018-06-20 19:43     ` Rob Herring
@ 2018-06-21  7:46       ` Sascha Hauer
  -1 siblings, 0 replies; 37+ messages in thread
From: Sascha Hauer @ 2018-06-21  7:46 UTC (permalink / raw)
  To: Rob Herring
  Cc: Dong Aisheng, Mark Rutland, dongas86, devicetree, linux-imx,
	kernel, fabio.estevam, shawnguo, linux-arm-kernel

On Wed, Jun 20, 2018 at 01:43:10PM -0600, Rob Herring wrote:
> On Sun, Jun 17, 2018 at 08:49:47PM +0800, Dong Aisheng wrote:
> > The Messaging Unit module enables two processors within
> > the SoC to communicate and coordinate by passing messages
> > (e.g. data, status and control) through the MU interface.
> > 
> > Cc: Shawn Guo <shawnguo@kernel.org>
> > Cc: Sascha Hauer <kernel@pengutronix.de>
> > Cc: Fabio Estevam <fabio.estevam@nxp.com>
> > Cc: Rob Herring <robh+dt@kernel.org>
> > Cc: Mark Rutland <mark.rutland@arm.com>
> > Cc: devicetree@vger.kernel.org
> > Signed-off-by: Dong Aisheng <aisheng.dong@nxp.com>
> > ---
> > v1->v2:
> >  * typo fixes
> >  * remove status property
> >  * remove imx6&7 compatible string which may be added later for
> >    the generic mailbox binding
> > 
> > Note: Because MU used by SCU is not implemented as a mailbox driver,
> > Instead, they're provided in library calls to gain higher performance.
> 
> Using a binding doesn't mean you have to use an OS's subsystem.
> 
> What needs higher performance? What's the performance difference? Why 
> can't the mailbox framework be improved?

>From what I see the performance is improved by polling the interrupt
registers rather than using interrupts.
I see no reason though why this can't be implemented with the mailbox
framework as is.

Sascha

-- 
Pengutronix e.K.                           |                             |
Industrial Linux Solutions                 | http://www.pengutronix.de/  |
Peiner Str. 6-8, 31137 Hildesheim, Germany | Phone: +49-5121-206917-0    |
Amtsgericht Hildesheim, HRA 2686           | Fax:   +49-5121-206917-5555 |

^ permalink raw reply	[flat|nested] 37+ messages in thread

* [PATCH V2 2/4] dt-bindings: arm: fsl: add mu binding doc
@ 2018-06-21  7:46       ` Sascha Hauer
  0 siblings, 0 replies; 37+ messages in thread
From: Sascha Hauer @ 2018-06-21  7:46 UTC (permalink / raw)
  To: linux-arm-kernel

On Wed, Jun 20, 2018 at 01:43:10PM -0600, Rob Herring wrote:
> On Sun, Jun 17, 2018 at 08:49:47PM +0800, Dong Aisheng wrote:
> > The Messaging Unit module enables two processors within
> > the SoC to communicate and coordinate by passing messages
> > (e.g. data, status and control) through the MU interface.
> > 
> > Cc: Shawn Guo <shawnguo@kernel.org>
> > Cc: Sascha Hauer <kernel@pengutronix.de>
> > Cc: Fabio Estevam <fabio.estevam@nxp.com>
> > Cc: Rob Herring <robh+dt@kernel.org>
> > Cc: Mark Rutland <mark.rutland@arm.com>
> > Cc: devicetree at vger.kernel.org
> > Signed-off-by: Dong Aisheng <aisheng.dong@nxp.com>
> > ---
> > v1->v2:
> >  * typo fixes
> >  * remove status property
> >  * remove imx6&7 compatible string which may be added later for
> >    the generic mailbox binding
> > 
> > Note: Because MU used by SCU is not implemented as a mailbox driver,
> > Instead, they're provided in library calls to gain higher performance.
> 
> Using a binding doesn't mean you have to use an OS's subsystem.
> 
> What needs higher performance? What's the performance difference? Why 
> can't the mailbox framework be improved?

>From what I see the performance is improved by polling the interrupt
registers rather than using interrupts.
I see no reason though why this can't be implemented with the mailbox
framework as is.

Sascha

-- 
Pengutronix e.K.                           |                             |
Industrial Linux Solutions                 | http://www.pengutronix.de/  |
Peiner Str. 6-8, 31137 Hildesheim, Germany | Phone: +49-5121-206917-0    |
Amtsgericht Hildesheim, HRA 2686           | Fax:   +49-5121-206917-5555 |

^ permalink raw reply	[flat|nested] 37+ messages in thread

* RE: [PATCH V2 3/4] dt-bindings: arm: fsl: add scu binding doc
  2018-06-21  7:37         ` Sascha Hauer
@ 2018-06-21 12:05           ` A.s. Dong
  -1 siblings, 0 replies; 37+ messages in thread
From: A.s. Dong @ 2018-06-21 12:05 UTC (permalink / raw)
  To: Sascha Hauer
  Cc: Mark Rutland, Rob Herring, dongas86, devicetree, dl-linux-imx,
	kernel, Fabio Estevam, shawnguo, linux-arm-kernel

> -----Original Message-----
> From: Sascha Hauer [mailto:s.hauer@pengutronix.de]
> Sent: Thursday, June 21, 2018 3:37 PM
> To: A.s. Dong <aisheng.dong@nxp.com>
> Cc: Rob Herring <robh@kernel.org>; Mark Rutland
> <mark.rutland@arm.com>; devicetree@vger.kernel.org;
> dongas86@gmail.com; dl-linux-imx <linux-imx@nxp.com>;
> kernel@pengutronix.de; Fabio Estevam <fabio.estevam@nxp.com>;
> shawnguo@kernel.org; linux-arm-kernel@lists.infradead.org
> Subject: Re: [PATCH V2 3/4] dt-bindings: arm: fsl: add scu binding doc
> 
> On Thu, Jun 21, 2018 at 03:38:30AM +0000, A.s. Dong wrote:
> > Hi Rob,
> >
> > > -----Original Message-----
> > > From: Rob Herring [mailto:robh@kernel.org]
> > > Sent: Thursday, June 21, 2018 3:45 AM
> > > To: A.s. Dong <aisheng.dong@nxp.com>
> > > Cc: linux-arm-kernel@lists.infradead.org; dongas86@gmail.com;
> > > kernel@pengutronix.de; shawnguo@kernel.org; Fabio Estevam
> > > <fabio.estevam@nxp.com>; dl-linux-imx <linux-imx@nxp.com>; Mark
> > > Rutland <mark.rutland@arm.com>; devicetree@vger.kernel.org
> > > Subject: Re: [PATCH V2 3/4] dt-bindings: arm: fsl: add scu binding
> > > doc
> > >
> > > On Sun, Jun 17, 2018 at 08:49:48PM +0800, Dong Aisheng wrote:
> > > > The System Controller Firmware (SCFW) is a low-level system
> > > > function which runs on a dedicated Cortex-M core to provide power,
> > > > clock, and resource management. It exists on some i.MX8
> > > > processors. e.g. i.MX8QM (QM, QP), and i.MX8QX (QXP, DX).
> > > >
> > > > Cc: Shawn Guo <shawnguo@kernel.org>
> > > > Cc: Sascha Hauer <kernel@pengutronix.de>
> > > > Cc: Fabio Estevam <fabio.estevam@nxp.com>
> > > > Cc: Rob Herring <robh+dt@kernel.org>
> > > > Cc: Mark Rutland <mark.rutland@arm.com>
> > > > Cc: devicetree@vger.kernel.org
> > > > Signed-off-by: Dong Aisheng <aisheng.dong@nxp.com>
> > > > ---
> > > > v1->v2:
> > > >  * remove status
> > > >  * changed to mu1
> > > > ---
> > > >  .../devicetree/bindings/arm/freescale/fsl,scu.txt  | 38
> > > > ++++++++++++++++++++++
> > > >  1 file changed, 38 insertions(+)
> > > >  create mode 100644
> > > > Documentation/devicetree/bindings/arm/freescale/fsl,scu.txt
> > > >
> > > > diff --git
> > > > a/Documentation/devicetree/bindings/arm/freescale/fsl,scu.txt
> > > > b/Documentation/devicetree/bindings/arm/freescale/fsl,scu.txt
> > > > new file mode 100644
> > > > index 0000000..9b7c9fe
> > > > --- /dev/null
> > > > +++ b/Documentation/devicetree/bindings/arm/freescale/fsl,scu.txt
> > > > @@ -0,0 +1,38 @@
> > > > +NXP i.MX System Controller Firmware (SCFW)
> > > > +-----------------------------------------------------------------
> > > > +---
> > > > +
> > > > +The System Controller Firmware (SCFW) is a low-level system
> > > > +function which runs on a dedicated Cortex-M core to provide
> > > > +power, clock, and resource management. It exists on some i.MX8
> > > > +processors. e.g. i.MX8QM (QM, QP), and i.MX8QX (QXP, DX).
> > > > +
> > > > +The AP communicates with the SC using a multi-ported MU module
> > > > +found in the LSIO subsystem. The current definition of this MU
> > > > +module provides
> > > > +5 remote AP connections to the SC to support up to 5 execution
> > > > +environments (TZ, HV, standard Linux, etc.). The SC side of this
> > > > +MU module interfaces with the LSIO DSC IP bus. The SC firmware
> > > > +will communicate with this MU using the MSI bus.
> > > > +
> > > > +System Controller Device Node:
> > > > +=============================
> > > > +
> > > > +Required properties:
> > > > +-------------------
> > > > +- compatible: should be "fsl,imx8qxp-scu" or "fsl,imx8qm-scu"
> > > > +- fsl,mu: a phandle to the Message Unit used by SCU. Should be
> > > > +	  one of LSIO MU0~M4 for imx8qxp and imx8qm. Users need
> > > > +	  to make sure not use the one which is conflict with
> > > > +	  other execution environments. e.g. ATF.
> > >
> > > Use the mailbox binding even if you don't use the mailbox subsystem.
> > >
> >
> > Looks reasonable. Will change it.
> >
> > BTW as I said before, the current mailbox binding fixed #mbox-cells to
> > be at least 1 which is not suitable for i.MX SCU MU as it has only one
> > physical channel.
> 
> Why is that not suitable? If we use #mbox-cells = 1 we can encode the
> channel into the second cell. Furthermore, the SCU mode which uses all
> channels in a funny way could be another channel id the mu driver could use
> to distinguish the different channel/modes. i.e.
> 
> #define MU_CHANNEL_0	0
> #define MU_CHANNEL_1    1
> #define MU_CHANNEL_2    2
> #define MU_CHANNEL_3    3
> #define MU_CHANNEL_SCU	4
> 

That's a bit confusing.
The 'channel' here actually are abstract virtual channels implemented by driver,
not hardware. MU itself does not have multi channels. (For example, you can't
grep a 'channel' word in the MU chapter in RM).

Yes, it has 4 Read/Write Data register pairs, but it's not originally designed for
separate channels, they're defined to send multi words.

See:
42.3.3.2 Messaging Examples
The following are messaging examples:
* Passing short messages: Transmit register(s) can be used to pass short messages
from one to four words in length. For example, when a four-word message is desired,
only one of the registers needs to have its corresponding interrupt enable bit set at the
receiver side; the message's first three words are written to the registers whose
interrupt is masked, and the fourth word is written to the other register (which
triggers an interrupt at the receiver side).

Abstracting them into individual channels mean we lose the HW capability to send
multi words during one transfer. However, I'm not object to it if we have no
special requirent of it. But for SCU, we need use the four data read/write register
at the same time which seems Mailbox driver can't support.

And MU_CHANNEL_SCU seems a bit more confuse. It is really one
channel from HW point of view but transfer multi works at one time.

Do you really want us to do that way?

And by greping mbox-cells in device tree, there's indeed someone using 0.
arch/arm/boot/dts/bcm283x.dtsi:145:			#mbox-cells = <0>;

Regards
Dong Aisheng

> 
> scu {
>        compatible = "fsl,imx8qxp-scu";
>        fsl,mu = <&lsio_mu1 MU_CHANNEL_SCU>; };
> 
> This would also allow to the MU-SCU-mode driver to coexist with the regular
> MU driver for which Oleksij posted a driver.
> 
> Sascha
> 
> --
> Pengutronix e.K.                           |                             |
> Industrial Linux Solutions                 |
> https://emea01.safelinks.protection.outlook.com/?url=http%3A%2F%2Fww
> w.pengutronix.de%2F&data=02%7C01%7Caisheng.dong%40nxp.com%7C17a
> 33ebbb3ef437fe76b08d5d749da3c%7C686ea1d3bc2b4c6fa92cd99c5c301635%
> 7C0%7C0%7C636651634553051648&sdata=YqJD%2FOcQvywtUtG6tpzdOxlnM
> tPRZP17ftjYP3gwynU%3D&reserved=0  |
> Peiner Str. 6-8, 31137 Hildesheim, Germany | Phone: +49-5121-206917-0    |
> Amtsgericht Hildesheim, HRA 2686           | Fax:   +49-5121-206917-5555 |

^ permalink raw reply	[flat|nested] 37+ messages in thread

* [PATCH V2 3/4] dt-bindings: arm: fsl: add scu binding doc
@ 2018-06-21 12:05           ` A.s. Dong
  0 siblings, 0 replies; 37+ messages in thread
From: A.s. Dong @ 2018-06-21 12:05 UTC (permalink / raw)
  To: linux-arm-kernel

> -----Original Message-----
> From: Sascha Hauer [mailto:s.hauer at pengutronix.de]
> Sent: Thursday, June 21, 2018 3:37 PM
> To: A.s. Dong <aisheng.dong@nxp.com>
> Cc: Rob Herring <robh@kernel.org>; Mark Rutland
> <mark.rutland@arm.com>; devicetree at vger.kernel.org;
> dongas86 at gmail.com; dl-linux-imx <linux-imx@nxp.com>;
> kernel at pengutronix.de; Fabio Estevam <fabio.estevam@nxp.com>;
> shawnguo at kernel.org; linux-arm-kernel at lists.infradead.org
> Subject: Re: [PATCH V2 3/4] dt-bindings: arm: fsl: add scu binding doc
> 
> On Thu, Jun 21, 2018 at 03:38:30AM +0000, A.s. Dong wrote:
> > Hi Rob,
> >
> > > -----Original Message-----
> > > From: Rob Herring [mailto:robh at kernel.org]
> > > Sent: Thursday, June 21, 2018 3:45 AM
> > > To: A.s. Dong <aisheng.dong@nxp.com>
> > > Cc: linux-arm-kernel at lists.infradead.org; dongas86 at gmail.com;
> > > kernel at pengutronix.de; shawnguo at kernel.org; Fabio Estevam
> > > <fabio.estevam@nxp.com>; dl-linux-imx <linux-imx@nxp.com>; Mark
> > > Rutland <mark.rutland@arm.com>; devicetree at vger.kernel.org
> > > Subject: Re: [PATCH V2 3/4] dt-bindings: arm: fsl: add scu binding
> > > doc
> > >
> > > On Sun, Jun 17, 2018 at 08:49:48PM +0800, Dong Aisheng wrote:
> > > > The System Controller Firmware (SCFW) is a low-level system
> > > > function which runs on a dedicated Cortex-M core to provide power,
> > > > clock, and resource management. It exists on some i.MX8
> > > > processors. e.g. i.MX8QM (QM, QP), and i.MX8QX (QXP, DX).
> > > >
> > > > Cc: Shawn Guo <shawnguo@kernel.org>
> > > > Cc: Sascha Hauer <kernel@pengutronix.de>
> > > > Cc: Fabio Estevam <fabio.estevam@nxp.com>
> > > > Cc: Rob Herring <robh+dt@kernel.org>
> > > > Cc: Mark Rutland <mark.rutland@arm.com>
> > > > Cc: devicetree at vger.kernel.org
> > > > Signed-off-by: Dong Aisheng <aisheng.dong@nxp.com>
> > > > ---
> > > > v1->v2:
> > > >  * remove status
> > > >  * changed to mu1
> > > > ---
> > > >  .../devicetree/bindings/arm/freescale/fsl,scu.txt  | 38
> > > > ++++++++++++++++++++++
> > > >  1 file changed, 38 insertions(+)
> > > >  create mode 100644
> > > > Documentation/devicetree/bindings/arm/freescale/fsl,scu.txt
> > > >
> > > > diff --git
> > > > a/Documentation/devicetree/bindings/arm/freescale/fsl,scu.txt
> > > > b/Documentation/devicetree/bindings/arm/freescale/fsl,scu.txt
> > > > new file mode 100644
> > > > index 0000000..9b7c9fe
> > > > --- /dev/null
> > > > +++ b/Documentation/devicetree/bindings/arm/freescale/fsl,scu.txt
> > > > @@ -0,0 +1,38 @@
> > > > +NXP i.MX System Controller Firmware (SCFW)
> > > > +-----------------------------------------------------------------
> > > > +---
> > > > +
> > > > +The System Controller Firmware (SCFW) is a low-level system
> > > > +function which runs on a dedicated Cortex-M core to provide
> > > > +power, clock, and resource management. It exists on some i.MX8
> > > > +processors. e.g. i.MX8QM (QM, QP), and i.MX8QX (QXP, DX).
> > > > +
> > > > +The AP communicates with the SC using a multi-ported MU module
> > > > +found in the LSIO subsystem. The current definition of this MU
> > > > +module provides
> > > > +5 remote AP connections to the SC to support up to 5 execution
> > > > +environments (TZ, HV, standard Linux, etc.). The SC side of this
> > > > +MU module interfaces with the LSIO DSC IP bus. The SC firmware
> > > > +will communicate with this MU using the MSI bus.
> > > > +
> > > > +System Controller Device Node:
> > > > +=============================
> > > > +
> > > > +Required properties:
> > > > +-------------------
> > > > +- compatible: should be "fsl,imx8qxp-scu" or "fsl,imx8qm-scu"
> > > > +- fsl,mu: a phandle to the Message Unit used by SCU. Should be
> > > > +	  one of LSIO MU0~M4 for imx8qxp and imx8qm. Users need
> > > > +	  to make sure not use the one which is conflict with
> > > > +	  other execution environments. e.g. ATF.
> > >
> > > Use the mailbox binding even if you don't use the mailbox subsystem.
> > >
> >
> > Looks reasonable. Will change it.
> >
> > BTW as I said before, the current mailbox binding fixed #mbox-cells to
> > be at least 1 which is not suitable for i.MX SCU MU as it has only one
> > physical channel.
> 
> Why is that not suitable? If we use #mbox-cells = 1 we can encode the
> channel into the second cell. Furthermore, the SCU mode which uses all
> channels in a funny way could be another channel id the mu driver could use
> to distinguish the different channel/modes. i.e.
> 
> #define MU_CHANNEL_0	0
> #define MU_CHANNEL_1    1
> #define MU_CHANNEL_2    2
> #define MU_CHANNEL_3    3
> #define MU_CHANNEL_SCU	4
> 

That's a bit confusing.
The 'channel' here actually are abstract virtual channels implemented by driver,
not hardware. MU itself does not have multi channels. (For example, you can't
grep a 'channel' word in the MU chapter in RM).

Yes, it has 4 Read/Write Data register pairs, but it's not originally designed for
separate channels, they're defined to send multi words.

See:
42.3.3.2 Messaging Examples
The following are messaging examples:
* Passing short messages: Transmit register(s) can be used to pass short messages
from one to four words in length. For example, when a four-word message is desired,
only one of the registers needs to have its corresponding interrupt enable bit set at the
receiver side; the message's first three words are written to the registers whose
interrupt is masked, and the fourth word is written to the other register (which
triggers an interrupt at the receiver side).

Abstracting them into individual channels mean we lose the HW capability to send
multi words during one transfer. However, I'm not object to it if we have no
special requirent of it. But for SCU, we need use the four data read/write register
at the same time which seems Mailbox driver can't support.

And MU_CHANNEL_SCU seems a bit more confuse. It is really one
channel from HW point of view but transfer multi works at one time.

Do you really want us to do that way?

And by greping mbox-cells in device tree, there's indeed someone using 0.
arch/arm/boot/dts/bcm283x.dtsi:145:			#mbox-cells = <0>;

Regards
Dong Aisheng

> 
> scu {
>        compatible = "fsl,imx8qxp-scu";
>        fsl,mu = <&lsio_mu1 MU_CHANNEL_SCU>; };
> 
> This would also allow to the MU-SCU-mode driver to coexist with the regular
> MU driver for which Oleksij posted a driver.
> 
> Sascha
> 
> --
> Pengutronix e.K.                           |                             |
> Industrial Linux Solutions                 |
> https://emea01.safelinks.protection.outlook.com/?url=http%3A%2F%2Fww
> w.pengutronix.de%2F&data=02%7C01%7Caisheng.dong%40nxp.com%7C17a
> 33ebbb3ef437fe76b08d5d749da3c%7C686ea1d3bc2b4c6fa92cd99c5c301635%
> 7C0%7C0%7C636651634553051648&sdata=YqJD%2FOcQvywtUtG6tpzdOxlnM
> tPRZP17ftjYP3gwynU%3D&reserved=0  |
> Peiner Str. 6-8, 31137 Hildesheim, Germany | Phone: +49-5121-206917-0    |
> Amtsgericht Hildesheim, HRA 2686           | Fax:   +49-5121-206917-5555 |

^ permalink raw reply	[flat|nested] 37+ messages in thread

* RE: [PATCH V2 2/4] dt-bindings: arm: fsl: add mu binding doc
  2018-06-21  7:46       ` Sascha Hauer
@ 2018-06-21 17:11         ` A.s. Dong
  -1 siblings, 0 replies; 37+ messages in thread
From: A.s. Dong @ 2018-06-21 17:11 UTC (permalink / raw)
  To: Sascha Hauer, Rob Herring
  Cc: Mark Rutland, devicetree, dongas86, dl-linux-imx, kernel,
	Fabio Estevam, shawnguo, linux-arm-kernel

Hi Sascha,

> -----Original Message-----
> From: Sascha Hauer [mailto:s.hauer@pengutronix.de]
> Sent: Thursday, June 21, 2018 3:47 PM
> To: Rob Herring <robh@kernel.org>
> Cc: A.s. Dong <aisheng.dong@nxp.com>; linux-arm-
> kernel@lists.infradead.org; dongas86@gmail.com; kernel@pengutronix.de;
> shawnguo@kernel.org; Fabio Estevam <fabio.estevam@nxp.com>; dl-linux-
> imx <linux-imx@nxp.com>; Mark Rutland <mark.rutland@arm.com>;
> devicetree@vger.kernel.org
> Subject: Re: [PATCH V2 2/4] dt-bindings: arm: fsl: add mu binding doc
> 
> On Wed, Jun 20, 2018 at 01:43:10PM -0600, Rob Herring wrote:
> > On Sun, Jun 17, 2018 at 08:49:47PM +0800, Dong Aisheng wrote:
> > > The Messaging Unit module enables two processors within the SoC to
> > > communicate and coordinate by passing messages (e.g. data, status
> > > and control) through the MU interface.
> > >
> > > Cc: Shawn Guo <shawnguo@kernel.org>
> > > Cc: Sascha Hauer <kernel@pengutronix.de>
> > > Cc: Fabio Estevam <fabio.estevam@nxp.com>
> > > Cc: Rob Herring <robh+dt@kernel.org>
> > > Cc: Mark Rutland <mark.rutland@arm.com>
> > > Cc: devicetree@vger.kernel.org
> > > Signed-off-by: Dong Aisheng <aisheng.dong@nxp.com>
> > > ---
> > > v1->v2:
> > >  * typo fixes
> > >  * remove status property
> > >  * remove imx6&7 compatible string which may be added later for
> > >    the generic mailbox binding
> > >
> > > Note: Because MU used by SCU is not implemented as a mailbox driver,
> > > Instead, they're provided in library calls to gain higher performance.
> >
> > Using a binding doesn't mean you have to use an OS's subsystem.
> >
> > What needs higher performance? What's the performance difference?
> Why
> > can't the mailbox framework be improved?
> 
> From what I see the performance is improved by polling the interrupt
> registers rather than using interrupts.
> I see no reason though why this can't be implemented with the mailbox
> framework as is.
> 

I thought you've agreed to not write generic MU(mailbox) driver for SCU.
https://www.spinics.net/lists/arm-kernel/msg650217.html
But seems it's still not quite certain...

I'd like to explain some more.

1) the interrupt mechanism is not quite suitable for SCU firmware protocol
as the transfer size would be more than 4 words and the response data size
is also undetermined (it's set by SCU firmware side during a response).
So polling mode may be the best way to handle this as MU message
handling usually is quite fast in a few microseconds.

2) It's true that Mailbox framework is well designed and powerful.
But it's not quite suitable for SCU MU as we don't need to use the most
bits of it. Mailbox seems like to be more suitable for a generic MU
mailbox driver used by various clients/servers.  But SCU MU are
quite specific to SCU protocol and can't be used by other clients
(MU 0~4 is fixed for SCU communication in MX8 HW design). 
Even we write a MU Mailbox driver for SCU MU, it's still not a general
one and can't be used by others (e.g. communication with M4).
So I'd believe the current library way is still the best approach for SCU MU
Using. But I'm also okay for another generic MU drivers for other common
communications between A core and M4 side.

3) We actually have tried the MU(Mailbox) internally, it increased a lot
complexity comparing to the current library way and got a booting time
regression issue due to extra delays introduced in handling SCU protocol
in mailbox way.

And finally a nature question to us is: 
What the benefit we can get for SCU MU using a mailbox way?

If we can't find benefits but introduce more complexities then why
we would do that way?

Regards
Dong Aisheng

> Sascha
> 
> --
> Pengutronix e.K.                           |                             |
> Industrial Linux Solutions                 |
> https://emea01.safelinks.protection.outlook.com/?url=http%3A%2F%2Fww
> w.pengutronix.de%2F&data=02%7C01%7Caisheng.dong%40nxp.com%7C8ec
> 827973f6e40d1fe1508d5d74b2471%7C686ea1d3bc2b4c6fa92cd99c5c301635%
> 7C0%7C0%7C636651640089438560&sdata=oRARdjilF4Ve2%2FnsXVhG7fmjXip
> H0HslgDMldbIGgY0%3D&reserved=0  |
> Peiner Str. 6-8, 31137 Hildesheim, Germany | Phone: +49-5121-206917-0    |
> Amtsgericht Hildesheim, HRA 2686           | Fax:   +49-5121-206917-5555 |

^ permalink raw reply	[flat|nested] 37+ messages in thread

* [PATCH V2 2/4] dt-bindings: arm: fsl: add mu binding doc
@ 2018-06-21 17:11         ` A.s. Dong
  0 siblings, 0 replies; 37+ messages in thread
From: A.s. Dong @ 2018-06-21 17:11 UTC (permalink / raw)
  To: linux-arm-kernel

Hi Sascha,

> -----Original Message-----
> From: Sascha Hauer [mailto:s.hauer at pengutronix.de]
> Sent: Thursday, June 21, 2018 3:47 PM
> To: Rob Herring <robh@kernel.org>
> Cc: A.s. Dong <aisheng.dong@nxp.com>; linux-arm-
> kernel at lists.infradead.org; dongas86 at gmail.com; kernel at pengutronix.de;
> shawnguo at kernel.org; Fabio Estevam <fabio.estevam@nxp.com>; dl-linux-
> imx <linux-imx@nxp.com>; Mark Rutland <mark.rutland@arm.com>;
> devicetree at vger.kernel.org
> Subject: Re: [PATCH V2 2/4] dt-bindings: arm: fsl: add mu binding doc
> 
> On Wed, Jun 20, 2018 at 01:43:10PM -0600, Rob Herring wrote:
> > On Sun, Jun 17, 2018 at 08:49:47PM +0800, Dong Aisheng wrote:
> > > The Messaging Unit module enables two processors within the SoC to
> > > communicate and coordinate by passing messages (e.g. data, status
> > > and control) through the MU interface.
> > >
> > > Cc: Shawn Guo <shawnguo@kernel.org>
> > > Cc: Sascha Hauer <kernel@pengutronix.de>
> > > Cc: Fabio Estevam <fabio.estevam@nxp.com>
> > > Cc: Rob Herring <robh+dt@kernel.org>
> > > Cc: Mark Rutland <mark.rutland@arm.com>
> > > Cc: devicetree at vger.kernel.org
> > > Signed-off-by: Dong Aisheng <aisheng.dong@nxp.com>
> > > ---
> > > v1->v2:
> > >  * typo fixes
> > >  * remove status property
> > >  * remove imx6&7 compatible string which may be added later for
> > >    the generic mailbox binding
> > >
> > > Note: Because MU used by SCU is not implemented as a mailbox driver,
> > > Instead, they're provided in library calls to gain higher performance.
> >
> > Using a binding doesn't mean you have to use an OS's subsystem.
> >
> > What needs higher performance? What's the performance difference?
> Why
> > can't the mailbox framework be improved?
> 
> From what I see the performance is improved by polling the interrupt
> registers rather than using interrupts.
> I see no reason though why this can't be implemented with the mailbox
> framework as is.
> 

I thought you've agreed to not write generic MU(mailbox) driver for SCU.
https://www.spinics.net/lists/arm-kernel/msg650217.html
But seems it's still not quite certain...

I'd like to explain some more.

1) the interrupt mechanism is not quite suitable for SCU firmware protocol
as the transfer size would be more than 4 words and the response data size
is also undetermined (it's set by SCU firmware side during a response).
So polling mode may be the best way to handle this as MU message
handling usually is quite fast in a few microseconds.

2) It's true that Mailbox framework is well designed and powerful.
But it's not quite suitable for SCU MU as we don't need to use the most
bits of it. Mailbox seems like to be more suitable for a generic MU
mailbox driver used by various clients/servers.  But SCU MU are
quite specific to SCU protocol and can't be used by other clients
(MU 0~4 is fixed for SCU communication in MX8 HW design). 
Even we write a MU Mailbox driver for SCU MU, it's still not a general
one and can't be used by others (e.g. communication with M4).
So I'd believe the current library way is still the best approach for SCU MU
Using. But I'm also okay for another generic MU drivers for other common
communications between A core and M4 side.

3) We actually have tried the MU(Mailbox) internally, it increased a lot
complexity comparing to the current library way and got a booting time
regression issue due to extra delays introduced in handling SCU protocol
in mailbox way.

And finally a nature question to us is: 
What the benefit we can get for SCU MU using a mailbox way?

If we can't find benefits but introduce more complexities then why
we would do that way?

Regards
Dong Aisheng

> Sascha
> 
> --
> Pengutronix e.K.                           |                             |
> Industrial Linux Solutions                 |
> https://emea01.safelinks.protection.outlook.com/?url=http%3A%2F%2Fww
> w.pengutronix.de%2F&data=02%7C01%7Caisheng.dong%40nxp.com%7C8ec
> 827973f6e40d1fe1508d5d74b2471%7C686ea1d3bc2b4c6fa92cd99c5c301635%
> 7C0%7C0%7C636651640089438560&sdata=oRARdjilF4Ve2%2FnsXVhG7fmjXip
> H0HslgDMldbIGgY0%3D&reserved=0  |
> Peiner Str. 6-8, 31137 Hildesheim, Germany | Phone: +49-5121-206917-0    |
> Amtsgericht Hildesheim, HRA 2686           | Fax:   +49-5121-206917-5555 |

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH V2 2/4] dt-bindings: arm: fsl: add mu binding doc
  2018-06-21 17:11         ` A.s. Dong
@ 2018-06-21 18:08           ` Oleksij Rempel
  -1 siblings, 0 replies; 37+ messages in thread
From: Oleksij Rempel @ 2018-06-21 18:08 UTC (permalink / raw)
  To: A.s. Dong
  Cc: Mark Rutland, Rob Herring, dongas86, devicetree, Sascha Hauer,
	dl-linux-imx, kernel, Fabio Estevam, shawnguo, linux-arm-kernel


[-- Attachment #1.1: Type: text/plain, Size: 6535 bytes --]

Hi,

On Thu, Jun 21, 2018 at 05:11:33PM +0000, A.s. Dong wrote:
> Hi Sascha,
> > On Wed, Jun 20, 2018 at 01:43:10PM -0600, Rob Herring wrote:
> > > On Sun, Jun 17, 2018 at 08:49:47PM +0800, Dong Aisheng wrote:
> > > > The Messaging Unit module enables two processors within the SoC to
> > > > communicate and coordinate by passing messages (e.g. data, status
> > > > and control) through the MU interface.
> > > >
> > > > ---
> > > > v1->v2:
> > > >  * typo fixes
> > > >  * remove status property
> > > >  * remove imx6&7 compatible string which may be added later for
> > > >    the generic mailbox binding
> > > >
> > > > Note: Because MU used by SCU is not implemented as a mailbox driver,
> > > > Instead, they're provided in library calls to gain higher performance.
> > >
> > > Using a binding doesn't mean you have to use an OS's subsystem.
> > >
> > > What needs higher performance? What's the performance difference?
> > Why
> > > can't the mailbox framework be improved?
> > 
> > From what I see the performance is improved by polling the interrupt
> > registers rather than using interrupts.
> > I see no reason though why this can't be implemented with the mailbox
> > framework as is.
> > 
> 
> I thought you've agreed to not write generic MU(mailbox) driver for SCU.
> https://www.spinics.net/lists/arm-kernel/msg650217.html
> But seems it's still not quite certain...
> 
> I'd like to explain some more.
> 
> 1) the interrupt mechanism is not quite suitable for SCU firmware protocol
> as the transfer size would be more than 4 words and the response data size
> is also undetermined (it's set by SCU firmware side during a response).
> So polling mode may be the best way to handle this as MU message
> handling usually is quite fast in a few microseconds.
> 
> 2) It's true that Mailbox framework is well designed and powerful.
> But it's not quite suitable for SCU MU as we don't need to use the most
> bits of it. Mailbox seems like to be more suitable for a generic MU
> mailbox driver used by various clients/servers.  But SCU MU are
> quite specific to SCU protocol and can't be used by other clients
> (MU 0~4 is fixed for SCU communication in MX8 HW design). 
> Even we write a MU Mailbox driver for SCU MU, it's still not a general
> one and can't be used by others (e.g. communication with M4).
> So I'd believe the current library way is still the best approach for SCU MU
> Using. But I'm also okay for another generic MU drivers for other common
> communications between A core and M4 side.
> 
> 3) We actually have tried the MU(Mailbox) internally, it increased a lot
> complexity comparing to the current library way and got a booting time
> regression issue due to extra delays introduced in handling SCU protocol
> in mailbox way.
> 
> And finally a nature question to us is: 
> What the benefit we can get for SCU MU using a mailbox way?
> 
> If we can't find benefits but introduce more complexities then why
> we would do that way?

Looks like my response to same topic within my patch set is lost, so I
repost it here:

ok.. let's take some of IMX8 SCU driver code to see if there any difference:

..this part of the code is blocking write procedure for one
channeler (register or what ever name you prefer) per write.. correct?

+void mu_send_msg(struct mu_priv *priv, uint32_t index, uint32_t msg)
+{
+	uint32_t mask = MU_SR_TE0_MASK >> index;
+
+	/* Wait TX register to be empty. */
+	while (!(readl_relaxed(priv->base + MU_ASR) & mask))
+		;
+	writel_relaxed(msg, priv->base + MU_ATR0  + (index * 4));
+}
+EXPORT_SYMBOL_GPL(mu_send_msg);

According to documentation it is recommended to use only one status bit
for the last register to use MU as one big 4words sized pipe.
But, there is no way you can write to all 4 registers without checking status
for each of this register, because your protocol has dynamic message
size. So you are forced to use your one channel as 4 separate channels.
Write it part of the message separately and allow your firmware read 1
word to understand how to behave on the rest of the message.

+static void sc_ipc_write(struct sc_ipc *sc_ipc, void *data)
+{
+	sc_rpc_msg_t *msg = (sc_rpc_msg_t *) data;
+	uint8_t count = 0;
+
+	/* Check size */
+	if (msg->size > SC_RPC_MAX_MSG)
+		return;
+
+	/* Write first word */
+	mu_send_msg(sc_ipc->mu_base, 0, *((uint32_t *) msg));
+	count++;

.. in this loop you are writing to one channel/register per loop. If
the communicate will stall for some reason, the linux system will just
freeze here without any timeout or error message.. no idea how about the
opposite site.

+	/* Write remaining words */
+	while (count < msg->size) {
+		mu_send_msg(sc_ipc->mu_base, count % MU_TR_COUNT,
+			    msg->DATA.u32[count - 1]);
+		count++;
+	}
+}


... and here is a proof that sc_ipc_write will do in some cases 5 rounds
(5 * 4 bytes = 20 bytes single message) with probable busy waiting for
each channel

+sc_err_t sc_misc_seco_image_load(sc_ipc_t ipc, uint32_t addr_src,
+				 uint32_t addr_dst, uint32_t len, bool fw)
+{
+	sc_rpc_msg_t msg;
+	uint8_t result;
+
+	RPC_VER(&msg) = SC_RPC_VERSION;
+	RPC_SVC(&msg) = (uint8_t)SC_RPC_SVC_MISC;
+	RPC_FUNC(&msg) = (uint8_t)MISC_FUNC_SECO_IMAGE_LOAD;
+	RPC_U32(&msg, 0) = addr_src;
+	RPC_U32(&msg, 4) = addr_dst;
+	RPC_U32(&msg, 8) = len;
+	RPC_U8(&msg, 12) = (uint8_t)fw;
+	RPC_SIZE(&msg) = 5;
+
+	sc_call_rpc(ipc, &msg, false);
+
+	result = RPC_R8(&msg);
+	return (sc_err_t)result;
+}
+

So, the same code with mailbox framework will be some thing like this:
1. request all 4 channels in the probe. ignore completion callback and
set proper timeout.

2. keep your old code by replacing this part.
	/* Write remaining words */
	while (count < msg->size) {
		mbox_send_message(sc_ipc->mbox_chan[count % MU_TR_COUNT],
msg->DATA.u32[count - 1]);
		count++;
	}

The advantage of this variant. If SCU firmware will stall, the linux
will be able to notify about it without blocking complete system.

Can you please provide (if possible) your old mailbox based
implementation. I'm curious to see why it is slow.

-- 
Pengutronix e.K.                           |                             |
Industrial Linux Solutions                 | http://www.pengutronix.de/  |
Peiner Str. 6-8, 31137 Hildesheim, Germany | Phone: +49-5121-206917-0    |
Amtsgericht Hildesheim, HRA 2686           | Fax:   +49-5121-206917-5555 |

[-- Attachment #1.2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

[-- Attachment #2: Type: text/plain, Size: 176 bytes --]

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 37+ messages in thread

* [PATCH V2 2/4] dt-bindings: arm: fsl: add mu binding doc
@ 2018-06-21 18:08           ` Oleksij Rempel
  0 siblings, 0 replies; 37+ messages in thread
From: Oleksij Rempel @ 2018-06-21 18:08 UTC (permalink / raw)
  To: linux-arm-kernel

Hi,

On Thu, Jun 21, 2018 at 05:11:33PM +0000, A.s. Dong wrote:
> Hi Sascha,
> > On Wed, Jun 20, 2018 at 01:43:10PM -0600, Rob Herring wrote:
> > > On Sun, Jun 17, 2018 at 08:49:47PM +0800, Dong Aisheng wrote:
> > > > The Messaging Unit module enables two processors within the SoC to
> > > > communicate and coordinate by passing messages (e.g. data, status
> > > > and control) through the MU interface.
> > > >
> > > > ---
> > > > v1->v2:
> > > >  * typo fixes
> > > >  * remove status property
> > > >  * remove imx6&7 compatible string which may be added later for
> > > >    the generic mailbox binding
> > > >
> > > > Note: Because MU used by SCU is not implemented as a mailbox driver,
> > > > Instead, they're provided in library calls to gain higher performance.
> > >
> > > Using a binding doesn't mean you have to use an OS's subsystem.
> > >
> > > What needs higher performance? What's the performance difference?
> > Why
> > > can't the mailbox framework be improved?
> > 
> > From what I see the performance is improved by polling the interrupt
> > registers rather than using interrupts.
> > I see no reason though why this can't be implemented with the mailbox
> > framework as is.
> > 
> 
> I thought you've agreed to not write generic MU(mailbox) driver for SCU.
> https://www.spinics.net/lists/arm-kernel/msg650217.html
> But seems it's still not quite certain...
> 
> I'd like to explain some more.
> 
> 1) the interrupt mechanism is not quite suitable for SCU firmware protocol
> as the transfer size would be more than 4 words and the response data size
> is also undetermined (it's set by SCU firmware side during a response).
> So polling mode may be the best way to handle this as MU message
> handling usually is quite fast in a few microseconds.
> 
> 2) It's true that Mailbox framework is well designed and powerful.
> But it's not quite suitable for SCU MU as we don't need to use the most
> bits of it. Mailbox seems like to be more suitable for a generic MU
> mailbox driver used by various clients/servers.  But SCU MU are
> quite specific to SCU protocol and can't be used by other clients
> (MU 0~4 is fixed for SCU communication in MX8 HW design). 
> Even we write a MU Mailbox driver for SCU MU, it's still not a general
> one and can't be used by others (e.g. communication with M4).
> So I'd believe the current library way is still the best approach for SCU MU
> Using. But I'm also okay for another generic MU drivers for other common
> communications between A core and M4 side.
> 
> 3) We actually have tried the MU(Mailbox) internally, it increased a lot
> complexity comparing to the current library way and got a booting time
> regression issue due to extra delays introduced in handling SCU protocol
> in mailbox way.
> 
> And finally a nature question to us is: 
> What the benefit we can get for SCU MU using a mailbox way?
> 
> If we can't find benefits but introduce more complexities then why
> we would do that way?

Looks like my response to same topic within my patch set is lost, so I
repost it here:

ok.. let's take some of IMX8 SCU driver code to see if there any difference:

..this part of the code is blocking write procedure for one
channeler (register or what ever name you prefer) per write.. correct?

+void mu_send_msg(struct mu_priv *priv, uint32_t index, uint32_t msg)
+{
+	uint32_t mask = MU_SR_TE0_MASK >> index;
+
+	/* Wait TX register to be empty. */
+	while (!(readl_relaxed(priv->base + MU_ASR) & mask))
+		;
+	writel_relaxed(msg, priv->base + MU_ATR0  + (index * 4));
+}
+EXPORT_SYMBOL_GPL(mu_send_msg);

According to documentation it is recommended to use only one status bit
for the last register to use MU as one big 4words sized pipe.
But, there is no way you can write to all 4 registers without checking status
for each of this register, because your protocol has dynamic message
size. So you are forced to use your one channel as 4 separate channels.
Write it part of the message separately and allow your firmware read 1
word to understand how to behave on the rest of the message.

+static void sc_ipc_write(struct sc_ipc *sc_ipc, void *data)
+{
+	sc_rpc_msg_t *msg = (sc_rpc_msg_t *) data;
+	uint8_t count = 0;
+
+	/* Check size */
+	if (msg->size > SC_RPC_MAX_MSG)
+		return;
+
+	/* Write first word */
+	mu_send_msg(sc_ipc->mu_base, 0, *((uint32_t *) msg));
+	count++;

.. in this loop you are writing to one channel/register per loop. If
the communicate will stall for some reason, the linux system will just
freeze here without any timeout or error message.. no idea how about the
opposite site.

+	/* Write remaining words */
+	while (count < msg->size) {
+		mu_send_msg(sc_ipc->mu_base, count % MU_TR_COUNT,
+			    msg->DATA.u32[count - 1]);
+		count++;
+	}
+}


... and here is a proof that sc_ipc_write will do in some cases 5 rounds
(5 * 4 bytes = 20 bytes single message) with probable busy waiting for
each channel

+sc_err_t sc_misc_seco_image_load(sc_ipc_t ipc, uint32_t addr_src,
+				 uint32_t addr_dst, uint32_t len, bool fw)
+{
+	sc_rpc_msg_t msg;
+	uint8_t result;
+
+	RPC_VER(&msg) = SC_RPC_VERSION;
+	RPC_SVC(&msg) = (uint8_t)SC_RPC_SVC_MISC;
+	RPC_FUNC(&msg) = (uint8_t)MISC_FUNC_SECO_IMAGE_LOAD;
+	RPC_U32(&msg, 0) = addr_src;
+	RPC_U32(&msg, 4) = addr_dst;
+	RPC_U32(&msg, 8) = len;
+	RPC_U8(&msg, 12) = (uint8_t)fw;
+	RPC_SIZE(&msg) = 5;
+
+	sc_call_rpc(ipc, &msg, false);
+
+	result = RPC_R8(&msg);
+	return (sc_err_t)result;
+}
+

So, the same code with mailbox framework will be some thing like this:
1. request all 4 channels in the probe. ignore completion callback and
set proper timeout.

2. keep your old code by replacing this part.
	/* Write remaining words */
	while (count < msg->size) {
		mbox_send_message(sc_ipc->mbox_chan[count % MU_TR_COUNT],
msg->DATA.u32[count - 1]);
		count++;
	}

The advantage of this variant. If SCU firmware will stall, the linux
will be able to notify about it without blocking complete system.

Can you please provide (if possible) your old mailbox based
implementation. I'm curious to see why it is slow.

-- 
Pengutronix e.K.                           |                             |
Industrial Linux Solutions                 | http://www.pengutronix.de/  |
Peiner Str. 6-8, 31137 Hildesheim, Germany | Phone: +49-5121-206917-0    |
Amtsgericht Hildesheim, HRA 2686           | Fax:   +49-5121-206917-5555 |
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 488 bytes
Desc: not available
URL: <http://lists.infradead.org/pipermail/linux-arm-kernel/attachments/20180621/d33cff2c/attachment.sig>

^ permalink raw reply	[flat|nested] 37+ messages in thread

* RE: [PATCH V2 2/4] dt-bindings: arm: fsl: add mu binding doc
  2018-06-21 18:08           ` Oleksij Rempel
@ 2018-06-22  3:31             ` A.s. Dong
  -1 siblings, 0 replies; 37+ messages in thread
From: A.s. Dong @ 2018-06-22  3:31 UTC (permalink / raw)
  To: Oleksij Rempel
  Cc: Mark Rutland, Rob Herring, dongas86, devicetree, Sascha Hauer,
	dl-linux-imx, kernel, Fabio Estevam, shawnguo, linux-arm-kernel

> -----Original Message-----
> From: Oleksij Rempel [mailto:o.rempel@pengutronix.de]
> Sent: Friday, June 22, 2018 2:09 AM
> To: A.s. Dong <aisheng.dong@nxp.com>
> Cc: Sascha Hauer <s.hauer@pengutronix.de>; Rob Herring
> <robh@kernel.org>; Mark Rutland <mark.rutland@arm.com>;
> devicetree@vger.kernel.org; dongas86@gmail.com; dl-linux-imx <linux-
> imx@nxp.com>; kernel@pengutronix.de; Fabio Estevam
> <fabio.estevam@nxp.com>; shawnguo@kernel.org; linux-arm-
> kernel@lists.infradead.org
> Subject: Re: [PATCH V2 2/4] dt-bindings: arm: fsl: add mu binding doc
> 
> Hi,
> 
> On Thu, Jun 21, 2018 at 05:11:33PM +0000, A.s. Dong wrote:
> > Hi Sascha,
> > > On Wed, Jun 20, 2018 at 01:43:10PM -0600, Rob Herring wrote:
> > > > On Sun, Jun 17, 2018 at 08:49:47PM +0800, Dong Aisheng wrote:
> > > > > The Messaging Unit module enables two processors within the SoC
> > > > > to communicate and coordinate by passing messages (e.g. data,
> > > > > status and control) through the MU interface.
> > > > >
> > > > > ---
> > > > > v1->v2:
> > > > >  * typo fixes
> > > > >  * remove status property
> > > > >  * remove imx6&7 compatible string which may be added later for
> > > > >    the generic mailbox binding
> > > > >
> > > > > Note: Because MU used by SCU is not implemented as a mailbox
> > > > > driver, Instead, they're provided in library calls to gain higher
> performance.
> > > >
> > > > Using a binding doesn't mean you have to use an OS's subsystem.
> > > >
> > > > What needs higher performance? What's the performance difference?
> > > Why
> > > > can't the mailbox framework be improved?
> > >
> > > From what I see the performance is improved by polling the interrupt
> > > registers rather than using interrupts.
> > > I see no reason though why this can't be implemented with the
> > > mailbox framework as is.
> > >
> >
> > I thought you've agreed to not write generic MU(mailbox) driver for SCU.
> > https://www.spinics.net/lists/arm-kernel/msg650217.html
> > But seems it's still not quite certain...
> >
> > I'd like to explain some more.
> >
> > 1) the interrupt mechanism is not quite suitable for SCU firmware
> > protocol as the transfer size would be more than 4 words and the
> > response data size is also undetermined (it's set by SCU firmware side
> during a response).
> > So polling mode may be the best way to handle this as MU message
> > handling usually is quite fast in a few microseconds.
> >
> > 2) It's true that Mailbox framework is well designed and powerful.
> > But it's not quite suitable for SCU MU as we don't need to use the
> > most bits of it. Mailbox seems like to be more suitable for a generic
> > MU mailbox driver used by various clients/servers.  But SCU MU are
> > quite specific to SCU protocol and can't be used by other clients (MU
> > 0~4 is fixed for SCU communication in MX8 HW design).
> > Even we write a MU Mailbox driver for SCU MU, it's still not a general
> > one and can't be used by others (e.g. communication with M4).
> > So I'd believe the current library way is still the best approach for
> > SCU MU Using. But I'm also okay for another generic MU drivers for
> > other common communications between A core and M4 side.
> >
> > 3) We actually have tried the MU(Mailbox) internally, it increased a
> > lot complexity comparing to the current library way and got a booting
> > time regression issue due to extra delays introduced in handling SCU
> > protocol in mailbox way.
> >
> > And finally a nature question to us is:
> > What the benefit we can get for SCU MU using a mailbox way?
> >
> > If we can't find benefits but introduce more complexities then why we
> > would do that way?
> 
> Looks like my response to same topic within my patch set is lost, so I repost it
> here:
> 
> ok.. let's take some of IMX8 SCU driver code to see if there any difference:
> 
> ..this part of the code is blocking write procedure for one channeler (register
> or what ever name you prefer) per write.. correct?
> 
> +void mu_send_msg(struct mu_priv *priv, uint32_t index, uint32_t msg) {
> +	uint32_t mask = MU_SR_TE0_MASK >> index;
> +
> +	/* Wait TX register to be empty. */
> +	while (!(readl_relaxed(priv->base + MU_ASR) & mask))
> +		;
> +	writel_relaxed(msg, priv->base + MU_ATR0  + (index * 4)); }
> +EXPORT_SYMBOL_GPL(mu_send_msg);
> 
> According to documentation it is recommended to use only one status bit for
> the last register to use MU as one big 4words sized pipe.
> But, there is no way you can write to all 4 registers without checking status
> for each of this register, because your protocol has dynamic message size. So
> you are forced to use your one channel as 4 separate channels.
> Write it part of the message separately and allow your firmware read 1 word
> to understand how to behave on the rest of the message.
> 
> +static void sc_ipc_write(struct sc_ipc *sc_ipc, void *data) {
> +	sc_rpc_msg_t *msg = (sc_rpc_msg_t *) data;
> +	uint8_t count = 0;
> +
> +	/* Check size */
> +	if (msg->size > SC_RPC_MAX_MSG)
> +		return;
> +
> +	/* Write first word */
> +	mu_send_msg(sc_ipc->mu_base, 0, *((uint32_t *) msg));
> +	count++;
> 
> .. in this loop you are writing to one channel/register per loop. If the
> communicate will stall for some reason, the linux system will just freeze here
> without any timeout or error message.. no idea how about the opposite site.
> 
> +	/* Write remaining words */
> +	while (count < msg->size) {
> +		mu_send_msg(sc_ipc->mu_base, count % MU_TR_COUNT,
> +			    msg->DATA.u32[count - 1]);
> +		count++;
> +	}
> +}
> 
> 
> ... and here is a proof that sc_ipc_write will do in some cases 5 rounds
> (5 * 4 bytes = 20 bytes single message) with probable busy waiting for each
> channel
> 
> +sc_err_t sc_misc_seco_image_load(sc_ipc_t ipc, uint32_t addr_src,
> +				 uint32_t addr_dst, uint32_t len, bool fw) {
> +	sc_rpc_msg_t msg;
> +	uint8_t result;
> +
> +	RPC_VER(&msg) = SC_RPC_VERSION;
> +	RPC_SVC(&msg) = (uint8_t)SC_RPC_SVC_MISC;
> +	RPC_FUNC(&msg) = (uint8_t)MISC_FUNC_SECO_IMAGE_LOAD;
> +	RPC_U32(&msg, 0) = addr_src;
> +	RPC_U32(&msg, 4) = addr_dst;
> +	RPC_U32(&msg, 8) = len;
> +	RPC_U8(&msg, 12) = (uint8_t)fw;
> +	RPC_SIZE(&msg) = 5;
> +
> +	sc_call_rpc(ipc, &msg, false);
> +
> +	result = RPC_R8(&msg);
> +	return (sc_err_t)result;
> +}
> +
> 
> So, the same code with mailbox framework will be some thing like this:
> 1. request all 4 channels in the probe. ignore completion callback and set
> proper timeout.
> 

That looks a bit strange. As I said in another email, MU physical does not have multi
Channels (here you mean are virtual channels).  And one whole MU instance is
Exclusively used by SCU, why we need abstract them into 4 channels to use separately
with extra unnecessary resource hold and code path executed.

> 2. keep your old code by replacing this part.
> 	/* Write remaining words */
> 	while (count < msg->size) {
> 		mbox_send_message(sc_ipc->mbox_chan[count %
> MU_TR_COUNT],
> msg->DATA.u32[count - 1]);
> 		count++;
> 	}
> 
> The advantage of this variant. If SCU firmware will stall, the linux will be able
> to notify about it without blocking complete system.
> 

This part of code has been used for a long time and we've never met the stall
Issue which means SCU firmware guarantee it well. But I agree a timeout
mechanism Is better. However,  if only for this reason, we can simply add a
timeout mechanism in MU library function as well, but still is far from a strong
enough reason to switch to a more complexed mailbox one.

> Can you please provide (if possible) your old mailbox based implementation.
> I'm curious to see why it is slow.

In our implementation:
1) One channel per MU
2) Tx using the same way to send msg as sc_ipc_write() in polling mode
3) Rx enables the first word interrupt and polling for the rest of them in
a hrtimer.

The possible extra cost comparing to simple polling way:
1) Extra unnecessary code execution path of mailbox which is not used by SCU MU
2) Interrupt latency
3) Extra memory copy handling RX message separately.
4) Extra schedule of hrtimer polling

Some of them probably could be optimized. However, besides the slow problem,
the extra unnecessary complexity and can't be generic (specific to SCU) are also
important ones. 

And the MU general purpose interrupt feature and general purpose flags feature
may also not supported by mailbox well.

Regards
Dong Aisheng
> 
> --
> Pengutronix e.K.                           |                             |
> Industrial Linux Solutions                 | http://www.pengutronix.de/  |
> Peiner Str. 6-8, 31137 Hildesheim, Germany | Phone: +49-5121-206917-0    |
> Amtsgericht Hildesheim, HRA 2686           | Fax:   +49-5121-206917-5555 |

^ permalink raw reply	[flat|nested] 37+ messages in thread

* [PATCH V2 2/4] dt-bindings: arm: fsl: add mu binding doc
@ 2018-06-22  3:31             ` A.s. Dong
  0 siblings, 0 replies; 37+ messages in thread
From: A.s. Dong @ 2018-06-22  3:31 UTC (permalink / raw)
  To: linux-arm-kernel

> -----Original Message-----
> From: Oleksij Rempel [mailto:o.rempel at pengutronix.de]
> Sent: Friday, June 22, 2018 2:09 AM
> To: A.s. Dong <aisheng.dong@nxp.com>
> Cc: Sascha Hauer <s.hauer@pengutronix.de>; Rob Herring
> <robh@kernel.org>; Mark Rutland <mark.rutland@arm.com>;
> devicetree at vger.kernel.org; dongas86 at gmail.com; dl-linux-imx <linux-
> imx at nxp.com>; kernel at pengutronix.de; Fabio Estevam
> <fabio.estevam@nxp.com>; shawnguo at kernel.org; linux-arm-
> kernel at lists.infradead.org
> Subject: Re: [PATCH V2 2/4] dt-bindings: arm: fsl: add mu binding doc
> 
> Hi,
> 
> On Thu, Jun 21, 2018 at 05:11:33PM +0000, A.s. Dong wrote:
> > Hi Sascha,
> > > On Wed, Jun 20, 2018 at 01:43:10PM -0600, Rob Herring wrote:
> > > > On Sun, Jun 17, 2018 at 08:49:47PM +0800, Dong Aisheng wrote:
> > > > > The Messaging Unit module enables two processors within the SoC
> > > > > to communicate and coordinate by passing messages (e.g. data,
> > > > > status and control) through the MU interface.
> > > > >
> > > > > ---
> > > > > v1->v2:
> > > > >  * typo fixes
> > > > >  * remove status property
> > > > >  * remove imx6&7 compatible string which may be added later for
> > > > >    the generic mailbox binding
> > > > >
> > > > > Note: Because MU used by SCU is not implemented as a mailbox
> > > > > driver, Instead, they're provided in library calls to gain higher
> performance.
> > > >
> > > > Using a binding doesn't mean you have to use an OS's subsystem.
> > > >
> > > > What needs higher performance? What's the performance difference?
> > > Why
> > > > can't the mailbox framework be improved?
> > >
> > > From what I see the performance is improved by polling the interrupt
> > > registers rather than using interrupts.
> > > I see no reason though why this can't be implemented with the
> > > mailbox framework as is.
> > >
> >
> > I thought you've agreed to not write generic MU(mailbox) driver for SCU.
> > https://www.spinics.net/lists/arm-kernel/msg650217.html
> > But seems it's still not quite certain...
> >
> > I'd like to explain some more.
> >
> > 1) the interrupt mechanism is not quite suitable for SCU firmware
> > protocol as the transfer size would be more than 4 words and the
> > response data size is also undetermined (it's set by SCU firmware side
> during a response).
> > So polling mode may be the best way to handle this as MU message
> > handling usually is quite fast in a few microseconds.
> >
> > 2) It's true that Mailbox framework is well designed and powerful.
> > But it's not quite suitable for SCU MU as we don't need to use the
> > most bits of it. Mailbox seems like to be more suitable for a generic
> > MU mailbox driver used by various clients/servers.  But SCU MU are
> > quite specific to SCU protocol and can't be used by other clients (MU
> > 0~4 is fixed for SCU communication in MX8 HW design).
> > Even we write a MU Mailbox driver for SCU MU, it's still not a general
> > one and can't be used by others (e.g. communication with M4).
> > So I'd believe the current library way is still the best approach for
> > SCU MU Using. But I'm also okay for another generic MU drivers for
> > other common communications between A core and M4 side.
> >
> > 3) We actually have tried the MU(Mailbox) internally, it increased a
> > lot complexity comparing to the current library way and got a booting
> > time regression issue due to extra delays introduced in handling SCU
> > protocol in mailbox way.
> >
> > And finally a nature question to us is:
> > What the benefit we can get for SCU MU using a mailbox way?
> >
> > If we can't find benefits but introduce more complexities then why we
> > would do that way?
> 
> Looks like my response to same topic within my patch set is lost, so I repost it
> here:
> 
> ok.. let's take some of IMX8 SCU driver code to see if there any difference:
> 
> ..this part of the code is blocking write procedure for one channeler (register
> or what ever name you prefer) per write.. correct?
> 
> +void mu_send_msg(struct mu_priv *priv, uint32_t index, uint32_t msg) {
> +	uint32_t mask = MU_SR_TE0_MASK >> index;
> +
> +	/* Wait TX register to be empty. */
> +	while (!(readl_relaxed(priv->base + MU_ASR) & mask))
> +		;
> +	writel_relaxed(msg, priv->base + MU_ATR0  + (index * 4)); }
> +EXPORT_SYMBOL_GPL(mu_send_msg);
> 
> According to documentation it is recommended to use only one status bit for
> the last register to use MU as one big 4words sized pipe.
> But, there is no way you can write to all 4 registers without checking status
> for each of this register, because your protocol has dynamic message size. So
> you are forced to use your one channel as 4 separate channels.
> Write it part of the message separately and allow your firmware read 1 word
> to understand how to behave on the rest of the message.
> 
> +static void sc_ipc_write(struct sc_ipc *sc_ipc, void *data) {
> +	sc_rpc_msg_t *msg = (sc_rpc_msg_t *) data;
> +	uint8_t count = 0;
> +
> +	/* Check size */
> +	if (msg->size > SC_RPC_MAX_MSG)
> +		return;
> +
> +	/* Write first word */
> +	mu_send_msg(sc_ipc->mu_base, 0, *((uint32_t *) msg));
> +	count++;
> 
> .. in this loop you are writing to one channel/register per loop. If the
> communicate will stall for some reason, the linux system will just freeze here
> without any timeout or error message.. no idea how about the opposite site.
> 
> +	/* Write remaining words */
> +	while (count < msg->size) {
> +		mu_send_msg(sc_ipc->mu_base, count % MU_TR_COUNT,
> +			    msg->DATA.u32[count - 1]);
> +		count++;
> +	}
> +}
> 
> 
> ... and here is a proof that sc_ipc_write will do in some cases 5 rounds
> (5 * 4 bytes = 20 bytes single message) with probable busy waiting for each
> channel
> 
> +sc_err_t sc_misc_seco_image_load(sc_ipc_t ipc, uint32_t addr_src,
> +				 uint32_t addr_dst, uint32_t len, bool fw) {
> +	sc_rpc_msg_t msg;
> +	uint8_t result;
> +
> +	RPC_VER(&msg) = SC_RPC_VERSION;
> +	RPC_SVC(&msg) = (uint8_t)SC_RPC_SVC_MISC;
> +	RPC_FUNC(&msg) = (uint8_t)MISC_FUNC_SECO_IMAGE_LOAD;
> +	RPC_U32(&msg, 0) = addr_src;
> +	RPC_U32(&msg, 4) = addr_dst;
> +	RPC_U32(&msg, 8) = len;
> +	RPC_U8(&msg, 12) = (uint8_t)fw;
> +	RPC_SIZE(&msg) = 5;
> +
> +	sc_call_rpc(ipc, &msg, false);
> +
> +	result = RPC_R8(&msg);
> +	return (sc_err_t)result;
> +}
> +
> 
> So, the same code with mailbox framework will be some thing like this:
> 1. request all 4 channels in the probe. ignore completion callback and set
> proper timeout.
> 

That looks a bit strange. As I said in another email, MU physical does not have multi
Channels (here you mean are virtual channels).  And one whole MU instance is
Exclusively used by SCU, why we need abstract them into 4 channels to use separately
with extra unnecessary resource hold and code path executed.

> 2. keep your old code by replacing this part.
> 	/* Write remaining words */
> 	while (count < msg->size) {
> 		mbox_send_message(sc_ipc->mbox_chan[count %
> MU_TR_COUNT],
> msg->DATA.u32[count - 1]);
> 		count++;
> 	}
> 
> The advantage of this variant. If SCU firmware will stall, the linux will be able
> to notify about it without blocking complete system.
> 

This part of code has been used for a long time and we've never met the stall
Issue which means SCU firmware guarantee it well. But I agree a timeout
mechanism Is better. However,  if only for this reason, we can simply add a
timeout mechanism in MU library function as well, but still is far from a strong
enough reason to switch to a more complexed mailbox one.

> Can you please provide (if possible) your old mailbox based implementation.
> I'm curious to see why it is slow.

In our implementation:
1) One channel per MU
2) Tx using the same way to send msg as sc_ipc_write() in polling mode
3) Rx enables the first word interrupt and polling for the rest of them in
a hrtimer.

The possible extra cost comparing to simple polling way:
1) Extra unnecessary code execution path of mailbox which is not used by SCU MU
2) Interrupt latency
3) Extra memory copy handling RX message separately.
4) Extra schedule of hrtimer polling

Some of them probably could be optimized. However, besides the slow problem,
the extra unnecessary complexity and can't be generic (specific to SCU) are also
important ones. 

And the MU general purpose interrupt feature and general purpose flags feature
may also not supported by mailbox well.

Regards
Dong Aisheng
> 
> --
> Pengutronix e.K.                           |                             |
> Industrial Linux Solutions                 | http://www.pengutronix.de/  |
> Peiner Str. 6-8, 31137 Hildesheim, Germany | Phone: +49-5121-206917-0    |
> Amtsgericht Hildesheim, HRA 2686           | Fax:   +49-5121-206917-5555 |

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH V2 2/4] dt-bindings: arm: fsl: add mu binding doc
  2018-06-22  3:31             ` A.s. Dong
@ 2018-06-22  4:59               ` Oleksij Rempel
  -1 siblings, 0 replies; 37+ messages in thread
From: Oleksij Rempel @ 2018-06-22  4:59 UTC (permalink / raw)
  To: A.s. Dong
  Cc: Mark Rutland, Rob Herring, dongas86, devicetree, Sascha Hauer,
	dl-linux-imx, kernel, Fabio Estevam, shawnguo, linux-arm-kernel


[-- Attachment #1.1: Type: text/plain, Size: 10552 bytes --]

On Fri, Jun 22, 2018 at 03:31:11AM +0000, A.s. Dong wrote:
> > -----Original Message-----
> > From: Oleksij Rempel [mailto:o.rempel@pengutronix.de]
> > Sent: Friday, June 22, 2018 2:09 AM
> > To: A.s. Dong <aisheng.dong@nxp.com>
> > Cc: Sascha Hauer <s.hauer@pengutronix.de>; Rob Herring
> > <robh@kernel.org>; Mark Rutland <mark.rutland@arm.com>;
> > devicetree@vger.kernel.org; dongas86@gmail.com; dl-linux-imx <linux-
> > imx@nxp.com>; kernel@pengutronix.de; Fabio Estevam
> > <fabio.estevam@nxp.com>; shawnguo@kernel.org; linux-arm-
> > kernel@lists.infradead.org
> > Subject: Re: [PATCH V2 2/4] dt-bindings: arm: fsl: add mu binding doc
> > 
> > Hi,
> > 
> > On Thu, Jun 21, 2018 at 05:11:33PM +0000, A.s. Dong wrote:
> > > Hi Sascha,
> > > > On Wed, Jun 20, 2018 at 01:43:10PM -0600, Rob Herring wrote:
> > > > > On Sun, Jun 17, 2018 at 08:49:47PM +0800, Dong Aisheng wrote:
> > > > > > The Messaging Unit module enables two processors within the SoC
> > > > > > to communicate and coordinate by passing messages (e.g. data,
> > > > > > status and control) through the MU interface.
> > > > > >
> > > > > > ---
> > > > > > v1->v2:
> > > > > >  * typo fixes
> > > > > >  * remove status property
> > > > > >  * remove imx6&7 compatible string which may be added later for
> > > > > >    the generic mailbox binding
> > > > > >
> > > > > > Note: Because MU used by SCU is not implemented as a mailbox
> > > > > > driver, Instead, they're provided in library calls to gain higher
> > performance.
> > > > >
> > > > > Using a binding doesn't mean you have to use an OS's subsystem.
> > > > >
> > > > > What needs higher performance? What's the performance difference?
> > > > Why
> > > > > can't the mailbox framework be improved?
> > > >
> > > > From what I see the performance is improved by polling the interrupt
> > > > registers rather than using interrupts.
> > > > I see no reason though why this can't be implemented with the
> > > > mailbox framework as is.
> > > >
> > >
> > > I thought you've agreed to not write generic MU(mailbox) driver for SCU.
> > > https://www.spinics.net/lists/arm-kernel/msg650217.html
> > > But seems it's still not quite certain...
> > >
> > > I'd like to explain some more.
> > >
> > > 1) the interrupt mechanism is not quite suitable for SCU firmware
> > > protocol as the transfer size would be more than 4 words and the
> > > response data size is also undetermined (it's set by SCU firmware side
> > during a response).
> > > So polling mode may be the best way to handle this as MU message
> > > handling usually is quite fast in a few microseconds.
> > >
> > > 2) It's true that Mailbox framework is well designed and powerful.
> > > But it's not quite suitable for SCU MU as we don't need to use the
> > > most bits of it. Mailbox seems like to be more suitable for a generic
> > > MU mailbox driver used by various clients/servers.  But SCU MU are
> > > quite specific to SCU protocol and can't be used by other clients (MU
> > > 0~4 is fixed for SCU communication in MX8 HW design).
> > > Even we write a MU Mailbox driver for SCU MU, it's still not a general
> > > one and can't be used by others (e.g. communication with M4).
> > > So I'd believe the current library way is still the best approach for
> > > SCU MU Using. But I'm also okay for another generic MU drivers for
> > > other common communications between A core and M4 side.
> > >
> > > 3) We actually have tried the MU(Mailbox) internally, it increased a
> > > lot complexity comparing to the current library way and got a booting
> > > time regression issue due to extra delays introduced in handling SCU
> > > protocol in mailbox way.
> > >
> > > And finally a nature question to us is:
> > > What the benefit we can get for SCU MU using a mailbox way?
> > >
> > > If we can't find benefits but introduce more complexities then why we
> > > would do that way?
> > 
> > Looks like my response to same topic within my patch set is lost, so I repost it
> > here:
> > 
> > ok.. let's take some of IMX8 SCU driver code to see if there any difference:
> > 
> > ..this part of the code is blocking write procedure for one channeler (register
> > or what ever name you prefer) per write.. correct?
> > 
> > +void mu_send_msg(struct mu_priv *priv, uint32_t index, uint32_t msg) {
> > +	uint32_t mask = MU_SR_TE0_MASK >> index;
> > +
> > +	/* Wait TX register to be empty. */
> > +	while (!(readl_relaxed(priv->base + MU_ASR) & mask))
> > +		;
> > +	writel_relaxed(msg, priv->base + MU_ATR0  + (index * 4)); }
> > +EXPORT_SYMBOL_GPL(mu_send_msg);
> > 
> > According to documentation it is recommended to use only one status bit for
> > the last register to use MU as one big 4words sized pipe.
> > But, there is no way you can write to all 4 registers without checking status
> > for each of this register, because your protocol has dynamic message size. So
> > you are forced to use your one channel as 4 separate channels.
> > Write it part of the message separately and allow your firmware read 1 word
> > to understand how to behave on the rest of the message.
> > 
> > +static void sc_ipc_write(struct sc_ipc *sc_ipc, void *data) {
> > +	sc_rpc_msg_t *msg = (sc_rpc_msg_t *) data;
> > +	uint8_t count = 0;
> > +
> > +	/* Check size */
> > +	if (msg->size > SC_RPC_MAX_MSG)
> > +		return;
> > +
> > +	/* Write first word */
> > +	mu_send_msg(sc_ipc->mu_base, 0, *((uint32_t *) msg));
> > +	count++;
> > 
> > .. in this loop you are writing to one channel/register per loop. If the
> > communicate will stall for some reason, the linux system will just freeze here
> > without any timeout or error message.. no idea how about the opposite site.
> > 
> > +	/* Write remaining words */
> > +	while (count < msg->size) {
> > +		mu_send_msg(sc_ipc->mu_base, count % MU_TR_COUNT,
> > +			    msg->DATA.u32[count - 1]);
> > +		count++;
> > +	}
> > +}
> > 
> > 
> > ... and here is a proof that sc_ipc_write will do in some cases 5 rounds
> > (5 * 4 bytes = 20 bytes single message) with probable busy waiting for each
> > channel
> > 
> > +sc_err_t sc_misc_seco_image_load(sc_ipc_t ipc, uint32_t addr_src,
> > +				 uint32_t addr_dst, uint32_t len, bool fw) {
> > +	sc_rpc_msg_t msg;
> > +	uint8_t result;
> > +
> > +	RPC_VER(&msg) = SC_RPC_VERSION;
> > +	RPC_SVC(&msg) = (uint8_t)SC_RPC_SVC_MISC;
> > +	RPC_FUNC(&msg) = (uint8_t)MISC_FUNC_SECO_IMAGE_LOAD;
> > +	RPC_U32(&msg, 0) = addr_src;
> > +	RPC_U32(&msg, 4) = addr_dst;
> > +	RPC_U32(&msg, 8) = len;
> > +	RPC_U8(&msg, 12) = (uint8_t)fw;
> > +	RPC_SIZE(&msg) = 5;
> > +
> > +	sc_call_rpc(ipc, &msg, false);
> > +
> > +	result = RPC_R8(&msg);
> > +	return (sc_err_t)result;
> > +}
> > +
> > 
> > So, the same code with mailbox framework will be some thing like this:
> > 1. request all 4 channels in the probe. ignore completion callback and set
> > proper timeout.
> > 
> 
> That looks a bit strange. As I said in another email, MU physical does not have multi
> Channels (here you mean are virtual channels).  And one whole MU instance is
> Exclusively used by SCU, why we need abstract them into 4 channels to use separately
> with extra unnecessary resource hold and code path executed.

OK, so this MU should configured to use all 4 registers as one channel.
I have nothing against it to do in generic driver.

> > 2. keep your old code by replacing this part.
> > 	/* Write remaining words */
> > 	while (count < msg->size) {
> > 		mbox_send_message(sc_ipc->mbox_chan[count %
> > MU_TR_COUNT],
> > msg->DATA.u32[count - 1]);
> > 		count++;
> > 	}
> > 
> > The advantage of this variant. If SCU firmware will stall, the linux will be able
> > to notify about it without blocking complete system.
> > 
> 
> This part of code has been used for a long time and we've never met the stall
> Issue which means SCU firmware guarantee it well. But I agree a timeout
> mechanism Is better. However,  if only for this reason, we can simply add a
> timeout mechanism in MU library function as well, but still is far from a strong
> enough reason to switch to a more complexed mailbox one.

At this point a have absolutely zero tolerance. At the end I was one on
poor devs hunting similar kind of stalls.
Believe me, what can theoretically with 1% probability happen in the lab,
will end with days/months of bug hunting work and thousands of field
returns.

> > Can you please provide (if possible) your old mailbox based implementation.
> > I'm curious to see why it is slow.
> 
> In our implementation:
> 1) One channel per MU
> 2) Tx using the same way to send msg as sc_ipc_write() in polling mode
> 3) Rx enables the first word interrupt and polling for the rest of them in
> a hrtimer.
> 
> The possible extra cost comparing to simple polling way:
> 1) Extra unnecessary code execution path of mailbox which is not used by SCU MU
> 2) Interrupt latency
> 3) Extra memory copy handling RX message separately.
> 4) Extra schedule of hrtimer polling
> 
> Some of them probably could be optimized. However, besides the slow problem,
> the extra unnecessary complexity and can't be generic (specific to SCU) are also
> important ones. 
> 
> And the MU general purpose interrupt feature and general purpose flags feature
> may also not supported by mailbox well.

Ok, I give up.
Can we at least try to provide one device tree binding documentation for that?
At the end, it is one and the same HW IP block spread in one SoC 
(or different SoCs) an be used in different way. How exactly it will
be used is the choice of the developer/customer/user...
even if it will be described withing mailbox section dos not mean you are
forced to use mailbox framework.

Lets imagine worst case scenario and all of MU on i.MX8 are used by
different drivers. In my mailbox binding I suggest to use
<vendor>,<soc>-mu-<mu side>. On i.MX7 there is only one MU, so it looks
like "fsl,imx7s-mu-a" and "fsl,imx7s-mu-b". i.MX8 has multiple MU so
it should be enumerated. Probably it will look like: "fsl,imx8x-mu0-a".
Do it make sense for you? Can we find and agreement here? :)

-- 
Pengutronix e.K.                           |                             |
Industrial Linux Solutions                 | http://www.pengutronix.de/  |
Peiner Str. 6-8, 31137 Hildesheim, Germany | Phone: +49-5121-206917-0    |
Amtsgericht Hildesheim, HRA 2686           | Fax:   +49-5121-206917-5555 |

[-- Attachment #1.2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

[-- Attachment #2: Type: text/plain, Size: 176 bytes --]

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 37+ messages in thread

* [PATCH V2 2/4] dt-bindings: arm: fsl: add mu binding doc
@ 2018-06-22  4:59               ` Oleksij Rempel
  0 siblings, 0 replies; 37+ messages in thread
From: Oleksij Rempel @ 2018-06-22  4:59 UTC (permalink / raw)
  To: linux-arm-kernel

On Fri, Jun 22, 2018 at 03:31:11AM +0000, A.s. Dong wrote:
> > -----Original Message-----
> > From: Oleksij Rempel [mailto:o.rempel at pengutronix.de]
> > Sent: Friday, June 22, 2018 2:09 AM
> > To: A.s. Dong <aisheng.dong@nxp.com>
> > Cc: Sascha Hauer <s.hauer@pengutronix.de>; Rob Herring
> > <robh@kernel.org>; Mark Rutland <mark.rutland@arm.com>;
> > devicetree at vger.kernel.org; dongas86 at gmail.com; dl-linux-imx <linux-
> > imx at nxp.com>; kernel at pengutronix.de; Fabio Estevam
> > <fabio.estevam@nxp.com>; shawnguo at kernel.org; linux-arm-
> > kernel at lists.infradead.org
> > Subject: Re: [PATCH V2 2/4] dt-bindings: arm: fsl: add mu binding doc
> > 
> > Hi,
> > 
> > On Thu, Jun 21, 2018 at 05:11:33PM +0000, A.s. Dong wrote:
> > > Hi Sascha,
> > > > On Wed, Jun 20, 2018 at 01:43:10PM -0600, Rob Herring wrote:
> > > > > On Sun, Jun 17, 2018 at 08:49:47PM +0800, Dong Aisheng wrote:
> > > > > > The Messaging Unit module enables two processors within the SoC
> > > > > > to communicate and coordinate by passing messages (e.g. data,
> > > > > > status and control) through the MU interface.
> > > > > >
> > > > > > ---
> > > > > > v1->v2:
> > > > > >  * typo fixes
> > > > > >  * remove status property
> > > > > >  * remove imx6&7 compatible string which may be added later for
> > > > > >    the generic mailbox binding
> > > > > >
> > > > > > Note: Because MU used by SCU is not implemented as a mailbox
> > > > > > driver, Instead, they're provided in library calls to gain higher
> > performance.
> > > > >
> > > > > Using a binding doesn't mean you have to use an OS's subsystem.
> > > > >
> > > > > What needs higher performance? What's the performance difference?
> > > > Why
> > > > > can't the mailbox framework be improved?
> > > >
> > > > From what I see the performance is improved by polling the interrupt
> > > > registers rather than using interrupts.
> > > > I see no reason though why this can't be implemented with the
> > > > mailbox framework as is.
> > > >
> > >
> > > I thought you've agreed to not write generic MU(mailbox) driver for SCU.
> > > https://www.spinics.net/lists/arm-kernel/msg650217.html
> > > But seems it's still not quite certain...
> > >
> > > I'd like to explain some more.
> > >
> > > 1) the interrupt mechanism is not quite suitable for SCU firmware
> > > protocol as the transfer size would be more than 4 words and the
> > > response data size is also undetermined (it's set by SCU firmware side
> > during a response).
> > > So polling mode may be the best way to handle this as MU message
> > > handling usually is quite fast in a few microseconds.
> > >
> > > 2) It's true that Mailbox framework is well designed and powerful.
> > > But it's not quite suitable for SCU MU as we don't need to use the
> > > most bits of it. Mailbox seems like to be more suitable for a generic
> > > MU mailbox driver used by various clients/servers.  But SCU MU are
> > > quite specific to SCU protocol and can't be used by other clients (MU
> > > 0~4 is fixed for SCU communication in MX8 HW design).
> > > Even we write a MU Mailbox driver for SCU MU, it's still not a general
> > > one and can't be used by others (e.g. communication with M4).
> > > So I'd believe the current library way is still the best approach for
> > > SCU MU Using. But I'm also okay for another generic MU drivers for
> > > other common communications between A core and M4 side.
> > >
> > > 3) We actually have tried the MU(Mailbox) internally, it increased a
> > > lot complexity comparing to the current library way and got a booting
> > > time regression issue due to extra delays introduced in handling SCU
> > > protocol in mailbox way.
> > >
> > > And finally a nature question to us is:
> > > What the benefit we can get for SCU MU using a mailbox way?
> > >
> > > If we can't find benefits but introduce more complexities then why we
> > > would do that way?
> > 
> > Looks like my response to same topic within my patch set is lost, so I repost it
> > here:
> > 
> > ok.. let's take some of IMX8 SCU driver code to see if there any difference:
> > 
> > ..this part of the code is blocking write procedure for one channeler (register
> > or what ever name you prefer) per write.. correct?
> > 
> > +void mu_send_msg(struct mu_priv *priv, uint32_t index, uint32_t msg) {
> > +	uint32_t mask = MU_SR_TE0_MASK >> index;
> > +
> > +	/* Wait TX register to be empty. */
> > +	while (!(readl_relaxed(priv->base + MU_ASR) & mask))
> > +		;
> > +	writel_relaxed(msg, priv->base + MU_ATR0  + (index * 4)); }
> > +EXPORT_SYMBOL_GPL(mu_send_msg);
> > 
> > According to documentation it is recommended to use only one status bit for
> > the last register to use MU as one big 4words sized pipe.
> > But, there is no way you can write to all 4 registers without checking status
> > for each of this register, because your protocol has dynamic message size. So
> > you are forced to use your one channel as 4 separate channels.
> > Write it part of the message separately and allow your firmware read 1 word
> > to understand how to behave on the rest of the message.
> > 
> > +static void sc_ipc_write(struct sc_ipc *sc_ipc, void *data) {
> > +	sc_rpc_msg_t *msg = (sc_rpc_msg_t *) data;
> > +	uint8_t count = 0;
> > +
> > +	/* Check size */
> > +	if (msg->size > SC_RPC_MAX_MSG)
> > +		return;
> > +
> > +	/* Write first word */
> > +	mu_send_msg(sc_ipc->mu_base, 0, *((uint32_t *) msg));
> > +	count++;
> > 
> > .. in this loop you are writing to one channel/register per loop. If the
> > communicate will stall for some reason, the linux system will just freeze here
> > without any timeout or error message.. no idea how about the opposite site.
> > 
> > +	/* Write remaining words */
> > +	while (count < msg->size) {
> > +		mu_send_msg(sc_ipc->mu_base, count % MU_TR_COUNT,
> > +			    msg->DATA.u32[count - 1]);
> > +		count++;
> > +	}
> > +}
> > 
> > 
> > ... and here is a proof that sc_ipc_write will do in some cases 5 rounds
> > (5 * 4 bytes = 20 bytes single message) with probable busy waiting for each
> > channel
> > 
> > +sc_err_t sc_misc_seco_image_load(sc_ipc_t ipc, uint32_t addr_src,
> > +				 uint32_t addr_dst, uint32_t len, bool fw) {
> > +	sc_rpc_msg_t msg;
> > +	uint8_t result;
> > +
> > +	RPC_VER(&msg) = SC_RPC_VERSION;
> > +	RPC_SVC(&msg) = (uint8_t)SC_RPC_SVC_MISC;
> > +	RPC_FUNC(&msg) = (uint8_t)MISC_FUNC_SECO_IMAGE_LOAD;
> > +	RPC_U32(&msg, 0) = addr_src;
> > +	RPC_U32(&msg, 4) = addr_dst;
> > +	RPC_U32(&msg, 8) = len;
> > +	RPC_U8(&msg, 12) = (uint8_t)fw;
> > +	RPC_SIZE(&msg) = 5;
> > +
> > +	sc_call_rpc(ipc, &msg, false);
> > +
> > +	result = RPC_R8(&msg);
> > +	return (sc_err_t)result;
> > +}
> > +
> > 
> > So, the same code with mailbox framework will be some thing like this:
> > 1. request all 4 channels in the probe. ignore completion callback and set
> > proper timeout.
> > 
> 
> That looks a bit strange. As I said in another email, MU physical does not have multi
> Channels (here you mean are virtual channels).  And one whole MU instance is
> Exclusively used by SCU, why we need abstract them into 4 channels to use separately
> with extra unnecessary resource hold and code path executed.

OK, so this MU should configured to use all 4 registers as one channel.
I have nothing against it to do in generic driver.

> > 2. keep your old code by replacing this part.
> > 	/* Write remaining words */
> > 	while (count < msg->size) {
> > 		mbox_send_message(sc_ipc->mbox_chan[count %
> > MU_TR_COUNT],
> > msg->DATA.u32[count - 1]);
> > 		count++;
> > 	}
> > 
> > The advantage of this variant. If SCU firmware will stall, the linux will be able
> > to notify about it without blocking complete system.
> > 
> 
> This part of code has been used for a long time and we've never met the stall
> Issue which means SCU firmware guarantee it well. But I agree a timeout
> mechanism Is better. However,  if only for this reason, we can simply add a
> timeout mechanism in MU library function as well, but still is far from a strong
> enough reason to switch to a more complexed mailbox one.

At this point a have absolutely zero tolerance. At the end I was one on
poor devs hunting similar kind of stalls.
Believe me, what can theoretically with 1% probability happen in the lab,
will end with days/months of bug hunting work and thousands of field
returns.

> > Can you please provide (if possible) your old mailbox based implementation.
> > I'm curious to see why it is slow.
> 
> In our implementation:
> 1) One channel per MU
> 2) Tx using the same way to send msg as sc_ipc_write() in polling mode
> 3) Rx enables the first word interrupt and polling for the rest of them in
> a hrtimer.
> 
> The possible extra cost comparing to simple polling way:
> 1) Extra unnecessary code execution path of mailbox which is not used by SCU MU
> 2) Interrupt latency
> 3) Extra memory copy handling RX message separately.
> 4) Extra schedule of hrtimer polling
> 
> Some of them probably could be optimized. However, besides the slow problem,
> the extra unnecessary complexity and can't be generic (specific to SCU) are also
> important ones. 
> 
> And the MU general purpose interrupt feature and general purpose flags feature
> may also not supported by mailbox well.

Ok, I give up.
Can we at least try to provide one device tree binding documentation for that?
At the end, it is one and the same HW IP block spread in one SoC 
(or different SoCs) an be used in different way. How exactly it will
be used is the choice of the developer/customer/user...
even if it will be described withing mailbox section dos not mean you are
forced to use mailbox framework.

Lets imagine worst case scenario and all of MU on i.MX8 are used by
different drivers. In my mailbox binding I suggest to use
<vendor>,<soc>-mu-<mu side>. On i.MX7 there is only one MU, so it looks
like "fsl,imx7s-mu-a" and "fsl,imx7s-mu-b". i.MX8 has multiple MU so
it should be enumerated. Probably it will look like: "fsl,imx8x-mu0-a".
Do it make sense for you? Can we find and agreement here? :)

-- 
Pengutronix e.K.                           |                             |
Industrial Linux Solutions                 | http://www.pengutronix.de/  |
Peiner Str. 6-8, 31137 Hildesheim, Germany | Phone: +49-5121-206917-0    |
Amtsgericht Hildesheim, HRA 2686           | Fax:   +49-5121-206917-5555 |
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 488 bytes
Desc: not available
URL: <http://lists.infradead.org/pipermail/linux-arm-kernel/attachments/20180622/38c6f9b1/attachment-0001.sig>

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH V2 2/4] dt-bindings: arm: fsl: add mu binding doc
  2018-06-21 17:11         ` A.s. Dong
@ 2018-06-22  5:49           ` Sascha Hauer
  -1 siblings, 0 replies; 37+ messages in thread
From: Sascha Hauer @ 2018-06-22  5:49 UTC (permalink / raw)
  To: A.s. Dong
  Cc: Mark Rutland, Rob Herring, dongas86, devicetree, dl-linux-imx,
	kernel, Fabio Estevam, shawnguo, linux-arm-kernel

On Thu, Jun 21, 2018 at 05:11:33PM +0000, A.s. Dong wrote:
> Hi Sascha,
> 
> > -----Original Message-----
> > From: Sascha Hauer [mailto:s.hauer@pengutronix.de]
> > Sent: Thursday, June 21, 2018 3:47 PM
> > To: Rob Herring <robh@kernel.org>
> > Cc: A.s. Dong <aisheng.dong@nxp.com>; linux-arm-
> > kernel@lists.infradead.org; dongas86@gmail.com; kernel@pengutronix.de;
> > shawnguo@kernel.org; Fabio Estevam <fabio.estevam@nxp.com>; dl-linux-
> > imx <linux-imx@nxp.com>; Mark Rutland <mark.rutland@arm.com>;
> > devicetree@vger.kernel.org
> > Subject: Re: [PATCH V2 2/4] dt-bindings: arm: fsl: add mu binding doc
> > 
> > On Wed, Jun 20, 2018 at 01:43:10PM -0600, Rob Herring wrote:
> > > On Sun, Jun 17, 2018 at 08:49:47PM +0800, Dong Aisheng wrote:
> > > > The Messaging Unit module enables two processors within the SoC to
> > > > communicate and coordinate by passing messages (e.g. data, status
> > > > and control) through the MU interface.
> > > >
> > > > Cc: Shawn Guo <shawnguo@kernel.org>
> > > > Cc: Sascha Hauer <kernel@pengutronix.de>
> > > > Cc: Fabio Estevam <fabio.estevam@nxp.com>
> > > > Cc: Rob Herring <robh+dt@kernel.org>
> > > > Cc: Mark Rutland <mark.rutland@arm.com>
> > > > Cc: devicetree@vger.kernel.org
> > > > Signed-off-by: Dong Aisheng <aisheng.dong@nxp.com>
> > > > ---
> > > > v1->v2:
> > > >  * typo fixes
> > > >  * remove status property
> > > >  * remove imx6&7 compatible string which may be added later for
> > > >    the generic mailbox binding
> > > >
> > > > Note: Because MU used by SCU is not implemented as a mailbox driver,
> > > > Instead, they're provided in library calls to gain higher performance.
> > >
> > > Using a binding doesn't mean you have to use an OS's subsystem.
> > >
> > > What needs higher performance? What's the performance difference?
> > Why
> > > can't the mailbox framework be improved?
> > 
> > From what I see the performance is improved by polling the interrupt
> > registers rather than using interrupts.
> > I see no reason though why this can't be implemented with the mailbox
> > framework as is.
> > 
> 
> I thought you've agreed to not write generic MU(mailbox) driver for SCU.
> https://www.spinics.net/lists/arm-kernel/msg650217.html
> But seems it's still not quite certain...

My suggestion was that we change the compatible of the MU unit to
something that the SCU driver matches. But since we do not do that and
instead use a fsl,imx8qxp-mu compatible for the MU we have to come up
with a MU driver that handles both the SCU case and the regular usecase.
Otherwise we'll end up with a generic driver attaching to the device you
are using under the hoods.

Sascha

-- 
Pengutronix e.K.                           |                             |
Industrial Linux Solutions                 | http://www.pengutronix.de/  |
Peiner Str. 6-8, 31137 Hildesheim, Germany | Phone: +49-5121-206917-0    |
Amtsgericht Hildesheim, HRA 2686           | Fax:   +49-5121-206917-5555 |

^ permalink raw reply	[flat|nested] 37+ messages in thread

* [PATCH V2 2/4] dt-bindings: arm: fsl: add mu binding doc
@ 2018-06-22  5:49           ` Sascha Hauer
  0 siblings, 0 replies; 37+ messages in thread
From: Sascha Hauer @ 2018-06-22  5:49 UTC (permalink / raw)
  To: linux-arm-kernel

On Thu, Jun 21, 2018 at 05:11:33PM +0000, A.s. Dong wrote:
> Hi Sascha,
> 
> > -----Original Message-----
> > From: Sascha Hauer [mailto:s.hauer at pengutronix.de]
> > Sent: Thursday, June 21, 2018 3:47 PM
> > To: Rob Herring <robh@kernel.org>
> > Cc: A.s. Dong <aisheng.dong@nxp.com>; linux-arm-
> > kernel at lists.infradead.org; dongas86 at gmail.com; kernel at pengutronix.de;
> > shawnguo at kernel.org; Fabio Estevam <fabio.estevam@nxp.com>; dl-linux-
> > imx <linux-imx@nxp.com>; Mark Rutland <mark.rutland@arm.com>;
> > devicetree at vger.kernel.org
> > Subject: Re: [PATCH V2 2/4] dt-bindings: arm: fsl: add mu binding doc
> > 
> > On Wed, Jun 20, 2018 at 01:43:10PM -0600, Rob Herring wrote:
> > > On Sun, Jun 17, 2018 at 08:49:47PM +0800, Dong Aisheng wrote:
> > > > The Messaging Unit module enables two processors within the SoC to
> > > > communicate and coordinate by passing messages (e.g. data, status
> > > > and control) through the MU interface.
> > > >
> > > > Cc: Shawn Guo <shawnguo@kernel.org>
> > > > Cc: Sascha Hauer <kernel@pengutronix.de>
> > > > Cc: Fabio Estevam <fabio.estevam@nxp.com>
> > > > Cc: Rob Herring <robh+dt@kernel.org>
> > > > Cc: Mark Rutland <mark.rutland@arm.com>
> > > > Cc: devicetree at vger.kernel.org
> > > > Signed-off-by: Dong Aisheng <aisheng.dong@nxp.com>
> > > > ---
> > > > v1->v2:
> > > >  * typo fixes
> > > >  * remove status property
> > > >  * remove imx6&7 compatible string which may be added later for
> > > >    the generic mailbox binding
> > > >
> > > > Note: Because MU used by SCU is not implemented as a mailbox driver,
> > > > Instead, they're provided in library calls to gain higher performance.
> > >
> > > Using a binding doesn't mean you have to use an OS's subsystem.
> > >
> > > What needs higher performance? What's the performance difference?
> > Why
> > > can't the mailbox framework be improved?
> > 
> > From what I see the performance is improved by polling the interrupt
> > registers rather than using interrupts.
> > I see no reason though why this can't be implemented with the mailbox
> > framework as is.
> > 
> 
> I thought you've agreed to not write generic MU(mailbox) driver for SCU.
> https://www.spinics.net/lists/arm-kernel/msg650217.html
> But seems it's still not quite certain...

My suggestion was that we change the compatible of the MU unit to
something that the SCU driver matches. But since we do not do that and
instead use a fsl,imx8qxp-mu compatible for the MU we have to come up
with a MU driver that handles both the SCU case and the regular usecase.
Otherwise we'll end up with a generic driver attaching to the device you
are using under the hoods.

Sascha

-- 
Pengutronix e.K.                           |                             |
Industrial Linux Solutions                 | http://www.pengutronix.de/  |
Peiner Str. 6-8, 31137 Hildesheim, Germany | Phone: +49-5121-206917-0    |
Amtsgericht Hildesheim, HRA 2686           | Fax:   +49-5121-206917-5555 |

^ permalink raw reply	[flat|nested] 37+ messages in thread

* RE: [PATCH V2 2/4] dt-bindings: arm: fsl: add mu binding doc
  2018-06-22  4:59               ` Oleksij Rempel
@ 2018-06-22  5:59                 ` A.s. Dong
  -1 siblings, 0 replies; 37+ messages in thread
From: A.s. Dong @ 2018-06-22  5:59 UTC (permalink / raw)
  To: Oleksij Rempel
  Cc: Mark Rutland, Rob Herring, dongas86, devicetree, Sascha Hauer,
	dl-linux-imx, kernel, Fabio Estevam, shawnguo, linux-arm-kernel

Hi Oleksij,

> -----Original Message-----
> From: Oleksij Rempel [mailto:o.rempel@pengutronix.de]
> Sent: Friday, June 22, 2018 12:59 PM
> To: A.s. Dong <aisheng.dong@nxp.com>
> Cc: Sascha Hauer <s.hauer@pengutronix.de>; Rob Herring
> <robh@kernel.org>; Mark Rutland <mark.rutland@arm.com>;
> devicetree@vger.kernel.org; dongas86@gmail.com; dl-linux-imx <linux-
> imx@nxp.com>; kernel@pengutronix.de; Fabio Estevam
> <fabio.estevam@nxp.com>; shawnguo@kernel.org; linux-arm-
> kernel@lists.infradead.org
> Subject: Re: [PATCH V2 2/4] dt-bindings: arm: fsl: add mu binding doc
> 
> On Fri, Jun 22, 2018 at 03:31:11AM +0000, A.s. Dong wrote:
> > > -----Original Message-----
> > > From: Oleksij Rempel [mailto:o.rempel@pengutronix.de]
> > > Sent: Friday, June 22, 2018 2:09 AM
> > > To: A.s. Dong <aisheng.dong@nxp.com>
> > > Cc: Sascha Hauer <s.hauer@pengutronix.de>; Rob Herring
> > > <robh@kernel.org>; Mark Rutland <mark.rutland@arm.com>;
> > > devicetree@vger.kernel.org; dongas86@gmail.com; dl-linux-imx <linux-
> > > imx@nxp.com>; kernel@pengutronix.de; Fabio Estevam
> > > <fabio.estevam@nxp.com>; shawnguo@kernel.org; linux-arm-
> > > kernel@lists.infradead.org
> > > Subject: Re: [PATCH V2 2/4] dt-bindings: arm: fsl: add mu binding
> > > doc
> > >
> > > Hi,
> > >
> > > On Thu, Jun 21, 2018 at 05:11:33PM +0000, A.s. Dong wrote:
> > > > Hi Sascha,
> > > > > On Wed, Jun 20, 2018 at 01:43:10PM -0600, Rob Herring wrote:
> > > > > > On Sun, Jun 17, 2018 at 08:49:47PM +0800, Dong Aisheng wrote:
> > > > > > > The Messaging Unit module enables two processors within the
> > > > > > > SoC to communicate and coordinate by passing messages (e.g.
> > > > > > > data, status and control) through the MU interface.
> > > > > > >
> > > > > > > ---
> > > > > > > v1->v2:
> > > > > > >  * typo fixes
> > > > > > >  * remove status property
> > > > > > >  * remove imx6&7 compatible string which may be added later for
> > > > > > >    the generic mailbox binding
> > > > > > >
> > > > > > > Note: Because MU used by SCU is not implemented as a mailbox
> > > > > > > driver, Instead, they're provided in library calls to gain
> > > > > > > higher
> > > performance.
> > > > > >
> > > > > > Using a binding doesn't mean you have to use an OS's subsystem.
> > > > > >
> > > > > > What needs higher performance? What's the performance
> difference?
> > > > > Why
> > > > > > can't the mailbox framework be improved?
> > > > >
> > > > > From what I see the performance is improved by polling the
> > > > > interrupt registers rather than using interrupts.
> > > > > I see no reason though why this can't be implemented with the
> > > > > mailbox framework as is.
> > > > >
> > > >
> > > > I thought you've agreed to not write generic MU(mailbox) driver for
> SCU.
> > > > https://www.spinics.net/lists/arm-kernel/msg650217.html
> > > > But seems it's still not quite certain...
> > > >
> > > > I'd like to explain some more.
> > > >
> > > > 1) the interrupt mechanism is not quite suitable for SCU firmware
> > > > protocol as the transfer size would be more than 4 words and the
> > > > response data size is also undetermined (it's set by SCU firmware
> > > > side
> > > during a response).
> > > > So polling mode may be the best way to handle this as MU message
> > > > handling usually is quite fast in a few microseconds.
> > > >
> > > > 2) It's true that Mailbox framework is well designed and powerful.
> > > > But it's not quite suitable for SCU MU as we don't need to use the
> > > > most bits of it. Mailbox seems like to be more suitable for a
> > > > generic MU mailbox driver used by various clients/servers.  But
> > > > SCU MU are quite specific to SCU protocol and can't be used by
> > > > other clients (MU
> > > > 0~4 is fixed for SCU communication in MX8 HW design).
> > > > Even we write a MU Mailbox driver for SCU MU, it's still not a
> > > > general one and can't be used by others (e.g. communication with M4).
> > > > So I'd believe the current library way is still the best approach
> > > > for SCU MU Using. But I'm also okay for another generic MU drivers
> > > > for other common communications between A core and M4 side.
> > > >
> > > > 3) We actually have tried the MU(Mailbox) internally, it increased
> > > > a lot complexity comparing to the current library way and got a
> > > > booting time regression issue due to extra delays introduced in
> > > > handling SCU protocol in mailbox way.
> > > >
> > > > And finally a nature question to us is:
> > > > What the benefit we can get for SCU MU using a mailbox way?
> > > >
> > > > If we can't find benefits but introduce more complexities then why
> > > > we would do that way?
> > >
> > > Looks like my response to same topic within my patch set is lost, so
> > > I repost it
> > > here:
> > >
> > > ok.. let's take some of IMX8 SCU driver code to see if there any
> difference:
> > >
> > > ..this part of the code is blocking write procedure for one
> > > channeler (register or what ever name you prefer) per write.. correct?
> > >
> > > +void mu_send_msg(struct mu_priv *priv, uint32_t index, uint32_t msg)
> {
> > > +	uint32_t mask = MU_SR_TE0_MASK >> index;
> > > +
> > > +	/* Wait TX register to be empty. */
> > > +	while (!(readl_relaxed(priv->base + MU_ASR) & mask))
> > > +		;
> > > +	writel_relaxed(msg, priv->base + MU_ATR0  + (index * 4)); }
> > > +EXPORT_SYMBOL_GPL(mu_send_msg);
> > >
> > > According to documentation it is recommended to use only one status
> > > bit for the last register to use MU as one big 4words sized pipe.
> > > But, there is no way you can write to all 4 registers without
> > > checking status for each of this register, because your protocol has
> > > dynamic message size. So you are forced to use your one channel as 4
> separate channels.
> > > Write it part of the message separately and allow your firmware read
> > > 1 word to understand how to behave on the rest of the message.
> > >
> > > +static void sc_ipc_write(struct sc_ipc *sc_ipc, void *data) {
> > > +	sc_rpc_msg_t *msg = (sc_rpc_msg_t *) data;
> > > +	uint8_t count = 0;
> > > +
> > > +	/* Check size */
> > > +	if (msg->size > SC_RPC_MAX_MSG)
> > > +		return;
> > > +
> > > +	/* Write first word */
> > > +	mu_send_msg(sc_ipc->mu_base, 0, *((uint32_t *) msg));
> > > +	count++;
> > >
> > > .. in this loop you are writing to one channel/register per loop. If
> > > the communicate will stall for some reason, the linux system will
> > > just freeze here without any timeout or error message.. no idea how
> about the opposite site.
> > >
> > > +	/* Write remaining words */
> > > +	while (count < msg->size) {
> > > +		mu_send_msg(sc_ipc->mu_base, count % MU_TR_COUNT,
> > > +			    msg->DATA.u32[count - 1]);
> > > +		count++;
> > > +	}
> > > +}
> > >
> > >
> > > ... and here is a proof that sc_ipc_write will do in some cases 5
> > > rounds
> > > (5 * 4 bytes = 20 bytes single message) with probable busy waiting
> > > for each channel
> > >
> > > +sc_err_t sc_misc_seco_image_load(sc_ipc_t ipc, uint32_t addr_src,
> > > +				 uint32_t addr_dst, uint32_t len, bool fw) {
> > > +	sc_rpc_msg_t msg;
> > > +	uint8_t result;
> > > +
> > > +	RPC_VER(&msg) = SC_RPC_VERSION;
> > > +	RPC_SVC(&msg) = (uint8_t)SC_RPC_SVC_MISC;
> > > +	RPC_FUNC(&msg) = (uint8_t)MISC_FUNC_SECO_IMAGE_LOAD;
> > > +	RPC_U32(&msg, 0) = addr_src;
> > > +	RPC_U32(&msg, 4) = addr_dst;
> > > +	RPC_U32(&msg, 8) = len;
> > > +	RPC_U8(&msg, 12) = (uint8_t)fw;
> > > +	RPC_SIZE(&msg) = 5;
> > > +
> > > +	sc_call_rpc(ipc, &msg, false);
> > > +
> > > +	result = RPC_R8(&msg);
> > > +	return (sc_err_t)result;
> > > +}
> > > +
> > >
> > > So, the same code with mailbox framework will be some thing like this:
> > > 1. request all 4 channels in the probe. ignore completion callback
> > > and set proper timeout.
> > >
> >
> > That looks a bit strange. As I said in another email, MU physical does
> > not have multi Channels (here you mean are virtual channels).  And one
> > whole MU instance is Exclusively used by SCU, why we need abstract
> > them into 4 channels to use separately with extra unnecessary resource
> hold and code path executed.
> 
> OK, so this MU should configured to use all 4 registers as one channel.
> I have nothing against it to do in generic driver.
> 
> > > 2. keep your old code by replacing this part.
> > > 	/* Write remaining words */
> > > 	while (count < msg->size) {
> > > 		mbox_send_message(sc_ipc->mbox_chan[count %
> MU_TR_COUNT],
> > > msg->DATA.u32[count - 1]);
> > > 		count++;
> > > 	}
> > >
> > > The advantage of this variant. If SCU firmware will stall, the linux
> > > will be able to notify about it without blocking complete system.
> > >
> >
> > This part of code has been used for a long time and we've never met
> > the stall Issue which means SCU firmware guarantee it well. But I
> > agree a timeout mechanism Is better. However,  if only for this
> > reason, we can simply add a timeout mechanism in MU library function
> > as well, but still is far from a strong enough reason to switch to a more
> complexed mailbox one.
> 
> At this point a have absolutely zero tolerance. At the end I was one on poor
> devs hunting similar kind of stalls.
> Believe me, what can theoretically with 1% probability happen in the lab, will
> end with days/months of bug hunting work and thousands of field returns.
> 

Yeah, so I think a timeout mechanism is worthy to add.

> > > Can you please provide (if possible) your old mailbox based
> implementation.
> > > I'm curious to see why it is slow.
> >
> > In our implementation:
> > 1) One channel per MU
> > 2) Tx using the same way to send msg as sc_ipc_write() in polling mode
> > 3) Rx enables the first word interrupt and polling for the rest of
> > them in a hrtimer.
> >
> > The possible extra cost comparing to simple polling way:
> > 1) Extra unnecessary code execution path of mailbox which is not used
> > by SCU MU
> > 2) Interrupt latency
> > 3) Extra memory copy handling RX message separately.
> > 4) Extra schedule of hrtimer polling
> >
> > Some of them probably could be optimized. However, besides the slow
> > problem, the extra unnecessary complexity and can't be generic
> > (specific to SCU) are also important ones.
> >
> > And the MU general purpose interrupt feature and general purpose flags
> > feature may also not supported by mailbox well.
> 
> Ok, I give up.
> Can we at least try to provide one device tree binding documentation for
> that?

I will update the binding doc to use mailbox way as suggested by Rob.
But I still need to change mbox-cells to allow 0 which actually already exist
In kernel. So we may have too options for MU binding for users.
For SCU MU, mbox-cells = <0>
For General MU, mbox-cells = <1>

> At the end, it is one and the same HW IP block spread in one SoC (or
> different SoCs) an be used in different way. How exactly it will be used is the
> choice of the developer/customer/user...
> even if it will be described withing mailbox section dos not mean you are
> forced to use mailbox framework.
> 
> Lets imagine worst case scenario and all of MU on i.MX8 are used by
> different drivers. In my mailbox binding I suggest to use <vendor>,<soc>-
> mu-<mu side>. On i.MX7 there is only one MU, so it looks like "fsl,imx7s-mu-
> a" and "fsl,imx7s-mu-b". i.MX8 has multiple MU so it should be enumerated.
> Probably it will look like: "fsl,imx8x-mu0-a".
> Do it make sense for you? Can we find and agreement here? :)
> 

IMHO one possible concern may be that results in creating two much compatibles
strings. For example, each SoC have two (fsl, <soc>-mu-<a or b>).
Now with MU supported SoCs, we have mx6sx, mx7ulp, Mx7d, mx8qm,
mx8x, mx8mq and etc...

My question is that if it's really worthy to create a separate side b compatible
string instread of distinguish it by a property?

And for general purpose MUs to communicate with M4. Both side can be used by
either A core and M core. If we're doing this way, once users wants A core to access
B side, users need to change the compatible string, that seems more like an unusual
way compared to changing properties.

How do you think?

Regards
Dong Aisheng

> --
> Pengutronix e.K.                           |                             |
> Industrial Linux Solutions                 | http://www.pengutronix.de/  |
> Peiner Str. 6-8, 31137 Hildesheim, Germany | Phone: +49-5121-206917-0    |
> Amtsgericht Hildesheim, HRA 2686           | Fax:   +49-5121-206917-5555 |

^ permalink raw reply	[flat|nested] 37+ messages in thread

* [PATCH V2 2/4] dt-bindings: arm: fsl: add mu binding doc
@ 2018-06-22  5:59                 ` A.s. Dong
  0 siblings, 0 replies; 37+ messages in thread
From: A.s. Dong @ 2018-06-22  5:59 UTC (permalink / raw)
  To: linux-arm-kernel

Hi Oleksij,

> -----Original Message-----
> From: Oleksij Rempel [mailto:o.rempel at pengutronix.de]
> Sent: Friday, June 22, 2018 12:59 PM
> To: A.s. Dong <aisheng.dong@nxp.com>
> Cc: Sascha Hauer <s.hauer@pengutronix.de>; Rob Herring
> <robh@kernel.org>; Mark Rutland <mark.rutland@arm.com>;
> devicetree at vger.kernel.org; dongas86 at gmail.com; dl-linux-imx <linux-
> imx at nxp.com>; kernel at pengutronix.de; Fabio Estevam
> <fabio.estevam@nxp.com>; shawnguo at kernel.org; linux-arm-
> kernel at lists.infradead.org
> Subject: Re: [PATCH V2 2/4] dt-bindings: arm: fsl: add mu binding doc
> 
> On Fri, Jun 22, 2018 at 03:31:11AM +0000, A.s. Dong wrote:
> > > -----Original Message-----
> > > From: Oleksij Rempel [mailto:o.rempel at pengutronix.de]
> > > Sent: Friday, June 22, 2018 2:09 AM
> > > To: A.s. Dong <aisheng.dong@nxp.com>
> > > Cc: Sascha Hauer <s.hauer@pengutronix.de>; Rob Herring
> > > <robh@kernel.org>; Mark Rutland <mark.rutland@arm.com>;
> > > devicetree at vger.kernel.org; dongas86 at gmail.com; dl-linux-imx <linux-
> > > imx at nxp.com>; kernel at pengutronix.de; Fabio Estevam
> > > <fabio.estevam@nxp.com>; shawnguo at kernel.org; linux-arm-
> > > kernel at lists.infradead.org
> > > Subject: Re: [PATCH V2 2/4] dt-bindings: arm: fsl: add mu binding
> > > doc
> > >
> > > Hi,
> > >
> > > On Thu, Jun 21, 2018 at 05:11:33PM +0000, A.s. Dong wrote:
> > > > Hi Sascha,
> > > > > On Wed, Jun 20, 2018 at 01:43:10PM -0600, Rob Herring wrote:
> > > > > > On Sun, Jun 17, 2018 at 08:49:47PM +0800, Dong Aisheng wrote:
> > > > > > > The Messaging Unit module enables two processors within the
> > > > > > > SoC to communicate and coordinate by passing messages (e.g.
> > > > > > > data, status and control) through the MU interface.
> > > > > > >
> > > > > > > ---
> > > > > > > v1->v2:
> > > > > > >  * typo fixes
> > > > > > >  * remove status property
> > > > > > >  * remove imx6&7 compatible string which may be added later for
> > > > > > >    the generic mailbox binding
> > > > > > >
> > > > > > > Note: Because MU used by SCU is not implemented as a mailbox
> > > > > > > driver, Instead, they're provided in library calls to gain
> > > > > > > higher
> > > performance.
> > > > > >
> > > > > > Using a binding doesn't mean you have to use an OS's subsystem.
> > > > > >
> > > > > > What needs higher performance? What's the performance
> difference?
> > > > > Why
> > > > > > can't the mailbox framework be improved?
> > > > >
> > > > > From what I see the performance is improved by polling the
> > > > > interrupt registers rather than using interrupts.
> > > > > I see no reason though why this can't be implemented with the
> > > > > mailbox framework as is.
> > > > >
> > > >
> > > > I thought you've agreed to not write generic MU(mailbox) driver for
> SCU.
> > > > https://www.spinics.net/lists/arm-kernel/msg650217.html
> > > > But seems it's still not quite certain...
> > > >
> > > > I'd like to explain some more.
> > > >
> > > > 1) the interrupt mechanism is not quite suitable for SCU firmware
> > > > protocol as the transfer size would be more than 4 words and the
> > > > response data size is also undetermined (it's set by SCU firmware
> > > > side
> > > during a response).
> > > > So polling mode may be the best way to handle this as MU message
> > > > handling usually is quite fast in a few microseconds.
> > > >
> > > > 2) It's true that Mailbox framework is well designed and powerful.
> > > > But it's not quite suitable for SCU MU as we don't need to use the
> > > > most bits of it. Mailbox seems like to be more suitable for a
> > > > generic MU mailbox driver used by various clients/servers.  But
> > > > SCU MU are quite specific to SCU protocol and can't be used by
> > > > other clients (MU
> > > > 0~4 is fixed for SCU communication in MX8 HW design).
> > > > Even we write a MU Mailbox driver for SCU MU, it's still not a
> > > > general one and can't be used by others (e.g. communication with M4).
> > > > So I'd believe the current library way is still the best approach
> > > > for SCU MU Using. But I'm also okay for another generic MU drivers
> > > > for other common communications between A core and M4 side.
> > > >
> > > > 3) We actually have tried the MU(Mailbox) internally, it increased
> > > > a lot complexity comparing to the current library way and got a
> > > > booting time regression issue due to extra delays introduced in
> > > > handling SCU protocol in mailbox way.
> > > >
> > > > And finally a nature question to us is:
> > > > What the benefit we can get for SCU MU using a mailbox way?
> > > >
> > > > If we can't find benefits but introduce more complexities then why
> > > > we would do that way?
> > >
> > > Looks like my response to same topic within my patch set is lost, so
> > > I repost it
> > > here:
> > >
> > > ok.. let's take some of IMX8 SCU driver code to see if there any
> difference:
> > >
> > > ..this part of the code is blocking write procedure for one
> > > channeler (register or what ever name you prefer) per write.. correct?
> > >
> > > +void mu_send_msg(struct mu_priv *priv, uint32_t index, uint32_t msg)
> {
> > > +	uint32_t mask = MU_SR_TE0_MASK >> index;
> > > +
> > > +	/* Wait TX register to be empty. */
> > > +	while (!(readl_relaxed(priv->base + MU_ASR) & mask))
> > > +		;
> > > +	writel_relaxed(msg, priv->base + MU_ATR0  + (index * 4)); }
> > > +EXPORT_SYMBOL_GPL(mu_send_msg);
> > >
> > > According to documentation it is recommended to use only one status
> > > bit for the last register to use MU as one big 4words sized pipe.
> > > But, there is no way you can write to all 4 registers without
> > > checking status for each of this register, because your protocol has
> > > dynamic message size. So you are forced to use your one channel as 4
> separate channels.
> > > Write it part of the message separately and allow your firmware read
> > > 1 word to understand how to behave on the rest of the message.
> > >
> > > +static void sc_ipc_write(struct sc_ipc *sc_ipc, void *data) {
> > > +	sc_rpc_msg_t *msg = (sc_rpc_msg_t *) data;
> > > +	uint8_t count = 0;
> > > +
> > > +	/* Check size */
> > > +	if (msg->size > SC_RPC_MAX_MSG)
> > > +		return;
> > > +
> > > +	/* Write first word */
> > > +	mu_send_msg(sc_ipc->mu_base, 0, *((uint32_t *) msg));
> > > +	count++;
> > >
> > > .. in this loop you are writing to one channel/register per loop. If
> > > the communicate will stall for some reason, the linux system will
> > > just freeze here without any timeout or error message.. no idea how
> about the opposite site.
> > >
> > > +	/* Write remaining words */
> > > +	while (count < msg->size) {
> > > +		mu_send_msg(sc_ipc->mu_base, count % MU_TR_COUNT,
> > > +			    msg->DATA.u32[count - 1]);
> > > +		count++;
> > > +	}
> > > +}
> > >
> > >
> > > ... and here is a proof that sc_ipc_write will do in some cases 5
> > > rounds
> > > (5 * 4 bytes = 20 bytes single message) with probable busy waiting
> > > for each channel
> > >
> > > +sc_err_t sc_misc_seco_image_load(sc_ipc_t ipc, uint32_t addr_src,
> > > +				 uint32_t addr_dst, uint32_t len, bool fw) {
> > > +	sc_rpc_msg_t msg;
> > > +	uint8_t result;
> > > +
> > > +	RPC_VER(&msg) = SC_RPC_VERSION;
> > > +	RPC_SVC(&msg) = (uint8_t)SC_RPC_SVC_MISC;
> > > +	RPC_FUNC(&msg) = (uint8_t)MISC_FUNC_SECO_IMAGE_LOAD;
> > > +	RPC_U32(&msg, 0) = addr_src;
> > > +	RPC_U32(&msg, 4) = addr_dst;
> > > +	RPC_U32(&msg, 8) = len;
> > > +	RPC_U8(&msg, 12) = (uint8_t)fw;
> > > +	RPC_SIZE(&msg) = 5;
> > > +
> > > +	sc_call_rpc(ipc, &msg, false);
> > > +
> > > +	result = RPC_R8(&msg);
> > > +	return (sc_err_t)result;
> > > +}
> > > +
> > >
> > > So, the same code with mailbox framework will be some thing like this:
> > > 1. request all 4 channels in the probe. ignore completion callback
> > > and set proper timeout.
> > >
> >
> > That looks a bit strange. As I said in another email, MU physical does
> > not have multi Channels (here you mean are virtual channels).  And one
> > whole MU instance is Exclusively used by SCU, why we need abstract
> > them into 4 channels to use separately with extra unnecessary resource
> hold and code path executed.
> 
> OK, so this MU should configured to use all 4 registers as one channel.
> I have nothing against it to do in generic driver.
> 
> > > 2. keep your old code by replacing this part.
> > > 	/* Write remaining words */
> > > 	while (count < msg->size) {
> > > 		mbox_send_message(sc_ipc->mbox_chan[count %
> MU_TR_COUNT],
> > > msg->DATA.u32[count - 1]);
> > > 		count++;
> > > 	}
> > >
> > > The advantage of this variant. If SCU firmware will stall, the linux
> > > will be able to notify about it without blocking complete system.
> > >
> >
> > This part of code has been used for a long time and we've never met
> > the stall Issue which means SCU firmware guarantee it well. But I
> > agree a timeout mechanism Is better. However,  if only for this
> > reason, we can simply add a timeout mechanism in MU library function
> > as well, but still is far from a strong enough reason to switch to a more
> complexed mailbox one.
> 
> At this point a have absolutely zero tolerance. At the end I was one on poor
> devs hunting similar kind of stalls.
> Believe me, what can theoretically with 1% probability happen in the lab, will
> end with days/months of bug hunting work and thousands of field returns.
> 

Yeah, so I think a timeout mechanism is worthy to add.

> > > Can you please provide (if possible) your old mailbox based
> implementation.
> > > I'm curious to see why it is slow.
> >
> > In our implementation:
> > 1) One channel per MU
> > 2) Tx using the same way to send msg as sc_ipc_write() in polling mode
> > 3) Rx enables the first word interrupt and polling for the rest of
> > them in a hrtimer.
> >
> > The possible extra cost comparing to simple polling way:
> > 1) Extra unnecessary code execution path of mailbox which is not used
> > by SCU MU
> > 2) Interrupt latency
> > 3) Extra memory copy handling RX message separately.
> > 4) Extra schedule of hrtimer polling
> >
> > Some of them probably could be optimized. However, besides the slow
> > problem, the extra unnecessary complexity and can't be generic
> > (specific to SCU) are also important ones.
> >
> > And the MU general purpose interrupt feature and general purpose flags
> > feature may also not supported by mailbox well.
> 
> Ok, I give up.
> Can we at least try to provide one device tree binding documentation for
> that?

I will update the binding doc to use mailbox way as suggested by Rob.
But I still need to change mbox-cells to allow 0 which actually already exist
In kernel. So we may have too options for MU binding for users.
For SCU MU, mbox-cells = <0>
For General MU, mbox-cells = <1>

> At the end, it is one and the same HW IP block spread in one SoC (or
> different SoCs) an be used in different way. How exactly it will be used is the
> choice of the developer/customer/user...
> even if it will be described withing mailbox section dos not mean you are
> forced to use mailbox framework.
> 
> Lets imagine worst case scenario and all of MU on i.MX8 are used by
> different drivers. In my mailbox binding I suggest to use <vendor>,<soc>-
> mu-<mu side>. On i.MX7 there is only one MU, so it looks like "fsl,imx7s-mu-
> a" and "fsl,imx7s-mu-b". i.MX8 has multiple MU so it should be enumerated.
> Probably it will look like: "fsl,imx8x-mu0-a".
> Do it make sense for you? Can we find and agreement here? :)
> 

IMHO one possible concern may be that results in creating two much compatibles
strings. For example, each SoC have two (fsl, <soc>-mu-<a or b>).
Now with MU supported SoCs, we have mx6sx, mx7ulp, Mx7d, mx8qm,
mx8x, mx8mq and etc...

My question is that if it's really worthy to create a separate side b compatible
string instread of distinguish it by a property?

And for general purpose MUs to communicate with M4. Both side can be used by
either A core and M core. If we're doing this way, once users wants A core to access
B side, users need to change the compatible string, that seems more like an unusual
way compared to changing properties.

How do you think?

Regards
Dong Aisheng

> --
> Pengutronix e.K.                           |                             |
> Industrial Linux Solutions                 | http://www.pengutronix.de/  |
> Peiner Str. 6-8, 31137 Hildesheim, Germany | Phone: +49-5121-206917-0    |
> Amtsgericht Hildesheim, HRA 2686           | Fax:   +49-5121-206917-5555 |

^ permalink raw reply	[flat|nested] 37+ messages in thread

* RE: [PATCH V2 2/4] dt-bindings: arm: fsl: add mu binding doc
  2018-06-22  5:49           ` Sascha Hauer
@ 2018-06-22  6:04             ` A.s. Dong
  -1 siblings, 0 replies; 37+ messages in thread
From: A.s. Dong @ 2018-06-22  6:04 UTC (permalink / raw)
  To: Sascha Hauer
  Cc: Mark Rutland, Rob Herring, dongas86, devicetree, dl-linux-imx,
	kernel, Fabio Estevam, shawnguo, linux-arm-kernel

> -----Original Message-----
> From: Sascha Hauer [mailto:s.hauer@pengutronix.de]
> Sent: Friday, June 22, 2018 1:50 PM
> To: A.s. Dong <aisheng.dong@nxp.com>
> Cc: Rob Herring <robh@kernel.org>; linux-arm-kernel@lists.infradead.org;
> dongas86@gmail.com; kernel@pengutronix.de; shawnguo@kernel.org;
> Fabio Estevam <fabio.estevam@nxp.com>; dl-linux-imx <linux-
> imx@nxp.com>; Mark Rutland <mark.rutland@arm.com>;
> devicetree@vger.kernel.org
> Subject: Re: [PATCH V2 2/4] dt-bindings: arm: fsl: add mu binding doc
> 
> On Thu, Jun 21, 2018 at 05:11:33PM +0000, A.s. Dong wrote:
> > Hi Sascha,
> >
> > > -----Original Message-----
> > > From: Sascha Hauer [mailto:s.hauer@pengutronix.de]
> > > Sent: Thursday, June 21, 2018 3:47 PM
> > > To: Rob Herring <robh@kernel.org>
> > > Cc: A.s. Dong <aisheng.dong@nxp.com>; linux-arm-
> > > kernel@lists.infradead.org; dongas86@gmail.com;
> > > kernel@pengutronix.de; shawnguo@kernel.org; Fabio Estevam
> > > <fabio.estevam@nxp.com>; dl-linux- imx <linux-imx@nxp.com>; Mark
> > > Rutland <mark.rutland@arm.com>; devicetree@vger.kernel.org
> > > Subject: Re: [PATCH V2 2/4] dt-bindings: arm: fsl: add mu binding
> > > doc
> > >
> > > On Wed, Jun 20, 2018 at 01:43:10PM -0600, Rob Herring wrote:
> > > > On Sun, Jun 17, 2018 at 08:49:47PM +0800, Dong Aisheng wrote:
> > > > > The Messaging Unit module enables two processors within the SoC
> > > > > to communicate and coordinate by passing messages (e.g. data,
> > > > > status and control) through the MU interface.
> > > > >
> > > > > Cc: Shawn Guo <shawnguo@kernel.org>
> > > > > Cc: Sascha Hauer <kernel@pengutronix.de>
> > > > > Cc: Fabio Estevam <fabio.estevam@nxp.com>
> > > > > Cc: Rob Herring <robh+dt@kernel.org>
> > > > > Cc: Mark Rutland <mark.rutland@arm.com>
> > > > > Cc: devicetree@vger.kernel.org
> > > > > Signed-off-by: Dong Aisheng <aisheng.dong@nxp.com>
> > > > > ---
> > > > > v1->v2:
> > > > >  * typo fixes
> > > > >  * remove status property
> > > > >  * remove imx6&7 compatible string which may be added later for
> > > > >    the generic mailbox binding
> > > > >
> > > > > Note: Because MU used by SCU is not implemented as a mailbox
> > > > > driver, Instead, they're provided in library calls to gain higher
> performance.
> > > >
> > > > Using a binding doesn't mean you have to use an OS's subsystem.
> > > >
> > > > What needs higher performance? What's the performance difference?
> > > Why
> > > > can't the mailbox framework be improved?
> > >
> > > From what I see the performance is improved by polling the interrupt
> > > registers rather than using interrupts.
> > > I see no reason though why this can't be implemented with the
> > > mailbox framework as is.
> > >
> >
> > I thought you've agreed to not write generic MU(mailbox) driver for SCU.
> >
> https://emea01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fw
> ww
> > .spinics.net%2Flists%2Farm-
> kernel%2Fmsg650217.html&data=02%7C01%7Caish
> >
> eng.dong%40nxp.com%7Cfb1c9e5d6eff453207a908d5d803f139%7C686ea1d3
> bc2b4c
> >
> 6fa92cd99c5c301635%7C0%7C0%7C636652433808619368&sdata=vOFbr%2BSJ
> 1tkY5F
> > 8ha1wrxk3RMVxI4lzKkidD9cK%2Bl9A%3D&reserved=0
> > But seems it's still not quite certain...
> 
> My suggestion was that we change the compatible of the MU unit to
> something that the SCU driver matches. But since we do not do that and
> instead use a fsl,imx8qxp-mu compatible for the MU we have to come up
> with a MU driver that handles both the SCU case and the regular usecase.
> Otherwise we'll end up with a generic driver attaching to the device you are
> using under the hoods.
> 

Do you think checking mbox-cells can work?
E.g. SCU MU mbox-cells should be 0 while general purpose MU are 1.

Regards
Dong Aisheng

> Sascha
> 
> --
> Pengutronix e.K.                           |                             |
> Industrial Linux Solutions                 |
> https://emea01.safelinks.protection.outlook.com/?url=http%3A%2F%2Fww
> w.pengutronix.de%2F&data=02%7C01%7Caisheng.dong%40nxp.com%7Cfb1
> c9e5d6eff453207a908d5d803f139%7C686ea1d3bc2b4c6fa92cd99c5c301635%7
> C0%7C0%7C636652433808619368&sdata=AYkDI3ubpKuxfKvmsvM8d65suEAkf
> 3h9Tq5KZdJxoNY%3D&reserved=0  |
> Peiner Str. 6-8, 31137 Hildesheim, Germany | Phone: +49-5121-206917-0    |
> Amtsgericht Hildesheim, HRA 2686           | Fax:   +49-5121-206917-5555 |

^ permalink raw reply	[flat|nested] 37+ messages in thread

* [PATCH V2 2/4] dt-bindings: arm: fsl: add mu binding doc
@ 2018-06-22  6:04             ` A.s. Dong
  0 siblings, 0 replies; 37+ messages in thread
From: A.s. Dong @ 2018-06-22  6:04 UTC (permalink / raw)
  To: linux-arm-kernel

> -----Original Message-----
> From: Sascha Hauer [mailto:s.hauer at pengutronix.de]
> Sent: Friday, June 22, 2018 1:50 PM
> To: A.s. Dong <aisheng.dong@nxp.com>
> Cc: Rob Herring <robh@kernel.org>; linux-arm-kernel at lists.infradead.org;
> dongas86 at gmail.com; kernel at pengutronix.de; shawnguo at kernel.org;
> Fabio Estevam <fabio.estevam@nxp.com>; dl-linux-imx <linux-
> imx at nxp.com>; Mark Rutland <mark.rutland@arm.com>;
> devicetree at vger.kernel.org
> Subject: Re: [PATCH V2 2/4] dt-bindings: arm: fsl: add mu binding doc
> 
> On Thu, Jun 21, 2018 at 05:11:33PM +0000, A.s. Dong wrote:
> > Hi Sascha,
> >
> > > -----Original Message-----
> > > From: Sascha Hauer [mailto:s.hauer at pengutronix.de]
> > > Sent: Thursday, June 21, 2018 3:47 PM
> > > To: Rob Herring <robh@kernel.org>
> > > Cc: A.s. Dong <aisheng.dong@nxp.com>; linux-arm-
> > > kernel at lists.infradead.org; dongas86 at gmail.com;
> > > kernel at pengutronix.de; shawnguo at kernel.org; Fabio Estevam
> > > <fabio.estevam@nxp.com>; dl-linux- imx <linux-imx@nxp.com>; Mark
> > > Rutland <mark.rutland@arm.com>; devicetree at vger.kernel.org
> > > Subject: Re: [PATCH V2 2/4] dt-bindings: arm: fsl: add mu binding
> > > doc
> > >
> > > On Wed, Jun 20, 2018 at 01:43:10PM -0600, Rob Herring wrote:
> > > > On Sun, Jun 17, 2018 at 08:49:47PM +0800, Dong Aisheng wrote:
> > > > > The Messaging Unit module enables two processors within the SoC
> > > > > to communicate and coordinate by passing messages (e.g. data,
> > > > > status and control) through the MU interface.
> > > > >
> > > > > Cc: Shawn Guo <shawnguo@kernel.org>
> > > > > Cc: Sascha Hauer <kernel@pengutronix.de>
> > > > > Cc: Fabio Estevam <fabio.estevam@nxp.com>
> > > > > Cc: Rob Herring <robh+dt@kernel.org>
> > > > > Cc: Mark Rutland <mark.rutland@arm.com>
> > > > > Cc: devicetree at vger.kernel.org
> > > > > Signed-off-by: Dong Aisheng <aisheng.dong@nxp.com>
> > > > > ---
> > > > > v1->v2:
> > > > >  * typo fixes
> > > > >  * remove status property
> > > > >  * remove imx6&7 compatible string which may be added later for
> > > > >    the generic mailbox binding
> > > > >
> > > > > Note: Because MU used by SCU is not implemented as a mailbox
> > > > > driver, Instead, they're provided in library calls to gain higher
> performance.
> > > >
> > > > Using a binding doesn't mean you have to use an OS's subsystem.
> > > >
> > > > What needs higher performance? What's the performance difference?
> > > Why
> > > > can't the mailbox framework be improved?
> > >
> > > From what I see the performance is improved by polling the interrupt
> > > registers rather than using interrupts.
> > > I see no reason though why this can't be implemented with the
> > > mailbox framework as is.
> > >
> >
> > I thought you've agreed to not write generic MU(mailbox) driver for SCU.
> >
> https://emea01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fw
> ww
> > .spinics.net%2Flists%2Farm-
> kernel%2Fmsg650217.html&data=02%7C01%7Caish
> >
> eng.dong%40nxp.com%7Cfb1c9e5d6eff453207a908d5d803f139%7C686ea1d3
> bc2b4c
> >
> 6fa92cd99c5c301635%7C0%7C0%7C636652433808619368&sdata=vOFbr%2BSJ
> 1tkY5F
> > 8ha1wrxk3RMVxI4lzKkidD9cK%2Bl9A%3D&reserved=0
> > But seems it's still not quite certain...
> 
> My suggestion was that we change the compatible of the MU unit to
> something that the SCU driver matches. But since we do not do that and
> instead use a fsl,imx8qxp-mu compatible for the MU we have to come up
> with a MU driver that handles both the SCU case and the regular usecase.
> Otherwise we'll end up with a generic driver attaching to the device you are
> using under the hoods.
> 

Do you think checking mbox-cells can work?
E.g. SCU MU mbox-cells should be 0 while general purpose MU are 1.

Regards
Dong Aisheng

> Sascha
> 
> --
> Pengutronix e.K.                           |                             |
> Industrial Linux Solutions                 |
> https://emea01.safelinks.protection.outlook.com/?url=http%3A%2F%2Fww
> w.pengutronix.de%2F&data=02%7C01%7Caisheng.dong%40nxp.com%7Cfb1
> c9e5d6eff453207a908d5d803f139%7C686ea1d3bc2b4c6fa92cd99c5c301635%7
> C0%7C0%7C636652433808619368&sdata=AYkDI3ubpKuxfKvmsvM8d65suEAkf
> 3h9Tq5KZdJxoNY%3D&reserved=0  |
> Peiner Str. 6-8, 31137 Hildesheim, Germany | Phone: +49-5121-206917-0    |
> Amtsgericht Hildesheim, HRA 2686           | Fax:   +49-5121-206917-5555 |

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH V2 2/4] dt-bindings: arm: fsl: add mu binding doc
  2018-06-22  5:59                 ` A.s. Dong
@ 2018-06-22  6:48                   ` Oleksij Rempel
  -1 siblings, 0 replies; 37+ messages in thread
From: Oleksij Rempel @ 2018-06-22  6:48 UTC (permalink / raw)
  To: A.s. Dong
  Cc: Mark Rutland, Rob Herring, dongas86, devicetree, Sascha Hauer,
	dl-linux-imx, kernel, Fabio Estevam, shawnguo, linux-arm-kernel


[-- Attachment #1.1: Type: text/plain, Size: 13760 bytes --]

On Fri, Jun 22, 2018 at 05:59:30AM +0000, A.s. Dong wrote:
> Hi Oleksij,
> 
> > -----Original Message-----
> > From: Oleksij Rempel [mailto:o.rempel@pengutronix.de]
> > Sent: Friday, June 22, 2018 12:59 PM
> > To: A.s. Dong <aisheng.dong@nxp.com>
> > Cc: Sascha Hauer <s.hauer@pengutronix.de>; Rob Herring
> > <robh@kernel.org>; Mark Rutland <mark.rutland@arm.com>;
> > devicetree@vger.kernel.org; dongas86@gmail.com; dl-linux-imx <linux-
> > imx@nxp.com>; kernel@pengutronix.de; Fabio Estevam
> > <fabio.estevam@nxp.com>; shawnguo@kernel.org; linux-arm-
> > kernel@lists.infradead.org
> > Subject: Re: [PATCH V2 2/4] dt-bindings: arm: fsl: add mu binding doc
> > 
> > On Fri, Jun 22, 2018 at 03:31:11AM +0000, A.s. Dong wrote:
> > > > -----Original Message-----
> > > > From: Oleksij Rempel [mailto:o.rempel@pengutronix.de]
> > > > Sent: Friday, June 22, 2018 2:09 AM
> > > > To: A.s. Dong <aisheng.dong@nxp.com>
> > > > Cc: Sascha Hauer <s.hauer@pengutronix.de>; Rob Herring
> > > > <robh@kernel.org>; Mark Rutland <mark.rutland@arm.com>;
> > > > devicetree@vger.kernel.org; dongas86@gmail.com; dl-linux-imx <linux-
> > > > imx@nxp.com>; kernel@pengutronix.de; Fabio Estevam
> > > > <fabio.estevam@nxp.com>; shawnguo@kernel.org; linux-arm-
> > > > kernel@lists.infradead.org
> > > > Subject: Re: [PATCH V2 2/4] dt-bindings: arm: fsl: add mu binding
> > > > doc
> > > >
> > > > Hi,
> > > >
> > > > On Thu, Jun 21, 2018 at 05:11:33PM +0000, A.s. Dong wrote:
> > > > > Hi Sascha,
> > > > > > On Wed, Jun 20, 2018 at 01:43:10PM -0600, Rob Herring wrote:
> > > > > > > On Sun, Jun 17, 2018 at 08:49:47PM +0800, Dong Aisheng wrote:
> > > > > > > > The Messaging Unit module enables two processors within the
> > > > > > > > SoC to communicate and coordinate by passing messages (e.g.
> > > > > > > > data, status and control) through the MU interface.
> > > > > > > >
> > > > > > > > ---
> > > > > > > > v1->v2:
> > > > > > > >  * typo fixes
> > > > > > > >  * remove status property
> > > > > > > >  * remove imx6&7 compatible string which may be added later for
> > > > > > > >    the generic mailbox binding
> > > > > > > >
> > > > > > > > Note: Because MU used by SCU is not implemented as a mailbox
> > > > > > > > driver, Instead, they're provided in library calls to gain
> > > > > > > > higher
> > > > performance.
> > > > > > >
> > > > > > > Using a binding doesn't mean you have to use an OS's subsystem.
> > > > > > >
> > > > > > > What needs higher performance? What's the performance
> > difference?
> > > > > > Why
> > > > > > > can't the mailbox framework be improved?
> > > > > >
> > > > > > From what I see the performance is improved by polling the
> > > > > > interrupt registers rather than using interrupts.
> > > > > > I see no reason though why this can't be implemented with the
> > > > > > mailbox framework as is.
> > > > > >
> > > > >
> > > > > I thought you've agreed to not write generic MU(mailbox) driver for
> > SCU.
> > > > > https://www.spinics.net/lists/arm-kernel/msg650217.html
> > > > > But seems it's still not quite certain...
> > > > >
> > > > > I'd like to explain some more.
> > > > >
> > > > > 1) the interrupt mechanism is not quite suitable for SCU firmware
> > > > > protocol as the transfer size would be more than 4 words and the
> > > > > response data size is also undetermined (it's set by SCU firmware
> > > > > side
> > > > during a response).
> > > > > So polling mode may be the best way to handle this as MU message
> > > > > handling usually is quite fast in a few microseconds.
> > > > >
> > > > > 2) It's true that Mailbox framework is well designed and powerful.
> > > > > But it's not quite suitable for SCU MU as we don't need to use the
> > > > > most bits of it. Mailbox seems like to be more suitable for a
> > > > > generic MU mailbox driver used by various clients/servers.  But
> > > > > SCU MU are quite specific to SCU protocol and can't be used by
> > > > > other clients (MU
> > > > > 0~4 is fixed for SCU communication in MX8 HW design).
> > > > > Even we write a MU Mailbox driver for SCU MU, it's still not a
> > > > > general one and can't be used by others (e.g. communication with M4).
> > > > > So I'd believe the current library way is still the best approach
> > > > > for SCU MU Using. But I'm also okay for another generic MU drivers
> > > > > for other common communications between A core and M4 side.
> > > > >
> > > > > 3) We actually have tried the MU(Mailbox) internally, it increased
> > > > > a lot complexity comparing to the current library way and got a
> > > > > booting time regression issue due to extra delays introduced in
> > > > > handling SCU protocol in mailbox way.
> > > > >
> > > > > And finally a nature question to us is:
> > > > > What the benefit we can get for SCU MU using a mailbox way?
> > > > >
> > > > > If we can't find benefits but introduce more complexities then why
> > > > > we would do that way?
> > > >
> > > > Looks like my response to same topic within my patch set is lost, so
> > > > I repost it
> > > > here:
> > > >
> > > > ok.. let's take some of IMX8 SCU driver code to see if there any
> > difference:
> > > >
> > > > ..this part of the code is blocking write procedure for one
> > > > channeler (register or what ever name you prefer) per write.. correct?
> > > >
> > > > +void mu_send_msg(struct mu_priv *priv, uint32_t index, uint32_t msg)
> > {
> > > > +	uint32_t mask = MU_SR_TE0_MASK >> index;
> > > > +
> > > > +	/* Wait TX register to be empty. */
> > > > +	while (!(readl_relaxed(priv->base + MU_ASR) & mask))
> > > > +		;
> > > > +	writel_relaxed(msg, priv->base + MU_ATR0  + (index * 4)); }
> > > > +EXPORT_SYMBOL_GPL(mu_send_msg);
> > > >
> > > > According to documentation it is recommended to use only one status
> > > > bit for the last register to use MU as one big 4words sized pipe.
> > > > But, there is no way you can write to all 4 registers without
> > > > checking status for each of this register, because your protocol has
> > > > dynamic message size. So you are forced to use your one channel as 4
> > separate channels.
> > > > Write it part of the message separately and allow your firmware read
> > > > 1 word to understand how to behave on the rest of the message.
> > > >
> > > > +static void sc_ipc_write(struct sc_ipc *sc_ipc, void *data) {
> > > > +	sc_rpc_msg_t *msg = (sc_rpc_msg_t *) data;
> > > > +	uint8_t count = 0;
> > > > +
> > > > +	/* Check size */
> > > > +	if (msg->size > SC_RPC_MAX_MSG)
> > > > +		return;
> > > > +
> > > > +	/* Write first word */
> > > > +	mu_send_msg(sc_ipc->mu_base, 0, *((uint32_t *) msg));
> > > > +	count++;
> > > >
> > > > .. in this loop you are writing to one channel/register per loop. If
> > > > the communicate will stall for some reason, the linux system will
> > > > just freeze here without any timeout or error message.. no idea how
> > about the opposite site.
> > > >
> > > > +	/* Write remaining words */
> > > > +	while (count < msg->size) {
> > > > +		mu_send_msg(sc_ipc->mu_base, count % MU_TR_COUNT,
> > > > +			    msg->DATA.u32[count - 1]);
> > > > +		count++;
> > > > +	}
> > > > +}
> > > >
> > > >
> > > > ... and here is a proof that sc_ipc_write will do in some cases 5
> > > > rounds
> > > > (5 * 4 bytes = 20 bytes single message) with probable busy waiting
> > > > for each channel
> > > >
> > > > +sc_err_t sc_misc_seco_image_load(sc_ipc_t ipc, uint32_t addr_src,
> > > > +				 uint32_t addr_dst, uint32_t len, bool fw) {
> > > > +	sc_rpc_msg_t msg;
> > > > +	uint8_t result;
> > > > +
> > > > +	RPC_VER(&msg) = SC_RPC_VERSION;
> > > > +	RPC_SVC(&msg) = (uint8_t)SC_RPC_SVC_MISC;
> > > > +	RPC_FUNC(&msg) = (uint8_t)MISC_FUNC_SECO_IMAGE_LOAD;
> > > > +	RPC_U32(&msg, 0) = addr_src;
> > > > +	RPC_U32(&msg, 4) = addr_dst;
> > > > +	RPC_U32(&msg, 8) = len;
> > > > +	RPC_U8(&msg, 12) = (uint8_t)fw;
> > > > +	RPC_SIZE(&msg) = 5;
> > > > +
> > > > +	sc_call_rpc(ipc, &msg, false);
> > > > +
> > > > +	result = RPC_R8(&msg);
> > > > +	return (sc_err_t)result;
> > > > +}
> > > > +
> > > >
> > > > So, the same code with mailbox framework will be some thing like this:
> > > > 1. request all 4 channels in the probe. ignore completion callback
> > > > and set proper timeout.
> > > >
> > >
> > > That looks a bit strange. As I said in another email, MU physical does
> > > not have multi Channels (here you mean are virtual channels).  And one
> > > whole MU instance is Exclusively used by SCU, why we need abstract
> > > them into 4 channels to use separately with extra unnecessary resource
> > hold and code path executed.
> > 
> > OK, so this MU should configured to use all 4 registers as one channel.
> > I have nothing against it to do in generic driver.
> > 
> > > > 2. keep your old code by replacing this part.
> > > > 	/* Write remaining words */
> > > > 	while (count < msg->size) {
> > > > 		mbox_send_message(sc_ipc->mbox_chan[count %
> > MU_TR_COUNT],
> > > > msg->DATA.u32[count - 1]);
> > > > 		count++;
> > > > 	}
> > > >
> > > > The advantage of this variant. If SCU firmware will stall, the linux
> > > > will be able to notify about it without blocking complete system.
> > > >
> > >
> > > This part of code has been used for a long time and we've never met
> > > the stall Issue which means SCU firmware guarantee it well. But I
> > > agree a timeout mechanism Is better. However,  if only for this
> > > reason, we can simply add a timeout mechanism in MU library function
> > > as well, but still is far from a strong enough reason to switch to a more
> > complexed mailbox one.
> > 
> > At this point a have absolutely zero tolerance. At the end I was one on poor
> > devs hunting similar kind of stalls.
> > Believe me, what can theoretically with 1% probability happen in the lab, will
> > end with days/months of bug hunting work and thousands of field returns.
> > 
> 
> Yeah, so I think a timeout mechanism is worthy to add.
> 
> > > > Can you please provide (if possible) your old mailbox based
> > implementation.
> > > > I'm curious to see why it is slow.
> > >
> > > In our implementation:
> > > 1) One channel per MU
> > > 2) Tx using the same way to send msg as sc_ipc_write() in polling mode
> > > 3) Rx enables the first word interrupt and polling for the rest of
> > > them in a hrtimer.
> > >
> > > The possible extra cost comparing to simple polling way:
> > > 1) Extra unnecessary code execution path of mailbox which is not used
> > > by SCU MU
> > > 2) Interrupt latency
> > > 3) Extra memory copy handling RX message separately.
> > > 4) Extra schedule of hrtimer polling
> > >
> > > Some of them probably could be optimized. However, besides the slow
> > > problem, the extra unnecessary complexity and can't be generic
> > > (specific to SCU) are also important ones.
> > >
> > > And the MU general purpose interrupt feature and general purpose flags
> > > feature may also not supported by mailbox well.
> > 
> > Ok, I give up.
> > Can we at least try to provide one device tree binding documentation for
> > that?
> 
> I will update the binding doc to use mailbox way as suggested by Rob.
> But I still need to change mbox-cells to allow 0 which actually already exist
> In kernel. So we may have too options for MU binding for users.
> For SCU MU, mbox-cells = <0>
> For General MU, mbox-cells = <1>

ok. cool.

> > At the end, it is one and the same HW IP block spread in one SoC (or
> > different SoCs) an be used in different way. How exactly it will be used is the
> > choice of the developer/customer/user...
> > even if it will be described withing mailbox section dos not mean you are
> > forced to use mailbox framework.
> > 
> > Lets imagine worst case scenario and all of MU on i.MX8 are used by
> > different drivers. In my mailbox binding I suggest to use <vendor>,<soc>-
> > mu-<mu side>. On i.MX7 there is only one MU, so it looks like "fsl,imx7s-mu-
> > a" and "fsl,imx7s-mu-b". i.MX8 has multiple MU so it should be enumerated.
> > Probably it will look like: "fsl,imx8x-mu0-a".
> > Do it make sense for you? Can we find and agreement here? :)
> > 
> 
> IMHO one possible concern may be that results in creating two much compatibles
> strings. For example, each SoC have two (fsl, <soc>-mu-<a or b>).
> Now with MU supported SoCs, we have mx6sx, mx7ulp, Mx7d, mx8qm,
> mx8x, mx8mq and etc...
> 
> My question is that if it's really worthy to create a separate side b compatible
> string instread of distinguish it by a property?
> 
> And for general purpose MUs to communicate with M4. Both side can be used by
> either A core and M core. If we're doing this way, once users wants A core to access
> B side, users need to change the compatible string, that seems more like an unusual
> way compared to changing properties.
> 
> How do you think?

My current use case is Linux with DT running on all CPUs of one SoC. On i.MX7
there is only one MU and I use same mailbox driver on both sides. In
this case it is enough to have same compatible with extra parameter to
see the difference between A and B side. 
Just imagine, you'll use DT in your firmware on i.MX8. Will you use same
driver on both sides of MU? If yes, I'm OK to exclude A and B from
compatible.

-- 
Pengutronix e.K.                           |                             |
Industrial Linux Solutions                 | http://www.pengutronix.de/  |
Peiner Str. 6-8, 31137 Hildesheim, Germany | Phone: +49-5121-206917-0    |
Amtsgericht Hildesheim, HRA 2686           | Fax:   +49-5121-206917-5555 |

[-- Attachment #1.2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

[-- Attachment #2: Type: text/plain, Size: 176 bytes --]

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 37+ messages in thread

* [PATCH V2 2/4] dt-bindings: arm: fsl: add mu binding doc
@ 2018-06-22  6:48                   ` Oleksij Rempel
  0 siblings, 0 replies; 37+ messages in thread
From: Oleksij Rempel @ 2018-06-22  6:48 UTC (permalink / raw)
  To: linux-arm-kernel

On Fri, Jun 22, 2018 at 05:59:30AM +0000, A.s. Dong wrote:
> Hi Oleksij,
> 
> > -----Original Message-----
> > From: Oleksij Rempel [mailto:o.rempel at pengutronix.de]
> > Sent: Friday, June 22, 2018 12:59 PM
> > To: A.s. Dong <aisheng.dong@nxp.com>
> > Cc: Sascha Hauer <s.hauer@pengutronix.de>; Rob Herring
> > <robh@kernel.org>; Mark Rutland <mark.rutland@arm.com>;
> > devicetree at vger.kernel.org; dongas86 at gmail.com; dl-linux-imx <linux-
> > imx at nxp.com>; kernel at pengutronix.de; Fabio Estevam
> > <fabio.estevam@nxp.com>; shawnguo at kernel.org; linux-arm-
> > kernel at lists.infradead.org
> > Subject: Re: [PATCH V2 2/4] dt-bindings: arm: fsl: add mu binding doc
> > 
> > On Fri, Jun 22, 2018 at 03:31:11AM +0000, A.s. Dong wrote:
> > > > -----Original Message-----
> > > > From: Oleksij Rempel [mailto:o.rempel at pengutronix.de]
> > > > Sent: Friday, June 22, 2018 2:09 AM
> > > > To: A.s. Dong <aisheng.dong@nxp.com>
> > > > Cc: Sascha Hauer <s.hauer@pengutronix.de>; Rob Herring
> > > > <robh@kernel.org>; Mark Rutland <mark.rutland@arm.com>;
> > > > devicetree at vger.kernel.org; dongas86 at gmail.com; dl-linux-imx <linux-
> > > > imx at nxp.com>; kernel at pengutronix.de; Fabio Estevam
> > > > <fabio.estevam@nxp.com>; shawnguo at kernel.org; linux-arm-
> > > > kernel at lists.infradead.org
> > > > Subject: Re: [PATCH V2 2/4] dt-bindings: arm: fsl: add mu binding
> > > > doc
> > > >
> > > > Hi,
> > > >
> > > > On Thu, Jun 21, 2018 at 05:11:33PM +0000, A.s. Dong wrote:
> > > > > Hi Sascha,
> > > > > > On Wed, Jun 20, 2018 at 01:43:10PM -0600, Rob Herring wrote:
> > > > > > > On Sun, Jun 17, 2018 at 08:49:47PM +0800, Dong Aisheng wrote:
> > > > > > > > The Messaging Unit module enables two processors within the
> > > > > > > > SoC to communicate and coordinate by passing messages (e.g.
> > > > > > > > data, status and control) through the MU interface.
> > > > > > > >
> > > > > > > > ---
> > > > > > > > v1->v2:
> > > > > > > >  * typo fixes
> > > > > > > >  * remove status property
> > > > > > > >  * remove imx6&7 compatible string which may be added later for
> > > > > > > >    the generic mailbox binding
> > > > > > > >
> > > > > > > > Note: Because MU used by SCU is not implemented as a mailbox
> > > > > > > > driver, Instead, they're provided in library calls to gain
> > > > > > > > higher
> > > > performance.
> > > > > > >
> > > > > > > Using a binding doesn't mean you have to use an OS's subsystem.
> > > > > > >
> > > > > > > What needs higher performance? What's the performance
> > difference?
> > > > > > Why
> > > > > > > can't the mailbox framework be improved?
> > > > > >
> > > > > > From what I see the performance is improved by polling the
> > > > > > interrupt registers rather than using interrupts.
> > > > > > I see no reason though why this can't be implemented with the
> > > > > > mailbox framework as is.
> > > > > >
> > > > >
> > > > > I thought you've agreed to not write generic MU(mailbox) driver for
> > SCU.
> > > > > https://www.spinics.net/lists/arm-kernel/msg650217.html
> > > > > But seems it's still not quite certain...
> > > > >
> > > > > I'd like to explain some more.
> > > > >
> > > > > 1) the interrupt mechanism is not quite suitable for SCU firmware
> > > > > protocol as the transfer size would be more than 4 words and the
> > > > > response data size is also undetermined (it's set by SCU firmware
> > > > > side
> > > > during a response).
> > > > > So polling mode may be the best way to handle this as MU message
> > > > > handling usually is quite fast in a few microseconds.
> > > > >
> > > > > 2) It's true that Mailbox framework is well designed and powerful.
> > > > > But it's not quite suitable for SCU MU as we don't need to use the
> > > > > most bits of it. Mailbox seems like to be more suitable for a
> > > > > generic MU mailbox driver used by various clients/servers.  But
> > > > > SCU MU are quite specific to SCU protocol and can't be used by
> > > > > other clients (MU
> > > > > 0~4 is fixed for SCU communication in MX8 HW design).
> > > > > Even we write a MU Mailbox driver for SCU MU, it's still not a
> > > > > general one and can't be used by others (e.g. communication with M4).
> > > > > So I'd believe the current library way is still the best approach
> > > > > for SCU MU Using. But I'm also okay for another generic MU drivers
> > > > > for other common communications between A core and M4 side.
> > > > >
> > > > > 3) We actually have tried the MU(Mailbox) internally, it increased
> > > > > a lot complexity comparing to the current library way and got a
> > > > > booting time regression issue due to extra delays introduced in
> > > > > handling SCU protocol in mailbox way.
> > > > >
> > > > > And finally a nature question to us is:
> > > > > What the benefit we can get for SCU MU using a mailbox way?
> > > > >
> > > > > If we can't find benefits but introduce more complexities then why
> > > > > we would do that way?
> > > >
> > > > Looks like my response to same topic within my patch set is lost, so
> > > > I repost it
> > > > here:
> > > >
> > > > ok.. let's take some of IMX8 SCU driver code to see if there any
> > difference:
> > > >
> > > > ..this part of the code is blocking write procedure for one
> > > > channeler (register or what ever name you prefer) per write.. correct?
> > > >
> > > > +void mu_send_msg(struct mu_priv *priv, uint32_t index, uint32_t msg)
> > {
> > > > +	uint32_t mask = MU_SR_TE0_MASK >> index;
> > > > +
> > > > +	/* Wait TX register to be empty. */
> > > > +	while (!(readl_relaxed(priv->base + MU_ASR) & mask))
> > > > +		;
> > > > +	writel_relaxed(msg, priv->base + MU_ATR0  + (index * 4)); }
> > > > +EXPORT_SYMBOL_GPL(mu_send_msg);
> > > >
> > > > According to documentation it is recommended to use only one status
> > > > bit for the last register to use MU as one big 4words sized pipe.
> > > > But, there is no way you can write to all 4 registers without
> > > > checking status for each of this register, because your protocol has
> > > > dynamic message size. So you are forced to use your one channel as 4
> > separate channels.
> > > > Write it part of the message separately and allow your firmware read
> > > > 1 word to understand how to behave on the rest of the message.
> > > >
> > > > +static void sc_ipc_write(struct sc_ipc *sc_ipc, void *data) {
> > > > +	sc_rpc_msg_t *msg = (sc_rpc_msg_t *) data;
> > > > +	uint8_t count = 0;
> > > > +
> > > > +	/* Check size */
> > > > +	if (msg->size > SC_RPC_MAX_MSG)
> > > > +		return;
> > > > +
> > > > +	/* Write first word */
> > > > +	mu_send_msg(sc_ipc->mu_base, 0, *((uint32_t *) msg));
> > > > +	count++;
> > > >
> > > > .. in this loop you are writing to one channel/register per loop. If
> > > > the communicate will stall for some reason, the linux system will
> > > > just freeze here without any timeout or error message.. no idea how
> > about the opposite site.
> > > >
> > > > +	/* Write remaining words */
> > > > +	while (count < msg->size) {
> > > > +		mu_send_msg(sc_ipc->mu_base, count % MU_TR_COUNT,
> > > > +			    msg->DATA.u32[count - 1]);
> > > > +		count++;
> > > > +	}
> > > > +}
> > > >
> > > >
> > > > ... and here is a proof that sc_ipc_write will do in some cases 5
> > > > rounds
> > > > (5 * 4 bytes = 20 bytes single message) with probable busy waiting
> > > > for each channel
> > > >
> > > > +sc_err_t sc_misc_seco_image_load(sc_ipc_t ipc, uint32_t addr_src,
> > > > +				 uint32_t addr_dst, uint32_t len, bool fw) {
> > > > +	sc_rpc_msg_t msg;
> > > > +	uint8_t result;
> > > > +
> > > > +	RPC_VER(&msg) = SC_RPC_VERSION;
> > > > +	RPC_SVC(&msg) = (uint8_t)SC_RPC_SVC_MISC;
> > > > +	RPC_FUNC(&msg) = (uint8_t)MISC_FUNC_SECO_IMAGE_LOAD;
> > > > +	RPC_U32(&msg, 0) = addr_src;
> > > > +	RPC_U32(&msg, 4) = addr_dst;
> > > > +	RPC_U32(&msg, 8) = len;
> > > > +	RPC_U8(&msg, 12) = (uint8_t)fw;
> > > > +	RPC_SIZE(&msg) = 5;
> > > > +
> > > > +	sc_call_rpc(ipc, &msg, false);
> > > > +
> > > > +	result = RPC_R8(&msg);
> > > > +	return (sc_err_t)result;
> > > > +}
> > > > +
> > > >
> > > > So, the same code with mailbox framework will be some thing like this:
> > > > 1. request all 4 channels in the probe. ignore completion callback
> > > > and set proper timeout.
> > > >
> > >
> > > That looks a bit strange. As I said in another email, MU physical does
> > > not have multi Channels (here you mean are virtual channels).  And one
> > > whole MU instance is Exclusively used by SCU, why we need abstract
> > > them into 4 channels to use separately with extra unnecessary resource
> > hold and code path executed.
> > 
> > OK, so this MU should configured to use all 4 registers as one channel.
> > I have nothing against it to do in generic driver.
> > 
> > > > 2. keep your old code by replacing this part.
> > > > 	/* Write remaining words */
> > > > 	while (count < msg->size) {
> > > > 		mbox_send_message(sc_ipc->mbox_chan[count %
> > MU_TR_COUNT],
> > > > msg->DATA.u32[count - 1]);
> > > > 		count++;
> > > > 	}
> > > >
> > > > The advantage of this variant. If SCU firmware will stall, the linux
> > > > will be able to notify about it without blocking complete system.
> > > >
> > >
> > > This part of code has been used for a long time and we've never met
> > > the stall Issue which means SCU firmware guarantee it well. But I
> > > agree a timeout mechanism Is better. However,  if only for this
> > > reason, we can simply add a timeout mechanism in MU library function
> > > as well, but still is far from a strong enough reason to switch to a more
> > complexed mailbox one.
> > 
> > At this point a have absolutely zero tolerance. At the end I was one on poor
> > devs hunting similar kind of stalls.
> > Believe me, what can theoretically with 1% probability happen in the lab, will
> > end with days/months of bug hunting work and thousands of field returns.
> > 
> 
> Yeah, so I think a timeout mechanism is worthy to add.
> 
> > > > Can you please provide (if possible) your old mailbox based
> > implementation.
> > > > I'm curious to see why it is slow.
> > >
> > > In our implementation:
> > > 1) One channel per MU
> > > 2) Tx using the same way to send msg as sc_ipc_write() in polling mode
> > > 3) Rx enables the first word interrupt and polling for the rest of
> > > them in a hrtimer.
> > >
> > > The possible extra cost comparing to simple polling way:
> > > 1) Extra unnecessary code execution path of mailbox which is not used
> > > by SCU MU
> > > 2) Interrupt latency
> > > 3) Extra memory copy handling RX message separately.
> > > 4) Extra schedule of hrtimer polling
> > >
> > > Some of them probably could be optimized. However, besides the slow
> > > problem, the extra unnecessary complexity and can't be generic
> > > (specific to SCU) are also important ones.
> > >
> > > And the MU general purpose interrupt feature and general purpose flags
> > > feature may also not supported by mailbox well.
> > 
> > Ok, I give up.
> > Can we at least try to provide one device tree binding documentation for
> > that?
> 
> I will update the binding doc to use mailbox way as suggested by Rob.
> But I still need to change mbox-cells to allow 0 which actually already exist
> In kernel. So we may have too options for MU binding for users.
> For SCU MU, mbox-cells = <0>
> For General MU, mbox-cells = <1>

ok. cool.

> > At the end, it is one and the same HW IP block spread in one SoC (or
> > different SoCs) an be used in different way. How exactly it will be used is the
> > choice of the developer/customer/user...
> > even if it will be described withing mailbox section dos not mean you are
> > forced to use mailbox framework.
> > 
> > Lets imagine worst case scenario and all of MU on i.MX8 are used by
> > different drivers. In my mailbox binding I suggest to use <vendor>,<soc>-
> > mu-<mu side>. On i.MX7 there is only one MU, so it looks like "fsl,imx7s-mu-
> > a" and "fsl,imx7s-mu-b". i.MX8 has multiple MU so it should be enumerated.
> > Probably it will look like: "fsl,imx8x-mu0-a".
> > Do it make sense for you? Can we find and agreement here? :)
> > 
> 
> IMHO one possible concern may be that results in creating two much compatibles
> strings. For example, each SoC have two (fsl, <soc>-mu-<a or b>).
> Now with MU supported SoCs, we have mx6sx, mx7ulp, Mx7d, mx8qm,
> mx8x, mx8mq and etc...
> 
> My question is that if it's really worthy to create a separate side b compatible
> string instread of distinguish it by a property?
> 
> And for general purpose MUs to communicate with M4. Both side can be used by
> either A core and M core. If we're doing this way, once users wants A core to access
> B side, users need to change the compatible string, that seems more like an unusual
> way compared to changing properties.
> 
> How do you think?

My current use case is Linux with DT running on all CPUs of one SoC. On i.MX7
there is only one MU and I use same mailbox driver on both sides. In
this case it is enough to have same compatible with extra parameter to
see the difference between A and B side. 
Just imagine, you'll use DT in your firmware on i.MX8. Will you use same
driver on both sides of MU? If yes, I'm OK to exclude A and B from
compatible.

-- 
Pengutronix e.K.                           |                             |
Industrial Linux Solutions                 | http://www.pengutronix.de/  |
Peiner Str. 6-8, 31137 Hildesheim, Germany | Phone: +49-5121-206917-0    |
Amtsgericht Hildesheim, HRA 2686           | Fax:   +49-5121-206917-5555 |
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 488 bytes
Desc: not available
URL: <http://lists.infradead.org/pipermail/linux-arm-kernel/attachments/20180622/af632618/attachment.sig>

^ permalink raw reply	[flat|nested] 37+ messages in thread

* RE: [PATCH V2 2/4] dt-bindings: arm: fsl: add mu binding doc
  2018-06-22  6:48                   ` Oleksij Rempel
@ 2018-06-22  8:16                     ` A.s. Dong
  -1 siblings, 0 replies; 37+ messages in thread
From: A.s. Dong @ 2018-06-22  8:16 UTC (permalink / raw)
  To: Oleksij Rempel
  Cc: Mark Rutland, Rob Herring, dongas86, devicetree, Sascha Hauer,
	dl-linux-imx, kernel, Fabio Estevam, shawnguo, linux-arm-kernel

> -----Original Message-----
> From: Oleksij Rempel [mailto:o.rempel@pengutronix.de]
> Sent: Friday, June 22, 2018 2:49 PM
> To: A.s. Dong <aisheng.dong@nxp.com>
> Cc: Sascha Hauer <s.hauer@pengutronix.de>; Rob Herring
> <robh@kernel.org>; Mark Rutland <mark.rutland@arm.com>;
> devicetree@vger.kernel.org; dongas86@gmail.com; dl-linux-imx <linux-
> imx@nxp.com>; kernel@pengutronix.de; Fabio Estevam
> <fabio.estevam@nxp.com>; shawnguo@kernel.org; linux-arm-
> kernel@lists.infradead.org
> Subject: Re: [PATCH V2 2/4] dt-bindings: arm: fsl: add mu binding doc
> 
> On Fri, Jun 22, 2018 at 05:59:30AM +0000, A.s. Dong wrote:
> > Hi Oleksij,
> >
> > > -----Original Message-----
> > > From: Oleksij Rempel [mailto:o.rempel@pengutronix.de]
> > > Sent: Friday, June 22, 2018 12:59 PM
> > > To: A.s. Dong <aisheng.dong@nxp.com>
> > > Cc: Sascha Hauer <s.hauer@pengutronix.de>; Rob Herring
> > > <robh@kernel.org>; Mark Rutland <mark.rutland@arm.com>;
> > > devicetree@vger.kernel.org; dongas86@gmail.com; dl-linux-imx <linux-
> > > imx@nxp.com>; kernel@pengutronix.de; Fabio Estevam
> > > <fabio.estevam@nxp.com>; shawnguo@kernel.org; linux-arm-
> > > kernel@lists.infradead.org
> > > Subject: Re: [PATCH V2 2/4] dt-bindings: arm: fsl: add mu binding
> > > doc
> > >
> > > On Fri, Jun 22, 2018 at 03:31:11AM +0000, A.s. Dong wrote:
> > > > > -----Original Message-----
> > > > > From: Oleksij Rempel [mailto:o.rempel@pengutronix.de]
> > > > > Sent: Friday, June 22, 2018 2:09 AM
> > > > > To: A.s. Dong <aisheng.dong@nxp.com>
> > > > > Cc: Sascha Hauer <s.hauer@pengutronix.de>; Rob Herring
> > > > > <robh@kernel.org>; Mark Rutland <mark.rutland@arm.com>;
> > > > > devicetree@vger.kernel.org; dongas86@gmail.com; dl-linux-imx
> > > > > <linux- imx@nxp.com>; kernel@pengutronix.de; Fabio Estevam
> > > > > <fabio.estevam@nxp.com>; shawnguo@kernel.org; linux-arm-
> > > > > kernel@lists.infradead.org
> > > > > Subject: Re: [PATCH V2 2/4] dt-bindings: arm: fsl: add mu
> > > > > binding doc
> > > > >
> > > > > Hi,
> > > > >
> > > > > On Thu, Jun 21, 2018 at 05:11:33PM +0000, A.s. Dong wrote:
> > > > > > Hi Sascha,
> > > > > > > On Wed, Jun 20, 2018 at 01:43:10PM -0600, Rob Herring wrote:
> > > > > > > > On Sun, Jun 17, 2018 at 08:49:47PM +0800, Dong Aisheng wrote:
> > > > > > > > > The Messaging Unit module enables two processors within
> > > > > > > > > the SoC to communicate and coordinate by passing messages
> (e.g.
> > > > > > > > > data, status and control) through the MU interface.
> > > > > > > > >
> > > > > > > > > ---
> > > > > > > > > v1->v2:
> > > > > > > > >  * typo fixes
> > > > > > > > >  * remove status property
> > > > > > > > >  * remove imx6&7 compatible string which may be added later
> for
> > > > > > > > >    the generic mailbox binding
> > > > > > > > >
> > > > > > > > > Note: Because MU used by SCU is not implemented as a
> > > > > > > > > mailbox driver, Instead, they're provided in library
> > > > > > > > > calls to gain higher
> > > > > performance.
> > > > > > > >
> > > > > > > > Using a binding doesn't mean you have to use an OS's
> subsystem.
> > > > > > > >
> > > > > > > > What needs higher performance? What's the performance
> > > difference?
> > > > > > > Why
> > > > > > > > can't the mailbox framework be improved?
> > > > > > >
> > > > > > > From what I see the performance is improved by polling the
> > > > > > > interrupt registers rather than using interrupts.
> > > > > > > I see no reason though why this can't be implemented with
> > > > > > > the mailbox framework as is.
> > > > > > >
> > > > > >
> > > > > > I thought you've agreed to not write generic MU(mailbox)
> > > > > > driver for
> > > SCU.
> > > > > > https://www.spinics.net/lists/arm-kernel/msg650217.html
> > > > > > But seems it's still not quite certain...
> > > > > >
> > > > > > I'd like to explain some more.
> > > > > >
> > > > > > 1) the interrupt mechanism is not quite suitable for SCU
> > > > > > firmware protocol as the transfer size would be more than 4
> > > > > > words and the response data size is also undetermined (it's
> > > > > > set by SCU firmware side
> > > > > during a response).
> > > > > > So polling mode may be the best way to handle this as MU
> > > > > > message handling usually is quite fast in a few microseconds.
> > > > > >
> > > > > > 2) It's true that Mailbox framework is well designed and powerful.
> > > > > > But it's not quite suitable for SCU MU as we don't need to use
> > > > > > the most bits of it. Mailbox seems like to be more suitable
> > > > > > for a generic MU mailbox driver used by various
> > > > > > clients/servers.  But SCU MU are quite specific to SCU
> > > > > > protocol and can't be used by other clients (MU
> > > > > > 0~4 is fixed for SCU communication in MX8 HW design).
> > > > > > Even we write a MU Mailbox driver for SCU MU, it's still not a
> > > > > > general one and can't be used by others (e.g. communication with
> M4).
> > > > > > So I'd believe the current library way is still the best
> > > > > > approach for SCU MU Using. But I'm also okay for another
> > > > > > generic MU drivers for other common communications between A
> core and M4 side.
> > > > > >
> > > > > > 3) We actually have tried the MU(Mailbox) internally, it
> > > > > > increased a lot complexity comparing to the current library
> > > > > > way and got a booting time regression issue due to extra
> > > > > > delays introduced in handling SCU protocol in mailbox way.
> > > > > >
> > > > > > And finally a nature question to us is:
> > > > > > What the benefit we can get for SCU MU using a mailbox way?
> > > > > >
> > > > > > If we can't find benefits but introduce more complexities then
> > > > > > why we would do that way?
> > > > >
> > > > > Looks like my response to same topic within my patch set is
> > > > > lost, so I repost it
> > > > > here:
> > > > >
> > > > > ok.. let's take some of IMX8 SCU driver code to see if there any
> > > difference:
> > > > >
> > > > > ..this part of the code is blocking write procedure for one
> > > > > channeler (register or what ever name you prefer) per write.. correct?
> > > > >
> > > > > +void mu_send_msg(struct mu_priv *priv, uint32_t index, uint32_t
> > > > > +msg)
> > > {
> > > > > +	uint32_t mask = MU_SR_TE0_MASK >> index;
> > > > > +
> > > > > +	/* Wait TX register to be empty. */
> > > > > +	while (!(readl_relaxed(priv->base + MU_ASR) & mask))
> > > > > +		;
> > > > > +	writel_relaxed(msg, priv->base + MU_ATR0  + (index * 4)); }
> > > > > +EXPORT_SYMBOL_GPL(mu_send_msg);
> > > > >
> > > > > According to documentation it is recommended to use only one
> > > > > status bit for the last register to use MU as one big 4words sized pipe.
> > > > > But, there is no way you can write to all 4 registers without
> > > > > checking status for each of this register, because your protocol
> > > > > has dynamic message size. So you are forced to use your one
> > > > > channel as 4
> > > separate channels.
> > > > > Write it part of the message separately and allow your firmware
> > > > > read
> > > > > 1 word to understand how to behave on the rest of the message.
> > > > >
> > > > > +static void sc_ipc_write(struct sc_ipc *sc_ipc, void *data) {
> > > > > +	sc_rpc_msg_t *msg = (sc_rpc_msg_t *) data;
> > > > > +	uint8_t count = 0;
> > > > > +
> > > > > +	/* Check size */
> > > > > +	if (msg->size > SC_RPC_MAX_MSG)
> > > > > +		return;
> > > > > +
> > > > > +	/* Write first word */
> > > > > +	mu_send_msg(sc_ipc->mu_base, 0, *((uint32_t *) msg));
> > > > > +	count++;
> > > > >
> > > > > .. in this loop you are writing to one channel/register per
> > > > > loop. If the communicate will stall for some reason, the linux
> > > > > system will just freeze here without any timeout or error
> > > > > message.. no idea how
> > > about the opposite site.
> > > > >
> > > > > +	/* Write remaining words */
> > > > > +	while (count < msg->size) {
> > > > > +		mu_send_msg(sc_ipc->mu_base, count %
> MU_TR_COUNT,
> > > > > +			    msg->DATA.u32[count - 1]);
> > > > > +		count++;
> > > > > +	}
> > > > > +}
> > > > >
> > > > >
> > > > > ... and here is a proof that sc_ipc_write will do in some cases
> > > > > 5 rounds
> > > > > (5 * 4 bytes = 20 bytes single message) with probable busy
> > > > > waiting for each channel
> > > > >
> > > > > +sc_err_t sc_misc_seco_image_load(sc_ipc_t ipc, uint32_t addr_src,
> > > > > +				 uint32_t addr_dst, uint32_t len, bool
> fw) {
> > > > > +	sc_rpc_msg_t msg;
> > > > > +	uint8_t result;
> > > > > +
> > > > > +	RPC_VER(&msg) = SC_RPC_VERSION;
> > > > > +	RPC_SVC(&msg) = (uint8_t)SC_RPC_SVC_MISC;
> > > > > +	RPC_FUNC(&msg) =
> (uint8_t)MISC_FUNC_SECO_IMAGE_LOAD;
> > > > > +	RPC_U32(&msg, 0) = addr_src;
> > > > > +	RPC_U32(&msg, 4) = addr_dst;
> > > > > +	RPC_U32(&msg, 8) = len;
> > > > > +	RPC_U8(&msg, 12) = (uint8_t)fw;
> > > > > +	RPC_SIZE(&msg) = 5;
> > > > > +
> > > > > +	sc_call_rpc(ipc, &msg, false);
> > > > > +
> > > > > +	result = RPC_R8(&msg);
> > > > > +	return (sc_err_t)result;
> > > > > +}
> > > > > +
> > > > >
> > > > > So, the same code with mailbox framework will be some thing like this:
> > > > > 1. request all 4 channels in the probe. ignore completion
> > > > > callback and set proper timeout.
> > > > >
> > > >
> > > > That looks a bit strange. As I said in another email, MU physical
> > > > does not have multi Channels (here you mean are virtual channels).
> > > > And one whole MU instance is Exclusively used by SCU, why we need
> > > > abstract them into 4 channels to use separately with extra
> > > > unnecessary resource
> > > hold and code path executed.
> > >
> > > OK, so this MU should configured to use all 4 registers as one channel.
> > > I have nothing against it to do in generic driver.
> > >
> > > > > 2. keep your old code by replacing this part.
> > > > > 	/* Write remaining words */
> > > > > 	while (count < msg->size) {
> > > > > 		mbox_send_message(sc_ipc->mbox_chan[count %
> > > MU_TR_COUNT],
> > > > > msg->DATA.u32[count - 1]);
> > > > > 		count++;
> > > > > 	}
> > > > >
> > > > > The advantage of this variant. If SCU firmware will stall, the
> > > > > linux will be able to notify about it without blocking complete system.
> > > > >
> > > >
> > > > This part of code has been used for a long time and we've never
> > > > met the stall Issue which means SCU firmware guarantee it well.
> > > > But I agree a timeout mechanism Is better. However,  if only for
> > > > this reason, we can simply add a timeout mechanism in MU library
> > > > function as well, but still is far from a strong enough reason to
> > > > switch to a more
> > > complexed mailbox one.
> > >
> > > At this point a have absolutely zero tolerance. At the end I was one
> > > on poor devs hunting similar kind of stalls.
> > > Believe me, what can theoretically with 1% probability happen in the
> > > lab, will end with days/months of bug hunting work and thousands of
> field returns.
> > >
> >
> > Yeah, so I think a timeout mechanism is worthy to add.
> >
> > > > > Can you please provide (if possible) your old mailbox based
> > > implementation.
> > > > > I'm curious to see why it is slow.
> > > >
> > > > In our implementation:
> > > > 1) One channel per MU
> > > > 2) Tx using the same way to send msg as sc_ipc_write() in polling
> > > > mode
> > > > 3) Rx enables the first word interrupt and polling for the rest of
> > > > them in a hrtimer.
> > > >
> > > > The possible extra cost comparing to simple polling way:
> > > > 1) Extra unnecessary code execution path of mailbox which is not
> > > > used by SCU MU
> > > > 2) Interrupt latency
> > > > 3) Extra memory copy handling RX message separately.
> > > > 4) Extra schedule of hrtimer polling
> > > >
> > > > Some of them probably could be optimized. However, besides the
> > > > slow problem, the extra unnecessary complexity and can't be
> > > > generic (specific to SCU) are also important ones.
> > > >
> > > > And the MU general purpose interrupt feature and general purpose
> > > > flags feature may also not supported by mailbox well.
> > >
> > > Ok, I give up.
> > > Can we at least try to provide one device tree binding documentation
> > > for that?
> >
> > I will update the binding doc to use mailbox way as suggested by Rob.
> > But I still need to change mbox-cells to allow 0 which actually
> > already exist In kernel. So we may have too options for MU binding for
> users.
> > For SCU MU, mbox-cells = <0>
> > For General MU, mbox-cells = <1>
> 
> ok. cool.
> 
> > > At the end, it is one and the same HW IP block spread in one SoC (or
> > > different SoCs) an be used in different way. How exactly it will be
> > > used is the choice of the developer/customer/user...
> > > even if it will be described withing mailbox section dos not mean
> > > you are forced to use mailbox framework.
> > >
> > > Lets imagine worst case scenario and all of MU on i.MX8 are used by
> > > different drivers. In my mailbox binding I suggest to use
> > > <vendor>,<soc>- mu-<mu side>. On i.MX7 there is only one MU, so it
> > > looks like "fsl,imx7s-mu- a" and "fsl,imx7s-mu-b". i.MX8 has multiple MU
> so it should be enumerated.
> > > Probably it will look like: "fsl,imx8x-mu0-a".
> > > Do it make sense for you? Can we find and agreement here? :)
> > >
> >
> > IMHO one possible concern may be that results in creating two much
> > compatibles strings. For example, each SoC have two (fsl, <soc>-mu-<a or
> b>).
> > Now with MU supported SoCs, we have mx6sx, mx7ulp, Mx7d, mx8qm,
> mx8x,
> > mx8mq and etc...
> >
> > My question is that if it's really worthy to create a separate side b
> > compatible string instread of distinguish it by a property?
> >
> > And for general purpose MUs to communicate with M4. Both side can be
> > used by either A core and M core. If we're doing this way, once users
> > wants A core to access B side, users need to change the compatible
> > string, that seems more like an unusual way compared to changing
> properties.
> >
> > How do you think?
> 
> My current use case is Linux with DT running on all CPUs of one SoC. On
> i.MX7 there is only one MU and I use same mailbox driver on both sides. In
> this case it is enough to have same compatible with extra parameter to see
> the difference between A and B side.
> Just imagine, you'll use DT in your firmware on i.MX8. Will you use same
> driver on both sides of MU? If yes, I'm OK to exclude A and B from
> compatible.
> 

NXP release only provides a FreeRTOS based image running on M4 side.
Don't support Linux on M4. So we normally won't run the same driver
for both sides of MU. But for Linux community, it's good idea if Linux
can also run on M4 and using the same MU driver.

Regards
Dong Aisheng

> --
> Pengutronix e.K.                           |                             |
> Industrial Linux Solutions                 | http://www.pengutronix.de/  |
> Peiner Str. 6-8, 31137 Hildesheim, Germany | Phone: +49-5121-206917-0    |
> Amtsgericht Hildesheim, HRA 2686           | Fax:   +49-5121-206917-5555 |

^ permalink raw reply	[flat|nested] 37+ messages in thread

* [PATCH V2 2/4] dt-bindings: arm: fsl: add mu binding doc
@ 2018-06-22  8:16                     ` A.s. Dong
  0 siblings, 0 replies; 37+ messages in thread
From: A.s. Dong @ 2018-06-22  8:16 UTC (permalink / raw)
  To: linux-arm-kernel

> -----Original Message-----
> From: Oleksij Rempel [mailto:o.rempel at pengutronix.de]
> Sent: Friday, June 22, 2018 2:49 PM
> To: A.s. Dong <aisheng.dong@nxp.com>
> Cc: Sascha Hauer <s.hauer@pengutronix.de>; Rob Herring
> <robh@kernel.org>; Mark Rutland <mark.rutland@arm.com>;
> devicetree at vger.kernel.org; dongas86 at gmail.com; dl-linux-imx <linux-
> imx at nxp.com>; kernel at pengutronix.de; Fabio Estevam
> <fabio.estevam@nxp.com>; shawnguo at kernel.org; linux-arm-
> kernel at lists.infradead.org
> Subject: Re: [PATCH V2 2/4] dt-bindings: arm: fsl: add mu binding doc
> 
> On Fri, Jun 22, 2018 at 05:59:30AM +0000, A.s. Dong wrote:
> > Hi Oleksij,
> >
> > > -----Original Message-----
> > > From: Oleksij Rempel [mailto:o.rempel at pengutronix.de]
> > > Sent: Friday, June 22, 2018 12:59 PM
> > > To: A.s. Dong <aisheng.dong@nxp.com>
> > > Cc: Sascha Hauer <s.hauer@pengutronix.de>; Rob Herring
> > > <robh@kernel.org>; Mark Rutland <mark.rutland@arm.com>;
> > > devicetree at vger.kernel.org; dongas86 at gmail.com; dl-linux-imx <linux-
> > > imx at nxp.com>; kernel at pengutronix.de; Fabio Estevam
> > > <fabio.estevam@nxp.com>; shawnguo at kernel.org; linux-arm-
> > > kernel at lists.infradead.org
> > > Subject: Re: [PATCH V2 2/4] dt-bindings: arm: fsl: add mu binding
> > > doc
> > >
> > > On Fri, Jun 22, 2018 at 03:31:11AM +0000, A.s. Dong wrote:
> > > > > -----Original Message-----
> > > > > From: Oleksij Rempel [mailto:o.rempel at pengutronix.de]
> > > > > Sent: Friday, June 22, 2018 2:09 AM
> > > > > To: A.s. Dong <aisheng.dong@nxp.com>
> > > > > Cc: Sascha Hauer <s.hauer@pengutronix.de>; Rob Herring
> > > > > <robh@kernel.org>; Mark Rutland <mark.rutland@arm.com>;
> > > > > devicetree at vger.kernel.org; dongas86 at gmail.com; dl-linux-imx
> > > > > <linux- imx@nxp.com>; kernel at pengutronix.de; Fabio Estevam
> > > > > <fabio.estevam@nxp.com>; shawnguo at kernel.org; linux-arm-
> > > > > kernel at lists.infradead.org
> > > > > Subject: Re: [PATCH V2 2/4] dt-bindings: arm: fsl: add mu
> > > > > binding doc
> > > > >
> > > > > Hi,
> > > > >
> > > > > On Thu, Jun 21, 2018 at 05:11:33PM +0000, A.s. Dong wrote:
> > > > > > Hi Sascha,
> > > > > > > On Wed, Jun 20, 2018 at 01:43:10PM -0600, Rob Herring wrote:
> > > > > > > > On Sun, Jun 17, 2018 at 08:49:47PM +0800, Dong Aisheng wrote:
> > > > > > > > > The Messaging Unit module enables two processors within
> > > > > > > > > the SoC to communicate and coordinate by passing messages
> (e.g.
> > > > > > > > > data, status and control) through the MU interface.
> > > > > > > > >
> > > > > > > > > ---
> > > > > > > > > v1->v2:
> > > > > > > > >  * typo fixes
> > > > > > > > >  * remove status property
> > > > > > > > >  * remove imx6&7 compatible string which may be added later
> for
> > > > > > > > >    the generic mailbox binding
> > > > > > > > >
> > > > > > > > > Note: Because MU used by SCU is not implemented as a
> > > > > > > > > mailbox driver, Instead, they're provided in library
> > > > > > > > > calls to gain higher
> > > > > performance.
> > > > > > > >
> > > > > > > > Using a binding doesn't mean you have to use an OS's
> subsystem.
> > > > > > > >
> > > > > > > > What needs higher performance? What's the performance
> > > difference?
> > > > > > > Why
> > > > > > > > can't the mailbox framework be improved?
> > > > > > >
> > > > > > > From what I see the performance is improved by polling the
> > > > > > > interrupt registers rather than using interrupts.
> > > > > > > I see no reason though why this can't be implemented with
> > > > > > > the mailbox framework as is.
> > > > > > >
> > > > > >
> > > > > > I thought you've agreed to not write generic MU(mailbox)
> > > > > > driver for
> > > SCU.
> > > > > > https://www.spinics.net/lists/arm-kernel/msg650217.html
> > > > > > But seems it's still not quite certain...
> > > > > >
> > > > > > I'd like to explain some more.
> > > > > >
> > > > > > 1) the interrupt mechanism is not quite suitable for SCU
> > > > > > firmware protocol as the transfer size would be more than 4
> > > > > > words and the response data size is also undetermined (it's
> > > > > > set by SCU firmware side
> > > > > during a response).
> > > > > > So polling mode may be the best way to handle this as MU
> > > > > > message handling usually is quite fast in a few microseconds.
> > > > > >
> > > > > > 2) It's true that Mailbox framework is well designed and powerful.
> > > > > > But it's not quite suitable for SCU MU as we don't need to use
> > > > > > the most bits of it. Mailbox seems like to be more suitable
> > > > > > for a generic MU mailbox driver used by various
> > > > > > clients/servers.  But SCU MU are quite specific to SCU
> > > > > > protocol and can't be used by other clients (MU
> > > > > > 0~4 is fixed for SCU communication in MX8 HW design).
> > > > > > Even we write a MU Mailbox driver for SCU MU, it's still not a
> > > > > > general one and can't be used by others (e.g. communication with
> M4).
> > > > > > So I'd believe the current library way is still the best
> > > > > > approach for SCU MU Using. But I'm also okay for another
> > > > > > generic MU drivers for other common communications between A
> core and M4 side.
> > > > > >
> > > > > > 3) We actually have tried the MU(Mailbox) internally, it
> > > > > > increased a lot complexity comparing to the current library
> > > > > > way and got a booting time regression issue due to extra
> > > > > > delays introduced in handling SCU protocol in mailbox way.
> > > > > >
> > > > > > And finally a nature question to us is:
> > > > > > What the benefit we can get for SCU MU using a mailbox way?
> > > > > >
> > > > > > If we can't find benefits but introduce more complexities then
> > > > > > why we would do that way?
> > > > >
> > > > > Looks like my response to same topic within my patch set is
> > > > > lost, so I repost it
> > > > > here:
> > > > >
> > > > > ok.. let's take some of IMX8 SCU driver code to see if there any
> > > difference:
> > > > >
> > > > > ..this part of the code is blocking write procedure for one
> > > > > channeler (register or what ever name you prefer) per write.. correct?
> > > > >
> > > > > +void mu_send_msg(struct mu_priv *priv, uint32_t index, uint32_t
> > > > > +msg)
> > > {
> > > > > +	uint32_t mask = MU_SR_TE0_MASK >> index;
> > > > > +
> > > > > +	/* Wait TX register to be empty. */
> > > > > +	while (!(readl_relaxed(priv->base + MU_ASR) & mask))
> > > > > +		;
> > > > > +	writel_relaxed(msg, priv->base + MU_ATR0  + (index * 4)); }
> > > > > +EXPORT_SYMBOL_GPL(mu_send_msg);
> > > > >
> > > > > According to documentation it is recommended to use only one
> > > > > status bit for the last register to use MU as one big 4words sized pipe.
> > > > > But, there is no way you can write to all 4 registers without
> > > > > checking status for each of this register, because your protocol
> > > > > has dynamic message size. So you are forced to use your one
> > > > > channel as 4
> > > separate channels.
> > > > > Write it part of the message separately and allow your firmware
> > > > > read
> > > > > 1 word to understand how to behave on the rest of the message.
> > > > >
> > > > > +static void sc_ipc_write(struct sc_ipc *sc_ipc, void *data) {
> > > > > +	sc_rpc_msg_t *msg = (sc_rpc_msg_t *) data;
> > > > > +	uint8_t count = 0;
> > > > > +
> > > > > +	/* Check size */
> > > > > +	if (msg->size > SC_RPC_MAX_MSG)
> > > > > +		return;
> > > > > +
> > > > > +	/* Write first word */
> > > > > +	mu_send_msg(sc_ipc->mu_base, 0, *((uint32_t *) msg));
> > > > > +	count++;
> > > > >
> > > > > .. in this loop you are writing to one channel/register per
> > > > > loop. If the communicate will stall for some reason, the linux
> > > > > system will just freeze here without any timeout or error
> > > > > message.. no idea how
> > > about the opposite site.
> > > > >
> > > > > +	/* Write remaining words */
> > > > > +	while (count < msg->size) {
> > > > > +		mu_send_msg(sc_ipc->mu_base, count %
> MU_TR_COUNT,
> > > > > +			    msg->DATA.u32[count - 1]);
> > > > > +		count++;
> > > > > +	}
> > > > > +}
> > > > >
> > > > >
> > > > > ... and here is a proof that sc_ipc_write will do in some cases
> > > > > 5 rounds
> > > > > (5 * 4 bytes = 20 bytes single message) with probable busy
> > > > > waiting for each channel
> > > > >
> > > > > +sc_err_t sc_misc_seco_image_load(sc_ipc_t ipc, uint32_t addr_src,
> > > > > +				 uint32_t addr_dst, uint32_t len, bool
> fw) {
> > > > > +	sc_rpc_msg_t msg;
> > > > > +	uint8_t result;
> > > > > +
> > > > > +	RPC_VER(&msg) = SC_RPC_VERSION;
> > > > > +	RPC_SVC(&msg) = (uint8_t)SC_RPC_SVC_MISC;
> > > > > +	RPC_FUNC(&msg) =
> (uint8_t)MISC_FUNC_SECO_IMAGE_LOAD;
> > > > > +	RPC_U32(&msg, 0) = addr_src;
> > > > > +	RPC_U32(&msg, 4) = addr_dst;
> > > > > +	RPC_U32(&msg, 8) = len;
> > > > > +	RPC_U8(&msg, 12) = (uint8_t)fw;
> > > > > +	RPC_SIZE(&msg) = 5;
> > > > > +
> > > > > +	sc_call_rpc(ipc, &msg, false);
> > > > > +
> > > > > +	result = RPC_R8(&msg);
> > > > > +	return (sc_err_t)result;
> > > > > +}
> > > > > +
> > > > >
> > > > > So, the same code with mailbox framework will be some thing like this:
> > > > > 1. request all 4 channels in the probe. ignore completion
> > > > > callback and set proper timeout.
> > > > >
> > > >
> > > > That looks a bit strange. As I said in another email, MU physical
> > > > does not have multi Channels (here you mean are virtual channels).
> > > > And one whole MU instance is Exclusively used by SCU, why we need
> > > > abstract them into 4 channels to use separately with extra
> > > > unnecessary resource
> > > hold and code path executed.
> > >
> > > OK, so this MU should configured to use all 4 registers as one channel.
> > > I have nothing against it to do in generic driver.
> > >
> > > > > 2. keep your old code by replacing this part.
> > > > > 	/* Write remaining words */
> > > > > 	while (count < msg->size) {
> > > > > 		mbox_send_message(sc_ipc->mbox_chan[count %
> > > MU_TR_COUNT],
> > > > > msg->DATA.u32[count - 1]);
> > > > > 		count++;
> > > > > 	}
> > > > >
> > > > > The advantage of this variant. If SCU firmware will stall, the
> > > > > linux will be able to notify about it without blocking complete system.
> > > > >
> > > >
> > > > This part of code has been used for a long time and we've never
> > > > met the stall Issue which means SCU firmware guarantee it well.
> > > > But I agree a timeout mechanism Is better. However,  if only for
> > > > this reason, we can simply add a timeout mechanism in MU library
> > > > function as well, but still is far from a strong enough reason to
> > > > switch to a more
> > > complexed mailbox one.
> > >
> > > At this point a have absolutely zero tolerance. At the end I was one
> > > on poor devs hunting similar kind of stalls.
> > > Believe me, what can theoretically with 1% probability happen in the
> > > lab, will end with days/months of bug hunting work and thousands of
> field returns.
> > >
> >
> > Yeah, so I think a timeout mechanism is worthy to add.
> >
> > > > > Can you please provide (if possible) your old mailbox based
> > > implementation.
> > > > > I'm curious to see why it is slow.
> > > >
> > > > In our implementation:
> > > > 1) One channel per MU
> > > > 2) Tx using the same way to send msg as sc_ipc_write() in polling
> > > > mode
> > > > 3) Rx enables the first word interrupt and polling for the rest of
> > > > them in a hrtimer.
> > > >
> > > > The possible extra cost comparing to simple polling way:
> > > > 1) Extra unnecessary code execution path of mailbox which is not
> > > > used by SCU MU
> > > > 2) Interrupt latency
> > > > 3) Extra memory copy handling RX message separately.
> > > > 4) Extra schedule of hrtimer polling
> > > >
> > > > Some of them probably could be optimized. However, besides the
> > > > slow problem, the extra unnecessary complexity and can't be
> > > > generic (specific to SCU) are also important ones.
> > > >
> > > > And the MU general purpose interrupt feature and general purpose
> > > > flags feature may also not supported by mailbox well.
> > >
> > > Ok, I give up.
> > > Can we at least try to provide one device tree binding documentation
> > > for that?
> >
> > I will update the binding doc to use mailbox way as suggested by Rob.
> > But I still need to change mbox-cells to allow 0 which actually
> > already exist In kernel. So we may have too options for MU binding for
> users.
> > For SCU MU, mbox-cells = <0>
> > For General MU, mbox-cells = <1>
> 
> ok. cool.
> 
> > > At the end, it is one and the same HW IP block spread in one SoC (or
> > > different SoCs) an be used in different way. How exactly it will be
> > > used is the choice of the developer/customer/user...
> > > even if it will be described withing mailbox section dos not mean
> > > you are forced to use mailbox framework.
> > >
> > > Lets imagine worst case scenario and all of MU on i.MX8 are used by
> > > different drivers. In my mailbox binding I suggest to use
> > > <vendor>,<soc>- mu-<mu side>. On i.MX7 there is only one MU, so it
> > > looks like "fsl,imx7s-mu- a" and "fsl,imx7s-mu-b". i.MX8 has multiple MU
> so it should be enumerated.
> > > Probably it will look like: "fsl,imx8x-mu0-a".
> > > Do it make sense for you? Can we find and agreement here? :)
> > >
> >
> > IMHO one possible concern may be that results in creating two much
> > compatibles strings. For example, each SoC have two (fsl, <soc>-mu-<a or
> b>).
> > Now with MU supported SoCs, we have mx6sx, mx7ulp, Mx7d, mx8qm,
> mx8x,
> > mx8mq and etc...
> >
> > My question is that if it's really worthy to create a separate side b
> > compatible string instread of distinguish it by a property?
> >
> > And for general purpose MUs to communicate with M4. Both side can be
> > used by either A core and M core. If we're doing this way, once users
> > wants A core to access B side, users need to change the compatible
> > string, that seems more like an unusual way compared to changing
> properties.
> >
> > How do you think?
> 
> My current use case is Linux with DT running on all CPUs of one SoC. On
> i.MX7 there is only one MU and I use same mailbox driver on both sides. In
> this case it is enough to have same compatible with extra parameter to see
> the difference between A and B side.
> Just imagine, you'll use DT in your firmware on i.MX8. Will you use same
> driver on both sides of MU? If yes, I'm OK to exclude A and B from
> compatible.
> 

NXP release only provides a FreeRTOS based image running on M4 side.
Don't support Linux on M4. So we normally won't run the same driver
for both sides of MU. But for Linux community, it's good idea if Linux
can also run on M4 and using the same MU driver.

Regards
Dong Aisheng

> --
> Pengutronix e.K.                           |                             |
> Industrial Linux Solutions                 | http://www.pengutronix.de/  |
> Peiner Str. 6-8, 31137 Hildesheim, Germany | Phone: +49-5121-206917-0    |
> Amtsgericht Hildesheim, HRA 2686           | Fax:   +49-5121-206917-5555 |

^ permalink raw reply	[flat|nested] 37+ messages in thread

end of thread, other threads:[~2018-06-22  8:16 UTC | newest]

Thread overview: 37+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-06-17 12:49 [PATCH V2 0/4] soc: imx: add scu firmware api support Dong Aisheng
2018-06-17 12:49 ` [PATCH V2 1/4] soc: imx: add mu library functions support Dong Aisheng
2018-06-17 12:49 ` [PATCH V2 2/4] dt-bindings: arm: fsl: add mu binding doc Dong Aisheng
2018-06-17 12:49   ` Dong Aisheng
2018-06-20 19:43   ` Rob Herring
2018-06-20 19:43     ` Rob Herring
2018-06-21  7:46     ` Sascha Hauer
2018-06-21  7:46       ` Sascha Hauer
2018-06-21 17:11       ` A.s. Dong
2018-06-21 17:11         ` A.s. Dong
2018-06-21 18:08         ` Oleksij Rempel
2018-06-21 18:08           ` Oleksij Rempel
2018-06-22  3:31           ` A.s. Dong
2018-06-22  3:31             ` A.s. Dong
2018-06-22  4:59             ` Oleksij Rempel
2018-06-22  4:59               ` Oleksij Rempel
2018-06-22  5:59               ` A.s. Dong
2018-06-22  5:59                 ` A.s. Dong
2018-06-22  6:48                 ` Oleksij Rempel
2018-06-22  6:48                   ` Oleksij Rempel
2018-06-22  8:16                   ` A.s. Dong
2018-06-22  8:16                     ` A.s. Dong
2018-06-22  5:49         ` Sascha Hauer
2018-06-22  5:49           ` Sascha Hauer
2018-06-22  6:04           ` A.s. Dong
2018-06-22  6:04             ` A.s. Dong
2018-06-17 12:49 ` [PATCH V2 3/4] dt-bindings: arm: fsl: add scu " Dong Aisheng
2018-06-17 12:49   ` Dong Aisheng
2018-06-20 19:44   ` Rob Herring
2018-06-20 19:44     ` Rob Herring
2018-06-21  3:38     ` A.s. Dong
2018-06-21  3:38       ` A.s. Dong
2018-06-21  7:37       ` Sascha Hauer
2018-06-21  7:37         ` Sascha Hauer
2018-06-21 12:05         ` A.s. Dong
2018-06-21 12:05           ` A.s. Dong
     [not found] ` <1529239789-26849-5-git-send-email-aisheng.dong@nxp.com>
2018-06-18  8:58   ` [PATCH V2 4/4] soc: imx: add SC firmware IPC and APIs Leonard Crestez

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.