linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH V2 00/10] arm64: tegra: add BPMP support
@ 2016-07-05  9:04 Joseph Lo
  2016-07-05  9:04 ` [PATCH V2 01/10] Documentation: dt-bindings: mailbox: tegra: Add binding for HSP mailbox Joseph Lo
                   ` (9 more replies)
  0 siblings, 10 replies; 51+ messages in thread
From: Joseph Lo @ 2016-07-05  9:04 UTC (permalink / raw)
  To: Stephen Warren, Thierry Reding, Alexandre Courbot
  Cc: linux-tegra, linux-arm-kernel, Rob Herring, Mark Rutland,
	Peter De Schrijver, Matthew Longnecker, devicetree, Jassi Brar,
	linux-kernel, Catalin Marinas, Will Deacon, Joseph Lo

Hi,

This series introduce the first announced Boot and Power Management Processor
(BPMP) for the new generation Tegra SoCs, which is designed for boot
process handling and offloading the power management tasks from the CPU.

We also add some very initial and basic support for Tegra186 SoC, which
supports debug console and initrd for initial bring up currently. More drivers
and functions can be supported based on this later.

Thanks,
Joseph

Changes in V2
- revise the HSP mailbox and bpmp DT binding documents
- fix the HSP mailbox driver according to the binding update
- update the dts files to represent the binding update

Joseph Lo (10):
  Documentation: dt-bindings: mailbox: tegra: Add binding for HSP
    mailbox
  mailbox: tegra-hsp: Add HSP(Hardware Synchronization Primitives)
    driver
  Documentation: dt-bindings: firmware: tegra: add bindings of the BPMP
  firmware: tegra: add IVC library
  firmware: tegra: add BPMP support
  soc/tegra: Add Tegra186 support
  arm64: defconfig: Enable Tegra186 SoC
  arm64: dts: tegra: Add Tegra186 support
  arm64: dts: tegra: Add NVIDIA Tegra186 P3310 main board support
  arm64: dts: tegra: Add NVIDIA P2771 board support

 .../bindings/firmware/nvidia,tegra186-bpmp.txt     |   77 +
 .../bindings/mailbox/nvidia,tegra186-hsp.txt       |   51 +
 arch/arm64/boot/dts/nvidia/Makefile                |    1 +
 arch/arm64/boot/dts/nvidia/tegra186-p2771-0000.dts |    8 +
 arch/arm64/boot/dts/nvidia/tegra186-p3310.dtsi     |   34 +
 arch/arm64/boot/dts/nvidia/tegra186.dtsi           |   77 +
 arch/arm64/configs/defconfig                       |    1 +
 drivers/firmware/Kconfig                           |    1 +
 drivers/firmware/Makefile                          |    1 +
 drivers/firmware/tegra/Kconfig                     |   25 +
 drivers/firmware/tegra/Makefile                    |    2 +
 drivers/firmware/tegra/bpmp.c                      |  713 +++++++++
 drivers/firmware/tegra/ivc.c                       |  659 ++++++++
 drivers/mailbox/Kconfig                            |    9 +
 drivers/mailbox/Makefile                           |    2 +
 drivers/mailbox/tegra-hsp.c                        |  418 +++++
 drivers/soc/tegra/Kconfig                          |   14 +
 include/dt-bindings/clock/tegra186-clock.h         |  940 ++++++++++++
 include/dt-bindings/mailbox/tegra186-hsp.h         |   23 +
 include/dt-bindings/reset/tegra186-reset.h         |  217 +++
 include/soc/tegra/bpmp.h                           |   29 +
 include/soc/tegra/bpmp_abi.h                       | 1601 ++++++++++++++++++++
 include/soc/tegra/ivc.h                            |  102 ++
 23 files changed, 5005 insertions(+)
 create mode 100644 Documentation/devicetree/bindings/firmware/nvidia,tegra186-bpmp.txt
 create mode 100644 Documentation/devicetree/bindings/mailbox/nvidia,tegra186-hsp.txt
 create mode 100644 arch/arm64/boot/dts/nvidia/tegra186-p2771-0000.dts
 create mode 100644 arch/arm64/boot/dts/nvidia/tegra186-p3310.dtsi
 create mode 100644 arch/arm64/boot/dts/nvidia/tegra186.dtsi
 create mode 100644 drivers/firmware/tegra/Kconfig
 create mode 100644 drivers/firmware/tegra/Makefile
 create mode 100644 drivers/firmware/tegra/bpmp.c
 create mode 100644 drivers/firmware/tegra/ivc.c
 create mode 100644 drivers/mailbox/tegra-hsp.c
 create mode 100644 include/dt-bindings/clock/tegra186-clock.h
 create mode 100644 include/dt-bindings/mailbox/tegra186-hsp.h
 create mode 100644 include/dt-bindings/reset/tegra186-reset.h
 create mode 100644 include/soc/tegra/bpmp.h
 create mode 100644 include/soc/tegra/bpmp_abi.h
 create mode 100644 include/soc/tegra/ivc.h

-- 
2.9.0

^ permalink raw reply	[flat|nested] 51+ messages in thread

* [PATCH V2 01/10] Documentation: dt-bindings: mailbox: tegra: Add binding for HSP mailbox
  2016-07-05  9:04 [PATCH V2 00/10] arm64: tegra: add BPMP support Joseph Lo
@ 2016-07-05  9:04 ` Joseph Lo
  2016-07-06 17:02   ` Stephen Warren
  2016-07-07 18:13   ` Sivaram Nair
  2016-07-05  9:04 ` [PATCH V2 02/10] mailbox: tegra-hsp: Add HSP(Hardware Synchronization Primitives) driver Joseph Lo
                   ` (8 subsequent siblings)
  9 siblings, 2 replies; 51+ messages in thread
From: Joseph Lo @ 2016-07-05  9:04 UTC (permalink / raw)
  To: Stephen Warren, Thierry Reding, Alexandre Courbot
  Cc: linux-tegra, linux-arm-kernel, Rob Herring, Mark Rutland,
	Peter De Schrijver, Matthew Longnecker, devicetree, Jassi Brar,
	linux-kernel, Catalin Marinas, Will Deacon, Joseph Lo

Add DT binding for the Hardware Synchronization Primitives (HSP). The
HSP is designed for the processors to share resources and communicate
together. It provides a set of hardware synchronization primitives for
interprocessor communication. So the interprocessor communication (IPC)
protocols can use hardware synchronization primitive, when operating
between two processors not in an SMP relationship.

Signed-off-by: Joseph Lo <josephl@nvidia.com>
---
Changes in V2:
- revise the compatible string, interrupt-names, interrupts, and #mbox-cells
  properties
- remove "nvidia,hsp-function" property
- fix the header file name
- the binding supports the concept of multiple HSP sub-modules on one HSP HW
  block now.
---
 .../bindings/mailbox/nvidia,tegra186-hsp.txt       | 51 ++++++++++++++++++++++
 include/dt-bindings/mailbox/tegra186-hsp.h         | 23 ++++++++++
 2 files changed, 74 insertions(+)
 create mode 100644 Documentation/devicetree/bindings/mailbox/nvidia,tegra186-hsp.txt
 create mode 100644 include/dt-bindings/mailbox/tegra186-hsp.h

diff --git a/Documentation/devicetree/bindings/mailbox/nvidia,tegra186-hsp.txt b/Documentation/devicetree/bindings/mailbox/nvidia,tegra186-hsp.txt
new file mode 100644
index 000000000000..10e53edbe1c7
--- /dev/null
+++ b/Documentation/devicetree/bindings/mailbox/nvidia,tegra186-hsp.txt
@@ -0,0 +1,51 @@
+NVIDIA Tegra Hardware Synchronization Primitives (HSP)
+
+The HSP modules are used for the processors to share resources and communicate
+together. It provides a set of hardware synchronization primitives for
+interprocessor communication. So the interprocessor communication (IPC)
+protocols can use hardware synchronization primitives, when operating between
+two processors not in an SMP relationship.
+
+The features that HSP supported are shared mailboxes, shared semaphores,
+arbitrated semaphores and doorbells.
+
+Required properties:
+- name : Should be hsp
+- compatible
+    Array of strings.
+    one of:
+    - "nvidia,tegra186-hsp"
+- reg : Offset and length of the register set for the device.
+- interrupt-names
+    Array of strings.
+    Contains a list of names for the interrupts described by the interrupt
+    property. May contain the following entries, in any order:
+    - "doorbell"
+    Users of this binding MUST look up entries in the interrupt property
+    by name, using this interrupt-names property to do so.
+- interrupts
+    Array of interrupt specifiers.
+    Must contain one entry per entry in the interrupt-names property,
+    in a matching order.
+- #mbox-cells : Should be 1.
+
+The mbox specifier of the "mboxes" property in the client node should use
+the "HSP_MBOX_ID" macro which integrates the HSP type and master ID data.
+Those information can be found in the following file.
+
+- <dt-bindings/mailbox/tegra186-hsp.h>.
+
+Example:
+
+hsp_top0: hsp@3c00000 {
+	compatible = "nvidia,tegra186-hsp";
+	reg = <0x0 0x03c00000 0x0 0xa0000>;
+	interrupts = <GIC_SPI 176 IRQ_TYPE_LEVEL_HIGH>;
+	interrupt-names = "doorbell";
+	#mbox-cells = <1>;
+};
+
+client {
+	...
+	mboxes = <&hsp_top0 HSP_MBOX_ID(DB, HSP_DB_MASTER_XXX)>;
+};
diff --git a/include/dt-bindings/mailbox/tegra186-hsp.h b/include/dt-bindings/mailbox/tegra186-hsp.h
new file mode 100644
index 000000000000..365dbeb5d894
--- /dev/null
+++ b/include/dt-bindings/mailbox/tegra186-hsp.h
@@ -0,0 +1,23 @@
+/*
+ * This header provides constants for binding nvidia,tegra186-hsp.
+ *
+ * The number with HSP_DB_MASTER prefix indicates the bit that is
+ * associated with a master ID in the doorbell registers.
+ */
+
+
+#ifndef _DT_BINDINGS_MAILBOX_TEGRA186_HSP_H
+#define _DT_BINDINGS_MAILBOX_TEGRA186_HSP_H
+
+#define HSP_MBOX_TYPE_DB 0x0
+#define HSP_MBOX_TYPE_SM 0x1
+#define HSP_MBOX_TYPE_SS 0x2
+#define HSP_MBOX_TYPE_AS 0x3
+
+#define HSP_DB_MASTER_CCPLEX 17
+#define HSP_DB_MASTER_BPMP 19
+
+#define HSP_MBOX_ID(type, ID) \
+		(HSP_MBOX_TYPE_##type << 16 | ID)
+
+#endif	/* _DT_BINDINGS_MAILBOX_TEGRA186_HSP_H */
-- 
2.9.0

^ permalink raw reply related	[flat|nested] 51+ messages in thread

* [PATCH V2 02/10] mailbox: tegra-hsp: Add HSP(Hardware Synchronization Primitives) driver
  2016-07-05  9:04 [PATCH V2 00/10] arm64: tegra: add BPMP support Joseph Lo
  2016-07-05  9:04 ` [PATCH V2 01/10] Documentation: dt-bindings: mailbox: tegra: Add binding for HSP mailbox Joseph Lo
@ 2016-07-05  9:04 ` Joseph Lo
  2016-07-06  7:05   ` Alexandre Courbot
  2016-07-07 21:10   ` Sivaram Nair
  2016-07-05  9:04 ` [PATCH V2 03/10] Documentation: dt-bindings: firmware: tegra: add bindings of the BPMP Joseph Lo
                   ` (7 subsequent siblings)
  9 siblings, 2 replies; 51+ messages in thread
From: Joseph Lo @ 2016-07-05  9:04 UTC (permalink / raw)
  To: Stephen Warren, Thierry Reding, Alexandre Courbot
  Cc: linux-tegra, linux-arm-kernel, Rob Herring, Mark Rutland,
	Peter De Schrijver, Matthew Longnecker, devicetree, Jassi Brar,
	linux-kernel, Catalin Marinas, Will Deacon, Joseph Lo

The Tegra HSP mailbox driver implements the signaling doorbell-based
interprocessor communication (IPC) for remote processors currently. The
HSP HW modules support some different features for that, which are
shared mailboxes, shared semaphores, arbitrated semaphores, and
doorbells. And there are multiple HSP HW instances on the chip. So the
driver is extendable to support more features for different IPC
requirement.

The driver of remote processor can use it as a mailbox client and deal
with the IPC protocol to synchronize the data communications.

Signed-off-by: Joseph Lo <josephl@nvidia.com>
---
Changes in V2:
- Update the driver to support the binding changes in V2
- it's extendable to support multiple HSP sub-modules on the same HSP HW block
  now.
---
 drivers/mailbox/Kconfig     |   9 +
 drivers/mailbox/Makefile    |   2 +
 drivers/mailbox/tegra-hsp.c | 418 ++++++++++++++++++++++++++++++++++++++++++++
 3 files changed, 429 insertions(+)
 create mode 100644 drivers/mailbox/tegra-hsp.c

diff --git a/drivers/mailbox/Kconfig b/drivers/mailbox/Kconfig
index 5305923752d2..fe584cb54720 100644
--- a/drivers/mailbox/Kconfig
+++ b/drivers/mailbox/Kconfig
@@ -114,6 +114,15 @@ config MAILBOX_TEST
 	  Test client to help with testing new Controller driver
 	  implementations.
 
+config TEGRA_HSP_MBOX
+	bool "Tegra HSP(Hardware Synchronization Primitives) Driver"
+	depends on ARCH_TEGRA_186_SOC
+	help
+	  The Tegra HSP driver is used for the interprocessor communication
+	  between different remote processors and host processors on Tegra186
+	  and later SoCs. Say Y here if you want to have this support.
+	  If unsure say N.
+
 config XGENE_SLIMPRO_MBOX
 	tristate "APM SoC X-Gene SLIMpro Mailbox Controller"
 	depends on ARCH_XGENE
diff --git a/drivers/mailbox/Makefile b/drivers/mailbox/Makefile
index 0be3e742bb7d..26d8f91c7fea 100644
--- a/drivers/mailbox/Makefile
+++ b/drivers/mailbox/Makefile
@@ -25,3 +25,5 @@ obj-$(CONFIG_TI_MESSAGE_MANAGER) += ti-msgmgr.o
 obj-$(CONFIG_XGENE_SLIMPRO_MBOX) += mailbox-xgene-slimpro.o
 
 obj-$(CONFIG_HI6220_MBOX)	+= hi6220-mailbox.o
+
+obj-${CONFIG_TEGRA_HSP_MBOX}	+= tegra-hsp.o
diff --git a/drivers/mailbox/tegra-hsp.c b/drivers/mailbox/tegra-hsp.c
new file mode 100644
index 000000000000..93c3ef58f29f
--- /dev/null
+++ b/drivers/mailbox/tegra-hsp.c
@@ -0,0 +1,418 @@
+/*
+ * Copyright (c) 2016, NVIDIA CORPORATION.  All rights reserved.
+ *
+ * This program is free software; you can redistribute it and/or modify it
+ * under the terms and conditions of the GNU General Public License,
+ * version 2, as published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope it will be useful, but WITHOUT
+ * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
+ * FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License for
+ * more details.
+ */
+
+#include <linux/interrupt.h>
+#include <linux/io.h>
+#include <linux/mailbox_controller.h>
+#include <linux/of.h>
+#include <linux/of_device.h>
+#include <linux/platform_device.h>
+#include <dt-bindings/mailbox/tegra186-hsp.h>
+
+#define HSP_INT_DIMENSIONING	0x380
+#define HSP_nSM_OFFSET		0
+#define HSP_nSS_OFFSET		4
+#define HSP_nAS_OFFSET		8
+#define HSP_nDB_OFFSET		12
+#define HSP_nSI_OFFSET		16
+#define HSP_nINT_MASK		0xf
+
+#define HSP_DB_REG_TRIGGER	0x0
+#define HSP_DB_REG_ENABLE	0x4
+#define HSP_DB_REG_RAW		0x8
+#define HSP_DB_REG_PENDING	0xc
+
+#define HSP_DB_CCPLEX		1
+#define HSP_DB_BPMP		3
+
+#define MAX_NUM_HSP_CHAN 32
+#define MAX_NUM_HSP_DB 7
+
+#define hsp_db_offset(i, d) \
+	(d->base + ((1 + (d->nr_sm >> 1) + d->nr_ss + d->nr_as) << 16) + \
+	(i) * 0x100)
+
+struct tegra_hsp_db_chan {
+	int master_id;
+	int db_id;
+};
+
+struct tegra_hsp_mbox_chan {
+	int type;
+	union {
+		struct tegra_hsp_db_chan db_chan;
+	};
+};
+
+struct tegra_hsp_mbox {
+	struct mbox_controller *mbox;
+	void __iomem *base;
+	void __iomem *db_base[MAX_NUM_HSP_DB];
+	int db_irq;
+	int nr_sm;
+	int nr_as;
+	int nr_ss;
+	int nr_db;
+	int nr_si;
+	spinlock_t lock;
+};
+
+static inline u32 hsp_readl(void __iomem *base, int reg)
+{
+	return readl(base + reg);
+}
+
+static inline void hsp_writel(void __iomem *base, int reg, u32 val)
+{
+	writel(val, base + reg);
+	readl(base + reg);
+}
+
+static int hsp_db_can_ring(void __iomem *db_base)
+{
+	u32 reg;
+
+	reg = hsp_readl(db_base, HSP_DB_REG_ENABLE);
+
+	return !!(reg & BIT(HSP_DB_MASTER_CCPLEX));
+}
+
+static irqreturn_t hsp_db_irq(int irq, void *p)
+{
+	struct tegra_hsp_mbox *hsp_mbox = p;
+	ulong val;
+	int master_id;
+
+	val = (ulong)hsp_readl(hsp_mbox->db_base[HSP_DB_CCPLEX],
+			       HSP_DB_REG_PENDING);
+	hsp_writel(hsp_mbox->db_base[HSP_DB_CCPLEX], HSP_DB_REG_PENDING, val);
+
+	spin_lock(&hsp_mbox->lock);
+	for_each_set_bit(master_id, &val, MAX_NUM_HSP_CHAN) {
+		struct mbox_chan *chan;
+		struct tegra_hsp_mbox_chan *mchan;
+		int i;
+
+		for (i = 0; i < MAX_NUM_HSP_CHAN; i++) {
+			chan = &hsp_mbox->mbox->chans[i];
+
+			if (!chan->con_priv)
+				continue;
+
+			mchan = chan->con_priv;
+			if (mchan->type == HSP_MBOX_TYPE_DB &&
+			    mchan->db_chan.master_id == master_id)
+				break;
+			chan = NULL;
+		}
+
+		if (chan)
+			mbox_chan_received_data(chan, NULL);
+	}
+	spin_unlock(&hsp_mbox->lock);
+
+	return IRQ_HANDLED;
+}
+
+static int hsp_db_send_data(struct mbox_chan *chan, void *data)
+{
+	struct tegra_hsp_mbox_chan *mchan = chan->con_priv;
+	struct tegra_hsp_db_chan *db_chan = &mchan->db_chan;
+	struct tegra_hsp_mbox *hsp_mbox = dev_get_drvdata(chan->mbox->dev);
+
+	hsp_writel(hsp_mbox->db_base[db_chan->db_id], HSP_DB_REG_TRIGGER, 1);
+
+	return 0;
+}
+
+static int hsp_db_startup(struct mbox_chan *chan)
+{
+	struct tegra_hsp_mbox_chan *mchan = chan->con_priv;
+	struct tegra_hsp_db_chan *db_chan = &mchan->db_chan;
+	struct tegra_hsp_mbox *hsp_mbox = dev_get_drvdata(chan->mbox->dev);
+	u32 val;
+	unsigned long flag;
+
+	if (db_chan->master_id >= MAX_NUM_HSP_CHAN) {
+		dev_err(chan->mbox->dev, "invalid HSP chan: master ID: %d\n",
+			db_chan->master_id);
+		return -EINVAL;
+	}
+
+	spin_lock_irqsave(&hsp_mbox->lock, flag);
+	val = hsp_readl(hsp_mbox->db_base[HSP_DB_CCPLEX], HSP_DB_REG_ENABLE);
+	val |= BIT(db_chan->master_id);
+	hsp_writel(hsp_mbox->db_base[HSP_DB_CCPLEX], HSP_DB_REG_ENABLE, val);
+	spin_unlock_irqrestore(&hsp_mbox->lock, flag);
+
+	if (!hsp_db_can_ring(hsp_mbox->db_base[db_chan->db_id]))
+		return -ENODEV;
+
+	return 0;
+}
+
+static void hsp_db_shutdown(struct mbox_chan *chan)
+{
+	struct tegra_hsp_mbox_chan *mchan = chan->con_priv;
+	struct tegra_hsp_db_chan *db_chan = &mchan->db_chan;
+	struct tegra_hsp_mbox *hsp_mbox = dev_get_drvdata(chan->mbox->dev);
+	u32 val;
+	unsigned long flag;
+
+	spin_lock_irqsave(&hsp_mbox->lock, flag);
+	val = hsp_readl(hsp_mbox->db_base[HSP_DB_CCPLEX], HSP_DB_REG_ENABLE);
+	val &= ~BIT(db_chan->master_id);
+	hsp_writel(hsp_mbox->db_base[HSP_DB_CCPLEX], HSP_DB_REG_ENABLE, val);
+	spin_unlock_irqrestore(&hsp_mbox->lock, flag);
+}
+
+static bool hsp_db_last_tx_done(struct mbox_chan *chan)
+{
+	return true;
+}
+
+static int tegra_hsp_db_init(struct tegra_hsp_mbox *hsp_mbox,
+			     struct mbox_chan *mchan, int master_id)
+{
+	struct platform_device *pdev = to_platform_device(hsp_mbox->mbox->dev);
+	struct tegra_hsp_mbox_chan *hsp_mbox_chan;
+	int ret;
+
+	if (!hsp_mbox->db_irq) {
+		int i;
+
+		hsp_mbox->db_irq = platform_get_irq_byname(pdev, "doorbell");
+		ret = devm_request_irq(&pdev->dev, hsp_mbox->db_irq,
+				       hsp_db_irq, IRQF_NO_SUSPEND,
+				       dev_name(&pdev->dev), hsp_mbox);
+		if (ret)
+			return ret;
+
+		for (i = 0; i < MAX_NUM_HSP_DB; i++)
+			hsp_mbox->db_base[i] = hsp_db_offset(i, hsp_mbox);
+	}
+
+	hsp_mbox_chan = devm_kzalloc(&pdev->dev, sizeof(*hsp_mbox_chan),
+				     GFP_KERNEL);
+	if (!hsp_mbox_chan)
+		return -ENOMEM;
+
+	hsp_mbox_chan->type = HSP_MBOX_TYPE_DB;
+	hsp_mbox_chan->db_chan.master_id = master_id;
+	switch (master_id) {
+	case HSP_DB_MASTER_BPMP:
+		hsp_mbox_chan->db_chan.db_id = HSP_DB_BPMP;
+		break;
+	default:
+		hsp_mbox_chan->db_chan.db_id = MAX_NUM_HSP_DB;
+		break;
+	}
+
+	mchan->con_priv = hsp_mbox_chan;
+
+	return 0;
+}
+
+static int hsp_send_data(struct mbox_chan *chan, void *data)
+{
+	struct tegra_hsp_mbox_chan *hsp_mbox_chan = chan->con_priv;
+	int ret = 0;
+
+	switch (hsp_mbox_chan->type) {
+	case HSP_MBOX_TYPE_DB:
+		ret = hsp_db_send_data(chan, data);
+		break;
+	default:
+		break;
+	}
+
+	return ret;
+}
+
+static int hsp_startup(struct mbox_chan *chan)
+{
+	struct tegra_hsp_mbox_chan *hsp_mbox_chan = chan->con_priv;
+	int ret = 0;
+
+	switch (hsp_mbox_chan->type) {
+	case HSP_MBOX_TYPE_DB:
+		ret = hsp_db_startup(chan);
+		break;
+	default:
+		break;
+	}
+
+	return ret;
+}
+
+static void hsp_shutdown(struct mbox_chan *chan)
+{
+	struct tegra_hsp_mbox_chan *hsp_mbox_chan = chan->con_priv;
+
+	switch (hsp_mbox_chan->type) {
+	case HSP_MBOX_TYPE_DB:
+		hsp_db_shutdown(chan);
+		break;
+	default:
+		break;
+	}
+
+	chan->con_priv = NULL;
+}
+
+static bool hsp_last_tx_done(struct mbox_chan *chan)
+{
+	struct tegra_hsp_mbox_chan *hsp_mbox_chan = chan->con_priv;
+	bool ret = true;
+
+	switch (hsp_mbox_chan->type) {
+	case HSP_MBOX_TYPE_DB:
+		ret = hsp_db_last_tx_done(chan);
+		break;
+	default:
+		break;
+	}
+
+	return ret;
+}
+
+static const struct mbox_chan_ops tegra_hsp_ops = {
+	.send_data = hsp_send_data,
+	.startup = hsp_startup,
+	.shutdown = hsp_shutdown,
+	.last_tx_done = hsp_last_tx_done,
+};
+
+static const struct of_device_id tegra_hsp_match[] = {
+	{ .compatible = "nvidia,tegra186-hsp" },
+	{ }
+};
+
+static struct mbox_chan *
+of_hsp_mbox_xlate(struct mbox_controller *mbox,
+		  const struct of_phandle_args *sp)
+{
+	int mbox_id = sp->args[0];
+	int hsp_type = (mbox_id >> 16) & 0xf;
+	int master_id = mbox_id & 0xff;
+	struct tegra_hsp_mbox *hsp_mbox = dev_get_drvdata(mbox->dev);
+	struct mbox_chan *free_chan;
+	int i, ret = 0;
+
+	spin_lock(&hsp_mbox->lock);
+
+	for (i = 0; i < mbox->num_chans; i++) {
+		free_chan = &mbox->chans[i];
+		if (!free_chan->con_priv)
+			break;
+		free_chan = NULL;
+	}
+
+	if (!free_chan) {
+		spin_unlock(&hsp_mbox->lock);
+		return ERR_PTR(-EFAULT);
+	}
+
+	switch (hsp_type) {
+	case HSP_MBOX_TYPE_DB:
+		ret = tegra_hsp_db_init(hsp_mbox, free_chan, master_id);
+		break;
+	default:
+		break;
+	}
+
+	spin_unlock(&hsp_mbox->lock);
+
+	if (ret)
+		free_chan = ERR_PTR(-EFAULT);
+
+	return free_chan;
+}
+
+static int tegra_hsp_probe(struct platform_device *pdev)
+{
+	struct tegra_hsp_mbox *hsp_mbox;
+	struct resource *res;
+	int ret = 0;
+	u32 reg;
+
+	hsp_mbox = devm_kzalloc(&pdev->dev, sizeof(*hsp_mbox), GFP_KERNEL);
+	if (!hsp_mbox)
+		return -ENOMEM;
+
+	res = platform_get_resource(pdev, IORESOURCE_MEM, 0);
+	hsp_mbox->base = devm_ioremap_resource(&pdev->dev, res);
+	if (IS_ERR(hsp_mbox->base))
+		return PTR_ERR(hsp_mbox->base);
+
+	reg = hsp_readl(hsp_mbox->base, HSP_INT_DIMENSIONING);
+	hsp_mbox->nr_sm = (reg >> HSP_nSM_OFFSET) & HSP_nINT_MASK;
+	hsp_mbox->nr_ss = (reg >> HSP_nSS_OFFSET) & HSP_nINT_MASK;
+	hsp_mbox->nr_as = (reg >> HSP_nAS_OFFSET) & HSP_nINT_MASK;
+	hsp_mbox->nr_db = (reg >> HSP_nDB_OFFSET) & HSP_nINT_MASK;
+	hsp_mbox->nr_si = (reg >> HSP_nSI_OFFSET) & HSP_nINT_MASK;
+
+	hsp_mbox->mbox = devm_kzalloc(&pdev->dev,
+				      sizeof(*hsp_mbox->mbox), GFP_KERNEL);
+	if (!hsp_mbox->mbox)
+		return -ENOMEM;
+
+	hsp_mbox->mbox->chans =
+		devm_kcalloc(&pdev->dev, MAX_NUM_HSP_CHAN,
+			     sizeof(*hsp_mbox->mbox->chans), GFP_KERNEL);
+	if (!hsp_mbox->mbox->chans)
+		return -ENOMEM;
+
+	hsp_mbox->mbox->of_xlate = of_hsp_mbox_xlate;
+	hsp_mbox->mbox->num_chans = MAX_NUM_HSP_CHAN;
+	hsp_mbox->mbox->dev = &pdev->dev;
+	hsp_mbox->mbox->txdone_irq = false;
+	hsp_mbox->mbox->txdone_poll = false;
+	hsp_mbox->mbox->ops = &tegra_hsp_ops;
+	platform_set_drvdata(pdev, hsp_mbox);
+
+	ret = mbox_controller_register(hsp_mbox->mbox);
+	if (ret) {
+		pr_err("tegra-hsp mbox: fail to register mailbox %d.\n", ret);
+		return ret;
+	}
+
+	spin_lock_init(&hsp_mbox->lock);
+
+	return 0;
+}
+
+static int tegra_hsp_remove(struct platform_device *pdev)
+{
+	struct tegra_hsp_mbox *hsp_mbox = platform_get_drvdata(pdev);
+
+	if (hsp_mbox->mbox)
+		mbox_controller_unregister(hsp_mbox->mbox);
+
+	return 0;
+}
+
+static struct platform_driver tegra_hsp_driver = {
+	.driver = {
+		.name = "tegra-hsp",
+		.of_match_table = tegra_hsp_match,
+	},
+	.probe = tegra_hsp_probe,
+	.remove = tegra_hsp_remove,
+};
+
+static int __init tegra_hsp_init(void)
+{
+	return platform_driver_register(&tegra_hsp_driver);
+}
+core_initcall(tegra_hsp_init);
-- 
2.9.0

^ permalink raw reply related	[flat|nested] 51+ messages in thread

* [PATCH V2 03/10] Documentation: dt-bindings: firmware: tegra: add bindings of the BPMP
  2016-07-05  9:04 [PATCH V2 00/10] arm64: tegra: add BPMP support Joseph Lo
  2016-07-05  9:04 ` [PATCH V2 01/10] Documentation: dt-bindings: mailbox: tegra: Add binding for HSP mailbox Joseph Lo
  2016-07-05  9:04 ` [PATCH V2 02/10] mailbox: tegra-hsp: Add HSP(Hardware Synchronization Primitives) driver Joseph Lo
@ 2016-07-05  9:04 ` Joseph Lo
  2016-07-06 11:42   ` Alexandre Courbot
                     ` (3 more replies)
  2016-07-05  9:04 ` [PATCH V2 04/10] firmware: tegra: add IVC library Joseph Lo
                   ` (6 subsequent siblings)
  9 siblings, 4 replies; 51+ messages in thread
From: Joseph Lo @ 2016-07-05  9:04 UTC (permalink / raw)
  To: Stephen Warren, Thierry Reding, Alexandre Courbot
  Cc: linux-tegra, linux-arm-kernel, Rob Herring, Mark Rutland,
	Peter De Schrijver, Matthew Longnecker, devicetree, Jassi Brar,
	linux-kernel, Catalin Marinas, Will Deacon, Joseph Lo

The BPMP is a specific processor in Tegra chip, which is designed for
booting process handling and offloading the power management, clock
management, and reset control tasks from the CPU. The binding document
defines the resources that would be used by the BPMP firmware driver,
which can create the interprocessor communication (IPC) between the CPU
and BPMP.

Signed-off-by: Joseph Lo <josephl@nvidia.com>
---
Changes in V2:
- update the message that the BPMP is clock and reset control provider
- add tegra186-clock.h and tegra186-reset.h header files
- revise the description of the required properties
---
 .../bindings/firmware/nvidia,tegra186-bpmp.txt     |  77 ++
 include/dt-bindings/clock/tegra186-clock.h         | 940 +++++++++++++++++++++
 include/dt-bindings/reset/tegra186-reset.h         | 217 +++++
 3 files changed, 1234 insertions(+)
 create mode 100644 Documentation/devicetree/bindings/firmware/nvidia,tegra186-bpmp.txt
 create mode 100644 include/dt-bindings/clock/tegra186-clock.h
 create mode 100644 include/dt-bindings/reset/tegra186-reset.h

diff --git a/Documentation/devicetree/bindings/firmware/nvidia,tegra186-bpmp.txt b/Documentation/devicetree/bindings/firmware/nvidia,tegra186-bpmp.txt
new file mode 100644
index 000000000000..4d0b6eba56c5
--- /dev/null
+++ b/Documentation/devicetree/bindings/firmware/nvidia,tegra186-bpmp.txt
@@ -0,0 +1,77 @@
+NVIDIA Tegra Boot and Power Management Processor (BPMP)
+
+The BPMP is a specific processor in Tegra chip, which is designed for
+booting process handling and offloading the power management, clock
+management, and reset control tasks from the CPU. The binding document
+defines the resources that would be used by the BPMP firmware driver,
+which can create the interprocessor communication (IPC) between the CPU
+and BPMP.
+
+Required properties:
+- name : Should be bpmp
+- compatible
+    Array of strings
+    One of:
+    - "nvidia,tegra186-bpmp"
+- mboxes : The phandle of mailbox controller and the mailbox specifier.
+- shmem : List of the phandle of the TX and RX shared memory area that
+	  the IPC between CPU and BPMP is based on.
+- #clock-cells : Should be 1.
+- #reset-cells : Should be 1.
+
+This node is a mailbox consumer. See the following files for details of
+the mailbox subsystem, and the specifiers implemented by the relevant
+provider(s):
+
+- Documentation/devicetree/bindings/mailbox/mailbox.txt
+- Documentation/devicetree/bindings/mailbox/nvidia,tegra186-hsp.txt
+
+This node is a clock and reset provider. See the following files for
+general documentation of those features, and the specifiers implemented
+by this node:
+
+- Documentation/devicetree/bindings/clock/clock-bindings.txt
+- include/dt-bindings/clock/tegra186-clock.h
+- Documentation/devicetree/bindings/reset/reset.txt
+- include/dt-bindings/reset/tegra186-reset.h
+
+The shared memory bindings for BPMP
+-----------------------------------
+
+The shared memory area for the IPC TX and RX between CPU and BPMP are
+predefined and work on top of sysram, which is an SRAM inside the chip.
+
+See "Documentation/devicetree/bindings/sram/sram.txt" for the bindings.
+
+Example:
+
+hsp_top0: hsp@03c00000 {
+	...
+	#mbox-cells = <1>;
+};
+
+sysram@30000000 {
+	compatible = "nvidia,tegra186-sysram", "mmio-ram";
+	reg = <0x0 0x30000000 0x0 0x50000>;
+	#address-cells = <2>;
+	#size-cells = <2>;
+	ranges = <0 0x0 0x0 0x30000000 0x0 0x50000>;
+
+	cpu_bpmp_tx: bpmp_shmem@4e000 {
+		compatible = "nvidia,tegra186-bpmp-shmem";
+		reg = <0x0 0x4e000 0x0 0x1000>;
+	};
+
+	cpu_bpmp_rx: bpmp_shmem@4f000 {
+		compatible = "nvidia,tegra186-bpmp-shmem";
+		reg = <0x0 0x4f000 0x0 0x1000>;
+	};
+};
+
+bpmp {
+	compatible = "nvidia,tegra186-bpmp";
+	mboxes = <&hsp_top0 HSP_MBOX_ID(DB, HSP_DB_MASTER_BPMP)>;
+	shmem = <&cpu_bpmp_tx &cpu_bpmp_rx>;
+	#clock-cells = <1>;
+	#reset-cells = <1>;
+};
diff --git a/include/dt-bindings/clock/tegra186-clock.h b/include/dt-bindings/clock/tegra186-clock.h
new file mode 100644
index 000000000000..f73d32098f99
--- /dev/null
+++ b/include/dt-bindings/clock/tegra186-clock.h
@@ -0,0 +1,940 @@
+/** @file */
+
+#ifndef _MACH_T186_CLK_T186_H
+#define _MACH_T186_CLK_T186_H
+
+/**
+ * @defgroup clock_ids Clock Identifiers
+ * @{
+ *   @defgroup extern_input external input clocks
+ *   @{
+ *     @def TEGRA186_CLK_OSC
+ *     @def TEGRA186_CLK_CLK_32K
+ *     @def TEGRA186_CLK_DTV_INPUT
+ *     @def TEGRA186_CLK_SOR0_PAD_CLKOUT
+ *     @def TEGRA186_CLK_SOR1_PAD_CLKOUT
+ *     @def TEGRA186_CLK_I2S1_SYNC_INPUT
+ *     @def TEGRA186_CLK_I2S2_SYNC_INPUT
+ *     @def TEGRA186_CLK_I2S3_SYNC_INPUT
+ *     @def TEGRA186_CLK_I2S4_SYNC_INPUT
+ *     @def TEGRA186_CLK_I2S5_SYNC_INPUT
+ *     @def TEGRA186_CLK_I2S6_SYNC_INPUT
+ *     @def TEGRA186_CLK_SPDIFIN_SYNC_INPUT
+ *   @}
+ *
+ *   @defgroup extern_output external output clocks
+ *   @{
+ *     @def TEGRA186_CLK_EXTPERIPH1
+ *     @def TEGRA186_CLK_EXTPERIPH2
+ *     @def TEGRA186_CLK_EXTPERIPH3
+ *     @def TEGRA186_CLK_EXTPERIPH4
+ *   @}
+ *
+ *   @defgroup display_clks display related clocks
+ *   @{
+ *     @def TEGRA186_CLK_CEC
+ *     @def TEGRA186_CLK_DSIC
+ *     @def TEGRA186_CLK_DSIC_LP
+ *     @def TEGRA186_CLK_DSID
+ *     @def TEGRA186_CLK_DSID_LP
+ *     @def TEGRA186_CLK_DPAUX1
+ *     @def TEGRA186_CLK_DPAUX
+ *     @def TEGRA186_CLK_HDA2HDMICODEC
+ *     @def TEGRA186_CLK_NVDISPLAY_DISP
+ *     @def TEGRA186_CLK_NVDISPLAY_DSC
+ *     @def TEGRA186_CLK_NVDISPLAY_P0
+ *     @def TEGRA186_CLK_NVDISPLAY_P1
+ *     @def TEGRA186_CLK_NVDISPLAY_P2
+ *     @def TEGRA186_CLK_NVDISPLAYHUB
+ *     @def TEGRA186_CLK_SOR_SAFE
+ *     @def TEGRA186_CLK_SOR0
+ *     @def TEGRA186_CLK_SOR0_OUT
+ *     @def TEGRA186_CLK_SOR1
+ *     @def TEGRA186_CLK_SOR1_OUT
+ *     @def TEGRA186_CLK_DSI
+ *     @def TEGRA186_CLK_MIPI_CAL
+ *     @def TEGRA186_CLK_DSIA_LP
+ *     @def TEGRA186_CLK_DSIB
+ *     @def TEGRA186_CLK_DSIB_LP
+ *   @}
+ *
+ *   @defgroup camera_clks camera related clocks
+ *   @{
+ *     @def TEGRA186_CLK_NVCSI
+ *     @def TEGRA186_CLK_NVCSILP
+ *     @def TEGRA186_CLK_VI
+ *   @}
+ *
+ *   @defgroup audio_clks audio related clocks
+ *   @{
+ *     @def TEGRA186_CLK_ACLK
+ *     @def TEGRA186_CLK_ADSP
+ *     @def TEGRA186_CLK_ADSPNEON
+ *     @def TEGRA186_CLK_AHUB
+ *     @def TEGRA186_CLK_APE
+ *     @def TEGRA186_CLK_APB2APE
+ *     @def TEGRA186_CLK_AUD_MCLK
+ *     @def TEGRA186_CLK_DMIC1
+ *     @def TEGRA186_CLK_DMIC2
+ *     @def TEGRA186_CLK_DMIC3
+ *     @def TEGRA186_CLK_DMIC4
+ *     @def TEGRA186_CLK_DSPK1
+ *     @def TEGRA186_CLK_DSPK2
+ *     @def TEGRA186_CLK_HDA
+ *     @def TEGRA186_CLK_HDA2CODEC_2X
+ *     @def TEGRA186_CLK_I2S1
+ *     @def TEGRA186_CLK_I2S2
+ *     @def TEGRA186_CLK_I2S3
+ *     @def TEGRA186_CLK_I2S4
+ *     @def TEGRA186_CLK_I2S5
+ *     @def TEGRA186_CLK_I2S6
+ *     @def TEGRA186_CLK_MAUD
+ *     @def TEGRA186_CLK_PLL_A_OUT0
+ *     @def TEGRA186_CLK_SPDIF_DOUBLER
+ *     @def TEGRA186_CLK_SPDIF_IN
+ *     @def TEGRA186_CLK_SPDIF_OUT
+ *     @def TEGRA186_CLK_SYNC_DMIC1
+ *     @def TEGRA186_CLK_SYNC_DMIC2
+ *     @def TEGRA186_CLK_SYNC_DMIC3
+ *     @def TEGRA186_CLK_SYNC_DMIC4
+ *     @def TEGRA186_CLK_SYNC_DMIC5
+ *     @def TEGRA186_CLK_SYNC_DSPK1
+ *     @def TEGRA186_CLK_SYNC_DSPK2
+ *     @def TEGRA186_CLK_SYNC_I2S1
+ *     @def TEGRA186_CLK_SYNC_I2S2
+ *     @def TEGRA186_CLK_SYNC_I2S3
+ *     @def TEGRA186_CLK_SYNC_I2S4
+ *     @def TEGRA186_CLK_SYNC_I2S5
+ *     @def TEGRA186_CLK_SYNC_I2S6
+ *     @def TEGRA186_CLK_SYNC_SPDIF
+ *   @}
+ *
+ *   @defgroup uart_clks UART clocks
+ *   @{
+ *     @def TEGRA186_CLK_AON_UART_FST_MIPI_CAL
+ *     @def TEGRA186_CLK_UARTA
+ *     @def TEGRA186_CLK_UARTB
+ *     @def TEGRA186_CLK_UARTC
+ *     @def TEGRA186_CLK_UARTD
+ *     @def TEGRA186_CLK_UARTE
+ *     @def TEGRA186_CLK_UARTF
+ *     @def TEGRA186_CLK_UARTG
+ *     @def TEGRA186_CLK_UART_FST_MIPI_CAL
+ *   @}
+ *
+ *   @defgroup i2c_clks I2C clocks
+ *   @{
+ *     @def TEGRA186_CLK_AON_I2C_SLOW
+ *     @def TEGRA186_CLK_I2C1
+ *     @def TEGRA186_CLK_I2C2
+ *     @def TEGRA186_CLK_I2C3
+ *     @def TEGRA186_CLK_I2C4
+ *     @def TEGRA186_CLK_I2C5
+ *     @def TEGRA186_CLK_I2C6
+ *     @def TEGRA186_CLK_I2C8
+ *     @def TEGRA186_CLK_I2C9
+ *     @def TEGRA186_CLK_I2C1
+ *     @def TEGRA186_CLK_I2C12
+ *     @def TEGRA186_CLK_I2C13
+ *     @def TEGRA186_CLK_I2C14
+ *     @def TEGRA186_CLK_I2C_SLOW
+ *     @def TEGRA186_CLK_VI_I2C
+ *   @}
+ *
+ *   @defgroup spi_clks SPI clocks
+ *   @{
+ *     @def TEGRA186_CLK_SPI1
+ *     @def TEGRA186_CLK_SPI2
+ *     @def TEGRA186_CLK_SPI3
+ *     @def TEGRA186_CLK_SPI4
+ *   @}
+ *
+ *   @defgroup storage storage related clocks
+ *   @{
+ *     @def TEGRA186_CLK_SATA
+ *     @def TEGRA186_CLK_SATA_OOB
+ *     @def TEGRA186_CLK_SATA_IOBIST
+ *     @def TEGRA186_CLK_SDMMC_LEGACY_TM
+ *     @def TEGRA186_CLK_SDMMC1
+ *     @def TEGRA186_CLK_SDMMC2
+ *     @def TEGRA186_CLK_SDMMC3
+ *     @def TEGRA186_CLK_SDMMC4
+ *     @def TEGRA186_CLK_QSPI
+ *     @def TEGRA186_CLK_QSPI_OUT
+ *     @def TEGRA186_CLK_UFSDEV_REF
+ *     @def TEGRA186_CLK_UFSHC
+ *   @}
+ *
+ *   @defgroup pwm_clks PWM clocks
+ *   @{
+ *     @def TEGRA186_CLK_PWM1
+ *     @def TEGRA186_CLK_PWM2
+ *     @def TEGRA186_CLK_PWM3
+ *     @def TEGRA186_CLK_PWM4
+ *     @def TEGRA186_CLK_PWM5
+ *     @def TEGRA186_CLK_PWM6
+ *     @def TEGRA186_CLK_PWM7
+ *     @def TEGRA186_CLK_PWM8
+ *   @}
+ *
+ *   @defgroup plls PLLs and related clocks
+ *   @{
+ *     @def TEGRA186_CLK_PLLREFE_OUT_GATED
+ *     @def TEGRA186_CLK_PLLREFE_OUT1
+ *     @def TEGRA186_CLK_PLLD_OUT1
+ *     @def TEGRA186_CLK_PLLP_OUT0
+ *     @def TEGRA186_CLK_PLLP_OUT5
+ *     @def TEGRA186_CLK_PLLA
+ *     @def TEGRA186_CLK_PLLE_PWRSEQ
+ *     @def TEGRA186_CLK_PLLA_OUT1
+ *     @def TEGRA186_CLK_PLLREFE_REF
+ *     @def TEGRA186_CLK_UPHY_PLL0_PWRSEQ
+ *     @def TEGRA186_CLK_UPHY_PLL1_PWRSEQ
+ *     @def TEGRA186_CLK_PLLREFE_PLLE_PASSTHROUGH
+ *     @def TEGRA186_CLK_PLLREFE_PEX
+ *     @def TEGRA186_CLK_PLLREFE_IDDQ
+ *     @def TEGRA186_CLK_PLLC_OUT_AON
+ *     @def TEGRA186_CLK_PLLC_OUT_ISP
+ *     @def TEGRA186_CLK_PLLC_OUT_VE
+ *     @def TEGRA186_CLK_PLLC4_OUT
+ *     @def TEGRA186_CLK_PLLREFE_OUT
+ *     @def TEGRA186_CLK_PLLREFE_PLL_REF
+ *     @def TEGRA186_CLK_PLLE
+ *     @def TEGRA186_CLK_PLLC
+ *     @def TEGRA186_CLK_PLLP
+ *     @def TEGRA186_CLK_PLLD
+ *     @def TEGRA186_CLK_PLLD2
+ *     @def TEGRA186_CLK_PLLREFE_VCO
+ *     @def TEGRA186_CLK_PLLC2
+ *     @def TEGRA186_CLK_PLLC3
+ *     @def TEGRA186_CLK_PLLDP
+ *     @def TEGRA186_CLK_PLLC4_VCO
+ *     @def TEGRA186_CLK_PLLA1
+ *     @def TEGRA186_CLK_PLLNVCSI
+ *     @def TEGRA186_CLK_PLLDISPHUB
+ *     @def TEGRA186_CLK_PLLD3
+ *     @def TEGRA186_CLK_PLLBPMPCAM
+ *     @def TEGRA186_CLK_PLLAON
+ *     @def TEGRA186_CLK_PLLU
+ *     @def TEGRA186_CLK_PLLC4_VCO_DIV2
+ *     @def TEGRA186_CLK_PLL_REF
+ *     @def TEGRA186_CLK_PLLREFE_OUT1_DIV5
+ *     @def TEGRA186_CLK_UTMIP_PLL_PWRSEQ
+ *     @def TEGRA186_CLK_PLL_U_48M
+ *     @def TEGRA186_CLK_PLL_U_480M
+ *     @def TEGRA186_CLK_PLLC4_OUT0
+ *     @def TEGRA186_CLK_PLLC4_OUT1
+ *     @def TEGRA186_CLK_PLLC4_OUT2
+ *     @def TEGRA186_CLK_PLLC4_OUT_MUX
+ *     @def TEGRA186_CLK_DFLLDISP_DIV
+ *     @def TEGRA186_CLK_PLLDISPHUB_DIV
+ *     @def TEGRA186_CLK_PLLP_DIV8
+ *   @}
+ *
+ *   @defgroup nafll_clks NAFLL clock sources
+ *   @{
+ *     @def TEGRA186_CLK_NAFLL_AXI_CBB
+ *     @def TEGRA186_CLK_NAFLL_BCPU
+ *     @def TEGRA186_CLK_NAFLL_BPMP
+ *     @def TEGRA186_CLK_NAFLL_DISP
+ *     @def TEGRA186_CLK_NAFLL_GPU
+ *     @def TEGRA186_CLK_NAFLL_ISP
+ *     @def TEGRA186_CLK_NAFLL_MCPU
+ *     @def TEGRA186_CLK_NAFLL_NVDEC
+ *     @def TEGRA186_CLK_NAFLL_NVENC
+ *     @def TEGRA186_CLK_NAFLL_NVJPG
+ *     @def TEGRA186_CLK_NAFLL_SCE
+ *     @def TEGRA186_CLK_NAFLL_SE
+ *     @def TEGRA186_CLK_NAFLL_TSEC
+ *     @def TEGRA186_CLK_NAFLL_TSECB
+ *     @def TEGRA186_CLK_NAFLL_VI
+ *     @def TEGRA186_CLK_NAFLL_VIC
+ *   @}
+ *
+ *   @defgroup mphy MPHY related clocks
+ *   @{
+ *     @def TEGRA186_CLK_MPHY_L0_RX_SYMB
+ *     @def TEGRA186_CLK_MPHY_L0_RX_LS_BIT
+ *     @def TEGRA186_CLK_MPHY_L0_TX_SYMB
+ *     @def TEGRA186_CLK_MPHY_L0_TX_LS_3XBIT
+ *     @def TEGRA186_CLK_MPHY_L0_RX_ANA
+ *     @def TEGRA186_CLK_MPHY_L1_RX_ANA
+ *     @def TEGRA186_CLK_MPHY_IOBIST
+ *     @def TEGRA186_CLK_MPHY_TX_1MHZ_REF
+ *     @def TEGRA186_CLK_MPHY_CORE_PLL_FIXED
+ *   @}
+ *
+ *   @defgroup eavb EAVB related clocks
+ *   @{
+ *     @def TEGRA186_CLK_EQOS_AXI
+ *     @def TEGRA186_CLK_EQOS_PTP_REF
+ *     @def TEGRA186_CLK_EQOS_RX
+ *     @def TEGRA186_CLK_EQOS_RX_INPUT
+ *     @def TEGRA186_CLK_EQOS_TX
+ *   @}
+ *
+ *   @defgroup usb USB related clocks
+ *   @{
+ *     @def TEGRA186_CLK_PEX_USB_PAD0_MGMT
+ *     @def TEGRA186_CLK_PEX_USB_PAD1_MGMT
+ *     @def TEGRA186_CLK_HSIC_TRK
+ *     @def TEGRA186_CLK_USB2_TRK
+ *     @def TEGRA186_CLK_USB2_HSIC_TRK
+ *     @def TEGRA186_CLK_XUSB_CORE_SS
+ *     @def TEGRA186_CLK_XUSB_CORE_DEV
+ *     @def TEGRA186_CLK_XUSB_FALCON
+ *     @def TEGRA186_CLK_XUSB_FS
+ *     @def TEGRA186_CLK_XUSB
+ *     @def TEGRA186_CLK_XUSB_DEV
+ *     @def TEGRA186_CLK_XUSB_HOST
+ *     @def TEGRA186_CLK_XUSB_SS
+ *   @}
+ *
+ *   @defgroup bigblock compute block related clocks
+ *   @{
+ *     @def TEGRA186_CLK_GPCCLK
+ *     @def TEGRA186_CLK_GPC2CLK
+ *     @def TEGRA186_CLK_GPU
+ *     @def TEGRA186_CLK_HOST1X
+ *     @def TEGRA186_CLK_ISP
+ *     @def TEGRA186_CLK_NVDEC
+ *     @def TEGRA186_CLK_NVENC
+ *     @def TEGRA186_CLK_NVJPG
+ *     @def TEGRA186_CLK_SE
+ *     @def TEGRA186_CLK_TSEC
+ *     @def TEGRA186_CLK_TSECB
+ *     @def TEGRA186_CLK_VIC
+ *   @}
+ *
+ *   @defgroup can CAN bus related clocks
+ *   @{
+ *     @def TEGRA186_CLK_CAN1
+ *     @def TEGRA186_CLK_CAN1_HOST
+ *     @def TEGRA186_CLK_CAN2
+ *     @def TEGRA186_CLK_CAN2_HOST
+ *   @}
+ *
+ *   @defgroup system basic system clocks
+ *   @{
+ *     @def TEGRA186_CLK_ACTMON
+ *     @def TEGRA186_CLK_AON_APB
+ *     @def TEGRA186_CLK_AON_CPU_NIC
+ *     @def TEGRA186_CLK_AON_NIC
+ *     @def TEGRA186_CLK_AXI_CBB
+ *     @def TEGRA186_CLK_BPMP_APB
+ *     @def TEGRA186_CLK_BPMP_CPU_NIC
+ *     @def TEGRA186_CLK_BPMP_NIC_RATE
+ *     @def TEGRA186_CLK_CLK_M
+ *     @def TEGRA186_CLK_EMC
+ *     @def TEGRA186_CLK_MSS_ENCRYPT
+ *     @def TEGRA186_CLK_SCE_APB
+ *     @def TEGRA186_CLK_SCE_CPU_NIC
+ *     @def TEGRA186_CLK_SCE_NIC
+ *     @def TEGRA186_CLK_TSC
+ *   @}
+ *
+ *   @defgroup pcie_clks PCIe related clocks
+ *   @{
+ *     @def TEGRA186_CLK_AFI
+ *     @def TEGRA186_CLK_PCIE
+ *     @def TEGRA186_CLK_PCIE2_IOBIST
+ *     @def TEGRA186_CLK_PCIERX0
+ *     @def TEGRA186_CLK_PCIERX1
+ *     @def TEGRA186_CLK_PCIERX2
+ *     @def TEGRA186_CLK_PCIERX3
+ *     @def TEGRA186_CLK_PCIERX4
+ *   @}
+ */
+
+/** @brief output of gate CLK_ENB_FUSE */
+#define TEGRA186_CLK_FUSE 0
+/**
+ * @brief It's not what you think
+ * @details output of gate CLK_ENB_GPU. This output connects to the GPU
+ * pwrclk. @warning: This is almost certainly not the clock you think
+ * it is. If you're looking for the clock of the graphics engine, see
+ * TEGRA186_GPCCLK
+ */
+#define TEGRA186_CLK_GPU 1
+/** @brief output of gate CLK_ENB_PCIE */
+#define TEGRA186_CLK_PCIE 3
+/** @brief output of the divider IPFS_CLK_DIVISOR */
+#define TEGRA186_CLK_AFI 4
+/** @brief output of gate CLK_ENB_PCIE2_IOBIST */
+#define TEGRA186_CLK_PCIE2_IOBIST 5
+/** @brief output of gate CLK_ENB_PCIERX0*/
+#define TEGRA186_CLK_PCIERX0 6
+/** @brief output of gate CLK_ENB_PCIERX1*/
+#define TEGRA186_CLK_PCIERX1 7
+/** @brief output of gate CLK_ENB_PCIERX2*/
+#define TEGRA186_CLK_PCIERX2 8
+/** @brief output of gate CLK_ENB_PCIERX3*/
+#define TEGRA186_CLK_PCIERX3 9
+/** @brief output of gate CLK_ENB_PCIERX4*/
+#define TEGRA186_CLK_PCIERX4 10
+/** @brief output branch of PLL_C for ISP, controlled by gate CLK_ENB_PLLC_OUT_ISP */
+#define TEGRA186_CLK_PLLC_OUT_ISP 11
+/** @brief output branch of PLL_C for VI, controlled by gate CLK_ENB_PLLC_OUT_VE */
+#define TEGRA186_CLK_PLLC_OUT_VE 12
+/** @brief output branch of PLL_C for AON domain, controlled by gate CLK_ENB_PLLC_OUT_AON */
+#define TEGRA186_CLK_PLLC_OUT_AON 13
+/** @brief output of gate CLK_ENB_SOR_SAFE */
+#define TEGRA186_CLK_SOR_SAFE 39
+/** @brief output of mux controlled by CLK_RST_CONTROLLER_CLK_SOURCE_I2S2 */
+#define TEGRA186_CLK_I2S2 42
+/** @brief output of mux controlled by CLK_RST_CONTROLLER_CLK_SOURCE_I2S3 */
+#define TEGRA186_CLK_I2S3 43
+/** @brief output of mux controlled by CLK_RST_CONTROLLER_CLK_SOURCE_SPDF_IN */
+#define TEGRA186_CLK_SPDIF_IN 44
+/** @brief output of gate CLK_ENB_SPDIF_DOUBLER */
+#define TEGRA186_CLK_SPDIF_DOUBLER 45
+/**  @clkdesc{spi_clks, out, mux, CLK_RST_CONTROLLER_CLK_SOURCE_SPI3} */
+#define TEGRA186_CLK_SPI3 46
+/** @clkdesc{i2c_clks, out, mux, CLK_RST_CONTROLLER_CLK_SOURCE_I2C1} */
+#define TEGRA186_CLK_I2C1 47
+/** @clkdesc{i2c_clks, out, mux, CLK_RST_CONTROLLER_CLK_SOURCE_I2C5} */
+#define TEGRA186_CLK_I2C5 48
+/** @brief output of mux controlled by CLK_RST_CONTROLLER_CLK_SOURCE_SPI1 */
+#define TEGRA186_CLK_SPI1 49
+/** @brief output of mux controlled by CLK_RST_CONTROLLER_CLK_SOURCE_ISP */
+#define TEGRA186_CLK_ISP 50
+/** @brief output of mux controlled by CLK_RST_CONTROLLER_CLK_SOURCE_VI */
+#define TEGRA186_CLK_VI 51
+/** @brief output of mux controlled by CLK_RST_CONTROLLER_CLK_SOURCE_SDMMC1 */
+#define TEGRA186_CLK_SDMMC1 52
+/** @brief output of mux controlled by CLK_RST_CONTROLLER_CLK_SOURCE_SDMMC2 */
+#define TEGRA186_CLK_SDMMC2 53
+/** @brief output of mux controlled by CLK_RST_CONTROLLER_CLK_SOURCE_SDMMC4 */
+#define TEGRA186_CLK_SDMMC4 54
+/** @brief output of mux controlled by CLK_RST_CONTROLLER_CLK_SOURCE_UARTA */
+#define TEGRA186_CLK_UARTA 55
+/** @brief output of mux controlled by CLK_RST_CONTROLLER_CLK_SOURCE_UARTB */
+#define TEGRA186_CLK_UARTB 56
+/** @brief output of mux controlled by CLK_RST_CONTROLLER_CLK_SOURCE_HOST1X */
+#define TEGRA186_CLK_HOST1X 57
+/**
+ * @brief controls the EMC clock frequency.
+ * @details Doing a clk_set_rate on this clock will select the
+ * appropriate clock source, program the source rate and execute a
+ * specific sequence to switch to the new clock source for both memory
+ * controllers. This can be used to control the balance between memory
+ * throughput and memory controller power.
+ */
+#define TEGRA186_CLK_EMC 58
+/* @brief output of mux controlled by CLK_RST_CONTROLLER_CLK_SOURCE_EXTPERIPH4 */
+#define TEGRA186_CLK_EXTPERIPH4 73
+/** @brief output of mux controlled by CLK_RST_CONTROLLER_CLK_SOURCE_SPI4 */
+#define TEGRA186_CLK_SPI4 74
+/** @brief output of mux controlled by CLK_RST_CONTROLLER_CLK_SOURCE_I2C3 */
+#define TEGRA186_CLK_I2C3 75
+/** @brief output of mux controlled by CLK_RST_CONTROLLER_CLK_SOURCE_SDMMC3 */
+#define TEGRA186_CLK_SDMMC3 76
+/** @brief output of mux controlled by CLK_RST_CONTROLLER_CLK_SOURCE_UARTD */
+#define TEGRA186_CLK_UARTD 77
+/** output of mux controlled by CLK_RST_CONTROLLER_CLK_SOURCE_I2S1 */
+#define TEGRA186_CLK_I2S1 79
+/** output of gate CLK_ENB_DTV */
+#define TEGRA186_CLK_DTV 80
+/** output of mux controlled by CLK_RST_CONTROLLER_CLK_SOURCE_TSEC */
+#define TEGRA186_CLK_TSEC 81
+/** @brief output of gate CLK_ENB_DP2 */
+#define TEGRA186_CLK_DP2 82
+/** output of mux controlled by CLK_RST_CONTROLLER_CLK_SOURCE_I2S4 */
+#define TEGRA186_CLK_I2S4 84
+/** output of mux controlled by CLK_RST_CONTROLLER_CLK_SOURCE_I2S5 */
+#define TEGRA186_CLK_I2S5 85
+/** output of mux controlled by CLK_RST_CONTROLLER_CLK_SOURCE_I2C4 */
+#define TEGRA186_CLK_I2C4 86
+/** output of mux controlled by CLK_RST_CONTROLLER_CLK_SOURCE_AHUB */
+#define TEGRA186_CLK_AHUB 87
+/** output of mux controlled by CLK_RST_CONTROLLER_CLK_SOURCE_HDA2CODEC_2X */
+#define TEGRA186_CLK_HDA2CODEC_2X 88
+/** @brief output of mux controlled by CLK_RST_CONTROLLER_CLK_SOURCE_EXTPERIPH1 */
+#define TEGRA186_CLK_EXTPERIPH1 89
+/** @brief output of mux controlled by CLK_RST_CONTROLLER_CLK_SOURCE_EXTPERIPH2 */
+#define TEGRA186_CLK_EXTPERIPH2 90
+/** @brief output of mux controlled by CLK_RST_CONTROLLER_CLK_SOURCE_EXTPERIPH3 */
+#define TEGRA186_CLK_EXTPERIPH3 91
+/** @brief output of mux controlled by CLK_RST_CONTROLLER_CLK_SOURCE_I2C_SLOW */
+#define TEGRA186_CLK_I2C_SLOW 92
+/** @brief output of the SOR1_CLK_SRC mux in CLK_RST_CONTROLLER_CLK_SOURCE_SOR1 */
+#define TEGRA186_CLK_SOR1 93
+/** @brief output of gate CLK_ENB_CEC */
+#define TEGRA186_CLK_CEC 94
+/** @brief output of gate CLK_ENB_DPAUX1 */
+#define TEGRA186_CLK_DPAUX1 95
+/** @brief output of gate CLK_ENB_DPAUX */
+#define TEGRA186_CLK_DPAUX 96
+/** @brief output of the SOR0_CLK_SRC mux in CLK_RST_CONTROLLER_CLK_SOURCE_SOR0 */
+#define TEGRA186_CLK_SOR0 97
+/** @brief output of gate CLK_ENB_HDA2HDMICODEC */
+#define TEGRA186_CLK_HDA2HDMICODEC 98
+/** @brief output of mux controlled by CLK_RST_CONTROLLER_CLK_SOURCE_SATA */
+#define TEGRA186_CLK_SATA 99
+/** @brief output of gate CLK_ENB_SATA_OOB */
+#define TEGRA186_CLK_SATA_OOB 100
+/** @brief output of gate CLK_ENB_SATA_IOBIST */
+#define TEGRA186_CLK_SATA_IOBIST 101
+/** output of mux controlled by CLK_RST_CONTROLLER_CLK_SOURCE_HDA */
+#define TEGRA186_CLK_HDA 102
+/** @brief output of mux controlled by CLK_RST_CONTROLLER_CLK_SOURCE_SE */
+#define TEGRA186_CLK_SE 103
+/** @brief output of gate CLK_ENB_APB2APE */
+#define TEGRA186_CLK_APB2APE 104
+/** @brief output of mux controlled by CLK_RST_CONTROLLER_CLK_SOURCE_APE */
+#define TEGRA186_CLK_APE 105
+/** @brief output of gate CLK_ENB_IQC1 */
+#define TEGRA186_CLK_IQC1 106
+/** @brief output of gate CLK_ENB_IQC2 */
+#define TEGRA186_CLK_IQC2 107
+/** divide by 2 version of TEGRA186_CLK_PLLREFE_VCO */
+#define TEGRA186_CLK_PLLREFE_OUT 108
+/** @brief output of gate CLK_ENB_PLLREFE_PLL_REF */
+#define TEGRA186_CLK_PLLREFE_PLL_REF 109
+/** @brief output of gate CLK_ENB_PLLC4_OUT */
+#define TEGRA186_CLK_PLLC4_OUT 110
+/** @brief output of mux xusb_core_clk_switch on page 67 of T186_Clocks_IAS.doc */
+#define TEGRA186_CLK_XUSB 111
+/** controls xusb_dev_ce signal on page 66 and 67 of T186_Clocks_IAS.doc */
+#define TEGRA186_CLK_XUSB_DEV 112
+/** controls xusb_host_ce signal on page 67 of T186_Clocks_IAS.doc */
+#define TEGRA186_CLK_XUSB_HOST 113
+/** controls xusb_ss_ce signal on page 67 of T186_Clocks_IAS.doc */
+#define TEGRA186_CLK_XUSB_SS 114
+/** @brief output of gate CLK_ENB_DSI */
+#define TEGRA186_CLK_DSI 115
+/** @brief output of gate CLK_ENB_MIPI_CAL */
+#define TEGRA186_CLK_MIPI_CAL 116
+/** @brief output of mux controlled by CLK_RST_CONTROLLER_CLK_SOURCE_DSIA_LP */
+#define TEGRA186_CLK_DSIA_LP 117
+/** @brief output of gate CLK_ENB_DSIB */
+#define TEGRA186_CLK_DSIB 118
+/** @brief output of mux controlled by CLK_RST_CONTROLLER_CLK_SOURCE_DSIB_LP */
+#define TEGRA186_CLK_DSIB_LP 119
+/** @brief output of mux controlled by CLK_RST_CONTROLLER_CLK_SOURCE_DMIC1 */
+#define TEGRA186_CLK_DMIC1 122
+/** @brief output of mux controlled by CLK_RST_CONTROLLER_CLK_SOURCE_DMIC2 */
+#define TEGRA186_CLK_DMIC2 123
+/** @brief output of mux controlled by CLK_RST_CONTROLLER_CLK_SOURCE_AUD_MCLK */
+#define TEGRA186_CLK_AUD_MCLK 124
+/** @brief output of mux controlled by CLK_RST_CONTROLLER_CLK_SOURCE_I2C6 */
+#define TEGRA186_CLK_I2C6 125
+/**output of mux controlled by CLK_RST_CONTROLLER_CLK_SOURCE_UART_FST_MIPI_CAL */
+#define TEGRA186_CLK_UART_FST_MIPI_CAL 126
+/** @brief output of mux controlled by CLK_RST_CONTROLLER_CLK_SOURCE_VIC */
+#define TEGRA186_CLK_VIC 127
+/** @brief output of mux controlled by CLK_RST_CONTROLLER_CLK_SOURCE_SDMMC_LEGACY_TM */
+#define TEGRA186_CLK_SDMMC_LEGACY_TM 128
+/** @brief output of mux controlled by CLK_RST_CONTROLLER_CLK_SOURCE_NVDEC */
+#define TEGRA186_CLK_NVDEC 129
+/** @brief output of mux controlled by CLK_RST_CONTROLLER_CLK_SOURCE_NVJPG */
+#define TEGRA186_CLK_NVJPG 130
+/** @brief output of mux controlled by CLK_RST_CONTROLLER_CLK_SOURCE_NVENC */
+#define TEGRA186_CLK_NVENC 131
+/** @brief output of the QSPI_CLK_SRC mux in CLK_RST_CONTROLLER_CLK_SOURCE_QSPI */
+#define TEGRA186_CLK_QSPI 132
+/** @brief output of mux controlled by CLK_RST_CONTROLLER_CLK_SOURCE_VI_I2C */
+#define TEGRA186_CLK_VI_I2C 133
+/** @brief output of gate CLK_ENB_HSIC_TRK */
+#define TEGRA186_CLK_HSIC_TRK 134
+/** @brief output of gate CLK_ENB_USB2_TRK */
+#define TEGRA186_CLK_USB2_TRK 135
+/** output of mux controlled by CLK_RST_CONTROLLER_CLK_SOURCE_MAUD */
+#define TEGRA186_CLK_MAUD 136
+/** @brief output of mux controlled by CLK_RST_CONTROLLER_CLK_SOURCE_TSECB */
+#define TEGRA186_CLK_TSECB 137
+/** @brief output of gate CLK_ENB_ADSP */
+#define TEGRA186_CLK_ADSP 138
+/** @brief output of gate CLK_ENB_ADSPNEON */
+#define TEGRA186_CLK_ADSPNEON 139
+/** @brief output of the divider CLK_RST_CONTROLLER_CLK_SOURCE_MPHY_L0_RX_LS_SYMB */
+#define TEGRA186_CLK_MPHY_L0_RX_SYMB 140
+/** @brief output of gate CLK_ENB_MPHY_L0_RX_LS_BIT */
+#define TEGRA186_CLK_MPHY_L0_RX_LS_BIT 141
+/** @brief output of the divider CLK_RST_CONTROLLER_CLK_SOURCE_MPHY_L0_TX_LS_SYMB */
+#define TEGRA186_CLK_MPHY_L0_TX_SYMB 142
+/** @brief output of gate CLK_ENB_MPHY_L0_TX_LS_3XBIT */
+#define TEGRA186_CLK_MPHY_L0_TX_LS_3XBIT 143
+/** @brief output of gate CLK_ENB_MPHY_L0_RX_ANA */
+#define TEGRA186_CLK_MPHY_L0_RX_ANA 144
+/** @brief output of gate CLK_ENB_MPHY_L1_RX_ANA */
+#define TEGRA186_CLK_MPHY_L1_RX_ANA 145
+/** @brief output of the divider CLK_RST_CONTROLLER_CLK_SOURCE_MPHY_IOBIST */
+#define TEGRA186_CLK_MPHY_IOBIST 146
+/** @brief output of the divider CLK_RST_CONTROLLER_CLK_SOURCE_MPHY_TX_1MHZ_REF */
+#define TEGRA186_CLK_MPHY_TX_1MHZ_REF 147
+/** @brief output of the divider CLK_RST_CONTROLLER_CLK_SOURCE_MPHY_CORE_PLL_FIXED */
+#define TEGRA186_CLK_MPHY_CORE_PLL_FIXED 148
+/** @brief output of mux controlled by CLK_RST_CONTROLLER_CLK_SOURCE_AXI_CBB */
+#define TEGRA186_CLK_AXI_CBB 149
+/** @brief output of mux controlled by CLK_RST_CONTROLLER_CLK_SOURCE_DMIC3 */
+#define TEGRA186_CLK_DMIC3 150
+/** @brief output of mux controlled by CLK_RST_CONTROLLER_CLK_SOURCE_DMIC4 */
+#define TEGRA186_CLK_DMIC4 151
+/** @brief output of mux controlled by CLK_RST_CONTROLLER_CLK_SOURCE_DSPK1 */
+#define TEGRA186_CLK_DSPK1 152
+/** @brief output of mux controlled by CLK_RST_CONTROLLER_CLK_SOURCE_DSPK2 */
+#define TEGRA186_CLK_DSPK2 153
+/** @brief output of mux controlled by CLK_RST_CONTROLLER_CLK_SOURCE_I2C6 */
+#define TEGRA186_CLK_I2S6 154
+/** @brief output of mux controlled by CLK_RST_CONTROLLER_CLK_SOURCE_NVDISPLAY_P0 */
+#define TEGRA186_CLK_NVDISPLAY_P0 155
+/** @brief output of the NVDISPLAY_DISP_CLK_SRC mux in CLK_RST_CONTROLLER_CLK_SOURCE_NVDISPLAY_DISP */
+#define TEGRA186_CLK_NVDISPLAY_DISP 156
+/** @brief output of gate CLK_ENB_NVDISPLAY_DSC */
+#define TEGRA186_CLK_NVDISPLAY_DSC 157
+/** @brief output of mux controlled by CLK_RST_CONTROLLER_CLK_SOURCE_NVDISPLAYHUB */
+#define TEGRA186_CLK_NVDISPLAYHUB 158
+/** @brief output of mux controlled by CLK_RST_CONTROLLER_CLK_SOURCE_NVDISPLAY_P1 */
+#define TEGRA186_CLK_NVDISPLAY_P1 159
+/** @brief output of mux controlled by CLK_RST_CONTROLLER_CLK_SOURCE_NVDISPLAY_P2 */
+#define TEGRA186_CLK_NVDISPLAY_P2 160
+/** @brief output of mux controlled by CLK_RST_CONTROLLER_CLK_SOURCE_TACH */
+#define TEGRA186_CLK_TACH 166
+/** @brief output of gate CLK_ENB_EQOS */
+#define TEGRA186_CLK_EQOS_AXI 167
+/** @brief output of gate CLK_ENB_EQOS_RX */
+#define TEGRA186_CLK_EQOS_RX 168
+/** @brief output of mux controlled by CLK_RST_CONTROLLER_CLK_SOURCE_UFSHC_CG_SYS */
+#define TEGRA186_CLK_UFSHC 178
+/** @brief output of mux controlled by CLK_RST_CONTROLLER_CLK_SOURCE_UFSDEV_REF */
+#define TEGRA186_CLK_UFSDEV_REF 179
+/** @brief output of mux controlled by CLK_RST_CONTROLLER_CLK_SOURCE_NVCSI */
+#define TEGRA186_CLK_NVCSI 180
+/** @brief output of mux controlled by CLK_RST_CONTROLLER_CLK_SOURCE_NVCSILP */
+#define TEGRA186_CLK_NVCSILP 181
+/** @brief output of mux controlled by CLK_RST_CONTROLLER_CLK_SOURCE_I2C7 */
+#define TEGRA186_CLK_I2C7 182
+/** @brief output of mux controlled by CLK_RST_CONTROLLER_CLK_SOURCE_I2C9 */
+#define TEGRA186_CLK_I2C9 183
+/** @brief output of mux controlled by CLK_RST_CONTROLLER_CLK_SOURCE_I2C12 */
+#define TEGRA186_CLK_I2C12 184
+/** @brief output of mux controlled by CLK_RST_CONTROLLER_CLK_SOURCE_I2C13 */
+#define TEGRA186_CLK_I2C13 185
+/** @brief output of mux controlled by CLK_RST_CONTROLLER_CLK_SOURCE_I2C14 */
+#define TEGRA186_CLK_I2C14 186
+/** @brief output of mux controlled by CLK_RST_CONTROLLER_CLK_SOURCE_PWM1 */
+#define TEGRA186_CLK_PWM1 187
+/** @brief output of mux controlled by CLK_RST_CONTROLLER_CLK_SOURCE_PWM2 */
+#define TEGRA186_CLK_PWM2 188
+/** @brief output of mux controlled by CLK_RST_CONTROLLER_CLK_SOURCE_PWM3 */
+#define TEGRA186_CLK_PWM3 189
+/** @brief output of mux controlled by CLK_RST_CONTROLLER_CLK_SOURCE_PWM5 */
+#define TEGRA186_CLK_PWM5 190
+/** @brief output of mux controlled by CLK_RST_CONTROLLER_CLK_SOURCE_PWM6 */
+#define TEGRA186_CLK_PWM6 191
+/** @brief output of mux controlled by CLK_RST_CONTROLLER_CLK_SOURCE_PWM7 */
+#define TEGRA186_CLK_PWM7 192
+/** @brief output of mux controlled by CLK_RST_CONTROLLER_CLK_SOURCE_PWM8 */
+#define TEGRA186_CLK_PWM8 193
+/** @brief output of mux controlled by CLK_RST_CONTROLLER_CLK_SOURCE_UARTE */
+#define TEGRA186_CLK_UARTE 194
+/** @brief output of mux controlled by CLK_RST_CONTROLLER_CLK_SOURCE_UARTF */
+#define TEGRA186_CLK_UARTF 195
+/** @deprecated */
+#define TEGRA186_CLK_DBGAPB 196
+/** @brief output of mux controlled by CLK_RST_CONTROLLER_CLK_SOURCE_BPMP_CPU_NIC */
+#define TEGRA186_CLK_BPMP_CPU_NIC 197
+/** @brief output of mux controlled by CLK_RST_CONTROLLER_CLK_SOURCE_BPMP_APB */
+#define TEGRA186_CLK_BPMP_APB 199
+/** @brief output of mux controlled by TEGRA186_CLK_SOC_ACTMON */
+#define TEGRA186_CLK_ACTMON 201
+/** @brief output of mux controlled by CLK_RST_CONTROLLER_CLK_SOURCE_AON_CPU_NIC */
+#define TEGRA186_CLK_AON_CPU_NIC 208
+/** @brief output of mux controlled by CLK_RST_CONTROLLER_CLK_SOURCE_CAN1 */
+#define TEGRA186_CLK_CAN1 210
+/** @brief output of gate CLK_ENB_CAN1_HOST */
+#define TEGRA186_CLK_CAN1_HOST 211
+/** @brief output of mux controlled by CLK_RST_CONTROLLER_CLK_SOURCE_CAN2 */
+#define TEGRA186_CLK_CAN2 212
+/** @brief output of gate CLK_ENB_CAN2_HOST */
+#define TEGRA186_CLK_CAN2_HOST 213
+/** @brief output of mux controlled by CLK_RST_CONTROLLER_CLK_SOURCE_AON_APB */
+#define TEGRA186_CLK_AON_APB 214
+/** @brief output of mux controlled by CLK_RST_CONTROLLER_CLK_SOURCE_UARTC */
+#define TEGRA186_CLK_UARTC 215
+/** @brief output of mux controlled by CLK_RST_CONTROLLER_CLK_SOURCE_UARTG */
+#define TEGRA186_CLK_UARTG 216
+/** @brief output of mux controlled by CLK_RST_CONTROLLER_CLK_SOURCE_AON_UART_FST_MIPI_CAL */
+#define TEGRA186_CLK_AON_UART_FST_MIPI_CAL 217
+/** @brief output of mux controlled by CLK_RST_CONTROLLER_CLK_SOURCE_I2C2 */
+#define TEGRA186_CLK_I2C2 218
+/** @brief output of mux controlled by CLK_RST_CONTROLLER_CLK_SOURCE_I2C8 */
+#define TEGRA186_CLK_I2C8 219
+/** @brief output of mux controlled by CLK_RST_CONTROLLER_CLK_SOURCE_I2C10 */
+#define TEGRA186_CLK_I2C10 220
+/** @brief output of mux controlled by CLK_RST_CONTROLLER_CLK_SOURCE_AON_I2C_SLOW */
+#define TEGRA186_CLK_AON_I2C_SLOW 221
+/** @brief output of mux controlled by CLK_RST_CONTROLLER_CLK_SOURCE_SPI2 */
+#define TEGRA186_CLK_SPI2 222
+/** @brief output of mux controlled by CLK_RST_CONTROLLER_CLK_SOURCE_DMIC5 */
+#define TEGRA186_CLK_DMIC5 223
+/** @brief output of mux controlled by CLK_RST_CONTROLLER_CLK_SOURCE_AON_TOUCH */
+#define TEGRA186_CLK_AON_TOUCH 224
+/** @brief output of mux controlled by CLK_RST_CONTROLLER_CLK_SOURCE_PWM4 */
+#define TEGRA186_CLK_PWM4 225
+/** @brief output of mux controlled by CLK_RST_CONTROLLER_CLK_SOURCE_TSC. This clock object is read only and is used for all timers in the system. */
+#define TEGRA186_CLK_TSC 226
+/** @brief output of mux controlled by CLK_RST_CONTROLLER_CLK_SOURCE_MSS_ENCRYPT */
+#define TEGRA186_CLK_MSS_ENCRYPT 227
+/** @brief output of mux controlled by CLK_RST_CONTROLLER_CLK_SOURCE_SCE_CPU_NIC */
+#define TEGRA186_CLK_SCE_CPU_NIC 228
+/** @brief output of mux controlled by CLK_RST_CONTROLLER_CLK_SOURCE_SCE_APB */
+#define TEGRA186_CLK_SCE_APB 230
+/** @brief output of gate CLK_ENB_DSIC */
+#define TEGRA186_CLK_DSIC 231
+/** @brief output of mux controlled by CLK_RST_CONTROLLER_CLK_SOURCE_DSIC_LP */
+#define TEGRA186_CLK_DSIC_LP 232
+/** @brief output of gate CLK_ENB_DSID */
+#define TEGRA186_CLK_DSID 233
+/** @brief output of mux controlled by CLK_RST_CONTROLLER_CLK_SOURCE_DSID_LP */
+#define TEGRA186_CLK_DSID_LP 234
+/** @brief output of the divider CLK_RST_CONTROLLER_CLK_SOURCE_PEX_SATA_USB_RX_BYP */
+#define TEGRA186_CLK_PEX_SATA_USB_RX_BYP 236
+/** @brief output of mux controlled by CLK_RST_CONTROLLER_CLK_SOURCE_SPDIF_OUT */
+#define TEGRA186_CLK_SPDIF_OUT 238
+/** @brief output of the divider CLK_RST_CONTROLLER_CLK_SOURCE_EQOS_PTP_REF_CLK_0 */
+#define TEGRA186_CLK_EQOS_PTP_REF 239
+/** @brief output of the divider CLK_RST_CONTROLLER_CLK_SOURCE_EQOS_TX_CLK */
+#define TEGRA186_CLK_EQOS_TX 240
+/** @brief output of the divider CLK_RST_CONTROLLER_CLK_SOURCE_USB2_HSIC_TRK */
+#define TEGRA186_CLK_USB2_HSIC_TRK 241
+/** @brief output of mux xusb_ss_clk_switch on page 66 of T186_Clocks_IAS.doc */
+#define TEGRA186_CLK_XUSB_CORE_SS 242
+/** @brief output of mux xusb_core_dev_clk_switch on page 67 of T186_Clocks_IAS.doc */
+#define TEGRA186_CLK_XUSB_CORE_DEV 243
+/** @brief output of mux xusb_core_falcon_clk_switch on page 67 of T186_Clocks_IAS.doc */
+#define TEGRA186_CLK_XUSB_FALCON 244
+/** @brief output of mux xusb_fs_clk_switch on page 66 of T186_Clocks_IAS.doc */
+#define TEGRA186_CLK_XUSB_FS 245
+/** @brief output of the divider CLK_RST_CONTROLLER_PLLA_OUT */
+#define TEGRA186_CLK_PLL_A_OUT0 246
+/** @brief output of mux controlled by CLK_RST_CONTROLLER_AUDIO_SYNC_CLK_I2S1 */
+#define TEGRA186_CLK_SYNC_I2S1 247
+/** @brief output of mux controlled by CLK_RST_CONTROLLER_AUDIO_SYNC_CLK_I2S2 */
+#define TEGRA186_CLK_SYNC_I2S2 248
+/** @brief output of mux controlled by CLK_RST_CONTROLLER_AUDIO_SYNC_CLK_I2S3 */
+#define TEGRA186_CLK_SYNC_I2S3 249
+/** @brief output of mux controlled by CLK_RST_CONTROLLER_AUDIO_SYNC_CLK_I2S4 */
+#define TEGRA186_CLK_SYNC_I2S4 250
+/** @brief output of mux controlled by CLK_RST_CONTROLLER_AUDIO_SYNC_CLK_I2S5 */
+#define TEGRA186_CLK_SYNC_I2S5 251
+/** @brief output of mux controlled by CLK_RST_CONTROLLER_AUDIO_SYNC_CLK_I2S6 */
+#define TEGRA186_CLK_SYNC_I2S6 252
+/** @brief output of mux controlled by CLK_RST_CONTROLLER_AUDIO_SYNC_CLK_DSPK1 */
+#define TEGRA186_CLK_SYNC_DSPK1 253
+/** @brief output of mux controlled by CLK_RST_CONTROLLER_AUDIO_SYNC_CLK_DSPK2 */
+#define TEGRA186_CLK_SYNC_DSPK2 254
+/** @brief output of mux controlled by CLK_RST_CONTROLLER_AUDIO_SYNC_CLK_DMIC1 */
+#define TEGRA186_CLK_SYNC_DMIC1 255
+/** @brief output of mux controlled by CLK_RST_CONTROLLER_AUDIO_SYNC_CLK_DMIC2 */
+#define TEGRA186_CLK_SYNC_DMIC2 256
+/** @brief output of mux controlled by CLK_RST_CONTROLLER_AUDIO_SYNC_CLK_DMIC3 */
+#define TEGRA186_CLK_SYNC_DMIC3 257
+/** @brief output of mux controlled by CLK_RST_CONTROLLER_AUDIO_SYNC_CLK_DMIC4 */
+#define TEGRA186_CLK_SYNC_DMIC4 259
+/** @brief output of mux controlled by CLK_RST_CONTROLLER_AUDIO_SYNC_CLK_SPDIF */
+#define TEGRA186_CLK_SYNC_SPDIF 260
+/** @brief output of gate CLK_ENB_PLLREFE_OUT */
+#define TEGRA186_CLK_PLLREFE_OUT_GATED 261
+/** @brief output of the divider PLLREFE_DIVP in CLK_RST_CONTROLLER_PLLREFE_BASE. PLLREFE has 2 outputs:
+  *      * VCO/pdiv defined by this clock object
+  *      * VCO/2 defined by TEGRA186_CLK_PLLREFE_OUT
+  */
+#define TEGRA186_CLK_PLLREFE_OUT1 262
+#define TEGRA186_CLK_PLLD_OUT1 267
+/** @brief output of the divider PLLP_DIVP in CLK_RST_CONTROLLER_PLLP_BASE */
+#define TEGRA186_CLK_PLLP_OUT0 269
+/** @brief output of the divider CLK_RST_CONTROLLER_PLLP_OUTC */
+#define TEGRA186_CLK_PLLP_OUT5 270
+/** PLL controlled by CLK_RST_CONTROLLER_PLLA_BASE for use by audio clocks */
+#define TEGRA186_CLK_PLLA 271
+/** @brief output of mux controlled by CLK_RST_CONTROLLER_ACLK_BURST_POLICY divided by the divider controlled by ACLK_CLK_DIVISOR in CLK_RST_CONTROLLER_SUPER_ACLK_DIVIDER */
+#define TEGRA186_CLK_ACLK 273
+/** fixed 48MHz clock divided down from TEGRA186_CLK_PLL_U */
+#define TEGRA186_CLK_PLL_U_48M 274
+/** fixed 480MHz clock divided down from TEGRA186_CLK_PLL_U */
+#define TEGRA186_CLK_PLL_U_480M 275
+/** @brief output of the divider PLLC4_DIVP in CLK_RST_CONTROLLER_PLLC4_BASE. Output frequency is TEGRA186_CLK_PLLC4_VCO/PLLC4_DIVP */
+#define TEGRA186_CLK_PLLC4_OUT0 276
+/** fixed /3 divider. Output frequency of this clock is TEGRA186_CLK_PLLC4_VCO/3 */
+#define TEGRA186_CLK_PLLC4_OUT1 277
+/** fixed /5 divider. Output frequency of this clock is TEGRA186_CLK_PLLC4_VCO/5 */
+#define TEGRA186_CLK_PLLC4_OUT2 278
+/** @brief output of mux controlled by PLLC4_CLK_SEL in CLK_RST_CONTROLLER_PLLC4_MISC1 */
+#define TEGRA186_CLK_PLLC4_OUT_MUX 279
+/** @brief output of divider NVDISPLAY_DISP_CLK_DIVISOR in CLK_RST_CONTROLLER_CLK_SOURCE_NVDISPLAY_DISP when DFLLDISP_DIV is selected in NVDISPLAY_DISP_CLK_SRC */
+#define TEGRA186_CLK_DFLLDISP_DIV 284
+/** @brief output of divider NVDISPLAY_DISP_CLK_DIVISOR in CLK_RST_CONTROLLER_CLK_SOURCE_NVDISPLAY_DISP when PLLDISPHUB_DIV is selected in NVDISPLAY_DISP_CLK_SRC */
+#define TEGRA186_CLK_PLLDISPHUB_DIV 285
+/** fixed /8 divider which is used as the input for TEGRA186_CLK_SOR_SAFE */
+#define TEGRA186_CLK_PLLP_DIV8 286
+/** @brief output of divider CLK_RST_CONTROLLER_BPMP_NIC_RATE */
+#define TEGRA186_CLK_BPMP_NIC 287
+/** @brief output of the divider CLK_RST_CONTROLLER_PLLA1_OUT1 */
+#define TEGRA186_CLK_PLL_A_OUT1 288
+/** @deprecated */
+#define TEGRA186_CLK_GPC2CLK 289
+/** A fake clock which must be enabled during KFUSE read operations to ensure adequate VDD_CORE voltage. */
+#define TEGRA186_CLK_KFUSE 293
+/**
+ * @brief controls the PLLE hardware sequencer.
+ * @details This clock only has enable and disable methods. When the
+ * PLLE hw sequencer is enabled, PLLE, will be enabled or disabled by
+ * hw based on the control signals from the PCIe, SATA and XUSB
+ * clocks. When the PLLE hw sequencer is disabled, the state of PLLE
+ * is controlled by sw using clk_enable/clk_disable on
+ * TEGRA186_CLK_PLLE.
+ */
+#define TEGRA186_CLK_PLLE_PWRSEQ 294
+/** fixed 60MHz clock divided down from, TEGRA186_CLK_PLL_U */
+#define TEGRA186_CLK_PLLREFE_REF 295
+/** @brief output of mux controlled by SOR0_CLK_SEL0 and SOR0_CLK_SEL1 in CLK_RST_CONTROLLER_CLK_SOURCE_SOR0 */
+#define TEGRA186_CLK_SOR0_OUT 296
+/** @brief output of mux controlled by SOR1_CLK_SEL0 and SOR1_CLK_SEL1 in CLK_RST_CONTROLLER_CLK_SOURCE_SOR1 */
+#define TEGRA186_CLK_SOR1_OUT 297
+/** @brief fixed /5 divider.  Output frequency of this clock is TEGRA186_CLK_PLLREFE_OUT1/5. Used as input for TEGRA186_CLK_EQOS_AXI */
+#define TEGRA186_CLK_PLLREFE_OUT1_DIV5 298
+/** @brief controls the UTMIP_PLL (aka PLLU) hardware sqeuencer */
+#define TEGRA186_CLK_UTMIP_PLL_PWRSEQ 301
+/** @brief output of the divider CLK_RST_CONTROLLER_CLK_SOURCE_PEX_USB_PAD_PLL0_MGMT */
+#define TEGRA186_CLK_PEX_USB_PAD0_MGMT 302
+/** @brief output of the divider CLK_RST_CONTROLLER_CLK_SOURCE_PEX_USB_PAD_PLL1_MGMT */
+#define TEGRA186_CLK_PEX_USB_PAD1_MGMT 303
+/** @brief controls the UPHY_PLL0 hardware sqeuencer */
+#define TEGRA186_CLK_UPHY_PLL0_PWRSEQ 304
+/** @brief controls the UPHY_PLL1 hardware sqeuencer */
+#define TEGRA186_CLK_UPHY_PLL1_PWRSEQ 305
+/** @brief control for PLLREFE_IDDQ in CLK_RST_CONTROLLER_PLLREFE_MISC so the bypass output even be used when the PLL is disabled */
+#define TEGRA186_CLK_PLLREFE_PLLE_PASSTHROUGH 306
+/** @brief output of the mux controlled by PLLREFE_SEL_CLKIN_PEX in CLK_RST_CONTROLLER_PLLREFE_MISC */
+#define TEGRA186_CLK_PLLREFE_PEX 307
+/** @brief control for PLLREFE_IDDQ in CLK_RST_CONTROLLER_PLLREFE_MISC to turn on the PLL when enabled */
+#define TEGRA186_CLK_PLLREFE_IDDQ 308
+/** @brief output of the divider QSPI_CLK_DIV2_SEL in CLK_RST_CONTROLLER_CLK_SOURCE_QSPI */
+#define TEGRA186_CLK_QSPI_OUT 309
+/**
+ * @brief GPC2CLK-div-2
+ * @details fixed /2 divider. Output frequency is
+ * TEGRA186_CLK_GPC2CLK/2. The frequency of this clock is the
+ * frequency at which the GPU graphics engine runs. */
+#define TEGRA186_CLK_GPCCLK 310
+/** @brief output of divider CLK_RST_CONTROLLER_AON_NIC_RATE */
+#define TEGRA186_CLK_AON_NIC 450
+/** @brief output of divider CLK_RST_CONTROLLER_SCE_NIC_RATE */
+#define TEGRA186_CLK_SCE_NIC 451
+/** Fixed 100MHz PLL for PCIe, SATA and superspeed USB */
+#define TEGRA186_CLK_PLLE 512
+/** @brief PLL controlled by CLK_RST_CONTROLLER_PLLC_BASE */
+#define TEGRA186_CLK_PLLC 513
+/** Fixed 408MHz PLL for use by peripheral clocks */
+#define TEGRA186_CLK_PLLP 516
+/** @deprecated */
+#define TEGRA186_CLK_PLL_P TEGRA186_CLK_PLLP
+/** @brief PLL controlled by CLK_RST_CONTROLLER_PLLD_BASE for use by DSI */
+#define TEGRA186_CLK_PLLD 518
+/** @brief PLL controlled by CLK_RST_CONTROLLER_PLLD2_BASE for use by HDMI or DP */
+#define TEGRA186_CLK_PLLD2 519
+/**
+ * @brief PLL controlled by CLK_RST_CONTROLLER_PLLREFE_BASE.
+ * @details Note that this clock only controls the VCO output, before
+ * the post-divider. See TEGRA186_CLK_PLLREFE_OUT1 for more
+ * information.
+ */
+#define TEGRA186_CLK_PLLREFE_VCO 520
+/** @brief PLL controlled by CLK_RST_CONTROLLER_PLLC2_BASE */
+#define TEGRA186_CLK_PLLC2 521
+/** @brief PLL controlled by CLK_RST_CONTROLLER_PLLC3_BASE */
+#define TEGRA186_CLK_PLLC3 522
+/** @brief PLL controlled by CLK_RST_CONTROLLER_PLLDP_BASE for use as the DP link clock */
+#define TEGRA186_CLK_PLLDP 523
+/** @brief PLL controlled by CLK_RST_CONTROLLER_PLLC4_BASE */
+#define TEGRA186_CLK_PLLC4_VCO 524
+/** @brief PLL controlled by CLK_RST_CONTROLLER_PLLA1_BASE for use by audio clocks */
+#define TEGRA186_CLK_PLLA1 525
+/** @brief PLL controlled by CLK_RST_CONTROLLER_PLLNVCSI_BASE */
+#define TEGRA186_CLK_PLLNVCSI 526
+/** @brief PLL controlled by CLK_RST_CONTROLLER_PLLDISPHUB_BASE */
+#define TEGRA186_CLK_PLLDISPHUB 527
+/** @brief PLL controlled by CLK_RST_CONTROLLER_PLLD3_BASE for use by HDMI or DP */
+#define TEGRA186_CLK_PLLD3 528
+/** @brief PLL controlled by CLK_RST_CONTROLLER_PLLBPMPCAM_BASE */
+#define TEGRA186_CLK_PLLBPMPCAM 531
+/** @brief PLL controlled by CLK_RST_CONTROLLER_PLLAON_BASE for use by IP blocks in the AON domain */
+#define TEGRA186_CLK_PLLAON 532
+/** Fixed frequency 960MHz PLL for USB and EAVB */
+#define TEGRA186_CLK_PLLU 533
+/** fixed /2 divider. Output frequency is TEGRA186_CLK_PLLC4_VCO/2 */
+#define TEGRA186_CLK_PLLC4_VCO_DIV2 535
+/** @brief NAFLL clock source for AXI_CBB */
+#define TEGRA186_CLK_NAFLL_AXI_CBB 564
+/** @brief NAFLL clock source for BPMP */
+#define TEGRA186_CLK_NAFLL_BPMP 565
+/** @brief NAFLL clock source for ISP */
+#define TEGRA186_CLK_NAFLL_ISP 566
+/** @brief NAFLL clock source for NVDEC */
+#define TEGRA186_CLK_NAFLL_NVDEC 567
+/** @brief NAFLL clock source for NVENC */
+#define TEGRA186_CLK_NAFLL_NVENC 568
+/** @brief NAFLL clock source for NVJPG */
+#define TEGRA186_CLK_NAFLL_NVJPG 569
+/** @brief NAFLL clock source for SCE */
+#define TEGRA186_CLK_NAFLL_SCE 570
+/** @brief NAFLL clock source for SE */
+#define TEGRA186_CLK_NAFLL_SE 571
+/** @brief NAFLL clock source for TSEC */
+#define TEGRA186_CLK_NAFLL_TSEC 572
+/** @brief NAFLL clock source for TSECB */
+#define TEGRA186_CLK_NAFLL_TSECB 573
+/** @brief NAFLL clock source for VI */
+#define TEGRA186_CLK_NAFLL_VI 574
+/** @brief NAFLL clock source for VIC */
+#define TEGRA186_CLK_NAFLL_VIC 575
+/** @brief NAFLL clock source for DISP */
+#define TEGRA186_CLK_NAFLL_DISP 576
+/** @brief NAFLL clock source for GPU */
+#define TEGRA186_CLK_NAFLL_GPU 577
+/** @brief NAFLL clock source for M-CPU cluster */
+#define TEGRA186_CLK_NAFLL_MCPU 578
+/** @brief NAFLL clock source for B-CPU cluster */
+#define TEGRA186_CLK_NAFLL_BCPU 579
+/** @brief input from Tegra's CLK_32K_IN pad */
+#define TEGRA186_CLK_CLK_32K 608
+/** @brief output of divider CLK_RST_CONTROLLER_CLK_M_DIVIDE */
+#define TEGRA186_CLK_CLK_M 609
+/** @brief output of divider PLL_REF_DIV in CLK_RST_CONTROLLER_OSC_CTRL */
+#define TEGRA186_CLK_PLL_REF 610
+/** @brief input from Tegra's XTAL_IN */
+#define TEGRA186_CLK_OSC 612
+/** @brief clock recovered from EAVB input */
+#define TEGRA186_CLK_EQOS_RX_INPUT 613
+/** @brief clock recovered from DTV input */
+#define TEGRA186_CLK_DTV_INPUT 614
+/** @brief SOR0 brick output which feeds into SOR0_CLK_SEL mux in CLK_RST_CONTROLLER_CLK_SOURCE_SOR0*/
+#define TEGRA186_CLK_SOR0_PAD_CLKOUT 615
+/** @brief SOR1 brick output which feeds into SOR1_CLK_SEL mux in CLK_RST_CONTROLLER_CLK_SOURCE_SOR1*/
+#define TEGRA186_CLK_SOR1_PAD_CLKOUT 616
+/** @brief clock recovered from I2S1 input */
+#define TEGRA186_CLK_I2S1_SYNC_INPUT 617
+/** @brief clock recovered from I2S2 input */
+#define TEGRA186_CLK_I2S2_SYNC_INPUT 618
+/** @brief clock recovered from I2S3 input */
+#define TEGRA186_CLK_I2S3_SYNC_INPUT 619
+/** @brief clock recovered from I2S4 input */
+#define TEGRA186_CLK_I2S4_SYNC_INPUT 620
+/** @brief clock recovered from I2S5 input */
+#define TEGRA186_CLK_I2S5_SYNC_INPUT 621
+/** @brief clock recovered from I2S6 input */
+#define TEGRA186_CLK_I2S6_SYNC_INPUT 622
+/** @brief clock recovered from SPDIFIN input */
+#define TEGRA186_CLK_SPDIFIN_SYNC_INPUT 623
+
+/**
+ * @brief subject to change
+ * @details maximum clock identifier value plus one.
+ */
+#define TEGRA186_CLK_CLK_MAX 624
+
+/** @} */
+
+#endif
diff --git a/include/dt-bindings/reset/tegra186-reset.h b/include/dt-bindings/reset/tegra186-reset.h
new file mode 100644
index 000000000000..8a184e357955
--- /dev/null
+++ b/include/dt-bindings/reset/tegra186-reset.h
@@ -0,0 +1,217 @@
+/*
+ * Copyright (c) 2015, NVIDIA CORPORATION.  All rights reserved.
+ *
+ * This program is free software; you can redistribute it and/or modify it
+ * under the terms and conditions of the GNU General Public License,
+ * version 2, as published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope it will be useful, but WITHOUT
+ * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
+ * FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License for
+ * more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program.  If not, see <http://www.gnu.org/licenses/>.
+ */
+
+#ifndef _ABI_MACH_T186_RESET_T186_H_
+#define _ABI_MACH_T186_RESET_T186_H_
+
+
+#define TEGRA186_RESET_ACTMON			0
+#define TEGRA186_RESET_AFI			1
+#define TEGRA186_RESET_CEC			2
+#define TEGRA186_RESET_CSITE			3
+#define TEGRA186_RESET_DP2			4
+#define TEGRA186_RESET_DPAUX			5
+#define TEGRA186_RESET_DSI			6
+#define TEGRA186_RESET_DSIB			7
+#define TEGRA186_RESET_DTV			8
+#define TEGRA186_RESET_DVFS			9
+#define TEGRA186_RESET_ENTROPY			10
+#define TEGRA186_RESET_EXTPERIPH1		11
+#define TEGRA186_RESET_EXTPERIPH2		12
+#define TEGRA186_RESET_EXTPERIPH3		13
+#define TEGRA186_RESET_GPU			14
+#define TEGRA186_RESET_HDA			15
+#define TEGRA186_RESET_HDA2CODEC_2X		16
+#define TEGRA186_RESET_HDA2HDMICODEC		17
+#define TEGRA186_RESET_HOST1X			18
+#define TEGRA186_RESET_I2C1			19
+#define TEGRA186_RESET_I2C2			20
+#define TEGRA186_RESET_I2C3			21
+#define TEGRA186_RESET_I2C4			22
+#define TEGRA186_RESET_I2C5			23
+#define TEGRA186_RESET_I2C6			24
+#define TEGRA186_RESET_ISP			25
+#define TEGRA186_RESET_KFUSE			26
+#define TEGRA186_RESET_LA			27
+#define TEGRA186_RESET_MIPI_CAL			28
+#define TEGRA186_RESET_PCIE			29
+#define TEGRA186_RESET_PCIEXCLK			30
+#define TEGRA186_RESET_SATA			31
+#define TEGRA186_RESET_SATACOLD			32
+#define TEGRA186_RESET_SDMMC1			33
+#define TEGRA186_RESET_SDMMC2			34
+#define TEGRA186_RESET_SDMMC3			35
+#define TEGRA186_RESET_SDMMC4			36
+#define TEGRA186_RESET_SE			37
+#define TEGRA186_RESET_SOC_THERM		38
+#define TEGRA186_RESET_SOR0			39
+#define TEGRA186_RESET_SPI1			40
+#define TEGRA186_RESET_SPI2			41
+#define TEGRA186_RESET_SPI3			42
+#define TEGRA186_RESET_SPI4			43
+#define TEGRA186_RESET_TMR			44
+#define TEGRA186_RESET_TRIG_SYS			45
+#define TEGRA186_RESET_TSEC			46
+#define TEGRA186_RESET_UARTA			47
+#define TEGRA186_RESET_UARTB			48
+#define TEGRA186_RESET_UARTC			49
+#define TEGRA186_RESET_UARTD			50
+#define TEGRA186_RESET_VI			51
+#define TEGRA186_RESET_VIC			52
+#define TEGRA186_RESET_XUSB_DEV			53
+#define TEGRA186_RESET_XUSB_HOST		54
+#define TEGRA186_RESET_XUSB_PADCTL		55
+#define TEGRA186_RESET_XUSB_SS			56
+#define TEGRA186_RESET_AON_APB			57
+#define TEGRA186_RESET_AXI_CBB			58
+#define TEGRA186_RESET_BPMP_APB			59
+#define TEGRA186_RESET_CAN1			60
+#define TEGRA186_RESET_CAN2			61
+#define TEGRA186_RESET_DMIC5			62
+#define TEGRA186_RESET_DSIC			63
+#define TEGRA186_RESET_DSID			64
+#define TEGRA186_RESET_EMC_EMC			65
+#define TEGRA186_RESET_EMC_MEM			66
+#define TEGRA186_RESET_EMCSB_EMC		67
+#define TEGRA186_RESET_EMCSB_MEM		68
+#define TEGRA186_RESET_EQOS			69
+#define TEGRA186_RESET_GPCDMA			70
+#define TEGRA186_RESET_GPIO_CTL0		71
+#define TEGRA186_RESET_GPIO_CTL1		72
+#define TEGRA186_RESET_GPIO_CTL2		73
+#define TEGRA186_RESET_GPIO_CTL3		74
+#define TEGRA186_RESET_GPIO_CTL4		75
+#define TEGRA186_RESET_GPIO_CTL5		76
+#define TEGRA186_RESET_I2C10			77
+#define TEGRA186_RESET_I2C12			78
+#define TEGRA186_RESET_I2C13			79
+#define TEGRA186_RESET_I2C14			80
+#define TEGRA186_RESET_I2C7			81
+#define TEGRA186_RESET_I2C8			82
+#define TEGRA186_RESET_I2C9			83
+#define TEGRA186_RESET_JTAG2AXI			84
+#define TEGRA186_RESET_MPHY_IOBIST		85
+#define TEGRA186_RESET_MPHY_L0_RX		86
+#define TEGRA186_RESET_MPHY_L0_TX		87
+#define TEGRA186_RESET_NVCSI			88
+#define TEGRA186_RESET_NVDISPLAY0_HEAD0		89
+#define TEGRA186_RESET_NVDISPLAY0_HEAD1		90
+#define TEGRA186_RESET_NVDISPLAY0_HEAD2		91
+#define TEGRA186_RESET_NVDISPLAY0_MISC		92
+#define TEGRA186_RESET_NVDISPLAY0_WGRP0		93
+#define TEGRA186_RESET_NVDISPLAY0_WGRP1		94
+#define TEGRA186_RESET_NVDISPLAY0_WGRP2		95
+#define TEGRA186_RESET_NVDISPLAY0_WGRP3		96
+#define TEGRA186_RESET_NVDISPLAY0_WGRP4		97
+#define TEGRA186_RESET_NVDISPLAY0_WGRP5		98
+#define TEGRA186_RESET_PWM1			99
+#define TEGRA186_RESET_PWM2			100
+#define TEGRA186_RESET_PWM3			101
+#define TEGRA186_RESET_PWM4			102
+#define TEGRA186_RESET_PWM5			103
+#define TEGRA186_RESET_PWM6			104
+#define TEGRA186_RESET_PWM7			105
+#define TEGRA186_RESET_PWM8			106
+#define TEGRA186_RESET_SCE_APB			107
+#define TEGRA186_RESET_SOR1			108
+#define TEGRA186_RESET_TACH			109
+#define TEGRA186_RESET_TSC			110
+#define TEGRA186_RESET_UARTF			111
+#define TEGRA186_RESET_UARTG			112
+#define TEGRA186_RESET_UFSHC			113
+#define TEGRA186_RESET_UFSHC_AXI_M		114
+#define TEGRA186_RESET_UPHY			115
+#define TEGRA186_RESET_ADSP			116
+#define TEGRA186_RESET_ADSPDBG			117
+#define TEGRA186_RESET_ADSPINTF			118
+#define TEGRA186_RESET_ADSPNEON			119
+#define TEGRA186_RESET_ADSPPERIPH		120
+#define TEGRA186_RESET_ADSPSCU			121
+#define TEGRA186_RESET_ADSPWDT			122
+#define TEGRA186_RESET_APE			123
+#define TEGRA186_RESET_DPAUX1			124
+#define TEGRA186_RESET_NVDEC			125
+#define TEGRA186_RESET_NVENC			126
+#define TEGRA186_RESET_NVJPG			127
+#define TEGRA186_RESET_PEX_USB_UPHY		128
+#define TEGRA186_RESET_QSPI			129
+#define TEGRA186_RESET_TSECB			130
+#define TEGRA186_RESET_VI_I2C			131
+#define TEGRA186_RESET_UARTE			132
+#define TEGRA186_RESET_TOP_GTE			133
+#define TEGRA186_RESET_SHSP			134
+#define TEGRA186_RESET_PEX_USB_UPHY_L5		135
+#define TEGRA186_RESET_PEX_USB_UPHY_L4		136
+#define TEGRA186_RESET_PEX_USB_UPHY_L3		137
+#define TEGRA186_RESET_PEX_USB_UPHY_L2		138
+#define TEGRA186_RESET_PEX_USB_UPHY_L1		139
+#define TEGRA186_RESET_PEX_USB_UPHY_L0		140
+#define TEGRA186_RESET_PEX_USB_UPHY_PLL1	141
+#define TEGRA186_RESET_PEX_USB_UPHY_PLL0	142
+#define TEGRA186_RESET_TSCTNVI			143
+#define TEGRA186_RESET_EXTPERIPH4		144
+#define TEGRA186_RESET_DSIPADCTL		145
+#define TEGRA186_RESET_AUD_MCLK			146
+#define TEGRA186_RESET_MPHY_CLK_CTL		147
+#define TEGRA186_RESET_MPHY_L1_RX		148
+#define TEGRA186_RESET_MPHY_L1_TX		149
+#define TEGRA186_RESET_UFSHC_LP			150
+#define TEGRA186_RESET_BPMP_NIC			151
+#define TEGRA186_RESET_BPMP_NSYSPORESET		152
+#define TEGRA186_RESET_BPMP_NRESET		153
+#define TEGRA186_RESET_BPMP_DBGRESETN		154
+#define TEGRA186_RESET_BPMP_PRESETDBGN		155
+#define TEGRA186_RESET_BPMP_PM			156
+#define TEGRA186_RESET_BPMP_CVC			157
+#define TEGRA186_RESET_BPMP_DMA			158
+#define TEGRA186_RESET_BPMP_HSP			159
+#define TEGRA186_RESET_TSCTNBPMP		160
+#define TEGRA186_RESET_BPMP_TKE			161
+#define TEGRA186_RESET_BPMP_GTE			162
+#define TEGRA186_RESET_BPMP_PM_ACTMON		163
+#define TEGRA186_RESET_AON_NIC			164
+#define TEGRA186_RESET_AON_NSYSPORESET		165
+#define TEGRA186_RESET_AON_NRESET		166
+#define TEGRA186_RESET_AON_DBGRESETN		167
+#define TEGRA186_RESET_AON_PRESETDBGN		168
+#define TEGRA186_RESET_AON_ACTMON		169
+#define TEGRA186_RESET_AOPM			170
+#define TEGRA186_RESET_AOVC			171
+#define TEGRA186_RESET_AON_DMA			172
+#define TEGRA186_RESET_AON_GPIO			173
+#define TEGRA186_RESET_AON_HSP			174
+#define TEGRA186_RESET_TSCTNAON			175
+#define TEGRA186_RESET_AON_TKE			176
+#define TEGRA186_RESET_AON_GTE			177
+#define TEGRA186_RESET_SCE_NIC			178
+#define TEGRA186_RESET_SCE_NSYSPORESET		179
+#define TEGRA186_RESET_SCE_NRESET		180
+#define TEGRA186_RESET_SCE_DBGRESETN		181
+#define TEGRA186_RESET_SCE_PRESETDBGN		182
+#define TEGRA186_RESET_SCE_ACTMON		183
+#define TEGRA186_RESET_SCE_PM			184
+#define TEGRA186_RESET_SCE_DMA			185
+#define TEGRA186_RESET_SCE_HSP			186
+#define TEGRA186_RESET_TSCTNSCE			187
+#define TEGRA186_RESET_SCE_TKE			188
+#define TEGRA186_RESET_SCE_GTE			189
+#define TEGRA186_RESET_SCE_CFG			190
+#define TEGRA186_RESET_ADSP_ALL			191
+/** @brief controls the power up/down sequence of UFSHC PSW partition. Controls LP_PWR_READY, LP_ISOL_EN, and LP_RESET_N signals */
+#define TEGRA186_RESET_UFSHC_LP_SEQ		192
+#define TEGRA186_RESET_SIZE			193
+
+#endif
-- 
2.9.0

^ permalink raw reply related	[flat|nested] 51+ messages in thread

* [PATCH V2 04/10] firmware: tegra: add IVC library
  2016-07-05  9:04 [PATCH V2 00/10] arm64: tegra: add BPMP support Joseph Lo
                   ` (2 preceding siblings ...)
  2016-07-05  9:04 ` [PATCH V2 03/10] Documentation: dt-bindings: firmware: tegra: add bindings of the BPMP Joseph Lo
@ 2016-07-05  9:04 ` Joseph Lo
  2016-07-07 11:16   ` Alexandre Courbot
  2016-07-09 23:45   ` Paul Gortmaker
  2016-07-05  9:04 ` [PATCH V2 05/10] firmware: tegra: add BPMP support Joseph Lo
                   ` (5 subsequent siblings)
  9 siblings, 2 replies; 51+ messages in thread
From: Joseph Lo @ 2016-07-05  9:04 UTC (permalink / raw)
  To: Stephen Warren, Thierry Reding, Alexandre Courbot
  Cc: linux-tegra, linux-arm-kernel, Rob Herring, Mark Rutland,
	Peter De Schrijver, Matthew Longnecker, devicetree, Jassi Brar,
	linux-kernel, Catalin Marinas, Will Deacon, Joseph Lo

The Inter-VM communication (IVC) is a communication protocol, which is
designed for interprocessor communication (IPC) or the communication
between the hypervisor and the virtual machine with a guest OS on it. So
it can be translated as inter-virtual memory or inter-virtual machine
communication. The message channels are maintained on the DRAM or SRAM
and the data coherency should be considered. Or the data could be
corrupted or out of date when the remote client checking it.

Inside the IVC, it maintains memory-based descriptors for the TX/RX
channels and the coherency issue of the counter and payloads. So the
clients can use it to send/receive messages to/from remote ones.

We introduce it as a library for the firmware drivers, which can use it
for IPC.

Based-on-the-work-by:
Peter Newman <pnewman@nvidia.com>

Signed-off-by: Joseph Lo <josephl@nvidia.com>
---
Changes in V2:
- None
---
 drivers/firmware/Kconfig        |   1 +
 drivers/firmware/Makefile       |   1 +
 drivers/firmware/tegra/Kconfig  |  13 +
 drivers/firmware/tegra/Makefile |   1 +
 drivers/firmware/tegra/ivc.c    | 659 ++++++++++++++++++++++++++++++++++++++++
 include/soc/tegra/ivc.h         | 102 +++++++
 6 files changed, 777 insertions(+)
 create mode 100644 drivers/firmware/tegra/Kconfig
 create mode 100644 drivers/firmware/tegra/Makefile
 create mode 100644 drivers/firmware/tegra/ivc.c
 create mode 100644 include/soc/tegra/ivc.h

diff --git a/drivers/firmware/Kconfig b/drivers/firmware/Kconfig
index 5e618058defe..bbd64ae8c4c6 100644
--- a/drivers/firmware/Kconfig
+++ b/drivers/firmware/Kconfig
@@ -200,5 +200,6 @@ config HAVE_ARM_SMCCC
 source "drivers/firmware/broadcom/Kconfig"
 source "drivers/firmware/google/Kconfig"
 source "drivers/firmware/efi/Kconfig"
+source "drivers/firmware/tegra/Kconfig"
 
 endmenu
diff --git a/drivers/firmware/Makefile b/drivers/firmware/Makefile
index 474bada56fcd..9a4df8171cc4 100644
--- a/drivers/firmware/Makefile
+++ b/drivers/firmware/Makefile
@@ -24,3 +24,4 @@ obj-y				+= broadcom/
 obj-$(CONFIG_GOOGLE_FIRMWARE)	+= google/
 obj-$(CONFIG_EFI)		+= efi/
 obj-$(CONFIG_UEFI_CPER)		+= efi/
+obj-y				+= tegra/
diff --git a/drivers/firmware/tegra/Kconfig b/drivers/firmware/tegra/Kconfig
new file mode 100644
index 000000000000..1fa3e4e136a5
--- /dev/null
+++ b/drivers/firmware/tegra/Kconfig
@@ -0,0 +1,13 @@
+menu "Tegra firmware driver"
+
+config TEGRA_IVC
+	bool "Tegra IVC protocol"
+	depends on ARCH_TEGRA
+	help
+	  IVC (Inter-VM Communication) protocol is part of the IPC
+	  (Inter Processor Communication) framework on Tegra. It maintains the
+	  data and the different commuication channels in SysRAM or RAM and
+	  keeps the content is synchronization between host CPU and remote
+	  processors.
+
+endmenu
diff --git a/drivers/firmware/tegra/Makefile b/drivers/firmware/tegra/Makefile
new file mode 100644
index 000000000000..92e2153e8173
--- /dev/null
+++ b/drivers/firmware/tegra/Makefile
@@ -0,0 +1 @@
+obj-$(CONFIG_TEGRA_IVC)		+= ivc.o
diff --git a/drivers/firmware/tegra/ivc.c b/drivers/firmware/tegra/ivc.c
new file mode 100644
index 000000000000..3e736bb9915a
--- /dev/null
+++ b/drivers/firmware/tegra/ivc.c
@@ -0,0 +1,659 @@
+/*
+ * Copyright (c) 2014-2016, NVIDIA CORPORATION.  All rights reserved.
+ *
+ * This program is free software; you can redistribute it and/or modify it
+ * under the terms and conditions of the GNU General Public License,
+ * version 2, as published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope it will be useful, but WITHOUT
+ * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
+ * FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License for
+ * more details.
+ */
+
+#include <linux/module.h>
+
+#include <soc/tegra/ivc.h>
+
+#define IVC_ALIGN 64
+
+#ifdef CONFIG_SMP
+
+static inline void ivc_rmb(void)
+{
+	smp_rmb();
+}
+
+static inline void ivc_wmb(void)
+{
+	smp_wmb();
+}
+
+static inline void ivc_mb(void)
+{
+	smp_mb();
+}
+
+#else
+
+static inline void ivc_rmb(void)
+{
+	rmb();
+}
+
+static inline void ivc_wmb(void)
+{
+	wmb();
+}
+
+static inline void ivc_mb(void)
+{
+	mb();
+}
+
+#endif
+
+/*
+ * IVC channel reset protocol.
+ *
+ * Each end uses its tx_channel.state to indicate its synchronization state.
+ */
+enum ivc_state {
+	/*
+	 * This value is zero for backwards compatibility with services that
+	 * assume channels to be initially zeroed. Such channels are in an
+	 * initially valid state, but cannot be asynchronously reset, and must
+	 * maintain a valid state at all times.
+	 *
+	 * The transmitting end can enter the established state from the sync or
+	 * ack state when it observes the receiving endpoint in the ack or
+	 * established state, indicating that has cleared the counters in our
+	 * rx_channel.
+	 */
+	ivc_state_established = 0,
+
+	/*
+	 * If an endpoint is observed in the sync state, the remote endpoint is
+	 * allowed to clear the counters it owns asynchronously with respect to
+	 * the current endpoint. Therefore, the current endpoint is no longer
+	 * allowed to communicate.
+	 */
+	ivc_state_sync,
+
+	/*
+	 * When the transmitting end observes the receiving end in the sync
+	 * state, it can clear the w_count and r_count and transition to the ack
+	 * state. If the remote endpoint observes us in the ack state, it can
+	 * return to the established state once it has cleared its counters.
+	 */
+	ivc_state_ack
+};
+
+/*
+ * This structure is divided into two-cache aligned parts, the first is only
+ * written through the tx_channel pointer, while the second is only written
+ * through the rx_channel pointer. This delineates ownership of the cache lines,
+ * which is critical to performance and necessary in non-cache coherent
+ * implementations.
+ */
+struct ivc_channel_header {
+	union {
+		struct {
+			/* fields owned by the transmitting end */
+			uint32_t w_count;
+			uint32_t state;
+		};
+		uint8_t w_align[IVC_ALIGN];
+	};
+	union {
+		/* fields owned by the receiving end */
+		uint32_t r_count;
+		uint8_t r_align[IVC_ALIGN];
+	};
+};
+
+static inline void ivc_invalidate_counter(struct ivc *ivc,
+		dma_addr_t handle)
+{
+	if (!ivc->peer_device)
+		return;
+	dma_sync_single_for_cpu(ivc->peer_device, handle, IVC_ALIGN,
+			DMA_FROM_DEVICE);
+}
+
+static inline void ivc_flush_counter(struct ivc *ivc, dma_addr_t handle)
+{
+	if (!ivc->peer_device)
+		return;
+	dma_sync_single_for_device(ivc->peer_device, handle, IVC_ALIGN,
+			DMA_TO_DEVICE);
+}
+
+static inline int ivc_channel_empty(struct ivc *ivc,
+		struct ivc_channel_header *ch)
+{
+	/*
+	 * This function performs multiple checks on the same values with
+	 * security implications, so create snapshots with ACCESS_ONCE() to
+	 * ensure that these checks use the same values.
+	 */
+	uint32_t w_count = ACCESS_ONCE(ch->w_count);
+	uint32_t r_count = ACCESS_ONCE(ch->r_count);
+
+	/*
+	 * Perform an over-full check to prevent denial of service attacks where
+	 * a server could be easily fooled into believing that there's an
+	 * extremely large number of frames ready, since receivers are not
+	 * expected to check for full or over-full conditions.
+	 *
+	 * Although the channel isn't empty, this is an invalid case caused by
+	 * a potentially malicious peer, so returning empty is safer, because it
+	 * gives the impression that the channel has gone silent.
+	 */
+	if (w_count - r_count > ivc->nframes)
+		return 1;
+
+	return w_count == r_count;
+}
+
+static inline int ivc_channel_full(struct ivc *ivc,
+		struct ivc_channel_header *ch)
+{
+	/*
+	 * Invalid cases where the counters indicate that the queue is over
+	 * capacity also appear full.
+	 */
+	return ACCESS_ONCE(ch->w_count) - ACCESS_ONCE(ch->r_count)
+		>= ivc->nframes;
+}
+
+static inline uint32_t ivc_channel_avail_count(struct ivc *ivc,
+		struct ivc_channel_header *ch)
+{
+	/*
+	 * This function isn't expected to be used in scenarios where an
+	 * over-full situation can lead to denial of service attacks. See the
+	 * comment in ivc_channel_empty() for an explanation about special
+	 * over-full considerations.
+	 */
+	return ACCESS_ONCE(ch->w_count) - ACCESS_ONCE(ch->r_count);
+}
+
+static inline void ivc_advance_tx(struct ivc *ivc)
+{
+	ACCESS_ONCE(ivc->tx_channel->w_count) =
+		ACCESS_ONCE(ivc->tx_channel->w_count) + 1;
+
+	if (ivc->w_pos == ivc->nframes - 1)
+		ivc->w_pos = 0;
+	else
+		ivc->w_pos++;
+}
+
+static inline void ivc_advance_rx(struct ivc *ivc)
+{
+	ACCESS_ONCE(ivc->rx_channel->r_count) =
+		ACCESS_ONCE(ivc->rx_channel->r_count) + 1;
+
+	if (ivc->r_pos == ivc->nframes - 1)
+		ivc->r_pos = 0;
+	else
+		ivc->r_pos++;
+}
+
+static inline int ivc_check_read(struct ivc *ivc)
+{
+	/*
+	 * tx_channel->state is set locally, so it is not synchronized with
+	 * state from the remote peer. The remote peer cannot reset its
+	 * transmit counters until we've acknowledged its synchronization
+	 * request, so no additional synchronization is required because an
+	 * asynchronous transition of rx_channel->state to ivc_state_ack is not
+	 * allowed.
+	 */
+	if (ivc->tx_channel->state != ivc_state_established)
+		return -ECONNRESET;
+
+	/*
+	 * Avoid unnecessary invalidations when performing repeated accesses to
+	 * an IVC channel by checking the old queue pointers first.
+	 * Synchronization is only necessary when these pointers indicate empty
+	 * or full.
+	 */
+	if (!ivc_channel_empty(ivc, ivc->rx_channel))
+		return 0;
+
+	ivc_invalidate_counter(ivc, ivc->rx_handle +
+			offsetof(struct ivc_channel_header, w_count));
+	return ivc_channel_empty(ivc, ivc->rx_channel) ? -ENOMEM : 0;
+}
+
+static inline int ivc_check_write(struct ivc *ivc)
+{
+	if (ivc->tx_channel->state != ivc_state_established)
+		return -ECONNRESET;
+
+	if (!ivc_channel_full(ivc, ivc->tx_channel))
+		return 0;
+
+	ivc_invalidate_counter(ivc, ivc->tx_handle +
+			offsetof(struct ivc_channel_header, r_count));
+	return ivc_channel_full(ivc, ivc->tx_channel) ? -ENOMEM : 0;
+}
+
+static void *ivc_frame_pointer(struct ivc *ivc, struct ivc_channel_header *ch,
+		uint32_t frame)
+{
+	BUG_ON(frame >= ivc->nframes);
+	return (void *)((uintptr_t)(ch + 1) + ivc->frame_size * frame);
+}
+
+static inline dma_addr_t ivc_frame_handle(struct ivc *ivc,
+		dma_addr_t channel_handle, uint32_t frame)
+{
+	BUG_ON(!ivc->peer_device);
+	BUG_ON(frame >= ivc->nframes);
+	return channel_handle + sizeof(struct ivc_channel_header) +
+		ivc->frame_size * frame;
+}
+
+static inline void ivc_invalidate_frame(struct ivc *ivc,
+		dma_addr_t channel_handle, unsigned frame, int offset, int len)
+{
+	if (!ivc->peer_device)
+		return;
+	dma_sync_single_for_cpu(ivc->peer_device,
+			ivc_frame_handle(ivc, channel_handle, frame) + offset,
+			len, DMA_FROM_DEVICE);
+}
+
+static inline void ivc_flush_frame(struct ivc *ivc, dma_addr_t channel_handle,
+		unsigned frame, int offset, int len)
+{
+	if (!ivc->peer_device)
+		return;
+	dma_sync_single_for_device(ivc->peer_device,
+			ivc_frame_handle(ivc, channel_handle, frame) + offset,
+			len, DMA_TO_DEVICE);
+}
+
+/* directly peek at the next frame rx'ed */
+void *tegra_ivc_read_get_next_frame(struct ivc *ivc)
+{
+	int result = ivc_check_read(ivc);
+	if (result)
+		return ERR_PTR(result);
+
+	/*
+	 * Order observation of w_pos potentially indicating new data before
+	 * data read.
+	 */
+	ivc_rmb();
+
+	ivc_invalidate_frame(ivc, ivc->rx_handle, ivc->r_pos, 0,
+			ivc->frame_size);
+	return ivc_frame_pointer(ivc, ivc->rx_channel, ivc->r_pos);
+}
+EXPORT_SYMBOL(tegra_ivc_read_get_next_frame);
+
+int tegra_ivc_read_advance(struct ivc *ivc)
+{
+	/*
+	 * No read barriers or synchronization here: the caller is expected to
+	 * have already observed the channel non-empty. This check is just to
+	 * catch programming errors.
+	 */
+	int result = ivc_check_read(ivc);
+	if (result)
+		return result;
+
+	ivc_advance_rx(ivc);
+	ivc_flush_counter(ivc, ivc->rx_handle +
+			offsetof(struct ivc_channel_header, r_count));
+
+	/*
+	 * Ensure our write to r_pos occurs before our read from w_pos.
+	 */
+	ivc_mb();
+
+	/*
+	 * Notify only upon transition from full to non-full.
+	 * The available count can only asynchronously increase, so the
+	 * worst possible side-effect will be a spurious notification.
+	 */
+	ivc_invalidate_counter(ivc, ivc->rx_handle +
+		offsetof(struct ivc_channel_header, w_count));
+
+	if (ivc_channel_avail_count(ivc, ivc->rx_channel) == ivc->nframes - 1)
+		ivc->notify(ivc);
+
+	return 0;
+}
+EXPORT_SYMBOL(tegra_ivc_read_advance);
+
+/* directly poke at the next frame to be tx'ed */
+void *tegra_ivc_write_get_next_frame(struct ivc *ivc)
+{
+	int result = ivc_check_write(ivc);
+	if (result)
+		return ERR_PTR(result);
+
+	return ivc_frame_pointer(ivc, ivc->tx_channel, ivc->w_pos);
+}
+EXPORT_SYMBOL(tegra_ivc_write_get_next_frame);
+
+/* advance the tx buffer */
+int tegra_ivc_write_advance(struct ivc *ivc)
+{
+	int result = ivc_check_write(ivc);
+	if (result)
+		return result;
+
+	ivc_flush_frame(ivc, ivc->tx_handle, ivc->w_pos, 0, ivc->frame_size);
+
+	/*
+	 * Order any possible stores to the frame before update of w_pos.
+	 */
+	ivc_wmb();
+
+	ivc_advance_tx(ivc);
+	ivc_flush_counter(ivc, ivc->tx_handle +
+			offsetof(struct ivc_channel_header, w_count));
+
+	/*
+	 * Ensure our write to w_pos occurs before our read from r_pos.
+	 */
+	ivc_mb();
+
+	/*
+	 * Notify only upon transition from empty to non-empty.
+	 * The available count can only asynchronously decrease, so the
+	 * worst possible side-effect will be a spurious notification.
+	 */
+	ivc_invalidate_counter(ivc, ivc->tx_handle +
+		offsetof(struct ivc_channel_header, r_count));
+
+	if (ivc_channel_avail_count(ivc, ivc->tx_channel) == 1)
+		ivc->notify(ivc);
+
+	return 0;
+}
+EXPORT_SYMBOL(tegra_ivc_write_advance);
+
+void tegra_ivc_channel_reset(struct ivc *ivc)
+{
+	ivc->tx_channel->state = ivc_state_sync;
+	ivc_flush_counter(ivc, ivc->tx_handle +
+			offsetof(struct ivc_channel_header, w_count));
+	ivc->notify(ivc);
+}
+EXPORT_SYMBOL(tegra_ivc_channel_reset);
+
+/*
+ * ===============================================================
+ *  IVC State Transition Table - see tegra_ivc_channel_notified()
+ * ===============================================================
+ *
+ *	local	remote	action
+ *	-----	------	-----------------------------------
+ *	SYNC	EST	<none>
+ *	SYNC	ACK	reset counters; move to EST; notify
+ *	SYNC	SYNC	reset counters; move to ACK; notify
+ *	ACK	EST	move to EST; notify
+ *	ACK	ACK	move to EST; notify
+ *	ACK	SYNC	reset counters; move to ACK; notify
+ *	EST	EST	<none>
+ *	EST	ACK	<none>
+ *	EST	SYNC	reset counters; move to ACK; notify
+ *
+ * ===============================================================
+ */
+
+int tegra_ivc_channel_notified(struct ivc *ivc)
+{
+	enum ivc_state peer_state;
+
+	/* Copy the receiver's state out of shared memory. */
+	ivc_invalidate_counter(ivc, ivc->rx_handle +
+			offsetof(struct ivc_channel_header, w_count));
+	peer_state = ACCESS_ONCE(ivc->rx_channel->state);
+
+	if (peer_state == ivc_state_sync) {
+		/*
+		 * Order observation of ivc_state_sync before stores clearing
+		 * tx_channel.
+		 */
+		ivc_rmb();
+
+		/*
+		 * Reset tx_channel counters. The remote end is in the SYNC
+		 * state and won't make progress until we change our state,
+		 * so the counters are not in use at this time.
+		 */
+		ivc->tx_channel->w_count = 0;
+		ivc->rx_channel->r_count = 0;
+
+		ivc->w_pos = 0;
+		ivc->r_pos = 0;
+
+		/*
+		 * Ensure that counters appear cleared before new state can be
+		 * observed.
+		 */
+		ivc_wmb();
+
+		/*
+		 * Move to ACK state. We have just cleared our counters, so it
+		 * is now safe for the remote end to start using these values.
+		 */
+		ivc->tx_channel->state = ivc_state_ack;
+		ivc_flush_counter(ivc, ivc->tx_handle +
+				offsetof(struct ivc_channel_header, w_count));
+
+		/*
+		 * Notify remote end to observe state transition.
+		 */
+		ivc->notify(ivc);
+
+	} else if (ivc->tx_channel->state == ivc_state_sync &&
+			peer_state == ivc_state_ack) {
+		/*
+		 * Order observation of ivc_state_sync before stores clearing
+		 * tx_channel.
+		 */
+		ivc_rmb();
+
+		/*
+		 * Reset tx_channel counters. The remote end is in the ACK
+		 * state and won't make progress until we change our state,
+		 * so the counters are not in use at this time.
+		 */
+		ivc->tx_channel->w_count = 0;
+		ivc->rx_channel->r_count = 0;
+
+		ivc->w_pos = 0;
+		ivc->r_pos = 0;
+
+		/*
+		 * Ensure that counters appear cleared before new state can be
+		 * observed.
+		 */
+		ivc_wmb();
+
+		/*
+		 * Move to ESTABLISHED state. We know that the remote end has
+		 * already cleared its counters, so it is safe to start
+		 * writing/reading on this channel.
+		 */
+		ivc->tx_channel->state = ivc_state_established;
+		ivc_flush_counter(ivc, ivc->tx_handle +
+				offsetof(struct ivc_channel_header, w_count));
+
+		/*
+		 * Notify remote end to observe state transition.
+		 */
+		ivc->notify(ivc);
+
+	} else if (ivc->tx_channel->state == ivc_state_ack) {
+		/*
+		 * At this point, we have observed the peer to be in either
+		 * the ACK or ESTABLISHED state. Next, order observation of
+		 * peer state before storing to tx_channel.
+		 */
+		ivc_rmb();
+
+		/*
+		 * Move to ESTABLISHED state. We know that we have previously
+		 * cleared our counters, and we know that the remote end has
+		 * cleared its counters, so it is safe to start writing/reading
+		 * on this channel.
+		 */
+		ivc->tx_channel->state = ivc_state_established;
+		ivc_flush_counter(ivc, ivc->tx_handle +
+				offsetof(struct ivc_channel_header, w_count));
+
+		/*
+		 * Notify remote end to observe state transition.
+		 */
+		ivc->notify(ivc);
+
+	} else {
+		/*
+		 * There is no need to handle any further action. Either the
+		 * channel is already fully established, or we are waiting for
+		 * the remote end to catch up with our current state. Refer
+		 * to the diagram in "IVC State Transition Table" above.
+		 */
+	}
+
+	return ivc->tx_channel->state == ivc_state_established ? 0 : -EAGAIN;
+}
+EXPORT_SYMBOL(tegra_ivc_channel_notified);
+
+size_t tegra_ivc_align(size_t size)
+{
+	return (size + (IVC_ALIGN - 1)) & ~(IVC_ALIGN - 1);
+}
+EXPORT_SYMBOL(tegra_ivc_align);
+
+unsigned tegra_ivc_total_queue_size(unsigned queue_size)
+{
+	if (queue_size & (IVC_ALIGN - 1)) {
+		pr_err("%s: queue_size (%u) must be %u-byte aligned\n",
+				__func__, queue_size, IVC_ALIGN);
+		return 0;
+	}
+	return queue_size + sizeof(struct ivc_channel_header);
+}
+EXPORT_SYMBOL(tegra_ivc_total_queue_size);
+
+static int check_ivc_params(uintptr_t queue_base1, uintptr_t queue_base2,
+		unsigned nframes, unsigned frame_size)
+{
+	BUG_ON(offsetof(struct ivc_channel_header, w_count) & (IVC_ALIGN - 1));
+	BUG_ON(offsetof(struct ivc_channel_header, r_count) & (IVC_ALIGN - 1));
+	BUG_ON(sizeof(struct ivc_channel_header) & (IVC_ALIGN - 1));
+
+	if ((uint64_t)nframes * (uint64_t)frame_size >= 0x100000000) {
+		pr_err("nframes * frame_size overflows\n");
+		return -EINVAL;
+	}
+
+	/*
+	 * The headers must at least be aligned enough for counters
+	 * to be accessed atomically.
+	 */
+	if (queue_base1 & (IVC_ALIGN - 1)) {
+		pr_err("ivc channel start not aligned: %lx\n", queue_base1);
+		return -EINVAL;
+	}
+	if (queue_base2 & (IVC_ALIGN - 1)) {
+		pr_err("ivc channel start not aligned: %lx\n", queue_base2);
+		return -EINVAL;
+	}
+
+	if (frame_size & (IVC_ALIGN - 1)) {
+		pr_err("frame size not adequately aligned: %u\n", frame_size);
+		return -EINVAL;
+	}
+
+	if (queue_base1 < queue_base2) {
+		if (queue_base1 + frame_size * nframes > queue_base2) {
+			pr_err("queue regions overlap: %lx + %x, %x\n",
+					queue_base1, frame_size,
+					frame_size * nframes);
+			return -EINVAL;
+		}
+	} else {
+		if (queue_base2 + frame_size * nframes > queue_base1) {
+			pr_err("queue regions overlap: %lx + %x, %x\n",
+					queue_base2, frame_size,
+					frame_size * nframes);
+			return -EINVAL;
+		}
+	}
+
+	return 0;
+}
+
+int tegra_ivc_init(struct ivc *ivc, uintptr_t rx_base, dma_addr_t rx_handle,
+		   uintptr_t tx_base, dma_addr_t tx_handle, unsigned nframes,
+		   unsigned frame_size, struct device *peer_device,
+		   void (*notify)(struct ivc *))
+{
+	size_t queue_size;
+
+	int result = check_ivc_params(rx_base, tx_base, nframes, frame_size);
+	if (result)
+		return result;
+
+	BUG_ON(!ivc);
+	BUG_ON(!notify);
+
+	queue_size = tegra_ivc_total_queue_size(nframes * frame_size);
+
+	/*
+	 * All sizes that can be returned by communication functions should
+	 * fit in an int.
+	 */
+	if (frame_size > INT_MAX)
+		return -E2BIG;
+
+	ivc->rx_channel = (struct ivc_channel_header *)rx_base;
+	ivc->tx_channel = (struct ivc_channel_header *)tx_base;
+
+	if (peer_device) {
+		if (rx_handle != DMA_ERROR_CODE) {
+			ivc->rx_handle = rx_handle;
+			ivc->tx_handle = tx_handle;
+		} else {
+			ivc->rx_handle = dma_map_single(peer_device,
+				ivc->rx_channel, queue_size, DMA_BIDIRECTIONAL);
+			if (ivc->rx_handle == DMA_ERROR_CODE)
+				return -ENOMEM;
+
+			ivc->tx_handle = dma_map_single(peer_device,
+				ivc->tx_channel, queue_size, DMA_BIDIRECTIONAL);
+			if (ivc->tx_handle == DMA_ERROR_CODE) {
+				dma_unmap_single(peer_device, ivc->rx_handle,
+					queue_size, DMA_BIDIRECTIONAL);
+				return -ENOMEM;
+			}
+		}
+	}
+
+	ivc->notify = notify;
+	ivc->frame_size = frame_size;
+	ivc->nframes = nframes;
+	ivc->peer_device = peer_device;
+
+	/*
+	 * These values aren't necessarily correct until the channel has been
+	 * reset.
+	 */
+	ivc->w_pos = 0;
+	ivc->r_pos = 0;
+
+	return 0;
+}
+EXPORT_SYMBOL(tegra_ivc_init);
diff --git a/include/soc/tegra/ivc.h b/include/soc/tegra/ivc.h
new file mode 100644
index 000000000000..1762fbee3fa2
--- /dev/null
+++ b/include/soc/tegra/ivc.h
@@ -0,0 +1,102 @@
+/*
+ * Copyright (c) 2016, NVIDIA CORPORATION.  All rights reserved.
+ *
+ * This program is free software; you can redistribute it and/or modify it
+ * under the terms and conditions of the GNU General Public License,
+ * version 2, as published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope it will be useful, but WITHOUT
+ * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
+ * FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License for
+ * more details.
+ */
+
+#ifndef __TEGRA_IVC_H
+
+#include <linux/device.h>
+#include <linux/dma-mapping.h>
+#include <linux/types.h>
+
+struct ivc_channel_header;
+
+struct ivc {
+	struct ivc_channel_header *rx_channel, *tx_channel;
+	uint32_t w_pos, r_pos;
+
+	void (*notify)(struct ivc *);
+	uint32_t nframes, frame_size;
+
+	struct device *peer_device;
+	dma_addr_t rx_handle, tx_handle;
+};
+
+/**
+ * tegra_ivc_read_get_next_frame - Peek at the next frame to receive
+ * @ivc		pointer of the IVC channel
+ *
+ * Peek at the next frame to be received, without removing it from
+ * the queue.
+ *
+ * Returns a pointer to the frame, or an error encoded pointer.
+ */
+void *tegra_ivc_read_get_next_frame(struct ivc *ivc);
+
+/**
+ * tegra_ivc_read_advance - Advance the read queue
+ * @ivc		pointer of the IVC channel
+ *
+ * Advance the read queue
+ *
+ * Returns 0, or a negative error value if failed.
+ */
+int tegra_ivc_read_advance(struct ivc *ivc);
+
+/**
+ * tegra_ivc_write_get_next_frame - Poke at the next frame to transmit
+ * @ivc		pointer of the IVC channel
+ *
+ * Get access to the next frame.
+ *
+ * Returns a pointer to the frame, or an error encoded pointer.
+ */
+void *tegra_ivc_write_get_next_frame(struct ivc *ivc);
+
+/**
+ * tegra_ivc_write_advance - Advance the write queue
+ * @ivc		pointer of the IVC channel
+ *
+ * Advance the write queue
+ *
+ * Returns 0, or a negative error value if failed.
+ */
+int tegra_ivc_write_advance(struct ivc *ivc);
+
+/**
+ * tegra_ivc_channel_notified - handle internal messages
+ * @ivc		pointer of the IVC channel
+ *
+ * This function must be called following every notification.
+ *
+ * Returns 0 if the channel is ready for communication, or -EAGAIN if a channel
+ * reset is in progress.
+ */
+int tegra_ivc_channel_notified(struct ivc *ivc);
+
+/**
+ * tegra_ivc_channel_reset - initiates a reset of the shared memory state
+ * @ivc		pointer of the IVC channel
+ *
+ * This function must be called after a channel is reserved before it is used
+ * for communication. The channel will be ready for use when a subsequent call
+ * to notify the remote of the channel reset.
+ */
+void tegra_ivc_channel_reset(struct ivc *ivc);
+
+size_t tegra_ivc_align(size_t size);
+unsigned tegra_ivc_total_queue_size(unsigned queue_size);
+int tegra_ivc_init(struct ivc *ivc, uintptr_t rx_base, dma_addr_t rx_handle,
+		   uintptr_t tx_base, dma_addr_t tx_handle, unsigned nframes,
+		   unsigned frame_size, struct device *peer_device,
+		   void (*notify)(struct ivc *));
+
+#endif /* __TEGRA_IVC_H */
-- 
2.9.0

^ permalink raw reply related	[flat|nested] 51+ messages in thread

* [PATCH V2 05/10] firmware: tegra: add BPMP support
  2016-07-05  9:04 [PATCH V2 00/10] arm64: tegra: add BPMP support Joseph Lo
                   ` (3 preceding siblings ...)
  2016-07-05  9:04 ` [PATCH V2 04/10] firmware: tegra: add IVC library Joseph Lo
@ 2016-07-05  9:04 ` Joseph Lo
  2016-07-06 11:39   ` Alexandre Courbot
  2016-07-08 17:55   ` Sivaram Nair
  2016-07-05  9:04 ` [PATCH V2 06/10] soc/tegra: Add Tegra186 support Joseph Lo
                   ` (4 subsequent siblings)
  9 siblings, 2 replies; 51+ messages in thread
From: Joseph Lo @ 2016-07-05  9:04 UTC (permalink / raw)
  To: Stephen Warren, Thierry Reding, Alexandre Courbot
  Cc: linux-tegra, linux-arm-kernel, Rob Herring, Mark Rutland,
	Peter De Schrijver, Matthew Longnecker, devicetree, Jassi Brar,
	linux-kernel, Catalin Marinas, Will Deacon, Joseph Lo

The Tegra BPMP (Boot and Power Management Processor) is designed for the
booting process handling, offloading the power management tasks and
some system control services from the CPU. It can be clock, DVFS,
thermal/EDP, power gating operation and system suspend/resume handling.
So the CPU and the drivers of these modules can base on the service that
the BPMP firmware driver provided to signal the event for the specific PM
action to BPMP and receive the status update from BPMP.

Comparing to the ARM SCPI, the service provided by BPMP is message-based
communication but not method-based. The BPMP firmware driver provides the
send/receive service for the users, when the user concerns the response
time. If the user needs to get the event or update from the firmware, it
can request the MRQ service as well. The user needs to take care of the
message format, which we call BPMP ABI.

The BPMP ABI defines the message format for different modules or usages.
For example, the clock operation needs an MRQ service code called
MRQ_CLK with specific message format which includes different sub
commands for various clock operations. This is the message format that
BPMP can recognize.

So the user needs two things to initiate IPC between BPMP. Get the
service from the bpmp_ops structure and maintain the message format as
the BPMP ABI defined.

Based-on-the-work-by:
Sivaram Nair <sivaramn@nvidia.com>

Signed-off-by: Joseph Lo <josephl@nvidia.com>
---
Changes in V2:
- None
---
 drivers/firmware/tegra/Kconfig  |   12 +
 drivers/firmware/tegra/Makefile |    1 +
 drivers/firmware/tegra/bpmp.c   |  713 +++++++++++++++++
 include/soc/tegra/bpmp.h        |   29 +
 include/soc/tegra/bpmp_abi.h    | 1601 +++++++++++++++++++++++++++++++++++++++
 5 files changed, 2356 insertions(+)
 create mode 100644 drivers/firmware/tegra/bpmp.c
 create mode 100644 include/soc/tegra/bpmp.h
 create mode 100644 include/soc/tegra/bpmp_abi.h

diff --git a/drivers/firmware/tegra/Kconfig b/drivers/firmware/tegra/Kconfig
index 1fa3e4e136a5..ff2730d5c468 100644
--- a/drivers/firmware/tegra/Kconfig
+++ b/drivers/firmware/tegra/Kconfig
@@ -10,4 +10,16 @@ config TEGRA_IVC
 	  keeps the content is synchronization between host CPU and remote
 	  processors.
 
+config TEGRA_BPMP
+	bool "Tegra BPMP driver"
+	depends on ARCH_TEGRA && TEGRA_HSP_MBOX && TEGRA_IVC
+	help
+	  BPMP (Boot and Power Management Processor) is designed to off-loading
+	  the PM functions which include clock/DVFS/thermal/power from the CPU.
+	  It needs HSP as the HW synchronization and notification module and
+	  IVC module as the message communication protocol.
+
+	  This driver manages the IPC interface between host CPU and the
+	  firmware running on BPMP.
+
 endmenu
diff --git a/drivers/firmware/tegra/Makefile b/drivers/firmware/tegra/Makefile
index 92e2153e8173..e34a2f79e1ad 100644
--- a/drivers/firmware/tegra/Makefile
+++ b/drivers/firmware/tegra/Makefile
@@ -1 +1,2 @@
+obj-$(CONFIG_TEGRA_BPMP)	+= bpmp.o
 obj-$(CONFIG_TEGRA_IVC)		+= ivc.o
diff --git a/drivers/firmware/tegra/bpmp.c b/drivers/firmware/tegra/bpmp.c
new file mode 100644
index 000000000000..24fda626610e
--- /dev/null
+++ b/drivers/firmware/tegra/bpmp.c
@@ -0,0 +1,713 @@
+/*
+ * Copyright (c) 2016, NVIDIA CORPORATION.  All rights reserved.
+ *
+ * This program is free software; you can redistribute it and/or modify it
+ * under the terms and conditions of the GNU General Public License,
+ * version 2, as published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope it will be useful, but WITHOUT
+ * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
+ * FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License for
+ * more details.
+ */
+
+#include <linux/mailbox_client.h>
+#include <linux/of.h>
+#include <linux/of_address.h>
+#include <linux/of_device.h>
+#include <linux/platform_device.h>
+#include <linux/semaphore.h>
+
+#include <soc/tegra/bpmp.h>
+#include <soc/tegra/bpmp_abi.h>
+#include <soc/tegra/ivc.h>
+
+#define BPMP_MSG_SZ		128
+#define BPMP_MSG_DATA_SZ	120
+
+#define __MRQ_ATTRS		0xff000000
+#define __MRQ_INDEX(id)		((id) & ~__MRQ_ATTRS)
+
+#define DO_ACK			BIT(0)
+#define RING_DOORBELL		BIT(1)
+
+struct tegra_bpmp_soc_data {
+	u32 ch_index;		/* channel index */
+	u32 thread_ch_index;	/* thread channel index */
+	u32 cpu_rx_ch_index;	/* CPU Rx channel index */
+	u32 nr_ch;		/* number of total channels */
+	u32 nr_thread_ch;	/* number of thread channels */
+	u32 ch_timeout;		/* channel timeout */
+	u32 thread_ch_timeout;	/* thread channel timeout */
+};
+
+struct channel_info {
+	u32 tch_free;
+	u32 tch_to_complete;
+	struct semaphore tch_sem;
+};
+
+struct mb_data {
+	s32 code;
+	s32 flags;
+	u8 data[BPMP_MSG_DATA_SZ];
+} __packed;
+
+struct channel_data {
+	struct mb_data *ib;
+	struct mb_data *ob;
+};
+
+struct mrq {
+	struct list_head list;
+	u32 mrq_code;
+	bpmp_mrq_handler handler;
+	void *data;
+};
+
+struct tegra_bpmp {
+	struct device *dev;
+	const struct tegra_bpmp_soc_data *soc_data;
+	void __iomem *tx_base;
+	void __iomem *rx_base;
+	struct mbox_client cl;
+	struct mbox_chan *chan;
+	struct ivc *ivc_channels;
+	struct channel_data *ch_area;
+	struct channel_info ch_info;
+	struct completion *ch_completion;
+	struct list_head mrq_list;
+	struct tegra_bpmp_ops *ops;
+	spinlock_t lock;
+	bool init_done;
+};
+
+static struct tegra_bpmp *bpmp;
+
+static int bpmp_get_thread_ch(int idx)
+{
+	return bpmp->soc_data->thread_ch_index + idx;
+}
+
+static int bpmp_get_thread_ch_index(int ch)
+{
+	if (ch < bpmp->soc_data->thread_ch_index ||
+	    ch >= bpmp->soc_data->cpu_rx_ch_index)
+		return -1;
+	return ch - bpmp->soc_data->thread_ch_index;
+}
+
+static int bpmp_get_ob_channel(void)
+{
+	return smp_processor_id() + bpmp->soc_data->ch_index;
+}
+
+static struct completion *bpmp_get_completion_obj(int ch)
+{
+	int i = bpmp_get_thread_ch_index(ch);
+
+	return i < 0 ? NULL : bpmp->ch_completion + i;
+}
+
+static int bpmp_valid_txfer(void *ob_data, int ob_sz, void *ib_data, int ib_sz)
+{
+	return ob_sz >= 0 && ob_sz <= BPMP_MSG_DATA_SZ &&
+	       ib_sz >= 0 && ib_sz <= BPMP_MSG_DATA_SZ &&
+	       (!ob_sz || ob_data) && (!ib_sz || ib_data);
+}
+
+static bool bpmp_master_acked(int ch)
+{
+	struct ivc *ivc_chan;
+	void *frame;
+	bool ready;
+
+	ivc_chan = bpmp->ivc_channels + ch;
+	frame = tegra_ivc_read_get_next_frame(ivc_chan);
+	ready = !IS_ERR_OR_NULL(frame);
+	bpmp->ch_area[ch].ib = ready ? frame : NULL;
+
+	return ready;
+}
+
+static int bpmp_wait_ack(int ch)
+{
+	ktime_t t;
+
+	t = ns_to_ktime(local_clock());
+
+	do {
+		if (bpmp_master_acked(ch))
+			return 0;
+	} while (ktime_us_delta(ns_to_ktime(local_clock()), t) <
+		 bpmp->soc_data->ch_timeout);
+
+	return -ETIMEDOUT;
+}
+
+static bool bpmp_master_free(int ch)
+{
+	struct ivc *ivc_chan;
+	void *frame;
+	bool ready;
+
+	ivc_chan = bpmp->ivc_channels + ch;
+	frame = tegra_ivc_write_get_next_frame(ivc_chan);
+	ready = !IS_ERR_OR_NULL(frame);
+	bpmp->ch_area[ch].ob = ready ? frame : NULL;
+
+	return ready;
+}
+
+static int bpmp_wait_master_free(int ch)
+{
+	ktime_t t;
+
+	t = ns_to_ktime(local_clock());
+
+	do {
+		if (bpmp_master_free(ch))
+			return 0;
+	} while (ktime_us_delta(ns_to_ktime(local_clock()), t)
+		 < bpmp->soc_data->ch_timeout);
+
+	return -ETIMEDOUT;
+}
+
+static int __read_ch(int ch, void *data, int sz)
+{
+	struct ivc *ivc_chan;
+	struct mb_data *p;
+
+	ivc_chan = bpmp->ivc_channels + ch;
+	p = bpmp->ch_area[ch].ib;
+	if (data)
+		memcpy_fromio(data, p->data, sz);
+
+	return tegra_ivc_read_advance(ivc_chan);
+}
+
+static int bpmp_read_ch(int ch, void *data, int sz)
+{
+	unsigned long flags;
+	int i, ret;
+
+	i = bpmp_get_thread_ch_index(ch);
+
+	spin_lock_irqsave(&bpmp->lock, flags);
+	ret = __read_ch(ch, data, sz);
+	bpmp->ch_info.tch_free |= (1 << i);
+	spin_unlock_irqrestore(&bpmp->lock, flags);
+
+	up(&bpmp->ch_info.tch_sem);
+
+	return ret;
+}
+
+static int __write_ch(int ch, int mrq_code, int flags, void *data, int sz)
+{
+	struct ivc *ivc_chan;
+	struct mb_data *p;
+
+	ivc_chan = bpmp->ivc_channels + ch;
+	p = bpmp->ch_area[ch].ob;
+
+	p->code = mrq_code;
+	p->flags = flags;
+	if (data)
+		memcpy_toio(p->data, data, sz);
+
+	return tegra_ivc_write_advance(ivc_chan);
+}
+
+static int bpmp_write_threaded_ch(int *ch, int mrq_code, void *data, int sz)
+{
+	unsigned long flags;
+	int ret, i;
+
+	ret = down_timeout(&bpmp->ch_info.tch_sem,
+			   usecs_to_jiffies(bpmp->soc_data->thread_ch_timeout));
+	if (ret)
+		return ret;
+
+	spin_lock_irqsave(&bpmp->lock, flags);
+
+	i = __ffs(bpmp->ch_info.tch_free);
+	*ch = bpmp_get_thread_ch(i);
+	ret = bpmp_master_free(*ch) ? 0 : -EFAULT;
+	if (!ret) {
+		bpmp->ch_info.tch_free &= ~(1 << i);
+		__write_ch(*ch, mrq_code, DO_ACK | RING_DOORBELL, data, sz);
+		bpmp->ch_info.tch_to_complete |= (1 << *ch);
+	}
+
+	spin_unlock_irqrestore(&bpmp->lock, flags);
+
+	return ret;
+}
+
+static int bpmp_write_ch(int ch, int mrq_code, int flags, void *data, int sz)
+{
+	int ret;
+
+	ret = bpmp_wait_master_free(ch);
+	if (ret)
+		return ret;
+
+	return __write_ch(ch, mrq_code, flags, data, sz);
+}
+
+static int bpmp_send_receive_atomic(int mrq_code, void *ob_data, int ob_sz,
+				    void *ib_data, int ib_sz)
+{
+	int ch, ret;
+
+	if (WARN_ON(!irqs_disabled()))
+		return -EPERM;
+
+	if (!bpmp_valid_txfer(ob_data, ob_sz, ib_data, ib_sz))
+		return -EINVAL;
+
+	if (!bpmp->init_done)
+		return -ENODEV;
+
+	ch = bpmp_get_ob_channel();
+	ret = bpmp_write_ch(ch, mrq_code, DO_ACK, ob_data, ob_sz);
+	if (ret)
+		return ret;
+
+	ret = mbox_send_message(bpmp->chan, NULL);
+	if (ret < 0)
+		return ret;
+	mbox_client_txdone(bpmp->chan, 0);
+
+	ret = bpmp_wait_ack(ch);
+	if (ret)
+		return ret;
+
+	return __read_ch(ch, ib_data, ib_sz);
+}
+
+static int bpmp_send_receive(int mrq_code, void *ob_data, int ob_sz,
+			     void *ib_data, int ib_sz)
+{
+	struct completion *comp_obj;
+	unsigned long timeout;
+	int ch, ret;
+
+	if (WARN_ON(irqs_disabled()))
+		return -EPERM;
+
+	if (!bpmp_valid_txfer(ob_data, ob_sz, ib_data, ib_sz))
+		return -EINVAL;
+
+	if (!bpmp->init_done)
+		return -ENODEV;
+
+	ret = bpmp_write_threaded_ch(&ch, mrq_code, ob_data, ob_sz);
+	if (ret)
+		return ret;
+
+	ret = mbox_send_message(bpmp->chan, NULL);
+	if (ret < 0)
+		return ret;
+	mbox_client_txdone(bpmp->chan, 0);
+
+	comp_obj = bpmp_get_completion_obj(ch);
+	timeout = usecs_to_jiffies(bpmp->soc_data->thread_ch_timeout);
+	if (!wait_for_completion_timeout(comp_obj, timeout))
+		return -ETIMEDOUT;
+
+	return bpmp_read_ch(ch, ib_data, ib_sz);
+}
+
+static struct mrq *bpmp_find_mrq(u32 mrq_code)
+{
+	struct mrq *mrq;
+
+	list_for_each_entry(mrq, &bpmp->mrq_list, list) {
+		if (mrq_code == mrq->mrq_code)
+			return mrq;
+	}
+
+	return NULL;
+}
+
+static void bpmp_mrq_return_data(int ch, int code, void *data, int sz)
+{
+	int flags = bpmp->ch_area[ch].ib->flags;
+	struct ivc *ivc_chan;
+	struct mb_data *frame;
+	int ret;
+
+	if (WARN_ON(sz > BPMP_MSG_DATA_SZ))
+		return;
+
+	ivc_chan = bpmp->ivc_channels + ch;
+	ret = tegra_ivc_read_advance(ivc_chan);
+	WARN_ON(ret);
+
+	if (!(flags & DO_ACK))
+		return;
+
+	frame = tegra_ivc_write_get_next_frame(ivc_chan);
+	if (IS_ERR_OR_NULL(frame)) {
+		WARN_ON(1);
+		return;
+	}
+
+	frame->code = code;
+	if (data != NULL)
+		memcpy_toio(frame->data, data, sz);
+	ret = tegra_ivc_write_advance(ivc_chan);
+	WARN_ON(ret);
+
+	if (flags & RING_DOORBELL) {
+		ret = mbox_send_message(bpmp->chan, NULL);
+		if (ret < 0) {
+			WARN_ON(1);
+			return;
+		}
+		mbox_client_txdone(bpmp->chan, 0);
+	}
+}
+
+static void bpmp_mail_return(int ch, int ret_code, int val)
+{
+	bpmp_mrq_return_data(ch, ret_code, &val, sizeof(val));
+}
+
+static void bpmp_handle_mrq(int mrq_code, int ch)
+{
+	struct mrq *mrq;
+
+	spin_lock(&bpmp->lock);
+
+	mrq = bpmp_find_mrq(mrq_code);
+	if (!mrq) {
+		spin_unlock(&bpmp->lock);
+		bpmp_mail_return(ch, -EINVAL, 0);
+		return;
+	}
+
+	mrq->handler(mrq_code, mrq->data, ch);
+
+	spin_unlock(&bpmp->lock);
+}
+
+static int bpmp_request_mrq(int mrq_code, bpmp_mrq_handler handler, void *data)
+{
+	struct mrq *mrq;
+	unsigned long flags;
+
+	if (!handler)
+		return -EINVAL;
+
+	mrq = devm_kzalloc(bpmp->dev, sizeof(*mrq), GFP_KERNEL);
+	if (!mrq)
+		return -ENOMEM;
+
+	spin_lock_irqsave(&bpmp->lock, flags);
+
+	mrq->mrq_code = __MRQ_INDEX(mrq_code);
+	mrq->handler = handler;
+	mrq->data = data;
+	list_add(&mrq->list, &bpmp->mrq_list);
+
+	spin_unlock_irqrestore(&bpmp->lock, flags);
+
+	return 0;
+}
+
+static void bpmp_mrq_handle_ping(int mrq_code, void *data, int ch)
+{
+	int challenge;
+	int reply;
+
+	challenge = *(int *)bpmp->ch_area[ch].ib->data;
+	reply = challenge << (smp_processor_id() + 1);
+	bpmp_mail_return(ch, 0, reply);
+}
+
+static int bpmp_mailman_init(void)
+{
+	return bpmp_request_mrq(MRQ_PING, bpmp_mrq_handle_ping, NULL);
+}
+
+static int bpmp_ping(void)
+{
+	unsigned long flags;
+	ktime_t t;
+	int challenge = 1;
+	int reply = 0;
+	int ret;
+
+	t = ktime_get();
+	local_irq_save(flags);
+	ret = bpmp_send_receive_atomic(MRQ_PING, &challenge, sizeof(challenge),
+				       &reply, sizeof(reply));
+	local_irq_restore(flags);
+	t = ktime_sub(ktime_get(), t);
+
+	if (!ret)
+		dev_info(bpmp->dev,
+			 "ping ok: challenge: %d, reply: %d, time: %lld\n",
+			 challenge, reply, ktime_to_us(t));
+
+	return ret;
+}
+
+static int bpmp_get_fwtag(void)
+{
+	unsigned long flags;
+	void *vaddr;
+	dma_addr_t paddr;
+	u32 addr;
+	int ret;
+
+	vaddr = dma_alloc_coherent(bpmp->dev, BPMP_MSG_DATA_SZ, &paddr,
+				   GFP_KERNEL);
+	if (!vaddr)
+		return -ENOMEM;
+	addr = paddr;
+
+	local_irq_save(flags);
+	ret = bpmp_send_receive_atomic(MRQ_QUERY_TAG, &addr, sizeof(addr),
+				       NULL, 0);
+	local_irq_restore(flags);
+
+	if (!ret)
+		dev_info(bpmp->dev, "fwtag: %s\n", (char *)vaddr);
+
+	dma_free_coherent(bpmp->dev, BPMP_MSG_DATA_SZ, vaddr, paddr);
+
+	return ret;
+}
+
+static void bpmp_signal_thread(int ch)
+{
+	int flags = bpmp->ch_area[ch].ob->flags;
+	struct completion *comp_obj;
+
+	if (!(flags & RING_DOORBELL))
+		return;
+
+	comp_obj = bpmp_get_completion_obj(ch);
+	if (!comp_obj) {
+		WARN_ON(1);
+		return;
+	}
+
+	complete(comp_obj);
+}
+
+static void bpmp_handle_rx(struct mbox_client *cl, void *data)
+{
+	int i, rx_ch;
+
+	rx_ch = bpmp->soc_data->cpu_rx_ch_index;
+
+	if (bpmp_master_acked(rx_ch))
+		bpmp_handle_mrq(bpmp->ch_area[rx_ch].ib->code, rx_ch);
+
+	spin_lock(&bpmp->lock);
+
+	for (i = 0; i < bpmp->soc_data->nr_thread_ch &&
+			bpmp->ch_info.tch_to_complete; i++) {
+		int ch = bpmp_get_thread_ch(i);
+
+		if ((bpmp->ch_info.tch_to_complete & (1 << ch)) &&
+		    bpmp_master_acked(ch)) {
+			bpmp->ch_info.tch_to_complete &= ~(1 << ch);
+			bpmp_signal_thread(ch);
+		}
+	}
+
+	spin_unlock(&bpmp->lock);
+}
+
+static void bpmp_ivc_notify(struct ivc *ivc)
+{
+	int ret;
+
+	ret = mbox_send_message(bpmp->chan, NULL);
+	if (ret < 0)
+		return;
+
+	mbox_send_message(bpmp->chan, NULL);
+}
+
+static int bpmp_msg_chan_init(int ch)
+{
+	struct ivc *ivc_chan;
+	u32 hdr_sz, msg_sz, que_sz;
+	uintptr_t rx_base, tx_base;
+	int ret;
+
+	msg_sz = tegra_ivc_align(BPMP_MSG_SZ);
+	hdr_sz = tegra_ivc_total_queue_size(0);
+	que_sz = tegra_ivc_total_queue_size(msg_sz);
+
+	rx_base =  (uintptr_t)(bpmp->rx_base + que_sz * ch);
+	tx_base =  (uintptr_t)(bpmp->tx_base + que_sz * ch);
+
+	ivc_chan = bpmp->ivc_channels + ch;
+	ret = tegra_ivc_init(ivc_chan, rx_base, DMA_ERROR_CODE, tx_base,
+			     DMA_ERROR_CODE, 1, msg_sz, bpmp->dev,
+			     bpmp_ivc_notify);
+	if (ret) {
+		dev_err(bpmp->dev, "%s fail: ch %d returned %d\n",
+			__func__, ch, ret);
+		return ret;
+	}
+
+	/* reset the channel state */
+	tegra_ivc_channel_reset(ivc_chan);
+
+	/* sync the channel state with BPMP */
+	while (tegra_ivc_channel_notified(ivc_chan))
+		;
+
+	return 0;
+}
+
+struct tegra_bpmp_ops *tegra_bpmp_get_ops(void)
+{
+	if (bpmp->init_done && bpmp->ops)
+		return bpmp->ops;
+	return NULL;
+}
+EXPORT_SYMBOL(tegra_bpmp_get_ops);
+
+static struct tegra_bpmp_ops bpmp_ops = {
+	.send_receive = bpmp_send_receive,
+	.send_receive_atomic = bpmp_send_receive_atomic,
+	.request_mrq = bpmp_request_mrq,
+	.mrq_return = bpmp_mail_return,
+};
+
+static const struct tegra_bpmp_soc_data soc_data_tegra186 = {
+	.ch_index = 0,
+	.thread_ch_index = 6,
+	.cpu_rx_ch_index = 13,
+	.nr_ch = 14,
+	.nr_thread_ch = 7,
+	.ch_timeout = 60 * USEC_PER_SEC,
+	.thread_ch_timeout = 600 * USEC_PER_SEC,
+};
+
+static const struct of_device_id tegra_bpmp_match[] = {
+	{ .compatible = "nvidia,tegra186-bpmp", .data = &soc_data_tegra186 },
+	{ }
+};
+
+static int tegra_bpmp_probe(struct platform_device *pdev)
+{
+	const struct of_device_id *match;
+	struct resource shmem_res;
+	struct device_node *shmem_np;
+	int i, ret;
+
+	bpmp = devm_kzalloc(&pdev->dev, sizeof(*bpmp), GFP_KERNEL);
+	if (!bpmp)
+		return -ENOMEM;
+	bpmp->dev = &pdev->dev;
+
+	match = of_match_device(tegra_bpmp_match, &pdev->dev);
+	if (!match)
+		return -EINVAL;
+	bpmp->soc_data = match->data;
+
+	shmem_np = of_parse_phandle(pdev->dev.of_node, "shmem", 0);
+	of_address_to_resource(shmem_np, 0, &shmem_res);
+	bpmp->tx_base = devm_ioremap_resource(&pdev->dev, &shmem_res);
+	if (IS_ERR(bpmp->tx_base))
+		return PTR_ERR(bpmp->tx_base);
+
+	shmem_np = of_parse_phandle(pdev->dev.of_node, "shmem", 1);
+	of_address_to_resource(shmem_np, 0, &shmem_res);
+	bpmp->rx_base = devm_ioremap_resource(&pdev->dev, &shmem_res);
+	if (IS_ERR(bpmp->rx_base))
+		return PTR_ERR(bpmp->rx_base);
+
+	bpmp->ivc_channels = devm_kcalloc(&pdev->dev, bpmp->soc_data->nr_ch,
+					  sizeof(*bpmp->ivc_channels),
+					  GFP_KERNEL);
+	if (!bpmp->ivc_channels)
+		return -ENOMEM;
+
+	bpmp->ch_area = devm_kcalloc(&pdev->dev, bpmp->soc_data->nr_ch,
+				     sizeof(*bpmp->ch_area), GFP_KERNEL);
+	if (!bpmp->ch_area)
+		return -ENOMEM;
+
+	bpmp->ch_completion = devm_kcalloc(&pdev->dev,
+					   bpmp->soc_data->nr_thread_ch,
+					   sizeof(*bpmp->ch_completion),
+					   GFP_KERNEL);
+	if (!bpmp->ch_completion)
+		return -ENOMEM;
+
+	/* mbox registration */
+	bpmp->cl.dev = &pdev->dev;
+	bpmp->cl.rx_callback = bpmp_handle_rx;
+	bpmp->cl.tx_block = false;
+	bpmp->cl.knows_txdone = false;
+	bpmp->chan = mbox_request_channel(&bpmp->cl, 0);
+	if (IS_ERR(bpmp->chan)) {
+		if (PTR_ERR(bpmp->chan) != -EPROBE_DEFER)
+			dev_err(&pdev->dev,
+				"fail to get HSP mailbox, bpmp init fail.\n");
+		return PTR_ERR(bpmp->chan);
+	}
+
+	/* message channel initialization */
+	for (i = 0; i < bpmp->soc_data->nr_ch; i++) {
+		struct completion *comp_obj;
+
+		ret = bpmp_msg_chan_init(i);
+		if (ret)
+			return ret;
+
+		comp_obj = bpmp_get_completion_obj(i);
+		if (comp_obj)
+			init_completion(comp_obj);
+	}
+
+	bpmp->ch_info.tch_free = (1 << bpmp->soc_data->nr_thread_ch) - 1;
+	sema_init(&bpmp->ch_info.tch_sem, bpmp->soc_data->nr_thread_ch);
+
+	spin_lock_init(&bpmp->lock);
+	INIT_LIST_HEAD(&bpmp->mrq_list);
+	if (bpmp_mailman_init())
+		return -ENODEV;
+
+	bpmp->init_done = true;
+
+	ret = bpmp_ping();
+	if (ret)
+		dev_err(&pdev->dev, "ping failed: %d\n", ret);
+
+	ret = bpmp_get_fwtag();
+	if (ret)
+		dev_err(&pdev->dev, "get fwtag failed: %d\n", ret);
+
+	/* BPMP is ready now. */
+	bpmp->ops = &bpmp_ops;
+
+	return 0;
+}
+
+static struct platform_driver tegra_bpmp_driver = {
+	.driver = {
+		.name = "tegra-bpmp",
+		.of_match_table = tegra_bpmp_match,
+	},
+	.probe = tegra_bpmp_probe,
+};
+
+static int __init tegra_bpmp_init(void)
+{
+	return platform_driver_register(&tegra_bpmp_driver);
+}
+core_initcall(tegra_bpmp_init);
diff --git a/include/soc/tegra/bpmp.h b/include/soc/tegra/bpmp.h
new file mode 100644
index 000000000000..aaa0ef34ad7b
--- /dev/null
+++ b/include/soc/tegra/bpmp.h
@@ -0,0 +1,29 @@
+/*
+ * Copyright (c) 2016, NVIDIA CORPORATION.  All rights reserved.
+ *
+ * This program is free software; you can redistribute it and/or modify it
+ * under the terms and conditions of the GNU General Public License,
+ * version 2, as published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope it will be useful, but WITHOUT
+ * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
+ * FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License for
+ * more details.
+ */
+
+#ifndef __TEGRA_BPMP_H
+
+typedef void (*bpmp_mrq_handler)(int mrq_code, void *data, int ch);
+
+struct tegra_bpmp_ops {
+	int (*send_receive)(int mrq_code, void *ob_data, int ob_sz,
+			    void *ib_data, int ib_sz);
+	int (*send_receive_atomic)(int mrq_code, void *ob_data, int ob_sz,
+			    void *ib_data, int ib_sz);
+	int (*request_mrq)(int mrq_code, bpmp_mrq_handler handler, void *data);
+	void (*mrq_return)(int ch, int ret_code, int val);
+};
+
+struct tegra_bpmp_ops *tegra_bpmp_get_ops(void);
+
+#endif /* __TEGRA_BPMP_H */
diff --git a/include/soc/tegra/bpmp_abi.h b/include/soc/tegra/bpmp_abi.h
new file mode 100644
index 000000000000..0aaef5960e29
--- /dev/null
+++ b/include/soc/tegra/bpmp_abi.h
@@ -0,0 +1,1601 @@
+/*
+ * Copyright (c) 2014-2016, NVIDIA CORPORATION.  All rights reserved.
+ *
+ * This program is free software; you can redistribute it and/or modify it
+ * under the terms and conditions of the GNU General Public License,
+ * version 2, as published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope it will be useful, but WITHOUT
+ * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
+ * FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License for
+ * more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program.  If not, see <http://www.gnu.org/licenses/>.
+ */
+
+#ifndef _ABI_BPMP_ABI_H_
+#define _ABI_BPMP_ABI_H_
+
+#ifdef LK
+#include <stdint.h>
+#endif
+
+#ifndef __ABI_PACKED
+#define __ABI_PACKED __attribute__((packed))
+#endif
+
+#ifdef NO_GCC_EXTENSIONS
+#define EMPTY char empty;
+#define EMPTY_ARRAY 1
+#else
+#define EMPTY
+#define EMPTY_ARRAY 0
+#endif
+
+#ifndef __UNION_ANON
+#define __UNION_ANON
+#endif
+/**
+ * @file
+ */
+
+
+/**
+ * @defgroup MRQ MRQ Messages
+ * @brief Messages sent to/from BPMP via IPC
+ * @{
+ *   @defgroup MRQ_Format Message Format
+ *   @defgroup MRQ_Codes Message Request (MRQ) Codes
+ *   @defgroup MRQ_Payloads Message Payloads
+ *   @defgroup Error_Codes Error Codes
+ * @}
+ */
+
+/**
+ * @addtogroup MRQ_Format Message Format
+ * @{
+ * The CPU requests the BPMP to perform a particular service by
+ * sending it an IVC frame containing a single MRQ message. An MRQ
+ * message consists of a @ref mrq_request followed by a payload whose
+ * format depends on mrq_request::mrq.
+ *
+ * The BPMP processes the data and replies with an IVC frame (on the
+ * same IVC channel) containing and MRQ response. An MRQ response
+ * consists of a @ref mrq_response followed by a payload whose format
+ * depends on the associated mrq_request::mrq.
+ *
+ * A well-defined subset of the MRQ messages that the CPU sends to the
+ * BPMP can lead to BPMP eventually sending an MRQ message to the
+ * CPU. For example, when the CPU uses an #MRQ_THERMAL message to set
+ * a thermal trip point, the BPMP may eventually send a single
+ * #MRQ_THERMAL message of its own to the CPU indicating that the trip
+ * point has been crossed.
+ * @}
+ */
+
+/**
+ * @ingroup MRQ_Format
+ * @brief header for an MRQ message
+ *
+ * Provides the MRQ number for the MRQ message: #mrq. The remainder of
+ * the MRQ message is a payload (immediately following the
+ * mrq_request) whose format depends on mrq.
+ *
+ * @todo document the flags
+ */
+struct mrq_request {
+	/** @brief MRQ number of the request */
+	uint32_t mrq;
+	/** @brief flags for the request */
+	uint32_t flags;
+} __ABI_PACKED;
+
+/**
+ * @ingroup MRQ_Format
+ * @brief header for an MRQ response
+ *
+ *  Provides an error code for the associated MRQ message. The
+ *  remainder of the MRQ response is a payload (immediately following
+ *  the mrq_response) whose format depends on the associated
+ *  mrq_request::mrq
+ *
+ * @todo document the flags
+ */
+struct mrq_response {
+	/** @brief error code for the MRQ request itself */
+	int32_t err;
+	/** @brief flags for the response */
+	uint32_t flags;
+} __ABI_PACKED;
+
+/**
+ * @ingroup MRQ_Format
+ * Minimum needed size for an IPC message buffer
+ */
+#define MSG_MIN_SZ	128
+/**
+ * @ingroup MRQ_Format
+ *  Minimum size guaranteed for data in an IPC message buffer
+ */
+#define MSG_DATA_MIN_SZ	120
+
+/**
+ * @ingroup MRQ_Codes
+ * @name Legal MRQ codes
+ * These are the legal values for mrq_request::mrq
+ * @{
+ */
+
+#define MRQ_PING		0
+#define MRQ_QUERY_TAG		1
+#define MRQ_MODULE_LOAD		4
+#define MRQ_MODULE_UNLOAD	5
+#define MRQ_TRACE_MODIFY	7
+#define MRQ_WRITE_TRACE		8
+#define MRQ_THREADED_PING	9
+#define MRQ_MODULE_MAIL		11
+#define MRQ_DEBUGFS		19
+#define MRQ_RESET		20
+#define MRQ_I2C			21
+#define MRQ_CLK			22
+#define MRQ_QUERY_ABI		23
+#define MRQ_PG_READ_STATE	25
+#define MRQ_PG_UPDATE_STATE	26
+#define MRQ_THERMAL		27
+#define MRQ_CPU_VHINT		28
+#define MRQ_ABI_RATCHET		29
+#define MRQ_EMC_DVFS_LATENCY	31
+#define MRQ_TRACE_ITER		64
+
+/** @} */
+
+/**
+ * @ingroup MRQ_Codes
+ * @brief Maximum MRQ code to be sent by CPU software to
+ * BPMP. Subject to change in future
+ */
+#define MAX_CPU_MRQ_ID		64
+
+/**
+ * @addtogroup MRQ_Payloads Message Payloads
+ * @{
+ *   @defgroup Ping
+ *   @defgroup Query_Tag Query Tag
+ *   @defgroup Module Loadable Modules
+ *   @defgroup Trace
+ *   @defgroup Debugfs
+ *   @defgroup Reset
+ *   @defgroup I2C
+ *   @defgroup Clocks
+ *   @defgroup ABI_info ABI Info
+ *   @defgroup MC_Flush MC Flush
+ *   @defgroup Powergating
+ *   @defgroup Thermal
+ *   @defgroup Vhint CPU Voltage hint
+ *   @defgroup MRQ_Deprecated Deprecated MRQ messages
+ *   @defgroup EMC
+ * @}
+ */
+
+
+/**
+ * @ingroup MRQ_Codes
+ * @def MRQ_PING
+ * @brief A simple ping
+ *
+ * * Platforms: All
+ * * Initiators: Any
+ * * Targets: Any
+ * * Request Payload: @ref mrq_ping_request
+ * * Response Payload: @ref mrq_ping_response
+ *
+ * @ingroup MRQ_Codes
+ * @def MRQ_THREADED_PING
+ * @brief A deeper ping
+ *
+ * * Platforms: All
+ * * Initiators: Any
+ * * Targets: BPMP
+ * * Request Payload: @ref mrq_ping_request
+ * * Response Payload: @ref mrq_ping_response
+ *
+ * Behavior is equivalent to a simple #MRQ_PING except that BPMP
+ * responds from a thread context (providing a slightly more robust
+ * sign of life).
+ *
+ */
+
+/**
+ * @ingroup Ping
+ * @brief request with #MRQ_PING
+ *
+ * Used by the sender of an #MRQ_PING message to request a pong from
+ * recipient. The response from the recipient is computed based on
+ * #challenge.
+ */
+struct mrq_ping_request {
+/** @brief arbitrarily chosen value */
+	uint32_t challenge;
+} __ABI_PACKED;
+
+/**
+ * @ingroup Ping
+ * @brief response to #MRQ_PING
+ *
+ * Sent in response to an #MRQ_PING message. #reply should be the
+ * mrq_ping_request challenge left shifted by 1 with the carry-bit
+ * dropped.
+ *
+ */
+struct mrq_ping_response {
+	/** @brief response to the MRQ_PING challege */
+	uint32_t reply;
+} __ABI_PACKED;
+
+/**
+ * @ingroup MRQ_Codes
+ * @def MRQ_QUERY_TAG
+ * @brief Query BPMP firmware's tag (i.e. version information)
+ *
+ * * Platforms: All
+ * * Initiators: CCPLEX
+ * * Targets: BPMP
+ * * Request Payload: @ref mrq_query_tag_request
+ * * Response Payload: N/A
+ *
+ */
+
+/**
+ * @ingroup Query_Tag
+ * @brief request with #MRQ_QUERY_TAG
+ *
+ * Used by #MRQ_QUERY_TAG call to ask BPMP to fill in the memory
+ * pointed by #addr with BPMP firmware header.
+ *
+ * The sender is reponsible for ensuring that #addr is mapped in to
+ * the recipient's address map.
+ */
+struct mrq_query_tag_request {
+  /** @brief base address to store the firmware header */
+	uint32_t addr;
+} __ABI_PACKED;
+
+/**
+ * @ingroup MRQ_Codes
+ * @def MRQ_MODULE_LOAD
+ * @brief dynamically load a BPMP code module
+ *
+ * * Platforms: All
+ * * Initiators: CCPLEX
+ * * Targets: BPMP
+ * * Request Payload: @ref mrq_module_load_request
+ * * Response Payload: @ref mrq_module_load_response
+ *
+ * @note This MRQ is disabled on production systems
+ *
+ */
+
+/**
+ * @ingroup Module
+ * @brief request with #MRQ_MODULE_LOAD
+ *
+ * Used by #MRQ_MODULE_LOAD calls to ask the recipient to dynamically
+ * load the code located at #phys_addr and having size #size
+ * bytes. #phys_addr is treated as a void pointer.
+ *
+ * The recipient copies the code from #phys_addr to locally allocated
+ * memory prior to responding to this message.
+ *
+ * @todo document the module header format
+ *
+ * The sender is responsible for ensuring that the code is mapped in
+ * the recipient's address map.
+ *
+ */
+struct mrq_module_load_request {
+	/** @brief base address of the code to load. Treated as (void *) */
+	uint32_t phys_addr; /* (void *) */
+	/** @brief size in bytes of code to load */
+	uint32_t size;
+} __ABI_PACKED;
+
+/**
+ * @ingroup Module
+ * @brief response to #MRQ_MODULE_LOAD
+ *
+ * @todo document mrq_response::err
+ */
+struct mrq_module_load_response {
+	/** @brief handle to the loaded module */
+	uint32_t base;
+} __ABI_PACKED;
+
+/**
+ * @ingroup MRQ_Codes
+ * @def MRQ_MODULE_UNLOAD
+ * @brief unload a previously loaded code module
+ *
+ * * Platforms: All
+ * * Initiators: CCPLEX
+ * * Targets: BPMP
+ * * Request Payload: @ref mrq_module_unload_request
+ * * Response Payload: N/A
+ *
+ * @note This MRQ is disabled on production systems
+ */
+
+/**
+ * @ingroup Module
+ * @brief request with #MRQ_MODULE_UNLOAD
+ *
+ * Used by #MRQ_MODULE_UNLOAD calls to request that a previously loaded
+ * module be unloaded.
+ */
+struct mrq_module_unload_request {
+	/** @brief handle of the module to unload */
+	uint32_t base;
+} __ABI_PACKED;
+
+/**
+ * @ingroup MRQ_Codes
+ * @def MRQ_TRACE_MODIFY
+ * @brief modify the set of enabled trace events
+ *
+ * * Platforms: All
+ * * Initiators: CCPLEX
+ * * Targets: BPMP
+ * * Request Payload: @ref mrq_trace_modify_request
+ * * Response Payload: @ref mrq_trace_modify_response
+ *
+ * @note This MRQ is disabled on production systems
+ */
+
+/**
+ * @ingroup Trace
+ * @brief request with #MRQ_TRACE_MODIFY
+ *
+ * Used by %MRQ_TRACE_MODIFY calls to enable or disable specify trace
+ * events.  #set takes precedence for any bit set in both #set and
+ * #clr.
+ */
+struct mrq_trace_modify_request {
+	/** @brief bit mask of trace events to disable */
+	uint32_t clr;
+	/** @brief bit mask of trace events to enable */
+	uint32_t set;
+} __ABI_PACKED;
+
+/**
+ * @ingroup Trace
+ * @brief response to #MRQ_TRACE_MODIFY
+ *
+ * Sent in repsonse to an #MRQ_TRACE_MODIFY message. #mask reflects the
+ * state of which events are enabled after the recipient acted on the
+ * message.
+ *
+ */
+struct mrq_trace_modify_response {
+	/** @brief bit mask of trace event enable states */
+	uint32_t mask;
+} __ABI_PACKED;
+
+/**
+ * @ingroup MRQ_Codes
+ * @def MRQ_WRITE_TRACE
+ * @brief Write trace data to a buffer
+ *
+ * * Platforms: All
+ * * Initiators: CCPLEX
+ * * Targets: BPMP
+ * * Request Payload: @ref mrq_write_trace_request
+ * * Response Payload: @ref mrq_write_trace_response
+ *
+ * mrq_response::err depends on the @ref mrq_write_trace_request field
+ * values. err is -#BPMP_EINVAL if size is zero or area is NULL or
+ * area is in an illegal range. A positive value for err indicates the
+ * number of bytes written to area.
+ *
+ * @note This MRQ is disabled on production systems
+ */
+
+/**
+ * @ingroup Trace
+ * @brief request with #MRQ_WRITE_TRACE
+ *
+ * Used by MRQ_WRITE_TRACE calls to ask the recipient to copy trace
+ * data from the recipient's local buffer to the output buffer. #area
+ * is treated as a byte-aligned pointer in the recipient's address
+ * space.
+ *
+ * The sender is responsible for ensuring that the output
+ * buffer is mapped in the recipient's address map. The recipient is
+ * responsible for protecting its own code and data from accidental
+ * overwrites.
+ */
+struct mrq_write_trace_request {
+	/** @brief base address of output buffer */
+	uint32_t area;
+	/** @brief size in bytes of the output buffer */
+	uint32_t size;
+} __ABI_PACKED;
+
+/**
+ * @ingroup Trace
+ * @brief response to #MRQ_WRITE_TRACE
+ *
+ * Once this response is sent, the respondent will not access the
+ * output buffer further.
+ */
+struct mrq_write_trace_response {
+	/**
+	 * @brief flag whether more data remains in local buffer
+	 *
+	 * Value is 1 if the entire local trace buffer has been
+	 * drained to the outputbuffer. Value is 0 otherwise.
+	 */
+	uint32_t eof;
+} __ABI_PACKED;
+
+/** @private */
+struct mrq_threaded_ping_request {
+	uint32_t challenge;
+} __ABI_PACKED;
+
+/** @private */
+struct mrq_threaded_ping_response {
+	uint32_t reply;
+} __ABI_PACKED;
+
+/**
+ * @ingroup MRQ_Codes
+ * @def MRQ_MODULE_MAIL
+ * @brief send a message to a loadable module
+ *
+ * * Platforms: All
+ * * Initiators: Any
+ * * Targets: BPMP
+ * * Request Payload: @ref mrq_module_mail_request
+ * * Response Payload: @ref mrq_module_mail_response
+ *
+ * @note This MRQ is disabled on production systems
+ */
+
+/**
+ * @ingroup Module
+ * @brief request with #MRQ_MODULE_MAIL
+ */
+struct mrq_module_mail_request {
+	/** @brief handle to the previously loaded module */
+	uint32_t base;
+	/** @brief module-specific mail payload
+	 *
+	 * The length of data[ ] is unknown to the BPMP core firmware
+	 * but it is limited to the size of an IPC message.
+	 */
+	uint8_t data[EMPTY_ARRAY];
+} __ABI_PACKED;
+
+/**
+ * @ingroup Module
+ * @brief response to #MRQ_MODULE_MAIL
+ */
+struct mrq_module_mail_response {
+	/** @brief module-specific mail payload
+	 *
+	 * The length of data[ ] is unknown to the BPMP core firmware
+	 * but it is limited to the size of an IPC message.
+	 */
+	uint8_t data[EMPTY_ARRAY];
+} __ABI_PACKED;
+
+/**
+ * @ingroup MRQ_Codes
+ * @def MRQ_DEBUGFS
+ * @brief Interact with BPMP's debugfs file nodes
+ *
+ * * Platforms: T186
+ * * Initiators: Any
+ * * Targets: BPMP
+ * * Request Payload: @ref mrq_debugfs_request
+ * * Response Payload: @ref mrq_debugfs_response
+ */
+
+/**
+ * @addtogroup Debugfs
+ * @{
+ *
+ * The BPMP firmware implements a pseudo-filesystem called
+ * debugfs. Any driver within the firmware may register with debugfs
+ * to expose an arbitrary set of "files" in the filesystem. When
+ * software on the CPU writes to a debugfs file, debugfs passes the
+ * written data to a callback provided by the driver. When software on
+ * the CPU reads a debugfs file, debugfs queries the driver for the
+ * data to return to the CPU. The intention of the debugfs filesystem
+ * is to provide information useful for debugging the system at
+ * runtime.
+ *
+ * @note The files exposed via debugfs are not part of the
+ * BPMP firmware's ABI. debugfs files may be added or removed in any
+ * given version of the firmware. Typically the semantics of a debugfs
+ * file are consistent from version to version but even that is not
+ * guaranteed.
+ *
+ * @}
+ */
+/** @ingroup Debugfs */
+enum mrq_debugfs_commands {
+	CMD_DEBUGFS_READ = 1,
+	CMD_DEBUGFS_WRITE = 2,
+	CMD_DEBUGFS_DUMPDIR = 3,
+	CMD_DEBUGFS_MAX
+};
+
+/**
+ * @ingroup Debugfs
+ * @brief parameters for CMD_DEBUGFS_READ/WRITE command
+ */
+struct cmd_debugfs_fileop_request {
+	/** @brief physical address pointing at filename */
+	uint32_t fnameaddr;
+	/** @brief length in bytes of filename buffer */
+	uint32_t fnamelen;
+	/** @brief physical address pointing to data buffer */
+	uint32_t dataaddr;
+	/** @brief length in bytes of data buffer */
+	uint32_t datalen;
+} __ABI_PACKED;
+
+/**
+ * @ingroup Debugfs
+ * @brief parameters for CMD_DEBUGFS_READ/WRITE command
+ */
+struct cmd_debugfs_dumpdir_request {
+	/** @brief physical address pointing to data buffer */
+	uint32_t dataaddr;
+	/** @brief length in bytes of data buffer */
+	uint32_t datalen;
+} __ABI_PACKED;
+
+/**
+ * @ingroup Debugfs
+ * @brief response data for CMD_DEBUGFS_READ/WRITE command
+ */
+struct cmd_debugfs_fileop_response {
+	/** @brief always 0 */
+	uint32_t reserved;
+	/** @brief number of bytes read from or written to data buffer */
+	uint32_t nbytes;
+} __ABI_PACKED;
+
+/**
+ * @ingroup Debugfs
+ * @brief response data for CMD_DEBUGFS_DUMPDIR command
+ */
+struct cmd_debugfs_dumpdir_response {
+	/** @brief always 0 */
+	uint32_t reserved;
+	/** @brief number of bytes read from or written to data buffer */
+	uint32_t nbytes;
+} __ABI_PACKED;
+
+/**
+ * @ingroup Debugfs
+ * @brief request with #MRQ_DEBUGFS.
+ *
+ * The sender of an MRQ_DEBUGFS message uses #cmd to specify a debugfs
+ * command to execute. Legal commands are the values of @ref
+ * mrq_debugfs_commands. Each command requires a specific additional
+ * payload of data.
+ *
+ * |command            |payload|
+ * |-------------------|-------|
+ * |CMD_DEBUGFS_READ   |fop    |
+ * |CMD_DEBUGFS_WRITE  |fop    |
+ * |CMD_DEBUGFS_DUMPDIR|dumpdir|
+ */
+struct mrq_debugfs_request {
+	uint32_t cmd;
+	union {
+		struct cmd_debugfs_fileop_request fop;
+		struct cmd_debugfs_dumpdir_request dumpdir;
+	} __UNION_ANON;
+} __ABI_PACKED;
+
+/**
+ * @ingroup Debugfs
+ */
+struct mrq_debugfs_response {
+	/** @brief always 0 */
+	int32_t reserved;
+	union {
+		/** @brief response data for CMD_DEBUGFS_READ OR
+		 * CMD_DEBUGFS_WRITE command
+		 */
+		struct cmd_debugfs_fileop_response fop;
+		/** @brief response data for CMD_DEBUGFS_DUMPDIR command */
+		struct cmd_debugfs_dumpdir_response dumpdir;
+	} __UNION_ANON;
+} __ABI_PACKED;
+
+/**
+ * @addtogroup Debugfs
+ * @{
+ */
+#define DEBUGFS_S_ISDIR	(1 << 9)
+#define DEBUGFS_S_IRUSR	(1 << 8)
+#define DEBUGFS_S_IWUSR	(1 << 7)
+/** @} */
+
+
+/**
+ * @ingroup MRQ_Codes
+ * @def MRQ_RESET
+ * @brief reset an IP block
+ *
+ * * Platforms: T186
+ * * Initiators: Any
+ * * Targets: BPMP
+ * * Request Payload: @ref mrq_reset_request
+ * * Response Payload: N/A
+ */
+
+/**
+ * @ingroup Reset
+ */
+enum mrq_reset_commands {
+	CMD_RESET_ASSERT = 1,
+	CMD_RESET_DEASSERT = 2,
+	CMD_RESET_MODULE = 3,
+	CMD_RESET_MAX, /* not part of ABI and subject to change */
+};
+
+/**
+ * @ingroup Reset
+ * @brief request with MRQ_RESET
+ *
+ * Used by the sender of an #MRQ_RESET message to request BPMP to
+ * assert or or deassert a given reset line.
+ */
+struct mrq_reset_request {
+	/** @brief reset action to perform (@enum mrq_reset_commands) */
+	uint32_t cmd;
+	/** @brief id of the reset to affected */
+	uint32_t reset_id;
+} __ABI_PACKED;
+
+/**
+ * @ingroup MRQ_Codes
+ * @def MRQ_I2C
+ * @brief issue an i2c transaction
+ *
+ * * Platforms: T186
+ * * Initiators: Any
+ * * Targets: BPMP
+ * * Request Payload: @ref mrq_i2c_request
+ * * Response Payload: @ref mrq_i2c_response
+ */
+
+/**
+ * @addtogroup I2C
+ * @{
+ */
+#define TEGRA_I2C_IPC_MAX_IN_BUF_SIZE	(MSG_DATA_MIN_SZ - 12)
+#define TEGRA_I2C_IPC_MAX_OUT_BUF_SIZE	(MSG_DATA_MIN_SZ - 4)
+/** @} */
+
+/**
+ * @ingroup I2C
+ * @name Serial I2C flags
+ * Use these flags with serial_i2c_request::flags
+ * @{
+ */
+#define SERIALI2C_TEN           0x0010
+#define SERIALI2C_RD            0x0001
+#define SERIALI2C_STOP          0x8000
+#define SERIALI2C_NOSTART       0x4000
+#define SERIALI2C_REV_DIR_ADDR  0x2000
+#define SERIALI2C_IGNORE_NAK    0x1000
+#define SERIALI2C_NO_RD_ACK     0x0800
+#define SERIALI2C_RECV_LEN      0x0400
+/** @} */
+/** @ingroup I2C */
+enum {
+	CMD_I2C_XFER = 1
+};
+
+/**
+ * @ingroup I2C
+ * @brief serializable i2c request
+ *
+ * Instances of this structure are packed (little-endian) into
+ * cmd_i2c_xfer_request::data_buf. Each instance represents a single
+ * transaction (or a portion of a transaction with repeated starts) on
+ * an i2c bus.
+ *
+ * Because these structures are packed, some instances are likely to
+ * be misaligned. Additionally because #data is variable length, it is
+ * not possible to iterate through a serialized list of these
+ * structures without inspecting #len in each instance.  It may be
+ * easier to serialize or deserialize cmd_i2c_xfer_request::data_buf
+ * manually rather than using this structure definition.
+*/
+struct serial_i2c_request {
+	/** @brief I2C slave address */
+	uint16_t addr;
+	/** @brief bitmask of SERIALI2C_ flags */
+	uint16_t flags;
+	/** @brief length of I2C transaction in bytes */
+	uint16_t len;
+	/** @brief for write transactions only, #len bytes of data */
+	uint8_t data[];
+} __ABI_PACKED;
+
+/**
+ * @ingroup I2C
+ * @brief trigger one or more i2c transactions
+ */
+struct cmd_i2c_xfer_request {
+	/** @brief valid bus number from mach-t186/i2c-t186.h*/
+	uint32_t bus_id;
+
+	/** @brief count of valid bytes in #data_buf*/
+	uint32_t data_size;
+
+	/** @brief serialized packed instances of @ref serial_i2c_request*/
+	uint8_t data_buf[TEGRA_I2C_IPC_MAX_IN_BUF_SIZE];
+} __ABI_PACKED;
+
+/**
+ * @ingroup I2C
+ * @brief container for data read from the i2c bus
+ *
+ * Processing an cmd_i2c_xfer_request::data_buf causes BPMP to execute
+ * zero or more I2C reads. The data read from the bus is serialized
+ * into #data_buf.
+ */
+struct cmd_i2c_xfer_response {
+	/** @brief count of valid bytes in #data_buf*/
+	uint32_t data_size;
+	/** @brief i2c read data */
+	uint8_t data_buf[TEGRA_I2C_IPC_MAX_OUT_BUF_SIZE];
+} __ABI_PACKED;
+
+/**
+ * @ingroup I2C
+ * @brief request with #MRQ_I2C
+ */
+struct mrq_i2c_request {
+	/** @brief always CMD_I2C_XFER (i.e. 1) */
+	uint32_t cmd;
+	/** @brief parameters of the transfer request */
+	struct cmd_i2c_xfer_request xfer;
+} __ABI_PACKED;
+
+/**
+ * @ingroup I2C
+ * @brief response to #MRQ_I2C
+ */
+struct mrq_i2c_response {
+	struct cmd_i2c_xfer_response xfer;
+} __ABI_PACKED;
+
+/**
+ * @ingroup MRQ_Codes
+ * @def MRQ_CLK
+ *
+ * * Platforms: T186
+ * * Initiators: Any
+ * * Targets: BPMP
+ * * Request Payload: @ref mrq_clk_request
+ * * Response Payload: @ref mrq_clk_response
+ * @addtogroup Clocks
+ * @{
+ */
+
+/**
+ * @name MRQ_CLK sub-commands
+ * @{
+ */
+enum {
+	CMD_CLK_GET_RATE = 1,
+	CMD_CLK_SET_RATE = 2,
+	CMD_CLK_ROUND_RATE = 3,
+	CMD_CLK_GET_PARENT = 4,
+	CMD_CLK_SET_PARENT = 5,
+	CMD_CLK_IS_ENABLED = 6,
+	CMD_CLK_ENABLE = 7,
+	CMD_CLK_DISABLE = 8,
+	CMD_CLK_GET_ALL_INFO = 14,
+	CMD_CLK_GET_MAX_CLK_ID = 15,
+	CMD_CLK_MAX,
+};
+/** @} */
+
+#define MRQ_CLK_NAME_MAXLEN	40
+#define MRQ_CLK_MAX_PARENTS	16
+
+/** @private */
+struct cmd_clk_get_rate_request {
+	EMPTY
+} __ABI_PACKED;
+
+struct cmd_clk_get_rate_response {
+	int64_t rate;
+} __ABI_PACKED;
+
+struct cmd_clk_set_rate_request {
+	int32_t unused;
+	int64_t rate;
+} __ABI_PACKED;
+
+struct cmd_clk_set_rate_response {
+	int64_t rate;
+} __ABI_PACKED;
+
+struct cmd_clk_round_rate_request {
+	int32_t unused;
+	int64_t rate;
+} __ABI_PACKED;
+
+struct cmd_clk_round_rate_response {
+	int64_t rate;
+} __ABI_PACKED;
+
+/** @private */
+struct cmd_clk_get_parent_request {
+	EMPTY
+} __ABI_PACKED;
+
+struct cmd_clk_get_parent_response {
+	uint32_t parent_id;
+} __ABI_PACKED;
+
+struct cmd_clk_set_parent_request {
+	uint32_t parent_id;
+} __ABI_PACKED;
+
+struct cmd_clk_set_parent_response {
+	uint32_t parent_id;
+} __ABI_PACKED;
+
+/** @private */
+struct cmd_clk_is_enabled_request {
+	EMPTY
+} __ABI_PACKED;
+
+struct cmd_clk_is_enabled_response {
+	int32_t state;
+} __ABI_PACKED;
+
+/** @private */
+struct cmd_clk_enable_request {
+	EMPTY
+} __ABI_PACKED;
+
+/** @private */
+struct cmd_clk_enable_response {
+	EMPTY
+} __ABI_PACKED;
+
+/** @private */
+struct cmd_clk_disable_request {
+	EMPTY
+} __ABI_PACKED;
+
+/** @private */
+struct cmd_clk_disable_response {
+	EMPTY
+} __ABI_PACKED;
+
+/** @private */
+struct cmd_clk_get_all_info_request {
+	EMPTY
+} __ABI_PACKED;
+
+struct cmd_clk_get_all_info_response {
+	uint32_t flags;
+	uint32_t parent;
+	uint32_t parents[MRQ_CLK_MAX_PARENTS];
+	uint8_t num_parents;
+	uint8_t name[MRQ_CLK_NAME_MAXLEN];
+} __ABI_PACKED;
+
+/** @private */
+struct cmd_clk_get_max_clk_id_request {
+	EMPTY
+} __ABI_PACKED;
+
+struct cmd_clk_get_max_clk_id_response {
+	uint32_t max_id;
+} __ABI_PACKED;
+/** @} */
+
+/**
+ * @ingroup Clocks
+ * @brief request with #MRQ_CLK
+ *
+ * Used by the sender of an #MRQ_CLK message to control clocks. The
+ * clk_request is split into several sub-commands. Some sub-commands
+ * require no additional data. Others have a sub-command specific
+ * payload
+ *
+ * |sub-command                 |payload                |
+ * |----------------------------|-----------------------|
+ * |CMD_CLK_GET_RATE            |-                      |
+ * |CMD_CLK_SET_RATE            |clk_set_rate           |
+ * |CMD_CLK_ROUND_RATE          |clk_round_rate         |
+ * |CMD_CLK_GET_PARENT          |-                      |
+ * |CMD_CLK_SET_PARENT          |clk_set_parent         |
+ * |CMD_CLK_IS_ENABLED          |-                      |
+ * |CMD_CLK_ENABLE              |-                      |
+ * |CMD_CLK_DISABLE             |-                      |
+ * |CMD_CLK_GET_ALL_INFO        |-                      |
+ * |CMD_CLK_GET_MAX_CLK_ID      |-                      |
+ *
+ */
+
+struct mrq_clk_request {
+	/** @brief sub-command and clock id concatenated to 32-bit word.
+	 * - bits[31..24] is the sub-cmd.
+	 * - bits[23..0] is the clock id
+	 */
+	uint32_t cmd_and_id;
+
+	union {
+		/** @private */
+		struct cmd_clk_get_rate_request clk_get_rate;
+		struct cmd_clk_set_rate_request clk_set_rate;
+		struct cmd_clk_round_rate_request clk_round_rate;
+		/** @private */
+		struct cmd_clk_get_parent_request clk_get_parent;
+		struct cmd_clk_set_parent_request clk_set_parent;
+		/** @private */
+		struct cmd_clk_enable_request clk_enable;
+		/** @private */
+		struct cmd_clk_disable_request clk_disable;
+		/** @private */
+		struct cmd_clk_is_enabled_request clk_is_enabled;
+		/** @private */
+		struct cmd_clk_get_all_info_request clk_get_all_info;
+		/** @private */
+		struct cmd_clk_get_max_clk_id_request clk_get_max_clk_id;
+	} __UNION_ANON;
+} __ABI_PACKED;
+
+/**
+ * @ingroup Clocks
+ * @brief response to MRQ_CLK
+ *
+ * Each sub-command supported by @ref mrq_clk_request may return
+ * sub-command-specific data. Some do and some do not as indicated in
+ * the following table
+ *
+ * |sub-command                 |payload                 |
+ * |----------------------------|------------------------|
+ * |CMD_CLK_GET_RATE            |clk_get_rate            |
+ * |CMD_CLK_SET_RATE            |clk_set_rate            |
+ * |CMD_CLK_ROUND_RATE          |clk_round_rate          |
+ * |CMD_CLK_GET_PARENT          |clk_get_parent          |
+ * |CMD_CLK_SET_PARENT          |clk_set_parent          |
+ * |CMD_CLK_IS_ENABLED          |clk_is_enabled          |
+ * |CMD_CLK_ENABLE              |-                       |
+ * |CMD_CLK_DISABLE             |-                       |
+ * |CMD_CLK_GET_ALL_INFO        |clk_get_all_info        |
+ * |CMD_CLK_GET_MAX_CLK_ID      |clk_get_max_id          |
+ *
+ */
+
+struct mrq_clk_response {
+	union {
+		struct cmd_clk_get_rate_response clk_get_rate;
+		struct cmd_clk_set_rate_response clk_set_rate;
+		struct cmd_clk_round_rate_response clk_round_rate;
+		struct cmd_clk_get_parent_response clk_get_parent;
+		struct cmd_clk_set_parent_response clk_set_parent;
+		/** @private */
+		struct cmd_clk_enable_response clk_enable;
+		/** @private */
+		struct cmd_clk_disable_response clk_disable;
+		struct cmd_clk_is_enabled_response clk_is_enabled;
+		struct cmd_clk_get_all_info_response clk_get_all_info;
+		struct cmd_clk_get_max_clk_id_response clk_get_max_clk_id;
+	} __UNION_ANON;
+} __ABI_PACKED;
+
+/**
+ * @ingroup MRQ_Codes
+ * @def MRQ_QUERY_ABI
+ * @brief check if an MRQ is implemented
+ *
+ * * Platforms: All
+ * * Initiators: Any
+ * * Targets: Any
+ * * Request Payload: @ref mrq_query_abi_request
+ * * Response Payload: @ref mrq_query_abi_response
+ */
+
+/**
+ * @ingroup ABI_info
+ * @brief request with MRQ_QUERY_ABI
+ *
+ * Used by #MRQ_QUERY_ABI call to check if MRQ code #mrq is supported
+ * by the recipient.
+ */
+struct mrq_query_abi_request {
+	/** @brief MRQ code to query */
+	uint32_t mrq;
+} __ABI_PACKED;
+
+/**
+ * @ingroup ABI_info
+ * @brief response to MRQ_QUERY_ABI
+ */
+struct mrq_query_abi_response {
+	/** @brief 0 if queried MRQ is supported. Else, -#BPMP_ENODEV */
+	int32_t status;
+} __ABI_PACKED;
+
+/**
+ * @ingroup MRQ_Codes
+ * @def MRQ_PG_READ_STATE
+ * @brief read the power-gating state of a partition
+ *
+ * * Platforms: T186
+ * * Initiators: Any
+ * * Targets: BPMP
+ * * Request Payload: @ref mrq_pg_read_state_request
+ * * Response Payload: @ref mrq_pg_read_state_response
+ * @addtogroup Powergating
+ * @{
+ */
+
+/**
+ * @brief request with #MRQ_PG_READ_STATE
+ *
+ * Used by MRQ_PG_READ_STATE call to read the current state of a
+ * partition.
+ */
+struct mrq_pg_read_state_request {
+	/** @brief ID of partition */
+	uint32_t partition_id;
+} __ABI_PACKED;
+
+/**
+ * @brief response to MRQ_PG_READ_STATE
+ * @todo define possible errors.
+ */
+struct mrq_pg_read_state_response {
+	/** @brief read as don't care */
+	uint32_t sram_state;
+	/** @brief state of power partition
+	 * * 0 : off
+	 * * 1 : on
+	 */
+	uint32_t logic_state;
+} __ABI_PACKED;
+
+/** @} */
+
+/**
+ * @ingroup MRQ_Codes
+ * @def MRQ_PG_UPDATE_STATE
+ * @brief modify the power-gating state of a partition
+ *
+ * * Platforms: T186
+ * * Initiators: Any
+ * * Targets: BPMP
+ * * Request Payload: @ref mrq_pg_update_state_request
+ * * Response Payload: N/A
+ * @addtogroup Powergating
+ * @{
+ */
+
+/**
+ * @brief request with mrq_pg_update_state_request
+ *
+ * Used by #MRQ_PG_UPDATE_STATE call to request BPMP to change the
+ * state of a power partition #partition_id.
+ */
+struct mrq_pg_update_state_request {
+	/** @brief ID of partition */
+	uint32_t partition_id;
+	/** @brief secondary control of power partition
+	 *  @details Ignored by many versions of the BPMP
+	 *  firmware. For maximum compatibility, set the value
+	 *  according to @logic_state
+	 * *  0x1: power ON partition (@ref logic_state == 0x3)
+	 * *  0x3: power OFF partition (@ref logic_state == 0x1)
+	 */
+	uint32_t sram_state;
+	/** @brief controls state of power partition, legal values are
+	 * *  0x1 : power OFF partition
+	 * *  0x3 : power ON partition
+	 */
+	uint32_t logic_state;
+	/** @brief change state of clocks of the power partition, legal values
+	 * *  0x0 : do not change clock state
+	 * *  0x1 : disable partition clocks (only applicable when
+	 *          @ref logic_state == 0x1)
+	 * *  0x3 : enable partition clocks (only applicable when
+	 *          @ref logic_state == 0x3)
+	 */
+	uint32_t clock_state;
+} __ABI_PACKED;
+/** @} */
+
+/**
+ * @ingroup MRQ_Codes
+ * @def MRQ_THERMAL
+ * @brief interact with BPMP thermal framework
+ *
+ * * Platforms: T186
+ * * Initiators: Any
+ * * Targets: Any
+ * * Request Payload: TODO
+ * * Response Payload: TODO
+ *
+ * @addtogroup Thermal
+ *
+ * The BPMP firmware includes a thermal framework. Drivers within the
+ * bpmp firmware register with the framework to provide thermal
+ * zones. Each thermal zone corresponds to an entity whose temperature
+ * can be measured. The framework also has a notion of trip points. A
+ * trip point consists of a thermal zone id, a temperature, and a
+ * callback routine. The framework invokes the callback when the zone
+ * hits the indicated temperature. The BPMP firmware uses this thermal
+ * framework interally to implement various temperature-dependent
+ * functions.
+ *
+ * Software on the CPU can use #MRQ_THERMAL (with payload @ref
+ * mrq_thermal_host_to_bpmp_request) to interact with the BPMP thermal
+ * framework. The CPU must It can query the number of supported zones,
+ * query zone temperatures, and set trip points.
+ *
+ * When a trip point set by the CPU gets crossed, BPMP firmware issues
+ * an IPC to the CPU having mrq_request::mrq = #MRQ_THERMAL and a
+ * payload of @ref mrq_thermal_bpmp_to_host_request.
+ * @{
+ */
+enum mrq_thermal_host_to_bpmp_cmd {
+	/**
+	 * @brief Check whether the BPMP driver supports the specified
+	 * request type.
+	 *
+	 * Host needs to supply request parameters.
+	 *
+	 * mrq_response::err is 0 if the specified request is
+	 * supported and -#BPMP_ENODEV otherwise.
+	 */
+	CMD_THERMAL_QUERY_ABI = 0,
+
+	/**
+	 * @brief Get the current temperature of the specified zone.
+	 *
+	 * Host needs to supply request parameters.
+	 *
+	 * mrq_response::err is
+	 * *  0: Temperature query succeeded.
+	 * *  -#BPMP_EINVAL: Invalid request parameters.
+	 * *  -#BPMP_ENOENT: No driver registered for thermal zone..
+	 * *  -#BPMP_EFAULT: Problem reading temperature measurement.
+	 */
+	CMD_THERMAL_GET_TEMP = 1,
+
+	/**
+	 * @brief Enable or disable and set the lower and upper
+	 *   thermal limits for a thermal trip point. Each zone has
+	 *   one trip point.
+	 *
+	 * Host needs to supply request parameters. Once the
+	 * temperature hits a trip point, the BPMP will send a message
+	 * to the CPU having MRQ=MRQ_THERMAL and
+	 * type=CMD_THERMAL_HOST_TRIP_REACHED
+	 *
+	 * mrq_response::err is
+	 * *  0: Trip successfully set.
+	 * *  -#BPMP_EINVAL: Invalid request parameters.
+	 * *  -#BPMP_ENOENT: No driver registered for thermal zone.
+	 * *  -#BPMP_EFAULT: Problem setting trip point.
+	 */
+	CMD_THERMAL_SET_TRIP = 2,
+
+	/**
+	 * @brief Get the number of supported thermal zones.
+	 *
+	 * No request parameters required.
+	 *
+	 * mrq_response::err is always 0, indicating success.
+	 */
+	CMD_THERMAL_GET_NUM_ZONES = 3,
+
+	/** @brief: number of supported host-to-bpmp commands. May
+	 * increase in future
+	 */
+	CMD_THERMAL_HOST_TO_BPMP_NUM
+};
+
+enum mrq_thermal_bpmp_to_host_cmd {
+	/**
+	 * @brief Indication that the temperature for a zone has
+	 *   exceeded the range indicated in the thermal trip point
+	 *   for the zone.
+	 *
+	 * BPMP needs to supply request parameters. Host only needs to
+	 * acknowledge.
+	 */
+	CMD_THERMAL_HOST_TRIP_REACHED = 100,
+
+	/** @brief: number of supported bpmp-to-host commands. May
+	 * increase in future
+	 */
+	CMD_THERMAL_BPMP_TO_HOST_NUM
+};
+
+/*
+ * Host->BPMP request data for request type CMD_THERMAL_QUERY_ABI
+ *
+ * zone: Request type for which to check existence.
+ */
+struct cmd_thermal_query_abi_request {
+	uint32_t type;
+} __ABI_PACKED;
+
+/*
+ * Host->BPMP request data for request type CMD_THERMAL_GET_TEMP
+ *
+ * zone: Number of thermal zone.
+ */
+struct cmd_thermal_get_temp_request {
+	uint32_t zone;
+} __ABI_PACKED;
+
+/*
+ * BPMP->Host reply data for request CMD_THERMAL_GET_TEMP
+ *
+ * error: 0 if request succeeded.
+ *	-BPMP_EINVAL if request parameters were invalid.
+ *      -BPMP_ENOENT if no driver was registered for the specified thermal zone.
+ *      -BPMP_EFAULT for other thermal zone driver errors.
+ * temp: Current temperature in millicelsius.
+ */
+struct cmd_thermal_get_temp_response {
+	int32_t temp;
+} __ABI_PACKED;
+
+/*
+ * Host->BPMP request data for request type CMD_THERMAL_SET_TRIP
+ *
+ * zone: Number of thermal zone.
+ * low: Temperature of lower trip point in millicelsius
+ * high: Temperature of upper trip point in millicelsius
+ * enabled: 1 to enable trip point, 0 to disable trip point
+ */
+struct cmd_thermal_set_trip_request {
+	uint32_t zone;
+	int32_t low;
+	int32_t high;
+	uint32_t enabled;
+} __ABI_PACKED;
+
+/*
+ * BPMP->Host request data for request type CMD_THERMAL_HOST_TRIP_REACHED
+ *
+ * zone: Number of thermal zone where trip point was reached.
+ */
+struct cmd_thermal_host_trip_reached_request {
+	uint32_t zone;
+} __ABI_PACKED;
+
+/*
+ * BPMP->Host reply data for request type CMD_THERMAL_GET_NUM_ZONES
+ *
+ * num: Number of supported thermal zones. The thermal zones are indexed
+ *      starting from zero.
+ */
+struct cmd_thermal_get_num_zones_response {
+	uint32_t num;
+} __ABI_PACKED;
+
+/*
+ * Host->BPMP request data.
+ *
+ * Reply type is union mrq_thermal_bpmp_to_host_response.
+ *
+ * type: Type of request. Values listed in enum mrq_thermal_type.
+ * data: Request type specific parameters.
+ */
+struct mrq_thermal_host_to_bpmp_request {
+	uint32_t type;
+	union {
+		struct cmd_thermal_query_abi_request query_abi;
+		struct cmd_thermal_get_temp_request get_temp;
+		struct cmd_thermal_set_trip_request set_trip;
+	} __UNION_ANON;
+} __ABI_PACKED;
+
+/*
+ * BPMP->Host request data.
+ *
+ * type: Type of request. Values listed in enum mrq_thermal_type.
+ * data: Request type specific parameters.
+ */
+struct mrq_thermal_bpmp_to_host_request {
+	uint32_t type;
+	union {
+		struct cmd_thermal_host_trip_reached_request host_trip_reached;
+	} __UNION_ANON;
+} __ABI_PACKED;
+
+/*
+ * Data in reply to a Host->BPMP request.
+ */
+union mrq_thermal_bpmp_to_host_response {
+	struct cmd_thermal_get_temp_response get_temp;
+	struct cmd_thermal_get_num_zones_response get_num_zones;
+} __ABI_PACKED;
+/** @} */
+
+/**
+ * @ingroup MRQ_Codes
+ * @def MRQ_CPU_VHINT
+ * @brief Query CPU voltage hint data
+ *
+ * * Platforms: T186
+ * * Initiators: CCPLEX
+ * * Targets: BPMP
+ * * Request Payload: @ref mrq_cpu_vhint_request
+ * * Response Payload: N/A
+ *
+ * @addtogroup Vhint CPU Voltage hint
+ * @{
+ */
+
+/**
+ * @brief request with #MRQ_CPU_VHINT
+ *
+ * Used by #MRQ_CPU_VHINT call by CCPLEX to retrieve voltage hint data
+ * from BPMP to memory space pointed by #addr. CCPLEX is responsible
+ * to allocate sizeof(cpu_vhint_data) sized block of memory and
+ * appropriately map it for BPMP before sending the request.
+ */
+struct mrq_cpu_vhint_request {
+	/** @brief IOVA address for the #cpu_vhint_data */
+	uint32_t addr; /* struct cpu_vhint_data * */
+	/** @brief ID of the cluster whose data is requested */
+	uint32_t cluster_id; /* enum cluster_id */
+} __ABI_PACKED;
+
+/**
+ * @brief description of the CPU v/f relation
+ *
+ * Used by #MRQ_CPU_VHINT call to carry data pointed by #addr of
+ * struct mrq_cpu_vhint_request
+ */
+struct cpu_vhint_data {
+	uint32_t ref_clk_hz; /**< reference frequency in Hz */
+	uint16_t pdiv; /**< post divider value */
+	uint16_t mdiv; /**< input divider value */
+	uint16_t ndiv_max; /**< fMAX expressed with max NDIV value */
+	/** table of ndiv values as a function of vINDEX (voltage index) */
+	uint16_t ndiv[80];
+	/** minimum allowed NDIV value */
+	uint16_t ndiv_min;
+	/** minimum allowed voltage hint value (as in vINDEX) */
+	uint16_t vfloor;
+	/** maximum allowed voltage hint value (as in vINDEX) */
+	uint16_t vceil;
+	/** post-multiplier for vindex value */
+	uint16_t vindex_mult;
+	/** post-divider for vindex value */
+	uint16_t vindex_div;
+	/** reserved for future use */
+	uint16_t reserved[328];
+} __ABI_PACKED;
+
+/** @} */
+
+/**
+ * @ingroup MRQ_Codes
+ * @def MRQ_ABI_RATCHET
+ * @brief ABI ratchet value query
+ *
+ * * Platforms: T186
+ * * Initiators: Any
+ * * Targets: BPMP
+ * * Request Payload: @ref mrq_abi_ratchet_request
+ * * Response Payload: @ref mrq_abi_ratchet_response
+ * @addtogroup ABI_info
+ * @{
+ */
+
+/**
+ * @brief an ABI compatibility mechanism
+ *
+ * BPMP_ABI_RATCHET_VALUE may increase for various reasons in a future
+ * revision of this header file.
+ * 1. That future revision deprecates some MRQ
+ * 2. That future revision introduces a breaking change to an existing
+ *    MRQ or
+ * 3. A bug is discovered in an existing implementation of the BPMP-FW
+ *    (or possibly one of its clients) which warrants deprecating that
+ *    implementation.
+ */
+#define BPMP_ABI_RATCHET_VALUE 3
+
+/**
+ * @brief request with #MRQ_ABI_RATCHET.
+ *
+ * #ratchet should be #BPMP_ABI_RATCHET_VALUE from the ABI header
+ * against which the requester was compiled.
+ *
+ * If ratchet is less than BPMP's #BPMP_ABI_RATCHET_VALUE, BPMP may
+ * reply with mrq_response::err = -#BPMP_ERANGE to indicate that
+ * BPMP-FW cannot interoperate correctly with the requester. Requester
+ * should cease further communication with BPMP.
+ *
+ * Otherwise, err shall be 0.
+ */
+struct mrq_abi_ratchet_request {
+	/** @brief requester's ratchet value */
+	uint16_t ratchet;
+};
+
+/**
+ * @brief response to #MRQ_ABI_RATCHET
+ *
+ * #ratchet shall be #BPMP_ABI_RATCHET_VALUE from the ABI header
+ * against which BPMP firwmare was compiled.
+ *
+ * If #ratchet is less than the requester's #BPMP_ABI_RATCHET_VALUE,
+ * the requster must either interoperate with BPMP according to an ABI
+ * header version with BPMP_ABI_RATCHET_VALUE = ratchet or cease
+ * communication with BPMP.
+ *
+ * If mrq_response::err is 0 and ratchet is greater than or equal to the
+ * requester's BPMP_ABI_RATCHET_VALUE, the requester should continue
+ * normal operation.
+ */
+struct mrq_abi_ratchet_response {
+	/** @brief BPMP's ratchet value */
+	uint16_t ratchet;
+};
+/** @} */
+
+/**
+ * @ingroup MRQ_Codes
+ * @def MRQ_EMC_DVFS_LATENCY
+ * @brief query frequency dependent EMC DVFS latency
+ *
+ * * Platforms: T186
+ * * Initiators: CCPLEX
+ * * Targets: BPMP
+ * * Request Payload: N/A
+ * * Response Payload: @ref mrq_emc_dvfs_latency_response
+ * @addtogroup EMC
+ * @{
+ */
+
+/**
+ * @brief used by @ref mrq_emc_dvfs_latency_response
+ */
+struct emc_dvfs_latency {
+	/** @brief EMC frequency in kHz */
+	uint32_t freq;
+	/** @brief EMC DVFS latency in nanoseconds */
+	uint32_t latency;
+} __ABI_PACKED;
+
+#define EMC_DVFS_LATENCY_MAX_SIZE	14
+/**
+ * @brief response to #MRQ_EMC_DVFS_LATENCY
+ */
+struct mrq_emc_dvfs_latency_response {
+	/** @brief the number valid entries in #pairs */
+	uint32_t num_pairs;
+	/** @brief EMC <frequency, latency> information */
+	struct emc_dvfs_latency pairs[EMC_DVFS_LATENCY_MAX_SIZE];
+} __ABI_PACKED;
+
+/** @} */
+
+/**
+ * @ingroup MRQ_Codes
+ * @def MRQ_TRACE_ITER
+ * @brief manage the trace iterator
+ *
+ * * Platforms: All
+ * * Initiators: CCPLEX
+ * * Targets: BPMP
+ * * Request Payload: N/A
+ * * Response Payload: @ref mrq_trace_iter_request
+ * @addtogroup Trace
+ * @{
+ */
+enum {
+	/** @brief (re)start the tracing now. Ignore older events */
+	TRACE_ITER_INIT = 0,
+	/** @brief clobber all events in the trace buffer */
+	TRACE_ITER_CLEAN = 1
+};
+
+/**
+ * @brief request with #MRQ_TRACE_ITER
+ */
+struct mrq_trace_iter_request {
+	/** @brief TRACE_ITER_INIT or TRACE_ITER_CLEAN */
+	uint32_t cmd;
+} __ABI_PACKED;
+
+/** @} */
+
+/*
+ *  4. Enumerations
+ */
+
+/*
+ *   4.1 CPU enumerations
+ *
+ * See <mach-t186/system-t186.h>
+ *
+ *   4.2 CPU Cluster enumerations
+ *
+ * See <mach-t186/system-t186.h>
+ *
+ *   4.3 System low power state enumerations
+ *
+ * See <mach-t186/system-t186.h>
+ */
+
+/*
+ *   4.4 Clock enumerations
+ *
+ * For clock enumerations, see <mach-t186/clk-t186.h>
+ */
+
+/*
+ *   4.5 Reset enumerations
+ *
+ * For reset enumerations, see <mach-t186/reset-t186.h>
+ */
+
+/*
+ *   4.6 Thermal sensor enumerations
+ *
+ * For thermal sensor enumerations, see <mach-t186/thermal-t186.h>
+ */
+
+/**
+ * @defgroup Error_Codes
+ * Negative values for mrq_response::err generally indicate some
+ * error. The ABI defines the following error codes. Negating these
+ * defines is an exercise left to the user.
+ * @{
+ */
+/** @brief No such file or directory */
+#define BPMP_ENOENT	2
+/** @brief No MRQ handler */
+#define BPMP_ENOHANDLER	3
+/** @brief I/O error */
+#define BPMP_EIO	5
+/** @brief Bad sub-MRQ command */
+#define BPMP_EBADCMD	6
+/** @brief Not enough memory */
+#define BPMP_ENOMEM	12
+/** @brief Permission denied */
+#define BPMP_EACCES	13
+/** @brief Bad address */
+#define BPMP_EFAULT	14
+/** @brief No such device */
+#define BPMP_ENODEV	19
+/** @brief Argument is a directory */
+#define BPMP_EISDIR	21
+/** @brief Invalid argument */
+#define BPMP_EINVAL	22
+/** @brief Timeout during operation */
+#define BPMP_ETIMEDOUT  23
+/** @brief Out of range */
+#define BPMP_ERANGE	34
+/** @} */
+/** @} */
+#endif
-- 
2.9.0

^ permalink raw reply related	[flat|nested] 51+ messages in thread

* [PATCH V2 06/10] soc/tegra: Add Tegra186 support
  2016-07-05  9:04 [PATCH V2 00/10] arm64: tegra: add BPMP support Joseph Lo
                   ` (4 preceding siblings ...)
  2016-07-05  9:04 ` [PATCH V2 05/10] firmware: tegra: add BPMP support Joseph Lo
@ 2016-07-05  9:04 ` Joseph Lo
  2016-07-05  9:04 ` [PATCH V2 07/10] arm64: defconfig: Enable Tegra186 SoC Joseph Lo
                   ` (3 subsequent siblings)
  9 siblings, 0 replies; 51+ messages in thread
From: Joseph Lo @ 2016-07-05  9:04 UTC (permalink / raw)
  To: Stephen Warren, Thierry Reding, Alexandre Courbot
  Cc: linux-tegra, linux-arm-kernel, Rob Herring, Mark Rutland,
	Peter De Schrijver, Matthew Longnecker, devicetree, Jassi Brar,
	linux-kernel, Catalin Marinas, Will Deacon, Joseph Lo

The Tegra186 has a combination of Denver and Cortex-A57 CPU cores and
GPUs with Pascal architecture on it. It features with ADSP with
Cortex-A9 CPU for audio processing, hardware video encoder/decoder with
multi-format support, ISP for image capture processing and BPMP for the
power managements.

Signed-off-by: Joseph Lo <josephl@nvidia.com>
---
Changes in V2:
- None
---
 drivers/soc/tegra/Kconfig | 14 ++++++++++++++
 1 file changed, 14 insertions(+)

diff --git a/drivers/soc/tegra/Kconfig b/drivers/soc/tegra/Kconfig
index 03089ad2fc65..88a71dfd466c 100644
--- a/drivers/soc/tegra/Kconfig
+++ b/drivers/soc/tegra/Kconfig
@@ -61,6 +61,20 @@ config ARCH_TEGRA_132_SOC
 	  but contains an NVIDIA Denver CPU complex in place of
 	  Tegra124's "4+1" Cortex-A15 CPU complex.
 
+config ARCH_TEGRA_186_SOC
+	bool "NVIDIA Tegra186 SoC"
+	select MAILBOX
+	select TEGRA_BPMP
+	select TEGRA_HSP_MBOX
+	select TEGRA_IVC
+	help
+	  Enable support for the NVIDIA Tegar186 SoC. The Tegra186 has a
+	  combination of Denver and Cortex-A57 CPU cores and GPUs with Pascal
+	  architecture on it. It features with ADSP with Cortex-A9 CPU for
+	  audio processing, hardware video encoder/decoder with multi-format
+	  support, ISP for image capture processing and BPMP for the power
+	  managements.
+
 config ARCH_TEGRA_210_SOC
 	bool "NVIDIA Tegra210 SoC"
 	select PINCTRL_TEGRA210
-- 
2.9.0

^ permalink raw reply related	[flat|nested] 51+ messages in thread

* [PATCH V2 07/10] arm64: defconfig: Enable Tegra186 SoC
  2016-07-05  9:04 [PATCH V2 00/10] arm64: tegra: add BPMP support Joseph Lo
                   ` (5 preceding siblings ...)
  2016-07-05  9:04 ` [PATCH V2 06/10] soc/tegra: Add Tegra186 support Joseph Lo
@ 2016-07-05  9:04 ` Joseph Lo
  2016-07-05  9:04 ` [PATCH V2 08/10] arm64: dts: tegra: Add Tegra186 support Joseph Lo
                   ` (2 subsequent siblings)
  9 siblings, 0 replies; 51+ messages in thread
From: Joseph Lo @ 2016-07-05  9:04 UTC (permalink / raw)
  To: Stephen Warren, Thierry Reding, Alexandre Courbot
  Cc: linux-tegra, linux-arm-kernel, Rob Herring, Mark Rutland,
	Peter De Schrijver, Matthew Longnecker, devicetree, Jassi Brar,
	linux-kernel, Catalin Marinas, Will Deacon, Joseph Lo

Enable Tegra186 SoC.

Signed-off-by: Joseph Lo <josephl@nvidia.com>
---
Changes in V2:
- None
---
 arch/arm64/configs/defconfig | 1 +
 1 file changed, 1 insertion(+)

diff --git a/arch/arm64/configs/defconfig b/arch/arm64/configs/defconfig
index e69051098435..64d767ec142c 100644
--- a/arch/arm64/configs/defconfig
+++ b/arch/arm64/configs/defconfig
@@ -326,6 +326,7 @@ CONFIG_QCOM_SMEM=y
 CONFIG_QCOM_SMD=y
 CONFIG_QCOM_SMD_RPM=y
 CONFIG_ARCH_TEGRA_132_SOC=y
+CONFIG_ARCH_TEGRA_186_SOC=y
 CONFIG_ARCH_TEGRA_210_SOC=y
 CONFIG_EXTCON_USB_GPIO=y
 CONFIG_PWM=y
-- 
2.9.0

^ permalink raw reply related	[flat|nested] 51+ messages in thread

* [PATCH V2 08/10] arm64: dts: tegra: Add Tegra186 support
  2016-07-05  9:04 [PATCH V2 00/10] arm64: tegra: add BPMP support Joseph Lo
                   ` (6 preceding siblings ...)
  2016-07-05  9:04 ` [PATCH V2 07/10] arm64: defconfig: Enable Tegra186 SoC Joseph Lo
@ 2016-07-05  9:04 ` Joseph Lo
  2016-07-05  9:04 ` [PATCH V2 09/10] arm64: dts: tegra: Add NVIDIA Tegra186 P3310 main board support Joseph Lo
  2016-07-05  9:04 ` [PATCH V2 10/10] arm64: dts: tegra: Add NVIDIA P2771 " Joseph Lo
  9 siblings, 0 replies; 51+ messages in thread
From: Joseph Lo @ 2016-07-05  9:04 UTC (permalink / raw)
  To: Stephen Warren, Thierry Reding, Alexandre Courbot
  Cc: linux-tegra, linux-arm-kernel, Rob Herring, Mark Rutland,
	Peter De Schrijver, Matthew Longnecker, devicetree, Jassi Brar,
	linux-kernel, Catalin Marinas, Will Deacon, Joseph Lo

This adds the initial support of Tegra186 SoC, which can help to bring
up the debug console and initrd for further developing.

Signed-off-by: Joseph Lo <josephl@nvidia.com>
---
Changes in V2:
- update the file according the HSP and BPMP binding fix in V2
---
 arch/arm64/boot/dts/nvidia/tegra186.dtsi | 77 ++++++++++++++++++++++++++++++++
 1 file changed, 77 insertions(+)
 create mode 100644 arch/arm64/boot/dts/nvidia/tegra186.dtsi

diff --git a/arch/arm64/boot/dts/nvidia/tegra186.dtsi b/arch/arm64/boot/dts/nvidia/tegra186.dtsi
new file mode 100644
index 000000000000..57badd5de9b4
--- /dev/null
+++ b/arch/arm64/boot/dts/nvidia/tegra186.dtsi
@@ -0,0 +1,77 @@
+#include <dt-bindings/interrupt-controller/arm-gic.h>
+#include <dt-bindings/mailbox/tegra186-hsp.h>
+
+/ {
+	compatible = "nvidia,tegra186";
+	interrupt-parent = <&gic>;
+	#address-cells = <2>;
+	#size-cells = <2>;
+
+	uarta: serial@03100000 {
+		compatible = "nvidia,tegra186-uart", "nvidia,tegra20-uart";
+		reg = <0x0 0x03100000 0x0 0x40>;
+		reg-shift = <2>;
+		interrupts = <GIC_SPI 112 IRQ_TYPE_LEVEL_HIGH>;
+		status = "disabled";
+	};
+
+	gic: interrupt-controller@03881000 {
+		compatible = "arm,gic-400";
+		#interrupt-cells = <3>;
+		interrupt-controller;
+		reg = <0x0 0x03881000 0x0 0x1000>,
+		      <0x0 0x03882000 0x0 0x2000>;
+		interrupts = <GIC_PPI 9
+			(GIC_CPU_MASK_SIMPLE(4) | IRQ_TYPE_LEVEL_HIGH)>;
+		interrupt-parent = <&gic>;
+	};
+
+	hsp_top0: hsp@03c00000 {
+		compatible = "nvidia,tegra186-hsp";
+		reg = <0x0 0x03c00000 0x0 0xa0000>;
+		interrupts = <GIC_SPI 176 IRQ_TYPE_LEVEL_HIGH>;
+		interrupt-names = "doorbell";
+		#mbox-cells = <1>;
+		status = "disabled";
+	};
+
+	sysram@30000000 {
+		compatible = "nvidia,tegra186-sysram", "mmio-ram";
+		reg = <0x0 0x30000000 0x0 0x4ffff>;
+		#address-cells = <2>;
+		#size-cells = <2>;
+		ranges = <0 0x0 0x0 0x30000000 0x0 0x4ffff>;
+
+		cpu_bpmp_tx: bpmp_shmem@4e000 {
+			compatible = "nvidia,tegra186-bpmp-shmem";
+			reg = <0x0 0x4e000 0x0 0x1000>;
+		};
+
+		cpu_bpmp_rx: bpmp_shmem@4f000 {
+			compatible = "nvidia,tegra186-bpmp-shmem";
+			reg = <0x0 0x4f000 0x0 0x1000>;
+		};
+	};
+
+	bpmp {
+		compatible = "nvidia,tegra186-bpmp";
+		mboxes = <&hsp_top0 HSP_MBOX_ID(DB, HSP_DB_MASTER_BPMP)>;
+		shmem = <&cpu_bpmp_tx &cpu_bpmp_rx>;
+		#clock-cells = <1>;
+		#reset-cells = <1>;
+		status = "disabled";
+	};
+
+	timer {
+		compatible = "arm,armv8-timer";
+		interrupts = <GIC_PPI 13
+				(GIC_CPU_MASK_SIMPLE(4) | IRQ_TYPE_LEVEL_LOW)>,
+			     <GIC_PPI 14
+				(GIC_CPU_MASK_SIMPLE(4) | IRQ_TYPE_LEVEL_LOW)>,
+			     <GIC_PPI 11
+				(GIC_CPU_MASK_SIMPLE(4) | IRQ_TYPE_LEVEL_LOW)>,
+			     <GIC_PPI 10
+				(GIC_CPU_MASK_SIMPLE(4) | IRQ_TYPE_LEVEL_LOW)>;
+		interrupt-parent = <&gic>;
+	};
+};
-- 
2.9.0

^ permalink raw reply related	[flat|nested] 51+ messages in thread

* [PATCH V2 09/10] arm64: dts: tegra: Add NVIDIA Tegra186 P3310 main board support
  2016-07-05  9:04 [PATCH V2 00/10] arm64: tegra: add BPMP support Joseph Lo
                   ` (7 preceding siblings ...)
  2016-07-05  9:04 ` [PATCH V2 08/10] arm64: dts: tegra: Add Tegra186 support Joseph Lo
@ 2016-07-05  9:04 ` Joseph Lo
  2016-07-05  9:04 ` [PATCH V2 10/10] arm64: dts: tegra: Add NVIDIA P2771 " Joseph Lo
  9 siblings, 0 replies; 51+ messages in thread
From: Joseph Lo @ 2016-07-05  9:04 UTC (permalink / raw)
  To: Stephen Warren, Thierry Reding, Alexandre Courbot
  Cc: linux-tegra, linux-arm-kernel, Rob Herring, Mark Rutland,
	Peter De Schrijver, Matthew Longnecker, devicetree, Jassi Brar,
	linux-kernel, Catalin Marinas, Will Deacon, Joseph Lo

Add NVIDIA Tegra186 P3310 main board support, which is a chip module
with DRAM, nonvolatile storage, WiFi, ethernet and PMIC chips on it. It
also needs an IO board and hooks on it to represent as an application
platform.

Signed-off-by: Joseph Lo <josephl@nvidia.com>
---
Changes in V2:
- update according to the binding fix in V2
---
 arch/arm64/boot/dts/nvidia/tegra186-p3310.dtsi | 34 ++++++++++++++++++++++++++
 1 file changed, 34 insertions(+)
 create mode 100644 arch/arm64/boot/dts/nvidia/tegra186-p3310.dtsi

diff --git a/arch/arm64/boot/dts/nvidia/tegra186-p3310.dtsi b/arch/arm64/boot/dts/nvidia/tegra186-p3310.dtsi
new file mode 100644
index 000000000000..f5238866d321
--- /dev/null
+++ b/arch/arm64/boot/dts/nvidia/tegra186-p3310.dtsi
@@ -0,0 +1,34 @@
+#include "tegra186.dtsi"
+
+/ {
+	model = "NVIDIA Tegra186 P3310 main Board";
+	compatible = "nvidia,p3301", "nvidia,tegra186";
+
+	aliases {
+		serial0 = &uarta;
+	};
+
+	chosen {
+		bootargs = "earlycon console=ttyS0,115200n8";
+		stdout-path = "serial0:115200n8";
+	};
+
+	memory {
+		device_type = "memory";
+		reg = <0x0 0x80000000 0x2 0x00000000>;
+	};
+
+	serial@03100000 {
+		// HACK: before clk driver ready
+		clock-frequency = <408000000>;
+		status = "okay";
+	};
+
+	hsp@03c00000 {
+		status = "okay";
+	};
+
+	bpmp {
+		status = "okay";
+	};
+};
-- 
2.9.0

^ permalink raw reply related	[flat|nested] 51+ messages in thread

* [PATCH V2 10/10] arm64: dts: tegra: Add NVIDIA P2771 board support
  2016-07-05  9:04 [PATCH V2 00/10] arm64: tegra: add BPMP support Joseph Lo
                   ` (8 preceding siblings ...)
  2016-07-05  9:04 ` [PATCH V2 09/10] arm64: dts: tegra: Add NVIDIA Tegra186 P3310 main board support Joseph Lo
@ 2016-07-05  9:04 ` Joseph Lo
  9 siblings, 0 replies; 51+ messages in thread
From: Joseph Lo @ 2016-07-05  9:04 UTC (permalink / raw)
  To: Stephen Warren, Thierry Reding, Alexandre Courbot
  Cc: linux-tegra, linux-arm-kernel, Rob Herring, Mark Rutland,
	Peter De Schrijver, Matthew Longnecker, devicetree, Jassi Brar,
	linux-kernel, Catalin Marinas, Will Deacon, Joseph Lo

Add NVIDIA Tegra186 P2771 board support, which is a reference
development board with P2597 I/O board and P3310 chip module on it.

Signed-off-by: Joseph Lo <josephl@nvidia.com>
---
Changes in V2:
- None
---
 arch/arm64/boot/dts/nvidia/Makefile                | 1 +
 arch/arm64/boot/dts/nvidia/tegra186-p2771-0000.dts | 8 ++++++++
 2 files changed, 9 insertions(+)
 create mode 100644 arch/arm64/boot/dts/nvidia/tegra186-p2771-0000.dts

diff --git a/arch/arm64/boot/dts/nvidia/Makefile b/arch/arm64/boot/dts/nvidia/Makefile
index 0f7cdf3e05c1..67234f3dc795 100644
--- a/arch/arm64/boot/dts/nvidia/Makefile
+++ b/arch/arm64/boot/dts/nvidia/Makefile
@@ -1,4 +1,5 @@
 dtb-$(CONFIG_ARCH_TEGRA_132_SOC) += tegra132-norrin.dtb
+dtb-$(CONFIG_ARCH_TEGRA_186_SOC) += tegra186-p2771-0000.dtb
 dtb-$(CONFIG_ARCH_TEGRA_210_SOC) += tegra210-p2371-0000.dtb
 dtb-$(CONFIG_ARCH_TEGRA_210_SOC) += tegra210-p2371-2180.dtb
 dtb-$(CONFIG_ARCH_TEGRA_210_SOC) += tegra210-p2571.dtb
diff --git a/arch/arm64/boot/dts/nvidia/tegra186-p2771-0000.dts b/arch/arm64/boot/dts/nvidia/tegra186-p2771-0000.dts
new file mode 100644
index 000000000000..66b936389fa7
--- /dev/null
+++ b/arch/arm64/boot/dts/nvidia/tegra186-p2771-0000.dts
@@ -0,0 +1,8 @@
+/dts-v1/;
+
+#include "tegra186-p3310.dtsi"
+
+/ {
+	model = "NVIDIA Tegra186 P2771-0000 Board";
+	compatible = "nvidia,p2771-0000", "nvidia,tegra186";
+};
-- 
2.9.0

^ permalink raw reply related	[flat|nested] 51+ messages in thread

* Re: [PATCH V2 02/10] mailbox: tegra-hsp: Add HSP(Hardware Synchronization Primitives) driver
  2016-07-05  9:04 ` [PATCH V2 02/10] mailbox: tegra-hsp: Add HSP(Hardware Synchronization Primitives) driver Joseph Lo
@ 2016-07-06  7:05   ` Alexandre Courbot
  2016-07-06  9:06     ` Joseph Lo
  2016-07-07 21:10   ` Sivaram Nair
  1 sibling, 1 reply; 51+ messages in thread
From: Alexandre Courbot @ 2016-07-06  7:05 UTC (permalink / raw)
  To: Joseph Lo
  Cc: Stephen Warren, Thierry Reding, linux-tegra, linux-arm-kernel,
	Rob Herring, Mark Rutland, Peter De Schrijver,
	Matthew Longnecker, devicetree, Jassi Brar,
	Linux Kernel Mailing List, Catalin Marinas, Will Deacon

On Tue, Jul 5, 2016 at 6:04 PM, Joseph Lo <josephl@nvidia.com> wrote:
> The Tegra HSP mailbox driver implements the signaling doorbell-based
> interprocessor communication (IPC) for remote processors currently. The
> HSP HW modules support some different features for that, which are
> shared mailboxes, shared semaphores, arbitrated semaphores, and
> doorbells. And there are multiple HSP HW instances on the chip. So the
> driver is extendable to support more features for different IPC
> requirement.
>
> The driver of remote processor can use it as a mailbox client and deal
> with the IPC protocol to synchronize the data communications.
>
> Signed-off-by: Joseph Lo <josephl@nvidia.com>
> ---
> Changes in V2:
> - Update the driver to support the binding changes in V2
> - it's extendable to support multiple HSP sub-modules on the same HSP HW block
>   now.
> ---
>  drivers/mailbox/Kconfig     |   9 +
>  drivers/mailbox/Makefile    |   2 +
>  drivers/mailbox/tegra-hsp.c | 418 ++++++++++++++++++++++++++++++++++++++++++++
>  3 files changed, 429 insertions(+)
>  create mode 100644 drivers/mailbox/tegra-hsp.c
>
> diff --git a/drivers/mailbox/Kconfig b/drivers/mailbox/Kconfig
> index 5305923752d2..fe584cb54720 100644
> --- a/drivers/mailbox/Kconfig
> +++ b/drivers/mailbox/Kconfig
> @@ -114,6 +114,15 @@ config MAILBOX_TEST
>           Test client to help with testing new Controller driver
>           implementations.
>
> +config TEGRA_HSP_MBOX
> +       bool "Tegra HSP(Hardware Synchronization Primitives) Driver"

Space missing before the opening parenthesis (same in the patch title btw).

> +       depends on ARCH_TEGRA_186_SOC
> +       help
> +         The Tegra HSP driver is used for the interprocessor communication
> +         between different remote processors and host processors on Tegra186
> +         and later SoCs. Say Y here if you want to have this support.
> +         If unsure say N.

Since this option is selected automatically by ARCH_TEGRA_186_SOC, you
should probably drop the last 2 sentences.

> +
>  config XGENE_SLIMPRO_MBOX
>         tristate "APM SoC X-Gene SLIMpro Mailbox Controller"
>         depends on ARCH_XGENE
> diff --git a/drivers/mailbox/Makefile b/drivers/mailbox/Makefile
> index 0be3e742bb7d..26d8f91c7fea 100644
> --- a/drivers/mailbox/Makefile
> +++ b/drivers/mailbox/Makefile
> @@ -25,3 +25,5 @@ obj-$(CONFIG_TI_MESSAGE_MANAGER) += ti-msgmgr.o
>  obj-$(CONFIG_XGENE_SLIMPRO_MBOX) += mailbox-xgene-slimpro.o
>
>  obj-$(CONFIG_HI6220_MBOX)      += hi6220-mailbox.o
> +
> +obj-${CONFIG_TEGRA_HSP_MBOX}   += tegra-hsp.o
> diff --git a/drivers/mailbox/tegra-hsp.c b/drivers/mailbox/tegra-hsp.c
> new file mode 100644
> index 000000000000..93c3ef58f29f
> --- /dev/null
> +++ b/drivers/mailbox/tegra-hsp.c
> @@ -0,0 +1,418 @@
> +/*
> + * Copyright (c) 2016, NVIDIA CORPORATION.  All rights reserved.
> + *
> + * This program is free software; you can redistribute it and/or modify it
> + * under the terms and conditions of the GNU General Public License,
> + * version 2, as published by the Free Software Foundation.
> + *
> + * This program is distributed in the hope it will be useful, but WITHOUT
> + * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
> + * FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License for
> + * more details.
> + */
> +
> +#include <linux/interrupt.h>
> +#include <linux/io.h>
> +#include <linux/mailbox_controller.h>
> +#include <linux/of.h>
> +#include <linux/of_device.h>
> +#include <linux/platform_device.h>
> +#include <dt-bindings/mailbox/tegra186-hsp.h>
> +
> +#define HSP_INT_DIMENSIONING   0x380
> +#define HSP_nSM_OFFSET         0
> +#define HSP_nSS_OFFSET         4
> +#define HSP_nAS_OFFSET         8
> +#define HSP_nDB_OFFSET         12
> +#define HSP_nSI_OFFSET         16

Would be nice to have comments to understand what SM, SS, AS, etc.
stand for (Shared Mailboxes, Shared Semaphores, Arbitrated Semaphores
but you need to look at the patch description to understand that). A
top-of-file comment explaning the necessary concepts to read this code
would do the trick.

> +#define HSP_nINT_MASK          0xf
> +
> +#define HSP_DB_REG_TRIGGER     0x0
> +#define HSP_DB_REG_ENABLE      0x4
> +#define HSP_DB_REG_RAW         0x8
> +#define HSP_DB_REG_PENDING     0xc
> +
> +#define HSP_DB_CCPLEX          1
> +#define HSP_DB_BPMP            3

Maybe turn this into enum and use that type for
tegra_hsp_db_chan::db_id? Also have MAX_NUM_HSP_DB here, since it is
related to these values?

> +
> +#define MAX_NUM_HSP_CHAN 32
> +#define MAX_NUM_HSP_DB 7
> +
> +#define hsp_db_offset(i, d) \
> +       (d->base + ((1 + (d->nr_sm >> 1) + d->nr_ss + d->nr_as) << 16) + \
> +       (i) * 0x100)
> +
> +struct tegra_hsp_db_chan {
> +       int master_id;
> +       int db_id;
> +};
> +
> +struct tegra_hsp_mbox_chan {
> +       int type;
> +       union {
> +               struct tegra_hsp_db_chan db_chan;
> +       };
> +};
> +
> +struct tegra_hsp_mbox {
> +       struct mbox_controller *mbox;
> +       void __iomem *base;
> +       void __iomem *db_base[MAX_NUM_HSP_DB];
> +       int db_irq;
> +       int nr_sm;
> +       int nr_as;
> +       int nr_ss;
> +       int nr_db;
> +       int nr_si;
> +       spinlock_t lock;
> +};
> +
> +static inline u32 hsp_readl(void __iomem *base, int reg)
> +{
> +       return readl(base + reg);
> +}
> +
> +static inline void hsp_writel(void __iomem *base, int reg, u32 val)
> +{
> +       writel(val, base + reg);
> +       readl(base + reg);
> +}
> +
> +static int hsp_db_can_ring(void __iomem *db_base)
> +{
> +       u32 reg;
> +
> +       reg = hsp_readl(db_base, HSP_DB_REG_ENABLE);
> +
> +       return !!(reg & BIT(HSP_DB_MASTER_CCPLEX));
> +}
> +
> +static irqreturn_t hsp_db_irq(int irq, void *p)
> +{
> +       struct tegra_hsp_mbox *hsp_mbox = p;
> +       ulong val;
> +       int master_id;
> +
> +       val = (ulong)hsp_readl(hsp_mbox->db_base[HSP_DB_CCPLEX],
> +                              HSP_DB_REG_PENDING);
> +       hsp_writel(hsp_mbox->db_base[HSP_DB_CCPLEX], HSP_DB_REG_PENDING, val);
> +
> +       spin_lock(&hsp_mbox->lock);
> +       for_each_set_bit(master_id, &val, MAX_NUM_HSP_CHAN) {
> +               struct mbox_chan *chan;
> +               struct tegra_hsp_mbox_chan *mchan;
> +               int i;
> +
> +               for (i = 0; i < MAX_NUM_HSP_CHAN; i++) {

I wonder if this could not be optimized. You are doing a double loop
on MAX_NUM_HSP_CHAN to look for an identical master_id. Since it seems
like the same master_id cannot be used twice (considering that the
inner loop only processes the first match), couldn't you just select
the free channel in of_hsp_mbox_xlate() by doing
&mbox->chans[master_id] (and returning an error if it is already
used), then simply getting chan as &hsp_mbox->mbox->chans[master_id]
instead of having the inner loop below? That would remove the need for
the second loop.

If having two channels use the same master_id is a valid scenario,
then all matches on master_id should probably be processed, not just
the first one.

> +                       chan = &hsp_mbox->mbox->chans[i];
> +
> +                       if (!chan->con_priv)
> +                               continue;
> +
> +                       mchan = chan->con_priv;
> +                       if (mchan->type == HSP_MBOX_TYPE_DB &&
> +                           mchan->db_chan.master_id == master_id)
> +                               break;
> +                       chan = NULL;
> +               }
> +
> +               if (chan)
> +                       mbox_chan_received_data(chan, NULL);
> +       }
> +       spin_unlock(&hsp_mbox->lock);
> +
> +       return IRQ_HANDLED;
> +}
> +
> +static int hsp_db_send_data(struct mbox_chan *chan, void *data)
> +{
> +       struct tegra_hsp_mbox_chan *mchan = chan->con_priv;
> +       struct tegra_hsp_db_chan *db_chan = &mchan->db_chan;
> +       struct tegra_hsp_mbox *hsp_mbox = dev_get_drvdata(chan->mbox->dev);
> +
> +       hsp_writel(hsp_mbox->db_base[db_chan->db_id], HSP_DB_REG_TRIGGER, 1);
> +
> +       return 0;
> +}
> +
> +static int hsp_db_startup(struct mbox_chan *chan)
> +{
> +       struct tegra_hsp_mbox_chan *mchan = chan->con_priv;
> +       struct tegra_hsp_db_chan *db_chan = &mchan->db_chan;
> +       struct tegra_hsp_mbox *hsp_mbox = dev_get_drvdata(chan->mbox->dev);
> +       u32 val;
> +       unsigned long flag;
> +
> +       if (db_chan->master_id >= MAX_NUM_HSP_CHAN) {
> +               dev_err(chan->mbox->dev, "invalid HSP chan: master ID: %d\n",
> +                       db_chan->master_id);
> +               return -EINVAL;
> +       }
> +
> +       spin_lock_irqsave(&hsp_mbox->lock, flag);
> +       val = hsp_readl(hsp_mbox->db_base[HSP_DB_CCPLEX], HSP_DB_REG_ENABLE);
> +       val |= BIT(db_chan->master_id);
> +       hsp_writel(hsp_mbox->db_base[HSP_DB_CCPLEX], HSP_DB_REG_ENABLE, val);
> +       spin_unlock_irqrestore(&hsp_mbox->lock, flag);
> +
> +       if (!hsp_db_can_ring(hsp_mbox->db_base[db_chan->db_id]))
> +               return -ENODEV;
> +
> +       return 0;
> +}
> +
> +static void hsp_db_shutdown(struct mbox_chan *chan)
> +{
> +       struct tegra_hsp_mbox_chan *mchan = chan->con_priv;
> +       struct tegra_hsp_db_chan *db_chan = &mchan->db_chan;
> +       struct tegra_hsp_mbox *hsp_mbox = dev_get_drvdata(chan->mbox->dev);
> +       u32 val;
> +       unsigned long flag;
> +
> +       spin_lock_irqsave(&hsp_mbox->lock, flag);
> +       val = hsp_readl(hsp_mbox->db_base[HSP_DB_CCPLEX], HSP_DB_REG_ENABLE);
> +       val &= ~BIT(db_chan->master_id);
> +       hsp_writel(hsp_mbox->db_base[HSP_DB_CCPLEX], HSP_DB_REG_ENABLE, val);
> +       spin_unlock_irqrestore(&hsp_mbox->lock, flag);
> +}
> +
> +static bool hsp_db_last_tx_done(struct mbox_chan *chan)
> +{
> +       return true;
> +}
> +
> +static int tegra_hsp_db_init(struct tegra_hsp_mbox *hsp_mbox,
> +                            struct mbox_chan *mchan, int master_id)
> +{
> +       struct platform_device *pdev = to_platform_device(hsp_mbox->mbox->dev);
> +       struct tegra_hsp_mbox_chan *hsp_mbox_chan;
> +       int ret;
> +
> +       if (!hsp_mbox->db_irq) {
> +               int i;
> +
> +               hsp_mbox->db_irq = platform_get_irq_byname(pdev, "doorbell");

Getting the IRQ sounds more like a job for probe() - I don't see the
benefit of lazy-doing it?

> +               ret = devm_request_irq(&pdev->dev, hsp_mbox->db_irq,
> +                                      hsp_db_irq, IRQF_NO_SUSPEND,
> +                                      dev_name(&pdev->dev), hsp_mbox);
> +               if (ret)
> +                       return ret;
> +
> +               for (i = 0; i < MAX_NUM_HSP_DB; i++)
> +                       hsp_mbox->db_base[i] = hsp_db_offset(i, hsp_mbox);

Same here, cannot this be moved into probe()?

> +       }
> +
> +       hsp_mbox_chan = devm_kzalloc(&pdev->dev, sizeof(*hsp_mbox_chan),
> +                                    GFP_KERNEL);
> +       if (!hsp_mbox_chan)
> +               return -ENOMEM;
> +
> +       hsp_mbox_chan->type = HSP_MBOX_TYPE_DB;
> +       hsp_mbox_chan->db_chan.master_id = master_id;
> +       switch (master_id) {
> +       case HSP_DB_MASTER_BPMP:
> +               hsp_mbox_chan->db_chan.db_id = HSP_DB_BPMP;
> +               break;
> +       default:
> +               hsp_mbox_chan->db_chan.db_id = MAX_NUM_HSP_DB;
> +               break;
> +       }
> +
> +       mchan->con_priv = hsp_mbox_chan;
> +
> +       return 0;
> +}
> +
> +static int hsp_send_data(struct mbox_chan *chan, void *data)
> +{
> +       struct tegra_hsp_mbox_chan *hsp_mbox_chan = chan->con_priv;
> +       int ret = 0;
> +
> +       switch (hsp_mbox_chan->type) {
> +       case HSP_MBOX_TYPE_DB:
> +               ret = hsp_db_send_data(chan, data);
> +               break;
> +       default:
> +               break;
> +       }
> +
> +       return ret;
> +}
> +
> +static int hsp_startup(struct mbox_chan *chan)
> +{
> +       struct tegra_hsp_mbox_chan *hsp_mbox_chan = chan->con_priv;
> +       int ret = 0;
> +
> +       switch (hsp_mbox_chan->type) {
> +       case HSP_MBOX_TYPE_DB:
> +               ret = hsp_db_startup(chan);
> +               break;
> +       default:
> +               break;
> +       }
> +
> +       return ret;
> +}
> +
> +static void hsp_shutdown(struct mbox_chan *chan)
> +{
> +       struct tegra_hsp_mbox_chan *hsp_mbox_chan = chan->con_priv;
> +
> +       switch (hsp_mbox_chan->type) {
> +       case HSP_MBOX_TYPE_DB:
> +               hsp_db_shutdown(chan);
> +               break;
> +       default:
> +               break;
> +       }
> +
> +       chan->con_priv = NULL;
> +}
> +
> +static bool hsp_last_tx_done(struct mbox_chan *chan)
> +{
> +       struct tegra_hsp_mbox_chan *hsp_mbox_chan = chan->con_priv;
> +       bool ret = true;
> +
> +       switch (hsp_mbox_chan->type) {
> +       case HSP_MBOX_TYPE_DB:
> +               ret = hsp_db_last_tx_done(chan);
> +               break;
> +       default:
> +               break;
> +       }
> +
> +       return ret;
> +}
> +
> +static const struct mbox_chan_ops tegra_hsp_ops = {
> +       .send_data = hsp_send_data,
> +       .startup = hsp_startup,
> +       .shutdown = hsp_shutdown,
> +       .last_tx_done = hsp_last_tx_done,
> +};
> +
> +static const struct of_device_id tegra_hsp_match[] = {
> +       { .compatible = "nvidia,tegra186-hsp" },
> +       { }
> +};
> +
> +static struct mbox_chan *
> +of_hsp_mbox_xlate(struct mbox_controller *mbox,
> +                 const struct of_phandle_args *sp)
> +{
> +       int mbox_id = sp->args[0];
> +       int hsp_type = (mbox_id >> 16) & 0xf;
> +       int master_id = mbox_id & 0xff;
> +       struct tegra_hsp_mbox *hsp_mbox = dev_get_drvdata(mbox->dev);
> +       struct mbox_chan *free_chan;
> +       int i, ret = 0;
> +
> +       spin_lock(&hsp_mbox->lock);
> +
> +       for (i = 0; i < mbox->num_chans; i++) {
> +               free_chan = &mbox->chans[i];
> +               if (!free_chan->con_priv)
> +                       break;
> +               free_chan = NULL;
> +       }
> +
> +       if (!free_chan) {
> +               spin_unlock(&hsp_mbox->lock);
> +               return ERR_PTR(-EFAULT);
> +       }
> +
> +       switch (hsp_type) {
> +       case HSP_MBOX_TYPE_DB:
> +               ret = tegra_hsp_db_init(hsp_mbox, free_chan, master_id);

Maybe move tegra_hsp_db_init's definition closer to this function,
since it is only used here? Having related functions close to one
another makes it easier to understand the code.

> +               break;
> +       default:

This looks like an error condition - it should probably be reported,
and maybe even an error returned. If you do it here you probably don't
need to do the same check in hsp_send_data() and following functions.

> +               break;
> +       }
> +
> +       spin_unlock(&hsp_mbox->lock);
> +
> +       if (ret)
> +               free_chan = ERR_PTR(-EFAULT);
> +
> +       return free_chan;
> +}
> +
> +static int tegra_hsp_probe(struct platform_device *pdev)
> +{
> +       struct tegra_hsp_mbox *hsp_mbox;
> +       struct resource *res;
> +       int ret = 0;
> +       u32 reg;
> +
> +       hsp_mbox = devm_kzalloc(&pdev->dev, sizeof(*hsp_mbox), GFP_KERNEL);
> +       if (!hsp_mbox)
> +               return -ENOMEM;
> +
> +       res = platform_get_resource(pdev, IORESOURCE_MEM, 0);
> +       hsp_mbox->base = devm_ioremap_resource(&pdev->dev, res);
> +       if (IS_ERR(hsp_mbox->base))
> +               return PTR_ERR(hsp_mbox->base);
> +
> +       reg = hsp_readl(hsp_mbox->base, HSP_INT_DIMENSIONING);
> +       hsp_mbox->nr_sm = (reg >> HSP_nSM_OFFSET) & HSP_nINT_MASK;
> +       hsp_mbox->nr_ss = (reg >> HSP_nSS_OFFSET) & HSP_nINT_MASK;
> +       hsp_mbox->nr_as = (reg >> HSP_nAS_OFFSET) & HSP_nINT_MASK;
> +       hsp_mbox->nr_db = (reg >> HSP_nDB_OFFSET) & HSP_nINT_MASK;
> +       hsp_mbox->nr_si = (reg >> HSP_nSI_OFFSET) & HSP_nINT_MASK;

Maybe have a HSP_NR(type, reg) that expands to (reg >> HSP_n ## TYPE
## _OFFSET) & HSP_nINT_MASK) to simplify this?

> +
> +       hsp_mbox->mbox = devm_kzalloc(&pdev->dev,
> +                                     sizeof(*hsp_mbox->mbox), GFP_KERNEL);
> +       if (!hsp_mbox->mbox)
> +               return -ENOMEM;
> +
> +       hsp_mbox->mbox->chans =
> +               devm_kcalloc(&pdev->dev, MAX_NUM_HSP_CHAN,
> +                            sizeof(*hsp_mbox->mbox->chans), GFP_KERNEL);
> +       if (!hsp_mbox->mbox->chans)
> +               return -ENOMEM;
> +
> +       hsp_mbox->mbox->of_xlate = of_hsp_mbox_xlate;
> +       hsp_mbox->mbox->num_chans = MAX_NUM_HSP_CHAN;
> +       hsp_mbox->mbox->dev = &pdev->dev;
> +       hsp_mbox->mbox->txdone_irq = false;
> +       hsp_mbox->mbox->txdone_poll = false;
> +       hsp_mbox->mbox->ops = &tegra_hsp_ops;
> +       platform_set_drvdata(pdev, hsp_mbox);
> +
> +       ret = mbox_controller_register(hsp_mbox->mbox);
> +       if (ret) {
> +               pr_err("tegra-hsp mbox: fail to register mailbox %d.\n", ret);
> +               return ret;
> +       }
> +
> +       spin_lock_init(&hsp_mbox->lock);
> +
> +       return 0;
> +}
> +
> +static int tegra_hsp_remove(struct platform_device *pdev)
> +{
> +       struct tegra_hsp_mbox *hsp_mbox = platform_get_drvdata(pdev);
> +
> +       if (hsp_mbox->mbox)
> +               mbox_controller_unregister(hsp_mbox->mbox);
> +
> +       return 0;
> +}
> +
> +static struct platform_driver tegra_hsp_driver = {
> +       .driver = {
> +               .name = "tegra-hsp",
> +               .of_match_table = tegra_hsp_match,
> +       },
> +       .probe = tegra_hsp_probe,
> +       .remove = tegra_hsp_remove,
> +};
> +
> +static int __init tegra_hsp_init(void)
> +{
> +       return platform_driver_register(&tegra_hsp_driver);
> +}
> +core_initcall(tegra_hsp_init);
> --
> 2.9.0
>

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH V2 02/10] mailbox: tegra-hsp: Add HSP(Hardware Synchronization Primitives) driver
  2016-07-06  7:05   ` Alexandre Courbot
@ 2016-07-06  9:06     ` Joseph Lo
  2016-07-06 12:23       ` Alexandre Courbot
  2016-07-06 16:50       ` Stephen Warren
  0 siblings, 2 replies; 51+ messages in thread
From: Joseph Lo @ 2016-07-06  9:06 UTC (permalink / raw)
  To: Alexandre Courbot
  Cc: Stephen Warren, Thierry Reding, linux-tegra, linux-arm-kernel,
	Rob Herring, Mark Rutland, Peter De Schrijver,
	Matthew Longnecker, devicetree, Jassi Brar,
	Linux Kernel Mailing List, Catalin Marinas, Will Deacon

On 07/06/2016 03:05 PM, Alexandre Courbot wrote:
> On Tue, Jul 5, 2016 at 6:04 PM, Joseph Lo <josephl@nvidia.com> wrote:
>> The Tegra HSP mailbox driver implements the signaling doorbell-based
>> interprocessor communication (IPC) for remote processors currently. The
>> HSP HW modules support some different features for that, which are
>> shared mailboxes, shared semaphores, arbitrated semaphores, and
>> doorbells. And there are multiple HSP HW instances on the chip. So the
>> driver is extendable to support more features for different IPC
>> requirement.
>>
>> The driver of remote processor can use it as a mailbox client and deal
>> with the IPC protocol to synchronize the data communications.
>>
>> Signed-off-by: Joseph Lo <josephl@nvidia.com>
>> ---
>> Changes in V2:
>> - Update the driver to support the binding changes in V2
>> - it's extendable to support multiple HSP sub-modules on the same HSP HW block
>>    now.
>> ---
>>   drivers/mailbox/Kconfig     |   9 +
>>   drivers/mailbox/Makefile    |   2 +
>>   drivers/mailbox/tegra-hsp.c | 418 ++++++++++++++++++++++++++++++++++++++++++++
>>   3 files changed, 429 insertions(+)
>>   create mode 100644 drivers/mailbox/tegra-hsp.c
>>
>> diff --git a/drivers/mailbox/Kconfig b/drivers/mailbox/Kconfig
>> index 5305923752d2..fe584cb54720 100644
>> --- a/drivers/mailbox/Kconfig
>> +++ b/drivers/mailbox/Kconfig
>> @@ -114,6 +114,15 @@ config MAILBOX_TEST
>>            Test client to help with testing new Controller driver
>>            implementations.
>>
>> +config TEGRA_HSP_MBOX
>> +       bool "Tegra HSP(Hardware Synchronization Primitives) Driver"
>
> Space missing before the opening parenthesis (same in the patch title btw).
Okay.
>
>> +       depends on ARCH_TEGRA_186_SOC
>> +       help
>> +         The Tegra HSP driver is used for the interprocessor communication
>> +         between different remote processors and host processors on Tegra186
>> +         and later SoCs. Say Y here if you want to have this support.
>> +         If unsure say N.
>
> Since this option is selected automatically by ARCH_TEGRA_186_SOC, you
> should probably drop the last 2 sentences.
Okay.
>
>> +
>>   config XGENE_SLIMPRO_MBOX
>>          tristate "APM SoC X-Gene SLIMpro Mailbox Controller"
>>          depends on ARCH_XGENE
>> diff --git a/drivers/mailbox/Makefile b/drivers/mailbox/Makefile
>> index 0be3e742bb7d..26d8f91c7fea 100644
>> --- a/drivers/mailbox/Makefile
>> +++ b/drivers/mailbox/Makefile
>> @@ -25,3 +25,5 @@ obj-$(CONFIG_TI_MESSAGE_MANAGER) += ti-msgmgr.o
>>   obj-$(CONFIG_XGENE_SLIMPRO_MBOX) += mailbox-xgene-slimpro.o
>>
>>   obj-$(CONFIG_HI6220_MBOX)      += hi6220-mailbox.o
>> +
>> +obj-${CONFIG_TEGRA_HSP_MBOX}   += tegra-hsp.o
>> diff --git a/drivers/mailbox/tegra-hsp.c b/drivers/mailbox/tegra-hsp.c
>> new file mode 100644
>> index 000000000000..93c3ef58f29f
>> --- /dev/null
>> +++ b/drivers/mailbox/tegra-hsp.c
>> @@ -0,0 +1,418 @@
>> +/*
>> + * Copyright (c) 2016, NVIDIA CORPORATION.  All rights reserved.
>> + *
>> + * This program is free software; you can redistribute it and/or modify it
>> + * under the terms and conditions of the GNU General Public License,
>> + * version 2, as published by the Free Software Foundation.
>> + *
>> + * This program is distributed in the hope it will be useful, but WITHOUT
>> + * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
>> + * FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License for
>> + * more details.
>> + */
>> +
>> +#include <linux/interrupt.h>
>> +#include <linux/io.h>
>> +#include <linux/mailbox_controller.h>
>> +#include <linux/of.h>
>> +#include <linux/of_device.h>
>> +#include <linux/platform_device.h>
>> +#include <dt-bindings/mailbox/tegra186-hsp.h>
>> +
>> +#define HSP_INT_DIMENSIONING   0x380
>> +#define HSP_nSM_OFFSET         0
>> +#define HSP_nSS_OFFSET         4
>> +#define HSP_nAS_OFFSET         8
>> +#define HSP_nDB_OFFSET         12
>> +#define HSP_nSI_OFFSET         16
>
> Would be nice to have comments to understand what SM, SS, AS, etc.
> stand for (Shared Mailboxes, Shared Semaphores, Arbitrated Semaphores
> but you need to look at the patch description to understand that). A
> top-of-file comment explaning the necessary concepts to read this code
> would do the trick.
Yes, will fix that.
>
>> +#define HSP_nINT_MASK          0xf
>> +
>> +#define HSP_DB_REG_TRIGGER     0x0
>> +#define HSP_DB_REG_ENABLE      0x4
>> +#define HSP_DB_REG_RAW         0x8
>> +#define HSP_DB_REG_PENDING     0xc
>> +
>> +#define HSP_DB_CCPLEX          1
>> +#define HSP_DB_BPMP            3
>
> Maybe turn this into enum and use that type for
> tegra_hsp_db_chan::db_id? Also have MAX_NUM_HSP_DB here, since it is
> related to these values?
Okay.
>
>> +
>> +#define MAX_NUM_HSP_CHAN 32
>> +#define MAX_NUM_HSP_DB 7
>> +
>> +#define hsp_db_offset(i, d) \
>> +       (d->base + ((1 + (d->nr_sm >> 1) + d->nr_ss + d->nr_as) << 16) + \
>> +       (i) * 0x100)
>> +
>> +struct tegra_hsp_db_chan {
>> +       int master_id;
>> +       int db_id;
>> +};
>> +
>> +struct tegra_hsp_mbox_chan {
>> +       int type;
>> +       union {
>> +               struct tegra_hsp_db_chan db_chan;
>> +       };
>> +};
>> +
>> +struct tegra_hsp_mbox {
>> +       struct mbox_controller *mbox;
>> +       void __iomem *base;
>> +       void __iomem *db_base[MAX_NUM_HSP_DB];
>> +       int db_irq;
>> +       int nr_sm;
>> +       int nr_as;
>> +       int nr_ss;
>> +       int nr_db;
>> +       int nr_si;
>> +       spinlock_t lock;
>> +};
>> +
>> +static inline u32 hsp_readl(void __iomem *base, int reg)
>> +{
>> +       return readl(base + reg);
>> +}
>> +
>> +static inline void hsp_writel(void __iomem *base, int reg, u32 val)
>> +{
>> +       writel(val, base + reg);
>> +       readl(base + reg);
>> +}
>> +
>> +static int hsp_db_can_ring(void __iomem *db_base)
>> +{
>> +       u32 reg;
>> +
>> +       reg = hsp_readl(db_base, HSP_DB_REG_ENABLE);
>> +
>> +       return !!(reg & BIT(HSP_DB_MASTER_CCPLEX));
>> +}
>> +
>> +static irqreturn_t hsp_db_irq(int irq, void *p)
>> +{
>> +       struct tegra_hsp_mbox *hsp_mbox = p;
>> +       ulong val;
>> +       int master_id;
>> +
>> +       val = (ulong)hsp_readl(hsp_mbox->db_base[HSP_DB_CCPLEX],
>> +                              HSP_DB_REG_PENDING);
>> +       hsp_writel(hsp_mbox->db_base[HSP_DB_CCPLEX], HSP_DB_REG_PENDING, val);
>> +
>> +       spin_lock(&hsp_mbox->lock);
>> +       for_each_set_bit(master_id, &val, MAX_NUM_HSP_CHAN) {
>> +               struct mbox_chan *chan;
>> +               struct tegra_hsp_mbox_chan *mchan;
>> +               int i;
>> +
>> +               for (i = 0; i < MAX_NUM_HSP_CHAN; i++) {
>
> I wonder if this could not be optimized. You are doing a double loop
> on MAX_NUM_HSP_CHAN to look for an identical master_id. Since it seems
> like the same master_id cannot be used twice (considering that the
> inner loop only processes the first match), couldn't you just select
> the free channel in of_hsp_mbox_xlate() by doing
> &mbox->chans[master_id] (and returning an error if it is already
> used), then simply getting chan as &hsp_mbox->mbox->chans[master_id]
> instead of having the inner loop below? That would remove the need for
> the second loop.

That was exactly what I did in the V1, which only supported one HSP 
sub-module per HSP HW block. So we can just use the master_id as the 
mbox channel ID.

Meanwhile, the V2 is purposed to support multiple HSP sub-modules to be 
running on the same HSP HW block. The "ID" between different modules 
could be conflict. So I dropped the mechanism that used the master_id as 
the mbox channel ID.

Instead, the channel is allocated at the time, when the client is bound 
to one of the HSP sub-modules. And we store the "ID" information into 
the private mbox channel data, which can help us to figure out which 
mbox channel should response to the interrupt.

In the doorbell case, because all the DB clients are shared the same DB 
IRQ at the CPU side. So in the ISR, we need to figure out the IRQ 
source, which is the master_id that the IRQ came from. This is the outer 
loop. The inner loop, we figure out which channel should response to by 
checking the type and ID.

And I think it should be pretty quick, because we only check the set bit 
from the pending register. And finding the matching channel.

>
> If having two channels use the same master_id is a valid scenario,
> then all matches on master_id should probably be processed, not just
> the first one.
Each DB channel should have different master_id.

>
>> +                       chan = &hsp_mbox->mbox->chans[i];
>> +
>> +                       if (!chan->con_priv)
>> +                               continue;
>> +
>> +                       mchan = chan->con_priv;
>> +                       if (mchan->type == HSP_MBOX_TYPE_DB &&
>> +                           mchan->db_chan.master_id == master_id)
>> +                               break;
>> +                       chan = NULL;
>> +               }
>> +
>> +               if (chan)
>> +                       mbox_chan_received_data(chan, NULL);
>> +       }
>> +       spin_unlock(&hsp_mbox->lock);
>> +
>> +       return IRQ_HANDLED;
>> +}
>> +
>> +static int hsp_db_send_data(struct mbox_chan *chan, void *data)
>> +{
>> +       struct tegra_hsp_mbox_chan *mchan = chan->con_priv;
>> +       struct tegra_hsp_db_chan *db_chan = &mchan->db_chan;
>> +       struct tegra_hsp_mbox *hsp_mbox = dev_get_drvdata(chan->mbox->dev);
>> +
>> +       hsp_writel(hsp_mbox->db_base[db_chan->db_id], HSP_DB_REG_TRIGGER, 1);
>> +
>> +       return 0;
>> +}
>> +
>> +static int hsp_db_startup(struct mbox_chan *chan)
>> +{
>> +       struct tegra_hsp_mbox_chan *mchan = chan->con_priv;
>> +       struct tegra_hsp_db_chan *db_chan = &mchan->db_chan;
>> +       struct tegra_hsp_mbox *hsp_mbox = dev_get_drvdata(chan->mbox->dev);
>> +       u32 val;
>> +       unsigned long flag;
>> +
>> +       if (db_chan->master_id >= MAX_NUM_HSP_CHAN) {
>> +               dev_err(chan->mbox->dev, "invalid HSP chan: master ID: %d\n",
>> +                       db_chan->master_id);
>> +               return -EINVAL;
>> +       }
>> +
>> +       spin_lock_irqsave(&hsp_mbox->lock, flag);
>> +       val = hsp_readl(hsp_mbox->db_base[HSP_DB_CCPLEX], HSP_DB_REG_ENABLE);
>> +       val |= BIT(db_chan->master_id);
>> +       hsp_writel(hsp_mbox->db_base[HSP_DB_CCPLEX], HSP_DB_REG_ENABLE, val);
>> +       spin_unlock_irqrestore(&hsp_mbox->lock, flag);
>> +
>> +       if (!hsp_db_can_ring(hsp_mbox->db_base[db_chan->db_id]))
>> +               return -ENODEV;
>> +
>> +       return 0;
>> +}
>> +
>> +static void hsp_db_shutdown(struct mbox_chan *chan)
>> +{
>> +       struct tegra_hsp_mbox_chan *mchan = chan->con_priv;
>> +       struct tegra_hsp_db_chan *db_chan = &mchan->db_chan;
>> +       struct tegra_hsp_mbox *hsp_mbox = dev_get_drvdata(chan->mbox->dev);
>> +       u32 val;
>> +       unsigned long flag;
>> +
>> +       spin_lock_irqsave(&hsp_mbox->lock, flag);
>> +       val = hsp_readl(hsp_mbox->db_base[HSP_DB_CCPLEX], HSP_DB_REG_ENABLE);
>> +       val &= ~BIT(db_chan->master_id);
>> +       hsp_writel(hsp_mbox->db_base[HSP_DB_CCPLEX], HSP_DB_REG_ENABLE, val);
>> +       spin_unlock_irqrestore(&hsp_mbox->lock, flag);
>> +}
>> +
>> +static bool hsp_db_last_tx_done(struct mbox_chan *chan)
>> +{
>> +       return true;
>> +}
>> +
>> +static int tegra_hsp_db_init(struct tegra_hsp_mbox *hsp_mbox,
>> +                            struct mbox_chan *mchan, int master_id)
>> +{
>> +       struct platform_device *pdev = to_platform_device(hsp_mbox->mbox->dev);
>> +       struct tegra_hsp_mbox_chan *hsp_mbox_chan;
>> +       int ret;
>> +
>> +       if (!hsp_mbox->db_irq) {
>> +               int i;
>> +
>> +               hsp_mbox->db_irq = platform_get_irq_byname(pdev, "doorbell");
>
> Getting the IRQ sounds more like a job for probe() - I don't see the
> benefit of lazy-doing it?

We only need the IRQ when the client is requesting the DB service. For 
other HSP sub-modules, they are using different IRQ. So I didn't do that 
at probe time.

>
>> +               ret = devm_request_irq(&pdev->dev, hsp_mbox->db_irq,
>> +                                      hsp_db_irq, IRQF_NO_SUSPEND,
>> +                                      dev_name(&pdev->dev), hsp_mbox);
>> +               if (ret)
>> +                       return ret;
>> +
>> +               for (i = 0; i < MAX_NUM_HSP_DB; i++)
>> +                       hsp_mbox->db_base[i] = hsp_db_offset(i, hsp_mbox);
>
> Same here, cannot this be moved into probe()?
Same as above, only needed when the client requests it.

>
>> +       }
>> +
>> +       hsp_mbox_chan = devm_kzalloc(&pdev->dev, sizeof(*hsp_mbox_chan),
>> +                                    GFP_KERNEL);
>> +       if (!hsp_mbox_chan)
>> +               return -ENOMEM;
>> +
>> +       hsp_mbox_chan->type = HSP_MBOX_TYPE_DB;
>> +       hsp_mbox_chan->db_chan.master_id = master_id;
>> +       switch (master_id) {
>> +       case HSP_DB_MASTER_BPMP:
>> +               hsp_mbox_chan->db_chan.db_id = HSP_DB_BPMP;
>> +               break;
>> +       default:
>> +               hsp_mbox_chan->db_chan.db_id = MAX_NUM_HSP_DB;
>> +               break;
>> +       }
>> +
>> +       mchan->con_priv = hsp_mbox_chan;
>> +
>> +       return 0;
>> +}
>> +
>> +static int hsp_send_data(struct mbox_chan *chan, void *data)
>> +{
>> +       struct tegra_hsp_mbox_chan *hsp_mbox_chan = chan->con_priv;
>> +       int ret = 0;
>> +
>> +       switch (hsp_mbox_chan->type) {
>> +       case HSP_MBOX_TYPE_DB:
>> +               ret = hsp_db_send_data(chan, data);
>> +               break;
>> +       default:
>> +               break;
>> +       }
>> +
>> +       return ret;
>> +}
>> +
>> +static int hsp_startup(struct mbox_chan *chan)
>> +{
>> +       struct tegra_hsp_mbox_chan *hsp_mbox_chan = chan->con_priv;
>> +       int ret = 0;
>> +
>> +       switch (hsp_mbox_chan->type) {
>> +       case HSP_MBOX_TYPE_DB:
>> +               ret = hsp_db_startup(chan);
>> +               break;
>> +       default:
>> +               break;
>> +       }
>> +
>> +       return ret;
>> +}
>> +
>> +static void hsp_shutdown(struct mbox_chan *chan)
>> +{
>> +       struct tegra_hsp_mbox_chan *hsp_mbox_chan = chan->con_priv;
>> +
>> +       switch (hsp_mbox_chan->type) {
>> +       case HSP_MBOX_TYPE_DB:
>> +               hsp_db_shutdown(chan);
>> +               break;
>> +       default:
>> +               break;
>> +       }
>> +
>> +       chan->con_priv = NULL;
>> +}
>> +
>> +static bool hsp_last_tx_done(struct mbox_chan *chan)
>> +{
>> +       struct tegra_hsp_mbox_chan *hsp_mbox_chan = chan->con_priv;
>> +       bool ret = true;
>> +
>> +       switch (hsp_mbox_chan->type) {
>> +       case HSP_MBOX_TYPE_DB:
>> +               ret = hsp_db_last_tx_done(chan);
>> +               break;
>> +       default:
>> +               break;
>> +       }
>> +
>> +       return ret;
>> +}
>> +
>> +static const struct mbox_chan_ops tegra_hsp_ops = {
>> +       .send_data = hsp_send_data,
>> +       .startup = hsp_startup,
>> +       .shutdown = hsp_shutdown,
>> +       .last_tx_done = hsp_last_tx_done,
>> +};
>> +
>> +static const struct of_device_id tegra_hsp_match[] = {
>> +       { .compatible = "nvidia,tegra186-hsp" },
>> +       { }
>> +};
>> +
>> +static struct mbox_chan *
>> +of_hsp_mbox_xlate(struct mbox_controller *mbox,
>> +                 const struct of_phandle_args *sp)
>> +{
>> +       int mbox_id = sp->args[0];
>> +       int hsp_type = (mbox_id >> 16) & 0xf;
>> +       int master_id = mbox_id & 0xff;
>> +       struct tegra_hsp_mbox *hsp_mbox = dev_get_drvdata(mbox->dev);
>> +       struct mbox_chan *free_chan;
>> +       int i, ret = 0;
>> +
>> +       spin_lock(&hsp_mbox->lock);
>> +
>> +       for (i = 0; i < mbox->num_chans; i++) {
>> +               free_chan = &mbox->chans[i];
>> +               if (!free_chan->con_priv)
>> +                       break;
>> +               free_chan = NULL;
>> +       }
>> +
>> +       if (!free_chan) {
>> +               spin_unlock(&hsp_mbox->lock);
>> +               return ERR_PTR(-EFAULT);
>> +       }
>> +
>> +       switch (hsp_type) {
>> +       case HSP_MBOX_TYPE_DB:
>> +               ret = tegra_hsp_db_init(hsp_mbox, free_chan, master_id);
>
> Maybe move tegra_hsp_db_init's definition closer to this function,
> since it is only used here? Having related functions close to one
> another makes it easier to understand the code.
Okay.

>
>> +               break;
>> +       default:
>
> This looks like an error condition - it should probably be reported,
> and maybe even an error returned. If you do it here you probably don't
> need to do the same check in hsp_send_data() and following functions.
Yes, will fix.

>
>> +               break;
>> +       }
>> +
>> +       spin_unlock(&hsp_mbox->lock);
>> +
>> +       if (ret)
>> +               free_chan = ERR_PTR(-EFAULT);
>> +
>> +       return free_chan;
>> +}
>> +
>> +static int tegra_hsp_probe(struct platform_device *pdev)
>> +{
>> +       struct tegra_hsp_mbox *hsp_mbox;
>> +       struct resource *res;
>> +       int ret = 0;
>> +       u32 reg;
>> +
>> +       hsp_mbox = devm_kzalloc(&pdev->dev, sizeof(*hsp_mbox), GFP_KERNEL);
>> +       if (!hsp_mbox)
>> +               return -ENOMEM;
>> +
>> +       res = platform_get_resource(pdev, IORESOURCE_MEM, 0);
>> +       hsp_mbox->base = devm_ioremap_resource(&pdev->dev, res);
>> +       if (IS_ERR(hsp_mbox->base))
>> +               return PTR_ERR(hsp_mbox->base);
>> +
>> +       reg = hsp_readl(hsp_mbox->base, HSP_INT_DIMENSIONING);
>> +       hsp_mbox->nr_sm = (reg >> HSP_nSM_OFFSET) & HSP_nINT_MASK;
>> +       hsp_mbox->nr_ss = (reg >> HSP_nSS_OFFSET) & HSP_nINT_MASK;
>> +       hsp_mbox->nr_as = (reg >> HSP_nAS_OFFSET) & HSP_nINT_MASK;
>> +       hsp_mbox->nr_db = (reg >> HSP_nDB_OFFSET) & HSP_nINT_MASK;
>> +       hsp_mbox->nr_si = (reg >> HSP_nSI_OFFSET) & HSP_nINT_MASK;
>
> Maybe have a HSP_NR(type, reg) that expands to (reg >> HSP_n ## TYPE
> ## _OFFSET) & HSP_nINT_MASK) to simplify this?
Good suggestion, will fix.

Thanks,
-Joseph

>
>> +
>> +       hsp_mbox->mbox = devm_kzalloc(&pdev->dev,
>> +                                     sizeof(*hsp_mbox->mbox), GFP_KERNEL);
>> +       if (!hsp_mbox->mbox)
>> +               return -ENOMEM;
>> +
>> +       hsp_mbox->mbox->chans =
>> +               devm_kcalloc(&pdev->dev, MAX_NUM_HSP_CHAN,
>> +                            sizeof(*hsp_mbox->mbox->chans), GFP_KERNEL);
>> +       if (!hsp_mbox->mbox->chans)
>> +               return -ENOMEM;
>> +
>> +       hsp_mbox->mbox->of_xlate = of_hsp_mbox_xlate;
>> +       hsp_mbox->mbox->num_chans = MAX_NUM_HSP_CHAN;
>> +       hsp_mbox->mbox->dev = &pdev->dev;
>> +       hsp_mbox->mbox->txdone_irq = false;
>> +       hsp_mbox->mbox->txdone_poll = false;
>> +       hsp_mbox->mbox->ops = &tegra_hsp_ops;
>> +       platform_set_drvdata(pdev, hsp_mbox);
>> +
>> +       ret = mbox_controller_register(hsp_mbox->mbox);
>> +       if (ret) {
>> +               pr_err("tegra-hsp mbox: fail to register mailbox %d.\n", ret);
>> +               return ret;
>> +       }
>> +
>> +       spin_lock_init(&hsp_mbox->lock);
>> +
>> +       return 0;
>> +}
>> +
>> +static int tegra_hsp_remove(struct platform_device *pdev)
>> +{
>> +       struct tegra_hsp_mbox *hsp_mbox = platform_get_drvdata(pdev);
>> +
>> +       if (hsp_mbox->mbox)
>> +               mbox_controller_unregister(hsp_mbox->mbox);
>> +
>> +       return 0;
>> +}
>> +
>> +static struct platform_driver tegra_hsp_driver = {
>> +       .driver = {
>> +               .name = "tegra-hsp",
>> +               .of_match_table = tegra_hsp_match,
>> +       },
>> +       .probe = tegra_hsp_probe,
>> +       .remove = tegra_hsp_remove,
>> +};
>> +
>> +static int __init tegra_hsp_init(void)
>> +{
>> +       return platform_driver_register(&tegra_hsp_driver);
>> +}
>> +core_initcall(tegra_hsp_init);
>> --
>> 2.9.0
>>

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH V2 05/10] firmware: tegra: add BPMP support
  2016-07-05  9:04 ` [PATCH V2 05/10] firmware: tegra: add BPMP support Joseph Lo
@ 2016-07-06 11:39   ` Alexandre Courbot
  2016-07-06 16:39     ` Stephen Warren
                       ` (2 more replies)
  2016-07-08 17:55   ` Sivaram Nair
  1 sibling, 3 replies; 51+ messages in thread
From: Alexandre Courbot @ 2016-07-06 11:39 UTC (permalink / raw)
  To: Joseph Lo
  Cc: Stephen Warren, Thierry Reding, linux-tegra, linux-arm-kernel,
	Rob Herring, Mark Rutland, Peter De Schrijver,
	Matthew Longnecker, devicetree, Jassi Brar,
	Linux Kernel Mailing List, Catalin Marinas, Will Deacon

Sorry, I will probably need to do several passes on this one to
understand everything, but here is what I can say after a first look:

On Tue, Jul 5, 2016 at 6:04 PM, Joseph Lo <josephl@nvidia.com> wrote:
> The Tegra BPMP (Boot and Power Management Processor) is designed for the
> booting process handling, offloading the power management tasks and
> some system control services from the CPU. It can be clock, DVFS,
> thermal/EDP, power gating operation and system suspend/resume handling.
> So the CPU and the drivers of these modules can base on the service that
> the BPMP firmware driver provided to signal the event for the specific PM
> action to BPMP and receive the status update from BPMP.
>
> Comparing to the ARM SCPI, the service provided by BPMP is message-based
> communication but not method-based. The BPMP firmware driver provides the
> send/receive service for the users, when the user concerns the response
> time. If the user needs to get the event or update from the firmware, it
> can request the MRQ service as well. The user needs to take care of the
> message format, which we call BPMP ABI.
>
> The BPMP ABI defines the message format for different modules or usages.
> For example, the clock operation needs an MRQ service code called
> MRQ_CLK with specific message format which includes different sub
> commands for various clock operations. This is the message format that
> BPMP can recognize.
>
> So the user needs two things to initiate IPC between BPMP. Get the
> service from the bpmp_ops structure and maintain the message format as
> the BPMP ABI defined.
>
> Based-on-the-work-by:
> Sivaram Nair <sivaramn@nvidia.com>
>
> Signed-off-by: Joseph Lo <josephl@nvidia.com>
> ---
> Changes in V2:
> - None
> ---
>  drivers/firmware/tegra/Kconfig  |   12 +
>  drivers/firmware/tegra/Makefile |    1 +
>  drivers/firmware/tegra/bpmp.c   |  713 +++++++++++++++++
>  include/soc/tegra/bpmp.h        |   29 +
>  include/soc/tegra/bpmp_abi.h    | 1601 +++++++++++++++++++++++++++++++++++++++
>  5 files changed, 2356 insertions(+)
>  create mode 100644 drivers/firmware/tegra/bpmp.c
>  create mode 100644 include/soc/tegra/bpmp.h
>  create mode 100644 include/soc/tegra/bpmp_abi.h
>
> diff --git a/drivers/firmware/tegra/Kconfig b/drivers/firmware/tegra/Kconfig
> index 1fa3e4e136a5..ff2730d5c468 100644
> --- a/drivers/firmware/tegra/Kconfig
> +++ b/drivers/firmware/tegra/Kconfig
> @@ -10,4 +10,16 @@ config TEGRA_IVC
>           keeps the content is synchronization between host CPU and remote
>           processors.
>
> +config TEGRA_BPMP
> +       bool "Tegra BPMP driver"
> +       depends on ARCH_TEGRA && TEGRA_HSP_MBOX && TEGRA_IVC
> +       help
> +         BPMP (Boot and Power Management Processor) is designed to off-loading

s/off-loading/off-load

> +         the PM functions which include clock/DVFS/thermal/power from the CPU.
> +         It needs HSP as the HW synchronization and notification module and
> +         IVC module as the message communication protocol.
> +
> +         This driver manages the IPC interface between host CPU and the
> +         firmware running on BPMP.
> +
>  endmenu
> diff --git a/drivers/firmware/tegra/Makefile b/drivers/firmware/tegra/Makefile
> index 92e2153e8173..e34a2f79e1ad 100644
> --- a/drivers/firmware/tegra/Makefile
> +++ b/drivers/firmware/tegra/Makefile
> @@ -1 +1,2 @@
> +obj-$(CONFIG_TEGRA_BPMP)       += bpmp.o
>  obj-$(CONFIG_TEGRA_IVC)                += ivc.o
> diff --git a/drivers/firmware/tegra/bpmp.c b/drivers/firmware/tegra/bpmp.c
> new file mode 100644
> index 000000000000..24fda626610e
> --- /dev/null
> +++ b/drivers/firmware/tegra/bpmp.c
> @@ -0,0 +1,713 @@
> +/*
> + * Copyright (c) 2016, NVIDIA CORPORATION.  All rights reserved.
> + *
> + * This program is free software; you can redistribute it and/or modify it
> + * under the terms and conditions of the GNU General Public License,
> + * version 2, as published by the Free Software Foundation.
> + *
> + * This program is distributed in the hope it will be useful, but WITHOUT
> + * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
> + * FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License for
> + * more details.
> + */
> +
> +#include <linux/mailbox_client.h>
> +#include <linux/of.h>
> +#include <linux/of_address.h>
> +#include <linux/of_device.h>
> +#include <linux/platform_device.h>
> +#include <linux/semaphore.h>
> +
> +#include <soc/tegra/bpmp.h>
> +#include <soc/tegra/bpmp_abi.h>
> +#include <soc/tegra/ivc.h>
> +
> +#define BPMP_MSG_SZ            128
> +#define BPMP_MSG_DATA_SZ       120
> +
> +#define __MRQ_ATTRS            0xff000000
> +#define __MRQ_INDEX(id)                ((id) & ~__MRQ_ATTRS)
> +
> +#define DO_ACK                 BIT(0)
> +#define RING_DOORBELL          BIT(1)
> +
> +struct tegra_bpmp_soc_data {
> +       u32 ch_index;           /* channel index */
> +       u32 thread_ch_index;    /* thread channel index */
> +       u32 cpu_rx_ch_index;    /* CPU Rx channel index */
> +       u32 nr_ch;              /* number of total channels */
> +       u32 nr_thread_ch;       /* number of thread channels */
> +       u32 ch_timeout;         /* channel timeout */
> +       u32 thread_ch_timeout;  /* thread channel timeout */
> +};

With just these comments it is not clear what everything in this
structure does. Maybe a file-level comment explaining how BPMP
basically works and what the different channels are allocated to would
help understanding the code.

> +
> +struct channel_info {
> +       u32 tch_free;
> +       u32 tch_to_complete;
> +       struct semaphore tch_sem;
> +};
> +
> +struct mb_data {
> +       s32 code;
> +       s32 flags;
> +       u8 data[BPMP_MSG_DATA_SZ];
> +} __packed;
> +
> +struct channel_data {
> +       struct mb_data *ib;
> +       struct mb_data *ob;
> +};
> +
> +struct mrq {
> +       struct list_head list;
> +       u32 mrq_code;
> +       bpmp_mrq_handler handler;
> +       void *data;
> +};
> +
> +struct tegra_bpmp {
> +       struct device *dev;
> +       const struct tegra_bpmp_soc_data *soc_data;
> +       void __iomem *tx_base;
> +       void __iomem *rx_base;
> +       struct mbox_client cl;
> +       struct mbox_chan *chan;
> +       struct ivc *ivc_channels;
> +       struct channel_data *ch_area;
> +       struct channel_info ch_info;
> +       struct completion *ch_completion;
> +       struct list_head mrq_list;
> +       struct tegra_bpmp_ops *ops;
> +       spinlock_t lock;
> +       bool init_done;
> +};
> +
> +static struct tegra_bpmp *bpmp;

static? Ok, we only need one... for now. How about a private member in
your ivc structure that allows you to retrieve the bpmp and going
dynamic? This will require an extra argument in many functions, but is
cleaner design IMHO.

> +
> +static int bpmp_get_thread_ch(int idx)
> +{
> +       return bpmp->soc_data->thread_ch_index + idx;
> +}
> +
> +static int bpmp_get_thread_ch_index(int ch)
> +{
> +       if (ch < bpmp->soc_data->thread_ch_index ||
> +           ch >= bpmp->soc_data->cpu_rx_ch_index)

Shouldn't that be ch >= bpmp->soc_data->cpu_rx_ch_index +
bpmp->soc_data->nr_thread_ch?

Either rx_ch_index indicates the upper bound of the threaded channels,
and in that case you don't need tegra_bpmp_soc_data::nr_thread_ch, or
it can be anywhere else and you should use the correct member.

> +               return -1;
> +       return ch - bpmp->soc_data->thread_ch_index;
> +}
> +
> +static int bpmp_get_ob_channel(void)
> +{
> +       return smp_processor_id() + bpmp->soc_data->ch_index;
> +}
> +
> +static struct completion *bpmp_get_completion_obj(int ch)
> +{
> +       int i = bpmp_get_thread_ch_index(ch);
> +
> +       return i < 0 ? NULL : bpmp->ch_completion + i;
> +}
> +
> +static int bpmp_valid_txfer(void *ob_data, int ob_sz, void *ib_data, int ib_sz)
> +{
> +       return ob_sz >= 0 && ob_sz <= BPMP_MSG_DATA_SZ &&
> +              ib_sz >= 0 && ib_sz <= BPMP_MSG_DATA_SZ &&
> +              (!ob_sz || ob_data) && (!ib_sz || ib_data);
> +}
> +
> +static bool bpmp_master_acked(int ch)
> +{
> +       struct ivc *ivc_chan;
> +       void *frame;
> +       bool ready;
> +
> +       ivc_chan = bpmp->ivc_channels + ch;
> +       frame = tegra_ivc_read_get_next_frame(ivc_chan);
> +       ready = !IS_ERR_OR_NULL(frame);
> +       bpmp->ch_area[ch].ib = ready ? frame : NULL;
> +
> +       return ready;
> +}
> +
> +static int bpmp_wait_ack(int ch)
> +{
> +       ktime_t t;
> +
> +       t = ns_to_ktime(local_clock());
> +
> +       do {
> +               if (bpmp_master_acked(ch))
> +                       return 0;
> +       } while (ktime_us_delta(ns_to_ktime(local_clock()), t) <
> +                bpmp->soc_data->ch_timeout);
> +
> +       return -ETIMEDOUT;
> +}
> +
> +static bool bpmp_master_free(int ch)
> +{
> +       struct ivc *ivc_chan;
> +       void *frame;
> +       bool ready;
> +
> +       ivc_chan = bpmp->ivc_channels + ch;
> +       frame = tegra_ivc_write_get_next_frame(ivc_chan);
> +       ready = !IS_ERR_OR_NULL(frame);
> +       bpmp->ch_area[ch].ob = ready ? frame : NULL;
> +
> +       return ready;
> +}
> +
> +static int bpmp_wait_master_free(int ch)
> +{
> +       ktime_t t;
> +
> +       t = ns_to_ktime(local_clock());
> +
> +       do {
> +               if (bpmp_master_free(ch))
> +                       return 0;
> +       } while (ktime_us_delta(ns_to_ktime(local_clock()), t)
> +                < bpmp->soc_data->ch_timeout);
> +
> +       return -ETIMEDOUT;
> +}
> +
> +static int __read_ch(int ch, void *data, int sz)
> +{
> +       struct ivc *ivc_chan;
> +       struct mb_data *p;
> +
> +       ivc_chan = bpmp->ivc_channels + ch;
> +       p = bpmp->ch_area[ch].ib;
> +       if (data)
> +               memcpy_fromio(data, p->data, sz);
> +
> +       return tegra_ivc_read_advance(ivc_chan);
> +}
> +
> +static int bpmp_read_ch(int ch, void *data, int sz)
> +{
> +       unsigned long flags;
> +       int i, ret;
> +
> +       i = bpmp_get_thread_ch_index(ch);

i is not a very good name for this variable.
Also note that bpmp_get_thread_ch_index() can return -1, this case is
not handled.

> +
> +       spin_lock_irqsave(&bpmp->lock, flags);
> +       ret = __read_ch(ch, data, sz);
> +       bpmp->ch_info.tch_free |= (1 << i);
> +       spin_unlock_irqrestore(&bpmp->lock, flags);
> +
> +       up(&bpmp->ch_info.tch_sem);
> +
> +       return ret;
> +}
> +
> +static int __write_ch(int ch, int mrq_code, int flags, void *data, int sz)
> +{
> +       struct ivc *ivc_chan;
> +       struct mb_data *p;
> +
> +       ivc_chan = bpmp->ivc_channels + ch;
> +       p = bpmp->ch_area[ch].ob;
> +
> +       p->code = mrq_code;
> +       p->flags = flags;
> +       if (data)
> +               memcpy_toio(p->data, data, sz);
> +
> +       return tegra_ivc_write_advance(ivc_chan);
> +}
> +
> +static int bpmp_write_threaded_ch(int *ch, int mrq_code, void *data, int sz)
> +{
> +       unsigned long flags;
> +       int ret, i;
> +
> +       ret = down_timeout(&bpmp->ch_info.tch_sem,
> +                          usecs_to_jiffies(bpmp->soc_data->thread_ch_timeout));
> +       if (ret)
> +               return ret;
> +
> +       spin_lock_irqsave(&bpmp->lock, flags);
> +
> +       i = __ffs(bpmp->ch_info.tch_free);
> +       *ch = bpmp_get_thread_ch(i);
> +       ret = bpmp_master_free(*ch) ? 0 : -EFAULT;
> +       if (!ret) {
> +               bpmp->ch_info.tch_free &= ~(1 << i);
> +               __write_ch(*ch, mrq_code, DO_ACK | RING_DOORBELL, data, sz);
> +               bpmp->ch_info.tch_to_complete |= (1 << *ch);
> +       }
> +
> +       spin_unlock_irqrestore(&bpmp->lock, flags);
> +
> +       return ret;
> +}
> +
> +static int bpmp_write_ch(int ch, int mrq_code, int flags, void *data, int sz)
> +{
> +       int ret;
> +
> +       ret = bpmp_wait_master_free(ch);
> +       if (ret)
> +               return ret;
> +
> +       return __write_ch(ch, mrq_code, flags, data, sz);
> +}
> +
> +static int bpmp_send_receive_atomic(int mrq_code, void *ob_data, int ob_sz,
> +                                   void *ib_data, int ib_sz)
> +{
> +       int ch, ret;
> +
> +       if (WARN_ON(!irqs_disabled()))
> +               return -EPERM;
> +
> +       if (!bpmp_valid_txfer(ob_data, ob_sz, ib_data, ib_sz))
> +               return -EINVAL;
> +
> +       if (!bpmp->init_done)
> +               return -ENODEV;
> +
> +       ch = bpmp_get_ob_channel();
> +       ret = bpmp_write_ch(ch, mrq_code, DO_ACK, ob_data, ob_sz);
> +       if (ret)
> +               return ret;
> +
> +       ret = mbox_send_message(bpmp->chan, NULL);
> +       if (ret < 0)
> +               return ret;
> +       mbox_client_txdone(bpmp->chan, 0);
> +
> +       ret = bpmp_wait_ack(ch);
> +       if (ret)
> +               return ret;
> +
> +       return __read_ch(ch, ib_data, ib_sz);
> +}
> +
> +static int bpmp_send_receive(int mrq_code, void *ob_data, int ob_sz,
> +                            void *ib_data, int ib_sz)
> +{
> +       struct completion *comp_obj;
> +       unsigned long timeout;
> +       int ch, ret;
> +
> +       if (WARN_ON(irqs_disabled()))
> +               return -EPERM;
> +
> +       if (!bpmp_valid_txfer(ob_data, ob_sz, ib_data, ib_sz))
> +               return -EINVAL;
> +
> +       if (!bpmp->init_done)
> +               return -ENODEV;
> +
> +       ret = bpmp_write_threaded_ch(&ch, mrq_code, ob_data, ob_sz);
> +       if (ret)
> +               return ret;
> +
> +       ret = mbox_send_message(bpmp->chan, NULL);
> +       if (ret < 0)
> +               return ret;
> +       mbox_client_txdone(bpmp->chan, 0);
> +
> +       comp_obj = bpmp_get_completion_obj(ch);
> +       timeout = usecs_to_jiffies(bpmp->soc_data->thread_ch_timeout);
> +       if (!wait_for_completion_timeout(comp_obj, timeout))
> +               return -ETIMEDOUT;
> +
> +       return bpmp_read_ch(ch, ib_data, ib_sz);
> +}
> +
> +static struct mrq *bpmp_find_mrq(u32 mrq_code)
> +{
> +       struct mrq *mrq;
> +
> +       list_for_each_entry(mrq, &bpmp->mrq_list, list) {
> +               if (mrq_code == mrq->mrq_code)
> +                       return mrq;
> +       }
> +
> +       return NULL;
> +}
> +
> +static void bpmp_mrq_return_data(int ch, int code, void *data, int sz)
> +{
> +       int flags = bpmp->ch_area[ch].ib->flags;
> +       struct ivc *ivc_chan;
> +       struct mb_data *frame;
> +       int ret;
> +
> +       if (WARN_ON(sz > BPMP_MSG_DATA_SZ))
> +               return;
> +
> +       ivc_chan = bpmp->ivc_channels + ch;
> +       ret = tegra_ivc_read_advance(ivc_chan);
> +       WARN_ON(ret);
> +
> +       if (!(flags & DO_ACK))
> +               return;
> +
> +       frame = tegra_ivc_write_get_next_frame(ivc_chan);
> +       if (IS_ERR_OR_NULL(frame)) {
> +               WARN_ON(1);
> +               return;
> +       }
> +
> +       frame->code = code;
> +       if (data != NULL)
> +               memcpy_toio(frame->data, data, sz);
> +       ret = tegra_ivc_write_advance(ivc_chan);
> +       WARN_ON(ret);
> +
> +       if (flags & RING_DOORBELL) {
> +               ret = mbox_send_message(bpmp->chan, NULL);
> +               if (ret < 0) {
> +                       WARN_ON(1);
> +                       return;
> +               }
> +               mbox_client_txdone(bpmp->chan, 0);
> +       }
> +}
> +
> +static void bpmp_mail_return(int ch, int ret_code, int val)
> +{
> +       bpmp_mrq_return_data(ch, ret_code, &val, sizeof(val));
> +}
> +
> +static void bpmp_handle_mrq(int mrq_code, int ch)
> +{
> +       struct mrq *mrq;
> +
> +       spin_lock(&bpmp->lock);
> +
> +       mrq = bpmp_find_mrq(mrq_code);
> +       if (!mrq) {
> +               spin_unlock(&bpmp->lock);
> +               bpmp_mail_return(ch, -EINVAL, 0);
> +               return;
> +       }
> +
> +       mrq->handler(mrq_code, mrq->data, ch);
> +
> +       spin_unlock(&bpmp->lock);
> +}
> +
> +static int bpmp_request_mrq(int mrq_code, bpmp_mrq_handler handler, void *data)
> +{
> +       struct mrq *mrq;
> +       unsigned long flags;
> +
> +       if (!handler)
> +               return -EINVAL;
> +
> +       mrq = devm_kzalloc(bpmp->dev, sizeof(*mrq), GFP_KERNEL);
> +       if (!mrq)
> +               return -ENOMEM;
> +
> +       spin_lock_irqsave(&bpmp->lock, flags);
> +
> +       mrq->mrq_code = __MRQ_INDEX(mrq_code);
> +       mrq->handler = handler;
> +       mrq->data = data;
> +       list_add(&mrq->list, &bpmp->mrq_list);
> +
> +       spin_unlock_irqrestore(&bpmp->lock, flags);
> +
> +       return 0;
> +}
> +
> +static void bpmp_mrq_handle_ping(int mrq_code, void *data, int ch)
> +{
> +       int challenge;
> +       int reply;
> +
> +       challenge = *(int *)bpmp->ch_area[ch].ib->data;
> +       reply = challenge << (smp_processor_id() + 1);
> +       bpmp_mail_return(ch, 0, reply);
> +}
> +
> +static int bpmp_mailman_init(void)
> +{
> +       return bpmp_request_mrq(MRQ_PING, bpmp_mrq_handle_ping, NULL);
> +}
> +
> +static int bpmp_ping(void)
> +{
> +       unsigned long flags;
> +       ktime_t t;
> +       int challenge = 1;

Mmmm, shouldn't use a mrq_ping_request instead of an parameter which
size may vary depending on the architecture? On a 64-bit big endian
architecture, your messages would be corrupted.

> +       int reply = 0;

And this should probably be a mrq_ping_response. These remarks may
also apply to bpmp_mrq_handle_ping().

> +       int ret;
> +
> +       t = ktime_get();
> +       local_irq_save(flags);
> +       ret = bpmp_send_receive_atomic(MRQ_PING, &challenge, sizeof(challenge),
> +                                      &reply, sizeof(reply));
> +       local_irq_restore(flags);
> +       t = ktime_sub(ktime_get(), t);
> +
> +       if (!ret)
> +               dev_info(bpmp->dev,
> +                        "ping ok: challenge: %d, reply: %d, time: %lld\n",
> +                        challenge, reply, ktime_to_us(t));
> +
> +       return ret;
> +}
> +
> +static int bpmp_get_fwtag(void)
> +{
> +       unsigned long flags;
> +       void *vaddr;
> +       dma_addr_t paddr;
> +       u32 addr;

Here also we should use a mrq_query_tag_request.

> +       int ret;
> +
> +       vaddr = dma_alloc_coherent(bpmp->dev, BPMP_MSG_DATA_SZ, &paddr,
> +                                  GFP_KERNEL);

dma_addr_t may be 64 bit here, and you may get an address higher than
the 32 bits allowed by mrq_query_tag_request! I guess you want to add
GFP_DMA32 as flag to your call to dma_alloc_coherent.

> +       if (!vaddr)
> +               return -ENOMEM;
> +       addr = paddr;
> +
> +       local_irq_save(flags);
> +       ret = bpmp_send_receive_atomic(MRQ_QUERY_TAG, &addr, sizeof(addr),
> +                                      NULL, 0);
> +       local_irq_restore(flags);
> +
> +       if (!ret)
> +               dev_info(bpmp->dev, "fwtag: %s\n", (char *)vaddr);
> +
> +       dma_free_coherent(bpmp->dev, BPMP_MSG_DATA_SZ, vaddr, paddr);
> +
> +       return ret;
> +}
> +
> +static void bpmp_signal_thread(int ch)
> +{
> +       int flags = bpmp->ch_area[ch].ob->flags;
> +       struct completion *comp_obj;
> +
> +       if (!(flags & RING_DOORBELL))
> +               return;
> +
> +       comp_obj = bpmp_get_completion_obj(ch);
> +       if (!comp_obj) {
> +               WARN_ON(1);
> +               return;
> +       }
> +
> +       complete(comp_obj);
> +}
> +
> +static void bpmp_handle_rx(struct mbox_client *cl, void *data)
> +{
> +       int i, rx_ch;
> +
> +       rx_ch = bpmp->soc_data->cpu_rx_ch_index;
> +
> +       if (bpmp_master_acked(rx_ch))
> +               bpmp_handle_mrq(bpmp->ch_area[rx_ch].ib->code, rx_ch);
> +
> +       spin_lock(&bpmp->lock);
> +
> +       for (i = 0; i < bpmp->soc_data->nr_thread_ch &&
> +                       bpmp->ch_info.tch_to_complete; i++) {
> +               int ch = bpmp_get_thread_ch(i);
> +
> +               if ((bpmp->ch_info.tch_to_complete & (1 << ch)) &&
> +                   bpmp_master_acked(ch)) {
> +                       bpmp->ch_info.tch_to_complete &= ~(1 << ch);
> +                       bpmp_signal_thread(ch);
> +               }
> +       }
> +
> +       spin_unlock(&bpmp->lock);
> +}
> +
> +static void bpmp_ivc_notify(struct ivc *ivc)
> +{
> +       int ret;
> +
> +       ret = mbox_send_message(bpmp->chan, NULL);
> +       if (ret < 0)
> +               return;
> +
> +       mbox_send_message(bpmp->chan, NULL);

Why the second call to mbox_send_message? May to useful to add a
comment explaining it.

> +}
> +
> +static int bpmp_msg_chan_init(int ch)
> +{
> +       struct ivc *ivc_chan;
> +       u32 hdr_sz, msg_sz, que_sz;
> +       uintptr_t rx_base, tx_base;
> +       int ret;
> +
> +       msg_sz = tegra_ivc_align(BPMP_MSG_SZ);
> +       hdr_sz = tegra_ivc_total_queue_size(0);

I believe hdr_sz is never used?

> +       que_sz = tegra_ivc_total_queue_size(msg_sz);
> +
> +       rx_base =  (uintptr_t)(bpmp->rx_base + que_sz * ch);
> +       tx_base =  (uintptr_t)(bpmp->tx_base + que_sz * ch);
> +
> +       ivc_chan = bpmp->ivc_channels + ch;
> +       ret = tegra_ivc_init(ivc_chan, rx_base, DMA_ERROR_CODE, tx_base,
> +                            DMA_ERROR_CODE, 1, msg_sz, bpmp->dev,
> +                            bpmp_ivc_notify);
> +       if (ret) {
> +               dev_err(bpmp->dev, "%s fail: ch %d returned %d\n",
> +                       __func__, ch, ret);
> +               return ret;
> +       }
> +
> +       /* reset the channel state */
> +       tegra_ivc_channel_reset(ivc_chan);
> +
> +       /* sync the channel state with BPMP */
> +       while (tegra_ivc_channel_notified(ivc_chan))
> +               ;
> +
> +       return 0;
> +}
> +
> +struct tegra_bpmp_ops *tegra_bpmp_get_ops(void)
> +{
> +       if (bpmp->init_done && bpmp->ops)
> +               return bpmp->ops;
> +       return NULL;
> +}
> +EXPORT_SYMBOL(tegra_bpmp_get_ops);
> +
> +static struct tegra_bpmp_ops bpmp_ops = {
> +       .send_receive = bpmp_send_receive,
> +       .send_receive_atomic = bpmp_send_receive_atomic,
> +       .request_mrq = bpmp_request_mrq,
> +       .mrq_return = bpmp_mail_return,
> +};
> +
> +static const struct tegra_bpmp_soc_data soc_data_tegra186 = {
> +       .ch_index = 0,
> +       .thread_ch_index = 6,
> +       .cpu_rx_ch_index = 13,
> +       .nr_ch = 14,
> +       .nr_thread_ch = 7,
> +       .ch_timeout = 60 * USEC_PER_SEC,
> +       .thread_ch_timeout = 600 * USEC_PER_SEC,
> +};
> +
> +static const struct of_device_id tegra_bpmp_match[] = {
> +       { .compatible = "nvidia,tegra186-bpmp", .data = &soc_data_tegra186 },
> +       { }
> +};
> +
> +static int tegra_bpmp_probe(struct platform_device *pdev)
> +{
> +       const struct of_device_id *match;
> +       struct resource shmem_res;
> +       struct device_node *shmem_np;
> +       int i, ret;
> +
> +       bpmp = devm_kzalloc(&pdev->dev, sizeof(*bpmp), GFP_KERNEL);
> +       if (!bpmp)
> +               return -ENOMEM;
> +       bpmp->dev = &pdev->dev;
> +
> +       match = of_match_device(tegra_bpmp_match, &pdev->dev);
> +       if (!match)
> +               return -EINVAL;
> +       bpmp->soc_data = match->data;
> +
> +       shmem_np = of_parse_phandle(pdev->dev.of_node, "shmem", 0);
> +       of_address_to_resource(shmem_np, 0, &shmem_res);

Maybe check return value for these two calls?

> +       bpmp->tx_base = devm_ioremap_resource(&pdev->dev, &shmem_res);
> +       if (IS_ERR(bpmp->tx_base))
> +               return PTR_ERR(bpmp->tx_base);
> +
> +       shmem_np = of_parse_phandle(pdev->dev.of_node, "shmem", 1);
> +       of_address_to_resource(shmem_np, 0, &shmem_res);

And here too.

> +       bpmp->rx_base = devm_ioremap_resource(&pdev->dev, &shmem_res);
> +       if (IS_ERR(bpmp->rx_base))
> +               return PTR_ERR(bpmp->rx_base);
> +
> +       bpmp->ivc_channels = devm_kcalloc(&pdev->dev, bpmp->soc_data->nr_ch,
> +                                         sizeof(*bpmp->ivc_channels),
> +                                         GFP_KERNEL);
> +       if (!bpmp->ivc_channels)
> +               return -ENOMEM;
> +
> +       bpmp->ch_area = devm_kcalloc(&pdev->dev, bpmp->soc_data->nr_ch,
> +                                    sizeof(*bpmp->ch_area), GFP_KERNEL);
> +       if (!bpmp->ch_area)
> +               return -ENOMEM;
> +
> +       bpmp->ch_completion = devm_kcalloc(&pdev->dev,
> +                                          bpmp->soc_data->nr_thread_ch,
> +                                          sizeof(*bpmp->ch_completion),
> +                                          GFP_KERNEL);
> +       if (!bpmp->ch_completion)
> +               return -ENOMEM;
> +
> +       /* mbox registration */
> +       bpmp->cl.dev = &pdev->dev;
> +       bpmp->cl.rx_callback = bpmp_handle_rx;
> +       bpmp->cl.tx_block = false;
> +       bpmp->cl.knows_txdone = false;
> +       bpmp->chan = mbox_request_channel(&bpmp->cl, 0);
> +       if (IS_ERR(bpmp->chan)) {
> +               if (PTR_ERR(bpmp->chan) != -EPROBE_DEFER)
> +                       dev_err(&pdev->dev,
> +                               "fail to get HSP mailbox, bpmp init fail.\n");
> +               return PTR_ERR(bpmp->chan);
> +       }
> +
> +       /* message channel initialization */
> +       for (i = 0; i < bpmp->soc_data->nr_ch; i++) {
> +               struct completion *comp_obj;
> +
> +               ret = bpmp_msg_chan_init(i);
> +               if (ret)
> +                       return ret;
> +
> +               comp_obj = bpmp_get_completion_obj(i);
> +               if (comp_obj)
> +                       init_completion(comp_obj);
> +       }
> +
> +       bpmp->ch_info.tch_free = (1 << bpmp->soc_data->nr_thread_ch) - 1;
> +       sema_init(&bpmp->ch_info.tch_sem, bpmp->soc_data->nr_thread_ch);
> +
> +       spin_lock_init(&bpmp->lock);
> +       INIT_LIST_HEAD(&bpmp->mrq_list);
> +       if (bpmp_mailman_init())
> +               return -ENODEV;
> +
> +       bpmp->init_done = true;
> +
> +       ret = bpmp_ping();
> +       if (ret)
> +               dev_err(&pdev->dev, "ping failed: %d\n", ret);
> +
> +       ret = bpmp_get_fwtag();
> +       if (ret)
> +               dev_err(&pdev->dev, "get fwtag failed: %d\n", ret);
> +
> +       /* BPMP is ready now. */
> +       bpmp->ops = &bpmp_ops;
> +
> +       return 0;
> +}
> +
> +static struct platform_driver tegra_bpmp_driver = {
> +       .driver = {
> +               .name = "tegra-bpmp",
> +               .of_match_table = tegra_bpmp_match,
> +       },
> +       .probe = tegra_bpmp_probe,
> +};
> +
> +static int __init tegra_bpmp_init(void)
> +{
> +       return platform_driver_register(&tegra_bpmp_driver);
> +}
> +core_initcall(tegra_bpmp_init);
> diff --git a/include/soc/tegra/bpmp.h b/include/soc/tegra/bpmp.h
> new file mode 100644
> index 000000000000..aaa0ef34ad7b
> --- /dev/null
> +++ b/include/soc/tegra/bpmp.h
> @@ -0,0 +1,29 @@
> +/*
> + * Copyright (c) 2016, NVIDIA CORPORATION.  All rights reserved.
> + *
> + * This program is free software; you can redistribute it and/or modify it
> + * under the terms and conditions of the GNU General Public License,
> + * version 2, as published by the Free Software Foundation.
> + *
> + * This program is distributed in the hope it will be useful, but WITHOUT
> + * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
> + * FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License for
> + * more details.
> + */
> +
> +#ifndef __TEGRA_BPMP_H
> +
> +typedef void (*bpmp_mrq_handler)(int mrq_code, void *data, int ch);
> +
> +struct tegra_bpmp_ops {
> +       int (*send_receive)(int mrq_code, void *ob_data, int ob_sz,
> +                           void *ib_data, int ib_sz);
> +       int (*send_receive_atomic)(int mrq_code, void *ob_data, int ob_sz,
> +                           void *ib_data, int ib_sz);
> +       int (*request_mrq)(int mrq_code, bpmp_mrq_handler handler, void *data);
> +       void (*mrq_return)(int ch, int ret_code, int val);
> +};
> +
> +struct tegra_bpmp_ops *tegra_bpmp_get_ops(void);
> +
> +#endif /* __TEGRA_BPMP_H */
> diff --git a/include/soc/tegra/bpmp_abi.h b/include/soc/tegra/bpmp_abi.h
> new file mode 100644
> index 000000000000..0aaef5960e29
> --- /dev/null
> +++ b/include/soc/tegra/bpmp_abi.h
> @@ -0,0 +1,1601 @@
> +/*
> + * Copyright (c) 2014-2016, NVIDIA CORPORATION.  All rights reserved.
> + *
> + * This program is free software; you can redistribute it and/or modify it
> + * under the terms and conditions of the GNU General Public License,
> + * version 2, as published by the Free Software Foundation.
> + *
> + * This program is distributed in the hope it will be useful, but WITHOUT
> + * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
> + * FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License for
> + * more details.
> + *
> + * You should have received a copy of the GNU General Public License
> + * along with this program.  If not, see <http://www.gnu.org/licenses/>.
> + */
> +
> +#ifndef _ABI_BPMP_ABI_H_
> +#define _ABI_BPMP_ABI_H_
> +
> +#ifdef LK
> +#include <stdint.h>
> +#endif
> +
> +#ifndef __ABI_PACKED
> +#define __ABI_PACKED __attribute__((packed))
> +#endif
> +
> +#ifdef NO_GCC_EXTENSIONS
> +#define EMPTY char empty;
> +#define EMPTY_ARRAY 1
> +#else
> +#define EMPTY
> +#define EMPTY_ARRAY 0
> +#endif
> +
> +#ifndef __UNION_ANON
> +#define __UNION_ANON
> +#endif
> +/**
> + * @file
> + */
> +
> +
> +/**
> + * @defgroup MRQ MRQ Messages
> + * @brief Messages sent to/from BPMP via IPC
> + * @{
> + *   @defgroup MRQ_Format Message Format
> + *   @defgroup MRQ_Codes Message Request (MRQ) Codes
> + *   @defgroup MRQ_Payloads Message Payloads
> + *   @defgroup Error_Codes Error Codes
> + * @}
> + */

...

There is a lot of stuff in this file, most of which we are not using
now - this is ok, but unless this is a file synced from an outside
resource maybe we should trim the structures we don't need and add
them as we make use of them? It helps dividing the work in bite-size
chunks.

Regarding the documentation format of this file, is this valid kernel
documentation since the adoption of Sphynx? Or is it whatever the
origin is using?

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH V2 03/10] Documentation: dt-bindings: firmware: tegra: add bindings of the BPMP
  2016-07-05  9:04 ` [PATCH V2 03/10] Documentation: dt-bindings: firmware: tegra: add bindings of the BPMP Joseph Lo
@ 2016-07-06 11:42   ` Alexandre Courbot
  2016-07-07  6:25     ` Joseph Lo
  2016-07-06 17:03   ` Stephen Warren
                     ` (2 subsequent siblings)
  3 siblings, 1 reply; 51+ messages in thread
From: Alexandre Courbot @ 2016-07-06 11:42 UTC (permalink / raw)
  To: Joseph Lo
  Cc: Stephen Warren, Thierry Reding, linux-tegra, linux-arm-kernel,
	Rob Herring, Mark Rutland, Peter De Schrijver,
	Matthew Longnecker, devicetree, Jassi Brar,
	Linux Kernel Mailing List, Catalin Marinas, Will Deacon

On Tue, Jul 5, 2016 at 6:04 PM, Joseph Lo <josephl@nvidia.com> wrote:
> The BPMP is a specific processor in Tegra chip, which is designed for
> booting process handling and offloading the power management, clock
> management, and reset control tasks from the CPU. The binding document
> defines the resources that would be used by the BPMP firmware driver,
> which can create the interprocessor communication (IPC) between the CPU
> and BPMP.
>
> Signed-off-by: Joseph Lo <josephl@nvidia.com>
> ---
> Changes in V2:
> - update the message that the BPMP is clock and reset control provider
> - add tegra186-clock.h and tegra186-reset.h header files
> - revise the description of the required properties
> ---
>  .../bindings/firmware/nvidia,tegra186-bpmp.txt     |  77 ++
>  include/dt-bindings/clock/tegra186-clock.h         | 940 +++++++++++++++++++++
>  include/dt-bindings/reset/tegra186-reset.h         | 217 +++++
>  3 files changed, 1234 insertions(+)
>  create mode 100644 Documentation/devicetree/bindings/firmware/nvidia,tegra186-bpmp.txt
>  create mode 100644 include/dt-bindings/clock/tegra186-clock.h
>  create mode 100644 include/dt-bindings/reset/tegra186-reset.h
>
> diff --git a/Documentation/devicetree/bindings/firmware/nvidia,tegra186-bpmp.txt b/Documentation/devicetree/bindings/firmware/nvidia,tegra186-bpmp.txt
> new file mode 100644
> index 000000000000..4d0b6eba56c5
> --- /dev/null
> +++ b/Documentation/devicetree/bindings/firmware/nvidia,tegra186-bpmp.txt
> @@ -0,0 +1,77 @@
> +NVIDIA Tegra Boot and Power Management Processor (BPMP)
> +
> +The BPMP is a specific processor in Tegra chip, which is designed for
> +booting process handling and offloading the power management, clock
> +management, and reset control tasks from the CPU. The binding document
> +defines the resources that would be used by the BPMP firmware driver,
> +which can create the interprocessor communication (IPC) between the CPU
> +and BPMP.
> +
> +Required properties:
> +- name : Should be bpmp
> +- compatible
> +    Array of strings
> +    One of:
> +    - "nvidia,tegra186-bpmp"
> +- mboxes : The phandle of mailbox controller and the mailbox specifier.
> +- shmem : List of the phandle of the TX and RX shared memory area that
> +         the IPC between CPU and BPMP is based on.
> +- #clock-cells : Should be 1.
> +- #reset-cells : Should be 1.
> +
> +This node is a mailbox consumer. See the following files for details of
> +the mailbox subsystem, and the specifiers implemented by the relevant
> +provider(s):
> +
> +- Documentation/devicetree/bindings/mailbox/mailbox.txt
> +- Documentation/devicetree/bindings/mailbox/nvidia,tegra186-hsp.txt
> +
> +This node is a clock and reset provider. See the following files for
> +general documentation of those features, and the specifiers implemented
> +by this node:
> +
> +- Documentation/devicetree/bindings/clock/clock-bindings.txt
> +- include/dt-bindings/clock/tegra186-clock.h
> +- Documentation/devicetree/bindings/reset/reset.txt
> +- include/dt-bindings/reset/tegra186-reset.h
> +
> +The shared memory bindings for BPMP
> +-----------------------------------
> +
> +The shared memory area for the IPC TX and RX between CPU and BPMP are
> +predefined and work on top of sysram, which is an SRAM inside the chip.
> +
> +See "Documentation/devicetree/bindings/sram/sram.txt" for the bindings.
> +
> +Example:
> +
> +hsp_top0: hsp@03c00000 {
> +       ...
> +       #mbox-cells = <1>;
> +};
> +
> +sysram@30000000 {
> +       compatible = "nvidia,tegra186-sysram", "mmio-ram";

Shouldn't the second compatible be "mmio-sram"?

If so, then you have the same typo in tegra186.dtsi as well.

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH V2 02/10] mailbox: tegra-hsp: Add HSP(Hardware Synchronization Primitives) driver
  2016-07-06  9:06     ` Joseph Lo
@ 2016-07-06 12:23       ` Alexandre Courbot
  2016-07-07  6:37         ` Joseph Lo
  2016-07-06 16:50       ` Stephen Warren
  1 sibling, 1 reply; 51+ messages in thread
From: Alexandre Courbot @ 2016-07-06 12:23 UTC (permalink / raw)
  To: Joseph Lo
  Cc: Stephen Warren, Thierry Reding, linux-tegra, linux-arm-kernel,
	Rob Herring, Mark Rutland, Peter De Schrijver,
	Matthew Longnecker, devicetree, Jassi Brar,
	Linux Kernel Mailing List, Catalin Marinas, Will Deacon

On Wed, Jul 6, 2016 at 6:06 PM, Joseph Lo <josephl@nvidia.com> wrote:
> On 07/06/2016 03:05 PM, Alexandre Courbot wrote:
>>
>> On Tue, Jul 5, 2016 at 6:04 PM, Joseph Lo <josephl@nvidia.com> wrote:
>>>
>>> The Tegra HSP mailbox driver implements the signaling doorbell-based
>>> interprocessor communication (IPC) for remote processors currently. The
>>> HSP HW modules support some different features for that, which are
>>> shared mailboxes, shared semaphores, arbitrated semaphores, and
>>> doorbells. And there are multiple HSP HW instances on the chip. So the
>>> driver is extendable to support more features for different IPC
>>> requirement.
>>>
>>> The driver of remote processor can use it as a mailbox client and deal
>>> with the IPC protocol to synchronize the data communications.
>>>
>>> Signed-off-by: Joseph Lo <josephl@nvidia.com>
>>> ---
>>> Changes in V2:
>>> - Update the driver to support the binding changes in V2
>>> - it's extendable to support multiple HSP sub-modules on the same HSP HW
>>> block
>>>    now.
>>> ---
>>>   drivers/mailbox/Kconfig     |   9 +
>>>   drivers/mailbox/Makefile    |   2 +
>>>   drivers/mailbox/tegra-hsp.c | 418
>>> ++++++++++++++++++++++++++++++++++++++++++++
>>>   3 files changed, 429 insertions(+)
>>>   create mode 100644 drivers/mailbox/tegra-hsp.c
>>>
>>> diff --git a/drivers/mailbox/Kconfig b/drivers/mailbox/Kconfig
>>> index 5305923752d2..fe584cb54720 100644
>>> --- a/drivers/mailbox/Kconfig
>>> +++ b/drivers/mailbox/Kconfig
>>> @@ -114,6 +114,15 @@ config MAILBOX_TEST
>>>            Test client to help with testing new Controller driver
>>>            implementations.
>>>
>>> +config TEGRA_HSP_MBOX
>>> +       bool "Tegra HSP(Hardware Synchronization Primitives) Driver"
>>
>>
>> Space missing before the opening parenthesis (same in the patch title
>> btw).
>
> Okay.
>>
>>
>>> +       depends on ARCH_TEGRA_186_SOC
>>> +       help
>>> +         The Tegra HSP driver is used for the interprocessor
>>> communication
>>> +         between different remote processors and host processors on
>>> Tegra186
>>> +         and later SoCs. Say Y here if you want to have this support.
>>> +         If unsure say N.
>>
>>
>> Since this option is selected automatically by ARCH_TEGRA_186_SOC, you
>> should probably drop the last 2 sentences.
>
> Okay.
>
>>
>>> +
>>>   config XGENE_SLIMPRO_MBOX
>>>          tristate "APM SoC X-Gene SLIMpro Mailbox Controller"
>>>          depends on ARCH_XGENE
>>> diff --git a/drivers/mailbox/Makefile b/drivers/mailbox/Makefile
>>> index 0be3e742bb7d..26d8f91c7fea 100644
>>> --- a/drivers/mailbox/Makefile
>>> +++ b/drivers/mailbox/Makefile
>>> @@ -25,3 +25,5 @@ obj-$(CONFIG_TI_MESSAGE_MANAGER) += ti-msgmgr.o
>>>   obj-$(CONFIG_XGENE_SLIMPRO_MBOX) += mailbox-xgene-slimpro.o
>>>
>>>   obj-$(CONFIG_HI6220_MBOX)      += hi6220-mailbox.o
>>> +
>>> +obj-${CONFIG_TEGRA_HSP_MBOX}   += tegra-hsp.o
>>> diff --git a/drivers/mailbox/tegra-hsp.c b/drivers/mailbox/tegra-hsp.c
>>> new file mode 100644
>>> index 000000000000..93c3ef58f29f
>>> --- /dev/null
>>> +++ b/drivers/mailbox/tegra-hsp.c
>>> @@ -0,0 +1,418 @@
>>> +/*
>>> + * Copyright (c) 2016, NVIDIA CORPORATION.  All rights reserved.
>>> + *
>>> + * This program is free software; you can redistribute it and/or modify
>>> it
>>> + * under the terms and conditions of the GNU General Public License,
>>> + * version 2, as published by the Free Software Foundation.
>>> + *
>>> + * This program is distributed in the hope it will be useful, but
>>> WITHOUT
>>> + * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
>>> + * FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License
>>> for
>>> + * more details.
>>> + */
>>> +
>>> +#include <linux/interrupt.h>
>>> +#include <linux/io.h>
>>> +#include <linux/mailbox_controller.h>
>>> +#include <linux/of.h>
>>> +#include <linux/of_device.h>
>>> +#include <linux/platform_device.h>
>>> +#include <dt-bindings/mailbox/tegra186-hsp.h>
>>> +
>>> +#define HSP_INT_DIMENSIONING   0x380
>>> +#define HSP_nSM_OFFSET         0
>>> +#define HSP_nSS_OFFSET         4
>>> +#define HSP_nAS_OFFSET         8
>>> +#define HSP_nDB_OFFSET         12
>>> +#define HSP_nSI_OFFSET         16
>>
>>
>> Would be nice to have comments to understand what SM, SS, AS, etc.
>> stand for (Shared Mailboxes, Shared Semaphores, Arbitrated Semaphores
>> but you need to look at the patch description to understand that). A
>> top-of-file comment explaning the necessary concepts to read this code
>> would do the trick.
>
> Yes, will fix that.
>>
>>
>>> +#define HSP_nINT_MASK          0xf
>>> +
>>> +#define HSP_DB_REG_TRIGGER     0x0
>>> +#define HSP_DB_REG_ENABLE      0x4
>>> +#define HSP_DB_REG_RAW         0x8
>>> +#define HSP_DB_REG_PENDING     0xc
>>> +
>>> +#define HSP_DB_CCPLEX          1
>>> +#define HSP_DB_BPMP            3
>>
>>
>> Maybe turn this into enum and use that type for
>> tegra_hsp_db_chan::db_id? Also have MAX_NUM_HSP_DB here, since it is
>> related to these values?
>
> Okay.
>
>>
>>> +
>>> +#define MAX_NUM_HSP_CHAN 32
>>> +#define MAX_NUM_HSP_DB 7
>>> +
>>> +#define hsp_db_offset(i, d) \
>>> +       (d->base + ((1 + (d->nr_sm >> 1) + d->nr_ss + d->nr_as) << 16) +
>>> \
>>> +       (i) * 0x100)
>>> +
>>> +struct tegra_hsp_db_chan {
>>> +       int master_id;
>>> +       int db_id;
>>> +};
>>> +
>>> +struct tegra_hsp_mbox_chan {
>>> +       int type;
>>> +       union {
>>> +               struct tegra_hsp_db_chan db_chan;
>>> +       };
>>> +};
>>> +
>>> +struct tegra_hsp_mbox {
>>> +       struct mbox_controller *mbox;
>>> +       void __iomem *base;
>>> +       void __iomem *db_base[MAX_NUM_HSP_DB];
>>> +       int db_irq;
>>> +       int nr_sm;
>>> +       int nr_as;
>>> +       int nr_ss;
>>> +       int nr_db;
>>> +       int nr_si;
>>> +       spinlock_t lock;
>>> +};
>>> +
>>> +static inline u32 hsp_readl(void __iomem *base, int reg)
>>> +{
>>> +       return readl(base + reg);
>>> +}
>>> +
>>> +static inline void hsp_writel(void __iomem *base, int reg, u32 val)
>>> +{
>>> +       writel(val, base + reg);
>>> +       readl(base + reg);
>>> +}
>>> +
>>> +static int hsp_db_can_ring(void __iomem *db_base)
>>> +{
>>> +       u32 reg;
>>> +
>>> +       reg = hsp_readl(db_base, HSP_DB_REG_ENABLE);
>>> +
>>> +       return !!(reg & BIT(HSP_DB_MASTER_CCPLEX));
>>> +}
>>> +
>>> +static irqreturn_t hsp_db_irq(int irq, void *p)
>>> +{
>>> +       struct tegra_hsp_mbox *hsp_mbox = p;
>>> +       ulong val;
>>> +       int master_id;
>>> +
>>> +       val = (ulong)hsp_readl(hsp_mbox->db_base[HSP_DB_CCPLEX],
>>> +                              HSP_DB_REG_PENDING);
>>> +       hsp_writel(hsp_mbox->db_base[HSP_DB_CCPLEX], HSP_DB_REG_PENDING,
>>> val);
>>> +
>>> +       spin_lock(&hsp_mbox->lock);
>>> +       for_each_set_bit(master_id, &val, MAX_NUM_HSP_CHAN) {
>>> +               struct mbox_chan *chan;
>>> +               struct tegra_hsp_mbox_chan *mchan;
>>> +               int i;
>>> +
>>> +               for (i = 0; i < MAX_NUM_HSP_CHAN; i++) {
>>
>>
>> I wonder if this could not be optimized. You are doing a double loop
>> on MAX_NUM_HSP_CHAN to look for an identical master_id. Since it seems
>> like the same master_id cannot be used twice (considering that the
>> inner loop only processes the first match), couldn't you just select
>> the free channel in of_hsp_mbox_xlate() by doing
>> &mbox->chans[master_id] (and returning an error if it is already
>> used), then simply getting chan as &hsp_mbox->mbox->chans[master_id]
>> instead of having the inner loop below? That would remove the need for
>> the second loop.
>
>
> That was exactly what I did in the V1, which only supported one HSP
> sub-module per HSP HW block. So we can just use the master_id as the mbox
> channel ID.
>
> Meanwhile, the V2 is purposed to support multiple HSP sub-modules to be
> running on the same HSP HW block. The "ID" between different modules could
> be conflict. So I dropped the mechanism that used the master_id as the mbox
> channel ID.
>
> Instead, the channel is allocated at the time, when the client is bound to
> one of the HSP sub-modules. And we store the "ID" information into the
> private mbox channel data, which can help us to figure out which mbox
> channel should response to the interrupt.
>
> In the doorbell case, because all the DB clients are shared the same DB IRQ
> at the CPU side. So in the ISR, we need to figure out the IRQ source, which
> is the master_id that the IRQ came from. This is the outer loop. The inner
> loop, we figure out which channel should response to by checking the type
> and ID.
>
> And I think it should be pretty quick, because we only check the set bit
> from the pending register. And finding the matching channel.

Yeah, I am not worried about the CPU time (although in interrupt
context, we always should), but rather about whether the code could be
simplified.

Ah, I think I get it. You want to be able to receive interrupts from
the same master, but not necessarily for the doorbell function.
Because of this you cannot use master_id as the index for the channel.
Am I understanding correctly?

>
>>
>> If having two channels use the same master_id is a valid scenario,
>> then all matches on master_id should probably be processed, not just
>> the first one.
>
> Each DB channel should have different master_id.
>
>
>>
>>> +                       chan = &hsp_mbox->mbox->chans[i];
>>> +
>>> +                       if (!chan->con_priv)
>>> +                               continue;
>>> +
>>> +                       mchan = chan->con_priv;
>>> +                       if (mchan->type == HSP_MBOX_TYPE_DB &&
>>> +                           mchan->db_chan.master_id == master_id)
>>> +                               break;
>>> +                       chan = NULL;
>>> +               }
>>> +
>>> +               if (chan)
>>> +                       mbox_chan_received_data(chan, NULL);
>>> +       }
>>> +       spin_unlock(&hsp_mbox->lock);
>>> +
>>> +       return IRQ_HANDLED;
>>> +}
>>> +
>>> +static int hsp_db_send_data(struct mbox_chan *chan, void *data)
>>> +{
>>> +       struct tegra_hsp_mbox_chan *mchan = chan->con_priv;
>>> +       struct tegra_hsp_db_chan *db_chan = &mchan->db_chan;
>>> +       struct tegra_hsp_mbox *hsp_mbox =
>>> dev_get_drvdata(chan->mbox->dev);
>>> +
>>> +       hsp_writel(hsp_mbox->db_base[db_chan->db_id], HSP_DB_REG_TRIGGER,
>>> 1);
>>> +
>>> +       return 0;
>>> +}
>>> +
>>> +static int hsp_db_startup(struct mbox_chan *chan)
>>> +{
>>> +       struct tegra_hsp_mbox_chan *mchan = chan->con_priv;
>>> +       struct tegra_hsp_db_chan *db_chan = &mchan->db_chan;
>>> +       struct tegra_hsp_mbox *hsp_mbox =
>>> dev_get_drvdata(chan->mbox->dev);
>>> +       u32 val;
>>> +       unsigned long flag;
>>> +
>>> +       if (db_chan->master_id >= MAX_NUM_HSP_CHAN) {
>>> +               dev_err(chan->mbox->dev, "invalid HSP chan: master ID:
>>> %d\n",
>>> +                       db_chan->master_id);
>>> +               return -EINVAL;
>>> +       }
>>> +
>>> +       spin_lock_irqsave(&hsp_mbox->lock, flag);
>>> +       val = hsp_readl(hsp_mbox->db_base[HSP_DB_CCPLEX],
>>> HSP_DB_REG_ENABLE);
>>> +       val |= BIT(db_chan->master_id);
>>> +       hsp_writel(hsp_mbox->db_base[HSP_DB_CCPLEX], HSP_DB_REG_ENABLE,
>>> val);
>>> +       spin_unlock_irqrestore(&hsp_mbox->lock, flag);
>>> +
>>> +       if (!hsp_db_can_ring(hsp_mbox->db_base[db_chan->db_id]))
>>> +               return -ENODEV;
>>> +
>>> +       return 0;
>>> +}
>>> +
>>> +static void hsp_db_shutdown(struct mbox_chan *chan)
>>> +{
>>> +       struct tegra_hsp_mbox_chan *mchan = chan->con_priv;
>>> +       struct tegra_hsp_db_chan *db_chan = &mchan->db_chan;
>>> +       struct tegra_hsp_mbox *hsp_mbox =
>>> dev_get_drvdata(chan->mbox->dev);
>>> +       u32 val;
>>> +       unsigned long flag;
>>> +
>>> +       spin_lock_irqsave(&hsp_mbox->lock, flag);
>>> +       val = hsp_readl(hsp_mbox->db_base[HSP_DB_CCPLEX],
>>> HSP_DB_REG_ENABLE);
>>> +       val &= ~BIT(db_chan->master_id);
>>> +       hsp_writel(hsp_mbox->db_base[HSP_DB_CCPLEX], HSP_DB_REG_ENABLE,
>>> val);
>>> +       spin_unlock_irqrestore(&hsp_mbox->lock, flag);
>>> +}
>>> +
>>> +static bool hsp_db_last_tx_done(struct mbox_chan *chan)
>>> +{
>>> +       return true;
>>> +}
>>> +
>>> +static int tegra_hsp_db_init(struct tegra_hsp_mbox *hsp_mbox,
>>> +                            struct mbox_chan *mchan, int master_id)
>>> +{
>>> +       struct platform_device *pdev =
>>> to_platform_device(hsp_mbox->mbox->dev);
>>> +       struct tegra_hsp_mbox_chan *hsp_mbox_chan;
>>> +       int ret;
>>> +
>>> +       if (!hsp_mbox->db_irq) {
>>> +               int i;
>>> +
>>> +               hsp_mbox->db_irq = platform_get_irq_byname(pdev,
>>> "doorbell");
>>
>>
>> Getting the IRQ sounds more like a job for probe() - I don't see the
>> benefit of lazy-doing it?
>
>
> We only need the IRQ when the client is requesting the DB service. For other
> HSP sub-modules, they are using different IRQ. So I didn't do that at probe
> time.

Ok, but probe() is where resources should be acquired... and at the
very least DT properties be looked up. In this case there is no hard
requirement for doing it elsewhere.

Is this interrupt absolutely required? Or can we tolerate to not use
the doorbell service? In the first case, the driver should fail during
probe(), not sometime later. In the second case, you should still get
all the interrupts in probe(), then disable them if they are not
needed, and check in this function whether db_irq is a valid interrupt
number to decide whether or not we can use doorbell.

>
>>
>>> +               ret = devm_request_irq(&pdev->dev, hsp_mbox->db_irq,
>>> +                                      hsp_db_irq, IRQF_NO_SUSPEND,
>>> +                                      dev_name(&pdev->dev), hsp_mbox);
>>> +               if (ret)
>>> +                       return ret;
>>> +
>>> +               for (i = 0; i < MAX_NUM_HSP_DB; i++)
>>> +                       hsp_mbox->db_base[i] = hsp_db_offset(i,
>>> hsp_mbox);
>>
>>
>> Same here, cannot this be moved into probe()?
>
> Same as above, only needed when the client requests it.

But you don't waste any resources by doing it preemptively in probe().
So let's keep related code in the same place.

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH V2 05/10] firmware: tegra: add BPMP support
  2016-07-06 11:39   ` Alexandre Courbot
@ 2016-07-06 16:39     ` Stephen Warren
  2016-07-06 16:47     ` Matt Longnecker
  2016-07-07  8:17     ` Joseph Lo
  2 siblings, 0 replies; 51+ messages in thread
From: Stephen Warren @ 2016-07-06 16:39 UTC (permalink / raw)
  To: Alexandre Courbot, Joseph Lo
  Cc: Thierry Reding, linux-tegra, linux-arm-kernel, Rob Herring,
	Mark Rutland, Peter De Schrijver, Matthew Longnecker, devicetree,
	Jassi Brar, Linux Kernel Mailing List, Catalin Marinas,
	Will Deacon

On 07/06/2016 05:39 AM, Alexandre Courbot wrote:
> Sorry, I will probably need to do several passes on this one to
> understand everything, but here is what I can say after a first look:
>
> On Tue, Jul 5, 2016 at 6:04 PM, Joseph Lo <josephl@nvidia.com> wrote:
>> The Tegra BPMP (Boot and Power Management Processor) is designed for the
>> booting process handling, offloading the power management tasks and
>> some system control services from the CPU. It can be clock, DVFS,
>> thermal/EDP, power gating operation and system suspend/resume handling.
>> So the CPU and the drivers of these modules can base on the service that
>> the BPMP firmware driver provided to signal the event for the specific PM
>> action to BPMP and receive the status update from BPMP.
>>
>> Comparing to the ARM SCPI, the service provided by BPMP is message-based
>> communication but not method-based. The BPMP firmware driver provides the
>> send/receive service for the users, when the user concerns the response
>> time. If the user needs to get the event or update from the firmware, it
>> can request the MRQ service as well. The user needs to take care of the
>> message format, which we call BPMP ABI.
>>
>> The BPMP ABI defines the message format for different modules or usages.
>> For example, the clock operation needs an MRQ service code called
>> MRQ_CLK with specific message format which includes different sub
>> commands for various clock operations. This is the message format that
>> BPMP can recognize.
>>
>> So the user needs two things to initiate IPC between BPMP. Get the
>> service from the bpmp_ops structure and maintain the message format as
>> the BPMP ABI defined.

>> diff --git a/include/soc/tegra/bpmp_abi.h b/include/soc/tegra/bpmp_abi.h

>> +#ifndef _ABI_BPMP_ABI_H_
>> +#define _ABI_BPMP_ABI_H_
>> +
>> +#ifdef LK
>> +#include <stdint.h>
>> +#endif
>> +
>> +#ifndef __ABI_PACKED
>> +#define __ABI_PACKED __attribute__((packed))
>> +#endif
>> +
>> +#ifdef NO_GCC_EXTENSIONS
>> +#define EMPTY char empty;
>> +#define EMPTY_ARRAY 1
>> +#else
>> +#define EMPTY
>> +#define EMPTY_ARRAY 0
>> +#endif
>> +
>> +#ifndef __UNION_ANON
>> +#define __UNION_ANON
>> +#endif
>> +/**
>> + * @file
>> + */
>> +
>> +
>> +/**
>> + * @defgroup MRQ MRQ Messages
>> + * @brief Messages sent to/from BPMP via IPC
>> + * @{
>> + *   @defgroup MRQ_Format Message Format
>> + *   @defgroup MRQ_Codes Message Request (MRQ) Codes
>> + *   @defgroup MRQ_Payloads Message Payloads
>> + *   @defgroup Error_Codes Error Codes
>> + * @}
>> + */
>
> ...
>
> There is a lot of stuff in this file, most of which we are not using
> now - this is ok, but unless this is a file synced from an outside
> resource maybe we should trim the structures we don't need and add
> them as we make use of them? It helps dividing the work in bite-size
> chunks.
>
> Regarding the documentation format of this file, is this valid kernel
> documentation since the adoption of Sphynx? Or is it whatever the
> origin is using?

This file is an ABI document published by the BPMP FW team. I believe it 
would be best to use it unmodified.

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH V2 05/10] firmware: tegra: add BPMP support
  2016-07-06 11:39   ` Alexandre Courbot
  2016-07-06 16:39     ` Stephen Warren
@ 2016-07-06 16:47     ` Matt Longnecker
  2016-07-07  2:24       ` Alexandre Courbot
  2016-07-07  8:17     ` Joseph Lo
  2 siblings, 1 reply; 51+ messages in thread
From: Matt Longnecker @ 2016-07-06 16:47 UTC (permalink / raw)
  To: Alexandre Courbot, Joseph Lo
  Cc: Stephen Warren, Thierry Reding, linux-tegra, linux-arm-kernel,
	Rob Herring, Mark Rutland, Peter De Schrijver, devicetree,
	Jassi Brar, Linux Kernel Mailing List, Catalin Marinas,
	Will Deacon

Alex,

On 07/06/2016 04:39 AM, Alexandre Courbot wrote:
>> diff --git a/include/soc/tegra/bpmp_abi.h b/include/soc/tegra/bpmp_abi.h
>> >new file mode 100644
>> >index 000000000000..0aaef5960e29
>> >--- /dev/null
>> >+++ b/include/soc/tegra/bpmp_abi.h
>> >@@ -0,0 +1,1601 @@
>> >+/*
>> >+ * Copyright (c) 2014-2016, NVIDIA CORPORATION.  All rights reserved.
>> >+ *
>> >+ * This program is free software; you can redistribute it and/or modify it
>> >+ * under the terms and conditions of the GNU General Public License,
>> >+ * version 2, as published by the Free Software Foundation.
>> >+ *
>> >+ * This program is distributed in the hope it will be useful, but WITHOUT
>> >+ * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
>> >+ * FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License for
>> >+ * more details.
>> >+ *
>> >+ * You should have received a copy of the GNU General Public License
>> >+ * along with this program.  If not, see<http://www.gnu.org/licenses/>.
>> >+ */
>> >+
>> >+#ifndef_ABI_BPMP_ABI_H_
>> >+#define_ABI_BPMP_ABI_H_
>> >+
>>
> ...
>
> There is a lot of stuff in this file, most of which we are not using
> now - this is ok, but unless this is a file synced from an outside
> resource maybe we should trim the structures we don't need and add
> them as we make use of them? It helps dividing the work in bite-size
> chunks.
>
> Regarding the documentation format of this file, is this valid kernel
> documentation since the adoption of Sphynx? Or is it whatever the
> origin is using?
bpmp_abi.h is meant to be delivered as is from an NVIDIA internal repo 
to a variety of OS'es. Each of them has a different documentation 
standard and coding standard.

I'd like to avoid trimming parts from the file (or even worse modifying 
parts of the file) so that future deliveries are trivial.

Thanks,
Matt Longnecker

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH V2 02/10] mailbox: tegra-hsp: Add HSP(Hardware Synchronization Primitives) driver
  2016-07-06  9:06     ` Joseph Lo
  2016-07-06 12:23       ` Alexandre Courbot
@ 2016-07-06 16:50       ` Stephen Warren
  2016-07-07  6:49         ` Joseph Lo
  1 sibling, 1 reply; 51+ messages in thread
From: Stephen Warren @ 2016-07-06 16:50 UTC (permalink / raw)
  To: Joseph Lo, Alexandre Courbot
  Cc: Thierry Reding, linux-tegra, linux-arm-kernel, Rob Herring,
	Mark Rutland, Peter De Schrijver, Matthew Longnecker, devicetree,
	Jassi Brar, Linux Kernel Mailing List, Catalin Marinas,
	Will Deacon

On 07/06/2016 03:06 AM, Joseph Lo wrote:
> On 07/06/2016 03:05 PM, Alexandre Courbot wrote:
>> On Tue, Jul 5, 2016 at 6:04 PM, Joseph Lo <josephl@nvidia.com> wrote:
>>> The Tegra HSP mailbox driver implements the signaling doorbell-based
>>> interprocessor communication (IPC) for remote processors currently. The
>>> HSP HW modules support some different features for that, which are
>>> shared mailboxes, shared semaphores, arbitrated semaphores, and
>>> doorbells. And there are multiple HSP HW instances on the chip. So the
>>> driver is extendable to support more features for different IPC
>>> requirement.
>>>
>>> The driver of remote processor can use it as a mailbox client and deal
>>> with the IPC protocol to synchronize the data communications.

>>> diff --git a/drivers/mailbox/tegra-hsp.c b/drivers/mailbox/tegra-hsp.c

>>> +static irqreturn_t hsp_db_irq(int irq, void *p)
>>> +{
>>> +       struct tegra_hsp_mbox *hsp_mbox = p;
>>> +       ulong val;
>>> +       int master_id;
>>> +
>>> +       val = (ulong)hsp_readl(hsp_mbox->db_base[HSP_DB_CCPLEX],
>>> +                              HSP_DB_REG_PENDING);
>>> +       hsp_writel(hsp_mbox->db_base[HSP_DB_CCPLEX],
>>> HSP_DB_REG_PENDING, val);
>>> +
>>> +       spin_lock(&hsp_mbox->lock);
>>> +       for_each_set_bit(master_id, &val, MAX_NUM_HSP_CHAN) {
>>> +               struct mbox_chan *chan;
>>> +               struct tegra_hsp_mbox_chan *mchan;
>>> +               int i;
>>> +
>>> +               for (i = 0; i < MAX_NUM_HSP_CHAN; i++) {
>>
>> I wonder if this could not be optimized. You are doing a double loop
>> on MAX_NUM_HSP_CHAN to look for an identical master_id. Since it seems
>> like the same master_id cannot be used twice (considering that the
>> inner loop only processes the first match), couldn't you just select
>> the free channel in of_hsp_mbox_xlate() by doing
>> &mbox->chans[master_id] (and returning an error if it is already
>> used), then simply getting chan as &hsp_mbox->mbox->chans[master_id]
>> instead of having the inner loop below? That would remove the need for
>> the second loop.
>
> That was exactly what I did in the V1, which only supported one HSP
> sub-module per HSP HW block. So we can just use the master_id as the
> mbox channel ID.
>
> Meanwhile, the V2 is purposed to support multiple HSP sub-modules to be
> running on the same HSP HW block. The "ID" between different modules
> could be conflict. So I dropped the mechanism that used the master_id as
> the mbox channel ID.

I haven't looked at the code in this patch since I'm mainly concerned 
about the DT bindings. However, I will say that nothing in the change to 
the mailbox specifier in DT should have required /any/ changes to the 
code, except to add a single check to validate that the "mailbox type" 
encoded into the top 16 bits of the mailbox ID were 0, and hence 
represented a doorbell rather than anything else. Any enhancements to 
support other mailbox types could have happened later, and I doubt would 
require anything dynamic even then.

>>> +static int tegra_hsp_db_init(struct tegra_hsp_mbox *hsp_mbox,
>>> +                            struct mbox_chan *mchan, int master_id)
>>> +{
>>> +       struct platform_device *pdev = to_platform_device(hsp_mbox->mbox->dev);
>>> +       struct tegra_hsp_mbox_chan *hsp_mbox_chan;
>>> +       int ret;
>>> +
>>> +       if (!hsp_mbox->db_irq) {
>>> +               int i;
>>> +
>>> +               hsp_mbox->db_irq = platform_get_irq_byname(pdev, "doorbell");
>>
>> Getting the IRQ sounds more like a job for probe() - I don't see the
>> benefit of lazy-doing it?
>
> We only need the IRQ when the client is requesting the DB service. For
> other HSP sub-modules, they are using different IRQ. So I didn't do that
> at probe time.

All resources provided by other devices/drivers must be acquired at 
probe time, since that's the only time it's possible to defer probe if 
the provider of the resource is not available.

If you don't follow that rule, what happens is:

1) This driver probes.

2) Some other driver calls tegra_hsp_db_init(), and it fails since the 
provider of the IRQ is not yet available. This likely ends up returning 
something other than -EPROBE_DEFER since the HSP driver was found 
successfully (thus there is no deferred probe situation as far as the 
mailbox core is concerned), it's just that the mailbox channel 
lookup/init/... failed.

3) The other driver's probe() fails due to this, but since the error 
wasn't a probe deferral, the other driver's probe() is never retried.

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH V2 01/10] Documentation: dt-bindings: mailbox: tegra: Add binding for HSP mailbox
  2016-07-05  9:04 ` [PATCH V2 01/10] Documentation: dt-bindings: mailbox: tegra: Add binding for HSP mailbox Joseph Lo
@ 2016-07-06 17:02   ` Stephen Warren
  2016-07-07  6:24     ` Joseph Lo
  2016-07-07 18:13   ` Sivaram Nair
  1 sibling, 1 reply; 51+ messages in thread
From: Stephen Warren @ 2016-07-06 17:02 UTC (permalink / raw)
  To: Joseph Lo
  Cc: Thierry Reding, Alexandre Courbot, linux-tegra, linux-arm-kernel,
	Rob Herring, Mark Rutland, Peter De Schrijver,
	Matthew Longnecker, devicetree, Jassi Brar, linux-kernel,
	Catalin Marinas, Will Deacon

On 07/05/2016 03:04 AM, Joseph Lo wrote:
> Add DT binding for the Hardware Synchronization Primitives (HSP). The
> HSP is designed for the processors to share resources and communicate
> together. It provides a set of hardware synchronization primitives for
> interprocessor communication. So the interprocessor communication (IPC)
> protocols can use hardware synchronization primitive, when operating
> between two processors not in an SMP relationship.

Acked-by: Stephen Warren <swarren@nvidia.com>

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH V2 03/10] Documentation: dt-bindings: firmware: tegra: add bindings of the BPMP
  2016-07-05  9:04 ` [PATCH V2 03/10] Documentation: dt-bindings: firmware: tegra: add bindings of the BPMP Joseph Lo
  2016-07-06 11:42   ` Alexandre Courbot
@ 2016-07-06 17:03   ` Stephen Warren
  2016-07-07  6:26     ` Joseph Lo
  2016-07-11 14:22   ` Rob Herring
  2016-07-13 19:41   ` Stephen Warren
  3 siblings, 1 reply; 51+ messages in thread
From: Stephen Warren @ 2016-07-06 17:03 UTC (permalink / raw)
  To: Joseph Lo
  Cc: Thierry Reding, Alexandre Courbot, linux-tegra, linux-arm-kernel,
	Rob Herring, Mark Rutland, Peter De Schrijver,
	Matthew Longnecker, devicetree, Jassi Brar, linux-kernel,
	Catalin Marinas, Will Deacon

On 07/05/2016 03:04 AM, Joseph Lo wrote:
> The BPMP is a specific processor in Tegra chip, which is designed for
> booting process handling and offloading the power management, clock
> management, and reset control tasks from the CPU. The binding document
> defines the resources that would be used by the BPMP firmware driver,
> which can create the interprocessor communication (IPC) between the CPU
> and BPMP.

Acked-by: Stephen Warren <swarren@nvidia.com>

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH V2 05/10] firmware: tegra: add BPMP support
  2016-07-06 16:47     ` Matt Longnecker
@ 2016-07-07  2:24       ` Alexandre Courbot
  0 siblings, 0 replies; 51+ messages in thread
From: Alexandre Courbot @ 2016-07-07  2:24 UTC (permalink / raw)
  To: Matt Longnecker
  Cc: Joseph Lo, Stephen Warren, Thierry Reding, linux-tegra,
	linux-arm-kernel, Rob Herring, Mark Rutland, Peter De Schrijver,
	devicetree, Jassi Brar, Linux Kernel Mailing List,
	Catalin Marinas, Will Deacon

On Thu, Jul 7, 2016 at 1:47 AM, Matt Longnecker <mlongnecker@nvidia.com> wrote:
> Alex,
>
>
> On 07/06/2016 04:39 AM, Alexandre Courbot wrote:
>>>
>>> diff --git a/include/soc/tegra/bpmp_abi.h b/include/soc/tegra/bpmp_abi.h
>>> >new file mode 100644
>>> >index 000000000000..0aaef5960e29
>>> >--- /dev/null
>>> >+++ b/include/soc/tegra/bpmp_abi.h
>>> >@@ -0,0 +1,1601 @@
>>> >+/*
>>> >+ * Copyright (c) 2014-2016, NVIDIA CORPORATION.  All rights reserved.
>>> >+ *
>>> >+ * This program is free software; you can redistribute it and/or modify
>>> > it
>>> >+ * under the terms and conditions of the GNU General Public License,
>>> >+ * version 2, as published by the Free Software Foundation.
>>> >+ *
>>> >+ * This program is distributed in the hope it will be useful, but
>>> > WITHOUT
>>> >+ * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY
>>> > or
>>> >+ * FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public
>>> > License for
>>> >+ * more details.
>>> >+ *
>>> >+ * You should have received a copy of the GNU General Public License
>>> >+ * along with this program.  If not, see<http://www.gnu.org/licenses/>.
>>> >+ */
>>> >+
>>> >+#ifndef_ABI_BPMP_ABI_H_
>>> >+#define_ABI_BPMP_ABI_H_
>>> >+
>>>
>> ...
>>
>> There is a lot of stuff in this file, most of which we are not using
>> now - this is ok, but unless this is a file synced from an outside
>> resource maybe we should trim the structures we don't need and add
>> them as we make use of them? It helps dividing the work in bite-size
>> chunks.
>>
>> Regarding the documentation format of this file, is this valid kernel
>> documentation since the adoption of Sphynx? Or is it whatever the
>> origin is using?
>
> bpmp_abi.h is meant to be delivered as is from an NVIDIA internal repo to a
> variety of OS'es. Each of them has a different documentation standard and
> coding standard.
>
> I'd like to avoid trimming parts from the file (or even worse modifying
> parts of the file) so that future deliveries are trivial.

Makes sense, thanks to you and Stephen for the clarification.

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH V2 01/10] Documentation: dt-bindings: mailbox: tegra: Add binding for HSP mailbox
  2016-07-06 17:02   ` Stephen Warren
@ 2016-07-07  6:24     ` Joseph Lo
  0 siblings, 0 replies; 51+ messages in thread
From: Joseph Lo @ 2016-07-07  6:24 UTC (permalink / raw)
  To: Stephen Warren
  Cc: Thierry Reding, Alexandre Courbot, linux-tegra, linux-arm-kernel,
	Rob Herring, Mark Rutland, Peter De Schrijver,
	Matthew Longnecker, devicetree, Jassi Brar, linux-kernel,
	Catalin Marinas, Will Deacon

On 07/07/2016 01:02 AM, Stephen Warren wrote:
> On 07/05/2016 03:04 AM, Joseph Lo wrote:
>> Add DT binding for the Hardware Synchronization Primitives (HSP). The
>> HSP is designed for the processors to share resources and communicate
>> together. It provides a set of hardware synchronization primitives for
>> interprocessor communication. So the interprocessor communication (IPC)
>> protocols can use hardware synchronization primitive, when operating
>> between two processors not in an SMP relationship.
>
> Acked-by: Stephen Warren <swarren@nvidia.com>

Thanks,
-Joseph

> --
> To unsubscribe from this list: send the line "unsubscribe linux-tegra" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH V2 03/10] Documentation: dt-bindings: firmware: tegra: add bindings of the BPMP
  2016-07-06 11:42   ` Alexandre Courbot
@ 2016-07-07  6:25     ` Joseph Lo
  0 siblings, 0 replies; 51+ messages in thread
From: Joseph Lo @ 2016-07-07  6:25 UTC (permalink / raw)
  To: Alexandre Courbot
  Cc: Stephen Warren, Thierry Reding, linux-tegra, linux-arm-kernel,
	Rob Herring, Mark Rutland, Peter De Schrijver,
	Matthew Longnecker, devicetree, Jassi Brar,
	Linux Kernel Mailing List, Catalin Marinas, Will Deacon

On 07/06/2016 07:42 PM, Alexandre Courbot wrote:
> On Tue, Jul 5, 2016 at 6:04 PM, Joseph Lo <josephl@nvidia.com> wrote:
>> The BPMP is a specific processor in Tegra chip, which is designed for
>> booting process handling and offloading the power management, clock
>> management, and reset control tasks from the CPU. The binding document
>> defines the resources that would be used by the BPMP firmware driver,
>> which can create the interprocessor communication (IPC) between the CPU
>> and BPMP.
>>
>> Signed-off-by: Joseph Lo <josephl@nvidia.com>
>> ---
>> Changes in V2:
>> - update the message that the BPMP is clock and reset control provider
>> - add tegra186-clock.h and tegra186-reset.h header files
>> - revise the description of the required properties
>> ---
>>   .../bindings/firmware/nvidia,tegra186-bpmp.txt     |  77 ++
>>   include/dt-bindings/clock/tegra186-clock.h         | 940 +++++++++++++++++++++
>>   include/dt-bindings/reset/tegra186-reset.h         | 217 +++++
>>   3 files changed, 1234 insertions(+)
>>   create mode 100644 Documentation/devicetree/bindings/firmware/nvidia,tegra186-bpmp.txt
>>   create mode 100644 include/dt-bindings/clock/tegra186-clock.h
>>   create mode 100644 include/dt-bindings/reset/tegra186-reset.h
>>
>> diff --git a/Documentation/devicetree/bindings/firmware/nvidia,tegra186-bpmp.txt b/Documentation/devicetree/bindings/firmware/nvidia,tegra186-bpmp.txt
>> new file mode 100644
>> index 000000000000..4d0b6eba56c5
>> --- /dev/null
>> +++ b/Documentation/devicetree/bindings/firmware/nvidia,tegra186-bpmp.txt
>> @@ -0,0 +1,77 @@
>> +NVIDIA Tegra Boot and Power Management Processor (BPMP)
>> +
>> +The BPMP is a specific processor in Tegra chip, which is designed for
>> +booting process handling and offloading the power management, clock
>> +management, and reset control tasks from the CPU. The binding document
>> +defines the resources that would be used by the BPMP firmware driver,
>> +which can create the interprocessor communication (IPC) between the CPU
>> +and BPMP.
>> +
>> +Required properties:
>> +- name : Should be bpmp
>> +- compatible
>> +    Array of strings
>> +    One of:
>> +    - "nvidia,tegra186-bpmp"
>> +- mboxes : The phandle of mailbox controller and the mailbox specifier.
>> +- shmem : List of the phandle of the TX and RX shared memory area that
>> +         the IPC between CPU and BPMP is based on.
>> +- #clock-cells : Should be 1.
>> +- #reset-cells : Should be 1.
>> +
>> +This node is a mailbox consumer. See the following files for details of
>> +the mailbox subsystem, and the specifiers implemented by the relevant
>> +provider(s):
>> +
>> +- Documentation/devicetree/bindings/mailbox/mailbox.txt
>> +- Documentation/devicetree/bindings/mailbox/nvidia,tegra186-hsp.txt
>> +
>> +This node is a clock and reset provider. See the following files for
>> +general documentation of those features, and the specifiers implemented
>> +by this node:
>> +
>> +- Documentation/devicetree/bindings/clock/clock-bindings.txt
>> +- include/dt-bindings/clock/tegra186-clock.h
>> +- Documentation/devicetree/bindings/reset/reset.txt
>> +- include/dt-bindings/reset/tegra186-reset.h
>> +
>> +The shared memory bindings for BPMP
>> +-----------------------------------
>> +
>> +The shared memory area for the IPC TX and RX between CPU and BPMP are
>> +predefined and work on top of sysram, which is an SRAM inside the chip.
>> +
>> +See "Documentation/devicetree/bindings/sram/sram.txt" for the bindings.
>> +
>> +Example:
>> +
>> +hsp_top0: hsp@03c00000 {
>> +       ...
>> +       #mbox-cells = <1>;
>> +};
>> +
>> +sysram@30000000 {
>> +       compatible = "nvidia,tegra186-sysram", "mmio-ram";
>
> Shouldn't the second compatible be "mmio-sram"?
>
> If so, then you have the same typo in tegra186.dtsi as well.
>

Good catch, will fix.

Thanks,
-Joseph

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH V2 03/10] Documentation: dt-bindings: firmware: tegra: add bindings of the BPMP
  2016-07-06 17:03   ` Stephen Warren
@ 2016-07-07  6:26     ` Joseph Lo
  0 siblings, 0 replies; 51+ messages in thread
From: Joseph Lo @ 2016-07-07  6:26 UTC (permalink / raw)
  To: Stephen Warren
  Cc: Thierry Reding, Alexandre Courbot, linux-tegra, linux-arm-kernel,
	Rob Herring, Mark Rutland, Peter De Schrijver,
	Matthew Longnecker, devicetree, Jassi Brar, linux-kernel,
	Catalin Marinas, Will Deacon

On 07/07/2016 01:03 AM, Stephen Warren wrote:
> On 07/05/2016 03:04 AM, Joseph Lo wrote:
>> The BPMP is a specific processor in Tegra chip, which is designed for
>> booting process handling and offloading the power management, clock
>> management, and reset control tasks from the CPU. The binding document
>> defines the resources that would be used by the BPMP firmware driver,
>> which can create the interprocessor communication (IPC) between the CPU
>> and BPMP.
>
> Acked-by: Stephen Warren <swarren@nvidia.com>

Thanks,
-Joseph

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH V2 02/10] mailbox: tegra-hsp: Add HSP(Hardware Synchronization Primitives) driver
  2016-07-06 12:23       ` Alexandre Courbot
@ 2016-07-07  6:37         ` Joseph Lo
  2016-07-07 21:33           ` Sivaram Nair
  0 siblings, 1 reply; 51+ messages in thread
From: Joseph Lo @ 2016-07-07  6:37 UTC (permalink / raw)
  To: Alexandre Courbot
  Cc: Stephen Warren, Thierry Reding, linux-tegra, linux-arm-kernel,
	Rob Herring, Mark Rutland, Peter De Schrijver,
	Matthew Longnecker, devicetree, Jassi Brar,
	Linux Kernel Mailing List, Catalin Marinas, Will Deacon

On 07/06/2016 08:23 PM, Alexandre Courbot wrote:
> On Wed, Jul 6, 2016 at 6:06 PM, Joseph Lo <josephl@nvidia.com> wrote:
>> On 07/06/2016 03:05 PM, Alexandre Courbot wrote:
>>>
>>> On Tue, Jul 5, 2016 at 6:04 PM, Joseph Lo <josephl@nvidia.com> wrote:
>>>>
>>>> The Tegra HSP mailbox driver implements the signaling doorbell-based
>>>> interprocessor communication (IPC) for remote processors currently. The
>>>> HSP HW modules support some different features for that, which are
>>>> shared mailboxes, shared semaphores, arbitrated semaphores, and
>>>> doorbells. And there are multiple HSP HW instances on the chip. So the
>>>> driver is extendable to support more features for different IPC
>>>> requirement.
>>>>
>>>> The driver of remote processor can use it as a mailbox client and deal
>>>> with the IPC protocol to synchronize the data communications.
>>>>
>>>> Signed-off-by: Joseph Lo <josephl@nvidia.com>
>>>> ---
>>>> Changes in V2:
>>>> - Update the driver to support the binding changes in V2
>>>> - it's extendable to support multiple HSP sub-modules on the same HSP HW
>>>> block
>>>>     now.
>>>> ---
>>>>    drivers/mailbox/Kconfig     |   9 +
>>>>    drivers/mailbox/Makefile    |   2 +
>>>>    drivers/mailbox/tegra-hsp.c | 418
>>>> ++++++++++++++++++++++++++++++++++++++++++++
>>>>    3 files changed, 429 insertions(+)
>>>>    create mode 100644 drivers/mailbox/tegra-hsp.c
>>>>
>>>> diff --git a/drivers/mailbox/Kconfig b/drivers/mailbox/Kconfig
>>>> index 5305923752d2..fe584cb54720 100644
>>>> --- a/drivers/mailbox/Kconfig
>>>> +++ b/drivers/mailbox/Kconfig
>>>> @@ -114,6 +114,15 @@ config MAILBOX_TEST
>>>>             Test client to help with testing new Controller driver
>>>>             implementations.
>>>>
>>>> +config TEGRA_HSP_MBOX
>>>> +       bool "Tegra HSP(Hardware Synchronization Primitives) Driver"
>>>
>>>
>>> Space missing before the opening parenthesis (same in the patch title
>>> btw).
>>
>> Okay.
>>>
>>>
>>>> +       depends on ARCH_TEGRA_186_SOC
>>>> +       help
>>>> +         The Tegra HSP driver is used for the interprocessor
>>>> communication
>>>> +         between different remote processors and host processors on
>>>> Tegra186
>>>> +         and later SoCs. Say Y here if you want to have this support.
>>>> +         If unsure say N.
>>>
>>>
>>> Since this option is selected automatically by ARCH_TEGRA_186_SOC, you
>>> should probably drop the last 2 sentences.
>>
>> Okay.
>>
>>>
>>>> +
>>>>    config XGENE_SLIMPRO_MBOX
>>>>           tristate "APM SoC X-Gene SLIMpro Mailbox Controller"
>>>>           depends on ARCH_XGENE
>>>> diff --git a/drivers/mailbox/Makefile b/drivers/mailbox/Makefile
>>>> index 0be3e742bb7d..26d8f91c7fea 100644
>>>> --- a/drivers/mailbox/Makefile
>>>> +++ b/drivers/mailbox/Makefile
>>>> @@ -25,3 +25,5 @@ obj-$(CONFIG_TI_MESSAGE_MANAGER) += ti-msgmgr.o
>>>>    obj-$(CONFIG_XGENE_SLIMPRO_MBOX) += mailbox-xgene-slimpro.o
>>>>
>>>>    obj-$(CONFIG_HI6220_MBOX)      += hi6220-mailbox.o
>>>> +
>>>> +obj-${CONFIG_TEGRA_HSP_MBOX}   += tegra-hsp.o
>>>> diff --git a/drivers/mailbox/tegra-hsp.c b/drivers/mailbox/tegra-hsp.c
>>>> new file mode 100644
>>>> index 000000000000..93c3ef58f29f
>>>> --- /dev/null
>>>> +++ b/drivers/mailbox/tegra-hsp.c
>>>> @@ -0,0 +1,418 @@
>>>> +/*
>>>> + * Copyright (c) 2016, NVIDIA CORPORATION.  All rights reserved.
>>>> + *
>>>> + * This program is free software; you can redistribute it and/or modify
>>>> it
>>>> + * under the terms and conditions of the GNU General Public License,
>>>> + * version 2, as published by the Free Software Foundation.
>>>> + *
>>>> + * This program is distributed in the hope it will be useful, but
>>>> WITHOUT
>>>> + * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
>>>> + * FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License
>>>> for
>>>> + * more details.
>>>> + */
>>>> +
>>>> +#include <linux/interrupt.h>
>>>> +#include <linux/io.h>
>>>> +#include <linux/mailbox_controller.h>
>>>> +#include <linux/of.h>
>>>> +#include <linux/of_device.h>
>>>> +#include <linux/platform_device.h>
>>>> +#include <dt-bindings/mailbox/tegra186-hsp.h>
>>>> +
>>>> +#define HSP_INT_DIMENSIONING   0x380
>>>> +#define HSP_nSM_OFFSET         0
>>>> +#define HSP_nSS_OFFSET         4
>>>> +#define HSP_nAS_OFFSET         8
>>>> +#define HSP_nDB_OFFSET         12
>>>> +#define HSP_nSI_OFFSET         16
>>>
>>>
>>> Would be nice to have comments to understand what SM, SS, AS, etc.
>>> stand for (Shared Mailboxes, Shared Semaphores, Arbitrated Semaphores
>>> but you need to look at the patch description to understand that). A
>>> top-of-file comment explaning the necessary concepts to read this code
>>> would do the trick.
>>
>> Yes, will fix that.
>>>
>>>
>>>> +#define HSP_nINT_MASK          0xf
>>>> +
>>>> +#define HSP_DB_REG_TRIGGER     0x0
>>>> +#define HSP_DB_REG_ENABLE      0x4
>>>> +#define HSP_DB_REG_RAW         0x8
>>>> +#define HSP_DB_REG_PENDING     0xc
>>>> +
>>>> +#define HSP_DB_CCPLEX          1
>>>> +#define HSP_DB_BPMP            3
>>>
>>>
>>> Maybe turn this into enum and use that type for
>>> tegra_hsp_db_chan::db_id? Also have MAX_NUM_HSP_DB here, since it is
>>> related to these values?
>>
>> Okay.
>>
>>>
>>>> +
>>>> +#define MAX_NUM_HSP_CHAN 32
>>>> +#define MAX_NUM_HSP_DB 7
>>>> +
>>>> +#define hsp_db_offset(i, d) \
>>>> +       (d->base + ((1 + (d->nr_sm >> 1) + d->nr_ss + d->nr_as) << 16) +
>>>> \
>>>> +       (i) * 0x100)
>>>> +
>>>> +struct tegra_hsp_db_chan {
>>>> +       int master_id;
>>>> +       int db_id;
>>>> +};
>>>> +
>>>> +struct tegra_hsp_mbox_chan {
>>>> +       int type;
>>>> +       union {
>>>> +               struct tegra_hsp_db_chan db_chan;
>>>> +       };
>>>> +};
>>>> +
>>>> +struct tegra_hsp_mbox {
>>>> +       struct mbox_controller *mbox;
>>>> +       void __iomem *base;
>>>> +       void __iomem *db_base[MAX_NUM_HSP_DB];
>>>> +       int db_irq;
>>>> +       int nr_sm;
>>>> +       int nr_as;
>>>> +       int nr_ss;
>>>> +       int nr_db;
>>>> +       int nr_si;
>>>> +       spinlock_t lock;
>>>> +};
>>>> +
>>>> +static inline u32 hsp_readl(void __iomem *base, int reg)
>>>> +{
>>>> +       return readl(base + reg);
>>>> +}
>>>> +
>>>> +static inline void hsp_writel(void __iomem *base, int reg, u32 val)
>>>> +{
>>>> +       writel(val, base + reg);
>>>> +       readl(base + reg);
>>>> +}
>>>> +
>>>> +static int hsp_db_can_ring(void __iomem *db_base)
>>>> +{
>>>> +       u32 reg;
>>>> +
>>>> +       reg = hsp_readl(db_base, HSP_DB_REG_ENABLE);
>>>> +
>>>> +       return !!(reg & BIT(HSP_DB_MASTER_CCPLEX));
>>>> +}
>>>> +
>>>> +static irqreturn_t hsp_db_irq(int irq, void *p)
>>>> +{
>>>> +       struct tegra_hsp_mbox *hsp_mbox = p;
>>>> +       ulong val;
>>>> +       int master_id;
>>>> +
>>>> +       val = (ulong)hsp_readl(hsp_mbox->db_base[HSP_DB_CCPLEX],
>>>> +                              HSP_DB_REG_PENDING);
>>>> +       hsp_writel(hsp_mbox->db_base[HSP_DB_CCPLEX], HSP_DB_REG_PENDING,
>>>> val);
>>>> +
>>>> +       spin_lock(&hsp_mbox->lock);
>>>> +       for_each_set_bit(master_id, &val, MAX_NUM_HSP_CHAN) {
>>>> +               struct mbox_chan *chan;
>>>> +               struct tegra_hsp_mbox_chan *mchan;
>>>> +               int i;
>>>> +
>>>> +               for (i = 0; i < MAX_NUM_HSP_CHAN; i++) {
>>>
>>>
>>> I wonder if this could not be optimized. You are doing a double loop
>>> on MAX_NUM_HSP_CHAN to look for an identical master_id. Since it seems
>>> like the same master_id cannot be used twice (considering that the
>>> inner loop only processes the first match), couldn't you just select
>>> the free channel in of_hsp_mbox_xlate() by doing
>>> &mbox->chans[master_id] (and returning an error if it is already
>>> used), then simply getting chan as &hsp_mbox->mbox->chans[master_id]
>>> instead of having the inner loop below? That would remove the need for
>>> the second loop.
>>
>>
>> That was exactly what I did in the V1, which only supported one HSP
>> sub-module per HSP HW block. So we can just use the master_id as the mbox
>> channel ID.
>>
>> Meanwhile, the V2 is purposed to support multiple HSP sub-modules to be
>> running on the same HSP HW block. The "ID" between different modules could
>> be conflict. So I dropped the mechanism that used the master_id as the mbox
>> channel ID.
>>
>> Instead, the channel is allocated at the time, when the client is bound to
>> one of the HSP sub-modules. And we store the "ID" information into the
>> private mbox channel data, which can help us to figure out which mbox
>> channel should response to the interrupt.
>>
>> In the doorbell case, because all the DB clients are shared the same DB IRQ
>> at the CPU side. So in the ISR, we need to figure out the IRQ source, which
>> is the master_id that the IRQ came from. This is the outer loop. The inner
>> loop, we figure out which channel should response to by checking the type
>> and ID.
>>
>> And I think it should be pretty quick, because we only check the set bit
>> from the pending register. And finding the matching channel.
>
> Yeah, I am not worried about the CPU time (although in interrupt
> context, we always should), but rather about whether the code could be
> simplified.
>
> Ah, I think I get it. You want to be able to receive interrupts from
> the same master, but not necessarily for the doorbell function.
> Because of this you cannot use master_id as the index for the channel.
> Am I understanding correctly?

Yes, the DB clients trigger the IRQ through the same master 
(HSP_DB_CCPLEX) with it's master_id. We (CPU) can check the ID to know 
who is requesting the HSP mbox service. Each ID is unique under the DB 
module.

But the ID could be conflict when the HSP mbox driver are working with 
multiple HSP sub-function under the same HSP HW block. So we can't just 
match the ID to the HSP mbox channel ID.

>
>>
>>>
>>> If having two channels use the same master_id is a valid scenario,
>>> then all matches on master_id should probably be processed, not just
>>> the first one.
>>
>> Each DB channel should have different master_id.
>>
>>
>>>
>>>> +                       chan = &hsp_mbox->mbox->chans[i];
>>>> +
>>>> +                       if (!chan->con_priv)
>>>> +                               continue;
>>>> +
>>>> +                       mchan = chan->con_priv;
>>>> +                       if (mchan->type == HSP_MBOX_TYPE_DB &&
>>>> +                           mchan->db_chan.master_id == master_id)
>>>> +                               break;
>>>> +                       chan = NULL;
>>>> +               }
>>>> +
>>>> +               if (chan)
>>>> +                       mbox_chan_received_data(chan, NULL);
>>>> +       }
>>>> +       spin_unlock(&hsp_mbox->lock);
>>>> +
>>>> +       return IRQ_HANDLED;
>>>> +}
>>>> +
>>>> +static int hsp_db_send_data(struct mbox_chan *chan, void *data)
>>>> +{
>>>> +       struct tegra_hsp_mbox_chan *mchan = chan->con_priv;
>>>> +       struct tegra_hsp_db_chan *db_chan = &mchan->db_chan;
>>>> +       struct tegra_hsp_mbox *hsp_mbox =
>>>> dev_get_drvdata(chan->mbox->dev);
>>>> +
>>>> +       hsp_writel(hsp_mbox->db_base[db_chan->db_id], HSP_DB_REG_TRIGGER,
>>>> 1);
>>>> +
>>>> +       return 0;
>>>> +}
>>>> +
>>>> +static int hsp_db_startup(struct mbox_chan *chan)
>>>> +{
>>>> +       struct tegra_hsp_mbox_chan *mchan = chan->con_priv;
>>>> +       struct tegra_hsp_db_chan *db_chan = &mchan->db_chan;
>>>> +       struct tegra_hsp_mbox *hsp_mbox =
>>>> dev_get_drvdata(chan->mbox->dev);
>>>> +       u32 val;
>>>> +       unsigned long flag;
>>>> +
>>>> +       if (db_chan->master_id >= MAX_NUM_HSP_CHAN) {
>>>> +               dev_err(chan->mbox->dev, "invalid HSP chan: master ID:
>>>> %d\n",
>>>> +                       db_chan->master_id);
>>>> +               return -EINVAL;
>>>> +       }
>>>> +
>>>> +       spin_lock_irqsave(&hsp_mbox->lock, flag);
>>>> +       val = hsp_readl(hsp_mbox->db_base[HSP_DB_CCPLEX],
>>>> HSP_DB_REG_ENABLE);
>>>> +       val |= BIT(db_chan->master_id);
>>>> +       hsp_writel(hsp_mbox->db_base[HSP_DB_CCPLEX], HSP_DB_REG_ENABLE,
>>>> val);
>>>> +       spin_unlock_irqrestore(&hsp_mbox->lock, flag);
>>>> +
>>>> +       if (!hsp_db_can_ring(hsp_mbox->db_base[db_chan->db_id]))
>>>> +               return -ENODEV;
>>>> +
>>>> +       return 0;
>>>> +}
>>>> +
>>>> +static void hsp_db_shutdown(struct mbox_chan *chan)
>>>> +{
>>>> +       struct tegra_hsp_mbox_chan *mchan = chan->con_priv;
>>>> +       struct tegra_hsp_db_chan *db_chan = &mchan->db_chan;
>>>> +       struct tegra_hsp_mbox *hsp_mbox =
>>>> dev_get_drvdata(chan->mbox->dev);
>>>> +       u32 val;
>>>> +       unsigned long flag;
>>>> +
>>>> +       spin_lock_irqsave(&hsp_mbox->lock, flag);
>>>> +       val = hsp_readl(hsp_mbox->db_base[HSP_DB_CCPLEX],
>>>> HSP_DB_REG_ENABLE);
>>>> +       val &= ~BIT(db_chan->master_id);
>>>> +       hsp_writel(hsp_mbox->db_base[HSP_DB_CCPLEX], HSP_DB_REG_ENABLE,
>>>> val);
>>>> +       spin_unlock_irqrestore(&hsp_mbox->lock, flag);
>>>> +}
>>>> +
>>>> +static bool hsp_db_last_tx_done(struct mbox_chan *chan)
>>>> +{
>>>> +       return true;
>>>> +}
>>>> +
>>>> +static int tegra_hsp_db_init(struct tegra_hsp_mbox *hsp_mbox,
>>>> +                            struct mbox_chan *mchan, int master_id)
>>>> +{
>>>> +       struct platform_device *pdev =
>>>> to_platform_device(hsp_mbox->mbox->dev);
>>>> +       struct tegra_hsp_mbox_chan *hsp_mbox_chan;
>>>> +       int ret;
>>>> +
>>>> +       if (!hsp_mbox->db_irq) {
>>>> +               int i;
>>>> +
>>>> +               hsp_mbox->db_irq = platform_get_irq_byname(pdev,
>>>> "doorbell");
>>>
>>>
>>> Getting the IRQ sounds more like a job for probe() - I don't see the
>>> benefit of lazy-doing it?
>>
>>
>> We only need the IRQ when the client is requesting the DB service. For other
>> HSP sub-modules, they are using different IRQ. So I didn't do that at probe
>> time.
>
> Ok, but probe() is where resources should be acquired... and at the
> very least DT properties be looked up. In this case there is no hard
> requirement for doing it elsewhere.
>
> Is this interrupt absolutely required? Or can we tolerate to not use
> the doorbell service? In the first case, the driver should fail during
> probe(), not sometime later. In the second case, you should still get
> all the interrupts in probe(), then disable them if they are not
> needed, and check in this function whether db_irq is a valid interrupt
> number to decide whether or not we can use doorbell.

Ah, I understand your concern now. It should be ok to move to probe(). 
Will fix that.

>
>>
>>>
>>>> +               ret = devm_request_irq(&pdev->dev, hsp_mbox->db_irq,
>>>> +                                      hsp_db_irq, IRQF_NO_SUSPEND,
>>>> +                                      dev_name(&pdev->dev), hsp_mbox);
>>>> +               if (ret)
>>>> +                       return ret;
>>>> +
>>>> +               for (i = 0; i < MAX_NUM_HSP_DB; i++)
>>>> +                       hsp_mbox->db_base[i] = hsp_db_offset(i,
>>>> hsp_mbox);
>>>
>>>
>>> Same here, cannot this be moved into probe()?
>>
>> Same as above, only needed when the client requests it.
>
> But you don't waste any resources by doing it preemptively in probe().
> So let's keep related code in the same place.

Okay.

Thanks,
-Joseph

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH V2 02/10] mailbox: tegra-hsp: Add HSP(Hardware Synchronization Primitives) driver
  2016-07-06 16:50       ` Stephen Warren
@ 2016-07-07  6:49         ` Joseph Lo
  0 siblings, 0 replies; 51+ messages in thread
From: Joseph Lo @ 2016-07-07  6:49 UTC (permalink / raw)
  To: Stephen Warren, Alexandre Courbot
  Cc: Thierry Reding, linux-tegra, linux-arm-kernel, Rob Herring,
	Mark Rutland, Peter De Schrijver, Matthew Longnecker, devicetree,
	Jassi Brar, Linux Kernel Mailing List, Catalin Marinas,
	Will Deacon

On 07/07/2016 12:50 AM, Stephen Warren wrote:
> On 07/06/2016 03:06 AM, Joseph Lo wrote:
>> On 07/06/2016 03:05 PM, Alexandre Courbot wrote:
>>> On Tue, Jul 5, 2016 at 6:04 PM, Joseph Lo <josephl@nvidia.com> wrote:
>>>> The Tegra HSP mailbox driver implements the signaling doorbell-based
>>>> interprocessor communication (IPC) for remote processors currently. The
>>>> HSP HW modules support some different features for that, which are
>>>> shared mailboxes, shared semaphores, arbitrated semaphores, and
>>>> doorbells. And there are multiple HSP HW instances on the chip. So the
>>>> driver is extendable to support more features for different IPC
>>>> requirement.
>>>>
>>>> The driver of remote processor can use it as a mailbox client and deal
>>>> with the IPC protocol to synchronize the data communications.
>
>>>> diff --git a/drivers/mailbox/tegra-hsp.c b/drivers/mailbox/tegra-hsp.c
>
>>>> +static irqreturn_t hsp_db_irq(int irq, void *p)
>>>> +{
>>>> +       struct tegra_hsp_mbox *hsp_mbox = p;
>>>> +       ulong val;
>>>> +       int master_id;
>>>> +
>>>> +       val = (ulong)hsp_readl(hsp_mbox->db_base[HSP_DB_CCPLEX],
>>>> +                              HSP_DB_REG_PENDING);
>>>> +       hsp_writel(hsp_mbox->db_base[HSP_DB_CCPLEX],
>>>> HSP_DB_REG_PENDING, val);
>>>> +
>>>> +       spin_lock(&hsp_mbox->lock);
>>>> +       for_each_set_bit(master_id, &val, MAX_NUM_HSP_CHAN) {
>>>> +               struct mbox_chan *chan;
>>>> +               struct tegra_hsp_mbox_chan *mchan;
>>>> +               int i;
>>>> +
>>>> +               for (i = 0; i < MAX_NUM_HSP_CHAN; i++) {
>>>
>>> I wonder if this could not be optimized. You are doing a double loop
>>> on MAX_NUM_HSP_CHAN to look for an identical master_id. Since it seems
>>> like the same master_id cannot be used twice (considering that the
>>> inner loop only processes the first match), couldn't you just select
>>> the free channel in of_hsp_mbox_xlate() by doing
>>> &mbox->chans[master_id] (and returning an error if it is already
>>> used), then simply getting chan as &hsp_mbox->mbox->chans[master_id]
>>> instead of having the inner loop below? That would remove the need for
>>> the second loop.
>>
>> That was exactly what I did in the V1, which only supported one HSP
>> sub-module per HSP HW block. So we can just use the master_id as the
>> mbox channel ID.
>>
>> Meanwhile, the V2 is purposed to support multiple HSP sub-modules to be
>> running on the same HSP HW block. The "ID" between different modules
>> could be conflict. So I dropped the mechanism that used the master_id as
>> the mbox channel ID.
>
> I haven't looked at the code in this patch since I'm mainly concerned
> about the DT bindings. However, I will say that nothing in the change to
> the mailbox specifier in DT should have required /any/ changes to the
> code, except to add a single check to validate that the "mailbox type"
> encoded into the top 16 bits of the mailbox ID were 0, and hence
> represented a doorbell rather than anything else. Any enhancements to
> support other mailbox types could have happened later, and I doubt would
> require anything dynamic even then.

Yes, I only add the code for that change. Maybe some glue code for the 
extend-ability to support more HSP modules in the future.

>
>>>> +static int tegra_hsp_db_init(struct tegra_hsp_mbox *hsp_mbox,
>>>> +                            struct mbox_chan *mchan, int master_id)
>>>> +{
>>>> +       struct platform_device *pdev =
>>>> to_platform_device(hsp_mbox->mbox->dev);
>>>> +       struct tegra_hsp_mbox_chan *hsp_mbox_chan;
>>>> +       int ret;
>>>> +
>>>> +       if (!hsp_mbox->db_irq) {
>>>> +               int i;
>>>> +
>>>> +               hsp_mbox->db_irq = platform_get_irq_byname(pdev,
>>>> "doorbell");
>>>
>>> Getting the IRQ sounds more like a job for probe() - I don't see the
>>> benefit of lazy-doing it?
>>
>> We only need the IRQ when the client is requesting the DB service. For
>> other HSP sub-modules, they are using different IRQ. So I didn't do that
>> at probe time.
>
> All resources provided by other devices/drivers must be acquired at
> probe time, since that's the only time it's possible to defer probe if
> the provider of the resource is not available.
>
> If you don't follow that rule, what happens is:
>
> 1) This driver probes.
>
> 2) Some other driver calls tegra_hsp_db_init(), and it fails since the
> provider of the IRQ is not yet available. This likely ends up returning
> something other than -EPROBE_DEFER since the HSP driver was found
> successfully (thus there is no deferred probe situation as far as the
> mailbox core is concerned), it's just that the mailbox channel
> lookup/init/... failed.
>
> 3) The other driver's probe() fails due to this, but since the error
> wasn't a probe deferral, the other driver's probe() is never retried.

Agree, will fix this.

Thanks,
-Joseph

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH V2 05/10] firmware: tegra: add BPMP support
  2016-07-06 11:39   ` Alexandre Courbot
  2016-07-06 16:39     ` Stephen Warren
  2016-07-06 16:47     ` Matt Longnecker
@ 2016-07-07  8:17     ` Joseph Lo
  2016-07-07 10:18       ` Alexandre Courbot
  2 siblings, 1 reply; 51+ messages in thread
From: Joseph Lo @ 2016-07-07  8:17 UTC (permalink / raw)
  To: Alexandre Courbot
  Cc: Stephen Warren, Thierry Reding, linux-tegra, linux-arm-kernel,
	Rob Herring, Mark Rutland, Peter De Schrijver,
	Matthew Longnecker, devicetree, Jassi Brar,
	Linux Kernel Mailing List, Catalin Marinas, Will Deacon

On 07/06/2016 07:39 PM, Alexandre Courbot wrote:
> Sorry, I will probably need to do several passes on this one to
> understand everything, but here is what I can say after a first look:
>
> On Tue, Jul 5, 2016 at 6:04 PM, Joseph Lo <josephl@nvidia.com> wrote:
>> The Tegra BPMP (Boot and Power Management Processor) is designed for the
>> booting process handling, offloading the power management tasks and
>> some system control services from the CPU. It can be clock, DVFS,
>> thermal/EDP, power gating operation and system suspend/resume handling.
>> So the CPU and the drivers of these modules can base on the service that
>> the BPMP firmware driver provided to signal the event for the specific PM
>> action to BPMP and receive the status update from BPMP.
>>
>> Comparing to the ARM SCPI, the service provided by BPMP is message-based
>> communication but not method-based. The BPMP firmware driver provides the
>> send/receive service for the users, when the user concerns the response
>> time. If the user needs to get the event or update from the firmware, it
>> can request the MRQ service as well. The user needs to take care of the
>> message format, which we call BPMP ABI.
>>
>> The BPMP ABI defines the message format for different modules or usages.
>> For example, the clock operation needs an MRQ service code called
>> MRQ_CLK with specific message format which includes different sub
>> commands for various clock operations. This is the message format that
>> BPMP can recognize.
>>
>> So the user needs two things to initiate IPC between BPMP. Get the
>> service from the bpmp_ops structure and maintain the message format as
>> the BPMP ABI defined.
>>
>> Based-on-the-work-by:
>> Sivaram Nair <sivaramn@nvidia.com>
>>
>> Signed-off-by: Joseph Lo <josephl@nvidia.com>
>> ---
>> Changes in V2:
>> - None
>> ---
>>   drivers/firmware/tegra/Kconfig  |   12 +
>>   drivers/firmware/tegra/Makefile |    1 +
>>   drivers/firmware/tegra/bpmp.c   |  713 +++++++++++++++++
>>   include/soc/tegra/bpmp.h        |   29 +
>>   include/soc/tegra/bpmp_abi.h    | 1601 +++++++++++++++++++++++++++++++++++++++
>>   5 files changed, 2356 insertions(+)
>>   create mode 100644 drivers/firmware/tegra/bpmp.c
>>   create mode 100644 include/soc/tegra/bpmp.h
>>   create mode 100644 include/soc/tegra/bpmp_abi.h
>>
>> diff --git a/drivers/firmware/tegra/Kconfig b/drivers/firmware/tegra/Kconfig
>> index 1fa3e4e136a5..ff2730d5c468 100644
>> --- a/drivers/firmware/tegra/Kconfig
>> +++ b/drivers/firmware/tegra/Kconfig
>> @@ -10,4 +10,16 @@ config TEGRA_IVC
>>            keeps the content is synchronization between host CPU and remote
>>            processors.
>>
>> +config TEGRA_BPMP
>> +       bool "Tegra BPMP driver"
>> +       depends on ARCH_TEGRA && TEGRA_HSP_MBOX && TEGRA_IVC
>> +       help
>> +         BPMP (Boot and Power Management Processor) is designed to off-loading
>
> s/off-loading/off-load
>
>> +         the PM functions which include clock/DVFS/thermal/power from the CPU.
>> +         It needs HSP as the HW synchronization and notification module and
>> +         IVC module as the message communication protocol.
>> +
>> +         This driver manages the IPC interface between host CPU and the
>> +         firmware running on BPMP.
>> +
>>   endmenu
>> diff --git a/drivers/firmware/tegra/Makefile b/drivers/firmware/tegra/Makefile
>> index 92e2153e8173..e34a2f79e1ad 100644
>> --- a/drivers/firmware/tegra/Makefile
>> +++ b/drivers/firmware/tegra/Makefile
>> @@ -1 +1,2 @@
>> +obj-$(CONFIG_TEGRA_BPMP)       += bpmp.o
>>   obj-$(CONFIG_TEGRA_IVC)                += ivc.o
>> diff --git a/drivers/firmware/tegra/bpmp.c b/drivers/firmware/tegra/bpmp.c
>> new file mode 100644
>> index 000000000000..24fda626610e
>> --- /dev/null
>> +++ b/drivers/firmware/tegra/bpmp.c
>> @@ -0,0 +1,713 @@
>> +/*
>> + * Copyright (c) 2016, NVIDIA CORPORATION.  All rights reserved.
>> + *
>> + * This program is free software; you can redistribute it and/or modify it
>> + * under the terms and conditions of the GNU General Public License,
>> + * version 2, as published by the Free Software Foundation.
>> + *
>> + * This program is distributed in the hope it will be useful, but WITHOUT
>> + * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
>> + * FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License for
>> + * more details.
>> + */
>> +
>> +#include <linux/mailbox_client.h>
>> +#include <linux/of.h>
>> +#include <linux/of_address.h>
>> +#include <linux/of_device.h>
>> +#include <linux/platform_device.h>
>> +#include <linux/semaphore.h>
>> +
>> +#include <soc/tegra/bpmp.h>
>> +#include <soc/tegra/bpmp_abi.h>
>> +#include <soc/tegra/ivc.h>
>> +
>> +#define BPMP_MSG_SZ            128
>> +#define BPMP_MSG_DATA_SZ       120
>> +
>> +#define __MRQ_ATTRS            0xff000000
>> +#define __MRQ_INDEX(id)                ((id) & ~__MRQ_ATTRS)
>> +
>> +#define DO_ACK                 BIT(0)
>> +#define RING_DOORBELL          BIT(1)
>> +
>> +struct tegra_bpmp_soc_data {
>> +       u32 ch_index;           /* channel index */
>> +       u32 thread_ch_index;    /* thread channel index */
>> +       u32 cpu_rx_ch_index;    /* CPU Rx channel index */
>> +       u32 nr_ch;              /* number of total channels */
>> +       u32 nr_thread_ch;       /* number of thread channels */
>> +       u32 ch_timeout;         /* channel timeout */
>> +       u32 thread_ch_timeout;  /* thread channel timeout */
>> +};
>
> With just these comments it is not clear what everything in this
> structure does. Maybe a file-level comment explaining how BPMP
> basically works and what the different channels are allocated to would
> help understanding the code.

We have two kinds of TX channels (channel & thread channel above) for 
the BPMP clients (clock, thermal, reset, power mgmt control, etc.) to use.

The channel means an atomic channel that could be used when the client 
needs the response immediately. e.g. setting clock rate, re-parent the 
clock source. Each CPUs have it's own atomic for the usage. The client 
can acquire one of them, and the ch_index means the first channel they 
are able to use in the channel array.

The response of thread channel can be postponed later. And the client 
allows getting the response after BPMP finished the service and response 
to them by IRQ. The thread_ch_index means the same the first  channel 
that the clients are available to use.

And the CPU RX channel is designed for the client to register some 
specific services (We call MRQ in the bpmp_abi.) listen to some update 
from the BPMP firmware.

Because we might have different numbers of these channels, using this 
structure as the bpmp_soc_data to get different configuration according 
to different SoC.

>
>> +
>> +struct channel_info {
>> +       u32 tch_free;
>> +       u32 tch_to_complete;
>> +       struct semaphore tch_sem;
>> +};
>> +
>> +struct mb_data {
>> +       s32 code;
>> +       s32 flags;
>> +       u8 data[BPMP_MSG_DATA_SZ];
>> +} __packed;
>> +
>> +struct channel_data {
>> +       struct mb_data *ib;
>> +       struct mb_data *ob;
>> +};
>> +
>> +struct mrq {
>> +       struct list_head list;
>> +       u32 mrq_code;
>> +       bpmp_mrq_handler handler;
>> +       void *data;
>> +};
>> +
>> +struct tegra_bpmp {
>> +       struct device *dev;
>> +       const struct tegra_bpmp_soc_data *soc_data;
>> +       void __iomem *tx_base;
>> +       void __iomem *rx_base;
>> +       struct mbox_client cl;
>> +       struct mbox_chan *chan;
>> +       struct ivc *ivc_channels;
>> +       struct channel_data *ch_area;
>> +       struct channel_info ch_info;
>> +       struct completion *ch_completion;
>> +       struct list_head mrq_list;
>> +       struct tegra_bpmp_ops *ops;
>> +       spinlock_t lock;
>> +       bool init_done;
>> +};
>> +
>> +static struct tegra_bpmp *bpmp;
>
> static? Ok, we only need one... for now. How about a private member in
> your ivc structure that allows you to retrieve the bpmp and going
> dynamic? This will require an extra argument in many functions, but is
> cleaner design IMHO.

IVC is designed as a generic IPC protocol among different modules (We 
have not introduced some other usages of the IVC right now.). Maybe 
don't churn some other stuff into IVC is better.

>
>> +
>> +static int bpmp_get_thread_ch(int idx)
>> +{
>> +       return bpmp->soc_data->thread_ch_index + idx;
>> +}
>> +
>> +static int bpmp_get_thread_ch_index(int ch)
>> +{
>> +       if (ch < bpmp->soc_data->thread_ch_index ||
>> +           ch >= bpmp->soc_data->cpu_rx_ch_index)
>
> Shouldn't that be ch >= bpmp->soc_data->cpu_rx_ch_index +
> bpmp->soc_data->nr_thread_ch?
>
> Either rx_ch_index indicates the upper bound of the threaded channels,
> and in that case you don't need tegra_bpmp_soc_data::nr_thread_ch, or
> it can be anywhere else and you should use the correct member.

According the to the table below, we have 14 channels.
atomic ch: 0 ~ 5, 6 chanls
thread ch: 6 ~ 17, 7 chanls
CPU RX ch: 13 ~ 14, 2 chanls

+static const struct tegra_bpmp_soc_data soc_data_tegra186 = {
+	.ch_index = 0,
+	.thread_ch_index = 6,
+	.cpu_rx_ch_index = 13,
+	.nr_ch = 14,
+	.nr_thread_ch = 7,
+	.ch_timeout = 60 * USEC_PER_SEC,
+	.thread_ch_timeout = 600 * USEC_PER_SEC,
+};

We use the index to check channel violation and nr_thread_ch for other 
usage to avoid redundant channel number calculation elsewhere.

>
>> +               return -1;
>> +       return ch - bpmp->soc_data->thread_ch_index;
>> +}
>> +
>> +static int bpmp_get_ob_channel(void)
>> +{
>> +       return smp_processor_id() + bpmp->soc_data->ch_index;
>> +}
>> +
>> +static struct completion *bpmp_get_completion_obj(int ch)
>> +{
>> +       int i = bpmp_get_thread_ch_index(ch);
>> +
>> +       return i < 0 ? NULL : bpmp->ch_completion + i;
>> +}
>> +
>> +static int bpmp_valid_txfer(void *ob_data, int ob_sz, void *ib_data, int ib_sz)
>> +{
>> +       return ob_sz >= 0 && ob_sz <= BPMP_MSG_DATA_SZ &&
>> +              ib_sz >= 0 && ib_sz <= BPMP_MSG_DATA_SZ &&
>> +              (!ob_sz || ob_data) && (!ib_sz || ib_data);
>> +}
>> +
>> +static bool bpmp_master_acked(int ch)
>> +{
>> +       struct ivc *ivc_chan;
>> +       void *frame;
>> +       bool ready;
>> +
>> +       ivc_chan = bpmp->ivc_channels + ch;
>> +       frame = tegra_ivc_read_get_next_frame(ivc_chan);
>> +       ready = !IS_ERR_OR_NULL(frame);
>> +       bpmp->ch_area[ch].ib = ready ? frame : NULL;
>> +
>> +       return ready;
>> +}
>> +
>> +static int bpmp_wait_ack(int ch)
>> +{
>> +       ktime_t t;
>> +
>> +       t = ns_to_ktime(local_clock());
>> +
>> +       do {
>> +               if (bpmp_master_acked(ch))
>> +                       return 0;
>> +       } while (ktime_us_delta(ns_to_ktime(local_clock()), t) <
>> +                bpmp->soc_data->ch_timeout);
>> +
>> +       return -ETIMEDOUT;
>> +}
>> +
>> +static bool bpmp_master_free(int ch)
>> +{
>> +       struct ivc *ivc_chan;
>> +       void *frame;
>> +       bool ready;
>> +
>> +       ivc_chan = bpmp->ivc_channels + ch;
>> +       frame = tegra_ivc_write_get_next_frame(ivc_chan);
>> +       ready = !IS_ERR_OR_NULL(frame);
>> +       bpmp->ch_area[ch].ob = ready ? frame : NULL;
>> +
>> +       return ready;
>> +}
>> +
>> +static int bpmp_wait_master_free(int ch)
>> +{
>> +       ktime_t t;
>> +
>> +       t = ns_to_ktime(local_clock());
>> +
>> +       do {
>> +               if (bpmp_master_free(ch))
>> +                       return 0;
>> +       } while (ktime_us_delta(ns_to_ktime(local_clock()), t)
>> +                < bpmp->soc_data->ch_timeout);
>> +
>> +       return -ETIMEDOUT;
>> +}
>> +
>> +static int __read_ch(int ch, void *data, int sz)
>> +{
>> +       struct ivc *ivc_chan;
>> +       struct mb_data *p;
>> +
>> +       ivc_chan = bpmp->ivc_channels + ch;
>> +       p = bpmp->ch_area[ch].ib;
>> +       if (data)
>> +               memcpy_fromio(data, p->data, sz);
>> +
>> +       return tegra_ivc_read_advance(ivc_chan);
>> +}
>> +
>> +static int bpmp_read_ch(int ch, void *data, int sz)
>> +{
>> +       unsigned long flags;
>> +       int i, ret;
>> +
>> +       i = bpmp_get_thread_ch_index(ch);
>
> i is not a very good name for this variable.
> Also note that bpmp_get_thread_ch_index() can return -1, this case is
> not handled.
Okay, will fix this.

>
>> +
>> +       spin_lock_irqsave(&bpmp->lock, flags);
>> +       ret = __read_ch(ch, data, sz);
>> +       bpmp->ch_info.tch_free |= (1 << i);
>> +       spin_unlock_irqrestore(&bpmp->lock, flags);
>> +
>> +       up(&bpmp->ch_info.tch_sem);
>> +
>> +       return ret;
>> +}
>> +
>> +static int __write_ch(int ch, int mrq_code, int flags, void *data, int sz)
>> +{
>> +       struct ivc *ivc_chan;
>> +       struct mb_data *p;
>> +
>> +       ivc_chan = bpmp->ivc_channels + ch;
>> +       p = bpmp->ch_area[ch].ob;
>> +
>> +       p->code = mrq_code;
>> +       p->flags = flags;
>> +       if (data)
>> +               memcpy_toio(p->data, data, sz);
>> +
>> +       return tegra_ivc_write_advance(ivc_chan);
>> +}
>> +
>> +static int bpmp_write_threaded_ch(int *ch, int mrq_code, void *data, int sz)
>> +{
>> +       unsigned long flags;
>> +       int ret, i;
>> +
>> +       ret = down_timeout(&bpmp->ch_info.tch_sem,
>> +                          usecs_to_jiffies(bpmp->soc_data->thread_ch_timeout));
>> +       if (ret)
>> +               return ret;
>> +
>> +       spin_lock_irqsave(&bpmp->lock, flags);
>> +
>> +       i = __ffs(bpmp->ch_info.tch_free);
>> +       *ch = bpmp_get_thread_ch(i);
>> +       ret = bpmp_master_free(*ch) ? 0 : -EFAULT;
>> +       if (!ret) {
>> +               bpmp->ch_info.tch_free &= ~(1 << i);
>> +               __write_ch(*ch, mrq_code, DO_ACK | RING_DOORBELL, data, sz);
>> +               bpmp->ch_info.tch_to_complete |= (1 << *ch);
>> +       }
>> +
>> +       spin_unlock_irqrestore(&bpmp->lock, flags);
>> +
>> +       return ret;
>> +}
>> +
>> +static int bpmp_write_ch(int ch, int mrq_code, int flags, void *data, int sz)
>> +{
>> +       int ret;
>> +
>> +       ret = bpmp_wait_master_free(ch);
>> +       if (ret)
>> +               return ret;
>> +
>> +       return __write_ch(ch, mrq_code, flags, data, sz);
>> +}
>> +
>> +static int bpmp_send_receive_atomic(int mrq_code, void *ob_data, int ob_sz,
>> +                                   void *ib_data, int ib_sz)
>> +{
>> +       int ch, ret;
>> +
>> +       if (WARN_ON(!irqs_disabled()))
>> +               return -EPERM;
>> +
>> +       if (!bpmp_valid_txfer(ob_data, ob_sz, ib_data, ib_sz))
>> +               return -EINVAL;
>> +
>> +       if (!bpmp->init_done)
>> +               return -ENODEV;
>> +
>> +       ch = bpmp_get_ob_channel();
>> +       ret = bpmp_write_ch(ch, mrq_code, DO_ACK, ob_data, ob_sz);
>> +       if (ret)
>> +               return ret;
>> +
>> +       ret = mbox_send_message(bpmp->chan, NULL);
>> +       if (ret < 0)
>> +               return ret;
>> +       mbox_client_txdone(bpmp->chan, 0);
>> +
>> +       ret = bpmp_wait_ack(ch);
>> +       if (ret)
>> +               return ret;
>> +
>> +       return __read_ch(ch, ib_data, ib_sz);
>> +}
>> +
>> +static int bpmp_send_receive(int mrq_code, void *ob_data, int ob_sz,
>> +                            void *ib_data, int ib_sz)
>> +{
>> +       struct completion *comp_obj;
>> +       unsigned long timeout;
>> +       int ch, ret;
>> +
>> +       if (WARN_ON(irqs_disabled()))
>> +               return -EPERM;
>> +
>> +       if (!bpmp_valid_txfer(ob_data, ob_sz, ib_data, ib_sz))
>> +               return -EINVAL;
>> +
>> +       if (!bpmp->init_done)
>> +               return -ENODEV;
>> +
>> +       ret = bpmp_write_threaded_ch(&ch, mrq_code, ob_data, ob_sz);
>> +       if (ret)
>> +               return ret;
>> +
>> +       ret = mbox_send_message(bpmp->chan, NULL);
>> +       if (ret < 0)
>> +               return ret;
>> +       mbox_client_txdone(bpmp->chan, 0);
>> +
>> +       comp_obj = bpmp_get_completion_obj(ch);
>> +       timeout = usecs_to_jiffies(bpmp->soc_data->thread_ch_timeout);
>> +       if (!wait_for_completion_timeout(comp_obj, timeout))
>> +               return -ETIMEDOUT;
>> +
>> +       return bpmp_read_ch(ch, ib_data, ib_sz);
>> +}
>> +
>> +static struct mrq *bpmp_find_mrq(u32 mrq_code)
>> +{
>> +       struct mrq *mrq;
>> +
>> +       list_for_each_entry(mrq, &bpmp->mrq_list, list) {
>> +               if (mrq_code == mrq->mrq_code)
>> +                       return mrq;
>> +       }
>> +
>> +       return NULL;
>> +}
>> +
>> +static void bpmp_mrq_return_data(int ch, int code, void *data, int sz)
>> +{
>> +       int flags = bpmp->ch_area[ch].ib->flags;
>> +       struct ivc *ivc_chan;
>> +       struct mb_data *frame;
>> +       int ret;
>> +
>> +       if (WARN_ON(sz > BPMP_MSG_DATA_SZ))
>> +               return;
>> +
>> +       ivc_chan = bpmp->ivc_channels + ch;
>> +       ret = tegra_ivc_read_advance(ivc_chan);
>> +       WARN_ON(ret);
>> +
>> +       if (!(flags & DO_ACK))
>> +               return;
>> +
>> +       frame = tegra_ivc_write_get_next_frame(ivc_chan);
>> +       if (IS_ERR_OR_NULL(frame)) {
>> +               WARN_ON(1);
>> +               return;
>> +       }
>> +
>> +       frame->code = code;
>> +       if (data != NULL)
>> +               memcpy_toio(frame->data, data, sz);
>> +       ret = tegra_ivc_write_advance(ivc_chan);
>> +       WARN_ON(ret);
>> +
>> +       if (flags & RING_DOORBELL) {
>> +               ret = mbox_send_message(bpmp->chan, NULL);
>> +               if (ret < 0) {
>> +                       WARN_ON(1);
>> +                       return;
>> +               }
>> +               mbox_client_txdone(bpmp->chan, 0);
>> +       }
>> +}
>> +
>> +static void bpmp_mail_return(int ch, int ret_code, int val)
>> +{
>> +       bpmp_mrq_return_data(ch, ret_code, &val, sizeof(val));
>> +}
>> +
>> +static void bpmp_handle_mrq(int mrq_code, int ch)
>> +{
>> +       struct mrq *mrq;
>> +
>> +       spin_lock(&bpmp->lock);
>> +
>> +       mrq = bpmp_find_mrq(mrq_code);
>> +       if (!mrq) {
>> +               spin_unlock(&bpmp->lock);
>> +               bpmp_mail_return(ch, -EINVAL, 0);
>> +               return;
>> +       }
>> +
>> +       mrq->handler(mrq_code, mrq->data, ch);
>> +
>> +       spin_unlock(&bpmp->lock);
>> +}
>> +
>> +static int bpmp_request_mrq(int mrq_code, bpmp_mrq_handler handler, void *data)
>> +{
>> +       struct mrq *mrq;
>> +       unsigned long flags;
>> +
>> +       if (!handler)
>> +               return -EINVAL;
>> +
>> +       mrq = devm_kzalloc(bpmp->dev, sizeof(*mrq), GFP_KERNEL);
>> +       if (!mrq)
>> +               return -ENOMEM;
>> +
>> +       spin_lock_irqsave(&bpmp->lock, flags);
>> +
>> +       mrq->mrq_code = __MRQ_INDEX(mrq_code);
>> +       mrq->handler = handler;
>> +       mrq->data = data;
>> +       list_add(&mrq->list, &bpmp->mrq_list);
>> +
>> +       spin_unlock_irqrestore(&bpmp->lock, flags);
>> +
>> +       return 0;
>> +}
>> +
>> +static void bpmp_mrq_handle_ping(int mrq_code, void *data, int ch)
>> +{
>> +       int challenge;
>> +       int reply;
>> +
>> +       challenge = *(int *)bpmp->ch_area[ch].ib->data;
>> +       reply = challenge << (smp_processor_id() + 1);
>> +       bpmp_mail_return(ch, 0, reply);
>> +}
>> +
>> +static int bpmp_mailman_init(void)
>> +{
>> +       return bpmp_request_mrq(MRQ_PING, bpmp_mrq_handle_ping, NULL);
>> +}
>> +
>> +static int bpmp_ping(void)
>> +{
>> +       unsigned long flags;
>> +       ktime_t t;
>> +       int challenge = 1;
>
> Mmmm, shouldn't use a mrq_ping_request instead of an parameter which
> size may vary depending on the architecture? On a 64-bit big endian
> architecture, your messages would be corrupted.

Clarify one thig first. The mrq_ping_request and mrq_handle_ping above 
are used for the ping form BPMP to CPU. Like I said above, it's among 
CPU RX channel to get some information from BPMP firmware.

Here is the ping request from CPU to BPMP to make sure we can IPC with 
BPMP during the probe stage.

About the endian issue, I think we don't consider that in the message 
format right now. So I think we only support little endian for the IPC 
messages right now.

>
>> +       int reply = 0;
>
> And this should probably be a mrq_ping_response. These remarks may
> also apply to bpmp_mrq_handle_ping().
That is for receiving the ping request from BPMP.

>
>> +       int ret;
>> +
>> +       t = ktime_get();
>> +       local_irq_save(flags);
>> +       ret = bpmp_send_receive_atomic(MRQ_PING, &challenge, sizeof(challenge),
>> +                                      &reply, sizeof(reply));
>> +       local_irq_restore(flags);
>> +       t = ktime_sub(ktime_get(), t);
>> +
>> +       if (!ret)
>> +               dev_info(bpmp->dev,
>> +                        "ping ok: challenge: %d, reply: %d, time: %lld\n",
>> +                        challenge, reply, ktime_to_us(t));
>> +
>> +       return ret;
>> +}
>> +
>> +static int bpmp_get_fwtag(void)
>> +{
>> +       unsigned long flags;
>> +       void *vaddr;
>> +       dma_addr_t paddr;
>> +       u32 addr;
>
> Here also we should use a mrq_query_tag_request.
The is one-way request from CPU to BPMP. So we don't request an MRQ for 
that.

>
>> +       int ret;
>> +
>> +       vaddr = dma_alloc_coherent(bpmp->dev, BPMP_MSG_DATA_SZ, &paddr,
>> +                                  GFP_KERNEL);
>
> dma_addr_t may be 64 bit here, and you may get an address higher than
> the 32 bits allowed by mrq_query_tag_request! I guess you want to add
> GFP_DMA32 as flag to your call to dma_alloc_coherent.
BPMP should able to handle the address above 32 bits, but I am not sure 
does it configure to support that?

Will fix this.

>
>> +       if (!vaddr)
>> +               return -ENOMEM;
>> +       addr = paddr;
>> +
>> +       local_irq_save(flags);
>> +       ret = bpmp_send_receive_atomic(MRQ_QUERY_TAG, &addr, sizeof(addr),
>> +                                      NULL, 0);
>> +       local_irq_restore(flags);
>> +
>> +       if (!ret)
>> +               dev_info(bpmp->dev, "fwtag: %s\n", (char *)vaddr);
>> +
>> +       dma_free_coherent(bpmp->dev, BPMP_MSG_DATA_SZ, vaddr, paddr);
>> +
>> +       return ret;
>> +}
>> +
>> +static void bpmp_signal_thread(int ch)
>> +{
>> +       int flags = bpmp->ch_area[ch].ob->flags;
>> +       struct completion *comp_obj;
>> +
>> +       if (!(flags & RING_DOORBELL))
>> +               return;
>> +
>> +       comp_obj = bpmp_get_completion_obj(ch);
>> +       if (!comp_obj) {
>> +               WARN_ON(1);
>> +               return;
>> +       }
>> +
>> +       complete(comp_obj);
>> +}
>> +
>> +static void bpmp_handle_rx(struct mbox_client *cl, void *data)
>> +{
>> +       int i, rx_ch;
>> +
>> +       rx_ch = bpmp->soc_data->cpu_rx_ch_index;
>> +
>> +       if (bpmp_master_acked(rx_ch))
>> +               bpmp_handle_mrq(bpmp->ch_area[rx_ch].ib->code, rx_ch);
>> +
>> +       spin_lock(&bpmp->lock);
>> +
>> +       for (i = 0; i < bpmp->soc_data->nr_thread_ch &&
>> +                       bpmp->ch_info.tch_to_complete; i++) {
>> +               int ch = bpmp_get_thread_ch(i);
>> +
>> +               if ((bpmp->ch_info.tch_to_complete & (1 << ch)) &&
>> +                   bpmp_master_acked(ch)) {
>> +                       bpmp->ch_info.tch_to_complete &= ~(1 << ch);
>> +                       bpmp_signal_thread(ch);
>> +               }
>> +       }
>> +
>> +       spin_unlock(&bpmp->lock);
>> +}
>> +
>> +static void bpmp_ivc_notify(struct ivc *ivc)
>> +{
>> +       int ret;
>> +
>> +       ret = mbox_send_message(bpmp->chan, NULL);
>> +       if (ret < 0)
>> +               return;
>> +
>> +       mbox_send_message(bpmp->chan, NULL);
>
> Why the second call to mbox_send_message? May to useful to add a
> comment explaining it.
Ah!! It should be mbox_client_txdone(). Good catch.

>
>> +}
>> +
>> +static int bpmp_msg_chan_init(int ch)
>> +{
>> +       struct ivc *ivc_chan;
>> +       u32 hdr_sz, msg_sz, que_sz;
>> +       uintptr_t rx_base, tx_base;
>> +       int ret;
>> +
>> +       msg_sz = tegra_ivc_align(BPMP_MSG_SZ);
>> +       hdr_sz = tegra_ivc_total_queue_size(0);
>
> I believe hdr_sz is never used?
You are right. Should fix this.

>
>> +       que_sz = tegra_ivc_total_queue_size(msg_sz);
>> +
>> +       rx_base =  (uintptr_t)(bpmp->rx_base + que_sz * ch);
>> +       tx_base =  (uintptr_t)(bpmp->tx_base + que_sz * ch);
>> +
>> +       ivc_chan = bpmp->ivc_channels + ch;
>> +       ret = tegra_ivc_init(ivc_chan, rx_base, DMA_ERROR_CODE, tx_base,
>> +                            DMA_ERROR_CODE, 1, msg_sz, bpmp->dev,
>> +                            bpmp_ivc_notify);
>> +       if (ret) {
>> +               dev_err(bpmp->dev, "%s fail: ch %d returned %d\n",
>> +                       __func__, ch, ret);
>> +               return ret;
>> +       }
>> +
>> +       /* reset the channel state */
>> +       tegra_ivc_channel_reset(ivc_chan);
>> +
>> +       /* sync the channel state with BPMP */
>> +       while (tegra_ivc_channel_notified(ivc_chan))
>> +               ;
>> +
>> +       return 0;
>> +}
>> +
>> +struct tegra_bpmp_ops *tegra_bpmp_get_ops(void)
>> +{
>> +       if (bpmp->init_done && bpmp->ops)
>> +               return bpmp->ops;
>> +       return NULL;
>> +}
>> +EXPORT_SYMBOL(tegra_bpmp_get_ops);
>> +
>> +static struct tegra_bpmp_ops bpmp_ops = {
>> +       .send_receive = bpmp_send_receive,
>> +       .send_receive_atomic = bpmp_send_receive_atomic,
>> +       .request_mrq = bpmp_request_mrq,
>> +       .mrq_return = bpmp_mail_return,
>> +};
>> +
>> +static const struct tegra_bpmp_soc_data soc_data_tegra186 = {
>> +       .ch_index = 0,
>> +       .thread_ch_index = 6,
>> +       .cpu_rx_ch_index = 13,
>> +       .nr_ch = 14,
>> +       .nr_thread_ch = 7,
>> +       .ch_timeout = 60 * USEC_PER_SEC,
>> +       .thread_ch_timeout = 600 * USEC_PER_SEC,
>> +};
>> +
>> +static const struct of_device_id tegra_bpmp_match[] = {
>> +       { .compatible = "nvidia,tegra186-bpmp", .data = &soc_data_tegra186 },
>> +       { }
>> +};
>> +
>> +static int tegra_bpmp_probe(struct platform_device *pdev)
>> +{
>> +       const struct of_device_id *match;
>> +       struct resource shmem_res;
>> +       struct device_node *shmem_np;
>> +       int i, ret;
>> +
>> +       bpmp = devm_kzalloc(&pdev->dev, sizeof(*bpmp), GFP_KERNEL);
>> +       if (!bpmp)
>> +               return -ENOMEM;
>> +       bpmp->dev = &pdev->dev;
>> +
>> +       match = of_match_device(tegra_bpmp_match, &pdev->dev);
>> +       if (!match)
>> +               return -EINVAL;
>> +       bpmp->soc_data = match->data;
>> +
>> +       shmem_np = of_parse_phandle(pdev->dev.of_node, "shmem", 0);
>> +       of_address_to_resource(shmem_np, 0, &shmem_res);
>
> Maybe check return value for these two calls?
>
>> +       bpmp->tx_base = devm_ioremap_resource(&pdev->dev, &shmem_res);
>> +       if (IS_ERR(bpmp->tx_base))
>> +               return PTR_ERR(bpmp->tx_base);
>> +
>> +       shmem_np = of_parse_phandle(pdev->dev.of_node, "shmem", 1);
>> +       of_address_to_resource(shmem_np, 0, &shmem_res);
>
> And here too.
Okay, will fix.

Thanks,
-Joseph

>
>> +       bpmp->rx_base = devm_ioremap_resource(&pdev->dev, &shmem_res);
>> +       if (IS_ERR(bpmp->rx_base))
>> +               return PTR_ERR(bpmp->rx_base);
>> +
>> +       bpmp->ivc_channels = devm_kcalloc(&pdev->dev, bpmp->soc_data->nr_ch,
>> +                                         sizeof(*bpmp->ivc_channels),
>> +                                         GFP_KERNEL);
>> +       if (!bpmp->ivc_channels)
>> +               return -ENOMEM;
>> +
>> +       bpmp->ch_area = devm_kcalloc(&pdev->dev, bpmp->soc_data->nr_ch,
>> +                                    sizeof(*bpmp->ch_area), GFP_KERNEL);
>> +       if (!bpmp->ch_area)
>> +               return -ENOMEM;
>> +
>> +       bpmp->ch_completion = devm_kcalloc(&pdev->dev,
>> +                                          bpmp->soc_data->nr_thread_ch,
>> +                                          sizeof(*bpmp->ch_completion),
>> +                                          GFP_KERNEL);
>> +       if (!bpmp->ch_completion)
>> +               return -ENOMEM;
>> +
>> +       /* mbox registration */
>> +       bpmp->cl.dev = &pdev->dev;
>> +       bpmp->cl.rx_callback = bpmp_handle_rx;
>> +       bpmp->cl.tx_block = false;
>> +       bpmp->cl.knows_txdone = false;
>> +       bpmp->chan = mbox_request_channel(&bpmp->cl, 0);
>> +       if (IS_ERR(bpmp->chan)) {
>> +               if (PTR_ERR(bpmp->chan) != -EPROBE_DEFER)
>> +                       dev_err(&pdev->dev,
>> +                               "fail to get HSP mailbox, bpmp init fail.\n");
>> +               return PTR_ERR(bpmp->chan);
>> +       }
>> +
>> +       /* message channel initialization */
>> +       for (i = 0; i < bpmp->soc_data->nr_ch; i++) {
>> +               struct completion *comp_obj;
>> +
>> +               ret = bpmp_msg_chan_init(i);
>> +               if (ret)
>> +                       return ret;
>> +
>> +               comp_obj = bpmp_get_completion_obj(i);
>> +               if (comp_obj)
>> +                       init_completion(comp_obj);
>> +       }
>> +
>> +       bpmp->ch_info.tch_free = (1 << bpmp->soc_data->nr_thread_ch) - 1;
>> +       sema_init(&bpmp->ch_info.tch_sem, bpmp->soc_data->nr_thread_ch);
>> +
>> +       spin_lock_init(&bpmp->lock);
>> +       INIT_LIST_HEAD(&bpmp->mrq_list);
>> +       if (bpmp_mailman_init())
>> +               return -ENODEV;
>> +
>> +       bpmp->init_done = true;
>> +
>> +       ret = bpmp_ping();
>> +       if (ret)
>> +               dev_err(&pdev->dev, "ping failed: %d\n", ret);
>> +
>> +       ret = bpmp_get_fwtag();
>> +       if (ret)
>> +               dev_err(&pdev->dev, "get fwtag failed: %d\n", ret);
>> +
>> +       /* BPMP is ready now. */
>> +       bpmp->ops = &bpmp_ops;
>> +
>> +       return 0;
>> +}
>> +
>> +static struct platform_driver tegra_bpmp_driver = {
>> +       .driver = {
>> +               .name = "tegra-bpmp",
>> +               .of_match_table = tegra_bpmp_match,
>> +       },
>> +       .probe = tegra_bpmp_probe,
>> +};
>> +
>> +static int __init tegra_bpmp_init(void)
>> +{
>> +       return platform_driver_register(&tegra_bpmp_driver);
>> +}
>> +core_initcall(tegra_bpmp_init);
>> diff --git a/include/soc/tegra/bpmp.h b/include/soc/tegra/bpmp.h
>> new file mode 100644
>> index 000000000000..aaa0ef34ad7b
>> --- /dev/null
>> +++ b/include/soc/tegra/bpmp.h
>> @@ -0,0 +1,29 @@
>> +/*
>> + * Copyright (c) 2016, NVIDIA CORPORATION.  All rights reserved.
>> + *
>> + * This program is free software; you can redistribute it and/or modify it
>> + * under the terms and conditions of the GNU General Public License,
>> + * version 2, as published by the Free Software Foundation.
>> + *
>> + * This program is distributed in the hope it will be useful, but WITHOUT
>> + * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
>> + * FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License for
>> + * more details.
>> + */
>> +
>> +#ifndef __TEGRA_BPMP_H
>> +
>> +typedef void (*bpmp_mrq_handler)(int mrq_code, void *data, int ch);
>> +
>> +struct tegra_bpmp_ops {
>> +       int (*send_receive)(int mrq_code, void *ob_data, int ob_sz,
>> +                           void *ib_data, int ib_sz);
>> +       int (*send_receive_atomic)(int mrq_code, void *ob_data, int ob_sz,
>> +                           void *ib_data, int ib_sz);
>> +       int (*request_mrq)(int mrq_code, bpmp_mrq_handler handler, void *data);
>> +       void (*mrq_return)(int ch, int ret_code, int val);
>> +};
>> +
>> +struct tegra_bpmp_ops *tegra_bpmp_get_ops(void);
>> +
>> +#endif /* __TEGRA_BPMP_H */
>> diff --git a/include/soc/tegra/bpmp_abi.h b/include/soc/tegra/bpmp_abi.h
>> new file mode 100644
>> index 000000000000..0aaef5960e29
>> --- /dev/null
>> +++ b/include/soc/tegra/bpmp_abi.h
>> @@ -0,0 +1,1601 @@
>> +/*
>> + * Copyright (c) 2014-2016, NVIDIA CORPORATION.  All rights reserved.
>> + *
>> + * This program is free software; you can redistribute it and/or modify it
>> + * under the terms and conditions of the GNU General Public License,
>> + * version 2, as published by the Free Software Foundation.
>> + *
>> + * This program is distributed in the hope it will be useful, but WITHOUT
>> + * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
>> + * FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License for
>> + * more details.
>> + *
>> + * You should have received a copy of the GNU General Public License
>> + * along with this program.  If not, see <http://www.gnu.org/licenses/>.
>> + */
>> +
>> +#ifndef _ABI_BPMP_ABI_H_
>> +#define _ABI_BPMP_ABI_H_
>> +
>> +#ifdef LK
>> +#include <stdint.h>
>> +#endif
>> +
>> +#ifndef __ABI_PACKED
>> +#define __ABI_PACKED __attribute__((packed))
>> +#endif
>> +
>> +#ifdef NO_GCC_EXTENSIONS
>> +#define EMPTY char empty;
>> +#define EMPTY_ARRAY 1
>> +#else
>> +#define EMPTY
>> +#define EMPTY_ARRAY 0
>> +#endif
>> +
>> +#ifndef __UNION_ANON
>> +#define __UNION_ANON
>> +#endif
>> +/**
>> + * @file
>> + */
>> +
>> +
>> +/**
>> + * @defgroup MRQ MRQ Messages
>> + * @brief Messages sent to/from BPMP via IPC
>> + * @{
>> + *   @defgroup MRQ_Format Message Format
>> + *   @defgroup MRQ_Codes Message Request (MRQ) Codes
>> + *   @defgroup MRQ_Payloads Message Payloads
>> + *   @defgroup Error_Codes Error Codes
>> + * @}
>> + */
>
> ...
>
> There is a lot of stuff in this file, most of which we are not using
> now - this is ok, but unless this is a file synced from an outside
> resource maybe we should trim the structures we don't need and add
> them as we make use of them? It helps dividing the work in bite-size
> chunks.
>
> Regarding the documentation format of this file, is this valid kernel
> documentation since the adoption of Sphynx? Or is it whatever the
> origin is using?
>

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH V2 05/10] firmware: tegra: add BPMP support
  2016-07-07  8:17     ` Joseph Lo
@ 2016-07-07 10:18       ` Alexandre Courbot
  2016-07-07 19:55         ` Stephen Warren
  2016-07-08 20:19         ` Sivaram Nair
  0 siblings, 2 replies; 51+ messages in thread
From: Alexandre Courbot @ 2016-07-07 10:18 UTC (permalink / raw)
  To: Joseph Lo
  Cc: Stephen Warren, Thierry Reding, linux-tegra, linux-arm-kernel,
	Rob Herring, Mark Rutland, Peter De Schrijver,
	Matthew Longnecker, devicetree, Jassi Brar,
	Linux Kernel Mailing List, Catalin Marinas, Will Deacon

On Thu, Jul 7, 2016 at 5:17 PM, Joseph Lo <josephl@nvidia.com> wrote:
> On 07/06/2016 07:39 PM, Alexandre Courbot wrote:
>>
>> Sorry, I will probably need to do several passes on this one to
>> understand everything, but here is what I can say after a first look:
>>
>> On Tue, Jul 5, 2016 at 6:04 PM, Joseph Lo <josephl@nvidia.com> wrote:
>>>
>>> The Tegra BPMP (Boot and Power Management Processor) is designed for the
>>> booting process handling, offloading the power management tasks and
>>> some system control services from the CPU. It can be clock, DVFS,
>>> thermal/EDP, power gating operation and system suspend/resume handling.
>>> So the CPU and the drivers of these modules can base on the service that
>>> the BPMP firmware driver provided to signal the event for the specific PM
>>> action to BPMP and receive the status update from BPMP.
>>>
>>> Comparing to the ARM SCPI, the service provided by BPMP is message-based
>>> communication but not method-based. The BPMP firmware driver provides the
>>> send/receive service for the users, when the user concerns the response
>>> time. If the user needs to get the event or update from the firmware, it
>>> can request the MRQ service as well. The user needs to take care of the
>>> message format, which we call BPMP ABI.
>>>
>>> The BPMP ABI defines the message format for different modules or usages.
>>> For example, the clock operation needs an MRQ service code called
>>> MRQ_CLK with specific message format which includes different sub
>>> commands for various clock operations. This is the message format that
>>> BPMP can recognize.
>>>
>>> So the user needs two things to initiate IPC between BPMP. Get the
>>> service from the bpmp_ops structure and maintain the message format as
>>> the BPMP ABI defined.
>>>
>>> Based-on-the-work-by:
>>> Sivaram Nair <sivaramn@nvidia.com>
>>>
>>> Signed-off-by: Joseph Lo <josephl@nvidia.com>
>>> ---
>>> Changes in V2:
>>> - None
>>> ---
>>>   drivers/firmware/tegra/Kconfig  |   12 +
>>>   drivers/firmware/tegra/Makefile |    1 +
>>>   drivers/firmware/tegra/bpmp.c   |  713 +++++++++++++++++
>>>   include/soc/tegra/bpmp.h        |   29 +
>>>   include/soc/tegra/bpmp_abi.h    | 1601
>>> +++++++++++++++++++++++++++++++++++++++
>>>   5 files changed, 2356 insertions(+)
>>>   create mode 100644 drivers/firmware/tegra/bpmp.c
>>>   create mode 100644 include/soc/tegra/bpmp.h
>>>   create mode 100644 include/soc/tegra/bpmp_abi.h
>>>
>>> diff --git a/drivers/firmware/tegra/Kconfig
>>> b/drivers/firmware/tegra/Kconfig
>>> index 1fa3e4e136a5..ff2730d5c468 100644
>>> --- a/drivers/firmware/tegra/Kconfig
>>> +++ b/drivers/firmware/tegra/Kconfig
>>> @@ -10,4 +10,16 @@ config TEGRA_IVC
>>>            keeps the content is synchronization between host CPU and
>>> remote
>>>            processors.
>>>
>>> +config TEGRA_BPMP
>>> +       bool "Tegra BPMP driver"
>>> +       depends on ARCH_TEGRA && TEGRA_HSP_MBOX && TEGRA_IVC
>>> +       help
>>> +         BPMP (Boot and Power Management Processor) is designed to
>>> off-loading
>>
>>
>> s/off-loading/off-load
>>
>>> +         the PM functions which include clock/DVFS/thermal/power from
>>> the CPU.
>>> +         It needs HSP as the HW synchronization and notification module
>>> and
>>> +         IVC module as the message communication protocol.
>>> +
>>> +         This driver manages the IPC interface between host CPU and the
>>> +         firmware running on BPMP.
>>> +
>>>   endmenu
>>> diff --git a/drivers/firmware/tegra/Makefile
>>> b/drivers/firmware/tegra/Makefile
>>> index 92e2153e8173..e34a2f79e1ad 100644
>>> --- a/drivers/firmware/tegra/Makefile
>>> +++ b/drivers/firmware/tegra/Makefile
>>> @@ -1 +1,2 @@
>>> +obj-$(CONFIG_TEGRA_BPMP)       += bpmp.o
>>>   obj-$(CONFIG_TEGRA_IVC)                += ivc.o
>>> diff --git a/drivers/firmware/tegra/bpmp.c
>>> b/drivers/firmware/tegra/bpmp.c
>>> new file mode 100644
>>> index 000000000000..24fda626610e
>>> --- /dev/null
>>> +++ b/drivers/firmware/tegra/bpmp.c
>>> @@ -0,0 +1,713 @@
>>> +/*
>>> + * Copyright (c) 2016, NVIDIA CORPORATION.  All rights reserved.
>>> + *
>>> + * This program is free software; you can redistribute it and/or modify
>>> it
>>> + * under the terms and conditions of the GNU General Public License,
>>> + * version 2, as published by the Free Software Foundation.
>>> + *
>>> + * This program is distributed in the hope it will be useful, but
>>> WITHOUT
>>> + * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
>>> + * FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License
>>> for
>>> + * more details.
>>> + */
>>> +
>>> +#include <linux/mailbox_client.h>
>>> +#include <linux/of.h>
>>> +#include <linux/of_address.h>
>>> +#include <linux/of_device.h>
>>> +#include <linux/platform_device.h>
>>> +#include <linux/semaphore.h>
>>> +
>>> +#include <soc/tegra/bpmp.h>
>>> +#include <soc/tegra/bpmp_abi.h>
>>> +#include <soc/tegra/ivc.h>
>>> +
>>> +#define BPMP_MSG_SZ            128
>>> +#define BPMP_MSG_DATA_SZ       120
>>> +
>>> +#define __MRQ_ATTRS            0xff000000
>>> +#define __MRQ_INDEX(id)                ((id) & ~__MRQ_ATTRS)
>>> +
>>> +#define DO_ACK                 BIT(0)
>>> +#define RING_DOORBELL          BIT(1)
>>> +
>>> +struct tegra_bpmp_soc_data {
>>> +       u32 ch_index;           /* channel index */
>>> +       u32 thread_ch_index;    /* thread channel index */
>>> +       u32 cpu_rx_ch_index;    /* CPU Rx channel index */
>>> +       u32 nr_ch;              /* number of total channels */
>>> +       u32 nr_thread_ch;       /* number of thread channels */
>>> +       u32 ch_timeout;         /* channel timeout */
>>> +       u32 thread_ch_timeout;  /* thread channel timeout */
>>> +};
>>
>>
>> With just these comments it is not clear what everything in this
>> structure does. Maybe a file-level comment explaining how BPMP
>> basically works and what the different channels are allocated to would
>> help understanding the code.
>
>
> We have two kinds of TX channels (channel & thread channel above) for the
> BPMP clients (clock, thermal, reset, power mgmt control, etc.) to use.
>
> The channel means an atomic channel that could be used when the client needs
> the response immediately. e.g. setting clock rate, re-parent the clock
> source. Each CPUs have it's own atomic for the usage. The client can acquire
> one of them, and the ch_index means the first channel they are able to use
> in the channel array.
>
> The response of thread channel can be postponed later. And the client allows
> getting the response after BPMP finished the service and response to them by
> IRQ. The thread_ch_index means the same the first  channel that the clients
> are available to use.
>
> And the CPU RX channel is designed for the client to register some specific
> services (We call MRQ in the bpmp_abi.) listen to some update from the BPMP
> firmware.
>
> Because we might have different numbers of these channels, using this
> structure as the bpmp_soc_data to get different configuration according to
> different SoC.

Thanks, that clarifies things. This explanation deserves to in the C
file as well IMHO.

So IIUC the first 13 channels (6 bound to a specific CPU core and 7
threaded, allocated dynamically) are all used to initiate a
communication to the BPMP, while the cpu_rx channel is used as a sort
of IRQ (hence the name MRQ). Is this correct? This would be valuable
to state too. Maybe cpu_rx_ch_index can even be renamed to something
like mrq_ch_index to stress that fact.

A few additional comments follow below as I did a second pass on the code.

>
>
>>
>>> +
>>> +struct channel_info {
>>> +       u32 tch_free;
>>> +       u32 tch_to_complete;
>>> +       struct semaphore tch_sem;
>>> +};
>>> +
>>> +struct mb_data {
>>> +       s32 code;
>>> +       s32 flags;
>>> +       u8 data[BPMP_MSG_DATA_SZ];
>>> +} __packed;
>>> +
>>> +struct channel_data {
>>> +       struct mb_data *ib;
>>> +       struct mb_data *ob;
>>> +};
>>> +
>>> +struct mrq {
>>> +       struct list_head list;
>>> +       u32 mrq_code;
>>> +       bpmp_mrq_handler handler;
>>> +       void *data;
>>> +};
>>> +
>>> +struct tegra_bpmp {
>>> +       struct device *dev;
>>> +       const struct tegra_bpmp_soc_data *soc_data;
>>> +       void __iomem *tx_base;
>>> +       void __iomem *rx_base;
>>> +       struct mbox_client cl;
>>> +       struct mbox_chan *chan;
>>> +       struct ivc *ivc_channels;
>>> +       struct channel_data *ch_area;
>>> +       struct channel_info ch_info;
>>> +       struct completion *ch_completion;
>>> +       struct list_head mrq_list;
>>> +       struct tegra_bpmp_ops *ops;
>>> +       spinlock_t lock;
>>> +       bool init_done;
>>> +};
>>> +
>>> +static struct tegra_bpmp *bpmp;
>>
>>
>> static? Ok, we only need one... for now. How about a private member in
>> your ivc structure that allows you to retrieve the bpmp and going
>> dynamic? This will require an extra argument in many functions, but is
>> cleaner design IMHO.
>
>
> IVC is designed as a generic IPC protocol among different modules (We have
> not introduced some other usages of the IVC right now.). Maybe don't churn
> some other stuff into IVC is better.

Anything is fine if you can get rid of that static.

>
>>
>>> +
>>> +static int bpmp_get_thread_ch(int idx)
>>> +{
>>> +       return bpmp->soc_data->thread_ch_index + idx;
>>> +}
>>> +
>>> +static int bpmp_get_thread_ch_index(int ch)
>>> +{
>>> +       if (ch < bpmp->soc_data->thread_ch_index ||
>>> +           ch >= bpmp->soc_data->cpu_rx_ch_index)
>>
>>
>> Shouldn't that be ch >= bpmp->soc_data->cpu_rx_ch_index +
>> bpmp->soc_data->nr_thread_ch?
>>
>> Either rx_ch_index indicates the upper bound of the threaded channels,
>> and in that case you don't need tegra_bpmp_soc_data::nr_thread_ch, or
>> it can be anywhere else and you should use the correct member.
>
>
> According the to the table below, we have 14 channels.
> atomic ch: 0 ~ 5, 6 chanls
> thread ch: 6 ~ 17, 7 chanls
> CPU RX ch: 13 ~ 14, 2 chanls
>
> +static const struct tegra_bpmp_soc_data soc_data_tegra186 = {
> +       .ch_index = 0,
> +       .thread_ch_index = 6,
> +       .cpu_rx_ch_index = 13,
> +       .nr_ch = 14,
> +       .nr_thread_ch = 7,
> +       .ch_timeout = 60 * USEC_PER_SEC,
> +       .thread_ch_timeout = 600 * USEC_PER_SEC,
> +};
>
> We use the index to check channel violation and nr_thread_ch for other usage
> to avoid redundant channel number calculation elsewhere.

Sorry, my comment had a mistake. I meant that

          ch >= bpmp->soc_data->cpu_rx_ch_index

Should maybe be

          ch >= bpmp->soc_data->cpu_rx_ch_index + bpmp->soc_data->nr_thread_ch

According to the description you gave of these fields, there is no
guarantee that cpu_rx_ch_index will always be the first channel after
the threaded channels.

>
>
>>
>>> +               return -1;
>>> +       return ch - bpmp->soc_data->thread_ch_index;
>>> +}
>>> +
>>> +static int bpmp_get_ob_channel(void)
>>> +{
>>> +       return smp_processor_id() + bpmp->soc_data->ch_index;
>>> +}
>>> +
>>> +static struct completion *bpmp_get_completion_obj(int ch)
>>> +{
>>> +       int i = bpmp_get_thread_ch_index(ch);
>>> +
>>> +       return i < 0 ? NULL : bpmp->ch_completion + i;
>>> +}
>>> +
>>> +static int bpmp_valid_txfer(void *ob_data, int ob_sz, void *ib_data, int
>>> ib_sz)
>>> +{
>>> +       return ob_sz >= 0 && ob_sz <= BPMP_MSG_DATA_SZ &&
>>> +              ib_sz >= 0 && ib_sz <= BPMP_MSG_DATA_SZ &&
>>> +              (!ob_sz || ob_data) && (!ib_sz || ib_data);
>>> +}
>>> +
>>> +static bool bpmp_master_acked(int ch)
>>> +{
>>> +       struct ivc *ivc_chan;
>>> +       void *frame;
>>> +       bool ready;
>>> +
>>> +       ivc_chan = bpmp->ivc_channels + ch;
>>> +       frame = tegra_ivc_read_get_next_frame(ivc_chan);
>>> +       ready = !IS_ERR_OR_NULL(frame);
>>> +       bpmp->ch_area[ch].ib = ready ? frame : NULL;
>>> +
>>> +       return ready;
>>> +}
>>> +
>>> +static int bpmp_wait_ack(int ch)

Shouldn't this be bpmp_wait_master_ack ? Looking at the two next
functions makes me think it should (or bpmp_wait_master_free should be
renamed to bpmp_wait_free).

>>> +{
>>> +       ktime_t t;
>>> +
>>> +       t = ns_to_ktime(local_clock());
>>> +
>>> +       do {
>>> +               if (bpmp_master_acked(ch))
>>> +                       return 0;
>>> +       } while (ktime_us_delta(ns_to_ktime(local_clock()), t) <
>>> +                bpmp->soc_data->ch_timeout);
>>> +
>>> +       return -ETIMEDOUT;
>>> +}
>>> +
>>> +static bool bpmp_master_free(int ch)
>>> +{
>>> +       struct ivc *ivc_chan;
>>> +       void *frame;
>>> +       bool ready;
>>> +
>>> +       ivc_chan = bpmp->ivc_channels + ch;
>>> +       frame = tegra_ivc_write_get_next_frame(ivc_chan);
>>> +       ready = !IS_ERR_OR_NULL(frame);
>>> +       bpmp->ch_area[ch].ob = ready ? frame : NULL;
>>> +
>>> +       return ready;
>>> +}
>>> +
>>> +static int bpmp_wait_master_free(int ch)
>>> +{
>>> +       ktime_t t;
>>> +
>>> +       t = ns_to_ktime(local_clock());
>>> +
>>> +       do {
>>> +               if (bpmp_master_free(ch))
>>> +                       return 0;
>>> +       } while (ktime_us_delta(ns_to_ktime(local_clock()), t)
>>> +                < bpmp->soc_data->ch_timeout);
>>> +
>>> +       return -ETIMEDOUT;
>>> +}
>>> +
>>> +static int __read_ch(int ch, void *data, int sz)
>>> +{
>>> +       struct ivc *ivc_chan;
>>> +       struct mb_data *p;
>>> +
>>> +       ivc_chan = bpmp->ivc_channels + ch;
>>> +       p = bpmp->ch_area[ch].ib;
>>> +       if (data)
>>> +               memcpy_fromio(data, p->data, sz);
>>> +
>>> +       return tegra_ivc_read_advance(ivc_chan);
>>> +}
>>> +
>>> +static int bpmp_read_ch(int ch, void *data, int sz)

bpmp_read_threaded_ch maybe? we have bpmp_write_threaded_ch below, as
this function is clearly dealing with threaded channels only.

>>> +{
>>> +       unsigned long flags;
>>> +       int i, ret;
>>> +
>>> +       i = bpmp_get_thread_ch_index(ch);
>>
>>
>> i is not a very good name for this variable.
>> Also note that bpmp_get_thread_ch_index() can return -1, this case is
>> not handled.
>
> Okay, will fix this.
>
>
>>
>>> +
>>> +       spin_lock_irqsave(&bpmp->lock, flags);
>>> +       ret = __read_ch(ch, data, sz);
>>> +       bpmp->ch_info.tch_free |= (1 << i);
>>> +       spin_unlock_irqrestore(&bpmp->lock, flags);
>>> +
>>> +       up(&bpmp->ch_info.tch_sem);
>>> +
>>> +       return ret;
>>> +}
>>> +
>>> +static int __write_ch(int ch, int mrq_code, int flags, void *data, int
>>> sz)
>>> +{
>>> +       struct ivc *ivc_chan;
>>> +       struct mb_data *p;
>>> +
>>> +       ivc_chan = bpmp->ivc_channels + ch;
>>> +       p = bpmp->ch_area[ch].ob;
>>> +
>>> +       p->code = mrq_code;
>>> +       p->flags = flags;
>>> +       if (data)
>>> +               memcpy_toio(p->data, data, sz);
>>> +
>>> +       return tegra_ivc_write_advance(ivc_chan);
>>> +}
>>> +
>>> +static int bpmp_write_threaded_ch(int *ch, int mrq_code, void *data, int
>>> sz)
>>> +{
>>> +       unsigned long flags;
>>> +       int ret, i;
>>> +
>>> +       ret = down_timeout(&bpmp->ch_info.tch_sem,
>>> +
>>> usecs_to_jiffies(bpmp->soc_data->thread_ch_timeout));
>>> +       if (ret)
>>> +               return ret;
>>> +
>>> +       spin_lock_irqsave(&bpmp->lock, flags);
>>> +
>>> +       i = __ffs(bpmp->ch_info.tch_free);
>>> +       *ch = bpmp_get_thread_ch(i);
>>> +       ret = bpmp_master_free(*ch) ? 0 : -EFAULT;
>>> +       if (!ret) {

Style nit: I prefer to make the error case the exception, and normal
runtime the norm. This is where a goto statement can actually make
your code easier to follow. Have an err: label before the spin_unlock,
and jump to it if ret != 0. Then you can have the next three lines at
the lower indentation level, and not looking like as if they were an
error themselves.

Or if you really don't like the goto, check for ret != 0 and do the
spin_unlock and return in that block.

>>> +               bpmp->ch_info.tch_free &= ~(1 << i);
>>> +               __write_ch(*ch, mrq_code, DO_ACK | RING_DOORBELL, data,
>>> sz);
>>> +               bpmp->ch_info.tch_to_complete |= (1 << *ch);
>>> +       }
>>> +
>>> +       spin_unlock_irqrestore(&bpmp->lock, flags);
>>> +
>>> +       return ret;
>>> +}
>>> +
>>> +static int bpmp_write_ch(int ch, int mrq_code, int flags, void *data,
>>> int sz)
>>> +{
>>> +       int ret;
>>> +
>>> +       ret = bpmp_wait_master_free(ch);
>>> +       if (ret)
>>> +               return ret;
>>> +
>>> +       return __write_ch(ch, mrq_code, flags, data, sz);
>>> +}
>>> +
>>> +static int bpmp_send_receive_atomic(int mrq_code, void *ob_data, int
>>> ob_sz,
>>> +                                   void *ib_data, int ib_sz)
>>> +{
>>> +       int ch, ret;
>>> +
>>> +       if (WARN_ON(!irqs_disabled()))
>>> +               return -EPERM;
>>> +
>>> +       if (!bpmp_valid_txfer(ob_data, ob_sz, ib_data, ib_sz))
>>> +               return -EINVAL;
>>> +
>>> +       if (!bpmp->init_done)
>>> +               return -ENODEV;
>>> +
>>> +       ch = bpmp_get_ob_channel();
>>> +       ret = bpmp_write_ch(ch, mrq_code, DO_ACK, ob_data, ob_sz);
>>> +       if (ret)
>>> +               return ret;
>>> +
>>> +       ret = mbox_send_message(bpmp->chan, NULL);
>>> +       if (ret < 0)
>>> +               return ret;
>>> +       mbox_client_txdone(bpmp->chan, 0);
>>> +
>>> +       ret = bpmp_wait_ack(ch);
>>> +       if (ret)
>>> +               return ret;
>>> +
>>> +       return __read_ch(ch, ib_data, ib_sz);
>>> +}
>>> +
>>> +static int bpmp_send_receive(int mrq_code, void *ob_data, int ob_sz,
>>> +                            void *ib_data, int ib_sz)
>>> +{
>>> +       struct completion *comp_obj;
>>> +       unsigned long timeout;
>>> +       int ch, ret;
>>> +
>>> +       if (WARN_ON(irqs_disabled()))
>>> +               return -EPERM;
>>> +
>>> +       if (!bpmp_valid_txfer(ob_data, ob_sz, ib_data, ib_sz))
>>> +               return -EINVAL;
>>> +
>>> +       if (!bpmp->init_done)
>>> +               return -ENODEV;
>>> +
>>> +       ret = bpmp_write_threaded_ch(&ch, mrq_code, ob_data, ob_sz);
>>> +       if (ret)
>>> +               return ret;
>>> +
>>> +       ret = mbox_send_message(bpmp->chan, NULL);
>>> +       if (ret < 0)
>>> +               return ret;
>>> +       mbox_client_txdone(bpmp->chan, 0);
>>> +
>>> +       comp_obj = bpmp_get_completion_obj(ch);
>>> +       timeout = usecs_to_jiffies(bpmp->soc_data->thread_ch_timeout);
>>> +       if (!wait_for_completion_timeout(comp_obj, timeout))
>>> +               return -ETIMEDOUT;
>>> +
>>> +       return bpmp_read_ch(ch, ib_data, ib_sz);
>>> +}
>>> +
>>> +static struct mrq *bpmp_find_mrq(u32 mrq_code)
>>> +{
>>> +       struct mrq *mrq;
>>> +
>>> +       list_for_each_entry(mrq, &bpmp->mrq_list, list) {
>>> +               if (mrq_code == mrq->mrq_code)
>>> +                       return mrq;
>>> +       }
>>> +
>>> +       return NULL;
>>> +}
>>> +
>>> +static void bpmp_mrq_return_data(int ch, int code, void *data, int sz)
>>> +{
>>> +       int flags = bpmp->ch_area[ch].ib->flags;
>>> +       struct ivc *ivc_chan;
>>> +       struct mb_data *frame;
>>> +       int ret;
>>> +
>>> +       if (WARN_ON(sz > BPMP_MSG_DATA_SZ))
>>> +               return;
>>> +
>>> +       ivc_chan = bpmp->ivc_channels + ch;
>>> +       ret = tegra_ivc_read_advance(ivc_chan);
>>> +       WARN_ON(ret);
>>> +
>>> +       if (!(flags & DO_ACK))
>>> +               return;
>>> +
>>> +       frame = tegra_ivc_write_get_next_frame(ivc_chan);
>>> +       if (IS_ERR_OR_NULL(frame)) {
>>> +               WARN_ON(1);
>>> +               return;
>>> +       }
>>> +
>>> +       frame->code = code;
>>> +       if (data != NULL)
>>> +               memcpy_toio(frame->data, data, sz);
>>> +       ret = tegra_ivc_write_advance(ivc_chan);
>>> +       WARN_ON(ret);
>>> +
>>> +       if (flags & RING_DOORBELL) {
>>> +               ret = mbox_send_message(bpmp->chan, NULL);
>>> +               if (ret < 0) {
>>> +                       WARN_ON(1);
>>> +                       return;
>>> +               }
>>> +               mbox_client_txdone(bpmp->chan, 0);
>>> +       }
>>> +}
>>> +
>>> +static void bpmp_mail_return(int ch, int ret_code, int val)
>>> +{
>>> +       bpmp_mrq_return_data(ch, ret_code, &val, sizeof(val));
>>> +}
>>> +
>>> +static void bpmp_handle_mrq(int mrq_code, int ch)
>>> +{
>>> +       struct mrq *mrq;
>>> +
>>> +       spin_lock(&bpmp->lock);
>>> +
>>> +       mrq = bpmp_find_mrq(mrq_code);
>>> +       if (!mrq) {
>>> +               spin_unlock(&bpmp->lock);
>>> +               bpmp_mail_return(ch, -EINVAL, 0);
>>> +               return;
>>> +       }
>>> +
>>> +       mrq->handler(mrq_code, mrq->data, ch);
>>> +
>>> +       spin_unlock(&bpmp->lock);
>>> +}
>>> +
>>> +static int bpmp_request_mrq(int mrq_code, bpmp_mrq_handler handler, void
>>> *data)
>>> +{
>>> +       struct mrq *mrq;
>>> +       unsigned long flags;
>>> +
>>> +       if (!handler)
>>> +               return -EINVAL;
>>> +
>>> +       mrq = devm_kzalloc(bpmp->dev, sizeof(*mrq), GFP_KERNEL);
>>> +       if (!mrq)
>>> +               return -ENOMEM;
>>> +
>>> +       spin_lock_irqsave(&bpmp->lock, flags);
>>> +
>>> +       mrq->mrq_code = __MRQ_INDEX(mrq_code);
>>> +       mrq->handler = handler;
>>> +       mrq->data = data;
>>> +       list_add(&mrq->list, &bpmp->mrq_list);
>>> +
>>> +       spin_unlock_irqrestore(&bpmp->lock, flags);
>>> +
>>> +       return 0;
>>> +}
>>> +
>>> +static void bpmp_mrq_handle_ping(int mrq_code, void *data, int ch)
>>> +{
>>> +       int challenge;
>>> +       int reply;
>>> +
>>> +       challenge = *(int *)bpmp->ch_area[ch].ib->data;
>>> +       reply = challenge << (smp_processor_id() + 1);
>>> +       bpmp_mail_return(ch, 0, reply);
>>> +}
>>> +
>>> +static int bpmp_mailman_init(void)
>>> +{
>>> +       return bpmp_request_mrq(MRQ_PING, bpmp_mrq_handle_ping, NULL);
>>> +}
>>> +
>>> +static int bpmp_ping(void)
>>> +{
>>> +       unsigned long flags;
>>> +       ktime_t t;
>>> +       int challenge = 1;
>>
>>
>> Mmmm, shouldn't use a mrq_ping_request instead of an parameter which
>> size may vary depending on the architecture? On a 64-bit big endian
>> architecture, your messages would be corrupted.
>
>
> Clarify one thig first. The mrq_ping_request and mrq_handle_ping above are
> used for the ping form BPMP to CPU. Like I said above, it's among CPU RX
> channel to get some information from BPMP firmware.

Ok, so mrq_handle_ping *should* use these data structures at the very least.

>
> Here is the ping request from CPU to BPMP to make sure we can IPC with BPMP
> during the probe stage.
>
> About the endian issue, I think we don't consider that in the message format
> right now. So I think we only support little endian for the IPC messages
> right now.

Any code in the kernel should function correctly regardless of
endianness. And the problem is not so much with endianness as it is
with the operand size - is the BPMP expecting a 64-bit challenge here?
Considering that the equivalent MRQ uses a 32-bit integer, I'd bet
not. So please use u32/u64 as needed as well as cpu_to_leXX (and
leXX_to_cpu for the opposite) to make your code solid.

I understand that you don't want to use the MRQ structures because we
are not handling a MRQ here, but if they are relevant I think this
would still be safer that constructing messages from scalar data. That
or we should introduce a proper structure for these messages, but here
using the MRQ structure looks acceptable to me. Maybe they should not
be named MRQ at all, but that's not for us to decide.

>
>>
>>> +       int reply = 0;
>>
>>
>> And this should probably be a mrq_ping_response. These remarks may
>> also apply to bpmp_mrq_handle_ping().
>
> That is for receiving the ping request from BPMP.
>
>>
>>> +       int ret;
>>> +
>>> +       t = ktime_get();
>>> +       local_irq_save(flags);
>>> +       ret = bpmp_send_receive_atomic(MRQ_PING, &challenge,
>>> sizeof(challenge),
>>> +                                      &reply, sizeof(reply));
>>> +       local_irq_restore(flags);
>>> +       t = ktime_sub(ktime_get(), t);
>>> +
>>> +       if (!ret)
>>> +               dev_info(bpmp->dev,
>>> +                        "ping ok: challenge: %d, reply: %d, time:
>>> %lld\n",
>>> +                        challenge, reply, ktime_to_us(t));
>>> +
>>> +       return ret;
>>> +}
>>> +
>>> +static int bpmp_get_fwtag(void)
>>> +{
>>> +       unsigned long flags;
>>> +       void *vaddr;
>>> +       dma_addr_t paddr;
>>> +       u32 addr;
>>
>>
>> Here also we should use a mrq_query_tag_request.
>
> The is one-way request from CPU to BPMP. So we don't request an MRQ for
> that.
>
>>
>>> +       int ret;
>>> +
>>> +       vaddr = dma_alloc_coherent(bpmp->dev, BPMP_MSG_DATA_SZ, &paddr,
>>> +                                  GFP_KERNEL);
>>
>>
>> dma_addr_t may be 64 bit here, and you may get an address higher than
>> the 32 bits allowed by mrq_query_tag_request! I guess you want to add
>> GFP_DMA32 as flag to your call to dma_alloc_coherent.
>
> BPMP should able to handle the address above 32 bits, but I am not sure does
> it configure to support that?

If the message you pass only contains a 32-bit address, then I'm
afraid the protocol is the limiting factor here until it is updated.

Can't wait for the day when we will have to manage several versions of
this protocol! >_<

>
> Will fix this.
>
>
>>
>>> +       if (!vaddr)
>>> +               return -ENOMEM;
>>> +       addr = paddr;
>>> +
>>> +       local_irq_save(flags);
>>> +       ret = bpmp_send_receive_atomic(MRQ_QUERY_TAG, &addr,
>>> sizeof(addr),
>>> +                                      NULL, 0);
>>> +       local_irq_restore(flags);
>>> +
>>> +       if (!ret)
>>> +               dev_info(bpmp->dev, "fwtag: %s\n", (char *)vaddr);
>>> +
>>> +       dma_free_coherent(bpmp->dev, BPMP_MSG_DATA_SZ, vaddr, paddr);
>>> +
>>> +       return ret;
>>> +}
>>> +
>>> +static void bpmp_signal_thread(int ch)
>>> +{
>>> +       int flags = bpmp->ch_area[ch].ob->flags;
>>> +       struct completion *comp_obj;
>>> +
>>> +       if (!(flags & RING_DOORBELL))
>>> +               return;
>>> +
>>> +       comp_obj = bpmp_get_completion_obj(ch);
>>> +       if (!comp_obj) {
>>> +               WARN_ON(1);
>>> +               return;
>>> +       }
>>> +
>>> +       complete(comp_obj);
>>> +}
>>> +
>>> +static void bpmp_handle_rx(struct mbox_client *cl, void *data)
>>> +{
>>> +       int i, rx_ch;
>>> +
>>> +       rx_ch = bpmp->soc_data->cpu_rx_ch_index;
>>> +
>>> +       if (bpmp_master_acked(rx_ch))
>>> +               bpmp_handle_mrq(bpmp->ch_area[rx_ch].ib->code, rx_ch);
>>> +
>>> +       spin_lock(&bpmp->lock);
>>> +
>>> +       for (i = 0; i < bpmp->soc_data->nr_thread_ch &&
>>> +                       bpmp->ch_info.tch_to_complete; i++) {

for_each_set_bit(bpmp->ch_info.tch_to_complete, &i,
bpmp->soc_data->nr_thread_ch) ?

This will reduce the number of iterations and you won't have to do the
bpmp->ch_info.tch_to_complete & (1 << ch) check below.

>>> +               int ch = bpmp_get_thread_ch(i);
>>> +
>>> +               if ((bpmp->ch_info.tch_to_complete & (1 << ch)) &&
>>> +                   bpmp_master_acked(ch)) {
>>> +                       bpmp->ch_info.tch_to_complete &= ~(1 << ch);
>>> +                       bpmp_signal_thread(ch);
>>> +               }
>>> +       }
>>> +
>>> +       spin_unlock(&bpmp->lock);
>>> +}
>>> +
>>> +static void bpmp_ivc_notify(struct ivc *ivc)
>>> +{
>>> +       int ret;
>>> +
>>> +       ret = mbox_send_message(bpmp->chan, NULL);
>>> +       if (ret < 0)
>>> +               return;
>>> +
>>> +       mbox_send_message(bpmp->chan, NULL);
>>
>>
>> Why the second call to mbox_send_message? May to useful to add a
>> comment explaining it.
>
> Ah!! It should be mbox_client_txdone(). Good catch.

That makes more sense. :) But did this code work even with that typo?

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH V2 04/10] firmware: tegra: add IVC library
  2016-07-05  9:04 ` [PATCH V2 04/10] firmware: tegra: add IVC library Joseph Lo
@ 2016-07-07 11:16   ` Alexandre Courbot
  2016-07-09 23:45   ` Paul Gortmaker
  1 sibling, 0 replies; 51+ messages in thread
From: Alexandre Courbot @ 2016-07-07 11:16 UTC (permalink / raw)
  To: Joseph Lo
  Cc: Stephen Warren, Thierry Reding, linux-tegra, linux-arm-kernel,
	Rob Herring, Mark Rutland, Peter De Schrijver,
	Matthew Longnecker, devicetree, Jassi Brar,
	Linux Kernel Mailing List, Catalin Marinas, Will Deacon

On Tue, Jul 5, 2016 at 6:04 PM, Joseph Lo <josephl@nvidia.com> wrote:
> The Inter-VM communication (IVC) is a communication protocol, which is
> designed for interprocessor communication (IPC) or the communication
> between the hypervisor and the virtual machine with a guest OS on it. So
> it can be translated as inter-virtual memory or inter-virtual machine
> communication. The message channels are maintained on the DRAM or SRAM
> and the data coherency should be considered. Or the data could be
> corrupted or out of date when the remote client checking it.
>
> Inside the IVC, it maintains memory-based descriptors for the TX/RX
> channels and the coherency issue of the counter and payloads. So the
> clients can use it to send/receive messages to/from remote ones.
>
> We introduce it as a library for the firmware drivers, which can use it
> for IPC.
>
> Based-on-the-work-by:
> Peter Newman <pnewman@nvidia.com>
>
> Signed-off-by: Joseph Lo <josephl@nvidia.com>
> ---
> Changes in V2:
> - None
> ---
>  drivers/firmware/Kconfig        |   1 +
>  drivers/firmware/Makefile       |   1 +
>  drivers/firmware/tegra/Kconfig  |  13 +
>  drivers/firmware/tegra/Makefile |   1 +
>  drivers/firmware/tegra/ivc.c    | 659 ++++++++++++++++++++++++++++++++++++++++
>  include/soc/tegra/ivc.h         | 102 +++++++
>  6 files changed, 777 insertions(+)
>  create mode 100644 drivers/firmware/tegra/Kconfig
>  create mode 100644 drivers/firmware/tegra/Makefile
>  create mode 100644 drivers/firmware/tegra/ivc.c
>  create mode 100644 include/soc/tegra/ivc.h
>
> diff --git a/drivers/firmware/Kconfig b/drivers/firmware/Kconfig
> index 5e618058defe..bbd64ae8c4c6 100644
> --- a/drivers/firmware/Kconfig
> +++ b/drivers/firmware/Kconfig
> @@ -200,5 +200,6 @@ config HAVE_ARM_SMCCC
>  source "drivers/firmware/broadcom/Kconfig"
>  source "drivers/firmware/google/Kconfig"
>  source "drivers/firmware/efi/Kconfig"
> +source "drivers/firmware/tegra/Kconfig"
>
>  endmenu
> diff --git a/drivers/firmware/Makefile b/drivers/firmware/Makefile
> index 474bada56fcd..9a4df8171cc4 100644
> --- a/drivers/firmware/Makefile
> +++ b/drivers/firmware/Makefile
> @@ -24,3 +24,4 @@ obj-y                         += broadcom/
>  obj-$(CONFIG_GOOGLE_FIRMWARE)  += google/
>  obj-$(CONFIG_EFI)              += efi/
>  obj-$(CONFIG_UEFI_CPER)                += efi/
> +obj-y                          += tegra/
> diff --git a/drivers/firmware/tegra/Kconfig b/drivers/firmware/tegra/Kconfig
> new file mode 100644
> index 000000000000..1fa3e4e136a5
> --- /dev/null
> +++ b/drivers/firmware/tegra/Kconfig
> @@ -0,0 +1,13 @@
> +menu "Tegra firmware driver"
> +
> +config TEGRA_IVC
> +       bool "Tegra IVC protocol"
> +       depends on ARCH_TEGRA
> +       help
> +         IVC (Inter-VM Communication) protocol is part of the IPC
> +         (Inter Processor Communication) framework on Tegra. It maintains the
> +         data and the different commuication channels in SysRAM or RAM and
> +         keeps the content is synchronization between host CPU and remote
> +         processors.
> +
> +endmenu
> diff --git a/drivers/firmware/tegra/Makefile b/drivers/firmware/tegra/Makefile
> new file mode 100644
> index 000000000000..92e2153e8173
> --- /dev/null
> +++ b/drivers/firmware/tegra/Makefile
> @@ -0,0 +1 @@
> +obj-$(CONFIG_TEGRA_IVC)                += ivc.o
> diff --git a/drivers/firmware/tegra/ivc.c b/drivers/firmware/tegra/ivc.c
> new file mode 100644
> index 000000000000..3e736bb9915a
> --- /dev/null
> +++ b/drivers/firmware/tegra/ivc.c
> @@ -0,0 +1,659 @@
> +/*
> + * Copyright (c) 2014-2016, NVIDIA CORPORATION.  All rights reserved.
> + *
> + * This program is free software; you can redistribute it and/or modify it
> + * under the terms and conditions of the GNU General Public License,
> + * version 2, as published by the Free Software Foundation.
> + *
> + * This program is distributed in the hope it will be useful, but WITHOUT
> + * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
> + * FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License for
> + * more details.
> + */
> +
> +#include <linux/module.h>
> +
> +#include <soc/tegra/ivc.h>
> +
> +#define IVC_ALIGN 64
> +
> +#ifdef CONFIG_SMP
> +
> +static inline void ivc_rmb(void)
> +{
> +       smp_rmb();
> +}
> +
> +static inline void ivc_wmb(void)
> +{
> +       smp_wmb();
> +}
> +
> +static inline void ivc_mb(void)
> +{
> +       smp_mb();
> +}
> +
> +#else
> +
> +static inline void ivc_rmb(void)
> +{
> +       rmb();
> +}
> +
> +static inline void ivc_wmb(void)
> +{
> +       wmb();
> +}
> +
> +static inline void ivc_mb(void)
> +{
> +       mb();
> +}
> +
> +#endif
> +
> +/*
> + * IVC channel reset protocol.
> + *
> + * Each end uses its tx_channel.state to indicate its synchronization state.
> + */
> +enum ivc_state {
> +       /*
> +        * This value is zero for backwards compatibility with services that
> +        * assume channels to be initially zeroed. Such channels are in an
> +        * initially valid state, but cannot be asynchronously reset, and must
> +        * maintain a valid state at all times.
> +        *
> +        * The transmitting end can enter the established state from the sync or
> +        * ack state when it observes the receiving endpoint in the ack or
> +        * established state, indicating that has cleared the counters in our
> +        * rx_channel.
> +        */
> +       ivc_state_established = 0,
> +
> +       /*
> +        * If an endpoint is observed in the sync state, the remote endpoint is
> +        * allowed to clear the counters it owns asynchronously with respect to
> +        * the current endpoint. Therefore, the current endpoint is no longer
> +        * allowed to communicate.
> +        */
> +       ivc_state_sync,
> +
> +       /*
> +        * When the transmitting end observes the receiving end in the sync
> +        * state, it can clear the w_count and r_count and transition to the ack
> +        * state. If the remote endpoint observes us in the ack state, it can
> +        * return to the established state once it has cleared its counters.
> +        */
> +       ivc_state_ack
> +};
> +
> +/*
> + * This structure is divided into two-cache aligned parts, the first is only

Should read "two cache-aligned" maybe?

> + * written through the tx_channel pointer, while the second is only written
> + * through the rx_channel pointer. This delineates ownership of the cache lines,
> + * which is critical to performance and necessary in non-cache coherent
> + * implementations.
> + */
> +struct ivc_channel_header {
> +       union {
> +               struct {
> +                       /* fields owned by the transmitting end */

Fields? According to the context I would say "frames?"

> +                       uint32_t w_count;
> +                       uint32_t state;
> +               };
> +               uint8_t w_align[IVC_ALIGN];
> +       };
> +       union {
> +               /* fields owned by the receiving end */

Same here.

> +               uint32_t r_count;
> +               uint8_t r_align[IVC_ALIGN];
> +       };
> +};
> +
> +static inline void ivc_invalidate_counter(struct ivc *ivc,
> +               dma_addr_t handle)
> +{
> +       if (!ivc->peer_device)
> +               return;
> +       dma_sync_single_for_cpu(ivc->peer_device, handle, IVC_ALIGN,
> +                       DMA_FROM_DEVICE);
> +}
> +
> +static inline void ivc_flush_counter(struct ivc *ivc, dma_addr_t handle)
> +{
> +       if (!ivc->peer_device)
> +               return;
> +       dma_sync_single_for_device(ivc->peer_device, handle, IVC_ALIGN,
> +                       DMA_TO_DEVICE);
> +}
> +
> +static inline int ivc_channel_empty(struct ivc *ivc,
> +               struct ivc_channel_header *ch)

This function should probably return bool.

> +{
> +       /*
> +        * This function performs multiple checks on the same values with
> +        * security implications, so create snapshots with ACCESS_ONCE() to
> +        * ensure that these checks use the same values.
> +        */
> +       uint32_t w_count = ACCESS_ONCE(ch->w_count);
> +       uint32_t r_count = ACCESS_ONCE(ch->r_count);
> +
> +       /*
> +        * Perform an over-full check to prevent denial of service attacks where
> +        * a server could be easily fooled into believing that there's an
> +        * extremely large number of frames ready, since receivers are not
> +        * expected to check for full or over-full conditions.
> +        *
> +        * Although the channel isn't empty, this is an invalid case caused by
> +        * a potentially malicious peer, so returning empty is safer, because it
> +        * gives the impression that the channel has gone silent.
> +        */
> +       if (w_count - r_count > ivc->nframes)
> +               return 1;
> +
> +       return w_count == r_count;
> +}
> +
> +static inline int ivc_channel_full(struct ivc *ivc,
> +               struct ivc_channel_header *ch)

And this one too.

> +{
> +       /*
> +        * Invalid cases where the counters indicate that the queue is over
> +        * capacity also appear full.
> +        */
> +       return ACCESS_ONCE(ch->w_count) - ACCESS_ONCE(ch->r_count)
> +               >= ivc->nframes;
> +}
> +
> +static inline uint32_t ivc_channel_avail_count(struct ivc *ivc,
> +               struct ivc_channel_header *ch)
> +{
> +       /*
> +        * This function isn't expected to be used in scenarios where an
> +        * over-full situation can lead to denial of service attacks. See the
> +        * comment in ivc_channel_empty() for an explanation about special
> +        * over-full considerations.
> +        */
> +       return ACCESS_ONCE(ch->w_count) - ACCESS_ONCE(ch->r_count);
> +}
> +
> +static inline void ivc_advance_tx(struct ivc *ivc)
> +{
> +       ACCESS_ONCE(ivc->tx_channel->w_count) =
> +               ACCESS_ONCE(ivc->tx_channel->w_count) + 1;
> +
> +       if (ivc->w_pos == ivc->nframes - 1)
> +               ivc->w_pos = 0;
> +       else
> +               ivc->w_pos++;
> +}
> +
> +static inline void ivc_advance_rx(struct ivc *ivc)
> +{
> +       ACCESS_ONCE(ivc->rx_channel->r_count) =
> +               ACCESS_ONCE(ivc->rx_channel->r_count) + 1;
> +
> +       if (ivc->r_pos == ivc->nframes - 1)
> +               ivc->r_pos = 0;
> +       else
> +               ivc->r_pos++;
> +}
> +
> +static inline int ivc_check_read(struct ivc *ivc)
> +{
> +       /*
> +        * tx_channel->state is set locally, so it is not synchronized with
> +        * state from the remote peer. The remote peer cannot reset its
> +        * transmit counters until we've acknowledged its synchronization
> +        * request, so no additional synchronization is required because an
> +        * asynchronous transition of rx_channel->state to ivc_state_ack is not
> +        * allowed.
> +        */
> +       if (ivc->tx_channel->state != ivc_state_established)
> +               return -ECONNRESET;
> +
> +       /*
> +        * Avoid unnecessary invalidations when performing repeated accesses to
> +        * an IVC channel by checking the old queue pointers first.
> +        * Synchronization is only necessary when these pointers indicate empty
> +        * or full.
> +        */
> +       if (!ivc_channel_empty(ivc, ivc->rx_channel))
> +               return 0;
> +
> +       ivc_invalidate_counter(ivc, ivc->rx_handle +
> +                       offsetof(struct ivc_channel_header, w_count));
> +       return ivc_channel_empty(ivc, ivc->rx_channel) ? -ENOMEM : 0;
> +}
> +
> +static inline int ivc_check_write(struct ivc *ivc)
> +{
> +       if (ivc->tx_channel->state != ivc_state_established)
> +               return -ECONNRESET;
> +
> +       if (!ivc_channel_full(ivc, ivc->tx_channel))
> +               return 0;
> +
> +       ivc_invalidate_counter(ivc, ivc->tx_handle +
> +                       offsetof(struct ivc_channel_header, r_count));
> +       return ivc_channel_full(ivc, ivc->tx_channel) ? -ENOMEM : 0;
> +}
> +
> +static void *ivc_frame_pointer(struct ivc *ivc, struct ivc_channel_header *ch,
> +               uint32_t frame)
> +{
> +       BUG_ON(frame >= ivc->nframes);
> +       return (void *)((uintptr_t)(ch + 1) + ivc->frame_size * frame);
> +}
> +
> +static inline dma_addr_t ivc_frame_handle(struct ivc *ivc,
> +               dma_addr_t channel_handle, uint32_t frame)
> +{
> +       BUG_ON(!ivc->peer_device);
> +       BUG_ON(frame >= ivc->nframes);
> +       return channel_handle + sizeof(struct ivc_channel_header) +
> +               ivc->frame_size * frame;
> +}
> +
> +static inline void ivc_invalidate_frame(struct ivc *ivc,
> +               dma_addr_t channel_handle, unsigned frame, int offset, int len)
> +{
> +       if (!ivc->peer_device)
> +               return;
> +       dma_sync_single_for_cpu(ivc->peer_device,
> +                       ivc_frame_handle(ivc, channel_handle, frame) + offset,
> +                       len, DMA_FROM_DEVICE);
> +}
> +
> +static inline void ivc_flush_frame(struct ivc *ivc, dma_addr_t channel_handle,
> +               unsigned frame, int offset, int len)
> +{
> +       if (!ivc->peer_device)
> +               return;
> +       dma_sync_single_for_device(ivc->peer_device,
> +                       ivc_frame_handle(ivc, channel_handle, frame) + offset,
> +                       len, DMA_TO_DEVICE);
> +}
> +
> +/* directly peek at the next frame rx'ed */
> +void *tegra_ivc_read_get_next_frame(struct ivc *ivc)
> +{
> +       int result = ivc_check_read(ivc);
> +       if (result)
> +               return ERR_PTR(result);
> +
> +       /*
> +        * Order observation of w_pos potentially indicating new data before
> +        * data read.
> +        */
> +       ivc_rmb();
> +
> +       ivc_invalidate_frame(ivc, ivc->rx_handle, ivc->r_pos, 0,
> +                       ivc->frame_size);
> +       return ivc_frame_pointer(ivc, ivc->rx_channel, ivc->r_pos);
> +}
> +EXPORT_SYMBOL(tegra_ivc_read_get_next_frame);
> +
> +int tegra_ivc_read_advance(struct ivc *ivc)
> +{
> +       /*
> +        * No read barriers or synchronization here: the caller is expected to
> +        * have already observed the channel non-empty. This check is just to
> +        * catch programming errors.
> +        */
> +       int result = ivc_check_read(ivc);
> +       if (result)
> +               return result;
> +
> +       ivc_advance_rx(ivc);
> +       ivc_flush_counter(ivc, ivc->rx_handle +
> +                       offsetof(struct ivc_channel_header, r_count));

This function is called quite a few times, and every time you have
this cumbersome offsetof. In practice you can only flush one of two
things: the write counter (first 64 bits of the header) or the read
counter (second part). Maybe you can specify which one you want to
flush through an extra argument to ivc_flush_counter (say, enum {
COUNTER_RD, COUNTER_WR }), and perform the offsetof there according to
the value of the argument? Same for ivc_invalidate_counter.

> +
> +       /*
> +        * Ensure our write to r_pos occurs before our read from w_pos.
> +        */
> +       ivc_mb();
> +
> +       /*
> +        * Notify only upon transition from full to non-full.
> +        * The available count can only asynchronously increase, so the
> +        * worst possible side-effect will be a spurious notification.
> +        */
> +       ivc_invalidate_counter(ivc, ivc->rx_handle +
> +               offsetof(struct ivc_channel_header, w_count));
> +
> +       if (ivc_channel_avail_count(ivc, ivc->rx_channel) == ivc->nframes - 1)
> +               ivc->notify(ivc);
> +
> +       return 0;
> +}
> +EXPORT_SYMBOL(tegra_ivc_read_advance);
> +
> +/* directly poke at the next frame to be tx'ed */
> +void *tegra_ivc_write_get_next_frame(struct ivc *ivc)
> +{
> +       int result = ivc_check_write(ivc);
> +       if (result)
> +               return ERR_PTR(result);
> +
> +       return ivc_frame_pointer(ivc, ivc->tx_channel, ivc->w_pos);
> +}
> +EXPORT_SYMBOL(tegra_ivc_write_get_next_frame);
> +
> +/* advance the tx buffer */
> +int tegra_ivc_write_advance(struct ivc *ivc)
> +{
> +       int result = ivc_check_write(ivc);
> +       if (result)
> +               return result;
> +
> +       ivc_flush_frame(ivc, ivc->tx_handle, ivc->w_pos, 0, ivc->frame_size);
> +
> +       /*
> +        * Order any possible stores to the frame before update of w_pos.
> +        */
> +       ivc_wmb();
> +
> +       ivc_advance_tx(ivc);
> +       ivc_flush_counter(ivc, ivc->tx_handle +
> +                       offsetof(struct ivc_channel_header, w_count));
> +
> +       /*
> +        * Ensure our write to w_pos occurs before our read from r_pos.
> +        */
> +       ivc_mb();
> +
> +       /*
> +        * Notify only upon transition from empty to non-empty.
> +        * The available count can only asynchronously decrease, so the
> +        * worst possible side-effect will be a spurious notification.
> +        */
> +       ivc_invalidate_counter(ivc, ivc->tx_handle +
> +               offsetof(struct ivc_channel_header, r_count));
> +
> +       if (ivc_channel_avail_count(ivc, ivc->tx_channel) == 1)
> +               ivc->notify(ivc);
> +
> +       return 0;
> +}
> +EXPORT_SYMBOL(tegra_ivc_write_advance);
> +
> +void tegra_ivc_channel_reset(struct ivc *ivc)
> +{
> +       ivc->tx_channel->state = ivc_state_sync;
> +       ivc_flush_counter(ivc, ivc->tx_handle +
> +                       offsetof(struct ivc_channel_header, w_count));
> +       ivc->notify(ivc);
> +}
> +EXPORT_SYMBOL(tegra_ivc_channel_reset);
> +
> +/*
> + * ===============================================================
> + *  IVC State Transition Table - see tegra_ivc_channel_notified()
> + * ===============================================================
> + *
> + *     local   remote  action
> + *     -----   ------  -----------------------------------
> + *     SYNC    EST     <none>
> + *     SYNC    ACK     reset counters; move to EST; notify
> + *     SYNC    SYNC    reset counters; move to ACK; notify
> + *     ACK     EST     move to EST; notify
> + *     ACK     ACK     move to EST; notify
> + *     ACK     SYNC    reset counters; move to ACK; notify
> + *     EST     EST     <none>
> + *     EST     ACK     <none>
> + *     EST     SYNC    reset counters; move to ACK; notify
> + *
> + * ===============================================================
> + */
> +
> +int tegra_ivc_channel_notified(struct ivc *ivc)
> +{
> +       enum ivc_state peer_state;
> +
> +       /* Copy the receiver's state out of shared memory. */
> +       ivc_invalidate_counter(ivc, ivc->rx_handle +
> +                       offsetof(struct ivc_channel_header, w_count));
> +       peer_state = ACCESS_ONCE(ivc->rx_channel->state);
> +
> +       if (peer_state == ivc_state_sync) {
> +               /*
> +                * Order observation of ivc_state_sync before stores clearing
> +                * tx_channel.
> +                */
> +               ivc_rmb();
> +
> +               /*
> +                * Reset tx_channel counters. The remote end is in the SYNC
> +                * state and won't make progress until we change our state,
> +                * so the counters are not in use at this time.
> +                */
> +               ivc->tx_channel->w_count = 0;
> +               ivc->rx_channel->r_count = 0;
> +
> +               ivc->w_pos = 0;
> +               ivc->r_pos = 0;
> +
> +               /*
> +                * Ensure that counters appear cleared before new state can be
> +                * observed.
> +                */
> +               ivc_wmb();
> +
> +               /*
> +                * Move to ACK state. We have just cleared our counters, so it
> +                * is now safe for the remote end to start using these values.
> +                */
> +               ivc->tx_channel->state = ivc_state_ack;
> +               ivc_flush_counter(ivc, ivc->tx_handle +
> +                               offsetof(struct ivc_channel_header, w_count));
> +
> +               /*
> +                * Notify remote end to observe state transition.
> +                */
> +               ivc->notify(ivc);
> +
> +       } else if (ivc->tx_channel->state == ivc_state_sync &&
> +                       peer_state == ivc_state_ack) {
> +               /*
> +                * Order observation of ivc_state_sync before stores clearing
> +                * tx_channel.
> +                */
> +               ivc_rmb();
> +
> +               /*
> +                * Reset tx_channel counters. The remote end is in the ACK
> +                * state and won't make progress until we change our state,
> +                * so the counters are not in use at this time.
> +                */
> +               ivc->tx_channel->w_count = 0;
> +               ivc->rx_channel->r_count = 0;
> +
> +               ivc->w_pos = 0;
> +               ivc->r_pos = 0;
> +
> +               /*
> +                * Ensure that counters appear cleared before new state can be
> +                * observed.
> +                */
> +               ivc_wmb();
> +
> +               /*
> +                * Move to ESTABLISHED state. We know that the remote end has
> +                * already cleared its counters, so it is safe to start
> +                * writing/reading on this channel.
> +                */
> +               ivc->tx_channel->state = ivc_state_established;
> +               ivc_flush_counter(ivc, ivc->tx_handle +
> +                               offsetof(struct ivc_channel_header, w_count));
> +
> +               /*
> +                * Notify remote end to observe state transition.
> +                */
> +               ivc->notify(ivc);
> +
> +       } else if (ivc->tx_channel->state == ivc_state_ack) {
> +               /*
> +                * At this point, we have observed the peer to be in either
> +                * the ACK or ESTABLISHED state. Next, order observation of
> +                * peer state before storing to tx_channel.
> +                */
> +               ivc_rmb();
> +
> +               /*
> +                * Move to ESTABLISHED state. We know that we have previously
> +                * cleared our counters, and we know that the remote end has
> +                * cleared its counters, so it is safe to start writing/reading
> +                * on this channel.
> +                */
> +               ivc->tx_channel->state = ivc_state_established;
> +               ivc_flush_counter(ivc, ivc->tx_handle +
> +                               offsetof(struct ivc_channel_header, w_count));
> +
> +               /*
> +                * Notify remote end to observe state transition.
> +                */
> +               ivc->notify(ivc);
> +
> +       } else {
> +               /*
> +                * There is no need to handle any further action. Either the
> +                * channel is already fully established, or we are waiting for
> +                * the remote end to catch up with our current state. Refer
> +                * to the diagram in "IVC State Transition Table" above.
> +                */
> +       }
> +
> +       return ivc->tx_channel->state == ivc_state_established ? 0 : -EAGAIN;
> +}
> +EXPORT_SYMBOL(tegra_ivc_channel_notified);
> +
> +size_t tegra_ivc_align(size_t size)
> +{
> +       return (size + (IVC_ALIGN - 1)) & ~(IVC_ALIGN - 1);

return ALIGN(size, IVC_ALIGN)?

> +}
> +EXPORT_SYMBOL(tegra_ivc_align);
> +
> +unsigned tegra_ivc_total_queue_size(unsigned queue_size)
> +{
> +       if (queue_size & (IVC_ALIGN - 1)) {
> +               pr_err("%s: queue_size (%u) must be %u-byte aligned\n",
> +                               __func__, queue_size, IVC_ALIGN);
> +               return 0;
> +       }
> +       return queue_size + sizeof(struct ivc_channel_header);
> +}
> +EXPORT_SYMBOL(tegra_ivc_total_queue_size);
> +
> +static int check_ivc_params(uintptr_t queue_base1, uintptr_t queue_base2,
> +               unsigned nframes, unsigned frame_size)
> +{
> +       BUG_ON(offsetof(struct ivc_channel_header, w_count) & (IVC_ALIGN - 1));
> +       BUG_ON(offsetof(struct ivc_channel_header, r_count) & (IVC_ALIGN - 1));
> +       BUG_ON(sizeof(struct ivc_channel_header) & (IVC_ALIGN - 1));

These checks are done on purely static data, we don't need to do it
here, for each channel... If there a way to have them performed at
compilation time instead? Since these constraints must be enforced,
they should also be specified as a comment to struct
ivc_channel_header to avoid unwanted modifications.

Mmm, or thinking twice, since the condition can be evaluated at
compilation time, maybe the compiler will optimize these out entirely?
In that case, a better place to do this would be tegra_ivc_init() - we
want the failure to be reported as early as possible.

> +
> +       if ((uint64_t)nframes * (uint64_t)frame_size >= 0x100000000) {
> +               pr_err("nframes * frame_size overflows\n");
> +               return -EINVAL;
> +       }
> +
> +       /*
> +        * The headers must at least be aligned enough for counters
> +        * to be accessed atomically.
> +        */
> +       if (queue_base1 & (IVC_ALIGN - 1)) {
> +               pr_err("ivc channel start not aligned: %lx\n", queue_base1);
> +               return -EINVAL;
> +       }
> +       if (queue_base2 & (IVC_ALIGN - 1)) {
> +               pr_err("ivc channel start not aligned: %lx\n", queue_base2);
> +               return -EINVAL;
> +       }
> +
> +       if (frame_size & (IVC_ALIGN - 1)) {
> +               pr_err("frame size not adequately aligned: %u\n", frame_size);
> +               return -EINVAL;
> +       }
> +
> +       if (queue_base1 < queue_base2) {
> +               if (queue_base1 + frame_size * nframes > queue_base2) {
> +                       pr_err("queue regions overlap: %lx + %x, %x\n",
> +                                       queue_base1, frame_size,
> +                                       frame_size * nframes);
> +                       return -EINVAL;
> +               }
> +       } else {
> +               if (queue_base2 + frame_size * nframes > queue_base1) {
> +                       pr_err("queue regions overlap: %lx + %x, %x\n",
> +                                       queue_base2, frame_size,
> +                                       frame_size * nframes);
> +                       return -EINVAL;
> +               }
> +       }
> +
> +       return 0;
> +}
> +
> +int tegra_ivc_init(struct ivc *ivc, uintptr_t rx_base, dma_addr_t rx_handle,
> +                  uintptr_t tx_base, dma_addr_t tx_handle, unsigned nframes,
> +                  unsigned frame_size, struct device *peer_device,
> +                  void (*notify)(struct ivc *))
> +{
> +       size_t queue_size;
> +
> +       int result = check_ivc_params(rx_base, tx_base, nframes, frame_size);
> +       if (result)
> +               return result;
> +
> +       BUG_ON(!ivc);
> +       BUG_ON(!notify);

Why BUG_ON and not return -EINVAL? Is this really unrecoverable?
Doesn't seem so, and later in this function you return error
conditions for other errors...

> +
> +       queue_size = tegra_ivc_total_queue_size(nframes * frame_size);
> +
> +       /*
> +        * All sizes that can be returned by communication functions should
> +        * fit in an int.
> +        */
> +       if (frame_size > INT_MAX)
> +               return -E2BIG;
> +
> +       ivc->rx_channel = (struct ivc_channel_header *)rx_base;
> +       ivc->tx_channel = (struct ivc_channel_header *)tx_base;
> +
> +       if (peer_device) {
> +               if (rx_handle != DMA_ERROR_CODE) {

This looks more complicated than it needs to be - in practice,
tx_handle and rx_handle are always DMA_ERROR_CODE when you call this
function from the BPMP code, and this block will never get used.

Furthermore, ivc_flush_*() and ivc_invalidate_*() only rely on
peer_device to decide whether they need to call dma_sync. Calling
these functions on an unmapped page is an invalid use of the DMA API,
so the case where rx_handle != DMA_ERROR_CODE is invalid anyway.

Isn't it possible to remove these rx_handle/tx_handle arguments
altogether and thus simplify this function?

> +                       ivc->rx_handle = rx_handle;
> +                       ivc->tx_handle = tx_handle;
> +               } else {
> +                       ivc->rx_handle = dma_map_single(peer_device,
> +                               ivc->rx_channel, queue_size, DMA_BIDIRECTIONAL);
> +                       if (ivc->rx_handle == DMA_ERROR_CODE)
> +                               return -ENOMEM;
> +
> +                       ivc->tx_handle = dma_map_single(peer_device,
> +                               ivc->tx_channel, queue_size, DMA_BIDIRECTIONAL);
> +                       if (ivc->tx_handle == DMA_ERROR_CODE) {
> +                               dma_unmap_single(peer_device, ivc->rx_handle,
> +                                       queue_size, DMA_BIDIRECTIONAL);
> +                               return -ENOMEM;
> +                       }

When do we unmap these pages btw? I know the BPMP driver is probably
never going to be unloaded, but we should probably have
tegra_ivc_cleanup somewhere since other potential users may make use
of it.

> +               }
> +       }
> +
> +       ivc->notify = notify;
> +       ivc->frame_size = frame_size;
> +       ivc->nframes = nframes;
> +       ivc->peer_device = peer_device;
> +
> +       /*
> +        * These values aren't necessarily correct until the channel has been
> +        * reset.
> +        */
> +       ivc->w_pos = 0;
> +       ivc->r_pos = 0;
> +
> +       return 0;
> +}
> +EXPORT_SYMBOL(tegra_ivc_init);
> diff --git a/include/soc/tegra/ivc.h b/include/soc/tegra/ivc.h
> new file mode 100644
> index 000000000000..1762fbee3fa2
> --- /dev/null
> +++ b/include/soc/tegra/ivc.h
> @@ -0,0 +1,102 @@
> +/*
> + * Copyright (c) 2016, NVIDIA CORPORATION.  All rights reserved.
> + *
> + * This program is free software; you can redistribute it and/or modify it
> + * under the terms and conditions of the GNU General Public License,
> + * version 2, as published by the Free Software Foundation.
> + *
> + * This program is distributed in the hope it will be useful, but WITHOUT
> + * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
> + * FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License for
> + * more details.
> + */
> +
> +#ifndef __TEGRA_IVC_H
> +
> +#include <linux/device.h>
> +#include <linux/dma-mapping.h>
> +#include <linux/types.h>
> +
> +struct ivc_channel_header;
> +
> +struct ivc {

I'd rename this to tegra_ivc to avoid potential name collisions (and
since all using functions are prefixed with tegra_ivc).

> +       struct ivc_channel_header *rx_channel, *tx_channel;
> +       uint32_t w_pos, r_pos;
> +
> +       void (*notify)(struct ivc *);
> +       uint32_t nframes, frame_size;
> +
> +       struct device *peer_device;
> +       dma_addr_t rx_handle, tx_handle;
> +};
> +
> +/**
> + * tegra_ivc_read_get_next_frame - Peek at the next frame to receive
> + * @ivc                pointer of the IVC channel
> + *
> + * Peek at the next frame to be received, without removing it from
> + * the queue.
> + *
> + * Returns a pointer to the frame, or an error encoded pointer.
> + */
> +void *tegra_ivc_read_get_next_frame(struct ivc *ivc);
> +
> +/**
> + * tegra_ivc_read_advance - Advance the read queue
> + * @ivc                pointer of the IVC channel
> + *
> + * Advance the read queue
> + *
> + * Returns 0, or a negative error value if failed.
> + */
> +int tegra_ivc_read_advance(struct ivc *ivc);
> +
> +/**
> + * tegra_ivc_write_get_next_frame - Poke at the next frame to transmit
> + * @ivc                pointer of the IVC channel
> + *
> + * Get access to the next frame.
> + *
> + * Returns a pointer to the frame, or an error encoded pointer.
> + */
> +void *tegra_ivc_write_get_next_frame(struct ivc *ivc);
> +
> +/**
> + * tegra_ivc_write_advance - Advance the write queue
> + * @ivc                pointer of the IVC channel
> + *
> + * Advance the write queue
> + *
> + * Returns 0, or a negative error value if failed.
> + */
> +int tegra_ivc_write_advance(struct ivc *ivc);
> +
> +/**
> + * tegra_ivc_channel_notified - handle internal messages
> + * @ivc                pointer of the IVC channel
> + *
> + * This function must be called following every notification.
> + *
> + * Returns 0 if the channel is ready for communication, or -EAGAIN if a channel
> + * reset is in progress.
> + */
> +int tegra_ivc_channel_notified(struct ivc *ivc);
> +
> +/**
> + * tegra_ivc_channel_reset - initiates a reset of the shared memory state
> + * @ivc                pointer of the IVC channel
> + *
> + * This function must be called after a channel is reserved before it is used
> + * for communication. The channel will be ready for use when a subsequent call
> + * to notify the remote of the channel reset.
> + */
> +void tegra_ivc_channel_reset(struct ivc *ivc);
> +
> +size_t tegra_ivc_align(size_t size);
> +unsigned tegra_ivc_total_queue_size(unsigned queue_size);
> +int tegra_ivc_init(struct ivc *ivc, uintptr_t rx_base, dma_addr_t rx_handle,
> +                  uintptr_t tx_base, dma_addr_t tx_handle, unsigned nframes,
> +                  unsigned frame_size, struct device *peer_device,
> +                  void (*notify)(struct ivc *));
> +
> +#endif /* __TEGRA_IVC_H */
> --
> 2.9.0
>

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH V2 01/10] Documentation: dt-bindings: mailbox: tegra: Add binding for HSP mailbox
  2016-07-05  9:04 ` [PATCH V2 01/10] Documentation: dt-bindings: mailbox: tegra: Add binding for HSP mailbox Joseph Lo
  2016-07-06 17:02   ` Stephen Warren
@ 2016-07-07 18:13   ` Sivaram Nair
  2016-07-07 18:35     ` Stephen Warren
  1 sibling, 1 reply; 51+ messages in thread
From: Sivaram Nair @ 2016-07-07 18:13 UTC (permalink / raw)
  To: Joseph Lo
  Cc: Stephen Warren, Thierry Reding, Alexandre Courbot, linux-tegra,
	linux-arm-kernel, Rob Herring, Mark Rutland, Peter De Schrijver,
	Matthew Longnecker, devicetree, Jassi Brar, linux-kernel,
	Catalin Marinas, Will Deacon

On Tue, Jul 05, 2016 at 05:04:22PM +0800, Joseph Lo wrote:
> Add DT binding for the Hardware Synchronization Primitives (HSP). The
> HSP is designed for the processors to share resources and communicate
> together. It provides a set of hardware synchronization primitives for
> interprocessor communication. So the interprocessor communication (IPC)
> protocols can use hardware synchronization primitive, when operating
> between two processors not in an SMP relationship.
> 
> Signed-off-by: Joseph Lo <josephl@nvidia.com>
> ---
> Changes in V2:
> - revise the compatible string, interrupt-names, interrupts, and #mbox-cells
>   properties
> - remove "nvidia,hsp-function" property
> - fix the header file name
> - the binding supports the concept of multiple HSP sub-modules on one HSP HW
>   block now.
> ---
>  .../bindings/mailbox/nvidia,tegra186-hsp.txt       | 51 ++++++++++++++++++++++
>  include/dt-bindings/mailbox/tegra186-hsp.h         | 23 ++++++++++
>  2 files changed, 74 insertions(+)
>  create mode 100644 Documentation/devicetree/bindings/mailbox/nvidia,tegra186-hsp.txt
>  create mode 100644 include/dt-bindings/mailbox/tegra186-hsp.h
> 
> diff --git a/Documentation/devicetree/bindings/mailbox/nvidia,tegra186-hsp.txt b/Documentation/devicetree/bindings/mailbox/nvidia,tegra186-hsp.txt
> new file mode 100644
> index 000000000000..10e53edbe1c7
> --- /dev/null
> +++ b/Documentation/devicetree/bindings/mailbox/nvidia,tegra186-hsp.txt
> @@ -0,0 +1,51 @@
> +NVIDIA Tegra Hardware Synchronization Primitives (HSP)
> +
> +The HSP modules are used for the processors to share resources and communicate
> +together. It provides a set of hardware synchronization primitives for
> +interprocessor communication. So the interprocessor communication (IPC)
> +protocols can use hardware synchronization primitives, when operating between
> +two processors not in an SMP relationship.
> +
> +The features that HSP supported are shared mailboxes, shared semaphores,
> +arbitrated semaphores and doorbells.
> +
> +Required properties:
> +- name : Should be hsp
> +- compatible
> +    Array of strings.
> +    one of:
> +    - "nvidia,tegra186-hsp"
> +- reg : Offset and length of the register set for the device.
> +- interrupt-names
> +    Array of strings.
> +    Contains a list of names for the interrupts described by the interrupt
> +    property. May contain the following entries, in any order:
> +    - "doorbell"
> +    Users of this binding MUST look up entries in the interrupt property
> +    by name, using this interrupt-names property to do so.
> +- interrupts
> +    Array of interrupt specifiers.
> +    Must contain one entry per entry in the interrupt-names property,
> +    in a matching order.
> +- #mbox-cells : Should be 1.
> +
> +The mbox specifier of the "mboxes" property in the client node should use
> +the "HSP_MBOX_ID" macro which integrates the HSP type and master ID data.
> +Those information can be found in the following file.
> +
> +- <dt-bindings/mailbox/tegra186-hsp.h>.
> +
> +Example:
> +
> +hsp_top0: hsp@3c00000 {
> +	compatible = "nvidia,tegra186-hsp";
> +	reg = <0x0 0x03c00000 0x0 0xa0000>;
> +	interrupts = <GIC_SPI 176 IRQ_TYPE_LEVEL_HIGH>;
> +	interrupt-names = "doorbell";
> +	#mbox-cells = <1>;
> +};
> +
> +client {
> +	...
> +	mboxes = <&hsp_top0 HSP_MBOX_ID(DB, HSP_DB_MASTER_XXX)>;
> +};
> diff --git a/include/dt-bindings/mailbox/tegra186-hsp.h b/include/dt-bindings/mailbox/tegra186-hsp.h
> new file mode 100644
> index 000000000000..365dbeb5d894
> --- /dev/null
> +++ b/include/dt-bindings/mailbox/tegra186-hsp.h
> @@ -0,0 +1,23 @@
> +/*
> + * This header provides constants for binding nvidia,tegra186-hsp.
> + *
> + * The number with HSP_DB_MASTER prefix indicates the bit that is
> + * associated with a master ID in the doorbell registers.
> + */
> +
> +
> +#ifndef _DT_BINDINGS_MAILBOX_TEGRA186_HSP_H
> +#define _DT_BINDINGS_MAILBOX_TEGRA186_HSP_H
> +
> +#define HSP_MBOX_TYPE_DB 0x0
> +#define HSP_MBOX_TYPE_SM 0x1
> +#define HSP_MBOX_TYPE_SS 0x2
> +#define HSP_MBOX_TYPE_AS 0x3
> +
> +#define HSP_DB_MASTER_CCPLEX 17
> +#define HSP_DB_MASTER_BPMP 19
> +
> +#define HSP_MBOX_ID(type, ID) \
> +		(HSP_MBOX_TYPE_##type << 16 | ID)

It will be nicer if you avoid the macro glue magic '##' for 'type'. I
would also suggest to use braces around 'type' and 'ID'.

> +
> +#endif	/* _DT_BINDINGS_MAILBOX_TEGRA186_HSP_H */
> -- 
> 2.9.0
> 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-tegra" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH V2 01/10] Documentation: dt-bindings: mailbox: tegra: Add binding for HSP mailbox
  2016-07-07 18:13   ` Sivaram Nair
@ 2016-07-07 18:35     ` Stephen Warren
  2016-07-07 18:44       ` Sivaram Nair
  2016-07-11 14:14       ` Rob Herring
  0 siblings, 2 replies; 51+ messages in thread
From: Stephen Warren @ 2016-07-07 18:35 UTC (permalink / raw)
  To: Sivaram Nair, Joseph Lo
  Cc: Thierry Reding, Alexandre Courbot, linux-tegra, linux-arm-kernel,
	Rob Herring, Mark Rutland, Peter De Schrijver,
	Matthew Longnecker, devicetree, Jassi Brar, linux-kernel,
	Catalin Marinas, Will Deacon

On 07/07/2016 12:13 PM, Sivaram Nair wrote:
> On Tue, Jul 05, 2016 at 05:04:22PM +0800, Joseph Lo wrote:
>> Add DT binding for the Hardware Synchronization Primitives (HSP). The
>> HSP is designed for the processors to share resources and communicate
>> together. It provides a set of hardware synchronization primitives for
>> interprocessor communication. So the interprocessor communication (IPC)
>> protocols can use hardware synchronization primitive, when operating
>> between two processors not in an SMP relationship.

>> diff --git a/include/dt-bindings/mailbox/tegra186-hsp.h b/include/dt-bindings/mailbox/tegra186-hsp.h

>> +#define HSP_MBOX_TYPE_DB 0x0
>> +#define HSP_MBOX_TYPE_SM 0x1
>> +#define HSP_MBOX_TYPE_SS 0x2
>> +#define HSP_MBOX_TYPE_AS 0x3
>> +
>> +#define HSP_DB_MASTER_CCPLEX 17
>> +#define HSP_DB_MASTER_BPMP 19
>> +
>> +#define HSP_MBOX_ID(type, ID) \
>> +		(HSP_MBOX_TYPE_##type << 16 | ID)
>
> It will be nicer if you avoid the macro glue magic '##' for 'type'. I
> would also suggest to use braces around 'type' and 'ID'.

This technique been used without issue in quite a few other places 
without issue, and has the benefit of simplifying the text wherever the 
macro is used. What issue do you foresee?

BTW, if this patch does need reposting, I'd suggest s/ID/id/ since macro 
parameters are usually lower-case.

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH V2 01/10] Documentation: dt-bindings: mailbox: tegra: Add binding for HSP mailbox
  2016-07-07 18:35     ` Stephen Warren
@ 2016-07-07 18:44       ` Sivaram Nair
  2016-07-11 14:14       ` Rob Herring
  1 sibling, 0 replies; 51+ messages in thread
From: Sivaram Nair @ 2016-07-07 18:44 UTC (permalink / raw)
  To: Stephen Warren
  Cc: Joseph Lo, Thierry Reding, Alexandre Courbot, linux-tegra,
	linux-arm-kernel, Rob Herring, Mark Rutland, Peter De Schrijver,
	Matthew Longnecker, devicetree, Jassi Brar, linux-kernel,
	Catalin Marinas, Will Deacon

On Thu, Jul 07, 2016 at 12:35:02PM -0600, Stephen Warren wrote:
> On 07/07/2016 12:13 PM, Sivaram Nair wrote:
> >On Tue, Jul 05, 2016 at 05:04:22PM +0800, Joseph Lo wrote:
> >>Add DT binding for the Hardware Synchronization Primitives (HSP). The
> >>HSP is designed for the processors to share resources and communicate
> >>together. It provides a set of hardware synchronization primitives for
> >>interprocessor communication. So the interprocessor communication (IPC)
> >>protocols can use hardware synchronization primitive, when operating
> >>between two processors not in an SMP relationship.
> 
> >>diff --git a/include/dt-bindings/mailbox/tegra186-hsp.h b/include/dt-bindings/mailbox/tegra186-hsp.h
> 
> >>+#define HSP_MBOX_TYPE_DB 0x0
> >>+#define HSP_MBOX_TYPE_SM 0x1
> >>+#define HSP_MBOX_TYPE_SS 0x2
> >>+#define HSP_MBOX_TYPE_AS 0x3
> >>+
> >>+#define HSP_DB_MASTER_CCPLEX 17
> >>+#define HSP_DB_MASTER_BPMP 19
> >>+
> >>+#define HSP_MBOX_ID(type, ID) \
> >>+		(HSP_MBOX_TYPE_##type << 16 | ID)
> >
> >It will be nicer if you avoid the macro glue magic '##' for 'type'. I
> >would also suggest to use braces around 'type' and 'ID'.
> 
> This technique been used without issue in quite a few other places
> without issue, and has the benefit of simplifying the text wherever
> the macro is used. What issue do you foresee?

It improves readability where HSP_MBOX_ID is used (in tegra186.dtsi file
in this case) - but consider this as a cosmetic comment.

> 
> BTW, if this patch does need reposting, I'd suggest s/ID/id/ since
> macro parameters are usually lower-case.

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH V2 05/10] firmware: tegra: add BPMP support
  2016-07-07 10:18       ` Alexandre Courbot
@ 2016-07-07 19:55         ` Stephen Warren
  2016-07-08 20:19         ` Sivaram Nair
  1 sibling, 0 replies; 51+ messages in thread
From: Stephen Warren @ 2016-07-07 19:55 UTC (permalink / raw)
  To: Alexandre Courbot, Joseph Lo
  Cc: Thierry Reding, linux-tegra, linux-arm-kernel, Rob Herring,
	Mark Rutland, Peter De Schrijver, Matthew Longnecker, devicetree,
	Jassi Brar, Linux Kernel Mailing List, Catalin Marinas,
	Will Deacon

On 07/07/2016 04:18 AM, Alexandre Courbot wrote:
> On Thu, Jul 7, 2016 at 5:17 PM, Joseph Lo <josephl@nvidia.com> wrote:
>> On 07/06/2016 07:39 PM, Alexandre Courbot wrote:
>>>
>>> Sorry, I will probably need to do several passes on this one to
>>> understand everything, but here is what I can say after a first look:
>>>
>>> On Tue, Jul 5, 2016 at 6:04 PM, Joseph Lo <josephl@nvidia.com> wrote:
>>>>
>>>> The Tegra BPMP (Boot and Power Management Processor) is designed for the
>>>> booting process handling, offloading the power management tasks and
>>>> some system control services from the CPU. It can be clock, DVFS,
>>>> thermal/EDP, power gating operation and system suspend/resume handling.
>>>> So the CPU and the drivers of these modules can base on the service that
>>>> the BPMP firmware driver provided to signal the event for the specific PM
>>>> action to BPMP and receive the status update from BPMP.
>>>>
>>>> Comparing to the ARM SCPI, the service provided by BPMP is message-based
>>>> communication but not method-based. The BPMP firmware driver provides the
>>>> send/receive service for the users, when the user concerns the response
>>>> time. If the user needs to get the event or update from the firmware, it
>>>> can request the MRQ service as well. The user needs to take care of the
>>>> message format, which we call BPMP ABI.
>>>>
>>>> The BPMP ABI defines the message format for different modules or usages.
>>>> For example, the clock operation needs an MRQ service code called
>>>> MRQ_CLK with specific message format which includes different sub
>>>> commands for various clock operations. This is the message format that
>>>> BPMP can recognize.
>>>>
>>>> So the user needs two things to initiate IPC between BPMP. Get the
>>>> service from the bpmp_ops structure and maintain the message format as
>>>> the BPMP ABI defined.

>>>> +static struct tegra_bpmp *bpmp;
>>>
>>> static? Ok, we only need one... for now. How about a private member in
>>> your ivc structure that allows you to retrieve the bpmp and going
>>> dynamic? This will require an extra argument in many functions, but is
>>> cleaner design IMHO.
>>
>> IVC is designed as a generic IPC protocol among different modules (We have
>> not introduced some other usages of the IVC right now.). Maybe don't churn
>> some other stuff into IVC is better.
>
> Anything is fine if you can get rid of that static.

Typically the way this is handled is to store the "struct ivc" inside 
some other struct, and use the container_of macro to "move" from a 
"struct ivc *" to a "struct XXX *" where "struct XXX" contains a "struct 
ivc" field within it. That way, the IVC code's "struct ivc" knows 
nothing about any IVC client's code/structures/..., doesn't have to 
store any "client data", and yet the IVC client (the BPMP driver) can 
acquire a "struct bpmp" pointer from any "struct ivc" pointer that it 
"owns".

struct bpmp_mbox {
     struct bpmp *bpmp;
     struct ivc ivc;
};

void bpmp_call_ivc(struct bpmp_mbox *mb) {
     ivc_something(&mb->ivc);
}

void bpmp_callback_from_ivc(struct ivc *ivc) {
     struct bpmp_mbox *mb = container_of(struct bpmp_mbox, ivc, ivc);
     struct bpmp *bpmp = mb->bpmp;
}

(I didn't check if I got the parameters to container_of correct above)

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH V2 02/10] mailbox: tegra-hsp: Add HSP(Hardware Synchronization Primitives) driver
  2016-07-05  9:04 ` [PATCH V2 02/10] mailbox: tegra-hsp: Add HSP(Hardware Synchronization Primitives) driver Joseph Lo
  2016-07-06  7:05   ` Alexandre Courbot
@ 2016-07-07 21:10   ` Sivaram Nair
  2016-07-18  8:51     ` Joseph Lo
  1 sibling, 1 reply; 51+ messages in thread
From: Sivaram Nair @ 2016-07-07 21:10 UTC (permalink / raw)
  To: Joseph Lo
  Cc: Stephen Warren, Thierry Reding, Alexandre Courbot, linux-tegra,
	linux-arm-kernel, Rob Herring, Mark Rutland, Peter De Schrijver,
	Matthew Longnecker, devicetree, Jassi Brar, linux-kernel,
	Catalin Marinas, Will Deacon

On Tue, Jul 05, 2016 at 05:04:23PM +0800, Joseph Lo wrote:
> The Tegra HSP mailbox driver implements the signaling doorbell-based
> interprocessor communication (IPC) for remote processors currently. The
> HSP HW modules support some different features for that, which are
> shared mailboxes, shared semaphores, arbitrated semaphores, and
> doorbells. And there are multiple HSP HW instances on the chip. So the
> driver is extendable to support more features for different IPC
> requirement.
> 
> The driver of remote processor can use it as a mailbox client and deal
> with the IPC protocol to synchronize the data communications.
> 
> Signed-off-by: Joseph Lo <josephl@nvidia.com>
> ---
> Changes in V2:
> - Update the driver to support the binding changes in V2
> - it's extendable to support multiple HSP sub-modules on the same HSP HW block
>   now.
> ---
>  drivers/mailbox/Kconfig     |   9 +
>  drivers/mailbox/Makefile    |   2 +
>  drivers/mailbox/tegra-hsp.c | 418 ++++++++++++++++++++++++++++++++++++++++++++
>  3 files changed, 429 insertions(+)
>  create mode 100644 drivers/mailbox/tegra-hsp.c
> 
> diff --git a/drivers/mailbox/Kconfig b/drivers/mailbox/Kconfig
> index 5305923752d2..fe584cb54720 100644
> --- a/drivers/mailbox/Kconfig
> +++ b/drivers/mailbox/Kconfig
> @@ -114,6 +114,15 @@ config MAILBOX_TEST
>  	  Test client to help with testing new Controller driver
>  	  implementations.
>  
> +config TEGRA_HSP_MBOX
> +	bool "Tegra HSP(Hardware Synchronization Primitives) Driver"
> +	depends on ARCH_TEGRA_186_SOC
> +	help
> +	  The Tegra HSP driver is used for the interprocessor communication
> +	  between different remote processors and host processors on Tegra186
> +	  and later SoCs. Say Y here if you want to have this support.
> +	  If unsure say N.
> +
>  config XGENE_SLIMPRO_MBOX
>  	tristate "APM SoC X-Gene SLIMpro Mailbox Controller"
>  	depends on ARCH_XGENE
> diff --git a/drivers/mailbox/Makefile b/drivers/mailbox/Makefile
> index 0be3e742bb7d..26d8f91c7fea 100644
> --- a/drivers/mailbox/Makefile
> +++ b/drivers/mailbox/Makefile
> @@ -25,3 +25,5 @@ obj-$(CONFIG_TI_MESSAGE_MANAGER) += ti-msgmgr.o
>  obj-$(CONFIG_XGENE_SLIMPRO_MBOX) += mailbox-xgene-slimpro.o
>  
>  obj-$(CONFIG_HI6220_MBOX)	+= hi6220-mailbox.o
> +
> +obj-${CONFIG_TEGRA_HSP_MBOX}	+= tegra-hsp.o
> diff --git a/drivers/mailbox/tegra-hsp.c b/drivers/mailbox/tegra-hsp.c
> new file mode 100644
> index 000000000000..93c3ef58f29f
> --- /dev/null
> +++ b/drivers/mailbox/tegra-hsp.c
> @@ -0,0 +1,418 @@
> +/*
> + * Copyright (c) 2016, NVIDIA CORPORATION.  All rights reserved.
> + *
> + * This program is free software; you can redistribute it and/or modify it
> + * under the terms and conditions of the GNU General Public License,
> + * version 2, as published by the Free Software Foundation.
> + *
> + * This program is distributed in the hope it will be useful, but WITHOUT
> + * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
> + * FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License for
> + * more details.
> + */
> +
> +#include <linux/interrupt.h>
> +#include <linux/io.h>
> +#include <linux/mailbox_controller.h>
> +#include <linux/of.h>
> +#include <linux/of_device.h>
> +#include <linux/platform_device.h>
> +#include <dt-bindings/mailbox/tegra186-hsp.h>
> +
> +#define HSP_INT_DIMENSIONING	0x380
> +#define HSP_nSM_OFFSET		0
> +#define HSP_nSS_OFFSET		4
> +#define HSP_nAS_OFFSET		8
> +#define HSP_nDB_OFFSET		12
> +#define HSP_nSI_OFFSET		16
> +#define HSP_nINT_MASK		0xf
> +
> +#define HSP_DB_REG_TRIGGER	0x0
> +#define HSP_DB_REG_ENABLE	0x4
> +#define HSP_DB_REG_RAW		0x8
> +#define HSP_DB_REG_PENDING	0xc
> +
> +#define HSP_DB_CCPLEX		1
> +#define HSP_DB_BPMP		3
> +
> +#define MAX_NUM_HSP_CHAN 32

Is this an arbitrarily chosen number?

> +#define MAX_NUM_HSP_DB 7
> +
> +#define hsp_db_offset(i, d) \
> +	(d->base + ((1 + (d->nr_sm >> 1) + d->nr_ss + d->nr_as) << 16) + \
> +	(i) * 0x100)
> +
> +struct tegra_hsp_db_chan {
> +	int master_id;
> +	int db_id;

These should be unsigned? 

> +};
> +
> +struct tegra_hsp_mbox_chan {
> +	int type;

This too...

> +	union {
> +		struct tegra_hsp_db_chan db_chan;
> +	};
> +};

Why do we need to use a union?

> +
> +struct tegra_hsp_mbox {
> +	struct mbox_controller *mbox;
> +	void __iomem *base;
> +	void __iomem *db_base[MAX_NUM_HSP_DB];
> +	int db_irq;
> +	int nr_sm;
> +	int nr_as;
> +	int nr_ss;
> +	int nr_db;
> +	int nr_si;
> +	spinlock_t lock;

You might need to change this to a mutex - see below.

> +};
> +
> +static inline u32 hsp_readl(void __iomem *base, int reg)
> +{
> +	return readl(base + reg);
> +}
> +
> +static inline void hsp_writel(void __iomem *base, int reg, u32 val)
> +{
> +	writel(val, base + reg);
> +	readl(base + reg);
> +}
> +
> +static int hsp_db_can_ring(void __iomem *db_base)
> +{
> +	u32 reg;
> +
> +	reg = hsp_readl(db_base, HSP_DB_REG_ENABLE);
> +
> +	return !!(reg & BIT(HSP_DB_MASTER_CCPLEX));
> +}
> +
> +static irqreturn_t hsp_db_irq(int irq, void *p)
> +{
> +	struct tegra_hsp_mbox *hsp_mbox = p;
> +	ulong val;

This should be u32 and...

> +	int master_id;
> +
> +	val = (ulong)hsp_readl(hsp_mbox->db_base[HSP_DB_CCPLEX],
> +			       HSP_DB_REG_PENDING);

the cast should/can be removed (hsp_readl and hsp_writel both use u32)?

> +	hsp_writel(hsp_mbox->db_base[HSP_DB_CCPLEX], HSP_DB_REG_PENDING, val);
> +
> +	spin_lock(&hsp_mbox->lock);
> +	for_each_set_bit(master_id, &val, MAX_NUM_HSP_CHAN) {
> +		struct mbox_chan *chan;
> +		struct tegra_hsp_mbox_chan *mchan;
> +		int i;
> +
> +		for (i = 0; i < MAX_NUM_HSP_CHAN; i++) {
> +			chan = &hsp_mbox->mbox->chans[i];
> +
> +			if (!chan->con_priv)
> +				continue;
> +
> +			mchan = chan->con_priv;
> +			if (mchan->type == HSP_MBOX_TYPE_DB &&
> +			    mchan->db_chan.master_id == master_id)
> +				break;
> +			chan = NULL;
> +		}

Like Alexandre, I didn't like this use of inner loop as well. But I will
add my comment to the other thread.

> +
> +		if (chan)
> +			mbox_chan_received_data(chan, NULL);
> +	}
> +	spin_unlock(&hsp_mbox->lock);
> +
> +	return IRQ_HANDLED;
> +}
> +
> +static int hsp_db_send_data(struct mbox_chan *chan, void *data)
> +{
> +	struct tegra_hsp_mbox_chan *mchan = chan->con_priv;
> +	struct tegra_hsp_db_chan *db_chan = &mchan->db_chan;
> +	struct tegra_hsp_mbox *hsp_mbox = dev_get_drvdata(chan->mbox->dev);
> +
> +	hsp_writel(hsp_mbox->db_base[db_chan->db_id], HSP_DB_REG_TRIGGER, 1);
> +
> +	return 0;
> +}
> +
> +static int hsp_db_startup(struct mbox_chan *chan)
> +{
> +	struct tegra_hsp_mbox_chan *mchan = chan->con_priv;
> +	struct tegra_hsp_db_chan *db_chan = &mchan->db_chan;
> +	struct tegra_hsp_mbox *hsp_mbox = dev_get_drvdata(chan->mbox->dev);
> +	u32 val;
> +	unsigned long flag;
> +
> +	if (db_chan->master_id >= MAX_NUM_HSP_CHAN) {

Is this a valid check? IIUC, MAX_NUM_HSP_CHAN is independent of the
number of masters.

> +		dev_err(chan->mbox->dev, "invalid HSP chan: master ID: %d\n",
> +			db_chan->master_id);
> +		return -EINVAL;
> +	}
> +
> +	spin_lock_irqsave(&hsp_mbox->lock, flag);
> +	val = hsp_readl(hsp_mbox->db_base[HSP_DB_CCPLEX], HSP_DB_REG_ENABLE);
> +	val |= BIT(db_chan->master_id); 
> +	hsp_writel(hsp_mbox->db_base[HSP_DB_CCPLEX], HSP_DB_REG_ENABLE, val);
> +	spin_unlock_irqrestore(&hsp_mbox->lock, flag);
> +
> +	if (!hsp_db_can_ring(hsp_mbox->db_base[db_chan->db_id]))
> +		return -ENODEV;
> +
> +	return 0;
> +}
> +
> +static void hsp_db_shutdown(struct mbox_chan *chan)
> +{
> +	struct tegra_hsp_mbox_chan *mchan = chan->con_priv;
> +	struct tegra_hsp_db_chan *db_chan = &mchan->db_chan;
> +	struct tegra_hsp_mbox *hsp_mbox = dev_get_drvdata(chan->mbox->dev);
> +	u32 val;
> +	unsigned long flag;
> +
> +	spin_lock_irqsave(&hsp_mbox->lock, flag);
> +	val = hsp_readl(hsp_mbox->db_base[HSP_DB_CCPLEX], HSP_DB_REG_ENABLE);
> +	val &= ~BIT(db_chan->master_id);
> +	hsp_writel(hsp_mbox->db_base[HSP_DB_CCPLEX], HSP_DB_REG_ENABLE, val);
> +	spin_unlock_irqrestore(&hsp_mbox->lock, flag);
> +}
> +
> +static bool hsp_db_last_tx_done(struct mbox_chan *chan)
> +{
> +	return true;
> +}
> +
> +static int tegra_hsp_db_init(struct tegra_hsp_mbox *hsp_mbox,
> +			     struct mbox_chan *mchan, int master_id)
> +{
> +	struct platform_device *pdev = to_platform_device(hsp_mbox->mbox->dev);
> +	struct tegra_hsp_mbox_chan *hsp_mbox_chan;
> +	int ret;
> +
> +	if (!hsp_mbox->db_irq) {
> +		int i;
> +
> +		hsp_mbox->db_irq = platform_get_irq_byname(pdev, "doorbell");
> +		ret = devm_request_irq(&pdev->dev, hsp_mbox->db_irq,
> +				       hsp_db_irq, IRQF_NO_SUSPEND,
> +				       dev_name(&pdev->dev), hsp_mbox);
> +		if (ret)
> +			return ret;
> +
> +		for (i = 0; i < MAX_NUM_HSP_DB; i++)
> +			hsp_mbox->db_base[i] = hsp_db_offset(i, hsp_mbox);
> +	}
> +
> +	hsp_mbox_chan = devm_kzalloc(&pdev->dev, sizeof(*hsp_mbox_chan),
> +				     GFP_KERNEL);
> +	if (!hsp_mbox_chan)
> +		return -ENOMEM;
> +
> +	hsp_mbox_chan->type = HSP_MBOX_TYPE_DB;
> +	hsp_mbox_chan->db_chan.master_id = master_id;
> +	switch (master_id) {
> +	case HSP_DB_MASTER_BPMP:
> +		hsp_mbox_chan->db_chan.db_id = HSP_DB_BPMP;
> +		break;
> +	default:
> +		hsp_mbox_chan->db_chan.db_id = MAX_NUM_HSP_DB;
> +		break;
> +	}
> +
> +	mchan->con_priv = hsp_mbox_chan;
> +
> +	return 0;
> +}
> +
> +static int hsp_send_data(struct mbox_chan *chan, void *data)
> +{
> +	struct tegra_hsp_mbox_chan *hsp_mbox_chan = chan->con_priv;
> +	int ret = 0;
> +
> +	switch (hsp_mbox_chan->type) {
> +	case HSP_MBOX_TYPE_DB:
> +		ret = hsp_db_send_data(chan, data);
> +		break;
> +	default:

Should you return an error here?

> +		break;
> +	}
> +
> +	return ret;
> +}
> +
> +static int hsp_startup(struct mbox_chan *chan)
> +{
> +	struct tegra_hsp_mbox_chan *hsp_mbox_chan = chan->con_priv;
> +	int ret = 0;
> +
> +	switch (hsp_mbox_chan->type) {
> +	case HSP_MBOX_TYPE_DB:
> +		ret = hsp_db_startup(chan);
> +		break;
> +	default:

And here too...?

> +		break;
> +	}
> +
> +	return ret;
> +}
> +
> +static void hsp_shutdown(struct mbox_chan *chan)
> +{
> +	struct tegra_hsp_mbox_chan *hsp_mbox_chan = chan->con_priv;
> +
> +	switch (hsp_mbox_chan->type) {
> +	case HSP_MBOX_TYPE_DB:
> +		hsp_db_shutdown(chan);
> +		break;
> +	default:
> +		break;
> +	}
> +
> +	chan->con_priv = NULL;
> +}
> +
> +static bool hsp_last_tx_done(struct mbox_chan *chan)
> +{
> +	struct tegra_hsp_mbox_chan *hsp_mbox_chan = chan->con_priv;
> +	bool ret = true;
> +
> +	switch (hsp_mbox_chan->type) {
> +	case HSP_MBOX_TYPE_DB:
> +		ret = hsp_db_last_tx_done(chan);

hsp_db_last_tx_done() return true - so we might as well make this parent
function to return true and remove hsp_db_last_tx_done()?

> +		break;
> +	default:
> +		break;
> +	}
> +
> +	return ret;
> +}
> +
> +static const struct mbox_chan_ops tegra_hsp_ops = {
> +	.send_data = hsp_send_data,
> +	.startup = hsp_startup,
> +	.shutdown = hsp_shutdown,
> +	.last_tx_done = hsp_last_tx_done,
> +};
> +
> +static const struct of_device_id tegra_hsp_match[] = {
> +	{ .compatible = "nvidia,tegra186-hsp" },
> +	{ }
> +};
> +
> +static struct mbox_chan *
> +of_hsp_mbox_xlate(struct mbox_controller *mbox,
> +		  const struct of_phandle_args *sp)
> +{
> +	int mbox_id = sp->args[0];
> +	int hsp_type = (mbox_id >> 16) & 0xf;

Wouldn't it be nicer if the shift and mask constants are made defines in
the DT bindings header (tegra186-hsp.h)?

> +	int master_id = mbox_id & 0xff;
> +	struct tegra_hsp_mbox *hsp_mbox = dev_get_drvdata(mbox->dev);
> +	struct mbox_chan *free_chan;
> +	int i, ret = 0;
> +
> +	spin_lock(&hsp_mbox->lock);

If you must use spin locks, you will have to use the irqsave/restore
veriants in this function (called from thread context).

> +
> +	for (i = 0; i < mbox->num_chans; i++) {
> +		free_chan = &mbox->chans[i];
> +		if (!free_chan->con_priv)
> +			break;
> +		free_chan = NULL;
> +	}
> +
> +	if (!free_chan) {
> +		spin_unlock(&hsp_mbox->lock);
> +		return ERR_PTR(-EFAULT);
> +	}

IMO, it will be cleaner & simpler if you move the above code (doing the
lookup) into a separate function that returns free_chan - and you can
reuse that in hsp_db_irq()

> +
> +	switch (hsp_type) {
> +	case HSP_MBOX_TYPE_DB:
> +		ret = tegra_hsp_db_init(hsp_mbox, free_chan, master_id);

tegra_hsp_db_init() uses devm_kzalloc and you are doing this holding a
spinlock.

> +		break;
> +	default:

Not returning error here will also cause resource leak (free_chan).

> +		break;
> +	}
> +
> +	spin_unlock(&hsp_mbox->lock);
> +
> +	if (ret)
> +		free_chan = ERR_PTR(-EFAULT);
> +
> +	return free_chan;
> +}
> +
> +static int tegra_hsp_probe(struct platform_device *pdev)
> +{
> +	struct tegra_hsp_mbox *hsp_mbox;
> +	struct resource *res;
> +	int ret = 0;
> +	u32 reg;
> +
> +	hsp_mbox = devm_kzalloc(&pdev->dev, sizeof(*hsp_mbox), GFP_KERNEL);
> +	if (!hsp_mbox)
> +		return -ENOMEM;
> +
> +	res = platform_get_resource(pdev, IORESOURCE_MEM, 0);
> +	hsp_mbox->base = devm_ioremap_resource(&pdev->dev, res);
> +	if (IS_ERR(hsp_mbox->base))
> +		return PTR_ERR(hsp_mbox->base);
> +
> +	reg = hsp_readl(hsp_mbox->base, HSP_INT_DIMENSIONING);
> +	hsp_mbox->nr_sm = (reg >> HSP_nSM_OFFSET) & HSP_nINT_MASK;
> +	hsp_mbox->nr_ss = (reg >> HSP_nSS_OFFSET) & HSP_nINT_MASK;
> +	hsp_mbox->nr_as = (reg >> HSP_nAS_OFFSET) & HSP_nINT_MASK;
> +	hsp_mbox->nr_db = (reg >> HSP_nDB_OFFSET) & HSP_nINT_MASK;
> +	hsp_mbox->nr_si = (reg >> HSP_nSI_OFFSET) & HSP_nINT_MASK;
> +
> +	hsp_mbox->mbox = devm_kzalloc(&pdev->dev,
> +				      sizeof(*hsp_mbox->mbox), GFP_KERNEL);
> +	if (!hsp_mbox->mbox)
> +		return -ENOMEM;
> +
> +	hsp_mbox->mbox->chans =
> +		devm_kcalloc(&pdev->dev, MAX_NUM_HSP_CHAN,
> +			     sizeof(*hsp_mbox->mbox->chans), GFP_KERNEL);
> +	if (!hsp_mbox->mbox->chans)
> +		return -ENOMEM;
> +
> +	hsp_mbox->mbox->of_xlate = of_hsp_mbox_xlate;
> +	hsp_mbox->mbox->num_chans = MAX_NUM_HSP_CHAN;
> +	hsp_mbox->mbox->dev = &pdev->dev;
> +	hsp_mbox->mbox->txdone_irq = false;
> +	hsp_mbox->mbox->txdone_poll = false;
> +	hsp_mbox->mbox->ops = &tegra_hsp_ops;
> +	platform_set_drvdata(pdev, hsp_mbox);
> +
> +	ret = mbox_controller_register(hsp_mbox->mbox);
> +	if (ret) {
> +		pr_err("tegra-hsp mbox: fail to register mailbox %d.\n", ret);
> +		return ret;
> +	}
> +
> +	spin_lock_init(&hsp_mbox->lock);
> +
> +	return 0;
> +}
> +
> +static int tegra_hsp_remove(struct platform_device *pdev)
> +{
> +	struct tegra_hsp_mbox *hsp_mbox = platform_get_drvdata(pdev);
> +
> +	if (hsp_mbox->mbox)
> +		mbox_controller_unregister(hsp_mbox->mbox);
> +
> +	return 0;
> +}
> +
> +static struct platform_driver tegra_hsp_driver = {
> +	.driver = {
> +		.name = "tegra-hsp",
> +		.of_match_table = tegra_hsp_match,
> +	},
> +	.probe = tegra_hsp_probe,
> +	.remove = tegra_hsp_remove,
> +};
> +
> +static int __init tegra_hsp_init(void)
> +{
> +	return platform_driver_register(&tegra_hsp_driver);
> +}
> +core_initcall(tegra_hsp_init);
> -- 
> 2.9.0
> 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-tegra" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH V2 02/10] mailbox: tegra-hsp: Add HSP(Hardware Synchronization Primitives) driver
  2016-07-07  6:37         ` Joseph Lo
@ 2016-07-07 21:33           ` Sivaram Nair
  2016-07-18  8:58             ` Joseph Lo
  0 siblings, 1 reply; 51+ messages in thread
From: Sivaram Nair @ 2016-07-07 21:33 UTC (permalink / raw)
  To: Joseph Lo
  Cc: Alexandre Courbot, Stephen Warren, Thierry Reding, linux-tegra,
	linux-arm-kernel, Rob Herring, Mark Rutland, Peter De Schrijver,
	Matthew Longnecker, devicetree, Jassi Brar,
	Linux Kernel Mailing List, Catalin Marinas, Will Deacon

On Thu, Jul 07, 2016 at 02:37:27PM +0800, Joseph Lo wrote:
> On 07/06/2016 08:23 PM, Alexandre Courbot wrote:
> >On Wed, Jul 6, 2016 at 6:06 PM, Joseph Lo <josephl@nvidia.com> wrote:
> >>On 07/06/2016 03:05 PM, Alexandre Courbot wrote:
> >>>
> >>>On Tue, Jul 5, 2016 at 6:04 PM, Joseph Lo <josephl@nvidia.com> wrote:
> >>>>
> >>>>The Tegra HSP mailbox driver implements the signaling doorbell-based
> >>>>interprocessor communication (IPC) for remote processors currently. The
> >>>>HSP HW modules support some different features for that, which are
> >>>>shared mailboxes, shared semaphores, arbitrated semaphores, and
> >>>>doorbells. And there are multiple HSP HW instances on the chip. So the
> >>>>driver is extendable to support more features for different IPC
> >>>>requirement.
> >>>>
> >>>>The driver of remote processor can use it as a mailbox client and deal
> >>>>with the IPC protocol to synchronize the data communications.
> >>>>
> >>>>Signed-off-by: Joseph Lo <josephl@nvidia.com>
> >>>>---
> >>>>Changes in V2:
> >>>>- Update the driver to support the binding changes in V2
> >>>>- it's extendable to support multiple HSP sub-modules on the same HSP HW
> >>>>block
> >>>>    now.
> >>>>---
> >>>>   drivers/mailbox/Kconfig     |   9 +
> >>>>   drivers/mailbox/Makefile    |   2 +
> >>>>   drivers/mailbox/tegra-hsp.c | 418
> >>>>++++++++++++++++++++++++++++++++++++++++++++
> >>>>   3 files changed, 429 insertions(+)
> >>>>   create mode 100644 drivers/mailbox/tegra-hsp.c
> >>>>
> >>>>diff --git a/drivers/mailbox/Kconfig b/drivers/mailbox/Kconfig
> >>>>index 5305923752d2..fe584cb54720 100644
> >>>>--- a/drivers/mailbox/Kconfig
> >>>>+++ b/drivers/mailbox/Kconfig
> >>>>@@ -114,6 +114,15 @@ config MAILBOX_TEST
> >>>>            Test client to help with testing new Controller driver
> >>>>            implementations.
> >>>>
> >>>>+config TEGRA_HSP_MBOX
> >>>>+       bool "Tegra HSP(Hardware Synchronization Primitives) Driver"
> >>>
> >>>
> >>>Space missing before the opening parenthesis (same in the patch title
> >>>btw).
> >>
> >>Okay.
> >>>
> >>>
> >>>>+       depends on ARCH_TEGRA_186_SOC
> >>>>+       help
> >>>>+         The Tegra HSP driver is used for the interprocessor
> >>>>communication
> >>>>+         between different remote processors and host processors on
> >>>>Tegra186
> >>>>+         and later SoCs. Say Y here if you want to have this support.
> >>>>+         If unsure say N.
> >>>
> >>>
> >>>Since this option is selected automatically by ARCH_TEGRA_186_SOC, you
> >>>should probably drop the last 2 sentences.
> >>
> >>Okay.
> >>
> >>>
> >>>>+
> >>>>   config XGENE_SLIMPRO_MBOX
> >>>>          tristate "APM SoC X-Gene SLIMpro Mailbox Controller"
> >>>>          depends on ARCH_XGENE
> >>>>diff --git a/drivers/mailbox/Makefile b/drivers/mailbox/Makefile
> >>>>index 0be3e742bb7d..26d8f91c7fea 100644
> >>>>--- a/drivers/mailbox/Makefile
> >>>>+++ b/drivers/mailbox/Makefile
> >>>>@@ -25,3 +25,5 @@ obj-$(CONFIG_TI_MESSAGE_MANAGER) += ti-msgmgr.o
> >>>>   obj-$(CONFIG_XGENE_SLIMPRO_MBOX) += mailbox-xgene-slimpro.o
> >>>>
> >>>>   obj-$(CONFIG_HI6220_MBOX)      += hi6220-mailbox.o
> >>>>+
> >>>>+obj-${CONFIG_TEGRA_HSP_MBOX}   += tegra-hsp.o
> >>>>diff --git a/drivers/mailbox/tegra-hsp.c b/drivers/mailbox/tegra-hsp.c
> >>>>new file mode 100644
> >>>>index 000000000000..93c3ef58f29f
> >>>>--- /dev/null
> >>>>+++ b/drivers/mailbox/tegra-hsp.c
> >>>>@@ -0,0 +1,418 @@
> >>>>+/*
> >>>>+ * Copyright (c) 2016, NVIDIA CORPORATION.  All rights reserved.
> >>>>+ *
> >>>>+ * This program is free software; you can redistribute it and/or modify
> >>>>it
> >>>>+ * under the terms and conditions of the GNU General Public License,
> >>>>+ * version 2, as published by the Free Software Foundation.
> >>>>+ *
> >>>>+ * This program is distributed in the hope it will be useful, but
> >>>>WITHOUT
> >>>>+ * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
> >>>>+ * FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License
> >>>>for
> >>>>+ * more details.
> >>>>+ */
> >>>>+
> >>>>+#include <linux/interrupt.h>
> >>>>+#include <linux/io.h>
> >>>>+#include <linux/mailbox_controller.h>
> >>>>+#include <linux/of.h>
> >>>>+#include <linux/of_device.h>
> >>>>+#include <linux/platform_device.h>
> >>>>+#include <dt-bindings/mailbox/tegra186-hsp.h>
> >>>>+
> >>>>+#define HSP_INT_DIMENSIONING   0x380
> >>>>+#define HSP_nSM_OFFSET         0
> >>>>+#define HSP_nSS_OFFSET         4
> >>>>+#define HSP_nAS_OFFSET         8
> >>>>+#define HSP_nDB_OFFSET         12
> >>>>+#define HSP_nSI_OFFSET         16
> >>>
> >>>
> >>>Would be nice to have comments to understand what SM, SS, AS, etc.
> >>>stand for (Shared Mailboxes, Shared Semaphores, Arbitrated Semaphores
> >>>but you need to look at the patch description to understand that). A
> >>>top-of-file comment explaning the necessary concepts to read this code
> >>>would do the trick.
> >>
> >>Yes, will fix that.
> >>>
> >>>
> >>>>+#define HSP_nINT_MASK          0xf
> >>>>+
> >>>>+#define HSP_DB_REG_TRIGGER     0x0
> >>>>+#define HSP_DB_REG_ENABLE      0x4
> >>>>+#define HSP_DB_REG_RAW         0x8
> >>>>+#define HSP_DB_REG_PENDING     0xc
> >>>>+
> >>>>+#define HSP_DB_CCPLEX          1
> >>>>+#define HSP_DB_BPMP            3
> >>>
> >>>
> >>>Maybe turn this into enum and use that type for
> >>>tegra_hsp_db_chan::db_id? Also have MAX_NUM_HSP_DB here, since it is
> >>>related to these values?
> >>
> >>Okay.
> >>
> >>>
> >>>>+
> >>>>+#define MAX_NUM_HSP_CHAN 32
> >>>>+#define MAX_NUM_HSP_DB 7
> >>>>+
> >>>>+#define hsp_db_offset(i, d) \
> >>>>+       (d->base + ((1 + (d->nr_sm >> 1) + d->nr_ss + d->nr_as) << 16) +
> >>>>\
> >>>>+       (i) * 0x100)
> >>>>+
> >>>>+struct tegra_hsp_db_chan {
> >>>>+       int master_id;
> >>>>+       int db_id;
> >>>>+};
> >>>>+
> >>>>+struct tegra_hsp_mbox_chan {
> >>>>+       int type;
> >>>>+       union {
> >>>>+               struct tegra_hsp_db_chan db_chan;
> >>>>+       };
> >>>>+};
> >>>>+
> >>>>+struct tegra_hsp_mbox {
> >>>>+       struct mbox_controller *mbox;
> >>>>+       void __iomem *base;
> >>>>+       void __iomem *db_base[MAX_NUM_HSP_DB];
> >>>>+       int db_irq;
> >>>>+       int nr_sm;
> >>>>+       int nr_as;
> >>>>+       int nr_ss;
> >>>>+       int nr_db;
> >>>>+       int nr_si;
> >>>>+       spinlock_t lock;
> >>>>+};
> >>>>+
> >>>>+static inline u32 hsp_readl(void __iomem *base, int reg)
> >>>>+{
> >>>>+       return readl(base + reg);
> >>>>+}
> >>>>+
> >>>>+static inline void hsp_writel(void __iomem *base, int reg, u32 val)
> >>>>+{
> >>>>+       writel(val, base + reg);
> >>>>+       readl(base + reg);
> >>>>+}
> >>>>+
> >>>>+static int hsp_db_can_ring(void __iomem *db_base)
> >>>>+{
> >>>>+       u32 reg;
> >>>>+
> >>>>+       reg = hsp_readl(db_base, HSP_DB_REG_ENABLE);
> >>>>+
> >>>>+       return !!(reg & BIT(HSP_DB_MASTER_CCPLEX));
> >>>>+}
> >>>>+
> >>>>+static irqreturn_t hsp_db_irq(int irq, void *p)
> >>>>+{
> >>>>+       struct tegra_hsp_mbox *hsp_mbox = p;
> >>>>+       ulong val;
> >>>>+       int master_id;
> >>>>+
> >>>>+       val = (ulong)hsp_readl(hsp_mbox->db_base[HSP_DB_CCPLEX],
> >>>>+                              HSP_DB_REG_PENDING);
> >>>>+       hsp_writel(hsp_mbox->db_base[HSP_DB_CCPLEX], HSP_DB_REG_PENDING,
> >>>>val);
> >>>>+
> >>>>+       spin_lock(&hsp_mbox->lock);
> >>>>+       for_each_set_bit(master_id, &val, MAX_NUM_HSP_CHAN) {
> >>>>+               struct mbox_chan *chan;
> >>>>+               struct tegra_hsp_mbox_chan *mchan;
> >>>>+               int i;
> >>>>+
> >>>>+               for (i = 0; i < MAX_NUM_HSP_CHAN; i++) {
> >>>
> >>>
> >>>I wonder if this could not be optimized. You are doing a double loop
> >>>on MAX_NUM_HSP_CHAN to look for an identical master_id. Since it seems
> >>>like the same master_id cannot be used twice (considering that the
> >>>inner loop only processes the first match), couldn't you just select
> >>>the free channel in of_hsp_mbox_xlate() by doing
> >>>&mbox->chans[master_id] (and returning an error if it is already
> >>>used), then simply getting chan as &hsp_mbox->mbox->chans[master_id]
> >>>instead of having the inner loop below? That would remove the need for
> >>>the second loop.
> >>
> >>
> >>That was exactly what I did in the V1, which only supported one HSP
> >>sub-module per HSP HW block. So we can just use the master_id as the mbox
> >>channel ID.
> >>
> >>Meanwhile, the V2 is purposed to support multiple HSP sub-modules to be
> >>running on the same HSP HW block. The "ID" between different modules could
> >>be conflict. So I dropped the mechanism that used the master_id as the mbox
> >>channel ID.
> >>
> >>Instead, the channel is allocated at the time, when the client is bound to
> >>one of the HSP sub-modules. And we store the "ID" information into the
> >>private mbox channel data, which can help us to figure out which mbox
> >>channel should response to the interrupt.
> >>
> >>In the doorbell case, because all the DB clients are shared the same DB IRQ
> >>at the CPU side. So in the ISR, we need to figure out the IRQ source, which
> >>is the master_id that the IRQ came from. This is the outer loop. The inner
> >>loop, we figure out which channel should response to by checking the type
> >>and ID.
> >>
> >>And I think it should be pretty quick, because we only check the set bit
> >>from the pending register. And finding the matching channel.
> >
> >Yeah, I am not worried about the CPU time (although in interrupt
> >context, we always should), but rather about whether the code could be
> >simplified.
> >
> >Ah, I think I get it. You want to be able to receive interrupts from
> >the same master, but not necessarily for the doorbell function.
> >Because of this you cannot use master_id as the index for the channel.
> >Am I understanding correctly?
> 
> Yes, the DB clients trigger the IRQ through the same master
> (HSP_DB_CCPLEX) with it's master_id. We (CPU) can check the ID to
> know who is requesting the HSP mbox service. Each ID is unique under
> the DB module.
> 
> But the ID could be conflict when the HSP mbox driver are working
> with multiple HSP sub-function under the same HSP HW block. So we
> can't just match the ID to the HSP mbox channel ID.

Joseph, can you think about any other sub-function that uses the same
master ids (& those that does not have their own irqs)? I wonder if we
are over-engineering this. I think the hsp_db_startup() and
hsp_db_shutdown() does not support sharing masters - _startup() by one
followed by _shutdown() from another will mask the interrupt. If there
is infact other potential sub-functions, I would imagine this will
translate to other values of the tegra_hsp_mbox_chan.type than
HSP_MBOX_TYPE_DB? If yes, then you should be able to remove need of this
inner loop by having per-sub-function mboxes or by combining 'type' and
'master_id' to make single index value?

> 
> >
> >>
> >>>
> >>>If having two channels use the same master_id is a valid scenario,
> >>>then all matches on master_id should probably be processed, not just
> >>>the first one.
> >>
> >>Each DB channel should have different master_id.
> >>
> >>
> >>>
> >>>>+                       chan = &hsp_mbox->mbox->chans[i];
> >>>>+
> >>>>+                       if (!chan->con_priv)
> >>>>+                               continue;
> >>>>+
> >>>>+                       mchan = chan->con_priv;
> >>>>+                       if (mchan->type == HSP_MBOX_TYPE_DB &&
> >>>>+                           mchan->db_chan.master_id == master_id)
> >>>>+                               break;
> >>>>+                       chan = NULL;
> >>>>+               }
> >>>>+
> >>>>+               if (chan)
> >>>>+                       mbox_chan_received_data(chan, NULL);
> >>>>+       }
> >>>>+       spin_unlock(&hsp_mbox->lock);
> >>>>+
> >>>>+       return IRQ_HANDLED;
> >>>>+}
> >>>>+
> >>>>+static int hsp_db_send_data(struct mbox_chan *chan, void *data)
> >>>>+{
> >>>>+       struct tegra_hsp_mbox_chan *mchan = chan->con_priv;
> >>>>+       struct tegra_hsp_db_chan *db_chan = &mchan->db_chan;
> >>>>+       struct tegra_hsp_mbox *hsp_mbox =
> >>>>dev_get_drvdata(chan->mbox->dev);
> >>>>+
> >>>>+       hsp_writel(hsp_mbox->db_base[db_chan->db_id], HSP_DB_REG_TRIGGER,
> >>>>1);
> >>>>+
> >>>>+       return 0;
> >>>>+}
> >>>>+
> >>>>+static int hsp_db_startup(struct mbox_chan *chan)
> >>>>+{
> >>>>+       struct tegra_hsp_mbox_chan *mchan = chan->con_priv;
> >>>>+       struct tegra_hsp_db_chan *db_chan = &mchan->db_chan;
> >>>>+       struct tegra_hsp_mbox *hsp_mbox =
> >>>>dev_get_drvdata(chan->mbox->dev);
> >>>>+       u32 val;
> >>>>+       unsigned long flag;
> >>>>+
> >>>>+       if (db_chan->master_id >= MAX_NUM_HSP_CHAN) {
> >>>>+               dev_err(chan->mbox->dev, "invalid HSP chan: master ID:
> >>>>%d\n",
> >>>>+                       db_chan->master_id);
> >>>>+               return -EINVAL;
> >>>>+       }
> >>>>+
> >>>>+       spin_lock_irqsave(&hsp_mbox->lock, flag);
> >>>>+       val = hsp_readl(hsp_mbox->db_base[HSP_DB_CCPLEX],
> >>>>HSP_DB_REG_ENABLE);
> >>>>+       val |= BIT(db_chan->master_id);
> >>>>+       hsp_writel(hsp_mbox->db_base[HSP_DB_CCPLEX], HSP_DB_REG_ENABLE,
> >>>>val);
> >>>>+       spin_unlock_irqrestore(&hsp_mbox->lock, flag);
> >>>>+
> >>>>+       if (!hsp_db_can_ring(hsp_mbox->db_base[db_chan->db_id]))
> >>>>+               return -ENODEV;
> >>>>+
> >>>>+       return 0;
> >>>>+}
> >>>>+
> >>>>+static void hsp_db_shutdown(struct mbox_chan *chan)
> >>>>+{
> >>>>+       struct tegra_hsp_mbox_chan *mchan = chan->con_priv;
> >>>>+       struct tegra_hsp_db_chan *db_chan = &mchan->db_chan;
> >>>>+       struct tegra_hsp_mbox *hsp_mbox =
> >>>>dev_get_drvdata(chan->mbox->dev);
> >>>>+       u32 val;
> >>>>+       unsigned long flag;
> >>>>+
> >>>>+       spin_lock_irqsave(&hsp_mbox->lock, flag);
> >>>>+       val = hsp_readl(hsp_mbox->db_base[HSP_DB_CCPLEX],
> >>>>HSP_DB_REG_ENABLE);
> >>>>+       val &= ~BIT(db_chan->master_id);
> >>>>+       hsp_writel(hsp_mbox->db_base[HSP_DB_CCPLEX], HSP_DB_REG_ENABLE,
> >>>>val);
> >>>>+       spin_unlock_irqrestore(&hsp_mbox->lock, flag);
> >>>>+}
> >>>>+
> >>>>+static bool hsp_db_last_tx_done(struct mbox_chan *chan)
> >>>>+{
> >>>>+       return true;
> >>>>+}
> >>>>+
> >>>>+static int tegra_hsp_db_init(struct tegra_hsp_mbox *hsp_mbox,
> >>>>+                            struct mbox_chan *mchan, int master_id)
> >>>>+{
> >>>>+       struct platform_device *pdev =
> >>>>to_platform_device(hsp_mbox->mbox->dev);
> >>>>+       struct tegra_hsp_mbox_chan *hsp_mbox_chan;
> >>>>+       int ret;
> >>>>+
> >>>>+       if (!hsp_mbox->db_irq) {
> >>>>+               int i;
> >>>>+
> >>>>+               hsp_mbox->db_irq = platform_get_irq_byname(pdev,
> >>>>"doorbell");
> >>>
> >>>
> >>>Getting the IRQ sounds more like a job for probe() - I don't see the
> >>>benefit of lazy-doing it?
> >>
> >>
> >>We only need the IRQ when the client is requesting the DB service. For other
> >>HSP sub-modules, they are using different IRQ. So I didn't do that at probe
> >>time.
> >
> >Ok, but probe() is where resources should be acquired... and at the
> >very least DT properties be looked up. In this case there is no hard
> >requirement for doing it elsewhere.
> >
> >Is this interrupt absolutely required? Or can we tolerate to not use
> >the doorbell service? In the first case, the driver should fail during
> >probe(), not sometime later. In the second case, you should still get
> >all the interrupts in probe(), then disable them if they are not
> >needed, and check in this function whether db_irq is a valid interrupt
> >number to decide whether or not we can use doorbell.
> 
> Ah, I understand your concern now. It should be ok to move to
> probe(). Will fix that.
> 
> >
> >>
> >>>
> >>>>+               ret = devm_request_irq(&pdev->dev, hsp_mbox->db_irq,
> >>>>+                                      hsp_db_irq, IRQF_NO_SUSPEND,
> >>>>+                                      dev_name(&pdev->dev), hsp_mbox);
> >>>>+               if (ret)
> >>>>+                       return ret;
> >>>>+
> >>>>+               for (i = 0; i < MAX_NUM_HSP_DB; i++)
> >>>>+                       hsp_mbox->db_base[i] = hsp_db_offset(i,
> >>>>hsp_mbox);
> >>>
> >>>
> >>>Same here, cannot this be moved into probe()?
> >>
> >>Same as above, only needed when the client requests it.
> >
> >But you don't waste any resources by doing it preemptively in probe().
> >So let's keep related code in the same place.
> 
> Okay.
> 
> Thanks,
> -Joseph
> --
> To unsubscribe from this list: send the line "unsubscribe linux-tegra" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH V2 05/10] firmware: tegra: add BPMP support
  2016-07-05  9:04 ` [PATCH V2 05/10] firmware: tegra: add BPMP support Joseph Lo
  2016-07-06 11:39   ` Alexandre Courbot
@ 2016-07-08 17:55   ` Sivaram Nair
  1 sibling, 0 replies; 51+ messages in thread
From: Sivaram Nair @ 2016-07-08 17:55 UTC (permalink / raw)
  To: Joseph Lo
  Cc: Stephen Warren, Thierry Reding, Alexandre Courbot, linux-tegra,
	linux-arm-kernel, Rob Herring, Mark Rutland, Peter De Schrijver,
	Matthew Longnecker, devicetree, Jassi Brar, linux-kernel,
	Catalin Marinas, Will Deacon

On Tue, Jul 05, 2016 at 05:04:26PM +0800, Joseph Lo wrote:
> The Tegra BPMP (Boot and Power Management Processor) is designed for the
> booting process handling, offloading the power management tasks and
> some system control services from the CPU. It can be clock, DVFS,
> thermal/EDP, power gating operation and system suspend/resume handling.
> So the CPU and the drivers of these modules can base on the service that
> the BPMP firmware driver provided to signal the event for the specific PM
> action to BPMP and receive the status update from BPMP.
> 
> Comparing to the ARM SCPI, the service provided by BPMP is message-based
> communication but not method-based. The BPMP firmware driver provides the
> send/receive service for the users, when the user concerns the response
> time. If the user needs to get the event or update from the firmware, it
> can request the MRQ service as well. The user needs to take care of the
> message format, which we call BPMP ABI.
> 
> The BPMP ABI defines the message format for different modules or usages.
> For example, the clock operation needs an MRQ service code called
> MRQ_CLK with specific message format which includes different sub
> commands for various clock operations. This is the message format that
> BPMP can recognize.
> 
> So the user needs two things to initiate IPC between BPMP. Get the
> service from the bpmp_ops structure and maintain the message format as
> the BPMP ABI defined.
> 
> Based-on-the-work-by:
> Sivaram Nair <sivaramn@nvidia.com>
> 
> Signed-off-by: Joseph Lo <josephl@nvidia.com>
> ---
> Changes in V2:
> - None
> ---
>  drivers/firmware/tegra/Kconfig  |   12 +
>  drivers/firmware/tegra/Makefile |    1 +
>  drivers/firmware/tegra/bpmp.c   |  713 +++++++++++++++++
>  include/soc/tegra/bpmp.h        |   29 +
>  include/soc/tegra/bpmp_abi.h    | 1601 +++++++++++++++++++++++++++++++++++++++
>  5 files changed, 2356 insertions(+)
>  create mode 100644 drivers/firmware/tegra/bpmp.c
>  create mode 100644 include/soc/tegra/bpmp.h
>  create mode 100644 include/soc/tegra/bpmp_abi.h
> 
> diff --git a/drivers/firmware/tegra/Kconfig b/drivers/firmware/tegra/Kconfig
> index 1fa3e4e136a5..ff2730d5c468 100644
> --- a/drivers/firmware/tegra/Kconfig
> +++ b/drivers/firmware/tegra/Kconfig
> @@ -10,4 +10,16 @@ config TEGRA_IVC
>  	  keeps the content is synchronization between host CPU and remote
>  	  processors.
>  
> +config TEGRA_BPMP
> +	bool "Tegra BPMP driver"
> +	depends on ARCH_TEGRA && TEGRA_HSP_MBOX && TEGRA_IVC
> +	help
> +	  BPMP (Boot and Power Management Processor) is designed to off-loading
> +	  the PM functions which include clock/DVFS/thermal/power from the CPU.
> +	  It needs HSP as the HW synchronization and notification module and
> +	  IVC module as the message communication protocol.
> +
> +	  This driver manages the IPC interface between host CPU and the
> +	  firmware running on BPMP.
> +
>  endmenu
> diff --git a/drivers/firmware/tegra/Makefile b/drivers/firmware/tegra/Makefile
> index 92e2153e8173..e34a2f79e1ad 100644
> --- a/drivers/firmware/tegra/Makefile
> +++ b/drivers/firmware/tegra/Makefile
> @@ -1 +1,2 @@
> +obj-$(CONFIG_TEGRA_BPMP)	+= bpmp.o
>  obj-$(CONFIG_TEGRA_IVC)		+= ivc.o
> diff --git a/drivers/firmware/tegra/bpmp.c b/drivers/firmware/tegra/bpmp.c
> new file mode 100644
> index 000000000000..24fda626610e
> --- /dev/null
> +++ b/drivers/firmware/tegra/bpmp.c
> @@ -0,0 +1,713 @@
> +/*
> + * Copyright (c) 2016, NVIDIA CORPORATION.  All rights reserved.
> + *
> + * This program is free software; you can redistribute it and/or modify it
> + * under the terms and conditions of the GNU General Public License,
> + * version 2, as published by the Free Software Foundation.
> + *
> + * This program is distributed in the hope it will be useful, but WITHOUT
> + * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
> + * FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License for
> + * more details.
> + */
> +
> +#include <linux/mailbox_client.h>
> +#include <linux/of.h>
> +#include <linux/of_address.h>
> +#include <linux/of_device.h>
> +#include <linux/platform_device.h>
> +#include <linux/semaphore.h>
> +
> +#include <soc/tegra/bpmp.h>
> +#include <soc/tegra/bpmp_abi.h>
> +#include <soc/tegra/ivc.h>
> +
> +#define BPMP_MSG_SZ		128
> +#define BPMP_MSG_DATA_SZ	120

These should come from bpmp_abi.h (MSG_MIN_SZ & MSG_MIN_DATA_SZ).

> +
> +#define __MRQ_ATTRS		0xff000000
> +#define __MRQ_INDEX(id)		((id) & ~__MRQ_ATTRS)
> +
> +#define DO_ACK			BIT(0)
> +#define RING_DOORBELL		BIT(1)
> +
> +struct tegra_bpmp_soc_data {
> +	u32 ch_index;		/* channel index */
> +	u32 thread_ch_index;	/* thread channel index */
> +	u32 cpu_rx_ch_index;	/* CPU Rx channel index */
> +	u32 nr_ch;		/* number of total channels */
> +	u32 nr_thread_ch;	/* number of thread channels */
> +	u32 ch_timeout;		/* channel timeout */
> +	u32 thread_ch_timeout;	/* thread channel timeout */
> +};
> +
> +struct channel_info {
> +	u32 tch_free;
> +	u32 tch_to_complete;
> +	struct semaphore tch_sem;
> +};
> +
> +struct mb_data {
> +	s32 code;
> +	s32 flags;

These should be u32?

> +	u8 data[BPMP_MSG_DATA_SZ];
> +} __packed;
> +
> +struct channel_data {
> +	struct mb_data *ib;
> +	struct mb_data *ob;
> +};
> +
> +struct mrq {
> +	struct list_head list;
> +	u32 mrq_code;
> +	bpmp_mrq_handler handler;
> +	void *data;
> +};
> +
> +struct tegra_bpmp {
> +	struct device *dev;
> +	const struct tegra_bpmp_soc_data *soc_data;
> +	void __iomem *tx_base;
> +	void __iomem *rx_base;
> +	struct mbox_client cl;
> +	struct mbox_chan *chan;
> +	struct ivc *ivc_channels;
> +	struct channel_data *ch_area;
> +	struct channel_info ch_info;
> +	struct completion *ch_completion;
> +	struct list_head mrq_list;
> +	struct tegra_bpmp_ops *ops;
> +	spinlock_t lock;
> +	bool init_done;
> +};
> +
> +static struct tegra_bpmp *bpmp;
> +
> +static int bpmp_get_thread_ch(int idx)
> +{
> +	return bpmp->soc_data->thread_ch_index + idx;
> +}
> +
> +static int bpmp_get_thread_ch_index(int ch)
> +{
> +	if (ch < bpmp->soc_data->thread_ch_index ||
> +	    ch >= bpmp->soc_data->cpu_rx_ch_index)
> +		return -1;
> +	return ch - bpmp->soc_data->thread_ch_index;
> +}
> +
> +static int bpmp_get_ob_channel(void)
> +{
> +	return smp_processor_id() + bpmp->soc_data->ch_index;
> +}
> +
> +static struct completion *bpmp_get_completion_obj(int ch)
> +{
> +	int i = bpmp_get_thread_ch_index(ch);
> +
> +	return i < 0 ? NULL : bpmp->ch_completion + i;
> +}
> +
> +static int bpmp_valid_txfer(void *ob_data, int ob_sz, void *ib_data, int ib_sz)
> +{
> +	return ob_sz >= 0 && ob_sz <= BPMP_MSG_DATA_SZ &&
> +	       ib_sz >= 0 && ib_sz <= BPMP_MSG_DATA_SZ &&
> +	       (!ob_sz || ob_data) && (!ib_sz || ib_data);
> +}
> +
> +static bool bpmp_master_acked(int ch)
> +{
> +	struct ivc *ivc_chan;
> +	void *frame;
> +	bool ready;
> +
> +	ivc_chan = bpmp->ivc_channels + ch;
> +	frame = tegra_ivc_read_get_next_frame(ivc_chan);
> +	ready = !IS_ERR_OR_NULL(frame);
> +	bpmp->ch_area[ch].ib = ready ? frame : NULL;
> +
> +	return ready;
> +}
> +
> +static int bpmp_wait_ack(int ch)
> +{
> +	ktime_t t;

You could consider adding a bpmp_master_acked() check here. It will be
an extra call, but you might avoid executing the next line many times.

> +
> +	t = ns_to_ktime(local_clock());
> +
> +	do {
> +		if (bpmp_master_acked(ch))
> +			return 0;
> +	} while (ktime_us_delta(ns_to_ktime(local_clock()), t) <
> +		 bpmp->soc_data->ch_timeout);
> +
> +	return -ETIMEDOUT;
> +}
> +
> +static bool bpmp_master_free(int ch)
> +{
> +	struct ivc *ivc_chan;
> +	void *frame;
> +	bool ready;
> +
> +	ivc_chan = bpmp->ivc_channels + ch;
> +	frame = tegra_ivc_write_get_next_frame(ivc_chan);
> +	ready = !IS_ERR_OR_NULL(frame);
> +	bpmp->ch_area[ch].ob = ready ? frame : NULL;
> +
> +	return ready;
> +}
> +
> +static int bpmp_wait_master_free(int ch)
> +{
> +	ktime_t t;

Similarly doing an extra bpmp_master_free() check here, is worth
considering.

> +
> +	t = ns_to_ktime(local_clock());
> +
> +	do {
> +		if (bpmp_master_free(ch))
> +			return 0;
> +	} while (ktime_us_delta(ns_to_ktime(local_clock()), t)
> +		 < bpmp->soc_data->ch_timeout);
> +
> +	return -ETIMEDOUT;
> +}
> +
> +static int __read_ch(int ch, void *data, int sz)
> +{
> +	struct ivc *ivc_chan;
> +	struct mb_data *p;
> +
> +	ivc_chan = bpmp->ivc_channels + ch;
> +	p = bpmp->ch_area[ch].ib;
> +	if (data)
> +		memcpy_fromio(data, p->data, sz);
> +
> +	return tegra_ivc_read_advance(ivc_chan);
> +}
> +
> +static int bpmp_read_ch(int ch, void *data, int sz)
> +{
> +	unsigned long flags;
> +	int i, ret;
> +
> +	i = bpmp_get_thread_ch_index(ch);
> +
> +	spin_lock_irqsave(&bpmp->lock, flags);
> +	ret = __read_ch(ch, data, sz);
> +	bpmp->ch_info.tch_free |= (1 << i);
> +	spin_unlock_irqrestore(&bpmp->lock, flags);
> +
> +	up(&bpmp->ch_info.tch_sem);
> +
> +	return ret;
> +}
> +
> +static int __write_ch(int ch, int mrq_code, int flags, void *data, int sz)
> +{
> +	struct ivc *ivc_chan;
> +	struct mb_data *p;
> +
> +	ivc_chan = bpmp->ivc_channels + ch;
> +	p = bpmp->ch_area[ch].ob;
> +
> +	p->code = mrq_code;
> +	p->flags = flags;
> +	if (data)
> +		memcpy_toio(p->data, data, sz);
> +
> +	return tegra_ivc_write_advance(ivc_chan);
> +}
> +
> +static int bpmp_write_threaded_ch(int *ch, int mrq_code, void *data, int sz)
> +{
> +	unsigned long flags;
> +	int ret, i;
> +
> +	ret = down_timeout(&bpmp->ch_info.tch_sem,
> +			   usecs_to_jiffies(bpmp->soc_data->thread_ch_timeout));
> +	if (ret)
> +		return ret;
> +
> +	spin_lock_irqsave(&bpmp->lock, flags);
> +
> +	i = __ffs(bpmp->ch_info.tch_free);
> +	*ch = bpmp_get_thread_ch(i);
> +	ret = bpmp_master_free(*ch) ? 0 : -EFAULT;
> +	if (!ret) {
> +		bpmp->ch_info.tch_free &= ~(1 << i);
> +		__write_ch(*ch, mrq_code, DO_ACK | RING_DOORBELL, data, sz);
> +		bpmp->ch_info.tch_to_complete |= (1 << *ch);
> +	}
> +
> +	spin_unlock_irqrestore(&bpmp->lock, flags);
> +
> +	return ret;
> +}
> +
> +static int bpmp_write_ch(int ch, int mrq_code, int flags, void *data, int sz)
> +{
> +	int ret;
> +
> +	ret = bpmp_wait_master_free(ch);
> +	if (ret)
> +		return ret;
> +
> +	return __write_ch(ch, mrq_code, flags, data, sz);
> +}
> +
> +static int bpmp_send_receive_atomic(int mrq_code, void *ob_data, int ob_sz,
> +				    void *ib_data, int ib_sz)
> +{
> +	int ch, ret;
> +
> +	if (WARN_ON(!irqs_disabled()))
> +		return -EPERM;
> +
> +	if (!bpmp_valid_txfer(ob_data, ob_sz, ib_data, ib_sz))
> +		return -EINVAL;
> +
> +	if (!bpmp->init_done)
> +		return -ENODEV;
> +
> +	ch = bpmp_get_ob_channel();
> +	ret = bpmp_write_ch(ch, mrq_code, DO_ACK, ob_data, ob_sz);
> +	if (ret)
> +		return ret;
> +
> +	ret = mbox_send_message(bpmp->chan, NULL);
> +	if (ret < 0)
> +		return ret;
> +	mbox_client_txdone(bpmp->chan, 0);

mbox_send_message() & mbox_client_txdone() are always used in pair - so
why don't we move these two into a helper function?

> +
> +	ret = bpmp_wait_ack(ch);
> +	if (ret)
> +		return ret;
> +
> +	return __read_ch(ch, ib_data, ib_sz);
> +}
> +
> +static int bpmp_send_receive(int mrq_code, void *ob_data, int ob_sz,
> +			     void *ib_data, int ib_sz)
> +{
> +	struct completion *comp_obj;
> +	unsigned long timeout;
> +	int ch, ret;
> +
> +	if (WARN_ON(irqs_disabled()))
> +		return -EPERM;
> +
> +	if (!bpmp_valid_txfer(ob_data, ob_sz, ib_data, ib_sz))
> +		return -EINVAL;
> +
> +	if (!bpmp->init_done)
> +		return -ENODEV;
> +
> +	ret = bpmp_write_threaded_ch(&ch, mrq_code, ob_data, ob_sz);
> +	if (ret)
> +		return ret;
> +
> +	ret = mbox_send_message(bpmp->chan, NULL);
> +	if (ret < 0)
> +		return ret;
> +	mbox_client_txdone(bpmp->chan, 0);
> +
> +	comp_obj = bpmp_get_completion_obj(ch);
> +	timeout = usecs_to_jiffies(bpmp->soc_data->thread_ch_timeout);
> +	if (!wait_for_completion_timeout(comp_obj, timeout))
> +		return -ETIMEDOUT;
> +
> +	return bpmp_read_ch(ch, ib_data, ib_sz);
> +}
> +
> +static struct mrq *bpmp_find_mrq(u32 mrq_code)
> +{
> +	struct mrq *mrq;
> +
> +	list_for_each_entry(mrq, &bpmp->mrq_list, list) {
> +		if (mrq_code == mrq->mrq_code)
> +			return mrq;
> +	}
> +
> +	return NULL;
> +}
> +
> +static void bpmp_mrq_return_data(int ch, int code, void *data, int sz)
> +{
> +	int flags = bpmp->ch_area[ch].ib->flags;
> +	struct ivc *ivc_chan;
> +	struct mb_data *frame;
> +	int ret;
> +
> +	if (WARN_ON(sz > BPMP_MSG_DATA_SZ))
> +		return;
> +
> +	ivc_chan = bpmp->ivc_channels + ch;
> +	ret = tegra_ivc_read_advance(ivc_chan);
> +	WARN_ON(ret);
> +
> +	if (!(flags & DO_ACK))
> +		return;
> +
> +	frame = tegra_ivc_write_get_next_frame(ivc_chan);
> +	if (IS_ERR_OR_NULL(frame)) {
> +		WARN_ON(1);
> +		return;
> +	}
> +
> +	frame->code = code;
> +	if (data != NULL)
> +		memcpy_toio(frame->data, data, sz);
> +	ret = tegra_ivc_write_advance(ivc_chan);
> +	WARN_ON(ret);
> +
> +	if (flags & RING_DOORBELL) {
> +		ret = mbox_send_message(bpmp->chan, NULL);
> +		if (ret < 0) {
> +			WARN_ON(1);
> +			return;
> +		}
> +		mbox_client_txdone(bpmp->chan, 0);
> +	}
> +}
> +
> +static void bpmp_mail_return(int ch, int ret_code, int val)
> +{
> +	bpmp_mrq_return_data(ch, ret_code, &val, sizeof(val));
> +}
> +
> +static void bpmp_handle_mrq(int mrq_code, int ch)
> +{
> +	struct mrq *mrq;
> +
> +	spin_lock(&bpmp->lock);
> +
> +	mrq = bpmp_find_mrq(mrq_code);
> +	if (!mrq) {
> +		spin_unlock(&bpmp->lock);
> +		bpmp_mail_return(ch, -EINVAL, 0);
> +		return;
> +	}
> +
> +	mrq->handler(mrq_code, mrq->data, ch);
> +
> +	spin_unlock(&bpmp->lock);
> +}
> +
> +static int bpmp_request_mrq(int mrq_code, bpmp_mrq_handler handler, void *data)
> +{
> +	struct mrq *mrq;
> +	unsigned long flags;
> +
> +	if (!handler)
> +		return -EINVAL;
> +
> +	mrq = devm_kzalloc(bpmp->dev, sizeof(*mrq), GFP_KERNEL);
> +	if (!mrq)
> +		return -ENOMEM;
> +
> +	spin_lock_irqsave(&bpmp->lock, flags);
> +
> +	mrq->mrq_code = __MRQ_INDEX(mrq_code);
> +	mrq->handler = handler;
> +	mrq->data = data;

The last three lines can be outside the lock area.

> +	list_add(&mrq->list, &bpmp->mrq_list);
> +
> +	spin_unlock_irqrestore(&bpmp->lock, flags);
> +
> +	return 0;
> +}
> +
> +static void bpmp_mrq_handle_ping(int mrq_code, void *data, int ch)
> +{
> +	int challenge;
> +	int reply;
> +
> +	challenge = *(int *)bpmp->ch_area[ch].ib->data;
> +	reply = challenge << (smp_processor_id() + 1);
> +	bpmp_mail_return(ch, 0, reply);

We should use struct mrq_ping_response here...

> +}
> +
> +static int bpmp_mailman_init(void)
> +{
> +	return bpmp_request_mrq(MRQ_PING, bpmp_mrq_handle_ping, NULL);
> +}
> +
> +static int bpmp_ping(void)
> +{
> +	unsigned long flags;
> +	ktime_t t;
> +	int challenge = 1;
> +	int reply = 0;
> +	int ret;
> +
> +	t = ktime_get();
> +	local_irq_save(flags);
> +	ret = bpmp_send_receive_atomic(MRQ_PING, &challenge, sizeof(challenge),
> +				       &reply, sizeof(reply));

...and struct mrq_ping_request here.

> +	local_irq_restore(flags);
> +	t = ktime_sub(ktime_get(), t);
> +
> +	if (!ret)
> +		dev_info(bpmp->dev,
> +			 "ping ok: challenge: %d, reply: %d, time: %lld\n",
> +			 challenge, reply, ktime_to_us(t));
> +
> +	return ret;
> +}
> +
> +static int bpmp_get_fwtag(void)
> +{
> +	unsigned long flags;
> +	void *vaddr;
> +	dma_addr_t paddr;
> +	u32 addr;
> +	int ret;
> +
> +	vaddr = dma_alloc_coherent(bpmp->dev, BPMP_MSG_DATA_SZ, &paddr,
> +				   GFP_KERNEL);
> +	if (!vaddr)
> +		return -ENOMEM;
> +	addr = paddr;
> +
> +	local_irq_save(flags);
> +	ret = bpmp_send_receive_atomic(MRQ_QUERY_TAG, &addr, sizeof(addr),
> +				       NULL, 0);
> +	local_irq_restore(flags);
> +
> +	if (!ret)
> +		dev_info(bpmp->dev, "fwtag: %s\n", (char *)vaddr);
> +
> +	dma_free_coherent(bpmp->dev, BPMP_MSG_DATA_SZ, vaddr, paddr);
> +
> +	return ret;
> +}
> +
> +static void bpmp_signal_thread(int ch)
> +{
> +	int flags = bpmp->ch_area[ch].ob->flags;
> +	struct completion *comp_obj;
> +
> +	if (!(flags & RING_DOORBELL))
> +		return;
> +
> +	comp_obj = bpmp_get_completion_obj(ch);
> +	if (!comp_obj) {
> +		WARN_ON(1);
> +		return;
> +	}
> +
> +	complete(comp_obj);
> +}
> +
> +static void bpmp_handle_rx(struct mbox_client *cl, void *data)
> +{
> +	int i, rx_ch;
> +
> +	rx_ch = bpmp->soc_data->cpu_rx_ch_index;
> +
> +	if (bpmp_master_acked(rx_ch))
> +		bpmp_handle_mrq(bpmp->ch_area[rx_ch].ib->code, rx_ch);
> +
> +	spin_lock(&bpmp->lock);
> +
> +	for (i = 0; i < bpmp->soc_data->nr_thread_ch &&
> +			bpmp->ch_info.tch_to_complete; i++) {
> +		int ch = bpmp_get_thread_ch(i);
> +
> +		if ((bpmp->ch_info.tch_to_complete & (1 << ch)) &&
> +		    bpmp_master_acked(ch)) {
> +			bpmp->ch_info.tch_to_complete &= ~(1 << ch);
> +			bpmp_signal_thread(ch);
> +		}
> +	}
> +
> +	spin_unlock(&bpmp->lock);
> +}
> +
> +static void bpmp_ivc_notify(struct ivc *ivc)
> +{
> +	int ret;
> +
> +	ret = mbox_send_message(bpmp->chan, NULL);
> +	if (ret < 0)
> +		return;
> +
> +	mbox_send_message(bpmp->chan, NULL);
> +}

This will be called by the ivc framework at different times including
after reading a frame - causing multiple/redundant interrupts asserted
to bpmp --- since you are already doing the required from
bpmp_send_receive_atomic, bpmp_send_receive() & bpmp_mrq_return_data().
I think you should stub this out, and call mbox_send_message() and
mbox_client_txdone() after each tegra_ivc_channel_notified() in
bpmp_msg_chan_init()

> +
> +static int bpmp_msg_chan_init(int ch)
> +{
> +	struct ivc *ivc_chan;
> +	u32 hdr_sz, msg_sz, que_sz;
> +	uintptr_t rx_base, tx_base;
> +	int ret;
> +
> +	msg_sz = tegra_ivc_align(BPMP_MSG_SZ);
> +	hdr_sz = tegra_ivc_total_queue_size(0);
> +	que_sz = tegra_ivc_total_queue_size(msg_sz);
> +
> +	rx_base =  (uintptr_t)(bpmp->rx_base + que_sz * ch);
> +	tx_base =  (uintptr_t)(bpmp->tx_base + que_sz * ch);
> +
> +	ivc_chan = bpmp->ivc_channels + ch;
> +	ret = tegra_ivc_init(ivc_chan, rx_base, DMA_ERROR_CODE, tx_base,
> +			     DMA_ERROR_CODE, 1, msg_sz, bpmp->dev,
> +			     bpmp_ivc_notify);
> +	if (ret) {
> +		dev_err(bpmp->dev, "%s fail: ch %d returned %d\n",
> +			__func__, ch, ret);
> +		return ret;
> +	}
> +
> +	/* reset the channel state */
> +	tegra_ivc_channel_reset(ivc_chan);

A minor optimization would be to ring the doorbell here...

> +
> +	/* sync the channel state with BPMP */
> +	while (tegra_ivc_channel_notified(ivc_chan))
> +		;

... and keep this while loop outside this per-channel function (after
all the message channels are initialized in probe). That way, we don't
have to sync each channel before starting to initialize the next one.

> +
> +	return 0;
> +}
> +
> +struct tegra_bpmp_ops *tegra_bpmp_get_ops(void)
> +{
> +	if (bpmp->init_done && bpmp->ops)
> +		return bpmp->ops;
> +	return NULL;
> +}
> +EXPORT_SYMBOL(tegra_bpmp_get_ops);
> +
> +static struct tegra_bpmp_ops bpmp_ops = {
> +	.send_receive = bpmp_send_receive,
> +	.send_receive_atomic = bpmp_send_receive_atomic,
> +	.request_mrq = bpmp_request_mrq,
> +	.mrq_return = bpmp_mail_return,
> +};
> +
> +static const struct tegra_bpmp_soc_data soc_data_tegra186 = {
> +	.ch_index = 0,
> +	.thread_ch_index = 6,
> +	.cpu_rx_ch_index = 13,
> +	.nr_ch = 14,
> +	.nr_thread_ch = 7,
> +	.ch_timeout = 60 * USEC_PER_SEC,
> +	.thread_ch_timeout = 600 * USEC_PER_SEC,

This is too large - you can bring these down to 1 sec. I also wonder if
these should be moved to DT?

> +};
> +
> +static const struct of_device_id tegra_bpmp_match[] = {
> +	{ .compatible = "nvidia,tegra186-bpmp", .data = &soc_data_tegra186 },
> +	{ }
> +};
> +
> +static int tegra_bpmp_probe(struct platform_device *pdev)
> +{
> +	const struct of_device_id *match;
> +	struct resource shmem_res;
> +	struct device_node *shmem_np;
> +	int i, ret;
> +
> +	bpmp = devm_kzalloc(&pdev->dev, sizeof(*bpmp), GFP_KERNEL);
> +	if (!bpmp)
> +		return -ENOMEM;
> +	bpmp->dev = &pdev->dev;
> +
> +	match = of_match_device(tegra_bpmp_match, &pdev->dev);
> +	if (!match)
> +		return -EINVAL;

I am curious: is this check really required (tegra_bpmp_match is part of
tegra_bpmp_driver)?

> +	bpmp->soc_data = match->data;
> +
> +	shmem_np = of_parse_phandle(pdev->dev.of_node, "shmem", 0);
> +	of_address_to_resource(shmem_np, 0, &shmem_res);
> +	bpmp->tx_base = devm_ioremap_resource(&pdev->dev, &shmem_res);
> +	if (IS_ERR(bpmp->tx_base))
> +		return PTR_ERR(bpmp->tx_base);
> +
> +	shmem_np = of_parse_phandle(pdev->dev.of_node, "shmem", 1);
> +	of_address_to_resource(shmem_np, 0, &shmem_res);
> +	bpmp->rx_base = devm_ioremap_resource(&pdev->dev, &shmem_res);
> +	if (IS_ERR(bpmp->rx_base))
> +		return PTR_ERR(bpmp->rx_base);
> +
> +	bpmp->ivc_channels = devm_kcalloc(&pdev->dev, bpmp->soc_data->nr_ch,
> +					  sizeof(*bpmp->ivc_channels),
> +					  GFP_KERNEL);
> +	if (!bpmp->ivc_channels)
> +		return -ENOMEM;
> +
> +	bpmp->ch_area = devm_kcalloc(&pdev->dev, bpmp->soc_data->nr_ch,
> +				     sizeof(*bpmp->ch_area), GFP_KERNEL);
> +	if (!bpmp->ch_area)
> +		return -ENOMEM;
> +
> +	bpmp->ch_completion = devm_kcalloc(&pdev->dev,
> +					   bpmp->soc_data->nr_thread_ch,
> +					   sizeof(*bpmp->ch_completion),
> +					   GFP_KERNEL);
> +	if (!bpmp->ch_completion)
> +		return -ENOMEM;
> +
> +	/* mbox registration */
> +	bpmp->cl.dev = &pdev->dev;
> +	bpmp->cl.rx_callback = bpmp_handle_rx;
> +	bpmp->cl.tx_block = false;
> +	bpmp->cl.knows_txdone = false;
> +	bpmp->chan = mbox_request_channel(&bpmp->cl, 0);
> +	if (IS_ERR(bpmp->chan)) {
> +		if (PTR_ERR(bpmp->chan) != -EPROBE_DEFER)
> +			dev_err(&pdev->dev,
> +				"fail to get HSP mailbox, bpmp init fail.\n");
> +		return PTR_ERR(bpmp->chan);
> +	}
> +
> +	/* message channel initialization */
> +	for (i = 0; i < bpmp->soc_data->nr_ch; i++) {
> +		struct completion *comp_obj;
> +
> +		ret = bpmp_msg_chan_init(i);
> +		if (ret)
> +			return ret;
> +
> +		comp_obj = bpmp_get_completion_obj(i);
> +		if (comp_obj)
> +			init_completion(comp_obj);
> +	}
> +
> +	bpmp->ch_info.tch_free = (1 << bpmp->soc_data->nr_thread_ch) - 1;
> +	sema_init(&bpmp->ch_info.tch_sem, bpmp->soc_data->nr_thread_ch);
> +
> +	spin_lock_init(&bpmp->lock);
> +	INIT_LIST_HEAD(&bpmp->mrq_list);
> +	if (bpmp_mailman_init())
> +		return -ENODEV;
> +
> +	bpmp->init_done = true;
> +
> +	ret = bpmp_ping();
> +	if (ret)
> +		dev_err(&pdev->dev, "ping failed: %d\n", ret);
> +
> +	ret = bpmp_get_fwtag();
> +	if (ret)
> +		dev_err(&pdev->dev, "get fwtag failed: %d\n", ret);
> +
> +	/* BPMP is ready now. */
> +	bpmp->ops = &bpmp_ops;
> +
> +	return 0;
> +}
> +
> +static struct platform_driver tegra_bpmp_driver = {
> +	.driver = {
> +		.name = "tegra-bpmp",
> +		.of_match_table = tegra_bpmp_match,
> +	},
> +	.probe = tegra_bpmp_probe,
> +};
> +
> +static int __init tegra_bpmp_init(void)
> +{
> +	return platform_driver_register(&tegra_bpmp_driver);
> +}
> +core_initcall(tegra_bpmp_init);
> diff --git a/include/soc/tegra/bpmp.h b/include/soc/tegra/bpmp.h
> new file mode 100644
> index 000000000000..aaa0ef34ad7b
> --- /dev/null
> +++ b/include/soc/tegra/bpmp.h
> @@ -0,0 +1,29 @@
> +/*
> + * Copyright (c) 2016, NVIDIA CORPORATION.  All rights reserved.
> + *
> + * This program is free software; you can redistribute it and/or modify it
> + * under the terms and conditions of the GNU General Public License,
> + * version 2, as published by the Free Software Foundation.
> + *
> + * This program is distributed in the hope it will be useful, but WITHOUT
> + * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
> + * FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License for
> + * more details.
> + */
> +
> +#ifndef __TEGRA_BPMP_H

Missing #define __TEGRA_BPMP_H

> +
> +typedef void (*bpmp_mrq_handler)(int mrq_code, void *data, int ch);
> +
> +struct tegra_bpmp_ops {
> +	int (*send_receive)(int mrq_code, void *ob_data, int ob_sz,
> +			    void *ib_data, int ib_sz);
> +	int (*send_receive_atomic)(int mrq_code, void *ob_data, int ob_sz,
> +			    void *ib_data, int ib_sz);
> +	int (*request_mrq)(int mrq_code, bpmp_mrq_handler handler, void *data);
> +	void (*mrq_return)(int ch, int ret_code, int val);
> +};
> +
> +struct tegra_bpmp_ops *tegra_bpmp_get_ops(void);
> +
> +#endif /* __TEGRA_BPMP_H */
> diff --git a/include/soc/tegra/bpmp_abi.h b/include/soc/tegra/bpmp_abi.h
> new file mode 100644
> index 000000000000..0aaef5960e29
> --- /dev/null
> +++ b/include/soc/tegra/bpmp_abi.h
> @@ -0,0 +1,1601 @@
> +/*
> + * Copyright (c) 2014-2016, NVIDIA CORPORATION.  All rights reserved.
> + *
> + * This program is free software; you can redistribute it and/or modify it
> + * under the terms and conditions of the GNU General Public License,
> + * version 2, as published by the Free Software Foundation.
> + *
> + * This program is distributed in the hope it will be useful, but WITHOUT
> + * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
> + * FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License for
> + * more details.
> + *
> + * You should have received a copy of the GNU General Public License
> + * along with this program.  If not, see <http://www.gnu.org/licenses/>.
> + */
> +
> +#ifndef _ABI_BPMP_ABI_H_
> +#define _ABI_BPMP_ABI_H_
> +
> +#ifdef LK
> +#include <stdint.h>
> +#endif
> +
> +#ifndef __ABI_PACKED
> +#define __ABI_PACKED __attribute__((packed))
> +#endif
> +
> +#ifdef NO_GCC_EXTENSIONS
> +#define EMPTY char empty;
> +#define EMPTY_ARRAY 1
> +#else
> +#define EMPTY
> +#define EMPTY_ARRAY 0
> +#endif
> +
> +#ifndef __UNION_ANON
> +#define __UNION_ANON
> +#endif
> +/**
> + * @file
> + */
> +
> +
> +/**
> + * @defgroup MRQ MRQ Messages
> + * @brief Messages sent to/from BPMP via IPC
> + * @{
> + *   @defgroup MRQ_Format Message Format
> + *   @defgroup MRQ_Codes Message Request (MRQ) Codes
> + *   @defgroup MRQ_Payloads Message Payloads
> + *   @defgroup Error_Codes Error Codes
> + * @}
> + */
> +
> +/**
> + * @addtogroup MRQ_Format Message Format
> + * @{
> + * The CPU requests the BPMP to perform a particular service by
> + * sending it an IVC frame containing a single MRQ message. An MRQ
> + * message consists of a @ref mrq_request followed by a payload whose
> + * format depends on mrq_request::mrq.
> + *
> + * The BPMP processes the data and replies with an IVC frame (on the
> + * same IVC channel) containing and MRQ response. An MRQ response
> + * consists of a @ref mrq_response followed by a payload whose format
> + * depends on the associated mrq_request::mrq.
> + *
> + * A well-defined subset of the MRQ messages that the CPU sends to the
> + * BPMP can lead to BPMP eventually sending an MRQ message to the
> + * CPU. For example, when the CPU uses an #MRQ_THERMAL message to set
> + * a thermal trip point, the BPMP may eventually send a single
> + * #MRQ_THERMAL message of its own to the CPU indicating that the trip
> + * point has been crossed.
> + * @}
> + */
> +
> +/**
> + * @ingroup MRQ_Format
> + * @brief header for an MRQ message
> + *
> + * Provides the MRQ number for the MRQ message: #mrq. The remainder of
> + * the MRQ message is a payload (immediately following the
> + * mrq_request) whose format depends on mrq.
> + *
> + * @todo document the flags
> + */
> +struct mrq_request {
> +	/** @brief MRQ number of the request */
> +	uint32_t mrq;
> +	/** @brief flags for the request */
> +	uint32_t flags;
> +} __ABI_PACKED;
> +
> +/**
> + * @ingroup MRQ_Format
> + * @brief header for an MRQ response
> + *
> + *  Provides an error code for the associated MRQ message. The
> + *  remainder of the MRQ response is a payload (immediately following
> + *  the mrq_response) whose format depends on the associated
> + *  mrq_request::mrq
> + *
> + * @todo document the flags
> + */
> +struct mrq_response {
> +	/** @brief error code for the MRQ request itself */
> +	int32_t err;
> +	/** @brief flags for the response */
> +	uint32_t flags;
> +} __ABI_PACKED;
> +
> +/**
> + * @ingroup MRQ_Format
> + * Minimum needed size for an IPC message buffer
> + */
> +#define MSG_MIN_SZ	128
> +/**
> + * @ingroup MRQ_Format
> + *  Minimum size guaranteed for data in an IPC message buffer
> + */
> +#define MSG_DATA_MIN_SZ	120
> +
> +/**
> + * @ingroup MRQ_Codes
> + * @name Legal MRQ codes
> + * These are the legal values for mrq_request::mrq
> + * @{
> + */
> +
> +#define MRQ_PING		0
> +#define MRQ_QUERY_TAG		1
> +#define MRQ_MODULE_LOAD		4
> +#define MRQ_MODULE_UNLOAD	5
> +#define MRQ_TRACE_MODIFY	7
> +#define MRQ_WRITE_TRACE		8
> +#define MRQ_THREADED_PING	9
> +#define MRQ_MODULE_MAIL		11
> +#define MRQ_DEBUGFS		19
> +#define MRQ_RESET		20
> +#define MRQ_I2C			21
> +#define MRQ_CLK			22
> +#define MRQ_QUERY_ABI		23
> +#define MRQ_PG_READ_STATE	25
> +#define MRQ_PG_UPDATE_STATE	26
> +#define MRQ_THERMAL		27
> +#define MRQ_CPU_VHINT		28
> +#define MRQ_ABI_RATCHET		29
> +#define MRQ_EMC_DVFS_LATENCY	31
> +#define MRQ_TRACE_ITER		64
> +
> +/** @} */
> +
> +/**
> + * @ingroup MRQ_Codes
> + * @brief Maximum MRQ code to be sent by CPU software to
> + * BPMP. Subject to change in future
> + */
> +#define MAX_CPU_MRQ_ID		64
> +
> +/**
> + * @addtogroup MRQ_Payloads Message Payloads
> + * @{
> + *   @defgroup Ping
> + *   @defgroup Query_Tag Query Tag
> + *   @defgroup Module Loadable Modules
> + *   @defgroup Trace
> + *   @defgroup Debugfs
> + *   @defgroup Reset
> + *   @defgroup I2C
> + *   @defgroup Clocks
> + *   @defgroup ABI_info ABI Info
> + *   @defgroup MC_Flush MC Flush
> + *   @defgroup Powergating
> + *   @defgroup Thermal
> + *   @defgroup Vhint CPU Voltage hint
> + *   @defgroup MRQ_Deprecated Deprecated MRQ messages
> + *   @defgroup EMC
> + * @}
> + */
> +
> +
> +/**
> + * @ingroup MRQ_Codes
> + * @def MRQ_PING
> + * @brief A simple ping
> + *
> + * * Platforms: All
> + * * Initiators: Any
> + * * Targets: Any
> + * * Request Payload: @ref mrq_ping_request
> + * * Response Payload: @ref mrq_ping_response
> + *
> + * @ingroup MRQ_Codes
> + * @def MRQ_THREADED_PING
> + * @brief A deeper ping
> + *
> + * * Platforms: All
> + * * Initiators: Any
> + * * Targets: BPMP
> + * * Request Payload: @ref mrq_ping_request
> + * * Response Payload: @ref mrq_ping_response
> + *
> + * Behavior is equivalent to a simple #MRQ_PING except that BPMP
> + * responds from a thread context (providing a slightly more robust
> + * sign of life).
> + *
> + */
> +
> +/**
> + * @ingroup Ping
> + * @brief request with #MRQ_PING
> + *
> + * Used by the sender of an #MRQ_PING message to request a pong from
> + * recipient. The response from the recipient is computed based on
> + * #challenge.
> + */
> +struct mrq_ping_request {
> +/** @brief arbitrarily chosen value */
> +	uint32_t challenge;
> +} __ABI_PACKED;
> +
> +/**
> + * @ingroup Ping
> + * @brief response to #MRQ_PING
> + *
> + * Sent in response to an #MRQ_PING message. #reply should be the
> + * mrq_ping_request challenge left shifted by 1 with the carry-bit
> + * dropped.
> + *
> + */
> +struct mrq_ping_response {
> +	/** @brief response to the MRQ_PING challege */
> +	uint32_t reply;
> +} __ABI_PACKED;
> +
> +/**
> + * @ingroup MRQ_Codes
> + * @def MRQ_QUERY_TAG
> + * @brief Query BPMP firmware's tag (i.e. version information)
> + *
> + * * Platforms: All
> + * * Initiators: CCPLEX
> + * * Targets: BPMP
> + * * Request Payload: @ref mrq_query_tag_request
> + * * Response Payload: N/A
> + *
> + */
> +
> +/**
> + * @ingroup Query_Tag
> + * @brief request with #MRQ_QUERY_TAG
> + *
> + * Used by #MRQ_QUERY_TAG call to ask BPMP to fill in the memory
> + * pointed by #addr with BPMP firmware header.
> + *
> + * The sender is reponsible for ensuring that #addr is mapped in to
> + * the recipient's address map.
> + */
> +struct mrq_query_tag_request {
> +  /** @brief base address to store the firmware header */
> +	uint32_t addr;
> +} __ABI_PACKED;
> +
> +/**
> + * @ingroup MRQ_Codes
> + * @def MRQ_MODULE_LOAD
> + * @brief dynamically load a BPMP code module
> + *
> + * * Platforms: All
> + * * Initiators: CCPLEX
> + * * Targets: BPMP
> + * * Request Payload: @ref mrq_module_load_request
> + * * Response Payload: @ref mrq_module_load_response
> + *
> + * @note This MRQ is disabled on production systems
> + *
> + */
> +
> +/**
> + * @ingroup Module
> + * @brief request with #MRQ_MODULE_LOAD
> + *
> + * Used by #MRQ_MODULE_LOAD calls to ask the recipient to dynamically
> + * load the code located at #phys_addr and having size #size
> + * bytes. #phys_addr is treated as a void pointer.
> + *
> + * The recipient copies the code from #phys_addr to locally allocated
> + * memory prior to responding to this message.
> + *
> + * @todo document the module header format
> + *
> + * The sender is responsible for ensuring that the code is mapped in
> + * the recipient's address map.
> + *
> + */
> +struct mrq_module_load_request {
> +	/** @brief base address of the code to load. Treated as (void *) */
> +	uint32_t phys_addr; /* (void *) */
> +	/** @brief size in bytes of code to load */
> +	uint32_t size;
> +} __ABI_PACKED;
> +
> +/**
> + * @ingroup Module
> + * @brief response to #MRQ_MODULE_LOAD
> + *
> + * @todo document mrq_response::err
> + */
> +struct mrq_module_load_response {
> +	/** @brief handle to the loaded module */
> +	uint32_t base;
> +} __ABI_PACKED;
> +
> +/**
> + * @ingroup MRQ_Codes
> + * @def MRQ_MODULE_UNLOAD
> + * @brief unload a previously loaded code module
> + *
> + * * Platforms: All
> + * * Initiators: CCPLEX
> + * * Targets: BPMP
> + * * Request Payload: @ref mrq_module_unload_request
> + * * Response Payload: N/A
> + *
> + * @note This MRQ is disabled on production systems
> + */
> +
> +/**
> + * @ingroup Module
> + * @brief request with #MRQ_MODULE_UNLOAD
> + *
> + * Used by #MRQ_MODULE_UNLOAD calls to request that a previously loaded
> + * module be unloaded.
> + */
> +struct mrq_module_unload_request {
> +	/** @brief handle of the module to unload */
> +	uint32_t base;
> +} __ABI_PACKED;
> +
> +/**
> + * @ingroup MRQ_Codes
> + * @def MRQ_TRACE_MODIFY
> + * @brief modify the set of enabled trace events
> + *
> + * * Platforms: All
> + * * Initiators: CCPLEX
> + * * Targets: BPMP
> + * * Request Payload: @ref mrq_trace_modify_request
> + * * Response Payload: @ref mrq_trace_modify_response
> + *
> + * @note This MRQ is disabled on production systems
> + */
> +
> +/**
> + * @ingroup Trace
> + * @brief request with #MRQ_TRACE_MODIFY
> + *
> + * Used by %MRQ_TRACE_MODIFY calls to enable or disable specify trace
> + * events.  #set takes precedence for any bit set in both #set and
> + * #clr.
> + */
> +struct mrq_trace_modify_request {
> +	/** @brief bit mask of trace events to disable */
> +	uint32_t clr;
> +	/** @brief bit mask of trace events to enable */
> +	uint32_t set;
> +} __ABI_PACKED;
> +
> +/**
> + * @ingroup Trace
> + * @brief response to #MRQ_TRACE_MODIFY
> + *
> + * Sent in repsonse to an #MRQ_TRACE_MODIFY message. #mask reflects the
> + * state of which events are enabled after the recipient acted on the
> + * message.
> + *
> + */
> +struct mrq_trace_modify_response {
> +	/** @brief bit mask of trace event enable states */
> +	uint32_t mask;
> +} __ABI_PACKED;
> +
> +/**
> + * @ingroup MRQ_Codes
> + * @def MRQ_WRITE_TRACE
> + * @brief Write trace data to a buffer
> + *
> + * * Platforms: All
> + * * Initiators: CCPLEX
> + * * Targets: BPMP
> + * * Request Payload: @ref mrq_write_trace_request
> + * * Response Payload: @ref mrq_write_trace_response
> + *
> + * mrq_response::err depends on the @ref mrq_write_trace_request field
> + * values. err is -#BPMP_EINVAL if size is zero or area is NULL or
> + * area is in an illegal range. A positive value for err indicates the
> + * number of bytes written to area.
> + *
> + * @note This MRQ is disabled on production systems
> + */
> +
> +/**
> + * @ingroup Trace
> + * @brief request with #MRQ_WRITE_TRACE
> + *
> + * Used by MRQ_WRITE_TRACE calls to ask the recipient to copy trace
> + * data from the recipient's local buffer to the output buffer. #area
> + * is treated as a byte-aligned pointer in the recipient's address
> + * space.
> + *
> + * The sender is responsible for ensuring that the output
> + * buffer is mapped in the recipient's address map. The recipient is
> + * responsible for protecting its own code and data from accidental
> + * overwrites.
> + */
> +struct mrq_write_trace_request {
> +	/** @brief base address of output buffer */
> +	uint32_t area;
> +	/** @brief size in bytes of the output buffer */
> +	uint32_t size;
> +} __ABI_PACKED;
> +
> +/**
> + * @ingroup Trace
> + * @brief response to #MRQ_WRITE_TRACE
> + *
> + * Once this response is sent, the respondent will not access the
> + * output buffer further.
> + */
> +struct mrq_write_trace_response {
> +	/**
> +	 * @brief flag whether more data remains in local buffer
> +	 *
> +	 * Value is 1 if the entire local trace buffer has been
> +	 * drained to the outputbuffer. Value is 0 otherwise.
> +	 */
> +	uint32_t eof;
> +} __ABI_PACKED;
> +
> +/** @private */
> +struct mrq_threaded_ping_request {
> +	uint32_t challenge;
> +} __ABI_PACKED;
> +
> +/** @private */
> +struct mrq_threaded_ping_response {
> +	uint32_t reply;
> +} __ABI_PACKED;
> +
> +/**
> + * @ingroup MRQ_Codes
> + * @def MRQ_MODULE_MAIL
> + * @brief send a message to a loadable module
> + *
> + * * Platforms: All
> + * * Initiators: Any
> + * * Targets: BPMP
> + * * Request Payload: @ref mrq_module_mail_request
> + * * Response Payload: @ref mrq_module_mail_response
> + *
> + * @note This MRQ is disabled on production systems
> + */
> +
> +/**
> + * @ingroup Module
> + * @brief request with #MRQ_MODULE_MAIL
> + */
> +struct mrq_module_mail_request {
> +	/** @brief handle to the previously loaded module */
> +	uint32_t base;
> +	/** @brief module-specific mail payload
> +	 *
> +	 * The length of data[ ] is unknown to the BPMP core firmware
> +	 * but it is limited to the size of an IPC message.
> +	 */
> +	uint8_t data[EMPTY_ARRAY];
> +} __ABI_PACKED;
> +
> +/**
> + * @ingroup Module
> + * @brief response to #MRQ_MODULE_MAIL
> + */
> +struct mrq_module_mail_response {
> +	/** @brief module-specific mail payload
> +	 *
> +	 * The length of data[ ] is unknown to the BPMP core firmware
> +	 * but it is limited to the size of an IPC message.
> +	 */
> +	uint8_t data[EMPTY_ARRAY];
> +} __ABI_PACKED;
> +
> +/**
> + * @ingroup MRQ_Codes
> + * @def MRQ_DEBUGFS
> + * @brief Interact with BPMP's debugfs file nodes
> + *
> + * * Platforms: T186
> + * * Initiators: Any
> + * * Targets: BPMP
> + * * Request Payload: @ref mrq_debugfs_request
> + * * Response Payload: @ref mrq_debugfs_response
> + */
> +
> +/**
> + * @addtogroup Debugfs
> + * @{
> + *
> + * The BPMP firmware implements a pseudo-filesystem called
> + * debugfs. Any driver within the firmware may register with debugfs
> + * to expose an arbitrary set of "files" in the filesystem. When
> + * software on the CPU writes to a debugfs file, debugfs passes the
> + * written data to a callback provided by the driver. When software on
> + * the CPU reads a debugfs file, debugfs queries the driver for the
> + * data to return to the CPU. The intention of the debugfs filesystem
> + * is to provide information useful for debugging the system at
> + * runtime.
> + *
> + * @note The files exposed via debugfs are not part of the
> + * BPMP firmware's ABI. debugfs files may be added or removed in any
> + * given version of the firmware. Typically the semantics of a debugfs
> + * file are consistent from version to version but even that is not
> + * guaranteed.
> + *
> + * @}
> + */
> +/** @ingroup Debugfs */
> +enum mrq_debugfs_commands {
> +	CMD_DEBUGFS_READ = 1,
> +	CMD_DEBUGFS_WRITE = 2,
> +	CMD_DEBUGFS_DUMPDIR = 3,
> +	CMD_DEBUGFS_MAX
> +};
> +
> +/**
> + * @ingroup Debugfs
> + * @brief parameters for CMD_DEBUGFS_READ/WRITE command
> + */
> +struct cmd_debugfs_fileop_request {
> +	/** @brief physical address pointing at filename */
> +	uint32_t fnameaddr;
> +	/** @brief length in bytes of filename buffer */
> +	uint32_t fnamelen;
> +	/** @brief physical address pointing to data buffer */
> +	uint32_t dataaddr;
> +	/** @brief length in bytes of data buffer */
> +	uint32_t datalen;
> +} __ABI_PACKED;
> +
> +/**
> + * @ingroup Debugfs
> + * @brief parameters for CMD_DEBUGFS_READ/WRITE command
> + */
> +struct cmd_debugfs_dumpdir_request {
> +	/** @brief physical address pointing to data buffer */
> +	uint32_t dataaddr;
> +	/** @brief length in bytes of data buffer */
> +	uint32_t datalen;
> +} __ABI_PACKED;
> +
> +/**
> + * @ingroup Debugfs
> + * @brief response data for CMD_DEBUGFS_READ/WRITE command
> + */
> +struct cmd_debugfs_fileop_response {
> +	/** @brief always 0 */
> +	uint32_t reserved;
> +	/** @brief number of bytes read from or written to data buffer */
> +	uint32_t nbytes;
> +} __ABI_PACKED;
> +
> +/**
> + * @ingroup Debugfs
> + * @brief response data for CMD_DEBUGFS_DUMPDIR command
> + */
> +struct cmd_debugfs_dumpdir_response {
> +	/** @brief always 0 */
> +	uint32_t reserved;
> +	/** @brief number of bytes read from or written to data buffer */
> +	uint32_t nbytes;
> +} __ABI_PACKED;
> +
> +/**
> + * @ingroup Debugfs
> + * @brief request with #MRQ_DEBUGFS.
> + *
> + * The sender of an MRQ_DEBUGFS message uses #cmd to specify a debugfs
> + * command to execute. Legal commands are the values of @ref
> + * mrq_debugfs_commands. Each command requires a specific additional
> + * payload of data.
> + *
> + * |command            |payload|
> + * |-------------------|-------|
> + * |CMD_DEBUGFS_READ   |fop    |
> + * |CMD_DEBUGFS_WRITE  |fop    |
> + * |CMD_DEBUGFS_DUMPDIR|dumpdir|
> + */
> +struct mrq_debugfs_request {
> +	uint32_t cmd;
> +	union {
> +		struct cmd_debugfs_fileop_request fop;
> +		struct cmd_debugfs_dumpdir_request dumpdir;
> +	} __UNION_ANON;
> +} __ABI_PACKED;
> +
> +/**
> + * @ingroup Debugfs
> + */
> +struct mrq_debugfs_response {
> +	/** @brief always 0 */
> +	int32_t reserved;
> +	union {
> +		/** @brief response data for CMD_DEBUGFS_READ OR
> +		 * CMD_DEBUGFS_WRITE command
> +		 */
> +		struct cmd_debugfs_fileop_response fop;
> +		/** @brief response data for CMD_DEBUGFS_DUMPDIR command */
> +		struct cmd_debugfs_dumpdir_response dumpdir;
> +	} __UNION_ANON;
> +} __ABI_PACKED;
> +
> +/**
> + * @addtogroup Debugfs
> + * @{
> + */
> +#define DEBUGFS_S_ISDIR	(1 << 9)
> +#define DEBUGFS_S_IRUSR	(1 << 8)
> +#define DEBUGFS_S_IWUSR	(1 << 7)
> +/** @} */
> +
> +
> +/**
> + * @ingroup MRQ_Codes
> + * @def MRQ_RESET
> + * @brief reset an IP block
> + *
> + * * Platforms: T186
> + * * Initiators: Any
> + * * Targets: BPMP
> + * * Request Payload: @ref mrq_reset_request
> + * * Response Payload: N/A
> + */
> +
> +/**
> + * @ingroup Reset
> + */
> +enum mrq_reset_commands {
> +	CMD_RESET_ASSERT = 1,
> +	CMD_RESET_DEASSERT = 2,
> +	CMD_RESET_MODULE = 3,
> +	CMD_RESET_MAX, /* not part of ABI and subject to change */
> +};
> +
> +/**
> + * @ingroup Reset
> + * @brief request with MRQ_RESET
> + *
> + * Used by the sender of an #MRQ_RESET message to request BPMP to
> + * assert or or deassert a given reset line.
> + */
> +struct mrq_reset_request {
> +	/** @brief reset action to perform (@enum mrq_reset_commands) */
> +	uint32_t cmd;
> +	/** @brief id of the reset to affected */
> +	uint32_t reset_id;
> +} __ABI_PACKED;
> +
> +/**
> + * @ingroup MRQ_Codes
> + * @def MRQ_I2C
> + * @brief issue an i2c transaction
> + *
> + * * Platforms: T186
> + * * Initiators: Any
> + * * Targets: BPMP
> + * * Request Payload: @ref mrq_i2c_request
> + * * Response Payload: @ref mrq_i2c_response
> + */
> +
> +/**
> + * @addtogroup I2C
> + * @{
> + */
> +#define TEGRA_I2C_IPC_MAX_IN_BUF_SIZE	(MSG_DATA_MIN_SZ - 12)
> +#define TEGRA_I2C_IPC_MAX_OUT_BUF_SIZE	(MSG_DATA_MIN_SZ - 4)
> +/** @} */
> +
> +/**
> + * @ingroup I2C
> + * @name Serial I2C flags
> + * Use these flags with serial_i2c_request::flags
> + * @{
> + */
> +#define SERIALI2C_TEN           0x0010
> +#define SERIALI2C_RD            0x0001
> +#define SERIALI2C_STOP          0x8000
> +#define SERIALI2C_NOSTART       0x4000
> +#define SERIALI2C_REV_DIR_ADDR  0x2000
> +#define SERIALI2C_IGNORE_NAK    0x1000
> +#define SERIALI2C_NO_RD_ACK     0x0800
> +#define SERIALI2C_RECV_LEN      0x0400
> +/** @} */
> +/** @ingroup I2C */
> +enum {
> +	CMD_I2C_XFER = 1
> +};
> +
> +/**
> + * @ingroup I2C
> + * @brief serializable i2c request
> + *
> + * Instances of this structure are packed (little-endian) into
> + * cmd_i2c_xfer_request::data_buf. Each instance represents a single
> + * transaction (or a portion of a transaction with repeated starts) on
> + * an i2c bus.
> + *
> + * Because these structures are packed, some instances are likely to
> + * be misaligned. Additionally because #data is variable length, it is
> + * not possible to iterate through a serialized list of these
> + * structures without inspecting #len in each instance.  It may be
> + * easier to serialize or deserialize cmd_i2c_xfer_request::data_buf
> + * manually rather than using this structure definition.
> +*/
> +struct serial_i2c_request {
> +	/** @brief I2C slave address */
> +	uint16_t addr;
> +	/** @brief bitmask of SERIALI2C_ flags */
> +	uint16_t flags;
> +	/** @brief length of I2C transaction in bytes */
> +	uint16_t len;
> +	/** @brief for write transactions only, #len bytes of data */
> +	uint8_t data[];
> +} __ABI_PACKED;
> +
> +/**
> + * @ingroup I2C
> + * @brief trigger one or more i2c transactions
> + */
> +struct cmd_i2c_xfer_request {
> +	/** @brief valid bus number from mach-t186/i2c-t186.h*/
> +	uint32_t bus_id;
> +
> +	/** @brief count of valid bytes in #data_buf*/
> +	uint32_t data_size;
> +
> +	/** @brief serialized packed instances of @ref serial_i2c_request*/
> +	uint8_t data_buf[TEGRA_I2C_IPC_MAX_IN_BUF_SIZE];
> +} __ABI_PACKED;
> +
> +/**
> + * @ingroup I2C
> + * @brief container for data read from the i2c bus
> + *
> + * Processing an cmd_i2c_xfer_request::data_buf causes BPMP to execute
> + * zero or more I2C reads. The data read from the bus is serialized
> + * into #data_buf.
> + */
> +struct cmd_i2c_xfer_response {
> +	/** @brief count of valid bytes in #data_buf*/
> +	uint32_t data_size;
> +	/** @brief i2c read data */
> +	uint8_t data_buf[TEGRA_I2C_IPC_MAX_OUT_BUF_SIZE];
> +} __ABI_PACKED;
> +
> +/**
> + * @ingroup I2C
> + * @brief request with #MRQ_I2C
> + */
> +struct mrq_i2c_request {
> +	/** @brief always CMD_I2C_XFER (i.e. 1) */
> +	uint32_t cmd;
> +	/** @brief parameters of the transfer request */
> +	struct cmd_i2c_xfer_request xfer;
> +} __ABI_PACKED;
> +
> +/**
> + * @ingroup I2C
> + * @brief response to #MRQ_I2C
> + */
> +struct mrq_i2c_response {
> +	struct cmd_i2c_xfer_response xfer;
> +} __ABI_PACKED;
> +
> +/**
> + * @ingroup MRQ_Codes
> + * @def MRQ_CLK
> + *
> + * * Platforms: T186
> + * * Initiators: Any
> + * * Targets: BPMP
> + * * Request Payload: @ref mrq_clk_request
> + * * Response Payload: @ref mrq_clk_response
> + * @addtogroup Clocks
> + * @{
> + */
> +
> +/**
> + * @name MRQ_CLK sub-commands
> + * @{
> + */
> +enum {
> +	CMD_CLK_GET_RATE = 1,
> +	CMD_CLK_SET_RATE = 2,
> +	CMD_CLK_ROUND_RATE = 3,
> +	CMD_CLK_GET_PARENT = 4,
> +	CMD_CLK_SET_PARENT = 5,
> +	CMD_CLK_IS_ENABLED = 6,
> +	CMD_CLK_ENABLE = 7,
> +	CMD_CLK_DISABLE = 8,
> +	CMD_CLK_GET_ALL_INFO = 14,
> +	CMD_CLK_GET_MAX_CLK_ID = 15,
> +	CMD_CLK_MAX,
> +};
> +/** @} */
> +
> +#define MRQ_CLK_NAME_MAXLEN	40
> +#define MRQ_CLK_MAX_PARENTS	16
> +
> +/** @private */
> +struct cmd_clk_get_rate_request {
> +	EMPTY
> +} __ABI_PACKED;
> +
> +struct cmd_clk_get_rate_response {
> +	int64_t rate;
> +} __ABI_PACKED;
> +
> +struct cmd_clk_set_rate_request {
> +	int32_t unused;
> +	int64_t rate;
> +} __ABI_PACKED;
> +
> +struct cmd_clk_set_rate_response {
> +	int64_t rate;
> +} __ABI_PACKED;
> +
> +struct cmd_clk_round_rate_request {
> +	int32_t unused;
> +	int64_t rate;
> +} __ABI_PACKED;
> +
> +struct cmd_clk_round_rate_response {
> +	int64_t rate;
> +} __ABI_PACKED;
> +
> +/** @private */
> +struct cmd_clk_get_parent_request {
> +	EMPTY
> +} __ABI_PACKED;
> +
> +struct cmd_clk_get_parent_response {
> +	uint32_t parent_id;
> +} __ABI_PACKED;
> +
> +struct cmd_clk_set_parent_request {
> +	uint32_t parent_id;
> +} __ABI_PACKED;
> +
> +struct cmd_clk_set_parent_response {
> +	uint32_t parent_id;
> +} __ABI_PACKED;
> +
> +/** @private */
> +struct cmd_clk_is_enabled_request {
> +	EMPTY
> +} __ABI_PACKED;
> +
> +struct cmd_clk_is_enabled_response {
> +	int32_t state;
> +} __ABI_PACKED;
> +
> +/** @private */
> +struct cmd_clk_enable_request {
> +	EMPTY
> +} __ABI_PACKED;
> +
> +/** @private */
> +struct cmd_clk_enable_response {
> +	EMPTY
> +} __ABI_PACKED;
> +
> +/** @private */
> +struct cmd_clk_disable_request {
> +	EMPTY
> +} __ABI_PACKED;
> +
> +/** @private */
> +struct cmd_clk_disable_response {
> +	EMPTY
> +} __ABI_PACKED;
> +
> +/** @private */
> +struct cmd_clk_get_all_info_request {
> +	EMPTY
> +} __ABI_PACKED;
> +
> +struct cmd_clk_get_all_info_response {
> +	uint32_t flags;
> +	uint32_t parent;
> +	uint32_t parents[MRQ_CLK_MAX_PARENTS];
> +	uint8_t num_parents;
> +	uint8_t name[MRQ_CLK_NAME_MAXLEN];
> +} __ABI_PACKED;
> +
> +/** @private */
> +struct cmd_clk_get_max_clk_id_request {
> +	EMPTY
> +} __ABI_PACKED;
> +
> +struct cmd_clk_get_max_clk_id_response {
> +	uint32_t max_id;
> +} __ABI_PACKED;
> +/** @} */
> +
> +/**
> + * @ingroup Clocks
> + * @brief request with #MRQ_CLK
> + *
> + * Used by the sender of an #MRQ_CLK message to control clocks. The
> + * clk_request is split into several sub-commands. Some sub-commands
> + * require no additional data. Others have a sub-command specific
> + * payload
> + *
> + * |sub-command                 |payload                |
> + * |----------------------------|-----------------------|
> + * |CMD_CLK_GET_RATE            |-                      |
> + * |CMD_CLK_SET_RATE            |clk_set_rate           |
> + * |CMD_CLK_ROUND_RATE          |clk_round_rate         |
> + * |CMD_CLK_GET_PARENT          |-                      |
> + * |CMD_CLK_SET_PARENT          |clk_set_parent         |
> + * |CMD_CLK_IS_ENABLED          |-                      |
> + * |CMD_CLK_ENABLE              |-                      |
> + * |CMD_CLK_DISABLE             |-                      |
> + * |CMD_CLK_GET_ALL_INFO        |-                      |
> + * |CMD_CLK_GET_MAX_CLK_ID      |-                      |
> + *
> + */
> +
> +struct mrq_clk_request {
> +	/** @brief sub-command and clock id concatenated to 32-bit word.
> +	 * - bits[31..24] is the sub-cmd.
> +	 * - bits[23..0] is the clock id
> +	 */
> +	uint32_t cmd_and_id;
> +
> +	union {
> +		/** @private */
> +		struct cmd_clk_get_rate_request clk_get_rate;
> +		struct cmd_clk_set_rate_request clk_set_rate;
> +		struct cmd_clk_round_rate_request clk_round_rate;
> +		/** @private */
> +		struct cmd_clk_get_parent_request clk_get_parent;
> +		struct cmd_clk_set_parent_request clk_set_parent;
> +		/** @private */
> +		struct cmd_clk_enable_request clk_enable;
> +		/** @private */
> +		struct cmd_clk_disable_request clk_disable;
> +		/** @private */
> +		struct cmd_clk_is_enabled_request clk_is_enabled;
> +		/** @private */
> +		struct cmd_clk_get_all_info_request clk_get_all_info;
> +		/** @private */
> +		struct cmd_clk_get_max_clk_id_request clk_get_max_clk_id;
> +	} __UNION_ANON;
> +} __ABI_PACKED;
> +
> +/**
> + * @ingroup Clocks
> + * @brief response to MRQ_CLK
> + *
> + * Each sub-command supported by @ref mrq_clk_request may return
> + * sub-command-specific data. Some do and some do not as indicated in
> + * the following table
> + *
> + * |sub-command                 |payload                 |
> + * |----------------------------|------------------------|
> + * |CMD_CLK_GET_RATE            |clk_get_rate            |
> + * |CMD_CLK_SET_RATE            |clk_set_rate            |
> + * |CMD_CLK_ROUND_RATE          |clk_round_rate          |
> + * |CMD_CLK_GET_PARENT          |clk_get_parent          |
> + * |CMD_CLK_SET_PARENT          |clk_set_parent          |
> + * |CMD_CLK_IS_ENABLED          |clk_is_enabled          |
> + * |CMD_CLK_ENABLE              |-                       |
> + * |CMD_CLK_DISABLE             |-                       |
> + * |CMD_CLK_GET_ALL_INFO        |clk_get_all_info        |
> + * |CMD_CLK_GET_MAX_CLK_ID      |clk_get_max_id          |
> + *
> + */
> +
> +struct mrq_clk_response {
> +	union {
> +		struct cmd_clk_get_rate_response clk_get_rate;
> +		struct cmd_clk_set_rate_response clk_set_rate;
> +		struct cmd_clk_round_rate_response clk_round_rate;
> +		struct cmd_clk_get_parent_response clk_get_parent;
> +		struct cmd_clk_set_parent_response clk_set_parent;
> +		/** @private */
> +		struct cmd_clk_enable_response clk_enable;
> +		/** @private */
> +		struct cmd_clk_disable_response clk_disable;
> +		struct cmd_clk_is_enabled_response clk_is_enabled;
> +		struct cmd_clk_get_all_info_response clk_get_all_info;
> +		struct cmd_clk_get_max_clk_id_response clk_get_max_clk_id;
> +	} __UNION_ANON;
> +} __ABI_PACKED;
> +
> +/**
> + * @ingroup MRQ_Codes
> + * @def MRQ_QUERY_ABI
> + * @brief check if an MRQ is implemented
> + *
> + * * Platforms: All
> + * * Initiators: Any
> + * * Targets: Any
> + * * Request Payload: @ref mrq_query_abi_request
> + * * Response Payload: @ref mrq_query_abi_response
> + */
> +
> +/**
> + * @ingroup ABI_info
> + * @brief request with MRQ_QUERY_ABI
> + *
> + * Used by #MRQ_QUERY_ABI call to check if MRQ code #mrq is supported
> + * by the recipient.
> + */
> +struct mrq_query_abi_request {
> +	/** @brief MRQ code to query */
> +	uint32_t mrq;
> +} __ABI_PACKED;
> +
> +/**
> + * @ingroup ABI_info
> + * @brief response to MRQ_QUERY_ABI
> + */
> +struct mrq_query_abi_response {
> +	/** @brief 0 if queried MRQ is supported. Else, -#BPMP_ENODEV */
> +	int32_t status;
> +} __ABI_PACKED;
> +
> +/**
> + * @ingroup MRQ_Codes
> + * @def MRQ_PG_READ_STATE
> + * @brief read the power-gating state of a partition
> + *
> + * * Platforms: T186
> + * * Initiators: Any
> + * * Targets: BPMP
> + * * Request Payload: @ref mrq_pg_read_state_request
> + * * Response Payload: @ref mrq_pg_read_state_response
> + * @addtogroup Powergating
> + * @{
> + */
> +
> +/**
> + * @brief request with #MRQ_PG_READ_STATE
> + *
> + * Used by MRQ_PG_READ_STATE call to read the current state of a
> + * partition.
> + */
> +struct mrq_pg_read_state_request {
> +	/** @brief ID of partition */
> +	uint32_t partition_id;
> +} __ABI_PACKED;
> +
> +/**
> + * @brief response to MRQ_PG_READ_STATE
> + * @todo define possible errors.
> + */
> +struct mrq_pg_read_state_response {
> +	/** @brief read as don't care */
> +	uint32_t sram_state;
> +	/** @brief state of power partition
> +	 * * 0 : off
> +	 * * 1 : on
> +	 */
> +	uint32_t logic_state;
> +} __ABI_PACKED;
> +
> +/** @} */
> +
> +/**
> + * @ingroup MRQ_Codes
> + * @def MRQ_PG_UPDATE_STATE
> + * @brief modify the power-gating state of a partition
> + *
> + * * Platforms: T186
> + * * Initiators: Any
> + * * Targets: BPMP
> + * * Request Payload: @ref mrq_pg_update_state_request
> + * * Response Payload: N/A
> + * @addtogroup Powergating
> + * @{
> + */
> +
> +/**
> + * @brief request with mrq_pg_update_state_request
> + *
> + * Used by #MRQ_PG_UPDATE_STATE call to request BPMP to change the
> + * state of a power partition #partition_id.
> + */
> +struct mrq_pg_update_state_request {
> +	/** @brief ID of partition */
> +	uint32_t partition_id;
> +	/** @brief secondary control of power partition
> +	 *  @details Ignored by many versions of the BPMP
> +	 *  firmware. For maximum compatibility, set the value
> +	 *  according to @logic_state
> +	 * *  0x1: power ON partition (@ref logic_state == 0x3)
> +	 * *  0x3: power OFF partition (@ref logic_state == 0x1)
> +	 */
> +	uint32_t sram_state;
> +	/** @brief controls state of power partition, legal values are
> +	 * *  0x1 : power OFF partition
> +	 * *  0x3 : power ON partition
> +	 */
> +	uint32_t logic_state;
> +	/** @brief change state of clocks of the power partition, legal values
> +	 * *  0x0 : do not change clock state
> +	 * *  0x1 : disable partition clocks (only applicable when
> +	 *          @ref logic_state == 0x1)
> +	 * *  0x3 : enable partition clocks (only applicable when
> +	 *          @ref logic_state == 0x3)
> +	 */
> +	uint32_t clock_state;
> +} __ABI_PACKED;
> +/** @} */
> +
> +/**
> + * @ingroup MRQ_Codes
> + * @def MRQ_THERMAL
> + * @brief interact with BPMP thermal framework
> + *
> + * * Platforms: T186
> + * * Initiators: Any
> + * * Targets: Any
> + * * Request Payload: TODO
> + * * Response Payload: TODO
> + *
> + * @addtogroup Thermal
> + *
> + * The BPMP firmware includes a thermal framework. Drivers within the
> + * bpmp firmware register with the framework to provide thermal
> + * zones. Each thermal zone corresponds to an entity whose temperature
> + * can be measured. The framework also has a notion of trip points. A
> + * trip point consists of a thermal zone id, a temperature, and a
> + * callback routine. The framework invokes the callback when the zone
> + * hits the indicated temperature. The BPMP firmware uses this thermal
> + * framework interally to implement various temperature-dependent
> + * functions.
> + *
> + * Software on the CPU can use #MRQ_THERMAL (with payload @ref
> + * mrq_thermal_host_to_bpmp_request) to interact with the BPMP thermal
> + * framework. The CPU must It can query the number of supported zones,
> + * query zone temperatures, and set trip points.
> + *
> + * When a trip point set by the CPU gets crossed, BPMP firmware issues
> + * an IPC to the CPU having mrq_request::mrq = #MRQ_THERMAL and a
> + * payload of @ref mrq_thermal_bpmp_to_host_request.
> + * @{
> + */
> +enum mrq_thermal_host_to_bpmp_cmd {
> +	/**
> +	 * @brief Check whether the BPMP driver supports the specified
> +	 * request type.
> +	 *
> +	 * Host needs to supply request parameters.
> +	 *
> +	 * mrq_response::err is 0 if the specified request is
> +	 * supported and -#BPMP_ENODEV otherwise.
> +	 */
> +	CMD_THERMAL_QUERY_ABI = 0,
> +
> +	/**
> +	 * @brief Get the current temperature of the specified zone.
> +	 *
> +	 * Host needs to supply request parameters.
> +	 *
> +	 * mrq_response::err is
> +	 * *  0: Temperature query succeeded.
> +	 * *  -#BPMP_EINVAL: Invalid request parameters.
> +	 * *  -#BPMP_ENOENT: No driver registered for thermal zone..
> +	 * *  -#BPMP_EFAULT: Problem reading temperature measurement.
> +	 */
> +	CMD_THERMAL_GET_TEMP = 1,
> +
> +	/**
> +	 * @brief Enable or disable and set the lower and upper
> +	 *   thermal limits for a thermal trip point. Each zone has
> +	 *   one trip point.
> +	 *
> +	 * Host needs to supply request parameters. Once the
> +	 * temperature hits a trip point, the BPMP will send a message
> +	 * to the CPU having MRQ=MRQ_THERMAL and
> +	 * type=CMD_THERMAL_HOST_TRIP_REACHED
> +	 *
> +	 * mrq_response::err is
> +	 * *  0: Trip successfully set.
> +	 * *  -#BPMP_EINVAL: Invalid request parameters.
> +	 * *  -#BPMP_ENOENT: No driver registered for thermal zone.
> +	 * *  -#BPMP_EFAULT: Problem setting trip point.
> +	 */
> +	CMD_THERMAL_SET_TRIP = 2,
> +
> +	/**
> +	 * @brief Get the number of supported thermal zones.
> +	 *
> +	 * No request parameters required.
> +	 *
> +	 * mrq_response::err is always 0, indicating success.
> +	 */
> +	CMD_THERMAL_GET_NUM_ZONES = 3,
> +
> +	/** @brief: number of supported host-to-bpmp commands. May
> +	 * increase in future
> +	 */
> +	CMD_THERMAL_HOST_TO_BPMP_NUM
> +};
> +
> +enum mrq_thermal_bpmp_to_host_cmd {
> +	/**
> +	 * @brief Indication that the temperature for a zone has
> +	 *   exceeded the range indicated in the thermal trip point
> +	 *   for the zone.
> +	 *
> +	 * BPMP needs to supply request parameters. Host only needs to
> +	 * acknowledge.
> +	 */
> +	CMD_THERMAL_HOST_TRIP_REACHED = 100,
> +
> +	/** @brief: number of supported bpmp-to-host commands. May
> +	 * increase in future
> +	 */
> +	CMD_THERMAL_BPMP_TO_HOST_NUM
> +};
> +
> +/*
> + * Host->BPMP request data for request type CMD_THERMAL_QUERY_ABI
> + *
> + * zone: Request type for which to check existence.
> + */
> +struct cmd_thermal_query_abi_request {
> +	uint32_t type;
> +} __ABI_PACKED;
> +
> +/*
> + * Host->BPMP request data for request type CMD_THERMAL_GET_TEMP
> + *
> + * zone: Number of thermal zone.
> + */
> +struct cmd_thermal_get_temp_request {
> +	uint32_t zone;
> +} __ABI_PACKED;
> +
> +/*
> + * BPMP->Host reply data for request CMD_THERMAL_GET_TEMP
> + *
> + * error: 0 if request succeeded.
> + *	-BPMP_EINVAL if request parameters were invalid.
> + *      -BPMP_ENOENT if no driver was registered for the specified thermal zone.
> + *      -BPMP_EFAULT for other thermal zone driver errors.
> + * temp: Current temperature in millicelsius.
> + */
> +struct cmd_thermal_get_temp_response {
> +	int32_t temp;
> +} __ABI_PACKED;
> +
> +/*
> + * Host->BPMP request data for request type CMD_THERMAL_SET_TRIP
> + *
> + * zone: Number of thermal zone.
> + * low: Temperature of lower trip point in millicelsius
> + * high: Temperature of upper trip point in millicelsius
> + * enabled: 1 to enable trip point, 0 to disable trip point
> + */
> +struct cmd_thermal_set_trip_request {
> +	uint32_t zone;
> +	int32_t low;
> +	int32_t high;
> +	uint32_t enabled;
> +} __ABI_PACKED;
> +
> +/*
> + * BPMP->Host request data for request type CMD_THERMAL_HOST_TRIP_REACHED
> + *
> + * zone: Number of thermal zone where trip point was reached.
> + */
> +struct cmd_thermal_host_trip_reached_request {
> +	uint32_t zone;
> +} __ABI_PACKED;
> +
> +/*
> + * BPMP->Host reply data for request type CMD_THERMAL_GET_NUM_ZONES
> + *
> + * num: Number of supported thermal zones. The thermal zones are indexed
> + *      starting from zero.
> + */
> +struct cmd_thermal_get_num_zones_response {
> +	uint32_t num;
> +} __ABI_PACKED;
> +
> +/*
> + * Host->BPMP request data.
> + *
> + * Reply type is union mrq_thermal_bpmp_to_host_response.
> + *
> + * type: Type of request. Values listed in enum mrq_thermal_type.
> + * data: Request type specific parameters.
> + */
> +struct mrq_thermal_host_to_bpmp_request {
> +	uint32_t type;
> +	union {
> +		struct cmd_thermal_query_abi_request query_abi;
> +		struct cmd_thermal_get_temp_request get_temp;
> +		struct cmd_thermal_set_trip_request set_trip;
> +	} __UNION_ANON;
> +} __ABI_PACKED;
> +
> +/*
> + * BPMP->Host request data.
> + *
> + * type: Type of request. Values listed in enum mrq_thermal_type.
> + * data: Request type specific parameters.
> + */
> +struct mrq_thermal_bpmp_to_host_request {
> +	uint32_t type;
> +	union {
> +		struct cmd_thermal_host_trip_reached_request host_trip_reached;
> +	} __UNION_ANON;
> +} __ABI_PACKED;
> +
> +/*
> + * Data in reply to a Host->BPMP request.
> + */
> +union mrq_thermal_bpmp_to_host_response {
> +	struct cmd_thermal_get_temp_response get_temp;
> +	struct cmd_thermal_get_num_zones_response get_num_zones;
> +} __ABI_PACKED;
> +/** @} */
> +
> +/**
> + * @ingroup MRQ_Codes
> + * @def MRQ_CPU_VHINT
> + * @brief Query CPU voltage hint data
> + *
> + * * Platforms: T186
> + * * Initiators: CCPLEX
> + * * Targets: BPMP
> + * * Request Payload: @ref mrq_cpu_vhint_request
> + * * Response Payload: N/A
> + *
> + * @addtogroup Vhint CPU Voltage hint
> + * @{
> + */
> +
> +/**
> + * @brief request with #MRQ_CPU_VHINT
> + *
> + * Used by #MRQ_CPU_VHINT call by CCPLEX to retrieve voltage hint data
> + * from BPMP to memory space pointed by #addr. CCPLEX is responsible
> + * to allocate sizeof(cpu_vhint_data) sized block of memory and
> + * appropriately map it for BPMP before sending the request.
> + */
> +struct mrq_cpu_vhint_request {
> +	/** @brief IOVA address for the #cpu_vhint_data */
> +	uint32_t addr; /* struct cpu_vhint_data * */
> +	/** @brief ID of the cluster whose data is requested */
> +	uint32_t cluster_id; /* enum cluster_id */
> +} __ABI_PACKED;
> +
> +/**
> + * @brief description of the CPU v/f relation
> + *
> + * Used by #MRQ_CPU_VHINT call to carry data pointed by #addr of
> + * struct mrq_cpu_vhint_request
> + */
> +struct cpu_vhint_data {
> +	uint32_t ref_clk_hz; /**< reference frequency in Hz */
> +	uint16_t pdiv; /**< post divider value */
> +	uint16_t mdiv; /**< input divider value */
> +	uint16_t ndiv_max; /**< fMAX expressed with max NDIV value */
> +	/** table of ndiv values as a function of vINDEX (voltage index) */
> +	uint16_t ndiv[80];
> +	/** minimum allowed NDIV value */
> +	uint16_t ndiv_min;
> +	/** minimum allowed voltage hint value (as in vINDEX) */
> +	uint16_t vfloor;
> +	/** maximum allowed voltage hint value (as in vINDEX) */
> +	uint16_t vceil;
> +	/** post-multiplier for vindex value */
> +	uint16_t vindex_mult;
> +	/** post-divider for vindex value */
> +	uint16_t vindex_div;
> +	/** reserved for future use */
> +	uint16_t reserved[328];
> +} __ABI_PACKED;
> +
> +/** @} */
> +
> +/**
> + * @ingroup MRQ_Codes
> + * @def MRQ_ABI_RATCHET
> + * @brief ABI ratchet value query
> + *
> + * * Platforms: T186
> + * * Initiators: Any
> + * * Targets: BPMP
> + * * Request Payload: @ref mrq_abi_ratchet_request
> + * * Response Payload: @ref mrq_abi_ratchet_response
> + * @addtogroup ABI_info
> + * @{
> + */
> +
> +/**
> + * @brief an ABI compatibility mechanism
> + *
> + * BPMP_ABI_RATCHET_VALUE may increase for various reasons in a future
> + * revision of this header file.
> + * 1. That future revision deprecates some MRQ
> + * 2. That future revision introduces a breaking change to an existing
> + *    MRQ or
> + * 3. A bug is discovered in an existing implementation of the BPMP-FW
> + *    (or possibly one of its clients) which warrants deprecating that
> + *    implementation.
> + */
> +#define BPMP_ABI_RATCHET_VALUE 3
> +
> +/**
> + * @brief request with #MRQ_ABI_RATCHET.
> + *
> + * #ratchet should be #BPMP_ABI_RATCHET_VALUE from the ABI header
> + * against which the requester was compiled.
> + *
> + * If ratchet is less than BPMP's #BPMP_ABI_RATCHET_VALUE, BPMP may
> + * reply with mrq_response::err = -#BPMP_ERANGE to indicate that
> + * BPMP-FW cannot interoperate correctly with the requester. Requester
> + * should cease further communication with BPMP.
> + *
> + * Otherwise, err shall be 0.
> + */
> +struct mrq_abi_ratchet_request {
> +	/** @brief requester's ratchet value */
> +	uint16_t ratchet;
> +};
> +
> +/**
> + * @brief response to #MRQ_ABI_RATCHET
> + *
> + * #ratchet shall be #BPMP_ABI_RATCHET_VALUE from the ABI header
> + * against which BPMP firwmare was compiled.
> + *
> + * If #ratchet is less than the requester's #BPMP_ABI_RATCHET_VALUE,
> + * the requster must either interoperate with BPMP according to an ABI
> + * header version with BPMP_ABI_RATCHET_VALUE = ratchet or cease
> + * communication with BPMP.
> + *
> + * If mrq_response::err is 0 and ratchet is greater than or equal to the
> + * requester's BPMP_ABI_RATCHET_VALUE, the requester should continue
> + * normal operation.
> + */
> +struct mrq_abi_ratchet_response {
> +	/** @brief BPMP's ratchet value */
> +	uint16_t ratchet;
> +};
> +/** @} */
> +
> +/**
> + * @ingroup MRQ_Codes
> + * @def MRQ_EMC_DVFS_LATENCY
> + * @brief query frequency dependent EMC DVFS latency
> + *
> + * * Platforms: T186
> + * * Initiators: CCPLEX
> + * * Targets: BPMP
> + * * Request Payload: N/A
> + * * Response Payload: @ref mrq_emc_dvfs_latency_response
> + * @addtogroup EMC
> + * @{
> + */
> +
> +/**
> + * @brief used by @ref mrq_emc_dvfs_latency_response
> + */
> +struct emc_dvfs_latency {
> +	/** @brief EMC frequency in kHz */
> +	uint32_t freq;
> +	/** @brief EMC DVFS latency in nanoseconds */
> +	uint32_t latency;
> +} __ABI_PACKED;
> +
> +#define EMC_DVFS_LATENCY_MAX_SIZE	14
> +/**
> + * @brief response to #MRQ_EMC_DVFS_LATENCY
> + */
> +struct mrq_emc_dvfs_latency_response {
> +	/** @brief the number valid entries in #pairs */
> +	uint32_t num_pairs;
> +	/** @brief EMC <frequency, latency> information */
> +	struct emc_dvfs_latency pairs[EMC_DVFS_LATENCY_MAX_SIZE];
> +} __ABI_PACKED;
> +
> +/** @} */
> +
> +/**
> + * @ingroup MRQ_Codes
> + * @def MRQ_TRACE_ITER
> + * @brief manage the trace iterator
> + *
> + * * Platforms: All
> + * * Initiators: CCPLEX
> + * * Targets: BPMP
> + * * Request Payload: N/A
> + * * Response Payload: @ref mrq_trace_iter_request
> + * @addtogroup Trace
> + * @{
> + */
> +enum {
> +	/** @brief (re)start the tracing now. Ignore older events */
> +	TRACE_ITER_INIT = 0,
> +	/** @brief clobber all events in the trace buffer */
> +	TRACE_ITER_CLEAN = 1
> +};
> +
> +/**
> + * @brief request with #MRQ_TRACE_ITER
> + */
> +struct mrq_trace_iter_request {
> +	/** @brief TRACE_ITER_INIT or TRACE_ITER_CLEAN */
> +	uint32_t cmd;
> +} __ABI_PACKED;
> +
> +/** @} */
> +
> +/*
> + *  4. Enumerations
> + */
> +
> +/*
> + *   4.1 CPU enumerations
> + *
> + * See <mach-t186/system-t186.h>
> + *
> + *   4.2 CPU Cluster enumerations
> + *
> + * See <mach-t186/system-t186.h>
> + *
> + *   4.3 System low power state enumerations
> + *
> + * See <mach-t186/system-t186.h>
> + */
> +
> +/*
> + *   4.4 Clock enumerations
> + *
> + * For clock enumerations, see <mach-t186/clk-t186.h>
> + */
> +
> +/*
> + *   4.5 Reset enumerations
> + *
> + * For reset enumerations, see <mach-t186/reset-t186.h>
> + */
> +
> +/*
> + *   4.6 Thermal sensor enumerations
> + *
> + * For thermal sensor enumerations, see <mach-t186/thermal-t186.h>
> + */
> +
> +/**
> + * @defgroup Error_Codes
> + * Negative values for mrq_response::err generally indicate some
> + * error. The ABI defines the following error codes. Negating these
> + * defines is an exercise left to the user.
> + * @{
> + */
> +/** @brief No such file or directory */
> +#define BPMP_ENOENT	2
> +/** @brief No MRQ handler */
> +#define BPMP_ENOHANDLER	3
> +/** @brief I/O error */
> +#define BPMP_EIO	5
> +/** @brief Bad sub-MRQ command */
> +#define BPMP_EBADCMD	6
> +/** @brief Not enough memory */
> +#define BPMP_ENOMEM	12
> +/** @brief Permission denied */
> +#define BPMP_EACCES	13
> +/** @brief Bad address */
> +#define BPMP_EFAULT	14
> +/** @brief No such device */
> +#define BPMP_ENODEV	19
> +/** @brief Argument is a directory */
> +#define BPMP_EISDIR	21
> +/** @brief Invalid argument */
> +#define BPMP_EINVAL	22
> +/** @brief Timeout during operation */
> +#define BPMP_ETIMEDOUT  23
> +/** @brief Out of range */
> +#define BPMP_ERANGE	34
> +/** @} */
> +/** @} */
> +#endif
> -- 
> 2.9.0
> 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-tegra" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH V2 05/10] firmware: tegra: add BPMP support
  2016-07-07 10:18       ` Alexandre Courbot
  2016-07-07 19:55         ` Stephen Warren
@ 2016-07-08 20:19         ` Sivaram Nair
  1 sibling, 0 replies; 51+ messages in thread
From: Sivaram Nair @ 2016-07-08 20:19 UTC (permalink / raw)
  To: Alexandre Courbot
  Cc: Joseph Lo, Stephen Warren, Thierry Reding, linux-tegra,
	linux-arm-kernel, Rob Herring, Mark Rutland, Peter De Schrijver,
	Matthew Longnecker, devicetree, Jassi Brar,
	Linux Kernel Mailing List, Catalin Marinas, Will Deacon

On Thu, Jul 07, 2016 at 07:18:34PM +0900, Alexandre Courbot wrote:
> On Thu, Jul 7, 2016 at 5:17 PM, Joseph Lo <josephl@nvidia.com> wrote:
> > On 07/06/2016 07:39 PM, Alexandre Courbot wrote:
> >>
> >> Sorry, I will probably need to do several passes on this one to
> >> understand everything, but here is what I can say after a first look:
> >>
> >> On Tue, Jul 5, 2016 at 6:04 PM, Joseph Lo <josephl@nvidia.com> wrote:
> >>>
> >>> The Tegra BPMP (Boot and Power Management Processor) is designed for the
> >>> booting process handling, offloading the power management tasks and
> >>> some system control services from the CPU. It can be clock, DVFS,
> >>> thermal/EDP, power gating operation and system suspend/resume handling.
> >>> So the CPU and the drivers of these modules can base on the service that
> >>> the BPMP firmware driver provided to signal the event for the specific PM
> >>> action to BPMP and receive the status update from BPMP.
> >>>
> >>> Comparing to the ARM SCPI, the service provided by BPMP is message-based
> >>> communication but not method-based. The BPMP firmware driver provides the
> >>> send/receive service for the users, when the user concerns the response
> >>> time. If the user needs to get the event or update from the firmware, it
> >>> can request the MRQ service as well. The user needs to take care of the
> >>> message format, which we call BPMP ABI.
> >>>
> >>> The BPMP ABI defines the message format for different modules or usages.
> >>> For example, the clock operation needs an MRQ service code called
> >>> MRQ_CLK with specific message format which includes different sub
> >>> commands for various clock operations. This is the message format that
> >>> BPMP can recognize.
> >>>
> >>> So the user needs two things to initiate IPC between BPMP. Get the
> >>> service from the bpmp_ops structure and maintain the message format as
> >>> the BPMP ABI defined.
> >>>
> >>> Based-on-the-work-by:
> >>> Sivaram Nair <sivaramn@nvidia.com>
> >>>
> >>> Signed-off-by: Joseph Lo <josephl@nvidia.com>
> >>> ---
> >>> Changes in V2:
> >>> - None
> >>> ---
> >>>   drivers/firmware/tegra/Kconfig  |   12 +
> >>>   drivers/firmware/tegra/Makefile |    1 +
> >>>   drivers/firmware/tegra/bpmp.c   |  713 +++++++++++++++++
> >>>   include/soc/tegra/bpmp.h        |   29 +
> >>>   include/soc/tegra/bpmp_abi.h    | 1601
> >>> +++++++++++++++++++++++++++++++++++++++
> >>>   5 files changed, 2356 insertions(+)
> >>>   create mode 100644 drivers/firmware/tegra/bpmp.c
> >>>   create mode 100644 include/soc/tegra/bpmp.h
> >>>   create mode 100644 include/soc/tegra/bpmp_abi.h
> >>>
> >>> diff --git a/drivers/firmware/tegra/Kconfig
> >>> b/drivers/firmware/tegra/Kconfig
> >>> index 1fa3e4e136a5..ff2730d5c468 100644
> >>> --- a/drivers/firmware/tegra/Kconfig
> >>> +++ b/drivers/firmware/tegra/Kconfig
> >>> @@ -10,4 +10,16 @@ config TEGRA_IVC
> >>>            keeps the content is synchronization between host CPU and
> >>> remote
> >>>            processors.
> >>>
> >>> +config TEGRA_BPMP
> >>> +       bool "Tegra BPMP driver"
> >>> +       depends on ARCH_TEGRA && TEGRA_HSP_MBOX && TEGRA_IVC
> >>> +       help
> >>> +         BPMP (Boot and Power Management Processor) is designed to
> >>> off-loading
> >>
> >>
> >> s/off-loading/off-load
> >>
> >>> +         the PM functions which include clock/DVFS/thermal/power from
> >>> the CPU.
> >>> +         It needs HSP as the HW synchronization and notification module
> >>> and
> >>> +         IVC module as the message communication protocol.
> >>> +
> >>> +         This driver manages the IPC interface between host CPU and the
> >>> +         firmware running on BPMP.
> >>> +
> >>>   endmenu
> >>> diff --git a/drivers/firmware/tegra/Makefile
> >>> b/drivers/firmware/tegra/Makefile
> >>> index 92e2153e8173..e34a2f79e1ad 100644
> >>> --- a/drivers/firmware/tegra/Makefile
> >>> +++ b/drivers/firmware/tegra/Makefile
> >>> @@ -1 +1,2 @@
> >>> +obj-$(CONFIG_TEGRA_BPMP)       += bpmp.o
> >>>   obj-$(CONFIG_TEGRA_IVC)                += ivc.o
> >>> diff --git a/drivers/firmware/tegra/bpmp.c
> >>> b/drivers/firmware/tegra/bpmp.c
> >>> new file mode 100644
> >>> index 000000000000..24fda626610e
> >>> --- /dev/null
> >>> +++ b/drivers/firmware/tegra/bpmp.c
> >>> @@ -0,0 +1,713 @@
> >>> +/*
> >>> + * Copyright (c) 2016, NVIDIA CORPORATION.  All rights reserved.
> >>> + *
> >>> + * This program is free software; you can redistribute it and/or modify
> >>> it
> >>> + * under the terms and conditions of the GNU General Public License,
> >>> + * version 2, as published by the Free Software Foundation.
> >>> + *
> >>> + * This program is distributed in the hope it will be useful, but
> >>> WITHOUT
> >>> + * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
> >>> + * FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License
> >>> for
> >>> + * more details.
> >>> + */
> >>> +
> >>> +#include <linux/mailbox_client.h>
> >>> +#include <linux/of.h>
> >>> +#include <linux/of_address.h>
> >>> +#include <linux/of_device.h>
> >>> +#include <linux/platform_device.h>
> >>> +#include <linux/semaphore.h>
> >>> +
> >>> +#include <soc/tegra/bpmp.h>
> >>> +#include <soc/tegra/bpmp_abi.h>
> >>> +#include <soc/tegra/ivc.h>
> >>> +
> >>> +#define BPMP_MSG_SZ            128
> >>> +#define BPMP_MSG_DATA_SZ       120
> >>> +
> >>> +#define __MRQ_ATTRS            0xff000000
> >>> +#define __MRQ_INDEX(id)                ((id) & ~__MRQ_ATTRS)
> >>> +
> >>> +#define DO_ACK                 BIT(0)
> >>> +#define RING_DOORBELL          BIT(1)
> >>> +
> >>> +struct tegra_bpmp_soc_data {
> >>> +       u32 ch_index;           /* channel index */
> >>> +       u32 thread_ch_index;    /* thread channel index */
> >>> +       u32 cpu_rx_ch_index;    /* CPU Rx channel index */
> >>> +       u32 nr_ch;              /* number of total channels */
> >>> +       u32 nr_thread_ch;       /* number of thread channels */
> >>> +       u32 ch_timeout;         /* channel timeout */
> >>> +       u32 thread_ch_timeout;  /* thread channel timeout */
> >>> +};
> >>
> >>
> >> With just these comments it is not clear what everything in this
> >> structure does. Maybe a file-level comment explaining how BPMP
> >> basically works and what the different channels are allocated to would
> >> help understanding the code.
> >
> >
> > We have two kinds of TX channels (channel & thread channel above) for the
> > BPMP clients (clock, thermal, reset, power mgmt control, etc.) to use.
> >
> > The channel means an atomic channel that could be used when the client needs
> > the response immediately. e.g. setting clock rate, re-parent the clock
> > source. Each CPUs have it's own atomic for the usage. The client can acquire
> > one of them, and the ch_index means the first channel they are able to use
> > in the channel array.
> >
> > The response of thread channel can be postponed later. And the client allows
> > getting the response after BPMP finished the service and response to them by
> > IRQ. The thread_ch_index means the same the first  channel that the clients
> > are available to use.
> >
> > And the CPU RX channel is designed for the client to register some specific
> > services (We call MRQ in the bpmp_abi.) listen to some update from the BPMP
> > firmware.
> >
> > Because we might have different numbers of these channels, using this
> > structure as the bpmp_soc_data to get different configuration according to
> > different SoC.
> 
> Thanks, that clarifies things. This explanation deserves to in the C
> file as well IMHO.
> 
> So IIUC the first 13 channels (6 bound to a specific CPU core and 7
> threaded, allocated dynamically) are all used to initiate a
> communication to the BPMP, while the cpu_rx channel is used as a sort
> of IRQ (hence the name MRQ). Is this correct? This would be valuable
> to state too. Maybe cpu_rx_ch_index can even be renamed to something
> like mrq_ch_index to stress that fact.
> 
> A few additional comments follow below as I did a second pass on the code.
> 
> >
> >
> >>
> >>> +
> >>> +struct channel_info {
> >>> +       u32 tch_free;
> >>> +       u32 tch_to_complete;
> >>> +       struct semaphore tch_sem;
> >>> +};
> >>> +
> >>> +struct mb_data {
> >>> +       s32 code;
> >>> +       s32 flags;
> >>> +       u8 data[BPMP_MSG_DATA_SZ];
> >>> +} __packed;
> >>> +
> >>> +struct channel_data {
> >>> +       struct mb_data *ib;
> >>> +       struct mb_data *ob;
> >>> +};
> >>> +
> >>> +struct mrq {
> >>> +       struct list_head list;
> >>> +       u32 mrq_code;
> >>> +       bpmp_mrq_handler handler;
> >>> +       void *data;
> >>> +};
> >>> +
> >>> +struct tegra_bpmp {
> >>> +       struct device *dev;
> >>> +       const struct tegra_bpmp_soc_data *soc_data;
> >>> +       void __iomem *tx_base;
> >>> +       void __iomem *rx_base;
> >>> +       struct mbox_client cl;
> >>> +       struct mbox_chan *chan;
> >>> +       struct ivc *ivc_channels;
> >>> +       struct channel_data *ch_area;
> >>> +       struct channel_info ch_info;
> >>> +       struct completion *ch_completion;
> >>> +       struct list_head mrq_list;
> >>> +       struct tegra_bpmp_ops *ops;
> >>> +       spinlock_t lock;
> >>> +       bool init_done;
> >>> +};
> >>> +
> >>> +static struct tegra_bpmp *bpmp;
> >>
> >>
> >> static? Ok, we only need one... for now. How about a private member in
> >> your ivc structure that allows you to retrieve the bpmp and going
> >> dynamic? This will require an extra argument in many functions, but is
> >> cleaner design IMHO.
> >
> >
> > IVC is designed as a generic IPC protocol among different modules (We have
> > not introduced some other usages of the IVC right now.). Maybe don't churn
> > some other stuff into IVC is better.
> 
> Anything is fine if you can get rid of that static.
> 
> >
> >>
> >>> +
> >>> +static int bpmp_get_thread_ch(int idx)
> >>> +{
> >>> +       return bpmp->soc_data->thread_ch_index + idx;
> >>> +}
> >>> +
> >>> +static int bpmp_get_thread_ch_index(int ch)
> >>> +{
> >>> +       if (ch < bpmp->soc_data->thread_ch_index ||
> >>> +           ch >= bpmp->soc_data->cpu_rx_ch_index)
> >>
> >>
> >> Shouldn't that be ch >= bpmp->soc_data->cpu_rx_ch_index +
> >> bpmp->soc_data->nr_thread_ch?
> >>
> >> Either rx_ch_index indicates the upper bound of the threaded channels,
> >> and in that case you don't need tegra_bpmp_soc_data::nr_thread_ch, or
> >> it can be anywhere else and you should use the correct member.
> >
> >
> > According the to the table below, we have 14 channels.
> > atomic ch: 0 ~ 5, 6 chanls
> > thread ch: 6 ~ 17, 7 chanls
> > CPU RX ch: 13 ~ 14, 2 chanls

Or, did you mean

thread ch: 6 -> 12 
cpu rx ch: 13 (1 channel)

> >
> > +static const struct tegra_bpmp_soc_data soc_data_tegra186 = {
> > +       .ch_index = 0,
> > +       .thread_ch_index = 6,
> > +       .cpu_rx_ch_index = 13,
> > +       .nr_ch = 14,
> > +       .nr_thread_ch = 7,
> > +       .ch_timeout = 60 * USEC_PER_SEC,
> > +       .thread_ch_timeout = 600 * USEC_PER_SEC,
> > +};
> >
> > We use the index to check channel violation and nr_thread_ch for other usage
> > to avoid redundant channel number calculation elsewhere.
> 
> Sorry, my comment had a mistake. I meant that
> 
>           ch >= bpmp->soc_data->cpu_rx_ch_index
> 
> Should maybe be
> 
>           ch >= bpmp->soc_data->cpu_rx_ch_index + bpmp->soc_data->nr_thread_ch

Or did you mean
	ch >= bpmp->soc_data->thread_ch_index + bpmp->soc_data->nr_thread_ch ?

> 
> According to the description you gave of these fields, there is no
> guarantee that cpu_rx_ch_index will always be the first channel after
> the threaded channels.

I second Alex's concerns. It would better not to depend on the
adjacency of the channels. Also I think this data should come from the
device tree.

> 
> >
> >
> >>
> >>> +               return -1;
> >>> +       return ch - bpmp->soc_data->thread_ch_index;
> >>> +}
> >>> +
> >>> +static int bpmp_get_ob_channel(void)
> >>> +{
> >>> +       return smp_processor_id() + bpmp->soc_data->ch_index;
> >>> +}
> >>> +
> >>> +static struct completion *bpmp_get_completion_obj(int ch)
> >>> +{
> >>> +       int i = bpmp_get_thread_ch_index(ch);
> >>> +
> >>> +       return i < 0 ? NULL : bpmp->ch_completion + i;
> >>> +}
> >>> +
> >>> +static int bpmp_valid_txfer(void *ob_data, int ob_sz, void *ib_data, int
> >>> ib_sz)
> >>> +{
> >>> +       return ob_sz >= 0 && ob_sz <= BPMP_MSG_DATA_SZ &&
> >>> +              ib_sz >= 0 && ib_sz <= BPMP_MSG_DATA_SZ &&
> >>> +              (!ob_sz || ob_data) && (!ib_sz || ib_data);
> >>> +}
> >>> +
> >>> +static bool bpmp_master_acked(int ch)
> >>> +{
> >>> +       struct ivc *ivc_chan;
> >>> +       void *frame;
> >>> +       bool ready;
> >>> +
> >>> +       ivc_chan = bpmp->ivc_channels + ch;
> >>> +       frame = tegra_ivc_read_get_next_frame(ivc_chan);
> >>> +       ready = !IS_ERR_OR_NULL(frame);
> >>> +       bpmp->ch_area[ch].ib = ready ? frame : NULL;
> >>> +
> >>> +       return ready;
> >>> +}
> >>> +
> >>> +static int bpmp_wait_ack(int ch)
> 
> Shouldn't this be bpmp_wait_master_ack ? Looking at the two next
> functions makes me think it should (or bpmp_wait_master_free should be
> renamed to bpmp_wait_free).
> 
> >>> +{
> >>> +       ktime_t t;
> >>> +
> >>> +       t = ns_to_ktime(local_clock());
> >>> +
> >>> +       do {
> >>> +               if (bpmp_master_acked(ch))
> >>> +                       return 0;
> >>> +       } while (ktime_us_delta(ns_to_ktime(local_clock()), t) <
> >>> +                bpmp->soc_data->ch_timeout);
> >>> +
> >>> +       return -ETIMEDOUT;
> >>> +}
> >>> +
> >>> +static bool bpmp_master_free(int ch)
> >>> +{
> >>> +       struct ivc *ivc_chan;
> >>> +       void *frame;
> >>> +       bool ready;
> >>> +
> >>> +       ivc_chan = bpmp->ivc_channels + ch;
> >>> +       frame = tegra_ivc_write_get_next_frame(ivc_chan);
> >>> +       ready = !IS_ERR_OR_NULL(frame);
> >>> +       bpmp->ch_area[ch].ob = ready ? frame : NULL;
> >>> +
> >>> +       return ready;
> >>> +}
> >>> +
> >>> +static int bpmp_wait_master_free(int ch)
> >>> +{
> >>> +       ktime_t t;
> >>> +
> >>> +       t = ns_to_ktime(local_clock());
> >>> +
> >>> +       do {
> >>> +               if (bpmp_master_free(ch))
> >>> +                       return 0;
> >>> +       } while (ktime_us_delta(ns_to_ktime(local_clock()), t)
> >>> +                < bpmp->soc_data->ch_timeout);
> >>> +
> >>> +       return -ETIMEDOUT;
> >>> +}
> >>> +
> >>> +static int __read_ch(int ch, void *data, int sz)
> >>> +{
> >>> +       struct ivc *ivc_chan;
> >>> +       struct mb_data *p;
> >>> +
> >>> +       ivc_chan = bpmp->ivc_channels + ch;
> >>> +       p = bpmp->ch_area[ch].ib;
> >>> +       if (data)
> >>> +               memcpy_fromio(data, p->data, sz);
> >>> +
> >>> +       return tegra_ivc_read_advance(ivc_chan);
> >>> +}
> >>> +
> >>> +static int bpmp_read_ch(int ch, void *data, int sz)
> 
> bpmp_read_threaded_ch maybe? we have bpmp_write_threaded_ch below, as
> this function is clearly dealing with threaded channels only.
> 
> >>> +{
> >>> +       unsigned long flags;
> >>> +       int i, ret;
> >>> +
> >>> +       i = bpmp_get_thread_ch_index(ch);
> >>
> >>
> >> i is not a very good name for this variable.
> >> Also note that bpmp_get_thread_ch_index() can return -1, this case is
> >> not handled.
> >
> > Okay, will fix this.
> >
> >
> >>
> >>> +
> >>> +       spin_lock_irqsave(&bpmp->lock, flags);
> >>> +       ret = __read_ch(ch, data, sz);
> >>> +       bpmp->ch_info.tch_free |= (1 << i);
> >>> +       spin_unlock_irqrestore(&bpmp->lock, flags);
> >>> +
> >>> +       up(&bpmp->ch_info.tch_sem);
> >>> +
> >>> +       return ret;
> >>> +}
> >>> +
> >>> +static int __write_ch(int ch, int mrq_code, int flags, void *data, int
> >>> sz)
> >>> +{
> >>> +       struct ivc *ivc_chan;
> >>> +       struct mb_data *p;
> >>> +
> >>> +       ivc_chan = bpmp->ivc_channels + ch;
> >>> +       p = bpmp->ch_area[ch].ob;
> >>> +
> >>> +       p->code = mrq_code;
> >>> +       p->flags = flags;
> >>> +       if (data)
> >>> +               memcpy_toio(p->data, data, sz);
> >>> +
> >>> +       return tegra_ivc_write_advance(ivc_chan);
> >>> +}
> >>> +
> >>> +static int bpmp_write_threaded_ch(int *ch, int mrq_code, void *data, int
> >>> sz)
> >>> +{
> >>> +       unsigned long flags;
> >>> +       int ret, i;
> >>> +
> >>> +       ret = down_timeout(&bpmp->ch_info.tch_sem,
> >>> +
> >>> usecs_to_jiffies(bpmp->soc_data->thread_ch_timeout));
> >>> +       if (ret)
> >>> +               return ret;
> >>> +
> >>> +       spin_lock_irqsave(&bpmp->lock, flags);
> >>> +
> >>> +       i = __ffs(bpmp->ch_info.tch_free);
> >>> +       *ch = bpmp_get_thread_ch(i);
> >>> +       ret = bpmp_master_free(*ch) ? 0 : -EFAULT;
> >>> +       if (!ret) {
> 
> Style nit: I prefer to make the error case the exception, and normal
> runtime the norm. This is where a goto statement can actually make
> your code easier to follow. Have an err: label before the spin_unlock,
> and jump to it if ret != 0. Then you can have the next three lines at
> the lower indentation level, and not looking like as if they were an
> error themselves.
> 
> Or if you really don't like the goto, check for ret != 0 and do the
> spin_unlock and return in that block.
> 
> >>> +               bpmp->ch_info.tch_free &= ~(1 << i);
> >>> +               __write_ch(*ch, mrq_code, DO_ACK | RING_DOORBELL, data,
> >>> sz);
> >>> +               bpmp->ch_info.tch_to_complete |= (1 << *ch);
> >>> +       }
> >>> +
> >>> +       spin_unlock_irqrestore(&bpmp->lock, flags);
> >>> +
> >>> +       return ret;
> >>> +}
> >>> +
> >>> +static int bpmp_write_ch(int ch, int mrq_code, int flags, void *data,
> >>> int sz)
> >>> +{
> >>> +       int ret;
> >>> +
> >>> +       ret = bpmp_wait_master_free(ch);
> >>> +       if (ret)
> >>> +               return ret;
> >>> +
> >>> +       return __write_ch(ch, mrq_code, flags, data, sz);
> >>> +}
> >>> +
> >>> +static int bpmp_send_receive_atomic(int mrq_code, void *ob_data, int
> >>> ob_sz,
> >>> +                                   void *ib_data, int ib_sz)
> >>> +{
> >>> +       int ch, ret;
> >>> +
> >>> +       if (WARN_ON(!irqs_disabled()))
> >>> +               return -EPERM;
> >>> +
> >>> +       if (!bpmp_valid_txfer(ob_data, ob_sz, ib_data, ib_sz))
> >>> +               return -EINVAL;
> >>> +
> >>> +       if (!bpmp->init_done)
> >>> +               return -ENODEV;
> >>> +
> >>> +       ch = bpmp_get_ob_channel();
> >>> +       ret = bpmp_write_ch(ch, mrq_code, DO_ACK, ob_data, ob_sz);
> >>> +       if (ret)
> >>> +               return ret;
> >>> +
> >>> +       ret = mbox_send_message(bpmp->chan, NULL);
> >>> +       if (ret < 0)
> >>> +               return ret;
> >>> +       mbox_client_txdone(bpmp->chan, 0);
> >>> +
> >>> +       ret = bpmp_wait_ack(ch);
> >>> +       if (ret)
> >>> +               return ret;
> >>> +
> >>> +       return __read_ch(ch, ib_data, ib_sz);
> >>> +}
> >>> +
> >>> +static int bpmp_send_receive(int mrq_code, void *ob_data, int ob_sz,
> >>> +                            void *ib_data, int ib_sz)
> >>> +{
> >>> +       struct completion *comp_obj;
> >>> +       unsigned long timeout;
> >>> +       int ch, ret;
> >>> +
> >>> +       if (WARN_ON(irqs_disabled()))
> >>> +               return -EPERM;
> >>> +
> >>> +       if (!bpmp_valid_txfer(ob_data, ob_sz, ib_data, ib_sz))
> >>> +               return -EINVAL;
> >>> +
> >>> +       if (!bpmp->init_done)
> >>> +               return -ENODEV;
> >>> +
> >>> +       ret = bpmp_write_threaded_ch(&ch, mrq_code, ob_data, ob_sz);
> >>> +       if (ret)
> >>> +               return ret;
> >>> +
> >>> +       ret = mbox_send_message(bpmp->chan, NULL);
> >>> +       if (ret < 0)
> >>> +               return ret;
> >>> +       mbox_client_txdone(bpmp->chan, 0);
> >>> +
> >>> +       comp_obj = bpmp_get_completion_obj(ch);
> >>> +       timeout = usecs_to_jiffies(bpmp->soc_data->thread_ch_timeout);
> >>> +       if (!wait_for_completion_timeout(comp_obj, timeout))
> >>> +               return -ETIMEDOUT;
> >>> +
> >>> +       return bpmp_read_ch(ch, ib_data, ib_sz);
> >>> +}
> >>> +
> >>> +static struct mrq *bpmp_find_mrq(u32 mrq_code)
> >>> +{
> >>> +       struct mrq *mrq;
> >>> +
> >>> +       list_for_each_entry(mrq, &bpmp->mrq_list, list) {
> >>> +               if (mrq_code == mrq->mrq_code)
> >>> +                       return mrq;
> >>> +       }
> >>> +
> >>> +       return NULL;
> >>> +}
> >>> +
> >>> +static void bpmp_mrq_return_data(int ch, int code, void *data, int sz)
> >>> +{
> >>> +       int flags = bpmp->ch_area[ch].ib->flags;
> >>> +       struct ivc *ivc_chan;
> >>> +       struct mb_data *frame;
> >>> +       int ret;
> >>> +
> >>> +       if (WARN_ON(sz > BPMP_MSG_DATA_SZ))
> >>> +               return;
> >>> +
> >>> +       ivc_chan = bpmp->ivc_channels + ch;
> >>> +       ret = tegra_ivc_read_advance(ivc_chan);
> >>> +       WARN_ON(ret);
> >>> +
> >>> +       if (!(flags & DO_ACK))
> >>> +               return;
> >>> +
> >>> +       frame = tegra_ivc_write_get_next_frame(ivc_chan);
> >>> +       if (IS_ERR_OR_NULL(frame)) {
> >>> +               WARN_ON(1);
> >>> +               return;
> >>> +       }
> >>> +
> >>> +       frame->code = code;
> >>> +       if (data != NULL)
> >>> +               memcpy_toio(frame->data, data, sz);
> >>> +       ret = tegra_ivc_write_advance(ivc_chan);
> >>> +       WARN_ON(ret);
> >>> +
> >>> +       if (flags & RING_DOORBELL) {
> >>> +               ret = mbox_send_message(bpmp->chan, NULL);
> >>> +               if (ret < 0) {
> >>> +                       WARN_ON(1);
> >>> +                       return;
> >>> +               }
> >>> +               mbox_client_txdone(bpmp->chan, 0);
> >>> +       }
> >>> +}
> >>> +
> >>> +static void bpmp_mail_return(int ch, int ret_code, int val)
> >>> +{
> >>> +       bpmp_mrq_return_data(ch, ret_code, &val, sizeof(val));
> >>> +}
> >>> +
> >>> +static void bpmp_handle_mrq(int mrq_code, int ch)
> >>> +{
> >>> +       struct mrq *mrq;
> >>> +
> >>> +       spin_lock(&bpmp->lock);
> >>> +
> >>> +       mrq = bpmp_find_mrq(mrq_code);
> >>> +       if (!mrq) {
> >>> +               spin_unlock(&bpmp->lock);
> >>> +               bpmp_mail_return(ch, -EINVAL, 0);
> >>> +               return;
> >>> +       }
> >>> +
> >>> +       mrq->handler(mrq_code, mrq->data, ch);
> >>> +
> >>> +       spin_unlock(&bpmp->lock);
> >>> +}
> >>> +
> >>> +static int bpmp_request_mrq(int mrq_code, bpmp_mrq_handler handler, void
> >>> *data)
> >>> +{
> >>> +       struct mrq *mrq;
> >>> +       unsigned long flags;
> >>> +
> >>> +       if (!handler)
> >>> +               return -EINVAL;
> >>> +
> >>> +       mrq = devm_kzalloc(bpmp->dev, sizeof(*mrq), GFP_KERNEL);
> >>> +       if (!mrq)
> >>> +               return -ENOMEM;
> >>> +
> >>> +       spin_lock_irqsave(&bpmp->lock, flags);
> >>> +
> >>> +       mrq->mrq_code = __MRQ_INDEX(mrq_code);
> >>> +       mrq->handler = handler;
> >>> +       mrq->data = data;
> >>> +       list_add(&mrq->list, &bpmp->mrq_list);
> >>> +
> >>> +       spin_unlock_irqrestore(&bpmp->lock, flags);
> >>> +
> >>> +       return 0;
> >>> +}
> >>> +
> >>> +static void bpmp_mrq_handle_ping(int mrq_code, void *data, int ch)
> >>> +{
> >>> +       int challenge;
> >>> +       int reply;
> >>> +
> >>> +       challenge = *(int *)bpmp->ch_area[ch].ib->data;
> >>> +       reply = challenge << (smp_processor_id() + 1);
> >>> +       bpmp_mail_return(ch, 0, reply);
> >>> +}
> >>> +
> >>> +static int bpmp_mailman_init(void)
> >>> +{
> >>> +       return bpmp_request_mrq(MRQ_PING, bpmp_mrq_handle_ping, NULL);
> >>> +}
> >>> +
> >>> +static int bpmp_ping(void)
> >>> +{
> >>> +       unsigned long flags;
> >>> +       ktime_t t;
> >>> +       int challenge = 1;
> >>
> >>
> >> Mmmm, shouldn't use a mrq_ping_request instead of an parameter which
> >> size may vary depending on the architecture? On a 64-bit big endian
> >> architecture, your messages would be corrupted.
> >
> >
> > Clarify one thig first. The mrq_ping_request and mrq_handle_ping above are
> > used for the ping form BPMP to CPU. Like I said above, it's among CPU RX
> > channel to get some information from BPMP firmware.
> 
> Ok, so mrq_handle_ping *should* use these data structures at the very least.
> 
> >
> > Here is the ping request from CPU to BPMP to make sure we can IPC with BPMP
> > during the probe stage.
> >
> > About the endian issue, I think we don't consider that in the message format
> > right now. So I think we only support little endian for the IPC messages
> > right now.
> 
> Any code in the kernel should function correctly regardless of
> endianness. And the problem is not so much with endianness as it is
> with the operand size - is the BPMP expecting a 64-bit challenge here?
> Considering that the equivalent MRQ uses a 32-bit integer, I'd bet
> not. So please use u32/u64 as needed as well as cpu_to_leXX (and
> leXX_to_cpu for the opposite) to make your code solid.

I second this.

> 
> I understand that you don't want to use the MRQ structures because we
> are not handling a MRQ here, but if they are relevant I think this
> would still be safer that constructing messages from scalar data. That
> or we should introduce a proper structure for these messages, but here
> using the MRQ structure looks acceptable to me. Maybe they should not
> be named MRQ at all, but that's not for us to decide.

We should be using the mrq request structures from the ABI header.

> 
> >
> >>
> >>> +       int reply = 0;
> >>
> >>
> >> And this should probably be a mrq_ping_response. These remarks may
> >> also apply to bpmp_mrq_handle_ping().
> >
> > That is for receiving the ping request from BPMP.
> >
> >>
> >>> +       int ret;
> >>> +
> >>> +       t = ktime_get();
> >>> +       local_irq_save(flags);
> >>> +       ret = bpmp_send_receive_atomic(MRQ_PING, &challenge,
> >>> sizeof(challenge),
> >>> +                                      &reply, sizeof(reply));
> >>> +       local_irq_restore(flags);
> >>> +       t = ktime_sub(ktime_get(), t);
> >>> +
> >>> +       if (!ret)
> >>> +               dev_info(bpmp->dev,
> >>> +                        "ping ok: challenge: %d, reply: %d, time:
> >>> %lld\n",
> >>> +                        challenge, reply, ktime_to_us(t));
> >>> +
> >>> +       return ret;
> >>> +}
> >>> +
> >>> +static int bpmp_get_fwtag(void)
> >>> +{
> >>> +       unsigned long flags;
> >>> +       void *vaddr;
> >>> +       dma_addr_t paddr;
> >>> +       u32 addr;
> >>
> >>
> >> Here also we should use a mrq_query_tag_request.
> >
> > The is one-way request from CPU to BPMP. So we don't request an MRQ for
> > that.

It is not clear to me what you mean by 'one-way request' here. We are
sending a request to get the tag and we are getting the tag back via the
same message's response. Anyway, we should be using the 'struct
mrq_query_tag_request' here to be consistent.

> >
> >>
> >>> +       int ret;
> >>> +
> >>> +       vaddr = dma_alloc_coherent(bpmp->dev, BPMP_MSG_DATA_SZ, &paddr,
> >>> +                                  GFP_KERNEL);
> >>
> >>
> >> dma_addr_t may be 64 bit here, and you may get an address higher than
> >> the 32 bits allowed by mrq_query_tag_request! I guess you want to add
> >> GFP_DMA32 as flag to your call to dma_alloc_coherent.
> >
> > BPMP should able to handle the address above 32 bits, but I am not sure does
> > it configure to support that?

Either way, since this specific MRQ takes in only 32 bit address, I
think we should follow Alex's recommendation to use the GFP_DMA32 flag.

> 
> If the message you pass only contains a 32-bit address, then I'm
> afraid the protocol is the limiting factor here until it is updated.
> 
> Can't wait for the day when we will have to manage several versions of
> this protocol! >_<

If we need to pass a larger-than-32 bit address for this MRQ (or for
anything that takes in a 32-bit address now), the agreed upon process is
to define a new MRQ (i.e one with a different integer id) that takes in
new address type (and deprecating the 32-bit MRQ version). 

> 
> >
> > Will fix this.
> >
> >
> >>
> >>> +       if (!vaddr)
> >>> +               return -ENOMEM;
> >>> +       addr = paddr;
> >>> +
> >>> +       local_irq_save(flags);
> >>> +       ret = bpmp_send_receive_atomic(MRQ_QUERY_TAG, &addr,
> >>> sizeof(addr),
> >>> +                                      NULL, 0);
> >>> +       local_irq_restore(flags);
> >>> +
> >>> +       if (!ret)
> >>> +               dev_info(bpmp->dev, "fwtag: %s\n", (char *)vaddr);
> >>> +
> >>> +       dma_free_coherent(bpmp->dev, BPMP_MSG_DATA_SZ, vaddr, paddr);
> >>> +
> >>> +       return ret;
> >>> +}
> >>> +
> >>> +static void bpmp_signal_thread(int ch)
> >>> +{
> >>> +       int flags = bpmp->ch_area[ch].ob->flags;
> >>> +       struct completion *comp_obj;
> >>> +
> >>> +       if (!(flags & RING_DOORBELL))
> >>> +               return;
> >>> +
> >>> +       comp_obj = bpmp_get_completion_obj(ch);
> >>> +       if (!comp_obj) {
> >>> +               WARN_ON(1);
> >>> +               return;
> >>> +       }
> >>> +
> >>> +       complete(comp_obj);
> >>> +}
> >>> +
> >>> +static void bpmp_handle_rx(struct mbox_client *cl, void *data)
> >>> +{
> >>> +       int i, rx_ch;
> >>> +
> >>> +       rx_ch = bpmp->soc_data->cpu_rx_ch_index;
> >>> +
> >>> +       if (bpmp_master_acked(rx_ch))
> >>> +               bpmp_handle_mrq(bpmp->ch_area[rx_ch].ib->code, rx_ch);
> >>> +
> >>> +       spin_lock(&bpmp->lock);
> >>> +
> >>> +       for (i = 0; i < bpmp->soc_data->nr_thread_ch &&
> >>> +                       bpmp->ch_info.tch_to_complete; i++) {
> 
> for_each_set_bit(bpmp->ch_info.tch_to_complete, &i,
> bpmp->soc_data->nr_thread_ch) ?
> 
> This will reduce the number of iterations and you won't have to do the
> bpmp->ch_info.tch_to_complete & (1 << ch) check below.
> 
> >>> +               int ch = bpmp_get_thread_ch(i);
> >>> +
> >>> +               if ((bpmp->ch_info.tch_to_complete & (1 << ch)) &&
> >>> +                   bpmp_master_acked(ch)) {
> >>> +                       bpmp->ch_info.tch_to_complete &= ~(1 << ch);
> >>> +                       bpmp_signal_thread(ch);
> >>> +               }
> >>> +       }
> >>> +
> >>> +       spin_unlock(&bpmp->lock);
> >>> +}
> >>> +
> >>> +static void bpmp_ivc_notify(struct ivc *ivc)
> >>> +{
> >>> +       int ret;
> >>> +
> >>> +       ret = mbox_send_message(bpmp->chan, NULL);
> >>> +       if (ret < 0)
> >>> +               return;
> >>> +
> >>> +       mbox_send_message(bpmp->chan, NULL);
> >>
> >>
> >> Why the second call to mbox_send_message? May to useful to add a
> >> comment explaining it.
> >
> > Ah!! It should be mbox_client_txdone(). Good catch.
> 
> That makes more sense. :) But did this code work even with that typo?

It should have --- mbox_client_txdone() essentilly does nothing now.

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH V2 04/10] firmware: tegra: add IVC library
  2016-07-05  9:04 ` [PATCH V2 04/10] firmware: tegra: add IVC library Joseph Lo
  2016-07-07 11:16   ` Alexandre Courbot
@ 2016-07-09 23:45   ` Paul Gortmaker
  1 sibling, 0 replies; 51+ messages in thread
From: Paul Gortmaker @ 2016-07-09 23:45 UTC (permalink / raw)
  To: Joseph Lo
  Cc: Stephen Warren, Thierry Reding, Alexandre Courbot, linux-tegra,
	linux-arm-kernel, Rob Herring, Mark Rutland, Peter De Schrijver,
	Matthew Longnecker, devicetree, Jassi Brar, LKML,
	Catalin Marinas, Will Deacon

On Tue, Jul 5, 2016 at 5:04 AM, Joseph Lo <josephl@nvidia.com> wrote:
> The Inter-VM communication (IVC) is a communication protocol, which is
> designed for interprocessor communication (IPC) or the communication
> between the hypervisor and the virtual machine with a guest OS on it. So
> it can be translated as inter-virtual memory or inter-virtual machine
> communication. The message channels are maintained on the DRAM or SRAM
> and the data coherency should be considered. Or the data could be
> corrupted or out of date when the remote client checking it.
>
> Inside the IVC, it maintains memory-based descriptors for the TX/RX
> channels and the coherency issue of the counter and payloads. So the
> clients can use it to send/receive messages to/from remote ones.
>
> We introduce it as a library for the firmware drivers, which can use it
> for IPC.
>
> Based-on-the-work-by:
> Peter Newman <pnewman@nvidia.com>
>
> Signed-off-by: Joseph Lo <josephl@nvidia.com>
> ---
> Changes in V2:
> - None
> ---
>  drivers/firmware/Kconfig        |   1 +
>  drivers/firmware/Makefile       |   1 +
>  drivers/firmware/tegra/Kconfig  |  13 +
>  drivers/firmware/tegra/Makefile |   1 +
>  drivers/firmware/tegra/ivc.c    | 659 ++++++++++++++++++++++++++++++++++++++++
>  include/soc/tegra/ivc.h         | 102 +++++++
>  6 files changed, 777 insertions(+)
>  create mode 100644 drivers/firmware/tegra/Kconfig
>  create mode 100644 drivers/firmware/tegra/Makefile
>  create mode 100644 drivers/firmware/tegra/ivc.c
>  create mode 100644 include/soc/tegra/ivc.h
>
> diff --git a/drivers/firmware/Kconfig b/drivers/firmware/Kconfig
> index 5e618058defe..bbd64ae8c4c6 100644
> --- a/drivers/firmware/Kconfig
> +++ b/drivers/firmware/Kconfig
> @@ -200,5 +200,6 @@ config HAVE_ARM_SMCCC
>  source "drivers/firmware/broadcom/Kconfig"
>  source "drivers/firmware/google/Kconfig"
>  source "drivers/firmware/efi/Kconfig"
> +source "drivers/firmware/tegra/Kconfig"
>
>  endmenu
> diff --git a/drivers/firmware/Makefile b/drivers/firmware/Makefile
> index 474bada56fcd..9a4df8171cc4 100644
> --- a/drivers/firmware/Makefile
> +++ b/drivers/firmware/Makefile
> @@ -24,3 +24,4 @@ obj-y                         += broadcom/
>  obj-$(CONFIG_GOOGLE_FIRMWARE)  += google/
>  obj-$(CONFIG_EFI)              += efi/
>  obj-$(CONFIG_UEFI_CPER)                += efi/
> +obj-y                          += tegra/
> diff --git a/drivers/firmware/tegra/Kconfig b/drivers/firmware/tegra/Kconfig
> new file mode 100644
> index 000000000000..1fa3e4e136a5
> --- /dev/null
> +++ b/drivers/firmware/tegra/Kconfig
> @@ -0,0 +1,13 @@
> +menu "Tegra firmware driver"
> +
> +config TEGRA_IVC
> +       bool "Tegra IVC protocol"

If this driver is not tristate, then why does the driver include the
module.h header below?

> +       depends on ARCH_TEGRA
> +       help
> +         IVC (Inter-VM Communication) protocol is part of the IPC
> +         (Inter Processor Communication) framework on Tegra. It maintains the
> +         data and the different commuication channels in SysRAM or RAM and
> +         keeps the content is synchronization between host CPU and remote
> +         processors.
> +
> +endmenu
> diff --git a/drivers/firmware/tegra/Makefile b/drivers/firmware/tegra/Makefile
> new file mode 100644
> index 000000000000..92e2153e8173
> --- /dev/null
> +++ b/drivers/firmware/tegra/Makefile
> @@ -0,0 +1 @@
> +obj-$(CONFIG_TEGRA_IVC)                += ivc.o
> diff --git a/drivers/firmware/tegra/ivc.c b/drivers/firmware/tegra/ivc.c
> new file mode 100644
> index 000000000000..3e736bb9915a
> --- /dev/null
> +++ b/drivers/firmware/tegra/ivc.c
> @@ -0,0 +1,659 @@
> +/*
> + * Copyright (c) 2014-2016, NVIDIA CORPORATION.  All rights reserved.
> + *
> + * This program is free software; you can redistribute it and/or modify it
> + * under the terms and conditions of the GNU General Public License,
> + * version 2, as published by the Free Software Foundation.
> + *
> + * This program is distributed in the hope it will be useful, but WITHOUT
> + * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
> + * FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License for
> + * more details.
> + */
> +
> +#include <linux/module.h>
 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

I'm sure it "works" since module.h includes nearly everything else,
but that is less than ideal for exactly the same reason.

Thanks,
Paul.
--

> +
> +#include <soc/tegra/ivc.h>
> +
> +#define IVC_ALIGN 64
> +

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH V2 01/10] Documentation: dt-bindings: mailbox: tegra: Add binding for HSP mailbox
  2016-07-07 18:35     ` Stephen Warren
  2016-07-07 18:44       ` Sivaram Nair
@ 2016-07-11 14:14       ` Rob Herring
  2016-07-11 16:08         ` Stephen Warren
  1 sibling, 1 reply; 51+ messages in thread
From: Rob Herring @ 2016-07-11 14:14 UTC (permalink / raw)
  To: Stephen Warren
  Cc: Sivaram Nair, Joseph Lo, Thierry Reding, Alexandre Courbot,
	linux-tegra, linux-arm-kernel, Mark Rutland, Peter De Schrijver,
	Matthew Longnecker, devicetree, Jassi Brar, linux-kernel,
	Catalin Marinas, Will Deacon

On Thu, Jul 07, 2016 at 12:35:02PM -0600, Stephen Warren wrote:
> On 07/07/2016 12:13 PM, Sivaram Nair wrote:
> >On Tue, Jul 05, 2016 at 05:04:22PM +0800, Joseph Lo wrote:
> >>Add DT binding for the Hardware Synchronization Primitives (HSP). The
> >>HSP is designed for the processors to share resources and communicate
> >>together. It provides a set of hardware synchronization primitives for
> >>interprocessor communication. So the interprocessor communication (IPC)
> >>protocols can use hardware synchronization primitive, when operating
> >>between two processors not in an SMP relationship.
> 
> >>diff --git a/include/dt-bindings/mailbox/tegra186-hsp.h b/include/dt-bindings/mailbox/tegra186-hsp.h
> 
> >>+#define HSP_MBOX_TYPE_DB 0x0
> >>+#define HSP_MBOX_TYPE_SM 0x1
> >>+#define HSP_MBOX_TYPE_SS 0x2
> >>+#define HSP_MBOX_TYPE_AS 0x3
> >>+
> >>+#define HSP_DB_MASTER_CCPLEX 17
> >>+#define HSP_DB_MASTER_BPMP 19
> >>+
> >>+#define HSP_MBOX_ID(type, ID) \
> >>+		(HSP_MBOX_TYPE_##type << 16 | ID)
> >
> >It will be nicer if you avoid the macro glue magic '##' for 'type'. I
> >would also suggest to use braces around 'type' and 'ID'.
> 
> This technique been used without issue in quite a few other places without
> issue, and has the benefit of simplifying the text wherever the macro is
> used. What issue do you foresee?

I'm not a fan of using the macros to begin with and less so anything 
more complex than a single constant value. I'd rather see 2 cells here 
with the first being the id and the 2nd being the type. 

An issue with token pasting is grepping for DB, SM, etc. in kernel tree 
is probably noisy. Not such a big deal here, but a major PIA when you 
have more complex sets of includes.

Rob

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH V2 03/10] Documentation: dt-bindings: firmware: tegra: add bindings of the BPMP
  2016-07-05  9:04 ` [PATCH V2 03/10] Documentation: dt-bindings: firmware: tegra: add bindings of the BPMP Joseph Lo
  2016-07-06 11:42   ` Alexandre Courbot
  2016-07-06 17:03   ` Stephen Warren
@ 2016-07-11 14:22   ` Rob Herring
  2016-07-11 16:05     ` Stephen Warren
  2016-07-13 19:41   ` Stephen Warren
  3 siblings, 1 reply; 51+ messages in thread
From: Rob Herring @ 2016-07-11 14:22 UTC (permalink / raw)
  To: Joseph Lo
  Cc: Stephen Warren, Thierry Reding, Alexandre Courbot, linux-tegra,
	linux-arm-kernel, Mark Rutland, Peter De Schrijver,
	Matthew Longnecker, devicetree, Jassi Brar, linux-kernel,
	Catalin Marinas, Will Deacon

On Tue, Jul 05, 2016 at 05:04:24PM +0800, Joseph Lo wrote:
> The BPMP is a specific processor in Tegra chip, which is designed for
> booting process handling and offloading the power management, clock
> management, and reset control tasks from the CPU. The binding document
> defines the resources that would be used by the BPMP firmware driver,
> which can create the interprocessor communication (IPC) between the CPU
> and BPMP.
> 
> Signed-off-by: Joseph Lo <josephl@nvidia.com>
> ---
> Changes in V2:
> - update the message that the BPMP is clock and reset control provider
> - add tegra186-clock.h and tegra186-reset.h header files
> - revise the description of the required properties
> ---
>  .../bindings/firmware/nvidia,tegra186-bpmp.txt     |  77 ++
>  include/dt-bindings/clock/tegra186-clock.h         | 940 +++++++++++++++++++++
>  include/dt-bindings/reset/tegra186-reset.h         | 217 +++++
>  3 files changed, 1234 insertions(+)
>  create mode 100644 Documentation/devicetree/bindings/firmware/nvidia,tegra186-bpmp.txt
>  create mode 100644 include/dt-bindings/clock/tegra186-clock.h
>  create mode 100644 include/dt-bindings/reset/tegra186-reset.h
> 
> diff --git a/Documentation/devicetree/bindings/firmware/nvidia,tegra186-bpmp.txt b/Documentation/devicetree/bindings/firmware/nvidia,tegra186-bpmp.txt
> new file mode 100644
> index 000000000000..4d0b6eba56c5
> --- /dev/null
> +++ b/Documentation/devicetree/bindings/firmware/nvidia,tegra186-bpmp.txt
> @@ -0,0 +1,77 @@
> +NVIDIA Tegra Boot and Power Management Processor (BPMP)
> +
> +The BPMP is a specific processor in Tegra chip, which is designed for
> +booting process handling and offloading the power management, clock
> +management, and reset control tasks from the CPU. The binding document
> +defines the resources that would be used by the BPMP firmware driver,
> +which can create the interprocessor communication (IPC) between the CPU
> +and BPMP.
> +
> +Required properties:
> +- name : Should be bpmp
> +- compatible
> +    Array of strings
> +    One of:
> +    - "nvidia,tegra186-bpmp"
> +- mboxes : The phandle of mailbox controller and the mailbox specifier.
> +- shmem : List of the phandle of the TX and RX shared memory area that
> +	  the IPC between CPU and BPMP is based on.

I think you can use memory-region here.

> +- #clock-cells : Should be 1.
> +- #reset-cells : Should be 1.
> +
> +This node is a mailbox consumer. See the following files for details of
> +the mailbox subsystem, and the specifiers implemented by the relevant
> +provider(s):
> +
> +- Documentation/devicetree/bindings/mailbox/mailbox.txt
> +- Documentation/devicetree/bindings/mailbox/nvidia,tegra186-hsp.txt
> +
> +This node is a clock and reset provider. See the following files for
> +general documentation of those features, and the specifiers implemented
> +by this node:
> +
> +- Documentation/devicetree/bindings/clock/clock-bindings.txt
> +- include/dt-bindings/clock/tegra186-clock.h
> +- Documentation/devicetree/bindings/reset/reset.txt
> +- include/dt-bindings/reset/tegra186-reset.h
> +
> +The shared memory bindings for BPMP
> +-----------------------------------
> +
> +The shared memory area for the IPC TX and RX between CPU and BPMP are
> +predefined and work on top of sysram, which is an SRAM inside the chip.
> +
> +See "Documentation/devicetree/bindings/sram/sram.txt" for the bindings.
> +
> +Example:
> +
> +hsp_top0: hsp@03c00000 {
> +	...
> +	#mbox-cells = <1>;
> +};
> +
> +sysram@30000000 {
> +	compatible = "nvidia,tegra186-sysram", "mmio-ram";
> +	reg = <0x0 0x30000000 0x0 0x50000>;
> +	#address-cells = <2>;
> +	#size-cells = <2>;
> +	ranges = <0 0x0 0x0 0x30000000 0x0 0x50000>;
> +
> +	cpu_bpmp_tx: bpmp_shmem@4e000 {
> +		compatible = "nvidia,tegra186-bpmp-shmem";
> +		reg = <0x0 0x4e000 0x0 0x1000>;
> +	};
> +
> +	cpu_bpmp_rx: bpmp_shmem@4f000 {
> +		compatible = "nvidia,tegra186-bpmp-shmem";
> +		reg = <0x0 0x4f000 0x0 0x1000>;
> +	};
> +};
> +
> +bpmp {
> +	compatible = "nvidia,tegra186-bpmp";
> +	mboxes = <&hsp_top0 HSP_MBOX_ID(DB, HSP_DB_MASTER_BPMP)>;
> +	shmem = <&cpu_bpmp_tx &cpu_bpmp_rx>;
> +	#clock-cells = <1>;
> +	#reset-cells = <1>;
> +};
> diff --git a/include/dt-bindings/clock/tegra186-clock.h b/include/dt-bindings/clock/tegra186-clock.h
> new file mode 100644
> index 000000000000..f73d32098f99
> --- /dev/null
> +++ b/include/dt-bindings/clock/tegra186-clock.h
> @@ -0,0 +1,940 @@
> +/** @file */
> +
> +#ifndef _MACH_T186_CLK_T186_H
> +#define _MACH_T186_CLK_T186_H
> +
> +/**
> + * @defgroup clock_ids Clock Identifiers

Aren't these doxygen markup? Does that work with docbook? If not, 
remove.

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH V2 03/10] Documentation: dt-bindings: firmware: tegra: add bindings of the BPMP
  2016-07-11 14:22   ` Rob Herring
@ 2016-07-11 16:05     ` Stephen Warren
  2016-07-18  7:44       ` Joseph Lo
  0 siblings, 1 reply; 51+ messages in thread
From: Stephen Warren @ 2016-07-11 16:05 UTC (permalink / raw)
  To: Rob Herring, Joseph Lo
  Cc: Thierry Reding, Alexandre Courbot, linux-tegra, linux-arm-kernel,
	Mark Rutland, Peter De Schrijver, Matthew Longnecker, devicetree,
	Jassi Brar, linux-kernel, Catalin Marinas, Will Deacon

On 07/11/2016 08:22 AM, Rob Herring wrote:
> On Tue, Jul 05, 2016 at 05:04:24PM +0800, Joseph Lo wrote:
>> The BPMP is a specific processor in Tegra chip, which is designed for
>> booting process handling and offloading the power management, clock
>> management, and reset control tasks from the CPU. The binding document
>> defines the resources that would be used by the BPMP firmware driver,
>> which can create the interprocessor communication (IPC) between the CPU
>> and BPMP.

>> diff --git a/Documentation/devicetree/bindings/firmware/nvidia,tegra186-bpmp.txt b/Documentation/devicetree/bindings/firmware/nvidia,tegra186-bpmp.txt

>> +NVIDIA Tegra Boot and Power Management Processor (BPMP)
>> +
>> +The BPMP is a specific processor in Tegra chip, which is designed for
>> +booting process handling and offloading the power management, clock
>> +management, and reset control tasks from the CPU. The binding document
>> +defines the resources that would be used by the BPMP firmware driver,
>> +which can create the interprocessor communication (IPC) between the CPU
>> +and BPMP.
>> +
>> +Required properties:
>> +- name : Should be bpmp
>> +- compatible
>> +    Array of strings
>> +    One of:
>> +    - "nvidia,tegra186-bpmp"
>> +- mboxes : The phandle of mailbox controller and the mailbox specifier.
>> +- shmem : List of the phandle of the TX and RX shared memory area that
>> +	  the IPC between CPU and BPMP is based on.
>
> I think you can use memory-region here.

Isn't memory-region intended for references into the /reserved-memory 
node. If so, that isn't appropriate in this case since this property 
typically points at on-chip SRAM that isn't included in the OS's view of 
"system RAM".

Or, should /reserved-memory be used even for (e.g. non-DRAM) memory 
regions that aren't represented by the /memory/reg property?

>> diff --git a/include/dt-bindings/clock/tegra186-clock.h b/include/dt-bindings/clock/tegra186-clock.h

>> +/** @file */
>> +
>> +#ifndef _MACH_T186_CLK_T186_H
>> +#define _MACH_T186_CLK_T186_H
>> +
>> +/**
>> + * @defgroup clock_ids Clock Identifiers
>
> Aren't these doxygen markup? Does that work with docbook? If not,
> remove.

These headers are part of the BPMP FW release. It's preferable not to 
edit them when incorporating them into the Linux kernel (or any other SW 
stack) to simplify integration of any updated versions of the header, by 
removing the need to edit the file when doing so. Given that, do you 
still object?

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH V2 01/10] Documentation: dt-bindings: mailbox: tegra: Add binding for HSP mailbox
  2016-07-11 14:14       ` Rob Herring
@ 2016-07-11 16:08         ` Stephen Warren
  2016-07-18 23:13           ` Stephen Warren
  0 siblings, 1 reply; 51+ messages in thread
From: Stephen Warren @ 2016-07-11 16:08 UTC (permalink / raw)
  To: Rob Herring
  Cc: Sivaram Nair, Joseph Lo, Thierry Reding, Alexandre Courbot,
	linux-tegra, linux-arm-kernel, Mark Rutland, Peter De Schrijver,
	Matthew Longnecker, devicetree, Jassi Brar, linux-kernel,
	Catalin Marinas, Will Deacon

On 07/11/2016 08:14 AM, Rob Herring wrote:
> On Thu, Jul 07, 2016 at 12:35:02PM -0600, Stephen Warren wrote:
>> On 07/07/2016 12:13 PM, Sivaram Nair wrote:
>>> On Tue, Jul 05, 2016 at 05:04:22PM +0800, Joseph Lo wrote:
>>>> Add DT binding for the Hardware Synchronization Primitives (HSP). The
>>>> HSP is designed for the processors to share resources and communicate
>>>> together. It provides a set of hardware synchronization primitives for
>>>> interprocessor communication. So the interprocessor communication (IPC)
>>>> protocols can use hardware synchronization primitive, when operating
>>>> between two processors not in an SMP relationship.
>>
>>>> diff --git a/include/dt-bindings/mailbox/tegra186-hsp.h b/include/dt-bindings/mailbox/tegra186-hsp.h
>>
>>>> +#define HSP_MBOX_TYPE_DB 0x0
>>>> +#define HSP_MBOX_TYPE_SM 0x1
>>>> +#define HSP_MBOX_TYPE_SS 0x2
>>>> +#define HSP_MBOX_TYPE_AS 0x3
>>>> +
>>>> +#define HSP_DB_MASTER_CCPLEX 17
>>>> +#define HSP_DB_MASTER_BPMP 19
>>>> +
>>>> +#define HSP_MBOX_ID(type, ID) \
>>>> +		(HSP_MBOX_TYPE_##type << 16 | ID)
>>>
>>> It will be nicer if you avoid the macro glue magic '##' for 'type'. I
>>> would also suggest to use braces around 'type' and 'ID'.
>>
>> This technique been used without issue in quite a few other places without
>> issue, and has the benefit of simplifying the text wherever the macro is
>> used. What issue do you foresee?
>
> I'm not a fan of using the macros to begin with and less so anything
> more complex than a single constant value. I'd rather see 2 cells here
> with the first being the id and the 2nd being the type.
>
> An issue with token pasting is grepping for DB, SM, etc. in kernel tree
> is probably noisy. Not such a big deal here, but a major PIA when you
> have more complex sets of includes.

Is that a NAK or simply a suggestion? Having a single cell makes DT 
parsing a bit simpler, since pretty much every SW stack provides a 
default "one-cell" of_xlate implementation, whereas >1 cell means custom 
code for of_xlate.

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH V2 03/10] Documentation: dt-bindings: firmware: tegra: add bindings of the BPMP
  2016-07-05  9:04 ` [PATCH V2 03/10] Documentation: dt-bindings: firmware: tegra: add bindings of the BPMP Joseph Lo
                     ` (2 preceding siblings ...)
  2016-07-11 14:22   ` Rob Herring
@ 2016-07-13 19:41   ` Stephen Warren
  2016-07-18  6:42     ` Joseph Lo
  3 siblings, 1 reply; 51+ messages in thread
From: Stephen Warren @ 2016-07-13 19:41 UTC (permalink / raw)
  To: Joseph Lo, Thierry Reding, Alexandre Courbot
  Cc: linux-tegra, linux-arm-kernel, Rob Herring, Mark Rutland,
	Peter De Schrijver, Matthew Longnecker, devicetree, Jassi Brar,
	linux-kernel, Catalin Marinas, Will Deacon

On 07/05/2016 03:04 AM, Joseph Lo wrote:
> The BPMP is a specific processor in Tegra chip, which is designed for
> booting process handling and offloading the power management, clock
> management, and reset control tasks from the CPU. The binding document
> defines the resources that would be used by the BPMP firmware driver,
> which can create the interprocessor communication (IPC) between the CPU
> and BPMP.

> diff --git a/Documentation/devicetree/bindings/firmware/nvidia,tegra186-bpmp.txt b/Documentation/devicetree/bindings/firmware/nvidia,tegra186-bpmp.txt

> +- Documentation/devicetree/bindings/mailbox/mailbox.txt
> +- Documentation/devicetree/bindings/mailbox/nvidia,tegra186-hsp.txt

> +- Documentation/devicetree/bindings/clock/clock-bindings.txt
> +- include/dt-bindings/clock/tegra186-clock.h
> +- Documentation/devicetree/bindings/reset/reset.txt
> +- include/dt-bindings/reset/tegra186-reset.h

If you end up needing to repost this, it would be nice to make all those 
file references more generic. In particular, some SW projects store 
binding docs somewhere other than Documentation/devicetree/bindings/ 
(e.g. U-Boot uses doc/device-tree-bindings/), and it's possible that the 
header files aren't stored in include/ but somewhere else. To make these 
file references valid everywhere, I'd suggest using relative paths for 
the binding docs, and #include style paths for the headers, e.g.:

../clock/clock-bindings.txt

<dt-bindings/clock/tegra186-clock.h>

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH V2 03/10] Documentation: dt-bindings: firmware: tegra: add bindings of the BPMP
  2016-07-13 19:41   ` Stephen Warren
@ 2016-07-18  6:42     ` Joseph Lo
  0 siblings, 0 replies; 51+ messages in thread
From: Joseph Lo @ 2016-07-18  6:42 UTC (permalink / raw)
  To: Stephen Warren, Thierry Reding, Alexandre Courbot
  Cc: linux-tegra, linux-arm-kernel, Rob Herring, Mark Rutland,
	Peter De Schrijver, Matthew Longnecker, devicetree, Jassi Brar,
	linux-kernel, Catalin Marinas, Will Deacon

On 07/14/2016 03:41 AM, Stephen Warren wrote:
> On 07/05/2016 03:04 AM, Joseph Lo wrote:
>> The BPMP is a specific processor in Tegra chip, which is designed for
>> booting process handling and offloading the power management, clock
>> management, and reset control tasks from the CPU. The binding document
>> defines the resources that would be used by the BPMP firmware driver,
>> which can create the interprocessor communication (IPC) between the CPU
>> and BPMP.
>
>> diff --git
>> a/Documentation/devicetree/bindings/firmware/nvidia,tegra186-bpmp.txt
>> b/Documentation/devicetree/bindings/firmware/nvidia,tegra186-bpmp.txt
>
>> +- Documentation/devicetree/bindings/mailbox/mailbox.txt
>> +- Documentation/devicetree/bindings/mailbox/nvidia,tegra186-hsp.txt
>
>> +- Documentation/devicetree/bindings/clock/clock-bindings.txt
>> +- include/dt-bindings/clock/tegra186-clock.h
>> +- Documentation/devicetree/bindings/reset/reset.txt
>> +- include/dt-bindings/reset/tegra186-reset.h
>
> If you end up needing to repost this, it would be nice to make all those
> file references more generic. In particular, some SW projects store
> binding docs somewhere other than Documentation/devicetree/bindings/
> (e.g. U-Boot uses doc/device-tree-bindings/), and it's possible that the
> header files aren't stored in include/ but somewhere else. To make these
> file references valid everywhere, I'd suggest using relative paths for
> the binding docs, and #include style paths for the headers, e.g.:
>
> ../clock/clock-bindings.txt
>
> <dt-bindings/clock/tegra186-clock.h>
>
OK. Will fix this.

Thanks,
-Joseph

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH V2 03/10] Documentation: dt-bindings: firmware: tegra: add bindings of the BPMP
  2016-07-11 16:05     ` Stephen Warren
@ 2016-07-18  7:44       ` Joseph Lo
  2016-07-18 16:18         ` Stephen Warren
  0 siblings, 1 reply; 51+ messages in thread
From: Joseph Lo @ 2016-07-18  7:44 UTC (permalink / raw)
  To: Stephen Warren, Rob Herring
  Cc: Thierry Reding, Alexandre Courbot, linux-tegra, linux-arm-kernel,
	Mark Rutland, Peter De Schrijver, Matthew Longnecker, devicetree,
	Jassi Brar, linux-kernel, Catalin Marinas, Will Deacon

Hi Rob,

Thanks for your reviewing.

On 07/12/2016 12:05 AM, Stephen Warren wrote:
> On 07/11/2016 08:22 AM, Rob Herring wrote:
>> On Tue, Jul 05, 2016 at 05:04:24PM +0800, Joseph Lo wrote:
>>> The BPMP is a specific processor in Tegra chip, which is designed for
>>> booting process handling and offloading the power management, clock
>>> management, and reset control tasks from the CPU. The binding document
>>> defines the resources that would be used by the BPMP firmware driver,
>>> which can create the interprocessor communication (IPC) between the CPU
>>> and BPMP.
>
>>> diff --git
>>> a/Documentation/devicetree/bindings/firmware/nvidia,tegra186-bpmp.txt
>>> b/Documentation/devicetree/bindings/firmware/nvidia,tegra186-bpmp.txt
>
>>> +NVIDIA Tegra Boot and Power Management Processor (BPMP)
>>> +
>>> +The BPMP is a specific processor in Tegra chip, which is designed for
>>> +booting process handling and offloading the power management, clock
>>> +management, and reset control tasks from the CPU. The binding document
>>> +defines the resources that would be used by the BPMP firmware driver,
>>> +which can create the interprocessor communication (IPC) between the CPU
>>> +and BPMP.
>>> +
>>> +Required properties:
>>> +- name : Should be bpmp
>>> +- compatible
>>> +    Array of strings
>>> +    One of:
>>> +    - "nvidia,tegra186-bpmp"
>>> +- mboxes : The phandle of mailbox controller and the mailbox specifier.
>>> +- shmem : List of the phandle of the TX and RX shared memory area that
>>> +      the IPC between CPU and BPMP is based on.
>>
>> I think you can use memory-region here.
>
> Isn't memory-region intended for references into the /reserved-memory
> node. If so, that isn't appropriate in this case since this property
> typically points at on-chip SRAM that isn't included in the OS's view of
> "system RAM".
Agree with that.

>
> Or, should /reserved-memory be used even for (e.g. non-DRAM) memory
> regions that aren't represented by the /memory/reg property?
>

For shmem, I follow the same concept of the binding for arm,scpi 
(.../arm/arm,scpi.txt) that is currently using in mainline. Do you think 
that is more appropriate here?


>>> diff --git a/include/dt-bindings/clock/tegra186-clock.h
>>> b/include/dt-bindings/clock/tegra186-clock.h
>
>>> +/** @file */
>>> +
>>> +#ifndef _MACH_T186_CLK_T186_H
>>> +#define _MACH_T186_CLK_T186_H
>>> +
>>> +/**
>>> + * @defgroup clock_ids Clock Identifiers
>>
>> Aren't these doxygen markup? Does that work with docbook? If not,
>> remove.
>
> These headers are part of the BPMP FW release. It's preferable not to
> edit them when incorporating them into the Linux kernel (or any other SW
> stack) to simplify integration of any updated versions of the header, by
> removing the need to edit the file when doing so. Given that, do you
> still object?

How do you think of this, Rob?

Thanks,
-Joseph

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH V2 02/10] mailbox: tegra-hsp: Add HSP(Hardware Synchronization Primitives) driver
  2016-07-07 21:10   ` Sivaram Nair
@ 2016-07-18  8:51     ` Joseph Lo
  0 siblings, 0 replies; 51+ messages in thread
From: Joseph Lo @ 2016-07-18  8:51 UTC (permalink / raw)
  To: Sivaram Nair
  Cc: Stephen Warren, Thierry Reding, Alexandre Courbot, linux-tegra,
	linux-arm-kernel, Rob Herring, Mark Rutland, Peter De Schrijver,
	Matthew Longnecker, devicetree, Jassi Brar, linux-kernel,
	Catalin Marinas, Will Deacon

On 07/08/2016 05:10 AM, Sivaram Nair wrote:
> On Tue, Jul 05, 2016 at 05:04:23PM +0800, Joseph Lo wrote:
>> The Tegra HSP mailbox driver implements the signaling doorbell-based
>> interprocessor communication (IPC) for remote processors currently. The
>> HSP HW modules support some different features for that, which are
>> shared mailboxes, shared semaphores, arbitrated semaphores, and
>> doorbells. And there are multiple HSP HW instances on the chip. So the
>> driver is extendable to support more features for different IPC
>> requirement.
>>
>> The driver of remote processor can use it as a mailbox client and deal
>> with the IPC protocol to synchronize the data communications.
>>
>> Signed-off-by: Joseph Lo <josephl@nvidia.com>
>> ---
>> Changes in V2:
>> - Update the driver to support the binding changes in V2
>> - it's extendable to support multiple HSP sub-modules on the same HSP HW block
>>    now.
>> ---
>>   drivers/mailbox/Kconfig     |   9 +
>>   drivers/mailbox/Makefile    |   2 +
>>   drivers/mailbox/tegra-hsp.c | 418 ++++++++++++++++++++++++++++++++++++++++++++
>>   3 files changed, 429 insertions(+)
>>   create mode 100644 drivers/mailbox/tegra-hsp.c
>>
>> diff --git a/drivers/mailbox/Kconfig b/drivers/mailbox/Kconfig
>> index 5305923752d2..fe584cb54720 100644
>> --- a/drivers/mailbox/Kconfig
>> +++ b/drivers/mailbox/Kconfig
>> @@ -114,6 +114,15 @@ config MAILBOX_TEST
>>   	  Test client to help with testing new Controller driver
>>   	  implementations.
>>
>> +config TEGRA_HSP_MBOX
>> +	bool "Tegra HSP(Hardware Synchronization Primitives) Driver"
>> +	depends on ARCH_TEGRA_186_SOC
>> +	help
>> +	  The Tegra HSP driver is used for the interprocessor communication
>> +	  between different remote processors and host processors on Tegra186
>> +	  and later SoCs. Say Y here if you want to have this support.
>> +	  If unsure say N.
>> +
>>   config XGENE_SLIMPRO_MBOX
>>   	tristate "APM SoC X-Gene SLIMpro Mailbox Controller"
>>   	depends on ARCH_XGENE
>> diff --git a/drivers/mailbox/Makefile b/drivers/mailbox/Makefile
>> index 0be3e742bb7d..26d8f91c7fea 100644
>> --- a/drivers/mailbox/Makefile
>> +++ b/drivers/mailbox/Makefile
>> @@ -25,3 +25,5 @@ obj-$(CONFIG_TI_MESSAGE_MANAGER) += ti-msgmgr.o
>>   obj-$(CONFIG_XGENE_SLIMPRO_MBOX) += mailbox-xgene-slimpro.o
>>
>>   obj-$(CONFIG_HI6220_MBOX)	+= hi6220-mailbox.o
>> +
>> +obj-${CONFIG_TEGRA_HSP_MBOX}	+= tegra-hsp.o
>> diff --git a/drivers/mailbox/tegra-hsp.c b/drivers/mailbox/tegra-hsp.c
>> new file mode 100644
>> index 000000000000..93c3ef58f29f
>> --- /dev/null
>> +++ b/drivers/mailbox/tegra-hsp.c
>> @@ -0,0 +1,418 @@
>> +/*
>> + * Copyright (c) 2016, NVIDIA CORPORATION.  All rights reserved.
>> + *
>> + * This program is free software; you can redistribute it and/or modify it
>> + * under the terms and conditions of the GNU General Public License,
>> + * version 2, as published by the Free Software Foundation.
>> + *
>> + * This program is distributed in the hope it will be useful, but WITHOUT
>> + * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
>> + * FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License for
>> + * more details.
>> + */
>> +
>> +#include <linux/interrupt.h>
>> +#include <linux/io.h>
>> +#include <linux/mailbox_controller.h>
>> +#include <linux/of.h>
>> +#include <linux/of_device.h>
>> +#include <linux/platform_device.h>
>> +#include <dt-bindings/mailbox/tegra186-hsp.h>
>> +
>> +#define HSP_INT_DIMENSIONING	0x380
>> +#define HSP_nSM_OFFSET		0
>> +#define HSP_nSS_OFFSET		4
>> +#define HSP_nAS_OFFSET		8
>> +#define HSP_nDB_OFFSET		12
>> +#define HSP_nSI_OFFSET		16
>> +#define HSP_nINT_MASK		0xf
>> +
>> +#define HSP_DB_REG_TRIGGER	0x0
>> +#define HSP_DB_REG_ENABLE	0x4
>> +#define HSP_DB_REG_RAW		0x8
>> +#define HSP_DB_REG_PENDING	0xc
>> +
>> +#define HSP_DB_CCPLEX		1
>> +#define HSP_DB_BPMP		3
>> +
>> +#define MAX_NUM_HSP_CHAN 32
>
> Is this an arbitrarily chosen number?

Ah, this should be MAX_NUM_HSP_DB_CHAN now.

But the mbox driver still needs a max channel number, I will check how 
to enhance it properly with multiple HSP modules support in the same driver.

Maybe 4 channels for SM, AS, SS and DB. And the sub channels for 
different functions under them. Then it may able to fix the double loop 
issue in the hsp_db_irq function.

>
>> +#define MAX_NUM_HSP_DB 7
>> +
>> +#define hsp_db_offset(i, d) \
>> +	(d->base + ((1 + (d->nr_sm >> 1) + d->nr_ss + d->nr_as) << 16) + \
>> +	(i) * 0x100)
>> +
>> +struct tegra_hsp_db_chan {
>> +	int master_id;
>> +	int db_id;
>
> These should be unsigned?
Yes, will fix them.

>
>> +};
>> +
>> +struct tegra_hsp_mbox_chan {
>> +	int type;
>
> This too...
>
>> +	union {
>> +		struct tegra_hsp_db_chan db_chan;
>> +	};
>> +};
>
> Why do we need to use a union?

Because we only support DB right now, there is only one member in the 
union. We can add something like sm_chan here when we need to support 
that later.

>
>> +
>> +struct tegra_hsp_mbox {
>> +	struct mbox_controller *mbox;
>> +	void __iomem *base;
>> +	void __iomem *db_base[MAX_NUM_HSP_DB];
>> +	int db_irq;
>> +	int nr_sm;
>> +	int nr_as;
>> +	int nr_ss;
>> +	int nr_db;
>> +	int nr_si;
>> +	spinlock_t lock;
>
> You might need to change this to a mutex - see below.

OK, will fix this.

>
>> +};
>> +
>> +static inline u32 hsp_readl(void __iomem *base, int reg)
>> +{
>> +	return readl(base + reg);
>> +}
>> +
>> +static inline void hsp_writel(void __iomem *base, int reg, u32 val)
>> +{
>> +	writel(val, base + reg);
>> +	readl(base + reg);
>> +}
>> +
>> +static int hsp_db_can_ring(void __iomem *db_base)
>> +{
>> +	u32 reg;
>> +
>> +	reg = hsp_readl(db_base, HSP_DB_REG_ENABLE);
>> +
>> +	return !!(reg & BIT(HSP_DB_MASTER_CCPLEX));
>> +}
>> +
>> +static irqreturn_t hsp_db_irq(int irq, void *p)
>> +{
>> +	struct tegra_hsp_mbox *hsp_mbox = p;
>> +	ulong val;
>
> This should be u32 and...
>
>> +	int master_id;
>> +
>> +	val = (ulong)hsp_readl(hsp_mbox->db_base[HSP_DB_CCPLEX],
>> +			       HSP_DB_REG_PENDING);
>
> the cast should/can be removed (hsp_readl and hsp_writel both use u32)?
Yes.

>
>> +	hsp_writel(hsp_mbox->db_base[HSP_DB_CCPLEX], HSP_DB_REG_PENDING, val);
>> +
>> +	spin_lock(&hsp_mbox->lock);
>> +	for_each_set_bit(master_id, &val, MAX_NUM_HSP_CHAN) {
>> +		struct mbox_chan *chan;
>> +		struct tegra_hsp_mbox_chan *mchan;
>> +		int i;
>> +
>> +		for (i = 0; i < MAX_NUM_HSP_CHAN; i++) {
>> +			chan = &hsp_mbox->mbox->chans[i];
>> +
>> +			if (!chan->con_priv)
>> +				continue;
>> +
>> +			mchan = chan->con_priv;
>> +			if (mchan->type == HSP_MBOX_TYPE_DB &&
>> +			    mchan->db_chan.master_id == master_id)
>> +				break;
>> +			chan = NULL;
>> +		}
>
> Like Alexandre, I didn't like this use of inner loop as well. But I will
> add my comment to the other thread.
>
>> +
>> +		if (chan)
>> +			mbox_chan_received_data(chan, NULL);
>> +	}
>> +	spin_unlock(&hsp_mbox->lock);
>> +
>> +	return IRQ_HANDLED;
>> +}
>> +
>> +static int hsp_db_send_data(struct mbox_chan *chan, void *data)
>> +{
>> +	struct tegra_hsp_mbox_chan *mchan = chan->con_priv;
>> +	struct tegra_hsp_db_chan *db_chan = &mchan->db_chan;
>> +	struct tegra_hsp_mbox *hsp_mbox = dev_get_drvdata(chan->mbox->dev);
>> +
>> +	hsp_writel(hsp_mbox->db_base[db_chan->db_id], HSP_DB_REG_TRIGGER, 1);
>> +
>> +	return 0;
>> +}
>> +
>> +static int hsp_db_startup(struct mbox_chan *chan)
>> +{
>> +	struct tegra_hsp_mbox_chan *mchan = chan->con_priv;
>> +	struct tegra_hsp_db_chan *db_chan = &mchan->db_chan;
>> +	struct tegra_hsp_mbox *hsp_mbox = dev_get_drvdata(chan->mbox->dev);
>> +	u32 val;
>> +	unsigned long flag;
>> +
>> +	if (db_chan->master_id >= MAX_NUM_HSP_CHAN) {
>
> Is this a valid check? IIUC, MAX_NUM_HSP_CHAN is independent of the
> number of masters.

This should be MAX_NUM_HSP_DB_CHAN.

>
>> +		dev_err(chan->mbox->dev, "invalid HSP chan: master ID: %d\n",
>> +			db_chan->master_id);
>> +		return -EINVAL;
>> +	}
>> +
>> +	spin_lock_irqsave(&hsp_mbox->lock, flag);
>> +	val = hsp_readl(hsp_mbox->db_base[HSP_DB_CCPLEX], HSP_DB_REG_ENABLE);
>> +	val |= BIT(db_chan->master_id);
>> +	hsp_writel(hsp_mbox->db_base[HSP_DB_CCPLEX], HSP_DB_REG_ENABLE, val);
>> +	spin_unlock_irqrestore(&hsp_mbox->lock, flag);
>> +
>> +	if (!hsp_db_can_ring(hsp_mbox->db_base[db_chan->db_id]))
>> +		return -ENODEV;
>> +
>> +	return 0;
>> +}
>> +
>> +static void hsp_db_shutdown(struct mbox_chan *chan)
>> +{
>> +	struct tegra_hsp_mbox_chan *mchan = chan->con_priv;
>> +	struct tegra_hsp_db_chan *db_chan = &mchan->db_chan;
>> +	struct tegra_hsp_mbox *hsp_mbox = dev_get_drvdata(chan->mbox->dev);
>> +	u32 val;
>> +	unsigned long flag;
>> +
>> +	spin_lock_irqsave(&hsp_mbox->lock, flag);
>> +	val = hsp_readl(hsp_mbox->db_base[HSP_DB_CCPLEX], HSP_DB_REG_ENABLE);
>> +	val &= ~BIT(db_chan->master_id);
>> +	hsp_writel(hsp_mbox->db_base[HSP_DB_CCPLEX], HSP_DB_REG_ENABLE, val);
>> +	spin_unlock_irqrestore(&hsp_mbox->lock, flag);
>> +}
>> +
>> +static bool hsp_db_last_tx_done(struct mbox_chan *chan)
>> +{
>> +	return true;
>> +}
>> +
>> +static int tegra_hsp_db_init(struct tegra_hsp_mbox *hsp_mbox,
>> +			     struct mbox_chan *mchan, int master_id)
>> +{
>> +	struct platform_device *pdev = to_platform_device(hsp_mbox->mbox->dev);
>> +	struct tegra_hsp_mbox_chan *hsp_mbox_chan;
>> +	int ret;
>> +
>> +	if (!hsp_mbox->db_irq) {
>> +		int i;
>> +
>> +		hsp_mbox->db_irq = platform_get_irq_byname(pdev, "doorbell");
>> +		ret = devm_request_irq(&pdev->dev, hsp_mbox->db_irq,
>> +				       hsp_db_irq, IRQF_NO_SUSPEND,
>> +				       dev_name(&pdev->dev), hsp_mbox);
>> +		if (ret)
>> +			return ret;
>> +
>> +		for (i = 0; i < MAX_NUM_HSP_DB; i++)
>> +			hsp_mbox->db_base[i] = hsp_db_offset(i, hsp_mbox);
>> +	}
>> +
>> +	hsp_mbox_chan = devm_kzalloc(&pdev->dev, sizeof(*hsp_mbox_chan),
>> +				     GFP_KERNEL);
>> +	if (!hsp_mbox_chan)
>> +		return -ENOMEM;
>> +
>> +	hsp_mbox_chan->type = HSP_MBOX_TYPE_DB;
>> +	hsp_mbox_chan->db_chan.master_id = master_id;
>> +	switch (master_id) {
>> +	case HSP_DB_MASTER_BPMP:
>> +		hsp_mbox_chan->db_chan.db_id = HSP_DB_BPMP;
>> +		break;
>> +	default:
>> +		hsp_mbox_chan->db_chan.db_id = MAX_NUM_HSP_DB;
>> +		break;
>> +	}
>> +
>> +	mchan->con_priv = hsp_mbox_chan;
>> +
>> +	return 0;
>> +}
>> +
>> +static int hsp_send_data(struct mbox_chan *chan, void *data)
>> +{
>> +	struct tegra_hsp_mbox_chan *hsp_mbox_chan = chan->con_priv;
>> +	int ret = 0;
>> +
>> +	switch (hsp_mbox_chan->type) {
>> +	case HSP_MBOX_TYPE_DB:
>> +		ret = hsp_db_send_data(chan, data);
>> +		break;
>> +	default:
>
> Should you return an error here?
>
>> +		break;
>> +	}
>> +
>> +	return ret;
>> +}
>> +
>> +static int hsp_startup(struct mbox_chan *chan)
>> +{
>> +	struct tegra_hsp_mbox_chan *hsp_mbox_chan = chan->con_priv;
>> +	int ret = 0;
>> +
>> +	switch (hsp_mbox_chan->type) {
>> +	case HSP_MBOX_TYPE_DB:
>> +		ret = hsp_db_startup(chan);
>> +		break;
>> +	default:
>
> And here too...?
OK, will fix both.

>
>> +		break;
>> +	}
>> +
>> +	return ret;
>> +}
>> +
>> +static void hsp_shutdown(struct mbox_chan *chan)
>> +{
>> +	struct tegra_hsp_mbox_chan *hsp_mbox_chan = chan->con_priv;
>> +
>> +	switch (hsp_mbox_chan->type) {
>> +	case HSP_MBOX_TYPE_DB:
>> +		hsp_db_shutdown(chan);
>> +		break;
>> +	default:
>> +		break;
>> +	}
>> +
>> +	chan->con_priv = NULL;
>> +}
>> +
>> +static bool hsp_last_tx_done(struct mbox_chan *chan)
>> +{
>> +	struct tegra_hsp_mbox_chan *hsp_mbox_chan = chan->con_priv;
>> +	bool ret = true;
>> +
>> +	switch (hsp_mbox_chan->type) {
>> +	case HSP_MBOX_TYPE_DB:
>> +		ret = hsp_db_last_tx_done(chan);
>
> hsp_db_last_tx_done() return true - so we might as well make this parent
> function to return true and remove hsp_db_last_tx_done()?
Yes, true.

>
>> +		break;
>> +	default:
>> +		break;
>> +	}
>> +
>> +	return ret;
>> +}
>> +
>> +static const struct mbox_chan_ops tegra_hsp_ops = {
>> +	.send_data = hsp_send_data,
>> +	.startup = hsp_startup,
>> +	.shutdown = hsp_shutdown,
>> +	.last_tx_done = hsp_last_tx_done,
>> +};
>> +
>> +static const struct of_device_id tegra_hsp_match[] = {
>> +	{ .compatible = "nvidia,tegra186-hsp" },
>> +	{ }
>> +};
>> +
>> +static struct mbox_chan *
>> +of_hsp_mbox_xlate(struct mbox_controller *mbox,
>> +		  const struct of_phandle_args *sp)
>> +{
>> +	int mbox_id = sp->args[0];
>> +	int hsp_type = (mbox_id >> 16) & 0xf;
>
> Wouldn't it be nicer if the shift and mask constants are made defines in
> the DT bindings header (tegra186-hsp.h)?
Should be OK.

>
>> +	int master_id = mbox_id & 0xff;
>> +	struct tegra_hsp_mbox *hsp_mbox = dev_get_drvdata(mbox->dev);
>> +	struct mbox_chan *free_chan;
>> +	int i, ret = 0;
>> +
>> +	spin_lock(&hsp_mbox->lock);
>
> If you must use spin locks, you will have to use the irqsave/restore
> veriants in this function (called from thread context).
>
>> +
>> +	for (i = 0; i < mbox->num_chans; i++) {
>> +		free_chan = &mbox->chans[i];
>> +		if (!free_chan->con_priv)
>> +			break;
>> +		free_chan = NULL;
>> +	}
>> +
>> +	if (!free_chan) {
>> +		spin_unlock(&hsp_mbox->lock);
>> +		return ERR_PTR(-EFAULT);
>> +	}
>
> IMO, it will be cleaner & simpler if you move the above code (doing the
> lookup) into a separate function that returns free_chan - and you can
> reuse that in hsp_db_irq()
?
I think it's different usage.
>
>> +
>> +	switch (hsp_type) {
>> +	case HSP_MBOX_TYPE_DB:
>> +		ret = tegra_hsp_db_init(hsp_mbox, free_chan, master_id);
>
> tegra_hsp_db_init() uses devm_kzalloc and you are doing this holding a
> spinlock.
>
>> +		break;
>> +	default:
>
> Not returning error here will also cause resource leak (free_chan).
>
>> +		break;
>> +	}

Thanks,
-Joseph

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH V2 02/10] mailbox: tegra-hsp: Add HSP(Hardware Synchronization Primitives) driver
  2016-07-07 21:33           ` Sivaram Nair
@ 2016-07-18  8:58             ` Joseph Lo
  0 siblings, 0 replies; 51+ messages in thread
From: Joseph Lo @ 2016-07-18  8:58 UTC (permalink / raw)
  To: Sivaram Nair
  Cc: Alexandre Courbot, Stephen Warren, Thierry Reding, linux-tegra,
	linux-arm-kernel, Rob Herring, Mark Rutland, Peter De Schrijver,
	Matthew Longnecker, devicetree, Jassi Brar,
	Linux Kernel Mailing List, Catalin Marinas, Will Deacon

On 07/08/2016 05:33 AM, Sivaram Nair wrote:
> On Thu, Jul 07, 2016 at 02:37:27PM +0800, Joseph Lo wrote:
>> On 07/06/2016 08:23 PM, Alexandre Courbot wrote:
>>> On Wed, Jul 6, 2016 at 6:06 PM, Joseph Lo <josephl@nvidia.com> wrote:
>>>> On 07/06/2016 03:05 PM, Alexandre Courbot wrote:
>>>>>
>>>>> On Tue, Jul 5, 2016 at 6:04 PM, Joseph Lo <josephl@nvidia.com> wrote:
>>>>>>
>>>>>> The Tegra HSP mailbox driver implements the signaling doorbell-based
>>>>>> interprocessor communication (IPC) for remote processors currently. The
>>>>>> HSP HW modules support some different features for that, which are
>>>>>> shared mailboxes, shared semaphores, arbitrated semaphores, and
>>>>>> doorbells. And there are multiple HSP HW instances on the chip. So the
>>>>>> driver is extendable to support more features for different IPC
>>>>>> requirement.
>>>>>>
>>>>>> The driver of remote processor can use it as a mailbox client and deal
>>>>>> with the IPC protocol to synchronize the data communications.
>>>>>>
>>>>>> Signed-off-by: Joseph Lo <josephl@nvidia.com>
>>>>>> ---
>>>>>> Changes in V2:
>>>>>> - Update the driver to support the binding changes in V2
>>>>>> - it's extendable to support multiple HSP sub-modules on the same HSP HW
>>>>>> block
>>>>>>     now.
>>>>>> ---
>>>>>>    drivers/mailbox/Kconfig     |   9 +
>>>>>>    drivers/mailbox/Makefile    |   2 +
>>>>>>    drivers/mailbox/tegra-hsp.c | 418
>>>>>> ++++++++++++++++++++++++++++++++++++++++++++
>>>>>>    3 files changed, 429 insertions(+)
>>>>>>    create mode 100644 drivers/mailbox/tegra-hsp.c
>>>>>>
>>>>>> diff --git a/drivers/mailbox/Kconfig b/drivers/mailbox/Kconfig
>>>>>> index 5305923752d2..fe584cb54720 100644
>>>>>> --- a/drivers/mailbox/Kconfig
>>>>>> +++ b/drivers/mailbox/Kconfig
>>>>>> @@ -114,6 +114,15 @@ config MAILBOX_TEST
>>>>>>             Test client to help with testing new Controller driver
>>>>>>             implementations.
>>>>>>
>>>>>> +config TEGRA_HSP_MBOX
>>>>>> +       bool "Tegra HSP(Hardware Synchronization Primitives) Driver"
>>>>>
>>>>>
>>>>> Space missing before the opening parenthesis (same in the patch title
>>>>> btw).
>>>>
>>>> Okay.
>>>>>
>>>>>
>>>>>> +       depends on ARCH_TEGRA_186_SOC
>>>>>> +       help
>>>>>> +         The Tegra HSP driver is used for the interprocessor
>>>>>> communication
>>>>>> +         between different remote processors and host processors on
>>>>>> Tegra186
>>>>>> +         and later SoCs. Say Y here if you want to have this support.
>>>>>> +         If unsure say N.
>>>>>
>>>>>
>>>>> Since this option is selected automatically by ARCH_TEGRA_186_SOC, you
>>>>> should probably drop the last 2 sentences.
>>>>
>>>> Okay.
>>>>
>>>>>
>>>>>> +
>>>>>>    config XGENE_SLIMPRO_MBOX
>>>>>>           tristate "APM SoC X-Gene SLIMpro Mailbox Controller"
>>>>>>           depends on ARCH_XGENE
>>>>>> diff --git a/drivers/mailbox/Makefile b/drivers/mailbox/Makefile
>>>>>> index 0be3e742bb7d..26d8f91c7fea 100644
>>>>>> --- a/drivers/mailbox/Makefile
>>>>>> +++ b/drivers/mailbox/Makefile
>>>>>> @@ -25,3 +25,5 @@ obj-$(CONFIG_TI_MESSAGE_MANAGER) += ti-msgmgr.o
>>>>>>    obj-$(CONFIG_XGENE_SLIMPRO_MBOX) += mailbox-xgene-slimpro.o
>>>>>>
>>>>>>    obj-$(CONFIG_HI6220_MBOX)      += hi6220-mailbox.o
>>>>>> +
>>>>>> +obj-${CONFIG_TEGRA_HSP_MBOX}   += tegra-hsp.o
>>>>>> diff --git a/drivers/mailbox/tegra-hsp.c b/drivers/mailbox/tegra-hsp.c
>>>>>> new file mode 100644
>>>>>> index 000000000000..93c3ef58f29f
>>>>>> --- /dev/null
>>>>>> +++ b/drivers/mailbox/tegra-hsp.c
>>>>>> @@ -0,0 +1,418 @@
>>>>>> +/*
>>>>>> + * Copyright (c) 2016, NVIDIA CORPORATION.  All rights reserved.
>>>>>> + *
>>>>>> + * This program is free software; you can redistribute it and/or modify
>>>>>> it
>>>>>> + * under the terms and conditions of the GNU General Public License,
>>>>>> + * version 2, as published by the Free Software Foundation.
>>>>>> + *
>>>>>> + * This program is distributed in the hope it will be useful, but
>>>>>> WITHOUT
>>>>>> + * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
>>>>>> + * FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License
>>>>>> for
>>>>>> + * more details.
>>>>>> + */
>>>>>> +
>>>>>> +#include <linux/interrupt.h>
>>>>>> +#include <linux/io.h>
>>>>>> +#include <linux/mailbox_controller.h>
>>>>>> +#include <linux/of.h>
>>>>>> +#include <linux/of_device.h>
>>>>>> +#include <linux/platform_device.h>
>>>>>> +#include <dt-bindings/mailbox/tegra186-hsp.h>
>>>>>> +
>>>>>> +#define HSP_INT_DIMENSIONING   0x380
>>>>>> +#define HSP_nSM_OFFSET         0
>>>>>> +#define HSP_nSS_OFFSET         4
>>>>>> +#define HSP_nAS_OFFSET         8
>>>>>> +#define HSP_nDB_OFFSET         12
>>>>>> +#define HSP_nSI_OFFSET         16
>>>>>
>>>>>
>>>>> Would be nice to have comments to understand what SM, SS, AS, etc.
>>>>> stand for (Shared Mailboxes, Shared Semaphores, Arbitrated Semaphores
>>>>> but you need to look at the patch description to understand that). A
>>>>> top-of-file comment explaning the necessary concepts to read this code
>>>>> would do the trick.
>>>>
>>>> Yes, will fix that.
>>>>>
>>>>>
>>>>>> +#define HSP_nINT_MASK          0xf
>>>>>> +
>>>>>> +#define HSP_DB_REG_TRIGGER     0x0
>>>>>> +#define HSP_DB_REG_ENABLE      0x4
>>>>>> +#define HSP_DB_REG_RAW         0x8
>>>>>> +#define HSP_DB_REG_PENDING     0xc
>>>>>> +
>>>>>> +#define HSP_DB_CCPLEX          1
>>>>>> +#define HSP_DB_BPMP            3
>>>>>
>>>>>
>>>>> Maybe turn this into enum and use that type for
>>>>> tegra_hsp_db_chan::db_id? Also have MAX_NUM_HSP_DB here, since it is
>>>>> related to these values?
>>>>
>>>> Okay.
>>>>
>>>>>
>>>>>> +
>>>>>> +#define MAX_NUM_HSP_CHAN 32
>>>>>> +#define MAX_NUM_HSP_DB 7
>>>>>> +
>>>>>> +#define hsp_db_offset(i, d) \
>>>>>> +       (d->base + ((1 + (d->nr_sm >> 1) + d->nr_ss + d->nr_as) << 16) +
>>>>>> \
>>>>>> +       (i) * 0x100)
>>>>>> +
>>>>>> +struct tegra_hsp_db_chan {
>>>>>> +       int master_id;
>>>>>> +       int db_id;
>>>>>> +};
>>>>>> +
>>>>>> +struct tegra_hsp_mbox_chan {
>>>>>> +       int type;
>>>>>> +       union {
>>>>>> +               struct tegra_hsp_db_chan db_chan;
>>>>>> +       };
>>>>>> +};
>>>>>> +
>>>>>> +struct tegra_hsp_mbox {
>>>>>> +       struct mbox_controller *mbox;
>>>>>> +       void __iomem *base;
>>>>>> +       void __iomem *db_base[MAX_NUM_HSP_DB];
>>>>>> +       int db_irq;
>>>>>> +       int nr_sm;
>>>>>> +       int nr_as;
>>>>>> +       int nr_ss;
>>>>>> +       int nr_db;
>>>>>> +       int nr_si;
>>>>>> +       spinlock_t lock;
>>>>>> +};
>>>>>> +
>>>>>> +static inline u32 hsp_readl(void __iomem *base, int reg)
>>>>>> +{
>>>>>> +       return readl(base + reg);
>>>>>> +}
>>>>>> +
>>>>>> +static inline void hsp_writel(void __iomem *base, int reg, u32 val)
>>>>>> +{
>>>>>> +       writel(val, base + reg);
>>>>>> +       readl(base + reg);
>>>>>> +}
>>>>>> +
>>>>>> +static int hsp_db_can_ring(void __iomem *db_base)
>>>>>> +{
>>>>>> +       u32 reg;
>>>>>> +
>>>>>> +       reg = hsp_readl(db_base, HSP_DB_REG_ENABLE);
>>>>>> +
>>>>>> +       return !!(reg & BIT(HSP_DB_MASTER_CCPLEX));
>>>>>> +}
>>>>>> +
>>>>>> +static irqreturn_t hsp_db_irq(int irq, void *p)
>>>>>> +{
>>>>>> +       struct tegra_hsp_mbox *hsp_mbox = p;
>>>>>> +       ulong val;
>>>>>> +       int master_id;
>>>>>> +
>>>>>> +       val = (ulong)hsp_readl(hsp_mbox->db_base[HSP_DB_CCPLEX],
>>>>>> +                              HSP_DB_REG_PENDING);
>>>>>> +       hsp_writel(hsp_mbox->db_base[HSP_DB_CCPLEX], HSP_DB_REG_PENDING,
>>>>>> val);
>>>>>> +
>>>>>> +       spin_lock(&hsp_mbox->lock);
>>>>>> +       for_each_set_bit(master_id, &val, MAX_NUM_HSP_CHAN) {
>>>>>> +               struct mbox_chan *chan;
>>>>>> +               struct tegra_hsp_mbox_chan *mchan;
>>>>>> +               int i;
>>>>>> +
>>>>>> +               for (i = 0; i < MAX_NUM_HSP_CHAN; i++) {
>>>>>
>>>>>
>>>>> I wonder if this could not be optimized. You are doing a double loop
>>>>> on MAX_NUM_HSP_CHAN to look for an identical master_id. Since it seems
>>>>> like the same master_id cannot be used twice (considering that the
>>>>> inner loop only processes the first match), couldn't you just select
>>>>> the free channel in of_hsp_mbox_xlate() by doing
>>>>> &mbox->chans[master_id] (and returning an error if it is already
>>>>> used), then simply getting chan as &hsp_mbox->mbox->chans[master_id]
>>>>> instead of having the inner loop below? That would remove the need for
>>>>> the second loop.
>>>>
>>>>
>>>> That was exactly what I did in the V1, which only supported one HSP
>>>> sub-module per HSP HW block. So we can just use the master_id as the mbox
>>>> channel ID.
>>>>
>>>> Meanwhile, the V2 is purposed to support multiple HSP sub-modules to be
>>>> running on the same HSP HW block. The "ID" between different modules could
>>>> be conflict. So I dropped the mechanism that used the master_id as the mbox
>>>> channel ID.
>>>>
>>>> Instead, the channel is allocated at the time, when the client is bound to
>>>> one of the HSP sub-modules. And we store the "ID" information into the
>>>> private mbox channel data, which can help us to figure out which mbox
>>>> channel should response to the interrupt.
>>>>
>>>> In the doorbell case, because all the DB clients are shared the same DB IRQ
>>>> at the CPU side. So in the ISR, we need to figure out the IRQ source, which
>>>> is the master_id that the IRQ came from. This is the outer loop. The inner
>>>> loop, we figure out which channel should response to by checking the type
>>>> and ID.
>>>>
>>>> And I think it should be pretty quick, because we only check the set bit
>>> >from the pending register. And finding the matching channel.
>>>
>>> Yeah, I am not worried about the CPU time (although in interrupt
>>> context, we always should), but rather about whether the code could be
>>> simplified.
>>>
>>> Ah, I think I get it. You want to be able to receive interrupts from
>>> the same master, but not necessarily for the doorbell function.
>>> Because of this you cannot use master_id as the index for the channel.
>>> Am I understanding correctly?
>>
>> Yes, the DB clients trigger the IRQ through the same master
>> (HSP_DB_CCPLEX) with it's master_id. We (CPU) can check the ID to
>> know who is requesting the HSP mbox service. Each ID is unique under
>> the DB module.
>>
>> But the ID could be conflict when the HSP mbox driver are working
>> with multiple HSP sub-function under the same HSP HW block. So we
>> can't just match the ID to the HSP mbox channel ID.
>
> Joseph, can you think about any other sub-function that uses the same
> master ids (& those that does not have their own irqs)? I wonder if we
> are over-engineering this. I think the hsp_db_startup() and
> hsp_db_shutdown() does not support sharing masters - _startup() by one
> followed by _shutdown() from another will mask the interrupt. If there
> is infact other potential sub-functions, I would imagine this will
> translate to other values of the tegra_hsp_mbox_chan.type than
> HSP_MBOX_TYPE_DB? If yes, then you should be able to remove need of this
> inner loop by having per-sub-function mboxes or by combining 'type' and
> 'master_id' to make single index value?
>

I will try to refactor the driver to fix the inner loop issue by 
separating the mbox channel with different HSP modules. And hook 
different sub channels for different masters.

Thanks,
-Joseph

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH V2 03/10] Documentation: dt-bindings: firmware: tegra: add bindings of the BPMP
  2016-07-18  7:44       ` Joseph Lo
@ 2016-07-18 16:18         ` Stephen Warren
  0 siblings, 0 replies; 51+ messages in thread
From: Stephen Warren @ 2016-07-18 16:18 UTC (permalink / raw)
  To: Joseph Lo, Rob Herring
  Cc: Thierry Reding, Alexandre Courbot, linux-tegra, linux-arm-kernel,
	Mark Rutland, Peter De Schrijver, Matthew Longnecker, devicetree,
	Jassi Brar, linux-kernel, Catalin Marinas, Will Deacon

On 07/18/2016 01:44 AM, Joseph Lo wrote:
> Hi Rob,
>
> Thanks for your reviewing.
>
> On 07/12/2016 12:05 AM, Stephen Warren wrote:
>> On 07/11/2016 08:22 AM, Rob Herring wrote:
>>> On Tue, Jul 05, 2016 at 05:04:24PM +0800, Joseph Lo wrote:
>>>> The BPMP is a specific processor in Tegra chip, which is designed for
>>>> booting process handling and offloading the power management, clock
>>>> management, and reset control tasks from the CPU. The binding document
>>>> defines the resources that would be used by the BPMP firmware driver,
>>>> which can create the interprocessor communication (IPC) between the CPU
>>>> and BPMP.
>>
>>>> diff --git
>>>> a/Documentation/devicetree/bindings/firmware/nvidia,tegra186-bpmp.txt
>>>> b/Documentation/devicetree/bindings/firmware/nvidia,tegra186-bpmp.txt
>>
>>>> +NVIDIA Tegra Boot and Power Management Processor (BPMP)
>>>> +
>>>> +The BPMP is a specific processor in Tegra chip, which is designed for
>>>> +booting process handling and offloading the power management, clock
>>>> +management, and reset control tasks from the CPU. The binding document
>>>> +defines the resources that would be used by the BPMP firmware driver,
>>>> +which can create the interprocessor communication (IPC) between the
>>>> CPU
>>>> +and BPMP.
>>>> +
>>>> +Required properties:
>>>> +- name : Should be bpmp
>>>> +- compatible
>>>> +    Array of strings
>>>> +    One of:
>>>> +    - "nvidia,tegra186-bpmp"
>>>> +- mboxes : The phandle of mailbox controller and the mailbox
>>>> specifier.
>>>> +- shmem : List of the phandle of the TX and RX shared memory area that
>>>> +      the IPC between CPU and BPMP is based on.
>>>
>>> I think you can use memory-region here.
>>
>> Isn't memory-region intended for references into the /reserved-memory
>> node. If so, that isn't appropriate in this case since this property
>> typically points at on-chip SRAM that isn't included in the OS's view of
>> "system RAM".
> Agree with that.
>
>>
>> Or, should /reserved-memory be used even for (e.g. non-DRAM) memory
>> regions that aren't represented by the /memory/reg property?
>>
>
> For shmem, I follow the same concept of the binding for arm,scpi
> (.../arm/arm,scpi.txt) that is currently using in mainline. Do you think
> that is more appropriate here?

Personally I think the shmem property name used by the current patch is 
fine. Still, if Rob feels strongly about changing it, that's fine too.

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH V2 01/10] Documentation: dt-bindings: mailbox: tegra: Add binding for HSP mailbox
  2016-07-11 16:08         ` Stephen Warren
@ 2016-07-18 23:13           ` Stephen Warren
  2016-07-19  7:09             ` Joseph Lo
  0 siblings, 1 reply; 51+ messages in thread
From: Stephen Warren @ 2016-07-18 23:13 UTC (permalink / raw)
  To: Rob Herring
  Cc: Sivaram Nair, Joseph Lo, Thierry Reding, Alexandre Courbot,
	linux-tegra, linux-arm-kernel, Mark Rutland, Peter De Schrijver,
	Matthew Longnecker, devicetree, Jassi Brar, linux-kernel,
	Catalin Marinas, Will Deacon

On 07/11/2016 10:08 AM, Stephen Warren wrote:
> On 07/11/2016 08:14 AM, Rob Herring wrote:
>> On Thu, Jul 07, 2016 at 12:35:02PM -0600, Stephen Warren wrote:
>>> On 07/07/2016 12:13 PM, Sivaram Nair wrote:
>>>> On Tue, Jul 05, 2016 at 05:04:22PM +0800, Joseph Lo wrote:
>>>>> Add DT binding for the Hardware Synchronization Primitives (HSP). The
>>>>> HSP is designed for the processors to share resources and communicate
>>>>> together. It provides a set of hardware synchronization primitives for
>>>>> interprocessor communication. So the interprocessor communication
>>>>> (IPC)
>>>>> protocols can use hardware synchronization primitive, when operating
>>>>> between two processors not in an SMP relationship.
>>>
>>>>> diff --git a/include/dt-bindings/mailbox/tegra186-hsp.h
>>>>> b/include/dt-bindings/mailbox/tegra186-hsp.h
>>>
>>>>> +#define HSP_MBOX_TYPE_DB 0x0
>>>>> +#define HSP_MBOX_TYPE_SM 0x1
>>>>> +#define HSP_MBOX_TYPE_SS 0x2
>>>>> +#define HSP_MBOX_TYPE_AS 0x3
>>>>> +
>>>>> +#define HSP_DB_MASTER_CCPLEX 17
>>>>> +#define HSP_DB_MASTER_BPMP 19
>>>>> +
>>>>> +#define HSP_MBOX_ID(type, ID) \
>>>>> +        (HSP_MBOX_TYPE_##type << 16 | ID)
>>>>
>>>> It will be nicer if you avoid the macro glue magic '##' for 'type'. I
>>>> would also suggest to use braces around 'type' and 'ID'.
>>>
>>> This technique been used without issue in quite a few other places
>>> without
>>> issue, and has the benefit of simplifying the text wherever the macro is
>>> used. What issue do you foresee?
>>
>> I'm not a fan of using the macros to begin with and less so anything
>> more complex than a single constant value. I'd rather see 2 cells here
>> with the first being the id and the 2nd being the type.
>>
>> An issue with token pasting is grepping for DB, SM, etc. in kernel tree
>> is probably noisy. Not such a big deal here, but a major PIA when you
>> have more complex sets of includes.
>
> Is that a NAK or simply a suggestion? Having a single cell makes DT
> parsing a bit simpler, since pretty much every SW stack provides a
> default "one-cell" of_xlate implementation, whereas >1 cell means custom
> code for of_xlate.

I didn't see a response to this. Joseph, let's just use two cells 
instead. I'm rather desperately waiting for this binding to be complete 
so I can finalize the U-Boot code that uses it, and it sounds like 
changing to two cells will get an ack faster. Can you post an updated 
version of this series today/ASAP to get things moving? Thanks.

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH V2 01/10] Documentation: dt-bindings: mailbox: tegra: Add binding for HSP mailbox
  2016-07-18 23:13           ` Stephen Warren
@ 2016-07-19  7:09             ` Joseph Lo
  0 siblings, 0 replies; 51+ messages in thread
From: Joseph Lo @ 2016-07-19  7:09 UTC (permalink / raw)
  To: Stephen Warren, Rob Herring
  Cc: Sivaram Nair, Thierry Reding, Alexandre Courbot, linux-tegra,
	linux-arm-kernel, Mark Rutland, Peter De Schrijver,
	Matthew Longnecker, devicetree, Jassi Brar, linux-kernel,
	Catalin Marinas, Will Deacon

On 07/19/2016 07:13 AM, Stephen Warren wrote:
> On 07/11/2016 10:08 AM, Stephen Warren wrote:
>> On 07/11/2016 08:14 AM, Rob Herring wrote:
>>> On Thu, Jul 07, 2016 at 12:35:02PM -0600, Stephen Warren wrote:
>>>> On 07/07/2016 12:13 PM, Sivaram Nair wrote:
>>>>> On Tue, Jul 05, 2016 at 05:04:22PM +0800, Joseph Lo wrote:
>>>>>> Add DT binding for the Hardware Synchronization Primitives (HSP). The
>>>>>> HSP is designed for the processors to share resources and communicate
>>>>>> together. It provides a set of hardware synchronization primitives
>>>>>> for
>>>>>> interprocessor communication. So the interprocessor communication
>>>>>> (IPC)
>>>>>> protocols can use hardware synchronization primitive, when operating
>>>>>> between two processors not in an SMP relationship.
>>>>
>>>>>> diff --git a/include/dt-bindings/mailbox/tegra186-hsp.h
>>>>>> b/include/dt-bindings/mailbox/tegra186-hsp.h
>>>>
>>>>>> +#define HSP_MBOX_TYPE_DB 0x0
>>>>>> +#define HSP_MBOX_TYPE_SM 0x1
>>>>>> +#define HSP_MBOX_TYPE_SS 0x2
>>>>>> +#define HSP_MBOX_TYPE_AS 0x3
>>>>>> +
>>>>>> +#define HSP_DB_MASTER_CCPLEX 17
>>>>>> +#define HSP_DB_MASTER_BPMP 19
>>>>>> +
>>>>>> +#define HSP_MBOX_ID(type, ID) \
>>>>>> +        (HSP_MBOX_TYPE_##type << 16 | ID)
>>>>>
>>>>> It will be nicer if you avoid the macro glue magic '##' for 'type'. I
>>>>> would also suggest to use braces around 'type' and 'ID'.
>>>>
>>>> This technique been used without issue in quite a few other places
>>>> without
>>>> issue, and has the benefit of simplifying the text wherever the
>>>> macro is
>>>> used. What issue do you foresee?
>>>
>>> I'm not a fan of using the macros to begin with and less so anything
>>> more complex than a single constant value. I'd rather see 2 cells here
>>> with the first being the id and the 2nd being the type.
>>>
>>> An issue with token pasting is grepping for DB, SM, etc. in kernel tree
>>> is probably noisy. Not such a big deal here, but a major PIA when you
>>> have more complex sets of includes.
>>
>> Is that a NAK or simply a suggestion? Having a single cell makes DT
>> parsing a bit simpler, since pretty much every SW stack provides a
>> default "one-cell" of_xlate implementation, whereas >1 cell means custom
>> code for of_xlate.
>
> I didn't see a response to this. Joseph, let's just use two cells
> instead. I'm rather desperately waiting for this binding to be complete
> so I can finalize the U-Boot code that uses it, and it sounds like
> changing to two cells will get an ack faster. Can you post an updated
> version of this series today/ASAP to get things moving? Thanks.
>
Okay, will use two cells instead.

^ permalink raw reply	[flat|nested] 51+ messages in thread

end of thread, other threads:[~2016-07-19  7:09 UTC | newest]

Thread overview: 51+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-07-05  9:04 [PATCH V2 00/10] arm64: tegra: add BPMP support Joseph Lo
2016-07-05  9:04 ` [PATCH V2 01/10] Documentation: dt-bindings: mailbox: tegra: Add binding for HSP mailbox Joseph Lo
2016-07-06 17:02   ` Stephen Warren
2016-07-07  6:24     ` Joseph Lo
2016-07-07 18:13   ` Sivaram Nair
2016-07-07 18:35     ` Stephen Warren
2016-07-07 18:44       ` Sivaram Nair
2016-07-11 14:14       ` Rob Herring
2016-07-11 16:08         ` Stephen Warren
2016-07-18 23:13           ` Stephen Warren
2016-07-19  7:09             ` Joseph Lo
2016-07-05  9:04 ` [PATCH V2 02/10] mailbox: tegra-hsp: Add HSP(Hardware Synchronization Primitives) driver Joseph Lo
2016-07-06  7:05   ` Alexandre Courbot
2016-07-06  9:06     ` Joseph Lo
2016-07-06 12:23       ` Alexandre Courbot
2016-07-07  6:37         ` Joseph Lo
2016-07-07 21:33           ` Sivaram Nair
2016-07-18  8:58             ` Joseph Lo
2016-07-06 16:50       ` Stephen Warren
2016-07-07  6:49         ` Joseph Lo
2016-07-07 21:10   ` Sivaram Nair
2016-07-18  8:51     ` Joseph Lo
2016-07-05  9:04 ` [PATCH V2 03/10] Documentation: dt-bindings: firmware: tegra: add bindings of the BPMP Joseph Lo
2016-07-06 11:42   ` Alexandre Courbot
2016-07-07  6:25     ` Joseph Lo
2016-07-06 17:03   ` Stephen Warren
2016-07-07  6:26     ` Joseph Lo
2016-07-11 14:22   ` Rob Herring
2016-07-11 16:05     ` Stephen Warren
2016-07-18  7:44       ` Joseph Lo
2016-07-18 16:18         ` Stephen Warren
2016-07-13 19:41   ` Stephen Warren
2016-07-18  6:42     ` Joseph Lo
2016-07-05  9:04 ` [PATCH V2 04/10] firmware: tegra: add IVC library Joseph Lo
2016-07-07 11:16   ` Alexandre Courbot
2016-07-09 23:45   ` Paul Gortmaker
2016-07-05  9:04 ` [PATCH V2 05/10] firmware: tegra: add BPMP support Joseph Lo
2016-07-06 11:39   ` Alexandre Courbot
2016-07-06 16:39     ` Stephen Warren
2016-07-06 16:47     ` Matt Longnecker
2016-07-07  2:24       ` Alexandre Courbot
2016-07-07  8:17     ` Joseph Lo
2016-07-07 10:18       ` Alexandre Courbot
2016-07-07 19:55         ` Stephen Warren
2016-07-08 20:19         ` Sivaram Nair
2016-07-08 17:55   ` Sivaram Nair
2016-07-05  9:04 ` [PATCH V2 06/10] soc/tegra: Add Tegra186 support Joseph Lo
2016-07-05  9:04 ` [PATCH V2 07/10] arm64: defconfig: Enable Tegra186 SoC Joseph Lo
2016-07-05  9:04 ` [PATCH V2 08/10] arm64: dts: tegra: Add Tegra186 support Joseph Lo
2016-07-05  9:04 ` [PATCH V2 09/10] arm64: dts: tegra: Add NVIDIA Tegra186 P3310 main board support Joseph Lo
2016-07-05  9:04 ` [PATCH V2 10/10] arm64: dts: tegra: Add NVIDIA P2771 " Joseph Lo

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).