linux-remoteproc.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 0/3] TI K3 DSP remoteproc driver for C66x DSPs
@ 2020-03-25 20:18 Suman Anna
  2020-03-25 20:18 ` [PATCH 1/3] dt-bindings: remoteproc: Add bindings for C66x DSPs on TI K3 SoCs Suman Anna
                   ` (2 more replies)
  0 siblings, 3 replies; 13+ messages in thread
From: Suman Anna @ 2020-03-25 20:18 UTC (permalink / raw)
  To: Bjorn Andersson, Rob Herring, Mathieu Poirier
  Cc: Lokesh Vutla, linux-remoteproc, devicetree, linux-arm-kernel,
	linux-kernel, Suman Anna

Hi All,

The following series adds a new K3 DSP remoteproc driver supporting the
C66x DSPs on the TI K3 J721E SoCs. The current series mainly adds the support
for booting the DSPs from the Linux kernel. This series forms the foundation
for adding support for a new 64-bit DSP (C71x DSP) to be posted in a separate
series. Support for attaching to pre-booted DSPs (from bootloader) will be
done in a future series.

The C66x DSPs can boot either using firmware segments loaded into either DDR
and/or internal DSP RAMs. IPC is through the virtio-rpmsg transport. There is 
no Error Recovery or Power Management support at present. The driver also does
not support loading into on-chip SRAMs at present.

The patches are based on the current rproc-next branch, and does use couple
of patches posted earlier from the OMAP remoteproc series [1] and TI K3 R5F
series [2]. It also leverages the fixed memory carveout fixes series [3].

Following is the patch summary:
 - Patch 1 adds the bindings in the YAML format.
 - Patch 2 adds the basic remoteproc driver for the C66x DSPs
 - Patch 3 is an enhancement to support loading into the DSP's internal
   RAMs directly.

regards
Suman

[1] https://patchwork.kernel.org/patch/11455135/
[2] https://patchwork.kernel.org/patch/11456383/ 
[3] https://patchwork.kernel.org/cover/11447649/

Suman Anna (3):
  dt-bindings: remoteproc: Add bindings for C66x DSPs on TI K3 SoCs
  remoteproc/k3-dsp: Add a remoteproc driver of K3 C66x DSPs
  remoteproc/k3-dsp: Add support for L2RAM loading on C66x DSPs

 .../bindings/remoteproc/ti,k3-dsp-rproc.yaml  | 180 ++++
 drivers/remoteproc/Kconfig                    |  16 +
 drivers/remoteproc/Makefile                   |   1 +
 drivers/remoteproc/ti_k3_dsp_remoteproc.c     | 818 ++++++++++++++++++
 4 files changed, 1015 insertions(+)
 create mode 100644 Documentation/devicetree/bindings/remoteproc/ti,k3-dsp-rproc.yaml
 create mode 100644 drivers/remoteproc/ti_k3_dsp_remoteproc.c

-- 
2.23.0

^ permalink raw reply	[flat|nested] 13+ messages in thread

* [PATCH 1/3] dt-bindings: remoteproc: Add bindings for C66x DSPs on TI K3 SoCs
  2020-03-25 20:18 [PATCH 0/3] TI K3 DSP remoteproc driver for C66x DSPs Suman Anna
@ 2020-03-25 20:18 ` Suman Anna
  2020-03-26 16:54   ` Rob Herring
  2020-04-27 19:49   ` Mathieu Poirier
  2020-03-25 20:18 ` [PATCH 2/3] remoteproc/k3-dsp: Add a remoteproc driver of K3 C66x DSPs Suman Anna
  2020-03-25 20:18 ` [PATCH 3/3] remoteproc/k3-dsp: Add support for L2RAM loading on " Suman Anna
  2 siblings, 2 replies; 13+ messages in thread
From: Suman Anna @ 2020-03-25 20:18 UTC (permalink / raw)
  To: Bjorn Andersson, Rob Herring, Mathieu Poirier
  Cc: Lokesh Vutla, linux-remoteproc, devicetree, linux-arm-kernel,
	linux-kernel, Suman Anna

Some Texas Instruments K3 family of SoCs have one of more Digital Signal
Processor (DSP) subsystems that are comprised of either a TMS320C66x
CorePac and/or a next-generation TMS320C71x CorePac processor subsystem.
Add the device tree bindings document for the C66x DSP devices on these
SoCs. The added example illustrates the DT nodes for the first C66x DSP
device present on the K3 J721E family of SoCs.

Signed-off-by: Suman Anna <s-anna@ti.com>
---
 .../bindings/remoteproc/ti,k3-dsp-rproc.yaml  | 180 ++++++++++++++++++
 1 file changed, 180 insertions(+)
 create mode 100644 Documentation/devicetree/bindings/remoteproc/ti,k3-dsp-rproc.yaml

diff --git a/Documentation/devicetree/bindings/remoteproc/ti,k3-dsp-rproc.yaml b/Documentation/devicetree/bindings/remoteproc/ti,k3-dsp-rproc.yaml
new file mode 100644
index 000000000000..416e3abe7937
--- /dev/null
+++ b/Documentation/devicetree/bindings/remoteproc/ti,k3-dsp-rproc.yaml
@@ -0,0 +1,180 @@
+# SPDX-License-Identifier: (GPL-2.0-only or BSD-2-Clause)
+%YAML 1.2
+---
+$id: http://devicetree.org/schemas/remoteproc/ti,k3-dsp-rproc.yaml#
+$schema: http://devicetree.org/meta-schemas/core.yaml#
+
+title: TI K3 DSP devices
+
+maintainers:
+  - Suman Anna <s-anna@ti.com>
+
+description: |
+  The TI K3 family of SoCs usually have one or more TI DSP Core sub-systems
+  that are used to offload some of the processor-intensive tasks or algorithms,
+  for achieving various system level goals.
+
+  These processor sub-systems usually contain additional sub-modules like
+  L1 and/or L2 caches/SRAMs, an Interrupt Controller, an external memory
+  controller, a dedicated local power/sleep controller etc. The DSP processor
+  cores in the K3 SoCs are usually either a TMS320C66x CorePac processor or a
+  TMS320C71x CorePac processor.
+
+  Each DSP Core sub-system is represented as a single DT node. Each node has a
+  number of required or optional properties that enable the OS running on the
+  host processor (Arm CorePac) to perform the device management of the remote
+  processor and to communicate with the remote processor.
+
+properties:
+  compatible:
+    const: ti,j721e-c66-dsp
+    description:
+      Use "ti,j721e-c66-dsp" for C66x DSPs on K3 J721E SoCs
+
+  reg:
+    description: |
+      Should contain an entry for each value in 'reg-names'.
+      Each entry should have the memory region's start address
+      and the size of the region, the representation matching
+      the parent node's '#address-cells' and '#size-cells' values.
+    minItems: 3
+    maxItems: 3
+
+  reg-names:
+    description: |
+      Should contain strings with the names of the specific internal
+      internal memory regions, and should be defined in this order
+    maxItems: 3
+    items:
+      - const: l2sram
+      - const: l1pram
+      - const: l1dram
+
+  ti,sci:
+    $ref: /schemas/types.yaml#/definitions/phandle
+    description:
+      Should be a phandle to the TI-SCI System Controller node
+
+  ti,sci-dev-id:
+    $ref: /schemas/types.yaml#/definitions/uint32
+    description: |
+      Should contain the TI-SCI device id corresponding to the DSP core.
+      Please refer to the corresponding System Controller documentation
+      for valid values for the DSP cores.
+
+  ti,sci-proc-ids:
+    description: Should contain a single tuple of <proc_id host_id>.
+    allOf:
+      - $ref: /schemas/types.yaml#/definitions/uint32-matrix
+      - maxItems: 1
+        items:
+          items:
+            - description: TI-SCI processor id for the DSP core device
+            - description: TI-SCI host id to which processor control
+                           ownership should be transferred to
+
+  resets:
+    description: |
+      Should contain the phandle to the reset controller node
+      managing the resets for this device, and a reset
+      specifier. Please refer to the following reset bindings
+      for the reset argument specifier,
+      Documentation/devicetree/bindings/reset/ti,sci-reset.txt
+
+  firmware-name:
+    description: |
+      Should contain the name of the default firmware image
+      file located on the firmware search path
+
+  mboxes:
+    description: |
+      OMAP Mailbox specifier denoting the sub-mailbox, to be used for
+      communication with the remote processor. This property should match
+      with the sub-mailbox node used in the firmware image. The specifier
+      format is as per the bindings,
+      Documentation/devicetree/bindings/mailbox/omap-mailbox.txt
+
+  memory-region:
+    minItems: 2
+    description: |
+      phandle to the reserved memory nodes to be associated with the remoteproc
+      device. There should be atleast two reserved memory nodes defined - the
+      first one would be used for dynamic DMA allocations like vrings and vring
+      buffers, and the remaining ones used for the firmware image sections. The
+      reserved memory nodes should be carveout nodes, and should be defined as
+      per the bindings in
+      Documentation/devicetree/bindings/reserved-memory/reserved-memory.txt
+
+# Optional properties:
+# --------------------
+
+  sram:
+    $ref: /schemas/types.yaml#/definitions/phandle-array
+    minItems: 1
+    description: |
+      pHandles to one or more reserved on-chip SRAM regions. The regions
+      should be defined as child nodes of the respective SRAM node, and
+      should be defined as per the generic bindings in,
+      Documentation/devicetree/bindings/sram/sram.yaml
+
+required:
+ - compatible
+ - reg
+ - reg-names
+ - ti,sci
+ - ti,sci-dev-id
+ - ti,sci-proc-ids
+ - resets
+ - firmware-name
+ - mboxes
+ - memory-region
+
+additionalProperties: false
+
+examples:
+  - |
+
+    //Example: J721E SoC
+    /* DSP Carveout reserved memory nodes */
+    reserved-memory {
+        #address-cells = <2>;
+        #size-cells = <2>;
+        ranges;
+
+        c66_0_dma_memory_region: c66-dma-memory@a6000000 {
+            compatible = "shared-dma-pool";
+            reg = <0x00 0xa6000000 0x00 0x100000>;
+            no-map;
+        };
+
+        c66_0_memory_region: c66-memory@a6100000 {
+            compatible = "shared-dma-pool";
+            reg = <0x00 0xa6100000 0x00 0xf00000>;
+            no-map;
+        };
+    };
+
+    cbass_main: interconnect@100000 {
+        compatible = "simple-bus";
+        #address-cells = <2>;
+        #size-cells = <2>;
+        ranges = <0x4d 0x80800000 0x4d 0x80800000 0x00 0x00800000>, /* C66_0 */
+                 <0x4d 0x81800000 0x4d 0x81800000 0x00 0x00800000>; /* C66_1 */
+
+        /* J721E C66_0 DSP node */
+        c66_0: dsp@4d80800000 {
+            compatible = "ti,j721e-c66-dsp";
+            reg = <0x4d 0x80800000 0x00 0x00048000>,
+                  <0x4d 0x80e00000 0x00 0x00008000>,
+                  <0x4d 0x80f00000 0x00 0x00008000>;
+            reg-names = "l2sram", "l1pram", "l1dram";
+            ti,sci = <&dmsc>;
+            ti,sci-dev-id = <142>;
+            ti,sci-proc-ids = <0x03 0xFF>;
+            resets = <&k3_reset 142 1>;
+            firmware-name = "j7-c66_0-fw";
+            memory-region = <&c66_0_dma_memory_region>,
+                            <&c66_0_memory_region>;
+            mboxes = <&mailbox0_cluster3 &mbox_c66_0>;
+        };
+    };
-- 
2.23.0

^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH 2/3] remoteproc/k3-dsp: Add a remoteproc driver of K3 C66x DSPs
  2020-03-25 20:18 [PATCH 0/3] TI K3 DSP remoteproc driver for C66x DSPs Suman Anna
  2020-03-25 20:18 ` [PATCH 1/3] dt-bindings: remoteproc: Add bindings for C66x DSPs on TI K3 SoCs Suman Anna
@ 2020-03-25 20:18 ` Suman Anna
  2020-04-27 22:57   ` Mathieu Poirier
  2020-03-25 20:18 ` [PATCH 3/3] remoteproc/k3-dsp: Add support for L2RAM loading on " Suman Anna
  2 siblings, 1 reply; 13+ messages in thread
From: Suman Anna @ 2020-03-25 20:18 UTC (permalink / raw)
  To: Bjorn Andersson, Rob Herring, Mathieu Poirier
  Cc: Lokesh Vutla, linux-remoteproc, devicetree, linux-arm-kernel,
	linux-kernel, Suman Anna

The Texas Instrument's K3 J721E SoCs have two C66x DSP Subsystems in MAIN
voltage domain that are based on the TI's standard TMS320C66x DSP CorePac
module. Each subsystem has a Fixed/Floating-Point DSP CPU, with 32 KB each
of L1P & L1D SRAMs that can be configured and partitioned as either RAM
and/or Cache, and 288 KB of L2 SRAM with 256 KB of memory configurable as
either RAM and/or Cache. The CorePac also includes an Internal DMA (IDMA),
External Memory Controller (EMC), Extended Memory Controller (XMC) with a
Region Address Translator (RAT) unit for 32-bit to 48-bit address
extension/translations, an Interrupt Controller (INTC) and a Powerdown
Controller (PDC).

A new remoteproc module is added to perform the device management of
these DSP devices. The support is limited to images using only external
DDR memory at the moment, the loading support to internal memories and
any on-chip RAM memories will be added in a subsequent patch. RAT support
is also left for a future patch, and as such the reserved memory carveout
regions are all expected to be using memory regions within the first 2 GB.
Error Recovery and Power Management features are not currently supported.

The C66x remote processors do not have an MMU, and so require fixed memory
carveout regions matching the firmware image addresses. Support for this
is provided by mandating multiple memory regions to be attached to the
remoteproc device. The first memory region will be used to serve as the
DMA pool for all dynamic allocations like the vrings and vring buffers.
The remaining memory regions are mapped into the kernel at device probe
time, and are used to provide address translations for firmware image
segments without the need for any RSC_CARVEOUT entries. Any firmware
image using memory outside of the supplied reserved memory carveout
regions will be errored out.

The driver uses various TI-SCI interfaces to talk to the System Controller
(DMSC) for managing configuration, power and reset management of these
cores. IPC between the A72 cores and the DSP cores is supported through
the virtio rpmsg stack using shared memory and OMAP Mailboxes.

Signed-off-by: Suman Anna <s-anna@ti.com>
---
 drivers/remoteproc/Kconfig                |  16 +
 drivers/remoteproc/Makefile               |   1 +
 drivers/remoteproc/ti_k3_dsp_remoteproc.c | 736 ++++++++++++++++++++++
 3 files changed, 753 insertions(+)
 create mode 100644 drivers/remoteproc/ti_k3_dsp_remoteproc.c

diff --git a/drivers/remoteproc/Kconfig b/drivers/remoteproc/Kconfig
index 073048b4c0fb..66a76acb15b6 100644
--- a/drivers/remoteproc/Kconfig
+++ b/drivers/remoteproc/Kconfig
@@ -240,6 +240,22 @@ config TI_K3_R5_REMOTEPROC
 	  It's safe to say N here if you're not interested in utilizing
 	  a slave processor
 
+config TI_K3_DSP_REMOTEPROC
+	tristate "TI K3 DSP remoteproc support"
+	depends on ARCH_K3
+	select MAILBOX
+	select OMAP2PLUS_MBOX
+	help
+	  Say y here to support TI's C66x and C71x DSP remote processor
+	  subsystems on various TI K3 family of SoCs through the remote
+	  processor framework.
+
+	  You want to say m here in order to offload some processing
+	  tasks to these processors.
+
+	  It's safe to say N here if you're not interested in utilizing
+	  the DSP slave processors.
+
 endif # REMOTEPROC
 
 endmenu
diff --git a/drivers/remoteproc/Makefile b/drivers/remoteproc/Makefile
index 00ba826818af..eb51cc09e47b 100644
--- a/drivers/remoteproc/Makefile
+++ b/drivers/remoteproc/Makefile
@@ -29,3 +29,4 @@ obj-$(CONFIG_ST_REMOTEPROC)		+= st_remoteproc.o
 obj-$(CONFIG_ST_SLIM_REMOTEPROC)	+= st_slim_rproc.o
 obj-$(CONFIG_STM32_RPROC)		+= stm32_rproc.o
 obj-$(CONFIG_TI_K3_R5_REMOTEPROC)	+= ti_k3_r5_remoteproc.o
+obj-$(CONFIG_TI_K3_DSP_REMOTEPROC)	+= ti_k3_dsp_remoteproc.o
diff --git a/drivers/remoteproc/ti_k3_dsp_remoteproc.c b/drivers/remoteproc/ti_k3_dsp_remoteproc.c
new file mode 100644
index 000000000000..fd0d84f46f90
--- /dev/null
+++ b/drivers/remoteproc/ti_k3_dsp_remoteproc.c
@@ -0,0 +1,736 @@
+// SPDX-License-Identifier: GPL-2.0-only
+/*
+ * TI K3 DSP Remote Processor(s) driver
+ *
+ * Copyright (C) 2018-2020 Texas Instruments Incorporated - http://www.ti.com/
+ *	Suman Anna <s-anna@ti.com>
+ */
+
+#include <linux/io.h>
+#include <linux/module.h>
+#include <linux/of_device.h>
+#include <linux/of_reserved_mem.h>
+#include <linux/platform_device.h>
+#include <linux/pm_runtime.h>
+#include <linux/remoteproc.h>
+#include <linux/mailbox_client.h>
+#include <linux/omap-mailbox.h>
+#include <linux/reset.h>
+#include <linux/soc/ti/ti_sci_protocol.h>
+
+#include "omap_remoteproc.h"
+#include "remoteproc_internal.h"
+#include "ti_sci_proc.h"
+
+#define KEYSTONE_RPROC_LOCAL_ADDRESS_MASK	(SZ_16M - 1)
+
+/**
+ * struct k3_dsp_rproc_mem - internal memory structure
+ * @cpu_addr: MPU virtual address of the memory region
+ * @bus_addr: Bus address used to access the memory region
+ * @dev_addr: Device address of the memory region from DSP view
+ * @size: Size of the memory region
+ */
+struct k3_dsp_rproc_mem {
+	void __iomem *cpu_addr;
+	phys_addr_t bus_addr;
+	u32 dev_addr;
+	size_t size;
+};
+
+/**
+ * struct k3_dsp_mem_data - memory definitions for a DSP
+ * @name: name for this memory entry
+ * @dev_addr: device address for the memory entry
+ */
+struct k3_dsp_mem_data {
+	const char *name;
+	const u32 dev_addr;
+};
+
+/**
+ * struct k3_dsp_dev_data - device data structure for a DSP
+ * @mems: pointer to memory definitions for a DSP
+ * @num_mems: number of memory regions in @mems
+ * @boot_align_addr: boot vector address alignment granularity
+ * @uses_lreset: flag to denote the need for local reset management
+ */
+struct k3_dsp_dev_data {
+	const struct k3_dsp_mem_data *mems;
+	u32 num_mems;
+	u32 boot_align_addr;
+	bool uses_lreset;
+};
+
+/**
+ * struct k3_dsp_rproc - k3 DSP remote processor driver structure
+ * @dev: cached device pointer
+ * @rproc: remoteproc device handle
+ * @mem: internal memory regions data
+ * @num_mems: number of internal memory regions
+ * @rmem: reserved memory regions data
+ * @num_rmems: number of reserved memory regions
+ * @reset: reset control handle
+ * @data: pointer to DSP-specific device data
+ * @tsp: TI-SCI processor control handle
+ * @ti_sci: TI-SCI handle
+ * @ti_sci_id: TI-SCI device identifier
+ * @mbox: mailbox channel handle
+ * @client: mailbox client to request the mailbox channel
+ */
+struct k3_dsp_rproc {
+	struct device *dev;
+	struct rproc *rproc;
+	struct k3_dsp_rproc_mem *mem;
+	int num_mems;
+	struct k3_dsp_rproc_mem *rmem;
+	int num_rmems;
+	struct reset_control *reset;
+	const struct k3_dsp_dev_data *data;
+	struct ti_sci_proc *tsp;
+	const struct ti_sci_handle *ti_sci;
+	u32 ti_sci_id;
+	struct mbox_chan *mbox;
+	struct mbox_client client;
+};
+
+/**
+ * k3_dsp_rproc_mbox_callback() - inbound mailbox message handler
+ * @client: mailbox client pointer used for requesting the mailbox channel
+ * @data: mailbox payload
+ *
+ * This handler is invoked by the OMAP mailbox driver whenever a mailbox
+ * message is received. Usually, the mailbox payload simply contains
+ * the index of the virtqueue that is kicked by the remote processor,
+ * and we let remoteproc core handle it.
+ *
+ * In addition to virtqueue indices, we also have some out-of-band values
+ * that indicate different events. Those values are deliberately very
+ * large so they don't coincide with virtqueue indices.
+ */
+static void k3_dsp_rproc_mbox_callback(struct mbox_client *client, void *data)
+{
+	struct k3_dsp_rproc *kproc = container_of(client, struct k3_dsp_rproc,
+						client);
+	struct device *dev = kproc->rproc->dev.parent;
+	const char *name = kproc->rproc->name;
+	u32 msg = omap_mbox_message(data);
+
+	dev_dbg(dev, "mbox msg: 0x%x\n", msg);
+
+	switch (msg) {
+	case RP_MBOX_CRASH:
+		/*
+		 * remoteproc detected an exception, but error recovery is not
+		 * supported. So, just log this for now
+		 */
+		dev_err(dev, "K3 DSP rproc %s crashed\n", name);
+		break;
+	case RP_MBOX_ECHO_REPLY:
+		dev_info(dev, "received echo reply from %s\n", name);
+		break;
+	default:
+		/* silently handle all other valid messages */
+		if (msg >= RP_MBOX_READY && msg < RP_MBOX_END_MSG)
+			return;
+		if (msg > kproc->rproc->max_notifyid) {
+			dev_dbg(dev, "dropping unknown message 0x%x", msg);
+			return;
+		}
+		/* msg contains the index of the triggered vring */
+		if (rproc_vq_interrupt(kproc->rproc, msg) == IRQ_NONE)
+			dev_dbg(dev, "no message was found in vqid %d\n", msg);
+	}
+}
+
+/*
+ * Kick the remote processor to notify about pending unprocessed messages.
+ * The vqid usage is not used and is inconsequential, as the kick is performed
+ * through a simulated GPIO (a bit in an IPC interrupt-triggering register),
+ * the remote processor is expected to process both its Tx and Rx virtqueues.
+ */
+static void k3_dsp_rproc_kick(struct rproc *rproc, int vqid)
+{
+	struct k3_dsp_rproc *kproc = rproc->priv;
+	struct device *dev = rproc->dev.parent;
+	mbox_msg_t msg = (mbox_msg_t)vqid;
+	int ret;
+
+	/* send the index of the triggered virtqueue in the mailbox payload */
+	ret = mbox_send_message(kproc->mbox, (void *)msg);
+	if (ret < 0)
+		dev_err(dev, "failed to send mailbox message, status = %d\n",
+			ret);
+}
+
+/* Put the DSP processor into reset */
+static int k3_dsp_rproc_reset(struct k3_dsp_rproc *kproc)
+{
+	struct device *dev = kproc->dev;
+	int ret;
+
+	ret = reset_control_assert(kproc->reset);
+	if (ret) {
+		dev_err(dev, "local-reset assert failed, ret = %d\n", ret);
+		return ret;
+	}
+
+	ret = kproc->ti_sci->ops.dev_ops.put_device(kproc->ti_sci,
+						    kproc->ti_sci_id);
+	if (ret) {
+		dev_err(dev, "module-reset assert failed, ret = %d\n", ret);
+		if (reset_control_deassert(kproc->reset))
+			dev_warn(dev, "local-reset deassert back failed\n");
+	}
+
+	return ret;
+}
+
+/* Release the DSP processor from reset */
+static int k3_dsp_rproc_release(struct k3_dsp_rproc *kproc)
+{
+	struct device *dev = kproc->dev;
+	int ret;
+
+	ret = kproc->ti_sci->ops.dev_ops.get_device(kproc->ti_sci,
+						   kproc->ti_sci_id);
+	if (ret) {
+		dev_err(dev, "module-reset deassert failed, ret = %d\n", ret);
+		return ret;
+	}
+
+	ret = reset_control_deassert(kproc->reset);
+	if (ret) {
+		dev_err(dev, "local-reset deassert failed, ret = %d\n", ret);
+		if (kproc->ti_sci->ops.dev_ops.put_device(kproc->ti_sci,
+							  kproc->ti_sci_id))
+			dev_warn(dev, "module-reset assert back failed\n");
+	}
+
+	return ret;
+}
+
+/*
+ * Power up the DSP remote processor.
+ *
+ * This function will be invoked only after the firmware for this rproc
+ * was loaded, parsed successfully, and all of its resource requirements
+ * were met.
+ */
+static int k3_dsp_rproc_start(struct rproc *rproc)
+{
+	struct k3_dsp_rproc *kproc = rproc->priv;
+	struct mbox_client *client = &kproc->client;
+	struct device *dev = kproc->dev;
+	u32 boot_addr;
+	int ret;
+
+	client->dev = dev;
+	client->tx_done = NULL;
+	client->rx_callback = k3_dsp_rproc_mbox_callback;
+	client->tx_block = false;
+	client->knows_txdone = false;
+
+	kproc->mbox = mbox_request_channel(client, 0);
+	if (IS_ERR(kproc->mbox)) {
+		ret = -EBUSY;
+		dev_err(dev, "mbox_request_channel failed: %ld\n",
+			PTR_ERR(kproc->mbox));
+		return ret;
+	}
+
+	/*
+	 * Ping the remote processor, this is only for sanity-sake for now;
+	 * there is no functional effect whatsoever.
+	 *
+	 * Note that the reply will _not_ arrive immediately: this message
+	 * will wait in the mailbox fifo until the remote processor is booted.
+	 */
+	ret = mbox_send_message(kproc->mbox, (void *)RP_MBOX_ECHO_REQUEST);
+	if (ret < 0) {
+		dev_err(dev, "mbox_send_message failed: %d\n", ret);
+		goto put_mbox;
+	}
+
+	boot_addr = rproc->bootaddr;
+	if (boot_addr & (kproc->data->boot_align_addr - 1)) {
+		dev_err(dev, "invalid boot address 0x%x, must be aligned on a 0x%x boundary\n",
+			boot_addr, kproc->data->boot_align_addr);
+		ret = -EINVAL;
+		goto put_mbox;
+	}
+
+	dev_err(dev, "booting DSP core using boot addr = 0x%x\n", boot_addr);
+	ret = ti_sci_proc_set_config(kproc->tsp, boot_addr, 0, 0);
+	if (ret)
+		goto put_mbox;
+
+	ret = k3_dsp_rproc_release(kproc);
+	if (ret)
+		goto put_mbox;
+
+	return 0;
+
+put_mbox:
+	mbox_free_channel(kproc->mbox);
+	return ret;
+}
+
+/*
+ * Stop the DSP remote processor.
+ *
+ * This function puts the DSP processor into reset, and finishes processing
+ * of any pending messages.
+ */
+static int k3_dsp_rproc_stop(struct rproc *rproc)
+{
+	struct k3_dsp_rproc *kproc = rproc->priv;
+
+	mbox_free_channel(kproc->mbox);
+
+	k3_dsp_rproc_reset(kproc);
+
+	return 0;
+}
+
+/*
+ * Custom function to translate a DSP device address (internal RAMs only) to a
+ * kernel virtual address.  The DSPs can access their RAMs at either an internal
+ * address visible only from a DSP, or at the SoC-level bus address. Both these
+ * addresses need to be looked through for translation. The translated addresses
+ * can be used either by the remoteproc core for loading (when using kernel
+ * remoteproc loader), or by any rpmsg bus drivers.
+ */
+static void *k3_dsp_rproc_da_to_va(struct rproc *rproc, u64 da, size_t len)
+{
+	struct k3_dsp_rproc *kproc = rproc->priv;
+	void __iomem *va = NULL;
+	phys_addr_t bus_addr;
+	u32 dev_addr, offset;
+	size_t size;
+	int i;
+
+	if (len == 0)
+		return NULL;
+
+	for (i = 0; i < kproc->num_mems; i++) {
+		bus_addr = kproc->mem[i].bus_addr;
+		dev_addr = kproc->mem[i].dev_addr;
+		size = kproc->mem[i].size;
+
+		if (da < KEYSTONE_RPROC_LOCAL_ADDRESS_MASK) {
+			/* handle DSP-view addresses */
+			if (da >= dev_addr &&
+			    ((da + len) <= (dev_addr + size))) {
+				offset = da - dev_addr;
+				va = kproc->mem[i].cpu_addr + offset;
+				return (__force void *)va;
+			}
+		} else {
+			/* handle SoC-view addresses */
+			if (da >= bus_addr &&
+			    (da + len) <= (bus_addr + size)) {
+				offset = da - bus_addr;
+				va = kproc->mem[i].cpu_addr + offset;
+				return (__force void *)va;
+			}
+		}
+	}
+
+	/* handle static DDR reserved memory regions */
+	for (i = 0; i < kproc->num_rmems; i++) {
+		dev_addr = kproc->rmem[i].dev_addr;
+		size = kproc->rmem[i].size;
+
+		if (da >= dev_addr && ((da + len) <= (dev_addr + size))) {
+			offset = da - dev_addr;
+			va = kproc->rmem[i].cpu_addr + offset;
+			return (__force void *)va;
+		}
+	}
+
+	return NULL;
+}
+
+static const struct rproc_ops k3_dsp_rproc_ops = {
+	.start		= k3_dsp_rproc_start,
+	.stop		= k3_dsp_rproc_stop,
+	.kick		= k3_dsp_rproc_kick,
+	.da_to_va	= k3_dsp_rproc_da_to_va,
+};
+
+static const char *k3_dsp_rproc_get_firmware(struct device *dev)
+{
+	const char *fw_name;
+	int ret;
+
+	ret = of_property_read_string(dev->of_node, "firmware-name",
+				      &fw_name);
+	if (ret) {
+		dev_err(dev, "failed to parse firmware-name property, ret = %d\n",
+			ret);
+		return ERR_PTR(ret);
+	}
+
+	return fw_name;
+}
+
+static int k3_dsp_rproc_of_get_memories(struct platform_device *pdev,
+					struct k3_dsp_rproc *kproc)
+{
+	const struct k3_dsp_dev_data *data = kproc->data;
+	struct device *dev = &pdev->dev;
+	struct resource *res;
+	int num_mems = 0;
+	int i;
+
+	num_mems = kproc->data->num_mems;
+	kproc->mem = devm_kcalloc(kproc->dev, num_mems,
+				  sizeof(*kproc->mem), GFP_KERNEL);
+	if (!kproc->mem)
+		return -ENOMEM;
+
+	for (i = 0; i < num_mems; i++) {
+		res = platform_get_resource_byname(pdev, IORESOURCE_MEM,
+						   data->mems[i].name);
+		if (!res) {
+			dev_err(dev, "found no memory resource for %s\n",
+				data->mems[i].name);
+			return -EINVAL;
+		}
+		if (!devm_request_mem_region(dev, res->start,
+					     resource_size(res),
+					     dev_name(dev))) {
+			dev_err(dev, "could not request %s region for resource\n",
+				data->mems[i].name);
+			return -EBUSY;
+		}
+
+		kproc->mem[i].cpu_addr = devm_ioremap_wc(dev, res->start,
+							 resource_size(res));
+		if (IS_ERR(kproc->mem[i].cpu_addr)) {
+			dev_err(dev, "failed to map %s memory\n",
+				data->mems[i].name);
+			return PTR_ERR(kproc->mem[i].cpu_addr);
+		}
+		kproc->mem[i].bus_addr = res->start;
+		kproc->mem[i].dev_addr = data->mems[i].dev_addr;
+		kproc->mem[i].size = resource_size(res);
+
+		dev_dbg(dev, "memory %8s: bus addr %pa size 0x%zx va %pK da 0x%x\n",
+			data->mems[i].name, &kproc->mem[i].bus_addr,
+			kproc->mem[i].size, kproc->mem[i].cpu_addr,
+			kproc->mem[i].dev_addr);
+
+		/* zero out memories to start in a pristine state */
+		/*
+		 * FIXME: comment out until kernel crash is fixed, possible
+		 * issue with local resets.
+		 * memset((__force void *)kproc->mem[i].cpu_addr, 0,
+		 *      kproc->mem[i].size);
+		 */
+	}
+	kproc->num_mems = num_mems;
+
+	return 0;
+}
+
+static int k3_dsp_reserved_mem_init(struct k3_dsp_rproc *kproc)
+{
+	struct device *dev = kproc->dev;
+	struct device_node *np = dev->of_node;
+	struct device_node *rmem_np;
+	struct reserved_mem *rmem;
+	int num_rmems;
+	int ret, i;
+
+	num_rmems = of_property_count_elems_of_size(np, "memory-region",
+						    sizeof(phandle));
+	if (num_rmems <= 0) {
+		dev_err(dev, "device does not reserved memory regions, ret = %d\n",
+			num_rmems);
+		return -EINVAL;
+	}
+	if (num_rmems < 2) {
+		dev_err(dev, "device needs atleast two memory regions to be defined, num = %d\n",
+			num_rmems);
+		return -EINVAL;
+	}
+
+	/* use reserved memory region 0 for vring DMA allocations */
+	ret = of_reserved_mem_device_init_by_idx(dev, np, 0);
+	if (ret) {
+		dev_err(dev, "device cannot initialize DMA pool, ret = %d\n",
+			ret);
+		return ret;
+	}
+
+	num_rmems--;
+	kproc->rmem = kcalloc(num_rmems, sizeof(*kproc->rmem), GFP_KERNEL);
+	if (!kproc->rmem) {
+		ret = -ENOMEM;
+		goto release_rmem;
+	}
+
+	/* use remaining reserved memory regions for static carveouts */
+	for (i = 0; i < num_rmems; i++) {
+		rmem_np = of_parse_phandle(np, "memory-region", i + 1);
+		if (!rmem_np) {
+			ret = -EINVAL;
+			goto unmap_rmem;
+		}
+
+		rmem = of_reserved_mem_lookup(rmem_np);
+		if (!rmem) {
+			of_node_put(rmem_np);
+			ret = -EINVAL;
+			goto unmap_rmem;
+		}
+		of_node_put(rmem_np);
+
+		kproc->rmem[i].bus_addr = rmem->base;
+		/* 64-bit address regions currently not supported */
+		kproc->rmem[i].dev_addr = (u32)rmem->base;
+		kproc->rmem[i].size = rmem->size;
+		kproc->rmem[i].cpu_addr = ioremap_wc(rmem->base, rmem->size);
+		if (!kproc->rmem[i].cpu_addr) {
+			dev_err(dev, "failed to map reserved memory#%d at %pa of size %pa\n",
+				i + 1, &rmem->base, &rmem->size);
+			ret = -ENOMEM;
+			goto unmap_rmem;
+		}
+
+		dev_dbg(dev, "reserved memory%d: bus addr %pa size 0x%zx va %pK da 0x%x\n",
+			i + 1, &kproc->rmem[i].bus_addr,
+			kproc->rmem[i].size, kproc->rmem[i].cpu_addr,
+			kproc->rmem[i].dev_addr);
+	}
+	kproc->num_rmems = num_rmems;
+
+	return 0;
+
+unmap_rmem:
+	for (i--; i >= 0; i--) {
+		if (kproc->rmem[i].cpu_addr)
+			iounmap(kproc->rmem[i].cpu_addr);
+	}
+	kfree(kproc->rmem);
+release_rmem:
+	of_reserved_mem_device_release(kproc->dev);
+	return ret;
+}
+
+static void k3_dsp_reserved_mem_exit(struct k3_dsp_rproc *kproc)
+{
+	int i;
+
+	for (i = 0; i < kproc->num_rmems; i++)
+		iounmap(kproc->rmem[i].cpu_addr);
+	kfree(kproc->rmem);
+
+	of_reserved_mem_device_release(kproc->dev);
+}
+
+static
+struct ti_sci_proc *k3_dsp_rproc_of_get_tsp(struct device *dev,
+					    const struct ti_sci_handle *sci)
+{
+	struct ti_sci_proc *tsp;
+	u32 temp[2];
+	int ret;
+
+	ret = of_property_read_u32_array(dev->of_node, "ti,sci-proc-ids",
+					 temp, 2);
+	if (ret < 0)
+		return ERR_PTR(ret);
+
+	tsp = kzalloc(sizeof(*tsp), GFP_KERNEL);
+	if (!tsp)
+		return ERR_PTR(-ENOMEM);
+
+	tsp->dev = dev;
+	tsp->sci = sci;
+	tsp->ops = &sci->ops.proc_ops;
+	tsp->proc_id = temp[0];
+	tsp->host_id = temp[1];
+
+	return tsp;
+}
+
+static int k3_dsp_rproc_probe(struct platform_device *pdev)
+{
+	struct device *dev = &pdev->dev;
+	struct device_node *np = dev->of_node;
+	const struct k3_dsp_dev_data *data;
+	struct k3_dsp_rproc *kproc;
+	struct rproc *rproc;
+	const char *fw_name;
+	int ret = 0;
+	int ret1;
+
+	data = of_device_get_match_data(dev);
+	if (!data)
+		return -ENODEV;
+
+	fw_name = k3_dsp_rproc_get_firmware(dev);
+	if (IS_ERR(fw_name))
+		return PTR_ERR(fw_name);
+
+	rproc = rproc_alloc(dev, dev_name(dev), &k3_dsp_rproc_ops, fw_name,
+			    sizeof(*kproc));
+	if (!rproc)
+		return -ENOMEM;
+
+	rproc->has_iommu = false;
+	rproc->recovery_disabled = true;
+	kproc = rproc->priv;
+	kproc->rproc = rproc;
+	kproc->dev = dev;
+	kproc->data = data;
+
+	kproc->ti_sci = ti_sci_get_by_phandle(np, "ti,sci");
+	if (IS_ERR(kproc->ti_sci)) {
+		ret = PTR_ERR(kproc->ti_sci);
+		if (ret != -EPROBE_DEFER) {
+			dev_err(dev, "failed to get ti-sci handle, ret = %d\n",
+				ret);
+		}
+		kproc->ti_sci = NULL;
+		goto free_rproc;
+	}
+
+	ret = of_property_read_u32(np, "ti,sci-dev-id", &kproc->ti_sci_id);
+	if (ret) {
+		dev_err(dev, "missing 'ti,sci-dev-id' property\n");
+		goto put_sci;
+	}
+
+	kproc->reset = devm_reset_control_get_exclusive(dev, NULL);
+	if (IS_ERR(kproc->reset)) {
+		ret = PTR_ERR(kproc->reset);
+		dev_err(dev, "failed to get reset, status = %d\n", ret);
+		goto put_sci;
+	}
+
+	kproc->tsp = k3_dsp_rproc_of_get_tsp(dev, kproc->ti_sci);
+	if (IS_ERR(kproc->tsp)) {
+		dev_err(dev, "failed to construct ti-sci proc control, ret = %d\n",
+			ret);
+		ret = PTR_ERR(kproc->tsp);
+		goto put_sci;
+	}
+
+	ret = ti_sci_proc_request(kproc->tsp);
+	if (ret < 0) {
+		dev_err(dev, "ti_sci_proc_request failed, ret = %d\n", ret);
+		goto free_tsp;
+	}
+
+	pm_runtime_enable(dev);
+	ret = pm_runtime_get_sync(dev);
+	if (ret < 0) {
+		dev_err(dev, "failed to enable clock, status = %d\n", ret);
+		pm_runtime_put_noidle(dev);
+		goto disable_rpm;
+	}
+
+	ret = k3_dsp_rproc_of_get_memories(pdev, kproc);
+	if (ret)
+		goto disable_clk;
+
+	ret = k3_dsp_reserved_mem_init(kproc);
+	if (ret) {
+		dev_err(dev, "reserved memory init failed, ret = %d\n", ret);
+		goto disable_clk;
+	}
+
+	ret = rproc_add(rproc);
+	if (ret) {
+		dev_err(dev, "failed to add register device with remoteproc core, status = %d\n",
+			ret);
+		goto release_mem;
+	}
+
+	platform_set_drvdata(pdev, kproc);
+
+	return 0;
+
+release_mem:
+	k3_dsp_reserved_mem_exit(kproc);
+disable_clk:
+	pm_runtime_put_sync(dev);
+disable_rpm:
+	pm_runtime_disable(dev);
+	ret1 = ti_sci_proc_release(kproc->tsp);
+	if (ret1)
+		dev_err(dev, "failed to release proc, ret = %d\n", ret1);
+free_tsp:
+	kfree(kproc->tsp);
+put_sci:
+	ret1 = ti_sci_put_handle(kproc->ti_sci);
+	if (ret1)
+		dev_err(dev, "failed to put ti_sci handle, ret = %d\n", ret1);
+free_rproc:
+	rproc_free(rproc);
+	return ret;
+}
+
+static int k3_dsp_rproc_remove(struct platform_device *pdev)
+{
+	struct k3_dsp_rproc *kproc = platform_get_drvdata(pdev);
+	struct device *dev = &pdev->dev;
+	int ret;
+
+	rproc_del(kproc->rproc);
+	pm_runtime_put_sync(&pdev->dev);
+	pm_runtime_disable(&pdev->dev);
+
+	ret = ti_sci_proc_release(kproc->tsp);
+	if (ret)
+		dev_err(dev, "failed to release proc, ret = %d\n", ret);
+
+	kfree(kproc->tsp);
+
+	ret = ti_sci_put_handle(kproc->ti_sci);
+	if (ret)
+		dev_err(dev, "failed to put ti_sci handle, ret = %d\n", ret);
+
+	k3_dsp_reserved_mem_exit(kproc);
+	rproc_free(kproc->rproc);
+
+	return 0;
+}
+
+static const struct k3_dsp_mem_data c66_mems[] = {
+	{ .name = "l2sram", .dev_addr = 0x800000 },
+	{ .name = "l1pram", .dev_addr = 0xe00000 },
+	{ .name = "l1dram", .dev_addr = 0xf00000 },
+};
+
+static const struct k3_dsp_dev_data c66_data = {
+	.mems = c66_mems,
+	.num_mems = ARRAY_SIZE(c66_mems),
+	.boot_align_addr = SZ_1K,
+	.uses_lreset = true,
+};
+
+static const struct of_device_id k3_dsp_of_match[] = {
+	{ .compatible = "ti,j721e-c66-dsp", .data = &c66_data, },
+	{ /* sentinel */ },
+};
+MODULE_DEVICE_TABLE(of, k3_dsp_of_match);
+
+static struct platform_driver k3_dsp_rproc_driver = {
+	.probe	= k3_dsp_rproc_probe,
+	.remove	= k3_dsp_rproc_remove,
+	.driver	= {
+		.name = "k3-dsp-rproc",
+		.of_match_table = k3_dsp_of_match,
+	},
+};
+
+module_platform_driver(k3_dsp_rproc_driver);
+
+MODULE_AUTHOR("Suman Anna <s-anna@ti.com>");
+MODULE_LICENSE("GPL v2");
+MODULE_DESCRIPTION("TI K3 DSP Remoteproc driver");
-- 
2.23.0

^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH 3/3] remoteproc/k3-dsp: Add support for L2RAM loading on C66x DSPs
  2020-03-25 20:18 [PATCH 0/3] TI K3 DSP remoteproc driver for C66x DSPs Suman Anna
  2020-03-25 20:18 ` [PATCH 1/3] dt-bindings: remoteproc: Add bindings for C66x DSPs on TI K3 SoCs Suman Anna
  2020-03-25 20:18 ` [PATCH 2/3] remoteproc/k3-dsp: Add a remoteproc driver of K3 C66x DSPs Suman Anna
@ 2020-03-25 20:18 ` Suman Anna
  2020-04-28 19:58   ` Mathieu Poirier
  2 siblings, 1 reply; 13+ messages in thread
From: Suman Anna @ 2020-03-25 20:18 UTC (permalink / raw)
  To: Bjorn Andersson, Rob Herring, Mathieu Poirier
  Cc: Lokesh Vutla, linux-remoteproc, devicetree, linux-arm-kernel,
	linux-kernel, Suman Anna

The resets for the DSP processors on K3 SoCs are managed through the
Power and Sleep Controller (PSC) module. Each DSP typically has two
resets - a global module reset for powering on the device, and a local
reset that affects only the CPU while allowing access to the other
sub-modules within the DSP processor sub-systems.

The C66x DSPs have two levels of internal RAMs that can be used to
boot from, and the firmware loading into these RAMs require the
local reset to be asserted with the device powered on/enabled using
the module reset. Enhance the K3 DSP remoteproc driver to add support
for loading into the internal RAMs. The local reset is deasserted on
SoC power-on-reset, so logic has to be added in probe in remoteproc
mode to balance the remoteproc state-machine.

Note that the local resets are a no-op on C71x cores, and the hardware
does not supporting loading into its internal RAMs.

Signed-off-by: Suman Anna <s-anna@ti.com>
---
 drivers/remoteproc/ti_k3_dsp_remoteproc.c | 82 +++++++++++++++++++++++
 1 file changed, 82 insertions(+)

diff --git a/drivers/remoteproc/ti_k3_dsp_remoteproc.c b/drivers/remoteproc/ti_k3_dsp_remoteproc.c
index fd0d84f46f90..7b712ef74611 100644
--- a/drivers/remoteproc/ti_k3_dsp_remoteproc.c
+++ b/drivers/remoteproc/ti_k3_dsp_remoteproc.c
@@ -175,6 +175,9 @@ static int k3_dsp_rproc_reset(struct k3_dsp_rproc *kproc)
 		return ret;
 	}
 
+	if (kproc->data->uses_lreset)
+		return ret;
+
 	ret = kproc->ti_sci->ops.dev_ops.put_device(kproc->ti_sci,
 						    kproc->ti_sci_id);
 	if (ret) {
@@ -192,6 +195,9 @@ static int k3_dsp_rproc_release(struct k3_dsp_rproc *kproc)
 	struct device *dev = kproc->dev;
 	int ret;
 
+	if (kproc->data->uses_lreset)
+		goto lreset;
+
 	ret = kproc->ti_sci->ops.dev_ops.get_device(kproc->ti_sci,
 						   kproc->ti_sci_id);
 	if (ret) {
@@ -199,6 +205,7 @@ static int k3_dsp_rproc_release(struct k3_dsp_rproc *kproc)
 		return ret;
 	}
 
+lreset:
 	ret = reset_control_deassert(kproc->reset);
 	if (ret) {
 		dev_err(dev, "local-reset deassert failed, ret = %d\n", ret);
@@ -210,6 +217,63 @@ static int k3_dsp_rproc_release(struct k3_dsp_rproc *kproc)
 	return ret;
 }
 
+/*
+ * The C66x DSP cores have a local reset that affects only the CPU, and a
+ * generic module reset that powers on the device and allows the DSP internal
+ * memories to be accessed while the local reset is asserted. This function is
+ * used to release the global reset on C66x DSPs to allow loading into the DSP
+ * internal RAMs. The .prepare() ops is invoked by remoteproc core before any
+ * firmware loading, and is followed by the .start() ops after loading to
+ * actually let the C66x DSP cores run. The local reset on C71x cores is a
+ * no-op and the global reset cannot be released on C71x cores until after
+ * the firmware images are loaded, so this function does nothing for C71x cores.
+ */
+static int k3_dsp_rproc_prepare(struct rproc *rproc)
+{
+	struct k3_dsp_rproc *kproc = rproc->priv;
+	struct device *dev = kproc->dev;
+	int ret;
+
+	/* local reset is no-op on C71x processors */
+	if (!kproc->data->uses_lreset)
+		return 0;
+
+	ret = kproc->ti_sci->ops.dev_ops.get_device(kproc->ti_sci,
+						    kproc->ti_sci_id);
+	if (ret)
+		dev_err(dev, "module-reset deassert failed, cannot enable internal RAM loading, ret = %d\n",
+			ret);
+
+	return ret;
+}
+
+/*
+ * This function implements the .unprepare() ops and performs the complimentary
+ * operations to that of the .prepare() ops. The function is used to assert the
+ * global reset on applicable C66x cores. This completes the second portion of
+ * powering down the C66x DSP cores. The cores themselves are only halted in the
+ * .stop() callback through the local reset, and the .unprepare() ops is invoked
+ * by the remoteproc core after the remoteproc is stopped to balance the global
+ * reset.
+ */
+static int k3_dsp_rproc_unprepare(struct rproc *rproc)
+{
+	struct k3_dsp_rproc *kproc = rproc->priv;
+	struct device *dev = kproc->dev;
+	int ret;
+
+	/* local reset is no-op on C71x processors */
+	if (!kproc->data->uses_lreset)
+		return 0;
+
+	ret = kproc->ti_sci->ops.dev_ops.put_device(kproc->ti_sci,
+						    kproc->ti_sci_id);
+	if (ret)
+		dev_err(dev, "module-reset assert failed, ret = %d\n", ret);
+
+	return ret;
+}
+
 /*
  * Power up the DSP remote processor.
  *
@@ -353,6 +417,8 @@ static void *k3_dsp_rproc_da_to_va(struct rproc *rproc, u64 da, size_t len)
 }
 
 static const struct rproc_ops k3_dsp_rproc_ops = {
+	.prepare	= k3_dsp_rproc_prepare,
+	.unprepare	= k3_dsp_rproc_unprepare,
 	.start		= k3_dsp_rproc_start,
 	.stop		= k3_dsp_rproc_stop,
 	.kick		= k3_dsp_rproc_kick,
@@ -644,6 +710,22 @@ static int k3_dsp_rproc_probe(struct platform_device *pdev)
 		goto disable_clk;
 	}
 
+	/*
+	 * ensure the DSP local reset is asserted to ensure the DSP doesn't
+	 * execute bogus code in .prepare() when the module reset is released.
+	 */
+	if (data->uses_lreset) {
+		ret = reset_control_status(kproc->reset);
+		if (ret < 0) {
+			dev_err(dev, "failed to get reset status, status = %d\n",
+				ret);
+			goto release_mem;
+		} else if (ret == 0) {
+			dev_warn(dev, "local reset is deasserted for device\n");
+			k3_dsp_rproc_reset(kproc);
+		}
+	}
+
 	ret = rproc_add(rproc);
 	if (ret) {
 		dev_err(dev, "failed to add register device with remoteproc core, status = %d\n",
-- 
2.23.0

^ permalink raw reply related	[flat|nested] 13+ messages in thread

* Re: [PATCH 1/3] dt-bindings: remoteproc: Add bindings for C66x DSPs on TI K3 SoCs
  2020-03-25 20:18 ` [PATCH 1/3] dt-bindings: remoteproc: Add bindings for C66x DSPs on TI K3 SoCs Suman Anna
@ 2020-03-26 16:54   ` Rob Herring
  2020-04-27 19:49   ` Mathieu Poirier
  1 sibling, 0 replies; 13+ messages in thread
From: Rob Herring @ 2020-03-26 16:54 UTC (permalink / raw)
  To: Suman Anna
  Cc: Bjorn Andersson, Rob Herring, Mathieu Poirier, Lokesh Vutla,
	linux-remoteproc, devicetree, linux-arm-kernel, linux-kernel

On Wed, 25 Mar 2020 15:18:37 -0500, Suman Anna wrote:
> Some Texas Instruments K3 family of SoCs have one of more Digital Signal
> Processor (DSP) subsystems that are comprised of either a TMS320C66x
> CorePac and/or a next-generation TMS320C71x CorePac processor subsystem.
> Add the device tree bindings document for the C66x DSP devices on these
> SoCs. The added example illustrates the DT nodes for the first C66x DSP
> device present on the K3 J721E family of SoCs.
> 
> Signed-off-by: Suman Anna <s-anna@ti.com>
> ---
>  .../bindings/remoteproc/ti,k3-dsp-rproc.yaml  | 180 ++++++++++++++++++
>  1 file changed, 180 insertions(+)
>  create mode 100644 Documentation/devicetree/bindings/remoteproc/ti,k3-dsp-rproc.yaml
> 

My bot found errors running 'make dt_binding_check' on your patch:

Documentation/devicetree/bindings/remoteproc/ti,k3-dsp-rproc.example.dts:23.13-20: Warning (ranges_format): /example-0/reserved-memory:ranges: empty "ranges" property but its #address-cells (2) differs from /example-0 (1)
Documentation/devicetree/bindings/remoteproc/ti,k3-dsp-rproc.example.dts:23.13-20: Warning (ranges_format): /example-0/reserved-memory:ranges: empty "ranges" property but its #size-cells (2) differs from /example-0 (1)
Documentation/devicetree/bindings/remoteproc/ti,k3-dsp-rproc.example.dts:42.13-43.72: Warning (ranges_format): /example-0/interconnect@100000:ranges: "ranges" property has invalid length (48 bytes) (parent #address-cells == 1, child #address-cells == 2, #size-cells == 2)
/builds/robherring/linux-dt-review/Documentation/devicetree/bindings/remoteproc/ti,k3-dsp-rproc.example.dt.yaml: interconnect@100000: $nodename:0: 'interconnect@100000' does not match '^(bus|soc|axi|ahb|apb)(@[0-9a-f]+)?$'

See https://patchwork.ozlabs.org/patch/1261640

If you already ran 'make dt_binding_check' and didn't see the above
error(s), then make sure dt-schema is up to date:

pip3 install git+https://github.com/devicetree-org/dt-schema.git@master --upgrade

Please check and re-submit.

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH 1/3] dt-bindings: remoteproc: Add bindings for C66x DSPs on TI K3 SoCs
  2020-03-25 20:18 ` [PATCH 1/3] dt-bindings: remoteproc: Add bindings for C66x DSPs on TI K3 SoCs Suman Anna
  2020-03-26 16:54   ` Rob Herring
@ 2020-04-27 19:49   ` Mathieu Poirier
  2020-05-13 17:20     ` Suman Anna
  1 sibling, 1 reply; 13+ messages in thread
From: Mathieu Poirier @ 2020-04-27 19:49 UTC (permalink / raw)
  To: Suman Anna
  Cc: Bjorn Andersson, Rob Herring, Lokesh Vutla, linux-remoteproc,
	devicetree, linux-arm-kernel, linux-kernel

Hi Suman,

I have started to review this set - comments will come over the next few days.

On Wed, Mar 25, 2020 at 03:18:37PM -0500, Suman Anna wrote:
> Some Texas Instruments K3 family of SoCs have one of more Digital Signal
> Processor (DSP) subsystems that are comprised of either a TMS320C66x
> CorePac and/or a next-generation TMS320C71x CorePac processor subsystem.
> Add the device tree bindings document for the C66x DSP devices on these
> SoCs. The added example illustrates the DT nodes for the first C66x DSP
> device present on the K3 J721E family of SoCs.
> 
> Signed-off-by: Suman Anna <s-anna@ti.com>
> ---
>  .../bindings/remoteproc/ti,k3-dsp-rproc.yaml  | 180 ++++++++++++++++++
>  1 file changed, 180 insertions(+)
>  create mode 100644 Documentation/devicetree/bindings/remoteproc/ti,k3-dsp-rproc.yaml
> 
> diff --git a/Documentation/devicetree/bindings/remoteproc/ti,k3-dsp-rproc.yaml b/Documentation/devicetree/bindings/remoteproc/ti,k3-dsp-rproc.yaml
> new file mode 100644
> index 000000000000..416e3abe7937
> --- /dev/null
> +++ b/Documentation/devicetree/bindings/remoteproc/ti,k3-dsp-rproc.yaml
> @@ -0,0 +1,180 @@
> +# SPDX-License-Identifier: (GPL-2.0-only or BSD-2-Clause)
> +%YAML 1.2
> +---
> +$id: http://devicetree.org/schemas/remoteproc/ti,k3-dsp-rproc.yaml#
> +$schema: http://devicetree.org/meta-schemas/core.yaml#
> +
> +title: TI K3 DSP devices
> +
> +maintainers:
> +  - Suman Anna <s-anna@ti.com>
> +
> +description: |
> +  The TI K3 family of SoCs usually have one or more TI DSP Core sub-systems
> +  that are used to offload some of the processor-intensive tasks or algorithms,
> +  for achieving various system level goals.
> +
> +  These processor sub-systems usually contain additional sub-modules like
> +  L1 and/or L2 caches/SRAMs, an Interrupt Controller, an external memory
> +  controller, a dedicated local power/sleep controller etc. The DSP processor
> +  cores in the K3 SoCs are usually either a TMS320C66x CorePac processor or a
> +  TMS320C71x CorePac processor.
> +
> +  Each DSP Core sub-system is represented as a single DT node. Each node has a
> +  number of required or optional properties that enable the OS running on the
> +  host processor (Arm CorePac) to perform the device management of the remote
> +  processor and to communicate with the remote processor.
> +
> +properties:
> +  compatible:
> +    const: ti,j721e-c66-dsp
> +    description:
> +      Use "ti,j721e-c66-dsp" for C66x DSPs on K3 J721E SoCs
> +
> +  reg:
> +    description: |
> +      Should contain an entry for each value in 'reg-names'.
> +      Each entry should have the memory region's start address
> +      and the size of the region, the representation matching
> +      the parent node's '#address-cells' and '#size-cells' values.
> +    minItems: 3
> +    maxItems: 3
> +
> +  reg-names:
> +    description: |
> +      Should contain strings with the names of the specific internal
> +      internal memory regions, and should be defined in this order

The word "internal" is found twice in a row.

> +    maxItems: 3
> +    items:
> +      - const: l2sram
> +      - const: l1pram
> +      - const: l1dram
> +
> +  ti,sci:
> +    $ref: /schemas/types.yaml#/definitions/phandle
> +    description:
> +      Should be a phandle to the TI-SCI System Controller node
> +
> +  ti,sci-dev-id:
> +    $ref: /schemas/types.yaml#/definitions/uint32
> +    description: |
> +      Should contain the TI-SCI device id corresponding to the DSP core.
> +      Please refer to the corresponding System Controller documentation
> +      for valid values for the DSP cores.
> +
> +  ti,sci-proc-ids:
> +    description: Should contain a single tuple of <proc_id host_id>.
> +    allOf:
> +      - $ref: /schemas/types.yaml#/definitions/uint32-matrix
> +      - maxItems: 1
> +        items:
> +          items:
> +            - description: TI-SCI processor id for the DSP core device
> +            - description: TI-SCI host id to which processor control
> +                           ownership should be transferred to
> +
> +  resets:
> +    description: |
> +      Should contain the phandle to the reset controller node
> +      managing the resets for this device, and a reset
> +      specifier. Please refer to the following reset bindings
> +      for the reset argument specifier,
> +      Documentation/devicetree/bindings/reset/ti,sci-reset.txt
> +
> +  firmware-name:
> +    description: |
> +      Should contain the name of the default firmware image
> +      file located on the firmware search path
> +
> +  mboxes:
> +    description: |
> +      OMAP Mailbox specifier denoting the sub-mailbox, to be used for
> +      communication with the remote processor. This property should match
> +      with the sub-mailbox node used in the firmware image. The specifier
> +      format is as per the bindings,
> +      Documentation/devicetree/bindings/mailbox/omap-mailbox.txt
> +
> +  memory-region:
> +    minItems: 2
> +    description: |
> +      phandle to the reserved memory nodes to be associated with the remoteproc
> +      device. There should be atleast two reserved memory nodes defined - the
> +      first one would be used for dynamic DMA allocations like vrings and vring
> +      buffers, and the remaining ones used for the firmware image sections. The
> +      reserved memory nodes should be carveout nodes, and should be defined as
> +      per the bindings in
> +      Documentation/devicetree/bindings/reserved-memory/reserved-memory.txt
> +
> +# Optional properties:
> +# --------------------
> +
> +  sram:
> +    $ref: /schemas/types.yaml#/definitions/phandle-array
> +    minItems: 1
> +    description: |
> +      pHandles to one or more reserved on-chip SRAM regions. The regions

s/pHandle/phandle

Thanks,
Mathieu

> +      should be defined as child nodes of the respective SRAM node, and
> +      should be defined as per the generic bindings in,
> +      Documentation/devicetree/bindings/sram/sram.yaml
> +
> +required:
> + - compatible
> + - reg
> + - reg-names
> + - ti,sci
> + - ti,sci-dev-id
> + - ti,sci-proc-ids
> + - resets
> + - firmware-name
> + - mboxes
> + - memory-region
> +
> +additionalProperties: false
> +
> +examples:
> +  - |
> +
> +    //Example: J721E SoC
> +    /* DSP Carveout reserved memory nodes */
> +    reserved-memory {
> +        #address-cells = <2>;
> +        #size-cells = <2>;
> +        ranges;
> +
> +        c66_0_dma_memory_region: c66-dma-memory@a6000000 {
> +            compatible = "shared-dma-pool";
> +            reg = <0x00 0xa6000000 0x00 0x100000>;
> +            no-map;
> +        };
> +
> +        c66_0_memory_region: c66-memory@a6100000 {
> +            compatible = "shared-dma-pool";
> +            reg = <0x00 0xa6100000 0x00 0xf00000>;
> +            no-map;
> +        };
> +    };
> +
> +    cbass_main: interconnect@100000 {
> +        compatible = "simple-bus";
> +        #address-cells = <2>;
> +        #size-cells = <2>;
> +        ranges = <0x4d 0x80800000 0x4d 0x80800000 0x00 0x00800000>, /* C66_0 */
> +                 <0x4d 0x81800000 0x4d 0x81800000 0x00 0x00800000>; /* C66_1 */
> +
> +        /* J721E C66_0 DSP node */
> +        c66_0: dsp@4d80800000 {
> +            compatible = "ti,j721e-c66-dsp";
> +            reg = <0x4d 0x80800000 0x00 0x00048000>,
> +                  <0x4d 0x80e00000 0x00 0x00008000>,
> +                  <0x4d 0x80f00000 0x00 0x00008000>;
> +            reg-names = "l2sram", "l1pram", "l1dram";
> +            ti,sci = <&dmsc>;
> +            ti,sci-dev-id = <142>;
> +            ti,sci-proc-ids = <0x03 0xFF>;
> +            resets = <&k3_reset 142 1>;
> +            firmware-name = "j7-c66_0-fw";
> +            memory-region = <&c66_0_dma_memory_region>,
> +                            <&c66_0_memory_region>;
> +            mboxes = <&mailbox0_cluster3 &mbox_c66_0>;
> +        };
> +    };
> -- 
> 2.23.0
> 

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH 2/3] remoteproc/k3-dsp: Add a remoteproc driver of K3 C66x DSPs
  2020-03-25 20:18 ` [PATCH 2/3] remoteproc/k3-dsp: Add a remoteproc driver of K3 C66x DSPs Suman Anna
@ 2020-04-27 22:57   ` Mathieu Poirier
  2020-05-13 18:14     ` Suman Anna
  0 siblings, 1 reply; 13+ messages in thread
From: Mathieu Poirier @ 2020-04-27 22:57 UTC (permalink / raw)
  To: Suman Anna
  Cc: Bjorn Andersson, Rob Herring, Lokesh Vutla, linux-remoteproc,
	devicetree, linux-arm-kernel, linux-kernel

On Wed, Mar 25, 2020 at 03:18:38PM -0500, Suman Anna wrote:
> The Texas Instrument's K3 J721E SoCs have two C66x DSP Subsystems in MAIN
> voltage domain that are based on the TI's standard TMS320C66x DSP CorePac
> module. Each subsystem has a Fixed/Floating-Point DSP CPU, with 32 KB each
> of L1P & L1D SRAMs that can be configured and partitioned as either RAM
> and/or Cache, and 288 KB of L2 SRAM with 256 KB of memory configurable as
> either RAM and/or Cache. The CorePac also includes an Internal DMA (IDMA),
> External Memory Controller (EMC), Extended Memory Controller (XMC) with a
> Region Address Translator (RAT) unit for 32-bit to 48-bit address
> extension/translations, an Interrupt Controller (INTC) and a Powerdown
> Controller (PDC).
> 
> A new remoteproc module is added to perform the device management of
> these DSP devices. The support is limited to images using only external
> DDR memory at the moment, the loading support to internal memories and
> any on-chip RAM memories will be added in a subsequent patch. RAT support
> is also left for a future patch, and as such the reserved memory carveout
> regions are all expected to be using memory regions within the first 2 GB.
> Error Recovery and Power Management features are not currently supported.
> 
> The C66x remote processors do not have an MMU, and so require fixed memory
> carveout regions matching the firmware image addresses. Support for this
> is provided by mandating multiple memory regions to be attached to the
> remoteproc device. The first memory region will be used to serve as the
> DMA pool for all dynamic allocations like the vrings and vring buffers.
> The remaining memory regions are mapped into the kernel at device probe
> time, and are used to provide address translations for firmware image
> segments without the need for any RSC_CARVEOUT entries. Any firmware
> image using memory outside of the supplied reserved memory carveout
> regions will be errored out.
> 
> The driver uses various TI-SCI interfaces to talk to the System Controller
> (DMSC) for managing configuration, power and reset management of these
> cores. IPC between the A72 cores and the DSP cores is supported through
> the virtio rpmsg stack using shared memory and OMAP Mailboxes.
> 
> Signed-off-by: Suman Anna <s-anna@ti.com>
> ---
>  drivers/remoteproc/Kconfig                |  16 +
>  drivers/remoteproc/Makefile               |   1 +
>  drivers/remoteproc/ti_k3_dsp_remoteproc.c | 736 ++++++++++++++++++++++
>  3 files changed, 753 insertions(+)
>  create mode 100644 drivers/remoteproc/ti_k3_dsp_remoteproc.c
> 
> diff --git a/drivers/remoteproc/Kconfig b/drivers/remoteproc/Kconfig
> index 073048b4c0fb..66a76acb15b6 100644
> --- a/drivers/remoteproc/Kconfig
> +++ b/drivers/remoteproc/Kconfig
> @@ -240,6 +240,22 @@ config TI_K3_R5_REMOTEPROC
>  	  It's safe to say N here if you're not interested in utilizing
>  	  a slave processor
>  
> +config TI_K3_DSP_REMOTEPROC
> +	tristate "TI K3 DSP remoteproc support"
> +	depends on ARCH_K3
> +	select MAILBOX
> +	select OMAP2PLUS_MBOX
> +	help
> +	  Say y here to support TI's C66x and C71x DSP remote processor
> +	  subsystems on various TI K3 family of SoCs through the remote
> +	  processor framework.
> +
> +	  You want to say m here in order to offload some processing
> +	  tasks to these processors.

Building this driver has a module, i.e 'm', has nothing to do with what the
remote processor does.  I would simply remove the above 2 lines.

> +
> +	  It's safe to say N here if you're not interested in utilizing
> +	  the DSP slave processors.
> +
>  endif # REMOTEPROC
>  
>  endmenu
> diff --git a/drivers/remoteproc/Makefile b/drivers/remoteproc/Makefile
> index 00ba826818af..eb51cc09e47b 100644
> --- a/drivers/remoteproc/Makefile
> +++ b/drivers/remoteproc/Makefile
> @@ -29,3 +29,4 @@ obj-$(CONFIG_ST_REMOTEPROC)		+= st_remoteproc.o
>  obj-$(CONFIG_ST_SLIM_REMOTEPROC)	+= st_slim_rproc.o
>  obj-$(CONFIG_STM32_RPROC)		+= stm32_rproc.o
>  obj-$(CONFIG_TI_K3_R5_REMOTEPROC)	+= ti_k3_r5_remoteproc.o
> +obj-$(CONFIG_TI_K3_DSP_REMOTEPROC)	+= ti_k3_dsp_remoteproc.o
> diff --git a/drivers/remoteproc/ti_k3_dsp_remoteproc.c b/drivers/remoteproc/ti_k3_dsp_remoteproc.c
> new file mode 100644
> index 000000000000..fd0d84f46f90
> --- /dev/null
> +++ b/drivers/remoteproc/ti_k3_dsp_remoteproc.c
> @@ -0,0 +1,736 @@
> +// SPDX-License-Identifier: GPL-2.0-only
> +/*
> + * TI K3 DSP Remote Processor(s) driver
> + *
> + * Copyright (C) 2018-2020 Texas Instruments Incorporated - http://www.ti.com/
> + *	Suman Anna <s-anna@ti.com>
> + */
> +
> +#include <linux/io.h>
> +#include <linux/module.h>
> +#include <linux/of_device.h>
> +#include <linux/of_reserved_mem.h>
> +#include <linux/platform_device.h>
> +#include <linux/pm_runtime.h>
> +#include <linux/remoteproc.h>
> +#include <linux/mailbox_client.h>
> +#include <linux/omap-mailbox.h>

Please move these two up.

> +#include <linux/reset.h>
> +#include <linux/soc/ti/ti_sci_protocol.h>
> +
> +#include "omap_remoteproc.h"
> +#include "remoteproc_internal.h"
> +#include "ti_sci_proc.h"
> +
> +#define KEYSTONE_RPROC_LOCAL_ADDRESS_MASK	(SZ_16M - 1)
> +
> +/**
> + * struct k3_dsp_rproc_mem - internal memory structure
> + * @cpu_addr: MPU virtual address of the memory region
> + * @bus_addr: Bus address used to access the memory region
> + * @dev_addr: Device address of the memory region from DSP view
> + * @size: Size of the memory region
> + */
> +struct k3_dsp_rproc_mem {

I would rename this 'k3_dsp_mem' to be consistent with k3_r5_mem.

> +	void __iomem *cpu_addr;
> +	phys_addr_t bus_addr;
> +	u32 dev_addr;
> +	size_t size;
> +};
> +
> +/**
> + * struct k3_dsp_mem_data - memory definitions for a DSP
> + * @name: name for this memory entry
> + * @dev_addr: device address for the memory entry
> + */
> +struct k3_dsp_mem_data {
> +	const char *name;
> +	const u32 dev_addr;
> +};
> +
> +/**
> + * struct k3_dsp_dev_data - device data structure for a DSP
> + * @mems: pointer to memory definitions for a DSP
> + * @num_mems: number of memory regions in @mems
> + * @boot_align_addr: boot vector address alignment granularity
> + * @uses_lreset: flag to denote the need for local reset management
> + */
> +struct k3_dsp_dev_data {
> +	const struct k3_dsp_mem_data *mems;
> +	u32 num_mems;
> +	u32 boot_align_addr;
> +	bool uses_lreset;
> +};
> +
> +/**
> + * struct k3_dsp_rproc - k3 DSP remote processor driver structure
> + * @dev: cached device pointer
> + * @rproc: remoteproc device handle
> + * @mem: internal memory regions data
> + * @num_mems: number of internal memory regions
> + * @rmem: reserved memory regions data
> + * @num_rmems: number of reserved memory regions
> + * @reset: reset control handle
> + * @data: pointer to DSP-specific device data
> + * @tsp: TI-SCI processor control handle
> + * @ti_sci: TI-SCI handle
> + * @ti_sci_id: TI-SCI device identifier
> + * @mbox: mailbox channel handle
> + * @client: mailbox client to request the mailbox channel
> + */
> +struct k3_dsp_rproc {
> +	struct device *dev;
> +	struct rproc *rproc;
> +	struct k3_dsp_rproc_mem *mem;
> +	int num_mems;
> +	struct k3_dsp_rproc_mem *rmem;
> +	int num_rmems;
> +	struct reset_control *reset;
> +	const struct k3_dsp_dev_data *data;
> +	struct ti_sci_proc *tsp;
> +	const struct ti_sci_handle *ti_sci;
> +	u32 ti_sci_id;
> +	struct mbox_chan *mbox;
> +	struct mbox_client client;
> +};
> +
> +/**
> + * k3_dsp_rproc_mbox_callback() - inbound mailbox message handler
> + * @client: mailbox client pointer used for requesting the mailbox channel
> + * @data: mailbox payload
> + *
> + * This handler is invoked by the OMAP mailbox driver whenever a mailbox
> + * message is received. Usually, the mailbox payload simply contains
> + * the index of the virtqueue that is kicked by the remote processor,
> + * and we let remoteproc core handle it.
> + *
> + * In addition to virtqueue indices, we also have some out-of-band values
> + * that indicate different events. Those values are deliberately very
> + * large so they don't coincide with virtqueue indices.
> + */
> +static void k3_dsp_rproc_mbox_callback(struct mbox_client *client, void *data)
> +{
> +	struct k3_dsp_rproc *kproc = container_of(client, struct k3_dsp_rproc,
> +						client);

Indentation problem.

> +	struct device *dev = kproc->rproc->dev.parent;
> +	const char *name = kproc->rproc->name;
> +	u32 msg = omap_mbox_message(data);
> +
> +	dev_dbg(dev, "mbox msg: 0x%x\n", msg);
> +
> +	switch (msg) {
> +	case RP_MBOX_CRASH:
> +		/*
> +		 * remoteproc detected an exception, but error recovery is not
> +		 * supported. So, just log this for now
> +		 */
> +		dev_err(dev, "K3 DSP rproc %s crashed\n", name);
> +		break;
> +	case RP_MBOX_ECHO_REPLY:
> +		dev_info(dev, "received echo reply from %s\n", name);
> +		break;
> +	default:
> +		/* silently handle all other valid messages */
> +		if (msg >= RP_MBOX_READY && msg < RP_MBOX_END_MSG)
> +			return;
> +		if (msg > kproc->rproc->max_notifyid) {
> +			dev_dbg(dev, "dropping unknown message 0x%x", msg);
> +			return;
> +		}
> +		/* msg contains the index of the triggered vring */
> +		if (rproc_vq_interrupt(kproc->rproc, msg) == IRQ_NONE)
> +			dev_dbg(dev, "no message was found in vqid %d\n", msg);
> +	}
> +}
> +
> +/*
> + * Kick the remote processor to notify about pending unprocessed messages.
> + * The vqid usage is not used and is inconsequential, as the kick is performed
> + * through a simulated GPIO (a bit in an IPC interrupt-triggering register),
> + * the remote processor is expected to process both its Tx and Rx virtqueues.
> + */
> +static void k3_dsp_rproc_kick(struct rproc *rproc, int vqid)
> +{
> +	struct k3_dsp_rproc *kproc = rproc->priv;
> +	struct device *dev = rproc->dev.parent;
> +	mbox_msg_t msg = (mbox_msg_t)vqid;
> +	int ret;
> +
> +	/* send the index of the triggered virtqueue in the mailbox payload */
> +	ret = mbox_send_message(kproc->mbox, (void *)msg);
> +	if (ret < 0)
> +		dev_err(dev, "failed to send mailbox message, status = %d\n",
> +			ret);
> +}
> +
> +/* Put the DSP processor into reset */
> +static int k3_dsp_rproc_reset(struct k3_dsp_rproc *kproc)
> +{
> +	struct device *dev = kproc->dev;
> +	int ret;
> +
> +	ret = reset_control_assert(kproc->reset);
> +	if (ret) {
> +		dev_err(dev, "local-reset assert failed, ret = %d\n", ret);
> +		return ret;
> +	}
> +
> +	ret = kproc->ti_sci->ops.dev_ops.put_device(kproc->ti_sci,
> +						    kproc->ti_sci_id);
> +	if (ret) {
> +		dev_err(dev, "module-reset assert failed, ret = %d\n", ret);
> +		if (reset_control_deassert(kproc->reset))
> +			dev_warn(dev, "local-reset deassert back failed\n");
> +	}
> +
> +	return ret;
> +}
> +
> +/* Release the DSP processor from reset */
> +static int k3_dsp_rproc_release(struct k3_dsp_rproc *kproc)
> +{
> +	struct device *dev = kproc->dev;
> +	int ret;
> +
> +	ret = kproc->ti_sci->ops.dev_ops.get_device(kproc->ti_sci,
> +						   kproc->ti_sci_id);

Indentation problem.

> +	if (ret) {
> +		dev_err(dev, "module-reset deassert failed, ret = %d\n", ret);
> +		return ret;
> +	}
> +
> +	ret = reset_control_deassert(kproc->reset);
> +	if (ret) {
> +		dev_err(dev, "local-reset deassert failed, ret = %d\n", ret);
> +		if (kproc->ti_sci->ops.dev_ops.put_device(kproc->ti_sci,
> +							  kproc->ti_sci_id))
> +			dev_warn(dev, "module-reset assert back failed\n");
> +	}
> +
> +	return ret;
> +}
> +
> +/*
> + * Power up the DSP remote processor.
> + *
> + * This function will be invoked only after the firmware for this rproc
> + * was loaded, parsed successfully, and all of its resource requirements
> + * were met.
> + */
> +static int k3_dsp_rproc_start(struct rproc *rproc)
> +{
> +	struct k3_dsp_rproc *kproc = rproc->priv;
> +	struct mbox_client *client = &kproc->client;
> +	struct device *dev = kproc->dev;
> +	u32 boot_addr;
> +	int ret;
> +
> +	client->dev = dev;
> +	client->tx_done = NULL;
> +	client->rx_callback = k3_dsp_rproc_mbox_callback;
> +	client->tx_block = false;
> +	client->knows_txdone = false;
> +
> +	kproc->mbox = mbox_request_channel(client, 0);
> +	if (IS_ERR(kproc->mbox)) {
> +		ret = -EBUSY;
> +		dev_err(dev, "mbox_request_channel failed: %ld\n",
> +			PTR_ERR(kproc->mbox));
> +		return ret;
> +	}
> +
> +	/*
> +	 * Ping the remote processor, this is only for sanity-sake for now;
> +	 * there is no functional effect whatsoever.
> +	 *
> +	 * Note that the reply will _not_ arrive immediately: this message
> +	 * will wait in the mailbox fifo until the remote processor is booted.
> +	 */
> +	ret = mbox_send_message(kproc->mbox, (void *)RP_MBOX_ECHO_REQUEST);
> +	if (ret < 0) {
> +		dev_err(dev, "mbox_send_message failed: %d\n", ret);
> +		goto put_mbox;
> +	}
> +
> +	boot_addr = rproc->bootaddr;
> +	if (boot_addr & (kproc->data->boot_align_addr - 1)) {
> +		dev_err(dev, "invalid boot address 0x%x, must be aligned on a 0x%x boundary\n",
> +			boot_addr, kproc->data->boot_align_addr);
> +		ret = -EINVAL;
> +		goto put_mbox;
> +	}
> +
> +	dev_err(dev, "booting DSP core using boot addr = 0x%x\n", boot_addr);
> +	ret = ti_sci_proc_set_config(kproc->tsp, boot_addr, 0, 0);
> +	if (ret)
> +		goto put_mbox;
> +
> +	ret = k3_dsp_rproc_release(kproc);
> +	if (ret)
> +		goto put_mbox;
> +
> +	return 0;
> +
> +put_mbox:
> +	mbox_free_channel(kproc->mbox);
> +	return ret;
> +}
> +
> +/*
> + * Stop the DSP remote processor.
> + *
> + * This function puts the DSP processor into reset, and finishes processing
> + * of any pending messages.
> + */
> +static int k3_dsp_rproc_stop(struct rproc *rproc)
> +{
> +	struct k3_dsp_rproc *kproc = rproc->priv;
> +
> +	mbox_free_channel(kproc->mbox);
> +
> +	k3_dsp_rproc_reset(kproc);
> +
> +	return 0;
> +}
> +
> +/*
> + * Custom function to translate a DSP device address (internal RAMs only) to a
> + * kernel virtual address.  The DSPs can access their RAMs at either an internal
> + * address visible only from a DSP, or at the SoC-level bus address. Both these
> + * addresses need to be looked through for translation. The translated addresses
> + * can be used either by the remoteproc core for loading (when using kernel
> + * remoteproc loader), or by any rpmsg bus drivers.
> + */
> +static void *k3_dsp_rproc_da_to_va(struct rproc *rproc, u64 da, size_t len)
> +{
> +	struct k3_dsp_rproc *kproc = rproc->priv;
> +	void __iomem *va = NULL;
> +	phys_addr_t bus_addr;
> +	u32 dev_addr, offset;
> +	size_t size;
> +	int i;
> +
> +	if (len == 0)
> +		return NULL;
> +
> +	for (i = 0; i < kproc->num_mems; i++) {
> +		bus_addr = kproc->mem[i].bus_addr;
> +		dev_addr = kproc->mem[i].dev_addr;
> +		size = kproc->mem[i].size;
> +
> +		if (da < KEYSTONE_RPROC_LOCAL_ADDRESS_MASK) {
> +			/* handle DSP-view addresses */
> +			if (da >= dev_addr &&
> +			    ((da + len) <= (dev_addr + size))) {
> +				offset = da - dev_addr;
> +				va = kproc->mem[i].cpu_addr + offset;
> +				return (__force void *)va;
> +			}
> +		} else {
> +			/* handle SoC-view addresses */
> +			if (da >= bus_addr &&
> +			    (da + len) <= (bus_addr + size)) {
> +				offset = da - bus_addr;
> +				va = kproc->mem[i].cpu_addr + offset;
> +				return (__force void *)va;
> +			}
> +		}
> +	}
> +
> +	/* handle static DDR reserved memory regions */
> +	for (i = 0; i < kproc->num_rmems; i++) {
> +		dev_addr = kproc->rmem[i].dev_addr;
> +		size = kproc->rmem[i].size;
> +
> +		if (da >= dev_addr && ((da + len) <= (dev_addr + size))) {
> +			offset = da - dev_addr;
> +			va = kproc->rmem[i].cpu_addr + offset;
> +			return (__force void *)va;
> +		}
> +	}
> +
> +	return NULL;
> +}
> +
> +static const struct rproc_ops k3_dsp_rproc_ops = {
> +	.start		= k3_dsp_rproc_start,
> +	.stop		= k3_dsp_rproc_stop,
> +	.kick		= k3_dsp_rproc_kick,
> +	.da_to_va	= k3_dsp_rproc_da_to_va,
> +};
> +
> +static const char *k3_dsp_rproc_get_firmware(struct device *dev)
> +{
> +	const char *fw_name;
> +	int ret;
> +
> +	ret = of_property_read_string(dev->of_node, "firmware-name",
> +				      &fw_name);
> +	if (ret) {
> +		dev_err(dev, "failed to parse firmware-name property, ret = %d\n",
> +			ret);
> +		return ERR_PTR(ret);
> +	}
> +
> +	return fw_name;
> +}

The above is a carbon copy of k3_r5_rproc_get_firmware().  Please reuse the same
function.

> +
> +static int k3_dsp_rproc_of_get_memories(struct platform_device *pdev,
> +					struct k3_dsp_rproc *kproc)
> +{
> +	const struct k3_dsp_dev_data *data = kproc->data;
> +	struct device *dev = &pdev->dev;
> +	struct resource *res;
> +	int num_mems = 0;
> +	int i;
> +
> +	num_mems = kproc->data->num_mems;
> +	kproc->mem = devm_kcalloc(kproc->dev, num_mems,
> +				  sizeof(*kproc->mem), GFP_KERNEL);
> +	if (!kproc->mem)
> +		return -ENOMEM;
> +
> +	for (i = 0; i < num_mems; i++) {
> +		res = platform_get_resource_byname(pdev, IORESOURCE_MEM,
> +						   data->mems[i].name);
> +		if (!res) {
> +			dev_err(dev, "found no memory resource for %s\n",
> +				data->mems[i].name);
> +			return -EINVAL;
> +		}
> +		if (!devm_request_mem_region(dev, res->start,
> +					     resource_size(res),
> +					     dev_name(dev))) {
> +			dev_err(dev, "could not request %s region for resource\n",
> +				data->mems[i].name);
> +			return -EBUSY;
> +		}
> +
> +		kproc->mem[i].cpu_addr = devm_ioremap_wc(dev, res->start,
> +							 resource_size(res));
> +		if (IS_ERR(kproc->mem[i].cpu_addr)) {
> +			dev_err(dev, "failed to map %s memory\n",
> +				data->mems[i].name);
> +			return PTR_ERR(kproc->mem[i].cpu_addr);
> +		}
> +		kproc->mem[i].bus_addr = res->start;
> +		kproc->mem[i].dev_addr = data->mems[i].dev_addr;
> +		kproc->mem[i].size = resource_size(res);
> +
> +		dev_dbg(dev, "memory %8s: bus addr %pa size 0x%zx va %pK da 0x%x\n",
> +			data->mems[i].name, &kproc->mem[i].bus_addr,
> +			kproc->mem[i].size, kproc->mem[i].cpu_addr,
> +			kproc->mem[i].dev_addr);
> +
> +		/* zero out memories to start in a pristine state */
> +		/*
> +		 * FIXME: comment out until kernel crash is fixed, possible
> +		 * issue with local resets.
> +		 * memset((__force void *)kproc->mem[i].cpu_addr, 0,
> +		 *      kproc->mem[i].size);
> +		 */

Things still work without zero'ing out the memory?  As such is it mandatory to
do so? Function k3_r5_core_of_get_internal_memories does not do a memset().  And
didn't Peng also had this problem?

> +	}
> +	kproc->num_mems = num_mems;
> +
> +	return 0;
> +}
> +
> +static int k3_dsp_reserved_mem_init(struct k3_dsp_rproc *kproc)
> +{
> +	struct device *dev = kproc->dev;
> +	struct device_node *np = dev->of_node;
> +	struct device_node *rmem_np;
> +	struct reserved_mem *rmem;
> +	int num_rmems;
> +	int ret, i;
> +
> +	num_rmems = of_property_count_elems_of_size(np, "memory-region",
> +						    sizeof(phandle));
> +	if (num_rmems <= 0) {
> +		dev_err(dev, "device does not reserved memory regions, ret = %d\n",
> +			num_rmems);
> +		return -EINVAL;
> +	}
> +	if (num_rmems < 2) {
> +		dev_err(dev, "device needs atleast two memory regions to be defined, num = %d\n",
> +			num_rmems);
> +		return -EINVAL;
> +	}
> +
> +	/* use reserved memory region 0 for vring DMA allocations */
> +	ret = of_reserved_mem_device_init_by_idx(dev, np, 0);
> +	if (ret) {
> +		dev_err(dev, "device cannot initialize DMA pool, ret = %d\n",
> +			ret);
> +		return ret;
> +	}
> +
> +	num_rmems--;
> +	kproc->rmem = kcalloc(num_rmems, sizeof(*kproc->rmem), GFP_KERNEL);
> +	if (!kproc->rmem) {
> +		ret = -ENOMEM;
> +		goto release_rmem;
> +	}
> +
> +	/* use remaining reserved memory regions for static carveouts */
> +	for (i = 0; i < num_rmems; i++) {
> +		rmem_np = of_parse_phandle(np, "memory-region", i + 1);
> +		if (!rmem_np) {
> +			ret = -EINVAL;
> +			goto unmap_rmem;
> +		}
> +
> +		rmem = of_reserved_mem_lookup(rmem_np);
> +		if (!rmem) {
> +			of_node_put(rmem_np);
> +			ret = -EINVAL;
> +			goto unmap_rmem;
> +		}
> +		of_node_put(rmem_np);
> +
> +		kproc->rmem[i].bus_addr = rmem->base;
> +		/* 64-bit address regions currently not supported */
> +		kproc->rmem[i].dev_addr = (u32)rmem->base;
> +		kproc->rmem[i].size = rmem->size;
> +		kproc->rmem[i].cpu_addr = ioremap_wc(rmem->base, rmem->size);
> +		if (!kproc->rmem[i].cpu_addr) {
> +			dev_err(dev, "failed to map reserved memory#%d at %pa of size %pa\n",
> +				i + 1, &rmem->base, &rmem->size);
> +			ret = -ENOMEM;
> +			goto unmap_rmem;
> +		}
> +
> +		dev_dbg(dev, "reserved memory%d: bus addr %pa size 0x%zx va %pK da 0x%x\n",
> +			i + 1, &kproc->rmem[i].bus_addr,
> +			kproc->rmem[i].size, kproc->rmem[i].cpu_addr,
> +			kproc->rmem[i].dev_addr);
> +	}
> +	kproc->num_rmems = num_rmems;
> +
> +	return 0;
> +
> +unmap_rmem:
> +	for (i--; i >= 0; i--) {
> +		if (kproc->rmem[i].cpu_addr)
> +			iounmap(kproc->rmem[i].cpu_addr);
> +	}
> +	kfree(kproc->rmem);
> +release_rmem:
> +	of_reserved_mem_device_release(kproc->dev);
> +	return ret;
> +}

Other than the type of structure passed to the function, this is an exact
replica of k3_r5_reserved_mem_init().  Do you foresee either of them changing
to a point where reusing code would be counter productive?  I think we are right
on the edge where duplication is better than using the same function.

> +
> +static void k3_dsp_reserved_mem_exit(struct k3_dsp_rproc *kproc)
> +{
> +	int i;
> +
> +	for (i = 0; i < kproc->num_rmems; i++)
> +		iounmap(kproc->rmem[i].cpu_addr);
> +	kfree(kproc->rmem);
> +
> +	of_reserved_mem_device_release(kproc->dev);
> +}
> +
> +static
> +struct ti_sci_proc *k3_dsp_rproc_of_get_tsp(struct device *dev,
> +					    const struct ti_sci_handle *sci)
> +{
> +	struct ti_sci_proc *tsp;
> +	u32 temp[2];
> +	int ret;
> +
> +	ret = of_property_read_u32_array(dev->of_node, "ti,sci-proc-ids",
> +					 temp, 2);
> +	if (ret < 0)
> +		return ERR_PTR(ret);
> +
> +	tsp = kzalloc(sizeof(*tsp), GFP_KERNEL);
> +	if (!tsp)
> +		return ERR_PTR(-ENOMEM);
> +
> +	tsp->dev = dev;
> +	tsp->sci = sci;
> +	tsp->ops = &sci->ops.proc_ops;
> +	tsp->proc_id = temp[0];
> +	tsp->host_id = temp[1];
> +
> +	return tsp;
> +}

Contrary to k3_dsp_reserved_mem_init(), this one can definitely be reused for
both c66 and r5.

> +
> +static int k3_dsp_rproc_probe(struct platform_device *pdev)
> +{
> +	struct device *dev = &pdev->dev;
> +	struct device_node *np = dev->of_node;
> +	const struct k3_dsp_dev_data *data;
> +	struct k3_dsp_rproc *kproc;
> +	struct rproc *rproc;
> +	const char *fw_name;
> +	int ret = 0;
> +	int ret1;
> +
> +	data = of_device_get_match_data(dev);
> +	if (!data)
> +		return -ENODEV;
> +
> +	fw_name = k3_dsp_rproc_get_firmware(dev);
> +	if (IS_ERR(fw_name))
> +		return PTR_ERR(fw_name);
> +
> +	rproc = rproc_alloc(dev, dev_name(dev), &k3_dsp_rproc_ops, fw_name,
> +			    sizeof(*kproc));
> +	if (!rproc)
> +		return -ENOMEM;
> +
> +	rproc->has_iommu = false;
> +	rproc->recovery_disabled = true;
> +	kproc = rproc->priv;
> +	kproc->rproc = rproc;
> +	kproc->dev = dev;
> +	kproc->data = data;
> +
> +	kproc->ti_sci = ti_sci_get_by_phandle(np, "ti,sci");
> +	if (IS_ERR(kproc->ti_sci)) {
> +		ret = PTR_ERR(kproc->ti_sci);
> +		if (ret != -EPROBE_DEFER) {
> +			dev_err(dev, "failed to get ti-sci handle, ret = %d\n",
> +				ret);
> +		}
> +		kproc->ti_sci = NULL;
> +		goto free_rproc;
> +	}
> +
> +	ret = of_property_read_u32(np, "ti,sci-dev-id", &kproc->ti_sci_id);
> +	if (ret) {
> +		dev_err(dev, "missing 'ti,sci-dev-id' property\n");
> +		goto put_sci;
> +	}
> +
> +	kproc->reset = devm_reset_control_get_exclusive(dev, NULL);
> +	if (IS_ERR(kproc->reset)) {
> +		ret = PTR_ERR(kproc->reset);
> +		dev_err(dev, "failed to get reset, status = %d\n", ret);
> +		goto put_sci;
> +	}
> +
> +	kproc->tsp = k3_dsp_rproc_of_get_tsp(dev, kproc->ti_sci);
> +	if (IS_ERR(kproc->tsp)) {
> +		dev_err(dev, "failed to construct ti-sci proc control, ret = %d\n",
> +			ret);
> +		ret = PTR_ERR(kproc->tsp);
> +		goto put_sci;
> +	}
> +
> +	ret = ti_sci_proc_request(kproc->tsp);
> +	if (ret < 0) {
> +		dev_err(dev, "ti_sci_proc_request failed, ret = %d\n", ret);
> +		goto free_tsp;
> +	}
> +
> +	pm_runtime_enable(dev);
> +	ret = pm_runtime_get_sync(dev);

What do these give you since the dev_pm_ops is not set for the
k3_dsp_rproc_driver platform diver and there is no clock specified in the DT?

Thanks,
Mathieu

> +	if (ret < 0) {
> +		dev_err(dev, "failed to enable clock, status = %d\n", ret);
> +		pm_runtime_put_noidle(dev);
> +		goto disable_rpm;
> +	}
> +
> +	ret = k3_dsp_rproc_of_get_memories(pdev, kproc);
> +	if (ret)
> +		goto disable_clk;
> +
> +	ret = k3_dsp_reserved_mem_init(kproc);
> +	if (ret) {
> +		dev_err(dev, "reserved memory init failed, ret = %d\n", ret);
> +		goto disable_clk;
> +	}
> +
> +	ret = rproc_add(rproc);
> +	if (ret) {
> +		dev_err(dev, "failed to add register device with remoteproc core, status = %d\n",
> +			ret);
> +		goto release_mem;
> +	}
> +
> +	platform_set_drvdata(pdev, kproc);
> +
> +	return 0;
> +
> +release_mem:
> +	k3_dsp_reserved_mem_exit(kproc);
> +disable_clk:
> +	pm_runtime_put_sync(dev);
> +disable_rpm:
> +	pm_runtime_disable(dev);
> +	ret1 = ti_sci_proc_release(kproc->tsp);
> +	if (ret1)
> +		dev_err(dev, "failed to release proc, ret = %d\n", ret1);
> +free_tsp:
> +	kfree(kproc->tsp);
> +put_sci:
> +	ret1 = ti_sci_put_handle(kproc->ti_sci);
> +	if (ret1)
> +		dev_err(dev, "failed to put ti_sci handle, ret = %d\n", ret1);
> +free_rproc:
> +	rproc_free(rproc);
> +	return ret;
> +}
> +
> +static int k3_dsp_rproc_remove(struct platform_device *pdev)
> +{
> +	struct k3_dsp_rproc *kproc = platform_get_drvdata(pdev);
> +	struct device *dev = &pdev->dev;
> +	int ret;
> +
> +	rproc_del(kproc->rproc);
> +	pm_runtime_put_sync(&pdev->dev);
> +	pm_runtime_disable(&pdev->dev);
> +
> +	ret = ti_sci_proc_release(kproc->tsp);
> +	if (ret)
> +		dev_err(dev, "failed to release proc, ret = %d\n", ret);
> +
> +	kfree(kproc->tsp);
> +
> +	ret = ti_sci_put_handle(kproc->ti_sci);
> +	if (ret)
> +		dev_err(dev, "failed to put ti_sci handle, ret = %d\n", ret);
> +
> +	k3_dsp_reserved_mem_exit(kproc);
> +	rproc_free(kproc->rproc);
> +
> +	return 0;
> +}
> +
> +static const struct k3_dsp_mem_data c66_mems[] = {
> +	{ .name = "l2sram", .dev_addr = 0x800000 },
> +	{ .name = "l1pram", .dev_addr = 0xe00000 },
> +	{ .name = "l1dram", .dev_addr = 0xf00000 },
> +};
> +
> +static const struct k3_dsp_dev_data c66_data = {
> +	.mems = c66_mems,
> +	.num_mems = ARRAY_SIZE(c66_mems),
> +	.boot_align_addr = SZ_1K,
> +	.uses_lreset = true,
> +};
> +
> +static const struct of_device_id k3_dsp_of_match[] = {
> +	{ .compatible = "ti,j721e-c66-dsp", .data = &c66_data, },
> +	{ /* sentinel */ },
> +};
> +MODULE_DEVICE_TABLE(of, k3_dsp_of_match);
> +
> +static struct platform_driver k3_dsp_rproc_driver = {
> +	.probe	= k3_dsp_rproc_probe,
> +	.remove	= k3_dsp_rproc_remove,
> +	.driver	= {
> +		.name = "k3-dsp-rproc",
> +		.of_match_table = k3_dsp_of_match,
> +	},
> +};
> +
> +module_platform_driver(k3_dsp_rproc_driver);
> +
> +MODULE_AUTHOR("Suman Anna <s-anna@ti.com>");
> +MODULE_LICENSE("GPL v2");
> +MODULE_DESCRIPTION("TI K3 DSP Remoteproc driver");
> -- 
> 2.23.0
> 

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH 3/3] remoteproc/k3-dsp: Add support for L2RAM loading on C66x DSPs
  2020-03-25 20:18 ` [PATCH 3/3] remoteproc/k3-dsp: Add support for L2RAM loading on " Suman Anna
@ 2020-04-28 19:58   ` Mathieu Poirier
  2020-04-28 20:09     ` Mathieu Poirier
  0 siblings, 1 reply; 13+ messages in thread
From: Mathieu Poirier @ 2020-04-28 19:58 UTC (permalink / raw)
  To: Suman Anna
  Cc: Bjorn Andersson, Rob Herring, Lokesh Vutla, linux-remoteproc,
	devicetree, linux-arm-kernel, linux-kernel

On Wed, Mar 25, 2020 at 03:18:39PM -0500, Suman Anna wrote:
> The resets for the DSP processors on K3 SoCs are managed through the
> Power and Sleep Controller (PSC) module. Each DSP typically has two
> resets - a global module reset for powering on the device, and a local
> reset that affects only the CPU while allowing access to the other
> sub-modules within the DSP processor sub-systems.
> 
> The C66x DSPs have two levels of internal RAMs that can be used to
> boot from, and the firmware loading into these RAMs require the
> local reset to be asserted with the device powered on/enabled using
> the module reset. Enhance the K3 DSP remoteproc driver to add support
> for loading into the internal RAMs. The local reset is deasserted on
> SoC power-on-reset, so logic has to be added in probe in remoteproc
> mode to balance the remoteproc state-machine.
> 
> Note that the local resets are a no-op on C71x cores, and the hardware
> does not supporting loading into its internal RAMs.
> 
> Signed-off-by: Suman Anna <s-anna@ti.com>
> ---
>  drivers/remoteproc/ti_k3_dsp_remoteproc.c | 82 +++++++++++++++++++++++
>  1 file changed, 82 insertions(+)
> 
> diff --git a/drivers/remoteproc/ti_k3_dsp_remoteproc.c b/drivers/remoteproc/ti_k3_dsp_remoteproc.c
> index fd0d84f46f90..7b712ef74611 100644
> --- a/drivers/remoteproc/ti_k3_dsp_remoteproc.c
> +++ b/drivers/remoteproc/ti_k3_dsp_remoteproc.c
> @@ -175,6 +175,9 @@ static int k3_dsp_rproc_reset(struct k3_dsp_rproc *kproc)
>  		return ret;
>  	}
>  
> +	if (kproc->data->uses_lreset)
> +		return ret;
> +
>  	ret = kproc->ti_sci->ops.dev_ops.put_device(kproc->ti_sci,
>  						    kproc->ti_sci_id);
>  	if (ret) {
> @@ -192,6 +195,9 @@ static int k3_dsp_rproc_release(struct k3_dsp_rproc *kproc)
>  	struct device *dev = kproc->dev;
>  	int ret;
>  
> +	if (kproc->data->uses_lreset)
> +		goto lreset;
> +
>  	ret = kproc->ti_sci->ops.dev_ops.get_device(kproc->ti_sci,
>  						   kproc->ti_sci_id);
>  	if (ret) {
> @@ -199,6 +205,7 @@ static int k3_dsp_rproc_release(struct k3_dsp_rproc *kproc)
>  		return ret;
>  	}
>  
> +lreset:
>  	ret = reset_control_deassert(kproc->reset);
>  	if (ret) {
>  		dev_err(dev, "local-reset deassert failed, ret = %d\n", ret);
> @@ -210,6 +217,63 @@ static int k3_dsp_rproc_release(struct k3_dsp_rproc *kproc)
>  	return ret;
>  }
>  
> +/*
> + * The C66x DSP cores have a local reset that affects only the CPU, and a
> + * generic module reset that powers on the device and allows the DSP internal
> + * memories to be accessed while the local reset is asserted. This function is
> + * used to release the global reset on C66x DSPs to allow loading into the DSP
> + * internal RAMs. The .prepare() ops is invoked by remoteproc core before any
> + * firmware loading, and is followed by the .start() ops after loading to
> + * actually let the C66x DSP cores run. The local reset on C71x cores is a
> + * no-op and the global reset cannot be released on C71x cores until after
> + * the firmware images are loaded, so this function does nothing for C71x cores.
> + */
> +static int k3_dsp_rproc_prepare(struct rproc *rproc)
> +{
> +	struct k3_dsp_rproc *kproc = rproc->priv;
> +	struct device *dev = kproc->dev;
> +	int ret;
> +
> +	/* local reset is no-op on C71x processors */
> +	if (!kproc->data->uses_lreset)
> +		return 0;

In k3_dsp_rproc_release() the condition is "if (kproc->data->uses_lreset)" and
here it is the opposite, which did a good job at getting me confused.

Taking a step back, I assume c71 DSPs will have their own k3_dsp_dev_data where
the users_lreset flag will be false.  In that case I think it would make the
code easier to understand if the k3_dsp_rproc_ops was declared without the
.prepare and .unprepare.  In probe(), if data->uses_lreset is true then
k3_dsp_rproc_prepare() and k3_dsp_rproc_unprepare() are set.

I am done reviewing this set.

Thanks,
Mathieu

> +
> +	ret = kproc->ti_sci->ops.dev_ops.get_device(kproc->ti_sci,
> +						    kproc->ti_sci_id);
> +	if (ret)
> +		dev_err(dev, "module-reset deassert failed, cannot enable internal RAM loading, ret = %d\n",
> +			ret);
> +
> +	return ret;
> +}
> +
> +/*
> + * This function implements the .unprepare() ops and performs the complimentary
> + * operations to that of the .prepare() ops. The function is used to assert the
> + * global reset on applicable C66x cores. This completes the second portion of
> + * powering down the C66x DSP cores. The cores themselves are only halted in the
> + * .stop() callback through the local reset, and the .unprepare() ops is invoked
> + * by the remoteproc core after the remoteproc is stopped to balance the global
> + * reset.
> + */
> +static int k3_dsp_rproc_unprepare(struct rproc *rproc)
> +{
> +	struct k3_dsp_rproc *kproc = rproc->priv;
> +	struct device *dev = kproc->dev;
> +	int ret;
> +
> +	/* local reset is no-op on C71x processors */
> +	if (!kproc->data->uses_lreset)
> +		return 0;
> +
> +	ret = kproc->ti_sci->ops.dev_ops.put_device(kproc->ti_sci,
> +						    kproc->ti_sci_id);
> +	if (ret)
> +		dev_err(dev, "module-reset assert failed, ret = %d\n", ret);
> +
> +	return ret;
> +}
> +
>  /*
>   * Power up the DSP remote processor.
>   *
> @@ -353,6 +417,8 @@ static void *k3_dsp_rproc_da_to_va(struct rproc *rproc, u64 da, size_t len)
>  }
>  
>  static const struct rproc_ops k3_dsp_rproc_ops = {
> +	.prepare	= k3_dsp_rproc_prepare,
> +	.unprepare	= k3_dsp_rproc_unprepare,
>  	.start		= k3_dsp_rproc_start,
>  	.stop		= k3_dsp_rproc_stop,
>  	.kick		= k3_dsp_rproc_kick,
> @@ -644,6 +710,22 @@ static int k3_dsp_rproc_probe(struct platform_device *pdev)
>  		goto disable_clk;
>  	}
>  
> +	/*
> +	 * ensure the DSP local reset is asserted to ensure the DSP doesn't
> +	 * execute bogus code in .prepare() when the module reset is released.
> +	 */
> +	if (data->uses_lreset) {
> +		ret = reset_control_status(kproc->reset);
> +		if (ret < 0) {
> +			dev_err(dev, "failed to get reset status, status = %d\n",
> +				ret);
> +			goto release_mem;
> +		} else if (ret == 0) {
> +			dev_warn(dev, "local reset is deasserted for device\n");
> +			k3_dsp_rproc_reset(kproc);
> +		}
> +	}
> +
>  	ret = rproc_add(rproc);
>  	if (ret) {
>  		dev_err(dev, "failed to add register device with remoteproc core, status = %d\n",
> -- 
> 2.23.0
> 

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH 3/3] remoteproc/k3-dsp: Add support for L2RAM loading on C66x DSPs
  2020-04-28 19:58   ` Mathieu Poirier
@ 2020-04-28 20:09     ` Mathieu Poirier
  2020-05-13 22:31       ` Suman Anna
  0 siblings, 1 reply; 13+ messages in thread
From: Mathieu Poirier @ 2020-04-28 20:09 UTC (permalink / raw)
  To: Suman Anna
  Cc: Bjorn Andersson, Rob Herring, Lokesh Vutla, linux-remoteproc,
	devicetree, linux-arm-kernel, Linux Kernel Mailing List

On Tue, 28 Apr 2020 at 13:58, Mathieu Poirier
<mathieu.poirier@linaro.org> wrote:
>
> On Wed, Mar 25, 2020 at 03:18:39PM -0500, Suman Anna wrote:
> > The resets for the DSP processors on K3 SoCs are managed through the
> > Power and Sleep Controller (PSC) module. Each DSP typically has two
> > resets - a global module reset for powering on the device, and a local
> > reset that affects only the CPU while allowing access to the other
> > sub-modules within the DSP processor sub-systems.
> >
> > The C66x DSPs have two levels of internal RAMs that can be used to
> > boot from, and the firmware loading into these RAMs require the
> > local reset to be asserted with the device powered on/enabled using
> > the module reset. Enhance the K3 DSP remoteproc driver to add support
> > for loading into the internal RAMs. The local reset is deasserted on
> > SoC power-on-reset, so logic has to be added in probe in remoteproc
> > mode to balance the remoteproc state-machine.
> >
> > Note that the local resets are a no-op on C71x cores, and the hardware
> > does not supporting loading into its internal RAMs.
> >
> > Signed-off-by: Suman Anna <s-anna@ti.com>
> > ---
> >  drivers/remoteproc/ti_k3_dsp_remoteproc.c | 82 +++++++++++++++++++++++
> >  1 file changed, 82 insertions(+)
> >
> > diff --git a/drivers/remoteproc/ti_k3_dsp_remoteproc.c b/drivers/remoteproc/ti_k3_dsp_remoteproc.c
> > index fd0d84f46f90..7b712ef74611 100644
> > --- a/drivers/remoteproc/ti_k3_dsp_remoteproc.c
> > +++ b/drivers/remoteproc/ti_k3_dsp_remoteproc.c
> > @@ -175,6 +175,9 @@ static int k3_dsp_rproc_reset(struct k3_dsp_rproc *kproc)
> >               return ret;
> >       }
> >
> > +     if (kproc->data->uses_lreset)
> > +             return ret;
> > +
> >       ret = kproc->ti_sci->ops.dev_ops.put_device(kproc->ti_sci,
> >                                                   kproc->ti_sci_id);
> >       if (ret) {
> > @@ -192,6 +195,9 @@ static int k3_dsp_rproc_release(struct k3_dsp_rproc *kproc)
> >       struct device *dev = kproc->dev;
> >       int ret;
> >
> > +     if (kproc->data->uses_lreset)
> > +             goto lreset;
> > +
> >       ret = kproc->ti_sci->ops.dev_ops.get_device(kproc->ti_sci,
> >                                                  kproc->ti_sci_id);
> >       if (ret) {
> > @@ -199,6 +205,7 @@ static int k3_dsp_rproc_release(struct k3_dsp_rproc *kproc)
> >               return ret;
> >       }
> >
> > +lreset:
> >       ret = reset_control_deassert(kproc->reset);
> >       if (ret) {
> >               dev_err(dev, "local-reset deassert failed, ret = %d\n", ret);
> > @@ -210,6 +217,63 @@ static int k3_dsp_rproc_release(struct k3_dsp_rproc *kproc)
> >       return ret;
> >  }
> >
> > +/*
> > + * The C66x DSP cores have a local reset that affects only the CPU, and a
> > + * generic module reset that powers on the device and allows the DSP internal
> > + * memories to be accessed while the local reset is asserted. This function is
> > + * used to release the global reset on C66x DSPs to allow loading into the DSP
> > + * internal RAMs. The .prepare() ops is invoked by remoteproc core before any
> > + * firmware loading, and is followed by the .start() ops after loading to
> > + * actually let the C66x DSP cores run. The local reset on C71x cores is a
> > + * no-op and the global reset cannot be released on C71x cores until after
> > + * the firmware images are loaded, so this function does nothing for C71x cores.
> > + */
> > +static int k3_dsp_rproc_prepare(struct rproc *rproc)
> > +{
> > +     struct k3_dsp_rproc *kproc = rproc->priv;
> > +     struct device *dev = kproc->dev;
> > +     int ret;
> > +
> > +     /* local reset is no-op on C71x processors */
> > +     if (!kproc->data->uses_lreset)
> > +             return 0;
>
> In k3_dsp_rproc_release() the condition is "if (kproc->data->uses_lreset)" and
> here it is the opposite, which did a good job at getting me confused.
>
> Taking a step back, I assume c71 DSPs will have their own k3_dsp_dev_data where
> the users_lreset flag will be false.  In that case I think it would make the
> code easier to understand if the k3_dsp_rproc_ops was declared without the
> .prepare and .unprepare.  In probe(), if data->uses_lreset is true then
> k3_dsp_rproc_prepare() and k3_dsp_rproc_unprepare() are set.
>

I forgot... Since this is a C71 related change, was there a reason to
lump it with the C66 set?  If not I would simply move that to the C71
work.

> I am done reviewing this set.
>
> Thanks,
> Mathieu
>
> > +
> > +     ret = kproc->ti_sci->ops.dev_ops.get_device(kproc->ti_sci,
> > +                                                 kproc->ti_sci_id);
> > +     if (ret)
> > +             dev_err(dev, "module-reset deassert failed, cannot enable internal RAM loading, ret = %d\n",
> > +                     ret);
> > +
> > +     return ret;
> > +}
> > +
> > +/*
> > + * This function implements the .unprepare() ops and performs the complimentary
> > + * operations to that of the .prepare() ops. The function is used to assert the
> > + * global reset on applicable C66x cores. This completes the second portion of
> > + * powering down the C66x DSP cores. The cores themselves are only halted in the
> > + * .stop() callback through the local reset, and the .unprepare() ops is invoked
> > + * by the remoteproc core after the remoteproc is stopped to balance the global
> > + * reset.
> > + */
> > +static int k3_dsp_rproc_unprepare(struct rproc *rproc)
> > +{
> > +     struct k3_dsp_rproc *kproc = rproc->priv;
> > +     struct device *dev = kproc->dev;
> > +     int ret;
> > +
> > +     /* local reset is no-op on C71x processors */
> > +     if (!kproc->data->uses_lreset)
> > +             return 0;
> > +
> > +     ret = kproc->ti_sci->ops.dev_ops.put_device(kproc->ti_sci,
> > +                                                 kproc->ti_sci_id);
> > +     if (ret)
> > +             dev_err(dev, "module-reset assert failed, ret = %d\n", ret);
> > +
> > +     return ret;
> > +}
> > +
> >  /*
> >   * Power up the DSP remote processor.
> >   *
> > @@ -353,6 +417,8 @@ static void *k3_dsp_rproc_da_to_va(struct rproc *rproc, u64 da, size_t len)
> >  }
> >
> >  static const struct rproc_ops k3_dsp_rproc_ops = {
> > +     .prepare        = k3_dsp_rproc_prepare,
> > +     .unprepare      = k3_dsp_rproc_unprepare,
> >       .start          = k3_dsp_rproc_start,
> >       .stop           = k3_dsp_rproc_stop,
> >       .kick           = k3_dsp_rproc_kick,
> > @@ -644,6 +710,22 @@ static int k3_dsp_rproc_probe(struct platform_device *pdev)
> >               goto disable_clk;
> >       }
> >
> > +     /*
> > +      * ensure the DSP local reset is asserted to ensure the DSP doesn't
> > +      * execute bogus code in .prepare() when the module reset is released.
> > +      */
> > +     if (data->uses_lreset) {
> > +             ret = reset_control_status(kproc->reset);
> > +             if (ret < 0) {
> > +                     dev_err(dev, "failed to get reset status, status = %d\n",
> > +                             ret);
> > +                     goto release_mem;
> > +             } else if (ret == 0) {
> > +                     dev_warn(dev, "local reset is deasserted for device\n");
> > +                     k3_dsp_rproc_reset(kproc);
> > +             }
> > +     }
> > +
> >       ret = rproc_add(rproc);
> >       if (ret) {
> >               dev_err(dev, "failed to add register device with remoteproc core, status = %d\n",
> > --
> > 2.23.0
> >

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH 1/3] dt-bindings: remoteproc: Add bindings for C66x DSPs on TI K3 SoCs
  2020-04-27 19:49   ` Mathieu Poirier
@ 2020-05-13 17:20     ` Suman Anna
  0 siblings, 0 replies; 13+ messages in thread
From: Suman Anna @ 2020-05-13 17:20 UTC (permalink / raw)
  To: Mathieu Poirier
  Cc: Bjorn Andersson, Rob Herring, Lokesh Vutla, linux-remoteproc,
	devicetree, linux-arm-kernel, linux-kernel

On 4/27/20 2:49 PM, Mathieu Poirier wrote:
> Hi Suman,
> 
> I have started to review this set - comments will come over the next few days.
> 
> On Wed, Mar 25, 2020 at 03:18:37PM -0500, Suman Anna wrote:
>> Some Texas Instruments K3 family of SoCs have one of more Digital Signal
>> Processor (DSP) subsystems that are comprised of either a TMS320C66x
>> CorePac and/or a next-generation TMS320C71x CorePac processor subsystem.
>> Add the device tree bindings document for the C66x DSP devices on these
>> SoCs. The added example illustrates the DT nodes for the first C66x DSP
>> device present on the K3 J721E family of SoCs.
>>
>> Signed-off-by: Suman Anna <s-anna@ti.com>
>> ---
>>   .../bindings/remoteproc/ti,k3-dsp-rproc.yaml  | 180 ++++++++++++++++++
>>   1 file changed, 180 insertions(+)
>>   create mode 100644 Documentation/devicetree/bindings/remoteproc/ti,k3-dsp-rproc.yaml
>>
>> diff --git a/Documentation/devicetree/bindings/remoteproc/ti,k3-dsp-rproc.yaml b/Documentation/devicetree/bindings/remoteproc/ti,k3-dsp-rproc.yaml
>> new file mode 100644
>> index 000000000000..416e3abe7937
>> --- /dev/null
>> +++ b/Documentation/devicetree/bindings/remoteproc/ti,k3-dsp-rproc.yaml
>> @@ -0,0 +1,180 @@
>> +# SPDX-License-Identifier: (GPL-2.0-only or BSD-2-Clause)
>> +%YAML 1.2
>> +---
>> +$id: http://devicetree.org/schemas/remoteproc/ti,k3-dsp-rproc.yaml#
>> +$schema: http://devicetree.org/meta-schemas/core.yaml#
>> +
>> +title: TI K3 DSP devices
>> +
>> +maintainers:
>> +  - Suman Anna <s-anna@ti.com>
>> +
>> +description: |
>> +  The TI K3 family of SoCs usually have one or more TI DSP Core sub-systems
>> +  that are used to offload some of the processor-intensive tasks or algorithms,
>> +  for achieving various system level goals.
>> +
>> +  These processor sub-systems usually contain additional sub-modules like
>> +  L1 and/or L2 caches/SRAMs, an Interrupt Controller, an external memory
>> +  controller, a dedicated local power/sleep controller etc. The DSP processor
>> +  cores in the K3 SoCs are usually either a TMS320C66x CorePac processor or a
>> +  TMS320C71x CorePac processor.
>> +
>> +  Each DSP Core sub-system is represented as a single DT node. Each node has a
>> +  number of required or optional properties that enable the OS running on the
>> +  host processor (Arm CorePac) to perform the device management of the remote
>> +  processor and to communicate with the remote processor.
>> +
>> +properties:
>> +  compatible:
>> +    const: ti,j721e-c66-dsp
>> +    description:
>> +      Use "ti,j721e-c66-dsp" for C66x DSPs on K3 J721E SoCs
>> +
>> +  reg:
>> +    description: |
>> +      Should contain an entry for each value in 'reg-names'.
>> +      Each entry should have the memory region's start address
>> +      and the size of the region, the representation matching
>> +      the parent node's '#address-cells' and '#size-cells' values.
>> +    minItems: 3
>> +    maxItems: 3
>> +
>> +  reg-names:
>> +    description: |
>> +      Should contain strings with the names of the specific internal
>> +      internal memory regions, and should be defined in this order
> 
> The word "internal" is found twice in a row.
> 
>> +    maxItems: 3
>> +    items:
>> +      - const: l2sram
>> +      - const: l1pram
>> +      - const: l1dram
>> +
>> +  ti,sci:
>> +    $ref: /schemas/types.yaml#/definitions/phandle
>> +    description:
>> +      Should be a phandle to the TI-SCI System Controller node
>> +
>> +  ti,sci-dev-id:
>> +    $ref: /schemas/types.yaml#/definitions/uint32
>> +    description: |
>> +      Should contain the TI-SCI device id corresponding to the DSP core.
>> +      Please refer to the corresponding System Controller documentation
>> +      for valid values for the DSP cores.
>> +
>> +  ti,sci-proc-ids:
>> +    description: Should contain a single tuple of <proc_id host_id>.
>> +    allOf:
>> +      - $ref: /schemas/types.yaml#/definitions/uint32-matrix
>> +      - maxItems: 1
>> +        items:
>> +          items:
>> +            - description: TI-SCI processor id for the DSP core device
>> +            - description: TI-SCI host id to which processor control
>> +                           ownership should be transferred to
>> +
>> +  resets:
>> +    description: |
>> +      Should contain the phandle to the reset controller node
>> +      managing the resets for this device, and a reset
>> +      specifier. Please refer to the following reset bindings
>> +      for the reset argument specifier,
>> +      Documentation/devicetree/bindings/reset/ti,sci-reset.txt
>> +
>> +  firmware-name:
>> +    description: |
>> +      Should contain the name of the default firmware image
>> +      file located on the firmware search path
>> +
>> +  mboxes:
>> +    description: |
>> +      OMAP Mailbox specifier denoting the sub-mailbox, to be used for
>> +      communication with the remote processor. This property should match
>> +      with the sub-mailbox node used in the firmware image. The specifier
>> +      format is as per the bindings,
>> +      Documentation/devicetree/bindings/mailbox/omap-mailbox.txt
>> +
>> +  memory-region:
>> +    minItems: 2
>> +    description: |
>> +      phandle to the reserved memory nodes to be associated with the remoteproc
>> +      device. There should be atleast two reserved memory nodes defined - the
>> +      first one would be used for dynamic DMA allocations like vrings and vring
>> +      buffers, and the remaining ones used for the firmware image sections. The
>> +      reserved memory nodes should be carveout nodes, and should be defined as
>> +      per the bindings in
>> +      Documentation/devicetree/bindings/reserved-memory/reserved-memory.txt
>> +
>> +# Optional properties:
>> +# --------------------
>> +
>> +  sram:
>> +    $ref: /schemas/types.yaml#/definitions/phandle-array
>> +    minItems: 1
>> +    description: |
>> +      pHandles to one or more reserved on-chip SRAM regions. The regions
> 
> s/pHandle/phandle

Thanks Mathieu, will fix both of these in the next version.

regards
Suman

> 
> Thanks,
> Mathieu
> 
>> +      should be defined as child nodes of the respective SRAM node, and
>> +      should be defined as per the generic bindings in,
>> +      Documentation/devicetree/bindings/sram/sram.yaml
>> +
>> +required:
>> + - compatible
>> + - reg
>> + - reg-names
>> + - ti,sci
>> + - ti,sci-dev-id
>> + - ti,sci-proc-ids
>> + - resets
>> + - firmware-name
>> + - mboxes
>> + - memory-region
>> +
>> +additionalProperties: false
>> +
>> +examples:
>> +  - |
>> +
>> +    //Example: J721E SoC
>> +    /* DSP Carveout reserved memory nodes */
>> +    reserved-memory {
>> +        #address-cells = <2>;
>> +        #size-cells = <2>;
>> +        ranges;
>> +
>> +        c66_0_dma_memory_region: c66-dma-memory@a6000000 {
>> +            compatible = "shared-dma-pool";
>> +            reg = <0x00 0xa6000000 0x00 0x100000>;
>> +            no-map;
>> +        };
>> +
>> +        c66_0_memory_region: c66-memory@a6100000 {
>> +            compatible = "shared-dma-pool";
>> +            reg = <0x00 0xa6100000 0x00 0xf00000>;
>> +            no-map;
>> +        };
>> +    };
>> +
>> +    cbass_main: interconnect@100000 {
>> +        compatible = "simple-bus";
>> +        #address-cells = <2>;
>> +        #size-cells = <2>;
>> +        ranges = <0x4d 0x80800000 0x4d 0x80800000 0x00 0x00800000>, /* C66_0 */
>> +                 <0x4d 0x81800000 0x4d 0x81800000 0x00 0x00800000>; /* C66_1 */
>> +
>> +        /* J721E C66_0 DSP node */
>> +        c66_0: dsp@4d80800000 {
>> +            compatible = "ti,j721e-c66-dsp";
>> +            reg = <0x4d 0x80800000 0x00 0x00048000>,
>> +                  <0x4d 0x80e00000 0x00 0x00008000>,
>> +                  <0x4d 0x80f00000 0x00 0x00008000>;
>> +            reg-names = "l2sram", "l1pram", "l1dram";
>> +            ti,sci = <&dmsc>;
>> +            ti,sci-dev-id = <142>;
>> +            ti,sci-proc-ids = <0x03 0xFF>;
>> +            resets = <&k3_reset 142 1>;
>> +            firmware-name = "j7-c66_0-fw";
>> +            memory-region = <&c66_0_dma_memory_region>,
>> +                            <&c66_0_memory_region>;
>> +            mboxes = <&mailbox0_cluster3 &mbox_c66_0>;
>> +        };
>> +    };
>> -- 
>> 2.23.0
>>


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH 2/3] remoteproc/k3-dsp: Add a remoteproc driver of K3 C66x DSPs
  2020-04-27 22:57   ` Mathieu Poirier
@ 2020-05-13 18:14     ` Suman Anna
  2020-05-13 19:40       ` Mathieu Poirier
  0 siblings, 1 reply; 13+ messages in thread
From: Suman Anna @ 2020-05-13 18:14 UTC (permalink / raw)
  To: Mathieu Poirier
  Cc: Bjorn Andersson, Rob Herring, Lokesh Vutla, linux-remoteproc,
	devicetree, linux-arm-kernel, linux-kernel

Hi Mathieu,

On 4/27/20 5:57 PM, Mathieu Poirier wrote:
> On Wed, Mar 25, 2020 at 03:18:38PM -0500, Suman Anna wrote:
>> The Texas Instrument's K3 J721E SoCs have two C66x DSP Subsystems in MAIN
>> voltage domain that are based on the TI's standard TMS320C66x DSP CorePac
>> module. Each subsystem has a Fixed/Floating-Point DSP CPU, with 32 KB each
>> of L1P & L1D SRAMs that can be configured and partitioned as either RAM
>> and/or Cache, and 288 KB of L2 SRAM with 256 KB of memory configurable as
>> either RAM and/or Cache. The CorePac also includes an Internal DMA (IDMA),
>> External Memory Controller (EMC), Extended Memory Controller (XMC) with a
>> Region Address Translator (RAT) unit for 32-bit to 48-bit address
>> extension/translations, an Interrupt Controller (INTC) and a Powerdown
>> Controller (PDC).
>>
>> A new remoteproc module is added to perform the device management of
>> these DSP devices. The support is limited to images using only external
>> DDR memory at the moment, the loading support to internal memories and
>> any on-chip RAM memories will be added in a subsequent patch. RAT support
>> is also left for a future patch, and as such the reserved memory carveout
>> regions are all expected to be using memory regions within the first 2 GB.
>> Error Recovery and Power Management features are not currently supported.
>>
>> The C66x remote processors do not have an MMU, and so require fixed memory
>> carveout regions matching the firmware image addresses. Support for this
>> is provided by mandating multiple memory regions to be attached to the
>> remoteproc device. The first memory region will be used to serve as the
>> DMA pool for all dynamic allocations like the vrings and vring buffers.
>> The remaining memory regions are mapped into the kernel at device probe
>> time, and are used to provide address translations for firmware image
>> segments without the need for any RSC_CARVEOUT entries. Any firmware
>> image using memory outside of the supplied reserved memory carveout
>> regions will be errored out.
>>
>> The driver uses various TI-SCI interfaces to talk to the System Controller
>> (DMSC) for managing configuration, power and reset management of these
>> cores. IPC between the A72 cores and the DSP cores is supported through
>> the virtio rpmsg stack using shared memory and OMAP Mailboxes.
>>
>> Signed-off-by: Suman Anna <s-anna@ti.com>
>> ---
>>   drivers/remoteproc/Kconfig                |  16 +
>>   drivers/remoteproc/Makefile               |   1 +
>>   drivers/remoteproc/ti_k3_dsp_remoteproc.c | 736 ++++++++++++++++++++++
>>   3 files changed, 753 insertions(+)
>>   create mode 100644 drivers/remoteproc/ti_k3_dsp_remoteproc.c
>>
>> diff --git a/drivers/remoteproc/Kconfig b/drivers/remoteproc/Kconfig
>> index 073048b4c0fb..66a76acb15b6 100644
>> --- a/drivers/remoteproc/Kconfig
>> +++ b/drivers/remoteproc/Kconfig
>> @@ -240,6 +240,22 @@ config TI_K3_R5_REMOTEPROC
>>   	  It's safe to say N here if you're not interested in utilizing
>>   	  a slave processor
>>   
>> +config TI_K3_DSP_REMOTEPROC
>> +	tristate "TI K3 DSP remoteproc support"
>> +	depends on ARCH_K3
>> +	select MAILBOX
>> +	select OMAP2PLUS_MBOX
>> +	help
>> +	  Say y here to support TI's C66x and C71x DSP remote processor
>> +	  subsystems on various TI K3 family of SoCs through the remote
>> +	  processor framework.
>> +
>> +	  You want to say m here in order to offload some processing
>> +	  tasks to these processors.
> 
> Building this driver has a module, i.e 'm', has nothing to do with what the
> remote processor does.  I would simply remove the above 2 lines.

Yes, can drop. I will switch the "Say y" to "Say m" - that would be the 
preferred option. Having the driver built-in means the firmware has to 
be part of initramfs.

> 
>> +
>> +	  It's safe to say N here if you're not interested in utilizing
>> +	  the DSP slave processors.
>> +
>>   endif # REMOTEPROC
>>   
>>   endmenu
>> diff --git a/drivers/remoteproc/Makefile b/drivers/remoteproc/Makefile
>> index 00ba826818af..eb51cc09e47b 100644
>> --- a/drivers/remoteproc/Makefile
>> +++ b/drivers/remoteproc/Makefile
>> @@ -29,3 +29,4 @@ obj-$(CONFIG_ST_REMOTEPROC)		+= st_remoteproc.o
>>   obj-$(CONFIG_ST_SLIM_REMOTEPROC)	+= st_slim_rproc.o
>>   obj-$(CONFIG_STM32_RPROC)		+= stm32_rproc.o
>>   obj-$(CONFIG_TI_K3_R5_REMOTEPROC)	+= ti_k3_r5_remoteproc.o
>> +obj-$(CONFIG_TI_K3_DSP_REMOTEPROC)	+= ti_k3_dsp_remoteproc.o
>> diff --git a/drivers/remoteproc/ti_k3_dsp_remoteproc.c b/drivers/remoteproc/ti_k3_dsp_remoteproc.c
>> new file mode 100644
>> index 000000000000..fd0d84f46f90
>> --- /dev/null
>> +++ b/drivers/remoteproc/ti_k3_dsp_remoteproc.c
>> @@ -0,0 +1,736 @@
>> +// SPDX-License-Identifier: GPL-2.0-only
>> +/*
>> + * TI K3 DSP Remote Processor(s) driver
>> + *
>> + * Copyright (C) 2018-2020 Texas Instruments Incorporated - http://www.ti.com/
>> + *	Suman Anna <s-anna@ti.com>
>> + */
>> +
>> +#include <linux/io.h>
>> +#include <linux/module.h>
>> +#include <linux/of_device.h>
>> +#include <linux/of_reserved_mem.h>
>> +#include <linux/platform_device.h>
>> +#include <linux/pm_runtime.h>
>> +#include <linux/remoteproc.h>
>> +#include <linux/mailbox_client.h>
>> +#include <linux/omap-mailbox.h>
> 
> Please move these two up.

OK.

> 
>> +#include <linux/reset.h>
>> +#include <linux/soc/ti/ti_sci_protocol.h>
>> +
>> +#include "omap_remoteproc.h"
>> +#include "remoteproc_internal.h"
>> +#include "ti_sci_proc.h"
>> +
>> +#define KEYSTONE_RPROC_LOCAL_ADDRESS_MASK	(SZ_16M - 1)
>> +
>> +/**
>> + * struct k3_dsp_rproc_mem - internal memory structure
>> + * @cpu_addr: MPU virtual address of the memory region
>> + * @bus_addr: Bus address used to access the memory region
>> + * @dev_addr: Device address of the memory region from DSP view
>> + * @size: Size of the memory region
>> + */
>> +struct k3_dsp_rproc_mem {
> 
> I would rename this 'k3_dsp_mem' to be consistent with k3_r5_mem.

Yeah, will rename.

> 
>> +	void __iomem *cpu_addr;
>> +	phys_addr_t bus_addr;
>> +	u32 dev_addr;
>> +	size_t size;
>> +};
>> +
>> +/**
>> + * struct k3_dsp_mem_data - memory definitions for a DSP
>> + * @name: name for this memory entry
>> + * @dev_addr: device address for the memory entry
>> + */
>> +struct k3_dsp_mem_data {
>> +	const char *name;
>> +	const u32 dev_addr;
>> +};
>> +
>> +/**
>> + * struct k3_dsp_dev_data - device data structure for a DSP
>> + * @mems: pointer to memory definitions for a DSP
>> + * @num_mems: number of memory regions in @mems
>> + * @boot_align_addr: boot vector address alignment granularity
>> + * @uses_lreset: flag to denote the need for local reset management
>> + */
>> +struct k3_dsp_dev_data {
>> +	const struct k3_dsp_mem_data *mems;
>> +	u32 num_mems;
>> +	u32 boot_align_addr;
>> +	bool uses_lreset;
>> +};
>> +
>> +/**
>> + * struct k3_dsp_rproc - k3 DSP remote processor driver structure
>> + * @dev: cached device pointer
>> + * @rproc: remoteproc device handle
>> + * @mem: internal memory regions data
>> + * @num_mems: number of internal memory regions
>> + * @rmem: reserved memory regions data
>> + * @num_rmems: number of reserved memory regions
>> + * @reset: reset control handle
>> + * @data: pointer to DSP-specific device data
>> + * @tsp: TI-SCI processor control handle
>> + * @ti_sci: TI-SCI handle
>> + * @ti_sci_id: TI-SCI device identifier
>> + * @mbox: mailbox channel handle
>> + * @client: mailbox client to request the mailbox channel
>> + */
>> +struct k3_dsp_rproc {
>> +	struct device *dev;
>> +	struct rproc *rproc;
>> +	struct k3_dsp_rproc_mem *mem;
>> +	int num_mems;
>> +	struct k3_dsp_rproc_mem *rmem;
>> +	int num_rmems;
>> +	struct reset_control *reset;
>> +	const struct k3_dsp_dev_data *data;
>> +	struct ti_sci_proc *tsp;
>> +	const struct ti_sci_handle *ti_sci;
>> +	u32 ti_sci_id;
>> +	struct mbox_chan *mbox;
>> +	struct mbox_client client;
>> +};
>> +
>> +/**
>> + * k3_dsp_rproc_mbox_callback() - inbound mailbox message handler
>> + * @client: mailbox client pointer used for requesting the mailbox channel
>> + * @data: mailbox payload
>> + *
>> + * This handler is invoked by the OMAP mailbox driver whenever a mailbox
>> + * message is received. Usually, the mailbox payload simply contains
>> + * the index of the virtqueue that is kicked by the remote processor,
>> + * and we let remoteproc core handle it.
>> + *
>> + * In addition to virtqueue indices, we also have some out-of-band values
>> + * that indicate different events. Those values are deliberately very
>> + * large so they don't coincide with virtqueue indices.
>> + */
>> +static void k3_dsp_rproc_mbox_callback(struct mbox_client *client, void *data)
>> +{
>> +	struct k3_dsp_rproc *kproc = container_of(client, struct k3_dsp_rproc,
>> +						client);
> 
> Indentation problem.

Thanks. Hmm, checkpatch didn't catch this.

> 
>> +	struct device *dev = kproc->rproc->dev.parent;
>> +	const char *name = kproc->rproc->name;
>> +	u32 msg = omap_mbox_message(data);
>> +
>> +	dev_dbg(dev, "mbox msg: 0x%x\n", msg);
>> +
>> +	switch (msg) {
>> +	case RP_MBOX_CRASH:
>> +		/*
>> +		 * remoteproc detected an exception, but error recovery is not
>> +		 * supported. So, just log this for now
>> +		 */
>> +		dev_err(dev, "K3 DSP rproc %s crashed\n", name);
>> +		break;
>> +	case RP_MBOX_ECHO_REPLY:
>> +		dev_info(dev, "received echo reply from %s\n", name);
>> +		break;
>> +	default:
>> +		/* silently handle all other valid messages */
>> +		if (msg >= RP_MBOX_READY && msg < RP_MBOX_END_MSG)
>> +			return;
>> +		if (msg > kproc->rproc->max_notifyid) {
>> +			dev_dbg(dev, "dropping unknown message 0x%x", msg);
>> +			return;
>> +		}
>> +		/* msg contains the index of the triggered vring */
>> +		if (rproc_vq_interrupt(kproc->rproc, msg) == IRQ_NONE)
>> +			dev_dbg(dev, "no message was found in vqid %d\n", msg);
>> +	}
>> +}
>> +
>> +/*
>> + * Kick the remote processor to notify about pending unprocessed messages.
>> + * The vqid usage is not used and is inconsequential, as the kick is performed
>> + * through a simulated GPIO (a bit in an IPC interrupt-triggering register),
>> + * the remote processor is expected to process both its Tx and Rx virtqueues.
>> + */
>> +static void k3_dsp_rproc_kick(struct rproc *rproc, int vqid)
>> +{
>> +	struct k3_dsp_rproc *kproc = rproc->priv;
>> +	struct device *dev = rproc->dev.parent;
>> +	mbox_msg_t msg = (mbox_msg_t)vqid;
>> +	int ret;
>> +
>> +	/* send the index of the triggered virtqueue in the mailbox payload */
>> +	ret = mbox_send_message(kproc->mbox, (void *)msg);
>> +	if (ret < 0)
>> +		dev_err(dev, "failed to send mailbox message, status = %d\n",
>> +			ret);
>> +}
>> +
>> +/* Put the DSP processor into reset */
>> +static int k3_dsp_rproc_reset(struct k3_dsp_rproc *kproc)
>> +{
>> +	struct device *dev = kproc->dev;
>> +	int ret;
>> +
>> +	ret = reset_control_assert(kproc->reset);
>> +	if (ret) {
>> +		dev_err(dev, "local-reset assert failed, ret = %d\n", ret);
>> +		return ret;
>> +	}
>> +
>> +	ret = kproc->ti_sci->ops.dev_ops.put_device(kproc->ti_sci,
>> +						    kproc->ti_sci_id);
>> +	if (ret) {
>> +		dev_err(dev, "module-reset assert failed, ret = %d\n", ret);
>> +		if (reset_control_deassert(kproc->reset))
>> +			dev_warn(dev, "local-reset deassert back failed\n");
>> +	}
>> +
>> +	return ret;
>> +}
>> +
>> +/* Release the DSP processor from reset */
>> +static int k3_dsp_rproc_release(struct k3_dsp_rproc *kproc)
>> +{
>> +	struct device *dev = kproc->dev;
>> +	int ret;
>> +
>> +	ret = kproc->ti_sci->ops.dev_ops.get_device(kproc->ti_sci,
>> +						   kproc->ti_sci_id);
> 
> Indentation problem.

Thanks for catching, will fix.

> 
>> +	if (ret) {
>> +		dev_err(dev, "module-reset deassert failed, ret = %d\n", ret);
>> +		return ret;
>> +	}
>> +
>> +	ret = reset_control_deassert(kproc->reset);
>> +	if (ret) {
>> +		dev_err(dev, "local-reset deassert failed, ret = %d\n", ret);
>> +		if (kproc->ti_sci->ops.dev_ops.put_device(kproc->ti_sci,
>> +							  kproc->ti_sci_id))
>> +			dev_warn(dev, "module-reset assert back failed\n");
>> +	}
>> +
>> +	return ret;
>> +}
>> +
>> +/*
>> + * Power up the DSP remote processor.
>> + *
>> + * This function will be invoked only after the firmware for this rproc
>> + * was loaded, parsed successfully, and all of its resource requirements
>> + * were met.
>> + */
>> +static int k3_dsp_rproc_start(struct rproc *rproc)
>> +{
>> +	struct k3_dsp_rproc *kproc = rproc->priv;
>> +	struct mbox_client *client = &kproc->client;
>> +	struct device *dev = kproc->dev;
>> +	u32 boot_addr;
>> +	int ret;
>> +
>> +	client->dev = dev;
>> +	client->tx_done = NULL;
>> +	client->rx_callback = k3_dsp_rproc_mbox_callback;
>> +	client->tx_block = false;
>> +	client->knows_txdone = false;
>> +
>> +	kproc->mbox = mbox_request_channel(client, 0);
>> +	if (IS_ERR(kproc->mbox)) {
>> +		ret = -EBUSY;
>> +		dev_err(dev, "mbox_request_channel failed: %ld\n",
>> +			PTR_ERR(kproc->mbox));
>> +		return ret;
>> +	}
>> +
>> +	/*
>> +	 * Ping the remote processor, this is only for sanity-sake for now;
>> +	 * there is no functional effect whatsoever.
>> +	 *
>> +	 * Note that the reply will _not_ arrive immediately: this message
>> +	 * will wait in the mailbox fifo until the remote processor is booted.
>> +	 */
>> +	ret = mbox_send_message(kproc->mbox, (void *)RP_MBOX_ECHO_REQUEST);
>> +	if (ret < 0) {
>> +		dev_err(dev, "mbox_send_message failed: %d\n", ret);
>> +		goto put_mbox;
>> +	}
>> +
>> +	boot_addr = rproc->bootaddr;
>> +	if (boot_addr & (kproc->data->boot_align_addr - 1)) {
>> +		dev_err(dev, "invalid boot address 0x%x, must be aligned on a 0x%x boundary\n",
>> +			boot_addr, kproc->data->boot_align_addr);
>> +		ret = -EINVAL;
>> +		goto put_mbox;
>> +	}
>> +
>> +	dev_err(dev, "booting DSP core using boot addr = 0x%x\n", boot_addr);
>> +	ret = ti_sci_proc_set_config(kproc->tsp, boot_addr, 0, 0);
>> +	if (ret)
>> +		goto put_mbox;
>> +
>> +	ret = k3_dsp_rproc_release(kproc);
>> +	if (ret)
>> +		goto put_mbox;
>> +
>> +	return 0;
>> +
>> +put_mbox:
>> +	mbox_free_channel(kproc->mbox);
>> +	return ret;
>> +}
>> +
>> +/*
>> + * Stop the DSP remote processor.
>> + *
>> + * This function puts the DSP processor into reset, and finishes processing
>> + * of any pending messages.
>> + */
>> +static int k3_dsp_rproc_stop(struct rproc *rproc)
>> +{
>> +	struct k3_dsp_rproc *kproc = rproc->priv;
>> +
>> +	mbox_free_channel(kproc->mbox);
>> +
>> +	k3_dsp_rproc_reset(kproc);
>> +
>> +	return 0;
>> +}
>> +
>> +/*
>> + * Custom function to translate a DSP device address (internal RAMs only) to a
>> + * kernel virtual address.  The DSPs can access their RAMs at either an internal
>> + * address visible only from a DSP, or at the SoC-level bus address. Both these
>> + * addresses need to be looked through for translation. The translated addresses
>> + * can be used either by the remoteproc core for loading (when using kernel
>> + * remoteproc loader), or by any rpmsg bus drivers.
>> + */
>> +static void *k3_dsp_rproc_da_to_va(struct rproc *rproc, u64 da, size_t len)
>> +{
>> +	struct k3_dsp_rproc *kproc = rproc->priv;
>> +	void __iomem *va = NULL;
>> +	phys_addr_t bus_addr;
>> +	u32 dev_addr, offset;
>> +	size_t size;
>> +	int i;
>> +
>> +	if (len == 0)
>> +		return NULL;
>> +
>> +	for (i = 0; i < kproc->num_mems; i++) {
>> +		bus_addr = kproc->mem[i].bus_addr;
>> +		dev_addr = kproc->mem[i].dev_addr;
>> +		size = kproc->mem[i].size;
>> +
>> +		if (da < KEYSTONE_RPROC_LOCAL_ADDRESS_MASK) {
>> +			/* handle DSP-view addresses */
>> +			if (da >= dev_addr &&
>> +			    ((da + len) <= (dev_addr + size))) {
>> +				offset = da - dev_addr;
>> +				va = kproc->mem[i].cpu_addr + offset;
>> +				return (__force void *)va;
>> +			}
>> +		} else {
>> +			/* handle SoC-view addresses */
>> +			if (da >= bus_addr &&
>> +			    (da + len) <= (bus_addr + size)) {
>> +				offset = da - bus_addr;
>> +				va = kproc->mem[i].cpu_addr + offset;
>> +				return (__force void *)va;
>> +			}
>> +		}
>> +	}
>> +
>> +	/* handle static DDR reserved memory regions */
>> +	for (i = 0; i < kproc->num_rmems; i++) {
>> +		dev_addr = kproc->rmem[i].dev_addr;
>> +		size = kproc->rmem[i].size;
>> +
>> +		if (da >= dev_addr && ((da + len) <= (dev_addr + size))) {
>> +			offset = da - dev_addr;
>> +			va = kproc->rmem[i].cpu_addr + offset;
>> +			return (__force void *)va;
>> +		}
>> +	}
>> +
>> +	return NULL;
>> +}
>> +
>> +static const struct rproc_ops k3_dsp_rproc_ops = {
>> +	.start		= k3_dsp_rproc_start,
>> +	.stop		= k3_dsp_rproc_stop,
>> +	.kick		= k3_dsp_rproc_kick,
>> +	.da_to_va	= k3_dsp_rproc_da_to_va,
>> +};
>> +
>> +static const char *k3_dsp_rproc_get_firmware(struct device *dev)
>> +{
>> +	const char *fw_name;
>> +	int ret;
>> +
>> +	ret = of_property_read_string(dev->of_node, "firmware-name",
>> +				      &fw_name);
>> +	if (ret) {
>> +		dev_err(dev, "failed to parse firmware-name property, ret = %d\n",
>> +			ret);
>> +		return ERR_PTR(ret);
>> +	}
>> +
>> +	return fw_name;
>> +}
> 
> The above is a carbon copy of k3_r5_rproc_get_firmware().  Please reuse the same
> function.

Yeah, I can add this as a common helper to rproc core, would be useful 
beyond just the TI rproc drivers.

> 
>> +
>> +static int k3_dsp_rproc_of_get_memories(struct platform_device *pdev,
>> +					struct k3_dsp_rproc *kproc)
>> +{
>> +	const struct k3_dsp_dev_data *data = kproc->data;
>> +	struct device *dev = &pdev->dev;
>> +	struct resource *res;
>> +	int num_mems = 0;
>> +	int i;
>> +
>> +	num_mems = kproc->data->num_mems;
>> +	kproc->mem = devm_kcalloc(kproc->dev, num_mems,
>> +				  sizeof(*kproc->mem), GFP_KERNEL);
>> +	if (!kproc->mem)
>> +		return -ENOMEM;
>> +
>> +	for (i = 0; i < num_mems; i++) {
>> +		res = platform_get_resource_byname(pdev, IORESOURCE_MEM,
>> +						   data->mems[i].name);
>> +		if (!res) {
>> +			dev_err(dev, "found no memory resource for %s\n",
>> +				data->mems[i].name);
>> +			return -EINVAL;
>> +		}
>> +		if (!devm_request_mem_region(dev, res->start,
>> +					     resource_size(res),
>> +					     dev_name(dev))) {
>> +			dev_err(dev, "could not request %s region for resource\n",
>> +				data->mems[i].name);
>> +			return -EBUSY;
>> +		}
>> +
>> +		kproc->mem[i].cpu_addr = devm_ioremap_wc(dev, res->start,
>> +							 resource_size(res));
>> +		if (IS_ERR(kproc->mem[i].cpu_addr)) {
>> +			dev_err(dev, "failed to map %s memory\n",
>> +				data->mems[i].name);
>> +			return PTR_ERR(kproc->mem[i].cpu_addr);
>> +		}
>> +		kproc->mem[i].bus_addr = res->start;
>> +		kproc->mem[i].dev_addr = data->mems[i].dev_addr;
>> +		kproc->mem[i].size = resource_size(res);
>> +
>> +		dev_dbg(dev, "memory %8s: bus addr %pa size 0x%zx va %pK da 0x%x\n",
>> +			data->mems[i].name, &kproc->mem[i].bus_addr,
>> +			kproc->mem[i].size, kproc->mem[i].cpu_addr,
>> +			kproc->mem[i].dev_addr);
>> +
>> +		/* zero out memories to start in a pristine state */
>> +		/*
>> +		 * FIXME: comment out until kernel crash is fixed, possible
>> +		 * issue with local resets.
>> +		 * memset((__force void *)kproc->mem[i].cpu_addr, 0,
>> +		 *      kproc->mem[i].size);
>> +		 */
> 
> Things still work without zero'ing out the memory?  As such is it mandatory to
> do so? Function k3_r5_core_of_get_internal_memories does not do a memset().  And
> didn't Peng also had this problem?

This is a stale comment, I will clean this up. The zeroing out is not 
strictly needed, it is only to ensure that the DSPs are started in a 
pristine condition. The issue is unrelated to what Peng reported, it is 
not the ARM memset issue (which won't be an issue since I am already 
using the ioremap_wc variant), but rather related to device being 
powered-on to be able to access the DSP internal memories from ARM. This 
won't be powered on at the time this function is invoked anyway. The R5F 
does needs to memzero it for ECC reasons, and does so in the 
k3_r5_rproc_prepare().

> 
>> +	}
>> +	kproc->num_mems = num_mems;
>> +
>> +	return 0;
>> +}
>> +
>> +static int k3_dsp_reserved_mem_init(struct k3_dsp_rproc *kproc)
>> +{
>> +	struct device *dev = kproc->dev;
>> +	struct device_node *np = dev->of_node;
>> +	struct device_node *rmem_np;
>> +	struct reserved_mem *rmem;
>> +	int num_rmems;
>> +	int ret, i;
>> +
>> +	num_rmems = of_property_count_elems_of_size(np, "memory-region",
>> +						    sizeof(phandle));
>> +	if (num_rmems <= 0) {
>> +		dev_err(dev, "device does not reserved memory regions, ret = %d\n",
>> +			num_rmems);
>> +		return -EINVAL;
>> +	}
>> +	if (num_rmems < 2) {
>> +		dev_err(dev, "device needs atleast two memory regions to be defined, num = %d\n",
>> +			num_rmems);
>> +		return -EINVAL;
>> +	}
>> +
>> +	/* use reserved memory region 0 for vring DMA allocations */
>> +	ret = of_reserved_mem_device_init_by_idx(dev, np, 0);
>> +	if (ret) {
>> +		dev_err(dev, "device cannot initialize DMA pool, ret = %d\n",
>> +			ret);
>> +		return ret;
>> +	}
>> +
>> +	num_rmems--;
>> +	kproc->rmem = kcalloc(num_rmems, sizeof(*kproc->rmem), GFP_KERNEL);
>> +	if (!kproc->rmem) {
>> +		ret = -ENOMEM;
>> +		goto release_rmem;
>> +	}
>> +
>> +	/* use remaining reserved memory regions for static carveouts */
>> +	for (i = 0; i < num_rmems; i++) {
>> +		rmem_np = of_parse_phandle(np, "memory-region", i + 1);
>> +		if (!rmem_np) {
>> +			ret = -EINVAL;
>> +			goto unmap_rmem;
>> +		}
>> +
>> +		rmem = of_reserved_mem_lookup(rmem_np);
>> +		if (!rmem) {
>> +			of_node_put(rmem_np);
>> +			ret = -EINVAL;
>> +			goto unmap_rmem;
>> +		}
>> +		of_node_put(rmem_np);
>> +
>> +		kproc->rmem[i].bus_addr = rmem->base;
>> +		/* 64-bit address regions currently not supported */
>> +		kproc->rmem[i].dev_addr = (u32)rmem->base;
>> +		kproc->rmem[i].size = rmem->size;
>> +		kproc->rmem[i].cpu_addr = ioremap_wc(rmem->base, rmem->size);
>> +		if (!kproc->rmem[i].cpu_addr) {
>> +			dev_err(dev, "failed to map reserved memory#%d at %pa of size %pa\n",
>> +				i + 1, &rmem->base, &rmem->size);
>> +			ret = -ENOMEM;
>> +			goto unmap_rmem;
>> +		}
>> +
>> +		dev_dbg(dev, "reserved memory%d: bus addr %pa size 0x%zx va %pK da 0x%x\n",
>> +			i + 1, &kproc->rmem[i].bus_addr,
>> +			kproc->rmem[i].size, kproc->rmem[i].cpu_addr,
>> +			kproc->rmem[i].dev_addr);
>> +	}
>> +	kproc->num_rmems = num_rmems;
>> +
>> +	return 0;
>> +
>> +unmap_rmem:
>> +	for (i--; i >= 0; i--) {
>> +		if (kproc->rmem[i].cpu_addr)
>> +			iounmap(kproc->rmem[i].cpu_addr);
>> +	}
>> +	kfree(kproc->rmem);
>> +release_rmem:
>> +	of_reserved_mem_device_release(kproc->dev);
>> +	return ret;
>> +}
> 
> Other than the type of structure passed to the function, this is an exact
> replica of k3_r5_reserved_mem_init().  Do you foresee either of them changing
> to a point where reusing code would be counter productive?  I think we are right
> on the edge where duplication is better than using the same function.

Yeah, nothing at the moment. The number of regions can change, I have 
not enabled the support for addresses beyond 32-bit atm, so that is 
another factor.

> 
>> +
>> +static void k3_dsp_reserved_mem_exit(struct k3_dsp_rproc *kproc)
>> +{
>> +	int i;
>> +
>> +	for (i = 0; i < kproc->num_rmems; i++)
>> +		iounmap(kproc->rmem[i].cpu_addr);
>> +	kfree(kproc->rmem);
>> +
>> +	of_reserved_mem_device_release(kproc->dev);
>> +}
>> +
>> +static
>> +struct ti_sci_proc *k3_dsp_rproc_of_get_tsp(struct device *dev,
>> +					    const struct ti_sci_handle *sci)
>> +{
>> +	struct ti_sci_proc *tsp;
>> +	u32 temp[2];
>> +	int ret;
>> +
>> +	ret = of_property_read_u32_array(dev->of_node, "ti,sci-proc-ids",
>> +					 temp, 2);
>> +	if (ret < 0)
>> +		return ERR_PTR(ret);
>> +
>> +	tsp = kzalloc(sizeof(*tsp), GFP_KERNEL);
>> +	if (!tsp)
>> +		return ERR_PTR(-ENOMEM);
>> +
>> +	tsp->dev = dev;
>> +	tsp->sci = sci;
>> +	tsp->ops = &sci->ops.proc_ops;
>> +	tsp->proc_id = temp[0];
>> +	tsp->host_id = temp[1];
>> +
>> +	return tsp;
>> +}
> 
> Contrary to k3_dsp_reserved_mem_init(), this one can definitely be reused for
> both c66 and r5.

Yeah, but is it worth it introduce a common module for one function? 
Little bit large to define this as an inline function like I have done 
with most of the ti_sci_proc helpers.

> 
>> +
>> +static int k3_dsp_rproc_probe(struct platform_device *pdev)
>> +{
>> +	struct device *dev = &pdev->dev;
>> +	struct device_node *np = dev->of_node;
>> +	const struct k3_dsp_dev_data *data;
>> +	struct k3_dsp_rproc *kproc;
>> +	struct rproc *rproc;
>> +	const char *fw_name;
>> +	int ret = 0;
>> +	int ret1;
>> +
>> +	data = of_device_get_match_data(dev);
>> +	if (!data)
>> +		return -ENODEV;
>> +
>> +	fw_name = k3_dsp_rproc_get_firmware(dev);
>> +	if (IS_ERR(fw_name))
>> +		return PTR_ERR(fw_name);
>> +
>> +	rproc = rproc_alloc(dev, dev_name(dev), &k3_dsp_rproc_ops, fw_name,
>> +			    sizeof(*kproc));
>> +	if (!rproc)
>> +		return -ENOMEM;
>> +
>> +	rproc->has_iommu = false;
>> +	rproc->recovery_disabled = true;
>> +	kproc = rproc->priv;
>> +	kproc->rproc = rproc;
>> +	kproc->dev = dev;
>> +	kproc->data = data;
>> +
>> +	kproc->ti_sci = ti_sci_get_by_phandle(np, "ti,sci");
>> +	if (IS_ERR(kproc->ti_sci)) {
>> +		ret = PTR_ERR(kproc->ti_sci);
>> +		if (ret != -EPROBE_DEFER) {
>> +			dev_err(dev, "failed to get ti-sci handle, ret = %d\n",
>> +				ret);
>> +		}
>> +		kproc->ti_sci = NULL;
>> +		goto free_rproc;
>> +	}
>> +
>> +	ret = of_property_read_u32(np, "ti,sci-dev-id", &kproc->ti_sci_id);
>> +	if (ret) {
>> +		dev_err(dev, "missing 'ti,sci-dev-id' property\n");
>> +		goto put_sci;
>> +	}
>> +
>> +	kproc->reset = devm_reset_control_get_exclusive(dev, NULL);
>> +	if (IS_ERR(kproc->reset)) {
>> +		ret = PTR_ERR(kproc->reset);
>> +		dev_err(dev, "failed to get reset, status = %d\n", ret);
>> +		goto put_sci;
>> +	}
>> +
>> +	kproc->tsp = k3_dsp_rproc_of_get_tsp(dev, kproc->ti_sci);
>> +	if (IS_ERR(kproc->tsp)) {
>> +		dev_err(dev, "failed to construct ti-sci proc control, ret = %d\n",
>> +			ret);
>> +		ret = PTR_ERR(kproc->tsp);
>> +		goto put_sci;
>> +	}
>> +
>> +	ret = ti_sci_proc_request(kproc->tsp);
>> +	if (ret < 0) {
>> +		dev_err(dev, "ti_sci_proc_request failed, ret = %d\n", ret);
>> +		goto free_tsp;
>> +	}
>> +
>> +	pm_runtime_enable(dev);
>> +	ret = pm_runtime_get_sync(dev);
> 
> What do these give you since the dev_pm_ops is not set for the
> k3_dsp_rproc_driver platform diver and there is no clock specified in the DT?

Yeah, I can drop this. Adding a clock in DT would not have made any 
difference here, but a power-domains property would have. And I don't 
use the power-domains property because of the genpd handling in driver 
core that messes with the device state.

regards
Sumahn

> 
> Thanks,
> Mathieu
> 
>> +	if (ret < 0) {
>> +		dev_err(dev, "failed to enable clock, status = %d\n", ret);
>> +		pm_runtime_put_noidle(dev);
>> +		goto disable_rpm;
>> +	}
>> +
>> +	ret = k3_dsp_rproc_of_get_memories(pdev, kproc);
>> +	if (ret)
>> +		goto disable_clk;
>> +
>> +	ret = k3_dsp_reserved_mem_init(kproc);
>> +	if (ret) {
>> +		dev_err(dev, "reserved memory init failed, ret = %d\n", ret);
>> +		goto disable_clk;
>> +	}
>> +
>> +	ret = rproc_add(rproc);
>> +	if (ret) {
>> +		dev_err(dev, "failed to add register device with remoteproc core, status = %d\n",
>> +			ret);
>> +		goto release_mem;
>> +	}
>> +
>> +	platform_set_drvdata(pdev, kproc);
>> +
>> +	return 0;
>> +
>> +release_mem:
>> +	k3_dsp_reserved_mem_exit(kproc);
>> +disable_clk:
>> +	pm_runtime_put_sync(dev);
>> +disable_rpm:
>> +	pm_runtime_disable(dev);
>> +	ret1 = ti_sci_proc_release(kproc->tsp);
>> +	if (ret1)
>> +		dev_err(dev, "failed to release proc, ret = %d\n", ret1);
>> +free_tsp:
>> +	kfree(kproc->tsp);
>> +put_sci:
>> +	ret1 = ti_sci_put_handle(kproc->ti_sci);
>> +	if (ret1)
>> +		dev_err(dev, "failed to put ti_sci handle, ret = %d\n", ret1);
>> +free_rproc:
>> +	rproc_free(rproc);
>> +	return ret;
>> +}
>> +
>> +static int k3_dsp_rproc_remove(struct platform_device *pdev)
>> +{
>> +	struct k3_dsp_rproc *kproc = platform_get_drvdata(pdev);
>> +	struct device *dev = &pdev->dev;
>> +	int ret;
>> +
>> +	rproc_del(kproc->rproc);
>> +	pm_runtime_put_sync(&pdev->dev);
>> +	pm_runtime_disable(&pdev->dev);
>> +
>> +	ret = ti_sci_proc_release(kproc->tsp);
>> +	if (ret)
>> +		dev_err(dev, "failed to release proc, ret = %d\n", ret);
>> +
>> +	kfree(kproc->tsp);
>> +
>> +	ret = ti_sci_put_handle(kproc->ti_sci);
>> +	if (ret)
>> +		dev_err(dev, "failed to put ti_sci handle, ret = %d\n", ret);
>> +
>> +	k3_dsp_reserved_mem_exit(kproc);
>> +	rproc_free(kproc->rproc);
>> +
>> +	return 0;
>> +}
>> +
>> +static const struct k3_dsp_mem_data c66_mems[] = {
>> +	{ .name = "l2sram", .dev_addr = 0x800000 },
>> +	{ .name = "l1pram", .dev_addr = 0xe00000 },
>> +	{ .name = "l1dram", .dev_addr = 0xf00000 },
>> +};
>> +
>> +static const struct k3_dsp_dev_data c66_data = {
>> +	.mems = c66_mems,
>> +	.num_mems = ARRAY_SIZE(c66_mems),
>> +	.boot_align_addr = SZ_1K,
>> +	.uses_lreset = true,
>> +};
>> +
>> +static const struct of_device_id k3_dsp_of_match[] = {
>> +	{ .compatible = "ti,j721e-c66-dsp", .data = &c66_data, },
>> +	{ /* sentinel */ },
>> +};
>> +MODULE_DEVICE_TABLE(of, k3_dsp_of_match);
>> +
>> +static struct platform_driver k3_dsp_rproc_driver = {
>> +	.probe	= k3_dsp_rproc_probe,
>> +	.remove	= k3_dsp_rproc_remove,
>> +	.driver	= {
>> +		.name = "k3-dsp-rproc",
>> +		.of_match_table = k3_dsp_of_match,
>> +	},
>> +};
>> +
>> +module_platform_driver(k3_dsp_rproc_driver);
>> +
>> +MODULE_AUTHOR("Suman Anna <s-anna@ti.com>");
>> +MODULE_LICENSE("GPL v2");
>> +MODULE_DESCRIPTION("TI K3 DSP Remoteproc driver");
>> -- 
>> 2.23.0
>>


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH 2/3] remoteproc/k3-dsp: Add a remoteproc driver of K3 C66x DSPs
  2020-05-13 18:14     ` Suman Anna
@ 2020-05-13 19:40       ` Mathieu Poirier
  0 siblings, 0 replies; 13+ messages in thread
From: Mathieu Poirier @ 2020-05-13 19:40 UTC (permalink / raw)
  To: Suman Anna
  Cc: Bjorn Andersson, Rob Herring, Lokesh Vutla, linux-remoteproc,
	devicetree, linux-arm-kernel, Linux Kernel Mailing List

On Wed, 13 May 2020 at 12:14, Suman Anna <s-anna@ti.com> wrote:
>
> Hi Mathieu,
>
> On 4/27/20 5:57 PM, Mathieu Poirier wrote:
> > On Wed, Mar 25, 2020 at 03:18:38PM -0500, Suman Anna wrote:
> >> The Texas Instrument's K3 J721E SoCs have two C66x DSP Subsystems in MAIN
> >> voltage domain that are based on the TI's standard TMS320C66x DSP CorePac
> >> module. Each subsystem has a Fixed/Floating-Point DSP CPU, with 32 KB each
> >> of L1P & L1D SRAMs that can be configured and partitioned as either RAM
> >> and/or Cache, and 288 KB of L2 SRAM with 256 KB of memory configurable as
> >> either RAM and/or Cache. The CorePac also includes an Internal DMA (IDMA),
> >> External Memory Controller (EMC), Extended Memory Controller (XMC) with a
> >> Region Address Translator (RAT) unit for 32-bit to 48-bit address
> >> extension/translations, an Interrupt Controller (INTC) and a Powerdown
> >> Controller (PDC).
> >>
> >> A new remoteproc module is added to perform the device management of
> >> these DSP devices. The support is limited to images using only external
> >> DDR memory at the moment, the loading support to internal memories and
> >> any on-chip RAM memories will be added in a subsequent patch. RAT support
> >> is also left for a future patch, and as such the reserved memory carveout
> >> regions are all expected to be using memory regions within the first 2 GB.
> >> Error Recovery and Power Management features are not currently supported.
> >>
> >> The C66x remote processors do not have an MMU, and so require fixed memory
> >> carveout regions matching the firmware image addresses. Support for this
> >> is provided by mandating multiple memory regions to be attached to the
> >> remoteproc device. The first memory region will be used to serve as the
> >> DMA pool for all dynamic allocations like the vrings and vring buffers.
> >> The remaining memory regions are mapped into the kernel at device probe
> >> time, and are used to provide address translations for firmware image
> >> segments without the need for any RSC_CARVEOUT entries. Any firmware
> >> image using memory outside of the supplied reserved memory carveout
> >> regions will be errored out.
> >>
> >> The driver uses various TI-SCI interfaces to talk to the System Controller
> >> (DMSC) for managing configuration, power and reset management of these
> >> cores. IPC between the A72 cores and the DSP cores is supported through
> >> the virtio rpmsg stack using shared memory and OMAP Mailboxes.
> >>
> >> Signed-off-by: Suman Anna <s-anna@ti.com>
> >> ---
> >>   drivers/remoteproc/Kconfig                |  16 +
> >>   drivers/remoteproc/Makefile               |   1 +
> >>   drivers/remoteproc/ti_k3_dsp_remoteproc.c | 736 ++++++++++++++++++++++
> >>   3 files changed, 753 insertions(+)
> >>   create mode 100644 drivers/remoteproc/ti_k3_dsp_remoteproc.c
> >>
> >> diff --git a/drivers/remoteproc/Kconfig b/drivers/remoteproc/Kconfig
> >> index 073048b4c0fb..66a76acb15b6 100644
> >> --- a/drivers/remoteproc/Kconfig
> >> +++ b/drivers/remoteproc/Kconfig
> >> @@ -240,6 +240,22 @@ config TI_K3_R5_REMOTEPROC
> >>        It's safe to say N here if you're not interested in utilizing
> >>        a slave processor
> >>
> >> +config TI_K3_DSP_REMOTEPROC
> >> +    tristate "TI K3 DSP remoteproc support"
> >> +    depends on ARCH_K3
> >> +    select MAILBOX
> >> +    select OMAP2PLUS_MBOX
> >> +    help
> >> +      Say y here to support TI's C66x and C71x DSP remote processor
> >> +      subsystems on various TI K3 family of SoCs through the remote
> >> +      processor framework.
> >> +
> >> +      You want to say m here in order to offload some processing
> >> +      tasks to these processors.
> >
> > Building this driver has a module, i.e 'm', has nothing to do with what the
> > remote processor does.  I would simply remove the above 2 lines.
>
> Yes, can drop. I will switch the "Say y" to "Say m" - that would be the
> preferred option. Having the driver built-in means the firmware has to
> be part of initramfs.
>
> >
> >> +
> >> +      It's safe to say N here if you're not interested in utilizing
> >> +      the DSP slave processors.
> >> +
> >>   endif # REMOTEPROC
> >>
> >>   endmenu
> >> diff --git a/drivers/remoteproc/Makefile b/drivers/remoteproc/Makefile
> >> index 00ba826818af..eb51cc09e47b 100644
> >> --- a/drivers/remoteproc/Makefile
> >> +++ b/drivers/remoteproc/Makefile
> >> @@ -29,3 +29,4 @@ obj-$(CONFIG_ST_REMOTEPROC)                += st_remoteproc.o
> >>   obj-$(CONFIG_ST_SLIM_REMOTEPROC)   += st_slim_rproc.o
> >>   obj-$(CONFIG_STM32_RPROC)          += stm32_rproc.o
> >>   obj-$(CONFIG_TI_K3_R5_REMOTEPROC)  += ti_k3_r5_remoteproc.o
> >> +obj-$(CONFIG_TI_K3_DSP_REMOTEPROC)  += ti_k3_dsp_remoteproc.o
> >> diff --git a/drivers/remoteproc/ti_k3_dsp_remoteproc.c b/drivers/remoteproc/ti_k3_dsp_remoteproc.c
> >> new file mode 100644
> >> index 000000000000..fd0d84f46f90
> >> --- /dev/null
> >> +++ b/drivers/remoteproc/ti_k3_dsp_remoteproc.c
> >> @@ -0,0 +1,736 @@
> >> +// SPDX-License-Identifier: GPL-2.0-only
> >> +/*
> >> + * TI K3 DSP Remote Processor(s) driver
> >> + *
> >> + * Copyright (C) 2018-2020 Texas Instruments Incorporated - http://www.ti.com/
> >> + *  Suman Anna <s-anna@ti.com>
> >> + */
> >> +
> >> +#include <linux/io.h>
> >> +#include <linux/module.h>
> >> +#include <linux/of_device.h>
> >> +#include <linux/of_reserved_mem.h>
> >> +#include <linux/platform_device.h>
> >> +#include <linux/pm_runtime.h>
> >> +#include <linux/remoteproc.h>
> >> +#include <linux/mailbox_client.h>
> >> +#include <linux/omap-mailbox.h>
> >
> > Please move these two up.
>
> OK.
>
> >
> >> +#include <linux/reset.h>
> >> +#include <linux/soc/ti/ti_sci_protocol.h>
> >> +
> >> +#include "omap_remoteproc.h"
> >> +#include "remoteproc_internal.h"
> >> +#include "ti_sci_proc.h"
> >> +
> >> +#define KEYSTONE_RPROC_LOCAL_ADDRESS_MASK   (SZ_16M - 1)
> >> +
> >> +/**
> >> + * struct k3_dsp_rproc_mem - internal memory structure
> >> + * @cpu_addr: MPU virtual address of the memory region
> >> + * @bus_addr: Bus address used to access the memory region
> >> + * @dev_addr: Device address of the memory region from DSP view
> >> + * @size: Size of the memory region
> >> + */
> >> +struct k3_dsp_rproc_mem {
> >
> > I would rename this 'k3_dsp_mem' to be consistent with k3_r5_mem.
>
> Yeah, will rename.
>
> >
> >> +    void __iomem *cpu_addr;
> >> +    phys_addr_t bus_addr;
> >> +    u32 dev_addr;
> >> +    size_t size;
> >> +};
> >> +
> >> +/**
> >> + * struct k3_dsp_mem_data - memory definitions for a DSP
> >> + * @name: name for this memory entry
> >> + * @dev_addr: device address for the memory entry
> >> + */
> >> +struct k3_dsp_mem_data {
> >> +    const char *name;
> >> +    const u32 dev_addr;
> >> +};
> >> +
> >> +/**
> >> + * struct k3_dsp_dev_data - device data structure for a DSP
> >> + * @mems: pointer to memory definitions for a DSP
> >> + * @num_mems: number of memory regions in @mems
> >> + * @boot_align_addr: boot vector address alignment granularity
> >> + * @uses_lreset: flag to denote the need for local reset management
> >> + */
> >> +struct k3_dsp_dev_data {
> >> +    const struct k3_dsp_mem_data *mems;
> >> +    u32 num_mems;
> >> +    u32 boot_align_addr;
> >> +    bool uses_lreset;
> >> +};
> >> +
> >> +/**
> >> + * struct k3_dsp_rproc - k3 DSP remote processor driver structure
> >> + * @dev: cached device pointer
> >> + * @rproc: remoteproc device handle
> >> + * @mem: internal memory regions data
> >> + * @num_mems: number of internal memory regions
> >> + * @rmem: reserved memory regions data
> >> + * @num_rmems: number of reserved memory regions
> >> + * @reset: reset control handle
> >> + * @data: pointer to DSP-specific device data
> >> + * @tsp: TI-SCI processor control handle
> >> + * @ti_sci: TI-SCI handle
> >> + * @ti_sci_id: TI-SCI device identifier
> >> + * @mbox: mailbox channel handle
> >> + * @client: mailbox client to request the mailbox channel
> >> + */
> >> +struct k3_dsp_rproc {
> >> +    struct device *dev;
> >> +    struct rproc *rproc;
> >> +    struct k3_dsp_rproc_mem *mem;
> >> +    int num_mems;
> >> +    struct k3_dsp_rproc_mem *rmem;
> >> +    int num_rmems;
> >> +    struct reset_control *reset;
> >> +    const struct k3_dsp_dev_data *data;
> >> +    struct ti_sci_proc *tsp;
> >> +    const struct ti_sci_handle *ti_sci;
> >> +    u32 ti_sci_id;
> >> +    struct mbox_chan *mbox;
> >> +    struct mbox_client client;
> >> +};
> >> +
> >> +/**
> >> + * k3_dsp_rproc_mbox_callback() - inbound mailbox message handler
> >> + * @client: mailbox client pointer used for requesting the mailbox channel
> >> + * @data: mailbox payload
> >> + *
> >> + * This handler is invoked by the OMAP mailbox driver whenever a mailbox
> >> + * message is received. Usually, the mailbox payload simply contains
> >> + * the index of the virtqueue that is kicked by the remote processor,
> >> + * and we let remoteproc core handle it.
> >> + *
> >> + * In addition to virtqueue indices, we also have some out-of-band values
> >> + * that indicate different events. Those values are deliberately very
> >> + * large so they don't coincide with virtqueue indices.
> >> + */
> >> +static void k3_dsp_rproc_mbox_callback(struct mbox_client *client, void *data)
> >> +{
> >> +    struct k3_dsp_rproc *kproc = container_of(client, struct k3_dsp_rproc,
> >> +                                            client);
> >
> > Indentation problem.
>
> Thanks. Hmm, checkpatch didn't catch this.
>
> >
> >> +    struct device *dev = kproc->rproc->dev.parent;
> >> +    const char *name = kproc->rproc->name;
> >> +    u32 msg = omap_mbox_message(data);
> >> +
> >> +    dev_dbg(dev, "mbox msg: 0x%x\n", msg);
> >> +
> >> +    switch (msg) {
> >> +    case RP_MBOX_CRASH:
> >> +            /*
> >> +             * remoteproc detected an exception, but error recovery is not
> >> +             * supported. So, just log this for now
> >> +             */
> >> +            dev_err(dev, "K3 DSP rproc %s crashed\n", name);
> >> +            break;
> >> +    case RP_MBOX_ECHO_REPLY:
> >> +            dev_info(dev, "received echo reply from %s\n", name);
> >> +            break;
> >> +    default:
> >> +            /* silently handle all other valid messages */
> >> +            if (msg >= RP_MBOX_READY && msg < RP_MBOX_END_MSG)
> >> +                    return;
> >> +            if (msg > kproc->rproc->max_notifyid) {
> >> +                    dev_dbg(dev, "dropping unknown message 0x%x", msg);
> >> +                    return;
> >> +            }
> >> +            /* msg contains the index of the triggered vring */
> >> +            if (rproc_vq_interrupt(kproc->rproc, msg) == IRQ_NONE)
> >> +                    dev_dbg(dev, "no message was found in vqid %d\n", msg);
> >> +    }
> >> +}
> >> +
> >> +/*
> >> + * Kick the remote processor to notify about pending unprocessed messages.
> >> + * The vqid usage is not used and is inconsequential, as the kick is performed
> >> + * through a simulated GPIO (a bit in an IPC interrupt-triggering register),
> >> + * the remote processor is expected to process both its Tx and Rx virtqueues.
> >> + */
> >> +static void k3_dsp_rproc_kick(struct rproc *rproc, int vqid)
> >> +{
> >> +    struct k3_dsp_rproc *kproc = rproc->priv;
> >> +    struct device *dev = rproc->dev.parent;
> >> +    mbox_msg_t msg = (mbox_msg_t)vqid;
> >> +    int ret;
> >> +
> >> +    /* send the index of the triggered virtqueue in the mailbox payload */
> >> +    ret = mbox_send_message(kproc->mbox, (void *)msg);
> >> +    if (ret < 0)
> >> +            dev_err(dev, "failed to send mailbox message, status = %d\n",
> >> +                    ret);
> >> +}
> >> +
> >> +/* Put the DSP processor into reset */
> >> +static int k3_dsp_rproc_reset(struct k3_dsp_rproc *kproc)
> >> +{
> >> +    struct device *dev = kproc->dev;
> >> +    int ret;
> >> +
> >> +    ret = reset_control_assert(kproc->reset);
> >> +    if (ret) {
> >> +            dev_err(dev, "local-reset assert failed, ret = %d\n", ret);
> >> +            return ret;
> >> +    }
> >> +
> >> +    ret = kproc->ti_sci->ops.dev_ops.put_device(kproc->ti_sci,
> >> +                                                kproc->ti_sci_id);
> >> +    if (ret) {
> >> +            dev_err(dev, "module-reset assert failed, ret = %d\n", ret);
> >> +            if (reset_control_deassert(kproc->reset))
> >> +                    dev_warn(dev, "local-reset deassert back failed\n");
> >> +    }
> >> +
> >> +    return ret;
> >> +}
> >> +
> >> +/* Release the DSP processor from reset */
> >> +static int k3_dsp_rproc_release(struct k3_dsp_rproc *kproc)
> >> +{
> >> +    struct device *dev = kproc->dev;
> >> +    int ret;
> >> +
> >> +    ret = kproc->ti_sci->ops.dev_ops.get_device(kproc->ti_sci,
> >> +                                               kproc->ti_sci_id);
> >
> > Indentation problem.
>
> Thanks for catching, will fix.
>
> >
> >> +    if (ret) {
> >> +            dev_err(dev, "module-reset deassert failed, ret = %d\n", ret);
> >> +            return ret;
> >> +    }
> >> +
> >> +    ret = reset_control_deassert(kproc->reset);
> >> +    if (ret) {
> >> +            dev_err(dev, "local-reset deassert failed, ret = %d\n", ret);
> >> +            if (kproc->ti_sci->ops.dev_ops.put_device(kproc->ti_sci,
> >> +                                                      kproc->ti_sci_id))
> >> +                    dev_warn(dev, "module-reset assert back failed\n");
> >> +    }
> >> +
> >> +    return ret;
> >> +}
> >> +
> >> +/*
> >> + * Power up the DSP remote processor.
> >> + *
> >> + * This function will be invoked only after the firmware for this rproc
> >> + * was loaded, parsed successfully, and all of its resource requirements
> >> + * were met.
> >> + */
> >> +static int k3_dsp_rproc_start(struct rproc *rproc)
> >> +{
> >> +    struct k3_dsp_rproc *kproc = rproc->priv;
> >> +    struct mbox_client *client = &kproc->client;
> >> +    struct device *dev = kproc->dev;
> >> +    u32 boot_addr;
> >> +    int ret;
> >> +
> >> +    client->dev = dev;
> >> +    client->tx_done = NULL;
> >> +    client->rx_callback = k3_dsp_rproc_mbox_callback;
> >> +    client->tx_block = false;
> >> +    client->knows_txdone = false;
> >> +
> >> +    kproc->mbox = mbox_request_channel(client, 0);
> >> +    if (IS_ERR(kproc->mbox)) {
> >> +            ret = -EBUSY;
> >> +            dev_err(dev, "mbox_request_channel failed: %ld\n",
> >> +                    PTR_ERR(kproc->mbox));
> >> +            return ret;
> >> +    }
> >> +
> >> +    /*
> >> +     * Ping the remote processor, this is only for sanity-sake for now;
> >> +     * there is no functional effect whatsoever.
> >> +     *
> >> +     * Note that the reply will _not_ arrive immediately: this message
> >> +     * will wait in the mailbox fifo until the remote processor is booted.
> >> +     */
> >> +    ret = mbox_send_message(kproc->mbox, (void *)RP_MBOX_ECHO_REQUEST);
> >> +    if (ret < 0) {
> >> +            dev_err(dev, "mbox_send_message failed: %d\n", ret);
> >> +            goto put_mbox;
> >> +    }
> >> +
> >> +    boot_addr = rproc->bootaddr;
> >> +    if (boot_addr & (kproc->data->boot_align_addr - 1)) {
> >> +            dev_err(dev, "invalid boot address 0x%x, must be aligned on a 0x%x boundary\n",
> >> +                    boot_addr, kproc->data->boot_align_addr);
> >> +            ret = -EINVAL;
> >> +            goto put_mbox;
> >> +    }
> >> +
> >> +    dev_err(dev, "booting DSP core using boot addr = 0x%x\n", boot_addr);
> >> +    ret = ti_sci_proc_set_config(kproc->tsp, boot_addr, 0, 0);
> >> +    if (ret)
> >> +            goto put_mbox;
> >> +
> >> +    ret = k3_dsp_rproc_release(kproc);
> >> +    if (ret)
> >> +            goto put_mbox;
> >> +
> >> +    return 0;
> >> +
> >> +put_mbox:
> >> +    mbox_free_channel(kproc->mbox);
> >> +    return ret;
> >> +}
> >> +
> >> +/*
> >> + * Stop the DSP remote processor.
> >> + *
> >> + * This function puts the DSP processor into reset, and finishes processing
> >> + * of any pending messages.
> >> + */
> >> +static int k3_dsp_rproc_stop(struct rproc *rproc)
> >> +{
> >> +    struct k3_dsp_rproc *kproc = rproc->priv;
> >> +
> >> +    mbox_free_channel(kproc->mbox);
> >> +
> >> +    k3_dsp_rproc_reset(kproc);
> >> +
> >> +    return 0;
> >> +}
> >> +
> >> +/*
> >> + * Custom function to translate a DSP device address (internal RAMs only) to a
> >> + * kernel virtual address.  The DSPs can access their RAMs at either an internal
> >> + * address visible only from a DSP, or at the SoC-level bus address. Both these
> >> + * addresses need to be looked through for translation. The translated addresses
> >> + * can be used either by the remoteproc core for loading (when using kernel
> >> + * remoteproc loader), or by any rpmsg bus drivers.
> >> + */
> >> +static void *k3_dsp_rproc_da_to_va(struct rproc *rproc, u64 da, size_t len)
> >> +{
> >> +    struct k3_dsp_rproc *kproc = rproc->priv;
> >> +    void __iomem *va = NULL;
> >> +    phys_addr_t bus_addr;
> >> +    u32 dev_addr, offset;
> >> +    size_t size;
> >> +    int i;
> >> +
> >> +    if (len == 0)
> >> +            return NULL;
> >> +
> >> +    for (i = 0; i < kproc->num_mems; i++) {
> >> +            bus_addr = kproc->mem[i].bus_addr;
> >> +            dev_addr = kproc->mem[i].dev_addr;
> >> +            size = kproc->mem[i].size;
> >> +
> >> +            if (da < KEYSTONE_RPROC_LOCAL_ADDRESS_MASK) {
> >> +                    /* handle DSP-view addresses */
> >> +                    if (da >= dev_addr &&
> >> +                        ((da + len) <= (dev_addr + size))) {
> >> +                            offset = da - dev_addr;
> >> +                            va = kproc->mem[i].cpu_addr + offset;
> >> +                            return (__force void *)va;
> >> +                    }
> >> +            } else {
> >> +                    /* handle SoC-view addresses */
> >> +                    if (da >= bus_addr &&
> >> +                        (da + len) <= (bus_addr + size)) {
> >> +                            offset = da - bus_addr;
> >> +                            va = kproc->mem[i].cpu_addr + offset;
> >> +                            return (__force void *)va;
> >> +                    }
> >> +            }
> >> +    }
> >> +
> >> +    /* handle static DDR reserved memory regions */
> >> +    for (i = 0; i < kproc->num_rmems; i++) {
> >> +            dev_addr = kproc->rmem[i].dev_addr;
> >> +            size = kproc->rmem[i].size;
> >> +
> >> +            if (da >= dev_addr && ((da + len) <= (dev_addr + size))) {
> >> +                    offset = da - dev_addr;
> >> +                    va = kproc->rmem[i].cpu_addr + offset;
> >> +                    return (__force void *)va;
> >> +            }
> >> +    }
> >> +
> >> +    return NULL;
> >> +}
> >> +
> >> +static const struct rproc_ops k3_dsp_rproc_ops = {
> >> +    .start          = k3_dsp_rproc_start,
> >> +    .stop           = k3_dsp_rproc_stop,
> >> +    .kick           = k3_dsp_rproc_kick,
> >> +    .da_to_va       = k3_dsp_rproc_da_to_va,
> >> +};
> >> +
> >> +static const char *k3_dsp_rproc_get_firmware(struct device *dev)
> >> +{
> >> +    const char *fw_name;
> >> +    int ret;
> >> +
> >> +    ret = of_property_read_string(dev->of_node, "firmware-name",
> >> +                                  &fw_name);
> >> +    if (ret) {
> >> +            dev_err(dev, "failed to parse firmware-name property, ret = %d\n",
> >> +                    ret);
> >> +            return ERR_PTR(ret);
> >> +    }
> >> +
> >> +    return fw_name;
> >> +}
> >
> > The above is a carbon copy of k3_r5_rproc_get_firmware().  Please reuse the same
> > function.
>
> Yeah, I can add this as a common helper to rproc core, would be useful
> beyond just the TI rproc drivers.
>
> >
> >> +
> >> +static int k3_dsp_rproc_of_get_memories(struct platform_device *pdev,
> >> +                                    struct k3_dsp_rproc *kproc)
> >> +{
> >> +    const struct k3_dsp_dev_data *data = kproc->data;
> >> +    struct device *dev = &pdev->dev;
> >> +    struct resource *res;
> >> +    int num_mems = 0;
> >> +    int i;
> >> +
> >> +    num_mems = kproc->data->num_mems;
> >> +    kproc->mem = devm_kcalloc(kproc->dev, num_mems,
> >> +                              sizeof(*kproc->mem), GFP_KERNEL);
> >> +    if (!kproc->mem)
> >> +            return -ENOMEM;
> >> +
> >> +    for (i = 0; i < num_mems; i++) {
> >> +            res = platform_get_resource_byname(pdev, IORESOURCE_MEM,
> >> +                                               data->mems[i].name);
> >> +            if (!res) {
> >> +                    dev_err(dev, "found no memory resource for %s\n",
> >> +                            data->mems[i].name);
> >> +                    return -EINVAL;
> >> +            }
> >> +            if (!devm_request_mem_region(dev, res->start,
> >> +                                         resource_size(res),
> >> +                                         dev_name(dev))) {
> >> +                    dev_err(dev, "could not request %s region for resource\n",
> >> +                            data->mems[i].name);
> >> +                    return -EBUSY;
> >> +            }
> >> +
> >> +            kproc->mem[i].cpu_addr = devm_ioremap_wc(dev, res->start,
> >> +                                                     resource_size(res));
> >> +            if (IS_ERR(kproc->mem[i].cpu_addr)) {
> >> +                    dev_err(dev, "failed to map %s memory\n",
> >> +                            data->mems[i].name);
> >> +                    return PTR_ERR(kproc->mem[i].cpu_addr);
> >> +            }
> >> +            kproc->mem[i].bus_addr = res->start;
> >> +            kproc->mem[i].dev_addr = data->mems[i].dev_addr;
> >> +            kproc->mem[i].size = resource_size(res);
> >> +
> >> +            dev_dbg(dev, "memory %8s: bus addr %pa size 0x%zx va %pK da 0x%x\n",
> >> +                    data->mems[i].name, &kproc->mem[i].bus_addr,
> >> +                    kproc->mem[i].size, kproc->mem[i].cpu_addr,
> >> +                    kproc->mem[i].dev_addr);
> >> +
> >> +            /* zero out memories to start in a pristine state */
> >> +            /*
> >> +             * FIXME: comment out until kernel crash is fixed, possible
> >> +             * issue with local resets.
> >> +             * memset((__force void *)kproc->mem[i].cpu_addr, 0,
> >> +             *      kproc->mem[i].size);
> >> +             */
> >
> > Things still work without zero'ing out the memory?  As such is it mandatory to
> > do so? Function k3_r5_core_of_get_internal_memories does not do a memset().  And
> > didn't Peng also had this problem?
>
> This is a stale comment, I will clean this up. The zeroing out is not
> strictly needed, it is only to ensure that the DSPs are started in a
> pristine condition. The issue is unrelated to what Peng reported, it is
> not the ARM memset issue (which won't be an issue since I am already
> using the ioremap_wc variant), but rather related to device being
> powered-on to be able to access the DSP internal memories from ARM. This
> won't be powered on at the time this function is invoked anyway. The R5F
> does needs to memzero it for ECC reasons, and does so in the
> k3_r5_rproc_prepare().
>
> >
> >> +    }
> >> +    kproc->num_mems = num_mems;
> >> +
> >> +    return 0;
> >> +}
> >> +
> >> +static int k3_dsp_reserved_mem_init(struct k3_dsp_rproc *kproc)
> >> +{
> >> +    struct device *dev = kproc->dev;
> >> +    struct device_node *np = dev->of_node;
> >> +    struct device_node *rmem_np;
> >> +    struct reserved_mem *rmem;
> >> +    int num_rmems;
> >> +    int ret, i;
> >> +
> >> +    num_rmems = of_property_count_elems_of_size(np, "memory-region",
> >> +                                                sizeof(phandle));
> >> +    if (num_rmems <= 0) {
> >> +            dev_err(dev, "device does not reserved memory regions, ret = %d\n",
> >> +                    num_rmems);
> >> +            return -EINVAL;
> >> +    }
> >> +    if (num_rmems < 2) {
> >> +            dev_err(dev, "device needs atleast two memory regions to be defined, num = %d\n",
> >> +                    num_rmems);
> >> +            return -EINVAL;
> >> +    }
> >> +
> >> +    /* use reserved memory region 0 for vring DMA allocations */
> >> +    ret = of_reserved_mem_device_init_by_idx(dev, np, 0);
> >> +    if (ret) {
> >> +            dev_err(dev, "device cannot initialize DMA pool, ret = %d\n",
> >> +                    ret);
> >> +            return ret;
> >> +    }
> >> +
> >> +    num_rmems--;
> >> +    kproc->rmem = kcalloc(num_rmems, sizeof(*kproc->rmem), GFP_KERNEL);
> >> +    if (!kproc->rmem) {
> >> +            ret = -ENOMEM;
> >> +            goto release_rmem;
> >> +    }
> >> +
> >> +    /* use remaining reserved memory regions for static carveouts */
> >> +    for (i = 0; i < num_rmems; i++) {
> >> +            rmem_np = of_parse_phandle(np, "memory-region", i + 1);
> >> +            if (!rmem_np) {
> >> +                    ret = -EINVAL;
> >> +                    goto unmap_rmem;
> >> +            }
> >> +
> >> +            rmem = of_reserved_mem_lookup(rmem_np);
> >> +            if (!rmem) {
> >> +                    of_node_put(rmem_np);
> >> +                    ret = -EINVAL;
> >> +                    goto unmap_rmem;
> >> +            }
> >> +            of_node_put(rmem_np);
> >> +
> >> +            kproc->rmem[i].bus_addr = rmem->base;
> >> +            /* 64-bit address regions currently not supported */
> >> +            kproc->rmem[i].dev_addr = (u32)rmem->base;
> >> +            kproc->rmem[i].size = rmem->size;
> >> +            kproc->rmem[i].cpu_addr = ioremap_wc(rmem->base, rmem->size);
> >> +            if (!kproc->rmem[i].cpu_addr) {
> >> +                    dev_err(dev, "failed to map reserved memory#%d at %pa of size %pa\n",
> >> +                            i + 1, &rmem->base, &rmem->size);
> >> +                    ret = -ENOMEM;
> >> +                    goto unmap_rmem;
> >> +            }
> >> +
> >> +            dev_dbg(dev, "reserved memory%d: bus addr %pa size 0x%zx va %pK da 0x%x\n",
> >> +                    i + 1, &kproc->rmem[i].bus_addr,
> >> +                    kproc->rmem[i].size, kproc->rmem[i].cpu_addr,
> >> +                    kproc->rmem[i].dev_addr);
> >> +    }
> >> +    kproc->num_rmems = num_rmems;
> >> +
> >> +    return 0;
> >> +
> >> +unmap_rmem:
> >> +    for (i--; i >= 0; i--) {
> >> +            if (kproc->rmem[i].cpu_addr)
> >> +                    iounmap(kproc->rmem[i].cpu_addr);
> >> +    }
> >> +    kfree(kproc->rmem);
> >> +release_rmem:
> >> +    of_reserved_mem_device_release(kproc->dev);
> >> +    return ret;
> >> +}
> >
> > Other than the type of structure passed to the function, this is an exact
> > replica of k3_r5_reserved_mem_init().  Do you foresee either of them changing
> > to a point where reusing code would be counter productive?  I think we are right
> > on the edge where duplication is better than using the same function.
>
> Yeah, nothing at the moment. The number of regions can change, I have
> not enabled the support for addresses beyond 32-bit atm, so that is
> another factor.
>

Right, it is entirely up to you to make the call.  Lease as is or
reused based on what you think is best.

> >
> >> +
> >> +static void k3_dsp_reserved_mem_exit(struct k3_dsp_rproc *kproc)
> >> +{
> >> +    int i;
> >> +
> >> +    for (i = 0; i < kproc->num_rmems; i++)
> >> +            iounmap(kproc->rmem[i].cpu_addr);
> >> +    kfree(kproc->rmem);
> >> +
> >> +    of_reserved_mem_device_release(kproc->dev);
> >> +}
> >> +
> >> +static
> >> +struct ti_sci_proc *k3_dsp_rproc_of_get_tsp(struct device *dev,
> >> +                                        const struct ti_sci_handle *sci)
> >> +{
> >> +    struct ti_sci_proc *tsp;
> >> +    u32 temp[2];
> >> +    int ret;
> >> +
> >> +    ret = of_property_read_u32_array(dev->of_node, "ti,sci-proc-ids",
> >> +                                     temp, 2);
> >> +    if (ret < 0)
> >> +            return ERR_PTR(ret);
> >> +
> >> +    tsp = kzalloc(sizeof(*tsp), GFP_KERNEL);
> >> +    if (!tsp)
> >> +            return ERR_PTR(-ENOMEM);
> >> +
> >> +    tsp->dev = dev;
> >> +    tsp->sci = sci;
> >> +    tsp->ops = &sci->ops.proc_ops;
> >> +    tsp->proc_id = temp[0];
> >> +    tsp->host_id = temp[1];
> >> +
> >> +    return tsp;
> >> +}
> >
> > Contrary to k3_dsp_reserved_mem_init(), this one can definitely be reused for
> > both c66 and r5.
>
> Yeah, but is it worth it introduce a common module for one function?
> Little bit large to define this as an inline function like I have done
> with most of the ti_sci_proc helpers.
>

I see your point.

> >
> >> +
> >> +static int k3_dsp_rproc_probe(struct platform_device *pdev)
> >> +{
> >> +    struct device *dev = &pdev->dev;
> >> +    struct device_node *np = dev->of_node;
> >> +    const struct k3_dsp_dev_data *data;
> >> +    struct k3_dsp_rproc *kproc;
> >> +    struct rproc *rproc;
> >> +    const char *fw_name;
> >> +    int ret = 0;
> >> +    int ret1;
> >> +
> >> +    data = of_device_get_match_data(dev);
> >> +    if (!data)
> >> +            return -ENODEV;
> >> +
> >> +    fw_name = k3_dsp_rproc_get_firmware(dev);
> >> +    if (IS_ERR(fw_name))
> >> +            return PTR_ERR(fw_name);
> >> +
> >> +    rproc = rproc_alloc(dev, dev_name(dev), &k3_dsp_rproc_ops, fw_name,
> >> +                        sizeof(*kproc));
> >> +    if (!rproc)
> >> +            return -ENOMEM;
> >> +
> >> +    rproc->has_iommu = false;
> >> +    rproc->recovery_disabled = true;
> >> +    kproc = rproc->priv;
> >> +    kproc->rproc = rproc;
> >> +    kproc->dev = dev;
> >> +    kproc->data = data;
> >> +
> >> +    kproc->ti_sci = ti_sci_get_by_phandle(np, "ti,sci");
> >> +    if (IS_ERR(kproc->ti_sci)) {
> >> +            ret = PTR_ERR(kproc->ti_sci);
> >> +            if (ret != -EPROBE_DEFER) {
> >> +                    dev_err(dev, "failed to get ti-sci handle, ret = %d\n",
> >> +                            ret);
> >> +            }
> >> +            kproc->ti_sci = NULL;
> >> +            goto free_rproc;
> >> +    }
> >> +
> >> +    ret = of_property_read_u32(np, "ti,sci-dev-id", &kproc->ti_sci_id);
> >> +    if (ret) {
> >> +            dev_err(dev, "missing 'ti,sci-dev-id' property\n");
> >> +            goto put_sci;
> >> +    }
> >> +
> >> +    kproc->reset = devm_reset_control_get_exclusive(dev, NULL);
> >> +    if (IS_ERR(kproc->reset)) {
> >> +            ret = PTR_ERR(kproc->reset);
> >> +            dev_err(dev, "failed to get reset, status = %d\n", ret);
> >> +            goto put_sci;
> >> +    }
> >> +
> >> +    kproc->tsp = k3_dsp_rproc_of_get_tsp(dev, kproc->ti_sci);
> >> +    if (IS_ERR(kproc->tsp)) {
> >> +            dev_err(dev, "failed to construct ti-sci proc control, ret = %d\n",
> >> +                    ret);
> >> +            ret = PTR_ERR(kproc->tsp);
> >> +            goto put_sci;
> >> +    }
> >> +
> >> +    ret = ti_sci_proc_request(kproc->tsp);
> >> +    if (ret < 0) {
> >> +            dev_err(dev, "ti_sci_proc_request failed, ret = %d\n", ret);
> >> +            goto free_tsp;
> >> +    }
> >> +
> >> +    pm_runtime_enable(dev);
> >> +    ret = pm_runtime_get_sync(dev);
> >
> > What do these give you since the dev_pm_ops is not set for the
> > k3_dsp_rproc_driver platform diver and there is no clock specified in the DT?
>
> Yeah, I can drop this. Adding a clock in DT would not have made any
> difference here, but a power-domains property would have. And I don't
> use the power-domains property because of the genpd handling in driver
> core that messes with the device state.
>
> regards
> Sumahn
>
> >
> > Thanks,
> > Mathieu
> >
> >> +    if (ret < 0) {
> >> +            dev_err(dev, "failed to enable clock, status = %d\n", ret);
> >> +            pm_runtime_put_noidle(dev);
> >> +            goto disable_rpm;
> >> +    }
> >> +
> >> +    ret = k3_dsp_rproc_of_get_memories(pdev, kproc);
> >> +    if (ret)
> >> +            goto disable_clk;
> >> +
> >> +    ret = k3_dsp_reserved_mem_init(kproc);
> >> +    if (ret) {
> >> +            dev_err(dev, "reserved memory init failed, ret = %d\n", ret);
> >> +            goto disable_clk;
> >> +    }
> >> +
> >> +    ret = rproc_add(rproc);
> >> +    if (ret) {
> >> +            dev_err(dev, "failed to add register device with remoteproc core, status = %d\n",
> >> +                    ret);
> >> +            goto release_mem;
> >> +    }
> >> +
> >> +    platform_set_drvdata(pdev, kproc);
> >> +
> >> +    return 0;
> >> +
> >> +release_mem:
> >> +    k3_dsp_reserved_mem_exit(kproc);
> >> +disable_clk:
> >> +    pm_runtime_put_sync(dev);
> >> +disable_rpm:
> >> +    pm_runtime_disable(dev);
> >> +    ret1 = ti_sci_proc_release(kproc->tsp);
> >> +    if (ret1)
> >> +            dev_err(dev, "failed to release proc, ret = %d\n", ret1);
> >> +free_tsp:
> >> +    kfree(kproc->tsp);
> >> +put_sci:
> >> +    ret1 = ti_sci_put_handle(kproc->ti_sci);
> >> +    if (ret1)
> >> +            dev_err(dev, "failed to put ti_sci handle, ret = %d\n", ret1);
> >> +free_rproc:
> >> +    rproc_free(rproc);
> >> +    return ret;
> >> +}
> >> +
> >> +static int k3_dsp_rproc_remove(struct platform_device *pdev)
> >> +{
> >> +    struct k3_dsp_rproc *kproc = platform_get_drvdata(pdev);
> >> +    struct device *dev = &pdev->dev;
> >> +    int ret;
> >> +
> >> +    rproc_del(kproc->rproc);
> >> +    pm_runtime_put_sync(&pdev->dev);
> >> +    pm_runtime_disable(&pdev->dev);
> >> +
> >> +    ret = ti_sci_proc_release(kproc->tsp);
> >> +    if (ret)
> >> +            dev_err(dev, "failed to release proc, ret = %d\n", ret);
> >> +
> >> +    kfree(kproc->tsp);
> >> +
> >> +    ret = ti_sci_put_handle(kproc->ti_sci);
> >> +    if (ret)
> >> +            dev_err(dev, "failed to put ti_sci handle, ret = %d\n", ret);
> >> +
> >> +    k3_dsp_reserved_mem_exit(kproc);
> >> +    rproc_free(kproc->rproc);
> >> +
> >> +    return 0;
> >> +}
> >> +
> >> +static const struct k3_dsp_mem_data c66_mems[] = {
> >> +    { .name = "l2sram", .dev_addr = 0x800000 },
> >> +    { .name = "l1pram", .dev_addr = 0xe00000 },
> >> +    { .name = "l1dram", .dev_addr = 0xf00000 },
> >> +};
> >> +
> >> +static const struct k3_dsp_dev_data c66_data = {
> >> +    .mems = c66_mems,
> >> +    .num_mems = ARRAY_SIZE(c66_mems),
> >> +    .boot_align_addr = SZ_1K,
> >> +    .uses_lreset = true,
> >> +};
> >> +
> >> +static const struct of_device_id k3_dsp_of_match[] = {
> >> +    { .compatible = "ti,j721e-c66-dsp", .data = &c66_data, },
> >> +    { /* sentinel */ },
> >> +};
> >> +MODULE_DEVICE_TABLE(of, k3_dsp_of_match);
> >> +
> >> +static struct platform_driver k3_dsp_rproc_driver = {
> >> +    .probe  = k3_dsp_rproc_probe,
> >> +    .remove = k3_dsp_rproc_remove,
> >> +    .driver = {
> >> +            .name = "k3-dsp-rproc",
> >> +            .of_match_table = k3_dsp_of_match,
> >> +    },
> >> +};
> >> +
> >> +module_platform_driver(k3_dsp_rproc_driver);
> >> +
> >> +MODULE_AUTHOR("Suman Anna <s-anna@ti.com>");
> >> +MODULE_LICENSE("GPL v2");
> >> +MODULE_DESCRIPTION("TI K3 DSP Remoteproc driver");
> >> --
> >> 2.23.0
> >>
>

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH 3/3] remoteproc/k3-dsp: Add support for L2RAM loading on C66x DSPs
  2020-04-28 20:09     ` Mathieu Poirier
@ 2020-05-13 22:31       ` Suman Anna
  0 siblings, 0 replies; 13+ messages in thread
From: Suman Anna @ 2020-05-13 22:31 UTC (permalink / raw)
  To: Mathieu Poirier
  Cc: Bjorn Andersson, Rob Herring, Lokesh Vutla, linux-remoteproc,
	devicetree, linux-arm-kernel, Linux Kernel Mailing List

Hi Mathieu,

On 4/28/20 3:09 PM, Mathieu Poirier wrote:
> On Tue, 28 Apr 2020 at 13:58, Mathieu Poirier
> <mathieu.poirier@linaro.org> wrote:
>>
>> On Wed, Mar 25, 2020 at 03:18:39PM -0500, Suman Anna wrote:
>>> The resets for the DSP processors on K3 SoCs are managed through the
>>> Power and Sleep Controller (PSC) module. Each DSP typically has two
>>> resets - a global module reset for powering on the device, and a local
>>> reset that affects only the CPU while allowing access to the other
>>> sub-modules within the DSP processor sub-systems.
>>>
>>> The C66x DSPs have two levels of internal RAMs that can be used to
>>> boot from, and the firmware loading into these RAMs require the
>>> local reset to be asserted with the device powered on/enabled using
>>> the module reset. Enhance the K3 DSP remoteproc driver to add support
>>> for loading into the internal RAMs. The local reset is deasserted on
>>> SoC power-on-reset, so logic has to be added in probe in remoteproc
>>> mode to balance the remoteproc state-machine.
>>>
>>> Note that the local resets are a no-op on C71x cores, and the hardware
>>> does not supporting loading into its internal RAMs.
>>>
>>> Signed-off-by: Suman Anna <s-anna@ti.com>
>>> ---
>>>   drivers/remoteproc/ti_k3_dsp_remoteproc.c | 82 +++++++++++++++++++++++
>>>   1 file changed, 82 insertions(+)
>>>
>>> diff --git a/drivers/remoteproc/ti_k3_dsp_remoteproc.c b/drivers/remoteproc/ti_k3_dsp_remoteproc.c
>>> index fd0d84f46f90..7b712ef74611 100644
>>> --- a/drivers/remoteproc/ti_k3_dsp_remoteproc.c
>>> +++ b/drivers/remoteproc/ti_k3_dsp_remoteproc.c
>>> @@ -175,6 +175,9 @@ static int k3_dsp_rproc_reset(struct k3_dsp_rproc *kproc)
>>>                return ret;
>>>        }
>>>
>>> +     if (kproc->data->uses_lreset)
>>> +             return ret;
>>> +
>>>        ret = kproc->ti_sci->ops.dev_ops.put_device(kproc->ti_sci,
>>>                                                    kproc->ti_sci_id);
>>>        if (ret) {
>>> @@ -192,6 +195,9 @@ static int k3_dsp_rproc_release(struct k3_dsp_rproc *kproc)
>>>        struct device *dev = kproc->dev;
>>>        int ret;
>>>
>>> +     if (kproc->data->uses_lreset)
>>> +             goto lreset;
>>> +
>>>        ret = kproc->ti_sci->ops.dev_ops.get_device(kproc->ti_sci,
>>>                                                   kproc->ti_sci_id);
>>>        if (ret) {
>>> @@ -199,6 +205,7 @@ static int k3_dsp_rproc_release(struct k3_dsp_rproc *kproc)
>>>                return ret;
>>>        }
>>>
>>> +lreset:
>>>        ret = reset_control_deassert(kproc->reset);
>>>        if (ret) {
>>>                dev_err(dev, "local-reset deassert failed, ret = %d\n", ret);
>>> @@ -210,6 +217,63 @@ static int k3_dsp_rproc_release(struct k3_dsp_rproc *kproc)
>>>        return ret;
>>>   }
>>>
>>> +/*
>>> + * The C66x DSP cores have a local reset that affects only the CPU, and a
>>> + * generic module reset that powers on the device and allows the DSP internal
>>> + * memories to be accessed while the local reset is asserted. This function is
>>> + * used to release the global reset on C66x DSPs to allow loading into the DSP
>>> + * internal RAMs. The .prepare() ops is invoked by remoteproc core before any
>>> + * firmware loading, and is followed by the .start() ops after loading to
>>> + * actually let the C66x DSP cores run. The local reset on C71x cores is a
>>> + * no-op and the global reset cannot be released on C71x cores until after
>>> + * the firmware images are loaded, so this function does nothing for C71x cores.
>>> + */
>>> +static int k3_dsp_rproc_prepare(struct rproc *rproc)
>>> +{
>>> +     struct k3_dsp_rproc *kproc = rproc->priv;
>>> +     struct device *dev = kproc->dev;
>>> +     int ret;
>>> +
>>> +     /* local reset is no-op on C71x processors */
>>> +     if (!kproc->data->uses_lreset)
>>> +             return 0;
>>
>> In k3_dsp_rproc_release() the condition is "if (kproc->data->uses_lreset)" and
>> here it is the opposite, which did a good job at getting me confused.

Do you prefer I add a comment there? It needs to bail out there since 
the get_device portion would be executed here.

>>
>> Taking a step back, I assume c71 DSPs will have their own k3_dsp_dev_data where
>> the users_lreset flag will be false.  

Yes.

In that case I think it would make the
>> code easier to understand if the k3_dsp_rproc_ops was declared without the
>> .prepare and .unprepare.  In probe(), if data->uses_lreset is true then
>> k3_dsp_rproc_prepare() and k3_dsp_rproc_unprepare() are set.

Yeah, ok, that will avoid the confusion and limit the 
prepare()/unprepare() only for C66 DSPs.

>>
> 
> I forgot... Since this is a C71 related change, was there a reason to
> lump it with the C66 set?  If not I would simply move that to the C71
> work.

OK, I can remove this logic here, and add the prepare()/unprepare() 
conditionally for C66x in the C71 patch.

> 
>> I am done reviewing this set.

Thanks for all the review comments.

regards
Suman

>>
>> Thanks,
>> Mathieu
>>
>>> +
>>> +     ret = kproc->ti_sci->ops.dev_ops.get_device(kproc->ti_sci,
>>> +                                                 kproc->ti_sci_id);
>>> +     if (ret)
>>> +             dev_err(dev, "module-reset deassert failed, cannot enable internal RAM loading, ret = %d\n",
>>> +                     ret);
>>> +
>>> +     return ret;
>>> +}
>>> +
>>> +/*
>>> + * This function implements the .unprepare() ops and performs the complimentary
>>> + * operations to that of the .prepare() ops. The function is used to assert the
>>> + * global reset on applicable C66x cores. This completes the second portion of
>>> + * powering down the C66x DSP cores. The cores themselves are only halted in the
>>> + * .stop() callback through the local reset, and the .unprepare() ops is invoked
>>> + * by the remoteproc core after the remoteproc is stopped to balance the global
>>> + * reset.
>>> + */
>>> +static int k3_dsp_rproc_unprepare(struct rproc *rproc)
>>> +{
>>> +     struct k3_dsp_rproc *kproc = rproc->priv;
>>> +     struct device *dev = kproc->dev;
>>> +     int ret;
>>> +
>>> +     /* local reset is no-op on C71x processors */
>>> +     if (!kproc->data->uses_lreset)
>>> +             return 0;
>>> +
>>> +     ret = kproc->ti_sci->ops.dev_ops.put_device(kproc->ti_sci,
>>> +                                                 kproc->ti_sci_id);
>>> +     if (ret)
>>> +             dev_err(dev, "module-reset assert failed, ret = %d\n", ret);
>>> +
>>> +     return ret;
>>> +}
>>> +
>>>   /*
>>>    * Power up the DSP remote processor.
>>>    *
>>> @@ -353,6 +417,8 @@ static void *k3_dsp_rproc_da_to_va(struct rproc *rproc, u64 da, size_t len)
>>>   }
>>>
>>>   static const struct rproc_ops k3_dsp_rproc_ops = {
>>> +     .prepare        = k3_dsp_rproc_prepare,
>>> +     .unprepare      = k3_dsp_rproc_unprepare,
>>>        .start          = k3_dsp_rproc_start,
>>>        .stop           = k3_dsp_rproc_stop,
>>>        .kick           = k3_dsp_rproc_kick,
>>> @@ -644,6 +710,22 @@ static int k3_dsp_rproc_probe(struct platform_device *pdev)
>>>                goto disable_clk;
>>>        }
>>>
>>> +     /*
>>> +      * ensure the DSP local reset is asserted to ensure the DSP doesn't
>>> +      * execute bogus code in .prepare() when the module reset is released.
>>> +      */
>>> +     if (data->uses_lreset) {
>>> +             ret = reset_control_status(kproc->reset);
>>> +             if (ret < 0) {
>>> +                     dev_err(dev, "failed to get reset status, status = %d\n",
>>> +                             ret);
>>> +                     goto release_mem;
>>> +             } else if (ret == 0) {
>>> +                     dev_warn(dev, "local reset is deasserted for device\n");
>>> +                     k3_dsp_rproc_reset(kproc);
>>> +             }
>>> +     }
>>> +
>>>        ret = rproc_add(rproc);
>>>        if (ret) {
>>>                dev_err(dev, "failed to add register device with remoteproc core, status = %d\n",
>>> --
>>> 2.23.0
>>>


^ permalink raw reply	[flat|nested] 13+ messages in thread

end of thread, other threads:[~2020-05-13 22:32 UTC | newest]

Thread overview: 13+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-03-25 20:18 [PATCH 0/3] TI K3 DSP remoteproc driver for C66x DSPs Suman Anna
2020-03-25 20:18 ` [PATCH 1/3] dt-bindings: remoteproc: Add bindings for C66x DSPs on TI K3 SoCs Suman Anna
2020-03-26 16:54   ` Rob Herring
2020-04-27 19:49   ` Mathieu Poirier
2020-05-13 17:20     ` Suman Anna
2020-03-25 20:18 ` [PATCH 2/3] remoteproc/k3-dsp: Add a remoteproc driver of K3 C66x DSPs Suman Anna
2020-04-27 22:57   ` Mathieu Poirier
2020-05-13 18:14     ` Suman Anna
2020-05-13 19:40       ` Mathieu Poirier
2020-03-25 20:18 ` [PATCH 3/3] remoteproc/k3-dsp: Add support for L2RAM loading on " Suman Anna
2020-04-28 19:58   ` Mathieu Poirier
2020-04-28 20:09     ` Mathieu Poirier
2020-05-13 22:31       ` Suman Anna

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).