linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v2 00/12] Aspeed: Add SCU interrupt controller and XDMA engine drivers
@ 2019-12-05 17:15 Eddie James
  2019-12-05 17:15 ` [PATCH v2 01/12] dt-bindings: interrupt-controller: Add Aspeed SCU interrupt controller Eddie James
                   ` (11 more replies)
  0 siblings, 12 replies; 27+ messages in thread
From: Eddie James @ 2019-12-05 17:15 UTC (permalink / raw)
  To: linux-kernel
  Cc: devicetree, jason, linux-aspeed, maz, robh+dt, tglx,
	mark.rutland, joel, andrew

This series first adds a driver to control the interrupt controller provided by
the System Control Unit (SCU) on the AST2500 and AST2600 SOCs. The interrupts
made available are necessary for the control of the XDMA engine embedded in the
same Aspeed SOCs.
This series then adds a driver to control the XDMA engine. This driver was
previously sent to the list without support for the AST2600, and has been
refactored significantly to enable that support. The XDMA engine performs
automatic DMA operations between the Aspeed SOC (acting as a BMC) and a host
processor.

Changes since v1:
 - See individual patches
 - In summary, first the irqchip driver switched to use the parent SCU regmap
   rather than iomapping it's register. Secondly, the XDMA initialization
   switched to use properties from the device tree rather than dynamically
   calculate memory spaces, and system config.

Eddie James (12):
  dt-bindings: interrupt-controller: Add Aspeed SCU interrupt controller
  irqchip: Add Aspeed SCU interrupt controller
  ARM: dts: aspeed: ast2500: Add SCU interrupt controller
  ARM: dts: aspeed: ast2600: Add SCU interrupt controllers
  dt-bindings: soc: Add Aspeed XDMA Engine
  drivers/soc: Add Aspeed XDMA Engine Driver
  drivers/soc: xdma: Add user interface
  ARM: dts: aspeed: ast2500: Add XDMA Engine
  ARM: dts: aspeed: ast2600: Add XDMA Engine
  ARM: dts: aspeed: witherspoon: Enable XDMA Engine
  ARM: dts: aspeed: rainier: Enable XDMA engine
  ARM: dts: aspeed: tacoma: Enable XDMA engine

 .../interrupt-controller/aspeed,ast2xxx-scu-ic.txt |   23 +
 .../devicetree/bindings/soc/aspeed/xdma.txt        |   43 +
 MAINTAINERS                                        |   16 +
 arch/arm/boot/dts/aspeed-bmc-ibm-rainier.dts       |    5 +
 arch/arm/boot/dts/aspeed-bmc-opp-tacoma.dts        |    5 +
 arch/arm/boot/dts/aspeed-bmc-opp-witherspoon.dts   |    5 +
 arch/arm/boot/dts/aspeed-g5.dtsi                   |   25 +-
 arch/arm/boot/dts/aspeed-g6.dtsi                   |   33 +
 drivers/irqchip/Makefile                           |    2 +-
 drivers/irqchip/irq-aspeed-scu-ic.c                |  239 +++++
 drivers/soc/aspeed/Kconfig                         |    8 +
 drivers/soc/aspeed/Makefile                        |    1 +
 drivers/soc/aspeed/aspeed-xdma.c                   | 1007 ++++++++++++++++++++
 .../interrupt-controller/aspeed-scu-ic.h           |   23 +
 include/uapi/linux/aspeed-xdma.h                   |   41 +
 15 files changed, 1473 insertions(+), 3 deletions(-)
 create mode 100644 Documentation/devicetree/bindings/interrupt-controller/aspeed,ast2xxx-scu-ic.txt
 create mode 100644 Documentation/devicetree/bindings/soc/aspeed/xdma.txt
 create mode 100644 drivers/irqchip/irq-aspeed-scu-ic.c
 create mode 100644 drivers/soc/aspeed/aspeed-xdma.c
 create mode 100644 include/dt-bindings/interrupt-controller/aspeed-scu-ic.h
 create mode 100644 include/uapi/linux/aspeed-xdma.h

-- 
1.8.3.1


^ permalink raw reply	[flat|nested] 27+ messages in thread

* [PATCH v2 01/12] dt-bindings: interrupt-controller: Add Aspeed SCU interrupt controller
  2019-12-05 17:15 [PATCH v2 00/12] Aspeed: Add SCU interrupt controller and XDMA engine drivers Eddie James
@ 2019-12-05 17:15 ` Eddie James
  2019-12-10 23:32   ` Andrew Jeffery
  2019-12-05 17:15 ` [PATCH v2 02/12] irqchip: " Eddie James
                   ` (10 subsequent siblings)
  11 siblings, 1 reply; 27+ messages in thread
From: Eddie James @ 2019-12-05 17:15 UTC (permalink / raw)
  To: linux-kernel
  Cc: devicetree, jason, linux-aspeed, maz, robh+dt, tglx,
	mark.rutland, joel, andrew

Document the Aspeed SCU interrupt controller and add an include file
for the interrupts it provides.

Signed-off-by: Eddie James <eajames@linux.ibm.com>
Reviewed-by: Rob Herring <robh@kernel.org>
---
Changes since v1:
 - Remove 'reg' required property.

 .../interrupt-controller/aspeed,ast2xxx-scu-ic.txt | 23 ++++++++++++++++++++++
 MAINTAINERS                                        |  7 +++++++
 .../interrupt-controller/aspeed-scu-ic.h           | 23 ++++++++++++++++++++++
 3 files changed, 53 insertions(+)
 create mode 100644 Documentation/devicetree/bindings/interrupt-controller/aspeed,ast2xxx-scu-ic.txt
 create mode 100644 include/dt-bindings/interrupt-controller/aspeed-scu-ic.h

diff --git a/Documentation/devicetree/bindings/interrupt-controller/aspeed,ast2xxx-scu-ic.txt b/Documentation/devicetree/bindings/interrupt-controller/aspeed,ast2xxx-scu-ic.txt
new file mode 100644
index 0000000..251ed44
--- /dev/null
+++ b/Documentation/devicetree/bindings/interrupt-controller/aspeed,ast2xxx-scu-ic.txt
@@ -0,0 +1,23 @@
+Aspeed AST25XX and AST26XX SCU Interrupt Controller
+
+Required Properties:
+ - #interrupt-cells		: must be 1
+ - compatible			: must be "aspeed,ast2500-scu-ic",
+				  "aspeed,ast2600-scu-ic0" or
+				  "aspeed,ast2600-scu-ic1"
+ - interrupts			: interrupt from the parent controller
+ - interrupt-controller		: indicates that the controller receives and
+				  fires new interrupts for child busses
+
+Example:
+
+    syscon@1e6e2000 {
+        ranges = <0 0x1e6e2000 0x1a8>;
+
+        scu_ic: interrupt-controller@18 {
+            #interrupt-cells = <1>;
+            compatible = "aspeed,ast2500-scu-ic";
+            interrupts = <21>;
+            interrupt-controller;
+        };
+    };
diff --git a/MAINTAINERS b/MAINTAINERS
index 18c1137..6bbb194 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -2691,6 +2691,13 @@ S:	Maintained
 F:	drivers/pinctrl/aspeed/
 F:	Documentation/devicetree/bindings/pinctrl/aspeed,*
 
+ASPEED SCU INTERRUPT CONTROLLER DRIVER
+M:	Eddie James <eajames@linux.ibm.com>
+L:	linux-aspeed@lists.ozlabs.org (moderated for non-subscribers)
+S:	Maintained
+F:	Documentation/devicetree/bindings/interrupt-controller/aspeed,ast2xxx-scu-ic.txt
+F:	include/dt-bindings/interrupt-controller/aspeed-scu-ic.h
+
 ASPEED VIDEO ENGINE DRIVER
 M:	Eddie James <eajames@linux.ibm.com>
 L:	linux-media@vger.kernel.org
diff --git a/include/dt-bindings/interrupt-controller/aspeed-scu-ic.h b/include/dt-bindings/interrupt-controller/aspeed-scu-ic.h
new file mode 100644
index 0000000..f315d5a
--- /dev/null
+++ b/include/dt-bindings/interrupt-controller/aspeed-scu-ic.h
@@ -0,0 +1,23 @@
+/* SPDX-License-Identifier: GPL-2.0+ */
+
+#ifndef _DT_BINDINGS_INTERRUPT_CONTROLLER_ASPEED_SCU_IC_H_
+#define _DT_BINDINGS_INTERRUPT_CONTROLLER_ASPEED_SCU_IC_H_
+
+#define ASPEED_SCU_IC_VGA_CURSOR_CHANGE			0
+#define ASPEED_SCU_IC_VGA_SCRATCH_REG_CHANGE		1
+
+#define ASPEED_AST2500_SCU_IC_PCIE_RESET_LO_TO_HI	2
+#define ASPEED_AST2500_SCU_IC_PCIE_RESET_HI_TO_LO	3
+#define ASPEED_AST2500_SCU_IC_LPC_RESET_LO_TO_HI	4
+#define ASPEED_AST2500_SCU_IC_LPC_RESET_HI_TO_LO	5
+#define ASPEED_AST2500_SCU_IC_ISSUE_MSI			6
+
+#define ASPEED_AST2600_SCU_IC0_PCIE_PERST_LO_TO_HI	2
+#define ASPEED_AST2600_SCU_IC0_PCIE_PERST_HI_TO_LO	3
+#define ASPEED_AST2600_SCU_IC0_PCIE_RCRST_LO_TO_HI	4
+#define ASPEED_AST2600_SCU_IC0_PCIE_RCRST_HI_TO_LO	5
+
+#define ASPEED_AST2600_SCU_IC1_LPC_RESET_LO_TO_HI	0
+#define ASPEED_AST2600_SCU_IC1_LPC_RESET_HI_TO_LO	1
+
+#endif /* _DT_BINDINGS_INTERRUPT_CONTROLLER_ASPEED_SCU_IC_H_ */
-- 
1.8.3.1


^ permalink raw reply related	[flat|nested] 27+ messages in thread

* [PATCH v2 02/12] irqchip: Add Aspeed SCU interrupt controller
  2019-12-05 17:15 [PATCH v2 00/12] Aspeed: Add SCU interrupt controller and XDMA engine drivers Eddie James
  2019-12-05 17:15 ` [PATCH v2 01/12] dt-bindings: interrupt-controller: Add Aspeed SCU interrupt controller Eddie James
@ 2019-12-05 17:15 ` Eddie James
  2019-12-11  0:32   ` Andrew Jeffery
  2019-12-05 17:15 ` [PATCH v2 03/12] ARM: dts: aspeed: ast2500: Add " Eddie James
                   ` (9 subsequent siblings)
  11 siblings, 1 reply; 27+ messages in thread
From: Eddie James @ 2019-12-05 17:15 UTC (permalink / raw)
  To: linux-kernel
  Cc: devicetree, jason, linux-aspeed, maz, robh+dt, tglx,
	mark.rutland, joel, andrew

The Aspeed SOCs provide some interrupts through the System Control
Unit registers. Add an interrupt controller that provides these
interrupts to the system.

Signed-off-by: Eddie James <eajames@linux.ibm.com>
---
Changes since v1:
 - Use the parent SCU regmap rather than iomap its register range.
 - Use handle_level_irq rather than handle_simple_irq.

 MAINTAINERS                         |   1 +
 drivers/irqchip/Makefile            |   2 +-
 drivers/irqchip/irq-aspeed-scu-ic.c | 239 ++++++++++++++++++++++++++++++++++++
 3 files changed, 241 insertions(+), 1 deletion(-)
 create mode 100644 drivers/irqchip/irq-aspeed-scu-ic.c

diff --git a/MAINTAINERS b/MAINTAINERS
index 6bbb194..398861a 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -2696,6 +2696,7 @@ M:	Eddie James <eajames@linux.ibm.com>
 L:	linux-aspeed@lists.ozlabs.org (moderated for non-subscribers)
 S:	Maintained
 F:	Documentation/devicetree/bindings/interrupt-controller/aspeed,ast2xxx-scu-ic.txt
+F:	drivers/irqchip/irq-aspeed-scu-ic.c
 F:	include/dt-bindings/interrupt-controller/aspeed-scu-ic.h
 
 ASPEED VIDEO ENGINE DRIVER
diff --git a/drivers/irqchip/Makefile b/drivers/irqchip/Makefile
index e806dda..6c9262c 100644
--- a/drivers/irqchip/Makefile
+++ b/drivers/irqchip/Makefile
@@ -87,7 +87,7 @@ obj-$(CONFIG_MVEBU_SEI)			+= irq-mvebu-sei.o
 obj-$(CONFIG_LS_EXTIRQ)			+= irq-ls-extirq.o
 obj-$(CONFIG_LS_SCFG_MSI)		+= irq-ls-scfg-msi.o
 obj-$(CONFIG_EZNPS_GIC)			+= irq-eznps.o
-obj-$(CONFIG_ARCH_ASPEED)		+= irq-aspeed-vic.o irq-aspeed-i2c-ic.o
+obj-$(CONFIG_ARCH_ASPEED)		+= irq-aspeed-vic.o irq-aspeed-i2c-ic.o irq-aspeed-scu-ic.o
 obj-$(CONFIG_STM32_EXTI) 		+= irq-stm32-exti.o
 obj-$(CONFIG_QCOM_IRQ_COMBINER)		+= qcom-irq-combiner.o
 obj-$(CONFIG_IRQ_UNIPHIER_AIDET)	+= irq-uniphier-aidet.o
diff --git a/drivers/irqchip/irq-aspeed-scu-ic.c b/drivers/irqchip/irq-aspeed-scu-ic.c
new file mode 100644
index 0000000..c90a334
--- /dev/null
+++ b/drivers/irqchip/irq-aspeed-scu-ic.c
@@ -0,0 +1,239 @@
+// SPDX-License-Identifier: GPL-2.0-or-later
+/*
+ * Aspeed AST24XX, AST25XX, and AST26XX SCU Interrupt Controller
+ * Copyright 2019 IBM Corporation
+ *
+ * Eddie James <eajames@linux.ibm.com>
+ */
+
+#include <linux/bitops.h>
+#include <linux/irq.h>
+#include <linux/irqchip.h>
+#include <linux/irqchip/chained_irq.h>
+#include <linux/irqdomain.h>
+#include <linux/mfd/syscon.h>
+#include <linux/of_irq.h>
+#include <linux/regmap.h>
+
+#define ASPEED_SCU_IC_REG		0x018
+#define ASPEED_SCU_IC_SHIFT		0
+#define ASPEED_SCU_IC_ENABLE		GENMASK(6, ASPEED_SCU_IC_SHIFT)
+#define ASPEED_SCU_IC_NUM_IRQS		7
+#define ASPEED_SCU_IC_STATUS_SHIFT	16
+
+#define ASPEED_AST2600_SCU_IC0_REG	0x560
+#define ASPEED_AST2600_SCU_IC0_SHIFT	0
+#define ASPEED_AST2600_SCU_IC0_ENABLE	\
+	GENMASK(5, ASPEED_AST2600_SCU_IC0_SHIFT)
+#define ASPEED_AST2600_SCU_IC0_NUM_IRQS	6
+
+#define ASPEED_AST2600_SCU_IC1_REG	0x570
+#define ASPEED_AST2600_SCU_IC1_SHIFT	4
+#define ASPEED_AST2600_SCU_IC1_ENABLE	\
+	GENMASK(5, ASPEED_AST2600_SCU_IC1_SHIFT)
+#define ASPEED_AST2600_SCU_IC1_NUM_IRQS	2
+
+struct aspeed_scu_ic {
+	unsigned long irq_enable;
+	unsigned long irq_shift;
+	unsigned int num_irqs;
+	unsigned int reg;
+	struct regmap *scu;
+	struct irq_domain *irq_domain;
+};
+
+static void aspeed_scu_ic_irq_handler(struct irq_desc *desc)
+{
+	unsigned int irq;
+	unsigned int sts;
+	unsigned long bit;
+	unsigned long enabled;
+	unsigned long max;
+	unsigned long status;
+	struct aspeed_scu_ic *scu_ic = irq_desc_get_handler_data(desc);
+	struct irq_chip *chip = irq_desc_get_chip(desc);
+	unsigned int mask = scu_ic->irq_enable << ASPEED_SCU_IC_STATUS_SHIFT;
+
+	chained_irq_enter(chip, desc);
+
+	/*
+	 * The SCU IC has just one register to control its operation and read
+	 * status. The interrupt enable bits occupy the lower 16 bits of the
+	 * register, while the interrupt status bits occupy the upper 16 bits.
+	 * The status bit for a given interrupt is always 16 bits shifted from
+	 * the enable bit for the same interrupt.
+	 * Therefore, perform the IRQ operations in the enable bit space by
+	 * shifting the status down to get the mapping and then back up to
+	 * clear the bit.
+	 */
+	regmap_read(scu_ic->scu, scu_ic->reg, &sts);
+	enabled = sts & scu_ic->irq_enable;
+	status = (sts >> ASPEED_SCU_IC_STATUS_SHIFT) & enabled;
+
+	bit = scu_ic->irq_shift;
+	max = scu_ic->num_irqs + bit;
+
+	for_each_set_bit_from(bit, &status, max) {
+		irq = irq_find_mapping(scu_ic->irq_domain,
+				       bit - scu_ic->irq_shift);
+		generic_handle_irq(irq);
+
+		regmap_update_bits(scu_ic->scu, scu_ic->reg, mask,
+				   BIT(bit + ASPEED_SCU_IC_STATUS_SHIFT));
+	}
+
+	chained_irq_exit(chip, desc);
+}
+
+static void aspeed_scu_ic_irq_mask(struct irq_data *data)
+{
+	struct aspeed_scu_ic *scu_ic = irq_data_get_irq_chip_data(data);
+	unsigned int mask = BIT(data->hwirq + scu_ic->irq_shift) |
+		(scu_ic->irq_enable << ASPEED_SCU_IC_STATUS_SHIFT);
+
+	/*
+	 * Status bits are cleared by writing 1. In order to prevent the mask
+	 * operation from clearing the status bits, they should be under the
+	 * mask and written with 0.
+	 */
+	regmap_update_bits(scu_ic->scu, scu_ic->reg, mask, 0);
+}
+
+static void aspeed_scu_ic_irq_unmask(struct irq_data *data)
+{
+	struct aspeed_scu_ic *scu_ic = irq_data_get_irq_chip_data(data);
+	unsigned int bit = BIT(data->hwirq + scu_ic->irq_shift);
+	unsigned int mask = bit |
+		(scu_ic->irq_enable << ASPEED_SCU_IC_STATUS_SHIFT);
+
+	/*
+	 * Status bits are cleared by writing 1. In order to prevent the unmask
+	 * operation from clearing the status bits, they should be under the
+	 * mask and written with 0.
+	 */
+	regmap_update_bits(scu_ic->scu, scu_ic->reg, mask, bit);
+}
+
+static int aspeed_scu_ic_irq_set_affinity(struct irq_data *data,
+					  const struct cpumask *dest,
+					  bool force)
+{
+	return -EINVAL;
+}
+
+static struct irq_chip aspeed_scu_ic_chip = {
+	.name			= "aspeed-scu-ic",
+	.irq_mask		= aspeed_scu_ic_irq_mask,
+	.irq_unmask		= aspeed_scu_ic_irq_unmask,
+	.irq_set_affinity	= aspeed_scu_ic_irq_set_affinity,
+};
+
+static int aspeed_scu_ic_map(struct irq_domain *domain, unsigned int irq,
+			     irq_hw_number_t hwirq)
+{
+	irq_set_chip_and_handler(irq, &aspeed_scu_ic_chip, handle_level_irq);
+	irq_set_chip_data(irq, domain->host_data);
+
+	return 0;
+}
+
+static const struct irq_domain_ops aspeed_scu_ic_domain_ops = {
+	.map = aspeed_scu_ic_map,
+};
+
+static int aspeed_scu_ic_of_init_common(struct aspeed_scu_ic *scu_ic,
+					struct device_node *node)
+{
+	int irq;
+	int rc = 0;
+
+	if (!node->parent) {
+		rc = -ENODEV;
+		goto err;
+	}
+
+	scu_ic->scu = syscon_node_to_regmap(node->parent);
+	if (IS_ERR(scu_ic->scu)) {
+		rc = PTR_ERR(scu_ic->scu);
+		goto err;
+	}
+
+	irq = irq_of_parse_and_map(node, 0);
+	if (irq < 0) {
+		rc = irq;
+		goto err;
+	}
+
+	scu_ic->irq_domain = irq_domain_add_linear(node, scu_ic->num_irqs,
+						   &aspeed_scu_ic_domain_ops,
+						   scu_ic);
+	if (!scu_ic->irq_domain) {
+		rc = -ENOMEM;
+		goto err;
+	}
+
+	irq_set_chained_handler_and_data(irq, aspeed_scu_ic_irq_handler,
+					 scu_ic);
+
+	return 0;
+
+err:
+	kfree(scu_ic);
+
+	return rc;
+}
+
+static int __init aspeed_scu_ic_of_init(struct device_node *node,
+					struct device_node *parent)
+{
+	struct aspeed_scu_ic *scu_ic = kzalloc(sizeof(*scu_ic), GFP_KERNEL);
+
+	if (!scu_ic)
+		return -ENOMEM;
+
+	scu_ic->irq_enable = ASPEED_SCU_IC_ENABLE;
+	scu_ic->irq_shift = ASPEED_SCU_IC_SHIFT;
+	scu_ic->num_irqs = ASPEED_SCU_IC_NUM_IRQS;
+	scu_ic->reg = ASPEED_SCU_IC_REG;
+
+	return aspeed_scu_ic_of_init_common(scu_ic, node);
+}
+
+static int __init aspeed_ast2600_scu_ic0_of_init(struct device_node *node,
+						 struct device_node *parent)
+{
+	struct aspeed_scu_ic *scu_ic = kzalloc(sizeof(*scu_ic), GFP_KERNEL);
+
+	if (!scu_ic)
+		return -ENOMEM;
+
+	scu_ic->irq_enable = ASPEED_AST2600_SCU_IC0_ENABLE;
+	scu_ic->irq_shift = ASPEED_AST2600_SCU_IC0_SHIFT;
+	scu_ic->num_irqs = ASPEED_AST2600_SCU_IC0_NUM_IRQS;
+	scu_ic->reg = ASPEED_AST2600_SCU_IC0_REG;
+
+	return aspeed_scu_ic_of_init_common(scu_ic, node);
+}
+
+static int __init aspeed_ast2600_scu_ic1_of_init(struct device_node *node,
+						 struct device_node *parent)
+{
+	struct aspeed_scu_ic *scu_ic = kzalloc(sizeof(*scu_ic), GFP_KERNEL);
+
+	if (!scu_ic)
+		return -ENOMEM;
+
+	scu_ic->irq_enable = ASPEED_AST2600_SCU_IC1_ENABLE;
+	scu_ic->irq_shift = ASPEED_AST2600_SCU_IC1_SHIFT;
+	scu_ic->num_irqs = ASPEED_AST2600_SCU_IC1_NUM_IRQS;
+	scu_ic->reg = ASPEED_AST2600_SCU_IC1_REG;
+
+	return aspeed_scu_ic_of_init_common(scu_ic, node);
+}
+
+IRQCHIP_DECLARE(ast2400_scu_ic, "aspeed,ast2400-scu-ic", aspeed_scu_ic_of_init);
+IRQCHIP_DECLARE(ast2500_scu_ic, "aspeed,ast2500-scu-ic", aspeed_scu_ic_of_init);
+IRQCHIP_DECLARE(ast2600_scu_ic0, "aspeed,ast2600-scu-ic0",
+		aspeed_ast2600_scu_ic0_of_init);
+IRQCHIP_DECLARE(ast2600_scu_ic1, "aspeed,ast2600-scu-ic1",
+		aspeed_ast2600_scu_ic1_of_init);
-- 
1.8.3.1


^ permalink raw reply related	[flat|nested] 27+ messages in thread

* [PATCH v2 03/12] ARM: dts: aspeed: ast2500: Add SCU interrupt controller
  2019-12-05 17:15 [PATCH v2 00/12] Aspeed: Add SCU interrupt controller and XDMA engine drivers Eddie James
  2019-12-05 17:15 ` [PATCH v2 01/12] dt-bindings: interrupt-controller: Add Aspeed SCU interrupt controller Eddie James
  2019-12-05 17:15 ` [PATCH v2 02/12] irqchip: " Eddie James
@ 2019-12-05 17:15 ` Eddie James
  2019-12-11  0:36   ` Andrew Jeffery
  2019-12-05 17:15 ` [PATCH v2 04/12] ARM: dts: aspeed: ast2600: Add SCU interrupt controllers Eddie James
                   ` (8 subsequent siblings)
  11 siblings, 1 reply; 27+ messages in thread
From: Eddie James @ 2019-12-05 17:15 UTC (permalink / raw)
  To: linux-kernel
  Cc: devicetree, jason, linux-aspeed, maz, robh+dt, tglx,
	mark.rutland, joel, andrew

Add a node for the interrupt controller provided by the SCU.

Signed-off-by: Eddie James <eajames@linux.ibm.com>
---
 arch/arm/boot/dts/aspeed-g5.dtsi | 10 +++++++++-
 1 file changed, 9 insertions(+), 1 deletion(-)

diff --git a/arch/arm/boot/dts/aspeed-g5.dtsi b/arch/arm/boot/dts/aspeed-g5.dtsi
index a259c63..df44022 100644
--- a/arch/arm/boot/dts/aspeed-g5.dtsi
+++ b/arch/arm/boot/dts/aspeed-g5.dtsi
@@ -216,8 +216,9 @@
 			syscon: syscon@1e6e2000 {
 				compatible = "aspeed,ast2500-scu", "syscon", "simple-mfd";
 				reg = <0x1e6e2000 0x1a8>;
+				ranges = <0 0x1e6e2000 0x1a8>;
 				#address-cells = <1>;
-				#size-cells = <0>;
+				#size-cells = <1>;
 				#clock-cells = <1>;
 				#reset-cells = <1>;
 
@@ -231,6 +232,13 @@
 					compatible = "aspeed,ast2500-p2a-ctrl";
 					status = "disabled";
 				};
+
+				scu_ic: interrupt-controller@18 {
+					#interrupt-cells = <1>;
+					compatible = "aspeed,ast2500-scu-ic";
+					interrupts = <21>;
+					interrupt-controller;
+				};
 			};
 
 			rng: hwrng@1e6e2078 {
-- 
1.8.3.1


^ permalink raw reply related	[flat|nested] 27+ messages in thread

* [PATCH v2 04/12] ARM: dts: aspeed: ast2600: Add SCU interrupt controllers
  2019-12-05 17:15 [PATCH v2 00/12] Aspeed: Add SCU interrupt controller and XDMA engine drivers Eddie James
                   ` (2 preceding siblings ...)
  2019-12-05 17:15 ` [PATCH v2 03/12] ARM: dts: aspeed: ast2500: Add " Eddie James
@ 2019-12-05 17:15 ` Eddie James
  2019-12-11  0:42   ` Andrew Jeffery
  2019-12-05 17:15 ` [PATCH v2 05/12] dt-bindings: soc: Add Aspeed XDMA Engine Eddie James
                   ` (7 subsequent siblings)
  11 siblings, 1 reply; 27+ messages in thread
From: Eddie James @ 2019-12-05 17:15 UTC (permalink / raw)
  To: linux-kernel
  Cc: devicetree, jason, linux-aspeed, maz, robh+dt, tglx,
	mark.rutland, joel, andrew

Add nodes for the interrupt controllers provided by the SCU.

Signed-off-by: Eddie James <eajames@linux.ibm.com>
---
 arch/arm/boot/dts/aspeed-g6.dtsi | 14 ++++++++++++++
 1 file changed, 14 insertions(+)

diff --git a/arch/arm/boot/dts/aspeed-g6.dtsi b/arch/arm/boot/dts/aspeed-g6.dtsi
index 5f6142d..ead336e 100644
--- a/arch/arm/boot/dts/aspeed-g6.dtsi
+++ b/arch/arm/boot/dts/aspeed-g6.dtsi
@@ -288,6 +288,20 @@
 					compatible = "aspeed,ast2600-smpmem";
 					reg = <0x180 0x40>;
 				};
+
+				scu_ic0: interrupt-controller@0 {
+					#interrupt-cells = <1>;
+					compatible = "aspeed,ast2600-scu-ic0";
+					interrupts = <GIC_SPI 12 IRQ_TYPE_LEVEL_HIGH>;
+					interrupt-controller;
+				};
+
+				scu_ic1: interrupt-controller@1 {
+					#interrupt-cells = <1>;
+					compatible = "aspeed,ast2600-scu-ic1";
+					interrupts = <GIC_SPI 41 IRQ_TYPE_LEVEL_HIGH>;
+					interrupt-controller;
+				};
 			};
 
 			rng: hwrng@1e6e2524 {
-- 
1.8.3.1


^ permalink raw reply related	[flat|nested] 27+ messages in thread

* [PATCH v2 05/12] dt-bindings: soc: Add Aspeed XDMA Engine
  2019-12-05 17:15 [PATCH v2 00/12] Aspeed: Add SCU interrupt controller and XDMA engine drivers Eddie James
                   ` (3 preceding siblings ...)
  2019-12-05 17:15 ` [PATCH v2 04/12] ARM: dts: aspeed: ast2600: Add SCU interrupt controllers Eddie James
@ 2019-12-05 17:15 ` Eddie James
  2019-12-11  0:40   ` Andrew Jeffery
  2019-12-05 17:15 ` [PATCH v2 06/12] drivers/soc: Add Aspeed XDMA Engine Driver Eddie James
                   ` (6 subsequent siblings)
  11 siblings, 1 reply; 27+ messages in thread
From: Eddie James @ 2019-12-05 17:15 UTC (permalink / raw)
  To: linux-kernel
  Cc: devicetree, jason, linux-aspeed, maz, robh+dt, tglx,
	mark.rutland, joel, andrew

Document the bindings for the Aspeed AST25XX and AST26XX XDMA engine.

Signed-off-by: Eddie James <eajames@linux.ibm.com>
Reviewed-by: Rob Herring <robh@kernel.org>
---
Changes since v1:
 - Add 'clocks', 'scu', 'sdmc', 'vga-mem', and 'pcie-device' property
   documentation

 .../devicetree/bindings/soc/aspeed/xdma.txt        | 43 ++++++++++++++++++++++
 MAINTAINERS                                        |  6 +++
 2 files changed, 49 insertions(+)
 create mode 100644 Documentation/devicetree/bindings/soc/aspeed/xdma.txt

diff --git a/Documentation/devicetree/bindings/soc/aspeed/xdma.txt b/Documentation/devicetree/bindings/soc/aspeed/xdma.txt
new file mode 100644
index 0000000..942dc07
--- /dev/null
+++ b/Documentation/devicetree/bindings/soc/aspeed/xdma.txt
@@ -0,0 +1,43 @@
+Aspeed AST25XX and AST26XX XDMA Engine
+
+The XDMA Engine embedded in the AST2500 and AST2600 SOCs can perform automatic
+DMA operations over PCI between the SOC (acting as a BMC) and a host processor.
+
+Required properties:
+ - compatible		: must be "aspeed,ast2500-xdma" or
+			  "aspeed,ast2600-xdma"
+ - reg			: contains the address and size of the memory region
+			  associated with the XDMA engine registers
+ - clocks		: clock specifier for the clock associated with the
+			  XDMA engine
+ - resets		: reset specifier for the syscon reset associated with
+			  the XDMA engine
+ - interrupts-extended	: two interrupt nodes; the first specifies the global
+			  interrupt for the XDMA engine and the second
+			  specifies the PCI-E reset or PERST interrupt.
+ - scu			: a phandle to the syscon node for the system control
+			  unit of the SOC
+ - sdmc			: a phandle to the syscon node for the SDRAM memory
+			  controller of the SOC
+ - vga-mem		: contains the address and size of the VGA memory space
+			  to be used by the XDMA engine
+
+Optional properties:
+ - pcie-device		: should be either "bmc" or "vga", corresponding to
+			  which device should be used by the XDMA engine for
+			  DMA operations. If this property is not set, the XDMA
+			  engine will use the BMC PCI-E device.
+
+Example:
+
+    xdma@1e6e7000 {
+        compatible = "aspeed,ast2500-xdma";
+        reg = <0x1e6e7000 0x100>;
+        clocks = <&syscon ASPEED_CLK_GATE_BCLK>;
+        resets = <&syscon ASPEED_RESET_XDMA>;
+        interrupts-extended = <&vic 6>, <&scu_ic ASPEED_AST2500_SCU_IC_PCIE_RESET_LO_TO_HI>;
+        scu = <&syscon>;
+        sdmc = <&edac>;
+        pcie-device = "bmc";
+        vga-mem = <0x9f000000 0x01000000>;
+    };
diff --git a/MAINTAINERS b/MAINTAINERS
index 398861a..528a142 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -2707,6 +2707,12 @@ S:	Maintained
 F:	drivers/media/platform/aspeed-video.c
 F:	Documentation/devicetree/bindings/media/aspeed-video.txt
 
+ASPEED XDMA ENGINE DRIVER
+M:	Eddie James <eajames@linux.ibm.com>
+L:	linux-aspeed@lists.ozlabs.org (moderated for non-subscribers)
+S:	Maintained
+F:	Documentation/devicetree/bindings/soc/aspeed/xdma.txt
+
 ASUS NOTEBOOKS AND EEEPC ACPI/WMI EXTRAS DRIVERS
 M:	Corentin Chary <corentin.chary@gmail.com>
 L:	acpi4asus-user@lists.sourceforge.net
-- 
1.8.3.1


^ permalink raw reply related	[flat|nested] 27+ messages in thread

* [PATCH v2 06/12] drivers/soc: Add Aspeed XDMA Engine Driver
  2019-12-05 17:15 [PATCH v2 00/12] Aspeed: Add SCU interrupt controller and XDMA engine drivers Eddie James
                   ` (4 preceding siblings ...)
  2019-12-05 17:15 ` [PATCH v2 05/12] dt-bindings: soc: Add Aspeed XDMA Engine Eddie James
@ 2019-12-05 17:15 ` Eddie James
  2019-12-11  3:47   ` Andrew Jeffery
  2019-12-05 17:15 ` [PATCH v2 07/12] drivers/soc: xdma: Add user interface Eddie James
                   ` (5 subsequent siblings)
  11 siblings, 1 reply; 27+ messages in thread
From: Eddie James @ 2019-12-05 17:15 UTC (permalink / raw)
  To: linux-kernel
  Cc: devicetree, jason, linux-aspeed, maz, robh+dt, tglx,
	mark.rutland, joel, andrew

The XDMA engine embedded in the AST2500 and AST2600 SOCs performs PCI
DMA operations between the SOC (acting as a BMC) and a host processor
in a server.

This commit adds a driver to control the XDMA engine and adds functions
to initialize the hardware and memory and start DMA operations.

Signed-off-by: Eddie James <eajames@linux.ibm.com>
---
Changes since v1:
 - Add lock comments
 - Don't split reset into start/finish
 - Drop client_lock as the new reset method means client data is safe
 - Switch to a chip structure to control versions rather than enum/switch
 - Drop in_progress bool in favor or current_client being NULL or not
 - Get SDRAM controller from dts phandle
 - Configure PCI-E based on dts property
 - Get VGA memory space from dts property
 - Drop bmc_addr from aspeed_xdma_op as mmap can do all that

 MAINTAINERS                      |   2 +
 drivers/soc/aspeed/Kconfig       |   8 +
 drivers/soc/aspeed/Makefile      |   1 +
 drivers/soc/aspeed/aspeed-xdma.c | 783 +++++++++++++++++++++++++++++++++++++++
 include/uapi/linux/aspeed-xdma.h |  41 ++
 5 files changed, 835 insertions(+)
 create mode 100644 drivers/soc/aspeed/aspeed-xdma.c
 create mode 100644 include/uapi/linux/aspeed-xdma.h

diff --git a/MAINTAINERS b/MAINTAINERS
index 528a142..617c03d 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -2712,6 +2712,8 @@ M:	Eddie James <eajames@linux.ibm.com>
 L:	linux-aspeed@lists.ozlabs.org (moderated for non-subscribers)
 S:	Maintained
 F:	Documentation/devicetree/bindings/soc/aspeed/xdma.txt
+F:	drivers/soc/aspeed/aspeed-xdma.c
+F:	include/uapi/linux/aspeed-xdma.h
 
 ASUS NOTEBOOKS AND EEEPC ACPI/WMI EXTRAS DRIVERS
 M:	Corentin Chary <corentin.chary@gmail.com>
diff --git a/drivers/soc/aspeed/Kconfig b/drivers/soc/aspeed/Kconfig
index 323e177..2a6c16f 100644
--- a/drivers/soc/aspeed/Kconfig
+++ b/drivers/soc/aspeed/Kconfig
@@ -29,4 +29,12 @@ config ASPEED_P2A_CTRL
 	  ioctl()s, the driver also provides an interface for userspace mappings to
 	  a pre-defined region.
 
+config ASPEED_XDMA
+	tristate "Aspeed XDMA Engine Driver"
+	depends on SOC_ASPEED && REGMAP && MFD_SYSCON && HAS_DMA
+	help
+	  Enable support for the Aspeed XDMA Engine found on the Aspeed AST2XXX
+	  SOCs. The XDMA engine can perform automatic PCI DMA operations
+	  between the AST2XXX (acting as a BMC) and a host processor.
+
 endmenu
diff --git a/drivers/soc/aspeed/Makefile b/drivers/soc/aspeed/Makefile
index b64be47..977b046 100644
--- a/drivers/soc/aspeed/Makefile
+++ b/drivers/soc/aspeed/Makefile
@@ -2,3 +2,4 @@
 obj-$(CONFIG_ASPEED_LPC_CTRL)	+= aspeed-lpc-ctrl.o
 obj-$(CONFIG_ASPEED_LPC_SNOOP)	+= aspeed-lpc-snoop.o
 obj-$(CONFIG_ASPEED_P2A_CTRL)	+= aspeed-p2a-ctrl.o
+obj-$(CONFIG_ASPEED_XDMA)	+= aspeed-xdma.o
diff --git a/drivers/soc/aspeed/aspeed-xdma.c b/drivers/soc/aspeed/aspeed-xdma.c
new file mode 100644
index 0000000..a9b3eeb
--- /dev/null
+++ b/drivers/soc/aspeed/aspeed-xdma.c
@@ -0,0 +1,783 @@
+// SPDX-License-Identifier: GPL-2.0-or-later
+// Copyright IBM Corp 2019
+
+#include <linux/aspeed-xdma.h>
+#include <linux/bitfield.h>
+#include <linux/clk.h>
+#include <linux/delay.h>
+#include <linux/device.h>
+#include <linux/dma-mapping.h>
+#include <linux/fs.h>
+#include <linux/genalloc.h>
+#include <linux/interrupt.h>
+#include <linux/io.h>
+#include <linux/jiffies.h>
+#include <linux/mfd/syscon.h>
+#include <linux/module.h>
+#include <linux/mutex.h>
+#include <linux/of_device.h>
+#include <linux/platform_device.h>
+#include <linux/poll.h>
+#include <linux/regmap.h>
+#include <linux/reset.h>
+#include <linux/slab.h>
+#include <linux/spinlock.h>
+#include <linux/string.h>
+#include <linux/uaccess.h>
+#include <linux/wait.h>
+#include <linux/workqueue.h>
+
+#define DEVICE_NAME				"aspeed-xdma"
+
+#define SCU_AST2500_PCIE_CONF			0x180
+#define SCU_AST2600_PCIE_CONF			0xc20
+#define  SCU_PCIE_CONF_VGA_EN			 BIT(0)
+#define  SCU_PCIE_CONF_VGA_EN_MMIO		 BIT(1)
+#define  SCU_PCIE_CONF_VGA_EN_LPC		 BIT(2)
+#define  SCU_PCIE_CONF_VGA_EN_MSI		 BIT(3)
+#define  SCU_PCIE_CONF_VGA_EN_MCTP		 BIT(4)
+#define  SCU_PCIE_CONF_VGA_EN_IRQ		 BIT(5)
+#define  SCU_PCIE_CONF_VGA_EN_DMA		 BIT(6)
+#define  SCU_PCIE_CONF_BMC_EN			 BIT(8)
+#define  SCU_PCIE_CONF_BMC_EN_MMIO		 BIT(9)
+#define  SCU_PCIE_CONF_BMC_EN_MSI		 BIT(11)
+#define  SCU_PCIE_CONF_BMC_EN_MCTP		 BIT(12)
+#define  SCU_PCIE_CONF_BMC_EN_IRQ		 BIT(13)
+#define  SCU_PCIE_CONF_BMC_EN_DMA		 BIT(14)
+
+#define SCU_AST2500_BMC_CLASS_REV		0x19c
+#define SCU_AST2600_BMC_CLASS_REV		0xc4c
+#define  SCU_BMC_CLASS_REV_XDMA			 0xff000001
+
+#define SDMC_REMAP				0x008
+#define  SDMC_AST2500_REMAP_PCIE		 BIT(16)
+#define  SDMC_AST2500_REMAP_XDMA		 BIT(17)
+#define  SDMC_AST2600_REMAP_XDMA		 BIT(18)
+
+#define XDMA_CMDQ_SIZE				PAGE_SIZE
+#define XDMA_NUM_CMDS				\
+	(XDMA_CMDQ_SIZE / sizeof(struct aspeed_xdma_cmd))
+
+/* Aspeed specification requires 10ms after switching the reset line */
+#define XDMA_RESET_TIME_MS			10
+
+#define XDMA_CMD_AST2500_PITCH_SHIFT		3
+#define XDMA_CMD_AST2500_PITCH_BMC		GENMASK_ULL(62, 51)
+#define XDMA_CMD_AST2500_PITCH_HOST		GENMASK_ULL(46, 35)
+#define XDMA_CMD_AST2500_PITCH_UPSTREAM		BIT_ULL(31)
+#define XDMA_CMD_AST2500_PITCH_ADDR		GENMASK_ULL(29, 4)
+#define XDMA_CMD_AST2500_PITCH_ID		BIT_ULL(0)
+#define XDMA_CMD_AST2500_CMD_IRQ_EN		BIT_ULL(31)
+#define XDMA_CMD_AST2500_CMD_LINE_NO		GENMASK_ULL(27, 16)
+#define XDMA_CMD_AST2500_CMD_IRQ_BMC		BIT_ULL(15)
+#define XDMA_CMD_AST2500_CMD_LINE_SIZE_SHIFT	4
+#define XDMA_CMD_AST2500_CMD_LINE_SIZE		\
+	GENMASK_ULL(14, XDMA_CMD_AST2500_CMD_LINE_SIZE_SHIFT)
+#define XDMA_CMD_AST2500_CMD_ID			BIT_ULL(1)
+
+#define XDMA_CMD_AST2600_PITCH_BMC		GENMASK_ULL(62, 48)
+#define XDMA_CMD_AST2600_PITCH_HOST		GENMASK_ULL(46, 32)
+#define XDMA_CMD_AST2600_PITCH_ADDR		GENMASK_ULL(30, 0)
+#define XDMA_CMD_AST2600_CMD_64_EN		BIT_ULL(40)
+#define XDMA_CMD_AST2600_CMD_IRQ_BMC		BIT_ULL(37)
+#define XDMA_CMD_AST2600_CMD_IRQ_HOST		BIT_ULL(36)
+#define XDMA_CMD_AST2600_CMD_UPSTREAM		BIT_ULL(32)
+#define XDMA_CMD_AST2600_CMD_LINE_NO		GENMASK_ULL(27, 16)
+#define XDMA_CMD_AST2600_CMD_LINE_SIZE		GENMASK_ULL(14, 0)
+#define XDMA_CMD_AST2600_CMD_MULTILINE_SIZE	GENMASK_ULL(14, 12)
+
+#define XDMA_AST2500_QUEUE_ENTRY_SIZE		4
+#define XDMA_AST2500_HOST_CMDQ_ADDR0		0x00
+#define XDMA_AST2500_HOST_CMDQ_ENDP		0x04
+#define XDMA_AST2500_HOST_CMDQ_WRITEP		0x08
+#define XDMA_AST2500_HOST_CMDQ_READP		0x0c
+#define XDMA_AST2500_BMC_CMDQ_ADDR		0x10
+#define XDMA_AST2500_BMC_CMDQ_ENDP		0x14
+#define XDMA_AST2500_BMC_CMDQ_WRITEP		0x18
+#define XDMA_AST2500_BMC_CMDQ_READP		0x1c
+#define  XDMA_BMC_CMDQ_READP_RESET		 0xee882266
+#define XDMA_AST2500_CTRL			0x20
+#define  XDMA_AST2500_CTRL_US_COMP		 BIT(4)
+#define  XDMA_AST2500_CTRL_DS_COMP		 BIT(5)
+#define  XDMA_AST2500_CTRL_DS_DIRTY		 BIT(6)
+#define  XDMA_AST2500_CTRL_DS_SIZE_256		 BIT(17)
+#define  XDMA_AST2500_CTRL_DS_TIMEOUT		 BIT(28)
+#define  XDMA_AST2500_CTRL_DS_CHECK_ID		 BIT(29)
+#define XDMA_AST2500_STATUS			0x24
+#define  XDMA_AST2500_STATUS_US_COMP		 BIT(4)
+#define  XDMA_AST2500_STATUS_DS_COMP		 BIT(5)
+#define  XDMA_AST2500_STATUS_DS_DIRTY		 BIT(6)
+#define XDMA_AST2500_INPRG_DS_CMD1		0x38
+#define XDMA_AST2500_INPRG_DS_CMD2		0x3c
+#define XDMA_AST2500_INPRG_US_CMD00		0x40
+#define XDMA_AST2500_INPRG_US_CMD01		0x44
+#define XDMA_AST2500_INPRG_US_CMD10		0x48
+#define XDMA_AST2500_INPRG_US_CMD11		0x4c
+#define XDMA_AST2500_INPRG_US_CMD20		0x50
+#define XDMA_AST2500_INPRG_US_CMD21		0x54
+#define XDMA_AST2500_HOST_CMDQ_ADDR1		0x60
+#define XDMA_AST2500_VGA_CMDQ_ADDR0		0x64
+#define XDMA_AST2500_VGA_CMDQ_ENDP		0x68
+#define XDMA_AST2500_VGA_CMDQ_WRITEP		0x6c
+#define XDMA_AST2500_VGA_CMDQ_READP		0x70
+#define XDMA_AST2500_VGA_CMD_STATUS		0x74
+#define XDMA_AST2500_VGA_CMDQ_ADDR1		0x78
+
+#define XDMA_AST2600_QUEUE_ENTRY_SIZE		2
+#define XDMA_AST2600_HOST_CMDQ_ADDR0		0x00
+#define XDMA_AST2600_HOST_CMDQ_ADDR1		0x04
+#define XDMA_AST2600_HOST_CMDQ_ENDP		0x08
+#define XDMA_AST2600_HOST_CMDQ_WRITEP		0x0c
+#define XDMA_AST2600_HOST_CMDQ_READP		0x10
+#define XDMA_AST2600_BMC_CMDQ_ADDR		0x14
+#define XDMA_AST2600_BMC_CMDQ_ENDP		0x18
+#define XDMA_AST2600_BMC_CMDQ_WRITEP		0x1c
+#define XDMA_AST2600_BMC_CMDQ_READP		0x20
+#define XDMA_AST2600_VGA_CMDQ_ADDR0		0x24
+#define XDMA_AST2600_VGA_CMDQ_ADDR1		0x28
+#define XDMA_AST2600_VGA_CMDQ_ENDP		0x2c
+#define XDMA_AST2600_VGA_CMDQ_WRITEP		0x30
+#define XDMA_AST2600_VGA_CMDQ_READP		0x34
+#define XDMA_AST2600_CTRL			0x38
+#define  XDMA_AST2600_CTRL_US_COMP		 BIT(16)
+#define  XDMA_AST2600_CTRL_DS_COMP		 BIT(17)
+#define  XDMA_AST2600_CTRL_DS_DIRTY		 BIT(18)
+#define  XDMA_AST2600_CTRL_DS_SIZE_256		 BIT(20)
+#define XDMA_AST2600_STATUS			0x3c
+#define  XDMA_AST2600_STATUS_US_COMP		 BIT(16)
+#define  XDMA_AST2600_STATUS_DS_COMP		 BIT(17)
+#define  XDMA_AST2600_STATUS_DS_DIRTY		 BIT(18)
+#define XDMA_AST2600_INPRG_DS_CMD00		0x40
+#define XDMA_AST2600_INPRG_DS_CMD01		0x44
+#define XDMA_AST2600_INPRG_DS_CMD10		0x48
+#define XDMA_AST2600_INPRG_DS_CMD11		0x4c
+#define XDMA_AST2600_INPRG_DS_CMD20		0x50
+#define XDMA_AST2600_INPRG_DS_CMD21		0x54
+#define XDMA_AST2600_INPRG_US_CMD00		0x60
+#define XDMA_AST2600_INPRG_US_CMD01		0x64
+#define XDMA_AST2600_INPRG_US_CMD10		0x68
+#define XDMA_AST2600_INPRG_US_CMD11		0x6c
+#define XDMA_AST2600_INPRG_US_CMD20		0x70
+#define XDMA_AST2600_INPRG_US_CMD21		0x74
+
+struct aspeed_xdma_cmd {
+	u64 host_addr;
+	u64 pitch;
+	u64 cmd;
+	u64 reserved;
+};
+
+struct aspeed_xdma_regs {
+	u8 bmc_cmdq_addr;
+	u8 bmc_cmdq_endp;
+	u8 bmc_cmdq_writep;
+	u8 bmc_cmdq_readp;
+	u8 control;
+	u8 status;
+};
+
+struct aspeed_xdma_status_bits {
+	u32 us_comp;
+	u32 ds_comp;
+	u32 ds_dirty;
+};
+
+struct aspeed_xdma;
+
+struct aspeed_xdma_chip {
+	u32 control;
+	u32 scu_bmc_class;
+	u32 scu_pcie_conf;
+	u32 sdmc_remap;
+	unsigned int queue_entry_size;
+	struct aspeed_xdma_regs regs;
+	struct aspeed_xdma_status_bits status_bits;
+	unsigned int (*set_cmd)(struct aspeed_xdma *ctx,
+				struct aspeed_xdma_op *op, u32 bmc_addr);
+};
+
+struct aspeed_xdma_client;
+
+struct aspeed_xdma {
+	const struct aspeed_xdma_chip *chip;
+
+	struct device *dev;
+	void __iomem *base;
+	struct clk *clock;
+	struct reset_control *reset;
+
+	struct aspeed_xdma_client *current_client;
+
+	/* start_lock protects cmd_idx, cmdq, and the state of the engine */
+	struct mutex start_lock;
+	void *cmdq;
+	bool in_reset;
+	bool upstream;
+	unsigned int cmd_idx;
+
+	/* reset_lock protects in_reset and the reset state of the engine */
+	spinlock_t reset_lock;
+
+	wait_queue_head_t wait;
+	struct work_struct reset_work;
+
+	u32 vga_phys;
+	u32 vga_size;
+	void __iomem *vga_virt;
+	dma_addr_t cmdq_vga_phys;
+	void *cmdq_vga_virt;
+	struct gen_pool *vga_pool;
+};
+
+struct aspeed_xdma_client {
+	struct aspeed_xdma *ctx;
+
+	bool error;
+	bool in_progress;
+	void *virt;
+	dma_addr_t phys;
+	u32 size;
+};
+
+static u32 aspeed_xdma_readl(struct aspeed_xdma *ctx, u8 reg)
+{
+	u32 v = readl(ctx->base + reg);
+
+	dev_dbg(ctx->dev, "read %02x[%08x]\n", reg, v);
+	return v;
+}
+
+static void aspeed_xdma_writel(struct aspeed_xdma *ctx, u8 reg, u32 val)
+{
+	writel(val, ctx->base + reg);
+	dev_dbg(ctx->dev, "write %02x[%08x]\n", reg, val);
+}
+
+static void aspeed_xdma_init_eng(struct aspeed_xdma *ctx)
+{
+	aspeed_xdma_writel(ctx, ctx->chip->regs.bmc_cmdq_endp,
+			   ctx->chip->queue_entry_size * XDMA_NUM_CMDS);
+	aspeed_xdma_writel(ctx, ctx->chip->regs.bmc_cmdq_readp,
+			   XDMA_BMC_CMDQ_READP_RESET);
+	aspeed_xdma_writel(ctx, ctx->chip->regs.bmc_cmdq_writep, 0);
+	aspeed_xdma_writel(ctx, ctx->chip->regs.control, ctx->chip->control);
+	aspeed_xdma_writel(ctx, ctx->chip->regs.bmc_cmdq_addr,
+			   ctx->cmdq_vga_phys);
+
+	ctx->cmd_idx = 0;
+	ctx->current_client = NULL;
+}
+
+static unsigned int aspeed_xdma_ast2500_set_cmd(struct aspeed_xdma *ctx,
+						struct aspeed_xdma_op *op,
+						u32 bmc_addr)
+{
+	u64 cmd = XDMA_CMD_AST2500_CMD_IRQ_EN | XDMA_CMD_AST2500_CMD_IRQ_BMC |
+		XDMA_CMD_AST2500_CMD_ID;
+	u64 cmd_pitch = (op->direction ? XDMA_CMD_AST2500_PITCH_UPSTREAM : 0) |
+		XDMA_CMD_AST2500_PITCH_ID;
+	unsigned int line_size;
+	unsigned int nidx = (ctx->cmd_idx + 1) % XDMA_NUM_CMDS;
+	unsigned int line_no = 1;
+	unsigned int pitch = 1;
+	struct aspeed_xdma_cmd *ncmd =
+		&(((struct aspeed_xdma_cmd *)ctx->cmdq)[ctx->cmd_idx]);
+
+	dev_dbg(ctx->dev, "xdma %s ast2500: bmc[%08x] len[%08x] host[%08x]\n",
+		op->direction ? "upstream" : "downstream", bmc_addr, op->len,
+		(u32)op->host_addr);
+
+	if (op->len > XDMA_CMD_AST2500_CMD_LINE_SIZE) {
+		unsigned int rem;
+		unsigned int total;
+
+		line_no = op->len / XDMA_CMD_AST2500_CMD_LINE_SIZE;
+		total = XDMA_CMD_AST2500_CMD_LINE_SIZE * line_no;
+		rem = (op->len - total) >>
+			XDMA_CMD_AST2500_CMD_LINE_SIZE_SHIFT;
+		line_size = XDMA_CMD_AST2500_CMD_LINE_SIZE;
+		pitch = line_size >> XDMA_CMD_AST2500_PITCH_SHIFT;
+		line_size >>= XDMA_CMD_AST2500_CMD_LINE_SIZE_SHIFT;
+
+		if (rem) {
+			u32 rbmc = bmc_addr + total;
+			struct aspeed_xdma_cmd *rcmd =
+				&(((struct aspeed_xdma_cmd *)ctx->cmdq)[nidx]);
+
+			rcmd->host_addr = op->host_addr + (u64)total;
+			rcmd->pitch = cmd_pitch |
+				((u64)rbmc & XDMA_CMD_AST2500_PITCH_ADDR) |
+				FIELD_PREP(XDMA_CMD_AST2500_PITCH_HOST, 1) |
+				FIELD_PREP(XDMA_CMD_AST2500_PITCH_BMC, 1);
+			rcmd->cmd = cmd |
+				FIELD_PREP(XDMA_CMD_AST2500_CMD_LINE_NO, 1) |
+				FIELD_PREP(XDMA_CMD_AST2500_CMD_LINE_SIZE,
+					   rem);
+
+			print_hex_dump_debug("xdma rem ", DUMP_PREFIX_OFFSET,
+					     16, 1, rcmd, sizeof(*rcmd), true);
+
+			cmd &= ~(XDMA_CMD_AST2500_CMD_IRQ_EN |
+				 XDMA_CMD_AST2500_CMD_IRQ_BMC);
+
+			nidx = (nidx + 1) % XDMA_NUM_CMDS;
+		}
+	} else {
+		line_size = op->len >> XDMA_CMD_AST2500_CMD_LINE_SIZE_SHIFT;
+	}
+
+	ncmd->host_addr = op->host_addr;
+	ncmd->pitch = cmd_pitch |
+		((u64)bmc_addr & XDMA_CMD_AST2500_PITCH_ADDR) |
+		FIELD_PREP(XDMA_CMD_AST2500_PITCH_HOST, pitch) |
+		FIELD_PREP(XDMA_CMD_AST2500_PITCH_BMC, pitch);
+	ncmd->cmd = cmd | FIELD_PREP(XDMA_CMD_AST2500_CMD_LINE_NO, line_no) |
+		FIELD_PREP(XDMA_CMD_AST2500_CMD_LINE_SIZE, line_size);
+
+	print_hex_dump_debug("xdma cmd ", DUMP_PREFIX_OFFSET, 16, 1, ncmd,
+			     sizeof(*ncmd), true);
+
+	return nidx;
+}
+
+static unsigned int aspeed_xdma_ast2600_set_cmd(struct aspeed_xdma *ctx,
+						struct aspeed_xdma_op *op,
+						u32 bmc_addr)
+{
+	u64 cmd = XDMA_CMD_AST2600_CMD_IRQ_BMC |
+		(op->direction ? XDMA_CMD_AST2600_CMD_UPSTREAM : 0);
+	unsigned int line_size;
+	unsigned int nidx = (ctx->cmd_idx + 1) % XDMA_NUM_CMDS;
+	unsigned int line_no = 1;
+	unsigned int pitch = 1;
+	struct aspeed_xdma_cmd *ncmd =
+		&(((struct aspeed_xdma_cmd *)ctx->cmdq)[ctx->cmd_idx]);
+
+	if ((op->host_addr + op->len) & 0xffffffff00000000ULL)
+		cmd |= XDMA_CMD_AST2600_CMD_64_EN;
+
+	dev_dbg(ctx->dev, "xdma %s ast2600: bmc[%08x] len[%08x] "
+		"host[%016llx]\n", op->direction ? "upstream" : "downstream",
+		bmc_addr, op->len, op->host_addr);
+
+	if (op->len > XDMA_CMD_AST2600_CMD_LINE_SIZE) {
+		unsigned int rem;
+		unsigned int total;
+
+		line_no = op->len / XDMA_CMD_AST2600_CMD_MULTILINE_SIZE;
+		total = XDMA_CMD_AST2600_CMD_MULTILINE_SIZE * line_no;
+		rem = op->len - total;
+		line_size = XDMA_CMD_AST2600_CMD_MULTILINE_SIZE;
+		pitch = line_size;
+
+		if (rem) {
+			u32 rbmc = bmc_addr + total;
+			struct aspeed_xdma_cmd *rcmd =
+				&(((struct aspeed_xdma_cmd *)ctx->cmdq)[nidx]);
+
+			rcmd->host_addr = op->host_addr + (u64)total;
+			rcmd->pitch =
+				((u64)rbmc & XDMA_CMD_AST2600_PITCH_ADDR) |
+				FIELD_PREP(XDMA_CMD_AST2600_PITCH_HOST, 1) |
+				FIELD_PREP(XDMA_CMD_AST2600_PITCH_BMC, 1);
+			rcmd->cmd = cmd |
+				FIELD_PREP(XDMA_CMD_AST2600_CMD_LINE_NO, 1) |
+				FIELD_PREP(XDMA_CMD_AST2600_CMD_LINE_SIZE,
+					   rem);
+
+			print_hex_dump_debug("xdma rem ", DUMP_PREFIX_OFFSET,
+					     16, 1, rcmd, sizeof(*rcmd), true);
+
+			cmd &= ~XDMA_CMD_AST2600_CMD_IRQ_BMC;
+
+			nidx = (nidx + 1) % XDMA_NUM_CMDS;
+		}
+	} else {
+		line_size = op->len;
+	}
+
+	ncmd->host_addr = op->host_addr;
+	ncmd->pitch = ((u64)bmc_addr & XDMA_CMD_AST2600_PITCH_ADDR) |
+		FIELD_PREP(XDMA_CMD_AST2600_PITCH_HOST, pitch) |
+		FIELD_PREP(XDMA_CMD_AST2600_PITCH_BMC, pitch);
+	ncmd->cmd = cmd | FIELD_PREP(XDMA_CMD_AST2600_CMD_LINE_NO, line_no) |
+		FIELD_PREP(XDMA_CMD_AST2600_CMD_LINE_SIZE, line_size);
+
+	print_hex_dump_debug("xdma cmd ", DUMP_PREFIX_OFFSET, 16, 1, ncmd,
+			     sizeof(*ncmd), true);
+
+	return nidx;
+}
+
+static void aspeed_xdma_start(struct aspeed_xdma *ctx,
+			      struct aspeed_xdma_op *op, u32 bmc_addr,
+			      struct aspeed_xdma_client *client)
+{
+	unsigned int nidx;
+
+	mutex_lock(&ctx->start_lock);
+
+	nidx = ctx->chip->set_cmd(ctx, op, bmc_addr);
+	memcpy(ctx->cmdq_vga_virt, ctx->cmdq, XDMA_CMDQ_SIZE);
+
+	client->in_progress = true;
+	ctx->current_client = client;
+	ctx->upstream = op->direction ? true : false;
+
+	aspeed_xdma_writel(ctx, ctx->chip->regs.bmc_cmdq_writep,
+			   nidx * ctx->chip->queue_entry_size);
+
+	ctx->cmd_idx = nidx;
+
+	mutex_unlock(&ctx->start_lock);
+}
+
+static void aspeed_xdma_done(struct aspeed_xdma *ctx, bool error)
+{
+	if (ctx->current_client) {
+		ctx->current_client->error = error;
+		ctx->current_client->in_progress = false;
+		ctx->current_client = NULL;
+	}
+
+	wake_up_interruptible_all(&ctx->wait);
+}
+
+static irqreturn_t aspeed_xdma_irq(int irq, void *arg)
+{
+	struct aspeed_xdma *ctx = arg;
+	u32 status = aspeed_xdma_readl(ctx, ctx->chip->regs.status);
+
+	if (status & ctx->chip->status_bits.ds_dirty) {
+		aspeed_xdma_done(ctx, true);
+	} else {
+		if (status & ctx->chip->status_bits.us_comp) {
+			if (ctx->upstream)
+				aspeed_xdma_done(ctx, false);
+		}
+
+		if (status & ctx->chip->status_bits.ds_comp) {
+			if (!ctx->upstream)
+				aspeed_xdma_done(ctx, false);
+		}
+	}
+
+	aspeed_xdma_writel(ctx, ctx->chip->regs.status, status);
+
+	return IRQ_HANDLED;
+}
+
+static void aspeed_xdma_reset(struct aspeed_xdma *ctx)
+{
+	reset_control_assert(ctx->reset);
+	msleep(XDMA_RESET_TIME_MS);
+
+	reset_control_deassert(ctx->reset);
+	msleep(XDMA_RESET_TIME_MS);
+
+	aspeed_xdma_init_eng(ctx);
+
+	ctx->in_reset = false;
+	aspeed_xdma_done(ctx, true);
+}
+
+static void aspeed_xdma_reset_work(struct work_struct *work)
+{
+	struct aspeed_xdma *ctx = container_of(work, struct aspeed_xdma,
+					       reset_work);
+
+	/*
+	 * Lock to make sure operations aren't started while the engine is
+	 * in reset.
+	 */
+	mutex_lock(&ctx->start_lock);
+
+	aspeed_xdma_reset(ctx);
+
+	mutex_unlock(&ctx->start_lock);
+}
+
+static irqreturn_t aspeed_xdma_pcie_irq(int irq, void *arg)
+{
+	unsigned long flags;
+	struct aspeed_xdma *ctx = arg;
+
+	dev_dbg(ctx->dev, "pcie reset\n");
+
+	spin_lock_irqsave(&ctx->reset_lock, flags);
+	if (ctx->in_reset) {
+		spin_unlock_irqrestore(&ctx->reset_lock, flags);
+		return IRQ_HANDLED;
+	}
+
+	ctx->in_reset = true;
+	spin_unlock_irqrestore(&ctx->reset_lock, flags);
+
+	schedule_work(&ctx->reset_work);
+	return IRQ_HANDLED;
+}
+
+static int aspeed_xdma_probe(struct platform_device *pdev)
+{
+	int irq;
+	int pcie_irq;
+	int rc;
+	u32 vgamem[2];
+	struct regmap *scu;
+	struct regmap *sdmc;
+	struct aspeed_xdma *ctx;
+	struct device *dev = &pdev->dev;
+	const void *md = of_device_get_match_data(dev);
+
+	if (!md)
+		return -ENODEV;
+
+	ctx = devm_kzalloc(dev, sizeof(*ctx), GFP_KERNEL);
+	if (!ctx)
+		return -ENOMEM;
+
+	ctx->chip = md;
+	ctx->dev = dev;
+	platform_set_drvdata(pdev, ctx);
+	mutex_init(&ctx->start_lock);
+	INIT_WORK(&ctx->reset_work, aspeed_xdma_reset_work);
+	spin_lock_init(&ctx->reset_lock);
+	init_waitqueue_head(&ctx->wait);
+
+	ctx->base = devm_platform_ioremap_resource(pdev, 0);
+	if (IS_ERR(ctx->base)) {
+		dev_err(dev, "Failed to map registers.\n");
+		return PTR_ERR(ctx->base);
+	}
+
+	irq = platform_get_irq(pdev, 0);
+	if (irq < 0) {
+		dev_err(dev, "Unable to find IRQ.\n");
+		return irq;
+	}
+
+	rc = devm_request_irq(dev, irq, aspeed_xdma_irq, IRQF_SHARED,
+			      DEVICE_NAME, ctx);
+	if (rc < 0) {
+		dev_err(dev, "Failed to request IRQ %d.\n", irq);
+		return rc;
+	}
+
+	ctx->clock = devm_clk_get(dev, NULL);
+	if (IS_ERR(ctx->clock)) {
+		dev_err(dev, "Failed to request clock.\n");
+		return PTR_ERR(ctx->clock);
+	}
+
+	ctx->reset = devm_reset_control_get_exclusive(dev, NULL);
+	if (IS_ERR(ctx->reset)) {
+		dev_err(dev, "Failed to request reset control.\n");
+		return PTR_ERR(ctx->reset);
+	}
+
+	ctx->cmdq = devm_kzalloc(ctx->dev, XDMA_CMDQ_SIZE, GFP_KERNEL);
+	if (!ctx->cmdq) {
+		dev_err(dev, "Failed to allocate command queue.\n");
+		return -ENOMEM;
+	}
+
+	ctx->vga_pool = devm_gen_pool_create(dev, ilog2(PAGE_SIZE), -1, NULL);
+	if (!ctx->vga_pool) {
+		dev_err(dev, "Failed to setup genalloc pool.\n");
+		return -ENOMEM;
+	}
+
+	rc = of_property_read_u32_array(dev->of_node, "vga-mem", vgamem, 2);
+	if (rc) {
+		dev_err(dev, "Unable to get VGA memory space.\n");
+		return rc;
+	}
+
+	ctx->vga_phys = vgamem[0];
+	ctx->vga_size = vgamem[1];
+
+	ctx->vga_virt = devm_ioremap(dev, ctx->vga_phys, ctx->vga_size);
+	if (IS_ERR(ctx->vga_virt)) {
+		dev_err(dev, "Failed to map VGA memory space.\n");
+		return PTR_ERR(ctx->vga_virt);
+	}
+
+	rc = gen_pool_add_virt(ctx->vga_pool, (unsigned long)ctx->vga_virt,
+			       ctx->vga_phys, ctx->vga_size, -1);
+	if (rc) {
+		dev_err(ctx->dev, "Failed to add memory to genalloc pool.\n");
+		return rc;
+	}
+
+	sdmc = syscon_regmap_lookup_by_phandle(dev->of_node, "sdmc");
+	if (IS_ERR(sdmc)) {
+		dev_err(dev, "Unable to configure memory controller.\n");
+		return PTR_ERR(sdmc);
+	}
+
+	regmap_update_bits(sdmc, SDMC_REMAP, ctx->chip->sdmc_remap,
+			   ctx->chip->sdmc_remap);
+
+	scu = syscon_regmap_lookup_by_phandle(dev->of_node, "scu");
+	if (!IS_ERR(scu)) {
+		u32 selection;
+		bool pcie_device_bmc = true;
+		const u32 bmc = SCU_PCIE_CONF_BMC_EN |
+			SCU_PCIE_CONF_BMC_EN_MSI | SCU_PCIE_CONF_BMC_EN_IRQ |
+			SCU_PCIE_CONF_BMC_EN_DMA;
+		const u32 vga = SCU_PCIE_CONF_VGA_EN |
+			SCU_PCIE_CONF_VGA_EN_MSI | SCU_PCIE_CONF_VGA_EN_IRQ |
+			SCU_PCIE_CONF_VGA_EN_DMA;
+		const char *pcie = NULL;
+
+		if (!of_property_read_string(dev->of_node, "pcie-device",
+					     &pcie)) {
+			if (!strcmp(pcie, "vga"))
+				pcie_device_bmc = false;
+		}
+
+		if (pcie_device_bmc) {
+			selection = bmc;
+			regmap_write(scu, ctx->chip->scu_bmc_class,
+				     SCU_BMC_CLASS_REV_XDMA);
+		} else {
+			selection = vga;
+		}
+
+		regmap_update_bits(scu, ctx->chip->scu_pcie_conf, bmc | vga,
+				   selection);
+	} else {
+		dev_warn(dev, "Unable to configure PCIe; continuing.\n");
+	}
+
+	rc = clk_prepare_enable(ctx->clock);
+	if (rc) {
+		dev_err(dev, "Failed to enable the clock.\n");
+		return rc;
+	}
+	msleep(XDMA_RESET_TIME_MS);
+
+	rc = reset_control_deassert(ctx->reset);
+	if (rc) {
+		clk_disable_unprepare(ctx->clock);
+
+		dev_err(dev, "Failed to clear the reset.\n");
+		return rc;
+	}
+	msleep(XDMA_RESET_TIME_MS);
+
+	ctx->cmdq_vga_virt = gen_pool_dma_alloc(ctx->vga_pool, XDMA_CMDQ_SIZE,
+						&ctx->cmdq_vga_phys);
+	if (!ctx->cmdq_vga_virt) {
+		dev_err(ctx->dev, "Failed to genalloc cmdq.\n");
+
+		reset_control_assert(ctx->reset);
+		clk_disable_unprepare(ctx->clock);
+		return -ENOMEM;
+	}
+
+	aspeed_xdma_init_eng(ctx);
+
+	/*
+	 * This interrupt could fire immediately so only request it once the
+	 * engine and driver are initialized.
+	 */
+	pcie_irq = platform_get_irq(pdev, 1);
+	if (pcie_irq < 0) {
+		dev_warn(dev, "Unable to find PCI-E IRQ.\n");
+	} else {
+		rc = devm_request_irq(dev, pcie_irq, aspeed_xdma_pcie_irq,
+				      IRQF_SHARED, DEVICE_NAME, ctx);
+		if (rc < 0)
+			dev_warn(dev, "Failed to request PCI-E IRQ %d.\n", rc);
+	}
+
+	return 0;
+}
+
+static int aspeed_xdma_remove(struct platform_device *pdev)
+{
+	struct aspeed_xdma *ctx = platform_get_drvdata(pdev);
+
+	gen_pool_free(ctx->vga_pool, (unsigned long)ctx->cmdq_vga_virt,
+		      XDMA_CMDQ_SIZE);
+
+	reset_control_assert(ctx->reset);
+	clk_disable_unprepare(ctx->clock);
+
+	return 0;
+}
+
+static const struct aspeed_xdma_chip aspeed_ast2500_xdma_chip = {
+	.control = XDMA_AST2500_CTRL_US_COMP | XDMA_AST2500_CTRL_DS_COMP |
+		XDMA_AST2500_CTRL_DS_DIRTY | XDMA_AST2500_CTRL_DS_SIZE_256 |
+		XDMA_AST2500_CTRL_DS_TIMEOUT | XDMA_AST2500_CTRL_DS_CHECK_ID,
+	.scu_bmc_class = SCU_AST2500_BMC_CLASS_REV,
+	.scu_pcie_conf = SCU_AST2500_PCIE_CONF,
+	.sdmc_remap = SDMC_AST2500_REMAP_PCIE | SDMC_AST2500_REMAP_XDMA,
+	.queue_entry_size = XDMA_AST2500_QUEUE_ENTRY_SIZE,
+	.regs = {
+		.bmc_cmdq_addr = XDMA_AST2500_BMC_CMDQ_ADDR,
+		.bmc_cmdq_endp = XDMA_AST2500_BMC_CMDQ_ENDP,
+		.bmc_cmdq_writep = XDMA_AST2500_BMC_CMDQ_WRITEP,
+		.bmc_cmdq_readp = XDMA_AST2500_BMC_CMDQ_READP,
+		.control = XDMA_AST2500_CTRL,
+		.status = XDMA_AST2500_STATUS,
+	},
+	.status_bits = {
+		.us_comp = XDMA_AST2500_STATUS_US_COMP,
+		.ds_comp = XDMA_AST2500_STATUS_DS_COMP,
+		.ds_dirty = XDMA_AST2500_STATUS_DS_DIRTY,
+	},
+	.set_cmd = aspeed_xdma_ast2500_set_cmd,
+};
+
+static const struct aspeed_xdma_chip aspeed_ast2600_xdma_chip = {
+	.control = XDMA_AST2600_CTRL_US_COMP | XDMA_AST2600_CTRL_DS_COMP |
+		XDMA_AST2600_CTRL_DS_DIRTY | XDMA_AST2600_CTRL_DS_SIZE_256,
+	.scu_bmc_class = SCU_AST2600_BMC_CLASS_REV,
+	.scu_pcie_conf = SCU_AST2600_PCIE_CONF,
+	.sdmc_remap = SDMC_AST2600_REMAP_XDMA,
+	.queue_entry_size = XDMA_AST2600_QUEUE_ENTRY_SIZE,
+	.regs = {
+		.bmc_cmdq_addr = XDMA_AST2600_BMC_CMDQ_ADDR,
+		.bmc_cmdq_endp = XDMA_AST2600_BMC_CMDQ_ENDP,
+		.bmc_cmdq_writep = XDMA_AST2600_BMC_CMDQ_WRITEP,
+		.bmc_cmdq_readp = XDMA_AST2600_BMC_CMDQ_READP,
+		.control = XDMA_AST2600_CTRL,
+		.status = XDMA_AST2600_STATUS,
+	},
+	.status_bits = {
+		.us_comp = XDMA_AST2600_STATUS_US_COMP,
+		.ds_comp = XDMA_AST2600_STATUS_DS_COMP,
+		.ds_dirty = XDMA_AST2600_STATUS_DS_DIRTY,
+	},
+	.set_cmd = aspeed_xdma_ast2600_set_cmd,
+};
+
+static const struct of_device_id aspeed_xdma_match[] = {
+	{
+		.compatible = "aspeed,ast2500-xdma",
+		.data = &aspeed_ast2500_xdma_chip,
+	},
+	{
+		.compatible = "aspeed,ast2600-xdma",
+		.data = &aspeed_ast2600_xdma_chip,
+	},
+	{ },
+};
+
+static struct platform_driver aspeed_xdma_driver = {
+	.probe = aspeed_xdma_probe,
+	.remove = aspeed_xdma_remove,
+	.driver = {
+		.name = DEVICE_NAME,
+		.of_match_table = aspeed_xdma_match,
+	},
+};
+
+module_platform_driver(aspeed_xdma_driver);
+
+MODULE_AUTHOR("Eddie James");
+MODULE_DESCRIPTION("Aspeed XDMA Engine Driver");
+MODULE_LICENSE("GPL v2");
diff --git a/include/uapi/linux/aspeed-xdma.h b/include/uapi/linux/aspeed-xdma.h
new file mode 100644
index 0000000..f39f38e
--- /dev/null
+++ b/include/uapi/linux/aspeed-xdma.h
@@ -0,0 +1,41 @@
+/* SPDX-License-Identifier: GPL-2.0+ WITH Linux-syscall-note */
+/* Copyright IBM Corp 2019 */
+
+#ifndef _UAPI_LINUX_ASPEED_XDMA_H_
+#define _UAPI_LINUX_ASPEED_XDMA_H_
+
+#include <linux/types.h>
+
+/*
+ * aspeed_xdma_direction
+ *
+ * ASPEED_XDMA_DIRECTION_DOWNSTREAM: transfers data from the host to the BMC
+ *
+ * ASPEED_XDMA_DIRECTION_UPSTREAM: transfers data from the BMC to the host
+ *
+ * ASPEED_XDMA_DIRECTION_RESET: resets the XDMA engine
+ */
+enum aspeed_xdma_direction {
+	ASPEED_XDMA_DIRECTION_DOWNSTREAM = 0,
+	ASPEED_XDMA_DIRECTION_UPSTREAM,
+	ASPEED_XDMA_DIRECTION_RESET,
+};
+
+/*
+ * aspeed_xdma_op
+ *
+ * host_addr: the DMA address on the host side, typically configured by PCI
+ *            subsystem
+ *
+ * len: the size of the transfer in bytes
+ *
+ * direction: an enumerator indicating the direction of the DMA operation; see
+ *            enum aspeed_xdma_direction
+ */
+struct aspeed_xdma_op {
+	__u64 host_addr;
+	__u32 len;
+	__u32 direction;
+};
+
+#endif /* _UAPI_LINUX_ASPEED_XDMA_H_ */
-- 
1.8.3.1


^ permalink raw reply related	[flat|nested] 27+ messages in thread

* [PATCH v2 07/12] drivers/soc: xdma: Add user interface
  2019-12-05 17:15 [PATCH v2 00/12] Aspeed: Add SCU interrupt controller and XDMA engine drivers Eddie James
                   ` (5 preceding siblings ...)
  2019-12-05 17:15 ` [PATCH v2 06/12] drivers/soc: Add Aspeed XDMA Engine Driver Eddie James
@ 2019-12-05 17:15 ` Eddie James
  2019-12-11  3:48   ` Andrew Jeffery
  2019-12-05 17:15 ` [PATCH v2 08/12] ARM: dts: aspeed: ast2500: Add XDMA Engine Eddie James
                   ` (4 subsequent siblings)
  11 siblings, 1 reply; 27+ messages in thread
From: Eddie James @ 2019-12-05 17:15 UTC (permalink / raw)
  To: linux-kernel
  Cc: devicetree, jason, linux-aspeed, maz, robh+dt, tglx,
	mark.rutland, joel, andrew

This commits adds a miscdevice to provide a user interface to the XDMA
engine. The interface provides the write operation to start DMA
operations. The DMA parameters are passed as the data to the write call.
The actual data to transfer is NOT passed through write. Note that both
directions of DMA operation are accomplished through the write command;
BMC to host and host to BMC.

The XDMA engine is restricted to only accessing the reserved memory
space on the AST2500, typically used by the VGA. For this reason, the
VGA memory space is pooled and allocated with genalloc. Users calling
mmap allocate pages from this pool for their usage. The space allocated
by a client will be the space used in the DMA operation. For an
"upstream" (BMC to host) operation, the data in the client's area will
be transferred to the host. For a "downstream" (host to BMC) operation,
the host data will be placed in the client's memory area.

Poll is also provided in order to determine when the DMA operation is
complete for non-blocking IO.

Signed-off-by: Eddie James <eajames@linux.ibm.com>
---
Changes since v1:
 - Add file_lock comment
 - Bring user reset up to date with new reset method

 drivers/soc/aspeed/aspeed-xdma.c | 224 +++++++++++++++++++++++++++++++++++++++
 1 file changed, 224 insertions(+)

diff --git a/drivers/soc/aspeed/aspeed-xdma.c b/drivers/soc/aspeed/aspeed-xdma.c
index a9b3eeb..d4b96a7 100644
--- a/drivers/soc/aspeed/aspeed-xdma.c
+++ b/drivers/soc/aspeed/aspeed-xdma.c
@@ -13,6 +13,7 @@
 #include <linux/io.h>
 #include <linux/jiffies.h>
 #include <linux/mfd/syscon.h>
+#include <linux/miscdevice.h>
 #include <linux/module.h>
 #include <linux/mutex.h>
 #include <linux/of_device.h>
@@ -206,6 +207,8 @@ struct aspeed_xdma {
 	struct clk *clock;
 	struct reset_control *reset;
 
+	/* file_lock serializes reads of current_client */
+	struct mutex file_lock;
 	struct aspeed_xdma_client *current_client;
 
 	/* start_lock protects cmd_idx, cmdq, and the state of the engine */
@@ -227,6 +230,8 @@ struct aspeed_xdma {
 	dma_addr_t cmdq_vga_phys;
 	void *cmdq_vga_virt;
 	struct gen_pool *vga_pool;
+
+	struct miscdevice misc;
 };
 
 struct aspeed_xdma_client {
@@ -517,6 +522,207 @@ static irqreturn_t aspeed_xdma_pcie_irq(int irq, void *arg)
 	return IRQ_HANDLED;
 }
 
+static ssize_t aspeed_xdma_write(struct file *file, const char __user *buf,
+				 size_t len, loff_t *offset)
+{
+	int rc;
+	struct aspeed_xdma_op op;
+	struct aspeed_xdma_client *client = file->private_data;
+	struct aspeed_xdma *ctx = client->ctx;
+	u32 offs = client->phys ? (client->phys - ctx->vga_phys) :
+		XDMA_CMDQ_SIZE;
+
+	if (len != sizeof(op))
+		return -EINVAL;
+
+	rc = copy_from_user(&op, buf, len);
+	if (rc)
+		return rc;
+
+	if (op.direction == ASPEED_XDMA_DIRECTION_RESET) {
+		unsigned long flags;
+
+		spin_lock_irqsave(&ctx->reset_lock, flags);
+		if (ctx->in_reset) {
+			spin_unlock_irqrestore(&ctx->reset_lock, flags);
+			return len;
+		}
+
+		ctx->in_reset = true;
+		spin_unlock_irqrestore(&ctx->reset_lock, flags);
+
+		mutex_lock(&ctx->start_lock);
+
+		aspeed_xdma_reset(ctx);
+
+		mutex_unlock(&ctx->start_lock);
+
+		return len;
+	} else if (op.direction > ASPEED_XDMA_DIRECTION_RESET) {
+		return -EINVAL;
+	}
+
+	if (op.len > ctx->vga_size - offs)
+		return -EINVAL;
+
+	if (file->f_flags & O_NONBLOCK) {
+		if (!mutex_trylock(&ctx->file_lock))
+			return -EAGAIN;
+
+		if (ctx->current_client) {
+			mutex_unlock(&ctx->file_lock);
+			return -EAGAIN;
+		}
+	} else {
+		mutex_lock(&ctx->file_lock);
+
+		rc = wait_event_interruptible(ctx->wait, !ctx->current_client);
+		if (rc) {
+			mutex_unlock(&ctx->file_lock);
+			return -EINTR;
+		}
+	}
+
+	aspeed_xdma_start(ctx, &op, ctx->vga_phys + offs, client);
+
+	mutex_unlock(&ctx->file_lock);
+
+	if (!(file->f_flags & O_NONBLOCK)) {
+		rc = wait_event_interruptible(ctx->wait, !client->in_progress);
+		if (rc)
+			return -EINTR;
+
+		if (client->error)
+			return -EIO;
+	}
+
+	return len;
+}
+
+static __poll_t aspeed_xdma_poll(struct file *file,
+				 struct poll_table_struct *wait)
+{
+	__poll_t mask = 0;
+	__poll_t req = poll_requested_events(wait);
+	struct aspeed_xdma_client *client = file->private_data;
+	struct aspeed_xdma *ctx = client->ctx;
+
+	if (req & (EPOLLIN | EPOLLRDNORM)) {
+		if (client->in_progress)
+			poll_wait(file, &ctx->wait, wait);
+
+		if (!client->in_progress) {
+			if (client->error)
+				mask |= EPOLLERR;
+			else
+				mask |= EPOLLIN | EPOLLRDNORM;
+		}
+	}
+
+	if (req & (EPOLLOUT | EPOLLWRNORM)) {
+		if (ctx->current_client)
+			poll_wait(file, &ctx->wait, wait);
+
+		if (!ctx->current_client)
+			mask |= EPOLLOUT | EPOLLWRNORM;
+	}
+
+	return mask;
+}
+
+static void aspeed_xdma_vma_close(struct vm_area_struct *vma)
+{
+	struct aspeed_xdma_client *client = vma->vm_private_data;
+
+	gen_pool_free(client->ctx->vga_pool, (unsigned long)client->virt,
+		      client->size);
+
+	client->virt = NULL;
+	client->phys = 0;
+	client->size = 0;
+}
+
+static const struct vm_operations_struct aspeed_xdma_vm_ops = {
+	.close =	aspeed_xdma_vma_close,
+};
+
+static int aspeed_xdma_mmap(struct file *file, struct vm_area_struct *vma)
+{
+	int rc;
+	struct aspeed_xdma_client *client = file->private_data;
+	struct aspeed_xdma *ctx = client->ctx;
+
+	/* restrict file to one mapping */
+	if (client->size)
+		return -ENOMEM;
+
+	client->size = vma->vm_end - vma->vm_start;
+	client->virt = gen_pool_dma_alloc(ctx->vga_pool, client->size,
+					  &client->phys);
+	if (!client->virt) {
+		client->phys = 0;
+		client->size = 0;
+		return -ENOMEM;
+	}
+
+	vma->vm_pgoff = (client->phys - ctx->vga_phys) >> PAGE_SHIFT;
+	vma->vm_ops = &aspeed_xdma_vm_ops;
+	vma->vm_private_data = client;
+	vma->vm_page_prot = pgprot_noncached(vma->vm_page_prot);
+
+	rc = io_remap_pfn_range(vma, vma->vm_start, client->phys >> PAGE_SHIFT,
+				client->size, vma->vm_page_prot);
+	if (rc) {
+		gen_pool_free(ctx->vga_pool, (unsigned long)client->virt,
+			      client->size);
+
+		client->virt = NULL;
+		client->phys = 0;
+		client->size = 0;
+		return rc;
+	}
+
+	dev_dbg(ctx->dev, "mmap: v[%08lx] to p[%08x], s[%08x]\n",
+		vma->vm_start, (u32)client->phys, client->size);
+
+	return 0;
+}
+
+static int aspeed_xdma_open(struct inode *inode, struct file *file)
+{
+	struct miscdevice *misc = file->private_data;
+	struct aspeed_xdma *ctx = container_of(misc, struct aspeed_xdma, misc);
+	struct aspeed_xdma_client *client = kzalloc(sizeof(*client),
+						    GFP_KERNEL);
+
+	if (!client)
+		return -ENOMEM;
+
+	client->ctx = ctx;
+	file->private_data = client;
+	return 0;
+}
+
+static int aspeed_xdma_release(struct inode *inode, struct file *file)
+{
+	struct aspeed_xdma_client *client = file->private_data;
+
+	if (client->ctx->current_client == client)
+		client->ctx->current_client = NULL;
+
+	kfree(client);
+	return 0;
+}
+
+static const struct file_operations aspeed_xdma_fops = {
+	.owner			= THIS_MODULE,
+	.write			= aspeed_xdma_write,
+	.poll			= aspeed_xdma_poll,
+	.mmap			= aspeed_xdma_mmap,
+	.open			= aspeed_xdma_open,
+	.release		= aspeed_xdma_release,
+};
+
 static int aspeed_xdma_probe(struct platform_device *pdev)
 {
 	int irq;
@@ -539,6 +745,7 @@ static int aspeed_xdma_probe(struct platform_device *pdev)
 	ctx->chip = md;
 	ctx->dev = dev;
 	platform_set_drvdata(pdev, ctx);
+	mutex_init(&ctx->file_lock);
 	mutex_init(&ctx->start_lock);
 	INIT_WORK(&ctx->reset_work, aspeed_xdma_reset_work);
 	spin_lock_init(&ctx->reset_lock);
@@ -678,6 +885,22 @@ static int aspeed_xdma_probe(struct platform_device *pdev)
 
 	aspeed_xdma_init_eng(ctx);
 
+	ctx->misc.minor = MISC_DYNAMIC_MINOR;
+	ctx->misc.fops = &aspeed_xdma_fops;
+	ctx->misc.name = "aspeed-xdma";
+	ctx->misc.parent = dev;
+	rc = misc_register(&ctx->misc);
+	if (rc) {
+		dev_err(dev, "Failed to register xdma miscdevice.\n");
+
+		gen_pool_free(ctx->vga_pool, (unsigned long)ctx->cmdq_vga_virt,
+			      XDMA_CMDQ_SIZE);
+
+		reset_control_assert(ctx->reset);
+		clk_disable_unprepare(ctx->clock);
+		return rc;
+	}
+
 	/*
 	 * This interrupt could fire immediately so only request it once the
 	 * engine and driver are initialized.
@@ -699,6 +922,7 @@ static int aspeed_xdma_remove(struct platform_device *pdev)
 {
 	struct aspeed_xdma *ctx = platform_get_drvdata(pdev);
 
+	misc_deregister(&ctx->misc);
 	gen_pool_free(ctx->vga_pool, (unsigned long)ctx->cmdq_vga_virt,
 		      XDMA_CMDQ_SIZE);
 
-- 
1.8.3.1


^ permalink raw reply related	[flat|nested] 27+ messages in thread

* [PATCH v2 08/12] ARM: dts: aspeed: ast2500: Add XDMA Engine
  2019-12-05 17:15 [PATCH v2 00/12] Aspeed: Add SCU interrupt controller and XDMA engine drivers Eddie James
                   ` (6 preceding siblings ...)
  2019-12-05 17:15 ` [PATCH v2 07/12] drivers/soc: xdma: Add user interface Eddie James
@ 2019-12-05 17:15 ` Eddie James
  2019-12-05 17:15 ` [PATCH v2 09/12] ARM: dts: aspeed: ast2600: " Eddie James
                   ` (3 subsequent siblings)
  11 siblings, 0 replies; 27+ messages in thread
From: Eddie James @ 2019-12-05 17:15 UTC (permalink / raw)
  To: linux-kernel
  Cc: devicetree, jason, linux-aspeed, maz, robh+dt, tglx,
	mark.rutland, joel, andrew

Add a node for the XDMA engine with all the necessary information. Also
make the EDAC node compatible with syscon.

Signed-off-by: Eddie James <eajames@linux.ibm.com>
---
Changes since v1:
 - Add syscon compatible to the SDRAM controller so we can grab it in the
   XDMA driver
 - Add the new properties to the XDMA node

 arch/arm/boot/dts/aspeed-g5.dtsi | 15 ++++++++++++++-
 1 file changed, 14 insertions(+), 1 deletion(-)

diff --git a/arch/arm/boot/dts/aspeed-g5.dtsi b/arch/arm/boot/dts/aspeed-g5.dtsi
index df44022..6a83cd2 100644
--- a/arch/arm/boot/dts/aspeed-g5.dtsi
+++ b/arch/arm/boot/dts/aspeed-g5.dtsi
@@ -1,5 +1,6 @@
 // SPDX-License-Identifier: GPL-2.0+
 #include <dt-bindings/clock/aspeed-clock.h>
+#include <dt-bindings/interrupt-controller/aspeed-scu-ic.h>
 
 / {
 	model = "Aspeed BMC";
@@ -48,7 +49,7 @@
 	};
 
 	edac: sdram@1e6e0000 {
-		compatible = "aspeed,ast2500-sdram-edac";
+		compatible = "aspeed,ast2500-sdram-edac", "syscon";
 		reg = <0x1e6e0000 0x174>;
 		interrupts = <0>;
 		status = "disabled";
@@ -258,6 +259,18 @@
 				interrupts = <0x19>;
 			};
 
+			xdma: xdma@1e6e7000 {
+				compatible = "aspeed,ast2500-xdma";
+				reg = <0x1e6e7000 0x100>;
+				clocks = <&syscon ASPEED_CLK_GATE_BCLK>;
+				resets = <&syscon ASPEED_RESET_XDMA>;
+				interrupts-extended = <&vic 6>, <&scu_ic ASPEED_AST2500_SCU_IC_PCIE_RESET_LO_TO_HI>;
+				pcie-device = "bmc";
+				scu = <&syscon>;
+				sdmc = <&edac>;
+				status = "disabled";
+			};
+
 			adc: adc@1e6e9000 {
 				compatible = "aspeed,ast2500-adc";
 				reg = <0x1e6e9000 0xb0>;
-- 
1.8.3.1


^ permalink raw reply related	[flat|nested] 27+ messages in thread

* [PATCH v2 09/12] ARM: dts: aspeed: ast2600: Add XDMA Engine
  2019-12-05 17:15 [PATCH v2 00/12] Aspeed: Add SCU interrupt controller and XDMA engine drivers Eddie James
                   ` (7 preceding siblings ...)
  2019-12-05 17:15 ` [PATCH v2 08/12] ARM: dts: aspeed: ast2500: Add XDMA Engine Eddie James
@ 2019-12-05 17:15 ` Eddie James
  2019-12-11  3:48   ` Andrew Jeffery
  2019-12-05 17:15 ` [PATCH v2 10/12] ARM: dts: aspeed: witherspoon: Enable " Eddie James
                   ` (2 subsequent siblings)
  11 siblings, 1 reply; 27+ messages in thread
From: Eddie James @ 2019-12-05 17:15 UTC (permalink / raw)
  To: linux-kernel
  Cc: devicetree, jason, linux-aspeed, maz, robh+dt, tglx,
	mark.rutland, joel, andrew

Add a node for the XDMA engine with all the necessary information. Also
add a simple syscon node for the SDRAM memory controller.

Signed-off-by: Eddie James <eajames@linux.ibm.com>
---
Changes since v1:
 - Add a syscon SDRAM controller
 - Add various properties to XDMA node

 arch/arm/boot/dts/aspeed-g6.dtsi | 19 +++++++++++++++++++
 1 file changed, 19 insertions(+)

diff --git a/arch/arm/boot/dts/aspeed-g6.dtsi b/arch/arm/boot/dts/aspeed-g6.dtsi
index ead336e..514d685 100644
--- a/arch/arm/boot/dts/aspeed-g6.dtsi
+++ b/arch/arm/boot/dts/aspeed-g6.dtsi
@@ -3,6 +3,7 @@
 
 #include <dt-bindings/interrupt-controller/arm-gic.h>
 #include <dt-bindings/clock/ast2600-clock.h>
+#include <dt-bindings/interrupt-controller/aspeed-scu-ic.h>
 
 / {
 	model = "Aspeed BMC";
@@ -265,6 +266,11 @@
 			status = "disabled";
 		};
 
+		sdmc: sdram@1e6e0000 {
+			compatible = "syscon";
+			reg = <0x1e6e0000 0xb8>;
+		};
+
 		apb {
 			compatible = "simple-bus";
 			#address-cells = <1>;
@@ -311,6 +317,19 @@
 				quality = <100>;
 			};
 
+			xdma: xdma@1e6e7000 {
+				compatible = "aspeed,ast2600-xdma";
+				reg = <0x1e6e7000 0x100>;
+				clocks = <&syscon ASPEED_CLK_GATE_BCLK>;
+				resets = <&syscon ASPEED_RESET_DEV_XDMA>;
+				interrupts-extended = <&gic GIC_SPI 6 IRQ_TYPE_LEVEL_HIGH>,
+						      <&scu_ic0 ASPEED_AST2600_SCU_IC0_PCIE_PERST_LO_TO_HI>;
+				pcie-device = "bmc";
+				scu = <&syscon>;
+				sdmc = <&sdmc>;
+				status = "disabled";
+			};
+
 			gpio0: gpio@1e780000 {
 				#gpio-cells = <2>;
 				gpio-controller;
-- 
1.8.3.1


^ permalink raw reply related	[flat|nested] 27+ messages in thread

* [PATCH v2 10/12] ARM: dts: aspeed: witherspoon: Enable XDMA Engine
  2019-12-05 17:15 [PATCH v2 00/12] Aspeed: Add SCU interrupt controller and XDMA engine drivers Eddie James
                   ` (8 preceding siblings ...)
  2019-12-05 17:15 ` [PATCH v2 09/12] ARM: dts: aspeed: ast2600: " Eddie James
@ 2019-12-05 17:15 ` Eddie James
  2019-12-05 17:15 ` [PATCH v2 11/12] ARM: dts: aspeed: rainier: Enable XDMA engine Eddie James
  2019-12-05 17:15 ` [PATCH v2 12/12] ARM: dts: aspeed: tacoma: " Eddie James
  11 siblings, 0 replies; 27+ messages in thread
From: Eddie James @ 2019-12-05 17:15 UTC (permalink / raw)
  To: linux-kernel
  Cc: devicetree, jason, linux-aspeed, maz, robh+dt, tglx,
	mark.rutland, joel, andrew

Enable the XDMA engine node.

Signed-off-by: Eddie James <eajames@linux.ibm.com>
---
 arch/arm/boot/dts/aspeed-bmc-opp-witherspoon.dts | 5 +++++
 1 file changed, 5 insertions(+)

diff --git a/arch/arm/boot/dts/aspeed-bmc-opp-witherspoon.dts b/arch/arm/boot/dts/aspeed-bmc-opp-witherspoon.dts
index 569dad9..2676a28 100644
--- a/arch/arm/boot/dts/aspeed-bmc-opp-witherspoon.dts
+++ b/arch/arm/boot/dts/aspeed-bmc-opp-witherspoon.dts
@@ -658,4 +658,9 @@
 	memory-region = <&video_engine_memory>;
 };
 
+&xdma {
+	status = "okay";
+	vga-mem = <0x9f000000 0x01000000>;
+};
+
 #include "ibm-power9-dual.dtsi"
-- 
1.8.3.1


^ permalink raw reply related	[flat|nested] 27+ messages in thread

* [PATCH v2 11/12] ARM: dts: aspeed: rainier: Enable XDMA engine
  2019-12-05 17:15 [PATCH v2 00/12] Aspeed: Add SCU interrupt controller and XDMA engine drivers Eddie James
                   ` (9 preceding siblings ...)
  2019-12-05 17:15 ` [PATCH v2 10/12] ARM: dts: aspeed: witherspoon: Enable " Eddie James
@ 2019-12-05 17:15 ` Eddie James
  2019-12-05 17:15 ` [PATCH v2 12/12] ARM: dts: aspeed: tacoma: " Eddie James
  11 siblings, 0 replies; 27+ messages in thread
From: Eddie James @ 2019-12-05 17:15 UTC (permalink / raw)
  To: linux-kernel
  Cc: devicetree, jason, linux-aspeed, maz, robh+dt, tglx,
	mark.rutland, joel, andrew

Enable the XDMA engine node.

Signed-off-by: Eddie James <eajames@linux.ibm.com>
---
 arch/arm/boot/dts/aspeed-bmc-ibm-rainier.dts | 5 +++++
 1 file changed, 5 insertions(+)

diff --git a/arch/arm/boot/dts/aspeed-bmc-ibm-rainier.dts b/arch/arm/boot/dts/aspeed-bmc-ibm-rainier.dts
index c1c9cd3..7c1a155 100644
--- a/arch/arm/boot/dts/aspeed-bmc-ibm-rainier.dts
+++ b/arch/arm/boot/dts/aspeed-bmc-ibm-rainier.dts
@@ -970,3 +970,8 @@
 		spi-max-frequency = <100000000>;
 	};
 };
+
+&xdma {
+	status = "okay";
+	vga-mem = <0xbf800000 0x00800000>;
+};
-- 
1.8.3.1


^ permalink raw reply related	[flat|nested] 27+ messages in thread

* [PATCH v2 12/12] ARM: dts: aspeed: tacoma: Enable XDMA engine
  2019-12-05 17:15 [PATCH v2 00/12] Aspeed: Add SCU interrupt controller and XDMA engine drivers Eddie James
                   ` (10 preceding siblings ...)
  2019-12-05 17:15 ` [PATCH v2 11/12] ARM: dts: aspeed: rainier: Enable XDMA engine Eddie James
@ 2019-12-05 17:15 ` Eddie James
  11 siblings, 0 replies; 27+ messages in thread
From: Eddie James @ 2019-12-05 17:15 UTC (permalink / raw)
  To: linux-kernel
  Cc: devicetree, jason, linux-aspeed, maz, robh+dt, tglx,
	mark.rutland, joel, andrew

Enable the XDMA engine node.

Signed-off-by: Eddie James <eajames@linux.ibm.com>
---
 arch/arm/boot/dts/aspeed-bmc-opp-tacoma.dts | 5 +++++
 1 file changed, 5 insertions(+)

diff --git a/arch/arm/boot/dts/aspeed-bmc-opp-tacoma.dts b/arch/arm/boot/dts/aspeed-bmc-opp-tacoma.dts
index f02de4a..cf54708 100644
--- a/arch/arm/boot/dts/aspeed-bmc-opp-tacoma.dts
+++ b/arch/arm/boot/dts/aspeed-bmc-opp-tacoma.dts
@@ -1193,3 +1193,8 @@
 	pinctrl-0 = <&pinctrl_lpc_default>,
 		    <&pinctrl_lsirq_default>;
 };
+
+&xdma {
+	status = "okay";
+	vga-mem = <0xbf800000 0x00800000>;
+};
-- 
1.8.3.1


^ permalink raw reply related	[flat|nested] 27+ messages in thread

* Re: [PATCH v2 01/12] dt-bindings: interrupt-controller: Add Aspeed SCU interrupt controller
  2019-12-05 17:15 ` [PATCH v2 01/12] dt-bindings: interrupt-controller: Add Aspeed SCU interrupt controller Eddie James
@ 2019-12-10 23:32   ` Andrew Jeffery
  0 siblings, 0 replies; 27+ messages in thread
From: Andrew Jeffery @ 2019-12-10 23:32 UTC (permalink / raw)
  To: Eddie James, linux-kernel
  Cc: devicetree, Jason Cooper, linux-aspeed, Marc Zyngier,
	Rob Herring, tglx, mark.rutland, Joel Stanley



On Fri, 6 Dec 2019, at 03:45, Eddie James wrote:
> Document the Aspeed SCU interrupt controller and add an include file
> for the interrupts it provides.
> 
> Signed-off-by: Eddie James <eajames@linux.ibm.com>
> Reviewed-by: Rob Herring <robh@kernel.org>
> ---
> Changes since v1:
>  - Remove 'reg' required property.

Hmm, I have a series that rearranges the SCU bindings to fix up some
issues we have with dtc warnings. I'm happy for this to go in now as it's
consistent with what we have as my patches are not yet merged,  but
we should circle back later.

Acked-by: Andrew Jeffery <andrew@aj.id.au>

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH v2 02/12] irqchip: Add Aspeed SCU interrupt controller
  2019-12-05 17:15 ` [PATCH v2 02/12] irqchip: " Eddie James
@ 2019-12-11  0:32   ` Andrew Jeffery
  0 siblings, 0 replies; 27+ messages in thread
From: Andrew Jeffery @ 2019-12-11  0:32 UTC (permalink / raw)
  To: Eddie James, linux-kernel
  Cc: devicetree, Jason Cooper, linux-aspeed, Marc Zyngier,
	Rob Herring, tglx, mark.rutland, Joel Stanley



On Fri, 6 Dec 2019, at 03:45, Eddie James wrote:
> The Aspeed SOCs provide some interrupts through the System Control
> Unit registers. Add an interrupt controller that provides these
> interrupts to the system.
> 
> Signed-off-by: Eddie James <eajames@linux.ibm.com>

Reviewed-by: Andrew Jeffery <andrew@aj.id.au>

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH v2 03/12] ARM: dts: aspeed: ast2500: Add SCU interrupt controller
  2019-12-05 17:15 ` [PATCH v2 03/12] ARM: dts: aspeed: ast2500: Add " Eddie James
@ 2019-12-11  0:36   ` Andrew Jeffery
  0 siblings, 0 replies; 27+ messages in thread
From: Andrew Jeffery @ 2019-12-11  0:36 UTC (permalink / raw)
  To: Eddie James, linux-kernel
  Cc: devicetree, Jason Cooper, linux-aspeed, Marc Zyngier,
	Rob Herring, tglx, mark.rutland, Joel Stanley



On Fri, 6 Dec 2019, at 03:45, Eddie James wrote:
> Add a node for the interrupt controller provided by the SCU.
> 
> Signed-off-by: Eddie James <eajames@linux.ibm.com>
> ---
>  arch/arm/boot/dts/aspeed-g5.dtsi | 10 +++++++++-
>  1 file changed, 9 insertions(+), 1 deletion(-)
> 
> diff --git a/arch/arm/boot/dts/aspeed-g5.dtsi b/arch/arm/boot/dts/aspeed-g5.dtsi
> index a259c63..df44022 100644
> --- a/arch/arm/boot/dts/aspeed-g5.dtsi
> +++ b/arch/arm/boot/dts/aspeed-g5.dtsi
> @@ -216,8 +216,9 @@
>  			syscon: syscon@1e6e2000 {
>  				compatible = "aspeed,ast2500-scu", "syscon", "simple-mfd";
>  				reg = <0x1e6e2000 0x1a8>;
> +				ranges = <0 0x1e6e2000 0x1a8>;
>  				#address-cells = <1>;
> -				#size-cells = <0>;
> +				#size-cells = <1>;

You're no-longer adding a reg property to the interrupt controller node below so the
hunk above is unnecessary.

Andrew

>  				#clock-cells = <1>;
>  				#reset-cells = <1>;
>  
> @@ -231,6 +232,13 @@
>  					compatible = "aspeed,ast2500-p2a-ctrl";
>  					status = "disabled";
>  				};
> +
> +				scu_ic: interrupt-controller@18 {
> +					#interrupt-cells = <1>;
> +					compatible = "aspeed,ast2500-scu-ic";
> +					interrupts = <21>;
> +					interrupt-controller;
> +				};
>  			};
>  
>  			rng: hwrng@1e6e2078 {
> -- 
> 1.8.3.1
> 
>

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH v2 05/12] dt-bindings: soc: Add Aspeed XDMA Engine
  2019-12-05 17:15 ` [PATCH v2 05/12] dt-bindings: soc: Add Aspeed XDMA Engine Eddie James
@ 2019-12-11  0:40   ` Andrew Jeffery
  0 siblings, 0 replies; 27+ messages in thread
From: Andrew Jeffery @ 2019-12-11  0:40 UTC (permalink / raw)
  To: Eddie James, linux-kernel
  Cc: devicetree, Jason Cooper, linux-aspeed, Marc Zyngier,
	Rob Herring, tglx, mark.rutland, Joel Stanley



On Fri, 6 Dec 2019, at 03:45, Eddie James wrote:
> Document the bindings for the Aspeed AST25XX and AST26XX XDMA engine.
> 
> Signed-off-by: Eddie James <eajames@linux.ibm.com>
> Reviewed-by: Rob Herring <robh@kernel.org>
> ---
> Changes since v1:
>  - Add 'clocks', 'scu', 'sdmc', 'vga-mem', and 'pcie-device' property
>    documentation
> 
>  .../devicetree/bindings/soc/aspeed/xdma.txt        | 43 ++++++++++++++++++++++
>  MAINTAINERS                                        |  6 +++
>  2 files changed, 49 insertions(+)
>  create mode 100644 Documentation/devicetree/bindings/soc/aspeed/xdma.txt
> 
> diff --git a/Documentation/devicetree/bindings/soc/aspeed/xdma.txt 
> b/Documentation/devicetree/bindings/soc/aspeed/xdma.txt
> new file mode 100644
> index 0000000..942dc07
> --- /dev/null
> +++ b/Documentation/devicetree/bindings/soc/aspeed/xdma.txt
> @@ -0,0 +1,43 @@
> +Aspeed AST25XX and AST26XX XDMA Engine
> +
> +The XDMA Engine embedded in the AST2500 and AST2600 SOCs can perform 
> automatic
> +DMA operations over PCI between the SOC (acting as a BMC) and a host 
> processor.
> +
> +Required properties:
> + - compatible		: must be "aspeed,ast2500-xdma" or
> +			  "aspeed,ast2600-xdma"
> + - reg			: contains the address and size of the memory region
> +			  associated with the XDMA engine registers
> + - clocks		: clock specifier for the clock associated with the
> +			  XDMA engine
> + - resets		: reset specifier for the syscon reset associated with
> +			  the XDMA engine
> + - interrupts-extended	: two interrupt nodes; the first specifies the 

s/nodes/cells/?

> global
> +			  interrupt for the XDMA engine and the second
> +			  specifies the PCI-E reset or PERST interrupt.
> + - scu			: a phandle to the syscon node for the system control
> +			  unit of the SOC
> + - sdmc			: a phandle to the syscon node for the SDRAM memory
> +			  controller of the SOC

This is a driver and is unrelated to the XDMA hardware. I think we can avoid
touching the SDMC in the XDMA driver anyway (it should be up to the
platform configuration to make sure the remapping is set correctly, not a
behaviour of the XDMA driver).

> + - vga-mem		: contains the address and size of the VGA memory space
> +			  to be used by the XDMA engine

Shouldn't we generalise this to just a 'memory' property? It doesn't have to
be the VGA memory, just we tend to use the VGA memory to uphold security
properties of our platforms.

> +
> +Optional properties:
> + - pcie-device		: should be either "bmc" or "vga", corresponding to
> +			  which device should be used by the XDMA engine for
> +			  DMA operations. If this property is not set, the XDMA
> +			  engine will use the BMC PCI-E device.
> +
> +Example:
> +
> +    xdma@1e6e7000 {
> +        compatible = "aspeed,ast2500-xdma";
> +        reg = <0x1e6e7000 0x100>;
> +        clocks = <&syscon ASPEED_CLK_GATE_BCLK>;
> +        resets = <&syscon ASPEED_RESET_XDMA>;
> +        interrupts-extended = <&vic 6>, <&scu_ic 
> ASPEED_AST2500_SCU_IC_PCIE_RESET_LO_TO_HI>;
> +        scu = <&syscon>;
> +        sdmc = <&edac>;
> +        pcie-device = "bmc";
> +        vga-mem = <0x9f000000 0x01000000>;
> +    };
> diff --git a/MAINTAINERS b/MAINTAINERS
> index 398861a..528a142 100644
> --- a/MAINTAINERS
> +++ b/MAINTAINERS
> @@ -2707,6 +2707,12 @@ S:	Maintained
>  F:	drivers/media/platform/aspeed-video.c
>  F:	Documentation/devicetree/bindings/media/aspeed-video.txt
>  
> +ASPEED XDMA ENGINE DRIVER
> +M:	Eddie James <eajames@linux.ibm.com>
> +L:	linux-aspeed@lists.ozlabs.org (moderated for non-subscribers)
> +S:	Maintained
> +F:	Documentation/devicetree/bindings/soc/aspeed/xdma.txt
> +
>  ASUS NOTEBOOKS AND EEEPC ACPI/WMI EXTRAS DRIVERS
>  M:	Corentin Chary <corentin.chary@gmail.com>
>  L:	acpi4asus-user@lists.sourceforge.net
> -- 
> 1.8.3.1
> 
>

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH v2 04/12] ARM: dts: aspeed: ast2600: Add SCU interrupt controllers
  2019-12-05 17:15 ` [PATCH v2 04/12] ARM: dts: aspeed: ast2600: Add SCU interrupt controllers Eddie James
@ 2019-12-11  0:42   ` Andrew Jeffery
  0 siblings, 0 replies; 27+ messages in thread
From: Andrew Jeffery @ 2019-12-11  0:42 UTC (permalink / raw)
  To: Eddie James, linux-kernel
  Cc: devicetree, Jason Cooper, linux-aspeed, Marc Zyngier,
	Rob Herring, tglx, mark.rutland, Joel Stanley



On Fri, 6 Dec 2019, at 03:45, Eddie James wrote:
> Add nodes for the interrupt controllers provided by the SCU.
> 
> Signed-off-by: Eddie James <eajames@linux.ibm.com>

Reviewed-by: Andrew Jeffery <andrew@aj.id.au>

You should split out the DT changes and send them as their own
series to Joel.

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH v2 06/12] drivers/soc: Add Aspeed XDMA Engine Driver
  2019-12-05 17:15 ` [PATCH v2 06/12] drivers/soc: Add Aspeed XDMA Engine Driver Eddie James
@ 2019-12-11  3:47   ` Andrew Jeffery
  2019-12-11 20:39     ` Eddie James
  0 siblings, 1 reply; 27+ messages in thread
From: Andrew Jeffery @ 2019-12-11  3:47 UTC (permalink / raw)
  To: Eddie James, linux-kernel
  Cc: devicetree, Jason Cooper, linux-aspeed, Marc Zyngier,
	Rob Herring, tglx, mark.rutland, Joel Stanley



On Fri, 6 Dec 2019, at 03:45, Eddie James wrote:
> The XDMA engine embedded in the AST2500 and AST2600 SOCs performs PCI
> DMA operations between the SOC (acting as a BMC) and a host processor
> in a server.
> 
> This commit adds a driver to control the XDMA engine and adds functions
> to initialize the hardware and memory and start DMA operations.
> 
> Signed-off-by: Eddie James <eajames@linux.ibm.com>
> ---
> Changes since v1:
>  - Add lock comments
>  - Don't split reset into start/finish
>  - Drop client_lock as the new reset method means client data is safe
>  - Switch to a chip structure to control versions rather than enum/switch
>  - Drop in_progress bool in favor or current_client being NULL or not
>  - Get SDRAM controller from dts phandle
>  - Configure PCI-E based on dts property
>  - Get VGA memory space from dts property
>  - Drop bmc_addr from aspeed_xdma_op as mmap can do all that
> 
>  MAINTAINERS                      |   2 +
>  drivers/soc/aspeed/Kconfig       |   8 +
>  drivers/soc/aspeed/Makefile      |   1 +
>  drivers/soc/aspeed/aspeed-xdma.c | 783 +++++++++++++++++++++++++++++++++++++++
>  include/uapi/linux/aspeed-xdma.h |  41 ++
>  5 files changed, 835 insertions(+)
>  create mode 100644 drivers/soc/aspeed/aspeed-xdma.c
>  create mode 100644 include/uapi/linux/aspeed-xdma.h
> 
> diff --git a/MAINTAINERS b/MAINTAINERS
> index 528a142..617c03d 100644
> --- a/MAINTAINERS
> +++ b/MAINTAINERS
> @@ -2712,6 +2712,8 @@ M:	Eddie James <eajames@linux.ibm.com>
>  L:	linux-aspeed@lists.ozlabs.org (moderated for non-subscribers)
>  S:	Maintained
>  F:	Documentation/devicetree/bindings/soc/aspeed/xdma.txt
> +F:	drivers/soc/aspeed/aspeed-xdma.c
> +F:	include/uapi/linux/aspeed-xdma.h
>  
>  ASUS NOTEBOOKS AND EEEPC ACPI/WMI EXTRAS DRIVERS
>  M:	Corentin Chary <corentin.chary@gmail.com>
> diff --git a/drivers/soc/aspeed/Kconfig b/drivers/soc/aspeed/Kconfig
> index 323e177..2a6c16f 100644
> --- a/drivers/soc/aspeed/Kconfig
> +++ b/drivers/soc/aspeed/Kconfig
> @@ -29,4 +29,12 @@ config ASPEED_P2A_CTRL
>  	  ioctl()s, the driver also provides an interface for userspace mappings to
>  	  a pre-defined region.
>  
> +config ASPEED_XDMA
> +	tristate "Aspeed XDMA Engine Driver"
> +	depends on SOC_ASPEED && REGMAP && MFD_SYSCON && HAS_DMA
> +	help
> +	  Enable support for the Aspeed XDMA Engine found on the Aspeed AST2XXX
> +	  SOCs. The XDMA engine can perform automatic PCI DMA operations
> +	  between the AST2XXX (acting as a BMC) and a host processor.
> +
>  endmenu
> diff --git a/drivers/soc/aspeed/Makefile b/drivers/soc/aspeed/Makefile
> index b64be47..977b046 100644
> --- a/drivers/soc/aspeed/Makefile
> +++ b/drivers/soc/aspeed/Makefile
> @@ -2,3 +2,4 @@
>  obj-$(CONFIG_ASPEED_LPC_CTRL)	+= aspeed-lpc-ctrl.o
>  obj-$(CONFIG_ASPEED_LPC_SNOOP)	+= aspeed-lpc-snoop.o
>  obj-$(CONFIG_ASPEED_P2A_CTRL)	+= aspeed-p2a-ctrl.o
> +obj-$(CONFIG_ASPEED_XDMA)	+= aspeed-xdma.o
> diff --git a/drivers/soc/aspeed/aspeed-xdma.c b/drivers/soc/aspeed/aspeed-xdma.c
> new file mode 100644
> index 0000000..a9b3eeb
> --- /dev/null
> +++ b/drivers/soc/aspeed/aspeed-xdma.c
> @@ -0,0 +1,783 @@
> +// SPDX-License-Identifier: GPL-2.0-or-later
> +// Copyright IBM Corp 2019
> +
> +#include <linux/aspeed-xdma.h>
> +#include <linux/bitfield.h>
> +#include <linux/clk.h>
> +#include <linux/delay.h>
> +#include <linux/device.h>
> +#include <linux/dma-mapping.h>
> +#include <linux/fs.h>
> +#include <linux/genalloc.h>
> +#include <linux/interrupt.h>
> +#include <linux/io.h>
> +#include <linux/jiffies.h>
> +#include <linux/mfd/syscon.h>
> +#include <linux/module.h>
> +#include <linux/mutex.h>
> +#include <linux/of_device.h>
> +#include <linux/platform_device.h>
> +#include <linux/poll.h>
> +#include <linux/regmap.h>
> +#include <linux/reset.h>
> +#include <linux/slab.h>
> +#include <linux/spinlock.h>
> +#include <linux/string.h>
> +#include <linux/uaccess.h>
> +#include <linux/wait.h>
> +#include <linux/workqueue.h>
> +
> +#define DEVICE_NAME				"aspeed-xdma"
> +
> +#define SCU_AST2500_PCIE_CONF			0x180
> +#define SCU_AST2600_PCIE_CONF			0xc20
> +#define  SCU_PCIE_CONF_VGA_EN			 BIT(0)
> +#define  SCU_PCIE_CONF_VGA_EN_MMIO		 BIT(1)
> +#define  SCU_PCIE_CONF_VGA_EN_LPC		 BIT(2)
> +#define  SCU_PCIE_CONF_VGA_EN_MSI		 BIT(3)
> +#define  SCU_PCIE_CONF_VGA_EN_MCTP		 BIT(4)
> +#define  SCU_PCIE_CONF_VGA_EN_IRQ		 BIT(5)
> +#define  SCU_PCIE_CONF_VGA_EN_DMA		 BIT(6)
> +#define  SCU_PCIE_CONF_BMC_EN			 BIT(8)
> +#define  SCU_PCIE_CONF_BMC_EN_MMIO		 BIT(9)
> +#define  SCU_PCIE_CONF_BMC_EN_MSI		 BIT(11)
> +#define  SCU_PCIE_CONF_BMC_EN_MCTP		 BIT(12)
> +#define  SCU_PCIE_CONF_BMC_EN_IRQ		 BIT(13)
> +#define  SCU_PCIE_CONF_BMC_EN_DMA		 BIT(14)
> +
> +#define SCU_AST2500_BMC_CLASS_REV		0x19c
> +#define SCU_AST2600_BMC_CLASS_REV		0xc4c
> +#define  SCU_BMC_CLASS_REV_XDMA			 0xff000001
> +
> +#define SDMC_REMAP				0x008
> +#define  SDMC_AST2500_REMAP_PCIE		 BIT(16)
> +#define  SDMC_AST2500_REMAP_XDMA		 BIT(17)
> +#define  SDMC_AST2600_REMAP_XDMA		 BIT(18)
> +
> +#define XDMA_CMDQ_SIZE				PAGE_SIZE
> +#define XDMA_NUM_CMDS				\
> +	(XDMA_CMDQ_SIZE / sizeof(struct aspeed_xdma_cmd))
> +
> +/* Aspeed specification requires 10ms after switching the reset line */
> +#define XDMA_RESET_TIME_MS			10
> +
> +#define XDMA_CMD_AST2500_PITCH_SHIFT		3
> +#define XDMA_CMD_AST2500_PITCH_BMC		GENMASK_ULL(62, 51)
> +#define XDMA_CMD_AST2500_PITCH_HOST		GENMASK_ULL(46, 35)
> +#define XDMA_CMD_AST2500_PITCH_UPSTREAM		BIT_ULL(31)
> +#define XDMA_CMD_AST2500_PITCH_ADDR		GENMASK_ULL(29, 4)
> +#define XDMA_CMD_AST2500_PITCH_ID		BIT_ULL(0)
> +#define XDMA_CMD_AST2500_CMD_IRQ_EN		BIT_ULL(31)
> +#define XDMA_CMD_AST2500_CMD_LINE_NO		GENMASK_ULL(27, 16)
> +#define XDMA_CMD_AST2500_CMD_IRQ_BMC		BIT_ULL(15)
> +#define XDMA_CMD_AST2500_CMD_LINE_SIZE_SHIFT	4
> +#define XDMA_CMD_AST2500_CMD_LINE_SIZE		\
> +	GENMASK_ULL(14, XDMA_CMD_AST2500_CMD_LINE_SIZE_SHIFT)
> +#define XDMA_CMD_AST2500_CMD_ID			BIT_ULL(1)
> +
> +#define XDMA_CMD_AST2600_PITCH_BMC		GENMASK_ULL(62, 48)
> +#define XDMA_CMD_AST2600_PITCH_HOST		GENMASK_ULL(46, 32)
> +#define XDMA_CMD_AST2600_PITCH_ADDR		GENMASK_ULL(30, 0)
> +#define XDMA_CMD_AST2600_CMD_64_EN		BIT_ULL(40)
> +#define XDMA_CMD_AST2600_CMD_IRQ_BMC		BIT_ULL(37)
> +#define XDMA_CMD_AST2600_CMD_IRQ_HOST		BIT_ULL(36)
> +#define XDMA_CMD_AST2600_CMD_UPSTREAM		BIT_ULL(32)
> +#define XDMA_CMD_AST2600_CMD_LINE_NO		GENMASK_ULL(27, 16)
> +#define XDMA_CMD_AST2600_CMD_LINE_SIZE		GENMASK_ULL(14, 0)
> +#define XDMA_CMD_AST2600_CMD_MULTILINE_SIZE	GENMASK_ULL(14, 12)
> +
> +#define XDMA_AST2500_QUEUE_ENTRY_SIZE		4
> +#define XDMA_AST2500_HOST_CMDQ_ADDR0		0x00
> +#define XDMA_AST2500_HOST_CMDQ_ENDP		0x04
> +#define XDMA_AST2500_HOST_CMDQ_WRITEP		0x08
> +#define XDMA_AST2500_HOST_CMDQ_READP		0x0c
> +#define XDMA_AST2500_BMC_CMDQ_ADDR		0x10
> +#define XDMA_AST2500_BMC_CMDQ_ENDP		0x14
> +#define XDMA_AST2500_BMC_CMDQ_WRITEP		0x18
> +#define XDMA_AST2500_BMC_CMDQ_READP		0x1c
> +#define  XDMA_BMC_CMDQ_READP_RESET		 0xee882266
> +#define XDMA_AST2500_CTRL			0x20
> +#define  XDMA_AST2500_CTRL_US_COMP		 BIT(4)
> +#define  XDMA_AST2500_CTRL_DS_COMP		 BIT(5)
> +#define  XDMA_AST2500_CTRL_DS_DIRTY		 BIT(6)
> +#define  XDMA_AST2500_CTRL_DS_SIZE_256		 BIT(17)
> +#define  XDMA_AST2500_CTRL_DS_TIMEOUT		 BIT(28)
> +#define  XDMA_AST2500_CTRL_DS_CHECK_ID		 BIT(29)
> +#define XDMA_AST2500_STATUS			0x24
> +#define  XDMA_AST2500_STATUS_US_COMP		 BIT(4)
> +#define  XDMA_AST2500_STATUS_DS_COMP		 BIT(5)
> +#define  XDMA_AST2500_STATUS_DS_DIRTY		 BIT(6)
> +#define XDMA_AST2500_INPRG_DS_CMD1		0x38
> +#define XDMA_AST2500_INPRG_DS_CMD2		0x3c
> +#define XDMA_AST2500_INPRG_US_CMD00		0x40
> +#define XDMA_AST2500_INPRG_US_CMD01		0x44
> +#define XDMA_AST2500_INPRG_US_CMD10		0x48
> +#define XDMA_AST2500_INPRG_US_CMD11		0x4c
> +#define XDMA_AST2500_INPRG_US_CMD20		0x50
> +#define XDMA_AST2500_INPRG_US_CMD21		0x54
> +#define XDMA_AST2500_HOST_CMDQ_ADDR1		0x60
> +#define XDMA_AST2500_VGA_CMDQ_ADDR0		0x64
> +#define XDMA_AST2500_VGA_CMDQ_ENDP		0x68
> +#define XDMA_AST2500_VGA_CMDQ_WRITEP		0x6c
> +#define XDMA_AST2500_VGA_CMDQ_READP		0x70
> +#define XDMA_AST2500_VGA_CMD_STATUS		0x74
> +#define XDMA_AST2500_VGA_CMDQ_ADDR1		0x78
> +
> +#define XDMA_AST2600_QUEUE_ENTRY_SIZE		2
> +#define XDMA_AST2600_HOST_CMDQ_ADDR0		0x00
> +#define XDMA_AST2600_HOST_CMDQ_ADDR1		0x04
> +#define XDMA_AST2600_HOST_CMDQ_ENDP		0x08
> +#define XDMA_AST2600_HOST_CMDQ_WRITEP		0x0c
> +#define XDMA_AST2600_HOST_CMDQ_READP		0x10
> +#define XDMA_AST2600_BMC_CMDQ_ADDR		0x14
> +#define XDMA_AST2600_BMC_CMDQ_ENDP		0x18
> +#define XDMA_AST2600_BMC_CMDQ_WRITEP		0x1c
> +#define XDMA_AST2600_BMC_CMDQ_READP		0x20
> +#define XDMA_AST2600_VGA_CMDQ_ADDR0		0x24
> +#define XDMA_AST2600_VGA_CMDQ_ADDR1		0x28
> +#define XDMA_AST2600_VGA_CMDQ_ENDP		0x2c
> +#define XDMA_AST2600_VGA_CMDQ_WRITEP		0x30
> +#define XDMA_AST2600_VGA_CMDQ_READP		0x34
> +#define XDMA_AST2600_CTRL			0x38
> +#define  XDMA_AST2600_CTRL_US_COMP		 BIT(16)
> +#define  XDMA_AST2600_CTRL_DS_COMP		 BIT(17)
> +#define  XDMA_AST2600_CTRL_DS_DIRTY		 BIT(18)
> +#define  XDMA_AST2600_CTRL_DS_SIZE_256		 BIT(20)
> +#define XDMA_AST2600_STATUS			0x3c
> +#define  XDMA_AST2600_STATUS_US_COMP		 BIT(16)
> +#define  XDMA_AST2600_STATUS_DS_COMP		 BIT(17)
> +#define  XDMA_AST2600_STATUS_DS_DIRTY		 BIT(18)
> +#define XDMA_AST2600_INPRG_DS_CMD00		0x40
> +#define XDMA_AST2600_INPRG_DS_CMD01		0x44
> +#define XDMA_AST2600_INPRG_DS_CMD10		0x48
> +#define XDMA_AST2600_INPRG_DS_CMD11		0x4c
> +#define XDMA_AST2600_INPRG_DS_CMD20		0x50
> +#define XDMA_AST2600_INPRG_DS_CMD21		0x54
> +#define XDMA_AST2600_INPRG_US_CMD00		0x60
> +#define XDMA_AST2600_INPRG_US_CMD01		0x64
> +#define XDMA_AST2600_INPRG_US_CMD10		0x68
> +#define XDMA_AST2600_INPRG_US_CMD11		0x6c
> +#define XDMA_AST2600_INPRG_US_CMD20		0x70
> +#define XDMA_AST2600_INPRG_US_CMD21		0x74
> +
> +struct aspeed_xdma_cmd {
> +	u64 host_addr;
> +	u64 pitch;
> +	u64 cmd;
> +	u64 reserved;
> +};
> +
> +struct aspeed_xdma_regs {
> +	u8 bmc_cmdq_addr;
> +	u8 bmc_cmdq_endp;
> +	u8 bmc_cmdq_writep;
> +	u8 bmc_cmdq_readp;
> +	u8 control;
> +	u8 status;
> +};
> +
> +struct aspeed_xdma_status_bits {
> +	u32 us_comp;
> +	u32 ds_comp;
> +	u32 ds_dirty;
> +};
> +
> +struct aspeed_xdma;
> +
> +struct aspeed_xdma_chip {
> +	u32 control;
> +	u32 scu_bmc_class;
> +	u32 scu_pcie_conf;
> +	u32 sdmc_remap;
> +	unsigned int queue_entry_size;
> +	struct aspeed_xdma_regs regs;
> +	struct aspeed_xdma_status_bits status_bits;
> +	unsigned int (*set_cmd)(struct aspeed_xdma *ctx,
> +				struct aspeed_xdma_op *op, u32 bmc_addr);
> +};
> +
> +struct aspeed_xdma_client;
> +
> +struct aspeed_xdma {
> +	const struct aspeed_xdma_chip *chip;
> +
> +	struct device *dev;
> +	void __iomem *base;
> +	struct clk *clock;
> +	struct reset_control *reset;
> +
> +	struct aspeed_xdma_client *current_client;
> +
> +	/* start_lock protects cmd_idx, cmdq, and the state of the engine */
> +	struct mutex start_lock;
> +	void *cmdq;

Can this not be typed as `struct aspeed_xdma_cmd *cmdq`?

> +	bool in_reset;

Bit of a nit, but can we move in_reset under reset_lock below?

> +	bool upstream;
> +	unsigned int cmd_idx;
> +
> +	/* reset_lock protects in_reset and the reset state of the engine */
> +	spinlock_t reset_lock;
> +
> +	wait_queue_head_t wait;
> +	struct work_struct reset_work;
> +
> +	u32 vga_phys;
> +	u32 vga_size;
> +	void __iomem *vga_virt;
> +	dma_addr_t cmdq_vga_phys;
> +	void *cmdq_vga_virt;
> +	struct gen_pool *vga_pool;

This shouldn't have anything to do with VGA specifically, we could
theoretically use any other piece of reserved memory (in the right
security context).

> +};
> +
> +struct aspeed_xdma_client {
> +	struct aspeed_xdma *ctx;
> +
> +	bool error;
> +	bool in_progress;
> +	void *virt;
> +	dma_addr_t phys;
> +	u32 size;
> +};
> +
> +static u32 aspeed_xdma_readl(struct aspeed_xdma *ctx, u8 reg)
> +{
> +	u32 v = readl(ctx->base + reg);
> +
> +	dev_dbg(ctx->dev, "read %02x[%08x]\n", reg, v);
> +	return v;
> +}
> +
> +static void aspeed_xdma_writel(struct aspeed_xdma *ctx, u8 reg, u32 val)
> +{
> +	writel(val, ctx->base + reg);
> +	dev_dbg(ctx->dev, "write %02x[%08x]\n", reg, val);
> +}
> +
> +static void aspeed_xdma_init_eng(struct aspeed_xdma *ctx)
> +{
> +	aspeed_xdma_writel(ctx, ctx->chip->regs.bmc_cmdq_endp,
> +			   ctx->chip->queue_entry_size * XDMA_NUM_CMDS);
> +	aspeed_xdma_writel(ctx, ctx->chip->regs.bmc_cmdq_readp,
> +			   XDMA_BMC_CMDQ_READP_RESET);
> +	aspeed_xdma_writel(ctx, ctx->chip->regs.bmc_cmdq_writep, 0);
> +	aspeed_xdma_writel(ctx, ctx->chip->regs.control, ctx->chip->control);
> +	aspeed_xdma_writel(ctx, ctx->chip->regs.bmc_cmdq_addr,
> +			   ctx->cmdq_vga_phys);
> +
> +	ctx->cmd_idx = 0;
> +	ctx->current_client = NULL;
> +}
> +
> +static unsigned int aspeed_xdma_ast2500_set_cmd(struct aspeed_xdma *ctx,
> +						struct aspeed_xdma_op *op,
> +						u32 bmc_addr)
> +{
> +	u64 cmd = XDMA_CMD_AST2500_CMD_IRQ_EN | XDMA_CMD_AST2500_CMD_IRQ_BMC |
> +		XDMA_CMD_AST2500_CMD_ID;
> +	u64 cmd_pitch = (op->direction ? XDMA_CMD_AST2500_PITCH_UPSTREAM : 0) |
> +		XDMA_CMD_AST2500_PITCH_ID;
> +	unsigned int line_size;
> +	unsigned int nidx = (ctx->cmd_idx + 1) % XDMA_NUM_CMDS;
> +	unsigned int line_no = 1;
> +	unsigned int pitch = 1;
> +	struct aspeed_xdma_cmd *ncmd =
> +		&(((struct aspeed_xdma_cmd *)ctx->cmdq)[ctx->cmd_idx]);

Changing ctx->cmdq away from a `void *` would help out here.

> +
> +	dev_dbg(ctx->dev, "xdma %s ast2500: bmc[%08x] len[%08x] host[%08x]\n",
> +		op->direction ? "upstream" : "downstream", bmc_addr, op->len,
> +		(u32)op->host_addr);
> +
> +	if (op->len > XDMA_CMD_AST2500_CMD_LINE_SIZE) {
> +		unsigned int rem;
> +		unsigned int total;
> +
> +		line_no = op->len / XDMA_CMD_AST2500_CMD_LINE_SIZE;
> +		total = XDMA_CMD_AST2500_CMD_LINE_SIZE * line_no;
> +		rem = (op->len - total) >>
> +			XDMA_CMD_AST2500_CMD_LINE_SIZE_SHIFT;
> +		line_size = XDMA_CMD_AST2500_CMD_LINE_SIZE;
> +		pitch = line_size >> XDMA_CMD_AST2500_PITCH_SHIFT;
> +		line_size >>= XDMA_CMD_AST2500_CMD_LINE_SIZE_SHIFT;
> +
> +		if (rem) {
> +			u32 rbmc = bmc_addr + total;
> +			struct aspeed_xdma_cmd *rcmd =
> +				&(((struct aspeed_xdma_cmd *)ctx->cmdq)[nidx]);
> +
> +			rcmd->host_addr = op->host_addr + (u64)total;
> +			rcmd->pitch = cmd_pitch |
> +				((u64)rbmc & XDMA_CMD_AST2500_PITCH_ADDR) |
> +				FIELD_PREP(XDMA_CMD_AST2500_PITCH_HOST, 1) |
> +				FIELD_PREP(XDMA_CMD_AST2500_PITCH_BMC, 1);
> +			rcmd->cmd = cmd |
> +				FIELD_PREP(XDMA_CMD_AST2500_CMD_LINE_NO, 1) |
> +				FIELD_PREP(XDMA_CMD_AST2500_CMD_LINE_SIZE,
> +					   rem);
> +
> +			print_hex_dump_debug("xdma rem ", DUMP_PREFIX_OFFSET,
> +					     16, 1, rcmd, sizeof(*rcmd), true);
> +
> +			cmd &= ~(XDMA_CMD_AST2500_CMD_IRQ_EN |
> +				 XDMA_CMD_AST2500_CMD_IRQ_BMC);
> +
> +			nidx = (nidx + 1) % XDMA_NUM_CMDS;
> +		}
> +	} else {
> +		line_size = op->len >> XDMA_CMD_AST2500_CMD_LINE_SIZE_SHIFT;
> +	}
> +
> +	ncmd->host_addr = op->host_addr;
> +	ncmd->pitch = cmd_pitch |
> +		((u64)bmc_addr & XDMA_CMD_AST2500_PITCH_ADDR) |
> +		FIELD_PREP(XDMA_CMD_AST2500_PITCH_HOST, pitch) |
> +		FIELD_PREP(XDMA_CMD_AST2500_PITCH_BMC, pitch);
> +	ncmd->cmd = cmd | FIELD_PREP(XDMA_CMD_AST2500_CMD_LINE_NO, line_no) |
> +		FIELD_PREP(XDMA_CMD_AST2500_CMD_LINE_SIZE, line_size);
> +
> +	print_hex_dump_debug("xdma cmd ", DUMP_PREFIX_OFFSET, 16, 1, ncmd,
> +			     sizeof(*ncmd), true);
> +
> +	return nidx;
> +}
> +
> +static unsigned int aspeed_xdma_ast2600_set_cmd(struct aspeed_xdma *ctx,
> +						struct aspeed_xdma_op *op,
> +						u32 bmc_addr)
> +{
> +	u64 cmd = XDMA_CMD_AST2600_CMD_IRQ_BMC |
> +		(op->direction ? XDMA_CMD_AST2600_CMD_UPSTREAM : 0);
> +	unsigned int line_size;
> +	unsigned int nidx = (ctx->cmd_idx + 1) % XDMA_NUM_CMDS;
> +	unsigned int line_no = 1;
> +	unsigned int pitch = 1;
> +	struct aspeed_xdma_cmd *ncmd =
> +		&(((struct aspeed_xdma_cmd *)ctx->cmdq)[ctx->cmd_idx]);
> +
> +	if ((op->host_addr + op->len) & 0xffffffff00000000ULL)

Do we know that this won't wrap?

> +		cmd |= XDMA_CMD_AST2600_CMD_64_EN;
> +
> +	dev_dbg(ctx->dev, "xdma %s ast2600: bmc[%08x] len[%08x] "
> +		"host[%016llx]\n", op->direction ? "upstream" : "downstream",
> +		bmc_addr, op->len, op->host_addr);
> +
> +	if (op->len > XDMA_CMD_AST2600_CMD_LINE_SIZE) {
> +		unsigned int rem;
> +		unsigned int total;
> +
> +		line_no = op->len / XDMA_CMD_AST2600_CMD_MULTILINE_SIZE;
> +		total = XDMA_CMD_AST2600_CMD_MULTILINE_SIZE * line_no;
> +		rem = op->len - total;
> +		line_size = XDMA_CMD_AST2600_CMD_MULTILINE_SIZE;
> +		pitch = line_size;
> +
> +		if (rem) {
> +			u32 rbmc = bmc_addr + total;
> +			struct aspeed_xdma_cmd *rcmd =
> +				&(((struct aspeed_xdma_cmd *)ctx->cmdq)[nidx]);
> +
> +			rcmd->host_addr = op->host_addr + (u64)total;
> +			rcmd->pitch =
> +				((u64)rbmc & XDMA_CMD_AST2600_PITCH_ADDR) |
> +				FIELD_PREP(XDMA_CMD_AST2600_PITCH_HOST, 1) |
> +				FIELD_PREP(XDMA_CMD_AST2600_PITCH_BMC, 1);
> +			rcmd->cmd = cmd |
> +				FIELD_PREP(XDMA_CMD_AST2600_CMD_LINE_NO, 1) |
> +				FIELD_PREP(XDMA_CMD_AST2600_CMD_LINE_SIZE,
> +					   rem);
> +
> +			print_hex_dump_debug("xdma rem ", DUMP_PREFIX_OFFSET,
> +					     16, 1, rcmd, sizeof(*rcmd), true);
> +
> +			cmd &= ~XDMA_CMD_AST2600_CMD_IRQ_BMC;
> +
> +			nidx = (nidx + 1) % XDMA_NUM_CMDS;
> +		}
> +	} else {
> +		line_size = op->len;
> +	}
> +
> +	ncmd->host_addr = op->host_addr;
> +	ncmd->pitch = ((u64)bmc_addr & XDMA_CMD_AST2600_PITCH_ADDR) |
> +		FIELD_PREP(XDMA_CMD_AST2600_PITCH_HOST, pitch) |
> +		FIELD_PREP(XDMA_CMD_AST2600_PITCH_BMC, pitch);
> +	ncmd->cmd = cmd | FIELD_PREP(XDMA_CMD_AST2600_CMD_LINE_NO, line_no) |
> +		FIELD_PREP(XDMA_CMD_AST2600_CMD_LINE_SIZE, line_size);
> +
> +	print_hex_dump_debug("xdma cmd ", DUMP_PREFIX_OFFSET, 16, 1, ncmd,
> +			     sizeof(*ncmd), true);
> +
> +	return nidx;
> +}
> +
> +static void aspeed_xdma_start(struct aspeed_xdma *ctx,
> +			      struct aspeed_xdma_op *op, u32 bmc_addr,
> +			      struct aspeed_xdma_client *client)
> +{
> +	unsigned int nidx;
> +
> +	mutex_lock(&ctx->start_lock);
> +
> +	nidx = ctx->chip->set_cmd(ctx, op, bmc_addr);
> +	memcpy(ctx->cmdq_vga_virt, ctx->cmdq, XDMA_CMDQ_SIZE);
> +
> +	client->in_progress = true;
> +	ctx->current_client = client;
> +	ctx->upstream = op->direction ? true : false;

Could get away without the branch, just assign directly or use the !! idiom
(you're already turning op->direction into a boolean by using it as the
condition).

> +
> +	aspeed_xdma_writel(ctx, ctx->chip->regs.bmc_cmdq_writep,
> +			   nidx * ctx->chip->queue_entry_size);
> +
> +	ctx->cmd_idx = nidx;
> +
> +	mutex_unlock(&ctx->start_lock);
> +}
> +
> +static void aspeed_xdma_done(struct aspeed_xdma *ctx, bool error)
> +{
> +	if (ctx->current_client) {
> +		ctx->current_client->error = error;
> +		ctx->current_client->in_progress = false;
> +		ctx->current_client = NULL;

You need to take start_lock before writing these members to ensure the
writes are not reordered across acquisition of start_lock in
aspeed_xdma_start() above, unless there's some other guarantee of that?

I see that aspeed_xdma_done() is called from aspeed_xdma_irq() below,
so maybe we need a different locking strategy or punt aspeed_xdma_done()
to a workqueue, as start_lock is a mutex.

> +	}
> +
> +	wake_up_interruptible_all(&ctx->wait);
> +}
> +
> +static irqreturn_t aspeed_xdma_irq(int irq, void *arg)
> +{
> +	struct aspeed_xdma *ctx = arg;
> +	u32 status = aspeed_xdma_readl(ctx, ctx->chip->regs.status);
> +
> +	if (status & ctx->chip->status_bits.ds_dirty) {
> +		aspeed_xdma_done(ctx, true);
> +	} else {
> +		if (status & ctx->chip->status_bits.us_comp) {
> +			if (ctx->upstream)
> +				aspeed_xdma_done(ctx, false);
> +		}
> +
> +		if (status & ctx->chip->status_bits.ds_comp) {
> +			if (!ctx->upstream)
> +				aspeed_xdma_done(ctx, false);
> +		}
> +	}
> +
> +	aspeed_xdma_writel(ctx, ctx->chip->regs.status, status);
> +
> +	return IRQ_HANDLED;
> +}
> +
> +static void aspeed_xdma_reset(struct aspeed_xdma *ctx)
> +{
> +	reset_control_assert(ctx->reset);
> +	msleep(XDMA_RESET_TIME_MS);
> +
> +	reset_control_deassert(ctx->reset);
> +	msleep(XDMA_RESET_TIME_MS);
> +
> +	aspeed_xdma_init_eng(ctx);
> +
> +	ctx->in_reset = false;
> +	aspeed_xdma_done(ctx, true);
> +}
> +
> +static void aspeed_xdma_reset_work(struct work_struct *work)
> +{
> +	struct aspeed_xdma *ctx = container_of(work, struct aspeed_xdma,
> +					       reset_work);
> +
> +	/*
> +	 * Lock to make sure operations aren't started while the engine is
> +	 * in reset.
> +	 */
> +	mutex_lock(&ctx->start_lock);
> +
> +	aspeed_xdma_reset(ctx);
> +
> +	mutex_unlock(&ctx->start_lock);
> +}
> +
> +static irqreturn_t aspeed_xdma_pcie_irq(int irq, void *arg)
> +{
> +	unsigned long flags;
> +	struct aspeed_xdma *ctx = arg;
> +
> +	dev_dbg(ctx->dev, "pcie reset\n");
> +
> +	spin_lock_irqsave(&ctx->reset_lock, flags);
> +	if (ctx->in_reset) {
> +		spin_unlock_irqrestore(&ctx->reset_lock, flags);
> +		return IRQ_HANDLED;
> +	}
> +
> +	ctx->in_reset = true;
> +	spin_unlock_irqrestore(&ctx->reset_lock, flags);
> +
> +	schedule_work(&ctx->reset_work);
> +	return IRQ_HANDLED;
> +}
> +
> +static int aspeed_xdma_probe(struct platform_device *pdev)
> +{
> +	int irq;
> +	int pcie_irq;
> +	int rc;
> +	u32 vgamem[2];
> +	struct regmap *scu;
> +	struct regmap *sdmc;
> +	struct aspeed_xdma *ctx;
> +	struct device *dev = &pdev->dev;
> +	const void *md = of_device_get_match_data(dev);

So close to reverse christmas tree :)

> +
> +	if (!md)
> +		return -ENODEV;
> +
> +	ctx = devm_kzalloc(dev, sizeof(*ctx), GFP_KERNEL);
> +	if (!ctx)
> +		return -ENOMEM;
> +
> +	ctx->chip = md;
> +	ctx->dev = dev;
> +	platform_set_drvdata(pdev, ctx);
> +	mutex_init(&ctx->start_lock);
> +	INIT_WORK(&ctx->reset_work, aspeed_xdma_reset_work);
> +	spin_lock_init(&ctx->reset_lock);
> +	init_waitqueue_head(&ctx->wait);
> +
> +	ctx->base = devm_platform_ioremap_resource(pdev, 0);
> +	if (IS_ERR(ctx->base)) {
> +		dev_err(dev, "Failed to map registers.\n");
> +		return PTR_ERR(ctx->base);
> +	}
> +
> +	irq = platform_get_irq(pdev, 0);
> +	if (irq < 0) {
> +		dev_err(dev, "Unable to find IRQ.\n");
> +		return irq;
> +	}
> +
> +	rc = devm_request_irq(dev, irq, aspeed_xdma_irq, IRQF_SHARED,
> +			      DEVICE_NAME, ctx);
> +	if (rc < 0) {
> +		dev_err(dev, "Failed to request IRQ %d.\n", irq);
> +		return rc;
> +	}
> +
> +	ctx->clock = devm_clk_get(dev, NULL);
> +	if (IS_ERR(ctx->clock)) {
> +		dev_err(dev, "Failed to request clock.\n");
> +		return PTR_ERR(ctx->clock);
> +	}
> +
> +	ctx->reset = devm_reset_control_get_exclusive(dev, NULL);
> +	if (IS_ERR(ctx->reset)) {
> +		dev_err(dev, "Failed to request reset control.\n");
> +		return PTR_ERR(ctx->reset);
> +	}
> +
> +	ctx->cmdq = devm_kzalloc(ctx->dev, XDMA_CMDQ_SIZE, GFP_KERNEL);
> +	if (!ctx->cmdq) {
> +		dev_err(dev, "Failed to allocate command queue.\n");
> +		return -ENOMEM;
> +	}
> +
> +	ctx->vga_pool = devm_gen_pool_create(dev, ilog2(PAGE_SIZE), -1, NULL);
> +	if (!ctx->vga_pool) {
> +		dev_err(dev, "Failed to setup genalloc pool.\n");
> +		return -ENOMEM;
> +	}
> +
> +	rc = of_property_read_u32_array(dev->of_node, "vga-mem", vgamem, 2);

As mentioned, this could be any reserved memory range. Also can't we get it as
a resource rather than parsing a u32 array? Not sure if there's an advantage
but it feels like a better representation.

> +	if (rc) {
> +		dev_err(dev, "Unable to get VGA memory space.\n");
> +		return rc;
> +	}
> +
> +	ctx->vga_phys = vgamem[0];
> +	ctx->vga_size = vgamem[1];
> +
> +	ctx->vga_virt = devm_ioremap(dev, ctx->vga_phys, ctx->vga_size);
> +	if (IS_ERR(ctx->vga_virt)) {
> +		dev_err(dev, "Failed to map VGA memory space.\n");
> +		return PTR_ERR(ctx->vga_virt);
> +	}
> +
> +	rc = gen_pool_add_virt(ctx->vga_pool, (unsigned long)ctx->vga_virt,
> +			       ctx->vga_phys, ctx->vga_size, -1);
> +	if (rc) {
> +		dev_err(ctx->dev, "Failed to add memory to genalloc pool.\n");
> +		return rc;
> +	}
> +
> +	sdmc = syscon_regmap_lookup_by_phandle(dev->of_node, "sdmc");
> +	if (IS_ERR(sdmc)) {
> +		dev_err(dev, "Unable to configure memory controller.\n");
> +		return PTR_ERR(sdmc);
> +	}
> +
> +	regmap_update_bits(sdmc, SDMC_REMAP, ctx->chip->sdmc_remap,
> +			   ctx->chip->sdmc_remap);

I disagree with doing this. As mentioned on the bindings it should be up to
the platform integrator to ensure that this is configured appropriately.

> +
> +	scu = syscon_regmap_lookup_by_phandle(dev->of_node, "scu");
> +	if (!IS_ERR(scu)) {
> +		u32 selection;
> +		bool pcie_device_bmc = true;
> +		const u32 bmc = SCU_PCIE_CONF_BMC_EN |
> +			SCU_PCIE_CONF_BMC_EN_MSI | SCU_PCIE_CONF_BMC_EN_IRQ |
> +			SCU_PCIE_CONF_BMC_EN_DMA;
> +		const u32 vga = SCU_PCIE_CONF_VGA_EN |
> +			SCU_PCIE_CONF_VGA_EN_MSI | SCU_PCIE_CONF_VGA_EN_IRQ |
> +			SCU_PCIE_CONF_VGA_EN_DMA;
> +		const char *pcie = NULL;
> +
> +		if (!of_property_read_string(dev->of_node, "pcie-device",
> +					     &pcie)) {
> +			if (!strcmp(pcie, "vga"))
> +				pcie_device_bmc = false;

You need to return an error here if the string is not one of "vga" or "bmc". This
just assumes that anything that isn't "vga" is "bmc".

> +		}
> +
> +		if (pcie_device_bmc) {
> +			selection = bmc;
> +			regmap_write(scu, ctx->chip->scu_bmc_class,
> +				     SCU_BMC_CLASS_REV_XDMA);
> +		} else {
> +			selection = vga;
> +		}
> +
> +		regmap_update_bits(scu, ctx->chip->scu_pcie_conf, bmc | vga,
> +				   selection);
> +	} else {
> +		dev_warn(dev, "Unable to configure PCIe; continuing.\n");
> +	}
> +
> +	rc = clk_prepare_enable(ctx->clock);
> +	if (rc) {
> +		dev_err(dev, "Failed to enable the clock.\n");
> +		return rc;
> +	}
> +	msleep(XDMA_RESET_TIME_MS);
> +
> +	rc = reset_control_deassert(ctx->reset);
> +	if (rc) {
> +		clk_disable_unprepare(ctx->clock);
> +
> +		dev_err(dev, "Failed to clear the reset.\n");
> +		return rc;
> +	}
> +	msleep(XDMA_RESET_TIME_MS);
> +
> +	ctx->cmdq_vga_virt = gen_pool_dma_alloc(ctx->vga_pool, XDMA_CMDQ_SIZE,
> +						&ctx->cmdq_vga_phys);
> +	if (!ctx->cmdq_vga_virt) {
> +		dev_err(ctx->dev, "Failed to genalloc cmdq.\n");
> +
> +		reset_control_assert(ctx->reset);
> +		clk_disable_unprepare(ctx->clock);
> +		return -ENOMEM;
> +	}
> +
> +	aspeed_xdma_init_eng(ctx);
> +
> +	/*
> +	 * This interrupt could fire immediately so only request it once the
> +	 * engine and driver are initialized.
> +	 */
> +	pcie_irq = platform_get_irq(pdev, 1);
> +	if (pcie_irq < 0) {
> +		dev_warn(dev, "Unable to find PCI-E IRQ.\n");
> +	} else {
> +		rc = devm_request_irq(dev, pcie_irq, aspeed_xdma_pcie_irq,
> +				      IRQF_SHARED, DEVICE_NAME, ctx);
> +		if (rc < 0)
> +			dev_warn(dev, "Failed to request PCI-E IRQ %d.\n", rc);
> +	}
> +
> +	return 0;
> +}
> +
> +static int aspeed_xdma_remove(struct platform_device *pdev)
> +{
> +	struct aspeed_xdma *ctx = platform_get_drvdata(pdev);
> +
> +	gen_pool_free(ctx->vga_pool, (unsigned long)ctx->cmdq_vga_virt,
> +		      XDMA_CMDQ_SIZE);
> +
> +	reset_control_assert(ctx->reset);
> +	clk_disable_unprepare(ctx->clock);
> +
> +	return 0;
> +}
> +
> +static const struct aspeed_xdma_chip aspeed_ast2500_xdma_chip = {
> +	.control = XDMA_AST2500_CTRL_US_COMP | XDMA_AST2500_CTRL_DS_COMP |
> +		XDMA_AST2500_CTRL_DS_DIRTY | XDMA_AST2500_CTRL_DS_SIZE_256 |
> +		XDMA_AST2500_CTRL_DS_TIMEOUT | XDMA_AST2500_CTRL_DS_CHECK_ID,
> +	.scu_bmc_class = SCU_AST2500_BMC_CLASS_REV,
> +	.scu_pcie_conf = SCU_AST2500_PCIE_CONF,
> +	.sdmc_remap = SDMC_AST2500_REMAP_PCIE | SDMC_AST2500_REMAP_XDMA,
> +	.queue_entry_size = XDMA_AST2500_QUEUE_ENTRY_SIZE,
> +	.regs = {
> +		.bmc_cmdq_addr = XDMA_AST2500_BMC_CMDQ_ADDR,
> +		.bmc_cmdq_endp = XDMA_AST2500_BMC_CMDQ_ENDP,
> +		.bmc_cmdq_writep = XDMA_AST2500_BMC_CMDQ_WRITEP,
> +		.bmc_cmdq_readp = XDMA_AST2500_BMC_CMDQ_READP,
> +		.control = XDMA_AST2500_CTRL,
> +		.status = XDMA_AST2500_STATUS,
> +	},
> +	.status_bits = {
> +		.us_comp = XDMA_AST2500_STATUS_US_COMP,
> +		.ds_comp = XDMA_AST2500_STATUS_DS_COMP,
> +		.ds_dirty = XDMA_AST2500_STATUS_DS_DIRTY,
> +	},
> +	.set_cmd = aspeed_xdma_ast2500_set_cmd,
> +};
> +
> +static const struct aspeed_xdma_chip aspeed_ast2600_xdma_chip = {
> +	.control = XDMA_AST2600_CTRL_US_COMP | XDMA_AST2600_CTRL_DS_COMP |
> +		XDMA_AST2600_CTRL_DS_DIRTY | XDMA_AST2600_CTRL_DS_SIZE_256,
> +	.scu_bmc_class = SCU_AST2600_BMC_CLASS_REV,
> +	.scu_pcie_conf = SCU_AST2600_PCIE_CONF,
> +	.sdmc_remap = SDMC_AST2600_REMAP_XDMA,
> +	.queue_entry_size = XDMA_AST2600_QUEUE_ENTRY_SIZE,
> +	.regs = {
> +		.bmc_cmdq_addr = XDMA_AST2600_BMC_CMDQ_ADDR,
> +		.bmc_cmdq_endp = XDMA_AST2600_BMC_CMDQ_ENDP,
> +		.bmc_cmdq_writep = XDMA_AST2600_BMC_CMDQ_WRITEP,
> +		.bmc_cmdq_readp = XDMA_AST2600_BMC_CMDQ_READP,
> +		.control = XDMA_AST2600_CTRL,
> +		.status = XDMA_AST2600_STATUS,
> +	},
> +	.status_bits = {
> +		.us_comp = XDMA_AST2600_STATUS_US_COMP,
> +		.ds_comp = XDMA_AST2600_STATUS_DS_COMP,
> +		.ds_dirty = XDMA_AST2600_STATUS_DS_DIRTY,
> +	},
> +	.set_cmd = aspeed_xdma_ast2600_set_cmd,
> +};
> +
> +static const struct of_device_id aspeed_xdma_match[] = {
> +	{
> +		.compatible = "aspeed,ast2500-xdma",
> +		.data = &aspeed_ast2500_xdma_chip,
> +	},
> +	{
> +		.compatible = "aspeed,ast2600-xdma",
> +		.data = &aspeed_ast2600_xdma_chip,
> +	},
> +	{ },
> +};

Thanks, I think this representation is a lot cleaner.

> +
> +static struct platform_driver aspeed_xdma_driver = {
> +	.probe = aspeed_xdma_probe,
> +	.remove = aspeed_xdma_remove,
> +	.driver = {
> +		.name = DEVICE_NAME,
> +		.of_match_table = aspeed_xdma_match,
> +	},
> +};
> +
> +module_platform_driver(aspeed_xdma_driver);
> +
> +MODULE_AUTHOR("Eddie James");
> +MODULE_DESCRIPTION("Aspeed XDMA Engine Driver");
> +MODULE_LICENSE("GPL v2");
> diff --git a/include/uapi/linux/aspeed-xdma.h b/include/uapi/linux/aspeed-xdma.h
> new file mode 100644
> index 0000000..f39f38e
> --- /dev/null
> +++ b/include/uapi/linux/aspeed-xdma.h
> @@ -0,0 +1,41 @@
> +/* SPDX-License-Identifier: GPL-2.0+ WITH Linux-syscall-note */
> +/* Copyright IBM Corp 2019 */
> +
> +#ifndef _UAPI_LINUX_ASPEED_XDMA_H_
> +#define _UAPI_LINUX_ASPEED_XDMA_H_
> +
> +#include <linux/types.h>
> +
> +/*
> + * aspeed_xdma_direction
> + *
> + * ASPEED_XDMA_DIRECTION_DOWNSTREAM: transfers data from the host to the BMC
> + *
> + * ASPEED_XDMA_DIRECTION_UPSTREAM: transfers data from the BMC to the host
> + *
> + * ASPEED_XDMA_DIRECTION_RESET: resets the XDMA engine
> + */
> +enum aspeed_xdma_direction {
> +	ASPEED_XDMA_DIRECTION_DOWNSTREAM = 0,
> +	ASPEED_XDMA_DIRECTION_UPSTREAM,
> +	ASPEED_XDMA_DIRECTION_RESET,

I still think having a reset action as part of the direction is a bit funky. Can you maybe
put that in a separate patch so we can debate it later?

> +};
> +
> +/*
> + * aspeed_xdma_op
> + *
> + * host_addr: the DMA address on the host side, typically configured by PCI
> + *            subsystem
> + *
> + * len: the size of the transfer in bytes
> + *
> + * direction: an enumerator indicating the direction of the DMA operation; see
> + *            enum aspeed_xdma_direction
> + */
> +struct aspeed_xdma_op {
> +	__u64 host_addr;
> +	__u32 len;
> +	__u32 direction;
> +};
> +
> +#endif /* _UAPI_LINUX_ASPEED_XDMA_H_ */
> -- 
> 1.8.3.1
> 
>

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH v2 07/12] drivers/soc: xdma: Add user interface
  2019-12-05 17:15 ` [PATCH v2 07/12] drivers/soc: xdma: Add user interface Eddie James
@ 2019-12-11  3:48   ` Andrew Jeffery
  2019-12-11 20:43     ` Eddie James
  0 siblings, 1 reply; 27+ messages in thread
From: Andrew Jeffery @ 2019-12-11  3:48 UTC (permalink / raw)
  To: Eddie James, linux-kernel
  Cc: devicetree, Jason Cooper, linux-aspeed, Marc Zyngier,
	Rob Herring, tglx, mark.rutland, Joel Stanley



On Fri, 6 Dec 2019, at 03:45, Eddie James wrote:
> This commits adds a miscdevice to provide a user interface to the XDMA
> engine. The interface provides the write operation to start DMA
> operations. The DMA parameters are passed as the data to the write call.
> The actual data to transfer is NOT passed through write. Note that both
> directions of DMA operation are accomplished through the write command;
> BMC to host and host to BMC.
> 
> The XDMA engine is restricted to only accessing the reserved memory
> space on the AST2500, typically used by the VGA. For this reason, the
> VGA memory space is pooled and allocated with genalloc. Users calling
> mmap allocate pages from this pool for their usage. The space allocated
> by a client will be the space used in the DMA operation. For an
> "upstream" (BMC to host) operation, the data in the client's area will
> be transferred to the host. For a "downstream" (host to BMC) operation,
> the host data will be placed in the client's memory area.

Given the comments on earlier patches we should reconsider descriptions
of the VGA area in this paragraph.

> 
> Poll is also provided in order to determine when the DMA operation is
> complete for non-blocking IO.
> 
> Signed-off-by: Eddie James <eajames@linux.ibm.com>
> ---
> Changes since v1:
>  - Add file_lock comment
>  - Bring user reset up to date with new reset method
> 
>  drivers/soc/aspeed/aspeed-xdma.c | 224 +++++++++++++++++++++++++++++++++++++++
>  1 file changed, 224 insertions(+)
> 
> diff --git a/drivers/soc/aspeed/aspeed-xdma.c b/drivers/soc/aspeed/aspeed-xdma.c
> index a9b3eeb..d4b96a7 100644
> --- a/drivers/soc/aspeed/aspeed-xdma.c
> +++ b/drivers/soc/aspeed/aspeed-xdma.c
> @@ -13,6 +13,7 @@
>  #include <linux/io.h>
>  #include <linux/jiffies.h>
>  #include <linux/mfd/syscon.h>
> +#include <linux/miscdevice.h>
>  #include <linux/module.h>
>  #include <linux/mutex.h>
>  #include <linux/of_device.h>
> @@ -206,6 +207,8 @@ struct aspeed_xdma {
>  	struct clk *clock;
>  	struct reset_control *reset;
>  
> +	/* file_lock serializes reads of current_client */
> +	struct mutex file_lock;
>  	struct aspeed_xdma_client *current_client;
>  
>  	/* start_lock protects cmd_idx, cmdq, and the state of the engine */
> @@ -227,6 +230,8 @@ struct aspeed_xdma {
>  	dma_addr_t cmdq_vga_phys;
>  	void *cmdq_vga_virt;
>  	struct gen_pool *vga_pool;
> +
> +	struct miscdevice misc;
>  };
>  
>  struct aspeed_xdma_client {
> @@ -517,6 +522,207 @@ static irqreturn_t aspeed_xdma_pcie_irq(int irq, 
> void *arg)
>  	return IRQ_HANDLED;
>  }
>  
> +static ssize_t aspeed_xdma_write(struct file *file, const char __user *buf,
> +				 size_t len, loff_t *offset)
> +{
> +	int rc;
> +	struct aspeed_xdma_op op;
> +	struct aspeed_xdma_client *client = file->private_data;
> +	struct aspeed_xdma *ctx = client->ctx;
> +	u32 offs = client->phys ? (client->phys - ctx->vga_phys) :
> +		XDMA_CMDQ_SIZE;
> +
> +	if (len != sizeof(op))
> +		return -EINVAL;
> +
> +	rc = copy_from_user(&op, buf, len);
> +	if (rc)
> +		return rc;
> +
> +	if (op.direction == ASPEED_XDMA_DIRECTION_RESET) {
> +		unsigned long flags;
> +
> +		spin_lock_irqsave(&ctx->reset_lock, flags);
> +		if (ctx->in_reset) {
> +			spin_unlock_irqrestore(&ctx->reset_lock, flags);
> +			return len;
> +		}
> +
> +		ctx->in_reset = true;
> +		spin_unlock_irqrestore(&ctx->reset_lock, flags);
> +
> +		mutex_lock(&ctx->start_lock);
> +
> +		aspeed_xdma_reset(ctx);
> +
> +		mutex_unlock(&ctx->start_lock);
> +
> +		return len;
> +	} else if (op.direction > ASPEED_XDMA_DIRECTION_RESET) {
> +		return -EINVAL;
> +	}
> +
> +	if (op.len > ctx->vga_size - offs)
> +		return -EINVAL;

I'm wondering if we can rearrange the code to move the sanity checks to the
top of the function, so this and the `op.direction >
ASPEED_XDMA_DIRECTION_RESET` case.

The check above should fail for the reset case as well, I expect op.len should
be set to zero in that case. But I still think that jamming the reset command
into a "direction" concept feels broken, so as mentioned on an earlier patch
I'd prefer we move that distraction out to a separate patch.

> +
> +	if (file->f_flags & O_NONBLOCK) {
> +		if (!mutex_trylock(&ctx->file_lock))
> +			return -EAGAIN;
> +
> +		if (ctx->current_client) {
> +			mutex_unlock(&ctx->file_lock);
> +			return -EAGAIN;

I think EBUSY is better here.

> +		}
> +	} else {
> +		mutex_lock(&ctx->file_lock);
> +
> +		rc = wait_event_interruptible(ctx->wait, !ctx->current_client);
> +		if (rc) {
> +			mutex_unlock(&ctx->file_lock);
> +			return -EINTR;
> +		}
> +	}
> +
> +	aspeed_xdma_start(ctx, &op, ctx->vga_phys + offs, client);
> +
> +	mutex_unlock(&ctx->file_lock);

You've used file_lock here to protect aspeed_xdma_start() but start_lock
above to protect aspeed_xdma_reset(), so it seems one client can disrupt
another by resetting the engine while a DMA is in progress?

> +
> +	if (!(file->f_flags & O_NONBLOCK)) {
> +		rc = wait_event_interruptible(ctx->wait, !client->in_progress);
> +		if (rc)
> +			return -EINTR;
> +
> +		if (client->error)
> +			return -EIO;
> +	}
> +
> +	return len;
> +}
> +
> +static __poll_t aspeed_xdma_poll(struct file *file,
> +				 struct poll_table_struct *wait)
> +{
> +	__poll_t mask = 0;
> +	__poll_t req = poll_requested_events(wait);
> +	struct aspeed_xdma_client *client = file->private_data;
> +	struct aspeed_xdma *ctx = client->ctx;
> +
> +	if (req & (EPOLLIN | EPOLLRDNORM)) {
> +		if (client->in_progress)
> +			poll_wait(file, &ctx->wait, wait);
> +
> +		if (!client->in_progress) {
> +			if (client->error)
> +				mask |= EPOLLERR;
> +			else
> +				mask |= EPOLLIN | EPOLLRDNORM;
> +		}
> +	}
> +
> +	if (req & (EPOLLOUT | EPOLLWRNORM)) {
> +		if (ctx->current_client)
> +			poll_wait(file, &ctx->wait, wait);
> +
> +		if (!ctx->current_client)
> +			mask |= EPOLLOUT | EPOLLWRNORM;
> +	}
> +
> +	return mask;
> +}
> +
> +static void aspeed_xdma_vma_close(struct vm_area_struct *vma)
> +{
> +	struct aspeed_xdma_client *client = vma->vm_private_data;
> +
> +	gen_pool_free(client->ctx->vga_pool, (unsigned long)client->virt,
> +		      client->size);

What assurance do we have that a DMA isn't in progress? With non-blocking
IO we could easily start one then close the file descriptor, which would cause
havoc if the physical range is reused by a subsequent mapping.

> +
> +	client->virt = NULL;
> +	client->phys = 0;
> +	client->size = 0;
> +}
> +
> +static const struct vm_operations_struct aspeed_xdma_vm_ops = {
> +	.close =	aspeed_xdma_vma_close,
> +};
> +
> +static int aspeed_xdma_mmap(struct file *file, struct vm_area_struct *vma)
> +{
> +	int rc;
> +	struct aspeed_xdma_client *client = file->private_data;
> +	struct aspeed_xdma *ctx = client->ctx;
> +
> +	/* restrict file to one mapping */
> +	if (client->size)
> +		return -ENOMEM;
> +
> +	client->size = vma->vm_end - vma->vm_start;
> +	client->virt = gen_pool_dma_alloc(ctx->vga_pool, client->size,
> +					  &client->phys);
> +	if (!client->virt) {
> +		client->phys = 0;
> +		client->size = 0;
> +		return -ENOMEM;
> +	}
> +
> +	vma->vm_pgoff = (client->phys - ctx->vga_phys) >> PAGE_SHIFT;
> +	vma->vm_ops = &aspeed_xdma_vm_ops;
> +	vma->vm_private_data = client;
> +	vma->vm_page_prot = pgprot_noncached(vma->vm_page_prot);
> +
> +	rc = io_remap_pfn_range(vma, vma->vm_start, client->phys >> PAGE_SHIFT,
> +				client->size, vma->vm_page_prot);
> +	if (rc) {
> +		gen_pool_free(ctx->vga_pool, (unsigned long)client->virt,
> +			      client->size);
> +
> +		client->virt = NULL;
> +		client->phys = 0;
> +		client->size = 0;
> +		return rc;
> +	}
> +
> +	dev_dbg(ctx->dev, "mmap: v[%08lx] to p[%08x], s[%08x]\n",
> +		vma->vm_start, (u32)client->phys, client->size);
> +
> +	return 0;
> +}
> +
> +static int aspeed_xdma_open(struct inode *inode, struct file *file)
> +{
> +	struct miscdevice *misc = file->private_data;
> +	struct aspeed_xdma *ctx = container_of(misc, struct aspeed_xdma, misc);
> +	struct aspeed_xdma_client *client = kzalloc(sizeof(*client),
> +						    GFP_KERNEL);
> +
> +	if (!client)
> +		return -ENOMEM;
> +
> +	client->ctx = ctx;
> +	file->private_data = client;
> +	return 0;
> +}
> +
> +static int aspeed_xdma_release(struct inode *inode, struct file *file)
> +{
> +	struct aspeed_xdma_client *client = file->private_data;
> +
> +	if (client->ctx->current_client == client)
> +		client->ctx->current_client = NULL;

Shouldn't we also cancel the DMA op? This seems like a DoS risk: set up
a non-blocking, large downstream transfer then close the client. Also risks
scribbling on memory we no-longer own given we don't cancel/wait for
completion in vm close callback?

> +
> +	kfree(client);
> +	return 0;
> +}
> +
> +static const struct file_operations aspeed_xdma_fops = {
> +	.owner			= THIS_MODULE,
> +	.write			= aspeed_xdma_write,
> +	.poll			= aspeed_xdma_poll,
> +	.mmap			= aspeed_xdma_mmap,
> +	.open			= aspeed_xdma_open,
> +	.release		= aspeed_xdma_release,
> +};
> +
>  static int aspeed_xdma_probe(struct platform_device *pdev)
>  {
>  	int irq;
> @@ -539,6 +745,7 @@ static int aspeed_xdma_probe(struct platform_device *pdev)
>  	ctx->chip = md;
>  	ctx->dev = dev;
>  	platform_set_drvdata(pdev, ctx);
> +	mutex_init(&ctx->file_lock);
>  	mutex_init(&ctx->start_lock);
>  	INIT_WORK(&ctx->reset_work, aspeed_xdma_reset_work);
>  	spin_lock_init(&ctx->reset_lock);
> @@ -678,6 +885,22 @@ static int aspeed_xdma_probe(struct platform_device *pdev)
>  
>  	aspeed_xdma_init_eng(ctx);
>  
> +	ctx->misc.minor = MISC_DYNAMIC_MINOR;
> +	ctx->misc.fops = &aspeed_xdma_fops;
> +	ctx->misc.name = "aspeed-xdma";
> +	ctx->misc.parent = dev;
> +	rc = misc_register(&ctx->misc);
> +	if (rc) {
> +		dev_err(dev, "Failed to register xdma miscdevice.\n");
> +
> +		gen_pool_free(ctx->vga_pool, (unsigned long)ctx->cmdq_vga_virt,
> +			      XDMA_CMDQ_SIZE);
> +
> +		reset_control_assert(ctx->reset);
> +		clk_disable_unprepare(ctx->clock);
> +		return rc;
> +	}
> +
>  	/*
>  	 * This interrupt could fire immediately so only request it once the
>  	 * engine and driver are initialized.
> @@ -699,6 +922,7 @@ static int aspeed_xdma_remove(struct platform_device *pdev)
>  {
>  	struct aspeed_xdma *ctx = platform_get_drvdata(pdev);
>  
> +	misc_deregister(&ctx->misc);
>  	gen_pool_free(ctx->vga_pool, (unsigned long)ctx->cmdq_vga_virt,
>  		      XDMA_CMDQ_SIZE);
>  
> -- 
> 1.8.3.1
> 
>

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH v2 09/12] ARM: dts: aspeed: ast2600: Add XDMA Engine
  2019-12-05 17:15 ` [PATCH v2 09/12] ARM: dts: aspeed: ast2600: " Eddie James
@ 2019-12-11  3:48   ` Andrew Jeffery
  0 siblings, 0 replies; 27+ messages in thread
From: Andrew Jeffery @ 2019-12-11  3:48 UTC (permalink / raw)
  To: Eddie James, linux-kernel
  Cc: devicetree, Jason Cooper, linux-aspeed, Marc Zyngier,
	Rob Herring, tglx, mark.rutland, Joel Stanley



On Fri, 6 Dec 2019, at 03:45, Eddie James wrote:
> Add a node for the XDMA engine with all the necessary information. Also
> add a simple syscon node for the SDRAM memory controller.
> 
> Signed-off-by: Eddie James <eajames@linux.ibm.com>
> ---
> Changes since v1:
>  - Add a syscon SDRAM controller
>  - Add various properties to XDMA node
> 
>  arch/arm/boot/dts/aspeed-g6.dtsi | 19 +++++++++++++++++++
>  1 file changed, 19 insertions(+)
> 
> diff --git a/arch/arm/boot/dts/aspeed-g6.dtsi b/arch/arm/boot/dts/aspeed-g6.dtsi
> index ead336e..514d685 100644
> --- a/arch/arm/boot/dts/aspeed-g6.dtsi
> +++ b/arch/arm/boot/dts/aspeed-g6.dtsi
> @@ -3,6 +3,7 @@
>  
>  #include <dt-bindings/interrupt-controller/arm-gic.h>
>  #include <dt-bindings/clock/ast2600-clock.h>
> +#include <dt-bindings/interrupt-controller/aspeed-scu-ic.h>
>  
>  / {
>  	model = "Aspeed BMC";
> @@ -265,6 +266,11 @@
>  			status = "disabled";
>  		};
>  
> +		sdmc: sdram@1e6e0000 {
> +			compatible = "syscon";
> +			reg = <0x1e6e0000 0xb8>;
> +		};
> +

Hopefully we can drop this. We also need to figure out how whatever the solution is interacts
with the EDAC driver.

>  		apb {
>  			compatible = "simple-bus";
>  			#address-cells = <1>;
> @@ -311,6 +317,19 @@
>  				quality = <100>;
>  			};
>  
> +			xdma: xdma@1e6e7000 {
> +				compatible = "aspeed,ast2600-xdma";
> +				reg = <0x1e6e7000 0x100>;
> +				clocks = <&syscon ASPEED_CLK_GATE_BCLK>;
> +				resets = <&syscon ASPEED_RESET_DEV_XDMA>;
> +				interrupts-extended = <&gic GIC_SPI 6 IRQ_TYPE_LEVEL_HIGH>,
> +						      <&scu_ic0 ASPEED_AST2600_SCU_IC0_PCIE_PERST_LO_TO_HI>;
> +				pcie-device = "bmc";
> +				scu = <&syscon>;
> +				sdmc = <&sdmc>;

sdmc property should go away also.

> +				status = "disabled";
> +			};
> +
>  			gpio0: gpio@1e780000 {
>  				#gpio-cells = <2>;
>  				gpio-controller;
> -- 
> 1.8.3.1
> 
>

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH v2 06/12] drivers/soc: Add Aspeed XDMA Engine Driver
  2019-12-11  3:47   ` Andrew Jeffery
@ 2019-12-11 20:39     ` Eddie James
  2019-12-12  4:52       ` Andrew Jeffery
  0 siblings, 1 reply; 27+ messages in thread
From: Eddie James @ 2019-12-11 20:39 UTC (permalink / raw)
  To: Andrew Jeffery, linux-kernel
  Cc: devicetree, Jason Cooper, linux-aspeed, Marc Zyngier,
	Rob Herring, tglx, mark.rutland, Joel Stanley


On 12/10/19 9:47 PM, Andrew Jeffery wrote:
>
> On Fri, 6 Dec 2019, at 03:45, Eddie James wrote:
>> The XDMA engine embedded in the AST2500 and AST2600 SOCs performs PCI
>> DMA operations between the SOC (acting as a BMC) and a host processor
>> in a server.
>>
>> This commit adds a driver to control the XDMA engine and adds functions
>> to initialize the hardware and memory and start DMA operations.
>>
>> Signed-off-by: Eddie James <eajames@linux.ibm.com>
>> ---
>> Changes since v1:
>>   - Add lock comments
>>   - Don't split reset into start/finish
>>   - Drop client_lock as the new reset method means client data is safe
>>   - Switch to a chip structure to control versions rather than enum/switch
>>   - Drop in_progress bool in favor or current_client being NULL or not
>>   - Get SDRAM controller from dts phandle
>>   - Configure PCI-E based on dts property
>>   - Get VGA memory space from dts property
>>   - Drop bmc_addr from aspeed_xdma_op as mmap can do all that
>>
>>   MAINTAINERS                      |   2 +
>>   drivers/soc/aspeed/Kconfig       |   8 +
>>   drivers/soc/aspeed/Makefile      |   1 +
>>   drivers/soc/aspeed/aspeed-xdma.c | 783 +++++++++++++++++++++++++++++++++++++++
>>   include/uapi/linux/aspeed-xdma.h |  41 ++
>>   5 files changed, 835 insertions(+)
>>   create mode 100644 drivers/soc/aspeed/aspeed-xdma.c
>>   create mode 100644 include/uapi/linux/aspeed-xdma.h
>>
>> diff --git a/MAINTAINERS b/MAINTAINERS
>> index 528a142..617c03d 100644
>> --- a/MAINTAINERS
>> +++ b/MAINTAINERS
>> @@ -2712,6 +2712,8 @@ M:	Eddie James <eajames@linux.ibm.com>
>>   L:	linux-aspeed@lists.ozlabs.org (moderated for non-subscribers)
>>   S:	Maintained
>>   F:	Documentation/devicetree/bindings/soc/aspeed/xdma.txt
>> +F:	drivers/soc/aspeed/aspeed-xdma.c
>> +F:	include/uapi/linux/aspeed-xdma.h
>>   
>>   ASUS NOTEBOOKS AND EEEPC ACPI/WMI EXTRAS DRIVERS
>>   M:	Corentin Chary <corentin.chary@gmail.com>
>> diff --git a/drivers/soc/aspeed/Kconfig b/drivers/soc/aspeed/Kconfig
>> index 323e177..2a6c16f 100644
>> --- a/drivers/soc/aspeed/Kconfig
>> +++ b/drivers/soc/aspeed/Kconfig
>> @@ -29,4 +29,12 @@ config ASPEED_P2A_CTRL
>>   	  ioctl()s, the driver also provides an interface for userspace mappings to
>>   	  a pre-defined region.
>>   
>> +config ASPEED_XDMA
>> +	tristate "Aspeed XDMA Engine Driver"
>> +	depends on SOC_ASPEED && REGMAP && MFD_SYSCON && HAS_DMA
>> +	help
>> +	  Enable support for the Aspeed XDMA Engine found on the Aspeed AST2XXX
>> +	  SOCs. The XDMA engine can perform automatic PCI DMA operations
>> +	  between the AST2XXX (acting as a BMC) and a host processor.
>> +
>>   endmenu
>> diff --git a/drivers/soc/aspeed/Makefile b/drivers/soc/aspeed/Makefile
>> index b64be47..977b046 100644
>> --- a/drivers/soc/aspeed/Makefile
>> +++ b/drivers/soc/aspeed/Makefile
>> @@ -2,3 +2,4 @@
>>   obj-$(CONFIG_ASPEED_LPC_CTRL)	+= aspeed-lpc-ctrl.o
>>   obj-$(CONFIG_ASPEED_LPC_SNOOP)	+= aspeed-lpc-snoop.o
>>   obj-$(CONFIG_ASPEED_P2A_CTRL)	+= aspeed-p2a-ctrl.o
>> +obj-$(CONFIG_ASPEED_XDMA)	+= aspeed-xdma.o
>> diff --git a/drivers/soc/aspeed/aspeed-xdma.c b/drivers/soc/aspeed/aspeed-xdma.c
>> new file mode 100644
>> index 0000000..a9b3eeb
>> --- /dev/null
>> +++ b/drivers/soc/aspeed/aspeed-xdma.c
>> @@ -0,0 +1,783 @@
>> +// SPDX-License-Identifier: GPL-2.0-or-later
>> +// Copyright IBM Corp 2019
>> +
>> +#include <linux/aspeed-xdma.h>
>> +#include <linux/bitfield.h>
>> +#include <linux/clk.h>
>> +#include <linux/delay.h>
>> +#include <linux/device.h>
>> +#include <linux/dma-mapping.h>
>> +#include <linux/fs.h>
>> +#include <linux/genalloc.h>
>> +#include <linux/interrupt.h>
>> +#include <linux/io.h>
>> +#include <linux/jiffies.h>
>> +#include <linux/mfd/syscon.h>
>> +#include <linux/module.h>
>> +#include <linux/mutex.h>
>> +#include <linux/of_device.h>
>> +#include <linux/platform_device.h>
>> +#include <linux/poll.h>
>> +#include <linux/regmap.h>
>> +#include <linux/reset.h>
>> +#include <linux/slab.h>
>> +#include <linux/spinlock.h>
>> +#include <linux/string.h>
>> +#include <linux/uaccess.h>
>> +#include <linux/wait.h>
>> +#include <linux/workqueue.h>
>> +
>> +#define DEVICE_NAME				"aspeed-xdma"
>> +
>> +#define SCU_AST2500_PCIE_CONF			0x180
>> +#define SCU_AST2600_PCIE_CONF			0xc20
>> +#define  SCU_PCIE_CONF_VGA_EN			 BIT(0)
>> +#define  SCU_PCIE_CONF_VGA_EN_MMIO		 BIT(1)
>> +#define  SCU_PCIE_CONF_VGA_EN_LPC		 BIT(2)
>> +#define  SCU_PCIE_CONF_VGA_EN_MSI		 BIT(3)
>> +#define  SCU_PCIE_CONF_VGA_EN_MCTP		 BIT(4)
>> +#define  SCU_PCIE_CONF_VGA_EN_IRQ		 BIT(5)
>> +#define  SCU_PCIE_CONF_VGA_EN_DMA		 BIT(6)
>> +#define  SCU_PCIE_CONF_BMC_EN			 BIT(8)
>> +#define  SCU_PCIE_CONF_BMC_EN_MMIO		 BIT(9)
>> +#define  SCU_PCIE_CONF_BMC_EN_MSI		 BIT(11)
>> +#define  SCU_PCIE_CONF_BMC_EN_MCTP		 BIT(12)
>> +#define  SCU_PCIE_CONF_BMC_EN_IRQ		 BIT(13)
>> +#define  SCU_PCIE_CONF_BMC_EN_DMA		 BIT(14)
>> +
>> +#define SCU_AST2500_BMC_CLASS_REV		0x19c
>> +#define SCU_AST2600_BMC_CLASS_REV		0xc4c
>> +#define  SCU_BMC_CLASS_REV_XDMA			 0xff000001
>> +
>> +#define SDMC_REMAP				0x008
>> +#define  SDMC_AST2500_REMAP_PCIE		 BIT(16)
>> +#define  SDMC_AST2500_REMAP_XDMA		 BIT(17)
>> +#define  SDMC_AST2600_REMAP_XDMA		 BIT(18)
>> +
>> +#define XDMA_CMDQ_SIZE				PAGE_SIZE
>> +#define XDMA_NUM_CMDS				\
>> +	(XDMA_CMDQ_SIZE / sizeof(struct aspeed_xdma_cmd))
>> +
>> +/* Aspeed specification requires 10ms after switching the reset line */
>> +#define XDMA_RESET_TIME_MS			10
>> +
>> +#define XDMA_CMD_AST2500_PITCH_SHIFT		3
>> +#define XDMA_CMD_AST2500_PITCH_BMC		GENMASK_ULL(62, 51)
>> +#define XDMA_CMD_AST2500_PITCH_HOST		GENMASK_ULL(46, 35)
>> +#define XDMA_CMD_AST2500_PITCH_UPSTREAM		BIT_ULL(31)
>> +#define XDMA_CMD_AST2500_PITCH_ADDR		GENMASK_ULL(29, 4)
>> +#define XDMA_CMD_AST2500_PITCH_ID		BIT_ULL(0)
>> +#define XDMA_CMD_AST2500_CMD_IRQ_EN		BIT_ULL(31)
>> +#define XDMA_CMD_AST2500_CMD_LINE_NO		GENMASK_ULL(27, 16)
>> +#define XDMA_CMD_AST2500_CMD_IRQ_BMC		BIT_ULL(15)
>> +#define XDMA_CMD_AST2500_CMD_LINE_SIZE_SHIFT	4
>> +#define XDMA_CMD_AST2500_CMD_LINE_SIZE		\
>> +	GENMASK_ULL(14, XDMA_CMD_AST2500_CMD_LINE_SIZE_SHIFT)
>> +#define XDMA_CMD_AST2500_CMD_ID			BIT_ULL(1)
>> +
>> +#define XDMA_CMD_AST2600_PITCH_BMC		GENMASK_ULL(62, 48)
>> +#define XDMA_CMD_AST2600_PITCH_HOST		GENMASK_ULL(46, 32)
>> +#define XDMA_CMD_AST2600_PITCH_ADDR		GENMASK_ULL(30, 0)
>> +#define XDMA_CMD_AST2600_CMD_64_EN		BIT_ULL(40)
>> +#define XDMA_CMD_AST2600_CMD_IRQ_BMC		BIT_ULL(37)
>> +#define XDMA_CMD_AST2600_CMD_IRQ_HOST		BIT_ULL(36)
>> +#define XDMA_CMD_AST2600_CMD_UPSTREAM		BIT_ULL(32)
>> +#define XDMA_CMD_AST2600_CMD_LINE_NO		GENMASK_ULL(27, 16)
>> +#define XDMA_CMD_AST2600_CMD_LINE_SIZE		GENMASK_ULL(14, 0)
>> +#define XDMA_CMD_AST2600_CMD_MULTILINE_SIZE	GENMASK_ULL(14, 12)
>> +
>> +#define XDMA_AST2500_QUEUE_ENTRY_SIZE		4
>> +#define XDMA_AST2500_HOST_CMDQ_ADDR0		0x00
>> +#define XDMA_AST2500_HOST_CMDQ_ENDP		0x04
>> +#define XDMA_AST2500_HOST_CMDQ_WRITEP		0x08
>> +#define XDMA_AST2500_HOST_CMDQ_READP		0x0c
>> +#define XDMA_AST2500_BMC_CMDQ_ADDR		0x10
>> +#define XDMA_AST2500_BMC_CMDQ_ENDP		0x14
>> +#define XDMA_AST2500_BMC_CMDQ_WRITEP		0x18
>> +#define XDMA_AST2500_BMC_CMDQ_READP		0x1c
>> +#define  XDMA_BMC_CMDQ_READP_RESET		 0xee882266
>> +#define XDMA_AST2500_CTRL			0x20
>> +#define  XDMA_AST2500_CTRL_US_COMP		 BIT(4)
>> +#define  XDMA_AST2500_CTRL_DS_COMP		 BIT(5)
>> +#define  XDMA_AST2500_CTRL_DS_DIRTY		 BIT(6)
>> +#define  XDMA_AST2500_CTRL_DS_SIZE_256		 BIT(17)
>> +#define  XDMA_AST2500_CTRL_DS_TIMEOUT		 BIT(28)
>> +#define  XDMA_AST2500_CTRL_DS_CHECK_ID		 BIT(29)
>> +#define XDMA_AST2500_STATUS			0x24
>> +#define  XDMA_AST2500_STATUS_US_COMP		 BIT(4)
>> +#define  XDMA_AST2500_STATUS_DS_COMP		 BIT(5)
>> +#define  XDMA_AST2500_STATUS_DS_DIRTY		 BIT(6)
>> +#define XDMA_AST2500_INPRG_DS_CMD1		0x38
>> +#define XDMA_AST2500_INPRG_DS_CMD2		0x3c
>> +#define XDMA_AST2500_INPRG_US_CMD00		0x40
>> +#define XDMA_AST2500_INPRG_US_CMD01		0x44
>> +#define XDMA_AST2500_INPRG_US_CMD10		0x48
>> +#define XDMA_AST2500_INPRG_US_CMD11		0x4c
>> +#define XDMA_AST2500_INPRG_US_CMD20		0x50
>> +#define XDMA_AST2500_INPRG_US_CMD21		0x54
>> +#define XDMA_AST2500_HOST_CMDQ_ADDR1		0x60
>> +#define XDMA_AST2500_VGA_CMDQ_ADDR0		0x64
>> +#define XDMA_AST2500_VGA_CMDQ_ENDP		0x68
>> +#define XDMA_AST2500_VGA_CMDQ_WRITEP		0x6c
>> +#define XDMA_AST2500_VGA_CMDQ_READP		0x70
>> +#define XDMA_AST2500_VGA_CMD_STATUS		0x74
>> +#define XDMA_AST2500_VGA_CMDQ_ADDR1		0x78
>> +
>> +#define XDMA_AST2600_QUEUE_ENTRY_SIZE		2
>> +#define XDMA_AST2600_HOST_CMDQ_ADDR0		0x00
>> +#define XDMA_AST2600_HOST_CMDQ_ADDR1		0x04
>> +#define XDMA_AST2600_HOST_CMDQ_ENDP		0x08
>> +#define XDMA_AST2600_HOST_CMDQ_WRITEP		0x0c
>> +#define XDMA_AST2600_HOST_CMDQ_READP		0x10
>> +#define XDMA_AST2600_BMC_CMDQ_ADDR		0x14
>> +#define XDMA_AST2600_BMC_CMDQ_ENDP		0x18
>> +#define XDMA_AST2600_BMC_CMDQ_WRITEP		0x1c
>> +#define XDMA_AST2600_BMC_CMDQ_READP		0x20
>> +#define XDMA_AST2600_VGA_CMDQ_ADDR0		0x24
>> +#define XDMA_AST2600_VGA_CMDQ_ADDR1		0x28
>> +#define XDMA_AST2600_VGA_CMDQ_ENDP		0x2c
>> +#define XDMA_AST2600_VGA_CMDQ_WRITEP		0x30
>> +#define XDMA_AST2600_VGA_CMDQ_READP		0x34
>> +#define XDMA_AST2600_CTRL			0x38
>> +#define  XDMA_AST2600_CTRL_US_COMP		 BIT(16)
>> +#define  XDMA_AST2600_CTRL_DS_COMP		 BIT(17)
>> +#define  XDMA_AST2600_CTRL_DS_DIRTY		 BIT(18)
>> +#define  XDMA_AST2600_CTRL_DS_SIZE_256		 BIT(20)
>> +#define XDMA_AST2600_STATUS			0x3c
>> +#define  XDMA_AST2600_STATUS_US_COMP		 BIT(16)
>> +#define  XDMA_AST2600_STATUS_DS_COMP		 BIT(17)
>> +#define  XDMA_AST2600_STATUS_DS_DIRTY		 BIT(18)
>> +#define XDMA_AST2600_INPRG_DS_CMD00		0x40
>> +#define XDMA_AST2600_INPRG_DS_CMD01		0x44
>> +#define XDMA_AST2600_INPRG_DS_CMD10		0x48
>> +#define XDMA_AST2600_INPRG_DS_CMD11		0x4c
>> +#define XDMA_AST2600_INPRG_DS_CMD20		0x50
>> +#define XDMA_AST2600_INPRG_DS_CMD21		0x54
>> +#define XDMA_AST2600_INPRG_US_CMD00		0x60
>> +#define XDMA_AST2600_INPRG_US_CMD01		0x64
>> +#define XDMA_AST2600_INPRG_US_CMD10		0x68
>> +#define XDMA_AST2600_INPRG_US_CMD11		0x6c
>> +#define XDMA_AST2600_INPRG_US_CMD20		0x70
>> +#define XDMA_AST2600_INPRG_US_CMD21		0x74
>> +
>> +struct aspeed_xdma_cmd {
>> +	u64 host_addr;
>> +	u64 pitch;
>> +	u64 cmd;
>> +	u64 reserved;
>> +};
>> +
>> +struct aspeed_xdma_regs {
>> +	u8 bmc_cmdq_addr;
>> +	u8 bmc_cmdq_endp;
>> +	u8 bmc_cmdq_writep;
>> +	u8 bmc_cmdq_readp;
>> +	u8 control;
>> +	u8 status;
>> +};
>> +
>> +struct aspeed_xdma_status_bits {
>> +	u32 us_comp;
>> +	u32 ds_comp;
>> +	u32 ds_dirty;
>> +};
>> +
>> +struct aspeed_xdma;
>> +
>> +struct aspeed_xdma_chip {
>> +	u32 control;
>> +	u32 scu_bmc_class;
>> +	u32 scu_pcie_conf;
>> +	u32 sdmc_remap;
>> +	unsigned int queue_entry_size;
>> +	struct aspeed_xdma_regs regs;
>> +	struct aspeed_xdma_status_bits status_bits;
>> +	unsigned int (*set_cmd)(struct aspeed_xdma *ctx,
>> +				struct aspeed_xdma_op *op, u32 bmc_addr);
>> +};
>> +
>> +struct aspeed_xdma_client;
>> +
>> +struct aspeed_xdma {
>> +	const struct aspeed_xdma_chip *chip;
>> +
>> +	struct device *dev;
>> +	void __iomem *base;
>> +	struct clk *clock;
>> +	struct reset_control *reset;
>> +
>> +	struct aspeed_xdma_client *current_client;
>> +
>> +	/* start_lock protects cmd_idx, cmdq, and the state of the engine */
>> +	struct mutex start_lock;
>> +	void *cmdq;
> Can this not be typed as `struct aspeed_xdma_cmd *cmdq`?


Good idea.


>
>> +	bool in_reset;
> Bit of a nit, but can we move in_reset under reset_lock below?
>
>> +	bool upstream;
>> +	unsigned int cmd_idx;
>> +
>> +	/* reset_lock protects in_reset and the reset state of the engine */
>> +	spinlock_t reset_lock;
>> +
>> +	wait_queue_head_t wait;
>> +	struct work_struct reset_work;
>> +
>> +	u32 vga_phys;
>> +	u32 vga_size;
>> +	void __iomem *vga_virt;
>> +	dma_addr_t cmdq_vga_phys;
>> +	void *cmdq_vga_virt;
>> +	struct gen_pool *vga_pool;
> This shouldn't have anything to do with VGA specifically, we could
> theoretically use any other piece of reserved memory (in the right
> security context).
>
>> +};
>> +
>> +struct aspeed_xdma_client {
>> +	struct aspeed_xdma *ctx;
>> +
>> +	bool error;
>> +	bool in_progress;
>> +	void *virt;
>> +	dma_addr_t phys;
>> +	u32 size;
>> +};
>> +
>> +static u32 aspeed_xdma_readl(struct aspeed_xdma *ctx, u8 reg)
>> +{
>> +	u32 v = readl(ctx->base + reg);
>> +
>> +	dev_dbg(ctx->dev, "read %02x[%08x]\n", reg, v);
>> +	return v;
>> +}
>> +
>> +static void aspeed_xdma_writel(struct aspeed_xdma *ctx, u8 reg, u32 val)
>> +{
>> +	writel(val, ctx->base + reg);
>> +	dev_dbg(ctx->dev, "write %02x[%08x]\n", reg, val);
>> +}
>> +
>> +static void aspeed_xdma_init_eng(struct aspeed_xdma *ctx)
>> +{
>> +	aspeed_xdma_writel(ctx, ctx->chip->regs.bmc_cmdq_endp,
>> +			   ctx->chip->queue_entry_size * XDMA_NUM_CMDS);
>> +	aspeed_xdma_writel(ctx, ctx->chip->regs.bmc_cmdq_readp,
>> +			   XDMA_BMC_CMDQ_READP_RESET);
>> +	aspeed_xdma_writel(ctx, ctx->chip->regs.bmc_cmdq_writep, 0);
>> +	aspeed_xdma_writel(ctx, ctx->chip->regs.control, ctx->chip->control);
>> +	aspeed_xdma_writel(ctx, ctx->chip->regs.bmc_cmdq_addr,
>> +			   ctx->cmdq_vga_phys);
>> +
>> +	ctx->cmd_idx = 0;
>> +	ctx->current_client = NULL;
>> +}
>> +
>> +static unsigned int aspeed_xdma_ast2500_set_cmd(struct aspeed_xdma *ctx,
>> +						struct aspeed_xdma_op *op,
>> +						u32 bmc_addr)
>> +{
>> +	u64 cmd = XDMA_CMD_AST2500_CMD_IRQ_EN | XDMA_CMD_AST2500_CMD_IRQ_BMC |
>> +		XDMA_CMD_AST2500_CMD_ID;
>> +	u64 cmd_pitch = (op->direction ? XDMA_CMD_AST2500_PITCH_UPSTREAM : 0) |
>> +		XDMA_CMD_AST2500_PITCH_ID;
>> +	unsigned int line_size;
>> +	unsigned int nidx = (ctx->cmd_idx + 1) % XDMA_NUM_CMDS;
>> +	unsigned int line_no = 1;
>> +	unsigned int pitch = 1;
>> +	struct aspeed_xdma_cmd *ncmd =
>> +		&(((struct aspeed_xdma_cmd *)ctx->cmdq)[ctx->cmd_idx]);
> Changing ctx->cmdq away from a `void *` would help out here.
>
>> +
>> +	dev_dbg(ctx->dev, "xdma %s ast2500: bmc[%08x] len[%08x] host[%08x]\n",
>> +		op->direction ? "upstream" : "downstream", bmc_addr, op->len,
>> +		(u32)op->host_addr);
>> +
>> +	if (op->len > XDMA_CMD_AST2500_CMD_LINE_SIZE) {
>> +		unsigned int rem;
>> +		unsigned int total;
>> +
>> +		line_no = op->len / XDMA_CMD_AST2500_CMD_LINE_SIZE;
>> +		total = XDMA_CMD_AST2500_CMD_LINE_SIZE * line_no;
>> +		rem = (op->len - total) >>
>> +			XDMA_CMD_AST2500_CMD_LINE_SIZE_SHIFT;
>> +		line_size = XDMA_CMD_AST2500_CMD_LINE_SIZE;
>> +		pitch = line_size >> XDMA_CMD_AST2500_PITCH_SHIFT;
>> +		line_size >>= XDMA_CMD_AST2500_CMD_LINE_SIZE_SHIFT;
>> +
>> +		if (rem) {
>> +			u32 rbmc = bmc_addr + total;
>> +			struct aspeed_xdma_cmd *rcmd =
>> +				&(((struct aspeed_xdma_cmd *)ctx->cmdq)[nidx]);
>> +
>> +			rcmd->host_addr = op->host_addr + (u64)total;
>> +			rcmd->pitch = cmd_pitch |
>> +				((u64)rbmc & XDMA_CMD_AST2500_PITCH_ADDR) |
>> +				FIELD_PREP(XDMA_CMD_AST2500_PITCH_HOST, 1) |
>> +				FIELD_PREP(XDMA_CMD_AST2500_PITCH_BMC, 1);
>> +			rcmd->cmd = cmd |
>> +				FIELD_PREP(XDMA_CMD_AST2500_CMD_LINE_NO, 1) |
>> +				FIELD_PREP(XDMA_CMD_AST2500_CMD_LINE_SIZE,
>> +					   rem);
>> +
>> +			print_hex_dump_debug("xdma rem ", DUMP_PREFIX_OFFSET,
>> +					     16, 1, rcmd, sizeof(*rcmd), true);
>> +
>> +			cmd &= ~(XDMA_CMD_AST2500_CMD_IRQ_EN |
>> +				 XDMA_CMD_AST2500_CMD_IRQ_BMC);
>> +
>> +			nidx = (nidx + 1) % XDMA_NUM_CMDS;
>> +		}
>> +	} else {
>> +		line_size = op->len >> XDMA_CMD_AST2500_CMD_LINE_SIZE_SHIFT;
>> +	}
>> +
>> +	ncmd->host_addr = op->host_addr;
>> +	ncmd->pitch = cmd_pitch |
>> +		((u64)bmc_addr & XDMA_CMD_AST2500_PITCH_ADDR) |
>> +		FIELD_PREP(XDMA_CMD_AST2500_PITCH_HOST, pitch) |
>> +		FIELD_PREP(XDMA_CMD_AST2500_PITCH_BMC, pitch);
>> +	ncmd->cmd = cmd | FIELD_PREP(XDMA_CMD_AST2500_CMD_LINE_NO, line_no) |
>> +		FIELD_PREP(XDMA_CMD_AST2500_CMD_LINE_SIZE, line_size);
>> +
>> +	print_hex_dump_debug("xdma cmd ", DUMP_PREFIX_OFFSET, 16, 1, ncmd,
>> +			     sizeof(*ncmd), true);
>> +
>> +	return nidx;
>> +}
>> +
>> +static unsigned int aspeed_xdma_ast2600_set_cmd(struct aspeed_xdma *ctx,
>> +						struct aspeed_xdma_op *op,
>> +						u32 bmc_addr)
>> +{
>> +	u64 cmd = XDMA_CMD_AST2600_CMD_IRQ_BMC |
>> +		(op->direction ? XDMA_CMD_AST2600_CMD_UPSTREAM : 0);
>> +	unsigned int line_size;
>> +	unsigned int nidx = (ctx->cmd_idx + 1) % XDMA_NUM_CMDS;
>> +	unsigned int line_no = 1;
>> +	unsigned int pitch = 1;
>> +	struct aspeed_xdma_cmd *ncmd =
>> +		&(((struct aspeed_xdma_cmd *)ctx->cmdq)[ctx->cmd_idx]);
>> +
>> +	if ((op->host_addr + op->len) & 0xffffffff00000000ULL)
> Do we know that this won't wrap?


No, but I assume it would be a bad transfer anyway at that point?


>
>> +		cmd |= XDMA_CMD_AST2600_CMD_64_EN;
>> +
>> +	dev_dbg(ctx->dev, "xdma %s ast2600: bmc[%08x] len[%08x] "
>> +		"host[%016llx]\n", op->direction ? "upstream" : "downstream",
>> +		bmc_addr, op->len, op->host_addr);
>> +
>> +	if (op->len > XDMA_CMD_AST2600_CMD_LINE_SIZE) {
>> +		unsigned int rem;
>> +		unsigned int total;
>> +
>> +		line_no = op->len / XDMA_CMD_AST2600_CMD_MULTILINE_SIZE;
>> +		total = XDMA_CMD_AST2600_CMD_MULTILINE_SIZE * line_no;
>> +		rem = op->len - total;
>> +		line_size = XDMA_CMD_AST2600_CMD_MULTILINE_SIZE;
>> +		pitch = line_size;
>> +
>> +		if (rem) {
>> +			u32 rbmc = bmc_addr + total;
>> +			struct aspeed_xdma_cmd *rcmd =
>> +				&(((struct aspeed_xdma_cmd *)ctx->cmdq)[nidx]);
>> +
>> +			rcmd->host_addr = op->host_addr + (u64)total;
>> +			rcmd->pitch =
>> +				((u64)rbmc & XDMA_CMD_AST2600_PITCH_ADDR) |
>> +				FIELD_PREP(XDMA_CMD_AST2600_PITCH_HOST, 1) |
>> +				FIELD_PREP(XDMA_CMD_AST2600_PITCH_BMC, 1);
>> +			rcmd->cmd = cmd |
>> +				FIELD_PREP(XDMA_CMD_AST2600_CMD_LINE_NO, 1) |
>> +				FIELD_PREP(XDMA_CMD_AST2600_CMD_LINE_SIZE,
>> +					   rem);
>> +
>> +			print_hex_dump_debug("xdma rem ", DUMP_PREFIX_OFFSET,
>> +					     16, 1, rcmd, sizeof(*rcmd), true);
>> +
>> +			cmd &= ~XDMA_CMD_AST2600_CMD_IRQ_BMC;
>> +
>> +			nidx = (nidx + 1) % XDMA_NUM_CMDS;
>> +		}
>> +	} else {
>> +		line_size = op->len;
>> +	}
>> +
>> +	ncmd->host_addr = op->host_addr;
>> +	ncmd->pitch = ((u64)bmc_addr & XDMA_CMD_AST2600_PITCH_ADDR) |
>> +		FIELD_PREP(XDMA_CMD_AST2600_PITCH_HOST, pitch) |
>> +		FIELD_PREP(XDMA_CMD_AST2600_PITCH_BMC, pitch);
>> +	ncmd->cmd = cmd | FIELD_PREP(XDMA_CMD_AST2600_CMD_LINE_NO, line_no) |
>> +		FIELD_PREP(XDMA_CMD_AST2600_CMD_LINE_SIZE, line_size);
>> +
>> +	print_hex_dump_debug("xdma cmd ", DUMP_PREFIX_OFFSET, 16, 1, ncmd,
>> +			     sizeof(*ncmd), true);
>> +
>> +	return nidx;
>> +}
>> +
>> +static void aspeed_xdma_start(struct aspeed_xdma *ctx,
>> +			      struct aspeed_xdma_op *op, u32 bmc_addr,
>> +			      struct aspeed_xdma_client *client)
>> +{
>> +	unsigned int nidx;
>> +
>> +	mutex_lock(&ctx->start_lock);
>> +
>> +	nidx = ctx->chip->set_cmd(ctx, op, bmc_addr);
>> +	memcpy(ctx->cmdq_vga_virt, ctx->cmdq, XDMA_CMDQ_SIZE);
>> +
>> +	client->in_progress = true;
>> +	ctx->current_client = client;
>> +	ctx->upstream = op->direction ? true : false;
> Could get away without the branch, just assign directly or use the !! idiom
> (you're already turning op->direction into a boolean by using it as the
> condition).
>
>> +
>> +	aspeed_xdma_writel(ctx, ctx->chip->regs.bmc_cmdq_writep,
>> +			   nidx * ctx->chip->queue_entry_size);
>> +
>> +	ctx->cmd_idx = nidx;
>> +
>> +	mutex_unlock(&ctx->start_lock);
>> +}
>> +
>> +static void aspeed_xdma_done(struct aspeed_xdma *ctx, bool error)
>> +{
>> +	if (ctx->current_client) {
>> +		ctx->current_client->error = error;
>> +		ctx->current_client->in_progress = false;
>> +		ctx->current_client = NULL;
> You need to take start_lock before writing these members to ensure the
> writes are not reordered across acquisition of start_lock in
> aspeed_xdma_start() above, unless there's some other guarantee of that?


Unless we get spurious interrupts (as in, the xdma interrupt fires with 
no transfer started, and somehow the correct status bits are set), it's 
not possible to execute this at the same time as aspeed_xdma_start(). So 
I did not try and lock here. Do you think it's worth locking for that 
situation?


>
> I see that aspeed_xdma_done() is called from aspeed_xdma_irq() below,
> so maybe we need a different locking strategy or punt aspeed_xdma_done()
> to a workqueue, as start_lock is a mutex.
>
>> +	}
>> +
>> +	wake_up_interruptible_all(&ctx->wait);
>> +}
>> +
>> +static irqreturn_t aspeed_xdma_irq(int irq, void *arg)
>> +{
>> +	struct aspeed_xdma *ctx = arg;
>> +	u32 status = aspeed_xdma_readl(ctx, ctx->chip->regs.status);
>> +
>> +	if (status & ctx->chip->status_bits.ds_dirty) {
>> +		aspeed_xdma_done(ctx, true);
>> +	} else {
>> +		if (status & ctx->chip->status_bits.us_comp) {
>> +			if (ctx->upstream)
>> +				aspeed_xdma_done(ctx, false);
>> +		}
>> +
>> +		if (status & ctx->chip->status_bits.ds_comp) {
>> +			if (!ctx->upstream)
>> +				aspeed_xdma_done(ctx, false);
>> +		}
>> +	}
>> +
>> +	aspeed_xdma_writel(ctx, ctx->chip->regs.status, status);
>> +
>> +	return IRQ_HANDLED;
>> +}
>> +
>> +static void aspeed_xdma_reset(struct aspeed_xdma *ctx)
>> +{
>> +	reset_control_assert(ctx->reset);
>> +	msleep(XDMA_RESET_TIME_MS);
>> +
>> +	reset_control_deassert(ctx->reset);
>> +	msleep(XDMA_RESET_TIME_MS);
>> +
>> +	aspeed_xdma_init_eng(ctx);
>> +
>> +	ctx->in_reset = false;
>> +	aspeed_xdma_done(ctx, true);
>> +}
>> +
>> +static void aspeed_xdma_reset_work(struct work_struct *work)
>> +{
>> +	struct aspeed_xdma *ctx = container_of(work, struct aspeed_xdma,
>> +					       reset_work);
>> +
>> +	/*
>> +	 * Lock to make sure operations aren't started while the engine is
>> +	 * in reset.
>> +	 */
>> +	mutex_lock(&ctx->start_lock);
>> +
>> +	aspeed_xdma_reset(ctx);
>> +
>> +	mutex_unlock(&ctx->start_lock);
>> +}
>> +
>> +static irqreturn_t aspeed_xdma_pcie_irq(int irq, void *arg)
>> +{
>> +	unsigned long flags;
>> +	struct aspeed_xdma *ctx = arg;
>> +
>> +	dev_dbg(ctx->dev, "pcie reset\n");
>> +
>> +	spin_lock_irqsave(&ctx->reset_lock, flags);
>> +	if (ctx->in_reset) {
>> +		spin_unlock_irqrestore(&ctx->reset_lock, flags);
>> +		return IRQ_HANDLED;
>> +	}
>> +
>> +	ctx->in_reset = true;
>> +	spin_unlock_irqrestore(&ctx->reset_lock, flags);
>> +
>> +	schedule_work(&ctx->reset_work);
>> +	return IRQ_HANDLED;
>> +}
>> +
>> +static int aspeed_xdma_probe(struct platform_device *pdev)
>> +{
>> +	int irq;
>> +	int pcie_irq;
>> +	int rc;
>> +	u32 vgamem[2];
>> +	struct regmap *scu;
>> +	struct regmap *sdmc;
>> +	struct aspeed_xdma *ctx;
>> +	struct device *dev = &pdev->dev;
>> +	const void *md = of_device_get_match_data(dev);
> So close to reverse christmas tree :)


I'll be sure to fix it :)


>
>> +
>> +	if (!md)
>> +		return -ENODEV;
>> +
>> +	ctx = devm_kzalloc(dev, sizeof(*ctx), GFP_KERNEL);
>> +	if (!ctx)
>> +		return -ENOMEM;
>> +
>> +	ctx->chip = md;
>> +	ctx->dev = dev;
>> +	platform_set_drvdata(pdev, ctx);
>> +	mutex_init(&ctx->start_lock);
>> +	INIT_WORK(&ctx->reset_work, aspeed_xdma_reset_work);
>> +	spin_lock_init(&ctx->reset_lock);
>> +	init_waitqueue_head(&ctx->wait);
>> +
>> +	ctx->base = devm_platform_ioremap_resource(pdev, 0);
>> +	if (IS_ERR(ctx->base)) {
>> +		dev_err(dev, "Failed to map registers.\n");
>> +		return PTR_ERR(ctx->base);
>> +	}
>> +
>> +	irq = platform_get_irq(pdev, 0);
>> +	if (irq < 0) {
>> +		dev_err(dev, "Unable to find IRQ.\n");
>> +		return irq;
>> +	}
>> +
>> +	rc = devm_request_irq(dev, irq, aspeed_xdma_irq, IRQF_SHARED,
>> +			      DEVICE_NAME, ctx);
>> +	if (rc < 0) {
>> +		dev_err(dev, "Failed to request IRQ %d.\n", irq);
>> +		return rc;
>> +	}
>> +
>> +	ctx->clock = devm_clk_get(dev, NULL);
>> +	if (IS_ERR(ctx->clock)) {
>> +		dev_err(dev, "Failed to request clock.\n");
>> +		return PTR_ERR(ctx->clock);
>> +	}
>> +
>> +	ctx->reset = devm_reset_control_get_exclusive(dev, NULL);
>> +	if (IS_ERR(ctx->reset)) {
>> +		dev_err(dev, "Failed to request reset control.\n");
>> +		return PTR_ERR(ctx->reset);
>> +	}
>> +
>> +	ctx->cmdq = devm_kzalloc(ctx->dev, XDMA_CMDQ_SIZE, GFP_KERNEL);
>> +	if (!ctx->cmdq) {
>> +		dev_err(dev, "Failed to allocate command queue.\n");
>> +		return -ENOMEM;
>> +	}
>> +
>> +	ctx->vga_pool = devm_gen_pool_create(dev, ilog2(PAGE_SIZE), -1, NULL);
>> +	if (!ctx->vga_pool) {
>> +		dev_err(dev, "Failed to setup genalloc pool.\n");
>> +		return -ENOMEM;
>> +	}
>> +
>> +	rc = of_property_read_u32_array(dev->of_node, "vga-mem", vgamem, 2);
> As mentioned, this could be any reserved memory range. Also can't we get it as
> a resource rather than parsing a u32 array? Not sure if there's an advantage
> but it feels like a better representation.


That doesn't work unfortunately because the VGA memory is not mapped and 
the reserved memory subsystem fails to find it.


>
>> +	if (rc) {
>> +		dev_err(dev, "Unable to get VGA memory space.\n");
>> +		return rc;
>> +	}
>> +
>> +	ctx->vga_phys = vgamem[0];
>> +	ctx->vga_size = vgamem[1];
>> +
>> +	ctx->vga_virt = devm_ioremap(dev, ctx->vga_phys, ctx->vga_size);
>> +	if (IS_ERR(ctx->vga_virt)) {
>> +		dev_err(dev, "Failed to map VGA memory space.\n");
>> +		return PTR_ERR(ctx->vga_virt);
>> +	}
>> +
>> +	rc = gen_pool_add_virt(ctx->vga_pool, (unsigned long)ctx->vga_virt,
>> +			       ctx->vga_phys, ctx->vga_size, -1);
>> +	if (rc) {
>> +		dev_err(ctx->dev, "Failed to add memory to genalloc pool.\n");
>> +		return rc;
>> +	}
>> +
>> +	sdmc = syscon_regmap_lookup_by_phandle(dev->of_node, "sdmc");
>> +	if (IS_ERR(sdmc)) {
>> +		dev_err(dev, "Unable to configure memory controller.\n");
>> +		return PTR_ERR(sdmc);
>> +	}
>> +
>> +	regmap_update_bits(sdmc, SDMC_REMAP, ctx->chip->sdmc_remap,
>> +			   ctx->chip->sdmc_remap);
> I disagree with doing this. As mentioned on the bindings it should be up to
> the platform integrator to ensure that this is configured appropriately.


Probably so, but then how does one actually configure that elsewhere? Do 
you mean add code to the edac driver (and add support for the ast2600) 
to read some dts properties to set it?


>
>> +
>> +	scu = syscon_regmap_lookup_by_phandle(dev->of_node, "scu");
>> +	if (!IS_ERR(scu)) {
>> +		u32 selection;
>> +		bool pcie_device_bmc = true;
>> +		const u32 bmc = SCU_PCIE_CONF_BMC_EN |
>> +			SCU_PCIE_CONF_BMC_EN_MSI | SCU_PCIE_CONF_BMC_EN_IRQ |
>> +			SCU_PCIE_CONF_BMC_EN_DMA;
>> +		const u32 vga = SCU_PCIE_CONF_VGA_EN |
>> +			SCU_PCIE_CONF_VGA_EN_MSI | SCU_PCIE_CONF_VGA_EN_IRQ |
>> +			SCU_PCIE_CONF_VGA_EN_DMA;
>> +		const char *pcie = NULL;
>> +
>> +		if (!of_property_read_string(dev->of_node, "pcie-device",
>> +					     &pcie)) {
>> +			if (!strcmp(pcie, "vga"))
>> +				pcie_device_bmc = false;
> You need to return an error here if the string is not one of "vga" or "bmc". This
> just assumes that anything that isn't "vga" is "bmc".
>
>> +		}
>> +
>> +		if (pcie_device_bmc) {
>> +			selection = bmc;
>> +			regmap_write(scu, ctx->chip->scu_bmc_class,
>> +				     SCU_BMC_CLASS_REV_XDMA);
>> +		} else {
>> +			selection = vga;
>> +		}
>> +
>> +		regmap_update_bits(scu, ctx->chip->scu_pcie_conf, bmc | vga,
>> +				   selection);
>> +	} else {
>> +		dev_warn(dev, "Unable to configure PCIe; continuing.\n");
>> +	}
>> +
>> +	rc = clk_prepare_enable(ctx->clock);
>> +	if (rc) {
>> +		dev_err(dev, "Failed to enable the clock.\n");
>> +		return rc;
>> +	}
>> +	msleep(XDMA_RESET_TIME_MS);
>> +
>> +	rc = reset_control_deassert(ctx->reset);
>> +	if (rc) {
>> +		clk_disable_unprepare(ctx->clock);
>> +
>> +		dev_err(dev, "Failed to clear the reset.\n");
>> +		return rc;
>> +	}
>> +	msleep(XDMA_RESET_TIME_MS);
>> +
>> +	ctx->cmdq_vga_virt = gen_pool_dma_alloc(ctx->vga_pool, XDMA_CMDQ_SIZE,
>> +						&ctx->cmdq_vga_phys);
>> +	if (!ctx->cmdq_vga_virt) {
>> +		dev_err(ctx->dev, "Failed to genalloc cmdq.\n");
>> +
>> +		reset_control_assert(ctx->reset);
>> +		clk_disable_unprepare(ctx->clock);
>> +		return -ENOMEM;
>> +	}
>> +
>> +	aspeed_xdma_init_eng(ctx);
>> +
>> +	/*
>> +	 * This interrupt could fire immediately so only request it once the
>> +	 * engine and driver are initialized.
>> +	 */
>> +	pcie_irq = platform_get_irq(pdev, 1);
>> +	if (pcie_irq < 0) {
>> +		dev_warn(dev, "Unable to find PCI-E IRQ.\n");
>> +	} else {
>> +		rc = devm_request_irq(dev, pcie_irq, aspeed_xdma_pcie_irq,
>> +				      IRQF_SHARED, DEVICE_NAME, ctx);
>> +		if (rc < 0)
>> +			dev_warn(dev, "Failed to request PCI-E IRQ %d.\n", rc);
>> +	}
>> +
>> +	return 0;
>> +}
>> +
>> +static int aspeed_xdma_remove(struct platform_device *pdev)
>> +{
>> +	struct aspeed_xdma *ctx = platform_get_drvdata(pdev);
>> +
>> +	gen_pool_free(ctx->vga_pool, (unsigned long)ctx->cmdq_vga_virt,
>> +		      XDMA_CMDQ_SIZE);
>> +
>> +	reset_control_assert(ctx->reset);
>> +	clk_disable_unprepare(ctx->clock);
>> +
>> +	return 0;
>> +}
>> +
>> +static const struct aspeed_xdma_chip aspeed_ast2500_xdma_chip = {
>> +	.control = XDMA_AST2500_CTRL_US_COMP | XDMA_AST2500_CTRL_DS_COMP |
>> +		XDMA_AST2500_CTRL_DS_DIRTY | XDMA_AST2500_CTRL_DS_SIZE_256 |
>> +		XDMA_AST2500_CTRL_DS_TIMEOUT | XDMA_AST2500_CTRL_DS_CHECK_ID,
>> +	.scu_bmc_class = SCU_AST2500_BMC_CLASS_REV,
>> +	.scu_pcie_conf = SCU_AST2500_PCIE_CONF,
>> +	.sdmc_remap = SDMC_AST2500_REMAP_PCIE | SDMC_AST2500_REMAP_XDMA,
>> +	.queue_entry_size = XDMA_AST2500_QUEUE_ENTRY_SIZE,
>> +	.regs = {
>> +		.bmc_cmdq_addr = XDMA_AST2500_BMC_CMDQ_ADDR,
>> +		.bmc_cmdq_endp = XDMA_AST2500_BMC_CMDQ_ENDP,
>> +		.bmc_cmdq_writep = XDMA_AST2500_BMC_CMDQ_WRITEP,
>> +		.bmc_cmdq_readp = XDMA_AST2500_BMC_CMDQ_READP,
>> +		.control = XDMA_AST2500_CTRL,
>> +		.status = XDMA_AST2500_STATUS,
>> +	},
>> +	.status_bits = {
>> +		.us_comp = XDMA_AST2500_STATUS_US_COMP,
>> +		.ds_comp = XDMA_AST2500_STATUS_DS_COMP,
>> +		.ds_dirty = XDMA_AST2500_STATUS_DS_DIRTY,
>> +	},
>> +	.set_cmd = aspeed_xdma_ast2500_set_cmd,
>> +};
>> +
>> +static const struct aspeed_xdma_chip aspeed_ast2600_xdma_chip = {
>> +	.control = XDMA_AST2600_CTRL_US_COMP | XDMA_AST2600_CTRL_DS_COMP |
>> +		XDMA_AST2600_CTRL_DS_DIRTY | XDMA_AST2600_CTRL_DS_SIZE_256,
>> +	.scu_bmc_class = SCU_AST2600_BMC_CLASS_REV,
>> +	.scu_pcie_conf = SCU_AST2600_PCIE_CONF,
>> +	.sdmc_remap = SDMC_AST2600_REMAP_XDMA,
>> +	.queue_entry_size = XDMA_AST2600_QUEUE_ENTRY_SIZE,
>> +	.regs = {
>> +		.bmc_cmdq_addr = XDMA_AST2600_BMC_CMDQ_ADDR,
>> +		.bmc_cmdq_endp = XDMA_AST2600_BMC_CMDQ_ENDP,
>> +		.bmc_cmdq_writep = XDMA_AST2600_BMC_CMDQ_WRITEP,
>> +		.bmc_cmdq_readp = XDMA_AST2600_BMC_CMDQ_READP,
>> +		.control = XDMA_AST2600_CTRL,
>> +		.status = XDMA_AST2600_STATUS,
>> +	},
>> +	.status_bits = {
>> +		.us_comp = XDMA_AST2600_STATUS_US_COMP,
>> +		.ds_comp = XDMA_AST2600_STATUS_DS_COMP,
>> +		.ds_dirty = XDMA_AST2600_STATUS_DS_DIRTY,
>> +	},
>> +	.set_cmd = aspeed_xdma_ast2600_set_cmd,
>> +};
>> +
>> +static const struct of_device_id aspeed_xdma_match[] = {
>> +	{
>> +		.compatible = "aspeed,ast2500-xdma",
>> +		.data = &aspeed_ast2500_xdma_chip,
>> +	},
>> +	{
>> +		.compatible = "aspeed,ast2600-xdma",
>> +		.data = &aspeed_ast2600_xdma_chip,
>> +	},
>> +	{ },
>> +};
> Thanks, I think this representation is a lot cleaner.
>
>> +
>> +static struct platform_driver aspeed_xdma_driver = {
>> +	.probe = aspeed_xdma_probe,
>> +	.remove = aspeed_xdma_remove,
>> +	.driver = {
>> +		.name = DEVICE_NAME,
>> +		.of_match_table = aspeed_xdma_match,
>> +	},
>> +};
>> +
>> +module_platform_driver(aspeed_xdma_driver);
>> +
>> +MODULE_AUTHOR("Eddie James");
>> +MODULE_DESCRIPTION("Aspeed XDMA Engine Driver");
>> +MODULE_LICENSE("GPL v2");
>> diff --git a/include/uapi/linux/aspeed-xdma.h b/include/uapi/linux/aspeed-xdma.h
>> new file mode 100644
>> index 0000000..f39f38e
>> --- /dev/null
>> +++ b/include/uapi/linux/aspeed-xdma.h
>> @@ -0,0 +1,41 @@
>> +/* SPDX-License-Identifier: GPL-2.0+ WITH Linux-syscall-note */
>> +/* Copyright IBM Corp 2019 */
>> +
>> +#ifndef _UAPI_LINUX_ASPEED_XDMA_H_
>> +#define _UAPI_LINUX_ASPEED_XDMA_H_
>> +
>> +#include <linux/types.h>
>> +
>> +/*
>> + * aspeed_xdma_direction
>> + *
>> + * ASPEED_XDMA_DIRECTION_DOWNSTREAM: transfers data from the host to the BMC
>> + *
>> + * ASPEED_XDMA_DIRECTION_UPSTREAM: transfers data from the BMC to the host
>> + *
>> + * ASPEED_XDMA_DIRECTION_RESET: resets the XDMA engine
>> + */
>> +enum aspeed_xdma_direction {
>> +	ASPEED_XDMA_DIRECTION_DOWNSTREAM = 0,
>> +	ASPEED_XDMA_DIRECTION_UPSTREAM,
>> +	ASPEED_XDMA_DIRECTION_RESET,
> I still think having a reset action as part of the direction is a bit funky. Can you maybe
> put that in a separate patch so we can debate it later?


I can, but I'm fairly convinced this is the cleanest way to add the 
reset functionality.


>
>> +};
>> +
>> +/*
>> + * aspeed_xdma_op
>> + *
>> + * host_addr: the DMA address on the host side, typically configured by PCI
>> + *            subsystem
>> + *
>> + * len: the size of the transfer in bytes
>> + *
>> + * direction: an enumerator indicating the direction of the DMA operation; see
>> + *            enum aspeed_xdma_direction
>> + */
>> +struct aspeed_xdma_op {
>> +	__u64 host_addr;
>> +	__u32 len;
>> +	__u32 direction;
>> +};
>> +
>> +#endif /* _UAPI_LINUX_ASPEED_XDMA_H_ */
>> -- 
>> 1.8.3.1
>>
>>

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH v2 07/12] drivers/soc: xdma: Add user interface
  2019-12-11  3:48   ` Andrew Jeffery
@ 2019-12-11 20:43     ` Eddie James
  2019-12-12  5:02       ` Andrew Jeffery
  0 siblings, 1 reply; 27+ messages in thread
From: Eddie James @ 2019-12-11 20:43 UTC (permalink / raw)
  To: Andrew Jeffery, linux-kernel
  Cc: devicetree, Jason Cooper, linux-aspeed, Marc Zyngier,
	Rob Herring, tglx, mark.rutland, Joel Stanley


On 12/10/19 9:48 PM, Andrew Jeffery wrote:
>
> On Fri, 6 Dec 2019, at 03:45, Eddie James wrote:
>> This commits adds a miscdevice to provide a user interface to the XDMA
>> engine. The interface provides the write operation to start DMA
>> operations. The DMA parameters are passed as the data to the write call.
>> The actual data to transfer is NOT passed through write. Note that both
>> directions of DMA operation are accomplished through the write command;
>> BMC to host and host to BMC.
>>
>> The XDMA engine is restricted to only accessing the reserved memory
>> space on the AST2500, typically used by the VGA. For this reason, the
>> VGA memory space is pooled and allocated with genalloc. Users calling
>> mmap allocate pages from this pool for their usage. The space allocated
>> by a client will be the space used in the DMA operation. For an
>> "upstream" (BMC to host) operation, the data in the client's area will
>> be transferred to the host. For a "downstream" (host to BMC) operation,
>> the host data will be placed in the client's memory area.
> Given the comments on earlier patches we should reconsider descriptions
> of the VGA area in this paragraph.
>
>> Poll is also provided in order to determine when the DMA operation is
>> complete for non-blocking IO.
>>
>> Signed-off-by: Eddie James <eajames@linux.ibm.com>
>> ---
>> Changes since v1:
>>   - Add file_lock comment
>>   - Bring user reset up to date with new reset method
>>
>>   drivers/soc/aspeed/aspeed-xdma.c | 224 +++++++++++++++++++++++++++++++++++++++
>>   1 file changed, 224 insertions(+)
>>
>> diff --git a/drivers/soc/aspeed/aspeed-xdma.c b/drivers/soc/aspeed/aspeed-xdma.c
>> index a9b3eeb..d4b96a7 100644
>> --- a/drivers/soc/aspeed/aspeed-xdma.c
>> +++ b/drivers/soc/aspeed/aspeed-xdma.c
>> @@ -13,6 +13,7 @@
>>   #include <linux/io.h>
>>   #include <linux/jiffies.h>
>>   #include <linux/mfd/syscon.h>
>> +#include <linux/miscdevice.h>
>>   #include <linux/module.h>
>>   #include <linux/mutex.h>
>>   #include <linux/of_device.h>
>> @@ -206,6 +207,8 @@ struct aspeed_xdma {
>>   	struct clk *clock;
>>   	struct reset_control *reset;
>>   
>> +	/* file_lock serializes reads of current_client */
>> +	struct mutex file_lock;
>>   	struct aspeed_xdma_client *current_client;
>>   
>>   	/* start_lock protects cmd_idx, cmdq, and the state of the engine */
>> @@ -227,6 +230,8 @@ struct aspeed_xdma {
>>   	dma_addr_t cmdq_vga_phys;
>>   	void *cmdq_vga_virt;
>>   	struct gen_pool *vga_pool;
>> +
>> +	struct miscdevice misc;
>>   };
>>   
>>   struct aspeed_xdma_client {
>> @@ -517,6 +522,207 @@ static irqreturn_t aspeed_xdma_pcie_irq(int irq,
>> void *arg)
>>   	return IRQ_HANDLED;
>>   }
>>   
>> +static ssize_t aspeed_xdma_write(struct file *file, const char __user *buf,
>> +				 size_t len, loff_t *offset)
>> +{
>> +	int rc;
>> +	struct aspeed_xdma_op op;
>> +	struct aspeed_xdma_client *client = file->private_data;
>> +	struct aspeed_xdma *ctx = client->ctx;
>> +	u32 offs = client->phys ? (client->phys - ctx->vga_phys) :
>> +		XDMA_CMDQ_SIZE;
>> +
>> +	if (len != sizeof(op))
>> +		return -EINVAL;
>> +
>> +	rc = copy_from_user(&op, buf, len);
>> +	if (rc)
>> +		return rc;
>> +
>> +	if (op.direction == ASPEED_XDMA_DIRECTION_RESET) {
>> +		unsigned long flags;
>> +
>> +		spin_lock_irqsave(&ctx->reset_lock, flags);
>> +		if (ctx->in_reset) {
>> +			spin_unlock_irqrestore(&ctx->reset_lock, flags);
>> +			return len;
>> +		}
>> +
>> +		ctx->in_reset = true;
>> +		spin_unlock_irqrestore(&ctx->reset_lock, flags);
>> +
>> +		mutex_lock(&ctx->start_lock);
>> +
>> +		aspeed_xdma_reset(ctx);
>> +
>> +		mutex_unlock(&ctx->start_lock);
>> +
>> +		return len;
>> +	} else if (op.direction > ASPEED_XDMA_DIRECTION_RESET) {
>> +		return -EINVAL;
>> +	}
>> +
>> +	if (op.len > ctx->vga_size - offs)
>> +		return -EINVAL;
> I'm wondering if we can rearrange the code to move the sanity checks to the
> top of the function, so this and the `op.direction >
> ASPEED_XDMA_DIRECTION_RESET` case.
>
> The check above should fail for the reset case as well, I expect op.len should
> be set to zero in that case. But I still think that jamming the reset command
> into a "direction" concept feels broken, so as mentioned on an earlier patch
> I'd prefer we move that distraction out to a separate patch.
>
>> +
>> +	if (file->f_flags & O_NONBLOCK) {
>> +		if (!mutex_trylock(&ctx->file_lock))
>> +			return -EAGAIN;
>> +
>> +		if (ctx->current_client) {
>> +			mutex_unlock(&ctx->file_lock);
>> +			return -EAGAIN;
> I think EBUSY is better here.


Sure.


>
>> +		}
>> +	} else {
>> +		mutex_lock(&ctx->file_lock);
>> +
>> +		rc = wait_event_interruptible(ctx->wait, !ctx->current_client);
>> +		if (rc) {
>> +			mutex_unlock(&ctx->file_lock);
>> +			return -EINTR;
>> +		}
>> +	}
>> +
>> +	aspeed_xdma_start(ctx, &op, ctx->vga_phys + offs, client);
>> +
>> +	mutex_unlock(&ctx->file_lock);
> You've used file_lock here to protect aspeed_xdma_start() but start_lock
> above to protect aspeed_xdma_reset(), so it seems one client can disrupt
> another by resetting the engine while a DMA is in progress?


That's correct, that is the intention. In case the transfer hangs, 
another client needs to be able to reset and clear up a blocking transfer.


>
>> +
>> +	if (!(file->f_flags & O_NONBLOCK)) {
>> +		rc = wait_event_interruptible(ctx->wait, !client->in_progress);
>> +		if (rc)
>> +			return -EINTR;
>> +
>> +		if (client->error)
>> +			return -EIO;
>> +	}
>> +
>> +	return len;
>> +}
>> +
>> +static __poll_t aspeed_xdma_poll(struct file *file,
>> +				 struct poll_table_struct *wait)
>> +{
>> +	__poll_t mask = 0;
>> +	__poll_t req = poll_requested_events(wait);
>> +	struct aspeed_xdma_client *client = file->private_data;
>> +	struct aspeed_xdma *ctx = client->ctx;
>> +
>> +	if (req & (EPOLLIN | EPOLLRDNORM)) {
>> +		if (client->in_progress)
>> +			poll_wait(file, &ctx->wait, wait);
>> +
>> +		if (!client->in_progress) {
>> +			if (client->error)
>> +				mask |= EPOLLERR;
>> +			else
>> +				mask |= EPOLLIN | EPOLLRDNORM;
>> +		}
>> +	}
>> +
>> +	if (req & (EPOLLOUT | EPOLLWRNORM)) {
>> +		if (ctx->current_client)
>> +			poll_wait(file, &ctx->wait, wait);
>> +
>> +		if (!ctx->current_client)
>> +			mask |= EPOLLOUT | EPOLLWRNORM;
>> +	}
>> +
>> +	return mask;
>> +}
>> +
>> +static void aspeed_xdma_vma_close(struct vm_area_struct *vma)
>> +{
>> +	struct aspeed_xdma_client *client = vma->vm_private_data;
>> +
>> +	gen_pool_free(client->ctx->vga_pool, (unsigned long)client->virt,
>> +		      client->size);
> What assurance do we have that a DMA isn't in progress? With non-blocking
> IO we could easily start one then close the file descriptor, which would cause
> havoc if the physical range is reused by a subsequent mapping.


Good point.


>
>> +
>> +	client->virt = NULL;
>> +	client->phys = 0;
>> +	client->size = 0;
>> +}
>> +
>> +static const struct vm_operations_struct aspeed_xdma_vm_ops = {
>> +	.close =	aspeed_xdma_vma_close,
>> +};
>> +
>> +static int aspeed_xdma_mmap(struct file *file, struct vm_area_struct *vma)
>> +{
>> +	int rc;
>> +	struct aspeed_xdma_client *client = file->private_data;
>> +	struct aspeed_xdma *ctx = client->ctx;
>> +
>> +	/* restrict file to one mapping */
>> +	if (client->size)
>> +		return -ENOMEM;
>> +
>> +	client->size = vma->vm_end - vma->vm_start;
>> +	client->virt = gen_pool_dma_alloc(ctx->vga_pool, client->size,
>> +					  &client->phys);
>> +	if (!client->virt) {
>> +		client->phys = 0;
>> +		client->size = 0;
>> +		return -ENOMEM;
>> +	}
>> +
>> +	vma->vm_pgoff = (client->phys - ctx->vga_phys) >> PAGE_SHIFT;
>> +	vma->vm_ops = &aspeed_xdma_vm_ops;
>> +	vma->vm_private_data = client;
>> +	vma->vm_page_prot = pgprot_noncached(vma->vm_page_prot);
>> +
>> +	rc = io_remap_pfn_range(vma, vma->vm_start, client->phys >> PAGE_SHIFT,
>> +				client->size, vma->vm_page_prot);
>> +	if (rc) {
>> +		gen_pool_free(ctx->vga_pool, (unsigned long)client->virt,
>> +			      client->size);
>> +
>> +		client->virt = NULL;
>> +		client->phys = 0;
>> +		client->size = 0;
>> +		return rc;
>> +	}
>> +
>> +	dev_dbg(ctx->dev, "mmap: v[%08lx] to p[%08x], s[%08x]\n",
>> +		vma->vm_start, (u32)client->phys, client->size);
>> +
>> +	return 0;
>> +}
>> +
>> +static int aspeed_xdma_open(struct inode *inode, struct file *file)
>> +{
>> +	struct miscdevice *misc = file->private_data;
>> +	struct aspeed_xdma *ctx = container_of(misc, struct aspeed_xdma, misc);
>> +	struct aspeed_xdma_client *client = kzalloc(sizeof(*client),
>> +						    GFP_KERNEL);
>> +
>> +	if (!client)
>> +		return -ENOMEM;
>> +
>> +	client->ctx = ctx;
>> +	file->private_data = client;
>> +	return 0;
>> +}
>> +
>> +static int aspeed_xdma_release(struct inode *inode, struct file *file)
>> +{
>> +	struct aspeed_xdma_client *client = file->private_data;
>> +
>> +	if (client->ctx->current_client == client)
>> +		client->ctx->current_client = NULL;
> Shouldn't we also cancel the DMA op? This seems like a DoS risk: set up
> a non-blocking, large downstream transfer then close the client. Also risks
> scribbling on memory we no-longer own given we don't cancel/wait for
> completion in vm close callback?


Right, better wait for completion. There's no way to cancel a transfer.


>
>> +
>> +	kfree(client);
>> +	return 0;
>> +}
>> +
>> +static const struct file_operations aspeed_xdma_fops = {
>> +	.owner			= THIS_MODULE,
>> +	.write			= aspeed_xdma_write,
>> +	.poll			= aspeed_xdma_poll,
>> +	.mmap			= aspeed_xdma_mmap,
>> +	.open			= aspeed_xdma_open,
>> +	.release		= aspeed_xdma_release,
>> +};
>> +
>>   static int aspeed_xdma_probe(struct platform_device *pdev)
>>   {
>>   	int irq;
>> @@ -539,6 +745,7 @@ static int aspeed_xdma_probe(struct platform_device *pdev)
>>   	ctx->chip = md;
>>   	ctx->dev = dev;
>>   	platform_set_drvdata(pdev, ctx);
>> +	mutex_init(&ctx->file_lock);
>>   	mutex_init(&ctx->start_lock);
>>   	INIT_WORK(&ctx->reset_work, aspeed_xdma_reset_work);
>>   	spin_lock_init(&ctx->reset_lock);
>> @@ -678,6 +885,22 @@ static int aspeed_xdma_probe(struct platform_device *pdev)
>>   
>>   	aspeed_xdma_init_eng(ctx);
>>   
>> +	ctx->misc.minor = MISC_DYNAMIC_MINOR;
>> +	ctx->misc.fops = &aspeed_xdma_fops;
>> +	ctx->misc.name = "aspeed-xdma";
>> +	ctx->misc.parent = dev;
>> +	rc = misc_register(&ctx->misc);
>> +	if (rc) {
>> +		dev_err(dev, "Failed to register xdma miscdevice.\n");
>> +
>> +		gen_pool_free(ctx->vga_pool, (unsigned long)ctx->cmdq_vga_virt,
>> +			      XDMA_CMDQ_SIZE);
>> +
>> +		reset_control_assert(ctx->reset);
>> +		clk_disable_unprepare(ctx->clock);
>> +		return rc;
>> +	}
>> +
>>   	/*
>>   	 * This interrupt could fire immediately so only request it once the
>>   	 * engine and driver are initialized.
>> @@ -699,6 +922,7 @@ static int aspeed_xdma_remove(struct platform_device *pdev)
>>   {
>>   	struct aspeed_xdma *ctx = platform_get_drvdata(pdev);
>>   
>> +	misc_deregister(&ctx->misc);
>>   	gen_pool_free(ctx->vga_pool, (unsigned long)ctx->cmdq_vga_virt,
>>   		      XDMA_CMDQ_SIZE);
>>   
>> -- 
>> 1.8.3.1
>>
>>

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH v2 06/12] drivers/soc: Add Aspeed XDMA Engine Driver
  2019-12-11 20:39     ` Eddie James
@ 2019-12-12  4:52       ` Andrew Jeffery
  2019-12-12 19:16         ` Eddie James
  0 siblings, 1 reply; 27+ messages in thread
From: Andrew Jeffery @ 2019-12-12  4:52 UTC (permalink / raw)
  To: Eddie James, linux-kernel
  Cc: devicetree, Jason Cooper, linux-aspeed, Marc Zyngier,
	Rob Herring, tglx, mark.rutland, Joel Stanley



On Thu, 12 Dec 2019, at 07:09, Eddie James wrote:
> 
> On 12/10/19 9:47 PM, Andrew Jeffery wrote:
> >
> > On Fri, 6 Dec 2019, at 03:45, Eddie James wrote:
> >> +
> >> +static unsigned int aspeed_xdma_ast2600_set_cmd(struct aspeed_xdma *ctx,
> >> +						struct aspeed_xdma_op *op,
> >> +						u32 bmc_addr)
> >> +{
> >> +	u64 cmd = XDMA_CMD_AST2600_CMD_IRQ_BMC |
> >> +		(op->direction ? XDMA_CMD_AST2600_CMD_UPSTREAM : 0);
> >> +	unsigned int line_size;
> >> +	unsigned int nidx = (ctx->cmd_idx + 1) % XDMA_NUM_CMDS;
> >> +	unsigned int line_no = 1;
> >> +	unsigned int pitch = 1;
> >> +	struct aspeed_xdma_cmd *ncmd =
> >> +		&(((struct aspeed_xdma_cmd *)ctx->cmdq)[ctx->cmd_idx]);
> >> +
> >> +	if ((op->host_addr + op->len) & 0xffffffff00000000ULL)
> > Do we know that this won't wrap?
> 
> 
> No, but I assume it would be a bad transfer anyway at that point?

But what happens as a consequence? We would have a 64 bit address
but wouldn't enable 64bit addressing, so presumably the hardware
would only use the bottom 32 bits of the address?

Things could get weird yes?

Or is there some failure that would occur before we trigger the transfer?
Is that what you're depending on?

> >> +
> >> +static void aspeed_xdma_done(struct aspeed_xdma *ctx, bool error)
> >> +{
> >> +	if (ctx->current_client) {
> >> +		ctx->current_client->error = error;
> >> +		ctx->current_client->in_progress = false;
> >> +		ctx->current_client = NULL;
> > You need to take start_lock before writing these members to ensure the
> > writes are not reordered across acquisition of start_lock in
> > aspeed_xdma_start() above, unless there's some other guarantee of that?
> 
> 
> Unless we get spurious interrupts (as in, the xdma interrupt fires with 
> no transfer started, and somehow the correct status bits are set), it's 
> not possible to execute this at the same time as aspeed_xdma_start(). So 
> I did not try and lock here. Do you think it's worth locking for that 
> situation?
> 

Why is it worth not locking? How is it correct? To answer that way we invoke
all kinds of reasoning about multi-processing (interrupt handled on one core
while aspeed_xdma_start() is executing on another), value visibility and
instruction reordering (though as it happens the 2400, 2500 and 2600 are all
in-order). We'll trip ourselves up if there is eventually a switch to out-of-order
execution where the writes might be reordered and delayed until after
start_lock has been acquired in aspeed_xdma_start() by a subseqent transfer.
This line of reasoning is brittle exploitation of properties of the currently used
cores for no benefit. Finishing the DMA op isn't a hot path where you might
want to take some of these risks for performance, so we have almost zero
care for lock contention but we must always be concerned about correctness.

We avoid invoking all of those questions by acquiring the lock.

> >> +
> >> +	ctx->vga_pool = devm_gen_pool_create(dev, ilog2(PAGE_SIZE), -1, NULL);
> >> +	if (!ctx->vga_pool) {
> >> +		dev_err(dev, "Failed to setup genalloc pool.\n");
> >> +		return -ENOMEM;
> >> +	}
> >> +
> >> +	rc = of_property_read_u32_array(dev->of_node, "vga-mem", vgamem, 2);
> > As mentioned, this could be any reserved memory range. Also can't we get it as
> > a resource rather than parsing a u32 array? Not sure if there's an advantage
> > but it feels like a better representation.
> 
> 
> That doesn't work unfortunately because the VGA memory is not mapped and 
> the reserved memory subsystem fails to find it.

Fair enough.

> >> +
> >> +	regmap_update_bits(sdmc, SDMC_REMAP, ctx->chip->sdmc_remap,
> >> +			   ctx->chip->sdmc_remap);
> > I disagree with doing this. As mentioned on the bindings it should be up to
> > the platform integrator to ensure that this is configured appropriately.
> 
> 
> Probably so, but then how does one actually configure that elsewhere? Do 
> you mean add code to the edac driver (and add support for the ast2600) 
> to read some dts properties to set it?

Right. That's where I was going. I don't expect you to do that as part of this
patch series, but if you could separate this code out into separate patches
(dealing with the sdmc property in the devicetree binding as well) we can at
least concentrate on getting the core XDMA driver in and work out how to
move forward with configuring the memory controller later.

> >> +/*
> >> + * aspeed_xdma_direction
> >> + *
> >> + * ASPEED_XDMA_DIRECTION_DOWNSTREAM: transfers data from the host to the BMC
> >> + *
> >> + * ASPEED_XDMA_DIRECTION_UPSTREAM: transfers data from the BMC to the host
> >> + *
> >> + * ASPEED_XDMA_DIRECTION_RESET: resets the XDMA engine
> >> + */
> >> +enum aspeed_xdma_direction {
> >> +	ASPEED_XDMA_DIRECTION_DOWNSTREAM = 0,
> >> +	ASPEED_XDMA_DIRECTION_UPSTREAM,
> >> +	ASPEED_XDMA_DIRECTION_RESET,
> > I still think having a reset action as part of the direction is a bit funky. Can you maybe
> > put that in a separate patch so we can debate it later?
> 
> 
> I can, but I'm fairly convinced this is the cleanest way to add the 
> reset functionality.
> 

Right, but if you separate it out you'll get my reviewed-by on the core XDMA
patches much quicker :) You can convince me about it in slow-time.

Cheers,

Andrew

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH v2 07/12] drivers/soc: xdma: Add user interface
  2019-12-11 20:43     ` Eddie James
@ 2019-12-12  5:02       ` Andrew Jeffery
  0 siblings, 0 replies; 27+ messages in thread
From: Andrew Jeffery @ 2019-12-12  5:02 UTC (permalink / raw)
  To: Eddie James, linux-kernel
  Cc: devicetree, Jason Cooper, linux-aspeed, Marc Zyngier,
	Rob Herring, tglx, mark.rutland, Joel Stanley



On Thu, 12 Dec 2019, at 07:13, Eddie James wrote:
> 
> On 12/10/19 9:48 PM, Andrew Jeffery wrote:
> >
> > On Fri, 6 Dec 2019, at 03:45, Eddie James wrote:
> >> +		}
> >> +	} else {
> >> +		mutex_lock(&ctx->file_lock);
> >> +
> >> +		rc = wait_event_interruptible(ctx->wait, !ctx->current_client);
> >> +		if (rc) {
> >> +			mutex_unlock(&ctx->file_lock);
> >> +			return -EINTR;
> >> +		}
> >> +	}
> >> +
> >> +	aspeed_xdma_start(ctx, &op, ctx->vga_phys + offs, client);
> >> +
> >> +	mutex_unlock(&ctx->file_lock);
> > You've used file_lock here to protect aspeed_xdma_start() but start_lock
> > above to protect aspeed_xdma_reset(), so it seems one client can disrupt
> > another by resetting the engine while a DMA is in progress?
> 
> 
> That's correct, that is the intention. In case the transfer hangs, 
> another client needs to be able to reset and clear up a blocking transfer.

Ah. Can we log a noisy warning about resetting the engine while a DMA is
in progress then? I'd hate to debug this otherwise. The more information
we can log about both clients the better.

We still need to make sure we're using consistent locking, even if we wind
up with nested locking.

> >> +
> >> +static int aspeed_xdma_release(struct inode *inode, struct file *file)
> >> +{
> >> +	struct aspeed_xdma_client *client = file->private_data;
> >> +
> >> +	if (client->ctx->current_client == client)
> >> +		client->ctx->current_client = NULL;
> > Shouldn't we also cancel the DMA op? This seems like a DoS risk: set up
> > a non-blocking, large downstream transfer then close the client. Also risks
> > scribbling on memory we no-longer own given we don't cancel/wait for
> > completion in vm close callback?
> 
> 
> Right, better wait for completion. There's no way to cancel a transfer.

Right, that's handy context.

Cheers,

Andrew

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH v2 06/12] drivers/soc: Add Aspeed XDMA Engine Driver
  2019-12-12  4:52       ` Andrew Jeffery
@ 2019-12-12 19:16         ` Eddie James
  2019-12-13  1:42           ` Andrew Jeffery
  0 siblings, 1 reply; 27+ messages in thread
From: Eddie James @ 2019-12-12 19:16 UTC (permalink / raw)
  To: Andrew Jeffery, Eddie James, linux-kernel
  Cc: devicetree, Jason Cooper, linux-aspeed, Marc Zyngier,
	Rob Herring, tglx, mark.rutland, Joel Stanley


On 12/11/19 10:52 PM, Andrew Jeffery wrote:
>
> On Thu, 12 Dec 2019, at 07:09, Eddie James wrote:
>> On 12/10/19 9:47 PM, Andrew Jeffery wrote:
>>> On Fri, 6 Dec 2019, at 03:45, Eddie James wrote:
>>>> +
>>>> +static unsigned int aspeed_xdma_ast2600_set_cmd(struct aspeed_xdma *ctx,
>>>> +						struct aspeed_xdma_op *op,
>>>> +						u32 bmc_addr)
>>>> +{
>>>> +	u64 cmd = XDMA_CMD_AST2600_CMD_IRQ_BMC |
>>>> +		(op->direction ? XDMA_CMD_AST2600_CMD_UPSTREAM : 0);
>>>> +	unsigned int line_size;
>>>> +	unsigned int nidx = (ctx->cmd_idx + 1) % XDMA_NUM_CMDS;
>>>> +	unsigned int line_no = 1;
>>>> +	unsigned int pitch = 1;
>>>> +	struct aspeed_xdma_cmd *ncmd =
>>>> +		&(((struct aspeed_xdma_cmd *)ctx->cmdq)[ctx->cmd_idx]);
>>>> +
>>>> +	if ((op->host_addr + op->len) & 0xffffffff00000000ULL)
>>> Do we know that this won't wrap?
>>
>> No, but I assume it would be a bad transfer anyway at that point?
> But what happens as a consequence? We would have a 64 bit address
> but wouldn't enable 64bit addressing, so presumably the hardware
> would only use the bottom 32 bits of the address?
>
> Things could get weird yes?
>
> Or is there some failure that would occur before we trigger the transfer?
> Is that what you're depending on?


OK, I'll handle it then.


>
>>>> +
>>>> +static void aspeed_xdma_done(struct aspeed_xdma *ctx, bool error)
>>>> +{
>>>> +	if (ctx->current_client) {
>>>> +		ctx->current_client->error = error;
>>>> +		ctx->current_client->in_progress = false;
>>>> +		ctx->current_client = NULL;
>>> You need to take start_lock before writing these members to ensure the
>>> writes are not reordered across acquisition of start_lock in
>>> aspeed_xdma_start() above, unless there's some other guarantee of that?
>>
>> Unless we get spurious interrupts (as in, the xdma interrupt fires with
>> no transfer started, and somehow the correct status bits are set), it's
>> not possible to execute this at the same time as aspeed_xdma_start(). So
>> I did not try and lock here. Do you think it's worth locking for that
>> situation?
>>
> Why is it worth not locking? How is it correct? To answer that way we invoke
> all kinds of reasoning about multi-processing (interrupt handled on one core
> while aspeed_xdma_start() is executing on another), value visibility and
> instruction reordering (though as it happens the 2400, 2500 and 2600 are all
> in-order). We'll trip ourselves up if there is eventually a switch to out-of-order
> execution where the writes might be reordered and delayed until after
> start_lock has been acquired in aspeed_xdma_start() by a subseqent transfer.
> This line of reasoning is brittle exploitation of properties of the currently used
> cores for no benefit. Finishing the DMA op isn't a hot path where you might
> want to take some of these risks for performance, so we have almost zero
> care for lock contention but we must always be concerned about correctness.
>
> We avoid invoking all of those questions by acquiring the lock.


OK, I'll refactor to lock it.


>
>>>> +
>>>> +	ctx->vga_pool = devm_gen_pool_create(dev, ilog2(PAGE_SIZE), -1, NULL);
>>>> +	if (!ctx->vga_pool) {
>>>> +		dev_err(dev, "Failed to setup genalloc pool.\n");
>>>> +		return -ENOMEM;
>>>> +	}
>>>> +
>>>> +	rc = of_property_read_u32_array(dev->of_node, "vga-mem", vgamem, 2);
>>> As mentioned, this could be any reserved memory range. Also can't we get it as
>>> a resource rather than parsing a u32 array? Not sure if there's an advantage
>>> but it feels like a better representation.
>>
>> That doesn't work unfortunately because the VGA memory is not mapped and
>> the reserved memory subsystem fails to find it.
> Fair enough.
>
>>>> +
>>>> +	regmap_update_bits(sdmc, SDMC_REMAP, ctx->chip->sdmc_remap,
>>>> +			   ctx->chip->sdmc_remap);
>>> I disagree with doing this. As mentioned on the bindings it should be up to
>>> the platform integrator to ensure that this is configured appropriately.
>>
>> Probably so, but then how does one actually configure that elsewhere? Do
>> you mean add code to the edac driver (and add support for the ast2600)
>> to read some dts properties to set it?
> Right. That's where I was going. I don't expect you to do that as part of this
> patch series, but if you could separate this code out into separate patches
> (dealing with the sdmc property in the devicetree binding as well) we can at
> least concentrate on getting the core XDMA driver in and work out how to
> move forward with configuring the memory controller later.


Yea... my concern is that then we end up with a driver upstream that 
doesn't actually work. Same concern with the reset thing you mentioned 
below.


Thanks,

Eddie


>
>>>> +/*
>>>> + * aspeed_xdma_direction
>>>> + *
>>>> + * ASPEED_XDMA_DIRECTION_DOWNSTREAM: transfers data from the host to the BMC
>>>> + *
>>>> + * ASPEED_XDMA_DIRECTION_UPSTREAM: transfers data from the BMC to the host
>>>> + *
>>>> + * ASPEED_XDMA_DIRECTION_RESET: resets the XDMA engine
>>>> + */
>>>> +enum aspeed_xdma_direction {
>>>> +	ASPEED_XDMA_DIRECTION_DOWNSTREAM = 0,
>>>> +	ASPEED_XDMA_DIRECTION_UPSTREAM,
>>>> +	ASPEED_XDMA_DIRECTION_RESET,
>>> I still think having a reset action as part of the direction is a bit funky. Can you maybe
>>> put that in a separate patch so we can debate it later?
>>
>> I can, but I'm fairly convinced this is the cleanest way to add the
>> reset functionality.
>>
> Right, but if you separate it out you'll get my reviewed-by on the core XDMA
> patches much quicker :) You can convince me about it in slow-time
>
> Cheers,
>
> Andrew

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH v2 06/12] drivers/soc: Add Aspeed XDMA Engine Driver
  2019-12-12 19:16         ` Eddie James
@ 2019-12-13  1:42           ` Andrew Jeffery
  0 siblings, 0 replies; 27+ messages in thread
From: Andrew Jeffery @ 2019-12-13  1:42 UTC (permalink / raw)
  To: Eddie James, Eddie James, linux-kernel
  Cc: devicetree, Jason Cooper, linux-aspeed, Marc Zyngier,
	Rob Herring, tglx, mark.rutland, Joel Stanley



On Fri, 13 Dec 2019, at 05:46, Eddie James wrote:
> 
> On 12/11/19 10:52 PM, Andrew Jeffery wrote:
> >
> > On Thu, 12 Dec 2019, at 07:09, Eddie James wrote:
> >> On 12/10/19 9:47 PM, Andrew Jeffery wrote:
> >>> On Fri, 6 Dec 2019, at 03:45, Eddie James wrote:
> >>>> +
> >>>> +	regmap_update_bits(sdmc, SDMC_REMAP, ctx->chip->sdmc_remap,
> >>>> +			   ctx->chip->sdmc_remap);
> >>> I disagree with doing this. As mentioned on the bindings it should be up to
> >>> the platform integrator to ensure that this is configured appropriately.
> >>
> >> Probably so, but then how does one actually configure that elsewhere? Do
> >> you mean add code to the edac driver (and add support for the ast2600)
> >> to read some dts properties to set it?
> > Right. That's where I was going. I don't expect you to do that as part of this
> > patch series, but if you could separate this code out into separate patches
> > (dealing with the sdmc property in the devicetree binding as well) we can at
> > least concentrate on getting the core XDMA driver in and work out how to
> > move forward with configuring the memory controller later.
> 
> 
> Yea... my concern is that then we end up with a driver upstream that 
> doesn't actually work. Same concern with the reset thing you mentioned 
> below.

How would it not work? It would just be up to the platform integrator to make
sure the stars align right? If they do then there should be no problem. Whacking
the memory controller here is done out of convenience.

We can still carry the separate patches adding this and the reset behaviour in
e.g. the openbmc kernel tree if necessary.

Andrew

^ permalink raw reply	[flat|nested] 27+ messages in thread

end of thread, other threads:[~2019-12-13  1:41 UTC | newest]

Thread overview: 27+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-12-05 17:15 [PATCH v2 00/12] Aspeed: Add SCU interrupt controller and XDMA engine drivers Eddie James
2019-12-05 17:15 ` [PATCH v2 01/12] dt-bindings: interrupt-controller: Add Aspeed SCU interrupt controller Eddie James
2019-12-10 23:32   ` Andrew Jeffery
2019-12-05 17:15 ` [PATCH v2 02/12] irqchip: " Eddie James
2019-12-11  0:32   ` Andrew Jeffery
2019-12-05 17:15 ` [PATCH v2 03/12] ARM: dts: aspeed: ast2500: Add " Eddie James
2019-12-11  0:36   ` Andrew Jeffery
2019-12-05 17:15 ` [PATCH v2 04/12] ARM: dts: aspeed: ast2600: Add SCU interrupt controllers Eddie James
2019-12-11  0:42   ` Andrew Jeffery
2019-12-05 17:15 ` [PATCH v2 05/12] dt-bindings: soc: Add Aspeed XDMA Engine Eddie James
2019-12-11  0:40   ` Andrew Jeffery
2019-12-05 17:15 ` [PATCH v2 06/12] drivers/soc: Add Aspeed XDMA Engine Driver Eddie James
2019-12-11  3:47   ` Andrew Jeffery
2019-12-11 20:39     ` Eddie James
2019-12-12  4:52       ` Andrew Jeffery
2019-12-12 19:16         ` Eddie James
2019-12-13  1:42           ` Andrew Jeffery
2019-12-05 17:15 ` [PATCH v2 07/12] drivers/soc: xdma: Add user interface Eddie James
2019-12-11  3:48   ` Andrew Jeffery
2019-12-11 20:43     ` Eddie James
2019-12-12  5:02       ` Andrew Jeffery
2019-12-05 17:15 ` [PATCH v2 08/12] ARM: dts: aspeed: ast2500: Add XDMA Engine Eddie James
2019-12-05 17:15 ` [PATCH v2 09/12] ARM: dts: aspeed: ast2600: " Eddie James
2019-12-11  3:48   ` Andrew Jeffery
2019-12-05 17:15 ` [PATCH v2 10/12] ARM: dts: aspeed: witherspoon: Enable " Eddie James
2019-12-05 17:15 ` [PATCH v2 11/12] ARM: dts: aspeed: rainier: Enable XDMA engine Eddie James
2019-12-05 17:15 ` [PATCH v2 12/12] ARM: dts: aspeed: tacoma: " Eddie James

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).