* [PATCH 00/12] Aspeed: Add SCU interrupt controller and XDMA engine drivers @ 2019-11-08 20:18 Eddie James 2019-11-08 20:18 ` [PATCH 01/12] dt-bindings: interrupt-controller: Add Aspeed SCU interrupt controller Eddie James ` (11 more replies) 0 siblings, 12 replies; 24+ messages in thread From: Eddie James @ 2019-11-08 20:18 UTC (permalink / raw) To: linux-kernel Cc: linux-aspeed, andrew, joel, maz, jason, tglx, robh+dt, mark.rutland, devicetree This series first adds a driver to control the interrupt controller provided by the System Control Unit (SCU) on the AST2500 and AST2600 SOCs. The interrupts made available are necessary for the control of the XDMA engine embedded in the same Aspeed SOCs. This series then adds a driver to control the XDMA engine. This driver was previously sent to the list without support for the AST2600, and has been refactored significantly to enable that support. The XDMA engine performs automatic DMA operations between the Aspeed SOC (acting as a BMC) and a host processor. Eddie James (12): dt-bindings: interrupt-controller: Add Aspeed SCU interrupt controller irqchip: Add Aspeed SCU interrupt controller ARM: dts: aspeed: ast2500: Add SCU interrupt controller ARM: dts: aspeed: ast2600: Add SCU interrupt controllers dt-bindings: soc: Add Aspeed XDMA Engine drivers/soc: Add Aspeed XDMA Engine Driver drivers/soc: xdma: Add user interface ARM: dts: aspeed: ast2500: Add XDMA Engine ARM: dts: aspeed: ast2600: Add XDMA Engine ARM: dts: aspeed: witherspoon: Enable XDMA Engine ARM: dts: aspeed: rainier: Enable XDMA engine ARM: dts: aspeed: tacoma: Enable XDMA engine .../interrupt-controller/aspeed,ast2xxx-scu-ic.txt | 26 + .../devicetree/bindings/soc/aspeed/xdma.txt | 24 + MAINTAINERS | 16 + arch/arm/boot/dts/aspeed-bmc-ibm-rainier.dts | 4 + arch/arm/boot/dts/aspeed-bmc-opp-tacoma.dts | 4 + arch/arm/boot/dts/aspeed-bmc-opp-witherspoon.dts | 4 + arch/arm/boot/dts/aspeed-g5.dtsi | 21 +- arch/arm/boot/dts/aspeed-g6.dtsi | 29 + drivers/irqchip/Makefile | 2 +- drivers/irqchip/irq-aspeed-scu-ic.c | 233 +++++ drivers/soc/aspeed/Kconfig | 8 + drivers/soc/aspeed/Makefile | 1 + drivers/soc/aspeed/aspeed-xdma.c | 1079 ++++++++++++++++++++ .../interrupt-controller/aspeed-scu-ic.h | 23 + include/uapi/linux/aspeed-xdma.h | 49 + 15 files changed, 1521 insertions(+), 2 deletions(-) create mode 100644 Documentation/devicetree/bindings/interrupt-controller/aspeed,ast2xxx-scu-ic.txt create mode 100644 Documentation/devicetree/bindings/soc/aspeed/xdma.txt create mode 100644 drivers/irqchip/irq-aspeed-scu-ic.c create mode 100644 drivers/soc/aspeed/aspeed-xdma.c create mode 100644 include/dt-bindings/interrupt-controller/aspeed-scu-ic.h create mode 100644 include/uapi/linux/aspeed-xdma.h -- 1.8.3.1 ^ permalink raw reply [flat|nested] 24+ messages in thread
* [PATCH 01/12] dt-bindings: interrupt-controller: Add Aspeed SCU interrupt controller 2019-11-08 20:18 [PATCH 00/12] Aspeed: Add SCU interrupt controller and XDMA engine drivers Eddie James @ 2019-11-08 20:18 ` Eddie James 2019-11-12 0:54 ` Rob Herring 2019-11-08 20:18 ` [PATCH 02/12] irqchip: " Eddie James ` (10 subsequent siblings) 11 siblings, 1 reply; 24+ messages in thread From: Eddie James @ 2019-11-08 20:18 UTC (permalink / raw) To: linux-kernel Cc: linux-aspeed, andrew, joel, maz, jason, tglx, robh+dt, mark.rutland, devicetree Document the Aspeed SCU interrupt controller and add an include file for the interrupts it provides. Signed-off-by: Eddie James <eajames@linux.ibm.com> --- .../interrupt-controller/aspeed,ast2xxx-scu-ic.txt | 26 ++++++++++++++++++++++ MAINTAINERS | 7 ++++++ .../interrupt-controller/aspeed-scu-ic.h | 23 +++++++++++++++++++ 3 files changed, 56 insertions(+) create mode 100644 Documentation/devicetree/bindings/interrupt-controller/aspeed,ast2xxx-scu-ic.txt create mode 100644 include/dt-bindings/interrupt-controller/aspeed-scu-ic.h diff --git a/Documentation/devicetree/bindings/interrupt-controller/aspeed,ast2xxx-scu-ic.txt b/Documentation/devicetree/bindings/interrupt-controller/aspeed,ast2xxx-scu-ic.txt new file mode 100644 index 0000000..baa3780 --- /dev/null +++ b/Documentation/devicetree/bindings/interrupt-controller/aspeed,ast2xxx-scu-ic.txt @@ -0,0 +1,26 @@ +Aspeed AST25XX and AST26XX SCU Interrupt Controller + +Required Properties: + - #interrupt-cells : must be 1 + - compatible : must be "aspeed,ast2500-scu-ic", + "aspeed,ast2600-scu-ic0" or + "aspeed,ast2600-scu-ic1" + - reg : address and size of the SCU register + associated with the controller + - interrupts : interrupt from the parent controller + - interrupt-controller : indicates that the controlle receives and + fires new interrupts for child busses + +Example: + + syscon@1e6e2000 { + ranges = <0 0x1e6e2000 0x1a8>; + + scu_ic: interrupt-controller@18 { + #interrupt-cells = <1>; + compatible = "aspeed,ast2500-scu-ic"; + reg = <0x18 0x04>; + interrupts = <21>; + interrupt-controller; + }; + }; diff --git a/MAINTAINERS b/MAINTAINERS index c4c532c..1cb976f 100644 --- a/MAINTAINERS +++ b/MAINTAINERS @@ -2675,6 +2675,13 @@ S: Maintained F: drivers/pinctrl/aspeed/ F: Documentation/devicetree/bindings/pinctrl/aspeed,* +ASPEED SCU INTERRUPT CONTROLLER DRIVER +M: Eddie James <eajames@linux.ibm.com> +L: linux-aspeed@lists.ozlabs.org (moderated for non-subscribers) +S: Maintained +F: Documentation/devicetree/bindings/interrupt-controller/aspeed,ast2xxx-scu-ic.txt +F: include/dt-bindings/interrupt-controller/aspeed-scu-ic.h + ASPEED VIDEO ENGINE DRIVER M: Eddie James <eajames@linux.ibm.com> L: linux-media@vger.kernel.org diff --git a/include/dt-bindings/interrupt-controller/aspeed-scu-ic.h b/include/dt-bindings/interrupt-controller/aspeed-scu-ic.h new file mode 100644 index 0000000..f315d5a --- /dev/null +++ b/include/dt-bindings/interrupt-controller/aspeed-scu-ic.h @@ -0,0 +1,23 @@ +/* SPDX-License-Identifier: GPL-2.0+ */ + +#ifndef _DT_BINDINGS_INTERRUPT_CONTROLLER_ASPEED_SCU_IC_H_ +#define _DT_BINDINGS_INTERRUPT_CONTROLLER_ASPEED_SCU_IC_H_ + +#define ASPEED_SCU_IC_VGA_CURSOR_CHANGE 0 +#define ASPEED_SCU_IC_VGA_SCRATCH_REG_CHANGE 1 + +#define ASPEED_AST2500_SCU_IC_PCIE_RESET_LO_TO_HI 2 +#define ASPEED_AST2500_SCU_IC_PCIE_RESET_HI_TO_LO 3 +#define ASPEED_AST2500_SCU_IC_LPC_RESET_LO_TO_HI 4 +#define ASPEED_AST2500_SCU_IC_LPC_RESET_HI_TO_LO 5 +#define ASPEED_AST2500_SCU_IC_ISSUE_MSI 6 + +#define ASPEED_AST2600_SCU_IC0_PCIE_PERST_LO_TO_HI 2 +#define ASPEED_AST2600_SCU_IC0_PCIE_PERST_HI_TO_LO 3 +#define ASPEED_AST2600_SCU_IC0_PCIE_RCRST_LO_TO_HI 4 +#define ASPEED_AST2600_SCU_IC0_PCIE_RCRST_HI_TO_LO 5 + +#define ASPEED_AST2600_SCU_IC1_LPC_RESET_LO_TO_HI 0 +#define ASPEED_AST2600_SCU_IC1_LPC_RESET_HI_TO_LO 1 + +#endif /* _DT_BINDINGS_INTERRUPT_CONTROLLER_ASPEED_SCU_IC_H_ */ -- 1.8.3.1 ^ permalink raw reply related [flat|nested] 24+ messages in thread
* Re: [PATCH 01/12] dt-bindings: interrupt-controller: Add Aspeed SCU interrupt controller 2019-11-08 20:18 ` [PATCH 01/12] dt-bindings: interrupt-controller: Add Aspeed SCU interrupt controller Eddie James @ 2019-11-12 0:54 ` Rob Herring 0 siblings, 0 replies; 24+ messages in thread From: Rob Herring @ 2019-11-12 0:54 UTC (permalink / raw) To: Eddie James Cc: linux-kernel, linux-aspeed, andrew, joel, maz, jason, tglx, robh+dt, mark.rutland, devicetree On Fri, 8 Nov 2019 14:18:22 -0600, Eddie James wrote: > Document the Aspeed SCU interrupt controller and add an include file > for the interrupts it provides. > > Signed-off-by: Eddie James <eajames@linux.ibm.com> > --- > .../interrupt-controller/aspeed,ast2xxx-scu-ic.txt | 26 ++++++++++++++++++++++ > MAINTAINERS | 7 ++++++ > .../interrupt-controller/aspeed-scu-ic.h | 23 +++++++++++++++++++ > 3 files changed, 56 insertions(+) > create mode 100644 Documentation/devicetree/bindings/interrupt-controller/aspeed,ast2xxx-scu-ic.txt > create mode 100644 include/dt-bindings/interrupt-controller/aspeed-scu-ic.h > Please add Acked-by/Reviewed-by tags when posting new versions. However, there's no need to repost patches *only* to add the tags. The upstream maintainer will do that for acks received on the version they apply. If a tag was not added on purpose, please state why and what changed. ^ permalink raw reply [flat|nested] 24+ messages in thread
* [PATCH 02/12] irqchip: Add Aspeed SCU interrupt controller 2019-11-08 20:18 [PATCH 00/12] Aspeed: Add SCU interrupt controller and XDMA engine drivers Eddie James 2019-11-08 20:18 ` [PATCH 01/12] dt-bindings: interrupt-controller: Add Aspeed SCU interrupt controller Eddie James @ 2019-11-08 20:18 ` Eddie James 2019-11-10 14:53 ` Marc Zyngier 2019-11-08 20:18 ` [PATCH 03/12] ARM: dts: aspeed: ast2500: Add " Eddie James ` (9 subsequent siblings) 11 siblings, 1 reply; 24+ messages in thread From: Eddie James @ 2019-11-08 20:18 UTC (permalink / raw) To: linux-kernel Cc: linux-aspeed, andrew, joel, maz, jason, tglx, robh+dt, mark.rutland, devicetree The Aspeed SOCs provide some interrupts through the System Control Unit registers. Add an interrupt controller that provides these interrupts to the system. Signed-off-by: Eddie James <eajames@linux.ibm.com> --- MAINTAINERS | 1 + drivers/irqchip/Makefile | 2 +- drivers/irqchip/irq-aspeed-scu-ic.c | 233 ++++++++++++++++++++++++++++++++++++ 3 files changed, 235 insertions(+), 1 deletion(-) create mode 100644 drivers/irqchip/irq-aspeed-scu-ic.c diff --git a/MAINTAINERS b/MAINTAINERS index 1cb976f..c46181c 100644 --- a/MAINTAINERS +++ b/MAINTAINERS @@ -2680,6 +2680,7 @@ M: Eddie James <eajames@linux.ibm.com> L: linux-aspeed@lists.ozlabs.org (moderated for non-subscribers) S: Maintained F: Documentation/devicetree/bindings/interrupt-controller/aspeed,ast2xxx-scu-ic.txt +F: drivers/irqchip/irq-aspeed-scu-ic.c F: include/dt-bindings/interrupt-controller/aspeed-scu-ic.h ASPEED VIDEO ENGINE DRIVER diff --git a/drivers/irqchip/Makefile b/drivers/irqchip/Makefile index cc7c439..fce6b1d 100644 --- a/drivers/irqchip/Makefile +++ b/drivers/irqchip/Makefile @@ -86,7 +86,7 @@ obj-$(CONFIG_MVEBU_PIC) += irq-mvebu-pic.o obj-$(CONFIG_MVEBU_SEI) += irq-mvebu-sei.o obj-$(CONFIG_LS_SCFG_MSI) += irq-ls-scfg-msi.o obj-$(CONFIG_EZNPS_GIC) += irq-eznps.o -obj-$(CONFIG_ARCH_ASPEED) += irq-aspeed-vic.o irq-aspeed-i2c-ic.o +obj-$(CONFIG_ARCH_ASPEED) += irq-aspeed-vic.o irq-aspeed-i2c-ic.o irq-aspeed-scu-ic.o obj-$(CONFIG_STM32_EXTI) += irq-stm32-exti.o obj-$(CONFIG_QCOM_IRQ_COMBINER) += qcom-irq-combiner.o obj-$(CONFIG_IRQ_UNIPHIER_AIDET) += irq-uniphier-aidet.o diff --git a/drivers/irqchip/irq-aspeed-scu-ic.c b/drivers/irqchip/irq-aspeed-scu-ic.c new file mode 100644 index 0000000..45a83ec --- /dev/null +++ b/drivers/irqchip/irq-aspeed-scu-ic.c @@ -0,0 +1,233 @@ +// SPDX-License-Identifier: GPL-2.0+ +/* + * Aspeed AST24XX, AST25XX, and AST26XX SCU Interrupt Controller + * Copyright 2019 IBM Corporation + * + * Eddie James <eajames@linux.ibm.com> + */ + +#include <linux/bitops.h> +#include <linux/irq.h> +#include <linux/irqchip.h> +#include <linux/irqchip/chained_irq.h> +#include <linux/irqdomain.h> +#include <linux/of_address.h> +#include <linux/of_irq.h> +#include <linux/io.h> +#include <linux/spinlock.h> + +#define ASPEED_SCU_IC_SHIFT 0 +#define ASPEED_SCU_IC_ENABLE GENMASK(6, ASPEED_SCU_IC_SHIFT) +#define ASPEED_SCU_IC_NUM_IRQS 7 +#define ASPEED_SCU_IC_STATUS_SHIFT 16 + +#define ASPEED_AST2600_SCU_IC0_SHIFT 0 +#define ASPEED_AST2600_SCU_IC0_ENABLE \ + GENMASK(5, ASPEED_AST2600_SCU_IC0_SHIFT) +#define ASPEED_AST2600_SCU_IC0_NUM_IRQS 6 + +#define ASPEED_AST2600_SCU_IC1_SHIFT 4 +#define ASPEED_AST2600_SCU_IC1_ENABLE \ + GENMASK(5, ASPEED_AST2600_SCU_IC1_SHIFT) +#define ASPEED_AST2600_SCU_IC1_NUM_IRQS 2 + +struct aspeed_scu_ic { + unsigned long irq_enable; + unsigned long irq_shift; + unsigned int num_irqs; + void __iomem *reg; + struct irq_domain *irq_domain; + spinlock_t lock; +}; + +static void aspeed_scu_ic_irq_handler(struct irq_desc *desc) +{ + unsigned int irq; + unsigned long bit; + unsigned long enabled; + unsigned long max; + unsigned long status; + struct aspeed_scu_ic *scu_ic = irq_desc_get_handler_data(desc); + struct irq_chip *chip = irq_desc_get_chip(desc); + + chained_irq_enter(chip, desc); + + /* + * The SCU IC has just one register to control its operation and read + * status. The interrupt enable bits occupy the lower 16 bits of the + * register, while the interrupt status bits occupy the upper 16 bits. + * The status bit for a given interrupt is always 16 bits shifted from + * the enable bit for the same interrupt. + * Therefore, perform the IRQ operations in the enable bit space by + * shifting the status down to get the mapping and then back up to + * clear the bit. + */ + status = readl_relaxed(scu_ic->reg); + enabled = status & scu_ic->irq_enable; + status = (status >> ASPEED_SCU_IC_STATUS_SHIFT) & enabled; + + bit = scu_ic->irq_shift; + max = scu_ic->num_irqs + bit; + + for_each_set_bit_from(bit, &status, max) { + irq = irq_find_mapping(scu_ic->irq_domain, + bit - scu_ic->irq_shift); + generic_handle_irq(irq); + + writel_relaxed(enabled | BIT(bit + ASPEED_SCU_IC_STATUS_SHIFT), + scu_ic->reg); + } + + chained_irq_exit(chip, desc); +} + +static void aspeed_scu_ic_irq_mask(struct irq_data *data) +{ + struct aspeed_scu_ic *scu_ic = irq_data_get_irq_chip_data(data); + unsigned long bit = BIT(data->hwirq + scu_ic->irq_shift); + unsigned long flags; + unsigned long reg; + + spin_lock_irqsave(&scu_ic->lock, flags); + + reg = readl_relaxed(scu_ic->reg); + writel_relaxed((reg & ~bit) & scu_ic->irq_enable, scu_ic->reg); + + spin_unlock_irqrestore(&scu_ic->lock, flags); +} + +static void aspeed_scu_ic_irq_unmask(struct irq_data *data) +{ + struct aspeed_scu_ic *scu_ic = irq_data_get_irq_chip_data(data); + unsigned long bit = BIT(data->hwirq + scu_ic->irq_shift); + unsigned long flags; + unsigned long reg; + + spin_lock_irqsave(&scu_ic->lock, flags); + + reg = readl_relaxed(scu_ic->reg); + writel_relaxed((reg | bit) & scu_ic->irq_enable, scu_ic->reg); + + spin_unlock_irqrestore(&scu_ic->lock, flags); +} + +static int aspeed_scu_ic_irq_set_affinity(struct irq_data *data, + const struct cpumask *dest, + bool force) +{ + return -EINVAL; +} + +static struct irq_chip aspeed_scu_ic_chip = { + .name = "aspeed-scu-ic", + .irq_mask = aspeed_scu_ic_irq_mask, + .irq_unmask = aspeed_scu_ic_irq_unmask, + .irq_set_affinity = aspeed_scu_ic_irq_set_affinity, +}; + +static int aspeed_scu_ic_map(struct irq_domain *domain, unsigned int irq, + irq_hw_number_t hwirq) +{ + irq_set_chip_and_handler(irq, &aspeed_scu_ic_chip, handle_simple_irq); + irq_set_chip_data(irq, domain->host_data); + + return 0; +} + +static const struct irq_domain_ops aspeed_scu_ic_domain_ops = { + .map = aspeed_scu_ic_map, +}; + +static int aspeed_scu_ic_of_init_common(struct aspeed_scu_ic *scu_ic, + struct device_node *node) +{ + int irq; + int rc = 0; + + scu_ic->reg = of_iomap(node, 0); + if (!scu_ic->reg) { + rc = -ENOMEM; + goto err_free; + } + + irq = irq_of_parse_and_map(node, 0); + if (irq < 0) { + rc = irq; + goto err_iounmap; + } + + spin_lock_init(&scu_ic->lock); + + scu_ic->irq_domain = irq_domain_add_linear(node, scu_ic->num_irqs, + &aspeed_scu_ic_domain_ops, + scu_ic); + if (!scu_ic->irq_domain) { + rc = -ENOMEM; + goto err_iounmap; + } + + irq_set_chained_handler_and_data(irq, aspeed_scu_ic_irq_handler, + scu_ic); + + return 0; + +err_iounmap: + iounmap(scu_ic->reg); + +err_free: + kfree(scu_ic); + + return rc; +} + +static int __init aspeed_scu_ic_of_init(struct device_node *node, + struct device_node *parent) +{ + struct aspeed_scu_ic *scu_ic = kzalloc(sizeof(*scu_ic), GFP_KERNEL); + + if (!scu_ic) + return -ENOMEM; + + scu_ic->irq_enable = ASPEED_SCU_IC_ENABLE; + scu_ic->irq_shift = ASPEED_SCU_IC_SHIFT; + scu_ic->num_irqs = ASPEED_SCU_IC_NUM_IRQS; + + return aspeed_scu_ic_of_init_common(scu_ic, node); +} + +static int __init aspeed_ast2600_scu_ic0_of_init(struct device_node *node, + struct device_node *parent) +{ + struct aspeed_scu_ic *scu_ic = kzalloc(sizeof(*scu_ic), GFP_KERNEL); + + if (!scu_ic) + return -ENOMEM; + + scu_ic->irq_enable = ASPEED_AST2600_SCU_IC0_ENABLE; + scu_ic->irq_shift = ASPEED_AST2600_SCU_IC0_SHIFT; + scu_ic->num_irqs = ASPEED_AST2600_SCU_IC0_NUM_IRQS; + + return aspeed_scu_ic_of_init_common(scu_ic, node); +} + +static int __init aspeed_ast2600_scu_ic1_of_init(struct device_node *node, + struct device_node *parent) +{ + struct aspeed_scu_ic *scu_ic = kzalloc(sizeof(*scu_ic), GFP_KERNEL); + + if (!scu_ic) + return -ENOMEM; + + scu_ic->irq_enable = ASPEED_AST2600_SCU_IC1_ENABLE; + scu_ic->irq_shift = ASPEED_AST2600_SCU_IC1_SHIFT; + scu_ic->num_irqs = ASPEED_AST2600_SCU_IC1_NUM_IRQS; + + return aspeed_scu_ic_of_init_common(scu_ic, node); +} + +IRQCHIP_DECLARE(ast2400_scu_ic, "aspeed,ast2400-scu-ic", aspeed_scu_ic_of_init); +IRQCHIP_DECLARE(ast2500_scu_ic, "aspeed,ast2500-scu-ic", aspeed_scu_ic_of_init); +IRQCHIP_DECLARE(ast2600_scu_ic0, "aspeed,ast2600-scu-ic0", + aspeed_ast2600_scu_ic0_of_init); +IRQCHIP_DECLARE(ast2600_scu_ic1, "aspeed,ast2600-scu-ic1", + aspeed_ast2600_scu_ic1_of_init); -- 1.8.3.1 ^ permalink raw reply related [flat|nested] 24+ messages in thread
* Re: [PATCH 02/12] irqchip: Add Aspeed SCU interrupt controller 2019-11-08 20:18 ` [PATCH 02/12] irqchip: " Eddie James @ 2019-11-10 14:53 ` Marc Zyngier 0 siblings, 0 replies; 24+ messages in thread From: Marc Zyngier @ 2019-11-10 14:53 UTC (permalink / raw) To: Eddie James Cc: linux-kernel, linux-aspeed, andrew, joel, jason, tglx, robh+dt, mark.rutland, devicetree On Fri, 8 Nov 2019 14:18:23 -0600 Eddie James <eajames@linux.ibm.com> wrote: Hi Eddie, > The Aspeed SOCs provide some interrupts through the System Control > Unit registers. Add an interrupt controller that provides these > interrupts to the system. > > Signed-off-by: Eddie James <eajames@linux.ibm.com> > --- > MAINTAINERS | 1 + > drivers/irqchip/Makefile | 2 +- > drivers/irqchip/irq-aspeed-scu-ic.c | 233 ++++++++++++++++++++++++++++++++++++ > 3 files changed, 235 insertions(+), 1 deletion(-) > create mode 100644 drivers/irqchip/irq-aspeed-scu-ic.c [...] > +static int aspeed_scu_ic_map(struct irq_domain *domain, unsigned int irq, > + irq_hw_number_t hwirq) > +{ > + irq_set_chip_and_handler(irq, &aspeed_scu_ic_chip, handle_simple_irq); handle_simple_irq is usually wrong, and works badly with threaded interrupts. I suggest you'd change it to handle_level_irq, which probably matches the behaviour of the controller. Otherwise, this looks good. Thanks, M. -- Jazz is not dead. It just smells funny... ^ permalink raw reply [flat|nested] 24+ messages in thread
* [PATCH 03/12] ARM: dts: aspeed: ast2500: Add SCU interrupt controller 2019-11-08 20:18 [PATCH 00/12] Aspeed: Add SCU interrupt controller and XDMA engine drivers Eddie James 2019-11-08 20:18 ` [PATCH 01/12] dt-bindings: interrupt-controller: Add Aspeed SCU interrupt controller Eddie James 2019-11-08 20:18 ` [PATCH 02/12] irqchip: " Eddie James @ 2019-11-08 20:18 ` Eddie James 2019-11-08 20:18 ` [PATCH 04/12] ARM: dts: aspeed: ast2600: Add SCU interrupt controllers Eddie James ` (8 subsequent siblings) 11 siblings, 0 replies; 24+ messages in thread From: Eddie James @ 2019-11-08 20:18 UTC (permalink / raw) To: linux-kernel Cc: linux-aspeed, andrew, joel, maz, jason, tglx, robh+dt, mark.rutland, devicetree Add a node for the interrupt controller provided by the SCU. Signed-off-by: Eddie James <eajames@linux.ibm.com> --- arch/arm/boot/dts/aspeed-g5.dtsi | 11 ++++++++++- 1 file changed, 10 insertions(+), 1 deletion(-) diff --git a/arch/arm/boot/dts/aspeed-g5.dtsi b/arch/arm/boot/dts/aspeed-g5.dtsi index a259c63..4579c78 100644 --- a/arch/arm/boot/dts/aspeed-g5.dtsi +++ b/arch/arm/boot/dts/aspeed-g5.dtsi @@ -216,8 +216,9 @@ syscon: syscon@1e6e2000 { compatible = "aspeed,ast2500-scu", "syscon", "simple-mfd"; reg = <0x1e6e2000 0x1a8>; + ranges = <0 0x1e6e2000 0x1a8>; #address-cells = <1>; - #size-cells = <0>; + #size-cells = <1>; #clock-cells = <1>; #reset-cells = <1>; @@ -231,6 +232,14 @@ compatible = "aspeed,ast2500-p2a-ctrl"; status = "disabled"; }; + + scu_ic: interrupt-controller@18 { + #interrupt-cells = <1>; + compatible = "aspeed,ast2500-scu-ic"; + reg = <0x18 0x04>; + interrupts = <21>; + interrupt-controller; + }; }; rng: hwrng@1e6e2078 { -- 1.8.3.1 ^ permalink raw reply related [flat|nested] 24+ messages in thread
* [PATCH 04/12] ARM: dts: aspeed: ast2600: Add SCU interrupt controllers 2019-11-08 20:18 [PATCH 00/12] Aspeed: Add SCU interrupt controller and XDMA engine drivers Eddie James ` (2 preceding siblings ...) 2019-11-08 20:18 ` [PATCH 03/12] ARM: dts: aspeed: ast2500: Add " Eddie James @ 2019-11-08 20:18 ` Eddie James 2019-11-08 20:18 ` [PATCH 05/12] dt-bindings: soc: Add Aspeed XDMA Engine Eddie James ` (7 subsequent siblings) 11 siblings, 0 replies; 24+ messages in thread From: Eddie James @ 2019-11-08 20:18 UTC (permalink / raw) To: linux-kernel Cc: linux-aspeed, andrew, joel, maz, jason, tglx, robh+dt, mark.rutland, devicetree Add nodes for the interrupt controllers provided by the SCU. Signed-off-by: Eddie James <eajames@linux.ibm.com> --- arch/arm/boot/dts/aspeed-g6.dtsi | 18 ++++++++++++++++++ 1 file changed, 18 insertions(+) diff --git a/arch/arm/boot/dts/aspeed-g6.dtsi b/arch/arm/boot/dts/aspeed-g6.dtsi index 5f6142d..65ea2b2 100644 --- a/arch/arm/boot/dts/aspeed-g6.dtsi +++ b/arch/arm/boot/dts/aspeed-g6.dtsi @@ -288,6 +288,24 @@ compatible = "aspeed,ast2600-smpmem"; reg = <0x180 0x40>; }; + + scu_ic0: interrupt-controller@0 { + #interrupt-cells = <1>; + compatible = "aspeed,ast2600-scu-ic0"; + reg = <0x560 0x4>; + interrupt-parent = <&gic>; + interrupts = <GIC_SPI 12 IRQ_TYPE_LEVEL_HIGH>; + interrupt-controller; + }; + + scu_ic1: interrupt-controller@1 { + #interrupt-cells = <1>; + compatible = "aspeed,ast2600-scu-ic1"; + reg = <0x570 0x4>; + interrupt-parent = <&gic>; + interrupts = <GIC_SPI 41 IRQ_TYPE_LEVEL_HIGH>; + interrupt-controller; + }; }; rng: hwrng@1e6e2524 { -- 1.8.3.1 ^ permalink raw reply related [flat|nested] 24+ messages in thread
* [PATCH 05/12] dt-bindings: soc: Add Aspeed XDMA Engine 2019-11-08 20:18 [PATCH 00/12] Aspeed: Add SCU interrupt controller and XDMA engine drivers Eddie James ` (3 preceding siblings ...) 2019-11-08 20:18 ` [PATCH 04/12] ARM: dts: aspeed: ast2600: Add SCU interrupt controllers Eddie James @ 2019-11-08 20:18 ` Eddie James 2019-11-14 19:00 ` Rob Herring 2019-11-08 20:18 ` [PATCH 06/12] drivers/soc: Add Aspeed XDMA Engine Driver Eddie James ` (6 subsequent siblings) 11 siblings, 1 reply; 24+ messages in thread From: Eddie James @ 2019-11-08 20:18 UTC (permalink / raw) To: linux-kernel Cc: linux-aspeed, andrew, joel, maz, jason, tglx, robh+dt, mark.rutland, devicetree Document the bindings for the Aspeed AST25XX and AST26XX XDMA engine. Signed-off-by: Eddie James <eajames@linux.ibm.com> --- .../devicetree/bindings/soc/aspeed/xdma.txt | 24 ++++++++++++++++++++++ MAINTAINERS | 6 ++++++ 2 files changed, 30 insertions(+) create mode 100644 Documentation/devicetree/bindings/soc/aspeed/xdma.txt diff --git a/Documentation/devicetree/bindings/soc/aspeed/xdma.txt b/Documentation/devicetree/bindings/soc/aspeed/xdma.txt new file mode 100644 index 0000000..b6053e0 --- /dev/null +++ b/Documentation/devicetree/bindings/soc/aspeed/xdma.txt @@ -0,0 +1,24 @@ +Aspeed AST25XX and AST26XX XDMA Engine + +The XDMA Engine embedded in the AST2500 and AST2600 SOCs can perform automatic +DMA operations over PCI between the SOC (acting as a BMC) and a host processor. + +Required properties: + - compatible : must be "aspeed,ast2500-xdma" or + "aspeed,ast2600-xdma" + - reg : contains the address and size of the memory region + associated with the XDMA engine registers + - resets : reset specifier for the syscon reset associated with + the XDMA engine + - interrupts-extended : two interrupt nodes; the first specifies the global + interrupt for the XDMA engine and the second + specifies the PCI-E reset or PERST interrupt. + +Example: + + xdma@1e6e7000 { + compatible = "aspeed,ast2500-xdma"; + reg = <0x1e6e7000 0x100>; + resets = <&syscon ASPEED_RESET_XDMA>; + interrupts-extended = <&vic 6>, <&scu_ic ASPEED_AST2500_SCU_IC_PCIE_RESET_LO_TO_HI>; + }; diff --git a/MAINTAINERS b/MAINTAINERS index c46181c..540bd45 100644 --- a/MAINTAINERS +++ b/MAINTAINERS @@ -2691,6 +2691,12 @@ S: Maintained F: drivers/media/platform/aspeed-video.c F: Documentation/devicetree/bindings/media/aspeed-video.txt +ASPEED XDMA ENGINE DRIVER +M: Eddie James <eajames@linux.ibm.com> +L: linux-aspeed@lists.ozlabs.org (moderated for non-subscribers) +S: Maintained +F: Documentation/devicetree/bindings/soc/aspeed/xdma.txt + ASUS NOTEBOOKS AND EEEPC ACPI/WMI EXTRAS DRIVERS M: Corentin Chary <corentin.chary@gmail.com> L: acpi4asus-user@lists.sourceforge.net -- 1.8.3.1 ^ permalink raw reply related [flat|nested] 24+ messages in thread
* Re: [PATCH 05/12] dt-bindings: soc: Add Aspeed XDMA Engine 2019-11-08 20:18 ` [PATCH 05/12] dt-bindings: soc: Add Aspeed XDMA Engine Eddie James @ 2019-11-14 19:00 ` Rob Herring 0 siblings, 0 replies; 24+ messages in thread From: Rob Herring @ 2019-11-14 19:00 UTC (permalink / raw) To: Eddie James Cc: linux-kernel, linux-aspeed, andrew, joel, maz, jason, tglx, robh+dt, mark.rutland, devicetree On Fri, 8 Nov 2019 14:18:26 -0600, Eddie James wrote: > Document the bindings for the Aspeed AST25XX and AST26XX XDMA engine. > > Signed-off-by: Eddie James <eajames@linux.ibm.com> > --- > .../devicetree/bindings/soc/aspeed/xdma.txt | 24 ++++++++++++++++++++++ > MAINTAINERS | 6 ++++++ > 2 files changed, 30 insertions(+) > create mode 100644 Documentation/devicetree/bindings/soc/aspeed/xdma.txt > Reviewed-by: Rob Herring <robh@kernel.org> ^ permalink raw reply [flat|nested] 24+ messages in thread
* [PATCH 06/12] drivers/soc: Add Aspeed XDMA Engine Driver 2019-11-08 20:18 [PATCH 00/12] Aspeed: Add SCU interrupt controller and XDMA engine drivers Eddie James ` (4 preceding siblings ...) 2019-11-08 20:18 ` [PATCH 05/12] dt-bindings: soc: Add Aspeed XDMA Engine Eddie James @ 2019-11-08 20:18 ` Eddie James 2019-11-24 23:32 ` Andrew Jeffery 2019-11-08 20:18 ` [PATCH 07/12] drivers/soc: xdma: Add user interface Eddie James ` (5 subsequent siblings) 11 siblings, 1 reply; 24+ messages in thread From: Eddie James @ 2019-11-08 20:18 UTC (permalink / raw) To: linux-kernel Cc: linux-aspeed, andrew, joel, maz, jason, tglx, robh+dt, mark.rutland, devicetree The XDMA engine embedded in the AST2500 and AST2600 SOCs performs PCI DMA operations between the SOC (acting as a BMC) and a host processor in a server. This commit adds a driver to control the XDMA engine and adds functions to initialize the hardware and memory and start DMA operations. Signed-off-by: Eddie James <eajames@linux.ibm.com> --- MAINTAINERS | 2 + drivers/soc/aspeed/Kconfig | 8 + drivers/soc/aspeed/Makefile | 1 + drivers/soc/aspeed/aspeed-xdma.c | 856 +++++++++++++++++++++++++++++++++++++++ include/uapi/linux/aspeed-xdma.h | 49 +++ 5 files changed, 916 insertions(+) create mode 100644 drivers/soc/aspeed/aspeed-xdma.c create mode 100644 include/uapi/linux/aspeed-xdma.h diff --git a/MAINTAINERS b/MAINTAINERS index 540bd45..7eea32e4 100644 --- a/MAINTAINERS +++ b/MAINTAINERS @@ -2696,6 +2696,8 @@ M: Eddie James <eajames@linux.ibm.com> L: linux-aspeed@lists.ozlabs.org (moderated for non-subscribers) S: Maintained F: Documentation/devicetree/bindings/soc/aspeed/xdma.txt +F: drivers/soc/aspeed/aspeed-xdma.c +F: include/uapi/linux/aspeed-xdma.h ASUS NOTEBOOKS AND EEEPC ACPI/WMI EXTRAS DRIVERS M: Corentin Chary <corentin.chary@gmail.com> diff --git a/drivers/soc/aspeed/Kconfig b/drivers/soc/aspeed/Kconfig index 323e177..2a6c16f 100644 --- a/drivers/soc/aspeed/Kconfig +++ b/drivers/soc/aspeed/Kconfig @@ -29,4 +29,12 @@ config ASPEED_P2A_CTRL ioctl()s, the driver also provides an interface for userspace mappings to a pre-defined region. +config ASPEED_XDMA + tristate "Aspeed XDMA Engine Driver" + depends on SOC_ASPEED && REGMAP && MFD_SYSCON && HAS_DMA + help + Enable support for the Aspeed XDMA Engine found on the Aspeed AST2XXX + SOCs. The XDMA engine can perform automatic PCI DMA operations + between the AST2XXX (acting as a BMC) and a host processor. + endmenu diff --git a/drivers/soc/aspeed/Makefile b/drivers/soc/aspeed/Makefile index b64be47..977b046 100644 --- a/drivers/soc/aspeed/Makefile +++ b/drivers/soc/aspeed/Makefile @@ -2,3 +2,4 @@ obj-$(CONFIG_ASPEED_LPC_CTRL) += aspeed-lpc-ctrl.o obj-$(CONFIG_ASPEED_LPC_SNOOP) += aspeed-lpc-snoop.o obj-$(CONFIG_ASPEED_P2A_CTRL) += aspeed-p2a-ctrl.o +obj-$(CONFIG_ASPEED_XDMA) += aspeed-xdma.o diff --git a/drivers/soc/aspeed/aspeed-xdma.c b/drivers/soc/aspeed/aspeed-xdma.c new file mode 100644 index 0000000..99041a6 --- /dev/null +++ b/drivers/soc/aspeed/aspeed-xdma.c @@ -0,0 +1,856 @@ +// SPDX-License-Identifier: GPL-2.0+ +// Copyright IBM Corp 2019 + +#include <linux/aspeed-xdma.h> +#include <linux/bitfield.h> +#include <linux/clk.h> +#include <linux/delay.h> +#include <linux/device.h> +#include <linux/dma-mapping.h> +#include <linux/fs.h> +#include <linux/genalloc.h> +#include <linux/interrupt.h> +#include <linux/jiffies.h> +#include <linux/mfd/syscon.h> +#include <linux/miscdevice.h> +#include <linux/module.h> +#include <linux/mutex.h> +#include <linux/of_device.h> +#include <linux/platform_device.h> +#include <linux/poll.h> +#include <linux/regmap.h> +#include <linux/reset.h> +#include <linux/slab.h> +#include <linux/spinlock.h> +#include <linux/string.h> +#include <linux/uaccess.h> +#include <linux/wait.h> +#include <linux/workqueue.h> + +#define DEVICE_NAME "aspeed-xdma" + +#define SCU_AST2500_STRAP 0x070 +#define SCU_AST2500_STRAP_VGA_MEM GENMASK(3, 2) +#define SCU_AST2600_STRAP 0x500 +#define SCU_AST2600_STRAP_VGA_MEM GENMASK(14, 13) + +#define SCU_AST2500_PCIE_CONF 0x180 +#define SCU_AST2600_PCIE_CONF 0xc20 +#define SCU_PCIE_CONF_VGA_EN BIT(0) +#define SCU_PCIE_CONF_VGA_EN_MMIO BIT(1) +#define SCU_PCIE_CONF_VGA_EN_LPC BIT(2) +#define SCU_PCIE_CONF_VGA_EN_MSI BIT(3) +#define SCU_PCIE_CONF_VGA_EN_MCTP BIT(4) +#define SCU_PCIE_CONF_VGA_EN_IRQ BIT(5) +#define SCU_PCIE_CONF_VGA_EN_DMA BIT(6) +#define SCU_PCIE_CONF_BMC_EN BIT(8) +#define SCU_PCIE_CONF_BMC_EN_MMIO BIT(9) +#define SCU_PCIE_CONF_BMC_EN_MSI BIT(11) +#define SCU_PCIE_CONF_BMC_EN_MCTP BIT(12) +#define SCU_PCIE_CONF_BMC_EN_IRQ BIT(13) +#define SCU_PCIE_CONF_BMC_EN_DMA BIT(14) + +#define SCU_AST2500_BMC_CLASS_REV 0x19c +#define SCU_AST2600_BMC_CLASS_REV 0xc4c +#define SCU_BMC_CLASS_REV_XDMA 0xff000001 + +#define SDMC_BASE 0x1e6e0000 +#define SDMC_CONF 0x004 +#define SDMC_CONF_MEM GENMASK(1, 0) +#define SDMC_REMAP 0x008 +#define SDMC_AST2500_REMAP_MAGIC (BIT(16) | BIT(17)) +#define SDMC_AST2600_REMAP_MAGIC BIT(18) + +#define XDMA_CMDQ_SIZE PAGE_SIZE +#define XDMA_NUM_CMDS \ + (XDMA_CMDQ_SIZE / sizeof(struct aspeed_xdma_cmd)) + +/* Aspeed specification requires 10ms after switching the reset line */ +#define XDMA_RESET_TIME_MS 10 + +#define XDMA_DS_PCIE_REQ_SIZE_128 0 +#define XDMA_DS_PCIE_REQ_SIZE_256 1 +#define XDMA_DS_PCIE_REQ_SIZE_512 2 +#define XDMA_DS_PCIE_REQ_SIZE_1K 3 +#define XDMA_DS_PCIE_REQ_SIZE_2K 4 +#define XDMA_DS_PCIE_REQ_SIZE_4K 5 + +#define XDMA_CMD_AST2500_PITCH_SHIFT 3 +#define XDMA_CMD_AST2500_PITCH_BMC GENMASK_ULL(62, 51) +#define XDMA_CMD_AST2500_PITCH_HOST GENMASK_ULL(46, 35) +#define XDMA_CMD_AST2500_PITCH_UPSTREAM BIT_ULL(31) +#define XDMA_CMD_AST2500_PITCH_ADDR GENMASK_ULL(29, 4) +#define XDMA_CMD_AST2500_PITCH_ID BIT_ULL(0) +#define XDMA_CMD_AST2500_CMD_IRQ_EN BIT_ULL(31) +#define XDMA_CMD_AST2500_CMD_LINE_NO GENMASK_ULL(27, 16) +#define XDMA_CMD_AST2500_CMD_IRQ_BMC BIT_ULL(15) +#define XDMA_CMD_AST2500_CMD_LINE_SIZE_SHIFT 4 +#define XDMA_CMD_AST2500_CMD_LINE_SIZE \ + GENMASK_ULL(14, XDMA_CMD_AST2500_CMD_LINE_SIZE_SHIFT) +#define XDMA_CMD_AST2500_CMD_ID BIT_ULL(1) + +#define XDMA_CMD_AST2600_PITCH_BMC GENMASK_ULL(62, 48) +#define XDMA_CMD_AST2600_PITCH_HOST GENMASK_ULL(46, 32) +#define XDMA_CMD_AST2600_PITCH_ADDR GENMASK_ULL(30, 0) +#define XDMA_CMD_AST2600_CMD_64_EN BIT_ULL(40) +#define XDMA_CMD_AST2600_CMD_IRQ_BMC BIT_ULL(37) +#define XDMA_CMD_AST2600_CMD_IRQ_HOST BIT_ULL(36) +#define XDMA_CMD_AST2600_CMD_UPSTREAM BIT_ULL(32) +#define XDMA_CMD_AST2600_CMD_LINE_NO GENMASK_ULL(27, 16) +#define XDMA_CMD_AST2600_CMD_LINE_SIZE GENMASK_ULL(14, 0) +#define XDMA_CMD_AST2600_CMD_MULTILINE_SIZE GENMASK_ULL(14, 12) + +#define XDMA_AST2500_QUEUE_ENTRY_SIZE 4 +#define XDMA_AST2500_HOST_CMDQ_ADDR0 0x00 +#define XDMA_AST2500_HOST_CMDQ_ENDP 0x04 +#define XDMA_AST2500_HOST_CMDQ_WRITEP 0x08 +#define XDMA_AST2500_HOST_CMDQ_READP 0x0c +#define XDMA_AST2500_BMC_CMDQ_ADDR 0x10 +#define XDMA_AST2500_BMC_CMDQ_ENDP 0x14 +#define XDMA_AST2500_BMC_CMDQ_WRITEP 0x18 +#define XDMA_AST2500_BMC_CMDQ_READP 0x1c +#define XDMA_BMC_CMDQ_READP_MAGIC 0xee882266 +#define XDMA_AST2500_CTRL 0x20 +#define XDMA_AST2500_CTRL_US_COMP BIT(4) +#define XDMA_AST2500_CTRL_DS_COMP BIT(5) +#define XDMA_AST2500_CTRL_DS_DIRTY BIT(6) +#define XDMA_AST2500_CTRL_DS_SIZE GENMASK(19, 17) +#define XDMA_AST2500_CTRL_DS_TIMEOUT BIT(28) +#define XDMA_AST2500_CTRL_DS_CHECK_ID BIT(29) +#define XDMA_AST2500_STATUS 0x24 +#define XDMA_AST2500_STATUS_US_COMP BIT(4) +#define XDMA_AST2500_STATUS_DS_COMP BIT(5) +#define XDMA_AST2500_STATUS_DS_DIRTY BIT(6) +#define XDMA_AST2500_INPRG_DS_CMD1 0x38 +#define XDMA_AST2500_INPRG_DS_CMD2 0x3c +#define XDMA_AST2500_INPRG_US_CMD00 0x40 +#define XDMA_AST2500_INPRG_US_CMD01 0x44 +#define XDMA_AST2500_INPRG_US_CMD10 0x48 +#define XDMA_AST2500_INPRG_US_CMD11 0x4c +#define XDMA_AST2500_INPRG_US_CMD20 0x50 +#define XDMA_AST2500_INPRG_US_CMD21 0x54 +#define XDMA_AST2500_HOST_CMDQ_ADDR1 0x60 +#define XDMA_AST2500_VGA_CMDQ_ADDR0 0x64 +#define XDMA_AST2500_VGA_CMDQ_ENDP 0x68 +#define XDMA_AST2500_VGA_CMDQ_WRITEP 0x6c +#define XDMA_AST2500_VGA_CMDQ_READP 0x70 +#define XDMA_AST2500_VGA_CMD_STATUS 0x74 +#define XDMA_AST2500_VGA_CMDQ_ADDR1 0x78 + +#define XDMA_AST2600_QUEUE_ENTRY_SIZE 2 +#define XDMA_AST2600_HOST_CMDQ_ADDR0 0x00 +#define XDMA_AST2600_HOST_CMDQ_ADDR1 0x04 +#define XDMA_AST2600_HOST_CMDQ_ENDP 0x08 +#define XDMA_AST2600_HOST_CMDQ_WRITEP 0x0c +#define XDMA_AST2600_HOST_CMDQ_READP 0x10 +#define XDMA_AST2600_BMC_CMDQ_ADDR 0x14 +#define XDMA_AST2600_BMC_CMDQ_ENDP 0x18 +#define XDMA_AST2600_BMC_CMDQ_WRITEP 0x1c +#define XDMA_AST2600_BMC_CMDQ_READP 0x20 +#define XDMA_AST2600_VGA_CMDQ_ADDR0 0x24 +#define XDMA_AST2600_VGA_CMDQ_ADDR1 0x28 +#define XDMA_AST2600_VGA_CMDQ_ENDP 0x2c +#define XDMA_AST2600_VGA_CMDQ_WRITEP 0x30 +#define XDMA_AST2600_VGA_CMDQ_READP 0x34 +#define XDMA_AST2600_CTRL 0x38 +#define XDMA_AST2600_CTRL_US_COMP BIT(16) +#define XDMA_AST2600_CTRL_DS_COMP BIT(17) +#define XDMA_AST2600_CTRL_DS_DIRTY BIT(18) +#define XDMA_AST2600_CTRL_DS_SIZE GENMASK(22, 20) +#define XDMA_AST2600_STATUS 0x3c +#define XDMA_AST2600_STATUS_US_COMP BIT(16) +#define XDMA_AST2600_STATUS_DS_COMP BIT(17) +#define XDMA_AST2600_STATUS_DS_DIRTY BIT(18) +#define XDMA_AST2600_INPRG_DS_CMD00 0x40 +#define XDMA_AST2600_INPRG_DS_CMD01 0x44 +#define XDMA_AST2600_INPRG_DS_CMD10 0x48 +#define XDMA_AST2600_INPRG_DS_CMD11 0x4c +#define XDMA_AST2600_INPRG_DS_CMD20 0x50 +#define XDMA_AST2600_INPRG_DS_CMD21 0x54 +#define XDMA_AST2600_INPRG_US_CMD00 0x60 +#define XDMA_AST2600_INPRG_US_CMD01 0x64 +#define XDMA_AST2600_INPRG_US_CMD10 0x68 +#define XDMA_AST2600_INPRG_US_CMD11 0x6c +#define XDMA_AST2600_INPRG_US_CMD20 0x70 +#define XDMA_AST2600_INPRG_US_CMD21 0x74 + +enum versions { xdma_ast2500, xdma_ast2600 }; + +struct aspeed_xdma_cmd { + u64 host_addr; + u64 pitch; + u64 cmd; + u64 reserved; +}; + +struct aspeed_xdma_regs { + u8 bmc_cmdq_addr; + u8 bmc_cmdq_endp; + u8 bmc_cmdq_writep; + u8 bmc_cmdq_readp; + u8 control; + u8 status; +}; + +struct aspeed_xdma_status_bits { + u32 us_comp; + u32 ds_comp; + u32 ds_dirty; +}; + +struct aspeed_xdma_client; + +struct aspeed_xdma { + enum versions version; + u32 control; + unsigned int queue_entry_size; + struct aspeed_xdma_regs regs; + struct aspeed_xdma_status_bits status_bits; + + struct device *dev; + void __iomem *base; + struct clk *clock; + struct reset_control *reset; + + bool in_progress; + bool in_reset; + bool upstream; + unsigned int cmd_idx; + struct mutex start_lock; + struct delayed_work reset_work; + spinlock_t client_lock; + spinlock_t reset_lock; + wait_queue_head_t wait; + struct aspeed_xdma_client *current_client; + + u32 vga_phys; + u32 vga_size; + void *cmdq; + void __iomem *vga_virt; + dma_addr_t cmdq_vga_phys; + void *cmdq_vga_virt; + struct gen_pool *vga_pool; +}; + +struct aspeed_xdma_client { + struct aspeed_xdma *ctx; + + bool error; + bool in_progress; + void *virt; + dma_addr_t phys; + u32 size; +}; + +static u32 aspeed_xdma_readl(struct aspeed_xdma *ctx, u8 reg) +{ + u32 v = readl(ctx->base + reg); + + dev_dbg(ctx->dev, "read %02x[%08x]\n", reg, v); + return v; +} + +static void aspeed_xdma_writel(struct aspeed_xdma *ctx, u8 reg, u32 val) +{ + writel(val, ctx->base + reg); + dev_dbg(ctx->dev, "write %02x[%08x]\n", reg, readl(ctx->base + reg)); +} + +static void aspeed_xdma_init_eng(struct aspeed_xdma *ctx) +{ + aspeed_xdma_writel(ctx, ctx->regs.bmc_cmdq_endp, + ctx->queue_entry_size * XDMA_NUM_CMDS); + aspeed_xdma_writel(ctx, ctx->regs.bmc_cmdq_readp, + XDMA_BMC_CMDQ_READP_MAGIC); + aspeed_xdma_writel(ctx, ctx->regs.bmc_cmdq_writep, 0); + aspeed_xdma_writel(ctx, ctx->regs.control, ctx->control); + aspeed_xdma_writel(ctx, ctx->regs.bmc_cmdq_addr, ctx->cmdq_vga_phys); + + ctx->cmd_idx = 0; + ctx->in_progress = false; +} + +static unsigned int aspeed_xdma_ast2500_set_cmd(struct aspeed_xdma *ctx, + struct aspeed_xdma_op *op, + u32 bmc_addr) +{ + u64 cmd = XDMA_CMD_AST2500_CMD_IRQ_EN | XDMA_CMD_AST2500_CMD_IRQ_BMC | + XDMA_CMD_AST2500_CMD_ID; + u64 cmd_pitch = (op->direction ? XDMA_CMD_AST2500_PITCH_UPSTREAM : 0) | + XDMA_CMD_AST2500_PITCH_ID; + unsigned int line_size; + unsigned int nidx = (ctx->cmd_idx + 1) % XDMA_NUM_CMDS; + unsigned int line_no = 1; + unsigned int pitch = 1; + struct aspeed_xdma_cmd *ncmd = + &(((struct aspeed_xdma_cmd *)ctx->cmdq)[ctx->cmd_idx]); + + dev_dbg(ctx->dev, "xdma %s ast2500: bmc[%08x] len[%08x] host[%08x]\n", + op->direction ? "upstream" : "downstream", bmc_addr, op->len, + (u32)op->host_addr); + + if (op->len > XDMA_CMD_AST2500_CMD_LINE_SIZE) { + unsigned int rem; + unsigned int total; + + line_no = op->len / XDMA_CMD_AST2500_CMD_LINE_SIZE; + total = XDMA_CMD_AST2500_CMD_LINE_SIZE * line_no; + rem = (op->len - total) >> + XDMA_CMD_AST2500_CMD_LINE_SIZE_SHIFT; + line_size = XDMA_CMD_AST2500_CMD_LINE_SIZE; + pitch = line_size >> XDMA_CMD_AST2500_PITCH_SHIFT; + line_size >>= XDMA_CMD_AST2500_CMD_LINE_SIZE_SHIFT; + + if (rem) { + u32 rbmc = bmc_addr + total; + struct aspeed_xdma_cmd *rcmd = + &(((struct aspeed_xdma_cmd *)ctx->cmdq)[nidx]); + + rcmd->host_addr = op->host_addr + (u64)total; + rcmd->pitch = cmd_pitch | + ((u64)rbmc & XDMA_CMD_AST2500_PITCH_ADDR) | + FIELD_PREP(XDMA_CMD_AST2500_PITCH_HOST, 1) | + FIELD_PREP(XDMA_CMD_AST2500_PITCH_BMC, 1); + rcmd->cmd = cmd | + FIELD_PREP(XDMA_CMD_AST2500_CMD_LINE_NO, 1) | + FIELD_PREP(XDMA_CMD_AST2500_CMD_LINE_SIZE, + rem); + + print_hex_dump_debug("xdma rem", DUMP_PREFIX_OFFSET, + 16, 1, rcmd, sizeof(*rcmd), true); + + cmd &= ~(XDMA_CMD_AST2500_CMD_IRQ_EN | + XDMA_CMD_AST2500_CMD_IRQ_BMC); + + nidx = (nidx + 1) % XDMA_NUM_CMDS; + } + } else { + line_size = op->len >> XDMA_CMD_AST2500_CMD_LINE_SIZE_SHIFT; + } + + ncmd->host_addr = op->host_addr; + ncmd->pitch = cmd_pitch | + ((u64)bmc_addr & XDMA_CMD_AST2500_PITCH_ADDR) | + FIELD_PREP(XDMA_CMD_AST2500_PITCH_HOST, pitch) | + FIELD_PREP(XDMA_CMD_AST2500_PITCH_BMC, pitch); + ncmd->cmd = cmd | FIELD_PREP(XDMA_CMD_AST2500_CMD_LINE_NO, line_no) | + FIELD_PREP(XDMA_CMD_AST2500_CMD_LINE_SIZE, line_size); + + print_hex_dump_debug("xdma cmd", DUMP_PREFIX_OFFSET, 16, 1, ncmd, + sizeof(*ncmd), true); + + return nidx; +} + +static unsigned int aspeed_xdma_ast2600_set_cmd(struct aspeed_xdma *ctx, + struct aspeed_xdma_op *op, + u32 bmc_addr) +{ + u64 cmd = XDMA_CMD_AST2600_CMD_IRQ_BMC | + (op->direction ? XDMA_CMD_AST2600_CMD_UPSTREAM : 0); + unsigned int line_size; + unsigned int nidx = (ctx->cmd_idx + 1) % XDMA_NUM_CMDS; + unsigned int line_no = 1; + unsigned int pitch = 1; + struct aspeed_xdma_cmd *ncmd = + &(((struct aspeed_xdma_cmd *)ctx->cmdq)[ctx->cmd_idx]); + + if ((op->host_addr + op->len) & 0xffffffff00000000ULL) + cmd |= XDMA_CMD_AST2600_CMD_64_EN; + + dev_dbg(ctx->dev, "xdma %s ast2600: bmc[%08x] len[%08x] " + "host[%016llx]\n", op->direction ? "upstream" : "downstream", + bmc_addr, op->len, op->host_addr); + + if (op->len > XDMA_CMD_AST2600_CMD_LINE_SIZE) { + unsigned int rem; + unsigned int total; + + line_no = op->len / XDMA_CMD_AST2600_CMD_MULTILINE_SIZE; + total = XDMA_CMD_AST2600_CMD_MULTILINE_SIZE * line_no; + rem = op->len - total; + line_size = XDMA_CMD_AST2600_CMD_MULTILINE_SIZE; + pitch = line_size; + + if (rem) { + u32 rbmc = bmc_addr + total; + struct aspeed_xdma_cmd *rcmd = + &(((struct aspeed_xdma_cmd *)ctx->cmdq)[nidx]); + + rcmd->host_addr = op->host_addr + (u64)total; + rcmd->pitch = + ((u64)rbmc & XDMA_CMD_AST2600_PITCH_ADDR) | + FIELD_PREP(XDMA_CMD_AST2600_PITCH_HOST, 1) | + FIELD_PREP(XDMA_CMD_AST2600_PITCH_BMC, 1); + rcmd->cmd = cmd | + FIELD_PREP(XDMA_CMD_AST2600_CMD_LINE_NO, 1) | + FIELD_PREP(XDMA_CMD_AST2600_CMD_LINE_SIZE, + rem); + + print_hex_dump_debug("xdma rem", DUMP_PREFIX_OFFSET, + 16, 1, rcmd, sizeof(*rcmd), true); + + cmd &= ~XDMA_CMD_AST2600_CMD_IRQ_BMC; + + nidx = (nidx + 1) % XDMA_NUM_CMDS; + } + } else { + line_size = op->len; + } + + ncmd->host_addr = op->host_addr; + ncmd->pitch = ((u64)bmc_addr & XDMA_CMD_AST2600_PITCH_ADDR) | + FIELD_PREP(XDMA_CMD_AST2600_PITCH_HOST, pitch) | + FIELD_PREP(XDMA_CMD_AST2600_PITCH_BMC, pitch); + ncmd->cmd = cmd | FIELD_PREP(XDMA_CMD_AST2600_CMD_LINE_NO, line_no) | + FIELD_PREP(XDMA_CMD_AST2600_CMD_LINE_SIZE, line_size); + + print_hex_dump_debug("xdma cmd", DUMP_PREFIX_OFFSET, 16, 1, ncmd, + sizeof(*ncmd), true); + + return nidx; +} + +static void aspeed_xdma_start(struct aspeed_xdma *ctx, + struct aspeed_xdma_op *op, u32 bmc_addr, + struct aspeed_xdma_client *client) +{ + unsigned int nidx; + + mutex_lock(&ctx->start_lock); + + switch (ctx->version) { + default: + case xdma_ast2500: + nidx = aspeed_xdma_ast2500_set_cmd(ctx, op, bmc_addr); + break; + case xdma_ast2600: + nidx = aspeed_xdma_ast2600_set_cmd(ctx, op, bmc_addr); + break; + } + + memcpy(ctx->cmdq_vga_virt, ctx->cmdq, XDMA_CMDQ_SIZE); + + client->in_progress = true; + ctx->current_client = client; + + ctx->in_progress = true; + ctx->upstream = op->direction ? true : false; + + aspeed_xdma_writel(ctx, ctx->regs.bmc_cmdq_writep, + nidx * ctx->queue_entry_size); + + ctx->cmd_idx = nidx; + + mutex_unlock(&ctx->start_lock); +} + +static void aspeed_xdma_done(struct aspeed_xdma *ctx, bool error) +{ + unsigned long flags; + + /* + * Lock to make sure simultaneous reset and transfer complete don't + * leave the client with the wrong error state. + */ + spin_lock_irqsave(&ctx->client_lock, flags); + + if (ctx->current_client) { + ctx->current_client->error = error; + ctx->current_client->in_progress = false; + ctx->current_client = NULL; + } + + spin_unlock_irqrestore(&ctx->client_lock, flags); + + ctx->in_progress = false; + wake_up_interruptible_all(&ctx->wait); +} + +static irqreturn_t aspeed_xdma_irq(int irq, void *arg) +{ + struct aspeed_xdma *ctx = arg; + u32 status = aspeed_xdma_readl(ctx, ctx->regs.status); + + if (status & ctx->status_bits.ds_dirty) { + aspeed_xdma_done(ctx, true); + } else { + if (status & ctx->status_bits.us_comp) { + if (ctx->upstream) + aspeed_xdma_done(ctx, false); + } + + if (status & ctx->status_bits.ds_comp) { + if (!ctx->upstream) + aspeed_xdma_done(ctx, false); + } + } + + aspeed_xdma_writel(ctx, ctx->regs.status, status); + + return IRQ_HANDLED; +} + +static void aspeed_xdma_reset_finish(struct aspeed_xdma *ctx) +{ + unsigned long flags; + + spin_lock_irqsave(&ctx->reset_lock, flags); + + ctx->in_reset = false; + reset_control_deassert(ctx->reset); + + spin_unlock_irqrestore(&ctx->reset_lock, flags); + + msleep(XDMA_RESET_TIME_MS); + + aspeed_xdma_init_eng(ctx); + aspeed_xdma_done(ctx, true); +} + +static bool aspeed_xdma_reset_start(struct aspeed_xdma *ctx) +{ + bool rc = true; + unsigned long flags; + + spin_lock_irqsave(&ctx->reset_lock, flags); + + if (ctx->in_reset) { + rc = false; + } else { + ctx->in_reset = true; + reset_control_assert(ctx->reset); + } + + spin_unlock_irqrestore(&ctx->reset_lock, flags); + + return rc; +} + +static void aspeed_xdma_reset_work(struct work_struct *work) +{ + struct delayed_work *dwork = to_delayed_work(work); + struct aspeed_xdma *ctx = container_of(dwork, struct aspeed_xdma, + reset_work); + + /* + * Lock to make sure operations aren't started while the engine is + * in an undefined state coming out of reset and waiting to init. + */ + mutex_lock(&ctx->start_lock); + + aspeed_xdma_reset_finish(ctx); + + mutex_unlock(&ctx->start_lock); +} + +static irqreturn_t aspeed_xdma_pcie_irq(int irq, void *arg) +{ + struct aspeed_xdma *ctx = arg; + + dev_dbg(ctx->dev, "pcie reset\n"); + + if (aspeed_xdma_reset_start(ctx)) + schedule_delayed_work(&ctx->reset_work, + msecs_to_jiffies(XDMA_RESET_TIME_MS)); + + return IRQ_HANDLED; +} + +static int aspeed_xdma_init(struct aspeed_xdma *ctx) +{ + int rc; + struct regmap *scu; + u32 conf; + u32 mem_size; + u32 remap; + u32 scu_bmc_class; + u32 scu_pcie_conf; + u32 scu_strap; + u32 sdmc_remap_magic; + u32 strap = 0; + const u32 bmc = SCU_PCIE_CONF_BMC_EN | SCU_PCIE_CONF_BMC_EN_MSI | + SCU_PCIE_CONF_BMC_EN_MCTP | SCU_PCIE_CONF_BMC_EN_IRQ | + SCU_PCIE_CONF_BMC_EN_DMA; + const u32 vga = SCU_PCIE_CONF_VGA_EN | SCU_PCIE_CONF_VGA_EN_MSI | + SCU_PCIE_CONF_VGA_EN_MCTP | SCU_PCIE_CONF_VGA_EN_IRQ | + SCU_PCIE_CONF_VGA_EN_DMA; + u32 mem_sizes[4] = { 0x8000000, 0x10000000, 0x20000000, 0x40000000 }; + const u32 vga_sizes[4] = { 0x800000, 0x1000000, 0x2000000, 0x4000000 }; + void __iomem *sdmc_base = ioremap(SDMC_BASE, 0x100); + + if (!sdmc_base) { + dev_err(ctx->dev, "Failed to ioremap mem controller regs.\n"); + return -ENOMEM; + } + + switch (ctx->version) { + default: + case xdma_ast2500: + scu_bmc_class = SCU_AST2500_BMC_CLASS_REV; + scu_pcie_conf = SCU_AST2500_PCIE_CONF; + scu_strap = SCU_AST2500_STRAP; + sdmc_remap_magic = SDMC_AST2500_REMAP_MAGIC; + + scu = syscon_regmap_lookup_by_compatible("aspeed,ast2500-scu"); + break; + case xdma_ast2600: + scu_bmc_class = SCU_AST2600_BMC_CLASS_REV; + scu_pcie_conf = SCU_AST2600_PCIE_CONF; + scu_strap = SCU_AST2600_STRAP; + sdmc_remap_magic = SDMC_AST2600_REMAP_MAGIC; + + mem_sizes[0] *= 2; + mem_sizes[1] *= 2; + mem_sizes[2] *= 2; + mem_sizes[3] *= 2; + + scu = syscon_regmap_lookup_by_compatible("aspeed,ast2600-scu"); + break; + }; + + if (!scu) { + dev_err(ctx->dev, "Failed to grab SCU regs.\n"); + return -ENOMEM; + } + + /* Set SOC to use the BMC PCIe device and set the device class code */ + regmap_update_bits(scu, scu_pcie_conf, bmc | vga, bmc); + regmap_write(scu, scu_bmc_class, SCU_BMC_CLASS_REV_XDMA); + + /* + * Calculate the VGA memory size and physical address from the SCU and + * memory controller registers. + */ + regmap_read(scu, scu_strap, &strap); + + switch (ctx->version) { + case xdma_ast2500: + ctx->vga_size = vga_sizes[FIELD_GET(SCU_AST2500_STRAP_VGA_MEM, + strap)]; + break; + case xdma_ast2600: + ctx->vga_size = vga_sizes[FIELD_GET(SCU_AST2600_STRAP_VGA_MEM, + strap)]; + break; + } + + conf = readl(sdmc_base + SDMC_CONF); + remap = readl(sdmc_base + SDMC_REMAP); + remap |= sdmc_remap_magic; + writel(remap, sdmc_base + SDMC_REMAP); + mem_size = mem_sizes[conf & SDMC_CONF_MEM]; + + iounmap(sdmc_base); + + ctx->vga_phys = (mem_size - ctx->vga_size) + 0x80000000; + + ctx->cmdq = devm_kzalloc(ctx->dev, XDMA_CMDQ_SIZE, GFP_KERNEL); + if (!ctx->cmdq) { + dev_err(ctx->dev, "Failed to allocate command queue.\n"); + return -ENOMEM; + } + + ctx->vga_virt = ioremap(ctx->vga_phys, ctx->vga_size); + if (!ctx->vga_virt) { + dev_err(ctx->dev, "Failed to ioremap VGA memory.\n"); + return -ENOMEM; + } + + rc = gen_pool_add_virt(ctx->vga_pool, (unsigned long)ctx->vga_virt, + ctx->vga_phys, ctx->vga_size, -1); + if (rc) { + dev_err(ctx->dev, "Failed to add memory to genalloc pool.\n"); + iounmap(ctx->vga_virt); + return rc; + } + + ctx->cmdq_vga_virt = gen_pool_dma_alloc(ctx->vga_pool, XDMA_CMDQ_SIZE, + &ctx->cmdq_vga_phys); + if (!ctx->cmdq_vga_virt) { + dev_err(ctx->dev, "Failed to genalloc cmdq.\n"); + iounmap(ctx->vga_virt); + return -ENOMEM; + } + + dev_dbg(ctx->dev, "VGA mapped at phys[%08x], size[%08x].\n", + ctx->vga_phys, ctx->vga_size); + + return 0; +} + +static int aspeed_xdma_probe(struct platform_device *pdev) +{ + int irq; + int pcie_irq; + int rc; + enum versions vs = xdma_ast2500; + struct device *dev = &pdev->dev; + struct aspeed_xdma *ctx = devm_kzalloc(dev, sizeof(*ctx), GFP_KERNEL); + const void *md = of_device_get_match_data(dev); + + if (!ctx) + return -ENOMEM; + + if (md) + vs = (enum versions)md; + + switch (vs) { + default: + case xdma_ast2500: + ctx->version = xdma_ast2500; + ctx->control = XDMA_AST2500_CTRL_US_COMP | + XDMA_AST2500_CTRL_DS_COMP | + XDMA_AST2500_CTRL_DS_DIRTY | + FIELD_PREP(XDMA_AST2500_CTRL_DS_SIZE, + XDMA_DS_PCIE_REQ_SIZE_256) | + XDMA_AST2500_CTRL_DS_TIMEOUT | + XDMA_AST2500_CTRL_DS_CHECK_ID; + ctx->queue_entry_size = XDMA_AST2500_QUEUE_ENTRY_SIZE; + ctx->regs.bmc_cmdq_addr = XDMA_AST2500_BMC_CMDQ_ADDR; + ctx->regs.bmc_cmdq_endp = XDMA_AST2500_BMC_CMDQ_ENDP; + ctx->regs.bmc_cmdq_writep = XDMA_AST2500_BMC_CMDQ_WRITEP; + ctx->regs.bmc_cmdq_readp = XDMA_AST2500_BMC_CMDQ_READP; + ctx->regs.control = XDMA_AST2500_CTRL; + ctx->regs.status = XDMA_AST2500_STATUS; + ctx->status_bits.us_comp = XDMA_AST2500_STATUS_US_COMP; + ctx->status_bits.ds_comp = XDMA_AST2500_STATUS_DS_COMP; + ctx->status_bits.ds_dirty = XDMA_AST2500_STATUS_DS_DIRTY; + break; + case xdma_ast2600: + ctx->version = xdma_ast2600; + ctx->control = XDMA_AST2600_CTRL_US_COMP | + XDMA_AST2600_CTRL_DS_COMP | + XDMA_AST2600_CTRL_DS_DIRTY | + FIELD_PREP(XDMA_AST2600_CTRL_DS_SIZE, + XDMA_DS_PCIE_REQ_SIZE_256); + ctx->queue_entry_size = XDMA_AST2600_QUEUE_ENTRY_SIZE; + ctx->regs.bmc_cmdq_addr = XDMA_AST2600_BMC_CMDQ_ADDR; + ctx->regs.bmc_cmdq_endp = XDMA_AST2600_BMC_CMDQ_ENDP; + ctx->regs.bmc_cmdq_writep = XDMA_AST2600_BMC_CMDQ_WRITEP; + ctx->regs.bmc_cmdq_readp = XDMA_AST2600_BMC_CMDQ_READP; + ctx->regs.control = XDMA_AST2600_CTRL; + ctx->regs.status = XDMA_AST2600_STATUS; + ctx->status_bits.us_comp = XDMA_AST2600_STATUS_US_COMP; + ctx->status_bits.ds_comp = XDMA_AST2600_STATUS_DS_COMP; + ctx->status_bits.ds_dirty = XDMA_AST2600_STATUS_DS_DIRTY; + break; + }; + + ctx->dev = dev; + platform_set_drvdata(pdev, ctx); + mutex_init(&ctx->start_lock); + INIT_DELAYED_WORK(&ctx->reset_work, aspeed_xdma_reset_work); + spin_lock_init(&ctx->client_lock); + spin_lock_init(&ctx->reset_lock); + init_waitqueue_head(&ctx->wait); + + ctx->base = devm_platform_ioremap_resource(pdev, 0); + if (IS_ERR(ctx->base)) { + dev_err(dev, "Unable to ioremap registers.\n"); + return PTR_ERR(ctx->base); + } + + irq = platform_get_irq(pdev, 0); + if (irq < 0) { + dev_err(dev, "Unable to find IRQ.\n"); + return -ENODEV; + } + + rc = devm_request_irq(dev, irq, aspeed_xdma_irq, IRQF_SHARED, + DEVICE_NAME, ctx); + if (rc < 0) { + dev_err(dev, "Unable to request IRQ %d.\n", irq); + return rc; + } + + ctx->clock = devm_clk_get(dev, NULL); + if (IS_ERR(ctx->clock)) { + dev_err(dev, "Unable to request clock.\n"); + return PTR_ERR(ctx->clock); + } + + ctx->reset = devm_reset_control_get_exclusive(dev, NULL); + if (IS_ERR(ctx->reset)) { + dev_err(dev, "Unable to request reset control.\n"); + return PTR_ERR(ctx->reset); + } + + ctx->vga_pool = devm_gen_pool_create(dev, ilog2(PAGE_SIZE), -1, NULL); + if (!ctx->vga_pool) { + dev_err(dev, "Unable to setup genalloc pool.\n"); + return -ENOMEM; + } + + clk_prepare_enable(ctx->clock); + msleep(XDMA_RESET_TIME_MS); + + reset_control_deassert(ctx->reset); + msleep(XDMA_RESET_TIME_MS); + + rc = aspeed_xdma_init(ctx); + if (rc) { + reset_control_assert(ctx->reset); + clk_disable_unprepare(ctx->clock); + return rc; + } + + aspeed_xdma_init_eng(ctx); + + /* + * This interrupt could fire immediately so only request it once the + * engine and driver are initialized. + */ + pcie_irq = platform_get_irq(pdev, 1); + if (pcie_irq < 0) { + dev_warn(dev, "Unable to find PCI-E IRQ.\n"); + } else { + rc = devm_request_irq(dev, pcie_irq, aspeed_xdma_pcie_irq, + IRQF_SHARED, DEVICE_NAME, ctx); + if (rc < 0) + dev_warn(dev, "Unable to request PCI-E IRQ %d.\n", rc); + } + + return 0; +} + +static int aspeed_xdma_remove(struct platform_device *pdev) +{ + struct aspeed_xdma *ctx = platform_get_drvdata(pdev); + + gen_pool_free(ctx->vga_pool, (unsigned long)ctx->cmdq_vga_virt, + XDMA_CMDQ_SIZE); + iounmap(ctx->vga_virt); + + reset_control_assert(ctx->reset); + clk_disable_unprepare(ctx->clock); + + return 0; +} + +static const struct of_device_id aspeed_xdma_match[] = { + { + .compatible = "aspeed,ast2500-xdma", + .data = (void *)xdma_ast2500, + }, + { + .compatible = "aspeed,ast2600-xdma", + .data = (void *)xdma_ast2600, + }, + { }, +}; + +static struct platform_driver aspeed_xdma_driver = { + .probe = aspeed_xdma_probe, + .remove = aspeed_xdma_remove, + .driver = { + .name = DEVICE_NAME, + .of_match_table = aspeed_xdma_match, + }, +}; + +module_platform_driver(aspeed_xdma_driver); + +MODULE_AUTHOR("Eddie James"); +MODULE_DESCRIPTION("Aspeed XDMA Engine Driver"); +MODULE_LICENSE("GPL v2"); diff --git a/include/uapi/linux/aspeed-xdma.h b/include/uapi/linux/aspeed-xdma.h new file mode 100644 index 0000000..7f3a031 --- /dev/null +++ b/include/uapi/linux/aspeed-xdma.h @@ -0,0 +1,49 @@ +/* SPDX-License-Identifier: GPL-2.0+ WITH Linux-syscall-note */ +/* Copyright IBM Corp 2019 */ + +#ifndef _UAPI_LINUX_ASPEED_XDMA_H_ +#define _UAPI_LINUX_ASPEED_XDMA_H_ + +#include <linux/types.h> + +/* + * aspeed_xdma_direction + * + * ASPEED_XDMA_DIRECTION_DOWNSTREAM: transfers data from the host to the BMC + * + * ASPEED_XDMA_DIRECTION_UPSTREAM: transfers data from the BMC to the host + * + * ASPEED_XDMA_DIRECTION_RESET: resets the XDMA engine + */ +enum aspeed_xdma_direction { + ASPEED_XDMA_DIRECTION_DOWNSTREAM = 0, + ASPEED_XDMA_DIRECTION_UPSTREAM, + ASPEED_XDMA_DIRECTION_RESET, +}; + +/* + * aspeed_xdma_op + * + * host_addr: the DMA address on the host side, typically configured by PCI + * subsystem + * + * len: the size of the transfer in bytes + * + * direction: an enumerator indicating the direction of the DMA operation; see + * enum aspeed_xdma_direction + * + * bmc_addr: the virtual address to DMA on the BMC side; this parameter is + * unused on current platforms since the XDMA engine is restricted to + * accessing the VGA memory space + * + * reserved: for natural alignment purposes only + */ +struct aspeed_xdma_op { + __u64 host_addr; + __u32 len; + __u32 direction; + __u32 bmc_addr; + __u32 reserved; +}; + +#endif /* _UAPI_LINUX_ASPEED_XDMA_H_ */ -- 1.8.3.1 ^ permalink raw reply related [flat|nested] 24+ messages in thread
* Re: [PATCH 06/12] drivers/soc: Add Aspeed XDMA Engine Driver 2019-11-08 20:18 ` [PATCH 06/12] drivers/soc: Add Aspeed XDMA Engine Driver Eddie James @ 2019-11-24 23:32 ` Andrew Jeffery 2019-11-25 0:01 ` Andrew Jeffery 2019-11-25 19:15 ` Eddie James 0 siblings, 2 replies; 24+ messages in thread From: Andrew Jeffery @ 2019-11-24 23:32 UTC (permalink / raw) To: Eddie James, linux-kernel Cc: linux-aspeed, Joel Stanley, maz, Jason Cooper, tglx, Rob Herring, mark.rutland, devicetree On Sat, 9 Nov 2019, at 06:48, Eddie James wrote: > The XDMA engine embedded in the AST2500 and AST2600 SOCs performs PCI > DMA operations between the SOC (acting as a BMC) and a host processor > in a server. > > This commit adds a driver to control the XDMA engine and adds functions > to initialize the hardware and memory and start DMA operations. > > Signed-off-by: Eddie James <eajames@linux.ibm.com> > --- > MAINTAINERS | 2 + > drivers/soc/aspeed/Kconfig | 8 + > drivers/soc/aspeed/Makefile | 1 + > drivers/soc/aspeed/aspeed-xdma.c | 856 +++++++++++++++++++++++++++++++++++++++ > include/uapi/linux/aspeed-xdma.h | 49 +++ > 5 files changed, 916 insertions(+) > create mode 100644 drivers/soc/aspeed/aspeed-xdma.c > create mode 100644 include/uapi/linux/aspeed-xdma.h > > diff --git a/MAINTAINERS b/MAINTAINERS > index 540bd45..7eea32e4 100644 > --- a/MAINTAINERS > +++ b/MAINTAINERS > @@ -2696,6 +2696,8 @@ M: Eddie James <eajames@linux.ibm.com> > L: linux-aspeed@lists.ozlabs.org (moderated for non-subscribers) > S: Maintained > F: Documentation/devicetree/bindings/soc/aspeed/xdma.txt > +F: drivers/soc/aspeed/aspeed-xdma.c > +F: include/uapi/linux/aspeed-xdma.h > > ASUS NOTEBOOKS AND EEEPC ACPI/WMI EXTRAS DRIVERS > M: Corentin Chary <corentin.chary@gmail.com> > diff --git a/drivers/soc/aspeed/Kconfig b/drivers/soc/aspeed/Kconfig > index 323e177..2a6c16f 100644 > --- a/drivers/soc/aspeed/Kconfig > +++ b/drivers/soc/aspeed/Kconfig > @@ -29,4 +29,12 @@ config ASPEED_P2A_CTRL > ioctl()s, the driver also provides an interface for userspace mappings to > a pre-defined region. > > +config ASPEED_XDMA > + tristate "Aspeed XDMA Engine Driver" > + depends on SOC_ASPEED && REGMAP && MFD_SYSCON && HAS_DMA > + help > + Enable support for the Aspeed XDMA Engine found on the Aspeed AST2XXX > + SOCs. The XDMA engine can perform automatic PCI DMA operations > + between the AST2XXX (acting as a BMC) and a host processor. > + > endmenu > diff --git a/drivers/soc/aspeed/Makefile b/drivers/soc/aspeed/Makefile > index b64be47..977b046 100644 > --- a/drivers/soc/aspeed/Makefile > +++ b/drivers/soc/aspeed/Makefile > @@ -2,3 +2,4 @@ > obj-$(CONFIG_ASPEED_LPC_CTRL) += aspeed-lpc-ctrl.o > obj-$(CONFIG_ASPEED_LPC_SNOOP) += aspeed-lpc-snoop.o > obj-$(CONFIG_ASPEED_P2A_CTRL) += aspeed-p2a-ctrl.o > +obj-$(CONFIG_ASPEED_XDMA) += aspeed-xdma.o > diff --git a/drivers/soc/aspeed/aspeed-xdma.c b/drivers/soc/aspeed/aspeed-xdma.c > new file mode 100644 > index 0000000..99041a6 > --- /dev/null > +++ b/drivers/soc/aspeed/aspeed-xdma.c > @@ -0,0 +1,856 @@ > +// SPDX-License-Identifier: GPL-2.0+ This should be "GPL-2.0-or-later" (https://spdx.org/licenses/) > +// Copyright IBM Corp 2019 > + > +#include <linux/aspeed-xdma.h> A device-specific header in this include path seems a little strange to me. > +#include <linux/bitfield.h> > +#include <linux/clk.h> > +#include <linux/delay.h> > +#include <linux/device.h> > +#include <linux/dma-mapping.h> > +#include <linux/fs.h> > +#include <linux/genalloc.h> > +#include <linux/interrupt.h> > +#include <linux/jiffies.h> > +#include <linux/mfd/syscon.h> > +#include <linux/miscdevice.h> > +#include <linux/module.h> > +#include <linux/mutex.h> > +#include <linux/of_device.h> > +#include <linux/platform_device.h> > +#include <linux/poll.h> > +#include <linux/regmap.h> > +#include <linux/reset.h> > +#include <linux/slab.h> > +#include <linux/spinlock.h> > +#include <linux/string.h> > +#include <linux/uaccess.h> > +#include <linux/wait.h> > +#include <linux/workqueue.h> > + > +#define DEVICE_NAME "aspeed-xdma" > + > +#define SCU_AST2500_STRAP 0x070 > +#define SCU_AST2500_STRAP_VGA_MEM GENMASK(3, 2) > +#define SCU_AST2600_STRAP 0x500 > +#define SCU_AST2600_STRAP_VGA_MEM GENMASK(14, 13) It could be easier to review if you add support for one SoC at a time. > + > +#define SCU_AST2500_PCIE_CONF 0x180 > +#define SCU_AST2600_PCIE_CONF 0xc20 > +#define SCU_PCIE_CONF_VGA_EN BIT(0) > +#define SCU_PCIE_CONF_VGA_EN_MMIO BIT(1) > +#define SCU_PCIE_CONF_VGA_EN_LPC BIT(2) > +#define SCU_PCIE_CONF_VGA_EN_MSI BIT(3) > +#define SCU_PCIE_CONF_VGA_EN_MCTP BIT(4) > +#define SCU_PCIE_CONF_VGA_EN_IRQ BIT(5) > +#define SCU_PCIE_CONF_VGA_EN_DMA BIT(6) > +#define SCU_PCIE_CONF_BMC_EN BIT(8) > +#define SCU_PCIE_CONF_BMC_EN_MMIO BIT(9) > +#define SCU_PCIE_CONF_BMC_EN_MSI BIT(11) > +#define SCU_PCIE_CONF_BMC_EN_MCTP BIT(12) > +#define SCU_PCIE_CONF_BMC_EN_IRQ BIT(13) > +#define SCU_PCIE_CONF_BMC_EN_DMA BIT(14) > + > +#define SCU_AST2500_BMC_CLASS_REV 0x19c > +#define SCU_AST2600_BMC_CLASS_REV 0xc4c > +#define SCU_BMC_CLASS_REV_XDMA 0xff000001 > + > +#define SDMC_BASE 0x1e6e0000 > +#define SDMC_CONF 0x004 > +#define SDMC_CONF_MEM GENMASK(1, 0) > +#define SDMC_REMAP 0x008 > +#define SDMC_AST2500_REMAP_MAGIC (BIT(16) | BIT(17)) > +#define SDMC_AST2600_REMAP_MAGIC BIT(18) Can we have something more descriptive than "MAGIC"? What these bits are doing is well documented and defined. > + > +#define XDMA_CMDQ_SIZE PAGE_SIZE > +#define XDMA_NUM_CMDS \ > + (XDMA_CMDQ_SIZE / sizeof(struct aspeed_xdma_cmd)) > + > +/* Aspeed specification requires 10ms after switching the reset line */ > +#define XDMA_RESET_TIME_MS 10 > + > +#define XDMA_DS_PCIE_REQ_SIZE_128 0 > +#define XDMA_DS_PCIE_REQ_SIZE_256 1 > +#define XDMA_DS_PCIE_REQ_SIZE_512 2 > +#define XDMA_DS_PCIE_REQ_SIZE_1K 3 > +#define XDMA_DS_PCIE_REQ_SIZE_2K 4 > +#define XDMA_DS_PCIE_REQ_SIZE_4K 5 > + > +#define XDMA_CMD_AST2500_PITCH_SHIFT 3 > +#define XDMA_CMD_AST2500_PITCH_BMC GENMASK_ULL(62, 51) > +#define XDMA_CMD_AST2500_PITCH_HOST GENMASK_ULL(46, 35) > +#define XDMA_CMD_AST2500_PITCH_UPSTREAM BIT_ULL(31) > +#define XDMA_CMD_AST2500_PITCH_ADDR GENMASK_ULL(29, 4) > +#define XDMA_CMD_AST2500_PITCH_ID BIT_ULL(0) > +#define XDMA_CMD_AST2500_CMD_IRQ_EN BIT_ULL(31) > +#define XDMA_CMD_AST2500_CMD_LINE_NO GENMASK_ULL(27, 16) > +#define XDMA_CMD_AST2500_CMD_IRQ_BMC BIT_ULL(15) > +#define XDMA_CMD_AST2500_CMD_LINE_SIZE_SHIFT 4 > +#define XDMA_CMD_AST2500_CMD_LINE_SIZE \ > + GENMASK_ULL(14, XDMA_CMD_AST2500_CMD_LINE_SIZE_SHIFT) > +#define XDMA_CMD_AST2500_CMD_ID BIT_ULL(1) > + > +#define XDMA_CMD_AST2600_PITCH_BMC GENMASK_ULL(62, 48) > +#define XDMA_CMD_AST2600_PITCH_HOST GENMASK_ULL(46, 32) > +#define XDMA_CMD_AST2600_PITCH_ADDR GENMASK_ULL(30, 0) > +#define XDMA_CMD_AST2600_CMD_64_EN BIT_ULL(40) > +#define XDMA_CMD_AST2600_CMD_IRQ_BMC BIT_ULL(37) > +#define XDMA_CMD_AST2600_CMD_IRQ_HOST BIT_ULL(36) > +#define XDMA_CMD_AST2600_CMD_UPSTREAM BIT_ULL(32) > +#define XDMA_CMD_AST2600_CMD_LINE_NO GENMASK_ULL(27, 16) > +#define XDMA_CMD_AST2600_CMD_LINE_SIZE GENMASK_ULL(14, 0) > +#define XDMA_CMD_AST2600_CMD_MULTILINE_SIZE GENMASK_ULL(14, 12) > + > +#define XDMA_AST2500_QUEUE_ENTRY_SIZE 4 > +#define XDMA_AST2500_HOST_CMDQ_ADDR0 0x00 > +#define XDMA_AST2500_HOST_CMDQ_ENDP 0x04 > +#define XDMA_AST2500_HOST_CMDQ_WRITEP 0x08 > +#define XDMA_AST2500_HOST_CMDQ_READP 0x0c > +#define XDMA_AST2500_BMC_CMDQ_ADDR 0x10 > +#define XDMA_AST2500_BMC_CMDQ_ENDP 0x14 > +#define XDMA_AST2500_BMC_CMDQ_WRITEP 0x18 > +#define XDMA_AST2500_BMC_CMDQ_READP 0x1c > +#define XDMA_BMC_CMDQ_READP_MAGIC 0xee882266 What's this about? Using a macro to abstract a magic number and then using the word "magic" in the name isn't very helpful. I think it should be renamed or documented, or both. > +#define XDMA_AST2500_CTRL 0x20 > +#define XDMA_AST2500_CTRL_US_COMP BIT(4) > +#define XDMA_AST2500_CTRL_DS_COMP BIT(5) > +#define XDMA_AST2500_CTRL_DS_DIRTY BIT(6) > +#define XDMA_AST2500_CTRL_DS_SIZE GENMASK(19, 17) > +#define XDMA_AST2500_CTRL_DS_TIMEOUT BIT(28) > +#define XDMA_AST2500_CTRL_DS_CHECK_ID BIT(29) > +#define XDMA_AST2500_STATUS 0x24 > +#define XDMA_AST2500_STATUS_US_COMP BIT(4) > +#define XDMA_AST2500_STATUS_DS_COMP BIT(5) > +#define XDMA_AST2500_STATUS_DS_DIRTY BIT(6) > +#define XDMA_AST2500_INPRG_DS_CMD1 0x38 > +#define XDMA_AST2500_INPRG_DS_CMD2 0x3c > +#define XDMA_AST2500_INPRG_US_CMD00 0x40 > +#define XDMA_AST2500_INPRG_US_CMD01 0x44 > +#define XDMA_AST2500_INPRG_US_CMD10 0x48 > +#define XDMA_AST2500_INPRG_US_CMD11 0x4c > +#define XDMA_AST2500_INPRG_US_CMD20 0x50 > +#define XDMA_AST2500_INPRG_US_CMD21 0x54 > +#define XDMA_AST2500_HOST_CMDQ_ADDR1 0x60 > +#define XDMA_AST2500_VGA_CMDQ_ADDR0 0x64 > +#define XDMA_AST2500_VGA_CMDQ_ENDP 0x68 > +#define XDMA_AST2500_VGA_CMDQ_WRITEP 0x6c > +#define XDMA_AST2500_VGA_CMDQ_READP 0x70 > +#define XDMA_AST2500_VGA_CMD_STATUS 0x74 > +#define XDMA_AST2500_VGA_CMDQ_ADDR1 0x78 > + > +#define XDMA_AST2600_QUEUE_ENTRY_SIZE 2 > +#define XDMA_AST2600_HOST_CMDQ_ADDR0 0x00 > +#define XDMA_AST2600_HOST_CMDQ_ADDR1 0x04 > +#define XDMA_AST2600_HOST_CMDQ_ENDP 0x08 > +#define XDMA_AST2600_HOST_CMDQ_WRITEP 0x0c > +#define XDMA_AST2600_HOST_CMDQ_READP 0x10 > +#define XDMA_AST2600_BMC_CMDQ_ADDR 0x14 > +#define XDMA_AST2600_BMC_CMDQ_ENDP 0x18 > +#define XDMA_AST2600_BMC_CMDQ_WRITEP 0x1c > +#define XDMA_AST2600_BMC_CMDQ_READP 0x20 > +#define XDMA_AST2600_VGA_CMDQ_ADDR0 0x24 > +#define XDMA_AST2600_VGA_CMDQ_ADDR1 0x28 > +#define XDMA_AST2600_VGA_CMDQ_ENDP 0x2c > +#define XDMA_AST2600_VGA_CMDQ_WRITEP 0x30 > +#define XDMA_AST2600_VGA_CMDQ_READP 0x34 > +#define XDMA_AST2600_CTRL 0x38 > +#define XDMA_AST2600_CTRL_US_COMP BIT(16) > +#define XDMA_AST2600_CTRL_DS_COMP BIT(17) > +#define XDMA_AST2600_CTRL_DS_DIRTY BIT(18) > +#define XDMA_AST2600_CTRL_DS_SIZE GENMASK(22, 20) > +#define XDMA_AST2600_STATUS 0x3c > +#define XDMA_AST2600_STATUS_US_COMP BIT(16) > +#define XDMA_AST2600_STATUS_DS_COMP BIT(17) > +#define XDMA_AST2600_STATUS_DS_DIRTY BIT(18) > +#define XDMA_AST2600_INPRG_DS_CMD00 0x40 > +#define XDMA_AST2600_INPRG_DS_CMD01 0x44 > +#define XDMA_AST2600_INPRG_DS_CMD10 0x48 > +#define XDMA_AST2600_INPRG_DS_CMD11 0x4c > +#define XDMA_AST2600_INPRG_DS_CMD20 0x50 > +#define XDMA_AST2600_INPRG_DS_CMD21 0x54 > +#define XDMA_AST2600_INPRG_US_CMD00 0x60 > +#define XDMA_AST2600_INPRG_US_CMD01 0x64 > +#define XDMA_AST2600_INPRG_US_CMD10 0x68 > +#define XDMA_AST2600_INPRG_US_CMD11 0x6c > +#define XDMA_AST2600_INPRG_US_CMD20 0x70 > +#define XDMA_AST2600_INPRG_US_CMD21 0x74 > + > +enum versions { xdma_ast2500, xdma_ast2600 }; > + > +struct aspeed_xdma_cmd { > + u64 host_addr; > + u64 pitch; > + u64 cmd; > + u64 reserved; > +}; > + > +struct aspeed_xdma_regs { > + u8 bmc_cmdq_addr; > + u8 bmc_cmdq_endp; > + u8 bmc_cmdq_writep; > + u8 bmc_cmdq_readp; > + u8 control; > + u8 status; > +}; > + > +struct aspeed_xdma_status_bits { > + u32 us_comp; > + u32 ds_comp; > + u32 ds_dirty; > +}; > + > +struct aspeed_xdma_client; > + > +struct aspeed_xdma { > + enum versions version; > + u32 control; > + unsigned int queue_entry_size; > + struct aspeed_xdma_regs regs; > + struct aspeed_xdma_status_bits status_bits; > + > + struct device *dev; > + void __iomem *base; > + struct clk *clock; > + struct reset_control *reset; > + > + bool in_progress; > + bool in_reset; > + bool upstream; > + unsigned int cmd_idx; > + struct mutex start_lock; > + struct delayed_work reset_work; > + spinlock_t client_lock; > + spinlock_t reset_lock; What data are each of these locks protecting? Please add documentation about how you intend to use them. Try to group the locks with the data they protect if this is not already the case. > + wait_queue_head_t wait; > + struct aspeed_xdma_client *current_client; > + > + u32 vga_phys; > + u32 vga_size; > + void *cmdq; > + void __iomem *vga_virt; > + dma_addr_t cmdq_vga_phys; > + void *cmdq_vga_virt; > + struct gen_pool *vga_pool; > +}; > + > +struct aspeed_xdma_client { > + struct aspeed_xdma *ctx; > + > + bool error; > + bool in_progress; > + void *virt; > + dma_addr_t phys; > + u32 size; > +}; > + > +static u32 aspeed_xdma_readl(struct aspeed_xdma *ctx, u8 reg) > +{ > + u32 v = readl(ctx->base + reg); > + > + dev_dbg(ctx->dev, "read %02x[%08x]\n", reg, v); > + return v; > +} > + > +static void aspeed_xdma_writel(struct aspeed_xdma *ctx, u8 reg, u32 val) > +{ > + writel(val, ctx->base + reg); > + dev_dbg(ctx->dev, "write %02x[%08x]\n", reg, readl(ctx->base + reg)); That readl() seems a bit paranoid, and is probably evaluated whether or not this dev_dbg() is enabled. What drove this approach? > +} > + > +static void aspeed_xdma_init_eng(struct aspeed_xdma *ctx) > +{ > + aspeed_xdma_writel(ctx, ctx->regs.bmc_cmdq_endp, > + ctx->queue_entry_size * XDMA_NUM_CMDS); > + aspeed_xdma_writel(ctx, ctx->regs.bmc_cmdq_readp, > + XDMA_BMC_CMDQ_READP_MAGIC); > + aspeed_xdma_writel(ctx, ctx->regs.bmc_cmdq_writep, 0); > + aspeed_xdma_writel(ctx, ctx->regs.control, ctx->control); > + aspeed_xdma_writel(ctx, ctx->regs.bmc_cmdq_addr, ctx->cmdq_vga_phys); > + > + ctx->cmd_idx = 0; > + ctx->in_progress = false; > +} > + > +static unsigned int aspeed_xdma_ast2500_set_cmd(struct aspeed_xdma *ctx, > + struct aspeed_xdma_op *op, > + u32 bmc_addr) > +{ > + u64 cmd = XDMA_CMD_AST2500_CMD_IRQ_EN | XDMA_CMD_AST2500_CMD_IRQ_BMC | > + XDMA_CMD_AST2500_CMD_ID; > + u64 cmd_pitch = (op->direction ? XDMA_CMD_AST2500_PITCH_UPSTREAM : 0) | > + XDMA_CMD_AST2500_PITCH_ID; > + unsigned int line_size; > + unsigned int nidx = (ctx->cmd_idx + 1) % XDMA_NUM_CMDS; > + unsigned int line_no = 1; > + unsigned int pitch = 1; > + struct aspeed_xdma_cmd *ncmd = > + &(((struct aspeed_xdma_cmd *)ctx->cmdq)[ctx->cmd_idx]); > + > + dev_dbg(ctx->dev, "xdma %s ast2500: bmc[%08x] len[%08x] host[%08x]\n", > + op->direction ? "upstream" : "downstream", bmc_addr, op->len, > + (u32)op->host_addr); > + > + if (op->len > XDMA_CMD_AST2500_CMD_LINE_SIZE) { > + unsigned int rem; > + unsigned int total; > + > + line_no = op->len / XDMA_CMD_AST2500_CMD_LINE_SIZE; > + total = XDMA_CMD_AST2500_CMD_LINE_SIZE * line_no; > + rem = (op->len - total) >> > + XDMA_CMD_AST2500_CMD_LINE_SIZE_SHIFT; > + line_size = XDMA_CMD_AST2500_CMD_LINE_SIZE; > + pitch = line_size >> XDMA_CMD_AST2500_PITCH_SHIFT; > + line_size >>= XDMA_CMD_AST2500_CMD_LINE_SIZE_SHIFT; Can we clean up the configuration of line_size and pitch here? They are set to constants in this case. > + > + if (rem) { > + u32 rbmc = bmc_addr + total; > + struct aspeed_xdma_cmd *rcmd = > + &(((struct aspeed_xdma_cmd *)ctx->cmdq)[nidx]); > + > + rcmd->host_addr = op->host_addr + (u64)total; > + rcmd->pitch = cmd_pitch | > + ((u64)rbmc & XDMA_CMD_AST2500_PITCH_ADDR) | > + FIELD_PREP(XDMA_CMD_AST2500_PITCH_HOST, 1) | > + FIELD_PREP(XDMA_CMD_AST2500_PITCH_BMC, 1); > + rcmd->cmd = cmd | > + FIELD_PREP(XDMA_CMD_AST2500_CMD_LINE_NO, 1) | > + FIELD_PREP(XDMA_CMD_AST2500_CMD_LINE_SIZE, > + rem); > + > + print_hex_dump_debug("xdma rem", DUMP_PREFIX_OFFSET, > + 16, 1, rcmd, sizeof(*rcmd), true); > + > + cmd &= ~(XDMA_CMD_AST2500_CMD_IRQ_EN | > + XDMA_CMD_AST2500_CMD_IRQ_BMC); > + > + nidx = (nidx + 1) % XDMA_NUM_CMDS; > + } > + } else { > + line_size = op->len >> XDMA_CMD_AST2500_CMD_LINE_SIZE_SHIFT; > + } > + > + ncmd->host_addr = op->host_addr; > + ncmd->pitch = cmd_pitch | > + ((u64)bmc_addr & XDMA_CMD_AST2500_PITCH_ADDR) | > + FIELD_PREP(XDMA_CMD_AST2500_PITCH_HOST, pitch) | > + FIELD_PREP(XDMA_CMD_AST2500_PITCH_BMC, pitch); > + ncmd->cmd = cmd | FIELD_PREP(XDMA_CMD_AST2500_CMD_LINE_NO, line_no) | > + FIELD_PREP(XDMA_CMD_AST2500_CMD_LINE_SIZE, line_size); > + > + print_hex_dump_debug("xdma cmd", DUMP_PREFIX_OFFSET, 16, 1, ncmd, > + sizeof(*ncmd), true); > + > + return nidx; > +} > + > +static unsigned int aspeed_xdma_ast2600_set_cmd(struct aspeed_xdma *ctx, > + struct aspeed_xdma_op *op, > + u32 bmc_addr) > +{ > + u64 cmd = XDMA_CMD_AST2600_CMD_IRQ_BMC | > + (op->direction ? XDMA_CMD_AST2600_CMD_UPSTREAM : 0); > + unsigned int line_size; > + unsigned int nidx = (ctx->cmd_idx + 1) % XDMA_NUM_CMDS; > + unsigned int line_no = 1; > + unsigned int pitch = 1; > + struct aspeed_xdma_cmd *ncmd = > + &(((struct aspeed_xdma_cmd *)ctx->cmdq)[ctx->cmd_idx]); > + > + if ((op->host_addr + op->len) & 0xffffffff00000000ULL) > + cmd |= XDMA_CMD_AST2600_CMD_64_EN; > + > + dev_dbg(ctx->dev, "xdma %s ast2600: bmc[%08x] len[%08x] " > + "host[%016llx]\n", op->direction ? "upstream" : "downstream", > + bmc_addr, op->len, op->host_addr); > + > + if (op->len > XDMA_CMD_AST2600_CMD_LINE_SIZE) { > + unsigned int rem; > + unsigned int total; > + > + line_no = op->len / XDMA_CMD_AST2600_CMD_MULTILINE_SIZE; > + total = XDMA_CMD_AST2600_CMD_MULTILINE_SIZE * line_no; > + rem = op->len - total; > + line_size = XDMA_CMD_AST2600_CMD_MULTILINE_SIZE; > + pitch = line_size; > + > + if (rem) { > + u32 rbmc = bmc_addr + total; > + struct aspeed_xdma_cmd *rcmd = > + &(((struct aspeed_xdma_cmd *)ctx->cmdq)[nidx]); > + > + rcmd->host_addr = op->host_addr + (u64)total; > + rcmd->pitch = > + ((u64)rbmc & XDMA_CMD_AST2600_PITCH_ADDR) | > + FIELD_PREP(XDMA_CMD_AST2600_PITCH_HOST, 1) | > + FIELD_PREP(XDMA_CMD_AST2600_PITCH_BMC, 1); > + rcmd->cmd = cmd | > + FIELD_PREP(XDMA_CMD_AST2600_CMD_LINE_NO, 1) | > + FIELD_PREP(XDMA_CMD_AST2600_CMD_LINE_SIZE, > + rem); > + > + print_hex_dump_debug("xdma rem", DUMP_PREFIX_OFFSET, > + 16, 1, rcmd, sizeof(*rcmd), true); > + > + cmd &= ~XDMA_CMD_AST2600_CMD_IRQ_BMC; > + > + nidx = (nidx + 1) % XDMA_NUM_CMDS; > + } > + } else { > + line_size = op->len; > + } > + > + ncmd->host_addr = op->host_addr; > + ncmd->pitch = ((u64)bmc_addr & XDMA_CMD_AST2600_PITCH_ADDR) | > + FIELD_PREP(XDMA_CMD_AST2600_PITCH_HOST, pitch) | > + FIELD_PREP(XDMA_CMD_AST2600_PITCH_BMC, pitch); > + ncmd->cmd = cmd | FIELD_PREP(XDMA_CMD_AST2600_CMD_LINE_NO, line_no) | > + FIELD_PREP(XDMA_CMD_AST2600_CMD_LINE_SIZE, line_size); > + > + print_hex_dump_debug("xdma cmd", DUMP_PREFIX_OFFSET, 16, 1, ncmd, > + sizeof(*ncmd), true); > + > + return nidx; > +} > + > +static void aspeed_xdma_start(struct aspeed_xdma *ctx, > + struct aspeed_xdma_op *op, u32 bmc_addr, > + struct aspeed_xdma_client *client) > +{ > + unsigned int nidx; > + > + mutex_lock(&ctx->start_lock); > + > + switch (ctx->version) { > + default: > + case xdma_ast2500: > + nidx = aspeed_xdma_ast2500_set_cmd(ctx, op, bmc_addr); > + break; > + case xdma_ast2600: > + nidx = aspeed_xdma_ast2600_set_cmd(ctx, op, bmc_addr); > + break; > + } What was the trade-off between the enum and function pointers? Is the enum approach strictly better for some reason? > + > + memcpy(ctx->cmdq_vga_virt, ctx->cmdq, XDMA_CMDQ_SIZE); > + > + client->in_progress = true; > + ctx->current_client = client; > + > + ctx->in_progress = true; Do we need ctx->in_progress vs just using the NULL state of ctx->current_client? Is it possible for ctx->current_client to be set and not have a transfer in progress? > + ctx->upstream = op->direction ? true : false; > + > + aspeed_xdma_writel(ctx, ctx->regs.bmc_cmdq_writep, > + nidx * ctx->queue_entry_size); > + > + ctx->cmd_idx = nidx; > + > + mutex_unlock(&ctx->start_lock); > +} > + > +static void aspeed_xdma_done(struct aspeed_xdma *ctx, bool error) > +{ > + unsigned long flags; > + > + /* > + * Lock to make sure simultaneous reset and transfer complete don't > + * leave the client with the wrong error state. > + */ > + spin_lock_irqsave(&ctx->client_lock, flags); > + > + if (ctx->current_client) { You're testing ctx->current_client under ctx->client_lock here, but ctx->current_client is set under ctx->start_lock in aspeed_xdma_start(), which does not take ctx->client_lock. What data is protected by ctx->client_lock? What data is protected by ctx->client_lock? > + ctx->current_client->error = error; > + ctx->current_client->in_progress = false; > + ctx->current_client = NULL; > + } > + > + spin_unlock_irqrestore(&ctx->client_lock, flags); > + > + ctx->in_progress = false; You set ctx->in_progress under ctx->start_lock in aspeed_xdma_start() but you do not take ctx->start_lock when setting it here. What data is ctx->start_lock protecting? > + wake_up_interruptible_all(&ctx->wait); > +} > + > +static irqreturn_t aspeed_xdma_irq(int irq, void *arg) > +{ > + struct aspeed_xdma *ctx = arg; > + u32 status = aspeed_xdma_readl(ctx, ctx->regs.status); > + > + if (status & ctx->status_bits.ds_dirty) { > + aspeed_xdma_done(ctx, true); > + } else { > + if (status & ctx->status_bits.us_comp) { > + if (ctx->upstream) > + aspeed_xdma_done(ctx, false); > + } > + > + if (status & ctx->status_bits.ds_comp) { > + if (!ctx->upstream) > + aspeed_xdma_done(ctx, false); > + } > + } > + > + aspeed_xdma_writel(ctx, ctx->regs.status, status); > + > + return IRQ_HANDLED; > +} > + > +static void aspeed_xdma_reset_finish(struct aspeed_xdma *ctx) > +{ > + unsigned long flags; > + > + spin_lock_irqsave(&ctx->reset_lock, flags); > + > + ctx->in_reset = false; > + reset_control_deassert(ctx->reset); > + > + spin_unlock_irqrestore(&ctx->reset_lock, flags); > + > + msleep(XDMA_RESET_TIME_MS); > + > + aspeed_xdma_init_eng(ctx); > + aspeed_xdma_done(ctx, true); > +} > + > +static bool aspeed_xdma_reset_start(struct aspeed_xdma *ctx) > +{ > + bool rc = true; > + unsigned long flags; > + > + spin_lock_irqsave(&ctx->reset_lock, flags); > + > + if (ctx->in_reset) { > + rc = false; > + } else { > + ctx->in_reset = true; > + reset_control_assert(ctx->reset); > + } Do start and finish need to be split in this way? > + > + spin_unlock_irqrestore(&ctx->reset_lock, flags); > + > + return rc; > +} > + > +static void aspeed_xdma_reset_work(struct work_struct *work) > +{ > + struct delayed_work *dwork = to_delayed_work(work); > + struct aspeed_xdma *ctx = container_of(dwork, struct aspeed_xdma, > + reset_work); > + > + /* > + * Lock to make sure operations aren't started while the engine is > + * in an undefined state coming out of reset and waiting to init. Shouldn't we be holding it across both aspeed_xdma_reset_start() and aspeed_xdma_reset_finish()? > + */ > + mutex_lock(&ctx->start_lock); > + > + aspeed_xdma_reset_finish(ctx); > + > + mutex_unlock(&ctx->start_lock); > +} > + > +static irqreturn_t aspeed_xdma_pcie_irq(int irq, void *arg) > +{ > + struct aspeed_xdma *ctx = arg; > + > + dev_dbg(ctx->dev, "pcie reset\n"); > + > + if (aspeed_xdma_reset_start(ctx)) > + schedule_delayed_work(&ctx->reset_work, > + msecs_to_jiffies(XDMA_RESET_TIME_MS)); > + > + return IRQ_HANDLED; > +} > + > +static int aspeed_xdma_init(struct aspeed_xdma *ctx) > +{ > + int rc; > + struct regmap *scu; > + u32 conf; > + u32 mem_size; > + u32 remap; > + u32 scu_bmc_class; > + u32 scu_pcie_conf; > + u32 scu_strap; > + u32 sdmc_remap_magic; > + u32 strap = 0; > + const u32 bmc = SCU_PCIE_CONF_BMC_EN | SCU_PCIE_CONF_BMC_EN_MSI | > + SCU_PCIE_CONF_BMC_EN_MCTP Do we need MCTP here? | SCU_PCIE_CONF_BMC_EN_IRQ | > + SCU_PCIE_CONF_BMC_EN_DMA; > + const u32 vga = SCU_PCIE_CONF_VGA_EN | SCU_PCIE_CONF_VGA_EN_MSI | > + SCU_PCIE_CONF_VGA_EN_MCTP Do we need MCTP here? | SCU_PCIE_CONF_VGA_EN_IRQ | > + SCU_PCIE_CONF_VGA_EN_DMA; > + u32 mem_sizes[4] = { 0x8000000, 0x10000000, 0x20000000, 0x40000000 }; > + const u32 vga_sizes[4] = { 0x800000, 0x1000000, 0x2000000, 0x4000000 }; > + void __iomem *sdmc_base = ioremap(SDMC_BASE, 0x100); Surely this should be a phandle from the devicetree? And what about conflicts with a potential SDMC driver? I don't think what you have is the right approach. > + > + if (!sdmc_base) { > + dev_err(ctx->dev, "Failed to ioremap mem controller regs.\n"); > + return -ENOMEM; > + } > + > + switch (ctx->version) { > + default: > + case xdma_ast2500: > + scu_bmc_class = SCU_AST2500_BMC_CLASS_REV; > + scu_pcie_conf = SCU_AST2500_PCIE_CONF; > + scu_strap = SCU_AST2500_STRAP; > + sdmc_remap_magic = SDMC_AST2500_REMAP_MAGIC; Can't this be described statically? The values should be derived from the devicetree compatible. > + > + scu = syscon_regmap_lookup_by_compatible("aspeed,ast2500-scu"); I think we should do this via phandle in the devicetree. > + break; > + case xdma_ast2600: > + scu_bmc_class = SCU_AST2600_BMC_CLASS_REV; > + scu_pcie_conf = SCU_AST2600_PCIE_CONF; > + scu_strap = SCU_AST2600_STRAP; > + sdmc_remap_magic = SDMC_AST2600_REMAP_MAGIC; Can't this be described statically? The values should be derived from the devicetree compatible. > + > + mem_sizes[0] *= 2; > + mem_sizes[1] *= 2; > + mem_sizes[2] *= 2; > + mem_sizes[3] *= 2; Same query as above. > + > + scu = syscon_regmap_lookup_by_compatible("aspeed,ast2600-scu"); I think we should do this via phandle in the devicetree. > + break; > + }; > + > + if (!scu) { > + dev_err(ctx->dev, "Failed to grab SCU regs.\n"); > + return -ENOMEM; > + } > + > + /* Set SOC to use the BMC PCIe device and set the device class code */ > + regmap_update_bits(scu, scu_pcie_conf, bmc | vga, bmc); > + regmap_write(scu, scu_bmc_class, SCU_BMC_CLASS_REV_XDMA); This should be selectable, probably via the devicetree. > + > + /* > + * Calculate the VGA memory size and physical address from the SCU and > + * memory controller registers. > + */ > + regmap_read(scu, scu_strap, &strap); > + > + switch (ctx->version) { > + case xdma_ast2500: > + ctx->vga_size = vga_sizes[FIELD_GET(SCU_AST2500_STRAP_VGA_MEM, > + strap)]; > + break; > + case xdma_ast2600: > + ctx->vga_size = vga_sizes[FIELD_GET(SCU_AST2600_STRAP_VGA_MEM, > + strap)]; > + break; Can't this be described statically? The values should be derived from the devicetree compatible. > + } > + > + conf = readl(sdmc_base + SDMC_CONF); > + remap = readl(sdmc_base + SDMC_REMAP); > + remap |= sdmc_remap_magic; > + writel(remap, sdmc_base + SDMC_REMAP); > + mem_size = mem_sizes[conf & SDMC_CONF_MEM]; See previous objection to this. > + > + iounmap(sdmc_base); > + > + ctx->vga_phys = (mem_size - ctx->vga_size) + 0x80000000; RAM base should be extracted from the devicetree. Better yet, the VGA space should be extracted from the devicetree and this calculation avoided altogether. > + > + ctx->cmdq = devm_kzalloc(ctx->dev, XDMA_CMDQ_SIZE, GFP_KERNEL); > + if (!ctx->cmdq) { > + dev_err(ctx->dev, "Failed to allocate command queue.\n"); > + return -ENOMEM; > + } > + > + ctx->vga_virt = ioremap(ctx->vga_phys, ctx->vga_size); Use devm_ioremap() to avoid the cleanup. > + if (!ctx->vga_virt) { > + dev_err(ctx->dev, "Failed to ioremap VGA memory.\n"); > + return -ENOMEM; > + } > + > + rc = gen_pool_add_virt(ctx->vga_pool, (unsigned long)ctx->vga_virt, > + ctx->vga_phys, ctx->vga_size, -1); > + if (rc) { > + dev_err(ctx->dev, "Failed to add memory to genalloc pool.\n"); > + iounmap(ctx->vga_virt); > + return rc; > + } > + > + ctx->cmdq_vga_virt = gen_pool_dma_alloc(ctx->vga_pool, XDMA_CMDQ_SIZE, > + &ctx->cmdq_vga_phys); Can you educate me on this a little? Why is a command queue being allocated in memory writable by the host? Aren't we opening ourselves up to potential corruption? > + if (!ctx->cmdq_vga_virt) { > + dev_err(ctx->dev, "Failed to genalloc cmdq.\n"); > + iounmap(ctx->vga_virt); > + return -ENOMEM; > + } > + > + dev_dbg(ctx->dev, "VGA mapped at phys[%08x], size[%08x].\n", > + ctx->vga_phys, ctx->vga_size); > + > + return 0; > +} > + > +static int aspeed_xdma_probe(struct platform_device *pdev) > +{ > + int irq; > + int pcie_irq; > + int rc; > + enum versions vs = xdma_ast2500; > + struct device *dev = &pdev->dev; > + struct aspeed_xdma *ctx = devm_kzalloc(dev, sizeof(*ctx), GFP_KERNEL); > + const void *md = of_device_get_match_data(dev); > + > + if (!ctx) > + return -ENOMEM; > + > + if (md) > + vs = (enum versions)md; > + > + switch (vs) { > + default: > + case xdma_ast2500: > + ctx->version = xdma_ast2500; > + ctx->control = XDMA_AST2500_CTRL_US_COMP | > + XDMA_AST2500_CTRL_DS_COMP | > + XDMA_AST2500_CTRL_DS_DIRTY | > + FIELD_PREP(XDMA_AST2500_CTRL_DS_SIZE, > + XDMA_DS_PCIE_REQ_SIZE_256) | > + XDMA_AST2500_CTRL_DS_TIMEOUT | > + XDMA_AST2500_CTRL_DS_CHECK_ID; > + ctx->queue_entry_size = XDMA_AST2500_QUEUE_ENTRY_SIZE; > + ctx->regs.bmc_cmdq_addr = XDMA_AST2500_BMC_CMDQ_ADDR; > + ctx->regs.bmc_cmdq_endp = XDMA_AST2500_BMC_CMDQ_ENDP; > + ctx->regs.bmc_cmdq_writep = XDMA_AST2500_BMC_CMDQ_WRITEP; > + ctx->regs.bmc_cmdq_readp = XDMA_AST2500_BMC_CMDQ_READP; > + ctx->regs.control = XDMA_AST2500_CTRL; > + ctx->regs.status = XDMA_AST2500_STATUS; > + ctx->status_bits.us_comp = XDMA_AST2500_STATUS_US_COMP; > + ctx->status_bits.ds_comp = XDMA_AST2500_STATUS_DS_COMP; > + ctx->status_bits.ds_dirty = XDMA_AST2500_STATUS_DS_DIRTY; Why not include all this in the match data? > + break; > + case xdma_ast2600: > + ctx->version = xdma_ast2600; > + ctx->control = XDMA_AST2600_CTRL_US_COMP | > + XDMA_AST2600_CTRL_DS_COMP | > + XDMA_AST2600_CTRL_DS_DIRTY | > + FIELD_PREP(XDMA_AST2600_CTRL_DS_SIZE, > + XDMA_DS_PCIE_REQ_SIZE_256); > + ctx->queue_entry_size = XDMA_AST2600_QUEUE_ENTRY_SIZE; > + ctx->regs.bmc_cmdq_addr = XDMA_AST2600_BMC_CMDQ_ADDR; > + ctx->regs.bmc_cmdq_endp = XDMA_AST2600_BMC_CMDQ_ENDP; > + ctx->regs.bmc_cmdq_writep = XDMA_AST2600_BMC_CMDQ_WRITEP; > + ctx->regs.bmc_cmdq_readp = XDMA_AST2600_BMC_CMDQ_READP; > + ctx->regs.control = XDMA_AST2600_CTRL; > + ctx->regs.status = XDMA_AST2600_STATUS; > + ctx->status_bits.us_comp = XDMA_AST2600_STATUS_US_COMP; > + ctx->status_bits.ds_comp = XDMA_AST2600_STATUS_DS_COMP; > + ctx->status_bits.ds_dirty = XDMA_AST2600_STATUS_DS_DIRTY; Same query as above > + break; > + }; > + > + ctx->dev = dev; > + platform_set_drvdata(pdev, ctx); > + mutex_init(&ctx->start_lock); > + INIT_DELAYED_WORK(&ctx->reset_work, aspeed_xdma_reset_work); > + spin_lock_init(&ctx->client_lock); > + spin_lock_init(&ctx->reset_lock); > + init_waitqueue_head(&ctx->wait); > + > + ctx->base = devm_platform_ioremap_resource(pdev, 0); > + if (IS_ERR(ctx->base)) { > + dev_err(dev, "Unable to ioremap registers.\n"); > + return PTR_ERR(ctx->base); > + } > + > + irq = platform_get_irq(pdev, 0); > + if (irq < 0) { > + dev_err(dev, "Unable to find IRQ.\n"); > + return -ENODEV; > + } > + > + rc = devm_request_irq(dev, irq, aspeed_xdma_irq, IRQF_SHARED, > + DEVICE_NAME, ctx); > + if (rc < 0) { > + dev_err(dev, "Unable to request IRQ %d.\n", irq); > + return rc; > + } > + > + ctx->clock = devm_clk_get(dev, NULL); > + if (IS_ERR(ctx->clock)) { > + dev_err(dev, "Unable to request clock.\n"); > + return PTR_ERR(ctx->clock); > + } > + > + ctx->reset = devm_reset_control_get_exclusive(dev, NULL); > + if (IS_ERR(ctx->reset)) { > + dev_err(dev, "Unable to request reset control.\n"); > + return PTR_ERR(ctx->reset); > + } > + > + ctx->vga_pool = devm_gen_pool_create(dev, ilog2(PAGE_SIZE), -1, NULL); > + if (!ctx->vga_pool) { > + dev_err(dev, "Unable to setup genalloc pool.\n"); > + return -ENOMEM; > + } > + > + clk_prepare_enable(ctx->clock); Check for errors > + msleep(XDMA_RESET_TIME_MS); > + > + reset_control_deassert(ctx->reset); Check for errors. > + msleep(XDMA_RESET_TIME_MS); > + > + rc = aspeed_xdma_init(ctx); > + if (rc) { > + reset_control_assert(ctx->reset); > + clk_disable_unprepare(ctx->clock); > + return rc; > + } > + > + aspeed_xdma_init_eng(ctx); > + > + /* > + * This interrupt could fire immediately so only request it once the > + * engine and driver are initialized. > + */ > + pcie_irq = platform_get_irq(pdev, 1); > + if (pcie_irq < 0) { > + dev_warn(dev, "Unable to find PCI-E IRQ.\n"); > + } else { > + rc = devm_request_irq(dev, pcie_irq, aspeed_xdma_pcie_irq, > + IRQF_SHARED, DEVICE_NAME, ctx); > + if (rc < 0) > + dev_warn(dev, "Unable to request PCI-E IRQ %d.\n", rc); > + } > + > + return 0; > +} > + > +static int aspeed_xdma_remove(struct platform_device *pdev) > +{ > + struct aspeed_xdma *ctx = platform_get_drvdata(pdev); > + > + gen_pool_free(ctx->vga_pool, (unsigned long)ctx->cmdq_vga_virt, > + XDMA_CMDQ_SIZE); You've used devm_gen_pool_create(), so no need to explicitly free it. > + iounmap(ctx->vga_virt); This is unnecessary if we use devm_ioremap() as suggested above. > + > + reset_control_assert(ctx->reset); > + clk_disable_unprepare(ctx->clock); > + > + return 0; > +} > + > +static const struct of_device_id aspeed_xdma_match[] = { > + { > + .compatible = "aspeed,ast2500-xdma", > + .data = (void *)xdma_ast2500, I'd prefer you create a struct as discussed throughout. > + }, > + { > + .compatible = "aspeed,ast2600-xdma", > + .data = (void *)xdma_ast2600, > + }, > + { }, > +}; > + > +static struct platform_driver aspeed_xdma_driver = { > + .probe = aspeed_xdma_probe, > + .remove = aspeed_xdma_remove, > + .driver = { > + .name = DEVICE_NAME, > + .of_match_table = aspeed_xdma_match, > + }, > +}; > + > +module_platform_driver(aspeed_xdma_driver); > + > +MODULE_AUTHOR("Eddie James"); > +MODULE_DESCRIPTION("Aspeed XDMA Engine Driver"); > +MODULE_LICENSE("GPL v2"); > diff --git a/include/uapi/linux/aspeed-xdma.h b/include/uapi/linux/aspeed-xdma.h > new file mode 100644 > index 0000000..7f3a031 > --- /dev/null > +++ b/include/uapi/linux/aspeed-xdma.h > @@ -0,0 +1,49 @@ > +/* SPDX-License-Identifier: GPL-2.0+ WITH Linux-syscall-note */ > +/* Copyright IBM Corp 2019 */ > + > +#ifndef _UAPI_LINUX_ASPEED_XDMA_H_ > +#define _UAPI_LINUX_ASPEED_XDMA_H_ > + > +#include <linux/types.h> > + > +/* > + * aspeed_xdma_direction > + * > + * ASPEED_XDMA_DIRECTION_DOWNSTREAM: transfers data from the host to the BMC > + * > + * ASPEED_XDMA_DIRECTION_UPSTREAM: transfers data from the BMC to the host > + * > + * ASPEED_XDMA_DIRECTION_RESET: resets the XDMA engine > + */ > +enum aspeed_xdma_direction { > + ASPEED_XDMA_DIRECTION_DOWNSTREAM = 0, > + ASPEED_XDMA_DIRECTION_UPSTREAM, > + ASPEED_XDMA_DIRECTION_RESET, > +}; > + > +/* > + * aspeed_xdma_op > + * > + * host_addr: the DMA address on the host side, typically configured by PCI > + * subsystem > + * > + * len: the size of the transfer in bytes > + * > + * direction: an enumerator indicating the direction of the DMA operation; see > + * enum aspeed_xdma_direction > + * > + * bmc_addr: the virtual address to DMA on the BMC side; this parameter is > + * unused on current platforms since the XDMA engine is restricted to > + * accessing the VGA memory space This doesn't make sense to me - if it's a virtual address then talking about the VGA space doesn't make sense to me as where the memory lives is an implementation detail. If the parameter is a physical address then it makes sense, but we'd always want it specified regardless? Andrew > + * > + * reserved: for natural alignment purposes only > + */ > +struct aspeed_xdma_op { > + __u64 host_addr; > + __u32 len; > + __u32 direction; > + __u32 bmc_addr; > + __u32 reserved; > +}; > + > +#endif /* _UAPI_LINUX_ASPEED_XDMA_H_ */ > -- > 1.8.3.1 > > ^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [PATCH 06/12] drivers/soc: Add Aspeed XDMA Engine Driver 2019-11-24 23:32 ` Andrew Jeffery @ 2019-11-25 0:01 ` Andrew Jeffery 2019-11-25 19:15 ` Eddie James 1 sibling, 0 replies; 24+ messages in thread From: Andrew Jeffery @ 2019-11-25 0:01 UTC (permalink / raw) To: Eddie James, linux-kernel Cc: linux-aspeed, Joel Stanley, maz, Jason Cooper, tglx, Rob Herring, mark.rutland, devicetree > > +static int aspeed_xdma_remove(struct platform_device *pdev) > > +{ > > + struct aspeed_xdma *ctx = platform_get_drvdata(pdev); > > + > > + gen_pool_free(ctx->vga_pool, (unsigned long)ctx->cmdq_vga_virt, > > + XDMA_CMDQ_SIZE); > > You've used devm_gen_pool_create(), so no need to explicitly free it. Sorry, disregard that, brain-fart. Andrew ^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [PATCH 06/12] drivers/soc: Add Aspeed XDMA Engine Driver 2019-11-24 23:32 ` Andrew Jeffery 2019-11-25 0:01 ` Andrew Jeffery @ 2019-11-25 19:15 ` Eddie James 2019-11-26 0:53 ` Andrew Jeffery 1 sibling, 1 reply; 24+ messages in thread From: Eddie James @ 2019-11-25 19:15 UTC (permalink / raw) To: Andrew Jeffery, Eddie James, linux-kernel Cc: linux-aspeed, Joel Stanley, maz, Jason Cooper, tglx, Rob Herring, mark.rutland, devicetree On 11/24/19 5:32 PM, Andrew Jeffery wrote: > > On Sat, 9 Nov 2019, at 06:48, Eddie James wrote: >> The XDMA engine embedded in the AST2500 and AST2600 SOCs performs PCI >> DMA operations between the SOC (acting as a BMC) and a host processor >> in a server. >> >> This commit adds a driver to control the XDMA engine and adds functions >> to initialize the hardware and memory and start DMA operations. >> >> Signed-off-by: Eddie James <eajames@linux.ibm.com> >> --- >> MAINTAINERS | 2 + >> drivers/soc/aspeed/Kconfig | 8 + >> drivers/soc/aspeed/Makefile | 1 + >> drivers/soc/aspeed/aspeed-xdma.c | 856 +++++++++++++++++++++++++++++++++++++++ >> include/uapi/linux/aspeed-xdma.h | 49 +++ >> 5 files changed, 916 insertions(+) >> create mode 100644 drivers/soc/aspeed/aspeed-xdma.c >> create mode 100644 include/uapi/linux/aspeed-xdma.h >> >> diff --git a/MAINTAINERS b/MAINTAINERS >> index 540bd45..7eea32e4 100644 >> --- a/MAINTAINERS >> +++ b/MAINTAINERS >> @@ -2696,6 +2696,8 @@ M: Eddie James <eajames@linux.ibm.com> >> L: linux-aspeed@lists.ozlabs.org (moderated for non-subscribers) >> S: Maintained >> F: Documentation/devicetree/bindings/soc/aspeed/xdma.txt >> +F: drivers/soc/aspeed/aspeed-xdma.c >> +F: include/uapi/linux/aspeed-xdma.h >> >> ASUS NOTEBOOKS AND EEEPC ACPI/WMI EXTRAS DRIVERS >> M: Corentin Chary <corentin.chary@gmail.com> >> diff --git a/drivers/soc/aspeed/Kconfig b/drivers/soc/aspeed/Kconfig >> index 323e177..2a6c16f 100644 >> --- a/drivers/soc/aspeed/Kconfig >> +++ b/drivers/soc/aspeed/Kconfig >> @@ -29,4 +29,12 @@ config ASPEED_P2A_CTRL >> ioctl()s, the driver also provides an interface for userspace mappings to >> a pre-defined region. >> >> +config ASPEED_XDMA >> + tristate "Aspeed XDMA Engine Driver" >> + depends on SOC_ASPEED && REGMAP && MFD_SYSCON && HAS_DMA >> + help >> + Enable support for the Aspeed XDMA Engine found on the Aspeed AST2XXX >> + SOCs. The XDMA engine can perform automatic PCI DMA operations >> + between the AST2XXX (acting as a BMC) and a host processor. >> + >> endmenu >> diff --git a/drivers/soc/aspeed/Makefile b/drivers/soc/aspeed/Makefile >> index b64be47..977b046 100644 >> --- a/drivers/soc/aspeed/Makefile >> +++ b/drivers/soc/aspeed/Makefile >> @@ -2,3 +2,4 @@ >> obj-$(CONFIG_ASPEED_LPC_CTRL) += aspeed-lpc-ctrl.o >> obj-$(CONFIG_ASPEED_LPC_SNOOP) += aspeed-lpc-snoop.o >> obj-$(CONFIG_ASPEED_P2A_CTRL) += aspeed-p2a-ctrl.o >> +obj-$(CONFIG_ASPEED_XDMA) += aspeed-xdma.o >> diff --git a/drivers/soc/aspeed/aspeed-xdma.c b/drivers/soc/aspeed/aspeed-xdma.c >> new file mode 100644 >> index 0000000..99041a6 >> --- /dev/null >> +++ b/drivers/soc/aspeed/aspeed-xdma.c >> @@ -0,0 +1,856 @@ >> +// SPDX-License-Identifier: GPL-2.0+ > This should be "GPL-2.0-or-later" (https://spdx.org/licenses/) > >> +// Copyright IBM Corp 2019 >> + >> +#include <linux/aspeed-xdma.h> > A device-specific header in this include path seems a little strange to me. It's under uapi... do I need to specify a different path? > >> +#include <linux/bitfield.h> >> +#include <linux/clk.h> >> +#include <linux/delay.h> >> +#include <linux/device.h> >> +#include <linux/dma-mapping.h> >> +#include <linux/fs.h> >> +#include <linux/genalloc.h> >> +#include <linux/interrupt.h> >> +#include <linux/jiffies.h> >> +#include <linux/mfd/syscon.h> >> +#include <linux/miscdevice.h> >> +#include <linux/module.h> >> +#include <linux/mutex.h> >> +#include <linux/of_device.h> >> +#include <linux/platform_device.h> >> +#include <linux/poll.h> >> +#include <linux/regmap.h> >> +#include <linux/reset.h> >> +#include <linux/slab.h> >> +#include <linux/spinlock.h> >> +#include <linux/string.h> >> +#include <linux/uaccess.h> >> +#include <linux/wait.h> >> +#include <linux/workqueue.h> >> + >> +#define DEVICE_NAME "aspeed-xdma" >> + >> +#define SCU_AST2500_STRAP 0x070 >> +#define SCU_AST2500_STRAP_VGA_MEM GENMASK(3, 2) >> +#define SCU_AST2600_STRAP 0x500 >> +#define SCU_AST2600_STRAP_VGA_MEM GENMASK(14, 13) > It could be easier to review if you add support for one SoC at a time. > >> + >> +#define SCU_AST2500_PCIE_CONF 0x180 >> +#define SCU_AST2600_PCIE_CONF 0xc20 >> +#define SCU_PCIE_CONF_VGA_EN BIT(0) >> +#define SCU_PCIE_CONF_VGA_EN_MMIO BIT(1) >> +#define SCU_PCIE_CONF_VGA_EN_LPC BIT(2) >> +#define SCU_PCIE_CONF_VGA_EN_MSI BIT(3) >> +#define SCU_PCIE_CONF_VGA_EN_MCTP BIT(4) >> +#define SCU_PCIE_CONF_VGA_EN_IRQ BIT(5) >> +#define SCU_PCIE_CONF_VGA_EN_DMA BIT(6) >> +#define SCU_PCIE_CONF_BMC_EN BIT(8) >> +#define SCU_PCIE_CONF_BMC_EN_MMIO BIT(9) >> +#define SCU_PCIE_CONF_BMC_EN_MSI BIT(11) >> +#define SCU_PCIE_CONF_BMC_EN_MCTP BIT(12) >> +#define SCU_PCIE_CONF_BMC_EN_IRQ BIT(13) >> +#define SCU_PCIE_CONF_BMC_EN_DMA BIT(14) >> + >> +#define SCU_AST2500_BMC_CLASS_REV 0x19c >> +#define SCU_AST2600_BMC_CLASS_REV 0xc4c >> +#define SCU_BMC_CLASS_REV_XDMA 0xff000001 >> + >> +#define SDMC_BASE 0x1e6e0000 >> +#define SDMC_CONF 0x004 >> +#define SDMC_CONF_MEM GENMASK(1, 0) >> +#define SDMC_REMAP 0x008 >> +#define SDMC_AST2500_REMAP_MAGIC (BIT(16) | BIT(17)) >> +#define SDMC_AST2600_REMAP_MAGIC BIT(18) > Can we have something more descriptive than "MAGIC"? What these bits > are doing is well documented and defined. Maybe I missed it then... I wasn't able to correspond these necessary bits to what is actually happening. The doc indicates it remaps "REQs" to the highest memory space. What do the REQs correspond to, and why is it these bits for these SOCs? > >> + >> +#define XDMA_CMDQ_SIZE PAGE_SIZE >> +#define XDMA_NUM_CMDS \ >> + (XDMA_CMDQ_SIZE / sizeof(struct aspeed_xdma_cmd)) >> + >> +/* Aspeed specification requires 10ms after switching the reset line */ >> +#define XDMA_RESET_TIME_MS 10 >> + >> +#define XDMA_DS_PCIE_REQ_SIZE_128 0 >> +#define XDMA_DS_PCIE_REQ_SIZE_256 1 >> +#define XDMA_DS_PCIE_REQ_SIZE_512 2 >> +#define XDMA_DS_PCIE_REQ_SIZE_1K 3 >> +#define XDMA_DS_PCIE_REQ_SIZE_2K 4 >> +#define XDMA_DS_PCIE_REQ_SIZE_4K 5 >> + >> +#define XDMA_CMD_AST2500_PITCH_SHIFT 3 >> +#define XDMA_CMD_AST2500_PITCH_BMC GENMASK_ULL(62, 51) >> +#define XDMA_CMD_AST2500_PITCH_HOST GENMASK_ULL(46, 35) >> +#define XDMA_CMD_AST2500_PITCH_UPSTREAM BIT_ULL(31) >> +#define XDMA_CMD_AST2500_PITCH_ADDR GENMASK_ULL(29, 4) >> +#define XDMA_CMD_AST2500_PITCH_ID BIT_ULL(0) >> +#define XDMA_CMD_AST2500_CMD_IRQ_EN BIT_ULL(31) >> +#define XDMA_CMD_AST2500_CMD_LINE_NO GENMASK_ULL(27, 16) >> +#define XDMA_CMD_AST2500_CMD_IRQ_BMC BIT_ULL(15) >> +#define XDMA_CMD_AST2500_CMD_LINE_SIZE_SHIFT 4 >> +#define XDMA_CMD_AST2500_CMD_LINE_SIZE \ >> + GENMASK_ULL(14, XDMA_CMD_AST2500_CMD_LINE_SIZE_SHIFT) >> +#define XDMA_CMD_AST2500_CMD_ID BIT_ULL(1) >> + >> +#define XDMA_CMD_AST2600_PITCH_BMC GENMASK_ULL(62, 48) >> +#define XDMA_CMD_AST2600_PITCH_HOST GENMASK_ULL(46, 32) >> +#define XDMA_CMD_AST2600_PITCH_ADDR GENMASK_ULL(30, 0) >> +#define XDMA_CMD_AST2600_CMD_64_EN BIT_ULL(40) >> +#define XDMA_CMD_AST2600_CMD_IRQ_BMC BIT_ULL(37) >> +#define XDMA_CMD_AST2600_CMD_IRQ_HOST BIT_ULL(36) >> +#define XDMA_CMD_AST2600_CMD_UPSTREAM BIT_ULL(32) >> +#define XDMA_CMD_AST2600_CMD_LINE_NO GENMASK_ULL(27, 16) >> +#define XDMA_CMD_AST2600_CMD_LINE_SIZE GENMASK_ULL(14, 0) >> +#define XDMA_CMD_AST2600_CMD_MULTILINE_SIZE GENMASK_ULL(14, 12) >> + >> +#define XDMA_AST2500_QUEUE_ENTRY_SIZE 4 >> +#define XDMA_AST2500_HOST_CMDQ_ADDR0 0x00 >> +#define XDMA_AST2500_HOST_CMDQ_ENDP 0x04 >> +#define XDMA_AST2500_HOST_CMDQ_WRITEP 0x08 >> +#define XDMA_AST2500_HOST_CMDQ_READP 0x0c >> +#define XDMA_AST2500_BMC_CMDQ_ADDR 0x10 >> +#define XDMA_AST2500_BMC_CMDQ_ENDP 0x14 >> +#define XDMA_AST2500_BMC_CMDQ_WRITEP 0x18 >> +#define XDMA_AST2500_BMC_CMDQ_READP 0x1c >> +#define XDMA_BMC_CMDQ_READP_MAGIC 0xee882266 > What's this about? Using a macro to abstract a magic number and then using the > word "magic" in the name isn't very helpful. I think it should be renamed or > documented, or both. Sure. It's the value to write to reset the read pointer. > >> +#define XDMA_AST2500_CTRL 0x20 >> +#define XDMA_AST2500_CTRL_US_COMP BIT(4) >> +#define XDMA_AST2500_CTRL_DS_COMP BIT(5) >> +#define XDMA_AST2500_CTRL_DS_DIRTY BIT(6) >> +#define XDMA_AST2500_CTRL_DS_SIZE GENMASK(19, 17) >> +#define XDMA_AST2500_CTRL_DS_TIMEOUT BIT(28) >> +#define XDMA_AST2500_CTRL_DS_CHECK_ID BIT(29) >> +#define XDMA_AST2500_STATUS 0x24 >> +#define XDMA_AST2500_STATUS_US_COMP BIT(4) >> +#define XDMA_AST2500_STATUS_DS_COMP BIT(5) >> +#define XDMA_AST2500_STATUS_DS_DIRTY BIT(6) >> +#define XDMA_AST2500_INPRG_DS_CMD1 0x38 >> +#define XDMA_AST2500_INPRG_DS_CMD2 0x3c >> +#define XDMA_AST2500_INPRG_US_CMD00 0x40 >> +#define XDMA_AST2500_INPRG_US_CMD01 0x44 >> +#define XDMA_AST2500_INPRG_US_CMD10 0x48 >> +#define XDMA_AST2500_INPRG_US_CMD11 0x4c >> +#define XDMA_AST2500_INPRG_US_CMD20 0x50 >> +#define XDMA_AST2500_INPRG_US_CMD21 0x54 >> +#define XDMA_AST2500_HOST_CMDQ_ADDR1 0x60 >> +#define XDMA_AST2500_VGA_CMDQ_ADDR0 0x64 >> +#define XDMA_AST2500_VGA_CMDQ_ENDP 0x68 >> +#define XDMA_AST2500_VGA_CMDQ_WRITEP 0x6c >> +#define XDMA_AST2500_VGA_CMDQ_READP 0x70 >> +#define XDMA_AST2500_VGA_CMD_STATUS 0x74 >> +#define XDMA_AST2500_VGA_CMDQ_ADDR1 0x78 >> + >> +#define XDMA_AST2600_QUEUE_ENTRY_SIZE 2 >> +#define XDMA_AST2600_HOST_CMDQ_ADDR0 0x00 >> +#define XDMA_AST2600_HOST_CMDQ_ADDR1 0x04 >> +#define XDMA_AST2600_HOST_CMDQ_ENDP 0x08 >> +#define XDMA_AST2600_HOST_CMDQ_WRITEP 0x0c >> +#define XDMA_AST2600_HOST_CMDQ_READP 0x10 >> +#define XDMA_AST2600_BMC_CMDQ_ADDR 0x14 >> +#define XDMA_AST2600_BMC_CMDQ_ENDP 0x18 >> +#define XDMA_AST2600_BMC_CMDQ_WRITEP 0x1c >> +#define XDMA_AST2600_BMC_CMDQ_READP 0x20 >> +#define XDMA_AST2600_VGA_CMDQ_ADDR0 0x24 >> +#define XDMA_AST2600_VGA_CMDQ_ADDR1 0x28 >> +#define XDMA_AST2600_VGA_CMDQ_ENDP 0x2c >> +#define XDMA_AST2600_VGA_CMDQ_WRITEP 0x30 >> +#define XDMA_AST2600_VGA_CMDQ_READP 0x34 >> +#define XDMA_AST2600_CTRL 0x38 >> +#define XDMA_AST2600_CTRL_US_COMP BIT(16) >> +#define XDMA_AST2600_CTRL_DS_COMP BIT(17) >> +#define XDMA_AST2600_CTRL_DS_DIRTY BIT(18) >> +#define XDMA_AST2600_CTRL_DS_SIZE GENMASK(22, 20) >> +#define XDMA_AST2600_STATUS 0x3c >> +#define XDMA_AST2600_STATUS_US_COMP BIT(16) >> +#define XDMA_AST2600_STATUS_DS_COMP BIT(17) >> +#define XDMA_AST2600_STATUS_DS_DIRTY BIT(18) >> +#define XDMA_AST2600_INPRG_DS_CMD00 0x40 >> +#define XDMA_AST2600_INPRG_DS_CMD01 0x44 >> +#define XDMA_AST2600_INPRG_DS_CMD10 0x48 >> +#define XDMA_AST2600_INPRG_DS_CMD11 0x4c >> +#define XDMA_AST2600_INPRG_DS_CMD20 0x50 >> +#define XDMA_AST2600_INPRG_DS_CMD21 0x54 >> +#define XDMA_AST2600_INPRG_US_CMD00 0x60 >> +#define XDMA_AST2600_INPRG_US_CMD01 0x64 >> +#define XDMA_AST2600_INPRG_US_CMD10 0x68 >> +#define XDMA_AST2600_INPRG_US_CMD11 0x6c >> +#define XDMA_AST2600_INPRG_US_CMD20 0x70 >> +#define XDMA_AST2600_INPRG_US_CMD21 0x74 >> + >> +enum versions { xdma_ast2500, xdma_ast2600 }; >> + >> +struct aspeed_xdma_cmd { >> + u64 host_addr; >> + u64 pitch; >> + u64 cmd; >> + u64 reserved; >> +}; >> + >> +struct aspeed_xdma_regs { >> + u8 bmc_cmdq_addr; >> + u8 bmc_cmdq_endp; >> + u8 bmc_cmdq_writep; >> + u8 bmc_cmdq_readp; >> + u8 control; >> + u8 status; >> +}; >> + >> +struct aspeed_xdma_status_bits { >> + u32 us_comp; >> + u32 ds_comp; >> + u32 ds_dirty; >> +}; >> + >> +struct aspeed_xdma_client; >> + >> +struct aspeed_xdma { >> + enum versions version; >> + u32 control; >> + unsigned int queue_entry_size; >> + struct aspeed_xdma_regs regs; >> + struct aspeed_xdma_status_bits status_bits; >> + >> + struct device *dev; >> + void __iomem *base; >> + struct clk *clock; >> + struct reset_control *reset; >> + >> + bool in_progress; >> + bool in_reset; >> + bool upstream; >> + unsigned int cmd_idx; >> + struct mutex start_lock; >> + struct delayed_work reset_work; >> + spinlock_t client_lock; >> + spinlock_t reset_lock; > What data are each of these locks protecting? Please add documentation > about how you intend to use them. Try to group the locks with the data they > protect if this is not already the case. Sure, though I feel the names are fairly descriptive. > >> + wait_queue_head_t wait; >> + struct aspeed_xdma_client *current_client; >> + >> + u32 vga_phys; >> + u32 vga_size; >> + void *cmdq; >> + void __iomem *vga_virt; >> + dma_addr_t cmdq_vga_phys; >> + void *cmdq_vga_virt; >> + struct gen_pool *vga_pool; >> +}; >> + >> +struct aspeed_xdma_client { >> + struct aspeed_xdma *ctx; >> + >> + bool error; >> + bool in_progress; >> + void *virt; >> + dma_addr_t phys; >> + u32 size; >> +}; >> + >> +static u32 aspeed_xdma_readl(struct aspeed_xdma *ctx, u8 reg) >> +{ >> + u32 v = readl(ctx->base + reg); >> + >> + dev_dbg(ctx->dev, "read %02x[%08x]\n", reg, v); >> + return v; >> +} >> + >> +static void aspeed_xdma_writel(struct aspeed_xdma *ctx, u8 reg, u32 val) >> +{ >> + writel(val, ctx->base + reg); >> + dev_dbg(ctx->dev, "write %02x[%08x]\n", reg, readl(ctx->base + reg)); > That readl() seems a bit paranoid, and is probably evaluated whether or not > this dev_dbg() is enabled. What drove this approach? There is some tricky masking of writes of many of the engine registers, so it was helpful to see what the actual result is. I could drop it. > >> +} >> + >> +static void aspeed_xdma_init_eng(struct aspeed_xdma *ctx) >> +{ >> + aspeed_xdma_writel(ctx, ctx->regs.bmc_cmdq_endp, >> + ctx->queue_entry_size * XDMA_NUM_CMDS); >> + aspeed_xdma_writel(ctx, ctx->regs.bmc_cmdq_readp, >> + XDMA_BMC_CMDQ_READP_MAGIC); >> + aspeed_xdma_writel(ctx, ctx->regs.bmc_cmdq_writep, 0); >> + aspeed_xdma_writel(ctx, ctx->regs.control, ctx->control); >> + aspeed_xdma_writel(ctx, ctx->regs.bmc_cmdq_addr, ctx->cmdq_vga_phys); >> + >> + ctx->cmd_idx = 0; >> + ctx->in_progress = false; >> +} >> + >> +static unsigned int aspeed_xdma_ast2500_set_cmd(struct aspeed_xdma *ctx, >> + struct aspeed_xdma_op *op, >> + u32 bmc_addr) >> +{ >> + u64 cmd = XDMA_CMD_AST2500_CMD_IRQ_EN | XDMA_CMD_AST2500_CMD_IRQ_BMC | >> + XDMA_CMD_AST2500_CMD_ID; >> + u64 cmd_pitch = (op->direction ? XDMA_CMD_AST2500_PITCH_UPSTREAM : 0) | >> + XDMA_CMD_AST2500_PITCH_ID; >> + unsigned int line_size; >> + unsigned int nidx = (ctx->cmd_idx + 1) % XDMA_NUM_CMDS; >> + unsigned int line_no = 1; >> + unsigned int pitch = 1; >> + struct aspeed_xdma_cmd *ncmd = >> + &(((struct aspeed_xdma_cmd *)ctx->cmdq)[ctx->cmd_idx]); >> + >> + dev_dbg(ctx->dev, "xdma %s ast2500: bmc[%08x] len[%08x] host[%08x]\n", >> + op->direction ? "upstream" : "downstream", bmc_addr, op->len, >> + (u32)op->host_addr); >> + >> + if (op->len > XDMA_CMD_AST2500_CMD_LINE_SIZE) { >> + unsigned int rem; >> + unsigned int total; >> + >> + line_no = op->len / XDMA_CMD_AST2500_CMD_LINE_SIZE; >> + total = XDMA_CMD_AST2500_CMD_LINE_SIZE * line_no; >> + rem = (op->len - total) >> >> + XDMA_CMD_AST2500_CMD_LINE_SIZE_SHIFT; >> + line_size = XDMA_CMD_AST2500_CMD_LINE_SIZE; >> + pitch = line_size >> XDMA_CMD_AST2500_PITCH_SHIFT; >> + line_size >>= XDMA_CMD_AST2500_CMD_LINE_SIZE_SHIFT; > Can we clean up the configuration of line_size and pitch here? They are set to > constants in this case. I think this is as clean as it gets. They're used later on so unless I have more if statements it makes sense to set them here. > >> + >> + if (rem) { >> + u32 rbmc = bmc_addr + total; >> + struct aspeed_xdma_cmd *rcmd = >> + &(((struct aspeed_xdma_cmd *)ctx->cmdq)[nidx]); >> + >> + rcmd->host_addr = op->host_addr + (u64)total; >> + rcmd->pitch = cmd_pitch | >> + ((u64)rbmc & XDMA_CMD_AST2500_PITCH_ADDR) | >> + FIELD_PREP(XDMA_CMD_AST2500_PITCH_HOST, 1) | >> + FIELD_PREP(XDMA_CMD_AST2500_PITCH_BMC, 1); >> + rcmd->cmd = cmd | >> + FIELD_PREP(XDMA_CMD_AST2500_CMD_LINE_NO, 1) | >> + FIELD_PREP(XDMA_CMD_AST2500_CMD_LINE_SIZE, >> + rem); >> + >> + print_hex_dump_debug("xdma rem", DUMP_PREFIX_OFFSET, >> + 16, 1, rcmd, sizeof(*rcmd), true); >> + >> + cmd &= ~(XDMA_CMD_AST2500_CMD_IRQ_EN | >> + XDMA_CMD_AST2500_CMD_IRQ_BMC); >> + >> + nidx = (nidx + 1) % XDMA_NUM_CMDS; >> + } >> + } else { >> + line_size = op->len >> XDMA_CMD_AST2500_CMD_LINE_SIZE_SHIFT; >> + } >> + >> + ncmd->host_addr = op->host_addr; >> + ncmd->pitch = cmd_pitch | >> + ((u64)bmc_addr & XDMA_CMD_AST2500_PITCH_ADDR) | >> + FIELD_PREP(XDMA_CMD_AST2500_PITCH_HOST, pitch) | >> + FIELD_PREP(XDMA_CMD_AST2500_PITCH_BMC, pitch); >> + ncmd->cmd = cmd | FIELD_PREP(XDMA_CMD_AST2500_CMD_LINE_NO, line_no) | >> + FIELD_PREP(XDMA_CMD_AST2500_CMD_LINE_SIZE, line_size); >> + >> + print_hex_dump_debug("xdma cmd", DUMP_PREFIX_OFFSET, 16, 1, ncmd, >> + sizeof(*ncmd), true); >> + >> + return nidx; >> +} >> + >> +static unsigned int aspeed_xdma_ast2600_set_cmd(struct aspeed_xdma *ctx, >> + struct aspeed_xdma_op *op, >> + u32 bmc_addr) >> +{ >> + u64 cmd = XDMA_CMD_AST2600_CMD_IRQ_BMC | >> + (op->direction ? XDMA_CMD_AST2600_CMD_UPSTREAM : 0); >> + unsigned int line_size; >> + unsigned int nidx = (ctx->cmd_idx + 1) % XDMA_NUM_CMDS; >> + unsigned int line_no = 1; >> + unsigned int pitch = 1; >> + struct aspeed_xdma_cmd *ncmd = >> + &(((struct aspeed_xdma_cmd *)ctx->cmdq)[ctx->cmd_idx]); >> + >> + if ((op->host_addr + op->len) & 0xffffffff00000000ULL) >> + cmd |= XDMA_CMD_AST2600_CMD_64_EN; >> + >> + dev_dbg(ctx->dev, "xdma %s ast2600: bmc[%08x] len[%08x] " >> + "host[%016llx]\n", op->direction ? "upstream" : "downstream", >> + bmc_addr, op->len, op->host_addr); >> + >> + if (op->len > XDMA_CMD_AST2600_CMD_LINE_SIZE) { >> + unsigned int rem; >> + unsigned int total; >> + >> + line_no = op->len / XDMA_CMD_AST2600_CMD_MULTILINE_SIZE; >> + total = XDMA_CMD_AST2600_CMD_MULTILINE_SIZE * line_no; >> + rem = op->len - total; >> + line_size = XDMA_CMD_AST2600_CMD_MULTILINE_SIZE; >> + pitch = line_size; >> + >> + if (rem) { >> + u32 rbmc = bmc_addr + total; >> + struct aspeed_xdma_cmd *rcmd = >> + &(((struct aspeed_xdma_cmd *)ctx->cmdq)[nidx]); >> + >> + rcmd->host_addr = op->host_addr + (u64)total; >> + rcmd->pitch = >> + ((u64)rbmc & XDMA_CMD_AST2600_PITCH_ADDR) | >> + FIELD_PREP(XDMA_CMD_AST2600_PITCH_HOST, 1) | >> + FIELD_PREP(XDMA_CMD_AST2600_PITCH_BMC, 1); >> + rcmd->cmd = cmd | >> + FIELD_PREP(XDMA_CMD_AST2600_CMD_LINE_NO, 1) | >> + FIELD_PREP(XDMA_CMD_AST2600_CMD_LINE_SIZE, >> + rem); >> + >> + print_hex_dump_debug("xdma rem", DUMP_PREFIX_OFFSET, >> + 16, 1, rcmd, sizeof(*rcmd), true); >> + >> + cmd &= ~XDMA_CMD_AST2600_CMD_IRQ_BMC; >> + >> + nidx = (nidx + 1) % XDMA_NUM_CMDS; >> + } >> + } else { >> + line_size = op->len; >> + } >> + >> + ncmd->host_addr = op->host_addr; >> + ncmd->pitch = ((u64)bmc_addr & XDMA_CMD_AST2600_PITCH_ADDR) | >> + FIELD_PREP(XDMA_CMD_AST2600_PITCH_HOST, pitch) | >> + FIELD_PREP(XDMA_CMD_AST2600_PITCH_BMC, pitch); >> + ncmd->cmd = cmd | FIELD_PREP(XDMA_CMD_AST2600_CMD_LINE_NO, line_no) | >> + FIELD_PREP(XDMA_CMD_AST2600_CMD_LINE_SIZE, line_size); >> + >> + print_hex_dump_debug("xdma cmd", DUMP_PREFIX_OFFSET, 16, 1, ncmd, >> + sizeof(*ncmd), true); >> + >> + return nidx; >> +} >> + >> +static void aspeed_xdma_start(struct aspeed_xdma *ctx, >> + struct aspeed_xdma_op *op, u32 bmc_addr, >> + struct aspeed_xdma_client *client) >> +{ >> + unsigned int nidx; >> + >> + mutex_lock(&ctx->start_lock); >> + >> + switch (ctx->version) { >> + default: >> + case xdma_ast2500: >> + nidx = aspeed_xdma_ast2500_set_cmd(ctx, op, bmc_addr); >> + break; >> + case xdma_ast2600: >> + nidx = aspeed_xdma_ast2600_set_cmd(ctx, op, bmc_addr); >> + break; >> + } > What was the trade-off between the enum and function pointers? Is the > enum approach strictly better for some reason? No, I just decided to go with the switch. > >> + >> + memcpy(ctx->cmdq_vga_virt, ctx->cmdq, XDMA_CMDQ_SIZE); >> + >> + client->in_progress = true; >> + ctx->current_client = client; >> + >> + ctx->in_progress = true; > Do we need ctx->in_progress vs just using the NULL state of ctx->current_client? > Is it possible for ctx->current_client to be set and not have a transfer in progress? Good point, could probably drop that. > >> + ctx->upstream = op->direction ? true : false; >> + >> + aspeed_xdma_writel(ctx, ctx->regs.bmc_cmdq_writep, >> + nidx * ctx->queue_entry_size); >> + >> + ctx->cmd_idx = nidx; >> + >> + mutex_unlock(&ctx->start_lock); >> +} >> + >> +static void aspeed_xdma_done(struct aspeed_xdma *ctx, bool error) >> +{ >> + unsigned long flags; >> + >> + /* >> + * Lock to make sure simultaneous reset and transfer complete don't >> + * leave the client with the wrong error state. >> + */ >> + spin_lock_irqsave(&ctx->client_lock, flags); >> + >> + if (ctx->current_client) { > You're testing ctx->current_client under ctx->client_lock here, but ctx->current_client > is set under ctx->start_lock in aspeed_xdma_start(), which does not take > ctx->client_lock. What data is protected by ctx->client_lock? What data is protected > by ctx->client_lock? client_lock is making sure the client.error and ctx.current_client are updated atomically, as indicated by the comment. It's impossible to be starting a transfer at the same time, since ctx.in_progress would be true. > >> + ctx->current_client->error = error; >> + ctx->current_client->in_progress = false; >> + ctx->current_client = NULL; >> + } >> + >> + spin_unlock_irqrestore(&ctx->client_lock, flags); >> + >> + ctx->in_progress = false; > You set ctx->in_progress under ctx->start_lock in aspeed_xdma_start() but you > do not take ctx->start_lock when setting it here. What data is ctx->start_lock > protecting? start_lock is making sure that the hardware register state, command queue, and data associated with the current transfer are updated atomically in the start function. It shouldn't be necessary to lock just to reset in_progress > >> + wake_up_interruptible_all(&ctx->wait); >> +} >> + >> +static irqreturn_t aspeed_xdma_irq(int irq, void *arg) >> +{ >> + struct aspeed_xdma *ctx = arg; >> + u32 status = aspeed_xdma_readl(ctx, ctx->regs.status); >> + >> + if (status & ctx->status_bits.ds_dirty) { >> + aspeed_xdma_done(ctx, true); >> + } else { >> + if (status & ctx->status_bits.us_comp) { >> + if (ctx->upstream) >> + aspeed_xdma_done(ctx, false); >> + } >> + >> + if (status & ctx->status_bits.ds_comp) { >> + if (!ctx->upstream) >> + aspeed_xdma_done(ctx, false); >> + } >> + } >> + >> + aspeed_xdma_writel(ctx, ctx->regs.status, status); >> + >> + return IRQ_HANDLED; >> +} >> + >> +static void aspeed_xdma_reset_finish(struct aspeed_xdma *ctx) >> +{ >> + unsigned long flags; >> + >> + spin_lock_irqsave(&ctx->reset_lock, flags); >> + >> + ctx->in_reset = false; >> + reset_control_deassert(ctx->reset); >> + >> + spin_unlock_irqrestore(&ctx->reset_lock, flags); >> + >> + msleep(XDMA_RESET_TIME_MS); >> + >> + aspeed_xdma_init_eng(ctx); >> + aspeed_xdma_done(ctx, true); >> +} >> + >> +static bool aspeed_xdma_reset_start(struct aspeed_xdma *ctx) >> +{ >> + bool rc = true; >> + unsigned long flags; >> + >> + spin_lock_irqsave(&ctx->reset_lock, flags); >> + >> + if (ctx->in_reset) { >> + rc = false; >> + } else { >> + ctx->in_reset = true; >> + reset_control_assert(ctx->reset); >> + } > Do start and finish need to be split in this way? Yes, in order to prevent waiting for 10ms in interrupt context during the reset. > >> + >> + spin_unlock_irqrestore(&ctx->reset_lock, flags); >> + >> + return rc; >> +} >> + >> +static void aspeed_xdma_reset_work(struct work_struct *work) >> +{ >> + struct delayed_work *dwork = to_delayed_work(work); >> + struct aspeed_xdma *ctx = container_of(dwork, struct aspeed_xdma, >> + reset_work); >> + >> + /* >> + * Lock to make sure operations aren't started while the engine is >> + * in an undefined state coming out of reset and waiting to init. > Shouldn't we be holding it across both aspeed_xdma_reset_start() and > aspeed_xdma_reset_finish()? No, it isn't necessary and also start_lock is a mutex which can't be locked in interrupt context. It isn't necessary because a reset before, during, or after starting a transfer will simply cause the transfer to do nothing until the reset completes, when it will be put into an error state. > >> + */ >> + mutex_lock(&ctx->start_lock); >> + >> + aspeed_xdma_reset_finish(ctx); >> + >> + mutex_unlock(&ctx->start_lock); >> +} >> + >> +static irqreturn_t aspeed_xdma_pcie_irq(int irq, void *arg) >> +{ >> + struct aspeed_xdma *ctx = arg; >> + >> + dev_dbg(ctx->dev, "pcie reset\n"); >> + >> + if (aspeed_xdma_reset_start(ctx)) >> + schedule_delayed_work(&ctx->reset_work, >> + msecs_to_jiffies(XDMA_RESET_TIME_MS)); >> + >> + return IRQ_HANDLED; >> +} >> + >> +static int aspeed_xdma_init(struct aspeed_xdma *ctx) >> +{ >> + int rc; >> + struct regmap *scu; >> + u32 conf; >> + u32 mem_size; >> + u32 remap; >> + u32 scu_bmc_class; >> + u32 scu_pcie_conf; >> + u32 scu_strap; >> + u32 sdmc_remap_magic; >> + u32 strap = 0; >> + const u32 bmc = SCU_PCIE_CONF_BMC_EN | SCU_PCIE_CONF_BMC_EN_MSI | >> + SCU_PCIE_CONF_BMC_EN_MCTP > Do we need MCTP here? I don't know, I used the default value which has MCTP enabled. > > | SCU_PCIE_CONF_BMC_EN_IRQ | >> + SCU_PCIE_CONF_BMC_EN_DMA; >> + const u32 vga = SCU_PCIE_CONF_VGA_EN | SCU_PCIE_CONF_VGA_EN_MSI | >> + SCU_PCIE_CONF_VGA_EN_MCTP > Do we need MCTP here? > > | SCU_PCIE_CONF_VGA_EN_IRQ | >> + SCU_PCIE_CONF_VGA_EN_DMA; >> + u32 mem_sizes[4] = { 0x8000000, 0x10000000, 0x20000000, 0x40000000 }; >> + const u32 vga_sizes[4] = { 0x800000, 0x1000000, 0x2000000, 0x4000000 }; >> + void __iomem *sdmc_base = ioremap(SDMC_BASE, 0x100); > Surely this should be a phandle from the devicetree? And what about conflicts > with a potential SDMC driver? I don't think what you have is the right approach. > >> + >> + if (!sdmc_base) { >> + dev_err(ctx->dev, "Failed to ioremap mem controller regs.\n"); >> + return -ENOMEM; >> + } >> + >> + switch (ctx->version) { >> + default: >> + case xdma_ast2500: >> + scu_bmc_class = SCU_AST2500_BMC_CLASS_REV; >> + scu_pcie_conf = SCU_AST2500_PCIE_CONF; >> + scu_strap = SCU_AST2500_STRAP; >> + sdmc_remap_magic = SDMC_AST2500_REMAP_MAGIC; > Can't this be described statically? The values should be derived from the > devicetree compatible. > >> + >> + scu = syscon_regmap_lookup_by_compatible("aspeed,ast2500-scu"); > I think we should do this via phandle in the devicetree. > >> + break; >> + case xdma_ast2600: >> + scu_bmc_class = SCU_AST2600_BMC_CLASS_REV; >> + scu_pcie_conf = SCU_AST2600_PCIE_CONF; >> + scu_strap = SCU_AST2600_STRAP; >> + sdmc_remap_magic = SDMC_AST2600_REMAP_MAGIC; > Can't this be described statically? The values should be derived from the > devicetree compatible. > >> + >> + mem_sizes[0] *= 2; >> + mem_sizes[1] *= 2; >> + mem_sizes[2] *= 2; >> + mem_sizes[3] *= 2; > Same query as above. > >> + >> + scu = syscon_regmap_lookup_by_compatible("aspeed,ast2600-scu"); > I think we should do this via phandle in the devicetree. > >> + break; >> + }; >> + >> + if (!scu) { >> + dev_err(ctx->dev, "Failed to grab SCU regs.\n"); >> + return -ENOMEM; >> + } >> + >> + /* Set SOC to use the BMC PCIe device and set the device class code */ >> + regmap_update_bits(scu, scu_pcie_conf, bmc | vga, bmc); >> + regmap_write(scu, scu_bmc_class, SCU_BMC_CLASS_REV_XDMA); > This should be selectable, probably via the devicetree. > >> + >> + /* >> + * Calculate the VGA memory size and physical address from the SCU and >> + * memory controller registers. >> + */ >> + regmap_read(scu, scu_strap, &strap); >> + >> + switch (ctx->version) { >> + case xdma_ast2500: >> + ctx->vga_size = vga_sizes[FIELD_GET(SCU_AST2500_STRAP_VGA_MEM, >> + strap)]; >> + break; >> + case xdma_ast2600: >> + ctx->vga_size = vga_sizes[FIELD_GET(SCU_AST2600_STRAP_VGA_MEM, >> + strap)]; >> + break; > Can't this be described statically? The values should be derived from the > devicetree compatible. > >> + } >> + >> + conf = readl(sdmc_base + SDMC_CONF); >> + remap = readl(sdmc_base + SDMC_REMAP); >> + remap |= sdmc_remap_magic; >> + writel(remap, sdmc_base + SDMC_REMAP); >> + mem_size = mem_sizes[conf & SDMC_CONF_MEM]; > See previous objection to this. > >> + >> + iounmap(sdmc_base); >> + >> + ctx->vga_phys = (mem_size - ctx->vga_size) + 0x80000000; > RAM base should be extracted from the devicetree. Better yet, the VGA space should > be extracted from the devicetree and this calculation avoided altogether. Ok, I'll use the devicetree for all this instead. > >> + >> + ctx->cmdq = devm_kzalloc(ctx->dev, XDMA_CMDQ_SIZE, GFP_KERNEL); >> + if (!ctx->cmdq) { >> + dev_err(ctx->dev, "Failed to allocate command queue.\n"); >> + return -ENOMEM; >> + } >> + >> + ctx->vga_virt = ioremap(ctx->vga_phys, ctx->vga_size); > Use devm_ioremap() to avoid the cleanup. > >> + if (!ctx->vga_virt) { >> + dev_err(ctx->dev, "Failed to ioremap VGA memory.\n"); >> + return -ENOMEM; >> + } >> + >> + rc = gen_pool_add_virt(ctx->vga_pool, (unsigned long)ctx->vga_virt, >> + ctx->vga_phys, ctx->vga_size, -1); >> + if (rc) { >> + dev_err(ctx->dev, "Failed to add memory to genalloc pool.\n"); >> + iounmap(ctx->vga_virt); >> + return rc; >> + } >> + >> + ctx->cmdq_vga_virt = gen_pool_dma_alloc(ctx->vga_pool, XDMA_CMDQ_SIZE, >> + &ctx->cmdq_vga_phys); > Can you educate me on this a little? Why is a command queue being allocated in > memory writable by the host? Aren't we opening ourselves up to potential corruption? In it's current configuration, the engine can't access anything except the VGA memory. That includes the location for the command queue. Not sure much can be done about this. What path can the host use to access this memory though? > >> + if (!ctx->cmdq_vga_virt) { >> + dev_err(ctx->dev, "Failed to genalloc cmdq.\n"); >> + iounmap(ctx->vga_virt); >> + return -ENOMEM; >> + } >> + >> + dev_dbg(ctx->dev, "VGA mapped at phys[%08x], size[%08x].\n", >> + ctx->vga_phys, ctx->vga_size); >> + >> + return 0; >> +} >> + >> +static int aspeed_xdma_probe(struct platform_device *pdev) >> +{ >> + int irq; >> + int pcie_irq; >> + int rc; >> + enum versions vs = xdma_ast2500; >> + struct device *dev = &pdev->dev; >> + struct aspeed_xdma *ctx = devm_kzalloc(dev, sizeof(*ctx), GFP_KERNEL); >> + const void *md = of_device_get_match_data(dev); >> + >> + if (!ctx) >> + return -ENOMEM; >> + >> + if (md) >> + vs = (enum versions)md; >> + >> + switch (vs) { >> + default: >> + case xdma_ast2500: >> + ctx->version = xdma_ast2500; >> + ctx->control = XDMA_AST2500_CTRL_US_COMP | >> + XDMA_AST2500_CTRL_DS_COMP | >> + XDMA_AST2500_CTRL_DS_DIRTY | >> + FIELD_PREP(XDMA_AST2500_CTRL_DS_SIZE, >> + XDMA_DS_PCIE_REQ_SIZE_256) | >> + XDMA_AST2500_CTRL_DS_TIMEOUT | >> + XDMA_AST2500_CTRL_DS_CHECK_ID; >> + ctx->queue_entry_size = XDMA_AST2500_QUEUE_ENTRY_SIZE; >> + ctx->regs.bmc_cmdq_addr = XDMA_AST2500_BMC_CMDQ_ADDR; >> + ctx->regs.bmc_cmdq_endp = XDMA_AST2500_BMC_CMDQ_ENDP; >> + ctx->regs.bmc_cmdq_writep = XDMA_AST2500_BMC_CMDQ_WRITEP; >> + ctx->regs.bmc_cmdq_readp = XDMA_AST2500_BMC_CMDQ_READP; >> + ctx->regs.control = XDMA_AST2500_CTRL; >> + ctx->regs.status = XDMA_AST2500_STATUS; >> + ctx->status_bits.us_comp = XDMA_AST2500_STATUS_US_COMP; >> + ctx->status_bits.ds_comp = XDMA_AST2500_STATUS_DS_COMP; >> + ctx->status_bits.ds_dirty = XDMA_AST2500_STATUS_DS_DIRTY; > Why not include all this in the match data? Sure. > >> + break; >> + case xdma_ast2600: >> + ctx->version = xdma_ast2600; >> + ctx->control = XDMA_AST2600_CTRL_US_COMP | >> + XDMA_AST2600_CTRL_DS_COMP | >> + XDMA_AST2600_CTRL_DS_DIRTY | >> + FIELD_PREP(XDMA_AST2600_CTRL_DS_SIZE, >> + XDMA_DS_PCIE_REQ_SIZE_256); >> + ctx->queue_entry_size = XDMA_AST2600_QUEUE_ENTRY_SIZE; >> + ctx->regs.bmc_cmdq_addr = XDMA_AST2600_BMC_CMDQ_ADDR; >> + ctx->regs.bmc_cmdq_endp = XDMA_AST2600_BMC_CMDQ_ENDP; >> + ctx->regs.bmc_cmdq_writep = XDMA_AST2600_BMC_CMDQ_WRITEP; >> + ctx->regs.bmc_cmdq_readp = XDMA_AST2600_BMC_CMDQ_READP; >> + ctx->regs.control = XDMA_AST2600_CTRL; >> + ctx->regs.status = XDMA_AST2600_STATUS; >> + ctx->status_bits.us_comp = XDMA_AST2600_STATUS_US_COMP; >> + ctx->status_bits.ds_comp = XDMA_AST2600_STATUS_DS_COMP; >> + ctx->status_bits.ds_dirty = XDMA_AST2600_STATUS_DS_DIRTY; > Same query as above > >> + break; >> + }; >> + >> + ctx->dev = dev; >> + platform_set_drvdata(pdev, ctx); >> + mutex_init(&ctx->start_lock); >> + INIT_DELAYED_WORK(&ctx->reset_work, aspeed_xdma_reset_work); >> + spin_lock_init(&ctx->client_lock); >> + spin_lock_init(&ctx->reset_lock); >> + init_waitqueue_head(&ctx->wait); >> + >> + ctx->base = devm_platform_ioremap_resource(pdev, 0); >> + if (IS_ERR(ctx->base)) { >> + dev_err(dev, "Unable to ioremap registers.\n"); >> + return PTR_ERR(ctx->base); >> + } >> + >> + irq = platform_get_irq(pdev, 0); >> + if (irq < 0) { >> + dev_err(dev, "Unable to find IRQ.\n"); >> + return -ENODEV; >> + } >> + >> + rc = devm_request_irq(dev, irq, aspeed_xdma_irq, IRQF_SHARED, >> + DEVICE_NAME, ctx); >> + if (rc < 0) { >> + dev_err(dev, "Unable to request IRQ %d.\n", irq); >> + return rc; >> + } >> + >> + ctx->clock = devm_clk_get(dev, NULL); >> + if (IS_ERR(ctx->clock)) { >> + dev_err(dev, "Unable to request clock.\n"); >> + return PTR_ERR(ctx->clock); >> + } >> + >> + ctx->reset = devm_reset_control_get_exclusive(dev, NULL); >> + if (IS_ERR(ctx->reset)) { >> + dev_err(dev, "Unable to request reset control.\n"); >> + return PTR_ERR(ctx->reset); >> + } >> + >> + ctx->vga_pool = devm_gen_pool_create(dev, ilog2(PAGE_SIZE), -1, NULL); >> + if (!ctx->vga_pool) { >> + dev_err(dev, "Unable to setup genalloc pool.\n"); >> + return -ENOMEM; >> + } >> + >> + clk_prepare_enable(ctx->clock); > Check for errors > >> + msleep(XDMA_RESET_TIME_MS); >> + >> + reset_control_deassert(ctx->reset); > Check for errors. > >> + msleep(XDMA_RESET_TIME_MS); >> + >> + rc = aspeed_xdma_init(ctx); >> + if (rc) { >> + reset_control_assert(ctx->reset); >> + clk_disable_unprepare(ctx->clock); >> + return rc; >> + } >> + >> + aspeed_xdma_init_eng(ctx); >> + >> + /* >> + * This interrupt could fire immediately so only request it once the >> + * engine and driver are initialized. >> + */ >> + pcie_irq = platform_get_irq(pdev, 1); >> + if (pcie_irq < 0) { >> + dev_warn(dev, "Unable to find PCI-E IRQ.\n"); >> + } else { >> + rc = devm_request_irq(dev, pcie_irq, aspeed_xdma_pcie_irq, >> + IRQF_SHARED, DEVICE_NAME, ctx); >> + if (rc < 0) >> + dev_warn(dev, "Unable to request PCI-E IRQ %d.\n", rc); >> + } >> + >> + return 0; >> +} >> + >> +static int aspeed_xdma_remove(struct platform_device *pdev) >> +{ >> + struct aspeed_xdma *ctx = platform_get_drvdata(pdev); >> + >> + gen_pool_free(ctx->vga_pool, (unsigned long)ctx->cmdq_vga_virt, >> + XDMA_CMDQ_SIZE); > You've used devm_gen_pool_create(), so no need to explicitly free it. > >> + iounmap(ctx->vga_virt); > This is unnecessary if we use devm_ioremap() as suggested above. > >> + >> + reset_control_assert(ctx->reset); >> + clk_disable_unprepare(ctx->clock); >> + >> + return 0; >> +} >> + >> +static const struct of_device_id aspeed_xdma_match[] = { >> + { >> + .compatible = "aspeed,ast2500-xdma", >> + .data = (void *)xdma_ast2500, > I'd prefer you create a struct as discussed throughout. > >> + }, >> + { >> + .compatible = "aspeed,ast2600-xdma", >> + .data = (void *)xdma_ast2600, >> + }, >> + { }, >> +}; >> + >> +static struct platform_driver aspeed_xdma_driver = { >> + .probe = aspeed_xdma_probe, >> + .remove = aspeed_xdma_remove, >> + .driver = { >> + .name = DEVICE_NAME, >> + .of_match_table = aspeed_xdma_match, >> + }, >> +}; >> + >> +module_platform_driver(aspeed_xdma_driver); >> + >> +MODULE_AUTHOR("Eddie James"); >> +MODULE_DESCRIPTION("Aspeed XDMA Engine Driver"); >> +MODULE_LICENSE("GPL v2"); >> diff --git a/include/uapi/linux/aspeed-xdma.h b/include/uapi/linux/aspeed-xdma.h >> new file mode 100644 >> index 0000000..7f3a031 >> --- /dev/null >> +++ b/include/uapi/linux/aspeed-xdma.h >> @@ -0,0 +1,49 @@ >> +/* SPDX-License-Identifier: GPL-2.0+ WITH Linux-syscall-note */ >> +/* Copyright IBM Corp 2019 */ >> + >> +#ifndef _UAPI_LINUX_ASPEED_XDMA_H_ >> +#define _UAPI_LINUX_ASPEED_XDMA_H_ >> + >> +#include <linux/types.h> >> + >> +/* >> + * aspeed_xdma_direction >> + * >> + * ASPEED_XDMA_DIRECTION_DOWNSTREAM: transfers data from the host to the BMC >> + * >> + * ASPEED_XDMA_DIRECTION_UPSTREAM: transfers data from the BMC to the host >> + * >> + * ASPEED_XDMA_DIRECTION_RESET: resets the XDMA engine >> + */ >> +enum aspeed_xdma_direction { >> + ASPEED_XDMA_DIRECTION_DOWNSTREAM = 0, >> + ASPEED_XDMA_DIRECTION_UPSTREAM, >> + ASPEED_XDMA_DIRECTION_RESET, >> +}; >> + >> +/* >> + * aspeed_xdma_op >> + * >> + * host_addr: the DMA address on the host side, typically configured by PCI >> + * subsystem >> + * >> + * len: the size of the transfer in bytes >> + * >> + * direction: an enumerator indicating the direction of the DMA operation; see >> + * enum aspeed_xdma_direction >> + * >> + * bmc_addr: the virtual address to DMA on the BMC side; this parameter is >> + * unused on current platforms since the XDMA engine is restricted to >> + * accessing the VGA memory space > This doesn't make sense to me - if it's a virtual address then talking about the VGA > space doesn't make sense to me as where the memory lives is an implementation > detail. If the parameter is a physical address then it makes sense, but we'd always > want it specified regardless? As the comment says, it's unused. I added the parameter to future-proof the API in case the engine is configured differently, allowing arbitrary memory access by the engine. In that case, the user would pass the virtual address of their data. Thanks for the review! Eddie > > Andrew > >> + * >> + * reserved: for natural alignment purposes only >> + */ >> +struct aspeed_xdma_op { >> + __u64 host_addr; >> + __u32 len; >> + __u32 direction; >> + __u32 bmc_addr; >> + __u32 reserved; >> +}; >> + >> +#endif /* _UAPI_LINUX_ASPEED_XDMA_H_ */ >> -- >> 1.8.3.1 >> >> ^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [PATCH 06/12] drivers/soc: Add Aspeed XDMA Engine Driver 2019-11-25 19:15 ` Eddie James @ 2019-11-26 0:53 ` Andrew Jeffery 0 siblings, 0 replies; 24+ messages in thread From: Andrew Jeffery @ 2019-11-26 0:53 UTC (permalink / raw) To: Eddie James, Eddie James, linux-kernel Cc: linux-aspeed, Joel Stanley, maz, Jason Cooper, tglx, Rob Herring, mark.rutland, devicetree On Tue, 26 Nov 2019, at 05:45, Eddie James wrote: > > On 11/24/19 5:32 PM, Andrew Jeffery wrote: > > > > On Sat, 9 Nov 2019, at 06:48, Eddie James wrote: > >> The XDMA engine embedded in the AST2500 and AST2600 SOCs performs PCI > >> DMA operations between the SOC (acting as a BMC) and a host processor > >> in a server. > >> > >> This commit adds a driver to control the XDMA engine and adds functions > >> to initialize the hardware and memory and start DMA operations. > >> > >> Signed-off-by: Eddie James <eajames@linux.ibm.com> > >> --- > >> MAINTAINERS | 2 + > >> drivers/soc/aspeed/Kconfig | 8 + > >> drivers/soc/aspeed/Makefile | 1 + > >> drivers/soc/aspeed/aspeed-xdma.c | 856 +++++++++++++++++++++++++++++++++++++++ > >> include/uapi/linux/aspeed-xdma.h | 49 +++ > >> 5 files changed, 916 insertions(+) > >> create mode 100644 drivers/soc/aspeed/aspeed-xdma.c > >> create mode 100644 include/uapi/linux/aspeed-xdma.h > >> > >> diff --git a/MAINTAINERS b/MAINTAINERS > >> index 540bd45..7eea32e4 100644 > >> --- a/MAINTAINERS > >> +++ b/MAINTAINERS > >> @@ -2696,6 +2696,8 @@ M: Eddie James <eajames@linux.ibm.com> > >> L: linux-aspeed@lists.ozlabs.org (moderated for non-subscribers) > >> S: Maintained > >> F: Documentation/devicetree/bindings/soc/aspeed/xdma.txt > >> +F: drivers/soc/aspeed/aspeed-xdma.c > >> +F: include/uapi/linux/aspeed-xdma.h > >> > >> ASUS NOTEBOOKS AND EEEPC ACPI/WMI EXTRAS DRIVERS > >> M: Corentin Chary <corentin.chary@gmail.com> > >> diff --git a/drivers/soc/aspeed/Kconfig b/drivers/soc/aspeed/Kconfig > >> index 323e177..2a6c16f 100644 > >> --- a/drivers/soc/aspeed/Kconfig > >> +++ b/drivers/soc/aspeed/Kconfig > >> @@ -29,4 +29,12 @@ config ASPEED_P2A_CTRL > >> ioctl()s, the driver also provides an interface for userspace mappings to > >> a pre-defined region. > >> > >> +config ASPEED_XDMA > >> + tristate "Aspeed XDMA Engine Driver" > >> + depends on SOC_ASPEED && REGMAP && MFD_SYSCON && HAS_DMA > >> + help > >> + Enable support for the Aspeed XDMA Engine found on the Aspeed AST2XXX > >> + SOCs. The XDMA engine can perform automatic PCI DMA operations > >> + between the AST2XXX (acting as a BMC) and a host processor. > >> + > >> endmenu > >> diff --git a/drivers/soc/aspeed/Makefile b/drivers/soc/aspeed/Makefile > >> index b64be47..977b046 100644 > >> --- a/drivers/soc/aspeed/Makefile > >> +++ b/drivers/soc/aspeed/Makefile > >> @@ -2,3 +2,4 @@ > >> obj-$(CONFIG_ASPEED_LPC_CTRL) += aspeed-lpc-ctrl.o > >> obj-$(CONFIG_ASPEED_LPC_SNOOP) += aspeed-lpc-snoop.o > >> obj-$(CONFIG_ASPEED_P2A_CTRL) += aspeed-p2a-ctrl.o > >> +obj-$(CONFIG_ASPEED_XDMA) += aspeed-xdma.o > >> diff --git a/drivers/soc/aspeed/aspeed-xdma.c b/drivers/soc/aspeed/aspeed-xdma.c > >> new file mode 100644 > >> index 0000000..99041a6 > >> --- /dev/null > >> +++ b/drivers/soc/aspeed/aspeed-xdma.c > >> @@ -0,0 +1,856 @@ > >> +// SPDX-License-Identifier: GPL-2.0+ > > This should be "GPL-2.0-or-later" (https://spdx.org/licenses/) > > > >> +// Copyright IBM Corp 2019 > >> + > >> +#include <linux/aspeed-xdma.h> > > A device-specific header in this include path seems a little strange to me. > > > It's under uapi... do I need to specify a different path? No, I didn't circle around to removing the comment. Should be fine for uapi. > > > > > >> +#include <linux/bitfield.h> > >> +#include <linux/clk.h> > >> +#include <linux/delay.h> > >> +#include <linux/device.h> > >> +#include <linux/dma-mapping.h> > >> +#include <linux/fs.h> > >> +#include <linux/genalloc.h> > >> +#include <linux/interrupt.h> > >> +#include <linux/jiffies.h> > >> +#include <linux/mfd/syscon.h> > >> +#include <linux/miscdevice.h> > >> +#include <linux/module.h> > >> +#include <linux/mutex.h> > >> +#include <linux/of_device.h> > >> +#include <linux/platform_device.h> > >> +#include <linux/poll.h> > >> +#include <linux/regmap.h> > >> +#include <linux/reset.h> > >> +#include <linux/slab.h> > >> +#include <linux/spinlock.h> > >> +#include <linux/string.h> > >> +#include <linux/uaccess.h> > >> +#include <linux/wait.h> > >> +#include <linux/workqueue.h> > >> + > >> +#define DEVICE_NAME "aspeed-xdma" > >> + > >> +#define SCU_AST2500_STRAP 0x070 > >> +#define SCU_AST2500_STRAP_VGA_MEM GENMASK(3, 2) > >> +#define SCU_AST2600_STRAP 0x500 > >> +#define SCU_AST2600_STRAP_VGA_MEM GENMASK(14, 13) > > It could be easier to review if you add support for one SoC at a time. > > > >> + > >> +#define SCU_AST2500_PCIE_CONF 0x180 > >> +#define SCU_AST2600_PCIE_CONF 0xc20 > >> +#define SCU_PCIE_CONF_VGA_EN BIT(0) > >> +#define SCU_PCIE_CONF_VGA_EN_MMIO BIT(1) > >> +#define SCU_PCIE_CONF_VGA_EN_LPC BIT(2) > >> +#define SCU_PCIE_CONF_VGA_EN_MSI BIT(3) > >> +#define SCU_PCIE_CONF_VGA_EN_MCTP BIT(4) > >> +#define SCU_PCIE_CONF_VGA_EN_IRQ BIT(5) > >> +#define SCU_PCIE_CONF_VGA_EN_DMA BIT(6) > >> +#define SCU_PCIE_CONF_BMC_EN BIT(8) > >> +#define SCU_PCIE_CONF_BMC_EN_MMIO BIT(9) > >> +#define SCU_PCIE_CONF_BMC_EN_MSI BIT(11) > >> +#define SCU_PCIE_CONF_BMC_EN_MCTP BIT(12) > >> +#define SCU_PCIE_CONF_BMC_EN_IRQ BIT(13) > >> +#define SCU_PCIE_CONF_BMC_EN_DMA BIT(14) > >> + > >> +#define SCU_AST2500_BMC_CLASS_REV 0x19c > >> +#define SCU_AST2600_BMC_CLASS_REV 0xc4c > >> +#define SCU_BMC_CLASS_REV_XDMA 0xff000001 > >> + > >> +#define SDMC_BASE 0x1e6e0000 > >> +#define SDMC_CONF 0x004 > >> +#define SDMC_CONF_MEM GENMASK(1, 0) > >> +#define SDMC_REMAP 0x008 > >> +#define SDMC_AST2500_REMAP_MAGIC (BIT(16) | BIT(17)) > >> +#define SDMC_AST2600_REMAP_MAGIC BIT(18) > > Can we have something more descriptive than "MAGIC"? What these bits > > are doing is well documented and defined. > > > Maybe I missed it then... I wasn't able to correspond these necessary > bits to what is actually happening. The doc indicates it remaps "REQs" > to the highest memory space. What do the REQs correspond to, and why is > it these bits for these SOCs? The REQ index corresponds to arbiter access priorities for each device. As the REQ table describes all the devices the remapping register simply reuses the REQ indexes to capture the remapping state for each device by mapping the REQ index to a bit index for the remap word. I suggest you do something like: #define SDMC_AST2500_REMAP_PCIE_ACCESS BIT(16) #define SDMC_AST2500_REMAP_XDMA_ACCESS BIT(17) #define SDMC_AST2600_REMAP_XDMA1_ACCESS BIT(18) > > > > > >> + > >> +#define XDMA_CMDQ_SIZE PAGE_SIZE > >> +#define XDMA_NUM_CMDS \ > >> + (XDMA_CMDQ_SIZE / sizeof(struct aspeed_xdma_cmd)) > >> + > >> +/* Aspeed specification requires 10ms after switching the reset line */ > >> +#define XDMA_RESET_TIME_MS 10 > >> + > >> +#define XDMA_DS_PCIE_REQ_SIZE_128 0 > >> +#define XDMA_DS_PCIE_REQ_SIZE_256 1 > >> +#define XDMA_DS_PCIE_REQ_SIZE_512 2 > >> +#define XDMA_DS_PCIE_REQ_SIZE_1K 3 > >> +#define XDMA_DS_PCIE_REQ_SIZE_2K 4 > >> +#define XDMA_DS_PCIE_REQ_SIZE_4K 5 > >> + > >> +#define XDMA_CMD_AST2500_PITCH_SHIFT 3 > >> +#define XDMA_CMD_AST2500_PITCH_BMC GENMASK_ULL(62, 51) > >> +#define XDMA_CMD_AST2500_PITCH_HOST GENMASK_ULL(46, 35) > >> +#define XDMA_CMD_AST2500_PITCH_UPSTREAM BIT_ULL(31) > >> +#define XDMA_CMD_AST2500_PITCH_ADDR GENMASK_ULL(29, 4) > >> +#define XDMA_CMD_AST2500_PITCH_ID BIT_ULL(0) > >> +#define XDMA_CMD_AST2500_CMD_IRQ_EN BIT_ULL(31) > >> +#define XDMA_CMD_AST2500_CMD_LINE_NO GENMASK_ULL(27, 16) > >> +#define XDMA_CMD_AST2500_CMD_IRQ_BMC BIT_ULL(15) > >> +#define XDMA_CMD_AST2500_CMD_LINE_SIZE_SHIFT 4 > >> +#define XDMA_CMD_AST2500_CMD_LINE_SIZE \ > >> + GENMASK_ULL(14, XDMA_CMD_AST2500_CMD_LINE_SIZE_SHIFT) > >> +#define XDMA_CMD_AST2500_CMD_ID BIT_ULL(1) > >> + > >> +#define XDMA_CMD_AST2600_PITCH_BMC GENMASK_ULL(62, 48) > >> +#define XDMA_CMD_AST2600_PITCH_HOST GENMASK_ULL(46, 32) > >> +#define XDMA_CMD_AST2600_PITCH_ADDR GENMASK_ULL(30, 0) > >> +#define XDMA_CMD_AST2600_CMD_64_EN BIT_ULL(40) > >> +#define XDMA_CMD_AST2600_CMD_IRQ_BMC BIT_ULL(37) > >> +#define XDMA_CMD_AST2600_CMD_IRQ_HOST BIT_ULL(36) > >> +#define XDMA_CMD_AST2600_CMD_UPSTREAM BIT_ULL(32) > >> +#define XDMA_CMD_AST2600_CMD_LINE_NO GENMASK_ULL(27, 16) > >> +#define XDMA_CMD_AST2600_CMD_LINE_SIZE GENMASK_ULL(14, 0) > >> +#define XDMA_CMD_AST2600_CMD_MULTILINE_SIZE GENMASK_ULL(14, 12) > >> + > >> +#define XDMA_AST2500_QUEUE_ENTRY_SIZE 4 > >> +#define XDMA_AST2500_HOST_CMDQ_ADDR0 0x00 > >> +#define XDMA_AST2500_HOST_CMDQ_ENDP 0x04 > >> +#define XDMA_AST2500_HOST_CMDQ_WRITEP 0x08 > >> +#define XDMA_AST2500_HOST_CMDQ_READP 0x0c > >> +#define XDMA_AST2500_BMC_CMDQ_ADDR 0x10 > >> +#define XDMA_AST2500_BMC_CMDQ_ENDP 0x14 > >> +#define XDMA_AST2500_BMC_CMDQ_WRITEP 0x18 > >> +#define XDMA_AST2500_BMC_CMDQ_READP 0x1c > >> +#define XDMA_BMC_CMDQ_READP_MAGIC 0xee882266 > > What's this about? Using a macro to abstract a magic number and then using the > > word "magic" in the name isn't very helpful. I think it should be renamed or > > documented, or both. > > > Sure. It's the value to write to reset the read pointer. maybe s/MAGIC/RESET/ then? > > > > > >> +#define XDMA_AST2500_CTRL 0x20 > >> +#define XDMA_AST2500_CTRL_US_COMP BIT(4) > >> +#define XDMA_AST2500_CTRL_DS_COMP BIT(5) > >> +#define XDMA_AST2500_CTRL_DS_DIRTY BIT(6) > >> +#define XDMA_AST2500_CTRL_DS_SIZE GENMASK(19, 17) > >> +#define XDMA_AST2500_CTRL_DS_TIMEOUT BIT(28) > >> +#define XDMA_AST2500_CTRL_DS_CHECK_ID BIT(29) > >> +#define XDMA_AST2500_STATUS 0x24 > >> +#define XDMA_AST2500_STATUS_US_COMP BIT(4) > >> +#define XDMA_AST2500_STATUS_DS_COMP BIT(5) > >> +#define XDMA_AST2500_STATUS_DS_DIRTY BIT(6) > >> +#define XDMA_AST2500_INPRG_DS_CMD1 0x38 > >> +#define XDMA_AST2500_INPRG_DS_CMD2 0x3c > >> +#define XDMA_AST2500_INPRG_US_CMD00 0x40 > >> +#define XDMA_AST2500_INPRG_US_CMD01 0x44 > >> +#define XDMA_AST2500_INPRG_US_CMD10 0x48 > >> +#define XDMA_AST2500_INPRG_US_CMD11 0x4c > >> +#define XDMA_AST2500_INPRG_US_CMD20 0x50 > >> +#define XDMA_AST2500_INPRG_US_CMD21 0x54 > >> +#define XDMA_AST2500_HOST_CMDQ_ADDR1 0x60 > >> +#define XDMA_AST2500_VGA_CMDQ_ADDR0 0x64 > >> +#define XDMA_AST2500_VGA_CMDQ_ENDP 0x68 > >> +#define XDMA_AST2500_VGA_CMDQ_WRITEP 0x6c > >> +#define XDMA_AST2500_VGA_CMDQ_READP 0x70 > >> +#define XDMA_AST2500_VGA_CMD_STATUS 0x74 > >> +#define XDMA_AST2500_VGA_CMDQ_ADDR1 0x78 > >> + > >> +#define XDMA_AST2600_QUEUE_ENTRY_SIZE 2 > >> +#define XDMA_AST2600_HOST_CMDQ_ADDR0 0x00 > >> +#define XDMA_AST2600_HOST_CMDQ_ADDR1 0x04 > >> +#define XDMA_AST2600_HOST_CMDQ_ENDP 0x08 > >> +#define XDMA_AST2600_HOST_CMDQ_WRITEP 0x0c > >> +#define XDMA_AST2600_HOST_CMDQ_READP 0x10 > >> +#define XDMA_AST2600_BMC_CMDQ_ADDR 0x14 > >> +#define XDMA_AST2600_BMC_CMDQ_ENDP 0x18 > >> +#define XDMA_AST2600_BMC_CMDQ_WRITEP 0x1c > >> +#define XDMA_AST2600_BMC_CMDQ_READP 0x20 > >> +#define XDMA_AST2600_VGA_CMDQ_ADDR0 0x24 > >> +#define XDMA_AST2600_VGA_CMDQ_ADDR1 0x28 > >> +#define XDMA_AST2600_VGA_CMDQ_ENDP 0x2c > >> +#define XDMA_AST2600_VGA_CMDQ_WRITEP 0x30 > >> +#define XDMA_AST2600_VGA_CMDQ_READP 0x34 > >> +#define XDMA_AST2600_CTRL 0x38 > >> +#define XDMA_AST2600_CTRL_US_COMP BIT(16) > >> +#define XDMA_AST2600_CTRL_DS_COMP BIT(17) > >> +#define XDMA_AST2600_CTRL_DS_DIRTY BIT(18) > >> +#define XDMA_AST2600_CTRL_DS_SIZE GENMASK(22, 20) > >> +#define XDMA_AST2600_STATUS 0x3c > >> +#define XDMA_AST2600_STATUS_US_COMP BIT(16) > >> +#define XDMA_AST2600_STATUS_DS_COMP BIT(17) > >> +#define XDMA_AST2600_STATUS_DS_DIRTY BIT(18) > >> +#define XDMA_AST2600_INPRG_DS_CMD00 0x40 > >> +#define XDMA_AST2600_INPRG_DS_CMD01 0x44 > >> +#define XDMA_AST2600_INPRG_DS_CMD10 0x48 > >> +#define XDMA_AST2600_INPRG_DS_CMD11 0x4c > >> +#define XDMA_AST2600_INPRG_DS_CMD20 0x50 > >> +#define XDMA_AST2600_INPRG_DS_CMD21 0x54 > >> +#define XDMA_AST2600_INPRG_US_CMD00 0x60 > >> +#define XDMA_AST2600_INPRG_US_CMD01 0x64 > >> +#define XDMA_AST2600_INPRG_US_CMD10 0x68 > >> +#define XDMA_AST2600_INPRG_US_CMD11 0x6c > >> +#define XDMA_AST2600_INPRG_US_CMD20 0x70 > >> +#define XDMA_AST2600_INPRG_US_CMD21 0x74 > >> + > >> +enum versions { xdma_ast2500, xdma_ast2600 }; > >> + > >> +struct aspeed_xdma_cmd { > >> + u64 host_addr; > >> + u64 pitch; > >> + u64 cmd; > >> + u64 reserved; > >> +}; > >> + > >> +struct aspeed_xdma_regs { > >> + u8 bmc_cmdq_addr; > >> + u8 bmc_cmdq_endp; > >> + u8 bmc_cmdq_writep; > >> + u8 bmc_cmdq_readp; > >> + u8 control; > >> + u8 status; > >> +}; > >> + > >> +struct aspeed_xdma_status_bits { > >> + u32 us_comp; > >> + u32 ds_comp; > >> + u32 ds_dirty; > >> +}; > >> + > >> +struct aspeed_xdma_client; > >> + > >> +struct aspeed_xdma { > >> + enum versions version; > >> + u32 control; > >> + unsigned int queue_entry_size; > >> + struct aspeed_xdma_regs regs; > >> + struct aspeed_xdma_status_bits status_bits; > >> + > >> + struct device *dev; > >> + void __iomem *base; > >> + struct clk *clock; > >> + struct reset_control *reset; > >> + > >> + bool in_progress; > >> + bool in_reset; > >> + bool upstream; > >> + unsigned int cmd_idx; > >> + struct mutex start_lock; > >> + struct delayed_work reset_work; > >> + spinlock_t client_lock; > >> + spinlock_t reset_lock; > > What data are each of these locks protecting? Please add documentation > > about how you intend to use them. Try to group the locks with the data they > > protect if this is not already the case. > > > Sure, though I feel the names are fairly descriptive. Not descriptive enough for me. Please add documentation. > > > > > >> + wait_queue_head_t wait; > >> + struct aspeed_xdma_client *current_client; > >> + > >> + u32 vga_phys; > >> + u32 vga_size; > >> + void *cmdq; > >> + void __iomem *vga_virt; > >> + dma_addr_t cmdq_vga_phys; > >> + void *cmdq_vga_virt; > >> + struct gen_pool *vga_pool; > >> +}; > >> + > >> +struct aspeed_xdma_client { > >> + struct aspeed_xdma *ctx; > >> + > >> + bool error; > >> + bool in_progress; > >> + void *virt; > >> + dma_addr_t phys; > >> + u32 size; > >> +}; > >> + > >> +static u32 aspeed_xdma_readl(struct aspeed_xdma *ctx, u8 reg) > >> +{ > >> + u32 v = readl(ctx->base + reg); > >> + > >> + dev_dbg(ctx->dev, "read %02x[%08x]\n", reg, v); > >> + return v; > >> +} > >> + > >> +static void aspeed_xdma_writel(struct aspeed_xdma *ctx, u8 reg, u32 val) > >> +{ > >> + writel(val, ctx->base + reg); > >> + dev_dbg(ctx->dev, "write %02x[%08x]\n", reg, readl(ctx->base + reg)); > > That readl() seems a bit paranoid, and is probably evaluated whether or not > > this dev_dbg() is enabled. What drove this approach? > > > There is some tricky masking of writes of many of the engine registers, > so it was helpful to see what the actual result is. I could drop it. Wouldn't you want to record what you were attempting to write vs the value that stuck then? > > > > > >> +} > >> + > >> +static void aspeed_xdma_init_eng(struct aspeed_xdma *ctx) > >> +{ > >> + aspeed_xdma_writel(ctx, ctx->regs.bmc_cmdq_endp, > >> + ctx->queue_entry_size * XDMA_NUM_CMDS); > >> + aspeed_xdma_writel(ctx, ctx->regs.bmc_cmdq_readp, > >> + XDMA_BMC_CMDQ_READP_MAGIC); > >> + aspeed_xdma_writel(ctx, ctx->regs.bmc_cmdq_writep, 0); > >> + aspeed_xdma_writel(ctx, ctx->regs.control, ctx->control); > >> + aspeed_xdma_writel(ctx, ctx->regs.bmc_cmdq_addr, ctx->cmdq_vga_phys); > >> + > >> + ctx->cmd_idx = 0; > >> + ctx->in_progress = false; > >> +} > >> + > >> +static unsigned int aspeed_xdma_ast2500_set_cmd(struct aspeed_xdma *ctx, > >> + struct aspeed_xdma_op *op, > >> + u32 bmc_addr) > >> +{ > >> + u64 cmd = XDMA_CMD_AST2500_CMD_IRQ_EN | XDMA_CMD_AST2500_CMD_IRQ_BMC | > >> + XDMA_CMD_AST2500_CMD_ID; > >> + u64 cmd_pitch = (op->direction ? XDMA_CMD_AST2500_PITCH_UPSTREAM : 0) | > >> + XDMA_CMD_AST2500_PITCH_ID; > >> + unsigned int line_size; > >> + unsigned int nidx = (ctx->cmd_idx + 1) % XDMA_NUM_CMDS; > >> + unsigned int line_no = 1; > >> + unsigned int pitch = 1; > >> + struct aspeed_xdma_cmd *ncmd = > >> + &(((struct aspeed_xdma_cmd *)ctx->cmdq)[ctx->cmd_idx]); > >> + > >> + dev_dbg(ctx->dev, "xdma %s ast2500: bmc[%08x] len[%08x] host[%08x]\n", > >> + op->direction ? "upstream" : "downstream", bmc_addr, op->len, > >> + (u32)op->host_addr); > >> + > >> + if (op->len > XDMA_CMD_AST2500_CMD_LINE_SIZE) { > >> + unsigned int rem; > >> + unsigned int total; > >> + > >> + line_no = op->len / XDMA_CMD_AST2500_CMD_LINE_SIZE; > >> + total = XDMA_CMD_AST2500_CMD_LINE_SIZE * line_no; > >> + rem = (op->len - total) >> > >> + XDMA_CMD_AST2500_CMD_LINE_SIZE_SHIFT; > >> + line_size = XDMA_CMD_AST2500_CMD_LINE_SIZE; > >> + pitch = line_size >> XDMA_CMD_AST2500_PITCH_SHIFT; > >> + line_size >>= XDMA_CMD_AST2500_CMD_LINE_SIZE_SHIFT; > > Can we clean up the configuration of line_size and pitch here? They are set to > > constants in this case. > > > I think this is as clean as it gets. They're used later on so unless I > have more if statements it makes sense to set them here. No worries, was just curious. > > > > > >> + > >> + if (rem) { > >> + u32 rbmc = bmc_addr + total; > >> + struct aspeed_xdma_cmd *rcmd = > >> + &(((struct aspeed_xdma_cmd *)ctx->cmdq)[nidx]); > >> + > >> + rcmd->host_addr = op->host_addr + (u64)total; > >> + rcmd->pitch = cmd_pitch | > >> + ((u64)rbmc & XDMA_CMD_AST2500_PITCH_ADDR) | > >> + FIELD_PREP(XDMA_CMD_AST2500_PITCH_HOST, 1) | > >> + FIELD_PREP(XDMA_CMD_AST2500_PITCH_BMC, 1); > >> + rcmd->cmd = cmd | > >> + FIELD_PREP(XDMA_CMD_AST2500_CMD_LINE_NO, 1) | > >> + FIELD_PREP(XDMA_CMD_AST2500_CMD_LINE_SIZE, > >> + rem); > >> + > >> + print_hex_dump_debug("xdma rem", DUMP_PREFIX_OFFSET, > >> + 16, 1, rcmd, sizeof(*rcmd), true); > >> + > >> + cmd &= ~(XDMA_CMD_AST2500_CMD_IRQ_EN | > >> + XDMA_CMD_AST2500_CMD_IRQ_BMC); > >> + > >> + nidx = (nidx + 1) % XDMA_NUM_CMDS; > >> + } > >> + } else { > >> + line_size = op->len >> XDMA_CMD_AST2500_CMD_LINE_SIZE_SHIFT; > >> + } > >> + > >> + ncmd->host_addr = op->host_addr; > >> + ncmd->pitch = cmd_pitch | > >> + ((u64)bmc_addr & XDMA_CMD_AST2500_PITCH_ADDR) | > >> + FIELD_PREP(XDMA_CMD_AST2500_PITCH_HOST, pitch) | > >> + FIELD_PREP(XDMA_CMD_AST2500_PITCH_BMC, pitch); > >> + ncmd->cmd = cmd | FIELD_PREP(XDMA_CMD_AST2500_CMD_LINE_NO, line_no) | > >> + FIELD_PREP(XDMA_CMD_AST2500_CMD_LINE_SIZE, line_size); > >> + > >> + print_hex_dump_debug("xdma cmd", DUMP_PREFIX_OFFSET, 16, 1, ncmd, > >> + sizeof(*ncmd), true); > >> + > >> + return nidx; > >> +} > >> + > >> +static unsigned int aspeed_xdma_ast2600_set_cmd(struct aspeed_xdma *ctx, > >> + struct aspeed_xdma_op *op, > >> + u32 bmc_addr) > >> +{ > >> + u64 cmd = XDMA_CMD_AST2600_CMD_IRQ_BMC | > >> + (op->direction ? XDMA_CMD_AST2600_CMD_UPSTREAM : 0); > >> + unsigned int line_size; > >> + unsigned int nidx = (ctx->cmd_idx + 1) % XDMA_NUM_CMDS; > >> + unsigned int line_no = 1; > >> + unsigned int pitch = 1; > >> + struct aspeed_xdma_cmd *ncmd = > >> + &(((struct aspeed_xdma_cmd *)ctx->cmdq)[ctx->cmd_idx]); > >> + > >> + if ((op->host_addr + op->len) & 0xffffffff00000000ULL) > >> + cmd |= XDMA_CMD_AST2600_CMD_64_EN; > >> + > >> + dev_dbg(ctx->dev, "xdma %s ast2600: bmc[%08x] len[%08x] " > >> + "host[%016llx]\n", op->direction ? "upstream" : "downstream", > >> + bmc_addr, op->len, op->host_addr); > >> + > >> + if (op->len > XDMA_CMD_AST2600_CMD_LINE_SIZE) { > >> + unsigned int rem; > >> + unsigned int total; > >> + > >> + line_no = op->len / XDMA_CMD_AST2600_CMD_MULTILINE_SIZE; > >> + total = XDMA_CMD_AST2600_CMD_MULTILINE_SIZE * line_no; > >> + rem = op->len - total; > >> + line_size = XDMA_CMD_AST2600_CMD_MULTILINE_SIZE; > >> + pitch = line_size; > >> + > >> + if (rem) { > >> + u32 rbmc = bmc_addr + total; > >> + struct aspeed_xdma_cmd *rcmd = > >> + &(((struct aspeed_xdma_cmd *)ctx->cmdq)[nidx]); > >> + > >> + rcmd->host_addr = op->host_addr + (u64)total; > >> + rcmd->pitch = > >> + ((u64)rbmc & XDMA_CMD_AST2600_PITCH_ADDR) | > >> + FIELD_PREP(XDMA_CMD_AST2600_PITCH_HOST, 1) | > >> + FIELD_PREP(XDMA_CMD_AST2600_PITCH_BMC, 1); > >> + rcmd->cmd = cmd | > >> + FIELD_PREP(XDMA_CMD_AST2600_CMD_LINE_NO, 1) | > >> + FIELD_PREP(XDMA_CMD_AST2600_CMD_LINE_SIZE, > >> + rem); > >> + > >> + print_hex_dump_debug("xdma rem", DUMP_PREFIX_OFFSET, > >> + 16, 1, rcmd, sizeof(*rcmd), true); > >> + > >> + cmd &= ~XDMA_CMD_AST2600_CMD_IRQ_BMC; > >> + > >> + nidx = (nidx + 1) % XDMA_NUM_CMDS; > >> + } > >> + } else { > >> + line_size = op->len; > >> + } > >> + > >> + ncmd->host_addr = op->host_addr; > >> + ncmd->pitch = ((u64)bmc_addr & XDMA_CMD_AST2600_PITCH_ADDR) | > >> + FIELD_PREP(XDMA_CMD_AST2600_PITCH_HOST, pitch) | > >> + FIELD_PREP(XDMA_CMD_AST2600_PITCH_BMC, pitch); > >> + ncmd->cmd = cmd | FIELD_PREP(XDMA_CMD_AST2600_CMD_LINE_NO, line_no) | > >> + FIELD_PREP(XDMA_CMD_AST2600_CMD_LINE_SIZE, line_size); > >> + > >> + print_hex_dump_debug("xdma cmd", DUMP_PREFIX_OFFSET, 16, 1, ncmd, > >> + sizeof(*ncmd), true); > >> + > >> + return nidx; > >> +} > >> + > >> +static void aspeed_xdma_start(struct aspeed_xdma *ctx, > >> + struct aspeed_xdma_op *op, u32 bmc_addr, > >> + struct aspeed_xdma_client *client) > >> +{ > >> + unsigned int nidx; > >> + > >> + mutex_lock(&ctx->start_lock); > >> + > >> + switch (ctx->version) { > >> + default: > >> + case xdma_ast2500: > >> + nidx = aspeed_xdma_ast2500_set_cmd(ctx, op, bmc_addr); > >> + break; > >> + case xdma_ast2600: > >> + nidx = aspeed_xdma_ast2600_set_cmd(ctx, op, bmc_addr); > >> + break; > >> + } > > What was the trade-off between the enum and function pointers? Is the > > enum approach strictly better for some reason? > > > No, I just decided to go with the switch. I think this will go away with better use of the of match data. > > > > > >> + > >> + memcpy(ctx->cmdq_vga_virt, ctx->cmdq, XDMA_CMDQ_SIZE); > >> + > >> + client->in_progress = true; > >> + ctx->current_client = client; > >> + > >> + ctx->in_progress = true; > > Do we need ctx->in_progress vs just using the NULL state of ctx->current_client? > > Is it possible for ctx->current_client to be set and not have a transfer in progress? > > > Good point, could probably drop that. > > > > > >> + ctx->upstream = op->direction ? true : false; > >> + > >> + aspeed_xdma_writel(ctx, ctx->regs.bmc_cmdq_writep, > >> + nidx * ctx->queue_entry_size); > >> + > >> + ctx->cmd_idx = nidx; > >> + > >> + mutex_unlock(&ctx->start_lock); > >> +} > >> + > >> +static void aspeed_xdma_done(struct aspeed_xdma *ctx, bool error) > >> +{ > >> + unsigned long flags; > >> + > >> + /* > >> + * Lock to make sure simultaneous reset and transfer complete don't > >> + * leave the client with the wrong error state. > >> + */ > >> + spin_lock_irqsave(&ctx->client_lock, flags); > >> + > >> + if (ctx->current_client) { > > You're testing ctx->current_client under ctx->client_lock here, but ctx->current_client > > is set under ctx->start_lock in aspeed_xdma_start(), which does not take > > ctx->client_lock. What data is protected by ctx->client_lock? What data is protected > > by ctx->client_lock? > > > client_lock is making sure the client.error and ctx.current_client are > updated atomically, as indicated by the comment. But you need to use a consistent lock, otherwise it's pointless. > It's impossible to be > starting a transfer at the same time, since ctx.in_progress would be true. > > > > > >> + ctx->current_client->error = error; > >> + ctx->current_client->in_progress = false; > >> + ctx->current_client = NULL; > >> + } > >> + > >> + spin_unlock_irqrestore(&ctx->client_lock, flags); > >> + > >> + ctx->in_progress = false; > > You set ctx->in_progress under ctx->start_lock in aspeed_xdma_start() but you > > do not take ctx->start_lock when setting it here. What data is ctx->start_lock > > protecting? > > > start_lock is making sure that the hardware register state, command > queue, and data associated with the current transfer are updated > atomically in the start function. It shouldn't be necessary to lock just > to reset in_progress Then why is it set under start_lock in aspeed_xdma_start()? These questions are why I asked you to document what locks are protecting which struct members above. > > > > > >> + wake_up_interruptible_all(&ctx->wait); > >> +} > >> + > >> +static irqreturn_t aspeed_xdma_irq(int irq, void *arg) > >> +{ > >> + struct aspeed_xdma *ctx = arg; > >> + u32 status = aspeed_xdma_readl(ctx, ctx->regs.status); > >> + > >> + if (status & ctx->status_bits.ds_dirty) { > >> + aspeed_xdma_done(ctx, true); > >> + } else { > >> + if (status & ctx->status_bits.us_comp) { > >> + if (ctx->upstream) > >> + aspeed_xdma_done(ctx, false); > >> + } > >> + > >> + if (status & ctx->status_bits.ds_comp) { > >> + if (!ctx->upstream) > >> + aspeed_xdma_done(ctx, false); > >> + } > >> + } > >> + > >> + aspeed_xdma_writel(ctx, ctx->regs.status, status); > >> + > >> + return IRQ_HANDLED; > >> +} > >> + > >> +static void aspeed_xdma_reset_finish(struct aspeed_xdma *ctx) > >> +{ > >> + unsigned long flags; > >> + > >> + spin_lock_irqsave(&ctx->reset_lock, flags); > >> + > >> + ctx->in_reset = false; Looking back at this, I'd interpret `in_reset` as answering the question "is the device unusable?" In which case, it's not usable until after we've completed the sleep below, so I'd hold off on clearing the in_reset state until after the sleep completes. I'm not sure that having `in_reset` represent the literal reset state of the device (given the additional constraints) as it stands is useful. > >> + reset_control_deassert(ctx->reset); > >> + > >> + spin_unlock_irqrestore(&ctx->reset_lock, flags); > >> + > >> + msleep(XDMA_RESET_TIME_MS); > >> + > >> + aspeed_xdma_init_eng(ctx); > >> + aspeed_xdma_done(ctx, true); > >> +} > >> + > >> +static bool aspeed_xdma_reset_start(struct aspeed_xdma *ctx) > >> +{ > >> + bool rc = true; > >> + unsigned long flags; > >> + > >> + spin_lock_irqsave(&ctx->reset_lock, flags); > >> + > >> + if (ctx->in_reset) { > >> + rc = false; > >> + } else { > >> + ctx->in_reset = true; > >> + reset_control_assert(ctx->reset); > >> + } > > Do start and finish need to be split in this way? > > > Yes, in order to prevent waiting for 10ms in interrupt context during > the reset. Do we absolutely need to kick the reset off from interrupt context? Why not just queue the entire reset job from interrupt context? > > > > > >> + > >> + spin_unlock_irqrestore(&ctx->reset_lock, flags); > >> + > >> + return rc; > >> +} > >> + > >> +static void aspeed_xdma_reset_work(struct work_struct *work) > >> +{ > >> + struct delayed_work *dwork = to_delayed_work(work); > >> + struct aspeed_xdma *ctx = container_of(dwork, struct aspeed_xdma, > >> + reset_work); > >> + > >> + /* > >> + * Lock to make sure operations aren't started while the engine is > >> + * in an undefined state coming out of reset and waiting to init. > > Shouldn't we be holding it across both aspeed_xdma_reset_start() and > > aspeed_xdma_reset_finish()? > > > No, it isn't necessary and also start_lock is a mutex which can't be > locked in interrupt context. It isn't necessary because a reset before, > during, or after starting a transfer will simply cause the transfer to > do nothing until the reset completes, when it will be put into an error > state. So why do we lock it at all? > > > > > >> + */ > >> + mutex_lock(&ctx->start_lock); > >> + > >> + aspeed_xdma_reset_finish(ctx); > >> + > >> + mutex_unlock(&ctx->start_lock); > >> +} > >> + > >> +static irqreturn_t aspeed_xdma_pcie_irq(int irq, void *arg) > >> +{ > >> + struct aspeed_xdma *ctx = arg; > >> + > >> + dev_dbg(ctx->dev, "pcie reset\n"); > >> + > >> + if (aspeed_xdma_reset_start(ctx)) > >> + schedule_delayed_work(&ctx->reset_work, > >> + msecs_to_jiffies(XDMA_RESET_TIME_MS)); With the suggestion above, instead of having aspeed_xdma_reset_start() / aspeed_xdma_reset_finish() and the existing implementation of aspeed_xdma_reset_work(), we'd have just aspeed_xdma_reset(), and aspeed_xdma_reset_work() would look like: static void aspeed_xdma_reset_work(struct work_struct *work) { struct delayed_work *dwork = to_delayed_work(work); struct aspeed_xdma *ctx = container_of(dwork, struct aspeed_xdma, reset_work); mutex_lock(&ctx->start_lock); aspeed_xdma_reset(ctx) mutex_unlock(&ctx->start_lock); } and aspeed_xdma_pcie_irq() becomes: static irqreturn_t aspeed_xdma_pcie_irq(int irq, void *arg) { struct aspeed_xdma *ctx = arg; unsigned long flags; dev_dbg(ctx->dev, "pcie reset\n"); spin_lock_irqsave(&ctx->reset_lock, flags); if (ctx->in_reset) { spin_unlock_irqrestore(&ctx->reset_lock, flags); return IRQ_HANDLED; } ctx->in_reset = true; spin_unlock_irqrestore(&ctx->reset_lock, flags); schedule_delayed_work(&ctx->reset_work, msecs_to_jiffies(XDMA_RESET_TIME_MS)); return IRQ_HANDLED; } > >> + return IRQ_HANDLED; > >> +} > >> + > >> +static int aspeed_xdma_init(struct aspeed_xdma *ctx) > >> +{ > >> + int rc; > >> + struct regmap *scu; > >> + u32 conf; > >> + u32 mem_size; > >> + u32 remap; > >> + u32 scu_bmc_class; > >> + u32 scu_pcie_conf; > >> + u32 scu_strap; > >> + u32 sdmc_remap_magic; > >> + u32 strap = 0; > >> + const u32 bmc = SCU_PCIE_CONF_BMC_EN | SCU_PCIE_CONF_BMC_EN_MSI | > >> + SCU_PCIE_CONF_BMC_EN_MCTP > > Do we need MCTP here? > > > I don't know, I used the default value which has MCTP enabled. Ok. I'd tend to clear it, but it probably doesn't matter. > > > > > > | SCU_PCIE_CONF_BMC_EN_IRQ | > >> + SCU_PCIE_CONF_BMC_EN_DMA; > >> + const u32 vga = SCU_PCIE_CONF_VGA_EN | SCU_PCIE_CONF_VGA_EN_MSI | > >> + SCU_PCIE_CONF_VGA_EN_MCTP > > Do we need MCTP here? > > > > | SCU_PCIE_CONF_VGA_EN_IRQ | > >> + SCU_PCIE_CONF_VGA_EN_DMA; > >> + u32 mem_sizes[4] = { 0x8000000, 0x10000000, 0x20000000, 0x40000000 }; > >> + const u32 vga_sizes[4] = { 0x800000, 0x1000000, 0x2000000, 0x4000000 }; > >> + void __iomem *sdmc_base = ioremap(SDMC_BASE, 0x100); > > Surely this should be a phandle from the devicetree? And what about conflicts > > with a potential SDMC driver? I don't think what you have is the right approach. > > > >> + > >> + if (!sdmc_base) { > >> + dev_err(ctx->dev, "Failed to ioremap mem controller regs.\n"); > >> + return -ENOMEM; > >> + } > >> + > >> + switch (ctx->version) { > >> + default: > >> + case xdma_ast2500: > >> + scu_bmc_class = SCU_AST2500_BMC_CLASS_REV; > >> + scu_pcie_conf = SCU_AST2500_PCIE_CONF; > >> + scu_strap = SCU_AST2500_STRAP; > >> + sdmc_remap_magic = SDMC_AST2500_REMAP_MAGIC; > > Can't this be described statically? The values should be derived from the > > devicetree compatible. > > > >> + > >> + scu = syscon_regmap_lookup_by_compatible("aspeed,ast2500-scu"); > > I think we should do this via phandle in the devicetree. > > > >> + break; > >> + case xdma_ast2600: > >> + scu_bmc_class = SCU_AST2600_BMC_CLASS_REV; > >> + scu_pcie_conf = SCU_AST2600_PCIE_CONF; > >> + scu_strap = SCU_AST2600_STRAP; > >> + sdmc_remap_magic = SDMC_AST2600_REMAP_MAGIC; > > Can't this be described statically? The values should be derived from the > > devicetree compatible. > > > >> + > >> + mem_sizes[0] *= 2; > >> + mem_sizes[1] *= 2; > >> + mem_sizes[2] *= 2; > >> + mem_sizes[3] *= 2; > > Same query as above. > > > >> + > >> + scu = syscon_regmap_lookup_by_compatible("aspeed,ast2600-scu"); > > I think we should do this via phandle in the devicetree. > > > >> + break; > >> + }; > >> + > >> + if (!scu) { > >> + dev_err(ctx->dev, "Failed to grab SCU regs.\n"); > >> + return -ENOMEM; > >> + } > >> + > >> + /* Set SOC to use the BMC PCIe device and set the device class code */ > >> + regmap_update_bits(scu, scu_pcie_conf, bmc | vga, bmc); > >> + regmap_write(scu, scu_bmc_class, SCU_BMC_CLASS_REV_XDMA); > > This should be selectable, probably via the devicetree. > > > >> + > >> + /* > >> + * Calculate the VGA memory size and physical address from the SCU and > >> + * memory controller registers. > >> + */ > >> + regmap_read(scu, scu_strap, &strap); > >> + > >> + switch (ctx->version) { > >> + case xdma_ast2500: > >> + ctx->vga_size = vga_sizes[FIELD_GET(SCU_AST2500_STRAP_VGA_MEM, > >> + strap)]; > >> + break; > >> + case xdma_ast2600: > >> + ctx->vga_size = vga_sizes[FIELD_GET(SCU_AST2600_STRAP_VGA_MEM, > >> + strap)]; > >> + break; > > Can't this be described statically? The values should be derived from the > > devicetree compatible. > > > >> + } > >> + > >> + conf = readl(sdmc_base + SDMC_CONF); > >> + remap = readl(sdmc_base + SDMC_REMAP); > >> + remap |= sdmc_remap_magic; > >> + writel(remap, sdmc_base + SDMC_REMAP); > >> + mem_size = mem_sizes[conf & SDMC_CONF_MEM]; > > See previous objection to this. > > > >> + > >> + iounmap(sdmc_base); > >> + > >> + ctx->vga_phys = (mem_size - ctx->vga_size) + 0x80000000; > > RAM base should be extracted from the devicetree. Better yet, the VGA space should > > be extracted from the devicetree and this calculation avoided altogether. > > > Ok, I'll use the devicetree for all this instead. > > > > > >> + > >> + ctx->cmdq = devm_kzalloc(ctx->dev, XDMA_CMDQ_SIZE, GFP_KERNEL); > >> + if (!ctx->cmdq) { > >> + dev_err(ctx->dev, "Failed to allocate command queue.\n"); > >> + return -ENOMEM; > >> + } > >> + > >> + ctx->vga_virt = ioremap(ctx->vga_phys, ctx->vga_size); > > Use devm_ioremap() to avoid the cleanup. > > > >> + if (!ctx->vga_virt) { > >> + dev_err(ctx->dev, "Failed to ioremap VGA memory.\n"); > >> + return -ENOMEM; > >> + } > >> + > >> + rc = gen_pool_add_virt(ctx->vga_pool, (unsigned long)ctx->vga_virt, > >> + ctx->vga_phys, ctx->vga_size, -1); > >> + if (rc) { > >> + dev_err(ctx->dev, "Failed to add memory to genalloc pool.\n"); > >> + iounmap(ctx->vga_virt); > >> + return rc; > >> + } > >> + > >> + ctx->cmdq_vga_virt = gen_pool_dma_alloc(ctx->vga_pool, XDMA_CMDQ_SIZE, > >> + &ctx->cmdq_vga_phys); > > Can you educate me on this a little? Why is a command queue being allocated in > > memory writable by the host? Aren't we opening ourselves up to potential corruption? > > > In it's current configuration, the engine can't access anything except > the VGA memory. That includes the location for the command queue. Not > sure much can be done about this. Okay, yes, not much can be done about that. > What path can the host use to access this memory though? The host can drive DMAs to the VGA region as well, "independent" of the BMC. > > > > > >> + if (!ctx->cmdq_vga_virt) { > >> + dev_err(ctx->dev, "Failed to genalloc cmdq.\n"); > >> + iounmap(ctx->vga_virt); > >> + return -ENOMEM; > >> + } > >> + > >> + dev_dbg(ctx->dev, "VGA mapped at phys[%08x], size[%08x].\n", > >> + ctx->vga_phys, ctx->vga_size); > >> + > >> + return 0; > >> +} > >> + > >> +static int aspeed_xdma_probe(struct platform_device *pdev) > >> +{ > >> + int irq; > >> + int pcie_irq; > >> + int rc; > >> + enum versions vs = xdma_ast2500; > >> + struct device *dev = &pdev->dev; > >> + struct aspeed_xdma *ctx = devm_kzalloc(dev, sizeof(*ctx), GFP_KERNEL); > >> + const void *md = of_device_get_match_data(dev); > >> + > >> + if (!ctx) > >> + return -ENOMEM; > >> + > >> + if (md) > >> + vs = (enum versions)md; > >> + > >> + switch (vs) { > >> + default: > >> + case xdma_ast2500: > >> + ctx->version = xdma_ast2500; > >> + ctx->control = XDMA_AST2500_CTRL_US_COMP | > >> + XDMA_AST2500_CTRL_DS_COMP | > >> + XDMA_AST2500_CTRL_DS_DIRTY | > >> + FIELD_PREP(XDMA_AST2500_CTRL_DS_SIZE, > >> + XDMA_DS_PCIE_REQ_SIZE_256) | > >> + XDMA_AST2500_CTRL_DS_TIMEOUT | > >> + XDMA_AST2500_CTRL_DS_CHECK_ID; > >> + ctx->queue_entry_size = XDMA_AST2500_QUEUE_ENTRY_SIZE; > >> + ctx->regs.bmc_cmdq_addr = XDMA_AST2500_BMC_CMDQ_ADDR; > >> + ctx->regs.bmc_cmdq_endp = XDMA_AST2500_BMC_CMDQ_ENDP; > >> + ctx->regs.bmc_cmdq_writep = XDMA_AST2500_BMC_CMDQ_WRITEP; > >> + ctx->regs.bmc_cmdq_readp = XDMA_AST2500_BMC_CMDQ_READP; > >> + ctx->regs.control = XDMA_AST2500_CTRL; > >> + ctx->regs.status = XDMA_AST2500_STATUS; > >> + ctx->status_bits.us_comp = XDMA_AST2500_STATUS_US_COMP; > >> + ctx->status_bits.ds_comp = XDMA_AST2500_STATUS_DS_COMP; > >> + ctx->status_bits.ds_dirty = XDMA_AST2500_STATUS_DS_DIRTY; > > Why not include all this in the match data? > > > Sure. > > > > > >> + break; > >> + case xdma_ast2600: > >> + ctx->version = xdma_ast2600; > >> + ctx->control = XDMA_AST2600_CTRL_US_COMP | > >> + XDMA_AST2600_CTRL_DS_COMP | > >> + XDMA_AST2600_CTRL_DS_DIRTY | > >> + FIELD_PREP(XDMA_AST2600_CTRL_DS_SIZE, > >> + XDMA_DS_PCIE_REQ_SIZE_256); > >> + ctx->queue_entry_size = XDMA_AST2600_QUEUE_ENTRY_SIZE; > >> + ctx->regs.bmc_cmdq_addr = XDMA_AST2600_BMC_CMDQ_ADDR; > >> + ctx->regs.bmc_cmdq_endp = XDMA_AST2600_BMC_CMDQ_ENDP; > >> + ctx->regs.bmc_cmdq_writep = XDMA_AST2600_BMC_CMDQ_WRITEP; > >> + ctx->regs.bmc_cmdq_readp = XDMA_AST2600_BMC_CMDQ_READP; > >> + ctx->regs.control = XDMA_AST2600_CTRL; > >> + ctx->regs.status = XDMA_AST2600_STATUS; > >> + ctx->status_bits.us_comp = XDMA_AST2600_STATUS_US_COMP; > >> + ctx->status_bits.ds_comp = XDMA_AST2600_STATUS_DS_COMP; > >> + ctx->status_bits.ds_dirty = XDMA_AST2600_STATUS_DS_DIRTY; > > Same query as above > > > >> + break; > >> + }; > >> + > >> + ctx->dev = dev; > >> + platform_set_drvdata(pdev, ctx); > >> + mutex_init(&ctx->start_lock); > >> + INIT_DELAYED_WORK(&ctx->reset_work, aspeed_xdma_reset_work); > >> + spin_lock_init(&ctx->client_lock); > >> + spin_lock_init(&ctx->reset_lock); > >> + init_waitqueue_head(&ctx->wait); > >> + > >> + ctx->base = devm_platform_ioremap_resource(pdev, 0); > >> + if (IS_ERR(ctx->base)) { > >> + dev_err(dev, "Unable to ioremap registers.\n"); > >> + return PTR_ERR(ctx->base); > >> + } > >> + > >> + irq = platform_get_irq(pdev, 0); > >> + if (irq < 0) { > >> + dev_err(dev, "Unable to find IRQ.\n"); > >> + return -ENODEV; > >> + } > >> + > >> + rc = devm_request_irq(dev, irq, aspeed_xdma_irq, IRQF_SHARED, > >> + DEVICE_NAME, ctx); > >> + if (rc < 0) { > >> + dev_err(dev, "Unable to request IRQ %d.\n", irq); > >> + return rc; > >> + } > >> + > >> + ctx->clock = devm_clk_get(dev, NULL); > >> + if (IS_ERR(ctx->clock)) { > >> + dev_err(dev, "Unable to request clock.\n"); > >> + return PTR_ERR(ctx->clock); > >> + } > >> + > >> + ctx->reset = devm_reset_control_get_exclusive(dev, NULL); > >> + if (IS_ERR(ctx->reset)) { > >> + dev_err(dev, "Unable to request reset control.\n"); > >> + return PTR_ERR(ctx->reset); > >> + } > >> + > >> + ctx->vga_pool = devm_gen_pool_create(dev, ilog2(PAGE_SIZE), -1, NULL); > >> + if (!ctx->vga_pool) { > >> + dev_err(dev, "Unable to setup genalloc pool.\n"); > >> + return -ENOMEM; > >> + } > >> + > >> + clk_prepare_enable(ctx->clock); > > Check for errors > > > >> + msleep(XDMA_RESET_TIME_MS); > >> + > >> + reset_control_deassert(ctx->reset); > > Check for errors. > > > >> + msleep(XDMA_RESET_TIME_MS); > >> + > >> + rc = aspeed_xdma_init(ctx); > >> + if (rc) { > >> + reset_control_assert(ctx->reset); > >> + clk_disable_unprepare(ctx->clock); > >> + return rc; > >> + } > >> + > >> + aspeed_xdma_init_eng(ctx); > >> + > >> + /* > >> + * This interrupt could fire immediately so only request it once the > >> + * engine and driver are initialized. > >> + */ > >> + pcie_irq = platform_get_irq(pdev, 1); > >> + if (pcie_irq < 0) { > >> + dev_warn(dev, "Unable to find PCI-E IRQ.\n"); > >> + } else { > >> + rc = devm_request_irq(dev, pcie_irq, aspeed_xdma_pcie_irq, > >> + IRQF_SHARED, DEVICE_NAME, ctx); > >> + if (rc < 0) > >> + dev_warn(dev, "Unable to request PCI-E IRQ %d.\n", rc); > >> + } > >> + > >> + return 0; > >> +} > >> + > >> +static int aspeed_xdma_remove(struct platform_device *pdev) > >> +{ > >> + struct aspeed_xdma *ctx = platform_get_drvdata(pdev); > >> + > >> + gen_pool_free(ctx->vga_pool, (unsigned long)ctx->cmdq_vga_virt, > >> + XDMA_CMDQ_SIZE); > > You've used devm_gen_pool_create(), so no need to explicitly free it. > > > >> + iounmap(ctx->vga_virt); > > This is unnecessary if we use devm_ioremap() as suggested above. > > > >> + > >> + reset_control_assert(ctx->reset); > >> + clk_disable_unprepare(ctx->clock); > >> + > >> + return 0; > >> +} > >> + > >> +static const struct of_device_id aspeed_xdma_match[] = { > >> + { > >> + .compatible = "aspeed,ast2500-xdma", > >> + .data = (void *)xdma_ast2500, > > I'd prefer you create a struct as discussed throughout. > > > >> + }, > >> + { > >> + .compatible = "aspeed,ast2600-xdma", > >> + .data = (void *)xdma_ast2600, > >> + }, > >> + { }, > >> +}; > >> + > >> +static struct platform_driver aspeed_xdma_driver = { > >> + .probe = aspeed_xdma_probe, > >> + .remove = aspeed_xdma_remove, > >> + .driver = { > >> + .name = DEVICE_NAME, > >> + .of_match_table = aspeed_xdma_match, > >> + }, > >> +}; > >> + > >> +module_platform_driver(aspeed_xdma_driver); > >> + > >> +MODULE_AUTHOR("Eddie James"); > >> +MODULE_DESCRIPTION("Aspeed XDMA Engine Driver"); > >> +MODULE_LICENSE("GPL v2"); > >> diff --git a/include/uapi/linux/aspeed-xdma.h b/include/uapi/linux/aspeed-xdma.h > >> new file mode 100644 > >> index 0000000..7f3a031 > >> --- /dev/null > >> +++ b/include/uapi/linux/aspeed-xdma.h > >> @@ -0,0 +1,49 @@ > >> +/* SPDX-License-Identifier: GPL-2.0+ WITH Linux-syscall-note */ > >> +/* Copyright IBM Corp 2019 */ > >> + > >> +#ifndef _UAPI_LINUX_ASPEED_XDMA_H_ > >> +#define _UAPI_LINUX_ASPEED_XDMA_H_ > >> + > >> +#include <linux/types.h> > >> + > >> +/* > >> + * aspeed_xdma_direction > >> + * > >> + * ASPEED_XDMA_DIRECTION_DOWNSTREAM: transfers data from the host to the BMC > >> + * > >> + * ASPEED_XDMA_DIRECTION_UPSTREAM: transfers data from the BMC to the host > >> + * > >> + * ASPEED_XDMA_DIRECTION_RESET: resets the XDMA engine > >> + */ > >> +enum aspeed_xdma_direction { > >> + ASPEED_XDMA_DIRECTION_DOWNSTREAM = 0, > >> + ASPEED_XDMA_DIRECTION_UPSTREAM, > >> + ASPEED_XDMA_DIRECTION_RESET, > >> +}; > >> + > >> +/* > >> + * aspeed_xdma_op > >> + * > >> + * host_addr: the DMA address on the host side, typically configured by PCI > >> + * subsystem > >> + * > >> + * len: the size of the transfer in bytes > >> + * > >> + * direction: an enumerator indicating the direction of the DMA operation; see > >> + * enum aspeed_xdma_direction > >> + * > >> + * bmc_addr: the virtual address to DMA on the BMC side; this parameter is > >> + * unused on current platforms since the XDMA engine is restricted to > >> + * accessing the VGA memory space > > This doesn't make sense to me - if it's a virtual address then talking about the VGA > > space doesn't make sense to me as where the memory lives is an implementation > > detail. If the parameter is a physical address then it makes sense, but we'd always > > want it specified regardless? > > > As the comment says, it's unused. I added the parameter to future-proof > the API in case the engine is configured differently, allowing arbitrary > memory access by the engine. In that case, the user would pass the > virtual address of their data. Hmm. Wouldn't we still use the mmap() interface and derive the addresses that way regardless of where the backing memory is located? What do you think the model of interaction would be if we made use of this parameter? Andrew ^ permalink raw reply [flat|nested] 24+ messages in thread
* [PATCH 07/12] drivers/soc: xdma: Add user interface 2019-11-08 20:18 [PATCH 00/12] Aspeed: Add SCU interrupt controller and XDMA engine drivers Eddie James ` (5 preceding siblings ...) 2019-11-08 20:18 ` [PATCH 06/12] drivers/soc: Add Aspeed XDMA Engine Driver Eddie James @ 2019-11-08 20:18 ` Eddie James 2019-11-24 23:59 ` Andrew Jeffery 2019-11-08 20:18 ` [PATCH 08/12] ARM: dts: aspeed: ast2500: Add XDMA Engine Eddie James ` (4 subsequent siblings) 11 siblings, 1 reply; 24+ messages in thread From: Eddie James @ 2019-11-08 20:18 UTC (permalink / raw) To: linux-kernel Cc: linux-aspeed, andrew, joel, maz, jason, tglx, robh+dt, mark.rutland, devicetree This commits adds a miscdevice to provide a user interface to the XDMA engine. The interface provides the write operation to start DMA operations. The DMA parameters are passed as the data to the write call. The actual data to transfer is NOT passed through write. Note that both directions of DMA operation are accomplished through the write command; BMC to host and host to BMC. The XDMA engine is restricted to only accessing the reserved memory space on the AST2500, typically used by the VGA. For this reason, the VGA memory space is pooled and allocated with genalloc. Users calling mmap allocate pages from this pool for their usage. The space allocated by a client will be the space used in the DMA operation. For an "upstream" (BMC to host) operation, the data in the client's area will be transferred to the host. For a "downstream" (host to BMC) operation, the host data will be placed in the client's memory area. Poll is also provided in order to determine when the DMA operation is complete for non-blocking IO. Signed-off-by: Eddie James <eajames@linux.ibm.com> --- drivers/soc/aspeed/aspeed-xdma.c | 223 +++++++++++++++++++++++++++++++++++++++ 1 file changed, 223 insertions(+) diff --git a/drivers/soc/aspeed/aspeed-xdma.c b/drivers/soc/aspeed/aspeed-xdma.c index 99041a6..3d37582 100644 --- a/drivers/soc/aspeed/aspeed-xdma.c +++ b/drivers/soc/aspeed/aspeed-xdma.c @@ -64,6 +64,9 @@ #define XDMA_CMDQ_SIZE PAGE_SIZE #define XDMA_NUM_CMDS \ (XDMA_CMDQ_SIZE / sizeof(struct aspeed_xdma_cmd)) +#define XDMA_OP_SIZE_MAX sizeof(struct aspeed_xdma_op) +#define XDMA_OP_SIZE_MIN \ + (sizeof(struct aspeed_xdma_op) - sizeof(u64)) /* Aspeed specification requires 10ms after switching the reset line */ #define XDMA_RESET_TIME_MS 10 @@ -216,6 +219,7 @@ struct aspeed_xdma { bool in_reset; bool upstream; unsigned int cmd_idx; + struct mutex file_lock; struct mutex start_lock; struct delayed_work reset_work; spinlock_t client_lock; @@ -230,6 +234,8 @@ struct aspeed_xdma { dma_addr_t cmdq_vga_phys; void *cmdq_vga_virt; struct gen_pool *vga_pool; + + struct miscdevice misc; }; struct aspeed_xdma_client { @@ -557,6 +563,204 @@ static irqreturn_t aspeed_xdma_pcie_irq(int irq, void *arg) return IRQ_HANDLED; } +static ssize_t aspeed_xdma_write(struct file *file, const char __user *buf, + size_t len, loff_t *offset) +{ + int rc; + struct aspeed_xdma_op op; + struct aspeed_xdma_client *client = file->private_data; + struct aspeed_xdma *ctx = client->ctx; + u32 offs = client->phys ? (client->phys - ctx->vga_phys) : + XDMA_CMDQ_SIZE; + + if (len < XDMA_OP_SIZE_MIN) + return -EINVAL; + + if (len > XDMA_OP_SIZE_MAX) + len = XDMA_OP_SIZE_MAX; + + rc = copy_from_user(&op, buf, len); + if (rc) + return rc; + + if (op.direction == ASPEED_XDMA_DIRECTION_RESET) { + mutex_lock(&ctx->start_lock); + + if (aspeed_xdma_reset_start(ctx)) { + msleep(XDMA_RESET_TIME_MS); + + aspeed_xdma_reset_finish(ctx); + } + + mutex_unlock(&ctx->start_lock); + + return len; + } else if (op.direction > ASPEED_XDMA_DIRECTION_RESET) { + return -EINVAL; + } + + if (op.len > ctx->vga_size - offs) + return -EINVAL; + + if (file->f_flags & O_NONBLOCK) { + if (!mutex_trylock(&ctx->file_lock)) + return -EAGAIN; + + if (ctx->in_progress || ctx->in_reset) { + mutex_unlock(&ctx->file_lock); + return -EAGAIN; + } + } else { + mutex_lock(&ctx->file_lock); + + rc = wait_event_interruptible(ctx->wait, !ctx->in_progress && + !ctx->in_reset); + if (rc) { + mutex_unlock(&ctx->file_lock); + return -EINTR; + } + } + + aspeed_xdma_start(ctx, &op, ctx->vga_phys + offs, client); + + mutex_unlock(&ctx->file_lock); + + if (!(file->f_flags & O_NONBLOCK)) { + rc = wait_event_interruptible(ctx->wait, !ctx->in_progress); + if (rc) + return -EINTR; + + if (client->error) + return -EIO; + } + + return len; +} + +static __poll_t aspeed_xdma_poll(struct file *file, + struct poll_table_struct *wait) +{ + __poll_t mask = 0; + __poll_t req = poll_requested_events(wait); + struct aspeed_xdma_client *client = file->private_data; + struct aspeed_xdma *ctx = client->ctx; + + if (req & (EPOLLIN | EPOLLRDNORM)) { + if (client->in_progress) + poll_wait(file, &ctx->wait, wait); + + if (!client->in_progress) { + if (client->error) + mask |= EPOLLERR; + else + mask |= EPOLLIN | EPOLLRDNORM; + } + } + + if (req & (EPOLLOUT | EPOLLWRNORM)) { + if (ctx->in_progress) + poll_wait(file, &ctx->wait, wait); + + if (!ctx->in_progress) + mask |= EPOLLOUT | EPOLLWRNORM; + } + + return mask; +} + +static void aspeed_xdma_vma_close(struct vm_area_struct *vma) +{ + struct aspeed_xdma_client *client = vma->vm_private_data; + + gen_pool_free(client->ctx->vga_pool, (unsigned long)client->virt, + client->size); + + client->virt = NULL; + client->phys = 0; + client->size = 0; +} + +static const struct vm_operations_struct aspeed_xdma_vm_ops = { + .close = aspeed_xdma_vma_close, +}; + +static int aspeed_xdma_mmap(struct file *file, struct vm_area_struct *vma) +{ + int rc; + struct aspeed_xdma_client *client = file->private_data; + struct aspeed_xdma *ctx = client->ctx; + + /* restrict file to one mapping */ + if (client->size) + return -ENOMEM; + + client->size = vma->vm_end - vma->vm_start; + client->virt = gen_pool_dma_alloc(ctx->vga_pool, client->size, + &client->phys); + if (!client->virt) { + client->phys = 0; + client->size = 0; + return -ENOMEM; + } + + vma->vm_pgoff = (client->phys - ctx->vga_phys) >> PAGE_SHIFT; + vma->vm_ops = &aspeed_xdma_vm_ops; + vma->vm_private_data = client; + vma->vm_page_prot = pgprot_noncached(vma->vm_page_prot); + + rc = io_remap_pfn_range(vma, vma->vm_start, client->phys >> PAGE_SHIFT, + client->size, vma->vm_page_prot); + if (rc) { + gen_pool_free(ctx->vga_pool, (unsigned long)client->virt, + client->size); + + client->virt = NULL; + client->phys = 0; + client->size = 0; + return rc; + } + + dev_dbg(ctx->dev, "mmap: v[%08lx] to p[%08x], s[%08x]\n", + vma->vm_start, (u32)client->phys, client->size); + + return 0; +} + +static int aspeed_xdma_open(struct inode *inode, struct file *file) +{ + struct miscdevice *misc = file->private_data; + struct aspeed_xdma *ctx = container_of(misc, struct aspeed_xdma, misc); + struct aspeed_xdma_client *client = kzalloc(sizeof(*client), + GFP_KERNEL); + + if (!client) + return -ENOMEM; + + client->ctx = ctx; + file->private_data = client; + return 0; +} + +static int aspeed_xdma_release(struct inode *inode, struct file *file) +{ + struct aspeed_xdma_client *client = file->private_data; + + if (client->ctx->current_client == client) + client->ctx->current_client = NULL; + + kfree(client); + return 0; +} + +static const struct file_operations aspeed_xdma_fops = { + .owner = THIS_MODULE, + .write = aspeed_xdma_write, + .poll = aspeed_xdma_poll, + .mmap = aspeed_xdma_mmap, + .open = aspeed_xdma_open, + .release = aspeed_xdma_release, +}; + static int aspeed_xdma_init(struct aspeed_xdma *ctx) { int rc; @@ -739,6 +943,7 @@ static int aspeed_xdma_probe(struct platform_device *pdev) ctx->dev = dev; platform_set_drvdata(pdev, ctx); + mutex_init(&ctx->file_lock); mutex_init(&ctx->start_lock); INIT_DELAYED_WORK(&ctx->reset_work, aspeed_xdma_reset_work); spin_lock_init(&ctx->client_lock); @@ -797,6 +1002,23 @@ static int aspeed_xdma_probe(struct platform_device *pdev) aspeed_xdma_init_eng(ctx); + ctx->misc.minor = MISC_DYNAMIC_MINOR; + ctx->misc.fops = &aspeed_xdma_fops; + ctx->misc.name = "aspeed-xdma"; + ctx->misc.parent = dev; + rc = misc_register(&ctx->misc); + if (rc) { + dev_err(dev, "Unable to register xdma miscdevice.\n"); + + gen_pool_free(ctx->vga_pool, (unsigned long)ctx->cmdq_vga_virt, + XDMA_CMDQ_SIZE); + iounmap(ctx->vga_virt); + + reset_control_assert(ctx->reset); + clk_disable_unprepare(ctx->clock); + return rc; + } + /* * This interrupt could fire immediately so only request it once the * engine and driver are initialized. @@ -818,6 +1040,7 @@ static int aspeed_xdma_remove(struct platform_device *pdev) { struct aspeed_xdma *ctx = platform_get_drvdata(pdev); + misc_deregister(&ctx->misc); gen_pool_free(ctx->vga_pool, (unsigned long)ctx->cmdq_vga_virt, XDMA_CMDQ_SIZE); iounmap(ctx->vga_virt); -- 1.8.3.1 ^ permalink raw reply related [flat|nested] 24+ messages in thread
* Re: [PATCH 07/12] drivers/soc: xdma: Add user interface 2019-11-08 20:18 ` [PATCH 07/12] drivers/soc: xdma: Add user interface Eddie James @ 2019-11-24 23:59 ` Andrew Jeffery 2019-11-25 19:44 ` Eddie James 0 siblings, 1 reply; 24+ messages in thread From: Andrew Jeffery @ 2019-11-24 23:59 UTC (permalink / raw) To: Eddie James, linux-kernel Cc: linux-aspeed, Joel Stanley, maz, Jason Cooper, tglx, Rob Herring, mark.rutland, devicetree On Sat, 9 Nov 2019, at 06:48, Eddie James wrote: > This commits adds a miscdevice to provide a user interface to the XDMA > engine. The interface provides the write operation to start DMA > operations. The DMA parameters are passed as the data to the write call. > The actual data to transfer is NOT passed through write. Note that both > directions of DMA operation are accomplished through the write command; > BMC to host and host to BMC. > > The XDMA engine is restricted to only accessing the reserved memory > space on the AST2500, typically used by the VGA. For this reason, the > VGA memory space is pooled and allocated with genalloc. Users calling > mmap allocate pages from this pool for their usage. The space allocated > by a client will be the space used in the DMA operation. For an > "upstream" (BMC to host) operation, the data in the client's area will > be transferred to the host. For a "downstream" (host to BMC) operation, > the host data will be placed in the client's memory area. > > Poll is also provided in order to determine when the DMA operation is > complete for non-blocking IO. > > Signed-off-by: Eddie James <eajames@linux.ibm.com> > --- > drivers/soc/aspeed/aspeed-xdma.c | 223 +++++++++++++++++++++++++++++++++++++++ > 1 file changed, 223 insertions(+) > > diff --git a/drivers/soc/aspeed/aspeed-xdma.c b/drivers/soc/aspeed/aspeed-xdma.c > index 99041a6..3d37582 100644 > --- a/drivers/soc/aspeed/aspeed-xdma.c > +++ b/drivers/soc/aspeed/aspeed-xdma.c > @@ -64,6 +64,9 @@ > #define XDMA_CMDQ_SIZE PAGE_SIZE > #define XDMA_NUM_CMDS \ > (XDMA_CMDQ_SIZE / sizeof(struct aspeed_xdma_cmd)) > +#define XDMA_OP_SIZE_MAX sizeof(struct aspeed_xdma_op) > +#define XDMA_OP_SIZE_MIN \ > + (sizeof(struct aspeed_xdma_op) - sizeof(u64)) > > /* Aspeed specification requires 10ms after switching the reset line */ > #define XDMA_RESET_TIME_MS 10 > @@ -216,6 +219,7 @@ struct aspeed_xdma { > bool in_reset; > bool upstream; > unsigned int cmd_idx; > + struct mutex file_lock; Please add documentation about what data file_lock is protecting. > struct mutex start_lock; > struct delayed_work reset_work; > spinlock_t client_lock; > @@ -230,6 +234,8 @@ struct aspeed_xdma { > dma_addr_t cmdq_vga_phys; > void *cmdq_vga_virt; > struct gen_pool *vga_pool; > + > + struct miscdevice misc; > }; > > struct aspeed_xdma_client { > @@ -557,6 +563,204 @@ static irqreturn_t aspeed_xdma_pcie_irq(int irq, > void *arg) > return IRQ_HANDLED; > } > > +static ssize_t aspeed_xdma_write(struct file *file, const char __user *buf, > + size_t len, loff_t *offset) > +{ > + int rc; > + struct aspeed_xdma_op op; > + struct aspeed_xdma_client *client = file->private_data; > + struct aspeed_xdma *ctx = client->ctx; > + u32 offs = client->phys ? (client->phys - ctx->vga_phys) : > + XDMA_CMDQ_SIZE; > + > + if (len < XDMA_OP_SIZE_MIN) > + return -EINVAL; > + > + if (len > XDMA_OP_SIZE_MAX) > + len = XDMA_OP_SIZE_MAX; Isn't this an EINVAL case as well? > + > + rc = copy_from_user(&op, buf, len); > + if (rc) > + return rc; > + > + if (op.direction == ASPEED_XDMA_DIRECTION_RESET) { Seems a bit abusive to use the direction field to issue a reset. > + mutex_lock(&ctx->start_lock); > + > + if (aspeed_xdma_reset_start(ctx)) { > + msleep(XDMA_RESET_TIME_MS); > + > + aspeed_xdma_reset_finish(ctx); > + } > + > + mutex_unlock(&ctx->start_lock); > + > + return len; > + } else if (op.direction > ASPEED_XDMA_DIRECTION_RESET) { > + return -EINVAL; > + } > + > + if (op.len > ctx->vga_size - offs) > + return -EINVAL; > + > + if (file->f_flags & O_NONBLOCK) { > + if (!mutex_trylock(&ctx->file_lock)) > + return -EAGAIN; > + > + if (ctx->in_progress || ctx->in_reset) { ctx->in_progress was protected by a lock that isn't file_lock, so this looks wrong. > + mutex_unlock(&ctx->file_lock); > + return -EAGAIN; > + } > + } else { > + mutex_lock(&ctx->file_lock); > + > + rc = wait_event_interruptible(ctx->wait, !ctx->in_progress && > + !ctx->in_reset); As above. > + if (rc) { > + mutex_unlock(&ctx->file_lock); > + return -EINTR; > + } > + } > + > + aspeed_xdma_start(ctx, &op, ctx->vga_phys + offs, client); > + > + mutex_unlock(&ctx->file_lock); > + > + if (!(file->f_flags & O_NONBLOCK)) { > + rc = wait_event_interruptible(ctx->wait, !ctx->in_progress); > + if (rc) > + return -EINTR; > + > + if (client->error) > + return -EIO; What's the client->error value? Can it be more informative? > + } > + > + return len; We've potentially truncated len above (in the len > XDMA_OP_SIZE_MAX), which leads to some ambiguity with the write() syscall given that it can potentially return less than the requested length. This is one such case, but the caller probably shouldn't attempt a follow-up write. This would go away if we make the len > XDMA_OP_SIZE_MAX an EINVAL case as suggested agove. > +} > + > +static __poll_t aspeed_xdma_poll(struct file *file, > + struct poll_table_struct *wait) > +{ > + __poll_t mask = 0; > + __poll_t req = poll_requested_events(wait); > + struct aspeed_xdma_client *client = file->private_data; > + struct aspeed_xdma *ctx = client->ctx; > + > + if (req & (EPOLLIN | EPOLLRDNORM)) { > + if (client->in_progress) > + poll_wait(file, &ctx->wait, wait); > + > + if (!client->in_progress) { > + if (client->error) > + mask |= EPOLLERR; > + else > + mask |= EPOLLIN | EPOLLRDNORM; > + } > + } > + > + if (req & (EPOLLOUT | EPOLLWRNORM)) { > + if (ctx->in_progress) > + poll_wait(file, &ctx->wait, wait); > + > + if (!ctx->in_progress) > + mask |= EPOLLOUT | EPOLLWRNORM; > + } > + > + return mask; > +} > + > +static void aspeed_xdma_vma_close(struct vm_area_struct *vma) > +{ > + struct aspeed_xdma_client *client = vma->vm_private_data; > + > + gen_pool_free(client->ctx->vga_pool, (unsigned long)client->virt, > + client->size); > + > + client->virt = NULL; > + client->phys = 0; > + client->size = 0; > +} > + > +static const struct vm_operations_struct aspeed_xdma_vm_ops = { > + .close = aspeed_xdma_vma_close, > +}; > + > +static int aspeed_xdma_mmap(struct file *file, struct vm_area_struct *vma) > +{ > + int rc; > + struct aspeed_xdma_client *client = file->private_data; > + struct aspeed_xdma *ctx = client->ctx; > + > + /* restrict file to one mapping */ > + if (client->size) > + return -ENOMEM; Can we do better with the error code here? > + > + client->size = vma->vm_end - vma->vm_start; > + client->virt = gen_pool_dma_alloc(ctx->vga_pool, client->size, > + &client->phys); > + if (!client->virt) { > + client->phys = 0; > + client->size = 0; > + return -ENOMEM; > + } > + > + vma->vm_pgoff = (client->phys - ctx->vga_phys) >> PAGE_SHIFT; Where does client->phys get set? Andrew ^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [PATCH 07/12] drivers/soc: xdma: Add user interface 2019-11-24 23:59 ` Andrew Jeffery @ 2019-11-25 19:44 ` Eddie James 2019-11-26 3:30 ` Andrew Jeffery 0 siblings, 1 reply; 24+ messages in thread From: Eddie James @ 2019-11-25 19:44 UTC (permalink / raw) To: Andrew Jeffery, Eddie James, linux-kernel Cc: linux-aspeed, Joel Stanley, maz, Jason Cooper, tglx, Rob Herring, mark.rutland, devicetree On 11/24/19 5:59 PM, Andrew Jeffery wrote: > > On Sat, 9 Nov 2019, at 06:48, Eddie James wrote: >> This commits adds a miscdevice to provide a user interface to the XDMA >> engine. The interface provides the write operation to start DMA >> operations. The DMA parameters are passed as the data to the write call. >> The actual data to transfer is NOT passed through write. Note that both >> directions of DMA operation are accomplished through the write command; >> BMC to host and host to BMC. >> >> The XDMA engine is restricted to only accessing the reserved memory >> space on the AST2500, typically used by the VGA. For this reason, the >> VGA memory space is pooled and allocated with genalloc. Users calling >> mmap allocate pages from this pool for their usage. The space allocated >> by a client will be the space used in the DMA operation. For an >> "upstream" (BMC to host) operation, the data in the client's area will >> be transferred to the host. For a "downstream" (host to BMC) operation, >> the host data will be placed in the client's memory area. >> >> Poll is also provided in order to determine when the DMA operation is >> complete for non-blocking IO. >> >> Signed-off-by: Eddie James <eajames@linux.ibm.com> >> --- >> drivers/soc/aspeed/aspeed-xdma.c | 223 +++++++++++++++++++++++++++++++++++++++ >> 1 file changed, 223 insertions(+) >> >> diff --git a/drivers/soc/aspeed/aspeed-xdma.c b/drivers/soc/aspeed/aspeed-xdma.c >> index 99041a6..3d37582 100644 >> --- a/drivers/soc/aspeed/aspeed-xdma.c >> +++ b/drivers/soc/aspeed/aspeed-xdma.c >> @@ -64,6 +64,9 @@ >> #define XDMA_CMDQ_SIZE PAGE_SIZE >> #define XDMA_NUM_CMDS \ >> (XDMA_CMDQ_SIZE / sizeof(struct aspeed_xdma_cmd)) >> +#define XDMA_OP_SIZE_MAX sizeof(struct aspeed_xdma_op) >> +#define XDMA_OP_SIZE_MIN \ >> + (sizeof(struct aspeed_xdma_op) - sizeof(u64)) >> >> /* Aspeed specification requires 10ms after switching the reset line */ >> #define XDMA_RESET_TIME_MS 10 >> @@ -216,6 +219,7 @@ struct aspeed_xdma { >> bool in_reset; >> bool upstream; >> unsigned int cmd_idx; >> + struct mutex file_lock; > Please add documentation about what data file_lock is protecting. > >> struct mutex start_lock; >> struct delayed_work reset_work; >> spinlock_t client_lock; >> @@ -230,6 +234,8 @@ struct aspeed_xdma { >> dma_addr_t cmdq_vga_phys; >> void *cmdq_vga_virt; >> struct gen_pool *vga_pool; >> + >> + struct miscdevice misc; >> }; >> >> struct aspeed_xdma_client { >> @@ -557,6 +563,204 @@ static irqreturn_t aspeed_xdma_pcie_irq(int irq, >> void *arg) >> return IRQ_HANDLED; >> } >> >> +static ssize_t aspeed_xdma_write(struct file *file, const char __user *buf, >> + size_t len, loff_t *offset) >> +{ >> + int rc; >> + struct aspeed_xdma_op op; >> + struct aspeed_xdma_client *client = file->private_data; >> + struct aspeed_xdma *ctx = client->ctx; >> + u32 offs = client->phys ? (client->phys - ctx->vga_phys) : >> + XDMA_CMDQ_SIZE; >> + >> + if (len < XDMA_OP_SIZE_MIN) >> + return -EINVAL; >> + >> + if (len > XDMA_OP_SIZE_MAX) >> + len = XDMA_OP_SIZE_MAX; > Isn't this an EINVAL case as well? Perhaps so. > >> + >> + rc = copy_from_user(&op, buf, len); >> + if (rc) >> + return rc; >> + >> + if (op.direction == ASPEED_XDMA_DIRECTION_RESET) { > Seems a bit abusive to use the direction field to issue a reset. What would you recommend instead? > >> + mutex_lock(&ctx->start_lock); >> + >> + if (aspeed_xdma_reset_start(ctx)) { >> + msleep(XDMA_RESET_TIME_MS); >> + >> + aspeed_xdma_reset_finish(ctx); >> + } >> + >> + mutex_unlock(&ctx->start_lock); >> + >> + return len; >> + } else if (op.direction > ASPEED_XDMA_DIRECTION_RESET) { >> + return -EINVAL; >> + } >> + >> + if (op.len > ctx->vga_size - offs) >> + return -EINVAL; >> + >> + if (file->f_flags & O_NONBLOCK) { >> + if (!mutex_trylock(&ctx->file_lock)) >> + return -EAGAIN; >> + >> + if (ctx->in_progress || ctx->in_reset) { > ctx->in_progress was protected by a lock that isn't file_lock, so this looks wrong. file_lock isn't protecting in_progress. It's protecting access to the whole engine while a transfer is in progress. in_progress isn't protected at all, it's just better to lock before waiting for in_progress so that multiple clients don't all see in_progress go false and have to wait for a mutex (particularly in the nonblocking case). > >> + mutex_unlock(&ctx->file_lock); >> + return -EAGAIN; >> + } >> + } else { >> + mutex_lock(&ctx->file_lock); >> + >> + rc = wait_event_interruptible(ctx->wait, !ctx->in_progress && >> + !ctx->in_reset); > As above. > >> + if (rc) { >> + mutex_unlock(&ctx->file_lock); >> + return -EINTR; >> + } >> + } >> + >> + aspeed_xdma_start(ctx, &op, ctx->vga_phys + offs, client); >> + >> + mutex_unlock(&ctx->file_lock); >> + >> + if (!(file->f_flags & O_NONBLOCK)) { >> + rc = wait_event_interruptible(ctx->wait, !ctx->in_progress); >> + if (rc) >> + return -EINTR; >> + >> + if (client->error) >> + return -EIO; > What's the client->error value? Can it be more informative? Not really. There isn't much error information available. Basically the only way to get an error is if the engine is reset (user or PCIE initiated) while the transfer is on-going. > >> + } >> + >> + return len; > We've potentially truncated len above (in the len > XDMA_OP_SIZE_MAX), > which leads to some ambiguity with the write() syscall given that it can > potentially return less than the requested length. This is one such case, but > the caller probably shouldn't attempt a follow-up write. > > This would go away if we make the len > XDMA_OP_SIZE_MAX an EINVAL > case as suggested agove. Sure. > >> +} >> + >> +static __poll_t aspeed_xdma_poll(struct file *file, >> + struct poll_table_struct *wait) >> +{ >> + __poll_t mask = 0; >> + __poll_t req = poll_requested_events(wait); >> + struct aspeed_xdma_client *client = file->private_data; >> + struct aspeed_xdma *ctx = client->ctx; >> + >> + if (req & (EPOLLIN | EPOLLRDNORM)) { >> + if (client->in_progress) >> + poll_wait(file, &ctx->wait, wait); >> + >> + if (!client->in_progress) { >> + if (client->error) >> + mask |= EPOLLERR; >> + else >> + mask |= EPOLLIN | EPOLLRDNORM; >> + } >> + } >> + >> + if (req & (EPOLLOUT | EPOLLWRNORM)) { >> + if (ctx->in_progress) >> + poll_wait(file, &ctx->wait, wait); >> + >> + if (!ctx->in_progress) >> + mask |= EPOLLOUT | EPOLLWRNORM; >> + } >> + >> + return mask; >> +} >> + >> +static void aspeed_xdma_vma_close(struct vm_area_struct *vma) >> +{ >> + struct aspeed_xdma_client *client = vma->vm_private_data; >> + >> + gen_pool_free(client->ctx->vga_pool, (unsigned long)client->virt, >> + client->size); >> + >> + client->virt = NULL; >> + client->phys = 0; >> + client->size = 0; >> +} >> + >> +static const struct vm_operations_struct aspeed_xdma_vm_ops = { >> + .close = aspeed_xdma_vma_close, >> +}; >> + >> +static int aspeed_xdma_mmap(struct file *file, struct vm_area_struct *vma) >> +{ >> + int rc; >> + struct aspeed_xdma_client *client = file->private_data; >> + struct aspeed_xdma *ctx = client->ctx; >> + >> + /* restrict file to one mapping */ >> + if (client->size) >> + return -ENOMEM; > Can we do better with the error code here? Maybe? I'm open to suggestions... > >> + >> + client->size = vma->vm_end - vma->vm_start; >> + client->virt = gen_pool_dma_alloc(ctx->vga_pool, client->size, >> + &client->phys); >> + if (!client->virt) { >> + client->phys = 0; >> + client->size = 0; >> + return -ENOMEM; >> + } >> + >> + vma->vm_pgoff = (client->phys - ctx->vga_phys) >> PAGE_SHIFT; > Where does client->phys get set? gen_pool_dma_alloc sets it. Thanks for the review! Eddie > > Andrew ^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [PATCH 07/12] drivers/soc: xdma: Add user interface 2019-11-25 19:44 ` Eddie James @ 2019-11-26 3:30 ` Andrew Jeffery 2019-12-04 15:26 ` Eddie James 0 siblings, 1 reply; 24+ messages in thread From: Andrew Jeffery @ 2019-11-26 3:30 UTC (permalink / raw) To: Eddie James, Eddie James, linux-kernel Cc: linux-aspeed, Joel Stanley, maz, Jason Cooper, tglx, Rob Herring, mark.rutland, devicetree On Tue, 26 Nov 2019, at 06:14, Eddie James wrote: > > On 11/24/19 5:59 PM, Andrew Jeffery wrote: > > > > On Sat, 9 Nov 2019, at 06:48, Eddie James wrote: > >> This commits adds a miscdevice to provide a user interface to the XDMA > >> engine. The interface provides the write operation to start DMA > >> operations. The DMA parameters are passed as the data to the write call. > >> The actual data to transfer is NOT passed through write. Note that both > >> directions of DMA operation are accomplished through the write command; > >> BMC to host and host to BMC. > >> > >> The XDMA engine is restricted to only accessing the reserved memory > >> space on the AST2500, typically used by the VGA. For this reason, the > >> VGA memory space is pooled and allocated with genalloc. Users calling > >> mmap allocate pages from this pool for their usage. The space allocated > >> by a client will be the space used in the DMA operation. For an > >> "upstream" (BMC to host) operation, the data in the client's area will > >> be transferred to the host. For a "downstream" (host to BMC) operation, > >> the host data will be placed in the client's memory area. > >> > >> Poll is also provided in order to determine when the DMA operation is > >> complete for non-blocking IO. > >> > >> Signed-off-by: Eddie James <eajames@linux.ibm.com> > >> --- > >> drivers/soc/aspeed/aspeed-xdma.c | 223 +++++++++++++++++++++++++++++++++++++++ > >> 1 file changed, 223 insertions(+) > >> > >> diff --git a/drivers/soc/aspeed/aspeed-xdma.c b/drivers/soc/aspeed/aspeed-xdma.c > >> index 99041a6..3d37582 100644 > >> --- a/drivers/soc/aspeed/aspeed-xdma.c > >> +++ b/drivers/soc/aspeed/aspeed-xdma.c > >> @@ -64,6 +64,9 @@ > >> #define XDMA_CMDQ_SIZE PAGE_SIZE > >> #define XDMA_NUM_CMDS \ > >> (XDMA_CMDQ_SIZE / sizeof(struct aspeed_xdma_cmd)) > >> +#define XDMA_OP_SIZE_MAX sizeof(struct aspeed_xdma_op) > >> +#define XDMA_OP_SIZE_MIN \ > >> + (sizeof(struct aspeed_xdma_op) - sizeof(u64)) > >> > >> /* Aspeed specification requires 10ms after switching the reset line */ > >> #define XDMA_RESET_TIME_MS 10 > >> @@ -216,6 +219,7 @@ struct aspeed_xdma { > >> bool in_reset; > >> bool upstream; > >> unsigned int cmd_idx; > >> + struct mutex file_lock; > > Please add documentation about what data file_lock is protecting. > > > >> struct mutex start_lock; > >> struct delayed_work reset_work; > >> spinlock_t client_lock; > >> @@ -230,6 +234,8 @@ struct aspeed_xdma { > >> dma_addr_t cmdq_vga_phys; > >> void *cmdq_vga_virt; > >> struct gen_pool *vga_pool; > >> + > >> + struct miscdevice misc; > >> }; > >> > >> struct aspeed_xdma_client { > >> @@ -557,6 +563,204 @@ static irqreturn_t aspeed_xdma_pcie_irq(int irq, > >> void *arg) > >> return IRQ_HANDLED; > >> } > >> > >> +static ssize_t aspeed_xdma_write(struct file *file, const char __user *buf, > >> + size_t len, loff_t *offset) > >> +{ > >> + int rc; > >> + struct aspeed_xdma_op op; > >> + struct aspeed_xdma_client *client = file->private_data; > >> + struct aspeed_xdma *ctx = client->ctx; > >> + u32 offs = client->phys ? (client->phys - ctx->vga_phys) : > >> + XDMA_CMDQ_SIZE; > >> + > >> + if (len < XDMA_OP_SIZE_MIN) > >> + return -EINVAL; > >> + > >> + if (len > XDMA_OP_SIZE_MAX) > >> + len = XDMA_OP_SIZE_MAX; > > Isn't this an EINVAL case as well? > > > Perhaps so. > > > > > >> + > >> + rc = copy_from_user(&op, buf, len); > >> + if (rc) > >> + return rc; > >> + > >> + if (op.direction == ASPEED_XDMA_DIRECTION_RESET) { > > Seems a bit abusive to use the direction field to issue a reset. > > > What would you recommend instead? Looks like an ioctl() to me. But what need do we have to directly reset the device? Could we achieve the same by rebinding the driver if necessary? We should only need to reset it if the driver has bugs, or is there some errata that we need to deal with? Userspace shouldn't be handling that though? > > > > > >> + mutex_lock(&ctx->start_lock); > >> + > >> + if (aspeed_xdma_reset_start(ctx)) { > >> + msleep(XDMA_RESET_TIME_MS); > >> + > >> + aspeed_xdma_reset_finish(ctx); > >> + } > >> + > >> + mutex_unlock(&ctx->start_lock); > >> + > >> + return len; > >> + } else if (op.direction > ASPEED_XDMA_DIRECTION_RESET) { > >> + return -EINVAL; > >> + } > >> + > >> + if (op.len > ctx->vga_size - offs) > >> + return -EINVAL; > >> + > >> + if (file->f_flags & O_NONBLOCK) { > >> + if (!mutex_trylock(&ctx->file_lock)) > >> + return -EAGAIN; > >> + > >> + if (ctx->in_progress || ctx->in_reset) { > > ctx->in_progress was protected by a lock that isn't file_lock, so this looks wrong. > > > file_lock isn't protecting in_progress. It's protecting access to the > whole engine while a transfer is in progress. Then when would we ever gain file_lock if in_progress was set? Shouldn't the current client hold file_lock until we'd set in_progress to false, in which case we wouldn't need to check in_progress if we now hold the lock? > in_progress isn't protected at all, Except it is, because you've acquired file_lock above before checking it (and in_reset). And why is in_progress written to under ctx->start_lock (which is not file_lock) in aspeed_xdma_start() in the earlier patch if it's not protected? > it's just better to lock There's never a case of "it's just better to lock", as if it were optional. Either the variable needs to be protected against concurrent access or it doesn't. If it does, always access it under a consistent lock. > before waiting for > in_progress so that multiple clients don't all see in_progress go false > and have to wait for a mutex (particularly in the nonblocking case). Yes, so what you're suggesting is that in_progress needs to be protected against concurrent access, so needs to be accessed under a consistent lock. Though as suggested above it might be enough to successfully acquire file_lock. As far as I can see we have three events that we need to care about: 1. Submission of a DMA request 2. Completion of a DMA request 3. PCIe-driven reset For 1, multiple concurrent submissions need to be serialised, so we need a mutex with the semantics of file_lock as you've described above. 2 should only occur if we have an event of type 1 outstanding. If 1 is outstanding then receiving 2 should cause the process that triggered 1 to wake and release the mutex. 3 can happen at any time which results in two cases that we need to care about: 3a. DMA request in progress (i.e. 1 but have not yet seen corresponding 2) 3b. DMA idle (no request in progress) Events of type 1 need to be serialised against 3. 2 won't occur after 3 until 1 has occurred, so there's no need to consider it in the serialisation for 3. In the case of 3a. we need to reset the device, then mark the current transfer failed, then wake the associated process. The woken process will release the mutex and allow any queued requests to proceed. 3b is much simpler, though we need to prevent events of type 1 concurrently accessing the device while the reset is in progress. So we need a spinlock to cover configuring the device. So that's two locks - a mutex to serialise process-context access to the device, and a spinlock to serialise interrupts with respect to process-context access. Currently your implementation contains two mutexes (start_lock and file_lock) and two spinlocks (client_lock and reset_lock), all with fairly hazy definitions. > > > > > >> + mutex_unlock(&ctx->file_lock); > >> + return -EAGAIN; > >> + } > >> + } else { > >> + mutex_lock(&ctx->file_lock); > >> + > >> + rc = wait_event_interruptible(ctx->wait, !ctx->in_progress && > >> + !ctx->in_reset); > > As above. > > > >> + if (rc) { > >> + mutex_unlock(&ctx->file_lock); > >> + return -EINTR; > >> + } > >> + } > >> + > >> + aspeed_xdma_start(ctx, &op, ctx->vga_phys + offs, client); > >> + > >> + mutex_unlock(&ctx->file_lock); > >> + > >> + if (!(file->f_flags & O_NONBLOCK)) { > >> + rc = wait_event_interruptible(ctx->wait, !ctx->in_progress); > >> + if (rc) > >> + return -EINTR; > >> + > >> + if (client->error) > >> + return -EIO; > > What's the client->error value? Can it be more informative? > > > Not really. There isn't much error information available. Basically the > only way to get an error is if the engine is reset (user or PCIE > initiated) while the transfer is on-going. > > > > > >> + } > >> + > >> + return len; > > We've potentially truncated len above (in the len > XDMA_OP_SIZE_MAX), > > which leads to some ambiguity with the write() syscall given that it can > > potentially return less than the requested length. This is one such case, but > > the caller probably shouldn't attempt a follow-up write. > > > > This would go away if we make the len > XDMA_OP_SIZE_MAX an EINVAL > > case as suggested agove. > > > Sure. > > > > > >> +} > >> + > >> +static __poll_t aspeed_xdma_poll(struct file *file, > >> + struct poll_table_struct *wait) > >> +{ > >> + __poll_t mask = 0; > >> + __poll_t req = poll_requested_events(wait); > >> + struct aspeed_xdma_client *client = file->private_data; > >> + struct aspeed_xdma *ctx = client->ctx; > >> + > >> + if (req & (EPOLLIN | EPOLLRDNORM)) { > >> + if (client->in_progress) > >> + poll_wait(file, &ctx->wait, wait); > >> + > >> + if (!client->in_progress) { > >> + if (client->error) > >> + mask |= EPOLLERR; > >> + else > >> + mask |= EPOLLIN | EPOLLRDNORM; > >> + } > >> + } > >> + > >> + if (req & (EPOLLOUT | EPOLLWRNORM)) { > >> + if (ctx->in_progress) > >> + poll_wait(file, &ctx->wait, wait); > >> + > >> + if (!ctx->in_progress) > >> + mask |= EPOLLOUT | EPOLLWRNORM; > >> + } > >> + > >> + return mask; > >> +} > >> + > >> +static void aspeed_xdma_vma_close(struct vm_area_struct *vma) > >> +{ > >> + struct aspeed_xdma_client *client = vma->vm_private_data; > >> + > >> + gen_pool_free(client->ctx->vga_pool, (unsigned long)client->virt, > >> + client->size); > >> + > >> + client->virt = NULL; > >> + client->phys = 0; > >> + client->size = 0; > >> +} > >> + > >> +static const struct vm_operations_struct aspeed_xdma_vm_ops = { > >> + .close = aspeed_xdma_vma_close, > >> +}; > >> + > >> +static int aspeed_xdma_mmap(struct file *file, struct vm_area_struct *vma) > >> +{ > >> + int rc; > >> + struct aspeed_xdma_client *client = file->private_data; > >> + struct aspeed_xdma *ctx = client->ctx; > >> + > >> + /* restrict file to one mapping */ > >> + if (client->size) > >> + return -ENOMEM; > > Can we do better with the error code here? > > > Maybe? I'm open to suggestions... How about EBUSY? > > > > > >> + > >> + client->size = vma->vm_end - vma->vm_start; > >> + client->virt = gen_pool_dma_alloc(ctx->vga_pool, client->size, > >> + &client->phys); > >> + if (!client->virt) { > >> + client->phys = 0; > >> + client->size = 0; > >> + return -ENOMEM; > >> + } > >> + > >> + vma->vm_pgoff = (client->phys - ctx->vga_phys) >> PAGE_SHIFT; > > Where does client->phys get set? > > > gen_pool_dma_alloc sets it. Ah, yes. Thanks. Andrew ^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [PATCH 07/12] drivers/soc: xdma: Add user interface 2019-11-26 3:30 ` Andrew Jeffery @ 2019-12-04 15:26 ` Eddie James 0 siblings, 0 replies; 24+ messages in thread From: Eddie James @ 2019-12-04 15:26 UTC (permalink / raw) To: Andrew Jeffery, Eddie James, linux-kernel Cc: linux-aspeed, Joel Stanley, maz, Jason Cooper, tglx, Rob Herring, mark.rutland, devicetree On 11/25/19 9:30 PM, Andrew Jeffery wrote: > On Tue, 26 Nov 2019, at 06:14, Eddie James wrote: >> On 11/24/19 5:59 PM, Andrew Jeffery wrote: >>> On Sat, 9 Nov 2019, at 06:48, Eddie James wrote: >>>> This commits adds a miscdevice to provide a user interface to the XDMA >>>> engine. The interface provides the write operation to start DMA >>>> operations. The DMA parameters are passed as the data to the write call. >>>> The actual data to transfer is NOT passed through write. Note that both >>>> directions of DMA operation are accomplished through the write command; >>>> BMC to host and host to BMC. >>>> >>>> The XDMA engine is restricted to only accessing the reserved memory >>>> space on the AST2500, typically used by the VGA. For this reason, the >>>> VGA memory space is pooled and allocated with genalloc. Users calling >>>> mmap allocate pages from this pool for their usage. The space allocated >>>> by a client will be the space used in the DMA operation. For an >>>> "upstream" (BMC to host) operation, the data in the client's area will >>>> be transferred to the host. For a "downstream" (host to BMC) operation, >>>> the host data will be placed in the client's memory area. >>>> >>>> Poll is also provided in order to determine when the DMA operation is >>>> complete for non-blocking IO. >>>> >>>> Signed-off-by: Eddie James<eajames@linux.ibm.com> >>>> --- >>>> drivers/soc/aspeed/aspeed-xdma.c | 223 +++++++++++++++++++++++++++++++++++++++ >>>> 1 file changed, 223 insertions(+) >>>> >>>> diff --git a/drivers/soc/aspeed/aspeed-xdma.c b/drivers/soc/aspeed/aspeed-xdma.c >>>> index 99041a6..3d37582 100644 >>>> --- a/drivers/soc/aspeed/aspeed-xdma.c >>>> +++ b/drivers/soc/aspeed/aspeed-xdma.c >>>> @@ -64,6 +64,9 @@ >>>> #define XDMA_CMDQ_SIZE PAGE_SIZE >>>> #define XDMA_NUM_CMDS \ >>>> (XDMA_CMDQ_SIZE / sizeof(struct aspeed_xdma_cmd)) >>>> +#define XDMA_OP_SIZE_MAX sizeof(struct aspeed_xdma_op) >>>> +#define XDMA_OP_SIZE_MIN \ >>>> + (sizeof(struct aspeed_xdma_op) - sizeof(u64)) >>>> >>>> /* Aspeed specification requires 10ms after switching the reset line */ >>>> #define XDMA_RESET_TIME_MS 10 >>>> @@ -216,6 +219,7 @@ struct aspeed_xdma { >>>> bool in_reset; >>>> bool upstream; >>>> unsigned int cmd_idx; >>>> + struct mutex file_lock; >>> Please add documentation about what data file_lock is protecting. >>> >>>> struct mutex start_lock; >>>> struct delayed_work reset_work; >>>> spinlock_t client_lock; >>>> @@ -230,6 +234,8 @@ struct aspeed_xdma { >>>> dma_addr_t cmdq_vga_phys; >>>> void *cmdq_vga_virt; >>>> struct gen_pool *vga_pool; >>>> + >>>> + struct miscdevice misc; >>>> }; >>>> >>>> struct aspeed_xdma_client { >>>> @@ -557,6 +563,204 @@ static irqreturn_t aspeed_xdma_pcie_irq(int irq, >>>> void *arg) >>>> return IRQ_HANDLED; >>>> } >>>> >>>> +static ssize_t aspeed_xdma_write(struct file *file, const char __user *buf, >>>> + size_t len, loff_t *offset) >>>> +{ >>>> + int rc; >>>> + struct aspeed_xdma_op op; >>>> + struct aspeed_xdma_client *client = file->private_data; >>>> + struct aspeed_xdma *ctx = client->ctx; >>>> + u32 offs = client->phys ? (client->phys - ctx->vga_phys) : >>>> + XDMA_CMDQ_SIZE; >>>> + >>>> + if (len < XDMA_OP_SIZE_MIN) >>>> + return -EINVAL; >>>> + >>>> + if (len > XDMA_OP_SIZE_MAX) >>>> + len = XDMA_OP_SIZE_MAX; >>> Isn't this an EINVAL case as well? >> Perhaps so. >> >> >>>> + >>>> + rc = copy_from_user(&op, buf, len); >>>> + if (rc) >>>> + return rc; >>>> + >>>> + if (op.direction == ASPEED_XDMA_DIRECTION_RESET) { >>> Seems a bit abusive to use the direction field to issue a reset. >> What would you recommend instead? > Looks like an ioctl() to me. But what need do we have to directly reset > the device? Could we achieve the same by rebinding the driver if > necessary? We should only need to reset it if the driver has bugs, or > is there some errata that we need to deal with? Userspace shouldn't > be handling that though? Well it could be necessary to reset if userspace messes up and sends a bad host address for example, so that's why I'd like to have it available to userspace. >>>> + mutex_lock(&ctx->start_lock); >>>> + >>>> + if (aspeed_xdma_reset_start(ctx)) { >>>> + msleep(XDMA_RESET_TIME_MS); >>>> + >>>> + aspeed_xdma_reset_finish(ctx); >>>> + } >>>> + >>>> + mutex_unlock(&ctx->start_lock); >>>> + >>>> + return len; >>>> + } else if (op.direction > ASPEED_XDMA_DIRECTION_RESET) { >>>> + return -EINVAL; >>>> + } >>>> + >>>> + if (op.len > ctx->vga_size - offs) >>>> + return -EINVAL; >>>> + >>>> + if (file->f_flags & O_NONBLOCK) { >>>> + if (!mutex_trylock(&ctx->file_lock)) >>>> + return -EAGAIN; >>>> + >>>> + if (ctx->in_progress || ctx->in_reset) { >>> ctx->in_progress was protected by a lock that isn't file_lock, so this looks wrong. >> file_lock isn't protecting in_progress. It's protecting access to the >> whole engine while a transfer is in progress. > Then when would we ever gain file_lock if in_progress was set? Shouldn't the current > client hold file_lock until we'd set in_progress to false, in which case we wouldn't > need to check in_progress if we now hold the lock? That doesn't work for non-blocking io. >> in_progress isn't protected at all, > Except it is, because you've acquired file_lock above before checking it (and > in_reset). > > And why is in_progress written to under ctx->start_lock (which is not file_lock) > in aspeed_xdma_start() in the earlier patch if it's not protected? OK, I just didn't quite get what you meant by protected. Yes, multiple reads here of in_progress must be serialized so that only one thread sees it go false at once. This prevents multiple transfers from being started while one is in progress. But writing it doesn't require locking that I can see. >> it's just better to lock > There's never a case of "it's just better to lock", as if it were optional. Either the > variable needs to be protected against concurrent access or it doesn't. If it does, > always access it under a consistent lock. Not what I meant, was trying to say better to lock before waiting (rather than after). However now that I think again, that wouldn't work to prevent multiple transfers being started while one is in progress. >> before waiting for >> in_progress so that multiple clients don't all see in_progress go false >> and have to wait for a mutex (particularly in the nonblocking case). > Yes, so what you're suggesting is that in_progress needs to be protected against > concurrent access, so needs to be accessed under a consistent lock. Though as > suggested above it might be enough to successfully acquire file_lock. > As far as I can see we have three events that we need to care about: > > 1. Submission of a DMA request > 2. Completion of a DMA request > 3. PCIe-driven reset > > For 1, multiple concurrent submissions need to be serialised, so we need a > mutex with the semantics of file_lock as you've described above. > > 2 should only occur if we have an event of type 1 outstanding. If 1 is outstanding > then receiving 2 should cause the process that triggered 1 to wake and release > the mutex. This gets more complicated than that because we need non-blocking io; the mutex can't be held while the client call exits with EAGAIN. > 3 can happen at any time which results in two cases that we need to care about: > > 3a. DMA request in progress (i.e. 1 but have not yet seen corresponding 2) > 3b. DMA idle (no request in progress) > > Events of type 1 need to be serialised against 3. 2 won't occur after 3 until 1 has > occurred, so there's no need to consider it in the serialisation for 3. > > In the case of 3a. we need to reset the device, then mark the current transfer failed, > then wake the associated process. The woken process will release the mutex and > allow any queued requests to proceed. > > 3b is much simpler, though we need to prevent events of type 1 concurrently > accessing the device while the reset is in progress. So we need a spinlock to cover > configuring the device. > > So that's two locks - a mutex to serialise process-context access to the device, and > a spinlock to serialise interrupts with respect to process-context access. Currently > your implementation contains two mutexes (start_lock and file_lock) and two > spinlocks (client_lock and reset_lock), all with fairly hazy definitions. > >>>> + mutex_unlock(&ctx->file_lock); >>>> + return -EAGAIN; >>>> + } >>>> + } else { >>>> + mutex_lock(&ctx->file_lock); >>>> + >>>> + rc = wait_event_interruptible(ctx->wait, !ctx->in_progress && >>>> + !ctx->in_reset); >>> As above. >>> >>>> + if (rc) { >>>> + mutex_unlock(&ctx->file_lock); >>>> + return -EINTR; >>>> + } >>>> + } >>>> + >>>> + aspeed_xdma_start(ctx, &op, ctx->vga_phys + offs, client); >>>> + >>>> + mutex_unlock(&ctx->file_lock); >>>> + >>>> + if (!(file->f_flags & O_NONBLOCK)) { >>>> + rc = wait_event_interruptible(ctx->wait, !ctx->in_progress); >>>> + if (rc) >>>> + return -EINTR; >>>> + >>>> + if (client->error) >>>> + return -EIO; >>> What's the client->error value? Can it be more informative? >> Not really. There isn't much error information available. Basically the >> only way to get an error is if the engine is reset (user or PCIE >> initiated) while the transfer is on-going. >> >> >>>> + } >>>> + >>>> + return len; >>> We've potentially truncated len above (in the len > XDMA_OP_SIZE_MAX), >>> which leads to some ambiguity with the write() syscall given that it can >>> potentially return less than the requested length. This is one such case, but >>> the caller probably shouldn't attempt a follow-up write. >>> >>> This would go away if we make the len > XDMA_OP_SIZE_MAX an EINVAL >>> case as suggested agove. >> Sure. >> >> >>>> +} >>>> + >>>> +static __poll_t aspeed_xdma_poll(struct file *file, >>>> + struct poll_table_struct *wait) >>>> +{ >>>> + __poll_t mask = 0; >>>> + __poll_t req = poll_requested_events(wait); >>>> + struct aspeed_xdma_client *client = file->private_data; >>>> + struct aspeed_xdma *ctx = client->ctx; >>>> + >>>> + if (req & (EPOLLIN | EPOLLRDNORM)) { >>>> + if (client->in_progress) >>>> + poll_wait(file, &ctx->wait, wait); >>>> + >>>> + if (!client->in_progress) { >>>> + if (client->error) >>>> + mask |= EPOLLERR; >>>> + else >>>> + mask |= EPOLLIN | EPOLLRDNORM; >>>> + } >>>> + } >>>> + >>>> + if (req & (EPOLLOUT | EPOLLWRNORM)) { >>>> + if (ctx->in_progress) >>>> + poll_wait(file, &ctx->wait, wait); >>>> + >>>> + if (!ctx->in_progress) >>>> + mask |= EPOLLOUT | EPOLLWRNORM; >>>> + } >>>> + >>>> + return mask; >>>> +} >>>> + >>>> +static void aspeed_xdma_vma_close(struct vm_area_struct *vma) >>>> +{ >>>> + struct aspeed_xdma_client *client = vma->vm_private_data; >>>> + >>>> + gen_pool_free(client->ctx->vga_pool, (unsigned long)client->virt, >>>> + client->size); >>>> + >>>> + client->virt = NULL; >>>> + client->phys = 0; >>>> + client->size = 0; >>>> +} >>>> + >>>> +static const struct vm_operations_struct aspeed_xdma_vm_ops = { >>>> + .close = aspeed_xdma_vma_close, >>>> +}; >>>> + >>>> +static int aspeed_xdma_mmap(struct file *file, struct vm_area_struct *vma) >>>> +{ >>>> + int rc; >>>> + struct aspeed_xdma_client *client = file->private_data; >>>> + struct aspeed_xdma *ctx = client->ctx; >>>> + >>>> + /* restrict file to one mapping */ >>>> + if (client->size) >>>> + return -ENOMEM; >>> Can we do better with the error code here? >> Maybe? I'm open to suggestions... > How about EBUSY? Sounds good. >>>> + >>>> + client->size = vma->vm_end - vma->vm_start; >>>> + client->virt = gen_pool_dma_alloc(ctx->vga_pool, client->size, >>>> + &client->phys); >>>> + if (!client->virt) { >>>> + client->phys = 0; >>>> + client->size = 0; >>>> + return -ENOMEM; >>>> + } >>>> + >>>> + vma->vm_pgoff = (client->phys - ctx->vga_phys) >> PAGE_SHIFT; >>> Where does client->phys get set? >> gen_pool_dma_alloc sets it. > Ah, yes. Thanks. > > Andrew ^ permalink raw reply [flat|nested] 24+ messages in thread
* [PATCH 08/12] ARM: dts: aspeed: ast2500: Add XDMA Engine 2019-11-08 20:18 [PATCH 00/12] Aspeed: Add SCU interrupt controller and XDMA engine drivers Eddie James ` (6 preceding siblings ...) 2019-11-08 20:18 ` [PATCH 07/12] drivers/soc: xdma: Add user interface Eddie James @ 2019-11-08 20:18 ` Eddie James 2019-11-08 20:18 ` [PATCH 09/12] ARM: dts: aspeed: ast2600: " Eddie James ` (3 subsequent siblings) 11 siblings, 0 replies; 24+ messages in thread From: Eddie James @ 2019-11-08 20:18 UTC (permalink / raw) To: linux-kernel Cc: linux-aspeed, andrew, joel, maz, jason, tglx, robh+dt, mark.rutland, devicetree Add a node for the XDMA engine with all the necessary information. Signed-off-by: Eddie James <eajames@linux.ibm.com> --- arch/arm/boot/dts/aspeed-g5.dtsi | 10 ++++++++++ 1 file changed, 10 insertions(+) diff --git a/arch/arm/boot/dts/aspeed-g5.dtsi b/arch/arm/boot/dts/aspeed-g5.dtsi index 4579c78..5fa2744 100644 --- a/arch/arm/boot/dts/aspeed-g5.dtsi +++ b/arch/arm/boot/dts/aspeed-g5.dtsi @@ -1,5 +1,6 @@ // SPDX-License-Identifier: GPL-2.0+ #include <dt-bindings/clock/aspeed-clock.h> +#include <dt-bindings/interrupt-controller/aspeed-scu-ic.h> / { model = "Aspeed BMC"; @@ -259,6 +260,15 @@ interrupts = <0x19>; }; + xdma: xdma@1e6e7000 { + compatible = "aspeed,ast2500-xdma"; + reg = <0x1e6e7000 0x100>; + clocks = <&syscon ASPEED_CLK_GATE_BCLK>; + resets = <&syscon ASPEED_RESET_XDMA>; + interrupts-extended = <&vic 6>, <&scu_ic ASPEED_AST2500_SCU_IC_PCIE_RESET_LO_TO_HI>; + status = "disabled"; + }; + adc: adc@1e6e9000 { compatible = "aspeed,ast2500-adc"; reg = <0x1e6e9000 0xb0>; -- 1.8.3.1 ^ permalink raw reply related [flat|nested] 24+ messages in thread
* [PATCH 09/12] ARM: dts: aspeed: ast2600: Add XDMA Engine 2019-11-08 20:18 [PATCH 00/12] Aspeed: Add SCU interrupt controller and XDMA engine drivers Eddie James ` (7 preceding siblings ...) 2019-11-08 20:18 ` [PATCH 08/12] ARM: dts: aspeed: ast2500: Add XDMA Engine Eddie James @ 2019-11-08 20:18 ` Eddie James 2019-11-08 20:18 ` [PATCH 10/12] ARM: dts: aspeed: witherspoon: Enable " Eddie James ` (2 subsequent siblings) 11 siblings, 0 replies; 24+ messages in thread From: Eddie James @ 2019-11-08 20:18 UTC (permalink / raw) To: linux-kernel Cc: linux-aspeed, andrew, joel, maz, jason, tglx, robh+dt, mark.rutland, devicetree Add a node for the XDMA engine with all the necessary information. Signed-off-by: Eddie James <eajames@linux.ibm.com> --- arch/arm/boot/dts/aspeed-g6.dtsi | 11 +++++++++++ 1 file changed, 11 insertions(+) diff --git a/arch/arm/boot/dts/aspeed-g6.dtsi b/arch/arm/boot/dts/aspeed-g6.dtsi index 65ea2b2..61bd717 100644 --- a/arch/arm/boot/dts/aspeed-g6.dtsi +++ b/arch/arm/boot/dts/aspeed-g6.dtsi @@ -3,6 +3,7 @@ #include <dt-bindings/interrupt-controller/arm-gic.h> #include <dt-bindings/clock/ast2600-clock.h> +#include <dt-bindings/interrupt-controller/aspeed-scu-ic.h> / { model = "Aspeed BMC"; @@ -315,6 +316,16 @@ quality = <100>; }; + xdma: xdma@1e6e7000 { + compatible = "aspeed,ast2600-xdma"; + reg = <0x1e6e7000 0x100>; + clocks = <&syscon ASPEED_CLK_GATE_BCLK>; + resets = <&syscon ASPEED_RESET_DEV_XDMA>; + interrupts-extended = <&gic GIC_SPI 6 IRQ_TYPE_LEVEL_HIGH>, + <&scu_ic0 ASPEED_AST2600_SCU_IC0_PCIE_PERST_LO_TO_HI>; + status = "disabled"; + }; + gpio0: gpio@1e780000 { #gpio-cells = <2>; gpio-controller; -- 1.8.3.1 ^ permalink raw reply related [flat|nested] 24+ messages in thread
* [PATCH 10/12] ARM: dts: aspeed: witherspoon: Enable XDMA Engine 2019-11-08 20:18 [PATCH 00/12] Aspeed: Add SCU interrupt controller and XDMA engine drivers Eddie James ` (8 preceding siblings ...) 2019-11-08 20:18 ` [PATCH 09/12] ARM: dts: aspeed: ast2600: " Eddie James @ 2019-11-08 20:18 ` Eddie James 2019-11-08 20:18 ` [PATCH 11/12] ARM: dts: aspeed: rainier: Enable XDMA engine Eddie James 2019-11-08 20:18 ` [PATCH 12/12] ARM: dts: aspeed: tacoma: " Eddie James 11 siblings, 0 replies; 24+ messages in thread From: Eddie James @ 2019-11-08 20:18 UTC (permalink / raw) To: linux-kernel Cc: linux-aspeed, andrew, joel, maz, jason, tglx, robh+dt, mark.rutland, devicetree Enable the XDMA engine node. Signed-off-by: Eddie James <eajames@linux.ibm.com> --- arch/arm/boot/dts/aspeed-bmc-opp-witherspoon.dts | 4 ++++ 1 file changed, 4 insertions(+) diff --git a/arch/arm/boot/dts/aspeed-bmc-opp-witherspoon.dts b/arch/arm/boot/dts/aspeed-bmc-opp-witherspoon.dts index 569dad9..0f11335 100644 --- a/arch/arm/boot/dts/aspeed-bmc-opp-witherspoon.dts +++ b/arch/arm/boot/dts/aspeed-bmc-opp-witherspoon.dts @@ -658,4 +658,8 @@ memory-region = <&video_engine_memory>; }; +&xdma { + status = "okay"; +}; + #include "ibm-power9-dual.dtsi" -- 1.8.3.1 ^ permalink raw reply related [flat|nested] 24+ messages in thread
* [PATCH 11/12] ARM: dts: aspeed: rainier: Enable XDMA engine 2019-11-08 20:18 [PATCH 00/12] Aspeed: Add SCU interrupt controller and XDMA engine drivers Eddie James ` (9 preceding siblings ...) 2019-11-08 20:18 ` [PATCH 10/12] ARM: dts: aspeed: witherspoon: Enable " Eddie James @ 2019-11-08 20:18 ` Eddie James 2019-11-08 20:18 ` [PATCH 12/12] ARM: dts: aspeed: tacoma: " Eddie James 11 siblings, 0 replies; 24+ messages in thread From: Eddie James @ 2019-11-08 20:18 UTC (permalink / raw) To: linux-kernel Cc: linux-aspeed, andrew, joel, maz, jason, tglx, robh+dt, mark.rutland, devicetree Enable the XDMA engine node. Signed-off-by: Eddie James <eajames@linux.ibm.com> --- arch/arm/boot/dts/aspeed-bmc-ibm-rainier.dts | 4 ++++ 1 file changed, 4 insertions(+) diff --git a/arch/arm/boot/dts/aspeed-bmc-ibm-rainier.dts b/arch/arm/boot/dts/aspeed-bmc-ibm-rainier.dts index c1c9cd3..2cd53d9 100644 --- a/arch/arm/boot/dts/aspeed-bmc-ibm-rainier.dts +++ b/arch/arm/boot/dts/aspeed-bmc-ibm-rainier.dts @@ -970,3 +970,7 @@ spi-max-frequency = <100000000>; }; }; + +&xdma { + status = "okay"; +}; -- 1.8.3.1 ^ permalink raw reply related [flat|nested] 24+ messages in thread
* [PATCH 12/12] ARM: dts: aspeed: tacoma: Enable XDMA engine 2019-11-08 20:18 [PATCH 00/12] Aspeed: Add SCU interrupt controller and XDMA engine drivers Eddie James ` (10 preceding siblings ...) 2019-11-08 20:18 ` [PATCH 11/12] ARM: dts: aspeed: rainier: Enable XDMA engine Eddie James @ 2019-11-08 20:18 ` Eddie James 11 siblings, 0 replies; 24+ messages in thread From: Eddie James @ 2019-11-08 20:18 UTC (permalink / raw) To: linux-kernel Cc: linux-aspeed, andrew, joel, maz, jason, tglx, robh+dt, mark.rutland, devicetree Enable the XDMA engine node. Signed-off-by: Eddie James <eajames@linux.ibm.com> --- arch/arm/boot/dts/aspeed-bmc-opp-tacoma.dts | 4 ++++ 1 file changed, 4 insertions(+) diff --git a/arch/arm/boot/dts/aspeed-bmc-opp-tacoma.dts b/arch/arm/boot/dts/aspeed-bmc-opp-tacoma.dts index f02de4a..cbaa1bd 100644 --- a/arch/arm/boot/dts/aspeed-bmc-opp-tacoma.dts +++ b/arch/arm/boot/dts/aspeed-bmc-opp-tacoma.dts @@ -1193,3 +1193,7 @@ pinctrl-0 = <&pinctrl_lpc_default>, <&pinctrl_lsirq_default>; }; + +&xdma { + status = "okay"; +}; -- 1.8.3.1 ^ permalink raw reply related [flat|nested] 24+ messages in thread
end of thread, other threads:[~2019-12-04 15:27 UTC | newest] Thread overview: 24+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2019-11-08 20:18 [PATCH 00/12] Aspeed: Add SCU interrupt controller and XDMA engine drivers Eddie James 2019-11-08 20:18 ` [PATCH 01/12] dt-bindings: interrupt-controller: Add Aspeed SCU interrupt controller Eddie James 2019-11-12 0:54 ` Rob Herring 2019-11-08 20:18 ` [PATCH 02/12] irqchip: " Eddie James 2019-11-10 14:53 ` Marc Zyngier 2019-11-08 20:18 ` [PATCH 03/12] ARM: dts: aspeed: ast2500: Add " Eddie James 2019-11-08 20:18 ` [PATCH 04/12] ARM: dts: aspeed: ast2600: Add SCU interrupt controllers Eddie James 2019-11-08 20:18 ` [PATCH 05/12] dt-bindings: soc: Add Aspeed XDMA Engine Eddie James 2019-11-14 19:00 ` Rob Herring 2019-11-08 20:18 ` [PATCH 06/12] drivers/soc: Add Aspeed XDMA Engine Driver Eddie James 2019-11-24 23:32 ` Andrew Jeffery 2019-11-25 0:01 ` Andrew Jeffery 2019-11-25 19:15 ` Eddie James 2019-11-26 0:53 ` Andrew Jeffery 2019-11-08 20:18 ` [PATCH 07/12] drivers/soc: xdma: Add user interface Eddie James 2019-11-24 23:59 ` Andrew Jeffery 2019-11-25 19:44 ` Eddie James 2019-11-26 3:30 ` Andrew Jeffery 2019-12-04 15:26 ` Eddie James 2019-11-08 20:18 ` [PATCH 08/12] ARM: dts: aspeed: ast2500: Add XDMA Engine Eddie James 2019-11-08 20:18 ` [PATCH 09/12] ARM: dts: aspeed: ast2600: " Eddie James 2019-11-08 20:18 ` [PATCH 10/12] ARM: dts: aspeed: witherspoon: Enable " Eddie James 2019-11-08 20:18 ` [PATCH 11/12] ARM: dts: aspeed: rainier: Enable XDMA engine Eddie James 2019-11-08 20:18 ` [PATCH 12/12] ARM: dts: aspeed: tacoma: " Eddie James
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).