All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH V2 0/8] dmaengine: bcm2835: enhancement of driver
@ 2016-01-07 17:32 kernel at martin.sperl.org
  2016-01-07 17:32 ` [PATCH V2 1/8] dmaengine: bcm2835: set residue_granularity field kernel at martin.sperl.org
                   ` (7 more replies)
  0 siblings, 8 replies; 28+ messages in thread
From: kernel at martin.sperl.org @ 2016-01-07 17:32 UTC (permalink / raw)
  To: linux-arm-kernel

From: Martin Sperl <kernel@martin.sperl.org>

This patchset fixes several issues:
* missing residue_granularity to allow better support for I2S
* dma-channels are artificially restricted (channel 0 and 2)
* added correct support for dma-channels 11-14 by using the correct
  shared irq line
  * dma-channel 12 used the "trigger for all" irq resulting in
    a lockup when used
  * note that care was taken so that no change to the device-tree
    was necessary to avoid a break...

With this now 11 DMA channels available instead of 7 - the
additional HW-DMA-channels are:
* 0, 2 (masking of channels)
* 13, 14 (shared interrupts - see also note on channel 12)

It also adds several new features:
* slave_sg support
* dma_memcopy support

For these a consolidated method for creating standard control-block
chains was created to share common between cyclic, slave_sg and
memcopy.

Additional support for memset, memset_sg and interleave
are in the pipeline.

Testing:
* slave_sg
  * spi-bcm2835:
    * fb_st7735r framebuffer
      * tested using mplayer
* cyclic
  * bcm2835-i2s
    * Audio output with a Hifiberry I2S card.
      * tested using mplayer and aplay
* memcopy
  * dmatest

Tests using both slave_sg and cyclic concurrently were also conducted
using mplayer to play BigBuckBunny.

To test that interrupt line sharing was working propperly the same
audio/video test was run only using dma channel 11 to 14 - these
channels are sharing the same interrupt line -

Note that the bcm2835-i2s driver that is currenlty in the kernel
fails to work since:
commit 94cb7f76caa0b337 ("Switch to using the new clock driver support")
and requires modifications to use the clock-driver framework.

For testing a quick and dirty patch had to be used - a patch to use
the clock framework is in the pipeline.

Martin Sperl (8):
  dmaengine: bcm2835: set residue_granularity field
  dmaengine: bcm2835: remove unnecessary masking of dma channels
  dmaengine: bcm2835: use shared interrupt for channel 11 to 14.
  dmaengine: bcm2835: add additional defines for DMA-registers
  dmaengine: bcm2835: move cyclic member from bcm2835_chan into
    bcm2835_desc
  dmaengine: bcm2835: move controlblock chain generation into separate
    method
  dmaengine: bcm2835: add slave_sg support to bcm2835-dma
  dmaengine: bcm2835: add dma_memcopy support to bcm2835-dma

 drivers/dma/bcm2835-dma.c |  541 +++++++++++++++++++++++++++++++++++----------
 1 file changed, 419 insertions(+), 122 deletions(-)

--
1.7.10.4

^ permalink raw reply	[flat|nested] 28+ messages in thread

* [PATCH V2 1/8] dmaengine: bcm2835: set residue_granularity field
  2016-01-07 17:32 [PATCH V2 0/8] dmaengine: bcm2835: enhancement of driver kernel at martin.sperl.org
@ 2016-01-07 17:32 ` kernel at martin.sperl.org
  2016-01-28 19:53   ` Eric Anholt
  2016-01-07 17:33 ` [PATCH V2 2/8] dmaengine: bcm2835: remove unnecessary masking of dma channels kernel at martin.sperl.org
                   ` (6 subsequent siblings)
  7 siblings, 1 reply; 28+ messages in thread
From: kernel at martin.sperl.org @ 2016-01-07 17:32 UTC (permalink / raw)
  To: linux-arm-kernel

From: Martin Sperl <kernel@martin.sperl.org>

bcm2835-dma supports residue reporting at burst level but didn't report
this via the residue_granularity field.

See also:
https://github.com/raspberrypi/linux/commit/b015555327afa402f70ddc86e3632f59df1cd9d7
for the downstream patch.

Signed-off-by: Matthias Reichl <hias@horus.com>
Signed-off-by: Noralf Tr?nnes <noralf@tronnes.org>
Signed-off-by: Martin Sperl <kernel@martin.sperl.org>
---
 drivers/dma/bcm2835-dma.c |    1 +
 1 file changed, 1 insertion(+)

diff --git a/drivers/dma/bcm2835-dma.c b/drivers/dma/bcm2835-dma.c
index 996c4b0..2d72fe8 100644
--- a/drivers/dma/bcm2835-dma.c
+++ b/drivers/dma/bcm2835-dma.c
@@ -625,6 +625,7 @@ static int bcm2835_dma_probe(struct platform_device *pdev)
 	od->ddev.src_addr_widths = BIT(DMA_SLAVE_BUSWIDTH_4_BYTES);
 	od->ddev.dst_addr_widths = BIT(DMA_SLAVE_BUSWIDTH_4_BYTES);
 	od->ddev.directions = BIT(DMA_DEV_TO_MEM) | BIT(DMA_MEM_TO_DEV);
+	od->ddev.residue_granularity = DMA_RESIDUE_GRANULARITY_BURST;
 	od->ddev.dev = &pdev->dev;
 	INIT_LIST_HEAD(&od->ddev.channels);
 	spin_lock_init(&od->lock);
-- 
1.7.10.4

^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [PATCH V2 2/8] dmaengine: bcm2835: remove unnecessary masking of dma channels
  2016-01-07 17:32 [PATCH V2 0/8] dmaengine: bcm2835: enhancement of driver kernel at martin.sperl.org
  2016-01-07 17:32 ` [PATCH V2 1/8] dmaengine: bcm2835: set residue_granularity field kernel at martin.sperl.org
@ 2016-01-07 17:33 ` kernel at martin.sperl.org
  2016-01-30  3:08   ` Eric Anholt
  2016-01-07 17:33 ` [PATCH V2 3/8] dmaengine: bcm2835: use shared interrupt for channel 11 to 14 kernel at martin.sperl.org
                   ` (5 subsequent siblings)
  7 siblings, 1 reply; 28+ messages in thread
From: kernel at martin.sperl.org @ 2016-01-07 17:33 UTC (permalink / raw)
  To: linux-arm-kernel

From: Martin Sperl <kernel@martin.sperl.org>

The original patch contained 3 dma channels that were masked out.

These - as far as research and discussions show - are a
artefacts remaining from the downstream legacy dma-api.

Right now down-stream still includes a legacy api used only
in a single (downstream only) driver (bcm2708_fb) that requires
2D DMA for speedup (DMA-channel 0).
Formerly the sd-card support driver also was using this legacy
api (DMA-channel 2), but since has been moved over to use
dmaengine directly.

The DMA-channel 3 is already masked out in the devicetree in
the default property "brcm,dma-channel-mask = <0x7f35>;"

So we can remove the whole masking of DMA channels.

Signed-off-by: Martin Sperl <kernel@martin.sperl.org>
---
 drivers/dma/bcm2835-dma.c |    9 ---------
 1 file changed, 9 deletions(-)

diff --git a/drivers/dma/bcm2835-dma.c b/drivers/dma/bcm2835-dma.c
index 2d72fe8..e4ca980 100644
--- a/drivers/dma/bcm2835-dma.c
+++ b/drivers/dma/bcm2835-dma.c
@@ -123,9 +123,6 @@ struct bcm2835_desc {
 #define BCM2835_DMA_DATA_TYPE_S32	4
 #define BCM2835_DMA_DATA_TYPE_S128	16
 
-#define BCM2835_DMA_BULK_MASK	BIT(0)
-#define BCM2835_DMA_FIQ_MASK	(BIT(2) | BIT(3))
-
 /* Valid only for channels 0 - 14, 15 has its own base address */
 #define BCM2835_DMA_CHAN(n)	((n) << 8) /* Base address */
 #define BCM2835_DMA_CHANIO(base, n) ((base) + BCM2835_DMA_CHAN(n))
@@ -641,12 +638,6 @@ static int bcm2835_dma_probe(struct platform_device *pdev)
 		goto err_no_dma;
 	}
 
-	/*
-	 * Do not use the FIQ and BULK channels,
-	 * because they are used by the GPU.
-	 */
-	chans_available &= ~(BCM2835_DMA_FIQ_MASK | BCM2835_DMA_BULK_MASK);
-
 	for (i = 0; i < pdev->num_resources; i++) {
 		irq = platform_get_irq(pdev, i);
 		if (irq < 0)
-- 
1.7.10.4

^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [PATCH V2 3/8] dmaengine: bcm2835: use shared interrupt for channel 11 to 14.
  2016-01-07 17:32 [PATCH V2 0/8] dmaengine: bcm2835: enhancement of driver kernel at martin.sperl.org
  2016-01-07 17:32 ` [PATCH V2 1/8] dmaengine: bcm2835: set residue_granularity field kernel at martin.sperl.org
  2016-01-07 17:33 ` [PATCH V2 2/8] dmaengine: bcm2835: remove unnecessary masking of dma channels kernel at martin.sperl.org
@ 2016-01-07 17:33 ` kernel at martin.sperl.org
  2016-01-13 12:26   ` Vinod Koul
  2016-02-18  4:09   ` Eric Anholt
  2016-01-07 17:33 ` [PATCH V2 4/8] dmaengine: bcm2835: add additional defines for DMA-registers kernel at martin.sperl.org
                   ` (4 subsequent siblings)
  7 siblings, 2 replies; 28+ messages in thread
From: kernel at martin.sperl.org @ 2016-01-07 17:33 UTC (permalink / raw)
  To: linux-arm-kernel

From: Martin Sperl <kernel@martin.sperl.org>

The bcm2835 dma channel 11 to 14 only have a single shared irq line,
so this patch implements shared interrupts for these channels.

To avoid changes to the device tree (with regards to interrupts listed,
which reflects on pdev->num_resources) a fixed channel count had
to get used in the code instead.

With this patch applied we now have 11 dma channels available to
the ARM side of the SOC.

Signed-off-by: Martin Sperl <kernel@martin.sperl.org>
---
 drivers/dma/bcm2835-dma.c |   46 ++++++++++++++++++++++++++++++++++++++++-----
 1 file changed, 41 insertions(+), 5 deletions(-)

diff --git a/drivers/dma/bcm2835-dma.c b/drivers/dma/bcm2835-dma.c
index e4ca980..0c04236 100644
--- a/drivers/dma/bcm2835-dma.c
+++ b/drivers/dma/bcm2835-dma.c
@@ -81,7 +81,9 @@ struct bcm2835_chan {
 	struct dma_pool *cb_pool;
 
 	void __iomem *chan_base;
+
 	int irq_number;
+	unsigned long irq_flags;
 };
 
 struct bcm2835_desc {
@@ -127,6 +129,19 @@ struct bcm2835_desc {
 #define BCM2835_DMA_CHAN(n)	((n) << 8) /* Base address */
 #define BCM2835_DMA_CHANIO(base, n) ((base) + BCM2835_DMA_CHAN(n))
 
+/*
+ * number of dma channels we support
+ * we do not support DMA channel 15, as it is in a separate IO range,
+ * does not have a separate IRQ line except for the "catch all IRQ line"
+ * finally this channel is used by the firmware so is not available
+ */
+#define BCM2835_DMA_MAX_CHANNEL_NUMBER	14
+
+/* the DMA channels 11 to 14 share a common interrupt */
+#define BCM2835_DMA_IRQ_SHARED_MASK (BIT(11) | BIT(12) | BIT(13) | BIT(14))
+#define BCM2835_DMA_IRQ_SHARED		11
+#define BCM2835_DMA_IRQ_ALL		12
+
 static inline struct bcm2835_dmadev *to_bcm2835_dma_dev(struct dma_device *d)
 {
 	return container_of(d, struct bcm2835_dmadev, ddev);
@@ -215,6 +230,15 @@ static irqreturn_t bcm2835_dma_callback(int irq, void *data)
 	struct bcm2835_desc *d;
 	unsigned long flags;
 
+	/* check the shared interrupt */
+	if (c->irq_flags & IRQF_SHARED) {
+		/* check if the interrupt is enabled */
+		flags = readl(c->chan_base + BCM2835_DMA_CS);
+		/* if not set then we are not the reason for the irq */
+		if (!(flags & BCM2835_DMA_INT))
+			return IRQ_NONE;
+	}
+
 	spin_lock_irqsave(&c->vc.lock, flags);
 
 	/* Acknowledge interrupt */
@@ -250,7 +274,8 @@ static int bcm2835_dma_alloc_chan_resources(struct dma_chan *chan)
 	}
 
 	return request_irq(c->irq_number,
-			bcm2835_dma_callback, 0, "DMA IRQ", c);
+			   bcm2835_dma_callback,
+			   c->irq_flags, "DMA IRQ", c);
 }
 
 static void bcm2835_dma_free_chan_resources(struct dma_chan *chan)
@@ -526,7 +551,8 @@ static int bcm2835_dma_terminate_all(struct dma_chan *chan)
 	return 0;
 }
 
-static int bcm2835_dma_chan_init(struct bcm2835_dmadev *d, int chan_id, int irq)
+static int bcm2835_dma_chan_init(struct bcm2835_dmadev *d, int chan_id,
+				 int irq, unsigned long irq_flags)
 {
 	struct bcm2835_chan *c;
 
@@ -541,6 +567,7 @@ static int bcm2835_dma_chan_init(struct bcm2835_dmadev *d, int chan_id, int irq)
 	c->chan_base = BCM2835_DMA_CHANIO(d->base, chan_id);
 	c->ch = chan_id;
 	c->irq_number = irq;
+	c->irq_flags = irq_flags;
 
 	return 0;
 }
@@ -586,6 +613,7 @@ static int bcm2835_dma_probe(struct platform_device *pdev)
 	int rc;
 	int i;
 	int irq;
+	unsigned long irq_flags;
 	uint32_t chans_available;
 
 	if (!pdev->dev.dma_mask)
@@ -638,13 +666,21 @@ static int bcm2835_dma_probe(struct platform_device *pdev)
 		goto err_no_dma;
 	}
 
-	for (i = 0; i < pdev->num_resources; i++) {
-		irq = platform_get_irq(pdev, i);
+	for (i = 0; i <= BCM2835_DMA_MAX_CHANNEL_NUMBER; i++) {
+		if (BCM2835_DMA_IRQ_SHARED_MASK & BIT(i)) {
+			irq = platform_get_irq(pdev,
+					       BCM2835_DMA_IRQ_SHARED);
+			irq_flags = IRQF_SHARED;
+		} else {
+			irq = platform_get_irq(pdev, i);
+			irq_flags = 0;
+		}
+
 		if (irq < 0)
 			break;
 
 		if (chans_available & (1 << i)) {
-			rc = bcm2835_dma_chan_init(od, i, irq);
+			rc = bcm2835_dma_chan_init(od, i, irq, irq_flags);
 			if (rc)
 				goto err_no_dma;
 		}
-- 
1.7.10.4

^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [PATCH V2 4/8] dmaengine: bcm2835: add additional defines for DMA-registers
  2016-01-07 17:32 [PATCH V2 0/8] dmaengine: bcm2835: enhancement of driver kernel at martin.sperl.org
                   ` (2 preceding siblings ...)
  2016-01-07 17:33 ` [PATCH V2 3/8] dmaengine: bcm2835: use shared interrupt for channel 11 to 14 kernel at martin.sperl.org
@ 2016-01-07 17:33 ` kernel at martin.sperl.org
  2016-01-13 12:32   ` Vinod Koul
  2016-02-18  4:18   ` Eric Anholt
  2016-01-07 17:33 ` [PATCH V2 5/8] dmaengine: bcm2835: move cyclic member from bcm2835_chan into bcm2835_desc kernel at martin.sperl.org
                   ` (3 subsequent siblings)
  7 siblings, 2 replies; 28+ messages in thread
From: kernel at martin.sperl.org @ 2016-01-07 17:33 UTC (permalink / raw)
  To: linux-arm-kernel

From: Martin Sperl <kernel@martin.sperl.org>

Add additional defines describing the DMA registers
as well as adding some more documentation to those registers.

Signed-off-by: Martin Sperl <kernel@martin.sperl.org>
---
 drivers/dma/bcm2835-dma.c |   54 ++++++++++++++++++++++++++++++++++++++-------
 1 file changed, 46 insertions(+), 8 deletions(-)

diff --git a/drivers/dma/bcm2835-dma.c b/drivers/dma/bcm2835-dma.c
index 0c04236..7905327 100644
--- a/drivers/dma/bcm2835-dma.c
+++ b/drivers/dma/bcm2835-dma.c
@@ -99,26 +99,64 @@ struct bcm2835_desc {
 
 #define BCM2835_DMA_CS		0x00
 #define BCM2835_DMA_ADDR	0x04
+#define BCM2835_DMA_TI		0x08
 #define BCM2835_DMA_SOURCE_AD	0x0c
 #define BCM2835_DMA_DEST_AD	0x10
-#define BCM2835_DMA_NEXTCB	0x1C
+#define BCM2835_DMA_LEN		0x14
+#define BCM2835_DMA_STIDE	0x18
+#define BCM2835_DMA_NEXTCB	0x1c
+#define BCM2835_DMA_DEBUG	0x20
 
 /* DMA CS Control and Status bits */
-#define BCM2835_DMA_ACTIVE	BIT(0)
-#define BCM2835_DMA_INT	BIT(2)
+#define BCM2835_DMA_ACTIVE	BIT(0)  /* activate the DMA */
+#define BCM2835_DMA_END		BIT(1)  /* current CB has ended */
+#define BCM2835_DMA_INT		BIT(2)  /* interrupt status */
+#define BCM2835_DMA_DREQ	BIT(3)  /* DREQ state */
 #define BCM2835_DMA_ISPAUSED	BIT(4)  /* Pause requested or not active */
 #define BCM2835_DMA_ISHELD	BIT(5)  /* Is held by DREQ flow control */
-#define BCM2835_DMA_ERR	BIT(8)
+#define BCM2835_DMA_WAITING_FOR_WRITES BIT(6) /* waiting for last
+					       * AXI-write to ack
+					       */
+#define BCM2835_DMA_ERR		BIT(8)
+#define BCM2835_DMA_PRIORITY(x) ((x & 15) << 16) /* AXI priority */
+#define BCM2835_DMA_PANIC_PRIORITY(x) ((x & 15) << 20) /* panic priority */
+#define BCM2835_DMA_WAIT_FOR_WRITES BIT(28) /* current value of
+					     * TI.BCM2835_DMA_WAIT_RESP
+					     */
+#define BCM2835_DMA_DIS_DEBUG	BIT(29) /* disable debug pause signal */
 #define BCM2835_DMA_ABORT	BIT(30) /* Stop current CB, go to next, WO */
 #define BCM2835_DMA_RESET	BIT(31) /* WO, self clearing */
 
+/* Transfer information bits - also bcm2835_cb.info field */
 #define BCM2835_DMA_INT_EN	BIT(0)
+#define BCM2835_DMA_TDMODE	BIT(1) /* 2D-Mode */
+#define BCM2835_DMA_WAIT_RESP	BIT(3) /* wait for AXI-write to be acked */
 #define BCM2835_DMA_D_INC	BIT(4)
-#define BCM2835_DMA_D_DREQ	BIT(6)
+#define BCM2835_DMA_D_WIDTH	BIT(5) /* 128bit writes if set */
+#define BCM2835_DMA_D_DREQ	BIT(6) /* enable DREQ for destination */
+#define BCM2835_DMA_D_IGNORE	BIT(7) /* ignore destination writes */
 #define BCM2835_DMA_S_INC	BIT(8)
-#define BCM2835_DMA_S_DREQ	BIT(10)
-
-#define BCM2835_DMA_PER_MAP(x)	((x) << 16)
+#define BCM2835_DMA_S_WIDTH	BIT(9) /* 128bit writes if set */
+#define BCM2835_DMA_S_DREQ	BIT(10) /* enable SREQ for source */
+#define BCM2835_DMA_S_IGNORE	BIT(11) /* ignore source reads - read 0 */
+#define BCM2835_DMA_BURST_LENGTH(x) ((x & 15) << 12)
+#define BCM2835_DMA_PER_MAP(x)	((x & 31) << 16) /* REQ source */
+#define BCM2835_DMA_WAIT(x)	((x & 31) << 21) /* add DMA-wait cycles */
+#define BCM2835_DMA_NO_WIDE_BURSTS BIT(26) /* no 2 beat write bursts */
+
+/* debug register bits */
+#define BCM2835_DMA_DEBUG_LAST_NOT_SET_ERR	BIT(0)
+#define BCM2835_DMA_DEBUG_FIFO_ERR		BIT(1)
+#define BCM2835_DMA_DEBUG_READ_ERR		BIT(2)
+#define BCM2835_DMA_DEBUG_OUTSTANDING_WRITES_SHIFT 4
+#define BCM2835_DMA_DEBUG_OUTSTANDING_WRITES_BITS 4
+#define BCM2835_DMA_DEBUG_ID_SHIFT		16
+#define BCM2835_DMA_DEBUG_ID_BITS		9
+#define BCM2835_DMA_DEBUG_STATE_SHIFT		16
+#define BCM2835_DMA_DEBUG_STATE_BITS		9
+#define BCM2835_DMA_DEBUG_VERSION_SHIFT		25
+#define BCM2835_DMA_DEBUG_VERSION_BITS		3
+#define BCM2835_DMA_DEBUG_LITE			BIT(28)
 
 #define BCM2835_DMA_DATA_TYPE_S8	1
 #define BCM2835_DMA_DATA_TYPE_S16	2
-- 
1.7.10.4

^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [PATCH V2 5/8] dmaengine: bcm2835: move cyclic member from bcm2835_chan into bcm2835_desc
  2016-01-07 17:32 [PATCH V2 0/8] dmaengine: bcm2835: enhancement of driver kernel at martin.sperl.org
                   ` (3 preceding siblings ...)
  2016-01-07 17:33 ` [PATCH V2 4/8] dmaengine: bcm2835: add additional defines for DMA-registers kernel at martin.sperl.org
@ 2016-01-07 17:33 ` kernel at martin.sperl.org
  2016-02-18  4:19   ` Eric Anholt
  2016-01-07 17:33 ` [PATCH V2 6/8] dmaengine: bcm2835: move controlblock chain generation into separate method kernel at martin.sperl.org
                   ` (2 subsequent siblings)
  7 siblings, 1 reply; 28+ messages in thread
From: kernel at martin.sperl.org @ 2016-01-07 17:33 UTC (permalink / raw)
  To: linux-arm-kernel

From: Martin Sperl <kernel@martin.sperl.org>

In preparation to consolidating code we move the cyclic member
into the bcm_2835_desc structure.

Signed-off-by: Martin Sperl <kernel@martin.sperl.org>
---
 drivers/dma/bcm2835-dma.c |    6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/drivers/dma/bcm2835-dma.c b/drivers/dma/bcm2835-dma.c
index 7905327..43758ba 100644
--- a/drivers/dma/bcm2835-dma.c
+++ b/drivers/dma/bcm2835-dma.c
@@ -73,7 +73,6 @@ struct bcm2835_chan {
 	struct list_head node;
 
 	struct dma_slave_config	cfg;
-	bool cyclic;
 	unsigned int dreq;
 
 	int ch;
@@ -95,6 +94,8 @@ struct bcm2835_desc {
 
 	unsigned int frames;
 	size_t size;
+
+	bool cyclic;
 };
 
 #define BCM2835_DMA_CS		0x00
@@ -399,8 +400,6 @@ static void bcm2835_dma_issue_pending(struct dma_chan *chan)
 	struct bcm2835_chan *c = to_bcm2835_dma_chan(chan);
 	unsigned long flags;
 
-	c->cyclic = true; /* Nothing else is implemented */
-
 	spin_lock_irqsave(&c->vc.lock, flags);
 	if (vchan_issue_pending(&c->vc) && !c->desc)
 		bcm2835_dma_start_desc(c);
@@ -454,6 +453,7 @@ static struct dma_async_tx_descriptor *bcm2835_dma_prep_dma_cyclic(
 	d->c = c;
 	d->dir = direction;
 	d->frames = buf_len / period_len;
+	d->cyclic = true;
 
 	d->cb_list = kcalloc(d->frames, sizeof(*d->cb_list), GFP_KERNEL);
 	if (!d->cb_list) {
-- 
1.7.10.4

^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [PATCH V2 6/8] dmaengine: bcm2835: move controlblock chain generation into separate method
  2016-01-07 17:32 [PATCH V2 0/8] dmaengine: bcm2835: enhancement of driver kernel at martin.sperl.org
                   ` (4 preceding siblings ...)
  2016-01-07 17:33 ` [PATCH V2 5/8] dmaengine: bcm2835: move cyclic member from bcm2835_chan into bcm2835_desc kernel at martin.sperl.org
@ 2016-01-07 17:33 ` kernel at martin.sperl.org
  2016-01-13 13:23   ` Vinod Koul
  2016-02-18  3:24   ` Eric Anholt
  2016-01-07 17:33 ` [PATCH V2 7/8] dmaengine: bcm2835: add slave_sg support to bcm2835-dma kernel at martin.sperl.org
  2016-01-07 17:33 ` [PATCH V2 8/8] dmaengine: bcm2835: add dma_memcopy " kernel at martin.sperl.org
  7 siblings, 2 replies; 28+ messages in thread
From: kernel at martin.sperl.org @ 2016-01-07 17:33 UTC (permalink / raw)
  To: linux-arm-kernel

From: Martin Sperl <kernel@martin.sperl.org>

In preparation of adding slave_sg functionality this patch moves the
generation/allocation of bcm2835_desc and the building of
the corresponding DMA-control-block chain from bcm2835_dma_prep_dma_cyclic
into the newly created method bcm2835_dma_create_cb_chain.

This new code also takes the limits of LITE channels into account
and splits the control-block transfers at the correct location.

Tested Audio output with a Hifiberry I2S card.

Signed-off-by: Martin Sperl <kernel@martin.sperl.org>
---
 drivers/dma/bcm2835-dma.c |  288 ++++++++++++++++++++++++++++++---------------
 1 file changed, 191 insertions(+), 97 deletions(-)

diff --git a/drivers/dma/bcm2835-dma.c b/drivers/dma/bcm2835-dma.c
index 43758ba..41a4f0b 100644
--- a/drivers/dma/bcm2835-dma.c
+++ b/drivers/dma/bcm2835-dma.c
@@ -90,12 +90,12 @@ struct bcm2835_desc {
 	struct virt_dma_desc vd;
 	enum dma_transfer_direction dir;

-	struct bcm2835_cb_entry *cb_list;
-
 	unsigned int frames;
 	size_t size;

 	bool cyclic;
+
+	struct bcm2835_cb_entry cb_list[];
 };

 #define BCM2835_DMA_CS		0x00
@@ -181,6 +181,22 @@ struct bcm2835_desc {
 #define BCM2835_DMA_IRQ_SHARED		11
 #define BCM2835_DMA_IRQ_ALL		12

+/* the max dma length for different channels */
+#define MAX_DMA_LEN SZ_1G
+#define MAX_LITE_DMA_LEN (SZ_64K - 4)
+
+static inline bool bcm2835_dma_is_lite(struct bcm2835_chan *c)
+{
+	/* dma channels >= 7 are LITE channels */
+	return (c->ch >= 7);
+}
+
+static inline size_t bcm2835_dma_max_frame_length(struct bcm2835_chan *c)
+{
+	/* lite and normal channels have different max frame length */
+	return bcm2835_dma_is_lite(c) ? MAX_LITE_DMA_LEN : MAX_DMA_LEN;
+}
+
 static inline struct bcm2835_dmadev *to_bcm2835_dma_dev(struct dma_device *d)
 {
 	return container_of(d, struct bcm2835_dmadev, ddev);
@@ -197,19 +213,140 @@ static inline struct bcm2835_desc *to_bcm2835_dma_desc(
 	return container_of(t, struct bcm2835_desc, vd.tx);
 }

-static void bcm2835_dma_desc_free(struct virt_dma_desc *vd)
+static void bcm2835_dma_free_cb_chain(struct bcm2835_desc *desc)
 {
-	struct bcm2835_desc *desc = container_of(vd, struct bcm2835_desc, vd);
-	int i;
+	size_t i;

 	for (i = 0; i < desc->frames; i++)
 		dma_pool_free(desc->c->cb_pool, desc->cb_list[i].cb,
 			      desc->cb_list[i].paddr);

-	kfree(desc->cb_list);
 	kfree(desc);
 }

+static void bcm2835_dma_desc_free(struct virt_dma_desc *vd)
+{
+	bcm2835_dma_free_cb_chain(
+		container_of(vd, struct bcm2835_desc, vd));
+}
+
+static inline size_t bcm2835_dma_frames_for_length(size_t len,
+						   size_t period_len)
+{
+	return DIV_ROUND_UP(len, period_len);
+}
+
+/**
+ * bcm2835_dma_create_cb_chain - create a control block and fills data in
+ *
+ * @chan:           the @dma_chan for which we run this
+ * @direction:      the direction in which we transfer
+ * @cyclic:         it is a cyclic transfer
+ * @info:           the default info bits to apply per controlblock
+ * @finalextrainfo: additional bits in last controlblock
+ *                  (or when period_len is reached in case of cyclic)
+ * @frames:         number of controlblocks to allocate
+ * @src:            the src address to assign (if the S_INC bit is set
+ *                  in @info, then it gets incremented)
+ * @dst:            the dst address to assign (if the D_INC bit is set
+ *                  in @info, then it gets incremented)
+ * @buf_len:        the full buffer length (may also be 0)
+ * @period_len:     the period length when to apply @finalextrainfo
+ *                  in addition to the last transfer
+ *                  this will also break some control-blocks early
+ * @gfp:            the GFP flag to use for allocation
+ */
+static struct bcm2835_desc *bcm2835_dma_create_cb_chain(
+	struct dma_chan *chan, enum dma_transfer_direction direction,
+	bool cyclic, u32 info, u32 finalextrainfo, size_t frames,
+	dma_addr_t src, dma_addr_t dst, size_t buf_len,
+	size_t period_len, gfp_t gfp)
+{
+	struct bcm2835_chan *c = to_bcm2835_dma_chan(chan);
+	size_t len = buf_len, total_len;
+	size_t frame;
+	struct bcm2835_desc *d;
+	struct bcm2835_cb_entry *cb_entry;
+	struct bcm2835_dma_cb *control_block;
+	size_t max_len = bcm2835_dma_max_frame_length(c);
+
+	/* allocate and setup the descriptor. */
+	d = kzalloc(sizeof(*d) + frames * sizeof(struct bcm2835_cb_entry),
+		    gfp);
+	if (!d)
+		return NULL;
+
+	d->c = c;
+	d->dir = direction;
+	d->cyclic = cyclic;
+
+	/*
+	 * Iterate over all frames, create a control block
+	 * for each frame and link them together.
+	 */
+	for (frame = 0, total_len = 0; frame < frames; d->frames++, frame++) {
+		cb_entry = &d->cb_list[frame];
+		cb_entry->cb = dma_pool_alloc(c->cb_pool, gfp,
+					      &cb_entry->paddr);
+		if (!cb_entry->cb)
+			goto error_cb;
+
+		/* fill in the control block */
+		control_block = cb_entry->cb;
+		control_block->info = info;
+		control_block->src = src;
+		control_block->dst = dst;
+		if (buf_len) {
+			control_block->length = min(max_len, len);
+			if (period_len &&
+			    (total_len + control_block->length >=
+			     period_len)) {
+				/* set to end of period_len */
+				control_block->length =
+					period_len - total_len;
+				/* add extrainfo when cyclic */
+				if (cyclic)
+					control_block->info |=
+						finalextrainfo;
+				/* and reset total_len */
+				total_len = 0;
+			}
+		}
+		control_block->stride = 0;
+		control_block->next = 0;
+
+		/* link this the last controlblock */
+		if (frame)
+			d->cb_list[frame - 1].cb->next = cb_entry->paddr;
+
+		/* update src and dst and length */
+		if (buf_len)
+			len -= control_block->length;
+		if (src && (info & BCM2835_DMA_S_INC))
+			src += control_block->length;
+		if (dst && (info & BCM2835_DMA_D_INC))
+			dst += control_block->length;
+
+		/* Length of total transfer */
+		d->size += control_block->length;
+	}
+
+	/* the last frame requires extra flags */
+	d->cb_list[d->frames - 1].cb->info |= finalextrainfo;
+
+	/* check the size - if there is some missmatch,
+	 * then this is detected here
+	 */
+	if (buf_len && (d->size != buf_len))
+		goto error_cb;
+
+	return d;
+error_cb:
+	bcm2835_dma_free_cb_chain(d);
+
+	return NULL;
+}
+
 static int bcm2835_dma_abort(void __iomem *chan_base)
 {
 	unsigned long cs;
@@ -413,117 +550,74 @@ static struct dma_async_tx_descriptor *bcm2835_dma_prep_dma_cyclic(
 	unsigned long flags)
 {
 	struct bcm2835_chan *c = to_bcm2835_dma_chan(chan);
-	enum dma_slave_buswidth dev_width;
 	struct bcm2835_desc *d;
-	dma_addr_t dev_addr;
-	unsigned int es, sync_type;
-	unsigned int frame;
-	int i;
+	dma_addr_t src, dst;
+	u32 info = BCM2835_DMA_WAIT_RESP;
+	u32 extra = BCM2835_DMA_INT_EN;
+	size_t max_len = bcm2835_dma_max_frame_length(c);
+	size_t frames;

 	/* Grab configuration */
 	if (!is_slave_direction(direction)) {
-		dev_err(chan->device->dev, "%s: bad direction?\n", __func__);
+		dev_err(chan->device->dev,
+			"%s: bad direction?\n", __func__);
 		return NULL;
 	}

-	if (direction == DMA_DEV_TO_MEM) {
-		dev_addr = c->cfg.src_addr;
-		dev_width = c->cfg.src_addr_width;
-		sync_type = BCM2835_DMA_S_DREQ;
-	} else {
-		dev_addr = c->cfg.dst_addr;
-		dev_width = c->cfg.dst_addr_width;
-		sync_type = BCM2835_DMA_D_DREQ;
-	}
-
-	/* Bus width translates to the element size (ES) */
-	switch (dev_width) {
-	case DMA_SLAVE_BUSWIDTH_4_BYTES:
-		es = BCM2835_DMA_DATA_TYPE_S32;
-		break;
-	default:
+	if (!buf_len) {
+		dev_err(chan->device->dev,
+			"%s: bad buffer length (= 0)\n", __func__);
 		return NULL;
 	}

-	/* Now allocate and setup the descriptor. */
-	d = kzalloc(sizeof(*d), GFP_NOWAIT);
-	if (!d)
-		return NULL;
+	/* warn if buf_len is not a multiple of period_lenas this may leed
+	 * to unexpected latencies for interrupts and thus audiable clicks
+	 */
+	if (buf_len % period_len)
+		dev_warn(chan->device->dev,
+			 "%s: buffer_length (%zd) is not a multiple of period_len (%zd)\n",
+			 __func__, buf_len, period_len);

-	d->c = c;
-	d->dir = direction;
-	d->frames = buf_len / period_len;
-	d->cyclic = true;
+	/* Setup DREQ channel */
+	if (c->dreq != 0)
+		info |= BCM2835_DMA_PER_MAP(c->dreq);

-	d->cb_list = kcalloc(d->frames, sizeof(*d->cb_list), GFP_KERNEL);
-	if (!d->cb_list) {
-		kfree(d);
-		return NULL;
+	if (direction == DMA_DEV_TO_MEM) {
+		if (c->cfg.src_addr_width != DMA_SLAVE_BUSWIDTH_4_BYTES)
+			return NULL;
+		src = c->cfg.src_addr;
+		dst = buf_addr;
+		info |= BCM2835_DMA_S_DREQ | BCM2835_DMA_D_INC;
+	} else {
+		if (c->cfg.dst_addr_width != DMA_SLAVE_BUSWIDTH_4_BYTES)
+			return NULL;
+		dst = c->cfg.dst_addr;
+		src = buf_addr;
+		info |= BCM2835_DMA_D_DREQ | BCM2835_DMA_S_INC;
 	}
-	/* Allocate memory for control blocks */
-	for (i = 0; i < d->frames; i++) {
-		struct bcm2835_cb_entry *cb_entry = &d->cb_list[i];

-		cb_entry->cb = dma_pool_zalloc(c->cb_pool, GFP_ATOMIC,
-					       &cb_entry->paddr);
-		if (!cb_entry->cb)
-			goto error_cb;
-	}
+	/* calculate number of frames */
+	frames = /* number of periods */
+		 DIV_ROUND_UP(buf_len, period_len) *
+		 /* number of frames per period */
+		 bcm2835_dma_frames_for_length(period_len, max_len);

 	/*
-	 * Iterate over all frames, create a control block
-	 * for each frame and link them together.
+	 * allocate the CB chain
+	 * note that we need to use GFP_ATOMIC, as the ALSA i2s dmaengine
+	 * implementation calls prep_dma_cyclic with interrupts disabled.
 	 */
-	for (frame = 0; frame < d->frames; frame++) {
-		struct bcm2835_dma_cb *control_block = d->cb_list[frame].cb;
-
-		/* Setup adresses */
-		if (d->dir == DMA_DEV_TO_MEM) {
-			control_block->info = BCM2835_DMA_D_INC;
-			control_block->src = dev_addr;
-			control_block->dst = buf_addr + frame * period_len;
-		} else {
-			control_block->info = BCM2835_DMA_S_INC;
-			control_block->src = buf_addr + frame * period_len;
-			control_block->dst = dev_addr;
-		}
-
-		/* Enable interrupt */
-		control_block->info |= BCM2835_DMA_INT_EN;
-
-		/* Setup synchronization */
-		if (sync_type != 0)
-			control_block->info |= sync_type;
-
-		/* Setup DREQ channel */
-		if (c->dreq != 0)
-			control_block->info |=
-				BCM2835_DMA_PER_MAP(c->dreq);
-
-		/* Length of a frame */
-		control_block->length = period_len;
-		d->size += control_block->length;
+	d = bcm2835_dma_create_cb_chain(chan, direction, true,
+					info, extra,
+					frames, src, dst, buf_len,
+					period_len, GFP_ATOMIC);
+	if (!d)
+		return NULL;

-		/*
-		 * Next block is the next frame.
-		 * This DMA engine driver currently only supports cyclic DMA.
-		 * Therefore, wrap around at number of frames.
-		 */
-		control_block->next = d->cb_list[((frame + 1) % d->frames)].paddr;
-	}
+	/* wrap around into a loop */
+	d->cb_list[d->frames - 1].cb->next = d->cb_list[0].paddr;

 	return vchan_tx_prep(&c->vc, &d->vd, flags);
-error_cb:
-	i--;
-	for (; i >= 0; i--) {
-		struct bcm2835_cb_entry *cb_entry = &d->cb_list[i];
-
-		dma_pool_free(c->cb_pool, cb_entry->cb, cb_entry->paddr);
-	}
-
-	kfree(d->cb_list);
-	kfree(d);
-	return NULL;
 }

 static int bcm2835_dma_slave_config(struct dma_chan *chan,
--
1.7.10.4

^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [PATCH V2 7/8] dmaengine: bcm2835: add slave_sg support to bcm2835-dma
  2016-01-07 17:32 [PATCH V2 0/8] dmaengine: bcm2835: enhancement of driver kernel at martin.sperl.org
                   ` (5 preceding siblings ...)
  2016-01-07 17:33 ` [PATCH V2 6/8] dmaengine: bcm2835: move controlblock chain generation into separate method kernel at martin.sperl.org
@ 2016-01-07 17:33 ` kernel at martin.sperl.org
  2016-02-18  4:39   ` Eric Anholt
  2016-01-07 17:33 ` [PATCH V2 8/8] dmaengine: bcm2835: add dma_memcopy " kernel at martin.sperl.org
  7 siblings, 1 reply; 28+ messages in thread
From: kernel at martin.sperl.org @ 2016-01-07 17:33 UTC (permalink / raw)
  To: linux-arm-kernel

From: Martin Sperl <kernel@martin.sperl.org>

Add slave_sg support to bcm2835-dma using shared allocation
code for bcm2835_desc and DMA-control blocks already used by
dma_cyclic.

Note that bcm2835_dma_callback had to get modified to support
both modes of operation (cyclic and non-cyclic).

Tested using:
* Hifiberry I2S card (using cyclic DMA)
* fb_st7735r SPI-framebuffer (using slave_sg DMA via spi-bcm2835)
playing BigBuckBunny for audio and video.

Signed-off-by: Martin Sperl <kernel@martin.sperl.org>
---
 drivers/dma/bcm2835-dma.c |  113 +++++++++++++++++++++++++++++++++++++++++++--
 1 file changed, 108 insertions(+), 5 deletions(-)

diff --git a/drivers/dma/bcm2835-dma.c b/drivers/dma/bcm2835-dma.c
index 41a4f0b..0db8ddb 100644
--- a/drivers/dma/bcm2835-dma.c
+++ b/drivers/dma/bcm2835-dma.c
@@ -236,6 +236,23 @@ static inline size_t bcm2835_dma_frames_for_length(size_t len,
 	return DIV_ROUND_UP(len, period_len);
 }
 
+static inline size_t bcm2835_dma_count_frames_for_sg(
+	struct bcm2835_chan *c,
+	struct scatterlist *sgl,
+	unsigned int sg_len)
+{
+	size_t frames = 0;
+	struct scatterlist *sgent;
+	unsigned int i;
+	size_t plength = bcm2835_dma_max_frame_length(c);
+
+	for_each_sg(sgl, sgent, sg_len, i)
+		frames += bcm2835_dma_frames_for_length(
+			sg_dma_len(sgent), plength);
+
+	return frames;
+}
+
 /**
  * bcm2835_dma_create_cb_chain - create a control block and fills data in
  *
@@ -347,6 +364,32 @@ error_cb:
 	return NULL;
 }
 
+static void bcm2835_dma_fill_cb_chain_with_sg(
+	struct dma_chan *chan,
+	enum dma_transfer_direction direction,
+	struct bcm2835_cb_entry *cb,
+	struct scatterlist *sgl,
+	unsigned int sg_len)
+{
+	struct bcm2835_chan *c = to_bcm2835_dma_chan(chan);
+	size_t max_len = bcm2835_dma_max_frame_length(c);
+	unsigned int i, len;
+	dma_addr_t addr;
+	struct scatterlist *sgent;
+
+	for_each_sg(sgl, sgent, sg_len, i) {
+		for (addr = sg_dma_address(sgent), len = sg_dma_len(sgent);
+		     len > 0;
+		     addr += cb->cb->length, len -= cb->cb->length, cb++) {
+			if (direction == DMA_DEV_TO_MEM)
+				cb->cb->dst = addr;
+			else
+				cb->cb->src = addr;
+			cb->cb->length = min(len, max_len);
+		}
+	}
+}
+
 static int bcm2835_dma_abort(void __iomem *chan_base)
 {
 	unsigned long cs;
@@ -423,12 +466,18 @@ static irqreturn_t bcm2835_dma_callback(int irq, void *data)
 	d = c->desc;
 
 	if (d) {
-		/* TODO Only works for cyclic DMA */
-		vchan_cyclic_callback(&d->vd);
-	}
+		if (d->cyclic) {
+			/* call the cyclic callback */
+			vchan_cyclic_callback(&d->vd);
 
-	/* Keep the DMA engine running */
-	writel(BCM2835_DMA_ACTIVE, c->chan_base + BCM2835_DMA_CS);
+			/* Keep the DMA engine running */
+			writel(BCM2835_DMA_ACTIVE,
+			       c->chan_base + BCM2835_DMA_CS);
+		} else {
+			vchan_cookie_complete(&c->desc->vd);
+			bcm2835_dma_start_desc(c);
+		}
+	}
 
 	spin_unlock_irqrestore(&c->vc.lock, flags);
 
@@ -544,6 +593,58 @@ static void bcm2835_dma_issue_pending(struct dma_chan *chan)
 	spin_unlock_irqrestore(&c->vc.lock, flags);
 }
 
+static struct dma_async_tx_descriptor *bcm2835_dma_prep_slave_sg(
+	struct dma_chan *chan,
+	struct scatterlist *sgl, unsigned int sg_len,
+	enum dma_transfer_direction direction,
+	unsigned long flags, void *context)
+{
+	struct bcm2835_chan *c = to_bcm2835_dma_chan(chan);
+	struct bcm2835_desc *d;
+	dma_addr_t src = 0, dst = 0;
+	u32 info = BCM2835_DMA_WAIT_RESP;
+	u32 extra = BCM2835_DMA_INT_EN;
+	size_t frames;
+
+	if (!is_slave_direction(direction)) {
+		dev_err(chan->device->dev,
+			"%s: bad direction?\n", __func__);
+		return NULL;
+	}
+
+	if (c->dreq != 0)
+		info |= BCM2835_DMA_PER_MAP(c->dreq);
+
+	if (direction == DMA_DEV_TO_MEM) {
+		if (c->cfg.src_addr_width != DMA_SLAVE_BUSWIDTH_4_BYTES)
+			return NULL;
+		src = c->cfg.src_addr;
+		info |= BCM2835_DMA_S_DREQ | BCM2835_DMA_D_INC;
+	} else {
+		if (c->cfg.dst_addr_width != DMA_SLAVE_BUSWIDTH_4_BYTES)
+			return NULL;
+		dst = c->cfg.dst_addr;
+		info |= BCM2835_DMA_D_DREQ | BCM2835_DMA_S_INC;
+	}
+
+	/* count frames in sg list */
+	frames = bcm2835_dma_count_frames_for_sg(c, sgl, sg_len);
+
+	/* allocate the CB chain */
+	d = bcm2835_dma_create_cb_chain(chan, direction, false,
+					info, extra,
+					frames, src, dst, 0, 0,
+					GFP_KERNEL);
+	if (!d)
+		return NULL;
+
+	/* fill in frames with scatterlist pointers */
+	bcm2835_dma_fill_cb_chain_with_sg(chan, direction, d->cb_list,
+					  sgl, sg_len);
+
+	return vchan_tx_prep(&c->vc, &d->vd, flags);
+}
+
 static struct dma_async_tx_descriptor *bcm2835_dma_prep_dma_cyclic(
 	struct dma_chan *chan, dma_addr_t buf_addr, size_t buf_len,
 	size_t period_len, enum dma_transfer_direction direction,
@@ -772,11 +873,13 @@ static int bcm2835_dma_probe(struct platform_device *pdev)
 	dma_cap_set(DMA_SLAVE, od->ddev.cap_mask);
 	dma_cap_set(DMA_PRIVATE, od->ddev.cap_mask);
 	dma_cap_set(DMA_CYCLIC, od->ddev.cap_mask);
+	dma_cap_set(DMA_SLAVE, od->ddev.cap_mask);
 	od->ddev.device_alloc_chan_resources = bcm2835_dma_alloc_chan_resources;
 	od->ddev.device_free_chan_resources = bcm2835_dma_free_chan_resources;
 	od->ddev.device_tx_status = bcm2835_dma_tx_status;
 	od->ddev.device_issue_pending = bcm2835_dma_issue_pending;
 	od->ddev.device_prep_dma_cyclic = bcm2835_dma_prep_dma_cyclic;
+	od->ddev.device_prep_slave_sg = bcm2835_dma_prep_slave_sg;
 	od->ddev.device_config = bcm2835_dma_slave_config;
 	od->ddev.device_terminate_all = bcm2835_dma_terminate_all;
 	od->ddev.src_addr_widths = BIT(DMA_SLAVE_BUSWIDTH_4_BYTES);
-- 
1.7.10.4

^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [PATCH V2 8/8] dmaengine: bcm2835: add dma_memcopy support to bcm2835-dma
  2016-01-07 17:32 [PATCH V2 0/8] dmaengine: bcm2835: enhancement of driver kernel at martin.sperl.org
                   ` (6 preceding siblings ...)
  2016-01-07 17:33 ` [PATCH V2 7/8] dmaengine: bcm2835: add slave_sg support to bcm2835-dma kernel at martin.sperl.org
@ 2016-01-07 17:33 ` kernel at martin.sperl.org
  7 siblings, 0 replies; 28+ messages in thread
From: kernel at martin.sperl.org @ 2016-01-07 17:33 UTC (permalink / raw)
  To: linux-arm-kernel

From: Martin Sperl <kernel@martin.sperl.org>

Add dma_memcopy support to bcm2835-dma.

Also added check for an error condition in bcm2835_dma_create_cb_chain
that showed up during development of this patch.

Tested using dmatest for all 11 available dma channels with the
following output:
[  476.543529] dmatest: Started 1 threads using dma0chan0
[  489.011154] dmatest: dma0chan0-copy0: summary 10000 tests, 0 failures 802 iops 6452 KB/s (0)
[  489.078379] dmatest: Started 1 threads using dma0chan1
[  501.494448] dmatest: dma0chan1-copy0: summary 10000 tests, 0 failures 806 iops 6376 KB/s (0)
[  501.562567] dmatest: Started 1 threads using dma0chan10
[  514.515164] dmatest: dma0chan10-copy: summary 10000 tests, 0 failures 772 iops 6169 KB/s (0)
[  514.582969] dmatest: Started 1 threads using dma0chan2
[  527.092284] dmatest: dma0chan2-copy0: summary 10000 tests, 0 failures 800 iops 6405 KB/s (0)
[  527.160290] dmatest: Started 1 threads using dma0chan3
[  539.541064] dmatest: dma0chan3-copy0: summary 10000 tests, 0 failures 808 iops 6424 KB/s (0)
[  539.608849] dmatest: Started 1 threads using dma0chan4
[  552.402978] dmatest: dma0chan4-copy0: summary 10000 tests, 0 failures 782 iops 6286 KB/s (0)
[  552.470852] dmatest: Started 1 threads using dma0chan5
[  565.184158] dmatest: dma0chan5-copy0: summary 10000 tests, 0 failures 787 iops 6231 KB/s (0)
[  565.251846] dmatest: Started 1 threads using dma0chan6
[  577.959385] dmatest: dma0chan6-copy0: summary 10000 tests, 0 failures 787 iops 6244 KB/s (0)
[  578.027259] dmatest: Started 1 threads using dma0chan7
[  590.834078] dmatest: dma0chan7-copy0: summary 10000 tests, 0 failures 781 iops 6295 KB/s (0)
[  590.901668] dmatest: Started 1 threads using dma0chan8
[  603.640192] dmatest: dma0chan8-copy0: summary 10000 tests, 0 failures 785 iops 6265 KB/s (0)
[  603.708531] dmatest: Started 1 threads using dma0chan9
[  616.420882] dmatest: dma0chan9-copy0: summary 10000 tests, 0 failures 787 iops 6285 KB/s (0)

Signed-off-by: Martin Sperl <kernel@martin.sperl.org>
---
 drivers/dma/bcm2835-dma.c |   36 +++++++++++++++++++++++++++++++++++-
 1 file changed, 35 insertions(+), 1 deletion(-)

diff --git a/drivers/dma/bcm2835-dma.c b/drivers/dma/bcm2835-dma.c
index 0db8ddb..47238ab 100644
--- a/drivers/dma/bcm2835-dma.c
+++ b/drivers/dma/bcm2835-dma.c
@@ -287,6 +287,9 @@ static struct bcm2835_desc *bcm2835_dma_create_cb_chain(
 	struct bcm2835_dma_cb *control_block;
 	size_t max_len = bcm2835_dma_max_frame_length(c);
 
+	if (!frames)
+		return NULL;
+
 	/* allocate and setup the descriptor. */
 	d = kzalloc(sizeof(*d) + frames * sizeof(struct bcm2835_cb_entry),
 		    gfp);
@@ -593,6 +596,34 @@ static void bcm2835_dma_issue_pending(struct dma_chan *chan)
 	spin_unlock_irqrestore(&c->vc.lock, flags);
 }
 
+struct dma_async_tx_descriptor *bcm2835_dma_prep_dma_memcpy(
+	struct dma_chan *chan, dma_addr_t dst, dma_addr_t src,
+	size_t len, unsigned long flags)
+{
+	struct bcm2835_chan *c = to_bcm2835_dma_chan(chan);
+	struct bcm2835_desc *d;
+	u32 info = BCM2835_DMA_D_INC | BCM2835_DMA_S_INC;
+	u32 extra = BCM2835_DMA_INT_EN | BCM2835_DMA_WAIT_RESP;
+	size_t max_len = bcm2835_dma_max_frame_length(c);
+	size_t frames;
+
+	/* if src, dst or len is not given return with an error */
+	if (!src || !dst || !len)
+		return NULL;
+
+	/* calculate number of frames */
+	frames = bcm2835_dma_frames_for_length(len, max_len);
+
+	/* allocate the CB chain - this also fills in the pointers */
+	d = bcm2835_dma_create_cb_chain(chan, DMA_MEM_TO_MEM, false,
+					info, extra, frames,
+					src, dst, len, 0, GFP_KERNEL);
+	if (!d)
+		return NULL;
+
+	return vchan_tx_prep(&c->vc, &d->vd, flags);
+}
+
 static struct dma_async_tx_descriptor *bcm2835_dma_prep_slave_sg(
 	struct dma_chan *chan,
 	struct scatterlist *sgl, unsigned int sg_len,
@@ -874,17 +905,20 @@ static int bcm2835_dma_probe(struct platform_device *pdev)
 	dma_cap_set(DMA_PRIVATE, od->ddev.cap_mask);
 	dma_cap_set(DMA_CYCLIC, od->ddev.cap_mask);
 	dma_cap_set(DMA_SLAVE, od->ddev.cap_mask);
+	dma_cap_set(DMA_MEMCPY, od->ddev.cap_mask);
 	od->ddev.device_alloc_chan_resources = bcm2835_dma_alloc_chan_resources;
 	od->ddev.device_free_chan_resources = bcm2835_dma_free_chan_resources;
 	od->ddev.device_tx_status = bcm2835_dma_tx_status;
 	od->ddev.device_issue_pending = bcm2835_dma_issue_pending;
 	od->ddev.device_prep_dma_cyclic = bcm2835_dma_prep_dma_cyclic;
 	od->ddev.device_prep_slave_sg = bcm2835_dma_prep_slave_sg;
+	od->ddev.device_prep_dma_memcpy = bcm2835_dma_prep_dma_memcpy;
 	od->ddev.device_config = bcm2835_dma_slave_config;
 	od->ddev.device_terminate_all = bcm2835_dma_terminate_all;
 	od->ddev.src_addr_widths = BIT(DMA_SLAVE_BUSWIDTH_4_BYTES);
 	od->ddev.dst_addr_widths = BIT(DMA_SLAVE_BUSWIDTH_4_BYTES);
-	od->ddev.directions = BIT(DMA_DEV_TO_MEM) | BIT(DMA_MEM_TO_DEV);
+	od->ddev.directions = BIT(DMA_DEV_TO_MEM) | BIT(DMA_MEM_TO_DEV) |
+			      BIT(DMA_MEM_TO_MEM);
 	od->ddev.residue_granularity = DMA_RESIDUE_GRANULARITY_BURST;
 	od->ddev.dev = &pdev->dev;
 	INIT_LIST_HEAD(&od->ddev.channels);
-- 
1.7.10.4

^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [PATCH V2 3/8] dmaengine: bcm2835: use shared interrupt for channel 11 to 14.
  2016-01-07 17:33 ` [PATCH V2 3/8] dmaengine: bcm2835: use shared interrupt for channel 11 to 14 kernel at martin.sperl.org
@ 2016-01-13 12:26   ` Vinod Koul
  2016-01-13 13:30     ` Martin Sperl
  2016-02-18  4:09   ` Eric Anholt
  1 sibling, 1 reply; 28+ messages in thread
From: Vinod Koul @ 2016-01-13 12:26 UTC (permalink / raw)
  To: linux-arm-kernel

On Thu, Jan 07, 2016 at 05:33:01PM +0000, kernel at martin.sperl.org wrote:
> @@ -638,13 +666,21 @@ static int bcm2835_dma_probe(struct platform_device *pdev)
>  		goto err_no_dma;
>  	}
>  
> -	for (i = 0; i < pdev->num_resources; i++) {
> -		irq = platform_get_irq(pdev, i);
> +	for (i = 0; i <= BCM2835_DMA_MAX_CHANNEL_NUMBER; i++) {
> +		if (BCM2835_DMA_IRQ_SHARED_MASK & BIT(i)) {

Ideally this should be done thru DT data and not hard coded in kernel. I
dont think this assumption will hold good for next gen of this device, so
better to get this from DT!

-- 
~Vinod

^ permalink raw reply	[flat|nested] 28+ messages in thread

* [PATCH V2 4/8] dmaengine: bcm2835: add additional defines for DMA-registers
  2016-01-07 17:33 ` [PATCH V2 4/8] dmaengine: bcm2835: add additional defines for DMA-registers kernel at martin.sperl.org
@ 2016-01-13 12:32   ` Vinod Koul
  2016-02-18  4:18   ` Eric Anholt
  1 sibling, 0 replies; 28+ messages in thread
From: Vinod Koul @ 2016-01-13 12:32 UTC (permalink / raw)
  To: linux-arm-kernel

On Thu, Jan 07, 2016 at 05:33:02PM +0000, kernel at martin.sperl.org wrote:
> +#define BCM2835_DMA_PRIORITY(x) ((x & 15) << 16) /* AXI priority */
> +#define BCM2835_DMA_PANIC_PRIORITY(x) ((x & 15) << 20) /* panic priority */
> +#define BCM2835_DMA_WAIT_FOR_WRITES BIT(28) /* current value of
> +					     * TI.BCM2835_DMA_WAIT_RESP
> +					     */

I would put the comment before this, yes is not consistent but is more
readable..


-- 
~Vinod

^ permalink raw reply	[flat|nested] 28+ messages in thread

* [PATCH V2 6/8] dmaengine: bcm2835: move controlblock chain generation into separate method
  2016-01-07 17:33 ` [PATCH V2 6/8] dmaengine: bcm2835: move controlblock chain generation into separate method kernel at martin.sperl.org
@ 2016-01-13 13:23   ` Vinod Koul
  2016-01-13 13:38     ` Martin Sperl
  2016-02-18  3:24   ` Eric Anholt
  1 sibling, 1 reply; 28+ messages in thread
From: Vinod Koul @ 2016-01-13 13:23 UTC (permalink / raw)
  To: linux-arm-kernel

On Thu, Jan 07, 2016 at 05:33:04PM +0000, kernel at martin.sperl.org wrote:

> +static inline bool bcm2835_dma_is_lite(struct bcm2835_chan *c)
> +{
> +	/* dma channels >= 7 are LITE channels */
> +	return (c->ch >= 7);

Why not DT data here as well

> +}
> +
> +static inline size_t bcm2835_dma_max_frame_length(struct bcm2835_chan *c)
> +{
> +	/* lite and normal channels have different max frame length */
> +	return bcm2835_dma_is_lite(c) ? MAX_LITE_DMA_LEN : MAX_DMA_LEN;

Or rather get length from DT..

> +static struct bcm2835_desc *bcm2835_dma_create_cb_chain(
> +	struct dma_chan *chan, enum dma_transfer_direction direction,
> +	bool cyclic, u32 info, u32 finalextrainfo, size_t frames,
> +	dma_addr_t src, dma_addr_t dst, size_t buf_len,
> +	size_t period_len, gfp_t gfp)
> +{
> +	struct bcm2835_chan *c = to_bcm2835_dma_chan(chan);
> +	size_t len = buf_len, total_len;
> +	size_t frame;
> +	struct bcm2835_desc *d;
> +	struct bcm2835_cb_entry *cb_entry;
> +	struct bcm2835_dma_cb *control_block;
> +	size_t max_len = bcm2835_dma_max_frame_length(c);
> +
> +	/* allocate and setup the descriptor. */
> +	d = kzalloc(sizeof(*d) + frames * sizeof(struct bcm2835_cb_entry),
> +		    gfp);

odd style.. btw should flag be GFP_NOWAIT ..?

> +		/* fill in the control block */
> +		control_block = cb_entry->cb;
> +		control_block->info = info;
> +		control_block->src = src;
> +		control_block->dst = dst;
> +		if (buf_len) {
> +			control_block->length = min(max_len, len);
> +			if (period_len &&
> +			    (total_len + control_block->length >=
> +			     period_len)) {
> +				/* set to end of period_len */
> +				control_block->length =
> +					period_len - total_len;
> +				/* add extrainfo when cyclic */
> +				if (cyclic)
> +					control_block->info |=
> +						finalextrainfo;
> +				/* and reset total_len */
> +				total_len = 0;
> +			}

this looks hard to read, perhpas a helper will make it look better

> +	/* the last frame requires extra flags */
> +	d->cb_list[d->frames - 1].cb->info |= finalextrainfo;
> +
> +	/* check the size - if there is some missmatch,
> +	 * then this is detected here
> +	 */

this is not kernel style for multi-line comments

>  	/* Grab configuration */
>  	if (!is_slave_direction(direction)) {
> -		dev_err(chan->device->dev, "%s: bad direction?\n", __func__);
> +		dev_err(chan->device->dev,
> +			"%s: bad direction?\n", __func__);

unrelated change

> -	/* Now allocate and setup the descriptor. */
> -	d = kzalloc(sizeof(*d), GFP_NOWAIT);
> -	if (!d)
> -		return NULL;
> +	/* warn if buf_len is not a multiple of period_lenas this may leed
> +	 * to unexpected latencies for interrupts and thus audiable clicks
> +	 */

here too

>  	/*
> -	 * Iterate over all frames, create a control block
> -	 * for each frame and link them together.
> +	 * allocate the CB chain
> +	 * note that we need to use GFP_ATOMIC, as the ALSA i2s dmaengine

dmaengine drivers use GFP_NOWAIT in these cases

-- 
~Vinod

^ permalink raw reply	[flat|nested] 28+ messages in thread

* [PATCH V2 3/8] dmaengine: bcm2835: use shared interrupt for channel 11 to 14.
  2016-01-13 12:26   ` Vinod Koul
@ 2016-01-13 13:30     ` Martin Sperl
  2016-01-13 13:43       ` Vinod Koul
  0 siblings, 1 reply; 28+ messages in thread
From: Martin Sperl @ 2016-01-13 13:30 UTC (permalink / raw)
  To: linux-arm-kernel

On 13.01.2016 13:26, Vinod Koul wrote:
> On Thu, Jan 07, 2016 at 05:33:01PM +0000, kernel at martin.sperl.org wrote:
>> @@ -638,13 +666,21 @@ static int bcm2835_dma_probe(struct platform_device *pdev)
>>   		goto err_no_dma;
>>   	}
>>
>> -	for (i = 0; i < pdev->num_resources; i++) {
>> -		irq = platform_get_irq(pdev, i);
>> +	for (i = 0; i <= BCM2835_DMA_MAX_CHANNEL_NUMBER; i++) {
>> +		if (BCM2835_DMA_IRQ_SHARED_MASK & BIT(i)) {
>
> Ideally this should be done thru DT data and not hard coded in kernel. I
> dont think this assumption will hold good for next gen of this device, so
> better to get this from DT!

The ideal solution would be breaking the DT in such a way that we could
define a register range and interrupt per dma-channel looking something
like this:

diff --git a/arch/arm/boot/dts/bcm2835.dtsi b/arch/arm/boot/dts/bcm2835.dtsi
index 83d9787..9526b91 100644
--- a/arch/arm/boot/dts/bcm2835.dtsi
+++ b/arch/arm/boot/dts/bcm2835.dtsi
@@ -31,8 +31,28 @@

                 dma: dma at 7e007000 {
                         compatible = "brcm,bcm2835-dma";
-                       reg = <0x7e007000 0xf00>;
-                       interrupts = <1 16>,
+                       reg = <0x7e007f00 0x100>, /* status reg */
+                             /* dma channel 0-14 base addresses */
+                             <0x7e007000 0x100>,
+                             <0x7e007100 0x100>,
+                             <0x7e007200 0x100>,
+                             <0x7e007300 0x100>,
+                             <0x7e007400 0x100>,
+                             <0x7e007500 0x100>,
+                             <0x7e007600 0x100>,
+                             <0x7e007700 0x100>,
+                             <0x7e007800 0x100>,
+                             <0x7e007900 0x100>,
+                             <0x7e007a00 0x100>,
+                             <0x7e007b00 0x100>,
+                             <0x7e007c00 0x100>,
+                             <0x7e007d00 0x100>,
+                             <0x7e007e00 0x100>,
+                             /* dma channel 15 uses a different base */
+                             <0x7ee05000 0x100>;
+                       interrupts = <1 28>, /* catch all DMA-interrupts */
+                                    /* dma channel 0-10 interrupts */
+                                    <1 16>,
                                      <1 17>,
                                      <1 18>,
                                      <1 19>,
@@ -43,9 +63,30 @@
                                      <1 24>,
                                      <1 25>,
                                      <1 26>,
+                                    /* dma channel 11-14 share irq */
                                      <1 27>,
-                                    <1 28>;
-
+                                    <1 27>,
+                                    <1 27>,
+                                    <1 27>,
+                                    /* no irq support for dma channel 15 */
+                                    < 0 >;
+                       dma-names = "shared",
+                                   "dma0",
+                                   "dma1",
+                                   "dma2",
+                                   "dma3",
+                                   "dma4",
+                                   "dma5",
+                                   "dma6",
+                                   "dma7",
+                                   "dma8",
+                                   "dma9",
+                                   "dma10",
+                                   "dma11",
+                                   "dma12",
+                                   "dma13",
+                                   "dma14",
+                                   "dma15";
                         #dma-cells = <1>;
                         brcm,dma-channel-mask = <0x7f35>;
(or similar)

This actually would allow us to make "brcm,dma-channel-mask" redundant,
as we could remove those dma channels that are owned by the firmware
directly from the list.

That way we could also map other capabilities via the DT.

It would also allow a transparent addition of additional dma channels
with newer versions of the HW - mostly - by modifying the DT.

But that would be frowned upon, so I had to come up with the approach
taken, which makes the following assumptions:
* the DT maps only the interrupts that are assigned to the HW block
* the driver knows about the number of DMA channels in HW
* the driver knows about the mapping of shared interrupts
   (11-14 share irq).

It is not optimal, but at least it works with the least amount of
change to the DT - and what about all those assumptions that we
would need to hard-code to be backwards compatible to the DT without?

I guess we could replace BCM2835_DMA_MAX_CHANNEL_NUMBER with:
   /* we do not support dma channel 15 with this driver */
   #define BCM2835_DMA_MAX_CHANNEL_SUPPORTED 14
   ...
   for (i = 0;
        i <= min_t(int, flv(chans_available), 
BCM2835_DMA_MAX_CHANNEL_SUPPORTED);
        i++) {

So which way would you prefer this to go - I got another few days
before I leave on vacation.

Thanks,
	Martin

^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [PATCH V2 6/8] dmaengine: bcm2835: move controlblock chain generation into separate method
  2016-01-13 13:23   ` Vinod Koul
@ 2016-01-13 13:38     ` Martin Sperl
  0 siblings, 0 replies; 28+ messages in thread
From: Martin Sperl @ 2016-01-13 13:38 UTC (permalink / raw)
  To: linux-arm-kernel

On 13.01.2016 14:23, Vinod Koul wrote:
> On Thu, Jan 07, 2016 at 05:33:04PM +0000, kernel at martin.sperl.org wrote:
>
>> +static inline bool bcm2835_dma_is_lite(struct bcm2835_chan *c)
>> +{
>> +	/* dma channels >= 7 are LITE channels */
>> +	return (c->ch >= 7);
>
> Why not DT data here as well

See other email with comments on DT.

>
>> +}
>> +
>> +static inline size_t bcm2835_dma_max_frame_length(struct bcm2835_chan *c)
>> +{
>> +	/* lite and normal channels have different max frame length */
>> +	return bcm2835_dma_is_lite(c) ? MAX_LITE_DMA_LEN : MAX_DMA_LEN;
>
> Or rather get length from DT..
See other email with comments on DT.

>> +	/* allocate and setup the descriptor. */
>> +	d = kzalloc(sizeof(*d) + frames * sizeof(struct bcm2835_cb_entry),
>> +		    gfp);
>
> odd style.. btw should flag be GFP_NOWAIT ..?
gfp was used as a method-argument - see your comment below.

>
>> +		/* fill in the control block */
>> +		control_block = cb_entry->cb;
>> +		control_block->info = info;
>> +		control_block->src = src;
>> +		control_block->dst = dst;
>> +		if (buf_len) {
>> +			control_block->length = min(max_len, len);
>> +			if (period_len &&
>> +			    (total_len + control_block->length >=
>> +			     period_len)) {
>> +				/* set to end of period_len */
>> +				control_block->length =
>> +					period_len - total_len;
>> +				/* add extrainfo when cyclic */
>> +				if (cyclic)
>> +					control_block->info |=
>> +						finalextrainfo;
>> +				/* and reset total_len */
>> +				total_len = 0;
>> +			}
>
> this looks hard to read, perhpas a helper will make it look better
Perhaps - I can look...
>
>> +	/* the last frame requires extra flags */
>> +	d->cb_list[d->frames - 1].cb->info |= finalextrainfo;
>> +
>> +	/* check the size - if there is some missmatch,
>> +	 * then this is detected here
>> +	 */
>
> this is not kernel style for multi-line comments
I know - this came up with a different patch-set

>
>>   	/* Grab configuration */
>>   	if (!is_slave_direction(direction)) {
>> -		dev_err(chan->device->dev, "%s: bad direction?\n", __func__);
>> +		dev_err(chan->device->dev,
>> +			"%s: bad direction?\n", __func__);
>
> unrelated change
OK

>
>> -	/* Now allocate and setup the descriptor. */
>> -	d = kzalloc(sizeof(*d), GFP_NOWAIT);
>> -	if (!d)
>> -		return NULL;
>> +	/* warn if buf_len is not a multiple of period_lenas this may leed
>> +	 * to unexpected latencies for interrupts and thus audiable clicks
>> +	 */
>
> here too

as this code is new and there are a few more issues that have been seen
it was included here...

>
>>   	/*
>> -	 * Iterate over all frames, create a control block
>> -	 * for each frame and link them together.
>> +	 * allocate the CB chain
>> +	 * note that we need to use GFP_ATOMIC, as the ALSA i2s dmaengine
>
> dmaengine drivers use GFP_NOWAIT in these cases
OK we were not sure if this was "permissible" or would result in rare
issues...

I will wait on the generic DT discussion before sending any updates
incorporating both...

^ permalink raw reply	[flat|nested] 28+ messages in thread

* [PATCH V2 3/8] dmaengine: bcm2835: use shared interrupt for channel 11 to 14.
  2016-01-13 13:30     ` Martin Sperl
@ 2016-01-13 13:43       ` Vinod Koul
  2016-01-13 14:24         ` Martin Sperl
  0 siblings, 1 reply; 28+ messages in thread
From: Vinod Koul @ 2016-01-13 13:43 UTC (permalink / raw)
  To: linux-arm-kernel

On Wed, Jan 13, 2016 at 02:30:32PM +0100, Martin Sperl wrote:
> On 13.01.2016 13:26, Vinod Koul wrote:
> >On Thu, Jan 07, 2016 at 05:33:01PM +0000, kernel at martin.sperl.org wrote:
> >>@@ -638,13 +666,21 @@ static int bcm2835_dma_probe(struct platform_device *pdev)
> >>  		goto err_no_dma;
> >>  	}
> >>
> >>-	for (i = 0; i < pdev->num_resources; i++) {
> >>-		irq = platform_get_irq(pdev, i);
> >>+	for (i = 0; i <= BCM2835_DMA_MAX_CHANNEL_NUMBER; i++) {
> >>+		if (BCM2835_DMA_IRQ_SHARED_MASK & BIT(i)) {
> >
> >Ideally this should be done thru DT data and not hard coded in kernel. I
> >dont think this assumption will hold good for next gen of this device, so
> >better to get this from DT!
> 
> The ideal solution would be breaking the DT in such a way that we could
> define a register range and interrupt per dma-channel looking something
> like this:
> 
> diff --git a/arch/arm/boot/dts/bcm2835.dtsi b/arch/arm/boot/dts/bcm2835.dtsi
> index 83d9787..9526b91 100644
> --- a/arch/arm/boot/dts/bcm2835.dtsi
> +++ b/arch/arm/boot/dts/bcm2835.dtsi
> @@ -31,8 +31,28 @@
> 
>                 dma: dma at 7e007000 {
>                         compatible = "brcm,bcm2835-dma";
> -                       reg = <0x7e007000 0xf00>;
> -                       interrupts = <1 16>,
> +                       reg = <0x7e007f00 0x100>, /* status reg */
> +                             <0x7e007000 0x100>,
> +                             <0x7e007100 0x100>,
> +                             <0x7e007200 0x100>,
> +                             <0x7e007300 0x100>,
> +                             <0x7e007400 0x100>,
> +                             <0x7e007500 0x100>,
> +                             <0x7e007600 0x100>,
> +                             <0x7e007700 0x100>,
> +                             <0x7e007800 0x100>,
> +                             <0x7e007900 0x100>,
> +                             <0x7e007a00 0x100>,
> +                             <0x7e007b00 0x100>,
> +                             <0x7e007c00 0x100>,
> +                             <0x7e007d00 0x100>,
> +                             <0x7e007e00 0x100>,
> +                             /* dma channel 15 uses a different base */
> +                             <0x7ee05000 0x100>;
> +                       interrupts = <1 28>, /* catch all DMA-interrupts */
> +                                    /* dma channel 0-10 interrupts */
> +                                    <1 16>,
>                                      <1 17>,
>                                      <1 18>,
>                                      <1 19>,
> @@ -43,9 +63,30 @@
>                                      <1 24>,
>                                      <1 25>,
>                                      <1 26>,
> +                                    /* dma channel 11-14 share irq */
>                                      <1 27>,
> -                                    <1 28>;
> -
> +                                    <1 27>,
> +                                    <1 27>,
> +                                    <1 27>,
> +                                    /* no irq support for dma channel 15 */
> +                                    < 0 >;
> +                       dma-names = "shared",
> +                                   "dma0",
> +                                   "dma1",
> +                                   "dma2",
> +                                   "dma3",
> +                                   "dma4",
> +                                   "dma5",
> +                                   "dma6",
> +                                   "dma7",
> +                                   "dma8",
> +                                   "dma9",
> +                                   "dma10",
> +                                   "dma11",
> +                                   "dma12",
> +                                   "dma13",
> +                                   "dma14",
> +                                   "dma15";
>                         #dma-cells = <1>;
>                         brcm,dma-channel-mask = <0x7f35>;
> (or similar)
> 
> This actually would allow us to make "brcm,dma-channel-mask" redundant,
> as we could remove those dma channels that are owned by the firmware
> directly from the list.
> 
> That way we could also map other capabilities via the DT.
> 
> It would also allow a transparent addition of additional dma channels
> with newer versions of the HW - mostly - by modifying the DT.

Precisely

> 
> But that would be frowned upon, so I had to come up with the approach
> taken, which makes the following assumptions:

DT was designed to move this info and hardcoding from kernel into
DT, so why cant we do that?

> * the DT maps only the interrupts that are assigned to the HW block
> * the driver knows about the number of DMA channels in HW
> * the driver knows about the mapping of shared interrupts
>   (11-14 share irq).
> 
> It is not optimal, but at least it works with the least amount of
> change to the DT - and what about all those assumptions that we
> would need to hard-code to be backwards compatible to the DT without?
> 
> I guess we could replace BCM2835_DMA_MAX_CHANNEL_NUMBER with:
>   /* we do not support dma channel 15 with this driver */
>   #define BCM2835_DMA_MAX_CHANNEL_SUPPORTED 14
>   ...
>   for (i = 0;
>        i <= min_t(int, flv(chans_available),
> BCM2835_DMA_MAX_CHANNEL_SUPPORTED);
>        i++) {
> 
> So which way would you prefer this to go - I got another few days
> before I leave on vacation.

I still think DT is the right way to go here, unless I hear some other
convincing answer..

-- 
~Vinod

^ permalink raw reply	[flat|nested] 28+ messages in thread

* [PATCH V2 3/8] dmaengine: bcm2835: use shared interrupt for channel 11 to 14.
  2016-01-13 13:43       ` Vinod Koul
@ 2016-01-13 14:24         ` Martin Sperl
  2016-01-14  4:07           ` Vinod Koul
  0 siblings, 1 reply; 28+ messages in thread
From: Martin Sperl @ 2016-01-13 14:24 UTC (permalink / raw)
  To: linux-arm-kernel



On 13.01.2016 14:43, Vinod Koul wrote:
> On Wed, Jan 13, 2016 at 02:30:32PM +0100, Martin Sperl wrote:
>> On 13.01.2016 13:26, Vinod Koul wrote:
>>> On Thu, Jan 07, 2016 at 05:33:01PM +0000, kernel at martin.sperl.org wrote:
>>>> @@ -638,13 +666,21 @@ static int bcm2835_dma_probe(struct platform_device *pdev)
>>>>   		goto err_no_dma;
>>>>   	}
>>>>
>>>> -	for (i = 0; i < pdev->num_resources; i++) {
>>>> -		irq = platform_get_irq(pdev, i);
>>>> +	for (i = 0; i <= BCM2835_DMA_MAX_CHANNEL_NUMBER; i++) {
>>>> +		if (BCM2835_DMA_IRQ_SHARED_MASK & BIT(i)) {
>>>
>>> Ideally this should be done thru DT data and not hard coded in kernel. I
>>> dont think this assumption will hold good for next gen of this device, so
>>> better to get this from DT!
>>
>> The ideal solution would be breaking the DT in such a way that we could
>> define a register range and interrupt per dma-channel looking something
>> like this:
>>
>> diff --git a/arch/arm/boot/dts/bcm2835.dtsi b/arch/arm/boot/dts/bcm2835.dtsi
>> index 83d9787..9526b91 100644
>> --- a/arch/arm/boot/dts/bcm2835.dtsi
>> +++ b/arch/arm/boot/dts/bcm2835.dtsi
>> @@ -31,8 +31,28 @@
>>
>>                  dma: dma at 7e007000 {
>>                          compatible = "brcm,bcm2835-dma";
>> -                       reg = <0x7e007000 0xf00>;
>> -                       interrupts = <1 16>,
>> +                       reg = <0x7e007f00 0x100>, /* status reg */
>> +                             <0x7e007000 0x100>,
>> +                             <0x7e007100 0x100>,
>> +                             <0x7e007200 0x100>,
>> +                             <0x7e007300 0x100>,
>> +                             <0x7e007400 0x100>,
>> +                             <0x7e007500 0x100>,
>> +                             <0x7e007600 0x100>,
>> +                             <0x7e007700 0x100>,
>> +                             <0x7e007800 0x100>,
>> +                             <0x7e007900 0x100>,
>> +                             <0x7e007a00 0x100>,
>> +                             <0x7e007b00 0x100>,
>> +                             <0x7e007c00 0x100>,
>> +                             <0x7e007d00 0x100>,
>> +                             <0x7e007e00 0x100>,
>> +                             /* dma channel 15 uses a different base */
>> +                             <0x7ee05000 0x100>;
>> +                       interrupts = <1 28>, /* catch all DMA-interrupts */
>> +                                    /* dma channel 0-10 interrupts */
>> +                                    <1 16>,
>>                                       <1 17>,
>>                                       <1 18>,
>>                                       <1 19>,
>> @@ -43,9 +63,30 @@
>>                                       <1 24>,
>>                                       <1 25>,
>>                                       <1 26>,
>> +                                    /* dma channel 11-14 share irq */
>>                                       <1 27>,
>> -                                    <1 28>;
>> -
>> +                                    <1 27>,
>> +                                    <1 27>,
>> +                                    <1 27>,
>> +                                    /* no irq support for dma channel 15 */
>> +                                    < 0 >;
>> +                       dma-names = "shared",
>> +                                   "dma0",
>> +                                   "dma1",
>> +                                   "dma2",
>> +                                   "dma3",
>> +                                   "dma4",
>> +                                   "dma5",
>> +                                   "dma6",
>> +                                   "dma7",
>> +                                   "dma8",
>> +                                   "dma9",
>> +                                   "dma10",
>> +                                   "dma11",
>> +                                   "dma12",
>> +                                   "dma13",
>> +                                   "dma14",
>> +                                   "dma15";
>>                          #dma-cells = <1>;
>>                          brcm,dma-channel-mask = <0x7f35>;
>> (or similar)
>>
>> This actually would allow us to make "brcm,dma-channel-mask" redundant,
>> as we could remove those dma channels that are owned by the firmware
>> directly from the list.
>>
>> That way we could also map other capabilities via the DT.
>>
>> It would also allow a transparent addition of additional dma channels
>> with newer versions of the HW - mostly - by modifying the DT.
>
> Precisely
>
>>
>> But that would be frowned upon, so I had to come up with the approach
>> taken, which makes the following assumptions:
>
> DT was designed to move this info and hardcoding from kernel into
> DT, so why cant we do that?
We still need to be backwards-compatible - at least that is what
everyone tells me, so I need to hard-code fallbacks for those values.
>
>> * the DT maps only the interrupts that are assigned to the HW block
>> * the driver knows about the number of DMA channels in HW
that could be a DT property, yes.
>> * the driver knows about the mapping of shared interrupts
>>    (11-14 share irq).
OK - how would you define that "mapping" in a "sane" manner in the DT
that allows us to have multiple such mappings in the future?

>>
>> It is not optimal, but at least it works with the least amount of
>> change to the DT - and what about all those assumptions that we
>> would need to hard-code to be backwards compatible to the DT without?
>>
>> I guess we could replace BCM2835_DMA_MAX_CHANNEL_NUMBER with:
>>    /* we do not support dma channel 15 with this driver */
>>    #define BCM2835_DMA_MAX_CHANNEL_SUPPORTED 14
>>    ...
>>    for (i = 0;
>>         i <= min_t(int, flv(chans_available),
>> BCM2835_DMA_MAX_CHANNEL_SUPPORTED);
>>         i++) {
>>
>> So which way would you prefer this to go - I got another few days
>> before I leave on vacation.
>
> I still think DT is the right way to go here, unless I hear some other
> convincing answer..
>

The point is that the way the DT is right now it becomes very hard to
extend it in a sane manner - It would be bitmaps/lists here,
bitmaps/lists there...

Breaking the DT would make all of it go away.

If I think of it (reading your other comments), then here how the
"new" DT could look like:
dma: dma at 7e007f00 {
	/* new compatible name to map to new driver */
	compatible = "brcm,bcm2835-dma-v2";

	/* the status/enable registers */
	reg = <0x7e007f00 0x100>;

	/* the catch all interrupt */
	interrupts = <1 28>;

	dma0: dma-channel at 0x7e007000 {
		reg = <0x7e007000 0x100>;
		interrupts = <1 16>;
	};
	...
	dma7: dma-channel at 0x7e007700 {
		reg = <0x7e007700 0x100>;
		interrupts = <1 23>;
		dma-channel-type = <BCM2835_DMA_LITE>;
		dma-max-length = <65532>; /* 64K - 4 */
	};
	...
	dma11: dma-channel at 0x7e007b00 {
		reg = <0x7e007b00 0x100>;
		interrupts = <1 27>;
		shared-interrupt;
		dma-channel-type = <BCM2835_DMA_LITE>;
	};
	...
	dma14: dma-channel at 0x7e007e00 {
		reg = <0x7e007e00 0x100>;
		interrupts = <1 27>;
		shared-interrupt;
		dma-channel-type = <BCM2835_DMA_LITE>;
	};
	dma15: dma-channel at 0x7ee05000 {
		reg = <0x7ee05000 0x100>;
	};
};

Adding additional "features" would make it easy like:
* dma-channel priority on AXI bus
* setting warn/alert levels for DREQs
* eventually mapping of drivers to the explicit dma channel
   e.g: SPI has a limit of 65534 bytes per dma transfer in HW,
   so the use of a DMA-LITE engine would be sufficient - no need
   to waste a "normal" DMA-engine with 2D support,...
   So the spi dt node  could then look like this:
     dmas = <&dma11 BCM2835_DREQ_SPI_TX>, <&dma12 BCM2835_DREQ_SPI_RX>;
     dma-names = "tx", "rx";

Is this the approach I should try to take?

Or do you want me to continue on the "patchwork" route:
dma: dma at 7e007000 {
	interrupt-names = "dma0", ..., "dma10", "shared", "catch-all";

	bcrm,dma-lite-channel-mask = <0x7f80>;
	bcrm,dma-max-channels = 14;

	bcrm,catch-all-interrupt-name = "catch-all";
	bcrm,shared-interrupt-name = "shared";
	bcrm,shared-interrupt-channel-mapping = <0x7800>;
}

For all of the above we would need to define defaults in the driver
to make it work in a backwards compatible way.

That would not support the registers for each dma channel and not lend
itself towards extending only via the DT - say support DMA channel 15.

I have another 5 days before I leave on vacation - there is only so
much I can do...

Thanks,
	Martin

^ permalink raw reply	[flat|nested] 28+ messages in thread

* [PATCH V2 3/8] dmaengine: bcm2835: use shared interrupt for channel 11 to 14.
  2016-01-13 14:24         ` Martin Sperl
@ 2016-01-14  4:07           ` Vinod Koul
  2016-01-14  8:48             ` Martin Sperl
  2016-02-29 17:10             ` Martin Sperl
  0 siblings, 2 replies; 28+ messages in thread
From: Vinod Koul @ 2016-01-14  4:07 UTC (permalink / raw)
  To: linux-arm-kernel

On Wed, Jan 13, 2016 at 03:24:59PM +0100, Martin Sperl wrote:
> >>But that would be frowned upon, so I had to come up with the approach
> >>taken, which makes the following assumptions:
> >
> >DT was designed to move this info and hardcoding from kernel into
> >DT, so why cant we do that?
> We still need to be backwards-compatible - at least that is what
> everyone tells me, so I need to hard-code fallbacks for those values.

IMO hard code for falling back is okay as that supports old cases and new
platforms use geric DT info and new devices can be supported generically,
please check with DT folks..

> >>* the DT maps only the interrupts that are assigned to the HW block
> >>* the driver knows about the number of DMA channels in HW
> that could be a DT property, yes.
> >>* the driver knows about the mapping of shared interrupts
> >>   (11-14 share irq).
> OK - how would you define that "mapping" in a "sane" manner in the DT
> that allows us to have multiple such mappings in the future?

You are hard coding the flags for each channel, we can pass this for each
channel in the interrupt configi, a flag share/none..? Please run this thru
DT experts and I am not one of them..

> >>It is not optimal, but at least it works with the least amount of
> >>change to the DT - and what about all those assumptions that we
> >>would need to hard-code to be backwards compatible to the DT without?
> >>
> >>I guess we could replace BCM2835_DMA_MAX_CHANNEL_NUMBER with:
> >>   /* we do not support dma channel 15 with this driver */
> >>   #define BCM2835_DMA_MAX_CHANNEL_SUPPORTED 14
> >>   ...
> >>   for (i = 0;
> >>        i <= min_t(int, flv(chans_available),
> >>BCM2835_DMA_MAX_CHANNEL_SUPPORTED);
> >>        i++) {
> >>
> >>So which way would you prefer this to go - I got another few days
> >>before I leave on vacation.
> >
> >I still think DT is the right way to go here, unless I hear some other
> >convincing answer..
> >
> 
> The point is that the way the DT is right now it becomes very hard to
> extend it in a sane manner - It would be bitmaps/lists here,
> bitmaps/lists there...
> 
> Breaking the DT would make all of it go away.
> 
> If I think of it (reading your other comments), then here how the
> "new" DT could look like:
> dma: dma at 7e007f00 {
> 	/* new compatible name to map to new driver */
> 	compatible = "brcm,bcm2835-dma-v2";
> 
> 	/* the status/enable registers */
> 	reg = <0x7e007f00 0x100>;
> 
> 	/* the catch all interrupt */
> 	interrupts = <1 28>;
> 
> 	dma0: dma-channel at 0x7e007000 {
> 		reg = <0x7e007000 0x100>;
> 		interrupts = <1 16>;
> 	};
> 	...
> 	dma7: dma-channel at 0x7e007700 {
> 		reg = <0x7e007700 0x100>;
> 		interrupts = <1 23>;
> 		dma-channel-type = <BCM2835_DMA_LITE>;
> 		dma-max-length = <65532>; /* 64K - 4 */
> 	};
> 	...
> 	dma11: dma-channel at 0x7e007b00 {
> 		reg = <0x7e007b00 0x100>;
> 		interrupts = <1 27>;
> 		shared-interrupt;
> 		dma-channel-type = <BCM2835_DMA_LITE>;
> 	};
> 	...
> 	dma14: dma-channel at 0x7e007e00 {
> 		reg = <0x7e007e00 0x100>;
> 		interrupts = <1 27>;
> 		shared-interrupt;
> 		dma-channel-type = <BCM2835_DMA_LITE>;
> 	};
> 	dma15: dma-channel at 0x7ee05000 {
> 		reg = <0x7ee05000 0x100>;
> 	};
> };

This looks okay to me to start with...

> 
> Adding additional "features" would make it easy like:
> * dma-channel priority on AXI bus
> * setting warn/alert levels for DREQs
> * eventually mapping of drivers to the explicit dma channel
>   e.g: SPI has a limit of 65534 bytes per dma transfer in HW,
>   so the use of a DMA-LITE engine would be sufficient - no need
>   to waste a "normal" DMA-engine with 2D support,...
>   So the spi dt node  could then look like this:
>     dmas = <&dma11 BCM2835_DREQ_SPI_TX>, <&dma12 BCM2835_DREQ_SPI_RX>;
>     dma-names = "tx", "rx";
> 
> Is this the approach I should try to take?
> 
> Or do you want me to continue on the "patchwork" route:
> dma: dma at 7e007000 {
> 	interrupt-names = "dma0", ..., "dma10", "shared", "catch-all";
> 
> 	bcrm,dma-lite-channel-mask = <0x7f80>;
> 	bcrm,dma-max-channels = 14;
> 
> 	bcrm,catch-all-interrupt-name = "catch-all";
> 	bcrm,shared-interrupt-name = "shared";
> 	bcrm,shared-interrupt-channel-mapping = <0x7800>;
> }
> 
> For all of the above we would need to define defaults in the driver
> to make it work in a backwards compatible way.
> 
> That would not support the registers for each dma channel and not lend
> itself towards extending only via the DT - say support DMA channel 15.
> 
> I have another 5 days before I leave on vacation - there is only so
> much I can do...
> 
> Thanks,
> 	Martin

-- 
~Vinod

^ permalink raw reply	[flat|nested] 28+ messages in thread

* [PATCH V2 3/8] dmaengine: bcm2835: use shared interrupt for channel 11 to 14.
  2016-01-14  4:07           ` Vinod Koul
@ 2016-01-14  8:48             ` Martin Sperl
  2016-02-29 17:10             ` Martin Sperl
  1 sibling, 0 replies; 28+ messages in thread
From: Martin Sperl @ 2016-01-14  8:48 UTC (permalink / raw)
  To: linux-arm-kernel


On 14.01.2016 05:07, Vinod Koul wrote:
> On Wed, Jan 13, 2016 at 03:24:59PM +0100, Martin Sperl wrote:
>>>> But that would be frowned upon, so I had to come up with the approach
>>>> taken, which makes the following assumptions:
>>>
>>> DT was designed to move this info and hardcoding from kernel into
>>> DT, so why cant we do that?
>> We still need to be backwards-compatible - at least that is what
>> everyone tells me, so I need to hard-code fallbacks for those values.
>
> IMO hard code for falling back is okay as that supports old cases and new
> platforms use geric DT info and new devices can be supported generically,
> please check with DT folks..
>
>>>> * the DT maps only the interrupts that are assigned to the HW block
>>>> * the driver knows about the number of DMA channels in HW
>> that could be a DT property, yes.
>>>> * the driver knows about the mapping of shared interrupts
>>>>    (11-14 share irq).
>> OK - how would you define that "mapping" in a "sane" manner in the DT
>> that allows us to have multiple such mappings in the future?
>
> You are hard coding the flags for each channel, we can pass this for each
> channel in the interrupt configi, a flag share/none..? Please run this thru
> DT experts and I am not one of them..
>
>>>> It is not optimal, but at least it works with the least amount of
>>>> change to the DT - and what about all those assumptions that we
>>>> would need to hard-code to be backwards compatible to the DT without?
>>>>
>>>> I guess we could replace BCM2835_DMA_MAX_CHANNEL_NUMBER with:
>>>>    /* we do not support dma channel 15 with this driver */
>>>>    #define BCM2835_DMA_MAX_CHANNEL_SUPPORTED 14
>>>>    ...
>>>>    for (i = 0;
>>>>         i <= min_t(int, flv(chans_available),
>>>> BCM2835_DMA_MAX_CHANNEL_SUPPORTED);
>>>>         i++) {
>>>>
>>>> So which way would you prefer this to go - I got another few days
>>>> before I leave on vacation.
>>>
>>> I still think DT is the right way to go here, unless I hear some other
>>> convincing answer..
>>>
>>
>> The point is that the way the DT is right now it becomes very hard to
>> extend it in a sane manner - It would be bitmaps/lists here,
>> bitmaps/lists there...
>>
>> Breaking the DT would make all of it go away.
>>
>> If I think of it (reading your other comments), then here how the
>> "new" DT could look like:
>> dma: dma at 7e007f00 {
>> 	/* new compatible name to map to new driver */
>> 	compatible = "brcm,bcm2835-dma-v2";
>>
>> 	/* the status/enable registers */
>> 	reg = <0x7e007f00 0x100>;
>>
>> 	/* the catch all interrupt */
>> 	interrupts = <1 28>;
>>
>> 	dma0: dma-channel at 0x7e007000 {
>> 		reg = <0x7e007000 0x100>;
>> 		interrupts = <1 16>;
>> 	};
>> 	...
>> 	dma7: dma-channel at 0x7e007700 {
>> 		reg = <0x7e007700 0x100>;
>> 		interrupts = <1 23>;
>> 		dma-channel-type = <BCM2835_DMA_LITE>;
>> 		dma-max-length = <65532>; /* 64K - 4 */
>> 	};
>> 	...
>> 	dma11: dma-channel at 0x7e007b00 {
>> 		reg = <0x7e007b00 0x100>;
>> 		interrupts = <1 27>;
>> 		shared-interrupt;
>> 		dma-channel-type = <BCM2835_DMA_LITE>;
>> 	};
>> 	...
>> 	dma14: dma-channel at 0x7e007e00 {
>> 		reg = <0x7e007e00 0x100>;
>> 		interrupts = <1 27>;
>> 		shared-interrupt;
>> 		dma-channel-type = <BCM2835_DMA_LITE>;
>> 	};
>> 	dma15: dma-channel at 0x7ee05000 {
>> 		reg = <0x7ee05000 0x100>;
>> 	};
>> };
>
> This looks okay to me to start with...

Can we please agree on something over the next 2 weeks, so that I can
implement it when I get back from vacation?

The point is that it would be a much more explicit device-tree
with the way I had proposed it above.

I guess we still could use the same driver (compatiblity)
with some reorganization. To detect old and new DTs we could
just check the size of regs or number of interrupts and if they
are reg.len > 256 bytes or count(interrupts) = 13  then fall back
to legacy probe...

Thanks,
	Martin

^ permalink raw reply	[flat|nested] 28+ messages in thread

* [PATCH V2 1/8] dmaengine: bcm2835: set residue_granularity field
  2016-01-07 17:32 ` [PATCH V2 1/8] dmaengine: bcm2835: set residue_granularity field kernel at martin.sperl.org
@ 2016-01-28 19:53   ` Eric Anholt
  0 siblings, 0 replies; 28+ messages in thread
From: Eric Anholt @ 2016-01-28 19:53 UTC (permalink / raw)
  To: linux-arm-kernel

kernel at martin.sperl.org writes:

> From: Martin Sperl <kernel@martin.sperl.org>
>
> bcm2835-dma supports residue reporting at burst level but didn't report
> this via the residue_granularity field.
>
> See also:
> https://github.com/raspberrypi/linux/commit/b015555327afa402f70ddc86e3632f59df1cd9d7
> for the downstream patch.
>
> Signed-off-by: Matthias Reichl <hias@horus.com>
> Signed-off-by: Noralf Tr?nnes <noralf@tronnes.org>
> Signed-off-by: Martin Sperl <kernel@martin.sperl.org>

Reviewed-by: Eric Anholt <eric@anholt.net>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 818 bytes
Desc: not available
URL: <http://lists.infradead.org/pipermail/linux-arm-kernel/attachments/20160128/05eaaaa4/attachment.sig>

^ permalink raw reply	[flat|nested] 28+ messages in thread

* [PATCH V2 2/8] dmaengine: bcm2835: remove unnecessary masking of dma channels
  2016-01-07 17:33 ` [PATCH V2 2/8] dmaengine: bcm2835: remove unnecessary masking of dma channels kernel at martin.sperl.org
@ 2016-01-30  3:08   ` Eric Anholt
  0 siblings, 0 replies; 28+ messages in thread
From: Eric Anholt @ 2016-01-30  3:08 UTC (permalink / raw)
  To: linux-arm-kernel

kernel at martin.sperl.org writes:

> From: Martin Sperl <kernel@martin.sperl.org>
>
> The original patch contained 3 dma channels that were masked out.
>
> These - as far as research and discussions show - are a
> artefacts remaining from the downstream legacy dma-api.
>
> Right now down-stream still includes a legacy api used only
> in a single (downstream only) driver (bcm2708_fb) that requires
> 2D DMA for speedup (DMA-channel 0).
> Formerly the sd-card support driver also was using this legacy
> api (DMA-channel 2), but since has been moved over to use
> dmaengine directly.
>
> The DMA-channel 3 is already masked out in the devicetree in
> the default property "brcm,dma-channel-mask = <0x7f35>;"

Delayed the rest of the review, because I went digging around in the
firmware, and missed the ifdef mess in it that produced 0x7f35.  As far
as I can tell, though, we should be using just that value to control
which channels we expose (other than the shared-irq issue).  If the
firmware wants to use a different set of channels some day, they'll have
to edit our DT.

Reviewed-by: Eric Anholt <eric@anholt.net>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 818 bytes
Desc: not available
URL: <http://lists.infradead.org/pipermail/linux-arm-kernel/attachments/20160129/8188ea20/attachment-0001.sig>

^ permalink raw reply	[flat|nested] 28+ messages in thread

* [PATCH V2 6/8] dmaengine: bcm2835: move controlblock chain generation into separate method
  2016-01-07 17:33 ` [PATCH V2 6/8] dmaengine: bcm2835: move controlblock chain generation into separate method kernel at martin.sperl.org
  2016-01-13 13:23   ` Vinod Koul
@ 2016-02-18  3:24   ` Eric Anholt
  2016-02-29 18:14     ` Martin Sperl
  1 sibling, 1 reply; 28+ messages in thread
From: Eric Anholt @ 2016-02-18  3:24 UTC (permalink / raw)
  To: linux-arm-kernel

kernel at martin.sperl.org writes:

> From: Martin Sperl <kernel@martin.sperl.org>
>
> In preparation of adding slave_sg functionality this patch moves the
> generation/allocation of bcm2835_desc and the building of
> the corresponding DMA-control-block chain from bcm2835_dma_prep_dma_cyclic
> into the newly created method bcm2835_dma_create_cb_chain.
>
> This new code also takes the limits of LITE channels into account
> and splits the control-block transfers at the correct location.
>
> Tested Audio output with a Hifiberry I2S card.
>
> Signed-off-by: Martin Sperl <kernel@martin.sperl.org>
> ---
>  drivers/dma/bcm2835-dma.c |  288 ++++++++++++++++++++++++++++++---------------
>  1 file changed, 191 insertions(+), 97 deletions(-)
>
> diff --git a/drivers/dma/bcm2835-dma.c b/drivers/dma/bcm2835-dma.c
> index 43758ba..41a4f0b 100644
> --- a/drivers/dma/bcm2835-dma.c
> +++ b/drivers/dma/bcm2835-dma.c
> @@ -90,12 +90,12 @@ struct bcm2835_desc {
>  	struct virt_dma_desc vd;
>  	enum dma_transfer_direction dir;
>
> -	struct bcm2835_cb_entry *cb_list;
> -
>  	unsigned int frames;
>  	size_t size;
>
>  	bool cyclic;
> +
> +	struct bcm2835_cb_entry cb_list[];
>  };
>
>  #define BCM2835_DMA_CS		0x00
> @@ -181,6 +181,22 @@ struct bcm2835_desc {
>  #define BCM2835_DMA_IRQ_SHARED		11
>  #define BCM2835_DMA_IRQ_ALL		12
>
> +/* the max dma length for different channels */
> +#define MAX_DMA_LEN SZ_1G
> +#define MAX_LITE_DMA_LEN (SZ_64K - 4)
> +
> +static inline bool bcm2835_dma_is_lite(struct bcm2835_chan *c)
> +{
> +	/* dma channels >= 7 are LITE channels */
> +	return (c->ch >= 7);
> +}

You can ask a channel if it's a LITE engine by checking if the DEBUG reg
has bit 28 set.  Then you don't need to have the software/DT guessing
about it.

However, I'm disappointed to see these changes for adding LITE support
included in the same commit as refactoring chain generation into a
separate function.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 818 bytes
Desc: not available
URL: <http://lists.infradead.org/pipermail/linux-arm-kernel/attachments/20160217/2763337d/attachment.sig>

^ permalink raw reply	[flat|nested] 28+ messages in thread

* [PATCH V2 3/8] dmaengine: bcm2835: use shared interrupt for channel 11 to 14.
  2016-01-07 17:33 ` [PATCH V2 3/8] dmaengine: bcm2835: use shared interrupt for channel 11 to 14 kernel at martin.sperl.org
  2016-01-13 12:26   ` Vinod Koul
@ 2016-02-18  4:09   ` Eric Anholt
  1 sibling, 0 replies; 28+ messages in thread
From: Eric Anholt @ 2016-02-18  4:09 UTC (permalink / raw)
  To: linux-arm-kernel

kernel at martin.sperl.org writes:

> From: Martin Sperl <kernel@martin.sperl.org>
>
> The bcm2835 dma channel 11 to 14 only have a single shared irq line,
> so this patch implements shared interrupts for these channels.
>
> To avoid changes to the device tree (with regards to interrupts listed,
> which reflects on pdev->num_resources) a fixed channel count had
> to get used in the code instead.
>
> With this patch applied we now have 11 dma channels available to
> the ARM side of the SOC.
>
> Signed-off-by: Martin Sperl <kernel@martin.sperl.org>

Particularly given the existing DT bindings we have, I actually think
this is a good way to go.  This is internal to the hardware module about
how its features map to interrupt lines that leave the hardware module
(which are exposed in DT appropriately).

I thought about "what if we had the last interrupt line hooked up to a
different interrupt handler that read out and decided which channels to
wake up", but this seems much simpler.

Reviewed-by: Eric Anholt <eric@anholt.net>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 818 bytes
Desc: not available
URL: <http://lists.infradead.org/pipermail/linux-arm-kernel/attachments/20160217/ca03aa66/attachment.sig>

^ permalink raw reply	[flat|nested] 28+ messages in thread

* [PATCH V2 4/8] dmaengine: bcm2835: add additional defines for DMA-registers
  2016-01-07 17:33 ` [PATCH V2 4/8] dmaengine: bcm2835: add additional defines for DMA-registers kernel at martin.sperl.org
  2016-01-13 12:32   ` Vinod Koul
@ 2016-02-18  4:18   ` Eric Anholt
  1 sibling, 0 replies; 28+ messages in thread
From: Eric Anholt @ 2016-02-18  4:18 UTC (permalink / raw)
  To: linux-arm-kernel

kernel at martin.sperl.org writes:

> From: Martin Sperl <kernel@martin.sperl.org>
>
> Add additional defines describing the DMA registers
> as well as adding some more documentation to those registers.
>
> Signed-off-by: Martin Sperl <kernel@martin.sperl.org>
> ---
>  drivers/dma/bcm2835-dma.c |   54 ++++++++++++++++++++++++++++++++++++++-------
>  1 file changed, 46 insertions(+), 8 deletions(-)
>
> diff --git a/drivers/dma/bcm2835-dma.c b/drivers/dma/bcm2835-dma.c
> index 0c04236..7905327 100644
> --- a/drivers/dma/bcm2835-dma.c
> +++ b/drivers/dma/bcm2835-dma.c
> @@ -99,26 +99,64 @@ struct bcm2835_desc {
>  
>  #define BCM2835_DMA_CS		0x00
>  #define BCM2835_DMA_ADDR	0x04
> +#define BCM2835_DMA_TI		0x08
>  #define BCM2835_DMA_SOURCE_AD	0x0c
>  #define BCM2835_DMA_DEST_AD	0x10
> -#define BCM2835_DMA_NEXTCB	0x1C
> +#define BCM2835_DMA_LEN		0x14
> +#define BCM2835_DMA_STIDE	0x18

s/STIDE/STRIDE/

With that fixed, and optionally Vinod's comment about reformatting the
comment changed,

Reviewed-by: Eric Anholt <eric@anholt.net>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 818 bytes
Desc: not available
URL: <http://lists.infradead.org/pipermail/linux-arm-kernel/attachments/20160217/6a256ca6/attachment.sig>

^ permalink raw reply	[flat|nested] 28+ messages in thread

* [PATCH V2 5/8] dmaengine: bcm2835: move cyclic member from bcm2835_chan into bcm2835_desc
  2016-01-07 17:33 ` [PATCH V2 5/8] dmaengine: bcm2835: move cyclic member from bcm2835_chan into bcm2835_desc kernel at martin.sperl.org
@ 2016-02-18  4:19   ` Eric Anholt
  0 siblings, 0 replies; 28+ messages in thread
From: Eric Anholt @ 2016-02-18  4:19 UTC (permalink / raw)
  To: linux-arm-kernel

kernel at martin.sperl.org writes:

> From: Martin Sperl <kernel@martin.sperl.org>
>
> In preparation to consolidating code we move the cyclic member
> into the bcm_2835_desc structure.

Reviewed-by: Eric Anholt <eric@anholt.net>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 818 bytes
Desc: not available
URL: <http://lists.infradead.org/pipermail/linux-arm-kernel/attachments/20160217/b9f7c7f9/attachment.sig>

^ permalink raw reply	[flat|nested] 28+ messages in thread

* [PATCH V2 7/8] dmaengine: bcm2835: add slave_sg support to bcm2835-dma
  2016-01-07 17:33 ` [PATCH V2 7/8] dmaengine: bcm2835: add slave_sg support to bcm2835-dma kernel at martin.sperl.org
@ 2016-02-18  4:39   ` Eric Anholt
  0 siblings, 0 replies; 28+ messages in thread
From: Eric Anholt @ 2016-02-18  4:39 UTC (permalink / raw)
  To: linux-arm-kernel

kernel at martin.sperl.org writes:

> From: Martin Sperl <kernel@martin.sperl.org>
>
> Add slave_sg support to bcm2835-dma using shared allocation
> code for bcm2835_desc and DMA-control blocks already used by
> dma_cyclic.
>
> Note that bcm2835_dma_callback had to get modified to support
> both modes of operation (cyclic and non-cyclic).
>
> Tested using:
> * Hifiberry I2S card (using cyclic DMA)
> * fb_st7735r SPI-framebuffer (using slave_sg DMA via spi-bcm2835)
> playing BigBuckBunny for audio and video.
>
> Signed-off-by: Martin Sperl <kernel@martin.sperl.org>

I would have expected bcm2835_dma_create_cb_chain()'s changes to support
slave DMA to be in this patch rather than the refactor patch.  However,
the content of this one and 8/8 are:

Reviewed-by: Eric Anholt <eric@anholt.net>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 818 bytes
Desc: not available
URL: <http://lists.infradead.org/pipermail/linux-arm-kernel/attachments/20160217/9d8c8c3b/attachment.sig>

^ permalink raw reply	[flat|nested] 28+ messages in thread

* [PATCH V2 3/8] dmaengine: bcm2835: use shared interrupt for channel 11 to 14.
  2016-01-14  4:07           ` Vinod Koul
  2016-01-14  8:48             ` Martin Sperl
@ 2016-02-29 17:10             ` Martin Sperl
  2016-03-03 15:45               ` Vinod Koul
  1 sibling, 1 reply; 28+ messages in thread
From: Martin Sperl @ 2016-02-29 17:10 UTC (permalink / raw)
  To: linux-arm-kernel


> On 14.01.2016, at 05:07, Vinod Koul <vinod.koul@intel.com> wrote:
> 
> On Wed, Jan 13, 2016 at 03:24:59PM +0100, Martin Sperl wrote:
>>>> But that would be frowned upon, so I had to come up with the approach
>>>> taken, which makes the following assumptions:
>>> 
>>> DT was designed to move this info and hardcoding from kernel into
>>> DT, so why cant we do that?
>> We still need to be backwards-compatible - at least that is what
>> everyone tells me, so I need to hard-code fallbacks for those values.
> 
> IMO hard code for falling back is okay as that supports old cases and new
> platforms use geric DT info and new devices can be supported generically,
> please check with DT folks..
> 
>>>> * the DT maps only the interrupts that are assigned to the HW block
>>>> * the driver knows about the number of DMA channels in HW
>> that could be a DT property, yes.
>>>> * the driver knows about the mapping of shared interrupts
>>>>  (11-14 share irq).
>> OK - how would you define that "mapping" in a "sane" manner in the DT
>> that allows us to have multiple such mappings in the future?
> 
> You are hard coding the flags for each channel, we can pass this for each
> channel in the interrupt configi, a flag share/none..? Please run this thru
> DT experts and I am not one of them..

So here a ?simple? approach with the following device tree addition (plus defaults):
	interrupts = <1 16>, /* dma 0 irq */
		     <1 17>, /* dma 1 irq */
	...
		     <1 26>, /* dma 10 irq */
		     <1 27>, /* dma 11-14 irq */
		     <1 28>; /* dma all irq */
	brcm,dma-channel-mask = <0x7f35>;
	/* new properties below - also the defaults */
	brcm,dma-channel-shared-mask = <0x7800>;
	brcm,dma-shared-irq-index = <11>;

Is this sufficient to move forward?

Thanks,
	Martin

P.s: this would also mean (for later patches):
	brcm,dma-lite-channel-mask = <0x7f00>;

^ permalink raw reply	[flat|nested] 28+ messages in thread

* [PATCH V2 6/8] dmaengine: bcm2835: move controlblock chain generation into separate method
  2016-02-18  3:24   ` Eric Anholt
@ 2016-02-29 18:14     ` Martin Sperl
  0 siblings, 0 replies; 28+ messages in thread
From: Martin Sperl @ 2016-02-29 18:14 UTC (permalink / raw)
  To: linux-arm-kernel


> On 18.02.2016, at 04:24, Eric Anholt <eric@anholt.net> wrote:
> 
> kernel at martin.sperl.org writes:
> 
>> From: Martin Sperl <kernel@martin.sperl.org>
>> 
>> In preparation of adding slave_sg functionality this patch moves the
>> generation/allocation of bcm2835_desc and the building of
>> the corresponding DMA-control-block chain from bcm2835_dma_prep_dma_cyclic
>> into the newly created method bcm2835_dma_create_cb_chain.
>> 
>> This new code also takes the limits of LITE channels into account
>> and splits the control-block transfers at the correct location.
>> 
>> Tested Audio output with a Hifiberry I2S card.
>> 
>> Signed-off-by: Martin Sperl <kernel@martin.sperl.org>
>> ---
>> drivers/dma/bcm2835-dma.c |  288 ++++++++++++++++++++++++++++++---------------
>> 1 file changed, 191 insertions(+), 97 deletions(-)
>> 
>> diff --git a/drivers/dma/bcm2835-dma.c b/drivers/dma/bcm2835-dma.c
>> index 43758ba..41a4f0b 100644
>> --- a/drivers/dma/bcm2835-dma.c
>> +++ b/drivers/dma/bcm2835-dma.c
>> @@ -90,12 +90,12 @@ struct bcm2835_desc {
>> 	struct virt_dma_desc vd;
>> 	enum dma_transfer_direction dir;
>> 
>> -	struct bcm2835_cb_entry *cb_list;
>> -
>> 	unsigned int frames;
>> 	size_t size;
>> 
>> 	bool cyclic;
>> +
>> +	struct bcm2835_cb_entry cb_list[];
>> };
>> 
>> #define BCM2835_DMA_CS		0x00
>> @@ -181,6 +181,22 @@ struct bcm2835_desc {
>> #define BCM2835_DMA_IRQ_SHARED		11
>> #define BCM2835_DMA_IRQ_ALL		12
>> 
>> +/* the max dma length for different channels */
>> +#define MAX_DMA_LEN SZ_1G
>> +#define MAX_LITE_DMA_LEN (SZ_64K - 4)
>> +
>> +static inline bool bcm2835_dma_is_lite(struct bcm2835_chan *c)
>> +{
>> +	/* dma channels >= 7 are LITE channels */
>> +	return (c->ch >= 7);
>> +}
> 
> You can ask a channel if it's a LITE engine by checking if the DEBUG reg
> has bit 28 set.  Then you don't need to have the software/DT guessing
> about it.
I will implement that - it is simple.

> However, I'm disappointed to see these changes for adding LITE support
> included in the same commit as refactoring chain generation into a
> separate function.
I will separate this ?LITE-channel handling? out into a separate patch.

Martin

^ permalink raw reply	[flat|nested] 28+ messages in thread

* [PATCH V2 3/8] dmaengine: bcm2835: use shared interrupt for channel 11 to 14.
  2016-02-29 17:10             ` Martin Sperl
@ 2016-03-03 15:45               ` Vinod Koul
  0 siblings, 0 replies; 28+ messages in thread
From: Vinod Koul @ 2016-03-03 15:45 UTC (permalink / raw)
  To: linux-arm-kernel

On Mon, Feb 29, 2016 at 06:10:53PM +0100, Martin Sperl wrote:
> 
> > On 14.01.2016, at 05:07, Vinod Koul <vinod.koul@intel.com> wrote:
> > 
> > On Wed, Jan 13, 2016 at 03:24:59PM +0100, Martin Sperl wrote:
> >>>> But that would be frowned upon, so I had to come up with the approach
> >>>> taken, which makes the following assumptions:
> >>> 
> >>> DT was designed to move this info and hardcoding from kernel into
> >>> DT, so why cant we do that?
> >> We still need to be backwards-compatible - at least that is what
> >> everyone tells me, so I need to hard-code fallbacks for those values.
> > 
> > IMO hard code for falling back is okay as that supports old cases and new
> > platforms use geric DT info and new devices can be supported generically,
> > please check with DT folks..
> > 
> >>>> * the DT maps only the interrupts that are assigned to the HW block
> >>>> * the driver knows about the number of DMA channels in HW
> >> that could be a DT property, yes.
> >>>> * the driver knows about the mapping of shared interrupts
> >>>>  (11-14 share irq).
> >> OK - how would you define that "mapping" in a "sane" manner in the DT
> >> that allows us to have multiple such mappings in the future?
> > 
> > You are hard coding the flags for each channel, we can pass this for each
> > channel in the interrupt configi, a flag share/none..? Please run this thru
> > DT experts and I am not one of them..
> 
> So here a ?simple? approach with the following device tree addition (plus defaults):
> 	interrupts = <1 16>, /* dma 0 irq */
> 		     <1 17>, /* dma 1 irq */
> 	...
> 		     <1 26>, /* dma 10 irq */
> 		     <1 27>, /* dma 11-14 irq */
> 		     <1 28>; /* dma all irq */
> 	brcm,dma-channel-mask = <0x7f35>;
> 	/* new properties below - also the defaults */
> 	brcm,dma-channel-shared-mask = <0x7800>;
> 	brcm,dma-shared-irq-index = <11>;
> 
> Is this sufficient to move forward?

This looks okay to me for now, though am not a DT expert. Please send
patches and CC DT folks as relevant and we can discuss..

-- 
~Vinod

^ permalink raw reply	[flat|nested] 28+ messages in thread

end of thread, other threads:[~2016-03-03 15:45 UTC | newest]

Thread overview: 28+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-01-07 17:32 [PATCH V2 0/8] dmaengine: bcm2835: enhancement of driver kernel at martin.sperl.org
2016-01-07 17:32 ` [PATCH V2 1/8] dmaengine: bcm2835: set residue_granularity field kernel at martin.sperl.org
2016-01-28 19:53   ` Eric Anholt
2016-01-07 17:33 ` [PATCH V2 2/8] dmaengine: bcm2835: remove unnecessary masking of dma channels kernel at martin.sperl.org
2016-01-30  3:08   ` Eric Anholt
2016-01-07 17:33 ` [PATCH V2 3/8] dmaengine: bcm2835: use shared interrupt for channel 11 to 14 kernel at martin.sperl.org
2016-01-13 12:26   ` Vinod Koul
2016-01-13 13:30     ` Martin Sperl
2016-01-13 13:43       ` Vinod Koul
2016-01-13 14:24         ` Martin Sperl
2016-01-14  4:07           ` Vinod Koul
2016-01-14  8:48             ` Martin Sperl
2016-02-29 17:10             ` Martin Sperl
2016-03-03 15:45               ` Vinod Koul
2016-02-18  4:09   ` Eric Anholt
2016-01-07 17:33 ` [PATCH V2 4/8] dmaengine: bcm2835: add additional defines for DMA-registers kernel at martin.sperl.org
2016-01-13 12:32   ` Vinod Koul
2016-02-18  4:18   ` Eric Anholt
2016-01-07 17:33 ` [PATCH V2 5/8] dmaengine: bcm2835: move cyclic member from bcm2835_chan into bcm2835_desc kernel at martin.sperl.org
2016-02-18  4:19   ` Eric Anholt
2016-01-07 17:33 ` [PATCH V2 6/8] dmaengine: bcm2835: move controlblock chain generation into separate method kernel at martin.sperl.org
2016-01-13 13:23   ` Vinod Koul
2016-01-13 13:38     ` Martin Sperl
2016-02-18  3:24   ` Eric Anholt
2016-02-29 18:14     ` Martin Sperl
2016-01-07 17:33 ` [PATCH V2 7/8] dmaengine: bcm2835: add slave_sg support to bcm2835-dma kernel at martin.sperl.org
2016-02-18  4:39   ` Eric Anholt
2016-01-07 17:33 ` [PATCH V2 8/8] dmaengine: bcm2835: add dma_memcopy " kernel at martin.sperl.org

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.