* [PATCH 0/3] dmaengine: rcar-dmac: fix resource freeing synchronization @ 2017-03-28 22:40 Niklas Söderlund 2017-03-28 22:40 ` [PATCH 1/3] dmaengine: rcar-dmac: store channel IRQ in struct rcar_dmac_chan Niklas Söderlund ` (2 more replies) 0 siblings, 3 replies; 14+ messages in thread From: Niklas Söderlund @ 2017-03-28 22:40 UTC (permalink / raw) To: Vinod Koul, dmaengine, linux-renesas-soc Cc: Yoshihiro Shimoda, Lars-Peter Clausen, Hiroyuki Yokoyama, Niklas Söderlund Hi, This series fix resource freeing synchronization by: 1. Patch 1/3 Store the IRQ number in the global struct so it can be used later together with synchronize_irq(). 2. Patch 2/3 Adding support for the device_synchronize() callback in patch 2/3. 3. Patch 3/3 Waiting for any ISR that might still be running after the channel is halted prior to freeing its resources. This was patch previously part of a patch sent out by Yoshihiro Shimoda and authored by Hiroyuki Yokoyama, see [1]. In that thread it was suggested by Lars-Peter Clausen to instead implement the device_synchronize() callback. Unfortunately this is not enough to solve the issue. In rcar_dmac_free_chan_resources() the channel is halted by a call to rcar_dmac_chan_halt() and then directly moves on to freeing resources, here it is still needed to add a wait for any ISR to finish before freeing the resources, despite that a device_synchronize() have been added. This is because call chain: dma_release_channel() dma_chan_put() dmaengine_synchronize() rcar_dmac_free_chan_resources() rcar_dmac_chan_halt() Here dmaengine_synchronize() is called prior to rcar_dmac_chan_halt() so an extra synchronisation to wait for any running ISR is still needed. By both adding a device_synchronize() which can be used in conjunction with device_terminate_all() and fiends and by adding an explicit synchronize_irq() when freeing channel resources I feel the synchronisation for freeing channel resources are in a much better shape. It also solves the issue in the original mail thread. The series is based on v4.11-rc1 and is tested on r8a7795 Salvator-X. 1. https://patchwork.kernel.org/patch/9557691/ Niklas Söderlund (3): dmaengine: rcar-dmac: store channel IRQ in struct rcar_dmac_chan dmaengine: rcar-dmac: implement device_synchronize() dmaengine: rcar-dmac: wait for ISR to finish before freeing resources drivers/dma/sh/rcar-dmac.c | 27 +++++++++++++++++++++------ 1 file changed, 21 insertions(+), 6 deletions(-) -- 2.12.0 ^ permalink raw reply [flat|nested] 14+ messages in thread
* [PATCH 1/3] dmaengine: rcar-dmac: store channel IRQ in struct rcar_dmac_chan 2017-03-28 22:40 [PATCH 0/3] dmaengine: rcar-dmac: fix resource freeing synchronization Niklas Söderlund @ 2017-03-28 22:40 ` Niklas Söderlund 2017-03-28 22:40 ` [PATCH 2/3] dmaengine: rcar-dmac: implement device_synchronize() Niklas Söderlund 2017-03-28 22:40 ` [PATCH 3/3] dmaengine: rcar-dmac: wait for ISR to finish before freeing resources Niklas Söderlund 2 siblings, 0 replies; 14+ messages in thread From: Niklas Söderlund @ 2017-03-28 22:40 UTC (permalink / raw) To: Vinod Koul, dmaengine, linux-renesas-soc Cc: Yoshihiro Shimoda, Lars-Peter Clausen, Hiroyuki Yokoyama, Niklas Söderlund The IRQ number is needed after probe to be able to add synchronisation points in other places in the driver when freeing resources and to implement a device_synchronize() callback. Store the IRQ number in the struct rcar_dmac_chan so that it can be used later. Signed-off-by: Niklas Söderlund <niklas.soderlund+renesas@ragnatech.se> --- drivers/dma/sh/rcar-dmac.c | 13 ++++++++----- 1 file changed, 8 insertions(+), 5 deletions(-) diff --git a/drivers/dma/sh/rcar-dmac.c b/drivers/dma/sh/rcar-dmac.c index 48b22d5c86026098..3038654f11b5c6ed 100644 --- a/drivers/dma/sh/rcar-dmac.c +++ b/drivers/dma/sh/rcar-dmac.c @@ -144,6 +144,7 @@ struct rcar_dmac_chan_map { * @chan: base DMA channel object * @iomem: channel I/O memory base * @index: index of this channel in the controller + * @irq: channel IRQ * @src: slave memory address and size on the source side * @dst: slave memory address and size on the destination side * @mid_rid: hardware MID/RID for the DMA client using this channel @@ -161,6 +162,7 @@ struct rcar_dmac_chan { struct dma_chan chan; void __iomem *iomem; unsigned int index; + int irq; struct rcar_dmac_chan_slave src; struct rcar_dmac_chan_slave dst; @@ -1635,7 +1637,6 @@ static int rcar_dmac_chan_probe(struct rcar_dmac *dmac, struct dma_chan *chan = &rchan->chan; char pdev_irqname[5]; char *irqname; - int irq; int ret; rchan->index = index; @@ -1652,8 +1653,8 @@ static int rcar_dmac_chan_probe(struct rcar_dmac *dmac, /* Request the channel interrupt. */ sprintf(pdev_irqname, "ch%u", index); - irq = platform_get_irq_byname(pdev, pdev_irqname); - if (irq < 0) { + rchan->irq = platform_get_irq_byname(pdev, pdev_irqname); + if (rchan->irq < 0) { dev_err(dmac->dev, "no IRQ specified for channel %u\n", index); return -ENODEV; } @@ -1663,11 +1664,13 @@ static int rcar_dmac_chan_probe(struct rcar_dmac *dmac, if (!irqname) return -ENOMEM; - ret = devm_request_threaded_irq(dmac->dev, irq, rcar_dmac_isr_channel, + ret = devm_request_threaded_irq(dmac->dev, rchan->irq, + rcar_dmac_isr_channel, rcar_dmac_isr_channel_thread, 0, irqname, rchan); if (ret) { - dev_err(dmac->dev, "failed to request IRQ %u (%d)\n", irq, ret); + dev_err(dmac->dev, "failed to request IRQ %u (%d)\n", + rchan->irq, ret); return ret; } -- 2.12.0 ^ permalink raw reply related [flat|nested] 14+ messages in thread
* [PATCH 2/3] dmaengine: rcar-dmac: implement device_synchronize() 2017-03-28 22:40 [PATCH 0/3] dmaengine: rcar-dmac: fix resource freeing synchronization Niklas Söderlund 2017-03-28 22:40 ` [PATCH 1/3] dmaengine: rcar-dmac: store channel IRQ in struct rcar_dmac_chan Niklas Söderlund @ 2017-03-28 22:40 ` Niklas Söderlund 2017-03-28 22:40 ` [PATCH 3/3] dmaengine: rcar-dmac: wait for ISR to finish before freeing resources Niklas Söderlund 2 siblings, 0 replies; 14+ messages in thread From: Niklas Söderlund @ 2017-03-28 22:40 UTC (permalink / raw) To: Vinod Koul, dmaengine, linux-renesas-soc Cc: Yoshihiro Shimoda, Lars-Peter Clausen, Hiroyuki Yokoyama, Niklas Söderlund Implement the device_synchronize() callback which wait until a dma channel is stopped to provide a synchronization point. This protects the driver from multiple race conditions when terminating and freeing resources. E.g. the completion callback still running after device_terminate_all() has completed. Signed-off-by: Niklas Söderlund <niklas.soderlund+renesas@ragnatech.se> --- drivers/dma/sh/rcar-dmac.c | 8 ++++++++ 1 file changed, 8 insertions(+) diff --git a/drivers/dma/sh/rcar-dmac.c b/drivers/dma/sh/rcar-dmac.c index 3038654f11b5c6ed..4b90deb40d559bed 100644 --- a/drivers/dma/sh/rcar-dmac.c +++ b/drivers/dma/sh/rcar-dmac.c @@ -1353,6 +1353,13 @@ static void rcar_dmac_issue_pending(struct dma_chan *chan) spin_unlock_irqrestore(&rchan->lock, flags); } +static void rcar_dmac_device_synchronize(struct dma_chan *chan) +{ + struct rcar_dmac_chan *rchan = to_rcar_dmac_chan(chan); + + synchronize_irq(rchan->irq); +} + /* ----------------------------------------------------------------------------- * IRQ handling */ @@ -1834,6 +1841,7 @@ static int rcar_dmac_probe(struct platform_device *pdev) engine->device_terminate_all = rcar_dmac_chan_terminate_all; engine->device_tx_status = rcar_dmac_tx_status; engine->device_issue_pending = rcar_dmac_issue_pending; + engine->device_synchronize = rcar_dmac_device_synchronize; ret = dma_async_device_register(engine); if (ret < 0) -- 2.12.0 ^ permalink raw reply related [flat|nested] 14+ messages in thread
* [PATCH 3/3] dmaengine: rcar-dmac: wait for ISR to finish before freeing resources 2017-03-28 22:40 [PATCH 0/3] dmaengine: rcar-dmac: fix resource freeing synchronization Niklas Söderlund 2017-03-28 22:40 ` [PATCH 1/3] dmaengine: rcar-dmac: store channel IRQ in struct rcar_dmac_chan Niklas Söderlund 2017-03-28 22:40 ` [PATCH 2/3] dmaengine: rcar-dmac: implement device_synchronize() Niklas Söderlund @ 2017-03-28 22:40 ` Niklas Söderlund 2017-03-29 12:31 ` Geert Uytterhoeven 2 siblings, 1 reply; 14+ messages in thread From: Niklas Söderlund @ 2017-03-28 22:40 UTC (permalink / raw) To: Vinod Koul, dmaengine, linux-renesas-soc Cc: Yoshihiro Shimoda, Lars-Peter Clausen, Hiroyuki Yokoyama, Niklas Söderlund This fixes a race condition where the channel resources could be freed before the ISR had finished running resulting in a NULL pointer reference from the ISR. [ 167.148934] Unable to handle kernel NULL pointer dereference at virtual address 00000000 [ 167.157051] pgd = ffff80003c641000 [ 167.160449] [00000000] *pgd=000000007c507003, *pud=000000007c4ff003, *pmd=0000000000000000 [ 167.168719] Internal error: Oops: 96000046 [#1] PREEMPT SMP [ 167.174289] Modules linked in: [ 167.177348] CPU: 3 PID: 10547 Comm: dma_ioctl Not tainted 4.11.0-rc1-00001-g8d92afddc2f6633a #73 [ 167.186131] Hardware name: Renesas Salvator-X board based on r8a7795 (DT) [ 167.192917] task: ffff80003a411a00 task.stack: ffff80003bcd4000 [ 167.198850] PC is at rcar_dmac_chan_prep_sg+0xe0/0x400 [ 167.203985] LR is at rcar_dmac_chan_prep_sg+0x48/0x400 Based of previous work by: Hiroyuki Yokoyama <hiroyuki.yokoyama.vx@renesas.com>. Signed-off-by: Niklas Söderlund <niklas.soderlund+renesas@ragnatech.se> --- drivers/dma/sh/rcar-dmac.c | 6 +++++- 1 file changed, 5 insertions(+), 1 deletion(-) diff --git a/drivers/dma/sh/rcar-dmac.c b/drivers/dma/sh/rcar-dmac.c index 4b90deb40d559bed..0ec63600ebcc3a27 100644 --- a/drivers/dma/sh/rcar-dmac.c +++ b/drivers/dma/sh/rcar-dmac.c @@ -998,7 +998,11 @@ static void rcar_dmac_free_chan_resources(struct dma_chan *chan) rcar_dmac_chan_halt(rchan); spin_unlock_irq(&rchan->lock); - /* Now no new interrupts will occur */ + /* + * Now no new interrupts will occur, but one might already be + * running. Wait for it to finish before freeing resources. + */ + synchronize_irq(rchan->irq); if (rchan->mid_rid >= 0) { /* The caller is holding dma_list_mutex */ -- 2.12.0 ^ permalink raw reply related [flat|nested] 14+ messages in thread
* Re: [PATCH 3/3] dmaengine: rcar-dmac: wait for ISR to finish before freeing resources 2017-03-28 22:40 ` [PATCH 3/3] dmaengine: rcar-dmac: wait for ISR to finish before freeing resources Niklas Söderlund @ 2017-03-29 12:31 ` Geert Uytterhoeven 2017-03-29 13:30 ` Niklas Söderlund 0 siblings, 1 reply; 14+ messages in thread From: Geert Uytterhoeven @ 2017-03-29 12:31 UTC (permalink / raw) To: Niklas Söderlund Cc: Vinod Koul, dmaengine, Linux-Renesas, Yoshihiro Shimoda, Lars-Peter Clausen, Hiroyuki Yokoyama Hi Niklas, On Wed, Mar 29, 2017 at 12:40 AM, Niklas Söderlund <niklas.soderlund+renesas@ragnatech.se> wrote: > This fixes a race condition where the channel resources could be freed > before the ISR had finished running resulting in a NULL pointer > reference from the ISR. > > [ 167.148934] Unable to handle kernel NULL pointer dereference at virtual address 00000000 > [ 167.157051] pgd = ffff80003c641000 > [ 167.160449] [00000000] *pgd=000000007c507003, *pud=000000007c4ff003, *pmd=0000000000000000 > [ 167.168719] Internal error: Oops: 96000046 [#1] PREEMPT SMP > [ 167.174289] Modules linked in: > [ 167.177348] CPU: 3 PID: 10547 Comm: dma_ioctl Not tainted 4.11.0-rc1-00001-g8d92afddc2f6633a #73 > [ 167.186131] Hardware name: Renesas Salvator-X board based on r8a7795 (DT) > [ 167.192917] task: ffff80003a411a00 task.stack: ffff80003bcd4000 > [ 167.198850] PC is at rcar_dmac_chan_prep_sg+0xe0/0x400 > [ 167.203985] LR is at rcar_dmac_chan_prep_sg+0x48/0x400 Do you have a test case to trigger this? Thanks! Gr{oetje,eeting}s, Geert -- Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- geert@linux-m68k.org In personal conversations with technical people, I call myself a hacker. But when I'm talking to journalists I just say "programmer" or something like that. -- Linus Torvalds ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH 3/3] dmaengine: rcar-dmac: wait for ISR to finish before freeing resources 2017-03-29 12:31 ` Geert Uytterhoeven @ 2017-03-29 13:30 ` Niklas Söderlund 2017-03-30 7:38 ` Niklas Söderlund 0 siblings, 1 reply; 14+ messages in thread From: Niklas Söderlund @ 2017-03-29 13:30 UTC (permalink / raw) To: Geert Uytterhoeven Cc: Vinod Koul, dmaengine, Linux-Renesas, Yoshihiro Shimoda, Lars-Peter Clausen, Hiroyuki Yokoyama Hi Geert, On 2017-03-29 14:31:33 +0200, Geert Uytterhoeven wrote: > Hi Niklas, > > On Wed, Mar 29, 2017 at 12:40 AM, Niklas S�derlund > <niklas.soderlund+renesas@ragnatech.se> wrote: > > This fixes a race condition where the channel resources could be freed > > before the ISR had finished running resulting in a NULL pointer > > reference from the ISR. > > > > [ 167.148934] Unable to handle kernel NULL pointer dereference at virtual address 00000000 > > [ 167.157051] pgd = ffff80003c641000 > > [ 167.160449] [00000000] *pgd=000000007c507003, *pud=000000007c4ff003, *pmd=0000000000000000 > > [ 167.168719] Internal error: Oops: 96000046 [#1] PREEMPT SMP > > [ 167.174289] Modules linked in: > > [ 167.177348] CPU: 3 PID: 10547 Comm: dma_ioctl Not tainted 4.11.0-rc1-00001-g8d92afddc2f6633a #73 > > [ 167.186131] Hardware name: Renesas Salvator-X board based on r8a7795 (DT) > > [ 167.192917] task: ffff80003a411a00 task.stack: ffff80003bcd4000 > > [ 167.198850] PC is at rcar_dmac_chan_prep_sg+0xe0/0x400 > > [ 167.203985] LR is at rcar_dmac_chan_prep_sg+0x48/0x400 > > Do you have a test case to trigger this? Yes I have a testcase, it's rather complex and involves both a kernel module and a userspaces application to stress the rcar-dmac. I'm checking if I can share this publicly or not, please hold :-) > > Thanks! > > Gr{oetje,eeting}s, > > Geert > > -- > Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- geert@linux-m68k.org > > In personal conversations with technical people, I call myself a hacker. But > when I'm talking to journalists I just say "programmer" or something like that. > -- Linus Torvalds -- Regards, Niklas S�derlund ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH 3/3] dmaengine: rcar-dmac: wait for ISR to finish before freeing resources 2017-03-29 13:30 ` Niklas Söderlund @ 2017-03-30 7:38 ` Niklas Söderlund 2017-04-05 3:25 ` Vinod Koul 0 siblings, 1 reply; 14+ messages in thread From: Niklas Söderlund @ 2017-03-30 7:38 UTC (permalink / raw) To: Geert Uytterhoeven Cc: Vinod Koul, dmaengine, Linux-Renesas, Yoshihiro Shimoda, Lars-Peter Clausen, Hiroyuki Yokoyama Hi Geert, On 2017-03-29 15:30:42 +0200, Niklas S�derlund wrote: > Hi Geert, > > On 2017-03-29 14:31:33 +0200, Geert Uytterhoeven wrote: > > Hi Niklas, > > > > On Wed, Mar 29, 2017 at 12:40 AM, Niklas S�derlund > > <niklas.soderlund+renesas@ragnatech.se> wrote: > > > This fixes a race condition where the channel resources could be freed > > > before the ISR had finished running resulting in a NULL pointer > > > reference from the ISR. > > > > > > [ 167.148934] Unable to handle kernel NULL pointer dereference at virtual address 00000000 > > > [ 167.157051] pgd = ffff80003c641000 > > > [ 167.160449] [00000000] *pgd=000000007c507003, *pud=000000007c4ff003, *pmd=0000000000000000 > > > [ 167.168719] Internal error: Oops: 96000046 [#1] PREEMPT SMP > > > [ 167.174289] Modules linked in: > > > [ 167.177348] CPU: 3 PID: 10547 Comm: dma_ioctl Not tainted 4.11.0-rc1-00001-g8d92afddc2f6633a #73 > > > [ 167.186131] Hardware name: Renesas Salvator-X board based on r8a7795 (DT) > > > [ 167.192917] task: ffff80003a411a00 task.stack: ffff80003bcd4000 > > > [ 167.198850] PC is at rcar_dmac_chan_prep_sg+0xe0/0x400 > > > [ 167.203985] LR is at rcar_dmac_chan_prep_sg+0x48/0x400 > > > > Do you have a test case to trigger this? > > Yes I have a testcase, it's rather complex and involves both a kernel > module and a userspaces application to stress the rcar-dmac. I'm > checking if I can share this publicly or not, please hold :-) I have now received feedback that I'm unfortunately not allowed to share the test case :-( The big picture in how to trigger this problem is that you start a DMA transfer like this: struct dma_async_tx_descriptor *tx = ...; ... tx->tx_submit(tx); And then you directly call dma_release_channel() on this channel without making sure the completion callback ran or anything. Now if you are unlucky the ISR have not finished running for the DMA when dma_release_channel() starts to clean up resources. The synchronisation point in the dma_release_channel() call path fixes this. > > > > > Thanks! > > > > Gr{oetje,eeting}s, > > > > Geert > > > > -- > > Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- geert@linux-m68k.org > > > > In personal conversations with technical people, I call myself a hacker. But > > when I'm talking to journalists I just say "programmer" or something like that. > > -- Linus Torvalds > > -- > Regards, > Niklas S�derlund -- Regards, Niklas S�derlund ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH 3/3] dmaengine: rcar-dmac: wait for ISR to finish before freeing resources 2017-03-30 7:38 ` Niklas Söderlund @ 2017-04-05 3:25 ` Vinod Koul 2017-04-05 9:14 ` Niklas Söderlund 0 siblings, 1 reply; 14+ messages in thread From: Vinod Koul @ 2017-04-05 3:25 UTC (permalink / raw) To: Niklas Söderlund Cc: Geert Uytterhoeven, dmaengine, Linux-Renesas, Yoshihiro Shimoda, Lars-Peter Clausen, Hiroyuki Yokoyama On Thu, Mar 30, 2017 at 09:38:39AM +0200, Niklas S�derlund wrote: > Hi Geert, > > On 2017-03-29 15:30:42 +0200, Niklas S�derlund wrote: > > Hi Geert, > > > > On 2017-03-29 14:31:33 +0200, Geert Uytterhoeven wrote: > > > Hi Niklas, > > > > > > On Wed, Mar 29, 2017 at 12:40 AM, Niklas S�derlund > > > <niklas.soderlund+renesas@ragnatech.se> wrote: > > > > This fixes a race condition where the channel resources could be freed > > > > before the ISR had finished running resulting in a NULL pointer > > > > reference from the ISR. > > > > > > > > [ 167.148934] Unable to handle kernel NULL pointer dereference at virtual address 00000000 > > > > [ 167.157051] pgd = ffff80003c641000 > > > > [ 167.160449] [00000000] *pgd=000000007c507003, *pud=000000007c4ff003, *pmd=0000000000000000 > > > > [ 167.168719] Internal error: Oops: 96000046 [#1] PREEMPT SMP > > > > [ 167.174289] Modules linked in: > > > > [ 167.177348] CPU: 3 PID: 10547 Comm: dma_ioctl Not tainted 4.11.0-rc1-00001-g8d92afddc2f6633a #73 > > > > [ 167.186131] Hardware name: Renesas Salvator-X board based on r8a7795 (DT) > > > > [ 167.192917] task: ffff80003a411a00 task.stack: ffff80003bcd4000 > > > > [ 167.198850] PC is at rcar_dmac_chan_prep_sg+0xe0/0x400 > > > > [ 167.203985] LR is at rcar_dmac_chan_prep_sg+0x48/0x400 > > > > > > Do you have a test case to trigger this? > > > > Yes I have a testcase, it's rather complex and involves both a kernel > > module and a userspaces application to stress the rcar-dmac. I'm > > checking if I can share this publicly or not, please hold :-) > > I have now received feedback that I'm unfortunately not allowed to share > the test case :-( > > The big picture in how to trigger this problem is that you start a DMA > transfer like this: > > struct dma_async_tx_descriptor *tx = ...; > > ... > > tx->tx_submit(tx); > > And then you directly call dma_release_channel() on this channel without > making sure the completion callback ran or anything. Now if you are > unlucky the ISR have not finished running for the DMA when > dma_release_channel() starts to clean up resources. The synchronisation > point in the dma_release_channel() call path fixes this. Well the API expectation would be you abort the txn before calling release. So the expected order should be: dmaengine_terminate_all(); dma_release_channel(); Terminate should then stop the channel, ie abort the pending descriptors.. -- ~Vinod ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH 3/3] dmaengine: rcar-dmac: wait for ISR to finish before freeing resources 2017-04-05 3:25 ` Vinod Koul @ 2017-04-05 9:14 ` Niklas Söderlund 2017-04-05 10:40 ` Geert Uytterhoeven 0 siblings, 1 reply; 14+ messages in thread From: Niklas Söderlund @ 2017-04-05 9:14 UTC (permalink / raw) To: Vinod Koul Cc: Geert Uytterhoeven, dmaengine, Linux-Renesas, Yoshihiro Shimoda, Lars-Peter Clausen, Hiroyuki Yokoyama Hi Vinod, On 2017-04-05 08:55:31 +0530, Vinod Koul wrote: > On Thu, Mar 30, 2017 at 09:38:39AM +0200, Niklas S�derlund wrote: > > Hi Geert, > > > > On 2017-03-29 15:30:42 +0200, Niklas S�derlund wrote: > > > Hi Geert, > > > > > > On 2017-03-29 14:31:33 +0200, Geert Uytterhoeven wrote: > > > > Hi Niklas, > > > > > > > > On Wed, Mar 29, 2017 at 12:40 AM, Niklas S�derlund > > > > <niklas.soderlund+renesas@ragnatech.se> wrote: > > > > > This fixes a race condition where the channel resources could be freed > > > > > before the ISR had finished running resulting in a NULL pointer > > > > > reference from the ISR. > > > > > > > > > > [ 167.148934] Unable to handle kernel NULL pointer dereference at virtual address 00000000 > > > > > [ 167.157051] pgd = ffff80003c641000 > > > > > [ 167.160449] [00000000] *pgd=000000007c507003, *pud=000000007c4ff003, *pmd=0000000000000000 > > > > > [ 167.168719] Internal error: Oops: 96000046 [#1] PREEMPT SMP > > > > > [ 167.174289] Modules linked in: > > > > > [ 167.177348] CPU: 3 PID: 10547 Comm: dma_ioctl Not tainted 4.11.0-rc1-00001-g8d92afddc2f6633a #73 > > > > > [ 167.186131] Hardware name: Renesas Salvator-X board based on r8a7795 (DT) > > > > > [ 167.192917] task: ffff80003a411a00 task.stack: ffff80003bcd4000 > > > > > [ 167.198850] PC is at rcar_dmac_chan_prep_sg+0xe0/0x400 > > > > > [ 167.203985] LR is at rcar_dmac_chan_prep_sg+0x48/0x400 > > > > > > > > Do you have a test case to trigger this? > > > > > > Yes I have a testcase, it's rather complex and involves both a kernel > > > module and a userspaces application to stress the rcar-dmac. I'm > > > checking if I can share this publicly or not, please hold :-) > > > > I have now received feedback that I'm unfortunately not allowed to share > > the test case :-( > > > > The big picture in how to trigger this problem is that you start a DMA > > transfer like this: > > > > struct dma_async_tx_descriptor *tx = ...; > > > > ... > > > > tx->tx_submit(tx); > > > > And then you directly call dma_release_channel() on this channel without > > making sure the completion callback ran or anything. Now if you are > > unlucky the ISR have not finished running for the DMA when > > dma_release_channel() starts to clean up resources. The synchronisation > > point in the dma_release_channel() call path fixes this. > > Well the API expectation would be you abort the txn before calling release. > So the expected order should be: > > dmaengine_terminate_all(); > dma_release_channel(); Agree this is the correct way and in this case patch 3/3 in this series could be dropped. Then device_synchronize() would added to rcar-dmac, dmaengine_terminate_all() would turn of the IRQ and dma_release_channel() would ensure that device_synchronize() is called prior to calling rcar-dmac device_free_chan_resources(). > > Terminate should then stop the channel, ie abort the pending descriptors.. > However for reasons unknown to me the rcar-dmac device_free_chan_resources() implementation implements logic to turn of IRQs before it frees the resources. And it's because of this patch 3/3 is needed so that it can be sure no ISR is running before it frees resources. I don't know how to best proceed here. I agree it feels a bit odd that device_free_chan_resources() is dealing with the IRQs as such things should be done before it's called. But on the other hand that code has been part of the driver since it was added upstream. I feel a bit uncomfortable just removing that part from the device_free_chan_resources() since the driver have been in use with it for such a long time. How would you prefer I try and resolve this? -- Regards, Niklas S�derlund ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH 3/3] dmaengine: rcar-dmac: wait for ISR to finish before freeing resources 2017-04-05 9:14 ` Niklas Söderlund @ 2017-04-05 10:40 ` Geert Uytterhoeven 2017-04-07 11:33 ` Laurent Pinchart 0 siblings, 1 reply; 14+ messages in thread From: Geert Uytterhoeven @ 2017-04-05 10:40 UTC (permalink / raw) To: Niklas Söderlund Cc: Vinod Koul, dmaengine, Linux-Renesas, Yoshihiro Shimoda, Lars-Peter Clausen, Hiroyuki Yokoyama, Laurent Pinchart Hi Niklas, (CC Laurent) On Wed, Apr 5, 2017 at 11:14 AM, Niklas Söderlund <niklas.soderlund@ragnatech.se> wrote: > On 2017-04-05 08:55:31 +0530, Vinod Koul wrote: >> On Thu, Mar 30, 2017 at 09:38:39AM +0200, Niklas Söderlund wrote: >> > On 2017-03-29 15:30:42 +0200, Niklas Söderlund wrote: >> > > On 2017-03-29 14:31:33 +0200, Geert Uytterhoeven wrote: >> > > > On Wed, Mar 29, 2017 at 12:40 AM, Niklas Söderlund >> > > > <niklas.soderlund+renesas@ragnatech.se> wrote: >> > > > > This fixes a race condition where the channel resources could be freed >> > > > > before the ISR had finished running resulting in a NULL pointer >> > > > > reference from the ISR. >> > > > > >> > > > > [ 167.148934] Unable to handle kernel NULL pointer dereference at virtual address 00000000 >> > > > > [ 167.157051] pgd = ffff80003c641000 >> > > > > [ 167.160449] [00000000] *pgd=000000007c507003, *pud=000000007c4ff003, *pmd=0000000000000000 >> > > > > [ 167.168719] Internal error: Oops: 96000046 [#1] PREEMPT SMP >> > > > > [ 167.174289] Modules linked in: >> > > > > [ 167.177348] CPU: 3 PID: 10547 Comm: dma_ioctl Not tainted 4.11.0-rc1-00001-g8d92afddc2f6633a #73 >> > > > > [ 167.186131] Hardware name: Renesas Salvator-X board based on r8a7795 (DT) >> > > > > [ 167.192917] task: ffff80003a411a00 task.stack: ffff80003bcd4000 >> > > > > [ 167.198850] PC is at rcar_dmac_chan_prep_sg+0xe0/0x400 >> > > > > [ 167.203985] LR is at rcar_dmac_chan_prep_sg+0x48/0x400 >> > > > >> > > > Do you have a test case to trigger this? >> > > >> > > Yes I have a testcase, it's rather complex and involves both a kernel >> > > module and a userspaces application to stress the rcar-dmac. I'm >> > > checking if I can share this publicly or not, please hold :-) >> > >> > I have now received feedback that I'm unfortunately not allowed to share >> > the test case :-( >> > >> > The big picture in how to trigger this problem is that you start a DMA >> > transfer like this: >> > >> > struct dma_async_tx_descriptor *tx = ...; >> > >> > ... >> > >> > tx->tx_submit(tx); >> > >> > And then you directly call dma_release_channel() on this channel without >> > making sure the completion callback ran or anything. Now if you are >> > unlucky the ISR have not finished running for the DMA when >> > dma_release_channel() starts to clean up resources. The synchronisation >> > point in the dma_release_channel() call path fixes this. >> >> Well the API expectation would be you abort the txn before calling release. >> So the expected order should be: >> >> dmaengine_terminate_all(); >> dma_release_channel(); > > Agree this is the correct way and in this case patch 3/3 in this series > could be dropped. Then device_synchronize() would added to rcar-dmac, > dmaengine_terminate_all() would turn of the IRQ and > dma_release_channel() would ensure that device_synchronize() is called > prior to calling rcar-dmac device_free_chan_resources(). > >> >> Terminate should then stop the channel, ie abort the pending descriptors.. >> > > However for reasons unknown to me the rcar-dmac > device_free_chan_resources() implementation implements logic to turn of > IRQs before it frees the resources. And it's because of this patch 3/3 > is needed so that it can be sure no ISR is running before it frees > resources. > > I don't know how to best proceed here. I agree it feels a bit odd that > device_free_chan_resources() is dealing with the IRQs as such things > should be done before it's called. But on the other hand that code has > been part of the driver since it was added upstream. I feel a bit > uncomfortable just removing that part from the > device_free_chan_resources() since the driver have been in use with it > for such a long time. > > How would you prefer I try and resolve this? Perhaps Laurent knows why it was implemented this way? Gr{oetje,eeting}s, Geert -- Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- geert@linux-m68k.org In personal conversations with technical people, I call myself a hacker. But when I'm talking to journalists I just say "programmer" or something like that. -- Linus Torvalds ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH 3/3] dmaengine: rcar-dmac: wait for ISR to finish before freeing resources 2017-04-05 10:40 ` Geert Uytterhoeven @ 2017-04-07 11:33 ` Laurent Pinchart 2017-05-12 12:49 ` Niklas Söderlund 0 siblings, 1 reply; 14+ messages in thread From: Laurent Pinchart @ 2017-04-07 11:33 UTC (permalink / raw) To: Geert Uytterhoeven Cc: Niklas Söderlund, Vinod Koul, dmaengine, Linux-Renesas, Yoshihiro Shimoda, Lars-Peter Clausen, Hiroyuki Yokoyama Hi Geert, On Wednesday 05 Apr 2017 12:40:11 Geert Uytterhoeven wrote: > On Wed, Apr 5, 2017 at 11:14 AM, Niklas Söderlund wrote: > > On 2017-04-05 08:55:31 +0530, Vinod Koul wrote: > >> On Thu, Mar 30, 2017 at 09:38:39AM +0200, Niklas Söderlund wrote: > >>> On 2017-03-29 15:30:42 +0200, Niklas Söderlund wrote: > >>>> On 2017-03-29 14:31:33 +0200, Geert Uytterhoeven wrote: > >>>>> On Wed, Mar 29, 2017 at 12:40 AM, Niklas Söderlund wrote: > >>>>>> This fixes a race condition where the channel resources could be > >>>>>> freed before the ISR had finished running resulting in a NULL > >>>>>> pointer reference from the ISR. > >>>>>> > >>>>>> [ 167.148934] Unable to handle kernel NULL pointer dereference > >>>>>> at virtual address 00000000 > >>>>>> [ 167.157051] pgd = ffff80003c641000 > >>>>>> [ 167.160449] [00000000] *pgd=000000007c507003, > >>>>>> *pud=000000007c4ff003, *pmd=0000000000000000 > >>>>>> [ 167.168719] Internal error: Oops: 96000046 [#1] PREEMPT SMP > >>>>>> [ 167.174289] Modules linked in: > >>>>>> [ 167.177348] CPU: 3 PID: 10547 Comm: dma_ioctl Not tainted > >>>>>> 4.11.0-rc1-00001-g8d92afddc2f6633a #73 > >>>>>> [ 167.186131] Hardware name: Renesas Salvator-X board based on > >>>>>> r8a7795 (DT) > >>>>>> [ 167.192917] task: ffff80003a411a00 task.stack: ffff80003bcd4000 > >>>>>> [ 167.198850] PC is at rcar_dmac_chan_prep_sg+0xe0/0x400 > >>>>>> [ 167.203985] LR is at rcar_dmac_chan_prep_sg+0x48/0x400 > >>>>> > >>>>> Do you have a test case to trigger this? > >>>> > >>>> Yes I have a testcase, it's rather complex and involves both a kernel > >>>> module and a userspaces application to stress the rcar-dmac. I'm > >>>> checking if I can share this publicly or not, please hold :-) > >>> > >>> I have now received feedback that I'm unfortunately not allowed to > >>> share the test case :-( > >>> > >>> The big picture in how to trigger this problem is that you start a DMA > >>> transfer like this: > >>> > >>> struct dma_async_tx_descriptor *tx = ...; > >>> > >>> ... > >>> > >>> tx->tx_submit(tx); > >>> > >>> And then you directly call dma_release_channel() on this channel > >>> without making sure the completion callback ran or anything. Now if you > >>> are unlucky the ISR have not finished running for the DMA when > >>> dma_release_channel() starts to clean up resources. The synchronisation > >>> point in the dma_release_channel() call path fixes this. > >> > >> Well the API expectation would be you abort the txn before calling > >> release. So the expected order should be: > >> > >> dmaengine_terminate_all(); > >> dma_release_channel(); > > > > Agree this is the correct way and in this case patch 3/3 in this series > > could be dropped. Then device_synchronize() would added to rcar-dmac, > > dmaengine_terminate_all() would turn of the IRQ and > > dma_release_channel() would ensure that device_synchronize() is called > > prior to calling rcar-dmac device_free_chan_resources(). > > > >> Terminate should then stop the channel, ie abort the pending > >> descriptors.. > > > > However for reasons unknown to me the rcar-dmac > > device_free_chan_resources() implementation implements logic to turn of > > IRQs before it frees the resources. And it's because of this patch 3/3 > > is needed so that it can be sure no ISR is running before it frees > > resources. > > > > I don't know how to best proceed here. I agree it feels a bit odd that > > device_free_chan_resources() is dealing with the IRQs as such things > > should be done before it's called. But on the other hand that code has > > been part of the driver since it was added upstream. I feel a bit > > uncomfortable just removing that part from the > > device_free_chan_resources() since the driver have been in use with it > > for such a long time. > > > > How would you prefer I try and resolve this? > > Perhaps Laurent knows why it was implemented this way? That was nearly 3 years ago, and I can hardly remember reasons related to code I wrote 3 months ago :-) I might just have been overcautious, guarding against conditions that should not happen if the caller behaves correctly. The situation might have changed since the driver was written. It might also be just a case of cargo-cult programming, as the shdma_free_chan_resources() has very similar code. Given that freeing channel resources when the channel isn't idle can cause an oops, I think we should guard against that. This should probably be implemented in the dma-engine core, to make sure we catch the issue in as many drivers as possible. -- Regards, Laurent Pinchart ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH 3/3] dmaengine: rcar-dmac: wait for ISR to finish before freeing resources 2017-04-07 11:33 ` Laurent Pinchart @ 2017-05-12 12:49 ` Niklas Söderlund 2017-05-14 12:01 ` Vinod Koul 0 siblings, 1 reply; 14+ messages in thread From: Niklas Söderlund @ 2017-05-12 12:49 UTC (permalink / raw) To: Laurent Pinchart Cc: Geert Uytterhoeven, Vinod Koul, dmaengine, Linux-Renesas, Yoshihiro Shimoda, Lars-Peter Clausen, Hiroyuki Yokoyama On 2017-04-07 14:33:47 +0300, Laurent Pinchart wrote: > Hi Geert, > > On Wednesday 05 Apr 2017 12:40:11 Geert Uytterhoeven wrote: > > On Wed, Apr 5, 2017 at 11:14 AM, Niklas S�derlund wrote: > > > On 2017-04-05 08:55:31 +0530, Vinod Koul wrote: > > >> On Thu, Mar 30, 2017 at 09:38:39AM +0200, Niklas S�derlund wrote: > > >>> On 2017-03-29 15:30:42 +0200, Niklas S�derlund wrote: > > >>>> On 2017-03-29 14:31:33 +0200, Geert Uytterhoeven wrote: > > >>>>> On Wed, Mar 29, 2017 at 12:40 AM, Niklas S�derlund wrote: > > >>>>>> This fixes a race condition where the channel resources could be > > >>>>>> freed before the ISR had finished running resulting in a NULL > > >>>>>> pointer reference from the ISR. > > >>>>>> > > >>>>>> [ 167.148934] Unable to handle kernel NULL pointer dereference > > >>>>>> at virtual address 00000000 > > >>>>>> [ 167.157051] pgd = ffff80003c641000 > > >>>>>> [ 167.160449] [00000000] *pgd=000000007c507003, > > >>>>>> *pud=000000007c4ff003, *pmd=0000000000000000 > > >>>>>> [ 167.168719] Internal error: Oops: 96000046 [#1] PREEMPT SMP > > >>>>>> [ 167.174289] Modules linked in: > > >>>>>> [ 167.177348] CPU: 3 PID: 10547 Comm: dma_ioctl Not tainted > > >>>>>> 4.11.0-rc1-00001-g8d92afddc2f6633a #73 > > >>>>>> [ 167.186131] Hardware name: Renesas Salvator-X board based on > > >>>>>> r8a7795 (DT) > > >>>>>> [ 167.192917] task: ffff80003a411a00 task.stack: ffff80003bcd4000 > > >>>>>> [ 167.198850] PC is at rcar_dmac_chan_prep_sg+0xe0/0x400 > > >>>>>> [ 167.203985] LR is at rcar_dmac_chan_prep_sg+0x48/0x400 > > >>>>> > > >>>>> Do you have a test case to trigger this? > > >>>> > > >>>> Yes I have a testcase, it's rather complex and involves both a kernel > > >>>> module and a userspaces application to stress the rcar-dmac. I'm > > >>>> checking if I can share this publicly or not, please hold :-) > > >>> > > >>> I have now received feedback that I'm unfortunately not allowed to > > >>> share the test case :-( > > >>> > > >>> The big picture in how to trigger this problem is that you start a DMA > > >>> transfer like this: > > >>> > > >>> struct dma_async_tx_descriptor *tx = ...; > > >>> > > >>> ... > > >>> > > >>> tx->tx_submit(tx); > > >>> > > >>> And then you directly call dma_release_channel() on this channel > > >>> without making sure the completion callback ran or anything. Now if you > > >>> are unlucky the ISR have not finished running for the DMA when > > >>> dma_release_channel() starts to clean up resources. The synchronisation > > >>> point in the dma_release_channel() call path fixes this. > > >> > > >> Well the API expectation would be you abort the txn before calling > > >> release. So the expected order should be: > > >> > > >> dmaengine_terminate_all(); > > >> dma_release_channel(); > > > > > > Agree this is the correct way and in this case patch 3/3 in this series > > > could be dropped. Then device_synchronize() would added to rcar-dmac, > > > dmaengine_terminate_all() would turn of the IRQ and > > > dma_release_channel() would ensure that device_synchronize() is called > > > prior to calling rcar-dmac device_free_chan_resources(). > > > > > >> Terminate should then stop the channel, ie abort the pending > > >> descriptors.. > > > > > > However for reasons unknown to me the rcar-dmac > > > device_free_chan_resources() implementation implements logic to turn of > > > IRQs before it frees the resources. And it's because of this patch 3/3 > > > is needed so that it can be sure no ISR is running before it frees > > > resources. > > > > > > I don't know how to best proceed here. I agree it feels a bit odd that > > > device_free_chan_resources() is dealing with the IRQs as such things > > > should be done before it's called. But on the other hand that code has > > > been part of the driver since it was added upstream. I feel a bit > > > uncomfortable just removing that part from the > > > device_free_chan_resources() since the driver have been in use with it > > > for such a long time. > > > > > > How would you prefer I try and resolve this? > > > > Perhaps Laurent knows why it was implemented this way? > > That was nearly 3 years ago, and I can hardly remember reasons related to code > I wrote 3 months ago :-) > > I might just have been overcautious, guarding against conditions that should > not happen if the caller behaves correctly. The situation might have changed > since the driver was written. It might also be just a case of cargo-cult > programming, as the shdma_free_chan_resources() has very similar code. Since the driver today have this behavior would it not be best to first make sure it functions as expected and then as a second step see if we can remove it all together? Vinod would you be strongly opposed to picking up this series as is? > > Given that freeing channel resources when the channel isn't idle can cause an > oops, I think we should guard against that. This should probably be > implemented in the dma-engine core, to make sure we catch the issue in as many > drivers as possible. > > -- > Regards, > > Laurent Pinchart > -- Regards, Niklas S�derlund ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH 3/3] dmaengine: rcar-dmac: wait for ISR to finish before freeing resources 2017-05-12 12:49 ` Niklas Söderlund @ 2017-05-14 12:01 ` Vinod Koul 2017-05-15 23:12 ` Niklas Söderlund 0 siblings, 1 reply; 14+ messages in thread From: Vinod Koul @ 2017-05-14 12:01 UTC (permalink / raw) To: Niklas Söderlund Cc: Laurent Pinchart, Geert Uytterhoeven, dmaengine, Linux-Renesas, Yoshihiro Shimoda, Lars-Peter Clausen, Hiroyuki Yokoyama On Fri, May 12, 2017 at 02:49:38PM +0200, Niklas S�derlund wrote: > On 2017-04-07 14:33:47 +0300, Laurent Pinchart wrote: > > Hi Geert, > > > > On Wednesday 05 Apr 2017 12:40:11 Geert Uytterhoeven wrote: > > > On Wed, Apr 5, 2017 at 11:14 AM, Niklas S�derlund wrote: > > > > On 2017-04-05 08:55:31 +0530, Vinod Koul wrote: > > > >> On Thu, Mar 30, 2017 at 09:38:39AM +0200, Niklas S�derlund wrote: > > > >>> On 2017-03-29 15:30:42 +0200, Niklas S�derlund wrote: > > > >>>> On 2017-03-29 14:31:33 +0200, Geert Uytterhoeven wrote: > > > >>>>> On Wed, Mar 29, 2017 at 12:40 AM, Niklas S�derlund wrote: > > > >>>>>> This fixes a race condition where the channel resources could be > > > >>>>>> freed before the ISR had finished running resulting in a NULL > > > >>>>>> pointer reference from the ISR. > > > >>>>>> > > > >>>>>> [ 167.148934] Unable to handle kernel NULL pointer dereference > > > >>>>>> at virtual address 00000000 > > > >>>>>> [ 167.157051] pgd = ffff80003c641000 > > > >>>>>> [ 167.160449] [00000000] *pgd=000000007c507003, > > > >>>>>> *pud=000000007c4ff003, *pmd=0000000000000000 > > > >>>>>> [ 167.168719] Internal error: Oops: 96000046 [#1] PREEMPT SMP > > > >>>>>> [ 167.174289] Modules linked in: > > > >>>>>> [ 167.177348] CPU: 3 PID: 10547 Comm: dma_ioctl Not tainted > > > >>>>>> 4.11.0-rc1-00001-g8d92afddc2f6633a #73 > > > >>>>>> [ 167.186131] Hardware name: Renesas Salvator-X board based on > > > >>>>>> r8a7795 (DT) > > > >>>>>> [ 167.192917] task: ffff80003a411a00 task.stack: ffff80003bcd4000 > > > >>>>>> [ 167.198850] PC is at rcar_dmac_chan_prep_sg+0xe0/0x400 > > > >>>>>> [ 167.203985] LR is at rcar_dmac_chan_prep_sg+0x48/0x400 > > > >>>>> > > > >>>>> Do you have a test case to trigger this? > > > >>>> > > > >>>> Yes I have a testcase, it's rather complex and involves both a kernel > > > >>>> module and a userspaces application to stress the rcar-dmac. I'm > > > >>>> checking if I can share this publicly or not, please hold :-) > > > >>> > > > >>> I have now received feedback that I'm unfortunately not allowed to > > > >>> share the test case :-( > > > >>> > > > >>> The big picture in how to trigger this problem is that you start a DMA > > > >>> transfer like this: > > > >>> > > > >>> struct dma_async_tx_descriptor *tx = ...; > > > >>> > > > >>> ... > > > >>> > > > >>> tx->tx_submit(tx); > > > >>> > > > >>> And then you directly call dma_release_channel() on this channel > > > >>> without making sure the completion callback ran or anything. Now if you > > > >>> are unlucky the ISR have not finished running for the DMA when > > > >>> dma_release_channel() starts to clean up resources. The synchronisation > > > >>> point in the dma_release_channel() call path fixes this. > > > >> > > > >> Well the API expectation would be you abort the txn before calling > > > >> release. So the expected order should be: > > > >> > > > >> dmaengine_terminate_all(); > > > >> dma_release_channel(); > > > > > > > > Agree this is the correct way and in this case patch 3/3 in this series > > > > could be dropped. Then device_synchronize() would added to rcar-dmac, > > > > dmaengine_terminate_all() would turn of the IRQ and > > > > dma_release_channel() would ensure that device_synchronize() is called > > > > prior to calling rcar-dmac device_free_chan_resources(). > > > > > > > >> Terminate should then stop the channel, ie abort the pending > > > >> descriptors.. > > > > > > > > However for reasons unknown to me the rcar-dmac > > > > device_free_chan_resources() implementation implements logic to turn of > > > > IRQs before it frees the resources. And it's because of this patch 3/3 > > > > is needed so that it can be sure no ISR is running before it frees > > > > resources. > > > > > > > > I don't know how to best proceed here. I agree it feels a bit odd that > > > > device_free_chan_resources() is dealing with the IRQs as such things > > > > should be done before it's called. But on the other hand that code has > > > > been part of the driver since it was added upstream. I feel a bit > > > > uncomfortable just removing that part from the > > > > device_free_chan_resources() since the driver have been in use with it > > > > for such a long time. > > > > > > > > How would you prefer I try and resolve this? > > > > > > Perhaps Laurent knows why it was implemented this way? > > > > That was nearly 3 years ago, and I can hardly remember reasons related to code > > I wrote 3 months ago :-) > > > > I might just have been overcautious, guarding against conditions that should > > not happen if the caller behaves correctly. The situation might have changed > > since the driver was written. It might also be just a case of cargo-cult > > programming, as the shdma_free_chan_resources() has very similar code. > > Since the driver today have this behavior would it not be best to first > make sure it functions as expected and then as a second step see if we > can remove it all together? > > Vinod would you be strongly opposed to picking up this series as is? If there are no objections then I don't mind picking, please do rebase on -rc1 and resend -- ~Vinod ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH 3/3] dmaengine: rcar-dmac: wait for ISR to finish before freeing resources 2017-05-14 12:01 ` Vinod Koul @ 2017-05-15 23:12 ` Niklas Söderlund 0 siblings, 0 replies; 14+ messages in thread From: Niklas Söderlund @ 2017-05-15 23:12 UTC (permalink / raw) To: Vinod Koul Cc: Laurent Pinchart, Geert Uytterhoeven, dmaengine, Linux-Renesas, Yoshihiro Shimoda, Lars-Peter Clausen, Hiroyuki Yokoyama Hi Vinod, On 2017-05-14 17:31:36 +0530, Vinod Koul wrote: > On Fri, May 12, 2017 at 02:49:38PM +0200, Niklas S�derlund wrote: > > On 2017-04-07 14:33:47 +0300, Laurent Pinchart wrote: > > > Hi Geert, > > > > > > On Wednesday 05 Apr 2017 12:40:11 Geert Uytterhoeven wrote: > > > > On Wed, Apr 5, 2017 at 11:14 AM, Niklas S�derlund wrote: > > > > > On 2017-04-05 08:55:31 +0530, Vinod Koul wrote: > > > > >> On Thu, Mar 30, 2017 at 09:38:39AM +0200, Niklas S�derlund wrote: > > > > >>> On 2017-03-29 15:30:42 +0200, Niklas S�derlund wrote: > > > > >>>> On 2017-03-29 14:31:33 +0200, Geert Uytterhoeven wrote: > > > > >>>>> On Wed, Mar 29, 2017 at 12:40 AM, Niklas S�derlund wrote: > > > > >>>>>> This fixes a race condition where the channel resources could be > > > > >>>>>> freed before the ISR had finished running resulting in a NULL > > > > >>>>>> pointer reference from the ISR. > > > > >>>>>> > > > > >>>>>> [ 167.148934] Unable to handle kernel NULL pointer dereference > > > > >>>>>> at virtual address 00000000 > > > > >>>>>> [ 167.157051] pgd = ffff80003c641000 > > > > >>>>>> [ 167.160449] [00000000] *pgd=000000007c507003, > > > > >>>>>> *pud=000000007c4ff003, *pmd=0000000000000000 > > > > >>>>>> [ 167.168719] Internal error: Oops: 96000046 [#1] PREEMPT SMP > > > > >>>>>> [ 167.174289] Modules linked in: > > > > >>>>>> [ 167.177348] CPU: 3 PID: 10547 Comm: dma_ioctl Not tainted > > > > >>>>>> 4.11.0-rc1-00001-g8d92afddc2f6633a #73 > > > > >>>>>> [ 167.186131] Hardware name: Renesas Salvator-X board based on > > > > >>>>>> r8a7795 (DT) > > > > >>>>>> [ 167.192917] task: ffff80003a411a00 task.stack: ffff80003bcd4000 > > > > >>>>>> [ 167.198850] PC is at rcar_dmac_chan_prep_sg+0xe0/0x400 > > > > >>>>>> [ 167.203985] LR is at rcar_dmac_chan_prep_sg+0x48/0x400 > > > > >>>>> > > > > >>>>> Do you have a test case to trigger this? > > > > >>>> > > > > >>>> Yes I have a testcase, it's rather complex and involves both a kernel > > > > >>>> module and a userspaces application to stress the rcar-dmac. I'm > > > > >>>> checking if I can share this publicly or not, please hold :-) > > > > >>> > > > > >>> I have now received feedback that I'm unfortunately not allowed to > > > > >>> share the test case :-( > > > > >>> > > > > >>> The big picture in how to trigger this problem is that you start a DMA > > > > >>> transfer like this: > > > > >>> > > > > >>> struct dma_async_tx_descriptor *tx = ...; > > > > >>> > > > > >>> ... > > > > >>> > > > > >>> tx->tx_submit(tx); > > > > >>> > > > > >>> And then you directly call dma_release_channel() on this channel > > > > >>> without making sure the completion callback ran or anything. Now if you > > > > >>> are unlucky the ISR have not finished running for the DMA when > > > > >>> dma_release_channel() starts to clean up resources. The synchronisation > > > > >>> point in the dma_release_channel() call path fixes this. > > > > >> > > > > >> Well the API expectation would be you abort the txn before calling > > > > >> release. So the expected order should be: > > > > >> > > > > >> dmaengine_terminate_all(); > > > > >> dma_release_channel(); > > > > > > > > > > Agree this is the correct way and in this case patch 3/3 in this series > > > > > could be dropped. Then device_synchronize() would added to rcar-dmac, > > > > > dmaengine_terminate_all() would turn of the IRQ and > > > > > dma_release_channel() would ensure that device_synchronize() is called > > > > > prior to calling rcar-dmac device_free_chan_resources(). > > > > > > > > > >> Terminate should then stop the channel, ie abort the pending > > > > >> descriptors.. > > > > > > > > > > However for reasons unknown to me the rcar-dmac > > > > > device_free_chan_resources() implementation implements logic to turn of > > > > > IRQs before it frees the resources. And it's because of this patch 3/3 > > > > > is needed so that it can be sure no ISR is running before it frees > > > > > resources. > > > > > > > > > > I don't know how to best proceed here. I agree it feels a bit odd that > > > > > device_free_chan_resources() is dealing with the IRQs as such things > > > > > should be done before it's called. But on the other hand that code has > > > > > been part of the driver since it was added upstream. I feel a bit > > > > > uncomfortable just removing that part from the > > > > > device_free_chan_resources() since the driver have been in use with it > > > > > for such a long time. > > > > > > > > > > How would you prefer I try and resolve this? > > > > > > > > Perhaps Laurent knows why it was implemented this way? > > > > > > That was nearly 3 years ago, and I can hardly remember reasons related to code > > > I wrote 3 months ago :-) > > > > > > I might just have been overcautious, guarding against conditions that should > > > not happen if the caller behaves correctly. The situation might have changed > > > since the driver was written. It might also be just a case of cargo-cult > > > programming, as the shdma_free_chan_resources() has very similar code. > > > > Since the driver today have this behavior would it not be best to first > > make sure it functions as expected and then as a second step see if we > > can remove it all together? > > > > Vinod would you be strongly opposed to picking up this series as is? > > If there are no objections then I don't mind picking, please do rebase on > -rc1 and resend Thanks I have sent out a rebased v2. > > -- > ~Vinod -- Regards, Niklas S�derlund ^ permalink raw reply [flat|nested] 14+ messages in thread
end of thread, other threads:[~2017-05-15 23:12 UTC | newest] Thread overview: 14+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2017-03-28 22:40 [PATCH 0/3] dmaengine: rcar-dmac: fix resource freeing synchronization Niklas Söderlund 2017-03-28 22:40 ` [PATCH 1/3] dmaengine: rcar-dmac: store channel IRQ in struct rcar_dmac_chan Niklas Söderlund 2017-03-28 22:40 ` [PATCH 2/3] dmaengine: rcar-dmac: implement device_synchronize() Niklas Söderlund 2017-03-28 22:40 ` [PATCH 3/3] dmaengine: rcar-dmac: wait for ISR to finish before freeing resources Niklas Söderlund 2017-03-29 12:31 ` Geert Uytterhoeven 2017-03-29 13:30 ` Niklas Söderlund 2017-03-30 7:38 ` Niklas Söderlund 2017-04-05 3:25 ` Vinod Koul 2017-04-05 9:14 ` Niklas Söderlund 2017-04-05 10:40 ` Geert Uytterhoeven 2017-04-07 11:33 ` Laurent Pinchart 2017-05-12 12:49 ` Niklas Söderlund 2017-05-14 12:01 ` Vinod Koul 2017-05-15 23:12 ` Niklas Söderlund
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.