Linux-SPI Archive on lore.kernel.org
 help / color / Atom feed
* Use after free in bcm2835_spi_remove()
@ 2020-10-13 23:48 Florian Fainelli
  2020-10-14 14:09 ` Lukas Wunner
  0 siblings, 1 reply; 7+ messages in thread
From: Florian Fainelli @ 2020-10-13 23:48 UTC (permalink / raw)
  To: Lukas Wunner, linux-kernel, linux-spi; +Cc: Mark Brown

Hi Lukas,

With KASAN now working on ARM 32-bit, I was able to get the following
trace upon reboot which invokes bcm2835_spi_shutdown() calling
bcm2835_spi_remove(), the same can be triggered by doing a driver unbind:

# pwd
/sys/devices/platform/rdb/47e204800.spi/driver
# echo 47e204800.spi > unbind

How would you go about fixing this? This was not on a Rpi 4 but in
premise the same problem exists there.

Thanks!

[  229.746516]
==================================================================
[  229.754013] BUG: KASAN: use-after-free in bcm2835_dma_release+0x2c/0x260
[  229.760820] Read of size 4 at addr e0f08358 by task reboot/157
[  229.766727]
[  229.768302] CPU: 0 PID: 157 Comm: reboot Not tainted
5.9.0-gdf4dd84a3f7d #27
[  229.775445] Hardware name: Broadcom STB (Flattened Device Tree)
[  229.781448] Backtrace:
[  229.784017] [<c02120b4>] (dump_backtrace) from [<c02123d8>]
(show_stack+0x20/0x24)
[  229.791738]  r9:ffffffff r8:00000080 r7:c298e3c0 r6:400f0093
r5:00000000 r4:c298e3c0
[  229.799655] [<c02123b8>] (show_stack) from [<c08852a0>]
(dump_stack+0xbc/0xe0)
[  229.807050] [<c08851e4>] (dump_stack) from [<c04522bc>]
(print_address_description.constprop.3+0x3c/0x4b0)
[  229.816863]  r10:c2b771c0 r9:e46d9848 r8:e46d9854 r7:00000000
r6:c0b3ea3c r5:eeea5940
[  229.824815]  r4:e0f08358 r3:00000100
[  229.828510] [<c0452280>] (print_address_description.constprop.3) from
[<c0452944>] (kasan_report+0x15c/0x178)
[  229.838575]  r8:e46d9854 r7:00000000 r6:c0b3ea3c r5:0000009d r4:e0f08358
[  229.845411] [<c04527e8>] (kasan_report) from [<c0452f24>]
(__asan_load4+0x6c/0xbc)
[  229.853109]  r7:e0f08380 r6:e0f08000 r5:e0f08358 r4:e0f08380
[  229.858898] [<c0452eb8>] (__asan_load4) from [<c0b3ea3c>]
(bcm2835_dma_release+0x2c/0x260)
[  229.867318] [<c0b3ea10>] (bcm2835_dma_release) from [<c0b3ecd8>]
(bcm2835_spi_remove+0x68/0xa4)
[  229.876166]  r9:e46d9848 r8:e46d9854 r7:e0f083c0 r6:00000000
r5:e0f08000 r4:e0f08380
[  229.884069] [<c0b3ec70>] (bcm2835_spi_remove) from [<c0b3ed30>]
(bcm2835_spi_shutdown+0x1c/0x38)
[  229.892991]  r7:c2fc7f40 r6:e46d9810 r5:c2a1d854 r4:e46d9800
[  229.898788] [<c0b3ed14>] (bcm2835_spi_shutdown) from [<c0a17010>]
(platform_drv_shutdown+0x40/0x44)
[  229.907958]  r5:c2a1d854 r4:e46d9810
[  229.911653] [<c0a16fd0>] (platform_drv_shutdown) from [<c0a0f91c>]
(device_shutdown+0x248/0x35c)
[  229.920561]  r5:e465b810 r4:e46d9814
[  229.924255] [<c0a0f6d4>] (device_shutdown) from [<c0269418>]
(kernel_restart_prepare+0x4c/0x50)
[  229.933103]  r10:01234567 r9:fee1dead r8:dfdb3f60 r7:c2835240
r6:c2806d48 r5:00000000
[  229.941045]  r4:c2806d40
[  229.943675] [<c02693cc>] (kernel_restart_prepare) from [<c0269528>]
(kernel_restart+0x1c/0x60)
[  229.952405]  r5:00000000 r4:00000000
[  229.956084] [<c026950c>] (kernel_restart) from [<c0269810>]
(__do_sys_reboot+0x148/0x260)
[  229.964380]  r5:00000000 r4:bafb67c0
[  229.968057] [<c02696c8>] (__do_sys_reboot) from [<c0269998>]
(sys_reboot+0x18/0x1c)
[  229.975852]  r10:00000058 r9:dfdb0000 r8:c0200228 r7:00000058
r6:00000000 r5:00000004
[  229.983792]  r4:00000002
[  229.986422] [<c0269980>] (sys_reboot) from [<c0200060>]
(ret_fast_syscall+0x0/0x2c)
[  229.994190] Exception stack(0xdfdb3fa8 to 0xdfdb3ff0)
[  229.999350] 3fa0:                   00000002 00000004 fee1dead
28121969 01234567 000a9864
[  230.007669] 3fc0: 00000002 00000004 00000000 00000058 00000000
00000000 aedbe000 00000000
[  230.015974] 3fe0: aecce8f0 b6a81cec 000982d4 aecce910
[  230.021095]
[  230.022636] Allocated by task 20:
[  230.026039]  kasan_save_stack+0x24/0x48
[  230.029962]  __kasan_kmalloc.constprop.1+0xb8/0xc4
[  230.034842]  kasan_kmalloc+0x10/0x14
[  230.038495]  __kmalloc+0x168/0x2f4
[  230.041976]  __spi_alloc_controller+0x30/0xc0
[  230.046421]  bcm2835_spi_probe+0x90/0x4cc
[  230.050514]  platform_drv_probe+0x70/0xc8
[  230.054612]  really_probe+0x184/0x728
[  230.058361]  driver_probe_device+0xa4/0x278
[  230.062637]  __device_attach_driver+0xe8/0x148
[  230.067169]  bus_for_each_drv+0x108/0x158
[  230.071267]  __device_attach+0x190/0x234
[  230.075279]  device_initial_probe+0x1c/0x20
[  230.079551]  bus_probe_device+0xdc/0xec
[  230.083475]  deferred_probe_work_func+0xd4/0x11c
[  230.088196]  process_one_work+0x420/0x8f0
[  230.092293]  worker_thread+0x4fc/0x91c
[  230.096127]  kthread+0x21c/0x22c
[  230.099427]  ret_from_fork+0x14/0x20
[  230.103075]  0x0
[  230.104957]
[  230.106496] Freed by task 157:
[  230.109627]  kasan_save_stack+0x24/0x48
[  230.113542]  kasan_set_track+0x30/0x38
[  230.117375]  kasan_set_free_info+0x28/0x34
[  230.121553]  __kasan_slab_free+0x110/0x144
[  230.125732]  kasan_slab_free+0x14/0x18
[  230.129556]  kfree+0xbc/0x2b8
[  230.132597]  spi_controller_release+0x18/0x1c
[  230.137037]  device_release+0x4c/0xf0
[  230.140781]  kobject_put+0x14c/0x2d8
[  230.144434]  device_unregister+0x44/0x84
[  230.148438]  spi_unregister_controller+0xcc/0x124
[  230.153233]  bcm2835_spi_remove+0x5c/0xa4
[  230.157328]  bcm2835_spi_shutdown+0x1c/0x38
[  230.161593]  platform_drv_shutdown+0x40/0x44
[  230.165949]  device_shutdown+0x248/0x35c
[  230.169953]  kernel_restart_prepare+0x4c/0x50
[  230.174391]  kernel_restart+0x1c/0x60
[  230.178131]  __do_sys_reboot+0x148/0x260
[  230.182132]  sys_reboot+0x18/0x1c
[  230.185519]  ret_fast_syscall+0x0/0x2c
[  230.189335]  0xb6a81cec
[  230.191829]
[  230.193380] The buggy address belongs to the object at e0f08000
[  230.193380]  which belongs to the cache kmalloc-2k of size 2048
[  230.205354] The buggy address is located 856 bytes inside of
[  230.205354]  2048-byte region [e0f08000, e0f08800)
[  230.215907] The buggy address belongs to the page:
[  230.220806] page:b990e388 refcount:1 mapcount:0 mapping:00000000
index:0x0 pfn:0x20f08
[  230.228841] head:b990e388 order:3 compound_mapcount:0 compound_pincount:0
[  230.235731] flags: 0x2010200(slab|head)
[  230.239688] raw: 02010200 00000000 00000100 00000122 e4401800
00000000 80080008 00000000
[  230.247895] raw: ffffffff 00000001
[  230.251358] page dumped because: kasan: bad access detected
[  230.257000]
[  230.258534] Memory state around the buggy address:
[  230.263412]  e0f08200: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
[  230.270038]  e0f08280: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
[  230.276662] >e0f08300: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
[  230.283272]                                             ^
[  230.288759]  e0f08380: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
[  230.295384]  e0f08400: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
[  230.301992]
==================================================================
[  230.309311] Disabling lock debugging due to kernel taint
[  230.325568] reboot: Restarting system

-- 
Florian

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Use after free in bcm2835_spi_remove()
  2020-10-13 23:48 Use after free in bcm2835_spi_remove() Florian Fainelli
@ 2020-10-14 14:09 ` Lukas Wunner
  2020-10-14 19:40   ` Vladimir Oltean
  0 siblings, 1 reply; 7+ messages in thread
From: Lukas Wunner @ 2020-10-14 14:09 UTC (permalink / raw)
  To: Florian Fainelli; +Cc: linux-kernel, linux-spi, Mark Brown

On Tue, Oct 13, 2020 at 04:48:42PM -0700, Florian Fainelli wrote:
> With KASAN now working on ARM 32-bit, I was able to get the following
> trace upon reboot which invokes bcm2835_spi_shutdown() calling
> bcm2835_spi_remove(), the same can be triggered by doing a driver unbind:

Thank you for the report.  Apparently the problem is that
spi_unregister_controller() drops the last ref on the controller,
causing it to be freed, and afterwards we access the controller's
private data, which is part of the same allocation as
struct spi_controller:

bcm2835_spi_remove()
  spi_unregister_controller()
    device_unregister()
      put_device()
        spi_controller_release()  #  spi_master_class.dev_release()
	  kfree(ctlr)
  bcm2835_dma_release(ctlr, bs)
  ...

However, when I submitted commit 9dd277ff92d0, I double-checked that
the kfree() happens after bcm2835_spi_remove() has finished and I
even wrote in the commit message:

   "Note that
    the struct spi_controller as well as the driver-private data are not
    freed until after bcm2835_spi_remove() has finished, so accessing them
    is safe."

I'm puzzled now that it doesn't work as intended.  I do not see any
recent commits which changed the behavior, so I must have made a
mistake and missed something.

The below patch should fix the issue.  Could you verify that?
Unfortunately I do not have access to a RasPi currently.

An alternative to this patch would be a devm function which acquires
a ref on the spi controller on ->probe() and automatically releases it
after ->remove() has finished.  This could be used by other SPI drivers
as well.

Thanks,

Lukas

-- >8 --

diff --git a/drivers/spi/spi-bcm2835.c b/drivers/spi/spi-bcm2835.c
index 41986ac..5254fda 100644
--- a/drivers/spi/spi-bcm2835.c
+++ b/drivers/spi/spi-bcm2835.c
@@ -1377,6 +1377,7 @@ static int bcm2835_spi_remove(struct platform_device *pdev)
 
 	bcm2835_debugfs_remove(bs);
 
+	spi_controller_get(ctlr);
 	spi_unregister_controller(ctlr);
 
 	bcm2835_dma_release(ctlr, bs);
@@ -1386,6 +1387,7 @@ static int bcm2835_spi_remove(struct platform_device *pdev)
 		   BCM2835_SPI_CS_CLEAR_RX | BCM2835_SPI_CS_CLEAR_TX);
 
 	clk_disable_unprepare(bs->clk);
+	spi_controller_put(ctlr);
 
 	return 0;
 }


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Use after free in bcm2835_spi_remove()
  2020-10-14 14:09 ` Lukas Wunner
@ 2020-10-14 19:40   ` Vladimir Oltean
  2020-10-14 20:25     ` Mark Brown
  0 siblings, 1 reply; 7+ messages in thread
From: Vladimir Oltean @ 2020-10-14 19:40 UTC (permalink / raw)
  To: Lukas Wunner; +Cc: Florian Fainelli, linux-kernel, linux-spi, Mark Brown

On Wed, Oct 14, 2020 at 04:09:12PM +0200, Lukas Wunner wrote:
> Apparently the problem is that spi_unregister_controller() drops the
> last ref on the controller, causing it to be freed, and afterwards we
> access the controller's private data, which is part of the same
> allocation as struct spi_controller:
>
> bcm2835_spi_remove()
>   spi_unregister_controller()
>     device_unregister()
>       put_device()
>         spi_controller_release()  #  spi_master_class.dev_release()
> 	  kfree(ctlr)
>   bcm2835_dma_release(ctlr, bs)
>   ...

Also see these threads:
https://lore.kernel.org/linux-spi/20200922112241.GO4792@sirena.org.uk/T/#t
https://lore.kernel.org/linux-spi/270b94fd1e546d0c17a735c1f55500e58522da04.camel@suse.de/T/#u
And here's how _not_ to fix it:
https://lore.kernel.org/linux-spi/160088764365.36195.16185348610086043664.b4-ty@kernel.org/T/#t
At least without some care to not break other things:
https://lore.kernel.org/linux-spi/20200928080432.GC11648@pengutronix.de/T/#t

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Use after free in bcm2835_spi_remove()
  2020-10-14 19:40   ` Vladimir Oltean
@ 2020-10-14 20:25     ` Mark Brown
  2020-10-14 21:20       ` Florian Fainelli
  2020-10-15  5:38       ` Lukas Wunner
  0 siblings, 2 replies; 7+ messages in thread
From: Mark Brown @ 2020-10-14 20:25 UTC (permalink / raw)
  To: Vladimir Oltean; +Cc: Lukas Wunner, Florian Fainelli, linux-kernel, linux-spi


[-- Attachment #1: Type: text/plain, Size: 1100 bytes --]

On Wed, Oct 14, 2020 at 10:40:35PM +0300, Vladimir Oltean wrote:
> On Wed, Oct 14, 2020 at 04:09:12PM +0200, Lukas Wunner wrote:

> > Apparently the problem is that spi_unregister_controller() drops the
> > last ref on the controller, causing it to be freed, and afterwards we
> > access the controller's private data, which is part of the same
> > allocation as struct spi_controller:

> > bcm2835_spi_remove()
> >   spi_unregister_controller()
> >     device_unregister()
> >       put_device()
> >         spi_controller_release()  #  spi_master_class.dev_release()
> > 	  kfree(ctlr)
> >   bcm2835_dma_release(ctlr, bs)

> Also see these threads:
> https://lore.kernel.org/linux-spi/20200922112241.GO4792@sirena.org.uk/T/#t
> https://lore.kernel.org/linux-spi/270b94fd1e546d0c17a735c1f55500e58522da04.camel@suse.de/T/#u

Right, the proposed patch is yet another way to fix the issue - it all
comes back to the fact that you shouldn't be using the driver data after
unregistering if it was allocated as part of allocating the controller.
This framework feature is unfortunately quite error prone.

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Use after free in bcm2835_spi_remove()
  2020-10-14 20:25     ` Mark Brown
@ 2020-10-14 21:20       ` Florian Fainelli
  2020-10-15  5:38       ` Lukas Wunner
  1 sibling, 0 replies; 7+ messages in thread
From: Florian Fainelli @ 2020-10-14 21:20 UTC (permalink / raw)
  To: Mark Brown, Vladimir Oltean; +Cc: Lukas Wunner, linux-kernel, linux-spi


[-- Attachment #1: Type: text/plain, Size: 1298 bytes --]

On 10/14/20 1:25 PM, Mark Brown wrote:
> On Wed, Oct 14, 2020 at 10:40:35PM +0300, Vladimir Oltean wrote:
>> On Wed, Oct 14, 2020 at 04:09:12PM +0200, Lukas Wunner wrote:
> 
>>> Apparently the problem is that spi_unregister_controller() drops the
>>> last ref on the controller, causing it to be freed, and afterwards we
>>> access the controller's private data, which is part of the same
>>> allocation as struct spi_controller:
> 
>>> bcm2835_spi_remove()
>>>   spi_unregister_controller()
>>>     device_unregister()
>>>       put_device()
>>>         spi_controller_release()  #  spi_master_class.dev_release()
>>> 	  kfree(ctlr)
>>>   bcm2835_dma_release(ctlr, bs)
> 
>> Also see these threads:
>> https://lore.kernel.org/linux-spi/20200922112241.GO4792@sirena.org.uk/T/#t
>> https://lore.kernel.org/linux-spi/270b94fd1e546d0c17a735c1f55500e58522da04.camel@suse.de/T/#u
> 
> Right, the proposed patch is yet another way to fix the issue - it all
> comes back to the fact that you shouldn't be using the driver data after
> unregistering if it was allocated as part of allocating the controller.
> This framework feature is unfortunately quite error prone.

Lukas, your patch works fine for me and is only two lines, so maybe
better suited for stable. How about the attached patch?
-- 
Florian

[-- Attachment #2: 0001-spi-bcm2835-Fix-use-after-free-in-bcm2835_spi_remove.patch --]
[-- Type: text/x-patch, Size: 4973 bytes --]

From a4ee9da1ef09f9ddb04060e644b9c34fd532c189 Mon Sep 17 00:00:00 2001
From: Florian Fainelli <f.fainelli@gmail.com>
Date: Wed, 14 Oct 2020 14:15:28 -0700
Subject: [PATCH] spi: bcm2835: Fix use-after-free in bcm2835_spi_remove()

In bcm2835_spi_remove(), spi_controller_unregister() will free the ctlr
reference which will lead to an use after free in bcm2835_release_dma().

To avoid this use after free, allocate the bcm2835_spi structure with a
different lifecycle than the spi_controller structure such that we
unregister the SPI controller, free up all the resources and finally let
device managed allocations free the bcm2835_spi structure.

Fixes: 05897c710e8e ("spi: bcm2835: Tear down DMA before turning off SPI controller")
Fixes: 3ecd37edaa2a ("spi: bcm2835: enable dma modes for transfers meeting certain conditions")
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
---
 drivers/spi/spi-bcm2835.c | 46 +++++++++++++++++++++++----------------
 1 file changed, 27 insertions(+), 19 deletions(-)

diff --git a/drivers/spi/spi-bcm2835.c b/drivers/spi/spi-bcm2835.c
index 41986ac0fbfb..d66358e6b5cd 100644
--- a/drivers/spi/spi-bcm2835.c
+++ b/drivers/spi/spi-bcm2835.c
@@ -847,42 +847,41 @@ static bool bcm2835_spi_can_dma(struct spi_controller *ctlr,
 	return true;
 }
 
-static void bcm2835_dma_release(struct spi_controller *ctlr,
-				struct bcm2835_spi *bs)
+static void bcm2835_dma_release(struct bcm2835_spi *bs,
+				struct dma_chan *dma_tx,
+				struct dma_chan *dma_rx)
 {
 	int i;
 
-	if (ctlr->dma_tx) {
-		dmaengine_terminate_sync(ctlr->dma_tx);
+	if (dma_tx) {
+		dmaengine_terminate_sync(dma_tx);
 
 		if (bs->fill_tx_desc)
 			dmaengine_desc_free(bs->fill_tx_desc);
 
 		if (bs->fill_tx_addr)
-			dma_unmap_page_attrs(ctlr->dma_tx->device->dev,
+			dma_unmap_page_attrs(dma_tx->device->dev,
 					     bs->fill_tx_addr, sizeof(u32),
 					     DMA_TO_DEVICE,
 					     DMA_ATTR_SKIP_CPU_SYNC);
 
-		dma_release_channel(ctlr->dma_tx);
-		ctlr->dma_tx = NULL;
+		dma_release_channel(dma_tx);
 	}
 
-	if (ctlr->dma_rx) {
-		dmaengine_terminate_sync(ctlr->dma_rx);
+	if (dma_rx) {
+		dmaengine_terminate_sync(dma_rx);
 
 		for (i = 0; i < BCM2835_SPI_NUM_CS; i++)
 			if (bs->clear_rx_desc[i])
 				dmaengine_desc_free(bs->clear_rx_desc[i]);
 
 		if (bs->clear_rx_addr)
-			dma_unmap_single(ctlr->dma_rx->device->dev,
+			dma_unmap_single(dma_rx->device->dev,
 					 bs->clear_rx_addr,
 					 sizeof(bs->clear_rx_cs),
 					 DMA_TO_DEVICE);
 
-		dma_release_channel(ctlr->dma_rx);
-		ctlr->dma_rx = NULL;
+		dma_release_channel(dma_rx);
 	}
 }
 
@@ -1010,7 +1009,7 @@ static int bcm2835_dma_init(struct spi_controller *ctlr, struct device *dev,
 	dev_err(dev, "issue configuring dma: %d - not using DMA mode\n",
 		ret);
 err_release:
-	bcm2835_dma_release(ctlr, bs);
+	bcm2835_dma_release(bs, ctlr->dma_tx, ctlr->dma_rx);
 err:
 	/*
 	 * Only report error for deferred probing, otherwise fall back to
@@ -1291,12 +1290,17 @@ static int bcm2835_spi_probe(struct platform_device *pdev)
 	struct bcm2835_spi *bs;
 	int err;
 
+	bs = devm_kzalloc(&pdev->dev, sizeof(*bs), GFP_KERNEL);
+	if (!bs)
+		return -ENOMEM;
+
 	ctlr = spi_alloc_master(&pdev->dev, ALIGN(sizeof(*bs),
 						  dma_get_cache_alignment()));
 	if (!ctlr)
 		return -ENOMEM;
 
-	platform_set_drvdata(pdev, ctlr);
+	spi_controller_set_devdata(ctlr, bs);
+	platform_set_drvdata(pdev, bs);
 
 	ctlr->use_gpio_descriptors = true;
 	ctlr->mode_bits = BCM2835_SPI_MODE_BITS;
@@ -1308,7 +1312,6 @@ static int bcm2835_spi_probe(struct platform_device *pdev)
 	ctlr->prepare_message = bcm2835_spi_prepare_message;
 	ctlr->dev.of_node = pdev->dev.of_node;
 
-	bs = spi_controller_get_devdata(ctlr);
 	bs->ctlr = ctlr;
 
 	bs->regs = devm_platform_ioremap_resource(pdev, 0);
@@ -1362,7 +1365,7 @@ static int bcm2835_spi_probe(struct platform_device *pdev)
 	return 0;
 
 out_dma_release:
-	bcm2835_dma_release(ctlr, bs);
+	bcm2835_dma_release(bs, ctlr->dma_tx, ctlr->dma_rx);
 out_clk_disable:
 	clk_disable_unprepare(bs->clk);
 out_controller_put:
@@ -1372,14 +1375,19 @@ static int bcm2835_spi_probe(struct platform_device *pdev)
 
 static int bcm2835_spi_remove(struct platform_device *pdev)
 {
-	struct spi_controller *ctlr = platform_get_drvdata(pdev);
-	struct bcm2835_spi *bs = spi_controller_get_devdata(ctlr);
+	struct bcm2835_spi *bs = platform_get_drvdata(pdev);
+	struct spi_controller *ctlr = bs->ctlr;
+	struct dma_chan *tx_chan = ctlr->dma_tx;
+	struct dma_chan *rx_chan = ctlr->dma_rx;
 
 	bcm2835_debugfs_remove(bs);
 
 	spi_unregister_controller(ctlr);
 
-	bcm2835_dma_release(ctlr, bs);
+	/* ctlr is freed by spi_unregister_controller() use the cached dma_chan
+	 * references.
+	 */
+	bcm2835_dma_release(bs, tx_chan, rx_chan);
 
 	/* Clear FIFOs, and disable the HW block */
 	bcm2835_wr(bs, BCM2835_SPI_CS,
-- 
2.25.1


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Use after free in bcm2835_spi_remove()
  2020-10-14 20:25     ` Mark Brown
  2020-10-14 21:20       ` Florian Fainelli
@ 2020-10-15  5:38       ` Lukas Wunner
  2020-10-15 12:53         ` Mark Brown
  1 sibling, 1 reply; 7+ messages in thread
From: Lukas Wunner @ 2020-10-15  5:38 UTC (permalink / raw)
  To: Mark Brown
  Cc: Vladimir Oltean, Florian Fainelli, linux-kernel, linux-spi, Sascha Hauer

[cc += Sascha]

On Wed, Oct 14, 2020 at 09:25:05PM +0100, Mark Brown wrote:
> > On Wed, Oct 14, 2020 at 04:09:12PM +0200, Lukas Wunner wrote:
> > > Apparently the problem is that spi_unregister_controller() drops the
> > > last ref on the controller, causing it to be freed, and afterwards we
> > > access the controller's private data, which is part of the same
> > > allocation as struct spi_controller:
> 
> Right, the proposed patch is yet another way to fix the issue - it all
> comes back to the fact that you shouldn't be using the driver data after
> unregistering if it was allocated as part of allocating the controller.
> This framework feature is unfortunately quite error prone.

How about holding a ref on the controller as long as the SPI driver
is bound to the controller's parent device?  See below for a patch,
compile-tested only for lack of an SPI-equipped machine.

Makes sense or dumb idea?

If this approach is deemed to be a case of "midlayer fallacy",
we could alternatively do this in a library function which drivers
opt-in to.  Or, given that the majority of drivers seems to be affected,
make it the default and allow drivers to opt-out.

-- >8 --

diff --git a/drivers/spi/spi.c b/drivers/spi/spi.c
index 0cab239..5afa275 100644
--- a/drivers/spi/spi.c
+++ b/drivers/spi/spi.c
@@ -2399,6 +2399,11 @@ static ssize_t slave_store(struct device *dev, struct device_attribute *attr,
 extern struct class spi_slave_class;	/* dummy */
 #endif
 
+static void __spi_controller_put(void *ctlr)
+{
+	spi_controller_put(ctlr);
+}
+
 /**
  * __spi_alloc_controller - allocate an SPI master or slave controller
  * @dev: the controller, possibly using the platform_bus
@@ -2414,6 +2419,7 @@ static ssize_t slave_store(struct device *dev, struct device_attribute *attr,
  * This call is used only by SPI controller drivers, which are the
  * only ones directly touching chip registers.  It's how they allocate
  * an spi_controller structure, prior to calling spi_register_controller().
+ * The structure is accessible as long as the SPI driver is bound to @dev.
  *
  * This must be called from context that can sleep.
  *
@@ -2429,6 +2435,7 @@ struct spi_controller *__spi_alloc_controller(struct device *dev,
 {
 	struct spi_controller	*ctlr;
 	size_t ctlr_size = ALIGN(sizeof(*ctlr), dma_get_cache_alignment());
+	int ret;
 
 	if (!dev)
 		return NULL;
@@ -2449,6 +2456,13 @@ struct spi_controller *__spi_alloc_controller(struct device *dev,
 	pm_suspend_ignore_children(&ctlr->dev, true);
 	spi_controller_set_devdata(ctlr, (void *)ctlr + ctlr_size);
 
+	spi_controller_get(ctlr);
+	ret = devm_add_action(dev, __spi_controller_put, ctlr);
+	if (ret) {
+		kfree(ctlr);
+		return NULL;
+	}
+
 	return ctlr;
 }
 EXPORT_SYMBOL_GPL(__spi_alloc_controller);

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Use after free in bcm2835_spi_remove()
  2020-10-15  5:38       ` Lukas Wunner
@ 2020-10-15 12:53         ` Mark Brown
  0 siblings, 0 replies; 7+ messages in thread
From: Mark Brown @ 2020-10-15 12:53 UTC (permalink / raw)
  To: Lukas Wunner
  Cc: Vladimir Oltean, Florian Fainelli, linux-kernel, linux-spi, Sascha Hauer


[-- Attachment #1: Type: text/plain, Size: 1504 bytes --]

On Thu, Oct 15, 2020 at 07:38:29AM +0200, Lukas Wunner wrote:
> On Wed, Oct 14, 2020 at 09:25:05PM +0100, Mark Brown wrote:

> > Right, the proposed patch is yet another way to fix the issue - it all
> > comes back to the fact that you shouldn't be using the driver data after
> > unregistering if it was allocated as part of allocating the controller.
> > This framework feature is unfortunately quite error prone.

> How about holding a ref on the controller as long as the SPI driver
> is bound to the controller's parent device?  See below for a patch,
> compile-tested only for lack of an SPI-equipped machine.

> Makes sense or dumb idea?

> If this approach is deemed to be a case of "midlayer fallacy",
> we could alternatively do this in a library function which drivers
> opt-in to.  Or, given that the majority of drivers seems to be affected,
> make it the default and allow drivers to opt-out.

...

> +	spi_controller_get(ctlr);
> +	ret = devm_add_action(dev, __spi_controller_put, ctlr);
> +	if (ret) {
> +		kfree(ctlr);

This feels a bit icky - we're masking a standard use after free bug that
affects devm in general, not just this instance, and so while it will
work it doesn't feel great.  If we did do this it'd need more comments
and should probably be conditional on using the feature.  TBH I'm just
thinking it's better to just remove the feature, it's clearly error
prone and pretty redundant with devm.  I'm not sure any memory savings
it's delivering are worth the sharp edges.

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, back to index

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-10-13 23:48 Use after free in bcm2835_spi_remove() Florian Fainelli
2020-10-14 14:09 ` Lukas Wunner
2020-10-14 19:40   ` Vladimir Oltean
2020-10-14 20:25     ` Mark Brown
2020-10-14 21:20       ` Florian Fainelli
2020-10-15  5:38       ` Lukas Wunner
2020-10-15 12:53         ` Mark Brown

Linux-SPI Archive on lore.kernel.org

Archives are clonable:
	git clone --mirror https://lore.kernel.org/linux-spi/0 linux-spi/git/0.git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V2 linux-spi linux-spi/ https://lore.kernel.org/linux-spi \
		linux-spi@vger.kernel.org
	public-inbox-index linux-spi

Example config snippet for mirrors

Newsgroup available over NNTP:
	nntp://nntp.lore.kernel.org/org.kernel.vger.linux-spi


AGPL code for this site: git clone https://public-inbox.org/public-inbox.git