All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH] spi: Fix hung task timeout when initialization fails
@ 2014-04-30 10:36 Ricardo Ribalda Delgado
  2014-05-01  1:34   ` Mark Brown
  0 siblings, 1 reply; 15+ messages in thread
From: Ricardo Ribalda Delgado @ 2014-04-30 10:36 UTC (permalink / raw)
  To: linux-spi, broonie, linux-kernel; +Cc: Ricardo Ribalda Delgado

Since "786235eeba0e1e85e5cbbb9f97d1087ad03dfa21 kthread:
make kthread_create() killable" kthread_run can be killed by the user,
ie, by killing modprobe.

When this happens spi_destroy_queue() fails to cleanout the resources
ending up in a hung process that outputs a error message and avoids
the computer to be halted.

This bug is particulary critical on machines that relay on the spi
subsystem to hold a file system (MTD over SPI).

[   95.900328] spi_master spi32765: failed to create message pump task
[   95.900358] spi_master spi32765: problem initializing queue
[   95.912333] qtec_pcie b0030000.pcie_bridge: Axi-pcie bridge supervisor
[  240.463698] INFO: task udevd:129 blocked for more than 120 seconds.
[  240.463724]       Not tainted 3.13.0-qtec-standard+ #191
[  240.463733] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[  240.463743] udevd           D ffff880087815ac0     0   129    118 0x00000006
[  240.463762]  ffff880086593590 0000000000000002 ffff880087815ac0 ffff880086593fd8
[  240.463778]  0000000000012c00 0000000000012c00 ffff880087815ac0 ffff880086593520
[  240.463792]  ffffffff813debe7 ffffffff810be6df ffffffff81da539c 0000000000000000
[  240.463807] Call Trace:
[  240.463836]  [<ffffffff813debe7>] ? vt_console_print+0x57/0x3e0
[  240.463852]  [<ffffffff810be6df>] ? print_prefix+0x6f/0xb0
[  240.463867]  [<ffffffff810bf45c>] ? wake_up_klogd+0x3c/0x50
[  240.463880]  [<ffffffff810bf67b>] ? console_unlock+0x20b/0x3f0
[  240.463897]  [<ffffffff8178b799>] schedule+0x29/0x70
[  240.463910]  [<ffffffff8178a8d9>] schedule_timeout+0x189/0x240
[  240.463925]  [<ffffffff813fd510>] ? dev_vprintk_emit+0x50/0x60
[  240.463939]  [<ffffffff8178c167>] wait_for_common+0xe7/0x190
[  240.463954]  [<ffffffff810a7780>] ? wake_up_process+0x40/0x40
[  240.463968]  [<ffffffff8178c22d>] wait_for_completion+0x1d/0x20
[  240.463981]  [<ffffffff810996bc>] flush_kthread_worker+0x6c/0x80
[  240.463995]  [<ffffffff8178c2cd>] ? wait_for_completion_killable+0x1d/0x40
[  240.464007]  [<ffffffff81099162>] ? kthread_create_on_node+0xd2/0x190
[  240.464020]  [<ffffffff81099000>] ? kthread_freezable_should_stop+0x80/0x80
[  240.464036]  [<ffffffff8146c637>] spi_destroy_queue+0x27/0x60
[  240.464050]  [<ffffffff8146cc46>] spi_register_master+0x5d6/0x900
[  240.464072]  [<ffffffffa0000713>] spi_bitbang_start+0x83/0x110 [spi_bitbang]
[  240.464089]  [<ffffffffa0005960>] xilinx_spi_probe+0x230/0x3aa [spi_xilinx]
[  240.464104]  [<ffffffff81403645>] platform_drv_probe+0x45/0xb0
[  240.464120]  [<ffffffff81401052>] ? driver_sysfs_add+0x82/0xb0
[  240.464134]  [<ffffffff81401795>] driver_probe_device+0x125/0x3b0
[  240.464149]  [<ffffffff81401a20>] ? driver_probe_device+0x3b0/0x3b0
[  240.464163]  [<ffffffff81401a5b>] __device_attach+0x3b/0x40
[  240.464177]  [<ffffffff813ff7c3>] bus_for_each_drv+0x63/0xa0
[  240.464191]  [<ffffffff814015f8>] device_attach+0x88/0xa0
[  240.464205]  [<ffffffff81400a08>] bus_probe_device+0x98/0xc0
[  240.464218]  [<ffffffff813fe865>] device_add+0x495/0x600
[  240.464233]  [<ffffffff815d860f>] of_device_add+0x2f/0x40
[  240.464246]  [<ffffffff815d8dab>] of_platform_device_create_pdata+0x6b/0xb0
[  240.464259]  [<ffffffff815d8f03>] of_platform_bus_create+0xf3/0x200
[  240.464272]  [<ffffffff815d7228>] ? __of_match_node+0x48/0xe0
[  240.464286]  [<ffffffff815d8f5e>] of_platform_bus_create+0x14e/0x200
[  240.464298]  [<ffffffff815d6adc>] ? __of_find_property+0x3c/0x80
[  240.464312]  [<ffffffff815d9177>] of_platform_populate+0x47/0x90
[  240.464334]  [<ffffffffa000bcf2>] qt5023_of_populate+0x562/0x980 [qt5023]
[  240.464357]  [<ffffffffa000c363>] qt5023_of_probe+0x1d3/0x300 [qt5023]
[  240.464472]  [<ffffffffa000a123>] qt5023_pci_probe+0xd3/0x1c0 [qt5023]
[  240.464491]  [<ffffffff813682d4>] pci_device_probe+0x84/0xe0
[  240.464507]  [<ffffffff81401795>] driver_probe_device+0x125/0x3b0
[  240.464522]  [<ffffffff81401af3>] __driver_attach+0x93/0xa0
[  240.464536]  [<ffffffff81401a60>] ? __device_attach+0x40/0x40
[  240.464550]  [<ffffffff813ff703>] bus_for_each_dev+0x63/0xa0
[  240.464585]  [<ffffffff8140114e>] driver_attach+0x1e/0x20
[  240.464640]  [<ffffffff81400d30>] bus_add_driver+0x180/0x250
[  240.464687]  [<ffffffffa0010000>] ? 0xffffffffa000ffff
[  240.464738]  [<ffffffff81402164>] driver_register+0x64/0xf0
[  240.464789]  [<ffffffffa0010000>] ? 0xffffffffa000ffff
[  240.464845]  [<ffffffff81367d9b>] __pci_register_driver+0x4b/0x50
[  240.464899]  [<ffffffffa0010031>] qt5023_init+0x31/0x33 [qt5023]
[  240.464943]  [<ffffffff810002f2>] do_one_initcall+0xd2/0x180
[  240.464960]  [<ffffffff8109da88>] ? __blocking_notifier_call_chain+0x58/0x70
[  240.464976]  [<ffffffff810e07c6>] load_module+0x1ad6/0x23a0
[  240.464992]  [<ffffffff810dd5d0>] ? store_uevent+0x40/0x40
[  240.465010]  [<ffffffff810de5ea>] ? copy_module_from_fd.isra.50+0x12a/0x190
[  240.465024]  [<ffffffff810e1226>] SyS_finit_module+0x86/0xb0
[  240.465041]  [<ffffffff81797049>] tracesys+0xd0/0xd5
[  360.540806] INFO: task udevd:129 blocked for more than 120 seconds.
[  360.540832]       Not tainted 3.13.0-qtec-standard+ #191
[  360.540841] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[  360.540851] udevd           D ffff880087815ac0     0   129    118 0x00000006
[  360.540871]  ffff880086593590 0000000000000002 ffff880087815ac0 ffff880086593fd8
[  360.540886]  0000000000012c00 0000000000012c00 ffff880087815ac0 ffff880086593520
[  360.540900]  ffffffff813debe7 ffffffff810be6df ffffffff81da539c 0000000000000000
[  360.540915] Call Trace:
[  360.540943]  [<ffffffff813debe7>] ? vt_console_print+0x57/0x3e0
[  360.540959]  [<ffffffff810be6df>] ? print_prefix+0x6f/0xb0
[  360.540974]  [<ffffffff810bf45c>] ? wake_up_klogd+0x3c/0x50
[  360.540988]  [<ffffffff810bf67b>] ? console_unlock+0x20b/0x3f0
[  360.541004]  [<ffffffff8178b799>] schedule+0x29/0x70
[  360.541017]  [<ffffffff8178a8d9>] schedule_timeout+0x189/0x240
[  360.541032]  [<ffffffff813fd510>] ? dev_vprintk_emit+0x50/0x60
[  360.541046]  [<ffffffff8178c167>] wait_for_common+0xe7/0x190
[  360.541061]  [<ffffffff810a7780>] ? wake_up_process+0x40/0x40
[  360.541075]  [<ffffffff8178c22d>] wait_for_completion+0x1d/0x20
[  360.541089]  [<ffffffff810996bc>] flush_kthread_worker+0x6c/0x80
[  360.541102]  [<ffffffff8178c2cd>] ? wait_for_completion_killable+0x1d/0x40
[  360.541115]  [<ffffffff81099162>] ? kthread_create_on_node+0xd2/0x190
[  360.541127]  [<ffffffff81099000>] ? kthread_freezable_should_stop+0x80/0x80
[  360.541144]  [<ffffffff8146c637>] spi_destroy_queue+0x27/0x60
[  360.541157]  [<ffffffff8146cc46>] spi_register_master+0x5d6/0x900
[  360.541179]  [<ffffffffa0000713>] spi_bitbang_start+0x83/0x110 [spi_bitbang]
[  360.541196]  [<ffffffffa0005960>] xilinx_spi_probe+0x230/0x3aa [spi_xilinx]
[  360.541211]  [<ffffffff81403645>] platform_drv_probe+0x45/0xb0
[  360.541226]  [<ffffffff81401052>] ? driver_sysfs_add+0x82/0xb0
[  360.541240]  [<ffffffff81401795>] driver_probe_device+0x125/0x3b0
[  360.541255]  [<ffffffff81401a20>] ? driver_probe_device+0x3b0/0x3b0
[  360.541269]  [<ffffffff81401a5b>] __device_attach+0x3b/0x40
[  360.541283]  [<ffffffff813ff7c3>] bus_for_each_drv+0x63/0xa0
[  360.541297]  [<ffffffff814015f8>] device_attach+0x88/0xa0
[  360.541311]  [<ffffffff81400a08>] bus_probe_device+0x98/0xc0
[  360.541324]  [<ffffffff813fe865>] device_add+0x495/0x600
[  360.541339]  [<ffffffff815d860f>] of_device_add+0x2f/0x40
[  360.541352]  [<ffffffff815d8dab>] of_platform_device_create_pdata+0x6b/0xb0
[  360.541366]  [<ffffffff815d8f03>] of_platform_bus_create+0xf3/0x200
[  360.541378]  [<ffffffff815d7228>] ? __of_match_node+0x48/0xe0
[  360.541392]  [<ffffffff815d8f5e>] of_platform_bus_create+0x14e/0x200
[  360.541404]  [<ffffffff815d6adc>] ? __of_find_property+0x3c/0x80
[  360.541418]  [<ffffffff815d9177>] of_platform_populate+0x47/0x90
[  360.541441]  [<ffffffffa000bcf2>] qt5023_of_populate+0x562/0x980 [qt5023]
[  360.541463]  [<ffffffffa000c363>] qt5023_of_probe+0x1d3/0x300 [qt5023]
[  360.541484]  [<ffffffffa000a123>] qt5023_pci_probe+0xd3/0x1c0 [qt5023]
[  360.541502]  [<ffffffff813682d4>] pci_device_probe+0x84/0xe0
[  360.541517]  [<ffffffff81401795>] driver_probe_device+0x125/0x3b0
[  360.541620]  [<ffffffff81401af3>] __driver_attach+0x93/0xa0
[  360.541636]  [<ffffffff81401a60>] ? __device_attach+0x40/0x40
[  360.541650]  [<ffffffff813ff703>] bus_for_each_dev+0x63/0xa0
[  360.541665]  [<ffffffff8140114e>] driver_attach+0x1e/0x20
[  360.541680]  [<ffffffff81400d30>] bus_add_driver+0x180/0x250
[  360.541748]  [<ffffffffa0010000>] ? 0xffffffffa000ffff
[  360.541793]  [<ffffffff81402164>] driver_register+0x64/0xf0
[  360.541845]  [<ffffffffa0010000>] ? 0xffffffffa000ffff
[  360.541895]  [<ffffffff81367d9b>] __pci_register_driver+0x4b/0x50
[  360.541954]  [<ffffffffa0010031>] qt5023_init+0x31/0x33 [qt5023]
[  360.542007]  [<ffffffff810002f2>] do_one_initcall+0xd2/0x180
[  360.542062]  [<ffffffff8109da88>] ? __blocking_notifier_call_chain+0x58/0x70
[  360.542085]  [<ffffffff810e07c6>] load_module+0x1ad6/0x23a0
[  360.542101]  [<ffffffff810dd5d0>] ? store_uevent+0x40/0x40
[  360.542119]  [<ffffffff810de5ea>] ? copy_module_from_fd.isra.50+0x12a/0x190
[  360.542133]  [<ffffffff810e1226>] SyS_finit_module+0x86/0xb0
[  360.542150]  [<ffffffff81797049>] tracesys+0xd0/0xd5

Signed-off-by: Ricardo Ribalda Delgado <ricardo.ribalda@gmail.com>
---
 drivers/spi/spi.c | 6 ++++--
 1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/drivers/spi/spi.c b/drivers/spi/spi.c
index 4eb9bf0..8331556 100644
--- a/drivers/spi/spi.c
+++ b/drivers/spi/spi.c
@@ -1114,8 +1114,10 @@ static int spi_destroy_queue(struct spi_master *master)
 		return ret;
 	}
 
-	flush_kthread_worker(&master->kworker);
-	kthread_stop(master->kworker_task);
+	if (!IS_ERR(master->kworker_task)) {
+		flush_kthread_worker(&master->kworker);
+		kthread_stop(master->kworker_task);
+	}
 
 	return 0;
 }
-- 
1.9.2


^ permalink raw reply related	[flat|nested] 15+ messages in thread

* Re: [PATCH] spi: Fix hung task timeout when initialization fails
@ 2014-05-01  1:34   ` Mark Brown
  0 siblings, 0 replies; 15+ messages in thread
From: Mark Brown @ 2014-05-01  1:34 UTC (permalink / raw)
  To: Ricardo Ribalda Delgado; +Cc: linux-spi, linux-kernel

[-- Attachment #1: Type: text/plain, Size: 757 bytes --]

On Wed, Apr 30, 2014 at 12:36:11PM +0200, Ricardo Ribalda Delgado wrote:

> Since "786235eeba0e1e85e5cbbb9f97d1087ad03dfa21 kthread:
> make kthread_create() killable" kthread_run can be killed by the user,
> ie, by killing modprobe.

> -	flush_kthread_worker(&master->kworker);
> -	kthread_stop(master->kworker_task);
> +	if (!IS_ERR(master->kworker_task)) {
> +		flush_kthread_worker(&master->kworker);
> +		kthread_stop(master->kworker_task);
> +	}

How does this fix avoid racing with the task being killed?  It will
improve things but it doesn't look like a full fix.

Please don't include entire backtraces in commit messages, they're
typically extremely long and don't really add anything - edited
highlights are fine but try to keep it to the point.

[-- Attachment #2: Digital signature --]
[-- Type: application/pgp-signature, Size: 836 bytes --]

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH] spi: Fix hung task timeout when initialization fails
@ 2014-05-01  1:34   ` Mark Brown
  0 siblings, 0 replies; 15+ messages in thread
From: Mark Brown @ 2014-05-01  1:34 UTC (permalink / raw)
  To: Ricardo Ribalda Delgado
  Cc: linux-spi-u79uwXL29TY76Z2rM5mHXA, linux-kernel-u79uwXL29TY76Z2rM5mHXA

[-- Attachment #1: Type: text/plain, Size: 757 bytes --]

On Wed, Apr 30, 2014 at 12:36:11PM +0200, Ricardo Ribalda Delgado wrote:

> Since "786235eeba0e1e85e5cbbb9f97d1087ad03dfa21 kthread:
> make kthread_create() killable" kthread_run can be killed by the user,
> ie, by killing modprobe.

> -	flush_kthread_worker(&master->kworker);
> -	kthread_stop(master->kworker_task);
> +	if (!IS_ERR(master->kworker_task)) {
> +		flush_kthread_worker(&master->kworker);
> +		kthread_stop(master->kworker_task);
> +	}

How does this fix avoid racing with the task being killed?  It will
improve things but it doesn't look like a full fix.

Please don't include entire backtraces in commit messages, they're
typically extremely long and don't really add anything - edited
highlights are fine but try to keep it to the point.

[-- Attachment #2: Digital signature --]
[-- Type: application/pgp-signature, Size: 836 bytes --]

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH] spi: Fix hung task timeout when initialization fails
@ 2014-05-01  5:53     ` Ricardo Ribalda Delgado
  0 siblings, 0 replies; 15+ messages in thread
From: Ricardo Ribalda Delgado @ 2014-05-01  5:53 UTC (permalink / raw)
  To: Mark Brown; +Cc: linux-spi, LKML

Hello Mark

Thanks for your comments. This fix does not avoid the task being
killed (which is not an error). What it does is that IF the task is
killed or we are out of memory we will exit with all the resources
properly released and no locks helds, giving the user a chance to
reload/rebind the module instead of getting and unstable system that
cannot even reboot without a powercycle.

The mention of 786235eeba0e1e85e5cbbb9f97d1087ad03dfa2 is because the condition

if (IS_ERR(master->kworker_task)) {
dev_err(&master->dev, "failed to create message pump task\n");
return -ENOMEM;
}

was very unlikely to happen/difficult to force .

I will reword the message and clean out he backtrace and upload a v2.

Thanks!


On Thu, May 1, 2014 at 3:34 AM, Mark Brown <broonie@kernel.org> wrote:
> On Wed, Apr 30, 2014 at 12:36:11PM +0200, Ricardo Ribalda Delgado wrote:
>
>> Since "786235eeba0e1e85e5cbbb9f97d1087ad03dfa21 kthread:
>> make kthread_create() killable" kthread_run can be killed by the user,
>> ie, by killing modprobe.
>
>> -     flush_kthread_worker(&master->kworker);
>> -     kthread_stop(master->kworker_task);
>> +     if (!IS_ERR(master->kworker_task)) {
>> +             flush_kthread_worker(&master->kworker);
>> +             kthread_stop(master->kworker_task);
>> +     }
>
> How does this fix avoid racing with the task being killed?  It will
> improve things but it doesn't look like a full fix.
>
> Please don't include entire backtraces in commit messages, they're
> typically extremely long and don't really add anything - edited
> highlights are fine but try to keep it to the point.



-- 
Ricardo Ribalda

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH] spi: Fix hung task timeout when initialization fails
@ 2014-05-01  5:53     ` Ricardo Ribalda Delgado
  0 siblings, 0 replies; 15+ messages in thread
From: Ricardo Ribalda Delgado @ 2014-05-01  5:53 UTC (permalink / raw)
  To: Mark Brown; +Cc: linux-spi-u79uwXL29TY76Z2rM5mHXA, LKML

Hello Mark

Thanks for your comments. This fix does not avoid the task being
killed (which is not an error). What it does is that IF the task is
killed or we are out of memory we will exit with all the resources
properly released and no locks helds, giving the user a chance to
reload/rebind the module instead of getting and unstable system that
cannot even reboot without a powercycle.

The mention of 786235eeba0e1e85e5cbbb9f97d1087ad03dfa2 is because the condition

if (IS_ERR(master->kworker_task)) {
dev_err(&master->dev, "failed to create message pump task\n");
return -ENOMEM;
}

was very unlikely to happen/difficult to force .

I will reword the message and clean out he backtrace and upload a v2.

Thanks!


On Thu, May 1, 2014 at 3:34 AM, Mark Brown <broonie-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org> wrote:
> On Wed, Apr 30, 2014 at 12:36:11PM +0200, Ricardo Ribalda Delgado wrote:
>
>> Since "786235eeba0e1e85e5cbbb9f97d1087ad03dfa21 kthread:
>> make kthread_create() killable" kthread_run can be killed by the user,
>> ie, by killing modprobe.
>
>> -     flush_kthread_worker(&master->kworker);
>> -     kthread_stop(master->kworker_task);
>> +     if (!IS_ERR(master->kworker_task)) {
>> +             flush_kthread_worker(&master->kworker);
>> +             kthread_stop(master->kworker_task);
>> +     }
>
> How does this fix avoid racing with the task being killed?  It will
> improve things but it doesn't look like a full fix.
>
> Please don't include entire backtraces in commit messages, they're
> typically extremely long and don't really add anything - edited
> highlights are fine but try to keep it to the point.



-- 
Ricardo Ribalda
--
To unsubscribe from this list: send the line "unsubscribe linux-spi" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH] spi: Fix hung task timeout when initialization fails
@ 2014-05-01 14:11       ` Mark Brown
  0 siblings, 0 replies; 15+ messages in thread
From: Mark Brown @ 2014-05-01 14:11 UTC (permalink / raw)
  To: Ricardo Ribalda Delgado; +Cc: linux-spi, LKML

[-- Attachment #1: Type: text/plain, Size: 572 bytes --]

On Thu, May 01, 2014 at 07:53:58AM +0200, Ricardo Ribalda Delgado wrote:

> Thanks for your comments. This fix does not avoid the task being
> killed (which is not an error). What it does is that IF the task is
> killed or we are out of memory we will exit with all the resources
> properly released and no locks helds, giving the user a chance to
> reload/rebind the module instead of getting and unstable system that
> cannot even reboot without a powercycle.

How does it ensure that?  What's to stop the task being killed after we
check to see if the task was killed.

[-- Attachment #2: Digital signature --]
[-- Type: application/pgp-signature, Size: 836 bytes --]

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH] spi: Fix hung task timeout when initialization fails
@ 2014-05-01 14:11       ` Mark Brown
  0 siblings, 0 replies; 15+ messages in thread
From: Mark Brown @ 2014-05-01 14:11 UTC (permalink / raw)
  To: Ricardo Ribalda Delgado; +Cc: linux-spi-u79uwXL29TY76Z2rM5mHXA, LKML

[-- Attachment #1: Type: text/plain, Size: 572 bytes --]

On Thu, May 01, 2014 at 07:53:58AM +0200, Ricardo Ribalda Delgado wrote:

> Thanks for your comments. This fix does not avoid the task being
> killed (which is not an error). What it does is that IF the task is
> killed or we are out of memory we will exit with all the resources
> properly released and no locks helds, giving the user a chance to
> reload/rebind the module instead of getting and unstable system that
> cannot even reboot without a powercycle.

How does it ensure that?  What's to stop the task being killed after we
check to see if the task was killed.

[-- Attachment #2: Digital signature --]
[-- Type: application/pgp-signature, Size: 836 bytes --]

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH] spi: Fix hung task timeout when initialization fails
@ 2014-05-01 14:33         ` Geert Uytterhoeven
  0 siblings, 0 replies; 15+ messages in thread
From: Geert Uytterhoeven @ 2014-05-01 14:33 UTC (permalink / raw)
  To: Mark Brown; +Cc: Ricardo Ribalda Delgado, linux-spi, LKML

On Thu, May 1, 2014 at 4:11 PM, Mark Brown <broonie@kernel.org> wrote:
> On Thu, May 01, 2014 at 07:53:58AM +0200, Ricardo Ribalda Delgado wrote:
>
>> Thanks for your comments. This fix does not avoid the task being
>> killed (which is not an error). What it does is that IF the task is
>> killed or we are out of memory we will exit with all the resources
>> properly released and no locks helds, giving the user a chance to
>> reload/rebind the module instead of getting and unstable system that
>> cannot even reboot without a powercycle.
>
> How does it ensure that?  What's to stop the task being killed after we
> check to see if the task was killed.

master->kworker_task is set like this:

    master->kworker_task = kthread_run(...)

so it just contains the status of the creation of the kthread, not if it was
killed, right? Hence we don't check if it was killed.

So Ricardo's patch prevents the stopping and destruction of the thread
if it failed to be _created_.

What if it is killed? I suppose the kthread API handles that internally, as it
could happen to any thread (e.g. OOM)?

Gr{oetje,eeting}s,

                        Geert

--
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- geert@linux-m68k.org

In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
                                -- Linus Torvalds

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH] spi: Fix hung task timeout when initialization fails
@ 2014-05-01 14:33         ` Geert Uytterhoeven
  0 siblings, 0 replies; 15+ messages in thread
From: Geert Uytterhoeven @ 2014-05-01 14:33 UTC (permalink / raw)
  To: Mark Brown; +Cc: Ricardo Ribalda Delgado, linux-spi, LKML

On Thu, May 1, 2014 at 4:11 PM, Mark Brown <broonie-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org> wrote:
> On Thu, May 01, 2014 at 07:53:58AM +0200, Ricardo Ribalda Delgado wrote:
>
>> Thanks for your comments. This fix does not avoid the task being
>> killed (which is not an error). What it does is that IF the task is
>> killed or we are out of memory we will exit with all the resources
>> properly released and no locks helds, giving the user a chance to
>> reload/rebind the module instead of getting and unstable system that
>> cannot even reboot without a powercycle.
>
> How does it ensure that?  What's to stop the task being killed after we
> check to see if the task was killed.

master->kworker_task is set like this:

    master->kworker_task = kthread_run(...)

so it just contains the status of the creation of the kthread, not if it was
killed, right? Hence we don't check if it was killed.

So Ricardo's patch prevents the stopping and destruction of the thread
if it failed to be _created_.

What if it is killed? I suppose the kthread API handles that internally, as it
could happen to any thread (e.g. OOM)?

Gr{oetje,eeting}s,

                        Geert

--
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- geert-Td1EMuHUCqxL1ZNQvxDV9g@public.gmane.org

In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
                                -- Linus Torvalds
--
To unsubscribe from this list: send the line "unsubscribe linux-spi" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH] spi: Fix hung task timeout when initialization fails
@ 2014-05-01 16:03           ` Mark Brown
  0 siblings, 0 replies; 15+ messages in thread
From: Mark Brown @ 2014-05-01 16:03 UTC (permalink / raw)
  To: Geert Uytterhoeven; +Cc: Ricardo Ribalda Delgado, linux-spi, LKML

[-- Attachment #1: Type: text/plain, Size: 877 bytes --]

On Thu, May 01, 2014 at 04:33:58PM +0200, Geert Uytterhoeven wrote:
> On Thu, May 1, 2014 at 4:11 PM, Mark Brown <broonie@kernel.org> wrote:

> master->kworker_task is set like this:

>     master->kworker_task = kthread_run(...)

> so it just contains the status of the creation of the kthread, not if it was
> killed, right? Hence we don't check if it was killed.

> So Ricardo's patch prevents the stopping and destruction of the thread
> if it failed to be _created_.

OK, so that means that the description is very confusing then since it's
talking about the issue being due to kthread_run being killed.

> What if it is killed? I suppose the kthread API handles that internally, as it
> could happen to any thread (e.g. OOM)?

I'm not 100% clear on this to be honest.  Since I'm at ELC I've not
investigated fully yet and I'm mostly going on the patch descriptions
here.

[-- Attachment #2: Digital signature --]
[-- Type: application/pgp-signature, Size: 836 bytes --]

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH] spi: Fix hung task timeout when initialization fails
@ 2014-05-01 16:03           ` Mark Brown
  0 siblings, 0 replies; 15+ messages in thread
From: Mark Brown @ 2014-05-01 16:03 UTC (permalink / raw)
  To: Geert Uytterhoeven; +Cc: Ricardo Ribalda Delgado, linux-spi, LKML

[-- Attachment #1: Type: text/plain, Size: 906 bytes --]

On Thu, May 01, 2014 at 04:33:58PM +0200, Geert Uytterhoeven wrote:
> On Thu, May 1, 2014 at 4:11 PM, Mark Brown <broonie-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org> wrote:

> master->kworker_task is set like this:

>     master->kworker_task = kthread_run(...)

> so it just contains the status of the creation of the kthread, not if it was
> killed, right? Hence we don't check if it was killed.

> So Ricardo's patch prevents the stopping and destruction of the thread
> if it failed to be _created_.

OK, so that means that the description is very confusing then since it's
talking about the issue being due to kthread_run being killed.

> What if it is killed? I suppose the kthread API handles that internally, as it
> could happen to any thread (e.g. OOM)?

I'm not 100% clear on this to be honest.  Since I'm at ELC I've not
investigated fully yet and I'm mostly going on the patch descriptions
here.

[-- Attachment #2: Digital signature --]
[-- Type: application/pgp-signature, Size: 836 bytes --]

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH] spi: Fix hung task timeout when initialization fails
@ 2014-05-01 16:09   ` Mark Brown
  0 siblings, 0 replies; 15+ messages in thread
From: Mark Brown @ 2014-05-01 16:09 UTC (permalink / raw)
  To: Ricardo Ribalda Delgado; +Cc: linux-spi, linux-kernel

[-- Attachment #1: Type: text/plain, Size: 923 bytes --]

On Thu, May 01, 2014 at 10:15:15AM +0200, Ricardo Ribalda Delgado wrote:

> If kthread_run on spi_init_queue() fails, spi_destroy_queue can lead to
> hang timeout.

...

> When this happens, spi_destroy_queue() leads to a hung process that
> outputs a error message and avoids the computer to be halted/rebooted.

Why is the fix for this not to avoid running spi_destroy_queue() in the
first place?  I would not in general expect it to be robust to call the
destructor function if the init function failed (and indeed this looks
like what's happening with the kthread code here) - even if it works now
it seems like it will be a source of bugs in the future.

> -	flush_kthread_worker(&master->kworker);
> -	kthread_stop(master->kworker_task);
> +	if (!IS_ERR(master->kworker_task)) {
> +		flush_kthread_worker(&master->kworker);
> +		kthread_stop(master->kworker_task);
> +	}

This still just looks like a race condition.

[-- Attachment #2: Digital signature --]
[-- Type: application/pgp-signature, Size: 836 bytes --]

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH] spi: Fix hung task timeout when initialization fails
@ 2014-05-01 16:09   ` Mark Brown
  0 siblings, 0 replies; 15+ messages in thread
From: Mark Brown @ 2014-05-01 16:09 UTC (permalink / raw)
  To: Ricardo Ribalda Delgado
  Cc: linux-spi-u79uwXL29TY76Z2rM5mHXA, linux-kernel-u79uwXL29TY76Z2rM5mHXA

[-- Attachment #1: Type: text/plain, Size: 923 bytes --]

On Thu, May 01, 2014 at 10:15:15AM +0200, Ricardo Ribalda Delgado wrote:

> If kthread_run on spi_init_queue() fails, spi_destroy_queue can lead to
> hang timeout.

...

> When this happens, spi_destroy_queue() leads to a hung process that
> outputs a error message and avoids the computer to be halted/rebooted.

Why is the fix for this not to avoid running spi_destroy_queue() in the
first place?  I would not in general expect it to be robust to call the
destructor function if the init function failed (and indeed this looks
like what's happening with the kthread code here) - even if it works now
it seems like it will be a source of bugs in the future.

> -	flush_kthread_worker(&master->kworker);
> -	kthread_stop(master->kworker_task);
> +	if (!IS_ERR(master->kworker_task)) {
> +		flush_kthread_worker(&master->kworker);
> +		kthread_stop(master->kworker_task);
> +	}

This still just looks like a race condition.

[-- Attachment #2: Digital signature --]
[-- Type: application/pgp-signature, Size: 836 bytes --]

^ permalink raw reply	[flat|nested] 15+ messages in thread

* [PATCH] spi: Fix hung task timeout when initialization fails
@ 2014-05-01  8:15 ` Ricardo Ribalda Delgado
  0 siblings, 0 replies; 15+ messages in thread
From: Ricardo Ribalda Delgado @ 2014-05-01  8:15 UTC (permalink / raw)
  To: linux-spi, broonie, linux-kernel; +Cc: Ricardo Ribalda Delgado

If kthread_run on spi_init_queue() fails, spi_destroy_queue can lead to
hang timeout.

Before 786235eeba0e1e85e5cbbb9f97d1087ad03dfa21 kthread:
make kthread_create() killable" it was very unlikely that kthread_run
would fail, but after that patch a kill to modprobe can lead to this
kind of error. Udev kills modprobe after 30 seconds.

When this happens, spi_destroy_queue() leads to a hung process that
outputs a error message and avoids the computer to be halted/rebooted.

This bug is particulary critical on machines that relay on the spi
subsystem to hold a file system (MTD over SPI).

[   95.900328] spi_master spi32765: failed to create message pump task
[   95.900358] spi_master spi32765: problem initializing queue
[   95.912333] qtec_pcie b0030000.pcie_bridge: Axi-pcie bridge supervisor
[  240.463698] INFO: task udevd:129 blocked for more than 120 seconds.
[  240.463733] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[  240.463807] Call Trace:
[  240.463968]  [<ffffffff8178c22d>] wait_for_completion+0x1d/0x20
[  240.464036]  [<ffffffff8146c637>] spi_destroy_queue+0x27/0x60
[  240.464050]  [<ffffffff8146cc46>] spi_register_master+0x5d6/0x900

Signed-off-by: Ricardo Ribalda Delgado <ricardo.ribalda@gmail.com>
---

v2: Comment by Mark Brown:
-Improve commit message

 drivers/spi/spi.c | 6 ++++--
 1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/drivers/spi/spi.c b/drivers/spi/spi.c
index 4eb9bf0..8331556 100644
--- a/drivers/spi/spi.c
+++ b/drivers/spi/spi.c
@@ -1114,8 +1114,10 @@ static int spi_destroy_queue(struct spi_master *master)
 		return ret;
 	}
 
-	flush_kthread_worker(&master->kworker);
-	kthread_stop(master->kworker_task);
+	if (!IS_ERR(master->kworker_task)) {
+		flush_kthread_worker(&master->kworker);
+		kthread_stop(master->kworker_task);
+	}
 
 	return 0;
 }
-- 
1.9.2


^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCH] spi: Fix hung task timeout when initialization fails
@ 2014-05-01  8:15 ` Ricardo Ribalda Delgado
  0 siblings, 0 replies; 15+ messages in thread
From: Ricardo Ribalda Delgado @ 2014-05-01  8:15 UTC (permalink / raw)
  To: linux-spi-u79uwXL29TY76Z2rM5mHXA, broonie-DgEjT+Ai2ygdnm+yROfE0A,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA
  Cc: Ricardo Ribalda Delgado

If kthread_run on spi_init_queue() fails, spi_destroy_queue can lead to
hang timeout.

Before 786235eeba0e1e85e5cbbb9f97d1087ad03dfa21 kthread:
make kthread_create() killable" it was very unlikely that kthread_run
would fail, but after that patch a kill to modprobe can lead to this
kind of error. Udev kills modprobe after 30 seconds.

When this happens, spi_destroy_queue() leads to a hung process that
outputs a error message and avoids the computer to be halted/rebooted.

This bug is particulary critical on machines that relay on the spi
subsystem to hold a file system (MTD over SPI).

[   95.900328] spi_master spi32765: failed to create message pump task
[   95.900358] spi_master spi32765: problem initializing queue
[   95.912333] qtec_pcie b0030000.pcie_bridge: Axi-pcie bridge supervisor
[  240.463698] INFO: task udevd:129 blocked for more than 120 seconds.
[  240.463733] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[  240.463807] Call Trace:
[  240.463968]  [<ffffffff8178c22d>] wait_for_completion+0x1d/0x20
[  240.464036]  [<ffffffff8146c637>] spi_destroy_queue+0x27/0x60
[  240.464050]  [<ffffffff8146cc46>] spi_register_master+0x5d6/0x900

Signed-off-by: Ricardo Ribalda Delgado <ricardo.ribalda-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
---

v2: Comment by Mark Brown:
-Improve commit message

 drivers/spi/spi.c | 6 ++++--
 1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/drivers/spi/spi.c b/drivers/spi/spi.c
index 4eb9bf0..8331556 100644
--- a/drivers/spi/spi.c
+++ b/drivers/spi/spi.c
@@ -1114,8 +1114,10 @@ static int spi_destroy_queue(struct spi_master *master)
 		return ret;
 	}
 
-	flush_kthread_worker(&master->kworker);
-	kthread_stop(master->kworker_task);
+	if (!IS_ERR(master->kworker_task)) {
+		flush_kthread_worker(&master->kworker);
+		kthread_stop(master->kworker_task);
+	}
 
 	return 0;
 }
-- 
1.9.2

--
To unsubscribe from this list: send the line "unsubscribe linux-spi" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply related	[flat|nested] 15+ messages in thread

end of thread, other threads:[~2014-05-01 16:09 UTC | newest]

Thread overview: 15+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2014-04-30 10:36 [PATCH] spi: Fix hung task timeout when initialization fails Ricardo Ribalda Delgado
2014-05-01  1:34 ` Mark Brown
2014-05-01  1:34   ` Mark Brown
2014-05-01  5:53   ` Ricardo Ribalda Delgado
2014-05-01  5:53     ` Ricardo Ribalda Delgado
2014-05-01 14:11     ` Mark Brown
2014-05-01 14:11       ` Mark Brown
2014-05-01 14:33       ` Geert Uytterhoeven
2014-05-01 14:33         ` Geert Uytterhoeven
2014-05-01 16:03         ` Mark Brown
2014-05-01 16:03           ` Mark Brown
2014-05-01  8:15 Ricardo Ribalda Delgado
2014-05-01  8:15 ` Ricardo Ribalda Delgado
2014-05-01 16:09 ` Mark Brown
2014-05-01 16:09   ` Mark Brown

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.