On 04/12/2013 11:24 AM, Mylene Josserand wrote: >>> I have read 15 383 frames but after 15 834, it hangs. And after that, I >>> can not act on the CAN anymore. A "ifconfig can0 down" hangs the kernel >>> (even with ^C) and I have to restart the board to use the CAN again ! >> >> Ohhh, not good. >> >> Activate "MAGIC_SYSRQ" in the kernel via "make menuconfog" (Kernel >> hacking -> Magic SysRq key) and bring your system to hang again. Then >> connect via serial line to your embedded system and send a "break". (See >> documentation of your terminal program.). After the "break" send a >> normal "?" to get the help. If I remember correctly, use "break" + "d" >> to create a stackdump (see >> http://lxr.linux.no/linux+v3.8.6/Documentation/sysrq.txt for more >> documentation). You might want to try the magic sys request if your >> system is still alive to test if your setup is working. >> >> With the stack trace you might figure out what the system is doing. > > > Oouuhhhaaa ! Very useful ! It did not know the sysrq and it seems to be > very powerful ! Thanks ! :D You're welcome. > About serious things, the "d" command did not work (it prints the help) > but the "w" command ["dump tasks that are uninterruptable (blocked) > state"] shows interesting things : Maybe it was not 'd', but you've found the interesting information. > ---- when not blocked (so nothing in there) : > " > SysRq : Show Blocked State > task PC stack pid father > Sched Debug Version: v0.10, 3.8.2-9-can-modules+ #1 > [...] > " > > ---- when blocked : > " > SysRq : Show Blocked State > task PC stack pid father > kworker/u:0 D c03d6b20 0 6 2 0x00000000 > [] (__schedule+0x298/0x434) from [] > (schedule_timeout+0x170/0x1c0) > [] (schedule_timeout+0x170/0x1c0) from [] > (wait_for_common+0x148/0x1a4) > [] (wait_for_common+0x148/0x1a4) from [] > (spi_imx_transfer+0x70/0x84) > [] (spi_imx_transfer+0x70/0x84) from [] > (bitbang_work+0x130/0x390) The spi driver is waiting for a transfer to finish, it hangs in the wait_for_completion(): http://lxr.linux.no/linux+v3.8.6/drivers/spi/spi-imx.c#L712 The corresponding function (i.e. complete()) is called from the interrupt handler, once a transfer ins completed: http://lxr.linux.no/linux+v3.8.6/drivers/spi/spi-imx.c#L649 > [] (bitbang_work+0x130/0x390) from [] > (process_one_work+0x28c/0x504) > [] (process_one_work+0x28c/0x504) from [] > (worker_thread+0x1f0/0x648) > [] (worker_thread+0x1f0/0x648) from [] > (kthread+0xa0/0xac) > [] (kthread+0xa0/0xac) from [] (ret_from_fork+0x14/0x24) > irq/201-mcp251x D c03d6b20 0 950 2 0x00000000 > [] (__schedule+0x298/0x434) from [] > (schedule_timeout+0x170/0x1c0) > [] (schedule_timeout+0x170/0x1c0) from [] > (wait_for_common+0x148/0x1a4) > [] (wait_for_common+0x148/0x1a4) from [] > (__spi_sync+0x58/0x9c) > [] (__spi_sync+0x58/0x9c) from [] > (mcp251x_spi_trans+0xa4/0xd0 [mcp251x]) > [] (mcp251x_spi_trans+0xa4/0xd0 [mcp251x]) from [] > (mcp251x_can_ist+0x60/0x348 [mcp251x]) > [] (mcp251x_can_ist+0x60/0x348 [mcp251x]) from [] > (irq_thread_fn+0x1c/0x34) > [] (irq_thread_fn+0x1c/0x34) from [] > (irq_thread+0xd8/0x148) > [] (irq_thread+0xd8/0x148) from [] (kthread+0xa0/0xac) > [] (kthread+0xa0/0xac) from [] (ret_from_fork+0x14/0x24) > Sched Debug Version: v0.10, 3.8.2-9-can-modules+ #1 > [...] > " > > We can see that the problem is during the spi transfer. > Do you think it is a hardware problem ? Maybe a hardware problem, a driver problem or a problem in the hardware triggering a bug in the driver. > What is the "kworker" task ? > How fix it ? The "kworker" is some infrastructure task in that parts of the imx spi driver "live". Here the scenario is: - The spi_imx_transfer() is waiting for a transfer to finish. - The finish of the transfer is signaled via the complete() <-> wait_for_completion() - An Interrupt will call complete() - Did we get an Interrupt? Did we miss the Interrupt? Marc -- Pengutronix e.K. | Marc Kleine-Budde | Industrial Linux Solutions | Phone: +49-231-2826-924 | Vertretung West/Dortmund | Fax: +49-5121-206917-5555 | Amtsgericht Hildesheim, HRA 2686 | http://www.pengutronix.de |