All of lore.kernel.org
 help / color / mirror / Atom feed
* [Xenomai] Oops while running "cat /proc/xenomai/stat"
@ 2012-10-08  9:32 Stefan Roese
  2012-10-08 17:39 ` Gilles Chanteperdrix
  0 siblings, 1 reply; 12+ messages in thread
From: Stefan Roese @ 2012-10-08  9:32 UTC (permalink / raw)
  To: xenomai

Hi,

I'm currently developing an RTDM driver communicating with an FPGA
located on the LPB on an MPC5200 PowerPC. This driver already seems
to work quite well. But when I run my test application
communicating with the device driver, and I try to check the
Xenomai stat's, I get a kernel crash:

root@generic-powerpc:~# cat /proc/xenomai/stat 
[   43.722984] Oops: Kernel access of bad area, sig: 11 [#1]
[   43.728503] mpc5200-simple-platform
[   43.732057] Modules linked in: rt_fpga(O) rt_mpc52xx_lpbfifo(O)
[   43.738107] NIP: c00646a8 LR: c0098b84 CTR: c0098b34
[   43.743173] REGS: c7025b40 TRAP: 0300   Tainted: G           O  (3.5.3-00253-g4699145-dirty)
[   43.751776] MSR: 00001032 <ME,IR,DR,RI>  CR: 24424488  XER: 20000000
[   43.758279] DAR: 00000000, DSISR: 22000000
[   43.762454] TASK = c7b75360[1430] 'cat' THREAD: c7024000
GPR00: 00000000 c7025bf0 c7b75360 c7b1ca0c 00000002 02000007 00000000 00000031 
GPR08: c7ab0000 c7b1cc00 00000000 00000000 c0098b34 100a5a74 10017830 10006834 
GPR16: 10006770 10006774 c7b1cc00 00000000 00000002 c033d385 c03264ac 00000025 
GPR24: c0320000 00000001 00000002 c03b3460 00000000 00000000 c7a6ccc4 c7a6ccb4 
[   43.796697] Call Trace:
[   43.799189] [c7025bf0] [c004c540] 0xc004c540 (unreliable)
[   43.804698] [c7025c20] [c0098b84] 0xc0098b84
[   43.809051] [c7025c40] [c90e4588] 0xc90e4588
[   43.813406] [c7025c50] [c90d8114] 0xc90d8114
[   43.817758] [c7025c60] [c005e85c] 0xc005e85c
[   43.822113] [c7025ca0] [c005aef0] 0xc005aef0
[   43.826465] [c7025cc0] [c000d664] 0xc000d664
[   43.830819] [c7025cd0] [c000f710] 0xc000f710
[   43.835175] --- Exception: 501 at 0xc01b2614

This happens reproducible upon the "cat" command from the shell.

Looking at the NIP (c00646a8), the PPC is currently executing
xnsynch_flush().

Other proc files from the xenomai directory like "irq" don't cause
this crash. And accessing "stat" when this application is not
running also works fine:

root@generic-powerpc:~# cat /proc/xenomai/stat 
CPU  PID    MSW        CSW        PF    STAT       %CPU  NAME
  0  0      0          0          0     00500080  100.0  ROOT
  0  0      0          0          0     00000000    0.0  IRQ145: rtcan_mscan
  0  0      0          0          0     00000000    0.0  IRQ151: mpc52xx-lpbfifo
  0  0      0          0          0     00000000    0.0  IRQ194: mpc52xx-lpbfifo-rx
  0  0      0          17619      0     00000000    0.0  IRQ512: [timer]

Any idea what might go wrong here? 

Thanks,
Stefan


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [Xenomai] Oops while running "cat /proc/xenomai/stat"
  2012-10-08  9:32 [Xenomai] Oops while running "cat /proc/xenomai/stat" Stefan Roese
@ 2012-10-08 17:39 ` Gilles Chanteperdrix
  2012-10-09  6:48   ` Stefan Roese
  0 siblings, 1 reply; 12+ messages in thread
From: Gilles Chanteperdrix @ 2012-10-08 17:39 UTC (permalink / raw)
  To: Stefan Roese; +Cc: xenomai

On 10/08/2012 11:32 AM, Stefan Roese wrote:

> Hi,
> 
> I'm currently developing an RTDM driver communicating with an FPGA
> located on the LPB on an MPC5200 PowerPC. This driver already seems
> to work quite well. But when I run my test application
> communicating with the device driver, and I try to check the
> Xenomai stat's, I get a kernel crash:
> 
> root@generic-powerpc:~# cat /proc/xenomai/stat 
> [   43.722984] Oops: Kernel access of bad area, sig: 11 [#1]
> [   43.728503] mpc5200-simple-platform
> [   43.732057] Modules linked in: rt_fpga(O) rt_mpc52xx_lpbfifo(O)
> [   43.738107] NIP: c00646a8 LR: c0098b84 CTR: c0098b34
> [   43.743173] REGS: c7025b40 TRAP: 0300   Tainted: G           O  (3.5.3-00253-g4699145-dirty)
> [   43.751776] MSR: 00001032 <ME,IR,DR,RI>  CR: 24424488  XER: 20000000
> [   43.758279] DAR: 00000000, DSISR: 22000000
> [   43.762454] TASK = c7b75360[1430] 'cat' THREAD: c7024000
> GPR00: 00000000 c7025bf0 c7b75360 c7b1ca0c 00000002 02000007 00000000 00000031 
> GPR08: c7ab0000 c7b1cc00 00000000 00000000 c0098b34 100a5a74 10017830 10006834 
> GPR16: 10006770 10006774 c7b1cc00 00000000 00000002 c033d385 c03264ac 00000025 
> GPR24: c0320000 00000001 00000002 c03b3460 00000000 00000000 c7a6ccc4 c7a6ccb4 
> [   43.796697] Call Trace:
> [   43.799189] [c7025bf0] [c004c540] 0xc004c540 (unreliable)
> [   43.804698] [c7025c20] [c0098b84] 0xc0098b84
> [   43.809051] [c7025c40] [c90e4588] 0xc90e4588
> [   43.813406] [c7025c50] [c90d8114] 0xc90d8114
> [   43.817758] [c7025c60] [c005e85c] 0xc005e85c
> [   43.822113] [c7025ca0] [c005aef0] 0xc005aef0
> [   43.826465] [c7025cc0] [c000d664] 0xc000d664
> [   43.830819] [c7025cd0] [c000f710] 0xc000f710
> [   43.835175] --- Exception: 501 at 0xc01b2614
> 
> This happens reproducible upon the "cat" command from the shell.
> 
> Looking at the NIP (c00646a8), the PPC is currently executing
> xnsynch_flush().
> 
> Other proc files from the xenomai directory like "irq" don't cause
> this crash. And accessing "stat" when this application is not
> running also works fine:
> 
> root@generic-powerpc:~# cat /proc/xenomai/stat 
> CPU  PID    MSW        CSW        PF    STAT       %CPU  NAME
>   0  0      0          0          0     00500080  100.0  ROOT
>   0  0      0          0          0     00000000    0.0  IRQ145: rtcan_mscan
>   0  0      0          0          0     00000000    0.0  IRQ151: mpc52xx-lpbfifo
>   0  0      0          0          0     00000000    0.0  IRQ194: mpc52xx-lpbfifo-rx
>   0  0      0          17619      0     00000000    0.0  IRQ512: [timer]
> 
> Any idea what might go wrong here? 


Please enable CONFIG_KALLSYMS so that the backtrace contains readable
function names, otherwise, without a disassembly of your kernel, we have
no idea what functions the backtrace is referring to.

> 
> Thanks,
> Stefan
> 
> _______________________________________________
> Xenomai mailing list
> Xenomai@xenomai.org
> http://www.xenomai.org/mailman/listinfo/xenomai
> 



-- 
                                                                Gilles.


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [Xenomai] Oops while running "cat /proc/xenomai/stat"
  2012-10-08 17:39 ` Gilles Chanteperdrix
@ 2012-10-09  6:48   ` Stefan Roese
  2012-10-09  9:47     ` Gilles Chanteperdrix
  2012-10-09 14:24     ` Philippe Gerum
  0 siblings, 2 replies; 12+ messages in thread
From: Stefan Roese @ 2012-10-09  6:48 UTC (permalink / raw)
  To: Gilles Chanteperdrix; +Cc: xenomai

Hi Gilles,

On 10/08/2012 07:39 PM, Gilles Chanteperdrix wrote:
> On 10/08/2012 11:32 AM, Stefan Roese wrote:
> 
>> Hi,
>>
>> I'm currently developing an RTDM driver communicating with an FPGA
>> located on the LPB on an MPC5200 PowerPC. This driver already seems
>> to work quite well. But when I run my test application
>> communicating with the device driver, and I try to check the
>> Xenomai stat's, I get a kernel crash:
>>
>> root@generic-powerpc:~# cat /proc/xenomai/stat 
>> [   43.722984] Oops: Kernel access of bad area, sig: 11 [#1]
>> [   43.728503] mpc5200-simple-platform
>> [   43.732057] Modules linked in: rt_fpga(O) rt_mpc52xx_lpbfifo(O)
>> [   43.738107] NIP: c00646a8 LR: c0098b84 CTR: c0098b34
>> [   43.743173] REGS: c7025b40 TRAP: 0300   Tainted: G           O  (3.5.3-00253-g4699145-dirty)
>> [   43.751776] MSR: 00001032 <ME,IR,DR,RI>  CR: 24424488  XER: 20000000
>> [   43.758279] DAR: 00000000, DSISR: 22000000
>> [   43.762454] TASK = c7b75360[1430] 'cat' THREAD: c7024000
>> GPR00: 00000000 c7025bf0 c7b75360 c7b1ca0c 00000002 02000007 00000000 00000031 
>> GPR08: c7ab0000 c7b1cc00 00000000 00000000 c0098b34 100a5a74 10017830 10006834 
>> GPR16: 10006770 10006774 c7b1cc00 00000000 00000002 c033d385 c03264ac 00000025 
>> GPR24: c0320000 00000001 00000002 c03b3460 00000000 00000000 c7a6ccc4 c7a6ccb4 
>> [   43.796697] Call Trace:
>> [   43.799189] [c7025bf0] [c004c540] 0xc004c540 (unreliable)
>> [   43.804698] [c7025c20] [c0098b84] 0xc0098b84
>> [   43.809051] [c7025c40] [c90e4588] 0xc90e4588
>> [   43.813406] [c7025c50] [c90d8114] 0xc90d8114
>> [   43.817758] [c7025c60] [c005e85c] 0xc005e85c
>> [   43.822113] [c7025ca0] [c005aef0] 0xc005aef0
>> [   43.826465] [c7025cc0] [c000d664] 0xc000d664
>> [   43.830819] [c7025cd0] [c000f710] 0xc000f710
>> [   43.835175] --- Exception: 501 at 0xc01b2614
>>
>> This happens reproducible upon the "cat" command from the shell.
>>
>> Looking at the NIP (c00646a8), the PPC is currently executing
>> xnsynch_flush().
>>
>> Other proc files from the xenomai directory like "irq" don't cause
>> this crash. And accessing "stat" when this application is not
>> running also works fine:
>>
>> root@generic-powerpc:~# cat /proc/xenomai/stat 
>> CPU  PID    MSW        CSW        PF    STAT       %CPU  NAME
>>   0  0      0          0          0     00500080  100.0  ROOT
>>   0  0      0          0          0     00000000    0.0  IRQ145: rtcan_mscan
>>   0  0      0          0          0     00000000    0.0  IRQ151: mpc52xx-lpbfifo
>>   0  0      0          0          0     00000000    0.0  IRQ194: mpc52xx-lpbfifo-rx
>>   0  0      0          17619      0     00000000    0.0  IRQ512: [timer]
>>
>> Any idea what might go wrong here? 
> 
> 
> Please enable CONFIG_KALLSYMS so that the backtrace contains readable
> function names, otherwise, without a disassembly of your kernel, we have
> no idea what functions the backtrace is referring to.

Yes, sorry my bad. Here the new crash log:

root@generic-powerpc:~# cat /proc/xenomai/stat 
[   65.215600] Oops: Kernel access of bad area, sig: 11 [#1]
[   65.221118] mpc5200-simple-platform
[   65.224671] Modules linked in: rt_fpga(O) rt_mpc52xx_lpbfifo(O)
[   65.230718] NIP: c0066914 LR: c009adf0 CTR: c009ada0
[   65.235784] REGS: c716bac0 TRAP: 0300   Tainted: G           O  (3.5.3-00253-g4699145-dirty)
[   65.244386] MSR: 00001032 <ME,IR,DR,RI>  CR: 24000488  XER: 20000000
[   65.250888] DAR: 00000000, DSISR: 22000000
[   65.255064] TASK = c705cba0[1400] 'cat' THREAD: c716a000
GPR00: 00000000 c716bb70 c705cba0 c7b3060c 00000002 02000006 00000000 00000030 
GPR08: c70f0000 c7b30800 00000000 00000000 c009ada0 100a5a74 10017830 10006834 
GPR16: 10006770 10006774 100170f4 00010000 c00733d8 c75e4a60 00000000 c03ee068 
GPR24: 00000002 00000001 00000002 c03ec460 00000000 00000000 c7a6ccc4 c7a6ccb4 
[   65.289337] NIP [c0066914] xnsynch_flush+0x64/0x100
[   65.294328] LR [c009adf0] rtdm_event_signal+0x50/0xe4
[   65.299474] Call Trace:
[   65.301969] [c716bb70] [00000004] 0x4 (unreliable)
[   65.306864] [c716bba0] [c009adf0] rtdm_event_signal+0x50/0xe4
[   65.312735] [c716bbc0] [cb132588] fpga_dma_done_callback+0x18/0x28 [rt_fpga]
[   65.319934] [c716bbd0] [cb101114] mpc52xx_lpbfifo_bcom_irq+0x114/0x1c4 [rt_mpc52xx_lpbfifo]
[   65.328461] [c716bbe0] [c0060ac8] xnintr_irq_handler+0x70/0x234
[   65.334515] [c716bc20] [c005d15c] __ipipe_dispatch_irq+0x148/0x270
[   65.340832] [c716bc40] [c000d66c] __ipipe_grab_irq+0x38/0x6c
[   65.346615] [c716bc50] [c000f710] __ipipe_ret_from_except+0x0/0xc
[   65.352849] --- Exception: 501 at __ipipe_restore_head+0x84/0xfc
[   65.352849]     LR = __vfile_nklock_put+0x30/0x40
[   65.363775] [c716bd10] [c0069bcc] vfile_stat_next+0x1b0/0x23c (unreliable)
[   65.370795] [c716bd20] [c0072f14] __vfile_nklock_put+0x30/0x40
[   65.376751] [c716bd30] [c00738d0] vfile_snapshot_open+0x29c/0x31c
[   65.382975] [c716bd70] [c011b1c0] proc_reg_open+0x7c/0x11c
[   65.388576] [c716bd90] [c00d1a94] do_dentry_open.isra.18+0x1f4/0x29c
[   65.395064] [c716bdc0] [c00d2ad8] nameidata_to_filp+0x5c/0xa8
[   65.400932] [c716bde0] [c00e11c0] do_last.isra.49+0x260/0x78c
[   65.406796] [c716be30] [c00e17ec] path_openat+0xb4/0x3b0
[   65.412219] [c716be80] [c00e1bdc] do_filp_open+0x30/0x8c
[   65.417641] [c716bf10] [c00d2c20] do_sys_open+0xfc/0x1c8
[   65.423069] [c716bf40] [c000ee0c] ret_from_syscall+0x0/0x38
[   65.428757] --- Exception: c01 at 0xff0f47c
[   65.428757]     LR = 0xffabb94
[   65.436118] Instruction dump:
[   65.439139] 833b0918 63200001 573907fe 901b0918 853e0010 7f89f000 419e00ac 3ba00000 
[   65.447061] 48000044 419a0054 81690004 80090000 <900b0000> 81690000 80090004 900b0004 
[   65.455216] Oops: Kernel access of bad area, sig: 11 [#2]
[   65.460723] mpc5200-simple-platform
[   65.464272] Modules linked in: rt_fpga(O) rt_mpc52xx_lpbfifo(O)
[   65.470319] NIP: 00000000 LR: c0060ac8 CTR: 00000000
[   65.475385] REGS: c716b880 TRAP: 0400   Tainted: G      D    O  (3.5.3-00253-g4699145-dirty)
[   65.483985] MSR: 20001032 <ME,IR,DR,RI>  CR: 24000482  XER: 20000000
[   65.490489] TASK = c705cba0[1400] 'cat' THREAD: c716a000
GPR00: 00000000 c716b930 c705cba0 c7a6ccf0 c7a6ccf0 00000000 c0410c94 00000000 
GPR08: c03ecd78 00000000 00000002 00004000 00000000 100a5a74 10017830 10006834 
GPR16: 10006770 10006774 100170f4 00010000 c00733d8 c75e4a60 00000000 c03ee068 
GPR24: 00000002 c7a6ccf0 00000000 938c6ae2 c041fb40 00000010 c04244b8 c04240b8 
[   65.524729] NIP [00000000]   (null)
[   65.528297] LR [c0060ac8] xnintr_irq_handler+0x70/0x234
[   65.533621] Call Trace:
[   65.536115] [c716b930] [00000001] 0x1 (unreliable)
[   65.541023] [c716b970] [c005d15c] __ipipe_dispatch_irq+0x148/0x270
[   65.547337] [c716b990] [c000d66c] __ipipe_grab_irq+0x38/0x6c
[   65.553118] [c716b9a0] [c000f710] __ipipe_ret_from_except+0x0/0xc
[   65.559348] --- Exception: 501 at ipipe_unstall_root+0x48/0x5c
[   65.559348]     LR = ipipe_unstall_root+0x58/0x5c
[   65.570085] [c716ba70] [c000aa94] die+0x218/0x240
[   65.574892] [c716baa0] [c0010360] bad_page_fault+0xb4/0xfc
[   65.580492] [c716bab0] [c000f2c4] handle_page_fault+0x7c/0x80
[   65.586382] --- Exception: 300 at xnsynch_flush+0x64/0x100
[   65.586382]     LR = rtdm_event_signal+0x50/0xe4
[   65.596674] [c716bb70] [00000004] 0x4 (unreliable)
[   65.601569] [c716bba0] [c009adf0] rtdm_event_signal+0x50/0xe4
[   65.607440] [c716bbc0] [cb132588] fpga_dma_done_callback+0x18/0x28 [rt_fpga]
[   65.614641] [c716bbd0] [cb101114] mpc52xx_lpbfifo_bcom_irq+0x114/0x1c4 [rt_mpc52xx_lpbfifo]
[   65.623168] [c716bbe0] [c0060ac8] xnintr_irq_handler+0x70/0x234
[   65.629216] [c716bc20] [c005d15c] __ipipe_dispatch_irq+0x148/0x270
[   65.635526] [c716bc40] [c000d66c] __ipipe_grab_irq+0x38/0x6c
[   65.641305] [c716bc50] [c000f710] __ipipe_ret_from_except+0x0/0xc
[   65.647541] --- Exception: 501 at __ipipe_restore_head+0x84/0xfc
[   65.647541]     LR = __vfile_nklock_put+0x30/0x40
[   65.658467] [c716bd10] [c0069bcc] vfile_stat_next+0x1b0/0x23c (unreliable)
[   65.665488] [c716bd20] [c0072f14] __vfile_nklock_put+0x30/0x40
[   65.671443] [c716bd30] [c00738d0] vfile_snapshot_open+0x29c/0x31c
[   65.677667] [c716bd70] [c011b1c0] proc_reg_open+0x7c/0x11c
[   65.683272] [c716bd90] [c00d1a94] do_dentry_open.isra.18+0x1f4/0x29c
[   65.689758] [c716bdc0] [c00d2ad8] nameidata_to_filp+0x5c/0xa8
[   65.695627] [c716bde0] [c00e11c0] do_last.isra.49+0x260/0x78c
[   65.701493] [c716be30] [c00e17ec] path_openat+0xb4/0x3b0
[   65.706916] [c716be80] [c00e1bdc] do_filp_open+0x30/0x8c
[   65.712338] [c716bf10] [c00d2c20] do_sys_open+0xfc/0x1c8
[   65.717764] [c716bf40] [c000ee0c] ret_from_syscall+0x0/0x38
[   65.723452] --- Exception: c01 at 0xff0f47c
[   65.723452]     LR = 0xffabb94
[   65.730816] Instruction dump:
[   65.733838] XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX 
[   65.741753] XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX 




^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [Xenomai] Oops while running "cat /proc/xenomai/stat"
  2012-10-09  6:48   ` Stefan Roese
@ 2012-10-09  9:47     ` Gilles Chanteperdrix
  2012-10-09 10:18       ` Stefan Roese
  2012-10-09 14:24     ` Philippe Gerum
  1 sibling, 1 reply; 12+ messages in thread
From: Gilles Chanteperdrix @ 2012-10-09  9:47 UTC (permalink / raw)
  To: Stefan Roese; +Cc: xenomai

On 10/09/2012 08:48 AM, Stefan Roese wrote:
> root@generic-powerpc:~# cat /proc/xenomai/stat 
> [   65.215600] Oops: Kernel access of bad area, sig: 11 [#1]
> [   65.221118] mpc5200-simple-platform
> [   65.224671] Modules linked in: rt_fpga(O) rt_mpc52xx_lpbfifo(O)
> [   65.230718] NIP: c0066914 LR: c009adf0 CTR: c009ada0
> [   65.235784] REGS: c716bac0 TRAP: 0300   Tainted: G           O  (3.5.3-00253-g4699145-dirty)
> [   65.244386] MSR: 00001032 <ME,IR,DR,RI>  CR: 24000488  XER: 20000000
> [   65.250888] DAR: 00000000, DSISR: 22000000
> [   65.255064] TASK = c705cba0[1400] 'cat' THREAD: c716a000
> GPR00: 00000000 c716bb70 c705cba0 c7b3060c 00000002 02000006 00000000 00000030 
> GPR08: c70f0000 c7b30800 00000000 00000000 c009ada0 100a5a74 10017830 10006834 
> GPR16: 10006770 10006774 100170f4 00010000 c00733d8 c75e4a60 00000000 c03ee068 
> GPR24: 00000002 00000001 00000002 c03ec460 00000000 00000000 c7a6ccc4 c7a6ccb4 
> [   65.289337] NIP [c0066914] xnsynch_flush+0x64/0x100

Could you show us the diassembly of xnsynch_flush in the corresponding
kernel?

-- 
					    Gilles.


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [Xenomai] Oops while running "cat /proc/xenomai/stat"
  2012-10-09  9:47     ` Gilles Chanteperdrix
@ 2012-10-09 10:18       ` Stefan Roese
  0 siblings, 0 replies; 12+ messages in thread
From: Stefan Roese @ 2012-10-09 10:18 UTC (permalink / raw)
  To: Gilles Chanteperdrix; +Cc: xenomai

On 10/09/2012 11:47 AM, Gilles Chanteperdrix wrote:
> On 10/09/2012 08:48 AM, Stefan Roese wrote:
>> root@generic-powerpc:~# cat /proc/xenomai/stat 
>> [   65.215600] Oops: Kernel access of bad area, sig: 11 [#1]
>> [   65.221118] mpc5200-simple-platform
>> [   65.224671] Modules linked in: rt_fpga(O) rt_mpc52xx_lpbfifo(O)
>> [   65.230718] NIP: c0066914 LR: c009adf0 CTR: c009ada0
>> [   65.235784] REGS: c716bac0 TRAP: 0300   Tainted: G           O  (3.5.3-00253-g4699145-dirty)
>> [   65.244386] MSR: 00001032 <ME,IR,DR,RI>  CR: 24000488  XER: 20000000
>> [   65.250888] DAR: 00000000, DSISR: 22000000
>> [   65.255064] TASK = c705cba0[1400] 'cat' THREAD: c716a000
>> GPR00: 00000000 c716bb70 c705cba0 c7b3060c 00000002 02000006 00000000 00000030 
>> GPR08: c70f0000 c7b30800 00000000 00000000 c009ada0 100a5a74 10017830 10006834 
>> GPR16: 10006770 10006774 100170f4 00010000 c00733d8 c75e4a60 00000000 c03ee068 
>> GPR24: 00000002 00000001 00000002 c03ec460 00000000 00000000 c7a6ccc4 c7a6ccb4 
>> [   65.289337] NIP [c0066914] xnsynch_flush+0x64/0x100
> 
> Could you show us the diassembly of xnsynch_flush in the corresponding
> kernel?

Sure. Here you go: Kernel 3.5.3 with ipipe (core-3.5 branch) from
git.denx.de:

c00668b0 <xnsynch_flush>:
 *
 * Rescheduling: never.
 */

int xnsynch_flush(struct xnsynch *synch, xnflags_t reason)
{
c00668b0:       94 21 ff d0     stwu    r1,-48(r1)
c00668b4:       7c 08 02 a6     mflr    r0
c00668b8:       bf 21 00 14     stmw    r25,20(r1)
c00668bc:       7c 7f 1b 78     mr      r31,r3
c00668c0:       7c 9c 23 78     mr      r28,r4
c00668c4:       90 01 00 34     stw     r0,52(r1)
static inline void hard_local_irq_disable_notrace(void)
{
#ifdef CONFIG_BOOKE
        __asm__ __volatile__("wrteei 0": : :"memory");
#else
        unsigned long msr = mfmsr();
c00668c8:       7c 00 00 a6     mfmsr   r0
        mtmsr(msr & ~MSR_EE);
c00668cc:       54 00 04 5e     rlwinm  r0,r0,0,17,15
c00668d0:       7c 00 01 24     mtmsr   r0
 */
static inline int __test_and_set_bit(int nr, volatile unsigned long *addr)
{
        unsigned long mask = BIT_MASK(nr);
        unsigned long *p = ((unsigned long *)addr) + BIT_WORD(nr);
        unsigned long old = *p;
c00668d4:       3f 60 c0 3f     lis     r27,-16321
        return qslot->elems;
}

static inline int emptyq_p(xnqueue_t *qslot)
{
        return qslot->head.next == &qslot->head;
c00668d8:       7c 7e 1b 78     mr      r30,r3
c00668dc:       3b 7b c4 60     addi    r27,r27,-15264
        xnlock_get_irqsave(&nklock, s);

        trace_mark(xn_nucleus, synch_flush, "synch %p reason %lu",
                   synch, reason);

        status = emptypq_p(&synch->pendq) ? XNSYNCH_DONE : XNSYNCH_RESCHED;
c00668e0:       3b 40 00 02     li      r26,2
c00668e4:       83 3b 09 18     lwz     r25,2328(r27)

        *p = old | mask;
c00668e8:       63 20 00 01     ori     r0,r25,1
        return (old & mask) != 0;
c00668ec:       57 39 07 fe     clrlwi  r25,r25,31
{
        unsigned long mask = BIT_MASK(nr);
        unsigned long *p = ((unsigned long *)addr) + BIT_WORD(nr);
        unsigned long old = *p;

        *p = old | mask;
c00668f0:       90 1b 09 18     stw     r0,2328(r27)
c00668f4:       85 3e 00 10     lwzu    r9,16(r30)
c00668f8:       7f 89 f0 00     cmpw    cr7,r9,r30
c00668fc:       41 9e 00 ac     beq-    cr7,c00669a8 <xnsynch_flush+0xf8>

        while ((holder = getpq(&synch->pendq)) != NULL) {
                struct xnthread *sleeper = link2thread(holder, plink);
                xnthread_set_info(sleeper, reason);
                sleeper->wchan = NULL;
c0066900:       3b a0 00 00     li      r29,0
c0066904:       48 00 00 44     b       c0066948 <xnsynch_flush+0x98>
}

static inline xnholder_t *getq(xnqueue_t *qslot)
{
        xnholder_t *holder = getheadq(qslot);
        if (holder)
c0066908:       41 9a 00 54     beq-    cr6,c006695c <xnsynch_flush+0xac>
        head->next = holder;
}

static inline void dth(xnholder_t *holder)
{
        holder->last->next = holder->next;
c006690c:       81 69 00 04     lwz     r11,4(r9)
c0066910:       80 09 00 00     lwz     r0,0(r9)
c0066914:       90 0b 00 00     stw     r0,0(r11)
        holder->next->last = holder->last;
c0066918:       81 69 00 00     lwz     r11,0(r9)
c006691c:       80 09 00 04     lwz     r0,4(r9)
c0066920:       90 0b 00 04     stw     r0,4(r11)
}

static inline void removeq(xnqueue_t *qslot, xnholder_t *holder)
{
        dth(holder);
        --qslot->elems;
c0066924:       81 7f 00 18     lwz     r11,24(r31)
c0066928:       38 0b ff ff     addi    r0,r11,-1
c006692c:       90 1f 00 18     stw     r0,24(r31)
c0066930:       93 a9 00 20     stw     r29,32(r9)

        status = emptypq_p(&synch->pendq) ? XNSYNCH_DONE : XNSYNCH_RESCHED;

        while ((holder = getpq(&synch->pendq)) != NULL) {
                struct xnthread *sleeper = link2thread(holder, plink);
                xnthread_set_info(sleeper, reason);
c0066934:       80 09 ff d0     lwz     r0,-48(r9)
c0066938:       7c 00 e3 78     or      r0,r0,r28
c006693c:       90 09 ff d0     stw     r0,-48(r9)
                sleeper->wchan = NULL;
                xnpod_resume_thread(sleeper, XNPEND);
c0066940:       4b ff b2 39     bl      c0061b78 <xnpod_resume_thread>
c0066944:       81 3f 00 10     lwz     r9,16(r31)
#endif /* XENO_DEBUG(QUEUES) */

static inline xnholder_t *getheadq(xnqueue_t *qslot)
{
        xnholder_t *holder = qslot->head.next;
        return holder == &qslot->head ? NULL : holder;
c0066948:       7f 9e 48 00     cmpw    cr7,r30,r9
}

static inline xnholder_t *getq(xnqueue_t *qslot)
{
        xnholder_t *holder = getheadq(qslot);
        if (holder)
c006694c:       2f 09 00 00     cmpwi   cr6,r9,0
c0066950:       38 69 fe 0c     addi    r3,r9,-500
c0066954:       38 80 00 02     li      r4,2
#endif /* XENO_DEBUG(QUEUES) */

static inline xnholder_t *getheadq(xnqueue_t *qslot)
{
        xnholder_t *holder = qslot->head.next;
        return holder == &qslot->head ? NULL : holder;
c0066958:       40 9e ff b0     bne+    cr7,c0066908 <xnsynch_flush+0x58>
        }

        if (testbits(synch->status, XNSYNCH_CLAIMED)) {
c006695c:       80 1f 00 0c     lwz     r0,12(r31)
c0066960:       70 09 00 10     andi.   r9,r0,16
c0066964:       41 82 00 14     beq-    c0066978 <xnsynch_flush+0xc8>
                xnsynch_clear_boost(synch, synch->owner);
c0066968:       80 9f 00 1c     lwz     r4,28(r31)
c006696c:       7f e3 fb 78     mr      r3,r31
                status = XNSYNCH_RESCHED;
c0066970:       3b 40 00 02     li      r26,2
                sleeper->wchan = NULL;
                xnpod_resume_thread(sleeper, XNPEND);
        }

        if (testbits(synch->status, XNSYNCH_CLAIMED)) {
                xnsynch_clear_boost(synch, synch->owner);
c0066974:       4b ff f7 05     bl      c0066078 <xnsynch_clear_boost>
 * @nr: bit number to test
 * @addr: Address to start counting from
 */
static inline int test_bit(int nr, const volatile unsigned long *addr)
{
        return 1UL & (addr[BIT_WORD(nr)] >> (nr & (BITS_PER_LONG-1)));
c0066978:       80 1b 09 18     lwz     r0,2328(r27)
c006697c:       54 00 07 fe     clrlwi  r0,r0,31
void __ipipe_restore_head(unsigned long x);

static inline void ipipe_restore_head(unsigned long x)
{
        ipipe_check_irqoff();
        if ((x ^ test_bit(IPIPE_STALL_FLAG, &__ipipe_head_status)) & 1)
c0066980:       7f 80 c8 00     cmpw    cr7,r0,r25
c0066984:       41 be 00 0c     beq+    cr7,c0066990 <xnsynch_flush+0xe0>
                __ipipe_restore_head(x);
c0066988:       7f 23 cb 78     mr      r3,r25
c006698c:       4b ff 64 ad     bl      c005ce38 <__ipipe_restore_head>
        xnlock_put_irqrestore(&nklock, s);

        xnarch_post_graph_if(synch, 0, emptypq_p(&synch->pendq));

        return status;
}
c0066990:       80 01 00 34     lwz     r0,52(r1)
c0066994:       7f 43 d3 78     mr      r3,r26
c0066998:       bb 21 00 14     lmw     r25,20(r1)
c006699c:       38 21 00 30     addi    r1,r1,48
c00669a0:       7c 08 03 a6     mtlr    r0
c00669a4:       4e 80 00 20     blr
        xnlock_get_irqsave(&nklock, s);

        trace_mark(xn_nucleus, synch_flush, "synch %p reason %lu",
                   synch, reason);

        status = emptypq_p(&synch->pendq) ? XNSYNCH_DONE : XNSYNCH_RESCHED;
c00669a8:       3b 40 00 00     li      r26,0
c00669ac:       4b ff ff 54     b       c0066900 <xnsynch_flush+0x50>

c00669b0 <xnsynch_sleep_on>:
 * xnpod_init_thread), or nanoseconds otherwise.
 */

xnflags_t xnsynch_sleep_on(struct xnsynch *synch, xnticks_t timeout,
                           xntmode_t timeout_mode)




^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [Xenomai] Oops while running "cat /proc/xenomai/stat"
  2012-10-09  6:48   ` Stefan Roese
  2012-10-09  9:47     ` Gilles Chanteperdrix
@ 2012-10-09 14:24     ` Philippe Gerum
  2012-10-09 15:44       ` Stefan Roese
  1 sibling, 1 reply; 12+ messages in thread
From: Philippe Gerum @ 2012-10-09 14:24 UTC (permalink / raw)
  To: Stefan Roese; +Cc: xenomai

On 10/09/2012 08:48 AM, Stefan Roese wrote:

> [   65.601569] [c716bba0] [c009adf0] rtdm_event_signal+0x50/0xe4
> [   65.607440] [c716bbc0] [cb132588] fpga_dma_done_callback+0x18/0x28 [rt_fpga]
> [   65.614641] [c716bbd0] [cb101114] mpc52xx_lpbfifo_bcom_irq+0x114/0x1c4 [rt_mpc52xx_lpbfifo]

Is any list corruption detected when CONFIG_XENO_OPT_DEBUG_QUEUES is
enabled?

-- 
Philippe.


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [Xenomai] Oops while running "cat /proc/xenomai/stat"
  2012-10-09 14:24     ` Philippe Gerum
@ 2012-10-09 15:44       ` Stefan Roese
  2012-10-11 11:56         ` Stefan Roese
  0 siblings, 1 reply; 12+ messages in thread
From: Stefan Roese @ 2012-10-09 15:44 UTC (permalink / raw)
  To: Philippe Gerum; +Cc: xenomai

On 10/09/2012 04:24 PM, Philippe Gerum wrote:
> On 10/09/2012 08:48 AM, Stefan Roese wrote:
> 
>> [   65.601569] [c716bba0] [c009adf0] rtdm_event_signal+0x50/0xe4
>> [   65.607440] [c716bbc0] [cb132588] fpga_dma_done_callback+0x18/0x28 [rt_fpga]
>> [   65.614641] [c716bbd0] [cb101114] mpc52xx_lpbfifo_bcom_irq+0x114/0x1c4 [rt_mpc52xx_lpbfifo]
> 
> Is any list corruption detected when CONFIG_XENO_OPT_DEBUG_QUEUES is
> enabled?

No, I don't see any list corruptions with CONFIG_XENO_OPT_DEBUG_QUEUES
enabled. But the crash log is different now. At least part of the
"stat" is printed. And the crash happens now in xnintr_irq_handler().
This different crash seems to result from a change of the kernel
config (I enabled/disabled some other drivers as well in the meantime).
I'm debugging now, how different kernel configurations result in different
crash scenarios.

Thanks for your help so far.

Cheers,
Stefan




^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [Xenomai] Oops while running "cat /proc/xenomai/stat"
  2012-10-09 15:44       ` Stefan Roese
@ 2012-10-11 11:56         ` Stefan Roese
  2012-10-11 12:40           ` Philippe Gerum
  0 siblings, 1 reply; 12+ messages in thread
From: Stefan Roese @ 2012-10-11 11:56 UTC (permalink / raw)
  To: Philippe Gerum; +Cc: xenomai

On 10/09/2012 05:44 PM, Stefan Roese wrote:
> On 10/09/2012 04:24 PM, Philippe Gerum wrote:
>> On 10/09/2012 08:48 AM, Stefan Roese wrote:
>>
>>> [   65.601569] [c716bba0] [c009adf0] rtdm_event_signal+0x50/0xe4
>>> [   65.607440] [c716bbc0] [cb132588] fpga_dma_done_callback+0x18/0x28 [rt_fpga]
>>> [   65.614641] [c716bbd0] [cb101114] mpc52xx_lpbfifo_bcom_irq+0x114/0x1c4 [rt_mpc52xx_lpbfifo]
>>
>> Is any list corruption detected when CONFIG_XENO_OPT_DEBUG_QUEUES is
>> enabled?
> 
> No, I don't see any list corruptions with CONFIG_XENO_OPT_DEBUG_QUEUES
> enabled. But the crash log is different now. At least part of the
> "stat" is printed. And the crash happens now in xnintr_irq_handler().
> This different crash seems to result from a change of the kernel
> config (I enabled/disabled some other drivers as well in the meantime).
> I'm debugging now, how different kernel configurations result in different
> crash scenarios.

I now strapped down my device driver to the absolute minimum. 
"cat /proc/xenomai/stat" still does crash. But not all the time, and not
always with the same output. Very strange is the "0x100100" in the
output below. This is included in many of the crash reports. Does this
ring a bell?

(latest git xenomai-2.6 with latest core-3.5 ipipe on mpc5200)

root@generic-powerpc:~# cat /proc/xenomai/stat 
CPU  PID    MSW        CSW        PF    STAT       %CPU  NAME
  0  0      0          1458       0     00500080   99.2  ROOT
  0  1401   1          1458       0     00300186    0.1  fpga-loop
  0  0      0          1456       0     00000000    0.6  IRQ16: rt_fpga
  0  0      0          0          0     00000000    0.0  IRQ151: mpc52xx-lpbfifo
  0  0      0          0          0     00000000    0.0  IRQ194: mpc52xx-lpbfifo-rx
  0  0      0          10306      0     00000000    0.1  IRQ512: [timer]
root@generic-powerpc:~# cat /proc/xenomai/stat 
CPU  PID    MSW        CSW        PF    STAT       %CPU  NAME
  0  0      0          1546       0     00500080   95.9  ROOT
  0  1401   1          1546       0     00300186    0.5  fpga-loop
  0  0      0          1544       0     00000000    3.4  IRQ16: rt_fpga
  0  0      0          0          0     00000000    0.0  IRQ151: mpc52xx-lpbfifo
  0  0      0          0          0     00000000    0.0  IRQ194: mpc52xx-lpbfifo-rx
  0  0      0          10542      0     00000000    0.2  IRQ512: [timer]
root@generic-powerpc:~# cat /proc/xenomai/stat 
CPU  PID    MSW        CSW        PF    STAT       %CPU  NAME
  0  0      0          1655       0     00500080   95.9  ROOT
  0  1401   1          1655       0     00300186    0.5  fpga-loop
  0  0      0          1653       0     00000000    3.4  IRQ16: rt_fpga
  0  0      0          0          0     00000000    0.0  IRQ151: mpc52xx-lpbfifo
  0  0      0          0          0     00000000    0.0  IRQ194: mpc52xx-lpbfifo-rx
  0  0      0          10833      0     00000000    0.2  IRQ512: [timer]
root@generic-powerpc:~# cat /proc/xenomai/stat 
CPU  PID    MSW        CSW        PF    STAT       %CPU  NAME
[   45.161337] Unable to handle kernel paging request for data at address 0x7ffffff7
[   45.174889] Faulting instruction address: 0xc015ea7c
[   45.180048] Oops: Kernel access of bad area, sig: 11 [#1]
[   45.186022] mpc5200-simple-platform
[   45.189595] Modules linked in: rt_fpga(O) rt_mpc52xx_lpbfifo(O)
[   45.196088] NIP: c015ea7c LR: c00b42a0 CTR: c00b428c
[   45.201171] REGS: c709ddb0 TRAP: 0300   Tainted: G           O  (3.5.3-00254-g0a88116-dirty)
[   45.210190] MSR: 00009032 <EE,ME,IR,DR,RI>  CR: 48004424  XER: 00000000
[   45.217416] DAR: 7ffffff7, DSISR: 20000000
[   45.221608] TASK = c70c1740[1409] 'cat' THREAD: c709c000
GPR00: ffffffbf c709de60 c70c1740 aa883045 aa883045 c7b1f408 00000001 00000000 
GPR08: 000048dd 00008000 c734cec0 00000001 28004424 100a5a74 10017830 10006834 
GPR16: 10006770 10006774 100170f4 00000000 bfc331e0 100a1008 00000000 1007d050 
GPR24: bfc65f29 00000027 c7b1f408 00000000 c00b4318 7ffffff7 aa883045 c00b42a0 
[   45.257605] NIP [c015ea7c] kfree+0x40/0x150
[   45.261896] LR [c00b42a0] vfile_snapshot_free+0x14/0x24
[   45.267628] Call Trace:
[   45.270153] [c709de60] [00100100] 0x100100 (unreliable)
[   45.275506] [c709de90] [c00b42a0] vfile_snapshot_free+0x14/0x24
[   45.281965] [c709dea0] [c00b4378] vfile_snapshot_release+0x60/0x88
[   45.288294] [c709dec0] [c01aca10] proc_reg_release+0xd4/0x170
[   45.294591] [c709def0] [c0166548] fput+0xbc/0x238
[   45.299835] [c709df10] [c0162e10] filp_close+0x78/0xa4
[   45.305092] [c709df30] [c0162ed8] sys_close+0x9c/0xd8
[   45.310689] [c709df40] [c000ee0c] ret_from_syscall+0x0/0x38
[   45.316392] --- Exception: c01 at 0xff0fb20
[   45.316392]     LR = 0x1004dc44
[   45.324149] Instruction dump:
[   45.327195] 7fe802a6 7c7e1b78 90010034 409d00a8 3d20c04a 3fa34000 81294254 57bdc9f4 
[   45.335552] 7c09e82e 7fa9ea14 70098000 4082010c <801d0000> 700b0080 418200e0 3f60c048 
[   45.344089] ---[ end trace a562bf2537c91283 ]---
[   45.348807] 
Segmentation fault

Thanks,
Stefan




^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [Xenomai] Oops while running "cat /proc/xenomai/stat"
  2012-10-11 11:56         ` Stefan Roese
@ 2012-10-11 12:40           ` Philippe Gerum
  2012-10-11 12:42             ` Philippe Gerum
  2012-10-11 12:53             ` Stefan Roese
  0 siblings, 2 replies; 12+ messages in thread
From: Philippe Gerum @ 2012-10-11 12:40 UTC (permalink / raw)
  To: Stefan Roese; +Cc: xenomai

On 10/11/2012 01:56 PM, Stefan Roese wrote:
> On 10/09/2012 05:44 PM, Stefan Roese wrote:
>> On 10/09/2012 04:24 PM, Philippe Gerum wrote:
>>> On 10/09/2012 08:48 AM, Stefan Roese wrote:
>>>
>>>> [   65.601569] [c716bba0] [c009adf0] rtdm_event_signal+0x50/0xe4
>>>> [   65.607440] [c716bbc0] [cb132588] fpga_dma_done_callback+0x18/0x28 [rt_fpga]
>>>> [   65.614641] [c716bbd0] [cb101114] mpc52xx_lpbfifo_bcom_irq+0x114/0x1c4 [rt_mpc52xx_lpbfifo]
>>>
>>> Is any list corruption detected when CONFIG_XENO_OPT_DEBUG_QUEUES is
>>> enabled?
>>
>> No, I don't see any list corruptions with CONFIG_XENO_OPT_DEBUG_QUEUES
>> enabled. But the crash log is different now. At least part of the
>> "stat" is printed. And the crash happens now in xnintr_irq_handler().
>> This different crash seems to result from a change of the kernel
>> config (I enabled/disabled some other drivers as well in the meantime).
>> I'm debugging now, how different kernel configurations result in different
>> crash scenarios.
> 
> I now strapped down my device driver to the absolute minimum. 
> "cat /proc/xenomai/stat" still does crash. But not all the time, and not
> always with the same output. Very strange is the "0x100100" in the
> output below. This is included in many of the crash reports. Does this
> ring a bell?

#define LIST_POISON1  ((void *) 0x00100100 + POISON_POINTER_DELTA)

Somebody might be doing bad things with memory it does not own anymore?

-- 
Philippe.


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [Xenomai] Oops while running "cat /proc/xenomai/stat"
  2012-10-11 12:40           ` Philippe Gerum
@ 2012-10-11 12:42             ` Philippe Gerum
  2012-10-11 13:07               ` Stefan Roese
  2012-10-11 12:53             ` Stefan Roese
  1 sibling, 1 reply; 12+ messages in thread
From: Philippe Gerum @ 2012-10-11 12:42 UTC (permalink / raw)
  To: Stefan Roese; +Cc: xenomai

On 10/11/2012 02:40 PM, Philippe Gerum wrote:
> On 10/11/2012 01:56 PM, Stefan Roese wrote:
>> On 10/09/2012 05:44 PM, Stefan Roese wrote:
>>> On 10/09/2012 04:24 PM, Philippe Gerum wrote:
>>>> On 10/09/2012 08:48 AM, Stefan Roese wrote:
>>>>
>>>>> [   65.601569] [c716bba0] [c009adf0] rtdm_event_signal+0x50/0xe4
>>>>> [   65.607440] [c716bbc0] [cb132588] fpga_dma_done_callback+0x18/0x28 [rt_fpga]
>>>>> [   65.614641] [c716bbd0] [cb101114] mpc52xx_lpbfifo_bcom_irq+0x114/0x1c4 [rt_mpc52xx_lpbfifo]
>>>>
>>>> Is any list corruption detected when CONFIG_XENO_OPT_DEBUG_QUEUES is
>>>> enabled?
>>>
>>> No, I don't see any list corruptions with CONFIG_XENO_OPT_DEBUG_QUEUES
>>> enabled. But the crash log is different now. At least part of the
>>> "stat" is printed. And the crash happens now in xnintr_irq_handler().
>>> This different crash seems to result from a change of the kernel
>>> config (I enabled/disabled some other drivers as well in the meantime).
>>> I'm debugging now, how different kernel configurations result in different
>>> crash scenarios.
>>
>> I now strapped down my device driver to the absolute minimum. 
>> "cat /proc/xenomai/stat" still does crash. But not all the time, and not
>> always with the same output. Very strange is the "0x100100" in the
>> output below. This is included in many of the crash reports. Does this
>> ring a bell?
> 
> #define LIST_POISON1  ((void *) 0x00100100 + POISON_POINTER_DELTA)
> 
> Somebody might be doing bad things with memory it does not own anymore?
> 

Just to rule this out, could you try raising the kernel thread stack
size this way?

diff --git a/include/asm-powerpc/system.h b/include/asm-powerpc/system.h
index 53fd59d..e8dd57c 100644
--- a/include/asm-powerpc/system.h
+++ b/include/asm-powerpc/system.h
@@ -31,7 +31,7 @@
 #ifdef CONFIG_PPC64
 #define XNARCH_THREAD_STACKSZ   8182
 #else
-#define XNARCH_THREAD_STACKSZ   4096
+#define XNARCH_THREAD_STACKSZ   8192
 #endif

 #define xnarch_stack_size(tcb)  ((tcb)->stacksize)

-- 
Philippe.


^ permalink raw reply related	[flat|nested] 12+ messages in thread

* Re: [Xenomai] Oops while running "cat /proc/xenomai/stat"
  2012-10-11 12:40           ` Philippe Gerum
  2012-10-11 12:42             ` Philippe Gerum
@ 2012-10-11 12:53             ` Stefan Roese
  1 sibling, 0 replies; 12+ messages in thread
From: Stefan Roese @ 2012-10-11 12:53 UTC (permalink / raw)
  To: Philippe Gerum; +Cc: xenomai

On 10/11/2012 02:40 PM, Philippe Gerum wrote:
> On 10/11/2012 01:56 PM, Stefan Roese wrote:
>> On 10/09/2012 05:44 PM, Stefan Roese wrote:
>>> On 10/09/2012 04:24 PM, Philippe Gerum wrote:
>>>> On 10/09/2012 08:48 AM, Stefan Roese wrote:
>>>>
>>>>> [   65.601569] [c716bba0] [c009adf0] rtdm_event_signal+0x50/0xe4
>>>>> [   65.607440] [c716bbc0] [cb132588] fpga_dma_done_callback+0x18/0x28 [rt_fpga]
>>>>> [   65.614641] [c716bbd0] [cb101114] mpc52xx_lpbfifo_bcom_irq+0x114/0x1c4 [rt_mpc52xx_lpbfifo]
>>>>
>>>> Is any list corruption detected when CONFIG_XENO_OPT_DEBUG_QUEUES is
>>>> enabled?
>>>
>>> No, I don't see any list corruptions with CONFIG_XENO_OPT_DEBUG_QUEUES
>>> enabled. But the crash log is different now. At least part of the
>>> "stat" is printed. And the crash happens now in xnintr_irq_handler().
>>> This different crash seems to result from a change of the kernel
>>> config (I enabled/disabled some other drivers as well in the meantime).
>>> I'm debugging now, how different kernel configurations result in different
>>> crash scenarios.
>>
>> I now strapped down my device driver to the absolute minimum. 
>> "cat /proc/xenomai/stat" still does crash. But not all the time, and not
>> always with the same output. Very strange is the "0x100100" in the
>> output below. This is included in many of the crash reports. Does this
>> ring a bell?
> 
> #define LIST_POISON1  ((void *) 0x00100100 + POISON_POINTER_DELTA)
> 
> Somebody might be doing bad things with memory it does not own anymore?

Yep. That's it. Changing it some other value shows the new changed
value. I'm not doing anything with lists though in my code (AFAIR).

I'll check the increased stack next and let you know.

Thanks,
Stefan




^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [Xenomai] Oops while running "cat /proc/xenomai/stat"
  2012-10-11 12:42             ` Philippe Gerum
@ 2012-10-11 13:07               ` Stefan Roese
  0 siblings, 0 replies; 12+ messages in thread
From: Stefan Roese @ 2012-10-11 13:07 UTC (permalink / raw)
  To: Philippe Gerum; +Cc: xenomai

On 10/11/2012 02:42 PM, Philippe Gerum wrote:
>>> I now strapped down my device driver to the absolute minimum. 
>>> "cat /proc/xenomai/stat" still does crash. But not all the time, and not
>>> always with the same output. Very strange is the "0x100100" in the
>>> output below. This is included in many of the crash reports. Does this
>>> ring a bell?
>>
>> #define LIST_POISON1  ((void *) 0x00100100 + POISON_POINTER_DELTA)
>>
>> Somebody might be doing bad things with memory it does not own anymore?
>>
> 
> Just to rule this out, could you try raising the kernel thread stack
> size this way?
> 
> diff --git a/include/asm-powerpc/system.h b/include/asm-powerpc/system.h
> index 53fd59d..e8dd57c 100644
> --- a/include/asm-powerpc/system.h
> +++ b/include/asm-powerpc/system.h
> @@ -31,7 +31,7 @@
>  #ifdef CONFIG_PPC64
>  #define XNARCH_THREAD_STACKSZ   8182
>  #else
> -#define XNARCH_THREAD_STACKSZ   4096
> +#define XNARCH_THREAD_STACKSZ   8192
>  #endif
> 
>  #define xnarch_stack_size(tcb)  ((tcb)->stacksize)

No, it doesn't fix the problem. But now the 0x100100 doesn't show
in the crash dump any more. 3 times without it. Here one log:

root@generic-powerpc:~# cat /proc/xenomai/stat 
CPU  PID    MSW        CSW        PF    STAT       %CPU  NAME
  0  0      0          1295       0     00500080   95.8  ROOT
  0  1402   1          1295       0     00300186    0.6  fpga-loop
  0  0      0          1293       0     00000000    3.4  IRQ16: rt_fpga
  0  0      0          0          0     00000000    0.0  IRQ151: mpc52xx-lpbfifo
  0  0      0          0          0     00000000    0.0  IRQ194: mpc52xx-lpbfifo-rx
  0  0      0          12684      0     00000000    0.2  IRQ512: [timer]
root@generic-powerpc:~# cat /proc/xenomai/stat 
[   51.448640] Unable to handle kernel paging request for data at address 0x81d11f92
[   51.456435] Faulting instruction address: 0xc008ae48
[   51.462066] Oops: Kernel access of bad area, sig: 11 [#1]
[   51.467600] mpc5200-simple-platform
[   51.471543] Modules linked in: rt_fpga(O) rt_mpc52xx_lpbfifo(O)
[   51.477621] NIP: c008ae48 LR: c00b37ac CTR: c008ae2c
[   51.483104] REGS: c7081d90 TRAP: 0300   Tainted: G           O  (3.5.3-00254-g0a88116-dirty)
[   51.492161] MSR: 00009032 <EE,ME,IR,DR,RI>  CR: 88422488  XER: 00000000
[   51.498958] DAR: 81d11f92, DSISR: 20000000
[   51.503550] TASK = c724f360[1412] 'cat' THREAD: c7080000
GPR00: c00b37ac c7081e40 c724f360 c7a7cb40 81d11f52 c7081e98 00000000 00000001 
GPR08: 00000000 c0484e38 c0484dd0 00000000 81d11f52 100a5a74 10017830 10006834 
GPR16: 10006770 10006774 81d11f52 00000000 fffff000 0000003e fffff000 00000000 
GPR24: 00000000 c726ada8 00000000 c7afae00 bfb66808 00001000 c7081f20 c7a7cb40 
[   51.539159] NIP [c008ae48] vfile_stat_show+0x1c/0x1a8
[   51.544728] LR [c00b37ac] vfile_snapshot_show+0x2c/0x60
[   51.550070] Call Trace:
[   51.552576] [c7081e40] [c008afb8] vfile_stat_show+0x18c/0x1a8 (unreliable)
[   51.560020] [c7081e80] [c00b37ac] vfile_snapshot_show+0x2c/0x60
[   51.566520] [c7081e90] [c01838e0] seq_read+0x3b0/0x5a0
[   51.571784] [c7081ee0] [c01accb8] proc_reg_read+0x4c/0x70
[   51.577718] [c7081ef0] [c0165148] vfs_read+0xa8/0x184
[   51.582888] [c7081f10] [c0165270] sys_read+0x4c/0x8c
[   51.588383] [c7081f40] [c000ee0c] ret_from_syscall+0x0/0x38
[   51.594086] --- Exception: c01 at 0xff0fb94
[   51.594086]     LR = 0x100064f4
[   51.601934] Instruction dump:
[   51.604989] 807f05b8 7c0803a6 bbc10008 38210010 4e800020 7c8c2379 9421ffc0 7c0802a6 
[   51.613342] 93e1003c 7c7f1b78 90010044 4182011c <812c0040> 39600000 808c0044 7d202379 
[   51.621877] ---[ end trace b1a0c2afe6b0e729 ]---
[   51.631010] ------------[ cut here ]------------
[   51.635846] kernel BUG at mm/slub.c:3474!
[   51.640470] Oops: Exception in kernel mode, sig: 5 [#2]
[   51.645817] mpc5200-simple-platform
[   51.649376] Modules linked in: rt_fpga(O) rt_mpc52xx_lpbfifo(O)
[   51.655441] NIP: c015eb74 LR: c00b42a0 CTR: c00b428c
[   51.660517] REGS: c7081b50 TRAP: 0700   Tainted: G      D    O  (3.5.3-00254-g0a88116-dirty)
[   51.669134] MSR: 00029032 <EE,ME,IR,DR,RI>  CR: 24424424  XER: 00000000
[   51.675921] TASK = c724f360[1412] 'cat' THREAD: c7080000
GPR00: 00000001 c7081c00 c724f360 822c6d2e 822c6d2e c7afae08 00000001 00000000 
GPR08: 00004962 00000000 c700fee0 00000000 24424424 100a5a74 10017830 10006834 
GPR16: 10006770 10006774 81d11f52 00000000 fffff000 0000003e fffff000 00000000 
GPR24: 00000000 c726ada8 c7afae08 00000000 c00b4318 c1d2b8c0 822c6d2e c00b42a0 
[   51.710239] NIP [c015eb74] kfree+0x138/0x150
[   51.714613] LR [c00b42a0] vfile_snapshot_free+0x14/0x24
[   51.719944] Call Trace:
[   51.722443] [c7081c00] [c70b9040] 0xc70b9040 (unreliable)
[   51.727966] [c7081c30] [c00b42a0] vfile_snapshot_free+0x14/0x24
[   51.734018] [c7081c40] [c00b4378] vfile_snapshot_release+0x60/0x88
[   51.740341] [c7081c60] [c01aca10] proc_reg_release+0xd4/0x170
[   51.746224] [c7081c90] [c0166548] fput+0xbc/0x238
[   51.751034] [c7081cb0] [c0162e10] filp_close+0x78/0xa4
[   51.756290] [c7081cd0] [c001e3c0] put_files_struct+0xdc/0xf8
[   51.762075] [c7081cf0] [c001e508] do_exit+0x100/0x6c0
[   51.767240] [c7081d40] [c000aa14] die+0x198/0x240
[   51.772063] [c7081d70] [c0010360] bad_page_fault+0xb4/0xfc
[   51.777674] [c7081d80] [c000f2c4] handle_page_fault+0x7c/0x80
[   51.783574] --- Exception: 300 at vfile_stat_show+0x1c/0x1a8
[   51.783574]     LR = vfile_snapshot_show+0x2c/0x60
[   51.794245] [c7081e40] [c008afb8] vfile_stat_show+0x18c/0x1a8 (unreliable)
[   51.801276] [c7081e80] [c00b37ac] vfile_snapshot_show+0x2c/0x60
[   51.807336] [c7081e90] [c01838e0] seq_read+0x3b0/0x5a0
[   51.812592] [c7081ee0] [c01accb8] proc_reg_read+0x4c/0x70
[   51.818116] [c7081ef0] [c0165148] vfs_read+0xa8/0x184
[   51.823281] [c7081f10] [c0165270] sys_read+0x4c/0x8c
[   51.828360] [c7081f40] [c000ee0c] ret_from_syscall+0x0/0x38
[   51.834057] --- Exception: c01 at 0xff0fb94
[   51.834057]     LR = 0x100064f4
[   51.841519] Instruction dump:
[   51.844547] 7f83e378 7fa4eb78 7fc5f378 7fe6fb78 bb210014 7c0803a6 38210030 4824bc70 
[   51.852483] 801d0000 7009c000 7c000026 54001ffe <0f000000> 7fa3eb78 4bfdb9e5 4bffff80 
[   51.861125] ---[ end trace b1a0c2afe6b0e72a ]---
[   51.866292] 
[   51.868206] Fixing recursive fault but reboot is needed!

Any further ideas?

Thanks,
Stefan





^ permalink raw reply	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2012-10-11 13:07 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2012-10-08  9:32 [Xenomai] Oops while running "cat /proc/xenomai/stat" Stefan Roese
2012-10-08 17:39 ` Gilles Chanteperdrix
2012-10-09  6:48   ` Stefan Roese
2012-10-09  9:47     ` Gilles Chanteperdrix
2012-10-09 10:18       ` Stefan Roese
2012-10-09 14:24     ` Philippe Gerum
2012-10-09 15:44       ` Stefan Roese
2012-10-11 11:56         ` Stefan Roese
2012-10-11 12:40           ` Philippe Gerum
2012-10-11 12:42             ` Philippe Gerum
2012-10-11 13:07               ` Stefan Roese
2012-10-11 12:53             ` Stefan Roese

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.