* PROBLEM: ARM-dma-mapping-fix-for-speculative-prefetching cause OOPS @ 2011-06-30 4:09 Qin Dehua 2011-06-30 7:43 ` Russell King 0 siblings, 1 reply; 11+ messages in thread From: Qin Dehua @ 2011-06-30 4:09 UTC (permalink / raw) To: rmk+kernel, linux-kernel; +Cc: santosh.shilimkar, dan.j.williams, neilb The 2.6.38.8 Kernel make our IOP 341 XScale processor based RAID6 crashes. After doing a bisection, We found commit 2ffe2da3e71652d4f4cae19539b5c78c2a239136 cause the problem. That commit is only for ARMv6 and ARMv7 CPUs, so we revert it on 2.6.38.8 Kernel, and then our raid box runs OK. Following are some kernel messages when the system crashes: * The kernel config has CONFIG_ASYNC_PQ=y CONFIG_RAID6_PQ=y Unable to handle kernel NULL pointer dereference at virtual address 00000000 pgd = 80004000 [00000000] *pgd=00000000 Internal error: Oops: 817 [#1] last sysfs file: /sys/block/md1/md/metadata_version Modules linked in: e1000 mon CPU: 0 Not tainted (2.6.32 #140) PC is at raid5d+0x580/0x58c LR is at raid5d+0x46c/0x58c pc : [<80241f20>] lr : [<80241e0c>] psr: 20000093 sp : f0fbdf40 ip : f1129a4c fp : 00000000 r10: 00000000 r9 : 00000000 r8 : 00000000 r7 : f114b2c8 r6 : f0fbc000 r5 : 7fffffff r4 : f12364a0 r3 : 00000000 r2 : 60000093 r1 : 60000093 r0 : f1129a54 Flags: nzCv IRQs off FIQs on Mode SVC_32 ISA ARM Segment kernel Control: 0400397f Table: 6d1b8018 DAC: 00000035 Process md1_raid6 (pid: 1339, stack limit = 0xf0fbc278) Stack: (0xf0fbdf40 to 0xf0fbe000) df40: 803d4780 7fffffff 7fffffff f0fbc000 f114b2c8 f0fbdf74 f1129a54 f1129a4c df60: f0c60c00 f1129a00 802d985c 802d9264 f0fbc000 f114b2c8 f0fbdfac 00000000 df80: 00000000 f114b2c0 7fffffff f0fbc000 f114b2c8 f0fbdfac 00000000 00000000 dfa0: 00000000 8024ff4c f0fbdfd4 00000000 f0c39800 8004876c f0fbdfb8 f0fbdfb8 dfc0: f0cbfc20 f114b2c0 8024fefc 00000000 00000000 800484d8 00000000 00000000 dfe0: f0fbdfe0 f0fbdfe0 00000000 00000000 00000000 80025888 00000000 00000000 [<80241f20>] (raid5d+0x580/0x58c) from [<8024ff4c>] (md_thread+0x50/0x11c) [<8024ff4c>] (md_thread+0x50/0x11c) from [<800484d8>] (kthread+0x7c/0x84) [<800484d8>] (kthread+0x7c/0x84) from [<80025888>] (kernel_thread_exit+0x0/0x8) Code: ebfff1bc e28dd044 e8bd8ff0 e3a03000 (e5833000) ---[ end trace 21e2ce0d28cdd11a ]--- Unable to handle kernel NULL pointer dereference at virtual address 00000000 pgd = f0dd4000 [00000000] *pgd=70d2b031, *pte=00000000, *ppte=00000000 Internal error: Oops: 817 [#2] last sysfs file: /sys/block/md1/md/metadata_version Modules linked in: e1000 mon CPU: 0 Tainted: G D (2.6.32 #140) PC is at __release_stripe+0x1e4/0x200 LR is at release_stripe+0x1c/0x24 pc : [<8023c188>] lr : [<8023c1c0>] psr: 80000093 sp : f0d61dd0 ip : 00000004 fp : 00000000 r10: 00042000 r9 : 00000000 r8 : eb441bb8 r7 : 00004000 r6 : f12366f8 r5 : f1129a00 r4 : f12366f0 r3 : 00000000 r2 : 20000093 r1 : 00000000 r0 : f1129a00 Flags: Nzcv IRQs off FIQs on Mode SVC_32 ISA ARM Segment user Control: 0400397f Table: 70dd4018 DAC: 00000015 Process syslogd (pid: 623, stack limit = 0xf0d60278) Stack: (0xf0d61dd0 to 0xf0d62000) 1dc0: 8002d7a4 20000013 f1236b10 f12368c0 1de0: 00004000 8023c1c0 00001000 800a4af4 00004000 8019b350 00000001 00000000 1e00: f19b0000 00000000 00000000 eb441bb8 00000000 eb441bb8 00000000 f105c4a8 1e20: 00000000 8019b6d0 0000001c eb441bb8 eb441bb8 00000000 00000000 8019c200 1e40: 00000002 eb441bb8 f0fb9c80 00000000 00000000 8019c258 f1408920 801f2d68 1e60: 00000000 f1994310 00000000 f105c4a8 00000bb8 00000005 00046000 f0d61ea0 1e80: 00000001 804df75c 00000010 00000100 0000000a f0d60000 804df720 801a0960 1ea0: f0d61ea0 f0d61ea0 00000004 80039998 f1501690 803cfab0 0000006b 0000006b 1ec0: 803d4780 00000000 f1501690 803d0dcc 00000002 00000000 f1501690 80024044 1ee0: ffffffff f0d61f24 000001ad 80024adc f1501690 803d0dcc f1501690 00000000 1f00: 00000020 f1501690 00000020 f1501690 803d0dcc 00000002 00000000 f1501690 1f20: 00000000 f0d61f38 8007e2c4 800ac004 60000013 ffffffff f0d61e38 00020d42 1f40: 00000180 eb419c80 00000004 00000000 00000000 f1507600 00000000 00000020 1f60: f1501690 f0d07000 00000004 eb419c80 f0d60000 00000000 7ecb9724 8007e2c4 1f80: 00000000 00000000 00000123 00000d41 7ecb9f08 00000000 00000005 80025044 1fa0: 2ac90000 80024ea0 00000d41 7ecb9f08 7ecb9f08 00020d41 00000180 00000000 1fc0: 00000d41 7ecb9f08 00000000 00000005 7ecb9f08 00000038 2ac90000 7ecb9724 1fe0: 000b1aa8 7ecb9218 2ac41a9c 2ac2949c 60000010 7ecb9f08 00000000 00000000 [<8023c188>] (__release_stripe+0x1e4/0x200) from [<8023c1c0>] (release_stripe+0x1c/0x24) [<8023c1c0>] (release_stripe+0x1c/0x24) from [<800a4af4>] (bio_endio+0x48/0x64) [<800a4af4>] (bio_endio+0x48/0x64) from [<8019b350>] (blk_update_request+0x8c/0x3f4) [<8019b350>] (blk_update_request+0x8c/0x3f4) from [<8019b6d0>] (blk_update_bidi_request+0x18/0x60) [<8019b6d0>] (blk_update_bidi_request+0x18/0x60) from [<8019c200>] (blk_end_bidi_request+0x14/0x5c) [<8019c200>] (blk_end_bidi_request+0x14/0x5c) from [<8019c258>] (blk_end_request+0x10/0x18) [<8019c258>] (blk_end_request+0x10/0x18) from [<801f2d68>] (scsi_io_completion+0x74/0x4c0) [<801f2d68>] (scsi_io_completion+0x74/0x4c0) from [<801a0960>] (blk_done_softirq+0x80/0x98) [<801a0960>] (blk_done_softirq+0x80/0x98) from [<80039998>] (__do_softirq+0x88/0x11c) [<80039998>] (__do_softirq+0x88/0x11c) from [<80024044>] (asm_do_IRQ+0x44/0x8c) [<80024044>] (asm_do_IRQ+0x44/0x8c) from [<80024adc>] (__irq_svc+0x3c/0x80) Exception stack(0xf0d61ef0 to 0xf0d61f38) 1ee0: f1501690 803d0dcc f1501690 00000000 1f00: 00000020 f1501690 00000020 f1501690 803d0dcc 00000002 00000000 f1501690 1f20: 00000000 f0d61f38 8007e2c4 800ac004 60000013 ffffffff [<80024adc>] (__irq_svc+0x3c/0x80) from [<800ac004>] (fsnotify+0x124/0x170) [<800ac004>] (fsnotify+0x124/0x170) from [<8007e2c4>] (do_sys_open+0xac/0xe4) [<8007e2c4>] (do_sys_open+0xac/0xe4) from [<80024ea0>] (ret_fast_syscall+0x0/0x38) Code: e59300f0 eb002b2f eaffffe2 e1a03001 (e5833000) ---[ end trace 21e2ce0d28cdd11b ]--- Kernel panic - not syncing: Fatal exception in interrupt Unable to handle kernel NULL pointer dereference at virtual address 00000000 pgd = 80004000 [00000000] *pgd=00000000 Internal error: Oops: 817 [#1] last sysfs file: /sys/block/md1/md/metadata_version Modules linked in: e1000 mon CPU: 0 Not tainted (2.6.32 #140) PC is at get_active_stripe+0x5a4/0x66c LR is at sync_request+0xe4/0xea4 pc : [<8023edac>] lr : [<80242fc4>] psr: 40000093 sp : eb2dddb0 ip : 00000000 fp : f11292dc r10: 00012bd0 r9 : 00000000 r8 : 00012bd0 r7 : 00000000 r6 : f0cee940 r5 : eb2dc000 r4 : f1129200 r3 : 00000000 r2 : f1330008 r1 : 00000000 r0 : f0cee940 Flags: nZcv IRQs off FIQs on Mode SVC_32 ISA ARM Segment kernel Control: 0400397f Table: 6ac1c018 DAC: 00000035 Process md1_resync (pid: 3698, stack limit = 0xeb2dc278) Stack: (0xeb2dddb0 to 0xeb2de000) dda0: 00000000 00000001 00000000 00000000 ddc0: 00000000 00000000 00000000 000005e8 00000000 00000000 00000001 00000001 dde0: 000000e8 00000000 00000002 f1129200 00000001 00000000 00000001 eb2dde94 de00: 00000000 00012bd0 f0c3cc00 80242fc4 00000000 00000001 00000000 00000000 de20: f10c15c0 802090f0 f19b0000 80209404 ebbfd000 80000013 00012bd0 00000000 de40: f1981000 f105c4a8 f1981000 8019bf84 f10880c0 f1088000 f1981000 f1088000 de60: 00000008 00000000 f0c3cc00 f105c4a8 00000003 f1129200 00000004 f0c3cc00 de80: 00012b00 8019c740 f1053d80 8019ce1c 8003181c 00000400 00000000 000001a8 dea0: 00000000 0000155c f0c3cc00 00012bd0 00000000 00012bd0 00000000 802508a8 dec0: eb2ddf7c 00000001 eb2dc000 f0c3cc2c 00018680 00000000 f0c1cc00 00000002 dee0: 00012b00 00000000 eb40d830 8037fa40 00000000 00000000 000060d8 00000000 df00: 0000f460 00000000 00000000 00000000 00000000 00000000 00000000 00000000 df20: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 df40: 0000e689 0000e7bd 0000e8ea 0000e689 0000e689 0000e689 0000e689 0000e689 df60: 0000e689 0000e689 00000000 eb40d800 8004876c eb2ddf74 eb2ddf74 00000000 df80: edfa601c f10eb3a0 7fffffff eb2dc000 f10eb3a8 eb2ddfac 00000000 00000000 dfa0: 00000000 8024ff4c eb2ddfd4 f0fb5ed0 f10eb3a0 8024fefc 00000000 00000000 dfc0: f0fb5ed0 f10eb3a0 8024fefc 00000000 00000000 800484d8 00000000 00000000 dfe0: eb2ddfe0 eb2ddfe0 00000000 00000000 00000000 80025888 00000000 00000000 [<8023edac>] (get_active_stripe+0x5a4/0x66c) from [<80242fc4>] (sync_request+0xe4/0xea4) [<80242fc4>] (sync_request+0xe4/0xea4) from [<802508a8>] (md_do_sync+0x890/0xd6c) [<802508a8>] (md_do_sync+0x890/0xd6c) from [<8024ff4c>] (md_thread+0x50/0x11c) [<8024ff4c>] (md_thread+0x50/0x11c) from [<800484d8>] (kthread+0x7c/0x84) [<800484d8>] (kthread+0x7c/0x84) from [<80025888>] (kernel_thread_exit+0x0/0x8) Code: e5903028 e3130c02 1affff56 e3a03000 (e5833000) ---[ end trace 21e2ce0d28cdd11a ]--- Unable to handle kernel NULL pointer dereference at virtual address 00000000 pgd = 80004000 [00000000] *pgd=00000000 Internal error: Oops: 817 [#1] last sysfs file: /sys/block/md1/md/metadata_version Modules linked in: e1000 mon CPU: 0 Not tainted (2.6.32 #140) PC is at raid5d+0x580/0x58c LR is at raid5d+0x46c/0x58c pc : [<80241f20>] lr : [<80241e0c>] psr: 20000093 sp : f0fa9f40 ip : f0c22c4c fp : 00000000 r10: 00000000 r9 : 00000000 r8 : 00000000 r7 : f1155468 r6 : f0fa8000 r5 : 7fffffff r4 : f0e864a0 r3 : 00000000 r2 : 60000093 r1 : 60000093 r0 : f0c22c54 Flags: nzCv IRQs off FIQs on Mode SVC_32 ISA ARM Segment kernel Control: 0400397f Table: 6b740018 DAC: 00000035 Process md1_raid6 (pid: 1296, stack limit = 0xf0fa8278) Stack: (0xf0fa9f40 to 0xf0faa000) 9f40: 00000000 7fffffff 7fffffff f0fa8000 f1155468 f0fa9f74 f0c22c54 f0c22c4c 9f60: f0ddec00 f0c22c00 802d985c 802d9264 f1155460 7fffffff f0fa8000 f1155468 9f80: f0fa9fac f1155460 7fffffff f0fa8000 f1155468 f0fa9fac 00000000 00000000 9fa0: 00000000 8024ff4c f0fa9fd4 00000000 f0c40c00 8004876c f0fa9fb8 f0fa9fb8 9fc0: f0ccbc20 f1155460 8024fefc 00000000 00000000 800484d8 00000000 00000000 9fe0: f0fa9fe0 f0fa9fe0 00000000 00000000 00000000 80025888 00000000 00000000 [<80241f20>] (raid5d+0x580/0x58c) from [<8024ff4c>] (md_thread+0x50/0x11c) [<8024ff4c>] (md_thread+0x50/0x11c) from [<800484d8>] (kthread+0x7c/0x84) [<800484d8>] (kthread+0x7c/0x84) from [<80025888>] (kernel_thread_exit+0x0/0x8) Code: ebfff1bc e28dd044 e8bd8ff0 e3a03000 (e5833000) ---[ end trace aa8b689e041c4730 ]--- Regards, QinDehua ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: PROBLEM: ARM-dma-mapping-fix-for-speculative-prefetching cause OOPS 2011-06-30 4:09 PROBLEM: ARM-dma-mapping-fix-for-speculative-prefetching cause OOPS Qin Dehua @ 2011-06-30 7:43 ` Russell King 2011-06-30 11:16 ` Qin Dehua 0 siblings, 1 reply; 11+ messages in thread From: Russell King @ 2011-06-30 7:43 UTC (permalink / raw) To: Qin Dehua; +Cc: linux-kernel, santosh.shilimkar, dan.j.williams, neilb On Thu, Jun 30, 2011 at 12:09:15PM +0800, Qin Dehua wrote: > The 2.6.38.8 Kernel make our IOP 341 XScale processor based RAID6 crashes. > > After doing a bisection, We found commit > 2ffe2da3e71652d4f4cae19539b5c78c2a239136 cause the problem. > > That commit is only for ARMv6 and ARMv7 CPUs, so we revert it on > 2.6.38.8 Kernel, and then our raid box runs OK. > > Following are some kernel messages when the system crashes: > > * The kernel config has CONFIG_ASYNC_PQ=y CONFIG_RAID6_PQ=y These traces are from 2.6.32... And I assume have CONFIG_BUG unset because you have no verbose bug reporting (it's not reporting the file/line which is necessary to identify which BUG has been hit in the raid code.) Could you reproduce with CONFIG_BUG=y please? -- Russell King Linux kernel 2.6 ARM Linux - http://www.arm.linux.org.uk/ maintainer of: ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: PROBLEM: ARM-dma-mapping-fix-for-speculative-prefetching cause OOPS 2011-06-30 7:43 ` Russell King @ 2011-06-30 11:16 ` Qin Dehua 2011-06-30 11:28 ` Russell King 0 siblings, 1 reply; 11+ messages in thread From: Qin Dehua @ 2011-06-30 11:16 UTC (permalink / raw) To: Russell King; +Cc: linux-kernel, santosh.shilimkar, dan.j.williams, neilb Commit 2ffe2da3e follows v2.6.32, the message is from kernel build on commit 2ffe2da3e. The config has CONFIG_BUG=y and CONFIG_DEBUG_BUGVERBOSE=y, but the message is Oops, not BUG() macro, so they don't have line number. Regards, QinDehua 2011/6/30, Russell King <rmk@arm.linux.org.uk>: > These traces are from 2.6.32... And I assume have CONFIG_BUG unset > because you have no verbose bug reporting (it's not reporting the > file/line which is necessary to identify which BUG has been hit in > the raid code.) > > Could you reproduce with CONFIG_BUG=y please? > > -- > Russell King > Linux kernel 2.6 ARM Linux - http://www.arm.linux.org.uk/ > maintainer of: > ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: PROBLEM: ARM-dma-mapping-fix-for-speculative-prefetching cause OOPS 2011-06-30 11:16 ` Qin Dehua @ 2011-06-30 11:28 ` Russell King 2011-06-30 18:02 ` Dan Williams 0 siblings, 1 reply; 11+ messages in thread From: Russell King @ 2011-06-30 11:28 UTC (permalink / raw) To: Qin Dehua; +Cc: linux-kernel, santosh.shilimkar, dan.j.williams, neilb On Thu, Jun 30, 2011 at 07:16:24PM +0800, Qin Dehua wrote: > Commit 2ffe2da3e follows v2.6.32, the message is from kernel build on > commit 2ffe2da3e. > > The config has CONFIG_BUG=y and CONFIG_DEBUG_BUGVERBOSE=y, but the > message is Oops, not BUG() macro, so they don't have line number. In that case, the raid5 code contains an explicit NULL pointer dereference which isn't a BUG() - the code line disassembles to: 0: ebfff1bc bl 0xffffc6f8 4: e28dd044 add sp, sp, #68 ; 0x44 8: e8bd8ff0 pop {r4, r5, r6, r7, r8, r9, sl, fp, pc} c: e3a03000 mov r3, #0 ; 0x0 10: e5833000 str r3, [r3] <=== faulting instruction So, if you're saying that's not a BUG(), then I don't know what it is and I'm afraid I can't help because the oops doesn't make any sense to me. -- Russell King Linux kernel 2.6 ARM Linux - http://www.arm.linux.org.uk/ maintainer of: ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: PROBLEM: ARM-dma-mapping-fix-for-speculative-prefetching cause OOPS 2011-06-30 11:28 ` Russell King @ 2011-06-30 18:02 ` Dan Williams 2011-07-01 4:54 ` Qin Dehua 0 siblings, 1 reply; 11+ messages in thread From: Dan Williams @ 2011-06-30 18:02 UTC (permalink / raw) To: Qin Dehua; +Cc: Russell King, linux-kernel, santosh.shilimkar, neilb On Thu, Jun 30, 2011 at 4:28 AM, Russell King <rmk@arm.linux.org.uk> wrote: > On Thu, Jun 30, 2011 at 07:16:24PM +0800, Qin Dehua wrote: >> Commit 2ffe2da3e follows v2.6.32, the message is from kernel build on >> commit 2ffe2da3e. >> >> The config has CONFIG_BUG=y and CONFIG_DEBUG_BUGVERBOSE=y, but the >> message is Oops, not BUG() macro, so they don't have line number. > > In that case, the raid5 code contains an explicit NULL pointer > dereference which isn't a BUG() - the code line disassembles to: > > 0: ebfff1bc bl 0xffffc6f8 > 4: e28dd044 add sp, sp, #68 ; 0x44 > 8: e8bd8ff0 pop {r4, r5, r6, r7, r8, r9, sl, fp, pc} > c: e3a03000 mov r3, #0 ; 0x0 > 10: e5833000 str r3, [r3] <=== faulting instruction > > So, if you're saying that's not a BUG(), then I don't know what it is > and I'm afraid I can't help because the oops doesn't make any sense > to me. > QinDehua, Can you rebuild with CONFIG_DEBUG_INFO=y, reproduce the crash and then send the output of: $ gdb drivers/md/raid5.o (gdb) li *(raid5d+0x580) (gdb) li *(__release_stripe+0x1e4) etc... ...those offsets might change so just grab whatever "PC is at " reports in the oops. -- Dan ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: PROBLEM: ARM-dma-mapping-fix-for-speculative-prefetching cause OOPS 2011-06-30 18:02 ` Dan Williams @ 2011-07-01 4:54 ` Qin Dehua 2011-07-07 9:39 ` Russell King 0 siblings, 1 reply; 11+ messages in thread From: Qin Dehua @ 2011-07-01 4:54 UTC (permalink / raw) To: Russell King; +Cc: Dan Williams, linux-kernel, santosh.shilimkar, neilb (sorry, I resent this mail because the previous was rejected by linux-kernel@vger.kernel.org because of containing HTML.) I recompiled 2.6.38.8 kernel and run the test 3 times. Every time the bug occurs at RAID6 starting rebuild the second disk: ~# cat /proc/mdstat Personalities : [linear] [raid0] [raid1] [raid10] [raid6] [raid5] [raid4] md1 : active raid6 sdb[1](S) sda[0] sdd[3] sdc[2] 2000000 blocks super 1.2 level 6, 64k chunk, algorithm 2 [4/2] [__UU] [===================>.] recovery = 99.1% (991760/1000000) finish=0.0min speed=9784K/sec unused devices: <none> ~# cat /proc/mdstat Personalities : [linear] [raid0] [raid1] [raid10] [raid6] [raid5] [raid4] md1 : active raid6 sdb[1] sda[0] sdd[3] sdc[2] 2000000 blocks super 1.2 level 6, 64k chunk, algorithm 2 [4/3] [U_UU] resync=DELAYED unused devices: <none> The followings are messages of the three tests: ====== RUN 1 ====== kernel BUG at drivers/md/raid5.c:3978! Unable to handle kernel NULL pointer dereference at virtual address 00000000 pgd = 80004000 [00000000] *pgd=00000000 Internal error: Oops: 817 [#1] last sysfs file: /sys/block/md1/md/mismatch_cnt Modules linked in: CPU: 0 Not tainted (2.6.38.8+ #4) PC is at __bug+0x1c/0x28 LR is at __bug+0x18/0x28 pc : [<8002b0d4>] lr : [<8002b0d0>] psr: 20000093 sp : f122df30 ip : 00000007 fp : 00000000 r10: 00000000 r9 : 00000000 r8 : 00000000 r7 : f11b7328 r6 : f122c000 r5 : 7fffffff r4 : f11ae4a0 r3 : 00000000 r2 : 60000093 r1 : 0000431f r0 : 0000002d Flags: nzCv IRQs off FIQs on Mode SVC_32 ISA ARM Segment kernel Control: 0400397f Table: 6b4f4018 DAC: 00000035 Process md1_raid6 (pid: 1643, stack limit = 0xf122c278) Stack: (0xf122df30 to 0xf122e000) df20: 80273244 80273368 00000005 7fffffff df40: 7fffffff f122c000 f11b7328 f122df6c f0dcf854 f0dcf84c f0f6c400 f0dcf800 df60: 80314dcc 80314784 f11b7320 7fffffff f122c000 f11b7328 f122dfa4 f11b7320 df80: 7fffffff f122c000 f11b7328 f122dfa4 00000000 00000000 00000000 802807e4 dfa0: f122dfcc 00000000 f0fff340 800508a8 f122dfb0 f122dfb0 f11b5c34 f11b7320 dfc0: 8028075c 00000013 00000000 800505e4 00000000 00000000 f11b7320 00000000 dfe0: f122dfe0 f122dfe0 f11b5c34 80050564 80028998 80028998 00000000 00000000 [<8002b0d4>] (__bug+0x1c/0x28) from [<80273368>] (raid5d+0x590/0x594) [<80273368>] (raid5d+0x590/0x594) from [<802807e4>] (md_thread+0x88/0x130) [<802807e4>] (md_thread+0x88/0x130) from [<800505e4>] (kthread+0x80/0x88) [<800505e4>] (kthread+0x80/0x88) from [<80028998>] (kernel_thread_exit+0x0/0x8) Code: e1a01000 e59f000c eb003cb2 e3a03000 (e5833000) ---[ end trace d52c55614a4f7fe3 ]--- (gdb) li *(raid5d+0x590) 0x8540 is in raid5d (drivers/md/raid5.c:3978). 3973 } else 3974 return NULL; 3975 3976 list_del_init(&sh->lru); 3977 atomic_inc(&sh->count); 3978 BUG_ON(atomic_read(&sh->count) != 1); 3979 return sh; 3980 } 3981 3982 static int make_request(mddev_t *mddev, struct bio * bi) ====== RUN 2 ====== kernel BUG at drivers/md/raid5.c:199! Unable to handle kernel NULL pointer dereference at virtual address 00000000 pgd = f0d8c000 [00000000] *pgd=7115b831, *pte=00000000, *ppte=00000000 Internal error: Oops: 817 [#1] last sysfs file: /sys/block/md1/md/metadata_version Modules linked in: CPU: 0 Not tainted (2.6.38.8+ #4) PC is at __bug+0x1c/0x28 LR is at __bug+0x18/0x28 pc : [<8002b0d4>] lr : [<8002b0d0>] psr: 20000093 sp : f11d7990 ip : 00000007 fp : 00000000 r10: 00048000 r9 : 00000000 r8 : ee29c948 r7 : 00038000 r6 : f0e70de8 r5 : f0d29600 r4 : f0e70de0 r3 : 00000000 r2 : 60000093 r1 : 000044c9 r0 : 0000002c Flags: nzCv IRQs off FIQs on Mode SVC_32 ISA ARM Segment user Control: 0400397f Table: 70d8c018 DAC: 00000015 Process syslogd (pid: 913, stack limit = 0xf11d6278) Stack: (0xf11d7990 to 0xf11d8000) 7980: f11d79bc 8026c580 20000013 f134f5a0 79a0: f0e70eb0 8026c5ac 00001000 800b4438 00038000 801b2c94 00000013 00000000 79c0: f1857740 00000000 ee29c948 00000000 00000000 ee29c948 00000000 f1857400 79e0: 00000000 801b3010 f10be4b8 ee29c948 ee29c948 00000000 00000000 801b3dd0 7a00: 00000000 ee29c948 ef68b0a0 00000000 00000000 801b3e28 ef68b280 802106bc 7a20: 000001ad 80027b7c 00000801 f1857400 00000000 00080000 f10bee00 f11d7a60 7a40: 00000001 80532a54 00000010 00000100 0000000a f11d6000 8041e994 801b8370 7a60: f11d7a60 f11d7a60 00000004 8003e484 f1825824 804228f8 0000006b 0000006b 7a80: 804278a0 00000000 f18983fc f1898414 00000040 00000007 f1825824 8002704c 7aa0: ffffffff f11d7ae4 000001ad 80027b7c 00000121 ffffffff 000016e0 0000270f 7ac0: 00000002 00000002 000003e8 f18983fc f1898414 00000040 00000007 f1825824 7ae0: f11d7af8 f11d7af8 80242a68 801be554 20000013 ffffffff 00cfc120 00cfc120 7b00: f18983c0 00000400 f11d6000 00000080 00000093 00000000 f1936b00 80035a24 7b20: 00000000 00000000 80035e78 00000002 00000001 00000002 00000073 f1825824 7b40: 00cfc120 f18983c0 00000000 80244720 00000020 00000080 00000400 00000000 7b60: 0000005b f1898414 00000001 f11d7cc8 00232031 000000e8 00000020 00000000 7b80: 00000000 00cfc120 00000008 00000000 f18983c0 00cfc140 00000001 f18983fc 7ba0: 0000001f 00cfc120 00000020 00000002 00000000 00380000 00000000 00cfc0f4 7bc0: 00000000 f0d37960 00000002 f1074c00 f11d7cc0 8023bcd0 00cfc0f4 00000000 7be0: f11d7cd4 00000000 00000000 0097c0f4 00000000 f1112800 f1112800 8019fe5c 7c00: 0097c0f4 00000000 f11d7cd4 00000000 00000000 00000000 00000000 00000000 7c20: 00000000 0097c0f4 00000000 f1112800 f1112800 f0d37960 f14df000 0097c0f4 7c40: 00000000 801a1d9c 0097c0f4 00000000 f11d7cd4 00000000 00000000 00000002 7c60: f11d7cc0 00000000 00000000 00000000 00000000 00000000 00000000 0097c0f4 7c80: 00000000 0097c0f4 f1112800 f0d37960 f14df000 0000009f 00000003 801994cc 7ca0: 0097c0f4 00000000 f11d7cd4 00000622 f0d37240 f1149d20 00000000 00000002 7cc0: f0d37960 00000044 efb4c354 0000005b 32353a30 0000002c f11d12b8 0000005b 7ce0: 00000000 00000000 00000000 00000000 00000000 00000002 f0d37960 80199af0 7d00: 0000005b 00000003 00000000 0000005b 000002e3 00000000 efb4c354 f14df000 7d20: f1112800 0000005b 00000000 00000000 0000005b 0000005b 00003f0c efb4c354 7d40: ef68b3c0 f0d37988 f0d37960 f14df028 00000355 8133a980 000003af 0002e355 7d60: 00000000 80193fa8 0002e354 0000005b f11d7d84 0000005a f1112800 00000354 7d80: f14df0dc 00000000 0000005a 0000005a 0000005a 0002e355 00000000 f11d6000 7da0: f14df0dc 00000355 00000000 80067324 0000005a 0000005a 8133a980 00000001 7dc0: ee2edb80 00000cab 00000000 80324fa0 f11d7f40 00000001 00000000 0000005a 7de0: 00000001 8133a980 4e0d3670 f14df028 00000001 0002e355 00000000 0000005a 7e00: ee2edb80 f11d6000 00000000 800687f4 0002e355 00000000 f11d7f00 0000005a 7e20: 00000000 80088664 000200da f11d7f00 f11d7f40 f11d7ec8 f11d7ee4 f1412180 7e40: f11d7e84 f14df0dc 0000044e f0c37370 f0c37370 00000001 00000000 f0c35514 7e60: 7065114f f0c37370 81350a20 0000005a f0d8caa8 f14df040 00000001 f11d7ec8 7e80: f11d7f40 0002e355 00000000 ee2edb80 7eb75724 80068a70 f0d8c000 2ab45000 7ea0: f11d7ec8 fffffdee f11d7f40 ee2edb80 f11d7f80 f11d6000 00000000 8008d2b4 7ec0: 0002e355 00000000 00000817 f10f8ab4 00000000 00000001 ffffffff ee2edb80 7ee0: 00000000 00000000 00000000 00000000 f1936b00 8041d068 00000000 00000000 7f00: 0002e355 00000000 7eb750d4 800272e0 0000005a 800828f4 0000005a 00000000 7f20: 00000000 0002ab45 00000000 80080c78 00000022 00000022 00000000 00000000 7f40: 2ab45000 0000005a ee2edb80 0000005a 2ab45000 f11d7f80 800280e4 8008db34 7f60: ee2edb80 00000001 ee2edb80 fffffff7 0002e355 00000000 800280e4 8008e110 7f80: 0002e355 00000000 00000001 00000001 00000000 7eb751a0 0000005a 2ab45000 7fa0: 00000004 80027f40 7eb751a0 0000005a 00000004 2ab45000 0000005a 00000000 7fc0: 7eb751a0 0000005a 2ab45000 00000004 0000005a 0000004a 2ada7000 7eb75724 7fe0: 00000000 7eb75088 2ad58a9c 2ad40ac4 60000010 00000004 00000000 00000000 [<8002b0d4>] (__bug+0x1c/0x28) from [<8026c580>] (__release_stripe+0x1dc/0x1ec) [<8026c580>] (__release_stripe+0x1dc/0x1ec) from [<8026c5ac>] (release_stripe+0x1c/0x24) [<8026c5ac>] (release_stripe+0x1c/0x24) from [<800b4438>] (bio_endio+0x48/0x64) [<800b4438>] (bio_endio+0x48/0x64) from [<801b2c94>] (blk_update_request+0x8c/0x3f0) [<801b2c94>] (blk_update_request+0x8c/0x3f0) from [<801b3010>] (blk_update_bidi_request+0x18/0x74) [<801b3010>] (blk_update_bidi_request+0x18/0x74) from [<801b3dd0>] (blk_end_bidi_request+0x14/0x5c) [<801b3dd0>] (blk_end_bidi_request+0x14/0x5c) from [<801b3e28>] (blk_end_request+0x10/0x18) [<801b3e28>] (blk_end_request+0x10/0x18) from [<802106bc>] (scsi_io_completion+0x74/0x4c4) [<802106bc>] (scsi_io_completion+0x74/0x4c4) from [<801b8370>] (blk_done_softirq+0x80/0x98) [<801b8370>] (blk_done_softirq+0x80/0x98) from [<8003e484>] (__do_softirq+0x88/0x124) [<8003e484>] (__do_softirq+0x88/0x124) from [<8002704c>] (asm_do_IRQ+0x4c/0x98) [<8002704c>] (asm_do_IRQ+0x4c/0x98) from [<80027b7c>] (__irq_svc+0x3c/0x80) Exception stack(0xf11d7ab0 to 0xf11d7af8) 7aa0: 00000121 ffffffff 000016e0 0000270f 7ac0: 00000002 00000002 000003e8 f18983fc f1898414 00000040 00000007 f1825824 7ae0: f11d7af8 f11d7af8 80242a68 801be554 20000013 ffffffff [<80027b7c>] (__irq_svc+0x3c/0x80) from [<801be554>] (__delay+0x4/0xc) [<801be554>] (__delay+0x4/0xc) from [<80242a68>] (inval_cache_and_wait_for_operation+0x27c/0x3b0) [<80242a68>] (inval_cache_and_wait_for_operation+0x27c/0x3b0) from [<80244720>] (cfi_intelext_writev+0xaa0/0x1184) [<80244720>] (cfi_intelext_writev+0xaa0/0x1184) from [<8023bcd0>] (part_writev+0x48/0x58) [<8023bcd0>] (part_writev+0x48/0x58) from [<8019fe5c>] (jffs2_flash_direct_writev+0x38/0x104) [<8019fe5c>] (jffs2_flash_direct_writev+0x38/0x104) from [<801a1d9c>] (jffs2_flash_writev+0x330/0x438) [<801a1d9c>] (jffs2_flash_writev+0x330/0x438) from [<801994cc>] (jffs2_write_dnode+0x1d8/0x434) [<801994cc>] (jffs2_write_dnode+0x1d8/0x434) from [<80199af0>] (jffs2_write_inode_range+0x2a8/0x430) [<80199af0>] (jffs2_write_inode_range+0x2a8/0x430) from [<80193fa8>] (jffs2_write_end+0x180/0x2e4) [<80193fa8>] (jffs2_write_end+0x180/0x2e4) from [<80067324>] (generic_file_buffered_write+0xec/0x234) [<80067324>] (generic_file_buffered_write+0xec/0x234) from [<800687f4>] (__generic_file_aio_write+0x264/0x484) [<800687f4>] (__generic_file_aio_write+0x264/0x484) from [<80068a70>] (generic_file_aio_write+0x5c/0xd0) [<80068a70>] (generic_file_aio_write+0x5c/0xd0) from [<8008d2b4>] (do_sync_write+0xa8/0xe8) [<8008d2b4>] (do_sync_write+0xa8/0xe8) from [<8008db34>] (vfs_write+0xb4/0x148) [<8008db34>] (vfs_write+0xb4/0x148) from [<8008e110>] (sys_write+0x3c/0x6c) [<8008e110>] (sys_write+0x3c/0x6c) from [<80027f40>] (ret_fast_syscall+0x0/0x3c) Code: e1a01000 e59f000c eb003cb2 e3a03000 (e5833000) ---[ end trace 85ad4ff48ad369db ]--- Kernel panic - not syncing: Fatal exception in interrupt [<8002ced8>] (unwind_backtrace+0x0/0xf0) from [<8003934c>] (panic+0x5c/0x1a0) [<8003934c>] (panic+0x5c/0x1a0) from [<8002b60c>] (die+0x188/0x1bc) [<8002b60c>] (die+0x188/0x1bc) from [<8002dfb4>] (__do_kernel_fault+0x68/0x88) [<8002dfb4>] (__do_kernel_fault+0x68/0x88) from [<8002e11c>] (do_page_fault+0x148/0x214) [<8002e11c>] (do_page_fault+0x148/0x214) from [<800272e0>] (do_DataAbort+0x38/0x9c) [<800272e0>] (do_DataAbort+0x38/0x9c) from [<80027b2c>] (__dabt_svc+0x4c/0x60) Exception stack(0xf11d7948 to 0xf11d7990) 7940: 0000002c 000044c9 60000093 00000000 f0e70de0 f0d29600 7960: f0e70de8 00038000 ee29c948 00000000 00048000 00000000 00000007 f11d7990 7980: 8002b0d0 8002b0d4 20000093 ffffffff [<80027b2c>] (__dabt_svc+0x4c/0x60) from [<8002b0d4>] (__bug+0x1c/0x28) [<8002b0d4>] (__bug+0x1c/0x28) from [<8026c580>] (__release_stripe+0x1dc/0x1ec) [<8026c580>] (__release_stripe+0x1dc/0x1ec) from [<8026c5ac>] (release_stripe+0x1c/0x24) [<8026c5ac>] (release_stripe+0x1c/0x24) from [<800b4438>] (bio_endio+0x48/0x64) [<800b4438>] (bio_endio+0x48/0x64) from [<801b2c94>] (blk_update_request+0x8c/0x3f0) [<801b2c94>] (blk_update_request+0x8c/0x3f0) from [<801b3010>] (blk_update_bidi_request+0x18/0x74) [<801b3010>] (blk_update_bidi_request+0x18/0x74) from [<801b3dd0>] (blk_end_bidi_request+0x14/0x5c) [<801b3dd0>] (blk_end_bidi_request+0x14/0x5c) from [<801b3e28>] (blk_end_request+0x10/0x18) [<801b3e28>] (blk_end_request+0x10/0x18) from [<802106bc>] (scsi_io_completion+0x74/0x4c4) [<802106bc>] (scsi_io_completion+0x74/0x4c4) from [<801b8370>] (blk_done_softirq+0x80/0x98) [<801b8370>] (blk_done_softirq+0x80/0x98) from [<8003e484>] (__do_softirq+0x88/0x124) [<8003e484>] (__do_softirq+0x88/0x124) from [<8002704c>] (asm_do_IRQ+0x4c/0x98) [<8002704c>] (asm_do_IRQ+0x4c/0x98) from [<80027b7c>] (__irq_svc+0x3c/0x80) Exception stack(0xf11d7ab0 to 0xf11d7af8) 7aa0: 00000121 ffffffff 000016e0 0000270f 7ac0: 00000002 00000002 000003e8 f18983fc f1898414 00000040 00000007 f1825824 7ae0: f11d7af8 f11d7af8 80242a68 801be554 20000013 ffffffff [<80027b7c>] (__irq_svc+0x3c/0x80) from [<801be554>] (__delay+0x4/0xc) [<801be554>] (__delay+0x4/0xc) from [<80242a68>] (inval_cache_and_wait_for_operation+0x27c/0x3b0) [<80242a68>] (inval_cache_and_wait_for_operation+0x27c/0x3b0) from [<80244720>] (cfi_intelext_writev+0xaa0/0x1184) [<80244720>] (cfi_intelext_writev+0xaa0/0x1184) from [<8023bcd0>] (part_writev+0x48/0x58) [<8023bcd0>] (part_writev+0x48/0x58) from [<8019fe5c>] (jffs2_flash_direct_writev+0x38/0x104) [<8019fe5c>] (jffs2_flash_direct_writev+0x38/0x104) from [<801a1d9c>] (jffs2_flash_writev+0x330/0x438) [<801a1d9c>] (jffs2_flash_writev+0x330/0x438) from [<801994cc>] (jffs2_write_dnode+0x1d8/0x434) [<801994cc>] (jffs2_write_dnode+0x1d8/0x434) from [<80199af0>] (jffs2_write_inode_range+0x2a8/0x430) [<80199af0>] (jffs2_write_inode_range+0x2a8/0x430) from [<80193fa8>] (jffs2_write_end+0x180/0x2e4) [<80193fa8>] (jffs2_write_end+0x180/0x2e4) from [<80067324>] (generic_file_buffered_write+0xec/0x234) [<80067324>] (generic_file_buffered_write+0xec/0x234) from [<800687f4>] (__generic_file_aio_write+0x264/0x484) [<800687f4>] (__generic_file_aio_write+0x264/0x484) from [<80068a70>] (generic_file_aio_write+0x5c/0xd0) [<80068a70>] (generic_file_aio_write+0x5c/0xd0) from [<8008d2b4>] (do_sync_write+0xa8/0xe8) [<8008d2b4>] (do_sync_write+0xa8/0xe8) from [<8008db34>] (vfs_write+0xb4/0x148) [<8008db34>] (vfs_write+0xb4/0x148) from [<8008e110>] (sys_write+0x3c/0x6c) [<8008e110>] (sys_write+0x3c/0x6c) from [<80027f40>] (ret_fast_syscall+0x0/0x3c) (gdb) li *(__release_stripe+0x1dc) 0x1758 is in __release_stripe (drivers/md/raid5.c:215). 210 clear_bit(STRIPE_BIT_DELAY, &sh->state); 211 list_add_tail(&sh->lru, &conf->handle_list); 212 } 213 md_wakeup_thread(conf->mddev->thread); 214 } else { 215 BUG_ON(stripe_operations_active(sh)); 216 if (test_and_clear_bit(STRIPE_PREREAD_ACTIVE, &sh->state)) { 217 atomic_dec(&conf->preread_active_stripes); 218 if (atomic_read(&conf->preread_active_stripes) < IO_THRESHOLD) 219 md_wakeup_thread(conf->mddev->thread); vi drivers/md/raid5.c: 196 static void __release_stripe(raid5_conf_t *conf, struct stripe_head *sh) 197 { 198 if (atomic_dec_and_test(&sh->count)) { 199 BUG_ON(!list_empty(&sh->lru)); 200 BUG_ON(atomic_read(&conf->active_stripes)==0); 201 if (test_bit(STRIPE_HANDLE, &sh->state)) { 202 if (test_bit(STRIPE_DELAYED, &sh->state)) { 203 list_add_tail(&sh->lru, &conf->delayed_list); 204 plugger_set_plug(&conf->plug); 205 } else if (test_bit(STRIPE_BIT_DELAY, &sh->state) && 206 sh->bm_seq - conf->seq_write > 0) { 207 list_add_tail(&sh->lru, &conf->bitmap_list); 208 plugger_set_plug(&conf->plug); 209 } else { 210 clear_bit(STRIPE_BIT_DELAY, &sh->state); 211 list_add_tail(&sh->lru, &conf->handle_list); 212 } 213 md_wakeup_thread(conf->mddev->thread); 214 } else { 215 BUG_ON(stripe_operations_active(sh)); ====== RUN 3 ====== kernel BUG at drivers/md/raid5.c:199! Unable to handle kernel NULL pointer dereference at virtual address 00000000 pgd = eb234000 [00000000] *pgd=6b876831, *pte=00000000, *ppte=00000000 Internal error: Oops: 817 [#1] last sysfs file: /sys/block/md1/md/metadata_version Modules linked in: CPU: 0 Not tainted (2.6.38.8+ #4) PC is at __bug+0x1c/0x28 LR is at __bug+0x18/0x28 pc : [<8002b0d4>] lr : [<8002b0d0>] psr: 20000093 sp : efc83c80 ip : 00000007 fp : 00000000 r10: 00009000 r9 : 00000000 r8 : efca8ca8 r7 : 00057000 r6 : f0e4a4a8 r5 : f0ca0600 r4 : f0e4a4a0 r3 : 00000000 r2 : 60000093 r1 : 00004544 r0 : 0000002c Flags: nzCv IRQs off FIQs on Mode SVC_32 ISA ARM Segment user Control: 0400397f Table: 6b234018 DAC: 00000015 Process handle_alarm.sh (pid: 5250, stack limit = 0xefc82278) Stack: (0xefc83c80 to 0xefc84000) 3c80: f18b413c 8026c580 20000013 f1206f30 f0e4a5f0 8026c5ac 00001000 800b4438 3ca0: 00057000 801b2c94 f1831430 00000000 f18b4000 00000000 efca8ca8 00000000 3cc0: 00000000 efca8ca8 00000000 f1857740 00000000 801b3010 f18b4000 efca8ca8 3ce0: efca8ca8 00000000 00000000 801b3dd0 00000002 efca8ca8 ef22daa0 00000000 3d00: 00000000 801b3e28 ef22db40 802106bc f5000000 f181f190 00000000 f1857740 3d20: f5000000 00060000 f10be400 efc83d50 00000001 80532a54 00000010 00000100 3d40: 0000000a efc82000 8041e994 801b8370 efc83d50 efc83d50 00000004 8003e484 3d60: efc83df4 804228f8 0000006b 0000006b 804278a0 00000000 00001564 0000005a 3d80: 00000000 f0c97a80 efc83df4 8002704c ffffffff efc83dd4 000001ad 80027b7c 3da0: 00000000 f10c6290 f10c6290 60000013 00000000 00000001 efc6a5d8 00001564 3dc0: 0000005a 00000000 f0c97a80 efc83df4 f10c6290 efc83de8 801c33f0 8002e958 3de0: 60000013 ffffffff efc481e0 2ac78000 81335e60 f10c6290 00000100 00000000 3e00: 00000000 f140a3b0 0000005a 0000005a 80544000 0156418f 8056ec80 efc6a5d8 3e20: efc481e0 00000000 00000200 2ac78000 00000000 8007c2e8 00000001 eb234ab0 3e40: 00000000 00000000 00000000 0000005a 2ac78000 8056ec80 eb234aa8 00000000 3e60: efc481e0 efc6a5d8 00000000 2ac78000 eb234ab0 f0c97a80 7ee878a4 8007cef4 3e80: 0000005a 00000000 00000000 ffffffff 00000817 00000ab0 eb234000 2ac78000 3ea0: f0c97a80 eb234ab0 00000000 efc6a5d8 7ee878a4 8007d638 eb234ab0 00000000 3ec0: efc6a5d8 f0c97a80 80000005 f0c97ab4 2ac78d08 efc83fb0 f0072840 8002e144 3ee0: f10fdbb0 8003405c 8041e578 ffffffff 00000005 8041ce48 2ac78d08 efc83fb0 3f00: 00000000 2ad21000 7ee878a4 80027244 00000000 efc83f20 80314818 80027f00 3f20: efc82000 00000011 00000000 7ef43e1c f1836000 f0072840 ffffffff efc83f84 3f40: 00000000 00000000 00000000 80027b2c 2ab28048 00000000 00001482 f0072840 3f60: efc82000 00000000 00000000 00000000 00000000 00000000 00000000 efc83fac 3f80: 8041ffb4 efc83f98 80035fac 801bf97c 60000013 ffffffff ffffffff 000013e7 3fa0: 7ee87840 00000078 00000000 80027ec0 00000000 00000000 00000000 2ac78d08 3fc0: 00000000 000013e7 7ee87840 00000078 00000000 00000000 2ad21000 7ee878a4 3fe0: 2ab052b0 7ee87840 2ac9af14 2ac78d08 20000010 ffffffff 00000000 00000000 [<8002b0d4>] (__bug+0x1c/0x28) from [<8026c580>] (__release_stripe+0x1dc/0x1ec) [<8026c580>] (__release_stripe+0x1dc/0x1ec) from [<8026c5ac>] (release_stripe+0x1c/0x24) [<8026c5ac>] (release_stripe+0x1c/0x24) from [<800b4438>] (bio_endio+0x48/0x64) [<800b4438>] (bio_endio+0x48/0x64) from [<801b2c94>] (blk_update_request+0x8c/0x3f0) [<801b2c94>] (blk_update_request+0x8c/0x3f0) from [<801b3010>] (blk_update_bidi_request+0x18/0x74) [<801b3010>] (blk_update_bidi_request+0x18/0x74) from [<801b3dd0>] (blk_end_bidi_request+0x14/0x5c) [<801b3dd0>] (blk_end_bidi_request+0x14/0x5c) from [<801b3e28>] (blk_end_request+0x10/0x18) [<801b3e28>] (blk_end_request+0x10/0x18) from [<802106bc>] (scsi_io_completion+0x74/0x4c4) [<802106bc>] (scsi_io_completion+0x74/0x4c4) from [<801b8370>] (blk_done_softirq+0x80/0x98) [<801b8370>] (blk_done_softirq+0x80/0x98) from [<8003e484>] (__do_softirq+0x88/0x124) [<8003e484>] (__do_softirq+0x88/0x124) from [<8002704c>] (asm_do_IRQ+0x4c/0x98) [<8002704c>] (asm_do_IRQ+0x4c/0x98) from [<80027b7c>] (__irq_svc+0x3c/0x80) Exception stack(0xefc83da0 to 0xefc83de8) 3da0: 00000000 f10c6290 f10c6290 60000013 00000000 00000001 efc6a5d8 00001564 3dc0: 0000005a 00000000 f0c97a80 efc83df4 f10c6290 efc83de8 801c33f0 8002e958 3de0: 60000013 ffffffff [<80027b7c>] (__irq_svc+0x3c/0x80) from [<8002e958>] (update_mmu_cache+0x168/0x1f0) [<8002e958>] (update_mmu_cache+0x168/0x1f0) from [<8007c2e8>] (__do_fault+0x294/0x430) [<8007c2e8>] (__do_fault+0x294/0x430) from [<8007cef4>] (handle_pte_fault+0x78/0x35c) [<8007cef4>] (handle_pte_fault+0x78/0x35c) from [<8007d638>] (handle_mm_fault+0x88/0xac) [<8007d638>] (handle_mm_fault+0x88/0xac) from [<8002e144>] (do_page_fault+0x170/0x214) [<8002e144>] (do_page_fault+0x170/0x214) from [<80027244>] (do_PrefetchAbort+0x38/0x9c) [<80027244>] (do_PrefetchAbort+0x38/0x9c) from [<80027ec0>] (ret_from_exception+0x0/0x10) Exception stack(0xefc83fb0 to 0xefc83ff8) 3fa0: 00000000 00000000 00000000 2ac78d08 3fc0: 00000000 000013e7 7ee87840 00000078 00000000 00000000 2ad21000 7ee878a4 3fe0: 2ab052b0 7ee87840 2ac9af14 2ac78d08 20000010 ffffffff Code: e1a01000 e59f000c eb003cb2 e3a03000 (e5833000) ---[ end trace 01791389570b7cb9 ]--- Kernel panic - not syncing: Fatal exception in interrupt [<8002ced8>] (unwind_backtrace+0x0/0xf0) from [<8003934c>] (panic+0x5c/0x1a0) [<8003934c>] (panic+0x5c/0x1a0) from [<8002b60c>] (die+0x188/0x1bc) [<8002b60c>] (die+0x188/0x1bc) from [<8002dfb4>] (__do_kernel_fault+0x68/0x88) [<8002dfb4>] (__do_kernel_fault+0x68/0x88) from [<8002e11c>] (do_page_fault+0x148/0x214) [<8002e11c>] (do_page_fault+0x148/0x214) from [<800272e0>] (do_DataAbort+0x38/0x9c) [<800272e0>] (do_DataAbort+0x38/0x9c) from [<80027b2c>] (__dabt_svc+0x4c/0x60) Exception stack(0xefc83c38 to 0xefc83c80) 3c20: 0000002c 00004544 3c40: 60000093 00000000 f0e4a4a0 f0ca0600 f0e4a4a8 00057000 efca8ca8 00000000 3c60: 00009000 00000000 00000007 efc83c80 8002b0d0 8002b0d4 20000093 ffffffff [<80027b2c>] (__dabt_svc+0x4c/0x60) from [<8002b0d4>] (__bug+0x1c/0x28) [<8002b0d4>] (__bug+0x1c/0x28) from [<8026c580>] (__release_stripe+0x1dc/0x1ec) [<8026c580>] (__release_stripe+0x1dc/0x1ec) from [<8026c5ac>] (release_stripe+0x1c/0x24) [<8026c5ac>] (release_stripe+0x1c/0x24) from [<800b4438>] (bio_endio+0x48/0x64) [<800b4438>] (bio_endio+0x48/0x64) from [<801b2c94>] (blk_update_request+0x8c/0x3f0) [<801b2c94>] (blk_update_request+0x8c/0x3f0) from [<801b3010>] (blk_update_bidi_request+0x18/0x74) [<801b3010>] (blk_update_bidi_request+0x18/0x74) from [<801b3dd0>] (blk_end_bidi_request+0x14/0x5c) [<801b3dd0>] (blk_end_bidi_request+0x14/0x5c) from [<801b3e28>] (blk_end_request+0x10/0x18) [<801b3e28>] (blk_end_request+0x10/0x18) from [<802106bc>] (scsi_io_completion+0x74/0x4c4) [<802106bc>] (scsi_io_completion+0x74/0x4c4) from [<801b8370>] (blk_done_softirq+0x80/0x98) [<801b8370>] (blk_done_softirq+0x80/0x98) from [<8003e484>] (__do_softirq+0x88/0x124) [<8003e484>] (__do_softirq+0x88/0x124) from [<8002704c>] (asm_do_IRQ+0x4c/0x98) [<8002704c>] (asm_do_IRQ+0x4c/0x98) from [<80027b7c>] (__irq_svc+0x3c/0x80) Exception stack(0xefc83da0 to 0xefc83de8) 3da0: 00000000 f10c6290 f10c6290 60000013 00000000 00000001 efc6a5d8 00001564 3dc0: 0000005a 00000000 f0c97a80 efc83df4 f10c6290 efc83de8 801c33f0 8002e958 3de0: 60000013 ffffffff [<80027b7c>] (__irq_svc+0x3c/0x80) from [<8002e958>] (update_mmu_cache+0x168/0x1f0) [<8002e958>] (update_mmu_cache+0x168/0x1f0) from [<8007c2e8>] (__do_fault+0x294/0x430) [<8007c2e8>] (__do_fault+0x294/0x430) from [<8007cef4>] (handle_pte_fault+0x78/0x35c) [<8007cef4>] (handle_pte_fault+0x78/0x35c) from [<8007d638>] (handle_mm_fault+0x88/0xac) [<8007d638>] (handle_mm_fault+0x88/0xac) from [<8002e144>] (do_page_fault+0x170/0x214) [<8002e144>] (do_page_fault+0x170/0x214) from [<80027244>] (do_PrefetchAbort+0x38/0x9c) [<80027244>] (do_PrefetchAbort+0x38/0x9c) from [<80027ec0>] (ret_from_exception+0x0/0x10) Exception stack(0xefc83fb0 to 0xefc83ff8) 3fa0: 00000000 00000000 00000000 2ac78d08 3fc0: 00000000 000013e7 7ee87840 00000078 00000000 00000000 2ad21000 7ee878a4 3fe0: 2ab052b0 7ee87840 2ac9af14 2ac78d08 20000010 ffffffff Regards, QinDehua 2011/7/1 Dan Williams <dan.j.williams@intel.com> > > On Thu, Jun 30, 2011 at 4:28 AM, Russell King <rmk@arm.linux.org.uk> wrote: > > In that case, the raid5 code contains an explicit NULL pointer > > dereference which isn't a BUG() - the code line disassembles to: > > > > 0: ebfff1bc bl 0xffffc6f8 > > 4: e28dd044 add sp, sp, #68 ; 0x44 > > 8: e8bd8ff0 pop {r4, r5, r6, r7, r8, r9, sl, fp, pc} > > c: e3a03000 mov r3, #0 ; 0x0 > > 10: e5833000 str r3, [r3] <=== faulting instruction > > > > So, if you're saying that's not a BUG(), then I don't know what it is > > and I'm afraid I can't help because the oops doesn't make any sense > > to me. > > > > QinDehua, > > Can you rebuild with CONFIG_DEBUG_INFO=y, reproduce the crash and then > send the output of: > > $ gdb drivers/md/raid5.o > (gdb) li *(raid5d+0x580) > (gdb) li *(__release_stripe+0x1e4) > etc... > > ...those offsets might change so just grab whatever "PC is at " > reports in the oops. > > -- > Dan ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: PROBLEM: ARM-dma-mapping-fix-for-speculative-prefetching cause OOPS 2011-07-01 4:54 ` Qin Dehua @ 2011-07-07 9:39 ` Russell King 2011-07-08 4:38 ` Qin Dehua 0 siblings, 1 reply; 11+ messages in thread From: Russell King @ 2011-07-07 9:39 UTC (permalink / raw) To: Qin Dehua; +Cc: Dan Williams, linux-kernel, santosh.shilimkar, neilb On Fri, Jul 01, 2011 at 12:54:09PM +0800, Qin Dehua wrote: > The followings are messages of the three tests: > ====== RUN 1 ====== > kernel BUG at drivers/md/raid5.c:3978! So they are BUG_ON()s after all... Could you try commenting out: + if (dir != DMA_TO_DEVICE) + outer_inv_range(paddr, paddr + size); in ___dma_page_dev_to_cpu and: + if (dir != DMA_TO_DEVICE) { + unsigned long paddr = __pa(kaddr); + outer_inv_range(paddr, paddr + size); + } in ___dma_single_dev_to_cpu please - and put a BUG_ON(dir == DMA_BIDIRECTIONAL) in their place (because that won't be handled correctly with that change.) Thanks. -- Russell King Linux kernel 2.6 ARM Linux - http://www.arm.linux.org.uk/ maintainer of: ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: PROBLEM: ARM-dma-mapping-fix-for-speculative-prefetching cause OOPS 2011-07-07 9:39 ` Russell King @ 2011-07-08 4:38 ` Qin Dehua 2011-07-08 8:07 ` Russell King 0 siblings, 1 reply; 11+ messages in thread From: Qin Dehua @ 2011-07-08 4:38 UTC (permalink / raw) To: Russell King; +Cc: Dan Williams, linux-kernel, santosh.shilimkar, neilb 2011/7/7 Russell King <rmk@arm.linux.org.uk>: > Could you try commenting out: > > + if (dir != DMA_TO_DEVICE) > + outer_inv_range(paddr, paddr + size); > > in ___dma_page_dev_to_cpu and: > > + if (dir != DMA_TO_DEVICE) { > + unsigned long paddr = __pa(kaddr); > + outer_inv_range(paddr, paddr + size); > + } > > in ___dma_single_dev_to_cpu please - and put a BUG_ON(dir == DMA_BIDIRECTIONAL) > in their place (because that won't be handled correctly with that change.) After doing the above changes, the kernel just report BUG_ON(dir == DMA_BIDIRECTIONAL): kernel BUG at arch/arm/mm/dma-mapping.c:526! Unable to handle kernel NULL pointer dereference at virtual address 00000000 pgd = 80004000 [00000000] *pgd=00000000 Internal error: Oops: 805 [#1] last sysfs file: Modules linked in: CPU: 0 Not tainted (2.6.38.8+ #57) PC is at __bug+0x1c/0x28 LR is at __bug+0x18/0x28 pc : [<8002b0d4>] lr : [<8002b0d0>] psr: 60000013 sp : 80419e90 ip : 00000007 fp : 00001000 r10: f18fb0ec r9 : 00000005 r8 : 71932000 r7 : 81376640 r6 : 00001000 r5 : 00000000 r4 : 00000000 r3 : 00000000 r2 : 60000013 r1 : 0000219b r0 : 00000033 Flags: nZCv IRQs on FIQs on Mode SVC_32 ISA ARM Segment kernel Control: 0400397f Table: 00004018 DAC: 00000035 Process swapper (pid: 0, stack limit = 0x80418278) Stack: (0x80419e90 to 0x8041a000) 9e80: 80533ba1 8002d474 80419ec4 80419ea8 9ea0: 80033e40 00000003 00000004 f18fb0c0 f18fb0c0 8028948c 80419edc 80419ec8 9ec0: 80032ea4 f18fb0c0 8041eb10 00000000 00000000 00000100 f188a984 f188a980 9ee0: 00000000 802895d8 80532b80 f188a98c 00000000 00000004 718d2040 00000001 9f00: 80043dc0 00000000 8041eb10 80532a20 00000018 00000100 0000000a 80418000 9f20: 8041e994 8003e960 00000006 00000001 80532a5c 8003e444 f18a39c0 80420630 9f40: 00000001 00000001 804278a0 00000000 8041a000 000202c0 69056819 00020258 9f60: 00000000 8002704c ffffffff 80419fac 00000005 80027b7c 8039b020 000000ce 9f80: 00000000 60000013 80418000 80431eac 8041cabc 8041a000 000202c0 69056819 9fa0: 00020258 00000000 80423734 80419fc0 8002907c 80028eb4 60000013 ffffffff 9fc0: 8041a0e8 80431e78 80431e00 800089e8 80008344 00000000 00000000 80022dc0 9fe0: 00000000 0400397d 8041a024 800231c4 8041cab0 00008034 00000000 00000000 [<8002b0d4>] (__bug+0x1c/0x28) from [<8002d474>] (___dma_page_dev_to_cpu+0x7c/0x84) [<8002d474>] (___dma_page_dev_to_cpu+0x7c/0x84) from [<8028948c>] (iop_adma_run_tx_complete_actions+0x204/0x288) [<8028948c>] (iop_adma_run_tx_complete_actions+0x204/0x288) from [<802895d8>] (__iop_adma_slot_cleanup+0xc8/0x34c) [<802895d8>] (__iop_adma_slot_cleanup+0xc8/0x34c) from [<8003e960>] (tasklet_action+0x70/0xec) [<8003e960>] (tasklet_action+0x70/0xec) from [<8003e444>] (__do_softirq+0x88/0x124) [<8003e444>] (__do_softirq+0x88/0x124) from [<8002704c>] (asm_do_IRQ+0x4c/0x98) [<8002704c>] (asm_do_IRQ+0x4c/0x98) from [<80027b7c>] (__irq_svc+0x3c/0x80) Exception stack(0x80419f78 to 0x80419fc0) 9f60: 8039b020 000000ce 9f80: 00000000 60000013 80418000 80431eac 8041cabc 8041a000 000202c0 69056819 9fa0: 00020258 00000000 80423734 80419fc0 8002907c 80028eb4 60000013 ffffffff [<80027b7c>] (__irq_svc+0x3c/0x80) from [<80028eb4>] (cpu_idle+0x8c/0xa4) [<80028eb4>] (cpu_idle+0x8c/0xa4) from [<800089e8>] (start_kernel+0x224/0x2f4) [<800089e8>] (start_kernel+0x224/0x2f4) from [<00008034>] (0x8034) Code: e1a01000 e59f000c eb003ca2 e3a03000 (e5833000) ---[ end trace 6b4795e8111d3a81 ]--- Kernel panic - not syncing: Fatal exception in interrupt [<8002ced8>] (unwind_backtrace+0x0/0xf0) from [<8003930c>] (panic+0x5c/0x1a0) [<8003930c>] (panic+0x5c/0x1a0) from [<8002b60c>] (die+0x188/0x1bc) [<8002b60c>] (die+0x188/0x1bc) from [<8002df68>] (__do_kernel_fault+0x68/0x88) [<8002df68>] (__do_kernel_fault+0x68/0x88) from [<8002e0d0>] (do_page_fault+0x148/0x214) [<8002e0d0>] (do_page_fault+0x148/0x214) from [<800272e0>] (do_DataAbort+0x38/0x9c) [<800272e0>] (do_DataAbort+0x38/0x9c) from [<80027b2c>] (__dabt_svc+0x4c/0x60) Exception stack(0x80419e48 to 0x80419e90) 9e40: 00000033 0000219b 60000013 00000000 00000000 00000000 9e60: 00001000 81376640 71932000 00000005 f18fb0ec 00001000 00000007 80419e90 9e80: 8002b0d0 8002b0d4 60000013 ffffffff [<80027b2c>] (__dabt_svc+0x4c/0x60) from [<8002b0d4>] (__bug+0x1c/0x28) [<8002b0d4>] (__bug+0x1c/0x28) from [<8002d474>] (___dma_page_dev_to_cpu+0x7c/0x84) [<8002d474>] (___dma_page_dev_to_cpu+0x7c/0x84) from [<8028948c>] (iop_adma_run_tx_complete_actions+0x204/0x288) [<8028948c>] (iop_adma_run_tx_complete_actions+0x204/0x288) from [<802895d8>] (__iop_adma_slot_cleanup+0xc8/0x34c) [<802895d8>] (__iop_adma_slot_cleanup+0xc8/0x34c) from [<8003e960>] (tasklet_action+0x70/0xec) [<8003e960>] (tasklet_action+0x70/0xec) from [<8003e444>] (__do_softirq+0x88/0x124) [<8003e444>] (__do_softirq+0x88/0x124) from [<8002704c>] (asm_do_IRQ+0x4c/0x98) [<8002704c>] (asm_do_IRQ+0x4c/0x98) from [<80027b7c>] (__irq_svc+0x3c/0x80) Exception stack(0x80419f78 to 0x80419fc0) 9f60: 8039b020 000000ce 9f80: 00000000 60000013 80418000 80431eac 8041cabc 8041a000 000202c0 69056819 9fa0: 00020258 00000000 80423734 80419fc0 8002907c 80028eb4 60000013 ffffffff [<80027b7c>] (__irq_svc+0x3c/0x80) from [<80028eb4>] (cpu_idle+0x8c/0xa4) [<80028eb4>] (cpu_idle+0x8c/0xa4) from [<800089e8>] (start_kernel+0x224/0x2f4) [<800089e8>] (start_kernel+0x224/0x2f4) from [<00008034>] (0x8034) Regards, Qin Dehua ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: PROBLEM: ARM-dma-mapping-fix-for-speculative-prefetching cause OOPS 2011-07-08 4:38 ` Qin Dehua @ 2011-07-08 8:07 ` Russell King 2011-07-08 17:32 ` Russell King 0 siblings, 1 reply; 11+ messages in thread From: Russell King @ 2011-07-08 8:07 UTC (permalink / raw) To: Qin Dehua; +Cc: Dan Williams, linux-kernel, santosh.shilimkar, neilb On Fri, Jul 08, 2011 at 12:38:38PM +0800, Qin Dehua wrote: > After doing the above changes, the kernel just report BUG_ON(dir == > DMA_BIDIRECTIONAL): That's really unfortunate. The only other thing I can think which may help is to enable all the raid5, async_tx and dmaengine debug code. And I hope you have DMA_API_DEBUG enabled in your .config ? -- Russell King Linux kernel 2.6 ARM Linux - http://www.arm.linux.org.uk/ maintainer of: ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: PROBLEM: ARM-dma-mapping-fix-for-speculative-prefetching cause OOPS 2011-07-08 8:07 ` Russell King @ 2011-07-08 17:32 ` Russell King 2011-07-08 20:13 ` Dan Williams 0 siblings, 1 reply; 11+ messages in thread From: Russell King @ 2011-07-08 17:32 UTC (permalink / raw) To: Qin Dehua; +Cc: Dan Williams, linux-kernel, santosh.shilimkar, neilb On Fri, Jul 08, 2011 at 09:07:51AM +0100, Russell King wrote: > On Fri, Jul 08, 2011 at 12:38:38PM +0800, Qin Dehua wrote: > > After doing the above changes, the kernel just report BUG_ON(dir == > > DMA_BIDIRECTIONAL): > > That's really unfortunate. > > The only other thing I can think which may help is to enable all the > raid5, async_tx and dmaengine debug code. And I hope you have > DMA_API_DEBUG enabled in your .config ? I'm really grasping at straws here... I'll add to this that I'm out of ideas at the moment (I don't know the RAID5 nor the async offload code), and the only way I can think of resolving this is to revert the commit. While that sounds like a good thing to do, it means people using ARMv6 and later CPUs will be risking data corruption, which I don't think is that desirable either - and will in itself cause a regression there. So we really need to the bottom of what's going on (which I suspect may be due to DMA API abuse by the async offload stuff - mapping the same buffer multiple times with differing attributes.) Why that would impact sh->count I've no idea. -- Russell King Linux kernel 2.6 ARM Linux - http://www.arm.linux.org.uk/ maintainer of: ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: PROBLEM: ARM-dma-mapping-fix-for-speculative-prefetching cause OOPS 2011-07-08 17:32 ` Russell King @ 2011-07-08 20:13 ` Dan Williams 0 siblings, 0 replies; 11+ messages in thread From: Dan Williams @ 2011-07-08 20:13 UTC (permalink / raw) To: Russell King Cc: Qin Dehua, linux-kernel, santosh.shilimkar, neilb, Jiang, Dave On 7/8/2011 10:32 AM, Russell King wrote: > On Fri, Jul 08, 2011 at 09:07:51AM +0100, Russell King wrote: >> On Fri, Jul 08, 2011 at 12:38:38PM +0800, Qin Dehua wrote: >>> After doing the above changes, the kernel just report BUG_ON(dir == >>> DMA_BIDIRECTIONAL): >> >> That's really unfortunate. >> >> The only other thing I can think which may help is to enable all the >> raid5, async_tx and dmaengine debug code. And I hope you have >> DMA_API_DEBUG enabled in your .config ? > > I'm really grasping at straws here... > > I'll add to this that I'm out of ideas at the moment (I don't know the > RAID5 nor the async offload code), and the only way I can think of > resolving this is to revert the commit. > > While that sounds like a good thing to do, it means people using ARMv6 > and later CPUs will be risking data corruption, which I don't think is > that desirable either - and will in itself cause a regression there. Not much of a choice but crashing is better than data corruption. Disabling CONFIG_ASYNC_TX_DMA until the mapping violations can be resolved is probably the better course of action. Something like. diff --git a/drivers/dma/Kconfig b/drivers/dma/Kconfig index 1c28816..cb254a1 100644 --- a/drivers/dma/Kconfig +++ b/drivers/dma/Kconfig @@ -247,6 +247,7 @@ config NET_DMA config ASYNC_TX_DMA bool "Async_tx: Offload support for the async_tx api" depends on DMA_ENGINE + depends on !ARM help This allows the async_tx api to take advantage of offload engines for memcpy, memset, xor, and raid6 p+q operations. If your platform has > > So we really need to the bottom of what's going on (which I suspect > may be due to DMA API abuse by the async offload stuff - mapping the > same buffer multiple times with differing attributes.) Why that would > impact sh->count I've no idea. > This is concerning, I'll see about dusting off my iop34x and reproducing. Might not be for a week or so... -- Dan ^ permalink raw reply related [flat|nested] 11+ messages in thread
end of thread, other threads:[~2011-07-08 20:13 UTC | newest] Thread overview: 11+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2011-06-30 4:09 PROBLEM: ARM-dma-mapping-fix-for-speculative-prefetching cause OOPS Qin Dehua 2011-06-30 7:43 ` Russell King 2011-06-30 11:16 ` Qin Dehua 2011-06-30 11:28 ` Russell King 2011-06-30 18:02 ` Dan Williams 2011-07-01 4:54 ` Qin Dehua 2011-07-07 9:39 ` Russell King 2011-07-08 4:38 ` Qin Dehua 2011-07-08 8:07 ` Russell King 2011-07-08 17:32 ` Russell King 2011-07-08 20:13 ` Dan Williams
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.