* [PATCH] Fix Oops in crash_shrink_memory
@ 2010-06-07 7:28 Pavan Naregundi
2010-06-08 7:07 ` Pavan Naregundi
0 siblings, 1 reply; 11+ messages in thread
From: Pavan Naregundi @ 2010-06-07 7:28 UTC (permalink / raw)
To: linux-kernel
[-- Attachment #1: Type: text/plain, Size: 3207 bytes --]
Hi Everyone,
Please add me to CC in your reply..
When crashkernel is not enabled, "echo 0 > /sys/kernel/kexec_crash_size"
will generate OOPS message in the kernel. Below is the OOPS message and
other details,
# cat /proc/cmdline
ro LANG=en_US.UTF-8 SYSFONT=latarcyrheb-sun16 KEYTABLE=us console=hvc0
rhgb root=UUID=eafd9874-010c-46f9-a0c6-ea8db7c61ac3
# uname -a
Linux XXXXXXXX 2.6.35-rc1 #1 SMP Mon Jun 7 18:04:53 IST 2010 ppc64 ppc64
ppc64 GNU/Linux
# cd /sys/kernel/
# ls
debug kexec_loaded profiling uevent_seqnum
kexec_crash_loaded mm security vmcoreinfo
kexec_crash_size notes uevent_helper
# cat kexec_crash_loaded
0
# cat kexec_loaded
0
# cat kexec_crash_size
1
# echo 0 > kexec_crash_size
Unable to handle kernel paging request for data at address 0x00000030
Faulting instruction address: 0xc0000000000930b4
Oops: Kernel access of bad area, sig: 11 [#1]
SMP NR_CPUS=1024 NUMA pSeries
last sysfs file: /sys/kernel/kexec_crash_size
Modules linked in: sunrpc ipv6 ext3 jbd dm_mirror dm_region_hash dm_log
dm_multipath dm_mod uinput sg ehea sr_mod ibmveth cdrom ext4 jbd2
mbcache sd_mod crc_t10dif ibmvscsic scsi_transport_srp scsi_tgt [last
unloaded: scsi_wait_scan]
NIP: c0000000000930b4 LR: c0000000000930ac CTR: c0000000000b7ce0
REGS: c0000000b7803750 TRAP: 0300 Not tainted (2.6.35-rc1)
MSR: 8000000000009032 <EE,ME,IR,DR> CR: 28242482 XER: 20000000
DAR: 0000000000000030, DSISR: 0000000040000000
TASK = c0000000b9cc3dc0[1381] 'bash' THREAD: c0000000b7800000 CPU: 12
GPR00: c0000000000930ac c0000000b78039d0 c000000000e885a8
c000000000f42950
GPR04: c0000000b7803af0 0000000000000008 0000000000000002
c0000000005c6438
GPR08: 0000000000000000 000000008000000c 0000000000000000
0000000000000000
GPR12: 0000000040242448 c000000007441e00 00000000100f6210
0000000000000000
GPR16: 00000000100f4a38 00000000100cfb98 00000000100f4bdc
00000000100f4b4c
GPR20: 00000000103f4de8 0000000000000000 0000000000000000
00000000100f0000
GPR24: c0000000005c65c0 0000000000000000 c000000000da83e0
c0000000bc67f780
GPR28: c0000000bc684ae0 c000000000f42950 c000000000e1e3f8
c000000000da8408
NIP [c0000000000930b4] .release_resource+0x34/0xe0
LR [c0000000000930ac] .release_resource+0x2c/0xe0
Call Trace:
[c0000000b78039d0] [c0000000000930ac] .release_resource+0x2c/0xe0
(unreliable)
[c0000000b7803a60] [c0000000000d4fc8] .crash_shrink_memory+0x1c8/0x1f0
[c0000000b7803b30] [c0000000000b7d38] .kexec_crash_size_store+0x58/0x90
[c0000000b7803bc0] [c0000000002b0bb4] .kobj_attr_store+0x34/0x50
[c0000000b7803c30] [c000000000226d5c] .sysfs_write_file+0xec/0x1f0
[c0000000b7803ce0] [c00000000019e0bc] .vfs_write+0xec/0x1f0
[c0000000b7803d80] [c00000000019e2e8] .SyS_write+0x58/0xb0
[c0000000b7803e30] [c00000000000852c] syscall_exit+0x0/0x40
Instruction dump:
fba1ffe8 fbc1fff0 fbe1fff8 ebc2b228 7c7f1b78 f8010010 f821ff71 ebbe8000
7fa3eb78 484e35d9 60000000 e97f0020 <e92b0030> 2fa90000 419e002c
7fbf4800
---[ end trace afbc780462c9bf4e ]---
When crashkernel is not enabled, crashk_res resource have not been
reserved. Hence crashk_res.parent will be NULL.
Attaching a simple patch to this problem. Patch is tested and resolves this bug.
Thanks..
Pavan
[-- Attachment #2: fix-crash_shrink_memory.patch --]
[-- Type: text/x-patch, Size: 371 bytes --]
diff -Naur a/kernel/kexec.c b/kernel/kexec.c
--- a/kernel/kexec.c 2010-06-07 18:45:55.050000000 +0530
+++ b/kernel/kexec.c 2010-06-07 18:50:28.070000004 +0530
@@ -1134,7 +1134,7 @@
free_reserved_phys_range(end, crashk_res.end);
- if (start == end)
+ if ((start == end) && (crashk_res.parent != NULL))
release_resource(&crashk_res);
crashk_res.end = end - 1;
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH] Fix Oops in crash_shrink_memory
2010-06-07 7:28 [PATCH] Fix Oops in crash_shrink_memory Pavan Naregundi
@ 2010-06-08 7:07 ` Pavan Naregundi
2010-06-08 7:59 ` Américo Wang
0 siblings, 1 reply; 11+ messages in thread
From: Pavan Naregundi @ 2010-06-08 7:07 UTC (permalink / raw)
To: linux-kernel; +Cc: vgoyal, hbabu, kexec
Adding CC's..
On Mon, 2010-06-07 at 12:58 +0530, Pavan Naregundi wrote:
> Hi Everyone,
>
> Please add me to CC in your reply..
>
> When crashkernel is not enabled, "echo 0 > /sys/kernel/kexec_crash_size"
> will generate OOPS message in the kernel. Below is the OOPS message and
> other details,
>
> # cat /proc/cmdline
> ro LANG=en_US.UTF-8 SYSFONT=latarcyrheb-sun16 KEYTABLE=us console=hvc0
> rhgb root=UUID=eafd9874-010c-46f9-a0c6-ea8db7c61ac3
> # uname -a
> Linux XXXXXXXX 2.6.35-rc1 #1 SMP Mon Jun 7 18:04:53 IST 2010 ppc64 ppc64
> ppc64 GNU/Linux
> # cd /sys/kernel/
> # ls
> debug kexec_loaded profiling uevent_seqnum
> kexec_crash_loaded mm security vmcoreinfo
> kexec_crash_size notes uevent_helper
> # cat kexec_crash_loaded
> 0
> # cat kexec_loaded
> 0
> # cat kexec_crash_size
> 1
> # echo 0 > kexec_crash_size
> Unable to handle kernel paging request for data at address 0x00000030
> Faulting instruction address: 0xc0000000000930b4
> Oops: Kernel access of bad area, sig: 11 [#1]
> SMP NR_CPUS=1024 NUMA pSeries
> last sysfs file: /sys/kernel/kexec_crash_size
> Modules linked in: sunrpc ipv6 ext3 jbd dm_mirror dm_region_hash dm_log
> dm_multipath dm_mod uinput sg ehea sr_mod ibmveth cdrom ext4 jbd2
> mbcache sd_mod crc_t10dif ibmvscsic scsi_transport_srp scsi_tgt [last
> unloaded: scsi_wait_scan]
> NIP: c0000000000930b4 LR: c0000000000930ac CTR: c0000000000b7ce0
> REGS: c0000000b7803750 TRAP: 0300 Not tainted (2.6.35-rc1)
> MSR: 8000000000009032 <EE,ME,IR,DR> CR: 28242482 XER: 20000000
> DAR: 0000000000000030, DSISR: 0000000040000000
> TASK = c0000000b9cc3dc0[1381] 'bash' THREAD: c0000000b7800000 CPU: 12
> GPR00: c0000000000930ac c0000000b78039d0 c000000000e885a8
> c000000000f42950
> GPR04: c0000000b7803af0 0000000000000008 0000000000000002
> c0000000005c6438
> GPR08: 0000000000000000 000000008000000c 0000000000000000
> 0000000000000000
> GPR12: 0000000040242448 c000000007441e00 00000000100f6210
> 0000000000000000
> GPR16: 00000000100f4a38 00000000100cfb98 00000000100f4bdc
> 00000000100f4b4c
> GPR20: 00000000103f4de8 0000000000000000 0000000000000000
> 00000000100f0000
> GPR24: c0000000005c65c0 0000000000000000 c000000000da83e0
> c0000000bc67f780
> GPR28: c0000000bc684ae0 c000000000f42950 c000000000e1e3f8
> c000000000da8408
> NIP [c0000000000930b4] .release_resource+0x34/0xe0
> LR [c0000000000930ac] .release_resource+0x2c/0xe0
> Call Trace:
> [c0000000b78039d0] [c0000000000930ac] .release_resource+0x2c/0xe0
> (unreliable)
> [c0000000b7803a60] [c0000000000d4fc8] .crash_shrink_memory+0x1c8/0x1f0
> [c0000000b7803b30] [c0000000000b7d38] .kexec_crash_size_store+0x58/0x90
> [c0000000b7803bc0] [c0000000002b0bb4] .kobj_attr_store+0x34/0x50
> [c0000000b7803c30] [c000000000226d5c] .sysfs_write_file+0xec/0x1f0
> [c0000000b7803ce0] [c00000000019e0bc] .vfs_write+0xec/0x1f0
> [c0000000b7803d80] [c00000000019e2e8] .SyS_write+0x58/0xb0
> [c0000000b7803e30] [c00000000000852c] syscall_exit+0x0/0x40
> Instruction dump:
> fba1ffe8 fbc1fff0 fbe1fff8 ebc2b228 7c7f1b78 f8010010 f821ff71 ebbe8000
> 7fa3eb78 484e35d9 60000000 e97f0020 <e92b0030> 2fa90000 419e002c
> 7fbf4800
> ---[ end trace afbc780462c9bf4e ]---
>
> When crashkernel is not enabled, crashk_res resource have not been
> reserved. Hence crashk_res.parent will be NULL.
>
> Attaching a simple patch to this problem. Patch is tested and resolves this bug.
>
> Thanks..
> Pavan
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH] Fix Oops in crash_shrink_memory
2010-06-08 7:07 ` Pavan Naregundi
@ 2010-06-08 7:59 ` Américo Wang
2010-06-08 8:40 ` Pavan Naregundi
0 siblings, 1 reply; 11+ messages in thread
From: Américo Wang @ 2010-06-08 7:59 UTC (permalink / raw)
To: Pavan Naregundi; +Cc: linux-kernel, vgoyal, hbabu, kexec
On Tue, Jun 08, 2010 at 12:37:37PM +0530, Pavan Naregundi wrote:
>Adding CC's..
>
>
>On Mon, 2010-06-07 at 12:58 +0530, Pavan Naregundi wrote:
>> Hi Everyone,
>>
>> Please add me to CC in your reply..
>>
>> When crashkernel is not enabled, "echo 0 > /sys/kernel/kexec_crash_size"
>> will generate OOPS message in the kernel. Below is the OOPS message and
>> other details,
>>
>> # cat /proc/cmdline
>> ro LANG=en_US.UTF-8 SYSFONT=latarcyrheb-sun16 KEYTABLE=us console=hvc0
>> rhgb root=UUID=eafd9874-010c-46f9-a0c6-ea8db7c61ac3
>> # uname -a
>> Linux XXXXXXXX 2.6.35-rc1 #1 SMP Mon Jun 7 18:04:53 IST 2010 ppc64 ppc64
>> ppc64 GNU/Linux
>> # cd /sys/kernel/
>> # ls
>> debug kexec_loaded profiling uevent_seqnum
>> kexec_crash_loaded mm security vmcoreinfo
>> kexec_crash_size notes uevent_helper
>> # cat kexec_crash_loaded
>> 0
>> # cat kexec_loaded
>> 0
>> # cat kexec_crash_size
>> 1
>> # echo 0 > kexec_crash_size
>> Unable to handle kernel paging request for data at address 0x00000030
>> Faulting instruction address: 0xc0000000000930b4
>> Oops: Kernel access of bad area, sig: 11 [#1]
>> SMP NR_CPUS=1024 NUMA pSeries
>> last sysfs file: /sys/kernel/kexec_crash_size
>> Modules linked in: sunrpc ipv6 ext3 jbd dm_mirror dm_region_hash dm_log
>> dm_multipath dm_mod uinput sg ehea sr_mod ibmveth cdrom ext4 jbd2
>> mbcache sd_mod crc_t10dif ibmvscsic scsi_transport_srp scsi_tgt [last
>> unloaded: scsi_wait_scan]
>> NIP: c0000000000930b4 LR: c0000000000930ac CTR: c0000000000b7ce0
>> REGS: c0000000b7803750 TRAP: 0300 Not tainted (2.6.35-rc1)
>> MSR: 8000000000009032 <EE,ME,IR,DR> CR: 28242482 XER: 20000000
>> DAR: 0000000000000030, DSISR: 0000000040000000
>> TASK = c0000000b9cc3dc0[1381] 'bash' THREAD: c0000000b7800000 CPU: 12
>> GPR00: c0000000000930ac c0000000b78039d0 c000000000e885a8
>> c000000000f42950
>> GPR04: c0000000b7803af0 0000000000000008 0000000000000002
>> c0000000005c6438
>> GPR08: 0000000000000000 000000008000000c 0000000000000000
>> 0000000000000000
>> GPR12: 0000000040242448 c000000007441e00 00000000100f6210
>> 0000000000000000
>> GPR16: 00000000100f4a38 00000000100cfb98 00000000100f4bdc
>> 00000000100f4b4c
>> GPR20: 00000000103f4de8 0000000000000000 0000000000000000
>> 00000000100f0000
>> GPR24: c0000000005c65c0 0000000000000000 c000000000da83e0
>> c0000000bc67f780
>> GPR28: c0000000bc684ae0 c000000000f42950 c000000000e1e3f8
>> c000000000da8408
>> NIP [c0000000000930b4] .release_resource+0x34/0xe0
>> LR [c0000000000930ac] .release_resource+0x2c/0xe0
>> Call Trace:
>> [c0000000b78039d0] [c0000000000930ac] .release_resource+0x2c/0xe0
>> (unreliable)
>> [c0000000b7803a60] [c0000000000d4fc8] .crash_shrink_memory+0x1c8/0x1f0
>> [c0000000b7803b30] [c0000000000b7d38] .kexec_crash_size_store+0x58/0x90
>> [c0000000b7803bc0] [c0000000002b0bb4] .kobj_attr_store+0x34/0x50
>> [c0000000b7803c30] [c000000000226d5c] .sysfs_write_file+0xec/0x1f0
>> [c0000000b7803ce0] [c00000000019e0bc] .vfs_write+0xec/0x1f0
>> [c0000000b7803d80] [c00000000019e2e8] .SyS_write+0x58/0xb0
>> [c0000000b7803e30] [c00000000000852c] syscall_exit+0x0/0x40
>> Instruction dump:
>> fba1ffe8 fbc1fff0 fbe1fff8 ebc2b228 7c7f1b78 f8010010 f821ff71 ebbe8000
>> 7fa3eb78 484e35d9 60000000 e97f0020 <e92b0030> 2fa90000 419e002c
>> 7fbf4800
>> ---[ end trace afbc780462c9bf4e ]---
>>
>> When crashkernel is not enabled, crashk_res resource have not been
>> reserved. Hence crashk_res.parent will be NULL.
>>
>> Attaching a simple patch to this problem. Patch is tested and resolves this bug.
Ouch...
The patch indeed addresses the problem, looks good to me.
Please add your Signed-off-by and my:
Reviewed-by: WANG Cong <xiyou.wangcong@gmail.com>
Another problem is that you should get 0 instead of 1 when you don't
reserve any memory.
Thanks!
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH] Fix Oops in crash_shrink_memory
2010-06-08 7:59 ` Américo Wang
@ 2010-06-08 8:40 ` Pavan Naregundi
2010-06-08 8:54 ` Américo Wang
0 siblings, 1 reply; 11+ messages in thread
From: Pavan Naregundi @ 2010-06-08 8:40 UTC (permalink / raw)
To: Américo Wang; +Cc: linux-kernel, vgoyal, hbabu, kexec
[-- Attachment #1: Type: text/plain, Size: 4450 bytes --]
On Tue, 2010-06-08 at 15:59 +0800, Américo Wang wrote:
> On Tue, Jun 08, 2010 at 12:37:37PM +0530, Pavan Naregundi wrote:
> >Adding CC's..
> >
> >
> >On Mon, 2010-06-07 at 12:58 +0530, Pavan Naregundi wrote:
> >> Hi Everyone,
> >>
> >> Please add me to CC in your reply..
> >>
> >> When crashkernel is not enabled, "echo 0 > /sys/kernel/kexec_crash_size"
> >> will generate OOPS message in the kernel. Below is the OOPS message and
> >> other details,
> >>
> >> # cat /proc/cmdline
> >> ro LANG=en_US.UTF-8 SYSFONT=latarcyrheb-sun16 KEYTABLE=us console=hvc0
> >> rhgb root=UUID=eafd9874-010c-46f9-a0c6-ea8db7c61ac3
> >> # uname -a
> >> Linux XXXXXXXX 2.6.35-rc1 #1 SMP Mon Jun 7 18:04:53 IST 2010 ppc64 ppc64
> >> ppc64 GNU/Linux
> >> # cd /sys/kernel/
> >> # ls
> >> debug kexec_loaded profiling uevent_seqnum
> >> kexec_crash_loaded mm security vmcoreinfo
> >> kexec_crash_size notes uevent_helper
> >> # cat kexec_crash_loaded
> >> 0
> >> # cat kexec_loaded
> >> 0
> >> # cat kexec_crash_size
> >> 1
> >> # echo 0 > kexec_crash_size
> >> Unable to handle kernel paging request for data at address 0x00000030
> >> Faulting instruction address: 0xc0000000000930b4
> >> Oops: Kernel access of bad area, sig: 11 [#1]
> >> SMP NR_CPUS=1024 NUMA pSeries
> >> last sysfs file: /sys/kernel/kexec_crash_size
> >> Modules linked in: sunrpc ipv6 ext3 jbd dm_mirror dm_region_hash dm_log
> >> dm_multipath dm_mod uinput sg ehea sr_mod ibmveth cdrom ext4 jbd2
> >> mbcache sd_mod crc_t10dif ibmvscsic scsi_transport_srp scsi_tgt [last
> >> unloaded: scsi_wait_scan]
> >> NIP: c0000000000930b4 LR: c0000000000930ac CTR: c0000000000b7ce0
> >> REGS: c0000000b7803750 TRAP: 0300 Not tainted (2.6.35-rc1)
> >> MSR: 8000000000009032 <EE,ME,IR,DR> CR: 28242482 XER: 20000000
> >> DAR: 0000000000000030, DSISR: 0000000040000000
> >> TASK = c0000000b9cc3dc0[1381] 'bash' THREAD: c0000000b7800000 CPU: 12
> >> GPR00: c0000000000930ac c0000000b78039d0 c000000000e885a8
> >> c000000000f42950
> >> GPR04: c0000000b7803af0 0000000000000008 0000000000000002
> >> c0000000005c6438
> >> GPR08: 0000000000000000 000000008000000c 0000000000000000
> >> 0000000000000000
> >> GPR12: 0000000040242448 c000000007441e00 00000000100f6210
> >> 0000000000000000
> >> GPR16: 00000000100f4a38 00000000100cfb98 00000000100f4bdc
> >> 00000000100f4b4c
> >> GPR20: 00000000103f4de8 0000000000000000 0000000000000000
> >> 00000000100f0000
> >> GPR24: c0000000005c65c0 0000000000000000 c000000000da83e0
> >> c0000000bc67f780
> >> GPR28: c0000000bc684ae0 c000000000f42950 c000000000e1e3f8
> >> c000000000da8408
> >> NIP [c0000000000930b4] .release_resource+0x34/0xe0
> >> LR [c0000000000930ac] .release_resource+0x2c/0xe0
> >> Call Trace:
> >> [c0000000b78039d0] [c0000000000930ac] .release_resource+0x2c/0xe0
> >> (unreliable)
> >> [c0000000b7803a60] [c0000000000d4fc8] .crash_shrink_memory+0x1c8/0x1f0
> >> [c0000000b7803b30] [c0000000000b7d38] .kexec_crash_size_store+0x58/0x90
> >> [c0000000b7803bc0] [c0000000002b0bb4] .kobj_attr_store+0x34/0x50
> >> [c0000000b7803c30] [c000000000226d5c] .sysfs_write_file+0xec/0x1f0
> >> [c0000000b7803ce0] [c00000000019e0bc] .vfs_write+0xec/0x1f0
> >> [c0000000b7803d80] [c00000000019e2e8] .SyS_write+0x58/0xb0
> >> [c0000000b7803e30] [c00000000000852c] syscall_exit+0x0/0x40
> >> Instruction dump:
> >> fba1ffe8 fbc1fff0 fbe1fff8 ebc2b228 7c7f1b78 f8010010 f821ff71 ebbe8000
> >> 7fa3eb78 484e35d9 60000000 e97f0020 <e92b0030> 2fa90000 419e002c
> >> 7fbf4800
> >> ---[ end trace afbc780462c9bf4e ]---
> >>
> >> When crashkernel is not enabled, crashk_res resource have not been
> >> reserved. Hence crashk_res.parent will be NULL.
> >>
> >> Attaching a simple patch to this problem. Patch is tested and resolves this bug.
>
>
> Ouch...
>
> The patch indeed addresses the problem, looks good to me.
> Please add your Signed-off-by and my:
>
> Reviewed-by: WANG Cong <xiyou.wangcong@gmail.com>
>
> Another problem is that you should get 0 instead of 1 when you don't
> reserve any memory.
We get 1 here because, crash_get_memory_size() adds 1 as below,
size = crashk_res.end - crashk_res.start + 1;
We cant remove this addition, as it is required to display correct size
in case if we reserve the crash memory.
Coming back to issue.. Attaching the patch again.
Signed-off-by: Pavan Naregundi <pavan@linux.vnet.ibm.com>
Reviewed-by: WANG Cong <xiyou.wangcong@gmail.com>
---
[-- Attachment #2: fix-crash_shrink_memory.patch --]
[-- Type: text/x-patch, Size: 371 bytes --]
diff -Naur a/kernel/kexec.c b/kernel/kexec.c
--- a/kernel/kexec.c 2010-06-07 18:45:55.050000000 +0530
+++ b/kernel/kexec.c 2010-06-07 18:50:28.070000004 +0530
@@ -1134,7 +1134,7 @@
free_reserved_phys_range(end, crashk_res.end);
- if (start == end)
+ if ((start == end) && (crashk_res.parent != NULL))
release_resource(&crashk_res);
crashk_res.end = end - 1;
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH] Fix Oops in crash_shrink_memory
2010-06-08 8:40 ` Pavan Naregundi
@ 2010-06-08 8:54 ` Américo Wang
2010-06-08 9:41 ` Pavan Naregundi
0 siblings, 1 reply; 11+ messages in thread
From: Américo Wang @ 2010-06-08 8:54 UTC (permalink / raw)
To: Pavan Naregundi; +Cc: Américo Wang, linux-kernel, vgoyal, hbabu, kexec
On Tue, Jun 08, 2010 at 02:10:05PM +0530, Pavan Naregundi wrote:
>On Tue, 2010-06-08 at 15:59 +0800, Américo Wang wrote:
>> On Tue, Jun 08, 2010 at 12:37:37PM +0530, Pavan Naregundi wrote:
>> >Adding CC's..
>> >
>> >
>> >On Mon, 2010-06-07 at 12:58 +0530, Pavan Naregundi wrote:
>> >> Hi Everyone,
>> >>
>> >> Please add me to CC in your reply..
>> >>
>> >> When crashkernel is not enabled, "echo 0 > /sys/kernel/kexec_crash_size"
>> >> will generate OOPS message in the kernel. Below is the OOPS message and
>> >> other details,
>> >>
>> >> # cat /proc/cmdline
>> >> ro LANG=en_US.UTF-8 SYSFONT=latarcyrheb-sun16 KEYTABLE=us console=hvc0
>> >> rhgb root=UUID=eafd9874-010c-46f9-a0c6-ea8db7c61ac3
>> >> # uname -a
>> >> Linux XXXXXXXX 2.6.35-rc1 #1 SMP Mon Jun 7 18:04:53 IST 2010 ppc64 ppc64
>> >> ppc64 GNU/Linux
>> >> # cd /sys/kernel/
>> >> # ls
>> >> debug kexec_loaded profiling uevent_seqnum
>> >> kexec_crash_loaded mm security vmcoreinfo
>> >> kexec_crash_size notes uevent_helper
>> >> # cat kexec_crash_loaded
>> >> 0
>> >> # cat kexec_loaded
>> >> 0
>> >> # cat kexec_crash_size
>> >> 1
>> >> # echo 0 > kexec_crash_size
>> >> Unable to handle kernel paging request for data at address 0x00000030
>> >> Faulting instruction address: 0xc0000000000930b4
>> >> Oops: Kernel access of bad area, sig: 11 [#1]
>> >> SMP NR_CPUS=1024 NUMA pSeries
>> >> last sysfs file: /sys/kernel/kexec_crash_size
>> >> Modules linked in: sunrpc ipv6 ext3 jbd dm_mirror dm_region_hash dm_log
>> >> dm_multipath dm_mod uinput sg ehea sr_mod ibmveth cdrom ext4 jbd2
>> >> mbcache sd_mod crc_t10dif ibmvscsic scsi_transport_srp scsi_tgt [last
>> >> unloaded: scsi_wait_scan]
>> >> NIP: c0000000000930b4 LR: c0000000000930ac CTR: c0000000000b7ce0
>> >> REGS: c0000000b7803750 TRAP: 0300 Not tainted (2.6.35-rc1)
>> >> MSR: 8000000000009032 <EE,ME,IR,DR> CR: 28242482 XER: 20000000
>> >> DAR: 0000000000000030, DSISR: 0000000040000000
>> >> TASK = c0000000b9cc3dc0[1381] 'bash' THREAD: c0000000b7800000 CPU: 12
>> >> GPR00: c0000000000930ac c0000000b78039d0 c000000000e885a8
>> >> c000000000f42950
>> >> GPR04: c0000000b7803af0 0000000000000008 0000000000000002
>> >> c0000000005c6438
>> >> GPR08: 0000000000000000 000000008000000c 0000000000000000
>> >> 0000000000000000
>> >> GPR12: 0000000040242448 c000000007441e00 00000000100f6210
>> >> 0000000000000000
>> >> GPR16: 00000000100f4a38 00000000100cfb98 00000000100f4bdc
>> >> 00000000100f4b4c
>> >> GPR20: 00000000103f4de8 0000000000000000 0000000000000000
>> >> 00000000100f0000
>> >> GPR24: c0000000005c65c0 0000000000000000 c000000000da83e0
>> >> c0000000bc67f780
>> >> GPR28: c0000000bc684ae0 c000000000f42950 c000000000e1e3f8
>> >> c000000000da8408
>> >> NIP [c0000000000930b4] .release_resource+0x34/0xe0
>> >> LR [c0000000000930ac] .release_resource+0x2c/0xe0
>> >> Call Trace:
>> >> [c0000000b78039d0] [c0000000000930ac] .release_resource+0x2c/0xe0
>> >> (unreliable)
>> >> [c0000000b7803a60] [c0000000000d4fc8] .crash_shrink_memory+0x1c8/0x1f0
>> >> [c0000000b7803b30] [c0000000000b7d38] .kexec_crash_size_store+0x58/0x90
>> >> [c0000000b7803bc0] [c0000000002b0bb4] .kobj_attr_store+0x34/0x50
>> >> [c0000000b7803c30] [c000000000226d5c] .sysfs_write_file+0xec/0x1f0
>> >> [c0000000b7803ce0] [c00000000019e0bc] .vfs_write+0xec/0x1f0
>> >> [c0000000b7803d80] [c00000000019e2e8] .SyS_write+0x58/0xb0
>> >> [c0000000b7803e30] [c00000000000852c] syscall_exit+0x0/0x40
>> >> Instruction dump:
>> >> fba1ffe8 fbc1fff0 fbe1fff8 ebc2b228 7c7f1b78 f8010010 f821ff71 ebbe8000
>> >> 7fa3eb78 484e35d9 60000000 e97f0020 <e92b0030> 2fa90000 419e002c
>> >> 7fbf4800
>> >> ---[ end trace afbc780462c9bf4e ]---
>> >>
>> >> When crashkernel is not enabled, crashk_res resource have not been
>> >> reserved. Hence crashk_res.parent will be NULL.
>> >>
>> >> Attaching a simple patch to this problem. Patch is tested and resolves this bug.
>>
>>
>> Ouch...
>>
>> The patch indeed addresses the problem, looks good to me.
>> Please add your Signed-off-by and my:
>>
>> Reviewed-by: WANG Cong <xiyou.wangcong@gmail.com>
>>
>> Another problem is that you should get 0 instead of 1 when you don't
>> reserve any memory.
>
>We get 1 here because, crash_get_memory_size() adds 1 as below,
>
>size = crashk_res.end - crashk_res.start + 1;
>
>We cant remove this addition, as it is required to display correct size
>in case if we reserve the crash memory.
Yeah, but 0 is a special case, isn't it?
>
>Coming back to issue.. Attaching the patch again.
>
>Signed-off-by: Pavan Naregundi <pavan@linux.vnet.ibm.com>
>Reviewed-by: WANG Cong <xiyou.wangcong@gmail.com>
Thanks!
>---
>diff -Naur a/kernel/kexec.c b/kernel/kexec.c
>--- a/kernel/kexec.c 2010-06-07 18:45:55.050000000 +0530
>+++ b/kernel/kexec.c 2010-06-07 18:50:28.070000004 +0530
>@@ -1134,7 +1134,7 @@
>
> free_reserved_phys_range(end, crashk_res.end);
>
>- if (start == end)
>+ if ((start == end) && (crashk_res.parent != NULL))
> release_resource(&crashk_res);
> crashk_res.end = end - 1;
>
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH] Fix Oops in crash_shrink_memory
2010-06-08 8:54 ` Américo Wang
@ 2010-06-08 9:41 ` Pavan Naregundi
2010-06-09 3:44 ` Simon Horman
0 siblings, 1 reply; 11+ messages in thread
From: Pavan Naregundi @ 2010-06-08 9:41 UTC (permalink / raw)
To: Américo Wang; +Cc: linux-kernel, vgoyal, hbabu, kexec
[-- Attachment #1: Type: text/plain, Size: 5115 bytes --]
On Tue, 2010-06-08 at 16:54 +0800, Américo Wang wrote:
> On Tue, Jun 08, 2010 at 02:10:05PM +0530, Pavan Naregundi wrote:
> >On Tue, 2010-06-08 at 15:59 +0800, Américo Wang wrote:
> >> On Tue, Jun 08, 2010 at 12:37:37PM +0530, Pavan Naregundi wrote:
> >> >Adding CC's..
> >> >
> >> >
> >> >On Mon, 2010-06-07 at 12:58 +0530, Pavan Naregundi wrote:
> >> >> Hi Everyone,
> >> >>
> >> >> Please add me to CC in your reply..
> >> >>
> >> >> When crashkernel is not enabled, "echo 0 > /sys/kernel/kexec_crash_size"
> >> >> will generate OOPS message in the kernel. Below is the OOPS message and
> >> >> other details,
> >> >>
> >> >> # cat /proc/cmdline
> >> >> ro LANG=en_US.UTF-8 SYSFONT=latarcyrheb-sun16 KEYTABLE=us console=hvc0
> >> >> rhgb root=UUID=eafd9874-010c-46f9-a0c6-ea8db7c61ac3
> >> >> # uname -a
> >> >> Linux XXXXXXXX 2.6.35-rc1 #1 SMP Mon Jun 7 18:04:53 IST 2010 ppc64 ppc64
> >> >> ppc64 GNU/Linux
> >> >> # cd /sys/kernel/
> >> >> # ls
> >> >> debug kexec_loaded profiling uevent_seqnum
> >> >> kexec_crash_loaded mm security vmcoreinfo
> >> >> kexec_crash_size notes uevent_helper
> >> >> # cat kexec_crash_loaded
> >> >> 0
> >> >> # cat kexec_loaded
> >> >> 0
> >> >> # cat kexec_crash_size
> >> >> 1
> >> >> # echo 0 > kexec_crash_size
> >> >> Unable to handle kernel paging request for data at address 0x00000030
> >> >> Faulting instruction address: 0xc0000000000930b4
> >> >> Oops: Kernel access of bad area, sig: 11 [#1]
> >> >> SMP NR_CPUS=1024 NUMA pSeries
> >> >> last sysfs file: /sys/kernel/kexec_crash_size
> >> >> Modules linked in: sunrpc ipv6 ext3 jbd dm_mirror dm_region_hash dm_log
> >> >> dm_multipath dm_mod uinput sg ehea sr_mod ibmveth cdrom ext4 jbd2
> >> >> mbcache sd_mod crc_t10dif ibmvscsic scsi_transport_srp scsi_tgt [last
> >> >> unloaded: scsi_wait_scan]
> >> >> NIP: c0000000000930b4 LR: c0000000000930ac CTR: c0000000000b7ce0
> >> >> REGS: c0000000b7803750 TRAP: 0300 Not tainted (2.6.35-rc1)
> >> >> MSR: 8000000000009032 <EE,ME,IR,DR> CR: 28242482 XER: 20000000
> >> >> DAR: 0000000000000030, DSISR: 0000000040000000
> >> >> TASK = c0000000b9cc3dc0[1381] 'bash' THREAD: c0000000b7800000 CPU: 12
> >> >> GPR00: c0000000000930ac c0000000b78039d0 c000000000e885a8
> >> >> c000000000f42950
> >> >> GPR04: c0000000b7803af0 0000000000000008 0000000000000002
> >> >> c0000000005c6438
> >> >> GPR08: 0000000000000000 000000008000000c 0000000000000000
> >> >> 0000000000000000
> >> >> GPR12: 0000000040242448 c000000007441e00 00000000100f6210
> >> >> 0000000000000000
> >> >> GPR16: 00000000100f4a38 00000000100cfb98 00000000100f4bdc
> >> >> 00000000100f4b4c
> >> >> GPR20: 00000000103f4de8 0000000000000000 0000000000000000
> >> >> 00000000100f0000
> >> >> GPR24: c0000000005c65c0 0000000000000000 c000000000da83e0
> >> >> c0000000bc67f780
> >> >> GPR28: c0000000bc684ae0 c000000000f42950 c000000000e1e3f8
> >> >> c000000000da8408
> >> >> NIP [c0000000000930b4] .release_resource+0x34/0xe0
> >> >> LR [c0000000000930ac] .release_resource+0x2c/0xe0
> >> >> Call Trace:
> >> >> [c0000000b78039d0] [c0000000000930ac] .release_resource+0x2c/0xe0
> >> >> (unreliable)
> >> >> [c0000000b7803a60] [c0000000000d4fc8] .crash_shrink_memory+0x1c8/0x1f0
> >> >> [c0000000b7803b30] [c0000000000b7d38] .kexec_crash_size_store+0x58/0x90
> >> >> [c0000000b7803bc0] [c0000000002b0bb4] .kobj_attr_store+0x34/0x50
> >> >> [c0000000b7803c30] [c000000000226d5c] .sysfs_write_file+0xec/0x1f0
> >> >> [c0000000b7803ce0] [c00000000019e0bc] .vfs_write+0xec/0x1f0
> >> >> [c0000000b7803d80] [c00000000019e2e8] .SyS_write+0x58/0xb0
> >> >> [c0000000b7803e30] [c00000000000852c] syscall_exit+0x0/0x40
> >> >> Instruction dump:
> >> >> fba1ffe8 fbc1fff0 fbe1fff8 ebc2b228 7c7f1b78 f8010010 f821ff71 ebbe8000
> >> >> 7fa3eb78 484e35d9 60000000 e97f0020 <e92b0030> 2fa90000 419e002c
> >> >> 7fbf4800
> >> >> ---[ end trace afbc780462c9bf4e ]---
> >> >>
> >> >> When crashkernel is not enabled, crashk_res resource have not been
> >> >> reserved. Hence crashk_res.parent will be NULL.
> >> >>
> >> >> Attaching a simple patch to this problem. Patch is tested and resolves this bug.
> >>
> >>
> >> Ouch...
> >>
> >> The patch indeed addresses the problem, looks good to me.
> >> Please add your Signed-off-by and my:
> >>
> >> Reviewed-by: WANG Cong <xiyou.wangcong@gmail.com>
> >>
> >> Another problem is that you should get 0 instead of 1 when you don't
> >> reserve any memory.
> >
> >We get 1 here because, crash_get_memory_size() adds 1 as below,
> >
> >size = crashk_res.end - crashk_res.start + 1;
> >
> >We cant remove this addition, as it is required to display correct size
> >in case if we reserve the crash memory.
>
> Yeah, but 0 is a special case, isn't it?
Yes, it is a special case.
Prepared a new patch which solves both of this issues.
1. OOPs in crash_shrink_memory
2. make crash_get_memory_size to return correct size, in case of crash
memory not reserved.
Patch is tested.
Thank You.
Signed-off-by: Pavan Naregundi <pavan@linux.vnet.ibm.com>
Reviewed-by: WANG Cong <xiyou.wangcong@gmail.com>
---
[-- Attachment #2: fix-kexec.patch --]
[-- Type: text/x-patch, Size: 683 bytes --]
diff -Naur a/kernel/kexec.c b/kernel/kexec.c
--- a/kernel/kexec.c 2010-06-08 21:17:21.850000033 +0530
+++ b/kernel/kexec.c 2010-06-08 21:19:26.190000043 +0530
@@ -1089,9 +1089,10 @@
size_t crash_get_memory_size(void)
{
- size_t size;
+ size_t size = 0;
mutex_lock(&kexec_mutex);
- size = crashk_res.end - crashk_res.start + 1;
+ if(crashk_res.end != crashk_res.start)
+ size = crashk_res.end - crashk_res.start + 1;
mutex_unlock(&kexec_mutex);
return size;
}
@@ -1134,7 +1135,7 @@
free_reserved_phys_range(end, crashk_res.end);
- if (start == end)
+ if ((start == end) && (crashk_res.parent != NULL))
release_resource(&crashk_res);
crashk_res.end = end - 1;
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH] Fix Oops in crash_shrink_memory
2010-06-08 9:41 ` Pavan Naregundi
@ 2010-06-09 3:44 ` Simon Horman
2010-06-09 6:27 ` Pavan Naregundi
0 siblings, 1 reply; 11+ messages in thread
From: Simon Horman @ 2010-06-09 3:44 UTC (permalink / raw)
To: Pavan Naregundi; +Cc: Américo Wang, linux-kernel, vgoyal, hbabu, kexec
On Tue, Jun 08, 2010 at 03:11:47PM +0530, Pavan Naregundi wrote:
> On Tue, 2010-06-08 at 16:54 +0800, Américo Wang wrote:
> > On Tue, Jun 08, 2010 at 02:10:05PM +0530, Pavan Naregundi wrote:
> > >On Tue, 2010-06-08 at 15:59 +0800, Américo Wang wrote:
> > >> On Tue, Jun 08, 2010 at 12:37:37PM +0530, Pavan Naregundi wrote:
> > >> >Adding CC's..
> > >> >
> > >> >
> > >> >On Mon, 2010-06-07 at 12:58 +0530, Pavan Naregundi wrote:
> > >> >> Hi Everyone,
> > >> >>
> > >> >> Please add me to CC in your reply..
> > >> >>
> > >> >> When crashkernel is not enabled, "echo 0 > /sys/kernel/kexec_crash_size"
> > >> >> will generate OOPS message in the kernel. Below is the OOPS message and
> > >> >> other details,
> > >> >>
> > >> >> # cat /proc/cmdline
> > >> >> ro LANG=en_US.UTF-8 SYSFONT=latarcyrheb-sun16 KEYTABLE=us console=hvc0
> > >> >> rhgb root=UUID=eafd9874-010c-46f9-a0c6-ea8db7c61ac3
> > >> >> # uname -a
> > >> >> Linux XXXXXXXX 2.6.35-rc1 #1 SMP Mon Jun 7 18:04:53 IST 2010 ppc64 ppc64
> > >> >> ppc64 GNU/Linux
> > >> >> # cd /sys/kernel/
> > >> >> # ls
> > >> >> debug kexec_loaded profiling uevent_seqnum
> > >> >> kexec_crash_loaded mm security vmcoreinfo
> > >> >> kexec_crash_size notes uevent_helper
> > >> >> # cat kexec_crash_loaded
> > >> >> 0
> > >> >> # cat kexec_loaded
> > >> >> 0
> > >> >> # cat kexec_crash_size
> > >> >> 1
> > >> >> # echo 0 > kexec_crash_size
> > >> >> Unable to handle kernel paging request for data at address 0x00000030
> > >> >> Faulting instruction address: 0xc0000000000930b4
> > >> >> Oops: Kernel access of bad area, sig: 11 [#1]
> > >> >> SMP NR_CPUS=1024 NUMA pSeries
> > >> >> last sysfs file: /sys/kernel/kexec_crash_size
> > >> >> Modules linked in: sunrpc ipv6 ext3 jbd dm_mirror dm_region_hash dm_log
> > >> >> dm_multipath dm_mod uinput sg ehea sr_mod ibmveth cdrom ext4 jbd2
> > >> >> mbcache sd_mod crc_t10dif ibmvscsic scsi_transport_srp scsi_tgt [last
> > >> >> unloaded: scsi_wait_scan]
> > >> >> NIP: c0000000000930b4 LR: c0000000000930ac CTR: c0000000000b7ce0
> > >> >> REGS: c0000000b7803750 TRAP: 0300 Not tainted (2.6.35-rc1)
> > >> >> MSR: 8000000000009032 <EE,ME,IR,DR> CR: 28242482 XER: 20000000
> > >> >> DAR: 0000000000000030, DSISR: 0000000040000000
> > >> >> TASK = c0000000b9cc3dc0[1381] 'bash' THREAD: c0000000b7800000 CPU: 12
> > >> >> GPR00: c0000000000930ac c0000000b78039d0 c000000000e885a8
> > >> >> c000000000f42950
> > >> >> GPR04: c0000000b7803af0 0000000000000008 0000000000000002
> > >> >> c0000000005c6438
> > >> >> GPR08: 0000000000000000 000000008000000c 0000000000000000
> > >> >> 0000000000000000
> > >> >> GPR12: 0000000040242448 c000000007441e00 00000000100f6210
> > >> >> 0000000000000000
> > >> >> GPR16: 00000000100f4a38 00000000100cfb98 00000000100f4bdc
> > >> >> 00000000100f4b4c
> > >> >> GPR20: 00000000103f4de8 0000000000000000 0000000000000000
> > >> >> 00000000100f0000
> > >> >> GPR24: c0000000005c65c0 0000000000000000 c000000000da83e0
> > >> >> c0000000bc67f780
> > >> >> GPR28: c0000000bc684ae0 c000000000f42950 c000000000e1e3f8
> > >> >> c000000000da8408
> > >> >> NIP [c0000000000930b4] .release_resource+0x34/0xe0
> > >> >> LR [c0000000000930ac] .release_resource+0x2c/0xe0
> > >> >> Call Trace:
> > >> >> [c0000000b78039d0] [c0000000000930ac] .release_resource+0x2c/0xe0
> > >> >> (unreliable)
> > >> >> [c0000000b7803a60] [c0000000000d4fc8] .crash_shrink_memory+0x1c8/0x1f0
> > >> >> [c0000000b7803b30] [c0000000000b7d38] .kexec_crash_size_store+0x58/0x90
> > >> >> [c0000000b7803bc0] [c0000000002b0bb4] .kobj_attr_store+0x34/0x50
> > >> >> [c0000000b7803c30] [c000000000226d5c] .sysfs_write_file+0xec/0x1f0
> > >> >> [c0000000b7803ce0] [c00000000019e0bc] .vfs_write+0xec/0x1f0
> > >> >> [c0000000b7803d80] [c00000000019e2e8] .SyS_write+0x58/0xb0
> > >> >> [c0000000b7803e30] [c00000000000852c] syscall_exit+0x0/0x40
> > >> >> Instruction dump:
> > >> >> fba1ffe8 fbc1fff0 fbe1fff8 ebc2b228 7c7f1b78 f8010010 f821ff71 ebbe8000
> > >> >> 7fa3eb78 484e35d9 60000000 e97f0020 <e92b0030> 2fa90000 419e002c
> > >> >> 7fbf4800
> > >> >> ---[ end trace afbc780462c9bf4e ]---
> > >> >>
> > >> >> When crashkernel is not enabled, crashk_res resource have not been
> > >> >> reserved. Hence crashk_res.parent will be NULL.
> > >> >>
> > >> >> Attaching a simple patch to this problem. Patch is tested and resolves this bug.
> > >>
> > >>
> > >> Ouch...
> > >>
> > >> The patch indeed addresses the problem, looks good to me.
> > >> Please add your Signed-off-by and my:
> > >>
> > >> Reviewed-by: WANG Cong <xiyou.wangcong@gmail.com>
> > >>
> > >> Another problem is that you should get 0 instead of 1 when you don't
> > >> reserve any memory.
> > >
> > >We get 1 here because, crash_get_memory_size() adds 1 as below,
> > >
> > >size = crashk_res.end - crashk_res.start + 1;
> > >
> > >We cant remove this addition, as it is required to display correct size
> > >in case if we reserve the crash memory.
> >
> > Yeah, but 0 is a special case, isn't it?
>
> Yes, it is a special case.
>
> Prepared a new patch which solves both of this issues.
>
> 1. OOPs in crash_shrink_memory
> 2. make crash_get_memory_size to return correct size, in case of crash
> memory not reserved.
>
> Patch is tested.
>
> Thank You.
>
> Signed-off-by: Pavan Naregundi <pavan@linux.vnet.ibm.com>
> Reviewed-by: WANG Cong <xiyou.wangcong@gmail.com>
> ---
>
>
>
> diff -Naur a/kernel/kexec.c b/kernel/kexec.c
> --- a/kernel/kexec.c 2010-06-08 21:17:21.850000033 +0530
> +++ b/kernel/kexec.c 2010-06-08 21:19:26.190000043 +0530
> @@ -1089,9 +1089,10 @@
>
> size_t crash_get_memory_size(void)
> {
> - size_t size;
> + size_t size = 0;
> mutex_lock(&kexec_mutex);
> - size = crashk_res.end - crashk_res.start + 1;
> + if(crashk_res.end != crashk_res.start)
> + size = crashk_res.end - crashk_res.start + 1;
Minor style-issue: there should be a space between if and (.
> mutex_unlock(&kexec_mutex);
> return size;
> }
> @@ -1134,7 +1135,7 @@
>
> free_reserved_phys_range(end, crashk_res.end);
>
> - if (start == end)
> + if ((start == end) && (crashk_res.parent != NULL))
> release_resource(&crashk_res);
> crashk_res.end = end - 1;
>
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH] Fix Oops in crash_shrink_memory
2010-06-09 3:44 ` Simon Horman
@ 2010-06-09 6:27 ` Pavan Naregundi
2010-06-09 14:05 ` Vivek Goyal
2010-06-10 21:26 ` Andrew Morton
0 siblings, 2 replies; 11+ messages in thread
From: Pavan Naregundi @ 2010-06-09 6:27 UTC (permalink / raw)
To: Simon Horman; +Cc: Américo Wang, linux-kernel, vgoyal, hbabu, kexec
[-- Attachment #1: Type: text/plain, Size: 5923 bytes --]
On Wed, 2010-06-09 at 12:44 +0900, Simon Horman wrote:
> On Tue, Jun 08, 2010 at 03:11:47PM +0530, Pavan Naregundi wrote:
> > On Tue, 2010-06-08 at 16:54 +0800, Américo Wang wrote:
> > > On Tue, Jun 08, 2010 at 02:10:05PM +0530, Pavan Naregundi wrote:
> > > >On Tue, 2010-06-08 at 15:59 +0800, Américo Wang wrote:
> > > >> On Tue, Jun 08, 2010 at 12:37:37PM +0530, Pavan Naregundi wrote:
> > > >> >Adding CC's..
> > > >> >
> > > >> >
> > > >> >On Mon, 2010-06-07 at 12:58 +0530, Pavan Naregundi wrote:
> > > >> >> Hi Everyone,
> > > >> >>
> > > >> >> Please add me to CC in your reply..
> > > >> >>
> > > >> >> When crashkernel is not enabled, "echo 0 > /sys/kernel/kexec_crash_size"
> > > >> >> will generate OOPS message in the kernel. Below is the OOPS message and
> > > >> >> other details,
> > > >> >>
> > > >> >> # cat /proc/cmdline
> > > >> >> ro LANG=en_US.UTF-8 SYSFONT=latarcyrheb-sun16 KEYTABLE=us console=hvc0
> > > >> >> rhgb root=UUID=eafd9874-010c-46f9-a0c6-ea8db7c61ac3
> > > >> >> # uname -a
> > > >> >> Linux XXXXXXXX 2.6.35-rc1 #1 SMP Mon Jun 7 18:04:53 IST 2010 ppc64 ppc64
> > > >> >> ppc64 GNU/Linux
> > > >> >> # cd /sys/kernel/
> > > >> >> # ls
> > > >> >> debug kexec_loaded profiling uevent_seqnum
> > > >> >> kexec_crash_loaded mm security vmcoreinfo
> > > >> >> kexec_crash_size notes uevent_helper
> > > >> >> # cat kexec_crash_loaded
> > > >> >> 0
> > > >> >> # cat kexec_loaded
> > > >> >> 0
> > > >> >> # cat kexec_crash_size
> > > >> >> 1
> > > >> >> # echo 0 > kexec_crash_size
> > > >> >> Unable to handle kernel paging request for data at address 0x00000030
> > > >> >> Faulting instruction address: 0xc0000000000930b4
> > > >> >> Oops: Kernel access of bad area, sig: 11 [#1]
> > > >> >> SMP NR_CPUS=1024 NUMA pSeries
> > > >> >> last sysfs file: /sys/kernel/kexec_crash_size
> > > >> >> Modules linked in: sunrpc ipv6 ext3 jbd dm_mirror dm_region_hash dm_log
> > > >> >> dm_multipath dm_mod uinput sg ehea sr_mod ibmveth cdrom ext4 jbd2
> > > >> >> mbcache sd_mod crc_t10dif ibmvscsic scsi_transport_srp scsi_tgt [last
> > > >> >> unloaded: scsi_wait_scan]
> > > >> >> NIP: c0000000000930b4 LR: c0000000000930ac CTR: c0000000000b7ce0
> > > >> >> REGS: c0000000b7803750 TRAP: 0300 Not tainted (2.6.35-rc1)
> > > >> >> MSR: 8000000000009032 <EE,ME,IR,DR> CR: 28242482 XER: 20000000
> > > >> >> DAR: 0000000000000030, DSISR: 0000000040000000
> > > >> >> TASK = c0000000b9cc3dc0[1381] 'bash' THREAD: c0000000b7800000 CPU: 12
> > > >> >> GPR00: c0000000000930ac c0000000b78039d0 c000000000e885a8
> > > >> >> c000000000f42950
> > > >> >> GPR04: c0000000b7803af0 0000000000000008 0000000000000002
> > > >> >> c0000000005c6438
> > > >> >> GPR08: 0000000000000000 000000008000000c 0000000000000000
> > > >> >> 0000000000000000
> > > >> >> GPR12: 0000000040242448 c000000007441e00 00000000100f6210
> > > >> >> 0000000000000000
> > > >> >> GPR16: 00000000100f4a38 00000000100cfb98 00000000100f4bdc
> > > >> >> 00000000100f4b4c
> > > >> >> GPR20: 00000000103f4de8 0000000000000000 0000000000000000
> > > >> >> 00000000100f0000
> > > >> >> GPR24: c0000000005c65c0 0000000000000000 c000000000da83e0
> > > >> >> c0000000bc67f780
> > > >> >> GPR28: c0000000bc684ae0 c000000000f42950 c000000000e1e3f8
> > > >> >> c000000000da8408
> > > >> >> NIP [c0000000000930b4] .release_resource+0x34/0xe0
> > > >> >> LR [c0000000000930ac] .release_resource+0x2c/0xe0
> > > >> >> Call Trace:
> > > >> >> [c0000000b78039d0] [c0000000000930ac] .release_resource+0x2c/0xe0
> > > >> >> (unreliable)
> > > >> >> [c0000000b7803a60] [c0000000000d4fc8] .crash_shrink_memory+0x1c8/0x1f0
> > > >> >> [c0000000b7803b30] [c0000000000b7d38] .kexec_crash_size_store+0x58/0x90
> > > >> >> [c0000000b7803bc0] [c0000000002b0bb4] .kobj_attr_store+0x34/0x50
> > > >> >> [c0000000b7803c30] [c000000000226d5c] .sysfs_write_file+0xec/0x1f0
> > > >> >> [c0000000b7803ce0] [c00000000019e0bc] .vfs_write+0xec/0x1f0
> > > >> >> [c0000000b7803d80] [c00000000019e2e8] .SyS_write+0x58/0xb0
> > > >> >> [c0000000b7803e30] [c00000000000852c] syscall_exit+0x0/0x40
> > > >> >> Instruction dump:
> > > >> >> fba1ffe8 fbc1fff0 fbe1fff8 ebc2b228 7c7f1b78 f8010010 f821ff71 ebbe8000
> > > >> >> 7fa3eb78 484e35d9 60000000 e97f0020 <e92b0030> 2fa90000 419e002c
> > > >> >> 7fbf4800
> > > >> >> ---[ end trace afbc780462c9bf4e ]---
> > > >> >>
> > > >> >> When crashkernel is not enabled, crashk_res resource have not been
> > > >> >> reserved. Hence crashk_res.parent will be NULL.
> > > >> >>
> > > >> >> Attaching a simple patch to this problem. Patch is tested and resolves this bug.
> > > >>
> > > >>
> > > >> Ouch...
> > > >>
> > > >> The patch indeed addresses the problem, looks good to me.
> > > >> Please add your Signed-off-by and my:
> > > >>
> > > >> Reviewed-by: WANG Cong <xiyou.wangcong@gmail.com>
> > > >>
> > > >> Another problem is that you should get 0 instead of 1 when you don't
> > > >> reserve any memory.
> > > >
> > > >We get 1 here because, crash_get_memory_size() adds 1 as below,
> > > >
> > > >size = crashk_res.end - crashk_res.start + 1;
> > > >
> > > >We cant remove this addition, as it is required to display correct size
> > > >in case if we reserve the crash memory.
> > >
> > > Yeah, but 0 is a special case, isn't it?
> >
> > Yes, it is a special case.
> >
> > Prepared a new patch which solves both of this issues.
> >
> > 1. OOPs in crash_shrink_memory
> > 2. make crash_get_memory_size to return correct size, in case of crash
> > memory not reserved.
> >
> > Patch is tested.
> >
> > + if(crashk_res.end != crashk_res.start)
> > + size = crashk_res.end - crashk_res.start + 1;
>
> Minor style-issue: there should be a space between if and (.
>
Sorry for that.
Resending the patch with fixed style issues.
Signed-off-by: Pavan Naregundi <pavan@linux.vnet.ibm.com>
Reviewed-by: WANG Cong <xiyou.wangcong@gmail.com>
--
[-- Attachment #2: fix-kexec.patch --]
[-- Type: text/x-patch, Size: 684 bytes --]
diff -Naur a/kernel/kexec.c b/kernel/kexec.c
--- a/kernel/kexec.c 2010-06-08 21:17:21.850000033 +0530
+++ b/kernel/kexec.c 2010-06-09 18:01:37.590007921 +0530
@@ -1089,9 +1089,10 @@
size_t crash_get_memory_size(void)
{
- size_t size;
+ size_t size = 0;
mutex_lock(&kexec_mutex);
- size = crashk_res.end - crashk_res.start + 1;
+ if (crashk_res.end != crashk_res.start)
+ size = crashk_res.end - crashk_res.start + 1;
mutex_unlock(&kexec_mutex);
return size;
}
@@ -1134,7 +1135,7 @@
free_reserved_phys_range(end, crashk_res.end);
- if (start == end)
+ if ((start == end) && (crashk_res.parent != NULL))
release_resource(&crashk_res);
crashk_res.end = end - 1;
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH] Fix Oops in crash_shrink_memory
2010-06-09 6:27 ` Pavan Naregundi
@ 2010-06-09 14:05 ` Vivek Goyal
2010-06-10 21:26 ` Andrew Morton
1 sibling, 0 replies; 11+ messages in thread
From: Vivek Goyal @ 2010-06-09 14:05 UTC (permalink / raw)
To: Pavan Naregundi
Cc: Simon Horman, Américo Wang, linux-kernel, hbabu, kexec
On Wed, Jun 09, 2010 at 11:57:14AM +0530, Pavan Naregundi wrote:
> On Wed, 2010-06-09 at 12:44 +0900, Simon Horman wrote:
> > On Tue, Jun 08, 2010 at 03:11:47PM +0530, Pavan Naregundi wrote:
> > > On Tue, 2010-06-08 at 16:54 +0800, Américo Wang wrote:
> > > > On Tue, Jun 08, 2010 at 02:10:05PM +0530, Pavan Naregundi wrote:
> > > > >On Tue, 2010-06-08 at 15:59 +0800, Américo Wang wrote:
> > > > >> On Tue, Jun 08, 2010 at 12:37:37PM +0530, Pavan Naregundi wrote:
> > > > >> >Adding CC's..
> > > > >> >
> > > > >> >
> > > > >> >On Mon, 2010-06-07 at 12:58 +0530, Pavan Naregundi wrote:
> > > > >> >> Hi Everyone,
> > > > >> >>
> > > > >> >> Please add me to CC in your reply..
> > > > >> >>
> > > > >> >> When crashkernel is not enabled, "echo 0 > /sys/kernel/kexec_crash_size"
> > > > >> >> will generate OOPS message in the kernel. Below is the OOPS message and
> > > > >> >> other details,
> > > > >> >>
> > > > >> >> # cat /proc/cmdline
> > > > >> >> ro LANG=en_US.UTF-8 SYSFONT=latarcyrheb-sun16 KEYTABLE=us console=hvc0
> > > > >> >> rhgb root=UUID=eafd9874-010c-46f9-a0c6-ea8db7c61ac3
> > > > >> >> # uname -a
> > > > >> >> Linux XXXXXXXX 2.6.35-rc1 #1 SMP Mon Jun 7 18:04:53 IST 2010 ppc64 ppc64
> > > > >> >> ppc64 GNU/Linux
> > > > >> >> # cd /sys/kernel/
> > > > >> >> # ls
> > > > >> >> debug kexec_loaded profiling uevent_seqnum
> > > > >> >> kexec_crash_loaded mm security vmcoreinfo
> > > > >> >> kexec_crash_size notes uevent_helper
> > > > >> >> # cat kexec_crash_loaded
> > > > >> >> 0
> > > > >> >> # cat kexec_loaded
> > > > >> >> 0
> > > > >> >> # cat kexec_crash_size
> > > > >> >> 1
> > > > >> >> # echo 0 > kexec_crash_size
> > > > >> >> Unable to handle kernel paging request for data at address 0x00000030
> > > > >> >> Faulting instruction address: 0xc0000000000930b4
> > > > >> >> Oops: Kernel access of bad area, sig: 11 [#1]
> > > > >> >> SMP NR_CPUS=1024 NUMA pSeries
> > > > >> >> last sysfs file: /sys/kernel/kexec_crash_size
> > > > >> >> Modules linked in: sunrpc ipv6 ext3 jbd dm_mirror dm_region_hash dm_log
> > > > >> >> dm_multipath dm_mod uinput sg ehea sr_mod ibmveth cdrom ext4 jbd2
> > > > >> >> mbcache sd_mod crc_t10dif ibmvscsic scsi_transport_srp scsi_tgt [last
> > > > >> >> unloaded: scsi_wait_scan]
> > > > >> >> NIP: c0000000000930b4 LR: c0000000000930ac CTR: c0000000000b7ce0
> > > > >> >> REGS: c0000000b7803750 TRAP: 0300 Not tainted (2.6.35-rc1)
> > > > >> >> MSR: 8000000000009032 <EE,ME,IR,DR> CR: 28242482 XER: 20000000
> > > > >> >> DAR: 0000000000000030, DSISR: 0000000040000000
> > > > >> >> TASK = c0000000b9cc3dc0[1381] 'bash' THREAD: c0000000b7800000 CPU: 12
> > > > >> >> GPR00: c0000000000930ac c0000000b78039d0 c000000000e885a8
> > > > >> >> c000000000f42950
> > > > >> >> GPR04: c0000000b7803af0 0000000000000008 0000000000000002
> > > > >> >> c0000000005c6438
> > > > >> >> GPR08: 0000000000000000 000000008000000c 0000000000000000
> > > > >> >> 0000000000000000
> > > > >> >> GPR12: 0000000040242448 c000000007441e00 00000000100f6210
> > > > >> >> 0000000000000000
> > > > >> >> GPR16: 00000000100f4a38 00000000100cfb98 00000000100f4bdc
> > > > >> >> 00000000100f4b4c
> > > > >> >> GPR20: 00000000103f4de8 0000000000000000 0000000000000000
> > > > >> >> 00000000100f0000
> > > > >> >> GPR24: c0000000005c65c0 0000000000000000 c000000000da83e0
> > > > >> >> c0000000bc67f780
> > > > >> >> GPR28: c0000000bc684ae0 c000000000f42950 c000000000e1e3f8
> > > > >> >> c000000000da8408
> > > > >> >> NIP [c0000000000930b4] .release_resource+0x34/0xe0
> > > > >> >> LR [c0000000000930ac] .release_resource+0x2c/0xe0
> > > > >> >> Call Trace:
> > > > >> >> [c0000000b78039d0] [c0000000000930ac] .release_resource+0x2c/0xe0
> > > > >> >> (unreliable)
> > > > >> >> [c0000000b7803a60] [c0000000000d4fc8] .crash_shrink_memory+0x1c8/0x1f0
> > > > >> >> [c0000000b7803b30] [c0000000000b7d38] .kexec_crash_size_store+0x58/0x90
> > > > >> >> [c0000000b7803bc0] [c0000000002b0bb4] .kobj_attr_store+0x34/0x50
> > > > >> >> [c0000000b7803c30] [c000000000226d5c] .sysfs_write_file+0xec/0x1f0
> > > > >> >> [c0000000b7803ce0] [c00000000019e0bc] .vfs_write+0xec/0x1f0
> > > > >> >> [c0000000b7803d80] [c00000000019e2e8] .SyS_write+0x58/0xb0
> > > > >> >> [c0000000b7803e30] [c00000000000852c] syscall_exit+0x0/0x40
> > > > >> >> Instruction dump:
> > > > >> >> fba1ffe8 fbc1fff0 fbe1fff8 ebc2b228 7c7f1b78 f8010010 f821ff71 ebbe8000
> > > > >> >> 7fa3eb78 484e35d9 60000000 e97f0020 <e92b0030> 2fa90000 419e002c
> > > > >> >> 7fbf4800
> > > > >> >> ---[ end trace afbc780462c9bf4e ]---
> > > > >> >>
> > > > >> >> When crashkernel is not enabled, crashk_res resource have not been
> > > > >> >> reserved. Hence crashk_res.parent will be NULL.
> > > > >> >>
> > > > >> >> Attaching a simple patch to this problem. Patch is tested and resolves this bug.
> > > > >>
> > > > >>
> > > > >> Ouch...
> > > > >>
> > > > >> The patch indeed addresses the problem, looks good to me.
> > > > >> Please add your Signed-off-by and my:
> > > > >>
> > > > >> Reviewed-by: WANG Cong <xiyou.wangcong@gmail.com>
> > > > >>
> > > > >> Another problem is that you should get 0 instead of 1 when you don't
> > > > >> reserve any memory.
> > > > >
> > > > >We get 1 here because, crash_get_memory_size() adds 1 as below,
> > > > >
> > > > >size = crashk_res.end - crashk_res.start + 1;
> > > > >
> > > > >We cant remove this addition, as it is required to display correct size
> > > > >in case if we reserve the crash memory.
> > > >
> > > > Yeah, but 0 is a special case, isn't it?
> > >
> > > Yes, it is a special case.
> > >
> > > Prepared a new patch which solves both of this issues.
> > >
> > > 1. OOPs in crash_shrink_memory
> > > 2. make crash_get_memory_size to return correct size, in case of crash
> > > memory not reserved.
> > >
> > > Patch is tested.
> > >
>
> > > + if(crashk_res.end != crashk_res.start)
> > > + size = crashk_res.end - crashk_res.start + 1;
> >
> > Minor style-issue: there should be a space between if and (.
> >
>
> Sorry for that.
>
> Resending the patch with fixed style issues.
>
Looks good to me.
Acked-by: Vivek Goyal <vgoyal@redhat.com>
Thanks
Vivek
> Signed-off-by: Pavan Naregundi <pavan@linux.vnet.ibm.com>
> Reviewed-by: WANG Cong <xiyou.wangcong@gmail.com>
> --
>
>
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH] Fix Oops in crash_shrink_memory
2010-06-09 6:27 ` Pavan Naregundi
2010-06-09 14:05 ` Vivek Goyal
@ 2010-06-10 21:26 ` Andrew Morton
2010-06-11 7:30 ` Pavan Naregundi
1 sibling, 1 reply; 11+ messages in thread
From: Andrew Morton @ 2010-06-10 21:26 UTC (permalink / raw)
To: Pavan Naregundi
Cc: Simon Horman, Américo Wang, linux-kernel, vgoyal, hbabu, kexec
On Wed, 09 Jun 2010 11:57:14 +0530
Pavan Naregundi <pavan@linux.vnet.ibm.com> wrote:
> Resending the patch with fixed style issues.
>
> Signed-off-by: Pavan Naregundi <pavan@linux.vnet.ibm.com>
> Reviewed-by: WANG Cong <xiyou.wangcong@gmail.com>
> --
>
>
>
>
> [fix-kexec.patch text/x-patch (685B)]
> diff -Naur a/kernel/kexec.c b/kernel/kexec.c
> --- a/kernel/kexec.c 2010-06-08 21:17:21.850000033 +0530
> +++ b/kernel/kexec.c 2010-06-09 18:01:37.590007921 +0530
> @@ -1089,9 +1089,10 @@
>
> size_t crash_get_memory_size(void)
> {
> - size_t size;
> + size_t size = 0;
> mutex_lock(&kexec_mutex);
> - size = crashk_res.end - crashk_res.start + 1;
> + if (crashk_res.end != crashk_res.start)
> + size = crashk_res.end - crashk_res.start + 1;
> mutex_unlock(&kexec_mutex);
> return size;
> }
> @@ -1134,7 +1135,7 @@
>
> free_reserved_phys_range(end, crashk_res.end);
>
> - if (start == end)
> + if ((start == end) && (crashk_res.parent != NULL))
> release_resource(&crashk_res);
> crashk_res.end = end - 1;
The patch doesn't have a changelog and I'd prefer not to have to crawl
through the email thread and write one myself.
Please resend, including a full description of the bug and of its fix.
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH] Fix Oops in crash_shrink_memory
2010-06-10 21:26 ` Andrew Morton
@ 2010-06-11 7:30 ` Pavan Naregundi
0 siblings, 0 replies; 11+ messages in thread
From: Pavan Naregundi @ 2010-06-11 7:30 UTC (permalink / raw)
To: Andrew Morton
Cc: Simon Horman, Américo Wang, linux-kernel, vgoyal, hbabu, kexec
[-- Attachment #1: Type: text/plain, Size: 2147 bytes --]
On Thu, 2010-06-10 at 14:26 -0700, Andrew Morton wrote:
> On Wed, 09 Jun 2010 11:57:14 +0530
> Pavan Naregundi <pavan@linux.vnet.ibm.com> wrote:
>
> > Resending the patch with fixed style issues.
> >
> > Signed-off-by: Pavan Naregundi <pavan@linux.vnet.ibm.com>
> > Reviewed-by: WANG Cong <xiyou.wangcong@gmail.com>
> > --
> >
> >
> >
> >
> > [fix-kexec.patch text/x-patch (685B)]
> > diff -Naur a/kernel/kexec.c b/kernel/kexec.c
> > --- a/kernel/kexec.c 2010-06-08 21:17:21.850000033 +0530
> > +++ b/kernel/kexec.c 2010-06-09 18:01:37.590007921 +0530
> > @@ -1089,9 +1089,10 @@
> >
> > size_t crash_get_memory_size(void)
> > {
> > - size_t size;
> > + size_t size = 0;
> > mutex_lock(&kexec_mutex);
> > - size = crashk_res.end - crashk_res.start + 1;
> > + if (crashk_res.end != crashk_res.start)
> > + size = crashk_res.end - crashk_res.start + 1;
> > mutex_unlock(&kexec_mutex);
> > return size;
> > }
> > @@ -1134,7 +1135,7 @@
> >
> > free_reserved_phys_range(end, crashk_res.end);
> >
> > - if (start == end)
> > + if ((start == end) && (crashk_res.parent != NULL))
> > release_resource(&crashk_res);
> > crashk_res.end = end - 1;
>
> The patch doesn't have a changelog and I'd prefer not to have to crawl
> through the email thread and write one myself.
>
> Please resend, including a full description of the bug and of its fix.
Subject: kexec: fix Oops in crash_shrink_memory()
From: Pavan Naregundi <pavan@linux.vnet.ibm.com>
When crashkernel is not enabled, "echo 0 > /sys/kernel/kexec_crash_size"
OOPSes the kernel in crash_shrink_memory. This happens when
crash_shrink_memory tries to release the 'crashk_res' resource which are
not reserved. Also value of "/sys/kernel/kexec_crash_size" shows as 1,
which should be 0.
This patch fixes the OOPS in crash_shrink_memory and shows
"/sys/kernel/kexec_crash_size" as 0 when crash kernel memory is not
reserved.
Signed-off-by: Pavan Naregundi <pavan@linux.vnet.ibm.com>
Reviewed-by: WANG Cong <xiyou.wangcong@gmail.com>
Cc: Simon Horman <horms@verge.net.au>
Cc: Vivek Goyal <vgoyal@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---
[-- Attachment #2: kexec-fix-oops-in-crash_shrink_memory.patch --]
[-- Type: text/x-patch, Size: 764 bytes --]
diff -uprN a/kernel/kexec.c b/kernel/kexec.c
--- a/kernel/kexec.c 2010-06-08 21:17:21.850000033 +0530
+++ b/kernel/kexec.c 2010-06-09 18:01:37.590007921 +0530
@@ -1089,9 +1089,10 @@ void crash_kexec(struct pt_regs *regs)
size_t crash_get_memory_size(void)
{
- size_t size;
+ size_t size = 0;
mutex_lock(&kexec_mutex);
- size = crashk_res.end - crashk_res.start + 1;
+ if (crashk_res.end != crashk_res.start)
+ size = crashk_res.end - crashk_res.start + 1;
mutex_unlock(&kexec_mutex);
return size;
}
@@ -1134,7 +1135,7 @@ int crash_shrink_memory(unsigned long ne
free_reserved_phys_range(end, crashk_res.end);
- if (start == end)
+ if ((start == end) && (crashk_res.parent != NULL))
release_resource(&crashk_res);
crashk_res.end = end - 1;
^ permalink raw reply [flat|nested] 11+ messages in thread
end of thread, other threads:[~2010-06-11 7:30 UTC | newest]
Thread overview: 11+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2010-06-07 7:28 [PATCH] Fix Oops in crash_shrink_memory Pavan Naregundi
2010-06-08 7:07 ` Pavan Naregundi
2010-06-08 7:59 ` Américo Wang
2010-06-08 8:40 ` Pavan Naregundi
2010-06-08 8:54 ` Américo Wang
2010-06-08 9:41 ` Pavan Naregundi
2010-06-09 3:44 ` Simon Horman
2010-06-09 6:27 ` Pavan Naregundi
2010-06-09 14:05 ` Vivek Goyal
2010-06-10 21:26 ` Andrew Morton
2010-06-11 7:30 ` Pavan Naregundi
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).