* Re: [Fastboot] kexec+kdump testing with 2.6.12-rc3-mm3 [not found] <1115769558.26913.1046.camel@dyn318077bld.beaverton.ibm.com> @ 2005-05-11 2:53 ` Vivek Goyal 2005-05-11 15:20 ` Badari Pulavarty 0 siblings, 1 reply; 7+ messages in thread From: Vivek Goyal @ 2005-05-11 2:53 UTC (permalink / raw) To: Badari Pulavarty Cc: fastboot, linux kernel mailing list, Morton Andrew Morton On Tue, May 10, 2005 at 04:59:18PM -0700, Badari Pulavarty wrote: > Hi, > > I am using kexec+kdump on 2.6.12-rc3-mm3 and it seems to be working > fine on my 4-way P-III 8GB RAM machine. I did touch testing with > kexec+kdump and it worked fine. Then ran heavy IO load and forced > a panic and I was able to collect the dump. But I am not able to > analyze the dump to find out if I really got a valid dump or not :( > Copying to LKML. Gdb can not open a file larger than 2GB. You have got 8GB RAM hence /proc/vmcore size must be similar. For testing purposes you can boot first kernel with mem=2G and then take dump and analyze with gdb. But we need to work on some crash analysis tools like "crash" to be able to debug larger files. > BTW, what architectures kexec+kdump supported ? Does it work on > x86-64 ? > Kexec has been ported to x86-64 but not kdump. Thanks Vivek ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [Fastboot] kexec+kdump testing with 2.6.12-rc3-mm3 2005-05-11 2:53 ` [Fastboot] kexec+kdump testing with 2.6.12-rc3-mm3 Vivek Goyal @ 2005-05-11 15:20 ` Badari Pulavarty 2005-05-12 5:44 ` Vivek Goyal 0 siblings, 1 reply; 7+ messages in thread From: Badari Pulavarty @ 2005-05-11 15:20 UTC (permalink / raw) To: vgoyal; +Cc: fastboot, Linux Kernel Mailing List, Morton Andrew Morton On Tue, 2005-05-10 at 19:53, Vivek Goyal wrote: > On Tue, May 10, 2005 at 04:59:18PM -0700, Badari Pulavarty wrote: > > Hi, > > > > I am using kexec+kdump on 2.6.12-rc3-mm3 and it seems to be working > > fine on my 4-way P-III 8GB RAM machine. I did touch testing with > > kexec+kdump and it worked fine. Then ran heavy IO load and forced > > a panic and I was able to collect the dump. But I am not able to > > analyze the dump to find out if I really got a valid dump or not :( > > > > Copying to LKML. > > Gdb can not open a file larger than 2GB. You have got 8GB RAM hence > /proc/vmcore size must be similar. For testing purposes you can boot first > kernel with mem=2G and then take dump and analyze with gdb. Its better with mem=2G, but gdb is not really useful :( I wanted to look at all the processes and their stacks.. It shows me only one stack (not quite right). So I can't really use the dump for anything :( (gdb) bt #0 crash_get_current_regs (regs=0xc04ddbf8) at arch/i386/kernel/crash.c:99 #1 0xc0117fe9 in crash_save_self () at arch/i386/kernel/crash.c:107 #2 0xc0141d14 in crash_kexec () at kernel/kexec.c:1032 #3 0xc0294a44 in __handle_sysrq (key=99, pt_regs=0xc04ddd44, tty=0x0, check_mask=99) at drivers/char/sysrq.c:410 #4 0xc0294b1d in handle_sysrq (key=Variable "key" is not available. ) at drivers/char/sysrq.c:438 #5 0xc029e5ab in receive_chars (up=0xc05672a0, status=0xc04ddcd4, regs=0xc04ddd44) at serial_core.h:395 #6 0xc029e906 in serial8250_interrupt (irq=4, dev_id=0xc0566bc0, regs=0xc04ddd44) at drivers/serial/8250.c:1212 #7 0xc0142995 in handle_IRQ_event (irq=4, regs=0xc04ddd44, action=0xf7cebdc0) at kernel/irq/handle.c:87 #8 0xc0142aa5 in __do_IRQ (irq=4, regs=0xc04ddd44) at kernel/irq/handle.c:172 #9 0xc01065a7 in do_IRQ (regs=0xc04ddd44) at arch/i386/kernel/irq.c:108 #10 0xc0104aaa in common_interrupt () at atomic.h:175 #11 0xf7a7d980 in ?? () #12 0x00000000 in ?? () #13 0xf6b50760 in ?? () #14 0xf62ce800 in ?? () #15 0x00000004 in ?? () #16 0xc04ddda8 in init_thread_union () #17 0xf62ce868 in ?? () #18 0x0000007b in ?? () #19 0xf490007b in ?? () #20 0xffffff04 in ?? () #21 0xc036c052 in tcp_clean_rtx_queue (sk=0xf62ce800, seq_rtt_p=0xc04dddd4) at skbuff.h:677 #22 0xc036c9f0 in tcp_ack (sk=0xf62ce800, skb=Variable "skb" is not available. ) at net/ipv4/tcp_input.c:2938 #23 0xc036f566 in tcp_rcv_established (sk=0xf62ce800, skb=0xf7995980, th=0xf1593a34, len=32) at net/ipv4/tcp_input.c:4285 #24 0xc03783d0 in tcp_v4_do_rcv (sk=0xf62ce800, skb=0xf7995980) at net/ipv4/tcp_ipv4.c:1676 #25 0xc0378cf2 in tcp_v4_rcv (skb=0xf7995980) at net/ipv4/tcp_ipv4.c:1785 #26 0xc035c1f6 in ip_local_deliver (skb=0xf7995980) at net/ipv4/ip_input.c:242 #27 0xc035c89c in ip_rcv (skb=Variable "skb" is not available. ) at dst.h:246 #28 0xc034031a in netif_receive_skb (skb=0xf7995980) at net/core/dev.c:1713 #29 0xc03403d6 in process_backlog (backlog_dev=0xc600f52c, budget=0xc04ddf3c) at net/core/dev.c:1746 #30 0xc034058e in net_rx_action (h=Variable "h" is not available. ) at net/core/dev.c:1795 #31 0xc0127432 in __do_softirq () at kernel/softirq.c:95 #32 0xc01274e8 in do_softirq () at kernel/softirq.c:129 #33 0xc01065ac in do_IRQ (regs=0xc04ddf94) at arch/i386/kernel/irq.c:110 #34 0xc0104aaa in common_interrupt () at atomic.h:175 #35 0xffffe000 in ?? () #36 0xc600c2e0 in ?? () #37 0xc0101fb0 in enable_hlt () at arch/i386/kernel/process.c:93 #38 0xc010209c in cpu_idle () at arch/i386/kernel/process.c:199 #39 0xc04de9b5 in start_kernel () at init/main.c:527 #40 0xc010020e in is386 () at arch/i386/kernel/head.S:327 #41 0x00000000 in ?? () #42 0x00000000 in ?? () #43 0x00000000 in ?? () #44 0x00000000 in ?? () #45 0x00000000 in ?? () #46 0x00000000 in ?? () #47 0x00000000 in ?? () #48 0x00000000 in ?? () #49 0x00000000 in ?? () #50 0x00000000 in ?? () #51 0x00000000 in ?? () #52 0x00000000 in ?? () #53 0x00000000 in ?? () #54 0x00000000 in ?? () #55 0x00000000 in ?? () #56 0x00000000 in ?? () #57 0x00000000 in ?? () #58 0x00000000 in ?? () #59 0x00000000 in ?? () #60 0x00000000 in ?? () ... > > But we need to work on some crash analysis tools like "crash" to be able > to debug larger files. Is some one working on this too ? > > BTW, what architectures kexec+kdump supported ? Does it work on > > x86-64 ? > > > > Kexec has been ported to x86-64 but not kdump. :( Thanks, Badari ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [Fastboot] kexec+kdump testing with 2.6.12-rc3-mm3 2005-05-11 15:20 ` Badari Pulavarty @ 2005-05-12 5:44 ` Vivek Goyal 2005-05-12 10:22 ` Maneesh Soni 0 siblings, 1 reply; 7+ messages in thread From: Vivek Goyal @ 2005-05-12 5:44 UTC (permalink / raw) To: Badari Pulavarty Cc: vgoyal, fastboot, Linux Kernel Mailing List, Morton Andrew Morton On Wed, May 11, 2005 at 08:20:50AM -0700, Badari Pulavarty wrote: > On Tue, 2005-05-10 at 19:53, Vivek Goyal wrote: > > On Tue, May 10, 2005 at 04:59:18PM -0700, Badari Pulavarty wrote: > > > Hi, > > > > > > I am using kexec+kdump on 2.6.12-rc3-mm3 and it seems to be working > > > fine on my 4-way P-III 8GB RAM machine. I did touch testing with > > > kexec+kdump and it worked fine. Then ran heavy IO load and forced > > > a panic and I was able to collect the dump. But I am not able to > > > analyze the dump to find out if I really got a valid dump or not :( > > > > > > > Copying to LKML. > > > > Gdb can not open a file larger than 2GB. You have got 8GB RAM hence > > /proc/vmcore size must be similar. For testing purposes you can boot first > > kernel with mem=2G and then take dump and analyze with gdb. > > Its better with mem=2G, but gdb is not really useful :( > I wanted to look at all the processes and their stacks.. > It shows me only one stack (not quite right). So I can't > really use the dump for anything :( > You can run "info thread" to see how many cpus are are there. Use "thread" to switch to a different thread and then run "bt" to see the stack of that that thread. We have observed some issues with this. You will see proper stack only if other cpus were not running swapper thread (pid 0). For seeing the stack of all the processes, I guess macros need to be written which traverse the task list, retrieve stack pointer and then trace back. I have not tried it though. > > > > But we need to work on some crash analysis tools like "crash" to be able > > to debug larger files. > > Is some one working on this too ? > Yes, we are looking into this. Thanks Vivek ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [Fastboot] kexec+kdump testing with 2.6.12-rc3-mm3 2005-05-12 5:44 ` Vivek Goyal @ 2005-05-12 10:22 ` Maneesh Soni 2005-05-12 11:28 ` Srivatsa Vaddagiri 0 siblings, 1 reply; 7+ messages in thread From: Maneesh Soni @ 2005-05-12 10:22 UTC (permalink / raw) To: Vivek Goyal Cc: Badari Pulavarty, Morton Andrew Morton, fastboot, Linux Kernel Mailing List On Thu, May 12, 2005 at 11:14:24AM +0530, Vivek Goyal wrote: > On Wed, May 11, 2005 at 08:20:50AM -0700, Badari Pulavarty wrote: > > On Tue, 2005-05-10 at 19:53, Vivek Goyal wrote: > > > On Tue, May 10, 2005 at 04:59:18PM -0700, Badari Pulavarty wrote: > > > > Hi, > > > > > > > > I am using kexec+kdump on 2.6.12-rc3-mm3 and it seems to be working > > > > fine on my 4-way P-III 8GB RAM machine. I did touch testing with > > > > kexec+kdump and it worked fine. Then ran heavy IO load and forced > > > > a panic and I was able to collect the dump. But I am not able to > > > > analyze the dump to find out if I really got a valid dump or not :( > > > > > > > > > > Copying to LKML. > > > > > > Gdb can not open a file larger than 2GB. You have got 8GB RAM hence > > > /proc/vmcore size must be similar. For testing purposes you can boot first > > > kernel with mem=2G and then take dump and analyze with gdb. > > > > Its better with mem=2G, but gdb is not really useful :( > > I wanted to look at all the processes and their stacks.. > > It shows me only one stack (not quite right). So I can't > > really use the dump for anything :( > > > > > You can run "info thread" to see how many cpus are are there. Use "thread" to > switch to a different thread and then run "bt" to see the stack of that > that thread. We have observed some issues with this. You will see proper > stack only if other cpus were not running swapper thread (pid 0). > > For seeing the stack of all the processes, I guess macros need to be written > which traverse the task list, retrieve stack pointer and then trace back. I > have not tried it though. Following is a somewhat crude user defined command to dump stack for all the processes in the crashdump (gdb) define ps Type commands for definition of "ps". End with a line saying just "end". >set $tasks_off=((size_t)&((struct task_struct *)0)->tasks) >set $init_t=&init_task >set $next_t=(((char *)($init_t->tasks).next) - $tasks_off) >while ($next_t != $init_t) >set $next_t=(struct task_struct *)$next_t >print $next_t.comm >print $next_t.pid >x/40x $next_t.thread.esp >set $next_t=(char *)($next_t->tasks.next) - $tasks_off >end >end (gdb) ps $1 = "init\000er\000\000\000\000\000\000\000\000" $2 = 1 0xeff9fe5c: 0xeff9fea8 0x00000086 0xeff9fe74 0x00000286 0xeff9fe6c: 0xc4608e00 0xeff9febc 0xeff9fe88 0xc0126f53 0xeff9fe7c: 0x00242e9d 0xc4608420 0x00000e39 0xfff54405 0xeff9fe8c: 0x0000026a 0xc0405c00 0xeffd1a30 0xeffd1b58 0xeff9fe9c: 0x00242e9d 0xeff9febc 0x0000000b 0xeff9fee4 0xeff9feac: 0xc03a1a70 0xefdd200c 0xefd95000 0xeff9fecc 0xeff9febc: 0xc4609780 0xc4609780 0x00242e9d 0x4b87ad6e 0xeff9fecc: 0xc0127ad0 0xeffd1a30 0xc4608e00 0xee45e3c0 0xeff9fedc: 0x00000000 0x00000000 0xeff9ff60 0xc01707e6 0xeff9feec: 0x00000000 0x00000000 0x00000400 0x00000000 $3 = "migration/0\000\000\000\000" $4 = 2 0xeffa7f5c: 0xeffa7fa8 0x00000046 0xc4608420 0xc4608420 0xeffa7f6c: 0x00000082 0xeffa7f8c 0xe69fff54 0x00000000 0xeffa7f7c: 0xe7b77a70 0xc4608420 0x0000031e 0xb806bf4f 0xeffa7f8c: 0x00000161 0xe7b77a70 0xeffd7550 0xeffd7678 0xeffa7f9c: 0xc4608d6c 0xc4608420 0xeffa6000 0xeffa7fc0 0xeffa7fac: 0xc011a632 0x00000000 0xeffa6000 0xeff9ff44 0xeffa7fbc: 0x00000000 0xeffa7fe4 0xc0132c36 0xfffffffc 0xeffa7fcc: 0xc011a5b0 0xffffffff 0xffffffff 0xc0132ba0 0xeffa7fdc: 0x00000000 0x00000000 0x00000000 0xc0101145 0xeffa7fec: 0xeff9ff3c 0x00000000 0x00000000 0xb7fc938d Thanks Maneesh -- Maneesh Soni Linux Technology Center, IBM India Software Labs, Bangalore, India email: maneesh@in.ibm.com Phone: 91-80-25044990 ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [Fastboot] kexec+kdump testing with 2.6.12-rc3-mm3 2005-05-12 10:22 ` Maneesh Soni @ 2005-05-12 11:28 ` Srivatsa Vaddagiri 2005-05-14 10:33 ` Alexander Nyberg 0 siblings, 1 reply; 7+ messages in thread From: Srivatsa Vaddagiri @ 2005-05-12 11:28 UTC (permalink / raw) To: Maneesh Soni Cc: Vivek Goyal, Badari Pulavarty, Morton Andrew Morton, fastboot, Linux Kernel Mailing List On Thu, May 12, 2005 at 10:25:37AM +0000, Maneesh Soni wrote: > Following is a somewhat crude user defined command to dump stack for all the > processes in the crashdump > > > (gdb) define ps > Type commands for definition of "ps". > End with a line saying just "end". > >set $tasks_off=((size_t)&((struct task_struct *)0)->tasks) > >set $init_t=&init_task > >set $next_t=(((char *)($init_t->tasks).next) - $tasks_off) > >while ($next_t != $init_t) > >set $next_t=(struct task_struct *)$next_t > >print $next_t.comm > >print $next_t.pid > >x/40x $next_t.thread.esp > >set $next_t=(char *)($next_t->tasks.next) - $tasks_off > >end > >end Probably you need another loop here for iterating thr' all the threads of a task? do_each_thread/while_each_thread macros give the details. Basically the macros can be modified as: set $tasks_off=((size_t)&((struct task_struct *)0)->tasks) set $pid_off=((size_t)&((struct task_struct *)0)->pids[1].pid_list.next) set $init_t=&init_task set $next_t=(((char *)($init_t->tasks).next) - $tasks_off) while ($next_t != $init_t) set $next_t=(struct task_struct *)$next_t printf "\n%s:\n", $next_t.comm printf "PID = %d\n", $next_t.pid printf "Stack dump:\n" x/40x $next_t.thread.esp set $next_th=(((char *)$next_t->pids[1].pid_list.next) - $pid_off) while ($next_th != $next_t) set $next_th=(struct task_struct *)$next_th printf "\n%s:\n", $next_th.comm printf "PID = %d\n", $next_th.pid printf "Stack dump:\n" x/40x $next_th.thread.esp set $next_th=(((char *)$next_th->pids[1].pid_list.next) - $pid_off) end set $next_t=(char *)($next_t->tasks.next) - $tasks_off end -- Thanks and Regards, Srivatsa Vaddagiri, Linux Technology Center, IBM Software Labs, Bangalore, INDIA - 560017 ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [Fastboot] kexec+kdump testing with 2.6.12-rc3-mm3 2005-05-12 11:28 ` Srivatsa Vaddagiri @ 2005-05-14 10:33 ` Alexander Nyberg 2005-05-16 8:36 ` Maneesh Soni 0 siblings, 1 reply; 7+ messages in thread From: Alexander Nyberg @ 2005-05-14 10:33 UTC (permalink / raw) To: vatsa Cc: Linux Kernel Mailing List, fastboot, Andrew Morton, Badari Pulavarty, Vivek Goyal, Maneesh Soni > Probably you need another loop here for iterating thr' all the > threads of a task? do_each_thread/while_each_thread macros give > the details. > > Basically the macros can be modified as: > > set $tasks_off=((size_t)&((struct task_struct *)0)->tasks) > set $pid_off=((size_t)&((struct task_struct *)0)->pids[1].pid_list.next) > set $init_t=&init_task > set $next_t=(((char *)($init_t->tasks).next) - $tasks_off) > while ($next_t != $init_t) > set $next_t=(struct task_struct *)$next_t > printf "\n%s:\n", $next_t.comm > printf "PID = %d\n", $next_t.pid > printf "Stack dump:\n" > x/40x $next_t.thread.esp > set $next_th=(((char *)$next_t->pids[1].pid_list.next) - $pid_off) > while ($next_th != $next_t) > set $next_th=(struct task_struct *)$next_th > printf "\n%s:\n", $next_th.comm > printf "PID = %d\n", $next_th.pid > printf "Stack dump:\n" > x/40x $next_th.thread.esp > set $next_th=(((char *)$next_th->pids[1].pid_list.next) - $pid_off) > end > set $next_t=(char *)($next_t->tasks.next) - $tasks_off > end When looking at this I thought of what information we want to save in the ELF header to be examined after a crash. I'm currently working on some patches to save all threads in the kernel down into the PT_NOTE section. But do we really need this if we instead have a suite of gdb scripts and other user-space analyzers that can find the requested information? This would simplify aspects such as not having to fiddle with the crash ELF header in the kernel. I can't see a gain in dumping a bunch of NT_PRSTATUS or NT_TASKSTRUCT in the notes section if we can find the same information in gdb/user-space after a crash-dump. We should be able to get get (symbol) backtraces from each task with some gdb scripts too, so I think most analyzis can be done from user-space. Am I missing something? All tasks running on a cpu should be saved into NT_PRSTATUS notes ofcourse, as they are currently. The have a (very) useful pt_regs plus some other information such as trap number causing the panic or the address of where the fault occured. ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [Fastboot] kexec+kdump testing with 2.6.12-rc3-mm3 2005-05-14 10:33 ` Alexander Nyberg @ 2005-05-16 8:36 ` Maneesh Soni 0 siblings, 0 replies; 7+ messages in thread From: Maneesh Soni @ 2005-05-16 8:36 UTC (permalink / raw) To: Alexander Nyberg Cc: vatsa, Linux Kernel Mailing List, fastboot, Andrew Morton, Badari Pulavarty, Vivek Goyal On Sat, May 14, 2005 at 12:33:16PM +0200, Alexander Nyberg wrote: > > Probably you need another loop here for iterating thr' all the > > threads of a task? do_each_thread/while_each_thread macros give > > the details. > > > > Basically the macros can be modified as: > > > > set $tasks_off=((size_t)&((struct task_struct *)0)->tasks) > > set $pid_off=((size_t)&((struct task_struct *)0)->pids[1].pid_list.next) > > set $init_t=&init_task > > set $next_t=(((char *)($init_t->tasks).next) - $tasks_off) > > while ($next_t != $init_t) > > set $next_t=(struct task_struct *)$next_t > > printf "\n%s:\n", $next_t.comm > > printf "PID = %d\n", $next_t.pid > > printf "Stack dump:\n" > > x/40x $next_t.thread.esp > > set $next_th=(((char *)$next_t->pids[1].pid_list.next) - $pid_off) > > while ($next_th != $next_t) > > set $next_th=(struct task_struct *)$next_th > > printf "\n%s:\n", $next_th.comm > > printf "PID = %d\n", $next_th.pid > > printf "Stack dump:\n" > > x/40x $next_th.thread.esp > > set $next_th=(((char *)$next_th->pids[1].pid_list.next) - $pid_off) > > end > > set $next_t=(char *)($next_t->tasks.next) - $tasks_off > > end > > When looking at this I thought of what information we want to save in > the ELF header to be examined after a crash. I'm currently working on > some patches to save all threads in the kernel down into the PT_NOTE > section. But do we really need this if we instead have a suite of gdb > scripts and other user-space analyzers that can find the requested > information? > That's right, we should try to do minimum stuff at the time of crash, which are absolutely necessary. Analysis tools like gdb or crash should be able to extract or format useful things from the dump. > This would simplify aspects such as not having to fiddle with the crash > ELF header in the kernel. I can't see a gain in dumping a bunch of > NT_PRSTATUS or NT_TASKSTRUCT in the notes section if we can find the > same information in gdb/user-space after a crash-dump. > > We should be able to get get (symbol) backtraces from each task with > some gdb scripts too, so I think most analyzis can be done from > user-space. Am I missing something? > no... Thanks Maneesh -- Maneesh Soni Linux Technology Center, IBM India Software Labs, Bangalore, India email: maneesh@in.ibm.com Phone: 91-80-25044990 ^ permalink raw reply [flat|nested] 7+ messages in thread
end of thread, other threads:[~2005-05-16 8:40 UTC | newest] Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- [not found] <1115769558.26913.1046.camel@dyn318077bld.beaverton.ibm.com> 2005-05-11 2:53 ` [Fastboot] kexec+kdump testing with 2.6.12-rc3-mm3 Vivek Goyal 2005-05-11 15:20 ` Badari Pulavarty 2005-05-12 5:44 ` Vivek Goyal 2005-05-12 10:22 ` Maneesh Soni 2005-05-12 11:28 ` Srivatsa Vaddagiri 2005-05-14 10:33 ` Alexander Nyberg 2005-05-16 8:36 ` Maneesh Soni
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).