linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] Kdump(x86): add note type NT_KDUMPINFO to kernel core dumps
@ 2005-09-21  6:56 Vivek Goyal
  2005-09-21 14:28 ` [Fastboot] " Eric W. Biederman
  0 siblings, 1 reply; 21+ messages in thread
From: Vivek Goyal @ 2005-09-21  6:56 UTC (permalink / raw)
  To: Morton Andrew Morton
  Cc: linux kernel mailing list, Fastboot mailing list, Anderson Dave Anderson



o This patch adds a new note type NT_KDUMPINFO. This note is added with 
  kernel core dumps generated by kdump.

o This note mainly communicates the information required by kernel dump
  analysis tool "crash" to analyze the kernel core dump. As of now it contains 
  the pointer to task struct of panicing task and page size. Page size is 
  irrelevant for i386 but is required for architectures like ia64 and ppc64. 

o gdb is not affected by this change as gdb need not to parse this note.

Signed-off-by: Vivek Goyal <vgoyal@in.ibm.com>
---

 linux-2.6.14-rc1-mm1-vivek/arch/i386/kernel/crash.c |    9 +++++++++
 linux-2.6.14-rc1-mm1-vivek/include/linux/elf.h      |    2 +-
 linux-2.6.14-rc1-mm1-vivek/include/linux/elfcore.h  |   13 +++++++++++++
 3 files changed, 23 insertions(+), 1 deletion(-)

diff -puN arch/i386/kernel/crash.c~kdump-x86-add-note-type-nt-kdump arch/i386/kernel/crash.c
--- linux-2.6.14-rc1-mm1/arch/i386/kernel/crash.c~kdump-x86-add-note-type-nt-kdump	2005-09-21 12:22:14.113474136 +0530
+++ linux-2.6.14-rc1-mm1-vivek/arch/i386/kernel/crash.c	2005-09-21 12:22:14.128471856 +0530
@@ -62,6 +62,7 @@ static void final_note(u32 *buf)
 static void crash_save_this_cpu(struct pt_regs *regs, int cpu)
 {
 	struct elf_prstatus prstatus;
+	struct elf_kdumpinfo kdumpinfo;
 	u32 *buf;
 
 	if ((cpu < 0) || (cpu >= NR_CPUS))
@@ -80,6 +81,14 @@ static void crash_save_this_cpu(struct p
 	elf_core_copy_regs(&prstatus.pr_reg, regs);
 	buf = append_elf_note(buf, "CORE", NT_PRSTATUS, &prstatus,
 				sizeof(prstatus));
+
+	/* Also store NT_KDUMPINFO */
+	if (cpu == crashing_cpu) {
+		kdumpinfo.page_shift = PAGE_SHIFT;
+		kdumpinfo.panic_tsk = current;
+		buf = append_elf_note(buf, "CORE", NT_KDUMPINFO, &kdumpinfo,
+					sizeof(kdumpinfo));
+	}
 	final_note(buf);
 }
 
diff -puN include/linux/elfcore.h~kdump-x86-add-note-type-nt-kdump include/linux/elfcore.h
--- linux-2.6.14-rc1-mm1/include/linux/elfcore.h~kdump-x86-add-note-type-nt-kdump	2005-09-21 12:22:14.116473680 +0530
+++ linux-2.6.14-rc1-mm1-vivek/include/linux/elfcore.h	2005-09-21 12:22:14.129471704 +0530
@@ -79,9 +79,22 @@ struct elf_prpsinfo
 	char	pr_psargs[ELF_PRARGSZ];	/* initial part of arg list */
 };
 
+/*
+ * NT_KDUMPINFO can be used to communicate additional information required for
+ * kernel core dumps. Additional information includes kernel configuration
+ * variables like page size and any other relevant data required by
+ * debugger (crash in this case).
+*/
+struct elf_kdumpinfo
+{
+	char page_shift;		/* Page size */
+	struct task_struct *panic_tsk;	/* Pointer to panic task_struct */
+};
+
 #ifndef __KERNEL__
 typedef struct elf_prstatus prstatus_t;
 typedef struct elf_prpsinfo prpsinfo_t;
+typedef struct elf_kdumpinfo kdumpinfo_t;
 #define PRARGSZ ELF_PRARGSZ 
 #endif
 
diff -puN include/linux/elf.h~kdump-x86-add-note-type-nt-kdump include/linux/elf.h
--- linux-2.6.14-rc1-mm1/include/linux/elf.h~kdump-x86-add-note-type-nt-kdump	2005-09-21 12:22:14.120473072 +0530
+++ linux-2.6.14-rc1-mm1-vivek/include/linux/elf.h	2005-09-21 12:22:14.130471552 +0530
@@ -391,7 +391,7 @@ typedef struct elf64_shdr {
 #define NT_TASKSTRUCT	4
 #define NT_AUXV		6
 #define NT_PRXFPREG     0x46e62b7f      /* copied from gdb5.1/include/elf/common.h */
-
+#define NT_KDUMPINFO	7		/* Used for kernel core dumps */
 
 /* Note header in a PT_NOTE section */
 typedef struct elf32_note {
_

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [Fastboot] [PATCH] Kdump(x86): add note type NT_KDUMPINFO to kernel core dumps
  2005-09-21  6:56 [PATCH] Kdump(x86): add note type NT_KDUMPINFO to kernel core dumps Vivek Goyal
@ 2005-09-21 14:28 ` Eric W. Biederman
  2005-09-21 15:17   ` [Fastboot] [PATCH] Kdump(x86): add note type NT_KDUMPINFO tokernel " Dave Anderson
  2005-09-22  7:39   ` [Fastboot] [PATCH] Kdump(x86): add note type NT_KDUMPINFO to kernel " Vivek Goyal
  0 siblings, 2 replies; 21+ messages in thread
From: Eric W. Biederman @ 2005-09-21 14:28 UTC (permalink / raw)
  To: vgoyal
  Cc: Morton Andrew Morton, Anderson Dave Anderson,
	Fastboot mailing list, linux kernel mailing list

Vivek Goyal <vgoyal@in.ibm.com> writes:

> o This patch adds a new note type NT_KDUMPINFO. This note is added with 
>   kernel core dumps generated by kdump.
>
> o This note mainly communicates the information required by kernel dump
>   analysis tool "crash" to analyze the kernel core dump. As of now it contains 
>   the pointer to task struct of panicing task and page size. Page size is 
>   irrelevant for i386 but is required for architectures like ia64 and ppc64. 
>
> o gdb is not affected by this change as gdb need not to parse this note.

A couple of things.
- The name of your note is terribly generic, so it seems a poor choice.

- Why do we need to capture the page size at the time of the
  crash?  Isn't the page size a compile time option?  Won't
  sys_getpagesize() get you this information before the crash?  
  Why do we need the kernels page size at all?

- Why do you avoid storing the current task on the other cpus?

- Can't we derive the current task from the existing register information
  already captured.  If not would a little extra debug information
  captured at compile time be better?

- You don't address the issue of architectural control registers.
  So you are going to need another note at some point. (Not 
  necessarily a bad thing).


> +/*
> + * NT_KDUMPINFO can be used to communicate additional information required for
> + * kernel core dumps. Additional information includes kernel configuration
> + * variables like page size and any other relevant data required by
> + * debugger (crash in this case).
> +*/
> +struct elf_kdumpinfo
> +{
> +	char page_shift;		/* Page size */
> +	struct task_struct *panic_tsk;	/* Pointer to panic task_struct */
> +};
> +

>  #define NT_TASKSTRUCT	4
>  #define NT_AUXV		6
>  #define NT_PRXFPREG 0x46e62b7f /* copied from gdb5.1/include/elf/common.h */
> -
> +#define NT_KDUMPINFO	7		/* Used for kernel core dumps */
>  

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [Fastboot] [PATCH] Kdump(x86): add note type NT_KDUMPINFO tokernel  core dumps
  2005-09-21 14:28 ` [Fastboot] " Eric W. Biederman
@ 2005-09-21 15:17   ` Dave Anderson
  2005-09-22  9:01     ` Eric W. Biederman
  2005-09-22  7:39   ` [Fastboot] [PATCH] Kdump(x86): add note type NT_KDUMPINFO to kernel " Vivek Goyal
  1 sibling, 1 reply; 21+ messages in thread
From: Dave Anderson @ 2005-09-21 15:17 UTC (permalink / raw)
  To: Eric W. Biederman
  Cc: vgoyal, Morton Andrew Morton, Fastboot mailing list,
	linux kernel mailing list

"Eric W. Biederman" wrote:

> Vivek Goyal <vgoyal@in.ibm.com> writes:
>
> > o This patch adds a new note type NT_KDUMPINFO. This note is added with
> >   kernel core dumps generated by kdump.
> >
> > o This note mainly communicates the information required by kernel dump
> >   analysis tool "crash" to analyze the kernel core dump. As of now it contains
> >   the pointer to task struct of panicing task and page size. Page size is
> >   irrelevant for i386 but is required for architectures like ia64 and ppc64.
> >

It's not absolutely required for crash -- but more of a convenience.  As it is, the
set of NT_PRSTATUS sections can be scoured for the set of active
tasks, and then the process stacks (and potentially the IRQ stacks, then
mapped back to the proper process stack) can be searched for crash_kexec.
So crash works without the task_struct pointer, but it's ugly.  It's just that
netdump, diskdump and LKCD all report the panic task_struct pointer,
and it seems a reasonable thing to do.

>
> > o gdb is not affected by this change as gdb need not to parse this note.
>
> A couple of things.
> - The name of your note is terribly generic, so it seems a poor choice.
>
> - Why do we need to capture the page size at the time of the
>   crash?  Isn't the page size a compile time option?  Won't
>   sys_getpagesize() get you this information before the crash?
>   Why do we need the kernels page size at all?
>

The page size is needed for a whole host of things, primarily for
virtual-to-physical address translations, a bunch of VM-related
commands, etc.  For non-x86[_64] systems. the host system may
be using a different page size than the vmcore being examining.
That's probably a rare situation, and using getpagesize() on the
host would work in most cases, or perhaps issuing the page size
as a command line option, but again, it's another convenience item
that gets reported by netdump, diskdump and LKCD.

>
> - Why do you avoid storing the current task on the other cpus?
>

They can be found in the per-cpu runqueues data structure.

>
> - Can't we derive the current task from the existing register information
>   already captured.  If not would a little extra debug information
>   captured at compile time be better?
>
> - You don't address the issue of architectural control registers.
>   So you are going to need another note at some point. (Not
>   necessarily a bad thing).
>

I guess that's their whole point.  This NT_KDUMPINFO or whatever
you want to call it, could cumulatively collect any other data deemed
necessary that falls outside the bounds of the existing note types.

Dave Anderson

>
> > +/*
> > + * NT_KDUMPINFO can be used to communicate additional information required for
> > + * kernel core dumps. Additional information includes kernel configuration
> > + * variables like page size and any other relevant data required by
> > + * debugger (crash in this case).
> > +*/
> > +struct elf_kdumpinfo
> > +{
> > +     char page_shift;                /* Page size */
> > +     struct task_struct *panic_tsk;  /* Pointer to panic task_struct */
> > +};
> > +
>
> >  #define NT_TASKSTRUCT        4
> >  #define NT_AUXV              6
> >  #define NT_PRXFPREG 0x46e62b7f /* copied from gdb5.1/include/elf/common.h */
> > -
> > +#define NT_KDUMPINFO 7               /* Used for kernel core dumps */
> >


^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [Fastboot] [PATCH] Kdump(x86): add note type NT_KDUMPINFO to kernel core dumps
  2005-09-21 14:28 ` [Fastboot] " Eric W. Biederman
  2005-09-21 15:17   ` [Fastboot] [PATCH] Kdump(x86): add note type NT_KDUMPINFO tokernel " Dave Anderson
@ 2005-09-22  7:39   ` Vivek Goyal
  2005-09-22  7:46     ` Andrew Morton
  2005-09-22  9:11     ` Eric W. Biederman
  1 sibling, 2 replies; 21+ messages in thread
From: Vivek Goyal @ 2005-09-22  7:39 UTC (permalink / raw)
  To: Eric W. Biederman
  Cc: Morton Andrew Morton, Anderson Dave Anderson,
	Fastboot mailing list, linux kernel mailing list

On Wed, Sep 21, 2005 at 08:28:32AM -0600, Eric W. Biederman wrote:
> Vivek Goyal <vgoyal@in.ibm.com> writes:
> 
> > o This patch adds a new note type NT_KDUMPINFO. This note is added with 
> >   kernel core dumps generated by kdump.
> >
> > o This note mainly communicates the information required by kernel dump
> >   analysis tool "crash" to analyze the kernel core dump. As of now it contains 
> >   the pointer to task struct of panicing task and page size. Page size is 
> >   irrelevant for i386 but is required for architectures like ia64 and ppc64. 
> >
> > o gdb is not affected by this change as gdb need not to parse this note.
> 
> A couple of things.
> - The name of your note is terribly generic, so it seems a poor choice.
> 

Yes it seems generic. And I think the reason being that I did not have a
significant number of things to report hence could not create logical groups
and giving very specific names to those groups. As of today, based on the
requirements, could think of only two items (page size, current). There
is still scope for few more things to appear. Hence gave a generic name and
thought more items can be grouped in this note itself until and unless group
becomes really big and need to break.

Do you have any suggestion to make this name better?


> - Why do we need to capture the page size at the time of the
>   crash?  Isn't the page size a compile time option?  Won't
>   sys_getpagesize() get you this information before the crash?  
>   Why do we need the kernels page size at all?
> 

Dave has already mentioned about the need of page size. I agree that page
size is a compile time information. Are you hinting at creating a separate
note for reporting page size in user space while loading the capture kernel 
and store it in reserved region alongside other elf headers. Not sure, if
it is good to create a separate note type just to report one configuration
variable. 

I also thought of enabling "Kernel .config support" (CONFIG_IKCONFIG=y) and
may be "crash" could write a script similar to scripts/extract-ikconfig to
retrieve any configuration variable required from vmlinux itself. But this
option is not enabled by default and will increase kernel image size and
seems too much to do for a single configuration variable. 

> - Why do you avoid storing the current task on the other cpus?
> 
> - Can't we derive the current task from the existing register information
>   already captured.  

It can be done but as Dave suggested but that requires significant amount 
of job to be done as one has to traverse through the active task stacks and
look for crash_kexec(). An easier/simpler way is that kernel itself can 
report it.  Netdump, diskdump already do it. I think for simplicity, it 
makes sense to export this information from kernel in the form of note.

Only issue I could think of is stack overflow and current might be 
corrupted after panic.


> If not would a little extra debug information
>   captured at compile time be better?
> 

Sorry, I did not get this. Can you elaborate a little more on this. What
kind of debug information can be helpful in determining the current.

> - You don't address the issue of architectural control registers.
>   So you are going to need another note at some point. (Not 
>   necessarily a bad thing).
> 

That's true. Probably a new note type shall be required to store
architectural control registers. May be something like NT_KDUMP_CRREG and
NT_KDUMPINFO can continue to capture miscellaneous info. 

Thanks
Vivek

> 
> > +/*
> > + * NT_KDUMPINFO can be used to communicate additional information required for
> > + * kernel core dumps. Additional information includes kernel configuration
> > + * variables like page size and any other relevant data required by
> > + * debugger (crash in this case).
> > +*/
> > +struct elf_kdumpinfo
> > +{
> > +	char page_shift;		/* Page size */
> > +	struct task_struct *panic_tsk;	/* Pointer to panic task_struct */
> > +};
> > +
> 
> >  #define NT_TASKSTRUCT	4
> >  #define NT_AUXV		6
> >  #define NT_PRXFPREG 0x46e62b7f /* copied from gdb5.1/include/elf/common.h */
> > -
> > +#define NT_KDUMPINFO	7		/* Used for kernel core dumps */
> >  

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [Fastboot] [PATCH] Kdump(x86): add note type NT_KDUMPINFO to kernel core dumps
  2005-09-22  7:39   ` [Fastboot] [PATCH] Kdump(x86): add note type NT_KDUMPINFO to kernel " Vivek Goyal
@ 2005-09-22  7:46     ` Andrew Morton
  2005-09-22  8:32       ` Vivek Goyal
  2005-09-22  9:11     ` Eric W. Biederman
  1 sibling, 1 reply; 21+ messages in thread
From: Andrew Morton @ 2005-09-22  7:46 UTC (permalink / raw)
  To: vgoyal; +Cc: ebiederm, anderson, fastboot, linux-kernel

Vivek Goyal <vgoyal@in.ibm.com> wrote:
>
> > - Why do you avoid storing the current task on the other cpus?
>  > 
>  > - Can't we derive the current task from the existing register information
>  >   already captured.  
> 
>  It can be done but as Dave suggested but that requires significant amount 
>  of job to be done as one has to traverse through the active task stacks and
>  look for crash_kexec(). An easier/simpler way is that kernel itself can 
>  report it.  Netdump, diskdump already do it. I think for simplicity, it 
>  makes sense to export this information from kernel in the form of note.
> 
>  Only issue I could think of is stack overflow and current might be 
>  corrupted after panic.
> 

Yes, traversing the task_structs in a crashed kernel sounds like a poor
idea.


^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [Fastboot] [PATCH] Kdump(x86): add note type NT_KDUMPINFO to kernel core dumps
  2005-09-22  7:46     ` Andrew Morton
@ 2005-09-22  8:32       ` Vivek Goyal
  0 siblings, 0 replies; 21+ messages in thread
From: Vivek Goyal @ 2005-09-22  8:32 UTC (permalink / raw)
  To: Andrew Morton; +Cc: ebiederm, anderson, fastboot, linux-kernel

On Thu, Sep 22, 2005 at 12:46:48AM -0700, Andrew Morton wrote:
> Vivek Goyal <vgoyal@in.ibm.com> wrote:
> >
> > > - Why do you avoid storing the current task on the other cpus?
> >  > 
> >  > - Can't we derive the current task from the existing register information
> >  >   already captured.  
> > 
> >  It can be done but as Dave suggested but that requires significant amount 
> >  of job to be done as one has to traverse through the active task stacks and
> >  look for crash_kexec(). An easier/simpler way is that kernel itself can 
> >  report it.  Netdump, diskdump already do it. I think for simplicity, it 
> >  makes sense to export this information from kernel in the form of note.
> > 
> >  Only issue I could think of is stack overflow and current might be 
> >  corrupted after panic.
> > 
> 
> Yes, traversing the task_structs in a crashed kernel sounds like a poor
> idea.
> 

Andrew, traversal of task_structs will not be done in crashed kernel. It
will be done in user space by "crash" (if required, during post crash analysis).

In this case we are simply copying "current" pointer of panicking task in 
an elf note.

	kdumpinfo.panic_tsk = current;

Thanks
Vivek

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [Fastboot] [PATCH] Kdump(x86): add note type NT_KDUMPINFO tokernel  core dumps
  2005-09-21 15:17   ` [Fastboot] [PATCH] Kdump(x86): add note type NT_KDUMPINFO tokernel " Dave Anderson
@ 2005-09-22  9:01     ` Eric W. Biederman
  2005-09-22 14:08       ` Vivek Goyal
  0 siblings, 1 reply; 21+ messages in thread
From: Eric W. Biederman @ 2005-09-22  9:01 UTC (permalink / raw)
  To: Dave Anderson
  Cc: vgoyal, Morton Andrew Morton, Fastboot mailing list,
	linux kernel mailing list

Dave Anderson <anderson@redhat.com> writes:

> "Eric W. Biederman" wrote:
>
>> Vivek Goyal <vgoyal@in.ibm.com> writes:
>>
>> > o This patch adds a new note type NT_KDUMPINFO. This note is added with
>> >   kernel core dumps generated by kdump.
>> >
>> > o This note mainly communicates the information required by kernel dump
>> > analysis tool "crash" to analyze the kernel core dump. As of now it contains
>> >   the pointer to task struct of panicing task and page size. Page size is
>> > irrelevant for i386 but is required for architectures like ia64 and ppc64.
>> >
>
> It's not absolutely required for crash -- but more of a convenience.  As it is,
> the
> set of NT_PRSTATUS sections can be scoured for the set of active
> tasks, and then the process stacks (and potentially the IRQ stacks, then
> mapped back to the proper process stack) can be searched for crash_kexec.
> So crash works without the task_struct pointer, but it's ugly.  It's just that
> netdump, diskdump and LKCD all report the panic task_struct pointer,
> and it seems a reasonable thing to do.

Ok.  The point here is to know which task/cpu called panic, rather
than to get the task_struct.   That makes a lot of sense, and is
cheap to get.  Any note on the crashing cpu that is not captured
by another cpu will give us that information.  

My primary concern is while the concept of a task_struct is pretty
stable who is to know how the kernel will change in the future.  So
if we don't need to export a task_struct pointer and merely need
to flag the cpu that called panic we can do that much more reliably.

A practical question is how is this solved with a normal multi-threaded
core dumps?

It is not the best thing but order of the notes does and will matter,
it is likely worth Documenting so people don't change it that
PR_STATUS comes first so we can reliably find the first note
for a cpu, after everything has been serialized.

>> > o gdb is not affected by this change as gdb need not to parse this note.
>>
>> A couple of things.
>> - The name of your note is terribly generic, so it seems a poor choice.
>>
>> - Why do we need to capture the page size at the time of the
>>   crash?  Isn't the page size a compile time option?  Won't
>>   sys_getpagesize() get you this information before the crash?
>>   Why do we need the kernels page size at all?
>>
>
> The page size is needed for a whole host of things, primarily for
> virtual-to-physical address translations, a bunch of VM-related
> commands, etc.  For non-x86[_64] systems. the host system may
> be using a different page size than the vmcore being examining.
> That's probably a rare situation, and using getpagesize() on the
> host would work in most cases, or perhaps issuing the page size
> as a command line option, but again, it's another convenience item
> that gets reported by netdump, diskdump and LKCD.

What I was mostly saying is that there are other places this can
be captured.  If this is actually a compile time option on all
architectures we should simply add more debug information to vmlinux.
If this varies at runtime we have the option of capturing it from the
kernel before it crashes.  We don't need to run code to capture
it while the kernel is dying.

Having just skimmed the include/asm-* it looks like this is indeed
a compile time option so the page size needs to get added to the
information that is easy to get from the vmlinux binary.

We do need a way that we can test to see if a core dump
actually matches the vmlinux we are looking at.  Probably
this is capturing all of the information captured by linux/version.h
and linux/compile.h both at runtime and at compile time and
checking to see if they match.

>> - Why do you avoid storing the current task on the other cpus?
>>
>
> They can be found in the per-cpu runqueues data structure.

Thanks.  That confirms my impression we are just looking for a flag
that says we are the current cpu.

>> - Can't we derive the current task from the existing register information
>>   already captured.  If not would a little extra debug information
>>   captured at compile time be better?
>>
>> - You don't address the issue of architectural control registers.
>>   So you are going to need another note at some point. (Not
>>   necessarily a bad thing).
>>
>
> I guess that's their whole point.  This NT_KDUMPINFO or whatever
> you want to call it, could cumulatively collect any other data deemed
> necessary that falls outside the bounds of the existing note types.

It is clear we need one or mode notes.  Largely once we define what
is in a note we cannot change it.  The most we can do is to append
more fields to an existing node type, because we have size information
we can check on.  This is because the definition of a note is a kernel
API issue.

Eric

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [Fastboot] [PATCH] Kdump(x86): add note type NT_KDUMPINFO to kernel core dumps
  2005-09-22  7:39   ` [Fastboot] [PATCH] Kdump(x86): add note type NT_KDUMPINFO to kernel " Vivek Goyal
  2005-09-22  7:46     ` Andrew Morton
@ 2005-09-22  9:11     ` Eric W. Biederman
  1 sibling, 0 replies; 21+ messages in thread
From: Eric W. Biederman @ 2005-09-22  9:11 UTC (permalink / raw)
  To: vgoyal
  Cc: Morton Andrew Morton, Anderson Dave Anderson,
	Fastboot mailing list, linux kernel mailing list

Vivek Goyal <vgoyal@in.ibm.com> writes:

> On Wed, Sep 21, 2005 at 08:28:32AM -0600, Eric W. Biederman wrote:
>> Vivek Goyal <vgoyal@in.ibm.com> writes:
>> 
>> > o This patch adds a new note type NT_KDUMPINFO. This note is added with 
>> >   kernel core dumps generated by kdump.
>> >
>> > o This note mainly communicates the information required by kernel dump
>> > analysis tool "crash" to analyze the kernel core dump. As of now it contains
>> >   the pointer to task struct of panicing task and page size. Page size is 
>> > irrelevant for i386 but is required for architectures like ia64 and ppc64.
>> >
>> > o gdb is not affected by this change as gdb need not to parse this note.
>> 
>> A couple of things.
>> - The name of your note is terribly generic, so it seems a poor choice.
>> 
>
> Yes it seems generic. And I think the reason being that I did not have a
> significant number of things to report hence could not create logical groups
> and giving very specific names to those groups. As of today, based on the
> requirements, could think of only two items (page size, current). There
> is still scope for few more things to appear. Hence gave a generic name and
> thought more items can be grouped in this note itself until and unless group
> becomes really big and need to break.

I'm not certain exactly how but we do need to capture the kernel version
so we can verify the vmlinux binary and the core dump match.

> Do you have any suggestion to make this name better?

Not presently.  That still doesn't mean it isn't worth brain storming
about.

>> - Why do we need to capture the page size at the time of the
>>   crash?  Isn't the page size a compile time option?  Won't
>>   sys_getpagesize() get you this information before the crash?  
>>   Why do we need the kernels page size at all?
>> 
>
> Dave has already mentioned about the need of page size. I agree that page
> size is a compile time information. Are you hinting at creating a separate
> note for reporting page size in user space while loading the capture kernel 
> and store it in reserved region alongside other elf headers. Not sure, if
> it is good to create a separate note type just to report one configuration
> variable. 

It's in my reply to Dave but we need a way to get that kind of information
out of vmlinux.

>> - Why do you avoid storing the current task on the other cpus?
>> 
>> - Can't we derive the current task from the existing register information
>>   already captured.  

In my reply to Dave I noted that flagging the cpu is cheaper and more
reliably that returning current.

>> If not would a little extra debug information
>>   captured at compile time be better?
>> 
>
> Sorry, I did not get this. Can you elaborate a little more on this. What
> kind of debug information can be helpful in determining the current.

The dwarf2 debug information can tell you what registers a variable
is stored in.  I was thinking of something along those lines, for
reporting to the debugger where current was stored.

>> - You don't address the issue of architectural control registers.
>>   So you are going to need another note at some point. (Not 
>>   necessarily a bad thing).
>> 
>
> That's true. Probably a new note type shall be required to store
> architectural control registers. May be something like NT_KDUMP_CRREG and
> NT_KDUMPINFO can continue to capture miscellaneous info. 

Sounds right.  I am slightly less concerned so long as we restrict
ourselves to an append only discipline with regards to the information
in that note.

Eric

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [Fastboot] [PATCH] Kdump(x86): add note type NT_KDUMPINFO tokernel  core dumps
  2005-09-22  9:01     ` Eric W. Biederman
@ 2005-09-22 14:08       ` Vivek Goyal
  2005-09-22 15:06         ` Dave Anderson
  2005-09-22 16:38         ` [Fastboot] [PATCH] Kdump(x86): add note type NT_KDUMPINFO tokernel core dumps Eric W. Biederman
  0 siblings, 2 replies; 21+ messages in thread
From: Vivek Goyal @ 2005-09-22 14:08 UTC (permalink / raw)
  To: Eric W. Biederman
  Cc: Dave Anderson, Morton Andrew Morton, Fastboot mailing list,
	linux kernel mailing list

On Thu, Sep 22, 2005 at 03:01:10AM -0600, Eric W. Biederman wrote:
> Dave Anderson <anderson@redhat.com> writes:
> 
> > "Eric W. Biederman" wrote:
> >
> >> Vivek Goyal <vgoyal@in.ibm.com> writes:
> >>
> >> > o This patch adds a new note type NT_KDUMPINFO. This note is added with
> >> >   kernel core dumps generated by kdump.
> >> >
> >> > o This note mainly communicates the information required by kernel dump
> >> > analysis tool "crash" to analyze the kernel core dump. As of now it contains
> >> >   the pointer to task struct of panicing task and page size. Page size is
> >> > irrelevant for i386 but is required for architectures like ia64 and ppc64.
> >> >
> >
> > It's not absolutely required for crash -- but more of a convenience.  As it is,
> > the
> > set of NT_PRSTATUS sections can be scoured for the set of active
> > tasks, and then the process stacks (and potentially the IRQ stacks, then
> > mapped back to the proper process stack) can be searched for crash_kexec.
> > So crash works without the task_struct pointer, but it's ugly.  It's just that
> > netdump, diskdump and LKCD all report the panic task_struct pointer,
> > and it seems a reasonable thing to do.
> 
> Ok.  The point here is to know which task/cpu called panic, rather
> than to get the task_struct.   That makes a lot of sense, and is
> cheap to get.  Any note on the crashing cpu that is not captured
> by another cpu will give us that information.  
> 

That makes sense. Sheer presence of a particular note can identify the 
crashing cpu and no need to store "current". Only crashing cpu is going
to store that note and that too after respective NT_PRSTATUS.

> My primary concern is while the concept of a task_struct is pretty
> stable who is to know how the kernel will change in the future.  So
> if we don't need to export a task_struct pointer and merely need
> to flag the cpu that called panic we can do that much more reliably.
> 
> A practical question is how is this solved with a normal multi-threaded
> core dumps?
> 
> It is not the best thing but order of the notes does and will matter,
> it is likely worth Documenting so people don't change it that
> PR_STATUS comes first so we can reliably find the first note
> for a cpu, after everything has been serialized.
> 
> >> > o gdb is not affected by this change as gdb need not to parse this note.
> >>
> >> A couple of things.
> >> - The name of your note is terribly generic, so it seems a poor choice.
> >>
> >> - Why do we need to capture the page size at the time of the
> >>   crash?  Isn't the page size a compile time option?  Won't
> >>   sys_getpagesize() get you this information before the crash?
> >>   Why do we need the kernels page size at all?
> >>
> >
> > The page size is needed for a whole host of things, primarily for
> > virtual-to-physical address translations, a bunch of VM-related
> > commands, etc.  For non-x86[_64] systems. the host system may
> > be using a different page size than the vmcore being examining.
> > That's probably a rare situation, and using getpagesize() on the
> > host would work in most cases, or perhaps issuing the page size
> > as a command line option, but again, it's another convenience item
> > that gets reported by netdump, diskdump and LKCD.
> 
> What I was mostly saying is that there are other places this can
> be captured.  If this is actually a compile time option on all
> architectures we should simply add more debug information to vmlinux.
> If this varies at runtime we have the option of capturing it from the
> kernel before it crashes.  We don't need to run code to capture
> it while the kernel is dying.
> 
> Having just skimmed the include/asm-* it looks like this is indeed
> a compile time option so the page size needs to get added to the
> information that is easy to get from the vmlinux binary.
> 
> We do need a way that we can test to see if a core dump
> actually matches the vmlinux we are looking at.  Probably
> this is capturing all of the information captured by linux/version.h
> and linux/compile.h both at runtime and at compile time and
> checking to see if they match.
> 

I quickly browsed through "crash" code and looks like it is already doing
similiar check (kernel.c, verify_version()). It seems to be retrieving 
"linux_banner" from core image and also retrieving banner string from vmlinux 
and trying to match. So if banner information can be directly read from the 
core image, probably there is no need to export it through notes. 

Thanks
Vivek

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [Fastboot] [PATCH] Kdump(x86): add note type NT_KDUMPINFO tokernel   core dumps
  2005-09-22 14:08       ` Vivek Goyal
@ 2005-09-22 15:06         ` Dave Anderson
  2005-09-22 16:31           ` Eric W. Biederman
  2005-09-22 16:38         ` [Fastboot] [PATCH] Kdump(x86): add note type NT_KDUMPINFO tokernel core dumps Eric W. Biederman
  1 sibling, 1 reply; 21+ messages in thread
From: Dave Anderson @ 2005-09-22 15:06 UTC (permalink / raw)
  To: vgoyal
  Cc: Eric W. Biederman, Morton Andrew Morton, Fastboot mailing list,
	linux kernel mailing list

Vivek Goyal wrote:

>
> > Ok.  The point here is to know which task/cpu called panic, rather
> > than to get the task_struct.   That makes a lot of sense, and is
> > cheap to get.  Any note on the crashing cpu that is not captured
> > by another cpu will give us that information.
> >
>
> That makes sense. Sheer presence of a particular note can identify the
> crashing cpu and no need to store "current". Only crashing cpu is going
> to store that note and that too after respective NT_PRSTATUS.
>
> > My primary concern is while the concept of a task_struct is pretty
> > stable who is to know how the kernel will change in the future.  So
> > if we don't need to export a task_struct pointer and merely need
> > to flag the cpu that called panic we can do that much more reliably.

Just flagging the cpu, and then mapping that to the stack pointer found in
the associated NT_PRSTATUS register set should work OK too.  It gets
a little muddy if it crashed while running on an IRQ stack, but it still can be
tracked back from there as well.  (although not if the crashing task overflowed
the IRQ stack)

The task_struct would be ideal though -- if the kernel's use of task_structs
changes in the future, well, then crash is going to need a serious re-write
anyway...  FWIW, netdump and diskdump use the NT_TASKSTRUCT note
note to store just the "current" pointer, and not the whole task_struct itself,
which would just be a waste of space in the ELF header for crash's purposes.
And looking at the gdb sources, it appears to be totally ignored.  Who
uses the NT_TASKSTRUCT note anyway?

>
> > We do need a way that we can test to see if a core dump
> > actually matches the vmlinux we are looking at.  Probably
> > this is capturing all of the information captured by linux/version.h
> > and linux/compile.h both at runtime and at compile time and
> > checking to see if they match.
> >
>
> I quickly browsed through "crash" code and looks like it is already doing
> similiar check (kernel.c, verify_version()). It seems to be retrieving
> "linux_banner" from core image and also retrieving banner string from vmlinux
> and trying to match. So if banner information can be directly read from the
> core image, probably there is no need to export it through notes.
>

That's correct.

Dave



^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [Fastboot] [PATCH] Kdump(x86): add note type NT_KDUMPINFO tokernel   core dumps
  2005-09-22 15:06         ` Dave Anderson
@ 2005-09-22 16:31           ` Eric W. Biederman
  2005-09-22 20:33             ` Haren Myneni
                               ` (2 more replies)
  0 siblings, 3 replies; 21+ messages in thread
From: Eric W. Biederman @ 2005-09-22 16:31 UTC (permalink / raw)
  To: Dave Anderson
  Cc: vgoyal, Morton Andrew Morton, Fastboot mailing list,
	linux kernel mailing list

Dave Anderson <anderson@redhat.com> writes:

> Just flagging the cpu, and then mapping that to the stack pointer found in
> the associated NT_PRSTATUS register set should work OK too.  It gets
> a little muddy if it crashed while running on an IRQ stack, but it still can be
> tracked back from there as well.  (although not if the crashing task overflowed
> the IRQ stack)

You can't track it back from the crashing cpu if the IRQ stack overflows
either.  So I would rather have crash confused when trying to find the
task_struct.  Then to have the kernel fail avoidably while attempting
to capture a core dump.  

Even if you overflow the stack wit a bit of detective work it should still
be possible to show the stack overflowed and correct for it when analyzing
the crash dump.  Doing anything like that from a crashing cpu (in a
reliable way) is very hard. 

> The task_struct would be ideal though -- if the kernel's use of task_structs
> changes in the future, well, then crash is going to need a serious re-write
> anyway...  FWIW, netdump and diskdump use the NT_TASKSTRUCT note
> note to store just the "current" pointer, and not the whole task_struct itself,
> which would just be a waste of space in the ELF header for crash's purposes.
> And looking at the gdb sources, it appears to be totally ignored.  Who
> uses the NT_TASKSTRUCT note anyway?

Good question, especially as the kernel exports whatever we have for
a task struct today in the ELF note.  No ABI compatibility is
maintained.

Given all of that I recommend an empty NT_TASKSTRUCT to flag the
crashing cpu, for now.

Eric

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [Fastboot] [PATCH] Kdump(x86): add note type NT_KDUMPINFO tokernel  core dumps
  2005-09-22 14:08       ` Vivek Goyal
  2005-09-22 15:06         ` Dave Anderson
@ 2005-09-22 16:38         ` Eric W. Biederman
  2005-09-22 17:00           ` [Fastboot] [PATCH] Kdump(x86): add note type NT_KDUMPINFOtokernel " Dave Anderson
  1 sibling, 1 reply; 21+ messages in thread
From: Eric W. Biederman @ 2005-09-22 16:38 UTC (permalink / raw)
  To: vgoyal
  Cc: Dave Anderson, Morton Andrew Morton, Fastboot mailing list,
	linux kernel mailing list

Vivek Goyal <vgoyal@in.ibm.com> writes:

> I quickly browsed through "crash" code and looks like it is already doing
> similiar check (kernel.c, verify_version()). It seems to be retrieving 
> "linux_banner" from core image and also retrieving banner string from vmlinux 
> and trying to match. So if banner information can be directly read from the 
> core image, probably there is no need to export it through notes. 

Sounds good.  We still need to define a note for the cpu control
registers.  Do any of the other crash dump solution capture that
information right now?

Eric


^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [Fastboot] [PATCH] Kdump(x86): add note type NT_KDUMPINFOtokernel   core dumps
  2005-09-22 16:38         ` [Fastboot] [PATCH] Kdump(x86): add note type NT_KDUMPINFO tokernel core dumps Eric W. Biederman
@ 2005-09-22 17:00           ` Dave Anderson
  0 siblings, 0 replies; 21+ messages in thread
From: Dave Anderson @ 2005-09-22 17:00 UTC (permalink / raw)
  To: Eric W. Biederman
  Cc: vgoyal, Morton Andrew Morton, Fastboot mailing list,
	linux kernel mailing list

"Eric W. Biederman" wrote:

> Vivek Goyal <vgoyal@in.ibm.com> writes:
>
> > I quickly browsed through "crash" code and looks like it is already doing
> > similiar check (kernel.c, verify_version()). It seems to be retrieving
> > "linux_banner" from core image and also retrieving banner string from vmlinux
> > and trying to match. So if banner information can be directly read from the
> > core image, probably there is no need to export it through notes.
>
> Sounds good.  We still need to define a note for the cpu control
> registers.  Do any of the other crash dump solution capture that
> information right now?
>
> Eric

Certainly not in netdump, diskdump or LKCD...

On the other hand, I can't say I ever really needed it, although
that's not to say it couldn't be valuable for some types of
crashes.

Dave



^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [Fastboot] [PATCH] Kdump(x86): add note type NT_KDUMPINFO tokernel core dumps
  2005-09-22 16:31           ` Eric W. Biederman
@ 2005-09-22 20:33             ` Haren Myneni
  2005-09-23  5:09             ` Vivek Goyal
       [not found]             ` <OF0A1E6B6F.F00DC760-ON87257084.005F99D6-88257084.00634A38@us.ibm.com>
  2 siblings, 0 replies; 21+ messages in thread
From: Haren Myneni @ 2005-09-22 20:33 UTC (permalink / raw)
  To: Eric W. Biederman
  Cc: Dave Anderson, Morton Andrew Morton, Fastboot mailing list,
	linux kernel mailing list

Sorry. Reposting since it did not made to LKML due to my stupid mistake; 
Also posting Dave Anderson's reponse:

Eric W. Biederman wrote:

>Dave Anderson <anderson@redhat.com> writes:
>
>  
>
>>Just flagging the cpu, and then mapping that to the stack pointer found in
>>the associated NT_PRSTATUS register set should work OK too.  It gets
>>a little muddy if it crashed while running on an IRQ stack, but it still can be
>>tracked back from there as well.  (although not if the crashing task overflowed
>>the IRQ stack)
>>    
>>
>
>You can't track it back from the crashing cpu if the IRQ stack overflows
>either.  So I would rather have crash confused when trying to find the
>task_struct.  Then to have the kernel fail avoidably while attempting
>to capture a core dump.  
>
>Even if you overflow the stack wit a bit of detective work it should still
>be possible to show the stack overflowed and correct for it when analyzing
>the crash dump.  Doing anything like that from a crashing cpu (in a
>reliable way) is very hard. 
>
>  
>
>>The task_struct would be ideal though -- if the kernel's use of task_structs
>>changes in the future, well, then crash is going to need a serious re-write
>>anyway...  FWIW, netdump and diskdump use the NT_TASKSTRUCT note
>>note to store just the "current" pointer, and not the whole task_struct itself,
>>which would just be a waste of space in the ELF header for crash's purposes.
>>And looking at the gdb sources, it appears to be totally ignored.  Who
>>uses the NT_TASKSTRUCT note anyway?
>>    
>>
>
>Good question, especially as the kernel exports whatever we have for
>a task struct today in the ELF note.  No ABI compatibility is
>maintained.
>
>Given all of that I recommend an empty NT_TASKSTRUCT to flag the
>crashing cpu, for now.
>  
>
At present /proc/kcore writes the complete task structure for 
NT_TASKSTRUCT note section. Thought it is the standard. Hence created 
separate note section. The other option is the crash tool can directly 
read "crashing_cpu variable" from the vmcore to determine the panic cpu. 
Similarly, we can define panic_task variable in the kernel.

Dave Anderson (anderson@redhat.com) reponse:

" So does elf_core_dump() as well, but to gdb it's useless AFAICT...

Hey -- I wasn't even aware of the "crashing_cpu" variable.  
That would work just fine.

Still a "panic_task", and perhaps even a "crash_page_size" variable
would be nice as well.   No additional notes required...

Dave "

Basically, we can use some global structure in the kernel and dump any 
needed information which we do not need to invoke any analysis tools 
(crash, gdb). Dumping CPU control registers can also be done this way 
without creating separate note section.

Thanks
Haren

>Eric
>  
>
>------------------------------------------------------------------------
>
>_______________________________________________
>fastboot mailing list
>fastboot@lists.osdl.org
>https://lists.osdl.org/mailman/listinfo/fastboot
>  
>


^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [Fastboot] [PATCH] Kdump(x86): add note type NT_KDUMPINFO tokernel   core dumps
  2005-09-22 16:31           ` Eric W. Biederman
  2005-09-22 20:33             ` Haren Myneni
@ 2005-09-23  5:09             ` Vivek Goyal
  2005-09-23  7:22               ` Eric W. Biederman
  2005-09-23 15:17               ` Subject: [PATCH] Don't uselessly export task_struct to user space in " Eric W. Biederman
       [not found]             ` <OF0A1E6B6F.F00DC760-ON87257084.005F99D6-88257084.00634A38@us.ibm.com>
  2 siblings, 2 replies; 21+ messages in thread
From: Vivek Goyal @ 2005-09-23  5:09 UTC (permalink / raw)
  To: Eric W. Biederman
  Cc: Dave Anderson, Morton Andrew Morton, Fastboot mailing list,
	linux kernel mailing list

On Thu, Sep 22, 2005 at 10:31:52AM -0600, Eric W. Biederman wrote:
> Dave Anderson <anderson@redhat.com> writes:
> 
> > Just flagging the cpu, and then mapping that to the stack pointer found in
> > the associated NT_PRSTATUS register set should work OK too.  It gets
> > a little muddy if it crashed while running on an IRQ stack, but it still can be
> > tracked back from there as well.  (although not if the crashing task overflowed
> > the IRQ stack)
> 
> You can't track it back from the crashing cpu if the IRQ stack overflows
> either.  So I would rather have crash confused when trying to find the
> task_struct.  Then to have the kernel fail avoidably while attempting
> to capture a core dump.  
> 
> Even if you overflow the stack wit a bit of detective work it should still
> be possible to show the stack overflowed and correct for it when analyzing
> the crash dump.  Doing anything like that from a crashing cpu (in a
> reliable way) is very hard. 
> 
> > The task_struct would be ideal though -- if the kernel's use of task_structs
> > changes in the future, well, then crash is going to need a serious re-write
> > anyway...  FWIW, netdump and diskdump use the NT_TASKSTRUCT note
> > note to store just the "current" pointer, and not the whole task_struct itself,
> > which would just be a waste of space in the ELF header for crash's purposes.
> > And looking at the gdb sources, it appears to be totally ignored.  Who
> > uses the NT_TASKSTRUCT note anyway?
> 
> Good question, especially as the kernel exports whatever we have for
> a task struct today in the ELF note.  No ABI compatibility is
> maintained.
> 
> Given all of that I recommend an empty NT_TASKSTRUCT to flag the
> crashing cpu, for now.
>

I got a concern here. Are we not breaking the convention. NT_TASKSTRUCT note
type represents that task_sturct is stored in note data. elf_core_dump()
and /proc/kcore already do that (That's a different story that it might not 
be needed at all). Now if we store a null NT_TASKSTURCT, same note type will
carry two meanings.

IMHO, introducing a null NT_KDUMPINFO will help in that sense, at least 
there are no two interpretations of same note type. Also it provides the
scope to add more elements to it if need be.

Thanks
Vivek

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [Fastboot] [PATCH] Kdump(x86): add note type NT_KDUMPINFO tokernel core dumps
       [not found]             ` <OF0A1E6B6F.F00DC760-ON87257084.005F99D6-88257084.00634A38@us.ibm.com>
@ 2005-09-23  5:19               ` Vivek Goyal
       [not found]               ` <4332FD56.2F5256F5@redhat.com>
  1 sibling, 0 replies; 21+ messages in thread
From: Vivek Goyal @ 2005-09-23  5:19 UTC (permalink / raw)
  To: Haren Myneni
  Cc: Eric W. Biederman, Morton Andrew Morton, Fastboot mailing list,
	Dave Anderson, linux kernel mailing list, fastboot-bounces

On Thu, Sep 22, 2005 at 11:04:43AM -0700, Haren Myneni wrote:
> fastboot-bounces@lists.osdl.org wrote on 09/22/2005 09:31:52 AM:
> 
> > Dave Anderson <anderson@redhat.com> writes:
> > 
> > > Just flagging the cpu, and then mapping that to the stack pointer 
> found in
> > > the associated NT_PRSTATUS register set should work OK too.  It gets
> > > a little muddy if it crashed while running on an IRQ stack, but it
> > still can be
> > > tracked back from there as well.  (although not if the crashing 
> > task overflowed
> > > the IRQ stack)
> > 
> > You can't track it back from the crashing cpu if the IRQ stack overflows
> > either.  So I would rather have crash confused when trying to find the
> > task_struct.  Then to have the kernel fail avoidably while attempting
> > to capture a core dump. 
> > 
> > Even if you overflow the stack wit a bit of detective work it should 
> still
> > be possible to show the stack overflowed and correct for it when 
> analyzing
> > the crash dump.  Doing anything like that from a crashing cpu (in a
> > reliable way) is very hard. 
> > 
> > > The task_struct would be ideal though -- if the kernel's use of 
> task_structs
> > > changes in the future, well, then crash is going to need a serious 
> re-write
> > > anyway...  FWIW, netdump and diskdump use the NT_TASKSTRUCT note
> > > note to store just the "current" pointer, and not the whole 
> > task_struct itself,
> > > which would just be a waste of space in the ELF header for 
> crash'spurposes.
> > > And looking at the gdb sources, it appears to be totally ignored.  Who
> > > uses the NT_TASKSTRUCT note anyway?
> > 
> > Good question, especially as the kernel exports whatever we have for
> > a task struct today in the ELF note.  No ABI compatibility is
> > maintained.
> > 
> > Given all of that I recommend an empty NT_TASKSTRUCT to flag the
> > crashing cpu, for now.
> 
> At present /proc/kcore writes the complete task structure for 
> NT_TASKSTRUCT note section. Thought it is the standard. Hence created 
> separate note section. The other option is the crash tool can directly 
> read "crashing_cpu variable" from the vmcore to determine the panic cpu. 
> Similarly, we can define panic_task variable in the kernel.

crashing_cpu was introduced recently to handle one of the problems caused
due to NMI. I think we should not be relying on this variable. we get the
value of crashing_cpu by making smp_processor_id() call and this value will
be corrupted in case of stack overflow.

During OLS, Eric had suggested to either find a way to disable NMI after 
panic() or may be read LAPIC id. Reading LAPIC id seems to be more reliable.
I think mkdump folks already do something similar.

So, in short, using crashing_cpu might not be a good idea. Down the line
this varibale might not be present at all. 

> 
> Basically, we can use some global structure in the kernel and dump any 
> needed information which we do not need to invoke any analysis tools 
> (crash, gdb). Dumping CPU control registers can also be done this way 
> without creating separate note section.
> 
> Thanks
> Haren
> 
> Anyway, we already have crashing_cpu variable in the kernel. 
> > 
> > Eric
> > _______________________________________________
> > fastboot mailing list
> > fastboot@lists.osdl.org
> > https://lists.osdl.org/mailman/listinfo/fastboot

> _______________________________________________
> fastboot mailing list
> fastboot@lists.osdl.org
> https://lists.osdl.org/mailman/listinfo/fastboot


^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [Fastboot] [PATCH] Kdump(x86): add note type NT_KDUMPINFO  tokernelcore dumps
       [not found]               ` <4332FD56.2F5256F5@redhat.com>
@ 2005-09-23  7:12                 ` Eric W. Biederman
  2005-09-23 12:01                   ` Vivek Goyal
  2005-09-26  6:29                   ` Vivek Goyal
  0 siblings, 2 replies; 21+ messages in thread
From: Eric W. Biederman @ 2005-09-23  7:12 UTC (permalink / raw)
  To: Dave Anderson
  Cc: Haren Myneni, Morton Andrew Morton, Fastboot mailing list,
	fastboot-bounces, linux kernel mailing list

Dave Anderson <anderson@redhat.com> writes:

> So does elf_core_dump() as well, but to gdb it's useless AFAICT...

We can always post_process things when generating a core dump
if we have enough information.

> Hey -- I wasn't even aware of the "crashing_cpu" variable.  
> That would work just fine.
>
> Still a "panic_task", and perhaps even a "crash_page_size" variable
> would be nice as well.   No additional notes required...

To avoid defining an ABI that we need to maintain there is some
benefit in simply using static variables.  But the form of the
information really isn't the concern.

Where we capture the information and how reliable is that capture
is the concern.

To capture page size the easiest and most reliable way I can see
to do is to modify vmlinux.lds.S to contain something like:
>	 _page_shift = PAGE_SHIFT;
Giving you an absolute symbol _page_shift in vmlinux that
contains the value you need, without overhead in the running
kernel.

crashing_cpu makes sense to capture in some form, we definitely
need to compute something that will allow us to write to
a per cpu area on an SMP system.

The big concern at this point is that the code has not undergone
a serious stability audit.  So it is the expectation that there
is still code we can remove and modify to increase the likely hood
of getting a crash dump.

Currently we know that stack overflows sometimes happen and that
they are a source of kernel crashes.  It would be good if we could
take a crash dump despite them.  To do that requires code more
robust than we have today.  Quite likely it means that we will
not be able to reliably capture the task_struct of the crashing cpu.

Eric

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [Fastboot] [PATCH] Kdump(x86): add note type NT_KDUMPINFO tokernel   core dumps
  2005-09-23  5:09             ` Vivek Goyal
@ 2005-09-23  7:22               ` Eric W. Biederman
  2005-09-23 15:17               ` Subject: [PATCH] Don't uselessly export task_struct to user space in " Eric W. Biederman
  1 sibling, 0 replies; 21+ messages in thread
From: Eric W. Biederman @ 2005-09-23  7:22 UTC (permalink / raw)
  To: vgoyal
  Cc: Dave Anderson, Morton Andrew Morton, Fastboot mailing list,
	linux kernel mailing list

Vivek Goyal <vgoyal@in.ibm.com> writes:

> I got a concern here. Are we not breaking the convention. NT_TASKSTRUCT note
> type represents that task_sturct is stored in note data. elf_core_dump()
> and /proc/kcore already do that (That's a different story that it might not 
> be needed at all). Now if we store a null NT_TASKSTURCT, same note type will
> carry two meanings.

My impression is that NT_TASKSTRUCT has failed to define what actually appears
in the note, as the kernel task_struct is not designed to be exported to
user space, and can change without warning.

Hmm.  NT_TASKSTRUCT feels like an information leak although I haven't
a clue what kernel information you could leak that would be interesting
enough that you could compromise the kernel.  It looks like we should
export NT_TASKSTRUCT until we can decide what goes there.

> IMHO, introducing a null NT_KDUMPINFO will help in that sense, at least 
> there are no two interpretations of same note type. Also it provides the
> scope to add more elements to it if need be.

Agreed.  I was trying too hard to reuse what was already there.

Eric

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [Fastboot] [PATCH] Kdump(x86): add note type NT_KDUMPINFO tokernelcore dumps
  2005-09-23  7:12                 ` [Fastboot] [PATCH] Kdump(x86): add note type NT_KDUMPINFO tokernelcore dumps Eric W. Biederman
@ 2005-09-23 12:01                   ` Vivek Goyal
  2005-09-26  6:29                   ` Vivek Goyal
  1 sibling, 0 replies; 21+ messages in thread
From: Vivek Goyal @ 2005-09-23 12:01 UTC (permalink / raw)
  To: Eric W. Biederman
  Cc: Dave Anderson, Morton Andrew Morton, Fastboot mailing list,
	linux kernel mailing list, fastboot-bounces

On Fri, Sep 23, 2005 at 01:12:23AM -0600, Eric W. Biederman wrote:
> Dave Anderson <anderson@redhat.com> writes:
> 
> > So does elf_core_dump() as well, but to gdb it's useless AFAICT...
> 
> We can always post_process things when generating a core dump
> if we have enough information.
> 
> > Hey -- I wasn't even aware of the "crashing_cpu" variable.  
> > That would work just fine.
> >
> > Still a "panic_task", and perhaps even a "crash_page_size" variable
> > would be nice as well.   No additional notes required...
> 
> To avoid defining an ABI that we need to maintain there is some
> benefit in simply using static variables.  But the form of the
> information really isn't the concern.
> 
> Where we capture the information and how reliable is that capture
> is the concern.
> 
> To capture page size the easiest and most reliable way I can see
> to do is to modify vmlinux.lds.S to contain something like:
> >	 _page_shift = PAGE_SHIFT;
> Giving you an absolute symbol _page_shift in vmlinux that
> contains the value you need, without overhead in the running
> kernel.

That's a cool idea. I just tested it. 

[vivek@vivegoya linux]$ readelf -s vmlinux | grep __page_shift
 28865: 0000000c     0 NOTYPE  GLOBAL DEFAULT  ABS __page_shift

> 
> crashing_cpu makes sense to capture in some form, we definitely
> need to compute something that will allow us to write to
> a per cpu area on an SMP system.
> 
> The big concern at this point is that the code has not undergone
> a serious stability audit.  So it is the expectation that there
> is still code we can remove and modify to increase the likely hood
> of getting a crash dump.
> 
> Currently we know that stack overflows sometimes happen and that
> they are a source of kernel crashes.  It would be good if we could
> take a crash dump despite them.  To do that requires code more
> robust than we have today.  Quite likely it means that we will
> not be able to reliably capture the task_struct of the crashing cpu.
>

Agreed.

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Subject: [PATCH] Don't uselessly export task_struct to user space in core dumps.
  2005-09-23  5:09             ` Vivek Goyal
  2005-09-23  7:22               ` Eric W. Biederman
@ 2005-09-23 15:17               ` Eric W. Biederman
  1 sibling, 0 replies; 21+ messages in thread
From: Eric W. Biederman @ 2005-09-23 15:17 UTC (permalink / raw)
  To: Andrew Morton
  Cc: vgoyal@in.ibm.com Dave Anderson, Fastboot mailing list,
	linux kernel mailing list


task_struct is an internal structure to the kernel with
a lot of good information, that is probably interesting in
core dumps.  However there is no way for user space to know
what format that information is in making it useless.

I grepped the GDB 6.3 source code and NT_TASKSTRUCT while
defined is not used anywhere else.  So I would be surprised
if anyone notices it is missing.

In addition exporting kernel pointers to all the interesting
kernel data structures sounds like the very definition of an
information leak.  I haven't a clue what someone with evil
intentions could do with that information, but in any attack
against the kernel it looks like this is the perfect tool for
aiming that attack.

So since NT_TASKSTRUCT is useless as currently defined and
is potentially dangerous, let's just not export it.

Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>


---

 arch/mips/kernel/irixelf.c |   17 ++++++-----------
 fs/binfmt_elf.c            |    4 +---
 2 files changed, 7 insertions(+), 14 deletions(-)

06d116247caa2fef10cb73c5d6042a5ff658b677
diff --git a/arch/mips/kernel/irixelf.c b/arch/mips/kernel/irixelf.c
--- a/arch/mips/kernel/irixelf.c
+++ b/arch/mips/kernel/irixelf.c
@@ -1064,8 +1064,8 @@ static int irix_core_dump(long signr, st
 	struct elfhdr elf;
 	off_t offset = 0, dataoff;
 	int limit = current->signal->rlim[RLIMIT_CORE].rlim_cur;
-	int numnote = 4;
-	struct memelfnote notes[4];
+	int numnote = 3;
+	struct memelfnote notes[3];
 	struct elf_prstatus prstatus;	/* NT_PRSTATUS */
 	elf_fpregset_t fpu;		/* NT_PRFPREG */
 	struct elf_prpsinfo psinfo;	/* NT_PRPSINFO */
@@ -1198,20 +1198,15 @@ static int irix_core_dump(long signr, st
 	}
 	strlcpy(psinfo.pr_fname, current->comm, sizeof(psinfo.pr_fname));
 
-	notes[2].name = "CORE";
-	notes[2].type = NT_TASKSTRUCT;
-	notes[2].datasz = sizeof(*current);
-	notes[2].data = current;
-
 	/* Try to dump the FPU. */
 	prstatus.pr_fpvalid = dump_fpu (regs, &fpu);
 	if (!prstatus.pr_fpvalid) {
 		numnote--;
 	} else {
-		notes[3].name = "CORE";
-		notes[3].type = NT_PRFPREG;
-		notes[3].datasz = sizeof(fpu);
-		notes[3].data = &fpu;
+		notes[2].name = "CORE";
+		notes[2].type = NT_PRFPREG;
+		notes[2].datasz = sizeof(fpu);
+		notes[2].data = &fpu;
 	}
 
 	/* Write notes phdr entry. */
diff --git a/fs/binfmt_elf.c b/fs/binfmt_elf.c
--- a/fs/binfmt_elf.c
+++ b/fs/binfmt_elf.c
@@ -1503,9 +1503,7 @@ static int elf_core_dump(long signr, str
 	fill_psinfo(psinfo, current->group_leader, current->mm);
 	fill_note(notes +1, "CORE", NT_PRPSINFO, sizeof(*psinfo), psinfo);
 	
-	fill_note(notes +2, "CORE", NT_TASKSTRUCT, sizeof(*current), current);
-  
-	numnote = 3;
+	numnote = 2;
 
 	auxv = (elf_addr_t *) current->mm->saved_auxv;
 

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [Fastboot] [PATCH] Kdump(x86): add note type NT_KDUMPINFO tokernelcore dumps
  2005-09-23  7:12                 ` [Fastboot] [PATCH] Kdump(x86): add note type NT_KDUMPINFO tokernelcore dumps Eric W. Biederman
  2005-09-23 12:01                   ` Vivek Goyal
@ 2005-09-26  6:29                   ` Vivek Goyal
  1 sibling, 0 replies; 21+ messages in thread
From: Vivek Goyal @ 2005-09-26  6:29 UTC (permalink / raw)
  To: Eric W. Biederman, Morton Andrew Morton
  Cc: Dave Anderson, Fastboot mailing list, linux kernel mailing list,
	fastboot-bounces

On Fri, Sep 23, 2005 at 01:12:23AM -0600, Eric W. Biederman wrote:
> Dave Anderson <anderson@redhat.com> writes:
> 
> > So does elf_core_dump() as well, but to gdb it's useless AFAICT...
> 
> We can always post_process things when generating a core dump
> if we have enough information.
> 
> > Hey -- I wasn't even aware of the "crashing_cpu" variable.  
> > That would work just fine.
> >
> > Still a "panic_task", and perhaps even a "crash_page_size" variable
> > would be nice as well.   No additional notes required...
> 
> To avoid defining an ABI that we need to maintain there is some
> benefit in simply using static variables.  But the form of the
> information really isn't the concern.
> 
> Where we capture the information and how reliable is that capture
> is the concern.
> 
> To capture page size the easiest and most reliable way I can see
> to do is to modify vmlinux.lds.S to contain something like:
> >	 _page_shift = PAGE_SHIFT;
> Giving you an absolute symbol _page_shift in vmlinux that
> contains the value you need, without overhead in the running
> kernel.
> 
> crashing_cpu makes sense to capture in some form, we definitely
> need to compute something that will allow us to write to
> a per cpu area on an SMP system.
> 
> The big concern at this point is that the code has not undergone
> a serious stability audit.  So it is the expectation that there
> is still code we can remove and modify to increase the likely hood
> of getting a crash dump.
> 
> Currently we know that stack overflows sometimes happen and that
> they are a source of kernel crashes.  It would be good if we could
> take a crash dump despite them.  To do that requires code more
> robust than we have today.  Quite likely it means that we will
> not be able to reliably capture the task_struct of the crashing cpu.
> 


Ok. After all this discussion looks like time to drop the patch. The
architectures which need page_size, they can modify vmlinux.ld.S to 
embed the information in vmlinux. 

Crash can do without pointer to crashing cpu's task_struct as well
(Though it is more work on crash's part). At the same time this information
can not possibly be captured in kernel reliably after the crash.

Identifying crashing cpu is important. Empty NT_KDUMPINFO would have done
the job but use of variables will avoid defining ABI. For the time being
crashing_cpu will do the job. Ofcourse this code is not robust enough to
handle stack overflow situations and that shall have to be taken care of. 

Andrew, Could you please drop this patch from -mm.

kdumpx86-add-note-type-nt_kdumpinfo-to-kernel-core-dumps.patch

Thanks
Vivek

> Eric

> _______________________________________________
> fastboot mailing list
> fastboot@lists.osdl.org
> https://lists.osdl.org/mailman/listinfo/fastboot


^ permalink raw reply	[flat|nested] 21+ messages in thread

end of thread, other threads:[~2005-09-26  6:29 UTC | newest]

Thread overview: 21+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2005-09-21  6:56 [PATCH] Kdump(x86): add note type NT_KDUMPINFO to kernel core dumps Vivek Goyal
2005-09-21 14:28 ` [Fastboot] " Eric W. Biederman
2005-09-21 15:17   ` [Fastboot] [PATCH] Kdump(x86): add note type NT_KDUMPINFO tokernel " Dave Anderson
2005-09-22  9:01     ` Eric W. Biederman
2005-09-22 14:08       ` Vivek Goyal
2005-09-22 15:06         ` Dave Anderson
2005-09-22 16:31           ` Eric W. Biederman
2005-09-22 20:33             ` Haren Myneni
2005-09-23  5:09             ` Vivek Goyal
2005-09-23  7:22               ` Eric W. Biederman
2005-09-23 15:17               ` Subject: [PATCH] Don't uselessly export task_struct to user space in " Eric W. Biederman
     [not found]             ` <OF0A1E6B6F.F00DC760-ON87257084.005F99D6-88257084.00634A38@us.ibm.com>
2005-09-23  5:19               ` [Fastboot] [PATCH] Kdump(x86): add note type NT_KDUMPINFO tokernel " Vivek Goyal
     [not found]               ` <4332FD56.2F5256F5@redhat.com>
2005-09-23  7:12                 ` [Fastboot] [PATCH] Kdump(x86): add note type NT_KDUMPINFO tokernelcore dumps Eric W. Biederman
2005-09-23 12:01                   ` Vivek Goyal
2005-09-26  6:29                   ` Vivek Goyal
2005-09-22 16:38         ` [Fastboot] [PATCH] Kdump(x86): add note type NT_KDUMPINFO tokernel core dumps Eric W. Biederman
2005-09-22 17:00           ` [Fastboot] [PATCH] Kdump(x86): add note type NT_KDUMPINFOtokernel " Dave Anderson
2005-09-22  7:39   ` [Fastboot] [PATCH] Kdump(x86): add note type NT_KDUMPINFO to kernel " Vivek Goyal
2005-09-22  7:46     ` Andrew Morton
2005-09-22  8:32       ` Vivek Goyal
2005-09-22  9:11     ` Eric W. Biederman

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).