linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* fs/binfmt_elf.c:maydump()
@ 2006-04-06 21:03 David S. Miller
  2006-04-06 22:15 ` fs/binfmt_elf.c:maydump() Daniel Jacobowitz
  0 siblings, 1 reply; 10+ messages in thread
From: David S. Miller @ 2006-04-06 21:03 UTC (permalink / raw)
  To: linux-kernel


I sort of understand the idea behind this check in maydump():

	/* If it hasn't been written to, don't write it out */
	if (!vma->anon_vma)
		return 0;

but it causes real problems for debugging.  In fact a GDB testcase
breaks because of this check.

In the GDB testcase, the application mmap()'s a file with some
text in it.  It then calls abort() to dump core.  Then GDB loads
up the application again using that core file, and it tries to
look at the mmap()'d file, and that doesn't work.  We don't dump
the file contents because of the above check so GDB has no idea
how to reproduce the application state at the time of the core
dump.

Furthermore, it is vitally important to dump such areas to handle the
case where the file contents change after the core dump occurs.
So even if we had some way to tell GDB the full pathname of the
file which was mapped at that location, we should still dump the
contents and not try to elide them via this check in maydump().

Yes, this means we might hit the core dump limits quicker but we
shouldn't be doing anything which makes less debugging information
than necessary available.  Software development is hard enough as
it is right? :)

I also have a strange feeling that the VM_SHARED/i_nlink==0 check
might cause similar problems, but I won't touch that for now.

diff --git a/fs/binfmt_elf.c b/fs/binfmt_elf.c
index 537893a..7fea878 100644
--- a/fs/binfmt_elf.c
+++ b/fs/binfmt_elf.c
@@ -1167,10 +1167,6 @@ static int maydump(struct vm_area_struct
 	if (vma->vm_flags & VM_SHARED)
 		return vma->vm_file->f_dentry->d_inode->i_nlink == 0;
 
-	/* If it hasn't been written to, don't write it out */
-	if (!vma->anon_vma)
-		return 0;
-
 	return 1;
 }
 


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* Re: fs/binfmt_elf.c:maydump()
  2006-04-06 21:03 fs/binfmt_elf.c:maydump() David S. Miller
@ 2006-04-06 22:15 ` Daniel Jacobowitz
  2006-04-06 22:35   ` fs/binfmt_elf.c:maydump() David S. Miller
  0 siblings, 1 reply; 10+ messages in thread
From: Daniel Jacobowitz @ 2006-04-06 22:15 UTC (permalink / raw)
  To: David S. Miller; +Cc: linux-kernel

On Thu, Apr 06, 2006 at 02:03:57PM -0700, David S. Miller wrote:
> Yes, this means we might hit the core dump limits quicker but we
> shouldn't be doing anything which makes less debugging information
> than necessary available.  Software development is hard enough as
> it is right? :)

> -	/* If it hasn't been written to, don't write it out */
> -	if (!vma->anon_vma)
> -		return 0;
> -

Isn't this, um, a little more extreme than what you really want?
What goes into coredumps with this patch applied?  I bet it includes
the complete text segments of every executable and shared library
involved in the link.  You're going to need those if you want to debug,
anyway.

-- 
Daniel Jacobowitz
CodeSourcery

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: fs/binfmt_elf.c:maydump()
  2006-04-06 22:15 ` fs/binfmt_elf.c:maydump() Daniel Jacobowitz
@ 2006-04-06 22:35   ` David S. Miller
  2006-04-07  5:18     ` fs/binfmt_elf.c:maydump() David S. Miller
  0 siblings, 1 reply; 10+ messages in thread
From: David S. Miller @ 2006-04-06 22:35 UTC (permalink / raw)
  To: dan; +Cc: linux-kernel

From: Daniel Jacobowitz <dan@debian.org>
Date: Thu, 6 Apr 2006 18:15:19 -0400

> On Thu, Apr 06, 2006 at 02:03:57PM -0700, David S. Miller wrote:
> > Yes, this means we might hit the core dump limits quicker but we
> > shouldn't be doing anything which makes less debugging information
> > than necessary available.  Software development is hard enough as
> > it is right? :)
> 
> > -	/* If it hasn't been written to, don't write it out */
> > -	if (!vma->anon_vma)
> > -		return 0;
> > -
> 
> Isn't this, um, a little more extreme than what you really want?
> What goes into coredumps with this patch applied?  I bet it includes
> the complete text segments of every executable and shared library
> involved in the link.  You're going to need those if you want to debug,
> anyway.

With this patch applied, yes, it includes the complete text segments of
every executable and shared library mapped into the program which is
dumping core.

What's a good check to avoid shared libraries and executables?  And do
we really want to avoid including them?  What if a new version of one
of the shared libraries is installed on the system after the core file
is generated?  Wouldn't you want the complete original image so that
the program could be debugged accurately in the face of such changes?

Anyways a possible check would be if the object was mapped with
execute permission, so a test on VM_EXEC being set in vma->vm_flags.

But like the comment above maydump() seems to suggest, I'm of the
opinion that we should include as much as possible into the core
file image.

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: fs/binfmt_elf.c:maydump()
  2006-04-06 22:35   ` fs/binfmt_elf.c:maydump() David S. Miller
@ 2006-04-07  5:18     ` David S. Miller
  2006-04-07 18:02       ` fs/binfmt_elf.c:maydump() Daniel Jacobowitz
  2007-01-03  2:02       ` Contents of core dumps (was: Re: fs/binfmt_elf.c:maydump()) Daniel Jacobowitz
  0 siblings, 2 replies; 10+ messages in thread
From: David S. Miller @ 2006-04-07  5:18 UTC (permalink / raw)
  To: dan; +Cc: linux-kernel

From: "David S. Miller" <davem@davemloft.net>
Date: Thu, 06 Apr 2006 15:35:18 -0700 (PDT)

> With this patch applied, yes, it includes the complete text segments of
> every executable and shared library mapped into the program which is
> dumping core.

I did some more research and the 2.4.x kernel currently does
the test like this:

static inline int maydump(struct vm_area_struct *vma)
{
	/*
	 * If we may not read the contents, don't allow us to dump
	 * them either. "dump_write()" can't handle it anyway.
	 */
	if (!(vma->vm_flags & VM_READ))
		return 0;

	/* Do not dump I/O mapped devices! -DaveM */
	if (vma->vm_flags & VM_IO)
		return 0;
#if 1
	if (vma->vm_flags & (VM_WRITE|VM_GROWSUP|VM_GROWSDOWN))
		return 1;
	if (vma->vm_flags & (VM_READ|VM_EXEC|VM_EXECUTABLE|VM_SHARED))
		return 0;
#endif
	return 1;
}

Ok, if it's not readable don't put it into the core file, fine.
If it's an I/O mapping, skip it too, also makes sense.

The next check forces dumping of any wriable or stack segments.

The function always terminates at the next check, because VM_READ
is guarenteed to be set if we get here.  The first thing we
checked is if VM_READ was clear at the very top of the function.

Yikes...

Anyways, so what's exactly happening in 2.6.x right now?

static int maydump(struct vm_area_struct *vma)
{
	/* Do not dump I/O mapped devices or special mappings */
	if (vma->vm_flags & (VM_IO | VM_RESERVED))
		return 0;

	/* Dump shared memory only if mapped from an anonymous file.  */
	if (vma->vm_flags & VM_SHARED)
		return vma->vm_file->f_dentry->d_inode->i_nlink == 0;

	/* If it hasn't been written to, don't write it out */
	if (!vma->anon_vma)
		return 0;

	return 1;
}

Skip reserved or I/O mappings, ok.

Else skip shared mappings of non-anonymous files.  I'm not so sure
about this check.

Else skip any mapping not written to yet.  I still think at this
point the logic is wrong.

How about something like the following patch?  If it's executable
and not written to, skip it.  This would skip the main executable
image and all text segments of the shared libraries mapped in.

diff --git a/fs/binfmt_elf.c b/fs/binfmt_elf.c
index 537893a..9ec5c2b 100644
--- a/fs/binfmt_elf.c
+++ b/fs/binfmt_elf.c
@@ -1167,8 +1167,10 @@ static int maydump(struct vm_area_struct
 	if (vma->vm_flags & VM_SHARED)
 		return vma->vm_file->f_dentry->d_inode->i_nlink == 0;
 
-	/* If it hasn't been written to, don't write it out */
-	if (!vma->anon_vma)
+	/* If it is executable and hasn't been written to,
+	 * don't write it out.
+	 */
+	if ((vma->vm_flags & VM_EXEC) && !vma->anon_vma)
 		return 0;
 
 	return 1;


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* Re: fs/binfmt_elf.c:maydump()
  2006-04-07  5:18     ` fs/binfmt_elf.c:maydump() David S. Miller
@ 2006-04-07 18:02       ` Daniel Jacobowitz
  2006-04-07 20:27         ` fs/binfmt_elf.c:maydump() David S. Miller
  2007-01-03  2:02       ` Contents of core dumps (was: Re: fs/binfmt_elf.c:maydump()) Daniel Jacobowitz
  1 sibling, 1 reply; 10+ messages in thread
From: Daniel Jacobowitz @ 2006-04-07 18:02 UTC (permalink / raw)
  To: David S. Miller; +Cc: linux-kernel

On Thu, Apr 06, 2006 at 10:18:07PM -0700, David S. Miller wrote:
> How about something like the following patch?  If it's executable
> and not written to, skip it.  This would skip the main executable
> image and all text segments of the shared libraries mapped in.

Will this dump text segments that have been COW'd for the purposes of
inserting a breakpoint?

It's just a question of goals, I guess.  We could dump code, but it's
rarely useful, so historically we didn't.  Similarly, we could dump
mapped data from shared memory, but it can be huge and is rarely
useful, so generally we don't.

-- 
Daniel Jacobowitz
CodeSourcery

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: fs/binfmt_elf.c:maydump()
  2006-04-07 18:02       ` fs/binfmt_elf.c:maydump() Daniel Jacobowitz
@ 2006-04-07 20:27         ` David S. Miller
  2006-04-10 13:01           ` fs/binfmt_elf.c:maydump() Daniel Jacobowitz
  0 siblings, 1 reply; 10+ messages in thread
From: David S. Miller @ 2006-04-07 20:27 UTC (permalink / raw)
  To: dan; +Cc: linux-kernel

From: Daniel Jacobowitz <dan@debian.org>
Date: Fri, 7 Apr 2006 14:02:43 -0400

> On Thu, Apr 06, 2006 at 10:18:07PM -0700, David S. Miller wrote:
> > How about something like the following patch?  If it's executable
> > and not written to, skip it.  This would skip the main executable
> > image and all text segments of the shared libraries mapped in.
> 
> Will this dump text segments that have been COW'd for the purposes of
> inserting a breakpoint?

Yes, and it would also dump text segments that get written
by the dynamic linker such as the .plt, which we definitely
do want.

It would also dump impure text segment cases as well.

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: fs/binfmt_elf.c:maydump()
  2006-04-07 20:27         ` fs/binfmt_elf.c:maydump() David S. Miller
@ 2006-04-10 13:01           ` Daniel Jacobowitz
  0 siblings, 0 replies; 10+ messages in thread
From: Daniel Jacobowitz @ 2006-04-10 13:01 UTC (permalink / raw)
  To: David S. Miller; +Cc: linux-kernel

On Fri, Apr 07, 2006 at 01:27:53PM -0700, David S. Miller wrote:
> Yes, and it would also dump text segments that get written
> by the dynamic linker such as the .plt, which we definitely
> do want.
> 
> It would also dump impure text segment cases as well.

Well, I'm OK with this, upon reflection.  Might as well merge it and
see if anyone else is appalled :-)

-- 
Daniel Jacobowitz
CodeSourcery

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Contents of core dumps (was: Re: fs/binfmt_elf.c:maydump())
  2006-04-07  5:18     ` fs/binfmt_elf.c:maydump() David S. Miller
  2006-04-07 18:02       ` fs/binfmt_elf.c:maydump() Daniel Jacobowitz
@ 2007-01-03  2:02       ` Daniel Jacobowitz
  2007-01-03  4:57         ` Contents of core dumps David Miller
  1 sibling, 1 reply; 10+ messages in thread
From: Daniel Jacobowitz @ 2007-01-03  2:02 UTC (permalink / raw)
  To: David S. Miller; +Cc: linux-kernel

[Please CC, I am not subscribed to lkml.]

On Thu, Apr 06, 2006 at 10:18:07PM -0700, David S. Miller wrote:
> How about something like the following patch?  If it's executable
> and not written to, skip it.  This would skip the main executable
> image and all text segments of the shared libraries mapped in.

I've been going through GDB test failures (... again...) and I'm down
to a respectably small number on x86_64, but this is one of the
remaining ones.  I don't suppose there's been any change since we
discussed this in April?

A refresher for those following along: there's a GDB test that mmaps a
file using MAP_PRIVATE and PROT_WRITE.  It expects the contents to end
up in the core dump.  Right now, they don't.  I can fix the test by
making sure it writes to the mapping, but before I change the test,
I want to raise the question of what _should_ be in a core dump.

I took a peek at what Solaris includes in core dumps.  They offer
(not surprisingly) a pile of configuration options.  The default is
just about everything except for file-backed shared memory and some
symbol table data - it includes text segments, rodata, anonymous shared
memory, file backed mappings, et cetera.  I guess that's another
argument in favor of dumping more.  Then you can control it globally,
per process, et cetera.

http://src.opensolaris.org/source/xref/loficc/crypto/usr/src/uts/common/sys/corectl.h

I also checked an AIX manual since there was a reference to SA_FULLDUMP
in the GDB test:

 By default, the user data, anonymously mapped regions, and vm_infox
 structures are not included in a core dump. This partial core dump
 includes the current thread stack, the thread thrdctx structures, the
 user structure, and the state of the registers at the time of the
 fault. A partial core dump contains sufficient information for a stack
 traceback. The size of a core dump can also be limited by the setrlimit
 or setrlimit64 subroutine.

 To enable a full core dump, set the SA_FULLDUMP flag in the sigaction
 subroutine for the signal that is to generate a full core dump. If this
 flag is set when the core is dumped, the user data section, vm_infox,
 and anonymously mapped region structures are included in the core dump.

Not really sure what that translates to, but it's less than what
Solaris dumps, I think.

Does Linux need knobs for this?

> 
> diff --git a/fs/binfmt_elf.c b/fs/binfmt_elf.c
> index 537893a..9ec5c2b 100644
> --- a/fs/binfmt_elf.c
> +++ b/fs/binfmt_elf.c
> @@ -1167,8 +1167,10 @@ static int maydump(struct vm_area_struct
>  	if (vma->vm_flags & VM_SHARED)
>  		return vma->vm_file->f_dentry->d_inode->i_nlink == 0;
>  
> -	/* If it hasn't been written to, don't write it out */
> -	if (!vma->anon_vma)
> +	/* If it is executable and hasn't been written to,
> +	 * don't write it out.
> +	 */
> +	if ((vma->vm_flags & VM_EXEC) && !vma->anon_vma)
>  		return 0;
>  
>  	return 1;
> 
> 

-- 
Daniel Jacobowitz
CodeSourcery

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: Contents of core dumps
  2007-01-03  2:02       ` Contents of core dumps (was: Re: fs/binfmt_elf.c:maydump()) Daniel Jacobowitz
@ 2007-01-03  4:57         ` David Miller
  2007-01-03 18:13           ` Daniel Jacobowitz
  0 siblings, 1 reply; 10+ messages in thread
From: David Miller @ 2007-01-03  4:57 UTC (permalink / raw)
  To: dan; +Cc: linux-kernel

From: Daniel Jacobowitz <dan@debian.org>
Date: Tue, 2 Jan 2007 21:02:28 -0500

> On Thu, Apr 06, 2006 at 10:18:07PM -0700, David S. Miller wrote:
> > How about something like the following patch?  If it's executable
> > and not written to, skip it.  This would skip the main executable
> > image and all text segments of the shared libraries mapped in.
> 
> I've been going through GDB test failures (... again...) and I'm down
> to a respectably small number on x86_64, but this is one of the
> remaining ones.  I don't suppose there's been any change since we
> discussed this in April?

Not to my knowledge.

> Does Linux need knobs for this?

I don't think so.

The current behavior is very non-intuitive.

As a person who hacks gdb, the kernel, and the interactions between
them extensively, it took even me quite a while to track down this
problem.

Imagine some less skilled person trying to analyze a core dump
expecting the necessary information to be there and being unable
to figure out why?

So I'd say we should just put this change in, as-is.  It fixes bugs,
and in all the time that has passed since my initial posting there
has not been any serious dissent.

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: Contents of core dumps
  2007-01-03  4:57         ` Contents of core dumps David Miller
@ 2007-01-03 18:13           ` Daniel Jacobowitz
  0 siblings, 0 replies; 10+ messages in thread
From: Daniel Jacobowitz @ 2007-01-03 18:13 UTC (permalink / raw)
  To: David Miller; +Cc: linux-kernel

On Tue, Jan 02, 2007 at 08:57:21PM -0800, David Miller wrote:
> So I'd say we should just put this change in, as-is.  It fixes bugs,
> and in all the time that has passed since my initial posting there
> has not been any serious dissent.

Fine with me.  In that case, I will wait until the kernel is fixed,
verify it, and then probably adjust the GDB test to pass on either
patched or unpatched kernels.

-- 
Daniel Jacobowitz
CodeSourcery

^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2007-01-03 18:13 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2006-04-06 21:03 fs/binfmt_elf.c:maydump() David S. Miller
2006-04-06 22:15 ` fs/binfmt_elf.c:maydump() Daniel Jacobowitz
2006-04-06 22:35   ` fs/binfmt_elf.c:maydump() David S. Miller
2006-04-07  5:18     ` fs/binfmt_elf.c:maydump() David S. Miller
2006-04-07 18:02       ` fs/binfmt_elf.c:maydump() Daniel Jacobowitz
2006-04-07 20:27         ` fs/binfmt_elf.c:maydump() David S. Miller
2006-04-10 13:01           ` fs/binfmt_elf.c:maydump() Daniel Jacobowitz
2007-01-03  2:02       ` Contents of core dumps (was: Re: fs/binfmt_elf.c:maydump()) Daniel Jacobowitz
2007-01-03  4:57         ` Contents of core dumps David Miller
2007-01-03 18:13           ` Daniel Jacobowitz

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).