All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH] Extend core dump note section to contain file names of mapped files
@ 2012-07-11 10:35 Denys Vlasenko
  2012-07-11 15:15 ` Oleg Nesterov
  2012-07-11 15:40 ` Jonathan M. Foote
  0 siblings, 2 replies; 12+ messages in thread
From: Denys Vlasenko @ 2012-07-11 10:35 UTC (permalink / raw)
  To: linux-kernel, Jonathan M. Foote, H. J. Lu, Ingo Molnar,
	H. Peter Anvin, Andi Kleen
  Cc: Oleg Nesterov, Denys Vlasenko, Jan Kratochvil

[-- Attachment #1: Type: text/plain, Size: 5086 bytes --]

Hi,

Resending the patch after a while.
Jonathan, developer of CERT Triage Tools, expressed the need
to have this information, CCing him.

But before looking at the attached patch, we need a ruling.

In the last review it was proposed to maybe generate
this information in the form of ASCII text, a-la /proc/PID/maps.

This actually is a good idea, but regretfully, it come a few
decades too late, the rest of core file auxiliary information
is traditionally encoded in binary structures.

Please, can someone with authority in this area decide whether
we want to be unorthodox and use ASCII encoding for the whole thing,
or not?

If the decision will be to use ASCII, I will need to rework the patch.

Otherwise, please take a look at attached patch which implements
creation of a new note in binary format and let me know what do you think of it.

Original patch and description follows

* * * * * * * * * * * * * * * * * * * *

While working with core dump analysis, it struck me how much
PITA is caused merely by the fact that names of loaded binary
and libraries are not known.

gdb retrieves loaded library names by examining dynamic loader's
data stored in the core dump's data segments. It uses intimate
knowledge how and where dynamic loader keeps the list of loaded
libraries. (Meaning that it will break if non-standard loader
is used).

And, as Jan explained to me, it depends on knowing where
the linked list of libraries starts, which requires knowing binary
which was running. IIRC there is no easy and reasonably foolproof
way to determine binary's name. (Looking at argv[0] on stack
is not reasonably foolproof).

Which is *ridiculous*. We *know* the list of mapped files
at core dump generation time.

I propose to save this information in core dump, as a new note
in note segment.

This note has the following format:

long count     // how many files are mapped
long page_size // units for file_ofs
array of [COUNT] elements of
   long start
   long end
   long file_ofs
followed by COUNT filenames in ASCII: "FILE1" NUL "FILE2" NUL...
The attached patch implements this.

Since list of mapped files can be large (/proc/`pidof firefox`/maps
on my machine right now is 38k), I allocate the space for note
via vmalloc, and also have a sanity limit of 4 megabytes.
(Maybe we should make it smaller?)
Oleg suggested using a linked list of smaller structures instead of
using a potentially large contiguous block, and I tried it,
but resulting code was significantly more ugly (for my taste).

The patch is run-tested.

For testing, I sent ABRT signal to a running /usr/bin/md5sum.

"readelf -aW core" shows the new note as:

Notes at offset 0x00000274 with length 0x00000990:
 Owner                 Data size       Description
 CORE                 0x00000090       NT_PRSTATUS (prstatus structure)
 CORE                 0x0000007c       NT_PRPSINFO (prpsinfo structure)
 CORE                 0x000000a0       NT_AUXV (auxiliary vector)
 CORE                 0x00000168       Unknown note type: (0x46494c45)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^new note^^^^^^^^^^
In hex format:
                                               05 00 00 00  |................|
00000460  68 01 00 00 45 4c 49 46  46 49 4c 45 00 00 00 00  |h...ELIFCORE....|
00000470  0b 00 00 00 00 10 00 00  00 80 17 00 00 f0 31 00  |..............1.|
00000480  00 00 00 00 00 f0 31 00  00 00 32 00 a7 01 00 00  |......1...2.....|
00000490  00 00 32 00 00 20 32 00  a7 01 00 00 00 20 32 00  |..2.. 2...... 2.|
000004a0  00 30 32 00 a9 01 00 00  00 50 69 00 00 60 6b 00  |.02......Pi..`k.|
000004b0  00 00 00 00 00 60 6b 00  00 70 6b 00 20 00 00 00  |.....`k..pk. ...|
000004c0  00 70 6b 00 00 80 6b 00  21 00 00 00 00 80 04 08  |.pk...k.!.......|
000004d0  00 00 05 08 00 00 00 00  00 00 05 08 00 10 05 08  |................|
000004e0  07 00 00 00 00 10 05 08  00 20 05 08 08 00 00 00  |......... ......|
000004f0  00 20 52 b7 00 20 72 b7  00 00 00 00 2f 6c 69 62  |. R.. r...../lib|
00000500  2f 6c 69 62 63 2d 32 2e  31 34 2e 39 30 2e 73 6f  |/libc-2.14.90.so|
00000510  00 2f 6c 69 62 2f 6c 69  62 63 2d 32 2e 31 34 2e  |./lib/libc-2.14.|
00000520  39 30 2e 73 6f 00 2f 6c  69 62 2f 6c 69 62 63 2d  |90.so./lib/libc-|
00000530  32 2e 31 34 2e 39 30 2e  73 6f 00 2f 6c 69 62 2f  |2.14.90.so./lib/|
00000540  6c 69 62 63 2d 32 2e 31  34 2e 39 30 2e 73 6f 00  |libc-2.14.90.so.|
00000550  2f 6c 69 62 2f 6c 64 2d  32 2e 31 34 2e 39 30 2e  |/lib/ld-2.14.90.|
00000560  73 6f 00 2f 6c 69 62 2f  6c 64 2d 32 2e 31 34 2e  |so./lib/ld-2.14.|
00000570  39 30 2e 73 6f 00 2f 6c  69 62 2f 6c 64 2d 32 2e  |90.so./lib/ld-2.|
00000580  31 34 2e 39 30 2e 73 6f  00 2f 75 73 72 2f 62 69  |14.90.so./usr/bi|
00000590  6e 2f 6d 64 35 73 75 6d  00 2f 75 73 72 2f 62 69  |n/md5sum./usr/bi|
000005a0  6e 2f 6d 64 35 73 75 6d  00 2f 75 73 72 2f 62 69  |n/md5sum./usr/bi|
000005b0  6e 2f 6d 64 35 73 75 6d  00 2f 75 73 72 2f 6c 69  |n/md5sum./usr/li|
000005c0  62 2f 6c 6f 63 61 6c 65  2f 6c 6f 63 61 6c 65 2d  |b/locale/locale-|
000005d0  61 72 63 68 69 76 65 00                           |archive.

-- 
vda

[-- Attachment #2: file_note.patch --]
[-- Type: application/octet-stream, Size: 3404 bytes --]

diff --git a/fs/binfmt_elf.c b/fs/binfmt_elf.c
index 81878b7..b585ba1 100644
--- a/fs/binfmt_elf.c
+++ b/fs/binfmt_elf.c
@@ -1358,6 +1358,73 @@ static void fill_auxv_note(struct memelfnote *note, struct mm_struct *mm)
 	fill_note(note, "CORE", NT_AUXV, i * sizeof(elf_addr_t), auxv);
 }
 
+#define MAX_FILE_NOTE_SIZE (4*1024*1024)
+
+static void fill_files_note(struct memelfnote *note)
+{
+	struct vm_area_struct *vma;
+	struct file *file;
+	unsigned count, word_count, size, remaining;
+	long *data;
+	long *start_end_ofs;
+	char *name;
+
+	count = 0;
+	for (vma = current->mm->mmap; vma != NULL; vma = vma->vm_next) {
+		file = vma->vm_file;
+		if (!file)
+			continue;
+		count++;
+		if (count >= MAX_FILE_NOTE_SIZE / 64) /* paranoia check */
+			goto err;
+	}
+
+	size = count * 64;
+	word_count = 2 + 3 * count;
+ alloc:
+	if (size >= MAX_FILE_NOTE_SIZE) /* paranoia check */
+		goto err;
+	size = (size + PAGE_SIZE - 1) & (-PAGE_SIZE);
+	data = vmalloc(size);
+	if (!data)
+		goto err;
+	start_end_ofs = data;
+	name = (void*)&start_end_ofs[word_count];
+	remaining = size - word_count * sizeof(long);
+
+	*start_end_ofs++ = count;
+	*start_end_ofs++ = PAGE_SIZE;
+	for (vma = current->mm->mmap; vma != NULL; vma = vma->vm_next) {
+		const char *filename;
+
+		file = vma->vm_file;
+		if (!file)
+			continue;
+		if (remaining == 0) {
+ try_new_size:
+			vfree(data);
+			size = size * 5 / 4;
+			goto alloc;
+		}
+		filename = d_path(&file->f_path, name, remaining);
+		if (IS_ERR(filename)) {
+			if (PTR_ERR(filename) == -ENAMETOOLONG)
+				goto try_new_size;
+			/* continue; -- WRONG, we must have COUNT elements */
+			filename = "";
+		}
+		while ((remaining--, *name++ = *filename++) != '\0')
+			continue;
+		*start_end_ofs++ = vma->vm_start;
+		*start_end_ofs++ = vma->vm_end;
+		*start_end_ofs++ = vma->vm_pgoff;
+	}
+
+	size = name - (char*)data;
+	fill_note(note, "CORE", NT_FILE, size, data);
+ err: ;
+}
+
 #ifdef CORE_DUMP_USE_REGSET
 #include <linux/regset.h>
 
@@ -1372,6 +1439,7 @@ struct elf_note_info {
 	struct elf_thread_core_info *thread;
 	struct memelfnote psinfo;
 	struct memelfnote auxv;
+	struct memelfnote files;
 	size_t size;
 	int thread_notes;
 };
@@ -1532,6 +1600,9 @@ static int fill_note_info(struct elfhdr *elf, int phdrs,
 	fill_auxv_note(&info->auxv, current->mm);
 	info->size += notesize(&info->auxv);
 
+	fill_files_note(&info->files);
+	info->size += notesize(&info->files);
+
 	return 1;
 }
 
@@ -1560,6 +1631,8 @@ static int write_note_info(struct elf_note_info *info,
 			return 0;
 		if (first && !writenote(&info->auxv, file, foffset))
 			return 0;
+		if (first && !writenote(&info->files, file, foffset))
+			return 0;
 
 		for (i = 1; i < info->thread_notes; ++i)
 			if (t->notes[i].data &&
@@ -1586,6 +1659,7 @@ static void free_note_info(struct elf_note_info *info)
 		kfree(t);
 	}
 	kfree(info->psinfo.data);
+	vfree(info->files.data);
 }
 
 #else
diff --git a/include/linux/elf.h b/include/linux/elf.h
index 999b4f5..5e6c08f 100644
--- a/include/linux/elf.h
+++ b/include/linux/elf.h
@@ -372,6 +372,7 @@ typedef struct elf64_shdr {
 #define NT_PRPSINFO	3
 #define NT_TASKSTRUCT	4
 #define NT_AUXV		6
+#define NT_FILE		0x46494c45
 #define NT_PRXFPREG     0x46e62b7f      /* copied from gdb5.1/include/elf/common.h */
 #define NT_PPC_VMX	0x100		/* PowerPC Altivec/VMX registers */
 #define NT_PPC_SPE	0x101		/* PowerPC SPE/EVR registers */

^ permalink raw reply related	[flat|nested] 12+ messages in thread

* Re: [PATCH] Extend core dump note section to contain file names of mapped files
  2012-07-11 10:35 [PATCH] Extend core dump note section to contain file names of mapped files Denys Vlasenko
@ 2012-07-11 15:15 ` Oleg Nesterov
  2012-07-12 19:41   ` Denys Vlasenko
  2012-07-11 15:40 ` Jonathan M. Foote
  1 sibling, 1 reply; 12+ messages in thread
From: Oleg Nesterov @ 2012-07-11 15:15 UTC (permalink / raw)
  To: Denys Vlasenko
  Cc: linux-kernel, Jonathan M. Foote, H. J. Lu, Ingo Molnar,
	H. Peter Anvin, Andi Kleen, Denys Vlasenko, Jan Kratochvil

On 07/11, Denys Vlasenko wrote:
>
> I propose to save this information in core dump, as a new note
> in note segment.

Denys, I am in no position to discuss whether we need this change or not,
format, etc. I'll only try to comment the code.

And please do not use the attachments ;)

> +static void fill_files_note(struct memelfnote *note)
> +{
> +	struct vm_area_struct *vma;
> +	struct file *file;
> +	unsigned count, word_count, size, remaining;
> +	long *data;
> +	long *start_end_ofs;
> +	char *name;
> +
> +	count = 0;
> +	for (vma = current->mm->mmap; vma != NULL; vma = vma->vm_next) {
> +		file = vma->vm_file;
> +		if (!file)
> +			continue;
> +		count++;
> +		if (count >= MAX_FILE_NOTE_SIZE / 64) /* paranoia check */
> +			goto err;

Why this check? If count is huge, then...

> +	size = count * 64;
> +	word_count = 2 + 3 * count;
> + alloc:
> +	if (size >= MAX_FILE_NOTE_SIZE) /* paranoia check */
> +		goto err;

we should detect this case before the first alloc?

> +	size = (size + PAGE_SIZE - 1) & (-PAGE_SIZE);

Well, I'd suggest PAGE_MASK instead of -PAGE_SIZE. Better yet,

	size = round_up(size, PAGE_SIZE);

> +		if (remaining == 0) {
> + try_new_size:
> +			vfree(data);
> +			size = size * 5 / 4;
> +			goto alloc;
> +		}
> +		filename = d_path(&file->f_path, name, remaining);
> +		if (IS_ERR(filename)) {
> +			if (PTR_ERR(filename) == -ENAMETOOLONG)
> +				goto try_new_size;

This looks like unnecessary complication to me, or I missed something.
d_path(..., buflen) should handle the "buflen == 0" case correctly, so
afacics you can remove the "if (remaining == 0)" block and move this
free-and-goto-alloc code under the -ENAMETOOLONG check.

> +		while ((remaining--, *name++ = *filename++) != '\0')
> +			continue;

Well, perhaps this is just me... but this looks a bit too complex
to me ;) I won't insist, but

		do
			remaining--;
		while ((*name++ = *filename++));

looks more understandable, imho.

Or even

		/* d_path() fills the end of the buffer */
		remaining = name - filename;
		strcpy(name, filename);

Oleg.


^ permalink raw reply	[flat|nested] 12+ messages in thread

* RE: [PATCH] Extend core dump note section to contain file names of mapped files
  2012-07-11 10:35 [PATCH] Extend core dump note section to contain file names of mapped files Denys Vlasenko
  2012-07-11 15:15 ` Oleg Nesterov
@ 2012-07-11 15:40 ` Jonathan M. Foote
  1 sibling, 0 replies; 12+ messages in thread
From: Jonathan M. Foote @ 2012-07-11 15:40 UTC (permalink / raw)
  To: Denys Vlasenko, linux-kernel, H. J. Lu, Ingo Molnar,
	H. Peter Anvin, Andi Kleen
  Cc: Oleg Nesterov, Denys Vlasenko, Jan Kratochvil

Hello all,

For what it's worth several security teams that use the CERT Triage Tools have requested a feature (post-mortem analysis of core files) that will be supported by this functionality. This feature in the CERT Triage Tools would be more complete if siginfo was included in the core files as well, but this is a good start.

Jon

-----Original Message-----
From: Denys Vlasenko [mailto:vda.linux@gmail.com] 
Sent: Wednesday, July 11, 2012 6:36 AM
To: linux-kernel@vger.kernel.org; Jonathan M. Foote; H. J. Lu; Ingo Molnar; H. Peter Anvin; Andi Kleen
Cc: Oleg Nesterov; Denys Vlasenko; Jan Kratochvil
Subject: [PATCH] Extend core dump note section to contain file names of mapped files

Hi,

Resending the patch after a while.
Jonathan, developer of CERT Triage Tools, expressed the need to have this information, CCing him.

But before looking at the attached patch, we need a ruling.

In the last review it was proposed to maybe generate this information in the form of ASCII text, a-la /proc/PID/maps.

This actually is a good idea, but regretfully, it come a few decades too late, the rest of core file auxiliary information is traditionally encoded in binary structures.

Please, can someone with authority in this area decide whether we want to be unorthodox and use ASCII encoding for the whole thing, or not?

If the decision will be to use ASCII, I will need to rework the patch.

Otherwise, please take a look at attached patch which implements creation of a new note in binary format and let me know what do you think of it.

Original patch and description follows

* * * * * * * * * * * * * * * * * * * *

While working with core dump analysis, it struck me how much PITA is caused merely by the fact that names of loaded binary and libraries are not known.

gdb retrieves loaded library names by examining dynamic loader's data stored in the core dump's data segments. It uses intimate knowledge how and where dynamic loader keeps the list of loaded libraries. (Meaning that it will break if non-standard loader is used).

And, as Jan explained to me, it depends on knowing where the linked list of libraries starts, which requires knowing binary which was running. IIRC there is no easy and reasonably foolproof way to determine binary's name. (Looking at argv[0] on stack is not reasonably foolproof).

Which is *ridiculous*. We *know* the list of mapped files at core dump generation time.

I propose to save this information in core dump, as a new note in note segment.

This note has the following format:

long count     // how many files are mapped
long page_size // units for file_ofs
array of [COUNT] elements of
   long start
   long end
   long file_ofs
followed by COUNT filenames in ASCII: "FILE1" NUL "FILE2" NUL...
The attached patch implements this.

Since list of mapped files can be large (/proc/`pidof firefox`/maps on my machine right now is 38k), I allocate the space for note via vmalloc, and also have a sanity limit of 4 megabytes.
(Maybe we should make it smaller?)
Oleg suggested using a linked list of smaller structures instead of using a potentially large contiguous block, and I tried it, but resulting code was significantly more ugly (for my taste).

The patch is run-tested.

For testing, I sent ABRT signal to a running /usr/bin/md5sum.

"readelf -aW core" shows the new note as:

Notes at offset 0x00000274 with length 0x00000990:
 Owner                 Data size       Description
 CORE                 0x00000090       NT_PRSTATUS (prstatus structure)
 CORE                 0x0000007c       NT_PRPSINFO (prpsinfo structure)
 CORE                 0x000000a0       NT_AUXV (auxiliary vector)
 CORE                 0x00000168       Unknown note type: (0x46494c45)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^new note^^^^^^^^^^ In hex format:
                                               05 00 00 00  |................|
00000460  68 01 00 00 45 4c 49 46  46 49 4c 45 00 00 00 00  |h...ELIFCORE....|
00000470  0b 00 00 00 00 10 00 00  00 80 17 00 00 f0 31 00  |..............1.|
00000480  00 00 00 00 00 f0 31 00  00 00 32 00 a7 01 00 00  |......1...2.....|
00000490  00 00 32 00 00 20 32 00  a7 01 00 00 00 20 32 00  |..2.. 2...... 2.|
000004a0  00 30 32 00 a9 01 00 00  00 50 69 00 00 60 6b 00  |.02......Pi..`k.|
000004b0  00 00 00 00 00 60 6b 00  00 70 6b 00 20 00 00 00  |.....`k..pk. ...|
000004c0  00 70 6b 00 00 80 6b 00  21 00 00 00 00 80 04 08  |.pk...k.!.......|
000004d0  00 00 05 08 00 00 00 00  00 00 05 08 00 10 05 08  |................|
000004e0  07 00 00 00 00 10 05 08  00 20 05 08 08 00 00 00  |......... ......|
000004f0  00 20 52 b7 00 20 72 b7  00 00 00 00 2f 6c 69 62  |. R.. r...../lib|
00000500  2f 6c 69 62 63 2d 32 2e  31 34 2e 39 30 2e 73 6f  |/libc-2.14.90.so|
00000510  00 2f 6c 69 62 2f 6c 69  62 63 2d 32 2e 31 34 2e  |./lib/libc-2.14.|
00000520  39 30 2e 73 6f 00 2f 6c  69 62 2f 6c 69 62 63 2d  |90.so./lib/libc-|
00000530  32 2e 31 34 2e 39 30 2e  73 6f 00 2f 6c 69 62 2f  |2.14.90.so./lib/|
00000540  6c 69 62 63 2d 32 2e 31  34 2e 39 30 2e 73 6f 00  |libc-2.14.90.so.|
00000550  2f 6c 69 62 2f 6c 64 2d  32 2e 31 34 2e 39 30 2e  |/lib/ld-2.14.90.|
00000560  73 6f 00 2f 6c 69 62 2f  6c 64 2d 32 2e 31 34 2e  |so./lib/ld-2.14.|
00000570  39 30 2e 73 6f 00 2f 6c  69 62 2f 6c 64 2d 32 2e  |90.so./lib/ld-2.|
00000580  31 34 2e 39 30 2e 73 6f  00 2f 75 73 72 2f 62 69  |14.90.so./usr/bi|
00000590  6e 2f 6d 64 35 73 75 6d  00 2f 75 73 72 2f 62 69  |n/md5sum./usr/bi|
000005a0  6e 2f 6d 64 35 73 75 6d  00 2f 75 73 72 2f 62 69  |n/md5sum./usr/bi|
000005b0  6e 2f 6d 64 35 73 75 6d  00 2f 75 73 72 2f 6c 69  |n/md5sum./usr/li|
000005c0  62 2f 6c 6f 63 61 6c 65  2f 6c 6f 63 61 6c 65 2d  |b/locale/locale-|
000005d0  61 72 63 68 69 76 65 00                           |archive.

--
vda

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH] Extend core dump note section to contain file names of mapped files
  2012-07-11 15:15 ` Oleg Nesterov
@ 2012-07-12 19:41   ` Denys Vlasenko
  0 siblings, 0 replies; 12+ messages in thread
From: Denys Vlasenko @ 2012-07-12 19:41 UTC (permalink / raw)
  To: Oleg Nesterov
  Cc: Denys Vlasenko, linux-kernel, Jonathan M. Foote, H. J. Lu,
	Ingo Molnar, H. Peter Anvin, Andi Kleen, Denys Vlasenko,
	Jan Kratochvil

On Wednesday 11 July 2012 17:15, Oleg Nesterov wrote:
> On 07/11, Denys Vlasenko wrote:
> >
> > I propose to save this information in core dump, as a new note
> > in note segment.
> 
> Denys, I am in no position to discuss whether we need this change or not,
> format, etc. I'll only try to comment the code.
> 
> And please do not use the attachments ;)
> 
> > +static void fill_files_note(struct memelfnote *note)
> > +{
> > +	struct vm_area_struct *vma;
> > +	struct file *file;
> > +	unsigned count, word_count, size, remaining;
> > +	long *data;
> > +	long *start_end_ofs;
> > +	char *name;
> > +
> > +	count = 0;
> > +	for (vma = current->mm->mmap; vma != NULL; vma = vma->vm_next) {
> > +		file = vma->vm_file;
> > +		if (!file)
> > +			continue;
> > +		count++;
> > +		if (count >= MAX_FILE_NOTE_SIZE / 64) /* paranoia check */
> > +			goto err;
> 
> Why this check? If count is huge, then...
> 
> > +	size = count * 64;
> > +	word_count = 2 + 3 * count;
> > + alloc:
> > +	if (size >= MAX_FILE_NOTE_SIZE) /* paranoia check */
> > +		goto err;
> 
> we should detect this case before the first alloc?

Unless count * 64 overflows an int :)
As I said in the comment: paranoia.

Perhaps that's TOO MUCH of paranoia. Removing.

> > +	size = (size + PAGE_SIZE - 1) & (-PAGE_SIZE);
> 
> Well, I'd suggest PAGE_MASK instead of -PAGE_SIZE. Better yet,
> 
> 	size = round_up(size, PAGE_SIZE);
> 
> > +		if (remaining == 0) {
> > + try_new_size:
> > +			vfree(data);
> > +			size = size * 5 / 4;
> > +			goto alloc;
> > +		}
> > +		filename = d_path(&file->f_path, name, remaining);
> > +		if (IS_ERR(filename)) {
> > +			if (PTR_ERR(filename) == -ENAMETOOLONG)
> > +				goto try_new_size;
> 
> This looks like unnecessary complication to me, or I missed something.
> d_path(..., buflen) should handle the "buflen == 0" case correctly, so
> afacics you can remove the "if (remaining == 0)" block and move this
> free-and-goto-alloc code under the -ENAMETOOLONG check.
> 
> > +		while ((remaining--, *name++ = *filename++) != '\0')
> > +			continue;
> 
> Well, perhaps this is just me... but this looks a bit too complex
> to me ;) I won't insist, but
> 
> 		do
> 			remaining--;
> 		while ((*name++ = *filename++));
> 
> looks more understandable, imho.

Okay.


> Or even
> 
> 		/* d_path() fills the end of the buffer */
> 		remaining = name - filename;
> 		strcpy(name, filename);

This does not advance "name" pointer... oh...
it's actually clever! But it'll fail if we took

                        /* continue; -- WRONG, we must have COUNT elements */
                        filename = "";
                }

branch just above... I will use an open-coded loop for now.

Sending v2 in a moment.

-- 
vda

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH] Extend core dump note section to contain file names of mapped files
  2012-04-02  0:24   ` Oleg Nesterov
@ 2012-04-02 11:20     ` Pedro Alves
  0 siblings, 0 replies; 12+ messages in thread
From: Pedro Alves @ 2012-04-02 11:20 UTC (permalink / raw)
  To: Oleg Nesterov
  Cc: Andi Kleen, Denys Vlasenko, H. Peter Anvin, Andrew Morton,
	Jan Kratochvil, linux-kernel

On 04/02/2012 01:24 AM, Oleg Nesterov wrote:

> On 04/01, Andi Kleen wrote:
>> >
>>> > > I propose to save this information in core dump, as a new note
>>> > > in note segment.
>> >
>> > Seems like a good idea but rather than write complicated code
> I agree, I feel it can be simplified...
> 
>> > i would just reuse
>> > the /proc/*/maps code and dump it in that format?
> I must have missed something. Do you really suggest to use
> show_pid_map/etc?
> 
> If nothing else, this code depends on CONFIG_PROC_FS. But in any
> case I think this will only complicate fill_files_note().
> 
> coredump is "simple", we are the last thread which can play with
> this ->mm. We do not need locks, we do not need the "restart after
> we dropped mmap_sem" logic. We know that the task_struct can't go
> away. The only problem is rename, that is why we can't allocate the
> whole buffer beforehand.
> 
> OK, I must have missed something ;)


In my perspective, it's about having a single format to consume.
Having only one format to care for in either the core or when
debugging a live process simplifies things for all parties involved.

(And it doesn't seem a good idea to me to have a better format that
handles more things when debugging cores than what tools have available
when debugging a live process.  If /proc/*/maps doesn't handle something
we might care for, then fix that too, not just the core note.)

The kernel.  The man pages.  Userspace tools.  E.g., GDB can trivially
be made to hook a new core note with /proc/*/maps-like contents with
the "(gdb) info proc maps" command (an often requested feature).

I even wish that more (if not all) of /proc/ 's objects were
copied verbatim (as much as possible) to a core dump.

-- 
Pedro Alves

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH] Extend core dump note section to contain file names of mapped files
  2012-04-01  3:13 ` Andi Kleen
  2012-04-01  3:20   ` H. Peter Anvin
  2012-04-01 13:33   ` Denys Vlasenko
@ 2012-04-02  0:24   ` Oleg Nesterov
  2012-04-02 11:20     ` Pedro Alves
  2 siblings, 1 reply; 12+ messages in thread
From: Oleg Nesterov @ 2012-04-02  0:24 UTC (permalink / raw)
  To: Andi Kleen
  Cc: Denys Vlasenko, H. Peter Anvin, Andrew Morton, Jan Kratochvil,
	linux-kernel

On 04/01, Andi Kleen wrote:
>
> > I propose to save this information in core dump, as a new note
> > in note segment.
>
> Seems like a good idea but rather than write complicated code

I agree, I feel it can be simplified...

> i would just reuse
> the /proc/*/maps code and dump it in that format?

I must have missed something. Do you really suggest to use
show_pid_map/etc?

If nothing else, this code depends on CONFIG_PROC_FS. But in any
case I think this will only complicate fill_files_note().

coredump is "simple", we are the last thread which can play with
this ->mm. We do not need locks, we do not need the "restart after
we dropped mmap_sem" logic. We know that the task_struct can't go
away. The only problem is rename, that is why we can't allocate the
whole buffer beforehand.

OK, I must have missed something ;)

Oleg.


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH] Extend core dump note section to contain file names of mapped files
  2012-04-01 13:33   ` Denys Vlasenko
@ 2012-04-01 16:53     ` Andi Kleen
  0 siblings, 0 replies; 12+ messages in thread
From: Andi Kleen @ 2012-04-01 16:53 UTC (permalink / raw)
  To: Denys Vlasenko
  Cc: Andi Kleen, H. Peter Anvin, Andrew Morton, Oleg Nesterov,
	Jan Kratochvil, linux-kernel

> It uses seq_printf(). It will need refactoring if we would
> want to use it for purposes other than /proc pseudo-file
> generation.

seq_file just writes to a buffer. I'm sure there's a way to get at that
buffer. In theory you could also just copy using read/write and a small temp
buffer. The only problem would be to stop it taking the vm sem.

-Andi
-- 
ak@linux.intel.com -- Speaking for myself only.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH] Extend core dump note section to contain file names of mapped files
  2012-04-01  3:13 ` Andi Kleen
  2012-04-01  3:20   ` H. Peter Anvin
@ 2012-04-01 13:33   ` Denys Vlasenko
  2012-04-01 16:53     ` Andi Kleen
  2012-04-02  0:24   ` Oleg Nesterov
  2 siblings, 1 reply; 12+ messages in thread
From: Denys Vlasenko @ 2012-04-01 13:33 UTC (permalink / raw)
  To: Andi Kleen
  Cc: H. Peter Anvin, Andrew Morton, Oleg Nesterov, Jan Kratochvil,
	linux-kernel

On Sunday 01 April 2012 05:13, Andi Kleen wrote:
> > I propose to save this information in core dump, as a new note
> > in note segment.
> 
> Seems like a good idea but rather than write complicated code i would just reuse
> the /proc/*/maps code and dump it in that format?

I looked at this code (it's in fs/proc/task_mmu.c).

It uses seq_printf(). It will need refactoring if we would
want to use it for purposes other than /proc pseudo-file
generation.

-- 
vda

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH] Extend core dump note section to contain file names of mapped files
  2012-04-01  3:20   ` H. Peter Anvin
@ 2012-04-01  9:44     ` Andi Kleen
  0 siblings, 0 replies; 12+ messages in thread
From: Andi Kleen @ 2012-04-01  9:44 UTC (permalink / raw)
  To: H. Peter Anvin
  Cc: Andi Kleen, Denys Vlasenko, Andrew Morton, Oleg Nesterov,
	Jan Kratochvil, linux-kernel

On Sat, Mar 31, 2012 at 08:20:18PM -0700, H. Peter Anvin wrote:
> On 03/31/2012 08:13 PM, Andi Kleen wrote:
> >> I propose to save this information in core dump, as a new note
> >> in note segment.
> > 
> > Seems like a good idea but rather than write complicated code i would just reuse
> > the /proc/*/maps code and dump it in that format?
> > 
> 
> Does /proc/*/maps handle oddball characters in filenames?

It doesn't handle new lines well, but then if someone does that they 
break all programs that parse maps for other reasons (and likely
a lot of other things too) anyways.

Other than that it should be ok: the file name is just the last component
on the line.

-Andi
-- 
ak@linux.intel.com -- Speaking for myself only.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH] Extend core dump note section to contain file names of mapped files
  2012-04-01  3:13 ` Andi Kleen
@ 2012-04-01  3:20   ` H. Peter Anvin
  2012-04-01  9:44     ` Andi Kleen
  2012-04-01 13:33   ` Denys Vlasenko
  2012-04-02  0:24   ` Oleg Nesterov
  2 siblings, 1 reply; 12+ messages in thread
From: H. Peter Anvin @ 2012-04-01  3:20 UTC (permalink / raw)
  To: Andi Kleen
  Cc: Denys Vlasenko, Andrew Morton, Oleg Nesterov, Jan Kratochvil,
	linux-kernel

On 03/31/2012 08:13 PM, Andi Kleen wrote:
>> I propose to save this information in core dump, as a new note
>> in note segment.
> 
> Seems like a good idea but rather than write complicated code i would just reuse
> the /proc/*/maps code and dump it in that format?
> 

Does /proc/*/maps handle oddball characters in filenames?

	-hpa



^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH] Extend core dump note section to contain file names of mapped files
  2012-03-31 20:51 Denys Vlasenko
@ 2012-04-01  3:13 ` Andi Kleen
  2012-04-01  3:20   ` H. Peter Anvin
                     ` (2 more replies)
  0 siblings, 3 replies; 12+ messages in thread
From: Andi Kleen @ 2012-04-01  3:13 UTC (permalink / raw)
  To: Denys Vlasenko
  Cc: H. Peter Anvin, Andi Kleen, Andrew Morton, Oleg Nesterov,
	Jan Kratochvil, linux-kernel

> I propose to save this information in core dump, as a new note
> in note segment.

Seems like a good idea but rather than write complicated code i would just reuse
the /proc/*/maps code and dump it in that format?

-Andi


^ permalink raw reply	[flat|nested] 12+ messages in thread

* [PATCH] Extend core dump note section to contain file names of mapped files
@ 2012-03-31 20:51 Denys Vlasenko
  2012-04-01  3:13 ` Andi Kleen
  0 siblings, 1 reply; 12+ messages in thread
From: Denys Vlasenko @ 2012-03-31 20:51 UTC (permalink / raw)
  To: H. Peter Anvin, Andi Kleen, Andrew Morton, Oleg Nesterov, Jan Kratochvil
  Cc: linux-kernel

[-- Attachment #1: Type: text/plain, Size: 4193 bytes --]

Hi,

[forgot to CC lkml, resending]

While working with core dump analysis, it struck me how much
PITA is caused merely by the fact that names of loaded binary
and libraries are not known.

gdb retrieves loaded library names by examining dynamic loader's
data stored in the core dump's data segments. It uses intimate
knowledge how and where dynamic loader keeps the list of loaded
libraries. (Meaning that it will break if non-standard loader
is used).

And, as Jan explained to me, it depends on knowing where
the linked list of libraries starts, which requires knowing binary
which was running. IIRC there is no easy and reasonably foolproof
way to determine binary's name. (Looking at argv[0] on stack
is not reasonably foolproof).

Which is *ridiculous*. We *know* the list of mapped files
at core dump generation time.

I propose to save this information in core dump, as a new note
in note segment.

This note has the following format:

long count     // how many files are mapped
long page_size // units for file_ofs
array of [COUNT] elements of
   long start
   long end
   long file_ofs
followed by COUNT filenames in ASCII: "FILE1" NUL "FILE2" NUL...

The attached patch implements this.

Since list of mapped files can be large (/proc/`pidof firefox`/maps
on my machine right now is 38k), I allocate the space for note
via vmalloc, and also have a sanity limit of 4 megabytes.
(Maybe we should make it smaller?)
Oleg suggested using a linked list of smaller structures instead of
using a potentially large contiguous block, and I tried it,
but resulting code was significantly more ugly (for my taste).

The patch is run-tested.

For testing, I sent ABRT signal to a running /usr/bin/md5sum.

"readelf -aW core" shows the new note as:

Notes at offset 0x00000274 with length 0x00000990:
 Owner                 Data size       Description
 CORE                 0x00000090       NT_PRSTATUS (prstatus structure)
 CORE                 0x0000007c       NT_PRPSINFO (prpsinfo structure)
 CORE                 0x000000a0       NT_AUXV (auxiliary vector)
 CORE                 0x00000168       Unknown note type: (0x46494c45)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^new note^^^^^^^^^^

In hex format:
                                               05 00 00 00  |................|
00000460  68 01 00 00 45 4c 49 46  46 49 4c 45 00 00 00 00  |h...ELIFCORE....|
00000470  0b 00 00 00 00 10 00 00  00 80 17 00 00 f0 31 00  |..............1.|
00000480  00 00 00 00 00 f0 31 00  00 00 32 00 a7 01 00 00  |......1...2.....|
00000490  00 00 32 00 00 20 32 00  a7 01 00 00 00 20 32 00  |..2.. 2...... 2.|
000004a0  00 30 32 00 a9 01 00 00  00 50 69 00 00 60 6b 00  |.02......Pi..`k.|
000004b0  00 00 00 00 00 60 6b 00  00 70 6b 00 20 00 00 00  |.....`k..pk. ...|
000004c0  00 70 6b 00 00 80 6b 00  21 00 00 00 00 80 04 08  |.pk...k.!.......|
000004d0  00 00 05 08 00 00 00 00  00 00 05 08 00 10 05 08  |................|
000004e0  07 00 00 00 00 10 05 08  00 20 05 08 08 00 00 00  |......... ......|
000004f0  00 20 52 b7 00 20 72 b7  00 00 00 00 2f 6c 69 62  |. R.. r...../lib|
00000500  2f 6c 69 62 63 2d 32 2e  31 34 2e 39 30 2e 73 6f  |/libc-2.14.90.so|
00000510  00 2f 6c 69 62 2f 6c 69  62 63 2d 32 2e 31 34 2e  |./lib/libc-2.14.|
00000520  39 30 2e 73 6f 00 2f 6c  69 62 2f 6c 69 62 63 2d  |90.so./lib/libc-|
00000530  32 2e 31 34 2e 39 30 2e  73 6f 00 2f 6c 69 62 2f  |2.14.90.so./lib/|
00000540  6c 69 62 63 2d 32 2e 31  34 2e 39 30 2e 73 6f 00  |libc-2.14.90.so.|
00000550  2f 6c 69 62 2f 6c 64 2d  32 2e 31 34 2e 39 30 2e  |/lib/ld-2.14.90.|
00000560  73 6f 00 2f 6c 69 62 2f  6c 64 2d 32 2e 31 34 2e  |so./lib/ld-2.14.|
00000570  39 30 2e 73 6f 00 2f 6c  69 62 2f 6c 64 2d 32 2e  |90.so./lib/ld-2.|
00000580  31 34 2e 39 30 2e 73 6f  00 2f 75 73 72 2f 62 69  |14.90.so./usr/bi|
00000590  6e 2f 6d 64 35 73 75 6d  00 2f 75 73 72 2f 62 69  |n/md5sum./usr/bi|
000005a0  6e 2f 6d 64 35 73 75 6d  00 2f 75 73 72 2f 62 69  |n/md5sum./usr/bi|
000005b0  6e 2f 6d 64 35 73 75 6d  00 2f 75 73 72 2f 6c 69  |n/md5sum./usr/li|
000005c0  62 2f 6c 6f 63 61 6c 65  2f 6c 6f 63 61 6c 65 2d  |b/locale/locale-|
000005d0  61 72 63 68 69 76 65 00                           |archive.

--
vda

[-- Attachment #2: file_note.patch --]
[-- Type: application/octet-stream, Size: 3404 bytes --]

diff --git a/fs/binfmt_elf.c b/fs/binfmt_elf.c
index 81878b7..b585ba1 100644
--- a/fs/binfmt_elf.c
+++ b/fs/binfmt_elf.c
@@ -1358,6 +1358,73 @@ static void fill_auxv_note(struct memelfnote *note, struct mm_struct *mm)
 	fill_note(note, "CORE", NT_AUXV, i * sizeof(elf_addr_t), auxv);
 }
 
+#define MAX_FILE_NOTE_SIZE (4*1024*1024)
+
+static void fill_files_note(struct memelfnote *note)
+{
+	struct vm_area_struct *vma;
+	struct file *file;
+	unsigned count, word_count, size, remaining;
+	long *data;
+	long *start_end_ofs;
+	char *name;
+
+	count = 0;
+	for (vma = current->mm->mmap; vma != NULL; vma = vma->vm_next) {
+		file = vma->vm_file;
+		if (!file)
+			continue;
+		count++;
+		if (count >= MAX_FILE_NOTE_SIZE / 64) /* paranoia check */
+			goto err;
+	}
+
+	size = count * 64;
+	word_count = 2 + 3 * count;
+ alloc:
+	if (size >= MAX_FILE_NOTE_SIZE) /* paranoia check */
+		goto err;
+	size = (size + PAGE_SIZE - 1) & (-PAGE_SIZE);
+	data = vmalloc(size);
+	if (!data)
+		goto err;
+	start_end_ofs = data;
+	name = (void*)&start_end_ofs[word_count];
+	remaining = size - word_count * sizeof(long);
+
+	*start_end_ofs++ = count;
+	*start_end_ofs++ = PAGE_SIZE;
+	for (vma = current->mm->mmap; vma != NULL; vma = vma->vm_next) {
+		const char *filename;
+
+		file = vma->vm_file;
+		if (!file)
+			continue;
+		if (remaining == 0) {
+ try_new_size:
+			vfree(data);
+			size = size * 5 / 4;
+			goto alloc;
+		}
+		filename = d_path(&file->f_path, name, remaining);
+		if (IS_ERR(filename)) {
+			if (PTR_ERR(filename) == -ENAMETOOLONG)
+				goto try_new_size;
+			/* continue; -- WRONG, we must have COUNT elements */
+			filename = "";
+		}
+		while ((remaining--, *name++ = *filename++) != '\0')
+			continue;
+		*start_end_ofs++ = vma->vm_start;
+		*start_end_ofs++ = vma->vm_end;
+		*start_end_ofs++ = vma->vm_pgoff;
+	}
+
+	size = name - (char*)data;
+	fill_note(note, "CORE", NT_FILE, size, data);
+ err: ;
+}
+
 #ifdef CORE_DUMP_USE_REGSET
 #include <linux/regset.h>
 
@@ -1372,6 +1439,7 @@ struct elf_note_info {
 	struct elf_thread_core_info *thread;
 	struct memelfnote psinfo;
 	struct memelfnote auxv;
+	struct memelfnote files;
 	size_t size;
 	int thread_notes;
 };
@@ -1532,6 +1600,9 @@ static int fill_note_info(struct elfhdr *elf, int phdrs,
 	fill_auxv_note(&info->auxv, current->mm);
 	info->size += notesize(&info->auxv);
 
+	fill_files_note(&info->files);
+	info->size += notesize(&info->files);
+
 	return 1;
 }
 
@@ -1560,6 +1631,8 @@ static int write_note_info(struct elf_note_info *info,
 			return 0;
 		if (first && !writenote(&info->auxv, file, foffset))
 			return 0;
+		if (first && !writenote(&info->files, file, foffset))
+			return 0;
 
 		for (i = 1; i < info->thread_notes; ++i)
 			if (t->notes[i].data &&
@@ -1586,6 +1659,7 @@ static void free_note_info(struct elf_note_info *info)
 		kfree(t);
 	}
 	kfree(info->psinfo.data);
+	vfree(info->files.data);
 }
 
 #else
diff --git a/include/linux/elf.h b/include/linux/elf.h
index 999b4f5..5e6c08f 100644
--- a/include/linux/elf.h
+++ b/include/linux/elf.h
@@ -372,6 +372,7 @@ typedef struct elf64_shdr {
 #define NT_PRPSINFO	3
 #define NT_TASKSTRUCT	4
 #define NT_AUXV		6
+#define NT_FILE		0x46494c45
 #define NT_PRXFPREG     0x46e62b7f      /* copied from gdb5.1/include/elf/common.h */
 #define NT_PPC_VMX	0x100		/* PowerPC Altivec/VMX registers */
 #define NT_PPC_SPE	0x101		/* PowerPC SPE/EVR registers */

^ permalink raw reply related	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2012-07-12 19:41 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2012-07-11 10:35 [PATCH] Extend core dump note section to contain file names of mapped files Denys Vlasenko
2012-07-11 15:15 ` Oleg Nesterov
2012-07-12 19:41   ` Denys Vlasenko
2012-07-11 15:40 ` Jonathan M. Foote
  -- strict thread matches above, loose matches on Subject: below --
2012-03-31 20:51 Denys Vlasenko
2012-04-01  3:13 ` Andi Kleen
2012-04-01  3:20   ` H. Peter Anvin
2012-04-01  9:44     ` Andi Kleen
2012-04-01 13:33   ` Denys Vlasenko
2012-04-01 16:53     ` Andi Kleen
2012-04-02  0:24   ` Oleg Nesterov
2012-04-02 11:20     ` Pedro Alves

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.