linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
* [RESEND][mmotm][PATCH v2, 0/5] elf coredump: Add extended numbering support
@ 2010-01-04  1:06 Daisuke HATAYAMA
  2010-01-04  1:20 ` [RESEND][mmotm][PATCH v2, 1/5] Unify dump_seek() implementations for each binfmt_*.c Daisuke HATAYAMA
                   ` (5 more replies)
  0 siblings, 6 replies; 14+ messages in thread
From: Daisuke HATAYAMA @ 2010-01-04  1:06 UTC (permalink / raw)
  To: linux-mm
  Cc: linux-kernel, akpm, mhiramat, xiyou.wangcong, andi, jdike, tony.luck

I resend this serise of patches because in the previous post I didn't
get any reply.

The changes from v1 to v2:
 * fix binfmt_elf_fdpic.c and binfmt_aout.c, too
 * move dump_write() and dump_seek() into a header file
 * fix some style issues
 * correct confusing part of the patch descriptions
===
Summary
=======

The current ELF dumper can produce broken corefiles if program headers
exceed 65535. In particular, the program in 64-bit environment often
demands more than 65535 mmaps. If you google max_map_count, then you
can find many users facing this problem.

Solaris has already dealt with this issue, and other OSes have also
adopted the same method as in Solaris. Currently, Sun's document and
AMD 64 ABI include the description for the extension, where they call
the extension Extended Numbering. See Reference for further information.

I believe that linux kernel should adopt the same way as they did, so
I've written this patch.

I am also preparing for patches of GDB and binutils.

How to fix
==========

In new dumping process, there are two cases according to weather or
not the number of program headers is equal to or more than 65535.

 - if less than 65535, the produced corefile format is exactly the
   same as the ordinary one.

 - if equal to or more than 65535, then e_phnum field is set to newly
   introduced constant PN_XNUM(0xffff) and the actual number of
   program headers is set to sh_info field of the section header at
   index 0.

Compatibility Concern
=====================

 * As already mentioned in Summary, Sun and AMD64 has already adopted
   this. See Reference.

 * There are four combinations according to whether kernel and
   userland tools are respectively modified or not. The next table
   summarizes shortly for each combination.

                  ---------------------------------------------
                     Original Kernel    |   Modified Kernel   
                  ---------------------------------------------
    	            < 65535  | >= 65535 | < 65535  | >= 65535 
  -------------------------------------------------------------
   Original Tools |    OK    |  broken  |   OK     | broken (#)
  -------------------------------------------------------------
   Modified Tools |    OK    |  broken  |   OK     |    OK
  -------------------------------------------------------------

  Note that there is no case that `OK' changes to `broken'.

  (#) Although this case remains broken, O-M behaves better than
  O-O. That is, while in O-O case e_phnum field would be extremely
  small due to integer overflow, in O-M case it is guaranteed to be at
  least 65535 by being set to PN_XNUM(0xFFFF), much closer to the
  actual correct value than the O-O case.

Test Program
============

Here is a test program mkmmaps.c that is useful to produce the
corefile with many mmaps. To use this, please take the following
steps:

$ ulimit -c unlimited
$ sysctl vm.max_map_count=70000 # default 65530 is too small
$ sysctl fs.file-max=70000
$ mkmmaps 65535

Then, the program will abort and a corefile will be generated.

If failed, there are two cases according to the error message
displayed.

 * ``out of memory'' means vm.max_map_count is still smaller

 * ``too many open files'' means fs.file-max is still smaller

So, please change it to a larger value, and then retry it.

mkmmaps.c
==
#include <stdio.h>
#include <stdlib.h>
#include <sys/mman.h>
#include <fcntl.h>
#include <unistd.h>
int main(int argc, char **argv)
{ 
	int maps_num;
	if (argc < 2) {
		fprintf(stderr, "mkmmaps [number of maps to be created]\n");
		exit(1);
	}
	if (sscanf(argv[1], "%d", &maps_num) == EOF) {
		perror("sscanf");
		exit(2);
	}
	if (maps_num < 0) {
		fprintf(stderr, "%d is invalid\n", maps_num);
		exit(3);
	}
	for (; maps_num > 0; --maps_num) {
		if (MAP_FAILED == mmap((void *)NULL, (size_t) 1, PROT_READ,
					MAP_SHARED | MAP_ANONYMOUS, (int) -1,
					(off_t) NULL)) {
			perror("mmap");
			exit(4);
		}    
	}
	abort();
	{
		char buffer[128];
		sprintf(buffer, "wc -l /proc/%u/maps", getpid());
		system(buffer);
	}
	return 0;
}

Patches
=======

Tested on i386, ia64 and um/sys-i386.
Built on sh4 (which covers fs/binfmt_elf_fdpic.c)

diffstat output:

 arch/ia64/ia32/binfmt_elf32.c |    1 +
 arch/ia64/ia32/elfcore32.h    |   18 +++++
 arch/ia64/include/asm/elf.h   |   48 ------------
 arch/ia64/kernel/Makefile     |    2 +
 arch/ia64/kernel/elfcore.c    |   80 +++++++++++++++++++
 arch/um/sys-i386/Makefile     |    2 +
 arch/um/sys-i386/asm/elf.h    |   43 ----------
 arch/um/sys-i386/elfcore.c    |   83 ++++++++++++++++++++
 fs/binfmt_aout.c              |   36 +++------
 fs/binfmt_elf.c               |  141 +++++++++++++++++++++-------------
 fs/binfmt_elf_fdpic.c         |  171 ++++++++++++++++++++++++++--------------
 include/linux/coredump.h      |   41 ++++++++++
 include/linux/elf.h           |   28 +++++++-
 include/linux/elfcore.h       |   17 ++++
 kernel/Makefile               |    2 +
 kernel/elfcore.c              |   28 +++++++
 16 files changed, 512 insertions(+), 229 deletions(-)

Reference
=========

 - Sun microsystems: Linker and Libraries.
   Part No: 817-1984-17, September 2008.
   URL: http://docs.sun.com/app/docs/doc/817-1984

 - System V ABI AMD64 Architecture Processor Supplement
   Draft Version 0.99., May 11, 2009.
   URL: http://www.x86-64.org/

Signed-off-by: Daisuke HATAYAMA <d.hatayama@jp.fujitsu.com>

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 14+ messages in thread

* [RESEND][mmotm][PATCH v2, 1/5] Unify dump_seek() implementations for each binfmt_*.c
  2010-01-04  1:06 [RESEND][mmotm][PATCH v2, 0/5] elf coredump: Add extended numbering support Daisuke HATAYAMA
@ 2010-01-04  1:20 ` Daisuke HATAYAMA
  2010-01-04  1:20 ` [RESEND][mmotm][PATCH v2, 2/5] Move dump_write() and dump_seek() into a header file Daisuke HATAYAMA
                   ` (4 subsequent siblings)
  5 siblings, 0 replies; 14+ messages in thread
From: Daisuke HATAYAMA @ 2010-01-04  1:20 UTC (permalink / raw)
  To: linux-mm
  Cc: linux-kernel, akpm, mhiramat, xiyou.wangcong, andi, jdike, tony.luck

There are three different definitions for dump_seek() functions in
binfmt_aout.c, binfmt_elf.c and binfmt_elf_fdpic.c, respectively.  The
difference comes from the fact that a lot of patches has been applied
only for binfmt_elf.c.

My next patch will move dump_seek() into a header file in order to
share the same implementations for dump_write() and dump_seek(). As
the first step, this patch unify these three definitions for
dump_seek() by applying the past commits that have been applied only
for binfmt_elf.c.

Specifically, the modification made here is part of the following
commits:

  * d025c9db7f31fc0554ce7fb2dfc78d35a77f3487
  * 7f14daa19ea36b200d237ad3ac5826ae25360461

This patch does not change a shape of corefiles.

Signed-off-by: Daisuke HATAYAMA <d.hatayama@jp.fujitsu.com>
---
 fs/binfmt_aout.c      |   31 ++++++++++++++++++++-----
 fs/binfmt_elf_fdpic.c |   59 +++++++++++++++++++++++++++++++-----------------
 2 files changed, 62 insertions(+), 28 deletions(-)

diff --git a/fs/binfmt_aout.c b/fs/binfmt_aout.c
index 3bd288c..10717fb 100644
--- a/fs/binfmt_aout.c
+++ b/fs/binfmt_aout.c
@@ -69,16 +69,32 @@ static int dump_write(struct file *file, const void *addr, int nr)
 	return file->f_op->write(file, addr, nr, &file->f_pos) == nr;
 }
 
+static int dump_seek(struct file *file, loff_t off)
+{
+	if (file->f_op->llseek && file->f_op->llseek != no_llseek) {
+		if (file->f_op->llseek(file, off, SEEK_CUR) < 0)
+			return 0;
+	} else {
+		char *buf = (char *)get_zeroed_page(GFP_KERNEL);
+		if (!buf)
+			return 0;
+		while (off > 0) {
+			unsigned long n = off;
+			if (n > PAGE_SIZE)
+				n = PAGE_SIZE;
+			if (!dump_write(file, buf, n))
+				return 0;
+			off -= n;
+		}
+		free_page((unsigned long)buf);
+	}
+	return 1;
+}
+
 #define DUMP_WRITE(addr, nr)	\
 	if (!dump_write(file, (void *)(addr), (nr))) \
 		goto end_coredump;
 
-#define DUMP_SEEK(offset) \
-if (file->f_op->llseek) { \
-	if (file->f_op->llseek(file,(offset),0) != (offset)) \
- 		goto end_coredump; \
-} else file->f_pos = (offset)
-
 /*
  * Routine writes a core dump image in the current directory.
  * Currently only a stub-function.
@@ -132,7 +148,8 @@ static int aout_core_dump(struct coredump_params *cprm)
 /* struct user */
 	DUMP_WRITE(&dump,sizeof(dump));
 /* Now dump all of the user data.  Include malloced stuff as well */
-	DUMP_SEEK(PAGE_SIZE);
+	if (!dump_seek(cprm->file, PAGE_SIZE - sizeof(dump)))
+		goto end_coredump;
 /* now we start writing out the user space info */
 	set_fs(USER_DS);
 /* Dump the data area */
diff --git a/fs/binfmt_elf_fdpic.c b/fs/binfmt_elf_fdpic.c
index c25256a..ab9aa76 100644
--- a/fs/binfmt_elf_fdpic.c
+++ b/fs/binfmt_elf_fdpic.c
@@ -1214,11 +1214,22 @@ static int dump_write(struct file *file, const void *addr, int nr)
 
 static int dump_seek(struct file *file, loff_t off)
 {
-	if (file->f_op->llseek) {
-		if (file->f_op->llseek(file, off, SEEK_SET) != off)
+	if (file->f_op->llseek && file->f_op->llseek != no_llseek) {
+		if (file->f_op->llseek(file, off, SEEK_CUR) < 0)
 			return 0;
 	} else {
-		file->f_pos = off;
+		char *buf = (char *)get_zeroed_page(GFP_KERNEL);
+		if (!buf)
+			return 0;
+		while (off > 0) {
+			unsigned long n = off;
+			if (n > PAGE_SIZE)
+				n = PAGE_SIZE;
+			if (!dump_write(file, buf, n))
+				return 0;
+			off -= n;
+		}
+		free_page((unsigned long)buf);
 	}
 	return 1;
 }
@@ -1301,30 +1312,35 @@ static int notesize(struct memelfnote *en)
 
 /* #define DEBUG */
 
-#define DUMP_WRITE(addr, nr)	\
-	do { if (!dump_write(file, (addr), (nr))) return 0; } while(0)
-#define DUMP_SEEK(off)	\
-	do { if (!dump_seek(file, (off))) return 0; } while(0)
+#define DUMP_WRITE(addr, nr, foffset)	\
+	do { if (!dump_write(file, (addr), (nr))) return 0; *foffset += (nr); } while(0)
 
-static int writenote(struct memelfnote *men, struct file *file)
+static int alignfile(struct file *file, loff_t *foffset)
 {
-	struct elf_note en;
+	static const char buf[4] = { 0, };
+	DUMP_WRITE(buf, roundup(*foffset, 4) - *foffset, foffset);
+	return 1;
+}
 
+static int writenote(struct memelfnote *men, struct file *file,
+			loff_t *foffset)
+{
+	struct elf_note en;
 	en.n_namesz = strlen(men->name) + 1;
 	en.n_descsz = men->datasz;
 	en.n_type = men->type;
 
-	DUMP_WRITE(&en, sizeof(en));
-	DUMP_WRITE(men->name, en.n_namesz);
-	/* XXX - cast from long long to long to avoid need for libgcc.a */
-	DUMP_SEEK(roundup((unsigned long)file->f_pos, 4));	/* XXX */
-	DUMP_WRITE(men->data, men->datasz);
-	DUMP_SEEK(roundup((unsigned long)file->f_pos, 4));	/* XXX */
+	DUMP_WRITE(&en, sizeof(en), foffset);
+	DUMP_WRITE(men->name, en.n_namesz, foffset);
+	if (!alignfile(file, foffset))
+		return 0;
+	DUMP_WRITE(men->data, men->datasz, foffset);
+	if (!alignfile(file, foffset))
+		return 0;
 
 	return 1;
 }
 #undef DUMP_WRITE
-#undef DUMP_SEEK
 
 #define DUMP_WRITE(addr, nr)				\
 	if ((size += (nr)) > cprm->limit ||		\
@@ -1540,7 +1556,7 @@ static int elf_fdpic_dump_segments(struct file *file, size_t *size,
 					err = -EIO;
 				kunmap(page);
 				page_cache_release(page);
-			} else if (!dump_seek(file, file->f_pos + PAGE_SIZE))
+			} else if (!dump_seek(file, PAGE_SIZE))
 				err = -EFBIG;
 			if (err)
 				goto out;
@@ -1593,7 +1609,7 @@ static int elf_fdpic_core_dump(struct coredump_params *cprm)
 	int i;
 	struct vm_area_struct *vma;
 	struct elfhdr *elf = NULL;
-	loff_t offset = 0, dataoff;
+	loff_t offset = 0, dataoff, foffset;
 	int numnote;
 	struct memelfnote *notes = NULL;
 	struct elf_prstatus *prstatus = NULL;	/* NT_PRSTATUS */
@@ -1718,6 +1734,7 @@ static int elf_fdpic_core_dump(struct coredump_params *cprm)
 	DUMP_WRITE(elf, sizeof(*elf));
 	offset += sizeof(*elf);				/* Elf header */
 	offset += (segs+1) * sizeof(struct elf_phdr);	/* Program headers */
+	foffset = offset;
 
 	/* Write notes phdr entry */
 	{
@@ -1774,7 +1791,7 @@ static int elf_fdpic_core_dump(struct coredump_params *cprm)
 
  	/* write out the notes section */
 	for (i = 0; i < numnote; i++)
-		if (!writenote(notes + i, cprm->file))
+		if (!writenote(notes + i, cprm->file, &foffset))
 			goto end_coredump;
 
 	/* write out the thread status notes section */
@@ -1783,11 +1800,11 @@ static int elf_fdpic_core_dump(struct coredump_params *cprm)
 				list_entry(t, struct elf_thread_status, list);
 
 		for (i = 0; i < tmp->num_notes; i++)
-			if (!writenote(&tmp->notes[i], cprm->file))
+			if (!writenote(&tmp->notes[i], cprm->file, &foffset))
 				goto end_coredump;
 	}
 
-	if (!dump_seek(cprm->file, dataoff))
+	if (!dump_seek(cprm->file, dataoff - foffset))
 		goto end_coredump;
 
 	if (elf_fdpic_dump_segments(cprm->file, &size, &cprm->limit,
-- 
1.6.5.1

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [RESEND][mmotm][PATCH v2, 2/5] Move dump_write() and dump_seek() into a header file
  2010-01-04  1:06 [RESEND][mmotm][PATCH v2, 0/5] elf coredump: Add extended numbering support Daisuke HATAYAMA
  2010-01-04  1:20 ` [RESEND][mmotm][PATCH v2, 1/5] Unify dump_seek() implementations for each binfmt_*.c Daisuke HATAYAMA
@ 2010-01-04  1:20 ` Daisuke HATAYAMA
  2010-01-04  1:20 ` [RESEND][mmotm][PATCH v2, 3/5] elf coredump: replace ELF_CORE_EXTRA_* macros by functions Daisuke HATAYAMA
                   ` (3 subsequent siblings)
  5 siblings, 0 replies; 14+ messages in thread
From: Daisuke HATAYAMA @ 2010-01-04  1:20 UTC (permalink / raw)
  To: linux-mm
  Cc: linux-kernel, akpm, mhiramat, xiyou.wangcong, andi, jdike, tony.luck

My next patch will replace ELF_CORE_EXTRA_* macros by functions,
putting them into other newly created *.c files. Then, each files will
contain dump_write(), where each pair of binfmt_*.c and elfcore.c
should be the same. So, this patch moves them into a header file with
dump_seek(). Also, the patch deletes confusing DUMP_WRITE macros in
each files.

Signed-off-by: Daisuke HATAYAMA <d.hatayama@jp.fujitsu.com>
---
 fs/binfmt_aout.c         |   49 +++++++----------------------------------
 fs/binfmt_elf.c          |   52 ++++++++++++--------------------------------
 fs/binfmt_elf_fdpic.c    |   54 ++++++++++++---------------------------------
 include/linux/coredump.h |   41 ++++++++++++++++++++++++++++++++++
 4 files changed, 79 insertions(+), 117 deletions(-)
 create mode 100644 include/linux/coredump.h

diff --git a/fs/binfmt_aout.c b/fs/binfmt_aout.c
index 10717fb..75954e8 100644
--- a/fs/binfmt_aout.c
+++ b/fs/binfmt_aout.c
@@ -24,6 +24,7 @@
 #include <linux/binfmts.h>
 #include <linux/personality.h>
 #include <linux/init.h>
+#include <linux/coredump.h>
 
 #include <asm/system.h>
 #include <asm/uaccess.h>
@@ -60,42 +61,6 @@ static int set_brk(unsigned long start, unsigned long end)
 }
 
 /*
- * These are the only things you should do on a core-file: use only these
- * macros to write out all the necessary info.
- */
-
-static int dump_write(struct file *file, const void *addr, int nr)
-{
-	return file->f_op->write(file, addr, nr, &file->f_pos) == nr;
-}
-
-static int dump_seek(struct file *file, loff_t off)
-{
-	if (file->f_op->llseek && file->f_op->llseek != no_llseek) {
-		if (file->f_op->llseek(file, off, SEEK_CUR) < 0)
-			return 0;
-	} else {
-		char *buf = (char *)get_zeroed_page(GFP_KERNEL);
-		if (!buf)
-			return 0;
-		while (off > 0) {
-			unsigned long n = off;
-			if (n > PAGE_SIZE)
-				n = PAGE_SIZE;
-			if (!dump_write(file, buf, n))
-				return 0;
-			off -= n;
-		}
-		free_page((unsigned long)buf);
-	}
-	return 1;
-}
-
-#define DUMP_WRITE(addr, nr)	\
-	if (!dump_write(file, (void *)(addr), (nr))) \
-		goto end_coredump;
-
-/*
  * Routine writes a core dump image in the current directory.
  * Currently only a stub-function.
  *
@@ -146,7 +111,8 @@ static int aout_core_dump(struct coredump_params *cprm)
 
 	set_fs(KERNEL_DS);
 /* struct user */
-	DUMP_WRITE(&dump,sizeof(dump));
+	if (!dump_write(file, &dump, sizeof(dump)))
+		goto end_coredump;
 /* Now dump all of the user data.  Include malloced stuff as well */
 	if (!dump_seek(cprm->file, PAGE_SIZE - sizeof(dump)))
 		goto end_coredump;
@@ -156,17 +122,20 @@ static int aout_core_dump(struct coredump_params *cprm)
 	if (dump.u_dsize != 0) {
 		dump_start = START_DATA(dump);
 		dump_size = dump.u_dsize << PAGE_SHIFT;
-		DUMP_WRITE(dump_start,dump_size);
+		if (!dump_write(file, dump_start, dump_size))
+			goto end_coredump;
 	}
 /* Now prepare to dump the stack area */
 	if (dump.u_ssize != 0) {
 		dump_start = START_STACK(dump);
 		dump_size = dump.u_ssize << PAGE_SHIFT;
-		DUMP_WRITE(dump_start,dump_size);
+		if (!dump_write(file, dump_start, dump_size))
+			goto end_coredump;
 	}
 /* Finally dump the task struct.  Not be used by gdb, but could be useful */
 	set_fs(KERNEL_DS);
-	DUMP_WRITE(current,sizeof(*current));
+	if (!dump_write(file, current, sizeof(*current)))
+		goto end_coredump;
 end_coredump:
 	set_fs(fs);
 	return has_dumped;
diff --git a/fs/binfmt_elf.c b/fs/binfmt_elf.c
index edd90c4..a0fe475 100644
--- a/fs/binfmt_elf.c
+++ b/fs/binfmt_elf.c
@@ -31,6 +31,7 @@
 #include <linux/random.h>
 #include <linux/elf.h>
 #include <linux/utsname.h>
+#include <linux/coredump.h>
 #include <asm/uaccess.h>
 #include <asm/param.h>
 #include <asm/page.h>
@@ -1108,36 +1109,6 @@ out:
  * Modelled on fs/exec.c:aout_core_dump()
  * Jeremy Fitzhardinge <jeremy@sw.oz.au>
  */
-/*
- * These are the only things you should do on a core-file: use only these
- * functions to write out all the necessary info.
- */
-static int dump_write(struct file *file, const void *addr, int nr)
-{
-	return file->f_op->write(file, addr, nr, &file->f_pos) == nr;
-}
-
-static int dump_seek(struct file *file, loff_t off)
-{
-	if (file->f_op->llseek && file->f_op->llseek != no_llseek) {
-		if (file->f_op->llseek(file, off, SEEK_CUR) < 0)
-			return 0;
-	} else {
-		char *buf = (char *)get_zeroed_page(GFP_KERNEL);
-		if (!buf)
-			return 0;
-		while (off > 0) {
-			unsigned long n = off;
-			if (n > PAGE_SIZE)
-				n = PAGE_SIZE;
-			if (!dump_write(file, buf, n))
-				return 0;
-			off -= n;
-		}
-		free_page((unsigned long)buf);
-	}
-	return 1;
-}
 
 /*
  * Decide what to dump of a segment, part, all or none.
@@ -1272,11 +1243,6 @@ static int writenote(struct memelfnote *men, struct file *file,
 }
 #undef DUMP_WRITE
 
-#define DUMP_WRITE(addr, nr)				\
-	if ((size += (nr)) > cprm->limit ||		\
-	    !dump_write(cprm->file, (addr), (nr)))	\
-		goto end_coredump;
-
 static void fill_elf_header(struct elfhdr *elf, int segs,
 			    u16 machine, u32 flags, u8 osabi)
 {
@@ -1957,7 +1923,10 @@ static int elf_core_dump(struct coredump_params *cprm)
 	fs = get_fs();
 	set_fs(KERNEL_DS);
 
-	DUMP_WRITE(elf, sizeof(*elf));
+	size += sizeof(*elf);
+	if (size > cprm->limit || !dump_write(cprm->file, elf, sizeof(*elf)))
+		goto end_coredump;
+
 	offset += sizeof(*elf);				/* Elf header */
 	offset += (segs + 1) * sizeof(struct elf_phdr); /* Program headers */
 	foffset = offset;
@@ -1971,7 +1940,11 @@ static int elf_core_dump(struct coredump_params *cprm)
 
 		fill_elf_note_phdr(&phdr, sz, offset);
 		offset += sz;
-		DUMP_WRITE(&phdr, sizeof(phdr));
+
+		size += sizeof(phdr);
+		if (size > cprm->limit
+		    || !dump_write(cprm->file, &phdr, sizeof(phdr)))
+			goto end_coredump;
 	}
 
 	dataoff = offset = roundup(offset, ELF_EXEC_PAGESIZE);
@@ -2002,7 +1975,10 @@ static int elf_core_dump(struct coredump_params *cprm)
 			phdr.p_flags |= PF_X;
 		phdr.p_align = ELF_EXEC_PAGESIZE;
 
-		DUMP_WRITE(&phdr, sizeof(phdr));
+		size += sizeof(phdr);
+		if (size > cprm->limit
+		    || !dump_write(cprm->file, &phdr, sizeof(phdr)))
+			goto end_coredump;
 	}
 
 #ifdef ELF_CORE_WRITE_EXTRA_PHDRS
diff --git a/fs/binfmt_elf_fdpic.c b/fs/binfmt_elf_fdpic.c
index ab9aa76..d928e5f 100644
--- a/fs/binfmt_elf_fdpic.c
+++ b/fs/binfmt_elf_fdpic.c
@@ -34,6 +34,7 @@
 #include <linux/elf.h>
 #include <linux/elf-fdpic.h>
 #include <linux/elfcore.h>
+#include <linux/coredump.h>
 
 #include <asm/uaccess.h>
 #include <asm/param.h>
@@ -1204,37 +1205,6 @@ static int elf_fdpic_map_file_by_direct_mmap(struct elf_fdpic_params *params,
 #ifdef CONFIG_ELF_CORE
 
 /*
- * These are the only things you should do on a core-file: use only these
- * functions to write out all the necessary info.
- */
-static int dump_write(struct file *file, const void *addr, int nr)
-{
-	return file->f_op->write(file, addr, nr, &file->f_pos) == nr;
-}
-
-static int dump_seek(struct file *file, loff_t off)
-{
-	if (file->f_op->llseek && file->f_op->llseek != no_llseek) {
-		if (file->f_op->llseek(file, off, SEEK_CUR) < 0)
-			return 0;
-	} else {
-		char *buf = (char *)get_zeroed_page(GFP_KERNEL);
-		if (!buf)
-			return 0;
-		while (off > 0) {
-			unsigned long n = off;
-			if (n > PAGE_SIZE)
-				n = PAGE_SIZE;
-			if (!dump_write(file, buf, n))
-				return 0;
-			off -= n;
-		}
-		free_page((unsigned long)buf);
-	}
-	return 1;
-}
-
-/*
  * Decide whether a segment is worth dumping; default is yes to be
  * sure (missing info is worse than too much; etc).
  * Personally I'd include everything, and use the coredump limit...
@@ -1342,11 +1312,6 @@ static int writenote(struct memelfnote *men, struct file *file,
 }
 #undef DUMP_WRITE
 
-#define DUMP_WRITE(addr, nr)				\
-	if ((size += (nr)) > cprm->limit ||		\
-	    !dump_write(cprm->file, (addr), (nr)))	\
-		goto end_coredump;
-
 static inline void fill_elf_fdpic_header(struct elfhdr *elf, int segs)
 {
 	memcpy(elf->e_ident, ELFMAG, SELFMAG);
@@ -1731,7 +1696,11 @@ static int elf_fdpic_core_dump(struct coredump_params *cprm)
 	fs = get_fs();
 	set_fs(KERNEL_DS);
 
-	DUMP_WRITE(elf, sizeof(*elf));
+	size += sizeof(*elf);
+	if (size > cprm->limit
+	    || !dump_write(cprm->file, elf, sizeof(*elf)))
+		goto end_coredump;
+
 	offset += sizeof(*elf);				/* Elf header */
 	offset += (segs+1) * sizeof(struct elf_phdr);	/* Program headers */
 	foffset = offset;
@@ -1748,7 +1717,11 @@ static int elf_fdpic_core_dump(struct coredump_params *cprm)
 
 		fill_elf_note_phdr(&phdr, sz, offset);
 		offset += sz;
-		DUMP_WRITE(&phdr, sizeof(phdr));
+
+		size += sizeof(phdr);
+		if (size > cprm->limit
+		    || !dump_write(cprm->file, &phdr, sizeof(phdr)))
+			goto end_coredump;
 	}
 
 	/* Page-align dumped data */
@@ -1782,7 +1755,10 @@ static int elf_fdpic_core_dump(struct coredump_params *cprm)
 			phdr.p_flags |= PF_X;
 		phdr.p_align = ELF_EXEC_PAGESIZE;
 
-		DUMP_WRITE(&phdr, sizeof(phdr));
+		size += sizeof(phdr);
+		if (size > cprm->limit
+		    || !dump_write(cprm->file, &phdr, sizeof(phdr)))
+			goto end_coredump;
 	}
 
 #ifdef ELF_CORE_WRITE_EXTRA_PHDRS
diff --git a/include/linux/coredump.h b/include/linux/coredump.h
new file mode 100644
index 0000000..b3c91d7
--- /dev/null
+++ b/include/linux/coredump.h
@@ -0,0 +1,41 @@
+#ifndef _LINUX_COREDUMP_H
+#define _LINUX_COREDUMP_H
+
+#include <linux/types.h>
+#include <linux/mm.h>
+#include <linux/fs.h>
+
+/*
+ * These are the only things you should do on a core-file: use only these
+ * functions to write out all the necessary info.
+ */
+static inline int dump_write(struct file *file, const void *addr, int nr)
+{
+	return file->f_op->write(file, addr, nr, &file->f_pos) == nr;
+}
+
+static inline int dump_seek(struct file *file, loff_t off)
+{
+	if (file->f_op->llseek && file->f_op->llseek != no_llseek) {
+		if (file->f_op->llseek(file, off, SEEK_CUR) < 0)
+			return 0;
+	} else {
+		char *buf = (char *)get_zeroed_page(GFP_KERNEL);
+
+		if (!buf)
+			return 0;
+		while (off > 0) {
+			unsigned long n = off;
+
+			if (n > PAGE_SIZE)
+				n = PAGE_SIZE;
+			if (!dump_write(file, buf, n))
+				return 0;
+			off -= n;
+		}
+		free_page((unsigned long)buf);
+	}
+	return 1;
+}
+
+#endif /* _LINUX_COREDUMP_H */
-- 
1.6.5.1

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [RESEND][mmotm][PATCH v2, 3/5] elf coredump: replace ELF_CORE_EXTRA_* macros by functions
  2010-01-04  1:06 [RESEND][mmotm][PATCH v2, 0/5] elf coredump: Add extended numbering support Daisuke HATAYAMA
  2010-01-04  1:20 ` [RESEND][mmotm][PATCH v2, 1/5] Unify dump_seek() implementations for each binfmt_*.c Daisuke HATAYAMA
  2010-01-04  1:20 ` [RESEND][mmotm][PATCH v2, 2/5] Move dump_write() and dump_seek() into a header file Daisuke HATAYAMA
@ 2010-01-04  1:20 ` Daisuke HATAYAMA
  2010-01-04  1:20 ` [RESEND][mmotm][PATCH v2, 4/5] elf coredump: make offset calculation process and writing process explicit Daisuke HATAYAMA
                   ` (2 subsequent siblings)
  5 siblings, 0 replies; 14+ messages in thread
From: Daisuke HATAYAMA @ 2010-01-04  1:20 UTC (permalink / raw)
  To: linux-mm
  Cc: linux-kernel, akpm, mhiramat, xiyou.wangcong, andi, jdike, tony.luck

elf_core_dump() and elf_fdpic_core_dump() use #ifdef and the
corresponding macro for hiding _multiline_ logics in functions. This
patch removes #ifdef and replaces ELF_CORE_EXTRA_* by corresponding
functions. For architectures not implemeonting ELF_CORE_EXTRA_*, we
use weak functions in order to reduce a range of modification.

This cleanup is for my next patches, but I think this cleanup itself
is worth doing regardless of my firnal purpose.

Signed-off-by: Daisuke HATAYAMA <d.hatayama@jp.fujitsu.com>
---
 arch/ia64/ia32/binfmt_elf32.c |    1 +
 arch/ia64/ia32/elfcore32.h    |   17 ++++++++++
 arch/ia64/include/asm/elf.h   |   48 -----------------------------
 arch/ia64/kernel/Makefile     |    2 +
 arch/ia64/kernel/elfcore.c    |   64 +++++++++++++++++++++++++++++++++++++++
 arch/um/sys-i386/Makefile     |    2 +
 arch/um/sys-i386/asm/elf.h    |   43 --------------------------
 arch/um/sys-i386/elfcore.c    |   67 +++++++++++++++++++++++++++++++++++++++++
 fs/binfmt_elf.c               |   14 +++-----
 fs/binfmt_elf_fdpic.c         |   14 +++-----
 include/linux/elf.h           |    2 +
 include/linux/elfcore.h       |   16 ++++++++++
 kernel/Makefile               |    2 +
 kernel/elfcore.c              |   23 ++++++++++++++
 14 files changed, 206 insertions(+), 109 deletions(-)
 create mode 100644 arch/ia64/kernel/elfcore.c
 create mode 100644 arch/um/sys-i386/elfcore.c
 create mode 100644 kernel/elfcore.c

diff --git a/arch/ia64/ia32/binfmt_elf32.c b/arch/ia64/ia32/binfmt_elf32.c
index c69552b..2328f44 100644
--- a/arch/ia64/ia32/binfmt_elf32.c
+++ b/arch/ia64/ia32/binfmt_elf32.c
@@ -43,6 +43,7 @@ randomize_stack_top(unsigned long stack_top);
 #undef SET_PERSONALITY
 #define SET_PERSONALITY(ex)	elf32_set_personality()
 
+#undef elf_read_implies_exec
 #define elf_read_implies_exec(ex, have_pt_gnu_stack)	(!(have_pt_gnu_stack))
 
 /* Ugly but avoids duplication */
diff --git a/arch/ia64/ia32/elfcore32.h b/arch/ia64/ia32/elfcore32.h
index 6577257..7877601 100644
--- a/arch/ia64/ia32/elfcore32.h
+++ b/arch/ia64/ia32/elfcore32.h
@@ -8,6 +8,8 @@
 #ifndef _ELFCORE32_H_
 #define _ELFCORE32_H_
 
+#include <linux/elf.h>
+
 #include <asm/intrinsics.h>
 #include <asm/uaccess.h>
 
@@ -145,4 +147,19 @@ elf_core_copy_task_xfpregs(struct task_struct *tsk, elf_fpxregset_t *xfpu)
 	return 1;
 }
 
+/*
+ * These functions parameterize elf_core_dump in fs/binfmt_elf.c to write out
+ * extra segments containing the gate DSO contents.  Dumping its
+ * contents makes post-mortem fully interpretable later without matching up
+ * the same kernel and hardware config to see what PC values meant.
+ * Dumping its extra ELF program headers includes all the other information
+ * a debugger needs to easily find how the gate DSO was being used.
+ */
+extern Elf32_Half elf_core_extra_phdrs(void);
+extern int
+elf_core_write_extra_phdrs(struct file *file, loff_t offset, size_t *size,
+			   unsigned long limit);
+extern int
+elf_core_write_extra_data(struct file *file, size_t *size, unsigned long limit);
+
 #endif /* _ELFCORE32_H_ */
diff --git a/arch/ia64/include/asm/elf.h b/arch/ia64/include/asm/elf.h
index e14108b..60af1ef 100644
--- a/arch/ia64/include/asm/elf.h
+++ b/arch/ia64/include/asm/elf.h
@@ -217,54 +217,6 @@ do {										\
 	NEW_AUX_ENT(AT_SYSINFO_EHDR, (unsigned long) GATE_EHDR);		\
 } while (0)
 
-
-/*
- * These macros parameterize elf_core_dump in fs/binfmt_elf.c to write out
- * extra segments containing the gate DSO contents.  Dumping its
- * contents makes post-mortem fully interpretable later without matching up
- * the same kernel and hardware config to see what PC values meant.
- * Dumping its extra ELF program headers includes all the other information
- * a debugger needs to easily find how the gate DSO was being used.
- */
-#define ELF_CORE_EXTRA_PHDRS		(GATE_EHDR->e_phnum)
-#define ELF_CORE_WRITE_EXTRA_PHDRS						\
-do {										\
-	const struct elf_phdr *const gate_phdrs =			      \
-		(const struct elf_phdr *) (GATE_ADDR + GATE_EHDR->e_phoff);   \
-	int i;									\
-	Elf64_Off ofs = 0;						      \
-	for (i = 0; i < GATE_EHDR->e_phnum; ++i) {				\
-		struct elf_phdr phdr = gate_phdrs[i];			      \
-		if (phdr.p_type == PT_LOAD) {					\
-			phdr.p_memsz = PAGE_ALIGN(phdr.p_memsz);	      \
-			phdr.p_filesz = phdr.p_memsz;			      \
-			if (ofs == 0) {					      \
-				ofs = phdr.p_offset = offset;		      \
-			offset += phdr.p_filesz;				\
-		}							      \
-		else							      \
-				phdr.p_offset = ofs;			      \
-		}							      \
-		else							      \
-			phdr.p_offset += ofs;					\
-		phdr.p_paddr = 0; /* match other core phdrs */			\
-		DUMP_WRITE(&phdr, sizeof(phdr));				\
-	}									\
-} while (0)
-#define ELF_CORE_WRITE_EXTRA_DATA					\
-do {									\
-	const struct elf_phdr *const gate_phdrs =			      \
-		(const struct elf_phdr *) (GATE_ADDR + GATE_EHDR->e_phoff);   \
-	int i;								\
-	for (i = 0; i < GATE_EHDR->e_phnum; ++i) {			\
-		if (gate_phdrs[i].p_type == PT_LOAD) {			      \
-			DUMP_WRITE((void *) gate_phdrs[i].p_vaddr,	      \
-				   PAGE_ALIGN(gate_phdrs[i].p_memsz));	      \
-			break;						      \
-		}							      \
-	}								\
-} while (0)
-
 /*
  * format for entries in the Global Offset Table
  */
diff --git a/arch/ia64/kernel/Makefile b/arch/ia64/kernel/Makefile
index 2a75e93..1b3d65a 100644
--- a/arch/ia64/kernel/Makefile
+++ b/arch/ia64/kernel/Makefile
@@ -51,6 +51,8 @@ endif
 obj-$(CONFIG_DMAR)		+= pci-dma.o
 obj-$(CONFIG_SWIOTLB)		+= pci-swiotlb.o
 
+obj-$(CONFIG_BINFMT_ELF)	+= elfcore.o
+
 # fp_emulate() expects f2-f5,f16-f31 to contain the user-level state.
 CFLAGS_traps.o  += -mfixed-range=f2-f5,f16-f31
 
diff --git a/arch/ia64/kernel/elfcore.c b/arch/ia64/kernel/elfcore.c
new file mode 100644
index 0000000..57a2298
--- /dev/null
+++ b/arch/ia64/kernel/elfcore.c
@@ -0,0 +1,64 @@
+#include <linux/elf.h>
+#include <linux/coredump.h>
+#include <linux/fs.h>
+#include <linux/mm.h>
+
+#include <asm/elf.h>
+
+
+Elf64_Half elf_core_extra_phdrs(void)
+{
+	return GATE_EHDR->e_phnum;
+}
+
+int elf_core_write_extra_phdrs(struct file *file, loff_t offset, size_t *size,
+			       unsigned long limit)
+{
+	const struct elf_phdr *const gate_phdrs =
+		(const struct elf_phdr *) (GATE_ADDR + GATE_EHDR->e_phoff);
+	int i;
+	Elf64_Off ofs = 0;
+
+	for (i = 0; i < GATE_EHDR->e_phnum; ++i) {
+		struct elf_phdr phdr = gate_phdrs[i];
+
+		if (phdr.p_type == PT_LOAD) {
+			phdr.p_memsz = PAGE_ALIGN(phdr.p_memsz);
+			phdr.p_filesz = phdr.p_memsz;
+			if (ofs == 0) {
+				ofs = phdr.p_offset = offset;
+				offset += phdr.p_filesz;
+			} else {
+				phdr.p_offset = ofs;
+			}
+		} else {
+			phdr.p_offset += ofs;
+		}
+		phdr.p_paddr = 0; /* match other core phdrs */
+		*size += sizeof(phdr);
+		if (*size > limit || !dump_write(file, &phdr, sizeof(phdr)))
+			return 0;
+	}
+	return 1;
+}
+
+int elf_core_write_extra_data(struct file *file, size_t *size,
+			      unsigned long limit)
+{
+	const struct elf_phdr *const gate_phdrs =
+		(const struct elf_phdr *) (GATE_ADDR + GATE_EHDR->e_phoff);
+	int i;
+
+	for (i = 0; i < GATE_EHDR->e_phnum; ++i) {
+		if (gate_phdrs[i].p_type == PT_LOAD) {
+			void *addr = (void *)gate_phdrs[i].p_vaddr;
+			size_t memsz = PAGE_ALIGN(gate_phdrs[i].p_memsz);
+
+			*size += memsz;
+			if (*size > limit || !dump_write(file, addr, memsz))
+				return 0;
+			break;
+		}
+	}
+	return 1;
+}
diff --git a/arch/um/sys-i386/Makefile b/arch/um/sys-i386/Makefile
index 1b549bc..804b28d 100644
--- a/arch/um/sys-i386/Makefile
+++ b/arch/um/sys-i386/Makefile
@@ -6,6 +6,8 @@ obj-y = bug.o bugs.o checksum.o delay.o fault.o ksyms.o ldt.o ptrace.o \
 	ptrace_user.o setjmp.o signal.o stub.o stub_segv.o syscalls.o sysrq.o \
 	sys_call_table.o tls.o
 
+obj-$(CONFIG_BINFMT_ELF) += elfcore.o
+
 subarch-obj-y = lib/semaphore_32.o lib/string_32.o
 subarch-obj-$(CONFIG_HIGHMEM) += mm/highmem_32.o
 subarch-obj-$(CONFIG_MODULES) += kernel/module.o
diff --git a/arch/um/sys-i386/asm/elf.h b/arch/um/sys-i386/asm/elf.h
index 7708854..e64cd41 100644
--- a/arch/um/sys-i386/asm/elf.h
+++ b/arch/um/sys-i386/asm/elf.h
@@ -116,47 +116,4 @@ do {								\
 	}							\
 } while (0)
 
-/*
- * These macros parameterize elf_core_dump in fs/binfmt_elf.c to write out
- * extra segments containing the vsyscall DSO contents.  Dumping its
- * contents makes post-mortem fully interpretable later without matching up
- * the same kernel and hardware config to see what PC values meant.
- * Dumping its extra ELF program headers includes all the other information
- * a debugger needs to easily find how the vsyscall DSO was being used.
- */
-#define ELF_CORE_EXTRA_PHDRS						      \
-	(vsyscall_ehdr ? (((struct elfhdr *)vsyscall_ehdr)->e_phnum) : 0 )
-
-#define ELF_CORE_WRITE_EXTRA_PHDRS					      \
-if ( vsyscall_ehdr ) {							      \
-	const struct elfhdr *const ehdrp = (struct elfhdr *)vsyscall_ehdr;    \
-	const struct elf_phdr *const phdrp =				      \
-		(const struct elf_phdr *) (vsyscall_ehdr + ehdrp->e_phoff);   \
-	int i;								      \
-	Elf32_Off ofs = 0;						      \
-	for (i = 0; i < ehdrp->e_phnum; ++i) {				      \
-		struct elf_phdr phdr = phdrp[i];			      \
-		if (phdr.p_type == PT_LOAD) {				      \
-			ofs = phdr.p_offset = offset;			      \
-			offset += phdr.p_filesz;			      \
-		}							      \
-		else							      \
-			phdr.p_offset += ofs;				      \
-		phdr.p_paddr = 0; /* match other core phdrs */		      \
-		DUMP_WRITE(&phdr, sizeof(phdr));			      \
-	}								      \
-}
-#define ELF_CORE_WRITE_EXTRA_DATA					      \
-if ( vsyscall_ehdr ) {							      \
-	const struct elfhdr *const ehdrp = (struct elfhdr *)vsyscall_ehdr;    \
-	const struct elf_phdr *const phdrp =				      \
-		(const struct elf_phdr *) (vsyscall_ehdr + ehdrp->e_phoff);   \
-	int i;								      \
-	for (i = 0; i < ehdrp->e_phnum; ++i) {				      \
-		if (phdrp[i].p_type == PT_LOAD)				      \
-			DUMP_WRITE((void *) phdrp[i].p_vaddr,		      \
-				   phdrp[i].p_filesz);			      \
-	}								      \
-}
-
 #endif
diff --git a/arch/um/sys-i386/elfcore.c b/arch/um/sys-i386/elfcore.c
new file mode 100644
index 0000000..30cac52
--- /dev/null
+++ b/arch/um/sys-i386/elfcore.c
@@ -0,0 +1,67 @@
+#include <linux/elf.h>
+#include <linux/coredump.h>
+#include <linux/fs.h>
+#include <linux/mm.h>
+
+#include <asm/elf.h>
+
+
+Elf32_Half elf_core_extra_phdrs(void)
+{
+	return vsyscall_ehdr ? (((struct elfhdr *)vsyscall_ehdr)->e_phnum) : 0;
+}
+
+int elf_core_write_extra_phdrs(struct file *file, loff_t offset, size_t *size,
+			       unsigned long limit)
+{
+	if ( vsyscall_ehdr ) {
+		const struct elfhdr *const ehdrp =
+			(struct elfhdr *) vsyscall_ehdr;
+		const struct elf_phdr *const phdrp =
+			(const struct elf_phdr *) (vsyscall_ehdr + ehdrp->e_phoff);
+		int i;
+		Elf32_Off ofs = 0;
+
+		for (i = 0; i < ehdrp->e_phnum; ++i) {
+			struct elf_phdr phdr = phdrp[i];
+
+			if (phdr.p_type == PT_LOAD) {
+				ofs = phdr.p_offset = offset;
+				offset += phdr.p_filesz;
+			} else {
+				phdr.p_offset += ofs;
+			}
+			phdr.p_paddr = 0; /* match other core phdrs */
+			*size += sizeof(phdr);
+			if (*size > limit
+			    || !dump_write(file, &phdr, sizeof(phdr)))
+				return 0;
+		}
+	}
+	return 1;
+}
+
+int elf_core_write_extra_data(struct file *file, size_t *size,
+			      unsigned long limit)
+{
+	if ( vsyscall_ehdr ) {
+		const struct elfhdr *const ehdrp =
+			(struct elfhdr *) vsyscall_ehdr;
+		const struct elf_phdr *const phdrp =
+			(const struct elf_phdr *) (vsyscall_ehdr + ehdrp->e_phoff);
+		int i;
+
+		for (i = 0; i < ehdrp->e_phnum; ++i) {
+			if (phdrp[i].p_type == PT_LOAD) {
+				void *addr = (void *) phdrp[i].p_vaddr;
+				size_t filesz = phdrp[i].p_filesz;
+
+				*size += filesz;
+				if (*size > limit
+				    || !dump_write(file, addr, filesz))
+					return 0;
+			}
+		}
+	}
+	return 1;
+}
diff --git a/fs/binfmt_elf.c b/fs/binfmt_elf.c
index a0fe475..1b7e9de 100644
--- a/fs/binfmt_elf.c
+++ b/fs/binfmt_elf.c
@@ -1901,9 +1901,7 @@ static int elf_core_dump(struct coredump_params *cprm)
 	 * Please check DEFAULT_MAX_MAP_COUNT definition when you modify here.
 	 */
 	segs = current->mm->map_count;
-#ifdef ELF_CORE_EXTRA_PHDRS
-	segs += ELF_CORE_EXTRA_PHDRS;
-#endif
+	segs += elf_core_extra_phdrs();
 
 	gate_vma = get_gate_vma(current);
 	if (gate_vma != NULL)
@@ -1981,9 +1979,8 @@ static int elf_core_dump(struct coredump_params *cprm)
 			goto end_coredump;
 	}
 
-#ifdef ELF_CORE_WRITE_EXTRA_PHDRS
-	ELF_CORE_WRITE_EXTRA_PHDRS;
-#endif
+	if (!elf_core_write_extra_phdrs(cprm->file, offset, &size, cprm->limit))
+		goto end_coredump;
 
  	/* write out the notes section */
 	if (!write_note_info(&info, cprm->file, &foffset))
@@ -2022,9 +2019,8 @@ static int elf_core_dump(struct coredump_params *cprm)
 		}
 	}
 
-#ifdef ELF_CORE_WRITE_EXTRA_DATA
-	ELF_CORE_WRITE_EXTRA_DATA;
-#endif
+	if (!elf_core_write_extra_data(cprm->file, &size, cprm->limit))
+		goto end_coredump;
 
 end_coredump:
 	set_fs(fs);
diff --git a/fs/binfmt_elf_fdpic.c b/fs/binfmt_elf_fdpic.c
index d928e5f..4c16ff6 100644
--- a/fs/binfmt_elf_fdpic.c
+++ b/fs/binfmt_elf_fdpic.c
@@ -1652,9 +1652,7 @@ static int elf_fdpic_core_dump(struct coredump_params *cprm)
 	elf_core_copy_regs(&prstatus->pr_reg, cprm->regs);
 
 	segs = current->mm->map_count;
-#ifdef ELF_CORE_EXTRA_PHDRS
-	segs += ELF_CORE_EXTRA_PHDRS;
-#endif
+	segs += elf_core_extra_phdrs();
 
 	/* Set up header */
 	fill_elf_fdpic_header(elf, segs + 1);	/* including notes section */
@@ -1761,9 +1759,8 @@ static int elf_fdpic_core_dump(struct coredump_params *cprm)
 			goto end_coredump;
 	}
 
-#ifdef ELF_CORE_WRITE_EXTRA_PHDRS
-	ELF_CORE_WRITE_EXTRA_PHDRS;
-#endif
+	if (!elf_core_write_extra_phdrs(cprm->file, offset, &size, cprm->limit))
+		goto end_coredump;
 
  	/* write out the notes section */
 	for (i = 0; i < numnote; i++)
@@ -1787,9 +1784,8 @@ static int elf_fdpic_core_dump(struct coredump_params *cprm)
 				    mm_flags) < 0)
 		goto end_coredump;
 
-#ifdef ELF_CORE_WRITE_EXTRA_DATA
-	ELF_CORE_WRITE_EXTRA_DATA;
-#endif
+	if (!elf_core_write_extra_data(cprm->file, &size, cprm->limit))
+		goto end_coredump;
 
 	if (file->f_pos != offset) {
 		/* Sanity check */
diff --git a/include/linux/elf.h b/include/linux/elf.h
index 90a4ed0..d103127 100644
--- a/include/linux/elf.h
+++ b/include/linux/elf.h
@@ -386,6 +386,7 @@ extern Elf32_Dyn _DYNAMIC [];
 #define elf_phdr	elf32_phdr
 #define elf_note	elf32_note
 #define elf_addr_t	Elf32_Off
+#define Elf_Half	Elf32_Half
 
 #else
 
@@ -394,6 +395,7 @@ extern Elf64_Dyn _DYNAMIC [];
 #define elf_phdr	elf64_phdr
 #define elf_note	elf64_note
 #define elf_addr_t	Elf64_Off
+#define Elf_Half	Elf64_Half
 
 #endif
 
diff --git a/include/linux/elfcore.h b/include/linux/elfcore.h
index 00d6a68..cfda74f 100644
--- a/include/linux/elfcore.h
+++ b/include/linux/elfcore.h
@@ -8,6 +8,8 @@
 #include <linux/user.h>
 #endif
 #include <linux/ptrace.h>
+#include <linux/elf.h>
+#include <linux/fs.h>
 
 struct elf_siginfo
 {
@@ -150,5 +152,19 @@ static inline int elf_core_copy_task_xfpregs(struct task_struct *t, elf_fpxregse
 
 #endif /* __KERNEL__ */
 
+/*
+ * These functions parameterize elf_core_dump in fs/binfmt_elf.c to write out
+ * extra segments containing the gate DSO contents.  Dumping its
+ * contents makes post-mortem fully interpretable later without matching up
+ * the same kernel and hardware config to see what PC values meant.
+ * Dumping its extra ELF program headers includes all the other information
+ * a debugger needs to easily find how the gate DSO was being used.
+ */
+extern Elf_Half elf_core_extra_phdrs(void);
+extern int
+elf_core_write_extra_phdrs(struct file *file, loff_t offset, size_t *size,
+			   unsigned long limit);
+extern int
+elf_core_write_extra_data(struct file *file, size_t *size, unsigned long limit);
 
 #endif /* _LINUX_ELFCORE_H */
diff --git a/kernel/Makefile b/kernel/Makefile
index 01435e5..b6d297d 100644
--- a/kernel/Makefile
+++ b/kernel/Makefile
@@ -102,6 +102,8 @@ obj-$(CONFIG_SLOW_WORK_DEBUG) += slow-work-debugfs.o
 obj-$(CONFIG_PERF_EVENTS) += perf_event.o
 obj-$(CONFIG_HAVE_HW_BREAKPOINT) += hw_breakpoint.o
 obj-$(CONFIG_USER_RETURN_NOTIFIER) += user-return-notifier.o
+obj-$(CONFIG_BINFMT_ELF) += elfcore.o
+obj-$(CONFIG_BINFMT_ELF_FDPIC) += elfcore.o
 
 ifneq ($(CONFIG_SCHED_OMIT_FRAME_POINTER),y)
 # According to Alan Modra <alan@linuxcare.com.au>, the -fno-omit-frame-pointer is
diff --git a/kernel/elfcore.c b/kernel/elfcore.c
new file mode 100644
index 0000000..5445741
--- /dev/null
+++ b/kernel/elfcore.c
@@ -0,0 +1,23 @@
+#include <linux/elf.h>
+#include <linux/fs.h>
+#include <linux/mm.h>
+
+#include <asm/elf.h>
+
+
+Elf_Half __weak elf_core_extra_phdrs(void)
+{
+	return 0;
+}
+
+int __weak elf_core_write_extra_phdrs(struct file *file, loff_t offset, size_t *size,
+				      unsigned long limit)
+{
+	return 1;
+}
+
+int __weak elf_core_write_extra_data(struct file *file, size_t *size,
+				     unsigned long limit)
+{
+	return 1;
+}
-- 
1.6.5.1

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [RESEND][mmotm][PATCH v2, 4/5] elf coredump: make offset calculation process and writing process explicit
  2010-01-04  1:06 [RESEND][mmotm][PATCH v2, 0/5] elf coredump: Add extended numbering support Daisuke HATAYAMA
                   ` (2 preceding siblings ...)
  2010-01-04  1:20 ` [RESEND][mmotm][PATCH v2, 3/5] elf coredump: replace ELF_CORE_EXTRA_* macros by functions Daisuke HATAYAMA
@ 2010-01-04  1:20 ` Daisuke HATAYAMA
  2010-01-04  1:20 ` [RESEND][mmotm][PATCH v2, 5/5] elf coredump: Add extended numbering support Daisuke HATAYAMA
  2010-01-08  0:29 ` [RESEND][mmotm][PATCH v2, 0/5] " Andrew Morton
  5 siblings, 0 replies; 14+ messages in thread
From: Daisuke HATAYAMA @ 2010-01-04  1:20 UTC (permalink / raw)
  To: linux-mm
  Cc: linux-kernel, akpm, mhiramat, xiyou.wangcong, andi, jdike, tony.luck

By the next patch, elf_core_dump() and elf_fdpic_core_dump() will
support extended numbering and so will produce the corefiles with
section header table in a special case.

The problem is the process of writing a file header offset of the
section header table into e_shoff field of the ELF header. ELF header
is positioned at the beginning of the corefile, while section header
at the end. So, we need to take which of the following ways:

 1. Seek backward to retry writing operation for ELF header
    after writing process for a whole part

 2. Make offset calculation process and writing process
    totally sequential

The clause 1. is not always possible: one cannot assume that file
system supports seek function. Consider the no_llseek case.

Therefore, this patch adopts the clause 2.

Signed-off-by: Daisuke HATAYAMA <d.hatayama@jp.fujitsu.com>
---
 fs/binfmt_elf.c       |   27 ++++++++++++++++-----------
 fs/binfmt_elf_fdpic.c |   29 ++++++++++++++++-------------
 2 files changed, 32 insertions(+), 24 deletions(-)

diff --git a/fs/binfmt_elf.c b/fs/binfmt_elf.c
index 1b7e9de..b1ded32 100644
--- a/fs/binfmt_elf.c
+++ b/fs/binfmt_elf.c
@@ -1879,6 +1879,7 @@ static int elf_core_dump(struct coredump_params *cprm)
 	loff_t offset = 0, dataoff, foffset;
 	unsigned long mm_flags;
 	struct elf_note_info info;
+	struct elf_phdr *phdr4note = NULL;
 
 	/*
 	 * We no longer stop all VM operations.
@@ -1921,28 +1922,22 @@ static int elf_core_dump(struct coredump_params *cprm)
 	fs = get_fs();
 	set_fs(KERNEL_DS);
 
-	size += sizeof(*elf);
-	if (size > cprm->limit || !dump_write(cprm->file, elf, sizeof(*elf)))
-		goto end_coredump;
-
 	offset += sizeof(*elf);				/* Elf header */
 	offset += (segs + 1) * sizeof(struct elf_phdr); /* Program headers */
 	foffset = offset;
 
 	/* Write notes phdr entry */
 	{
-		struct elf_phdr phdr;
 		size_t sz = get_note_info_size(&info);
 
 		sz += elf_coredump_extra_notes_size();
 
-		fill_elf_note_phdr(&phdr, sz, offset);
-		offset += sz;
-
-		size += sizeof(phdr);
-		if (size > cprm->limit
-		    || !dump_write(cprm->file, &phdr, sizeof(phdr)))
+		phdr4note = kmalloc(sizeof(*phdr4note), GFP_KERNEL);
+		if (!phdr4note)
 			goto end_coredump;
+
+		fill_elf_note_phdr(phdr4note, sz, offset);
+		offset += sz;
 	}
 
 	dataoff = offset = roundup(offset, ELF_EXEC_PAGESIZE);
@@ -1954,6 +1949,15 @@ static int elf_core_dump(struct coredump_params *cprm)
 	 */
 	mm_flags = current->mm->flags;
 
+	size += sizeof(*elf);
+	if (size > cprm->limit || !dump_write(cprm->file, elf, sizeof(*elf)))
+		goto end_coredump;
+
+	size += sizeof(*phdr4note);
+	if (size > cprm->limit
+	    || !dump_write(cprm->file, phdr4note, sizeof(*phdr4note)))
+		goto end_coredump;
+
 	/* Write program headers for segments dump */
 	for (vma = first_vma(current, gate_vma); vma != NULL;
 			vma = next_vma(vma, gate_vma)) {
@@ -2027,6 +2031,7 @@ end_coredump:
 
 cleanup:
 	free_note_info(&info);
+	kfree(phdr4note);
 	kfree(elf);
 out:
 	return has_dumped;
diff --git a/fs/binfmt_elf_fdpic.c b/fs/binfmt_elf_fdpic.c
index 4c16ff6..cbfa34e 100644
--- a/fs/binfmt_elf_fdpic.c
+++ b/fs/binfmt_elf_fdpic.c
@@ -1588,6 +1588,7 @@ static int elf_fdpic_core_dump(struct coredump_params *cprm)
 	int thread_status_size = 0;
 	elf_addr_t *auxv;
 	unsigned long mm_flags;
+	struct elf_phdr *phdr4note = NULL;
 
 	/*
 	 * We no longer stop all VM operations.
@@ -1694,18 +1695,12 @@ static int elf_fdpic_core_dump(struct coredump_params *cprm)
 	fs = get_fs();
 	set_fs(KERNEL_DS);
 
-	size += sizeof(*elf);
-	if (size > cprm->limit
-	    || !dump_write(cprm->file, elf, sizeof(*elf)))
-		goto end_coredump;
-
 	offset += sizeof(*elf);				/* Elf header */
 	offset += (segs+1) * sizeof(struct elf_phdr);	/* Program headers */
 	foffset = offset;
 
 	/* Write notes phdr entry */
 	{
-		struct elf_phdr phdr;
 		int sz = 0;
 
 		for (i = 0; i < numnote; i++)
@@ -1713,13 +1708,12 @@ static int elf_fdpic_core_dump(struct coredump_params *cprm)
 
 		sz += thread_status_size;
 
-		fill_elf_note_phdr(&phdr, sz, offset);
-		offset += sz;
-
-		size += sizeof(phdr);
-		if (size > cprm->limit
-		    || !dump_write(cprm->file, &phdr, sizeof(phdr)))
+		phdr4note = kmalloc(sizeof(*phdr4note), GFP_KERNEL);
+		if (!phdr4note)
 			goto end_coredump;
+
+		fill_elf_note_phdr(phdr4note, sz, offset);
+		offset += sz;
 	}
 
 	/* Page-align dumped data */
@@ -1732,6 +1726,15 @@ static int elf_fdpic_core_dump(struct coredump_params *cprm)
 	 */
 	mm_flags = current->mm->flags;
 
+	size += sizeof(*elf);
+	if (size > cprm->limit || !dump_write(cprm->file, elf, sizeof(*elf)))
+		goto end_coredump;
+
+	size += sizeof(*phdr4note);
+	if (size > cprm->limit
+	    || !dump_write(cprm->file, phdr4note, sizeof(*phdr4note)))
+		goto end_coredump;
+
 	/* write program headers for segments dump */
 	for (vma = current->mm->mmap; vma; vma = vma->vm_next) {
 		struct elf_phdr phdr;
@@ -1803,7 +1806,7 @@ cleanup:
 		list_del(tmp);
 		kfree(list_entry(tmp, struct elf_thread_status, list));
 	}
-
+	kfree(phdr4note);
 	kfree(elf);
 	kfree(prstatus);
 	kfree(psinfo);
-- 
1.6.5.1

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [RESEND][mmotm][PATCH v2, 5/5] elf coredump: Add extended numbering support
  2010-01-04  1:06 [RESEND][mmotm][PATCH v2, 0/5] elf coredump: Add extended numbering support Daisuke HATAYAMA
                   ` (3 preceding siblings ...)
  2010-01-04  1:20 ` [RESEND][mmotm][PATCH v2, 4/5] elf coredump: make offset calculation process and writing process explicit Daisuke HATAYAMA
@ 2010-01-04  1:20 ` Daisuke HATAYAMA
  2010-01-08  0:29 ` [RESEND][mmotm][PATCH v2, 0/5] " Andrew Morton
  5 siblings, 0 replies; 14+ messages in thread
From: Daisuke HATAYAMA @ 2010-01-04  1:20 UTC (permalink / raw)
  To: linux-mm
  Cc: linux-kernel, akpm, mhiramat, xiyou.wangcong, andi, jdike, tony.luck

The current ELF dumper implementation can produce broken corefiles if
program headers exceed 65535. This number is determined by the number
of vmas which the process have. In particular, some extreme programs
may use more than 65535 vmas. (If you google max_map_count, you can
find some users facing this problem.) This kind of program never be
able to generate correct coredumps.

This patch implements ``extended numbering'' that uses sh_info field
of the first section header instead of e_phnum field in order to
represent upto 4294967295 vmas.

This is supported by
AMD64-ABI(http://www.x86-64.org/documentation.html) and
Solaris(http://docs.sun.com/app/docs/doc/817-1984/).
Of course, we are preparing patches for gdb and binutils.

Signed-off-by: Daisuke HATAYAMA <d.hatayama@jp.fujitsu.com>
---
 arch/ia64/ia32/elfcore32.h |    1 +
 arch/ia64/kernel/elfcore.c |   16 ++++++++++
 arch/um/sys-i386/elfcore.c |   16 ++++++++++
 fs/binfmt_elf.c            |   66 ++++++++++++++++++++++++++++++++++++++++++--
 fs/binfmt_elf_fdpic.c      |   63 ++++++++++++++++++++++++++++++++++++++++-
 include/linux/elf.h        |   26 ++++++++++++++++-
 include/linux/elfcore.h    |    1 +
 kernel/elfcore.c           |    5 +++
 8 files changed, 188 insertions(+), 6 deletions(-)

diff --git a/arch/ia64/ia32/elfcore32.h b/arch/ia64/ia32/elfcore32.h
index 7877601..2c1defa 100644
--- a/arch/ia64/ia32/elfcore32.h
+++ b/arch/ia64/ia32/elfcore32.h
@@ -161,5 +161,6 @@ elf_core_write_extra_phdrs(struct file *file, loff_t offset, size_t *size,
 			   unsigned long limit);
 extern int
 elf_core_write_extra_data(struct file *file, size_t *size, unsigned long limit);
+extern size_t elf_core_extra_data_size(void);
 
 #endif /* _ELFCORE32_H_ */
diff --git a/arch/ia64/kernel/elfcore.c b/arch/ia64/kernel/elfcore.c
index 57a2298..bac1639 100644
--- a/arch/ia64/kernel/elfcore.c
+++ b/arch/ia64/kernel/elfcore.c
@@ -62,3 +62,19 @@ int elf_core_write_extra_data(struct file *file, size_t *size,
 	}
 	return 1;
 }
+
+size_t elf_core_extra_data_size(void)
+{
+	const struct elf_phdr *const gate_phdrs =
+		(const struct elf_phdr *) (GATE_ADDR + GATE_EHDR->e_phoff);
+	int i;
+	size_t size = 0;
+
+	for (i = 0; i < GATE_EHDR->e_phnum; ++i) {
+		if (gate_phdrs[i].p_type == PT_LOAD) {
+			size += PAGE_ALIGN(gate_phdrs[i].p_memsz);
+			break;
+		}
+	}
+	return size;
+}
diff --git a/arch/um/sys-i386/elfcore.c b/arch/um/sys-i386/elfcore.c
index 30cac52..6bb49b6 100644
--- a/arch/um/sys-i386/elfcore.c
+++ b/arch/um/sys-i386/elfcore.c
@@ -65,3 +65,19 @@ int elf_core_write_extra_data(struct file *file, size_t *size,
 	}
 	return 1;
 }
+
+size_t elf_core_extra_data_size(void)
+{
+	if ( vsyscall_ehdr ) {
+		const struct elfhdr *const ehdrp =
+			(struct elfhdr *)vsyscall_ehdr;
+		const struct elf_phdr *const phdrp =
+			(const struct elf_phdr *) (vsyscall_ehdr + ehdrp->e_phoff);
+		int i;
+
+		for (i = 0; i < ehdrp->e_phnum; ++i)
+			if (phdrp[i].p_type == PT_LOAD)
+				return (size_t) phdrp[i].p_filesz;
+	}
+	return 0;
+}
diff --git a/fs/binfmt_elf.c b/fs/binfmt_elf.c
index b1ded32..43e9219 100644
--- a/fs/binfmt_elf.c
+++ b/fs/binfmt_elf.c
@@ -1861,6 +1861,34 @@ static struct vm_area_struct *next_vma(struct vm_area_struct *this_vma,
 	return gate_vma;
 }
 
+static void fill_extnum_info(struct elfhdr *elf, struct elf_shdr *shdr4extnum,
+			     elf_addr_t e_shoff, int segs)
+{
+	elf->e_shoff = e_shoff;
+	elf->e_shentsize = sizeof(*shdr4extnum);
+	elf->e_shnum = 1;
+	elf->e_shstrndx = SHN_UNDEF;
+
+	memset(shdr4extnum, 0, sizeof(*shdr4extnum));
+
+	shdr4extnum->sh_type = SHT_NULL;
+	shdr4extnum->sh_size = elf->e_shnum;
+	shdr4extnum->sh_link = elf->e_shstrndx;
+	shdr4extnum->sh_info = segs;
+}
+
+static size_t elf_core_vma_data_size(struct vm_area_struct *gate_vma,
+				     unsigned long mm_flags)
+{
+	struct vm_area_struct *vma;
+	size_t size = 0;
+
+	for (vma = first_vma(current, gate_vma); vma != NULL;
+	     vma = next_vma(vma, gate_vma))
+		size += vma_dump_size(vma, mm_flags);
+	return size;
+}
+
 /*
  * Actual dumper
  *
@@ -1880,6 +1908,9 @@ static int elf_core_dump(struct coredump_params *cprm)
 	unsigned long mm_flags;
 	struct elf_note_info info;
 	struct elf_phdr *phdr4note = NULL;
+	struct elf_shdr *shdr4extnum = NULL;
+	Elf_Half e_phnum;
+	elf_addr_t e_shoff;
 
 	/*
 	 * We no longer stop all VM operations.
@@ -1908,12 +1939,19 @@ static int elf_core_dump(struct coredump_params *cprm)
 	if (gate_vma != NULL)
 		segs++;
 
+	/* for notes section */
+	segs++;
+
+	/* If segs > PN_XNUM(0xffff), then e_phnum overflows. To avoid
+	 * this, kernel supports extended numbering. Have a look at
+	 * include/linux/elf.h for further information. */
+	e_phnum = segs > PN_XNUM ? PN_XNUM : segs;
+
 	/*
 	 * Collect all the non-memory information about the process for the
 	 * notes.  This also sets up the file header.
 	 */
-	if (!fill_note_info(elf, segs + 1, /* including notes section */
-			    &info, cprm->signr, cprm->regs))
+	if (!fill_note_info(elf, e_phnum, &info, cprm->signr, cprm->regs))
 		goto cleanup;
 
 	has_dumped = 1;
@@ -1923,7 +1961,7 @@ static int elf_core_dump(struct coredump_params *cprm)
 	set_fs(KERNEL_DS);
 
 	offset += sizeof(*elf);				/* Elf header */
-	offset += (segs + 1) * sizeof(struct elf_phdr); /* Program headers */
+	offset += segs * sizeof(struct elf_phdr);	/* Program headers */
 	foffset = offset;
 
 	/* Write notes phdr entry */
@@ -1949,6 +1987,19 @@ static int elf_core_dump(struct coredump_params *cprm)
 	 */
 	mm_flags = current->mm->flags;
 
+	offset += elf_core_vma_data_size(gate_vma, mm_flags);
+	offset += elf_core_extra_data_size();
+	e_shoff = offset;
+
+	if (e_phnum == PN_XNUM) {
+		shdr4extnum = kmalloc(sizeof(*shdr4extnum), GFP_KERNEL);
+		if (!shdr4extnum)
+			goto end_coredump;
+		fill_extnum_info(elf, shdr4extnum, e_shoff, segs);
+	}
+
+	offset = dataoff;
+
 	size += sizeof(*elf);
 	if (size > cprm->limit || !dump_write(cprm->file, elf, sizeof(*elf)))
 		goto end_coredump;
@@ -2026,11 +2077,20 @@ static int elf_core_dump(struct coredump_params *cprm)
 	if (!elf_core_write_extra_data(cprm->file, &size, cprm->limit))
 		goto end_coredump;
 
+	if (e_phnum == PN_XNUM) {
+		size += sizeof(*shdr4extnum);
+		if (size > cprm->limit
+		    || !dump_write(cprm->file, shdr4extnum,
+				   sizeof(*shdr4extnum)))
+			goto end_coredump;
+	}
+
 end_coredump:
 	set_fs(fs);
 
 cleanup:
 	free_note_info(&info);
+	kfree(shdr4extnum);
 	kfree(phdr4note);
 	kfree(elf);
 out:
diff --git a/fs/binfmt_elf_fdpic.c b/fs/binfmt_elf_fdpic.c
index cbfa34e..d33a6d8 100644
--- a/fs/binfmt_elf_fdpic.c
+++ b/fs/binfmt_elf_fdpic.c
@@ -1493,6 +1493,22 @@ static int elf_dump_thread_status(long signr, struct elf_thread_status *t)
 	return sz;
 }
 
+static void fill_extnum_info(struct elfhdr *elf, struct elf_shdr *shdr4extnum,
+			     elf_addr_t e_shoff, int segs)
+{
+	elf->e_shoff = e_shoff;
+	elf->e_shentsize = sizeof(*shdr4extnum);
+	elf->e_shnum = 1;
+	elf->e_shstrndx = SHN_UNDEF;
+
+	memset(shdr4extnum, 0, sizeof(*shdr4extnum));
+
+	shdr4extnum->sh_type = SHT_NULL;
+	shdr4extnum->sh_size = elf->e_shnum;
+	shdr4extnum->sh_link = elf->e_shstrndx;
+	shdr4extnum->sh_info = segs;
+}
+
 /*
  * dump the segments for an MMU process
  */
@@ -1557,6 +1573,17 @@ static int elf_fdpic_dump_segments(struct file *file, size_t *size,
 }
 #endif
 
+static size_t elf_core_vma_data_size(unsigned long mm_flags)
+{
+	struct vm_area_struct *vma;
+	size_t size = 0;
+
+	for (vma = current->mm->mmap; vma; vma->vm_next)
+		if (maydump(vma, mm_flags))
+			size += vma->vm_end - vma->vm_start;
+	return size;
+}
+
 /*
  * Actual dumper
  *
@@ -1589,6 +1616,9 @@ static int elf_fdpic_core_dump(struct coredump_params *cprm)
 	elf_addr_t *auxv;
 	unsigned long mm_flags;
 	struct elf_phdr *phdr4note = NULL;
+	struct elf_shdr *shdr4extnum = NULL;
+	Elf_Half e_phnum;
+	elf_addr_t e_shoff;
 
 	/*
 	 * We no longer stop all VM operations.
@@ -1655,8 +1685,16 @@ static int elf_fdpic_core_dump(struct coredump_params *cprm)
 	segs = current->mm->map_count;
 	segs += elf_core_extra_phdrs();
 
+	/* for notes section */
+	segs++;
+
+	/* If segs > PN_XNUM(0xffff), then e_phnum overflows. To avoid
+	 * this, kernel supports extended numbering. Have a look at
+	 * include/linux/elf.h for further information. */
+	e_phnum = segs > PN_XNUM ? PN_XNUM : segs;
+
 	/* Set up header */
-	fill_elf_fdpic_header(elf, segs + 1);	/* including notes section */
+	fill_elf_fdpic_header(elf, e_phnum);
 
 	has_dumped = 1;
 	current->flags |= PF_DUMPCORE;
@@ -1696,7 +1734,7 @@ static int elf_fdpic_core_dump(struct coredump_params *cprm)
 	set_fs(KERNEL_DS);
 
 	offset += sizeof(*elf);				/* Elf header */
-	offset += (segs+1) * sizeof(struct elf_phdr);	/* Program headers */
+	offset += segs * sizeof(struct elf_phdr);	/* Program headers */
 	foffset = offset;
 
 	/* Write notes phdr entry */
@@ -1726,6 +1764,19 @@ static int elf_fdpic_core_dump(struct coredump_params *cprm)
 	 */
 	mm_flags = current->mm->flags;
 
+	offset += elf_core_vma_data_size(mm_flags);
+	offset += elf_core_extra_data_size();
+	e_shoff = offset;
+
+	if (e_phnum == PN_XNUM) {
+		shdr4extnum = kmalloc(sizeof(*shdr4extnum), GFP_KERNEL);
+		if (!shdr4extnum)
+			goto end_coredump;
+		fill_extnum_info(elf, shdr4extnum, e_shoff, segs);
+	}
+
+	offset = dataoff;
+
 	size += sizeof(*elf);
 	if (size > cprm->limit || !dump_write(cprm->file, elf, sizeof(*elf)))
 		goto end_coredump;
@@ -1790,6 +1841,14 @@ static int elf_fdpic_core_dump(struct coredump_params *cprm)
 	if (!elf_core_write_extra_data(cprm->file, &size, cprm->limit))
 		goto end_coredump;
 
+	if (e_phnum == PN_XNUM) {
+		size += sizeof(*shdr4extnum);
+		if (size > cprm->limit
+		    || !dump_write(cprm->file, shdr4extnum,
+				   sizeof(*shdr4extnum)))
+			goto end_coredump;
+	}
+
 	if (file->f_pos != offset) {
 		/* Sanity check */
 		printk(KERN_WARNING
diff --git a/include/linux/elf.h b/include/linux/elf.h
index d103127..027fdfe 100644
--- a/include/linux/elf.h
+++ b/include/linux/elf.h
@@ -50,6 +50,28 @@ typedef __s64	Elf64_Sxword;
 
 #define PT_GNU_STACK	(PT_LOOS + 0x474e551)
 
+/*
+ * Extended Numbering
+ *
+ * If the real number of program header table entries is larger than
+ * or equal to PN_XNUM(0xffff), it is set to sh_info field of the
+ * section header at index 0, and PN_XNUM is set to e_phnum
+ * field. Otherwise, the section header at index 0 is zero
+ * initialized, if it exists.
+ *
+ * Specifications are available in:
+ *
+ * - Sun microsystems: Linker and Libraries.
+ *   Part No: 817-1984-17, September 2008.
+ *   URL: http://docs.sun.com/app/docs/doc/817-1984
+ *
+ * - System V ABI AMD64 Architecture Processor Supplement
+ *   Draft Version 0.99.,
+ *   May 11, 2009.
+ *   URL: http://www.x86-64.org/
+ */
+#define PN_XNUM 0xffff
+
 /* These constants define the different elf file types */
 #define ET_NONE   0
 #define ET_REL    1
@@ -286,7 +308,7 @@ typedef struct elf64_phdr {
 #define SHN_COMMON	0xfff2
 #define SHN_HIRESERVE	0xffff
  
-typedef struct {
+typedef struct elf32_shdr {
   Elf32_Word	sh_name;
   Elf32_Word	sh_type;
   Elf32_Word	sh_flags;
@@ -384,6 +406,7 @@ typedef struct elf64_note {
 extern Elf32_Dyn _DYNAMIC [];
 #define elfhdr		elf32_hdr
 #define elf_phdr	elf32_phdr
+#define elf_shdr	elf32_shdr
 #define elf_note	elf32_note
 #define elf_addr_t	Elf32_Off
 #define Elf_Half	Elf32_Half
@@ -393,6 +416,7 @@ extern Elf32_Dyn _DYNAMIC [];
 extern Elf64_Dyn _DYNAMIC [];
 #define elfhdr		elf64_hdr
 #define elf_phdr	elf64_phdr
+#define elf_shdr	elf64_shdr
 #define elf_note	elf64_note
 #define elf_addr_t	Elf64_Off
 #define Elf_Half	Elf64_Half
diff --git a/include/linux/elfcore.h b/include/linux/elfcore.h
index cfda74f..e687bc3 100644
--- a/include/linux/elfcore.h
+++ b/include/linux/elfcore.h
@@ -166,5 +166,6 @@ elf_core_write_extra_phdrs(struct file *file, loff_t offset, size_t *size,
 			   unsigned long limit);
 extern int
 elf_core_write_extra_data(struct file *file, size_t *size, unsigned long limit);
+extern size_t elf_core_extra_data_size(void);
 
 #endif /* _LINUX_ELFCORE_H */
diff --git a/kernel/elfcore.c b/kernel/elfcore.c
index 5445741..ff915ef 100644
--- a/kernel/elfcore.c
+++ b/kernel/elfcore.c
@@ -21,3 +21,8 @@ int __weak elf_core_write_extra_data(struct file *file, size_t *size,
 {
 	return 1;
 }
+
+size_t __weak elf_core_extra_data_size(void)
+{
+	return 0;
+}
-- 
1.6.5.1

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 14+ messages in thread

* Re: [RESEND][mmotm][PATCH v2, 0/5] elf coredump: Add extended numbering support
  2010-01-04  1:06 [RESEND][mmotm][PATCH v2, 0/5] elf coredump: Add extended numbering support Daisuke HATAYAMA
                   ` (4 preceding siblings ...)
  2010-01-04  1:20 ` [RESEND][mmotm][PATCH v2, 5/5] elf coredump: Add extended numbering support Daisuke HATAYAMA
@ 2010-01-08  0:29 ` Andrew Morton
  2010-01-08  0:32   ` Andrew Morton
  2010-01-12  3:12   ` Daisuke HATAYAMA
  5 siblings, 2 replies; 14+ messages in thread
From: Andrew Morton @ 2010-01-08  0:29 UTC (permalink / raw)
  To: Daisuke HATAYAMA
  Cc: linux-mm, linux-kernel, mhiramat, xiyou.wangcong, andi, jdike, tony.luck

On Mon, 04 Jan 2010 10:06:07 +0900 (JST)
Daisuke HATAYAMA <d.hatayama@jp.fujitsu.com> wrote:

> The current ELF dumper can produce broken corefiles if program headers
> exceed 65535. In particular, the program in 64-bit environment often
> demands more than 65535 mmaps. If you google max_map_count, then you
> can find many users facing this problem.
> 
> Solaris has already dealt with this issue, and other OSes have also
> adopted the same method as in Solaris. Currently, Sun's document and
> AMD 64 ABI include the description for the extension, where they call
> the extension Extended Numbering. See Reference for further information.
> 
> I believe that linux kernel should adopt the same way as they did, so
> I've written this patch.
> 
> I am also preparing for patches of GDB and binutils.

That's a beautifully presented patchset.  Thanks for doing all that
work - it helps.

UML maintenance appears to have ceased in recent times, so if we wish
to have these changes runtime tested (we should) then I think it would
be best if you could find someone to do that please.

And no akpm code-review would be complete without: dump_seek() is
waaaay to large to be inlined.  Is there some common .c file to where
we could move it?

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [RESEND][mmotm][PATCH v2, 0/5] elf coredump: Add extended numbering support
  2010-01-08  0:29 ` [RESEND][mmotm][PATCH v2, 0/5] " Andrew Morton
@ 2010-01-08  0:32   ` Andrew Morton
  2010-01-08  2:14     ` Masami Hiramatsu
  2010-01-12  3:12   ` Daisuke HATAYAMA
  1 sibling, 1 reply; 14+ messages in thread
From: Andrew Morton @ 2010-01-08  0:32 UTC (permalink / raw)
  To: Daisuke HATAYAMA, linux-mm, linux-kernel, mhiramat,
	xiyou.wangcong, andi, jdike, tony.luck

On Thu, 7 Jan 2010 16:29:28 -0800
Andrew Morton <akpm@linux-foundation.org> wrote:

> On Mon, 04 Jan 2010 10:06:07 +0900 (JST)
> Daisuke HATAYAMA <d.hatayama@jp.fujitsu.com> wrote:
> 
> > The current ELF dumper can produce broken corefiles if program headers
> > exceed 65535. In particular, the program in 64-bit environment often
> > demands more than 65535 mmaps. If you google max_map_count, then you
> > can find many users facing this problem.
> > 
> > Solaris has already dealt with this issue, and other OSes have also
> > adopted the same method as in Solaris. Currently, Sun's document and
> > AMD 64 ABI include the description for the extension, where they call
> > the extension Extended Numbering. See Reference for further information.
> > 
> > I believe that linux kernel should adopt the same way as they did, so
> > I've written this patch.
> > 
> > I am also preparing for patches of GDB and binutils.
> 
> That's a beautifully presented patchset.  Thanks for doing all that
> work - it helps.
> 
> UML maintenance appears to have ceased in recent times, so if we wish
> to have these changes runtime tested (we should) then I think it would
> be best if you could find someone to do that please.
> 
> And no akpm code-review would be complete without: dump_seek() is
> waaaay to large to be inlined.  Is there some common .c file to where
> we could move it?
> 

Also, these patches made a bit of a mess of
mm-pass-mm-flags-as-a-coredump-parameter-for-consistency.patch.

I consider
mm-pass-mm-flags-as-a-coredump-parameter-for-consistency.patch to be
less important (although older) than this patch series so I've fixed up
mm-pass-mm-flags-as-a-coredump-parameter-for-consistency.patch and have
staged it after your patch series.  If this causes problems then I'll
drop mm-pass-mm-flags-as-a-coredump-parameter-for-consistency.patch,
sorry.


From: Masami Hiramatsu <mhiramat@redhat.com>

Pass mm->flags as a coredump parameter for consistency.

 ---
1787         if (mm->core_state || !get_dumpable(mm)) {  <- (1)
1788                 up_write(&mm->mmap_sem);
1789                 put_cred(cred);
1790                 goto fail;
1791         }
1792
[...]
1798         if (get_dumpable(mm) == 2) {    /* Setuid core dump mode */ <-(2)
1799                 flag = O_EXCL;          /* Stop rewrite attacks */
1800                 cred->fsuid = 0;        /* Dump root private */
1801         }
 ---

Since dumpable bits are not protected by lock, there is a chance to change
these bits between (1) and (2).

To solve this issue, this patch copies mm->flags to
coredump_params.mm_flags at the beginning of do_coredump() and uses it
instead of get_dumpable() while dumping core.

This copy is also passed to binfmt->core_dump, since elf*_core_dump() uses
dump_filter bits in mm->flags.

Signed-off-by: Masami Hiramatsu <mhiramat@redhat.com>
Acked-by: Roland McGrath <roland@redhat.com>
Cc: Hidehiro Kawai <hidehiro.kawai.ez@hitachi.com>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Ingo Molnar <mingo@elte.hu>
Reviewed-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 fs/binfmt_elf.c         |   12 ++----------
 fs/binfmt_elf_fdpic.c   |   12 ++----------
 fs/exec.c               |   20 ++++++++++++++++----
 include/linux/binfmts.h |    1 +
 4 files changed, 21 insertions(+), 24 deletions(-)

diff -puN fs/binfmt_elf.c~mm-pass-mm-flags-as-a-coredump-parameter-for-consistency fs/binfmt_elf.c
--- a/fs/binfmt_elf.c~mm-pass-mm-flags-as-a-coredump-parameter-for-consistency
+++ a/fs/binfmt_elf.c
@@ -1905,7 +1905,6 @@ static int elf_core_dump(struct coredump
 	struct vm_area_struct *vma, *gate_vma;
 	struct elfhdr *elf = NULL;
 	loff_t offset = 0, dataoff, foffset;
-	unsigned long mm_flags;
 	struct elf_note_info info;
 	struct elf_phdr *phdr4note = NULL;
 	struct elf_shdr *shdr4extnum = NULL;
@@ -1980,13 +1979,6 @@ static int elf_core_dump(struct coredump
 
 	dataoff = offset = roundup(offset, ELF_EXEC_PAGESIZE);
 
-	/*
-	 * We must use the same mm->flags while dumping core to avoid
-	 * inconsistency between the program headers and bodies, otherwise an
-	 * unusable core file can be generated.
-	 */
-	mm_flags = current->mm->flags;
-
 	offset += elf_core_vma_data_size(gate_vma, mm_flags);
 	offset += elf_core_extra_data_size();
 	e_shoff = offset;
@@ -2018,7 +2010,7 @@ static int elf_core_dump(struct coredump
 		phdr.p_offset = offset;
 		phdr.p_vaddr = vma->vm_start;
 		phdr.p_paddr = 0;
-		phdr.p_filesz = vma_dump_size(vma, mm_flags);
+		phdr.p_filesz = vma_dump_size(vma, cprm->mm_flags);
 		phdr.p_memsz = vma->vm_end - vma->vm_start;
 		offset += phdr.p_filesz;
 		phdr.p_flags = vma->vm_flags & VM_READ ? PF_R : 0;
@@ -2053,7 +2045,7 @@ static int elf_core_dump(struct coredump
 		unsigned long addr;
 		unsigned long end;
 
-		end = vma->vm_start + vma_dump_size(vma, mm_flags);
+		end = vma->vm_start + vma_dump_size(vma, cprm->mm_flags);
 
 		for (addr = vma->vm_start; addr < end; addr += PAGE_SIZE) {
 			struct page *page;
diff -puN fs/binfmt_elf_fdpic.c~mm-pass-mm-flags-as-a-coredump-parameter-for-consistency fs/binfmt_elf_fdpic.c
--- a/fs/binfmt_elf_fdpic.c~mm-pass-mm-flags-as-a-coredump-parameter-for-consistency
+++ a/fs/binfmt_elf_fdpic.c
@@ -1623,7 +1623,6 @@ static int elf_fdpic_core_dump(struct co
 #endif
 	int thread_status_size = 0;
 	elf_addr_t *auxv;
-	unsigned long mm_flags;
 	struct elf_phdr *phdr4note = NULL;
 	struct elf_shdr *shdr4extnum = NULL;
 	Elf_Half e_phnum;
@@ -1766,13 +1765,6 @@ static int elf_fdpic_core_dump(struct co
 	/* Page-align dumped data */
 	dataoff = offset = roundup(offset, ELF_EXEC_PAGESIZE);
 
-	/*
-	 * We must use the same mm->flags while dumping core to avoid
-	 * inconsistency between the program headers and bodies, otherwise an
-	 * unusable core file can be generated.
-	 */
-	mm_flags = current->mm->flags;
-
 	offset += elf_core_vma_data_size(mm_flags);
 	offset += elf_core_extra_data_size();
 	e_shoff = offset;
@@ -1806,7 +1798,7 @@ static int elf_fdpic_core_dump(struct co
 		phdr.p_offset = offset;
 		phdr.p_vaddr = vma->vm_start;
 		phdr.p_paddr = 0;
-		phdr.p_filesz = maydump(vma, mm_flags) ? sz : 0;
+		phdr.p_filesz = maydump(vma, cprm->mm_flags) ? sz : 0;
 		phdr.p_memsz = sz;
 		offset += phdr.p_filesz;
 		phdr.p_flags = vma->vm_flags & VM_READ ? PF_R : 0;
@@ -1844,7 +1836,7 @@ static int elf_fdpic_core_dump(struct co
 		goto end_coredump;
 
 	if (elf_fdpic_dump_segments(cprm->file, &size, &cprm->limit,
-				    mm_flags) < 0)
+				    cprm->mm_flags) < 0)
 		goto end_coredump;
 
 	if (!elf_core_write_extra_data(cprm->file, &size, cprm->limit))
diff -puN fs/exec.c~mm-pass-mm-flags-as-a-coredump-parameter-for-consistency fs/exec.c
--- a/fs/exec.c~mm-pass-mm-flags-as-a-coredump-parameter-for-consistency
+++ a/fs/exec.c
@@ -1726,14 +1726,19 @@ void set_dumpable(struct mm_struct *mm, 
 	}
 }
 
-int get_dumpable(struct mm_struct *mm)
+static int __get_dumpable(unsigned long mm_flags)
 {
 	int ret;
 
-	ret = mm->flags & 0x3;
+	ret = mm_flags & MMF_DUMPABLE_MASK;
 	return (ret >= 2) ? 2 : ret;
 }
 
+int get_dumpable(struct mm_struct *mm)
+{
+	return __get_dumpable(mm->flags);
+}
+
 static void wait_for_dump_helpers(struct file *file)
 {
 	struct pipe_inode_info *pipe;
@@ -1777,6 +1782,12 @@ void do_coredump(long signr, int exit_co
 		.signr = signr,
 		.regs = regs,
 		.limit = current->signal->rlim[RLIMIT_CORE].rlim_cur,
+		/*
+		 * We must use the same mm->flags while dumping core to avoid
+		 * inconsistency of bit flags, since this flag is not protected
+		 * by any locks.
+		 */
+		.mm_flags = mm->flags,
 	};
 
 	audit_core_dumps(signr);
@@ -1795,7 +1806,7 @@ void do_coredump(long signr, int exit_co
 	/*
 	 * If another thread got here first, or we are not dumpable, bail out.
 	 */
-	if (mm->core_state || !get_dumpable(mm)) {
+	if (mm->core_state || !__get_dumpable(cprm.mm_flags)) {
 		up_write(&mm->mmap_sem);
 		put_cred(cred);
 		goto fail;
@@ -1806,7 +1817,8 @@ void do_coredump(long signr, int exit_co
 	 *	process nor do we know its entire history. We only know it
 	 *	was tainted so we dump it as root in mode 2.
 	 */
-	if (get_dumpable(mm) == 2) {	/* Setuid core dump mode */
+	if (__get_dumpable(cprm.mm_flags) == 2) {
+		/* Setuid core dump mode */
 		flag = O_EXCL;		/* Stop rewrite attacks */
 		cred->fsuid = 0;	/* Dump root private */
 	}
diff -puN include/linux/binfmts.h~mm-pass-mm-flags-as-a-coredump-parameter-for-consistency include/linux/binfmts.h
--- a/include/linux/binfmts.h~mm-pass-mm-flags-as-a-coredump-parameter-for-consistency
+++ a/include/linux/binfmts.h
@@ -74,6 +74,7 @@ struct coredump_params {
 	struct pt_regs *regs;
 	struct file *file;
 	unsigned long limit;
+	unsigned long mm_flags;
 };
 
 /*
_

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [RESEND][mmotm][PATCH v2, 0/5] elf coredump: Add extended numbering support
  2010-01-08  0:32   ` Andrew Morton
@ 2010-01-08  2:14     ` Masami Hiramatsu
  0 siblings, 0 replies; 14+ messages in thread
From: Masami Hiramatsu @ 2010-01-08  2:14 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Daisuke HATAYAMA, linux-mm, linux-kernel, xiyou.wangcong, andi,
	jdike, tony.luck

Andrew Morton wrote:
> On Thu, 7 Jan 2010 16:29:28 -0800
> Andrew Morton <akpm@linux-foundation.org> wrote:
> 
>> On Mon, 04 Jan 2010 10:06:07 +0900 (JST)
>> Daisuke HATAYAMA <d.hatayama@jp.fujitsu.com> wrote:
>>
>>> The current ELF dumper can produce broken corefiles if program headers
>>> exceed 65535. In particular, the program in 64-bit environment often
>>> demands more than 65535 mmaps. If you google max_map_count, then you
>>> can find many users facing this problem.
>>>
>>> Solaris has already dealt with this issue, and other OSes have also
>>> adopted the same method as in Solaris. Currently, Sun's document and
>>> AMD 64 ABI include the description for the extension, where they call
>>> the extension Extended Numbering. See Reference for further information.
>>>
>>> I believe that linux kernel should adopt the same way as they did, so
>>> I've written this patch.
>>>
>>> I am also preparing for patches of GDB and binutils.
>>
>> That's a beautifully presented patchset.  Thanks for doing all that
>> work - it helps.
>>
>> UML maintenance appears to have ceased in recent times, so if we wish
>> to have these changes runtime tested (we should) then I think it would
>> be best if you could find someone to do that please.
>>
>> And no akpm code-review would be complete without: dump_seek() is
>> waaaay to large to be inlined.  Is there some common .c file to where
>> we could move it?
>>
> 
> Also, these patches made a bit of a mess of
> mm-pass-mm-flags-as-a-coredump-parameter-for-consistency.patch.
> 
> I consider
> mm-pass-mm-flags-as-a-coredump-parameter-for-consistency.patch to be
> less important (although older) than this patch series so I've fixed up
> mm-pass-mm-flags-as-a-coredump-parameter-for-consistency.patch and have
> staged it after your patch series.  If this causes problems then I'll
> drop mm-pass-mm-flags-as-a-coredump-parameter-for-consistency.patch,
> sorry.

Sure, that's fine to me too.
It seems that those patches are not conflict each other directly.

Thank you for modifying my patch.
And I found just two misses.

> From: Masami Hiramatsu <mhiramat@redhat.com>
> 
> Pass mm->flags as a coredump parameter for consistency.
> 
>  ---
> 1787         if (mm->core_state || !get_dumpable(mm)) {  <- (1)
> 1788                 up_write(&mm->mmap_sem);
> 1789                 put_cred(cred);
> 1790                 goto fail;
> 1791         }
> 1792
> [...]
> 1798         if (get_dumpable(mm) == 2) {    /* Setuid core dump mode */ <-(2)
> 1799                 flag = O_EXCL;          /* Stop rewrite attacks */
> 1800                 cred->fsuid = 0;        /* Dump root private */
> 1801         }
>  ---
> 
> Since dumpable bits are not protected by lock, there is a chance to change
> these bits between (1) and (2).
> 
> To solve this issue, this patch copies mm->flags to
> coredump_params.mm_flags at the beginning of do_coredump() and uses it
> instead of get_dumpable() while dumping core.
> 
> This copy is also passed to binfmt->core_dump, since elf*_core_dump() uses
> dump_filter bits in mm->flags.
> 
> Signed-off-by: Masami Hiramatsu <mhiramat@redhat.com>
> Acked-by: Roland McGrath <roland@redhat.com>
> Cc: Hidehiro Kawai <hidehiro.kawai.ez@hitachi.com>
> Cc: Oleg Nesterov <oleg@redhat.com>
> Cc: Ingo Molnar <mingo@elte.hu>
> Reviewed-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
> ---
> 
>  fs/binfmt_elf.c         |   12 ++----------
>  fs/binfmt_elf_fdpic.c   |   12 ++----------
>  fs/exec.c               |   20 ++++++++++++++++----
>  include/linux/binfmts.h |    1 +
>  4 files changed, 21 insertions(+), 24 deletions(-)
> 
> diff -puN fs/binfmt_elf.c~mm-pass-mm-flags-as-a-coredump-parameter-for-consistency fs/binfmt_elf.c
> --- a/fs/binfmt_elf.c~mm-pass-mm-flags-as-a-coredump-parameter-for-consistency
> +++ a/fs/binfmt_elf.c
> @@ -1905,7 +1905,6 @@ static int elf_core_dump(struct coredump
>  	struct vm_area_struct *vma, *gate_vma;
>  	struct elfhdr *elf = NULL;
>  	loff_t offset = 0, dataoff, foffset;
> -	unsigned long mm_flags;
>  	struct elf_note_info info;
>  	struct elf_phdr *phdr4note = NULL;
>  	struct elf_shdr *shdr4extnum = NULL;
> @@ -1980,13 +1979,6 @@ static int elf_core_dump(struct coredump
>  
>  	dataoff = offset = roundup(offset, ELF_EXEC_PAGESIZE);
>  
> -	/*
> -	 * We must use the same mm->flags while dumping core to avoid
> -	 * inconsistency between the program headers and bodies, otherwise an
> -	 * unusable core file can be generated.
> -	 */
> -	mm_flags = current->mm->flags;
> -
>  	offset += elf_core_vma_data_size(gate_vma, mm_flags);
						   ^^^^^^^^ cprm->mm_flags

>  	offset += elf_core_extra_data_size();
>  	e_shoff = offset;
> @@ -2018,7 +2010,7 @@ static int elf_core_dump(struct coredump
>  		phdr.p_offset = offset;
>  		phdr.p_vaddr = vma->vm_start;
>  		phdr.p_paddr = 0;
> -		phdr.p_filesz = vma_dump_size(vma, mm_flags);
> +		phdr.p_filesz = vma_dump_size(vma, cprm->mm_flags);
>  		phdr.p_memsz = vma->vm_end - vma->vm_start;
>  		offset += phdr.p_filesz;
>  		phdr.p_flags = vma->vm_flags & VM_READ ? PF_R : 0;
> @@ -2053,7 +2045,7 @@ static int elf_core_dump(struct coredump
>  		unsigned long addr;
>  		unsigned long end;
>  
> -		end = vma->vm_start + vma_dump_size(vma, mm_flags);
> +		end = vma->vm_start + vma_dump_size(vma, cprm->mm_flags);
>  
>  		for (addr = vma->vm_start; addr < end; addr += PAGE_SIZE) {
>  			struct page *page;
> diff -puN fs/binfmt_elf_fdpic.c~mm-pass-mm-flags-as-a-coredump-parameter-for-consistency fs/binfmt_elf_fdpic.c
> --- a/fs/binfmt_elf_fdpic.c~mm-pass-mm-flags-as-a-coredump-parameter-for-consistency
> +++ a/fs/binfmt_elf_fdpic.c
> @@ -1623,7 +1623,6 @@ static int elf_fdpic_core_dump(struct co
>  #endif
>  	int thread_status_size = 0;
>  	elf_addr_t *auxv;
> -	unsigned long mm_flags;
>  	struct elf_phdr *phdr4note = NULL;
>  	struct elf_shdr *shdr4extnum = NULL;
>  	Elf_Half e_phnum;
> @@ -1766,13 +1765,6 @@ static int elf_fdpic_core_dump(struct co
>  	/* Page-align dumped data */
>  	dataoff = offset = roundup(offset, ELF_EXEC_PAGESIZE);
>  
> -	/*
> -	 * We must use the same mm->flags while dumping core to avoid
> -	 * inconsistency between the program headers and bodies, otherwise an
> -	 * unusable core file can be generated.
> -	 */
> -	mm_flags = current->mm->flags;
> -
>  	offset += elf_core_vma_data_size(mm_flags);
                                         ^^^^^^^^ cprm->mm_flags

Thanks!

-- 
Masami Hiramatsu

Software Engineer
Hitachi Computer Products (America), Inc.
Software Solutions Division

e-mail: mhiramat@redhat.com

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [RESEND][mmotm][PATCH v2, 0/5] elf coredump: Add extended numbering support
  2010-01-08  0:29 ` [RESEND][mmotm][PATCH v2, 0/5] " Andrew Morton
  2010-01-08  0:32   ` Andrew Morton
@ 2010-01-12  3:12   ` Daisuke HATAYAMA
  2010-01-12  3:24     ` Andrew Morton
  1 sibling, 1 reply; 14+ messages in thread
From: Daisuke HATAYAMA @ 2010-01-12  3:12 UTC (permalink / raw)
  To: akpm
  Cc: linux-mm, linux-kernel, mhiramat, xiyou.wangcong, andi, jdike, tony.luck

From: Andrew Morton <akpm@linux-foundation.org>
Subject: Re: [RESEND][mmotm][PATCH v2, 0/5] elf coredump: Add extended numbering support
Date: Thu, 7 Jan 2010 16:29:28 -0800

> On Mon, 04 Jan 2010 10:06:07 +0900 (JST)
> Daisuke HATAYAMA <d.hatayama@jp.fujitsu.com> wrote:
> 
> > The current ELF dumper can produce broken corefiles if program headers
> > exceed 65535. In particular, the program in 64-bit environment often
> > demands more than 65535 mmaps. If you google max_map_count, then you
> > can find many users facing this problem.
> > 
> > Solaris has already dealt with this issue, and other OSes have also
> > adopted the same method as in Solaris. Currently, Sun's document and
> > AMD 64 ABI include the description for the extension, where they call
> > the extension Extended Numbering. See Reference for further information.
> > 
> > I believe that linux kernel should adopt the same way as they did, so
> > I've written this patch.
> > 
> > I am also preparing for patches of GDB and binutils.
> 
> That's a beautifully presented patchset.  Thanks for doing all that
> work - it helps.
> 
> UML maintenance appears to have ceased in recent times, so if we wish
> to have these changes runtime tested (we should) then I think it would
> be best if you could find someone to do that please.
> 
> And no akpm code-review would be complete without: dump_seek() is
> waaaay to large to be inlined.  Is there some common .c file to where
> we could move it?
> 

I am sorry for very late reply.

* Patch Test for UML-i386

I tested on UML-i386 for the stable release of that time, precisely
2.6.32, since even building process for UML-i386 failed for mainline
and mmotm trees, as you've expected.

I don't know internal UML implementation at all, so I need to find
someone if runtime test for mmotm tree is absolutely necessary.

* modification for dump_seek()

I couldn't find any right .c file at which dump_seek() be placed. We
need to create a new .c file into which we put auxiliary functions to
generate/manipulate coredumps.

There is another problem regarding name space. The name dump_seek() is
too short.  If we move dump_seek() to some .c file, we need to rename
it according to the corresponding object file format, such as
elf_core_dump_seek() or aout_dump_seek(); or coredump_dump_seek(), as
currently dump_seek() is shared among dumping processes in multiple
object formats.

Should I submit these suggestions as a patch set of version 3?

Thanks

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [RESEND][mmotm][PATCH v2, 0/5] elf coredump: Add extended numbering support
  2010-01-12  3:12   ` Daisuke HATAYAMA
@ 2010-01-12  3:24     ` Andrew Morton
  2010-01-12  8:05       ` Daisuke HATAYAMA
  2010-01-13  8:57       ` Daisuke HATAYAMA
  0 siblings, 2 replies; 14+ messages in thread
From: Andrew Morton @ 2010-01-12  3:24 UTC (permalink / raw)
  To: Daisuke HATAYAMA
  Cc: linux-mm, linux-kernel, mhiramat, xiyou.wangcong, andi, jdike, tony.luck

On Tue, 12 Jan 2010 12:12:32 +0900 (JST) Daisuke HATAYAMA <d.hatayama@jp.fujitsu.com> wrote:

> From: Andrew Morton <akpm@linux-foundation.org>
> Subject: Re: [RESEND][mmotm][PATCH v2, 0/5] elf coredump: Add extended numbering support
> Date: Thu, 7 Jan 2010 16:29:28 -0800
> 
> > On Mon, 04 Jan 2010 10:06:07 +0900 (JST)
> > Daisuke HATAYAMA <d.hatayama@jp.fujitsu.com> wrote:
> > 
> > > The current ELF dumper can produce broken corefiles if program headers
> > > exceed 65535. In particular, the program in 64-bit environment often
> > > demands more than 65535 mmaps. If you google max_map_count, then you
> > > can find many users facing this problem.
> > > 
> > > Solaris has already dealt with this issue, and other OSes have also
> > > adopted the same method as in Solaris. Currently, Sun's document and
> > > AMD 64 ABI include the description for the extension, where they call
> > > the extension Extended Numbering. See Reference for further information.
> > > 
> > > I believe that linux kernel should adopt the same way as they did, so
> > > I've written this patch.
> > > 
> > > I am also preparing for patches of GDB and binutils.
> > 
> > That's a beautifully presented patchset.  Thanks for doing all that
> > work - it helps.
> > 
> > UML maintenance appears to have ceased in recent times, so if we wish
> > to have these changes runtime tested (we should) then I think it would
> > be best if you could find someone to do that please.
> > 
> > And no akpm code-review would be complete without: dump_seek() is
> > waaaay to large to be inlined.  Is there some common .c file to where
> > we could move it?
> > 
> 
> I am sorry for very late reply.
> 
> * Patch Test for UML-i386
> 
> I tested on UML-i386 for the stable release of that time, precisely
> 2.6.32, since even building process for UML-i386 failed for mainline
> and mmotm trees, as you've expected.
> 
> I don't know internal UML implementation at all, so I need to find
> someone if runtime test for mmotm tree is absolutely necessary.

OK, thanks.

> * modification for dump_seek()
> 
> I couldn't find any right .c file at which dump_seek() be placed. We
> need to create a new .c file into which we put auxiliary functions to
> generate/manipulate coredumps.

Sure, that sounds appropriate.

> There is another problem regarding name space. The name dump_seek() is
> too short.  If we move dump_seek() to some .c file, we need to rename
> it according to the corresponding object file format, such as
> elf_core_dump_seek() or aout_dump_seek(); or coredump_dump_seek(), as
> currently dump_seek() is shared among dumping processes in multiple
> object formats.

I don't understand.  Your current inlined dump_seek() looks like it
will work OK for all dump formats when uninlined?

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [RESEND][mmotm][PATCH v2, 0/5] elf coredump: Add extended numbering support
  2010-01-12  3:24     ` Andrew Morton
@ 2010-01-12  8:05       ` Daisuke HATAYAMA
  2010-01-12  8:16         ` Andrew Morton
  2010-01-13  8:57       ` Daisuke HATAYAMA
  1 sibling, 1 reply; 14+ messages in thread
From: Daisuke HATAYAMA @ 2010-01-12  8:05 UTC (permalink / raw)
  To: akpm
  Cc: linux-mm, linux-kernel, mhiramat, xiyou.wangcong, andi, jdike, tony.luck

From: Andrew Morton <akpm@linux-foundation.org>
Subject: Re: [RESEND][mmotm][PATCH v2, 0/5] elf coredump: Add extended numbering support
Date: Mon, 11 Jan 2010 19:24:18 -0800

> On Tue, 12 Jan 2010 12:12:32 +0900 (JST) Daisuke HATAYAMA <d.hatayama@jp.fujitsu.com> wrote:
> 
> > From: Andrew Morton <akpm@linux-foundation.org>
> > Subject: Re: [RESEND][mmotm][PATCH v2, 0/5] elf coredump: Add extended numbering support
> > Date: Thu, 7 Jan 2010 16:29:28 -0800
> > 
> > > On Mon, 04 Jan 2010 10:06:07 +0900 (JST)
> > > Daisuke HATAYAMA <d.hatayama@jp.fujitsu.com> wrote:
> > > 
> > > > The current ELF dumper can produce broken corefiles if program headers
> > > > exceed 65535. In particular, the program in 64-bit environment often
> > > > demands more than 65535 mmaps. If you google max_map_count, then you
> > > > can find many users facing this problem.
> > > > 
> > > > Solaris has already dealt with this issue, and other OSes have also
> > > > adopted the same method as in Solaris. Currently, Sun's document and
> > > > AMD 64 ABI include the description for the extension, where they call
> > > > the extension Extended Numbering. See Reference for further information.
> > > > 
> > > > I believe that linux kernel should adopt the same way as they did, so
> > > > I've written this patch.
> > > > 
> > > > I am also preparing for patches of GDB and binutils.
> > > 
> > > That's a beautifully presented patchset.  Thanks for doing all that
> > > work - it helps.
> > > 
> > > UML maintenance appears to have ceased in recent times, so if we wish
> > > to have these changes runtime tested (we should) then I think it would
> > > be best if you could find someone to do that please.
> > > 
> > > And no akpm code-review would be complete without: dump_seek() is
> > > waaaay to large to be inlined.  Is there some common .c file to where
> > > we could move it?
> > > 
> > 
> > I am sorry for very late reply.
> > 
> > * Patch Test for UML-i386
> > 
> > I tested on UML-i386 for the stable release of that time, precisely
> > 2.6.32, since even building process for UML-i386 failed for mainline
> > and mmotm trees, as you've expected.
> > 
> > I don't know internal UML implementation at all, so I need to find
> > someone if runtime test for mmotm tree is absolutely necessary.
> 
> OK, thanks.
> 
> > * modification for dump_seek()
> > 
> > I couldn't find any right .c file at which dump_seek() be placed. We
> > need to create a new .c file into which we put auxiliary functions to
> > generate/manipulate coredumps.
> 
> Sure, that sounds appropriate.
> 
> > There is another problem regarding name space. The name dump_seek() is
> > too short.  If we move dump_seek() to some .c file, we need to rename
> > it according to the corresponding object file format, such as
> > elf_core_dump_seek() or aout_dump_seek(); or coredump_dump_seek(), as
> > currently dump_seek() is shared among dumping processes in multiple
> > object formats.
> 
> I don't understand.  Your current inlined dump_seek() looks like it
> will work OK for all dump formats when uninlined?
> 

My concern is possibility of dump_seek()'s very short and general
naming wasting public name space and colliding other global names.  My
idea is, for example, to rename it coredump_dump_seek().

Please ignore remaining part of the previous explanation. As you
mention, current dump_seek() implementation is no problem. When it
will be needed is one of implementations of some object file format
will fork.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [RESEND][mmotm][PATCH v2, 0/5] elf coredump: Add extended numbering support
  2010-01-12  8:05       ` Daisuke HATAYAMA
@ 2010-01-12  8:16         ` Andrew Morton
  0 siblings, 0 replies; 14+ messages in thread
From: Andrew Morton @ 2010-01-12  8:16 UTC (permalink / raw)
  To: Daisuke HATAYAMA
  Cc: linux-mm, linux-kernel, mhiramat, xiyou.wangcong, andi, jdike, tony.luck

On Tue, 12 Jan 2010 17:05:03 +0900 (JST) Daisuke HATAYAMA <d.hatayama@jp.fujitsu.com> wrote:

> My concern is possibility of dump_seek()'s very short and general
> naming wasting public name space and colliding other global names.  My
> idea is, for example, to rename it coredump_dump_seek().

OK.  Yes, that would be a bit nicer.  I'm sure we have lots more
inappropriately named globals than that though.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [RESEND][mmotm][PATCH v2, 0/5] elf coredump: Add extended numbering support
  2010-01-12  3:24     ` Andrew Morton
  2010-01-12  8:05       ` Daisuke HATAYAMA
@ 2010-01-13  8:57       ` Daisuke HATAYAMA
  1 sibling, 0 replies; 14+ messages in thread
From: Daisuke HATAYAMA @ 2010-01-13  8:57 UTC (permalink / raw)
  To: akpm
  Cc: linux-mm, linux-kernel, mhiramat, xiyou.wangcong, andi, jdike, tony.luck

From: Andrew Morton <akpm@linux-foundation.org>
Subject: Re: [RESEND][mmotm][PATCH v2, 0/5] elf coredump: Add extended numbering support
Date: Mon, 11 Jan 2010 19:24:18 -0800

> On Tue, 12 Jan 2010 12:12:32 +0900 (JST) Daisuke HATAYAMA <d.hatayama@jp.fujitsu.com> wrote:
> 
>> From: Andrew Morton <akpm@linux-foundation.org>
>> Subject: Re: [RESEND][mmotm][PATCH v2, 0/5] elf coredump: Add extended numbering support
>> Date: Thu, 7 Jan 2010 16:29:28 -0800
>> 
>> > On Mon, 04 Jan 2010 10:06:07 +0900 (JST)
>> > Daisuke HATAYAMA <d.hatayama@jp.fujitsu.com> wrote:
>> > 
>> > > The current ELF dumper can produce broken corefiles if program headers
>> > > exceed 65535. In particular, the program in 64-bit environment often
>> > > demands more than 65535 mmaps. If you google max_map_count, then you
>> > > can find many users facing this problem.
>> > > 
>> > > Solaris has already dealt with this issue, and other OSes have also
>> > > adopted the same method as in Solaris. Currently, Sun's document and
>> > > AMD 64 ABI include the description for the extension, where they call
>> > > the extension Extended Numbering. See Reference for further information.
>> > > 
>> > > I believe that linux kernel should adopt the same way as they did, so
>> > > I've written this patch.
>> > > 
>> > > I am also preparing for patches of GDB and binutils.
>> > 
>> > That's a beautifully presented patchset.  Thanks for doing all that
>> > work - it helps.
>> > 
>> > UML maintenance appears to have ceased in recent times, so if we wish
>> > to have these changes runtime tested (we should) then I think it would
>> > be best if you could find someone to do that please.
>> > 
>> > And no akpm code-review would be complete without: dump_seek() is
>> > waaaay to large to be inlined.  Is there some common .c file to where
>> > we could move it?
>> > 
>> 
>> * Patch Test for UML-i386
>> 
>> I tested on UML-i386 for the stable release of that time, precisely
>> 2.6.32, since even building process for UML-i386 failed for mainline
>> and mmotm trees, as you've expected.
>> 
>> I don't know internal UML implementation at all, so I need to find
>> someone if runtime test for mmotm tree is absolutely necessary.
> 
> OK, thanks.
> 

I'd like to correct the above.

UML-i386 can successfully be built and run by using default config
file for v2.6.32.11, v2.6.33-rc3 and current git mmotm tree,
respectively.

I have yet to do build test by allmodconfig.

Thanks.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 14+ messages in thread

end of thread, other threads:[~2010-01-13  8:57 UTC | newest]

Thread overview: 14+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2010-01-04  1:06 [RESEND][mmotm][PATCH v2, 0/5] elf coredump: Add extended numbering support Daisuke HATAYAMA
2010-01-04  1:20 ` [RESEND][mmotm][PATCH v2, 1/5] Unify dump_seek() implementations for each binfmt_*.c Daisuke HATAYAMA
2010-01-04  1:20 ` [RESEND][mmotm][PATCH v2, 2/5] Move dump_write() and dump_seek() into a header file Daisuke HATAYAMA
2010-01-04  1:20 ` [RESEND][mmotm][PATCH v2, 3/5] elf coredump: replace ELF_CORE_EXTRA_* macros by functions Daisuke HATAYAMA
2010-01-04  1:20 ` [RESEND][mmotm][PATCH v2, 4/5] elf coredump: make offset calculation process and writing process explicit Daisuke HATAYAMA
2010-01-04  1:20 ` [RESEND][mmotm][PATCH v2, 5/5] elf coredump: Add extended numbering support Daisuke HATAYAMA
2010-01-08  0:29 ` [RESEND][mmotm][PATCH v2, 0/5] " Andrew Morton
2010-01-08  0:32   ` Andrew Morton
2010-01-08  2:14     ` Masami Hiramatsu
2010-01-12  3:12   ` Daisuke HATAYAMA
2010-01-12  3:24     ` Andrew Morton
2010-01-12  8:05       ` Daisuke HATAYAMA
2010-01-12  8:16         ` Andrew Morton
2010-01-13  8:57       ` Daisuke HATAYAMA

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).