All of lore.kernel.org
 help / color / mirror / Atom feed
From: HATAYAMA Daisuke <d.hatayama@jp.fujitsu.com>
To: vgoyal@redhat.com, ebiederm@xmission.com, cpw@sgi.com,
	kumagai-atsushi@mxc.nes.nec.co.jp, lisa.mitchell@hp.com,
	heiko.carstens@de.ibm.com, akpm@linux-foundation.org
Cc: kexec@lists.infradead.org, linux-kernel@vger.kernel.org
Subject: [PATCH v2 00/20] kdump, vmcore: support mmap() on /proc/vmcore
Date: Sat, 02 Mar 2013 17:35:48 +0900	[thread overview]
Message-ID: <20130302083447.31252.93914.stgit@localhost6.localdomain6> (raw)

Currently, read to /proc/vmcore is done by read_oldmem() that uses
ioremap/iounmap per a single page. For example, if memory is 1GB,
ioremap/iounmap is called (1GB / 4KB)-times, that is, 262144
times. This causes big performance degradation.

In particular, the current main user of this mmap() is makedumpfile,
which not only reads memory from /proc/vmcore but also does other
processing like filtering, compression and IO work. Update of page
table and the following TLB flush makes such processing much slow;
though I have yet to make patch for makedumpfile and yet to confirm
how it's improved.

To address the issue, this patch implements mmap() on /proc/vmcore to
improve read performance. My simple benchmark shows the improvement
from 200 [MiB/sec] to over 50.0 [GiB/sec].

ChangeLog
=========

v1 => v2)

- Clean up the existing codes: use e_phoff, and remove the assumption
  on PT_NOTE entries.
  => See PATCH 01, 02.

- Fix potencial bug that ELF haeader size is not included in exported
  vmcoreinfo size.
  => See Patch 03.

- Divide patch modifying read_vmcore() into two: clean-up and primary
  code change.
  => See Patch 9, 10.

- Put ELF note segments in page-size boundary on the 1st kernel
  instead of copying them into the buffer on the 2nd kernel.
  => See Patch 11, 12, 13, 14, 16.

Benchmark
=========

No change is seen from the previous patch series. See the previous
one from here:

  https://lkml.org/lkml/2013/2/14/89

TODO
====

- fix makedumpfile to use mmap() on /proc/vmcore and benchmark it to
  confirm whether we can see enough performance improvement. The idea
  is described here:
  http://lists.infradead.org/pipermail/kexec/2013-February/007982.html

- fix crash utility and makedumpfile to support NT_VMCORE_PAD note
  type. Both tools don't distinguish the same note types from
  different note names, which is not conform to ELF specification; now
  both reads NT_VMCORE_PAD note type as NT_VMCORE_DEBUGINFO.

Test
====

This patch set is composed based on v3.9-rc1.

Done on x86-64, x86-32 both with 1GB and over 4GB memory environments.

---

HATAYAMA Daisuke (20):
      vmcore: introduce mmap_vmcore()
      vmcore: count holes generated by round-up operation for vmcore size
      vmcore: round-up offset of vmcore object in page-size boundary
      vmcore: check if vmcore objects satify mmap()'s page-size boundary requirement
      vmcore: check NT_VMCORE_PAD as a mark indicating the end of ELF note buffer
      kexec: fill note buffers by NT_VMCORE_PAD notes in page-size boundary
      elf: introduce NT_VMCORE_PAD type
      kexec, elf: introduce NT_VMCORE_DEBUGINFO note type
      kexec: allocate vmcoreinfo note buffer on page-size boundary
      vmcore: allocate per-cpu crash_notes objects on page-size boundary
      vmcore: read buffers for vmcore objects copied from old memory
      vmcore: clean up read_vmcore()
      vmcore: modify vmcore clean-up function to free buffer on 2nd kernel
      vmcore: copy non page-size aligned head and tail pages in 2nd kernel
      vmcore, procfs: introduce a flag to distinguish objects copied in 2nd kernel
      vmcore: round up buffer size of ELF headers by PAGE_SIZE
      vmcore: allocate buffer for ELF headers on page-size alignment
      vmcore, sysfs: export ELF note segment size instead of vmcoreinfo data size
      vmcore: rearrange program headers without assuming consequtive PT_NOTE entries
      vmcore: refer to e_phoff member explicitly


 arch/s390/include/asm/kexec.h |    7 
 fs/proc/vmcore.c              |  577 ++++++++++++++++++++++++++++++++---------
 include/linux/kexec.h         |   16 +
 include/linux/proc_fs.h       |    8 -
 include/uapi/linux/elf.h      |    5 
 kernel/kexec.c                |   47 ++-
 kernel/ksysfs.c               |    2 
 7 files changed, 505 insertions(+), 157 deletions(-)

-- 
Thanks.
HATAYAMA, Daisuke

WARNING: multiple messages have this Message-ID (diff)
From: HATAYAMA Daisuke <d.hatayama@jp.fujitsu.com>
To: vgoyal@redhat.com, ebiederm@xmission.com, cpw@sgi.com,
	kumagai-atsushi@mxc.nes.nec.co.jp, lisa.mitchell@hp.com,
	heiko.carstens@de.ibm.com, akpm@linux-foundation.org
Cc: kexec@lists.infradead.org, linux-kernel@vger.kernel.org
Subject: [PATCH v2 00/20] kdump, vmcore: support mmap() on /proc/vmcore
Date: Sat, 02 Mar 2013 17:35:48 +0900	[thread overview]
Message-ID: <20130302083447.31252.93914.stgit@localhost6.localdomain6> (raw)

Currently, read to /proc/vmcore is done by read_oldmem() that uses
ioremap/iounmap per a single page. For example, if memory is 1GB,
ioremap/iounmap is called (1GB / 4KB)-times, that is, 262144
times. This causes big performance degradation.

In particular, the current main user of this mmap() is makedumpfile,
which not only reads memory from /proc/vmcore but also does other
processing like filtering, compression and IO work. Update of page
table and the following TLB flush makes such processing much slow;
though I have yet to make patch for makedumpfile and yet to confirm
how it's improved.

To address the issue, this patch implements mmap() on /proc/vmcore to
improve read performance. My simple benchmark shows the improvement
from 200 [MiB/sec] to over 50.0 [GiB/sec].

ChangeLog
=========

v1 => v2)

- Clean up the existing codes: use e_phoff, and remove the assumption
  on PT_NOTE entries.
  => See PATCH 01, 02.

- Fix potencial bug that ELF haeader size is not included in exported
  vmcoreinfo size.
  => See Patch 03.

- Divide patch modifying read_vmcore() into two: clean-up and primary
  code change.
  => See Patch 9, 10.

- Put ELF note segments in page-size boundary on the 1st kernel
  instead of copying them into the buffer on the 2nd kernel.
  => See Patch 11, 12, 13, 14, 16.

Benchmark
=========

No change is seen from the previous patch series. See the previous
one from here:

  https://lkml.org/lkml/2013/2/14/89

TODO
====

- fix makedumpfile to use mmap() on /proc/vmcore and benchmark it to
  confirm whether we can see enough performance improvement. The idea
  is described here:
  http://lists.infradead.org/pipermail/kexec/2013-February/007982.html

- fix crash utility and makedumpfile to support NT_VMCORE_PAD note
  type. Both tools don't distinguish the same note types from
  different note names, which is not conform to ELF specification; now
  both reads NT_VMCORE_PAD note type as NT_VMCORE_DEBUGINFO.

Test
====

This patch set is composed based on v3.9-rc1.

Done on x86-64, x86-32 both with 1GB and over 4GB memory environments.

---

HATAYAMA Daisuke (20):
      vmcore: introduce mmap_vmcore()
      vmcore: count holes generated by round-up operation for vmcore size
      vmcore: round-up offset of vmcore object in page-size boundary
      vmcore: check if vmcore objects satify mmap()'s page-size boundary requirement
      vmcore: check NT_VMCORE_PAD as a mark indicating the end of ELF note buffer
      kexec: fill note buffers by NT_VMCORE_PAD notes in page-size boundary
      elf: introduce NT_VMCORE_PAD type
      kexec, elf: introduce NT_VMCORE_DEBUGINFO note type
      kexec: allocate vmcoreinfo note buffer on page-size boundary
      vmcore: allocate per-cpu crash_notes objects on page-size boundary
      vmcore: read buffers for vmcore objects copied from old memory
      vmcore: clean up read_vmcore()
      vmcore: modify vmcore clean-up function to free buffer on 2nd kernel
      vmcore: copy non page-size aligned head and tail pages in 2nd kernel
      vmcore, procfs: introduce a flag to distinguish objects copied in 2nd kernel
      vmcore: round up buffer size of ELF headers by PAGE_SIZE
      vmcore: allocate buffer for ELF headers on page-size alignment
      vmcore, sysfs: export ELF note segment size instead of vmcoreinfo data size
      vmcore: rearrange program headers without assuming consequtive PT_NOTE entries
      vmcore: refer to e_phoff member explicitly


 arch/s390/include/asm/kexec.h |    7 
 fs/proc/vmcore.c              |  577 ++++++++++++++++++++++++++++++++---------
 include/linux/kexec.h         |   16 +
 include/linux/proc_fs.h       |    8 -
 include/uapi/linux/elf.h      |    5 
 kernel/kexec.c                |   47 ++-
 kernel/ksysfs.c               |    2 
 7 files changed, 505 insertions(+), 157 deletions(-)

-- 
Thanks.
HATAYAMA, Daisuke

_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec

             reply	other threads:[~2013-03-05  7:04 UTC|newest]

Thread overview: 82+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2013-03-02  8:35 HATAYAMA Daisuke [this message]
2013-03-02  8:35 ` [PATCH v2 00/20] kdump, vmcore: support mmap() on /proc/vmcore HATAYAMA Daisuke
2013-03-02  8:35 ` [PATCH v2 01/20] vmcore: refer to e_phoff member explicitly HATAYAMA Daisuke
2013-03-02  8:35   ` HATAYAMA Daisuke
2013-03-05  7:35   ` Zhang Yanfei
2013-03-05  7:35     ` Zhang Yanfei
2013-03-10  6:46     ` Zhang Yanfei
2013-03-10  6:46       ` Zhang Yanfei
2013-03-11  0:31       ` HATAYAMA Daisuke
2013-03-11  0:31         ` HATAYAMA Daisuke
2013-03-11 17:36         ` Vivek Goyal
2013-03-11 17:36           ` Vivek Goyal
2013-03-02  8:35 ` [PATCH v2 02/20] vmcore: rearrange program headers without assuming consequtive PT_NOTE entries HATAYAMA Daisuke
2013-03-02  8:35   ` HATAYAMA Daisuke
2013-03-05  8:36   ` Zhang Yanfei
2013-03-05  8:36     ` Zhang Yanfei
2013-03-05  9:02     ` HATAYAMA Daisuke
2013-03-05  9:02       ` HATAYAMA Daisuke
2013-03-05  9:35       ` Zhang Yanfei
2013-03-05  9:35         ` Zhang Yanfei
2013-03-02  8:36 ` [PATCH v2 03/20] vmcore, sysfs: export ELF note segment size instead of vmcoreinfo data size HATAYAMA Daisuke
2013-03-02  8:36   ` HATAYAMA Daisuke
2013-03-05  9:29   ` Zhang Yanfei
2013-03-05  9:29     ` Zhang Yanfei
2013-03-06  0:07   ` HATAYAMA Daisuke
2013-03-06  0:07     ` HATAYAMA Daisuke
2013-03-02  8:36 ` [PATCH v2 04/20] vmcore: allocate buffer for ELF headers on page-size alignment HATAYAMA Daisuke
2013-03-02  8:36   ` HATAYAMA Daisuke
2013-03-06  6:57   ` Zhang Yanfei
2013-03-06  6:57     ` Zhang Yanfei
2013-03-06  9:14     ` HATAYAMA Daisuke
2013-03-06  9:14       ` HATAYAMA Daisuke
2013-03-02  8:36 ` [PATCH v2 05/20] vmcore: round up buffer size of ELF headers by PAGE_SIZE HATAYAMA Daisuke
2013-03-02  8:36   ` HATAYAMA Daisuke
2013-03-06 15:51   ` Yanfei Zhang
2013-03-06 15:51     ` Yanfei Zhang
2013-03-02  8:36 ` [PATCH v2 06/20] vmcore, procfs: introduce a flag to distinguish objects copied in 2nd kernel HATAYAMA Daisuke
2013-03-02  8:36   ` HATAYAMA Daisuke
2013-03-06 15:55   ` Yanfei Zhang
2013-03-06 15:55     ` Yanfei Zhang
2013-03-02  8:36 ` [PATCH v2 07/20] vmcore: copy non page-size aligned head and tail pages " HATAYAMA Daisuke
2013-03-02  8:36   ` HATAYAMA Daisuke
2013-03-10  6:16   ` Zhang Yanfei
2013-03-10  6:16     ` Zhang Yanfei
2013-03-11  0:27     ` HATAYAMA Daisuke
2013-03-11  0:27       ` HATAYAMA Daisuke
2013-03-02  8:36 ` [PATCH v2 08/20] vmcore: modify vmcore clean-up function to free buffer on " HATAYAMA Daisuke
2013-03-02  8:36   ` HATAYAMA Daisuke
2013-03-02  8:36 ` [PATCH v2 09/20] vmcore: clean up read_vmcore() HATAYAMA Daisuke
2013-03-02  8:36   ` HATAYAMA Daisuke
2013-03-02  8:36 ` [PATCH v2 10/20] vmcore: read buffers for vmcore objects copied from old memory HATAYAMA Daisuke
2013-03-02  8:36   ` HATAYAMA Daisuke
2013-03-02  8:36 ` [PATCH v2 11/20] vmcore: allocate per-cpu crash_notes objects on page-size boundary HATAYAMA Daisuke
2013-03-02  8:36   ` HATAYAMA Daisuke
2013-03-02  8:36 ` [PATCH v2 12/20] kexec: allocate vmcoreinfo note buffer " HATAYAMA Daisuke
2013-03-02  8:36   ` HATAYAMA Daisuke
2013-03-02  8:37 ` [PATCH v2 13/20] kexec, elf: introduce NT_VMCORE_DEBUGINFO note type HATAYAMA Daisuke
2013-03-02  8:37   ` HATAYAMA Daisuke
2013-03-02  8:37 ` [PATCH v2 14/20] elf: introduce NT_VMCORE_PAD type HATAYAMA Daisuke
2013-03-02  8:37   ` HATAYAMA Daisuke
2013-03-02  8:37 ` [PATCH v2 15/20] kexec: fill note buffers by NT_VMCORE_PAD notes in page-size boundary HATAYAMA Daisuke
2013-03-02  8:37   ` HATAYAMA Daisuke
2013-03-07 10:11   ` Zhang Yanfei
2013-03-07 10:11     ` Zhang Yanfei
2013-03-08  1:55     ` HATAYAMA Daisuke
2013-03-08  1:55       ` HATAYAMA Daisuke
2013-03-08 13:02       ` Yanfei Zhang
2013-03-08 13:02         ` Yanfei Zhang
2013-03-09  3:46         ` HATAYAMA Daisuke
2013-03-09  3:46           ` HATAYAMA Daisuke
2013-03-10  2:33           ` Zhang Yanfei
2013-03-10  2:33             ` Zhang Yanfei
2013-03-02  8:37 ` [PATCH v2 16/20] vmcore: check NT_VMCORE_PAD as a mark indicating the end of ELF note buffer HATAYAMA Daisuke
2013-03-02  8:37   ` HATAYAMA Daisuke
2013-03-02  8:37 ` [PATCH v2 17/20] vmcore: check if vmcore objects satify mmap()'s page-size boundary requirement HATAYAMA Daisuke
2013-03-02  8:37   ` HATAYAMA Daisuke
2013-03-02  8:37 ` [PATCH v2 18/20] vmcore: round-up offset of vmcore object in page-size boundary HATAYAMA Daisuke
2013-03-02  8:37   ` HATAYAMA Daisuke
2013-03-02  8:37 ` [PATCH v2 19/20] vmcore: count holes generated by round-up operation for vmcore size HATAYAMA Daisuke
2013-03-02  8:37   ` HATAYAMA Daisuke
2013-03-02  8:37 ` [PATCH v2 20/20] vmcore: introduce mmap_vmcore() HATAYAMA Daisuke
2013-03-02  8:37   ` HATAYAMA Daisuke

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20130302083447.31252.93914.stgit@localhost6.localdomain6 \
    --to=d.hatayama@jp.fujitsu.com \
    --cc=akpm@linux-foundation.org \
    --cc=cpw@sgi.com \
    --cc=ebiederm@xmission.com \
    --cc=heiko.carstens@de.ibm.com \
    --cc=kexec@lists.infradead.org \
    --cc=kumagai-atsushi@mxc.nes.nec.co.jp \
    --cc=linux-kernel@vger.kernel.org \
    --cc=lisa.mitchell@hp.com \
    --cc=vgoyal@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.