All of lore.kernel.org
 help / color / mirror / Atom feed
* Re: /proc/vmcore kernel patches
       [not found]   ` <20130422175504.GA26312@sgi.com>
@ 2013-04-23  0:38     ` HATAYAMA Daisuke
  2013-04-23 11:45       ` Cliff Wickman
  0 siblings, 1 reply; 3+ messages in thread
From: HATAYAMA Daisuke @ 2013-04-23  0:38 UTC (permalink / raw)
  To: Cliff Wickman; +Cc: kexec, Atsushi Kumagai

(2013/04/23 2:55), Cliff Wickman wrote:
> Hello Mr. Atayama and Mr. Kumagai,
>
> I have been playing with the v4 patches
>       kdump, vmcore: support mmap() on /proc/vmcore
> and find the mmap interface to /proc/vmcore potentially about 80x faster than
> the read interface.
>
> But in practice (using a makedumpfile that mmap's instead of read's) I find
> it about 10x slower.
>
> It looks like makedumpfile's usage of the interface is very inefficient.
> It will mmap an area, read a page, then back up the offset to a previous
> page.  It has to munmap and mmap on virtually every read.

You can change size of mapping memory through command-line option 
--map-size <some KB>.

The version of makedumpfile is experimental. The design should be 
changed if it turns out to be problematic.

>
> Do you have a re-worked makedumpfile that predicts a large range of
> pages and mmap's the whole range just once?
> It seems that makedumpfile should have the information available to do
> that.
>

The benchmark result has already shown that under large enough map size, 
the current implementation performs as well as other kernel-space 
implementation that maps a whole range of memory.

In addition, the current implementation of remap_pfn_range uses 4KB 
pages only. This means that total size of PTEs amounts to 2GB per 1TB. 
It's better to map pages little by little for small memory programming.

-- 
Thanks.
HATAYAMA, Daisuke


_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: /proc/vmcore kernel patches
  2013-04-23  0:38     ` /proc/vmcore kernel patches HATAYAMA Daisuke
@ 2013-04-23 11:45       ` Cliff Wickman
  2013-04-24  0:17         ` HATAYAMA Daisuke
  0 siblings, 1 reply; 3+ messages in thread
From: Cliff Wickman @ 2013-04-23 11:45 UTC (permalink / raw)
  To: HATAYAMA Daisuke; +Cc: kexec, Atsushi Kumagai

On Tue, Apr 23, 2013 at 09:38:57AM +0900, HATAYAMA Daisuke wrote:
> (2013/04/23 2:55), Cliff Wickman wrote:
>> Hello Mr. Atayama and Mr. Kumagai,
>>
>> I have been playing with the v4 patches
>>       kdump, vmcore: support mmap() on /proc/vmcore
>> and find the mmap interface to /proc/vmcore potentially about 80x faster than
>> the read interface.
>>
>> But in practice (using a makedumpfile that mmap's instead of read's) I find
>> it about 10x slower.
>>
>> It looks like makedumpfile's usage of the interface is very inefficient.
>> It will mmap an area, read a page, then back up the offset to a previous
>> page.  It has to munmap and mmap on virtually every read.
>
> You can change size of mapping memory through command-line option  
> --map-size <some KB>.
>
> The version of makedumpfile is experimental. The design should be  
> changed if it turns out to be problematic.

Yes I'm using --map-size <some KB> but the bigger I make the mapping
size the worse makedumpfile performs. The typical pattern is to map and
read page x, then map and read page x - 1.  So every read has to unmap
and remap.  The bigger the mapping, the slower it goes.

>> Do you have a re-worked makedumpfile that predicts a large range of
>> pages and mmap's the whole range just once?
>> It seems that makedumpfile should have the information available to do
>> that.
>>
>
> The benchmark result has already shown that under large enough map size,  
> the current implementation performs as well as other kernel-space  
> implementation that maps a whole range of memory.

I must be missing some part of that benchmark.  I see that the interface
is much faster, but my benchmarks of makedumpfile itself are much slower
when using mmap.
Can you point me to the makedumpfile source that you are using?

> In addition, the current implementation of remap_pfn_range uses 4KB  
> pages only. This means that total size of PTEs amounts to 2GB per 1TB.  
> It's better to map pages little by little for small memory programming.

Agreed, we need a way to map with 2M pages.  And I am not suggesting that
you map all of the old kernel memory at once.  Just one region of page
structures at a time.

-Cliff
-- 
Cliff Wickman
SGI
cpw@sgi.com
(651) 683-3824

_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: /proc/vmcore kernel patches
  2013-04-23 11:45       ` Cliff Wickman
@ 2013-04-24  0:17         ` HATAYAMA Daisuke
  0 siblings, 0 replies; 3+ messages in thread
From: HATAYAMA Daisuke @ 2013-04-24  0:17 UTC (permalink / raw)
  To: Cliff Wickman; +Cc: kexec, Atsushi Kumagai

(2013/04/23 20:45), Cliff Wickman wrote:
> On Tue, Apr 23, 2013 at 09:38:57AM +0900, HATAYAMA Daisuke wrote:
>> (2013/04/23 2:55), Cliff Wickman wrote:
>>> Hello Mr. Atayama and Mr. Kumagai,
>>>
>>> I have been playing with the v4 patches
>>>        kdump, vmcore: support mmap() on /proc/vmcore
>>> and find the mmap interface to /proc/vmcore potentially about 80x faster than
>>> the read interface.
>>>
>>> But in practice (using a makedumpfile that mmap's instead of read's) I find
>>> it about 10x slower.
>>>
>>> It looks like makedumpfile's usage of the interface is very inefficient.
>>> It will mmap an area, read a page, then back up the offset to a previous
>>> page.  It has to munmap and mmap on virtually every read.
>>
>> You can change size of mapping memory through command-line option
>> --map-size <some KB>.
>>
>> The version of makedumpfile is experimental. The design should be
>> changed if it turns out to be problematic.
>
> Yes I'm using --map-size <some KB> but the bigger I make the mapping
> size the worse makedumpfile performs. The typical pattern is to map and
> read page x, then map and read page x - 1.  So every read has to unmap
> and remap.  The bigger the mapping, the slower it goes.
>
>>> Do you have a re-worked makedumpfile that predicts a large range of
>>> pages and mmap's the whole range just once?
>>> It seems that makedumpfile should have the information available to do
>>> that.
>>>
>>
>> The benchmark result has already shown that under large enough map size,
>> the current implementation performs as well as other kernel-space
>> implementation that maps a whole range of memory.
>
> I must be missing some part of that benchmark.  I see that the interface
> is much faster, but my benchmarks of makedumpfile itself are much slower
> when using mmap.
> Can you point me to the makedumpfile source that you are using?
>

I used mmap branch at

git://git.code.sf.net/p/makedumpfile/code

with the following patch applied:

===
diff --git a/makedumpfile.c b/makedumpfile.c
index 7acbf72..9dc6aee 100644
--- a/makedumpfile.c
+++ b/makedumpfile.c
@@ -290,8 +290,10 @@ read_with_mmap(off_t offset, void *bufptr, unsigned 
long size) {

  next_region:

-       if (!is_mapped_with_mmap(offset))
-               update_mmap_range(offset);
+       if (!is_mapped_with_mmap(offset)) {
+               if (!update_mmap_range(offset))
+                       return FALSE;
+       }

         read_size = MIN(info->mmap_end_offset - offset, size);
===

>> In addition, the current implementation of remap_pfn_range uses 4KB
>> pages only. This means that total size of PTEs amounts to 2GB per 1TB.
>> It's better to map pages little by little for small memory programming.
>
> Agreed, we need a way to map with 2M pages.  And I am not suggesting that
> you map all of the old kernel memory at once.  Just one region of page
> structures at a time.

Ideally so, but the benchmark showed good performance even in the 
current impelementation, so I'm now thinking that modifying 
remap_pfn_range is not definitely necessary.

-- 
Thanks.
HATAYAMA, Daisuke


_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec

^ permalink raw reply related	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2013-04-24  0:17 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
     [not found] <E1UQGbu-0000on-Qi@eag09.americas.sgi.com>
     [not found] ` <20130412101056.a7371f1297e3057125c44521@mxc.nes.nec.co.jp>
     [not found]   ` <20130422175504.GA26312@sgi.com>
2013-04-23  0:38     ` /proc/vmcore kernel patches HATAYAMA Daisuke
2013-04-23 11:45       ` Cliff Wickman
2013-04-24  0:17         ` HATAYAMA Daisuke

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.