From: Minchan Kim <minchan@kernel.org> To: Seth Jennings <sjenning@linux.vnet.ibm.com> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>, Andrew Morton <akpm@linux-foundation.org>, Dan Magenheimer <dan.magenheimer@oracle.com>, Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>, Nitin Gupta <ngupta@vflare.org>, Robert Jennings <rcj@linux.vnet.ibm.com>, linux-mm@kvack.org, devel@driverdev.osuosl.org, linux-kernel@vger.kernel.org Subject: Re: [PATCH 0/4] zsmalloc improvements Date: Wed, 11 Jul 2012 16:03:00 +0900 [thread overview] Message-ID: <4FFD2524.2050300@kernel.org> (raw) In-Reply-To: <1341263752-10210-1-git-send-email-sjenning@linux.vnet.ibm.com> Hi everybody, I realized it by Seth's mention yesterday that Greg already merged this series I should have hurried but last week I have no time. :( On 07/03/2012 06:15 AM, Seth Jennings wrote: > This patchset removes the current x86 dependency for zsmalloc > and introduces some performance improvements in the object > mapping paths. > > It was meant to be a follow-on to my previous patchest > > https://lkml.org/lkml/2012/6/26/540 > > However, this patchset differed so much in light of new performance > information that I mostly started over. > > In the past, I attempted to compare different mapping methods > via the use of zcache and frontswap. However, the nature of those > two features makes comparing mapping method efficiency difficult > since the mapping is a very small part of the overall code path. > > In an effort to get more useful statistics on the mapping speed, > I wrote a microbenchmark module named zsmapbench, designed to > measure mapping speed by calling straight into the zsmalloc > paths. > > https://github.com/spartacus06/zsmapbench > > This exposed an interesting and unexpected result: in all > cases that I tried, copying the objects that span pages instead > of using the page table to map them, was _always_ faster. I could > not find a case in which the page table mapping method was faster. > > zsmapbench measures the copy-based mapping at ~560 cycles for a > map/unmap operation on spanned object for both KVM guest and bare-metal, > while the page table mapping was ~1500 cycles on a VM and ~760 cycles > bare-metal. The cycles for the copy method will vary with > allocation size, however, it is still faster even for the largest > allocation that zsmalloc supports. > > The result is convenient though, as mempcy is very portable :) Today, I tested zsmapbench in my embedded board(ARM). tlb-flush is 30% faster than copy-based so it's always not win. I think it depends on CPU speed/cache size. zram is already very popular on embedded systems so I want to use it continuously without 30% big demage so I want to keep our old approach which supporting local tlb flush. Of course, in case of KVM guest, copy-based would be always bin win. So shouldn't we support both approach? It could make code very ugly but I think it has enough value. Any thought? > > This patchset replaces the x86-only page table mapping code with > copy-based mapping code. It also makes changes to optimize this > new method further. > > There are no changes in arch/x86 required. > > Patchset is based on greg's staging-next. > > Seth Jennings (4): > zsmalloc: remove x86 dependency > zsmalloc: add single-page object fastpath in unmap > zsmalloc: add details to zs_map_object boiler plate > zsmalloc: add mapping modes > > drivers/staging/zcache/zcache-main.c | 6 +- > drivers/staging/zram/zram_drv.c | 7 +- > drivers/staging/zsmalloc/Kconfig | 4 - > drivers/staging/zsmalloc/zsmalloc-main.c | 124 ++++++++++++++++++++++-------- > drivers/staging/zsmalloc/zsmalloc.h | 14 +++- > drivers/staging/zsmalloc/zsmalloc_int.h | 6 +- > 6 files changed, 114 insertions(+), 47 deletions(-) > -- Kind regards, Minchan Kim
WARNING: multiple messages have this Message-ID (diff)
From: Minchan Kim <minchan@kernel.org> To: Seth Jennings <sjenning@linux.vnet.ibm.com> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>, Andrew Morton <akpm@linux-foundation.org>, Dan Magenheimer <dan.magenheimer@oracle.com>, Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>, Nitin Gupta <ngupta@vflare.org>, Robert Jennings <rcj@linux.vnet.ibm.com>, linux-mm@kvack.org, devel@driverdev.osuosl.org, linux-kernel@vger.kernel.org Subject: Re: [PATCH 0/4] zsmalloc improvements Date: Wed, 11 Jul 2012 16:03:00 +0900 [thread overview] Message-ID: <4FFD2524.2050300@kernel.org> (raw) In-Reply-To: <1341263752-10210-1-git-send-email-sjenning@linux.vnet.ibm.com> Hi everybody, I realized it by Seth's mention yesterday that Greg already merged this series I should have hurried but last week I have no time. :( On 07/03/2012 06:15 AM, Seth Jennings wrote: > This patchset removes the current x86 dependency for zsmalloc > and introduces some performance improvements in the object > mapping paths. > > It was meant to be a follow-on to my previous patchest > > https://lkml.org/lkml/2012/6/26/540 > > However, this patchset differed so much in light of new performance > information that I mostly started over. > > In the past, I attempted to compare different mapping methods > via the use of zcache and frontswap. However, the nature of those > two features makes comparing mapping method efficiency difficult > since the mapping is a very small part of the overall code path. > > In an effort to get more useful statistics on the mapping speed, > I wrote a microbenchmark module named zsmapbench, designed to > measure mapping speed by calling straight into the zsmalloc > paths. > > https://github.com/spartacus06/zsmapbench > > This exposed an interesting and unexpected result: in all > cases that I tried, copying the objects that span pages instead > of using the page table to map them, was _always_ faster. I could > not find a case in which the page table mapping method was faster. > > zsmapbench measures the copy-based mapping at ~560 cycles for a > map/unmap operation on spanned object for both KVM guest and bare-metal, > while the page table mapping was ~1500 cycles on a VM and ~760 cycles > bare-metal. The cycles for the copy method will vary with > allocation size, however, it is still faster even for the largest > allocation that zsmalloc supports. > > The result is convenient though, as mempcy is very portable :) Today, I tested zsmapbench in my embedded board(ARM). tlb-flush is 30% faster than copy-based so it's always not win. I think it depends on CPU speed/cache size. zram is already very popular on embedded systems so I want to use it continuously without 30% big demage so I want to keep our old approach which supporting local tlb flush. Of course, in case of KVM guest, copy-based would be always bin win. So shouldn't we support both approach? It could make code very ugly but I think it has enough value. Any thought? > > This patchset replaces the x86-only page table mapping code with > copy-based mapping code. It also makes changes to optimize this > new method further. > > There are no changes in arch/x86 required. > > Patchset is based on greg's staging-next. > > Seth Jennings (4): > zsmalloc: remove x86 dependency > zsmalloc: add single-page object fastpath in unmap > zsmalloc: add details to zs_map_object boiler plate > zsmalloc: add mapping modes > > drivers/staging/zcache/zcache-main.c | 6 +- > drivers/staging/zram/zram_drv.c | 7 +- > drivers/staging/zsmalloc/Kconfig | 4 - > drivers/staging/zsmalloc/zsmalloc-main.c | 124 ++++++++++++++++++++++-------- > drivers/staging/zsmalloc/zsmalloc.h | 14 +++- > drivers/staging/zsmalloc/zsmalloc_int.h | 6 +- > 6 files changed, 114 insertions(+), 47 deletions(-) > -- Kind regards, Minchan Kim -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
next prev parent reply other threads:[~2012-07-11 7:03 UTC|newest] Thread overview: 62+ messages / expand[flat|nested] mbox.gz Atom feed top 2012-07-02 21:15 [PATCH 0/4] zsmalloc improvements Seth Jennings 2012-07-02 21:15 ` Seth Jennings 2012-07-02 21:15 ` [PATCH 1/4] zsmalloc: remove x86 dependency Seth Jennings 2012-07-02 21:15 ` Seth Jennings 2012-07-10 2:21 ` Minchan Kim 2012-07-10 2:21 ` Minchan Kim 2012-07-10 15:29 ` Seth Jennings 2012-07-10 15:29 ` Seth Jennings 2012-07-11 7:27 ` Minchan Kim 2012-07-11 7:27 ` Minchan Kim 2012-07-11 18:26 ` Nitin Gupta 2012-07-11 18:26 ` Nitin Gupta 2012-07-11 20:32 ` Seth Jennings 2012-07-11 20:32 ` Seth Jennings 2012-07-11 22:42 ` Nitin Gupta 2012-07-11 22:42 ` Nitin Gupta 2012-07-12 0:23 ` Seth Jennings 2012-07-12 0:23 ` Seth Jennings 2012-07-02 21:15 ` [PATCH 2/4] zsmalloc: add single-page object fastpath in unmap Seth Jennings 2012-07-02 21:15 ` Seth Jennings 2012-07-10 2:25 ` Minchan Kim 2012-07-10 2:25 ` Minchan Kim 2012-07-02 21:15 ` [PATCH 3/4] zsmalloc: add details to zs_map_object boiler plate Seth Jennings 2012-07-02 21:15 ` Seth Jennings 2012-07-10 2:35 ` Minchan Kim 2012-07-10 2:35 ` Minchan Kim 2012-07-10 15:17 ` Seth Jennings 2012-07-10 15:17 ` Seth Jennings 2012-07-11 7:42 ` Minchan Kim 2012-07-11 7:42 ` Minchan Kim 2012-07-11 14:15 ` Seth Jennings 2012-07-11 14:15 ` Seth Jennings 2012-07-12 1:15 ` Minchan Kim 2012-07-12 1:15 ` Minchan Kim 2012-07-12 19:54 ` Dan Magenheimer 2012-07-12 19:54 ` Dan Magenheimer 2012-07-12 22:46 ` Dan Magenheimer 2012-07-12 22:46 ` Dan Magenheimer 2012-07-02 21:15 ` [PATCH 4/4] zsmalloc: add mapping modes Seth Jennings 2012-07-02 21:15 ` Seth Jennings 2012-07-04 5:33 ` [PATCH 0/4] zsmalloc improvements Minchan Kim 2012-07-04 5:33 ` Minchan Kim 2012-07-04 20:43 ` Konrad Rzeszutek Wilk 2012-07-04 20:43 ` Konrad Rzeszutek Wilk 2012-07-06 15:07 ` Seth Jennings 2012-07-06 15:07 ` Seth Jennings 2012-07-09 13:58 ` Seth Jennings 2012-07-09 13:58 ` Seth Jennings 2012-07-11 19:42 ` Konrad Rzeszutek Wilk 2012-07-11 19:42 ` Konrad Rzeszutek Wilk 2012-07-11 20:48 ` Seth Jennings 2012-07-11 20:48 ` Seth Jennings 2012-07-12 10:40 ` Konrad Rzeszutek Wilk 2012-07-12 10:40 ` Konrad Rzeszutek Wilk 2012-07-11 7:03 ` Minchan Kim [this message] 2012-07-11 7:03 ` Minchan Kim 2012-07-11 14:00 ` Seth Jennings 2012-07-11 14:00 ` Seth Jennings 2012-07-12 1:01 ` Minchan Kim 2012-07-12 1:01 ` Minchan Kim 2012-07-11 19:16 ` Seth Jennings 2012-07-11 19:16 ` Seth Jennings
Reply instructions: You may reply publicly to this message via plain-text email using any one of the following methods: * Save the following mbox file, import it into your mail client, and reply-to-all from there: mbox Avoid top-posting and favor interleaved quoting: https://en.wikipedia.org/wiki/Posting_style#Interleaved_style * Reply using the --to, --cc, and --in-reply-to switches of git-send-email(1): git send-email \ --in-reply-to=4FFD2524.2050300@kernel.org \ --to=minchan@kernel.org \ --cc=akpm@linux-foundation.org \ --cc=dan.magenheimer@oracle.com \ --cc=devel@driverdev.osuosl.org \ --cc=gregkh@linuxfoundation.org \ --cc=konrad.wilk@oracle.com \ --cc=linux-kernel@vger.kernel.org \ --cc=linux-mm@kvack.org \ --cc=ngupta@vflare.org \ --cc=rcj@linux.vnet.ibm.com \ --cc=sjenning@linux.vnet.ibm.com \ /path/to/YOUR_REPLY https://kernel.org/pub/software/scm/git/docs/git-send-email.html * If your mail client supports setting the In-Reply-To header via mailto: links, try the mailto: linkBe sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.