From: Minchan Kim <minchan@kernel.org> To: Seth Jennings <sjenning@linux.vnet.ibm.com> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>, Andrew Morton <akpm@linux-foundation.org>, Dan Magenheimer <dan.magenheimer@oracle.com>, Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>, Nitin Gupta <ngupta@vflare.org>, Robert Jennings <rcj@linux.vnet.ibm.com>, linux-mm@kvack.org, devel@driverdev.osuosl.org, linux-kernel@vger.kernel.org Subject: Re: [PATCH 0/4] zsmalloc improvements Date: Wed, 11 Jul 2012 16:03:00 +0900 [thread overview] Message-ID: <4FFD2524.2050300@kernel.org> (raw) In-Reply-To: <1341263752-10210-1-git-send-email-sjenning@linux.vnet.ibm.com> Hi everybody, I realized it by Seth's mention yesterday that Greg already merged this series I should have hurried but last week I have no time. :( On 07/03/2012 06:15 AM, Seth Jennings wrote: > This patchset removes the current x86 dependency for zsmalloc > and introduces some performance improvements in the object > mapping paths. > > It was meant to be a follow-on to my previous patchest > > https://lkml.org/lkml/2012/6/26/540 > > However, this patchset differed so much in light of new performance > information that I mostly started over. > > In the past, I attempted to compare different mapping methods > via the use of zcache and frontswap. However, the nature of those > two features makes comparing mapping method efficiency difficult > since the mapping is a very small part of the overall code path. > > In an effort to get more useful statistics on the mapping speed, > I wrote a microbenchmark module named zsmapbench, designed to > measure mapping speed by calling straight into the zsmalloc > paths. > > https://github.com/spartacus06/zsmapbench > > This exposed an interesting and unexpected result: in all > cases that I tried, copying the objects that span pages instead > of using the page table to map them, was _always_ faster. I could > not find a case in which the page table mapping method was faster. > > zsmapbench measures the copy-based mapping at ~560 cycles for a > map/unmap operation on spanned object for both KVM guest and bare-metal, > while the page table mapping was ~1500 cycles on a VM and ~760 cycles > bare-metal. The cycles for the copy method will vary with > allocation size, however, it is still faster even for the largest > allocation that zsmalloc supports. > > The result is convenient though, as mempcy is very portable :) Today, I tested zsmapbench in my embedded board(ARM). tlb-flush is 30% faster than copy-based so it's always not win. I think it depends on CPU speed/cache size. zram is already very popular on embedded systems so I want to use it continuously without 30% big demage so I want to keep our old approach which supporting local tlb flush. Of course, in case of KVM guest, copy-based would be always bin win. So shouldn't we support both approach? It could make code very ugly but I think it has enough value. Any thought? > > This patchset replaces the x86-only page table mapping code with > copy-based mapping code. It also makes changes to optimize this > new method further. > > There are no changes in arch/x86 required. > > Patchset is based on greg's staging-next. > > Seth Jennings (4): > zsmalloc: remove x86 dependency > zsmalloc: add single-page object fastpath in unmap > zsmalloc: add details to zs_map_object boiler plate > zsmalloc: add mapping modes > > drivers/staging/zcache/zcache-main.c | 6 +- > drivers/staging/zram/zram_drv.c | 7 +- > drivers/staging/zsmalloc/Kconfig | 4 - > drivers/staging/zsmalloc/zsmalloc-main.c | 124 ++++++++++++++++++++++-------- > drivers/staging/zsmalloc/zsmalloc.h | 14 +++- > drivers/staging/zsmalloc/zsmalloc_int.h | 6 +- > 6 files changed, 114 insertions(+), 47 deletions(-) > -- Kind regards, Minchan Kim
WARNING: multiple messages have this Message-ID
From: Minchan Kim <minchan@kernel.org> To: Seth Jennings <sjenning@linux.vnet.ibm.com> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>, Andrew Morton <akpm@linux-foundation.org>, Dan Magenheimer <dan.magenheimer@oracle.com>, Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>, Nitin Gupta <ngupta@vflare.org>, Robert Jennings <rcj@linux.vnet.ibm.com>, linux-mm@kvack.org, devel@driverdev.osuosl.org, linux-kernel@vger.kernel.org Subject: Re: [PATCH 0/4] zsmalloc improvements Date: Wed, 11 Jul 2012 16:03:00 +0900 [thread overview] Message-ID: <4FFD2524.2050300@kernel.org> (raw) In-Reply-To: <1341263752-10210-1-git-send-email-sjenning@linux.vnet.ibm.com> Hi everybody, I realized it by Seth's mention yesterday that Greg already merged this series I should have hurried but last week I have no time. :( On 07/03/2012 06:15 AM, Seth Jennings wrote: > This patchset removes the current x86 dependency for zsmalloc > and introduces some performance improvements in the object > mapping paths. > > It was meant to be a follow-on to my previous patchest > > https://lkml.org/lkml/2012/6/26/540 > > However, this patchset differed so much in light of new performance > information that I mostly started over. > > In the past, I attempted to compare different mapping methods > via the use of zcache and frontswap. However, the nature of those > two features makes comparing mapping method efficiency difficult > since the mapping is a very small part of the overall code path. > > In an effort to get more useful statistics on the mapping speed, > I wrote a microbenchmark module named zsmapbench, designed to > measure mapping speed by calling straight into the zsmalloc > paths. > > https://github.com/spartacus06/zsmapbench > > This exposed an interesting and unexpected result: in all > cases that I tried, copying the objects that span pages instead > of using the page table to map them, was _always_ faster. I could > not find a case in which the page table mapping method was faster. > > zsmapbench measures the copy-based mapping at ~560 cycles for a > map/unmap operation on spanned object for both KVM guest and bare-metal, > while the page table mapping was ~1500 cycles on a VM and ~760 cycles > bare-metal. The cycles for the copy method will vary with > allocation size, however, it is still faster even for the largest > allocation that zsmalloc supports. > > The result is convenient though, as mempcy is very portable :) Today, I tested zsmapbench in my embedded board(ARM). tlb-flush is 30% faster than copy-based so it's always not win. I think it depends on CPU speed/cache size. zram is already very popular on embedded systems so I want to use it continuously without 30% big demage so I want to keep our old approach which supporting local tlb flush. Of course, in case of KVM guest, copy-based would be always bin win. So shouldn't we support both approach? It could make code very ugly but I think it has enough value. Any thought? > > This patchset replaces the x86-only page table mapping code with > copy-based mapping code. It also makes changes to optimize this > new method further. > > There are no changes in arch/x86 required. > > Patchset is based on greg's staging-next. > > Seth Jennings (4): > zsmalloc: remove x86 dependency > zsmalloc: add single-page object fastpath in unmap > zsmalloc: add details to zs_map_object boiler plate > zsmalloc: add mapping modes > > drivers/staging/zcache/zcache-main.c | 6 +- > drivers/staging/zram/zram_drv.c | 7 +- > drivers/staging/zsmalloc/Kconfig | 4 - > drivers/staging/zsmalloc/zsmalloc-main.c | 124 ++++++++++++++++++++++-------- > drivers/staging/zsmalloc/zsmalloc.h | 14 +++- > drivers/staging/zsmalloc/zsmalloc_int.h | 6 +- > 6 files changed, 114 insertions(+), 47 deletions(-) > -- Kind regards, Minchan Kim -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
next prev parent reply other threads:[~2012-07-11 7:03 UTC|newest] Thread overview: 62+ messages / expand[flat|nested] mbox.gz Atom feed top 2012-07-02 21:15 Seth Jennings 2012-07-02 21:15 ` Seth Jennings 2012-07-02 21:15 ` [PATCH 1/4] zsmalloc: remove x86 dependency Seth Jennings 2012-07-02 21:15 ` Seth Jennings 2012-07-10 2:21 ` Minchan Kim 2012-07-10 2:21 ` Minchan Kim 2012-07-10 15:29 ` Seth Jennings 2012-07-10 15:29 ` Seth Jennings 2012-07-11 7:27 ` Minchan Kim 2012-07-11 7:27 ` Minchan Kim 2012-07-11 18:26 ` Nitin Gupta 2012-07-11 18:26 ` Nitin Gupta 2012-07-11 20:32 ` Seth Jennings 2012-07-11 20:32 ` Seth Jennings 2012-07-11 22:42 ` Nitin Gupta 2012-07-11 22:42 ` Nitin Gupta 2012-07-12 0:23 ` Seth Jennings 2012-07-12 0:23 ` Seth Jennings 2012-07-02 21:15 ` [PATCH 2/4] zsmalloc: add single-page object fastpath in unmap Seth Jennings 2012-07-02 21:15 ` Seth Jennings 2012-07-10 2:25 ` Minchan Kim 2012-07-10 2:25 ` Minchan Kim 2012-07-02 21:15 ` [PATCH 3/4] zsmalloc: add details to zs_map_object boiler plate Seth Jennings 2012-07-02 21:15 ` Seth Jennings 2012-07-10 2:35 ` Minchan Kim 2012-07-10 2:35 ` Minchan Kim 2012-07-10 15:17 ` Seth Jennings 2012-07-10 15:17 ` Seth Jennings 2012-07-11 7:42 ` Minchan Kim 2012-07-11 7:42 ` Minchan Kim 2012-07-11 14:15 ` Seth Jennings 2012-07-11 14:15 ` Seth Jennings 2012-07-12 1:15 ` Minchan Kim 2012-07-12 1:15 ` Minchan Kim 2012-07-12 19:54 ` Dan Magenheimer 2012-07-12 19:54 ` Dan Magenheimer 2012-07-12 22:46 ` Dan Magenheimer 2012-07-12 22:46 ` Dan Magenheimer 2012-07-02 21:15 ` [PATCH 4/4] zsmalloc: add mapping modes Seth Jennings 2012-07-02 21:15 ` Seth Jennings 2012-07-04 5:33 ` [PATCH 0/4] zsmalloc improvements Minchan Kim 2012-07-04 5:33 ` Minchan Kim 2012-07-04 20:43 ` Konrad Rzeszutek Wilk 2012-07-04 20:43 ` Konrad Rzeszutek Wilk 2012-07-06 15:07 ` Seth Jennings 2012-07-06 15:07 ` Seth Jennings 2012-07-09 13:58 ` Seth Jennings 2012-07-09 13:58 ` Seth Jennings 2012-07-11 19:42 ` Konrad Rzeszutek Wilk 2012-07-11 19:42 ` Konrad Rzeszutek Wilk 2012-07-11 20:48 ` Seth Jennings 2012-07-11 20:48 ` Seth Jennings 2012-07-12 10:40 ` Konrad Rzeszutek Wilk 2012-07-12 10:40 ` Konrad Rzeszutek Wilk 2012-07-11 7:03 ` Minchan Kim [this message] 2012-07-11 7:03 ` Minchan Kim 2012-07-11 14:00 ` Seth Jennings 2012-07-11 14:00 ` Seth Jennings 2012-07-12 1:01 ` Minchan Kim 2012-07-12 1:01 ` Minchan Kim 2012-07-11 19:16 ` Seth Jennings 2012-07-11 19:16 ` Seth Jennings
Reply instructions: You may reply publicly to this message via plain-text email using any one of the following methods: * Save the following mbox file, import it into your mail client, and reply-to-all from there: mbox Avoid top-posting and favor interleaved quoting: https://en.wikipedia.org/wiki/Posting_style#Interleaved_style * Reply using the --to, --cc, and --in-reply-to switches of git-send-email(1): git send-email \ --in-reply-to=4FFD2524.2050300@kernel.org \ --to=minchan@kernel.org \ --cc=akpm@linux-foundation.org \ --cc=dan.magenheimer@oracle.com \ --cc=devel@driverdev.osuosl.org \ --cc=gregkh@linuxfoundation.org \ --cc=konrad.wilk@oracle.com \ --cc=linux-kernel@vger.kernel.org \ --cc=linux-mm@kvack.org \ --cc=ngupta@vflare.org \ --cc=rcj@linux.vnet.ibm.com \ --cc=sjenning@linux.vnet.ibm.com \ --subject='Re: [PATCH 0/4] zsmalloc improvements' \ /path/to/YOUR_REPLY https://kernel.org/pub/software/scm/git/docs/git-send-email.html * If your mail client supports setting the In-Reply-To header via mailto: links, try the mailto: link
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.