All of lore.kernel.org
 help / color / mirror / Atom feed
From: Minchan Kim <minchan@kernel.org>
To: Seth Jennings <sjenning@linux.vnet.ibm.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>,
	Andrew Morton <akpm@linux-foundation.org>,
	Dan Magenheimer <dan.magenheimer@oracle.com>,
	Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>,
	Nitin Gupta <ngupta@vflare.org>,
	Robert Jennings <rcj@linux.vnet.ibm.com>,
	linux-mm@kvack.org, devel@driverdev.osuosl.org,
	linux-kernel@vger.kernel.org
Subject: Re: [PATCH 0/4] zsmalloc improvements
Date: Wed, 11 Jul 2012 16:03:00 +0900	[thread overview]
Message-ID: <4FFD2524.2050300@kernel.org> (raw)
In-Reply-To: <1341263752-10210-1-git-send-email-sjenning@linux.vnet.ibm.com>

Hi everybody,

I realized it by Seth's mention yesterday that Greg already merged this series 
I should have hurried but last week I have no time. :(

On 07/03/2012 06:15 AM, Seth Jennings wrote:
> This patchset removes the current x86 dependency for zsmalloc
> and introduces some performance improvements in the object
> mapping paths.
> 
> It was meant to be a follow-on to my previous patchest
> 
> https://lkml.org/lkml/2012/6/26/540
> 
> However, this patchset differed so much in light of new performance
> information that I mostly started over.
> 
> In the past, I attempted to compare different mapping methods
> via the use of zcache and frontswap.  However, the nature of those
> two features makes comparing mapping method efficiency difficult
> since the mapping is a very small part of the overall code path.
> 
> In an effort to get more useful statistics on the mapping speed,
> I wrote a microbenchmark module named zsmapbench, designed to
> measure mapping speed by calling straight into the zsmalloc
> paths.
> 
> https://github.com/spartacus06/zsmapbench
> 
> This exposed an interesting and unexpected result: in all
> cases that I tried, copying the objects that span pages instead
> of using the page table to map them, was _always_ faster.  I could
> not find a case in which the page table mapping method was faster.
> 
> zsmapbench measures the copy-based mapping at ~560 cycles for a
> map/unmap operation on spanned object for both KVM guest and bare-metal,
> while the page table mapping was ~1500 cycles on a VM and ~760 cycles
> bare-metal.  The cycles for the copy method will vary with
> allocation size, however, it is still faster even for the largest
> allocation that zsmalloc supports.
> 
> The result is convenient though, as mempcy is very portable :)

Today, I tested zsmapbench in my embedded board(ARM).
tlb-flush is 30% faster than copy-based so it's always not win.
I think it depends on CPU speed/cache size.

zram is already very popular on embedded systems so I want to use
it continuously without 30% big demage so I want to keep our old approach
which supporting local tlb flush. 

Of course, in case of KVM guest, copy-based would be always bin win.
So shouldn't we support both approach? It could make code very ugly
but I think it has enough value.

Any thought?


> 
> This patchset replaces the x86-only page table mapping code with
> copy-based mapping code. It also makes changes to optimize this
> new method further.
> 
> There are no changes in arch/x86 required.
> 
> Patchset is based on greg's staging-next.
> 
> Seth Jennings (4):
>   zsmalloc: remove x86 dependency
>   zsmalloc: add single-page object fastpath in unmap
>   zsmalloc: add details to zs_map_object boiler plate
>   zsmalloc: add mapping modes
> 
>  drivers/staging/zcache/zcache-main.c     |    6 +-
>  drivers/staging/zram/zram_drv.c          |    7 +-
>  drivers/staging/zsmalloc/Kconfig         |    4 -
>  drivers/staging/zsmalloc/zsmalloc-main.c |  124 ++++++++++++++++++++++--------
>  drivers/staging/zsmalloc/zsmalloc.h      |   14 +++-
>  drivers/staging/zsmalloc/zsmalloc_int.h  |    6 +-
>  6 files changed, 114 insertions(+), 47 deletions(-)
> 


-- 
Kind regards,
Minchan Kim



WARNING: multiple messages have this Message-ID
From: Minchan Kim <minchan@kernel.org>
To: Seth Jennings <sjenning@linux.vnet.ibm.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>,
	Andrew Morton <akpm@linux-foundation.org>,
	Dan Magenheimer <dan.magenheimer@oracle.com>,
	Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>,
	Nitin Gupta <ngupta@vflare.org>,
	Robert Jennings <rcj@linux.vnet.ibm.com>,
	linux-mm@kvack.org, devel@driverdev.osuosl.org,
	linux-kernel@vger.kernel.org
Subject: Re: [PATCH 0/4] zsmalloc improvements
Date: Wed, 11 Jul 2012 16:03:00 +0900	[thread overview]
Message-ID: <4FFD2524.2050300@kernel.org> (raw)
In-Reply-To: <1341263752-10210-1-git-send-email-sjenning@linux.vnet.ibm.com>

Hi everybody,

I realized it by Seth's mention yesterday that Greg already merged this series 
I should have hurried but last week I have no time. :(

On 07/03/2012 06:15 AM, Seth Jennings wrote:
> This patchset removes the current x86 dependency for zsmalloc
> and introduces some performance improvements in the object
> mapping paths.
> 
> It was meant to be a follow-on to my previous patchest
> 
> https://lkml.org/lkml/2012/6/26/540
> 
> However, this patchset differed so much in light of new performance
> information that I mostly started over.
> 
> In the past, I attempted to compare different mapping methods
> via the use of zcache and frontswap.  However, the nature of those
> two features makes comparing mapping method efficiency difficult
> since the mapping is a very small part of the overall code path.
> 
> In an effort to get more useful statistics on the mapping speed,
> I wrote a microbenchmark module named zsmapbench, designed to
> measure mapping speed by calling straight into the zsmalloc
> paths.
> 
> https://github.com/spartacus06/zsmapbench
> 
> This exposed an interesting and unexpected result: in all
> cases that I tried, copying the objects that span pages instead
> of using the page table to map them, was _always_ faster.  I could
> not find a case in which the page table mapping method was faster.
> 
> zsmapbench measures the copy-based mapping at ~560 cycles for a
> map/unmap operation on spanned object for both KVM guest and bare-metal,
> while the page table mapping was ~1500 cycles on a VM and ~760 cycles
> bare-metal.  The cycles for the copy method will vary with
> allocation size, however, it is still faster even for the largest
> allocation that zsmalloc supports.
> 
> The result is convenient though, as mempcy is very portable :)

Today, I tested zsmapbench in my embedded board(ARM).
tlb-flush is 30% faster than copy-based so it's always not win.
I think it depends on CPU speed/cache size.

zram is already very popular on embedded systems so I want to use
it continuously without 30% big demage so I want to keep our old approach
which supporting local tlb flush. 

Of course, in case of KVM guest, copy-based would be always bin win.
So shouldn't we support both approach? It could make code very ugly
but I think it has enough value.

Any thought?


> 
> This patchset replaces the x86-only page table mapping code with
> copy-based mapping code. It also makes changes to optimize this
> new method further.
> 
> There are no changes in arch/x86 required.
> 
> Patchset is based on greg's staging-next.
> 
> Seth Jennings (4):
>   zsmalloc: remove x86 dependency
>   zsmalloc: add single-page object fastpath in unmap
>   zsmalloc: add details to zs_map_object boiler plate
>   zsmalloc: add mapping modes
> 
>  drivers/staging/zcache/zcache-main.c     |    6 +-
>  drivers/staging/zram/zram_drv.c          |    7 +-
>  drivers/staging/zsmalloc/Kconfig         |    4 -
>  drivers/staging/zsmalloc/zsmalloc-main.c |  124 ++++++++++++++++++++++--------
>  drivers/staging/zsmalloc/zsmalloc.h      |   14 +++-
>  drivers/staging/zsmalloc/zsmalloc_int.h  |    6 +-
>  6 files changed, 114 insertions(+), 47 deletions(-)
> 


-- 
Kind regards,
Minchan Kim


--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

  parent reply	other threads:[~2012-07-11  7:03 UTC|newest]

Thread overview: 62+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2012-07-02 21:15 Seth Jennings
2012-07-02 21:15 ` Seth Jennings
2012-07-02 21:15 ` [PATCH 1/4] zsmalloc: remove x86 dependency Seth Jennings
2012-07-02 21:15   ` Seth Jennings
2012-07-10  2:21   ` Minchan Kim
2012-07-10  2:21     ` Minchan Kim
2012-07-10 15:29     ` Seth Jennings
2012-07-10 15:29       ` Seth Jennings
2012-07-11  7:27       ` Minchan Kim
2012-07-11  7:27         ` Minchan Kim
2012-07-11 18:26   ` Nitin Gupta
2012-07-11 18:26     ` Nitin Gupta
2012-07-11 20:32     ` Seth Jennings
2012-07-11 20:32       ` Seth Jennings
2012-07-11 22:42       ` Nitin Gupta
2012-07-11 22:42         ` Nitin Gupta
2012-07-12  0:23         ` Seth Jennings
2012-07-12  0:23           ` Seth Jennings
2012-07-02 21:15 ` [PATCH 2/4] zsmalloc: add single-page object fastpath in unmap Seth Jennings
2012-07-02 21:15   ` Seth Jennings
2012-07-10  2:25   ` Minchan Kim
2012-07-10  2:25     ` Minchan Kim
2012-07-02 21:15 ` [PATCH 3/4] zsmalloc: add details to zs_map_object boiler plate Seth Jennings
2012-07-02 21:15   ` Seth Jennings
2012-07-10  2:35   ` Minchan Kim
2012-07-10  2:35     ` Minchan Kim
2012-07-10 15:17     ` Seth Jennings
2012-07-10 15:17       ` Seth Jennings
2012-07-11  7:42       ` Minchan Kim
2012-07-11  7:42         ` Minchan Kim
2012-07-11 14:15         ` Seth Jennings
2012-07-11 14:15           ` Seth Jennings
2012-07-12  1:15           ` Minchan Kim
2012-07-12  1:15             ` Minchan Kim
2012-07-12 19:54             ` Dan Magenheimer
2012-07-12 19:54               ` Dan Magenheimer
2012-07-12 22:46               ` Dan Magenheimer
2012-07-12 22:46                 ` Dan Magenheimer
2012-07-02 21:15 ` [PATCH 4/4] zsmalloc: add mapping modes Seth Jennings
2012-07-02 21:15   ` Seth Jennings
2012-07-04  5:33 ` [PATCH 0/4] zsmalloc improvements Minchan Kim
2012-07-04  5:33   ` Minchan Kim
2012-07-04 20:43 ` Konrad Rzeszutek Wilk
2012-07-04 20:43   ` Konrad Rzeszutek Wilk
2012-07-06 15:07   ` Seth Jennings
2012-07-06 15:07     ` Seth Jennings
2012-07-09 13:58     ` Seth Jennings
2012-07-09 13:58       ` Seth Jennings
2012-07-11 19:42       ` Konrad Rzeszutek Wilk
2012-07-11 19:42         ` Konrad Rzeszutek Wilk
2012-07-11 20:48         ` Seth Jennings
2012-07-11 20:48           ` Seth Jennings
2012-07-12 10:40           ` Konrad Rzeszutek Wilk
2012-07-12 10:40             ` Konrad Rzeszutek Wilk
2012-07-11  7:03 ` Minchan Kim [this message]
2012-07-11  7:03   ` Minchan Kim
2012-07-11 14:00   ` Seth Jennings
2012-07-11 14:00     ` Seth Jennings
2012-07-12  1:01     ` Minchan Kim
2012-07-12  1:01       ` Minchan Kim
2012-07-11 19:16   ` Seth Jennings
2012-07-11 19:16     ` Seth Jennings

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4FFD2524.2050300@kernel.org \
    --to=minchan@kernel.org \
    --cc=akpm@linux-foundation.org \
    --cc=dan.magenheimer@oracle.com \
    --cc=devel@driverdev.osuosl.org \
    --cc=gregkh@linuxfoundation.org \
    --cc=konrad.wilk@oracle.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=ngupta@vflare.org \
    --cc=rcj@linux.vnet.ibm.com \
    --cc=sjenning@linux.vnet.ibm.com \
    --subject='Re: [PATCH 0/4] zsmalloc improvements' \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.