[PATCH 0/4] zsmalloc improvements

* [PATCH 0/4] zsmalloc improvements
@ 2012-07-02 21:15 ` Seth Jennings
  0 siblings, 0 replies; 62+ messages in thread
From: Seth Jennings @ 2012-07-02 21:15 UTC (permalink / raw)
  To: Greg Kroah-Hartman
  Cc: Seth Jennings, Andrew Morton, Dan Magenheimer,
	Konrad Rzeszutek Wilk, Nitin Gupta, Minchan Kim, Robert Jennings,
	linux-mm, devel, linux-kernel

This patchset removes the current x86 dependency for zsmalloc
and introduces some performance improvements in the object
mapping paths.

It was meant to be a follow-on to my previous patchest

https://lkml.org/lkml/2012/6/26/540

However, this patchset differed so much in light of new performance
information that I mostly started over.

In the past, I attempted to compare different mapping methods
via the use of zcache and frontswap.  However, the nature of those
two features makes comparing mapping method efficiency difficult
since the mapping is a very small part of the overall code path.

In an effort to get more useful statistics on the mapping speed,
I wrote a microbenchmark module named zsmapbench, designed to
measure mapping speed by calling straight into the zsmalloc
paths.

https://github.com/spartacus06/zsmapbench

This exposed an interesting and unexpected result: in all
cases that I tried, copying the objects that span pages instead
of using the page table to map them, was _always_ faster.  I could
not find a case in which the page table mapping method was faster.

zsmapbench measures the copy-based mapping at ~560 cycles for a
map/unmap operation on spanned object for both KVM guest and bare-metal,
while the page table mapping was ~1500 cycles on a VM and ~760 cycles
bare-metal.  The cycles for the copy method will vary with
allocation size, however, it is still faster even for the largest
allocation that zsmalloc supports.

The result is convenient though, as mempcy is very portable :)

This patchset replaces the x86-only page table mapping code with
copy-based mapping code. It also makes changes to optimize this
new method further.

There are no changes in arch/x86 required.

Patchset is based on greg's staging-next.

Seth Jennings (4):
  zsmalloc: remove x86 dependency
  zsmalloc: add single-page object fastpath in unmap
  zsmalloc: add details to zs_map_object boiler plate
  zsmalloc: add mapping modes

 drivers/staging/zcache/zcache-main.c     |    6 +-
 drivers/staging/zram/zram_drv.c          |    7 +-
 drivers/staging/zsmalloc/Kconfig         |    4 -
 drivers/staging/zsmalloc/zsmalloc-main.c |  124 ++++++++++++++++++++++--------
 drivers/staging/zsmalloc/zsmalloc.h      |   14 +++-
 drivers/staging/zsmalloc/zsmalloc_int.h  |    6 +-
 6 files changed, 114 insertions(+), 47 deletions(-)

-- 
1.7.9.5

^ permalink raw reply	[flat|nested] 62+ messages in thread