* [PATCH v2] zsmalloc: Add Kconfig for enabling PTE method @ 2013-02-06 2:17 Minchan Kim 2013-02-06 2:28 ` Greg Kroah-Hartman ` (2 more replies) 0 siblings, 3 replies; 8+ messages in thread From: Minchan Kim @ 2013-02-06 2:17 UTC (permalink / raw) To: Greg Kroah-Hartman Cc: linux-mm, linux-kernel, Minchan Kim, Andrew Morton, Seth Jennings, Nitin Gupta, Dan Magenheimer, Konrad Rzeszutek Wilk Zsmalloc has two methods 1) copy-based and 2) pte-based to access allocations that span two pages. You can see history why we supported two approach from [1]. In summary, copy-based method is 3 times fater in x86 while pte-based is 6 times faster in ARM. But it was bad choice that adding hard coding to select architecture which want to use pte based method. This patch removed it and adds new Kconfig to select the approach. This patch is based on next-20130205. [1] https://lkml.org/lkml/2012/7/11/58 * Changelog from v1 * Fix CONFIG_PGTABLE_MAPPING in zsmalloc-main.c - Greg Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Seth Jennings <sjenning@linux.vnet.ibm.com> Cc: Nitin Gupta <ngupta@vflare.org> Cc: Dan Magenheimer <dan.magenheimer@oracle.com> Cc: Konrad Rzeszutek Wilk <konrad@darnok.org> Signed-off-by: Minchan Kim <minchan@kernel.org> --- drivers/staging/zsmalloc/Kconfig | 12 ++++++++++++ drivers/staging/zsmalloc/zsmalloc-main.c | 20 +++++--------------- 2 files changed, 17 insertions(+), 15 deletions(-) diff --git a/drivers/staging/zsmalloc/Kconfig b/drivers/staging/zsmalloc/Kconfig index 9084565..232b3b6 100644 --- a/drivers/staging/zsmalloc/Kconfig +++ b/drivers/staging/zsmalloc/Kconfig @@ -8,3 +8,15 @@ config ZSMALLOC non-standard allocator interface where a handle, not a pointer, is returned by an alloc(). This handle must be mapped in order to access the allocated space. + +config PGTABLE_MAPPING + bool "Use page table mapping to access allocations that span two pages" + depends on ZSMALLOC + default n + help + By default, zsmalloc uses a copy-based object mapping method to access + allocations that span two pages. However, if a particular architecture + performs VM mapping faster than copying, then you should select this. + This causes zsmalloc to use page table mapping rather than copying + for object mapping. You can check speed with zsmalloc benchmark[1]. + [1] https://github.com/spartacus06/zsmalloc diff --git a/drivers/staging/zsmalloc/zsmalloc-main.c b/drivers/staging/zsmalloc/zsmalloc-main.c index 06f73a9..2c1805c 100644 --- a/drivers/staging/zsmalloc/zsmalloc-main.c +++ b/drivers/staging/zsmalloc/zsmalloc-main.c @@ -207,6 +207,7 @@ struct zs_pool { struct size_class size_class[ZS_SIZE_CLASSES]; gfp_t flags; /* allocation flags used when growing pool */ + }; /* @@ -218,19 +219,8 @@ struct zs_pool { #define CLASS_IDX_MASK ((1 << CLASS_IDX_BITS) - 1) #define FULLNESS_MASK ((1 << FULLNESS_BITS) - 1) -/* - * By default, zsmalloc uses a copy-based object mapping method to access - * allocations that span two pages. However, if a particular architecture - * performs VM mapping faster than copying, then it should be added here - * so that USE_PGTABLE_MAPPING is defined. This causes zsmalloc to use - * page table mapping rather than copying for object mapping. -*/ -#if defined(CONFIG_ARM) -#define USE_PGTABLE_MAPPING -#endif - struct mapping_area { -#ifdef USE_PGTABLE_MAPPING +#ifdef CONFIG_PGTABLE_MAPPING struct vm_struct *vm; /* vm area for mapping object that span pages */ #else char *vm_buf; /* copy buffer for objects that span pages */ @@ -622,7 +612,7 @@ static struct page *find_get_zspage(struct size_class *class) return page; } -#ifdef USE_PGTABLE_MAPPING +#ifdef CONFIG_PGTABLE_MAPPING static inline int __zs_cpu_up(struct mapping_area *area) { /* @@ -663,7 +653,7 @@ static inline void __zs_unmap_object(struct mapping_area *area, flush_tlb_kernel_range(addr, end); } -#else /* USE_PGTABLE_MAPPING */ +#else /* CONFIG_PGTABLE_MAPPING*/ static inline int __zs_cpu_up(struct mapping_area *area) { @@ -741,7 +731,7 @@ out: pagefault_enable(); } -#endif /* USE_PGTABLE_MAPPING */ +#endif /* CONFIG_PGTABLE_MAPPING */ static int zs_cpu_notifier(struct notifier_block *nb, unsigned long action, void *pcpu) -- 1.8.1.1 ^ permalink raw reply related [flat|nested] 8+ messages in thread
* Re: [PATCH v2] zsmalloc: Add Kconfig for enabling PTE method 2013-02-06 2:17 [PATCH v2] zsmalloc: Add Kconfig for enabling PTE method Minchan Kim @ 2013-02-06 2:28 ` Greg Kroah-Hartman 2013-02-06 2:50 ` Minchan Kim 2013-02-06 16:47 ` Seth Jennings 2013-02-17 6:19 ` Ric Mason 2 siblings, 1 reply; 8+ messages in thread From: Greg Kroah-Hartman @ 2013-02-06 2:28 UTC (permalink / raw) To: Minchan Kim Cc: linux-mm, linux-kernel, Andrew Morton, Seth Jennings, Nitin Gupta, Dan Magenheimer, Konrad Rzeszutek Wilk On Wed, Feb 06, 2013 at 11:17:08AM +0900, Minchan Kim wrote: > diff --git a/drivers/staging/zsmalloc/Kconfig b/drivers/staging/zsmalloc/Kconfig > index 9084565..232b3b6 100644 > --- a/drivers/staging/zsmalloc/Kconfig > +++ b/drivers/staging/zsmalloc/Kconfig > @@ -8,3 +8,15 @@ config ZSMALLOC > non-standard allocator interface where a handle, not a pointer, is > returned by an alloc(). This handle must be mapped in order to > access the allocated space. > + > +config PGTABLE_MAPPING > + bool "Use page table mapping to access allocations that span two pages" No tabs? Please also put "ZSmalloc somewhere in the text here, otherwise it really doesn't make much sense when seeing it in a menu. > + depends on ZSMALLOC > + default n That's the default, so it can be dropped. > + help > + By default, zsmalloc uses a copy-based object mapping method to access > + allocations that span two pages. However, if a particular architecture > + performs VM mapping faster than copying, then you should select this. > + This causes zsmalloc to use page table mapping rather than copying > + for object mapping. You can check speed with zsmalloc benchmark[1]. > + [1] https://github.com/spartacus06/zsmalloc Care to specify exactly _what_ architectures this should be set for or not? That will help the distros out a lot in determining if this should be enabled or not. > diff --git a/drivers/staging/zsmalloc/zsmalloc-main.c b/drivers/staging/zsmalloc/zsmalloc-main.c > index 06f73a9..2c1805c 100644 > --- a/drivers/staging/zsmalloc/zsmalloc-main.c > +++ b/drivers/staging/zsmalloc/zsmalloc-main.c > @@ -207,6 +207,7 @@ struct zs_pool { > struct size_class size_class[ZS_SIZE_CLASSES]; > > gfp_t flags; /* allocation flags used when growing pool */ > + > }; > > /* Why add this extra line? thanks, greg k-h ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH v2] zsmalloc: Add Kconfig for enabling PTE method 2013-02-06 2:28 ` Greg Kroah-Hartman @ 2013-02-06 2:50 ` Minchan Kim 0 siblings, 0 replies; 8+ messages in thread From: Minchan Kim @ 2013-02-06 2:50 UTC (permalink / raw) To: Greg Kroah-Hartman Cc: linux-mm, linux-kernel, Andrew Morton, Seth Jennings, Nitin Gupta, Dan Magenheimer, Konrad Rzeszutek Wilk On Tue, Feb 05, 2013 at 06:28:54PM -0800, Greg Kroah-Hartman wrote: > On Wed, Feb 06, 2013 at 11:17:08AM +0900, Minchan Kim wrote: > > diff --git a/drivers/staging/zsmalloc/Kconfig b/drivers/staging/zsmalloc/Kconfig > > index 9084565..232b3b6 100644 > > --- a/drivers/staging/zsmalloc/Kconfig > > +++ b/drivers/staging/zsmalloc/Kconfig > > @@ -8,3 +8,15 @@ config ZSMALLOC > > non-standard allocator interface where a handle, not a pointer, is > > returned by an alloc(). This handle must be mapped in order to > > access the allocated space. > > + > > +config PGTABLE_MAPPING > > + bool "Use page table mapping to access allocations that span two pages" > > No tabs? > > Please also put "ZSmalloc somewhere in the text here, otherwise it > really doesn't make much sense when seeing it in a menu. > > > + depends on ZSMALLOC > > + default n > > That's the default, so it can be dropped. > > > + help > > + By default, zsmalloc uses a copy-based object mapping method to access > > + allocations that span two pages. However, if a particular architecture > > + performs VM mapping faster than copying, then you should select this. > > + This causes zsmalloc to use page table mapping rather than copying > > + for object mapping. You can check speed with zsmalloc benchmark[1]. > > + [1] https://github.com/spartacus06/zsmalloc > > Care to specify exactly _what_ architectures this should be set for or > not? That will help the distros out a lot in determining if this should > be enabled or not. > > > diff --git a/drivers/staging/zsmalloc/zsmalloc-main.c b/drivers/staging/zsmalloc/zsmalloc-main.c > > index 06f73a9..2c1805c 100644 > > --- a/drivers/staging/zsmalloc/zsmalloc-main.c > > +++ b/drivers/staging/zsmalloc/zsmalloc-main.c > > @@ -207,6 +207,7 @@ struct zs_pool { > > struct size_class size_class[ZS_SIZE_CLASSES]; > > > > gfp_t flags; /* allocation flags used when growing pool */ > > + > > }; > > > > /* > > Why add this extra line? > > thanks, > > greg k-h Sorry for bothering you. I fixed all you pointed out. Thanks for the review, Greg! Here it goes. ------------------- 8< ------------------- >From 506acea72916c9a12cf80290bc5cd87f4af1914d Mon Sep 17 00:00:00 2001 From: Minchan Kim <minchan@kernel.org> Date: Wed, 6 Feb 2013 11:10:59 +0900 Subject: [PATCH v3] zsmalloc: Add Kconfig for enabling PTE method Zsmalloc has two methods 1) copy-based and 2) pte-based to access allocations that span two pages. You can see history why we supported two approach from [1]. In summary, copy-based method is 3 times fater in x86 while pte-based is 6 times faster in ARM. But it was bad choice that adding hard coding to select architecture which want to use pte based method. This patch removed it and adds new Kconfig to select the approach. This patch is based on next-20130205. [1] https://lkml.org/lkml/2012/7/11/58 * Changelog from v2 * Add tab and drop "default n" - Greg * Modify description - Greg * Drop unnecessary extra line - Greg * Changelog from v1 * Fix CONFIG_PGTABLE_MAPPING in zsmalloc-main.c - Greg Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Seth Jennings <sjenning@linux.vnet.ibm.com> Cc: Nitin Gupta <ngupta@vflare.org> Cc: Dan Magenheimer <dan.magenheimer@oracle.com> Cc: Konrad Rzeszutek Wilk <konrad@darnok.org> Signed-off-by: Minchan Kim <minchan@kernel.org> --- drivers/staging/zsmalloc/Kconfig | 13 +++++++++++++ drivers/staging/zsmalloc/zsmalloc-main.c | 19 ++++--------------- 2 files changed, 17 insertions(+), 15 deletions(-) diff --git a/drivers/staging/zsmalloc/Kconfig b/drivers/staging/zsmalloc/Kconfig index 9084565..83f9cec 100644 --- a/drivers/staging/zsmalloc/Kconfig +++ b/drivers/staging/zsmalloc/Kconfig @@ -8,3 +8,16 @@ config ZSMALLOC non-standard allocator interface where a handle, not a pointer, is returned by an alloc(). This handle must be mapped in order to access the allocated space. + +config PGTABLE_MAPPING + bool "Use page table mapping to access object in zsmalloc" + depends on ZSMALLOC + help + By default, zsmalloc uses a copy-based object mapping method to + access allocations that span two pages. However, if a particular + architecture (ex, ARM) performs VM mapping faster than copying, + then you should select this. This causes zsmalloc to use page table + mapping rather than copying for object mapping. + + You can check speed with zsmalloc benchmark[1]. + [1] https://github.com/spartacus06/zsmalloc diff --git a/drivers/staging/zsmalloc/zsmalloc-main.c b/drivers/staging/zsmalloc/zsmalloc-main.c index 06f73a9..aa6aac4 100644 --- a/drivers/staging/zsmalloc/zsmalloc-main.c +++ b/drivers/staging/zsmalloc/zsmalloc-main.c @@ -218,19 +218,8 @@ struct zs_pool { #define CLASS_IDX_MASK ((1 << CLASS_IDX_BITS) - 1) #define FULLNESS_MASK ((1 << FULLNESS_BITS) - 1) -/* - * By default, zsmalloc uses a copy-based object mapping method to access - * allocations that span two pages. However, if a particular architecture - * performs VM mapping faster than copying, then it should be added here - * so that USE_PGTABLE_MAPPING is defined. This causes zsmalloc to use - * page table mapping rather than copying for object mapping. -*/ -#if defined(CONFIG_ARM) -#define USE_PGTABLE_MAPPING -#endif - struct mapping_area { -#ifdef USE_PGTABLE_MAPPING +#ifdef CONFIG_PGTABLE_MAPPING struct vm_struct *vm; /* vm area for mapping object that span pages */ #else char *vm_buf; /* copy buffer for objects that span pages */ @@ -622,7 +611,7 @@ static struct page *find_get_zspage(struct size_class *class) return page; } -#ifdef USE_PGTABLE_MAPPING +#ifdef CONFIG_PGTABLE_MAPPING static inline int __zs_cpu_up(struct mapping_area *area) { /* @@ -663,7 +652,7 @@ static inline void __zs_unmap_object(struct mapping_area *area, flush_tlb_kernel_range(addr, end); } -#else /* USE_PGTABLE_MAPPING */ +#else /* CONFIG_PGTABLE_MAPPING*/ static inline int __zs_cpu_up(struct mapping_area *area) { @@ -741,7 +730,7 @@ out: pagefault_enable(); } -#endif /* USE_PGTABLE_MAPPING */ +#endif /* CONFIG_PGTABLE_MAPPING */ static int zs_cpu_notifier(struct notifier_block *nb, unsigned long action, void *pcpu) -- 1.8.1.1 -- Kind regards, Minchan Kim ^ permalink raw reply related [flat|nested] 8+ messages in thread
* Re: [PATCH v2] zsmalloc: Add Kconfig for enabling PTE method 2013-02-06 2:17 [PATCH v2] zsmalloc: Add Kconfig for enabling PTE method Minchan Kim 2013-02-06 2:28 ` Greg Kroah-Hartman @ 2013-02-06 16:47 ` Seth Jennings 2013-02-06 23:16 ` Minchan Kim 2013-02-17 6:19 ` Ric Mason 2 siblings, 1 reply; 8+ messages in thread From: Seth Jennings @ 2013-02-06 16:47 UTC (permalink / raw) To: Minchan Kim Cc: Greg Kroah-Hartman, linux-mm, linux-kernel, Andrew Morton, Nitin Gupta, Dan Magenheimer, Konrad Rzeszutek Wilk On 02/05/2013 08:17 PM, Minchan Kim wrote: > Zsmalloc has two methods 1) copy-based and 2) pte-based to access > allocations that span two pages. You can see history why we supported > two approach from [1]. > > In summary, copy-based method is 3 times fater in x86 while pte-based > is 6 times faster in ARM. > > But it was bad choice that adding hard coding to select architecture > which want to use pte based method. This patch removed it and adds > new Kconfig to select the approach. > > This patch is based on next-20130205. > > [1] https://lkml.org/lkml/2012/7/11/58 > > * Changelog from v1 > * Fix CONFIG_PGTABLE_MAPPING in zsmalloc-main.c - Greg > > Cc: Andrew Morton <akpm@linux-foundation.org> > Cc: Seth Jennings <sjenning@linux.vnet.ibm.com> > Cc: Nitin Gupta <ngupta@vflare.org> > Cc: Dan Magenheimer <dan.magenheimer@oracle.com> > Cc: Konrad Rzeszutek Wilk <konrad@darnok.org> > Signed-off-by: Minchan Kim <minchan@kernel.org> > --- > drivers/staging/zsmalloc/Kconfig | 12 ++++++++++++ > drivers/staging/zsmalloc/zsmalloc-main.c | 20 +++++--------------- > 2 files changed, 17 insertions(+), 15 deletions(-) > > diff --git a/drivers/staging/zsmalloc/Kconfig b/drivers/staging/zsmalloc/Kconfig > index 9084565..232b3b6 100644 > --- a/drivers/staging/zsmalloc/Kconfig > +++ b/drivers/staging/zsmalloc/Kconfig > @@ -8,3 +8,15 @@ config ZSMALLOC > non-standard allocator interface where a handle, not a pointer, is > returned by an alloc(). This handle must be mapped in order to > access the allocated space. > + > +config PGTABLE_MAPPING > + bool "Use page table mapping to access allocations that span two pages" > + depends on ZSMALLOC > + default n > + help > + By default, zsmalloc uses a copy-based object mapping method to access > + allocations that span two pages. However, if a particular architecture > + performs VM mapping faster than copying, then you should select this. > + This causes zsmalloc to use page table mapping rather than copying > + for object mapping. You can check speed with zsmalloc benchmark[1]. > + [1] https://github.com/spartacus06/zsmalloc Hmm, I'm not sure we want to include this link in the Kconfig. While I don't have any plans to take that repo down, I could see it getting stale at some point for yet-to-be-determined reasons. Of course, without this tool (or something like it) it is hard to know which option is better for your particular platform. Would having this in a Documentation/ file, once one exists, be better? Seth ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH v2] zsmalloc: Add Kconfig for enabling PTE method 2013-02-06 16:47 ` Seth Jennings @ 2013-02-06 23:16 ` Minchan Kim 0 siblings, 0 replies; 8+ messages in thread From: Minchan Kim @ 2013-02-06 23:16 UTC (permalink / raw) To: Seth Jennings Cc: Greg Kroah-Hartman, linux-mm, linux-kernel, Andrew Morton, Nitin Gupta, Dan Magenheimer, Konrad Rzeszutek Wilk On Wed, Feb 06, 2013 at 10:47:16AM -0600, Seth Jennings wrote: > On 02/05/2013 08:17 PM, Minchan Kim wrote: > > Zsmalloc has two methods 1) copy-based and 2) pte-based to access > > allocations that span two pages. You can see history why we supported > > two approach from [1]. > > > > In summary, copy-based method is 3 times fater in x86 while pte-based > > is 6 times faster in ARM. > > > > But it was bad choice that adding hard coding to select architecture > > which want to use pte based method. This patch removed it and adds > > new Kconfig to select the approach. > > > > This patch is based on next-20130205. > > > > [1] https://lkml.org/lkml/2012/7/11/58 > > > > * Changelog from v1 > > * Fix CONFIG_PGTABLE_MAPPING in zsmalloc-main.c - Greg > > > > Cc: Andrew Morton <akpm@linux-foundation.org> > > Cc: Seth Jennings <sjenning@linux.vnet.ibm.com> > > Cc: Nitin Gupta <ngupta@vflare.org> > > Cc: Dan Magenheimer <dan.magenheimer@oracle.com> > > Cc: Konrad Rzeszutek Wilk <konrad@darnok.org> > > Signed-off-by: Minchan Kim <minchan@kernel.org> > > --- > > drivers/staging/zsmalloc/Kconfig | 12 ++++++++++++ > > drivers/staging/zsmalloc/zsmalloc-main.c | 20 +++++--------------- > > 2 files changed, 17 insertions(+), 15 deletions(-) > > > > diff --git a/drivers/staging/zsmalloc/Kconfig b/drivers/staging/zsmalloc/Kconfig > > index 9084565..232b3b6 100644 > > --- a/drivers/staging/zsmalloc/Kconfig > > +++ b/drivers/staging/zsmalloc/Kconfig > > @@ -8,3 +8,15 @@ config ZSMALLOC > > non-standard allocator interface where a handle, not a pointer, is > > returned by an alloc(). This handle must be mapped in order to > > access the allocated space. > > + > > +config PGTABLE_MAPPING > > + bool "Use page table mapping to access allocations that span two pages" > > + depends on ZSMALLOC > > + default n > > + help > > + By default, zsmalloc uses a copy-based object mapping method to access > > + allocations that span two pages. However, if a particular architecture > > + performs VM mapping faster than copying, then you should select this. > > + This causes zsmalloc to use page table mapping rather than copying > > + for object mapping. You can check speed with zsmalloc benchmark[1]. > > + [1] https://github.com/spartacus06/zsmalloc > > Hmm, I'm not sure we want to include this link in the Kconfig. While I > don't have any plans to take that repo down, I could see it getting > stale at some point for yet-to-be-determined reasons. > > Of course, without this tool (or something like it) it is hard to know > which option is better for your particular platform. > > Would having this in a Documentation/ file, once one exists, be better? It could be better. Then, Let's point out that documentataion in Kconfig. Okay, Let's sort it out. > > Seth > > -- > To unsubscribe, send a message with 'unsubscribe linux-mm' in > the body to majordomo@kvack.org. For more info on Linux MM, > see: http://www.linux-mm.org/ . > Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> -- Kind regards, Minchan Kim ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH v2] zsmalloc: Add Kconfig for enabling PTE method 2013-02-06 2:17 [PATCH v2] zsmalloc: Add Kconfig for enabling PTE method Minchan Kim 2013-02-06 2:28 ` Greg Kroah-Hartman 2013-02-06 16:47 ` Seth Jennings @ 2013-02-17 6:19 ` Ric Mason 2013-02-18 18:07 ` Dan Magenheimer 2013-02-18 18:08 ` Seth Jennings 2 siblings, 2 replies; 8+ messages in thread From: Ric Mason @ 2013-02-17 6:19 UTC (permalink / raw) To: Minchan Kim Cc: Greg Kroah-Hartman, linux-mm, linux-kernel, Andrew Morton, Seth Jennings, Nitin Gupta, Dan Magenheimer, Konrad Rzeszutek Wilk On 02/06/2013 10:17 AM, Minchan Kim wrote: > Zsmalloc has two methods 1) copy-based and 2) pte-based to access > allocations that span two pages. You can see history why we supported > two approach from [1]. > > In summary, copy-based method is 3 times fater in x86 while pte-based > is 6 times faster in ARM. Why in some arches copy-based method is better and in the other arches pte-based is better? What's the root reason? > > But it was bad choice that adding hard coding to select architecture > which want to use pte based method. This patch removed it and adds > new Kconfig to select the approach. > > This patch is based on next-20130205. > > [1] https://lkml.org/lkml/2012/7/11/58 > > * Changelog from v1 > * Fix CONFIG_PGTABLE_MAPPING in zsmalloc-main.c - Greg > > Cc: Andrew Morton <akpm@linux-foundation.org> > Cc: Seth Jennings <sjenning@linux.vnet.ibm.com> > Cc: Nitin Gupta <ngupta@vflare.org> > Cc: Dan Magenheimer <dan.magenheimer@oracle.com> > Cc: Konrad Rzeszutek Wilk <konrad@darnok.org> > Signed-off-by: Minchan Kim <minchan@kernel.org> > --- > drivers/staging/zsmalloc/Kconfig | 12 ++++++++++++ > drivers/staging/zsmalloc/zsmalloc-main.c | 20 +++++--------------- > 2 files changed, 17 insertions(+), 15 deletions(-) > > diff --git a/drivers/staging/zsmalloc/Kconfig b/drivers/staging/zsmalloc/Kconfig > index 9084565..232b3b6 100644 > --- a/drivers/staging/zsmalloc/Kconfig > +++ b/drivers/staging/zsmalloc/Kconfig > @@ -8,3 +8,15 @@ config ZSMALLOC > non-standard allocator interface where a handle, not a pointer, is > returned by an alloc(). This handle must be mapped in order to > access the allocated space. > + > +config PGTABLE_MAPPING > + bool "Use page table mapping to access allocations that span two pages" > + depends on ZSMALLOC > + default n > + help > + By default, zsmalloc uses a copy-based object mapping method to access > + allocations that span two pages. However, if a particular architecture > + performs VM mapping faster than copying, then you should select this. > + This causes zsmalloc to use page table mapping rather than copying > + for object mapping. You can check speed with zsmalloc benchmark[1]. > + [1] https://github.com/spartacus06/zsmalloc > diff --git a/drivers/staging/zsmalloc/zsmalloc-main.c b/drivers/staging/zsmalloc/zsmalloc-main.c > index 06f73a9..2c1805c 100644 > --- a/drivers/staging/zsmalloc/zsmalloc-main.c > +++ b/drivers/staging/zsmalloc/zsmalloc-main.c > @@ -207,6 +207,7 @@ struct zs_pool { > struct size_class size_class[ZS_SIZE_CLASSES]; > > gfp_t flags; /* allocation flags used when growing pool */ > + > }; > > /* > @@ -218,19 +219,8 @@ struct zs_pool { > #define CLASS_IDX_MASK ((1 << CLASS_IDX_BITS) - 1) > #define FULLNESS_MASK ((1 << FULLNESS_BITS) - 1) > > -/* > - * By default, zsmalloc uses a copy-based object mapping method to access > - * allocations that span two pages. However, if a particular architecture > - * performs VM mapping faster than copying, then it should be added here > - * so that USE_PGTABLE_MAPPING is defined. This causes zsmalloc to use > - * page table mapping rather than copying for object mapping. > -*/ > -#if defined(CONFIG_ARM) > -#define USE_PGTABLE_MAPPING > -#endif > - > struct mapping_area { > -#ifdef USE_PGTABLE_MAPPING > +#ifdef CONFIG_PGTABLE_MAPPING > struct vm_struct *vm; /* vm area for mapping object that span pages */ > #else > char *vm_buf; /* copy buffer for objects that span pages */ > @@ -622,7 +612,7 @@ static struct page *find_get_zspage(struct size_class *class) > return page; > } > > -#ifdef USE_PGTABLE_MAPPING > +#ifdef CONFIG_PGTABLE_MAPPING > static inline int __zs_cpu_up(struct mapping_area *area) > { > /* > @@ -663,7 +653,7 @@ static inline void __zs_unmap_object(struct mapping_area *area, > flush_tlb_kernel_range(addr, end); > } > > -#else /* USE_PGTABLE_MAPPING */ > +#else /* CONFIG_PGTABLE_MAPPING*/ > > static inline int __zs_cpu_up(struct mapping_area *area) > { > @@ -741,7 +731,7 @@ out: > pagefault_enable(); > } > > -#endif /* USE_PGTABLE_MAPPING */ > +#endif /* CONFIG_PGTABLE_MAPPING */ > > static int zs_cpu_notifier(struct notifier_block *nb, unsigned long action, > void *pcpu) ^ permalink raw reply [flat|nested] 8+ messages in thread
* RE: [PATCH v2] zsmalloc: Add Kconfig for enabling PTE method 2013-02-17 6:19 ` Ric Mason @ 2013-02-18 18:07 ` Dan Magenheimer 2013-02-18 18:08 ` Seth Jennings 1 sibling, 0 replies; 8+ messages in thread From: Dan Magenheimer @ 2013-02-18 18:07 UTC (permalink / raw) To: Ric Mason, Minchan Kim Cc: Greg Kroah-Hartman, linux-mm, linux-kernel, Andrew Morton, Seth Jennings, Nitin Gupta, Konrad Rzeszutek Wilk > From: Ric Mason [mailto:ric.masonn@gmail.com] > Sent: Saturday, February 16, 2013 11:19 PM > To: Minchan Kim > Cc: Greg Kroah-Hartman; linux-mm@kvack.org; linux-kernel@vger.kernel.org; Andrew Morton; Seth > Jennings; Nitin Gupta; Dan Magenheimer; Konrad Rzeszutek Wilk > Subject: Re: [PATCH v2] zsmalloc: Add Kconfig for enabling PTE method > > On 02/06/2013 10:17 AM, Minchan Kim wrote: > > Zsmalloc has two methods 1) copy-based and 2) pte-based to access > > allocations that span two pages. You can see history why we supported > > two approach from [1]. > > > > In summary, copy-based method is 3 times fater in x86 while pte-based > > is 6 times faster in ARM. > > Why in some arches copy-based method is better and in the other arches > pte-based is better? What's the root reason? Minchan, if you post another version, I think these precise numbers (of "times faster") should be removed. The speed is very data dependent, because the copy-based method is copying a zpage which may vary widely in size from ~100 bytes to nearly PAGE_SIZE bytes, a factor of 40x or more. Please at least say "up to 3 times" or "approximately 3x faster for an average compressed page". Ric, the copy-based method does an extra copy of N bytes (where N is the compressed size of a page). The pte-based method requires extra TLB actions. The relative speed of TLB operations vs copying is very architecture-dependent. It is also probably dependent on the specific implementation of the architecture (i.e x86 sandybridge is likely very different than x86 nehalem) and, as noted above, dependent on N which is unpredictable. So it makes sense to have both choices, but it's not at all clear how to select which one to use! ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH v2] zsmalloc: Add Kconfig for enabling PTE method 2013-02-17 6:19 ` Ric Mason 2013-02-18 18:07 ` Dan Magenheimer @ 2013-02-18 18:08 ` Seth Jennings 1 sibling, 0 replies; 8+ messages in thread From: Seth Jennings @ 2013-02-18 18:08 UTC (permalink / raw) To: Ric Mason Cc: Minchan Kim, Greg Kroah-Hartman, linux-mm, linux-kernel, Andrew Morton, Nitin Gupta, Dan Magenheimer, Konrad Rzeszutek Wilk On 02/17/2013 12:19 AM, Ric Mason wrote: > On 02/06/2013 10:17 AM, Minchan Kim wrote: >> Zsmalloc has two methods 1) copy-based and 2) pte-based to access >> allocations that span two pages. You can see history why we supported >> two approach from [1]. >> >> In summary, copy-based method is 3 times fater in x86 while pte-based >> is 6 times faster in ARM. > > Why in some arches copy-based method is better and in the other arches > pte-based is better? What's the root reason? Minchan might know more about this (or Russell King) but I'll give it a try. MMU designs can vary pretty significantly from arch to arch. An operation that is cheap on one MMU design can be expensive on another, especially once SMP gets involved, possibly resulting in inter-processor interrupts. RAM speed is also a factor since the copy-method will use more memory bandwidth. Embedded systems typically won't have really fast memory. Seth ^ permalink raw reply [flat|nested] 8+ messages in thread
end of thread, other threads:[~2013-02-18 18:09 UTC | newest] Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2013-02-06 2:17 [PATCH v2] zsmalloc: Add Kconfig for enabling PTE method Minchan Kim 2013-02-06 2:28 ` Greg Kroah-Hartman 2013-02-06 2:50 ` Minchan Kim 2013-02-06 16:47 ` Seth Jennings 2013-02-06 23:16 ` Minchan Kim 2013-02-17 6:19 ` Ric Mason 2013-02-18 18:07 ` Dan Magenheimer 2013-02-18 18:08 ` Seth Jennings
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).