linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 0/6] zsmalloc: clean up and fix arch dependency
@ 2012-04-25  6:23 Minchan Kim
  2012-04-25  6:23 ` [PATCH 1/6] zsmalloc: use PageFlag macro instead of [set|test]_bit Minchan Kim
                   ` (7 more replies)
  0 siblings, 8 replies; 32+ messages in thread
From: Minchan Kim @ 2012-04-25  6:23 UTC (permalink / raw)
  To: Greg Kroah-Hartman, Seth Jennings, Nitin Gupta, Dan Magenheimer,
	Andrew Morton
  Cc: linux-kernel, linux-mm, Minchan Kim

This patchset has some clean up patches(1-5) and remove 
set_bit, flush_tlb for portability in [6/6].

Minchan Kim (6):
  zsmalloc: use PageFlag macro instead of [set|test]_bit
  zsmalloc: remove unnecessary alignment
  zsmalloc: rename zspage_order with zspage_pages
  zsmalloc: add/fix function comment
  zsmalloc: remove unnecessary type casting
  zsmalloc: make zsmalloc portable

 drivers/staging/zsmalloc/Kconfig         |    4 --
 drivers/staging/zsmalloc/zsmalloc-main.c |   73 +++++++++++++++++-------------
 drivers/staging/zsmalloc/zsmalloc_int.h  |    3 +-
 3 files changed, 43 insertions(+), 37 deletions(-)

-- 
1.7.9.5


^ permalink raw reply	[flat|nested] 32+ messages in thread

* [PATCH 1/6] zsmalloc: use PageFlag macro instead of [set|test]_bit
  2012-04-25  6:23 [PATCH 0/6] zsmalloc: clean up and fix arch dependency Minchan Kim
@ 2012-04-25  6:23 ` Minchan Kim
  2012-04-25 12:43   ` Nitin Gupta
  2012-04-25  6:23 ` [PATCH 2/6] zsmalloc: remove unnecessary alignment Minchan Kim
                   ` (6 subsequent siblings)
  7 siblings, 1 reply; 32+ messages in thread
From: Minchan Kim @ 2012-04-25  6:23 UTC (permalink / raw)
  To: Greg Kroah-Hartman, Seth Jennings, Nitin Gupta, Dan Magenheimer,
	Andrew Morton
  Cc: linux-kernel, linux-mm, Minchan Kim

MM code always uses PageXXX to handle page flags.
Let's keep the consistency.

Signed-off-by: Minchan Kim <minchan@kernel.org>
---
 drivers/staging/zsmalloc/zsmalloc-main.c |    9 ++++-----
 1 file changed, 4 insertions(+), 5 deletions(-)

diff --git a/drivers/staging/zsmalloc/zsmalloc-main.c b/drivers/staging/zsmalloc/zsmalloc-main.c
index 917461c..504b6c2 100644
--- a/drivers/staging/zsmalloc/zsmalloc-main.c
+++ b/drivers/staging/zsmalloc/zsmalloc-main.c
@@ -45,12 +45,12 @@ static DEFINE_PER_CPU(struct mapping_area, zs_map_area);
 
 static int is_first_page(struct page *page)
 {
-	return test_bit(PG_private, &page->flags);
+	return PagePrivate(page);
 }
 
 static int is_last_page(struct page *page)
 {
-	return test_bit(PG_private_2, &page->flags);
+	return PagePrivate2(page);
 }
 
 static void get_zspage_mapping(struct page *page, unsigned int *class_idx,
@@ -377,7 +377,7 @@ static struct page *alloc_zspage(struct size_class *class, gfp_t flags)
 
 		INIT_LIST_HEAD(&page->lru);
 		if (i == 0) {	/* first page */
-			set_bit(PG_private, &page->flags);
+			SetPagePrivate(page);
 			set_page_private(page, 0);
 			first_page = page;
 			first_page->inuse = 0;
@@ -389,8 +389,7 @@ static struct page *alloc_zspage(struct size_class *class, gfp_t flags)
 		if (i >= 2)
 			list_add(&page->lru, &prev_page->lru);
 		if (i == class->zspage_order - 1)	/* last page */
-			set_bit(PG_private_2, &page->flags);
-
+			SetPagePrivate2(page);
 		prev_page = page;
 	}
 
-- 
1.7.9.5


^ permalink raw reply related	[flat|nested] 32+ messages in thread

* [PATCH 2/6] zsmalloc: remove unnecessary alignment
  2012-04-25  6:23 [PATCH 0/6] zsmalloc: clean up and fix arch dependency Minchan Kim
  2012-04-25  6:23 ` [PATCH 1/6] zsmalloc: use PageFlag macro instead of [set|test]_bit Minchan Kim
@ 2012-04-25  6:23 ` Minchan Kim
  2012-04-25 12:53   ` Nitin Gupta
  2012-04-25  6:23 ` [PATCH 3/6] zsmalloc: rename zspage_order with zspage_pages Minchan Kim
                   ` (5 subsequent siblings)
  7 siblings, 1 reply; 32+ messages in thread
From: Minchan Kim @ 2012-04-25  6:23 UTC (permalink / raw)
  To: Greg Kroah-Hartman, Seth Jennings, Nitin Gupta, Dan Magenheimer,
	Andrew Morton
  Cc: linux-kernel, linux-mm, Minchan Kim

It isn't necessary to align pool size with PAGE_SIZE.
If I missed something, please let me know it.

Signed-off-by: Minchan Kim <minchan@kernel.org>
---
 drivers/staging/zsmalloc/zsmalloc-main.c |    5 ++---
 1 file changed, 2 insertions(+), 3 deletions(-)

diff --git a/drivers/staging/zsmalloc/zsmalloc-main.c b/drivers/staging/zsmalloc/zsmalloc-main.c
index 504b6c2..b99ad9e 100644
--- a/drivers/staging/zsmalloc/zsmalloc-main.c
+++ b/drivers/staging/zsmalloc/zsmalloc-main.c
@@ -489,14 +489,13 @@ fail:
 
 struct zs_pool *zs_create_pool(const char *name, gfp_t flags)
 {
-	int i, error, ovhd_size;
+	int i, error;
 	struct zs_pool *pool;
 
 	if (!name)
 		return NULL;
 
-	ovhd_size = roundup(sizeof(*pool), PAGE_SIZE);
-	pool = kzalloc(ovhd_size, GFP_KERNEL);
+	pool = kzalloc(sizeof(*pool), GFP_KERNEL);
 	if (!pool)
 		return NULL;
 
-- 
1.7.9.5


^ permalink raw reply related	[flat|nested] 32+ messages in thread

* [PATCH 3/6] zsmalloc: rename zspage_order with zspage_pages
  2012-04-25  6:23 [PATCH 0/6] zsmalloc: clean up and fix arch dependency Minchan Kim
  2012-04-25  6:23 ` [PATCH 1/6] zsmalloc: use PageFlag macro instead of [set|test]_bit Minchan Kim
  2012-04-25  6:23 ` [PATCH 2/6] zsmalloc: remove unnecessary alignment Minchan Kim
@ 2012-04-25  6:23 ` Minchan Kim
  2012-04-25 13:03   ` Nitin Gupta
  2012-04-25  6:23 ` [PATCH 4/6] zsmalloc: add/fix function comment Minchan Kim
                   ` (4 subsequent siblings)
  7 siblings, 1 reply; 32+ messages in thread
From: Minchan Kim @ 2012-04-25  6:23 UTC (permalink / raw)
  To: Greg Kroah-Hartman, Seth Jennings, Nitin Gupta, Dan Magenheimer,
	Andrew Morton
  Cc: linux-kernel, linux-mm, Minchan Kim

zspage_order defines how many pages are needed to make a zspage.
So _order_ is rather awkward naming. It already deceive Jonathan
- http://lwn.net/Articles/477067/
" For each size, the code calculates an optimum number of pages (up to 16)"

Let's change from _order_ to _pages_.

Signed-off-by: Minchan Kim <minchan@kernel.org>
---
 drivers/staging/zsmalloc/zsmalloc-main.c |   14 +++++++-------
 drivers/staging/zsmalloc/zsmalloc_int.h  |    2 +-
 2 files changed, 8 insertions(+), 8 deletions(-)

diff --git a/drivers/staging/zsmalloc/zsmalloc-main.c b/drivers/staging/zsmalloc/zsmalloc-main.c
index b99ad9e..0fe4cbb 100644
--- a/drivers/staging/zsmalloc/zsmalloc-main.c
+++ b/drivers/staging/zsmalloc/zsmalloc-main.c
@@ -180,7 +180,7 @@ out:
  * link together 3 PAGE_SIZE sized pages to form a zspage
  * since then we can perfectly fit in 8 such objects.
  */
-static int get_zspage_order(int class_size)
+static int get_zspage_pages(int class_size)
 {
 	int i, max_usedpc = 0;
 	/* zspage order which gives maximum used size per KB */
@@ -368,7 +368,7 @@ static struct page *alloc_zspage(struct size_class *class, gfp_t flags)
 	 * identify the last page.
 	 */
 	error = -ENOMEM;
-	for (i = 0; i < class->zspage_order; i++) {
+	for (i = 0; i < class->zspage_pages; i++) {
 		struct page *page, *prev_page;
 
 		page = alloc_page(flags);
@@ -388,7 +388,7 @@ static struct page *alloc_zspage(struct size_class *class, gfp_t flags)
 			page->first_page = first_page;
 		if (i >= 2)
 			list_add(&page->lru, &prev_page->lru);
-		if (i == class->zspage_order - 1)	/* last page */
+		if (i == class->zspage_pages - 1)	/* last page */
 			SetPagePrivate2(page);
 		prev_page = page;
 	}
@@ -397,7 +397,7 @@ static struct page *alloc_zspage(struct size_class *class, gfp_t flags)
 
 	first_page->freelist = obj_location_to_handle(first_page, 0);
 	/* Maximum number of objects we can store in this zspage */
-	first_page->objects = class->zspage_order * PAGE_SIZE / class->size;
+	first_page->objects = class->zspage_pages * PAGE_SIZE / class->size;
 
 	error = 0; /* Success */
 
@@ -511,7 +511,7 @@ struct zs_pool *zs_create_pool(const char *name, gfp_t flags)
 		class->size = size;
 		class->index = i;
 		spin_lock_init(&class->lock);
-		class->zspage_order = get_zspage_order(size);
+		class->zspage_pages = get_zspage_pages(size);
 
 	}
 
@@ -602,7 +602,7 @@ void *zs_malloc(struct zs_pool *pool, size_t size)
 
 		set_zspage_mapping(first_page, class->index, ZS_EMPTY);
 		spin_lock(&class->lock);
-		class->pages_allocated += class->zspage_order;
+		class->pages_allocated += class->zspage_pages;
 	}
 
 	obj = first_page->freelist;
@@ -657,7 +657,7 @@ void zs_free(struct zs_pool *pool, void *obj)
 	fullness = fix_fullness_group(pool, first_page);
 
 	if (fullness == ZS_EMPTY)
-		class->pages_allocated -= class->zspage_order;
+		class->pages_allocated -= class->zspage_pages;
 
 	spin_unlock(&class->lock);
 
diff --git a/drivers/staging/zsmalloc/zsmalloc_int.h b/drivers/staging/zsmalloc/zsmalloc_int.h
index 92eefc6..8f9ce0c 100644
--- a/drivers/staging/zsmalloc/zsmalloc_int.h
+++ b/drivers/staging/zsmalloc/zsmalloc_int.h
@@ -124,7 +124,7 @@ struct size_class {
 	unsigned int index;
 
 	/* Number of PAGE_SIZE sized pages to combine to form a 'zspage' */
-	int zspage_order;
+	int zspage_pages;
 
 	spinlock_t lock;
 
-- 
1.7.9.5


^ permalink raw reply related	[flat|nested] 32+ messages in thread

* [PATCH 4/6] zsmalloc: add/fix function comment
  2012-04-25  6:23 [PATCH 0/6] zsmalloc: clean up and fix arch dependency Minchan Kim
                   ` (2 preceding siblings ...)
  2012-04-25  6:23 ` [PATCH 3/6] zsmalloc: rename zspage_order with zspage_pages Minchan Kim
@ 2012-04-25  6:23 ` Minchan Kim
  2012-04-25 13:27   ` Nitin Gupta
  2012-04-25  6:23 ` [PATCH 5/6] zsmalloc: remove unnecessary type casting Minchan Kim
                   ` (3 subsequent siblings)
  7 siblings, 1 reply; 32+ messages in thread
From: Minchan Kim @ 2012-04-25  6:23 UTC (permalink / raw)
  To: Greg Kroah-Hartman, Seth Jennings, Nitin Gupta, Dan Magenheimer,
	Andrew Morton
  Cc: linux-kernel, linux-mm, Minchan Kim

Add/fix the comment.

Signed-off-by: Minchan Kim <minchan@kernel.org>
---
 drivers/staging/zsmalloc/zsmalloc-main.c |   15 +++++++++++----
 1 file changed, 11 insertions(+), 4 deletions(-)

diff --git a/drivers/staging/zsmalloc/zsmalloc-main.c b/drivers/staging/zsmalloc/zsmalloc-main.c
index 0fe4cbb..b7d31cc 100644
--- a/drivers/staging/zsmalloc/zsmalloc-main.c
+++ b/drivers/staging/zsmalloc/zsmalloc-main.c
@@ -565,12 +565,9 @@ EXPORT_SYMBOL_GPL(zs_destroy_pool);
  * zs_malloc - Allocate block of given size from pool.
  * @pool: pool to allocate from
  * @size: size of block to allocate
- * @page: page no. that holds the object
- * @offset: location of object within page
  *
  * On success, <page, offset> identifies block allocated
- * and 0 is returned. On failure, <page, offset> is set to
- * 0 and -ENOMEM is returned.
+ * and <page, offset> is returned. On failure, NULL is returned.
  *
  * Allocation requests with size > ZS_MAX_ALLOC_SIZE will fail.
  */
@@ -666,6 +663,16 @@ void zs_free(struct zs_pool *pool, void *obj)
 }
 EXPORT_SYMBOL_GPL(zs_free);
 
+/**
+ * zs_map_object - get address of allocated object from handle.
+ * @pool: object allocated pool
+ * @handle: handle returned from zs_malloc
+ *
+ * Before using object allocated from zs_malloc, object
+ * should be mapped to page table by this function.
+ * After using object,  call zs_unmap_object to unmap page
+ * table.
+ */
 void *zs_map_object(struct zs_pool *pool, void *handle)
 {
 	struct page *page;
-- 
1.7.9.5


^ permalink raw reply related	[flat|nested] 32+ messages in thread

* [PATCH 5/6] zsmalloc: remove unnecessary type casting
  2012-04-25  6:23 [PATCH 0/6] zsmalloc: clean up and fix arch dependency Minchan Kim
                   ` (3 preceding siblings ...)
  2012-04-25  6:23 ` [PATCH 4/6] zsmalloc: add/fix function comment Minchan Kim
@ 2012-04-25  6:23 ` Minchan Kim
  2012-04-25 13:35   ` Nitin Gupta
  2012-04-25  6:23 ` [PATCH 6/6] zsmalloc: make zsmalloc portable Minchan Kim
                   ` (2 subsequent siblings)
  7 siblings, 1 reply; 32+ messages in thread
From: Minchan Kim @ 2012-04-25  6:23 UTC (permalink / raw)
  To: Greg Kroah-Hartman, Seth Jennings, Nitin Gupta, Dan Magenheimer,
	Andrew Morton
  Cc: linux-kernel, linux-mm, Minchan Kim

Let's remove unnecessary type casting of (void *).

Signed-off-by: Minchan Kim <minchan@kernel.org>
---
 drivers/staging/zsmalloc/zsmalloc-main.c |    3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/drivers/staging/zsmalloc/zsmalloc-main.c b/drivers/staging/zsmalloc/zsmalloc-main.c
index b7d31cc..ff089f8 100644
--- a/drivers/staging/zsmalloc/zsmalloc-main.c
+++ b/drivers/staging/zsmalloc/zsmalloc-main.c
@@ -644,8 +644,7 @@ void zs_free(struct zs_pool *pool, void *obj)
 	spin_lock(&class->lock);
 
 	/* Insert this object in containing zspage's freelist */
-	link = (struct link_free *)((unsigned char *)kmap_atomic(f_page)
-							+ f_offset);
+	link = (struct link_free *)(kmap_atomic(f_page)	+ f_offset);
 	link->next = first_page->freelist;
 	kunmap_atomic(link);
 	first_page->freelist = obj;
-- 
1.7.9.5


^ permalink raw reply related	[flat|nested] 32+ messages in thread

* [PATCH 6/6] zsmalloc: make zsmalloc portable
  2012-04-25  6:23 [PATCH 0/6] zsmalloc: clean up and fix arch dependency Minchan Kim
                   ` (4 preceding siblings ...)
  2012-04-25  6:23 ` [PATCH 5/6] zsmalloc: remove unnecessary type casting Minchan Kim
@ 2012-04-25  6:23 ` Minchan Kim
  2012-04-25 14:32   ` Nitin Gupta
  2012-04-26  2:11   ` Minchan Kim
  2012-04-25 12:41 ` [PATCH 0/6] zsmalloc: clean up and fix arch dependency Nitin Gupta
  2012-04-25 18:14 ` Greg Kroah-Hartman
  7 siblings, 2 replies; 32+ messages in thread
From: Minchan Kim @ 2012-04-25  6:23 UTC (permalink / raw)
  To: Greg Kroah-Hartman, Seth Jennings, Nitin Gupta, Dan Magenheimer,
	Andrew Morton
  Cc: linux-kernel, linux-mm, Minchan Kim

The zsmalloc uses __flush_tlb_one and set_pte.
It's very lower functions so that it makes arhcitecture dependency
so currently zsmalloc is used by only x86.
This patch changes them with map_vm_area and unmap_kernel_range so
it should work all architecture.

Signed-off-by: Minchan Kim <minchan@kernel.org>
---
 drivers/staging/zsmalloc/Kconfig         |    4 ----
 drivers/staging/zsmalloc/zsmalloc-main.c |   27 +++++++++++++++++----------
 drivers/staging/zsmalloc/zsmalloc_int.h  |    1 -
 3 files changed, 17 insertions(+), 15 deletions(-)

diff --git a/drivers/staging/zsmalloc/Kconfig b/drivers/staging/zsmalloc/Kconfig
index a5ab720..9084565 100644
--- a/drivers/staging/zsmalloc/Kconfig
+++ b/drivers/staging/zsmalloc/Kconfig
@@ -1,9 +1,5 @@
 config ZSMALLOC
 	tristate "Memory allocator for compressed pages"
-	# X86 dependency is because of the use of __flush_tlb_one and set_pte
-	# in zsmalloc-main.c.
-	# TODO: convert these to portable functions
-	depends on X86
 	default n
 	help
 	  zsmalloc is a slab-based memory allocator designed to store
diff --git a/drivers/staging/zsmalloc/zsmalloc-main.c b/drivers/staging/zsmalloc/zsmalloc-main.c
index ff089f8..cc017b1 100644
--- a/drivers/staging/zsmalloc/zsmalloc-main.c
+++ b/drivers/staging/zsmalloc/zsmalloc-main.c
@@ -442,7 +442,7 @@ static int zs_cpu_notifier(struct notifier_block *nb, unsigned long action,
 		area = &per_cpu(zs_map_area, cpu);
 		if (area->vm)
 			break;
-		area->vm = alloc_vm_area(2 * PAGE_SIZE, area->vm_ptes);
+		area->vm = alloc_vm_area(2 * PAGE_SIZE, NULL);
 		if (!area->vm)
 			return notifier_from_errno(-ENOMEM);
 		break;
@@ -696,13 +696,22 @@ void *zs_map_object(struct zs_pool *pool, void *handle)
 	} else {
 		/* this object spans two pages */
 		struct page *nextp;
+		struct page *pages[2];
+		struct page **page_array = &pages[0];
+		int err;
 
 		nextp = get_next_page(page);
 		BUG_ON(!nextp);
 
+		page_array[0] = page;
+		page_array[1] = nextp;
 
-		set_pte(area->vm_ptes[0], mk_pte(page, PAGE_KERNEL));
-		set_pte(area->vm_ptes[1], mk_pte(nextp, PAGE_KERNEL));
+		/*
+		 * map_vm_area never fail because we already allocated
+		 * pages for page table in alloc_vm_area.
+		 */
+		err = map_vm_area(area->vm, PAGE_KERNEL, &page_array);
+		BUG_ON(err);
 
 		/* We pre-allocated VM area so mapping can never fail */
 		area->vm_addr = area->vm->addr;
@@ -730,14 +739,12 @@ void zs_unmap_object(struct zs_pool *pool, void *handle)
 	off = obj_idx_to_offset(page, obj_idx, class->size);
 
 	area = &__get_cpu_var(zs_map_area);
-	if (off + class->size <= PAGE_SIZE) {
+	if (off + class->size <= PAGE_SIZE)
 		kunmap_atomic(area->vm_addr);
-	} else {
-		set_pte(area->vm_ptes[0], __pte(0));
-		set_pte(area->vm_ptes[1], __pte(0));
-		__flush_tlb_one((unsigned long)area->vm_addr);
-		__flush_tlb_one((unsigned long)area->vm_addr + PAGE_SIZE);
-	}
+	else
+		unmap_kernel_range((unsigned long)area->vm->addr,
+					PAGE_SIZE * 2);
+
 	put_cpu_var(zs_map_area);
 }
 EXPORT_SYMBOL_GPL(zs_unmap_object);
diff --git a/drivers/staging/zsmalloc/zsmalloc_int.h b/drivers/staging/zsmalloc/zsmalloc_int.h
index 8f9ce0c..4c11c89 100644
--- a/drivers/staging/zsmalloc/zsmalloc_int.h
+++ b/drivers/staging/zsmalloc/zsmalloc_int.h
@@ -111,7 +111,6 @@ static const int fullness_threshold_frac = 4;
 
 struct mapping_area {
 	struct vm_struct *vm;
-	pte_t *vm_ptes[2];
 	char *vm_addr;
 };
 
-- 
1.7.9.5


^ permalink raw reply related	[flat|nested] 32+ messages in thread

* Re: [PATCH 0/6] zsmalloc: clean up and fix arch dependency
  2012-04-25  6:23 [PATCH 0/6] zsmalloc: clean up and fix arch dependency Minchan Kim
                   ` (5 preceding siblings ...)
  2012-04-25  6:23 ` [PATCH 6/6] zsmalloc: make zsmalloc portable Minchan Kim
@ 2012-04-25 12:41 ` Nitin Gupta
  2012-04-26  1:20   ` Minchan Kim
  2012-04-25 18:14 ` Greg Kroah-Hartman
  7 siblings, 1 reply; 32+ messages in thread
From: Nitin Gupta @ 2012-04-25 12:41 UTC (permalink / raw)
  To: Minchan Kim
  Cc: Greg Kroah-Hartman, Seth Jennings, Dan Magenheimer,
	Andrew Morton, linux-kernel, linux-mm

Hi Minchan,

On 04/25/2012 02:23 AM, Minchan Kim wrote:

> This patchset has some clean up patches(1-5) and remove 
> set_bit, flush_tlb for portability in [6/6].
> 
> Minchan Kim (6):
>   zsmalloc: use PageFlag macro instead of [set|test]_bit
>   zsmalloc: remove unnecessary alignment
>   zsmalloc: rename zspage_order with zspage_pages
>   zsmalloc: add/fix function comment
>   zsmalloc: remove unnecessary type casting
>   zsmalloc: make zsmalloc portable
> 
>  drivers/staging/zsmalloc/Kconfig         |    4 --
>  drivers/staging/zsmalloc/zsmalloc-main.c |   73 +++++++++++++++++-------------
>  drivers/staging/zsmalloc/zsmalloc_int.h  |    3 +-
>  3 files changed, 43 insertions(+), 37 deletions(-)
> 


Thanks for the fixes.


Your description is missing testing notes (especially since patch [6/6]
is not a cosmetic change). So, can you please add these either here in
patch 0 or as part of patch 6/6 description?

Thanks,
Nitin

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH 1/6] zsmalloc: use PageFlag macro instead of [set|test]_bit
  2012-04-25  6:23 ` [PATCH 1/6] zsmalloc: use PageFlag macro instead of [set|test]_bit Minchan Kim
@ 2012-04-25 12:43   ` Nitin Gupta
  0 siblings, 0 replies; 32+ messages in thread
From: Nitin Gupta @ 2012-04-25 12:43 UTC (permalink / raw)
  To: Minchan Kim
  Cc: Greg Kroah-Hartman, Seth Jennings, Dan Magenheimer,
	Andrew Morton, linux-kernel, linux-mm

On 04/25/2012 02:23 AM, Minchan Kim wrote:

> MM code always uses PageXXX to handle page flags.
> Let's keep the consistency.
> 
> Signed-off-by: Minchan Kim <minchan@kernel.org>
> ---
>  drivers/staging/zsmalloc/zsmalloc-main.c |    9 ++++-----
>  1 file changed, 4 insertions(+), 5 deletions(-)
>



Acked-by: Nitin Gupta <ngupta@vflare.org>

Thanks,
Nitin

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH 2/6] zsmalloc: remove unnecessary alignment
  2012-04-25  6:23 ` [PATCH 2/6] zsmalloc: remove unnecessary alignment Minchan Kim
@ 2012-04-25 12:53   ` Nitin Gupta
  2012-04-26  1:42     ` Minchan Kim
  0 siblings, 1 reply; 32+ messages in thread
From: Nitin Gupta @ 2012-04-25 12:53 UTC (permalink / raw)
  To: Minchan Kim
  Cc: Greg Kroah-Hartman, Seth Jennings, Dan Magenheimer,
	Andrew Morton, linux-kernel, linux-mm

On 04/25/2012 02:23 AM, Minchan Kim wrote:

> It isn't necessary to align pool size with PAGE_SIZE.
> If I missed something, please let me know it.
> 
> Signed-off-by: Minchan Kim <minchan@kernel.org>
> ---
>  drivers/staging/zsmalloc/zsmalloc-main.c |    5 ++---
>  1 file changed, 2 insertions(+), 3 deletions(-)
> 
> diff --git a/drivers/staging/zsmalloc/zsmalloc-main.c b/drivers/staging/zsmalloc/zsmalloc-main.c
> index 504b6c2..b99ad9e 100644
> --- a/drivers/staging/zsmalloc/zsmalloc-main.c
> +++ b/drivers/staging/zsmalloc/zsmalloc-main.c
> @@ -489,14 +489,13 @@ fail:
>  
>  struct zs_pool *zs_create_pool(const char *name, gfp_t flags)
>  {
> -	int i, error, ovhd_size;
> +	int i, error;
>  	struct zs_pool *pool;
>  
>  	if (!name)
>  		return NULL;
>  
> -	ovhd_size = roundup(sizeof(*pool), PAGE_SIZE);
> -	pool = kzalloc(ovhd_size, GFP_KERNEL);
> +	pool = kzalloc(sizeof(*pool), GFP_KERNEL);
>  	if (!pool)
>  		return NULL;
>  


pool metadata is rounded-up to avoid potential false-sharing problem
(though we could just roundup to cache_line_size()).

Thanks,
Nitin

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH 3/6] zsmalloc: rename zspage_order with zspage_pages
  2012-04-25  6:23 ` [PATCH 3/6] zsmalloc: rename zspage_order with zspage_pages Minchan Kim
@ 2012-04-25 13:03   ` Nitin Gupta
  2012-04-26  1:46     ` Minchan Kim
  0 siblings, 1 reply; 32+ messages in thread
From: Nitin Gupta @ 2012-04-25 13:03 UTC (permalink / raw)
  To: Minchan Kim
  Cc: Greg Kroah-Hartman, Seth Jennings, Dan Magenheimer,
	Andrew Morton, linux-kernel, linux-mm

On 04/25/2012 02:23 AM, Minchan Kim wrote:

> zspage_order defines how many pages are needed to make a zspage.
> So _order_ is rather awkward naming. It already deceive Jonathan
> - http://lwn.net/Articles/477067/
> " For each size, the code calculates an optimum number of pages (up to 16)"
> 
> Let's change from _order_ to _pages_.
> 
> Signed-off-by: Minchan Kim <minchan@kernel.org>
> ---
>  drivers/staging/zsmalloc/zsmalloc-main.c |   14 +++++++-------
>  drivers/staging/zsmalloc/zsmalloc_int.h  |    2 +-
>  2 files changed, 8 insertions(+), 8 deletions(-)
>


Recently, Seth changed max_zspage_order to ZS_MAX_PAGES_PER_ZSPAGE for
the same reason. I think it would be better to rename the function in a
similary way to have some consistency. So, we could use:

1) get_pages_per_zspage() instead of get_zspage_pages()
2) class->pages_per_zspage instead of class->zspage_pages

 
> diff --git a/drivers/staging/zsmalloc/zsmalloc-main.c b/drivers/staging/zsmalloc/zsmalloc-main.c
> index b99ad9e..0fe4cbb 100644
> --- a/drivers/staging/zsmalloc/zsmalloc-main.c
> +++ b/drivers/staging/zsmalloc/zsmalloc-main.c
> @@ -180,7 +180,7 @@ out:
>   * link together 3 PAGE_SIZE sized pages to form a zspage
>   * since then we can perfectly fit in 8 such objects.
>   */
> -static int get_zspage_order(int class_size)
> +static int get_zspage_pages(int class_size)
>  {
>  	int i, max_usedpc = 0;
>  	/* zspage order which gives maximum used size per KB */
> @@ -368,7 +368,7 @@ static struct page *alloc_zspage(struct size_class *class, gfp_t flags)
>  	 * identify the last page.
>  	 */
>  	error = -ENOMEM;
> -	for (i = 0; i < class->zspage_order; i++) {
> +	for (i = 0; i < class->zspage_pages; i++) {
>  		struct page *page, *prev_page;
>  
>  		page = alloc_page(flags);
> @@ -388,7 +388,7 @@ static struct page *alloc_zspage(struct size_class *class, gfp_t flags)
>  			page->first_page = first_page;
>  		if (i >= 2)
>  			list_add(&page->lru, &prev_page->lru);
> -		if (i == class->zspage_order - 1)	/* last page */
> +		if (i == class->zspage_pages - 1)	/* last page */
>  			SetPagePrivate2(page);
>  		prev_page = page;
>  	}
> @@ -397,7 +397,7 @@ static struct page *alloc_zspage(struct size_class *class, gfp_t flags)
>  
>  	first_page->freelist = obj_location_to_handle(first_page, 0);
>  	/* Maximum number of objects we can store in this zspage */
> -	first_page->objects = class->zspage_order * PAGE_SIZE / class->size;
> +	first_page->objects = class->zspage_pages * PAGE_SIZE / class->size;
>  
>  	error = 0; /* Success */
>  
> @@ -511,7 +511,7 @@ struct zs_pool *zs_create_pool(const char *name, gfp_t flags)
>  		class->size = size;
>  		class->index = i;
>  		spin_lock_init(&class->lock);
> -		class->zspage_order = get_zspage_order(size);
> +		class->zspage_pages = get_zspage_pages(size);
>  
>  	}
>  
> @@ -602,7 +602,7 @@ void *zs_malloc(struct zs_pool *pool, size_t size)
>  
>  		set_zspage_mapping(first_page, class->index, ZS_EMPTY);
>  		spin_lock(&class->lock);
> -		class->pages_allocated += class->zspage_order;
> +		class->pages_allocated += class->zspage_pages;
>  	}
>  
>  	obj = first_page->freelist;
> @@ -657,7 +657,7 @@ void zs_free(struct zs_pool *pool, void *obj)
>  	fullness = fix_fullness_group(pool, first_page);
>  
>  	if (fullness == ZS_EMPTY)
> -		class->pages_allocated -= class->zspage_order;
> +		class->pages_allocated -= class->zspage_pages;
>  
>  	spin_unlock(&class->lock);
>  
> diff --git a/drivers/staging/zsmalloc/zsmalloc_int.h b/drivers/staging/zsmalloc/zsmalloc_int.h
> index 92eefc6..8f9ce0c 100644
> --- a/drivers/staging/zsmalloc/zsmalloc_int.h
> +++ b/drivers/staging/zsmalloc/zsmalloc_int.h
> @@ -124,7 +124,7 @@ struct size_class {
>  	unsigned int index;
>  
>  	/* Number of PAGE_SIZE sized pages to combine to form a 'zspage' */
> -	int zspage_order;
> +	int zspage_pages;
>  
>  	spinlock_t lock;
>  


Thanks,
Nitin

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH 4/6] zsmalloc: add/fix function comment
  2012-04-25  6:23 ` [PATCH 4/6] zsmalloc: add/fix function comment Minchan Kim
@ 2012-04-25 13:27   ` Nitin Gupta
  2012-04-26  1:51     ` Minchan Kim
  0 siblings, 1 reply; 32+ messages in thread
From: Nitin Gupta @ 2012-04-25 13:27 UTC (permalink / raw)
  To: Minchan Kim
  Cc: Greg Kroah-Hartman, Seth Jennings, Dan Magenheimer,
	Andrew Morton, linux-kernel, linux-mm

On 04/25/2012 02:23 AM, Minchan Kim wrote:

> Add/fix the comment.
> 
> Signed-off-by: Minchan Kim <minchan@kernel.org>
> ---
>  drivers/staging/zsmalloc/zsmalloc-main.c |   15 +++++++++++----
>  1 file changed, 11 insertions(+), 4 deletions(-)
> 
> diff --git a/drivers/staging/zsmalloc/zsmalloc-main.c b/drivers/staging/zsmalloc/zsmalloc-main.c
> index 0fe4cbb..b7d31cc 100644
> --- a/drivers/staging/zsmalloc/zsmalloc-main.c
> +++ b/drivers/staging/zsmalloc/zsmalloc-main.c
> @@ -565,12 +565,9 @@ EXPORT_SYMBOL_GPL(zs_destroy_pool);
>   * zs_malloc - Allocate block of given size from pool.
>   * @pool: pool to allocate from
>   * @size: size of block to allocate
> - * @page: page no. that holds the object
> - * @offset: location of object within page
>   *
>   * On success, <page, offset> identifies block allocated
> - * and 0 is returned. On failure, <page, offset> is set to
> - * 0 and -ENOMEM is returned.
> + * and <page, offset> is returned. On failure, NULL is returned.
>   *


The returned value indeed encodes <page, offset> values as a 'void *'
but this should not be part of the function documentation since its an
internal detail.  So, its probably better to say:

On success, handle to the allocated object is returned; NULL otherwise.

On a side note, we should also 'typedef void * zs_handle' to avoid any
confusion. Without this, users may just treat zs_malloc return value as
a pointer and try to deference it.

>   * Allocation requests with size > ZS_MAX_ALLOC_SIZE will fail.
>   */
> @@ -666,6 +663,16 @@ void zs_free(struct zs_pool *pool, void *obj)
>  }
>  EXPORT_SYMBOL_GPL(zs_free);
>  
> +/**
> + * zs_map_object - get address of allocated object from handle.
> + * @pool: object allocated pool


should be: @pool: pool from which the object was allocated

> + * @handle: handle returned from zs_malloc
> + *

> + * Before using object allocated from zs_malloc, object
> + * should be mapped to page table by this function.
> + * After using object,  call zs_unmap_object to unmap page
> + * table.
> + */


We are not really unmapping any page tables, so could be written as:

Before using an object allocated from zs_malloc, it must be mapped using
this function. When done with the object, it must be unmapped using
zs_unmap_object


Sorry for nitpicking.

Thanks,
Nitin

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH 5/6] zsmalloc: remove unnecessary type casting
  2012-04-25  6:23 ` [PATCH 5/6] zsmalloc: remove unnecessary type casting Minchan Kim
@ 2012-04-25 13:35   ` Nitin Gupta
  2012-04-25 17:56     ` Greg Kroah-Hartman
  2012-04-26  1:58     ` Minchan Kim
  0 siblings, 2 replies; 32+ messages in thread
From: Nitin Gupta @ 2012-04-25 13:35 UTC (permalink / raw)
  To: Minchan Kim
  Cc: Greg Kroah-Hartman, Seth Jennings, Dan Magenheimer,
	Andrew Morton, linux-kernel, linux-mm

On 04/25/2012 02:23 AM, Minchan Kim wrote:

> Let's remove unnecessary type casting of (void *).
> 
> Signed-off-by: Minchan Kim <minchan@kernel.org>
> ---
>  drivers/staging/zsmalloc/zsmalloc-main.c |    3 +--
>  1 file changed, 1 insertion(+), 2 deletions(-)
> 
> diff --git a/drivers/staging/zsmalloc/zsmalloc-main.c b/drivers/staging/zsmalloc/zsmalloc-main.c
> index b7d31cc..ff089f8 100644
> --- a/drivers/staging/zsmalloc/zsmalloc-main.c
> +++ b/drivers/staging/zsmalloc/zsmalloc-main.c
> @@ -644,8 +644,7 @@ void zs_free(struct zs_pool *pool, void *obj)
>  	spin_lock(&class->lock);
>  
>  	/* Insert this object in containing zspage's freelist */
> -	link = (struct link_free *)((unsigned char *)kmap_atomic(f_page)
> -							+ f_offset);
> +	link = (struct link_free *)(kmap_atomic(f_page)	+ f_offset);
>  	link->next = first_page->freelist;
>  	kunmap_atomic(link);
>  	first_page->freelist = obj;



Incrementing a void pointer looks weired and should not be allowed by C
compilers though gcc and clang seem to allow this without any warnings.
(fortunately C++ forbids incrementing void pointers)

So, we should keep this cast to unsigned char pointer to avoid relying
on a non-standard, compiler specific behavior.

Thanks,
Nitin


^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH 6/6] zsmalloc: make zsmalloc portable
  2012-04-25  6:23 ` [PATCH 6/6] zsmalloc: make zsmalloc portable Minchan Kim
@ 2012-04-25 14:32   ` Nitin Gupta
  2012-04-25 15:40     ` Dan Magenheimer
  2012-04-25 16:37     ` Seth Jennings
  2012-04-26  2:11   ` Minchan Kim
  1 sibling, 2 replies; 32+ messages in thread
From: Nitin Gupta @ 2012-04-25 14:32 UTC (permalink / raw)
  To: Minchan Kim
  Cc: Greg Kroah-Hartman, Seth Jennings, Dan Magenheimer,
	Andrew Morton, linux-kernel, linux-mm

On 04/25/2012 02:23 AM, Minchan Kim wrote:

> The zsmalloc uses __flush_tlb_one and set_pte.
> It's very lower functions so that it makes arhcitecture dependency
> so currently zsmalloc is used by only x86.
> This patch changes them with map_vm_area and unmap_kernel_range so
> it should work all architecture.
> 
> Signed-off-by: Minchan Kim <minchan@kernel.org>
> ---
>  drivers/staging/zsmalloc/Kconfig         |    4 ----
>  drivers/staging/zsmalloc/zsmalloc-main.c |   27 +++++++++++++++++----------
>  drivers/staging/zsmalloc/zsmalloc_int.h  |    1 -
>  3 files changed, 17 insertions(+), 15 deletions(-)
> 
> diff --git a/drivers/staging/zsmalloc/Kconfig b/drivers/staging/zsmalloc/Kconfig
> index a5ab720..9084565 100644
> --- a/drivers/staging/zsmalloc/Kconfig
> +++ b/drivers/staging/zsmalloc/Kconfig
> @@ -1,9 +1,5 @@
>  config ZSMALLOC
>  	tristate "Memory allocator for compressed pages"
> -	# X86 dependency is because of the use of __flush_tlb_one and set_pte
> -	# in zsmalloc-main.c.
> -	# TODO: convert these to portable functions
> -	depends on X86
>  	default n
>  	help
>  	  zsmalloc is a slab-based memory allocator designed to store
> diff --git a/drivers/staging/zsmalloc/zsmalloc-main.c b/drivers/staging/zsmalloc/zsmalloc-main.c
> index ff089f8..cc017b1 100644
> --- a/drivers/staging/zsmalloc/zsmalloc-main.c
> +++ b/drivers/staging/zsmalloc/zsmalloc-main.c
> @@ -442,7 +442,7 @@ static int zs_cpu_notifier(struct notifier_block *nb, unsigned long action,
>  		area = &per_cpu(zs_map_area, cpu);
>  		if (area->vm)
>  			break;
> -		area->vm = alloc_vm_area(2 * PAGE_SIZE, area->vm_ptes);
> +		area->vm = alloc_vm_area(2 * PAGE_SIZE, NULL);
>  		if (!area->vm)
>  			return notifier_from_errno(-ENOMEM);
>  		break;
> @@ -696,13 +696,22 @@ void *zs_map_object(struct zs_pool *pool, void *handle)
>  	} else {
>  		/* this object spans two pages */
>  		struct page *nextp;
> +		struct page *pages[2];
> +		struct page **page_array = &pages[0];
> +		int err;
>  
>  		nextp = get_next_page(page);
>  		BUG_ON(!nextp);
>  
> +		page_array[0] = page;
> +		page_array[1] = nextp;
>  
> -		set_pte(area->vm_ptes[0], mk_pte(page, PAGE_KERNEL));
> -		set_pte(area->vm_ptes[1], mk_pte(nextp, PAGE_KERNEL));
> +		/*
> +		 * map_vm_area never fail because we already allocated
> +		 * pages for page table in alloc_vm_area.
> +		 */
> +		err = map_vm_area(area->vm, PAGE_KERNEL, &page_array);
> +		BUG_ON(err);
>  
>  		/* We pre-allocated VM area so mapping can never fail */
>  		area->vm_addr = area->vm->addr;
> @@ -730,14 +739,12 @@ void zs_unmap_object(struct zs_pool *pool, void *handle)
>  	off = obj_idx_to_offset(page, obj_idx, class->size);
>  
>  	area = &__get_cpu_var(zs_map_area);
> -	if (off + class->size <= PAGE_SIZE) {
> +	if (off + class->size <= PAGE_SIZE)
>  		kunmap_atomic(area->vm_addr);
> -	} else {
> -		set_pte(area->vm_ptes[0], __pte(0));
> -		set_pte(area->vm_ptes[1], __pte(0));
> -		__flush_tlb_one((unsigned long)area->vm_addr);
> -		__flush_tlb_one((unsigned long)area->vm_addr + PAGE_SIZE);
> -	}
> +	else
> +		unmap_kernel_range((unsigned long)area->vm->addr,
> +					PAGE_SIZE * 2);
> +



This would certainly work but would incur unncessary cost. All we need
to do is to flush the local TLB entry correpsonding to these two pages.
However, unmap_kernel_range --> flush_tlb_kernel_range woule cause TLB
flush on all CPUs. Additionally, implementation of this function
(flush_tlb_kernel_range) on architecutures like x86 seems naive since it
flushes the entire TLB on all the CPUs.

Even with all this penalty, I'm inclined on keeping this change to
remove x86 only dependency, keeping improvements as future work.

I think Seth was working on this improvement but not sure about the
current status. Seth?


>  	put_cpu_var(zs_map_area);
>  }
>  EXPORT_SYMBOL_GPL(zs_unmap_object);
> diff --git a/drivers/staging/zsmalloc/zsmalloc_int.h b/drivers/staging/zsmalloc/zsmalloc_int.h
> index 8f9ce0c..4c11c89 100644
> --- a/drivers/staging/zsmalloc/zsmalloc_int.h
> +++ b/drivers/staging/zsmalloc/zsmalloc_int.h
> @@ -111,7 +111,6 @@ static const int fullness_threshold_frac = 4;
>  
>  struct mapping_area {
>  	struct vm_struct *vm;
> -	pte_t *vm_ptes[2];
>  	char *vm_addr;
>  };
>  


Thanks,
Nitin

^ permalink raw reply	[flat|nested] 32+ messages in thread

* RE: [PATCH 6/6] zsmalloc: make zsmalloc portable
  2012-04-25 14:32   ` Nitin Gupta
@ 2012-04-25 15:40     ` Dan Magenheimer
  2012-04-26  2:03       ` Minchan Kim
  2012-04-25 16:37     ` Seth Jennings
  1 sibling, 1 reply; 32+ messages in thread
From: Dan Magenheimer @ 2012-04-25 15:40 UTC (permalink / raw)
  To: Nitin Gupta, Minchan Kim
  Cc: Greg Kroah-Hartman, Seth Jennings, Andrew Morton, linux-kernel, linux-mm

> From: Nitin Gupta [mailto:ngupta@vflare.org]
> Subject: Re: [PATCH 6/6] zsmalloc: make zsmalloc portable
> 
> On 04/25/2012 02:23 AM, Minchan Kim wrote:
> 
> > The zsmalloc uses __flush_tlb_one and set_pte.
> > It's very lower functions so that it makes arhcitecture dependency
> > so currently zsmalloc is used by only x86.
> > This patch changes them with map_vm_area and unmap_kernel_range so
> > it should work all architecture.
> >
> > Signed-off-by: Minchan Kim <minchan@kernel.org>
> > ---
> >  drivers/staging/zsmalloc/Kconfig         |    4 ----
> >  drivers/staging/zsmalloc/zsmalloc-main.c |   27 +++++++++++++++++----------
> >  drivers/staging/zsmalloc/zsmalloc_int.h  |    1 -
> >  3 files changed, 17 insertions(+), 15 deletions(-)
> >
> > diff --git a/drivers/staging/zsmalloc/Kconfig b/drivers/staging/zsmalloc/Kconfig
> > index a5ab720..9084565 100644
> > --- a/drivers/staging/zsmalloc/Kconfig
> > +++ b/drivers/staging/zsmalloc/Kconfig
> > @@ -1,9 +1,5 @@
> >  config ZSMALLOC
> >  	tristate "Memory allocator for compressed pages"
> > -	# X86 dependency is because of the use of __flush_tlb_one and set_pte
> > -	# in zsmalloc-main.c.
> > -	# TODO: convert these to portable functions
> > -	depends on X86
> >  	default n
> >  	help
> >  	  zsmalloc is a slab-based memory allocator designed to store
> > diff --git a/drivers/staging/zsmalloc/zsmalloc-main.c b/drivers/staging/zsmalloc/zsmalloc-main.c
> > index ff089f8..cc017b1 100644
> > --- a/drivers/staging/zsmalloc/zsmalloc-main.c
> > +++ b/drivers/staging/zsmalloc/zsmalloc-main.c
> > @@ -442,7 +442,7 @@ static int zs_cpu_notifier(struct notifier_block *nb, unsigned long action,
> >  		area = &per_cpu(zs_map_area, cpu);
> >  		if (area->vm)
> >  			break;
> > -		area->vm = alloc_vm_area(2 * PAGE_SIZE, area->vm_ptes);
> > +		area->vm = alloc_vm_area(2 * PAGE_SIZE, NULL);
> >  		if (!area->vm)
> >  			return notifier_from_errno(-ENOMEM);
> >  		break;
> > @@ -696,13 +696,22 @@ void *zs_map_object(struct zs_pool *pool, void *handle)
> >  	} else {
> >  		/* this object spans two pages */
> >  		struct page *nextp;
> > +		struct page *pages[2];
> > +		struct page **page_array = &pages[0];
> > +		int err;
> >
> >  		nextp = get_next_page(page);
> >  		BUG_ON(!nextp);
> >
> > +		page_array[0] = page;
> > +		page_array[1] = nextp;
> >
> > -		set_pte(area->vm_ptes[0], mk_pte(page, PAGE_KERNEL));
> > -		set_pte(area->vm_ptes[1], mk_pte(nextp, PAGE_KERNEL));
> > +		/*
> > +		 * map_vm_area never fail because we already allocated
> > +		 * pages for page table in alloc_vm_area.
> > +		 */
> > +		err = map_vm_area(area->vm, PAGE_KERNEL, &page_array);
> > +		BUG_ON(err);
> >
> >  		/* We pre-allocated VM area so mapping can never fail */
> >  		area->vm_addr = area->vm->addr;
> > @@ -730,14 +739,12 @@ void zs_unmap_object(struct zs_pool *pool, void *handle)
> >  	off = obj_idx_to_offset(page, obj_idx, class->size);
> >
> >  	area = &__get_cpu_var(zs_map_area);
> > -	if (off + class->size <= PAGE_SIZE) {
> > +	if (off + class->size <= PAGE_SIZE)
> >  		kunmap_atomic(area->vm_addr);
> > -	} else {
> > -		set_pte(area->vm_ptes[0], __pte(0));
> > -		set_pte(area->vm_ptes[1], __pte(0));
> > -		__flush_tlb_one((unsigned long)area->vm_addr);
> > -		__flush_tlb_one((unsigned long)area->vm_addr + PAGE_SIZE);
> > -	}
> > +	else
> > +		unmap_kernel_range((unsigned long)area->vm->addr,
> > +					PAGE_SIZE * 2);
> > +
> 
> 
> 
> This would certainly work but would incur unncessary cost. All we need
> to do is to flush the local TLB entry correpsonding to these two pages.
> However, unmap_kernel_range --> flush_tlb_kernel_range woule cause TLB
> flush on all CPUs. Additionally, implementation of this function
> (flush_tlb_kernel_range) on architecutures like x86 seems naive since it
> flushes the entire TLB on all the CPUs.
> 
> Even with all this penalty, I'm inclined on keeping this change to
> remove x86 only dependency, keeping improvements as future work.
> 
> I think Seth was working on this improvement but not sure about the
> current status. Seth?

I wouldn't normally advocate an architecture-specific ifdef, but the
penalty for portability here seems high enough that it could make
sense here, perhaps hidden away in zsmalloc.h?  Perhaps eventually
in a mm header file as "unmap_kernel_page_pair_local()"?

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH 6/6] zsmalloc: make zsmalloc portable
  2012-04-25 14:32   ` Nitin Gupta
  2012-04-25 15:40     ` Dan Magenheimer
@ 2012-04-25 16:37     ` Seth Jennings
  2012-04-26  2:04       ` Minchan Kim
  2012-05-07 15:14       ` Seth Jennings
  1 sibling, 2 replies; 32+ messages in thread
From: Seth Jennings @ 2012-04-25 16:37 UTC (permalink / raw)
  To: Minchan Kim
  Cc: Nitin Gupta, Greg Kroah-Hartman, Dan Magenheimer, Andrew Morton,
	linux-kernel, linux-mm

Hey Minchan,

Thanks for the patches!

On 04/25/2012 09:32 AM, Nitin Gupta wrote:
> I think Seth was working on this improvement but not sure about the
> current status. Seth?

Yes, I looked at this option, and it is very clean and portable.

Unfortunately, IIRC, with our rate of mapping/unmapping,
flush_tlb_kernel_range() causes an IPI storm that effective
stalls the machine.

I'll apply your patch and try it out.

Thanks,
Seth


^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH 5/6] zsmalloc: remove unnecessary type casting
  2012-04-25 13:35   ` Nitin Gupta
@ 2012-04-25 17:56     ` Greg Kroah-Hartman
  2012-04-25 18:13       ` Nitin Gupta
  2012-04-26  1:58     ` Minchan Kim
  1 sibling, 1 reply; 32+ messages in thread
From: Greg Kroah-Hartman @ 2012-04-25 17:56 UTC (permalink / raw)
  To: Nitin Gupta
  Cc: Minchan Kim, Seth Jennings, Dan Magenheimer, Andrew Morton,
	linux-kernel, linux-mm

On Wed, Apr 25, 2012 at 09:35:25AM -0400, Nitin Gupta wrote:
> On 04/25/2012 02:23 AM, Minchan Kim wrote:
> 
> > Let's remove unnecessary type casting of (void *).
> > 
> > Signed-off-by: Minchan Kim <minchan@kernel.org>
> > ---
> >  drivers/staging/zsmalloc/zsmalloc-main.c |    3 +--
> >  1 file changed, 1 insertion(+), 2 deletions(-)
> > 
> > diff --git a/drivers/staging/zsmalloc/zsmalloc-main.c b/drivers/staging/zsmalloc/zsmalloc-main.c
> > index b7d31cc..ff089f8 100644
> > --- a/drivers/staging/zsmalloc/zsmalloc-main.c
> > +++ b/drivers/staging/zsmalloc/zsmalloc-main.c
> > @@ -644,8 +644,7 @@ void zs_free(struct zs_pool *pool, void *obj)
> >  	spin_lock(&class->lock);
> >  
> >  	/* Insert this object in containing zspage's freelist */
> > -	link = (struct link_free *)((unsigned char *)kmap_atomic(f_page)
> > -							+ f_offset);
> > +	link = (struct link_free *)(kmap_atomic(f_page)	+ f_offset);
> >  	link->next = first_page->freelist;
> >  	kunmap_atomic(link);
> >  	first_page->freelist = obj;
> 
> 
> 
> Incrementing a void pointer looks weired and should not be allowed by C
> compilers though gcc and clang seem to allow this without any warnings.
> (fortunately C++ forbids incrementing void pointers)

Huh?  A void pointer can safely be incremented by C I thought, do you
have a pointer to where in the reference it says it is "unspecified"?

> So, we should keep this cast to unsigned char pointer to avoid relying
> on a non-standard, compiler specific behavior.

I do agree about this, more people are starting to build the kernel with
other compilers than gcc, so it would be nice to ensure that we get
stuff like this right.

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH 5/6] zsmalloc: remove unnecessary type casting
  2012-04-25 17:56     ` Greg Kroah-Hartman
@ 2012-04-25 18:13       ` Nitin Gupta
  0 siblings, 0 replies; 32+ messages in thread
From: Nitin Gupta @ 2012-04-25 18:13 UTC (permalink / raw)
  To: Greg Kroah-Hartman
  Cc: Minchan Kim, Seth Jennings, Dan Magenheimer, Andrew Morton,
	linux-kernel, linux-mm

On 04/25/2012 01:56 PM, Greg Kroah-Hartman wrote:

> On Wed, Apr 25, 2012 at 09:35:25AM -0400, Nitin Gupta wrote:
>> On 04/25/2012 02:23 AM, Minchan Kim wrote:
>>
>>> Let's remove unnecessary type casting of (void *).
>>>
>>> Signed-off-by: Minchan Kim <minchan@kernel.org>
>>> ---
>>>  drivers/staging/zsmalloc/zsmalloc-main.c |    3 +--
>>>  1 file changed, 1 insertion(+), 2 deletions(-)
>>>
>>> diff --git a/drivers/staging/zsmalloc/zsmalloc-main.c b/drivers/staging/zsmalloc/zsmalloc-main.c
>>> index b7d31cc..ff089f8 100644
>>> --- a/drivers/staging/zsmalloc/zsmalloc-main.c
>>> +++ b/drivers/staging/zsmalloc/zsmalloc-main.c
>>> @@ -644,8 +644,7 @@ void zs_free(struct zs_pool *pool, void *obj)
>>>  	spin_lock(&class->lock);
>>>  
>>>  	/* Insert this object in containing zspage's freelist */
>>> -	link = (struct link_free *)((unsigned char *)kmap_atomic(f_page)
>>> -							+ f_offset);
>>> +	link = (struct link_free *)(kmap_atomic(f_page)	+ f_offset);
>>>  	link->next = first_page->freelist;
>>>  	kunmap_atomic(link);
>>>  	first_page->freelist = obj;
>>
>>
>>
>> Incrementing a void pointer looks weired and should not be allowed by C
>> compilers though gcc and clang seem to allow this without any warnings.
>> (fortunately C++ forbids incrementing void pointers)
> 
> Huh?  A void pointer can safely be incremented by C I thought, do you
> have a pointer to where in the reference it says it is "unspecified"?
> 


Arithmetic on void pointers and function pointers is listed as a GNU C
extension, so I don't think these operations are part of C standard.
>From info gcc (section 6.23):

"""
6.23 Arithmetic on `void'- and Function-Pointers
 In GNU C, addition and subtraction operations are supported on pointers
 to `void' and on pointers to functions.  This is done by treating the
 size of a `void' or of a function as 1.


 A consequence of this is that `sizeof' is also allowed on `void' and on
function types, and returns 1.
 The option `-Wpointer-arith' requests a warning if these extensions are
used.
"""


>> So, we should keep this cast to unsigned char pointer to avoid relying
>> on a non-standard, compiler specific behavior.
> 
> I do agree about this, more people are starting to build the kernel with
> other compilers than gcc, so it would be nice to ensure that we get
> stuff like this right.
> 


As an example, MSVC does not support arithmetic on void pointers:
http://stackoverflow.com/questions/1864352/pointer-arithmetic-when-void-has-unknown-size

Thanks,
Nitin

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH 0/6] zsmalloc: clean up and fix arch dependency
  2012-04-25  6:23 [PATCH 0/6] zsmalloc: clean up and fix arch dependency Minchan Kim
                   ` (6 preceding siblings ...)
  2012-04-25 12:41 ` [PATCH 0/6] zsmalloc: clean up and fix arch dependency Nitin Gupta
@ 2012-04-25 18:14 ` Greg Kroah-Hartman
  7 siblings, 0 replies; 32+ messages in thread
From: Greg Kroah-Hartman @ 2012-04-25 18:14 UTC (permalink / raw)
  To: Minchan Kim
  Cc: Seth Jennings, Nitin Gupta, Dan Magenheimer, Andrew Morton,
	linux-kernel, linux-mm

On Wed, Apr 25, 2012 at 03:23:08PM +0900, Minchan Kim wrote:
> This patchset has some clean up patches(1-5) and remove 
> set_bit, flush_tlb for portability in [6/6].
> 
> Minchan Kim (6):
>   zsmalloc: use PageFlag macro instead of [set|test]_bit

I've only applied this one patch, so feel free to drop it from your
series when you redo the others for your next round.

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH 0/6] zsmalloc: clean up and fix arch dependency
  2012-04-25 12:41 ` [PATCH 0/6] zsmalloc: clean up and fix arch dependency Nitin Gupta
@ 2012-04-26  1:20   ` Minchan Kim
  0 siblings, 0 replies; 32+ messages in thread
From: Minchan Kim @ 2012-04-26  1:20 UTC (permalink / raw)
  To: Nitin Gupta
  Cc: Greg Kroah-Hartman, Seth Jennings, Dan Magenheimer,
	Andrew Morton, linux-kernel, linux-mm

Hi Nitin,

On 04/25/2012 09:41 PM, Nitin Gupta wrote:

> Hi Minchan,
> 
> On 04/25/2012 02:23 AM, Minchan Kim wrote:
> 
>> This patchset has some clean up patches(1-5) and remove 
>> set_bit, flush_tlb for portability in [6/6].
>>
>> Minchan Kim (6):
>>   zsmalloc: use PageFlag macro instead of [set|test]_bit
>>   zsmalloc: remove unnecessary alignment
>>   zsmalloc: rename zspage_order with zspage_pages
>>   zsmalloc: add/fix function comment
>>   zsmalloc: remove unnecessary type casting
>>   zsmalloc: make zsmalloc portable
>>
>>  drivers/staging/zsmalloc/Kconfig         |    4 --
>>  drivers/staging/zsmalloc/zsmalloc-main.c |   73 +++++++++++++++++-------------
>>  drivers/staging/zsmalloc/zsmalloc_int.h  |    3 +-
>>  3 files changed, 43 insertions(+), 37 deletions(-)
>>
> 
> 
> Thanks for the fixes.
> 
> 
> Your description is missing testing notes (especially since patch [6/6]
> is not a cosmetic change). So, can you please add these either here in
> patch 0 or as part of patch 6/6 description?


Will do in later version.
Test is simply done in x86 and ARM qemu environment with zram so test coverage isn't good
but [1-6] is just trivial while [7] is severe. As I see Seth's reply, he could test it enough
and other architecture should work if it works in x86 because we used generic functions.

-- 
Kind regards,
Minchan Kim

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH 2/6] zsmalloc: remove unnecessary alignment
  2012-04-25 12:53   ` Nitin Gupta
@ 2012-04-26  1:42     ` Minchan Kim
  2012-05-03  5:16       ` Nitin Gupta
  0 siblings, 1 reply; 32+ messages in thread
From: Minchan Kim @ 2012-04-26  1:42 UTC (permalink / raw)
  To: Nitin Gupta
  Cc: Greg Kroah-Hartman, Seth Jennings, Dan Magenheimer,
	Andrew Morton, linux-kernel, linux-mm

On 04/25/2012 09:53 PM, Nitin Gupta wrote:

> On 04/25/2012 02:23 AM, Minchan Kim wrote:
> 
>> It isn't necessary to align pool size with PAGE_SIZE.
>> If I missed something, please let me know it.
>>
>> Signed-off-by: Minchan Kim <minchan@kernel.org>
>> ---
>>  drivers/staging/zsmalloc/zsmalloc-main.c |    5 ++---
>>  1 file changed, 2 insertions(+), 3 deletions(-)
>>
>> diff --git a/drivers/staging/zsmalloc/zsmalloc-main.c b/drivers/staging/zsmalloc/zsmalloc-main.c
>> index 504b6c2..b99ad9e 100644
>> --- a/drivers/staging/zsmalloc/zsmalloc-main.c
>> +++ b/drivers/staging/zsmalloc/zsmalloc-main.c
>> @@ -489,14 +489,13 @@ fail:
>>  
>>  struct zs_pool *zs_create_pool(const char *name, gfp_t flags)
>>  {
>> -	int i, error, ovhd_size;
>> +	int i, error;
>>  	struct zs_pool *pool;
>>  
>>  	if (!name)
>>  		return NULL;
>>  
>> -	ovhd_size = roundup(sizeof(*pool), PAGE_SIZE);
>> -	pool = kzalloc(ovhd_size, GFP_KERNEL);
>> +	pool = kzalloc(sizeof(*pool), GFP_KERNEL);
>>  	if (!pool)
>>  		return NULL;
>>  
> 
> 
> pool metadata is rounded-up to avoid potential false-sharing problem
> (though we could just roundup to cache_line_size()).


Do you really have any hurt by false-sharing problem?
If so, we can change it with

kzalloc(ALIGN(sizeof(*pool), cache_line_size()), GFP_KERNEL);

But I am not sure we need this.

> 
> Thanks,
> Nitin
> 
> --
> To unsubscribe, send a message with 'unsubscribe linux-mm' in
> the body to majordomo@kvack.org.  For more info on Linux MM,
> see: http://www.linux-mm.org/ .
> Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
> Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
> 



-- 
Kind regards,
Minchan Kim

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH 3/6] zsmalloc: rename zspage_order with zspage_pages
  2012-04-25 13:03   ` Nitin Gupta
@ 2012-04-26  1:46     ` Minchan Kim
  0 siblings, 0 replies; 32+ messages in thread
From: Minchan Kim @ 2012-04-26  1:46 UTC (permalink / raw)
  To: Nitin Gupta
  Cc: Greg Kroah-Hartman, Seth Jennings, Dan Magenheimer,
	Andrew Morton, linux-kernel, linux-mm

On 04/25/2012 10:03 PM, Nitin Gupta wrote:

> On 04/25/2012 02:23 AM, Minchan Kim wrote:
> 
>> zspage_order defines how many pages are needed to make a zspage.
>> So _order_ is rather awkward naming. It already deceive Jonathan
>> - http://lwn.net/Articles/477067/
>> " For each size, the code calculates an optimum number of pages (up to 16)"
>>
>> Let's change from _order_ to _pages_.
>>
>> Signed-off-by: Minchan Kim <minchan@kernel.org>
>> ---
>>  drivers/staging/zsmalloc/zsmalloc-main.c |   14 +++++++-------
>>  drivers/staging/zsmalloc/zsmalloc_int.h  |    2 +-
>>  2 files changed, 8 insertions(+), 8 deletions(-)
>>
> 
> 
> Recently, Seth changed max_zspage_order to ZS_MAX_PAGES_PER_ZSPAGE for
> the same reason. I think it would be better to rename the function in a
> similary way to have some consistency. So, we could use:
> 
> 1) get_pages_per_zspage() instead of get_zspage_pages()
> 2) class->pages_per_zspage instead of class->zspage_pages
> 


No problem. Will do.

-- 
Kind regards,
Minchan Kim

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH 4/6] zsmalloc: add/fix function comment
  2012-04-25 13:27   ` Nitin Gupta
@ 2012-04-26  1:51     ` Minchan Kim
  0 siblings, 0 replies; 32+ messages in thread
From: Minchan Kim @ 2012-04-26  1:51 UTC (permalink / raw)
  To: Nitin Gupta
  Cc: Greg Kroah-Hartman, Seth Jennings, Dan Magenheimer,
	Andrew Morton, linux-kernel, linux-mm

On 04/25/2012 10:27 PM, Nitin Gupta wrote:

> On 04/25/2012 02:23 AM, Minchan Kim wrote:
> 
>> Add/fix the comment.
>>
>> Signed-off-by: Minchan Kim <minchan@kernel.org>
>> ---
>>  drivers/staging/zsmalloc/zsmalloc-main.c |   15 +++++++++++----
>>  1 file changed, 11 insertions(+), 4 deletions(-)
>>
>> diff --git a/drivers/staging/zsmalloc/zsmalloc-main.c b/drivers/staging/zsmalloc/zsmalloc-main.c
>> index 0fe4cbb..b7d31cc 100644
>> --- a/drivers/staging/zsmalloc/zsmalloc-main.c
>> +++ b/drivers/staging/zsmalloc/zsmalloc-main.c
>> @@ -565,12 +565,9 @@ EXPORT_SYMBOL_GPL(zs_destroy_pool);
>>   * zs_malloc - Allocate block of given size from pool.
>>   * @pool: pool to allocate from
>>   * @size: size of block to allocate
>> - * @page: page no. that holds the object
>> - * @offset: location of object within page
>>   *
>>   * On success, <page, offset> identifies block allocated
>> - * and 0 is returned. On failure, <page, offset> is set to
>> - * 0 and -ENOMEM is returned.
>> + * and <page, offset> is returned. On failure, NULL is returned.
>>   *
> 
> 
> The returned value indeed encodes <page, offset> values as a 'void *'
> but this should not be part of the function documentation since its an
> internal detail.  So, its probably better to say:
> 
> On success, handle to the allocated object is returned; NULL otherwise.


Fair enough.

> 
> On a side note, we should also 'typedef void * zs_handle' to avoid any
> confusion. Without this, users may just treat zs_malloc return value as
> a pointer and try to deference it.


Yes. We should do it. I will make it as another patch in next spin.

> 
>>   * Allocation requests with size > ZS_MAX_ALLOC_SIZE will fail.
>>   */
>> @@ -666,6 +663,16 @@ void zs_free(struct zs_pool *pool, void *obj)
>>  }
>>  EXPORT_SYMBOL_GPL(zs_free);
>>  
>> +/**
>> + * zs_map_object - get address of allocated object from handle.
>> + * @pool: object allocated pool
> 
> 
> should be: @pool: pool from which the object was allocated
> 
>> + * @handle: handle returned from zs_malloc
>> + *
> 
>> + * Before using object allocated from zs_malloc, object
>> + * should be mapped to page table by this function.
>> + * After using object,  call zs_unmap_object to unmap page
>> + * table.
>> + */
> 
> 
> We are not really unmapping any page tables, so could be written as:
> 
> Before using an object allocated from zs_malloc, it must be mapped using
> this function. When done with the object, it must be unmapped using
> zs_unmap_object
> 
> 
> Sorry for nitpicking.


Never nitpicking.
Confusing documentation makes people very hang so documentation is very important.

Nitin, Thanks!

> 
> Thanks,
> Nitin
> 
> --
> To unsubscribe, send a message with 'unsubscribe linux-mm' in
> the body to majordomo@kvack.org.  For more info on Linux MM,
> see: http://www.linux-mm.org/ .
> Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
> Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
> 



-- 
Kind regards,
Minchan Kim

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH 5/6] zsmalloc: remove unnecessary type casting
  2012-04-25 13:35   ` Nitin Gupta
  2012-04-25 17:56     ` Greg Kroah-Hartman
@ 2012-04-26  1:58     ` Minchan Kim
  1 sibling, 0 replies; 32+ messages in thread
From: Minchan Kim @ 2012-04-26  1:58 UTC (permalink / raw)
  To: Nitin Gupta
  Cc: Greg Kroah-Hartman, Seth Jennings, Dan Magenheimer,
	Andrew Morton, linux-kernel, linux-mm

On 04/25/2012 10:35 PM, Nitin Gupta wrote:

> On 04/25/2012 02:23 AM, Minchan Kim wrote:
> 
>> Let's remove unnecessary type casting of (void *).
>>
>> Signed-off-by: Minchan Kim <minchan@kernel.org>
>> ---
>>  drivers/staging/zsmalloc/zsmalloc-main.c |    3 +--
>>  1 file changed, 1 insertion(+), 2 deletions(-)
>>
>> diff --git a/drivers/staging/zsmalloc/zsmalloc-main.c b/drivers/staging/zsmalloc/zsmalloc-main.c
>> index b7d31cc..ff089f8 100644
>> --- a/drivers/staging/zsmalloc/zsmalloc-main.c
>> +++ b/drivers/staging/zsmalloc/zsmalloc-main.c
>> @@ -644,8 +644,7 @@ void zs_free(struct zs_pool *pool, void *obj)
>>  	spin_lock(&class->lock);
>>  
>>  	/* Insert this object in containing zspage's freelist */
>> -	link = (struct link_free *)((unsigned char *)kmap_atomic(f_page)
>> -							+ f_offset);
>> +	link = (struct link_free *)(kmap_atomic(f_page)	+ f_offset);
>>  	link->next = first_page->freelist;
>>  	kunmap_atomic(link);
>>  	first_page->freelist = obj;
> 
> 
> 
> Incrementing a void pointer looks weired and should not be allowed by C
> compilers though gcc and clang seem to allow this without any warnings.
> (fortunately C++ forbids incrementing void pointers)


It's a gcc extension and we have been already used lots of place so I think it's no problem
although it's non-standard.
If we compile kernel with -Wpointer-arith, we would find a ton of warning in here and there.

> 
> So, we should keep this cast to unsigned char pointer to avoid relying
> on a non-standard, compiler specific behavior.



Okay. It's a just trivial and I have no justification to do it at the cost of breaking standard.

> 
> Thanks,
> Nitin
> 
> --
> To unsubscribe, send a message with 'unsubscribe linux-mm' in
> the body to majordomo@kvack.org.  For more info on Linux MM,
> see: http://www.linux-mm.org/ .
> Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
> Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
> 



-- 
Kind regards,
Minchan Kim

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH 6/6] zsmalloc: make zsmalloc portable
  2012-04-25 15:40     ` Dan Magenheimer
@ 2012-04-26  2:03       ` Minchan Kim
  2012-04-26  5:07         ` Minchan Kim
  0 siblings, 1 reply; 32+ messages in thread
From: Minchan Kim @ 2012-04-26  2:03 UTC (permalink / raw)
  To: Dan Magenheimer
  Cc: Nitin Gupta, Greg Kroah-Hartman, Seth Jennings, Andrew Morton,
	linux-kernel, linux-mm

Hi Dan,

On 04/26/2012 12:40 AM, Dan Magenheimer wrote:

>> From: Nitin Gupta [mailto:ngupta@vflare.org]
>> Subject: Re: [PATCH 6/6] zsmalloc: make zsmalloc portable
>>
>> On 04/25/2012 02:23 AM, Minchan Kim wrote:
>>
>>> The zsmalloc uses __flush_tlb_one and set_pte.
>>> It's very lower functions so that it makes arhcitecture dependency
>>> so currently zsmalloc is used by only x86.
>>> This patch changes them with map_vm_area and unmap_kernel_range so
>>> it should work all architecture.
>>>
>>> Signed-off-by: Minchan Kim <minchan@kernel.org>
>>> ---
>>>  drivers/staging/zsmalloc/Kconfig         |    4 ----
>>>  drivers/staging/zsmalloc/zsmalloc-main.c |   27 +++++++++++++++++----------
>>>  drivers/staging/zsmalloc/zsmalloc_int.h  |    1 -
>>>  3 files changed, 17 insertions(+), 15 deletions(-)
>>>
>>> diff --git a/drivers/staging/zsmalloc/Kconfig b/drivers/staging/zsmalloc/Kconfig
>>> index a5ab720..9084565 100644
>>> --- a/drivers/staging/zsmalloc/Kconfig
>>> +++ b/drivers/staging/zsmalloc/Kconfig
>>> @@ -1,9 +1,5 @@
>>>  config ZSMALLOC
>>>  	tristate "Memory allocator for compressed pages"
>>> -	# X86 dependency is because of the use of __flush_tlb_one and set_pte
>>> -	# in zsmalloc-main.c.
>>> -	# TODO: convert these to portable functions
>>> -	depends on X86
>>>  	default n
>>>  	help
>>>  	  zsmalloc is a slab-based memory allocator designed to store
>>> diff --git a/drivers/staging/zsmalloc/zsmalloc-main.c b/drivers/staging/zsmalloc/zsmalloc-main.c
>>> index ff089f8..cc017b1 100644
>>> --- a/drivers/staging/zsmalloc/zsmalloc-main.c
>>> +++ b/drivers/staging/zsmalloc/zsmalloc-main.c
>>> @@ -442,7 +442,7 @@ static int zs_cpu_notifier(struct notifier_block *nb, unsigned long action,
>>>  		area = &per_cpu(zs_map_area, cpu);
>>>  		if (area->vm)
>>>  			break;
>>> -		area->vm = alloc_vm_area(2 * PAGE_SIZE, area->vm_ptes);
>>> +		area->vm = alloc_vm_area(2 * PAGE_SIZE, NULL);
>>>  		if (!area->vm)
>>>  			return notifier_from_errno(-ENOMEM);
>>>  		break;
>>> @@ -696,13 +696,22 @@ void *zs_map_object(struct zs_pool *pool, void *handle)
>>>  	} else {
>>>  		/* this object spans two pages */
>>>  		struct page *nextp;
>>> +		struct page *pages[2];
>>> +		struct page **page_array = &pages[0];
>>> +		int err;
>>>
>>>  		nextp = get_next_page(page);
>>>  		BUG_ON(!nextp);
>>>
>>> +		page_array[0] = page;
>>> +		page_array[1] = nextp;
>>>
>>> -		set_pte(area->vm_ptes[0], mk_pte(page, PAGE_KERNEL));
>>> -		set_pte(area->vm_ptes[1], mk_pte(nextp, PAGE_KERNEL));
>>> +		/*
>>> +		 * map_vm_area never fail because we already allocated
>>> +		 * pages for page table in alloc_vm_area.
>>> +		 */
>>> +		err = map_vm_area(area->vm, PAGE_KERNEL, &page_array);
>>> +		BUG_ON(err);
>>>
>>>  		/* We pre-allocated VM area so mapping can never fail */
>>>  		area->vm_addr = area->vm->addr;
>>> @@ -730,14 +739,12 @@ void zs_unmap_object(struct zs_pool *pool, void *handle)
>>>  	off = obj_idx_to_offset(page, obj_idx, class->size);
>>>
>>>  	area = &__get_cpu_var(zs_map_area);
>>> -	if (off + class->size <= PAGE_SIZE) {
>>> +	if (off + class->size <= PAGE_SIZE)
>>>  		kunmap_atomic(area->vm_addr);
>>> -	} else {
>>> -		set_pte(area->vm_ptes[0], __pte(0));
>>> -		set_pte(area->vm_ptes[1], __pte(0));
>>> -		__flush_tlb_one((unsigned long)area->vm_addr);
>>> -		__flush_tlb_one((unsigned long)area->vm_addr + PAGE_SIZE);
>>> -	}
>>> +	else
>>> +		unmap_kernel_range((unsigned long)area->vm->addr,
>>> +					PAGE_SIZE * 2);
>>> +
>>
>>
>>
>> This would certainly work but would incur unncessary cost. All we need
>> to do is to flush the local TLB entry correpsonding to these two pages.
>> However, unmap_kernel_range --> flush_tlb_kernel_range woule cause TLB
>> flush on all CPUs. Additionally, implementation of this function
>> (flush_tlb_kernel_range) on architecutures like x86 seems naive since it
>> flushes the entire TLB on all the CPUs.
>>
>> Even with all this penalty, I'm inclined on keeping this change to
>> remove x86 only dependency, keeping improvements as future work.
>>
>> I think Seth was working on this improvement but not sure about the
>> current status. Seth?
> 
> I wouldn't normally advocate an architecture-specific ifdef, but the
> penalty for portability here seems high enough that it could make
> sense here, perhaps hidden away in zsmalloc.h?  Perhaps eventually
> in a mm header file as "unmap_kernel_page_pair_local()"?


Agree.
I think it's a right way we should go.

-- 
Kind regards,
Minchan Kim

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH 6/6] zsmalloc: make zsmalloc portable
  2012-04-25 16:37     ` Seth Jennings
@ 2012-04-26  2:04       ` Minchan Kim
  2012-05-07 15:14       ` Seth Jennings
  1 sibling, 0 replies; 32+ messages in thread
From: Minchan Kim @ 2012-04-26  2:04 UTC (permalink / raw)
  To: Seth Jennings
  Cc: Nitin Gupta, Greg Kroah-Hartman, Dan Magenheimer, Andrew Morton,
	linux-kernel, linux-mm

On 04/26/2012 01:37 AM, Seth Jennings wrote:

> Hey Minchan,
> 
> Thanks for the patches!
> 
> On 04/25/2012 09:32 AM, Nitin Gupta wrote:
>> I think Seth was working on this improvement but not sure about the
>> current status. Seth?
> 
> Yes, I looked at this option, and it is very clean and portable.
> 
> Unfortunately, IIRC, with our rate of mapping/unmapping,
> flush_tlb_kernel_range() causes an IPI storm that effective
> stalls the machine.
> 
> I'll apply your patch and try it out.


Seth, Thanks!

-- 
Kind regards,
Minchan Kim

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH 6/6] zsmalloc: make zsmalloc portable
  2012-04-25  6:23 ` [PATCH 6/6] zsmalloc: make zsmalloc portable Minchan Kim
  2012-04-25 14:32   ` Nitin Gupta
@ 2012-04-26  2:11   ` Minchan Kim
  1 sibling, 0 replies; 32+ messages in thread
From: Minchan Kim @ 2012-04-26  2:11 UTC (permalink / raw)
  Cc: Greg Kroah-Hartman, Seth Jennings, Nitin Gupta, Dan Magenheimer,
	Andrew Morton, linux-kernel, linux-mm

On 04/25/2012 03:23 PM, Minchan Kim wrote:

> The zsmalloc uses __flush_tlb_one and set_pte.
> It's very lower functions so that it makes arhcitecture dependency
> so currently zsmalloc is used by only x86.
> This patch changes them with map_vm_area and unmap_kernel_range so
> it should work all architecture.
> 
> Signed-off-by: Minchan Kim <minchan@kernel.org>
> ---
>  drivers/staging/zsmalloc/Kconfig         |    4 ----
>  drivers/staging/zsmalloc/zsmalloc-main.c |   27 +++++++++++++++++----------
>  drivers/staging/zsmalloc/zsmalloc_int.h  |    1 -
>  3 files changed, 17 insertions(+), 15 deletions(-)
> 
> diff --git a/drivers/staging/zsmalloc/Kconfig b/drivers/staging/zsmalloc/Kconfig
> index a5ab720..9084565 100644
> --- a/drivers/staging/zsmalloc/Kconfig
> +++ b/drivers/staging/zsmalloc/Kconfig
> @@ -1,9 +1,5 @@
>  config ZSMALLOC
>  	tristate "Memory allocator for compressed pages"
> -	# X86 dependency is because of the use of __flush_tlb_one and set_pte
> -	# in zsmalloc-main.c.
> -	# TODO: convert these to portable functions
> -	depends on X86
>  	default n
>  	help
>  	  zsmalloc is a slab-based memory allocator designed to store
> diff --git a/drivers/staging/zsmalloc/zsmalloc-main.c b/drivers/staging/zsmalloc/zsmalloc-main.c
> index ff089f8..cc017b1 100644
> --- a/drivers/staging/zsmalloc/zsmalloc-main.c
> +++ b/drivers/staging/zsmalloc/zsmalloc-main.c
> @@ -442,7 +442,7 @@ static int zs_cpu_notifier(struct notifier_block *nb, unsigned long action,
>  		area = &per_cpu(zs_map_area, cpu);
>  		if (area->vm)
>  			break;
> -		area->vm = alloc_vm_area(2 * PAGE_SIZE, area->vm_ptes);
> +		area->vm = alloc_vm_area(2 * PAGE_SIZE, NULL);
>  		if (!area->vm)
>  			return notifier_from_errno(-ENOMEM);
>  		break;
> @@ -696,13 +696,22 @@ void *zs_map_object(struct zs_pool *pool, void *handle)
>  	} else {
>  		/* this object spans two pages */
>  		struct page *nextp;
> +		struct page *pages[2];
> +		struct page **page_array = &pages[0];
> +		int err;
>  
>  		nextp = get_next_page(page);
>  		BUG_ON(!nextp);
>  
> +		page_array[0] = page;
> +		page_array[1] = nextp;
>  
> -		set_pte(area->vm_ptes[0], mk_pte(page, PAGE_KERNEL));
> -		set_pte(area->vm_ptes[1], mk_pte(nextp, PAGE_KERNEL));
> +		/*
> +		 * map_vm_area never fail because we already allocated
> +		 * pages for page table in alloc_vm_area.
> +		 */
> +		err = map_vm_area(area->vm, PAGE_KERNEL, &page_array);
> +		BUG_ON(err);
>  
>  		/* We pre-allocated VM area so mapping can never fail */
>  		area->vm_addr = area->vm->addr;
> @@ -730,14 +739,12 @@ void zs_unmap_object(struct zs_pool *pool, void *handle)
>  	off = obj_idx_to_offset(page, obj_idx, class->size);
>  
>  	area = &__get_cpu_var(zs_map_area);
> -	if (off + class->size <= PAGE_SIZE) {
> +	if (off + class->size <= PAGE_SIZE)
>  		kunmap_atomic(area->vm_addr);
> -	} else {
> -		set_pte(area->vm_ptes[0], __pte(0));
> -		set_pte(area->vm_ptes[1], __pte(0));
> -		__flush_tlb_one((unsigned long)area->vm_addr);
> -		__flush_tlb_one((unsigned long)area->vm_addr + PAGE_SIZE);
> -	}
> +	else
> +		unmap_kernel_range((unsigned long)area->vm->addr,
> +					PAGE_SIZE * 2);
> +


I should have export unmap_kernel_range. :(
Before that, I will look into another option to reduce IPI storm.

Thanks.

-- 
Kind regards,
Minchan Kim

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH 6/6] zsmalloc: make zsmalloc portable
  2012-04-26  2:03       ` Minchan Kim
@ 2012-04-26  5:07         ` Minchan Kim
  2012-04-30 16:24           ` Seth Jennings
  0 siblings, 1 reply; 32+ messages in thread
From: Minchan Kim @ 2012-04-26  5:07 UTC (permalink / raw)
  Cc: Dan Magenheimer, Nitin Gupta, Greg Kroah-Hartman, Seth Jennings,
	Andrew Morton, linux-kernel, linux-mm

On 04/26/2012 11:03 AM, Minchan Kim wrote:

> Hi Dan,
> 
> On 04/26/2012 12:40 AM, Dan Magenheimer wrote:
> 
>>> From: Nitin Gupta [mailto:ngupta@vflare.org]
>>> Subject: Re: [PATCH 6/6] zsmalloc: make zsmalloc portable
>>>
>>> On 04/25/2012 02:23 AM, Minchan Kim wrote:
>>>
>>>> The zsmalloc uses __flush_tlb_one and set_pte.
>>>> It's very lower functions so that it makes arhcitecture dependency
>>>> so currently zsmalloc is used by only x86.
>>>> This patch changes them with map_vm_area and unmap_kernel_range so
>>>> it should work all architecture.
>>>>
>>>> Signed-off-by: Minchan Kim <minchan@kernel.org>
>>>> ---
>>>>  drivers/staging/zsmalloc/Kconfig         |    4 ----
>>>>  drivers/staging/zsmalloc/zsmalloc-main.c |   27 +++++++++++++++++----------
>>>>  drivers/staging/zsmalloc/zsmalloc_int.h  |    1 -
>>>>  3 files changed, 17 insertions(+), 15 deletions(-)
>>>>
>>>> diff --git a/drivers/staging/zsmalloc/Kconfig b/drivers/staging/zsmalloc/Kconfig
>>>> index a5ab720..9084565 100644
>>>> --- a/drivers/staging/zsmalloc/Kconfig
>>>> +++ b/drivers/staging/zsmalloc/Kconfig
>>>> @@ -1,9 +1,5 @@
>>>>  config ZSMALLOC
>>>>  	tristate "Memory allocator for compressed pages"
>>>> -	# X86 dependency is because of the use of __flush_tlb_one and set_pte
>>>> -	# in zsmalloc-main.c.
>>>> -	# TODO: convert these to portable functions
>>>> -	depends on X86
>>>>  	default n
>>>>  	help
>>>>  	  zsmalloc is a slab-based memory allocator designed to store
>>>> diff --git a/drivers/staging/zsmalloc/zsmalloc-main.c b/drivers/staging/zsmalloc/zsmalloc-main.c
>>>> index ff089f8..cc017b1 100644
>>>> --- a/drivers/staging/zsmalloc/zsmalloc-main.c
>>>> +++ b/drivers/staging/zsmalloc/zsmalloc-main.c
>>>> @@ -442,7 +442,7 @@ static int zs_cpu_notifier(struct notifier_block *nb, unsigned long action,
>>>>  		area = &per_cpu(zs_map_area, cpu);
>>>>  		if (area->vm)
>>>>  			break;
>>>> -		area->vm = alloc_vm_area(2 * PAGE_SIZE, area->vm_ptes);
>>>> +		area->vm = alloc_vm_area(2 * PAGE_SIZE, NULL);
>>>>  		if (!area->vm)
>>>>  			return notifier_from_errno(-ENOMEM);
>>>>  		break;
>>>> @@ -696,13 +696,22 @@ void *zs_map_object(struct zs_pool *pool, void *handle)
>>>>  	} else {
>>>>  		/* this object spans two pages */
>>>>  		struct page *nextp;
>>>> +		struct page *pages[2];
>>>> +		struct page **page_array = &pages[0];
>>>> +		int err;
>>>>
>>>>  		nextp = get_next_page(page);
>>>>  		BUG_ON(!nextp);
>>>>
>>>> +		page_array[0] = page;
>>>> +		page_array[1] = nextp;
>>>>
>>>> -		set_pte(area->vm_ptes[0], mk_pte(page, PAGE_KERNEL));
>>>> -		set_pte(area->vm_ptes[1], mk_pte(nextp, PAGE_KERNEL));
>>>> +		/*
>>>> +		 * map_vm_area never fail because we already allocated
>>>> +		 * pages for page table in alloc_vm_area.
>>>> +		 */
>>>> +		err = map_vm_area(area->vm, PAGE_KERNEL, &page_array);
>>>> +		BUG_ON(err);
>>>>
>>>>  		/* We pre-allocated VM area so mapping can never fail */
>>>>  		area->vm_addr = area->vm->addr;
>>>> @@ -730,14 +739,12 @@ void zs_unmap_object(struct zs_pool *pool, void *handle)
>>>>  	off = obj_idx_to_offset(page, obj_idx, class->size);
>>>>
>>>>  	area = &__get_cpu_var(zs_map_area);
>>>> -	if (off + class->size <= PAGE_SIZE) {
>>>> +	if (off + class->size <= PAGE_SIZE)
>>>>  		kunmap_atomic(area->vm_addr);
>>>> -	} else {
>>>> -		set_pte(area->vm_ptes[0], __pte(0));
>>>> -		set_pte(area->vm_ptes[1], __pte(0));
>>>> -		__flush_tlb_one((unsigned long)area->vm_addr);
>>>> -		__flush_tlb_one((unsigned long)area->vm_addr + PAGE_SIZE);
>>>> -	}
>>>> +	else
>>>> +		unmap_kernel_range((unsigned long)area->vm->addr,
>>>> +					PAGE_SIZE * 2);
>>>> +
>>>
>>>
>>>
>>> This would certainly work but would incur unncessary cost. All we need
>>> to do is to flush the local TLB entry correpsonding to these two pages.
>>> However, unmap_kernel_range --> flush_tlb_kernel_range woule cause TLB
>>> flush on all CPUs. Additionally, implementation of this function
>>> (flush_tlb_kernel_range) on architecutures like x86 seems naive since it
>>> flushes the entire TLB on all the CPUs.
>>>
>>> Even with all this penalty, I'm inclined on keeping this change to
>>> remove x86 only dependency, keeping improvements as future work.
>>>
>>> I think Seth was working on this improvement but not sure about the
>>> current status. Seth?
>>
>> I wouldn't normally advocate an architecture-specific ifdef, but the
>> penalty for portability here seems high enough that it could make
>> sense here, perhaps hidden away in zsmalloc.h?  Perhaps eventually
>> in a mm header file as "unmap_kernel_page_pair_local()"?
> 
> 
> Agree.
> I think it's a right way we should go.
> 


Quick patch - totally untested.

We can implement new TLB flush function "local_flush_tlb_kernel_range"
If architecture is very smart, it could flush only tlb entries related to vaddr.
If architecture is smart, it could flush only tlb entries related to a CPU.
If architecture is _NOT_ smart, it could flush all entries of all CPUs.

Now there are few architectures have "local_flush_tlb_kernel_range".
MIPS, sh, unicore32, arm, score and x86 by this patch.
So I think it's good candidate other arch should implement.
Until that, we can add stub for other architectures which calls only [global/local] TLB flush.
We can expect maintainer could respond then they can implement best efficient method.
If the maintainer doesn't have any interest, zsmalloc could be very slow in that arch and
users will blame that architecture. 

Any thoughts?

diff --git a/arch/x86/include/asm/tlbflush.h b/arch/x86/include/asm/tlbflush.h
index 4ece077..118561c 100644
--- a/arch/x86/include/asm/tlbflush.h
+++ b/arch/x86/include/asm/tlbflush.h
@@ -172,4 +172,17 @@ static inline void flush_tlb_kernel_range(unsigned long start,
        flush_tlb_all();
 }
 
+static inline void local_flush_tlb_kernel_range(unsigned long start,
+                                         unsigned long end)
+{
+       if (cpu_has_invlpg) {
+               while(start < end) {
+                       __flush_tlb_single(start);
+                       start += PAGE_SIZE;
+               }
+       }
+       else
+               local_flush_tlb();
+}
+
 #endif /* _ASM_X86_TLBFLUSH_H */
diff --git a/drivers/staging/zsmalloc/zsmalloc-main.c b/drivers/staging/zsmalloc/zsmalloc-main.
index cc017b1..7755db0 100644
--- a/drivers/staging/zsmalloc/zsmalloc-main.c
+++ b/drivers/staging/zsmalloc/zsmalloc-main.c
@@ -742,7 +742,7 @@ void zs_unmap_object(struct zs_pool *pool, void *handle)
        if (off + class->size <= PAGE_SIZE)
                kunmap_atomic(area->vm_addr);
        else
-               unmap_kernel_range((unsigned long)area->vm->addr,
+               local_unmap_kernel_range((unsigned long)area->vm->addr,
                                        PAGE_SIZE * 2);
 
        put_cpu_var(zs_map_area);
diff --git a/include/linux/vmalloc.h b/include/linux/vmalloc.h
index dcdfc2b..bce403c 100644
--- a/include/linux/vmalloc.h
+++ b/include/linux/vmalloc.h
@@ -101,6 +101,7 @@ extern int map_kernel_range_noflush(unsigned long start, unsigned long size
                                    pgprot_t prot, struct page **pages);
 extern void unmap_kernel_range_noflush(unsigned long addr, unsigned long size);
 extern void unmap_kernel_range(unsigned long addr, unsigned long size);
+extern void local_unmap_kernel_range(unsigned long addr, unsigned long size);
 #else
 static inline int
 map_kernel_range_noflush(unsigned long start, unsigned long size,
@@ -116,6 +117,11 @@ static inline void
 unmap_kernel_range(unsigned long addr, unsigned long size)
 {
 }
+static inline void
+local_unmap_kernel_range(unsigned long addr, unsigned long size)
+{
+}
+
 #endif
 
 /* Allocate/destroy a 'vmalloc' VM area. */
diff --git a/mm/vmalloc.c b/mm/vmalloc.c
index 94dff88..791b142 100644
--- a/mm/vmalloc.c
+++ b/mm/vmalloc.c
@@ -1258,6 +1258,17 @@ void unmap_kernel_range(unsigned long addr, unsigned long size)
        flush_tlb_kernel_range(addr, end);
 }
 
+void local_unmap_kernel_range(unsigned long addr, unsigned long size)
+{
+       unsigned long end = addr + size;
+
+       flush_cache_vunmap(addr, end);
+       vunmap_page_range(addr, end);
+       local_flush_tlb_kernel_range(addr, end);
+}
+EXPORT_SYMBOL_GPL(local_unmap_kernel_range);
+
+
 int map_vm_area(struct vm_struct *area, pgprot_t prot, struct page ***pages)
 {
        unsigned long addr = (unsigned long)area->addr;

-- 
Kind regards,
Minchan Kim

^ permalink raw reply related	[flat|nested] 32+ messages in thread

* Re: [PATCH 6/6] zsmalloc: make zsmalloc portable
  2012-04-26  5:07         ` Minchan Kim
@ 2012-04-30 16:24           ` Seth Jennings
  0 siblings, 0 replies; 32+ messages in thread
From: Seth Jennings @ 2012-04-30 16:24 UTC (permalink / raw)
  To: Minchan Kim
  Cc: ; Dan Magenheimer, Nitin Gupta, Greg Kroah-Hartman,
	Andrew Morton, linux-kernel, linux-mm

On 04/26/2012 12:07 AM, Minchan Kim wrote:

> 
> Quick patch - totally untested.
> 
> We can implement new TLB flush function 
> "local_flush_tlb_kernel_range" If architecture is very smart, it 
> could flush only tlb entries related to vaddr. If architecture is 
> smart, it could flush only tlb entries related to a CPU. If 
> architecture is _NOT_ smart, it could flush all entries of all CPUs.
> 
> Now there are few architectures have "local_flush_tlb_kernel_range". 
> MIPS, sh, unicore32, arm, score and x86 by this patch. So I think 
> it's good candidate other arch should implement. Until that, we can 
> add stub for other architectures which calls only [global/local] TLB
>  flush. We can expect maintainer could respond then they can 
> implement best efficient method. If the maintainer doesn't have any 
> interest, zsmalloc could be very slow in that arch and users will 
> blame that architecture.
> 
> Any thoughts?


I had this same idea a while back.

It is encouraging to know that someone else independently thought of
this solution too :)  Makes me think it is a good solution.

Let me build and test on x86, make sure there are no unforseen consequences.

Thanks again for your work here!

Seth


^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH 2/6] zsmalloc: remove unnecessary alignment
  2012-04-26  1:42     ` Minchan Kim
@ 2012-05-03  5:16       ` Nitin Gupta
  0 siblings, 0 replies; 32+ messages in thread
From: Nitin Gupta @ 2012-05-03  5:16 UTC (permalink / raw)
  To: Minchan Kim
  Cc: Greg Kroah-Hartman, Seth Jennings, Dan Magenheimer,
	Andrew Morton, linux-kernel, linux-mm

Hi Minchan,

Sorry for late reply.

On 4/25/12 9:42 PM, Minchan Kim wrote:
> On 04/25/2012 09:53 PM, Nitin Gupta wrote:
>
>> On 04/25/2012 02:23 AM, Minchan Kim wrote:
>>
>>> It isn't necessary to align pool size with PAGE_SIZE.
>>> If I missed something, please let me know it.
>>>
>>> Signed-off-by: Minchan Kim<minchan@kernel.org>
>>> ---
>>>   drivers/staging/zsmalloc/zsmalloc-main.c |    5 ++---
>>>   1 file changed, 2 insertions(+), 3 deletions(-)
>>>
>>> diff --git a/drivers/staging/zsmalloc/zsmalloc-main.c b/drivers/staging/zsmalloc/zsmalloc-main.c
>>> index 504b6c2..b99ad9e 100644
>>> --- a/drivers/staging/zsmalloc/zsmalloc-main.c
>>> +++ b/drivers/staging/zsmalloc/zsmalloc-main.c
>>> @@ -489,14 +489,13 @@ fail:
>>>
>>>   struct zs_pool *zs_create_pool(const char *name, gfp_t flags)
>>>   {
>>> -	int i, error, ovhd_size;
>>> +	int i, error;
>>>   	struct zs_pool *pool;
>>>
>>>   	if (!name)
>>>   		return NULL;
>>>
>>> -	ovhd_size = roundup(sizeof(*pool), PAGE_SIZE);
>>> -	pool = kzalloc(ovhd_size, GFP_KERNEL);
>>> +	pool = kzalloc(sizeof(*pool), GFP_KERNEL);
>>>   	if (!pool)
>>>   		return NULL;
>>>
>>
>>
>> pool metadata is rounded-up to avoid potential false-sharing problem
>> (though we could just roundup to cache_line_size()).
>
>
> Do you really have any hurt by false-sharing problem?
> If so, we can change it with
>

I've never been hit by this false-sharing in any testing but this is 
really just a random chance. Apart from aligning to cache-line size, 
there is no way to ensure some unfortunate read-mostly object never 
falls in the same line.

> kzalloc(ALIGN(sizeof(*pool), cache_line_size()), GFP_KERNEL);
>

Yes, looks better than aligning to PAGE_SIZE.


Thanks,
Nitin

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH 6/6] zsmalloc: make zsmalloc portable
  2012-04-25 16:37     ` Seth Jennings
  2012-04-26  2:04       ` Minchan Kim
@ 2012-05-07 15:14       ` Seth Jennings
  2012-05-08  0:46         ` Minchan Kim
  1 sibling, 1 reply; 32+ messages in thread
From: Seth Jennings @ 2012-05-07 15:14 UTC (permalink / raw)
  To: Minchan Kim
  Cc: Nitin Gupta, Greg Kroah-Hartman, Dan Magenheimer, Andrew Morton,
	linux-kernel, linux-mm

On 04/25/2012 11:37 AM, Seth Jennings wrote:

> I'll apply your patch and try it out.

Sorry for taking so long.

I finally got around to testing this on an x86_64 VM and it works with
the same performance as before and is much cleaner.  I like it.  Just
need to expand the patch to all the arches.

I'm also interested to see if this works for ppc64.  I'm hoping to try
it out today or tomorrow.

--
Seth


^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH 6/6] zsmalloc: make zsmalloc portable
  2012-05-07 15:14       ` Seth Jennings
@ 2012-05-08  0:46         ` Minchan Kim
  0 siblings, 0 replies; 32+ messages in thread
From: Minchan Kim @ 2012-05-08  0:46 UTC (permalink / raw)
  To: Seth Jennings
  Cc: Nitin Gupta, Greg Kroah-Hartman, Dan Magenheimer, Andrew Morton,
	linux-kernel, linux-mm

On 05/08/2012 12:14 AM, Seth Jennings wrote:

> On 04/25/2012 11:37 AM, Seth Jennings wrote:
> 
>> I'll apply your patch and try it out.
> 
> Sorry for taking so long.
> 
> I finally got around to testing this on an x86_64 VM and it works with
> the same performance as before and is much cleaner.  I like it.  Just

> need to expand the patch to all the arches.

> 
> I'm also interested to see if this works for ppc64.  I'm hoping to try
> it out today or tomorrow.


I will have a time to make a patch in a weekend if other urgent doesn't
catch me. :)

Seth, Thanks for the testing!

> 
> --
> Seth
> 



-- 
Kind regards,
Minchan Kim

^ permalink raw reply	[flat|nested] 32+ messages in thread

end of thread, other threads:[~2012-05-08  0:47 UTC | newest]

Thread overview: 32+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2012-04-25  6:23 [PATCH 0/6] zsmalloc: clean up and fix arch dependency Minchan Kim
2012-04-25  6:23 ` [PATCH 1/6] zsmalloc: use PageFlag macro instead of [set|test]_bit Minchan Kim
2012-04-25 12:43   ` Nitin Gupta
2012-04-25  6:23 ` [PATCH 2/6] zsmalloc: remove unnecessary alignment Minchan Kim
2012-04-25 12:53   ` Nitin Gupta
2012-04-26  1:42     ` Minchan Kim
2012-05-03  5:16       ` Nitin Gupta
2012-04-25  6:23 ` [PATCH 3/6] zsmalloc: rename zspage_order with zspage_pages Minchan Kim
2012-04-25 13:03   ` Nitin Gupta
2012-04-26  1:46     ` Minchan Kim
2012-04-25  6:23 ` [PATCH 4/6] zsmalloc: add/fix function comment Minchan Kim
2012-04-25 13:27   ` Nitin Gupta
2012-04-26  1:51     ` Minchan Kim
2012-04-25  6:23 ` [PATCH 5/6] zsmalloc: remove unnecessary type casting Minchan Kim
2012-04-25 13:35   ` Nitin Gupta
2012-04-25 17:56     ` Greg Kroah-Hartman
2012-04-25 18:13       ` Nitin Gupta
2012-04-26  1:58     ` Minchan Kim
2012-04-25  6:23 ` [PATCH 6/6] zsmalloc: make zsmalloc portable Minchan Kim
2012-04-25 14:32   ` Nitin Gupta
2012-04-25 15:40     ` Dan Magenheimer
2012-04-26  2:03       ` Minchan Kim
2012-04-26  5:07         ` Minchan Kim
2012-04-30 16:24           ` Seth Jennings
2012-04-25 16:37     ` Seth Jennings
2012-04-26  2:04       ` Minchan Kim
2012-05-07 15:14       ` Seth Jennings
2012-05-08  0:46         ` Minchan Kim
2012-04-26  2:11   ` Minchan Kim
2012-04-25 12:41 ` [PATCH 0/6] zsmalloc: clean up and fix arch dependency Nitin Gupta
2012-04-26  1:20   ` Minchan Kim
2012-04-25 18:14 ` Greg Kroah-Hartman

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).