linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
* [RFC][PATCH 00/10] zsmalloc auto-compaction
@ 2015-05-29 15:05 Sergey Senozhatsky
  2015-05-29 15:05 ` [RFC][PATCH 01/10] zsmalloc: drop unused variable `nr_to_migrate' Sergey Senozhatsky
                   ` (10 more replies)
  0 siblings, 11 replies; 30+ messages in thread
From: Sergey Senozhatsky @ 2015-05-29 15:05 UTC (permalink / raw)
  To: Andrew Morton, Minchan Kim
  Cc: linux-mm, linux-kernel, Sergey Senozhatsky, Sergey Senozhatsky

Hello,

RFC

this is 4.3 material, but I wanted to publish it sooner to gain
responses and to settle it down before 4.3 merge window opens.

in short, this series tweaks zsmalloc's compaction and adds
auto-compaction support. auto-compaction is not aimed to replace
manual compaction, intead it's supposed to be good enough. yet
it surely slows down zsmalloc in some scenarious. whilst simple
un-tar test didn't show any significant performance difference


quote from commit 0007:

this test copies a 1.3G linux kernel tar to mounted zram disk,
and extracts it.

w/auto-compaction:

cat /sys/block/zram0/mm_stat
 1171456    26006    86016        0    86016    32781        0

time tar xf linux-3.10.tar.gz -C linux

real    0m16.970s
user    0m15.247s
sys     0m8.477s

du -sh linux
2.0G    linux

cat /sys/block/zram0/mm_stat
3547353088 2993384270 3011088384        0 3011088384    24310      108

=====================================================================

w/o auto compaction:

cat /sys/block/zram0/mm_stat
 1171456    26000    81920        0    81920    32781        0

time tar xf linux-3.10.tar.gz -C linux

real    0m16.983s
user    0m15.267s
sys     0m8.417s

du -sh linux
2.0G    linux

cat /sys/block/zram0/mm_stat
3548917760 2993566924 3011317760        0 3011317760    23928        0



Sergey Senozhatsky (10):
  zsmalloc: drop unused variable `nr_to_migrate'
  zsmalloc: always keep per-class stats
  zsmalloc: introduce zs_can_compact() function
  zsmalloc: cosmetic compaction code adjustments
  zsmalloc: add `num_migrated' to zs_pool
  zsmalloc: move compaction functions
  zsmalloc: introduce auto-compact support
  zsmalloc: export zs_pool `num_migrated'
  zram: remove `num_migrated' from zram_stats
  zsmalloc: lower ZS_ALMOST_FULL waterline

 drivers/block/zram/zram_drv.c |  12 +-
 drivers/block/zram/zram_drv.h |   1 -
 include/linux/zsmalloc.h      |   1 +
 mm/zsmalloc.c                 | 578 +++++++++++++++++++++---------------------
 4 files changed, 296 insertions(+), 296 deletions(-)

-- 
2.4.2.337.gfae46aa

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 30+ messages in thread

* [RFC][PATCH 01/10] zsmalloc: drop unused variable `nr_to_migrate'
  2015-05-29 15:05 [RFC][PATCH 00/10] zsmalloc auto-compaction Sergey Senozhatsky
@ 2015-05-29 15:05 ` Sergey Senozhatsky
  2015-06-04  2:04   ` Minchan Kim
  2015-05-29 15:05 ` [RFC][PATCH 02/10] zsmalloc: always keep per-class stats Sergey Senozhatsky
                   ` (9 subsequent siblings)
  10 siblings, 1 reply; 30+ messages in thread
From: Sergey Senozhatsky @ 2015-05-29 15:05 UTC (permalink / raw)
  To: Andrew Morton, Minchan Kim
  Cc: linux-mm, linux-kernel, Sergey Senozhatsky, Sergey Senozhatsky

__zs_compact() does not use `nr_to_migrate', drop it.

Signed-off-by: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
---
 mm/zsmalloc.c | 4 ----
 1 file changed, 4 deletions(-)

diff --git a/mm/zsmalloc.c b/mm/zsmalloc.c
index 33d5126..e615b31 100644
--- a/mm/zsmalloc.c
+++ b/mm/zsmalloc.c
@@ -1701,7 +1701,6 @@ static struct page *isolate_source_page(struct size_class *class)
 static unsigned long __zs_compact(struct zs_pool *pool,
 				struct size_class *class)
 {
-	int nr_to_migrate;
 	struct zs_compact_control cc;
 	struct page *src_page;
 	struct page *dst_page = NULL;
@@ -1712,8 +1711,6 @@ static unsigned long __zs_compact(struct zs_pool *pool,
 
 		BUG_ON(!is_first_page(src_page));
 
-		/* The goal is to migrate all live objects in source page */
-		nr_to_migrate = src_page->inuse;
 		cc.index = 0;
 		cc.s_page = src_page;
 
@@ -1728,7 +1725,6 @@ static unsigned long __zs_compact(struct zs_pool *pool,
 
 			putback_zspage(pool, class, dst_page);
 			nr_total_migrated += cc.nr_migrated;
-			nr_to_migrate -= cc.nr_migrated;
 		}
 
 		/* Stop if we couldn't find slot */
-- 
2.4.2.337.gfae46aa

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [RFC][PATCH 02/10] zsmalloc: always keep per-class stats
  2015-05-29 15:05 [RFC][PATCH 00/10] zsmalloc auto-compaction Sergey Senozhatsky
  2015-05-29 15:05 ` [RFC][PATCH 01/10] zsmalloc: drop unused variable `nr_to_migrate' Sergey Senozhatsky
@ 2015-05-29 15:05 ` Sergey Senozhatsky
  2015-06-04  2:18   ` Minchan Kim
  2015-05-29 15:05 ` [RFC][PATCH 03/10] zsmalloc: introduce zs_can_compact() function Sergey Senozhatsky
                   ` (8 subsequent siblings)
  10 siblings, 1 reply; 30+ messages in thread
From: Sergey Senozhatsky @ 2015-05-29 15:05 UTC (permalink / raw)
  To: Andrew Morton, Minchan Kim
  Cc: linux-mm, linux-kernel, Sergey Senozhatsky, Sergey Senozhatsky

always account per-class `zs_size_stat' stats. this data will
help us make better decisions during compaction. we are especially
interested in OBJ_ALLOCATED and OBJ_USED, which can tell us if
class compaction will result in any memory gain.

for instance, we know the number of allocated objects in the class,
the number of objects being used (so we also know how many objects
are not used) and the number of objects per-page. so we can estimate
how many pages compaction can free (pages that will turn into
ZS_EMPTY during compaction).

Signed-off-by: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
---
 mm/zsmalloc.c | 49 ++++++++++++-------------------------------------
 1 file changed, 12 insertions(+), 37 deletions(-)

diff --git a/mm/zsmalloc.c b/mm/zsmalloc.c
index e615b31..778b8db 100644
--- a/mm/zsmalloc.c
+++ b/mm/zsmalloc.c
@@ -169,14 +169,12 @@ enum zs_stat_type {
 	NR_ZS_STAT_TYPE,
 };
 
-#ifdef CONFIG_ZSMALLOC_STAT
-
-static struct dentry *zs_stat_root;
-
 struct zs_size_stat {
 	unsigned long objs[NR_ZS_STAT_TYPE];
 };
 
+#ifdef CONFIG_ZSMALLOC_STAT
+static struct dentry *zs_stat_root;
 #endif
 
 /*
@@ -201,25 +199,21 @@ static int zs_size_classes;
 static const int fullness_threshold_frac = 4;
 
 struct size_class {
+	spinlock_t		lock;
+	struct page		*fullness_list[_ZS_NR_FULLNESS_GROUPS];
 	/*
 	 * Size of objects stored in this class. Must be multiple
 	 * of ZS_ALIGN.
 	 */
-	int size;
-	unsigned int index;
+	int			size;
+	unsigned int		index;
 
 	/* Number of PAGE_SIZE sized pages to combine to form a 'zspage' */
-	int pages_per_zspage;
-	/* huge object: pages_per_zspage == 1 && maxobj_per_zspage == 1 */
-	bool huge;
-
-#ifdef CONFIG_ZSMALLOC_STAT
-	struct zs_size_stat stats;
-#endif
-
-	spinlock_t lock;
+	int			pages_per_zspage;
+	struct zs_size_stat	stats;
 
-	struct page *fullness_list[_ZS_NR_FULLNESS_GROUPS];
+	/* huge object: pages_per_zspage == 1 && maxobj_per_zspage == 1 */
+	bool			huge;
 };
 
 /*
@@ -439,8 +433,6 @@ static int get_size_class_index(int size)
 	return min(zs_size_classes - 1, idx);
 }
 
-#ifdef CONFIG_ZSMALLOC_STAT
-
 static inline void zs_stat_inc(struct size_class *class,
 				enum zs_stat_type type, unsigned long cnt)
 {
@@ -459,6 +451,8 @@ static inline unsigned long zs_stat_get(struct size_class *class,
 	return class->stats.objs[type];
 }
 
+#ifdef CONFIG_ZSMALLOC_STAT
+
 static int __init zs_stat_init(void)
 {
 	if (!debugfs_initialized())
@@ -574,23 +568,6 @@ static void zs_pool_stat_destroy(struct zs_pool *pool)
 }
 
 #else /* CONFIG_ZSMALLOC_STAT */
-
-static inline void zs_stat_inc(struct size_class *class,
-				enum zs_stat_type type, unsigned long cnt)
-{
-}
-
-static inline void zs_stat_dec(struct size_class *class,
-				enum zs_stat_type type, unsigned long cnt)
-{
-}
-
-static inline unsigned long zs_stat_get(struct size_class *class,
-				enum zs_stat_type type)
-{
-	return 0;
-}
-
 static int __init zs_stat_init(void)
 {
 	return 0;
@@ -608,7 +585,6 @@ static inline int zs_pool_stat_create(char *name, struct zs_pool *pool)
 static inline void zs_pool_stat_destroy(struct zs_pool *pool)
 {
 }
-
 #endif
 
 
@@ -1682,7 +1658,6 @@ static void putback_zspage(struct zs_pool *pool, struct size_class *class,
 			class->size, class->pages_per_zspage));
 		atomic_long_sub(class->pages_per_zspage,
 				&pool->pages_allocated);
-
 		free_zspage(first_page);
 	}
 }
-- 
2.4.2.337.gfae46aa

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [RFC][PATCH 03/10] zsmalloc: introduce zs_can_compact() function
  2015-05-29 15:05 [RFC][PATCH 00/10] zsmalloc auto-compaction Sergey Senozhatsky
  2015-05-29 15:05 ` [RFC][PATCH 01/10] zsmalloc: drop unused variable `nr_to_migrate' Sergey Senozhatsky
  2015-05-29 15:05 ` [RFC][PATCH 02/10] zsmalloc: always keep per-class stats Sergey Senozhatsky
@ 2015-05-29 15:05 ` Sergey Senozhatsky
  2015-06-04  2:55   ` Minchan Kim
  2015-05-29 15:05 ` [RFC][PATCH 04/10] zsmalloc: cosmetic compaction code adjustments Sergey Senozhatsky
                   ` (7 subsequent siblings)
  10 siblings, 1 reply; 30+ messages in thread
From: Sergey Senozhatsky @ 2015-05-29 15:05 UTC (permalink / raw)
  To: Andrew Morton, Minchan Kim
  Cc: linux-mm, linux-kernel, Sergey Senozhatsky, Sergey Senozhatsky

this function checks if class compaction will free any pages.
rephrasing, do we have enough unused objects to form at least one
ZS_EMPTY page and free it. it aborts compaction if class compaction
will not result into any (further) savings.

EXAMPLE (this debug output is not part of this patch set):

-- class size
-- number of allocated objects
-- number of used objects,
-- estimated number of pages that will be freed

[..]
[ 3303.108960] class-3072 objs:24652 inuse:24628 objs-per-page:4 pages-tofree:6
[ 3303.108965] class-3072 objs:24648 inuse:24628 objs-per-page:4 pages-tofree:5
[ 3303.108970] class-3072 objs:24644 inuse:24628 objs-per-page:4 pages-tofree:4
[ 3303.108973] class-3072 objs:24640 inuse:24628 objs-per-page:4 pages-tofree:3
[ 3303.108978] class-3072 objs:24636 inuse:24628 objs-per-page:4 pages-tofree:2
[ 3303.108982] class-3072 objs:24632 inuse:24628 objs-per-page:4 pages-tofree:1
[ 3303.108993] class-2720 objs:17970 inuse:17966 objs-per-page:3 pages-tofree:1
[ 3303.108997] class-2720 objs:17967 inuse:17966 objs-per-page:3 pages-tofree:0
[ 3303.108998] class-2720: Compaction is useless
[ 3303.109000] class-2448 objs:7680 inuse:7674 objs-per-page:5 pages-tofree:1
[ 3303.109005] class-2336 objs:13510 inuse:13500 objs-per-page:7 pages-tofree:1
[ 3303.109010] class-2336 objs:13503 inuse:13500 objs-per-page:7 pages-tofree:0
[ 3303.109011] class-2336: Compaction is useless
[ 3303.109013] class-1808 objs:1161 inuse:1154 objs-per-page:9 pages-tofree:0
[ 3303.109014] class-1808: Compaction is useless
[ 3303.109016] class-1744 objs:2135 inuse:2131 objs-per-page:7 pages-tofree:0
[ 3303.109017] class-1744: Compaction is useless
[ 3303.109019] class-1536 objs:1328 inuse:1323 objs-per-page:8 pages-tofree:0
[ 3303.109020] class-1536: Compaction is useless
[ 3303.109022] class-1488 objs:8855 inuse:8847 objs-per-page:11 pages-tofree:0
[ 3303.109023] class-1488: Compaction is useless
[ 3303.109025] class-1360 objs:14880 inuse:14878 objs-per-page:3 pages-tofree:0
[ 3303.109026] class-1360: Compaction is useless
[ 3303.109028] class-1248 objs:3588 inuse:3577 objs-per-page:13 pages-tofree:0
[ 3303.109029] class-1248: Compaction is useless
[ 3303.109031] class-1216 objs:3380 inuse:3372 objs-per-page:10 pages-tofree:0
[ 3303.109032] class-1216: Compaction is useless
[ 3303.109033] class-1168 objs:3416 inuse:3401 objs-per-page:7 pages-tofree:2
[ 3303.109037] class-1168 objs:3409 inuse:3401 objs-per-page:7 pages-tofree:1
[ 3303.109042] class-1104 objs:605 inuse:599 objs-per-page:11 pages-tofree:0
[ 3303.109043] class-1104: Compaction is useless
[..]

every "Compaction is useless" indicates that we saved some CPU cycles.

for example, class-1104 has

	605	object allocated
	599	objects used
	11	objects per-page

even if we have ALMOST_EMPTY page, we still don't have enough room to move
all of its objects and free this page; so compaction will not make a lot of
sense here, it's better to just leave it as is.

Signed-off-by: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
---
 mm/zsmalloc.c | 25 +++++++++++++++++++++++++
 1 file changed, 25 insertions(+)

diff --git a/mm/zsmalloc.c b/mm/zsmalloc.c
index 778b8db..9ef6f15 100644
--- a/mm/zsmalloc.c
+++ b/mm/zsmalloc.c
@@ -1673,6 +1673,28 @@ static struct page *isolate_source_page(struct size_class *class)
 	return page;
 }
 
+/*
+ * Make sure that we actually can compact this class,
+ * IOW if migration will empty at least one page.
+ *
+ * should be called under class->lock
+ */
+static bool zs_can_compact(struct size_class *class)
+{
+	/*
+	 * calculate how many unused allocated objects we
+	 * have and see if we can free any zspages. otherwise,
+	 * compaction can just move objects back and forth w/o
+	 * any memory gain.
+	 */
+	unsigned long ret = zs_stat_get(class, OBJ_ALLOCATED) -
+		zs_stat_get(class, OBJ_USED);
+
+	ret /= get_maxobj_per_zspage(class->size,
+			class->pages_per_zspage);
+	return ret > 0;
+}
+
 static unsigned long __zs_compact(struct zs_pool *pool,
 				struct size_class *class)
 {
@@ -1686,6 +1708,9 @@ static unsigned long __zs_compact(struct zs_pool *pool,
 
 		BUG_ON(!is_first_page(src_page));
 
+		if (!zs_can_compact(class))
+			break;
+
 		cc.index = 0;
 		cc.s_page = src_page;
 
-- 
2.4.2.337.gfae46aa

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [RFC][PATCH 04/10] zsmalloc: cosmetic compaction code adjustments
  2015-05-29 15:05 [RFC][PATCH 00/10] zsmalloc auto-compaction Sergey Senozhatsky
                   ` (2 preceding siblings ...)
  2015-05-29 15:05 ` [RFC][PATCH 03/10] zsmalloc: introduce zs_can_compact() function Sergey Senozhatsky
@ 2015-05-29 15:05 ` Sergey Senozhatsky
  2015-06-04  3:14   ` Minchan Kim
  2015-05-29 15:05 ` [RFC][PATCH 05/10] zsmalloc: add `num_migrated' to zs_pool Sergey Senozhatsky
                   ` (6 subsequent siblings)
  10 siblings, 1 reply; 30+ messages in thread
From: Sergey Senozhatsky @ 2015-05-29 15:05 UTC (permalink / raw)
  To: Andrew Morton, Minchan Kim
  Cc: linux-mm, linux-kernel, Sergey Senozhatsky, Sergey Senozhatsky

change zs_object_copy() argument order to be (DST, SRC) rather
than (SRC, DST). copy/move functions usually have (to, from)
arguments order.

rename alloc_target_page() to isolate_target_page(). this
function doesn't allocate anything, it isolates target page,
pretty much like isolate_source_page().

tweak __zs_compact() comment.

Signed-off-by: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
---
 mm/zsmalloc.c | 12 ++++++------
 1 file changed, 6 insertions(+), 6 deletions(-)

diff --git a/mm/zsmalloc.c b/mm/zsmalloc.c
index 9ef6f15..fa72a81 100644
--- a/mm/zsmalloc.c
+++ b/mm/zsmalloc.c
@@ -1469,7 +1469,7 @@ void zs_free(struct zs_pool *pool, unsigned long handle)
 }
 EXPORT_SYMBOL_GPL(zs_free);
 
-static void zs_object_copy(unsigned long src, unsigned long dst,
+static void zs_object_copy(unsigned long dst, unsigned long src,
 				struct size_class *class)
 {
 	struct page *s_page, *d_page;
@@ -1610,7 +1610,7 @@ static int migrate_zspage(struct zs_pool *pool, struct size_class *class,
 
 		used_obj = handle_to_obj(handle);
 		free_obj = obj_malloc(d_page, class, handle);
-		zs_object_copy(used_obj, free_obj, class);
+		zs_object_copy(free_obj, used_obj, class);
 		index++;
 		record_obj(handle, free_obj);
 		unpin_tag(handle);
@@ -1626,7 +1626,7 @@ static int migrate_zspage(struct zs_pool *pool, struct size_class *class,
 	return ret;
 }
 
-static struct page *alloc_target_page(struct size_class *class)
+static struct page *isolate_target_page(struct size_class *class)
 {
 	int i;
 	struct page *page;
@@ -1714,11 +1714,11 @@ static unsigned long __zs_compact(struct zs_pool *pool,
 		cc.index = 0;
 		cc.s_page = src_page;
 
-		while ((dst_page = alloc_target_page(class))) {
+		while ((dst_page = isolate_target_page(class))) {
 			cc.d_page = dst_page;
 			/*
-			 * If there is no more space in dst_page, try to
-			 * allocate another zspage.
+			 * If there is no more space in dst_page, resched
+			 * and see if anyone had allocated another zspage.
 			 */
 			if (!migrate_zspage(pool, class, &cc))
 				break;
-- 
2.4.2.337.gfae46aa

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [RFC][PATCH 05/10] zsmalloc: add `num_migrated' to zs_pool
  2015-05-29 15:05 [RFC][PATCH 00/10] zsmalloc auto-compaction Sergey Senozhatsky
                   ` (3 preceding siblings ...)
  2015-05-29 15:05 ` [RFC][PATCH 04/10] zsmalloc: cosmetic compaction code adjustments Sergey Senozhatsky
@ 2015-05-29 15:05 ` Sergey Senozhatsky
  2015-05-29 15:05 ` [RFC][PATCH 06/10] zsmalloc: move compaction functions Sergey Senozhatsky
                   ` (5 subsequent siblings)
  10 siblings, 0 replies; 30+ messages in thread
From: Sergey Senozhatsky @ 2015-05-29 15:05 UTC (permalink / raw)
  To: Andrew Morton, Minchan Kim
  Cc: linux-mm, linux-kernel, Sergey Senozhatsky, Sergey Senozhatsky

remove the number of migrated objects from `zs_compact_control'
and make it a `zs_pool' member. `zs_compact_control' has a limited
lifespan; we lose it when zs_compaction() returns back to zram. to
keep track of objects migrated during auto-compaction we need to
store this number in zs_pool.

Signed-off-by: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
---
 mm/zsmalloc.c | 36 ++++++++++++++----------------------
 1 file changed, 14 insertions(+), 22 deletions(-)

diff --git a/mm/zsmalloc.c b/mm/zsmalloc.c
index fa72a81..54eefc3 100644
--- a/mm/zsmalloc.c
+++ b/mm/zsmalloc.c
@@ -237,16 +237,19 @@ struct link_free {
 };
 
 struct zs_pool {
-	char *name;
+	char 			*name;
 
-	struct size_class **size_class;
-	struct kmem_cache *handle_cachep;
+	struct size_class	**size_class;
+	struct kmem_cache	*handle_cachep;
 
-	gfp_t flags;	/* allocation flags used when growing pool */
-	atomic_long_t pages_allocated;
+	/* allocation flags used when growing pool */
+	gfp_t 			flags;
+	atomic_long_t 		pages_allocated;
+	/* how many of objects were migrated */
+	unsigned long		num_migrated;
 
 #ifdef CONFIG_ZSMALLOC_STAT
-	struct dentry *stat_dentry;
+	struct dentry		*stat_dentry;
 #endif
 };
 
@@ -1576,8 +1579,6 @@ struct zs_compact_control {
 	 /* Starting object index within @s_page which used for live object
 	  * in the subpage. */
 	int index;
-	/* how many of objects are migrated */
-	int nr_migrated;
 };
 
 static int migrate_zspage(struct zs_pool *pool, struct size_class *class,
@@ -1588,7 +1589,6 @@ static int migrate_zspage(struct zs_pool *pool, struct size_class *class,
 	struct page *s_page = cc->s_page;
 	struct page *d_page = cc->d_page;
 	unsigned long index = cc->index;
-	int nr_migrated = 0;
 	int ret = 0;
 
 	while (1) {
@@ -1615,13 +1615,12 @@ static int migrate_zspage(struct zs_pool *pool, struct size_class *class,
 		record_obj(handle, free_obj);
 		unpin_tag(handle);
 		obj_free(pool, class, used_obj);
-		nr_migrated++;
+		pool->num_migrated++;
 	}
 
 	/* Remember last position in this iteration */
 	cc->s_page = s_page;
 	cc->index = index;
-	cc->nr_migrated = nr_migrated;
 
 	return ret;
 }
@@ -1695,13 +1694,11 @@ static bool zs_can_compact(struct size_class *class)
 	return ret > 0;
 }
 
-static unsigned long __zs_compact(struct zs_pool *pool,
-				struct size_class *class)
+static void __zs_compact(struct zs_pool *pool, struct size_class *class)
 {
 	struct zs_compact_control cc;
 	struct page *src_page;
 	struct page *dst_page = NULL;
-	unsigned long nr_total_migrated = 0;
 
 	spin_lock(&class->lock);
 	while ((src_page = isolate_source_page(class))) {
@@ -1724,7 +1721,6 @@ static unsigned long __zs_compact(struct zs_pool *pool,
 				break;
 
 			putback_zspage(pool, class, dst_page);
-			nr_total_migrated += cc.nr_migrated;
 		}
 
 		/* Stop if we couldn't find slot */
@@ -1734,7 +1730,6 @@ static unsigned long __zs_compact(struct zs_pool *pool,
 		putback_zspage(pool, class, dst_page);
 		putback_zspage(pool, class, src_page);
 		spin_unlock(&class->lock);
-		nr_total_migrated += cc.nr_migrated;
 		cond_resched();
 		spin_lock(&class->lock);
 	}
@@ -1743,14 +1738,11 @@ static unsigned long __zs_compact(struct zs_pool *pool,
 		putback_zspage(pool, class, src_page);
 
 	spin_unlock(&class->lock);
-
-	return nr_total_migrated;
 }
 
 unsigned long zs_compact(struct zs_pool *pool)
 {
 	int i;
-	unsigned long nr_migrated = 0;
 	struct size_class *class;
 
 	for (i = zs_size_classes - 1; i >= 0; i--) {
@@ -1759,10 +1751,10 @@ unsigned long zs_compact(struct zs_pool *pool)
 			continue;
 		if (class->index != i)
 			continue;
-		nr_migrated += __zs_compact(pool, class);
+		__zs_compact(pool, class);
 	}
-
-	return nr_migrated;
+	/* can be a bit outdated */
+	return pool->num_migrated;
 }
 EXPORT_SYMBOL_GPL(zs_compact);
 
-- 
2.4.2.337.gfae46aa

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [RFC][PATCH 06/10] zsmalloc: move compaction functions
  2015-05-29 15:05 [RFC][PATCH 00/10] zsmalloc auto-compaction Sergey Senozhatsky
                   ` (4 preceding siblings ...)
  2015-05-29 15:05 ` [RFC][PATCH 05/10] zsmalloc: add `num_migrated' to zs_pool Sergey Senozhatsky
@ 2015-05-29 15:05 ` Sergey Senozhatsky
  2015-05-29 15:05 ` [RFC][PATCH 07/10] zsmalloc: introduce auto-compact support Sergey Senozhatsky
                   ` (4 subsequent siblings)
  10 siblings, 0 replies; 30+ messages in thread
From: Sergey Senozhatsky @ 2015-05-29 15:05 UTC (permalink / raw)
  To: Andrew Morton, Minchan Kim
  Cc: linux-mm, linux-kernel, Sergey Senozhatsky, Sergey Senozhatsky

this patch simply moves compaction functions, so we can call
`static __zs_compaction()' (and friends) from zs_free().

Signed-off-by: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
---
 mm/zsmalloc.c | 426 +++++++++++++++++++++++++++++-----------------------------
 1 file changed, 215 insertions(+), 211 deletions(-)

diff --git a/mm/zsmalloc.c b/mm/zsmalloc.c
index 54eefc3..c2a640a 100644
--- a/mm/zsmalloc.c
+++ b/mm/zsmalloc.c
@@ -321,6 +321,7 @@ static int zs_zpool_malloc(void *pool, size_t size, gfp_t gfp,
 	*handle = zs_malloc(pool, size);
 	return *handle ? 0 : -1;
 }
+
 static void zs_zpool_free(void *pool, unsigned long handle)
 {
 	zs_free(pool, handle);
@@ -352,6 +353,7 @@ static void *zs_zpool_map(void *pool, unsigned long handle,
 
 	return zs_map_object(pool, handle, zs_mm);
 }
+
 static void zs_zpool_unmap(void *pool, unsigned long handle)
 {
 	zs_unmap_object(pool, handle);
@@ -590,7 +592,6 @@ static inline void zs_pool_stat_destroy(struct zs_pool *pool)
 }
 #endif
 
-
 /*
  * For each size class, zspages are divided into different groups
  * depending on how "full" they are. This was done so that we could
@@ -1117,7 +1118,6 @@ out:
 	/* enable page faults to match kunmap_atomic() return conditions */
 	pagefault_enable();
 }
-
 #endif /* CONFIG_PGTABLE_MAPPING */
 
 static int zs_cpu_notifier(struct notifier_block *nb, unsigned long action,
@@ -1207,115 +1207,6 @@ static bool zspage_full(struct page *page)
 	return page->inuse == page->objects;
 }
 
-unsigned long zs_get_total_pages(struct zs_pool *pool)
-{
-	return atomic_long_read(&pool->pages_allocated);
-}
-EXPORT_SYMBOL_GPL(zs_get_total_pages);
-
-/**
- * zs_map_object - get address of allocated object from handle.
- * @pool: pool from which the object was allocated
- * @handle: handle returned from zs_malloc
- *
- * Before using an object allocated from zs_malloc, it must be mapped using
- * this function. When done with the object, it must be unmapped using
- * zs_unmap_object.
- *
- * Only one object can be mapped per cpu at a time. There is no protection
- * against nested mappings.
- *
- * This function returns with preemption and page faults disabled.
- */
-void *zs_map_object(struct zs_pool *pool, unsigned long handle,
-			enum zs_mapmode mm)
-{
-	struct page *page;
-	unsigned long obj, obj_idx, off;
-
-	unsigned int class_idx;
-	enum fullness_group fg;
-	struct size_class *class;
-	struct mapping_area *area;
-	struct page *pages[2];
-	void *ret;
-
-	BUG_ON(!handle);
-
-	/*
-	 * Because we use per-cpu mapping areas shared among the
-	 * pools/users, we can't allow mapping in interrupt context
-	 * because it can corrupt another users mappings.
-	 */
-	BUG_ON(in_interrupt());
-
-	/* From now on, migration cannot move the object */
-	pin_tag(handle);
-
-	obj = handle_to_obj(handle);
-	obj_to_location(obj, &page, &obj_idx);
-	get_zspage_mapping(get_first_page(page), &class_idx, &fg);
-	class = pool->size_class[class_idx];
-	off = obj_idx_to_offset(page, obj_idx, class->size);
-
-	area = &get_cpu_var(zs_map_area);
-	area->vm_mm = mm;
-	if (off + class->size <= PAGE_SIZE) {
-		/* this object is contained entirely within a page */
-		area->vm_addr = kmap_atomic(page);
-		ret = area->vm_addr + off;
-		goto out;
-	}
-
-	/* this object spans two pages */
-	pages[0] = page;
-	pages[1] = get_next_page(page);
-	BUG_ON(!pages[1]);
-
-	ret = __zs_map_object(area, pages, off, class->size);
-out:
-	if (!class->huge)
-		ret += ZS_HANDLE_SIZE;
-
-	return ret;
-}
-EXPORT_SYMBOL_GPL(zs_map_object);
-
-void zs_unmap_object(struct zs_pool *pool, unsigned long handle)
-{
-	struct page *page;
-	unsigned long obj, obj_idx, off;
-
-	unsigned int class_idx;
-	enum fullness_group fg;
-	struct size_class *class;
-	struct mapping_area *area;
-
-	BUG_ON(!handle);
-
-	obj = handle_to_obj(handle);
-	obj_to_location(obj, &page, &obj_idx);
-	get_zspage_mapping(get_first_page(page), &class_idx, &fg);
-	class = pool->size_class[class_idx];
-	off = obj_idx_to_offset(page, obj_idx, class->size);
-
-	area = this_cpu_ptr(&zs_map_area);
-	if (off + class->size <= PAGE_SIZE)
-		kunmap_atomic(area->vm_addr);
-	else {
-		struct page *pages[2];
-
-		pages[0] = page;
-		pages[1] = get_next_page(page);
-		BUG_ON(!pages[1]);
-
-		__zs_unmap_object(area, pages, off, class->size);
-	}
-	put_cpu_var(zs_map_area);
-	unpin_tag(handle);
-}
-EXPORT_SYMBOL_GPL(zs_unmap_object);
-
 static unsigned long obj_malloc(struct page *first_page,
 		struct size_class *class, unsigned long handle)
 {
@@ -1347,63 +1238,6 @@ static unsigned long obj_malloc(struct page *first_page,
 	return obj;
 }
 
-
-/**
- * zs_malloc - Allocate block of given size from pool.
- * @pool: pool to allocate from
- * @size: size of block to allocate
- *
- * On success, handle to the allocated object is returned,
- * otherwise 0.
- * Allocation requests with size > ZS_MAX_ALLOC_SIZE will fail.
- */
-unsigned long zs_malloc(struct zs_pool *pool, size_t size)
-{
-	unsigned long handle, obj;
-	struct size_class *class;
-	struct page *first_page;
-
-	if (unlikely(!size || size > ZS_MAX_ALLOC_SIZE))
-		return 0;
-
-	handle = alloc_handle(pool);
-	if (!handle)
-		return 0;
-
-	/* extra space in chunk to keep the handle */
-	size += ZS_HANDLE_SIZE;
-	class = pool->size_class[get_size_class_index(size)];
-
-	spin_lock(&class->lock);
-	first_page = find_get_zspage(class);
-
-	if (!first_page) {
-		spin_unlock(&class->lock);
-		first_page = alloc_zspage(class, pool->flags);
-		if (unlikely(!first_page)) {
-			free_handle(pool, handle);
-			return 0;
-		}
-
-		set_zspage_mapping(first_page, class->index, ZS_EMPTY);
-		atomic_long_add(class->pages_per_zspage,
-					&pool->pages_allocated);
-
-		spin_lock(&class->lock);
-		zs_stat_inc(class, OBJ_ALLOCATED, get_maxobj_per_zspage(
-				class->size, class->pages_per_zspage));
-	}
-
-	obj = obj_malloc(first_page, class, handle);
-	/* Now move the zspage to another fullness group, if required */
-	fix_fullness_group(class, first_page);
-	record_obj(handle, obj);
-	spin_unlock(&class->lock);
-
-	return handle;
-}
-EXPORT_SYMBOL_GPL(zs_malloc);
-
 static void obj_free(struct zs_pool *pool, struct size_class *class,
 			unsigned long obj)
 {
@@ -1436,42 +1270,6 @@ static void obj_free(struct zs_pool *pool, struct size_class *class,
 	zs_stat_dec(class, OBJ_USED, 1);
 }
 
-void zs_free(struct zs_pool *pool, unsigned long handle)
-{
-	struct page *first_page, *f_page;
-	unsigned long obj, f_objidx;
-	int class_idx;
-	struct size_class *class;
-	enum fullness_group fullness;
-
-	if (unlikely(!handle))
-		return;
-
-	pin_tag(handle);
-	obj = handle_to_obj(handle);
-	obj_to_location(obj, &f_page, &f_objidx);
-	first_page = get_first_page(f_page);
-
-	get_zspage_mapping(first_page, &class_idx, &fullness);
-	class = pool->size_class[class_idx];
-
-	spin_lock(&class->lock);
-	obj_free(pool, class, obj);
-	fullness = fix_fullness_group(class, first_page);
-	if (fullness == ZS_EMPTY) {
-		zs_stat_dec(class, OBJ_ALLOCATED, get_maxobj_per_zspage(
-				class->size, class->pages_per_zspage));
-		atomic_long_sub(class->pages_per_zspage,
-				&pool->pages_allocated);
-		free_zspage(first_page);
-	}
-	spin_unlock(&class->lock);
-	unpin_tag(handle);
-
-	free_handle(pool, handle);
-}
-EXPORT_SYMBOL_GPL(zs_free);
-
 static void zs_object_copy(unsigned long dst, unsigned long src,
 				struct size_class *class)
 {
@@ -1572,13 +1370,17 @@ static unsigned long find_alloced_obj(struct page *page, int index,
 
 struct zs_compact_control {
 	/* Source page for migration which could be a subpage of zspage. */
-	struct page *s_page;
-	/* Destination page for migration which should be a first page
-	 * of zspage. */
-	struct page *d_page;
-	 /* Starting object index within @s_page which used for live object
-	  * in the subpage. */
-	int index;
+	struct page 	*s_page;
+	/*
+	 * Destination page for migration which should be a first page
+	 * of zspage.
+	 */
+	struct page 	*d_page;
+	 /*
+	  * Starting object index within @s_page which used for live object
+	  * in the subpage.
+	  */
+	int 		index;
 };
 
 static int migrate_zspage(struct zs_pool *pool, struct size_class *class,
@@ -1740,6 +1542,208 @@ static void __zs_compact(struct zs_pool *pool, struct size_class *class)
 	spin_unlock(&class->lock);
 }
 
+
+unsigned long zs_get_total_pages(struct zs_pool *pool)
+{
+	return atomic_long_read(&pool->pages_allocated);
+}
+EXPORT_SYMBOL_GPL(zs_get_total_pages);
+
+/**
+ * zs_map_object - get address of allocated object from handle.
+ * @pool: pool from which the object was allocated
+ * @handle: handle returned from zs_malloc
+ *
+ * Before using an object allocated from zs_malloc, it must be mapped using
+ * this function. When done with the object, it must be unmapped using
+ * zs_unmap_object.
+ *
+ * Only one object can be mapped per cpu at a time. There is no protection
+ * against nested mappings.
+ *
+ * This function returns with preemption and page faults disabled.
+ */
+void *zs_map_object(struct zs_pool *pool, unsigned long handle,
+			enum zs_mapmode mm)
+{
+	struct page *page;
+	unsigned long obj, obj_idx, off;
+
+	unsigned int class_idx;
+	enum fullness_group fg;
+	struct size_class *class;
+	struct mapping_area *area;
+	struct page *pages[2];
+	void *ret;
+
+	BUG_ON(!handle);
+
+	/*
+	 * Because we use per-cpu mapping areas shared among the
+	 * pools/users, we can't allow mapping in interrupt context
+	 * because it can corrupt another users mappings.
+	 */
+	BUG_ON(in_interrupt());
+
+	/* From now on, migration cannot move the object */
+	pin_tag(handle);
+
+	obj = handle_to_obj(handle);
+	obj_to_location(obj, &page, &obj_idx);
+	get_zspage_mapping(get_first_page(page), &class_idx, &fg);
+	class = pool->size_class[class_idx];
+	off = obj_idx_to_offset(page, obj_idx, class->size);
+
+	area = &get_cpu_var(zs_map_area);
+	area->vm_mm = mm;
+	if (off + class->size <= PAGE_SIZE) {
+		/* this object is contained entirely within a page */
+		area->vm_addr = kmap_atomic(page);
+		ret = area->vm_addr + off;
+		goto out;
+	}
+
+	/* this object spans two pages */
+	pages[0] = page;
+	pages[1] = get_next_page(page);
+	BUG_ON(!pages[1]);
+
+	ret = __zs_map_object(area, pages, off, class->size);
+out:
+	if (!class->huge)
+		ret += ZS_HANDLE_SIZE;
+
+	return ret;
+}
+EXPORT_SYMBOL_GPL(zs_map_object);
+
+void zs_unmap_object(struct zs_pool *pool, unsigned long handle)
+{
+	struct page *page;
+	unsigned long obj, obj_idx, off;
+
+	unsigned int class_idx;
+	enum fullness_group fg;
+	struct size_class *class;
+	struct mapping_area *area;
+
+	BUG_ON(!handle);
+
+	obj = handle_to_obj(handle);
+	obj_to_location(obj, &page, &obj_idx);
+	get_zspage_mapping(get_first_page(page), &class_idx, &fg);
+	class = pool->size_class[class_idx];
+	off = obj_idx_to_offset(page, obj_idx, class->size);
+
+	area = this_cpu_ptr(&zs_map_area);
+	if (off + class->size <= PAGE_SIZE)
+		kunmap_atomic(area->vm_addr);
+	else {
+		struct page *pages[2];
+
+		pages[0] = page;
+		pages[1] = get_next_page(page);
+		BUG_ON(!pages[1]);
+
+		__zs_unmap_object(area, pages, off, class->size);
+	}
+	put_cpu_var(zs_map_area);
+	unpin_tag(handle);
+}
+EXPORT_SYMBOL_GPL(zs_unmap_object);
+
+/**
+ * zs_malloc - Allocate block of given size from pool.
+ * @pool: pool to allocate from
+ * @size: size of block to allocate
+ *
+ * On success, handle to the allocated object is returned,
+ * otherwise 0.
+ * Allocation requests with size > ZS_MAX_ALLOC_SIZE will fail.
+ */
+unsigned long zs_malloc(struct zs_pool *pool, size_t size)
+{
+	unsigned long handle, obj;
+	struct size_class *class;
+	struct page *first_page;
+
+	if (unlikely(!size || size > ZS_MAX_ALLOC_SIZE))
+		return 0;
+
+	handle = alloc_handle(pool);
+	if (!handle)
+		return 0;
+
+	/* extra space in chunk to keep the handle */
+	size += ZS_HANDLE_SIZE;
+	class = pool->size_class[get_size_class_index(size)];
+
+	spin_lock(&class->lock);
+	first_page = find_get_zspage(class);
+
+	if (!first_page) {
+		spin_unlock(&class->lock);
+		first_page = alloc_zspage(class, pool->flags);
+		if (unlikely(!first_page)) {
+			free_handle(pool, handle);
+			return 0;
+		}
+
+		set_zspage_mapping(first_page, class->index, ZS_EMPTY);
+		atomic_long_add(class->pages_per_zspage,
+					&pool->pages_allocated);
+
+		spin_lock(&class->lock);
+		zs_stat_inc(class, OBJ_ALLOCATED, get_maxobj_per_zspage(
+				class->size, class->pages_per_zspage));
+	}
+
+	obj = obj_malloc(first_page, class, handle);
+	/* Now move the zspage to another fullness group, if required */
+	fix_fullness_group(class, first_page);
+	record_obj(handle, obj);
+	spin_unlock(&class->lock);
+
+	return handle;
+}
+EXPORT_SYMBOL_GPL(zs_malloc);
+
+void zs_free(struct zs_pool *pool, unsigned long handle)
+{
+	struct page *first_page, *f_page;
+	unsigned long obj, f_objidx;
+	int class_idx;
+	struct size_class *class;
+	enum fullness_group fullness;
+
+	if (unlikely(!handle))
+		return;
+
+	pin_tag(handle);
+	obj = handle_to_obj(handle);
+	obj_to_location(obj, &f_page, &f_objidx);
+	first_page = get_first_page(f_page);
+
+	get_zspage_mapping(first_page, &class_idx, &fullness);
+	class = pool->size_class[class_idx];
+
+	spin_lock(&class->lock);
+	obj_free(pool, class, obj);
+	fullness = fix_fullness_group(class, first_page);
+	if (fullness == ZS_EMPTY) {
+		zs_stat_dec(class, OBJ_ALLOCATED, get_maxobj_per_zspage(
+				class->size, class->pages_per_zspage));
+		atomic_long_sub(class->pages_per_zspage,
+				&pool->pages_allocated);
+		free_zspage(first_page);
+	}
+	spin_unlock(&class->lock);
+	unpin_tag(handle);
+
+	free_handle(pool, handle);
+}
+EXPORT_SYMBOL_GPL(zs_free);
+
 unsigned long zs_compact(struct zs_pool *pool)
 {
 	int i;
-- 
2.4.2.337.gfae46aa

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [RFC][PATCH 07/10] zsmalloc: introduce auto-compact support
  2015-05-29 15:05 [RFC][PATCH 00/10] zsmalloc auto-compaction Sergey Senozhatsky
                   ` (5 preceding siblings ...)
  2015-05-29 15:05 ` [RFC][PATCH 06/10] zsmalloc: move compaction functions Sergey Senozhatsky
@ 2015-05-29 15:05 ` Sergey Senozhatsky
  2015-06-04  4:57   ` Minchan Kim
  2015-05-29 15:05 ` [RFC][PATCH 08/10] zsmalloc: export zs_pool `num_migrated' Sergey Senozhatsky
                   ` (3 subsequent siblings)
  10 siblings, 1 reply; 30+ messages in thread
From: Sergey Senozhatsky @ 2015-05-29 15:05 UTC (permalink / raw)
  To: Andrew Morton, Minchan Kim
  Cc: linux-mm, linux-kernel, Sergey Senozhatsky, Sergey Senozhatsky

perform class compaction in zs_free(), if zs_free() has created
a ZS_ALMOST_EMPTY page. this is the most trivial `policy'.

probably it would make zs_can_compact() to return an estimated number
of pages that potentially will be free and trigger auto-compaction
only when it's above some limit (e.g. at least 4 zs pages); or put it
under config option.

this also tweaks __zs_compact() -- we can't do reschedule
anymore, waiting for new pages in the current class. so we
compact as much as we can and return immediately if compaction
is not possible anymore.

auto-compaction is not a replacement of manual compaction.

compiled linux kernel with auto-compaction:

cat /sys/block/zram0/mm_stat
2339885056 1601034235 1624076288        0 1624076288    19961     1106

performing additional manual compaction:

echo 1 > /sys/block/zram0/compact
cat /sys/block/zram0/mm_stat
2339885056 1601034235 1624051712        0 1624076288    19961     1114

manual compaction was able to migrate additional 8 objects. so
auto-compaction is 'good enough'.

TEST

this test copies a 1.3G linux kernel tar to mounted zram disk,
and extracts it.

w/auto-compaction:

cat /sys/block/zram0/mm_stat
 1171456    26006    86016        0    86016    32781        0

time tar xf linux-3.10.tar.gz -C linux

real    0m16.970s
user    0m15.247s
sys     0m8.477s

du -sh linux
2.0G    linux

cat /sys/block/zram0/mm_stat
3547353088 2993384270 3011088384        0 3011088384    24310      108

=====================================================================

w/o auto compaction:

cat /sys/block/zram0/mm_stat
 1171456    26000    81920        0    81920    32781        0

time tar xf linux-3.10.tar.gz -C linux

real    0m16.983s
user    0m15.267s
sys     0m8.417s

du -sh linux
2.0G    linux

cat /sys/block/zram0/mm_stat
3548917760 2993566924 3011317760        0 3011317760    23928        0

=====================================================================

iozone shows that auto-compacted code runs faster in several
tests, which is hardly trustworthy. anyway.

iozone -t 3 -R -r 16K -s 60M -I +Z

       test           base       auto-compact (compacted 66123 objs)
   Initial write   1603682.25          1645112.38
         Rewrite   2502243.31          2256570.31
            Read   7040860.00          7130575.00
         Re-read   7036490.75          7066744.25
    Reverse Read   6617115.25          6155395.50
     Stride read   6705085.50          6350030.38
     Random read   6668497.75          6350129.38
  Mixed workload   5494030.38          5091669.62
    Random write   2526834.44          2500977.81
          Pwrite   1656874.00          1663796.94
           Pread   3322818.91          3359683.44
          Fwrite   4090124.25          4099773.88
           Fread   10358916.25         10324409.75

Signed-off-by: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
---
 mm/zsmalloc.c | 25 +++++++++++++------------
 1 file changed, 13 insertions(+), 12 deletions(-)

diff --git a/mm/zsmalloc.c b/mm/zsmalloc.c
index c2a640a..70bf481 100644
--- a/mm/zsmalloc.c
+++ b/mm/zsmalloc.c
@@ -1515,34 +1515,28 @@ static void __zs_compact(struct zs_pool *pool, struct size_class *class)
 
 		while ((dst_page = isolate_target_page(class))) {
 			cc.d_page = dst_page;
-			/*
-			 * If there is no more space in dst_page, resched
-			 * and see if anyone had allocated another zspage.
-			 */
+
 			if (!migrate_zspage(pool, class, &cc))
-				break;
+				goto out;
 
 			putback_zspage(pool, class, dst_page);
 		}
 
-		/* Stop if we couldn't find slot */
-		if (dst_page == NULL)
+		if (!dst_page)
 			break;
-
 		putback_zspage(pool, class, dst_page);
 		putback_zspage(pool, class, src_page);
-		spin_unlock(&class->lock);
-		cond_resched();
-		spin_lock(&class->lock);
 	}
 
+out:
+	if (dst_page)
+		putback_zspage(pool, class, dst_page);
 	if (src_page)
 		putback_zspage(pool, class, src_page);
 
 	spin_unlock(&class->lock);
 }
 
-
 unsigned long zs_get_total_pages(struct zs_pool *pool)
 {
 	return atomic_long_read(&pool->pages_allocated);
@@ -1741,6 +1735,13 @@ void zs_free(struct zs_pool *pool, unsigned long handle)
 	unpin_tag(handle);
 
 	free_handle(pool, handle);
+
+	/*
+	 * actual fullness might have changed, __zs_compact() checks
+	 * if compaction makes sense
+	 */
+	if (fullness == ZS_ALMOST_EMPTY)
+		__zs_compact(pool, class);
 }
 EXPORT_SYMBOL_GPL(zs_free);
 
-- 
2.4.2.337.gfae46aa

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [RFC][PATCH 08/10] zsmalloc: export zs_pool `num_migrated'
  2015-05-29 15:05 [RFC][PATCH 00/10] zsmalloc auto-compaction Sergey Senozhatsky
                   ` (6 preceding siblings ...)
  2015-05-29 15:05 ` [RFC][PATCH 07/10] zsmalloc: introduce auto-compact support Sergey Senozhatsky
@ 2015-05-29 15:05 ` Sergey Senozhatsky
  2015-05-29 15:05 ` [RFC][PATCH 09/10] zram: remove `num_migrated' from zram_stats Sergey Senozhatsky
                   ` (2 subsequent siblings)
  10 siblings, 0 replies; 30+ messages in thread
From: Sergey Senozhatsky @ 2015-05-29 15:05 UTC (permalink / raw)
  To: Andrew Morton, Minchan Kim
  Cc: linux-mm, linux-kernel, Sergey Senozhatsky, Sergey Senozhatsky

introduce zs_get_num_migrated() to export zs_pool's ->num_migrated
counter.

Signed-off-by: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
---
 include/linux/zsmalloc.h | 1 +
 mm/zsmalloc.c            | 7 +++++++
 2 files changed, 8 insertions(+)

diff --git a/include/linux/zsmalloc.h b/include/linux/zsmalloc.h
index 1338190..e878875 100644
--- a/include/linux/zsmalloc.h
+++ b/include/linux/zsmalloc.h
@@ -47,6 +47,7 @@ void *zs_map_object(struct zs_pool *pool, unsigned long handle,
 void zs_unmap_object(struct zs_pool *pool, unsigned long handle);
 
 unsigned long zs_get_total_pages(struct zs_pool *pool);
+unsigned long zs_get_num_migrated(struct zs_pool *pool);
 unsigned long zs_compact(struct zs_pool *pool);
 
 #endif
diff --git a/mm/zsmalloc.c b/mm/zsmalloc.c
index 70bf481..0524c4a 100644
--- a/mm/zsmalloc.c
+++ b/mm/zsmalloc.c
@@ -1543,6 +1543,13 @@ unsigned long zs_get_total_pages(struct zs_pool *pool)
 }
 EXPORT_SYMBOL_GPL(zs_get_total_pages);
 
+unsigned long zs_get_num_migrated(struct zs_pool *pool)
+{
+	/* can be outdated */
+	return pool->num_migrated;
+}
+EXPORT_SYMBOL_GPL(zs_get_num_migrated);
+
 /**
  * zs_map_object - get address of allocated object from handle.
  * @pool: pool from which the object was allocated
-- 
2.4.2.337.gfae46aa

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [RFC][PATCH 09/10] zram: remove `num_migrated' from zram_stats
  2015-05-29 15:05 [RFC][PATCH 00/10] zsmalloc auto-compaction Sergey Senozhatsky
                   ` (7 preceding siblings ...)
  2015-05-29 15:05 ` [RFC][PATCH 08/10] zsmalloc: export zs_pool `num_migrated' Sergey Senozhatsky
@ 2015-05-29 15:05 ` Sergey Senozhatsky
  2015-05-29 15:05 ` [RFC][PATCH 10/10] zsmalloc: lower ZS_ALMOST_FULL waterline Sergey Senozhatsky
  2015-06-03  5:09 ` [RFC][PATCH 00/10] zsmalloc auto-compaction Sergey Senozhatsky
  10 siblings, 0 replies; 30+ messages in thread
From: Sergey Senozhatsky @ 2015-05-29 15:05 UTC (permalink / raw)
  To: Andrew Morton, Minchan Kim
  Cc: linux-mm, linux-kernel, Sergey Senozhatsky, Sergey Senozhatsky

drop zram's copy of `num_migrated' objects and use zs_pool's
zs_get_num_migrated() instead.

Signed-off-by: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
---
 drivers/block/zram/zram_drv.c | 12 ++++++------
 drivers/block/zram/zram_drv.h |  1 -
 2 files changed, 6 insertions(+), 7 deletions(-)

diff --git a/drivers/block/zram/zram_drv.c b/drivers/block/zram/zram_drv.c
index 28f6e46..31e45b4 100644
--- a/drivers/block/zram/zram_drv.c
+++ b/drivers/block/zram/zram_drv.c
@@ -385,7 +385,6 @@ static ssize_t comp_algorithm_store(struct device *dev,
 static ssize_t compact_store(struct device *dev,
 		struct device_attribute *attr, const char *buf, size_t len)
 {
-	unsigned long nr_migrated;
 	struct zram *zram = dev_to_zram(dev);
 	struct zram_meta *meta;
 
@@ -396,8 +395,7 @@ static ssize_t compact_store(struct device *dev,
 	}
 
 	meta = zram->meta;
-	nr_migrated = zs_compact(meta->mem_pool);
-	atomic64_add(nr_migrated, &zram->stats.num_migrated);
+	zs_compact(meta->mem_pool);
 	up_read(&zram->init_lock);
 
 	return len;
@@ -425,13 +423,15 @@ static ssize_t mm_stat_show(struct device *dev,
 		struct device_attribute *attr, char *buf)
 {
 	struct zram *zram = dev_to_zram(dev);
-	u64 orig_size, mem_used = 0;
+	u64 orig_size, mem_used = 0, num_migrated = 0;
 	long max_used;
 	ssize_t ret;
 
 	down_read(&zram->init_lock);
-	if (init_done(zram))
+	if (init_done(zram)) {
 		mem_used = zs_get_total_pages(zram->meta->mem_pool);
+		num_migrated = zs_get_num_migrated(zram->meta->mem_pool);
+	}
 
 	orig_size = atomic64_read(&zram->stats.pages_stored);
 	max_used = atomic_long_read(&zram->stats.max_used_pages);
@@ -444,7 +444,7 @@ static ssize_t mm_stat_show(struct device *dev,
 			zram->limit_pages << PAGE_SHIFT,
 			max_used << PAGE_SHIFT,
 			(u64)atomic64_read(&zram->stats.zero_pages),
-			(u64)atomic64_read(&zram->stats.num_migrated));
+			num_migrated);
 	up_read(&zram->init_lock);
 
 	return ret;
diff --git a/drivers/block/zram/zram_drv.h b/drivers/block/zram/zram_drv.h
index 6dbe2df..8e92339 100644
--- a/drivers/block/zram/zram_drv.h
+++ b/drivers/block/zram/zram_drv.h
@@ -78,7 +78,6 @@ struct zram_stats {
 	atomic64_t compr_data_size;	/* compressed size of pages stored */
 	atomic64_t num_reads;	/* failed + successful */
 	atomic64_t num_writes;	/* --do-- */
-	atomic64_t num_migrated;	/* no. of migrated object */
 	atomic64_t failed_reads;	/* can happen when memory is too low */
 	atomic64_t failed_writes;	/* can happen when memory is too low */
 	atomic64_t invalid_io;	/* non-page-aligned I/O requests */
-- 
2.4.2.337.gfae46aa

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [RFC][PATCH 10/10] zsmalloc: lower ZS_ALMOST_FULL waterline
  2015-05-29 15:05 [RFC][PATCH 00/10] zsmalloc auto-compaction Sergey Senozhatsky
                   ` (8 preceding siblings ...)
  2015-05-29 15:05 ` [RFC][PATCH 09/10] zram: remove `num_migrated' from zram_stats Sergey Senozhatsky
@ 2015-05-29 15:05 ` Sergey Senozhatsky
  2015-06-03  5:09 ` [RFC][PATCH 00/10] zsmalloc auto-compaction Sergey Senozhatsky
  10 siblings, 0 replies; 30+ messages in thread
From: Sergey Senozhatsky @ 2015-05-29 15:05 UTC (permalink / raw)
  To: Andrew Morton, Minchan Kim
  Cc: linux-mm, linux-kernel, Sergey Senozhatsky, Sergey Senozhatsky

get_fullness_group() considers 3/4 full pages as almost empty.
that, unfortunately, marks as ALMOST_EMPTY pages that we would
probably like to keep in ALMOST_FULL list.

ALMOST_EMPTY:
[..]
  inuse: 3 max_objexts: 4
  inuse: 5 max_objexts: 7
  inuse: 5 max_objexts: 7
  inuse: 2 max_objexts: 3
[..]

so, for "inuse: 5 max_objexts: 7" ALMOST_EMPTY page, for example,
it'll take 2 obj_malloc to make the page FULL and 5 obj_free to
make it EMPTY. compaction selects ALMOST_EMPTY pages as source
pages, which can result in extra object moves.

iow, in terms of compaction, it makes more sense to fill this
page, rather than drain it.

decrease ALMOST_FULL waterline to 2/3 of max capacity; which is,
of course, still imperfect.

Signed-off-by: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
---
 mm/zsmalloc.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/mm/zsmalloc.c b/mm/zsmalloc.c
index 0524c4a..a8a3eae 100644
--- a/mm/zsmalloc.c
+++ b/mm/zsmalloc.c
@@ -196,7 +196,7 @@ static int zs_size_classes;
  *
  * (see: fix_fullness_group())
  */
-static const int fullness_threshold_frac = 4;
+static const int fullness_threshold_frac = 3;
 
 struct size_class {
 	spinlock_t		lock;
@@ -612,7 +612,7 @@ static enum fullness_group get_fullness_group(struct page *page)
 		fg = ZS_EMPTY;
 	else if (inuse == max_objects)
 		fg = ZS_FULL;
-	else if (inuse <= 3 * max_objects / fullness_threshold_frac)
+	else if (inuse <= 2 * max_objects / fullness_threshold_frac)
 		fg = ZS_ALMOST_EMPTY;
 	else
 		fg = ZS_ALMOST_FULL;
-- 
2.4.2.337.gfae46aa

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 30+ messages in thread

* Re: [RFC][PATCH 00/10] zsmalloc auto-compaction
  2015-05-29 15:05 [RFC][PATCH 00/10] zsmalloc auto-compaction Sergey Senozhatsky
                   ` (9 preceding siblings ...)
  2015-05-29 15:05 ` [RFC][PATCH 10/10] zsmalloc: lower ZS_ALMOST_FULL waterline Sergey Senozhatsky
@ 2015-06-03  5:09 ` Sergey Senozhatsky
  10 siblings, 0 replies; 30+ messages in thread
From: Sergey Senozhatsky @ 2015-06-03  5:09 UTC (permalink / raw)
  To: Minchan Kim
  Cc: Andrew Morton, linux-mm, linux-kernel, Sergey Senozhatsky,
	Sergey Senozhatsky

On (05/30/15 00:05), Sergey Senozhatsky wrote:
> RFC
> 
> this is 4.3 material, but I wanted to publish it sooner to gain
> responses and to settle it down before 4.3 merge window opens.
> 
> in short, this series tweaks zsmalloc's compaction and adds
> auto-compaction support. auto-compaction is not aimed to replace
> manual compaction, intead it's supposed to be good enough. yet
> it surely slows down zsmalloc in some scenarious. whilst simple
> un-tar test didn't show any significant performance difference
> 
> 
> quote from commit 0007:
> 
> this test copies a 1.3G linux kernel tar to mounted zram disk,
> and extracts it.
> 

[..]


Hello,

I've a v2:
-- squashed and re-order some of the patches;
-- run iozone with lockdep disabled.

=== quote ===

    auto-compaction should not affect read-only tests, so we are interested
    in write-only and read-write (mixed) tests, but I'll post complete test
    stats:
    
    iozone -t 3 -R -r 16K -s 60M -I +Z
    ext4, 2g zram0 device, lzo, 4 compression streams max
    
           test           base       auto-compact (compacted 67904 objs)
       Initial write   2474943.62          2490551.69
             Rewrite   3656121.38          3002796.31
                Read   12068187.50         12044105.25
             Re-read   12009777.25         11930537.50
        Reverse Read   10858884.25         10388252.50
         Stride read   10715304.75         10429308.00
         Random read   10597970.50         10502978.75
      Mixed workload   8517269.00          8701298.12
        Random write   3595597.00          3465174.38
              Pwrite   2507361.25          2553224.50
               Pread   5380608.28          5340646.03
              Fwrite   6123863.62          6130514.25
               Fread   12006438.50         11936981.25
    
    mm_stat after the test
    
    base:
    cat /sys/block/zram0/mm_stat
    378834944  5748695  7446528        0  7450624    16318        0
    
    auto-compaction:
    cat /sys/block/zram0/mm_stat
    378892288  5754987  7397376        0  7397376    16304    67904

===

	-ss

> 
> 
> Sergey Senozhatsky (10):
>   zsmalloc: drop unused variable `nr_to_migrate'
>   zsmalloc: always keep per-class stats
>   zsmalloc: introduce zs_can_compact() function
>   zsmalloc: cosmetic compaction code adjustments
>   zsmalloc: add `num_migrated' to zs_pool
>   zsmalloc: move compaction functions
>   zsmalloc: introduce auto-compact support
>   zsmalloc: export zs_pool `num_migrated'
>   zram: remove `num_migrated' from zram_stats
>   zsmalloc: lower ZS_ALMOST_FULL waterline
> 
>  drivers/block/zram/zram_drv.c |  12 +-
>  drivers/block/zram/zram_drv.h |   1 -
>  include/linux/zsmalloc.h      |   1 +
>  mm/zsmalloc.c                 | 578 +++++++++++++++++++++---------------------
>  4 files changed, 296 insertions(+), 296 deletions(-)
> 
> -- 
> 2.4.2.337.gfae46aa
> 

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [RFC][PATCH 01/10] zsmalloc: drop unused variable `nr_to_migrate'
  2015-05-29 15:05 ` [RFC][PATCH 01/10] zsmalloc: drop unused variable `nr_to_migrate' Sergey Senozhatsky
@ 2015-06-04  2:04   ` Minchan Kim
  2015-06-04  2:10     ` Sergey Senozhatsky
  0 siblings, 1 reply; 30+ messages in thread
From: Minchan Kim @ 2015-06-04  2:04 UTC (permalink / raw)
  To: Sergey Senozhatsky
  Cc: Andrew Morton, linux-mm, linux-kernel, Sergey Senozhatsky

On Sat, May 30, 2015 at 12:05:19AM +0900, Sergey Senozhatsky wrote:
> __zs_compact() does not use `nr_to_migrate', drop it.
> 
> Signed-off-by: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
Acked-by: Minchan Kim <minchan@kernel.org>

-- 
Kind regards,
Minchan Kim

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [RFC][PATCH 01/10] zsmalloc: drop unused variable `nr_to_migrate'
  2015-06-04  2:04   ` Minchan Kim
@ 2015-06-04  2:10     ` Sergey Senozhatsky
  0 siblings, 0 replies; 30+ messages in thread
From: Sergey Senozhatsky @ 2015-06-04  2:10 UTC (permalink / raw)
  To: Minchan Kim
  Cc: Sergey Senozhatsky, Andrew Morton, linux-mm, linux-kernel,
	Sergey Senozhatsky

On (06/04/15 11:04), Minchan Kim wrote:
> On Sat, May 30, 2015 at 12:05:19AM +0900, Sergey Senozhatsky wrote:
> > __zs_compact() does not use `nr_to_migrate', drop it.
> > 
> > Signed-off-by: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
> Acked-by: Minchan Kim <minchan@kernel.org>
> 

Hello Minchan,

I will post a slightly reworked patchset later today.
thanks.

	-ss

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [RFC][PATCH 02/10] zsmalloc: always keep per-class stats
  2015-05-29 15:05 ` [RFC][PATCH 02/10] zsmalloc: always keep per-class stats Sergey Senozhatsky
@ 2015-06-04  2:18   ` Minchan Kim
  2015-06-04  2:34     ` Sergey Senozhatsky
  0 siblings, 1 reply; 30+ messages in thread
From: Minchan Kim @ 2015-06-04  2:18 UTC (permalink / raw)
  To: Sergey Senozhatsky
  Cc: Andrew Morton, linux-mm, linux-kernel, Sergey Senozhatsky

On Sat, May 30, 2015 at 12:05:20AM +0900, Sergey Senozhatsky wrote:
> always account per-class `zs_size_stat' stats. this data will
> help us make better decisions during compaction. we are especially
> interested in OBJ_ALLOCATED and OBJ_USED, which can tell us if
> class compaction will result in any memory gain.
> 
> for instance, we know the number of allocated objects in the class,
> the number of objects being used (so we also know how many objects
> are not used) and the number of objects per-page. so we can estimate
> how many pages compaction can free (pages that will turn into
> ZS_EMPTY during compaction).

Fair enough but I need to read further patches to see if we need
really this at the moment.

I hope it would be better to write down more detail in cover-letter
so when I read just [0/0] I realize your goal and approach without
looking into detail in each patch.

> 
> Signed-off-by: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
> ---
>  mm/zsmalloc.c | 49 ++++++++++++-------------------------------------
>  1 file changed, 12 insertions(+), 37 deletions(-)
> 
> diff --git a/mm/zsmalloc.c b/mm/zsmalloc.c
> index e615b31..778b8db 100644
> --- a/mm/zsmalloc.c
> +++ b/mm/zsmalloc.c
> @@ -169,14 +169,12 @@ enum zs_stat_type {
>  	NR_ZS_STAT_TYPE,
>  };
>  
> -#ifdef CONFIG_ZSMALLOC_STAT
> -
> -static struct dentry *zs_stat_root;
> -
>  struct zs_size_stat {
>  	unsigned long objs[NR_ZS_STAT_TYPE];
>  };
>  
> +#ifdef CONFIG_ZSMALLOC_STAT
> +static struct dentry *zs_stat_root;
>  #endif
>  
>  /*
> @@ -201,25 +199,21 @@ static int zs_size_classes;
>  static const int fullness_threshold_frac = 4;
>  
>  struct size_class {
> +	spinlock_t		lock;
> +	struct page		*fullness_list[_ZS_NR_FULLNESS_GROUPS];
>  	/*
>  	 * Size of objects stored in this class. Must be multiple
>  	 * of ZS_ALIGN.
>  	 */
> -	int size;
> -	unsigned int index;
> +	int			size;
> +	unsigned int		index;
>  
>  	/* Number of PAGE_SIZE sized pages to combine to form a 'zspage' */
> -	int pages_per_zspage;
> -	/* huge object: pages_per_zspage == 1 && maxobj_per_zspage == 1 */
> -	bool huge;
> -
> -#ifdef CONFIG_ZSMALLOC_STAT
> -	struct zs_size_stat stats;
> -#endif
> -
> -	spinlock_t lock;
> +	int			pages_per_zspage;
> +	struct zs_size_stat	stats;
>  
> -	struct page *fullness_list[_ZS_NR_FULLNESS_GROUPS];
> +	/* huge object: pages_per_zspage == 1 && maxobj_per_zspage == 1 */
> +	bool			huge;
>  };
>  
>  /*
> @@ -439,8 +433,6 @@ static int get_size_class_index(int size)
>  	return min(zs_size_classes - 1, idx);
>  }
>  
> -#ifdef CONFIG_ZSMALLOC_STAT
> -
>  static inline void zs_stat_inc(struct size_class *class,
>  				enum zs_stat_type type, unsigned long cnt)
>  {
> @@ -459,6 +451,8 @@ static inline unsigned long zs_stat_get(struct size_class *class,
>  	return class->stats.objs[type];
>  }
>  
> +#ifdef CONFIG_ZSMALLOC_STAT
> +
>  static int __init zs_stat_init(void)
>  {
>  	if (!debugfs_initialized())
> @@ -574,23 +568,6 @@ static void zs_pool_stat_destroy(struct zs_pool *pool)
>  }
>  
>  #else /* CONFIG_ZSMALLOC_STAT */
> -
> -static inline void zs_stat_inc(struct size_class *class,
> -				enum zs_stat_type type, unsigned long cnt)
> -{
> -}
> -
> -static inline void zs_stat_dec(struct size_class *class,
> -				enum zs_stat_type type, unsigned long cnt)
> -{
> -}
> -
> -static inline unsigned long zs_stat_get(struct size_class *class,
> -				enum zs_stat_type type)
> -{
> -	return 0;
> -}
> -
>  static int __init zs_stat_init(void)
>  {
>  	return 0;
> @@ -608,7 +585,6 @@ static inline int zs_pool_stat_create(char *name, struct zs_pool *pool)
>  static inline void zs_pool_stat_destroy(struct zs_pool *pool)
>  {
>  }
> -
>  #endif
>  
>  
> @@ -1682,7 +1658,6 @@ static void putback_zspage(struct zs_pool *pool, struct size_class *class,
>  			class->size, class->pages_per_zspage));
>  		atomic_long_sub(class->pages_per_zspage,
>  				&pool->pages_allocated);
> -
>  		free_zspage(first_page);
>  	}
>  }
> -- 
> 2.4.2.337.gfae46aa
> 

-- 
Kind regards,
Minchan Kim

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [RFC][PATCH 02/10] zsmalloc: always keep per-class stats
  2015-06-04  2:18   ` Minchan Kim
@ 2015-06-04  2:34     ` Sergey Senozhatsky
  0 siblings, 0 replies; 30+ messages in thread
From: Sergey Senozhatsky @ 2015-06-04  2:34 UTC (permalink / raw)
  To: Minchan Kim
  Cc: Sergey Senozhatsky, Andrew Morton, linux-mm, linux-kernel,
	Sergey Senozhatsky

On (06/04/15 11:18), Minchan Kim wrote:
> On Sat, May 30, 2015 at 12:05:20AM +0900, Sergey Senozhatsky wrote:
> > always account per-class `zs_size_stat' stats. this data will
> > help us make better decisions during compaction. we are especially
> > interested in OBJ_ALLOCATED and OBJ_USED, which can tell us if
> > class compaction will result in any memory gain.
> > 
> > for instance, we know the number of allocated objects in the class,
> > the number of objects being used (so we also know how many objects
> > are not used) and the number of objects per-page. so we can estimate
> > how many pages compaction can free (pages that will turn into
> > ZS_EMPTY during compaction).
> 
> Fair enough but I need to read further patches to see if we need
> really this at the moment.
> 
> I hope it would be better to write down more detail in cover-letter
> so when I read just [0/0] I realize your goal and approach without
> looking into detail in each patch.
> 

sure, will do later today.
I caught a cold, so I'm a bit slow.

	-ss

> > 
> > Signed-off-by: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
> > ---
> >  mm/zsmalloc.c | 49 ++++++++++++-------------------------------------
> >  1 file changed, 12 insertions(+), 37 deletions(-)
> > 
> > diff --git a/mm/zsmalloc.c b/mm/zsmalloc.c
> > index e615b31..778b8db 100644
> > --- a/mm/zsmalloc.c
> > +++ b/mm/zsmalloc.c
> > @@ -169,14 +169,12 @@ enum zs_stat_type {
> >  	NR_ZS_STAT_TYPE,
> >  };
> >  
> > -#ifdef CONFIG_ZSMALLOC_STAT
> > -
> > -static struct dentry *zs_stat_root;
> > -
> >  struct zs_size_stat {
> >  	unsigned long objs[NR_ZS_STAT_TYPE];
> >  };
> >  
> > +#ifdef CONFIG_ZSMALLOC_STAT
> > +static struct dentry *zs_stat_root;
> >  #endif
> >  
> >  /*
> > @@ -201,25 +199,21 @@ static int zs_size_classes;
> >  static const int fullness_threshold_frac = 4;
> >  
> >  struct size_class {
> > +	spinlock_t		lock;
> > +	struct page		*fullness_list[_ZS_NR_FULLNESS_GROUPS];
> >  	/*
> >  	 * Size of objects stored in this class. Must be multiple
> >  	 * of ZS_ALIGN.
> >  	 */
> > -	int size;
> > -	unsigned int index;
> > +	int			size;
> > +	unsigned int		index;
> >  
> >  	/* Number of PAGE_SIZE sized pages to combine to form a 'zspage' */
> > -	int pages_per_zspage;
> > -	/* huge object: pages_per_zspage == 1 && maxobj_per_zspage == 1 */
> > -	bool huge;
> > -
> > -#ifdef CONFIG_ZSMALLOC_STAT
> > -	struct zs_size_stat stats;
> > -#endif
> > -
> > -	spinlock_t lock;
> > +	int			pages_per_zspage;
> > +	struct zs_size_stat	stats;
> >  
> > -	struct page *fullness_list[_ZS_NR_FULLNESS_GROUPS];
> > +	/* huge object: pages_per_zspage == 1 && maxobj_per_zspage == 1 */
> > +	bool			huge;
> >  };
> >  
> >  /*
> > @@ -439,8 +433,6 @@ static int get_size_class_index(int size)
> >  	return min(zs_size_classes - 1, idx);
> >  }
> >  
> > -#ifdef CONFIG_ZSMALLOC_STAT
> > -
> >  static inline void zs_stat_inc(struct size_class *class,
> >  				enum zs_stat_type type, unsigned long cnt)
> >  {
> > @@ -459,6 +451,8 @@ static inline unsigned long zs_stat_get(struct size_class *class,
> >  	return class->stats.objs[type];
> >  }
> >  
> > +#ifdef CONFIG_ZSMALLOC_STAT
> > +
> >  static int __init zs_stat_init(void)
> >  {
> >  	if (!debugfs_initialized())
> > @@ -574,23 +568,6 @@ static void zs_pool_stat_destroy(struct zs_pool *pool)
> >  }
> >  
> >  #else /* CONFIG_ZSMALLOC_STAT */
> > -
> > -static inline void zs_stat_inc(struct size_class *class,
> > -				enum zs_stat_type type, unsigned long cnt)
> > -{
> > -}
> > -
> > -static inline void zs_stat_dec(struct size_class *class,
> > -				enum zs_stat_type type, unsigned long cnt)
> > -{
> > -}
> > -
> > -static inline unsigned long zs_stat_get(struct size_class *class,
> > -				enum zs_stat_type type)
> > -{
> > -	return 0;
> > -}
> > -
> >  static int __init zs_stat_init(void)
> >  {
> >  	return 0;
> > @@ -608,7 +585,6 @@ static inline int zs_pool_stat_create(char *name, struct zs_pool *pool)
> >  static inline void zs_pool_stat_destroy(struct zs_pool *pool)
> >  {
> >  }
> > -
> >  #endif
> >  
> >  
> > @@ -1682,7 +1658,6 @@ static void putback_zspage(struct zs_pool *pool, struct size_class *class,
> >  			class->size, class->pages_per_zspage));
> >  		atomic_long_sub(class->pages_per_zspage,
> >  				&pool->pages_allocated);
> > -
> >  		free_zspage(first_page);
> >  	}
> >  }
> > -- 
> > 2.4.2.337.gfae46aa
> > 
> 
> -- 
> Kind regards,
> Minchan Kim
> 

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [RFC][PATCH 03/10] zsmalloc: introduce zs_can_compact() function
  2015-05-29 15:05 ` [RFC][PATCH 03/10] zsmalloc: introduce zs_can_compact() function Sergey Senozhatsky
@ 2015-06-04  2:55   ` Minchan Kim
  2015-06-04  3:15     ` Sergey Senozhatsky
  0 siblings, 1 reply; 30+ messages in thread
From: Minchan Kim @ 2015-06-04  2:55 UTC (permalink / raw)
  To: Sergey Senozhatsky
  Cc: Andrew Morton, linux-mm, linux-kernel, Sergey Senozhatsky

On Sat, May 30, 2015 at 12:05:21AM +0900, Sergey Senozhatsky wrote:
> this function checks if class compaction will free any pages.
> rephrasing, do we have enough unused objects to form at least one
> ZS_EMPTY page and free it. it aborts compaction if class compaction
> will not result into any (further) savings.
> 
> EXAMPLE (this debug output is not part of this patch set):
> 
> -- class size
> -- number of allocated objects
> -- number of used objects,
> -- estimated number of pages that will be freed
> 
> [..]
> [ 3303.108960] class-3072 objs:24652 inuse:24628 objs-per-page:4 pages-tofree:6

                                                   maxobjs-per-zspage?

> [ 3303.108965] class-3072 objs:24648 inuse:24628 objs-per-page:4 pages-tofree:5
> [ 3303.108970] class-3072 objs:24644 inuse:24628 objs-per-page:4 pages-tofree:4
> [ 3303.108973] class-3072 objs:24640 inuse:24628 objs-per-page:4 pages-tofree:3
> [ 3303.108978] class-3072 objs:24636 inuse:24628 objs-per-page:4 pages-tofree:2
> [ 3303.108982] class-3072 objs:24632 inuse:24628 objs-per-page:4 pages-tofree:1
> [ 3303.108993] class-2720 objs:17970 inuse:17966 objs-per-page:3 pages-tofree:1
> [ 3303.108997] class-2720 objs:17967 inuse:17966 objs-per-page:3 pages-tofree:0
> [ 3303.108998] class-2720: Compaction is useless
> [ 3303.109000] class-2448 objs:7680 inuse:7674 objs-per-page:5 pages-tofree:1
> [ 3303.109005] class-2336 objs:13510 inuse:13500 objs-per-page:7 pages-tofree:1
> [ 3303.109010] class-2336 objs:13503 inuse:13500 objs-per-page:7 pages-tofree:0
> [ 3303.109011] class-2336: Compaction is useless
> [ 3303.109013] class-1808 objs:1161 inuse:1154 objs-per-page:9 pages-tofree:0
> [ 3303.109014] class-1808: Compaction is useless
> [ 3303.109016] class-1744 objs:2135 inuse:2131 objs-per-page:7 pages-tofree:0
> [ 3303.109017] class-1744: Compaction is useless
> [ 3303.109019] class-1536 objs:1328 inuse:1323 objs-per-page:8 pages-tofree:0
> [ 3303.109020] class-1536: Compaction is useless
> [ 3303.109022] class-1488 objs:8855 inuse:8847 objs-per-page:11 pages-tofree:0
> [ 3303.109023] class-1488: Compaction is useless
> [ 3303.109025] class-1360 objs:14880 inuse:14878 objs-per-page:3 pages-tofree:0
> [ 3303.109026] class-1360: Compaction is useless
> [ 3303.109028] class-1248 objs:3588 inuse:3577 objs-per-page:13 pages-tofree:0
> [ 3303.109029] class-1248: Compaction is useless
> [ 3303.109031] class-1216 objs:3380 inuse:3372 objs-per-page:10 pages-tofree:0
> [ 3303.109032] class-1216: Compaction is useless
> [ 3303.109033] class-1168 objs:3416 inuse:3401 objs-per-page:7 pages-tofree:2
> [ 3303.109037] class-1168 objs:3409 inuse:3401 objs-per-page:7 pages-tofree:1
> [ 3303.109042] class-1104 objs:605 inuse:599 objs-per-page:11 pages-tofree:0
> [ 3303.109043] class-1104: Compaction is useless
> [..]
> 
> every "Compaction is useless" indicates that we saved some CPU cycles.
> 
> for example, class-1104 has
> 
> 	605	object allocated
> 	599	objects used
> 	11	objects per-page
> 
> even if we have ALMOST_EMPTY page, we still don't have enough room to move
> all of its objects and free this page; so compaction will not make a lot of
> sense here, it's better to just leave it as is.

Fair enough.

> 
> Signed-off-by: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
> ---
>  mm/zsmalloc.c | 25 +++++++++++++++++++++++++
>  1 file changed, 25 insertions(+)
> 
> diff --git a/mm/zsmalloc.c b/mm/zsmalloc.c
> index 778b8db..9ef6f15 100644
> --- a/mm/zsmalloc.c
> +++ b/mm/zsmalloc.c
> @@ -1673,6 +1673,28 @@ static struct page *isolate_source_page(struct size_class *class)
>  	return page;
>  }
>  
> +/*
> + * Make sure that we actually can compact this class,
> + * IOW if migration will empty at least one page.
> + *
> + * should be called under class->lock
> + */
> +static bool zs_can_compact(struct size_class *class)
> +{
> +	/*
> +	 * calculate how many unused allocated objects we

           c should be captital.

I hope you will fix all of english grammer in next spin
because someone(like me) who is not a native will learn the
wrong english. :)
           
> +	 * have and see if we can free any zspages. otherwise,
> +	 * compaction can just move objects back and forth w/o
> +	 * any memory gain.
> +	 */
> +	unsigned long ret = zs_stat_get(class, OBJ_ALLOCATED) -
> +		zs_stat_get(class, OBJ_USED);
> +

I prefer obj_wasted to "ret".

> +	ret /= get_maxobj_per_zspage(class->size,
> +			class->pages_per_zspage);
> +	return ret > 0;
> +}
> +
>  static unsigned long __zs_compact(struct zs_pool *pool,
>  				struct size_class *class)
>  {
> @@ -1686,6 +1708,9 @@ static unsigned long __zs_compact(struct zs_pool *pool,
>  
>  		BUG_ON(!is_first_page(src_page));
>  
> +		if (!zs_can_compact(class))
> +			break;
> +
>  		cc.index = 0;
>  		cc.s_page = src_page;
>  
> -- 
> 2.4.2.337.gfae46aa
> 

-- 
Kind regards,
Minchan Kim

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [RFC][PATCH 04/10] zsmalloc: cosmetic compaction code adjustments
  2015-05-29 15:05 ` [RFC][PATCH 04/10] zsmalloc: cosmetic compaction code adjustments Sergey Senozhatsky
@ 2015-06-04  3:14   ` Minchan Kim
  0 siblings, 0 replies; 30+ messages in thread
From: Minchan Kim @ 2015-06-04  3:14 UTC (permalink / raw)
  To: Sergey Senozhatsky
  Cc: Andrew Morton, linux-mm, linux-kernel, Sergey Senozhatsky

On Sat, May 30, 2015 at 12:05:22AM +0900, Sergey Senozhatsky wrote:
> change zs_object_copy() argument order to be (DST, SRC) rather
> than (SRC, DST). copy/move functions usually have (to, from)
> arguments order.

Yeb,

> 
> rename alloc_target_page() to isolate_target_page(). this
> function doesn't allocate anything, it isolates target page,
> pretty much like isolate_source_page().

The reason I named it as alloc_target_page is I had a plan to
alloc new page which might be helpful sometime but I cannot
think of any benefit now so I follow your your patch.

> 
> tweak __zs_compact() comment.
> 
> Signed-off-by: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
Acked-by: Minchan Kim <minchan@kernel.org>

> ---
>  mm/zsmalloc.c | 12 ++++++------
>  1 file changed, 6 insertions(+), 6 deletions(-)
> 
> diff --git a/mm/zsmalloc.c b/mm/zsmalloc.c
> index 9ef6f15..fa72a81 100644
> --- a/mm/zsmalloc.c
> +++ b/mm/zsmalloc.c
> @@ -1469,7 +1469,7 @@ void zs_free(struct zs_pool *pool, unsigned long handle)
>  }
>  EXPORT_SYMBOL_GPL(zs_free);
>  
> -static void zs_object_copy(unsigned long src, unsigned long dst,
> +static void zs_object_copy(unsigned long dst, unsigned long src,
>  				struct size_class *class)
>  {
>  	struct page *s_page, *d_page;
> @@ -1610,7 +1610,7 @@ static int migrate_zspage(struct zs_pool *pool, struct size_class *class,
>  
>  		used_obj = handle_to_obj(handle);
>  		free_obj = obj_malloc(d_page, class, handle);
> -		zs_object_copy(used_obj, free_obj, class);
> +		zs_object_copy(free_obj, used_obj, class);
>  		index++;
>  		record_obj(handle, free_obj);
>  		unpin_tag(handle);
> @@ -1626,7 +1626,7 @@ static int migrate_zspage(struct zs_pool *pool, struct size_class *class,
>  	return ret;
>  }
>  
> -static struct page *alloc_target_page(struct size_class *class)
> +static struct page *isolate_target_page(struct size_class *class)
>  {
>  	int i;
>  	struct page *page;
> @@ -1714,11 +1714,11 @@ static unsigned long __zs_compact(struct zs_pool *pool,
>  		cc.index = 0;
>  		cc.s_page = src_page;
>  
> -		while ((dst_page = alloc_target_page(class))) {
> +		while ((dst_page = isolate_target_page(class))) {
>  			cc.d_page = dst_page;
>  			/*
> -			 * If there is no more space in dst_page, try to
> -			 * allocate another zspage.
> +			 * If there is no more space in dst_page, resched
> +			 * and see if anyone had allocated another zspage.
>  			 */
>  			if (!migrate_zspage(pool, class, &cc))
>  				break;
> -- 
> 2.4.2.337.gfae46aa
> 

-- 
Kind regards,
Minchan Kim

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [RFC][PATCH 03/10] zsmalloc: introduce zs_can_compact() function
  2015-06-04  2:55   ` Minchan Kim
@ 2015-06-04  3:15     ` Sergey Senozhatsky
  2015-06-04  3:30       ` Minchan Kim
  2015-06-04  3:31       ` Sergey Senozhatsky
  0 siblings, 2 replies; 30+ messages in thread
From: Sergey Senozhatsky @ 2015-06-04  3:15 UTC (permalink / raw)
  To: Minchan Kim
  Cc: Sergey Senozhatsky, Andrew Morton, linux-mm, linux-kernel,
	Sergey Senozhatsky

On (06/04/15 11:55), Minchan Kim wrote:
> > [ 3303.108960] class-3072 objs:24652 inuse:24628 objs-per-page:4 pages-tofree:6
> 
>                                                    maxobjs-per-zspage?
> 

yeah, I shortened it to be more of less "80 chars" friendly.


[..]

> > +	 * calculate how many unused allocated objects we
> 
>            c should be captital.
> 
> I hope you will fix all of english grammer in next spin
> because someone(like me) who is not a native will learn the
> wrong english. :)

sure, will fix. yeah, I'm a native broken english speaker :-)

> > +	 * have and see if we can free any zspages. otherwise,
> > +	 * compaction can just move objects back and forth w/o
> > +	 * any memory gain.
> > +	 */
> > +	unsigned long ret = zs_stat_get(class, OBJ_ALLOCATED) -
> > +		zs_stat_get(class, OBJ_USED);
> > +
> 
> I prefer obj_wasted to "ret".

ok.

I'm still thinking how good it should be.

for automatic compaction we don't want to uselessly move objects between
pages and I tend to think that it's better to compact less, than to waste
more cpu cycless.


on the other hand, this policy will miss cases like:

-- free objects in class: 5 (free-objs class capacity)
-- page1: inuse 2
-- page2: inuse 2
-- page3: inuse 3
-- page4: inuse 2

so total "insuse" is greater than free-objs class capacity. but, it's
surely possible to compact this class. partial inuse summ <= free-objs class
capacity (a partial summ is a ->inuse summ of any two of class pages:
page1 + page2, page2 + page3, etc.).

otoh, these partial sums will badly affect performance. may be for automatic
compaction (the one that happens w/o user interaction) we can do zs_can_compact()
and for manual compaction (the one that has been triggered by a user) we can
old "full-scan".

anyway, zs_can_compact() looks like something that we can optimize
independently later.

	-ss

> > +	ret /= get_maxobj_per_zspage(class->size,
> > +			class->pages_per_zspage);
> > +	return ret > 0;
> > +}
> > +
> >  static unsigned long __zs_compact(struct zs_pool *pool,
> >  				struct size_class *class)
> >  {
> > @@ -1686,6 +1708,9 @@ static unsigned long __zs_compact(struct zs_pool *pool,
> >  
> >  		BUG_ON(!is_first_page(src_page));
> >  
> > +		if (!zs_can_compact(class))
> > +			break;
> > +
> >  		cc.index = 0;
> >  		cc.s_page = src_page;
> >  
> > -- 
> > 2.4.2.337.gfae46aa
> > 
> 
> -- 
> Kind regards,
> Minchan Kim
> 

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [RFC][PATCH 03/10] zsmalloc: introduce zs_can_compact() function
  2015-06-04  3:15     ` Sergey Senozhatsky
@ 2015-06-04  3:30       ` Minchan Kim
  2015-06-04  3:42         ` Sergey Senozhatsky
  2015-06-04  3:31       ` Sergey Senozhatsky
  1 sibling, 1 reply; 30+ messages in thread
From: Minchan Kim @ 2015-06-04  3:30 UTC (permalink / raw)
  To: Sergey Senozhatsky
  Cc: Sergey Senozhatsky, Andrew Morton, linux-mm, linux-kernel

On Thu, Jun 04, 2015 at 12:15:14PM +0900, Sergey Senozhatsky wrote:
> On (06/04/15 11:55), Minchan Kim wrote:
> > > [ 3303.108960] class-3072 objs:24652 inuse:24628 objs-per-page:4 pages-tofree:6
> > 
> >                                                    maxobjs-per-zspage?
> > 
> 
> yeah, I shortened it to be more of less "80 chars" friendly.
> 
> 
> [..]
> 
> > > +	 * calculate how many unused allocated objects we
> > 
> >            c should be captital.
> > 
> > I hope you will fix all of english grammer in next spin
> > because someone(like me) who is not a native will learn the
> > wrong english. :)
> 
> sure, will fix. yeah, I'm a native broken english speaker :-)
> 
> > > +	 * have and see if we can free any zspages. otherwise,
> > > +	 * compaction can just move objects back and forth w/o
> > > +	 * any memory gain.
> > > +	 */
> > > +	unsigned long ret = zs_stat_get(class, OBJ_ALLOCATED) -
> > > +		zs_stat_get(class, OBJ_USED);
> > > +
> > 
> > I prefer obj_wasted to "ret".
> 
> ok.
> 
> I'm still thinking how good it should be.
> 
> for automatic compaction we don't want to uselessly move objects between
> pages and I tend to think that it's better to compact less, than to waste
> more cpu cycless.
> 
> 
> on the other hand, this policy will miss cases like:
> 
> -- free objects in class: 5 (free-objs class capacity)
> -- page1: inuse 2
> -- page2: inuse 2
> -- page3: inuse 3
> -- page4: inuse 2

What scenario do you have a cocern?
Could you describe this example more clear?

Thanks.
> 
> so total "insuse" is greater than free-objs class capacity. but, it's
> surely possible to compact this class. partial inuse summ <= free-objs class
> capacity (a partial summ is a ->inuse summ of any two of class pages:
> page1 + page2, page2 + page3, etc.).
> 
> otoh, these partial sums will badly affect performance. may be for automatic
> compaction (the one that happens w/o user interaction) we can do zs_can_compact()
> and for manual compaction (the one that has been triggered by a user) we can
> old "full-scan".
> 
> anyway, zs_can_compact() looks like something that we can optimize
> independently later.
> 
> 	-ss
> 
> > > +	ret /= get_maxobj_per_zspage(class->size,
> > > +			class->pages_per_zspage);
> > > +	return ret > 0;
> > > +}
> > > +
> > >  static unsigned long __zs_compact(struct zs_pool *pool,
> > >  				struct size_class *class)
> > >  {
> > > @@ -1686,6 +1708,9 @@ static unsigned long __zs_compact(struct zs_pool *pool,
> > >  
> > >  		BUG_ON(!is_first_page(src_page));
> > >  
> > > +		if (!zs_can_compact(class))
> > > +			break;
> > > +
> > >  		cc.index = 0;
> > >  		cc.s_page = src_page;
> > >  
> > > -- 
> > > 2.4.2.337.gfae46aa
> > > 
> > 
> > -- 
> > Kind regards,
> > Minchan Kim
> > 

-- 
Kind regards,
Minchan Kim

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [RFC][PATCH 03/10] zsmalloc: introduce zs_can_compact() function
  2015-06-04  3:15     ` Sergey Senozhatsky
  2015-06-04  3:30       ` Minchan Kim
@ 2015-06-04  3:31       ` Sergey Senozhatsky
  1 sibling, 0 replies; 30+ messages in thread
From: Sergey Senozhatsky @ 2015-06-04  3:31 UTC (permalink / raw)
  To: Sergey Senozhatsky
  Cc: Minchan Kim, Sergey Senozhatsky, Andrew Morton, linux-mm, linux-kernel

On (06/04/15 12:15), Sergey Senozhatsky wrote:
> I'm still thinking how good it should be.
> 
> for automatic compaction we don't want to uselessly move objects between
> pages and I tend to think that it's better to compact less, than to waste
> more cpu cycless.
> 
> 
> on the other hand, this policy will miss cases like:
> 
> -- free objects in class: 5 (free-objs class capacity)
> -- page1: inuse 2
> -- page2: inuse 2
> -- page3: inuse 3
> -- page4: inuse 2
> 
> so total "insuse" is greater than free-objs class capacity. but, it's
> surely possible to compact this class. partial inuse summ <= free-objs class
> capacity (a partial summ is a ->inuse summ of any two of class pages:
> page1 + page2, page2 + page3, etc.).
> 
> otoh, these partial sums will badly affect performance. may be for automatic
> compaction (the one that happens w/o user interaction) we can do zs_can_compact()
> and for manual compaction (the one that has been triggered by a user) we can
> old "full-scan".
> 
> anyway, zs_can_compact() looks like something that we can optimize
> independently later.
> 

so what I'm thinking of right now, is:

-- first do "if we have enough free objects to free at least one page"
check. compact if true.

  -- if false, then we can do on a per-page basis
     "if page->inuse <= class free-objs capacity" then compact it,
     else select next almost_empty page.

     here would be helpful to have pages ordered by ->inuse. but this
     is far to expensive.


I have a patch that I will post later that introduces weak/partial
page ordering within fullness_list (really inexpensive: just one int
compare to add a page with a higher ->inuse to list head instead of
list tail).

	-ss

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [RFC][PATCH 03/10] zsmalloc: introduce zs_can_compact() function
  2015-06-04  3:30       ` Minchan Kim
@ 2015-06-04  3:42         ` Sergey Senozhatsky
  2015-06-04  3:50           ` Minchan Kim
  0 siblings, 1 reply; 30+ messages in thread
From: Sergey Senozhatsky @ 2015-06-04  3:42 UTC (permalink / raw)
  To: Minchan Kim
  Cc: Sergey Senozhatsky, Sergey Senozhatsky, Andrew Morton, linux-mm,
	linux-kernel

On (06/04/15 12:30), Minchan Kim wrote:
> > -- free objects in class: 5 (free-objs class capacity)
> > -- page1: inuse 2
> > -- page2: inuse 2
> > -- page3: inuse 3
> > -- page4: inuse 2
> 
> What scenario do you have a cocern?
> Could you describe this example more clear?

you mean "how is this even possible"?

well, for example,

make -jX
make clean

can introduce a significant fragmentation. no new objects, just random
objs removal. assuming that we keep some of the objects, allocated during
compilation.

e.g.

...

page1
  allocate baz.so
  allocate foo.o
page2
  allocate bar.o
  allocate foo.so
...
pageN



now `make clean`

page1:
  allocated baz.so
  empty

page2
  empty
  allocated foo.so

...

pageN

in the worst case, every page can turn out to be ALMOST_EMPTY.

	-ss

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [RFC][PATCH 03/10] zsmalloc: introduce zs_can_compact() function
  2015-06-04  3:42         ` Sergey Senozhatsky
@ 2015-06-04  3:50           ` Minchan Kim
  2015-06-04  4:19             ` Sergey Senozhatsky
  0 siblings, 1 reply; 30+ messages in thread
From: Minchan Kim @ 2015-06-04  3:50 UTC (permalink / raw)
  To: Sergey Senozhatsky
  Cc: Sergey Senozhatsky, Andrew Morton, linux-mm, linux-kernel

On Thu, Jun 04, 2015 at 12:42:30PM +0900, Sergey Senozhatsky wrote:
> On (06/04/15 12:30), Minchan Kim wrote:
> > > -- free objects in class: 5 (free-objs class capacity)
> > > -- page1: inuse 2
> > > -- page2: inuse 2
> > > -- page3: inuse 3
> > > -- page4: inuse 2
> > 
> > What scenario do you have a cocern?
> > Could you describe this example more clear?
> 
> you mean "how is this even possible"?

No I meant. I couldn't understand your terms. Sorry.

What free-objs class capacity is?
page1 is zspage?

Let's use consistent terms between us.

For example, maxobj-per-zspage is 4.
A is allocated and used. X is allocated but not used.
so we can draw a zspage below.

        AAXX

So we can draw several zspages linked list as below

AAXX - AXXX - AAAX

Could you describe your problem again?

Sorry.


> 
> well, for example,
> 
> make -jX
> make clean
> 
> can introduce a significant fragmentation. no new objects, just random
> objs removal. assuming that we keep some of the objects, allocated during
> compilation.
> 
> e.g.
> 
> ...
> 
> page1
>   allocate baz.so
>   allocate foo.o
> page2
>   allocate bar.o
>   allocate foo.so
> ...
> pageN
> 
> 
> 
> now `make clean`
> 
> page1:
>   allocated baz.so
>   empty
> 
> page2
>   empty
>   allocated foo.so
> 
> ...
> 
> pageN
> 
> in the worst case, every page can turn out to be ALMOST_EMPTY.
> 
> 	-ss

-- 
Kind regards,
Minchan Kim

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [RFC][PATCH 03/10] zsmalloc: introduce zs_can_compact() function
  2015-06-04  3:50           ` Minchan Kim
@ 2015-06-04  4:19             ` Sergey Senozhatsky
  0 siblings, 0 replies; 30+ messages in thread
From: Sergey Senozhatsky @ 2015-06-04  4:19 UTC (permalink / raw)
  To: Minchan Kim
  Cc: Sergey Senozhatsky, Sergey Senozhatsky, Andrew Morton, linux-mm,
	linux-kernel

On (06/04/15 12:50), Minchan Kim wrote:
> > On (06/04/15 12:30), Minchan Kim wrote:
> > > 
> > > What scenario do you have a cocern?
> > > Could you describe this example more clear?
> > 
> > you mean "how is this even possible"?
> 
> No I meant. I couldn't understand your terms. Sorry.
> 
> What free-objs class capacity is?
> page1 is zspage?
> 
> Let's use consistent terms between us.
> 
> For example, maxobj-per-zspage is 4.
> A is allocated and used. X is allocated but not used.
> so we can draw a zspage below.
> 
>         AAXX
> 
> So we can draw several zspages linked list as below
> 
> AAXX - AXXX - AAAX
> 
> Could you describe your problem again?
> 
> Sorry.

My apologies.

yes, so:
-- free-objs class capacity -- how may unused allocated objects
we have in this class (in total).
-- page1..pageN -- zspages.

And I think that my example is utterly wrong and incorrect. My mistake.
Sorry for the noise.

	-ss

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [RFC][PATCH 07/10] zsmalloc: introduce auto-compact support
  2015-05-29 15:05 ` [RFC][PATCH 07/10] zsmalloc: introduce auto-compact support Sergey Senozhatsky
@ 2015-06-04  4:57   ` Minchan Kim
  2015-06-04  5:30     ` Sergey Senozhatsky
  0 siblings, 1 reply; 30+ messages in thread
From: Minchan Kim @ 2015-06-04  4:57 UTC (permalink / raw)
  To: Sergey Senozhatsky
  Cc: Andrew Morton, linux-mm, linux-kernel, Sergey Senozhatsky

On Sat, May 30, 2015 at 12:05:25AM +0900, Sergey Senozhatsky wrote:
> perform class compaction in zs_free(), if zs_free() has created
> a ZS_ALMOST_EMPTY page. this is the most trivial `policy'.

Finally, I got realized your intention.

Actually, I had a plan to add /sys/block/zram0/compact_threshold_ratio
which means to compact automatically when compr_data_size/mem_used_total
is below than the threshold but I didn't try because it could be done
by usertool.

Another reason I didn't try the approach is that it could scan all of
zs_objects repeatedly withtout any freeing zspage in some corner cases,
which could be big overhead we should prevent so we might add some
heuristic. as an example, we could delay a few compaction trial when
we found a few previous trials as all fails.
It's simple design of mm/compaction.c to prevent pointless overhead
but historically it made pains several times and required more
complicated logics but it's still painful.

Other thing I found recently is that it's not always win zsmalloc
for zram is not fragmented. The fragmented space could be used
for storing upcoming compressed objects although it is wasted space
at the moment but if we don't have any hole(ie, fragment space)
via frequent compaction, zsmalloc should allocate a new zspage
which could be allocated on movable pageblock by fallback of
nonmovable pageblock request on highly memory pressure system
so it accelerates fragment problem of the system memory.

So, I want to pass the policy to userspace.
If we found it's really trobule on userspace, then, we need more
thinking.

Thanks.

> 
> probably it would make zs_can_compact() to return an estimated number
> of pages that potentially will be free and trigger auto-compaction
> only when it's above some limit (e.g. at least 4 zs pages); or put it
> under config option.
> 
> this also tweaks __zs_compact() -- we can't do reschedule
> anymore, waiting for new pages in the current class. so we
> compact as much as we can and return immediately if compaction
> is not possible anymore.
> 
> auto-compaction is not a replacement of manual compaction.
> 
> compiled linux kernel with auto-compaction:
> 
> cat /sys/block/zram0/mm_stat
> 2339885056 1601034235 1624076288        0 1624076288    19961     1106
> 
> performing additional manual compaction:
> 
> echo 1 > /sys/block/zram0/compact
> cat /sys/block/zram0/mm_stat
> 2339885056 1601034235 1624051712        0 1624076288    19961     1114
> 
> manual compaction was able to migrate additional 8 objects. so
> auto-compaction is 'good enough'.
> 
> TEST
> 
> this test copies a 1.3G linux kernel tar to mounted zram disk,
> and extracts it.
> 
> w/auto-compaction:
> 
> cat /sys/block/zram0/mm_stat
>  1171456    26006    86016        0    86016    32781        0
> 
> time tar xf linux-3.10.tar.gz -C linux
> 
> real    0m16.970s
> user    0m15.247s
> sys     0m8.477s
> 
> du -sh linux
> 2.0G    linux
> 
> cat /sys/block/zram0/mm_stat
> 3547353088 2993384270 3011088384        0 3011088384    24310      108
> 
> =====================================================================
> 
> w/o auto compaction:
> 
> cat /sys/block/zram0/mm_stat
>  1171456    26000    81920        0    81920    32781        0
> 
> time tar xf linux-3.10.tar.gz -C linux
> 
> real    0m16.983s
> user    0m15.267s
> sys     0m8.417s
> 
> du -sh linux
> 2.0G    linux
> 
> cat /sys/block/zram0/mm_stat
> 3548917760 2993566924 3011317760        0 3011317760    23928        0
> 
> =====================================================================
> 
> iozone shows that auto-compacted code runs faster in several
> tests, which is hardly trustworthy. anyway.
> 
> iozone -t 3 -R -r 16K -s 60M -I +Z
> 
>        test           base       auto-compact (compacted 66123 objs)
>    Initial write   1603682.25          1645112.38
>          Rewrite   2502243.31          2256570.31
>             Read   7040860.00          7130575.00
>          Re-read   7036490.75          7066744.25
>     Reverse Read   6617115.25          6155395.50
>      Stride read   6705085.50          6350030.38
>      Random read   6668497.75          6350129.38
>   Mixed workload   5494030.38          5091669.62
>     Random write   2526834.44          2500977.81
>           Pwrite   1656874.00          1663796.94
>            Pread   3322818.91          3359683.44
>           Fwrite   4090124.25          4099773.88
>            Fread   10358916.25         10324409.75
> 
> Signed-off-by: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
> ---
>  mm/zsmalloc.c | 25 +++++++++++++------------
>  1 file changed, 13 insertions(+), 12 deletions(-)
> 
> diff --git a/mm/zsmalloc.c b/mm/zsmalloc.c
> index c2a640a..70bf481 100644
> --- a/mm/zsmalloc.c
> +++ b/mm/zsmalloc.c
> @@ -1515,34 +1515,28 @@ static void __zs_compact(struct zs_pool *pool, struct size_class *class)
>  
>  		while ((dst_page = isolate_target_page(class))) {
>  			cc.d_page = dst_page;
> -			/*
> -			 * If there is no more space in dst_page, resched
> -			 * and see if anyone had allocated another zspage.
> -			 */
> +
>  			if (!migrate_zspage(pool, class, &cc))
> -				break;
> +				goto out;
>  
>  			putback_zspage(pool, class, dst_page);
>  		}
>  
> -		/* Stop if we couldn't find slot */
> -		if (dst_page == NULL)
> +		if (!dst_page)
>  			break;
> -
>  		putback_zspage(pool, class, dst_page);
>  		putback_zspage(pool, class, src_page);
> -		spin_unlock(&class->lock);
> -		cond_resched();
> -		spin_lock(&class->lock);
>  	}
>  
> +out:
> +	if (dst_page)
> +		putback_zspage(pool, class, dst_page);
>  	if (src_page)
>  		putback_zspage(pool, class, src_page);
>  
>  	spin_unlock(&class->lock);
>  }
>  
> -
>  unsigned long zs_get_total_pages(struct zs_pool *pool)
>  {
>  	return atomic_long_read(&pool->pages_allocated);
> @@ -1741,6 +1735,13 @@ void zs_free(struct zs_pool *pool, unsigned long handle)
>  	unpin_tag(handle);
>  
>  	free_handle(pool, handle);
> +
> +	/*
> +	 * actual fullness might have changed, __zs_compact() checks
> +	 * if compaction makes sense
> +	 */
> +	if (fullness == ZS_ALMOST_EMPTY)
> +		__zs_compact(pool, class);
>  }
>  EXPORT_SYMBOL_GPL(zs_free);
>  
> -- 
> 2.4.2.337.gfae46aa
> 

-- 
Kind regards,
Minchan Kim

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [RFC][PATCH 07/10] zsmalloc: introduce auto-compact support
  2015-06-04  4:57   ` Minchan Kim
@ 2015-06-04  5:30     ` Sergey Senozhatsky
  2015-06-04  6:27       ` Minchan Kim
  0 siblings, 1 reply; 30+ messages in thread
From: Sergey Senozhatsky @ 2015-06-04  5:30 UTC (permalink / raw)
  To: Minchan Kim
  Cc: Sergey Senozhatsky, Andrew Morton, linux-mm, linux-kernel,
	Sergey Senozhatsky

On (06/04/15 13:57), Minchan Kim wrote:
> On Sat, May 30, 2015 at 12:05:25AM +0900, Sergey Senozhatsky wrote:
> > perform class compaction in zs_free(), if zs_free() has created
> > a ZS_ALMOST_EMPTY page. this is the most trivial `policy'.
> 
> Finally, I got realized your intention.
> 
> Actually, I had a plan to add /sys/block/zram0/compact_threshold_ratio
> which means to compact automatically when compr_data_size/mem_used_total
> is below than the threshold but I didn't try because it could be done
> by usertool.
> 
> Another reason I didn't try the approach is that it could scan all of
> zs_objects repeatedly withtout any freeing zspage in some corner cases,
> which could be big overhead we should prevent so we might add some
> heuristic. as an example, we could delay a few compaction trial when
> we found a few previous trials as all fails.

this is why I use zs_can_compact() -- to evict from zs_compact() as soon
as possible. so useless scans are minimized (well, at least expected). I'm
also thinking of a threshold-based solution -- do class auto-compaction
only if we can free X pages, for example.

the problem of compaction is that there is no compaction until you trigger
it.

and fragmented classes are not necessarily a win. if writes don't happen
to a fragmented class-X (and we basically can't tell if they will, nor we
can estimate; it's up to I/O and data patterns, compression algorithm, etc.)
then class-X stays fragmented w/o any use.


> It's simple design of mm/compaction.c to prevent pointless overhead
> but historically it made pains several times and required more
> complicated logics but it's still painful.
> 
> Other thing I found recently is that it's not always win zsmalloc
> for zram is not fragmented. The fragmented space could be used
> for storing upcoming compressed objects although it is wasted space
> at the moment but if we don't have any hole(ie, fragment space)
> via frequent compaction, zsmalloc should allocate a new zspage
> which could be allocated on movable pageblock by fallback of
> nonmovable pageblock request on highly memory pressure system
> so it accelerates fragment problem of the system memory.

yes, but compaction almost always leave classes fragmented. I think
it's a corner case, when the number of unused allocated objects was
exactly the same as the number of objects that we migrated and the
number of migrated objects was exactly N*maxobj_per_zspage, so we
left the class w/o any unused objects (OBJ_ALLOCATED == OBJ_USED).
classes have 'holes' after compaction.


> So, I want to pass the policy to userspace.
> If we found it's really trobule on userspace, then, we need more
> thinking.

well, it can be under config "aggressive compaction" or "automatic
compaction" option.

	-ss

> Thanks.
> 
> > 
> > probably it would make zs_can_compact() to return an estimated number
> > of pages that potentially will be free and trigger auto-compaction
> > only when it's above some limit (e.g. at least 4 zs pages); or put it
> > under config option.
> > 
> > this also tweaks __zs_compact() -- we can't do reschedule
> > anymore, waiting for new pages in the current class. so we
> > compact as much as we can and return immediately if compaction
> > is not possible anymore.
> > 
> > auto-compaction is not a replacement of manual compaction.
> > 
> > compiled linux kernel with auto-compaction:
> > 
> > cat /sys/block/zram0/mm_stat
> > 2339885056 1601034235 1624076288        0 1624076288    19961     1106
> > 
> > performing additional manual compaction:
> > 
> > echo 1 > /sys/block/zram0/compact
> > cat /sys/block/zram0/mm_stat
> > 2339885056 1601034235 1624051712        0 1624076288    19961     1114
> > 
> > manual compaction was able to migrate additional 8 objects. so
> > auto-compaction is 'good enough'.
> > 
> > TEST
> > 
> > this test copies a 1.3G linux kernel tar to mounted zram disk,
> > and extracts it.
> > 
> > w/auto-compaction:
> > 
> > cat /sys/block/zram0/mm_stat
> >  1171456    26006    86016        0    86016    32781        0
> > 
> > time tar xf linux-3.10.tar.gz -C linux
> > 
> > real    0m16.970s
> > user    0m15.247s
> > sys     0m8.477s
> > 
> > du -sh linux
> > 2.0G    linux
> > 
> > cat /sys/block/zram0/mm_stat
> > 3547353088 2993384270 3011088384        0 3011088384    24310      108
> > 
> > =====================================================================
> > 
> > w/o auto compaction:
> > 
> > cat /sys/block/zram0/mm_stat
> >  1171456    26000    81920        0    81920    32781        0
> > 
> > time tar xf linux-3.10.tar.gz -C linux
> > 
> > real    0m16.983s
> > user    0m15.267s
> > sys     0m8.417s
> > 
> > du -sh linux
> > 2.0G    linux
> > 
> > cat /sys/block/zram0/mm_stat
> > 3548917760 2993566924 3011317760        0 3011317760    23928        0
> > 
> > =====================================================================
> > 
> > iozone shows that auto-compacted code runs faster in several
> > tests, which is hardly trustworthy. anyway.
> > 
> > iozone -t 3 -R -r 16K -s 60M -I +Z
> > 
> >        test           base       auto-compact (compacted 66123 objs)
> >    Initial write   1603682.25          1645112.38
> >          Rewrite   2502243.31          2256570.31
> >             Read   7040860.00          7130575.00
> >          Re-read   7036490.75          7066744.25
> >     Reverse Read   6617115.25          6155395.50
> >      Stride read   6705085.50          6350030.38
> >      Random read   6668497.75          6350129.38
> >   Mixed workload   5494030.38          5091669.62
> >     Random write   2526834.44          2500977.81
> >           Pwrite   1656874.00          1663796.94
> >            Pread   3322818.91          3359683.44
> >           Fwrite   4090124.25          4099773.88
> >            Fread   10358916.25         10324409.75
> > 
> > Signed-off-by: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
> > ---
> >  mm/zsmalloc.c | 25 +++++++++++++------------
> >  1 file changed, 13 insertions(+), 12 deletions(-)
> > 
> > diff --git a/mm/zsmalloc.c b/mm/zsmalloc.c
> > index c2a640a..70bf481 100644
> > --- a/mm/zsmalloc.c
> > +++ b/mm/zsmalloc.c
> > @@ -1515,34 +1515,28 @@ static void __zs_compact(struct zs_pool *pool, struct size_class *class)
> >  
> >  		while ((dst_page = isolate_target_page(class))) {
> >  			cc.d_page = dst_page;
> > -			/*
> > -			 * If there is no more space in dst_page, resched
> > -			 * and see if anyone had allocated another zspage.
> > -			 */
> > +
> >  			if (!migrate_zspage(pool, class, &cc))
> > -				break;
> > +				goto out;
> >  
> >  			putback_zspage(pool, class, dst_page);
> >  		}
> >  
> > -		/* Stop if we couldn't find slot */
> > -		if (dst_page == NULL)
> > +		if (!dst_page)
> >  			break;
> > -
> >  		putback_zspage(pool, class, dst_page);
> >  		putback_zspage(pool, class, src_page);
> > -		spin_unlock(&class->lock);
> > -		cond_resched();
> > -		spin_lock(&class->lock);
> >  	}
> >  
> > +out:
> > +	if (dst_page)
> > +		putback_zspage(pool, class, dst_page);
> >  	if (src_page)
> >  		putback_zspage(pool, class, src_page);
> >  
> >  	spin_unlock(&class->lock);
> >  }
> >  
> > -
> >  unsigned long zs_get_total_pages(struct zs_pool *pool)
> >  {
> >  	return atomic_long_read(&pool->pages_allocated);
> > @@ -1741,6 +1735,13 @@ void zs_free(struct zs_pool *pool, unsigned long handle)
> >  	unpin_tag(handle);
> >  
> >  	free_handle(pool, handle);
> > +
> > +	/*
> > +	 * actual fullness might have changed, __zs_compact() checks
> > +	 * if compaction makes sense
> > +	 */
> > +	if (fullness == ZS_ALMOST_EMPTY)
> > +		__zs_compact(pool, class);
> >  }
> >  EXPORT_SYMBOL_GPL(zs_free);
> >  
> > -- 
> > 2.4.2.337.gfae46aa
> > 
> 
> -- 
> Kind regards,
> Minchan Kim
> 

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [RFC][PATCH 07/10] zsmalloc: introduce auto-compact support
  2015-06-04  5:30     ` Sergey Senozhatsky
@ 2015-06-04  6:27       ` Minchan Kim
  2015-06-04  7:04         ` Minchan Kim
  2015-06-04  7:28         ` Sergey Senozhatsky
  0 siblings, 2 replies; 30+ messages in thread
From: Minchan Kim @ 2015-06-04  6:27 UTC (permalink / raw)
  To: Sergey Senozhatsky
  Cc: Sergey Senozhatsky, Andrew Morton, linux-mm, linux-kernel

On Thu, Jun 04, 2015 at 02:30:56PM +0900, Sergey Senozhatsky wrote:
> On (06/04/15 13:57), Minchan Kim wrote:
> > On Sat, May 30, 2015 at 12:05:25AM +0900, Sergey Senozhatsky wrote:
> > > perform class compaction in zs_free(), if zs_free() has created
> > > a ZS_ALMOST_EMPTY page. this is the most trivial `policy'.
> > 
> > Finally, I got realized your intention.
> > 
> > Actually, I had a plan to add /sys/block/zram0/compact_threshold_ratio
> > which means to compact automatically when compr_data_size/mem_used_total
> > is below than the threshold but I didn't try because it could be done
> > by usertool.
> > 
> > Another reason I didn't try the approach is that it could scan all of
> > zs_objects repeatedly withtout any freeing zspage in some corner cases,
> > which could be big overhead we should prevent so we might add some
> > heuristic. as an example, we could delay a few compaction trial when
> > we found a few previous trials as all fails.
> 
> this is why I use zs_can_compact() -- to evict from zs_compact() as soon
> as possible. so useless scans are minimized (well, at least expected). I'm
> also thinking of a threshold-based solution -- do class auto-compaction
> only if we can free X pages, for example.
> 
> the problem of compaction is that there is no compaction until you trigger
> it.
> 
> and fragmented classes are not necessarily a win. if writes don't happen
> to a fragmented class-X (and we basically can't tell if they will, nor we
> can estimate; it's up to I/O and data patterns, compression algorithm, etc.)
> then class-X stays fragmented w/o any use.

The problem is migration/freeing old zspage/allocating new zspage is
not a cheap, either.
If the system has no problem with small fragmented space, there is
no point to keep such overheads.

So, ideal is we should trigger compaction once we realized system
is trouble but I don't have any good idea to detect it.
That's why i wanted to rely on the decision from user via
compact_threshold_ratio.

> 
> > It's simple design of mm/compaction.c to prevent pointless overhead
> > but historically it made pains several times and required more
> > complicated logics but it's still painful.
> > 
> > Other thing I found recently is that it's not always win zsmalloc
> > for zram is not fragmented. The fragmented space could be used
> > for storing upcoming compressed objects although it is wasted space
> > at the moment but if we don't have any hole(ie, fragment space)
> > via frequent compaction, zsmalloc should allocate a new zspage
> > which could be allocated on movable pageblock by fallback of
> > nonmovable pageblock request on highly memory pressure system
> > so it accelerates fragment problem of the system memory.
> 
> yes, but compaction almost always leave classes fragmented. I think
> it's a corner case, when the number of unused allocated objects was
> exactly the same as the number of objects that we migrated and the
> number of migrated objects was exactly N*maxobj_per_zspage, so we
> left the class w/o any unused objects (OBJ_ALLOCATED == OBJ_USED).
> classes have 'holes' after compaction.
> 
> 
> > So, I want to pass the policy to userspace.
> > If we found it's really trobule on userspace, then, we need more
> > thinking.
> 
> well, it can be under config "aggressive compaction" or "automatic
> compaction" option.
> 

If you really want to do it automatically without any feedback
form the userspace, we should find better algorithm.

-- 
Kind regards,
Minchan Kim

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [RFC][PATCH 07/10] zsmalloc: introduce auto-compact support
  2015-06-04  6:27       ` Minchan Kim
@ 2015-06-04  7:04         ` Minchan Kim
  2015-06-04 14:47           ` Sergey Senozhatsky
  2015-06-04  7:28         ` Sergey Senozhatsky
  1 sibling, 1 reply; 30+ messages in thread
From: Minchan Kim @ 2015-06-04  7:04 UTC (permalink / raw)
  To: Sergey Senozhatsky
  Cc: Sergey Senozhatsky, Andrew Morton, linux-mm, linux-kernel

On Thu, Jun 04, 2015 at 03:27:12PM +0900, Minchan Kim wrote:
> On Thu, Jun 04, 2015 at 02:30:56PM +0900, Sergey Senozhatsky wrote:
> > On (06/04/15 13:57), Minchan Kim wrote:
> > > On Sat, May 30, 2015 at 12:05:25AM +0900, Sergey Senozhatsky wrote:
> > > > perform class compaction in zs_free(), if zs_free() has created
> > > > a ZS_ALMOST_EMPTY page. this is the most trivial `policy'.
> > > 
> > > Finally, I got realized your intention.
> > > 
> > > Actually, I had a plan to add /sys/block/zram0/compact_threshold_ratio
> > > which means to compact automatically when compr_data_size/mem_used_total
> > > is below than the threshold but I didn't try because it could be done
> > > by usertool.
> > > 
> > > Another reason I didn't try the approach is that it could scan all of
> > > zs_objects repeatedly withtout any freeing zspage in some corner cases,
> > > which could be big overhead we should prevent so we might add some
> > > heuristic. as an example, we could delay a few compaction trial when
> > > we found a few previous trials as all fails.
> > 
> > this is why I use zs_can_compact() -- to evict from zs_compact() as soon
> > as possible. so useless scans are minimized (well, at least expected). I'm
> > also thinking of a threshold-based solution -- do class auto-compaction
> > only if we can free X pages, for example.
> > 
> > the problem of compaction is that there is no compaction until you trigger
> > it.
> > 
> > and fragmented classes are not necessarily a win. if writes don't happen
> > to a fragmented class-X (and we basically can't tell if they will, nor we
> > can estimate; it's up to I/O and data patterns, compression algorithm, etc.)
> > then class-X stays fragmented w/o any use.
> 
> The problem is migration/freeing old zspage/allocating new zspage is
> not a cheap, either.
> If the system has no problem with small fragmented space, there is
> no point to keep such overheads.
> 
> So, ideal is we should trigger compaction once we realized system
> is trouble but I don't have any good idea to detect it.
> That's why i wanted to rely on the decision from user via
> compact_threshold_ratio.
> 
> > 
> > > It's simple design of mm/compaction.c to prevent pointless overhead
> > > but historically it made pains several times and required more
> > > complicated logics but it's still painful.
> > > 
> > > Other thing I found recently is that it's not always win zsmalloc
> > > for zram is not fragmented. The fragmented space could be used
> > > for storing upcoming compressed objects although it is wasted space
> > > at the moment but if we don't have any hole(ie, fragment space)
> > > via frequent compaction, zsmalloc should allocate a new zspage
> > > which could be allocated on movable pageblock by fallback of
> > > nonmovable pageblock request on highly memory pressure system
> > > so it accelerates fragment problem of the system memory.
> > 
> > yes, but compaction almost always leave classes fragmented. I think
> > it's a corner case, when the number of unused allocated objects was
> > exactly the same as the number of objects that we migrated and the
> > number of migrated objects was exactly N*maxobj_per_zspage, so we
> > left the class w/o any unused objects (OBJ_ALLOCATED == OBJ_USED).
> > classes have 'holes' after compaction.
> > 
> > 
> > > So, I want to pass the policy to userspace.
> > > If we found it's really trobule on userspace, then, we need more
> > > thinking.
> > 
> > well, it can be under config "aggressive compaction" or "automatic
> > compaction" option.
> > 
> 
> If you really want to do it automatically without any feedback
> form the userspace, we should find better algorithm.

How about using slab shrinker?
If there is memory pressure, it would be called by VM and we will
try compaction without user's intervention and excessive object
scanning should be avoid by your zs_can_compact.

The concern I had about fragmentation spread out all over pageblock
should be solved as another issue. I'm plaing to make zsmalloced
page migratable. I hope we should work out it firstly to prevent
system heavy memory fragmentation by automatic compaction.

> 
> -- 
> Kind regards,
> Minchan Kim

-- 
Kind regards,
Minchan Kim

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [RFC][PATCH 07/10] zsmalloc: introduce auto-compact support
  2015-06-04  6:27       ` Minchan Kim
  2015-06-04  7:04         ` Minchan Kim
@ 2015-06-04  7:28         ` Sergey Senozhatsky
  1 sibling, 0 replies; 30+ messages in thread
From: Sergey Senozhatsky @ 2015-06-04  7:28 UTC (permalink / raw)
  To: Minchan Kim
  Cc: Sergey Senozhatsky, Sergey Senozhatsky, Andrew Morton, linux-mm,
	linux-kernel

On (06/04/15 15:27), Minchan Kim wrote:
[..]
> 
> The problem is migration/freeing old zspage/allocating new zspage is
> not a cheap, either.
> If the system has no problem with small fragmented space, there is
> no point to keep such overheads.
>
> So, ideal is we should trigger compaction once we realized system
> is trouble but I don't have any good idea to detect it.
> That's why i wanted to rely on the decision from user via
> compact_threshold_ratio.

that'll be extremly hard to understand knob.

well, we can do something like
-- don't let the number of "CLASS_ALMOST_EMPTY" to become N times greater
than "CLASS_ALMOST_FULL".

or

-- don't let the number of pages in ZS_ALMOST_EMPTY pages to contribute 70%
of class memory usage. that is 70% of all pages allocated for this class belong
to ZS_ALMOST_EMPTY zspages, thus potentially we can compact it.

> > 
> > > It's simple design of mm/compaction.c to prevent pointless overhead
> > > but historically it made pains several times and required more
> > > complicated logics but it's still painful.
> > > 
> > > Other thing I found recently is that it's not always win zsmalloc
> > > for zram is not fragmented. The fragmented space could be used
> > > for storing upcoming compressed objects although it is wasted space
> > > at the moment but if we don't have any hole(ie, fragment space)
> > > via frequent compaction, zsmalloc should allocate a new zspage
> > > which could be allocated on movable pageblock by fallback of
> > > nonmovable pageblock request on highly memory pressure system
> > > so it accelerates fragment problem of the system memory.
> > 
> > yes, but compaction almost always leave classes fragmented. I think
> > it's a corner case, when the number of unused allocated objects was
> > exactly the same as the number of objects that we migrated and the
> > number of migrated objects was exactly N*maxobj_per_zspage, so we
> > left the class w/o any unused objects (OBJ_ALLOCATED == OBJ_USED).
> > classes have 'holes' after compaction.
> > 
> > 
> > > So, I want to pass the policy to userspace.
> > > If we found it's really trobule on userspace, then, we need more
> > > thinking.
> > 
> > well, it can be under config "aggressive compaction" or "automatic
> > compaction" option.
> > 
> 
> If you really want to do it automatically without any feedback
> form the userspace, we should find better algorithm.

ok. I'll drop auto-compaction part for now and will resend
general/minor zsmalloc tweaks today.

	-ss

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [RFC][PATCH 07/10] zsmalloc: introduce auto-compact support
  2015-06-04  7:04         ` Minchan Kim
@ 2015-06-04 14:47           ` Sergey Senozhatsky
  0 siblings, 0 replies; 30+ messages in thread
From: Sergey Senozhatsky @ 2015-06-04 14:47 UTC (permalink / raw)
  To: Minchan Kim
  Cc: Sergey Senozhatsky, Sergey Senozhatsky, Andrew Morton, linux-mm,
	linux-kernel

On (06/04/15 16:04), Minchan Kim wrote:
[..]
> How about using slab shrinker?
> If there is memory pressure, it would be called by VM and we will
> try compaction without user's intervention and excessive object
> scanning should be avoid by your zs_can_compact.

hm, interesting.

ok, have a patch to trigger compaction from shrinker, but need to test
it more.

will send the updated patchset tomorrow, I think.

	-ss

> The concern I had about fragmentation spread out all over pageblock
> should be solved as another issue. I'm plaing to make zsmalloced
> page migratable. I hope we should work out it firstly to prevent
> system heavy memory fragmentation by automatic compaction.
> 

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 30+ messages in thread

end of thread, other threads:[~2015-06-04 14:48 UTC | newest]

Thread overview: 30+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-05-29 15:05 [RFC][PATCH 00/10] zsmalloc auto-compaction Sergey Senozhatsky
2015-05-29 15:05 ` [RFC][PATCH 01/10] zsmalloc: drop unused variable `nr_to_migrate' Sergey Senozhatsky
2015-06-04  2:04   ` Minchan Kim
2015-06-04  2:10     ` Sergey Senozhatsky
2015-05-29 15:05 ` [RFC][PATCH 02/10] zsmalloc: always keep per-class stats Sergey Senozhatsky
2015-06-04  2:18   ` Minchan Kim
2015-06-04  2:34     ` Sergey Senozhatsky
2015-05-29 15:05 ` [RFC][PATCH 03/10] zsmalloc: introduce zs_can_compact() function Sergey Senozhatsky
2015-06-04  2:55   ` Minchan Kim
2015-06-04  3:15     ` Sergey Senozhatsky
2015-06-04  3:30       ` Minchan Kim
2015-06-04  3:42         ` Sergey Senozhatsky
2015-06-04  3:50           ` Minchan Kim
2015-06-04  4:19             ` Sergey Senozhatsky
2015-06-04  3:31       ` Sergey Senozhatsky
2015-05-29 15:05 ` [RFC][PATCH 04/10] zsmalloc: cosmetic compaction code adjustments Sergey Senozhatsky
2015-06-04  3:14   ` Minchan Kim
2015-05-29 15:05 ` [RFC][PATCH 05/10] zsmalloc: add `num_migrated' to zs_pool Sergey Senozhatsky
2015-05-29 15:05 ` [RFC][PATCH 06/10] zsmalloc: move compaction functions Sergey Senozhatsky
2015-05-29 15:05 ` [RFC][PATCH 07/10] zsmalloc: introduce auto-compact support Sergey Senozhatsky
2015-06-04  4:57   ` Minchan Kim
2015-06-04  5:30     ` Sergey Senozhatsky
2015-06-04  6:27       ` Minchan Kim
2015-06-04  7:04         ` Minchan Kim
2015-06-04 14:47           ` Sergey Senozhatsky
2015-06-04  7:28         ` Sergey Senozhatsky
2015-05-29 15:05 ` [RFC][PATCH 08/10] zsmalloc: export zs_pool `num_migrated' Sergey Senozhatsky
2015-05-29 15:05 ` [RFC][PATCH 09/10] zram: remove `num_migrated' from zram_stats Sergey Senozhatsky
2015-05-29 15:05 ` [RFC][PATCH 10/10] zsmalloc: lower ZS_ALMOST_FULL waterline Sergey Senozhatsky
2015-06-03  5:09 ` [RFC][PATCH 00/10] zsmalloc auto-compaction Sergey Senozhatsky

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).