* [RFC][PATCH 00/10] zsmalloc auto-compaction
@ 2015-05-29 15:05 Sergey Senozhatsky
2015-05-29 15:05 ` [RFC][PATCH 01/10] zsmalloc: drop unused variable `nr_to_migrate' Sergey Senozhatsky
` (10 more replies)
0 siblings, 11 replies; 30+ messages in thread
From: Sergey Senozhatsky @ 2015-05-29 15:05 UTC (permalink / raw)
To: Andrew Morton, Minchan Kim
Cc: linux-mm, linux-kernel, Sergey Senozhatsky, Sergey Senozhatsky
Hello,
RFC
this is 4.3 material, but I wanted to publish it sooner to gain
responses and to settle it down before 4.3 merge window opens.
in short, this series tweaks zsmalloc's compaction and adds
auto-compaction support. auto-compaction is not aimed to replace
manual compaction, intead it's supposed to be good enough. yet
it surely slows down zsmalloc in some scenarious. whilst simple
un-tar test didn't show any significant performance difference
quote from commit 0007:
this test copies a 1.3G linux kernel tar to mounted zram disk,
and extracts it.
w/auto-compaction:
cat /sys/block/zram0/mm_stat
1171456 26006 86016 0 86016 32781 0
time tar xf linux-3.10.tar.gz -C linux
real 0m16.970s
user 0m15.247s
sys 0m8.477s
du -sh linux
2.0G linux
cat /sys/block/zram0/mm_stat
3547353088 2993384270 3011088384 0 3011088384 24310 108
=====================================================================
w/o auto compaction:
cat /sys/block/zram0/mm_stat
1171456 26000 81920 0 81920 32781 0
time tar xf linux-3.10.tar.gz -C linux
real 0m16.983s
user 0m15.267s
sys 0m8.417s
du -sh linux
2.0G linux
cat /sys/block/zram0/mm_stat
3548917760 2993566924 3011317760 0 3011317760 23928 0
Sergey Senozhatsky (10):
zsmalloc: drop unused variable `nr_to_migrate'
zsmalloc: always keep per-class stats
zsmalloc: introduce zs_can_compact() function
zsmalloc: cosmetic compaction code adjustments
zsmalloc: add `num_migrated' to zs_pool
zsmalloc: move compaction functions
zsmalloc: introduce auto-compact support
zsmalloc: export zs_pool `num_migrated'
zram: remove `num_migrated' from zram_stats
zsmalloc: lower ZS_ALMOST_FULL waterline
drivers/block/zram/zram_drv.c | 12 +-
drivers/block/zram/zram_drv.h | 1 -
include/linux/zsmalloc.h | 1 +
mm/zsmalloc.c | 578 +++++++++++++++++++++---------------------
4 files changed, 296 insertions(+), 296 deletions(-)
--
2.4.2.337.gfae46aa
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 30+ messages in thread
* [RFC][PATCH 01/10] zsmalloc: drop unused variable `nr_to_migrate'
2015-05-29 15:05 [RFC][PATCH 00/10] zsmalloc auto-compaction Sergey Senozhatsky
@ 2015-05-29 15:05 ` Sergey Senozhatsky
2015-06-04 2:04 ` Minchan Kim
2015-05-29 15:05 ` [RFC][PATCH 02/10] zsmalloc: always keep per-class stats Sergey Senozhatsky
` (9 subsequent siblings)
10 siblings, 1 reply; 30+ messages in thread
From: Sergey Senozhatsky @ 2015-05-29 15:05 UTC (permalink / raw)
To: Andrew Morton, Minchan Kim
Cc: linux-mm, linux-kernel, Sergey Senozhatsky, Sergey Senozhatsky
__zs_compact() does not use `nr_to_migrate', drop it.
Signed-off-by: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
---
mm/zsmalloc.c | 4 ----
1 file changed, 4 deletions(-)
diff --git a/mm/zsmalloc.c b/mm/zsmalloc.c
index 33d5126..e615b31 100644
--- a/mm/zsmalloc.c
+++ b/mm/zsmalloc.c
@@ -1701,7 +1701,6 @@ static struct page *isolate_source_page(struct size_class *class)
static unsigned long __zs_compact(struct zs_pool *pool,
struct size_class *class)
{
- int nr_to_migrate;
struct zs_compact_control cc;
struct page *src_page;
struct page *dst_page = NULL;
@@ -1712,8 +1711,6 @@ static unsigned long __zs_compact(struct zs_pool *pool,
BUG_ON(!is_first_page(src_page));
- /* The goal is to migrate all live objects in source page */
- nr_to_migrate = src_page->inuse;
cc.index = 0;
cc.s_page = src_page;
@@ -1728,7 +1725,6 @@ static unsigned long __zs_compact(struct zs_pool *pool,
putback_zspage(pool, class, dst_page);
nr_total_migrated += cc.nr_migrated;
- nr_to_migrate -= cc.nr_migrated;
}
/* Stop if we couldn't find slot */
--
2.4.2.337.gfae46aa
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply related [flat|nested] 30+ messages in thread
* [RFC][PATCH 02/10] zsmalloc: always keep per-class stats
2015-05-29 15:05 [RFC][PATCH 00/10] zsmalloc auto-compaction Sergey Senozhatsky
2015-05-29 15:05 ` [RFC][PATCH 01/10] zsmalloc: drop unused variable `nr_to_migrate' Sergey Senozhatsky
@ 2015-05-29 15:05 ` Sergey Senozhatsky
2015-06-04 2:18 ` Minchan Kim
2015-05-29 15:05 ` [RFC][PATCH 03/10] zsmalloc: introduce zs_can_compact() function Sergey Senozhatsky
` (8 subsequent siblings)
10 siblings, 1 reply; 30+ messages in thread
From: Sergey Senozhatsky @ 2015-05-29 15:05 UTC (permalink / raw)
To: Andrew Morton, Minchan Kim
Cc: linux-mm, linux-kernel, Sergey Senozhatsky, Sergey Senozhatsky
always account per-class `zs_size_stat' stats. this data will
help us make better decisions during compaction. we are especially
interested in OBJ_ALLOCATED and OBJ_USED, which can tell us if
class compaction will result in any memory gain.
for instance, we know the number of allocated objects in the class,
the number of objects being used (so we also know how many objects
are not used) and the number of objects per-page. so we can estimate
how many pages compaction can free (pages that will turn into
ZS_EMPTY during compaction).
Signed-off-by: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
---
mm/zsmalloc.c | 49 ++++++++++++-------------------------------------
1 file changed, 12 insertions(+), 37 deletions(-)
diff --git a/mm/zsmalloc.c b/mm/zsmalloc.c
index e615b31..778b8db 100644
--- a/mm/zsmalloc.c
+++ b/mm/zsmalloc.c
@@ -169,14 +169,12 @@ enum zs_stat_type {
NR_ZS_STAT_TYPE,
};
-#ifdef CONFIG_ZSMALLOC_STAT
-
-static struct dentry *zs_stat_root;
-
struct zs_size_stat {
unsigned long objs[NR_ZS_STAT_TYPE];
};
+#ifdef CONFIG_ZSMALLOC_STAT
+static struct dentry *zs_stat_root;
#endif
/*
@@ -201,25 +199,21 @@ static int zs_size_classes;
static const int fullness_threshold_frac = 4;
struct size_class {
+ spinlock_t lock;
+ struct page *fullness_list[_ZS_NR_FULLNESS_GROUPS];
/*
* Size of objects stored in this class. Must be multiple
* of ZS_ALIGN.
*/
- int size;
- unsigned int index;
+ int size;
+ unsigned int index;
/* Number of PAGE_SIZE sized pages to combine to form a 'zspage' */
- int pages_per_zspage;
- /* huge object: pages_per_zspage == 1 && maxobj_per_zspage == 1 */
- bool huge;
-
-#ifdef CONFIG_ZSMALLOC_STAT
- struct zs_size_stat stats;
-#endif
-
- spinlock_t lock;
+ int pages_per_zspage;
+ struct zs_size_stat stats;
- struct page *fullness_list[_ZS_NR_FULLNESS_GROUPS];
+ /* huge object: pages_per_zspage == 1 && maxobj_per_zspage == 1 */
+ bool huge;
};
/*
@@ -439,8 +433,6 @@ static int get_size_class_index(int size)
return min(zs_size_classes - 1, idx);
}
-#ifdef CONFIG_ZSMALLOC_STAT
-
static inline void zs_stat_inc(struct size_class *class,
enum zs_stat_type type, unsigned long cnt)
{
@@ -459,6 +451,8 @@ static inline unsigned long zs_stat_get(struct size_class *class,
return class->stats.objs[type];
}
+#ifdef CONFIG_ZSMALLOC_STAT
+
static int __init zs_stat_init(void)
{
if (!debugfs_initialized())
@@ -574,23 +568,6 @@ static void zs_pool_stat_destroy(struct zs_pool *pool)
}
#else /* CONFIG_ZSMALLOC_STAT */
-
-static inline void zs_stat_inc(struct size_class *class,
- enum zs_stat_type type, unsigned long cnt)
-{
-}
-
-static inline void zs_stat_dec(struct size_class *class,
- enum zs_stat_type type, unsigned long cnt)
-{
-}
-
-static inline unsigned long zs_stat_get(struct size_class *class,
- enum zs_stat_type type)
-{
- return 0;
-}
-
static int __init zs_stat_init(void)
{
return 0;
@@ -608,7 +585,6 @@ static inline int zs_pool_stat_create(char *name, struct zs_pool *pool)
static inline void zs_pool_stat_destroy(struct zs_pool *pool)
{
}
-
#endif
@@ -1682,7 +1658,6 @@ static void putback_zspage(struct zs_pool *pool, struct size_class *class,
class->size, class->pages_per_zspage));
atomic_long_sub(class->pages_per_zspage,
&pool->pages_allocated);
-
free_zspage(first_page);
}
}
--
2.4.2.337.gfae46aa
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply related [flat|nested] 30+ messages in thread
* [RFC][PATCH 03/10] zsmalloc: introduce zs_can_compact() function
2015-05-29 15:05 [RFC][PATCH 00/10] zsmalloc auto-compaction Sergey Senozhatsky
2015-05-29 15:05 ` [RFC][PATCH 01/10] zsmalloc: drop unused variable `nr_to_migrate' Sergey Senozhatsky
2015-05-29 15:05 ` [RFC][PATCH 02/10] zsmalloc: always keep per-class stats Sergey Senozhatsky
@ 2015-05-29 15:05 ` Sergey Senozhatsky
2015-06-04 2:55 ` Minchan Kim
2015-05-29 15:05 ` [RFC][PATCH 04/10] zsmalloc: cosmetic compaction code adjustments Sergey Senozhatsky
` (7 subsequent siblings)
10 siblings, 1 reply; 30+ messages in thread
From: Sergey Senozhatsky @ 2015-05-29 15:05 UTC (permalink / raw)
To: Andrew Morton, Minchan Kim
Cc: linux-mm, linux-kernel, Sergey Senozhatsky, Sergey Senozhatsky
this function checks if class compaction will free any pages.
rephrasing, do we have enough unused objects to form at least one
ZS_EMPTY page and free it. it aborts compaction if class compaction
will not result into any (further) savings.
EXAMPLE (this debug output is not part of this patch set):
-- class size
-- number of allocated objects
-- number of used objects,
-- estimated number of pages that will be freed
[..]
[ 3303.108960] class-3072 objs:24652 inuse:24628 objs-per-page:4 pages-tofree:6
[ 3303.108965] class-3072 objs:24648 inuse:24628 objs-per-page:4 pages-tofree:5
[ 3303.108970] class-3072 objs:24644 inuse:24628 objs-per-page:4 pages-tofree:4
[ 3303.108973] class-3072 objs:24640 inuse:24628 objs-per-page:4 pages-tofree:3
[ 3303.108978] class-3072 objs:24636 inuse:24628 objs-per-page:4 pages-tofree:2
[ 3303.108982] class-3072 objs:24632 inuse:24628 objs-per-page:4 pages-tofree:1
[ 3303.108993] class-2720 objs:17970 inuse:17966 objs-per-page:3 pages-tofree:1
[ 3303.108997] class-2720 objs:17967 inuse:17966 objs-per-page:3 pages-tofree:0
[ 3303.108998] class-2720: Compaction is useless
[ 3303.109000] class-2448 objs:7680 inuse:7674 objs-per-page:5 pages-tofree:1
[ 3303.109005] class-2336 objs:13510 inuse:13500 objs-per-page:7 pages-tofree:1
[ 3303.109010] class-2336 objs:13503 inuse:13500 objs-per-page:7 pages-tofree:0
[ 3303.109011] class-2336: Compaction is useless
[ 3303.109013] class-1808 objs:1161 inuse:1154 objs-per-page:9 pages-tofree:0
[ 3303.109014] class-1808: Compaction is useless
[ 3303.109016] class-1744 objs:2135 inuse:2131 objs-per-page:7 pages-tofree:0
[ 3303.109017] class-1744: Compaction is useless
[ 3303.109019] class-1536 objs:1328 inuse:1323 objs-per-page:8 pages-tofree:0
[ 3303.109020] class-1536: Compaction is useless
[ 3303.109022] class-1488 objs:8855 inuse:8847 objs-per-page:11 pages-tofree:0
[ 3303.109023] class-1488: Compaction is useless
[ 3303.109025] class-1360 objs:14880 inuse:14878 objs-per-page:3 pages-tofree:0
[ 3303.109026] class-1360: Compaction is useless
[ 3303.109028] class-1248 objs:3588 inuse:3577 objs-per-page:13 pages-tofree:0
[ 3303.109029] class-1248: Compaction is useless
[ 3303.109031] class-1216 objs:3380 inuse:3372 objs-per-page:10 pages-tofree:0
[ 3303.109032] class-1216: Compaction is useless
[ 3303.109033] class-1168 objs:3416 inuse:3401 objs-per-page:7 pages-tofree:2
[ 3303.109037] class-1168 objs:3409 inuse:3401 objs-per-page:7 pages-tofree:1
[ 3303.109042] class-1104 objs:605 inuse:599 objs-per-page:11 pages-tofree:0
[ 3303.109043] class-1104: Compaction is useless
[..]
every "Compaction is useless" indicates that we saved some CPU cycles.
for example, class-1104 has
605 object allocated
599 objects used
11 objects per-page
even if we have ALMOST_EMPTY page, we still don't have enough room to move
all of its objects and free this page; so compaction will not make a lot of
sense here, it's better to just leave it as is.
Signed-off-by: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
---
mm/zsmalloc.c | 25 +++++++++++++++++++++++++
1 file changed, 25 insertions(+)
diff --git a/mm/zsmalloc.c b/mm/zsmalloc.c
index 778b8db..9ef6f15 100644
--- a/mm/zsmalloc.c
+++ b/mm/zsmalloc.c
@@ -1673,6 +1673,28 @@ static struct page *isolate_source_page(struct size_class *class)
return page;
}
+/*
+ * Make sure that we actually can compact this class,
+ * IOW if migration will empty at least one page.
+ *
+ * should be called under class->lock
+ */
+static bool zs_can_compact(struct size_class *class)
+{
+ /*
+ * calculate how many unused allocated objects we
+ * have and see if we can free any zspages. otherwise,
+ * compaction can just move objects back and forth w/o
+ * any memory gain.
+ */
+ unsigned long ret = zs_stat_get(class, OBJ_ALLOCATED) -
+ zs_stat_get(class, OBJ_USED);
+
+ ret /= get_maxobj_per_zspage(class->size,
+ class->pages_per_zspage);
+ return ret > 0;
+}
+
static unsigned long __zs_compact(struct zs_pool *pool,
struct size_class *class)
{
@@ -1686,6 +1708,9 @@ static unsigned long __zs_compact(struct zs_pool *pool,
BUG_ON(!is_first_page(src_page));
+ if (!zs_can_compact(class))
+ break;
+
cc.index = 0;
cc.s_page = src_page;
--
2.4.2.337.gfae46aa
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply related [flat|nested] 30+ messages in thread
* [RFC][PATCH 04/10] zsmalloc: cosmetic compaction code adjustments
2015-05-29 15:05 [RFC][PATCH 00/10] zsmalloc auto-compaction Sergey Senozhatsky
` (2 preceding siblings ...)
2015-05-29 15:05 ` [RFC][PATCH 03/10] zsmalloc: introduce zs_can_compact() function Sergey Senozhatsky
@ 2015-05-29 15:05 ` Sergey Senozhatsky
2015-06-04 3:14 ` Minchan Kim
2015-05-29 15:05 ` [RFC][PATCH 05/10] zsmalloc: add `num_migrated' to zs_pool Sergey Senozhatsky
` (6 subsequent siblings)
10 siblings, 1 reply; 30+ messages in thread
From: Sergey Senozhatsky @ 2015-05-29 15:05 UTC (permalink / raw)
To: Andrew Morton, Minchan Kim
Cc: linux-mm, linux-kernel, Sergey Senozhatsky, Sergey Senozhatsky
change zs_object_copy() argument order to be (DST, SRC) rather
than (SRC, DST). copy/move functions usually have (to, from)
arguments order.
rename alloc_target_page() to isolate_target_page(). this
function doesn't allocate anything, it isolates target page,
pretty much like isolate_source_page().
tweak __zs_compact() comment.
Signed-off-by: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
---
mm/zsmalloc.c | 12 ++++++------
1 file changed, 6 insertions(+), 6 deletions(-)
diff --git a/mm/zsmalloc.c b/mm/zsmalloc.c
index 9ef6f15..fa72a81 100644
--- a/mm/zsmalloc.c
+++ b/mm/zsmalloc.c
@@ -1469,7 +1469,7 @@ void zs_free(struct zs_pool *pool, unsigned long handle)
}
EXPORT_SYMBOL_GPL(zs_free);
-static void zs_object_copy(unsigned long src, unsigned long dst,
+static void zs_object_copy(unsigned long dst, unsigned long src,
struct size_class *class)
{
struct page *s_page, *d_page;
@@ -1610,7 +1610,7 @@ static int migrate_zspage(struct zs_pool *pool, struct size_class *class,
used_obj = handle_to_obj(handle);
free_obj = obj_malloc(d_page, class, handle);
- zs_object_copy(used_obj, free_obj, class);
+ zs_object_copy(free_obj, used_obj, class);
index++;
record_obj(handle, free_obj);
unpin_tag(handle);
@@ -1626,7 +1626,7 @@ static int migrate_zspage(struct zs_pool *pool, struct size_class *class,
return ret;
}
-static struct page *alloc_target_page(struct size_class *class)
+static struct page *isolate_target_page(struct size_class *class)
{
int i;
struct page *page;
@@ -1714,11 +1714,11 @@ static unsigned long __zs_compact(struct zs_pool *pool,
cc.index = 0;
cc.s_page = src_page;
- while ((dst_page = alloc_target_page(class))) {
+ while ((dst_page = isolate_target_page(class))) {
cc.d_page = dst_page;
/*
- * If there is no more space in dst_page, try to
- * allocate another zspage.
+ * If there is no more space in dst_page, resched
+ * and see if anyone had allocated another zspage.
*/
if (!migrate_zspage(pool, class, &cc))
break;
--
2.4.2.337.gfae46aa
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply related [flat|nested] 30+ messages in thread
* [RFC][PATCH 05/10] zsmalloc: add `num_migrated' to zs_pool
2015-05-29 15:05 [RFC][PATCH 00/10] zsmalloc auto-compaction Sergey Senozhatsky
` (3 preceding siblings ...)
2015-05-29 15:05 ` [RFC][PATCH 04/10] zsmalloc: cosmetic compaction code adjustments Sergey Senozhatsky
@ 2015-05-29 15:05 ` Sergey Senozhatsky
2015-05-29 15:05 ` [RFC][PATCH 06/10] zsmalloc: move compaction functions Sergey Senozhatsky
` (5 subsequent siblings)
10 siblings, 0 replies; 30+ messages in thread
From: Sergey Senozhatsky @ 2015-05-29 15:05 UTC (permalink / raw)
To: Andrew Morton, Minchan Kim
Cc: linux-mm, linux-kernel, Sergey Senozhatsky, Sergey Senozhatsky
remove the number of migrated objects from `zs_compact_control'
and make it a `zs_pool' member. `zs_compact_control' has a limited
lifespan; we lose it when zs_compaction() returns back to zram. to
keep track of objects migrated during auto-compaction we need to
store this number in zs_pool.
Signed-off-by: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
---
mm/zsmalloc.c | 36 ++++++++++++++----------------------
1 file changed, 14 insertions(+), 22 deletions(-)
diff --git a/mm/zsmalloc.c b/mm/zsmalloc.c
index fa72a81..54eefc3 100644
--- a/mm/zsmalloc.c
+++ b/mm/zsmalloc.c
@@ -237,16 +237,19 @@ struct link_free {
};
struct zs_pool {
- char *name;
+ char *name;
- struct size_class **size_class;
- struct kmem_cache *handle_cachep;
+ struct size_class **size_class;
+ struct kmem_cache *handle_cachep;
- gfp_t flags; /* allocation flags used when growing pool */
- atomic_long_t pages_allocated;
+ /* allocation flags used when growing pool */
+ gfp_t flags;
+ atomic_long_t pages_allocated;
+ /* how many of objects were migrated */
+ unsigned long num_migrated;
#ifdef CONFIG_ZSMALLOC_STAT
- struct dentry *stat_dentry;
+ struct dentry *stat_dentry;
#endif
};
@@ -1576,8 +1579,6 @@ struct zs_compact_control {
/* Starting object index within @s_page which used for live object
* in the subpage. */
int index;
- /* how many of objects are migrated */
- int nr_migrated;
};
static int migrate_zspage(struct zs_pool *pool, struct size_class *class,
@@ -1588,7 +1589,6 @@ static int migrate_zspage(struct zs_pool *pool, struct size_class *class,
struct page *s_page = cc->s_page;
struct page *d_page = cc->d_page;
unsigned long index = cc->index;
- int nr_migrated = 0;
int ret = 0;
while (1) {
@@ -1615,13 +1615,12 @@ static int migrate_zspage(struct zs_pool *pool, struct size_class *class,
record_obj(handle, free_obj);
unpin_tag(handle);
obj_free(pool, class, used_obj);
- nr_migrated++;
+ pool->num_migrated++;
}
/* Remember last position in this iteration */
cc->s_page = s_page;
cc->index = index;
- cc->nr_migrated = nr_migrated;
return ret;
}
@@ -1695,13 +1694,11 @@ static bool zs_can_compact(struct size_class *class)
return ret > 0;
}
-static unsigned long __zs_compact(struct zs_pool *pool,
- struct size_class *class)
+static void __zs_compact(struct zs_pool *pool, struct size_class *class)
{
struct zs_compact_control cc;
struct page *src_page;
struct page *dst_page = NULL;
- unsigned long nr_total_migrated = 0;
spin_lock(&class->lock);
while ((src_page = isolate_source_page(class))) {
@@ -1724,7 +1721,6 @@ static unsigned long __zs_compact(struct zs_pool *pool,
break;
putback_zspage(pool, class, dst_page);
- nr_total_migrated += cc.nr_migrated;
}
/* Stop if we couldn't find slot */
@@ -1734,7 +1730,6 @@ static unsigned long __zs_compact(struct zs_pool *pool,
putback_zspage(pool, class, dst_page);
putback_zspage(pool, class, src_page);
spin_unlock(&class->lock);
- nr_total_migrated += cc.nr_migrated;
cond_resched();
spin_lock(&class->lock);
}
@@ -1743,14 +1738,11 @@ static unsigned long __zs_compact(struct zs_pool *pool,
putback_zspage(pool, class, src_page);
spin_unlock(&class->lock);
-
- return nr_total_migrated;
}
unsigned long zs_compact(struct zs_pool *pool)
{
int i;
- unsigned long nr_migrated = 0;
struct size_class *class;
for (i = zs_size_classes - 1; i >= 0; i--) {
@@ -1759,10 +1751,10 @@ unsigned long zs_compact(struct zs_pool *pool)
continue;
if (class->index != i)
continue;
- nr_migrated += __zs_compact(pool, class);
+ __zs_compact(pool, class);
}
-
- return nr_migrated;
+ /* can be a bit outdated */
+ return pool->num_migrated;
}
EXPORT_SYMBOL_GPL(zs_compact);
--
2.4.2.337.gfae46aa
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply related [flat|nested] 30+ messages in thread
* [RFC][PATCH 06/10] zsmalloc: move compaction functions
2015-05-29 15:05 [RFC][PATCH 00/10] zsmalloc auto-compaction Sergey Senozhatsky
` (4 preceding siblings ...)
2015-05-29 15:05 ` [RFC][PATCH 05/10] zsmalloc: add `num_migrated' to zs_pool Sergey Senozhatsky
@ 2015-05-29 15:05 ` Sergey Senozhatsky
2015-05-29 15:05 ` [RFC][PATCH 07/10] zsmalloc: introduce auto-compact support Sergey Senozhatsky
` (4 subsequent siblings)
10 siblings, 0 replies; 30+ messages in thread
From: Sergey Senozhatsky @ 2015-05-29 15:05 UTC (permalink / raw)
To: Andrew Morton, Minchan Kim
Cc: linux-mm, linux-kernel, Sergey Senozhatsky, Sergey Senozhatsky
this patch simply moves compaction functions, so we can call
`static __zs_compaction()' (and friends) from zs_free().
Signed-off-by: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
---
mm/zsmalloc.c | 426 +++++++++++++++++++++++++++++-----------------------------
1 file changed, 215 insertions(+), 211 deletions(-)
diff --git a/mm/zsmalloc.c b/mm/zsmalloc.c
index 54eefc3..c2a640a 100644
--- a/mm/zsmalloc.c
+++ b/mm/zsmalloc.c
@@ -321,6 +321,7 @@ static int zs_zpool_malloc(void *pool, size_t size, gfp_t gfp,
*handle = zs_malloc(pool, size);
return *handle ? 0 : -1;
}
+
static void zs_zpool_free(void *pool, unsigned long handle)
{
zs_free(pool, handle);
@@ -352,6 +353,7 @@ static void *zs_zpool_map(void *pool, unsigned long handle,
return zs_map_object(pool, handle, zs_mm);
}
+
static void zs_zpool_unmap(void *pool, unsigned long handle)
{
zs_unmap_object(pool, handle);
@@ -590,7 +592,6 @@ static inline void zs_pool_stat_destroy(struct zs_pool *pool)
}
#endif
-
/*
* For each size class, zspages are divided into different groups
* depending on how "full" they are. This was done so that we could
@@ -1117,7 +1118,6 @@ out:
/* enable page faults to match kunmap_atomic() return conditions */
pagefault_enable();
}
-
#endif /* CONFIG_PGTABLE_MAPPING */
static int zs_cpu_notifier(struct notifier_block *nb, unsigned long action,
@@ -1207,115 +1207,6 @@ static bool zspage_full(struct page *page)
return page->inuse == page->objects;
}
-unsigned long zs_get_total_pages(struct zs_pool *pool)
-{
- return atomic_long_read(&pool->pages_allocated);
-}
-EXPORT_SYMBOL_GPL(zs_get_total_pages);
-
-/**
- * zs_map_object - get address of allocated object from handle.
- * @pool: pool from which the object was allocated
- * @handle: handle returned from zs_malloc
- *
- * Before using an object allocated from zs_malloc, it must be mapped using
- * this function. When done with the object, it must be unmapped using
- * zs_unmap_object.
- *
- * Only one object can be mapped per cpu at a time. There is no protection
- * against nested mappings.
- *
- * This function returns with preemption and page faults disabled.
- */
-void *zs_map_object(struct zs_pool *pool, unsigned long handle,
- enum zs_mapmode mm)
-{
- struct page *page;
- unsigned long obj, obj_idx, off;
-
- unsigned int class_idx;
- enum fullness_group fg;
- struct size_class *class;
- struct mapping_area *area;
- struct page *pages[2];
- void *ret;
-
- BUG_ON(!handle);
-
- /*
- * Because we use per-cpu mapping areas shared among the
- * pools/users, we can't allow mapping in interrupt context
- * because it can corrupt another users mappings.
- */
- BUG_ON(in_interrupt());
-
- /* From now on, migration cannot move the object */
- pin_tag(handle);
-
- obj = handle_to_obj(handle);
- obj_to_location(obj, &page, &obj_idx);
- get_zspage_mapping(get_first_page(page), &class_idx, &fg);
- class = pool->size_class[class_idx];
- off = obj_idx_to_offset(page, obj_idx, class->size);
-
- area = &get_cpu_var(zs_map_area);
- area->vm_mm = mm;
- if (off + class->size <= PAGE_SIZE) {
- /* this object is contained entirely within a page */
- area->vm_addr = kmap_atomic(page);
- ret = area->vm_addr + off;
- goto out;
- }
-
- /* this object spans two pages */
- pages[0] = page;
- pages[1] = get_next_page(page);
- BUG_ON(!pages[1]);
-
- ret = __zs_map_object(area, pages, off, class->size);
-out:
- if (!class->huge)
- ret += ZS_HANDLE_SIZE;
-
- return ret;
-}
-EXPORT_SYMBOL_GPL(zs_map_object);
-
-void zs_unmap_object(struct zs_pool *pool, unsigned long handle)
-{
- struct page *page;
- unsigned long obj, obj_idx, off;
-
- unsigned int class_idx;
- enum fullness_group fg;
- struct size_class *class;
- struct mapping_area *area;
-
- BUG_ON(!handle);
-
- obj = handle_to_obj(handle);
- obj_to_location(obj, &page, &obj_idx);
- get_zspage_mapping(get_first_page(page), &class_idx, &fg);
- class = pool->size_class[class_idx];
- off = obj_idx_to_offset(page, obj_idx, class->size);
-
- area = this_cpu_ptr(&zs_map_area);
- if (off + class->size <= PAGE_SIZE)
- kunmap_atomic(area->vm_addr);
- else {
- struct page *pages[2];
-
- pages[0] = page;
- pages[1] = get_next_page(page);
- BUG_ON(!pages[1]);
-
- __zs_unmap_object(area, pages, off, class->size);
- }
- put_cpu_var(zs_map_area);
- unpin_tag(handle);
-}
-EXPORT_SYMBOL_GPL(zs_unmap_object);
-
static unsigned long obj_malloc(struct page *first_page,
struct size_class *class, unsigned long handle)
{
@@ -1347,63 +1238,6 @@ static unsigned long obj_malloc(struct page *first_page,
return obj;
}
-
-/**
- * zs_malloc - Allocate block of given size from pool.
- * @pool: pool to allocate from
- * @size: size of block to allocate
- *
- * On success, handle to the allocated object is returned,
- * otherwise 0.
- * Allocation requests with size > ZS_MAX_ALLOC_SIZE will fail.
- */
-unsigned long zs_malloc(struct zs_pool *pool, size_t size)
-{
- unsigned long handle, obj;
- struct size_class *class;
- struct page *first_page;
-
- if (unlikely(!size || size > ZS_MAX_ALLOC_SIZE))
- return 0;
-
- handle = alloc_handle(pool);
- if (!handle)
- return 0;
-
- /* extra space in chunk to keep the handle */
- size += ZS_HANDLE_SIZE;
- class = pool->size_class[get_size_class_index(size)];
-
- spin_lock(&class->lock);
- first_page = find_get_zspage(class);
-
- if (!first_page) {
- spin_unlock(&class->lock);
- first_page = alloc_zspage(class, pool->flags);
- if (unlikely(!first_page)) {
- free_handle(pool, handle);
- return 0;
- }
-
- set_zspage_mapping(first_page, class->index, ZS_EMPTY);
- atomic_long_add(class->pages_per_zspage,
- &pool->pages_allocated);
-
- spin_lock(&class->lock);
- zs_stat_inc(class, OBJ_ALLOCATED, get_maxobj_per_zspage(
- class->size, class->pages_per_zspage));
- }
-
- obj = obj_malloc(first_page, class, handle);
- /* Now move the zspage to another fullness group, if required */
- fix_fullness_group(class, first_page);
- record_obj(handle, obj);
- spin_unlock(&class->lock);
-
- return handle;
-}
-EXPORT_SYMBOL_GPL(zs_malloc);
-
static void obj_free(struct zs_pool *pool, struct size_class *class,
unsigned long obj)
{
@@ -1436,42 +1270,6 @@ static void obj_free(struct zs_pool *pool, struct size_class *class,
zs_stat_dec(class, OBJ_USED, 1);
}
-void zs_free(struct zs_pool *pool, unsigned long handle)
-{
- struct page *first_page, *f_page;
- unsigned long obj, f_objidx;
- int class_idx;
- struct size_class *class;
- enum fullness_group fullness;
-
- if (unlikely(!handle))
- return;
-
- pin_tag(handle);
- obj = handle_to_obj(handle);
- obj_to_location(obj, &f_page, &f_objidx);
- first_page = get_first_page(f_page);
-
- get_zspage_mapping(first_page, &class_idx, &fullness);
- class = pool->size_class[class_idx];
-
- spin_lock(&class->lock);
- obj_free(pool, class, obj);
- fullness = fix_fullness_group(class, first_page);
- if (fullness == ZS_EMPTY) {
- zs_stat_dec(class, OBJ_ALLOCATED, get_maxobj_per_zspage(
- class->size, class->pages_per_zspage));
- atomic_long_sub(class->pages_per_zspage,
- &pool->pages_allocated);
- free_zspage(first_page);
- }
- spin_unlock(&class->lock);
- unpin_tag(handle);
-
- free_handle(pool, handle);
-}
-EXPORT_SYMBOL_GPL(zs_free);
-
static void zs_object_copy(unsigned long dst, unsigned long src,
struct size_class *class)
{
@@ -1572,13 +1370,17 @@ static unsigned long find_alloced_obj(struct page *page, int index,
struct zs_compact_control {
/* Source page for migration which could be a subpage of zspage. */
- struct page *s_page;
- /* Destination page for migration which should be a first page
- * of zspage. */
- struct page *d_page;
- /* Starting object index within @s_page which used for live object
- * in the subpage. */
- int index;
+ struct page *s_page;
+ /*
+ * Destination page for migration which should be a first page
+ * of zspage.
+ */
+ struct page *d_page;
+ /*
+ * Starting object index within @s_page which used for live object
+ * in the subpage.
+ */
+ int index;
};
static int migrate_zspage(struct zs_pool *pool, struct size_class *class,
@@ -1740,6 +1542,208 @@ static void __zs_compact(struct zs_pool *pool, struct size_class *class)
spin_unlock(&class->lock);
}
+
+unsigned long zs_get_total_pages(struct zs_pool *pool)
+{
+ return atomic_long_read(&pool->pages_allocated);
+}
+EXPORT_SYMBOL_GPL(zs_get_total_pages);
+
+/**
+ * zs_map_object - get address of allocated object from handle.
+ * @pool: pool from which the object was allocated
+ * @handle: handle returned from zs_malloc
+ *
+ * Before using an object allocated from zs_malloc, it must be mapped using
+ * this function. When done with the object, it must be unmapped using
+ * zs_unmap_object.
+ *
+ * Only one object can be mapped per cpu at a time. There is no protection
+ * against nested mappings.
+ *
+ * This function returns with preemption and page faults disabled.
+ */
+void *zs_map_object(struct zs_pool *pool, unsigned long handle,
+ enum zs_mapmode mm)
+{
+ struct page *page;
+ unsigned long obj, obj_idx, off;
+
+ unsigned int class_idx;
+ enum fullness_group fg;
+ struct size_class *class;
+ struct mapping_area *area;
+ struct page *pages[2];
+ void *ret;
+
+ BUG_ON(!handle);
+
+ /*
+ * Because we use per-cpu mapping areas shared among the
+ * pools/users, we can't allow mapping in interrupt context
+ * because it can corrupt another users mappings.
+ */
+ BUG_ON(in_interrupt());
+
+ /* From now on, migration cannot move the object */
+ pin_tag(handle);
+
+ obj = handle_to_obj(handle);
+ obj_to_location(obj, &page, &obj_idx);
+ get_zspage_mapping(get_first_page(page), &class_idx, &fg);
+ class = pool->size_class[class_idx];
+ off = obj_idx_to_offset(page, obj_idx, class->size);
+
+ area = &get_cpu_var(zs_map_area);
+ area->vm_mm = mm;
+ if (off + class->size <= PAGE_SIZE) {
+ /* this object is contained entirely within a page */
+ area->vm_addr = kmap_atomic(page);
+ ret = area->vm_addr + off;
+ goto out;
+ }
+
+ /* this object spans two pages */
+ pages[0] = page;
+ pages[1] = get_next_page(page);
+ BUG_ON(!pages[1]);
+
+ ret = __zs_map_object(area, pages, off, class->size);
+out:
+ if (!class->huge)
+ ret += ZS_HANDLE_SIZE;
+
+ return ret;
+}
+EXPORT_SYMBOL_GPL(zs_map_object);
+
+void zs_unmap_object(struct zs_pool *pool, unsigned long handle)
+{
+ struct page *page;
+ unsigned long obj, obj_idx, off;
+
+ unsigned int class_idx;
+ enum fullness_group fg;
+ struct size_class *class;
+ struct mapping_area *area;
+
+ BUG_ON(!handle);
+
+ obj = handle_to_obj(handle);
+ obj_to_location(obj, &page, &obj_idx);
+ get_zspage_mapping(get_first_page(page), &class_idx, &fg);
+ class = pool->size_class[class_idx];
+ off = obj_idx_to_offset(page, obj_idx, class->size);
+
+ area = this_cpu_ptr(&zs_map_area);
+ if (off + class->size <= PAGE_SIZE)
+ kunmap_atomic(area->vm_addr);
+ else {
+ struct page *pages[2];
+
+ pages[0] = page;
+ pages[1] = get_next_page(page);
+ BUG_ON(!pages[1]);
+
+ __zs_unmap_object(area, pages, off, class->size);
+ }
+ put_cpu_var(zs_map_area);
+ unpin_tag(handle);
+}
+EXPORT_SYMBOL_GPL(zs_unmap_object);
+
+/**
+ * zs_malloc - Allocate block of given size from pool.
+ * @pool: pool to allocate from
+ * @size: size of block to allocate
+ *
+ * On success, handle to the allocated object is returned,
+ * otherwise 0.
+ * Allocation requests with size > ZS_MAX_ALLOC_SIZE will fail.
+ */
+unsigned long zs_malloc(struct zs_pool *pool, size_t size)
+{
+ unsigned long handle, obj;
+ struct size_class *class;
+ struct page *first_page;
+
+ if (unlikely(!size || size > ZS_MAX_ALLOC_SIZE))
+ return 0;
+
+ handle = alloc_handle(pool);
+ if (!handle)
+ return 0;
+
+ /* extra space in chunk to keep the handle */
+ size += ZS_HANDLE_SIZE;
+ class = pool->size_class[get_size_class_index(size)];
+
+ spin_lock(&class->lock);
+ first_page = find_get_zspage(class);
+
+ if (!first_page) {
+ spin_unlock(&class->lock);
+ first_page = alloc_zspage(class, pool->flags);
+ if (unlikely(!first_page)) {
+ free_handle(pool, handle);
+ return 0;
+ }
+
+ set_zspage_mapping(first_page, class->index, ZS_EMPTY);
+ atomic_long_add(class->pages_per_zspage,
+ &pool->pages_allocated);
+
+ spin_lock(&class->lock);
+ zs_stat_inc(class, OBJ_ALLOCATED, get_maxobj_per_zspage(
+ class->size, class->pages_per_zspage));
+ }
+
+ obj = obj_malloc(first_page, class, handle);
+ /* Now move the zspage to another fullness group, if required */
+ fix_fullness_group(class, first_page);
+ record_obj(handle, obj);
+ spin_unlock(&class->lock);
+
+ return handle;
+}
+EXPORT_SYMBOL_GPL(zs_malloc);
+
+void zs_free(struct zs_pool *pool, unsigned long handle)
+{
+ struct page *first_page, *f_page;
+ unsigned long obj, f_objidx;
+ int class_idx;
+ struct size_class *class;
+ enum fullness_group fullness;
+
+ if (unlikely(!handle))
+ return;
+
+ pin_tag(handle);
+ obj = handle_to_obj(handle);
+ obj_to_location(obj, &f_page, &f_objidx);
+ first_page = get_first_page(f_page);
+
+ get_zspage_mapping(first_page, &class_idx, &fullness);
+ class = pool->size_class[class_idx];
+
+ spin_lock(&class->lock);
+ obj_free(pool, class, obj);
+ fullness = fix_fullness_group(class, first_page);
+ if (fullness == ZS_EMPTY) {
+ zs_stat_dec(class, OBJ_ALLOCATED, get_maxobj_per_zspage(
+ class->size, class->pages_per_zspage));
+ atomic_long_sub(class->pages_per_zspage,
+ &pool->pages_allocated);
+ free_zspage(first_page);
+ }
+ spin_unlock(&class->lock);
+ unpin_tag(handle);
+
+ free_handle(pool, handle);
+}
+EXPORT_SYMBOL_GPL(zs_free);
+
unsigned long zs_compact(struct zs_pool *pool)
{
int i;
--
2.4.2.337.gfae46aa
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply related [flat|nested] 30+ messages in thread
* [RFC][PATCH 07/10] zsmalloc: introduce auto-compact support
2015-05-29 15:05 [RFC][PATCH 00/10] zsmalloc auto-compaction Sergey Senozhatsky
` (5 preceding siblings ...)
2015-05-29 15:05 ` [RFC][PATCH 06/10] zsmalloc: move compaction functions Sergey Senozhatsky
@ 2015-05-29 15:05 ` Sergey Senozhatsky
2015-06-04 4:57 ` Minchan Kim
2015-05-29 15:05 ` [RFC][PATCH 08/10] zsmalloc: export zs_pool `num_migrated' Sergey Senozhatsky
` (3 subsequent siblings)
10 siblings, 1 reply; 30+ messages in thread
From: Sergey Senozhatsky @ 2015-05-29 15:05 UTC (permalink / raw)
To: Andrew Morton, Minchan Kim
Cc: linux-mm, linux-kernel, Sergey Senozhatsky, Sergey Senozhatsky
perform class compaction in zs_free(), if zs_free() has created
a ZS_ALMOST_EMPTY page. this is the most trivial `policy'.
probably it would make zs_can_compact() to return an estimated number
of pages that potentially will be free and trigger auto-compaction
only when it's above some limit (e.g. at least 4 zs pages); or put it
under config option.
this also tweaks __zs_compact() -- we can't do reschedule
anymore, waiting for new pages in the current class. so we
compact as much as we can and return immediately if compaction
is not possible anymore.
auto-compaction is not a replacement of manual compaction.
compiled linux kernel with auto-compaction:
cat /sys/block/zram0/mm_stat
2339885056 1601034235 1624076288 0 1624076288 19961 1106
performing additional manual compaction:
echo 1 > /sys/block/zram0/compact
cat /sys/block/zram0/mm_stat
2339885056 1601034235 1624051712 0 1624076288 19961 1114
manual compaction was able to migrate additional 8 objects. so
auto-compaction is 'good enough'.
TEST
this test copies a 1.3G linux kernel tar to mounted zram disk,
and extracts it.
w/auto-compaction:
cat /sys/block/zram0/mm_stat
1171456 26006 86016 0 86016 32781 0
time tar xf linux-3.10.tar.gz -C linux
real 0m16.970s
user 0m15.247s
sys 0m8.477s
du -sh linux
2.0G linux
cat /sys/block/zram0/mm_stat
3547353088 2993384270 3011088384 0 3011088384 24310 108
=====================================================================
w/o auto compaction:
cat /sys/block/zram0/mm_stat
1171456 26000 81920 0 81920 32781 0
time tar xf linux-3.10.tar.gz -C linux
real 0m16.983s
user 0m15.267s
sys 0m8.417s
du -sh linux
2.0G linux
cat /sys/block/zram0/mm_stat
3548917760 2993566924 3011317760 0 3011317760 23928 0
=====================================================================
iozone shows that auto-compacted code runs faster in several
tests, which is hardly trustworthy. anyway.
iozone -t 3 -R -r 16K -s 60M -I +Z
test base auto-compact (compacted 66123 objs)
Initial write 1603682.25 1645112.38
Rewrite 2502243.31 2256570.31
Read 7040860.00 7130575.00
Re-read 7036490.75 7066744.25
Reverse Read 6617115.25 6155395.50
Stride read 6705085.50 6350030.38
Random read 6668497.75 6350129.38
Mixed workload 5494030.38 5091669.62
Random write 2526834.44 2500977.81
Pwrite 1656874.00 1663796.94
Pread 3322818.91 3359683.44
Fwrite 4090124.25 4099773.88
Fread 10358916.25 10324409.75
Signed-off-by: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
---
mm/zsmalloc.c | 25 +++++++++++++------------
1 file changed, 13 insertions(+), 12 deletions(-)
diff --git a/mm/zsmalloc.c b/mm/zsmalloc.c
index c2a640a..70bf481 100644
--- a/mm/zsmalloc.c
+++ b/mm/zsmalloc.c
@@ -1515,34 +1515,28 @@ static void __zs_compact(struct zs_pool *pool, struct size_class *class)
while ((dst_page = isolate_target_page(class))) {
cc.d_page = dst_page;
- /*
- * If there is no more space in dst_page, resched
- * and see if anyone had allocated another zspage.
- */
+
if (!migrate_zspage(pool, class, &cc))
- break;
+ goto out;
putback_zspage(pool, class, dst_page);
}
- /* Stop if we couldn't find slot */
- if (dst_page == NULL)
+ if (!dst_page)
break;
-
putback_zspage(pool, class, dst_page);
putback_zspage(pool, class, src_page);
- spin_unlock(&class->lock);
- cond_resched();
- spin_lock(&class->lock);
}
+out:
+ if (dst_page)
+ putback_zspage(pool, class, dst_page);
if (src_page)
putback_zspage(pool, class, src_page);
spin_unlock(&class->lock);
}
-
unsigned long zs_get_total_pages(struct zs_pool *pool)
{
return atomic_long_read(&pool->pages_allocated);
@@ -1741,6 +1735,13 @@ void zs_free(struct zs_pool *pool, unsigned long handle)
unpin_tag(handle);
free_handle(pool, handle);
+
+ /*
+ * actual fullness might have changed, __zs_compact() checks
+ * if compaction makes sense
+ */
+ if (fullness == ZS_ALMOST_EMPTY)
+ __zs_compact(pool, class);
}
EXPORT_SYMBOL_GPL(zs_free);
--
2.4.2.337.gfae46aa
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply related [flat|nested] 30+ messages in thread
* [RFC][PATCH 08/10] zsmalloc: export zs_pool `num_migrated'
2015-05-29 15:05 [RFC][PATCH 00/10] zsmalloc auto-compaction Sergey Senozhatsky
` (6 preceding siblings ...)
2015-05-29 15:05 ` [RFC][PATCH 07/10] zsmalloc: introduce auto-compact support Sergey Senozhatsky
@ 2015-05-29 15:05 ` Sergey Senozhatsky
2015-05-29 15:05 ` [RFC][PATCH 09/10] zram: remove `num_migrated' from zram_stats Sergey Senozhatsky
` (2 subsequent siblings)
10 siblings, 0 replies; 30+ messages in thread
From: Sergey Senozhatsky @ 2015-05-29 15:05 UTC (permalink / raw)
To: Andrew Morton, Minchan Kim
Cc: linux-mm, linux-kernel, Sergey Senozhatsky, Sergey Senozhatsky
introduce zs_get_num_migrated() to export zs_pool's ->num_migrated
counter.
Signed-off-by: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
---
include/linux/zsmalloc.h | 1 +
mm/zsmalloc.c | 7 +++++++
2 files changed, 8 insertions(+)
diff --git a/include/linux/zsmalloc.h b/include/linux/zsmalloc.h
index 1338190..e878875 100644
--- a/include/linux/zsmalloc.h
+++ b/include/linux/zsmalloc.h
@@ -47,6 +47,7 @@ void *zs_map_object(struct zs_pool *pool, unsigned long handle,
void zs_unmap_object(struct zs_pool *pool, unsigned long handle);
unsigned long zs_get_total_pages(struct zs_pool *pool);
+unsigned long zs_get_num_migrated(struct zs_pool *pool);
unsigned long zs_compact(struct zs_pool *pool);
#endif
diff --git a/mm/zsmalloc.c b/mm/zsmalloc.c
index 70bf481..0524c4a 100644
--- a/mm/zsmalloc.c
+++ b/mm/zsmalloc.c
@@ -1543,6 +1543,13 @@ unsigned long zs_get_total_pages(struct zs_pool *pool)
}
EXPORT_SYMBOL_GPL(zs_get_total_pages);
+unsigned long zs_get_num_migrated(struct zs_pool *pool)
+{
+ /* can be outdated */
+ return pool->num_migrated;
+}
+EXPORT_SYMBOL_GPL(zs_get_num_migrated);
+
/**
* zs_map_object - get address of allocated object from handle.
* @pool: pool from which the object was allocated
--
2.4.2.337.gfae46aa
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply related [flat|nested] 30+ messages in thread
* [RFC][PATCH 09/10] zram: remove `num_migrated' from zram_stats
2015-05-29 15:05 [RFC][PATCH 00/10] zsmalloc auto-compaction Sergey Senozhatsky
` (7 preceding siblings ...)
2015-05-29 15:05 ` [RFC][PATCH 08/10] zsmalloc: export zs_pool `num_migrated' Sergey Senozhatsky
@ 2015-05-29 15:05 ` Sergey Senozhatsky
2015-05-29 15:05 ` [RFC][PATCH 10/10] zsmalloc: lower ZS_ALMOST_FULL waterline Sergey Senozhatsky
2015-06-03 5:09 ` [RFC][PATCH 00/10] zsmalloc auto-compaction Sergey Senozhatsky
10 siblings, 0 replies; 30+ messages in thread
From: Sergey Senozhatsky @ 2015-05-29 15:05 UTC (permalink / raw)
To: Andrew Morton, Minchan Kim
Cc: linux-mm, linux-kernel, Sergey Senozhatsky, Sergey Senozhatsky
drop zram's copy of `num_migrated' objects and use zs_pool's
zs_get_num_migrated() instead.
Signed-off-by: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
---
drivers/block/zram/zram_drv.c | 12 ++++++------
drivers/block/zram/zram_drv.h | 1 -
2 files changed, 6 insertions(+), 7 deletions(-)
diff --git a/drivers/block/zram/zram_drv.c b/drivers/block/zram/zram_drv.c
index 28f6e46..31e45b4 100644
--- a/drivers/block/zram/zram_drv.c
+++ b/drivers/block/zram/zram_drv.c
@@ -385,7 +385,6 @@ static ssize_t comp_algorithm_store(struct device *dev,
static ssize_t compact_store(struct device *dev,
struct device_attribute *attr, const char *buf, size_t len)
{
- unsigned long nr_migrated;
struct zram *zram = dev_to_zram(dev);
struct zram_meta *meta;
@@ -396,8 +395,7 @@ static ssize_t compact_store(struct device *dev,
}
meta = zram->meta;
- nr_migrated = zs_compact(meta->mem_pool);
- atomic64_add(nr_migrated, &zram->stats.num_migrated);
+ zs_compact(meta->mem_pool);
up_read(&zram->init_lock);
return len;
@@ -425,13 +423,15 @@ static ssize_t mm_stat_show(struct device *dev,
struct device_attribute *attr, char *buf)
{
struct zram *zram = dev_to_zram(dev);
- u64 orig_size, mem_used = 0;
+ u64 orig_size, mem_used = 0, num_migrated = 0;
long max_used;
ssize_t ret;
down_read(&zram->init_lock);
- if (init_done(zram))
+ if (init_done(zram)) {
mem_used = zs_get_total_pages(zram->meta->mem_pool);
+ num_migrated = zs_get_num_migrated(zram->meta->mem_pool);
+ }
orig_size = atomic64_read(&zram->stats.pages_stored);
max_used = atomic_long_read(&zram->stats.max_used_pages);
@@ -444,7 +444,7 @@ static ssize_t mm_stat_show(struct device *dev,
zram->limit_pages << PAGE_SHIFT,
max_used << PAGE_SHIFT,
(u64)atomic64_read(&zram->stats.zero_pages),
- (u64)atomic64_read(&zram->stats.num_migrated));
+ num_migrated);
up_read(&zram->init_lock);
return ret;
diff --git a/drivers/block/zram/zram_drv.h b/drivers/block/zram/zram_drv.h
index 6dbe2df..8e92339 100644
--- a/drivers/block/zram/zram_drv.h
+++ b/drivers/block/zram/zram_drv.h
@@ -78,7 +78,6 @@ struct zram_stats {
atomic64_t compr_data_size; /* compressed size of pages stored */
atomic64_t num_reads; /* failed + successful */
atomic64_t num_writes; /* --do-- */
- atomic64_t num_migrated; /* no. of migrated object */
atomic64_t failed_reads; /* can happen when memory is too low */
atomic64_t failed_writes; /* can happen when memory is too low */
atomic64_t invalid_io; /* non-page-aligned I/O requests */
--
2.4.2.337.gfae46aa
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply related [flat|nested] 30+ messages in thread
* [RFC][PATCH 10/10] zsmalloc: lower ZS_ALMOST_FULL waterline
2015-05-29 15:05 [RFC][PATCH 00/10] zsmalloc auto-compaction Sergey Senozhatsky
` (8 preceding siblings ...)
2015-05-29 15:05 ` [RFC][PATCH 09/10] zram: remove `num_migrated' from zram_stats Sergey Senozhatsky
@ 2015-05-29 15:05 ` Sergey Senozhatsky
2015-06-03 5:09 ` [RFC][PATCH 00/10] zsmalloc auto-compaction Sergey Senozhatsky
10 siblings, 0 replies; 30+ messages in thread
From: Sergey Senozhatsky @ 2015-05-29 15:05 UTC (permalink / raw)
To: Andrew Morton, Minchan Kim
Cc: linux-mm, linux-kernel, Sergey Senozhatsky, Sergey Senozhatsky
get_fullness_group() considers 3/4 full pages as almost empty.
that, unfortunately, marks as ALMOST_EMPTY pages that we would
probably like to keep in ALMOST_FULL list.
ALMOST_EMPTY:
[..]
inuse: 3 max_objexts: 4
inuse: 5 max_objexts: 7
inuse: 5 max_objexts: 7
inuse: 2 max_objexts: 3
[..]
so, for "inuse: 5 max_objexts: 7" ALMOST_EMPTY page, for example,
it'll take 2 obj_malloc to make the page FULL and 5 obj_free to
make it EMPTY. compaction selects ALMOST_EMPTY pages as source
pages, which can result in extra object moves.
iow, in terms of compaction, it makes more sense to fill this
page, rather than drain it.
decrease ALMOST_FULL waterline to 2/3 of max capacity; which is,
of course, still imperfect.
Signed-off-by: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
---
mm/zsmalloc.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/mm/zsmalloc.c b/mm/zsmalloc.c
index 0524c4a..a8a3eae 100644
--- a/mm/zsmalloc.c
+++ b/mm/zsmalloc.c
@@ -196,7 +196,7 @@ static int zs_size_classes;
*
* (see: fix_fullness_group())
*/
-static const int fullness_threshold_frac = 4;
+static const int fullness_threshold_frac = 3;
struct size_class {
spinlock_t lock;
@@ -612,7 +612,7 @@ static enum fullness_group get_fullness_group(struct page *page)
fg = ZS_EMPTY;
else if (inuse == max_objects)
fg = ZS_FULL;
- else if (inuse <= 3 * max_objects / fullness_threshold_frac)
+ else if (inuse <= 2 * max_objects / fullness_threshold_frac)
fg = ZS_ALMOST_EMPTY;
else
fg = ZS_ALMOST_FULL;
--
2.4.2.337.gfae46aa
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply related [flat|nested] 30+ messages in thread
* Re: [RFC][PATCH 00/10] zsmalloc auto-compaction
2015-05-29 15:05 [RFC][PATCH 00/10] zsmalloc auto-compaction Sergey Senozhatsky
` (9 preceding siblings ...)
2015-05-29 15:05 ` [RFC][PATCH 10/10] zsmalloc: lower ZS_ALMOST_FULL waterline Sergey Senozhatsky
@ 2015-06-03 5:09 ` Sergey Senozhatsky
10 siblings, 0 replies; 30+ messages in thread
From: Sergey Senozhatsky @ 2015-06-03 5:09 UTC (permalink / raw)
To: Minchan Kim
Cc: Andrew Morton, linux-mm, linux-kernel, Sergey Senozhatsky,
Sergey Senozhatsky
On (05/30/15 00:05), Sergey Senozhatsky wrote:
> RFC
>
> this is 4.3 material, but I wanted to publish it sooner to gain
> responses and to settle it down before 4.3 merge window opens.
>
> in short, this series tweaks zsmalloc's compaction and adds
> auto-compaction support. auto-compaction is not aimed to replace
> manual compaction, intead it's supposed to be good enough. yet
> it surely slows down zsmalloc in some scenarious. whilst simple
> un-tar test didn't show any significant performance difference
>
>
> quote from commit 0007:
>
> this test copies a 1.3G linux kernel tar to mounted zram disk,
> and extracts it.
>
[..]
Hello,
I've a v2:
-- squashed and re-order some of the patches;
-- run iozone with lockdep disabled.
=== quote ===
auto-compaction should not affect read-only tests, so we are interested
in write-only and read-write (mixed) tests, but I'll post complete test
stats:
iozone -t 3 -R -r 16K -s 60M -I +Z
ext4, 2g zram0 device, lzo, 4 compression streams max
test base auto-compact (compacted 67904 objs)
Initial write 2474943.62 2490551.69
Rewrite 3656121.38 3002796.31
Read 12068187.50 12044105.25
Re-read 12009777.25 11930537.50
Reverse Read 10858884.25 10388252.50
Stride read 10715304.75 10429308.00
Random read 10597970.50 10502978.75
Mixed workload 8517269.00 8701298.12
Random write 3595597.00 3465174.38
Pwrite 2507361.25 2553224.50
Pread 5380608.28 5340646.03
Fwrite 6123863.62 6130514.25
Fread 12006438.50 11936981.25
mm_stat after the test
base:
cat /sys/block/zram0/mm_stat
378834944 5748695 7446528 0 7450624 16318 0
auto-compaction:
cat /sys/block/zram0/mm_stat
378892288 5754987 7397376 0 7397376 16304 67904
===
-ss
>
>
> Sergey Senozhatsky (10):
> zsmalloc: drop unused variable `nr_to_migrate'
> zsmalloc: always keep per-class stats
> zsmalloc: introduce zs_can_compact() function
> zsmalloc: cosmetic compaction code adjustments
> zsmalloc: add `num_migrated' to zs_pool
> zsmalloc: move compaction functions
> zsmalloc: introduce auto-compact support
> zsmalloc: export zs_pool `num_migrated'
> zram: remove `num_migrated' from zram_stats
> zsmalloc: lower ZS_ALMOST_FULL waterline
>
> drivers/block/zram/zram_drv.c | 12 +-
> drivers/block/zram/zram_drv.h | 1 -
> include/linux/zsmalloc.h | 1 +
> mm/zsmalloc.c | 578 +++++++++++++++++++++---------------------
> 4 files changed, 296 insertions(+), 296 deletions(-)
>
> --
> 2.4.2.337.gfae46aa
>
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [RFC][PATCH 01/10] zsmalloc: drop unused variable `nr_to_migrate'
2015-05-29 15:05 ` [RFC][PATCH 01/10] zsmalloc: drop unused variable `nr_to_migrate' Sergey Senozhatsky
@ 2015-06-04 2:04 ` Minchan Kim
2015-06-04 2:10 ` Sergey Senozhatsky
0 siblings, 1 reply; 30+ messages in thread
From: Minchan Kim @ 2015-06-04 2:04 UTC (permalink / raw)
To: Sergey Senozhatsky
Cc: Andrew Morton, linux-mm, linux-kernel, Sergey Senozhatsky
On Sat, May 30, 2015 at 12:05:19AM +0900, Sergey Senozhatsky wrote:
> __zs_compact() does not use `nr_to_migrate', drop it.
>
> Signed-off-by: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
Acked-by: Minchan Kim <minchan@kernel.org>
--
Kind regards,
Minchan Kim
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [RFC][PATCH 01/10] zsmalloc: drop unused variable `nr_to_migrate'
2015-06-04 2:04 ` Minchan Kim
@ 2015-06-04 2:10 ` Sergey Senozhatsky
0 siblings, 0 replies; 30+ messages in thread
From: Sergey Senozhatsky @ 2015-06-04 2:10 UTC (permalink / raw)
To: Minchan Kim
Cc: Sergey Senozhatsky, Andrew Morton, linux-mm, linux-kernel,
Sergey Senozhatsky
On (06/04/15 11:04), Minchan Kim wrote:
> On Sat, May 30, 2015 at 12:05:19AM +0900, Sergey Senozhatsky wrote:
> > __zs_compact() does not use `nr_to_migrate', drop it.
> >
> > Signed-off-by: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
> Acked-by: Minchan Kim <minchan@kernel.org>
>
Hello Minchan,
I will post a slightly reworked patchset later today.
thanks.
-ss
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [RFC][PATCH 02/10] zsmalloc: always keep per-class stats
2015-05-29 15:05 ` [RFC][PATCH 02/10] zsmalloc: always keep per-class stats Sergey Senozhatsky
@ 2015-06-04 2:18 ` Minchan Kim
2015-06-04 2:34 ` Sergey Senozhatsky
0 siblings, 1 reply; 30+ messages in thread
From: Minchan Kim @ 2015-06-04 2:18 UTC (permalink / raw)
To: Sergey Senozhatsky
Cc: Andrew Morton, linux-mm, linux-kernel, Sergey Senozhatsky
On Sat, May 30, 2015 at 12:05:20AM +0900, Sergey Senozhatsky wrote:
> always account per-class `zs_size_stat' stats. this data will
> help us make better decisions during compaction. we are especially
> interested in OBJ_ALLOCATED and OBJ_USED, which can tell us if
> class compaction will result in any memory gain.
>
> for instance, we know the number of allocated objects in the class,
> the number of objects being used (so we also know how many objects
> are not used) and the number of objects per-page. so we can estimate
> how many pages compaction can free (pages that will turn into
> ZS_EMPTY during compaction).
Fair enough but I need to read further patches to see if we need
really this at the moment.
I hope it would be better to write down more detail in cover-letter
so when I read just [0/0] I realize your goal and approach without
looking into detail in each patch.
>
> Signed-off-by: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
> ---
> mm/zsmalloc.c | 49 ++++++++++++-------------------------------------
> 1 file changed, 12 insertions(+), 37 deletions(-)
>
> diff --git a/mm/zsmalloc.c b/mm/zsmalloc.c
> index e615b31..778b8db 100644
> --- a/mm/zsmalloc.c
> +++ b/mm/zsmalloc.c
> @@ -169,14 +169,12 @@ enum zs_stat_type {
> NR_ZS_STAT_TYPE,
> };
>
> -#ifdef CONFIG_ZSMALLOC_STAT
> -
> -static struct dentry *zs_stat_root;
> -
> struct zs_size_stat {
> unsigned long objs[NR_ZS_STAT_TYPE];
> };
>
> +#ifdef CONFIG_ZSMALLOC_STAT
> +static struct dentry *zs_stat_root;
> #endif
>
> /*
> @@ -201,25 +199,21 @@ static int zs_size_classes;
> static const int fullness_threshold_frac = 4;
>
> struct size_class {
> + spinlock_t lock;
> + struct page *fullness_list[_ZS_NR_FULLNESS_GROUPS];
> /*
> * Size of objects stored in this class. Must be multiple
> * of ZS_ALIGN.
> */
> - int size;
> - unsigned int index;
> + int size;
> + unsigned int index;
>
> /* Number of PAGE_SIZE sized pages to combine to form a 'zspage' */
> - int pages_per_zspage;
> - /* huge object: pages_per_zspage == 1 && maxobj_per_zspage == 1 */
> - bool huge;
> -
> -#ifdef CONFIG_ZSMALLOC_STAT
> - struct zs_size_stat stats;
> -#endif
> -
> - spinlock_t lock;
> + int pages_per_zspage;
> + struct zs_size_stat stats;
>
> - struct page *fullness_list[_ZS_NR_FULLNESS_GROUPS];
> + /* huge object: pages_per_zspage == 1 && maxobj_per_zspage == 1 */
> + bool huge;
> };
>
> /*
> @@ -439,8 +433,6 @@ static int get_size_class_index(int size)
> return min(zs_size_classes - 1, idx);
> }
>
> -#ifdef CONFIG_ZSMALLOC_STAT
> -
> static inline void zs_stat_inc(struct size_class *class,
> enum zs_stat_type type, unsigned long cnt)
> {
> @@ -459,6 +451,8 @@ static inline unsigned long zs_stat_get(struct size_class *class,
> return class->stats.objs[type];
> }
>
> +#ifdef CONFIG_ZSMALLOC_STAT
> +
> static int __init zs_stat_init(void)
> {
> if (!debugfs_initialized())
> @@ -574,23 +568,6 @@ static void zs_pool_stat_destroy(struct zs_pool *pool)
> }
>
> #else /* CONFIG_ZSMALLOC_STAT */
> -
> -static inline void zs_stat_inc(struct size_class *class,
> - enum zs_stat_type type, unsigned long cnt)
> -{
> -}
> -
> -static inline void zs_stat_dec(struct size_class *class,
> - enum zs_stat_type type, unsigned long cnt)
> -{
> -}
> -
> -static inline unsigned long zs_stat_get(struct size_class *class,
> - enum zs_stat_type type)
> -{
> - return 0;
> -}
> -
> static int __init zs_stat_init(void)
> {
> return 0;
> @@ -608,7 +585,6 @@ static inline int zs_pool_stat_create(char *name, struct zs_pool *pool)
> static inline void zs_pool_stat_destroy(struct zs_pool *pool)
> {
> }
> -
> #endif
>
>
> @@ -1682,7 +1658,6 @@ static void putback_zspage(struct zs_pool *pool, struct size_class *class,
> class->size, class->pages_per_zspage));
> atomic_long_sub(class->pages_per_zspage,
> &pool->pages_allocated);
> -
> free_zspage(first_page);
> }
> }
> --
> 2.4.2.337.gfae46aa
>
--
Kind regards,
Minchan Kim
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [RFC][PATCH 02/10] zsmalloc: always keep per-class stats
2015-06-04 2:18 ` Minchan Kim
@ 2015-06-04 2:34 ` Sergey Senozhatsky
0 siblings, 0 replies; 30+ messages in thread
From: Sergey Senozhatsky @ 2015-06-04 2:34 UTC (permalink / raw)
To: Minchan Kim
Cc: Sergey Senozhatsky, Andrew Morton, linux-mm, linux-kernel,
Sergey Senozhatsky
On (06/04/15 11:18), Minchan Kim wrote:
> On Sat, May 30, 2015 at 12:05:20AM +0900, Sergey Senozhatsky wrote:
> > always account per-class `zs_size_stat' stats. this data will
> > help us make better decisions during compaction. we are especially
> > interested in OBJ_ALLOCATED and OBJ_USED, which can tell us if
> > class compaction will result in any memory gain.
> >
> > for instance, we know the number of allocated objects in the class,
> > the number of objects being used (so we also know how many objects
> > are not used) and the number of objects per-page. so we can estimate
> > how many pages compaction can free (pages that will turn into
> > ZS_EMPTY during compaction).
>
> Fair enough but I need to read further patches to see if we need
> really this at the moment.
>
> I hope it would be better to write down more detail in cover-letter
> so when I read just [0/0] I realize your goal and approach without
> looking into detail in each patch.
>
sure, will do later today.
I caught a cold, so I'm a bit slow.
-ss
> >
> > Signed-off-by: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
> > ---
> > mm/zsmalloc.c | 49 ++++++++++++-------------------------------------
> > 1 file changed, 12 insertions(+), 37 deletions(-)
> >
> > diff --git a/mm/zsmalloc.c b/mm/zsmalloc.c
> > index e615b31..778b8db 100644
> > --- a/mm/zsmalloc.c
> > +++ b/mm/zsmalloc.c
> > @@ -169,14 +169,12 @@ enum zs_stat_type {
> > NR_ZS_STAT_TYPE,
> > };
> >
> > -#ifdef CONFIG_ZSMALLOC_STAT
> > -
> > -static struct dentry *zs_stat_root;
> > -
> > struct zs_size_stat {
> > unsigned long objs[NR_ZS_STAT_TYPE];
> > };
> >
> > +#ifdef CONFIG_ZSMALLOC_STAT
> > +static struct dentry *zs_stat_root;
> > #endif
> >
> > /*
> > @@ -201,25 +199,21 @@ static int zs_size_classes;
> > static const int fullness_threshold_frac = 4;
> >
> > struct size_class {
> > + spinlock_t lock;
> > + struct page *fullness_list[_ZS_NR_FULLNESS_GROUPS];
> > /*
> > * Size of objects stored in this class. Must be multiple
> > * of ZS_ALIGN.
> > */
> > - int size;
> > - unsigned int index;
> > + int size;
> > + unsigned int index;
> >
> > /* Number of PAGE_SIZE sized pages to combine to form a 'zspage' */
> > - int pages_per_zspage;
> > - /* huge object: pages_per_zspage == 1 && maxobj_per_zspage == 1 */
> > - bool huge;
> > -
> > -#ifdef CONFIG_ZSMALLOC_STAT
> > - struct zs_size_stat stats;
> > -#endif
> > -
> > - spinlock_t lock;
> > + int pages_per_zspage;
> > + struct zs_size_stat stats;
> >
> > - struct page *fullness_list[_ZS_NR_FULLNESS_GROUPS];
> > + /* huge object: pages_per_zspage == 1 && maxobj_per_zspage == 1 */
> > + bool huge;
> > };
> >
> > /*
> > @@ -439,8 +433,6 @@ static int get_size_class_index(int size)
> > return min(zs_size_classes - 1, idx);
> > }
> >
> > -#ifdef CONFIG_ZSMALLOC_STAT
> > -
> > static inline void zs_stat_inc(struct size_class *class,
> > enum zs_stat_type type, unsigned long cnt)
> > {
> > @@ -459,6 +451,8 @@ static inline unsigned long zs_stat_get(struct size_class *class,
> > return class->stats.objs[type];
> > }
> >
> > +#ifdef CONFIG_ZSMALLOC_STAT
> > +
> > static int __init zs_stat_init(void)
> > {
> > if (!debugfs_initialized())
> > @@ -574,23 +568,6 @@ static void zs_pool_stat_destroy(struct zs_pool *pool)
> > }
> >
> > #else /* CONFIG_ZSMALLOC_STAT */
> > -
> > -static inline void zs_stat_inc(struct size_class *class,
> > - enum zs_stat_type type, unsigned long cnt)
> > -{
> > -}
> > -
> > -static inline void zs_stat_dec(struct size_class *class,
> > - enum zs_stat_type type, unsigned long cnt)
> > -{
> > -}
> > -
> > -static inline unsigned long zs_stat_get(struct size_class *class,
> > - enum zs_stat_type type)
> > -{
> > - return 0;
> > -}
> > -
> > static int __init zs_stat_init(void)
> > {
> > return 0;
> > @@ -608,7 +585,6 @@ static inline int zs_pool_stat_create(char *name, struct zs_pool *pool)
> > static inline void zs_pool_stat_destroy(struct zs_pool *pool)
> > {
> > }
> > -
> > #endif
> >
> >
> > @@ -1682,7 +1658,6 @@ static void putback_zspage(struct zs_pool *pool, struct size_class *class,
> > class->size, class->pages_per_zspage));
> > atomic_long_sub(class->pages_per_zspage,
> > &pool->pages_allocated);
> > -
> > free_zspage(first_page);
> > }
> > }
> > --
> > 2.4.2.337.gfae46aa
> >
>
> --
> Kind regards,
> Minchan Kim
>
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [RFC][PATCH 03/10] zsmalloc: introduce zs_can_compact() function
2015-05-29 15:05 ` [RFC][PATCH 03/10] zsmalloc: introduce zs_can_compact() function Sergey Senozhatsky
@ 2015-06-04 2:55 ` Minchan Kim
2015-06-04 3:15 ` Sergey Senozhatsky
0 siblings, 1 reply; 30+ messages in thread
From: Minchan Kim @ 2015-06-04 2:55 UTC (permalink / raw)
To: Sergey Senozhatsky
Cc: Andrew Morton, linux-mm, linux-kernel, Sergey Senozhatsky
On Sat, May 30, 2015 at 12:05:21AM +0900, Sergey Senozhatsky wrote:
> this function checks if class compaction will free any pages.
> rephrasing, do we have enough unused objects to form at least one
> ZS_EMPTY page and free it. it aborts compaction if class compaction
> will not result into any (further) savings.
>
> EXAMPLE (this debug output is not part of this patch set):
>
> -- class size
> -- number of allocated objects
> -- number of used objects,
> -- estimated number of pages that will be freed
>
> [..]
> [ 3303.108960] class-3072 objs:24652 inuse:24628 objs-per-page:4 pages-tofree:6
maxobjs-per-zspage?
> [ 3303.108965] class-3072 objs:24648 inuse:24628 objs-per-page:4 pages-tofree:5
> [ 3303.108970] class-3072 objs:24644 inuse:24628 objs-per-page:4 pages-tofree:4
> [ 3303.108973] class-3072 objs:24640 inuse:24628 objs-per-page:4 pages-tofree:3
> [ 3303.108978] class-3072 objs:24636 inuse:24628 objs-per-page:4 pages-tofree:2
> [ 3303.108982] class-3072 objs:24632 inuse:24628 objs-per-page:4 pages-tofree:1
> [ 3303.108993] class-2720 objs:17970 inuse:17966 objs-per-page:3 pages-tofree:1
> [ 3303.108997] class-2720 objs:17967 inuse:17966 objs-per-page:3 pages-tofree:0
> [ 3303.108998] class-2720: Compaction is useless
> [ 3303.109000] class-2448 objs:7680 inuse:7674 objs-per-page:5 pages-tofree:1
> [ 3303.109005] class-2336 objs:13510 inuse:13500 objs-per-page:7 pages-tofree:1
> [ 3303.109010] class-2336 objs:13503 inuse:13500 objs-per-page:7 pages-tofree:0
> [ 3303.109011] class-2336: Compaction is useless
> [ 3303.109013] class-1808 objs:1161 inuse:1154 objs-per-page:9 pages-tofree:0
> [ 3303.109014] class-1808: Compaction is useless
> [ 3303.109016] class-1744 objs:2135 inuse:2131 objs-per-page:7 pages-tofree:0
> [ 3303.109017] class-1744: Compaction is useless
> [ 3303.109019] class-1536 objs:1328 inuse:1323 objs-per-page:8 pages-tofree:0
> [ 3303.109020] class-1536: Compaction is useless
> [ 3303.109022] class-1488 objs:8855 inuse:8847 objs-per-page:11 pages-tofree:0
> [ 3303.109023] class-1488: Compaction is useless
> [ 3303.109025] class-1360 objs:14880 inuse:14878 objs-per-page:3 pages-tofree:0
> [ 3303.109026] class-1360: Compaction is useless
> [ 3303.109028] class-1248 objs:3588 inuse:3577 objs-per-page:13 pages-tofree:0
> [ 3303.109029] class-1248: Compaction is useless
> [ 3303.109031] class-1216 objs:3380 inuse:3372 objs-per-page:10 pages-tofree:0
> [ 3303.109032] class-1216: Compaction is useless
> [ 3303.109033] class-1168 objs:3416 inuse:3401 objs-per-page:7 pages-tofree:2
> [ 3303.109037] class-1168 objs:3409 inuse:3401 objs-per-page:7 pages-tofree:1
> [ 3303.109042] class-1104 objs:605 inuse:599 objs-per-page:11 pages-tofree:0
> [ 3303.109043] class-1104: Compaction is useless
> [..]
>
> every "Compaction is useless" indicates that we saved some CPU cycles.
>
> for example, class-1104 has
>
> 605 object allocated
> 599 objects used
> 11 objects per-page
>
> even if we have ALMOST_EMPTY page, we still don't have enough room to move
> all of its objects and free this page; so compaction will not make a lot of
> sense here, it's better to just leave it as is.
Fair enough.
>
> Signed-off-by: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
> ---
> mm/zsmalloc.c | 25 +++++++++++++++++++++++++
> 1 file changed, 25 insertions(+)
>
> diff --git a/mm/zsmalloc.c b/mm/zsmalloc.c
> index 778b8db..9ef6f15 100644
> --- a/mm/zsmalloc.c
> +++ b/mm/zsmalloc.c
> @@ -1673,6 +1673,28 @@ static struct page *isolate_source_page(struct size_class *class)
> return page;
> }
>
> +/*
> + * Make sure that we actually can compact this class,
> + * IOW if migration will empty at least one page.
> + *
> + * should be called under class->lock
> + */
> +static bool zs_can_compact(struct size_class *class)
> +{
> + /*
> + * calculate how many unused allocated objects we
c should be captital.
I hope you will fix all of english grammer in next spin
because someone(like me) who is not a native will learn the
wrong english. :)
> + * have and see if we can free any zspages. otherwise,
> + * compaction can just move objects back and forth w/o
> + * any memory gain.
> + */
> + unsigned long ret = zs_stat_get(class, OBJ_ALLOCATED) -
> + zs_stat_get(class, OBJ_USED);
> +
I prefer obj_wasted to "ret".
> + ret /= get_maxobj_per_zspage(class->size,
> + class->pages_per_zspage);
> + return ret > 0;
> +}
> +
> static unsigned long __zs_compact(struct zs_pool *pool,
> struct size_class *class)
> {
> @@ -1686,6 +1708,9 @@ static unsigned long __zs_compact(struct zs_pool *pool,
>
> BUG_ON(!is_first_page(src_page));
>
> + if (!zs_can_compact(class))
> + break;
> +
> cc.index = 0;
> cc.s_page = src_page;
>
> --
> 2.4.2.337.gfae46aa
>
--
Kind regards,
Minchan Kim
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [RFC][PATCH 04/10] zsmalloc: cosmetic compaction code adjustments
2015-05-29 15:05 ` [RFC][PATCH 04/10] zsmalloc: cosmetic compaction code adjustments Sergey Senozhatsky
@ 2015-06-04 3:14 ` Minchan Kim
0 siblings, 0 replies; 30+ messages in thread
From: Minchan Kim @ 2015-06-04 3:14 UTC (permalink / raw)
To: Sergey Senozhatsky
Cc: Andrew Morton, linux-mm, linux-kernel, Sergey Senozhatsky
On Sat, May 30, 2015 at 12:05:22AM +0900, Sergey Senozhatsky wrote:
> change zs_object_copy() argument order to be (DST, SRC) rather
> than (SRC, DST). copy/move functions usually have (to, from)
> arguments order.
Yeb,
>
> rename alloc_target_page() to isolate_target_page(). this
> function doesn't allocate anything, it isolates target page,
> pretty much like isolate_source_page().
The reason I named it as alloc_target_page is I had a plan to
alloc new page which might be helpful sometime but I cannot
think of any benefit now so I follow your your patch.
>
> tweak __zs_compact() comment.
>
> Signed-off-by: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
Acked-by: Minchan Kim <minchan@kernel.org>
> ---
> mm/zsmalloc.c | 12 ++++++------
> 1 file changed, 6 insertions(+), 6 deletions(-)
>
> diff --git a/mm/zsmalloc.c b/mm/zsmalloc.c
> index 9ef6f15..fa72a81 100644
> --- a/mm/zsmalloc.c
> +++ b/mm/zsmalloc.c
> @@ -1469,7 +1469,7 @@ void zs_free(struct zs_pool *pool, unsigned long handle)
> }
> EXPORT_SYMBOL_GPL(zs_free);
>
> -static void zs_object_copy(unsigned long src, unsigned long dst,
> +static void zs_object_copy(unsigned long dst, unsigned long src,
> struct size_class *class)
> {
> struct page *s_page, *d_page;
> @@ -1610,7 +1610,7 @@ static int migrate_zspage(struct zs_pool *pool, struct size_class *class,
>
> used_obj = handle_to_obj(handle);
> free_obj = obj_malloc(d_page, class, handle);
> - zs_object_copy(used_obj, free_obj, class);
> + zs_object_copy(free_obj, used_obj, class);
> index++;
> record_obj(handle, free_obj);
> unpin_tag(handle);
> @@ -1626,7 +1626,7 @@ static int migrate_zspage(struct zs_pool *pool, struct size_class *class,
> return ret;
> }
>
> -static struct page *alloc_target_page(struct size_class *class)
> +static struct page *isolate_target_page(struct size_class *class)
> {
> int i;
> struct page *page;
> @@ -1714,11 +1714,11 @@ static unsigned long __zs_compact(struct zs_pool *pool,
> cc.index = 0;
> cc.s_page = src_page;
>
> - while ((dst_page = alloc_target_page(class))) {
> + while ((dst_page = isolate_target_page(class))) {
> cc.d_page = dst_page;
> /*
> - * If there is no more space in dst_page, try to
> - * allocate another zspage.
> + * If there is no more space in dst_page, resched
> + * and see if anyone had allocated another zspage.
> */
> if (!migrate_zspage(pool, class, &cc))
> break;
> --
> 2.4.2.337.gfae46aa
>
--
Kind regards,
Minchan Kim
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [RFC][PATCH 03/10] zsmalloc: introduce zs_can_compact() function
2015-06-04 2:55 ` Minchan Kim
@ 2015-06-04 3:15 ` Sergey Senozhatsky
2015-06-04 3:30 ` Minchan Kim
2015-06-04 3:31 ` Sergey Senozhatsky
0 siblings, 2 replies; 30+ messages in thread
From: Sergey Senozhatsky @ 2015-06-04 3:15 UTC (permalink / raw)
To: Minchan Kim
Cc: Sergey Senozhatsky, Andrew Morton, linux-mm, linux-kernel,
Sergey Senozhatsky
On (06/04/15 11:55), Minchan Kim wrote:
> > [ 3303.108960] class-3072 objs:24652 inuse:24628 objs-per-page:4 pages-tofree:6
>
> maxobjs-per-zspage?
>
yeah, I shortened it to be more of less "80 chars" friendly.
[..]
> > + * calculate how many unused allocated objects we
>
> c should be captital.
>
> I hope you will fix all of english grammer in next spin
> because someone(like me) who is not a native will learn the
> wrong english. :)
sure, will fix. yeah, I'm a native broken english speaker :-)
> > + * have and see if we can free any zspages. otherwise,
> > + * compaction can just move objects back and forth w/o
> > + * any memory gain.
> > + */
> > + unsigned long ret = zs_stat_get(class, OBJ_ALLOCATED) -
> > + zs_stat_get(class, OBJ_USED);
> > +
>
> I prefer obj_wasted to "ret".
ok.
I'm still thinking how good it should be.
for automatic compaction we don't want to uselessly move objects between
pages and I tend to think that it's better to compact less, than to waste
more cpu cycless.
on the other hand, this policy will miss cases like:
-- free objects in class: 5 (free-objs class capacity)
-- page1: inuse 2
-- page2: inuse 2
-- page3: inuse 3
-- page4: inuse 2
so total "insuse" is greater than free-objs class capacity. but, it's
surely possible to compact this class. partial inuse summ <= free-objs class
capacity (a partial summ is a ->inuse summ of any two of class pages:
page1 + page2, page2 + page3, etc.).
otoh, these partial sums will badly affect performance. may be for automatic
compaction (the one that happens w/o user interaction) we can do zs_can_compact()
and for manual compaction (the one that has been triggered by a user) we can
old "full-scan".
anyway, zs_can_compact() looks like something that we can optimize
independently later.
-ss
> > + ret /= get_maxobj_per_zspage(class->size,
> > + class->pages_per_zspage);
> > + return ret > 0;
> > +}
> > +
> > static unsigned long __zs_compact(struct zs_pool *pool,
> > struct size_class *class)
> > {
> > @@ -1686,6 +1708,9 @@ static unsigned long __zs_compact(struct zs_pool *pool,
> >
> > BUG_ON(!is_first_page(src_page));
> >
> > + if (!zs_can_compact(class))
> > + break;
> > +
> > cc.index = 0;
> > cc.s_page = src_page;
> >
> > --
> > 2.4.2.337.gfae46aa
> >
>
> --
> Kind regards,
> Minchan Kim
>
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [RFC][PATCH 03/10] zsmalloc: introduce zs_can_compact() function
2015-06-04 3:15 ` Sergey Senozhatsky
@ 2015-06-04 3:30 ` Minchan Kim
2015-06-04 3:42 ` Sergey Senozhatsky
2015-06-04 3:31 ` Sergey Senozhatsky
1 sibling, 1 reply; 30+ messages in thread
From: Minchan Kim @ 2015-06-04 3:30 UTC (permalink / raw)
To: Sergey Senozhatsky
Cc: Sergey Senozhatsky, Andrew Morton, linux-mm, linux-kernel
On Thu, Jun 04, 2015 at 12:15:14PM +0900, Sergey Senozhatsky wrote:
> On (06/04/15 11:55), Minchan Kim wrote:
> > > [ 3303.108960] class-3072 objs:24652 inuse:24628 objs-per-page:4 pages-tofree:6
> >
> > maxobjs-per-zspage?
> >
>
> yeah, I shortened it to be more of less "80 chars" friendly.
>
>
> [..]
>
> > > + * calculate how many unused allocated objects we
> >
> > c should be captital.
> >
> > I hope you will fix all of english grammer in next spin
> > because someone(like me) who is not a native will learn the
> > wrong english. :)
>
> sure, will fix. yeah, I'm a native broken english speaker :-)
>
> > > + * have and see if we can free any zspages. otherwise,
> > > + * compaction can just move objects back and forth w/o
> > > + * any memory gain.
> > > + */
> > > + unsigned long ret = zs_stat_get(class, OBJ_ALLOCATED) -
> > > + zs_stat_get(class, OBJ_USED);
> > > +
> >
> > I prefer obj_wasted to "ret".
>
> ok.
>
> I'm still thinking how good it should be.
>
> for automatic compaction we don't want to uselessly move objects between
> pages and I tend to think that it's better to compact less, than to waste
> more cpu cycless.
>
>
> on the other hand, this policy will miss cases like:
>
> -- free objects in class: 5 (free-objs class capacity)
> -- page1: inuse 2
> -- page2: inuse 2
> -- page3: inuse 3
> -- page4: inuse 2
What scenario do you have a cocern?
Could you describe this example more clear?
Thanks.
>
> so total "insuse" is greater than free-objs class capacity. but, it's
> surely possible to compact this class. partial inuse summ <= free-objs class
> capacity (a partial summ is a ->inuse summ of any two of class pages:
> page1 + page2, page2 + page3, etc.).
>
> otoh, these partial sums will badly affect performance. may be for automatic
> compaction (the one that happens w/o user interaction) we can do zs_can_compact()
> and for manual compaction (the one that has been triggered by a user) we can
> old "full-scan".
>
> anyway, zs_can_compact() looks like something that we can optimize
> independently later.
>
> -ss
>
> > > + ret /= get_maxobj_per_zspage(class->size,
> > > + class->pages_per_zspage);
> > > + return ret > 0;
> > > +}
> > > +
> > > static unsigned long __zs_compact(struct zs_pool *pool,
> > > struct size_class *class)
> > > {
> > > @@ -1686,6 +1708,9 @@ static unsigned long __zs_compact(struct zs_pool *pool,
> > >
> > > BUG_ON(!is_first_page(src_page));
> > >
> > > + if (!zs_can_compact(class))
> > > + break;
> > > +
> > > cc.index = 0;
> > > cc.s_page = src_page;
> > >
> > > --
> > > 2.4.2.337.gfae46aa
> > >
> >
> > --
> > Kind regards,
> > Minchan Kim
> >
--
Kind regards,
Minchan Kim
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [RFC][PATCH 03/10] zsmalloc: introduce zs_can_compact() function
2015-06-04 3:15 ` Sergey Senozhatsky
2015-06-04 3:30 ` Minchan Kim
@ 2015-06-04 3:31 ` Sergey Senozhatsky
1 sibling, 0 replies; 30+ messages in thread
From: Sergey Senozhatsky @ 2015-06-04 3:31 UTC (permalink / raw)
To: Sergey Senozhatsky
Cc: Minchan Kim, Sergey Senozhatsky, Andrew Morton, linux-mm, linux-kernel
On (06/04/15 12:15), Sergey Senozhatsky wrote:
> I'm still thinking how good it should be.
>
> for automatic compaction we don't want to uselessly move objects between
> pages and I tend to think that it's better to compact less, than to waste
> more cpu cycless.
>
>
> on the other hand, this policy will miss cases like:
>
> -- free objects in class: 5 (free-objs class capacity)
> -- page1: inuse 2
> -- page2: inuse 2
> -- page3: inuse 3
> -- page4: inuse 2
>
> so total "insuse" is greater than free-objs class capacity. but, it's
> surely possible to compact this class. partial inuse summ <= free-objs class
> capacity (a partial summ is a ->inuse summ of any two of class pages:
> page1 + page2, page2 + page3, etc.).
>
> otoh, these partial sums will badly affect performance. may be for automatic
> compaction (the one that happens w/o user interaction) we can do zs_can_compact()
> and for manual compaction (the one that has been triggered by a user) we can
> old "full-scan".
>
> anyway, zs_can_compact() looks like something that we can optimize
> independently later.
>
so what I'm thinking of right now, is:
-- first do "if we have enough free objects to free at least one page"
check. compact if true.
-- if false, then we can do on a per-page basis
"if page->inuse <= class free-objs capacity" then compact it,
else select next almost_empty page.
here would be helpful to have pages ordered by ->inuse. but this
is far to expensive.
I have a patch that I will post later that introduces weak/partial
page ordering within fullness_list (really inexpensive: just one int
compare to add a page with a higher ->inuse to list head instead of
list tail).
-ss
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [RFC][PATCH 03/10] zsmalloc: introduce zs_can_compact() function
2015-06-04 3:30 ` Minchan Kim
@ 2015-06-04 3:42 ` Sergey Senozhatsky
2015-06-04 3:50 ` Minchan Kim
0 siblings, 1 reply; 30+ messages in thread
From: Sergey Senozhatsky @ 2015-06-04 3:42 UTC (permalink / raw)
To: Minchan Kim
Cc: Sergey Senozhatsky, Sergey Senozhatsky, Andrew Morton, linux-mm,
linux-kernel
On (06/04/15 12:30), Minchan Kim wrote:
> > -- free objects in class: 5 (free-objs class capacity)
> > -- page1: inuse 2
> > -- page2: inuse 2
> > -- page3: inuse 3
> > -- page4: inuse 2
>
> What scenario do you have a cocern?
> Could you describe this example more clear?
you mean "how is this even possible"?
well, for example,
make -jX
make clean
can introduce a significant fragmentation. no new objects, just random
objs removal. assuming that we keep some of the objects, allocated during
compilation.
e.g.
...
page1
allocate baz.so
allocate foo.o
page2
allocate bar.o
allocate foo.so
...
pageN
now `make clean`
page1:
allocated baz.so
empty
page2
empty
allocated foo.so
...
pageN
in the worst case, every page can turn out to be ALMOST_EMPTY.
-ss
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [RFC][PATCH 03/10] zsmalloc: introduce zs_can_compact() function
2015-06-04 3:42 ` Sergey Senozhatsky
@ 2015-06-04 3:50 ` Minchan Kim
2015-06-04 4:19 ` Sergey Senozhatsky
0 siblings, 1 reply; 30+ messages in thread
From: Minchan Kim @ 2015-06-04 3:50 UTC (permalink / raw)
To: Sergey Senozhatsky
Cc: Sergey Senozhatsky, Andrew Morton, linux-mm, linux-kernel
On Thu, Jun 04, 2015 at 12:42:30PM +0900, Sergey Senozhatsky wrote:
> On (06/04/15 12:30), Minchan Kim wrote:
> > > -- free objects in class: 5 (free-objs class capacity)
> > > -- page1: inuse 2
> > > -- page2: inuse 2
> > > -- page3: inuse 3
> > > -- page4: inuse 2
> >
> > What scenario do you have a cocern?
> > Could you describe this example more clear?
>
> you mean "how is this even possible"?
No I meant. I couldn't understand your terms. Sorry.
What free-objs class capacity is?
page1 is zspage?
Let's use consistent terms between us.
For example, maxobj-per-zspage is 4.
A is allocated and used. X is allocated but not used.
so we can draw a zspage below.
AAXX
So we can draw several zspages linked list as below
AAXX - AXXX - AAAX
Could you describe your problem again?
Sorry.
>
> well, for example,
>
> make -jX
> make clean
>
> can introduce a significant fragmentation. no new objects, just random
> objs removal. assuming that we keep some of the objects, allocated during
> compilation.
>
> e.g.
>
> ...
>
> page1
> allocate baz.so
> allocate foo.o
> page2
> allocate bar.o
> allocate foo.so
> ...
> pageN
>
>
>
> now `make clean`
>
> page1:
> allocated baz.so
> empty
>
> page2
> empty
> allocated foo.so
>
> ...
>
> pageN
>
> in the worst case, every page can turn out to be ALMOST_EMPTY.
>
> -ss
--
Kind regards,
Minchan Kim
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [RFC][PATCH 03/10] zsmalloc: introduce zs_can_compact() function
2015-06-04 3:50 ` Minchan Kim
@ 2015-06-04 4:19 ` Sergey Senozhatsky
0 siblings, 0 replies; 30+ messages in thread
From: Sergey Senozhatsky @ 2015-06-04 4:19 UTC (permalink / raw)
To: Minchan Kim
Cc: Sergey Senozhatsky, Sergey Senozhatsky, Andrew Morton, linux-mm,
linux-kernel
On (06/04/15 12:50), Minchan Kim wrote:
> > On (06/04/15 12:30), Minchan Kim wrote:
> > >
> > > What scenario do you have a cocern?
> > > Could you describe this example more clear?
> >
> > you mean "how is this even possible"?
>
> No I meant. I couldn't understand your terms. Sorry.
>
> What free-objs class capacity is?
> page1 is zspage?
>
> Let's use consistent terms between us.
>
> For example, maxobj-per-zspage is 4.
> A is allocated and used. X is allocated but not used.
> so we can draw a zspage below.
>
> AAXX
>
> So we can draw several zspages linked list as below
>
> AAXX - AXXX - AAAX
>
> Could you describe your problem again?
>
> Sorry.
My apologies.
yes, so:
-- free-objs class capacity -- how may unused allocated objects
we have in this class (in total).
-- page1..pageN -- zspages.
And I think that my example is utterly wrong and incorrect. My mistake.
Sorry for the noise.
-ss
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [RFC][PATCH 07/10] zsmalloc: introduce auto-compact support
2015-05-29 15:05 ` [RFC][PATCH 07/10] zsmalloc: introduce auto-compact support Sergey Senozhatsky
@ 2015-06-04 4:57 ` Minchan Kim
2015-06-04 5:30 ` Sergey Senozhatsky
0 siblings, 1 reply; 30+ messages in thread
From: Minchan Kim @ 2015-06-04 4:57 UTC (permalink / raw)
To: Sergey Senozhatsky
Cc: Andrew Morton, linux-mm, linux-kernel, Sergey Senozhatsky
On Sat, May 30, 2015 at 12:05:25AM +0900, Sergey Senozhatsky wrote:
> perform class compaction in zs_free(), if zs_free() has created
> a ZS_ALMOST_EMPTY page. this is the most trivial `policy'.
Finally, I got realized your intention.
Actually, I had a plan to add /sys/block/zram0/compact_threshold_ratio
which means to compact automatically when compr_data_size/mem_used_total
is below than the threshold but I didn't try because it could be done
by usertool.
Another reason I didn't try the approach is that it could scan all of
zs_objects repeatedly withtout any freeing zspage in some corner cases,
which could be big overhead we should prevent so we might add some
heuristic. as an example, we could delay a few compaction trial when
we found a few previous trials as all fails.
It's simple design of mm/compaction.c to prevent pointless overhead
but historically it made pains several times and required more
complicated logics but it's still painful.
Other thing I found recently is that it's not always win zsmalloc
for zram is not fragmented. The fragmented space could be used
for storing upcoming compressed objects although it is wasted space
at the moment but if we don't have any hole(ie, fragment space)
via frequent compaction, zsmalloc should allocate a new zspage
which could be allocated on movable pageblock by fallback of
nonmovable pageblock request on highly memory pressure system
so it accelerates fragment problem of the system memory.
So, I want to pass the policy to userspace.
If we found it's really trobule on userspace, then, we need more
thinking.
Thanks.
>
> probably it would make zs_can_compact() to return an estimated number
> of pages that potentially will be free and trigger auto-compaction
> only when it's above some limit (e.g. at least 4 zs pages); or put it
> under config option.
>
> this also tweaks __zs_compact() -- we can't do reschedule
> anymore, waiting for new pages in the current class. so we
> compact as much as we can and return immediately if compaction
> is not possible anymore.
>
> auto-compaction is not a replacement of manual compaction.
>
> compiled linux kernel with auto-compaction:
>
> cat /sys/block/zram0/mm_stat
> 2339885056 1601034235 1624076288 0 1624076288 19961 1106
>
> performing additional manual compaction:
>
> echo 1 > /sys/block/zram0/compact
> cat /sys/block/zram0/mm_stat
> 2339885056 1601034235 1624051712 0 1624076288 19961 1114
>
> manual compaction was able to migrate additional 8 objects. so
> auto-compaction is 'good enough'.
>
> TEST
>
> this test copies a 1.3G linux kernel tar to mounted zram disk,
> and extracts it.
>
> w/auto-compaction:
>
> cat /sys/block/zram0/mm_stat
> 1171456 26006 86016 0 86016 32781 0
>
> time tar xf linux-3.10.tar.gz -C linux
>
> real 0m16.970s
> user 0m15.247s
> sys 0m8.477s
>
> du -sh linux
> 2.0G linux
>
> cat /sys/block/zram0/mm_stat
> 3547353088 2993384270 3011088384 0 3011088384 24310 108
>
> =====================================================================
>
> w/o auto compaction:
>
> cat /sys/block/zram0/mm_stat
> 1171456 26000 81920 0 81920 32781 0
>
> time tar xf linux-3.10.tar.gz -C linux
>
> real 0m16.983s
> user 0m15.267s
> sys 0m8.417s
>
> du -sh linux
> 2.0G linux
>
> cat /sys/block/zram0/mm_stat
> 3548917760 2993566924 3011317760 0 3011317760 23928 0
>
> =====================================================================
>
> iozone shows that auto-compacted code runs faster in several
> tests, which is hardly trustworthy. anyway.
>
> iozone -t 3 -R -r 16K -s 60M -I +Z
>
> test base auto-compact (compacted 66123 objs)
> Initial write 1603682.25 1645112.38
> Rewrite 2502243.31 2256570.31
> Read 7040860.00 7130575.00
> Re-read 7036490.75 7066744.25
> Reverse Read 6617115.25 6155395.50
> Stride read 6705085.50 6350030.38
> Random read 6668497.75 6350129.38
> Mixed workload 5494030.38 5091669.62
> Random write 2526834.44 2500977.81
> Pwrite 1656874.00 1663796.94
> Pread 3322818.91 3359683.44
> Fwrite 4090124.25 4099773.88
> Fread 10358916.25 10324409.75
>
> Signed-off-by: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
> ---
> mm/zsmalloc.c | 25 +++++++++++++------------
> 1 file changed, 13 insertions(+), 12 deletions(-)
>
> diff --git a/mm/zsmalloc.c b/mm/zsmalloc.c
> index c2a640a..70bf481 100644
> --- a/mm/zsmalloc.c
> +++ b/mm/zsmalloc.c
> @@ -1515,34 +1515,28 @@ static void __zs_compact(struct zs_pool *pool, struct size_class *class)
>
> while ((dst_page = isolate_target_page(class))) {
> cc.d_page = dst_page;
> - /*
> - * If there is no more space in dst_page, resched
> - * and see if anyone had allocated another zspage.
> - */
> +
> if (!migrate_zspage(pool, class, &cc))
> - break;
> + goto out;
>
> putback_zspage(pool, class, dst_page);
> }
>
> - /* Stop if we couldn't find slot */
> - if (dst_page == NULL)
> + if (!dst_page)
> break;
> -
> putback_zspage(pool, class, dst_page);
> putback_zspage(pool, class, src_page);
> - spin_unlock(&class->lock);
> - cond_resched();
> - spin_lock(&class->lock);
> }
>
> +out:
> + if (dst_page)
> + putback_zspage(pool, class, dst_page);
> if (src_page)
> putback_zspage(pool, class, src_page);
>
> spin_unlock(&class->lock);
> }
>
> -
> unsigned long zs_get_total_pages(struct zs_pool *pool)
> {
> return atomic_long_read(&pool->pages_allocated);
> @@ -1741,6 +1735,13 @@ void zs_free(struct zs_pool *pool, unsigned long handle)
> unpin_tag(handle);
>
> free_handle(pool, handle);
> +
> + /*
> + * actual fullness might have changed, __zs_compact() checks
> + * if compaction makes sense
> + */
> + if (fullness == ZS_ALMOST_EMPTY)
> + __zs_compact(pool, class);
> }
> EXPORT_SYMBOL_GPL(zs_free);
>
> --
> 2.4.2.337.gfae46aa
>
--
Kind regards,
Minchan Kim
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [RFC][PATCH 07/10] zsmalloc: introduce auto-compact support
2015-06-04 4:57 ` Minchan Kim
@ 2015-06-04 5:30 ` Sergey Senozhatsky
2015-06-04 6:27 ` Minchan Kim
0 siblings, 1 reply; 30+ messages in thread
From: Sergey Senozhatsky @ 2015-06-04 5:30 UTC (permalink / raw)
To: Minchan Kim
Cc: Sergey Senozhatsky, Andrew Morton, linux-mm, linux-kernel,
Sergey Senozhatsky
On (06/04/15 13:57), Minchan Kim wrote:
> On Sat, May 30, 2015 at 12:05:25AM +0900, Sergey Senozhatsky wrote:
> > perform class compaction in zs_free(), if zs_free() has created
> > a ZS_ALMOST_EMPTY page. this is the most trivial `policy'.
>
> Finally, I got realized your intention.
>
> Actually, I had a plan to add /sys/block/zram0/compact_threshold_ratio
> which means to compact automatically when compr_data_size/mem_used_total
> is below than the threshold but I didn't try because it could be done
> by usertool.
>
> Another reason I didn't try the approach is that it could scan all of
> zs_objects repeatedly withtout any freeing zspage in some corner cases,
> which could be big overhead we should prevent so we might add some
> heuristic. as an example, we could delay a few compaction trial when
> we found a few previous trials as all fails.
this is why I use zs_can_compact() -- to evict from zs_compact() as soon
as possible. so useless scans are minimized (well, at least expected). I'm
also thinking of a threshold-based solution -- do class auto-compaction
only if we can free X pages, for example.
the problem of compaction is that there is no compaction until you trigger
it.
and fragmented classes are not necessarily a win. if writes don't happen
to a fragmented class-X (and we basically can't tell if they will, nor we
can estimate; it's up to I/O and data patterns, compression algorithm, etc.)
then class-X stays fragmented w/o any use.
> It's simple design of mm/compaction.c to prevent pointless overhead
> but historically it made pains several times and required more
> complicated logics but it's still painful.
>
> Other thing I found recently is that it's not always win zsmalloc
> for zram is not fragmented. The fragmented space could be used
> for storing upcoming compressed objects although it is wasted space
> at the moment but if we don't have any hole(ie, fragment space)
> via frequent compaction, zsmalloc should allocate a new zspage
> which could be allocated on movable pageblock by fallback of
> nonmovable pageblock request on highly memory pressure system
> so it accelerates fragment problem of the system memory.
yes, but compaction almost always leave classes fragmented. I think
it's a corner case, when the number of unused allocated objects was
exactly the same as the number of objects that we migrated and the
number of migrated objects was exactly N*maxobj_per_zspage, so we
left the class w/o any unused objects (OBJ_ALLOCATED == OBJ_USED).
classes have 'holes' after compaction.
> So, I want to pass the policy to userspace.
> If we found it's really trobule on userspace, then, we need more
> thinking.
well, it can be under config "aggressive compaction" or "automatic
compaction" option.
-ss
> Thanks.
>
> >
> > probably it would make zs_can_compact() to return an estimated number
> > of pages that potentially will be free and trigger auto-compaction
> > only when it's above some limit (e.g. at least 4 zs pages); or put it
> > under config option.
> >
> > this also tweaks __zs_compact() -- we can't do reschedule
> > anymore, waiting for new pages in the current class. so we
> > compact as much as we can and return immediately if compaction
> > is not possible anymore.
> >
> > auto-compaction is not a replacement of manual compaction.
> >
> > compiled linux kernel with auto-compaction:
> >
> > cat /sys/block/zram0/mm_stat
> > 2339885056 1601034235 1624076288 0 1624076288 19961 1106
> >
> > performing additional manual compaction:
> >
> > echo 1 > /sys/block/zram0/compact
> > cat /sys/block/zram0/mm_stat
> > 2339885056 1601034235 1624051712 0 1624076288 19961 1114
> >
> > manual compaction was able to migrate additional 8 objects. so
> > auto-compaction is 'good enough'.
> >
> > TEST
> >
> > this test copies a 1.3G linux kernel tar to mounted zram disk,
> > and extracts it.
> >
> > w/auto-compaction:
> >
> > cat /sys/block/zram0/mm_stat
> > 1171456 26006 86016 0 86016 32781 0
> >
> > time tar xf linux-3.10.tar.gz -C linux
> >
> > real 0m16.970s
> > user 0m15.247s
> > sys 0m8.477s
> >
> > du -sh linux
> > 2.0G linux
> >
> > cat /sys/block/zram0/mm_stat
> > 3547353088 2993384270 3011088384 0 3011088384 24310 108
> >
> > =====================================================================
> >
> > w/o auto compaction:
> >
> > cat /sys/block/zram0/mm_stat
> > 1171456 26000 81920 0 81920 32781 0
> >
> > time tar xf linux-3.10.tar.gz -C linux
> >
> > real 0m16.983s
> > user 0m15.267s
> > sys 0m8.417s
> >
> > du -sh linux
> > 2.0G linux
> >
> > cat /sys/block/zram0/mm_stat
> > 3548917760 2993566924 3011317760 0 3011317760 23928 0
> >
> > =====================================================================
> >
> > iozone shows that auto-compacted code runs faster in several
> > tests, which is hardly trustworthy. anyway.
> >
> > iozone -t 3 -R -r 16K -s 60M -I +Z
> >
> > test base auto-compact (compacted 66123 objs)
> > Initial write 1603682.25 1645112.38
> > Rewrite 2502243.31 2256570.31
> > Read 7040860.00 7130575.00
> > Re-read 7036490.75 7066744.25
> > Reverse Read 6617115.25 6155395.50
> > Stride read 6705085.50 6350030.38
> > Random read 6668497.75 6350129.38
> > Mixed workload 5494030.38 5091669.62
> > Random write 2526834.44 2500977.81
> > Pwrite 1656874.00 1663796.94
> > Pread 3322818.91 3359683.44
> > Fwrite 4090124.25 4099773.88
> > Fread 10358916.25 10324409.75
> >
> > Signed-off-by: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
> > ---
> > mm/zsmalloc.c | 25 +++++++++++++------------
> > 1 file changed, 13 insertions(+), 12 deletions(-)
> >
> > diff --git a/mm/zsmalloc.c b/mm/zsmalloc.c
> > index c2a640a..70bf481 100644
> > --- a/mm/zsmalloc.c
> > +++ b/mm/zsmalloc.c
> > @@ -1515,34 +1515,28 @@ static void __zs_compact(struct zs_pool *pool, struct size_class *class)
> >
> > while ((dst_page = isolate_target_page(class))) {
> > cc.d_page = dst_page;
> > - /*
> > - * If there is no more space in dst_page, resched
> > - * and see if anyone had allocated another zspage.
> > - */
> > +
> > if (!migrate_zspage(pool, class, &cc))
> > - break;
> > + goto out;
> >
> > putback_zspage(pool, class, dst_page);
> > }
> >
> > - /* Stop if we couldn't find slot */
> > - if (dst_page == NULL)
> > + if (!dst_page)
> > break;
> > -
> > putback_zspage(pool, class, dst_page);
> > putback_zspage(pool, class, src_page);
> > - spin_unlock(&class->lock);
> > - cond_resched();
> > - spin_lock(&class->lock);
> > }
> >
> > +out:
> > + if (dst_page)
> > + putback_zspage(pool, class, dst_page);
> > if (src_page)
> > putback_zspage(pool, class, src_page);
> >
> > spin_unlock(&class->lock);
> > }
> >
> > -
> > unsigned long zs_get_total_pages(struct zs_pool *pool)
> > {
> > return atomic_long_read(&pool->pages_allocated);
> > @@ -1741,6 +1735,13 @@ void zs_free(struct zs_pool *pool, unsigned long handle)
> > unpin_tag(handle);
> >
> > free_handle(pool, handle);
> > +
> > + /*
> > + * actual fullness might have changed, __zs_compact() checks
> > + * if compaction makes sense
> > + */
> > + if (fullness == ZS_ALMOST_EMPTY)
> > + __zs_compact(pool, class);
> > }
> > EXPORT_SYMBOL_GPL(zs_free);
> >
> > --
> > 2.4.2.337.gfae46aa
> >
>
> --
> Kind regards,
> Minchan Kim
>
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [RFC][PATCH 07/10] zsmalloc: introduce auto-compact support
2015-06-04 5:30 ` Sergey Senozhatsky
@ 2015-06-04 6:27 ` Minchan Kim
2015-06-04 7:04 ` Minchan Kim
2015-06-04 7:28 ` Sergey Senozhatsky
0 siblings, 2 replies; 30+ messages in thread
From: Minchan Kim @ 2015-06-04 6:27 UTC (permalink / raw)
To: Sergey Senozhatsky
Cc: Sergey Senozhatsky, Andrew Morton, linux-mm, linux-kernel
On Thu, Jun 04, 2015 at 02:30:56PM +0900, Sergey Senozhatsky wrote:
> On (06/04/15 13:57), Minchan Kim wrote:
> > On Sat, May 30, 2015 at 12:05:25AM +0900, Sergey Senozhatsky wrote:
> > > perform class compaction in zs_free(), if zs_free() has created
> > > a ZS_ALMOST_EMPTY page. this is the most trivial `policy'.
> >
> > Finally, I got realized your intention.
> >
> > Actually, I had a plan to add /sys/block/zram0/compact_threshold_ratio
> > which means to compact automatically when compr_data_size/mem_used_total
> > is below than the threshold but I didn't try because it could be done
> > by usertool.
> >
> > Another reason I didn't try the approach is that it could scan all of
> > zs_objects repeatedly withtout any freeing zspage in some corner cases,
> > which could be big overhead we should prevent so we might add some
> > heuristic. as an example, we could delay a few compaction trial when
> > we found a few previous trials as all fails.
>
> this is why I use zs_can_compact() -- to evict from zs_compact() as soon
> as possible. so useless scans are minimized (well, at least expected). I'm
> also thinking of a threshold-based solution -- do class auto-compaction
> only if we can free X pages, for example.
>
> the problem of compaction is that there is no compaction until you trigger
> it.
>
> and fragmented classes are not necessarily a win. if writes don't happen
> to a fragmented class-X (and we basically can't tell if they will, nor we
> can estimate; it's up to I/O and data patterns, compression algorithm, etc.)
> then class-X stays fragmented w/o any use.
The problem is migration/freeing old zspage/allocating new zspage is
not a cheap, either.
If the system has no problem with small fragmented space, there is
no point to keep such overheads.
So, ideal is we should trigger compaction once we realized system
is trouble but I don't have any good idea to detect it.
That's why i wanted to rely on the decision from user via
compact_threshold_ratio.
>
> > It's simple design of mm/compaction.c to prevent pointless overhead
> > but historically it made pains several times and required more
> > complicated logics but it's still painful.
> >
> > Other thing I found recently is that it's not always win zsmalloc
> > for zram is not fragmented. The fragmented space could be used
> > for storing upcoming compressed objects although it is wasted space
> > at the moment but if we don't have any hole(ie, fragment space)
> > via frequent compaction, zsmalloc should allocate a new zspage
> > which could be allocated on movable pageblock by fallback of
> > nonmovable pageblock request on highly memory pressure system
> > so it accelerates fragment problem of the system memory.
>
> yes, but compaction almost always leave classes fragmented. I think
> it's a corner case, when the number of unused allocated objects was
> exactly the same as the number of objects that we migrated and the
> number of migrated objects was exactly N*maxobj_per_zspage, so we
> left the class w/o any unused objects (OBJ_ALLOCATED == OBJ_USED).
> classes have 'holes' after compaction.
>
>
> > So, I want to pass the policy to userspace.
> > If we found it's really trobule on userspace, then, we need more
> > thinking.
>
> well, it can be under config "aggressive compaction" or "automatic
> compaction" option.
>
If you really want to do it automatically without any feedback
form the userspace, we should find better algorithm.
--
Kind regards,
Minchan Kim
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [RFC][PATCH 07/10] zsmalloc: introduce auto-compact support
2015-06-04 6:27 ` Minchan Kim
@ 2015-06-04 7:04 ` Minchan Kim
2015-06-04 14:47 ` Sergey Senozhatsky
2015-06-04 7:28 ` Sergey Senozhatsky
1 sibling, 1 reply; 30+ messages in thread
From: Minchan Kim @ 2015-06-04 7:04 UTC (permalink / raw)
To: Sergey Senozhatsky
Cc: Sergey Senozhatsky, Andrew Morton, linux-mm, linux-kernel
On Thu, Jun 04, 2015 at 03:27:12PM +0900, Minchan Kim wrote:
> On Thu, Jun 04, 2015 at 02:30:56PM +0900, Sergey Senozhatsky wrote:
> > On (06/04/15 13:57), Minchan Kim wrote:
> > > On Sat, May 30, 2015 at 12:05:25AM +0900, Sergey Senozhatsky wrote:
> > > > perform class compaction in zs_free(), if zs_free() has created
> > > > a ZS_ALMOST_EMPTY page. this is the most trivial `policy'.
> > >
> > > Finally, I got realized your intention.
> > >
> > > Actually, I had a plan to add /sys/block/zram0/compact_threshold_ratio
> > > which means to compact automatically when compr_data_size/mem_used_total
> > > is below than the threshold but I didn't try because it could be done
> > > by usertool.
> > >
> > > Another reason I didn't try the approach is that it could scan all of
> > > zs_objects repeatedly withtout any freeing zspage in some corner cases,
> > > which could be big overhead we should prevent so we might add some
> > > heuristic. as an example, we could delay a few compaction trial when
> > > we found a few previous trials as all fails.
> >
> > this is why I use zs_can_compact() -- to evict from zs_compact() as soon
> > as possible. so useless scans are minimized (well, at least expected). I'm
> > also thinking of a threshold-based solution -- do class auto-compaction
> > only if we can free X pages, for example.
> >
> > the problem of compaction is that there is no compaction until you trigger
> > it.
> >
> > and fragmented classes are not necessarily a win. if writes don't happen
> > to a fragmented class-X (and we basically can't tell if they will, nor we
> > can estimate; it's up to I/O and data patterns, compression algorithm, etc.)
> > then class-X stays fragmented w/o any use.
>
> The problem is migration/freeing old zspage/allocating new zspage is
> not a cheap, either.
> If the system has no problem with small fragmented space, there is
> no point to keep such overheads.
>
> So, ideal is we should trigger compaction once we realized system
> is trouble but I don't have any good idea to detect it.
> That's why i wanted to rely on the decision from user via
> compact_threshold_ratio.
>
> >
> > > It's simple design of mm/compaction.c to prevent pointless overhead
> > > but historically it made pains several times and required more
> > > complicated logics but it's still painful.
> > >
> > > Other thing I found recently is that it's not always win zsmalloc
> > > for zram is not fragmented. The fragmented space could be used
> > > for storing upcoming compressed objects although it is wasted space
> > > at the moment but if we don't have any hole(ie, fragment space)
> > > via frequent compaction, zsmalloc should allocate a new zspage
> > > which could be allocated on movable pageblock by fallback of
> > > nonmovable pageblock request on highly memory pressure system
> > > so it accelerates fragment problem of the system memory.
> >
> > yes, but compaction almost always leave classes fragmented. I think
> > it's a corner case, when the number of unused allocated objects was
> > exactly the same as the number of objects that we migrated and the
> > number of migrated objects was exactly N*maxobj_per_zspage, so we
> > left the class w/o any unused objects (OBJ_ALLOCATED == OBJ_USED).
> > classes have 'holes' after compaction.
> >
> >
> > > So, I want to pass the policy to userspace.
> > > If we found it's really trobule on userspace, then, we need more
> > > thinking.
> >
> > well, it can be under config "aggressive compaction" or "automatic
> > compaction" option.
> >
>
> If you really want to do it automatically without any feedback
> form the userspace, we should find better algorithm.
How about using slab shrinker?
If there is memory pressure, it would be called by VM and we will
try compaction without user's intervention and excessive object
scanning should be avoid by your zs_can_compact.
The concern I had about fragmentation spread out all over pageblock
should be solved as another issue. I'm plaing to make zsmalloced
page migratable. I hope we should work out it firstly to prevent
system heavy memory fragmentation by automatic compaction.
>
> --
> Kind regards,
> Minchan Kim
--
Kind regards,
Minchan Kim
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [RFC][PATCH 07/10] zsmalloc: introduce auto-compact support
2015-06-04 6:27 ` Minchan Kim
2015-06-04 7:04 ` Minchan Kim
@ 2015-06-04 7:28 ` Sergey Senozhatsky
1 sibling, 0 replies; 30+ messages in thread
From: Sergey Senozhatsky @ 2015-06-04 7:28 UTC (permalink / raw)
To: Minchan Kim
Cc: Sergey Senozhatsky, Sergey Senozhatsky, Andrew Morton, linux-mm,
linux-kernel
On (06/04/15 15:27), Minchan Kim wrote:
[..]
>
> The problem is migration/freeing old zspage/allocating new zspage is
> not a cheap, either.
> If the system has no problem with small fragmented space, there is
> no point to keep such overheads.
>
> So, ideal is we should trigger compaction once we realized system
> is trouble but I don't have any good idea to detect it.
> That's why i wanted to rely on the decision from user via
> compact_threshold_ratio.
that'll be extremly hard to understand knob.
well, we can do something like
-- don't let the number of "CLASS_ALMOST_EMPTY" to become N times greater
than "CLASS_ALMOST_FULL".
or
-- don't let the number of pages in ZS_ALMOST_EMPTY pages to contribute 70%
of class memory usage. that is 70% of all pages allocated for this class belong
to ZS_ALMOST_EMPTY zspages, thus potentially we can compact it.
> >
> > > It's simple design of mm/compaction.c to prevent pointless overhead
> > > but historically it made pains several times and required more
> > > complicated logics but it's still painful.
> > >
> > > Other thing I found recently is that it's not always win zsmalloc
> > > for zram is not fragmented. The fragmented space could be used
> > > for storing upcoming compressed objects although it is wasted space
> > > at the moment but if we don't have any hole(ie, fragment space)
> > > via frequent compaction, zsmalloc should allocate a new zspage
> > > which could be allocated on movable pageblock by fallback of
> > > nonmovable pageblock request on highly memory pressure system
> > > so it accelerates fragment problem of the system memory.
> >
> > yes, but compaction almost always leave classes fragmented. I think
> > it's a corner case, when the number of unused allocated objects was
> > exactly the same as the number of objects that we migrated and the
> > number of migrated objects was exactly N*maxobj_per_zspage, so we
> > left the class w/o any unused objects (OBJ_ALLOCATED == OBJ_USED).
> > classes have 'holes' after compaction.
> >
> >
> > > So, I want to pass the policy to userspace.
> > > If we found it's really trobule on userspace, then, we need more
> > > thinking.
> >
> > well, it can be under config "aggressive compaction" or "automatic
> > compaction" option.
> >
>
> If you really want to do it automatically without any feedback
> form the userspace, we should find better algorithm.
ok. I'll drop auto-compaction part for now and will resend
general/minor zsmalloc tweaks today.
-ss
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [RFC][PATCH 07/10] zsmalloc: introduce auto-compact support
2015-06-04 7:04 ` Minchan Kim
@ 2015-06-04 14:47 ` Sergey Senozhatsky
0 siblings, 0 replies; 30+ messages in thread
From: Sergey Senozhatsky @ 2015-06-04 14:47 UTC (permalink / raw)
To: Minchan Kim
Cc: Sergey Senozhatsky, Sergey Senozhatsky, Andrew Morton, linux-mm,
linux-kernel
On (06/04/15 16:04), Minchan Kim wrote:
[..]
> How about using slab shrinker?
> If there is memory pressure, it would be called by VM and we will
> try compaction without user's intervention and excessive object
> scanning should be avoid by your zs_can_compact.
hm, interesting.
ok, have a patch to trigger compaction from shrinker, but need to test
it more.
will send the updated patchset tomorrow, I think.
-ss
> The concern I had about fragmentation spread out all over pageblock
> should be solved as another issue. I'm plaing to make zsmalloced
> page migratable. I hope we should work out it firstly to prevent
> system heavy memory fragmentation by automatic compaction.
>
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 30+ messages in thread
end of thread, other threads:[~2015-06-04 14:48 UTC | newest]
Thread overview: 30+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-05-29 15:05 [RFC][PATCH 00/10] zsmalloc auto-compaction Sergey Senozhatsky
2015-05-29 15:05 ` [RFC][PATCH 01/10] zsmalloc: drop unused variable `nr_to_migrate' Sergey Senozhatsky
2015-06-04 2:04 ` Minchan Kim
2015-06-04 2:10 ` Sergey Senozhatsky
2015-05-29 15:05 ` [RFC][PATCH 02/10] zsmalloc: always keep per-class stats Sergey Senozhatsky
2015-06-04 2:18 ` Minchan Kim
2015-06-04 2:34 ` Sergey Senozhatsky
2015-05-29 15:05 ` [RFC][PATCH 03/10] zsmalloc: introduce zs_can_compact() function Sergey Senozhatsky
2015-06-04 2:55 ` Minchan Kim
2015-06-04 3:15 ` Sergey Senozhatsky
2015-06-04 3:30 ` Minchan Kim
2015-06-04 3:42 ` Sergey Senozhatsky
2015-06-04 3:50 ` Minchan Kim
2015-06-04 4:19 ` Sergey Senozhatsky
2015-06-04 3:31 ` Sergey Senozhatsky
2015-05-29 15:05 ` [RFC][PATCH 04/10] zsmalloc: cosmetic compaction code adjustments Sergey Senozhatsky
2015-06-04 3:14 ` Minchan Kim
2015-05-29 15:05 ` [RFC][PATCH 05/10] zsmalloc: add `num_migrated' to zs_pool Sergey Senozhatsky
2015-05-29 15:05 ` [RFC][PATCH 06/10] zsmalloc: move compaction functions Sergey Senozhatsky
2015-05-29 15:05 ` [RFC][PATCH 07/10] zsmalloc: introduce auto-compact support Sergey Senozhatsky
2015-06-04 4:57 ` Minchan Kim
2015-06-04 5:30 ` Sergey Senozhatsky
2015-06-04 6:27 ` Minchan Kim
2015-06-04 7:04 ` Minchan Kim
2015-06-04 14:47 ` Sergey Senozhatsky
2015-06-04 7:28 ` Sergey Senozhatsky
2015-05-29 15:05 ` [RFC][PATCH 08/10] zsmalloc: export zs_pool `num_migrated' Sergey Senozhatsky
2015-05-29 15:05 ` [RFC][PATCH 09/10] zram: remove `num_migrated' from zram_stats Sergey Senozhatsky
2015-05-29 15:05 ` [RFC][PATCH 10/10] zsmalloc: lower ZS_ALMOST_FULL waterline Sergey Senozhatsky
2015-06-03 5:09 ` [RFC][PATCH 00/10] zsmalloc auto-compaction Sergey Senozhatsky
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).