From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-pa0-f47.google.com (mail-pa0-f47.google.com [209.85.220.47]) by kanga.kvack.org (Postfix) with ESMTP id 27CCB6B0075 for ; Fri, 29 May 2015 11:06:03 -0400 (EDT) Received: by padbw4 with SMTP id bw4so62555420pad.0 for ; Fri, 29 May 2015 08:06:02 -0700 (PDT) Received: from mail-pd0-x231.google.com (mail-pd0-x231.google.com. [2607:f8b0:400e:c02::231]) by mx.google.com with ESMTPS id ex2si8787272pbc.106.2015.05.29.08.06.01 for (version=TLSv1.2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Fri, 29 May 2015 08:06:02 -0700 (PDT) Received: by pdbqa5 with SMTP id qa5so55625134pdb.0 for ; Fri, 29 May 2015 08:06:01 -0700 (PDT) From: Sergey Senozhatsky Subject: [RFC][PATCH 00/10] zsmalloc auto-compaction Date: Sat, 30 May 2015 00:05:18 +0900 Message-Id: <1432911928-14654-1-git-send-email-sergey.senozhatsky@gmail.com> Sender: owner-linux-mm@kvack.org List-ID: To: Andrew Morton , Minchan Kim Cc: linux-mm@kvack.org, linux-kernel@vger.kernel.org, Sergey Senozhatsky , Sergey Senozhatsky Hello, RFC this is 4.3 material, but I wanted to publish it sooner to gain responses and to settle it down before 4.3 merge window opens. in short, this series tweaks zsmalloc's compaction and adds auto-compaction support. auto-compaction is not aimed to replace manual compaction, intead it's supposed to be good enough. yet it surely slows down zsmalloc in some scenarious. whilst simple un-tar test didn't show any significant performance difference quote from commit 0007: this test copies a 1.3G linux kernel tar to mounted zram disk, and extracts it. w/auto-compaction: cat /sys/block/zram0/mm_stat 1171456 26006 86016 0 86016 32781 0 time tar xf linux-3.10.tar.gz -C linux real 0m16.970s user 0m15.247s sys 0m8.477s du -sh linux 2.0G linux cat /sys/block/zram0/mm_stat 3547353088 2993384270 3011088384 0 3011088384 24310 108 ===================================================================== w/o auto compaction: cat /sys/block/zram0/mm_stat 1171456 26000 81920 0 81920 32781 0 time tar xf linux-3.10.tar.gz -C linux real 0m16.983s user 0m15.267s sys 0m8.417s du -sh linux 2.0G linux cat /sys/block/zram0/mm_stat 3548917760 2993566924 3011317760 0 3011317760 23928 0 Sergey Senozhatsky (10): zsmalloc: drop unused variable `nr_to_migrate' zsmalloc: always keep per-class stats zsmalloc: introduce zs_can_compact() function zsmalloc: cosmetic compaction code adjustments zsmalloc: add `num_migrated' to zs_pool zsmalloc: move compaction functions zsmalloc: introduce auto-compact support zsmalloc: export zs_pool `num_migrated' zram: remove `num_migrated' from zram_stats zsmalloc: lower ZS_ALMOST_FULL waterline drivers/block/zram/zram_drv.c | 12 +- drivers/block/zram/zram_drv.h | 1 - include/linux/zsmalloc.h | 1 + mm/zsmalloc.c | 578 +++++++++++++++++++++--------------------- 4 files changed, 296 insertions(+), 296 deletions(-) -- 2.4.2.337.gfae46aa -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: email@kvack.org From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-pd0-f173.google.com (mail-pd0-f173.google.com [209.85.192.173]) by kanga.kvack.org (Postfix) with ESMTP id D65FF6B0078 for ; Fri, 29 May 2015 11:06:07 -0400 (EDT) Received: by pdbqa5 with SMTP id qa5so55627177pdb.0 for ; Fri, 29 May 2015 08:06:07 -0700 (PDT) Received: from mail-pd0-x232.google.com (mail-pd0-x232.google.com. [2607:f8b0:400e:c02::232]) by mx.google.com with ESMTPS id am7si8796395pad.150.2015.05.29.08.06.06 for (version=TLSv1.2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Fri, 29 May 2015 08:06:06 -0700 (PDT) Received: by pdbki1 with SMTP id ki1so55641332pdb.1 for ; Fri, 29 May 2015 08:06:06 -0700 (PDT) From: Sergey Senozhatsky Subject: [RFC][PATCH 01/10] zsmalloc: drop unused variable `nr_to_migrate' Date: Sat, 30 May 2015 00:05:19 +0900 Message-Id: <1432911928-14654-2-git-send-email-sergey.senozhatsky@gmail.com> In-Reply-To: <1432911928-14654-1-git-send-email-sergey.senozhatsky@gmail.com> References: <1432911928-14654-1-git-send-email-sergey.senozhatsky@gmail.com> Sender: owner-linux-mm@kvack.org List-ID: To: Andrew Morton , Minchan Kim Cc: linux-mm@kvack.org, linux-kernel@vger.kernel.org, Sergey Senozhatsky , Sergey Senozhatsky __zs_compact() does not use `nr_to_migrate', drop it. Signed-off-by: Sergey Senozhatsky --- mm/zsmalloc.c | 4 ---- 1 file changed, 4 deletions(-) diff --git a/mm/zsmalloc.c b/mm/zsmalloc.c index 33d5126..e615b31 100644 --- a/mm/zsmalloc.c +++ b/mm/zsmalloc.c @@ -1701,7 +1701,6 @@ static struct page *isolate_source_page(struct size_class *class) static unsigned long __zs_compact(struct zs_pool *pool, struct size_class *class) { - int nr_to_migrate; struct zs_compact_control cc; struct page *src_page; struct page *dst_page = NULL; @@ -1712,8 +1711,6 @@ static unsigned long __zs_compact(struct zs_pool *pool, BUG_ON(!is_first_page(src_page)); - /* The goal is to migrate all live objects in source page */ - nr_to_migrate = src_page->inuse; cc.index = 0; cc.s_page = src_page; @@ -1728,7 +1725,6 @@ static unsigned long __zs_compact(struct zs_pool *pool, putback_zspage(pool, class, dst_page); nr_total_migrated += cc.nr_migrated; - nr_to_migrate -= cc.nr_migrated; } /* Stop if we couldn't find slot */ -- 2.4.2.337.gfae46aa -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: email@kvack.org From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-pd0-f177.google.com (mail-pd0-f177.google.com [209.85.192.177]) by kanga.kvack.org (Postfix) with ESMTP id 785F16B007D for ; Fri, 29 May 2015 11:06:11 -0400 (EDT) Received: by pdfh10 with SMTP id h10so55525460pdf.3 for ; Fri, 29 May 2015 08:06:11 -0700 (PDT) Received: from mail-pa0-x22b.google.com (mail-pa0-x22b.google.com. [2607:f8b0:400e:c03::22b]) by mx.google.com with ESMTPS id dc5si8838543pad.43.2015.05.29.08.06.10 for (version=TLSv1.2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Fri, 29 May 2015 08:06:10 -0700 (PDT) Received: by padbw4 with SMTP id bw4so62557581pad.0 for ; Fri, 29 May 2015 08:06:10 -0700 (PDT) From: Sergey Senozhatsky Subject: [RFC][PATCH 02/10] zsmalloc: always keep per-class stats Date: Sat, 30 May 2015 00:05:20 +0900 Message-Id: <1432911928-14654-3-git-send-email-sergey.senozhatsky@gmail.com> In-Reply-To: <1432911928-14654-1-git-send-email-sergey.senozhatsky@gmail.com> References: <1432911928-14654-1-git-send-email-sergey.senozhatsky@gmail.com> Sender: owner-linux-mm@kvack.org List-ID: To: Andrew Morton , Minchan Kim Cc: linux-mm@kvack.org, linux-kernel@vger.kernel.org, Sergey Senozhatsky , Sergey Senozhatsky always account per-class `zs_size_stat' stats. this data will help us make better decisions during compaction. we are especially interested in OBJ_ALLOCATED and OBJ_USED, which can tell us if class compaction will result in any memory gain. for instance, we know the number of allocated objects in the class, the number of objects being used (so we also know how many objects are not used) and the number of objects per-page. so we can estimate how many pages compaction can free (pages that will turn into ZS_EMPTY during compaction). Signed-off-by: Sergey Senozhatsky --- mm/zsmalloc.c | 49 ++++++++++++------------------------------------- 1 file changed, 12 insertions(+), 37 deletions(-) diff --git a/mm/zsmalloc.c b/mm/zsmalloc.c index e615b31..778b8db 100644 --- a/mm/zsmalloc.c +++ b/mm/zsmalloc.c @@ -169,14 +169,12 @@ enum zs_stat_type { NR_ZS_STAT_TYPE, }; -#ifdef CONFIG_ZSMALLOC_STAT - -static struct dentry *zs_stat_root; - struct zs_size_stat { unsigned long objs[NR_ZS_STAT_TYPE]; }; +#ifdef CONFIG_ZSMALLOC_STAT +static struct dentry *zs_stat_root; #endif /* @@ -201,25 +199,21 @@ static int zs_size_classes; static const int fullness_threshold_frac = 4; struct size_class { + spinlock_t lock; + struct page *fullness_list[_ZS_NR_FULLNESS_GROUPS]; /* * Size of objects stored in this class. Must be multiple * of ZS_ALIGN. */ - int size; - unsigned int index; + int size; + unsigned int index; /* Number of PAGE_SIZE sized pages to combine to form a 'zspage' */ - int pages_per_zspage; - /* huge object: pages_per_zspage == 1 && maxobj_per_zspage == 1 */ - bool huge; - -#ifdef CONFIG_ZSMALLOC_STAT - struct zs_size_stat stats; -#endif - - spinlock_t lock; + int pages_per_zspage; + struct zs_size_stat stats; - struct page *fullness_list[_ZS_NR_FULLNESS_GROUPS]; + /* huge object: pages_per_zspage == 1 && maxobj_per_zspage == 1 */ + bool huge; }; /* @@ -439,8 +433,6 @@ static int get_size_class_index(int size) return min(zs_size_classes - 1, idx); } -#ifdef CONFIG_ZSMALLOC_STAT - static inline void zs_stat_inc(struct size_class *class, enum zs_stat_type type, unsigned long cnt) { @@ -459,6 +451,8 @@ static inline unsigned long zs_stat_get(struct size_class *class, return class->stats.objs[type]; } +#ifdef CONFIG_ZSMALLOC_STAT + static int __init zs_stat_init(void) { if (!debugfs_initialized()) @@ -574,23 +568,6 @@ static void zs_pool_stat_destroy(struct zs_pool *pool) } #else /* CONFIG_ZSMALLOC_STAT */ - -static inline void zs_stat_inc(struct size_class *class, - enum zs_stat_type type, unsigned long cnt) -{ -} - -static inline void zs_stat_dec(struct size_class *class, - enum zs_stat_type type, unsigned long cnt) -{ -} - -static inline unsigned long zs_stat_get(struct size_class *class, - enum zs_stat_type type) -{ - return 0; -} - static int __init zs_stat_init(void) { return 0; @@ -608,7 +585,6 @@ static inline int zs_pool_stat_create(char *name, struct zs_pool *pool) static inline void zs_pool_stat_destroy(struct zs_pool *pool) { } - #endif @@ -1682,7 +1658,6 @@ static void putback_zspage(struct zs_pool *pool, struct size_class *class, class->size, class->pages_per_zspage)); atomic_long_sub(class->pages_per_zspage, &pool->pages_allocated); - free_zspage(first_page); } } -- 2.4.2.337.gfae46aa -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: email@kvack.org From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-pa0-f42.google.com (mail-pa0-f42.google.com [209.85.220.42]) by kanga.kvack.org (Postfix) with ESMTP id 586656B0081 for ; Fri, 29 May 2015 11:06:15 -0400 (EDT) Received: by pacux9 with SMTP id ux9so20946844pac.3 for ; Fri, 29 May 2015 08:06:15 -0700 (PDT) Received: from mail-pd0-x229.google.com (mail-pd0-x229.google.com. [2607:f8b0:400e:c02::229]) by mx.google.com with ESMTPS id x2si8798204pas.140.2015.05.29.08.06.14 for (version=TLSv1.2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Fri, 29 May 2015 08:06:14 -0700 (PDT) Received: by pdbki1 with SMTP id ki1so55643942pdb.1 for ; Fri, 29 May 2015 08:06:14 -0700 (PDT) From: Sergey Senozhatsky Subject: [RFC][PATCH 03/10] zsmalloc: introduce zs_can_compact() function Date: Sat, 30 May 2015 00:05:21 +0900 Message-Id: <1432911928-14654-4-git-send-email-sergey.senozhatsky@gmail.com> In-Reply-To: <1432911928-14654-1-git-send-email-sergey.senozhatsky@gmail.com> References: <1432911928-14654-1-git-send-email-sergey.senozhatsky@gmail.com> Sender: owner-linux-mm@kvack.org List-ID: To: Andrew Morton , Minchan Kim Cc: linux-mm@kvack.org, linux-kernel@vger.kernel.org, Sergey Senozhatsky , Sergey Senozhatsky this function checks if class compaction will free any pages. rephrasing, do we have enough unused objects to form at least one ZS_EMPTY page and free it. it aborts compaction if class compaction will not result into any (further) savings. EXAMPLE (this debug output is not part of this patch set): -- class size -- number of allocated objects -- number of used objects, -- estimated number of pages that will be freed [..] [ 3303.108960] class-3072 objs:24652 inuse:24628 objs-per-page:4 pages-tofree:6 [ 3303.108965] class-3072 objs:24648 inuse:24628 objs-per-page:4 pages-tofree:5 [ 3303.108970] class-3072 objs:24644 inuse:24628 objs-per-page:4 pages-tofree:4 [ 3303.108973] class-3072 objs:24640 inuse:24628 objs-per-page:4 pages-tofree:3 [ 3303.108978] class-3072 objs:24636 inuse:24628 objs-per-page:4 pages-tofree:2 [ 3303.108982] class-3072 objs:24632 inuse:24628 objs-per-page:4 pages-tofree:1 [ 3303.108993] class-2720 objs:17970 inuse:17966 objs-per-page:3 pages-tofree:1 [ 3303.108997] class-2720 objs:17967 inuse:17966 objs-per-page:3 pages-tofree:0 [ 3303.108998] class-2720: Compaction is useless [ 3303.109000] class-2448 objs:7680 inuse:7674 objs-per-page:5 pages-tofree:1 [ 3303.109005] class-2336 objs:13510 inuse:13500 objs-per-page:7 pages-tofree:1 [ 3303.109010] class-2336 objs:13503 inuse:13500 objs-per-page:7 pages-tofree:0 [ 3303.109011] class-2336: Compaction is useless [ 3303.109013] class-1808 objs:1161 inuse:1154 objs-per-page:9 pages-tofree:0 [ 3303.109014] class-1808: Compaction is useless [ 3303.109016] class-1744 objs:2135 inuse:2131 objs-per-page:7 pages-tofree:0 [ 3303.109017] class-1744: Compaction is useless [ 3303.109019] class-1536 objs:1328 inuse:1323 objs-per-page:8 pages-tofree:0 [ 3303.109020] class-1536: Compaction is useless [ 3303.109022] class-1488 objs:8855 inuse:8847 objs-per-page:11 pages-tofree:0 [ 3303.109023] class-1488: Compaction is useless [ 3303.109025] class-1360 objs:14880 inuse:14878 objs-per-page:3 pages-tofree:0 [ 3303.109026] class-1360: Compaction is useless [ 3303.109028] class-1248 objs:3588 inuse:3577 objs-per-page:13 pages-tofree:0 [ 3303.109029] class-1248: Compaction is useless [ 3303.109031] class-1216 objs:3380 inuse:3372 objs-per-page:10 pages-tofree:0 [ 3303.109032] class-1216: Compaction is useless [ 3303.109033] class-1168 objs:3416 inuse:3401 objs-per-page:7 pages-tofree:2 [ 3303.109037] class-1168 objs:3409 inuse:3401 objs-per-page:7 pages-tofree:1 [ 3303.109042] class-1104 objs:605 inuse:599 objs-per-page:11 pages-tofree:0 [ 3303.109043] class-1104: Compaction is useless [..] every "Compaction is useless" indicates that we saved some CPU cycles. for example, class-1104 has 605 object allocated 599 objects used 11 objects per-page even if we have ALMOST_EMPTY page, we still don't have enough room to move all of its objects and free this page; so compaction will not make a lot of sense here, it's better to just leave it as is. Signed-off-by: Sergey Senozhatsky --- mm/zsmalloc.c | 25 +++++++++++++++++++++++++ 1 file changed, 25 insertions(+) diff --git a/mm/zsmalloc.c b/mm/zsmalloc.c index 778b8db..9ef6f15 100644 --- a/mm/zsmalloc.c +++ b/mm/zsmalloc.c @@ -1673,6 +1673,28 @@ static struct page *isolate_source_page(struct size_class *class) return page; } +/* + * Make sure that we actually can compact this class, + * IOW if migration will empty at least one page. + * + * should be called under class->lock + */ +static bool zs_can_compact(struct size_class *class) +{ + /* + * calculate how many unused allocated objects we + * have and see if we can free any zspages. otherwise, + * compaction can just move objects back and forth w/o + * any memory gain. + */ + unsigned long ret = zs_stat_get(class, OBJ_ALLOCATED) - + zs_stat_get(class, OBJ_USED); + + ret /= get_maxobj_per_zspage(class->size, + class->pages_per_zspage); + return ret > 0; +} + static unsigned long __zs_compact(struct zs_pool *pool, struct size_class *class) { @@ -1686,6 +1708,9 @@ static unsigned long __zs_compact(struct zs_pool *pool, BUG_ON(!is_first_page(src_page)); + if (!zs_can_compact(class)) + break; + cc.index = 0; cc.s_page = src_page; -- 2.4.2.337.gfae46aa -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: email@kvack.org From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-pa0-f49.google.com (mail-pa0-f49.google.com [209.85.220.49]) by kanga.kvack.org (Postfix) with ESMTP id ABE706B0083 for ; Fri, 29 May 2015 11:06:19 -0400 (EDT) Received: by padbw4 with SMTP id bw4so62560435pad.0 for ; Fri, 29 May 2015 08:06:19 -0700 (PDT) Received: from mail-pa0-x22f.google.com (mail-pa0-x22f.google.com. [2607:f8b0:400e:c03::22f]) by mx.google.com with ESMTPS id pc7si8831809pac.69.2015.05.29.08.06.18 for (version=TLSv1.2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Fri, 29 May 2015 08:06:18 -0700 (PDT) Received: by pacrp13 with SMTP id rp13so16093776pac.2 for ; Fri, 29 May 2015 08:06:18 -0700 (PDT) From: Sergey Senozhatsky Subject: [RFC][PATCH 04/10] zsmalloc: cosmetic compaction code adjustments Date: Sat, 30 May 2015 00:05:22 +0900 Message-Id: <1432911928-14654-5-git-send-email-sergey.senozhatsky@gmail.com> In-Reply-To: <1432911928-14654-1-git-send-email-sergey.senozhatsky@gmail.com> References: <1432911928-14654-1-git-send-email-sergey.senozhatsky@gmail.com> Sender: owner-linux-mm@kvack.org List-ID: To: Andrew Morton , Minchan Kim Cc: linux-mm@kvack.org, linux-kernel@vger.kernel.org, Sergey Senozhatsky , Sergey Senozhatsky change zs_object_copy() argument order to be (DST, SRC) rather than (SRC, DST). copy/move functions usually have (to, from) arguments order. rename alloc_target_page() to isolate_target_page(). this function doesn't allocate anything, it isolates target page, pretty much like isolate_source_page(). tweak __zs_compact() comment. Signed-off-by: Sergey Senozhatsky --- mm/zsmalloc.c | 12 ++++++------ 1 file changed, 6 insertions(+), 6 deletions(-) diff --git a/mm/zsmalloc.c b/mm/zsmalloc.c index 9ef6f15..fa72a81 100644 --- a/mm/zsmalloc.c +++ b/mm/zsmalloc.c @@ -1469,7 +1469,7 @@ void zs_free(struct zs_pool *pool, unsigned long handle) } EXPORT_SYMBOL_GPL(zs_free); -static void zs_object_copy(unsigned long src, unsigned long dst, +static void zs_object_copy(unsigned long dst, unsigned long src, struct size_class *class) { struct page *s_page, *d_page; @@ -1610,7 +1610,7 @@ static int migrate_zspage(struct zs_pool *pool, struct size_class *class, used_obj = handle_to_obj(handle); free_obj = obj_malloc(d_page, class, handle); - zs_object_copy(used_obj, free_obj, class); + zs_object_copy(free_obj, used_obj, class); index++; record_obj(handle, free_obj); unpin_tag(handle); @@ -1626,7 +1626,7 @@ static int migrate_zspage(struct zs_pool *pool, struct size_class *class, return ret; } -static struct page *alloc_target_page(struct size_class *class) +static struct page *isolate_target_page(struct size_class *class) { int i; struct page *page; @@ -1714,11 +1714,11 @@ static unsigned long __zs_compact(struct zs_pool *pool, cc.index = 0; cc.s_page = src_page; - while ((dst_page = alloc_target_page(class))) { + while ((dst_page = isolate_target_page(class))) { cc.d_page = dst_page; /* - * If there is no more space in dst_page, try to - * allocate another zspage. + * If there is no more space in dst_page, resched + * and see if anyone had allocated another zspage. */ if (!migrate_zspage(pool, class, &cc)) break; -- 2.4.2.337.gfae46aa -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: email@kvack.org From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-pd0-f182.google.com (mail-pd0-f182.google.com [209.85.192.182]) by kanga.kvack.org (Postfix) with ESMTP id 3FE8E6B0085 for ; Fri, 29 May 2015 11:06:30 -0400 (EDT) Received: by pdfh10 with SMTP id h10so55531541pdf.3 for ; Fri, 29 May 2015 08:06:30 -0700 (PDT) Received: from mail-pa0-x22c.google.com (mail-pa0-x22c.google.com. [2607:f8b0:400e:c03::22c]) by mx.google.com with ESMTPS id e3si8761045pdc.240.2015.05.29.08.06.29 for (version=TLSv1.2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Fri, 29 May 2015 08:06:29 -0700 (PDT) Received: by padbw4 with SMTP id bw4so62563379pad.0 for ; Fri, 29 May 2015 08:06:29 -0700 (PDT) From: Sergey Senozhatsky Subject: [RFC][PATCH 05/10] zsmalloc: add `num_migrated' to zs_pool Date: Sat, 30 May 2015 00:05:23 +0900 Message-Id: <1432911928-14654-6-git-send-email-sergey.senozhatsky@gmail.com> In-Reply-To: <1432911928-14654-1-git-send-email-sergey.senozhatsky@gmail.com> References: <1432911928-14654-1-git-send-email-sergey.senozhatsky@gmail.com> Sender: owner-linux-mm@kvack.org List-ID: To: Andrew Morton , Minchan Kim Cc: linux-mm@kvack.org, linux-kernel@vger.kernel.org, Sergey Senozhatsky , Sergey Senozhatsky remove the number of migrated objects from `zs_compact_control' and make it a `zs_pool' member. `zs_compact_control' has a limited lifespan; we lose it when zs_compaction() returns back to zram. to keep track of objects migrated during auto-compaction we need to store this number in zs_pool. Signed-off-by: Sergey Senozhatsky --- mm/zsmalloc.c | 36 ++++++++++++++---------------------- 1 file changed, 14 insertions(+), 22 deletions(-) diff --git a/mm/zsmalloc.c b/mm/zsmalloc.c index fa72a81..54eefc3 100644 --- a/mm/zsmalloc.c +++ b/mm/zsmalloc.c @@ -237,16 +237,19 @@ struct link_free { }; struct zs_pool { - char *name; + char *name; - struct size_class **size_class; - struct kmem_cache *handle_cachep; + struct size_class **size_class; + struct kmem_cache *handle_cachep; - gfp_t flags; /* allocation flags used when growing pool */ - atomic_long_t pages_allocated; + /* allocation flags used when growing pool */ + gfp_t flags; + atomic_long_t pages_allocated; + /* how many of objects were migrated */ + unsigned long num_migrated; #ifdef CONFIG_ZSMALLOC_STAT - struct dentry *stat_dentry; + struct dentry *stat_dentry; #endif }; @@ -1576,8 +1579,6 @@ struct zs_compact_control { /* Starting object index within @s_page which used for live object * in the subpage. */ int index; - /* how many of objects are migrated */ - int nr_migrated; }; static int migrate_zspage(struct zs_pool *pool, struct size_class *class, @@ -1588,7 +1589,6 @@ static int migrate_zspage(struct zs_pool *pool, struct size_class *class, struct page *s_page = cc->s_page; struct page *d_page = cc->d_page; unsigned long index = cc->index; - int nr_migrated = 0; int ret = 0; while (1) { @@ -1615,13 +1615,12 @@ static int migrate_zspage(struct zs_pool *pool, struct size_class *class, record_obj(handle, free_obj); unpin_tag(handle); obj_free(pool, class, used_obj); - nr_migrated++; + pool->num_migrated++; } /* Remember last position in this iteration */ cc->s_page = s_page; cc->index = index; - cc->nr_migrated = nr_migrated; return ret; } @@ -1695,13 +1694,11 @@ static bool zs_can_compact(struct size_class *class) return ret > 0; } -static unsigned long __zs_compact(struct zs_pool *pool, - struct size_class *class) +static void __zs_compact(struct zs_pool *pool, struct size_class *class) { struct zs_compact_control cc; struct page *src_page; struct page *dst_page = NULL; - unsigned long nr_total_migrated = 0; spin_lock(&class->lock); while ((src_page = isolate_source_page(class))) { @@ -1724,7 +1721,6 @@ static unsigned long __zs_compact(struct zs_pool *pool, break; putback_zspage(pool, class, dst_page); - nr_total_migrated += cc.nr_migrated; } /* Stop if we couldn't find slot */ @@ -1734,7 +1730,6 @@ static unsigned long __zs_compact(struct zs_pool *pool, putback_zspage(pool, class, dst_page); putback_zspage(pool, class, src_page); spin_unlock(&class->lock); - nr_total_migrated += cc.nr_migrated; cond_resched(); spin_lock(&class->lock); } @@ -1743,14 +1738,11 @@ static unsigned long __zs_compact(struct zs_pool *pool, putback_zspage(pool, class, src_page); spin_unlock(&class->lock); - - return nr_total_migrated; } unsigned long zs_compact(struct zs_pool *pool) { int i; - unsigned long nr_migrated = 0; struct size_class *class; for (i = zs_size_classes - 1; i >= 0; i--) { @@ -1759,10 +1751,10 @@ unsigned long zs_compact(struct zs_pool *pool) continue; if (class->index != i) continue; - nr_migrated += __zs_compact(pool, class); + __zs_compact(pool, class); } - - return nr_migrated; + /* can be a bit outdated */ + return pool->num_migrated; } EXPORT_SYMBOL_GPL(zs_compact); -- 2.4.2.337.gfae46aa -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: email@kvack.org From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-pa0-f53.google.com (mail-pa0-f53.google.com [209.85.220.53]) by kanga.kvack.org (Postfix) with ESMTP id 8D0306B0088 for ; Fri, 29 May 2015 11:06:34 -0400 (EDT) Received: by pabru16 with SMTP id ru16so62706630pab.1 for ; Fri, 29 May 2015 08:06:34 -0700 (PDT) Received: from mail-pd0-x22d.google.com (mail-pd0-x22d.google.com. [2607:f8b0:400e:c02::22d]) by mx.google.com with ESMTPS id fj8si8819616pdb.93.2015.05.29.08.06.33 for (version=TLSv1.2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Fri, 29 May 2015 08:06:33 -0700 (PDT) Received: by pdfh10 with SMTP id h10so55532648pdf.3 for ; Fri, 29 May 2015 08:06:33 -0700 (PDT) From: Sergey Senozhatsky Subject: [RFC][PATCH 06/10] zsmalloc: move compaction functions Date: Sat, 30 May 2015 00:05:24 +0900 Message-Id: <1432911928-14654-7-git-send-email-sergey.senozhatsky@gmail.com> In-Reply-To: <1432911928-14654-1-git-send-email-sergey.senozhatsky@gmail.com> References: <1432911928-14654-1-git-send-email-sergey.senozhatsky@gmail.com> Sender: owner-linux-mm@kvack.org List-ID: To: Andrew Morton , Minchan Kim Cc: linux-mm@kvack.org, linux-kernel@vger.kernel.org, Sergey Senozhatsky , Sergey Senozhatsky this patch simply moves compaction functions, so we can call `static __zs_compaction()' (and friends) from zs_free(). Signed-off-by: Sergey Senozhatsky --- mm/zsmalloc.c | 426 +++++++++++++++++++++++++++++----------------------------- 1 file changed, 215 insertions(+), 211 deletions(-) diff --git a/mm/zsmalloc.c b/mm/zsmalloc.c index 54eefc3..c2a640a 100644 --- a/mm/zsmalloc.c +++ b/mm/zsmalloc.c @@ -321,6 +321,7 @@ static int zs_zpool_malloc(void *pool, size_t size, gfp_t gfp, *handle = zs_malloc(pool, size); return *handle ? 0 : -1; } + static void zs_zpool_free(void *pool, unsigned long handle) { zs_free(pool, handle); @@ -352,6 +353,7 @@ static void *zs_zpool_map(void *pool, unsigned long handle, return zs_map_object(pool, handle, zs_mm); } + static void zs_zpool_unmap(void *pool, unsigned long handle) { zs_unmap_object(pool, handle); @@ -590,7 +592,6 @@ static inline void zs_pool_stat_destroy(struct zs_pool *pool) } #endif - /* * For each size class, zspages are divided into different groups * depending on how "full" they are. This was done so that we could @@ -1117,7 +1118,6 @@ out: /* enable page faults to match kunmap_atomic() return conditions */ pagefault_enable(); } - #endif /* CONFIG_PGTABLE_MAPPING */ static int zs_cpu_notifier(struct notifier_block *nb, unsigned long action, @@ -1207,115 +1207,6 @@ static bool zspage_full(struct page *page) return page->inuse == page->objects; } -unsigned long zs_get_total_pages(struct zs_pool *pool) -{ - return atomic_long_read(&pool->pages_allocated); -} -EXPORT_SYMBOL_GPL(zs_get_total_pages); - -/** - * zs_map_object - get address of allocated object from handle. - * @pool: pool from which the object was allocated - * @handle: handle returned from zs_malloc - * - * Before using an object allocated from zs_malloc, it must be mapped using - * this function. When done with the object, it must be unmapped using - * zs_unmap_object. - * - * Only one object can be mapped per cpu at a time. There is no protection - * against nested mappings. - * - * This function returns with preemption and page faults disabled. - */ -void *zs_map_object(struct zs_pool *pool, unsigned long handle, - enum zs_mapmode mm) -{ - struct page *page; - unsigned long obj, obj_idx, off; - - unsigned int class_idx; - enum fullness_group fg; - struct size_class *class; - struct mapping_area *area; - struct page *pages[2]; - void *ret; - - BUG_ON(!handle); - - /* - * Because we use per-cpu mapping areas shared among the - * pools/users, we can't allow mapping in interrupt context - * because it can corrupt another users mappings. - */ - BUG_ON(in_interrupt()); - - /* From now on, migration cannot move the object */ - pin_tag(handle); - - obj = handle_to_obj(handle); - obj_to_location(obj, &page, &obj_idx); - get_zspage_mapping(get_first_page(page), &class_idx, &fg); - class = pool->size_class[class_idx]; - off = obj_idx_to_offset(page, obj_idx, class->size); - - area = &get_cpu_var(zs_map_area); - area->vm_mm = mm; - if (off + class->size <= PAGE_SIZE) { - /* this object is contained entirely within a page */ - area->vm_addr = kmap_atomic(page); - ret = area->vm_addr + off; - goto out; - } - - /* this object spans two pages */ - pages[0] = page; - pages[1] = get_next_page(page); - BUG_ON(!pages[1]); - - ret = __zs_map_object(area, pages, off, class->size); -out: - if (!class->huge) - ret += ZS_HANDLE_SIZE; - - return ret; -} -EXPORT_SYMBOL_GPL(zs_map_object); - -void zs_unmap_object(struct zs_pool *pool, unsigned long handle) -{ - struct page *page; - unsigned long obj, obj_idx, off; - - unsigned int class_idx; - enum fullness_group fg; - struct size_class *class; - struct mapping_area *area; - - BUG_ON(!handle); - - obj = handle_to_obj(handle); - obj_to_location(obj, &page, &obj_idx); - get_zspage_mapping(get_first_page(page), &class_idx, &fg); - class = pool->size_class[class_idx]; - off = obj_idx_to_offset(page, obj_idx, class->size); - - area = this_cpu_ptr(&zs_map_area); - if (off + class->size <= PAGE_SIZE) - kunmap_atomic(area->vm_addr); - else { - struct page *pages[2]; - - pages[0] = page; - pages[1] = get_next_page(page); - BUG_ON(!pages[1]); - - __zs_unmap_object(area, pages, off, class->size); - } - put_cpu_var(zs_map_area); - unpin_tag(handle); -} -EXPORT_SYMBOL_GPL(zs_unmap_object); - static unsigned long obj_malloc(struct page *first_page, struct size_class *class, unsigned long handle) { @@ -1347,63 +1238,6 @@ static unsigned long obj_malloc(struct page *first_page, return obj; } - -/** - * zs_malloc - Allocate block of given size from pool. - * @pool: pool to allocate from - * @size: size of block to allocate - * - * On success, handle to the allocated object is returned, - * otherwise 0. - * Allocation requests with size > ZS_MAX_ALLOC_SIZE will fail. - */ -unsigned long zs_malloc(struct zs_pool *pool, size_t size) -{ - unsigned long handle, obj; - struct size_class *class; - struct page *first_page; - - if (unlikely(!size || size > ZS_MAX_ALLOC_SIZE)) - return 0; - - handle = alloc_handle(pool); - if (!handle) - return 0; - - /* extra space in chunk to keep the handle */ - size += ZS_HANDLE_SIZE; - class = pool->size_class[get_size_class_index(size)]; - - spin_lock(&class->lock); - first_page = find_get_zspage(class); - - if (!first_page) { - spin_unlock(&class->lock); - first_page = alloc_zspage(class, pool->flags); - if (unlikely(!first_page)) { - free_handle(pool, handle); - return 0; - } - - set_zspage_mapping(first_page, class->index, ZS_EMPTY); - atomic_long_add(class->pages_per_zspage, - &pool->pages_allocated); - - spin_lock(&class->lock); - zs_stat_inc(class, OBJ_ALLOCATED, get_maxobj_per_zspage( - class->size, class->pages_per_zspage)); - } - - obj = obj_malloc(first_page, class, handle); - /* Now move the zspage to another fullness group, if required */ - fix_fullness_group(class, first_page); - record_obj(handle, obj); - spin_unlock(&class->lock); - - return handle; -} -EXPORT_SYMBOL_GPL(zs_malloc); - static void obj_free(struct zs_pool *pool, struct size_class *class, unsigned long obj) { @@ -1436,42 +1270,6 @@ static void obj_free(struct zs_pool *pool, struct size_class *class, zs_stat_dec(class, OBJ_USED, 1); } -void zs_free(struct zs_pool *pool, unsigned long handle) -{ - struct page *first_page, *f_page; - unsigned long obj, f_objidx; - int class_idx; - struct size_class *class; - enum fullness_group fullness; - - if (unlikely(!handle)) - return; - - pin_tag(handle); - obj = handle_to_obj(handle); - obj_to_location(obj, &f_page, &f_objidx); - first_page = get_first_page(f_page); - - get_zspage_mapping(first_page, &class_idx, &fullness); - class = pool->size_class[class_idx]; - - spin_lock(&class->lock); - obj_free(pool, class, obj); - fullness = fix_fullness_group(class, first_page); - if (fullness == ZS_EMPTY) { - zs_stat_dec(class, OBJ_ALLOCATED, get_maxobj_per_zspage( - class->size, class->pages_per_zspage)); - atomic_long_sub(class->pages_per_zspage, - &pool->pages_allocated); - free_zspage(first_page); - } - spin_unlock(&class->lock); - unpin_tag(handle); - - free_handle(pool, handle); -} -EXPORT_SYMBOL_GPL(zs_free); - static void zs_object_copy(unsigned long dst, unsigned long src, struct size_class *class) { @@ -1572,13 +1370,17 @@ static unsigned long find_alloced_obj(struct page *page, int index, struct zs_compact_control { /* Source page for migration which could be a subpage of zspage. */ - struct page *s_page; - /* Destination page for migration which should be a first page - * of zspage. */ - struct page *d_page; - /* Starting object index within @s_page which used for live object - * in the subpage. */ - int index; + struct page *s_page; + /* + * Destination page for migration which should be a first page + * of zspage. + */ + struct page *d_page; + /* + * Starting object index within @s_page which used for live object + * in the subpage. + */ + int index; }; static int migrate_zspage(struct zs_pool *pool, struct size_class *class, @@ -1740,6 +1542,208 @@ static void __zs_compact(struct zs_pool *pool, struct size_class *class) spin_unlock(&class->lock); } + +unsigned long zs_get_total_pages(struct zs_pool *pool) +{ + return atomic_long_read(&pool->pages_allocated); +} +EXPORT_SYMBOL_GPL(zs_get_total_pages); + +/** + * zs_map_object - get address of allocated object from handle. + * @pool: pool from which the object was allocated + * @handle: handle returned from zs_malloc + * + * Before using an object allocated from zs_malloc, it must be mapped using + * this function. When done with the object, it must be unmapped using + * zs_unmap_object. + * + * Only one object can be mapped per cpu at a time. There is no protection + * against nested mappings. + * + * This function returns with preemption and page faults disabled. + */ +void *zs_map_object(struct zs_pool *pool, unsigned long handle, + enum zs_mapmode mm) +{ + struct page *page; + unsigned long obj, obj_idx, off; + + unsigned int class_idx; + enum fullness_group fg; + struct size_class *class; + struct mapping_area *area; + struct page *pages[2]; + void *ret; + + BUG_ON(!handle); + + /* + * Because we use per-cpu mapping areas shared among the + * pools/users, we can't allow mapping in interrupt context + * because it can corrupt another users mappings. + */ + BUG_ON(in_interrupt()); + + /* From now on, migration cannot move the object */ + pin_tag(handle); + + obj = handle_to_obj(handle); + obj_to_location(obj, &page, &obj_idx); + get_zspage_mapping(get_first_page(page), &class_idx, &fg); + class = pool->size_class[class_idx]; + off = obj_idx_to_offset(page, obj_idx, class->size); + + area = &get_cpu_var(zs_map_area); + area->vm_mm = mm; + if (off + class->size <= PAGE_SIZE) { + /* this object is contained entirely within a page */ + area->vm_addr = kmap_atomic(page); + ret = area->vm_addr + off; + goto out; + } + + /* this object spans two pages */ + pages[0] = page; + pages[1] = get_next_page(page); + BUG_ON(!pages[1]); + + ret = __zs_map_object(area, pages, off, class->size); +out: + if (!class->huge) + ret += ZS_HANDLE_SIZE; + + return ret; +} +EXPORT_SYMBOL_GPL(zs_map_object); + +void zs_unmap_object(struct zs_pool *pool, unsigned long handle) +{ + struct page *page; + unsigned long obj, obj_idx, off; + + unsigned int class_idx; + enum fullness_group fg; + struct size_class *class; + struct mapping_area *area; + + BUG_ON(!handle); + + obj = handle_to_obj(handle); + obj_to_location(obj, &page, &obj_idx); + get_zspage_mapping(get_first_page(page), &class_idx, &fg); + class = pool->size_class[class_idx]; + off = obj_idx_to_offset(page, obj_idx, class->size); + + area = this_cpu_ptr(&zs_map_area); + if (off + class->size <= PAGE_SIZE) + kunmap_atomic(area->vm_addr); + else { + struct page *pages[2]; + + pages[0] = page; + pages[1] = get_next_page(page); + BUG_ON(!pages[1]); + + __zs_unmap_object(area, pages, off, class->size); + } + put_cpu_var(zs_map_area); + unpin_tag(handle); +} +EXPORT_SYMBOL_GPL(zs_unmap_object); + +/** + * zs_malloc - Allocate block of given size from pool. + * @pool: pool to allocate from + * @size: size of block to allocate + * + * On success, handle to the allocated object is returned, + * otherwise 0. + * Allocation requests with size > ZS_MAX_ALLOC_SIZE will fail. + */ +unsigned long zs_malloc(struct zs_pool *pool, size_t size) +{ + unsigned long handle, obj; + struct size_class *class; + struct page *first_page; + + if (unlikely(!size || size > ZS_MAX_ALLOC_SIZE)) + return 0; + + handle = alloc_handle(pool); + if (!handle) + return 0; + + /* extra space in chunk to keep the handle */ + size += ZS_HANDLE_SIZE; + class = pool->size_class[get_size_class_index(size)]; + + spin_lock(&class->lock); + first_page = find_get_zspage(class); + + if (!first_page) { + spin_unlock(&class->lock); + first_page = alloc_zspage(class, pool->flags); + if (unlikely(!first_page)) { + free_handle(pool, handle); + return 0; + } + + set_zspage_mapping(first_page, class->index, ZS_EMPTY); + atomic_long_add(class->pages_per_zspage, + &pool->pages_allocated); + + spin_lock(&class->lock); + zs_stat_inc(class, OBJ_ALLOCATED, get_maxobj_per_zspage( + class->size, class->pages_per_zspage)); + } + + obj = obj_malloc(first_page, class, handle); + /* Now move the zspage to another fullness group, if required */ + fix_fullness_group(class, first_page); + record_obj(handle, obj); + spin_unlock(&class->lock); + + return handle; +} +EXPORT_SYMBOL_GPL(zs_malloc); + +void zs_free(struct zs_pool *pool, unsigned long handle) +{ + struct page *first_page, *f_page; + unsigned long obj, f_objidx; + int class_idx; + struct size_class *class; + enum fullness_group fullness; + + if (unlikely(!handle)) + return; + + pin_tag(handle); + obj = handle_to_obj(handle); + obj_to_location(obj, &f_page, &f_objidx); + first_page = get_first_page(f_page); + + get_zspage_mapping(first_page, &class_idx, &fullness); + class = pool->size_class[class_idx]; + + spin_lock(&class->lock); + obj_free(pool, class, obj); + fullness = fix_fullness_group(class, first_page); + if (fullness == ZS_EMPTY) { + zs_stat_dec(class, OBJ_ALLOCATED, get_maxobj_per_zspage( + class->size, class->pages_per_zspage)); + atomic_long_sub(class->pages_per_zspage, + &pool->pages_allocated); + free_zspage(first_page); + } + spin_unlock(&class->lock); + unpin_tag(handle); + + free_handle(pool, handle); +} +EXPORT_SYMBOL_GPL(zs_free); + unsigned long zs_compact(struct zs_pool *pool) { int i; -- 2.4.2.337.gfae46aa -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: email@kvack.org From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-pd0-f175.google.com (mail-pd0-f175.google.com [209.85.192.175]) by kanga.kvack.org (Postfix) with ESMTP id 216936B008A for ; Fri, 29 May 2015 11:06:38 -0400 (EDT) Received: by pdea3 with SMTP id a3so55496473pde.2 for ; Fri, 29 May 2015 08:06:37 -0700 (PDT) Received: from mail-pa0-x22d.google.com (mail-pa0-x22d.google.com. [2607:f8b0:400e:c03::22d]) by mx.google.com with ESMTPS id bm1si8742484pbd.212.2015.05.29.08.06.37 for (version=TLSv1.2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Fri, 29 May 2015 08:06:37 -0700 (PDT) Received: by pabru16 with SMTP id ru16so62707583pab.1 for ; Fri, 29 May 2015 08:06:37 -0700 (PDT) From: Sergey Senozhatsky Subject: [RFC][PATCH 07/10] zsmalloc: introduce auto-compact support Date: Sat, 30 May 2015 00:05:25 +0900 Message-Id: <1432911928-14654-8-git-send-email-sergey.senozhatsky@gmail.com> In-Reply-To: <1432911928-14654-1-git-send-email-sergey.senozhatsky@gmail.com> References: <1432911928-14654-1-git-send-email-sergey.senozhatsky@gmail.com> Sender: owner-linux-mm@kvack.org List-ID: To: Andrew Morton , Minchan Kim Cc: linux-mm@kvack.org, linux-kernel@vger.kernel.org, Sergey Senozhatsky , Sergey Senozhatsky perform class compaction in zs_free(), if zs_free() has created a ZS_ALMOST_EMPTY page. this is the most trivial `policy'. probably it would make zs_can_compact() to return an estimated number of pages that potentially will be free and trigger auto-compaction only when it's above some limit (e.g. at least 4 zs pages); or put it under config option. this also tweaks __zs_compact() -- we can't do reschedule anymore, waiting for new pages in the current class. so we compact as much as we can and return immediately if compaction is not possible anymore. auto-compaction is not a replacement of manual compaction. compiled linux kernel with auto-compaction: cat /sys/block/zram0/mm_stat 2339885056 1601034235 1624076288 0 1624076288 19961 1106 performing additional manual compaction: echo 1 > /sys/block/zram0/compact cat /sys/block/zram0/mm_stat 2339885056 1601034235 1624051712 0 1624076288 19961 1114 manual compaction was able to migrate additional 8 objects. so auto-compaction is 'good enough'. TEST this test copies a 1.3G linux kernel tar to mounted zram disk, and extracts it. w/auto-compaction: cat /sys/block/zram0/mm_stat 1171456 26006 86016 0 86016 32781 0 time tar xf linux-3.10.tar.gz -C linux real 0m16.970s user 0m15.247s sys 0m8.477s du -sh linux 2.0G linux cat /sys/block/zram0/mm_stat 3547353088 2993384270 3011088384 0 3011088384 24310 108 ===================================================================== w/o auto compaction: cat /sys/block/zram0/mm_stat 1171456 26000 81920 0 81920 32781 0 time tar xf linux-3.10.tar.gz -C linux real 0m16.983s user 0m15.267s sys 0m8.417s du -sh linux 2.0G linux cat /sys/block/zram0/mm_stat 3548917760 2993566924 3011317760 0 3011317760 23928 0 ===================================================================== iozone shows that auto-compacted code runs faster in several tests, which is hardly trustworthy. anyway. iozone -t 3 -R -r 16K -s 60M -I +Z test base auto-compact (compacted 66123 objs) Initial write 1603682.25 1645112.38 Rewrite 2502243.31 2256570.31 Read 7040860.00 7130575.00 Re-read 7036490.75 7066744.25 Reverse Read 6617115.25 6155395.50 Stride read 6705085.50 6350030.38 Random read 6668497.75 6350129.38 Mixed workload 5494030.38 5091669.62 Random write 2526834.44 2500977.81 Pwrite 1656874.00 1663796.94 Pread 3322818.91 3359683.44 Fwrite 4090124.25 4099773.88 Fread 10358916.25 10324409.75 Signed-off-by: Sergey Senozhatsky --- mm/zsmalloc.c | 25 +++++++++++++------------ 1 file changed, 13 insertions(+), 12 deletions(-) diff --git a/mm/zsmalloc.c b/mm/zsmalloc.c index c2a640a..70bf481 100644 --- a/mm/zsmalloc.c +++ b/mm/zsmalloc.c @@ -1515,34 +1515,28 @@ static void __zs_compact(struct zs_pool *pool, struct size_class *class) while ((dst_page = isolate_target_page(class))) { cc.d_page = dst_page; - /* - * If there is no more space in dst_page, resched - * and see if anyone had allocated another zspage. - */ + if (!migrate_zspage(pool, class, &cc)) - break; + goto out; putback_zspage(pool, class, dst_page); } - /* Stop if we couldn't find slot */ - if (dst_page == NULL) + if (!dst_page) break; - putback_zspage(pool, class, dst_page); putback_zspage(pool, class, src_page); - spin_unlock(&class->lock); - cond_resched(); - spin_lock(&class->lock); } +out: + if (dst_page) + putback_zspage(pool, class, dst_page); if (src_page) putback_zspage(pool, class, src_page); spin_unlock(&class->lock); } - unsigned long zs_get_total_pages(struct zs_pool *pool) { return atomic_long_read(&pool->pages_allocated); @@ -1741,6 +1735,13 @@ void zs_free(struct zs_pool *pool, unsigned long handle) unpin_tag(handle); free_handle(pool, handle); + + /* + * actual fullness might have changed, __zs_compact() checks + * if compaction makes sense + */ + if (fullness == ZS_ALMOST_EMPTY) + __zs_compact(pool, class); } EXPORT_SYMBOL_GPL(zs_free); -- 2.4.2.337.gfae46aa -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: email@kvack.org From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-pd0-f178.google.com (mail-pd0-f178.google.com [209.85.192.178]) by kanga.kvack.org (Postfix) with ESMTP id 506956B0092 for ; Fri, 29 May 2015 11:06:42 -0400 (EDT) Received: by pdea3 with SMTP id a3so55497877pde.2 for ; Fri, 29 May 2015 08:06:42 -0700 (PDT) Received: from mail-pd0-x22f.google.com (mail-pd0-x22f.google.com. [2607:f8b0:400e:c02::22f]) by mx.google.com with ESMTPS id gf9si8744684pbc.213.2015.05.29.08.06.41 for (version=TLSv1.2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Fri, 29 May 2015 08:06:41 -0700 (PDT) Received: by pdea3 with SMTP id a3so55497552pde.2 for ; Fri, 29 May 2015 08:06:41 -0700 (PDT) From: Sergey Senozhatsky Subject: [RFC][PATCH 08/10] zsmalloc: export zs_pool `num_migrated' Date: Sat, 30 May 2015 00:05:26 +0900 Message-Id: <1432911928-14654-9-git-send-email-sergey.senozhatsky@gmail.com> In-Reply-To: <1432911928-14654-1-git-send-email-sergey.senozhatsky@gmail.com> References: <1432911928-14654-1-git-send-email-sergey.senozhatsky@gmail.com> Sender: owner-linux-mm@kvack.org List-ID: To: Andrew Morton , Minchan Kim Cc: linux-mm@kvack.org, linux-kernel@vger.kernel.org, Sergey Senozhatsky , Sergey Senozhatsky introduce zs_get_num_migrated() to export zs_pool's ->num_migrated counter. Signed-off-by: Sergey Senozhatsky --- include/linux/zsmalloc.h | 1 + mm/zsmalloc.c | 7 +++++++ 2 files changed, 8 insertions(+) diff --git a/include/linux/zsmalloc.h b/include/linux/zsmalloc.h index 1338190..e878875 100644 --- a/include/linux/zsmalloc.h +++ b/include/linux/zsmalloc.h @@ -47,6 +47,7 @@ void *zs_map_object(struct zs_pool *pool, unsigned long handle, void zs_unmap_object(struct zs_pool *pool, unsigned long handle); unsigned long zs_get_total_pages(struct zs_pool *pool); +unsigned long zs_get_num_migrated(struct zs_pool *pool); unsigned long zs_compact(struct zs_pool *pool); #endif diff --git a/mm/zsmalloc.c b/mm/zsmalloc.c index 70bf481..0524c4a 100644 --- a/mm/zsmalloc.c +++ b/mm/zsmalloc.c @@ -1543,6 +1543,13 @@ unsigned long zs_get_total_pages(struct zs_pool *pool) } EXPORT_SYMBOL_GPL(zs_get_total_pages); +unsigned long zs_get_num_migrated(struct zs_pool *pool) +{ + /* can be outdated */ + return pool->num_migrated; +} +EXPORT_SYMBOL_GPL(zs_get_num_migrated); + /** * zs_map_object - get address of allocated object from handle. * @pool: pool from which the object was allocated -- 2.4.2.337.gfae46aa -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: email@kvack.org From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-pd0-f179.google.com (mail-pd0-f179.google.com [209.85.192.179]) by kanga.kvack.org (Postfix) with ESMTP id 9DD746B0096 for ; Fri, 29 May 2015 11:06:46 -0400 (EDT) Received: by pdbqa5 with SMTP id qa5so55640195pdb.0 for ; Fri, 29 May 2015 08:06:46 -0700 (PDT) Received: from mail-pa0-x232.google.com (mail-pa0-x232.google.com. [2607:f8b0:400e:c03::232]) by mx.google.com with ESMTPS id b7si8754523pbu.178.2015.05.29.08.06.45 for (version=TLSv1.2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Fri, 29 May 2015 08:06:45 -0700 (PDT) Received: by pabru16 with SMTP id ru16so62710205pab.1 for ; Fri, 29 May 2015 08:06:45 -0700 (PDT) From: Sergey Senozhatsky Subject: [RFC][PATCH 09/10] zram: remove `num_migrated' from zram_stats Date: Sat, 30 May 2015 00:05:27 +0900 Message-Id: <1432911928-14654-10-git-send-email-sergey.senozhatsky@gmail.com> In-Reply-To: <1432911928-14654-1-git-send-email-sergey.senozhatsky@gmail.com> References: <1432911928-14654-1-git-send-email-sergey.senozhatsky@gmail.com> Sender: owner-linux-mm@kvack.org List-ID: To: Andrew Morton , Minchan Kim Cc: linux-mm@kvack.org, linux-kernel@vger.kernel.org, Sergey Senozhatsky , Sergey Senozhatsky drop zram's copy of `num_migrated' objects and use zs_pool's zs_get_num_migrated() instead. Signed-off-by: Sergey Senozhatsky --- drivers/block/zram/zram_drv.c | 12 ++++++------ drivers/block/zram/zram_drv.h | 1 - 2 files changed, 6 insertions(+), 7 deletions(-) diff --git a/drivers/block/zram/zram_drv.c b/drivers/block/zram/zram_drv.c index 28f6e46..31e45b4 100644 --- a/drivers/block/zram/zram_drv.c +++ b/drivers/block/zram/zram_drv.c @@ -385,7 +385,6 @@ static ssize_t comp_algorithm_store(struct device *dev, static ssize_t compact_store(struct device *dev, struct device_attribute *attr, const char *buf, size_t len) { - unsigned long nr_migrated; struct zram *zram = dev_to_zram(dev); struct zram_meta *meta; @@ -396,8 +395,7 @@ static ssize_t compact_store(struct device *dev, } meta = zram->meta; - nr_migrated = zs_compact(meta->mem_pool); - atomic64_add(nr_migrated, &zram->stats.num_migrated); + zs_compact(meta->mem_pool); up_read(&zram->init_lock); return len; @@ -425,13 +423,15 @@ static ssize_t mm_stat_show(struct device *dev, struct device_attribute *attr, char *buf) { struct zram *zram = dev_to_zram(dev); - u64 orig_size, mem_used = 0; + u64 orig_size, mem_used = 0, num_migrated = 0; long max_used; ssize_t ret; down_read(&zram->init_lock); - if (init_done(zram)) + if (init_done(zram)) { mem_used = zs_get_total_pages(zram->meta->mem_pool); + num_migrated = zs_get_num_migrated(zram->meta->mem_pool); + } orig_size = atomic64_read(&zram->stats.pages_stored); max_used = atomic_long_read(&zram->stats.max_used_pages); @@ -444,7 +444,7 @@ static ssize_t mm_stat_show(struct device *dev, zram->limit_pages << PAGE_SHIFT, max_used << PAGE_SHIFT, (u64)atomic64_read(&zram->stats.zero_pages), - (u64)atomic64_read(&zram->stats.num_migrated)); + num_migrated); up_read(&zram->init_lock); return ret; diff --git a/drivers/block/zram/zram_drv.h b/drivers/block/zram/zram_drv.h index 6dbe2df..8e92339 100644 --- a/drivers/block/zram/zram_drv.h +++ b/drivers/block/zram/zram_drv.h @@ -78,7 +78,6 @@ struct zram_stats { atomic64_t compr_data_size; /* compressed size of pages stored */ atomic64_t num_reads; /* failed + successful */ atomic64_t num_writes; /* --do-- */ - atomic64_t num_migrated; /* no. of migrated object */ atomic64_t failed_reads; /* can happen when memory is too low */ atomic64_t failed_writes; /* can happen when memory is too low */ atomic64_t invalid_io; /* non-page-aligned I/O requests */ -- 2.4.2.337.gfae46aa -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: email@kvack.org From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-pa0-f54.google.com (mail-pa0-f54.google.com [209.85.220.54]) by kanga.kvack.org (Postfix) with ESMTP id 2582E6B0098 for ; Fri, 29 May 2015 11:06:50 -0400 (EDT) Received: by pacux9 with SMTP id ux9so20957822pac.3 for ; Fri, 29 May 2015 08:06:49 -0700 (PDT) Received: from mail-pa0-x22c.google.com (mail-pa0-x22c.google.com. [2607:f8b0:400e:c03::22c]) by mx.google.com with ESMTPS id t9si133491pas.59.2015.05.29.08.06.49 for (version=TLSv1.2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Fri, 29 May 2015 08:06:49 -0700 (PDT) Received: by pacrp13 with SMTP id rp13so16103629pac.2 for ; Fri, 29 May 2015 08:06:49 -0700 (PDT) From: Sergey Senozhatsky Subject: [RFC][PATCH 10/10] zsmalloc: lower ZS_ALMOST_FULL waterline Date: Sat, 30 May 2015 00:05:28 +0900 Message-Id: <1432911928-14654-11-git-send-email-sergey.senozhatsky@gmail.com> In-Reply-To: <1432911928-14654-1-git-send-email-sergey.senozhatsky@gmail.com> References: <1432911928-14654-1-git-send-email-sergey.senozhatsky@gmail.com> Sender: owner-linux-mm@kvack.org List-ID: To: Andrew Morton , Minchan Kim Cc: linux-mm@kvack.org, linux-kernel@vger.kernel.org, Sergey Senozhatsky , Sergey Senozhatsky get_fullness_group() considers 3/4 full pages as almost empty. that, unfortunately, marks as ALMOST_EMPTY pages that we would probably like to keep in ALMOST_FULL list. ALMOST_EMPTY: [..] inuse: 3 max_objexts: 4 inuse: 5 max_objexts: 7 inuse: 5 max_objexts: 7 inuse: 2 max_objexts: 3 [..] so, for "inuse: 5 max_objexts: 7" ALMOST_EMPTY page, for example, it'll take 2 obj_malloc to make the page FULL and 5 obj_free to make it EMPTY. compaction selects ALMOST_EMPTY pages as source pages, which can result in extra object moves. iow, in terms of compaction, it makes more sense to fill this page, rather than drain it. decrease ALMOST_FULL waterline to 2/3 of max capacity; which is, of course, still imperfect. Signed-off-by: Sergey Senozhatsky --- mm/zsmalloc.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/mm/zsmalloc.c b/mm/zsmalloc.c index 0524c4a..a8a3eae 100644 --- a/mm/zsmalloc.c +++ b/mm/zsmalloc.c @@ -196,7 +196,7 @@ static int zs_size_classes; * * (see: fix_fullness_group()) */ -static const int fullness_threshold_frac = 4; +static const int fullness_threshold_frac = 3; struct size_class { spinlock_t lock; @@ -612,7 +612,7 @@ static enum fullness_group get_fullness_group(struct page *page) fg = ZS_EMPTY; else if (inuse == max_objects) fg = ZS_FULL; - else if (inuse <= 3 * max_objects / fullness_threshold_frac) + else if (inuse <= 2 * max_objects / fullness_threshold_frac) fg = ZS_ALMOST_EMPTY; else fg = ZS_ALMOST_FULL; -- 2.4.2.337.gfae46aa -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: email@kvack.org From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-pd0-f175.google.com (mail-pd0-f175.google.com [209.85.192.175]) by kanga.kvack.org (Postfix) with ESMTP id 57138900016 for ; Wed, 3 Jun 2015 01:08:49 -0400 (EDT) Received: by pdjm12 with SMTP id m12so65622450pdj.3 for ; Tue, 02 Jun 2015 22:08:49 -0700 (PDT) Received: from mail-pd0-x22f.google.com (mail-pd0-x22f.google.com. [2607:f8b0:400e:c02::22f]) by mx.google.com with ESMTPS id b12si29748871pdl.238.2015.06.02.22.08.48 for (version=TLSv1.2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Tue, 02 Jun 2015 22:08:48 -0700 (PDT) Received: by pdbqa5 with SMTP id qa5so148575152pdb.0 for ; Tue, 02 Jun 2015 22:08:48 -0700 (PDT) Date: Wed, 3 Jun 2015 14:09:10 +0900 From: Sergey Senozhatsky Subject: Re: [RFC][PATCH 00/10] zsmalloc auto-compaction Message-ID: <20150603050910.GA534@swordfish> References: <1432911928-14654-1-git-send-email-sergey.senozhatsky@gmail.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <1432911928-14654-1-git-send-email-sergey.senozhatsky@gmail.com> Sender: owner-linux-mm@kvack.org List-ID: To: Minchan Kim Cc: Andrew Morton , linux-mm@kvack.org, linux-kernel@vger.kernel.org, Sergey Senozhatsky , Sergey Senozhatsky On (05/30/15 00:05), Sergey Senozhatsky wrote: > RFC > > this is 4.3 material, but I wanted to publish it sooner to gain > responses and to settle it down before 4.3 merge window opens. > > in short, this series tweaks zsmalloc's compaction and adds > auto-compaction support. auto-compaction is not aimed to replace > manual compaction, intead it's supposed to be good enough. yet > it surely slows down zsmalloc in some scenarious. whilst simple > un-tar test didn't show any significant performance difference > > > quote from commit 0007: > > this test copies a 1.3G linux kernel tar to mounted zram disk, > and extracts it. > [..] Hello, I've a v2: -- squashed and re-order some of the patches; -- run iozone with lockdep disabled. === quote === auto-compaction should not affect read-only tests, so we are interested in write-only and read-write (mixed) tests, but I'll post complete test stats: iozone -t 3 -R -r 16K -s 60M -I +Z ext4, 2g zram0 device, lzo, 4 compression streams max test base auto-compact (compacted 67904 objs) Initial write 2474943.62 2490551.69 Rewrite 3656121.38 3002796.31 Read 12068187.50 12044105.25 Re-read 12009777.25 11930537.50 Reverse Read 10858884.25 10388252.50 Stride read 10715304.75 10429308.00 Random read 10597970.50 10502978.75 Mixed workload 8517269.00 8701298.12 Random write 3595597.00 3465174.38 Pwrite 2507361.25 2553224.50 Pread 5380608.28 5340646.03 Fwrite 6123863.62 6130514.25 Fread 12006438.50 11936981.25 mm_stat after the test base: cat /sys/block/zram0/mm_stat 378834944 5748695 7446528 0 7450624 16318 0 auto-compaction: cat /sys/block/zram0/mm_stat 378892288 5754987 7397376 0 7397376 16304 67904 === -ss > > > Sergey Senozhatsky (10): > zsmalloc: drop unused variable `nr_to_migrate' > zsmalloc: always keep per-class stats > zsmalloc: introduce zs_can_compact() function > zsmalloc: cosmetic compaction code adjustments > zsmalloc: add `num_migrated' to zs_pool > zsmalloc: move compaction functions > zsmalloc: introduce auto-compact support > zsmalloc: export zs_pool `num_migrated' > zram: remove `num_migrated' from zram_stats > zsmalloc: lower ZS_ALMOST_FULL waterline > > drivers/block/zram/zram_drv.c | 12 +- > drivers/block/zram/zram_drv.h | 1 - > include/linux/zsmalloc.h | 1 + > mm/zsmalloc.c | 578 +++++++++++++++++++++--------------------- > 4 files changed, 296 insertions(+), 296 deletions(-) > > -- > 2.4.2.337.gfae46aa > -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: email@kvack.org From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-pa0-f52.google.com (mail-pa0-f52.google.com [209.85.220.52]) by kanga.kvack.org (Postfix) with ESMTP id 475EF900016 for ; Wed, 3 Jun 2015 22:04:08 -0400 (EDT) Received: by padj3 with SMTP id j3so18894523pad.0 for ; Wed, 03 Jun 2015 19:04:07 -0700 (PDT) Received: from mail-pa0-x231.google.com (mail-pa0-x231.google.com. [2607:f8b0:400e:c03::231]) by mx.google.com with ESMTPS id o6si3489548pdn.123.2015.06.03.19.04.07 for (version=TLSv1.2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Wed, 03 Jun 2015 19:04:07 -0700 (PDT) Received: by pabqy3 with SMTP id qy3so18793062pab.3 for ; Wed, 03 Jun 2015 19:04:07 -0700 (PDT) Date: Thu, 4 Jun 2015 11:04:01 +0900 From: Minchan Kim Subject: Re: [RFC][PATCH 01/10] zsmalloc: drop unused variable `nr_to_migrate' Message-ID: <20150604020401.GB2241@blaptop> References: <1432911928-14654-1-git-send-email-sergey.senozhatsky@gmail.com> <1432911928-14654-2-git-send-email-sergey.senozhatsky@gmail.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <1432911928-14654-2-git-send-email-sergey.senozhatsky@gmail.com> Sender: owner-linux-mm@kvack.org List-ID: To: Sergey Senozhatsky Cc: Andrew Morton , linux-mm@kvack.org, linux-kernel@vger.kernel.org, Sergey Senozhatsky On Sat, May 30, 2015 at 12:05:19AM +0900, Sergey Senozhatsky wrote: > __zs_compact() does not use `nr_to_migrate', drop it. > > Signed-off-by: Sergey Senozhatsky Acked-by: Minchan Kim -- Kind regards, Minchan Kim -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: email@kvack.org From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-pa0-f49.google.com (mail-pa0-f49.google.com [209.85.220.49]) by kanga.kvack.org (Postfix) with ESMTP id 44E6E900016 for ; Wed, 3 Jun 2015 22:10:17 -0400 (EDT) Received: by pabqy3 with SMTP id qy3so18888602pab.3 for ; Wed, 03 Jun 2015 19:10:16 -0700 (PDT) Received: from mail-pd0-x22a.google.com (mail-pd0-x22a.google.com. [2607:f8b0:400e:c02::22a]) by mx.google.com with ESMTPS id uy7si3444039pbc.246.2015.06.03.19.10.16 for (version=TLSv1.2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Wed, 03 Jun 2015 19:10:16 -0700 (PDT) Received: by pdjm12 with SMTP id m12so19799483pdj.3 for ; Wed, 03 Jun 2015 19:10:16 -0700 (PDT) Date: Thu, 4 Jun 2015 11:10:41 +0900 From: Sergey Senozhatsky Subject: Re: [RFC][PATCH 01/10] zsmalloc: drop unused variable `nr_to_migrate' Message-ID: <20150604021041.GA1951@swordfish> References: <1432911928-14654-1-git-send-email-sergey.senozhatsky@gmail.com> <1432911928-14654-2-git-send-email-sergey.senozhatsky@gmail.com> <20150604020401.GB2241@blaptop> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20150604020401.GB2241@blaptop> Sender: owner-linux-mm@kvack.org List-ID: To: Minchan Kim Cc: Sergey Senozhatsky , Andrew Morton , linux-mm@kvack.org, linux-kernel@vger.kernel.org, Sergey Senozhatsky On (06/04/15 11:04), Minchan Kim wrote: > On Sat, May 30, 2015 at 12:05:19AM +0900, Sergey Senozhatsky wrote: > > __zs_compact() does not use `nr_to_migrate', drop it. > > > > Signed-off-by: Sergey Senozhatsky > Acked-by: Minchan Kim > Hello Minchan, I will post a slightly reworked patchset later today. thanks. -ss -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: email@kvack.org From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-pd0-f174.google.com (mail-pd0-f174.google.com [209.85.192.174]) by kanga.kvack.org (Postfix) with ESMTP id B6AF6900016 for ; Wed, 3 Jun 2015 22:18:28 -0400 (EDT) Received: by pdbki1 with SMTP id ki1so19978944pdb.1 for ; Wed, 03 Jun 2015 19:18:28 -0700 (PDT) Received: from mail-pd0-x233.google.com (mail-pd0-x233.google.com. [2607:f8b0:400e:c02::233]) by mx.google.com with ESMTPS id df1si3543529pad.84.2015.06.03.19.18.27 for (version=TLSv1.2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Wed, 03 Jun 2015 19:18:27 -0700 (PDT) Received: by pdjm12 with SMTP id m12so19930805pdj.3 for ; Wed, 03 Jun 2015 19:18:27 -0700 (PDT) Date: Thu, 4 Jun 2015 11:18:21 +0900 From: Minchan Kim Subject: Re: [RFC][PATCH 02/10] zsmalloc: always keep per-class stats Message-ID: <20150604021821.GC2241@blaptop> References: <1432911928-14654-1-git-send-email-sergey.senozhatsky@gmail.com> <1432911928-14654-3-git-send-email-sergey.senozhatsky@gmail.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <1432911928-14654-3-git-send-email-sergey.senozhatsky@gmail.com> Sender: owner-linux-mm@kvack.org List-ID: To: Sergey Senozhatsky Cc: Andrew Morton , linux-mm@kvack.org, linux-kernel@vger.kernel.org, Sergey Senozhatsky On Sat, May 30, 2015 at 12:05:20AM +0900, Sergey Senozhatsky wrote: > always account per-class `zs_size_stat' stats. this data will > help us make better decisions during compaction. we are especially > interested in OBJ_ALLOCATED and OBJ_USED, which can tell us if > class compaction will result in any memory gain. > > for instance, we know the number of allocated objects in the class, > the number of objects being used (so we also know how many objects > are not used) and the number of objects per-page. so we can estimate > how many pages compaction can free (pages that will turn into > ZS_EMPTY during compaction). Fair enough but I need to read further patches to see if we need really this at the moment. I hope it would be better to write down more detail in cover-letter so when I read just [0/0] I realize your goal and approach without looking into detail in each patch. > > Signed-off-by: Sergey Senozhatsky > --- > mm/zsmalloc.c | 49 ++++++++++++------------------------------------- > 1 file changed, 12 insertions(+), 37 deletions(-) > > diff --git a/mm/zsmalloc.c b/mm/zsmalloc.c > index e615b31..778b8db 100644 > --- a/mm/zsmalloc.c > +++ b/mm/zsmalloc.c > @@ -169,14 +169,12 @@ enum zs_stat_type { > NR_ZS_STAT_TYPE, > }; > > -#ifdef CONFIG_ZSMALLOC_STAT > - > -static struct dentry *zs_stat_root; > - > struct zs_size_stat { > unsigned long objs[NR_ZS_STAT_TYPE]; > }; > > +#ifdef CONFIG_ZSMALLOC_STAT > +static struct dentry *zs_stat_root; > #endif > > /* > @@ -201,25 +199,21 @@ static int zs_size_classes; > static const int fullness_threshold_frac = 4; > > struct size_class { > + spinlock_t lock; > + struct page *fullness_list[_ZS_NR_FULLNESS_GROUPS]; > /* > * Size of objects stored in this class. Must be multiple > * of ZS_ALIGN. > */ > - int size; > - unsigned int index; > + int size; > + unsigned int index; > > /* Number of PAGE_SIZE sized pages to combine to form a 'zspage' */ > - int pages_per_zspage; > - /* huge object: pages_per_zspage == 1 && maxobj_per_zspage == 1 */ > - bool huge; > - > -#ifdef CONFIG_ZSMALLOC_STAT > - struct zs_size_stat stats; > -#endif > - > - spinlock_t lock; > + int pages_per_zspage; > + struct zs_size_stat stats; > > - struct page *fullness_list[_ZS_NR_FULLNESS_GROUPS]; > + /* huge object: pages_per_zspage == 1 && maxobj_per_zspage == 1 */ > + bool huge; > }; > > /* > @@ -439,8 +433,6 @@ static int get_size_class_index(int size) > return min(zs_size_classes - 1, idx); > } > > -#ifdef CONFIG_ZSMALLOC_STAT > - > static inline void zs_stat_inc(struct size_class *class, > enum zs_stat_type type, unsigned long cnt) > { > @@ -459,6 +451,8 @@ static inline unsigned long zs_stat_get(struct size_class *class, > return class->stats.objs[type]; > } > > +#ifdef CONFIG_ZSMALLOC_STAT > + > static int __init zs_stat_init(void) > { > if (!debugfs_initialized()) > @@ -574,23 +568,6 @@ static void zs_pool_stat_destroy(struct zs_pool *pool) > } > > #else /* CONFIG_ZSMALLOC_STAT */ > - > -static inline void zs_stat_inc(struct size_class *class, > - enum zs_stat_type type, unsigned long cnt) > -{ > -} > - > -static inline void zs_stat_dec(struct size_class *class, > - enum zs_stat_type type, unsigned long cnt) > -{ > -} > - > -static inline unsigned long zs_stat_get(struct size_class *class, > - enum zs_stat_type type) > -{ > - return 0; > -} > - > static int __init zs_stat_init(void) > { > return 0; > @@ -608,7 +585,6 @@ static inline int zs_pool_stat_create(char *name, struct zs_pool *pool) > static inline void zs_pool_stat_destroy(struct zs_pool *pool) > { > } > - > #endif > > > @@ -1682,7 +1658,6 @@ static void putback_zspage(struct zs_pool *pool, struct size_class *class, > class->size, class->pages_per_zspage)); > atomic_long_sub(class->pages_per_zspage, > &pool->pages_allocated); > - > free_zspage(first_page); > } > } > -- > 2.4.2.337.gfae46aa > -- Kind regards, Minchan Kim -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: email@kvack.org From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-pa0-f51.google.com (mail-pa0-f51.google.com [209.85.220.51]) by kanga.kvack.org (Postfix) with ESMTP id 193F9900016 for ; Wed, 3 Jun 2015 22:34:00 -0400 (EDT) Received: by pabqy3 with SMTP id qy3so19254851pab.3 for ; Wed, 03 Jun 2015 19:33:59 -0700 (PDT) Received: from mail-pa0-x232.google.com (mail-pa0-x232.google.com. [2607:f8b0:400e:c03::232]) by mx.google.com with ESMTPS id ds16si3569319pdb.171.2015.06.03.19.33.59 for (version=TLSv1.2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Wed, 03 Jun 2015 19:33:59 -0700 (PDT) Received: by padjw17 with SMTP id jw17so19352141pad.2 for ; Wed, 03 Jun 2015 19:33:59 -0700 (PDT) Date: Thu, 4 Jun 2015 11:34:23 +0900 From: Sergey Senozhatsky Subject: Re: [RFC][PATCH 02/10] zsmalloc: always keep per-class stats Message-ID: <20150604023423.GC1951@swordfish> References: <1432911928-14654-1-git-send-email-sergey.senozhatsky@gmail.com> <1432911928-14654-3-git-send-email-sergey.senozhatsky@gmail.com> <20150604021821.GC2241@blaptop> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20150604021821.GC2241@blaptop> Sender: owner-linux-mm@kvack.org List-ID: To: Minchan Kim Cc: Sergey Senozhatsky , Andrew Morton , linux-mm@kvack.org, linux-kernel@vger.kernel.org, Sergey Senozhatsky On (06/04/15 11:18), Minchan Kim wrote: > On Sat, May 30, 2015 at 12:05:20AM +0900, Sergey Senozhatsky wrote: > > always account per-class `zs_size_stat' stats. this data will > > help us make better decisions during compaction. we are especially > > interested in OBJ_ALLOCATED and OBJ_USED, which can tell us if > > class compaction will result in any memory gain. > > > > for instance, we know the number of allocated objects in the class, > > the number of objects being used (so we also know how many objects > > are not used) and the number of objects per-page. so we can estimate > > how many pages compaction can free (pages that will turn into > > ZS_EMPTY during compaction). > > Fair enough but I need to read further patches to see if we need > really this at the moment. > > I hope it would be better to write down more detail in cover-letter > so when I read just [0/0] I realize your goal and approach without > looking into detail in each patch. > sure, will do later today. I caught a cold, so I'm a bit slow. -ss > > > > Signed-off-by: Sergey Senozhatsky > > --- > > mm/zsmalloc.c | 49 ++++++++++++------------------------------------- > > 1 file changed, 12 insertions(+), 37 deletions(-) > > > > diff --git a/mm/zsmalloc.c b/mm/zsmalloc.c > > index e615b31..778b8db 100644 > > --- a/mm/zsmalloc.c > > +++ b/mm/zsmalloc.c > > @@ -169,14 +169,12 @@ enum zs_stat_type { > > NR_ZS_STAT_TYPE, > > }; > > > > -#ifdef CONFIG_ZSMALLOC_STAT > > - > > -static struct dentry *zs_stat_root; > > - > > struct zs_size_stat { > > unsigned long objs[NR_ZS_STAT_TYPE]; > > }; > > > > +#ifdef CONFIG_ZSMALLOC_STAT > > +static struct dentry *zs_stat_root; > > #endif > > > > /* > > @@ -201,25 +199,21 @@ static int zs_size_classes; > > static const int fullness_threshold_frac = 4; > > > > struct size_class { > > + spinlock_t lock; > > + struct page *fullness_list[_ZS_NR_FULLNESS_GROUPS]; > > /* > > * Size of objects stored in this class. Must be multiple > > * of ZS_ALIGN. > > */ > > - int size; > > - unsigned int index; > > + int size; > > + unsigned int index; > > > > /* Number of PAGE_SIZE sized pages to combine to form a 'zspage' */ > > - int pages_per_zspage; > > - /* huge object: pages_per_zspage == 1 && maxobj_per_zspage == 1 */ > > - bool huge; > > - > > -#ifdef CONFIG_ZSMALLOC_STAT > > - struct zs_size_stat stats; > > -#endif > > - > > - spinlock_t lock; > > + int pages_per_zspage; > > + struct zs_size_stat stats; > > > > - struct page *fullness_list[_ZS_NR_FULLNESS_GROUPS]; > > + /* huge object: pages_per_zspage == 1 && maxobj_per_zspage == 1 */ > > + bool huge; > > }; > > > > /* > > @@ -439,8 +433,6 @@ static int get_size_class_index(int size) > > return min(zs_size_classes - 1, idx); > > } > > > > -#ifdef CONFIG_ZSMALLOC_STAT > > - > > static inline void zs_stat_inc(struct size_class *class, > > enum zs_stat_type type, unsigned long cnt) > > { > > @@ -459,6 +451,8 @@ static inline unsigned long zs_stat_get(struct size_class *class, > > return class->stats.objs[type]; > > } > > > > +#ifdef CONFIG_ZSMALLOC_STAT > > + > > static int __init zs_stat_init(void) > > { > > if (!debugfs_initialized()) > > @@ -574,23 +568,6 @@ static void zs_pool_stat_destroy(struct zs_pool *pool) > > } > > > > #else /* CONFIG_ZSMALLOC_STAT */ > > - > > -static inline void zs_stat_inc(struct size_class *class, > > - enum zs_stat_type type, unsigned long cnt) > > -{ > > -} > > - > > -static inline void zs_stat_dec(struct size_class *class, > > - enum zs_stat_type type, unsigned long cnt) > > -{ > > -} > > - > > -static inline unsigned long zs_stat_get(struct size_class *class, > > - enum zs_stat_type type) > > -{ > > - return 0; > > -} > > - > > static int __init zs_stat_init(void) > > { > > return 0; > > @@ -608,7 +585,6 @@ static inline int zs_pool_stat_create(char *name, struct zs_pool *pool) > > static inline void zs_pool_stat_destroy(struct zs_pool *pool) > > { > > } > > - > > #endif > > > > > > @@ -1682,7 +1658,6 @@ static void putback_zspage(struct zs_pool *pool, struct size_class *class, > > class->size, class->pages_per_zspage)); > > atomic_long_sub(class->pages_per_zspage, > > &pool->pages_allocated); > > - > > free_zspage(first_page); > > } > > } > > -- > > 2.4.2.337.gfae46aa > > > > -- > Kind regards, > Minchan Kim > -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: email@kvack.org From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-pd0-f169.google.com (mail-pd0-f169.google.com [209.85.192.169]) by kanga.kvack.org (Postfix) with ESMTP id DE407900016 for ; Wed, 3 Jun 2015 22:55:40 -0400 (EDT) Received: by pdbnf5 with SMTP id nf5so20612231pdb.2 for ; Wed, 03 Jun 2015 19:55:40 -0700 (PDT) Received: from mail-pd0-x234.google.com (mail-pd0-x234.google.com. [2607:f8b0:400e:c02::234]) by mx.google.com with ESMTPS id xq11si3617973pac.200.2015.06.03.19.55.39 for (version=TLSv1.2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Wed, 03 Jun 2015 19:55:39 -0700 (PDT) Received: by pdbki1 with SMTP id ki1so20578318pdb.1 for ; Wed, 03 Jun 2015 19:55:39 -0700 (PDT) Date: Thu, 4 Jun 2015 11:55:33 +0900 From: Minchan Kim Subject: Re: [RFC][PATCH 03/10] zsmalloc: introduce zs_can_compact() function Message-ID: <20150604025533.GE2241@blaptop> References: <1432911928-14654-1-git-send-email-sergey.senozhatsky@gmail.com> <1432911928-14654-4-git-send-email-sergey.senozhatsky@gmail.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <1432911928-14654-4-git-send-email-sergey.senozhatsky@gmail.com> Sender: owner-linux-mm@kvack.org List-ID: To: Sergey Senozhatsky Cc: Andrew Morton , linux-mm@kvack.org, linux-kernel@vger.kernel.org, Sergey Senozhatsky On Sat, May 30, 2015 at 12:05:21AM +0900, Sergey Senozhatsky wrote: > this function checks if class compaction will free any pages. > rephrasing, do we have enough unused objects to form at least one > ZS_EMPTY page and free it. it aborts compaction if class compaction > will not result into any (further) savings. > > EXAMPLE (this debug output is not part of this patch set): > > -- class size > -- number of allocated objects > -- number of used objects, > -- estimated number of pages that will be freed > > [..] > [ 3303.108960] class-3072 objs:24652 inuse:24628 objs-per-page:4 pages-tofree:6 maxobjs-per-zspage? > [ 3303.108965] class-3072 objs:24648 inuse:24628 objs-per-page:4 pages-tofree:5 > [ 3303.108970] class-3072 objs:24644 inuse:24628 objs-per-page:4 pages-tofree:4 > [ 3303.108973] class-3072 objs:24640 inuse:24628 objs-per-page:4 pages-tofree:3 > [ 3303.108978] class-3072 objs:24636 inuse:24628 objs-per-page:4 pages-tofree:2 > [ 3303.108982] class-3072 objs:24632 inuse:24628 objs-per-page:4 pages-tofree:1 > [ 3303.108993] class-2720 objs:17970 inuse:17966 objs-per-page:3 pages-tofree:1 > [ 3303.108997] class-2720 objs:17967 inuse:17966 objs-per-page:3 pages-tofree:0 > [ 3303.108998] class-2720: Compaction is useless > [ 3303.109000] class-2448 objs:7680 inuse:7674 objs-per-page:5 pages-tofree:1 > [ 3303.109005] class-2336 objs:13510 inuse:13500 objs-per-page:7 pages-tofree:1 > [ 3303.109010] class-2336 objs:13503 inuse:13500 objs-per-page:7 pages-tofree:0 > [ 3303.109011] class-2336: Compaction is useless > [ 3303.109013] class-1808 objs:1161 inuse:1154 objs-per-page:9 pages-tofree:0 > [ 3303.109014] class-1808: Compaction is useless > [ 3303.109016] class-1744 objs:2135 inuse:2131 objs-per-page:7 pages-tofree:0 > [ 3303.109017] class-1744: Compaction is useless > [ 3303.109019] class-1536 objs:1328 inuse:1323 objs-per-page:8 pages-tofree:0 > [ 3303.109020] class-1536: Compaction is useless > [ 3303.109022] class-1488 objs:8855 inuse:8847 objs-per-page:11 pages-tofree:0 > [ 3303.109023] class-1488: Compaction is useless > [ 3303.109025] class-1360 objs:14880 inuse:14878 objs-per-page:3 pages-tofree:0 > [ 3303.109026] class-1360: Compaction is useless > [ 3303.109028] class-1248 objs:3588 inuse:3577 objs-per-page:13 pages-tofree:0 > [ 3303.109029] class-1248: Compaction is useless > [ 3303.109031] class-1216 objs:3380 inuse:3372 objs-per-page:10 pages-tofree:0 > [ 3303.109032] class-1216: Compaction is useless > [ 3303.109033] class-1168 objs:3416 inuse:3401 objs-per-page:7 pages-tofree:2 > [ 3303.109037] class-1168 objs:3409 inuse:3401 objs-per-page:7 pages-tofree:1 > [ 3303.109042] class-1104 objs:605 inuse:599 objs-per-page:11 pages-tofree:0 > [ 3303.109043] class-1104: Compaction is useless > [..] > > every "Compaction is useless" indicates that we saved some CPU cycles. > > for example, class-1104 has > > 605 object allocated > 599 objects used > 11 objects per-page > > even if we have ALMOST_EMPTY page, we still don't have enough room to move > all of its objects and free this page; so compaction will not make a lot of > sense here, it's better to just leave it as is. Fair enough. > > Signed-off-by: Sergey Senozhatsky > --- > mm/zsmalloc.c | 25 +++++++++++++++++++++++++ > 1 file changed, 25 insertions(+) > > diff --git a/mm/zsmalloc.c b/mm/zsmalloc.c > index 778b8db..9ef6f15 100644 > --- a/mm/zsmalloc.c > +++ b/mm/zsmalloc.c > @@ -1673,6 +1673,28 @@ static struct page *isolate_source_page(struct size_class *class) > return page; > } > > +/* > + * Make sure that we actually can compact this class, > + * IOW if migration will empty at least one page. > + * > + * should be called under class->lock > + */ > +static bool zs_can_compact(struct size_class *class) > +{ > + /* > + * calculate how many unused allocated objects we c should be captital. I hope you will fix all of english grammer in next spin because someone(like me) who is not a native will learn the wrong english. :) > + * have and see if we can free any zspages. otherwise, > + * compaction can just move objects back and forth w/o > + * any memory gain. > + */ > + unsigned long ret = zs_stat_get(class, OBJ_ALLOCATED) - > + zs_stat_get(class, OBJ_USED); > + I prefer obj_wasted to "ret". > + ret /= get_maxobj_per_zspage(class->size, > + class->pages_per_zspage); > + return ret > 0; > +} > + > static unsigned long __zs_compact(struct zs_pool *pool, > struct size_class *class) > { > @@ -1686,6 +1708,9 @@ static unsigned long __zs_compact(struct zs_pool *pool, > > BUG_ON(!is_first_page(src_page)); > > + if (!zs_can_compact(class)) > + break; > + > cc.index = 0; > cc.s_page = src_page; > > -- > 2.4.2.337.gfae46aa > -- Kind regards, Minchan Kim -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: email@kvack.org From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-pa0-f43.google.com (mail-pa0-f43.google.com [209.85.220.43]) by kanga.kvack.org (Postfix) with ESMTP id 29A51900016 for ; Wed, 3 Jun 2015 23:14:20 -0400 (EDT) Received: by padj3 with SMTP id j3so19974827pad.0 for ; Wed, 03 Jun 2015 20:14:19 -0700 (PDT) Received: from mail-pa0-x22c.google.com (mail-pa0-x22c.google.com. [2607:f8b0:400e:c03::22c]) by mx.google.com with ESMTPS id ca15si3755574pdb.31.2015.06.03.20.14.19 for (version=TLSv1.2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Wed, 03 Jun 2015 20:14:19 -0700 (PDT) Received: by payr10 with SMTP id r10so20050922pay.1 for ; Wed, 03 Jun 2015 20:14:19 -0700 (PDT) Date: Thu, 4 Jun 2015 12:14:12 +0900 From: Minchan Kim Subject: Re: [RFC][PATCH 04/10] zsmalloc: cosmetic compaction code adjustments Message-ID: <20150604031412.GF2241@blaptop> References: <1432911928-14654-1-git-send-email-sergey.senozhatsky@gmail.com> <1432911928-14654-5-git-send-email-sergey.senozhatsky@gmail.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <1432911928-14654-5-git-send-email-sergey.senozhatsky@gmail.com> Sender: owner-linux-mm@kvack.org List-ID: To: Sergey Senozhatsky Cc: Andrew Morton , linux-mm@kvack.org, linux-kernel@vger.kernel.org, Sergey Senozhatsky On Sat, May 30, 2015 at 12:05:22AM +0900, Sergey Senozhatsky wrote: > change zs_object_copy() argument order to be (DST, SRC) rather > than (SRC, DST). copy/move functions usually have (to, from) > arguments order. Yeb, > > rename alloc_target_page() to isolate_target_page(). this > function doesn't allocate anything, it isolates target page, > pretty much like isolate_source_page(). The reason I named it as alloc_target_page is I had a plan to alloc new page which might be helpful sometime but I cannot think of any benefit now so I follow your your patch. > > tweak __zs_compact() comment. > > Signed-off-by: Sergey Senozhatsky Acked-by: Minchan Kim > --- > mm/zsmalloc.c | 12 ++++++------ > 1 file changed, 6 insertions(+), 6 deletions(-) > > diff --git a/mm/zsmalloc.c b/mm/zsmalloc.c > index 9ef6f15..fa72a81 100644 > --- a/mm/zsmalloc.c > +++ b/mm/zsmalloc.c > @@ -1469,7 +1469,7 @@ void zs_free(struct zs_pool *pool, unsigned long handle) > } > EXPORT_SYMBOL_GPL(zs_free); > > -static void zs_object_copy(unsigned long src, unsigned long dst, > +static void zs_object_copy(unsigned long dst, unsigned long src, > struct size_class *class) > { > struct page *s_page, *d_page; > @@ -1610,7 +1610,7 @@ static int migrate_zspage(struct zs_pool *pool, struct size_class *class, > > used_obj = handle_to_obj(handle); > free_obj = obj_malloc(d_page, class, handle); > - zs_object_copy(used_obj, free_obj, class); > + zs_object_copy(free_obj, used_obj, class); > index++; > record_obj(handle, free_obj); > unpin_tag(handle); > @@ -1626,7 +1626,7 @@ static int migrate_zspage(struct zs_pool *pool, struct size_class *class, > return ret; > } > > -static struct page *alloc_target_page(struct size_class *class) > +static struct page *isolate_target_page(struct size_class *class) > { > int i; > struct page *page; > @@ -1714,11 +1714,11 @@ static unsigned long __zs_compact(struct zs_pool *pool, > cc.index = 0; > cc.s_page = src_page; > > - while ((dst_page = alloc_target_page(class))) { > + while ((dst_page = isolate_target_page(class))) { > cc.d_page = dst_page; > /* > - * If there is no more space in dst_page, try to > - * allocate another zspage. > + * If there is no more space in dst_page, resched > + * and see if anyone had allocated another zspage. > */ > if (!migrate_zspage(pool, class, &cc)) > break; > -- > 2.4.2.337.gfae46aa > -- Kind regards, Minchan Kim -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: email@kvack.org From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-pd0-f169.google.com (mail-pd0-f169.google.com [209.85.192.169]) by kanga.kvack.org (Postfix) with ESMTP id E82BF900016 for ; Wed, 3 Jun 2015 23:14:51 -0400 (EDT) Received: by pdbnf5 with SMTP id nf5so20923252pdb.2 for ; Wed, 03 Jun 2015 20:14:51 -0700 (PDT) Received: from mail-pa0-x231.google.com (mail-pa0-x231.google.com. [2607:f8b0:400e:c03::231]) by mx.google.com with ESMTPS id qo6si3708221pac.151.2015.06.03.20.14.50 for (version=TLSv1.2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Wed, 03 Jun 2015 20:14:51 -0700 (PDT) Received: by pabqy3 with SMTP id qy3so19875016pab.3 for ; Wed, 03 Jun 2015 20:14:50 -0700 (PDT) Date: Thu, 4 Jun 2015 12:15:14 +0900 From: Sergey Senozhatsky Subject: Re: [RFC][PATCH 03/10] zsmalloc: introduce zs_can_compact() function Message-ID: <20150604031514.GE1951@swordfish> References: <1432911928-14654-1-git-send-email-sergey.senozhatsky@gmail.com> <1432911928-14654-4-git-send-email-sergey.senozhatsky@gmail.com> <20150604025533.GE2241@blaptop> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20150604025533.GE2241@blaptop> Sender: owner-linux-mm@kvack.org List-ID: To: Minchan Kim Cc: Sergey Senozhatsky , Andrew Morton , linux-mm@kvack.org, linux-kernel@vger.kernel.org, Sergey Senozhatsky On (06/04/15 11:55), Minchan Kim wrote: > > [ 3303.108960] class-3072 objs:24652 inuse:24628 objs-per-page:4 pages-tofree:6 > > maxobjs-per-zspage? > yeah, I shortened it to be more of less "80 chars" friendly. [..] > > + * calculate how many unused allocated objects we > > c should be captital. > > I hope you will fix all of english grammer in next spin > because someone(like me) who is not a native will learn the > wrong english. :) sure, will fix. yeah, I'm a native broken english speaker :-) > > + * have and see if we can free any zspages. otherwise, > > + * compaction can just move objects back and forth w/o > > + * any memory gain. > > + */ > > + unsigned long ret = zs_stat_get(class, OBJ_ALLOCATED) - > > + zs_stat_get(class, OBJ_USED); > > + > > I prefer obj_wasted to "ret". ok. I'm still thinking how good it should be. for automatic compaction we don't want to uselessly move objects between pages and I tend to think that it's better to compact less, than to waste more cpu cycless. on the other hand, this policy will miss cases like: -- free objects in class: 5 (free-objs class capacity) -- page1: inuse 2 -- page2: inuse 2 -- page3: inuse 3 -- page4: inuse 2 so total "insuse" is greater than free-objs class capacity. but, it's surely possible to compact this class. partial inuse summ <= free-objs class capacity (a partial summ is a ->inuse summ of any two of class pages: page1 + page2, page2 + page3, etc.). otoh, these partial sums will badly affect performance. may be for automatic compaction (the one that happens w/o user interaction) we can do zs_can_compact() and for manual compaction (the one that has been triggered by a user) we can old "full-scan". anyway, zs_can_compact() looks like something that we can optimize independently later. -ss > > + ret /= get_maxobj_per_zspage(class->size, > > + class->pages_per_zspage); > > + return ret > 0; > > +} > > + > > static unsigned long __zs_compact(struct zs_pool *pool, > > struct size_class *class) > > { > > @@ -1686,6 +1708,9 @@ static unsigned long __zs_compact(struct zs_pool *pool, > > > > BUG_ON(!is_first_page(src_page)); > > > > + if (!zs_can_compact(class)) > > + break; > > + > > cc.index = 0; > > cc.s_page = src_page; > > > > -- > > 2.4.2.337.gfae46aa > > > > -- > Kind regards, > Minchan Kim > -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: email@kvack.org From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-pa0-f42.google.com (mail-pa0-f42.google.com [209.85.220.42]) by kanga.kvack.org (Postfix) with ESMTP id A80BE900016 for ; Wed, 3 Jun 2015 23:30:21 -0400 (EDT) Received: by padj3 with SMTP id j3so20206749pad.0 for ; Wed, 03 Jun 2015 20:30:21 -0700 (PDT) Received: from mail-pd0-x22f.google.com (mail-pd0-x22f.google.com. [2607:f8b0:400e:c02::22f]) by mx.google.com with ESMTPS id e3si3725539pdc.240.2015.06.03.20.30.20 for (version=TLSv1.2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Wed, 03 Jun 2015 20:30:20 -0700 (PDT) Received: by pdbki1 with SMTP id ki1so21124028pdb.1 for ; Wed, 03 Jun 2015 20:30:20 -0700 (PDT) Date: Thu, 4 Jun 2015 12:30:14 +0900 From: Minchan Kim Subject: Re: [RFC][PATCH 03/10] zsmalloc: introduce zs_can_compact() function Message-ID: <20150604033014.GG2241@blaptop> References: <1432911928-14654-1-git-send-email-sergey.senozhatsky@gmail.com> <1432911928-14654-4-git-send-email-sergey.senozhatsky@gmail.com> <20150604025533.GE2241@blaptop> <20150604031514.GE1951@swordfish> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20150604031514.GE1951@swordfish> Sender: owner-linux-mm@kvack.org List-ID: To: Sergey Senozhatsky Cc: Sergey Senozhatsky , Andrew Morton , linux-mm@kvack.org, linux-kernel@vger.kernel.org On Thu, Jun 04, 2015 at 12:15:14PM +0900, Sergey Senozhatsky wrote: > On (06/04/15 11:55), Minchan Kim wrote: > > > [ 3303.108960] class-3072 objs:24652 inuse:24628 objs-per-page:4 pages-tofree:6 > > > > maxobjs-per-zspage? > > > > yeah, I shortened it to be more of less "80 chars" friendly. > > > [..] > > > > + * calculate how many unused allocated objects we > > > > c should be captital. > > > > I hope you will fix all of english grammer in next spin > > because someone(like me) who is not a native will learn the > > wrong english. :) > > sure, will fix. yeah, I'm a native broken english speaker :-) > > > > + * have and see if we can free any zspages. otherwise, > > > + * compaction can just move objects back and forth w/o > > > + * any memory gain. > > > + */ > > > + unsigned long ret = zs_stat_get(class, OBJ_ALLOCATED) - > > > + zs_stat_get(class, OBJ_USED); > > > + > > > > I prefer obj_wasted to "ret". > > ok. > > I'm still thinking how good it should be. > > for automatic compaction we don't want to uselessly move objects between > pages and I tend to think that it's better to compact less, than to waste > more cpu cycless. > > > on the other hand, this policy will miss cases like: > > -- free objects in class: 5 (free-objs class capacity) > -- page1: inuse 2 > -- page2: inuse 2 > -- page3: inuse 3 > -- page4: inuse 2 What scenario do you have a cocern? Could you describe this example more clear? Thanks. > > so total "insuse" is greater than free-objs class capacity. but, it's > surely possible to compact this class. partial inuse summ <= free-objs class > capacity (a partial summ is a ->inuse summ of any two of class pages: > page1 + page2, page2 + page3, etc.). > > otoh, these partial sums will badly affect performance. may be for automatic > compaction (the one that happens w/o user interaction) we can do zs_can_compact() > and for manual compaction (the one that has been triggered by a user) we can > old "full-scan". > > anyway, zs_can_compact() looks like something that we can optimize > independently later. > > -ss > > > > + ret /= get_maxobj_per_zspage(class->size, > > > + class->pages_per_zspage); > > > + return ret > 0; > > > +} > > > + > > > static unsigned long __zs_compact(struct zs_pool *pool, > > > struct size_class *class) > > > { > > > @@ -1686,6 +1708,9 @@ static unsigned long __zs_compact(struct zs_pool *pool, > > > > > > BUG_ON(!is_first_page(src_page)); > > > > > > + if (!zs_can_compact(class)) > > > + break; > > > + > > > cc.index = 0; > > > cc.s_page = src_page; > > > > > > -- > > > 2.4.2.337.gfae46aa > > > > > > > -- > > Kind regards, > > Minchan Kim > > -- Kind regards, Minchan Kim -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: email@kvack.org From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-pa0-f53.google.com (mail-pa0-f53.google.com [209.85.220.53]) by kanga.kvack.org (Postfix) with ESMTP id 4F07E900016 for ; Wed, 3 Jun 2015 23:30:56 -0400 (EDT) Received: by padj3 with SMTP id j3so20215126pad.0 for ; Wed, 03 Jun 2015 20:30:56 -0700 (PDT) Received: from mail-pa0-x232.google.com (mail-pa0-x232.google.com. [2607:f8b0:400e:c03::232]) by mx.google.com with ESMTPS id ng3si3804486pdb.52.2015.06.03.20.30.55 for (version=TLSv1.2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Wed, 03 Jun 2015 20:30:55 -0700 (PDT) Received: by padj3 with SMTP id j3so20214913pad.0 for ; Wed, 03 Jun 2015 20:30:55 -0700 (PDT) Date: Thu, 4 Jun 2015 12:31:18 +0900 From: Sergey Senozhatsky Subject: Re: [RFC][PATCH 03/10] zsmalloc: introduce zs_can_compact() function Message-ID: <20150604033118.GG1951@swordfish> References: <1432911928-14654-1-git-send-email-sergey.senozhatsky@gmail.com> <1432911928-14654-4-git-send-email-sergey.senozhatsky@gmail.com> <20150604025533.GE2241@blaptop> <20150604031514.GE1951@swordfish> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20150604031514.GE1951@swordfish> Sender: owner-linux-mm@kvack.org List-ID: To: Sergey Senozhatsky Cc: Minchan Kim , Sergey Senozhatsky , Andrew Morton , linux-mm@kvack.org, linux-kernel@vger.kernel.org On (06/04/15 12:15), Sergey Senozhatsky wrote: > I'm still thinking how good it should be. > > for automatic compaction we don't want to uselessly move objects between > pages and I tend to think that it's better to compact less, than to waste > more cpu cycless. > > > on the other hand, this policy will miss cases like: > > -- free objects in class: 5 (free-objs class capacity) > -- page1: inuse 2 > -- page2: inuse 2 > -- page3: inuse 3 > -- page4: inuse 2 > > so total "insuse" is greater than free-objs class capacity. but, it's > surely possible to compact this class. partial inuse summ <= free-objs class > capacity (a partial summ is a ->inuse summ of any two of class pages: > page1 + page2, page2 + page3, etc.). > > otoh, these partial sums will badly affect performance. may be for automatic > compaction (the one that happens w/o user interaction) we can do zs_can_compact() > and for manual compaction (the one that has been triggered by a user) we can > old "full-scan". > > anyway, zs_can_compact() looks like something that we can optimize > independently later. > so what I'm thinking of right now, is: -- first do "if we have enough free objects to free at least one page" check. compact if true. -- if false, then we can do on a per-page basis "if page->inuse <= class free-objs capacity" then compact it, else select next almost_empty page. here would be helpful to have pages ordered by ->inuse. but this is far to expensive. I have a patch that I will post later that introduces weak/partial page ordering within fullness_list (really inexpensive: just one int compare to add a page with a higher ->inuse to list head instead of list tail). -ss -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: email@kvack.org From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-pd0-f170.google.com (mail-pd0-f170.google.com [209.85.192.170]) by kanga.kvack.org (Postfix) with ESMTP id 3766B900016 for ; Wed, 3 Jun 2015 23:42:07 -0400 (EDT) Received: by pdbki1 with SMTP id ki1so21305601pdb.1 for ; Wed, 03 Jun 2015 20:42:07 -0700 (PDT) Received: from mail-pa0-x22f.google.com (mail-pa0-x22f.google.com. [2607:f8b0:400e:c03::22f]) by mx.google.com with ESMTPS id k7si3805315pdn.158.2015.06.03.20.42.06 for (version=TLSv1.2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Wed, 03 Jun 2015 20:42:06 -0700 (PDT) Received: by payr10 with SMTP id r10so20455269pay.1 for ; Wed, 03 Jun 2015 20:42:06 -0700 (PDT) Date: Thu, 4 Jun 2015 12:42:30 +0900 From: Sergey Senozhatsky Subject: Re: [RFC][PATCH 03/10] zsmalloc: introduce zs_can_compact() function Message-ID: <20150604034230.GH1951@swordfish> References: <1432911928-14654-1-git-send-email-sergey.senozhatsky@gmail.com> <1432911928-14654-4-git-send-email-sergey.senozhatsky@gmail.com> <20150604025533.GE2241@blaptop> <20150604031514.GE1951@swordfish> <20150604033014.GG2241@blaptop> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20150604033014.GG2241@blaptop> Sender: owner-linux-mm@kvack.org List-ID: To: Minchan Kim Cc: Sergey Senozhatsky , Sergey Senozhatsky , Andrew Morton , linux-mm@kvack.org, linux-kernel@vger.kernel.org On (06/04/15 12:30), Minchan Kim wrote: > > -- free objects in class: 5 (free-objs class capacity) > > -- page1: inuse 2 > > -- page2: inuse 2 > > -- page3: inuse 3 > > -- page4: inuse 2 > > What scenario do you have a cocern? > Could you describe this example more clear? you mean "how is this even possible"? well, for example, make -jX make clean can introduce a significant fragmentation. no new objects, just random objs removal. assuming that we keep some of the objects, allocated during compilation. e.g. ... page1 allocate baz.so allocate foo.o page2 allocate bar.o allocate foo.so ... pageN now `make clean` page1: allocated baz.so empty page2 empty allocated foo.so ... pageN in the worst case, every page can turn out to be ALMOST_EMPTY. -ss -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: email@kvack.org From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-pd0-f176.google.com (mail-pd0-f176.google.com [209.85.192.176]) by kanga.kvack.org (Postfix) with ESMTP id 0F57D900016 for ; Wed, 3 Jun 2015 23:50:33 -0400 (EDT) Received: by pdjm12 with SMTP id m12so21385768pdj.3 for ; Wed, 03 Jun 2015 20:50:32 -0700 (PDT) Received: from mail-pa0-x233.google.com (mail-pa0-x233.google.com. [2607:f8b0:400e:c03::233]) by mx.google.com with ESMTPS id s8si3786128pdp.253.2015.06.03.20.50.32 for (version=TLSv1.2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Wed, 03 Jun 2015 20:50:32 -0700 (PDT) Received: by pabqy3 with SMTP id qy3so20393033pab.3 for ; Wed, 03 Jun 2015 20:50:31 -0700 (PDT) Date: Thu, 4 Jun 2015 12:50:25 +0900 From: Minchan Kim Subject: Re: [RFC][PATCH 03/10] zsmalloc: introduce zs_can_compact() function Message-ID: <20150604035025.GH2241@blaptop> References: <1432911928-14654-1-git-send-email-sergey.senozhatsky@gmail.com> <1432911928-14654-4-git-send-email-sergey.senozhatsky@gmail.com> <20150604025533.GE2241@blaptop> <20150604031514.GE1951@swordfish> <20150604033014.GG2241@blaptop> <20150604034230.GH1951@swordfish> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20150604034230.GH1951@swordfish> Sender: owner-linux-mm@kvack.org List-ID: To: Sergey Senozhatsky Cc: Sergey Senozhatsky , Andrew Morton , linux-mm@kvack.org, linux-kernel@vger.kernel.org On Thu, Jun 04, 2015 at 12:42:30PM +0900, Sergey Senozhatsky wrote: > On (06/04/15 12:30), Minchan Kim wrote: > > > -- free objects in class: 5 (free-objs class capacity) > > > -- page1: inuse 2 > > > -- page2: inuse 2 > > > -- page3: inuse 3 > > > -- page4: inuse 2 > > > > What scenario do you have a cocern? > > Could you describe this example more clear? > > you mean "how is this even possible"? No I meant. I couldn't understand your terms. Sorry. What free-objs class capacity is? page1 is zspage? Let's use consistent terms between us. For example, maxobj-per-zspage is 4. A is allocated and used. X is allocated but not used. so we can draw a zspage below. AAXX So we can draw several zspages linked list as below AAXX - AXXX - AAAX Could you describe your problem again? Sorry. > > well, for example, > > make -jX > make clean > > can introduce a significant fragmentation. no new objects, just random > objs removal. assuming that we keep some of the objects, allocated during > compilation. > > e.g. > > ... > > page1 > allocate baz.so > allocate foo.o > page2 > allocate bar.o > allocate foo.so > ... > pageN > > > > now `make clean` > > page1: > allocated baz.so > empty > > page2 > empty > allocated foo.so > > ... > > pageN > > in the worst case, every page can turn out to be ALMOST_EMPTY. > > -ss -- Kind regards, Minchan Kim -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: email@kvack.org From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-pa0-f49.google.com (mail-pa0-f49.google.com [209.85.220.49]) by kanga.kvack.org (Postfix) with ESMTP id 1CE65900016 for ; Thu, 4 Jun 2015 00:18:49 -0400 (EDT) Received: by pabqy3 with SMTP id qy3so20824532pab.3 for ; Wed, 03 Jun 2015 21:18:48 -0700 (PDT) Received: from mail-pd0-x235.google.com (mail-pd0-x235.google.com. [2607:f8b0:400e:c02::235]) by mx.google.com with ESMTPS id gu1si3894865pbd.210.2015.06.03.21.18.47 for (version=TLSv1.2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Wed, 03 Jun 2015 21:18:48 -0700 (PDT) Received: by pdbki1 with SMTP id ki1so21876988pdb.1 for ; Wed, 03 Jun 2015 21:18:47 -0700 (PDT) Date: Thu, 4 Jun 2015 13:19:11 +0900 From: Sergey Senozhatsky Subject: Re: [RFC][PATCH 03/10] zsmalloc: introduce zs_can_compact() function Message-ID: <20150604041911.GI1951@swordfish> References: <1432911928-14654-1-git-send-email-sergey.senozhatsky@gmail.com> <1432911928-14654-4-git-send-email-sergey.senozhatsky@gmail.com> <20150604025533.GE2241@blaptop> <20150604031514.GE1951@swordfish> <20150604033014.GG2241@blaptop> <20150604034230.GH1951@swordfish> <20150604035025.GH2241@blaptop> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20150604035025.GH2241@blaptop> Sender: owner-linux-mm@kvack.org List-ID: To: Minchan Kim Cc: Sergey Senozhatsky , Sergey Senozhatsky , Andrew Morton , linux-mm@kvack.org, linux-kernel@vger.kernel.org On (06/04/15 12:50), Minchan Kim wrote: > > On (06/04/15 12:30), Minchan Kim wrote: > > > > > > What scenario do you have a cocern? > > > Could you describe this example more clear? > > > > you mean "how is this even possible"? > > No I meant. I couldn't understand your terms. Sorry. > > What free-objs class capacity is? > page1 is zspage? > > Let's use consistent terms between us. > > For example, maxobj-per-zspage is 4. > A is allocated and used. X is allocated but not used. > so we can draw a zspage below. > > AAXX > > So we can draw several zspages linked list as below > > AAXX - AXXX - AAAX > > Could you describe your problem again? > > Sorry. My apologies. yes, so: -- free-objs class capacity -- how may unused allocated objects we have in this class (in total). -- page1..pageN -- zspages. And I think that my example is utterly wrong and incorrect. My mistake. Sorry for the noise. -ss -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: email@kvack.org From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-pa0-f43.google.com (mail-pa0-f43.google.com [209.85.220.43]) by kanga.kvack.org (Postfix) with ESMTP id E9E6F900016 for ; Thu, 4 Jun 2015 00:57:32 -0400 (EDT) Received: by padj3 with SMTP id j3so21526887pad.0 for ; Wed, 03 Jun 2015 21:57:32 -0700 (PDT) Received: from mail-pa0-x229.google.com (mail-pa0-x229.google.com. [2607:f8b0:400e:c03::229]) by mx.google.com with ESMTPS id nz1si4094811pbb.33.2015.06.03.21.57.32 for (version=TLSv1.2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Wed, 03 Jun 2015 21:57:32 -0700 (PDT) Received: by payr10 with SMTP id r10so21608347pay.1 for ; Wed, 03 Jun 2015 21:57:32 -0700 (PDT) Date: Thu, 4 Jun 2015 13:57:25 +0900 From: Minchan Kim Subject: Re: [RFC][PATCH 07/10] zsmalloc: introduce auto-compact support Message-ID: <20150604045725.GI2241@blaptop> References: <1432911928-14654-1-git-send-email-sergey.senozhatsky@gmail.com> <1432911928-14654-8-git-send-email-sergey.senozhatsky@gmail.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <1432911928-14654-8-git-send-email-sergey.senozhatsky@gmail.com> Sender: owner-linux-mm@kvack.org List-ID: To: Sergey Senozhatsky Cc: Andrew Morton , linux-mm@kvack.org, linux-kernel@vger.kernel.org, Sergey Senozhatsky On Sat, May 30, 2015 at 12:05:25AM +0900, Sergey Senozhatsky wrote: > perform class compaction in zs_free(), if zs_free() has created > a ZS_ALMOST_EMPTY page. this is the most trivial `policy'. Finally, I got realized your intention. Actually, I had a plan to add /sys/block/zram0/compact_threshold_ratio which means to compact automatically when compr_data_size/mem_used_total is below than the threshold but I didn't try because it could be done by usertool. Another reason I didn't try the approach is that it could scan all of zs_objects repeatedly withtout any freeing zspage in some corner cases, which could be big overhead we should prevent so we might add some heuristic. as an example, we could delay a few compaction trial when we found a few previous trials as all fails. It's simple design of mm/compaction.c to prevent pointless overhead but historically it made pains several times and required more complicated logics but it's still painful. Other thing I found recently is that it's not always win zsmalloc for zram is not fragmented. The fragmented space could be used for storing upcoming compressed objects although it is wasted space at the moment but if we don't have any hole(ie, fragment space) via frequent compaction, zsmalloc should allocate a new zspage which could be allocated on movable pageblock by fallback of nonmovable pageblock request on highly memory pressure system so it accelerates fragment problem of the system memory. So, I want to pass the policy to userspace. If we found it's really trobule on userspace, then, we need more thinking. Thanks. > > probably it would make zs_can_compact() to return an estimated number > of pages that potentially will be free and trigger auto-compaction > only when it's above some limit (e.g. at least 4 zs pages); or put it > under config option. > > this also tweaks __zs_compact() -- we can't do reschedule > anymore, waiting for new pages in the current class. so we > compact as much as we can and return immediately if compaction > is not possible anymore. > > auto-compaction is not a replacement of manual compaction. > > compiled linux kernel with auto-compaction: > > cat /sys/block/zram0/mm_stat > 2339885056 1601034235 1624076288 0 1624076288 19961 1106 > > performing additional manual compaction: > > echo 1 > /sys/block/zram0/compact > cat /sys/block/zram0/mm_stat > 2339885056 1601034235 1624051712 0 1624076288 19961 1114 > > manual compaction was able to migrate additional 8 objects. so > auto-compaction is 'good enough'. > > TEST > > this test copies a 1.3G linux kernel tar to mounted zram disk, > and extracts it. > > w/auto-compaction: > > cat /sys/block/zram0/mm_stat > 1171456 26006 86016 0 86016 32781 0 > > time tar xf linux-3.10.tar.gz -C linux > > real 0m16.970s > user 0m15.247s > sys 0m8.477s > > du -sh linux > 2.0G linux > > cat /sys/block/zram0/mm_stat > 3547353088 2993384270 3011088384 0 3011088384 24310 108 > > ===================================================================== > > w/o auto compaction: > > cat /sys/block/zram0/mm_stat > 1171456 26000 81920 0 81920 32781 0 > > time tar xf linux-3.10.tar.gz -C linux > > real 0m16.983s > user 0m15.267s > sys 0m8.417s > > du -sh linux > 2.0G linux > > cat /sys/block/zram0/mm_stat > 3548917760 2993566924 3011317760 0 3011317760 23928 0 > > ===================================================================== > > iozone shows that auto-compacted code runs faster in several > tests, which is hardly trustworthy. anyway. > > iozone -t 3 -R -r 16K -s 60M -I +Z > > test base auto-compact (compacted 66123 objs) > Initial write 1603682.25 1645112.38 > Rewrite 2502243.31 2256570.31 > Read 7040860.00 7130575.00 > Re-read 7036490.75 7066744.25 > Reverse Read 6617115.25 6155395.50 > Stride read 6705085.50 6350030.38 > Random read 6668497.75 6350129.38 > Mixed workload 5494030.38 5091669.62 > Random write 2526834.44 2500977.81 > Pwrite 1656874.00 1663796.94 > Pread 3322818.91 3359683.44 > Fwrite 4090124.25 4099773.88 > Fread 10358916.25 10324409.75 > > Signed-off-by: Sergey Senozhatsky > --- > mm/zsmalloc.c | 25 +++++++++++++------------ > 1 file changed, 13 insertions(+), 12 deletions(-) > > diff --git a/mm/zsmalloc.c b/mm/zsmalloc.c > index c2a640a..70bf481 100644 > --- a/mm/zsmalloc.c > +++ b/mm/zsmalloc.c > @@ -1515,34 +1515,28 @@ static void __zs_compact(struct zs_pool *pool, struct size_class *class) > > while ((dst_page = isolate_target_page(class))) { > cc.d_page = dst_page; > - /* > - * If there is no more space in dst_page, resched > - * and see if anyone had allocated another zspage. > - */ > + > if (!migrate_zspage(pool, class, &cc)) > - break; > + goto out; > > putback_zspage(pool, class, dst_page); > } > > - /* Stop if we couldn't find slot */ > - if (dst_page == NULL) > + if (!dst_page) > break; > - > putback_zspage(pool, class, dst_page); > putback_zspage(pool, class, src_page); > - spin_unlock(&class->lock); > - cond_resched(); > - spin_lock(&class->lock); > } > > +out: > + if (dst_page) > + putback_zspage(pool, class, dst_page); > if (src_page) > putback_zspage(pool, class, src_page); > > spin_unlock(&class->lock); > } > > - > unsigned long zs_get_total_pages(struct zs_pool *pool) > { > return atomic_long_read(&pool->pages_allocated); > @@ -1741,6 +1735,13 @@ void zs_free(struct zs_pool *pool, unsigned long handle) > unpin_tag(handle); > > free_handle(pool, handle); > + > + /* > + * actual fullness might have changed, __zs_compact() checks > + * if compaction makes sense > + */ > + if (fullness == ZS_ALMOST_EMPTY) > + __zs_compact(pool, class); > } > EXPORT_SYMBOL_GPL(zs_free); > > -- > 2.4.2.337.gfae46aa > -- Kind regards, Minchan Kim -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: email@kvack.org From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-pd0-f179.google.com (mail-pd0-f179.google.com [209.85.192.179]) by kanga.kvack.org (Postfix) with ESMTP id F21D8900016 for ; Thu, 4 Jun 2015 01:30:34 -0400 (EDT) Received: by pdbnf5 with SMTP id nf5so23078325pdb.2 for ; Wed, 03 Jun 2015 22:30:34 -0700 (PDT) Received: from mail-pd0-x22c.google.com (mail-pd0-x22c.google.com. [2607:f8b0:400e:c02::22c]) by mx.google.com with ESMTPS id pv5si4129776pbb.244.2015.06.03.22.30.33 for (version=TLSv1.2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Wed, 03 Jun 2015 22:30:33 -0700 (PDT) Received: by pdbki1 with SMTP id ki1so23036592pdb.1 for ; Wed, 03 Jun 2015 22:30:33 -0700 (PDT) Date: Thu, 4 Jun 2015 14:30:56 +0900 From: Sergey Senozhatsky Subject: Re: [RFC][PATCH 07/10] zsmalloc: introduce auto-compact support Message-ID: <20150604053056.GA662@swordfish> References: <1432911928-14654-1-git-send-email-sergey.senozhatsky@gmail.com> <1432911928-14654-8-git-send-email-sergey.senozhatsky@gmail.com> <20150604045725.GI2241@blaptop> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20150604045725.GI2241@blaptop> Sender: owner-linux-mm@kvack.org List-ID: To: Minchan Kim Cc: Sergey Senozhatsky , Andrew Morton , linux-mm@kvack.org, linux-kernel@vger.kernel.org, Sergey Senozhatsky On (06/04/15 13:57), Minchan Kim wrote: > On Sat, May 30, 2015 at 12:05:25AM +0900, Sergey Senozhatsky wrote: > > perform class compaction in zs_free(), if zs_free() has created > > a ZS_ALMOST_EMPTY page. this is the most trivial `policy'. > > Finally, I got realized your intention. > > Actually, I had a plan to add /sys/block/zram0/compact_threshold_ratio > which means to compact automatically when compr_data_size/mem_used_total > is below than the threshold but I didn't try because it could be done > by usertool. > > Another reason I didn't try the approach is that it could scan all of > zs_objects repeatedly withtout any freeing zspage in some corner cases, > which could be big overhead we should prevent so we might add some > heuristic. as an example, we could delay a few compaction trial when > we found a few previous trials as all fails. this is why I use zs_can_compact() -- to evict from zs_compact() as soon as possible. so useless scans are minimized (well, at least expected). I'm also thinking of a threshold-based solution -- do class auto-compaction only if we can free X pages, for example. the problem of compaction is that there is no compaction until you trigger it. and fragmented classes are not necessarily a win. if writes don't happen to a fragmented class-X (and we basically can't tell if they will, nor we can estimate; it's up to I/O and data patterns, compression algorithm, etc.) then class-X stays fragmented w/o any use. > It's simple design of mm/compaction.c to prevent pointless overhead > but historically it made pains several times and required more > complicated logics but it's still painful. > > Other thing I found recently is that it's not always win zsmalloc > for zram is not fragmented. The fragmented space could be used > for storing upcoming compressed objects although it is wasted space > at the moment but if we don't have any hole(ie, fragment space) > via frequent compaction, zsmalloc should allocate a new zspage > which could be allocated on movable pageblock by fallback of > nonmovable pageblock request on highly memory pressure system > so it accelerates fragment problem of the system memory. yes, but compaction almost always leave classes fragmented. I think it's a corner case, when the number of unused allocated objects was exactly the same as the number of objects that we migrated and the number of migrated objects was exactly N*maxobj_per_zspage, so we left the class w/o any unused objects (OBJ_ALLOCATED == OBJ_USED). classes have 'holes' after compaction. > So, I want to pass the policy to userspace. > If we found it's really trobule on userspace, then, we need more > thinking. well, it can be under config "aggressive compaction" or "automatic compaction" option. -ss > Thanks. > > > > > probably it would make zs_can_compact() to return an estimated number > > of pages that potentially will be free and trigger auto-compaction > > only when it's above some limit (e.g. at least 4 zs pages); or put it > > under config option. > > > > this also tweaks __zs_compact() -- we can't do reschedule > > anymore, waiting for new pages in the current class. so we > > compact as much as we can and return immediately if compaction > > is not possible anymore. > > > > auto-compaction is not a replacement of manual compaction. > > > > compiled linux kernel with auto-compaction: > > > > cat /sys/block/zram0/mm_stat > > 2339885056 1601034235 1624076288 0 1624076288 19961 1106 > > > > performing additional manual compaction: > > > > echo 1 > /sys/block/zram0/compact > > cat /sys/block/zram0/mm_stat > > 2339885056 1601034235 1624051712 0 1624076288 19961 1114 > > > > manual compaction was able to migrate additional 8 objects. so > > auto-compaction is 'good enough'. > > > > TEST > > > > this test copies a 1.3G linux kernel tar to mounted zram disk, > > and extracts it. > > > > w/auto-compaction: > > > > cat /sys/block/zram0/mm_stat > > 1171456 26006 86016 0 86016 32781 0 > > > > time tar xf linux-3.10.tar.gz -C linux > > > > real 0m16.970s > > user 0m15.247s > > sys 0m8.477s > > > > du -sh linux > > 2.0G linux > > > > cat /sys/block/zram0/mm_stat > > 3547353088 2993384270 3011088384 0 3011088384 24310 108 > > > > ===================================================================== > > > > w/o auto compaction: > > > > cat /sys/block/zram0/mm_stat > > 1171456 26000 81920 0 81920 32781 0 > > > > time tar xf linux-3.10.tar.gz -C linux > > > > real 0m16.983s > > user 0m15.267s > > sys 0m8.417s > > > > du -sh linux > > 2.0G linux > > > > cat /sys/block/zram0/mm_stat > > 3548917760 2993566924 3011317760 0 3011317760 23928 0 > > > > ===================================================================== > > > > iozone shows that auto-compacted code runs faster in several > > tests, which is hardly trustworthy. anyway. > > > > iozone -t 3 -R -r 16K -s 60M -I +Z > > > > test base auto-compact (compacted 66123 objs) > > Initial write 1603682.25 1645112.38 > > Rewrite 2502243.31 2256570.31 > > Read 7040860.00 7130575.00 > > Re-read 7036490.75 7066744.25 > > Reverse Read 6617115.25 6155395.50 > > Stride read 6705085.50 6350030.38 > > Random read 6668497.75 6350129.38 > > Mixed workload 5494030.38 5091669.62 > > Random write 2526834.44 2500977.81 > > Pwrite 1656874.00 1663796.94 > > Pread 3322818.91 3359683.44 > > Fwrite 4090124.25 4099773.88 > > Fread 10358916.25 10324409.75 > > > > Signed-off-by: Sergey Senozhatsky > > --- > > mm/zsmalloc.c | 25 +++++++++++++------------ > > 1 file changed, 13 insertions(+), 12 deletions(-) > > > > diff --git a/mm/zsmalloc.c b/mm/zsmalloc.c > > index c2a640a..70bf481 100644 > > --- a/mm/zsmalloc.c > > +++ b/mm/zsmalloc.c > > @@ -1515,34 +1515,28 @@ static void __zs_compact(struct zs_pool *pool, struct size_class *class) > > > > while ((dst_page = isolate_target_page(class))) { > > cc.d_page = dst_page; > > - /* > > - * If there is no more space in dst_page, resched > > - * and see if anyone had allocated another zspage. > > - */ > > + > > if (!migrate_zspage(pool, class, &cc)) > > - break; > > + goto out; > > > > putback_zspage(pool, class, dst_page); > > } > > > > - /* Stop if we couldn't find slot */ > > - if (dst_page == NULL) > > + if (!dst_page) > > break; > > - > > putback_zspage(pool, class, dst_page); > > putback_zspage(pool, class, src_page); > > - spin_unlock(&class->lock); > > - cond_resched(); > > - spin_lock(&class->lock); > > } > > > > +out: > > + if (dst_page) > > + putback_zspage(pool, class, dst_page); > > if (src_page) > > putback_zspage(pool, class, src_page); > > > > spin_unlock(&class->lock); > > } > > > > - > > unsigned long zs_get_total_pages(struct zs_pool *pool) > > { > > return atomic_long_read(&pool->pages_allocated); > > @@ -1741,6 +1735,13 @@ void zs_free(struct zs_pool *pool, unsigned long handle) > > unpin_tag(handle); > > > > free_handle(pool, handle); > > + > > + /* > > + * actual fullness might have changed, __zs_compact() checks > > + * if compaction makes sense > > + */ > > + if (fullness == ZS_ALMOST_EMPTY) > > + __zs_compact(pool, class); > > } > > EXPORT_SYMBOL_GPL(zs_free); > > > > -- > > 2.4.2.337.gfae46aa > > > > -- > Kind regards, > Minchan Kim > -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: email@kvack.org From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-pd0-f174.google.com (mail-pd0-f174.google.com [209.85.192.174]) by kanga.kvack.org (Postfix) with ESMTP id CB5C1900016 for ; Thu, 4 Jun 2015 02:27:21 -0400 (EDT) Received: by pdbnf5 with SMTP id nf5so24024238pdb.2 for ; Wed, 03 Jun 2015 23:27:21 -0700 (PDT) Received: from mail-pa0-x22b.google.com (mail-pa0-x22b.google.com. [2607:f8b0:400e:c03::22b]) by mx.google.com with ESMTPS id k5si4430896pdo.20.2015.06.03.23.27.20 for (version=TLSv1.2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Wed, 03 Jun 2015 23:27:20 -0700 (PDT) Received: by padj3 with SMTP id j3so22961773pad.0 for ; Wed, 03 Jun 2015 23:27:20 -0700 (PDT) Date: Thu, 4 Jun 2015 15:27:12 +0900 From: Minchan Kim Subject: Re: [RFC][PATCH 07/10] zsmalloc: introduce auto-compact support Message-ID: <20150604062712.GJ2241@blaptop> References: <1432911928-14654-1-git-send-email-sergey.senozhatsky@gmail.com> <1432911928-14654-8-git-send-email-sergey.senozhatsky@gmail.com> <20150604045725.GI2241@blaptop> <20150604053056.GA662@swordfish> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20150604053056.GA662@swordfish> Sender: owner-linux-mm@kvack.org List-ID: To: Sergey Senozhatsky Cc: Sergey Senozhatsky , Andrew Morton , linux-mm@kvack.org, linux-kernel@vger.kernel.org On Thu, Jun 04, 2015 at 02:30:56PM +0900, Sergey Senozhatsky wrote: > On (06/04/15 13:57), Minchan Kim wrote: > > On Sat, May 30, 2015 at 12:05:25AM +0900, Sergey Senozhatsky wrote: > > > perform class compaction in zs_free(), if zs_free() has created > > > a ZS_ALMOST_EMPTY page. this is the most trivial `policy'. > > > > Finally, I got realized your intention. > > > > Actually, I had a plan to add /sys/block/zram0/compact_threshold_ratio > > which means to compact automatically when compr_data_size/mem_used_total > > is below than the threshold but I didn't try because it could be done > > by usertool. > > > > Another reason I didn't try the approach is that it could scan all of > > zs_objects repeatedly withtout any freeing zspage in some corner cases, > > which could be big overhead we should prevent so we might add some > > heuristic. as an example, we could delay a few compaction trial when > > we found a few previous trials as all fails. > > this is why I use zs_can_compact() -- to evict from zs_compact() as soon > as possible. so useless scans are minimized (well, at least expected). I'm > also thinking of a threshold-based solution -- do class auto-compaction > only if we can free X pages, for example. > > the problem of compaction is that there is no compaction until you trigger > it. > > and fragmented classes are not necessarily a win. if writes don't happen > to a fragmented class-X (and we basically can't tell if they will, nor we > can estimate; it's up to I/O and data patterns, compression algorithm, etc.) > then class-X stays fragmented w/o any use. The problem is migration/freeing old zspage/allocating new zspage is not a cheap, either. If the system has no problem with small fragmented space, there is no point to keep such overheads. So, ideal is we should trigger compaction once we realized system is trouble but I don't have any good idea to detect it. That's why i wanted to rely on the decision from user via compact_threshold_ratio. > > > It's simple design of mm/compaction.c to prevent pointless overhead > > but historically it made pains several times and required more > > complicated logics but it's still painful. > > > > Other thing I found recently is that it's not always win zsmalloc > > for zram is not fragmented. The fragmented space could be used > > for storing upcoming compressed objects although it is wasted space > > at the moment but if we don't have any hole(ie, fragment space) > > via frequent compaction, zsmalloc should allocate a new zspage > > which could be allocated on movable pageblock by fallback of > > nonmovable pageblock request on highly memory pressure system > > so it accelerates fragment problem of the system memory. > > yes, but compaction almost always leave classes fragmented. I think > it's a corner case, when the number of unused allocated objects was > exactly the same as the number of objects that we migrated and the > number of migrated objects was exactly N*maxobj_per_zspage, so we > left the class w/o any unused objects (OBJ_ALLOCATED == OBJ_USED). > classes have 'holes' after compaction. > > > > So, I want to pass the policy to userspace. > > If we found it's really trobule on userspace, then, we need more > > thinking. > > well, it can be under config "aggressive compaction" or "automatic > compaction" option. > If you really want to do it automatically without any feedback form the userspace, we should find better algorithm. -- Kind regards, Minchan Kim -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: email@kvack.org From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-pa0-f50.google.com (mail-pa0-f50.google.com [209.85.220.50]) by kanga.kvack.org (Postfix) with ESMTP id C74EB900016 for ; Thu, 4 Jun 2015 03:04:24 -0400 (EDT) Received: by payr10 with SMTP id r10so23666353pay.1 for ; Thu, 04 Jun 2015 00:04:24 -0700 (PDT) Received: from mail-pa0-x22d.google.com (mail-pa0-x22d.google.com. [2607:f8b0:400e:c03::22d]) by mx.google.com with ESMTPS id b4si4528358pas.87.2015.06.04.00.04.23 for (version=TLSv1.2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Thu, 04 Jun 2015 00:04:23 -0700 (PDT) Received: by padj3 with SMTP id j3so23580996pad.0 for ; Thu, 04 Jun 2015 00:04:23 -0700 (PDT) Date: Thu, 4 Jun 2015 16:04:16 +0900 From: Minchan Kim Subject: Re: [RFC][PATCH 07/10] zsmalloc: introduce auto-compact support Message-ID: <20150604070416.GK2241@blaptop> References: <1432911928-14654-1-git-send-email-sergey.senozhatsky@gmail.com> <1432911928-14654-8-git-send-email-sergey.senozhatsky@gmail.com> <20150604045725.GI2241@blaptop> <20150604053056.GA662@swordfish> <20150604062712.GJ2241@blaptop> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20150604062712.GJ2241@blaptop> Sender: owner-linux-mm@kvack.org List-ID: To: Sergey Senozhatsky Cc: Sergey Senozhatsky , Andrew Morton , linux-mm@kvack.org, linux-kernel@vger.kernel.org On Thu, Jun 04, 2015 at 03:27:12PM +0900, Minchan Kim wrote: > On Thu, Jun 04, 2015 at 02:30:56PM +0900, Sergey Senozhatsky wrote: > > On (06/04/15 13:57), Minchan Kim wrote: > > > On Sat, May 30, 2015 at 12:05:25AM +0900, Sergey Senozhatsky wrote: > > > > perform class compaction in zs_free(), if zs_free() has created > > > > a ZS_ALMOST_EMPTY page. this is the most trivial `policy'. > > > > > > Finally, I got realized your intention. > > > > > > Actually, I had a plan to add /sys/block/zram0/compact_threshold_ratio > > > which means to compact automatically when compr_data_size/mem_used_total > > > is below than the threshold but I didn't try because it could be done > > > by usertool. > > > > > > Another reason I didn't try the approach is that it could scan all of > > > zs_objects repeatedly withtout any freeing zspage in some corner cases, > > > which could be big overhead we should prevent so we might add some > > > heuristic. as an example, we could delay a few compaction trial when > > > we found a few previous trials as all fails. > > > > this is why I use zs_can_compact() -- to evict from zs_compact() as soon > > as possible. so useless scans are minimized (well, at least expected). I'm > > also thinking of a threshold-based solution -- do class auto-compaction > > only if we can free X pages, for example. > > > > the problem of compaction is that there is no compaction until you trigger > > it. > > > > and fragmented classes are not necessarily a win. if writes don't happen > > to a fragmented class-X (and we basically can't tell if they will, nor we > > can estimate; it's up to I/O and data patterns, compression algorithm, etc.) > > then class-X stays fragmented w/o any use. > > The problem is migration/freeing old zspage/allocating new zspage is > not a cheap, either. > If the system has no problem with small fragmented space, there is > no point to keep such overheads. > > So, ideal is we should trigger compaction once we realized system > is trouble but I don't have any good idea to detect it. > That's why i wanted to rely on the decision from user via > compact_threshold_ratio. > > > > > > It's simple design of mm/compaction.c to prevent pointless overhead > > > but historically it made pains several times and required more > > > complicated logics but it's still painful. > > > > > > Other thing I found recently is that it's not always win zsmalloc > > > for zram is not fragmented. The fragmented space could be used > > > for storing upcoming compressed objects although it is wasted space > > > at the moment but if we don't have any hole(ie, fragment space) > > > via frequent compaction, zsmalloc should allocate a new zspage > > > which could be allocated on movable pageblock by fallback of > > > nonmovable pageblock request on highly memory pressure system > > > so it accelerates fragment problem of the system memory. > > > > yes, but compaction almost always leave classes fragmented. I think > > it's a corner case, when the number of unused allocated objects was > > exactly the same as the number of objects that we migrated and the > > number of migrated objects was exactly N*maxobj_per_zspage, so we > > left the class w/o any unused objects (OBJ_ALLOCATED == OBJ_USED). > > classes have 'holes' after compaction. > > > > > > > So, I want to pass the policy to userspace. > > > If we found it's really trobule on userspace, then, we need more > > > thinking. > > > > well, it can be under config "aggressive compaction" or "automatic > > compaction" option. > > > > If you really want to do it automatically without any feedback > form the userspace, we should find better algorithm. How about using slab shrinker? If there is memory pressure, it would be called by VM and we will try compaction without user's intervention and excessive object scanning should be avoid by your zs_can_compact. The concern I had about fragmentation spread out all over pageblock should be solved as another issue. I'm plaing to make zsmalloced page migratable. I hope we should work out it firstly to prevent system heavy memory fragmentation by automatic compaction. > > -- > Kind regards, > Minchan Kim -- Kind regards, Minchan Kim -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: email@kvack.org From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-pd0-f176.google.com (mail-pd0-f176.google.com [209.85.192.176]) by kanga.kvack.org (Postfix) with ESMTP id 7E9F8900016 for ; Thu, 4 Jun 2015 03:27:56 -0400 (EDT) Received: by pdbqa5 with SMTP id qa5so25129257pdb.0 for ; Thu, 04 Jun 2015 00:27:56 -0700 (PDT) Received: from mail-pd0-x22d.google.com (mail-pd0-x22d.google.com. [2607:f8b0:400e:c02::22d]) by mx.google.com with ESMTPS id q2si4635164pap.44.2015.06.04.00.27.55 for (version=TLSv1.2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Thu, 04 Jun 2015 00:27:55 -0700 (PDT) Received: by pdbqa5 with SMTP id qa5so25128973pdb.0 for ; Thu, 04 Jun 2015 00:27:55 -0700 (PDT) Date: Thu, 4 Jun 2015 16:28:16 +0900 From: Sergey Senozhatsky Subject: Re: [RFC][PATCH 07/10] zsmalloc: introduce auto-compact support Message-ID: <20150604072816.GB662@swordfish> References: <1432911928-14654-1-git-send-email-sergey.senozhatsky@gmail.com> <1432911928-14654-8-git-send-email-sergey.senozhatsky@gmail.com> <20150604045725.GI2241@blaptop> <20150604053056.GA662@swordfish> <20150604062712.GJ2241@blaptop> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20150604062712.GJ2241@blaptop> Sender: owner-linux-mm@kvack.org List-ID: To: Minchan Kim Cc: Sergey Senozhatsky , Sergey Senozhatsky , Andrew Morton , linux-mm@kvack.org, linux-kernel@vger.kernel.org On (06/04/15 15:27), Minchan Kim wrote: [..] > > The problem is migration/freeing old zspage/allocating new zspage is > not a cheap, either. > If the system has no problem with small fragmented space, there is > no point to keep such overheads. > > So, ideal is we should trigger compaction once we realized system > is trouble but I don't have any good idea to detect it. > That's why i wanted to rely on the decision from user via > compact_threshold_ratio. that'll be extremly hard to understand knob. well, we can do something like -- don't let the number of "CLASS_ALMOST_EMPTY" to become N times greater than "CLASS_ALMOST_FULL". or -- don't let the number of pages in ZS_ALMOST_EMPTY pages to contribute 70% of class memory usage. that is 70% of all pages allocated for this class belong to ZS_ALMOST_EMPTY zspages, thus potentially we can compact it. > > > > > It's simple design of mm/compaction.c to prevent pointless overhead > > > but historically it made pains several times and required more > > > complicated logics but it's still painful. > > > > > > Other thing I found recently is that it's not always win zsmalloc > > > for zram is not fragmented. The fragmented space could be used > > > for storing upcoming compressed objects although it is wasted space > > > at the moment but if we don't have any hole(ie, fragment space) > > > via frequent compaction, zsmalloc should allocate a new zspage > > > which could be allocated on movable pageblock by fallback of > > > nonmovable pageblock request on highly memory pressure system > > > so it accelerates fragment problem of the system memory. > > > > yes, but compaction almost always leave classes fragmented. I think > > it's a corner case, when the number of unused allocated objects was > > exactly the same as the number of objects that we migrated and the > > number of migrated objects was exactly N*maxobj_per_zspage, so we > > left the class w/o any unused objects (OBJ_ALLOCATED == OBJ_USED). > > classes have 'holes' after compaction. > > > > > > > So, I want to pass the policy to userspace. > > > If we found it's really trobule on userspace, then, we need more > > > thinking. > > > > well, it can be under config "aggressive compaction" or "automatic > > compaction" option. > > > > If you really want to do it automatically without any feedback > form the userspace, we should find better algorithm. ok. I'll drop auto-compaction part for now and will resend general/minor zsmalloc tweaks today. -ss -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: email@kvack.org From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-pd0-f175.google.com (mail-pd0-f175.google.com [209.85.192.175]) by kanga.kvack.org (Postfix) with ESMTP id 47E87900016 for ; Thu, 4 Jun 2015 10:48:04 -0400 (EDT) Received: by pdbki1 with SMTP id ki1so32444277pdb.1 for ; Thu, 04 Jun 2015 07:48:04 -0700 (PDT) Received: from mail-pa0-x229.google.com (mail-pa0-x229.google.com. [2607:f8b0:400e:c03::229]) by mx.google.com with ESMTPS id el11si6151230pac.237.2015.06.04.07.48.03 for (version=TLSv1.2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Thu, 04 Jun 2015 07:48:03 -0700 (PDT) Received: by padj3 with SMTP id j3so31067250pad.0 for ; Thu, 04 Jun 2015 07:48:03 -0700 (PDT) Date: Thu, 4 Jun 2015 23:47:30 +0900 From: Sergey Senozhatsky Subject: Re: [RFC][PATCH 07/10] zsmalloc: introduce auto-compact support Message-ID: <20150604144730.GA484@swordfish> References: <1432911928-14654-1-git-send-email-sergey.senozhatsky@gmail.com> <1432911928-14654-8-git-send-email-sergey.senozhatsky@gmail.com> <20150604045725.GI2241@blaptop> <20150604053056.GA662@swordfish> <20150604062712.GJ2241@blaptop> <20150604070416.GK2241@blaptop> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20150604070416.GK2241@blaptop> Sender: owner-linux-mm@kvack.org List-ID: To: Minchan Kim Cc: Sergey Senozhatsky , Sergey Senozhatsky , Andrew Morton , linux-mm@kvack.org, linux-kernel@vger.kernel.org On (06/04/15 16:04), Minchan Kim wrote: [..] > How about using slab shrinker? > If there is memory pressure, it would be called by VM and we will > try compaction without user's intervention and excessive object > scanning should be avoid by your zs_can_compact. hm, interesting. ok, have a patch to trigger compaction from shrinker, but need to test it more. will send the updated patchset tomorrow, I think. -ss > The concern I had about fragmentation spread out all over pageblock > should be solved as another issue. I'm plaing to make zsmalloced > page migratable. I hope we should work out it firstly to prevent > system heavy memory fragmentation by automatic compaction. > -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: email@kvack.org