All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v3 0/2] zram/zcomp: stream allocation fixes and tweaks
@ 2015-11-27  4:10 Sergey Senozhatsky
  2015-11-27  4:10 ` [PATCH v3 1/2] zram/zcomp: use GFP_NOIO to allocate streams Sergey Senozhatsky
  2015-11-27  4:10 ` [PATCH v3 2/2] zram: try vmalloc() after kmalloc() Sergey Senozhatsky
  0 siblings, 2 replies; 18+ messages in thread
From: Sergey Senozhatsky @ 2015-11-27  4:10 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Minchan Kim, Kyeongdon Kim, linux-kernel, Sergey Senozhatsky,
	Sergey Senozhatsky

Hello,
	Two patches that address possible issues with compression
stream allocations under low memory or heavy fragmentation conditions.

These patches are considered to be a -stable material, however there
is no Cc -stable on "zram: try vmalloc() after kmalloc()" as of now.
We'd like to ask Kyeongdon to re-test and re-confirm (the patch has
been modified).

I split the patch set into separate threads. Two more patches
- zram: pass gfp from zcomp frontend to backend (Minchan Kim)
- zram/zcomp: do not zero out zcomp private pages (Sergey Senozhatsky)
(not a -stable material) will be posted separately.

Kyeongdon Kim (1):
  zram: try vmalloc() after kmalloc()

Sergey Senozhatsky (1):
  zram/zcomp: use GFP_NOIO to allocate streams

 drivers/block/zram/zcomp.c     |  4 ++--
 drivers/block/zram/zcomp_lz4.c | 23 +++++++++++++++++++++--
 drivers/block/zram/zcomp_lzo.c | 23 +++++++++++++++++++++--
 3 files changed, 44 insertions(+), 6 deletions(-)

-- 
2.6.3.368.gf34be46


^ permalink raw reply	[flat|nested] 18+ messages in thread

* [PATCH v3 1/2] zram/zcomp: use GFP_NOIO to allocate streams
  2015-11-27  4:10 [PATCH v3 0/2] zram/zcomp: stream allocation fixes and tweaks Sergey Senozhatsky
@ 2015-11-27  4:10 ` Sergey Senozhatsky
  2015-11-30  7:09   ` Minchan Kim
  2015-11-27  4:10 ` [PATCH v3 2/2] zram: try vmalloc() after kmalloc() Sergey Senozhatsky
  1 sibling, 1 reply; 18+ messages in thread
From: Sergey Senozhatsky @ 2015-11-27  4:10 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Minchan Kim, Kyeongdon Kim, linux-kernel, Sergey Senozhatsky,
	Sergey Senozhatsky, stable

From: Sergey Senozhatsky <sergey.senozhatsky.work@gmail.com>

We can end up allocating a new compression stream with GFP_KERNEL
from within the IO path, which may result is nested (recursive) IO
operations. That can introduce problems if the IO path in question
is a reclaimer, holding some locks that will deadlock nested IOs.

Allocate streams and working memory using GFP_NOIO flag, forbidding
recursive IO and FS operations.

An example:

[  747.233722] inconsistent {IN-RECLAIM_FS-W} -> {RECLAIM_FS-ON-W} usage.
[  747.233724] git/20158 [HC0[0]:SC0[0]:HE1:SE1] takes:
[  747.233725]  (jbd2_handle){+.+.?.}, at: [<ffffffff811e31db>] start_this_handle+0x4ca/0x555
[  747.233733] {IN-RECLAIM_FS-W} state was registered at:
[  747.233735]   [<ffffffff8107b8e9>] __lock_acquire+0x8da/0x117b
[  747.233738]   [<ffffffff8107c950>] lock_acquire+0x10c/0x1a7
[  747.233740]   [<ffffffff811e323e>] start_this_handle+0x52d/0x555
[  747.233742]   [<ffffffff811e331a>] jbd2__journal_start+0xb4/0x237
[  747.233744]   [<ffffffff811cc6c7>] __ext4_journal_start_sb+0x108/0x17e
[  747.233748]   [<ffffffff811a90bf>] ext4_dirty_inode+0x32/0x61
[  747.233750]   [<ffffffff8115f37e>] __mark_inode_dirty+0x16b/0x60c
[  747.233754]   [<ffffffff81150ad6>] iput+0x11e/0x274
[  747.233757]   [<ffffffff8114bfbd>] __dentry_kill+0x148/0x1b8
[  747.233759]   [<ffffffff8114c9d9>] shrink_dentry_list+0x274/0x44a
[  747.233761]   [<ffffffff8114d38a>] prune_dcache_sb+0x4a/0x55
[  747.233763]   [<ffffffff8113b1ad>] super_cache_scan+0xfc/0x176
[  747.233767]   [<ffffffff810fa089>] shrink_slab.part.14.constprop.25+0x2a2/0x4d3
[  747.233770]   [<ffffffff810fcccb>] shrink_zone+0x74/0x140
[  747.233772]   [<ffffffff810fd924>] kswapd+0x6b7/0x930
[  747.233774]   [<ffffffff81058887>] kthread+0x107/0x10f
[  747.233778]   [<ffffffff814fadff>] ret_from_fork+0x3f/0x70
[  747.233783] irq event stamp: 138297
[  747.233784] hardirqs last  enabled at (138297): [<ffffffff8107aff3>] debug_check_no_locks_freed+0x113/0x12f
[  747.233786] hardirqs last disabled at (138296): [<ffffffff8107af13>] debug_check_no_locks_freed+0x33/0x12f
[  747.233788] softirqs last  enabled at (137818): [<ffffffff81040f89>] __do_softirq+0x2d3/0x3e9
[  747.233792] softirqs last disabled at (137813): [<ffffffff81041292>] irq_exit+0x41/0x95
[  747.233794]
               other info that might help us debug this:
[  747.233796]  Possible unsafe locking scenario:
[  747.233797]        CPU0
[  747.233798]        ----
[  747.233799]   lock(jbd2_handle);
[  747.233801]   <Interrupt>
[  747.233801]     lock(jbd2_handle);
[  747.233803]
                *** DEADLOCK ***
[  747.233805] 5 locks held by git/20158:
[  747.233806]  #0:  (sb_writers#7){.+.+.+}, at: [<ffffffff81155411>] mnt_want_write+0x24/0x4b
[  747.233811]  #1:  (&type->i_mutex_dir_key#2/1){+.+.+.}, at: [<ffffffff81145087>] lock_rename+0xd9/0xe3
[  747.233817]  #2:  (&sb->s_type->i_mutex_key#11){+.+.+.}, at: [<ffffffff8114f8e2>] lock_two_nondirectories+0x3f/0x6b
[  747.233822]  #3:  (&sb->s_type->i_mutex_key#11/4){+.+.+.}, at: [<ffffffff8114f909>] lock_two_nondirectories+0x66/0x6b
[  747.233827]  #4:  (jbd2_handle){+.+.?.}, at: [<ffffffff811e31db>] start_this_handle+0x4ca/0x555
[  747.233831]
               stack backtrace:
[  747.233834] CPU: 2 PID: 20158 Comm: git Not tainted 4.1.0-rc7-next-20150615-dbg-00016-g8bdf555-dirty #211
[  747.233837]  ffff8800a56cea40 ffff88010d0a75f8 ffffffff814f446d ffffffff81077036
[  747.233840]  ffffffff823a84b0 ffff88010d0a7638 ffffffff814f3849 0000000000000001
[  747.233843]  000000000000000a ffff8800a56cf6f8 ffff8800a56cea40 ffffffff810795dd
[  747.233846] Call Trace:
[  747.233849]  [<ffffffff814f446d>] dump_stack+0x4c/0x6e
[  747.233852]  [<ffffffff81077036>] ? up+0x39/0x3e
[  747.233854]  [<ffffffff814f3849>] print_usage_bug.part.23+0x25b/0x26a
[  747.233857]  [<ffffffff810795dd>] ? print_shortest_lock_dependencies+0x182/0x182
[  747.233859]  [<ffffffff8107a9c9>] mark_lock+0x384/0x56d
[  747.233862]  [<ffffffff8107ac11>] mark_held_locks+0x5f/0x76
[  747.233865]  [<ffffffffa023d2f3>] ? zcomp_strm_alloc+0x25/0x73 [zram]
[  747.233867]  [<ffffffff8107d13b>] lockdep_trace_alloc+0xb2/0xb5
[  747.233870]  [<ffffffff8112bac7>] kmem_cache_alloc_trace+0x32/0x1e2
[  747.233873]  [<ffffffffa023d2f3>] zcomp_strm_alloc+0x25/0x73 [zram]
[  747.233876]  [<ffffffffa023d428>] zcomp_strm_multi_find+0xe7/0x173 [zram]
[  747.233879]  [<ffffffffa023d58b>] zcomp_strm_find+0xc/0xe [zram]
[  747.233881]  [<ffffffffa023f292>] zram_bvec_rw+0x2ca/0x7e0 [zram]
[  747.233885]  [<ffffffffa023fa8c>] zram_make_request+0x1fa/0x301 [zram]
[  747.233889]  [<ffffffff812142f8>] generic_make_request+0x9c/0xdb
[  747.233891]  [<ffffffff8121442e>] submit_bio+0xf7/0x120
[  747.233895]  [<ffffffff810f1c0c>] ? __test_set_page_writeback+0x1a0/0x1b8
[  747.233897]  [<ffffffff811a9d00>] ext4_io_submit+0x2e/0x43
[  747.233899]  [<ffffffff811a9efa>] ext4_bio_write_page+0x1b7/0x300
[  747.233902]  [<ffffffff811a2106>] mpage_submit_page+0x60/0x77
[  747.233905]  [<ffffffff811a25b0>] mpage_map_and_submit_buffers+0x10f/0x21d
[  747.233907]  [<ffffffff811a6814>] ext4_writepages+0xc8c/0xe1b
[  747.233910]  [<ffffffff810f3f77>] do_writepages+0x23/0x2c
[  747.233913]  [<ffffffff810ea5d1>] __filemap_fdatawrite_range+0x84/0x8b
[  747.233915]  [<ffffffff810ea657>] filemap_flush+0x1c/0x1e
[  747.233917]  [<ffffffff811a3851>] ext4_alloc_da_blocks+0xb8/0x117
[  747.233919]  [<ffffffff811af52a>] ext4_rename+0x132/0x6dc
[  747.233921]  [<ffffffff8107ac11>] ? mark_held_locks+0x5f/0x76
[  747.233924]  [<ffffffff811afafd>] ext4_rename2+0x29/0x2b
[  747.233926]  [<ffffffff811427ea>] vfs_rename+0x540/0x636
[  747.233928]  [<ffffffff81146a01>] SyS_renameat2+0x359/0x44d
[  747.233931]  [<ffffffff81146b26>] SyS_rename+0x1e/0x20
[  747.233933]  [<ffffffff814faa17>] entry_SYSCALL_64_fastpath+0x12/0x6f

[minchan: add stable mark]
Cc: stable@vger.kernel.org
Signed-off-by: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
---
 drivers/block/zram/zcomp.c     | 4 ++--
 drivers/block/zram/zcomp_lz4.c | 2 +-
 drivers/block/zram/zcomp_lzo.c | 2 +-
 3 files changed, 4 insertions(+), 4 deletions(-)

diff --git a/drivers/block/zram/zcomp.c b/drivers/block/zram/zcomp.c
index 5cb13ca..c536177 100644
--- a/drivers/block/zram/zcomp.c
+++ b/drivers/block/zram/zcomp.c
@@ -76,7 +76,7 @@ static void zcomp_strm_free(struct zcomp *comp, struct zcomp_strm *zstrm)
  */
 static struct zcomp_strm *zcomp_strm_alloc(struct zcomp *comp)
 {
-	struct zcomp_strm *zstrm = kmalloc(sizeof(*zstrm), GFP_KERNEL);
+	struct zcomp_strm *zstrm = kmalloc(sizeof(*zstrm), GFP_NOIO);
 	if (!zstrm)
 		return NULL;
 
@@ -85,7 +85,7 @@ static struct zcomp_strm *zcomp_strm_alloc(struct zcomp *comp)
 	 * allocate 2 pages. 1 for compressed data, plus 1 extra for the
 	 * case when compressed size is larger than the original one
 	 */
-	zstrm->buffer = (void *)__get_free_pages(GFP_KERNEL | __GFP_ZERO, 1);
+	zstrm->buffer = (void *)__get_free_pages(GFP_NOIO | __GFP_ZERO, 1);
 	if (!zstrm->private || !zstrm->buffer) {
 		zcomp_strm_free(comp, zstrm);
 		zstrm = NULL;
diff --git a/drivers/block/zram/zcomp_lz4.c b/drivers/block/zram/zcomp_lz4.c
index f2afb7e..ee44b51 100644
--- a/drivers/block/zram/zcomp_lz4.c
+++ b/drivers/block/zram/zcomp_lz4.c
@@ -15,7 +15,7 @@
 
 static void *zcomp_lz4_create(void)
 {
-	return kzalloc(LZ4_MEM_COMPRESS, GFP_KERNEL);
+	return kzalloc(LZ4_MEM_COMPRESS, GFP_NOIO);
 }
 
 static void zcomp_lz4_destroy(void *private)
diff --git a/drivers/block/zram/zcomp_lzo.c b/drivers/block/zram/zcomp_lzo.c
index da1bc47..683ce04 100644
--- a/drivers/block/zram/zcomp_lzo.c
+++ b/drivers/block/zram/zcomp_lzo.c
@@ -15,7 +15,7 @@
 
 static void *lzo_create(void)
 {
-	return kzalloc(LZO1X_MEM_COMPRESS, GFP_KERNEL);
+	return kzalloc(LZO1X_MEM_COMPRESS, GFP_NOIO);
 }
 
 static void lzo_destroy(void *private)
-- 
2.6.3.368.gf34be46


^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH v3 2/2] zram: try vmalloc() after kmalloc()
  2015-11-27  4:10 [PATCH v3 0/2] zram/zcomp: stream allocation fixes and tweaks Sergey Senozhatsky
  2015-11-27  4:10 ` [PATCH v3 1/2] zram/zcomp: use GFP_NOIO to allocate streams Sergey Senozhatsky
@ 2015-11-27  4:10 ` Sergey Senozhatsky
  2015-11-30  7:10   ` Minchan Kim
  1 sibling, 1 reply; 18+ messages in thread
From: Sergey Senozhatsky @ 2015-11-27  4:10 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Minchan Kim, Kyeongdon Kim, linux-kernel, Sergey Senozhatsky,
	Sergey Senozhatsky

From: Kyeongdon Kim <kyeongdon.kim@lge.com>

When we're using LZ4 multi compression streams for zram swap,
we found out page allocation failure message in system running test.
That was not only once, but a few(2 - 5 times per test).
Also, some failure cases were continually occurring to try allocation
order 3.

In order to make parallel compression private data, we should call
kzalloc() with order 2/3 in runtime(lzo/lz4). But if there is no order
 2/3 size memory to allocate in that time, page allocation fails.
This patch makes to use vmalloc() as fallback of kmalloc(), this
prevents page alloc failure warning.

After using this, we never found warning message in running test, also
It could reduce process startup latency about 60-120ms in each case.

For reference a call trace :

Binder_1: page allocation failure: order:3, mode:0x10c0d0
CPU: 0 PID: 424 Comm: Binder_1 Tainted: GW 3.10.49-perf-g991d02b-dirty #20
Call trace:
[<ffffffc0002069c8>] dump_backtrace+0x0/0x270
[<ffffffc000206c48>] show_stack+0x10/0x1c
[<ffffffc000cb51c8>] dump_stack+0x1c/0x28
[<ffffffc0002bbfc8>] warn_alloc_failed+0xfc/0x11c
[<ffffffc0002bf518>] __alloc_pages_nodemask+0x724/0x7f0
[<ffffffc0002bf5f8>] __get_free_pages+0x14/0x5c
[<ffffffc0002ed6a4>] kmalloc_order_trace+0x38/0xd8
[<ffffffc0005d9738>] zcomp_lz4_create+0x2c/0x38
[<ffffffc0005d78f0>] zcomp_strm_alloc+0x34/0x78
[<ffffffc0005d7a58>] zcomp_strm_multi_find+0x124/0x1ec
[<ffffffc0005d7c14>] zcomp_strm_find+0xc/0x18
[<ffffffc0005d8fa0>] zram_bvec_rw+0x2fc/0x780
[<ffffffc0005d9680>] zram_make_request+0x25c/0x2d4
[<ffffffc00040f8ac>] generic_make_request+0x80/0xbc
[<ffffffc00040f98c>] submit_bio+0xa4/0x15c
[<ffffffc0002e8bb0>] __swap_writepage+0x218/0x230
[<ffffffc0002e8c04>] swap_writepage+0x3c/0x4c
[<ffffffc0002c7384>] shrink_page_list+0x51c/0x8d0
[<ffffffc0002c7e88>] shrink_inactive_list+0x3f8/0x60c
[<ffffffc0002c86c8>] shrink_lruvec+0x33c/0x4cc
[<ffffffc0002c8894>] shrink_zone+0x3c/0x100
[<ffffffc0002c8c10>] try_to_free_pages+0x2b8/0x54c
[<ffffffc0002bf308>] __alloc_pages_nodemask+0x514/0x7f0
[<ffffffc0002bf5f8>] __get_free_pages+0x14/0x5c
[<ffffffc0003446cc>] proc_info_read+0x50/0xe4
[<ffffffc0002f5204>] vfs_read+0xa0/0x12c
[<ffffffc0002f59c8>] SyS_read+0x44/0x74
DMA: 3397*4kB (MC) 26*8kB (RC) 0*16kB 0*32kB 0*64kB 0*128kB 0*256kB
     0*512kB 0*1024kB 0*2048kB 0*4096kB = 13796kB

[minchan: change vmalloc gfp and adding comment about gfp]
[sergey: tweak comments and styles]
Signed-off-by: Kyeongdon Kim <kyeongdon.kim@lge.com>
Signed-off-by: Minchan Kim <minchan@kernel.org>
Acked-by: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
---
 drivers/block/zram/zcomp_lz4.c | 23 +++++++++++++++++++++--
 drivers/block/zram/zcomp_lzo.c | 23 +++++++++++++++++++++--
 2 files changed, 42 insertions(+), 4 deletions(-)

diff --git a/drivers/block/zram/zcomp_lz4.c b/drivers/block/zram/zcomp_lz4.c
index ee44b51..f2bfced 100644
--- a/drivers/block/zram/zcomp_lz4.c
+++ b/drivers/block/zram/zcomp_lz4.c
@@ -10,17 +10,36 @@
 #include <linux/kernel.h>
 #include <linux/slab.h>
 #include <linux/lz4.h>
+#include <linux/vmalloc.h>
+#include <linux/mm.h>
 
 #include "zcomp_lz4.h"
 
 static void *zcomp_lz4_create(void)
 {
-	return kzalloc(LZ4_MEM_COMPRESS, GFP_NOIO);
+	void *ret;
+
+	/*
+	 * This function can be called in swapout/fs write path
+	 * so we can't use GFP_FS|IO. And it assumes we already
+	 * have at least one stream in zram initialization so we
+	 * don't do best effort to allocate more stream in here.
+	 * A default stream will work well without further multiple
+	 * streams. That's why we use NORETRY | NOWARN | NOMEMALLOC.
+	 */
+	ret = kzalloc(LZ4_MEM_COMPRESS, GFP_NOIO | __GFP_NORETRY |
+					__GFP_NOWARN | __GFP_NOMEMALLOC);
+	if (!ret)
+		ret = __vmalloc(LZ4_MEM_COMPRESS,
+				GFP_NOIO | __GFP_NORETRY | __GFP_NOWARN |
+				__GFP_NOMEMALLOC | __GFP_ZERO | __GFP_HIGHMEM,
+				PAGE_KERNEL);
+	return ret;
 }
 
 static void zcomp_lz4_destroy(void *private)
 {
-	kfree(private);
+	kvfree(private);
 }
 
 static int zcomp_lz4_compress(const unsigned char *src, unsigned char *dst,
diff --git a/drivers/block/zram/zcomp_lzo.c b/drivers/block/zram/zcomp_lzo.c
index 683ce04..7fbb4a3 100644
--- a/drivers/block/zram/zcomp_lzo.c
+++ b/drivers/block/zram/zcomp_lzo.c
@@ -10,17 +10,36 @@
 #include <linux/kernel.h>
 #include <linux/slab.h>
 #include <linux/lzo.h>
+#include <linux/vmalloc.h>
+#include <linux/mm.h>
 
 #include "zcomp_lzo.h"
 
 static void *lzo_create(void)
 {
-	return kzalloc(LZO1X_MEM_COMPRESS, GFP_NOIO);
+	void *ret;
+
+	/*
+	 * This function can be called in swapout/fs write path
+	 * so we can't use GFP_FS|IO. And it assumes we already
+	 * have at least one stream in zram initialization so we
+	 * don't do best effort to allocate more stream in here.
+	 * A default stream will work well without further multiple
+	 * streams. That's why we use NORETRY | NOWARN | NOMEMALLOC.
+	 */
+	ret = kzalloc(LZO1X_MEM_COMPRESS, GFP_NOIO | __GFP_NORETRY |
+					__GFP_NOWARN | __GFP_NOMEMALLOC);
+	if (!ret)
+		ret = __vmalloc(LZO1X_MEM_COMPRESS,
+				GFP_NOIO | __GFP_NORETRY | __GFP_NOWARN |
+				__GFP_NOMEMALLOC | __GFP_ZERO | __GFP_HIGHMEM,
+				PAGE_KERNEL);
+	return ret;
 }
 
 static void lzo_destroy(void *private)
 {
-	kfree(private);
+	kvfree(private);
 }
 
 static int lzo_compress(const unsigned char *src, unsigned char *dst,
-- 
2.6.3.368.gf34be46


^ permalink raw reply related	[flat|nested] 18+ messages in thread

* Re: [PATCH v3 1/2] zram/zcomp: use GFP_NOIO to allocate streams
  2015-11-27  4:10 ` [PATCH v3 1/2] zram/zcomp: use GFP_NOIO to allocate streams Sergey Senozhatsky
@ 2015-11-30  7:09   ` Minchan Kim
  0 siblings, 0 replies; 18+ messages in thread
From: Minchan Kim @ 2015-11-30  7:09 UTC (permalink / raw)
  To: Sergey Senozhatsky
  Cc: Andrew Morton, Kyeongdon Kim, linux-kernel, Sergey Senozhatsky, stable

On Fri, Nov 27, 2015 at 01:10:48PM +0900, Sergey Senozhatsky wrote:
> From: Sergey Senozhatsky <sergey.senozhatsky.work@gmail.com>
> 
> We can end up allocating a new compression stream with GFP_KERNEL
> from within the IO path, which may result is nested (recursive) IO
> operations. That can introduce problems if the IO path in question
> is a reclaimer, holding some locks that will deadlock nested IOs.
> 
> Allocate streams and working memory using GFP_NOIO flag, forbidding
> recursive IO and FS operations.
> 
> An example:
> 
> [  747.233722] inconsistent {IN-RECLAIM_FS-W} -> {RECLAIM_FS-ON-W} usage.
> [  747.233724] git/20158 [HC0[0]:SC0[0]:HE1:SE1] takes:
> [  747.233725]  (jbd2_handle){+.+.?.}, at: [<ffffffff811e31db>] start_this_handle+0x4ca/0x555
> [  747.233733] {IN-RECLAIM_FS-W} state was registered at:
> [  747.233735]   [<ffffffff8107b8e9>] __lock_acquire+0x8da/0x117b
> [  747.233738]   [<ffffffff8107c950>] lock_acquire+0x10c/0x1a7
> [  747.233740]   [<ffffffff811e323e>] start_this_handle+0x52d/0x555
> [  747.233742]   [<ffffffff811e331a>] jbd2__journal_start+0xb4/0x237
> [  747.233744]   [<ffffffff811cc6c7>] __ext4_journal_start_sb+0x108/0x17e
> [  747.233748]   [<ffffffff811a90bf>] ext4_dirty_inode+0x32/0x61
> [  747.233750]   [<ffffffff8115f37e>] __mark_inode_dirty+0x16b/0x60c
> [  747.233754]   [<ffffffff81150ad6>] iput+0x11e/0x274
> [  747.233757]   [<ffffffff8114bfbd>] __dentry_kill+0x148/0x1b8
> [  747.233759]   [<ffffffff8114c9d9>] shrink_dentry_list+0x274/0x44a
> [  747.233761]   [<ffffffff8114d38a>] prune_dcache_sb+0x4a/0x55
> [  747.233763]   [<ffffffff8113b1ad>] super_cache_scan+0xfc/0x176
> [  747.233767]   [<ffffffff810fa089>] shrink_slab.part.14.constprop.25+0x2a2/0x4d3
> [  747.233770]   [<ffffffff810fcccb>] shrink_zone+0x74/0x140
> [  747.233772]   [<ffffffff810fd924>] kswapd+0x6b7/0x930
> [  747.233774]   [<ffffffff81058887>] kthread+0x107/0x10f
> [  747.233778]   [<ffffffff814fadff>] ret_from_fork+0x3f/0x70
> [  747.233783] irq event stamp: 138297
> [  747.233784] hardirqs last  enabled at (138297): [<ffffffff8107aff3>] debug_check_no_locks_freed+0x113/0x12f
> [  747.233786] hardirqs last disabled at (138296): [<ffffffff8107af13>] debug_check_no_locks_freed+0x33/0x12f
> [  747.233788] softirqs last  enabled at (137818): [<ffffffff81040f89>] __do_softirq+0x2d3/0x3e9
> [  747.233792] softirqs last disabled at (137813): [<ffffffff81041292>] irq_exit+0x41/0x95
> [  747.233794]
>                other info that might help us debug this:
> [  747.233796]  Possible unsafe locking scenario:
> [  747.233797]        CPU0
> [  747.233798]        ----
> [  747.233799]   lock(jbd2_handle);
> [  747.233801]   <Interrupt>
> [  747.233801]     lock(jbd2_handle);
> [  747.233803]
>                 *** DEADLOCK ***
> [  747.233805] 5 locks held by git/20158:
> [  747.233806]  #0:  (sb_writers#7){.+.+.+}, at: [<ffffffff81155411>] mnt_want_write+0x24/0x4b
> [  747.233811]  #1:  (&type->i_mutex_dir_key#2/1){+.+.+.}, at: [<ffffffff81145087>] lock_rename+0xd9/0xe3
> [  747.233817]  #2:  (&sb->s_type->i_mutex_key#11){+.+.+.}, at: [<ffffffff8114f8e2>] lock_two_nondirectories+0x3f/0x6b
> [  747.233822]  #3:  (&sb->s_type->i_mutex_key#11/4){+.+.+.}, at: [<ffffffff8114f909>] lock_two_nondirectories+0x66/0x6b
> [  747.233827]  #4:  (jbd2_handle){+.+.?.}, at: [<ffffffff811e31db>] start_this_handle+0x4ca/0x555
> [  747.233831]
>                stack backtrace:
> [  747.233834] CPU: 2 PID: 20158 Comm: git Not tainted 4.1.0-rc7-next-20150615-dbg-00016-g8bdf555-dirty #211
> [  747.233837]  ffff8800a56cea40 ffff88010d0a75f8 ffffffff814f446d ffffffff81077036
> [  747.233840]  ffffffff823a84b0 ffff88010d0a7638 ffffffff814f3849 0000000000000001
> [  747.233843]  000000000000000a ffff8800a56cf6f8 ffff8800a56cea40 ffffffff810795dd
> [  747.233846] Call Trace:
> [  747.233849]  [<ffffffff814f446d>] dump_stack+0x4c/0x6e
> [  747.233852]  [<ffffffff81077036>] ? up+0x39/0x3e
> [  747.233854]  [<ffffffff814f3849>] print_usage_bug.part.23+0x25b/0x26a
> [  747.233857]  [<ffffffff810795dd>] ? print_shortest_lock_dependencies+0x182/0x182
> [  747.233859]  [<ffffffff8107a9c9>] mark_lock+0x384/0x56d
> [  747.233862]  [<ffffffff8107ac11>] mark_held_locks+0x5f/0x76
> [  747.233865]  [<ffffffffa023d2f3>] ? zcomp_strm_alloc+0x25/0x73 [zram]
> [  747.233867]  [<ffffffff8107d13b>] lockdep_trace_alloc+0xb2/0xb5
> [  747.233870]  [<ffffffff8112bac7>] kmem_cache_alloc_trace+0x32/0x1e2
> [  747.233873]  [<ffffffffa023d2f3>] zcomp_strm_alloc+0x25/0x73 [zram]
> [  747.233876]  [<ffffffffa023d428>] zcomp_strm_multi_find+0xe7/0x173 [zram]
> [  747.233879]  [<ffffffffa023d58b>] zcomp_strm_find+0xc/0xe [zram]
> [  747.233881]  [<ffffffffa023f292>] zram_bvec_rw+0x2ca/0x7e0 [zram]
> [  747.233885]  [<ffffffffa023fa8c>] zram_make_request+0x1fa/0x301 [zram]
> [  747.233889]  [<ffffffff812142f8>] generic_make_request+0x9c/0xdb
> [  747.233891]  [<ffffffff8121442e>] submit_bio+0xf7/0x120
> [  747.233895]  [<ffffffff810f1c0c>] ? __test_set_page_writeback+0x1a0/0x1b8
> [  747.233897]  [<ffffffff811a9d00>] ext4_io_submit+0x2e/0x43
> [  747.233899]  [<ffffffff811a9efa>] ext4_bio_write_page+0x1b7/0x300
> [  747.233902]  [<ffffffff811a2106>] mpage_submit_page+0x60/0x77
> [  747.233905]  [<ffffffff811a25b0>] mpage_map_and_submit_buffers+0x10f/0x21d
> [  747.233907]  [<ffffffff811a6814>] ext4_writepages+0xc8c/0xe1b
> [  747.233910]  [<ffffffff810f3f77>] do_writepages+0x23/0x2c
> [  747.233913]  [<ffffffff810ea5d1>] __filemap_fdatawrite_range+0x84/0x8b
> [  747.233915]  [<ffffffff810ea657>] filemap_flush+0x1c/0x1e
> [  747.233917]  [<ffffffff811a3851>] ext4_alloc_da_blocks+0xb8/0x117
> [  747.233919]  [<ffffffff811af52a>] ext4_rename+0x132/0x6dc
> [  747.233921]  [<ffffffff8107ac11>] ? mark_held_locks+0x5f/0x76
> [  747.233924]  [<ffffffff811afafd>] ext4_rename2+0x29/0x2b
> [  747.233926]  [<ffffffff811427ea>] vfs_rename+0x540/0x636
> [  747.233928]  [<ffffffff81146a01>] SyS_renameat2+0x359/0x44d
> [  747.233931]  [<ffffffff81146b26>] SyS_rename+0x1e/0x20
> [  747.233933]  [<ffffffff814faa17>] entry_SYSCALL_64_fastpath+0x12/0x6f
> 
> [minchan: add stable mark]
> Cc: stable@vger.kernel.org
> Signed-off-by: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
Acked-by: Minchan Kim <minchan@kernel.org>

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH v3 2/2] zram: try vmalloc() after kmalloc()
  2015-11-27  4:10 ` [PATCH v3 2/2] zram: try vmalloc() after kmalloc() Sergey Senozhatsky
@ 2015-11-30  7:10   ` Minchan Kim
  2015-11-30 10:42     ` kyeongdon.kim
  0 siblings, 1 reply; 18+ messages in thread
From: Minchan Kim @ 2015-11-30  7:10 UTC (permalink / raw)
  To: Sergey Senozhatsky
  Cc: Andrew Morton, Kyeongdon Kim, linux-kernel, Sergey Senozhatsky

On Fri, Nov 27, 2015 at 01:10:49PM +0900, Sergey Senozhatsky wrote:
> From: Kyeongdon Kim <kyeongdon.kim@lge.com>
> 
> When we're using LZ4 multi compression streams for zram swap,
> we found out page allocation failure message in system running test.
> That was not only once, but a few(2 - 5 times per test).
> Also, some failure cases were continually occurring to try allocation
> order 3.
> 
> In order to make parallel compression private data, we should call
> kzalloc() with order 2/3 in runtime(lzo/lz4). But if there is no order
>  2/3 size memory to allocate in that time, page allocation fails.
> This patch makes to use vmalloc() as fallback of kmalloc(), this
> prevents page alloc failure warning.
> 
> After using this, we never found warning message in running test, also
> It could reduce process startup latency about 60-120ms in each case.
> 
> For reference a call trace :
> 
> Binder_1: page allocation failure: order:3, mode:0x10c0d0
> CPU: 0 PID: 424 Comm: Binder_1 Tainted: GW 3.10.49-perf-g991d02b-dirty #20
> Call trace:
> [<ffffffc0002069c8>] dump_backtrace+0x0/0x270
> [<ffffffc000206c48>] show_stack+0x10/0x1c
> [<ffffffc000cb51c8>] dump_stack+0x1c/0x28
> [<ffffffc0002bbfc8>] warn_alloc_failed+0xfc/0x11c
> [<ffffffc0002bf518>] __alloc_pages_nodemask+0x724/0x7f0
> [<ffffffc0002bf5f8>] __get_free_pages+0x14/0x5c
> [<ffffffc0002ed6a4>] kmalloc_order_trace+0x38/0xd8
> [<ffffffc0005d9738>] zcomp_lz4_create+0x2c/0x38
> [<ffffffc0005d78f0>] zcomp_strm_alloc+0x34/0x78
> [<ffffffc0005d7a58>] zcomp_strm_multi_find+0x124/0x1ec
> [<ffffffc0005d7c14>] zcomp_strm_find+0xc/0x18
> [<ffffffc0005d8fa0>] zram_bvec_rw+0x2fc/0x780
> [<ffffffc0005d9680>] zram_make_request+0x25c/0x2d4
> [<ffffffc00040f8ac>] generic_make_request+0x80/0xbc
> [<ffffffc00040f98c>] submit_bio+0xa4/0x15c
> [<ffffffc0002e8bb0>] __swap_writepage+0x218/0x230
> [<ffffffc0002e8c04>] swap_writepage+0x3c/0x4c
> [<ffffffc0002c7384>] shrink_page_list+0x51c/0x8d0
> [<ffffffc0002c7e88>] shrink_inactive_list+0x3f8/0x60c
> [<ffffffc0002c86c8>] shrink_lruvec+0x33c/0x4cc
> [<ffffffc0002c8894>] shrink_zone+0x3c/0x100
> [<ffffffc0002c8c10>] try_to_free_pages+0x2b8/0x54c
> [<ffffffc0002bf308>] __alloc_pages_nodemask+0x514/0x7f0
> [<ffffffc0002bf5f8>] __get_free_pages+0x14/0x5c
> [<ffffffc0003446cc>] proc_info_read+0x50/0xe4
> [<ffffffc0002f5204>] vfs_read+0xa0/0x12c
> [<ffffffc0002f59c8>] SyS_read+0x44/0x74
> DMA: 3397*4kB (MC) 26*8kB (RC) 0*16kB 0*32kB 0*64kB 0*128kB 0*256kB
>      0*512kB 0*1024kB 0*2048kB 0*4096kB = 13796kB
> 
> [minchan: change vmalloc gfp and adding comment about gfp]
> [sergey: tweak comments and styles]
> Signed-off-by: Kyeongdon Kim <kyeongdon.kim@lge.com>
> Signed-off-by: Minchan Kim <minchan@kernel.org>

Kyeongdon, Could you test this patch on your device?

Thanks.


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: Re: [PATCH v3 2/2] zram: try vmalloc() after kmalloc()
  2015-11-30  7:10   ` Minchan Kim
@ 2015-11-30 10:42     ` kyeongdon.kim
  2015-11-30 11:14       ` Sergey Senozhatsky
       [not found]       ` <20151130231841.GA960@bbox>
  0 siblings, 2 replies; 18+ messages in thread
From: kyeongdon.kim @ 2015-11-30 10:42 UTC (permalink / raw)
  To: Minchan Kim, Sergey Senozhatsky
  Cc: Andrew Morton, linux-kernel, Sergey Senozhatsky

On 2015-11-30 오후 4:10, Minchan Kim wrote:
> On Fri, Nov 27, 2015 at 01:10:49PM +0900, Sergey Senozhatsky wrote:
>> From: Kyeongdon Kim <kyeongdon.kim@lge.com>
>>
>> When we're using LZ4 multi compression streams for zram swap,
>> we found out page allocation failure message in system running test.
>> That was not only once, but a few(2 - 5 times per test).
>> Also, some failure cases were continually occurring to try allocation
>> order 3.
>>
>> In order to make parallel compression private data, we should call
>> kzalloc() with order 2/3 in runtime(lzo/lz4). But if there is no order
>> 2/3 size memory to allocate in that time, page allocation fails.
>> This patch makes to use vmalloc() as fallback of kmalloc(), this
>> prevents page alloc failure warning.
>>
>> After using this, we never found warning message in running test, also
>> It could reduce process startup latency about 60-120ms in each case.
>>
>> For reference a call trace :
>>
>> Binder_1: page allocation failure: order:3, mode:0x10c0d0
>> CPU: 0 PID: 424 Comm: Binder_1 Tainted: GW 3.10.49-perf-g991d02b-dirty
> #20
>> Call trace:
>> [<ffffffc0002069c8>] dump_backtrace+0x0/0x270
>> [<ffffffc000206c48>] show_stack+0x10/0x1c
>> [<ffffffc000cb51c8>] dump_stack+0x1c/0x28
>> [<ffffffc0002bbfc8>] warn_alloc_failed+0xfc/0x11c
>> [<ffffffc0002bf518>] __alloc_pages_nodemask+0x724/0x7f0
>> [<ffffffc0002bf5f8>] __get_free_pages+0x14/0x5c
>> [<ffffffc0002ed6a4>] kmalloc_order_trace+0x38/0xd8
>> [<ffffffc0005d9738>] zcomp_lz4_create+0x2c/0x38
>> [<ffffffc0005d78f0>] zcomp_strm_alloc+0x34/0x78
>> [<ffffffc0005d7a58>] zcomp_strm_multi_find+0x124/0x1ec
>> [<ffffffc0005d7c14>] zcomp_strm_find+0xc/0x18
>> [<ffffffc0005d8fa0>] zram_bvec_rw+0x2fc/0x780
>> [<ffffffc0005d9680>] zram_make_request+0x25c/0x2d4
>> [<ffffffc00040f8ac>] generic_make_request+0x80/0xbc
>> [<ffffffc00040f98c>] submit_bio+0xa4/0x15c
>> [<ffffffc0002e8bb0>] __swap_writepage+0x218/0x230
>> [<ffffffc0002e8c04>] swap_writepage+0x3c/0x4c
>> [<ffffffc0002c7384>] shrink_page_list+0x51c/0x8d0
>> [<ffffffc0002c7e88>] shrink_inactive_list+0x3f8/0x60c
>> [<ffffffc0002c86c8>] shrink_lruvec+0x33c/0x4cc
>> [<ffffffc0002c8894>] shrink_zone+0x3c/0x100
>> [<ffffffc0002c8c10>] try_to_free_pages+0x2b8/0x54c
>> [<ffffffc0002bf308>] __alloc_pages_nodemask+0x514/0x7f0
>> [<ffffffc0002bf5f8>] __get_free_pages+0x14/0x5c
>> [<ffffffc0003446cc>] proc_info_read+0x50/0xe4
>> [<ffffffc0002f5204>] vfs_read+0xa0/0x12c
>> [<ffffffc0002f59c8>] SyS_read+0x44/0x74
>> DMA: 3397*4kB (MC) 26*8kB (RC) 0*16kB 0*32kB 0*64kB 0*128kB 0*256kB
>> 0*512kB 0*1024kB 0*2048kB 0*4096kB = 13796kB
>>
>> [minchan: change vmalloc gfp and adding comment about gfp]
>> [sergey: tweak comments and styles]
>> Signed-off-by: Kyeongdon Kim <kyeongdon.kim@lge.com>
>> Signed-off-by: Minchan Kim <minchan@kernel.org>
> 
> Kyeongdon, Could you test this patch on your device?
> 
> Thanks.

Sorry to have kept you waiting,
Obviously, I couldn't see allocation fail message with this patch.
But, there is something to make some delay(not sure yet this is normal).

static struct zcomp_strm *zcomp_strm_alloc(struct zcomp *comp)
{
<snip>

   zstrm->private = comp->backend->create();
	            ^ // sometimes, return 'null' continually(2-5times)

As you know, if there is 'null' return, this function is called again to
get a memory in while() loop. I just checked this one with printk().

If you guys don't mind, I'll test more with trace log to check time delay.

However, If this is fully expectable status to you.
I think I don't need to do it.

Thanks,
Kyeongdon Kim

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH v3 2/2] zram: try vmalloc() after kmalloc()
  2015-11-30 10:42     ` kyeongdon.kim
@ 2015-11-30 11:14       ` Sergey Senozhatsky
  2015-12-01  2:04         ` kyeongdon.kim
       [not found]       ` <20151130231841.GA960@bbox>
  1 sibling, 1 reply; 18+ messages in thread
From: Sergey Senozhatsky @ 2015-11-30 11:14 UTC (permalink / raw)
  To: kyeongdon.kim
  Cc: Minchan Kim, Sergey Senozhatsky, Andrew Morton, linux-kernel,
	Sergey Senozhatsky

On (11/30/15 19:42), kyeongdon.kim wrote:
[..]
> Sorry to have kept you waiting,
> Obviously, I couldn't see allocation fail message with this patch.
> But, there is something to make some delay(not sure yet this is normal).

what delay? how significant it is? do you see it in practice or it's just
a guess?

> static struct zcomp_strm *zcomp_strm_alloc(struct zcomp *comp)
> {
> <snip>
> 
>    zstrm->private = comp->backend->create();
> 	            ^ // sometimes, return 'null' continually(2-5times)
> 
> As you know, if there is 'null' return, this function is called again to
> get a memory in while() loop. I just checked this one with printk().

well, not always.

a) current wait_event() for available stream to become idle.
b) once current awaken it attempts to get an idle stream
c) if zstrm then return
d) if there is no idle stream then goto a)
e) else try to allocate stream again, if !zstrm goto a), else return

        while (1) {
                spin_lock(&zs->strm_lock);
                if (!list_empty(&zs->idle_strm)) {
                        zstrm = list_entry(zs->idle_strm.next,
                                        struct zcomp_strm, list);
                        list_del(&zstrm->list);
                        spin_unlock(&zs->strm_lock);
                        return zstrm;
                }
                /* zstrm streams limit reached, wait for idle stream */
                if (zs->avail_strm >= zs->max_strm) {
                        spin_unlock(&zs->strm_lock);
                        wait_event(zs->strm_wait, !list_empty(&zs->idle_strm));
                        continue;
                }
                /* allocate new zstrm stream */
                zs->avail_strm++;
                spin_unlock(&zs->strm_lock);

                zstrm = zcomp_strm_alloc(comp);
                if (!zstrm) {
                        spin_lock(&zs->strm_lock);
                        zs->avail_strm--;
                        spin_unlock(&zs->strm_lock);
                        wait_event(zs->strm_wait, !list_empty(&zs->idle_strm));
                        continue;
                }
                break;
        }

so it's possible for current to zcomp_strm_alloc() several times...

do you see the same process doing N zcomp_strm_alloc() calls, or it's N processes
doing one zcomp_strm_alloc()? I think the latter one is more likely; once we failed
to zcomp_strm_alloc() quite possible that N concurrent or succeeding IOs will do
the same. That's why I proposed to decrease ->max_strm; but we basically don't know
when we shall rollback it to the original value; I'm not sure I want to do something
like: every 42nd IO try to increment ->max_strm by one, until it's less than the
original value.

so I'd probably prefer to keep it the way it is; but let's see the numbers from
you first.

	-ss

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH v3 2/2] zram: try vmalloc() after kmalloc()
       [not found]       ` <20151130231841.GA960@bbox>
@ 2015-12-01  0:33         ` Sergey Senozhatsky
  2015-12-01  2:31         ` Re: " kyeongdon.kim
  1 sibling, 0 replies; 18+ messages in thread
From: Sergey Senozhatsky @ 2015-12-01  0:33 UTC (permalink / raw)
  To: Minchan Kim
  Cc: kyeongdon.kim, Sergey Senozhatsky, Andrew Morton, linux-kernel,
	Sergey Senozhatsky

On (12/01/15 08:18), Minchan Kim wrote:
[..]
> > As you know, if there is 'null' return, this function is called again to
> > get a memory in while() loop. I just checked this one with printk().
> > 
> > If you guys don't mind, I'll test more with trace log to check time delay.
> 
> No problem.
> 
> > 
> > However, If this is fully expectable status to you.
> > I think I don't need to do it.
> 
> It's not what I expected. Actually, I thought failure of vmalloc
> in that place should be *really really* rare. I think it's caused by
> __GFP_NOMEMALLOC so I want to see test result without the flag.

hm, agree. otherwise the whole vmalloc() fallback thing adds a
little value. additional streams are really not that important
to waste emergency memory. a stream, once allocated, stays
forever (until user decrease the ->max_strm).

> Thanks for the careful test!

yes, thank you Kyeongdon.

	-ss

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: Re: [PATCH v3 2/2] zram: try vmalloc() after kmalloc()
  2015-11-30 11:14       ` Sergey Senozhatsky
@ 2015-12-01  2:04         ` kyeongdon.kim
  0 siblings, 0 replies; 18+ messages in thread
From: kyeongdon.kim @ 2015-12-01  2:04 UTC (permalink / raw)
  To: Sergey Senozhatsky
  Cc: Minchan Kim, Andrew Morton, linux-kernel, Sergey Senozhatsky

On 2015-11-30 오후 8:14, Sergey Senozhatsky wrote:
> On (11/30/15 19:42), kyeongdon.kim wrote:
> [..]
>> Sorry to have kept you waiting,
>> Obviously, I couldn't see allocation fail message with this patch.
>> But, there is something to make some delay(not sure yet this is normal).
> 
> what delay? how significant it is? do you see it in practice or it's just
> a guess?
> 
Now, I just checked with printk() log in zcomp_lz4_create()/lzo_create()
to see address 'ret'. and these 'null' values are called several times
from kzalloc. also __vmalloc. - these are making the delay.
So, not significant status I guess. but if this allocation try is many.
I doubt is is fine.

>> static struct zcomp_strm *zcomp_strm_alloc(struct zcomp *comp)
>> {
>> <snip>
>>
>> zstrm->private = comp->backend->create();
>> ^ // sometimes, return 'null' continually(2-5times)
>>
>> As you know, if there is 'null' return, this function is called again to
>> get a memory in while() loop. I just checked this one with printk().
> 
> well, not always.
> 
> a) current wait_event() for available stream to become idle.
> b) once current awaken it attempts to get an idle stream
> c) if zstrm then return
> d) if there is no idle stream then goto a)
> e) else try to allocate stream again, if !zstrm goto a), else return
> 
> while (1) {
> spin_lock(&zs->strm_lock);
> if (!list_empty(&zs->idle_strm)) {
> zstrm = list_entry(zs->idle_strm.next,
> struct zcomp_strm, list);
> list_del(&zstrm->list);
> spin_unlock(&zs->strm_lock);
> return zstrm;
> }
> /* zstrm streams limit reached, wait for idle stream */
> if (zs->avail_strm >= zs->max_strm) {
> spin_unlock(&zs->strm_lock);
> wait_event(zs->strm_wait, !list_empty(&zs->idle_strm));
> continue;
> }
> /* allocate new zstrm stream */
> zs->avail_strm++;
> spin_unlock(&zs->strm_lock);
> 
> zstrm = zcomp_strm_alloc(comp);
> if (!zstrm) {
> spin_lock(&zs->strm_lock);
> zs->avail_strm--;
> spin_unlock(&zs->strm_lock);
> wait_event(zs->strm_wait, !list_empty(&zs->idle_strm));
> continue;
> }
> break;
> }
> 
> so it's possible for current to zcomp_strm_alloc() several times...
> 
> do you see the same process doing N zcomp_strm_alloc() calls, or it's N
> processes
> doing one zcomp_strm_alloc()? I think the latter one is more likely;
> once we failed
> to zcomp_strm_alloc() quite possible that N concurrent or succeeding IOs
> will do
> the same. That's why I proposed to decrease ->max_strm; but we basically
> don't know
> when we shall rollback it to the original value; I'm not sure I want to
> do something
> like: every 42nd IO try to increment ->max_strm by one, until it's less
> than the
> original value.
> 
> so I'd probably prefer to keep it the way it is; but let's see the
> numbers from
> you first.
> 
> -ss

I didn't check detailed yet.I'll explain after checking this.

Thanks,
Kyeongdon Kim


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: Re: Re: [PATCH v3 2/2] zram: try vmalloc() after kmalloc()
       [not found]       ` <20151130231841.GA960@bbox>
  2015-12-01  0:33         ` Sergey Senozhatsky
@ 2015-12-01  2:31         ` kyeongdon.kim
  2015-12-01  4:44           ` Minchan Kim
  1 sibling, 1 reply; 18+ messages in thread
From: kyeongdon.kim @ 2015-12-01  2:31 UTC (permalink / raw)
  To: Minchan Kim
  Cc: Sergey Senozhatsky, Andrew Morton, linux-kernel, Sergey Senozhatsky

On 2015-12-01 오전 8:18, Minchan Kim wrote:
> Hi Kyeongdon,
> 
> On Mon, Nov 30, 2015 at 07:42:02PM +0900, kyeongdon.kim wrote:
> 
>> > On Fri, Nov 27, 2015 at 01:10:49PM +0900, Sergey Senozhatsky wrote:
>> >> From: Kyeongdon Kim <kyeongdon.kim@lge.com>
>> >>
>> >> When we're using LZ4 multi compression streams for zram swap,
>> >> we found out page allocation failure message in system running test.
>> >> That was not only once, but a few(2 - 5 times per test).
>> >> Also, some failure cases were continually occurring to try allocation
>> >> order 3.
>> >>
>> >> In order to make parallel compression private data, we should call
>> >> kzalloc() with order 2/3 in runtime(lzo/lz4). But if there is no order
>> >> 2/3 size memory to allocate in that time, page allocation fails.
>> >> This patch makes to use vmalloc() as fallback of kmalloc(), this
>> >> prevents page alloc failure warning.
>> >>
>> >> After using this, we never found warning message in running test, also
>> >> It could reduce process startup latency about 60-120ms in each case.
>> >>
>> >> For reference a call trace :
>> >>
>> >> Binder_1: page allocation failure: order:3, mode:0x10c0d0
>> >> CPU: 0 PID: 424 Comm: Binder_1 Tainted: GW 3.10.49-perf-g991d02b-dirty
>> > #20
>> >> Call trace:
>> >> [<ffffffc0002069c8>] dump_backtrace+0x0/0x270
>> >> [<ffffffc000206c48>] show_stack+0x10/0x1c
>> >> [<ffffffc000cb51c8>] dump_stack+0x1c/0x28
>> >> [<ffffffc0002bbfc8>] warn_alloc_failed+0xfc/0x11c
>> >> [<ffffffc0002bf518>] __alloc_pages_nodemask+0x724/0x7f0
>> >> [<ffffffc0002bf5f8>] __get_free_pages+0x14/0x5c
>> >> [<ffffffc0002ed6a4>] kmalloc_order_trace+0x38/0xd8
>> >> [<ffffffc0005d9738>] zcomp_lz4_create+0x2c/0x38
>> >> [<ffffffc0005d78f0>] zcomp_strm_alloc+0x34/0x78
>> >> [<ffffffc0005d7a58>] zcomp_strm_multi_find+0x124/0x1ec
>> >> [<ffffffc0005d7c14>] zcomp_strm_find+0xc/0x18
>> >> [<ffffffc0005d8fa0>] zram_bvec_rw+0x2fc/0x780
>> >> [<ffffffc0005d9680>] zram_make_request+0x25c/0x2d4
>> >> [<ffffffc00040f8ac>] generic_make_request+0x80/0xbc
>> >> [<ffffffc00040f98c>] submit_bio+0xa4/0x15c
>> >> [<ffffffc0002e8bb0>] __swap_writepage+0x218/0x230
>> >> [<ffffffc0002e8c04>] swap_writepage+0x3c/0x4c
>> >> [<ffffffc0002c7384>] shrink_page_list+0x51c/0x8d0
>> >> [<ffffffc0002c7e88>] shrink_inactive_list+0x3f8/0x60c
>> >> [<ffffffc0002c86c8>] shrink_lruvec+0x33c/0x4cc
>> >> [<ffffffc0002c8894>] shrink_zone+0x3c/0x100
>> >> [<ffffffc0002c8c10>] try_to_free_pages+0x2b8/0x54c
>> >> [<ffffffc0002bf308>] __alloc_pages_nodemask+0x514/0x7f0
>> >> [<ffffffc0002bf5f8>] __get_free_pages+0x14/0x5c
>> >> [<ffffffc0003446cc>] proc_info_read+0x50/0xe4
>> >> [<ffffffc0002f5204>] vfs_read+0xa0/0x12c
>> >> [<ffffffc0002f59c8>] SyS_read+0x44/0x74
>> >> DMA: 3397*4kB (MC) 26*8kB (RC) 0*16kB 0*32kB 0*64kB 0*128kB 0*256kB
>> >> 0*512kB 0*1024kB 0*2048kB 0*4096kB = 13796kB
>> >>
>> >> [minchan: change vmalloc gfp and adding comment about gfp]
>> >> [sergey: tweak comments and styles]
>> >> Signed-off-by: Kyeongdon Kim <kyeongdon.kim@lge.com>
>> >> Signed-off-by: Minchan Kim <minchan@kernel.org>
>> >
>> > Kyeongdon, Could you test this patch on your device?
>> >
>> > Thanks.
>>
>> Sorry to have kept you waiting,
>> Obviously, I couldn't see allocation fail message with this patch.
>> But, there is something to make some delay(not sure yet this is normal).
> 
> You mean new changes makes start-up delay of your application sometime
> still,
> but not frequent like old?
> 
I couldn't see start-up delay during my test after this patch.
But, I checked the return value from alloc function like the below :

static void *zcomp_lz4_create(void)
<snip>
  ret = kzalloc(LZ4_MEM_COMPRESS, GFP_NOIO | __GFP_NORETRY |
			__GFP_NOWARN | __GFP_NOMEMALLOC);
  printk("%s: %d: ret = %p\n",__func__,__LINE__,ret);	//line 32
  if (!ret) {
    ret = __vmalloc(LZ4_MEM_COMPRESS,
		GFP_NOIO | __GFP_NORETRY | __GFP_NOWARN |
		__GFP_NOMEMALLOC | __GFP_ZERO | __GFP_HIGHMEM,
		PAGE_KERNEL);
    printk("%s: %d: ret = %p\n",__func__,__LINE__,ret);	//line 38
  }
  return ret;

log message :
[  352.226014][0] zcomp_lz4_create: 32: ret =           (null)
[  352.226035][0] zcomp_lz4_create: 38: ret =           (null)
[  352.226791][0] zcomp_lz4_create: 32: ret =           (null)
[  352.226809][0] zcomp_lz4_create: 38: ret =           (null)
[  352.230348][0] zcomp_lz4_create: 32: ret =           (null)
[  352.230369][0] zcomp_lz4_create: 38: ret =           (null)
[  352.230460][0] zcomp_lz4_create: 32: ret =           (null)
[  352.230485][0] zcomp_lz4_create: 38: ret =           (null)
[  352.230507][0] zcomp_lz4_create: 32: ret =           (null)
[  352.230520][0] zcomp_lz4_create: 38: ret =           (null)
[  352.230608][0] zcomp_lz4_create: 32: ret =           (null)
[  352.230619][0] zcomp_lz4_create: 38: ret =           (null)
[  352.230888][0] zcomp_lz4_create: 32: ret =           (null)
[  352.230902][0] zcomp_lz4_create: 38: ret =           (null)
[  352.231406][0] zcomp_lz4_create: 32: ret = ffffffc002088000
[  352.234024][0] zcomp_lz4_create: 32: ret =           (null)
[  352.234060][0] zcomp_lz4_create: 38: ret =           (null)
[  352.234359][0] zcomp_lz4_create: 32: ret =           (null)
[  352.234384][0] zcomp_lz4_create: 38: ret =           (null)
[  352.234618][0] zcomp_lz4_create: 32: ret =           (null)
[  352.234639][0] zcomp_lz4_create: 38: ret =           (null)
[  352.234667][0] zcomp_lz4_create: 32: ret =           (null)
[  352.234685][0] zcomp_lz4_create: 38: ret =           (null)
[  352.234738][0] zcomp_lz4_create: 32: ret =           (null)
[  352.234748][0] zcomp_lz4_create: 38: ret =           (null)
[  352.234800][0] zcomp_lz4_create: 32: ret =           (null)
[  352.234816][0] zcomp_lz4_create: 38: ret =           (null)
[  352.234852][0] zcomp_lz4_create: 32: ret =           (null)
[  352.234865][0] zcomp_lz4_create: 38: ret =           (null)
[  352.235136][0] zcomp_lz4_create: 32: ret =           (null)
[  352.235179][0] zcomp_lz4_create: 38: ret = ffffff80016a4000

I thought this pattern from vmalloc is not normal.
>>
>> static struct zcomp_strm *zcomp_strm_alloc(struct zcomp *comp)
>> {
>> <snip>
>>
>> zstrm->private = comp->backend->create();
>> ^ // sometimes, return 'null' continually(2-5times)
> 
> Hmm, I think it is caused by __GFP_NOMEMALLOC.
> Could you test it without the flag?
> 
>>
>> As you know, if there is 'null' return, this function is called again to
>> get a memory in while() loop. I just checked this one with printk().
>>
>> If you guys don't mind, I'll test more with trace log to check time
> delay.
> 
> No problem.
> 
>>
>> However, If this is fully expectable status to you.
>> I think I don't need to do it.
> 
> It's not what I expected. Actually, I thought failure of vmalloc
> in that place should be *really really* rare. I think it's caused by
> __GFP_NOMEMALLOC so I want to see test result without the flag.
> 
> Thanks for the careful test!
> 
You're welcome.

After I removed flag '__GFP_NOMEMALLOC', I couldn't find return 'null'
from vmalloc until now.

log message :
<4>[ 2288.954934][0] KDKIM: zcomp_lz4_create: 24: ret =           (null)
<4>[ 2288.954972][0] KDKIM: zcomp_lz4_create: 30: ret = ffffff800287e000
..<snip>..
<4>[ 2289.092411][0] KDKIM: zcomp_lz4_create: 24: ret =           (null)
<4>[ 2289.092546][0] KDKIM: zcomp_lz4_create: 30: ret = ffffff80028b5000
..<snip>..
<4>[ 2289.135628][0] KDKIM: zcomp_lz4_create: 24: ret =           (null)
<4>[ 2289.135642][0] KDKIM: zcomp_lz4_create: 24: ret =           (null)
<4>[ 2289.135729][0] KDKIM: zcomp_lz4_create: 30: ret = ffffff80028be000
<4>[ 2289.135732][0] KDKIM: zcomp_lz4_create: 30: ret = ffffff80028c7000

Thanks,
Kyeongdon Kim
> 
>>
>> Thanks,
>> Kyeongdon Kim


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: Re: Re: [PATCH v3 2/2] zram: try vmalloc() after kmalloc()
  2015-12-01  2:31         ` Re: " kyeongdon.kim
@ 2015-12-01  4:44           ` Minchan Kim
       [not found]             ` <20151201045535.GA5999@bbox>
  0 siblings, 1 reply; 18+ messages in thread
From: Minchan Kim @ 2015-12-01  4:44 UTC (permalink / raw)
  To: kyeongdon.kim
  Cc: Sergey Senozhatsky, Andrew Morton, linux-kernel, Sergey Senozhatsky

On Tue, Dec 01, 2015 at 11:31:41AM +0900, kyeongdon.kim wrote:
> On 2015-12-01 오전 8:18, Minchan Kim wrote:
> > Hi Kyeongdon,
> > 
> > On Mon, Nov 30, 2015 at 07:42:02PM +0900, kyeongdon.kim wrote:
> > 
> >> > On Fri, Nov 27, 2015 at 01:10:49PM +0900, Sergey Senozhatsky wrote:
> >> >> From: Kyeongdon Kim <kyeongdon.kim@lge.com>
> >> >>
> >> >> When we're using LZ4 multi compression streams for zram swap,
> >> >> we found out page allocation failure message in system running test.
> >> >> That was not only once, but a few(2 - 5 times per test).
> >> >> Also, some failure cases were continually occurring to try allocation
> >> >> order 3.
> >> >>
> >> >> In order to make parallel compression private data, we should call
> >> >> kzalloc() with order 2/3 in runtime(lzo/lz4). But if there is no order
> >> >> 2/3 size memory to allocate in that time, page allocation fails.
> >> >> This patch makes to use vmalloc() as fallback of kmalloc(), this
> >> >> prevents page alloc failure warning.
> >> >>
> >> >> After using this, we never found warning message in running test, also
> >> >> It could reduce process startup latency about 60-120ms in each case.
> >> >>
> >> >> For reference a call trace :
> >> >>
> >> >> Binder_1: page allocation failure: order:3, mode:0x10c0d0
> >> >> CPU: 0 PID: 424 Comm: Binder_1 Tainted: GW 3.10.49-perf-g991d02b-dirty
> >> > #20
> >> >> Call trace:
> >> >> [<ffffffc0002069c8>] dump_backtrace+0x0/0x270
> >> >> [<ffffffc000206c48>] show_stack+0x10/0x1c
> >> >> [<ffffffc000cb51c8>] dump_stack+0x1c/0x28
> >> >> [<ffffffc0002bbfc8>] warn_alloc_failed+0xfc/0x11c
> >> >> [<ffffffc0002bf518>] __alloc_pages_nodemask+0x724/0x7f0
> >> >> [<ffffffc0002bf5f8>] __get_free_pages+0x14/0x5c
> >> >> [<ffffffc0002ed6a4>] kmalloc_order_trace+0x38/0xd8
> >> >> [<ffffffc0005d9738>] zcomp_lz4_create+0x2c/0x38
> >> >> [<ffffffc0005d78f0>] zcomp_strm_alloc+0x34/0x78
> >> >> [<ffffffc0005d7a58>] zcomp_strm_multi_find+0x124/0x1ec
> >> >> [<ffffffc0005d7c14>] zcomp_strm_find+0xc/0x18
> >> >> [<ffffffc0005d8fa0>] zram_bvec_rw+0x2fc/0x780
> >> >> [<ffffffc0005d9680>] zram_make_request+0x25c/0x2d4
> >> >> [<ffffffc00040f8ac>] generic_make_request+0x80/0xbc
> >> >> [<ffffffc00040f98c>] submit_bio+0xa4/0x15c
> >> >> [<ffffffc0002e8bb0>] __swap_writepage+0x218/0x230
> >> >> [<ffffffc0002e8c04>] swap_writepage+0x3c/0x4c
> >> >> [<ffffffc0002c7384>] shrink_page_list+0x51c/0x8d0
> >> >> [<ffffffc0002c7e88>] shrink_inactive_list+0x3f8/0x60c
> >> >> [<ffffffc0002c86c8>] shrink_lruvec+0x33c/0x4cc
> >> >> [<ffffffc0002c8894>] shrink_zone+0x3c/0x100
> >> >> [<ffffffc0002c8c10>] try_to_free_pages+0x2b8/0x54c
> >> >> [<ffffffc0002bf308>] __alloc_pages_nodemask+0x514/0x7f0
> >> >> [<ffffffc0002bf5f8>] __get_free_pages+0x14/0x5c
> >> >> [<ffffffc0003446cc>] proc_info_read+0x50/0xe4
> >> >> [<ffffffc0002f5204>] vfs_read+0xa0/0x12c
> >> >> [<ffffffc0002f59c8>] SyS_read+0x44/0x74
> >> >> DMA: 3397*4kB (MC) 26*8kB (RC) 0*16kB 0*32kB 0*64kB 0*128kB 0*256kB
> >> >> 0*512kB 0*1024kB 0*2048kB 0*4096kB = 13796kB
> >> >>
> >> >> [minchan: change vmalloc gfp and adding comment about gfp]
> >> >> [sergey: tweak comments and styles]
> >> >> Signed-off-by: Kyeongdon Kim <kyeongdon.kim@lge.com>
> >> >> Signed-off-by: Minchan Kim <minchan@kernel.org>
> >> >
> >> > Kyeongdon, Could you test this patch on your device?
> >> >
> >> > Thanks.
> >>
> >> Sorry to have kept you waiting,
> >> Obviously, I couldn't see allocation fail message with this patch.
> >> But, there is something to make some delay(not sure yet this is normal).
> > 
> > You mean new changes makes start-up delay of your application sometime
> > still,
> > but not frequent like old?
> > 
> I couldn't see start-up delay during my test after this patch.
> But, I checked the return value from alloc function like the below :
> 
> static void *zcomp_lz4_create(void)
> <snip>
>   ret = kzalloc(LZ4_MEM_COMPRESS, GFP_NOIO | __GFP_NORETRY |
> 			__GFP_NOWARN | __GFP_NOMEMALLOC);
>   printk("%s: %d: ret = %p\n",__func__,__LINE__,ret);	//line 32
>   if (!ret) {
>     ret = __vmalloc(LZ4_MEM_COMPRESS,
> 		GFP_NOIO | __GFP_NORETRY | __GFP_NOWARN |
> 		__GFP_NOMEMALLOC | __GFP_ZERO | __GFP_HIGHMEM,
> 		PAGE_KERNEL);
>     printk("%s: %d: ret = %p\n",__func__,__LINE__,ret);	//line 38
>   }
>   return ret;

OK, I got it. 

> 
> log message :
> [  352.226014][0] zcomp_lz4_create: 32: ret =           (null)
> [  352.226035][0] zcomp_lz4_create: 38: ret =           (null)
> [  352.226791][0] zcomp_lz4_create: 32: ret =           (null)
> [  352.226809][0] zcomp_lz4_create: 38: ret =           (null)
> [  352.230348][0] zcomp_lz4_create: 32: ret =           (null)
> [  352.230369][0] zcomp_lz4_create: 38: ret =           (null)
> [  352.230460][0] zcomp_lz4_create: 32: ret =           (null)
> [  352.230485][0] zcomp_lz4_create: 38: ret =           (null)
> [  352.230507][0] zcomp_lz4_create: 32: ret =           (null)
> [  352.230520][0] zcomp_lz4_create: 38: ret =           (null)
> [  352.230608][0] zcomp_lz4_create: 32: ret =           (null)
> [  352.230619][0] zcomp_lz4_create: 38: ret =           (null)
> [  352.230888][0] zcomp_lz4_create: 32: ret =           (null)
> [  352.230902][0] zcomp_lz4_create: 38: ret =           (null)
> [  352.231406][0] zcomp_lz4_create: 32: ret = ffffffc002088000
> [  352.234024][0] zcomp_lz4_create: 32: ret =           (null)
> [  352.234060][0] zcomp_lz4_create: 38: ret =           (null)
> [  352.234359][0] zcomp_lz4_create: 32: ret =           (null)
> [  352.234384][0] zcomp_lz4_create: 38: ret =           (null)
> [  352.234618][0] zcomp_lz4_create: 32: ret =           (null)
> [  352.234639][0] zcomp_lz4_create: 38: ret =           (null)
> [  352.234667][0] zcomp_lz4_create: 32: ret =           (null)
> [  352.234685][0] zcomp_lz4_create: 38: ret =           (null)
> [  352.234738][0] zcomp_lz4_create: 32: ret =           (null)
> [  352.234748][0] zcomp_lz4_create: 38: ret =           (null)
> [  352.234800][0] zcomp_lz4_create: 32: ret =           (null)
> [  352.234816][0] zcomp_lz4_create: 38: ret =           (null)
> [  352.234852][0] zcomp_lz4_create: 32: ret =           (null)
> [  352.234865][0] zcomp_lz4_create: 38: ret =           (null)
> [  352.235136][0] zcomp_lz4_create: 32: ret =           (null)
> [  352.235179][0] zcomp_lz4_create: 38: ret = ffffff80016a4000
> 
> I thought this pattern from vmalloc is not normal.
> >>
> >> static struct zcomp_strm *zcomp_strm_alloc(struct zcomp *comp)
> >> {
> >> <snip>
> >>
> >> zstrm->private = comp->backend->create();
> >> ^ // sometimes, return 'null' continually(2-5times)
> > 
> > Hmm, I think it is caused by __GFP_NOMEMALLOC.
> > Could you test it without the flag?
> > 
> >>
> >> As you know, if there is 'null' return, this function is called again to
> >> get a memory in while() loop. I just checked this one with printk().
> >>
> >> If you guys don't mind, I'll test more with trace log to check time
> > delay.
> > 
> > No problem.
> > 
> >>
> >> However, If this is fully expectable status to you.
> >> I think I don't need to do it.
> > 
> > It's not what I expected. Actually, I thought failure of vmalloc
> > in that place should be *really really* rare. I think it's caused by
> > __GFP_NOMEMALLOC so I want to see test result without the flag.
> > 
> > Thanks for the careful test!
> > 
> You're welcome.
> 
> After I removed flag '__GFP_NOMEMALLOC', I couldn't find return 'null'
> from vmalloc until now.
 
Thanks for the test.
You said you cannot see the delay any more so it's fine as it is.

Regardless of it, I want to remove __GFP_NOMEMALLOC in the gfp_mask.
The reason we decided to add vmalloc fallback is that we expect that
it is likely to succeed a few order-0 pages to allocate (sizeof(buffer)
> PAGE_SIZE) although it was failed by kmalloc.
But what I missing is that zram-swap is working in direct reclaim
normally so it is more flexible to use emegency memory pool(
e,g, zsmalloc doesn't use __GFP_NOMEMALLOC) so there is no point
to use __GFP_NOMEMALLOC in here.

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH v3 2/2] zram: try vmalloc() after kmalloc()
       [not found]             ` <20151201045535.GA5999@bbox>
@ 2015-12-01  5:16               ` Sergey Senozhatsky
  2015-12-01  6:35                 ` Kyeongdon Kim
  0 siblings, 1 reply; 18+ messages in thread
From: Sergey Senozhatsky @ 2015-12-01  5:16 UTC (permalink / raw)
  To: Minchan Kim
  Cc: kyeongdon.kim, Sergey Senozhatsky, Andrew Morton, linux-kernel,
	Sergey Senozhatsky

On (12/01/15 13:55), Minchan Kim wrote:
[..]
> To clear my opinion,
> 
> lzo_create(gfp_t flags)
> {
>         void * ret = kmalloc(LZO1X_MEM_COMPRESS, flags);
>         if (!ret)
>                 ret = vmalloc(LZO1X_MEM_COMPRESS, flasgs | GFP_NOMEMALLOC);
>         return ret;
> }

ah, ok, I see. I've a question.

we had
	kmalloc(f | __GFP_NOMEMALLOC)
	__vmalloc(f | __GFP_NOMEMALLOC)

which produced high failure rates for both kmalloc() and __vmalloc()

test #1

> > > log message :
[..]
> > > [  352.230608][0] zcomp_lz4_create: 32: ret =           (null)
> > > [  352.230619][0] zcomp_lz4_create: 38: ret =           (null)
> > > [  352.230888][0] zcomp_lz4_create: 32: ret =           (null)
> > > [  352.230902][0] zcomp_lz4_create: 38: ret =           (null)
> > > [  352.231406][0] zcomp_lz4_create: 32: ret = ffffffc002088000
> > > [  352.234024][0] zcomp_lz4_create: 32: ret =           (null)
> > > [  352.234060][0] zcomp_lz4_create: 38: ret =           (null)
> > > [  352.234359][0] zcomp_lz4_create: 32: ret =           (null)
[..]
> > > [  352.234384][0] zcomp_lz4_create: 38: ret =           (null)
> > > [  352.234618][0] zcomp_lz4_create: 32: ret =           (null)
> > > [  352.234639][0] zcomp_lz4_create: 38: ret =           (null)
> > > [  352.234667][0] zcomp_lz4_create: 32: ret =           (null)
> > > [  352.235179][0] zcomp_lz4_create: 38: ret = ffffff80016a4000



Kyeongdon, do I understand correctly, that for the second test you
removed '__GFP_NOMEMALLOC' from both kmalloc() and __vmalloc()?

iow:
	kmalloc(f & ~__GFP_NOMEMALLOC)
	vmalloc(f & ~__GFP_NOMEMALLOC)

test #2 : almost always failing kmalloc() and !NULL __vmalloc()

> > > log message :
> > > <4>[ 2288.954934][0] KDKIM: zcomp_lz4_create: 24: ret =           (null)
> > > <4>[ 2288.954972][0] KDKIM: zcomp_lz4_create: 30: ret = ffffff800287e000
> > > ..<snip>..
> > > <4>[ 2289.092411][0] KDKIM: zcomp_lz4_create: 24: ret =           (null)
> > > <4>[ 2289.092546][0] KDKIM: zcomp_lz4_create: 30: ret = ffffff80028b5000
> > > ..<snip>..
> > > <4>[ 2289.135628][0] KDKIM: zcomp_lz4_create: 24: ret =           (null)
> > > <4>[ 2289.135642][0] KDKIM: zcomp_lz4_create: 24: ret =           (null)
> > > <4>[ 2289.135729][0] KDKIM: zcomp_lz4_create: 30: ret = ffffff80028be000
> > > <4>[ 2289.135732][0] KDKIM: zcomp_lz4_create: 30: ret = ffffff80028c7000


if this is the case (__GFP_NOMEMALLOC removed from both kmalloc and __vmalloc),
then proposed

	kmalloc(f & ~__GFP_NOMEMALLOC)
	__vmalloc(f | __GFP_NOMEMALLOC)


can be very close to 'test #1 && test #2':

	kmalloc() fails (as in test #2)
	__vmalloc() fails (as in test #1)

isn't it?

	-ss

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: Re: [PATCH v3 2/2] zram: try vmalloc() after kmalloc()
  2015-12-01  5:16               ` Sergey Senozhatsky
@ 2015-12-01  6:35                 ` Kyeongdon Kim
  2015-12-01  7:15                   ` Sergey Senozhatsky
  2015-12-01  7:24                   ` Sergey Senozhatsky
  0 siblings, 2 replies; 18+ messages in thread
From: Kyeongdon Kim @ 2015-12-01  6:35 UTC (permalink / raw)
  To: Sergey Senozhatsky, Minchan Kim
  Cc: Andrew Morton, linux-kernel, Sergey Senozhatsky


On 2015-12-01 오후 2:16, Sergey Senozhatsky wrote:
> On (12/01/15 13:55), Minchan Kim wrote:
> [..]
>> To clear my opinion,
>>
>> lzo_create(gfp_t flags)
>> {
>> void * ret = kmalloc(LZO1X_MEM_COMPRESS, flags);
>> if (!ret)
>> ret = vmalloc(LZO1X_MEM_COMPRESS, flasgs | GFP_NOMEMALLOC);
>> return ret;
>> }
> 
> ah, ok, I see. I've a question.
> 
> we had
> kmalloc(f | __GFP_NOMEMALLOC)
> __vmalloc(f | __GFP_NOMEMALLOC)
> 
> which produced high failure rates for both kmalloc() and __vmalloc()
> 
> test #1
> 
>> > > log message :
> [..]
>> > > [ 352.230608][0] zcomp_lz4_create: 32: ret = (null)
>> > > [ 352.230619][0] zcomp_lz4_create: 38: ret = (null)
>> > > [ 352.230888][0] zcomp_lz4_create: 32: ret = (null)
>> > > [ 352.230902][0] zcomp_lz4_create: 38: ret = (null)
>> > > [ 352.231406][0] zcomp_lz4_create: 32: ret = ffffffc002088000
>> > > [ 352.234024][0] zcomp_lz4_create: 32: ret = (null)
>> > > [ 352.234060][0] zcomp_lz4_create: 38: ret = (null)
>> > > [ 352.234359][0] zcomp_lz4_create: 32: ret = (null)
> [..]
>> > > [ 352.234384][0] zcomp_lz4_create: 38: ret = (null)
>> > > [ 352.234618][0] zcomp_lz4_create: 32: ret = (null)
>> > > [ 352.234639][0] zcomp_lz4_create: 38: ret = (null)
>> > > [ 352.234667][0] zcomp_lz4_create: 32: ret = (null)
>> > > [ 352.235179][0] zcomp_lz4_create: 38: ret = ffffff80016a4000
> 
> 
> 
> Kyeongdon, do I understand correctly, that for the second test you
> removed '__GFP_NOMEMALLOC' from both kmalloc() and __vmalloc()?
> 
> iow:
> kmalloc(f & ~__GFP_NOMEMALLOC)
> vmalloc(f & ~__GFP_NOMEMALLOC)
> 
> test #2 : almost always failing kmalloc() and !NULL __vmalloc()
> 
>> > > log message :
>> > > <4>[ 2288.954934][0] KDKIM: zcomp_lz4_create: 24: ret = (null)
>> > > <4>[ 2288.954972][0] KDKIM: zcomp_lz4_create: 30: ret =
> ffffff800287e000
>> > > ..<snip>..
>> > > <4>[ 2289.092411][0] KDKIM: zcomp_lz4_create: 24: ret = (null)
>> > > <4>[ 2289.092546][0] KDKIM: zcomp_lz4_create: 30: ret =
> ffffff80028b5000
>> > > ..<snip>..
>> > > <4>[ 2289.135628][0] KDKIM: zcomp_lz4_create: 24: ret = (null)
>> > > <4>[ 2289.135642][0] KDKIM: zcomp_lz4_create: 24: ret = (null)
>> > > <4>[ 2289.135729][0] KDKIM: zcomp_lz4_create: 30: ret =
> ffffff80028be000
>> > > <4>[ 2289.135732][0] KDKIM: zcomp_lz4_create: 30: ret =
> ffffff80028c7000
> 
> 
> if this is the case (__GFP_NOMEMALLOC removed from both kmalloc and
> __vmalloc),
> then proposed
> 
> kmalloc(f & ~__GFP_NOMEMALLOC)
> __vmalloc(f | __GFP_NOMEMALLOC)
> 
> 
> can be very close to 'test #1 && test #2':
> 
> kmalloc() fails (as in test #2)
> __vmalloc() fails (as in test #1)
> 
> isn't it?
> 
> -ss

Let me give you a simple code of it.

@test #1 (previous shared log)
 kmalloc(f | __GFP_NOMEMALLOC)
  __vmalloc(f | __GFP_NOMEMALLOC)
// can find failure both

@test #2 (previous shared log)
 kmalloc(f | __GFP_NOMEMALLOC)
__vmalloc(f) 	
// removed '__GFP_NOMEMALLOC' from vmalloc() only, and cannot find
failure from vmalloc()

And like you said, I made a quick check to see a failure about kmalloc()
without the flag :

@test #3
 kmalloc(f)
__vmalloc(f | __GFP_NOMEMALLOC)
// removed '__GFP_NOMEMALLOC' from zmalloc() only
// and cannot find failure from zmalloc(), but in this case, it's hard
to find failure from vmalloc() because of already allocation mostly from
zsmalloc()

log message (test #3) :
<4>[  186.763605][1] KDKIM: zcomp_lz4_create: 24: ret = ffffffc002030000
<4>[  186.776652][1] KDKIM: zcomp_lz4_create: 24: ret = ffffffc0020f0000
<4>[  186.811423][1] KDKIM: zcomp_lz4_create: 24: ret = ffffffc002108000
<4>[  186.816744][1] KDKIM: zcomp_lz4_create: 24: ret = ffffffc002000000
<4>[  186.816796][1] KDKIM: zcomp_lz4_create: 24: ret = ffffffc002008000

@test #4
 kmalloc(f)
__vmalloc(f)
// cannot find failure both until now

log message (test #4) :
<4>[  641.440468][7] KDKIM: zcomp_lz4_create: 24: ret = ffffffc002190000
<snip>
<4>[  922.182980][7] KDKIM: zcomp_lz4_create: 24: ret = ffffffc002208000
<snip>
<4>[  923.197593][7] KDKIM: zcomp_lz4_create: 24: ret = ffffffc002020000
<snip>
<4>[  939.813499][7] KDKIM: zcomp_lz4_create: 24: ret = ffffffc0020a0000

So,is there another problem if we remove the flag from both sides?

Thanks,
Kyeongdon Kim

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH v3 2/2] zram: try vmalloc() after kmalloc()
  2015-12-01  6:35                 ` Kyeongdon Kim
@ 2015-12-01  7:15                   ` Sergey Senozhatsky
  2015-12-01  7:33                     ` Sergey Senozhatsky
  2015-12-01  8:16                     ` Minchan Kim
  2015-12-01  7:24                   ` Sergey Senozhatsky
  1 sibling, 2 replies; 18+ messages in thread
From: Sergey Senozhatsky @ 2015-12-01  7:15 UTC (permalink / raw)
  To: Minchan Kim, Kyeongdon Kim
  Cc: Andrew Morton, Sergey Senozhatsky, linux-kernel, Sergey Senozhatsky

On (12/01/15 15:35), Kyeongdon Kim wrote:
[..]
> @test #4
>  kmalloc(f)
> __vmalloc(f)
> // cannot find failure both until now
> 
> log message (test #4) :
> <4>[  641.440468][7] KDKIM: zcomp_lz4_create: 24: ret = ffffffc002190000
> <snip>
> <4>[  922.182980][7] KDKIM: zcomp_lz4_create: 24: ret = ffffffc002208000
> <snip>
> <4>[  923.197593][7] KDKIM: zcomp_lz4_create: 24: ret = ffffffc002020000
> <snip>
> <4>[  939.813499][7] KDKIM: zcomp_lz4_create: 24: ret = ffffffc0020a0000

Thanks!

> So,is there another problem if we remove the flag from both sides?
> 

Technically, '~__GFP_NOMEMALLOC' is what we've been doing for some time (well,
always); and, as Minchan noted, zsmalloc does not depend on emergency pools.

I vote for removal of __GFP_NOMEMALLOC from both kmalloc() and __vmalloc().

(user can make ->max_strm big enough to deplete emergency mem; but I tend to
ignore it).

Minchan?

	-ss

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH v3 2/2] zram: try vmalloc() after kmalloc()
  2015-12-01  6:35                 ` Kyeongdon Kim
  2015-12-01  7:15                   ` Sergey Senozhatsky
@ 2015-12-01  7:24                   ` Sergey Senozhatsky
  1 sibling, 0 replies; 18+ messages in thread
From: Sergey Senozhatsky @ 2015-12-01  7:24 UTC (permalink / raw)
  To: Kyeongdon Kim
  Cc: Sergey Senozhatsky, Minchan Kim, Andrew Morton, linux-kernel,
	Sergey Senozhatsky

On (12/01/15 15:35), Kyeongdon Kim wrote:
> Let me give you a simple code of it.
> 
> @test #1 (previous shared log)
>  kmalloc(f | __GFP_NOMEMALLOC)
>   __vmalloc(f | __GFP_NOMEMALLOC)
> // can find failure both
> 
> @test #2 (previous shared log)
>  kmalloc(f | __GFP_NOMEMALLOC)
> __vmalloc(f) 	
> // removed '__GFP_NOMEMALLOC' from vmalloc() only, and cannot find
> failure from vmalloc()
> 
> And like you said, I made a quick check to see a failure about kmalloc()
> without the flag :
> 
> @test #3
>  kmalloc(f)
> __vmalloc(f | __GFP_NOMEMALLOC)
> // removed '__GFP_NOMEMALLOC' from zmalloc() only
> // and cannot find failure from zmalloc(), but in this case, it's hard
> to find failure from vmalloc() because of already allocation mostly from
> zsmalloc()
> 

I assume, that "zsmalloc" and "zmalloc" here are meant to be "kzalloc (kmalloc)"

	-ss

> log message (test #3) :
> <4>[  186.763605][1] KDKIM: zcomp_lz4_create: 24: ret = ffffffc002030000
> <4>[  186.776652][1] KDKIM: zcomp_lz4_create: 24: ret = ffffffc0020f0000
> <4>[  186.811423][1] KDKIM: zcomp_lz4_create: 24: ret = ffffffc002108000
> <4>[  186.816744][1] KDKIM: zcomp_lz4_create: 24: ret = ffffffc002000000
> <4>[  186.816796][1] KDKIM: zcomp_lz4_create: 24: ret = ffffffc002008000
> 
> @test #4
>  kmalloc(f)
> __vmalloc(f)
> // cannot find failure both until now
> 
> log message (test #4) :
> <4>[  641.440468][7] KDKIM: zcomp_lz4_create: 24: ret = ffffffc002190000
> <snip>
> <4>[  922.182980][7] KDKIM: zcomp_lz4_create: 24: ret = ffffffc002208000
> <snip>
> <4>[  923.197593][7] KDKIM: zcomp_lz4_create: 24: ret = ffffffc002020000
> <snip>
> <4>[  939.813499][7] KDKIM: zcomp_lz4_create: 24: ret = ffffffc0020a0000
> 
> So,is there another problem if we remove the flag from both sides?
> 
> Thanks,
> Kyeongdon Kim
> 

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH v3 2/2] zram: try vmalloc() after kmalloc()
  2015-12-01  7:15                   ` Sergey Senozhatsky
@ 2015-12-01  7:33                     ` Sergey Senozhatsky
  2015-12-01  8:16                     ` Minchan Kim
  1 sibling, 0 replies; 18+ messages in thread
From: Sergey Senozhatsky @ 2015-12-01  7:33 UTC (permalink / raw)
  To: Sergey Senozhatsky
  Cc: Minchan Kim, Kyeongdon Kim, Andrew Morton, linux-kernel,
	Sergey Senozhatsky

On (12/01/15 16:15), Sergey Senozhatsky wrote:
> On (12/01/15 15:35), Kyeongdon Kim wrote:
> [..]
> > @test #4
> >  kmalloc(f)
> > __vmalloc(f)
> > // cannot find failure both until now
> > 
> > log message (test #4) :
> > <4>[  641.440468][7] KDKIM: zcomp_lz4_create: 24: ret = ffffffc002190000
> > <snip>
> > <4>[  922.182980][7] KDKIM: zcomp_lz4_create: 24: ret = ffffffc002208000
> > <snip>
> > <4>[  923.197593][7] KDKIM: zcomp_lz4_create: 24: ret = ffffffc002020000
> > <snip>
> > <4>[  939.813499][7] KDKIM: zcomp_lz4_create: 24: ret = ffffffc0020a0000
> 
> Thanks!
> 
> > So,is there another problem if we remove the flag from both sides?
> > 
> 
> Technically, '~__GFP_NOMEMALLOC' is what we've been doing for some time (well,
> always); and, as Minchan noted, zsmalloc does not depend on emergency pools.
> 
> I vote for removal of __GFP_NOMEMALLOC from both kmalloc() and __vmalloc().
> 

um.. which is very close to
    "remove vmalloc() fallback and use kzalloc(f & ~__GFP_NOMEMALLOC) only"

	-ss

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH v3 2/2] zram: try vmalloc() after kmalloc()
  2015-12-01  7:15                   ` Sergey Senozhatsky
  2015-12-01  7:33                     ` Sergey Senozhatsky
@ 2015-12-01  8:16                     ` Minchan Kim
  2015-12-01  9:11                       ` Sergey Senozhatsky
  1 sibling, 1 reply; 18+ messages in thread
From: Minchan Kim @ 2015-12-01  8:16 UTC (permalink / raw)
  To: Sergey Senozhatsky
  Cc: Kyeongdon Kim, Andrew Morton, linux-kernel, Sergey Senozhatsky

On Tue, Dec 01, 2015 at 04:15:42PM +0900, Sergey Senozhatsky wrote:
> On (12/01/15 15:35), Kyeongdon Kim wrote:
> [..]
> > @test #4
> >  kmalloc(f)
> > __vmalloc(f)
> > // cannot find failure both until now
> > 
> > log message (test #4) :
> > <4>[  641.440468][7] KDKIM: zcomp_lz4_create: 24: ret = ffffffc002190000
> > <snip>
> > <4>[  922.182980][7] KDKIM: zcomp_lz4_create: 24: ret = ffffffc002208000
> > <snip>
> > <4>[  923.197593][7] KDKIM: zcomp_lz4_create: 24: ret = ffffffc002020000
> > <snip>
> > <4>[  939.813499][7] KDKIM: zcomp_lz4_create: 24: ret = ffffffc0020a0000
> 
> Thanks!
> 
> > So,is there another problem if we remove the flag from both sides?
> > 
> 
> Technically, '~__GFP_NOMEMALLOC' is what we've been doing for some time (well,
> always); and, as Minchan noted, zsmalloc does not depend on emergency pools.
> 
> I vote for removal of __GFP_NOMEMALLOC from both kmalloc() and __vmalloc().
> 
> (user can make ->max_strm big enough to deplete emergency mem; but I tend to
> ignore it).
> 
> Minchan?

Agree. Do you mind resending patches? :)

Thanks.

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH v3 2/2] zram: try vmalloc() after kmalloc()
  2015-12-01  8:16                     ` Minchan Kim
@ 2015-12-01  9:11                       ` Sergey Senozhatsky
  0 siblings, 0 replies; 18+ messages in thread
From: Sergey Senozhatsky @ 2015-12-01  9:11 UTC (permalink / raw)
  To: Minchan Kim
  Cc: Sergey Senozhatsky, Kyeongdon Kim, Andrew Morton, linux-kernel,
	Sergey Senozhatsky

On (12/01/15 17:16), Minchan Kim wrote:
> On Tue, Dec 01, 2015 at 04:15:42PM +0900, Sergey Senozhatsky wrote:
> > On (12/01/15 15:35), Kyeongdon Kim wrote:
> > [..]
> > > @test #4
> > >  kmalloc(f)
> > > __vmalloc(f)
> > > // cannot find failure both until now
> > > 
> > > log message (test #4) :
> > > <4>[  641.440468][7] KDKIM: zcomp_lz4_create: 24: ret = ffffffc002190000
> > > <snip>
> > > <4>[  922.182980][7] KDKIM: zcomp_lz4_create: 24: ret = ffffffc002208000
> > > <snip>
> > > <4>[  923.197593][7] KDKIM: zcomp_lz4_create: 24: ret = ffffffc002020000
> > > <snip>
> > > <4>[  939.813499][7] KDKIM: zcomp_lz4_create: 24: ret = ffffffc0020a0000
> > 
> > Thanks!
> > 
> > > So,is there another problem if we remove the flag from both sides?
> > > 
> > 
> > Technically, '~__GFP_NOMEMALLOC' is what we've been doing for some time (well,
> > always); and, as Minchan noted, zsmalloc does not depend on emergency pools.
> > 
> > I vote for removal of __GFP_NOMEMALLOC from both kmalloc() and __vmalloc().
> > 
> > (user can make ->max_strm big enough to deplete emergency mem; but I tend to
> > ignore it).
> > 
> > Minchan?
> 
> Agree. Do you mind resending patches? :)

OK, will do later today. Thanks.

	-ss

^ permalink raw reply	[flat|nested] 18+ messages in thread

end of thread, other threads:[~2015-12-01  9:10 UTC | newest]

Thread overview: 18+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-11-27  4:10 [PATCH v3 0/2] zram/zcomp: stream allocation fixes and tweaks Sergey Senozhatsky
2015-11-27  4:10 ` [PATCH v3 1/2] zram/zcomp: use GFP_NOIO to allocate streams Sergey Senozhatsky
2015-11-30  7:09   ` Minchan Kim
2015-11-27  4:10 ` [PATCH v3 2/2] zram: try vmalloc() after kmalloc() Sergey Senozhatsky
2015-11-30  7:10   ` Minchan Kim
2015-11-30 10:42     ` kyeongdon.kim
2015-11-30 11:14       ` Sergey Senozhatsky
2015-12-01  2:04         ` kyeongdon.kim
     [not found]       ` <20151130231841.GA960@bbox>
2015-12-01  0:33         ` Sergey Senozhatsky
2015-12-01  2:31         ` Re: " kyeongdon.kim
2015-12-01  4:44           ` Minchan Kim
     [not found]             ` <20151201045535.GA5999@bbox>
2015-12-01  5:16               ` Sergey Senozhatsky
2015-12-01  6:35                 ` Kyeongdon Kim
2015-12-01  7:15                   ` Sergey Senozhatsky
2015-12-01  7:33                     ` Sergey Senozhatsky
2015-12-01  8:16                     ` Minchan Kim
2015-12-01  9:11                       ` Sergey Senozhatsky
2015-12-01  7:24                   ` Sergey Senozhatsky

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.