Linux-Block Archive on lore.kernel.org
 help / color / Atom feed
* [PATCH 0/3] bcache patches for Linux v5.6-rc2 
@ 2020-02-13 14:12 Coly Li
  2020-02-13 14:12 ` [PATCH 1/3] bcache: ignore pending signals when creating gc and allocator thread Coly Li
                   ` (3 more replies)
  0 siblings, 4 replies; 7+ messages in thread
From: Coly Li @ 2020-02-13 14:12 UTC (permalink / raw)
  To: axboe; +Cc: linux-bcache, linux-block, Coly Li

Hi Jens,

Here are 3 minor fixes Linux v5.6-rc2. The first 2 patches fix
kthread creating failure during bcache cache set starts up. The
last patch is a code clean up.

Please take them. Thanks in advance.

Coly Li
---
Coly Li (3):
  bcache: ignore pending signals when creating gc and allocator thread
  bcache: Revert "bcache: shrink btree node cache after
    bch_btree_check()"
  bcache: remove macro nr_to_fifo_front()

 drivers/md/bcache/alloc.c   | 18 ++++++++++++++++--
 drivers/md/bcache/btree.c   | 13 +++++++++++++
 drivers/md/bcache/journal.c |  7 ++-----
 drivers/md/bcache/super.c   | 17 -----------------
 4 files changed, 31 insertions(+), 24 deletions(-)

-- 
2.16.4


^ permalink raw reply	[flat|nested] 7+ messages in thread

* [PATCH 1/3] bcache: ignore pending signals when creating gc and allocator thread
  2020-02-13 14:12 [PATCH 0/3] bcache patches for Linux v5.6-rc2 Coly Li
@ 2020-02-13 14:12 ` Coly Li
  2020-02-19 16:32   ` Christoph Hellwig
  2020-02-13 14:12 ` [PATCH 2/3] bcache: Revert "bcache: shrink btree node cache after bch_btree_check()" Coly Li
                   ` (2 subsequent siblings)
  3 siblings, 1 reply; 7+ messages in thread
From: Coly Li @ 2020-02-13 14:12 UTC (permalink / raw)
  To: axboe; +Cc: linux-bcache, linux-block, Coly Li

When run a cache set, all the bcache btree node of this cache set will
be checked by bch_btree_check(). If the bcache btree is very large,
iterating all the btree nodes will occupy too much system memory and
the bcache registering process might be selected and killed by system
OOM killer. kthread_run() will fail if current process has pending
signal, therefore the kthread creating in run_cache_set() for gc and
allocator kernel threads are very probably failed for a very large
bcache btree.

Indeed such OOM is safe and the registering process will exit after
the registration done. Therefore this patch flushes pending signals
during the cache set start up, specificly in bch_cache_allocator_start()
and bch_gc_thread_start(), to make sure run_cache_set() won't fail for
large cahced data set.

Signed-off-by: Coly Li <colyli@suse.de>
---
 drivers/md/bcache/alloc.c | 18 ++++++++++++++++--
 drivers/md/bcache/btree.c | 13 +++++++++++++
 2 files changed, 29 insertions(+), 2 deletions(-)

diff --git a/drivers/md/bcache/alloc.c b/drivers/md/bcache/alloc.c
index a1df0d95151c..8bc1faf71ff2 100644
--- a/drivers/md/bcache/alloc.c
+++ b/drivers/md/bcache/alloc.c
@@ -67,6 +67,7 @@
 #include <linux/blkdev.h>
 #include <linux/kthread.h>
 #include <linux/random.h>
+#include <linux/sched/signal.h>
 #include <trace/events/bcache.h>
 
 #define MAX_OPEN_BUCKETS 128
@@ -733,8 +734,21 @@ int bch_open_buckets_alloc(struct cache_set *c)
 
 int bch_cache_allocator_start(struct cache *ca)
 {
-	struct task_struct *k = kthread_run(bch_allocator_thread,
-					    ca, "bcache_allocator");
+	struct task_struct *k;
+
+	/*
+	 * In case previous btree check operation occupies too many
+	 * system memory for bcache btree node cache, and the
+	 * registering process is selected by OOM killer. Here just
+	 * ignore the SIGKILL sent by OOM killer if there is, to
+	 * avoid kthread_run() being failed by pending signals. The
+	 * bcache registering process will exit after the registration
+	 * done.
+	 */
+	if (signal_pending(current))
+		flush_signals(current);
+
+	k = kthread_run(bch_allocator_thread, ca, "bcache_allocator");
 	if (IS_ERR(k))
 		return PTR_ERR(k);
 
diff --git a/drivers/md/bcache/btree.c b/drivers/md/bcache/btree.c
index 1e020bd7f5ac..4009023305a3 100644
--- a/drivers/md/bcache/btree.c
+++ b/drivers/md/bcache/btree.c
@@ -34,6 +34,7 @@
 #include <linux/random.h>
 #include <linux/rcupdate.h>
 #include <linux/sched/clock.h>
+#include <linux/sched/signal.h>
 #include <linux/rculist.h>
 #include <linux/delay.h>
 #include <trace/events/bcache.h>
@@ -1913,6 +1914,18 @@ static int bch_gc_thread(void *arg)
 
 int bch_gc_thread_start(struct cache_set *c)
 {
+	/*
+	 * In case previous btree check operation occupies too many
+	 * system memory for bcache btree node cache, and the
+	 * registering process is selected by OOM killer. Here just
+	 * ignore the SIGKILL sent by OOM killer if there is, to
+	 * avoid kthread_run() being failed by pending signals. The
+	 * bcache registering process will exit after the registration
+	 * done.
+	 */
+	if (signal_pending(current))
+		flush_signals(current);
+
 	c->gc_thread = kthread_run(bch_gc_thread, c, "bcache_gc");
 	return PTR_ERR_OR_ZERO(c->gc_thread);
 }
-- 
2.16.4


^ permalink raw reply	[flat|nested] 7+ messages in thread

* [PATCH 2/3] bcache: Revert "bcache: shrink btree node cache after bch_btree_check()"
  2020-02-13 14:12 [PATCH 0/3] bcache patches for Linux v5.6-rc2 Coly Li
  2020-02-13 14:12 ` [PATCH 1/3] bcache: ignore pending signals when creating gc and allocator thread Coly Li
@ 2020-02-13 14:12 ` Coly Li
  2020-02-13 14:12 ` [PATCH 3/3] bcache: remove macro nr_to_fifo_front() Coly Li
  2020-02-13 15:54 ` [PATCH 0/3] bcache patches for Linux v5.6-rc2 Jens Axboe
  3 siblings, 0 replies; 7+ messages in thread
From: Coly Li @ 2020-02-13 14:12 UTC (permalink / raw)
  To: axboe; +Cc: linux-bcache, linux-block, Coly Li

This reverts commit 1df3877ff6a4810054237c3259d900ded4468969.

In my testing, sometimes even all the cached btree nodes are freed,
creating gc and allocator kernel threads may still fail. Finally it
turns out that kthread_run() may fail if there is pending signal for
current task. And the pending signal is sent from OOM killer which
is triggered by memory consuption in bch_btree_check().

Therefore explicitly shrinking bcache btree node here does not help,
and after the shrinker callback is improved, as well as pending signals
are ignored before creating kernel threads, now such operation is
unncessary anymore.

This patch reverts the commit 1df3877ff6a4 ("bcache: shrink btree node
cache after bch_btree_check()") because we have better improvement now.

Signed-off-by: Coly Li <colyli@suse.de>
---
 drivers/md/bcache/super.c | 17 -----------------
 1 file changed, 17 deletions(-)

diff --git a/drivers/md/bcache/super.c b/drivers/md/bcache/super.c
index 2749daf09724..0c3c5419c52b 100644
--- a/drivers/md/bcache/super.c
+++ b/drivers/md/bcache/super.c
@@ -1917,23 +1917,6 @@ static int run_cache_set(struct cache_set *c)
 		if (bch_btree_check(c))
 			goto err;
 
-		/*
-		 * bch_btree_check() may occupy too much system memory which
-		 * has negative effects to user space application (e.g. data
-		 * base) performance. Shrink the mca cache memory proactively
-		 * here to avoid competing memory with user space workloads..
-		 */
-		if (!c->shrinker_disabled) {
-			struct shrink_control sc;
-
-			sc.gfp_mask = GFP_KERNEL;
-			sc.nr_to_scan = c->btree_cache_used * c->btree_pages;
-			/* first run to clear b->accessed tag */
-			c->shrink.scan_objects(&c->shrink, &sc);
-			/* second run to reap non-accessed nodes */
-			c->shrink.scan_objects(&c->shrink, &sc);
-		}
-
 		bch_journal_mark(c, &journal);
 		bch_initial_gc_finish(c);
 		pr_debug("btree_check() done");
-- 
2.16.4


^ permalink raw reply	[flat|nested] 7+ messages in thread

* [PATCH 3/3] bcache: remove macro nr_to_fifo_front()
  2020-02-13 14:12 [PATCH 0/3] bcache patches for Linux v5.6-rc2 Coly Li
  2020-02-13 14:12 ` [PATCH 1/3] bcache: ignore pending signals when creating gc and allocator thread Coly Li
  2020-02-13 14:12 ` [PATCH 2/3] bcache: Revert "bcache: shrink btree node cache after bch_btree_check()" Coly Li
@ 2020-02-13 14:12 ` Coly Li
  2020-02-13 15:54 ` [PATCH 0/3] bcache patches for Linux v5.6-rc2 Jens Axboe
  3 siblings, 0 replies; 7+ messages in thread
From: Coly Li @ 2020-02-13 14:12 UTC (permalink / raw)
  To: axboe; +Cc: linux-bcache, linux-block, Coly Li

Macro nr_to_fifo_front() is only used once in btree_flush_write(),
it is unncessary indeed. This patch removes this macro and does
calculation directly in place.

Signed-off-by: Coly Li <colyli@suse.de>
---
 drivers/md/bcache/journal.c | 7 ++-----
 1 file changed, 2 insertions(+), 5 deletions(-)

diff --git a/drivers/md/bcache/journal.c b/drivers/md/bcache/journal.c
index 6730820780b0..0e3ff9745ac7 100644
--- a/drivers/md/bcache/journal.c
+++ b/drivers/md/bcache/journal.c
@@ -417,8 +417,6 @@ int bch_journal_replay(struct cache_set *s, struct list_head *list)
 
 /* Journalling */
 
-#define nr_to_fifo_front(p, front_p, mask)	(((p) - (front_p)) & (mask))
-
 static void btree_flush_write(struct cache_set *c)
 {
 	struct btree *b, *t, *btree_nodes[BTREE_FLUSH_NR];
@@ -510,9 +508,8 @@ static void btree_flush_write(struct cache_set *c)
 		 *   journal entry can be reclaimed). These selected nodes
 		 *   will be ignored and skipped in the folowing for-loop.
 		 */
-		if (nr_to_fifo_front(btree_current_write(b)->journal,
-				     fifo_front_p,
-				     mask) != 0) {
+		if (((btree_current_write(b)->journal - fifo_front_p) &
+		     mask) != 0) {
 			mutex_unlock(&b->write_lock);
 			continue;
 		}
-- 
2.16.4


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH 0/3] bcache patches for Linux v5.6-rc2
  2020-02-13 14:12 [PATCH 0/3] bcache patches for Linux v5.6-rc2 Coly Li
                   ` (2 preceding siblings ...)
  2020-02-13 14:12 ` [PATCH 3/3] bcache: remove macro nr_to_fifo_front() Coly Li
@ 2020-02-13 15:54 ` Jens Axboe
  3 siblings, 0 replies; 7+ messages in thread
From: Jens Axboe @ 2020-02-13 15:54 UTC (permalink / raw)
  To: Coly Li; +Cc: linux-bcache, linux-block

On 2/13/20 7:12 AM, Coly Li wrote:
> Hi Jens,
> 
> Here are 3 minor fixes Linux v5.6-rc2. The first 2 patches fix
> kthread creating failure during bcache cache set starts up. The
> last patch is a code clean up.
> 
> Please take them. Thanks in advance.

Applied, thanks.

-- 
Jens Axboe


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH 1/3] bcache: ignore pending signals when creating gc and allocator thread
  2020-02-13 14:12 ` [PATCH 1/3] bcache: ignore pending signals when creating gc and allocator thread Coly Li
@ 2020-02-19 16:32   ` Christoph Hellwig
  2020-02-20 13:20     ` Coly Li
  0 siblings, 1 reply; 7+ messages in thread
From: Christoph Hellwig @ 2020-02-19 16:32 UTC (permalink / raw)
  To: Coly Li; +Cc: axboe, linux-bcache, linux-block

On Thu, Feb 13, 2020 at 10:12:05PM +0800, Coly Li wrote:
> +	/*
> +	 * In case previous btree check operation occupies too many
> +	 * system memory for bcache btree node cache, and the
> +	 * registering process is selected by OOM killer. Here just
> +	 * ignore the SIGKILL sent by OOM killer if there is, to
> +	 * avoid kthread_run() being failed by pending signals. The
> +	 * bcache registering process will exit after the registration
> +	 * done.
> +	 */
> +	if (signal_pending(current))
> +		flush_signals(current);
> +
> +	k = kthread_run(bch_allocator_thread, ca, "bcache_allocator");

This really needs to go into the kthread code itself instead of
requiring cargo culting in the callers.


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH 1/3] bcache: ignore pending signals when creating gc and allocator thread
  2020-02-19 16:32   ` Christoph Hellwig
@ 2020-02-20 13:20     ` Coly Li
  0 siblings, 0 replies; 7+ messages in thread
From: Coly Li @ 2020-02-20 13:20 UTC (permalink / raw)
  To: Christoph Hellwig; +Cc: axboe, linux-bcache, linux-block

On 2020/2/20 12:32 上午, Christoph Hellwig wrote:
> On Thu, Feb 13, 2020 at 10:12:05PM +0800, Coly Li wrote:
>> +	/*
>> +	 * In case previous btree check operation occupies too many
>> +	 * system memory for bcache btree node cache, and the
>> +	 * registering process is selected by OOM killer. Here just
>> +	 * ignore the SIGKILL sent by OOM killer if there is, to
>> +	 * avoid kthread_run() being failed by pending signals. The
>> +	 * bcache registering process will exit after the registration
>> +	 * done.
>> +	 */
>> +	if (signal_pending(current))
>> +		flush_signals(current);
>> +
>> +	k = kthread_run(bch_allocator_thread, ca, "bcache_allocator");
> 
> This really needs to go into the kthread code itself instead of
> requiring cargo culting in the callers.
> 

Hi Christoph,

Correct me if I am wrong.

If the signal is set before calling kthread_run(), kthread_run() will
fail and return -EINTR as code comment of __kthread_create_on_node() says,

 315  /*
 316   * Wait for completion in killable state, for I might be chosen by
 317   * the OOM killer while kthreadd is trying to allocate memory for
 318   * new kernel thread.
 319   */
 320   if (unlikely(wait_for_completion_killable(&done))) {
 321       /*
 322       * If I was SIGKILLed before kthreadd (or new kernel thread)
 323       * calls complete(), leave the cleanup of this structure to
 324       * that thread.
 325       */
 326       if (xchg(&create->done, NULL))
 327             return ERR_PTR(-EINTR);
 328       /*
 329        * kthreadd (or new kernel thread) will call complete()
 330        * shortly.
 331        */
 332        wait_for_completion(&done);
 333   }

So the caller of kthread_run(), in this case it is
bch_cache_allocator_start() will receive -EINTR, and returns error to
its caller bch_cache_set_alloc(). Then the registration will fail and
ignore what the kthread routine does in parallel.

Therefore I need to explicitly call pending_signal() before calling
kthread_run().


-- 

Coly Li

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, back to index

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-02-13 14:12 [PATCH 0/3] bcache patches for Linux v5.6-rc2 Coly Li
2020-02-13 14:12 ` [PATCH 1/3] bcache: ignore pending signals when creating gc and allocator thread Coly Li
2020-02-19 16:32   ` Christoph Hellwig
2020-02-20 13:20     ` Coly Li
2020-02-13 14:12 ` [PATCH 2/3] bcache: Revert "bcache: shrink btree node cache after bch_btree_check()" Coly Li
2020-02-13 14:12 ` [PATCH 3/3] bcache: remove macro nr_to_fifo_front() Coly Li
2020-02-13 15:54 ` [PATCH 0/3] bcache patches for Linux v5.6-rc2 Jens Axboe

Linux-Block Archive on lore.kernel.org

Archives are clonable:
	git clone --mirror https://lore.kernel.org/linux-block/0 linux-block/git/0.git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V2 linux-block linux-block/ https://lore.kernel.org/linux-block \
		linux-block@vger.kernel.org
	public-inbox-index linux-block

Example config snippet for mirrors

Newsgroup available over NNTP:
	nntp://nntp.lore.kernel.org/org.kernel.vger.linux-block


AGPL code for this site: git clone https://public-inbox.org/public-inbox.git