All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 00/15] bcache fixes before Linux v5.3
@ 2019-06-04 15:16 Coly Li
  2019-06-04 15:16 ` [PATCH 01/15] Revert "bcache: set CACHE_SET_IO_DISABLE in bch_cached_dev_error()" Coly Li
                   ` (15 more replies)
  0 siblings, 16 replies; 21+ messages in thread
From: Coly Li @ 2019-06-04 15:16 UTC (permalink / raw)
  To: linux-bcache; +Cc: linux-block, Coly Li

Hi folks,

Here are some bcache fixes on my hand before, I post them out
for your review and comments. I am also testing them at same time,
and manage to have them into the for-v5.3 series.

Thanks in advance.

Coly Li
---

Alexandru Ardelean (1):
  bcache: use sysfs_match_string() instead of __sysfs_match_string()

Coly Li (14):
  Revert "bcache: set CACHE_SET_IO_DISABLE in bch_cached_dev_error()"
  bcache: avoid flushing btree node in cache_set_flush() if io disabled
  bcache: ignore read-ahead request failure on backing device
  bcache: add io error counting in write_bdev_super_endio()
  bcache: remove "XXX:" comment line from run_cache_set()
  bcache: remove unnecessary prefetch() in bset_search_tree()
  bcache: add return value check to bch_cached_dev_run()
  bcache: remove unncessary code in bch_btree_keys_init()
  bcache: check CACHE_SET_IO_DISABLE in allocator code
  bcache: check CACHE_SET_IO_DISABLE bit in bch_journal()
  bcache: more detailed error message to bcache_device_link()
  bcache: add more error message in bch_cached_dev_attach()
  bcache: improve error message in bch_cached_dev_run()
  bcache: shrink btree node cache after bch_btree_check()

 drivers/md/bcache/alloc.c   |   9 ++++
 drivers/md/bcache/bcache.h  |   2 +-
 drivers/md/bcache/bset.c    |  31 ++++--------
 drivers/md/bcache/btree.c   |   4 ++
 drivers/md/bcache/io.c      |  12 +++++
 drivers/md/bcache/journal.c |   4 ++
 drivers/md/bcache/super.c   | 113 +++++++++++++++++++++++++++++---------------
 drivers/md/bcache/sysfs.c   |  27 +++++------
 8 files changed, 127 insertions(+), 75 deletions(-)

-- 
2.16.4


^ permalink raw reply	[flat|nested] 21+ messages in thread

* [PATCH 01/15] Revert "bcache: set CACHE_SET_IO_DISABLE in bch_cached_dev_error()"
  2019-06-04 15:16 [PATCH 00/15] bcache fixes before Linux v5.3 Coly Li
@ 2019-06-04 15:16 ` Coly Li
  2019-06-04 15:16 ` [PATCH 02/15] bcache: avoid flushing btree node in cache_set_flush() if io disabled Coly Li
                   ` (14 subsequent siblings)
  15 siblings, 0 replies; 21+ messages in thread
From: Coly Li @ 2019-06-04 15:16 UTC (permalink / raw)
  To: linux-bcache; +Cc: linux-block, Coly Li, stable

This reverts commit 6147305c73e4511ca1a975b766b97a779d442567.

Although this patch helps the failed bcache device to stop faster when
too many I/O errors detected on corresponding cached device, setting
CACHE_SET_IO_DISABLE bit to cache set c->flags was not a good idea. This
operation will disable all I/Os on cache set, which means other attached
bcache devices won't work neither.

Without this patch, the failed bcache device can also be stopped
eventually if internal I/O accomplished (e.g. writeback). Therefore here
I revert it.

Fixes: 6147305c73e4 ("bcache: set CACHE_SET_IO_DISABLE in bch_cached_dev_error()")
Reported-by: Yong Li <mr.liyong@qq.com>
Signed-off-by: Coly Li <colyli@suse.de>
Cc: stable@vger.kernel.org
---
 drivers/md/bcache/super.c | 17 -----------------
 1 file changed, 17 deletions(-)

diff --git a/drivers/md/bcache/super.c b/drivers/md/bcache/super.c
index 1b63ac876169..eaaa046fd95d 100644
--- a/drivers/md/bcache/super.c
+++ b/drivers/md/bcache/super.c
@@ -1437,8 +1437,6 @@ int bch_flash_dev_create(struct cache_set *c, uint64_t size)
 
 bool bch_cached_dev_error(struct cached_dev *dc)
 {
-	struct cache_set *c;
-
 	if (!dc || test_bit(BCACHE_DEV_CLOSING, &dc->disk.flags))
 		return false;
 
@@ -1449,21 +1447,6 @@ bool bch_cached_dev_error(struct cached_dev *dc)
 	pr_err("stop %s: too many IO errors on backing device %s\n",
 		dc->disk.disk->disk_name, dc->backing_dev_name);
 
-	/*
-	 * If the cached device is still attached to a cache set,
-	 * even dc->io_disable is true and no more I/O requests
-	 * accepted, cache device internal I/O (writeback scan or
-	 * garbage collection) may still prevent bcache device from
-	 * being stopped. So here CACHE_SET_IO_DISABLE should be
-	 * set to c->flags too, to make the internal I/O to cache
-	 * device rejected and stopped immediately.
-	 * If c is NULL, that means the bcache device is not attached
-	 * to any cache set, then no CACHE_SET_IO_DISABLE bit to set.
-	 */
-	c = dc->disk.c;
-	if (c && test_and_set_bit(CACHE_SET_IO_DISABLE, &c->flags))
-		pr_info("CACHE_SET_IO_DISABLE already set");
-
 	bcache_device_stop(&dc->disk);
 	return true;
 }
-- 
2.16.4


^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [PATCH 02/15] bcache: avoid flushing btree node in cache_set_flush() if io disabled
  2019-06-04 15:16 [PATCH 00/15] bcache fixes before Linux v5.3 Coly Li
  2019-06-04 15:16 ` [PATCH 01/15] Revert "bcache: set CACHE_SET_IO_DISABLE in bch_cached_dev_error()" Coly Li
@ 2019-06-04 15:16 ` Coly Li
  2019-06-04 15:16 ` [PATCH 03/15] bcache: ignore read-ahead request failure on backing device Coly Li
                   ` (13 subsequent siblings)
  15 siblings, 0 replies; 21+ messages in thread
From: Coly Li @ 2019-06-04 15:16 UTC (permalink / raw)
  To: linux-bcache; +Cc: linux-block, Coly Li

When cache_set_flush() is called for too many I/O errors detected on
cache device and the cache set is retiring, inside the function it
doesn't make sense to flushing cached btree nodes from c->btree_cache
because CACHE_SET_IO_DISABLE is set on c->flags already and all I/Os
onto cache device will be rejected.

This patch checks in cache_set_flush() that whether CACHE_SET_IO_DISABLE
is set. If yes, then avoids to flush the cached btree nodes to reduce
more time and make cache set retiring more faster.

Signed-off-by: Coly Li <colyli@suse.de>
---
 drivers/md/bcache/super.c | 18 +++++++++++-------
 1 file changed, 11 insertions(+), 7 deletions(-)

diff --git a/drivers/md/bcache/super.c b/drivers/md/bcache/super.c
index eaaa046fd95d..da9d6a63b81a 100644
--- a/drivers/md/bcache/super.c
+++ b/drivers/md/bcache/super.c
@@ -1553,13 +1553,17 @@ static void cache_set_flush(struct closure *cl)
 	if (!IS_ERR_OR_NULL(c->root))
 		list_add(&c->root->list, &c->btree_cache);
 
-	/* Should skip this if we're unregistering because of an error */
-	list_for_each_entry(b, &c->btree_cache, list) {
-		mutex_lock(&b->write_lock);
-		if (btree_node_dirty(b))
-			__bch_btree_node_write(b, NULL);
-		mutex_unlock(&b->write_lock);
-	}
+	/*
+	 * Avoid flushing cached nodes if cache set is retiring
+	 * due to too many I/O errors detected.
+	 */
+	if (!test_bit(CACHE_SET_IO_DISABLE, &c->flags))
+		list_for_each_entry(b, &c->btree_cache, list) {
+			mutex_lock(&b->write_lock);
+			if (btree_node_dirty(b))
+				__bch_btree_node_write(b, NULL);
+			mutex_unlock(&b->write_lock);
+		}
 
 	for_each_cache(ca, c, i)
 		if (ca->alloc_thread)
-- 
2.16.4


^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [PATCH 03/15] bcache: ignore read-ahead request failure on backing device
  2019-06-04 15:16 [PATCH 00/15] bcache fixes before Linux v5.3 Coly Li
  2019-06-04 15:16 ` [PATCH 01/15] Revert "bcache: set CACHE_SET_IO_DISABLE in bch_cached_dev_error()" Coly Li
  2019-06-04 15:16 ` [PATCH 02/15] bcache: avoid flushing btree node in cache_set_flush() if io disabled Coly Li
@ 2019-06-04 15:16 ` Coly Li
  2019-06-04 15:16 ` [PATCH 04/15] bcache: add io error counting in write_bdev_super_endio() Coly Li
                   ` (12 subsequent siblings)
  15 siblings, 0 replies; 21+ messages in thread
From: Coly Li @ 2019-06-04 15:16 UTC (permalink / raw)
  To: linux-bcache; +Cc: linux-block, Coly Li, stable

When md raid device (e.g. raid456) is used as backing device, read-ahead
requests on a degrading and recovering md raid device might be failured
immediately by md raid code, but indeed this md raid array can still be
read or write for normal I/O requests. Therefore such failed read-ahead
request are not real hardware failure. Further more, after degrading and
recovering accomplished, read-ahead requests will be handled by md raid
array again.

For such condition, I/O failures of read-ahead requests don't indicate
real health status (because normal I/O still be served), they should not
be counted into I/O error counter dc->io_errors.

Since there is no simple way to detect whether the backing divice is a
md raid device, this patch simply ignores I/O failures for read-ahead
bios on backing device, to avoid bogus backing device failure on a
degrading md raid array.

Suggested-and-tested-by: Thorsten Knabe <linux@thorsten-knabe.de>
Signed-off-by: Coly Li <colyli@suse.de>
Cc: stable@vger.kernel.org
---
 drivers/md/bcache/io.c | 12 ++++++++++++
 1 file changed, 12 insertions(+)

diff --git a/drivers/md/bcache/io.c b/drivers/md/bcache/io.c
index c25097968319..4d93f07f63e5 100644
--- a/drivers/md/bcache/io.c
+++ b/drivers/md/bcache/io.c
@@ -58,6 +58,18 @@ void bch_count_backing_io_errors(struct cached_dev *dc, struct bio *bio)
 
 	WARN_ONCE(!dc, "NULL pointer of struct cached_dev");
 
+	/*
+	 * Read-ahead requests on a degrading and recovering md raid
+	 * (e.g. raid6) device might be failured immediately by md
+	 * raid code, which is not a real hardware media failure. So
+	 * we shouldn't count failed REQ_RAHEAD bio to dc->io_errors.
+	 */
+	if (bio->bi_opf & REQ_RAHEAD) {
+		pr_warn_ratelimited("%s: Read-ahead I/O failed on backing device, ignore",
+				    dc->backing_dev_name);
+		return;
+	}
+
 	errors = atomic_add_return(1, &dc->io_errors);
 	if (errors < dc->error_limit)
 		pr_err("%s: IO error on backing device, unrecoverable",
-- 
2.16.4


^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [PATCH 04/15] bcache: add io error counting in write_bdev_super_endio()
  2019-06-04 15:16 [PATCH 00/15] bcache fixes before Linux v5.3 Coly Li
                   ` (2 preceding siblings ...)
  2019-06-04 15:16 ` [PATCH 03/15] bcache: ignore read-ahead request failure on backing device Coly Li
@ 2019-06-04 15:16 ` Coly Li
  2019-06-04 15:16 ` [PATCH 05/15] bcache: remove "XXX:" comment line from run_cache_set() Coly Li
                   ` (11 subsequent siblings)
  15 siblings, 0 replies; 21+ messages in thread
From: Coly Li @ 2019-06-04 15:16 UTC (permalink / raw)
  To: linux-bcache; +Cc: linux-block, Coly Li

When backing device super block is written by bch_write_bdev_super(),
the bio complete callback write_bdev_super_endio() simply ignores I/O
status. Indeed such write request also contribute to backing device
health status if the request failed.

This patch checkes bio->bi_status in write_bdev_super_endio(), if there
is error, bch_count_backing_io_errors() will be called to count an I/O
error to dc->io_errors.

Signed-off-by: Coly Li <colyli@suse.de>
---
 drivers/md/bcache/super.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/drivers/md/bcache/super.c b/drivers/md/bcache/super.c
index da9d6a63b81a..877113b62b0f 100644
--- a/drivers/md/bcache/super.c
+++ b/drivers/md/bcache/super.c
@@ -197,7 +197,9 @@ static const char *read_super(struct cache_sb *sb, struct block_device *bdev,
 static void write_bdev_super_endio(struct bio *bio)
 {
 	struct cached_dev *dc = bio->bi_private;
-	/* XXX: error checking */
+
+	if (bio->bi_status)
+		bch_count_backing_io_errors(dc, bio);
 
 	closure_put(&dc->sb_write);
 }
-- 
2.16.4


^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [PATCH 05/15] bcache: remove "XXX:" comment line from run_cache_set()
  2019-06-04 15:16 [PATCH 00/15] bcache fixes before Linux v5.3 Coly Li
                   ` (3 preceding siblings ...)
  2019-06-04 15:16 ` [PATCH 04/15] bcache: add io error counting in write_bdev_super_endio() Coly Li
@ 2019-06-04 15:16 ` Coly Li
  2019-06-04 15:16 ` [PATCH 06/15] bcache: remove unnecessary prefetch() in bset_search_tree() Coly Li
                   ` (10 subsequent siblings)
  15 siblings, 0 replies; 21+ messages in thread
From: Coly Li @ 2019-06-04 15:16 UTC (permalink / raw)
  To: linux-bcache; +Cc: linux-block, Coly Li

In previous bcache patches for Linux v5.2, the failure code path of
run_cache_set() is tested and fixed. So now the following comment
line can be removed from run_cache_set(),
	/* XXX: test this, it's broken */

Signed-off-by: Coly Li <colyli@suse.de>
---
 drivers/md/bcache/super.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/md/bcache/super.c b/drivers/md/bcache/super.c
index 877113b62b0f..3364b20567eb 100644
--- a/drivers/md/bcache/super.c
+++ b/drivers/md/bcache/super.c
@@ -1946,7 +1946,7 @@ static int run_cache_set(struct cache_set *c)
 	}
 
 	closure_sync(&cl);
-	/* XXX: test this, it's broken */
+
 	bch_cache_set_error(c, "%s", err);
 
 	return -EIO;
-- 
2.16.4


^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [PATCH 06/15] bcache: remove unnecessary prefetch() in bset_search_tree()
  2019-06-04 15:16 [PATCH 00/15] bcache fixes before Linux v5.3 Coly Li
                   ` (4 preceding siblings ...)
  2019-06-04 15:16 ` [PATCH 05/15] bcache: remove "XXX:" comment line from run_cache_set() Coly Li
@ 2019-06-04 15:16 ` Coly Li
  2019-06-04 15:16 ` [PATCH 07/15] bcache: use sysfs_match_string() instead of __sysfs_match_string() Coly Li
                   ` (9 subsequent siblings)
  15 siblings, 0 replies; 21+ messages in thread
From: Coly Li @ 2019-06-04 15:16 UTC (permalink / raw)
  To: linux-bcache; +Cc: linux-block, Coly Li

In function bset_search_tree(), when p >= t->size, t->tree[0] will be
prefetched by the following code piece,
 974                 unsigned int p = n << 4;
 975
 976                 p &= ((int) (p - t->size)) >> 31;
 977
 978                 prefetch(&t->tree[p]);

The purpose of the above code is to avoid a branch instruction, but
when p >= t->size, prefetch(&t->tree[0]) has no positive performance
contribution at all. This patch avoids the unncessary prefetch by only
calling prefetch() when p < t->size.

Signed-off-by: Coly Li <colyli@suse.de>
---
 drivers/md/bcache/bset.c | 16 ++--------------
 1 file changed, 2 insertions(+), 14 deletions(-)

diff --git a/drivers/md/bcache/bset.c b/drivers/md/bcache/bset.c
index 8f07fa6e1739..aa2e4ab0fab9 100644
--- a/drivers/md/bcache/bset.c
+++ b/drivers/md/bcache/bset.c
@@ -960,22 +960,10 @@ static struct bset_search_iter bset_search_tree(struct bset_tree *t,
 	unsigned int inorder, j, n = 1;
 
 	do {
-		/*
-		 * A bit trick here.
-		 * If p < t->size, (int)(p - t->size) is a minus value and
-		 * the most significant bit is set, right shifting 31 bits
-		 * gets 1. If p >= t->size, the most significant bit is
-		 * not set, right shifting 31 bits gets 0.
-		 * So the following 2 lines equals to
-		 *	if (p >= t->size)
-		 *		p = 0;
-		 * but a branch instruction is avoided.
-		 */
 		unsigned int p = n << 4;
 
-		p &= ((int) (p - t->size)) >> 31;
-
-		prefetch(&t->tree[p]);
+		if (p < t->size)
+			prefetch(&t->tree[p]);
 
 		j = n;
 		f = &t->tree[j];
-- 
2.16.4


^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [PATCH 07/15] bcache: use sysfs_match_string() instead of __sysfs_match_string()
  2019-06-04 15:16 [PATCH 00/15] bcache fixes before Linux v5.3 Coly Li
                   ` (5 preceding siblings ...)
  2019-06-04 15:16 ` [PATCH 06/15] bcache: remove unnecessary prefetch() in bset_search_tree() Coly Li
@ 2019-06-04 15:16 ` Coly Li
  2019-06-04 15:16 ` [PATCH 08/15] bcache: add return value check to bch_cached_dev_run() Coly Li
                   ` (8 subsequent siblings)
  15 siblings, 0 replies; 21+ messages in thread
From: Coly Li @ 2019-06-04 15:16 UTC (permalink / raw)
  To: linux-bcache; +Cc: linux-block, Alexandru Ardelean, Coly Li

From: Alexandru Ardelean <alexandru.ardelean@analog.com>

The arrays (of strings) that are passed to __sysfs_match_string() are
static, so use sysfs_match_string() which does an implicit ARRAY_SIZE()
over these arrays.

Functionally, this doesn't change anything.
The change is more cosmetic.

It only shrinks the static arrays by 1 byte each.

Signed-off-by: Alexandru Ardelean <alexandru.ardelean@analog.com>
Signed-off-by: Coly Li <colyli@suse.de>
---
 drivers/md/bcache/sysfs.c | 20 ++++++++------------
 1 file changed, 8 insertions(+), 12 deletions(-)

diff --git a/drivers/md/bcache/sysfs.c b/drivers/md/bcache/sysfs.c
index 6cd44d3cf906..3a520262933d 100644
--- a/drivers/md/bcache/sysfs.c
+++ b/drivers/md/bcache/sysfs.c
@@ -21,28 +21,24 @@ static const char * const bch_cache_modes[] = {
 	"writethrough",
 	"writeback",
 	"writearound",
-	"none",
-	NULL
+	"none"
 };
 
 /* Default is 0 ("auto") */
 static const char * const bch_stop_on_failure_modes[] = {
 	"auto",
-	"always",
-	NULL
+	"always"
 };
 
 static const char * const cache_replacement_policies[] = {
 	"lru",
 	"fifo",
-	"random",
-	NULL
+	"random"
 };
 
 static const char * const error_actions[] = {
 	"unregister",
-	"panic",
-	NULL
+	"panic"
 };
 
 write_attribute(attach);
@@ -333,7 +329,7 @@ STORE(__cached_dev)
 		bch_cached_dev_run(dc);
 
 	if (attr == &sysfs_cache_mode) {
-		v = __sysfs_match_string(bch_cache_modes, -1, buf);
+		v = sysfs_match_string(bch_cache_modes, buf);
 		if (v < 0)
 			return v;
 
@@ -344,7 +340,7 @@ STORE(__cached_dev)
 	}
 
 	if (attr == &sysfs_stop_when_cache_set_failed) {
-		v = __sysfs_match_string(bch_stop_on_failure_modes, -1, buf);
+		v = sysfs_match_string(bch_stop_on_failure_modes, buf);
 		if (v < 0)
 			return v;
 
@@ -794,7 +790,7 @@ STORE(__bch_cache_set)
 			    0, UINT_MAX);
 
 	if (attr == &sysfs_errors) {
-		v = __sysfs_match_string(error_actions, -1, buf);
+		v = sysfs_match_string(error_actions, buf);
 		if (v < 0)
 			return v;
 
@@ -1058,7 +1054,7 @@ STORE(__bch_cache)
 	}
 
 	if (attr == &sysfs_cache_replacement_policy) {
-		v = __sysfs_match_string(cache_replacement_policies, -1, buf);
+		v = sysfs_match_string(cache_replacement_policies, buf);
 		if (v < 0)
 			return v;
 
-- 
2.16.4


^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [PATCH 08/15] bcache: add return value check to bch_cached_dev_run()
  2019-06-04 15:16 [PATCH 00/15] bcache fixes before Linux v5.3 Coly Li
                   ` (6 preceding siblings ...)
  2019-06-04 15:16 ` [PATCH 07/15] bcache: use sysfs_match_string() instead of __sysfs_match_string() Coly Li
@ 2019-06-04 15:16 ` Coly Li
  2019-06-04 15:16 ` [PATCH 09/15] bcache: remove unncessary code in bch_btree_keys_init() Coly Li
                   ` (7 subsequent siblings)
  15 siblings, 0 replies; 21+ messages in thread
From: Coly Li @ 2019-06-04 15:16 UTC (permalink / raw)
  To: linux-bcache; +Cc: linux-block, Coly Li

This patch adds return value check to bch_cached_dev_run(), now if there
is error happens inside bch_cached_dev_run(), it can be catched.

Signed-off-by: Coly Li <colyli@suse.de>
---
 drivers/md/bcache/bcache.h |  2 +-
 drivers/md/bcache/super.c  | 33 ++++++++++++++++++++++++++-------
 drivers/md/bcache/sysfs.c  |  7 +++++--
 3 files changed, 32 insertions(+), 10 deletions(-)

diff --git a/drivers/md/bcache/bcache.h b/drivers/md/bcache/bcache.h
index fdf75352e16a..73a97586a2ef 100644
--- a/drivers/md/bcache/bcache.h
+++ b/drivers/md/bcache/bcache.h
@@ -1006,7 +1006,7 @@ int bch_flash_dev_create(struct cache_set *c, uint64_t size);
 int bch_cached_dev_attach(struct cached_dev *dc, struct cache_set *c,
 			  uint8_t *set_uuid);
 void bch_cached_dev_detach(struct cached_dev *dc);
-void bch_cached_dev_run(struct cached_dev *dc);
+int bch_cached_dev_run(struct cached_dev *dc);
 void bcache_device_stop(struct bcache_device *d);
 
 void bch_cache_set_unregister(struct cache_set *c);
diff --git a/drivers/md/bcache/super.c b/drivers/md/bcache/super.c
index 3364b20567eb..74eb18b1af40 100644
--- a/drivers/md/bcache/super.c
+++ b/drivers/md/bcache/super.c
@@ -910,7 +910,7 @@ static int cached_dev_status_update(void *arg)
 }
 
 
-void bch_cached_dev_run(struct cached_dev *dc)
+int bch_cached_dev_run(struct cached_dev *dc)
 {
 	struct bcache_device *d = &dc->disk;
 	char *buf = kmemdup_nul(dc->sb.label, SB_LABEL_SIZE, GFP_KERNEL);
@@ -921,11 +921,14 @@ void bch_cached_dev_run(struct cached_dev *dc)
 		NULL,
 	};
 
+	if (dc->io_disable)
+		return -EIO;
+
 	if (atomic_xchg(&dc->running, 1)) {
 		kfree(env[1]);
 		kfree(env[2]);
 		kfree(buf);
-		return;
+		return -EBUSY;
 	}
 
 	if (!d->c &&
@@ -951,8 +954,11 @@ void bch_cached_dev_run(struct cached_dev *dc)
 	kfree(buf);
 
 	if (sysfs_create_link(&d->kobj, &disk_to_dev(d->disk)->kobj, "dev") ||
-	    sysfs_create_link(&disk_to_dev(d->disk)->kobj, &d->kobj, "bcache"))
+	    sysfs_create_link(&disk_to_dev(d->disk)->kobj,
+			      &d->kobj, "bcache")) {
 		pr_debug("error creating sysfs link");
+		return -ENOMEM;
+	}
 
 	dc->status_update_thread = kthread_run(cached_dev_status_update,
 					       dc, "bcache_status_update");
@@ -961,6 +967,8 @@ void bch_cached_dev_run(struct cached_dev *dc)
 			"continue to run without monitoring backing "
 			"device status");
 	}
+
+	return 0;
 }
 
 /*
@@ -1056,6 +1064,7 @@ int bch_cached_dev_attach(struct cached_dev *dc, struct cache_set *c,
 	uint32_t rtime = cpu_to_le32((u32)ktime_get_real_seconds());
 	struct uuid_entry *u;
 	struct cached_dev *exist_dc, *t;
+	int ret = 0;
 
 	if ((set_uuid && memcmp(set_uuid, c->sb.set_uuid, 16)) ||
 	    (!set_uuid && memcmp(dc->sb.set_uuid, c->sb.set_uuid, 16)))
@@ -1165,7 +1174,12 @@ int bch_cached_dev_attach(struct cached_dev *dc, struct cache_set *c,
 
 	bch_sectors_dirty_init(&dc->disk);
 
-	bch_cached_dev_run(dc);
+	ret = bch_cached_dev_run(dc);
+	if (ret && (ret != -EBUSY)) {
+		up_write(&dc->writeback_lock);
+		return ret;
+	}
+
 	bcache_device_link(&dc->disk, c, "bdev");
 	atomic_inc(&c->attached_dev_nr);
 
@@ -1292,6 +1306,7 @@ static int register_bdev(struct cache_sb *sb, struct page *sb_page,
 {
 	const char *err = "cannot allocate memory";
 	struct cache_set *c;
+	int ret = -ENOMEM;
 
 	bdevname(bdev, dc->backing_dev_name);
 	memcpy(&dc->sb, sb, sizeof(struct cache_sb));
@@ -1321,14 +1336,18 @@ static int register_bdev(struct cache_sb *sb, struct page *sb_page,
 		bch_cached_dev_attach(dc, c, NULL);
 
 	if (BDEV_STATE(&dc->sb) == BDEV_STATE_NONE ||
-	    BDEV_STATE(&dc->sb) == BDEV_STATE_STALE)
-		bch_cached_dev_run(dc);
+	    BDEV_STATE(&dc->sb) == BDEV_STATE_STALE) {
+		err = "failed to run cached device";
+		ret = bch_cached_dev_run(dc);
+		if (ret)
+			goto err;
+	}
 
 	return 0;
 err:
 	pr_notice("error %s: %s", dc->backing_dev_name, err);
 	bcache_device_stop(&dc->disk);
-	return -EIO;
+	return ret;
 }
 
 /* Flash only volumes */
diff --git a/drivers/md/bcache/sysfs.c b/drivers/md/bcache/sysfs.c
index 3a520262933d..129031663cc8 100644
--- a/drivers/md/bcache/sysfs.c
+++ b/drivers/md/bcache/sysfs.c
@@ -325,8 +325,11 @@ STORE(__cached_dev)
 		bch_cache_accounting_clear(&dc->accounting);
 
 	if (attr == &sysfs_running &&
-	    strtoul_or_return(buf))
-		bch_cached_dev_run(dc);
+	    strtoul_or_return(buf)) {
+		v = bch_cached_dev_run(dc);
+		if (v)
+			return v;
+	}
 
 	if (attr == &sysfs_cache_mode) {
 		v = sysfs_match_string(bch_cache_modes, buf);
-- 
2.16.4


^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [PATCH 09/15] bcache: remove unncessary code in bch_btree_keys_init()
  2019-06-04 15:16 [PATCH 00/15] bcache fixes before Linux v5.3 Coly Li
                   ` (7 preceding siblings ...)
  2019-06-04 15:16 ` [PATCH 08/15] bcache: add return value check to bch_cached_dev_run() Coly Li
@ 2019-06-04 15:16 ` Coly Li
  2019-06-04 15:16 ` [PATCH 10/15] bcache: check CACHE_SET_IO_DISABLE in allocator code Coly Li
                   ` (6 subsequent siblings)
  15 siblings, 0 replies; 21+ messages in thread
From: Coly Li @ 2019-06-04 15:16 UTC (permalink / raw)
  To: linux-bcache; +Cc: linux-block, Coly Li

Function bch_btree_keys_init() initializes b->set[].size and
b->set[].data to zero. As the code comments indicates, these code indeed
is unncessary, because both struct btree_keys and struct bset_tree are
nested embedded into struct btree, when struct btree is filled with 0
bits by kzalloc() in mca_bucket_alloc(), b->set[].size and
b->set[].data are initialized to 0 (a.k.a NULL) already.

This patch removes the redundant code, and add comments in
bch_btree_keys_init() and mca_bucket_alloc() to explain why it's safe.

Signed-off-by: Coly Li <colyli@suse.de>
---
 drivers/md/bcache/bset.c  | 15 ++++++---------
 drivers/md/bcache/btree.c |  4 ++++
 2 files changed, 10 insertions(+), 9 deletions(-)

diff --git a/drivers/md/bcache/bset.c b/drivers/md/bcache/bset.c
index aa2e4ab0fab9..eedaf0f3e3f0 100644
--- a/drivers/md/bcache/bset.c
+++ b/drivers/md/bcache/bset.c
@@ -347,22 +347,19 @@ EXPORT_SYMBOL(bch_btree_keys_alloc);
 void bch_btree_keys_init(struct btree_keys *b, const struct btree_keys_ops *ops,
 			 bool *expensive_debug_checks)
 {
-	unsigned int i;
-
 	b->ops = ops;
 	b->expensive_debug_checks = expensive_debug_checks;
 	b->nsets = 0;
 	b->last_set_unwritten = 0;
 
-	/* XXX: shouldn't be needed */
-	for (i = 0; i < MAX_BSETS; i++)
-		b->set[i].size = 0;
 	/*
-	 * Second loop starts at 1 because b->keys[0]->data is the memory we
-	 * allocated
+	 * struct btree_keys in embedded in struct btree, and struct
+	 * bset_tree is embedded into struct btree_keys. They are all
+	 * initialized as 0 by kzalloc() in mca_bucket_alloc(), and
+	 * b->set[0].data is allocated in bch_btree_keys_alloc(), so we
+	 * don't have to initiate b->set[].size and b->set[].data here
+	 * any more.
 	 */
-	for (i = 1; i < MAX_BSETS; i++)
-		b->set[i].data = NULL;
 }
 EXPORT_SYMBOL(bch_btree_keys_init);
 
diff --git a/drivers/md/bcache/btree.c b/drivers/md/bcache/btree.c
index 773f5fdad25f..cf38a1b031fa 100644
--- a/drivers/md/bcache/btree.c
+++ b/drivers/md/bcache/btree.c
@@ -613,6 +613,10 @@ static void mca_data_alloc(struct btree *b, struct bkey *k, gfp_t gfp)
 static struct btree *mca_bucket_alloc(struct cache_set *c,
 				      struct bkey *k, gfp_t gfp)
 {
+	/*
+	 * kzalloc() is necessary here for initialization,
+	 * see code comments in bch_btree_keys_init().
+	 */
 	struct btree *b = kzalloc(sizeof(struct btree), gfp);
 
 	if (!b)
-- 
2.16.4


^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [PATCH 10/15] bcache: check CACHE_SET_IO_DISABLE in allocator code
  2019-06-04 15:16 [PATCH 00/15] bcache fixes before Linux v5.3 Coly Li
                   ` (8 preceding siblings ...)
  2019-06-04 15:16 ` [PATCH 09/15] bcache: remove unncessary code in bch_btree_keys_init() Coly Li
@ 2019-06-04 15:16 ` Coly Li
  2019-06-04 15:16 ` [PATCH 11/15] bcache: check CACHE_SET_IO_DISABLE bit in bch_journal() Coly Li
                   ` (5 subsequent siblings)
  15 siblings, 0 replies; 21+ messages in thread
From: Coly Li @ 2019-06-04 15:16 UTC (permalink / raw)
  To: linux-bcache; +Cc: linux-block, Coly Li

If CACHE_SET_IO_DISABLE of a cache set flag is set by too many I/O
errors, currently allocator routines can still continue allocate
space which may introduce inconsistent metadata state.

This patch checkes CACHE_SET_IO_DISABLE bit in following allocator
routines,
- bch_bucket_alloc()
- __bch_bucket_alloc_set()
Once CACHE_SET_IO_DISABLE is set on cache set, the allocator routines
may reject allocation request earlier to avoid potential inconsistent
metadata.

Signed-off-by: Coly Li <colyli@suse.de>
---
 drivers/md/bcache/alloc.c | 9 +++++++++
 1 file changed, 9 insertions(+)

diff --git a/drivers/md/bcache/alloc.c b/drivers/md/bcache/alloc.c
index f8986effcb50..6f776823b9ba 100644
--- a/drivers/md/bcache/alloc.c
+++ b/drivers/md/bcache/alloc.c
@@ -393,6 +393,11 @@ long bch_bucket_alloc(struct cache *ca, unsigned int reserve, bool wait)
 	struct bucket *b;
 	long r;
 
+
+	/* No allocation if CACHE_SET_IO_DISABLE bit is set */
+	if (unlikely(test_bit(CACHE_SET_IO_DISABLE, &ca->set->flags)))
+		return -1;
+
 	/* fastpath */
 	if (fifo_pop(&ca->free[RESERVE_NONE], r) ||
 	    fifo_pop(&ca->free[reserve], r))
@@ -484,6 +489,10 @@ int __bch_bucket_alloc_set(struct cache_set *c, unsigned int reserve,
 {
 	int i;
 
+	/* No allocation if CACHE_SET_IO_DISABLE bit is set */
+	if (unlikely(test_bit(CACHE_SET_IO_DISABLE, &c->flags)))
+		return -1;
+
 	lockdep_assert_held(&c->bucket_lock);
 	BUG_ON(!n || n > c->caches_loaded || n > MAX_CACHES_PER_SET);
 
-- 
2.16.4


^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [PATCH 11/15] bcache: check CACHE_SET_IO_DISABLE bit in bch_journal()
  2019-06-04 15:16 [PATCH 00/15] bcache fixes before Linux v5.3 Coly Li
                   ` (9 preceding siblings ...)
  2019-06-04 15:16 ` [PATCH 10/15] bcache: check CACHE_SET_IO_DISABLE in allocator code Coly Li
@ 2019-06-04 15:16 ` Coly Li
  2019-06-04 15:16 ` [PATCH 12/15] bcache: more detailed error message to bcache_device_link() Coly Li
                   ` (4 subsequent siblings)
  15 siblings, 0 replies; 21+ messages in thread
From: Coly Li @ 2019-06-04 15:16 UTC (permalink / raw)
  To: linux-bcache; +Cc: linux-block, Coly Li

When too many I/O errors happen on cache set and CACHE_SET_IO_DISABLE
bit is set, bch_journal() may continue to work because the journaling
bkey might be still in write set yet. The caller of bch_journal() may
believe the journal still work but the truth is in-memory journal write
set won't be written into cache device any more. This behavior may
introduce potential inconsistent metadata status.

This patch checks CACHE_SET_IO_DISABLE bit at the head of bch_journal(),
if the bit is set, bch_journal() returns NULL immediately to notice
caller to know journal does not work.

Signed-off-by: Coly Li <colyli@suse.de>
---
 drivers/md/bcache/journal.c | 4 ++++
 1 file changed, 4 insertions(+)

diff --git a/drivers/md/bcache/journal.c b/drivers/md/bcache/journal.c
index 12dae9348147..d4b9817f2237 100644
--- a/drivers/md/bcache/journal.c
+++ b/drivers/md/bcache/journal.c
@@ -811,6 +811,10 @@ atomic_t *bch_journal(struct cache_set *c,
 	struct journal_write *w;
 	atomic_t *ret;
 
+	/* No journaling if CACHE_SET_IO_DISABLE set already */
+	if (unlikely(test_bit(CACHE_SET_IO_DISABLE, &c->flags)))
+		return NULL;
+
 	if (!CACHE_SYNC(&c->sb))
 		return NULL;
 
-- 
2.16.4


^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [PATCH 12/15] bcache: more detailed error message to bcache_device_link()
  2019-06-04 15:16 [PATCH 00/15] bcache fixes before Linux v5.3 Coly Li
                   ` (10 preceding siblings ...)
  2019-06-04 15:16 ` [PATCH 11/15] bcache: check CACHE_SET_IO_DISABLE bit in bch_journal() Coly Li
@ 2019-06-04 15:16 ` Coly Li
  2019-06-04 15:16 ` [PATCH 13/15] bcache: add more error message in bch_cached_dev_attach() Coly Li
                   ` (3 subsequent siblings)
  15 siblings, 0 replies; 21+ messages in thread
From: Coly Li @ 2019-06-04 15:16 UTC (permalink / raw)
  To: linux-bcache; +Cc: linux-block, Coly Li

This patch adds more accurate error message for specific
ssyfs_create_link() call, to help debugging failure during
bcache device start tup.

Signed-off-by: Coly Li <colyli@suse.de>
---
 drivers/md/bcache/super.c | 11 ++++++++---
 1 file changed, 8 insertions(+), 3 deletions(-)

diff --git a/drivers/md/bcache/super.c b/drivers/md/bcache/super.c
index 74eb18b1af40..1fa3f4e26d02 100644
--- a/drivers/md/bcache/super.c
+++ b/drivers/md/bcache/super.c
@@ -693,6 +693,7 @@ static void bcache_device_link(struct bcache_device *d, struct cache_set *c,
 {
 	unsigned int i;
 	struct cache *ca;
+	int ret;
 
 	for_each_cache(ca, d->c, i)
 		bd_link_disk_holder(ca->bdev, d->disk);
@@ -700,9 +701,13 @@ static void bcache_device_link(struct bcache_device *d, struct cache_set *c,
 	snprintf(d->name, BCACHEDEVNAME_SIZE,
 		 "%s%u", name, d->id);
 
-	WARN(sysfs_create_link(&d->kobj, &c->kobj, "cache") ||
-	     sysfs_create_link(&c->kobj, &d->kobj, d->name),
-	     "Couldn't create device <-> cache set symlinks");
+	ret = sysfs_create_link(&d->kobj, &c->kobj, "cache");
+	if (ret < 0)
+		pr_err("Couldn't create device -> cache set symlink");
+
+	ret = sysfs_create_link(&c->kobj, &d->kobj, d->name);
+	if (ret < 0)
+		pr_err("Couldn't create cache set -> device symlink");
 
 	clear_bit(BCACHE_DEV_UNLINK_DONE, &d->flags);
 }
-- 
2.16.4


^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [PATCH 13/15] bcache: add more error message in bch_cached_dev_attach()
  2019-06-04 15:16 [PATCH 00/15] bcache fixes before Linux v5.3 Coly Li
                   ` (11 preceding siblings ...)
  2019-06-04 15:16 ` [PATCH 12/15] bcache: more detailed error message to bcache_device_link() Coly Li
@ 2019-06-04 15:16 ` Coly Li
  2019-06-04 15:16 ` [PATCH 14/15] bcache: shrink btree node cache after bch_btree_check() Coly Li
                   ` (2 subsequent siblings)
  15 siblings, 0 replies; 21+ messages in thread
From: Coly Li @ 2019-06-04 15:16 UTC (permalink / raw)
  To: linux-bcache; +Cc: linux-block, Coly Li

This patch adds more error message for attaching cached device, this is
helpful to debug code failure during bache device start up.

Signed-off-by: Coly Li <colyli@suse.de>
---
 drivers/md/bcache/super.c | 4 ++++
 1 file changed, 4 insertions(+)

diff --git a/drivers/md/bcache/super.c b/drivers/md/bcache/super.c
index 1fa3f4e26d02..cf5673af3143 100644
--- a/drivers/md/bcache/super.c
+++ b/drivers/md/bcache/super.c
@@ -1169,6 +1169,8 @@ int bch_cached_dev_attach(struct cached_dev *dc, struct cache_set *c,
 	down_write(&dc->writeback_lock);
 	if (bch_cached_dev_writeback_start(dc)) {
 		up_write(&dc->writeback_lock);
+		pr_err("Couldn't start writeback facilities for %s",
+		       dc->disk.disk->disk_name);
 		return -ENOMEM;
 	}
 
@@ -1182,6 +1184,8 @@ int bch_cached_dev_attach(struct cached_dev *dc, struct cache_set *c,
 	ret = bch_cached_dev_run(dc);
 	if (ret && (ret != -EBUSY)) {
 		up_write(&dc->writeback_lock);
+		pr_err("Couldn't run cached device %s",
+		       dc->backing_dev_name);
 		return ret;
 	}
 
-- 
2.16.4


^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [PATCH 14/15] bcache: shrink btree node cache after bch_btree_check()
  2019-06-04 15:16 [PATCH 00/15] bcache fixes before Linux v5.3 Coly Li
                   ` (12 preceding siblings ...)
  2019-06-04 15:16 ` [PATCH 13/15] bcache: add more error message in bch_cached_dev_attach() Coly Li
@ 2019-06-04 15:16 ` Coly Li
  2019-06-04 15:16 ` [PATCH 15/15] bcache: improve error message in bch_cached_dev_run() Coly Li
  2019-06-04 15:53 ` [PATCH 16/18] bcache: only set BCACHE_DEV_WB_RUNNING when cached device attached Coly Li
  15 siblings, 0 replies; 21+ messages in thread
From: Coly Li @ 2019-06-04 15:16 UTC (permalink / raw)
  To: linux-bcache; +Cc: linux-block, Coly Li

When cache set starts, bch_btree_check() will check all bkeys on cache
device by calculating the checksum. This operation will consume a huge
number of system memory if there are a lot of data cached. Since bcache
uses its own mca cache to maintain all its read-in btree nodes, and only
releases the cache space when system memory manage code starts to shrink
caches. There is will be a delay between bch_btree_check() returns and
the bcache shrink code gets called, so following memory allocatiion
might fail after bch_btree_check() finished. The most frequent one is
failure of creating allocator kernel thread.

This patch tries to proactively call bcache mca shrinker routine to
release around 25% cache memory, to help following memory allocation
to success. 'Around 25% cache memory' means when mca shrnker tries to
release cache memory, it might have to skip some busy memory objects,
so the result might be a few less than the expected amount.

Signed-off-by: Coly Li <colyli@suse.de>
---
 drivers/md/bcache/super.c | 18 ++++++++++++++++++
 1 file changed, 18 insertions(+)

diff --git a/drivers/md/bcache/super.c b/drivers/md/bcache/super.c
index cf5673af3143..4a6406b53de1 100644
--- a/drivers/md/bcache/super.c
+++ b/drivers/md/bcache/super.c
@@ -1866,6 +1866,24 @@ static int run_cache_set(struct cache_set *c)
 		if (bch_btree_check(c))
 			goto err;
 
+		/*
+		 * bch_btree_check() may occupy too much system memory which
+		 * will fail memory allocation operations in the following
+		 * routines before kernel triggers memory shrinker call backs.
+		 * Shrinking 25% mca cache memory proactively here to avoid
+		 * potential memory allocation failure.
+		 */
+		if (!c->shrinker_disabled) {
+			struct shrink_control sc;
+
+			sc.gfp_mask = GFP_KERNEL;
+			sc.nr_to_scan =
+				c->shrink.count_objects(&c->shrink, &sc) / 4;
+			pr_debug("try to shrink %lu (25%%) cached btree node",
+				 sc.nr_to_scan);
+			c->shrink.scan_objects(&c->shrink, &sc);
+		}
+
 		bch_journal_mark(c, &journal);
 		bch_initial_gc_finish(c);
 		pr_debug("btree_check() done");
-- 
2.16.4


^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [PATCH 15/15] bcache: improve error message in bch_cached_dev_run()
  2019-06-04 15:16 [PATCH 00/15] bcache fixes before Linux v5.3 Coly Li
                   ` (13 preceding siblings ...)
  2019-06-04 15:16 ` [PATCH 14/15] bcache: shrink btree node cache after bch_btree_check() Coly Li
@ 2019-06-04 15:16 ` Coly Li
  2019-06-04 15:53 ` [PATCH 16/18] bcache: only set BCACHE_DEV_WB_RUNNING when cached device attached Coly Li
  15 siblings, 0 replies; 21+ messages in thread
From: Coly Li @ 2019-06-04 15:16 UTC (permalink / raw)
  To: linux-bcache; +Cc: linux-block, Coly Li

This patch adds more error message in bch_cached_dev_run() to indicate
the exact reason why an error value is returned. Please notice when
printing out the "is running already" message, pr_info() is used here,
because in this case also -EBUSY is returned, the bcache device can
continue to attach to the cache devince and run, so it won't be an
error level message in kernel message.

Signed-off-by: Coly Li <colyli@suse.de>
---
 drivers/md/bcache/super.c | 9 +++++++--
 1 file changed, 7 insertions(+), 2 deletions(-)

diff --git a/drivers/md/bcache/super.c b/drivers/md/bcache/super.c
index 4a6406b53de1..026e2df358c3 100644
--- a/drivers/md/bcache/super.c
+++ b/drivers/md/bcache/super.c
@@ -926,13 +926,18 @@ int bch_cached_dev_run(struct cached_dev *dc)
 		NULL,
 	};
 
-	if (dc->io_disable)
+	if (dc->io_disable) {
+		pr_err("I/O disabled on cached dev %s",
+		       dc->backing_dev_name);
 		return -EIO;
+	}
 
 	if (atomic_xchg(&dc->running, 1)) {
 		kfree(env[1]);
 		kfree(env[2]);
 		kfree(buf);
+		pr_info("cached dev %s is running already",
+		       dc->backing_dev_name);
 		return -EBUSY;
 	}
 
@@ -961,7 +966,7 @@ int bch_cached_dev_run(struct cached_dev *dc)
 	if (sysfs_create_link(&d->kobj, &disk_to_dev(d->disk)->kobj, "dev") ||
 	    sysfs_create_link(&disk_to_dev(d->disk)->kobj,
 			      &d->kobj, "bcache")) {
-		pr_debug("error creating sysfs link");
+		pr_err("Couldn't create bcache dev <-> disk sysfs symlinks");
 		return -ENOMEM;
 	}
 
-- 
2.16.4


^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [PATCH 16/18] bcache: only set BCACHE_DEV_WB_RUNNING when cached device attached
  2019-06-04 15:16 [PATCH 00/15] bcache fixes before Linux v5.3 Coly Li
                   ` (14 preceding siblings ...)
  2019-06-04 15:16 ` [PATCH 15/15] bcache: improve error message in bch_cached_dev_run() Coly Li
@ 2019-06-04 15:53 ` Coly Li
  2019-06-04 15:53   ` [PATCH 17/18] bcache: make bset_search_tree() be more understandable Coly Li
  2019-06-04 15:53   ` [PATCH 18/18] bcache: add code comments for journal_read_bucket() Coly Li
  15 siblings, 2 replies; 21+ messages in thread
From: Coly Li @ 2019-06-04 15:53 UTC (permalink / raw)
  To: linux-bcache; +Cc: linux-block, Coly Li, stable

When people set a writeback percent via sysfs file,
  /sys/block/bcache<N>/bcache/writeback_percent
current code directly sets BCACHE_DEV_WB_RUNNING to dc->disk.flags
and schedules kworker dc->writeback_rate_update.

If there is no cache set attached to, the writebac kernel thread is
not running indeed, running dc->writeback_rate_update does not make
sense and may cause NULL pointer deference when reference cache set
pointer inside update_writeback_rate().

This patch checks whether the cache set point (dc->disk.c) is NULL in
sysfs interface handler, and only set BCACHE_DEV_WB_RUNNING and
schedule dc->writeback_rate_update when dc->disk.c is not NULL (it
means the cache device is attached to a cache set).

This problem might be introduced from initial bcache commit, but
commit 3fd47bfe55b0 ("bcache: stop dc->writeback_rate_update properly")
changes part of the orignal code piece, so I add 'Fixes: 3fd47bfe55b0'
to indicate from which commit this patch can be applied.

Fixes: 3fd47bfe55b0 ("bcache: stop dc->writeback_rate_update properly")
Reported-by: Bjørn Forsman <bjorn.forsman@gmail.com>
Signed-off-by: Coly Li <colyli@suse.de>
Cc: stable@vger.kernel.org
---
 drivers/md/bcache/sysfs.c | 7 ++++++-
 1 file changed, 6 insertions(+), 1 deletion(-)

diff --git a/drivers/md/bcache/sysfs.c b/drivers/md/bcache/sysfs.c
index 129031663cc8..eb678e43ac00 100644
--- a/drivers/md/bcache/sysfs.c
+++ b/drivers/md/bcache/sysfs.c
@@ -430,8 +430,13 @@ STORE(bch_cached_dev)
 			bch_writeback_queue(dc);
 	}
 
+	/*
+	 * Only set BCACHE_DEV_WB_RUNNING when cached device attached to
+	 * a cache set, otherwise it doesn't make sense.
+	 */
 	if (attr == &sysfs_writeback_percent)
-		if (!test_and_set_bit(BCACHE_DEV_WB_RUNNING, &dc->disk.flags))
+		if ((dc->disk.c != NULL) &&
+		    (!test_and_set_bit(BCACHE_DEV_WB_RUNNING, &dc->disk.flags)))
 			schedule_delayed_work(&dc->writeback_rate_update,
 				      dc->writeback_rate_update_seconds * HZ);
 
-- 
2.16.4


^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [PATCH 17/18] bcache: make bset_search_tree() be more understandable
  2019-06-04 15:53 ` [PATCH 16/18] bcache: only set BCACHE_DEV_WB_RUNNING when cached device attached Coly Li
@ 2019-06-04 15:53   ` Coly Li
  2019-06-05  5:43     ` Christoph Hellwig
  2019-06-04 15:53   ` [PATCH 18/18] bcache: add code comments for journal_read_bucket() Coly Li
  1 sibling, 1 reply; 21+ messages in thread
From: Coly Li @ 2019-06-04 15:53 UTC (permalink / raw)
  To: linux-bcache; +Cc: linux-block, Coly Li

The purpose of following code in bset_search_tree() is to avoid a branch
instruction,
 994         if (likely(f->exponent != 127))
 995                 n = j * 2 + (((unsigned int)
 996                               (f->mantissa -
 997                                bfloat_mantissa(search, f))) >> 31);
 998         else
 999                 n = (bkey_cmp(tree_to_bkey(t, j), search) > 0)
1000                         ? j * 2
1001                         : j * 2 + 1;

This piece of code is not very clear to understand, even when I tried to
add code comment for it, I made mistake. This patch removes the implict
bit operation and uses explicit branch to calculate next location in
binary tree search.

Signed-off-by: Coly Li <colyli@suse.de>
---
 drivers/md/bcache/bset.c | 17 +++--------------
 1 file changed, 3 insertions(+), 14 deletions(-)

diff --git a/drivers/md/bcache/bset.c b/drivers/md/bcache/bset.c
index eedaf0f3e3f0..32e2e4d8fa6c 100644
--- a/drivers/md/bcache/bset.c
+++ b/drivers/md/bcache/bset.c
@@ -965,21 +965,10 @@ static struct bset_search_iter bset_search_tree(struct bset_tree *t,
 		j = n;
 		f = &t->tree[j];
 
-		/*
-		 * Similar bit trick, use subtract operation to avoid a branch
-		 * instruction.
-		 *
-		 * n = (f->mantissa > bfloat_mantissa())
-		 *	? j * 2
-		 *	: j * 2 + 1;
-		 *
-		 * We need to subtract 1 from f->mantissa for the sign bit trick
-		 * to work  - that's done in make_bfloat()
-		 */
 		if (likely(f->exponent != 127))
-			n = j * 2 + (((unsigned int)
-				      (f->mantissa -
-				       bfloat_mantissa(search, f))) >> 31);
+			n = (f->mantissa >= bfloat_mantissa(search, f))
+				? j * 2
+				: j * 2 + 1;
 		else
 			n = (bkey_cmp(tree_to_bkey(t, j), search) > 0)
 				? j * 2
-- 
2.16.4


^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [PATCH 18/18] bcache: add code comments for journal_read_bucket()
  2019-06-04 15:53 ` [PATCH 16/18] bcache: only set BCACHE_DEV_WB_RUNNING when cached device attached Coly Li
  2019-06-04 15:53   ` [PATCH 17/18] bcache: make bset_search_tree() be more understandable Coly Li
@ 2019-06-04 15:53   ` Coly Li
  1 sibling, 0 replies; 21+ messages in thread
From: Coly Li @ 2019-06-04 15:53 UTC (permalink / raw)
  To: linux-bcache; +Cc: linux-block, Coly Li

This patch adds more code comments in journal_read_bucket(), this is an
effort to make the code to be more understandable.

Signed-off-by: Coly Li <colyli@suse.de>
---
 drivers/md/bcache/journal.c | 24 ++++++++++++++++++++++++
 1 file changed, 24 insertions(+)

diff --git a/drivers/md/bcache/journal.c b/drivers/md/bcache/journal.c
index d4b9817f2237..d3f2331fc559 100644
--- a/drivers/md/bcache/journal.c
+++ b/drivers/md/bcache/journal.c
@@ -100,6 +100,20 @@ reread:		left = ca->sb.bucket_size - offset;
 
 			blocks = set_blocks(j, block_bytes(ca->set));
 
+			/*
+			 * Nodes in 'list' are in linear increasing order of
+			 * i->j.seq, the node on head has the smallest (oldest)
+			 * journal seq, the node on tail has the biggest
+			 * (latest) journal seq.
+			 */
+
+			/*
+			 * Check from the oldest jset for last_seq. If
+			 * i->j.seq < j->last_seq, it means the oldest jset
+			 * in list is expired and useless, remove it from
+			 * this list. Otherwise, j is a condidate jset for
+			 * further following checks.
+			 */
 			while (!list_empty(list)) {
 				i = list_first_entry(list,
 					struct journal_replay, list);
@@ -109,13 +123,22 @@ reread:		left = ca->sb.bucket_size - offset;
 				kfree(i);
 			}
 
+			/* iterate list in reverse order (from latest jset) */
 			list_for_each_entry_reverse(i, list, list) {
 				if (j->seq == i->j.seq)
 					goto next_set;
 
+				/*
+				 * if j->seq is less than any i->j.last_seq
+				 * in list, j is an expired and useless jset.
+				 */
 				if (j->seq < i->j.last_seq)
 					goto next_set;
 
+				/*
+				 * 'where' points to first jset in list which
+				 * is elder then j.
+				 */
 				if (j->seq > i->j.seq) {
 					where = &i->list;
 					goto add;
@@ -129,6 +152,7 @@ reread:		left = ca->sb.bucket_size - offset;
 			if (!i)
 				return -ENOMEM;
 			memcpy(&i->j, j, bytes);
+			/* Add to the location after 'where' points to */
 			list_add(&i->list, where);
 			ret = 1;
 
-- 
2.16.4


^ permalink raw reply related	[flat|nested] 21+ messages in thread

* Re: [PATCH 17/18] bcache: make bset_search_tree() be more understandable
  2019-06-04 15:53   ` [PATCH 17/18] bcache: make bset_search_tree() be more understandable Coly Li
@ 2019-06-05  5:43     ` Christoph Hellwig
  2019-06-05  5:45       ` Coly Li
  0 siblings, 1 reply; 21+ messages in thread
From: Christoph Hellwig @ 2019-06-05  5:43 UTC (permalink / raw)
  To: Coly Li; +Cc: linux-bcache, linux-block

> -			n = j * 2 + (((unsigned int)
> -				      (f->mantissa -
> -				       bfloat_mantissa(search, f))) >> 31);
> +			n = (f->mantissa >= bfloat_mantissa(search, f))
> +				? j * 2
> +				: j * 2 + 1;

If you really want to make it more readable a good old if else would
help a lot.

>  		else
>  			n = (bkey_cmp(tree_to_bkey(t, j), search) > 0)
>  				? j * 2

Same here.

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH 17/18] bcache: make bset_search_tree() be more understandable
  2019-06-05  5:43     ` Christoph Hellwig
@ 2019-06-05  5:45       ` Coly Li
  0 siblings, 0 replies; 21+ messages in thread
From: Coly Li @ 2019-06-05  5:45 UTC (permalink / raw)
  To: Christoph Hellwig; +Cc: linux-bcache, linux-block

On 2019/6/5 1:43 下午, Christoph Hellwig wrote:
>> -			n = j * 2 + (((unsigned int)
>> -				      (f->mantissa -
>> -				       bfloat_mantissa(search, f))) >> 31);
>> +			n = (f->mantissa >= bfloat_mantissa(search, f))
>> +				? j * 2
>> +				: j * 2 + 1;
> 
> If you really want to make it more readable a good old if else would
> help a lot.
> 
>>  		else
>>  			n = (bkey_cmp(tree_to_bkey(t, j), search) > 0)
>>  				? j * 2
> 
> Same here.
> 

Hi Christoph,

Thanks for the hint, will handle it soon.

-- 

Coly Li

^ permalink raw reply	[flat|nested] 21+ messages in thread

end of thread, other threads:[~2019-06-05  5:45 UTC | newest]

Thread overview: 21+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-06-04 15:16 [PATCH 00/15] bcache fixes before Linux v5.3 Coly Li
2019-06-04 15:16 ` [PATCH 01/15] Revert "bcache: set CACHE_SET_IO_DISABLE in bch_cached_dev_error()" Coly Li
2019-06-04 15:16 ` [PATCH 02/15] bcache: avoid flushing btree node in cache_set_flush() if io disabled Coly Li
2019-06-04 15:16 ` [PATCH 03/15] bcache: ignore read-ahead request failure on backing device Coly Li
2019-06-04 15:16 ` [PATCH 04/15] bcache: add io error counting in write_bdev_super_endio() Coly Li
2019-06-04 15:16 ` [PATCH 05/15] bcache: remove "XXX:" comment line from run_cache_set() Coly Li
2019-06-04 15:16 ` [PATCH 06/15] bcache: remove unnecessary prefetch() in bset_search_tree() Coly Li
2019-06-04 15:16 ` [PATCH 07/15] bcache: use sysfs_match_string() instead of __sysfs_match_string() Coly Li
2019-06-04 15:16 ` [PATCH 08/15] bcache: add return value check to bch_cached_dev_run() Coly Li
2019-06-04 15:16 ` [PATCH 09/15] bcache: remove unncessary code in bch_btree_keys_init() Coly Li
2019-06-04 15:16 ` [PATCH 10/15] bcache: check CACHE_SET_IO_DISABLE in allocator code Coly Li
2019-06-04 15:16 ` [PATCH 11/15] bcache: check CACHE_SET_IO_DISABLE bit in bch_journal() Coly Li
2019-06-04 15:16 ` [PATCH 12/15] bcache: more detailed error message to bcache_device_link() Coly Li
2019-06-04 15:16 ` [PATCH 13/15] bcache: add more error message in bch_cached_dev_attach() Coly Li
2019-06-04 15:16 ` [PATCH 14/15] bcache: shrink btree node cache after bch_btree_check() Coly Li
2019-06-04 15:16 ` [PATCH 15/15] bcache: improve error message in bch_cached_dev_run() Coly Li
2019-06-04 15:53 ` [PATCH 16/18] bcache: only set BCACHE_DEV_WB_RUNNING when cached device attached Coly Li
2019-06-04 15:53   ` [PATCH 17/18] bcache: make bset_search_tree() be more understandable Coly Li
2019-06-05  5:43     ` Christoph Hellwig
2019-06-05  5:45       ` Coly Li
2019-06-04 15:53   ` [PATCH 18/18] bcache: add code comments for journal_read_bucket() Coly Li

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.