* [PATCH 00/15] bcache fixes before Linux v5.3
@ 2019-06-04 15:16 Coly Li
2019-06-04 15:16 ` [PATCH 01/15] Revert "bcache: set CACHE_SET_IO_DISABLE in bch_cached_dev_error()" Coly Li
` (15 more replies)
0 siblings, 16 replies; 21+ messages in thread
From: Coly Li @ 2019-06-04 15:16 UTC (permalink / raw)
To: linux-bcache; +Cc: linux-block, Coly Li
Hi folks,
Here are some bcache fixes on my hand before, I post them out
for your review and comments. I am also testing them at same time,
and manage to have them into the for-v5.3 series.
Thanks in advance.
Coly Li
---
Alexandru Ardelean (1):
bcache: use sysfs_match_string() instead of __sysfs_match_string()
Coly Li (14):
Revert "bcache: set CACHE_SET_IO_DISABLE in bch_cached_dev_error()"
bcache: avoid flushing btree node in cache_set_flush() if io disabled
bcache: ignore read-ahead request failure on backing device
bcache: add io error counting in write_bdev_super_endio()
bcache: remove "XXX:" comment line from run_cache_set()
bcache: remove unnecessary prefetch() in bset_search_tree()
bcache: add return value check to bch_cached_dev_run()
bcache: remove unncessary code in bch_btree_keys_init()
bcache: check CACHE_SET_IO_DISABLE in allocator code
bcache: check CACHE_SET_IO_DISABLE bit in bch_journal()
bcache: more detailed error message to bcache_device_link()
bcache: add more error message in bch_cached_dev_attach()
bcache: improve error message in bch_cached_dev_run()
bcache: shrink btree node cache after bch_btree_check()
drivers/md/bcache/alloc.c | 9 ++++
drivers/md/bcache/bcache.h | 2 +-
drivers/md/bcache/bset.c | 31 ++++--------
drivers/md/bcache/btree.c | 4 ++
drivers/md/bcache/io.c | 12 +++++
drivers/md/bcache/journal.c | 4 ++
drivers/md/bcache/super.c | 113 +++++++++++++++++++++++++++++---------------
drivers/md/bcache/sysfs.c | 27 +++++------
8 files changed, 127 insertions(+), 75 deletions(-)
--
2.16.4
^ permalink raw reply [flat|nested] 21+ messages in thread
* [PATCH 01/15] Revert "bcache: set CACHE_SET_IO_DISABLE in bch_cached_dev_error()"
2019-06-04 15:16 [PATCH 00/15] bcache fixes before Linux v5.3 Coly Li
@ 2019-06-04 15:16 ` Coly Li
2019-06-04 15:16 ` [PATCH 02/15] bcache: avoid flushing btree node in cache_set_flush() if io disabled Coly Li
` (14 subsequent siblings)
15 siblings, 0 replies; 21+ messages in thread
From: Coly Li @ 2019-06-04 15:16 UTC (permalink / raw)
To: linux-bcache; +Cc: linux-block, Coly Li, stable
This reverts commit 6147305c73e4511ca1a975b766b97a779d442567.
Although this patch helps the failed bcache device to stop faster when
too many I/O errors detected on corresponding cached device, setting
CACHE_SET_IO_DISABLE bit to cache set c->flags was not a good idea. This
operation will disable all I/Os on cache set, which means other attached
bcache devices won't work neither.
Without this patch, the failed bcache device can also be stopped
eventually if internal I/O accomplished (e.g. writeback). Therefore here
I revert it.
Fixes: 6147305c73e4 ("bcache: set CACHE_SET_IO_DISABLE in bch_cached_dev_error()")
Reported-by: Yong Li <mr.liyong@qq.com>
Signed-off-by: Coly Li <colyli@suse.de>
Cc: stable@vger.kernel.org
---
drivers/md/bcache/super.c | 17 -----------------
1 file changed, 17 deletions(-)
diff --git a/drivers/md/bcache/super.c b/drivers/md/bcache/super.c
index 1b63ac876169..eaaa046fd95d 100644
--- a/drivers/md/bcache/super.c
+++ b/drivers/md/bcache/super.c
@@ -1437,8 +1437,6 @@ int bch_flash_dev_create(struct cache_set *c, uint64_t size)
bool bch_cached_dev_error(struct cached_dev *dc)
{
- struct cache_set *c;
-
if (!dc || test_bit(BCACHE_DEV_CLOSING, &dc->disk.flags))
return false;
@@ -1449,21 +1447,6 @@ bool bch_cached_dev_error(struct cached_dev *dc)
pr_err("stop %s: too many IO errors on backing device %s\n",
dc->disk.disk->disk_name, dc->backing_dev_name);
- /*
- * If the cached device is still attached to a cache set,
- * even dc->io_disable is true and no more I/O requests
- * accepted, cache device internal I/O (writeback scan or
- * garbage collection) may still prevent bcache device from
- * being stopped. So here CACHE_SET_IO_DISABLE should be
- * set to c->flags too, to make the internal I/O to cache
- * device rejected and stopped immediately.
- * If c is NULL, that means the bcache device is not attached
- * to any cache set, then no CACHE_SET_IO_DISABLE bit to set.
- */
- c = dc->disk.c;
- if (c && test_and_set_bit(CACHE_SET_IO_DISABLE, &c->flags))
- pr_info("CACHE_SET_IO_DISABLE already set");
-
bcache_device_stop(&dc->disk);
return true;
}
--
2.16.4
^ permalink raw reply related [flat|nested] 21+ messages in thread
* [PATCH 02/15] bcache: avoid flushing btree node in cache_set_flush() if io disabled
2019-06-04 15:16 [PATCH 00/15] bcache fixes before Linux v5.3 Coly Li
2019-06-04 15:16 ` [PATCH 01/15] Revert "bcache: set CACHE_SET_IO_DISABLE in bch_cached_dev_error()" Coly Li
@ 2019-06-04 15:16 ` Coly Li
2019-06-04 15:16 ` [PATCH 03/15] bcache: ignore read-ahead request failure on backing device Coly Li
` (13 subsequent siblings)
15 siblings, 0 replies; 21+ messages in thread
From: Coly Li @ 2019-06-04 15:16 UTC (permalink / raw)
To: linux-bcache; +Cc: linux-block, Coly Li
When cache_set_flush() is called for too many I/O errors detected on
cache device and the cache set is retiring, inside the function it
doesn't make sense to flushing cached btree nodes from c->btree_cache
because CACHE_SET_IO_DISABLE is set on c->flags already and all I/Os
onto cache device will be rejected.
This patch checks in cache_set_flush() that whether CACHE_SET_IO_DISABLE
is set. If yes, then avoids to flush the cached btree nodes to reduce
more time and make cache set retiring more faster.
Signed-off-by: Coly Li <colyli@suse.de>
---
drivers/md/bcache/super.c | 18 +++++++++++-------
1 file changed, 11 insertions(+), 7 deletions(-)
diff --git a/drivers/md/bcache/super.c b/drivers/md/bcache/super.c
index eaaa046fd95d..da9d6a63b81a 100644
--- a/drivers/md/bcache/super.c
+++ b/drivers/md/bcache/super.c
@@ -1553,13 +1553,17 @@ static void cache_set_flush(struct closure *cl)
if (!IS_ERR_OR_NULL(c->root))
list_add(&c->root->list, &c->btree_cache);
- /* Should skip this if we're unregistering because of an error */
- list_for_each_entry(b, &c->btree_cache, list) {
- mutex_lock(&b->write_lock);
- if (btree_node_dirty(b))
- __bch_btree_node_write(b, NULL);
- mutex_unlock(&b->write_lock);
- }
+ /*
+ * Avoid flushing cached nodes if cache set is retiring
+ * due to too many I/O errors detected.
+ */
+ if (!test_bit(CACHE_SET_IO_DISABLE, &c->flags))
+ list_for_each_entry(b, &c->btree_cache, list) {
+ mutex_lock(&b->write_lock);
+ if (btree_node_dirty(b))
+ __bch_btree_node_write(b, NULL);
+ mutex_unlock(&b->write_lock);
+ }
for_each_cache(ca, c, i)
if (ca->alloc_thread)
--
2.16.4
^ permalink raw reply related [flat|nested] 21+ messages in thread
* [PATCH 03/15] bcache: ignore read-ahead request failure on backing device
2019-06-04 15:16 [PATCH 00/15] bcache fixes before Linux v5.3 Coly Li
2019-06-04 15:16 ` [PATCH 01/15] Revert "bcache: set CACHE_SET_IO_DISABLE in bch_cached_dev_error()" Coly Li
2019-06-04 15:16 ` [PATCH 02/15] bcache: avoid flushing btree node in cache_set_flush() if io disabled Coly Li
@ 2019-06-04 15:16 ` Coly Li
2019-06-04 15:16 ` [PATCH 04/15] bcache: add io error counting in write_bdev_super_endio() Coly Li
` (12 subsequent siblings)
15 siblings, 0 replies; 21+ messages in thread
From: Coly Li @ 2019-06-04 15:16 UTC (permalink / raw)
To: linux-bcache; +Cc: linux-block, Coly Li, stable
When md raid device (e.g. raid456) is used as backing device, read-ahead
requests on a degrading and recovering md raid device might be failured
immediately by md raid code, but indeed this md raid array can still be
read or write for normal I/O requests. Therefore such failed read-ahead
request are not real hardware failure. Further more, after degrading and
recovering accomplished, read-ahead requests will be handled by md raid
array again.
For such condition, I/O failures of read-ahead requests don't indicate
real health status (because normal I/O still be served), they should not
be counted into I/O error counter dc->io_errors.
Since there is no simple way to detect whether the backing divice is a
md raid device, this patch simply ignores I/O failures for read-ahead
bios on backing device, to avoid bogus backing device failure on a
degrading md raid array.
Suggested-and-tested-by: Thorsten Knabe <linux@thorsten-knabe.de>
Signed-off-by: Coly Li <colyli@suse.de>
Cc: stable@vger.kernel.org
---
drivers/md/bcache/io.c | 12 ++++++++++++
1 file changed, 12 insertions(+)
diff --git a/drivers/md/bcache/io.c b/drivers/md/bcache/io.c
index c25097968319..4d93f07f63e5 100644
--- a/drivers/md/bcache/io.c
+++ b/drivers/md/bcache/io.c
@@ -58,6 +58,18 @@ void bch_count_backing_io_errors(struct cached_dev *dc, struct bio *bio)
WARN_ONCE(!dc, "NULL pointer of struct cached_dev");
+ /*
+ * Read-ahead requests on a degrading and recovering md raid
+ * (e.g. raid6) device might be failured immediately by md
+ * raid code, which is not a real hardware media failure. So
+ * we shouldn't count failed REQ_RAHEAD bio to dc->io_errors.
+ */
+ if (bio->bi_opf & REQ_RAHEAD) {
+ pr_warn_ratelimited("%s: Read-ahead I/O failed on backing device, ignore",
+ dc->backing_dev_name);
+ return;
+ }
+
errors = atomic_add_return(1, &dc->io_errors);
if (errors < dc->error_limit)
pr_err("%s: IO error on backing device, unrecoverable",
--
2.16.4
^ permalink raw reply related [flat|nested] 21+ messages in thread
* [PATCH 04/15] bcache: add io error counting in write_bdev_super_endio()
2019-06-04 15:16 [PATCH 00/15] bcache fixes before Linux v5.3 Coly Li
` (2 preceding siblings ...)
2019-06-04 15:16 ` [PATCH 03/15] bcache: ignore read-ahead request failure on backing device Coly Li
@ 2019-06-04 15:16 ` Coly Li
2019-06-04 15:16 ` [PATCH 05/15] bcache: remove "XXX:" comment line from run_cache_set() Coly Li
` (11 subsequent siblings)
15 siblings, 0 replies; 21+ messages in thread
From: Coly Li @ 2019-06-04 15:16 UTC (permalink / raw)
To: linux-bcache; +Cc: linux-block, Coly Li
When backing device super block is written by bch_write_bdev_super(),
the bio complete callback write_bdev_super_endio() simply ignores I/O
status. Indeed such write request also contribute to backing device
health status if the request failed.
This patch checkes bio->bi_status in write_bdev_super_endio(), if there
is error, bch_count_backing_io_errors() will be called to count an I/O
error to dc->io_errors.
Signed-off-by: Coly Li <colyli@suse.de>
---
drivers/md/bcache/super.c | 4 +++-
1 file changed, 3 insertions(+), 1 deletion(-)
diff --git a/drivers/md/bcache/super.c b/drivers/md/bcache/super.c
index da9d6a63b81a..877113b62b0f 100644
--- a/drivers/md/bcache/super.c
+++ b/drivers/md/bcache/super.c
@@ -197,7 +197,9 @@ static const char *read_super(struct cache_sb *sb, struct block_device *bdev,
static void write_bdev_super_endio(struct bio *bio)
{
struct cached_dev *dc = bio->bi_private;
- /* XXX: error checking */
+
+ if (bio->bi_status)
+ bch_count_backing_io_errors(dc, bio);
closure_put(&dc->sb_write);
}
--
2.16.4
^ permalink raw reply related [flat|nested] 21+ messages in thread
* [PATCH 05/15] bcache: remove "XXX:" comment line from run_cache_set()
2019-06-04 15:16 [PATCH 00/15] bcache fixes before Linux v5.3 Coly Li
` (3 preceding siblings ...)
2019-06-04 15:16 ` [PATCH 04/15] bcache: add io error counting in write_bdev_super_endio() Coly Li
@ 2019-06-04 15:16 ` Coly Li
2019-06-04 15:16 ` [PATCH 06/15] bcache: remove unnecessary prefetch() in bset_search_tree() Coly Li
` (10 subsequent siblings)
15 siblings, 0 replies; 21+ messages in thread
From: Coly Li @ 2019-06-04 15:16 UTC (permalink / raw)
To: linux-bcache; +Cc: linux-block, Coly Li
In previous bcache patches for Linux v5.2, the failure code path of
run_cache_set() is tested and fixed. So now the following comment
line can be removed from run_cache_set(),
/* XXX: test this, it's broken */
Signed-off-by: Coly Li <colyli@suse.de>
---
drivers/md/bcache/super.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/md/bcache/super.c b/drivers/md/bcache/super.c
index 877113b62b0f..3364b20567eb 100644
--- a/drivers/md/bcache/super.c
+++ b/drivers/md/bcache/super.c
@@ -1946,7 +1946,7 @@ static int run_cache_set(struct cache_set *c)
}
closure_sync(&cl);
- /* XXX: test this, it's broken */
+
bch_cache_set_error(c, "%s", err);
return -EIO;
--
2.16.4
^ permalink raw reply related [flat|nested] 21+ messages in thread
* [PATCH 06/15] bcache: remove unnecessary prefetch() in bset_search_tree()
2019-06-04 15:16 [PATCH 00/15] bcache fixes before Linux v5.3 Coly Li
` (4 preceding siblings ...)
2019-06-04 15:16 ` [PATCH 05/15] bcache: remove "XXX:" comment line from run_cache_set() Coly Li
@ 2019-06-04 15:16 ` Coly Li
2019-06-04 15:16 ` [PATCH 07/15] bcache: use sysfs_match_string() instead of __sysfs_match_string() Coly Li
` (9 subsequent siblings)
15 siblings, 0 replies; 21+ messages in thread
From: Coly Li @ 2019-06-04 15:16 UTC (permalink / raw)
To: linux-bcache; +Cc: linux-block, Coly Li
In function bset_search_tree(), when p >= t->size, t->tree[0] will be
prefetched by the following code piece,
974 unsigned int p = n << 4;
975
976 p &= ((int) (p - t->size)) >> 31;
977
978 prefetch(&t->tree[p]);
The purpose of the above code is to avoid a branch instruction, but
when p >= t->size, prefetch(&t->tree[0]) has no positive performance
contribution at all. This patch avoids the unncessary prefetch by only
calling prefetch() when p < t->size.
Signed-off-by: Coly Li <colyli@suse.de>
---
drivers/md/bcache/bset.c | 16 ++--------------
1 file changed, 2 insertions(+), 14 deletions(-)
diff --git a/drivers/md/bcache/bset.c b/drivers/md/bcache/bset.c
index 8f07fa6e1739..aa2e4ab0fab9 100644
--- a/drivers/md/bcache/bset.c
+++ b/drivers/md/bcache/bset.c
@@ -960,22 +960,10 @@ static struct bset_search_iter bset_search_tree(struct bset_tree *t,
unsigned int inorder, j, n = 1;
do {
- /*
- * A bit trick here.
- * If p < t->size, (int)(p - t->size) is a minus value and
- * the most significant bit is set, right shifting 31 bits
- * gets 1. If p >= t->size, the most significant bit is
- * not set, right shifting 31 bits gets 0.
- * So the following 2 lines equals to
- * if (p >= t->size)
- * p = 0;
- * but a branch instruction is avoided.
- */
unsigned int p = n << 4;
- p &= ((int) (p - t->size)) >> 31;
-
- prefetch(&t->tree[p]);
+ if (p < t->size)
+ prefetch(&t->tree[p]);
j = n;
f = &t->tree[j];
--
2.16.4
^ permalink raw reply related [flat|nested] 21+ messages in thread
* [PATCH 07/15] bcache: use sysfs_match_string() instead of __sysfs_match_string()
2019-06-04 15:16 [PATCH 00/15] bcache fixes before Linux v5.3 Coly Li
` (5 preceding siblings ...)
2019-06-04 15:16 ` [PATCH 06/15] bcache: remove unnecessary prefetch() in bset_search_tree() Coly Li
@ 2019-06-04 15:16 ` Coly Li
2019-06-04 15:16 ` [PATCH 08/15] bcache: add return value check to bch_cached_dev_run() Coly Li
` (8 subsequent siblings)
15 siblings, 0 replies; 21+ messages in thread
From: Coly Li @ 2019-06-04 15:16 UTC (permalink / raw)
To: linux-bcache; +Cc: linux-block, Alexandru Ardelean, Coly Li
From: Alexandru Ardelean <alexandru.ardelean@analog.com>
The arrays (of strings) that are passed to __sysfs_match_string() are
static, so use sysfs_match_string() which does an implicit ARRAY_SIZE()
over these arrays.
Functionally, this doesn't change anything.
The change is more cosmetic.
It only shrinks the static arrays by 1 byte each.
Signed-off-by: Alexandru Ardelean <alexandru.ardelean@analog.com>
Signed-off-by: Coly Li <colyli@suse.de>
---
drivers/md/bcache/sysfs.c | 20 ++++++++------------
1 file changed, 8 insertions(+), 12 deletions(-)
diff --git a/drivers/md/bcache/sysfs.c b/drivers/md/bcache/sysfs.c
index 6cd44d3cf906..3a520262933d 100644
--- a/drivers/md/bcache/sysfs.c
+++ b/drivers/md/bcache/sysfs.c
@@ -21,28 +21,24 @@ static const char * const bch_cache_modes[] = {
"writethrough",
"writeback",
"writearound",
- "none",
- NULL
+ "none"
};
/* Default is 0 ("auto") */
static const char * const bch_stop_on_failure_modes[] = {
"auto",
- "always",
- NULL
+ "always"
};
static const char * const cache_replacement_policies[] = {
"lru",
"fifo",
- "random",
- NULL
+ "random"
};
static const char * const error_actions[] = {
"unregister",
- "panic",
- NULL
+ "panic"
};
write_attribute(attach);
@@ -333,7 +329,7 @@ STORE(__cached_dev)
bch_cached_dev_run(dc);
if (attr == &sysfs_cache_mode) {
- v = __sysfs_match_string(bch_cache_modes, -1, buf);
+ v = sysfs_match_string(bch_cache_modes, buf);
if (v < 0)
return v;
@@ -344,7 +340,7 @@ STORE(__cached_dev)
}
if (attr == &sysfs_stop_when_cache_set_failed) {
- v = __sysfs_match_string(bch_stop_on_failure_modes, -1, buf);
+ v = sysfs_match_string(bch_stop_on_failure_modes, buf);
if (v < 0)
return v;
@@ -794,7 +790,7 @@ STORE(__bch_cache_set)
0, UINT_MAX);
if (attr == &sysfs_errors) {
- v = __sysfs_match_string(error_actions, -1, buf);
+ v = sysfs_match_string(error_actions, buf);
if (v < 0)
return v;
@@ -1058,7 +1054,7 @@ STORE(__bch_cache)
}
if (attr == &sysfs_cache_replacement_policy) {
- v = __sysfs_match_string(cache_replacement_policies, -1, buf);
+ v = sysfs_match_string(cache_replacement_policies, buf);
if (v < 0)
return v;
--
2.16.4
^ permalink raw reply related [flat|nested] 21+ messages in thread
* [PATCH 08/15] bcache: add return value check to bch_cached_dev_run()
2019-06-04 15:16 [PATCH 00/15] bcache fixes before Linux v5.3 Coly Li
` (6 preceding siblings ...)
2019-06-04 15:16 ` [PATCH 07/15] bcache: use sysfs_match_string() instead of __sysfs_match_string() Coly Li
@ 2019-06-04 15:16 ` Coly Li
2019-06-04 15:16 ` [PATCH 09/15] bcache: remove unncessary code in bch_btree_keys_init() Coly Li
` (7 subsequent siblings)
15 siblings, 0 replies; 21+ messages in thread
From: Coly Li @ 2019-06-04 15:16 UTC (permalink / raw)
To: linux-bcache; +Cc: linux-block, Coly Li
This patch adds return value check to bch_cached_dev_run(), now if there
is error happens inside bch_cached_dev_run(), it can be catched.
Signed-off-by: Coly Li <colyli@suse.de>
---
drivers/md/bcache/bcache.h | 2 +-
drivers/md/bcache/super.c | 33 ++++++++++++++++++++++++++-------
drivers/md/bcache/sysfs.c | 7 +++++--
3 files changed, 32 insertions(+), 10 deletions(-)
diff --git a/drivers/md/bcache/bcache.h b/drivers/md/bcache/bcache.h
index fdf75352e16a..73a97586a2ef 100644
--- a/drivers/md/bcache/bcache.h
+++ b/drivers/md/bcache/bcache.h
@@ -1006,7 +1006,7 @@ int bch_flash_dev_create(struct cache_set *c, uint64_t size);
int bch_cached_dev_attach(struct cached_dev *dc, struct cache_set *c,
uint8_t *set_uuid);
void bch_cached_dev_detach(struct cached_dev *dc);
-void bch_cached_dev_run(struct cached_dev *dc);
+int bch_cached_dev_run(struct cached_dev *dc);
void bcache_device_stop(struct bcache_device *d);
void bch_cache_set_unregister(struct cache_set *c);
diff --git a/drivers/md/bcache/super.c b/drivers/md/bcache/super.c
index 3364b20567eb..74eb18b1af40 100644
--- a/drivers/md/bcache/super.c
+++ b/drivers/md/bcache/super.c
@@ -910,7 +910,7 @@ static int cached_dev_status_update(void *arg)
}
-void bch_cached_dev_run(struct cached_dev *dc)
+int bch_cached_dev_run(struct cached_dev *dc)
{
struct bcache_device *d = &dc->disk;
char *buf = kmemdup_nul(dc->sb.label, SB_LABEL_SIZE, GFP_KERNEL);
@@ -921,11 +921,14 @@ void bch_cached_dev_run(struct cached_dev *dc)
NULL,
};
+ if (dc->io_disable)
+ return -EIO;
+
if (atomic_xchg(&dc->running, 1)) {
kfree(env[1]);
kfree(env[2]);
kfree(buf);
- return;
+ return -EBUSY;
}
if (!d->c &&
@@ -951,8 +954,11 @@ void bch_cached_dev_run(struct cached_dev *dc)
kfree(buf);
if (sysfs_create_link(&d->kobj, &disk_to_dev(d->disk)->kobj, "dev") ||
- sysfs_create_link(&disk_to_dev(d->disk)->kobj, &d->kobj, "bcache"))
+ sysfs_create_link(&disk_to_dev(d->disk)->kobj,
+ &d->kobj, "bcache")) {
pr_debug("error creating sysfs link");
+ return -ENOMEM;
+ }
dc->status_update_thread = kthread_run(cached_dev_status_update,
dc, "bcache_status_update");
@@ -961,6 +967,8 @@ void bch_cached_dev_run(struct cached_dev *dc)
"continue to run without monitoring backing "
"device status");
}
+
+ return 0;
}
/*
@@ -1056,6 +1064,7 @@ int bch_cached_dev_attach(struct cached_dev *dc, struct cache_set *c,
uint32_t rtime = cpu_to_le32((u32)ktime_get_real_seconds());
struct uuid_entry *u;
struct cached_dev *exist_dc, *t;
+ int ret = 0;
if ((set_uuid && memcmp(set_uuid, c->sb.set_uuid, 16)) ||
(!set_uuid && memcmp(dc->sb.set_uuid, c->sb.set_uuid, 16)))
@@ -1165,7 +1174,12 @@ int bch_cached_dev_attach(struct cached_dev *dc, struct cache_set *c,
bch_sectors_dirty_init(&dc->disk);
- bch_cached_dev_run(dc);
+ ret = bch_cached_dev_run(dc);
+ if (ret && (ret != -EBUSY)) {
+ up_write(&dc->writeback_lock);
+ return ret;
+ }
+
bcache_device_link(&dc->disk, c, "bdev");
atomic_inc(&c->attached_dev_nr);
@@ -1292,6 +1306,7 @@ static int register_bdev(struct cache_sb *sb, struct page *sb_page,
{
const char *err = "cannot allocate memory";
struct cache_set *c;
+ int ret = -ENOMEM;
bdevname(bdev, dc->backing_dev_name);
memcpy(&dc->sb, sb, sizeof(struct cache_sb));
@@ -1321,14 +1336,18 @@ static int register_bdev(struct cache_sb *sb, struct page *sb_page,
bch_cached_dev_attach(dc, c, NULL);
if (BDEV_STATE(&dc->sb) == BDEV_STATE_NONE ||
- BDEV_STATE(&dc->sb) == BDEV_STATE_STALE)
- bch_cached_dev_run(dc);
+ BDEV_STATE(&dc->sb) == BDEV_STATE_STALE) {
+ err = "failed to run cached device";
+ ret = bch_cached_dev_run(dc);
+ if (ret)
+ goto err;
+ }
return 0;
err:
pr_notice("error %s: %s", dc->backing_dev_name, err);
bcache_device_stop(&dc->disk);
- return -EIO;
+ return ret;
}
/* Flash only volumes */
diff --git a/drivers/md/bcache/sysfs.c b/drivers/md/bcache/sysfs.c
index 3a520262933d..129031663cc8 100644
--- a/drivers/md/bcache/sysfs.c
+++ b/drivers/md/bcache/sysfs.c
@@ -325,8 +325,11 @@ STORE(__cached_dev)
bch_cache_accounting_clear(&dc->accounting);
if (attr == &sysfs_running &&
- strtoul_or_return(buf))
- bch_cached_dev_run(dc);
+ strtoul_or_return(buf)) {
+ v = bch_cached_dev_run(dc);
+ if (v)
+ return v;
+ }
if (attr == &sysfs_cache_mode) {
v = sysfs_match_string(bch_cache_modes, buf);
--
2.16.4
^ permalink raw reply related [flat|nested] 21+ messages in thread
* [PATCH 09/15] bcache: remove unncessary code in bch_btree_keys_init()
2019-06-04 15:16 [PATCH 00/15] bcache fixes before Linux v5.3 Coly Li
` (7 preceding siblings ...)
2019-06-04 15:16 ` [PATCH 08/15] bcache: add return value check to bch_cached_dev_run() Coly Li
@ 2019-06-04 15:16 ` Coly Li
2019-06-04 15:16 ` [PATCH 10/15] bcache: check CACHE_SET_IO_DISABLE in allocator code Coly Li
` (6 subsequent siblings)
15 siblings, 0 replies; 21+ messages in thread
From: Coly Li @ 2019-06-04 15:16 UTC (permalink / raw)
To: linux-bcache; +Cc: linux-block, Coly Li
Function bch_btree_keys_init() initializes b->set[].size and
b->set[].data to zero. As the code comments indicates, these code indeed
is unncessary, because both struct btree_keys and struct bset_tree are
nested embedded into struct btree, when struct btree is filled with 0
bits by kzalloc() in mca_bucket_alloc(), b->set[].size and
b->set[].data are initialized to 0 (a.k.a NULL) already.
This patch removes the redundant code, and add comments in
bch_btree_keys_init() and mca_bucket_alloc() to explain why it's safe.
Signed-off-by: Coly Li <colyli@suse.de>
---
drivers/md/bcache/bset.c | 15 ++++++---------
drivers/md/bcache/btree.c | 4 ++++
2 files changed, 10 insertions(+), 9 deletions(-)
diff --git a/drivers/md/bcache/bset.c b/drivers/md/bcache/bset.c
index aa2e4ab0fab9..eedaf0f3e3f0 100644
--- a/drivers/md/bcache/bset.c
+++ b/drivers/md/bcache/bset.c
@@ -347,22 +347,19 @@ EXPORT_SYMBOL(bch_btree_keys_alloc);
void bch_btree_keys_init(struct btree_keys *b, const struct btree_keys_ops *ops,
bool *expensive_debug_checks)
{
- unsigned int i;
-
b->ops = ops;
b->expensive_debug_checks = expensive_debug_checks;
b->nsets = 0;
b->last_set_unwritten = 0;
- /* XXX: shouldn't be needed */
- for (i = 0; i < MAX_BSETS; i++)
- b->set[i].size = 0;
/*
- * Second loop starts at 1 because b->keys[0]->data is the memory we
- * allocated
+ * struct btree_keys in embedded in struct btree, and struct
+ * bset_tree is embedded into struct btree_keys. They are all
+ * initialized as 0 by kzalloc() in mca_bucket_alloc(), and
+ * b->set[0].data is allocated in bch_btree_keys_alloc(), so we
+ * don't have to initiate b->set[].size and b->set[].data here
+ * any more.
*/
- for (i = 1; i < MAX_BSETS; i++)
- b->set[i].data = NULL;
}
EXPORT_SYMBOL(bch_btree_keys_init);
diff --git a/drivers/md/bcache/btree.c b/drivers/md/bcache/btree.c
index 773f5fdad25f..cf38a1b031fa 100644
--- a/drivers/md/bcache/btree.c
+++ b/drivers/md/bcache/btree.c
@@ -613,6 +613,10 @@ static void mca_data_alloc(struct btree *b, struct bkey *k, gfp_t gfp)
static struct btree *mca_bucket_alloc(struct cache_set *c,
struct bkey *k, gfp_t gfp)
{
+ /*
+ * kzalloc() is necessary here for initialization,
+ * see code comments in bch_btree_keys_init().
+ */
struct btree *b = kzalloc(sizeof(struct btree), gfp);
if (!b)
--
2.16.4
^ permalink raw reply related [flat|nested] 21+ messages in thread
* [PATCH 10/15] bcache: check CACHE_SET_IO_DISABLE in allocator code
2019-06-04 15:16 [PATCH 00/15] bcache fixes before Linux v5.3 Coly Li
` (8 preceding siblings ...)
2019-06-04 15:16 ` [PATCH 09/15] bcache: remove unncessary code in bch_btree_keys_init() Coly Li
@ 2019-06-04 15:16 ` Coly Li
2019-06-04 15:16 ` [PATCH 11/15] bcache: check CACHE_SET_IO_DISABLE bit in bch_journal() Coly Li
` (5 subsequent siblings)
15 siblings, 0 replies; 21+ messages in thread
From: Coly Li @ 2019-06-04 15:16 UTC (permalink / raw)
To: linux-bcache; +Cc: linux-block, Coly Li
If CACHE_SET_IO_DISABLE of a cache set flag is set by too many I/O
errors, currently allocator routines can still continue allocate
space which may introduce inconsistent metadata state.
This patch checkes CACHE_SET_IO_DISABLE bit in following allocator
routines,
- bch_bucket_alloc()
- __bch_bucket_alloc_set()
Once CACHE_SET_IO_DISABLE is set on cache set, the allocator routines
may reject allocation request earlier to avoid potential inconsistent
metadata.
Signed-off-by: Coly Li <colyli@suse.de>
---
drivers/md/bcache/alloc.c | 9 +++++++++
1 file changed, 9 insertions(+)
diff --git a/drivers/md/bcache/alloc.c b/drivers/md/bcache/alloc.c
index f8986effcb50..6f776823b9ba 100644
--- a/drivers/md/bcache/alloc.c
+++ b/drivers/md/bcache/alloc.c
@@ -393,6 +393,11 @@ long bch_bucket_alloc(struct cache *ca, unsigned int reserve, bool wait)
struct bucket *b;
long r;
+
+ /* No allocation if CACHE_SET_IO_DISABLE bit is set */
+ if (unlikely(test_bit(CACHE_SET_IO_DISABLE, &ca->set->flags)))
+ return -1;
+
/* fastpath */
if (fifo_pop(&ca->free[RESERVE_NONE], r) ||
fifo_pop(&ca->free[reserve], r))
@@ -484,6 +489,10 @@ int __bch_bucket_alloc_set(struct cache_set *c, unsigned int reserve,
{
int i;
+ /* No allocation if CACHE_SET_IO_DISABLE bit is set */
+ if (unlikely(test_bit(CACHE_SET_IO_DISABLE, &c->flags)))
+ return -1;
+
lockdep_assert_held(&c->bucket_lock);
BUG_ON(!n || n > c->caches_loaded || n > MAX_CACHES_PER_SET);
--
2.16.4
^ permalink raw reply related [flat|nested] 21+ messages in thread
* [PATCH 11/15] bcache: check CACHE_SET_IO_DISABLE bit in bch_journal()
2019-06-04 15:16 [PATCH 00/15] bcache fixes before Linux v5.3 Coly Li
` (9 preceding siblings ...)
2019-06-04 15:16 ` [PATCH 10/15] bcache: check CACHE_SET_IO_DISABLE in allocator code Coly Li
@ 2019-06-04 15:16 ` Coly Li
2019-06-04 15:16 ` [PATCH 12/15] bcache: more detailed error message to bcache_device_link() Coly Li
` (4 subsequent siblings)
15 siblings, 0 replies; 21+ messages in thread
From: Coly Li @ 2019-06-04 15:16 UTC (permalink / raw)
To: linux-bcache; +Cc: linux-block, Coly Li
When too many I/O errors happen on cache set and CACHE_SET_IO_DISABLE
bit is set, bch_journal() may continue to work because the journaling
bkey might be still in write set yet. The caller of bch_journal() may
believe the journal still work but the truth is in-memory journal write
set won't be written into cache device any more. This behavior may
introduce potential inconsistent metadata status.
This patch checks CACHE_SET_IO_DISABLE bit at the head of bch_journal(),
if the bit is set, bch_journal() returns NULL immediately to notice
caller to know journal does not work.
Signed-off-by: Coly Li <colyli@suse.de>
---
drivers/md/bcache/journal.c | 4 ++++
1 file changed, 4 insertions(+)
diff --git a/drivers/md/bcache/journal.c b/drivers/md/bcache/journal.c
index 12dae9348147..d4b9817f2237 100644
--- a/drivers/md/bcache/journal.c
+++ b/drivers/md/bcache/journal.c
@@ -811,6 +811,10 @@ atomic_t *bch_journal(struct cache_set *c,
struct journal_write *w;
atomic_t *ret;
+ /* No journaling if CACHE_SET_IO_DISABLE set already */
+ if (unlikely(test_bit(CACHE_SET_IO_DISABLE, &c->flags)))
+ return NULL;
+
if (!CACHE_SYNC(&c->sb))
return NULL;
--
2.16.4
^ permalink raw reply related [flat|nested] 21+ messages in thread
* [PATCH 12/15] bcache: more detailed error message to bcache_device_link()
2019-06-04 15:16 [PATCH 00/15] bcache fixes before Linux v5.3 Coly Li
` (10 preceding siblings ...)
2019-06-04 15:16 ` [PATCH 11/15] bcache: check CACHE_SET_IO_DISABLE bit in bch_journal() Coly Li
@ 2019-06-04 15:16 ` Coly Li
2019-06-04 15:16 ` [PATCH 13/15] bcache: add more error message in bch_cached_dev_attach() Coly Li
` (3 subsequent siblings)
15 siblings, 0 replies; 21+ messages in thread
From: Coly Li @ 2019-06-04 15:16 UTC (permalink / raw)
To: linux-bcache; +Cc: linux-block, Coly Li
This patch adds more accurate error message for specific
ssyfs_create_link() call, to help debugging failure during
bcache device start tup.
Signed-off-by: Coly Li <colyli@suse.de>
---
drivers/md/bcache/super.c | 11 ++++++++---
1 file changed, 8 insertions(+), 3 deletions(-)
diff --git a/drivers/md/bcache/super.c b/drivers/md/bcache/super.c
index 74eb18b1af40..1fa3f4e26d02 100644
--- a/drivers/md/bcache/super.c
+++ b/drivers/md/bcache/super.c
@@ -693,6 +693,7 @@ static void bcache_device_link(struct bcache_device *d, struct cache_set *c,
{
unsigned int i;
struct cache *ca;
+ int ret;
for_each_cache(ca, d->c, i)
bd_link_disk_holder(ca->bdev, d->disk);
@@ -700,9 +701,13 @@ static void bcache_device_link(struct bcache_device *d, struct cache_set *c,
snprintf(d->name, BCACHEDEVNAME_SIZE,
"%s%u", name, d->id);
- WARN(sysfs_create_link(&d->kobj, &c->kobj, "cache") ||
- sysfs_create_link(&c->kobj, &d->kobj, d->name),
- "Couldn't create device <-> cache set symlinks");
+ ret = sysfs_create_link(&d->kobj, &c->kobj, "cache");
+ if (ret < 0)
+ pr_err("Couldn't create device -> cache set symlink");
+
+ ret = sysfs_create_link(&c->kobj, &d->kobj, d->name);
+ if (ret < 0)
+ pr_err("Couldn't create cache set -> device symlink");
clear_bit(BCACHE_DEV_UNLINK_DONE, &d->flags);
}
--
2.16.4
^ permalink raw reply related [flat|nested] 21+ messages in thread
* [PATCH 13/15] bcache: add more error message in bch_cached_dev_attach()
2019-06-04 15:16 [PATCH 00/15] bcache fixes before Linux v5.3 Coly Li
` (11 preceding siblings ...)
2019-06-04 15:16 ` [PATCH 12/15] bcache: more detailed error message to bcache_device_link() Coly Li
@ 2019-06-04 15:16 ` Coly Li
2019-06-04 15:16 ` [PATCH 14/15] bcache: shrink btree node cache after bch_btree_check() Coly Li
` (2 subsequent siblings)
15 siblings, 0 replies; 21+ messages in thread
From: Coly Li @ 2019-06-04 15:16 UTC (permalink / raw)
To: linux-bcache; +Cc: linux-block, Coly Li
This patch adds more error message for attaching cached device, this is
helpful to debug code failure during bache device start up.
Signed-off-by: Coly Li <colyli@suse.de>
---
drivers/md/bcache/super.c | 4 ++++
1 file changed, 4 insertions(+)
diff --git a/drivers/md/bcache/super.c b/drivers/md/bcache/super.c
index 1fa3f4e26d02..cf5673af3143 100644
--- a/drivers/md/bcache/super.c
+++ b/drivers/md/bcache/super.c
@@ -1169,6 +1169,8 @@ int bch_cached_dev_attach(struct cached_dev *dc, struct cache_set *c,
down_write(&dc->writeback_lock);
if (bch_cached_dev_writeback_start(dc)) {
up_write(&dc->writeback_lock);
+ pr_err("Couldn't start writeback facilities for %s",
+ dc->disk.disk->disk_name);
return -ENOMEM;
}
@@ -1182,6 +1184,8 @@ int bch_cached_dev_attach(struct cached_dev *dc, struct cache_set *c,
ret = bch_cached_dev_run(dc);
if (ret && (ret != -EBUSY)) {
up_write(&dc->writeback_lock);
+ pr_err("Couldn't run cached device %s",
+ dc->backing_dev_name);
return ret;
}
--
2.16.4
^ permalink raw reply related [flat|nested] 21+ messages in thread
* [PATCH 14/15] bcache: shrink btree node cache after bch_btree_check()
2019-06-04 15:16 [PATCH 00/15] bcache fixes before Linux v5.3 Coly Li
` (12 preceding siblings ...)
2019-06-04 15:16 ` [PATCH 13/15] bcache: add more error message in bch_cached_dev_attach() Coly Li
@ 2019-06-04 15:16 ` Coly Li
2019-06-04 15:16 ` [PATCH 15/15] bcache: improve error message in bch_cached_dev_run() Coly Li
2019-06-04 15:53 ` [PATCH 16/18] bcache: only set BCACHE_DEV_WB_RUNNING when cached device attached Coly Li
15 siblings, 0 replies; 21+ messages in thread
From: Coly Li @ 2019-06-04 15:16 UTC (permalink / raw)
To: linux-bcache; +Cc: linux-block, Coly Li
When cache set starts, bch_btree_check() will check all bkeys on cache
device by calculating the checksum. This operation will consume a huge
number of system memory if there are a lot of data cached. Since bcache
uses its own mca cache to maintain all its read-in btree nodes, and only
releases the cache space when system memory manage code starts to shrink
caches. There is will be a delay between bch_btree_check() returns and
the bcache shrink code gets called, so following memory allocatiion
might fail after bch_btree_check() finished. The most frequent one is
failure of creating allocator kernel thread.
This patch tries to proactively call bcache mca shrinker routine to
release around 25% cache memory, to help following memory allocation
to success. 'Around 25% cache memory' means when mca shrnker tries to
release cache memory, it might have to skip some busy memory objects,
so the result might be a few less than the expected amount.
Signed-off-by: Coly Li <colyli@suse.de>
---
drivers/md/bcache/super.c | 18 ++++++++++++++++++
1 file changed, 18 insertions(+)
diff --git a/drivers/md/bcache/super.c b/drivers/md/bcache/super.c
index cf5673af3143..4a6406b53de1 100644
--- a/drivers/md/bcache/super.c
+++ b/drivers/md/bcache/super.c
@@ -1866,6 +1866,24 @@ static int run_cache_set(struct cache_set *c)
if (bch_btree_check(c))
goto err;
+ /*
+ * bch_btree_check() may occupy too much system memory which
+ * will fail memory allocation operations in the following
+ * routines before kernel triggers memory shrinker call backs.
+ * Shrinking 25% mca cache memory proactively here to avoid
+ * potential memory allocation failure.
+ */
+ if (!c->shrinker_disabled) {
+ struct shrink_control sc;
+
+ sc.gfp_mask = GFP_KERNEL;
+ sc.nr_to_scan =
+ c->shrink.count_objects(&c->shrink, &sc) / 4;
+ pr_debug("try to shrink %lu (25%%) cached btree node",
+ sc.nr_to_scan);
+ c->shrink.scan_objects(&c->shrink, &sc);
+ }
+
bch_journal_mark(c, &journal);
bch_initial_gc_finish(c);
pr_debug("btree_check() done");
--
2.16.4
^ permalink raw reply related [flat|nested] 21+ messages in thread
* [PATCH 15/15] bcache: improve error message in bch_cached_dev_run()
2019-06-04 15:16 [PATCH 00/15] bcache fixes before Linux v5.3 Coly Li
` (13 preceding siblings ...)
2019-06-04 15:16 ` [PATCH 14/15] bcache: shrink btree node cache after bch_btree_check() Coly Li
@ 2019-06-04 15:16 ` Coly Li
2019-06-04 15:53 ` [PATCH 16/18] bcache: only set BCACHE_DEV_WB_RUNNING when cached device attached Coly Li
15 siblings, 0 replies; 21+ messages in thread
From: Coly Li @ 2019-06-04 15:16 UTC (permalink / raw)
To: linux-bcache; +Cc: linux-block, Coly Li
This patch adds more error message in bch_cached_dev_run() to indicate
the exact reason why an error value is returned. Please notice when
printing out the "is running already" message, pr_info() is used here,
because in this case also -EBUSY is returned, the bcache device can
continue to attach to the cache devince and run, so it won't be an
error level message in kernel message.
Signed-off-by: Coly Li <colyli@suse.de>
---
drivers/md/bcache/super.c | 9 +++++++--
1 file changed, 7 insertions(+), 2 deletions(-)
diff --git a/drivers/md/bcache/super.c b/drivers/md/bcache/super.c
index 4a6406b53de1..026e2df358c3 100644
--- a/drivers/md/bcache/super.c
+++ b/drivers/md/bcache/super.c
@@ -926,13 +926,18 @@ int bch_cached_dev_run(struct cached_dev *dc)
NULL,
};
- if (dc->io_disable)
+ if (dc->io_disable) {
+ pr_err("I/O disabled on cached dev %s",
+ dc->backing_dev_name);
return -EIO;
+ }
if (atomic_xchg(&dc->running, 1)) {
kfree(env[1]);
kfree(env[2]);
kfree(buf);
+ pr_info("cached dev %s is running already",
+ dc->backing_dev_name);
return -EBUSY;
}
@@ -961,7 +966,7 @@ int bch_cached_dev_run(struct cached_dev *dc)
if (sysfs_create_link(&d->kobj, &disk_to_dev(d->disk)->kobj, "dev") ||
sysfs_create_link(&disk_to_dev(d->disk)->kobj,
&d->kobj, "bcache")) {
- pr_debug("error creating sysfs link");
+ pr_err("Couldn't create bcache dev <-> disk sysfs symlinks");
return -ENOMEM;
}
--
2.16.4
^ permalink raw reply related [flat|nested] 21+ messages in thread
* [PATCH 16/18] bcache: only set BCACHE_DEV_WB_RUNNING when cached device attached
2019-06-04 15:16 [PATCH 00/15] bcache fixes before Linux v5.3 Coly Li
` (14 preceding siblings ...)
2019-06-04 15:16 ` [PATCH 15/15] bcache: improve error message in bch_cached_dev_run() Coly Li
@ 2019-06-04 15:53 ` Coly Li
2019-06-04 15:53 ` [PATCH 17/18] bcache: make bset_search_tree() be more understandable Coly Li
2019-06-04 15:53 ` [PATCH 18/18] bcache: add code comments for journal_read_bucket() Coly Li
15 siblings, 2 replies; 21+ messages in thread
From: Coly Li @ 2019-06-04 15:53 UTC (permalink / raw)
To: linux-bcache; +Cc: linux-block, Coly Li, stable
When people set a writeback percent via sysfs file,
/sys/block/bcache<N>/bcache/writeback_percent
current code directly sets BCACHE_DEV_WB_RUNNING to dc->disk.flags
and schedules kworker dc->writeback_rate_update.
If there is no cache set attached to, the writebac kernel thread is
not running indeed, running dc->writeback_rate_update does not make
sense and may cause NULL pointer deference when reference cache set
pointer inside update_writeback_rate().
This patch checks whether the cache set point (dc->disk.c) is NULL in
sysfs interface handler, and only set BCACHE_DEV_WB_RUNNING and
schedule dc->writeback_rate_update when dc->disk.c is not NULL (it
means the cache device is attached to a cache set).
This problem might be introduced from initial bcache commit, but
commit 3fd47bfe55b0 ("bcache: stop dc->writeback_rate_update properly")
changes part of the orignal code piece, so I add 'Fixes: 3fd47bfe55b0'
to indicate from which commit this patch can be applied.
Fixes: 3fd47bfe55b0 ("bcache: stop dc->writeback_rate_update properly")
Reported-by: Bjørn Forsman <bjorn.forsman@gmail.com>
Signed-off-by: Coly Li <colyli@suse.de>
Cc: stable@vger.kernel.org
---
drivers/md/bcache/sysfs.c | 7 ++++++-
1 file changed, 6 insertions(+), 1 deletion(-)
diff --git a/drivers/md/bcache/sysfs.c b/drivers/md/bcache/sysfs.c
index 129031663cc8..eb678e43ac00 100644
--- a/drivers/md/bcache/sysfs.c
+++ b/drivers/md/bcache/sysfs.c
@@ -430,8 +430,13 @@ STORE(bch_cached_dev)
bch_writeback_queue(dc);
}
+ /*
+ * Only set BCACHE_DEV_WB_RUNNING when cached device attached to
+ * a cache set, otherwise it doesn't make sense.
+ */
if (attr == &sysfs_writeback_percent)
- if (!test_and_set_bit(BCACHE_DEV_WB_RUNNING, &dc->disk.flags))
+ if ((dc->disk.c != NULL) &&
+ (!test_and_set_bit(BCACHE_DEV_WB_RUNNING, &dc->disk.flags)))
schedule_delayed_work(&dc->writeback_rate_update,
dc->writeback_rate_update_seconds * HZ);
--
2.16.4
^ permalink raw reply related [flat|nested] 21+ messages in thread
* [PATCH 17/18] bcache: make bset_search_tree() be more understandable
2019-06-04 15:53 ` [PATCH 16/18] bcache: only set BCACHE_DEV_WB_RUNNING when cached device attached Coly Li
@ 2019-06-04 15:53 ` Coly Li
2019-06-05 5:43 ` Christoph Hellwig
2019-06-04 15:53 ` [PATCH 18/18] bcache: add code comments for journal_read_bucket() Coly Li
1 sibling, 1 reply; 21+ messages in thread
From: Coly Li @ 2019-06-04 15:53 UTC (permalink / raw)
To: linux-bcache; +Cc: linux-block, Coly Li
The purpose of following code in bset_search_tree() is to avoid a branch
instruction,
994 if (likely(f->exponent != 127))
995 n = j * 2 + (((unsigned int)
996 (f->mantissa -
997 bfloat_mantissa(search, f))) >> 31);
998 else
999 n = (bkey_cmp(tree_to_bkey(t, j), search) > 0)
1000 ? j * 2
1001 : j * 2 + 1;
This piece of code is not very clear to understand, even when I tried to
add code comment for it, I made mistake. This patch removes the implict
bit operation and uses explicit branch to calculate next location in
binary tree search.
Signed-off-by: Coly Li <colyli@suse.de>
---
drivers/md/bcache/bset.c | 17 +++--------------
1 file changed, 3 insertions(+), 14 deletions(-)
diff --git a/drivers/md/bcache/bset.c b/drivers/md/bcache/bset.c
index eedaf0f3e3f0..32e2e4d8fa6c 100644
--- a/drivers/md/bcache/bset.c
+++ b/drivers/md/bcache/bset.c
@@ -965,21 +965,10 @@ static struct bset_search_iter bset_search_tree(struct bset_tree *t,
j = n;
f = &t->tree[j];
- /*
- * Similar bit trick, use subtract operation to avoid a branch
- * instruction.
- *
- * n = (f->mantissa > bfloat_mantissa())
- * ? j * 2
- * : j * 2 + 1;
- *
- * We need to subtract 1 from f->mantissa for the sign bit trick
- * to work - that's done in make_bfloat()
- */
if (likely(f->exponent != 127))
- n = j * 2 + (((unsigned int)
- (f->mantissa -
- bfloat_mantissa(search, f))) >> 31);
+ n = (f->mantissa >= bfloat_mantissa(search, f))
+ ? j * 2
+ : j * 2 + 1;
else
n = (bkey_cmp(tree_to_bkey(t, j), search) > 0)
? j * 2
--
2.16.4
^ permalink raw reply related [flat|nested] 21+ messages in thread
* [PATCH 18/18] bcache: add code comments for journal_read_bucket()
2019-06-04 15:53 ` [PATCH 16/18] bcache: only set BCACHE_DEV_WB_RUNNING when cached device attached Coly Li
2019-06-04 15:53 ` [PATCH 17/18] bcache: make bset_search_tree() be more understandable Coly Li
@ 2019-06-04 15:53 ` Coly Li
1 sibling, 0 replies; 21+ messages in thread
From: Coly Li @ 2019-06-04 15:53 UTC (permalink / raw)
To: linux-bcache; +Cc: linux-block, Coly Li
This patch adds more code comments in journal_read_bucket(), this is an
effort to make the code to be more understandable.
Signed-off-by: Coly Li <colyli@suse.de>
---
drivers/md/bcache/journal.c | 24 ++++++++++++++++++++++++
1 file changed, 24 insertions(+)
diff --git a/drivers/md/bcache/journal.c b/drivers/md/bcache/journal.c
index d4b9817f2237..d3f2331fc559 100644
--- a/drivers/md/bcache/journal.c
+++ b/drivers/md/bcache/journal.c
@@ -100,6 +100,20 @@ reread: left = ca->sb.bucket_size - offset;
blocks = set_blocks(j, block_bytes(ca->set));
+ /*
+ * Nodes in 'list' are in linear increasing order of
+ * i->j.seq, the node on head has the smallest (oldest)
+ * journal seq, the node on tail has the biggest
+ * (latest) journal seq.
+ */
+
+ /*
+ * Check from the oldest jset for last_seq. If
+ * i->j.seq < j->last_seq, it means the oldest jset
+ * in list is expired and useless, remove it from
+ * this list. Otherwise, j is a condidate jset for
+ * further following checks.
+ */
while (!list_empty(list)) {
i = list_first_entry(list,
struct journal_replay, list);
@@ -109,13 +123,22 @@ reread: left = ca->sb.bucket_size - offset;
kfree(i);
}
+ /* iterate list in reverse order (from latest jset) */
list_for_each_entry_reverse(i, list, list) {
if (j->seq == i->j.seq)
goto next_set;
+ /*
+ * if j->seq is less than any i->j.last_seq
+ * in list, j is an expired and useless jset.
+ */
if (j->seq < i->j.last_seq)
goto next_set;
+ /*
+ * 'where' points to first jset in list which
+ * is elder then j.
+ */
if (j->seq > i->j.seq) {
where = &i->list;
goto add;
@@ -129,6 +152,7 @@ reread: left = ca->sb.bucket_size - offset;
if (!i)
return -ENOMEM;
memcpy(&i->j, j, bytes);
+ /* Add to the location after 'where' points to */
list_add(&i->list, where);
ret = 1;
--
2.16.4
^ permalink raw reply related [flat|nested] 21+ messages in thread
* Re: [PATCH 17/18] bcache: make bset_search_tree() be more understandable
2019-06-04 15:53 ` [PATCH 17/18] bcache: make bset_search_tree() be more understandable Coly Li
@ 2019-06-05 5:43 ` Christoph Hellwig
2019-06-05 5:45 ` Coly Li
0 siblings, 1 reply; 21+ messages in thread
From: Christoph Hellwig @ 2019-06-05 5:43 UTC (permalink / raw)
To: Coly Li; +Cc: linux-bcache, linux-block
> - n = j * 2 + (((unsigned int)
> - (f->mantissa -
> - bfloat_mantissa(search, f))) >> 31);
> + n = (f->mantissa >= bfloat_mantissa(search, f))
> + ? j * 2
> + : j * 2 + 1;
If you really want to make it more readable a good old if else would
help a lot.
> else
> n = (bkey_cmp(tree_to_bkey(t, j), search) > 0)
> ? j * 2
Same here.
^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [PATCH 17/18] bcache: make bset_search_tree() be more understandable
2019-06-05 5:43 ` Christoph Hellwig
@ 2019-06-05 5:45 ` Coly Li
0 siblings, 0 replies; 21+ messages in thread
From: Coly Li @ 2019-06-05 5:45 UTC (permalink / raw)
To: Christoph Hellwig; +Cc: linux-bcache, linux-block
On 2019/6/5 1:43 下午, Christoph Hellwig wrote:
>> - n = j * 2 + (((unsigned int)
>> - (f->mantissa -
>> - bfloat_mantissa(search, f))) >> 31);
>> + n = (f->mantissa >= bfloat_mantissa(search, f))
>> + ? j * 2
>> + : j * 2 + 1;
>
> If you really want to make it more readable a good old if else would
> help a lot.
>
>> else
>> n = (bkey_cmp(tree_to_bkey(t, j), search) > 0)
>> ? j * 2
>
> Same here.
>
Hi Christoph,
Thanks for the hint, will handle it soon.
--
Coly Li
^ permalink raw reply [flat|nested] 21+ messages in thread
end of thread, other threads:[~2019-06-05 5:45 UTC | newest]
Thread overview: 21+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-06-04 15:16 [PATCH 00/15] bcache fixes before Linux v5.3 Coly Li
2019-06-04 15:16 ` [PATCH 01/15] Revert "bcache: set CACHE_SET_IO_DISABLE in bch_cached_dev_error()" Coly Li
2019-06-04 15:16 ` [PATCH 02/15] bcache: avoid flushing btree node in cache_set_flush() if io disabled Coly Li
2019-06-04 15:16 ` [PATCH 03/15] bcache: ignore read-ahead request failure on backing device Coly Li
2019-06-04 15:16 ` [PATCH 04/15] bcache: add io error counting in write_bdev_super_endio() Coly Li
2019-06-04 15:16 ` [PATCH 05/15] bcache: remove "XXX:" comment line from run_cache_set() Coly Li
2019-06-04 15:16 ` [PATCH 06/15] bcache: remove unnecessary prefetch() in bset_search_tree() Coly Li
2019-06-04 15:16 ` [PATCH 07/15] bcache: use sysfs_match_string() instead of __sysfs_match_string() Coly Li
2019-06-04 15:16 ` [PATCH 08/15] bcache: add return value check to bch_cached_dev_run() Coly Li
2019-06-04 15:16 ` [PATCH 09/15] bcache: remove unncessary code in bch_btree_keys_init() Coly Li
2019-06-04 15:16 ` [PATCH 10/15] bcache: check CACHE_SET_IO_DISABLE in allocator code Coly Li
2019-06-04 15:16 ` [PATCH 11/15] bcache: check CACHE_SET_IO_DISABLE bit in bch_journal() Coly Li
2019-06-04 15:16 ` [PATCH 12/15] bcache: more detailed error message to bcache_device_link() Coly Li
2019-06-04 15:16 ` [PATCH 13/15] bcache: add more error message in bch_cached_dev_attach() Coly Li
2019-06-04 15:16 ` [PATCH 14/15] bcache: shrink btree node cache after bch_btree_check() Coly Li
2019-06-04 15:16 ` [PATCH 15/15] bcache: improve error message in bch_cached_dev_run() Coly Li
2019-06-04 15:53 ` [PATCH 16/18] bcache: only set BCACHE_DEV_WB_RUNNING when cached device attached Coly Li
2019-06-04 15:53 ` [PATCH 17/18] bcache: make bset_search_tree() be more understandable Coly Li
2019-06-05 5:43 ` Christoph Hellwig
2019-06-05 5:45 ` Coly Li
2019-06-04 15:53 ` [PATCH 18/18] bcache: add code comments for journal_read_bucket() Coly Li
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.