All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 00/13] bcache: fixes and update for 4.14
@ 2017-09-06  6:25 Coly Li
  2017-09-06  6:25 ` [PATCH 01/12] bcache: Fix leak of bdev reference Coly Li
                   ` (12 more replies)
  0 siblings, 13 replies; 35+ messages in thread
From: Coly Li @ 2017-09-06  6:25 UTC (permalink / raw)
  To: linux-bcache, linux-block, axboe; +Cc: bcache, Coly Li

Hi Jens,

Here are 12 patchs for bcache fixes and updates, most of them were posted
by Eric Wheeler in 4.13 merge window, but delayed due to lack of code
review.

The following patches are reviewed or acked by peer developers,
	0001-bcache-Fix-leak-of-bdev-reference.patch
	0002-bcache-fix-sequential-large-write-IO-bypass.patch
	0003-bcache-do-not-subtract-sectors_to_gc-for-bypassed-IO.patch
	0004-bcache-Don-t-reinvent-the-wheel-but-use-existing-lli.patch
	0005-bcache-gc-does-not-work-when-triggering-by-manual-co.patch
	0006-bcache-correct-cache_dirty_target-in-__update_writeb.patch
	0007-bcache-Correct-return-value-for-sysfs-attach-errors.patch
	0008-bcache-increase-the-number-of-open-buckets.patch
	0009-bcache-fix-for-gc-and-write-back-race.patch
	0010-bcache-silence-static-checker-warning.patch
	0011-bcache-Update-continue_at-documentation.patch
	0012-bcache-fix-bch_hprint-crash-and-improve-output.patch
You may consider to pick them up if no other comments come up, especially
0012-bcache-fix-bch_hprint-crash-and-improve-output.patch which fixes a
potential serious security issue as author declears.

I do basic test on the whole patch set, bcache works and behaves as
expected. For futher bcache bug reports, I will follow up them.

There are some other patches which does not pass my test currently,
once they are fixed and reviewed by peer developers I will send them
to you for 4.14 merge window.

Thanks in advance.

Coly Li

^ permalink raw reply	[flat|nested] 35+ messages in thread

* [PATCH 01/12] bcache: Fix leak of bdev reference
  2017-09-06  6:25 [PATCH 00/13] bcache: fixes and update for 4.14 Coly Li
@ 2017-09-06  6:25 ` Coly Li
  2017-09-06  6:25 ` [PATCH 02/12] bcache: fix sequential large write IO bypass Coly Li
                   ` (11 subsequent siblings)
  12 siblings, 0 replies; 35+ messages in thread
From: Coly Li @ 2017-09-06  6:25 UTC (permalink / raw)
  To: linux-bcache, linux-block, axboe; +Cc: bcache, Jan Kara, stable

From: Jan Kara <jack@suse.cz>

If blkdev_get_by_path() in register_bcache() fails, we try to lookup the
block device using lookup_bdev() to detect which situation we are in to
properly report error. However we never drop the reference returned to
us from lookup_bdev(). Fix that.

Signed-off-by: Jan Kara <jack@suse.cz>
Acked-by: Coly Li <colyli@suse.de>
Cc: stable@vger.kernel.org
---
 drivers/md/bcache/super.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/drivers/md/bcache/super.c b/drivers/md/bcache/super.c
index 8352fad765f6..9a2c190745b6 100644
--- a/drivers/md/bcache/super.c
+++ b/drivers/md/bcache/super.c
@@ -1964,6 +1964,8 @@ static ssize_t register_bcache(struct kobject *k, struct kobj_attribute *attr,
 			else
 				err = "device busy";
 			mutex_unlock(&bch_register_lock);
+			if (!IS_ERR(bdev))
+				bdput(bdev);
 			if (attr == &ksysfs_register_quiet)
 				goto out;
 		}
-- 
2.13.5

^ permalink raw reply related	[flat|nested] 35+ messages in thread

* [PATCH 02/12] bcache: fix sequential large write IO bypass
  2017-09-06  6:25 [PATCH 00/13] bcache: fixes and update for 4.14 Coly Li
  2017-09-06  6:25 ` [PATCH 01/12] bcache: Fix leak of bdev reference Coly Li
@ 2017-09-06  6:25 ` Coly Li
  2017-09-06  6:25 ` [PATCH 03/12] bcache: do not subtract sectors_to_gc for bypassed IO Coly Li
                   ` (10 subsequent siblings)
  12 siblings, 0 replies; 35+ messages in thread
From: Coly Li @ 2017-09-06  6:25 UTC (permalink / raw)
  To: linux-bcache, linux-block, axboe; +Cc: bcache, Tang Junhui, stable

From: Tang Junhui <tang.junhui@zte.com.cn>

Sequential write IOs were tested with bs=1M by FIO in writeback cache
mode, these IOs were expected to be bypassed, but actually they did not.
We debug the code, and find in check_should_bypass():
    if (!congested &&
        mode == CACHE_MODE_WRITEBACK &&
        op_is_write(bio_op(bio)) &&
        (bio->bi_opf & REQ_SYNC))
        goto rescale
that means, If in writeback mode, a write IO with REQ_SYNC flag will not
be bypassed though it is a sequential large IO, It's not a correct thing
to do actually, so this patch remove these codes.

Signed-off-by: tang.junhui <tang.junhui@zte.com.cn>
Reviewed-by: Kent Overstreet <kent.overstreet@gmail.com>
Reviewed-by: Eric Wheeler <bcache@linux.ewheeler.net>
Cc: stable@vger.kernel.org
---
 drivers/md/bcache/request.c | 6 ------
 1 file changed, 6 deletions(-)

diff --git a/drivers/md/bcache/request.c b/drivers/md/bcache/request.c
index 019b3df9f1c6..958072a11347 100644
--- a/drivers/md/bcache/request.c
+++ b/drivers/md/bcache/request.c
@@ -400,12 +400,6 @@ static bool check_should_bypass(struct cached_dev *dc, struct bio *bio)
 	if (!congested && !dc->sequential_cutoff)
 		goto rescale;
 
-	if (!congested &&
-	    mode == CACHE_MODE_WRITEBACK &&
-	    op_is_write(bio->bi_opf) &&
-	    op_is_sync(bio->bi_opf))
-		goto rescale;
-
 	spin_lock(&dc->io_lock);
 
 	hlist_for_each_entry(i, iohash(dc, bio->bi_iter.bi_sector), hash)
-- 
2.13.5

^ permalink raw reply related	[flat|nested] 35+ messages in thread

* [PATCH 03/12] bcache: do not subtract sectors_to_gc for bypassed IO
  2017-09-06  6:25 [PATCH 00/13] bcache: fixes and update for 4.14 Coly Li
  2017-09-06  6:25 ` [PATCH 01/12] bcache: Fix leak of bdev reference Coly Li
  2017-09-06  6:25 ` [PATCH 02/12] bcache: fix sequential large write IO bypass Coly Li
@ 2017-09-06  6:25 ` Coly Li
  2017-09-06  6:25 ` [PATCH 04/12] bcache: Don't reinvent the wheel but use existing llist API Coly Li
                   ` (9 subsequent siblings)
  12 siblings, 0 replies; 35+ messages in thread
From: Coly Li @ 2017-09-06  6:25 UTC (permalink / raw)
  To: linux-bcache, linux-block, axboe; +Cc: bcache, Tang Junhui, stable

From: Tang Junhui <tang.junhui@zte.com.cn>

Since bypassed IOs use no bucket, so do not subtract sectors_to_gc to
trigger gc thread.

Signed-off-by: tang.junhui <tang.junhui@zte.com.cn>
Acked-by: Coly Li <colyli@suse.de>
Reviewed-by: Eric Wheeler <bcache@linux.ewheeler.net>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Cc: stable@vger.kernel.org
---
 drivers/md/bcache/request.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/drivers/md/bcache/request.c b/drivers/md/bcache/request.c
index 958072a11347..4b413db99276 100644
--- a/drivers/md/bcache/request.c
+++ b/drivers/md/bcache/request.c
@@ -196,12 +196,12 @@ static void bch_data_insert_start(struct closure *cl)
 	struct data_insert_op *op = container_of(cl, struct data_insert_op, cl);
 	struct bio *bio = op->bio, *n;
 
-	if (atomic_sub_return(bio_sectors(bio), &op->c->sectors_to_gc) < 0)
-		wake_up_gc(op->c);
-
 	if (op->bypass)
 		return bch_data_invalidate(cl);
 
+	if (atomic_sub_return(bio_sectors(bio), &op->c->sectors_to_gc) < 0)
+		wake_up_gc(op->c);
+
 	/*
 	 * Journal writes are marked REQ_PREFLUSH; if the original write was a
 	 * flush, it'll wait on the journal write.
-- 
2.13.5

^ permalink raw reply related	[flat|nested] 35+ messages in thread

* [PATCH 04/12] bcache: Don't reinvent the wheel but use existing llist API
  2017-09-06  6:25 [PATCH 00/13] bcache: fixes and update for 4.14 Coly Li
                   ` (2 preceding siblings ...)
  2017-09-06  6:25 ` [PATCH 03/12] bcache: do not subtract sectors_to_gc for bypassed IO Coly Li
@ 2017-09-06  6:25 ` Coly Li
  2017-09-26  4:38   ` Michael Lyle
  2017-09-06  6:25 ` [PATCH 05/12] bcache: gc does not work when triggering by manual command Coly Li
                   ` (8 subsequent siblings)
  12 siblings, 1 reply; 35+ messages in thread
From: Coly Li @ 2017-09-06  6:25 UTC (permalink / raw)
  To: linux-bcache, linux-block, axboe; +Cc: bcache, Byungchul Park

From: Byungchul Park <byungchul.park@lge.com>

Although llist provides proper APIs, they are not used. Make them used.

Signed-off-by: Byungchul Park <byungchul.park@lge.com>
Acked-by: Coly Li <colyli@suse.de>
---
 drivers/md/bcache/closure.c | 15 ++-------------
 1 file changed, 2 insertions(+), 13 deletions(-)

diff --git a/drivers/md/bcache/closure.c b/drivers/md/bcache/closure.c
index 864e673aec39..7d5286b05036 100644
--- a/drivers/md/bcache/closure.c
+++ b/drivers/md/bcache/closure.c
@@ -70,21 +70,10 @@ void __closure_wake_up(struct closure_waitlist *wait_list)
 	list = llist_del_all(&wait_list->list);
 
 	/* We first reverse the list to preserve FIFO ordering and fairness */
-
-	while (list) {
-		struct llist_node *t = list;
-		list = llist_next(list);
-
-		t->next = reverse;
-		reverse = t;
-	}
+	reverse = llist_reverse_order(list);
 
 	/* Then do the wakeups */
-
-	while (reverse) {
-		cl = container_of(reverse, struct closure, list);
-		reverse = llist_next(reverse);
-
+	llist_for_each_entry(cl, reverse, list) {
 		closure_set_waiting(cl, 0);
 		closure_sub(cl, CLOSURE_WAITING + 1);
 	}
-- 
2.13.5

^ permalink raw reply related	[flat|nested] 35+ messages in thread

* [PATCH 05/12] bcache: gc does not work when triggering by manual command
  2017-09-06  6:25 [PATCH 00/13] bcache: fixes and update for 4.14 Coly Li
                   ` (3 preceding siblings ...)
  2017-09-06  6:25 ` [PATCH 04/12] bcache: Don't reinvent the wheel but use existing llist API Coly Li
@ 2017-09-06  6:25 ` Coly Li
  2017-09-06  6:25 ` [PATCH 06/12] bcache: correct cache_dirty_target in __update_writeback_rate() Coly Li
                   ` (7 subsequent siblings)
  12 siblings, 0 replies; 35+ messages in thread
From: Coly Li @ 2017-09-06  6:25 UTC (permalink / raw)
  To: linux-bcache, linux-block, axboe; +Cc: bcache, Tang Junhui

From: Tang Junhui <tang.junhui@zte.com.cn>

I try to execute the following command to trigger gc thread:
[root@localhost internal]# echo 1 > trigger_gc
But it does not work, I debug the code in gc_should_run(), It works only
if in invalidating or sectors_to_gc < 0. So set sectors_to_gc to -1 to
meet the condition when we trigger gc by manual command.

(Code comments aded by Coly Li)

Signed-off-by: Tang Junhui <tang.junhui@zte.com.cn>
Reviewed-by: Coly Li <colyli@suse.de>
---
 drivers/md/bcache/sysfs.c | 15 ++++++++++++++-
 1 file changed, 14 insertions(+), 1 deletion(-)

diff --git a/drivers/md/bcache/sysfs.c b/drivers/md/bcache/sysfs.c
index f90f13616980..09295c07ea88 100644
--- a/drivers/md/bcache/sysfs.c
+++ b/drivers/md/bcache/sysfs.c
@@ -615,8 +615,21 @@ STORE(__bch_cache_set)
 		bch_cache_accounting_clear(&c->accounting);
 	}
 
-	if (attr == &sysfs_trigger_gc)
+	if (attr == &sysfs_trigger_gc) {
+		/*
+		 * Garbage collection thread only works when sectors_to_gc < 0,
+		 * when users write to sysfs entry trigger_gc, most of time
+		 * they want to forcibly triger gargage collection. Here -1 is
+		 * set to c->sectors_to_gc, to make gc_should_run() give a
+		 * chance to permit gc thread to run. "give a chance" means
+		 * before going into gc_should_run(), there is still chance
+		 * that c->sectors_to_gc being set to other positive value. So
+		 * writing sysfs entry trigger_gc won't always make sure gc
+		 * thread takes effect.
+		 */
+		atomic_set(&c->sectors_to_gc, -1);
 		wake_up_gc(c);
+	}
 
 	if (attr == &sysfs_prune_cache) {
 		struct shrink_control sc;
-- 
2.13.5

^ permalink raw reply related	[flat|nested] 35+ messages in thread

* [PATCH 06/12] bcache: correct cache_dirty_target in __update_writeback_rate()
  2017-09-06  6:25 [PATCH 00/13] bcache: fixes and update for 4.14 Coly Li
                   ` (4 preceding siblings ...)
  2017-09-06  6:25 ` [PATCH 05/12] bcache: gc does not work when triggering by manual command Coly Li
@ 2017-09-06  6:25 ` Coly Li
  2017-09-06  6:25 ` [PATCH 07/12] bcache: Correct return value for sysfs attach errors Coly Li
                   ` (6 subsequent siblings)
  12 siblings, 0 replies; 35+ messages in thread
From: Coly Li @ 2017-09-06  6:25 UTC (permalink / raw)
  To: linux-bcache, linux-block, axboe; +Cc: bcache, Tang Junhui, stable

From: Tang Junhui <tang.junhui@zte.com.cn>

__update_write_rate() uses a Proportion-Differentiation Controller
algorithm to control writeback rate. A dirty target number is used in
this PD controller to control writeback rate. A larger target number
will make the writeback rate smaller, on the versus, a smaller target
number will make the writeback rate larger.

bcache uses the following steps to calculate the target number,
1) cache_sectors = all-buckets-of-cache-set * buckets-size
2) cache_dirty_target = cache_sectors * cached-device-writeback_percent
3) target = cache_dirty_target *
(sectors-of-cached-device/sectors-of-all-cached-devices-of-this-cache-set)

The calculation at step 1) for cache_sectors is incorrect, which does
not consider dirty blocks occupied by flash only volume.

A flash only volume can be took as a bcache device without cached
device. All data sectors allocated for it are persistent on cache device
and marked dirty, they are not touched by bcache writeback and garbage
collection code. So data blocks of flash only volume should be ignore
when calculating cache_sectors of cache set.

Current code does not subtract dirty sectors of flash only volume, which
results a larger target number from the above 3 steps. And in sequence
the cache device's writeback rate is smaller then a correct value,
writeback speed is slower on all cached devices.

This patch fixes the incorrect slower writeback rate by subtracting
dirty sectors of flash only volumes in __update_writeback_rate().

(Commit log composed by Coly Li to pass checkpatch.pl checking)

Signed-off-by: Tang Junhui <tang.junhui@zte.com.cn>
Reviewed-by: Coly Li <colyli@suse.de>
Cc: stable@vger.kernel.org
---
 drivers/md/bcache/writeback.c |  3 ++-
 drivers/md/bcache/writeback.h | 19 +++++++++++++++++++
 2 files changed, 21 insertions(+), 1 deletion(-)

diff --git a/drivers/md/bcache/writeback.c b/drivers/md/bcache/writeback.c
index 42c66e76f05e..b533c2292ba5 100644
--- a/drivers/md/bcache/writeback.c
+++ b/drivers/md/bcache/writeback.c
@@ -21,7 +21,8 @@
 static void __update_writeback_rate(struct cached_dev *dc)
 {
 	struct cache_set *c = dc->disk.c;
-	uint64_t cache_sectors = c->nbuckets * c->sb.bucket_size;
+	uint64_t cache_sectors = c->nbuckets * c->sb.bucket_size -
+				bcache_flash_devs_sectors_dirty(c);
 	uint64_t cache_dirty_target =
 		div_u64(cache_sectors * dc->writeback_percent, 100);
 
diff --git a/drivers/md/bcache/writeback.h b/drivers/md/bcache/writeback.h
index 629bd1a502fd..8789b9c8c484 100644
--- a/drivers/md/bcache/writeback.h
+++ b/drivers/md/bcache/writeback.h
@@ -14,6 +14,25 @@ static inline uint64_t bcache_dev_sectors_dirty(struct bcache_device *d)
 	return ret;
 }
 
+static inline uint64_t  bcache_flash_devs_sectors_dirty(struct cache_set *c)
+{
+	uint64_t i, ret = 0;
+
+	mutex_lock(&bch_register_lock);
+
+	for (i = 0; i < c->nr_uuids; i++) {
+		struct bcache_device *d = c->devices[i];
+
+		if (!d || !UUID_FLASH_ONLY(&c->uuids[i]))
+			continue;
+	   ret += bcache_dev_sectors_dirty(d);
+	}
+
+	mutex_unlock(&bch_register_lock);
+
+	return ret;
+}
+
 static inline unsigned offset_to_stripe(struct bcache_device *d,
 					uint64_t offset)
 {
-- 
2.13.5

^ permalink raw reply related	[flat|nested] 35+ messages in thread

* [PATCH 07/12] bcache: Correct return value for sysfs attach errors
  2017-09-06  6:25 [PATCH 00/13] bcache: fixes and update for 4.14 Coly Li
                   ` (5 preceding siblings ...)
  2017-09-06  6:25 ` [PATCH 06/12] bcache: correct cache_dirty_target in __update_writeback_rate() Coly Li
@ 2017-09-06  6:25 ` Coly Li
  2017-09-06  6:25   ` Coly Li
                   ` (5 subsequent siblings)
  12 siblings, 0 replies; 35+ messages in thread
From: Coly Li @ 2017-09-06  6:25 UTC (permalink / raw)
  To: linux-bcache, linux-block, axboe; +Cc: bcache, Tony Asleson, stable

From: Tony Asleson <tasleson@redhat.com>

If you encounter any errors in bch_cached_dev_attach it will return
a negative error code.  The variable 'v' which stores the result is
unsigned, thus user space sees a very large value returned for bytes
written which can cause incorrect user space behavior.  Utilize 1
signed variable to use throughout the function to preserve error return
capability.

Signed-off-by: Tony Asleson <tasleson@redhat.com>
Acked-by: Coly Li <colyli@suse.de>
Cc: stable@vger.kernel.org
---
 drivers/md/bcache/sysfs.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/md/bcache/sysfs.c b/drivers/md/bcache/sysfs.c
index 09295c07ea88..104c57cd666c 100644
--- a/drivers/md/bcache/sysfs.c
+++ b/drivers/md/bcache/sysfs.c
@@ -192,7 +192,7 @@ STORE(__cached_dev)
 {
 	struct cached_dev *dc = container_of(kobj, struct cached_dev,
 					     disk.kobj);
-	unsigned v = size;
+	ssize_t v = size;
 	struct cache_set *c;
 	struct kobj_uevent_env *env;
 
@@ -227,7 +227,7 @@ STORE(__cached_dev)
 		bch_cached_dev_run(dc);
 
 	if (attr == &sysfs_cache_mode) {
-		ssize_t v = bch_read_string_list(buf, bch_cache_modes + 1);
+		v = bch_read_string_list(buf, bch_cache_modes + 1);
 
 		if (v < 0)
 			return v;
-- 
2.13.5

^ permalink raw reply related	[flat|nested] 35+ messages in thread

* [PATCH 08/12] bcache: increase the number of open buckets
  2017-09-06  6:25 [PATCH 00/13] bcache: fixes and update for 4.14 Coly Li
@ 2017-09-06  6:25   ` Coly Li
  2017-09-06  6:25 ` [PATCH 02/12] bcache: fix sequential large write IO bypass Coly Li
                     ` (11 subsequent siblings)
  12 siblings, 0 replies; 35+ messages in thread
From: Coly Li @ 2017-09-06  6:25 UTC (permalink / raw)
  To: linux-bcache, linux-block, axboe; +Cc: bcache, Tang Junhui

From: Tang Junhui <tang.junhui@zte.com.cn>

In currently, we only alloc 6 open buckets for each cache set,
but in usually, we always attach about 10 or so backend devices for
each cache set, and the each bcache device are always accessed by
about 10 or so threads in top application layer. So 6 open buckets
are too few, It has led to that each of the same thread write data
to different buckets, which would cause low efficiency write-back,
and also cause buckets inefficient, and would be Very easy to run
out of.

I add debug message in bch_open_buckets_alloc() to print alloc bucket
info, and test with ten bcache devices with a cache set, and each
bcache device is accessed by ten threads.

>From the debug message, we can see that, after the modification, One
bucket is more likely to assign to the same thread, and the data from
the same thread are more likely to write the same bucket. Usually the
same thread always write/read the same backend device, so it is good
for write-back and also promote the usage efficiency of buckets.

Signed-off-by: Tang Junhui <tang.junhui@zte.com.cn>
Reviewed-by: Coly Li <colyli@suse.de>
---
 drivers/md/bcache/alloc.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/drivers/md/bcache/alloc.c b/drivers/md/bcache/alloc.c
index ca4abe1ccd8d..cacbe2dbd5c3 100644
--- a/drivers/md/bcache/alloc.c
+++ b/drivers/md/bcache/alloc.c
@@ -68,6 +68,8 @@
 #include <linux/random.h>
 #include <trace/events/bcache.h>
 
+#define MAX_OPEN_BUCKETS 128
+
 /* Bucket heap / gen */
 
 uint8_t bch_inc_gen(struct cache *ca, struct bucket *b)
@@ -671,7 +673,7 @@ int bch_open_buckets_alloc(struct cache_set *c)
 
 	spin_lock_init(&c->data_bucket_lock);
 
-	for (i = 0; i < 6; i++) {
+	for (i = 0; i < MAX_OPEN_BUCKETS; i++) {
 		struct open_bucket *b = kzalloc(sizeof(*b), GFP_KERNEL);
 		if (!b)
 			return -ENOMEM;
-- 
2.13.5

^ permalink raw reply related	[flat|nested] 35+ messages in thread

* [PATCH 08/12] bcache: increase the number of open buckets
@ 2017-09-06  6:25   ` Coly Li
  0 siblings, 0 replies; 35+ messages in thread
From: Coly Li @ 2017-09-06  6:25 UTC (permalink / raw)
  To: linux-bcache, linux-block, axboe; +Cc: bcache, Tang Junhui

From: Tang Junhui <tang.junhui@zte.com.cn>

In currently, we only alloc 6 open buckets for each cache set,
but in usually, we always attach about 10 or so backend devices for
each cache set, and the each bcache device are always accessed by
about 10 or so threads in top application layer. So 6 open buckets
are too few, It has led to that each of the same thread write data
to different buckets, which would cause low efficiency write-back,
and also cause buckets inefficient, and would be Very easy to run
out of.

I add debug message in bch_open_buckets_alloc() to print alloc bucket
info, and test with ten bcache devices with a cache set, and each
bcache device is accessed by ten threads.

From the debug message, we can see that, after the modification, One
bucket is more likely to assign to the same thread, and the data from
the same thread are more likely to write the same bucket. Usually the
same thread always write/read the same backend device, so it is good
for write-back and also promote the usage efficiency of buckets.

Signed-off-by: Tang Junhui <tang.junhui@zte.com.cn>
Reviewed-by: Coly Li <colyli@suse.de>
---
 drivers/md/bcache/alloc.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/drivers/md/bcache/alloc.c b/drivers/md/bcache/alloc.c
index ca4abe1ccd8d..cacbe2dbd5c3 100644
--- a/drivers/md/bcache/alloc.c
+++ b/drivers/md/bcache/alloc.c
@@ -68,6 +68,8 @@
 #include <linux/random.h>
 #include <trace/events/bcache.h>
 
+#define MAX_OPEN_BUCKETS 128
+
 /* Bucket heap / gen */
 
 uint8_t bch_inc_gen(struct cache *ca, struct bucket *b)
@@ -671,7 +673,7 @@ int bch_open_buckets_alloc(struct cache_set *c)
 
 	spin_lock_init(&c->data_bucket_lock);
 
-	for (i = 0; i < 6; i++) {
+	for (i = 0; i < MAX_OPEN_BUCKETS; i++) {
 		struct open_bucket *b = kzalloc(sizeof(*b), GFP_KERNEL);
 		if (!b)
 			return -ENOMEM;
-- 
2.13.5

^ permalink raw reply related	[flat|nested] 35+ messages in thread

* [PATCH 09/12] bcache: fix for gc and write-back race
  2017-09-06  6:25 [PATCH 00/13] bcache: fixes and update for 4.14 Coly Li
                   ` (7 preceding siblings ...)
  2017-09-06  6:25   ` Coly Li
@ 2017-09-06  6:25 ` Coly Li
  2017-09-06  6:26 ` [PATCH 10/12] bcache: silence static checker warning Coly Li
                   ` (3 subsequent siblings)
  12 siblings, 0 replies; 35+ messages in thread
From: Coly Li @ 2017-09-06  6:25 UTC (permalink / raw)
  To: linux-bcache, linux-block, axboe; +Cc: bcache, Tang Junhui, stable

From: Tang Junhui <tang.junhui@zte.com.cn>

gc and write-back get raced (see the email "bcache get stucked" I sended
before):
gc thread                               write-back thread
|                                       |bch_writeback_thread()
|bch_gc_thread()                        |
|                                       |==>read_dirty()
|==>bch_btree_gc()                      |
|==>btree_root() //get btree root       |
|                //node write locker    |
|==>bch_btree_gc_root()                 |
|                                       |==>read_dirty_submit()
|                                       |==>write_dirty()
|                                       |==>continue_at(cl,
|                                       |               write_dirty_finish,
|                                       |               system_wq);
|                                       |==>write_dirty_finish()//excute
|                                       |               //in system_wq
|                                       |==>bch_btree_insert()
|                                       |==>bch_btree_map_leaf_nodes()
|                                       |==>__bch_btree_map_nodes()
|                                       |==>btree_root //try to get btree
|                                       |              //root node read
|                                       |              //lock
|                                       |-----stuck here
|==>bch_btree_set_root()
|==>bch_journal_meta()
|==>bch_journal()
|==>journal_try_write()
|==>journal_write_unlocked() //journal_full(&c->journal)
|                            //condition satisfied
|==>continue_at(cl, journal_write, system_wq); //try to excute
|                               //journal_write in system_wq
|                               //but work queue is excuting
|                               //write_dirty_finish()
|==>closure_sync(); //wait journal_write execute
|                   //over and wake up gc,
|-------------stuck here
|==>release root node write locker

This patch alloc a separate work-queue for write-back thread to avoid such
race.

(Commit log re-organized by Coly Li to pass checkpatch.pl checking)

Signed-off-by: Tang Junhui <tang.junhui@zte.com.cn>
Acked-by: Coly Li <colyli@suse.de>
Cc: stable@vger.kernel.org
---
 drivers/md/bcache/bcache.h    | 1 +
 drivers/md/bcache/super.c     | 2 ++
 drivers/md/bcache/writeback.c | 9 +++++++--
 3 files changed, 10 insertions(+), 2 deletions(-)

diff --git a/drivers/md/bcache/bcache.h b/drivers/md/bcache/bcache.h
index dee542fff68e..2ed9bd231d84 100644
--- a/drivers/md/bcache/bcache.h
+++ b/drivers/md/bcache/bcache.h
@@ -333,6 +333,7 @@ struct cached_dev {
 	/* Limit number of writeback bios in flight */
 	struct semaphore	in_flight;
 	struct task_struct	*writeback_thread;
+	struct workqueue_struct	*writeback_write_wq;
 
 	struct keybuf		writeback_keys;
 
diff --git a/drivers/md/bcache/super.c b/drivers/md/bcache/super.c
index 9a2c190745b6..253918972335 100644
--- a/drivers/md/bcache/super.c
+++ b/drivers/md/bcache/super.c
@@ -1059,6 +1059,8 @@ static void cached_dev_free(struct closure *cl)
 	cancel_delayed_work_sync(&dc->writeback_rate_update);
 	if (!IS_ERR_OR_NULL(dc->writeback_thread))
 		kthread_stop(dc->writeback_thread);
+	if (dc->writeback_write_wq)
+		destroy_workqueue(dc->writeback_write_wq);
 
 	mutex_lock(&bch_register_lock);
 
diff --git a/drivers/md/bcache/writeback.c b/drivers/md/bcache/writeback.c
index b533c2292ba5..323551f7cb28 100644
--- a/drivers/md/bcache/writeback.c
+++ b/drivers/md/bcache/writeback.c
@@ -187,7 +187,7 @@ static void write_dirty(struct closure *cl)
 
 	closure_bio_submit(&io->bio, cl);
 
-	continue_at(cl, write_dirty_finish, system_wq);
+	continue_at(cl, write_dirty_finish, io->dc->writeback_write_wq);
 }
 
 static void read_dirty_endio(struct bio *bio)
@@ -207,7 +207,7 @@ static void read_dirty_submit(struct closure *cl)
 
 	closure_bio_submit(&io->bio, cl);
 
-	continue_at(cl, write_dirty, system_wq);
+	continue_at(cl, write_dirty, io->dc->writeback_write_wq);
 }
 
 static void read_dirty(struct cached_dev *dc)
@@ -517,6 +517,11 @@ void bch_cached_dev_writeback_init(struct cached_dev *dc)
 
 int bch_cached_dev_writeback_start(struct cached_dev *dc)
 {
+	dc->writeback_write_wq = alloc_workqueue("bcache_writeback_wq",
+						WQ_MEM_RECLAIM, 0);
+	if (!dc->writeback_write_wq)
+		return -ENOMEM;
+
 	dc->writeback_thread = kthread_create(bch_writeback_thread, dc,
 					      "bcache_writeback");
 	if (IS_ERR(dc->writeback_thread))
-- 
2.13.5

^ permalink raw reply related	[flat|nested] 35+ messages in thread

* [PATCH 10/12] bcache: silence static checker warning
  2017-09-06  6:25 [PATCH 00/13] bcache: fixes and update for 4.14 Coly Li
                   ` (8 preceding siblings ...)
  2017-09-06  6:25 ` [PATCH 09/12] bcache: fix for gc and write-back race Coly Li
@ 2017-09-06  6:26 ` Coly Li
  2017-09-06  6:26 ` [PATCH 11/12] bcache: Update continue_at() documentation Coly Li
                   ` (2 subsequent siblings)
  12 siblings, 0 replies; 35+ messages in thread
From: Coly Li @ 2017-09-06  6:26 UTC (permalink / raw)
  To: linux-bcache, linux-block, axboe; +Cc: bcache, Dan Carpenter

From: Dan Carpenter <dan.carpenter@oracle.com>

In olden times, closure_return() used to have a hidden return built in.
We removed the hidden return but forgot to add a new return here.  If
"c" were NULL we would oops on the next line, but fortunately "c" is
never NULL.  Let's just remove the if statement.

Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Reviewed-by: Coly Li <colyli@suse.de>
---
 drivers/md/bcache/super.c | 3 ---
 1 file changed, 3 deletions(-)

diff --git a/drivers/md/bcache/super.c b/drivers/md/bcache/super.c
index 253918972335..c5deec621fdd 100644
--- a/drivers/md/bcache/super.c
+++ b/drivers/md/bcache/super.c
@@ -1376,9 +1376,6 @@ static void cache_set_flush(struct closure *cl)
 	struct btree *b;
 	unsigned i;
 
-	if (!c)
-		closure_return(cl);
-
 	bch_cache_accounting_destroy(&c->accounting);
 
 	kobject_put(&c->internal);
-- 
2.13.5

^ permalink raw reply related	[flat|nested] 35+ messages in thread

* [PATCH 11/12] bcache: Update continue_at() documentation
  2017-09-06  6:25 [PATCH 00/13] bcache: fixes and update for 4.14 Coly Li
                   ` (9 preceding siblings ...)
  2017-09-06  6:26 ` [PATCH 10/12] bcache: silence static checker warning Coly Li
@ 2017-09-06  6:26 ` Coly Li
  2017-09-06  6:26 ` [PATCH 12/12] bcache: fix bch_hprint crash and improve output Coly Li
  2017-09-06 14:20 ` [PATCH 00/13] bcache: fixes and update for 4.14 Jens Axboe
  12 siblings, 0 replies; 35+ messages in thread
From: Coly Li @ 2017-09-06  6:26 UTC (permalink / raw)
  To: linux-bcache, linux-block, axboe; +Cc: bcache, Dan Carpenter

From: Dan Carpenter <dan.carpenter@oracle.com>

continue_at() doesn't have a return statement anymore.

Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Acked-by: Coly Li <colyli@suse.de>
Reviewed-by: Christoph Hellwig <hch@lst.de>
---
 drivers/md/bcache/closure.h | 4 ----
 1 file changed, 4 deletions(-)

diff --git a/drivers/md/bcache/closure.h b/drivers/md/bcache/closure.h
index 1ec84ca81146..295b7e43f92c 100644
--- a/drivers/md/bcache/closure.h
+++ b/drivers/md/bcache/closure.h
@@ -312,8 +312,6 @@ static inline void closure_wake_up(struct closure_waitlist *list)
  * been dropped with closure_put()), it will resume execution at @fn running out
  * of @wq (or, if @wq is NULL, @fn will be called by closure_put() directly).
  *
- * NOTE: This macro expands to a return in the calling function!
- *
  * This is because after calling continue_at() you no longer have a ref on @cl,
  * and whatever @cl owns may be freed out from under you - a running closure fn
  * has a ref on its own closure which continue_at() drops.
@@ -340,8 +338,6 @@ do {									\
  * Causes @fn to be executed out of @cl, in @wq context (or called directly if
  * @wq is NULL).
  *
- * NOTE: like continue_at(), this macro expands to a return in the caller!
- *
  * The ref the caller of continue_at_nobarrier() had on @cl is now owned by @fn,
  * thus it's not safe to touch anything protected by @cl after a
  * continue_at_nobarrier().
-- 
2.13.5

^ permalink raw reply related	[flat|nested] 35+ messages in thread

* [PATCH 12/12] bcache: fix bch_hprint crash and improve output
  2017-09-06  6:25 [PATCH 00/13] bcache: fixes and update for 4.14 Coly Li
                   ` (10 preceding siblings ...)
  2017-09-06  6:26 ` [PATCH 11/12] bcache: Update continue_at() documentation Coly Li
@ 2017-09-06  6:26 ` Coly Li
  2017-09-06 14:20 ` [PATCH 00/13] bcache: fixes and update for 4.14 Jens Axboe
  12 siblings, 0 replies; 35+ messages in thread
From: Coly Li @ 2017-09-06  6:26 UTC (permalink / raw)
  To: linux-bcache, linux-block, axboe; +Cc: bcache, Michael Lyle, stable

From: Michael Lyle <mlyle@lyle.org>

Most importantly, solve a crash where %llu was used to format signed
numbers.  This would cause a buffer overflow when reading sysfs
writeback_rate_debug, as only 20 bytes were allocated for this and
%llu writes 20 characters plus a null.

Always use the units mechanism rather than having different output
paths for simplicity.

Also, correct problems with display output where 1.10 was a larger
number than 1.09, by multiplying by 10 and then dividing by 1024 instead
of dividing by 100.  (Remainders of >= 1000 would print as .10).

Minor changes: Always display the decimal point instead of trying to
omit it based on number of digits shown.  Decide what units to use
based on 1000 as a threshold, not 1024 (in other words, always print
at most 3 digits before the decimal point).

Signed-off-by: Michael Lyle <mlyle@lyle.org>
Reported-by: Dmitry Yu Okunev <dyokunev@ut.mephi.ru>
Acked-by: Kent Overstreet <kent.overstreet@gmail.com>
Reviewed-by: Coly Li <colyli@suse.de>
Cc: stable@vger.kernel.org
---
 drivers/md/bcache/util.c | 50 +++++++++++++++++++++++++++++++++---------------
 1 file changed, 35 insertions(+), 15 deletions(-)

diff --git a/drivers/md/bcache/util.c b/drivers/md/bcache/util.c
index 8c3a938f4bf0..176d3c2ef5f5 100644
--- a/drivers/md/bcache/util.c
+++ b/drivers/md/bcache/util.c
@@ -74,24 +74,44 @@ STRTO_H(strtouint, unsigned int)
 STRTO_H(strtoll, long long)
 STRTO_H(strtoull, unsigned long long)
 
+/**
+ * bch_hprint() - formats @v to human readable string for sysfs.
+ *
+ * @v - signed 64 bit integer
+ * @buf - the (at least 8 byte) buffer to format the result into.
+ *
+ * Returns the number of bytes used by format.
+ */
 ssize_t bch_hprint(char *buf, int64_t v)
 {
 	static const char units[] = "?kMGTPEZY";
-	char dec[4] = "";
-	int u, t = 0;
-
-	for (u = 0; v >= 1024 || v <= -1024; u++) {
-		t = v & ~(~0 << 10);
-		v >>= 10;
-	}
-
-	if (!u)
-		return sprintf(buf, "%llu", v);
-
-	if (v < 100 && v > -100)
-		snprintf(dec, sizeof(dec), ".%i", t / 100);
-
-	return sprintf(buf, "%lli%s%c", v, dec, units[u]);
+	int u = 0, t;
+
+	uint64_t q;
+
+	if (v < 0)
+		q = -v;
+	else
+		q = v;
+
+	/* For as long as the number is more than 3 digits, but at least
+	 * once, shift right / divide by 1024.  Keep the remainder for
+	 * a digit after the decimal point.
+	 */
+	do {
+		u++;
+
+		t = q & ~(~0 << 10);
+		q >>= 10;
+	} while (q >= 1000);
+
+	if (v < 0)
+		/* '-', up to 3 digits, '.', 1 digit, 1 character, null;
+		 * yields 8 bytes.
+		 */
+		return sprintf(buf, "-%llu.%i%c", q, t * 10 / 1024, units[u]);
+	else
+		return sprintf(buf, "%llu.%i%c", q, t * 10 / 1024, units[u]);
 }
 
 ssize_t bch_snprint_string_list(char *buf, size_t size, const char * const list[],
-- 
2.13.5

^ permalink raw reply related	[flat|nested] 35+ messages in thread

* Re: [PATCH 00/13] bcache: fixes and update for 4.14
  2017-09-06  6:25 [PATCH 00/13] bcache: fixes and update for 4.14 Coly Li
                   ` (11 preceding siblings ...)
  2017-09-06  6:26 ` [PATCH 12/12] bcache: fix bch_hprint crash and improve output Coly Li
@ 2017-09-06 14:20 ` Jens Axboe
  2017-09-06 15:41   ` Coly Li
  12 siblings, 1 reply; 35+ messages in thread
From: Jens Axboe @ 2017-09-06 14:20 UTC (permalink / raw)
  To: Coly Li; +Cc: linux-bcache, linux-block, bcache

On Wed, Sep 06 2017, Coly Li wrote:
> Hi Jens,
> 
> Here are 12 patchs for bcache fixes and updates, most of them were posted
> by Eric Wheeler in 4.13 merge window, but delayed due to lack of code
> review.

Next time, please send this _before_ the merge window opens. Not a huge
problem for this series, since it has been posted numerous times in the
past and already has good review coverage, but it really should have
been in linux-next for a week or two before heading upstream.

> There are some other patches which does not pass my test currently,
> once they are fixed and reviewed by peer developers I will send them
> to you for 4.14 merge window.

If they are not passing tests etc before the merge window opens, they
are not for this cycle.

-- 
Jens Axboe

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH 00/13] bcache: fixes and update for 4.14
  2017-09-06 14:20 ` [PATCH 00/13] bcache: fixes and update for 4.14 Jens Axboe
@ 2017-09-06 15:41   ` Coly Li
  2017-09-06 15:46     ` Jens Axboe
  0 siblings, 1 reply; 35+ messages in thread
From: Coly Li @ 2017-09-06 15:41 UTC (permalink / raw)
  To: Jens Axboe; +Cc: linux-bcache, linux-block, bcache

On 2017/9/6 下午10:20, Jens Axboe wrote:
> On Wed, Sep 06 2017, Coly Li wrote:
>> Hi Jens,
>>
>> Here are 12 patchs for bcache fixes and updates, most of them were posted
>> by Eric Wheeler in 4.13 merge window, but delayed due to lack of code
>> review.
> 
> Next time, please send this _before_ the merge window opens. Not a huge
> problem for this series, since it has been posted numerous times in the
> past and already has good review coverage, but it really should have
> been in linux-next for a week or two before heading upstream.
> 
Hi Jens,

Copied, send patches _before_ merge window opens.

But could you please to give me a hint, how can I find when a specific
merge window will open ? I search LKML, and see people send pull
requests for 4.14 merge window, but I don't see any announcement for
4.14 merge window time slot.


>> There are some other patches which does not pass my test currently,
>> once they are fixed and reviewed by peer developers I will send them
>> to you for 4.14 merge window.
> 
> If they are not passing tests etc before the merge window opens, they
> are not for this cycle.

I see. There is one more patch should be in 4.14, it is OK for code, the
original author asks me to help him to compose a better commit log. I am
about to send this patch to you today or tomorrow.

Thanks for the hint.

-- 
Coly Li

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH 00/13] bcache: fixes and update for 4.14
  2017-09-06 15:41   ` Coly Li
@ 2017-09-06 15:46     ` Jens Axboe
  2017-09-06 17:38       ` Coly Li
  0 siblings, 1 reply; 35+ messages in thread
From: Jens Axboe @ 2017-09-06 15:46 UTC (permalink / raw)
  To: Coly Li; +Cc: linux-bcache, linux-block, bcache

On 09/06/2017 09:41 AM, Coly Li wrote:
> On 2017/9/6 下午10:20, Jens Axboe wrote:
>> On Wed, Sep 06 2017, Coly Li wrote:
>>> Hi Jens,
>>>
>>> Here are 12 patchs for bcache fixes and updates, most of them were posted
>>> by Eric Wheeler in 4.13 merge window, but delayed due to lack of code
>>> review.
>>
>> Next time, please send this _before_ the merge window opens. Not a huge
>> problem for this series, since it has been posted numerous times in the
>> past and already has good review coverage, but it really should have
>> been in linux-next for a week or two before heading upstream.
>>
> Hi Jens,
> 
> Copied, send patches _before_ merge window opens.
> 
> But could you please to give me a hint, how can I find when a specific
> merge window will open ? I search LKML, and see people send pull
> requests for 4.14 merge window, but I don't see any announcement for
> 4.14 merge window time slot.

The merge window opens when Linus releases the previous kernel. So you
have to try and keep track of when he expects to do that. That really
isn't that difficult - Linus always cuts a release on Sundays. We always
have 7 or 8 rc releases, so a good time to check is his mail on -rc6 or
-rc7 release, that usually gives a good indication of when he expects to
release the final.

To keep things simple, if you always have things ready when -rc7 is
released, then you are at least one week out from the merge window,
possibly two if we end up doing an -rc8. That means you don't have to
track anything, you know the exact date of when -rc7 is released since
Linus's schedule is usually reliable.

-- 
Jens Axboe

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH 00/13] bcache: fixes and update for 4.14
  2017-09-06 15:46     ` Jens Axboe
@ 2017-09-06 17:38       ` Coly Li
  2017-09-07 18:51         ` Eddie Chapman
  2017-09-07 19:01         ` Eddie Chapman
  0 siblings, 2 replies; 35+ messages in thread
From: Coly Li @ 2017-09-06 17:38 UTC (permalink / raw)
  To: Jens Axboe; +Cc: linux-bcache, linux-block, bcache

On 2017/9/6 下午11:46, Jens Axboe wrote:
> On 09/06/2017 09:41 AM, Coly Li wrote:
>> On 2017/9/6 下午10:20, Jens Axboe wrote:
>>> On Wed, Sep 06 2017, Coly Li wrote:
>>>> Hi Jens,
>>>>
>>>> Here are 12 patchs for bcache fixes and updates, most of them were posted
>>>> by Eric Wheeler in 4.13 merge window, but delayed due to lack of code
>>>> review.
>>>
>>> Next time, please send this _before_ the merge window opens. Not a huge
>>> problem for this series, since it has been posted numerous times in the
>>> past and already has good review coverage, but it really should have
>>> been in linux-next for a week or two before heading upstream.
>>>
>> Hi Jens,
>>
>> Copied, send patches _before_ merge window opens.
>>
>> But could you please to give me a hint, how can I find when a specific
>> merge window will open ? I search LKML, and see people send pull
>> requests for 4.14 merge window, but I don't see any announcement for
>> 4.14 merge window time slot.
> 
> The merge window opens when Linus releases the previous kernel. So you
> have to try and keep track of when he expects to do that. That really
> isn't that difficult - Linus always cuts a release on Sundays. We always
> have 7 or 8 rc releases, so a good time to check is his mail on -rc6 or
> -rc7 release, that usually gives a good indication of when he expects to
> release the final.
> 
> To keep things simple, if you always have things ready when -rc7 is
> released, then you are at least one week out from the merge window,
> possibly two if we end up doing an -rc8. That means you don't have to
> track anything, you know the exact date of when -rc7 is released since
> Linus's schedule is usually reliable.
> 

Hi Jens,

Copied, I will follow the above rule in next cycle.

And I just post the last patch
(0013-bcache-initialize-stripe_sectors_dirty-correctly-for.patch) to you
for 4.14 cycle.

This patch is an important fix for bcache writeback rate calculation, it
is depended by another patch I sent to you
(0006-bcache-correct-cache_dirty_target-in-__update_writeb.patch),
please have it in 4.14.

Thanks in advance.

-- 
Coly Li

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH 00/13] bcache: fixes and update for 4.14
  2017-09-06 17:38       ` Coly Li
@ 2017-09-07 18:51         ` Eddie Chapman
  2017-09-07 19:31           ` Jens Axboe
  2017-09-07 19:01         ` Eddie Chapman
  1 sibling, 1 reply; 35+ messages in thread
From: Eddie Chapman @ 2017-09-07 18:51 UTC (permalink / raw)
  To: Jens Axboe; +Cc: Coly Li, linux-bcache, linux-block, bcache

On 06/09/17 18:38, Coly Li wrote:
> On 2017/9/6 下午11:46, Jens Axboe wrote:
>> On 09/06/2017 09:41 AM, Coly Li wrote:
>>> On 2017/9/6 下午10:20, Jens Axboe wrote:
>>>> On Wed, Sep 06 2017, Coly Li wrote:
>>>>> Hi Jens,
>>>>>
>>>>> Here are 12 patchs for bcache fixes and updates, most of them were posted
>>>>> by Eric Wheeler in 4.13 merge window, but delayed due to lack of code
>>>>> review.
>>>>
>>>> Next time, please send this _before_ the merge window opens. Not a huge
>>>> problem for this series, since it has been posted numerous times in the
>>>> past and already has good review coverage, but it really should have
>>>> been in linux-next for a week or two before heading upstream.
>>>>
>>> Hi Jens,
>>>
>>> Copied, send patches _before_ merge window opens.
>>>
>>> But could you please to give me a hint, how can I find when a specific
>>> merge window will open ? I search LKML, and see people send pull
>>> requests for 4.14 merge window, but I don't see any announcement for
>>> 4.14 merge window time slot.
>>
>> The merge window opens when Linus releases the previous kernel. So you
>> have to try and keep track of when he expects to do that. That really
>> isn't that difficult - Linus always cuts a release on Sundays. We always
>> have 7 or 8 rc releases, so a good time to check is his mail on -rc6 or
>> -rc7 release, that usually gives a good indication of when he expects to
>> release the final.
>>
>> To keep things simple, if you always have things ready when -rc7 is
>> released, then you are at least one week out from the merge window,
>> possibly two if we end up doing an -rc8. That means you don't have to
>> track anything, you know the exact date of when -rc7 is released since
>> Linus's schedule is usually reliable.
>>
> 
> Hi Jens,
> 
> Copied, I will follow the above rule in next cycle.
> 
> And I just post the last patch
> (0013-bcache-initialize-stripe_sectors_dirty-correctly-for.patch) to you
> for 4.14 cycle.
> 
> This patch is an important fix for bcache writeback rate calculation, it
> is depended by another patch I sent to you
> (0006-bcache-correct-cache_dirty_target-in-__update_writeb.patch),
> please have it in 4.14.
> 
> Thanks in advance.

Hello Jens,

FWIW I'd like to support Coly by reporting that patches 0001, 0002, 0006 
and 0013 (the last one mentioned above) have been tested by me on 2 
production servers with a good deal of bcache activity for the last 6 
weeks without any issues having arisen.

But this may not be much use to you to know, as these are running 
vanilla kernel.org kernel 4.4.77.  Still, these 4 patches exactly as 
Coly has sent them to you apply without any changes to the code, so I 
guess it vouches for the code at least somewhat.

Eddie

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH 00/13] bcache: fixes and update for 4.14
  2017-09-06 17:38       ` Coly Li
  2017-09-07 18:51         ` Eddie Chapman
@ 2017-09-07 19:01         ` Eddie Chapman
  1 sibling, 0 replies; 35+ messages in thread
From: Eddie Chapman @ 2017-09-07 19:01 UTC (permalink / raw)
  To: Jens Axboe; +Cc: Coly Li, linux-bcache, linux-block, bcache

On 06/09/17 18:38, Coly Li wrote:
> On 2017/9/6 下午11:46, Jens Axboe wrote:
>> On 09/06/2017 09:41 AM, Coly Li wrote:
>>> On 2017/9/6 下午10:20, Jens Axboe wrote:
>>>> On Wed, Sep 06 2017, Coly Li wrote:
>>>>> Hi Jens,
>>>>>
>>>>> Here are 12 patchs for bcache fixes and updates, most of them were posted
>>>>> by Eric Wheeler in 4.13 merge window, but delayed due to lack of code
>>>>> review.
>>>>
>>>> Next time, please send this _before_ the merge window opens. Not a huge
>>>> problem for this series, since it has been posted numerous times in the
>>>> past and already has good review coverage, but it really should have
>>>> been in linux-next for a week or two before heading upstream.
>>>>
>>> Hi Jens,
>>>
>>> Copied, send patches _before_ merge window opens.
>>>
>>> But could you please to give me a hint, how can I find when a specific
>>> merge window will open ? I search LKML, and see people send pull
>>> requests for 4.14 merge window, but I don't see any announcement for
>>> 4.14 merge window time slot.
>>
>> The merge window opens when Linus releases the previous kernel. So you
>> have to try and keep track of when he expects to do that. That really
>> isn't that difficult - Linus always cuts a release on Sundays. We always
>> have 7 or 8 rc releases, so a good time to check is his mail on -rc6 or
>> -rc7 release, that usually gives a good indication of when he expects to
>> release the final.
>>
>> To keep things simple, if you always have things ready when -rc7 is
>> released, then you are at least one week out from the merge window,
>> possibly two if we end up doing an -rc8. That means you don't have to
>> track anything, you know the exact date of when -rc7 is released since
>> Linus's schedule is usually reliable.
>>
> 
> Hi Jens,
> 
> Copied, I will follow the above rule in next cycle.
> 
> And I just post the last patch
> (0013-bcache-initialize-stripe_sectors_dirty-correctly-for.patch) to you
> for 4.14 cycle.
> 
> This patch is an important fix for bcache writeback rate calculation, it
> is depended by another patch I sent to you
> (0006-bcache-correct-cache_dirty_target-in-__update_writeb.patch),
> please have it in 4.14.
> 
> Thanks in advance.

I'm sorry, correction to my last mail, the patches I have tested should 
read: 0002, 0003, 0006 and 0013.

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH 00/13] bcache: fixes and update for 4.14
  2017-09-07 18:51         ` Eddie Chapman
@ 2017-09-07 19:31           ` Jens Axboe
  0 siblings, 0 replies; 35+ messages in thread
From: Jens Axboe @ 2017-09-07 19:31 UTC (permalink / raw)
  To: Eddie Chapman; +Cc: Coly Li, linux-bcache, linux-block, bcache

On 09/07/2017 12:51 PM, Eddie Chapman wrote:
> On 06/09/17 18:38, Coly Li wrote:
>> On 2017/9/6 下午11:46, Jens Axboe wrote:
>>> On 09/06/2017 09:41 AM, Coly Li wrote:
>>>> On 2017/9/6 下午10:20, Jens Axboe wrote:
>>>>> On Wed, Sep 06 2017, Coly Li wrote:
>>>>>> Hi Jens,
>>>>>>
>>>>>> Here are 12 patchs for bcache fixes and updates, most of them were posted
>>>>>> by Eric Wheeler in 4.13 merge window, but delayed due to lack of code
>>>>>> review.
>>>>>
>>>>> Next time, please send this _before_ the merge window opens. Not a huge
>>>>> problem for this series, since it has been posted numerous times in the
>>>>> past and already has good review coverage, but it really should have
>>>>> been in linux-next for a week or two before heading upstream.
>>>>>
>>>> Hi Jens,
>>>>
>>>> Copied, send patches _before_ merge window opens.
>>>>
>>>> But could you please to give me a hint, how can I find when a specific
>>>> merge window will open ? I search LKML, and see people send pull
>>>> requests for 4.14 merge window, but I don't see any announcement for
>>>> 4.14 merge window time slot.
>>>
>>> The merge window opens when Linus releases the previous kernel. So you
>>> have to try and keep track of when he expects to do that. That really
>>> isn't that difficult - Linus always cuts a release on Sundays. We always
>>> have 7 or 8 rc releases, so a good time to check is his mail on -rc6 or
>>> -rc7 release, that usually gives a good indication of when he expects to
>>> release the final.
>>>
>>> To keep things simple, if you always have things ready when -rc7 is
>>> released, then you are at least one week out from the merge window,
>>> possibly two if we end up doing an -rc8. That means you don't have to
>>> track anything, you know the exact date of when -rc7 is released since
>>> Linus's schedule is usually reliable.
>>>
>>
>> Hi Jens,
>>
>> Copied, I will follow the above rule in next cycle.
>>
>> And I just post the last patch
>> (0013-bcache-initialize-stripe_sectors_dirty-correctly-for.patch) to you
>> for 4.14 cycle.
>>
>> This patch is an important fix for bcache writeback rate calculation, it
>> is depended by another patch I sent to you
>> (0006-bcache-correct-cache_dirty_target-in-__update_writeb.patch),
>> please have it in 4.14.
>>
>> Thanks in advance.
> 
> Hello Jens,
> 
> FWIW I'd like to support Coly by reporting that patches 0001, 0002, 0006 
> and 0013 (the last one mentioned above) have been tested by me on 2 
> production servers with a good deal of bcache activity for the last 6 
> weeks without any issues having arisen.
> 
> But this may not be much use to you to know, as these are running 
> vanilla kernel.org kernel 4.4.77.  Still, these 4 patches exactly as 
> Coly has sent them to you apply without any changes to the code, so I 
> guess it vouches for the code at least somewhat.

That's all good and fine, and it's better than no testing at all. But it
doesn't change the fact that anyone using bcache on -next would have
been using the changes for a while before release, instead of just one
or two people. Additionally, your base is drastically different than
mainline, since you're running a kernel that's a year and a half old.

Next time I'm not adding anything post merge window open, unless it's
addressing a problem that was added in the merge window.

-- 
Jens Axboe

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH 04/12] bcache: Don't reinvent the wheel but use existing llist API
  2017-09-06  6:25 ` [PATCH 04/12] bcache: Don't reinvent the wheel but use existing llist API Coly Li
@ 2017-09-26  4:38   ` Michael Lyle
  2017-09-26  6:39       ` 박병철/선임연구원/SW Platform(연)AOT팀(byungchul.park@lge.com)
                       ` (2 more replies)
  0 siblings, 3 replies; 35+ messages in thread
From: Michael Lyle @ 2017-09-26  4:38 UTC (permalink / raw)
  To: Coly Li
  Cc: linux-bcache, linux-block, axboe, Eric Wheeler, Byungchul Park,
	Kent Overstreet

I believe this introduces a critical bug.

cl->list is used to link together the llists for both things waiting,
and for things that are being woken.

If a closure that is woken decides to wait again, it will corrupt the
llist that __closure_wake_up is using.

The previous iteration structure gets the next element of the list
before waking and is therefore safe.

Mike

^ permalink raw reply	[flat|nested] 35+ messages in thread

* RE: [PATCH 04/12] bcache: Don't reinvent the wheel but use existing llist API
  2017-09-26  4:38   ` Michael Lyle
@ 2017-09-26  6:39       ` 박병철/선임연구원/SW Platform(연)AOT팀(byungchul.park@lge.com)
  2017-09-26  7:08     ` Coly Li
  2017-09-26  7:46     ` Coly Li
  2 siblings, 0 replies; 35+ messages in thread
From: 박병철/선임연구원/SW Platform(연)AOT팀(byungchul.park@lge.com) @ 2017-09-26  6:39 UTC (permalink / raw)
  To: Michael Lyle, Coly Li
  Cc: linux-bcache, linux-block, axboe, Eric Wheeler, Byungchul Park,
	Kent Overstreet, kernel-team

PiAtLS0tLU9yaWdpbmFsIE1lc3NhZ2UtLS0tLQ0KPiBGcm9tOiBNaWNoYWVsIEx5bGUgW21haWx0
bzptbHlsZUBseWxlLm9yZ10NCj4gU2VudDogVHVlc2RheSwgU2VwdGVtYmVyIDI2LCAyMDE3IDE6
MzggUE0NCj4gVG86IENvbHkgTGkNCj4gQ2M6IGxpbnV4LWJjYWNoZUB2Z2VyLmtlcm5lbC5vcmc7
IGxpbnV4LWJsb2NrQHZnZXIua2VybmVsLm9yZzsNCj4gYXhib2VAa2VybmVsLmRrOyBFcmljIFdo
ZWVsZXI7IEJ5dW5nY2h1bCBQYXJrOyBLZW50IE92ZXJzdHJlZXQNCj4gU3ViamVjdDogUmU6IFtQ
QVRDSCAwNC8xMl0gYmNhY2hlOiBEb24ndCByZWludmVudCB0aGUgd2hlZWwgYnV0IHVzZSBleGlz
dGluZw0KPiBsbGlzdCBBUEkNCj4gDQo+IEkgYmVsaWV2ZSB0aGlzIGludHJvZHVjZXMgYSBjcml0
aWNhbCBidWcuDQo+IA0KPiBjbC0+bGlzdCBpcyB1c2VkIHRvIGxpbmsgdG9nZXRoZXIgdGhlIGxs
aXN0cyBmb3IgYm90aCB0aGluZ3Mgd2FpdGluZywNCj4gYW5kIGZvciB0aGluZ3MgdGhhdCBhcmUg
YmVpbmcgd29rZW4uDQo+IA0KPiBJZiBhIGNsb3N1cmUgdGhhdCBpcyB3b2tlbiBkZWNpZGVzIHRv
IHdhaXQgYWdhaW4sIGl0IHdpbGwgY29ycnVwdCB0aGUNCj4gbGxpc3QgdGhhdCBfX2Nsb3N1cmVf
d2FrZV91cCBpcyB1c2luZy4NCj4gDQo+IFRoZSBwcmV2aW91cyBpdGVyYXRpb24gc3RydWN0dXJl
IGdldHMgdGhlIG5leHQgZWxlbWVudCBvZiB0aGUgbGlzdA0KPiBiZWZvcmUgd2FraW5nIGFuZCBp
cyB0aGVyZWZvcmUgc2FmZS4NCg0KRG8geW91IG1lYW4gd2UgaGF2ZSB0byB1c2UgbGxpc3RfZm9y
X2VhY2hfZW50cnlfc2FmZSgpIGluc3RlYWQgb2Ygbm9uLXNhZmUgdmVyc2lvbj8NCklzIGl0IG9r
IGlmIHdlIHVzZSBpdCBpbnN0ZWFkPw0KDQo+IE1pa2UNCg==

^ permalink raw reply	[flat|nested] 35+ messages in thread

* RE: [PATCH 04/12] bcache: Don't reinvent the wheel but use existing llist API
@ 2017-09-26  6:39       ` 박병철/선임연구원/SW Platform(연)AOT팀(byungchul.park@lge.com)
  0 siblings, 0 replies; 35+ messages in thread
From: 박병철/선임연구원/SW Platform(연)AOT팀(byungchul.park@lge.com) @ 2017-09-26  6:39 UTC (permalink / raw)
  To: Michael Lyle, Coly Li
  Cc: linux-bcache, linux-block, axboe, Eric Wheeler, Byungchul Park,
	Kent Overstreet, kernel-team

> -----Original Message-----
> From: Michael Lyle [mailto:mlyle@lyle.org]
> Sent: Tuesday, September 26, 2017 1:38 PM
> To: Coly Li
> Cc: linux-bcache@vger.kernel.org; linux-block@vger.kernel.org;
> axboe@kernel.dk; Eric Wheeler; Byungchul Park; Kent Overstreet
> Subject: Re: [PATCH 04/12] bcache: Don't reinvent the wheel but use existing
> llist API
> 
> I believe this introduces a critical bug.
> 
> cl->list is used to link together the llists for both things waiting,
> and for things that are being woken.
> 
> If a closure that is woken decides to wait again, it will corrupt the
> llist that __closure_wake_up is using.
> 
> The previous iteration structure gets the next element of the list
> before waking and is therefore safe.

Do you mean we have to use llist_for_each_entry_safe() instead of non-safe version?
Is it ok if we use it instead?

> Mike

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH 04/12] bcache: Don't reinvent the wheel but use existing llist API
  2017-09-26  4:38   ` Michael Lyle
  2017-09-26  6:39       ` 박병철/선임연구원/SW Platform(연)AOT팀(byungchul.park@lge.com)
@ 2017-09-26  7:08     ` Coly Li
  2017-09-26  7:16         ` 박병철/선임연구원/SW Platform(연)AOT팀(byungchul.park@lge.com)
  2017-09-26  7:46     ` Coly Li
  2 siblings, 1 reply; 35+ messages in thread
From: Coly Li @ 2017-09-26  7:08 UTC (permalink / raw)
  To: Michael Lyle, Coly Li
  Cc: linux-bcache, linux-block, axboe, Eric Wheeler, Byungchul Park,
	Kent Overstreet

On 2017/9/26 下午12:38, Michael Lyle wrote:
> I believe this introduces a critical bug.
> 
> cl->list is used to link together the llists for both things waiting,
> and for things that are being woken.
> 
> If a closure that is woken decides to wait again, it will corrupt the
> llist that __closure_wake_up is using.
> 
> The previous iteration structure gets the next element of the list
> before waking and is therefore safe.
>

Hi Mike,

Good catch! I see llist_del_all() but forget cl->list can be modified in
closure_wait(). Yes there is potential chance to mislead
llist_for_each_entry() to iterate wrong list.
llist_for_each_entry_safe() should be used here. I will send a fix to
Jens, hope to catch up 4.14 still.

Thanks!
-- 
Coly Li

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH 04/12] bcache: Don't reinvent the wheel but use existing llist API
  2017-09-26  6:39       ` 박병철/선임연구원/SW Platform(연)AOT팀(byungchul.park@lge.com)
  (?)
@ 2017-09-26  7:09       ` Coly Li
  2017-09-26  7:15           ` 박병철/선임연구원/SW Platform(연)AOT팀(byungchul.park@lge.com)
  -1 siblings, 1 reply; 35+ messages in thread
From: Coly Li @ 2017-09-26  7:09 UTC (permalink / raw)
  To: 박병철/선임연구원/SW
	Platform(연)AOT팀(byungchul.park@lge.com),
	Michael Lyle, Coly Li
  Cc: linux-bcache, linux-block, axboe, Eric Wheeler, Kent Overstreet,
	kernel-team

On 2017/9/26 下午2:39, 박병철/선임연구원/SW
Platform(연)AOT팀(byungchul.park@lge.com) wrote:
>> -----Original Message-----
>> From: Michael Lyle [mailto:mlyle@lyle.org]
>> Sent: Tuesday, September 26, 2017 1:38 PM
>> To: Coly Li
>> Cc: linux-bcache@vger.kernel.org; linux-block@vger.kernel.org;
>> axboe@kernel.dk; Eric Wheeler; Byungchul Park; Kent Overstreet
>> Subject: Re: [PATCH 04/12] bcache: Don't reinvent the wheel but use existing
>> llist API
>>
>> I believe this introduces a critical bug.
>>
>> cl->list is used to link together the llists for both things waiting,
>> and for things that are being woken.
>>
>> If a closure that is woken decides to wait again, it will corrupt the
>> llist that __closure_wake_up is using.
>>
>> The previous iteration structure gets the next element of the list
>> before waking and is therefore safe.
> 
> Do you mean we have to use llist_for_each_entry_safe() instead of non-safe version?
> Is it ok if we use it instead?

Yes, we should use llist_for_each_entry_safe(), there is a quite
implicit race here.

-- 
Coly Li

^ permalink raw reply	[flat|nested] 35+ messages in thread

* RE: [PATCH 04/12] bcache: Don't reinvent the wheel but use existing llist API
  2017-09-26  7:09       ` Coly Li
@ 2017-09-26  7:15           ` 박병철/선임연구원/SW Platform(연)AOT팀(byungchul.park@lge.com)
  0 siblings, 0 replies; 35+ messages in thread
From: 박병철/선임연구원/SW Platform(연)AOT팀(byungchul.park@lge.com) @ 2017-09-26  7:15 UTC (permalink / raw)
  To: Coly Li,
	박병철/선임연구원/SW
	Platform(연)AOT팀(byungchul.park@lge.com),
	Michael Lyle, Coly Li
  Cc: linux-bcache, linux-block, axboe, Eric Wheeler, Kent Overstreet,
	kernel-team

PiAtLS0tLU9yaWdpbmFsIE1lc3NhZ2UtLS0tLQ0KPiBGcm9tOiBDb2x5IExpIFttYWlsdG86aUBj
b2x5LmxpXQ0KPiBTZW50OiBUdWVzZGF5LCBTZXB0ZW1iZXIgMjYsIDIwMTcgNDowOSBQTQ0KPiBU
bzog67CV67OR7LKgL+yEoOyehOyXsOq1rOybkC9TVyBQbGF0Zm9ybSjsl7ApQU9U7YyAKGJ5dW5n
Y2h1bC5wYXJrQGxnZS5jb20pOw0KPiBNaWNoYWVsIEx5bGU7IENvbHkgTGkNCj4gQ2M6IGxpbnV4
LWJjYWNoZUB2Z2VyLmtlcm5lbC5vcmc7IGxpbnV4LWJsb2NrQHZnZXIua2VybmVsLm9yZzsNCj4g
YXhib2VAa2VybmVsLmRrOyBFcmljIFdoZWVsZXI7IEtlbnQgT3ZlcnN0cmVldDsga2VybmVsLXRl
YW1AbGdlLmNvbQ0KPiBTdWJqZWN0OiBSZTogW1BBVENIIDA0LzEyXSBiY2FjaGU6IERvbid0IHJl
aW52ZW50IHRoZSB3aGVlbCBidXQgdXNlIGV4aXN0aW5nDQo+IGxsaXN0IEFQSQ0KPiANCj4gT24g
MjAxNy85LzI2IOS4i+WNiDI6MzksIOuwleuzkeyyoC/shKDsnoTsl7Dqtazsm5AvU1cNCj4gUGxh
dGZvcm0o7JewKUFPVO2MgChieXVuZ2NodWwucGFya0BsZ2UuY29tKSB3cm90ZToNCj4gPj4gLS0t
LS1PcmlnaW5hbCBNZXNzYWdlLS0tLS0NCj4gPj4gRnJvbTogTWljaGFlbCBMeWxlIFttYWlsdG86
bWx5bGVAbHlsZS5vcmddDQo+ID4+IFNlbnQ6IFR1ZXNkYXksIFNlcHRlbWJlciAyNiwgMjAxNyAx
OjM4IFBNDQo+ID4+IFRvOiBDb2x5IExpDQo+ID4+IENjOiBsaW51eC1iY2FjaGVAdmdlci5rZXJu
ZWwub3JnOyBsaW51eC1ibG9ja0B2Z2VyLmtlcm5lbC5vcmc7DQo+ID4+IGF4Ym9lQGtlcm5lbC5k
azsgRXJpYyBXaGVlbGVyOyBCeXVuZ2NodWwgUGFyazsgS2VudCBPdmVyc3RyZWV0DQo+ID4+IFN1
YmplY3Q6IFJlOiBbUEFUQ0ggMDQvMTJdIGJjYWNoZTogRG9uJ3QgcmVpbnZlbnQgdGhlIHdoZWVs
IGJ1dCB1c2UgZXhpc3RpbmcNCj4gPj4gbGxpc3QgQVBJDQo+ID4+DQo+ID4+IEkgYmVsaWV2ZSB0
aGlzIGludHJvZHVjZXMgYSBjcml0aWNhbCBidWcuDQo+ID4+DQo+ID4+IGNsLT5saXN0IGlzIHVz
ZWQgdG8gbGluayB0b2dldGhlciB0aGUgbGxpc3RzIGZvciBib3RoIHRoaW5ncyB3YWl0aW5nLA0K
PiA+PiBhbmQgZm9yIHRoaW5ncyB0aGF0IGFyZSBiZWluZyB3b2tlbi4NCj4gPj4NCj4gPj4gSWYg
YSBjbG9zdXJlIHRoYXQgaXMgd29rZW4gZGVjaWRlcyB0byB3YWl0IGFnYWluLCBpdCB3aWxsIGNv
cnJ1cHQgdGhlDQo+ID4+IGxsaXN0IHRoYXQgX19jbG9zdXJlX3dha2VfdXAgaXMgdXNpbmcuDQo+
ID4+DQo+ID4+IFRoZSBwcmV2aW91cyBpdGVyYXRpb24gc3RydWN0dXJlIGdldHMgdGhlIG5leHQg
ZWxlbWVudCBvZiB0aGUgbGlzdA0KPiA+PiBiZWZvcmUgd2FraW5nIGFuZCBpcyB0aGVyZWZvcmUg
c2FmZS4NCj4gPg0KPiA+IERvIHlvdSBtZWFuIHdlIGhhdmUgdG8gdXNlIGxsaXN0X2Zvcl9lYWNo
X2VudHJ5X3NhZmUoKSBpbnN0ZWFkIG9mIG5vbi1zYWZlDQo+IHZlcnNpb24/DQo+ID4gSXMgaXQg
b2sgaWYgd2UgdXNlIGl0IGluc3RlYWQ/DQo+IA0KPiBZZXMsIHdlIHNob3VsZCB1c2UgbGxpc3Rf
Zm9yX2VhY2hfZW50cnlfc2FmZSgpLCB0aGVyZSBpcyBhIHF1aXRlDQo+IGltcGxpY2l0IHJhY2Ug
aGVyZS4NCg0KSGkgY29seSwNCg0KQXMgeW91IGtub3csIG15IGZpcnN0IHBhdGNoIHVzZWQgdGhl
IHNhZmUgdmVyc2lvbiwgYnV0IHlvdSBzdWdnZXN0ZWQgdG8gcmVwbGFjZQ0KSXQgd2l0aCBub24t
c2FmZSBvbmUuIDooDQoNCkkgd2lsbCBjaGFuZ2UgaXQgc28gaXQgZG9lcyB0aGUgc2FtZSBhcyB0
aGUgb3JpZ2luYWwgcGF0Y2ggZGlkLiA6KQ0KDQpUaGFuayB5b3UgdmVyeSBtdWNoLA0KQnl1bmdj
aHVsDQoNCj4gLS0NCj4gQ29seSBMaQ0K

^ permalink raw reply	[flat|nested] 35+ messages in thread

* RE: [PATCH 04/12] bcache: Don't reinvent the wheel but use existing llist API
@ 2017-09-26  7:15           ` 박병철/선임연구원/SW Platform(연)AOT팀(byungchul.park@lge.com)
  0 siblings, 0 replies; 35+ messages in thread
From: 박병철/선임연구원/SW Platform(연)AOT팀(byungchul.park@lge.com) @ 2017-09-26  7:15 UTC (permalink / raw)
  To: Coly Li,
	박병철/선임연구원/SW
	Platform(연)AOT팀(byungchul.park@lge.com),
	Michael Lyle, Coly Li
  Cc: linux-bcache, linux-block, axboe, Eric Wheeler, Kent Overstreet,
	kernel-team

> -----Original Message-----
> From: Coly Li [mailto:i@coly.li]
> Sent: Tuesday, September 26, 2017 4:09 PM
> To: 박병철/선임연구원/SW Platform(연)AOT팀(byungchul.park@lge.com);
> Michael Lyle; Coly Li
> Cc: linux-bcache@vger.kernel.org; linux-block@vger.kernel.org;
> axboe@kernel.dk; Eric Wheeler; Kent Overstreet; kernel-team@lge.com
> Subject: Re: [PATCH 04/12] bcache: Don't reinvent the wheel but use existing
> llist API
> 
> On 2017/9/26 下午2:39, 박병철/선임연구원/SW
> Platform(연)AOT팀(byungchul.park@lge.com) wrote:
> >> -----Original Message-----
> >> From: Michael Lyle [mailto:mlyle@lyle.org]
> >> Sent: Tuesday, September 26, 2017 1:38 PM
> >> To: Coly Li
> >> Cc: linux-bcache@vger.kernel.org; linux-block@vger.kernel.org;
> >> axboe@kernel.dk; Eric Wheeler; Byungchul Park; Kent Overstreet
> >> Subject: Re: [PATCH 04/12] bcache: Don't reinvent the wheel but use existing
> >> llist API
> >>
> >> I believe this introduces a critical bug.
> >>
> >> cl->list is used to link together the llists for both things waiting,
> >> and for things that are being woken.
> >>
> >> If a closure that is woken decides to wait again, it will corrupt the
> >> llist that __closure_wake_up is using.
> >>
> >> The previous iteration structure gets the next element of the list
> >> before waking and is therefore safe.
> >
> > Do you mean we have to use llist_for_each_entry_safe() instead of non-safe
> version?
> > Is it ok if we use it instead?
> 
> Yes, we should use llist_for_each_entry_safe(), there is a quite
> implicit race here.

Hi coly,

As you know, my first patch used the safe version, but you suggested to replace
It with non-safe one. :(

I will change it so it does the same as the original patch did. :)

Thank you very much,
Byungchul

> --
> Coly Li

^ permalink raw reply	[flat|nested] 35+ messages in thread

* RE: [PATCH 04/12] bcache: Don't reinvent the wheel but use existing llist API
  2017-09-26  7:08     ` Coly Li
@ 2017-09-26  7:16         ` 박병철/선임연구원/SW Platform(연)AOT팀(byungchul.park@lge.com)
  0 siblings, 0 replies; 35+ messages in thread
From: 박병철/선임연구원/SW Platform(연)AOT팀(byungchul.park@lge.com) @ 2017-09-26  7:16 UTC (permalink / raw)
  To: Coly Li, Michael Lyle, Coly Li
  Cc: linux-bcache, linux-block, axboe, Eric Wheeler, Byungchul Park,
	Kent Overstreet, kernel-team

PiAtLS0tLU9yaWdpbmFsIE1lc3NhZ2UtLS0tLQ0KPiBGcm9tOiBDb2x5IExpIFttYWlsdG86aUBj
b2x5LmxpXQ0KPiBTZW50OiBUdWVzZGF5LCBTZXB0ZW1iZXIgMjYsIDIwMTcgNDowOSBQTQ0KPiBU
bzogTWljaGFlbCBMeWxlOyBDb2x5IExpDQo+IENjOiBsaW51eC1iY2FjaGVAdmdlci5rZXJuZWwu
b3JnOyBsaW51eC1ibG9ja0B2Z2VyLmtlcm5lbC5vcmc7DQo+IGF4Ym9lQGtlcm5lbC5kazsgRXJp
YyBXaGVlbGVyOyBCeXVuZ2NodWwgUGFyazsgS2VudCBPdmVyc3RyZWV0DQo+IFN1YmplY3Q6IFJl
OiBbUEFUQ0ggMDQvMTJdIGJjYWNoZTogRG9uJ3QgcmVpbnZlbnQgdGhlIHdoZWVsIGJ1dCB1c2Ug
ZXhpc3RpbmcNCj4gbGxpc3QgQVBJDQo+IA0KPiBPbiAyMDE3LzkvMjYg5LiL5Y2IMTI6MzgsIE1p
Y2hhZWwgTHlsZSB3cm90ZToNCj4gPiBJIGJlbGlldmUgdGhpcyBpbnRyb2R1Y2VzIGEgY3JpdGlj
YWwgYnVnLg0KPiA+DQo+ID4gY2wtPmxpc3QgaXMgdXNlZCB0byBsaW5rIHRvZ2V0aGVyIHRoZSBs
bGlzdHMgZm9yIGJvdGggdGhpbmdzIHdhaXRpbmcsDQo+ID4gYW5kIGZvciB0aGluZ3MgdGhhdCBh
cmUgYmVpbmcgd29rZW4uDQo+ID4NCj4gPiBJZiBhIGNsb3N1cmUgdGhhdCBpcyB3b2tlbiBkZWNp
ZGVzIHRvIHdhaXQgYWdhaW4sIGl0IHdpbGwgY29ycnVwdCB0aGUNCj4gPiBsbGlzdCB0aGF0IF9f
Y2xvc3VyZV93YWtlX3VwIGlzIHVzaW5nLg0KPiA+DQo+ID4gVGhlIHByZXZpb3VzIGl0ZXJhdGlv
biBzdHJ1Y3R1cmUgZ2V0cyB0aGUgbmV4dCBlbGVtZW50IG9mIHRoZSBsaXN0DQo+ID4gYmVmb3Jl
IHdha2luZyBhbmQgaXMgdGhlcmVmb3JlIHNhZmUuDQo+ID4NCj4gDQo+IEhpIE1pa2UsDQo+IA0K
PiBHb29kIGNhdGNoISBJIHNlZSBsbGlzdF9kZWxfYWxsKCkgYnV0IGZvcmdldCBjbC0+bGlzdCBj
YW4gYmUgbW9kaWZpZWQgaW4NCj4gY2xvc3VyZV93YWl0KCkuIFllcyB0aGVyZSBpcyBwb3RlbnRp
YWwgY2hhbmNlIHRvIG1pc2xlYWQNCj4gbGxpc3RfZm9yX2VhY2hfZW50cnkoKSB0byBpdGVyYXRl
IHdyb25nIGxpc3QuDQo+IGxsaXN0X2Zvcl9lYWNoX2VudHJ5X3NhZmUoKSBzaG91bGQgYmUgdXNl
ZCBoZXJlLiBJIHdpbGwgc2VuZCBhIGZpeCB0bw0KPiBKZW5zLCBob3BlIHRvIGNhdGNoIHVwIDQu
MTQgc3RpbGwuDQoNCkkgc2VlLiBZb3UgaGF2ZSBhIHBsYW4gdG8gZG8gaXQuIFBsZWFzZSBmaXgg
aXQuDQoNClRoYW5rIHlvdS4NCg0KPiBUaGFua3MhDQo+IC0tDQo+IENvbHkgTGkNCg==

^ permalink raw reply	[flat|nested] 35+ messages in thread

* RE: [PATCH 04/12] bcache: Don't reinvent the wheel but use existing llist API
@ 2017-09-26  7:16         ` 박병철/선임연구원/SW Platform(연)AOT팀(byungchul.park@lge.com)
  0 siblings, 0 replies; 35+ messages in thread
From: 박병철/선임연구원/SW Platform(연)AOT팀(byungchul.park@lge.com) @ 2017-09-26  7:16 UTC (permalink / raw)
  To: Coly Li, Michael Lyle, Coly Li
  Cc: linux-bcache, linux-block, axboe, Eric Wheeler, Byungchul Park,
	Kent Overstreet, kernel-team

> -----Original Message-----
> From: Coly Li [mailto:i@coly.li]
> Sent: Tuesday, September 26, 2017 4:09 PM
> To: Michael Lyle; Coly Li
> Cc: linux-bcache@vger.kernel.org; linux-block@vger.kernel.org;
> axboe@kernel.dk; Eric Wheeler; Byungchul Park; Kent Overstreet
> Subject: Re: [PATCH 04/12] bcache: Don't reinvent the wheel but use existing
> llist API
> 
> On 2017/9/26 下午12:38, Michael Lyle wrote:
> > I believe this introduces a critical bug.
> >
> > cl->list is used to link together the llists for both things waiting,
> > and for things that are being woken.
> >
> > If a closure that is woken decides to wait again, it will corrupt the
> > llist that __closure_wake_up is using.
> >
> > The previous iteration structure gets the next element of the list
> > before waking and is therefore safe.
> >
> 
> Hi Mike,
> 
> Good catch! I see llist_del_all() but forget cl->list can be modified in
> closure_wait(). Yes there is potential chance to mislead
> llist_for_each_entry() to iterate wrong list.
> llist_for_each_entry_safe() should be used here. I will send a fix to
> Jens, hope to catch up 4.14 still.

I see. You have a plan to do it. Please fix it.

Thank you.

> Thanks!
> --
> Coly Li

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH 04/12] bcache: Don't reinvent the wheel but use existing llist API
  2017-09-26  7:15           ` 박병철/선임연구원/SW Platform(연)AOT팀(byungchul.park@lge.com)
  (?)
@ 2017-09-26  7:22           ` Coly Li
  -1 siblings, 0 replies; 35+ messages in thread
From: Coly Li @ 2017-09-26  7:22 UTC (permalink / raw)
  To: 박병철/선임연구원/SW
	Platform(연)AOT팀(byungchul.park@lge.com),
	Michael Lyle, Coly Li
  Cc: linux-bcache, linux-block, axboe, Eric Wheeler, Kent Overstreet,
	kernel-team

On 2017/9/26 下午3:15, 박병철/선임연구원/SW
Platform(연)AOT팀(byungchul.park@lge.com) wrote:
>> -----Original Message-----
>> From: Coly Li [mailto:i@coly.li]
>> Sent: Tuesday, September 26, 2017 4:09 PM
>> To: 박병철/선임연구원/SW Platform(연)AOT팀(byungchul.park@lge.com);
>> Michael Lyle; Coly Li
>> Cc: linux-bcache@vger.kernel.org; linux-block@vger.kernel.org;
>> axboe@kernel.dk; Eric Wheeler; Kent Overstreet; kernel-team@lge.com
>> Subject: Re: [PATCH 04/12] bcache: Don't reinvent the wheel but use existing
>> llist API
>>
>> On 2017/9/26 下午2:39, 박병철/선임연구원/SW
>> Platform(연)AOT팀(byungchul.park@lge.com) wrote:
>>>> -----Original Message-----
>>>> From: Michael Lyle [mailto:mlyle@lyle.org]
>>>> Sent: Tuesday, September 26, 2017 1:38 PM
>>>> To: Coly Li
>>>> Cc: linux-bcache@vger.kernel.org; linux-block@vger.kernel.org;
>>>> axboe@kernel.dk; Eric Wheeler; Byungchul Park; Kent Overstreet
>>>> Subject: Re: [PATCH 04/12] bcache: Don't reinvent the wheel but use existing
>>>> llist API
>>>>
>>>> I believe this introduces a critical bug.
>>>>
>>>> cl->list is used to link together the llists for both things waiting,
>>>> and for things that are being woken.
>>>>
>>>> If a closure that is woken decides to wait again, it will corrupt the
>>>> llist that __closure_wake_up is using.
>>>>
>>>> The previous iteration structure gets the next element of the list
>>>> before waking and is therefore safe.
>>>
>>> Do you mean we have to use llist_for_each_entry_safe() instead of non-safe
>> version?
>>> Is it ok if we use it instead?
>>
>> Yes, we should use llist_for_each_entry_safe(), there is a quite
>> implicit race here.
> 
> Hi coly,
> 
> As you know, my first patch used the safe version, but you suggested to replace
> It with non-safe one. :(

No doubt, it's my fault. I didn't find the implicit potential race.

> I will change it so it does the same as the original patch did. :)

Yes, please. Because the original patch is merged into Linux upstream,
post a fix to replace llist_for_each_entry() by
llist_for_each_entry_safe(). We still have chance to catch up 4.14.

Thank you !

-- 
Coly Li

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH 04/12] bcache: Don't reinvent the wheel but use existing llist API
  2017-09-26  7:16         ` 박병철/선임연구원/SW Platform(연)AOT팀(byungchul.park@lge.com)
  (?)
@ 2017-09-26  7:24         ` Coly Li
  -1 siblings, 0 replies; 35+ messages in thread
From: Coly Li @ 2017-09-26  7:24 UTC (permalink / raw)
  To: 박병철/선임연구원/SW
	Platform(연)AOT팀(byungchul.park@lge.com),
	Michael Lyle, Coly Li
  Cc: linux-bcache, linux-block, axboe, Eric Wheeler, Kent Overstreet,
	kernel-team

On 2017/9/26 下午3:16, 박병철/선임연구원/SW
Platform(연)AOT팀(byungchul.park@lge.com) wrote:
>> -----Original Message-----
>> From: Coly Li [mailto:i@coly.li]
>> Sent: Tuesday, September 26, 2017 4:09 PM
>> To: Michael Lyle; Coly Li
>> Cc: linux-bcache@vger.kernel.org; linux-block@vger.kernel.org;
>> axboe@kernel.dk; Eric Wheeler; Byungchul Park; Kent Overstreet
>> Subject: Re: [PATCH 04/12] bcache: Don't reinvent the wheel but use existing
>> llist API
>>
>> On 2017/9/26 下午12:38, Michael Lyle wrote:
>>> I believe this introduces a critical bug.
>>>
>>> cl->list is used to link together the llists for both things waiting,
>>> and for things that are being woken.
>>>
>>> If a closure that is woken decides to wait again, it will corrupt the
>>> llist that __closure_wake_up is using.
>>>
>>> The previous iteration structure gets the next element of the list
>>> before waking and is therefore safe.
>>>
>>
>> Hi Mike,
>>
>> Good catch! I see llist_del_all() but forget cl->list can be modified in
>> closure_wait(). Yes there is potential chance to mislead
>> llist_for_each_entry() to iterate wrong list.
>> llist_for_each_entry_safe() should be used here. I will send a fix to
>> Jens, hope to catch up 4.14 still.
> 
> I see. You have a plan to do it. Please fix it.

Oh, I just see this email, when I replied your last one.
Sure, let me fix my fault.

-- 
Coly Li

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH 04/12] bcache: Don't reinvent the wheel but use existing llist API
  2017-09-26  4:38   ` Michael Lyle
  2017-09-26  6:39       ` 박병철/선임연구원/SW Platform(연)AOT팀(byungchul.park@lge.com)
  2017-09-26  7:08     ` Coly Li
@ 2017-09-26  7:46     ` Coly Li
  2017-09-26 19:55         ` Michael Lyle
  2 siblings, 1 reply; 35+ messages in thread
From: Coly Li @ 2017-09-26  7:46 UTC (permalink / raw)
  To: Michael Lyle, Coly Li
  Cc: linux-bcache, linux-block, axboe, Eric Wheeler, Byungchul Park,
	Kent Overstreet

On 2017/9/26 下午12:38, Michael Lyle wrote:
> I believe this introduces a critical bug.
> 
> cl->list is used to link together the llists for both things waiting,
> and for things that are being woken.
> 
> If a closure that is woken decides to wait again, it will corrupt the
> llist that __closure_wake_up is using.
> 
> The previous iteration structure gets the next element of the list
> before waking and is therefore safe.


Hi Michael and Byungchul,

This is my fault to suggest Byungchul to change his correct patch into
wrong one. But it's good to learn such an implicit race behind bcache
code. I just post a patch to explain how this race may happen and
corrupt the reverse list iteration. Could you please to review the fix ?

And thanks to Michael again to catch this bug.

-- 
Coly Li

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH 04/12] bcache: Don't reinvent the wheel but use existing llist API
  2017-09-26  7:46     ` Coly Li
@ 2017-09-26 19:55         ` Michael Lyle
  0 siblings, 0 replies; 35+ messages in thread
From: Michael Lyle @ 2017-09-26 19:55 UTC (permalink / raw)
  To: Coly Li
  Cc: Coly Li, linux-bcache, linux-block, axboe, Eric Wheeler,
	Byungchul Park, Kent Overstreet

Thanks everyone--

Yes, this looks good to me and works in testing.

Mike

On Tue, Sep 26, 2017 at 12:46 AM, Coly Li <i@coly.li> wrote:
> On 2017/9/26 =E4=B8=8B=E5=8D=8812:38, Michael Lyle wrote:
>> I believe this introduces a critical bug.
>>
>> cl->list is used to link together the llists for both things waiting,
>> and for things that are being woken.
>>
>> If a closure that is woken decides to wait again, it will corrupt the
>> llist that __closure_wake_up is using.
>>
>> The previous iteration structure gets the next element of the list
>> before waking and is therefore safe.
>
>
> Hi Michael and Byungchul,
>
> This is my fault to suggest Byungchul to change his correct patch into
> wrong one. But it's good to learn such an implicit race behind bcache
> code. I just post a patch to explain how this race may happen and
> corrupt the reverse list iteration. Could you please to review the fix ?
>
> And thanks to Michael again to catch this bug.
>
> --
> Coly Li

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH 04/12] bcache: Don't reinvent the wheel but use existing llist API
@ 2017-09-26 19:55         ` Michael Lyle
  0 siblings, 0 replies; 35+ messages in thread
From: Michael Lyle @ 2017-09-26 19:55 UTC (permalink / raw)
  To: Coly Li
  Cc: Coly Li, linux-bcache, linux-block, axboe, Eric Wheeler,
	Byungchul Park, Kent Overstreet

Thanks everyone--

Yes, this looks good to me and works in testing.

Mike

On Tue, Sep 26, 2017 at 12:46 AM, Coly Li <i@coly.li> wrote:
> On 2017/9/26 下午12:38, Michael Lyle wrote:
>> I believe this introduces a critical bug.
>>
>> cl->list is used to link together the llists for both things waiting,
>> and for things that are being woken.
>>
>> If a closure that is woken decides to wait again, it will corrupt the
>> llist that __closure_wake_up is using.
>>
>> The previous iteration structure gets the next element of the list
>> before waking and is therefore safe.
>
>
> Hi Michael and Byungchul,
>
> This is my fault to suggest Byungchul to change his correct patch into
> wrong one. But it's good to learn such an implicit race behind bcache
> code. I just post a patch to explain how this race may happen and
> corrupt the reverse list iteration. Could you please to review the fix ?
>
> And thanks to Michael again to catch this bug.
>
> --
> Coly Li

^ permalink raw reply	[flat|nested] 35+ messages in thread

end of thread, other threads:[~2017-09-26 19:55 UTC | newest]

Thread overview: 35+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-09-06  6:25 [PATCH 00/13] bcache: fixes and update for 4.14 Coly Li
2017-09-06  6:25 ` [PATCH 01/12] bcache: Fix leak of bdev reference Coly Li
2017-09-06  6:25 ` [PATCH 02/12] bcache: fix sequential large write IO bypass Coly Li
2017-09-06  6:25 ` [PATCH 03/12] bcache: do not subtract sectors_to_gc for bypassed IO Coly Li
2017-09-06  6:25 ` [PATCH 04/12] bcache: Don't reinvent the wheel but use existing llist API Coly Li
2017-09-26  4:38   ` Michael Lyle
2017-09-26  6:39     ` 박병철/선임연구원/SW Platform(연)AOT팀(byungchul.park@lge.com)
2017-09-26  6:39       ` 박병철/선임연구원/SW Platform(연)AOT팀(byungchul.park@lge.com)
2017-09-26  7:09       ` Coly Li
2017-09-26  7:15         ` 박병철/선임연구원/SW Platform(연)AOT팀(byungchul.park@lge.com)
2017-09-26  7:15           ` 박병철/선임연구원/SW Platform(연)AOT팀(byungchul.park@lge.com)
2017-09-26  7:22           ` Coly Li
2017-09-26  7:08     ` Coly Li
2017-09-26  7:16       ` 박병철/선임연구원/SW Platform(연)AOT팀(byungchul.park@lge.com)
2017-09-26  7:16         ` 박병철/선임연구원/SW Platform(연)AOT팀(byungchul.park@lge.com)
2017-09-26  7:24         ` Coly Li
2017-09-26  7:46     ` Coly Li
2017-09-26 19:55       ` Michael Lyle
2017-09-26 19:55         ` Michael Lyle
2017-09-06  6:25 ` [PATCH 05/12] bcache: gc does not work when triggering by manual command Coly Li
2017-09-06  6:25 ` [PATCH 06/12] bcache: correct cache_dirty_target in __update_writeback_rate() Coly Li
2017-09-06  6:25 ` [PATCH 07/12] bcache: Correct return value for sysfs attach errors Coly Li
2017-09-06  6:25 ` [PATCH 08/12] bcache: increase the number of open buckets Coly Li
2017-09-06  6:25   ` Coly Li
2017-09-06  6:25 ` [PATCH 09/12] bcache: fix for gc and write-back race Coly Li
2017-09-06  6:26 ` [PATCH 10/12] bcache: silence static checker warning Coly Li
2017-09-06  6:26 ` [PATCH 11/12] bcache: Update continue_at() documentation Coly Li
2017-09-06  6:26 ` [PATCH 12/12] bcache: fix bch_hprint crash and improve output Coly Li
2017-09-06 14:20 ` [PATCH 00/13] bcache: fixes and update for 4.14 Jens Axboe
2017-09-06 15:41   ` Coly Li
2017-09-06 15:46     ` Jens Axboe
2017-09-06 17:38       ` Coly Li
2017-09-07 18:51         ` Eddie Chapman
2017-09-07 19:31           ` Jens Axboe
2017-09-07 19:01         ` Eddie Chapman

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.