All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 0/2] Duplicate UUID fixes for 4.16
@ 2018-03-05 21:41 Michael Lyle
  2018-03-05 21:41 ` [PATCH 1/2] bcache: fix crashes in duplicate cache device register Michael Lyle
                   ` (2 more replies)
  0 siblings, 3 replies; 4+ messages in thread
From: Michael Lyle @ 2018-03-05 21:41 UTC (permalink / raw)
  To: linux-block, linux-bcache; +Cc: axboe

Jens--

Sorry for a couple of last-minute changes.  Marc Merlin noticed an issue
with disk imaging where if multiple devices were present with the same
bcache UUID that a kernel panic could result.  Tang Junhui fixed this.
I found a related data corruption issue with duplicate backing devices.

Both are minimal, reviewed, and tested.  As always, thanks for your help.

Mike

^ permalink raw reply	[flat|nested] 4+ messages in thread

* [PATCH 1/2] bcache: fix crashes in duplicate cache device register
  2018-03-05 21:41 [PATCH 0/2] Duplicate UUID fixes for 4.16 Michael Lyle
@ 2018-03-05 21:41 ` Michael Lyle
  2018-03-05 21:41 ` [PATCH 2/2] bcache: don't attach backing with duplicate UUID Michael Lyle
  2018-03-05 21:46 ` [PATCH 0/2] Duplicate UUID fixes for 4.16 Jens Axboe
  2 siblings, 0 replies; 4+ messages in thread
From: Michael Lyle @ 2018-03-05 21:41 UTC (permalink / raw)
  To: linux-block, linux-bcache; +Cc: axboe, Tang Junhui, stable

From: Tang Junhui <tang.junhui@zte.com.cn>

Kernel crashed when register a duplicate cache device, the call trace is
bellow:
[  417.643790] CPU: 1 PID: 16886 Comm: bcache-register Tainted: G
   W  OE    4.15.5-amd64-preempt-sysrq-20171018 #2
[  417.643861] Hardware name: LENOVO 20ERCTO1WW/20ERCTO1WW, BIOS
N1DET41W (1.15 ) 12/31/2015
[  417.643870] RIP: 0010:bdevname+0x13/0x1e
[  417.643876] RSP: 0018:ffffa3aa9138fd38 EFLAGS: 00010282
[  417.643884] RAX: 0000000000000000 RBX: ffff8c8f2f2f8000 RCX: ffffd6701f8
c7edf
[  417.643890] RDX: ffffa3aa9138fd88 RSI: ffffa3aa9138fd88 RDI: 00000000000
00000
[  417.643895] RBP: ffffa3aa9138fde0 R08: ffffa3aa9138fae8 R09: 00000000000
1850e
[  417.643901] R10: ffff8c8eed34b271 R11: ffff8c8eed34b250 R12: 00000000000
00000
[  417.643906] R13: ffffd6701f78f940 R14: ffff8c8f38f80000 R15: ffff8c8ea7d
90000
[  417.643913] FS:  00007fde7e66f500(0000) GS:ffff8c8f61440000(0000) knlGS:
0000000000000000
[  417.643919] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  417.643925] CR2: 0000000000000314 CR3: 00000007e6fa0001 CR4: 00000000003
606e0
[  417.643931] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 00000000000
00000
[  417.643938] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 00000000000
00400
[  417.643946] Call Trace:
[  417.643978]  register_bcache+0x1117/0x1270 [bcache]
[  417.643994]  ? slab_pre_alloc_hook+0x15/0x3c
[  417.644001]  ? slab_post_alloc_hook.isra.44+0xa/0x1a
[  417.644013]  ? kernfs_fop_write+0xf6/0x138
[  417.644020]  kernfs_fop_write+0xf6/0x138
[  417.644031]  __vfs_write+0x31/0xcc
[  417.644043]  ? current_kernel_time64+0x10/0x36
[  417.644115]  ? __audit_syscall_entry+0xbf/0xe3
[  417.644124]  vfs_write+0xa5/0xe2
[  417.644133]  SyS_write+0x5c/0x9f
[  417.644144]  do_syscall_64+0x72/0x81
[  417.644161]  entry_SYSCALL_64_after_hwframe+0x3d/0xa2
[  417.644169] RIP: 0033:0x7fde7e1c1974
[  417.644175] RSP: 002b:00007fff13009a38 EFLAGS: 00000246 ORIG_RAX: 0000000
000000001
[  417.644183] RAX: ffffffffffffffda RBX: 0000000001658280 RCX: 00007fde7e1c
1974
[  417.644188] RDX: 000000000000000a RSI: 0000000001658280 RDI: 000000000000
0001
[  417.644193] RBP: 000000000000000a R08: 0000000000000003 R09: 000000000000
0077
[  417.644198] R10: 000000000000089e R11: 0000000000000246 R12: 000000000000
0001
[  417.644203] R13: 000000000000000a R14: 7fffffffffffffff R15: 000000000000
0000
[  417.644213] Code: c7 c2 83 6f ee 98 be 20 00 00 00 48 89 df e8 6c 27 3b 0
0 48 89 d8 5b c3 0f 1f 44 00 00 48 8b 47 70 48 89 f2 48 8b bf 80 00 00 00 <8
b> b0 14 03 00 00 e9 73 ff ff ff 0f 1f 44 00 00 48 8b 47 40 39
[  417.644302] RIP: bdevname+0x13/0x1e RSP: ffffa3aa9138fd38
[  417.644306] CR2: 0000000000000314

When registering duplicate cache device in register_cache(), after failure
on calling register_cache_set(), bch_cache_release() will be called, then
bdev will be freed, so bdevname(bdev, name) caused kernel crash.

Since bch_cache_release() will free bdev, so in this patch we make sure
bdev being freed if register_cache() fail, and do not free bdev again in
register_bcache() when register_cache() fail.

Signed-off-by: Tang Junhui <tang.junhui@zte.com.cn>
Reported-by: Marc MERLIN <marc@merlins.org>
Tested-by: Michael Lyle <mlyle@lyle.org>
Reviewed-by: Michael Lyle <mlyle@lyle.org>
Cc: <stable@vger.kernel.org>
---
 drivers/md/bcache/super.c | 16 ++++++++++------
 1 file changed, 10 insertions(+), 6 deletions(-)

diff --git a/drivers/md/bcache/super.c b/drivers/md/bcache/super.c
index 312895788036..9c141a8aaacc 100644
--- a/drivers/md/bcache/super.c
+++ b/drivers/md/bcache/super.c
@@ -1204,7 +1204,7 @@ static void register_bdev(struct cache_sb *sb, struct page *sb_page,
 
 	return;
 err:
-	pr_notice("error opening %s: %s", bdevname(bdev, name), err);
+	pr_notice("error %s: %s", bdevname(bdev, name), err);
 	bcache_device_stop(&dc->disk);
 }
 
@@ -1883,6 +1883,8 @@ static int register_cache(struct cache_sb *sb, struct page *sb_page,
 	const char *err = NULL; /* must be set for any error case */
 	int ret = 0;
 
+	bdevname(bdev, name);
+
 	memcpy(&ca->sb, sb, sizeof(struct cache_sb));
 	ca->bdev = bdev;
 	ca->bdev->bd_holder = ca;
@@ -1891,11 +1893,12 @@ static int register_cache(struct cache_sb *sb, struct page *sb_page,
 	bio_first_bvec_all(&ca->sb_bio)->bv_page = sb_page;
 	get_page(sb_page);
 
-	if (blk_queue_discard(bdev_get_queue(ca->bdev)))
+	if (blk_queue_discard(bdev_get_queue(bdev)))
 		ca->discard = CACHE_DISCARD(&ca->sb);
 
 	ret = cache_alloc(ca);
 	if (ret != 0) {
+		blkdev_put(bdev, FMODE_READ|FMODE_WRITE|FMODE_EXCL);
 		if (ret == -ENOMEM)
 			err = "cache_alloc(): -ENOMEM";
 		else
@@ -1918,14 +1921,14 @@ static int register_cache(struct cache_sb *sb, struct page *sb_page,
 		goto out;
 	}
 
-	pr_info("registered cache device %s", bdevname(bdev, name));
+	pr_info("registered cache device %s", name);
 
 out:
 	kobject_put(&ca->kobj);
 
 err:
 	if (err)
-		pr_notice("error opening %s: %s", bdevname(bdev, name), err);
+		pr_notice("error %s: %s", name, err);
 
 	return ret;
 }
@@ -2014,6 +2017,7 @@ static ssize_t register_bcache(struct kobject *k, struct kobj_attribute *attr,
 	if (err)
 		goto err_close;
 
+	err = "failed to register device";
 	if (SB_IS_BDEV(sb)) {
 		struct cached_dev *dc = kzalloc(sizeof(*dc), GFP_KERNEL);
 		if (!dc)
@@ -2028,7 +2032,7 @@ static ssize_t register_bcache(struct kobject *k, struct kobj_attribute *attr,
 			goto err_close;
 
 		if (register_cache(sb, sb_page, bdev, ca) != 0)
-			goto err_close;
+			goto err;
 	}
 out:
 	if (sb_page)
@@ -2041,7 +2045,7 @@ static ssize_t register_bcache(struct kobject *k, struct kobj_attribute *attr,
 err_close:
 	blkdev_put(bdev, FMODE_READ|FMODE_WRITE|FMODE_EXCL);
 err:
-	pr_info("error opening %s: %s", path, err);
+	pr_info("error %s: %s", path, err);
 	ret = -EINVAL;
 	goto out;
 }
-- 
2.14.1

^ permalink raw reply related	[flat|nested] 4+ messages in thread

* [PATCH 2/2] bcache: don't attach backing with duplicate UUID
  2018-03-05 21:41 [PATCH 0/2] Duplicate UUID fixes for 4.16 Michael Lyle
  2018-03-05 21:41 ` [PATCH 1/2] bcache: fix crashes in duplicate cache device register Michael Lyle
@ 2018-03-05 21:41 ` Michael Lyle
  2018-03-05 21:46 ` [PATCH 0/2] Duplicate UUID fixes for 4.16 Jens Axboe
  2 siblings, 0 replies; 4+ messages in thread
From: Michael Lyle @ 2018-03-05 21:41 UTC (permalink / raw)
  To: linux-block, linux-bcache; +Cc: axboe, Michael Lyle, stable

This can happen e.g. during disk cloning.

This is an incomplete fix: it does not catch duplicate UUIDs earlier
when things are still unattached.  It does not unregister the device.
Further changes to cope better with this are planned but conflict with
Coly's ongoing improvements to handling device errors.  In the meantime,
one can manually stop the device after this has happened.

Attempts to attach a duplicate device result in:

[  136.372404] loop: module loaded
[  136.424461] bcache: register_bdev() registered backing device loop0
[  136.424464] bcache: bch_cached_dev_attach() Tried to attach loop0 but duplicate UUID already attached

My test procedure is:

  dd if=/dev/sdb1 of=imgfile bs=1024 count=262144
  losetup -f imgfile

Signed-off-by: Michael Lyle <mlyle@lyle.org>
Reviewed-by: Tang Junhui <tang.junhui@zte.com.cn>
Cc: <stable@vger.kernel.org>
---
 drivers/md/bcache/super.c | 11 +++++++++++
 1 file changed, 11 insertions(+)

diff --git a/drivers/md/bcache/super.c b/drivers/md/bcache/super.c
index 9c141a8aaacc..5cace6892958 100644
--- a/drivers/md/bcache/super.c
+++ b/drivers/md/bcache/super.c
@@ -963,6 +963,7 @@ int bch_cached_dev_attach(struct cached_dev *dc, struct cache_set *c,
 	uint32_t rtime = cpu_to_le32(get_seconds());
 	struct uuid_entry *u;
 	char buf[BDEVNAME_SIZE];
+	struct cached_dev *exist_dc, *t;
 
 	bdevname(dc->bdev, buf);
 
@@ -987,6 +988,16 @@ int bch_cached_dev_attach(struct cached_dev *dc, struct cache_set *c,
 		return -EINVAL;
 	}
 
+	/* Check whether already attached */
+	list_for_each_entry_safe(exist_dc, t, &c->cached_devs, list) {
+		if (!memcmp(dc->sb.uuid, exist_dc->sb.uuid, 16)) {
+			pr_err("Tried to attach %s but duplicate UUID already attached",
+				buf);
+
+			return -EINVAL;
+		}
+	}
+
 	u = uuid_find(c, dc->sb.uuid);
 
 	if (u &&
-- 
2.14.1

^ permalink raw reply related	[flat|nested] 4+ messages in thread

* Re: [PATCH 0/2] Duplicate UUID fixes for 4.16
  2018-03-05 21:41 [PATCH 0/2] Duplicate UUID fixes for 4.16 Michael Lyle
  2018-03-05 21:41 ` [PATCH 1/2] bcache: fix crashes in duplicate cache device register Michael Lyle
  2018-03-05 21:41 ` [PATCH 2/2] bcache: don't attach backing with duplicate UUID Michael Lyle
@ 2018-03-05 21:46 ` Jens Axboe
  2 siblings, 0 replies; 4+ messages in thread
From: Jens Axboe @ 2018-03-05 21:46 UTC (permalink / raw)
  To: Michael Lyle, linux-block, linux-bcache

On 3/5/18 2:41 PM, Michael Lyle wrote:
> Jens--
> 
> Sorry for a couple of last-minute changes.  Marc Merlin noticed an issue
> with disk imaging where if multiple devices were present with the same
> bcache UUID that a kernel panic could result.  Tang Junhui fixed this.
> I found a related data corruption issue with duplicate backing devices.
> 
> Both are minimal, reviewed, and tested.  As always, thanks for your help.

Applied, thanks Mike.

-- 
Jens Axboe

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2018-03-05 21:46 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-03-05 21:41 [PATCH 0/2] Duplicate UUID fixes for 4.16 Michael Lyle
2018-03-05 21:41 ` [PATCH 1/2] bcache: fix crashes in duplicate cache device register Michael Lyle
2018-03-05 21:41 ` [PATCH 2/2] bcache: don't attach backing with duplicate UUID Michael Lyle
2018-03-05 21:46 ` [PATCH 0/2] Duplicate UUID fixes for 4.16 Jens Axboe

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.