linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH -next v3 0/6] nbd: bugfix and cleanup patches
@ 2022-05-21  7:37 Yu Kuai
  2022-05-21  7:37 ` [PATCH -next v3 1/6] nbd: call genl_unregister_family() first in nbd_cleanup() Yu Kuai
                   ` (6 more replies)
  0 siblings, 7 replies; 16+ messages in thread
From: Yu Kuai @ 2022-05-21  7:37 UTC (permalink / raw)
  To: josef, axboe, yukuai3, ming.lei; +Cc: linux-block, nbd, linux-kernel, yi.zhang

Changes in v3:
 - rewrap to 80 columns where possible in patch 6
Changes in v2:
 - in patch 3, instead of clear and then reset the flag if rq is not
 completed, test first and clear if rq is going to complete.

path 1-2 fix races between nbd setup and module removal.
patch 3 fix io can't be completed in some error path.
patch 4 fix io hung when disconnecting failed.
patch 5 fix sysfs warning about duplicate creation.
patch 6 use pr_err to output error message.

Previous versions:
v1: https://lore.kernel.org/all/20220426130746.885140-1-yukuai3@huawei.com/
v2: https://lore.kernel.org/all/20220518122618.1702997-1-yukuai3@huawei.com/

Yu Kuai (5):
  nbd: call genl_unregister_family() first in nbd_cleanup()
  nbd: fix race between nbd_alloc_config() and module removal
  nbd: don't clear 'NBD_CMD_INFLIGHT' flag if request is not completed
  nbd: fix io hung while disconnecting device
  nbd: use pr_err to output error message

Zhang Wensheng (1):
  nbd: fix possible overflow on 'first_minor' in nbd_dev_add()

 drivers/block/nbd.c | 114 ++++++++++++++++++++++++++------------------
 1 file changed, 68 insertions(+), 46 deletions(-)

-- 
2.31.1


^ permalink raw reply	[flat|nested] 16+ messages in thread

* [PATCH -next v3 1/6] nbd: call genl_unregister_family() first in nbd_cleanup()
  2022-05-21  7:37 [PATCH -next v3 0/6] nbd: bugfix and cleanup patches Yu Kuai
@ 2022-05-21  7:37 ` Yu Kuai
  2022-05-23 14:13   ` Josef Bacik
  2022-05-21  7:37 ` [PATCH -next v3 2/6] nbd: fix race between nbd_alloc_config() and module removal Yu Kuai
                   ` (5 subsequent siblings)
  6 siblings, 1 reply; 16+ messages in thread
From: Yu Kuai @ 2022-05-21  7:37 UTC (permalink / raw)
  To: josef, axboe, yukuai3, ming.lei; +Cc: linux-block, nbd, linux-kernel, yi.zhang

Otherwise there may be race between module removal and the handling of
netlink command, which can lead to the oops as shown below:

  BUG: kernel NULL pointer dereference, address: 0000000000000098
  Oops: 0002 [#1] SMP PTI
  CPU: 1 PID: 31299 Comm: nbd-client Tainted: G            E     5.14.0-rc4
  Hardware name: QEMU Standard PC (i440FX + PIIX, 1996)
  RIP: 0010:down_write+0x1a/0x50
  Call Trace:
   start_creating+0x89/0x130
   debugfs_create_dir+0x1b/0x130
   nbd_start_device+0x13d/0x390 [nbd]
   nbd_genl_connect+0x42f/0x748 [nbd]
   genl_family_rcv_msg_doit.isra.0+0xec/0x150
   genl_rcv_msg+0xe5/0x1e0
   netlink_rcv_skb+0x55/0x100
   genl_rcv+0x29/0x40
   netlink_unicast+0x1a8/0x250
   netlink_sendmsg+0x21b/0x430
   ____sys_sendmsg+0x2a4/0x2d0
   ___sys_sendmsg+0x81/0xc0
   __sys_sendmsg+0x62/0xb0
   __x64_sys_sendmsg+0x1f/0x30
   do_syscall_64+0x3b/0xc0
   entry_SYSCALL_64_after_hwframe+0x44/0xae
  Modules linked in: nbd(E-)

Signed-off-by: Hou Tao <houtao1@huawei.com>
Signed-off-by: Yu Kuai <yukuai3@huawei.com>
---
 drivers/block/nbd.c | 7 ++++++-
 1 file changed, 6 insertions(+), 1 deletion(-)

diff --git a/drivers/block/nbd.c b/drivers/block/nbd.c
index ac8b045c777c..a73e853f5833 100644
--- a/drivers/block/nbd.c
+++ b/drivers/block/nbd.c
@@ -2528,6 +2528,12 @@ static void __exit nbd_cleanup(void)
 	struct nbd_device *nbd;
 	LIST_HEAD(del_list);
 
+	/*
+	 * Unregister netlink interface prior to waiting
+	 * for the completion of netlink commands.
+	 */
+	genl_unregister_family(&nbd_genl_family);
+
 	nbd_dbg_close();
 
 	mutex_lock(&nbd_index_mutex);
@@ -2546,7 +2552,6 @@ static void __exit nbd_cleanup(void)
 	destroy_workqueue(nbd_del_wq);
 
 	idr_destroy(&nbd_index_idr);
-	genl_unregister_family(&nbd_genl_family);
 	unregister_blkdev(NBD_MAJOR, "nbd");
 }
 
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH -next v3 2/6] nbd: fix race between nbd_alloc_config() and module removal
  2022-05-21  7:37 [PATCH -next v3 0/6] nbd: bugfix and cleanup patches Yu Kuai
  2022-05-21  7:37 ` [PATCH -next v3 1/6] nbd: call genl_unregister_family() first in nbd_cleanup() Yu Kuai
@ 2022-05-21  7:37 ` Yu Kuai
  2022-05-23 14:14   ` Josef Bacik
  2022-05-21  7:37 ` [PATCH -next v3 3/6] nbd: don't clear 'NBD_CMD_INFLIGHT' flag if request is not completed Yu Kuai
                   ` (4 subsequent siblings)
  6 siblings, 1 reply; 16+ messages in thread
From: Yu Kuai @ 2022-05-21  7:37 UTC (permalink / raw)
  To: josef, axboe, yukuai3, ming.lei; +Cc: linux-block, nbd, linux-kernel, yi.zhang

When nbd module is being removing, nbd_alloc_config() may be
called concurrently by nbd_genl_connect(), although try_module_get()
will return false, but nbd_alloc_config() doesn't handle it.

The race may lead to the leak of nbd_config and its related
resources (e.g, recv_workq) and oops in nbd_read_stat() due
to the unload of nbd module as shown below:

  BUG: kernel NULL pointer dereference, address: 0000000000000040
  Oops: 0000 [#1] SMP PTI
  CPU: 5 PID: 13840 Comm: kworker/u17:33 Not tainted 5.14.0+ #1
  Hardware name: QEMU Standard PC (i440FX + PIIX, 1996)
  Workqueue: knbd16-recv recv_work [nbd]
  RIP: 0010:nbd_read_stat.cold+0x130/0x1a4 [nbd]
  Call Trace:
   recv_work+0x3b/0xb0 [nbd]
   process_one_work+0x1ed/0x390
   worker_thread+0x4a/0x3d0
   kthread+0x12a/0x150
   ret_from_fork+0x22/0x30

Fixing it by checking the return value of try_module_get()
in nbd_alloc_config(). As nbd_alloc_config() may return ERR_PTR(-ENODEV),
assign nbd->config only when nbd_alloc_config() succeeds to ensure
the value of nbd->config is binary (valid or NULL).

Also adding a debug message to check the reference counter
of nbd_config during module removal.

Signed-off-by: Hou Tao <houtao1@huawei.com>
Signed-off-by: Yu Kuai <yukuai3@huawei.com>
---
 drivers/block/nbd.c | 28 +++++++++++++++++++---------
 1 file changed, 19 insertions(+), 9 deletions(-)

diff --git a/drivers/block/nbd.c b/drivers/block/nbd.c
index a73e853f5833..2ee1e376d5c4 100644
--- a/drivers/block/nbd.c
+++ b/drivers/block/nbd.c
@@ -1518,15 +1518,20 @@ static struct nbd_config *nbd_alloc_config(void)
 {
 	struct nbd_config *config;
 
+	if (!try_module_get(THIS_MODULE))
+		return ERR_PTR(-ENODEV);
+
 	config = kzalloc(sizeof(struct nbd_config), GFP_NOFS);
-	if (!config)
-		return NULL;
+	if (!config) {
+		module_put(THIS_MODULE);
+		return ERR_PTR(-ENOMEM);
+	}
+
 	atomic_set(&config->recv_threads, 0);
 	init_waitqueue_head(&config->recv_wq);
 	init_waitqueue_head(&config->conn_wait);
 	config->blksize_bits = NBD_DEF_BLKSIZE_BITS;
 	atomic_set(&config->live_connections, 0);
-	try_module_get(THIS_MODULE);
 	return config;
 }
 
@@ -1553,12 +1558,13 @@ static int nbd_open(struct block_device *bdev, fmode_t mode)
 			mutex_unlock(&nbd->config_lock);
 			goto out;
 		}
-		config = nbd->config = nbd_alloc_config();
-		if (!config) {
-			ret = -ENOMEM;
+		config = nbd_alloc_config();
+		if (IS_ERR(config)) {
+			ret = PTR_ERR(config);
 			mutex_unlock(&nbd->config_lock);
 			goto out;
 		}
+		nbd->config = config;
 		refcount_set(&nbd->config_refs, 1);
 		refcount_inc(&nbd->refs);
 		mutex_unlock(&nbd->config_lock);
@@ -1964,13 +1970,14 @@ static int nbd_genl_connect(struct sk_buff *skb, struct genl_info *info)
 		nbd_put(nbd);
 		return -EINVAL;
 	}
-	config = nbd->config = nbd_alloc_config();
-	if (!nbd->config) {
+	config = nbd_alloc_config();
+	if (IS_ERR(config)) {
 		mutex_unlock(&nbd->config_lock);
 		nbd_put(nbd);
 		printk(KERN_ERR "nbd: couldn't allocate config\n");
-		return -ENOMEM;
+		return PTR_ERR(config);
 	}
+	nbd->config = config;
 	refcount_set(&nbd->config_refs, 1);
 	set_bit(NBD_RT_BOUND, &config->runtime_flags);
 
@@ -2543,6 +2550,9 @@ static void __exit nbd_cleanup(void)
 	while (!list_empty(&del_list)) {
 		nbd = list_first_entry(&del_list, struct nbd_device, list);
 		list_del_init(&nbd->list);
+		if (refcount_read(&nbd->config_refs))
+			printk(KERN_ERR "nbd: possibly leaking nbd_config (ref %d)\n",
+					refcount_read(&nbd->config_refs));
 		if (refcount_read(&nbd->refs) != 1)
 			printk(KERN_ERR "nbd: possibly leaking a device\n");
 		nbd_put(nbd);
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH -next v3 3/6] nbd: don't clear 'NBD_CMD_INFLIGHT' flag if request is not completed
  2022-05-21  7:37 [PATCH -next v3 0/6] nbd: bugfix and cleanup patches Yu Kuai
  2022-05-21  7:37 ` [PATCH -next v3 1/6] nbd: call genl_unregister_family() first in nbd_cleanup() Yu Kuai
  2022-05-21  7:37 ` [PATCH -next v3 2/6] nbd: fix race between nbd_alloc_config() and module removal Yu Kuai
@ 2022-05-21  7:37 ` Yu Kuai
  2022-05-23 14:12   ` Josef Bacik
  2022-05-21  7:37 ` [PATCH -next v3 4/6] nbd: fix io hung while disconnecting device Yu Kuai
                   ` (3 subsequent siblings)
  6 siblings, 1 reply; 16+ messages in thread
From: Yu Kuai @ 2022-05-21  7:37 UTC (permalink / raw)
  To: josef, axboe, yukuai3, ming.lei; +Cc: linux-block, nbd, linux-kernel, yi.zhang

Otherwise io will hung because request will only be completed if the
cmd has the flag 'NBD_CMD_INFLIGHT'.

Fixes: 07175cb1baf4 ("nbd: make sure request completion won't concurrent")
Signed-off-by: Yu Kuai <yukuai3@huawei.com>
---
 drivers/block/nbd.c | 18 ++++++++++++++----
 1 file changed, 14 insertions(+), 4 deletions(-)

diff --git a/drivers/block/nbd.c b/drivers/block/nbd.c
index 2ee1e376d5c4..a0d0910dae2a 100644
--- a/drivers/block/nbd.c
+++ b/drivers/block/nbd.c
@@ -403,13 +403,14 @@ static enum blk_eh_timer_return nbd_xmit_timeout(struct request *req,
 	if (!mutex_trylock(&cmd->lock))
 		return BLK_EH_RESET_TIMER;
 
-	if (!__test_and_clear_bit(NBD_CMD_INFLIGHT, &cmd->flags)) {
+	if (!test_bit(NBD_CMD_INFLIGHT, &cmd->flags)) {
 		mutex_unlock(&cmd->lock);
 		return BLK_EH_DONE;
 	}
 
 	if (!refcount_inc_not_zero(&nbd->config_refs)) {
 		cmd->status = BLK_STS_TIMEOUT;
+		__clear_bit(NBD_CMD_INFLIGHT, &cmd->flags);
 		mutex_unlock(&cmd->lock);
 		goto done;
 	}
@@ -478,6 +479,7 @@ static enum blk_eh_timer_return nbd_xmit_timeout(struct request *req,
 	dev_err_ratelimited(nbd_to_dev(nbd), "Connection timed out\n");
 	set_bit(NBD_RT_TIMEDOUT, &config->runtime_flags);
 	cmd->status = BLK_STS_IOERR;
+	__clear_bit(NBD_CMD_INFLIGHT, &cmd->flags);
 	mutex_unlock(&cmd->lock);
 	sock_shutdown(nbd);
 	nbd_config_put(nbd);
@@ -745,7 +747,7 @@ static struct nbd_cmd *nbd_handle_reply(struct nbd_device *nbd, int index,
 	cmd = blk_mq_rq_to_pdu(req);
 
 	mutex_lock(&cmd->lock);
-	if (!__test_and_clear_bit(NBD_CMD_INFLIGHT, &cmd->flags)) {
+	if (!test_bit(NBD_CMD_INFLIGHT, &cmd->flags)) {
 		dev_err(disk_to_dev(nbd->disk), "Suspicious reply %d (status %u flags %lu)",
 			tag, cmd->status, cmd->flags);
 		ret = -ENOENT;
@@ -854,8 +856,16 @@ static void recv_work(struct work_struct *work)
 		}
 
 		rq = blk_mq_rq_from_pdu(cmd);
-		if (likely(!blk_should_fake_timeout(rq->q)))
-			blk_mq_complete_request(rq);
+		if (likely(!blk_should_fake_timeout(rq->q))) {
+			bool complete;
+
+			mutex_lock(&cmd->lock);
+			complete = __test_and_clear_bit(NBD_CMD_INFLIGHT,
+							&cmd->flags);
+			mutex_unlock(&cmd->lock);
+			if (complete)
+				blk_mq_complete_request(rq);
+		}
 		percpu_ref_put(&q->q_usage_counter);
 	}
 
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH -next v3 4/6] nbd: fix io hung while disconnecting device
  2022-05-21  7:37 [PATCH -next v3 0/6] nbd: bugfix and cleanup patches Yu Kuai
                   ` (2 preceding siblings ...)
  2022-05-21  7:37 ` [PATCH -next v3 3/6] nbd: don't clear 'NBD_CMD_INFLIGHT' flag if request is not completed Yu Kuai
@ 2022-05-21  7:37 ` Yu Kuai
  2022-05-23 14:15   ` Josef Bacik
  2022-05-21  7:37 ` [PATCH -next v3 5/6] nbd: fix possible overflow on 'first_minor' in nbd_dev_add() Yu Kuai
                   ` (2 subsequent siblings)
  6 siblings, 1 reply; 16+ messages in thread
From: Yu Kuai @ 2022-05-21  7:37 UTC (permalink / raw)
  To: josef, axboe, yukuai3, ming.lei; +Cc: linux-block, nbd, linux-kernel, yi.zhang

In our tests, "qemu-nbd" triggers a io hung:

INFO: task qemu-nbd:11445 blocked for more than 368 seconds.
      Not tainted 5.18.0-rc3-next-20220422-00003-g2176915513ca #884
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
task:qemu-nbd        state:D stack:    0 pid:11445 ppid:     1 flags:0x00000000
Call Trace:
 <TASK>
 __schedule+0x480/0x1050
 ? _raw_spin_lock_irqsave+0x3e/0xb0
 schedule+0x9c/0x1b0
 blk_mq_freeze_queue_wait+0x9d/0xf0
 ? ipi_rseq+0x70/0x70
 blk_mq_freeze_queue+0x2b/0x40
 nbd_add_socket+0x6b/0x270 [nbd]
 nbd_ioctl+0x383/0x510 [nbd]
 blkdev_ioctl+0x18e/0x3e0
 __x64_sys_ioctl+0xac/0x120
 do_syscall_64+0x35/0x80
 entry_SYSCALL_64_after_hwframe+0x44/0xae
RIP: 0033:0x7fd8ff706577
RSP: 002b:00007fd8fcdfebf8 EFLAGS: 00000246 ORIG_RAX: 0000000000000010
RAX: ffffffffffffffda RBX: 0000000040000000 RCX: 00007fd8ff706577
RDX: 000000000000000d RSI: 000000000000ab00 RDI: 000000000000000f
RBP: 000000000000000f R08: 000000000000fbe8 R09: 000055fe497c62b0
R10: 00000002aff20000 R11: 0000000000000246 R12: 000000000000006d
R13: 0000000000000000 R14: 00007ffe82dc5e70 R15: 00007fd8fcdff9c0

"qemu-ndb -d" will call ioctl 'NBD_DISCONNECT' first, however, following
message was found:

block nbd0: Send disconnect failed -32

Which indicate that something is wrong with the server. Then,
"qemu-nbd -d" will call ioctl 'NBD_CLEAR_SOCK', however ioctl can't clear
requests after commit 2516ab1543fd("nbd: only clear the queue on device
teardown"). And in the meantime, request can't complete through timeout
because nbd_xmit_timeout() will always return 'BLK_EH_RESET_TIMER', which
means such request will never be completed in this situation.

Now that the flag 'NBD_CMD_INFLIGHT' can make sure requests won't
complete multiple times, switch back to call nbd_clear_sock() in
nbd_clear_sock_ioctl(), so that inflight requests can be cleared.

Signed-off-by: Yu Kuai <yukuai3@huawei.com>
---
 drivers/block/nbd.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/block/nbd.c b/drivers/block/nbd.c
index a0d0910dae2a..ec736cc52134 100644
--- a/drivers/block/nbd.c
+++ b/drivers/block/nbd.c
@@ -1429,7 +1429,7 @@ static int nbd_start_device_ioctl(struct nbd_device *nbd)
 static void nbd_clear_sock_ioctl(struct nbd_device *nbd,
 				 struct block_device *bdev)
 {
-	sock_shutdown(nbd);
+	nbd_clear_sock(nbd);
 	__invalidate_device(bdev, true);
 	nbd_bdev_reset(nbd);
 	if (test_and_clear_bit(NBD_RT_HAS_CONFIG_REF,
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH -next v3 5/6] nbd: fix possible overflow on 'first_minor' in nbd_dev_add()
  2022-05-21  7:37 [PATCH -next v3 0/6] nbd: bugfix and cleanup patches Yu Kuai
                   ` (3 preceding siblings ...)
  2022-05-21  7:37 ` [PATCH -next v3 4/6] nbd: fix io hung while disconnecting device Yu Kuai
@ 2022-05-21  7:37 ` Yu Kuai
  2022-05-23 14:15   ` Josef Bacik
  2022-05-21  7:37 ` [PATCH -next v3 6/6] nbd: use pr_err to output error message Yu Kuai
  2022-05-28 12:20 ` [PATCH -next v3 0/6] nbd: bugfix and cleanup patches Jens Axboe
  6 siblings, 1 reply; 16+ messages in thread
From: Yu Kuai @ 2022-05-21  7:37 UTC (permalink / raw)
  To: josef, axboe, yukuai3, ming.lei; +Cc: linux-block, nbd, linux-kernel, yi.zhang

From: Zhang Wensheng <zhangwensheng5@huawei.com>

When 'index' is a big numbers, it may become negative which forced
to 'int'. then 'index << part_shift' might overflow to a positive
value that is not greater than '0xfffff', then sysfs might complains
about duplicate creation. Because of this, move the 'index' judgment
to the front will fix it and be better.

Fixes: b0d9111a2d53 ("nbd: use an idr to keep track of nbd devices")
Fixes: 940c264984fd ("nbd: fix possible overflow for 'first_minor' in nbd_dev_add()")
Signed-off-by: Zhang Wensheng <zhangwensheng5@huawei.com>
Signed-off-by: Yu Kuai <yukuai3@huawei.com>
---
 drivers/block/nbd.c | 23 ++++++++++++-----------
 1 file changed, 12 insertions(+), 11 deletions(-)

diff --git a/drivers/block/nbd.c b/drivers/block/nbd.c
index ec736cc52134..349bc3da878d 100644
--- a/drivers/block/nbd.c
+++ b/drivers/block/nbd.c
@@ -1814,17 +1814,7 @@ static struct nbd_device *nbd_dev_add(int index, unsigned int refs)
 	refcount_set(&nbd->refs, 0);
 	INIT_LIST_HEAD(&nbd->list);
 	disk->major = NBD_MAJOR;
-
-	/* Too big first_minor can cause duplicate creation of
-	 * sysfs files/links, since index << part_shift might overflow, or
-	 * MKDEV() expect that the max bits of first_minor is 20.
-	 */
 	disk->first_minor = index << part_shift;
-	if (disk->first_minor < index || disk->first_minor > MINORMASK) {
-		err = -EINVAL;
-		goto out_free_work;
-	}
-
 	disk->minors = 1 << part_shift;
 	disk->fops = &nbd_fops;
 	disk->private_data = nbd;
@@ -1929,8 +1919,19 @@ static int nbd_genl_connect(struct sk_buff *skb, struct genl_info *info)
 	if (!netlink_capable(skb, CAP_SYS_ADMIN))
 		return -EPERM;
 
-	if (info->attrs[NBD_ATTR_INDEX])
+	if (info->attrs[NBD_ATTR_INDEX]) {
 		index = nla_get_u32(info->attrs[NBD_ATTR_INDEX]);
+
+		/*
+		 * Too big first_minor can cause duplicate creation of
+		 * sysfs files/links, since index << part_shift might overflow, or
+		 * MKDEV() expect that the max bits of first_minor is 20.
+		 */
+		if (index < 0 || index > MINORMASK >> part_shift) {
+			printk(KERN_ERR "nbd: illegal input index %d\n", index);
+			return -EINVAL;
+		}
+	}
 	if (!info->attrs[NBD_ATTR_SOCKETS]) {
 		printk(KERN_ERR "nbd: must specify at least one socket\n");
 		return -EINVAL;
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH -next v3 6/6] nbd: use pr_err to output error message
  2022-05-21  7:37 [PATCH -next v3 0/6] nbd: bugfix and cleanup patches Yu Kuai
                   ` (4 preceding siblings ...)
  2022-05-21  7:37 ` [PATCH -next v3 5/6] nbd: fix possible overflow on 'first_minor' in nbd_dev_add() Yu Kuai
@ 2022-05-21  7:37 ` Yu Kuai
  2022-05-23 14:16   ` Josef Bacik
  2022-05-28 12:20 ` [PATCH -next v3 0/6] nbd: bugfix and cleanup patches Jens Axboe
  6 siblings, 1 reply; 16+ messages in thread
From: Yu Kuai @ 2022-05-21  7:37 UTC (permalink / raw)
  To: josef, axboe, yukuai3, ming.lei; +Cc: linux-block, nbd, linux-kernel, yi.zhang

Instead of using the long printk(KERN_ERR "nbd: ...") to
output error message, defining pr_fmt and using
the short pr_err("") to do that. The replacemen is done
by using the following command:

  sed -i 's/printk(KERN_ERR "nbd: /pr_err("/g' \
		  drivers/block/nbd.c

This patch also rewrap to 80 columns where possible.

Signed-off-by: Hou Tao <houtao1@huawei.com>
Signed-off-by: Yu Kuai <yukuai3@huawei.com>
---
 drivers/block/nbd.c | 40 ++++++++++++++++++----------------------
 1 file changed, 18 insertions(+), 22 deletions(-)

diff --git a/drivers/block/nbd.c b/drivers/block/nbd.c
index 349bc3da878d..07f3c139a3d7 100644
--- a/drivers/block/nbd.c
+++ b/drivers/block/nbd.c
@@ -1928,16 +1928,16 @@ static int nbd_genl_connect(struct sk_buff *skb, struct genl_info *info)
 		 * MKDEV() expect that the max bits of first_minor is 20.
 		 */
 		if (index < 0 || index > MINORMASK >> part_shift) {
-			printk(KERN_ERR "nbd: illegal input index %d\n", index);
+			pr_err("illegal input index %d\n", index);
 			return -EINVAL;
 		}
 	}
 	if (!info->attrs[NBD_ATTR_SOCKETS]) {
-		printk(KERN_ERR "nbd: must specify at least one socket\n");
+		pr_err("must specify at least one socket\n");
 		return -EINVAL;
 	}
 	if (!info->attrs[NBD_ATTR_SIZE_BYTES]) {
-		printk(KERN_ERR "nbd: must specify a size in bytes for the device\n");
+		pr_err("must specify a size in bytes for the device\n");
 		return -EINVAL;
 	}
 again:
@@ -1973,7 +1973,7 @@ static int nbd_genl_connect(struct sk_buff *skb, struct genl_info *info)
 		nbd_put(nbd);
 		if (index == -1)
 			goto again;
-		printk(KERN_ERR "nbd: nbd%d already in use\n", index);
+		pr_err("nbd%d already in use\n", index);
 		return -EBUSY;
 	}
 	if (WARN_ON(nbd->config)) {
@@ -1985,7 +1985,7 @@ static int nbd_genl_connect(struct sk_buff *skb, struct genl_info *info)
 	if (IS_ERR(config)) {
 		mutex_unlock(&nbd->config_lock);
 		nbd_put(nbd);
-		printk(KERN_ERR "nbd: couldn't allocate config\n");
+		pr_err("couldn't allocate config\n");
 		return PTR_ERR(config);
 	}
 	nbd->config = config;
@@ -2041,7 +2041,7 @@ static int nbd_genl_connect(struct sk_buff *skb, struct genl_info *info)
 			struct nlattr *socks[NBD_SOCK_MAX+1];
 
 			if (nla_type(attr) != NBD_SOCK_ITEM) {
-				printk(KERN_ERR "nbd: socks must be embedded in a SOCK_ITEM attr\n");
+				pr_err("socks must be embedded in a SOCK_ITEM attr\n");
 				ret = -EINVAL;
 				goto out;
 			}
@@ -2050,7 +2050,7 @@ static int nbd_genl_connect(struct sk_buff *skb, struct genl_info *info)
 							  nbd_sock_policy,
 							  info->extack);
 			if (ret != 0) {
-				printk(KERN_ERR "nbd: error processing sock list\n");
+				pr_err("error processing sock list\n");
 				ret = -EINVAL;
 				goto out;
 			}
@@ -2122,7 +2122,7 @@ static int nbd_genl_disconnect(struct sk_buff *skb, struct genl_info *info)
 		return -EPERM;
 
 	if (!info->attrs[NBD_ATTR_INDEX]) {
-		printk(KERN_ERR "nbd: must specify an index to disconnect\n");
+		pr_err("must specify an index to disconnect\n");
 		return -EINVAL;
 	}
 	index = nla_get_u32(info->attrs[NBD_ATTR_INDEX]);
@@ -2130,14 +2130,12 @@ static int nbd_genl_disconnect(struct sk_buff *skb, struct genl_info *info)
 	nbd = idr_find(&nbd_index_idr, index);
 	if (!nbd) {
 		mutex_unlock(&nbd_index_mutex);
-		printk(KERN_ERR "nbd: couldn't find device at index %d\n",
-		       index);
+		pr_err("couldn't find device at index %d\n", index);
 		return -EINVAL;
 	}
 	if (!refcount_inc_not_zero(&nbd->refs)) {
 		mutex_unlock(&nbd_index_mutex);
-		printk(KERN_ERR "nbd: device at index %d is going down\n",
-		       index);
+		pr_err("device at index %d is going down\n", index);
 		return -EINVAL;
 	}
 	mutex_unlock(&nbd_index_mutex);
@@ -2162,7 +2160,7 @@ static int nbd_genl_reconfigure(struct sk_buff *skb, struct genl_info *info)
 		return -EPERM;
 
 	if (!info->attrs[NBD_ATTR_INDEX]) {
-		printk(KERN_ERR "nbd: must specify a device to reconfigure\n");
+		pr_err("must specify a device to reconfigure\n");
 		return -EINVAL;
 	}
 	index = nla_get_u32(info->attrs[NBD_ATTR_INDEX]);
@@ -2170,8 +2168,7 @@ static int nbd_genl_reconfigure(struct sk_buff *skb, struct genl_info *info)
 	nbd = idr_find(&nbd_index_idr, index);
 	if (!nbd) {
 		mutex_unlock(&nbd_index_mutex);
-		printk(KERN_ERR "nbd: couldn't find a device at index %d\n",
-		       index);
+		pr_err("couldn't find a device at index %d\n", index);
 		return -EINVAL;
 	}
 	if (nbd->backend) {
@@ -2192,8 +2189,7 @@ static int nbd_genl_reconfigure(struct sk_buff *skb, struct genl_info *info)
 	}
 	if (!refcount_inc_not_zero(&nbd->refs)) {
 		mutex_unlock(&nbd_index_mutex);
-		printk(KERN_ERR "nbd: device at index %d is going down\n",
-		       index);
+		pr_err("device at index %d is going down\n", index);
 		return -EINVAL;
 	}
 	mutex_unlock(&nbd_index_mutex);
@@ -2257,7 +2253,7 @@ static int nbd_genl_reconfigure(struct sk_buff *skb, struct genl_info *info)
 			struct nlattr *socks[NBD_SOCK_MAX+1];
 
 			if (nla_type(attr) != NBD_SOCK_ITEM) {
-				printk(KERN_ERR "nbd: socks must be embedded in a SOCK_ITEM attr\n");
+				pr_err("socks must be embedded in a SOCK_ITEM attr\n");
 				ret = -EINVAL;
 				goto out;
 			}
@@ -2266,7 +2262,7 @@ static int nbd_genl_reconfigure(struct sk_buff *skb, struct genl_info *info)
 							  nbd_sock_policy,
 							  info->extack);
 			if (ret != 0) {
-				printk(KERN_ERR "nbd: error processing sock list\n");
+				pr_err("error processing sock list\n");
 				ret = -EINVAL;
 				goto out;
 			}
@@ -2483,7 +2479,7 @@ static int __init nbd_init(void)
 	BUILD_BUG_ON(sizeof(struct nbd_request) != 28);
 
 	if (max_part < 0) {
-		printk(KERN_ERR "nbd: max_part must be >= 0\n");
+		pr_err("max_part must be >= 0\n");
 		return -EINVAL;
 	}
 
@@ -2562,10 +2558,10 @@ static void __exit nbd_cleanup(void)
 		nbd = list_first_entry(&del_list, struct nbd_device, list);
 		list_del_init(&nbd->list);
 		if (refcount_read(&nbd->config_refs))
-			printk(KERN_ERR "nbd: possibly leaking nbd_config (ref %d)\n",
+			pr_err("possibly leaking nbd_config (ref %d)\n",
 					refcount_read(&nbd->config_refs));
 		if (refcount_read(&nbd->refs) != 1)
-			printk(KERN_ERR "nbd: possibly leaking a device\n");
+			pr_err("possibly leaking a device\n");
 		nbd_put(nbd);
 	}
 
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 16+ messages in thread

* Re: [PATCH -next v3 3/6] nbd: don't clear 'NBD_CMD_INFLIGHT' flag if request is not completed
  2022-05-21  7:37 ` [PATCH -next v3 3/6] nbd: don't clear 'NBD_CMD_INFLIGHT' flag if request is not completed Yu Kuai
@ 2022-05-23 14:12   ` Josef Bacik
  2022-05-24  1:07     ` Yu Kuai
  0 siblings, 1 reply; 16+ messages in thread
From: Josef Bacik @ 2022-05-23 14:12 UTC (permalink / raw)
  To: Yu Kuai; +Cc: axboe, ming.lei, linux-block, nbd, linux-kernel, yi.zhang

On Sat, May 21, 2022 at 03:37:46PM +0800, Yu Kuai wrote:
> Otherwise io will hung because request will only be completed if the
> cmd has the flag 'NBD_CMD_INFLIGHT'.
> 
> Fixes: 07175cb1baf4 ("nbd: make sure request completion won't concurrent")
> Signed-off-by: Yu Kuai <yukuai3@huawei.com>
> ---
>  drivers/block/nbd.c | 18 ++++++++++++++----
>  1 file changed, 14 insertions(+), 4 deletions(-)
> 
> diff --git a/drivers/block/nbd.c b/drivers/block/nbd.c
> index 2ee1e376d5c4..a0d0910dae2a 100644
> --- a/drivers/block/nbd.c
> +++ b/drivers/block/nbd.c
> @@ -403,13 +403,14 @@ static enum blk_eh_timer_return nbd_xmit_timeout(struct request *req,
>  	if (!mutex_trylock(&cmd->lock))
>  		return BLK_EH_RESET_TIMER;
>  
> -	if (!__test_and_clear_bit(NBD_CMD_INFLIGHT, &cmd->flags)) {
> +	if (!test_bit(NBD_CMD_INFLIGHT, &cmd->flags)) {
>  		mutex_unlock(&cmd->lock);
>  		return BLK_EH_DONE;
>  	}
>  
>  	if (!refcount_inc_not_zero(&nbd->config_refs)) {
>  		cmd->status = BLK_STS_TIMEOUT;
> +		__clear_bit(NBD_CMD_INFLIGHT, &cmd->flags);
>  		mutex_unlock(&cmd->lock);
>  		goto done;
>  	}
> @@ -478,6 +479,7 @@ static enum blk_eh_timer_return nbd_xmit_timeout(struct request *req,
>  	dev_err_ratelimited(nbd_to_dev(nbd), "Connection timed out\n");
>  	set_bit(NBD_RT_TIMEDOUT, &config->runtime_flags);
>  	cmd->status = BLK_STS_IOERR;
> +	__clear_bit(NBD_CMD_INFLIGHT, &cmd->flags);
>  	mutex_unlock(&cmd->lock);
>  	sock_shutdown(nbd);
>  	nbd_config_put(nbd);
> @@ -745,7 +747,7 @@ static struct nbd_cmd *nbd_handle_reply(struct nbd_device *nbd, int index,
>  	cmd = blk_mq_rq_to_pdu(req);
>  
>  	mutex_lock(&cmd->lock);
> -	if (!__test_and_clear_bit(NBD_CMD_INFLIGHT, &cmd->flags)) {
> +	if (!test_bit(NBD_CMD_INFLIGHT, &cmd->flags)) {
>  		dev_err(disk_to_dev(nbd->disk), "Suspicious reply %d (status %u flags %lu)",
>  			tag, cmd->status, cmd->flags);
>  		ret = -ENOENT;
> @@ -854,8 +856,16 @@ static void recv_work(struct work_struct *work)
>  		}
>  
>  		rq = blk_mq_rq_from_pdu(cmd);
> -		if (likely(!blk_should_fake_timeout(rq->q)))
> -			blk_mq_complete_request(rq);
> +		if (likely(!blk_should_fake_timeout(rq->q))) {
> +			bool complete;
> +
> +			mutex_lock(&cmd->lock);
> +			complete = __test_and_clear_bit(NBD_CMD_INFLIGHT,
> +							&cmd->flags);
> +			mutex_unlock(&cmd->lock);
> +			if (complete)
> +				blk_mq_complete_request(rq);
> +		}

I'd rather this be handled in nbd_handle_reply.  We should return with it
cleared if it's ready to be completed.  Thanks,

Josef

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH -next v3 1/6] nbd: call genl_unregister_family() first in nbd_cleanup()
  2022-05-21  7:37 ` [PATCH -next v3 1/6] nbd: call genl_unregister_family() first in nbd_cleanup() Yu Kuai
@ 2022-05-23 14:13   ` Josef Bacik
  0 siblings, 0 replies; 16+ messages in thread
From: Josef Bacik @ 2022-05-23 14:13 UTC (permalink / raw)
  To: Yu Kuai; +Cc: axboe, ming.lei, linux-block, nbd, linux-kernel, yi.zhang

On Sat, May 21, 2022 at 03:37:44PM +0800, Yu Kuai wrote:
> Otherwise there may be race between module removal and the handling of
> netlink command, which can lead to the oops as shown below:
> 
>   BUG: kernel NULL pointer dereference, address: 0000000000000098
>   Oops: 0002 [#1] SMP PTI
>   CPU: 1 PID: 31299 Comm: nbd-client Tainted: G            E     5.14.0-rc4
>   Hardware name: QEMU Standard PC (i440FX + PIIX, 1996)
>   RIP: 0010:down_write+0x1a/0x50
>   Call Trace:
>    start_creating+0x89/0x130
>    debugfs_create_dir+0x1b/0x130
>    nbd_start_device+0x13d/0x390 [nbd]
>    nbd_genl_connect+0x42f/0x748 [nbd]
>    genl_family_rcv_msg_doit.isra.0+0xec/0x150
>    genl_rcv_msg+0xe5/0x1e0
>    netlink_rcv_skb+0x55/0x100
>    genl_rcv+0x29/0x40
>    netlink_unicast+0x1a8/0x250
>    netlink_sendmsg+0x21b/0x430
>    ____sys_sendmsg+0x2a4/0x2d0
>    ___sys_sendmsg+0x81/0xc0
>    __sys_sendmsg+0x62/0xb0
>    __x64_sys_sendmsg+0x1f/0x30
>    do_syscall_64+0x3b/0xc0
>    entry_SYSCALL_64_after_hwframe+0x44/0xae
>   Modules linked in: nbd(E-)
> 
> Signed-off-by: Hou Tao <houtao1@huawei.com>
> Signed-off-by: Yu Kuai <yukuai3@huawei.com>

Reviewed-by: Josef Bacik <josef@toxicpanda.com>

Thanks,

Josef

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH -next v3 2/6] nbd: fix race between nbd_alloc_config() and module removal
  2022-05-21  7:37 ` [PATCH -next v3 2/6] nbd: fix race between nbd_alloc_config() and module removal Yu Kuai
@ 2022-05-23 14:14   ` Josef Bacik
  0 siblings, 0 replies; 16+ messages in thread
From: Josef Bacik @ 2022-05-23 14:14 UTC (permalink / raw)
  To: Yu Kuai; +Cc: axboe, ming.lei, linux-block, nbd, linux-kernel, yi.zhang

On Sat, May 21, 2022 at 03:37:45PM +0800, Yu Kuai wrote:
> When nbd module is being removing, nbd_alloc_config() may be
> called concurrently by nbd_genl_connect(), although try_module_get()
> will return false, but nbd_alloc_config() doesn't handle it.
> 
> The race may lead to the leak of nbd_config and its related
> resources (e.g, recv_workq) and oops in nbd_read_stat() due
> to the unload of nbd module as shown below:
> 
>   BUG: kernel NULL pointer dereference, address: 0000000000000040
>   Oops: 0000 [#1] SMP PTI
>   CPU: 5 PID: 13840 Comm: kworker/u17:33 Not tainted 5.14.0+ #1
>   Hardware name: QEMU Standard PC (i440FX + PIIX, 1996)
>   Workqueue: knbd16-recv recv_work [nbd]
>   RIP: 0010:nbd_read_stat.cold+0x130/0x1a4 [nbd]
>   Call Trace:
>    recv_work+0x3b/0xb0 [nbd]
>    process_one_work+0x1ed/0x390
>    worker_thread+0x4a/0x3d0
>    kthread+0x12a/0x150
>    ret_from_fork+0x22/0x30
> 
> Fixing it by checking the return value of try_module_get()
> in nbd_alloc_config(). As nbd_alloc_config() may return ERR_PTR(-ENODEV),
> assign nbd->config only when nbd_alloc_config() succeeds to ensure
> the value of nbd->config is binary (valid or NULL).
> 
> Also adding a debug message to check the reference counter
> of nbd_config during module removal.
> 
> Signed-off-by: Hou Tao <houtao1@huawei.com>
> Signed-off-by: Yu Kuai <yukuai3@huawei.com>

Reviewed-by: Josef Bacik <josef@toxicpanda.com>

Thanks,

Josef

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH -next v3 4/6] nbd: fix io hung while disconnecting device
  2022-05-21  7:37 ` [PATCH -next v3 4/6] nbd: fix io hung while disconnecting device Yu Kuai
@ 2022-05-23 14:15   ` Josef Bacik
  0 siblings, 0 replies; 16+ messages in thread
From: Josef Bacik @ 2022-05-23 14:15 UTC (permalink / raw)
  To: Yu Kuai; +Cc: axboe, ming.lei, linux-block, nbd, linux-kernel, yi.zhang

On Sat, May 21, 2022 at 03:37:47PM +0800, Yu Kuai wrote:
> In our tests, "qemu-nbd" triggers a io hung:
> 
> INFO: task qemu-nbd:11445 blocked for more than 368 seconds.
>       Not tainted 5.18.0-rc3-next-20220422-00003-g2176915513ca #884
> "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
> task:qemu-nbd        state:D stack:    0 pid:11445 ppid:     1 flags:0x00000000
> Call Trace:
>  <TASK>
>  __schedule+0x480/0x1050
>  ? _raw_spin_lock_irqsave+0x3e/0xb0
>  schedule+0x9c/0x1b0
>  blk_mq_freeze_queue_wait+0x9d/0xf0
>  ? ipi_rseq+0x70/0x70
>  blk_mq_freeze_queue+0x2b/0x40
>  nbd_add_socket+0x6b/0x270 [nbd]
>  nbd_ioctl+0x383/0x510 [nbd]
>  blkdev_ioctl+0x18e/0x3e0
>  __x64_sys_ioctl+0xac/0x120
>  do_syscall_64+0x35/0x80
>  entry_SYSCALL_64_after_hwframe+0x44/0xae
> RIP: 0033:0x7fd8ff706577
> RSP: 002b:00007fd8fcdfebf8 EFLAGS: 00000246 ORIG_RAX: 0000000000000010
> RAX: ffffffffffffffda RBX: 0000000040000000 RCX: 00007fd8ff706577
> RDX: 000000000000000d RSI: 000000000000ab00 RDI: 000000000000000f
> RBP: 000000000000000f R08: 000000000000fbe8 R09: 000055fe497c62b0
> R10: 00000002aff20000 R11: 0000000000000246 R12: 000000000000006d
> R13: 0000000000000000 R14: 00007ffe82dc5e70 R15: 00007fd8fcdff9c0
> 
> "qemu-ndb -d" will call ioctl 'NBD_DISCONNECT' first, however, following
> message was found:
> 
> block nbd0: Send disconnect failed -32
> 
> Which indicate that something is wrong with the server. Then,
> "qemu-nbd -d" will call ioctl 'NBD_CLEAR_SOCK', however ioctl can't clear
> requests after commit 2516ab1543fd("nbd: only clear the queue on device
> teardown"). And in the meantime, request can't complete through timeout
> because nbd_xmit_timeout() will always return 'BLK_EH_RESET_TIMER', which
> means such request will never be completed in this situation.
> 
> Now that the flag 'NBD_CMD_INFLIGHT' can make sure requests won't
> complete multiple times, switch back to call nbd_clear_sock() in
> nbd_clear_sock_ioctl(), so that inflight requests can be cleared.
> 
> Signed-off-by: Yu Kuai <yukuai3@huawei.com>

Reviewed-by: Josef Bacik <josef@toxicpanda.com>

Thanks,

Josef

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH -next v3 5/6] nbd: fix possible overflow on 'first_minor' in nbd_dev_add()
  2022-05-21  7:37 ` [PATCH -next v3 5/6] nbd: fix possible overflow on 'first_minor' in nbd_dev_add() Yu Kuai
@ 2022-05-23 14:15   ` Josef Bacik
  0 siblings, 0 replies; 16+ messages in thread
From: Josef Bacik @ 2022-05-23 14:15 UTC (permalink / raw)
  To: Yu Kuai; +Cc: axboe, ming.lei, linux-block, nbd, linux-kernel, yi.zhang

On Sat, May 21, 2022 at 03:37:48PM +0800, Yu Kuai wrote:
> From: Zhang Wensheng <zhangwensheng5@huawei.com>
> 
> When 'index' is a big numbers, it may become negative which forced
> to 'int'. then 'index << part_shift' might overflow to a positive
> value that is not greater than '0xfffff', then sysfs might complains
> about duplicate creation. Because of this, move the 'index' judgment
> to the front will fix it and be better.
> 
> Fixes: b0d9111a2d53 ("nbd: use an idr to keep track of nbd devices")
> Fixes: 940c264984fd ("nbd: fix possible overflow for 'first_minor' in nbd_dev_add()")
> Signed-off-by: Zhang Wensheng <zhangwensheng5@huawei.com>
> Signed-off-by: Yu Kuai <yukuai3@huawei.com>

Reviewed-by: Josef Bacik <josef@toxicpanda.com>

Thanks,

Josef

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH -next v3 6/6] nbd: use pr_err to output error message
  2022-05-21  7:37 ` [PATCH -next v3 6/6] nbd: use pr_err to output error message Yu Kuai
@ 2022-05-23 14:16   ` Josef Bacik
  0 siblings, 0 replies; 16+ messages in thread
From: Josef Bacik @ 2022-05-23 14:16 UTC (permalink / raw)
  To: Yu Kuai; +Cc: axboe, ming.lei, linux-block, nbd, linux-kernel, yi.zhang

On Sat, May 21, 2022 at 03:37:49PM +0800, Yu Kuai wrote:
> Instead of using the long printk(KERN_ERR "nbd: ...") to
> output error message, defining pr_fmt and using
> the short pr_err("") to do that. The replacemen is done
> by using the following command:
> 
>   sed -i 's/printk(KERN_ERR "nbd: /pr_err("/g' \
> 		  drivers/block/nbd.c
> 
> This patch also rewrap to 80 columns where possible.
> 
> Signed-off-by: Hou Tao <houtao1@huawei.com>
> Signed-off-by: Yu Kuai <yukuai3@huawei.com>

Reviewed-by: Josef Bacik <josef@toxicpanda.com>

Thanks,

Josef

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH -next v3 3/6] nbd: don't clear 'NBD_CMD_INFLIGHT' flag if request is not completed
  2022-05-23 14:12   ` Josef Bacik
@ 2022-05-24  1:07     ` Yu Kuai
  2022-05-24  1:51       ` Yu Kuai
  0 siblings, 1 reply; 16+ messages in thread
From: Yu Kuai @ 2022-05-24  1:07 UTC (permalink / raw)
  To: Josef Bacik; +Cc: axboe, ming.lei, linux-block, nbd, linux-kernel, yi.zhang

在 2022/05/23 22:12, Josef Bacik 写道:
> On Sat, May 21, 2022 at 03:37:46PM +0800, Yu Kuai wrote:
>> Otherwise io will hung because request will only be completed if the
>> cmd has the flag 'NBD_CMD_INFLIGHT'.
>>
>> Fixes: 07175cb1baf4 ("nbd: make sure request completion won't concurrent")
>> Signed-off-by: Yu Kuai <yukuai3@huawei.com>
>> ---
>>   drivers/block/nbd.c | 18 ++++++++++++++----
>>   1 file changed, 14 insertions(+), 4 deletions(-)
>>
>> diff --git a/drivers/block/nbd.c b/drivers/block/nbd.c
>> index 2ee1e376d5c4..a0d0910dae2a 100644
>> --- a/drivers/block/nbd.c
>> +++ b/drivers/block/nbd.c
>> @@ -403,13 +403,14 @@ static enum blk_eh_timer_return nbd_xmit_timeout(struct request *req,
>>   	if (!mutex_trylock(&cmd->lock))
>>   		return BLK_EH_RESET_TIMER;
>>   
>> -	if (!__test_and_clear_bit(NBD_CMD_INFLIGHT, &cmd->flags)) {
>> +	if (!test_bit(NBD_CMD_INFLIGHT, &cmd->flags)) {
>>   		mutex_unlock(&cmd->lock);
>>   		return BLK_EH_DONE;
>>   	}
>>   
>>   	if (!refcount_inc_not_zero(&nbd->config_refs)) {
>>   		cmd->status = BLK_STS_TIMEOUT;
>> +		__clear_bit(NBD_CMD_INFLIGHT, &cmd->flags);
>>   		mutex_unlock(&cmd->lock);
>>   		goto done;
>>   	}
>> @@ -478,6 +479,7 @@ static enum blk_eh_timer_return nbd_xmit_timeout(struct request *req,
>>   	dev_err_ratelimited(nbd_to_dev(nbd), "Connection timed out\n");
>>   	set_bit(NBD_RT_TIMEDOUT, &config->runtime_flags);
>>   	cmd->status = BLK_STS_IOERR;
>> +	__clear_bit(NBD_CMD_INFLIGHT, &cmd->flags);
>>   	mutex_unlock(&cmd->lock);
>>   	sock_shutdown(nbd);
>>   	nbd_config_put(nbd);
>> @@ -745,7 +747,7 @@ static struct nbd_cmd *nbd_handle_reply(struct nbd_device *nbd, int index,
>>   	cmd = blk_mq_rq_to_pdu(req);
>>   
>>   	mutex_lock(&cmd->lock);
>> -	if (!__test_and_clear_bit(NBD_CMD_INFLIGHT, &cmd->flags)) {
>> +	if (!test_bit(NBD_CMD_INFLIGHT, &cmd->flags)) {
>>   		dev_err(disk_to_dev(nbd->disk), "Suspicious reply %d (status %u flags %lu)",
>>   			tag, cmd->status, cmd->flags);
>>   		ret = -ENOENT;
>> @@ -854,8 +856,16 @@ static void recv_work(struct work_struct *work)
>>   		}
>>   
>>   		rq = blk_mq_rq_from_pdu(cmd);
>> -		if (likely(!blk_should_fake_timeout(rq->q)))
>> -			blk_mq_complete_request(rq);
>> +		if (likely(!blk_should_fake_timeout(rq->q))) {
>> +			bool complete;
>> +
>> +			mutex_lock(&cmd->lock);
>> +			complete = __test_and_clear_bit(NBD_CMD_INFLIGHT,
>> +							&cmd->flags);
>> +			mutex_unlock(&cmd->lock);
>> +			if (complete)
>> +				blk_mq_complete_request(rq);
>> +		}
> 
> I'd rather this be handled in nbd_handle_reply.  We should return with it
> cleared if it's ready to be completed.  Thanks,
Hi,

Thanks for your advice, I'll do that in next version. I'll still have to
hold the lock to set the bit again in case blk_should_fake_timeout()
pass...

Thanks,
Kuai
> 
> Josef
> .
> 

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH -next v3 3/6] nbd: don't clear 'NBD_CMD_INFLIGHT' flag if request is not completed
  2022-05-24  1:07     ` Yu Kuai
@ 2022-05-24  1:51       ` Yu Kuai
  0 siblings, 0 replies; 16+ messages in thread
From: Yu Kuai @ 2022-05-24  1:51 UTC (permalink / raw)
  To: Josef Bacik; +Cc: axboe, ming.lei, linux-block, nbd, linux-kernel, yi.zhang

在 2022/05/24 9:07, Yu Kuai 写道:
> 在 2022/05/23 22:12, Josef Bacik 写道:
>> On Sat, May 21, 2022 at 03:37:46PM +0800, Yu Kuai wrote:
>>> Otherwise io will hung because request will only be completed if the
>>> cmd has the flag 'NBD_CMD_INFLIGHT'.
>>>
>>> Fixes: 07175cb1baf4 ("nbd: make sure request completion won't 
>>> concurrent")
>>> Signed-off-by: Yu Kuai <yukuai3@huawei.com>
>>> ---
>>>   drivers/block/nbd.c | 18 ++++++++++++++----
>>>   1 file changed, 14 insertions(+), 4 deletions(-)
>>>
>>> diff --git a/drivers/block/nbd.c b/drivers/block/nbd.c
>>> index 2ee1e376d5c4..a0d0910dae2a 100644
>>> --- a/drivers/block/nbd.c
>>> +++ b/drivers/block/nbd.c
>>> @@ -403,13 +403,14 @@ static enum blk_eh_timer_return 
>>> nbd_xmit_timeout(struct request *req,
>>>       if (!mutex_trylock(&cmd->lock))
>>>           return BLK_EH_RESET_TIMER;
>>> -    if (!__test_and_clear_bit(NBD_CMD_INFLIGHT, &cmd->flags)) {
>>> +    if (!test_bit(NBD_CMD_INFLIGHT, &cmd->flags)) {
>>>           mutex_unlock(&cmd->lock);
>>>           return BLK_EH_DONE;
>>>       }
>>>       if (!refcount_inc_not_zero(&nbd->config_refs)) {
>>>           cmd->status = BLK_STS_TIMEOUT;
>>> +        __clear_bit(NBD_CMD_INFLIGHT, &cmd->flags);
>>>           mutex_unlock(&cmd->lock);
>>>           goto done;
>>>       }
>>> @@ -478,6 +479,7 @@ static enum blk_eh_timer_return 
>>> nbd_xmit_timeout(struct request *req,
>>>       dev_err_ratelimited(nbd_to_dev(nbd), "Connection timed out\n");
>>>       set_bit(NBD_RT_TIMEDOUT, &config->runtime_flags);
>>>       cmd->status = BLK_STS_IOERR;
>>> +    __clear_bit(NBD_CMD_INFLIGHT, &cmd->flags);
>>>       mutex_unlock(&cmd->lock);
>>>       sock_shutdown(nbd);
>>>       nbd_config_put(nbd);
>>> @@ -745,7 +747,7 @@ static struct nbd_cmd *nbd_handle_reply(struct 
>>> nbd_device *nbd, int index,
>>>       cmd = blk_mq_rq_to_pdu(req);
>>>       mutex_lock(&cmd->lock);
>>> -    if (!__test_and_clear_bit(NBD_CMD_INFLIGHT, &cmd->flags)) {
>>> +    if (!test_bit(NBD_CMD_INFLIGHT, &cmd->flags)) {
>>>           dev_err(disk_to_dev(nbd->disk), "Suspicious reply %d 
>>> (status %u flags %lu)",
>>>               tag, cmd->status, cmd->flags);
>>>           ret = -ENOENT;
>>> @@ -854,8 +856,16 @@ static void recv_work(struct work_struct *work)
>>>           }
>>>           rq = blk_mq_rq_from_pdu(cmd);
>>> -        if (likely(!blk_should_fake_timeout(rq->q)))
>>> -            blk_mq_complete_request(rq);
>>> +        if (likely(!blk_should_fake_timeout(rq->q))) {
>>> +            bool complete;
>>> +
>>> +            mutex_lock(&cmd->lock);
>>> +            complete = __test_and_clear_bit(NBD_CMD_INFLIGHT,
>>> +                            &cmd->flags);
>>> +            mutex_unlock(&cmd->lock);
>>> +            if (complete)
>>> +                blk_mq_complete_request(rq);
>>> +        }
>>
>> I'd rather this be handled in nbd_handle_reply.  We should return with it
>> cleared if it's ready to be completed.  Thanks,
> Hi,
> 
> Thanks for your advice, I'll do that in next version. I'll still have to
> hold the lock to set the bit again in case blk_should_fake_timeout()
> pass...

Hi, Josef

I just found out that this way is problematic:
t1:			t2:
recv_work
  nbd_handle_reply
   __clear_bit
			nbd_xmit_timeout
			 test_bit(NBD_CMD_INFLIGHT, &cmd->flags) -> fail
			 return BLK_EH_DONE -> rq can't complete
  blk_should_fake_timeout -> true
  __set_bit

__clear_bit and then __set_bit from recv_work leaves a window, and
concurrent nbd_xmit_timeout() may lead to that request can't be
completed through both timeout and recv_work().

Do you think it's ok to keep the current implementation with some
comments to explain the above scenario?

Thanks,
Kuai

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH -next v3 0/6] nbd: bugfix and cleanup patches
  2022-05-21  7:37 [PATCH -next v3 0/6] nbd: bugfix and cleanup patches Yu Kuai
                   ` (5 preceding siblings ...)
  2022-05-21  7:37 ` [PATCH -next v3 6/6] nbd: use pr_err to output error message Yu Kuai
@ 2022-05-28 12:20 ` Jens Axboe
  6 siblings, 0 replies; 16+ messages in thread
From: Jens Axboe @ 2022-05-28 12:20 UTC (permalink / raw)
  To: yukuai3, ming.lei, josef; +Cc: linux-block, nbd, linux-kernel, yi.zhang

On Sat, 21 May 2022 15:37:43 +0800, Yu Kuai wrote:
> Changes in v3:
>  - rewrap to 80 columns where possible in patch 6
> Changes in v2:
>  - in patch 3, instead of clear and then reset the flag if rq is not
>  completed, test first and clear if rq is going to complete.
> 
> path 1-2 fix races between nbd setup and module removal.
> patch 3 fix io can't be completed in some error path.
> patch 4 fix io hung when disconnecting failed.
> patch 5 fix sysfs warning about duplicate creation.
> patch 6 use pr_err to output error message.
> 
> [...]

Applied, thanks!

[1/6] nbd: call genl_unregister_family() first in nbd_cleanup()
      commit: 06c4da89c24e7023ea448cadf8e9daf06a0aae6e
[2/6] nbd: fix race between nbd_alloc_config() and module removal
      commit: c55b2b983b0fa012942c3eb16384b2b722caa810
[3/6] nbd: don't clear 'NBD_CMD_INFLIGHT' flag if request is not completed
      commit: 2895f1831e911ca87d4efdf43e35eb72a0c7e66e
[4/6] nbd: fix io hung while disconnecting device
      commit: 09dadb5985023e27d4740ebd17e6fea4640110e5
[5/6] nbd: fix possible overflow on 'first_minor' in nbd_dev_add()
      commit: 858f1bf65d3d9c00b5e2d8ca87dc79ed88267c98
[6/6] nbd: use pr_err to output error message
      commit: 1243172d5894e2d8f277ee3c278180792de5c521

Best regards,
-- 
Jens Axboe



^ permalink raw reply	[flat|nested] 16+ messages in thread

end of thread, other threads:[~2022-05-28 12:20 UTC | newest]

Thread overview: 16+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-05-21  7:37 [PATCH -next v3 0/6] nbd: bugfix and cleanup patches Yu Kuai
2022-05-21  7:37 ` [PATCH -next v3 1/6] nbd: call genl_unregister_family() first in nbd_cleanup() Yu Kuai
2022-05-23 14:13   ` Josef Bacik
2022-05-21  7:37 ` [PATCH -next v3 2/6] nbd: fix race between nbd_alloc_config() and module removal Yu Kuai
2022-05-23 14:14   ` Josef Bacik
2022-05-21  7:37 ` [PATCH -next v3 3/6] nbd: don't clear 'NBD_CMD_INFLIGHT' flag if request is not completed Yu Kuai
2022-05-23 14:12   ` Josef Bacik
2022-05-24  1:07     ` Yu Kuai
2022-05-24  1:51       ` Yu Kuai
2022-05-21  7:37 ` [PATCH -next v3 4/6] nbd: fix io hung while disconnecting device Yu Kuai
2022-05-23 14:15   ` Josef Bacik
2022-05-21  7:37 ` [PATCH -next v3 5/6] nbd: fix possible overflow on 'first_minor' in nbd_dev_add() Yu Kuai
2022-05-23 14:15   ` Josef Bacik
2022-05-21  7:37 ` [PATCH -next v3 6/6] nbd: use pr_err to output error message Yu Kuai
2022-05-23 14:16   ` Josef Bacik
2022-05-28 12:20 ` [PATCH -next v3 0/6] nbd: bugfix and cleanup patches Jens Axboe

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).