linux-block.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [patch v8 0/7] handle unexpected message from server
@ 2021-09-16  9:33 Yu Kuai
  2021-09-16  9:33 ` [patch v8 1/7] nbd: don't handle response without a corresponding request message Yu Kuai
                   ` (9 more replies)
  0 siblings, 10 replies; 30+ messages in thread
From: Yu Kuai @ 2021-09-16  9:33 UTC (permalink / raw)
  To: josef, axboe, ming.lei, hch
  Cc: linux-block, nbd, linux-kernel, yukuai3, yi.zhang

This patch set tries to fix that client might oops if nbd server send
unexpected message to client, for example, our syzkaller report a uaf
in nbd_read_stat():

Call trace:
 dump_backtrace+0x0/0x310 arch/arm64/kernel/time.c:78
 show_stack+0x28/0x38 arch/arm64/kernel/traps.c:158
 __dump_stack lib/dump_stack.c:77 [inline]
 dump_stack+0x144/0x1b4 lib/dump_stack.c:118
 print_address_description+0x68/0x2d0 mm/kasan/report.c:253
 kasan_report_error mm/kasan/report.c:351 [inline]
 kasan_report+0x134/0x2f0 mm/kasan/report.c:409
 check_memory_region_inline mm/kasan/kasan.c:260 [inline]
 __asan_load4+0x88/0xb0 mm/kasan/kasan.c:699
 __read_once_size include/linux/compiler.h:193 [inline]
 blk_mq_rq_state block/blk-mq.h:106 [inline]
 blk_mq_request_started+0x24/0x40 block/blk-mq.c:644
 nbd_read_stat drivers/block/nbd.c:670 [inline]
 recv_work+0x1bc/0x890 drivers/block/nbd.c:749
 process_one_work+0x3ec/0x9e0 kernel/workqueue.c:2147
 worker_thread+0x80/0x9d0 kernel/workqueue.c:2302
 kthread+0x1d8/0x1e0 kernel/kthread.c:255
 ret_from_fork+0x10/0x18 arch/arm64/kernel/entry.S:1174

1) At first, a normal io is submitted and completed with scheduler:

internel_tag = blk_mq_get_tag -> get tag from sched_tags
 blk_mq_rq_ctx_init
  sched_tags->rq[internel_tag] = sched_tag->static_rq[internel_tag]
...
blk_mq_get_driver_tag
 __blk_mq_get_driver_tag -> get tag from tags
 tags->rq[tag] = sched_tag->static_rq[internel_tag]

So, both tags->rq[tag] and sched_tags->rq[internel_tag] are pointing
to the request: sched_tags->static_rq[internal_tag]. Even if the
io is finished.

2) nbd server send a reply with random tag directly:

recv_work
 nbd_read_stat
  blk_mq_tag_to_rq(tags, tag)
   rq = tags->rq[tag]

3) if the sched_tags->static_rq is freed:

blk_mq_sched_free_requests
 blk_mq_free_rqs(q->tag_set, hctx->sched_tags, i)
  -> step 2) access rq before clearing rq mapping
  blk_mq_clear_rq_mapping(set, tags, hctx_idx);
  __free_pages() -> rq is freed here

4) Then, nbd continue to use the freed request in nbd_read_stat()

Changes in v8:
 - add patch 5 to this series.
 - modify some words.
Changes in v7:
 - instead of exposing blk_queue_exit(), using percpu_ref_put()
 directly.
 - drop the ref right after nbd_handle_reply().
Changes in v6:
 - don't set cmd->status to error if request is completed before
 nbd_clear_req().
 - get 'q_usage_counter' to prevent accessing freed request through
 blk_mq_tag_to_rq(), instead of using blk_mq_find_and_get_req().
Changes in v5:
 - move patch 1 & 2 in v4 (patch 4 & 5 in v5) behind
 - add some comment in patch 5
Changes in v4:
 - change the name of the patchset, since uaf is not the only problem
 if server send unexpected reply message.
 - instead of adding new interface, use blk_mq_find_and_get_req().
 - add patch 5 to this series
Changes in v3:
 - v2 can't fix the problem thoroughly, add patch 3-4 to this series.
 - modify descriptions.
 - patch 5 is just a cleanup
Changes in v2:
 - as Bart suggested, add a new helper function for drivers to get
 request by tag.

Yu Kuai (7):
  nbd: don't handle response without a corresponding request message
  nbd: make sure request completion won't concurrent
  nbd: check sock index in nbd_read_stat()
  nbd: don't start request if nbd_queue_rq() failed
  nbd: clean up return value checking of sock_xmit()
  nbd: partition nbd_read_stat() into nbd_read_reply() and
    nbd_handle_reply()
  nbd: fix uaf in nbd_handle_reply()

 drivers/block/nbd.c | 135 +++++++++++++++++++++++++++++++-------------
 1 file changed, 96 insertions(+), 39 deletions(-)

-- 
2.31.1


^ permalink raw reply	[flat|nested] 30+ messages in thread

* [patch v8 1/7] nbd: don't handle response without a corresponding request message
  2021-09-16  9:33 [patch v8 0/7] handle unexpected message from server Yu Kuai
@ 2021-09-16  9:33 ` Yu Kuai
  2021-09-16  9:33 ` [patch v8 2/7] nbd: make sure request completion won't concurrent Yu Kuai
                   ` (8 subsequent siblings)
  9 siblings, 0 replies; 30+ messages in thread
From: Yu Kuai @ 2021-09-16  9:33 UTC (permalink / raw)
  To: josef, axboe, ming.lei, hch
  Cc: linux-block, nbd, linux-kernel, yukuai3, yi.zhang

While handling a response message from server, nbd_read_stat() will
try to get request by tag, and then complete the request. However,
this is problematic if nbd haven't sent a corresponding request
message:

t1                      t2
                        submit_bio
                         nbd_queue_rq
                          blk_mq_start_request
recv_work
 nbd_read_stat
  blk_mq_tag_to_rq
 blk_mq_complete_request
                          nbd_send_cmd

Thus add a new cmd flag 'NBD_CMD_INFLIGHT', it will be set in
nbd_send_cmd() and checked in nbd_read_stat().

Noted that this patch can't fix that blk_mq_tag_to_rq() might
return a freed request, and this will be fixed in following
patches.

Signed-off-by: Yu Kuai <yukuai3@huawei.com>
Reviewed-by: Ming Lei <ming.lei@redhat.com>
---
 drivers/block/nbd.c | 22 +++++++++++++++++++++-
 1 file changed, 21 insertions(+), 1 deletion(-)

diff --git a/drivers/block/nbd.c b/drivers/block/nbd.c
index 5170a630778d..23ded569e8d3 100644
--- a/drivers/block/nbd.c
+++ b/drivers/block/nbd.c
@@ -126,6 +126,12 @@ struct nbd_device {
 };
 
 #define NBD_CMD_REQUEUED	1
+/*
+ * This flag will be set if nbd_queue_rq() succeed, and will be checked and
+ * cleared in completion. Both setting and clearing of the flag are protected
+ * by cmd->lock.
+ */
+#define NBD_CMD_INFLIGHT	2
 
 struct nbd_cmd {
 	struct nbd_device *nbd;
@@ -400,6 +406,7 @@ static enum blk_eh_timer_return nbd_xmit_timeout(struct request *req,
 	if (!mutex_trylock(&cmd->lock))
 		return BLK_EH_RESET_TIMER;
 
+	__clear_bit(NBD_CMD_INFLIGHT, &cmd->flags);
 	if (!refcount_inc_not_zero(&nbd->config_refs)) {
 		cmd->status = BLK_STS_TIMEOUT;
 		mutex_unlock(&cmd->lock);
@@ -729,6 +736,12 @@ static struct nbd_cmd *nbd_read_stat(struct nbd_device *nbd, int index)
 	cmd = blk_mq_rq_to_pdu(req);
 
 	mutex_lock(&cmd->lock);
+	if (!__test_and_clear_bit(NBD_CMD_INFLIGHT, &cmd->flags)) {
+		dev_err(disk_to_dev(nbd->disk), "Suspicious reply %d (status %u flags %lu)",
+			tag, cmd->status, cmd->flags);
+		ret = -ENOENT;
+		goto out;
+	}
 	if (cmd->cmd_cookie != nbd_handle_to_cookie(handle)) {
 		dev_err(disk_to_dev(nbd->disk), "Double reply on req %p, cmd_cookie %u, handle cookie %u\n",
 			req, cmd->cmd_cookie, nbd_handle_to_cookie(handle));
@@ -828,6 +841,7 @@ static bool nbd_clear_req(struct request *req, void *data, bool reserved)
 		return true;
 
 	mutex_lock(&cmd->lock);
+	__clear_bit(NBD_CMD_INFLIGHT, &cmd->flags);
 	cmd->status = BLK_STS_IOERR;
 	mutex_unlock(&cmd->lock);
 
@@ -964,7 +978,13 @@ static int nbd_handle_cmd(struct nbd_cmd *cmd, int index)
 	 * returns EAGAIN can be retried on a different socket.
 	 */
 	ret = nbd_send_cmd(nbd, cmd, index);
-	if (ret == -EAGAIN) {
+	/*
+	 * Access to this flag is protected by cmd->lock, thus it's safe to set
+	 * the flag after nbd_send_cmd() succeed to send request to server.
+	 */
+	if (!ret)
+		__set_bit(NBD_CMD_INFLIGHT, &cmd->flags);
+	else if (ret == -EAGAIN) {
 		dev_err_ratelimited(disk_to_dev(nbd->disk),
 				    "Request send failed, requeueing\n");
 		nbd_mark_nsock_dead(nbd, nsock, 1);
-- 
2.31.1


^ permalink raw reply	[flat|nested] 30+ messages in thread

* [patch v8 2/7] nbd: make sure request completion won't concurrent
  2021-09-16  9:33 [patch v8 0/7] handle unexpected message from server Yu Kuai
  2021-09-16  9:33 ` [patch v8 1/7] nbd: don't handle response without a corresponding request message Yu Kuai
@ 2021-09-16  9:33 ` Yu Kuai
  2021-09-16  9:33 ` [patch v8 3/7] nbd: check sock index in nbd_read_stat() Yu Kuai
                   ` (7 subsequent siblings)
  9 siblings, 0 replies; 30+ messages in thread
From: Yu Kuai @ 2021-09-16  9:33 UTC (permalink / raw)
  To: josef, axboe, ming.lei, hch
  Cc: linux-block, nbd, linux-kernel, yukuai3, yi.zhang

commit cddce0116058 ("nbd: Aovid double completion of a request")
try to fix that nbd_clear_que() and recv_work() can complete a
request concurrently. However, the problem still exists:

t1                    t2                     t3

nbd_disconnect_and_put
 flush_workqueue
                      recv_work
                       blk_mq_complete_request
                        blk_mq_complete_request_remote -> this is true
                         WRITE_ONCE(rq->state, MQ_RQ_COMPLETE)
                          blk_mq_raise_softirq
                                             blk_done_softirq
                                              blk_complete_reqs
                                               nbd_complete_rq
                                                blk_mq_end_request
                                                 blk_mq_free_request
                                                  WRITE_ONCE(rq->state, MQ_RQ_IDLE)
  nbd_clear_que
   blk_mq_tagset_busy_iter
    nbd_clear_req
                                                   __blk_mq_free_request
                                                    blk_mq_put_tag
     blk_mq_complete_request -> complete again

There are three places where request can be completed in nbd:
recv_work(), nbd_clear_que() and nbd_xmit_timeout(). Since they
all hold cmd->lock before completing the request, it's easy to
avoid the problem by setting and checking a cmd flag.

Signed-off-by: Yu Kuai <yukuai3@huawei.com>
Reviewed-by: Ming Lei <ming.lei@redhat.com>
---
 drivers/block/nbd.c | 11 +++++++++--
 1 file changed, 9 insertions(+), 2 deletions(-)

diff --git a/drivers/block/nbd.c b/drivers/block/nbd.c
index 23ded569e8d3..614c6ab2b8fe 100644
--- a/drivers/block/nbd.c
+++ b/drivers/block/nbd.c
@@ -406,7 +406,11 @@ static enum blk_eh_timer_return nbd_xmit_timeout(struct request *req,
 	if (!mutex_trylock(&cmd->lock))
 		return BLK_EH_RESET_TIMER;
 
-	__clear_bit(NBD_CMD_INFLIGHT, &cmd->flags);
+	if (!__test_and_clear_bit(NBD_CMD_INFLIGHT, &cmd->flags)) {
+		mutex_unlock(&cmd->lock);
+		return BLK_EH_DONE;
+	}
+
 	if (!refcount_inc_not_zero(&nbd->config_refs)) {
 		cmd->status = BLK_STS_TIMEOUT;
 		mutex_unlock(&cmd->lock);
@@ -841,7 +845,10 @@ static bool nbd_clear_req(struct request *req, void *data, bool reserved)
 		return true;
 
 	mutex_lock(&cmd->lock);
-	__clear_bit(NBD_CMD_INFLIGHT, &cmd->flags);
+	if (!__test_and_clear_bit(NBD_CMD_INFLIGHT, &cmd->flags)) {
+		mutex_unlock(&cmd->lock);
+		return true;
+	}
 	cmd->status = BLK_STS_IOERR;
 	mutex_unlock(&cmd->lock);
 
-- 
2.31.1


^ permalink raw reply	[flat|nested] 30+ messages in thread

* [patch v8 3/7] nbd: check sock index in nbd_read_stat()
  2021-09-16  9:33 [patch v8 0/7] handle unexpected message from server Yu Kuai
  2021-09-16  9:33 ` [patch v8 1/7] nbd: don't handle response without a corresponding request message Yu Kuai
  2021-09-16  9:33 ` [patch v8 2/7] nbd: make sure request completion won't concurrent Yu Kuai
@ 2021-09-16  9:33 ` Yu Kuai
  2021-09-19 10:34   ` yukuai (C)
  2021-09-23  0:29   ` Ming Lei
  2021-09-16  9:33 ` [patch v8 4/7] nbd: don't start request if nbd_queue_rq() failed Yu Kuai
                   ` (6 subsequent siblings)
  9 siblings, 2 replies; 30+ messages in thread
From: Yu Kuai @ 2021-09-16  9:33 UTC (permalink / raw)
  To: josef, axboe, ming.lei, hch
  Cc: linux-block, nbd, linux-kernel, yukuai3, yi.zhang

The sock that clent send request in nbd_send_cmd() and receive reply
in nbd_read_stat() should be the same.

Signed-off-by: Yu Kuai <yukuai3@huawei.com>
---
 drivers/block/nbd.c | 4 ++++
 1 file changed, 4 insertions(+)

diff --git a/drivers/block/nbd.c b/drivers/block/nbd.c
index 614c6ab2b8fe..c724a5bd7fa4 100644
--- a/drivers/block/nbd.c
+++ b/drivers/block/nbd.c
@@ -746,6 +746,10 @@ static struct nbd_cmd *nbd_read_stat(struct nbd_device *nbd, int index)
 		ret = -ENOENT;
 		goto out;
 	}
+	if (cmd->index != index) {
+		dev_err(disk_to_dev(nbd->disk), "Unexpected reply %d from different sock %d (expected %d)",
+			tag, index, cmd->index);
+	}
 	if (cmd->cmd_cookie != nbd_handle_to_cookie(handle)) {
 		dev_err(disk_to_dev(nbd->disk), "Double reply on req %p, cmd_cookie %u, handle cookie %u\n",
 			req, cmd->cmd_cookie, nbd_handle_to_cookie(handle));
-- 
2.31.1


^ permalink raw reply	[flat|nested] 30+ messages in thread

* [patch v8 4/7] nbd: don't start request if nbd_queue_rq() failed
  2021-09-16  9:33 [patch v8 0/7] handle unexpected message from server Yu Kuai
                   ` (2 preceding siblings ...)
  2021-09-16  9:33 ` [patch v8 3/7] nbd: check sock index in nbd_read_stat() Yu Kuai
@ 2021-09-16  9:33 ` Yu Kuai
  2021-09-16 12:48   ` Ming Lei
  2021-09-16  9:33 ` [patch v8 5/7] nbd: clean up return value checking of sock_xmit() Yu Kuai
                   ` (5 subsequent siblings)
  9 siblings, 1 reply; 30+ messages in thread
From: Yu Kuai @ 2021-09-16  9:33 UTC (permalink / raw)
  To: josef, axboe, ming.lei, hch
  Cc: linux-block, nbd, linux-kernel, yukuai3, yi.zhang

commit 6a468d5990ec ("nbd: don't start req until after the dead
connection logic") move blk_mq_start_request() from nbd_queue_rq()
to nbd_handle_cmd() to skip starting request if the connection is
dead. However, request is still started in other error paths.

Currently, blk_mq_end_request() will be called immediately if
nbd_queue_rq() failed, thus start request in such situation is
useless. So remove blk_mq_start_request() from error paths in
nbd_handle_cmd().

Signed-off-by: Yu Kuai <yukuai3@huawei.com>
---
 drivers/block/nbd.c | 3 ---
 1 file changed, 3 deletions(-)

diff --git a/drivers/block/nbd.c b/drivers/block/nbd.c
index c724a5bd7fa4..22c91d8901f6 100644
--- a/drivers/block/nbd.c
+++ b/drivers/block/nbd.c
@@ -934,7 +934,6 @@ static int nbd_handle_cmd(struct nbd_cmd *cmd, int index)
 	if (!refcount_inc_not_zero(&nbd->config_refs)) {
 		dev_err_ratelimited(disk_to_dev(nbd->disk),
 				    "Socks array is empty\n");
-		blk_mq_start_request(req);
 		return -EINVAL;
 	}
 	config = nbd->config;
@@ -943,7 +942,6 @@ static int nbd_handle_cmd(struct nbd_cmd *cmd, int index)
 		dev_err_ratelimited(disk_to_dev(nbd->disk),
 				    "Attempted send on invalid socket\n");
 		nbd_config_put(nbd);
-		blk_mq_start_request(req);
 		return -EINVAL;
 	}
 	cmd->status = BLK_STS_OK;
@@ -967,7 +965,6 @@ static int nbd_handle_cmd(struct nbd_cmd *cmd, int index)
 			 */
 			sock_shutdown(nbd);
 			nbd_config_put(nbd);
-			blk_mq_start_request(req);
 			return -EIO;
 		}
 		goto again;
-- 
2.31.1


^ permalink raw reply	[flat|nested] 30+ messages in thread

* [patch v8 5/7] nbd: clean up return value checking of sock_xmit()
  2021-09-16  9:33 [patch v8 0/7] handle unexpected message from server Yu Kuai
                   ` (3 preceding siblings ...)
  2021-09-16  9:33 ` [patch v8 4/7] nbd: don't start request if nbd_queue_rq() failed Yu Kuai
@ 2021-09-16  9:33 ` Yu Kuai
  2021-09-16 12:49   ` Ming Lei
  2021-09-16  9:33 ` [patch v8 6/7] nbd: partition nbd_read_stat() into nbd_read_reply() and nbd_handle_reply() Yu Kuai
                   ` (4 subsequent siblings)
  9 siblings, 1 reply; 30+ messages in thread
From: Yu Kuai @ 2021-09-16  9:33 UTC (permalink / raw)
  To: josef, axboe, ming.lei, hch
  Cc: linux-block, nbd, linux-kernel, yukuai3, yi.zhang

Check if sock_xmit() return 0 is useless because it'll never return
0, comment it and remove such checkings.

Signed-off-by: Yu Kuai <yukuai3@huawei.com>
---
 drivers/block/nbd.c | 13 +++++++------
 1 file changed, 7 insertions(+), 6 deletions(-)

diff --git a/drivers/block/nbd.c b/drivers/block/nbd.c
index 22c91d8901f6..7f9e030e41ce 100644
--- a/drivers/block/nbd.c
+++ b/drivers/block/nbd.c
@@ -490,7 +490,8 @@ static enum blk_eh_timer_return nbd_xmit_timeout(struct request *req,
 }
 
 /*
- *  Send or receive packet.
+ *  Send or receive packet. Return a positive value on success and
+ *  negtive value on failue, and never return 0.
  */
 static int sock_xmit(struct nbd_device *nbd, int index, int send,
 		     struct iov_iter *iter, int msg_flags, int *sent)
@@ -616,7 +617,7 @@ static int nbd_send_cmd(struct nbd_device *nbd, struct nbd_cmd *cmd, int index)
 	result = sock_xmit(nbd, index, 1, &from,
 			(type == NBD_CMD_WRITE) ? MSG_MORE : 0, &sent);
 	trace_nbd_header_sent(req, handle);
-	if (result <= 0) {
+	if (result < 0) {
 		if (was_interrupted(result)) {
 			/* If we havne't sent anything we can just return BUSY,
 			 * however if we have sent something we need to make
@@ -660,7 +661,7 @@ static int nbd_send_cmd(struct nbd_device *nbd, struct nbd_cmd *cmd, int index)
 				skip = 0;
 			}
 			result = sock_xmit(nbd, index, 1, &from, flags, &sent);
-			if (result <= 0) {
+			if (result < 0) {
 				if (was_interrupted(result)) {
 					/* We've already sent the header, we
 					 * have no choice but to set pending and
@@ -712,7 +713,7 @@ static struct nbd_cmd *nbd_read_stat(struct nbd_device *nbd, int index)
 	reply.magic = 0;
 	iov_iter_kvec(&to, READ, &iov, 1, sizeof(reply));
 	result = sock_xmit(nbd, index, 0, &to, MSG_WAITALL, NULL);
-	if (result <= 0) {
+	if (result < 0) {
 		if (!nbd_disconnected(config))
 			dev_err(disk_to_dev(nbd->disk),
 				"Receive control failed (result %d)\n", result);
@@ -783,7 +784,7 @@ static struct nbd_cmd *nbd_read_stat(struct nbd_device *nbd, int index)
 		rq_for_each_segment(bvec, req, iter) {
 			iov_iter_bvec(&to, READ, &bvec, 1, bvec.bv_len);
 			result = sock_xmit(nbd, index, 0, &to, MSG_WAITALL, NULL);
-			if (result <= 0) {
+			if (result < 0) {
 				dev_err(disk_to_dev(nbd->disk), "Receive data failed (result %d)\n",
 					result);
 				/*
@@ -1229,7 +1230,7 @@ static void send_disconnects(struct nbd_device *nbd)
 		iov_iter_kvec(&from, WRITE, &iov, 1, sizeof(request));
 		mutex_lock(&nsock->tx_lock);
 		ret = sock_xmit(nbd, i, 1, &from, 0, NULL);
-		if (ret <= 0)
+		if (ret < 0)
 			dev_err(disk_to_dev(nbd->disk),
 				"Send disconnect failed %d\n", ret);
 		mutex_unlock(&nsock->tx_lock);
-- 
2.31.1


^ permalink raw reply	[flat|nested] 30+ messages in thread

* [patch v8 6/7] nbd: partition nbd_read_stat() into nbd_read_reply() and nbd_handle_reply()
  2021-09-16  9:33 [patch v8 0/7] handle unexpected message from server Yu Kuai
                   ` (4 preceding siblings ...)
  2021-09-16  9:33 ` [patch v8 5/7] nbd: clean up return value checking of sock_xmit() Yu Kuai
@ 2021-09-16  9:33 ` Yu Kuai
  2021-09-16 12:53   ` Ming Lei
  2021-09-16  9:33 ` [patch v8 7/7] nbd: fix uaf in nbd_handle_reply() Yu Kuai
                   ` (3 subsequent siblings)
  9 siblings, 1 reply; 30+ messages in thread
From: Yu Kuai @ 2021-09-16  9:33 UTC (permalink / raw)
  To: josef, axboe, ming.lei, hch
  Cc: linux-block, nbd, linux-kernel, yukuai3, yi.zhang

Prepare to fix uaf in nbd_read_stat(), no functional changes.

Signed-off-by: Yu Kuai <yukuai3@huawei.com>
---
 drivers/block/nbd.c | 74 +++++++++++++++++++++++++++------------------
 1 file changed, 44 insertions(+), 30 deletions(-)

diff --git a/drivers/block/nbd.c b/drivers/block/nbd.c
index 7f9e030e41ce..69dc5eac9ad3 100644
--- a/drivers/block/nbd.c
+++ b/drivers/block/nbd.c
@@ -695,38 +695,45 @@ static int nbd_send_cmd(struct nbd_device *nbd, struct nbd_cmd *cmd, int index)
 	return 0;
 }
 
-/* NULL returned = something went wrong, inform userspace */
-static struct nbd_cmd *nbd_read_stat(struct nbd_device *nbd, int index)
+static int nbd_read_reply(struct nbd_device *nbd, int index,
+			  struct nbd_reply *reply)
 {
-	struct nbd_config *config = nbd->config;
-	int result;
-	struct nbd_reply reply;
-	struct nbd_cmd *cmd;
-	struct request *req = NULL;
-	u64 handle;
-	u16 hwq;
-	u32 tag;
-	struct kvec iov = {.iov_base = &reply, .iov_len = sizeof(reply)};
+	struct kvec iov = {.iov_base = reply, .iov_len = sizeof(*reply)};
 	struct iov_iter to;
-	int ret = 0;
+	int result;
 
-	reply.magic = 0;
-	iov_iter_kvec(&to, READ, &iov, 1, sizeof(reply));
+	reply->magic = 0;
+	iov_iter_kvec(&to, READ, &iov, 1, sizeof(*reply));
 	result = sock_xmit(nbd, index, 0, &to, MSG_WAITALL, NULL);
 	if (result < 0) {
-		if (!nbd_disconnected(config))
+		if (!nbd_disconnected(nbd->config))
 			dev_err(disk_to_dev(nbd->disk),
 				"Receive control failed (result %d)\n", result);
-		return ERR_PTR(result);
+		return result;
 	}
 
-	if (ntohl(reply.magic) != NBD_REPLY_MAGIC) {
+	if (ntohl(reply->magic) != NBD_REPLY_MAGIC) {
 		dev_err(disk_to_dev(nbd->disk), "Wrong magic (0x%lx)\n",
-				(unsigned long)ntohl(reply.magic));
-		return ERR_PTR(-EPROTO);
+				(unsigned long)ntohl(reply->magic));
+		return -EPROTO;
 	}
 
-	memcpy(&handle, reply.handle, sizeof(handle));
+	return 0;
+}
+
+/* NULL returned = something went wrong, inform userspace */
+static struct nbd_cmd *nbd_handle_reply(struct nbd_device *nbd, int index,
+					struct nbd_reply *reply)
+{
+	int result;
+	struct nbd_cmd *cmd;
+	struct request *req = NULL;
+	u64 handle;
+	u16 hwq;
+	u32 tag;
+	int ret = 0;
+
+	memcpy(&handle, reply->handle, sizeof(handle));
 	tag = nbd_handle_to_tag(handle);
 	hwq = blk_mq_unique_tag_to_hwq(tag);
 	if (hwq < nbd->tag_set.nr_hw_queues)
@@ -769,9 +776,9 @@ static struct nbd_cmd *nbd_read_stat(struct nbd_device *nbd, int index)
 		ret = -ENOENT;
 		goto out;
 	}
-	if (ntohl(reply.error)) {
+	if (ntohl(reply->error)) {
 		dev_err(disk_to_dev(nbd->disk), "Other side returned error (%d)\n",
-			ntohl(reply.error));
+			ntohl(reply->error));
 		cmd->status = BLK_STS_IOERR;
 		goto out;
 	}
@@ -780,6 +787,7 @@ static struct nbd_cmd *nbd_read_stat(struct nbd_device *nbd, int index)
 	if (rq_data_dir(req) != WRITE) {
 		struct req_iterator iter;
 		struct bio_vec bvec;
+		struct iov_iter to;
 
 		rq_for_each_segment(bvec, req, iter) {
 			iov_iter_bvec(&to, READ, &bvec, 1, bvec.bv_len);
@@ -793,7 +801,7 @@ static struct nbd_cmd *nbd_read_stat(struct nbd_device *nbd, int index)
 				 * and let the timeout stuff handle resubmitting
 				 * this request onto another connection.
 				 */
-				if (nbd_disconnected(config)) {
+				if (nbd_disconnected(nbd->config)) {
 					cmd->status = BLK_STS_IOERR;
 					goto out;
 				}
@@ -817,24 +825,30 @@ static void recv_work(struct work_struct *work)
 						     work);
 	struct nbd_device *nbd = args->nbd;
 	struct nbd_config *config = nbd->config;
+	struct nbd_sock *nsock;
 	struct nbd_cmd *cmd;
 	struct request *rq;
 
 	while (1) {
-		cmd = nbd_read_stat(nbd, args->index);
-		if (IS_ERR(cmd)) {
-			struct nbd_sock *nsock = config->socks[args->index];
+		struct nbd_reply reply;
 
-			mutex_lock(&nsock->tx_lock);
-			nbd_mark_nsock_dead(nbd, nsock, 1);
-			mutex_unlock(&nsock->tx_lock);
+		if (nbd_read_reply(nbd, args->index, &reply))
+			break;
+
+		cmd = nbd_handle_reply(nbd, args->index, &reply);
+		if (IS_ERR(cmd))
 			break;
-		}
 
 		rq = blk_mq_rq_from_pdu(cmd);
 		if (likely(!blk_should_fake_timeout(rq->q)))
 			blk_mq_complete_request(rq);
 	}
+
+	nsock = config->socks[args->index];
+	mutex_lock(&nsock->tx_lock);
+	nbd_mark_nsock_dead(nbd, nsock, 1);
+	mutex_unlock(&nsock->tx_lock);
+
 	nbd_config_put(nbd);
 	atomic_dec(&config->recv_threads);
 	wake_up(&config->recv_wq);
-- 
2.31.1


^ permalink raw reply	[flat|nested] 30+ messages in thread

* [patch v8 7/7] nbd: fix uaf in nbd_handle_reply()
  2021-09-16  9:33 [patch v8 0/7] handle unexpected message from server Yu Kuai
                   ` (5 preceding siblings ...)
  2021-09-16  9:33 ` [patch v8 6/7] nbd: partition nbd_read_stat() into nbd_read_reply() and nbd_handle_reply() Yu Kuai
@ 2021-09-16  9:33 ` Yu Kuai
  2021-09-16 12:58   ` Ming Lei
  2021-09-16 14:18   ` [PATCH v9] " Yu Kuai
  2021-09-23 13:33 ` [patch v8 0/7] handle unexpected message from server yukuai (C)
                   ` (2 subsequent siblings)
  9 siblings, 2 replies; 30+ messages in thread
From: Yu Kuai @ 2021-09-16  9:33 UTC (permalink / raw)
  To: josef, axboe, ming.lei, hch
  Cc: linux-block, nbd, linux-kernel, yukuai3, yi.zhang

There is a problem that nbd_handle_reply() might access freed request:

1) At first, a normal io is submitted and completed with scheduler:

internel_tag = blk_mq_get_tag -> get tag from sched_tags
 blk_mq_rq_ctx_init
  sched_tags->rq[internel_tag] = sched_tag->static_rq[internel_tag]
...
blk_mq_get_driver_tag
 __blk_mq_get_driver_tag -> get tag from tags
 tags->rq[tag] = sched_tag->static_rq[internel_tag]

So, both tags->rq[tag] and sched_tags->rq[internel_tag] are pointing
to the request: sched_tags->static_rq[internal_tag]. Even if the
io is finished.

2) nbd server send a reply with random tag directly:

recv_work
 nbd_handle_reply
  blk_mq_tag_to_rq(tags, tag)
   rq = tags->rq[tag]

3) if the sched_tags->static_rq is freed:

blk_mq_sched_free_requests
 blk_mq_free_rqs(q->tag_set, hctx->sched_tags, i)
  -> step 2) access rq before clearing rq mapping
  blk_mq_clear_rq_mapping(set, tags, hctx_idx);
  __free_pages() -> rq is freed here

4) Then, nbd continue to use the freed request in nbd_handle_reply

Fix the problem by get 'q_usage_counter' before blk_mq_tag_to_rq(),
thus request is ensured not to be freed because 'q_usage_counter' is
not zero.

Signed-off-by: Yu Kuai <yukuai3@huawei.com>
---
 drivers/block/nbd.c | 14 ++++++++++++++
 1 file changed, 14 insertions(+)

diff --git a/drivers/block/nbd.c b/drivers/block/nbd.c
index 69dc5eac9ad3..b3a47fc6237f 100644
--- a/drivers/block/nbd.c
+++ b/drivers/block/nbd.c
@@ -825,6 +825,7 @@ static void recv_work(struct work_struct *work)
 						     work);
 	struct nbd_device *nbd = args->nbd;
 	struct nbd_config *config = nbd->config;
+	struct request_queue *q = nbd->disk->queue;
 	struct nbd_sock *nsock;
 	struct nbd_cmd *cmd;
 	struct request *rq;
@@ -835,7 +836,20 @@ static void recv_work(struct work_struct *work)
 		if (nbd_read_reply(nbd, args->index, &reply))
 			break;
 
+		/*
+		 * Grab .q_usage_counter so request pool won't go away, then no
+		 * request use-after-free is possible during nbd_handle_reply().
+		 * If queue is frozen, there won't be any inflight requests, we
+		 * needn't to handle the incoming garbage message.
+		 */
+		if (!percpu_ref_tryget(&q->q_usage_counter)) {
+			dev_err(disk_to_dev(nbd->disk), "%s: no io inflight\n",
+				__func__);
+			break;
+		}
+
 		cmd = nbd_handle_reply(nbd, args->index, &reply);
+		percpu_ref_put(&q->q_usage_counter);
 		if (IS_ERR(cmd))
 			break;
 
-- 
2.31.1


^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [patch v8 4/7] nbd: don't start request if nbd_queue_rq() failed
  2021-09-16  9:33 ` [patch v8 4/7] nbd: don't start request if nbd_queue_rq() failed Yu Kuai
@ 2021-09-16 12:48   ` Ming Lei
  0 siblings, 0 replies; 30+ messages in thread
From: Ming Lei @ 2021-09-16 12:48 UTC (permalink / raw)
  To: Yu Kuai; +Cc: josef, axboe, hch, linux-block, nbd, linux-kernel, yi.zhang

On Thu, Sep 16, 2021 at 05:33:47PM +0800, Yu Kuai wrote:
> commit 6a468d5990ec ("nbd: don't start req until after the dead
> connection logic") move blk_mq_start_request() from nbd_queue_rq()
> to nbd_handle_cmd() to skip starting request if the connection is
> dead. However, request is still started in other error paths.
> 
> Currently, blk_mq_end_request() will be called immediately if
> nbd_queue_rq() failed, thus start request in such situation is
> useless. So remove blk_mq_start_request() from error paths in
> nbd_handle_cmd().
> 
> Signed-off-by: Yu Kuai <yukuai3@huawei.com>
> ---

Reviewed-by: Ming Lei <ming.lei@redhat.com>

-- 
Ming


^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [patch v8 5/7] nbd: clean up return value checking of sock_xmit()
  2021-09-16  9:33 ` [patch v8 5/7] nbd: clean up return value checking of sock_xmit() Yu Kuai
@ 2021-09-16 12:49   ` Ming Lei
  0 siblings, 0 replies; 30+ messages in thread
From: Ming Lei @ 2021-09-16 12:49 UTC (permalink / raw)
  To: Yu Kuai; +Cc: josef, axboe, hch, linux-block, nbd, linux-kernel, yi.zhang

On Thu, Sep 16, 2021 at 05:33:48PM +0800, Yu Kuai wrote:
> Check if sock_xmit() return 0 is useless because it'll never return
> 0, comment it and remove such checkings.
> 
> Signed-off-by: Yu Kuai <yukuai3@huawei.com>
> ---
>  drivers/block/nbd.c | 13 +++++++------
>  1 file changed, 7 insertions(+), 6 deletions(-)

Reviewed-by: Ming Lei <ming.lei@redhat.com>

-- 
Ming


^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [patch v8 6/7] nbd: partition nbd_read_stat() into nbd_read_reply() and nbd_handle_reply()
  2021-09-16  9:33 ` [patch v8 6/7] nbd: partition nbd_read_stat() into nbd_read_reply() and nbd_handle_reply() Yu Kuai
@ 2021-09-16 12:53   ` Ming Lei
  0 siblings, 0 replies; 30+ messages in thread
From: Ming Lei @ 2021-09-16 12:53 UTC (permalink / raw)
  To: Yu Kuai; +Cc: josef, axboe, hch, linux-block, nbd, linux-kernel, yi.zhang

On Thu, Sep 16, 2021 at 05:33:49PM +0800, Yu Kuai wrote:
> Prepare to fix uaf in nbd_read_stat(), no functional changes.
> 
> Signed-off-by: Yu Kuai <yukuai3@huawei.com>
> ---

Reviewed-by: Ming Lei <ming.lei@redhat.com>

-- 
Ming


^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [patch v8 7/7] nbd: fix uaf in nbd_handle_reply()
  2021-09-16  9:33 ` [patch v8 7/7] nbd: fix uaf in nbd_handle_reply() Yu Kuai
@ 2021-09-16 12:58   ` Ming Lei
  2021-09-16 13:10     ` yukuai (C)
  2021-09-16 14:18   ` [PATCH v9] " Yu Kuai
  1 sibling, 1 reply; 30+ messages in thread
From: Ming Lei @ 2021-09-16 12:58 UTC (permalink / raw)
  To: Yu Kuai; +Cc: josef, axboe, hch, linux-block, nbd, linux-kernel, yi.zhang

On Thu, Sep 16, 2021 at 05:33:50PM +0800, Yu Kuai wrote:
> There is a problem that nbd_handle_reply() might access freed request:
> 
> 1) At first, a normal io is submitted and completed with scheduler:
> 
> internel_tag = blk_mq_get_tag -> get tag from sched_tags
>  blk_mq_rq_ctx_init
>   sched_tags->rq[internel_tag] = sched_tag->static_rq[internel_tag]
> ...
> blk_mq_get_driver_tag
>  __blk_mq_get_driver_tag -> get tag from tags
>  tags->rq[tag] = sched_tag->static_rq[internel_tag]
> 
> So, both tags->rq[tag] and sched_tags->rq[internel_tag] are pointing
> to the request: sched_tags->static_rq[internal_tag]. Even if the
> io is finished.
> 
> 2) nbd server send a reply with random tag directly:
> 
> recv_work
>  nbd_handle_reply
>   blk_mq_tag_to_rq(tags, tag)
>    rq = tags->rq[tag]
> 
> 3) if the sched_tags->static_rq is freed:
> 
> blk_mq_sched_free_requests
>  blk_mq_free_rqs(q->tag_set, hctx->sched_tags, i)
>   -> step 2) access rq before clearing rq mapping
>   blk_mq_clear_rq_mapping(set, tags, hctx_idx);
>   __free_pages() -> rq is freed here
> 
> 4) Then, nbd continue to use the freed request in nbd_handle_reply
> 
> Fix the problem by get 'q_usage_counter' before blk_mq_tag_to_rq(),
> thus request is ensured not to be freed because 'q_usage_counter' is
> not zero.
> 
> Signed-off-by: Yu Kuai <yukuai3@huawei.com>
> ---
>  drivers/block/nbd.c | 14 ++++++++++++++
>  1 file changed, 14 insertions(+)
> 
> diff --git a/drivers/block/nbd.c b/drivers/block/nbd.c
> index 69dc5eac9ad3..b3a47fc6237f 100644
> --- a/drivers/block/nbd.c
> +++ b/drivers/block/nbd.c
> @@ -825,6 +825,7 @@ static void recv_work(struct work_struct *work)
>  						     work);
>  	struct nbd_device *nbd = args->nbd;
>  	struct nbd_config *config = nbd->config;
> +	struct request_queue *q = nbd->disk->queue;
>  	struct nbd_sock *nsock;
>  	struct nbd_cmd *cmd;
>  	struct request *rq;
> @@ -835,7 +836,20 @@ static void recv_work(struct work_struct *work)
>  		if (nbd_read_reply(nbd, args->index, &reply))
>  			break;
>  
> +		/*
> +		 * Grab .q_usage_counter so request pool won't go away, then no
> +		 * request use-after-free is possible during nbd_handle_reply().
> +		 * If queue is frozen, there won't be any inflight requests, we
> +		 * needn't to handle the incoming garbage message.
> +		 */
> +		if (!percpu_ref_tryget(&q->q_usage_counter)) {
> +			dev_err(disk_to_dev(nbd->disk), "%s: no io inflight\n",
> +				__func__);
> +			break;
> +		}
> +
>  		cmd = nbd_handle_reply(nbd, args->index, &reply);
> +		percpu_ref_put(&q->q_usage_counter);
>  		if (IS_ERR(cmd))
>  			break;

The refcount needs to be grabbed when completing the request because
the request may be completed from other code path, then the request pool
will be freed from that code path when the request is referred.


Thanks,
Ming


^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [patch v8 7/7] nbd: fix uaf in nbd_handle_reply()
  2021-09-16 12:58   ` Ming Lei
@ 2021-09-16 13:10     ` yukuai (C)
  2021-09-16 13:55       ` Ming Lei
  0 siblings, 1 reply; 30+ messages in thread
From: yukuai (C) @ 2021-09-16 13:10 UTC (permalink / raw)
  To: Ming Lei; +Cc: josef, axboe, hch, linux-block, nbd, linux-kernel, yi.zhang

On 2021/09/16 20:58, Ming Lei wrote:
> On Thu, Sep 16, 2021 at 05:33:50PM +0800, Yu Kuai wrote:
>> There is a problem that nbd_handle_reply() might access freed request:
>>
>> 1) At first, a normal io is submitted and completed with scheduler:
>>
>> internel_tag = blk_mq_get_tag -> get tag from sched_tags
>>   blk_mq_rq_ctx_init
>>    sched_tags->rq[internel_tag] = sched_tag->static_rq[internel_tag]
>> ...
>> blk_mq_get_driver_tag
>>   __blk_mq_get_driver_tag -> get tag from tags
>>   tags->rq[tag] = sched_tag->static_rq[internel_tag]
>>
>> So, both tags->rq[tag] and sched_tags->rq[internel_tag] are pointing
>> to the request: sched_tags->static_rq[internal_tag]. Even if the
>> io is finished.
>>
>> 2) nbd server send a reply with random tag directly:
>>
>> recv_work
>>   nbd_handle_reply
>>    blk_mq_tag_to_rq(tags, tag)
>>     rq = tags->rq[tag]
>>
>> 3) if the sched_tags->static_rq is freed:
>>
>> blk_mq_sched_free_requests
>>   blk_mq_free_rqs(q->tag_set, hctx->sched_tags, i)
>>    -> step 2) access rq before clearing rq mapping
>>    blk_mq_clear_rq_mapping(set, tags, hctx_idx);
>>    __free_pages() -> rq is freed here
>>
>> 4) Then, nbd continue to use the freed request in nbd_handle_reply
>>
>> Fix the problem by get 'q_usage_counter' before blk_mq_tag_to_rq(),
>> thus request is ensured not to be freed because 'q_usage_counter' is
>> not zero.
>>
>> Signed-off-by: Yu Kuai <yukuai3@huawei.com>
>> ---
>>   drivers/block/nbd.c | 14 ++++++++++++++
>>   1 file changed, 14 insertions(+)
>>
>> diff --git a/drivers/block/nbd.c b/drivers/block/nbd.c
>> index 69dc5eac9ad3..b3a47fc6237f 100644
>> --- a/drivers/block/nbd.c
>> +++ b/drivers/block/nbd.c
>> @@ -825,6 +825,7 @@ static void recv_work(struct work_struct *work)
>>   						     work);
>>   	struct nbd_device *nbd = args->nbd;
>>   	struct nbd_config *config = nbd->config;
>> +	struct request_queue *q = nbd->disk->queue;
>>   	struct nbd_sock *nsock;
>>   	struct nbd_cmd *cmd;
>>   	struct request *rq;
>> @@ -835,7 +836,20 @@ static void recv_work(struct work_struct *work)
>>   		if (nbd_read_reply(nbd, args->index, &reply))
>>   			break;
>>   
>> +		/*
>> +		 * Grab .q_usage_counter so request pool won't go away, then no
>> +		 * request use-after-free is possible during nbd_handle_reply().
>> +		 * If queue is frozen, there won't be any inflight requests, we
>> +		 * needn't to handle the incoming garbage message.
>> +		 */
>> +		if (!percpu_ref_tryget(&q->q_usage_counter)) {
>> +			dev_err(disk_to_dev(nbd->disk), "%s: no io inflight\n",
>> +				__func__);
>> +			break;
>> +		}
>> +
>>   		cmd = nbd_handle_reply(nbd, args->index, &reply);
>> +		percpu_ref_put(&q->q_usage_counter);
>>   		if (IS_ERR(cmd))
>>   			break;
> 
> The refcount needs to be grabbed when completing the request because
> the request may be completed from other code path, then the request pool
> will be freed from that code path when the request is referred.

Hi,

The request can't complete concurrently, thus put ref here is safe.

There used to be a commet here that I tried to explain it... It's fine
to me to move it behind anyway.

Thanks,
Kuai

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [patch v8 7/7] nbd: fix uaf in nbd_handle_reply()
  2021-09-16 13:10     ` yukuai (C)
@ 2021-09-16 13:55       ` Ming Lei
  2021-09-16 14:05         ` yukuai (C)
  0 siblings, 1 reply; 30+ messages in thread
From: Ming Lei @ 2021-09-16 13:55 UTC (permalink / raw)
  To: yukuai (C); +Cc: josef, axboe, hch, linux-block, nbd, linux-kernel, yi.zhang

On Thu, Sep 16, 2021 at 09:10:37PM +0800, yukuai (C) wrote:
> On 2021/09/16 20:58, Ming Lei wrote:
> > On Thu, Sep 16, 2021 at 05:33:50PM +0800, Yu Kuai wrote:
> > > There is a problem that nbd_handle_reply() might access freed request:
> > > 
> > > 1) At first, a normal io is submitted and completed with scheduler:
> > > 
> > > internel_tag = blk_mq_get_tag -> get tag from sched_tags
> > >   blk_mq_rq_ctx_init
> > >    sched_tags->rq[internel_tag] = sched_tag->static_rq[internel_tag]
> > > ...
> > > blk_mq_get_driver_tag
> > >   __blk_mq_get_driver_tag -> get tag from tags
> > >   tags->rq[tag] = sched_tag->static_rq[internel_tag]
> > > 
> > > So, both tags->rq[tag] and sched_tags->rq[internel_tag] are pointing
> > > to the request: sched_tags->static_rq[internal_tag]. Even if the
> > > io is finished.
> > > 
> > > 2) nbd server send a reply with random tag directly:
> > > 
> > > recv_work
> > >   nbd_handle_reply
> > >    blk_mq_tag_to_rq(tags, tag)
> > >     rq = tags->rq[tag]
> > > 
> > > 3) if the sched_tags->static_rq is freed:
> > > 
> > > blk_mq_sched_free_requests
> > >   blk_mq_free_rqs(q->tag_set, hctx->sched_tags, i)
> > >    -> step 2) access rq before clearing rq mapping
> > >    blk_mq_clear_rq_mapping(set, tags, hctx_idx);
> > >    __free_pages() -> rq is freed here
> > > 
> > > 4) Then, nbd continue to use the freed request in nbd_handle_reply
> > > 
> > > Fix the problem by get 'q_usage_counter' before blk_mq_tag_to_rq(),
> > > thus request is ensured not to be freed because 'q_usage_counter' is
> > > not zero.
> > > 
> > > Signed-off-by: Yu Kuai <yukuai3@huawei.com>
> > > ---
> > >   drivers/block/nbd.c | 14 ++++++++++++++
> > >   1 file changed, 14 insertions(+)
> > > 
> > > diff --git a/drivers/block/nbd.c b/drivers/block/nbd.c
> > > index 69dc5eac9ad3..b3a47fc6237f 100644
> > > --- a/drivers/block/nbd.c
> > > +++ b/drivers/block/nbd.c
> > > @@ -825,6 +825,7 @@ static void recv_work(struct work_struct *work)
> > >   						     work);
> > >   	struct nbd_device *nbd = args->nbd;
> > >   	struct nbd_config *config = nbd->config;
> > > +	struct request_queue *q = nbd->disk->queue;
> > >   	struct nbd_sock *nsock;
> > >   	struct nbd_cmd *cmd;
> > >   	struct request *rq;
> > > @@ -835,7 +836,20 @@ static void recv_work(struct work_struct *work)
> > >   		if (nbd_read_reply(nbd, args->index, &reply))
> > >   			break;
> > > +		/*
> > > +		 * Grab .q_usage_counter so request pool won't go away, then no
> > > +		 * request use-after-free is possible during nbd_handle_reply().
> > > +		 * If queue is frozen, there won't be any inflight requests, we
> > > +		 * needn't to handle the incoming garbage message.
> > > +		 */
> > > +		if (!percpu_ref_tryget(&q->q_usage_counter)) {
> > > +			dev_err(disk_to_dev(nbd->disk), "%s: no io inflight\n",
> > > +				__func__);
> > > +			break;
> > > +		}
> > > +
> > >   		cmd = nbd_handle_reply(nbd, args->index, &reply);
> > > +		percpu_ref_put(&q->q_usage_counter);
> > >   		if (IS_ERR(cmd))
> > >   			break;
> > 
> > The refcount needs to be grabbed when completing the request because
> > the request may be completed from other code path, then the request pool
> > will be freed from that code path when the request is referred.
> 
> Hi,
> 
> The request can't complete concurrently, thus put ref here is safe.
> 
> There used to be a commet here that I tried to explain it... It's fine
> to me to move it behind anyway.

Never see such comment. cmd->lock isn't held here, so I believe
concurrent completion is possible here.


Thanks,
Ming


^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [patch v8 7/7] nbd: fix uaf in nbd_handle_reply()
  2021-09-16 13:55       ` Ming Lei
@ 2021-09-16 14:05         ` yukuai (C)
  0 siblings, 0 replies; 30+ messages in thread
From: yukuai (C) @ 2021-09-16 14:05 UTC (permalink / raw)
  To: Ming Lei; +Cc: josef, axboe, hch, linux-block, nbd, linux-kernel, yi.zhang

On 2021/09/16 21:55, Ming Lei write:
> On Thu, Sep 16, 2021 at 09:10:37PM +0800, yukuai (C) wrote:
>> On 2021/09/16 20:58, Ming Lei wrote:
>>> On Thu, Sep 16, 2021 at 05:33:50PM +0800, Yu Kuai wrote:
>>>> There is a problem that nbd_handle_reply() might access freed request:
>>>>
>>>> 1) At first, a normal io is submitted and completed with scheduler:
>>>>
>>>> internel_tag = blk_mq_get_tag -> get tag from sched_tags
>>>>    blk_mq_rq_ctx_init
>>>>     sched_tags->rq[internel_tag] = sched_tag->static_rq[internel_tag]
>>>> ...
>>>> blk_mq_get_driver_tag
>>>>    __blk_mq_get_driver_tag -> get tag from tags
>>>>    tags->rq[tag] = sched_tag->static_rq[internel_tag]
>>>>
>>>> So, both tags->rq[tag] and sched_tags->rq[internel_tag] are pointing
>>>> to the request: sched_tags->static_rq[internal_tag]. Even if the
>>>> io is finished.
>>>>
>>>> 2) nbd server send a reply with random tag directly:
>>>>
>>>> recv_work
>>>>    nbd_handle_reply
>>>>     blk_mq_tag_to_rq(tags, tag)
>>>>      rq = tags->rq[tag]
>>>>
>>>> 3) if the sched_tags->static_rq is freed:
>>>>
>>>> blk_mq_sched_free_requests
>>>>    blk_mq_free_rqs(q->tag_set, hctx->sched_tags, i)
>>>>     -> step 2) access rq before clearing rq mapping
>>>>     blk_mq_clear_rq_mapping(set, tags, hctx_idx);
>>>>     __free_pages() -> rq is freed here
>>>>
>>>> 4) Then, nbd continue to use the freed request in nbd_handle_reply
>>>>
>>>> Fix the problem by get 'q_usage_counter' before blk_mq_tag_to_rq(),
>>>> thus request is ensured not to be freed because 'q_usage_counter' is
>>>> not zero.
>>>>
>>>> Signed-off-by: Yu Kuai <yukuai3@huawei.com>
>>>> ---
>>>>    drivers/block/nbd.c | 14 ++++++++++++++
>>>>    1 file changed, 14 insertions(+)
>>>>
>>>> diff --git a/drivers/block/nbd.c b/drivers/block/nbd.c
>>>> index 69dc5eac9ad3..b3a47fc6237f 100644
>>>> --- a/drivers/block/nbd.c
>>>> +++ b/drivers/block/nbd.c
>>>> @@ -825,6 +825,7 @@ static void recv_work(struct work_struct *work)
>>>>    						     work);
>>>>    	struct nbd_device *nbd = args->nbd;
>>>>    	struct nbd_config *config = nbd->config;
>>>> +	struct request_queue *q = nbd->disk->queue;
>>>>    	struct nbd_sock *nsock;
>>>>    	struct nbd_cmd *cmd;
>>>>    	struct request *rq;
>>>> @@ -835,7 +836,20 @@ static void recv_work(struct work_struct *work)
>>>>    		if (nbd_read_reply(nbd, args->index, &reply))
>>>>    			break;
>>>> +		/*
>>>> +		 * Grab .q_usage_counter so request pool won't go away, then no
>>>> +		 * request use-after-free is possible during nbd_handle_reply().
>>>> +		 * If queue is frozen, there won't be any inflight requests, we
>>>> +		 * needn't to handle the incoming garbage message.
>>>> +		 */
>>>> +		if (!percpu_ref_tryget(&q->q_usage_counter)) {
>>>> +			dev_err(disk_to_dev(nbd->disk), "%s: no io inflight\n",
>>>> +				__func__);
>>>> +			break;
>>>> +		}
>>>> +
>>>>    		cmd = nbd_handle_reply(nbd, args->index, &reply);
>>>> +		percpu_ref_put(&q->q_usage_counter);
>>>>    		if (IS_ERR(cmd))
>>>>    			break;
>>>
>>> The refcount needs to be grabbed when completing the request because
>>> the request may be completed from other code path, then the request pool
>>> will be freed from that code path when the request is referred.
>>
>> Hi,
>>
>> The request can't complete concurrently, thus put ref here is safe.
>>
>> There used to be a commet here that I tried to explain it... It's fine
>> to me to move it behind anyway.
> 
> Never see such comment. cmd->lock isn't held here, so I believe
> concurrent completion is possible here.
> 

After patch 2, __test_and_clear_bit(NBD_CMD_INFLIGHT) must pass
while cmd->lock is held before completing the request, thus request
completion won't concurrent...

Thanks,
Kuai

^ permalink raw reply	[flat|nested] 30+ messages in thread

* [PATCH v9] nbd: fix uaf in nbd_handle_reply()
  2021-09-16  9:33 ` [patch v8 7/7] nbd: fix uaf in nbd_handle_reply() Yu Kuai
  2021-09-16 12:58   ` Ming Lei
@ 2021-09-16 14:18   ` Yu Kuai
  2021-09-17  3:33     ` Ming Lei
  2021-10-17 13:09     ` Jens Axboe
  1 sibling, 2 replies; 30+ messages in thread
From: Yu Kuai @ 2021-09-16 14:18 UTC (permalink / raw)
  To: josef, axboe, ming.lei; +Cc: linux-block, nbd, linux-kernel, yukuai3, yi.zhang

There is a problem that nbd_handle_reply() might access freed request:

1) At first, a normal io is submitted and completed with scheduler:

internel_tag = blk_mq_get_tag -> get tag from sched_tags
 blk_mq_rq_ctx_init
  sched_tags->rq[internel_tag] = sched_tag->static_rq[internel_tag]
...
blk_mq_get_driver_tag
 __blk_mq_get_driver_tag -> get tag from tags
 tags->rq[tag] = sched_tag->static_rq[internel_tag]

So, both tags->rq[tag] and sched_tags->rq[internel_tag] are pointing
to the request: sched_tags->static_rq[internal_tag]. Even if the
io is finished.

2) nbd server send a reply with random tag directly:

recv_work
 nbd_handle_reply
  blk_mq_tag_to_rq(tags, tag)
   rq = tags->rq[tag]

3) if the sched_tags->static_rq is freed:

blk_mq_sched_free_requests
 blk_mq_free_rqs(q->tag_set, hctx->sched_tags, i)
  -> step 2) access rq before clearing rq mapping
  blk_mq_clear_rq_mapping(set, tags, hctx_idx);
  __free_pages() -> rq is freed here

4) Then, nbd continue to use the freed request in nbd_handle_reply

Fix the problem by get 'q_usage_counter' before blk_mq_tag_to_rq(),
thus request is ensured not to be freed because 'q_usage_counter' is
not zero.

Signed-off-by: Yu Kuai <yukuai3@huawei.com>
---
Changes in v9:
 - move percpu_ref_put() behind.

 drivers/block/nbd.c | 18 +++++++++++++++++-
 1 file changed, 17 insertions(+), 1 deletion(-)

diff --git a/drivers/block/nbd.c b/drivers/block/nbd.c
index 69dc5eac9ad3..f9d63794275e 100644
--- a/drivers/block/nbd.c
+++ b/drivers/block/nbd.c
@@ -825,6 +825,7 @@ static void recv_work(struct work_struct *work)
 						     work);
 	struct nbd_device *nbd = args->nbd;
 	struct nbd_config *config = nbd->config;
+	struct request_queue *q = nbd->disk->queue;
 	struct nbd_sock *nsock;
 	struct nbd_cmd *cmd;
 	struct request *rq;
@@ -835,13 +836,28 @@ static void recv_work(struct work_struct *work)
 		if (nbd_read_reply(nbd, args->index, &reply))
 			break;
 
+		/*
+		 * Grab .q_usage_counter so request pool won't go away, then no
+		 * request use-after-free is possible during nbd_handle_reply().
+		 * If queue is frozen, there won't be any inflight requests, we
+		 * needn't to handle the incoming garbage message.
+		 */
+		if (!percpu_ref_tryget(&q->q_usage_counter)) {
+			dev_err(disk_to_dev(nbd->disk), "%s: no io inflight\n",
+				__func__);
+			break;
+		}
+
 		cmd = nbd_handle_reply(nbd, args->index, &reply);
-		if (IS_ERR(cmd))
+		if (IS_ERR(cmd)) {
+			percpu_ref_put(&q->q_usage_counter);
 			break;
+		}
 
 		rq = blk_mq_rq_from_pdu(cmd);
 		if (likely(!blk_should_fake_timeout(rq->q)))
 			blk_mq_complete_request(rq);
+		percpu_ref_put(&q->q_usage_counter);
 	}
 
 	nsock = config->socks[args->index];
-- 
2.31.1


^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH v9] nbd: fix uaf in nbd_handle_reply()
  2021-09-16 14:18   ` [PATCH v9] " Yu Kuai
@ 2021-09-17  3:33     ` Ming Lei
  2021-10-17 13:09     ` Jens Axboe
  1 sibling, 0 replies; 30+ messages in thread
From: Ming Lei @ 2021-09-17  3:33 UTC (permalink / raw)
  To: Yu Kuai; +Cc: josef, axboe, linux-block, nbd, linux-kernel, yi.zhang

On Thu, Sep 16, 2021 at 10:18:10PM +0800, Yu Kuai wrote:
> There is a problem that nbd_handle_reply() might access freed request:
> 
> 1) At first, a normal io is submitted and completed with scheduler:
> 
> internel_tag = blk_mq_get_tag -> get tag from sched_tags
>  blk_mq_rq_ctx_init
>   sched_tags->rq[internel_tag] = sched_tag->static_rq[internel_tag]
> ...
> blk_mq_get_driver_tag
>  __blk_mq_get_driver_tag -> get tag from tags
>  tags->rq[tag] = sched_tag->static_rq[internel_tag]
> 
> So, both tags->rq[tag] and sched_tags->rq[internel_tag] are pointing
> to the request: sched_tags->static_rq[internal_tag]. Even if the
> io is finished.
> 
> 2) nbd server send a reply with random tag directly:
> 
> recv_work
>  nbd_handle_reply
>   blk_mq_tag_to_rq(tags, tag)
>    rq = tags->rq[tag]
> 
> 3) if the sched_tags->static_rq is freed:
> 
> blk_mq_sched_free_requests
>  blk_mq_free_rqs(q->tag_set, hctx->sched_tags, i)
>   -> step 2) access rq before clearing rq mapping
>   blk_mq_clear_rq_mapping(set, tags, hctx_idx);
>   __free_pages() -> rq is freed here
> 
> 4) Then, nbd continue to use the freed request in nbd_handle_reply
> 
> Fix the problem by get 'q_usage_counter' before blk_mq_tag_to_rq(),
> thus request is ensured not to be freed because 'q_usage_counter' is
> not zero.
> 
> Signed-off-by: Yu Kuai <yukuai3@huawei.com>
> ---
> Changes in v9:
>  - move percpu_ref_put() behind.

Looks fine:

Reviewed-by: Ming Lei <ming.lei@redhat.com>

-- 
Ming


^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [patch v8 3/7] nbd: check sock index in nbd_read_stat()
  2021-09-16  9:33 ` [patch v8 3/7] nbd: check sock index in nbd_read_stat() Yu Kuai
@ 2021-09-19 10:34   ` yukuai (C)
  2021-09-22  9:22     ` Ming Lei
  2021-09-23  0:29   ` Ming Lei
  1 sibling, 1 reply; 30+ messages in thread
From: yukuai (C) @ 2021-09-19 10:34 UTC (permalink / raw)
  To: josef, axboe, ming.lei, hch; +Cc: linux-block, nbd, linux-kernel, yi.zhang

On 2021/09/16 17:33, Yu Kuai wrote:
> The sock that clent send request in nbd_send_cmd() and receive reply
> in nbd_read_stat() should be the same.
> 
> Signed-off-by: Yu Kuai <yukuai3@huawei.com>
> ---
>   drivers/block/nbd.c | 4 ++++
>   1 file changed, 4 insertions(+)
> 
> diff --git a/drivers/block/nbd.c b/drivers/block/nbd.c
> index 614c6ab2b8fe..c724a5bd7fa4 100644
> --- a/drivers/block/nbd.c
> +++ b/drivers/block/nbd.c
> @@ -746,6 +746,10 @@ static struct nbd_cmd *nbd_read_stat(struct nbd_device *nbd, int index)
>   		ret = -ENOENT;
>   		goto out;
>   	}
> +	if (cmd->index != index) {
> +		dev_err(disk_to_dev(nbd->disk), "Unexpected reply %d from different sock %d (expected %d)",
> +			tag, index, cmd->index);
> +	}
>   	if (cmd->cmd_cookie != nbd_handle_to_cookie(handle)) {
>   		dev_err(disk_to_dev(nbd->disk), "Double reply on req %p, cmd_cookie %u, handle cookie %u\n",
>   			req, cmd->cmd_cookie, nbd_handle_to_cookie(handle));
> 

Hi, Ming

Any suggestions about this patch?

Thanks,
Kuai

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [patch v8 3/7] nbd: check sock index in nbd_read_stat()
  2021-09-19 10:34   ` yukuai (C)
@ 2021-09-22  9:22     ` Ming Lei
  2021-09-22 12:12       ` Eric Blake
                         ` (2 more replies)
  0 siblings, 3 replies; 30+ messages in thread
From: Ming Lei @ 2021-09-22  9:22 UTC (permalink / raw)
  To: yukuai (C); +Cc: josef, axboe, hch, linux-block, nbd, linux-kernel, yi.zhang

On Sun, Sep 19, 2021 at 06:34:28PM +0800, yukuai (C) wrote:
> On 2021/09/16 17:33, Yu Kuai wrote:
> > The sock that clent send request in nbd_send_cmd() and receive reply
> > in nbd_read_stat() should be the same.
> > 
> > Signed-off-by: Yu Kuai <yukuai3@huawei.com>
> > ---
> >   drivers/block/nbd.c | 4 ++++
> >   1 file changed, 4 insertions(+)
> > 
> > diff --git a/drivers/block/nbd.c b/drivers/block/nbd.c
> > index 614c6ab2b8fe..c724a5bd7fa4 100644
> > --- a/drivers/block/nbd.c
> > +++ b/drivers/block/nbd.c
> > @@ -746,6 +746,10 @@ static struct nbd_cmd *nbd_read_stat(struct nbd_device *nbd, int index)
> >   		ret = -ENOENT;
> >   		goto out;
> >   	}
> > +	if (cmd->index != index) {
> > +		dev_err(disk_to_dev(nbd->disk), "Unexpected reply %d from different sock %d (expected %d)",
> > +			tag, index, cmd->index);
> > +	}
> >   	if (cmd->cmd_cookie != nbd_handle_to_cookie(handle)) {
> >   		dev_err(disk_to_dev(nbd->disk), "Double reply on req %p, cmd_cookie %u, handle cookie %u\n",
> >   			req, cmd->cmd_cookie, nbd_handle_to_cookie(handle));
> > 
> 
> Hi, Ming
> 
> Any suggestions about this patch?

I think this one relies on nbd protocol between server and client, and
does the protocol require both request and reply xmitted via same
socket?


Thanks,
Ming


^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [patch v8 3/7] nbd: check sock index in nbd_read_stat()
  2021-09-22  9:22     ` Ming Lei
@ 2021-09-22 12:12       ` Eric Blake
  2021-09-22 12:21       ` yukuai (C)
  2021-09-22 15:56       ` Wouter Verhelst
  2 siblings, 0 replies; 30+ messages in thread
From: Eric Blake @ 2021-09-22 12:12 UTC (permalink / raw)
  To: Ming Lei
  Cc: yukuai (C), josef, axboe, hch, linux-block, nbd, linux-kernel, yi.zhang

On Wed, Sep 22, 2021 at 05:22:07PM +0800, Ming Lei wrote:
> 
> I think this one relies on nbd protocol between server and client, and
> does the protocol require both request and reply xmitted via same
> socket?

Yes, a reply must be transmitted on the same socket as the request
came over.  This is because independent sockets are not required to
use distinct 64-bit handles, and there is no way for a server to tell
if independent clients are related to one another; sending a reply on
the wrong socket is thus not guaranteed to reach the intended client.
Thus, a compliant server will never send a reply over a different
socket than the original request, and if a client ever gets a reply
with a handle it was not expecting, then the server is buggy or
malicious.

-- 
Eric Blake, Principal Software Engineer
Red Hat, Inc.           +1-919-301-3266
Virtualization:  qemu.org | libvirt.org


^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [patch v8 3/7] nbd: check sock index in nbd_read_stat()
  2021-09-22  9:22     ` Ming Lei
  2021-09-22 12:12       ` Eric Blake
@ 2021-09-22 12:21       ` yukuai (C)
  2021-09-22 15:56       ` Wouter Verhelst
  2 siblings, 0 replies; 30+ messages in thread
From: yukuai (C) @ 2021-09-22 12:21 UTC (permalink / raw)
  To: Ming Lei; +Cc: josef, axboe, hch, linux-block, nbd, linux-kernel, yi.zhang

On 2021/09/22 17:22, Ming Lei wrote:
> On Sun, Sep 19, 2021 at 06:34:28PM +0800, yukuai (C) wrote:
>> On 2021/09/16 17:33, Yu Kuai wrote:
>>> The sock that clent send request in nbd_send_cmd() and receive reply
>>> in nbd_read_stat() should be the same.
>>>
>>> Signed-off-by: Yu Kuai <yukuai3@huawei.com>
>>> ---
>>>    drivers/block/nbd.c | 4 ++++
>>>    1 file changed, 4 insertions(+)
>>>
>>> diff --git a/drivers/block/nbd.c b/drivers/block/nbd.c
>>> index 614c6ab2b8fe..c724a5bd7fa4 100644
>>> --- a/drivers/block/nbd.c
>>> +++ b/drivers/block/nbd.c
>>> @@ -746,6 +746,10 @@ static struct nbd_cmd *nbd_read_stat(struct nbd_device *nbd, int index)
>>>    		ret = -ENOENT;
>>>    		goto out;
>>>    	}
>>> +	if (cmd->index != index) {
>>> +		dev_err(disk_to_dev(nbd->disk), "Unexpected reply %d from different sock %d (expected %d)",
>>> +			tag, index, cmd->index);
>>> +	}
>>>    	if (cmd->cmd_cookie != nbd_handle_to_cookie(handle)) {
>>>    		dev_err(disk_to_dev(nbd->disk), "Double reply on req %p, cmd_cookie %u, handle cookie %u\n",
>>>    			req, cmd->cmd_cookie, nbd_handle_to_cookie(handle));
>>>
>>
>> Hi, Ming
>>
>> Any suggestions about this patch?
> 
> I think this one relies on nbd protocol between server and client, and
> does the protocol require both request and reply xmitted via same
> socket?
> 

I searched nbd-server source code, and found that socket_read() and
send_reply->socket_write() are always come in pares and using the same
socket.

BTW, if server reply a read request from a unexpected sock, then
nbd_read_stat() might stuck in receiving the read data. And for worse,
nbd_read_stat() can mistake the normal reply message for the read data
afterwards and corrupt client.

Thanks,
Kuai


^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [patch v8 3/7] nbd: check sock index in nbd_read_stat()
  2021-09-22  9:22     ` Ming Lei
  2021-09-22 12:12       ` Eric Blake
  2021-09-22 12:21       ` yukuai (C)
@ 2021-09-22 15:56       ` Wouter Verhelst
  2 siblings, 0 replies; 30+ messages in thread
From: Wouter Verhelst @ 2021-09-22 15:56 UTC (permalink / raw)
  To: Ming Lei
  Cc: yukuai (C), josef, axboe, hch, linux-block, nbd, linux-kernel, yi.zhang

On Wed, Sep 22, 2021 at 05:22:07PM +0800, Ming Lei wrote:
> On Sun, Sep 19, 2021 at 06:34:28PM +0800, yukuai (C) wrote:
> > On 2021/09/16 17:33, Yu Kuai wrote:
> > > The sock that clent send request in nbd_send_cmd() and receive reply
> > > in nbd_read_stat() should be the same.
> > > 
> > > Signed-off-by: Yu Kuai <yukuai3@huawei.com>
> > > ---
> > >   drivers/block/nbd.c | 4 ++++
> > >   1 file changed, 4 insertions(+)
> > > 
> > > diff --git a/drivers/block/nbd.c b/drivers/block/nbd.c
> > > index 614c6ab2b8fe..c724a5bd7fa4 100644
> > > --- a/drivers/block/nbd.c
> > > +++ b/drivers/block/nbd.c
> > > @@ -746,6 +746,10 @@ static struct nbd_cmd *nbd_read_stat(struct nbd_device *nbd, int index)
> > >   		ret = -ENOENT;
> > >   		goto out;
> > >   	}
> > > +	if (cmd->index != index) {
> > > +		dev_err(disk_to_dev(nbd->disk), "Unexpected reply %d from different sock %d (expected %d)",
> > > +			tag, index, cmd->index);
> > > +	}
> > >   	if (cmd->cmd_cookie != nbd_handle_to_cookie(handle)) {
> > >   		dev_err(disk_to_dev(nbd->disk), "Double reply on req %p, cmd_cookie %u, handle cookie %u\n",
> > >   			req, cmd->cmd_cookie, nbd_handle_to_cookie(handle));
> > > 
> > 
> > Hi, Ming
> > 
> > Any suggestions about this patch?
> 
> I think this one relies on nbd protocol between server and client, and
> does the protocol require both request and reply xmitted via same
> socket?

As Eric already answered: yes, the request and reply are specified such
that they must be transmitted over the same socket.

For more details, you can find the protocol specification at
https://github.com/NetworkBlockDevice/nbd/blob/master/doc/proto.md

Please note that the protocol defined there has some options that are
not currently supported by the Linux nbd implementation -- specifically
the "structured reply" message format (and all that requires it) is not
implemented (yet?).

-- 
     w@uter.{be,co.za}
wouter@{grep.be,fosdem.org,debian.org}

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [patch v8 3/7] nbd: check sock index in nbd_read_stat()
  2021-09-16  9:33 ` [patch v8 3/7] nbd: check sock index in nbd_read_stat() Yu Kuai
  2021-09-19 10:34   ` yukuai (C)
@ 2021-09-23  0:29   ` Ming Lei
  1 sibling, 0 replies; 30+ messages in thread
From: Ming Lei @ 2021-09-23  0:29 UTC (permalink / raw)
  To: Yu Kuai; +Cc: josef, axboe, hch, linux-block, nbd, linux-kernel, yi.zhang

On Thu, Sep 16, 2021 at 05:33:46PM +0800, Yu Kuai wrote:
> The sock that clent send request in nbd_send_cmd() and receive reply
> in nbd_read_stat() should be the same.
> 
> Signed-off-by: Yu Kuai <yukuai3@huawei.com>
> ---
>  drivers/block/nbd.c | 4 ++++
>  1 file changed, 4 insertions(+)
> 
> diff --git a/drivers/block/nbd.c b/drivers/block/nbd.c
> index 614c6ab2b8fe..c724a5bd7fa4 100644
> --- a/drivers/block/nbd.c
> +++ b/drivers/block/nbd.c
> @@ -746,6 +746,10 @@ static struct nbd_cmd *nbd_read_stat(struct nbd_device *nbd, int index)
>  		ret = -ENOENT;
>  		goto out;
>  	}
> +	if (cmd->index != index) {
> +		dev_err(disk_to_dev(nbd->disk), "Unexpected reply %d from different sock %d (expected %d)",
> +			tag, index, cmd->index);
> +	}
>  	if (cmd->cmd_cookie != nbd_handle_to_cookie(handle)) {
>  		dev_err(disk_to_dev(nbd->disk), "Double reply on req %p, cmd_cookie %u, handle cookie %u\n",
>  			req, cmd->cmd_cookie, nbd_handle_to_cookie(handle));
> -- 
> 2.31.1
> 

Reviewed-by: Ming Lei <ming.lei@redhat.com>

-- 
Ming


^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [patch v8 0/7] handle unexpected message from server
  2021-09-16  9:33 [patch v8 0/7] handle unexpected message from server Yu Kuai
                   ` (6 preceding siblings ...)
  2021-09-16  9:33 ` [patch v8 7/7] nbd: fix uaf in nbd_handle_reply() Yu Kuai
@ 2021-09-23 13:33 ` yukuai (C)
  2021-09-29 12:54   ` yukuai (C)
  2021-10-15 16:08 ` Josef Bacik
  2021-10-17 13:09 ` Jens Axboe
  9 siblings, 1 reply; 30+ messages in thread
From: yukuai (C) @ 2021-09-23 13:33 UTC (permalink / raw)
  To: josef, axboe, ming.lei, hch; +Cc: linux-block, nbd, linux-kernel, yi.zhang

On 2021/09/16 17:33, Yu Kuai wrote:

Hi, jens

Any interest to apply this series?

Thanks,
Kuai
> This patch set tries to fix that client might oops if nbd server send
> unexpected message to client, for example, our syzkaller report a uaf
> in nbd_read_stat():
> 
> Call trace:
>   dump_backtrace+0x0/0x310 arch/arm64/kernel/time.c:78
>   show_stack+0x28/0x38 arch/arm64/kernel/traps.c:158
>   __dump_stack lib/dump_stack.c:77 [inline]
>   dump_stack+0x144/0x1b4 lib/dump_stack.c:118
>   print_address_description+0x68/0x2d0 mm/kasan/report.c:253
>   kasan_report_error mm/kasan/report.c:351 [inline]
>   kasan_report+0x134/0x2f0 mm/kasan/report.c:409
>   check_memory_region_inline mm/kasan/kasan.c:260 [inline]
>   __asan_load4+0x88/0xb0 mm/kasan/kasan.c:699
>   __read_once_size include/linux/compiler.h:193 [inline]
>   blk_mq_rq_state block/blk-mq.h:106 [inline]
>   blk_mq_request_started+0x24/0x40 block/blk-mq.c:644
>   nbd_read_stat drivers/block/nbd.c:670 [inline]
>   recv_work+0x1bc/0x890 drivers/block/nbd.c:749
>   process_one_work+0x3ec/0x9e0 kernel/workqueue.c:2147
>   worker_thread+0x80/0x9d0 kernel/workqueue.c:2302
>   kthread+0x1d8/0x1e0 kernel/kthread.c:255
>   ret_from_fork+0x10/0x18 arch/arm64/kernel/entry.S:1174
> 
> 1) At first, a normal io is submitted and completed with scheduler:
> 
> internel_tag = blk_mq_get_tag -> get tag from sched_tags
>   blk_mq_rq_ctx_init
>    sched_tags->rq[internel_tag] = sched_tag->static_rq[internel_tag]
> ...
> blk_mq_get_driver_tag
>   __blk_mq_get_driver_tag -> get tag from tags
>   tags->rq[tag] = sched_tag->static_rq[internel_tag]
> 
> So, both tags->rq[tag] and sched_tags->rq[internel_tag] are pointing
> to the request: sched_tags->static_rq[internal_tag]. Even if the
> io is finished.
> 
> 2) nbd server send a reply with random tag directly:
> 
> recv_work
>   nbd_read_stat
>    blk_mq_tag_to_rq(tags, tag)
>     rq = tags->rq[tag]
> 
> 3) if the sched_tags->static_rq is freed:
> 
> blk_mq_sched_free_requests
>   blk_mq_free_rqs(q->tag_set, hctx->sched_tags, i)
>    -> step 2) access rq before clearing rq mapping
>    blk_mq_clear_rq_mapping(set, tags, hctx_idx);
>    __free_pages() -> rq is freed here
> 
> 4) Then, nbd continue to use the freed request in nbd_read_stat()
> 
> Changes in v8:
>   - add patch 5 to this series.
>   - modify some words.
> Changes in v7:
>   - instead of exposing blk_queue_exit(), using percpu_ref_put()
>   directly.
>   - drop the ref right after nbd_handle_reply().
> Changes in v6:
>   - don't set cmd->status to error if request is completed before
>   nbd_clear_req().
>   - get 'q_usage_counter' to prevent accessing freed request through
>   blk_mq_tag_to_rq(), instead of using blk_mq_find_and_get_req().
> Changes in v5:
>   - move patch 1 & 2 in v4 (patch 4 & 5 in v5) behind
>   - add some comment in patch 5
> Changes in v4:
>   - change the name of the patchset, since uaf is not the only problem
>   if server send unexpected reply message.
>   - instead of adding new interface, use blk_mq_find_and_get_req().
>   - add patch 5 to this series
> Changes in v3:
>   - v2 can't fix the problem thoroughly, add patch 3-4 to this series.
>   - modify descriptions.
>   - patch 5 is just a cleanup
> Changes in v2:
>   - as Bart suggested, add a new helper function for drivers to get
>   request by tag.
> 
> Yu Kuai (7):
>    nbd: don't handle response without a corresponding request message
>    nbd: make sure request completion won't concurrent
>    nbd: check sock index in nbd_read_stat()
>    nbd: don't start request if nbd_queue_rq() failed
>    nbd: clean up return value checking of sock_xmit()
>    nbd: partition nbd_read_stat() into nbd_read_reply() and
>      nbd_handle_reply()
>    nbd: fix uaf in nbd_handle_reply()
> 
>   drivers/block/nbd.c | 135 +++++++++++++++++++++++++++++++-------------
>   1 file changed, 96 insertions(+), 39 deletions(-)
> 

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [patch v8 0/7] handle unexpected message from server
  2021-09-23 13:33 ` [patch v8 0/7] handle unexpected message from server yukuai (C)
@ 2021-09-29 12:54   ` yukuai (C)
  2021-10-08  7:17     ` yukuai (C)
  0 siblings, 1 reply; 30+ messages in thread
From: yukuai (C) @ 2021-09-29 12:54 UTC (permalink / raw)
  To: josef, axboe, ming.lei, hch; +Cc: linux-block, nbd, linux-kernel, yi.zhang

On 2021/09/23 21:33, yukuai (C) wrote:
> On 2021/09/16 17:33, Yu Kuai wrote:
> 
> Hi, jens
> 
> Any interest to apply this series?

friendly ping ...
> 
> Thanks,
> Kuai
>> This patch set tries to fix that client might oops if nbd server send
>> unexpected message to client, for example, our syzkaller report a uaf
>> in nbd_read_stat():
>>
>> Call trace:
>>   dump_backtrace+0x0/0x310 arch/arm64/kernel/time.c:78
>>   show_stack+0x28/0x38 arch/arm64/kernel/traps.c:158
>>   __dump_stack lib/dump_stack.c:77 [inline]
>>   dump_stack+0x144/0x1b4 lib/dump_stack.c:118
>>   print_address_description+0x68/0x2d0 mm/kasan/report.c:253
>>   kasan_report_error mm/kasan/report.c:351 [inline]
>>   kasan_report+0x134/0x2f0 mm/kasan/report.c:409
>>   check_memory_region_inline mm/kasan/kasan.c:260 [inline]
>>   __asan_load4+0x88/0xb0 mm/kasan/kasan.c:699
>>   __read_once_size include/linux/compiler.h:193 [inline]
>>   blk_mq_rq_state block/blk-mq.h:106 [inline]
>>   blk_mq_request_started+0x24/0x40 block/blk-mq.c:644
>>   nbd_read_stat drivers/block/nbd.c:670 [inline]
>>   recv_work+0x1bc/0x890 drivers/block/nbd.c:749
>>   process_one_work+0x3ec/0x9e0 kernel/workqueue.c:2147
>>   worker_thread+0x80/0x9d0 kernel/workqueue.c:2302
>>   kthread+0x1d8/0x1e0 kernel/kthread.c:255
>>   ret_from_fork+0x10/0x18 arch/arm64/kernel/entry.S:1174
>>
>> 1) At first, a normal io is submitted and completed with scheduler:
>>
>> internel_tag = blk_mq_get_tag -> get tag from sched_tags
>>   blk_mq_rq_ctx_init
>>    sched_tags->rq[internel_tag] = sched_tag->static_rq[internel_tag]
>> ...
>> blk_mq_get_driver_tag
>>   __blk_mq_get_driver_tag -> get tag from tags
>>   tags->rq[tag] = sched_tag->static_rq[internel_tag]
>>
>> So, both tags->rq[tag] and sched_tags->rq[internel_tag] are pointing
>> to the request: sched_tags->static_rq[internal_tag]. Even if the
>> io is finished.
>>
>> 2) nbd server send a reply with random tag directly:
>>
>> recv_work
>>   nbd_read_stat
>>    blk_mq_tag_to_rq(tags, tag)
>>     rq = tags->rq[tag]
>>
>> 3) if the sched_tags->static_rq is freed:
>>
>> blk_mq_sched_free_requests
>>   blk_mq_free_rqs(q->tag_set, hctx->sched_tags, i)
>>    -> step 2) access rq before clearing rq mapping
>>    blk_mq_clear_rq_mapping(set, tags, hctx_idx);
>>    __free_pages() -> rq is freed here
>>
>> 4) Then, nbd continue to use the freed request in nbd_read_stat()
>>
>> Changes in v8:
>>   - add patch 5 to this series.
>>   - modify some words.
>> Changes in v7:
>>   - instead of exposing blk_queue_exit(), using percpu_ref_put()
>>   directly.
>>   - drop the ref right after nbd_handle_reply().
>> Changes in v6:
>>   - don't set cmd->status to error if request is completed before
>>   nbd_clear_req().
>>   - get 'q_usage_counter' to prevent accessing freed request through
>>   blk_mq_tag_to_rq(), instead of using blk_mq_find_and_get_req().
>> Changes in v5:
>>   - move patch 1 & 2 in v4 (patch 4 & 5 in v5) behind
>>   - add some comment in patch 5
>> Changes in v4:
>>   - change the name of the patchset, since uaf is not the only problem
>>   if server send unexpected reply message.
>>   - instead of adding new interface, use blk_mq_find_and_get_req().
>>   - add patch 5 to this series
>> Changes in v3:
>>   - v2 can't fix the problem thoroughly, add patch 3-4 to this series.
>>   - modify descriptions.
>>   - patch 5 is just a cleanup
>> Changes in v2:
>>   - as Bart suggested, add a new helper function for drivers to get
>>   request by tag.
>>
>> Yu Kuai (7):
>>    nbd: don't handle response without a corresponding request message
>>    nbd: make sure request completion won't concurrent
>>    nbd: check sock index in nbd_read_stat()
>>    nbd: don't start request if nbd_queue_rq() failed
>>    nbd: clean up return value checking of sock_xmit()
>>    nbd: partition nbd_read_stat() into nbd_read_reply() and
>>      nbd_handle_reply()
>>    nbd: fix uaf in nbd_handle_reply()
>>
>>   drivers/block/nbd.c | 135 +++++++++++++++++++++++++++++++-------------
>>   1 file changed, 96 insertions(+), 39 deletions(-)
>>

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [patch v8 0/7] handle unexpected message from server
  2021-09-29 12:54   ` yukuai (C)
@ 2021-10-08  7:17     ` yukuai (C)
  2021-10-15  8:19       ` yukuai (C)
  0 siblings, 1 reply; 30+ messages in thread
From: yukuai (C) @ 2021-10-08  7:17 UTC (permalink / raw)
  To: josef, axboe, ming.lei, hch; +Cc: linux-block, nbd, linux-kernel, yi.zhang

On 2021/09/29 20:54, yukuai (C) wrote:
> On 2021/09/23 21:33, yukuai (C) wrote:
>> On 2021/09/16 17:33, Yu Kuai wrote:
>>
>> Hi, jens
>>
>> Any interest to apply this series?
> 
> friendly ping ...

Hi, Jens

friendly ping again ...
>>
>> Thanks,
>> Kuai
>>> This patch set tries to fix that client might oops if nbd server send
>>> unexpected message to client, for example, our syzkaller report a uaf
>>> in nbd_read_stat():
>>>
>>> Call trace:
>>>   dump_backtrace+0x0/0x310 arch/arm64/kernel/time.c:78
>>>   show_stack+0x28/0x38 arch/arm64/kernel/traps.c:158
>>>   __dump_stack lib/dump_stack.c:77 [inline]
>>>   dump_stack+0x144/0x1b4 lib/dump_stack.c:118
>>>   print_address_description+0x68/0x2d0 mm/kasan/report.c:253
>>>   kasan_report_error mm/kasan/report.c:351 [inline]
>>>   kasan_report+0x134/0x2f0 mm/kasan/report.c:409
>>>   check_memory_region_inline mm/kasan/kasan.c:260 [inline]
>>>   __asan_load4+0x88/0xb0 mm/kasan/kasan.c:699
>>>   __read_once_size include/linux/compiler.h:193 [inline]
>>>   blk_mq_rq_state block/blk-mq.h:106 [inline]
>>>   blk_mq_request_started+0x24/0x40 block/blk-mq.c:644
>>>   nbd_read_stat drivers/block/nbd.c:670 [inline]
>>>   recv_work+0x1bc/0x890 drivers/block/nbd.c:749
>>>   process_one_work+0x3ec/0x9e0 kernel/workqueue.c:2147
>>>   worker_thread+0x80/0x9d0 kernel/workqueue.c:2302
>>>   kthread+0x1d8/0x1e0 kernel/kthread.c:255
>>>   ret_from_fork+0x10/0x18 arch/arm64/kernel/entry.S:1174
>>>
>>> 1) At first, a normal io is submitted and completed with scheduler:
>>>
>>> internel_tag = blk_mq_get_tag -> get tag from sched_tags
>>>   blk_mq_rq_ctx_init
>>>    sched_tags->rq[internel_tag] = sched_tag->static_rq[internel_tag]
>>> ...
>>> blk_mq_get_driver_tag
>>>   __blk_mq_get_driver_tag -> get tag from tags
>>>   tags->rq[tag] = sched_tag->static_rq[internel_tag]
>>>
>>> So, both tags->rq[tag] and sched_tags->rq[internel_tag] are pointing
>>> to the request: sched_tags->static_rq[internal_tag]. Even if the
>>> io is finished.
>>>
>>> 2) nbd server send a reply with random tag directly:
>>>
>>> recv_work
>>>   nbd_read_stat
>>>    blk_mq_tag_to_rq(tags, tag)
>>>     rq = tags->rq[tag]
>>>
>>> 3) if the sched_tags->static_rq is freed:
>>>
>>> blk_mq_sched_free_requests
>>>   blk_mq_free_rqs(q->tag_set, hctx->sched_tags, i)
>>>    -> step 2) access rq before clearing rq mapping
>>>    blk_mq_clear_rq_mapping(set, tags, hctx_idx);
>>>    __free_pages() -> rq is freed here
>>>
>>> 4) Then, nbd continue to use the freed request in nbd_read_stat()
>>>
>>> Changes in v8:
>>>   - add patch 5 to this series.
>>>   - modify some words.
>>> Changes in v7:
>>>   - instead of exposing blk_queue_exit(), using percpu_ref_put()
>>>   directly.
>>>   - drop the ref right after nbd_handle_reply().
>>> Changes in v6:
>>>   - don't set cmd->status to error if request is completed before
>>>   nbd_clear_req().
>>>   - get 'q_usage_counter' to prevent accessing freed request through
>>>   blk_mq_tag_to_rq(), instead of using blk_mq_find_and_get_req().
>>> Changes in v5:
>>>   - move patch 1 & 2 in v4 (patch 4 & 5 in v5) behind
>>>   - add some comment in patch 5
>>> Changes in v4:
>>>   - change the name of the patchset, since uaf is not the only problem
>>>   if server send unexpected reply message.
>>>   - instead of adding new interface, use blk_mq_find_and_get_req().
>>>   - add patch 5 to this series
>>> Changes in v3:
>>>   - v2 can't fix the problem thoroughly, add patch 3-4 to this series.
>>>   - modify descriptions.
>>>   - patch 5 is just a cleanup
>>> Changes in v2:
>>>   - as Bart suggested, add a new helper function for drivers to get
>>>   request by tag.
>>>
>>> Yu Kuai (7):
>>>    nbd: don't handle response without a corresponding request message
>>>    nbd: make sure request completion won't concurrent
>>>    nbd: check sock index in nbd_read_stat()
>>>    nbd: don't start request if nbd_queue_rq() failed
>>>    nbd: clean up return value checking of sock_xmit()
>>>    nbd: partition nbd_read_stat() into nbd_read_reply() and
>>>      nbd_handle_reply()
>>>    nbd: fix uaf in nbd_handle_reply()
>>>
>>>   drivers/block/nbd.c | 135 +++++++++++++++++++++++++++++++-------------
>>>   1 file changed, 96 insertions(+), 39 deletions(-)
>>>

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [patch v8 0/7] handle unexpected message from server
  2021-10-08  7:17     ` yukuai (C)
@ 2021-10-15  8:19       ` yukuai (C)
  0 siblings, 0 replies; 30+ messages in thread
From: yukuai (C) @ 2021-10-15  8:19 UTC (permalink / raw)
  To: josef, axboe, ming.lei, hch; +Cc: linux-block, nbd, linux-kernel, yi.zhang

On 2021/10/08 15:17, yukuai (C) wrote:
> On 2021/09/29 20:54, yukuai (C) wrote:
>> On 2021/09/23 21:33, yukuai (C) wrote:
>>> On 2021/09/16 17:33, Yu Kuai wrote:
>>>
>>> Hi, jens
>>>
>>> Any interest to apply this series?
>>
>> friendly ping ...
> 
> Hi, Jens
> 
> friendly ping again ...

Hi, Jens

friendly ping again ...
>>>
>>> Thanks,
>>> Kuai
>>>> This patch set tries to fix that client might oops if nbd server send
>>>> unexpected message to client, for example, our syzkaller report a uaf
>>>> in nbd_read_stat():
>>>>
>>>> Call trace:
>>>>   dump_backtrace+0x0/0x310 arch/arm64/kernel/time.c:78
>>>>   show_stack+0x28/0x38 arch/arm64/kernel/traps.c:158
>>>>   __dump_stack lib/dump_stack.c:77 [inline]
>>>>   dump_stack+0x144/0x1b4 lib/dump_stack.c:118
>>>>   print_address_description+0x68/0x2d0 mm/kasan/report.c:253
>>>>   kasan_report_error mm/kasan/report.c:351 [inline]
>>>>   kasan_report+0x134/0x2f0 mm/kasan/report.c:409
>>>>   check_memory_region_inline mm/kasan/kasan.c:260 [inline]
>>>>   __asan_load4+0x88/0xb0 mm/kasan/kasan.c:699
>>>>   __read_once_size include/linux/compiler.h:193 [inline]
>>>>   blk_mq_rq_state block/blk-mq.h:106 [inline]
>>>>   blk_mq_request_started+0x24/0x40 block/blk-mq.c:644
>>>>   nbd_read_stat drivers/block/nbd.c:670 [inline]
>>>>   recv_work+0x1bc/0x890 drivers/block/nbd.c:749
>>>>   process_one_work+0x3ec/0x9e0 kernel/workqueue.c:2147
>>>>   worker_thread+0x80/0x9d0 kernel/workqueue.c:2302
>>>>   kthread+0x1d8/0x1e0 kernel/kthread.c:255
>>>>   ret_from_fork+0x10/0x18 arch/arm64/kernel/entry.S:1174
>>>>
>>>> 1) At first, a normal io is submitted and completed with scheduler:
>>>>
>>>> internel_tag = blk_mq_get_tag -> get tag from sched_tags
>>>>   blk_mq_rq_ctx_init
>>>>    sched_tags->rq[internel_tag] = sched_tag->static_rq[internel_tag]
>>>> ...
>>>> blk_mq_get_driver_tag
>>>>   __blk_mq_get_driver_tag -> get tag from tags
>>>>   tags->rq[tag] = sched_tag->static_rq[internel_tag]
>>>>
>>>> So, both tags->rq[tag] and sched_tags->rq[internel_tag] are pointing
>>>> to the request: sched_tags->static_rq[internal_tag]. Even if the
>>>> io is finished.
>>>>
>>>> 2) nbd server send a reply with random tag directly:
>>>>
>>>> recv_work
>>>>   nbd_read_stat
>>>>    blk_mq_tag_to_rq(tags, tag)
>>>>     rq = tags->rq[tag]
>>>>
>>>> 3) if the sched_tags->static_rq is freed:
>>>>
>>>> blk_mq_sched_free_requests
>>>>   blk_mq_free_rqs(q->tag_set, hctx->sched_tags, i)
>>>>    -> step 2) access rq before clearing rq mapping
>>>>    blk_mq_clear_rq_mapping(set, tags, hctx_idx);
>>>>    __free_pages() -> rq is freed here
>>>>
>>>> 4) Then, nbd continue to use the freed request in nbd_read_stat()
>>>>
>>>> Changes in v8:
>>>>   - add patch 5 to this series.
>>>>   - modify some words.
>>>> Changes in v7:
>>>>   - instead of exposing blk_queue_exit(), using percpu_ref_put()
>>>>   directly.
>>>>   - drop the ref right after nbd_handle_reply().
>>>> Changes in v6:
>>>>   - don't set cmd->status to error if request is completed before
>>>>   nbd_clear_req().
>>>>   - get 'q_usage_counter' to prevent accessing freed request through
>>>>   blk_mq_tag_to_rq(), instead of using blk_mq_find_and_get_req().
>>>> Changes in v5:
>>>>   - move patch 1 & 2 in v4 (patch 4 & 5 in v5) behind
>>>>   - add some comment in patch 5
>>>> Changes in v4:
>>>>   - change the name of the patchset, since uaf is not the only problem
>>>>   if server send unexpected reply message.
>>>>   - instead of adding new interface, use blk_mq_find_and_get_req().
>>>>   - add patch 5 to this series
>>>> Changes in v3:
>>>>   - v2 can't fix the problem thoroughly, add patch 3-4 to this series.
>>>>   - modify descriptions.
>>>>   - patch 5 is just a cleanup
>>>> Changes in v2:
>>>>   - as Bart suggested, add a new helper function for drivers to get
>>>>   request by tag.
>>>>
>>>> Yu Kuai (7):
>>>>    nbd: don't handle response without a corresponding request message
>>>>    nbd: make sure request completion won't concurrent
>>>>    nbd: check sock index in nbd_read_stat()
>>>>    nbd: don't start request if nbd_queue_rq() failed
>>>>    nbd: clean up return value checking of sock_xmit()
>>>>    nbd: partition nbd_read_stat() into nbd_read_reply() and
>>>>      nbd_handle_reply()
>>>>    nbd: fix uaf in nbd_handle_reply()
>>>>
>>>>   drivers/block/nbd.c | 135 
>>>> +++++++++++++++++++++++++++++++-------------
>>>>   1 file changed, 96 insertions(+), 39 deletions(-)
>>>>

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [patch v8 0/7] handle unexpected message from server
  2021-09-16  9:33 [patch v8 0/7] handle unexpected message from server Yu Kuai
                   ` (7 preceding siblings ...)
  2021-09-23 13:33 ` [patch v8 0/7] handle unexpected message from server yukuai (C)
@ 2021-10-15 16:08 ` Josef Bacik
  2021-10-17 13:09 ` Jens Axboe
  9 siblings, 0 replies; 30+ messages in thread
From: Josef Bacik @ 2021-10-15 16:08 UTC (permalink / raw)
  To: Yu Kuai; +Cc: axboe, ming.lei, hch, linux-block, nbd, linux-kernel, yi.zhang

On Thu, Sep 16, 2021 at 05:33:43PM +0800, Yu Kuai wrote:
> This patch set tries to fix that client might oops if nbd server send
> unexpected message to client, for example, our syzkaller report a uaf
> in nbd_read_stat():
> 
> Call trace:
>  dump_backtrace+0x0/0x310 arch/arm64/kernel/time.c:78
>  show_stack+0x28/0x38 arch/arm64/kernel/traps.c:158
>  __dump_stack lib/dump_stack.c:77 [inline]
>  dump_stack+0x144/0x1b4 lib/dump_stack.c:118
>  print_address_description+0x68/0x2d0 mm/kasan/report.c:253
>  kasan_report_error mm/kasan/report.c:351 [inline]
>  kasan_report+0x134/0x2f0 mm/kasan/report.c:409
>  check_memory_region_inline mm/kasan/kasan.c:260 [inline]
>  __asan_load4+0x88/0xb0 mm/kasan/kasan.c:699
>  __read_once_size include/linux/compiler.h:193 [inline]
>  blk_mq_rq_state block/blk-mq.h:106 [inline]
>  blk_mq_request_started+0x24/0x40 block/blk-mq.c:644
>  nbd_read_stat drivers/block/nbd.c:670 [inline]
>  recv_work+0x1bc/0x890 drivers/block/nbd.c:749
>  process_one_work+0x3ec/0x9e0 kernel/workqueue.c:2147
>  worker_thread+0x80/0x9d0 kernel/workqueue.c:2302
>  kthread+0x1d8/0x1e0 kernel/kthread.c:255
>  ret_from_fork+0x10/0x18 arch/arm64/kernel/entry.S:1174
> 
> 1) At first, a normal io is submitted and completed with scheduler:
> 
> internel_tag = blk_mq_get_tag -> get tag from sched_tags
>  blk_mq_rq_ctx_init
>   sched_tags->rq[internel_tag] = sched_tag->static_rq[internel_tag]
> ...
> blk_mq_get_driver_tag
>  __blk_mq_get_driver_tag -> get tag from tags
>  tags->rq[tag] = sched_tag->static_rq[internel_tag]
> 
> So, both tags->rq[tag] and sched_tags->rq[internel_tag] are pointing
> to the request: sched_tags->static_rq[internal_tag]. Even if the
> io is finished.
> 
> 2) nbd server send a reply with random tag directly:
> 
> recv_work
>  nbd_read_stat
>   blk_mq_tag_to_rq(tags, tag)
>    rq = tags->rq[tag]
> 
> 3) if the sched_tags->static_rq is freed:
> 
> blk_mq_sched_free_requests
>  blk_mq_free_rqs(q->tag_set, hctx->sched_tags, i)
>   -> step 2) access rq before clearing rq mapping
>   blk_mq_clear_rq_mapping(set, tags, hctx_idx);
>   __free_pages() -> rq is freed here
> 
> 4) Then, nbd continue to use the freed request in nbd_read_stat()
> 
> Changes in v8:
>  - add patch 5 to this series.
>  - modify some words.
> Changes in v7:
>  - instead of exposing blk_queue_exit(), using percpu_ref_put()
>  directly.
>  - drop the ref right after nbd_handle_reply().
> Changes in v6:
>  - don't set cmd->status to error if request is completed before
>  nbd_clear_req().
>  - get 'q_usage_counter' to prevent accessing freed request through
>  blk_mq_tag_to_rq(), instead of using blk_mq_find_and_get_req().
> Changes in v5:
>  - move patch 1 & 2 in v4 (patch 4 & 5 in v5) behind
>  - add some comment in patch 5
> Changes in v4:
>  - change the name of the patchset, since uaf is not the only problem
>  if server send unexpected reply message.
>  - instead of adding new interface, use blk_mq_find_and_get_req().
>  - add patch 5 to this series
> Changes in v3:
>  - v2 can't fix the problem thoroughly, add patch 3-4 to this series.
>  - modify descriptions.
>  - patch 5 is just a cleanup
> Changes in v2:
>  - as Bart suggested, add a new helper function for drivers to get
>  request by tag.
> 

You can add

Reviewed-by: Josef Bacik <josef@toxicpanda.com>

to the series.  Sorry Jens, this one slipped through the cracks.  Thanks,

Josef

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [patch v8 0/7] handle unexpected message from server
  2021-09-16  9:33 [patch v8 0/7] handle unexpected message from server Yu Kuai
                   ` (8 preceding siblings ...)
  2021-10-15 16:08 ` Josef Bacik
@ 2021-10-17 13:09 ` Jens Axboe
  9 siblings, 0 replies; 30+ messages in thread
From: Jens Axboe @ 2021-10-17 13:09 UTC (permalink / raw)
  To: hch, ming.lei, Yu Kuai, josef
  Cc: Jens Axboe, nbd, yi.zhang, linux-kernel, linux-block

On Thu, 16 Sep 2021 17:33:43 +0800, Yu Kuai wrote:
> This patch set tries to fix that client might oops if nbd server send
> unexpected message to client, for example, our syzkaller report a uaf
> in nbd_read_stat():
> 
> Call trace:
>  dump_backtrace+0x0/0x310 arch/arm64/kernel/time.c:78
>  show_stack+0x28/0x38 arch/arm64/kernel/traps.c:158
>  __dump_stack lib/dump_stack.c:77 [inline]
>  dump_stack+0x144/0x1b4 lib/dump_stack.c:118
>  print_address_description+0x68/0x2d0 mm/kasan/report.c:253
>  kasan_report_error mm/kasan/report.c:351 [inline]
>  kasan_report+0x134/0x2f0 mm/kasan/report.c:409
>  check_memory_region_inline mm/kasan/kasan.c:260 [inline]
>  __asan_load4+0x88/0xb0 mm/kasan/kasan.c:699
>  __read_once_size include/linux/compiler.h:193 [inline]
>  blk_mq_rq_state block/blk-mq.h:106 [inline]
>  blk_mq_request_started+0x24/0x40 block/blk-mq.c:644
>  nbd_read_stat drivers/block/nbd.c:670 [inline]
>  recv_work+0x1bc/0x890 drivers/block/nbd.c:749
>  process_one_work+0x3ec/0x9e0 kernel/workqueue.c:2147
>  worker_thread+0x80/0x9d0 kernel/workqueue.c:2302
>  kthread+0x1d8/0x1e0 kernel/kthread.c:255
>  ret_from_fork+0x10/0x18 arch/arm64/kernel/entry.S:1174
> 
> [...]

Applied, thanks!

[1/7] nbd: don't handle response without a corresponding request message
      commit: b5644a3a79bf3be5f1238db1b2f241374b27b0f0
[2/7] nbd: make sure request completion won't concurrent
      commit: d14b304f558f8c8f53da3a8d0c0b671f14a9c2f4
[3/7] nbd: check sock index in nbd_read_stat()
      commit: dbd73178da676945d8bbcf6afe731623f683ce0a
[4/7] nbd: don't start request if nbd_queue_rq() failed
      commit: a83fdc85365586dc5c0f3ff91680e18e37a66f19
[5/7] nbd: clean up return value checking of sock_xmit()
      commit: 6157a8f489909db00151a4e361903b9099b03b75
[6/7] nbd: partition nbd_read_stat() into nbd_read_reply() and nbd_handle_reply()
      commit: 961e9f50be9bb47835b0ac7e08d55d2d0a45e493
[7/7] nbd: fix uaf in nbd_handle_reply()
      commit: 52c90e0184f67eecb00b53b79bfdf75e0274f8fd

Best regards,
-- 
Jens Axboe



^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH v9] nbd: fix uaf in nbd_handle_reply()
  2021-09-16 14:18   ` [PATCH v9] " Yu Kuai
  2021-09-17  3:33     ` Ming Lei
@ 2021-10-17 13:09     ` Jens Axboe
  1 sibling, 0 replies; 30+ messages in thread
From: Jens Axboe @ 2021-10-17 13:09 UTC (permalink / raw)
  To: Yu Kuai, ming.lei, josef
  Cc: Jens Axboe, nbd, yi.zhang, linux-kernel, linux-block

On Thu, 16 Sep 2021 22:18:10 +0800, Yu Kuai wrote:
> There is a problem that nbd_handle_reply() might access freed request:
> 
> 1) At first, a normal io is submitted and completed with scheduler:
> 
> internel_tag = blk_mq_get_tag -> get tag from sched_tags
>  blk_mq_rq_ctx_init
>   sched_tags->rq[internel_tag] = sched_tag->static_rq[internel_tag]
> ...
> blk_mq_get_driver_tag
>  __blk_mq_get_driver_tag -> get tag from tags
>  tags->rq[tag] = sched_tag->static_rq[internel_tag]
> 
> [...]

Applied, thanks!

[1/1] nbd: fix uaf in nbd_handle_reply()
      commit: 52c90e0184f67eecb00b53b79bfdf75e0274f8fd

Best regards,
-- 
Jens Axboe



^ permalink raw reply	[flat|nested] 30+ messages in thread

end of thread, other threads:[~2021-10-17 13:09 UTC | newest]

Thread overview: 30+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-09-16  9:33 [patch v8 0/7] handle unexpected message from server Yu Kuai
2021-09-16  9:33 ` [patch v8 1/7] nbd: don't handle response without a corresponding request message Yu Kuai
2021-09-16  9:33 ` [patch v8 2/7] nbd: make sure request completion won't concurrent Yu Kuai
2021-09-16  9:33 ` [patch v8 3/7] nbd: check sock index in nbd_read_stat() Yu Kuai
2021-09-19 10:34   ` yukuai (C)
2021-09-22  9:22     ` Ming Lei
2021-09-22 12:12       ` Eric Blake
2021-09-22 12:21       ` yukuai (C)
2021-09-22 15:56       ` Wouter Verhelst
2021-09-23  0:29   ` Ming Lei
2021-09-16  9:33 ` [patch v8 4/7] nbd: don't start request if nbd_queue_rq() failed Yu Kuai
2021-09-16 12:48   ` Ming Lei
2021-09-16  9:33 ` [patch v8 5/7] nbd: clean up return value checking of sock_xmit() Yu Kuai
2021-09-16 12:49   ` Ming Lei
2021-09-16  9:33 ` [patch v8 6/7] nbd: partition nbd_read_stat() into nbd_read_reply() and nbd_handle_reply() Yu Kuai
2021-09-16 12:53   ` Ming Lei
2021-09-16  9:33 ` [patch v8 7/7] nbd: fix uaf in nbd_handle_reply() Yu Kuai
2021-09-16 12:58   ` Ming Lei
2021-09-16 13:10     ` yukuai (C)
2021-09-16 13:55       ` Ming Lei
2021-09-16 14:05         ` yukuai (C)
2021-09-16 14:18   ` [PATCH v9] " Yu Kuai
2021-09-17  3:33     ` Ming Lei
2021-10-17 13:09     ` Jens Axboe
2021-09-23 13:33 ` [patch v8 0/7] handle unexpected message from server yukuai (C)
2021-09-29 12:54   ` yukuai (C)
2021-10-08  7:17     ` yukuai (C)
2021-10-15  8:19       ` yukuai (C)
2021-10-15 16:08 ` Josef Bacik
2021-10-17 13:09 ` Jens Axboe

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).