linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v7 0/6] handle unexpected message from server
@ 2021-09-15  9:20 Yu Kuai
  2021-09-15  9:20 ` [PATCH v7 1/6] nbd: don't handle response without a corresponding request message Yu Kuai
                   ` (5 more replies)
  0 siblings, 6 replies; 14+ messages in thread
From: Yu Kuai @ 2021-09-15  9:20 UTC (permalink / raw)
  To: josef, axboe, hch, ming.lei
  Cc: linux-block, nbd, linux-kernel, yukuai3, yi.zhang

This patch set tries to fix that client might oops if nbd server send
unexpected message to client, for example, our syzkaller report a uaf
in nbd_read_stat():

Call trace:
 dump_backtrace+0x0/0x310 arch/arm64/kernel/time.c:78
 show_stack+0x28/0x38 arch/arm64/kernel/traps.c:158
 __dump_stack lib/dump_stack.c:77 [inline]
 dump_stack+0x144/0x1b4 lib/dump_stack.c:118
 print_address_description+0x68/0x2d0 mm/kasan/report.c:253
 kasan_report_error mm/kasan/report.c:351 [inline]
 kasan_report+0x134/0x2f0 mm/kasan/report.c:409
 check_memory_region_inline mm/kasan/kasan.c:260 [inline]
 __asan_load4+0x88/0xb0 mm/kasan/kasan.c:699
 __read_once_size include/linux/compiler.h:193 [inline]
 blk_mq_rq_state block/blk-mq.h:106 [inline]
 blk_mq_request_started+0x24/0x40 block/blk-mq.c:644
 nbd_read_stat drivers/block/nbd.c:670 [inline]
 recv_work+0x1bc/0x890 drivers/block/nbd.c:749
 process_one_work+0x3ec/0x9e0 kernel/workqueue.c:2147
 worker_thread+0x80/0x9d0 kernel/workqueue.c:2302
 kthread+0x1d8/0x1e0 kernel/kthread.c:255
 ret_from_fork+0x10/0x18 arch/arm64/kernel/entry.S:1174

1) At first, a normal io is submitted and completed with scheduler:

internel_tag = blk_mq_get_tag -> get tag from sched_tags
 blk_mq_rq_ctx_init
  sched_tags->rq[internel_tag] = sched_tag->static_rq[internel_tag]
...
blk_mq_get_driver_tag
 __blk_mq_get_driver_tag -> get tag from tags
 tags->rq[tag] = sched_tag->static_rq[internel_tag]

So, both tags->rq[tag] and sched_tags->rq[internel_tag] are pointing
to the request: sched_tags->static_rq[internal_tag]. Even if the
io is finished.

2) nbd server send a reply with random tag directly:

recv_work
 nbd_read_stat
  blk_mq_tag_to_rq(tags, tag)
   rq = tags->rq[tag]

3) if the sched_tags->static_rq is freed:

blk_mq_sched_free_requests
 blk_mq_free_rqs(q->tag_set, hctx->sched_tags, i)
  -> step 2) access rq before clearing rq mapping
  blk_mq_clear_rq_mapping(set, tags, hctx_idx);
  __free_pages() -> rq is freed here

4) Then, nbd continue to use the freed request in nbd_read_stat()

Changes in v7:
 - instead of exposing blk_queue_exit(), using percpu_ref_put()
 directly.
 - drop the ref right after nbd_handle_reply().
Changes in v6:
 - don't set cmd->status to error if request is completed before
 nbd_clear_req().
 - get 'q_usage_counter' to prevent accessing freed request through
 blk_mq_tag_to_rq(), instead of using blk_mq_find_and_get_req().
Changes in v5:
 - move patch 1 & 2 in v4 (patch 4 & 5 in v5) behind
 - add some comment in patch 5
Changes in v4:
 - change the name of the patchset, since uaf is not the only problem
 if server send unexpected reply message.
 - instead of adding new interface, use blk_mq_find_and_get_req().
 - add patch 5 to this series
Changes in v3:
 - v2 can't fix the problem thoroughly, add patch 3-4 to this series.
 - modify descriptions.
 - patch 5 is just a cleanup
Changes in v2:
 - as Bart suggested, add a new helper function for drivers to get
 request by tag.

Yu Kuai (6):
  nbd: don't handle response without a corresponding request message
  nbd: make sure request completion won't concurrent
  nbd: check sock index in nbd_read_stat()
  nbd: don't start request if nbd_queue_rq() failed
  nbd: partition nbd_read_stat() into nbd_read_reply() and
    nbd_handle_reply()
  nbd: fix uaf in nbd_handle_reply()

 drivers/block/nbd.c | 128 ++++++++++++++++++++++++++++++++------------
 1 file changed, 94 insertions(+), 34 deletions(-)

-- 
2.31.1


^ permalink raw reply	[flat|nested] 14+ messages in thread

* [PATCH v7 1/6] nbd: don't handle response without a corresponding request message
  2021-09-15  9:20 [PATCH v7 0/6] handle unexpected message from server Yu Kuai
@ 2021-09-15  9:20 ` Yu Kuai
  2021-09-15  9:20 ` [PATCH v7 2/6] nbd: make sure request completion won't concurrent Yu Kuai
                   ` (4 subsequent siblings)
  5 siblings, 0 replies; 14+ messages in thread
From: Yu Kuai @ 2021-09-15  9:20 UTC (permalink / raw)
  To: josef, axboe, hch, ming.lei
  Cc: linux-block, nbd, linux-kernel, yukuai3, yi.zhang

While handling a response message from server, nbd_read_stat() will
try to get request by tag, and then complete the request. However,
this is problematic if nbd haven't sent a corresponding request
message:

t1                      t2
                        submit_bio
                         nbd_queue_rq
                          blk_mq_start_request
recv_work
 nbd_read_stat
  blk_mq_tag_to_rq
 blk_mq_complete_request
                          nbd_send_cmd

Thus add a new cmd flag 'NBD_CMD_INFLIGHT', it will be set in
nbd_send_cmd() and checked in nbd_read_stat().

Noted that this patch can't fix that blk_mq_tag_to_rq() might
return a freed request, and this will be fixed in following
patches.

Signed-off-by: Yu Kuai <yukuai3@huawei.com>
Reviewed-by: Ming Lei <ming.lei@redhat.com>
---
 drivers/block/nbd.c | 22 +++++++++++++++++++++-
 1 file changed, 21 insertions(+), 1 deletion(-)

diff --git a/drivers/block/nbd.c b/drivers/block/nbd.c
index 5170a630778d..23ded569e8d3 100644
--- a/drivers/block/nbd.c
+++ b/drivers/block/nbd.c
@@ -126,6 +126,12 @@ struct nbd_device {
 };
 
 #define NBD_CMD_REQUEUED	1
+/*
+ * This flag will be set if nbd_queue_rq() succeed, and will be checked and
+ * cleared in completion. Both setting and clearing of the flag are protected
+ * by cmd->lock.
+ */
+#define NBD_CMD_INFLIGHT	2
 
 struct nbd_cmd {
 	struct nbd_device *nbd;
@@ -400,6 +406,7 @@ static enum blk_eh_timer_return nbd_xmit_timeout(struct request *req,
 	if (!mutex_trylock(&cmd->lock))
 		return BLK_EH_RESET_TIMER;
 
+	__clear_bit(NBD_CMD_INFLIGHT, &cmd->flags);
 	if (!refcount_inc_not_zero(&nbd->config_refs)) {
 		cmd->status = BLK_STS_TIMEOUT;
 		mutex_unlock(&cmd->lock);
@@ -729,6 +736,12 @@ static struct nbd_cmd *nbd_read_stat(struct nbd_device *nbd, int index)
 	cmd = blk_mq_rq_to_pdu(req);
 
 	mutex_lock(&cmd->lock);
+	if (!__test_and_clear_bit(NBD_CMD_INFLIGHT, &cmd->flags)) {
+		dev_err(disk_to_dev(nbd->disk), "Suspicious reply %d (status %u flags %lu)",
+			tag, cmd->status, cmd->flags);
+		ret = -ENOENT;
+		goto out;
+	}
 	if (cmd->cmd_cookie != nbd_handle_to_cookie(handle)) {
 		dev_err(disk_to_dev(nbd->disk), "Double reply on req %p, cmd_cookie %u, handle cookie %u\n",
 			req, cmd->cmd_cookie, nbd_handle_to_cookie(handle));
@@ -828,6 +841,7 @@ static bool nbd_clear_req(struct request *req, void *data, bool reserved)
 		return true;
 
 	mutex_lock(&cmd->lock);
+	__clear_bit(NBD_CMD_INFLIGHT, &cmd->flags);
 	cmd->status = BLK_STS_IOERR;
 	mutex_unlock(&cmd->lock);
 
@@ -964,7 +978,13 @@ static int nbd_handle_cmd(struct nbd_cmd *cmd, int index)
 	 * returns EAGAIN can be retried on a different socket.
 	 */
 	ret = nbd_send_cmd(nbd, cmd, index);
-	if (ret == -EAGAIN) {
+	/*
+	 * Access to this flag is protected by cmd->lock, thus it's safe to set
+	 * the flag after nbd_send_cmd() succeed to send request to server.
+	 */
+	if (!ret)
+		__set_bit(NBD_CMD_INFLIGHT, &cmd->flags);
+	else if (ret == -EAGAIN) {
 		dev_err_ratelimited(disk_to_dev(nbd->disk),
 				    "Request send failed, requeueing\n");
 		nbd_mark_nsock_dead(nbd, nsock, 1);
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH v7 2/6] nbd: make sure request completion won't concurrent
  2021-09-15  9:20 [PATCH v7 0/6] handle unexpected message from server Yu Kuai
  2021-09-15  9:20 ` [PATCH v7 1/6] nbd: don't handle response without a corresponding request message Yu Kuai
@ 2021-09-15  9:20 ` Yu Kuai
  2021-09-16  7:53   ` Ming Lei
  2021-09-15  9:20 ` [PATCH v7 3/6] nbd: check sock index in nbd_read_stat() Yu Kuai
                   ` (3 subsequent siblings)
  5 siblings, 1 reply; 14+ messages in thread
From: Yu Kuai @ 2021-09-15  9:20 UTC (permalink / raw)
  To: josef, axboe, hch, ming.lei
  Cc: linux-block, nbd, linux-kernel, yukuai3, yi.zhang

commit cddce0116058 ("nbd: Aovid double completion of a request")
try to fix that nbd_clear_que() and recv_work() can complete a
request concurrently. However, the problem still exists:

t1                    t2                     t3

nbd_disconnect_and_put
 flush_workqueue
                      recv_work
                       blk_mq_complete_request
                        blk_mq_complete_request_remote -> this is true
                         WRITE_ONCE(rq->state, MQ_RQ_COMPLETE)
                          blk_mq_raise_softirq
                                             blk_done_softirq
                                              blk_complete_reqs
                                               nbd_complete_rq
                                                blk_mq_end_request
                                                 blk_mq_free_request
                                                  WRITE_ONCE(rq->state, MQ_RQ_IDLE)
  nbd_clear_que
   blk_mq_tagset_busy_iter
    nbd_clear_req
                                                   __blk_mq_free_request
                                                    blk_mq_put_tag
     blk_mq_complete_request -> complete again

There are three places where request can be completed in nbd:
recv_work(), nbd_clear_que() and nbd_xmit_timeout(). Since they
all hold cmd->lock before completing the request, it's easy to
avoid the problem by setting and checking a cmd flag.

Signed-off-by: Yu Kuai <yukuai3@huawei.com>
---
 drivers/block/nbd.c | 11 +++++++++--
 1 file changed, 9 insertions(+), 2 deletions(-)

diff --git a/drivers/block/nbd.c b/drivers/block/nbd.c
index 23ded569e8d3..614c6ab2b8fe 100644
--- a/drivers/block/nbd.c
+++ b/drivers/block/nbd.c
@@ -406,7 +406,11 @@ static enum blk_eh_timer_return nbd_xmit_timeout(struct request *req,
 	if (!mutex_trylock(&cmd->lock))
 		return BLK_EH_RESET_TIMER;
 
-	__clear_bit(NBD_CMD_INFLIGHT, &cmd->flags);
+	if (!__test_and_clear_bit(NBD_CMD_INFLIGHT, &cmd->flags)) {
+		mutex_unlock(&cmd->lock);
+		return BLK_EH_DONE;
+	}
+
 	if (!refcount_inc_not_zero(&nbd->config_refs)) {
 		cmd->status = BLK_STS_TIMEOUT;
 		mutex_unlock(&cmd->lock);
@@ -841,7 +845,10 @@ static bool nbd_clear_req(struct request *req, void *data, bool reserved)
 		return true;
 
 	mutex_lock(&cmd->lock);
-	__clear_bit(NBD_CMD_INFLIGHT, &cmd->flags);
+	if (!__test_and_clear_bit(NBD_CMD_INFLIGHT, &cmd->flags)) {
+		mutex_unlock(&cmd->lock);
+		return true;
+	}
 	cmd->status = BLK_STS_IOERR;
 	mutex_unlock(&cmd->lock);
 
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH v7 3/6] nbd: check sock index in nbd_read_stat()
  2021-09-15  9:20 [PATCH v7 0/6] handle unexpected message from server Yu Kuai
  2021-09-15  9:20 ` [PATCH v7 1/6] nbd: don't handle response without a corresponding request message Yu Kuai
  2021-09-15  9:20 ` [PATCH v7 2/6] nbd: make sure request completion won't concurrent Yu Kuai
@ 2021-09-15  9:20 ` Yu Kuai
  2021-09-15  9:20 ` [PATCH v7 4/6] nbd: don't start request if nbd_queue_rq() failed Yu Kuai
                   ` (2 subsequent siblings)
  5 siblings, 0 replies; 14+ messages in thread
From: Yu Kuai @ 2021-09-15  9:20 UTC (permalink / raw)
  To: josef, axboe, hch, ming.lei
  Cc: linux-block, nbd, linux-kernel, yukuai3, yi.zhang

The sock that clent send request in nbd_send_cmd() and receive reply
in nbd_read_stat() should be the same.

Signed-off-by: Yu Kuai <yukuai3@huawei.com>
---
 drivers/block/nbd.c | 4 ++++
 1 file changed, 4 insertions(+)

diff --git a/drivers/block/nbd.c b/drivers/block/nbd.c
index 614c6ab2b8fe..c724a5bd7fa4 100644
--- a/drivers/block/nbd.c
+++ b/drivers/block/nbd.c
@@ -746,6 +746,10 @@ static struct nbd_cmd *nbd_read_stat(struct nbd_device *nbd, int index)
 		ret = -ENOENT;
 		goto out;
 	}
+	if (cmd->index != index) {
+		dev_err(disk_to_dev(nbd->disk), "Unexpected reply %d from different sock %d (expected %d)",
+			tag, index, cmd->index);
+	}
 	if (cmd->cmd_cookie != nbd_handle_to_cookie(handle)) {
 		dev_err(disk_to_dev(nbd->disk), "Double reply on req %p, cmd_cookie %u, handle cookie %u\n",
 			req, cmd->cmd_cookie, nbd_handle_to_cookie(handle));
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH v7 4/6] nbd: don't start request if nbd_queue_rq() failed
  2021-09-15  9:20 [PATCH v7 0/6] handle unexpected message from server Yu Kuai
                   ` (2 preceding siblings ...)
  2021-09-15  9:20 ` [PATCH v7 3/6] nbd: check sock index in nbd_read_stat() Yu Kuai
@ 2021-09-15  9:20 ` Yu Kuai
  2021-09-16  7:58   ` Ming Lei
  2021-09-15  9:20 ` [PATCH v7 5/6] nbd: partition nbd_read_stat() into nbd_read_reply() and nbd_handle_reply() Yu Kuai
  2021-09-15  9:20 ` [PATCH v7 6/6] nbd: fix uaf in nbd_handle_reply() Yu Kuai
  5 siblings, 1 reply; 14+ messages in thread
From: Yu Kuai @ 2021-09-15  9:20 UTC (permalink / raw)
  To: josef, axboe, hch, ming.lei
  Cc: linux-block, nbd, linux-kernel, yukuai3, yi.zhang

Currently, blk_mq_end_request() will be called if nbd_queue_rq()
failed, thus start request in such situation is useless.

Signed-off-by: Yu Kuai <yukuai3@huawei.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
---
 drivers/block/nbd.c | 3 ---
 1 file changed, 3 deletions(-)

diff --git a/drivers/block/nbd.c b/drivers/block/nbd.c
index c724a5bd7fa4..22c91d8901f6 100644
--- a/drivers/block/nbd.c
+++ b/drivers/block/nbd.c
@@ -934,7 +934,6 @@ static int nbd_handle_cmd(struct nbd_cmd *cmd, int index)
 	if (!refcount_inc_not_zero(&nbd->config_refs)) {
 		dev_err_ratelimited(disk_to_dev(nbd->disk),
 				    "Socks array is empty\n");
-		blk_mq_start_request(req);
 		return -EINVAL;
 	}
 	config = nbd->config;
@@ -943,7 +942,6 @@ static int nbd_handle_cmd(struct nbd_cmd *cmd, int index)
 		dev_err_ratelimited(disk_to_dev(nbd->disk),
 				    "Attempted send on invalid socket\n");
 		nbd_config_put(nbd);
-		blk_mq_start_request(req);
 		return -EINVAL;
 	}
 	cmd->status = BLK_STS_OK;
@@ -967,7 +965,6 @@ static int nbd_handle_cmd(struct nbd_cmd *cmd, int index)
 			 */
 			sock_shutdown(nbd);
 			nbd_config_put(nbd);
-			blk_mq_start_request(req);
 			return -EIO;
 		}
 		goto again;
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH v7 5/6] nbd: partition nbd_read_stat() into nbd_read_reply() and nbd_handle_reply()
  2021-09-15  9:20 [PATCH v7 0/6] handle unexpected message from server Yu Kuai
                   ` (3 preceding siblings ...)
  2021-09-15  9:20 ` [PATCH v7 4/6] nbd: don't start request if nbd_queue_rq() failed Yu Kuai
@ 2021-09-15  9:20 ` Yu Kuai
  2021-09-16  8:00   ` Ming Lei
  2021-09-15  9:20 ` [PATCH v7 6/6] nbd: fix uaf in nbd_handle_reply() Yu Kuai
  5 siblings, 1 reply; 14+ messages in thread
From: Yu Kuai @ 2021-09-15  9:20 UTC (permalink / raw)
  To: josef, axboe, hch, ming.lei
  Cc: linux-block, nbd, linux-kernel, yukuai3, yi.zhang

Prepare to fix uaf in nbd_read_stat(), no functional changes.

Signed-off-by: Yu Kuai <yukuai3@huawei.com>
---
 drivers/block/nbd.c | 76 +++++++++++++++++++++++++++------------------
 1 file changed, 45 insertions(+), 31 deletions(-)

diff --git a/drivers/block/nbd.c b/drivers/block/nbd.c
index 22c91d8901f6..9a7bbf8ebe74 100644
--- a/drivers/block/nbd.c
+++ b/drivers/block/nbd.c
@@ -694,38 +694,45 @@ static int nbd_send_cmd(struct nbd_device *nbd, struct nbd_cmd *cmd, int index)
 	return 0;
 }
 
-/* NULL returned = something went wrong, inform userspace */
-static struct nbd_cmd *nbd_read_stat(struct nbd_device *nbd, int index)
+static int nbd_read_reply(struct nbd_device *nbd, int index,
+			  struct nbd_reply *reply)
 {
-	struct nbd_config *config = nbd->config;
-	int result;
-	struct nbd_reply reply;
-	struct nbd_cmd *cmd;
-	struct request *req = NULL;
-	u64 handle;
-	u16 hwq;
-	u32 tag;
-	struct kvec iov = {.iov_base = &reply, .iov_len = sizeof(reply)};
+	struct kvec iov = {.iov_base = reply, .iov_len = sizeof(*reply)};
 	struct iov_iter to;
-	int ret = 0;
+	int result;
 
-	reply.magic = 0;
-	iov_iter_kvec(&to, READ, &iov, 1, sizeof(reply));
+	reply->magic = 0;
+	iov_iter_kvec(&to, READ, &iov, 1, sizeof(*reply));
 	result = sock_xmit(nbd, index, 0, &to, MSG_WAITALL, NULL);
-	if (result <= 0) {
-		if (!nbd_disconnected(config))
+	if (result < 0) {
+		if (!nbd_disconnected(nbd->config))
 			dev_err(disk_to_dev(nbd->disk),
 				"Receive control failed (result %d)\n", result);
-		return ERR_PTR(result);
+		return result;
 	}
 
-	if (ntohl(reply.magic) != NBD_REPLY_MAGIC) {
+	if (ntohl(reply->magic) != NBD_REPLY_MAGIC) {
 		dev_err(disk_to_dev(nbd->disk), "Wrong magic (0x%lx)\n",
-				(unsigned long)ntohl(reply.magic));
-		return ERR_PTR(-EPROTO);
+				(unsigned long)ntohl(reply->magic));
+		return -EPROTO;
 	}
 
-	memcpy(&handle, reply.handle, sizeof(handle));
+	return 0;
+}
+
+/* NULL returned = something went wrong, inform userspace */
+static struct nbd_cmd *nbd_handle_reply(struct nbd_device *nbd, int index,
+					struct nbd_reply *reply)
+{
+	int result;
+	struct nbd_cmd *cmd;
+	struct request *req = NULL;
+	u64 handle;
+	u16 hwq;
+	u32 tag;
+	int ret = 0;
+
+	memcpy(&handle, reply->handle, sizeof(handle));
 	tag = nbd_handle_to_tag(handle);
 	hwq = blk_mq_unique_tag_to_hwq(tag);
 	if (hwq < nbd->tag_set.nr_hw_queues)
@@ -768,9 +775,9 @@ static struct nbd_cmd *nbd_read_stat(struct nbd_device *nbd, int index)
 		ret = -ENOENT;
 		goto out;
 	}
-	if (ntohl(reply.error)) {
+	if (ntohl(reply->error)) {
 		dev_err(disk_to_dev(nbd->disk), "Other side returned error (%d)\n",
-			ntohl(reply.error));
+			ntohl(reply->error));
 		cmd->status = BLK_STS_IOERR;
 		goto out;
 	}
@@ -779,6 +786,7 @@ static struct nbd_cmd *nbd_read_stat(struct nbd_device *nbd, int index)
 	if (rq_data_dir(req) != WRITE) {
 		struct req_iterator iter;
 		struct bio_vec bvec;
+		struct iov_iter to;
 
 		rq_for_each_segment(bvec, req, iter) {
 			iov_iter_bvec(&to, READ, &bvec, 1, bvec.bv_len);
@@ -792,7 +800,7 @@ static struct nbd_cmd *nbd_read_stat(struct nbd_device *nbd, int index)
 				 * and let the timeout stuff handle resubmitting
 				 * this request onto another connection.
 				 */
-				if (nbd_disconnected(config)) {
+				if (nbd_disconnected(nbd->config)) {
 					cmd->status = BLK_STS_IOERR;
 					goto out;
 				}
@@ -816,24 +824,30 @@ static void recv_work(struct work_struct *work)
 						     work);
 	struct nbd_device *nbd = args->nbd;
 	struct nbd_config *config = nbd->config;
+	struct nbd_sock *nsock;
 	struct nbd_cmd *cmd;
 	struct request *rq;
 
 	while (1) {
-		cmd = nbd_read_stat(nbd, args->index);
-		if (IS_ERR(cmd)) {
-			struct nbd_sock *nsock = config->socks[args->index];
+		struct nbd_reply reply;
 
-			mutex_lock(&nsock->tx_lock);
-			nbd_mark_nsock_dead(nbd, nsock, 1);
-			mutex_unlock(&nsock->tx_lock);
+		if (nbd_read_reply(nbd, args->index, &reply))
+			break;
+
+		cmd = nbd_handle_reply(nbd, args->index, &reply);
+		if (IS_ERR(cmd))
 			break;
-		}
 
 		rq = blk_mq_rq_from_pdu(cmd);
 		if (likely(!blk_should_fake_timeout(rq->q)))
 			blk_mq_complete_request(rq);
 	}
+
+	nsock = config->socks[args->index];
+	mutex_lock(&nsock->tx_lock);
+	nbd_mark_nsock_dead(nbd, nsock, 1);
+	mutex_unlock(&nsock->tx_lock);
+
 	nbd_config_put(nbd);
 	atomic_dec(&config->recv_threads);
 	wake_up(&config->recv_wq);
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH v7 6/6] nbd: fix uaf in nbd_handle_reply()
  2021-09-15  9:20 [PATCH v7 0/6] handle unexpected message from server Yu Kuai
                   ` (4 preceding siblings ...)
  2021-09-15  9:20 ` [PATCH v7 5/6] nbd: partition nbd_read_stat() into nbd_read_reply() and nbd_handle_reply() Yu Kuai
@ 2021-09-15  9:20 ` Yu Kuai
  2021-09-16  8:04   ` Ming Lei
  5 siblings, 1 reply; 14+ messages in thread
From: Yu Kuai @ 2021-09-15  9:20 UTC (permalink / raw)
  To: josef, axboe, hch, ming.lei
  Cc: linux-block, nbd, linux-kernel, yukuai3, yi.zhang

There is a problem that nbd_handle_reply() might access freed request:

1) At first, a normal io is submitted and completed with scheduler:

internel_tag = blk_mq_get_tag -> get tag from sched_tags
 blk_mq_rq_ctx_init
  sched_tags->rq[internel_tag] = sched_tag->static_rq[internel_tag]
...
blk_mq_get_driver_tag
 __blk_mq_get_driver_tag -> get tag from tags
 tags->rq[tag] = sched_tag->static_rq[internel_tag]

So, both tags->rq[tag] and sched_tags->rq[internel_tag] are pointing
to the request: sched_tags->static_rq[internal_tag]. Even if the
io is finished.

2) nbd server send a reply with random tag directly:

recv_work
 nbd_handle_reply
  blk_mq_tag_to_rq(tags, tag)
   rq = tags->rq[tag]

3) if the sched_tags->static_rq is freed:

blk_mq_sched_free_requests
 blk_mq_free_rqs(q->tag_set, hctx->sched_tags, i)
  -> step 2) access rq before clearing rq mapping
  blk_mq_clear_rq_mapping(set, tags, hctx_idx);
  __free_pages() -> rq is freed here

4) Then, nbd continue to use the freed request in nbd_handle_reply

Fix the problem by get 'q_usage_counter' before blk_mq_tag_to_rq(),
thus request is ensured not to be freed because 'q_usage_counter' is
not zero.

Signed-off-by: Yu Kuai <yukuai3@huawei.com>
---
 drivers/block/nbd.c | 18 ++++++++++++++++++
 1 file changed, 18 insertions(+)

diff --git a/drivers/block/nbd.c b/drivers/block/nbd.c
index 9a7bbf8ebe74..3e8b70b5d4f9 100644
--- a/drivers/block/nbd.c
+++ b/drivers/block/nbd.c
@@ -824,6 +824,7 @@ static void recv_work(struct work_struct *work)
 						     work);
 	struct nbd_device *nbd = args->nbd;
 	struct nbd_config *config = nbd->config;
+	struct request_queue *q = nbd->disk->queue;
 	struct nbd_sock *nsock;
 	struct nbd_cmd *cmd;
 	struct request *rq;
@@ -834,7 +835,24 @@ static void recv_work(struct work_struct *work)
 		if (nbd_read_reply(nbd, args->index, &reply))
 			break;
 
+		/*
+		 * Grab ref of q_usage_counter can prevent request being freed
+		 * during nbd_handle_reply(). If q_usage_counter is zero, then
+		 * no request is inflight, which means something is wrong since
+		 * we expect to find a request to complete here.
+		 */
+		if (!percpu_ref_tryget(&q->q_usage_counter)) {
+			dev_err(disk_to_dev(nbd->disk), "%s: no io inflight\n",
+				__func__);
+			break;
+		}
+
 		cmd = nbd_handle_reply(nbd, args->index, &reply);
+		/*
+		 * It's safe to drop ref before request completion, inflight
+		 * request will ensure q_usage_counter won't be zero.
+		 */
+		percpu_ref_put(&q->q_usage_counter);
 		if (IS_ERR(cmd))
 			break;
 
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 14+ messages in thread

* Re: [PATCH v7 2/6] nbd: make sure request completion won't concurrent
  2021-09-15  9:20 ` [PATCH v7 2/6] nbd: make sure request completion won't concurrent Yu Kuai
@ 2021-09-16  7:53   ` Ming Lei
  0 siblings, 0 replies; 14+ messages in thread
From: Ming Lei @ 2021-09-16  7:53 UTC (permalink / raw)
  To: Yu Kuai; +Cc: josef, axboe, hch, linux-block, nbd, linux-kernel, yi.zhang

On Wed, Sep 15, 2021 at 05:20:06PM +0800, Yu Kuai wrote:
> commit cddce0116058 ("nbd: Aovid double completion of a request")
> try to fix that nbd_clear_que() and recv_work() can complete a
> request concurrently. However, the problem still exists:
> 
> t1                    t2                     t3
> 
> nbd_disconnect_and_put
>  flush_workqueue
>                       recv_work
>                        blk_mq_complete_request
>                         blk_mq_complete_request_remote -> this is true
>                          WRITE_ONCE(rq->state, MQ_RQ_COMPLETE)
>                           blk_mq_raise_softirq
>                                              blk_done_softirq
>                                               blk_complete_reqs
>                                                nbd_complete_rq
>                                                 blk_mq_end_request
>                                                  blk_mq_free_request
>                                                   WRITE_ONCE(rq->state, MQ_RQ_IDLE)
>   nbd_clear_que
>    blk_mq_tagset_busy_iter
>     nbd_clear_req
>                                                    __blk_mq_free_request
>                                                     blk_mq_put_tag
>      blk_mq_complete_request -> complete again
> 
> There are three places where request can be completed in nbd:
> recv_work(), nbd_clear_que() and nbd_xmit_timeout(). Since they
> all hold cmd->lock before completing the request, it's easy to
> avoid the problem by setting and checking a cmd flag.
> 
> Signed-off-by: Yu Kuai <yukuai3@huawei.com>
> ---

Reviewed-by: Ming Lei <ming.lei@redhat.com>

-- 
Ming


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH v7 4/6] nbd: don't start request if nbd_queue_rq() failed
  2021-09-15  9:20 ` [PATCH v7 4/6] nbd: don't start request if nbd_queue_rq() failed Yu Kuai
@ 2021-09-16  7:58   ` Ming Lei
  0 siblings, 0 replies; 14+ messages in thread
From: Ming Lei @ 2021-09-16  7:58 UTC (permalink / raw)
  To: Yu Kuai; +Cc: josef, axboe, hch, linux-block, nbd, linux-kernel, yi.zhang

On Wed, Sep 15, 2021 at 05:20:08PM +0800, Yu Kuai wrote:
> Currently, blk_mq_end_request() will be called if nbd_queue_rq()
> failed, thus start request in such situation is useless.
> 
> Signed-off-by: Yu Kuai <yukuai3@huawei.com>
> Reviewed-by: Christoph Hellwig <hch@lst.de>
> ---
>  drivers/block/nbd.c | 3 ---
>  1 file changed, 3 deletions(-)
> 
> diff --git a/drivers/block/nbd.c b/drivers/block/nbd.c
> index c724a5bd7fa4..22c91d8901f6 100644
> --- a/drivers/block/nbd.c
> +++ b/drivers/block/nbd.c
> @@ -934,7 +934,6 @@ static int nbd_handle_cmd(struct nbd_cmd *cmd, int index)
>  	if (!refcount_inc_not_zero(&nbd->config_refs)) {
>  		dev_err_ratelimited(disk_to_dev(nbd->disk),
>  				    "Socks array is empty\n");
> -		blk_mq_start_request(req);
>  		return -EINVAL;
>  	}
>  	config = nbd->config;
> @@ -943,7 +942,6 @@ static int nbd_handle_cmd(struct nbd_cmd *cmd, int index)
>  		dev_err_ratelimited(disk_to_dev(nbd->disk),
>  				    "Attempted send on invalid socket\n");
>  		nbd_config_put(nbd);
> -		blk_mq_start_request(req);
>  		return -EINVAL;
>  	}
>  	cmd->status = BLK_STS_OK;
> @@ -967,7 +965,6 @@ static int nbd_handle_cmd(struct nbd_cmd *cmd, int index)
>  			 */
>  			sock_shutdown(nbd);
>  			nbd_config_put(nbd);
> -			blk_mq_start_request(req);
>  			return -EIO;
>  		}

It is basically a partial revert of the following fix, care to add commit log
about why removing these blk_mq_start_request() is safe?

commit 6a468d5990ecd1c2d07dd85f8633bbdd0ba61c40
Author: Josef Bacik <jbacik@fb.com>
Date:   Mon Nov 6 16:11:58 2017 -0500

    nbd: don't start req until after the dead connection logic



Thanks, 
Ming


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH v7 5/6] nbd: partition nbd_read_stat() into nbd_read_reply() and nbd_handle_reply()
  2021-09-15  9:20 ` [PATCH v7 5/6] nbd: partition nbd_read_stat() into nbd_read_reply() and nbd_handle_reply() Yu Kuai
@ 2021-09-16  8:00   ` Ming Lei
  0 siblings, 0 replies; 14+ messages in thread
From: Ming Lei @ 2021-09-16  8:00 UTC (permalink / raw)
  To: Yu Kuai; +Cc: josef, axboe, hch, linux-block, nbd, linux-kernel, yi.zhang

On Wed, Sep 15, 2021 at 05:20:09PM +0800, Yu Kuai wrote:
> Prepare to fix uaf in nbd_read_stat(), no functional changes.
> 
> Signed-off-by: Yu Kuai <yukuai3@huawei.com>
> ---
>  drivers/block/nbd.c | 76 +++++++++++++++++++++++++++------------------
>  1 file changed, 45 insertions(+), 31 deletions(-)
> 
> diff --git a/drivers/block/nbd.c b/drivers/block/nbd.c
> index 22c91d8901f6..9a7bbf8ebe74 100644
> --- a/drivers/block/nbd.c
> +++ b/drivers/block/nbd.c
> @@ -694,38 +694,45 @@ static int nbd_send_cmd(struct nbd_device *nbd, struct nbd_cmd *cmd, int index)
>  	return 0;
>  }
>  
> -/* NULL returned = something went wrong, inform userspace */
> -static struct nbd_cmd *nbd_read_stat(struct nbd_device *nbd, int index)
> +static int nbd_read_reply(struct nbd_device *nbd, int index,
> +			  struct nbd_reply *reply)
>  {
> -	struct nbd_config *config = nbd->config;
> -	int result;
> -	struct nbd_reply reply;
> -	struct nbd_cmd *cmd;
> -	struct request *req = NULL;
> -	u64 handle;
> -	u16 hwq;
> -	u32 tag;
> -	struct kvec iov = {.iov_base = &reply, .iov_len = sizeof(reply)};
> +	struct kvec iov = {.iov_base = reply, .iov_len = sizeof(*reply)};
>  	struct iov_iter to;
> -	int ret = 0;
> +	int result;
>  
> -	reply.magic = 0;
> -	iov_iter_kvec(&to, READ, &iov, 1, sizeof(reply));
> +	reply->magic = 0;
> +	iov_iter_kvec(&to, READ, &iov, 1, sizeof(*reply));
>  	result = sock_xmit(nbd, index, 0, &to, MSG_WAITALL, NULL);
> -	if (result <= 0) {
> -		if (!nbd_disconnected(config))
> +	if (result < 0) {
> +		if (!nbd_disconnected(nbd->config))

The above is actually sort of functional change, I'd suggest to do it in one
single patch because sock_xmit() won't return zero.

-- 
Ming


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH v7 6/6] nbd: fix uaf in nbd_handle_reply()
  2021-09-15  9:20 ` [PATCH v7 6/6] nbd: fix uaf in nbd_handle_reply() Yu Kuai
@ 2021-09-16  8:04   ` Ming Lei
  2021-09-16  8:47     ` yukuai (C)
  0 siblings, 1 reply; 14+ messages in thread
From: Ming Lei @ 2021-09-16  8:04 UTC (permalink / raw)
  To: Yu Kuai; +Cc: josef, axboe, hch, linux-block, nbd, linux-kernel, yi.zhang

On Wed, Sep 15, 2021 at 05:20:10PM +0800, Yu Kuai wrote:
> There is a problem that nbd_handle_reply() might access freed request:
> 
> 1) At first, a normal io is submitted and completed with scheduler:
> 
> internel_tag = blk_mq_get_tag -> get tag from sched_tags
>  blk_mq_rq_ctx_init
>   sched_tags->rq[internel_tag] = sched_tag->static_rq[internel_tag]
> ...
> blk_mq_get_driver_tag
>  __blk_mq_get_driver_tag -> get tag from tags
>  tags->rq[tag] = sched_tag->static_rq[internel_tag]
> 
> So, both tags->rq[tag] and sched_tags->rq[internel_tag] are pointing
> to the request: sched_tags->static_rq[internal_tag]. Even if the
> io is finished.
> 
> 2) nbd server send a reply with random tag directly:
> 
> recv_work
>  nbd_handle_reply
>   blk_mq_tag_to_rq(tags, tag)
>    rq = tags->rq[tag]
> 
> 3) if the sched_tags->static_rq is freed:
> 
> blk_mq_sched_free_requests
>  blk_mq_free_rqs(q->tag_set, hctx->sched_tags, i)
>   -> step 2) access rq before clearing rq mapping
>   blk_mq_clear_rq_mapping(set, tags, hctx_idx);
>   __free_pages() -> rq is freed here
> 
> 4) Then, nbd continue to use the freed request in nbd_handle_reply
> 
> Fix the problem by get 'q_usage_counter' before blk_mq_tag_to_rq(),
> thus request is ensured not to be freed because 'q_usage_counter' is
> not zero.
> 
> Signed-off-by: Yu Kuai <yukuai3@huawei.com>
> ---
>  drivers/block/nbd.c | 18 ++++++++++++++++++
>  1 file changed, 18 insertions(+)
> 
> diff --git a/drivers/block/nbd.c b/drivers/block/nbd.c
> index 9a7bbf8ebe74..3e8b70b5d4f9 100644
> --- a/drivers/block/nbd.c
> +++ b/drivers/block/nbd.c
> @@ -824,6 +824,7 @@ static void recv_work(struct work_struct *work)
>  						     work);
>  	struct nbd_device *nbd = args->nbd;
>  	struct nbd_config *config = nbd->config;
> +	struct request_queue *q = nbd->disk->queue;
>  	struct nbd_sock *nsock;
>  	struct nbd_cmd *cmd;
>  	struct request *rq;
> @@ -834,7 +835,24 @@ static void recv_work(struct work_struct *work)
>  		if (nbd_read_reply(nbd, args->index, &reply))
>  			break;
>  
> +		/*
> +		 * Grab ref of q_usage_counter can prevent request being freed
> +		 * during nbd_handle_reply(). If q_usage_counter is zero, then
> +		 * no request is inflight, which means something is wrong since
> +		 * we expect to find a request to complete here.
> +		 */

The above comment is wrong, the purpose is simply for avoiding request
pool freed, such as elevator switching won't happen once
->q_usage_counter is grabbed. So no any request UAF can be triggered
when calling into nbd_handle_reply().

> +		if (!percpu_ref_tryget(&q->q_usage_counter)) {
> +			dev_err(disk_to_dev(nbd->disk), "%s: no io inflight\n",
> +				__func__);
> +			break;
> +		}
> +
>  		cmd = nbd_handle_reply(nbd, args->index, &reply);
> +		/*
> +		 * It's safe to drop ref before request completion, inflight
> +		 * request will ensure q_usage_counter won't be zero.
> +		 */

The above comment is useless actually.

-- 
Ming


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH v7 6/6] nbd: fix uaf in nbd_handle_reply()
  2021-09-16  8:04   ` Ming Lei
@ 2021-09-16  8:47     ` yukuai (C)
  2021-09-16  9:06       ` Ming Lei
  0 siblings, 1 reply; 14+ messages in thread
From: yukuai (C) @ 2021-09-16  8:47 UTC (permalink / raw)
  To: Ming Lei; +Cc: josef, axboe, hch, linux-block, nbd, linux-kernel, yi.zhang

On 2021/09/16 16:04, Ming Lei wrote:
> On Wed, Sep 15, 2021 at 05:20:10PM +0800, Yu Kuai wrote:
>> There is a problem that nbd_handle_reply() might access freed request:
>>
>> 1) At first, a normal io is submitted and completed with scheduler:
>>
>> internel_tag = blk_mq_get_tag -> get tag from sched_tags
>>   blk_mq_rq_ctx_init
>>    sched_tags->rq[internel_tag] = sched_tag->static_rq[internel_tag]
>> ...
>> blk_mq_get_driver_tag
>>   __blk_mq_get_driver_tag -> get tag from tags
>>   tags->rq[tag] = sched_tag->static_rq[internel_tag]
>>
>> So, both tags->rq[tag] and sched_tags->rq[internel_tag] are pointing
>> to the request: sched_tags->static_rq[internal_tag]. Even if the
>> io is finished.
>>
>> 2) nbd server send a reply with random tag directly:
>>
>> recv_work
>>   nbd_handle_reply
>>    blk_mq_tag_to_rq(tags, tag)
>>     rq = tags->rq[tag]
>>
>> 3) if the sched_tags->static_rq is freed:
>>
>> blk_mq_sched_free_requests
>>   blk_mq_free_rqs(q->tag_set, hctx->sched_tags, i)
>>    -> step 2) access rq before clearing rq mapping
>>    blk_mq_clear_rq_mapping(set, tags, hctx_idx);
>>    __free_pages() -> rq is freed here
>>
>> 4) Then, nbd continue to use the freed request in nbd_handle_reply
>>
>> Fix the problem by get 'q_usage_counter' before blk_mq_tag_to_rq(),
>> thus request is ensured not to be freed because 'q_usage_counter' is
>> not zero.
>>
>> Signed-off-by: Yu Kuai <yukuai3@huawei.com>
>> ---
>>   drivers/block/nbd.c | 18 ++++++++++++++++++
>>   1 file changed, 18 insertions(+)
>>
>> diff --git a/drivers/block/nbd.c b/drivers/block/nbd.c
>> index 9a7bbf8ebe74..3e8b70b5d4f9 100644
>> --- a/drivers/block/nbd.c
>> +++ b/drivers/block/nbd.c
>> @@ -824,6 +824,7 @@ static void recv_work(struct work_struct *work)
>>   						     work);
>>   	struct nbd_device *nbd = args->nbd;
>>   	struct nbd_config *config = nbd->config;
>> +	struct request_queue *q = nbd->disk->queue;
>>   	struct nbd_sock *nsock;
>>   	struct nbd_cmd *cmd;
>>   	struct request *rq;
>> @@ -834,7 +835,24 @@ static void recv_work(struct work_struct *work)
>>   		if (nbd_read_reply(nbd, args->index, &reply))
>>   			break;
>>   
>> +		/*
>> +		 * Grab ref of q_usage_counter can prevent request being freed
>> +		 * during nbd_handle_reply(). If q_usage_counter is zero, then
>> +		 * no request is inflight, which means something is wrong since
>> +		 * we expect to find a request to complete here.
>> +		 */
> 
> The above comment is wrong, the purpose is simply for avoiding request
> pool freed, such as elevator switching won't happen once
> ->q_usage_counter is grabbed. So no any request UAF can be triggered
> when calling into nbd_handle_reply().

Do you mean the comment about q_usage_counter is zero is wrong ?
> 
>> +		if (!percpu_ref_tryget(&q->q_usage_counter)) {
>> +			dev_err(disk_to_dev(nbd->disk), "%s: no io inflight\n",
>> +				__func__);
>> +			break;
>> +		}
>> +
>>   		cmd = nbd_handle_reply(nbd, args->index, &reply);
>> +		/*
>> +		 * It's safe to drop ref before request completion, inflight
>> +		 * request will ensure q_usage_counter won't be zero.
>> +		 */
> 
> The above comment is useless actually.

Will remove the comments.

Thanks,
Kuai
> 

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH v7 6/6] nbd: fix uaf in nbd_handle_reply()
  2021-09-16  8:47     ` yukuai (C)
@ 2021-09-16  9:06       ` Ming Lei
  2021-09-16  9:14         ` yukuai (C)
  0 siblings, 1 reply; 14+ messages in thread
From: Ming Lei @ 2021-09-16  9:06 UTC (permalink / raw)
  To: yukuai (C); +Cc: josef, axboe, hch, linux-block, nbd, linux-kernel, yi.zhang

On Thu, Sep 16, 2021 at 04:47:08PM +0800, yukuai (C) wrote:
> On 2021/09/16 16:04, Ming Lei wrote:
> > On Wed, Sep 15, 2021 at 05:20:10PM +0800, Yu Kuai wrote:
> > > There is a problem that nbd_handle_reply() might access freed request:
> > > 
> > > 1) At first, a normal io is submitted and completed with scheduler:
> > > 
> > > internel_tag = blk_mq_get_tag -> get tag from sched_tags
> > >   blk_mq_rq_ctx_init
> > >    sched_tags->rq[internel_tag] = sched_tag->static_rq[internel_tag]
> > > ...
> > > blk_mq_get_driver_tag
> > >   __blk_mq_get_driver_tag -> get tag from tags
> > >   tags->rq[tag] = sched_tag->static_rq[internel_tag]
> > > 
> > > So, both tags->rq[tag] and sched_tags->rq[internel_tag] are pointing
> > > to the request: sched_tags->static_rq[internal_tag]. Even if the
> > > io is finished.
> > > 
> > > 2) nbd server send a reply with random tag directly:
> > > 
> > > recv_work
> > >   nbd_handle_reply
> > >    blk_mq_tag_to_rq(tags, tag)
> > >     rq = tags->rq[tag]
> > > 
> > > 3) if the sched_tags->static_rq is freed:
> > > 
> > > blk_mq_sched_free_requests
> > >   blk_mq_free_rqs(q->tag_set, hctx->sched_tags, i)
> > >    -> step 2) access rq before clearing rq mapping
> > >    blk_mq_clear_rq_mapping(set, tags, hctx_idx);
> > >    __free_pages() -> rq is freed here
> > > 
> > > 4) Then, nbd continue to use the freed request in nbd_handle_reply
> > > 
> > > Fix the problem by get 'q_usage_counter' before blk_mq_tag_to_rq(),
> > > thus request is ensured not to be freed because 'q_usage_counter' is
> > > not zero.
> > > 
> > > Signed-off-by: Yu Kuai <yukuai3@huawei.com>
> > > ---
> > >   drivers/block/nbd.c | 18 ++++++++++++++++++
> > >   1 file changed, 18 insertions(+)
> > > 
> > > diff --git a/drivers/block/nbd.c b/drivers/block/nbd.c
> > > index 9a7bbf8ebe74..3e8b70b5d4f9 100644
> > > --- a/drivers/block/nbd.c
> > > +++ b/drivers/block/nbd.c
> > > @@ -824,6 +824,7 @@ static void recv_work(struct work_struct *work)
> > >   						     work);
> > >   	struct nbd_device *nbd = args->nbd;
> > >   	struct nbd_config *config = nbd->config;
> > > +	struct request_queue *q = nbd->disk->queue;
> > >   	struct nbd_sock *nsock;
> > >   	struct nbd_cmd *cmd;
> > >   	struct request *rq;
> > > @@ -834,7 +835,24 @@ static void recv_work(struct work_struct *work)
> > >   		if (nbd_read_reply(nbd, args->index, &reply))
> > >   			break;
> > > +		/*
> > > +		 * Grab ref of q_usage_counter can prevent request being freed
> > > +		 * during nbd_handle_reply(). If q_usage_counter is zero, then
> > > +		 * no request is inflight, which means something is wrong since
> > > +		 * we expect to find a request to complete here.
> > > +		 */
> > 
> > The above comment is wrong, the purpose is simply for avoiding request
> > pool freed, such as elevator switching won't happen once
> > ->q_usage_counter is grabbed. So no any request UAF can be triggered
> > when calling into nbd_handle_reply().
> 
> Do you mean the comment about q_usage_counter is zero is wrong ?

How about the following words?

/*
 * Grab .q_usage_counter so request pool won't go away, then no request
 * use-after-free is possible during nbd_handle_reply(). If queue is frozen,
 * there won't be any inflight requests, we needn't to handle the incoming
 * garbage message
 */

Thanks,
Ming


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH v7 6/6] nbd: fix uaf in nbd_handle_reply()
  2021-09-16  9:06       ` Ming Lei
@ 2021-09-16  9:14         ` yukuai (C)
  0 siblings, 0 replies; 14+ messages in thread
From: yukuai (C) @ 2021-09-16  9:14 UTC (permalink / raw)
  To: Ming Lei; +Cc: josef, axboe, hch, linux-block, nbd, linux-kernel, yi.zhang

On 2021/09/16 17:06, Ming Lei wrote:
> On Thu, Sep 16, 2021 at 04:47:08PM +0800, yukuai (C) wrote:
>> On 2021/09/16 16:04, Ming Lei wrote:
>>> On Wed, Sep 15, 2021 at 05:20:10PM +0800, Yu Kuai wrote:
>>>> There is a problem that nbd_handle_reply() might access freed request:
>>>>
>>>> 1) At first, a normal io is submitted and completed with scheduler:
>>>>
>>>> internel_tag = blk_mq_get_tag -> get tag from sched_tags
>>>>    blk_mq_rq_ctx_init
>>>>     sched_tags->rq[internel_tag] = sched_tag->static_rq[internel_tag]
>>>> ...
>>>> blk_mq_get_driver_tag
>>>>    __blk_mq_get_driver_tag -> get tag from tags
>>>>    tags->rq[tag] = sched_tag->static_rq[internel_tag]
>>>>
>>>> So, both tags->rq[tag] and sched_tags->rq[internel_tag] are pointing
>>>> to the request: sched_tags->static_rq[internal_tag]. Even if the
>>>> io is finished.
>>>>
>>>> 2) nbd server send a reply with random tag directly:
>>>>
>>>> recv_work
>>>>    nbd_handle_reply
>>>>     blk_mq_tag_to_rq(tags, tag)
>>>>      rq = tags->rq[tag]
>>>>
>>>> 3) if the sched_tags->static_rq is freed:
>>>>
>>>> blk_mq_sched_free_requests
>>>>    blk_mq_free_rqs(q->tag_set, hctx->sched_tags, i)
>>>>     -> step 2) access rq before clearing rq mapping
>>>>     blk_mq_clear_rq_mapping(set, tags, hctx_idx);
>>>>     __free_pages() -> rq is freed here
>>>>
>>>> 4) Then, nbd continue to use the freed request in nbd_handle_reply
>>>>
>>>> Fix the problem by get 'q_usage_counter' before blk_mq_tag_to_rq(),
>>>> thus request is ensured not to be freed because 'q_usage_counter' is
>>>> not zero.
>>>>
>>>> Signed-off-by: Yu Kuai <yukuai3@huawei.com>
>>>> ---
>>>>    drivers/block/nbd.c | 18 ++++++++++++++++++
>>>>    1 file changed, 18 insertions(+)
>>>>
>>>> diff --git a/drivers/block/nbd.c b/drivers/block/nbd.c
>>>> index 9a7bbf8ebe74..3e8b70b5d4f9 100644
>>>> --- a/drivers/block/nbd.c
>>>> +++ b/drivers/block/nbd.c
>>>> @@ -824,6 +824,7 @@ static void recv_work(struct work_struct *work)
>>>>    						     work);
>>>>    	struct nbd_device *nbd = args->nbd;
>>>>    	struct nbd_config *config = nbd->config;
>>>> +	struct request_queue *q = nbd->disk->queue;
>>>>    	struct nbd_sock *nsock;
>>>>    	struct nbd_cmd *cmd;
>>>>    	struct request *rq;
>>>> @@ -834,7 +835,24 @@ static void recv_work(struct work_struct *work)
>>>>    		if (nbd_read_reply(nbd, args->index, &reply))
>>>>    			break;
>>>> +		/*
>>>> +		 * Grab ref of q_usage_counter can prevent request being freed
>>>> +		 * during nbd_handle_reply(). If q_usage_counter is zero, then
>>>> +		 * no request is inflight, which means something is wrong since
>>>> +		 * we expect to find a request to complete here.
>>>> +		 */
>>>
>>> The above comment is wrong, the purpose is simply for avoiding request
>>> pool freed, such as elevator switching won't happen once
>>> ->q_usage_counter is grabbed. So no any request UAF can be triggered
>>> when calling into nbd_handle_reply().
>>
>> Do you mean the comment about q_usage_counter is zero is wrong ?
> 
> How about the following words?
> 
> /*
>   * Grab .q_usage_counter so request pool won't go away, then no request
>   * use-after-free is possible during nbd_handle_reply(). If queue is frozen,
>   * there won't be any inflight requests, we needn't to handle the incoming
>   * garbage message
>   */

Will use these words.

Thanks,
Kuai

^ permalink raw reply	[flat|nested] 14+ messages in thread

end of thread, other threads:[~2021-09-16  9:16 UTC | newest]

Thread overview: 14+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-09-15  9:20 [PATCH v7 0/6] handle unexpected message from server Yu Kuai
2021-09-15  9:20 ` [PATCH v7 1/6] nbd: don't handle response without a corresponding request message Yu Kuai
2021-09-15  9:20 ` [PATCH v7 2/6] nbd: make sure request completion won't concurrent Yu Kuai
2021-09-16  7:53   ` Ming Lei
2021-09-15  9:20 ` [PATCH v7 3/6] nbd: check sock index in nbd_read_stat() Yu Kuai
2021-09-15  9:20 ` [PATCH v7 4/6] nbd: don't start request if nbd_queue_rq() failed Yu Kuai
2021-09-16  7:58   ` Ming Lei
2021-09-15  9:20 ` [PATCH v7 5/6] nbd: partition nbd_read_stat() into nbd_read_reply() and nbd_handle_reply() Yu Kuai
2021-09-16  8:00   ` Ming Lei
2021-09-15  9:20 ` [PATCH v7 6/6] nbd: fix uaf in nbd_handle_reply() Yu Kuai
2021-09-16  8:04   ` Ming Lei
2021-09-16  8:47     ` yukuai (C)
2021-09-16  9:06       ` Ming Lei
2021-09-16  9:14         ` yukuai (C)

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).