From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-11.1 required=3.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS, INCLUDES_PATCH,MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 0FA2CC433F5 for ; Wed, 15 Sep 2021 03:16:08 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id E045561175 for ; Wed, 15 Sep 2021 03:16:07 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S235413AbhIODRZ (ORCPT ); Tue, 14 Sep 2021 23:17:25 -0400 Received: from us-smtp-delivery-124.mimecast.com ([170.10.133.124]:53769 "EHLO us-smtp-delivery-124.mimecast.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S235015AbhIODRY (ORCPT ); Tue, 14 Sep 2021 23:17:24 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1631675765; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=zlyxS0b1FK2wfVn9hcZ7WHgk1exfB9oaJDccxJiZ+b8=; b=f23pcD7p5AtfUWR6jtsbOkuzhr6Z4ZVURV1TeY4SysEZKz+BkE49e9aHeONGYdAc64+3CO JquJTce+CbcLd7ml8rhYEswLXRMwFxIuBTB2VJ7NtimOMMKdl/bzBaXfaTi0kmMuzgiM2V 9tjhH3M/0pU6DfRPrb2qSrGydMkr6+o= Received: from mimecast-mx01.redhat.com (mimecast-mx01.redhat.com [209.132.183.4]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-516-m5JDQB5rOWiq4iijiQWLVA-1; Tue, 14 Sep 2021 23:16:04 -0400 X-MC-Unique: m5JDQB5rOWiq4iijiQWLVA-1 Received: from smtp.corp.redhat.com (int-mx08.intmail.prod.int.phx2.redhat.com [10.5.11.23]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx01.redhat.com (Postfix) with ESMTPS id 2FA7E100C661; Wed, 15 Sep 2021 03:16:03 +0000 (UTC) Received: from T590 (ovpn-12-59.pek2.redhat.com [10.72.12.59]) by smtp.corp.redhat.com (Postfix) with ESMTPS id 4D76B1972E; Wed, 15 Sep 2021 03:15:55 +0000 (UTC) Date: Wed, 15 Sep 2021 11:16:06 +0800 From: Ming Lei To: "yukuai (C)" Cc: axboe@kernel.dk, josef@toxicpanda.com, hch@infradead.org, linux-block@vger.kernel.org, linux-kernel@vger.kernel.org, nbd@other.debian.org, yi.zhang@huawei.com Subject: Re: [PATCH v5 5/6] nbd: convert to use blk_mq_find_and_get_req() Message-ID: References: <20210909141256.2606682-1-yukuai3@huawei.com> <20210909141256.2606682-6-yukuai3@huawei.com> <374c6b37-b4b2-fe01-66be-ca2dbbc283e9@huawei.com> <8f1849a3-6bf2-6b14-7ef9-4969a9a5425b@huawei.com> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline Content-Transfer-Encoding: 8bit In-Reply-To: <8f1849a3-6bf2-6b14-7ef9-4969a9a5425b@huawei.com> X-Scanned-By: MIMEDefang 2.84 on 10.5.11.23 Precedence: bulk List-ID: X-Mailing-List: linux-block@vger.kernel.org On Wed, Sep 15, 2021 at 09:54:09AM +0800, yukuai (C) wrote: > On 2021/09/14 22:37, Ming Lei wrote: > > On Tue, Sep 14, 2021 at 05:19:31PM +0800, yukuai (C) wrote: > > > On 在 2021/09/14 15:46, Ming Lei wrote: > > > > > > > If the above can happen, blk_mq_find_and_get_req() may not fix it too, just > > > > wondering why not take the following simpler way for avoiding the UAF? > > > > > > > > diff --git a/drivers/block/nbd.c b/drivers/block/nbd.c > > > > index 5170a630778d..dfa5cce71f66 100644 > > > > --- a/drivers/block/nbd.c > > > > +++ b/drivers/block/nbd.c > > > > @@ -795,9 +795,13 @@ static void recv_work(struct work_struct *work) > > > > work); > > > > struct nbd_device *nbd = args->nbd; > > > > struct nbd_config *config = nbd->config; > > > > + struct request_queue *q = nbd->disk->queue; > > > > struct nbd_cmd *cmd; > > > > struct request *rq; > > > > + if (!percpu_ref_tryget(&q->q_usage_counter)) > > > > + return; > > > > + > > > > while (1) { > > > > cmd = nbd_read_stat(nbd, args->index); > > > > if (IS_ERR(cmd)) { > > > > @@ -813,6 +817,7 @@ static void recv_work(struct work_struct *work) > > > > if (likely(!blk_should_fake_timeout(rq->q))) > > > > blk_mq_complete_request(rq); > > > > } > > > > + blk_queue_exit(q); > > > > nbd_config_put(nbd); > > > > atomic_dec(&config->recv_threads); > > > > wake_up(&config->recv_wq); > > > > > > > > > > Hi, Ming > > > > > > This apporch is wrong. > > > > > > If blk_mq_freeze_queue() is called, and nbd is waiting for all > > > request to complete. percpu_ref_tryget() will fail here, and deadlock > > > will occur because request can't complete in recv_work(). > > > > No, percpu_ref_tryget() won't fail until ->q_usage_counter is zero, when > > it is perfectly fine to do nothing in recv_work(). > > > > Hi Ming > > This apporch is a good idea, however we should not get q_usage_counter > in reccv_work(), because It will block freeze queue. > > How about get q_usage_counter in nbd_read_stat(), and put in error path > or after request completion? OK, looks I missed that nbd_read_stat() needs to wait for incoming reply first, so how about the following change by partitioning nbd_read_stat() into nbd_read_reply() and nbd_handle_reply()? diff --git a/drivers/block/nbd.c b/drivers/block/nbd.c index 5170a630778d..477fe057fc93 100644 --- a/drivers/block/nbd.c +++ b/drivers/block/nbd.c @@ -683,38 +683,47 @@ static int nbd_send_cmd(struct nbd_device *nbd, struct nbd_cmd *cmd, int index) return 0; } -/* NULL returned = something went wrong, inform userspace */ -static struct nbd_cmd *nbd_read_stat(struct nbd_device *nbd, int index) +static int nbd_read_reply(struct nbd_device *nbd, int index, + struct nbd_reply *reply) { - struct nbd_config *config = nbd->config; int result; - struct nbd_reply reply; - struct nbd_cmd *cmd; - struct request *req = NULL; - u64 handle; - u16 hwq; - u32 tag; - struct kvec iov = {.iov_base = &reply, .iov_len = sizeof(reply)}; + struct kvec iov = {.iov_base = reply, .iov_len = sizeof(*reply)}; struct iov_iter to; - int ret = 0; - reply.magic = 0; + reply->magic = 0; iov_iter_kvec(&to, READ, &iov, 1, sizeof(reply)); result = sock_xmit(nbd, index, 0, &to, MSG_WAITALL, NULL); - if (result <= 0) { - if (!nbd_disconnected(config)) + if (result < 0) { + if (!nbd_disconnected(nbd->config)) dev_err(disk_to_dev(nbd->disk), "Receive control failed (result %d)\n", result); - return ERR_PTR(result); + return result; } - if (ntohl(reply.magic) != NBD_REPLY_MAGIC) { + if (ntohl(reply->magic) != NBD_REPLY_MAGIC) { dev_err(disk_to_dev(nbd->disk), "Wrong magic (0x%lx)\n", - (unsigned long)ntohl(reply.magic)); - return ERR_PTR(-EPROTO); + (unsigned long)ntohl(reply->magic)); + return -EPROTO; } - memcpy(&handle, reply.handle, sizeof(handle)); + return 0; +} + +/* NULL returned = something went wrong, inform userspace */ +static struct nbd_cmd *nbd_handle_reply(struct nbd_device *nbd, int index, + struct nbd_reply *reply) +{ + struct nbd_config *config = nbd->config; + int result; + struct nbd_cmd *cmd; + struct request *req = NULL; + u64 handle; + u16 hwq; + u32 tag; + struct iov_iter to; + int ret = 0; + + memcpy(&handle, reply->handle, sizeof(handle)); tag = nbd_handle_to_tag(handle); hwq = blk_mq_unique_tag_to_hwq(tag); if (hwq < nbd->tag_set.nr_hw_queues) @@ -747,9 +756,9 @@ static struct nbd_cmd *nbd_read_stat(struct nbd_device *nbd, int index) ret = -ENOENT; goto out; } - if (ntohl(reply.error)) { + if (ntohl(reply->error)) { dev_err(disk_to_dev(nbd->disk), "Other side returned error (%d)\n", - ntohl(reply.error)); + ntohl(reply->error)); cmd->status = BLK_STS_IOERR; goto out; } @@ -795,24 +804,36 @@ static void recv_work(struct work_struct *work) work); struct nbd_device *nbd = args->nbd; struct nbd_config *config = nbd->config; + struct request_queue *q = nbd->disk->queue; + struct nbd_sock *nsock; struct nbd_cmd *cmd; struct request *rq; while (1) { - cmd = nbd_read_stat(nbd, args->index); - if (IS_ERR(cmd)) { - struct nbd_sock *nsock = config->socks[args->index]; + struct nbd_reply reply; - mutex_lock(&nsock->tx_lock); - nbd_mark_nsock_dead(nbd, nsock, 1); - mutex_unlock(&nsock->tx_lock); + if (nbd_read_reply(nbd, args->index, &reply)) break; - } + if (!percpu_ref_tryget(&q->q_usage_counter)) + break; + + cmd = nbd_handle_reply(nbd, args->index, &reply); + if (IS_ERR(cmd)) { + blk_queue_exit(q); + break; + } rq = blk_mq_rq_from_pdu(cmd); if (likely(!blk_should_fake_timeout(rq->q))) blk_mq_complete_request(rq); + blk_queue_exit(q); } + + nsock = config->socks[args->index]; + mutex_lock(&nsock->tx_lock); + nbd_mark_nsock_dead(nbd, nsock, 1); + mutex_unlock(&nsock->tx_lock); + nbd_config_put(nbd); atomic_dec(&config->recv_threads); wake_up(&config->recv_wq); Thanks, Ming