From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-12.2 required=3.0 tests=BAYES_00, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_PATCH,MAILING_LIST_MULTI,NICE_REPLY_A, SPF_HELO_NONE,SPF_PASS,USER_AGENT_SANE_1 autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 6A669C433F5 for ; Wed, 15 Sep 2021 03:36:52 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 50D6B6113E for ; Wed, 15 Sep 2021 03:36:52 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230004AbhIODiJ (ORCPT ); Tue, 14 Sep 2021 23:38:09 -0400 Received: from szxga02-in.huawei.com ([45.249.212.188]:15415 "EHLO szxga02-in.huawei.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229980AbhIODiI (ORCPT ); Tue, 14 Sep 2021 23:38:08 -0400 Received: from dggemv704-chm.china.huawei.com (unknown [172.30.72.56]) by szxga02-in.huawei.com (SkyGuard) with ESMTP id 4H8QlG4jPlzR9tK; Wed, 15 Sep 2021 11:32:42 +0800 (CST) Received: from dggema762-chm.china.huawei.com (10.1.198.204) by dggemv704-chm.china.huawei.com (10.3.19.47) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_CBC_SHA256) id 15.1.2308.8; Wed, 15 Sep 2021 11:36:48 +0800 Received: from [10.174.176.73] (10.174.176.73) by dggema762-chm.china.huawei.com (10.1.198.204) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_CBC_SHA256_P256) id 15.1.2308.8; Wed, 15 Sep 2021 11:36:48 +0800 Subject: Re: [PATCH v5 5/6] nbd: convert to use blk_mq_find_and_get_req() To: Ming Lei CC: , , , , , , References: <20210909141256.2606682-1-yukuai3@huawei.com> <20210909141256.2606682-6-yukuai3@huawei.com> <374c6b37-b4b2-fe01-66be-ca2dbbc283e9@huawei.com> <8f1849a3-6bf2-6b14-7ef9-4969a9a5425b@huawei.com> From: "yukuai (C)" Message-ID: <60f68f6b-5fff-6a39-b77b-4bbb86f1c87e@huawei.com> Date: Wed, 15 Sep 2021 11:36:47 +0800 User-Agent: Mozilla/5.0 (Windows NT 10.0; WOW64; rv:60.0) Gecko/20100101 Thunderbird/60.8.0 MIME-Version: 1.0 In-Reply-To: Content-Type: text/plain; charset="utf-8"; format=flowed Content-Transfer-Encoding: 8bit X-Originating-IP: [10.174.176.73] X-ClientProxiedBy: dggems705-chm.china.huawei.com (10.3.19.182) To dggema762-chm.china.huawei.com (10.1.198.204) X-CFilter-Loop: Reflected Precedence: bulk List-ID: X-Mailing-List: linux-block@vger.kernel.org On 2021/09/15 11:16, Ming Lei wrote: > On Wed, Sep 15, 2021 at 09:54:09AM +0800, yukuai (C) wrote: >> On 2021/09/14 22:37, Ming Lei wrote: >>> On Tue, Sep 14, 2021 at 05:19:31PM +0800, yukuai (C) wrote: >>>> On 在 2021/09/14 15:46, Ming Lei wrote: >>>> >>>>> If the above can happen, blk_mq_find_and_get_req() may not fix it too, just >>>>> wondering why not take the following simpler way for avoiding the UAF? >>>>> >>>>> diff --git a/drivers/block/nbd.c b/drivers/block/nbd.c >>>>> index 5170a630778d..dfa5cce71f66 100644 >>>>> --- a/drivers/block/nbd.c >>>>> +++ b/drivers/block/nbd.c >>>>> @@ -795,9 +795,13 @@ static void recv_work(struct work_struct *work) >>>>> work); >>>>> struct nbd_device *nbd = args->nbd; >>>>> struct nbd_config *config = nbd->config; >>>>> + struct request_queue *q = nbd->disk->queue; >>>>> struct nbd_cmd *cmd; >>>>> struct request *rq; >>>>> + if (!percpu_ref_tryget(&q->q_usage_counter)) >>>>> + return; >>>>> + >>>>> while (1) { >>>>> cmd = nbd_read_stat(nbd, args->index); >>>>> if (IS_ERR(cmd)) { >>>>> @@ -813,6 +817,7 @@ static void recv_work(struct work_struct *work) >>>>> if (likely(!blk_should_fake_timeout(rq->q))) >>>>> blk_mq_complete_request(rq); >>>>> } >>>>> + blk_queue_exit(q); >>>>> nbd_config_put(nbd); >>>>> atomic_dec(&config->recv_threads); >>>>> wake_up(&config->recv_wq); >>>>> >>>> >>>> Hi, Ming >>>> >>>> This apporch is wrong. >>>> >>>> If blk_mq_freeze_queue() is called, and nbd is waiting for all >>>> request to complete. percpu_ref_tryget() will fail here, and deadlock >>>> will occur because request can't complete in recv_work(). >>> >>> No, percpu_ref_tryget() won't fail until ->q_usage_counter is zero, when >>> it is perfectly fine to do nothing in recv_work(). >>> >> >> Hi Ming >> >> This apporch is a good idea, however we should not get q_usage_counter >> in reccv_work(), because It will block freeze queue. >> >> How about get q_usage_counter in nbd_read_stat(), and put in error path >> or after request completion? > > OK, looks I missed that nbd_read_stat() needs to wait for incoming reply > first, so how about the following change by partitioning nbd_read_stat() > into nbd_read_reply() and nbd_handle_reply()? Hi, Ming The change looks good to me. Do you want to send a patch to fix this? Thanks, Kuai > > diff --git a/drivers/block/nbd.c b/drivers/block/nbd.c > index 5170a630778d..477fe057fc93 100644 > --- a/drivers/block/nbd.c > +++ b/drivers/block/nbd.c > @@ -683,38 +683,47 @@ static int nbd_send_cmd(struct nbd_device *nbd, struct nbd_cmd *cmd, int index) > return 0; > } > > -/* NULL returned = something went wrong, inform userspace */ > -static struct nbd_cmd *nbd_read_stat(struct nbd_device *nbd, int index) > +static int nbd_read_reply(struct nbd_device *nbd, int index, > + struct nbd_reply *reply) > { > - struct nbd_config *config = nbd->config; > int result; > - struct nbd_reply reply; > - struct nbd_cmd *cmd; > - struct request *req = NULL; > - u64 handle; > - u16 hwq; > - u32 tag; > - struct kvec iov = {.iov_base = &reply, .iov_len = sizeof(reply)}; > + struct kvec iov = {.iov_base = reply, .iov_len = sizeof(*reply)}; > struct iov_iter to; > - int ret = 0; > > - reply.magic = 0; > + reply->magic = 0; > iov_iter_kvec(&to, READ, &iov, 1, sizeof(reply)); > result = sock_xmit(nbd, index, 0, &to, MSG_WAITALL, NULL); > - if (result <= 0) { > - if (!nbd_disconnected(config)) > + if (result < 0) { > + if (!nbd_disconnected(nbd->config)) > dev_err(disk_to_dev(nbd->disk), > "Receive control failed (result %d)\n", result); > - return ERR_PTR(result); > + return result; > } > > - if (ntohl(reply.magic) != NBD_REPLY_MAGIC) { > + if (ntohl(reply->magic) != NBD_REPLY_MAGIC) { > dev_err(disk_to_dev(nbd->disk), "Wrong magic (0x%lx)\n", > - (unsigned long)ntohl(reply.magic)); > - return ERR_PTR(-EPROTO); > + (unsigned long)ntohl(reply->magic)); > + return -EPROTO; > } > > - memcpy(&handle, reply.handle, sizeof(handle)); > + return 0; > +} > + > +/* NULL returned = something went wrong, inform userspace */ > +static struct nbd_cmd *nbd_handle_reply(struct nbd_device *nbd, int index, > + struct nbd_reply *reply) > +{ > + struct nbd_config *config = nbd->config; > + int result; > + struct nbd_cmd *cmd; > + struct request *req = NULL; > + u64 handle; > + u16 hwq; > + u32 tag; > + struct iov_iter to; > + int ret = 0; > + > + memcpy(&handle, reply->handle, sizeof(handle)); > tag = nbd_handle_to_tag(handle); > hwq = blk_mq_unique_tag_to_hwq(tag); > if (hwq < nbd->tag_set.nr_hw_queues) > @@ -747,9 +756,9 @@ static struct nbd_cmd *nbd_read_stat(struct nbd_device *nbd, int index) > ret = -ENOENT; > goto out; > } > - if (ntohl(reply.error)) { > + if (ntohl(reply->error)) { > dev_err(disk_to_dev(nbd->disk), "Other side returned error (%d)\n", > - ntohl(reply.error)); > + ntohl(reply->error)); > cmd->status = BLK_STS_IOERR; > goto out; > } > @@ -795,24 +804,36 @@ static void recv_work(struct work_struct *work) > work); > struct nbd_device *nbd = args->nbd; > struct nbd_config *config = nbd->config; > + struct request_queue *q = nbd->disk->queue; > + struct nbd_sock *nsock; > struct nbd_cmd *cmd; > struct request *rq; > > while (1) { > - cmd = nbd_read_stat(nbd, args->index); > - if (IS_ERR(cmd)) { > - struct nbd_sock *nsock = config->socks[args->index]; > + struct nbd_reply reply; > > - mutex_lock(&nsock->tx_lock); > - nbd_mark_nsock_dead(nbd, nsock, 1); > - mutex_unlock(&nsock->tx_lock); > + if (nbd_read_reply(nbd, args->index, &reply)) > break; > - } > > + if (!percpu_ref_tryget(&q->q_usage_counter)) > + break; > + > + cmd = nbd_handle_reply(nbd, args->index, &reply); > + if (IS_ERR(cmd)) { > + blk_queue_exit(q); > + break; > + } > rq = blk_mq_rq_from_pdu(cmd); > if (likely(!blk_should_fake_timeout(rq->q))) > blk_mq_complete_request(rq); > + blk_queue_exit(q); > } > + > + nsock = config->socks[args->index]; > + mutex_lock(&nsock->tx_lock); > + nbd_mark_nsock_dead(nbd, nsock, 1); > + mutex_unlock(&nsock->tx_lock); > + > nbd_config_put(nbd); > atomic_dec(&config->recv_threads); > wake_up(&config->recv_wq); > > > Thanks, > Ming > > . >