From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-5.4 required=3.0 tests=DKIMWL_WL_HIGH,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,USER_AGENT_SANE_1 autolearn=unavailable autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id B2468C352A4 for ; Mon, 10 Feb 2020 03:16:59 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id 7FA8220870 for ; Mon, 10 Feb 2020 03:16:59 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b="UPWVkpeu" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726961AbgBJDQ6 (ORCPT ); Sun, 9 Feb 2020 22:16:58 -0500 Received: from us-smtp-2.mimecast.com ([205.139.110.61]:31897 "EHLO us-smtp-1.mimecast.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726950AbgBJDQ6 (ORCPT ); Sun, 9 Feb 2020 22:16:58 -0500 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1581304616; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=B2VonG/X8Vx9WnWym+dJCLHJZ7AYbg4oAY+nxppyQOk=; b=UPWVkpeuPGfYQX2ppyF/3n7B1Ts4vPXN4hwsklRlMA/rHElBl+TL4r5HlB/sOrrcNVxpNx avHX6JWwZ+D1McnQL5mok/k3o5vKhHUhRmkU7KaY2RQEiCPXJ/TzDzksIFNZ8RWHREN/xQ ezgj8D3oqOQvOQ9famXflyrtb0pNpHM= Received: from mimecast-mx01.redhat.com (mimecast-mx01.redhat.com [209.132.183.4]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-18-DNqv5R2gOG-s4TlFFWkpDQ-1; Sun, 09 Feb 2020 22:16:51 -0500 X-MC-Unique: DNqv5R2gOG-s4TlFFWkpDQ-1 Received: from smtp.corp.redhat.com (int-mx02.intmail.prod.int.phx2.redhat.com [10.5.11.12]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx01.redhat.com (Postfix) with ESMTPS id 3AAF3184AEA9; Mon, 10 Feb 2020 03:16:50 +0000 (UTC) Received: from [10.10.120.117] (ovpn-120-117.rdu2.redhat.com [10.10.120.117]) by smtp.corp.redhat.com (Postfix) with ESMTP id 6F65360BE0; Mon, 10 Feb 2020 03:16:48 +0000 (UTC) Subject: Re: [PATCH] nbd: fix potential NULL pointer fault in connect and disconnect process To: "sunke (E)" , josef@toxicpanda.com, axboe@kernel.dk References: <20200117115005.37006-1-sunke32@huawei.com> <5E21EF96.1010204@redhat.com> Cc: linux-block@vger.kernel.org, nbd@other.debian.org, linux-kernel@vger.kernel.org, Xiubo Li From: Mike Christie Message-ID: <5E40CB1F.7070301@redhat.com> Date: Sun, 9 Feb 2020 21:16:47 -0600 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:38.0) Gecko/20100101 Thunderbird/38.6.0 MIME-Version: 1.0 In-Reply-To: Content-Type: multipart/mixed; boundary="------------010007000101060400020607" X-Scanned-By: MIMEDefang 2.79 on 10.5.11.12 Sender: linux-block-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-block@vger.kernel.org This is a multi-part message in MIME format. --------------010007000101060400020607 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable On 01/19/2020 01:10 AM, sunke (E) wrote: >=20 > Thanks for your detailed suggestions. >=20 > =E5=9C=A8 2020/1/18 1:32, Mike Christie =E5=86=99=E9=81=93: >> On 01/17/2020 05:50 AM, Sun Ke wrote: >>> Connect and disconnect a nbd device repeatedly, will cause >>> NULL pointer fault. >>> >>> It will appear by the steps: >>> 1. Connect the nbd device and disconnect it, but now nbd device >>> is not disconnected totally. >>> 2. Connect the same nbd device again immediately, it will fail >>> in nbd_start_device with a EBUSY return value. >>> 3. Wait a second to make sure the last config_refs is reduced >>> and run nbd_config_put to disconnect the nbd device totally. >>> 4. Start another process to open the nbd_device, config_refs >>> will increase and at the same time disconnect it. >> >> Just to make sure I understood this, for step 4 the process is doing: >> >> open(/dev/nbdX); >> ioctl(NBD_DISCONNECT, /dev/nbdX) or nbd_genl_disconnect(for /dev/nbdX) >> >> ? >> > do nbd_genl_disconnect(for /dev/nbdX)=EF=BC=9B > I tested it. Connect /dev/nbdX > through ioctl interface by nbd-client -L -N export localhost /dev/nbdX = and > through netlink interface by nbd-client localhost XXXX /dev/nbdX, > disconnect /dev/nbdX by nbd-client -d /dev/nbdX. > Both call nbd_genl_disconnect(for /dev/nbdX) and both contain the same > null pointer dereference. >> There is no successful NBD_DO_IT / nbd_genl_connect between the open a= nd >> disconnect calls at step #4, because it would normally be done at #2 a= nd >> that failed. nbd_disconnect_and_put could then reference a null >> recv_workq. If we are also racing with a close() then that could free >> the device/config from under nbd_disconnect_and_put. >> > Yes, nbd_disconnect_and_put could then reference a null recv_workq. Hey Sunke How about the attached patch. I am still testing it. The basic idea is that we need to do a flush whenever we have done a sock_shutdown and are in the disconnect/connect/clear sock path, so it just adds the flush in that function. We then do not need to keep adding these flushes everywher= e. --------------010007000101060400020607 Content-Type: text/x-patch; name="0001-nbd-fix-crash-during-nbd_genl_disconnect.patch" Content-Transfer-Encoding: 7bit Content-Disposition: attachment; filename="0001-nbd-fix-crash-during-nbd_genl_disconnect.patch" >From c007f70b73d31a11ea90ae53c748266eec88c0ab Mon Sep 17 00:00:00 2001 From: Mike Christie Date: Sun, 9 Feb 2020 21:06:00 -0600 Subject: [PATCH] nbd: fix crash during nbd_genl_disconnect If we open a nbd device, but have not done a nbd_genl_connect we will crash in nbd_genl_disconnect when we try to flush the workqueue since it was never allocated. This patch moves all the flush calls to sock_shutdown and adds a check for a valid workqueue. Note that we flush_workqueue will do the right thing if there are no running works so we did not need the if (i) check in nbd_start_device. This also fixes a possible bug where you could do nbd_genl_connect, then do a NBD_DISCONNECT and NBD_CLEAR_SOCK and the workqueue would not get flushed. --- drivers/block/nbd.c | 44 +++++++++++++++++--------------------------- 1 file changed, 17 insertions(+), 27 deletions(-) diff --git a/drivers/block/nbd.c b/drivers/block/nbd.c index 78181908f0df..1afc70ed1f0a 100644 --- a/drivers/block/nbd.c +++ b/drivers/block/nbd.c @@ -341,9 +341,12 @@ static void nbd_complete_rq(struct request *req) } /* - * Forcibly shutdown the socket causing all listeners to error + * Forcibly shutdown the socket causing all listeners to error. If called + * from a disconnect/clearing context we must make sure the recv workqueues + * have exited so they don't release the last nbd_config reference and try to + * destroy the workqueue from inside itself. */ -static void sock_shutdown(struct nbd_device *nbd) +static void sock_shutdown(struct nbd_device *nbd, bool flush_work) { struct nbd_config *config = nbd->config; int i; @@ -351,7 +354,7 @@ static void sock_shutdown(struct nbd_device *nbd) if (config->num_connections == 0) return; if (test_and_set_bit(NBD_RT_DISCONNECTED, &config->runtime_flags)) - return; + goto try_flush; for (i = 0; i < config->num_connections; i++) { struct nbd_sock *nsock = config->socks[i]; @@ -360,6 +363,10 @@ static void sock_shutdown(struct nbd_device *nbd) mutex_unlock(&nsock->tx_lock); } dev_warn(disk_to_dev(nbd->disk), "shutting down sockets\n"); + +try_flush: + if (flush_work && nbd->recv_workq) + flush_workqueue(nbd->recv_workq); } static u32 req_to_nbd_cmd_type(struct request *req) @@ -446,7 +453,7 @@ static enum blk_eh_timer_return nbd_xmit_timeout(struct request *req, set_bit(NBD_RT_TIMEDOUT, &config->runtime_flags); cmd->status = BLK_STS_IOERR; mutex_unlock(&cmd->lock); - sock_shutdown(nbd); + sock_shutdown(nbd, false); nbd_config_put(nbd); done: blk_mq_complete_request(req); @@ -910,7 +917,7 @@ static int nbd_handle_cmd(struct nbd_cmd *cmd, int index) * for the reconnect timer don't trigger the timer again * and instead just error out. */ - sock_shutdown(nbd); + sock_shutdown(nbd, false); nbd_config_put(nbd); blk_mq_start_request(req); return -EIO; @@ -1178,7 +1185,7 @@ static int nbd_disconnect(struct nbd_device *nbd) static void nbd_clear_sock(struct nbd_device *nbd) { - sock_shutdown(nbd); + sock_shutdown(nbd, true); nbd_clear_que(nbd); nbd->task_setup = NULL; } @@ -1264,17 +1271,7 @@ static int nbd_start_device(struct nbd_device *nbd) args = kzalloc(sizeof(*args), GFP_KERNEL); if (!args) { - sock_shutdown(nbd); - /* - * If num_connections is m (2 < m), - * and NO.1 ~ NO.n(1 < n < m) kzallocs are successful. - * But NO.(n + 1) failed. We still have n recv threads. - * So, add flush_workqueue here to prevent recv threads - * dropping the last config_refs and trying to destroy - * the workqueue from inside the workqueue. - */ - if (i) - flush_workqueue(nbd->recv_workq); + sock_shutdown(nbd, true); return -ENOMEM; } sk_set_memalloc(config->socks[i]->sock->sk); @@ -1306,9 +1303,7 @@ static int nbd_start_device_ioctl(struct nbd_device *nbd, struct block_device *b mutex_unlock(&nbd->config_lock); ret = wait_event_interruptible(config->recv_wq, atomic_read(&config->recv_threads) == 0); - if (ret) - sock_shutdown(nbd); - flush_workqueue(nbd->recv_workq); + sock_shutdown(nbd, true); mutex_lock(&nbd->config_lock); nbd_bdev_reset(bdev); @@ -1323,7 +1318,7 @@ static int nbd_start_device_ioctl(struct nbd_device *nbd, struct block_device *b static void nbd_clear_sock_ioctl(struct nbd_device *nbd, struct block_device *bdev) { - sock_shutdown(nbd); + sock_shutdown(nbd, true); __invalidate_device(bdev, true); nbd_bdev_reset(bdev); if (test_and_clear_bit(NBD_RT_HAS_CONFIG_REF, @@ -1986,12 +1981,7 @@ static void nbd_disconnect_and_put(struct nbd_device *nbd) nbd_disconnect(nbd); nbd_clear_sock(nbd); mutex_unlock(&nbd->config_lock); - /* - * Make sure recv thread has finished, so it does not drop the last - * config ref and try to destroy the workqueue from inside the work - * queue. - */ - flush_workqueue(nbd->recv_workq); + if (test_and_clear_bit(NBD_RT_HAS_CONFIG_REF, &nbd->config->runtime_flags)) nbd_config_put(nbd); -- 2.20.1 --------------010007000101060400020607--