From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-2.4 required=3.0 tests=DKIM_SIGNED,DKIM_VALID, DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI,SPF_HELO_NONE, SPF_PASS,USER_AGENT_SANE_1 autolearn=no autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 6E0C1C433DF for ; Wed, 20 May 2020 00:26:37 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 4B295206D4 for ; Wed, 20 May 2020 00:26:37 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=ziepe.ca header.i=@ziepe.ca header.b="eTdF25Z3" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1728053AbgETA0g (ORCPT ); Tue, 19 May 2020 20:26:36 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:37728 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726178AbgETA0g (ORCPT ); Tue, 19 May 2020 20:26:36 -0400 Received: from mail-qk1-x742.google.com (mail-qk1-x742.google.com [IPv6:2607:f8b0:4864:20::742]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 07C96C061A0F for ; Tue, 19 May 2020 17:26:36 -0700 (PDT) Received: by mail-qk1-x742.google.com with SMTP id m11so1948272qka.4 for ; Tue, 19 May 2020 17:26:35 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ziepe.ca; s=google; h=date:from:to:cc:subject:message-id:references:mime-version :content-disposition:in-reply-to:user-agent; bh=nE9yGKuHmyTdmxhrXXfxST6XxOah5ElttRSkoHJZ5qU=; b=eTdF25Z3gw9qjHPYw1NkCgy5PcHufAe72jf5rpfeYRZbnWUzny5qGdsN86FOpL0vNc KP2DPKDVL16AoIp1y7hkFUJz8uw9FXe1G36AXs+Tr7zz3BtyT49u2pi+lq3KGOfvD3ia l3u/YTaVwqVV7dpZfJgFIXn8v1n9bt4ylSPVR/8+f3xsRNEDpyEP+03lQQCdrz8N6RMi 6AYOgx25V4i+EkPJOrghohywsxv4YuI3ZYzxvp0+24h1oWC/tQNaEVGH+Lss6NQx3/b9 Uy10SXfuNN0ZH5+WmyxbR2gpDkcYImsI99/+03RDz7YBx4L0qsx8WKKeCydSRd+nNcMM N+Zw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:date:from:to:cc:subject:message-id:references :mime-version:content-disposition:in-reply-to:user-agent; bh=nE9yGKuHmyTdmxhrXXfxST6XxOah5ElttRSkoHJZ5qU=; b=B4cJtSi8mDSqPZnhsQ6mb8UOdvWBdVMShc2e7NUOcYg62/sQuhIlgTG1rv3wTjpLNK IFcQn2tutZ+3AXfNdQPLFqSZ+dWjGJrPMLm9itOR++8sdeLnj2SVR2BoXTFCrZBcbPsT oml06SOOE0l/huXnddvnt0QC75u8cGyM0jFbEAs5VzvvVDoP4bgpbXyPrENQoOGE3DyM dYaqLmYt3LLjxiJPcDNIQvDMJ6PjcFfYWiIw57HKRAEdJmRhR3CIBXHxz9us1nMqZsfM vapw7RQx0M+uxndbtcOZReIM9i4dSQWNHBJPNqs1lhNNkUjEuYRyh6+xC6Wy0RIYC3aS BvNA== X-Gm-Message-State: AOAM53276QAonxVClKttD6+fYaEh5IIlPLn8njGiOIrAVwq3svlCTn0O egP8dP4pFRPUBI2bVl0odN9thQ== X-Google-Smtp-Source: ABdhPJyV40iO2ET5T/0qUojJK8xg7Xixyujhd49wVC+crumicEgK3ik5uVhjo+1eTNCC0zsjRROrHA== X-Received: by 2002:a05:620a:1e:: with SMTP id j30mr2238850qki.470.1589934395246; Tue, 19 May 2020 17:26:35 -0700 (PDT) Received: from ziepe.ca (hlfxns017vw-156-34-48-30.dhcp-dynamic.fibreop.ns.bellaliant.net. [156.34.48.30]) by smtp.gmail.com with ESMTPSA id 23sm912470qkf.68.2020.05.19.17.26.34 (version=TLS1_2 cipher=ECDHE-ECDSA-CHACHA20-POLY1305 bits=256/256); Tue, 19 May 2020 17:26:34 -0700 (PDT) Received: from jgg by mlx.ziepe.ca with local (Exim 4.90_1) (envelope-from ) id 1jbCZ8-0003YU-Ed; Tue, 19 May 2020 21:26:34 -0300 Date: Tue, 19 May 2020 21:26:34 -0300 From: Jason Gunthorpe To: Dennis Dalessandro Cc: dledford@redhat.com, linux-rdma@vger.kernel.org, Mike Marciniszyn , stable@vger.kernel.org, Kaike Wan Subject: Re: [PATCH for-rc or next 1/3] IB/hfi1: Do not destroy hfi1_wq when the device is shut down Message-ID: <20200520002634.GF31189@ziepe.ca> References: <20200512030622.189865.65024.stgit@awfm-01.aw.intel.com> <20200512031315.189865.15477.stgit@awfm-01.aw.intel.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20200512031315.189865.15477.stgit@awfm-01.aw.intel.com> User-Agent: Mutt/1.9.4 (2018-02-28) Sender: linux-rdma-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-rdma@vger.kernel.org On Mon, May 11, 2020 at 11:13:15PM -0400, Dennis Dalessandro wrote: > From: Kaike Wan > > The workqueue hfi1_wq is destroyed in function shutdown_device(), which > is called by either shutdown_one() or remove_one(). The function > shutdown_one() is called when the kernel is rebooted while remove_one() > is called when the hfi1 driver is unloaded. When the kernel is rebooted, > hfi1_wq is destroyed while all qps are still active, leading to a > kernel crash: AFAIK the purpose of shutdown is to stop all in progress DMAs. If devices are wildly doing DMA during the shutdown process then all manner of things can fail, including kexecing into another kernel. Do you achive that with these shutdown handlers? It does make sense that the work queue would not be destroyed in shutdown, but I'm surprised it doesn't flush it? Jason