From mboxrd@z Thu Jan 1 00:00:00 1970 From: Jeff Guo Subject: Re: [PATCH v1] igb_uio: fix unexpected removal for hot-unplug Date: Fri, 19 Oct 2018 16:35:46 +0800 Message-ID: <33442113-1445-21a4-3c1b-6ddd5e7586d4@intel.com> References: <1534502916-31636-9-git-send-email-jia.guo@intel.com> <1539844035-11524-1-git-send-email-jia.guo@intel.com> <540b6dc6-5c58-2a6d-aeec-55b92408eef0@intel.com> Mime-Version: 1.0 Content-Type: text/plain; charset=utf-8; format=flowed Content-Transfer-Encoding: 7bit Cc: dev@dpdk.org, helin.zhang@intel.com To: Ferruh Yigit , stephen@networkplumber.org, bruce.richardson@intel.com, konstantin.ananyev@intel.com, thomas@monjalon.net, shaopeng.he@intel.com Return-path: Received: from mga07.intel.com (mga07.intel.com [134.134.136.100]) by dpdk.org (Postfix) with ESMTP id 6E6A45A44 for ; Fri, 19 Oct 2018 10:35:49 +0200 (CEST) In-Reply-To: <540b6dc6-5c58-2a6d-aeec-55b92408eef0@intel.com> Content-Language: en-US List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Sender: "dev" On 10/19/2018 12:06 AM, Ferruh Yigit wrote: > On 10/18/2018 7:27 AM, Jeff Guo wrote: >> When a device is hot-unplugged, pci_remove will be invoked unexpectedly >> before pci_release, it will caused kernel hung issue which will throw the >> error info of "Trying to free already-free IRQ XXX". And on the other hand, >> if pci_remove before pci_release, the interrupt will not got chance to be >> disabled. So this patch aim to fix this issue by adding pci_release call >> in pci_remove, it will gurranty that all pci clean up will be done before >> pci removal. >> >> Signed-off-by: Jeff Guo >> --- >> kernel/linux/igb_uio/igb_uio.c | 2 ++ >> 1 file changed, 2 insertions(+) >> >> diff --git a/kernel/linux/igb_uio/igb_uio.c b/kernel/linux/igb_uio/igb_uio.c >> index fede66c..3cf394b 100644 >> --- a/kernel/linux/igb_uio/igb_uio.c >> +++ b/kernel/linux/igb_uio/igb_uio.c >> @@ -570,6 +570,8 @@ igbuio_pci_remove(struct pci_dev *dev) >> { >> struct rte_uio_pci_dev *udev = pci_get_drvdata(dev); >> >> + igbuio_pci_release(&udev->info, NULL); >> + > Hi Jeff, > > This is simpler approach comparing to previous version. > > And do you know if igbuio_pci_release() won't be called after > igbuio_pci_remove() because that will also cause crash, and indeed it will cause > a crash in the uio too. > > The flow as far as I can see: > when uioN device opened by application, igbuio_pci_open() is called. > > If device removed, I expect driver remove() function called, which has a call > stack like below: > > igbuio_pci_remove() > uio_unregister_device() > uio_device_release() > kfree(struct uio_device) > > After this point udev is freed and igbuio_pci_release() shouldn't be called, so > I assume uioN device closed before this point but I couldn't find where, if not > closed, closing it later will crash. What i saw is that after igb_uio remove , if detach the device the pci release will be called, so the igbuo_pci_release should be called again. > I can't test the hotplug case, can you please confirm above patch fixing crashes > you observed for your use cases? yes, it could be fix the crashed i observed right now. > > And for regular usecase this change shouldn't cause any problem, so at worst it > may not be fixing all hotplug issues, which looks safe to get. I think it would fix this hung issue that caused of double free irq and would not have side effect anyway.