linux-pci.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: poza@codeaurora.org
To: Thomas Tai <thomas.tai@oracle.com>
Cc: bhelgaas@google.com, keith.busch@intel.com,
	linux-pci@vger.kernel.org, linux-pci-owner@vger.kernel.org
Subject: Re: [PATCH 1/1] PCI/AER: prevent pcie_do_fatal_recovery from using device after it is removed
Date: Wed, 15 Aug 2018 21:29:37 +0530	[thread overview]
Message-ID: <3c5fe7896f5d558fef9d37f8b7156a3b@codeaurora.org> (raw)
In-Reply-To: <a23efe6f-4341-edaa-a230-c2f1c93ff0ce@oracle.com>

On 2018-08-15 21:13, Thomas Tai wrote:
> [ ... ]>>>>>>> +        if (dev)
>>>>>>>> +            pci_info(dev, "Device recovery from fatal error 
>>>>>>>> successful\n");
>>>>>>>> +        else
>>>>>>>> +            pr_err("AER: Can not find pci_dev for 
>>>>>>>> %04x:%02x:%02x.%x\n",
>>>>>>>> +                   domain, bus,
>>>>>>>> +                   PCI_SLOT(devfn), PCI_FUNC(devfn));
>>>>>>>> +        pci_dev_put(dev);
>>>>> 
>>>>> That is, replace above with:
>>>>> 
>>>>>     pr_info("AER: Device %04x:%02x:%02x.%x recover from fatal error
>>>>> successful\n", domain, bus, PCI_SLOT(devfn), PCI_FUNC(devfn));
>>>>> 
>>>> is this not sufficient to print ?  which is there already.
>>>> pci_info(dev, "Device recovery from fatal error successful\n");
>>> 
>>> Unfortunately, I can't use "dev" because as you said I can't
>>> pci_get_domain_slot to search for it after rescanning.
>>> 
>>> -T
>> 
>> If I understood the patch correctly,
>> 1) it is restructuring of pci_dev_put(dev);
>> 2) now we are sending uevent only if it is bridge.
>> 3) the other logic where you store and compare bdf is not required, 
>> although we have to fix following.
>> pci_info(dev, "Device recovery from fatal error successful\n");
>> 
>> 
>> probably 1) and 2) could be a separate patch.
> 
> Yes Oza. I will separate 1) and 2) and fix the pci_info(dev, "...") to
> use domain:bus:devfn

Thanks Thomas,
although I am not sure on 3)  by doing following.

  dev = pci_get_domain_bus_and_slot(domain, bus, devfn);
      if (dev)
         pci_info(dev, "Device recovery from fatal error successful\n");

but also it may be the case that device is removed and there is no new 
device which would match BDF
so in any case recovery has happened as far as PCIe link is concerned.

surprisingly in my case followin also went fine
pci_info(dev, "Device recovery from fatal error successful\n");
I guess probably after re-enumeration, I got the sane dev back !!  
(since it enumerated the same device)

how about something like this

dev = pci_get_domain_bus_and_slot(domain, bus, devfn);
      if (dev)
         pci_info(dev, "Device recovery from fatal error successful\n");
       else
          pr_ifo ("AER: Device recovery from fatal error successful")

Regards,
Oza.

> 
>> 
>> by the way,
>> In my testing even after removing device the call pci_uevent_ers(dev, 
>> PCI_ERS_RESULT_DISCONNECT); did not create any problem.
> 
> I think that is dangerous, if we restructure pci_dev_put(dev), ie
> revert commit bd91b56cb3b27492963caeb5fccefe20a986ca8d. The dev
> structure is freed and we shouldn't really use it to send
> pci_uevent_ers(). Although pci_uevent_ers() may be smart enough to
> figure out to avoid crash, we should only use it when the dev is not
> removed.
> 

I Agree.

> Thanks,
> Thomas
> 
>>> 
>> 
>>> 
>>>> 
>>>>> 
>>>>>>>>      } else {
>>>>>>>> -        pci_uevent_ers(dev, PCI_ERS_RESULT_DISCONNECT);
>>>>>>>> -        pci_info(dev, "Device recovery from fatal error 
>>>>>>>> failed\n");
>>>>>>>> +        if (is_bridge_device)
>>>>>>> previously there was no condition.. why is this required now ?
>>>>>>> and it was doing on "dev" while you are doing it on "udev"
>>>>>> is that the reason we dont watnt o use dev after it is removed ? 
>>>>>> instead do pci_uevent_ers on bridge ?
>>>>> 
>>>>> 
>>>>> Yes, that's exactly the reason. I should have written down the 
>>>>> reason
>>>>> in the commit log. Here is the details explanation for the benefit 
>>>>> of
>>>>> others. If the failing device is a bridge device, udev is equal to
>>>>> dev. So we can use udev in the pci_uevent_ers. But for non bridge
>>>>> device the device is already removed from the bus and thus can't 
>>>>> send
>>>>> pci_uevent_ers.
>>>>> 
>>>>> Thank you,
>>>>> Thomas
>>>>> 
>>>>>>>> +            pci_uevent_ers(udev, PCI_ERS_RESULT_DISCONNECT);
>>>>>>>> +        pr_err("AER: Device %04x:%02x:%02x.%x recovery from 
>>>>>>>> fatal error failed\n",
>>>>>>>> +               domain, bus, PCI_SLOT(devfn), PCI_FUNC(devfn));
>>>>>>>>      }
>>>>>>>> 
>>>>>>>> -    pci_dev_put(dev);
>>>>>>>>      pci_unlock_rescan_remove();
>>>>>>>>  }
>>>>>>> 
>>>>>>> Regards,
>>>>>>> Oza.

  reply	other threads:[~2018-08-15 18:52 UTC|newest]

Thread overview: 91+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-08-13 16:51 [PATCH 0/1] PCI/AER: prevent pcie_do_fatal_recovery from using device after it is removed Thomas Tai
2018-08-13 16:51 ` [PATCH 1/1] " Thomas Tai
2018-08-14  9:16   ` poza
2018-08-14  9:22     ` poza
2018-08-14 13:51       ` Thomas Tai
2018-08-15 14:57         ` poza
2018-08-15 15:02           ` Thomas Tai
2018-08-15 15:26             ` poza
2018-08-15 15:43               ` Thomas Tai
2018-08-15 15:59                 ` poza [this message]
2018-08-15 16:04                   ` Thomas Tai
2018-08-15 21:55       ` Benjamin Herrenschmidt
2018-08-15 21:56       ` Benjamin Herrenschmidt
2018-08-16  6:36         ` poza
2018-08-16  6:51           ` Benjamin Herrenschmidt
2018-08-16  6:59             ` Benjamin Herrenschmidt
2018-08-16  8:07               ` poza
2018-08-16  8:12                 ` Benjamin Herrenschmidt
2018-08-16  9:03                   ` poza
2018-08-16 10:07                     ` Benjamin Herrenschmidt
2018-08-16 14:11                       ` poza
2018-08-16 23:30                         ` Benjamin Herrenschmidt
2018-08-17 10:29                           ` poza
2018-08-17 10:44                             ` poza
2018-08-18  7:38                             ` Benjamin Herrenschmidt
2018-08-16  7:05             ` Benjamin Herrenschmidt
2018-08-16  7:15               ` Benjamin Herrenschmidt
2018-08-16  7:56                 ` poza
2018-08-16  8:10                   ` Benjamin Herrenschmidt
2018-08-16  8:05               ` poza
2018-08-16  8:15                 ` Benjamin Herrenschmidt
2018-08-16  8:22                   ` poza
2018-08-16  8:28                     ` Benjamin Herrenschmidt
2018-08-16 13:30                       ` Thomas Tai
2018-08-16 13:46                     ` Sinan Kaya
2018-08-16 23:27                       ` Benjamin Herrenschmidt
2018-08-17  6:35                         ` poza
2018-08-19  2:24                         ` Bjorn Helgaas
2018-08-20  5:09                           ` poza
2018-08-20  5:15                             ` Benjamin Herrenschmidt
2018-08-20 13:02                               ` Thomas Tai
2018-08-20 13:27                                 ` Benjamin Herrenschmidt
2018-08-19  2:19               ` Bjorn Helgaas
2018-08-19 21:41                 ` Sinan Kaya
2018-08-20  2:03                   ` Benjamin Herrenschmidt
2018-08-20  5:19                     ` poza
2018-08-20  5:33                       ` Benjamin Herrenschmidt
2018-08-20  7:56                         ` poza
2018-08-20 11:22                           ` Benjamin Herrenschmidt
2018-08-20 13:26                             ` poza
2018-08-20 21:02                               ` Benjamin Herrenschmidt
2018-08-21  5:14                                 ` poza
2018-08-21  6:06                                   ` Benjamin Herrenschmidt
2018-08-21 14:37                                     ` Keith Busch
2018-08-21 15:07                                       ` Sinan Kaya
2018-08-21 15:29                                         ` Keith Busch
2018-08-21 15:50                                           ` Sinan Kaya
2018-08-21 15:55                                             ` Sinan Kaya
2018-08-21 16:44                                             ` Keith Busch
2018-08-21 15:30                                       ` poza
2018-08-21 21:14                                       ` Benjamin Herrenschmidt
2018-08-21 22:04                                         ` Keith Busch
2018-08-21 22:33                                           ` Keith Busch
2018-08-21 23:06                                           ` Benjamin Herrenschmidt
2018-08-21 23:13                                             ` Keith Busch
2018-08-22  0:36                                               ` Benjamin Herrenschmidt
2018-08-30  0:01                                             ` Keith Busch
2018-08-30  0:10                                               ` Sinan Kaya
2018-08-30  0:46                                                 ` Keith Busch
2018-08-30  4:26                                               ` Benjamin Herrenschmidt
2018-08-20 15:53                             ` Keith Busch
2018-08-20 16:13                               ` poza
2018-08-20 16:32                                 ` Keith Busch
2018-08-20 21:05                               ` Benjamin Herrenschmidt
2018-08-20 21:21                                 ` Sinan Kaya
2018-08-20 21:35                                   ` Keith Busch
2018-08-20 21:53                                     ` Benjamin Herrenschmidt
2018-08-20 22:02                                       ` Sinan Kaya
2018-08-20 22:04                                         ` Benjamin Herrenschmidt
2018-08-20 22:13                                           ` Sinan Kaya
2018-08-20 22:19                                             ` Benjamin Herrenschmidt
2018-08-22  9:13                                             ` Lukas Wunner
2018-08-22 14:38                                               ` Keith Busch
2018-08-22 14:51                                                 ` Sinan Kaya
2018-08-20 22:13                                           ` Keith Busch
2018-08-20 22:19                                             ` Benjamin Herrenschmidt
2018-08-21  1:30                                               ` Keith Busch
2018-08-20  4:37                 ` Benjamin Herrenschmidt
2018-08-20  4:39                 ` PATCH] Partial revert of "PCI/AER: Handle ERR_FATAL with removal and re-enumeration of devices" Benjamin Herrenschmidt
2018-08-21 19:50                   ` Bjorn Helgaas
2018-08-22  4:35                     ` poza

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=3c5fe7896f5d558fef9d37f8b7156a3b@codeaurora.org \
    --to=poza@codeaurora.org \
    --cc=bhelgaas@google.com \
    --cc=keith.busch@intel.com \
    --cc=linux-pci-owner@vger.kernel.org \
    --cc=linux-pci@vger.kernel.org \
    --cc=thomas.tai@oracle.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).