linux-pci.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Khalid Aziz <khalid@gonehiking.org>
To: Dave Young <dyoung@redhat.com>
Cc: Kairui Song <kasong@redhat.com>, Baoquan He <bhe@redhat.com>,
	linux-pci@vger.kernel.org, kexec@lists.infradead.org,
	Jerry Hoemann <Jerry.Hoemann@hpe.com>,
	Randy Wright <rwright@hpe.com>,
	Linux Kernel Mailing List <linux-kernel@vger.kernel.org>,
	Bjorn Helgaas <helgaas@kernel.org>,
	Deepa Dinamani <deepa.kernel@gmail.com>
Subject: Re: [RFC PATCH] PCI, kdump: Clear bus master bit upon shutdown in kdump kernel
Date: Fri, 17 Jan 2020 08:44:19 -0700	[thread overview]
Message-ID: <f7c5d044-b821-2edd-7b24-ca36056b61bb@gonehiking.org> (raw)
In-Reply-To: <20200117032413.GA16906@dhcp-128-65.nay.redhat.com>

On 1/16/20 8:24 PM, Dave Young wrote:
> On 01/15/20 at 02:17pm, Khalid Aziz wrote:
>> On 1/15/20 11:05 AM, Kairui Song wrote:
>>> On Thu, Jan 16, 2020 at 1:31 AM Khalid Aziz <khalid@gonehiking.org> wrote:
>>>>
>>>> On 1/13/20 10:07 AM, Kairui Song wrote:
>>>>> On Sun, Jan 12, 2020 at 2:33 AM Deepa Dinamani <deepa.kernel@gmail.com> wrote:
>>>>>>
>>>>>>> Hi, there are some previous works about this issue, reset PCI devices
>>>>>>> in kdump kernel to stop ongoing DMA:
>>>>>>>
>>>>>>> [v7,0/5] Reset PCIe devices to address DMA problem on kdump with iommu
>>>>>>> https://lore.kernel.org/patchwork/cover/343767/
>>>>>>>
>>>>>>> [v2] PCI: Reset PCIe devices to stop ongoing DMA
>>>>>>> https://lore.kernel.org/patchwork/patch/379191/
>>>>>>>
>>>>>>> And didn't get merged, that patch are trying to fix some DMAR error
>>>>>>> problem, but resetting devices is a bit too destructive, and the
>>>>>>> problem is later fixed in IOMMU side. And in most case the DMA seems
>>>>>>> harmless, as they targets first kernel's memory and kdump kernel only
>>>>>>> live in crash memory.
>>>>>>
>>>>>> I was going to ask the same. If the kdump kernel had IOMMU on, would
>>>>>> that still be a problem?
>>>>>
>>>>> It will still fail, doing DMA is not a problem, it only go wrong when
>>>>> a device's upstream bridge is mistakenly shutdown before the device
>>>>> shutdown.
>>>>>
>>>>>>
>>>>>>> Also, by the time kdump kernel is able to scan and reset devices,
>>>>>>> there are already a very large time window where things could go
>>>>>>> wrong.
>>>>>>>
>>>>>>> The currently problem observed only happens upon kdump kernel
>>>>>>> shutdown, as the upper bridge is disabled before the device is
>>>>>>> disabledm so DMA will raise error. It's more like a problem of wrong
>>>>>>> device shutting down order.
>>>>>>
>>>>>> The way it was described earlier "During this time, the SUT sometimes
>>>>>> gets a PCI error that raises an NMI." suggests that it isn't really
>>>>>> restricted to kexec/kdump.
>>>>>> Any attached device without an active driver might attempt spurious or
>>>>>> malicious DMA and trigger the same during normal operation.
>>>>>> Do you have available some more reporting of what happens during the
>>>>>> PCIe error handling?
>>>>>
>>>>> Let me add more info about this:
>>>>>
>>>>> On the machine where I can reproduce this issue, the first kernel
>>>>> always runs fine, and kdump kernel works fine during dumping the
>>>>> vmcore, even if I keep the kdump kernel running for hours, nothing
>>>>> goes wrong. If there are DMA during normal operation that will cause
>>>>> problem, this should have exposed it.
>>>>>
>>>>
>>>> This is the part that is puzzling me. Error shows up only when kdump
>>>> kernel is being shut down. kdump kernel can run for hours without this
>>>> issue. What is the operation from downstream device that is resulting in
>>>> uncorrectable error - is it indeed a DMA request? Why does that
>>>> operation from downstream device not happen until shutdown?
>>>>
>>>> I just want to make sure we fix the right problem in the right way.
>>>>
>>>
>>> Actually the device could keep sending request with no problem during
>>> kdump kernel running. Eg. keep sending DMA, and all DMA targets first
>>> kernel's system memory, so kdump runs fine as long as nothing touch
>>> the reserved crash memory. And the error is reported by the port, when
>>> shutdown it has bus master bit, and downstream request will cause
>>> error.
>>>
>>
>> Problem really is there are active devices while kdump kernel is
>> running. You did say earlier - "And in most case the DMA seems
>> harmless, as they targets first kernel's memory and kdump kernel only
>> live in crash memory.". Even if this holds today, it is going to break
>> one of these days. There is the "reset_devices" option but that does not
>> work if driver is not loaded by kdump kernel. Can we try to shut down
>> devices in machine_crash_shutdown() before we start kdump kernel?
> 
> It is not a good idea :)  We do not add extra logic after a panic
> because the kernel is not stable and we want a correct vmcore.
> 

I agree any extra code in panic path opens up door to more trouble. For
kdump kernel if hardware is not in a good shape when it boots up, we are
still in no better place. There may be some room to do minimal and
absolutely essential hardware shutdown in panic path to ensure kdump
kernel can work reliably, but such code has to be approached very
carefully. For this specific problem, it seems to stems from not loading
the same drivers in kdump kernel that were running in previous kernel.
Recommendation for situations like this just might be that one must load
all the same driver and use "reset_devices" option if they want to
ensure stability for kdump kernel. I can see reluctance to load any more
drivers than absolutely needed in kdump kernel, but not doing that has
side-effects as we are seeing in this case.

--
Khalid

  parent reply	other threads:[~2020-01-17 15:44 UTC|newest]

Thread overview: 40+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-12-25 19:21 [RFC PATCH] PCI, kdump: Clear bus master bit upon shutdown in kdump kernel Kairui Song
2020-01-03  7:58 ` Kairui Song
2020-01-10 21:42 ` Bjorn Helgaas
2020-01-10 22:25   ` Khalid Aziz and Shuah Khan
2020-01-10 23:00     ` Jerry Hoemann
2020-01-11  0:18       ` Khalid Aziz
2020-01-11  0:50         ` Baoquan He
2020-01-11  3:45           ` Khalid Aziz
2020-01-11  9:35             ` Kairui Song
2020-01-11 18:32               ` Deepa Dinamani
2020-01-13 17:07                 ` Kairui Song
2020-01-15  1:16                   ` Deepa Dinamani
2020-01-15  7:56                     ` Kairui Song
2020-01-15 17:30                   ` Khalid Aziz
2020-01-15 18:05                     ` Kairui Song
2020-01-15 21:17                       ` Khalid Aziz
2020-01-17  3:24                         ` Dave Young
2020-01-17  3:46                           ` Baoquan He
2020-01-17 15:44                           ` Khalid Aziz [this message]
2020-01-11 10:04             ` Baoquan He
2020-01-11  0:45       ` Baoquan He
2020-01-11  0:51         ` Baoquan He
2020-01-11  1:46         ` Baoquan He
2020-01-11  9:24         ` Kairui Song
2020-01-10 23:36   ` Jerry Hoemann
2020-01-11  8:46   ` Kairui Song
2020-02-22 16:56 ` Bjorn Helgaas
2020-02-24  4:56   ` Dave Young
2020-02-24 17:30   ` Kairui Song
2020-02-28 19:53     ` Deepa Dinamani
2020-03-03 21:01       ` Deepa Dinamani
2020-03-05  3:53         ` Baoquan He
2020-03-05  4:53           ` Deepa Dinamani
2020-03-05  6:06             ` Deepa Dinamani
2020-03-06  9:38             ` Baoquan He
2020-07-22 14:52               ` Kairui Song
2020-07-22 15:21                 ` Bjorn Helgaas
2020-07-22 21:50                   ` Jerry Hoemann
2020-07-23  0:00                     ` Bjorn Helgaas
2020-07-23 18:34                       ` Kairui Song

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=f7c5d044-b821-2edd-7b24-ca36056b61bb@gonehiking.org \
    --to=khalid@gonehiking.org \
    --cc=Jerry.Hoemann@hpe.com \
    --cc=bhe@redhat.com \
    --cc=deepa.kernel@gmail.com \
    --cc=dyoung@redhat.com \
    --cc=helgaas@kernel.org \
    --cc=kasong@redhat.com \
    --cc=kexec@lists.infradead.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-pci@vger.kernel.org \
    --cc=rwright@hpe.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).