linux-pci.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Huacai Chen <chenhuacai@gmail.com>
To: Sinan Kaya <okaya@kernel.org>
Cc: Bjorn Helgaas <helgaas@kernel.org>,
	huangshuai@loongson.cn, Huacai Chen <chenhuacai@loongson.cn>,
	Bjorn Helgaas <bhelgaas@google.com>,
	linux-pci <linux-pci@vger.kernel.org>,
	Xuefeng Li <lixuefeng@loongson.cn>,
	Jiaxun Yang <jiaxun.yang@flygoat.com>,
	Tiezhu Yang <yangtiezhu@loongson.cn>
Subject: Re: [PATCH V2 1/4] PCI/portdrv: Don't disable device during shutdown
Date: Sat, 12 Jun 2021 12:31:06 +0800	[thread overview]
Message-ID: <CAAhV-H7jadY0J80oydRq73d5+GGHuqWv9Y0R7N_e_tbSBrrhiQ@mail.gmail.com> (raw)
In-Reply-To: <21b1495f-689f-2d75-d896-2a33c03b186b@kernel.org>

Hi, Bjorn and Sinan,

On Tue, Jun 8, 2021 at 4:43 AM Sinan Kaya <okaya@kernel.org> wrote:
>
> On 6/4/2021 12:24 PM, Huacai Chen wrote:
> >> So you need to explain why we need to allow DMA from those devices
> >> even after we shutdown the port.  "It makes reboot work" is not a
> >> sufficient explanation.
> > I think only the designer of LS7A can tell us why. So, Mr. Shuai
> > Huang, could you please explain this?
> >
>
> Could there be some kind of a shutdown/init problem on your graphics
> card driver?
>
> During shutdown, remove() callback of all endpoints get called. This is
> the opportunity for your graphics driver to put hardware into safe
> state.
>
> If there is a problem with the hardware/driver, it should be a quirk as
> opposed to changing the default safe behavior for all devices.
I have had an offline discussion with Mr. Shuai Huang, he told me that
CPU is still writing data to framebuffer while poweroff/reboot, and if
we clear Bus Master Bit at this time, CPU will wait ack from device,
but never return, so deadlock.

More or less, we can modify the GPU driver to avoid this, as I said in
the commit message:

"The poweroff/reboot failures could easily be reproduced on Loongson
platforms. I think this is not a Loongson-specific problem, instead, is
a problem related to some specific PCI hosts. On some x86 platforms,
radeon/amdgpu devices can cause the same problem [1][2], and commit
faefba95c9e8ca3a ("drm/amdgpu: just suspend the hw on pci shutdown")
can resolve it.

Radeon driver is more difficult than amdgpu due to its confusing symbol
names, and I have maintained an out-of-tree patch for a long time [4]."

[1] https://bugs.freedesktop.org/show_bug.cgi?id=97980
[2] https://bugs.freedesktop.org/show_bug.cgi?id=98638
[4] https://github.com/chenhuacai/linux/commit/8da06f9b669831829416a3e9f4d1c57f217a42f0

Modifing every GPU driver is impossible for me, and I found some RAID
controllers have problems, too.

>
> The port driver here prevents memory from getting corrupted by rogue
> hardware. There is a window during kexec where hardware can write to
> system memory addresses if IOVA addresses and system memory addresses
> overlap.
>
KEXEC has no problems, as discussed before:
http://patchwork.ozlabs.org/project/linux-pci/patch/1600680138-10949-1-git-send-email-chenhc@lemote.com/

static void pci_device_shutdown(struct device *dev)
{
  ...
         if (drv && drv->shutdown)
                 drv->shutdown(pci_dev);

         /*
          * If this is a kexec reboot, turn off Bus Master bit on the
          * device to tell it to not continue to do DMA. Don't touch
          * devices in D3cold or unknown states.
          * If it is not a kexec reboot, firmware will hit the PCI
          * devices with big hammer and stop their DMA any way.
          */
         if (kexec_in_progress && (pci_dev->current_state <= PCI_D3hot))
                 pci_clear_master(pci_dev);
}

Huacai

  reply	other threads:[~2021-06-12  4:31 UTC|newest]

Thread overview: 20+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-05-28  7:14 [PATCH V2 0/4] PCI: Loongson-related pci quirks Huacai Chen
2021-05-28  7:15 ` [PATCH V2 1/4] PCI/portdrv: Don't disable device during shutdown Huacai Chen
2021-05-28 21:43   ` Bjorn Helgaas
2021-06-04  9:24     ` Huacai Chen
2021-06-07 20:43       ` Sinan Kaya
2021-06-12  4:31         ` Huacai Chen [this message]
2021-05-28  7:15 ` [PATCH V2 2/4] PCI: Move loongson pci quirks to quirks.c Huacai Chen
2021-06-05 21:15   ` Bjorn Helgaas
2021-06-06  8:14     ` LoongArch (was: Re: [PATCH V2 2/4] PCI: Move loongson pci quirks to quirks.c) Jiaxun Yang
2021-06-07  0:51       ` Huacai Chen
2021-05-28  7:15 ` [PATCH V2 3/4] PCI: Improve the MRRS quirk for LS7A Huacai Chen
2021-05-28 20:32   ` Bjorn Helgaas
2021-06-04  9:43     ` Huacai Chen
2021-06-05 21:34       ` Bjorn Helgaas
2021-06-12  4:16         ` Huacai Chen
2021-05-28  7:15 ` [PATCH V2 4/4] PCI: Add quirk for multifunction devices of LS7A Huacai Chen
2021-05-28 20:51   ` Bjorn Helgaas
2021-06-04  9:59     ` Huacai Chen
2021-06-05 21:28       ` Bjorn Helgaas
2021-06-06  8:01         ` Jiaxun Yang

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CAAhV-H7jadY0J80oydRq73d5+GGHuqWv9Y0R7N_e_tbSBrrhiQ@mail.gmail.com \
    --to=chenhuacai@gmail.com \
    --cc=bhelgaas@google.com \
    --cc=chenhuacai@loongson.cn \
    --cc=helgaas@kernel.org \
    --cc=huangshuai@loongson.cn \
    --cc=jiaxun.yang@flygoat.com \
    --cc=linux-pci@vger.kernel.org \
    --cc=lixuefeng@loongson.cn \
    --cc=okaya@kernel.org \
    --cc=yangtiezhu@loongson.cn \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).