linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Bjorn Helgaas <helgaas@kernel.org>
To: Baolu Lu <baolu.lu@linux.intel.com>
Cc: "Bjorn Helgaas" <bhelgaas@google.com>,
	"Joerg Roedel" <jroedel@suse.de>,
	"Matt Fagnani" <matt.fagnani@bell.net>,
	"Christian König" <christian.koenig@amd.com>,
	"Jason Gunthorpe" <jgg@nvidia.com>,
	"Kevin Tian" <kevin.tian@intel.com>,
	"Vasant Hegde" <vasant.hegde@amd.com>,
	"Tony Zhu" <tony.zhu@intel.com>,
	linux-pci@vger.kernel.org, iommu@lists.linux.dev,
	linux-kernel@vger.kernel.org
Subject: Re: [PATCH v3 1/1] PCI: Add translated request only flag for pci_enable_pasid()
Date: Mon, 30 Jan 2023 12:38:10 -0600	[thread overview]
Message-ID: <20230130183810.GA1692786@bhelgaas> (raw)
In-Reply-To: <647de371-fe11-15b4-5e11-8ca43a754180@linux.intel.com>

On Sun, Jan 29, 2023 at 04:42:32PM +0800, Baolu Lu wrote:
> On 2023/1/28 1:30, Bjorn Helgaas wrote:
> > On Sat, Jan 14, 2023 at 03:34:20PM +0800, Lu Baolu wrote:
> > > The PCIe fabric routes Memory Requests based on the TLP address, ignoring
> > > the PASID. In order to ensure system integrity, commit 201007ef707a ("PCI:
> > > Enable PASID only when ACS RR & UF enabled on upstream path") requires
> > > some ACS features being supported on device's upstream path when enabling
> > > PCI/PASID.
> > > 
> > > One alternative is ATS/PRI which lets the device resolve the PASID + addr
> > > pair before a memory request is made into a routeable TLB address through
> > > the translation agent.
> > 
> > This sounds like "ATS/PRI" is a solution to a problem, but we haven't
> > stated the problem yet.
> > 
> > > Those resolved addresses are then cached on the
> > > device instead of in the IOMMU TLB and the device always sets translated
> > > bit for PASID. One example of those devices are AMD graphic devices that
> > > always have ACS or ATS/PRI enabled together with PASID.
> > > 
> > > This adds a flag parameter in the pci_enable_pasid() helper, with which
> > > the device driver could opt-in the fact that device always sets the
> > > translated bit for PASID.
> > >
> > > It also applies this opt-in for AMD graphic devices. Without this change,
> > > kernel boots to black screen on a system with below AMD graphic device:
> > > 
> > > 00:01.0 VGA compatible controller: Advanced Micro Devices, Inc.
> > >          [AMD/ATI] Wani [Radeon R5/R6/R7 Graphics] (rev ca)
> > >          (prog-if 00 [VGA controller])
> > > 	DeviceName: ATI EG BROADWAY
> > > 	Subsystem: Hewlett-Packard Company Device 8332
> > 
> > What is the underlying failure here?  "Black screen" is useful but we
> > should say *why* that happens, e.g., transactions went the wrong place
> > or whatever.

> > I guess PCI_PASID_XLATED_REQ_ONLY is something only the driver knows,
> > right?  We can't deduce from architected config space that the device
> > will produce PASID prefixes for every Memory Request, can we?
> 
> No, we can't. That's the reason why we need a flag here.

Sorry, I'm still confused.  PCI_PASID_XLATED_REQ_ONLY is a
device-specific property, and you want to opt-in AMD graphics devices.
Where's the AMD graphics-specific change?  The current patch does
this:

  pdev_pri_ats_enable
    pci_enable_pasid(pdev, 0, PCI_PASID_XLATED_REQ_ONLY)

which looks like it does it for *all* devices below an AMD IOMMU,
without any device or driver input.

PCIe r6.0, sec 6.20.1:

  A Function is not permitted to generate Requests using Translated
  Addresses and a PASID unless both PASID Enable and Translated
  Requests with PASID Enable are Set.

You want AMD graphics devices to do DMA with translated addresses and
PASID, right?  pci_enable_pasid() sets PASID Enable
(PCI_PASID_CTRL_ENABLE), but I don't see where "Translated Requests
with PASID Enable" is set.  We don't even have a #define for it.

I would think we should check "Translated Requests with PASID
Supported" before setting "Translated Requests with PASID Enable",
too?

> PCI: Add translated request only flag for pci_enable_pasid()
> 
> The PCIe fabric routes Memory Requests based on the TLP address, ignoring
> the PASID. In order to ensure system integrity, commit 201007ef707a ("PCI:
> Enable PASID only when ACS RR & UF enabled on upstream path") requires
> some ACS features being supported on device's upstream path when enabling
> PCI/PASID.
> 
> However, above change causes the Linux kernel boots to black screen on a
> system with below graphic device:

We need a PCIe concept-level description of the issue first, i.e., in
terms of DMA, PASID, ACS, etc.  Then we can mention the AMD GPU issue
as an instance.

> 00:01.0 VGA compatible controller: Advanced Micro Devices, Inc.
>         [AMD/ATI] Wani [Radeon R5/R6/R7 Graphics] (rev ca)
>         (prog-if 00 [VGA controller])
>         DeviceName: ATI EG BROADWAY
>         Subsystem: Hewlett-Packard Company Device 8332
> 
> The kernel trace looks like below:
> 
>  Call Trace:
>   <TASK>
>   amd_iommu_attach_device+0x2e0/0x300
>   __iommu_attach_device+0x1b/0x90
>   iommu_attach_group+0x65/0xa0
>   amd_iommu_init_device+0x16b/0x250 [iommu_v2]
>   kfd_iommu_resume+0x4c/0x1a0 [amdgpu]
>   kgd2kfd_resume_iommu+0x12/0x30 [amdgpu]
>   kgd2kfd_device_init.cold+0x346/0x49a [amdgpu]
>   amdgpu_amdkfd_device_init+0x142/0x1d0 [amdgpu]
>   amdgpu_device_init.cold+0x19f5/0x1e21 [amdgpu]
>   ? _raw_spin_lock_irqsave+0x23/0x50
>   amdgpu_driver_load_kms+0x15/0x110 [amdgpu]
>   amdgpu_pci_probe+0x161/0x370 [amdgpu]
>   local_pci_probe+0x41/0x80
>   pci_device_probe+0xb3/0x220
>   really_probe+0xde/0x380
>   ? pm_runtime_barrier+0x50/0x90
>   __driver_probe_device+0x78/0x170
>   driver_probe_device+0x1f/0x90
>   __driver_attach+0xce/0x1c0
>   ? __pfx___driver_attach+0x10/0x10
>   bus_for_each_dev+0x73/0xa0
>   bus_add_driver+0x1ae/0x200
>   driver_register+0x89/0xe0
>   ? __pfx_init_module+0x10/0x10 [amdgpu]
>   do_one_initcall+0x59/0x230
>   do_init_module+0x4a/0x200
>   __do_sys_init_module+0x157/0x180
>   do_syscall_64+0x5b/0x80
>   ? handle_mm_fault+0xff/0x2f0
>   ? do_user_addr_fault+0x1ef/0x690
>   ? exc_page_fault+0x70/0x170
>   entry_SYSCALL_64_after_hwframe+0x72/0xdc

The stack trace doesn't seem like it shows a failure, so I'm not sure
it's useful this time.  If it is, we can at least strip out the
irrelevant pieces.

> The AMD iommu driver allocates a new domain (called v2 domain) for the

"v2 domain" needs to be something greppable -- an identifier,
filename, etc.

> amdgpu device and enables its PCI PASID/ATS/PRI before attaching the
> v2 domain to it. The failure of pci_enable_pasid() due to lack of ACS
> causes the domain attaching device to fail. The amdgpu device is unable
> to DMA normally, resulting in a black screen of the system.
> 
> However, this device is special as it relies on ATS/PRI to resolve the
> PASID + addr pair before a memory request is made into a routeable TLB
> address through the translation agent. Those resolved addresses are then
> cached on the device instead of in the IOMMU TLB and the device always
> uses translated memory request for PASID.
> 
> ACS is not necessary for the devices that always use translated memory
> request for PASID. But this is device specific and only device driver
> knows this. We can't deduce this from architected config space.
> 
> Add a flag for pci_enable_pasid(), with which the device drivers could
> opt-in the fact that device always uses translated memory requests for
> PASID hence the ACS is not a necessity. Apply this opt-in for above AMD
> graphic device.
> 
> At present, it is a common practice to enable/disable PCI PASID in the
> iommu drivers. Considering that the device driver knows more about the
> specific device, it's better to move pci_enable_pasid() into the specific
> device drivers.
> [-- end --]
> 
> --
> Best regards,
> baolu

  reply	other threads:[~2023-01-30 18:39 UTC|newest]

Thread overview: 27+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-01-14  7:34 [PATCH v3 1/1] PCI: Add translated request only flag for pci_enable_pasid() Lu Baolu
2023-01-16 15:42 ` Jason Gunthorpe
2023-01-27 11:30   ` Linux kernel regression tracking (Thorsten Leemhuis)
2023-01-27 17:30 ` Bjorn Helgaas
2023-01-28  7:52   ` Tian, Kevin
2023-01-29  8:42   ` Baolu Lu
2023-01-30 18:38     ` Bjorn Helgaas [this message]
2023-01-30 18:47       ` Jason Gunthorpe
2023-01-31 23:50         ` Bjorn Helgaas
2023-02-01  2:28           ` Jason Gunthorpe
2023-01-31 12:25       ` Baolu Lu
2023-02-01 16:58         ` Bjorn Helgaas
2023-02-02  3:08           ` Baolu Lu
2023-02-02 20:12             ` Bjorn Helgaas
2023-02-02 20:45               ` Jason Gunthorpe
2023-02-03 18:20                 ` Bjorn Helgaas
2023-02-03 18:52                   ` Jason Gunthorpe
2023-02-06  4:28                     ` Tian, Kevin
2023-01-31 12:56       ` Baolu Lu
2023-02-01  0:14         ` Bjorn Helgaas
2023-02-01  2:36           ` Jason Gunthorpe
2023-02-01 14:09             ` Jonathan Cameron
2023-02-01  5:18           ` Vasant Hegde
2023-02-01  5:51           ` Baolu Lu
2023-02-01  5:59           ` Baolu Lu
2023-02-01  6:31           ` Baolu Lu
2023-02-01 14:22             ` Jason Gunthorpe

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20230130183810.GA1692786@bhelgaas \
    --to=helgaas@kernel.org \
    --cc=baolu.lu@linux.intel.com \
    --cc=bhelgaas@google.com \
    --cc=christian.koenig@amd.com \
    --cc=iommu@lists.linux.dev \
    --cc=jgg@nvidia.com \
    --cc=jroedel@suse.de \
    --cc=kevin.tian@intel.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-pci@vger.kernel.org \
    --cc=matt.fagnani@bell.net \
    --cc=tony.zhu@intel.com \
    --cc=vasant.hegde@amd.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).