iommu.lists.linux-foundation.org archive mirror
 help / color / mirror / Atom feed
From: Jason Gunthorpe <jgg@nvidia.com>
To: Baolu Lu <baolu.lu@linux.intel.com>
Cc: Vasant Hegde <vasant.hegde@amd.com>,
	Matt Fagnani <matt.fagnani@bell.net>,
	Thorsten Leemhuis <regressions@leemhuis.info>,
	Joerg Roedel <jroedel@suse.de>,
	"iommu@lists.linux.dev" <iommu@lists.linux.dev>,
	LKML <linux-kernel@vger.kernel.org>,
	"regressions@lists.linux.dev" <regressions@lists.linux.dev>,
	Linux PCI <linux-pci@vger.kernel.org>,
	Bjorn Helgaas <bhelgaas@google.com>
Subject: Re: [regression, bisected, pci/iommu] Bug 216865 - Black screen when amdgpu started during 6.2-rc1 boot with AMD IOMMU enabled
Date: Mon, 9 Jan 2023 09:43:21 -0400	[thread overview]
Message-ID: <Y7wZ+SvT455HS2b6@nvidia.com> (raw)
In-Reply-To: <bb865b8f-6f8f-769a-6364-d46b45caca85@linux.intel.com>

On Sat, Jan 07, 2023 at 10:44:46AM +0800, Baolu Lu wrote:
> On 1/6/2023 10:14 PM, Jason Gunthorpe wrote:
> > On Thu, Jan 05, 2023 at 03:57:28PM +0530, Vasant Hegde wrote:
> > > Matt,
> > > 
> > > On 1/5/2023 6:39 AM, Matt Fagnani wrote:
> > > > I built 6.2-rc2 with the patch applied. The same black screen problem happened
> > > > with 6.2-rc2 with the patch. I tried to use early kdump with 6.2-rc2 with the
> > > > patch twice by panicking the kernel with sysrq+alt+c after the black screen
> > > > happened. The system rebooted after about 10-20 seconds both times, but no kdump
> > > > and dmesg files were saved in /var/crash. I'm attaching the lspci -vvv output as
> > > > requested.
> > > > 
> > > 
> > > Thanks for testing. As mentioned earlier I was not expecting this patch to fix
> > > the black screen issue. It should fix kernel warnings and IOMMU page fault
> > > related call traces. By any chance do you have the kernel boot logs?
> > > 
> > > 
> > > @Baolu,
> > >    Looking into lspci output, it doesn't list ACS feature for Graphics card. So
> > > with your fix it didn't enable PASID and hence it failed to boot.
> > 
> > The ACS checks being done are feature of the path not the end point or
> > root port.
> > 
> > If we are expecting ACS on the end port then it is just a bug in how
> > the test was written.. The test should be a NOP because there are no
> > switches in this topology.
> > 
> > Looking at it, this seems to just be because pci_enable_pasid is
> > calling pci_acs_path_enabled wrong, the only other user is here:
> > 
> > 	for (bus = pdev->bus; !pci_is_root_bus(bus); bus = bus->parent) {
> > 		if (!bus->self)
> > 			continue;
> > 
> > 		if (pci_acs_path_enabled(bus->self, NULL, REQ_ACS_FLAGS))
> > 			break;
> > 
> > 		pdev = bus->self;
> > 
> > 		group = iommu_group_get(&pdev->dev);
> > 		if (group)
> > 			return group;
> > 	}
> > 
> > And notice it is calling it on pdev->bus not on pdev itself which
> > naturally excludes the end point from the ACS validation.
> > 
> > So try something like:
> > 
> > 	if (!pci_acs_path_enabled(pdev->bus->self, NULL, PCI_ACS_RR | PCI_ACS_UF))
> > 
> > (and probably need to check for null ?)
> 
> Yeah! This really is a misuse of pci_acs_path_enabled().
> 
> But if @pdev is an endpoint of a multiple function device, perhaps we
> still need to check acs on it?

Ah, I don't know anything about what this means from a spec
perspective.

Certainly if a function can internalize MMIO and loop it back to
another function then it surely is not OK for PASID either, nor should
those functions be in different iommu groups.

So, either this never happens for some spec reason, or the test in the
iommu code forming groups is incorrect.

Jason

  reply	other threads:[~2023-01-09 13:43 UTC|newest]

Thread overview: 45+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-12-30  8:18 [regression, bisected, pci/iommu] Bug 216865 - Black screen when amdgpu started during 6.2-rc1 boot with AMD IOMMU enabled Thorsten Leemhuis
2023-01-03 10:30 ` Joerg Roedel
2023-01-03 19:06 ` Matt Fagnani
     [not found] ` <5aa0e698-f715-0481-36e5-46505024ebc1@bell.net>
2023-01-04  6:54   ` Baolu Lu
2023-01-04 15:50     ` Vasant Hegde
2023-01-05  1:09       ` Matt Fagnani
2023-01-05 10:27         ` Vasant Hegde
2023-01-05 10:37           ` Baolu Lu
2023-01-05 10:46             ` Vasant Hegde
2023-01-05 14:46               ` Deucher, Alexander
2023-01-05 15:27                 ` Felix Kuehling
2023-01-06  5:48                   ` Baolu Lu
2023-02-15 15:39                     ` Bjorn Helgaas
2023-02-16  0:35                       ` Felix Kuehling
2023-02-16  0:44                         ` Jason Gunthorpe
2023-02-16  5:37                           ` Vasant Hegde
2023-02-16 14:55                             ` Felix Kuehling
2023-02-16 14:53                           ` Felix Kuehling
2023-02-16  5:25                         ` Vasant Hegde
2023-02-16 18:59                           ` Matt Fagnani
2023-02-16 19:59                             ` Felix Kuehling
2023-02-17  5:36                               ` Vasant Hegde
2023-02-17  5:23                             ` Vasant Hegde
2023-01-05 19:51           ` Matt Fagnani
2023-01-06  7:28           ` Matt Fagnani
2023-01-10 16:08             ` Vasant Hegde
2023-01-10 16:12               ` Vasant Hegde
2023-01-06 14:14           ` Jason Gunthorpe
2023-01-07  2:44             ` Baolu Lu
2023-01-09 13:43               ` Jason Gunthorpe [this message]
2023-01-10  5:28                 ` Baolu Lu
2023-01-10  5:48             ` Baolu Lu
2023-01-10  8:06               ` Matt Fagnani
     [not found]                 ` <bb3d5d1a-c222-9270-60fa-7d0b74bebd1a@linux.intel.com>
2023-01-10 22:12                   ` Matt Fagnani
2023-01-10 13:25               ` Jason Gunthorpe
2023-01-10 13:45                 ` Christian König
2023-01-10 13:51                   ` Jason Gunthorpe
2023-01-10 13:56                     ` Christian König
2023-01-10 20:51                       ` Matt Fagnani
2023-01-11  8:35                         ` Christian König
2023-01-10 15:05                   ` Felix Kuehling
2023-01-10 15:19                     ` Jason Gunthorpe
2023-01-10 15:21                       ` Felix Kuehling
2023-01-11  3:16                 ` Baolu Lu
2023-01-11 13:08                   ` Jason Gunthorpe

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=Y7wZ+SvT455HS2b6@nvidia.com \
    --to=jgg@nvidia.com \
    --cc=baolu.lu@linux.intel.com \
    --cc=bhelgaas@google.com \
    --cc=iommu@lists.linux.dev \
    --cc=jroedel@suse.de \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-pci@vger.kernel.org \
    --cc=matt.fagnani@bell.net \
    --cc=regressions@leemhuis.info \
    --cc=regressions@lists.linux.dev \
    --cc=vasant.hegde@amd.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).