All of lore.kernel.org
 help / color / mirror / Atom feed
From: Vasant Hegde <vasant.hegde@amd.com>
To: Matt Fagnani <matt.fagnani@bell.net>,
	Baolu Lu <baolu.lu@linux.intel.com>,
	Thorsten Leemhuis <regressions@leemhuis.info>
Cc: Joerg Roedel <jroedel@suse.de>,
	"iommu@lists.linux.dev" <iommu@lists.linux.dev>,
	LKML <linux-kernel@vger.kernel.org>,
	"regressions@lists.linux.dev" <regressions@lists.linux.dev>,
	Linux PCI <linux-pci@vger.kernel.org>,
	Bjorn Helgaas <bhelgaas@google.com>
Subject: Re: [regression, bisected, pci/iommu] Bug 216865 - Black screen when amdgpu started during 6.2-rc1 boot with AMD IOMMU enabled
Date: Tue, 10 Jan 2023 21:38:13 +0530	[thread overview]
Message-ID: <525730fd-5982-fea7-b6d5-2da69f225f04@amd.com> (raw)
In-Reply-To: <ff26929d-9fb0-3c85-2594-dc2937c1ba9a@bell.net>

Matt,


On 1/6/2023 12:58 PM, Matt Fagnani wrote:
> I booted 6.2-rc2 + patch with rd.driver.blacklist=amdgpu on the kernel command
> line to prevent amdgpu from being started while the initramfs was in use. The
> black screen problem happened later in the boot. I pressed sysrq+alt+s,u,b to do
> an emergency sync, remount read-only, and reboot. The journal for that boot was
> shown on the next boot. The two warnings which I previously reported weren't
> shown in the journal, but the same null pointer dereference which made amdgpu
> crash happened. I'm attaching the kernel log from the journal of that boot.
> 

Thanks for your effort to get boot log. This is helpful.

Looking into the code further,
  iommu_detach_group() didn't attach devices back to default_domain. So IOMMU
point of view device group was left in inconsistent state. This resulted in
IOMMU throwing page fault errors and amd IOMMU event handler code always assumes
that domain is setup properly. That resulted in below NULL pointer dereference
issue.

  Jan 06 02:07:52 kernel: BUG: kernel NULL pointer dereference, address:
0000000000000058
  Jan 06 02:07:52 kernel: #PF: supervisor read access in kernel mode
  Jan 06 02:07:53 kernel: #PF: error_code(0x0000) - not-present page
  Jan 06 02:07:53 kernel: PGD 0 P4D 0
  Jan 06 02:07:53 kernel: Oops: 0000 [#1] PREEMPT SMP NOPTI
  Jan 06 02:07:53 kernel: CPU: 2 PID: 56 Comm: irq/24-AMD-Vi Not tainted
6.2.0-rc2+ #89
  Jan 06 02:07:53 kernel: Hardware name: HP HP Laptop 15-bw0xx/8332, BIOS F.52
12/03/2019
  Jan 06 02:07:53 kernel: RIP: 0010:report_iommu_fault+0x11/0x90

Ideally if domain attach fails (in this case its because pasid capability check
returned error) we should put devices back to original domain.. so that it can
continue without PASID capability.

I have a patch to handle these error conditions (not the fix for original
issue). I will try to post it soon.

-Vasant

  reply	other threads:[~2023-01-10 16:08 UTC|newest]

Thread overview: 68+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-12-30  8:18 [regression, bisected, pci/iommu] Bug 216865 - Black screen when amdgpu started during 6.2-rc1 boot with AMD IOMMU enabled Thorsten Leemhuis
2023-01-03 10:30 ` Joerg Roedel
2023-01-03 19:06 ` Matt Fagnani
     [not found] ` <5aa0e698-f715-0481-36e5-46505024ebc1@bell.net>
2023-01-04  6:54   ` Baolu Lu
2023-01-04 15:50     ` Vasant Hegde
2023-01-05  1:09       ` Matt Fagnani
2023-01-05 10:27         ` Vasant Hegde
2023-01-05 10:37           ` Baolu Lu
2023-01-05 10:46             ` Vasant Hegde
2023-01-05 14:46               ` Deucher, Alexander
2023-01-05 15:27                 ` Felix Kuehling
2023-01-06  5:48                   ` Baolu Lu
2023-02-15 15:39                     ` Bjorn Helgaas
2023-02-15 15:39                       ` Bjorn Helgaas
2023-02-16  0:35                       ` Felix Kuehling
2023-02-16  0:35                         ` Felix Kuehling
2023-02-16  0:44                         ` Jason Gunthorpe
2023-02-16  0:44                           ` Jason Gunthorpe
2023-02-16  5:37                           ` Vasant Hegde
2023-02-16  5:37                             ` Vasant Hegde
2023-02-16 14:55                             ` Felix Kuehling
2023-02-16 14:55                               ` Felix Kuehling
2023-02-16 14:53                           ` Felix Kuehling
2023-02-16 14:53                             ` Felix Kuehling
2023-02-16  5:25                         ` Vasant Hegde
2023-02-16  5:25                           ` Vasant Hegde
2023-02-16 18:59                           ` Matt Fagnani
2023-02-16 18:59                             ` Matt Fagnani
2023-02-16 19:59                             ` Felix Kuehling
2023-02-16 19:59                               ` Felix Kuehling
2023-02-17  5:36                               ` Vasant Hegde
2023-02-17  5:36                                 ` Vasant Hegde
2023-02-17  5:23                             ` Vasant Hegde
2023-02-17  5:23                               ` Vasant Hegde
2023-01-05 19:51           ` Matt Fagnani
2023-01-06  7:28           ` Matt Fagnani
2023-01-10 16:08             ` Vasant Hegde [this message]
2023-01-10 16:12               ` Vasant Hegde
2023-01-06 14:14           ` Jason Gunthorpe
2023-01-07  2:44             ` Baolu Lu
2023-01-09 13:43               ` Jason Gunthorpe
2023-01-10  5:28                 ` Baolu Lu
2023-01-10  5:48             ` Baolu Lu
2023-01-10  8:06               ` Matt Fagnani
     [not found]                 ` <bb3d5d1a-c222-9270-60fa-7d0b74bebd1a@linux.intel.com>
2023-01-10 22:12                   ` Matt Fagnani
2023-01-10 22:12                     ` Matt Fagnani
2023-01-10 13:25               ` Jason Gunthorpe
2023-01-10 13:25                 ` Jason Gunthorpe
2023-01-10 13:45                 ` Christian König
2023-01-10 13:45                   ` Christian König
2023-01-10 13:51                   ` Jason Gunthorpe
2023-01-10 13:51                     ` Jason Gunthorpe
2023-01-10 13:56                     ` Christian König
2023-01-10 13:56                       ` Christian König
2023-01-10 20:51                       ` Matt Fagnani
2023-01-10 20:51                         ` Matt Fagnani
2023-01-11  8:35                         ` Christian König
2023-01-11  8:35                           ` Christian König
2023-01-10 15:05                   ` Felix Kuehling
2023-01-10 15:05                     ` Felix Kuehling
2023-01-10 15:19                     ` Jason Gunthorpe
2023-01-10 15:19                       ` Jason Gunthorpe
2023-01-10 15:21                       ` Felix Kuehling
2023-01-10 15:21                         ` Felix Kuehling
2023-01-11  3:16                 ` Baolu Lu
2023-01-11  3:16                   ` Baolu Lu
2023-01-11 13:08                   ` Jason Gunthorpe
2023-01-11 13:08                     ` Jason Gunthorpe

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=525730fd-5982-fea7-b6d5-2da69f225f04@amd.com \
    --to=vasant.hegde@amd.com \
    --cc=baolu.lu@linux.intel.com \
    --cc=bhelgaas@google.com \
    --cc=iommu@lists.linux.dev \
    --cc=jroedel@suse.de \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-pci@vger.kernel.org \
    --cc=matt.fagnani@bell.net \
    --cc=regressions@leemhuis.info \
    --cc=regressions@lists.linux.dev \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.