All of lore.kernel.org
 help / color / mirror / Atom feed
From: Daniel Gomez <daniel@qtec.com>
To: "Deucher, Alexander" <Alexander.Deucher@amd.com>
Cc: Bjorn Helgaas <helgaas@kernel.org>,
	"linux-pci@vger.kernel.org" <linux-pci@vger.kernel.org>,
	Thomas Gleixner <tglx@linutronix.de>,
	"Koenig, Christian" <Christian.Koenig@amd.com>
Subject: Re: PCI/MSI: kernel NULL pointer dereference
Date: Thu, 8 Sep 2022 16:30:28 +0200	[thread overview]
Message-ID: <CAH1Ww+TydyHo+CADnX2Rn8XfDd9coSZj03hxNfCwMg4t1qP+bw@mail.gmail.com> (raw)
In-Reply-To: <BL1PR12MB51444FEAE2566F31E4BA591BF7409@BL1PR12MB5144.namprd12.prod.outlook.com>

On Thu, 8 Sept 2022 at 15:44, Deucher, Alexander
<Alexander.Deucher@amd.com> wrote:
>
> [Public]
>
> > -----Original Message-----
> > From: Bjorn Helgaas <helgaas@kernel.org>
> > Sent: Thursday, September 8, 2022 8:13 AM
> > To: Daniel Gomez <daniel@qtec.com>
> > Cc: linux-pci@vger.kernel.org; Thomas Gleixner <tglx@linutronix.de>;
> > Deucher, Alexander <Alexander.Deucher@amd.com>; Koenig, Christian
> > <Christian.Koenig@amd.com>
> > Subject: Re: PCI/MSI: kernel NULL pointer dereference
> >
> > On Thu, Sep 08, 2022 at 12:41:00PM +0200, Daniel Gomez wrote:
> > > Hi,
> > >
> > > I have the following error whenever I remove the fglrx module from the
> > > latest 6.0-rc4.
> >
> > You bisected to 93296cd1325d; I don't see a commit with a "Fixes:"
> > that references that.  If you can reproduce this with an in-tree driver, we can
> > certainly fix it, but it's harder for an out-of-tree driver.
I understand, thanks. I tried with radeon but rmmod it is not possible
once it's loaded.

> >
> > I cc'd some AMD graphics folks in case they have a pointer for where to get
> > fglrx support.
>
> I can ask around, but I don't think we've actively worked on fglrx since we switched to the open source amdgpu driver 5-6 years ago.  What hardware is this?
It's an AMD G-T56N (bobcat) with an AMD ATI Radeon HD 6320.
>
> Alex
>
> >
> > > Logs:
> > > /mnt/raid0/krops/workspace/sources/fglrx-
> > module/module/firegl_public.c
> > > :1674
> > > KCL_SetPageCache_Array
> > > <6>[fglrx] IRQ 37 Disabled
> > > BUG: kernel NULL pointer dereference, address: 0000000000000010
> > > #PF: supervisor write access in kernel mode
> > > #PF: error_code(0x0002) - not-present page PGD 0 P4D 0
> > > Oops: 0002 [#1] SMP NOPTI
> > > CPU: 1 PID: 254 Comm: rmmod Tainted: G        W  O
> > > 6.0.0-rc4-qtec-standard #2
> > > Hardware name: QTechnology QT5022/QT5022, BIOS PM_2.1.0.309 X64
> > > 09/27/2013
> > > RIP: 0010:mutex_lock+0x2a/0x40
> > > Code: 0f 1f 44 00 00 53 be 1b 01 00 00 48 89 fb 48 c7 c7 08 81 3d 82
> > > e8 46 2c 52  ff e8 01 d7 ff ff 31 c0 65 48 8b 14 25 00 ad 01 00 <f0>
> > > 48 0f b1 13 74 06 48 89  df 5b eb b9 5b c3 0f 1f 80 00 00 00 00
> > > RSP: 0018:ffffc90000b07dd8 EFLAGS: 00010246
> > > RAX: 0000000000000000 RBX: 0000000000000010 RCX: 0000000000000000
> > > RDX: ffff888116aabb00 RSI: 000000000000011b RDI: ffffffff823d8108
> > > RBP: ffff8881148d20d0 R08: 0000000000000000 R09: ffffffffa053537b
> > > R10: ffff888149365cc0 R11: ffffea00052c5048 R12: 0000000000000000
> > > R13: ffff88813877c000 R14: 0000000000000000 R15: 0000000000000000
> > > FS:  00007f6f90b3cb80(0000) GS:ffff88815b300000(0000)
> > > knlGS:0000000000000000
> > > CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> > > CR2: 0000000000000010 CR3: 000000011e4c8000 CR4: 00000000000006e0 Call
> > > Trace:
> > >  <TASK>
> > >  pci_disable_msi+0x34/0xe0
> > >  irqmgr_wrap_shutdown+0x165/0x190 [fglrx]  ?
> > > firegl_takedown+0x841/0x950 [fglrx]  ? kobject_put+0xa6/0x220  ?
> > > cleanup_device+0x299/0x2a0 [fglrx]  ? pci_unregister_driver+0x42/0xa0
> > > ? firegl_cleanup_device_heads+0x65/0xa0 [fglrx]  ?
> > > firegl_cleanup_module+0x84/0x11c [fglrx]  ?
> > > __x64_sys_delete_module+0x11b/0x210
> > >  ? get_vtime_delta+0xe/0x40
> > >  ? vtime_user_exit+0x1c/0x60
> > >  ? __ct_user_exit+0x68/0xb0
> > >  ? do_syscall_64+0x3c/0x80
> > >  ? entry_SYSCALL_64_after_hwframe+0x63/0xcd
> > >  </TASK>
> > > Modules linked in: amdgpu fglrx(O-) ath9k ath9k_common mfd_core
> > > gpu_sched drm_buddy drm_ttm_helper ath9k_hw ttm
> > drm_display_helper
> > > drm_kms_helper ath sp5100_tco syscopyarea sysfillrect sysimgblt
> > > fb_sys_fops video drm backlight
> > > ipv6
> > > CR2: 0000000000000010
> > > ---[ end trace 0000000000000000 ]---
> > > RIP: 0010:mutex_lock+0x2a/0x40
> > > Code: 0f 1f 44 00 00 53 be 1b 01 00 00 48 89 fb 48 c7 c7 08 81 3d 82
> > > e8 46 2c 52  ff e8 01 d7 ff ff 31 c0 65 48 8b 14 25 00 ad 01 00 <f0>
> > > 48 0f b1 13 74 06 48 89  df 5b eb b9 5b c3 0f 1f 80 00 00 00 00
> > > RSP: 0018:ffffc90000b07dd8 EFLAGS: 00010246
> > > RAX: 0000000000000000 RBX: 0000000000000010 RCX: 0000000000000000
> > > RDX: ffff888116aabb00 RSI: 000000000000011b RDI: ffffffff823d8108
> > > RBP: ffff8881148d20d0 R08: 0000000000000000 R09: ffffffffa053537b
> > > R10: ffff888149365cc0 R11: ffffea00052c5048 R12: 0000000000000000
> > > R13: ffff88813877c000 R14: 0000000000000000 R15: 0000000000000000
> > > FS:  00007f6f90b3cb80(0000) GS:ffff88815b300000(0000)
> > > knlGS:0000000000000000
> > > CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> > > CR2: 0000000000000010 CR3: 000000011e4c8000 CR4: 00000000000006e0
> > >
> > > Steps:
> > > insmod fglrx.ko
> > > clinfo
> > > MatrixMultiplication
> > > rmmod fglrx.ko
> > >
> > > I know this is an out of tree driver from AMD but we still need that
> > > driver for some products because of the OpenCL stack support on it.
> > >
> > > Note: The open-source upstream radeon does not support OpenCL.
> > >
> > > So, doing git-bisect I found the issue is provoked by this commit [1].
> > > Unfortunately, I cannot revert it for testing as if I do it the system
> > > hangs on boot because of this other commit [2].
> > >
> > > I understand, the driver might have some issues but shouldn't the
> > > kernel prevent this crash at pci_disable_msi function? Do we have a
> > > mutex problem here provoked by the fglrx driver?
> > > Does anyone have any suggestions on how we can/should proceed with
> > this?
> > >
> > > Thanks in advance,
> > > Daniel
> > >
> > > [1] Commit 93296cd1325d1d9afede60202d8833011c9001f2:
> > > 93296cd1325d 2021-12-15 PCI/MSI: Allocate MSI device data on first use
> > > [2] Commit ffd84485e6beb9cad3e5a133d88201b995298c33:
> > > ffd84485e6be 2021-12-10 PCI/MSI: Let the irq code handle sysfs groups

  reply	other threads:[~2022-09-08 14:31 UTC|newest]

Thread overview: 5+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-09-08 10:41 PCI/MSI: kernel NULL pointer dereference Daniel Gomez
2022-09-08 12:12 ` Bjorn Helgaas
2022-09-08 13:44   ` Deucher, Alexander
2022-09-08 14:30     ` Daniel Gomez [this message]
2022-09-14 16:19       ` Bjorn Helgaas

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CAH1Ww+TydyHo+CADnX2Rn8XfDd9coSZj03hxNfCwMg4t1qP+bw@mail.gmail.com \
    --to=daniel@qtec.com \
    --cc=Alexander.Deucher@amd.com \
    --cc=Christian.Koenig@amd.com \
    --cc=helgaas@kernel.org \
    --cc=linux-pci@vger.kernel.org \
    --cc=tglx@linutronix.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.