All of lore.kernel.org
 help / color / mirror / Atom feed
* [Bug 214859] New: drm-amdgpu-init-iommu~fd-device-init.patch introduce bug
@ 2021-10-28 18:23 bugzilla-daemon
  2021-10-31  9:46 ` [Bug 214859] " bugzilla-daemon
                   ` (8 more replies)
  0 siblings, 9 replies; 10+ messages in thread
From: bugzilla-daemon @ 2021-10-28 18:23 UTC (permalink / raw)
  To: dri-devel

https://bugzilla.kernel.org/show_bug.cgi?id=214859

            Bug ID: 214859
           Summary: drm-amdgpu-init-iommu~fd-device-init.patch introduce
                    bug
           Product: Drivers
           Version: 2.5
    Kernel Version: 5.14.15
          Hardware: x86-64
                OS: Linux
              Tree: Mainline
            Status: NEW
          Severity: normal
          Priority: P1
         Component: Video(DRI - non Intel)
          Assignee: drivers_video-dri@kernel-bugs.osdl.org
          Reporter: towo@siduction.org
        Regression: No

After commit d60096b3b2c2..cd8cc7d31b49 100644
drm-amdgpu-init-iommu~fd-device-init.patch

Kernel 5.14.15 on most Ryzen Notebooks X cant't start really.
There is a long time, before x is starting, dmesg is spammed with failure
messages like

Okt 28 10:28:08 kernel: ^[[0;1;39mamdgpu 0000:04:00.0: amdgpu: failed to write
reg 28b4 wait reg 28c6
Okt 28 10:28:21 kernel: ^[[0;1;39mamdgpu 0000:04:00.0: amdgpu: failed to write
reg 1a6f4 wait reg 1a706
Okt 28 10:28:34 kernel: ^[[0;1;39mamdgpu 0000:04:00.0: amdgpu: failed to write
reg 28b4 wait reg 28c6
Okt 28 10:28:47 kernel: ^[[0;1;39mamdgpu 0000:04:00.0: amdgpu: failed to write
reg 1a6f4 wait reg 1a706
Okt 28 10:29:01 kernel: ^[[0;1;39mamdgpu 0000:04:00.0: amdgpu: failed to write
reg 28b4 wait reg 28c6
Okt 28 10:29:14 kernel: ^[[0;1;39mamdgpu 0000:04:00.0: amdgpu: failed to write
reg 1a6f4 wait reg 1a706
Okt 28 10:29:27 kernel: ^[[0;1;39mamdgpu 0000:04:00.0: amdgpu: failed to write
reg 28b4 wait reg 28c6
Okt 28 10:29:40 kernel: ^[[0;1;39mamdgpu 0000:04:00.0: amdgpu: failed to write
reg 1a6f4 wait reg 1a706

and/or

Okt 28 10:29:40 kernel: ^[[0;1;39mamdgpu 0000:04:00.0: amdgpu: [gfxhub0]
no-retry page fault (src_id:0 ring:128 vmid:0 pasid:0, for process  pid 0
thread  pid 0)
Okt 28 10:29:40 kernel: ^[[0;1;39mamdgpu 0000:04:00.0: amdgpu:   in page
starting at address 0x0000000000872000 from IH client 0x1b (UTCL2)
Okt 28 10:29:40 kernel: ^[[0;1;39mamdgpu 0000:04:00.0: amdgpu:
VM_L2_PROTECTION_FAULT_STATUS:0x00040D00
Okt 28 10:29:40 kernel: ^[[0;1;39mamdgpu 0000:04:00.0: amdgpu:          Faulty
UTCL2 client ID: CPG (0x6)
Okt 28 10:29:40 kernel: ^[[0;1;39mamdgpu 0000:04:00.0: amdgpu:         
MORE_FAULTS: 0x0
Okt 28 10:29:40 kernel: ^[[0;1;39mamdgpu 0000:04:00.0: amdgpu:         
WALKER_ERROR: 0x0
Okt 28 10:29:40 kernel: ^[[0;1;39mamdgpu 0000:04:00.0: amdgpu:         
PERMISSION_FAULTS: 0x0
Okt 28 10:29:40 kernel: ^[[0;1;39mamdgpu 0000:04:00.0: amdgpu:         
MAPPING_ERROR: 0x1
Okt 28 10:29:40 kernel: ^[[0;1;39mamdgpu 0000:04:00.0: amdgpu:          RW: 0x1
Okt 28 10:29:40 kernel: ^[[0;1;39mamdgpu 0000:04:00.0: amdgpu: [gfxhub0]
no-retry page fault (src_id:0 ring:128 vmid:0 pasid:0, for process  pid 0
thread  pid 0)
Okt 28 10:29:40 kernel: ^[[0;1;39mamdgpu 0000:04:00.0: amdgpu:   in page
starting at address 0x0000000000872000 from IH client 0x1b (UTCL2)
Okt 28 10:29:40 kernel: ^[[0;1;39mamdgpu 0000:04:00.0: amdgpu:
VM_L2_PROTECTION_FAULT_STATUS:0x00040D00
Okt 28 10:29:40 kernel: ^[[0;1;39mamdgpu 0000:04:00.0: amdgpu:          Faulty
UTCL2 client ID: CPG (0x6)
Okt 28 10:29:40 kernel: ^[[0;1;39mamdgpu 0000:04:00.0: amdgpu:         
MORE_FAULTS: 0x0
Okt 28 10:29:40 kernel: ^[[0;1;39mamdgpu 0000:04:00.0: amdgpu:         
WALKER_ERROR: 0x0
Okt 28 10:29:40 kernel: ^[[0;1;39mamdgpu 0000:04:00.0: amdgpu:         
PERMISSION_FAULTS: 0x0
Okt 28 10:29:40 kernel: ^[[0;1;39mamdgpu 0000:04:00.0: amdgpu:         
MAPPING_ERROR: 0x1
Okt 28 10:29:40 kernel: ^[[0;1;39mamdgpu 0000:04:00.0: amdgpu:          RW: 0x1

Reverting that commit and the kernel is back working normal.
Here the related reports from our users (ignore the nvidia posts).
https://forum.siduction.org/index.php?topic=8439.0

-- 
You may reply to this email to add a comment.

You are receiving this mail because:
You are watching the assignee of the bug.

^ permalink raw reply	[flat|nested] 10+ messages in thread

* [Bug 214859] drm-amdgpu-init-iommu~fd-device-init.patch introduce bug
  2021-10-28 18:23 [Bug 214859] New: drm-amdgpu-init-iommu~fd-device-init.patch introduce bug bugzilla-daemon
@ 2021-10-31  9:46 ` bugzilla-daemon
  2021-11-01 19:42 ` bugzilla-daemon
                   ` (7 subsequent siblings)
  8 siblings, 0 replies; 10+ messages in thread
From: bugzilla-daemon @ 2021-10-31  9:46 UTC (permalink / raw)
  To: dri-devel

https://bugzilla.kernel.org/show_bug.cgi?id=214859

--- Comment #1 from Sebastian Dalfuß (sd@sedf.de) ---
I can confirm this for a
"04:00.0 VGA compatible controller: Advanced Micro Devices, Inc. [AMD/ATI]
Picasso (rev c2)".

-- 
You may reply to this email to add a comment.

You are receiving this mail because:
You are watching the assignee of the bug.

^ permalink raw reply	[flat|nested] 10+ messages in thread

* [Bug 214859] drm-amdgpu-init-iommu~fd-device-init.patch introduce bug
  2021-10-28 18:23 [Bug 214859] New: drm-amdgpu-init-iommu~fd-device-init.patch introduce bug bugzilla-daemon
  2021-10-31  9:46 ` [Bug 214859] " bugzilla-daemon
@ 2021-11-01 19:42 ` bugzilla-daemon
  2021-11-01 19:44 ` bugzilla-daemon
                   ` (6 subsequent siblings)
  8 siblings, 0 replies; 10+ messages in thread
From: bugzilla-daemon @ 2021-11-01 19:42 UTC (permalink / raw)
  To: dri-devel

https://bugzilla.kernel.org/show_bug.cgi?id=214859

--- Comment #2 from towo@siduction.org ---
The relevant commit is 714d9e4574d54596973ee3b0624ee4a16264d700

-- 
You may reply to this email to add a comment.

You are receiving this mail because:
You are watching the assignee of the bug.

^ permalink raw reply	[flat|nested] 10+ messages in thread

* [Bug 214859] drm-amdgpu-init-iommu~fd-device-init.patch introduce bug
  2021-10-28 18:23 [Bug 214859] New: drm-amdgpu-init-iommu~fd-device-init.patch introduce bug bugzilla-daemon
  2021-10-31  9:46 ` [Bug 214859] " bugzilla-daemon
  2021-11-01 19:42 ` bugzilla-daemon
@ 2021-11-01 19:44 ` bugzilla-daemon
  2021-11-01 20:05 ` bugzilla-daemon
                   ` (5 subsequent siblings)
  8 siblings, 0 replies; 10+ messages in thread
From: bugzilla-daemon @ 2021-11-01 19:44 UTC (permalink / raw)
  To: dri-devel

https://bugzilla.kernel.org/show_bug.cgi?id=214859

towo@siduction.org changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
     Kernel Version|5.14.15                     |5.14.15, 5.15.0
         Regression|No                          |Yes

-- 
You may reply to this email to add a comment.

You are receiving this mail because:
You are watching the assignee of the bug.

^ permalink raw reply	[flat|nested] 10+ messages in thread

* [Bug 214859] drm-amdgpu-init-iommu~fd-device-init.patch introduce bug
  2021-10-28 18:23 [Bug 214859] New: drm-amdgpu-init-iommu~fd-device-init.patch introduce bug bugzilla-daemon
                   ` (2 preceding siblings ...)
  2021-11-01 19:44 ` bugzilla-daemon
@ 2021-11-01 20:05 ` bugzilla-daemon
  2021-11-02 20:30 ` bugzilla-daemon
                   ` (4 subsequent siblings)
  8 siblings, 0 replies; 10+ messages in thread
From: bugzilla-daemon @ 2021-11-01 20:05 UTC (permalink / raw)
  To: dri-devel

https://bugzilla.kernel.org/show_bug.cgi?id=214859

--- Comment #3 from towo@siduction.org ---
Additional info, after installing the kernel from a working system, 1st boot
with that kernel is working flawless. Rebooting with that kernel and the boot
is hanging a long time, then the desktop starts but the system is not really
usuable. All the problems do not happen after reverting
714d9e4574d54596973ee3b0624ee4a16264d700.

-- 
You may reply to this email to add a comment.

You are receiving this mail because:
You are watching the assignee of the bug.

^ permalink raw reply	[flat|nested] 10+ messages in thread

* [Bug 214859] drm-amdgpu-init-iommu~fd-device-init.patch introduce bug
  2021-10-28 18:23 [Bug 214859] New: drm-amdgpu-init-iommu~fd-device-init.patch introduce bug bugzilla-daemon
                   ` (3 preceding siblings ...)
  2021-11-01 20:05 ` bugzilla-daemon
@ 2021-11-02 20:30 ` bugzilla-daemon
  2021-11-03  1:46 ` bugzilla-daemon
                   ` (3 subsequent siblings)
  8 siblings, 0 replies; 10+ messages in thread
From: bugzilla-daemon @ 2021-11-02 20:30 UTC (permalink / raw)
  To: dri-devel

https://bugzilla.kernel.org/show_bug.cgi?id=214859

Alex Deucher (alexdeucher@gmail.com) changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |alexdeucher@gmail.com

--- Comment #4 from Alex Deucher (alexdeucher@gmail.com) ---
I think this patch set should address the issue:
https://patchwork.freedesktop.org/series/96508/

-- 
You may reply to this email to add a comment.

You are receiving this mail because:
You are watching the assignee of the bug.

^ permalink raw reply	[flat|nested] 10+ messages in thread

* [Bug 214859] drm-amdgpu-init-iommu~fd-device-init.patch introduce bug
  2021-10-28 18:23 [Bug 214859] New: drm-amdgpu-init-iommu~fd-device-init.patch introduce bug bugzilla-daemon
                   ` (4 preceding siblings ...)
  2021-11-02 20:30 ` bugzilla-daemon
@ 2021-11-03  1:46 ` bugzilla-daemon
  2021-11-03 14:29 ` bugzilla-daemon
                   ` (2 subsequent siblings)
  8 siblings, 0 replies; 10+ messages in thread
From: bugzilla-daemon @ 2021-11-03  1:46 UTC (permalink / raw)
  To: dri-devel

https://bugzilla.kernel.org/show_bug.cgi?id=214859

James Zhu (jamesz@amd.com) changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |jamesz@amd.com

--- Comment #5 from James Zhu (jamesz@amd.com) ---
Created attachment 299413
  --> https://bugzilla.kernel.org/attachment.cgi?id=299413&action=edit
patch to fix

Suggest to upgrade to 5.15rc7 and apply this patch, then make a test.

-- 
You may reply to this email to add a comment.

You are receiving this mail because:
You are watching the assignee of the bug.

^ permalink raw reply	[flat|nested] 10+ messages in thread

* [Bug 214859] drm-amdgpu-init-iommu~fd-device-init.patch introduce bug
  2021-10-28 18:23 [Bug 214859] New: drm-amdgpu-init-iommu~fd-device-init.patch introduce bug bugzilla-daemon
                   ` (5 preceding siblings ...)
  2021-11-03  1:46 ` bugzilla-daemon
@ 2021-11-03 14:29 ` bugzilla-daemon
  2021-11-04 16:55 ` bugzilla-daemon
  2021-11-17 16:38 ` bugzilla-daemon
  8 siblings, 0 replies; 10+ messages in thread
From: bugzilla-daemon @ 2021-11-03 14:29 UTC (permalink / raw)
  To: dri-devel

https://bugzilla.kernel.org/show_bug.cgi?id=214859

--- Comment #6 from James Zhu (jamesz@amd.com) ---
Created attachment 299437
  --> https://bugzilla.kernel.org/attachment.cgi?id=299437&action=edit
analysis for this issue

Linux 5.14.15  + afd1818 can fix the issue.

Linux 5.15rc7 re-apply "init iommu after amdkfd device init" and "move
iommu_resume before ip init/resume" which overwrote afd1818 caused the issue
again.

714d9e4 drm/amdgpu: init iommu after amdkfd device init

f02abeb drm/amdgpu: move iommu_resume before ip init/resume

afd1818 drm/amdkfd: fix boot failure when iommu is disabled in Picasso.

286826d drm/amdgpu: init iommu after amdkfd device init

9cec53c drm/amdgpu: move iommu_resume before ip init/resume

-- 
You may reply to this email to add a comment.

You are receiving this mail because:
You are watching the assignee of the bug.

^ permalink raw reply	[flat|nested] 10+ messages in thread

* [Bug 214859] drm-amdgpu-init-iommu~fd-device-init.patch introduce bug
  2021-10-28 18:23 [Bug 214859] New: drm-amdgpu-init-iommu~fd-device-init.patch introduce bug bugzilla-daemon
                   ` (6 preceding siblings ...)
  2021-11-03 14:29 ` bugzilla-daemon
@ 2021-11-04 16:55 ` bugzilla-daemon
  2021-11-17 16:38 ` bugzilla-daemon
  8 siblings, 0 replies; 10+ messages in thread
From: bugzilla-daemon @ 2021-11-04 16:55 UTC (permalink / raw)
  To: dri-devel

https://bugzilla.kernel.org/show_bug.cgi?id=214859

--- Comment #7 from towo@siduction.org ---
With linux 5.14.17-rc1 and 5.15.1-rc1 the problem is gone.
So i think, that bug is resolved.

-- 
You may reply to this email to add a comment.

You are receiving this mail because:
You are watching the assignee of the bug.

^ permalink raw reply	[flat|nested] 10+ messages in thread

* [Bug 214859] drm-amdgpu-init-iommu~fd-device-init.patch introduce bug
  2021-10-28 18:23 [Bug 214859] New: drm-amdgpu-init-iommu~fd-device-init.patch introduce bug bugzilla-daemon
                   ` (7 preceding siblings ...)
  2021-11-04 16:55 ` bugzilla-daemon
@ 2021-11-17 16:38 ` bugzilla-daemon
  8 siblings, 0 replies; 10+ messages in thread
From: bugzilla-daemon @ 2021-11-17 16:38 UTC (permalink / raw)
  To: dri-devel

https://bugzilla.kernel.org/show_bug.cgi?id=214859

spasswolf@web.de changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |spasswolf@web.de

--- Comment #8 from spasswolf@web.de ---
*** Bug 214901 has been marked as a duplicate of this bug. ***

-- 
You may reply to this email to add a comment.

You are receiving this mail because:
You are watching the assignee of the bug.

^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2021-11-17 16:39 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-10-28 18:23 [Bug 214859] New: drm-amdgpu-init-iommu~fd-device-init.patch introduce bug bugzilla-daemon
2021-10-31  9:46 ` [Bug 214859] " bugzilla-daemon
2021-11-01 19:42 ` bugzilla-daemon
2021-11-01 19:44 ` bugzilla-daemon
2021-11-01 20:05 ` bugzilla-daemon
2021-11-02 20:30 ` bugzilla-daemon
2021-11-03  1:46 ` bugzilla-daemon
2021-11-03 14:29 ` bugzilla-daemon
2021-11-04 16:55 ` bugzilla-daemon
2021-11-17 16:38 ` bugzilla-daemon

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.