* [PATCH] drm/amdgpu: Fix Kernel Oops triggered by kfdtest
@ 2018-11-15 14:46 Joerg Roedel
2018-11-15 20:38 ` Kuehling, Felix
0 siblings, 1 reply; 2+ messages in thread
From: Joerg Roedel @ 2018-11-15 14:46 UTC (permalink / raw)
To: Rex Zhu, Evan Quan, Alex Deucher, christian.koenig, David1.Zhou
Cc: amd-gfx, dri-devel, linux-kernel, Joerg Roedel
From: Joerg Roedel <jroedel@suse.de>
Running kfdtest on Kaveri triggers a kernel NULL-ptr dereference:
BUG: unable to handle kernel NULL pointer dereference at 0000000000000000
PGD 42c017067 P4D 42c017067 PUD 40f071067 PMD 0
Oops: 0010 [#1] SMP NOPTI
CPU: 0 PID: 13107 Comm: kfdtest Not tainted 4.20.0-rc2+ #11
Hardware name: Gigabyte Technology Co., Ltd. To be filled by O.E.M./F2A88XM-HD3, BIOS F6 05/28/2014
RIP: 0010: (null)
Code: Bad RIP value.
RSP: 0018:ffffc90001adbbf0 EFLAGS: 00010202
RAX: ffffffffa0806240 RBX: ffff88842a0fbc00 RCX: 0000000000000002
RDX: 0000000000000001 RSI: 0000000000000004 RDI: ffff888429690000
RBP: ffffc90001adbbf8 R08: 0000000000002000 R09: ffff88842e542ec0
R10: 00007feff778f008 R11: 00007feff778f010 R12: 0000000000000000
R13: ffff88840f063a20 R14: ffff88842a0fbd20 R15: 000000000f43ff60
FS: 00007feff7769740(0000) GS:ffff88842fa00000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: ffffffffffffffd6 CR3: 000000040f122000 CR4: 00000000000406f0
Call Trace:
? amdgpu_amdkfd_set_compute_idle+0x29/0x30 [amdgpu]
register_process+0x140/0x150 [amdgpu]
pqm_create_queue+0x395/0x560 [amdgpu]
kfd_ioctl_create_queue+0x285/0x680 [amdgpu]
kfd_ioctl+0x27f/0x450 [amdgpu]
? kfd_ioctl_destroy_queue+0x80/0x80 [amdgpu]
do_vfs_ioctl+0x96/0x6a0
? __audit_syscall_entry+0xdd/0x130
? handle_mm_fault+0x11b/0x240
ksys_ioctl+0x67/0x90
__x64_sys_ioctl+0x1a/0x20
do_syscall_64+0x61/0x190
entry_SYSCALL_64_after_hwframe+0x44/0xa9
The reason is that the pp_funcs->switch_power_profile
function pointer is not set for a Kaveri ASIC and thus the
kernel calls a NULL pointer.
Add a check before calling the function to avoid that.
Signed-off-by: Joerg Roedel <jroedel@suse.de>
---
drivers/gpu/drm/amd/amdgpu/amdgpu_dpm.h | 5 +++--
1 file changed, 3 insertions(+), 2 deletions(-)
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_dpm.h b/drivers/gpu/drm/amd/amdgpu/amdgpu_dpm.h
index f972cd156795..0ecedd30f2aa 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_dpm.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_dpm.h
@@ -337,8 +337,9 @@ enum amdgpu_pcie_gen {
(adev)->powerplay.pp_handle, request))
#define amdgpu_dpm_switch_power_profile(adev, type, en) \
- ((adev)->powerplay.pp_funcs->switch_power_profile(\
- (adev)->powerplay.pp_handle, type, en))
+ if ((adev)->powerplay.pp_funcs->switch_power_profile != NULL) \
+ ((adev)->powerplay.pp_funcs->switch_power_profile(\
+ (adev)->powerplay.pp_handle, type, en))
#define amdgpu_dpm_set_clockgating_by_smu(adev, msg_id) \
((adev)->powerplay.pp_funcs->set_clockgating_by_smu(\
--
2.13.7
^ permalink raw reply related [flat|nested] 2+ messages in thread
* RE: [PATCH] drm/amdgpu: Fix Kernel Oops triggered by kfdtest
2018-11-15 14:46 [PATCH] drm/amdgpu: Fix Kernel Oops triggered by kfdtest Joerg Roedel
@ 2018-11-15 20:38 ` Kuehling, Felix
0 siblings, 0 replies; 2+ messages in thread
From: Kuehling, Felix @ 2018-11-15 20:38 UTC (permalink / raw)
To: Joerg Roedel, Zhu, Rex, Quan, Evan, Deucher, Alexander, Koenig,
Christian, Zhou, David(ChunMing)
Cc: Joerg Roedel, dri-devel, amd-gfx, linux-kernel
Apologies. We already have a fix for this on our internal amd-kfd-staging branch, but it's missing from amd-staging-drm-next. I'll cherry-pick our fix to amd-staging-drm-next and nominate it for drm-fixes.
Regards,
Felix
-----Original Message-----
From: amd-gfx <amd-gfx-bounces@lists.freedesktop.org> On Behalf Of Joerg Roedel
Sent: Thursday, November 15, 2018 9:46 AM
To: Zhu, Rex <Rex.Zhu@amd.com>; Quan, Evan <Evan.Quan@amd.com>; Deucher, Alexander <Alexander.Deucher@amd.com>; Koenig, Christian <Christian.Koenig@amd.com>; Zhou, David(ChunMing) <David1.Zhou@amd.com>
Cc: Joerg Roedel <jroedel@suse.de>; dri-devel@lists.freedesktop.org; amd-gfx@lists.freedesktop.org; linux-kernel@vger.kernel.org
Subject: [PATCH] drm/amdgpu: Fix Kernel Oops triggered by kfdtest
From: Joerg Roedel <jroedel@suse.de>
Running kfdtest on Kaveri triggers a kernel NULL-ptr dereference:
BUG: unable to handle kernel NULL pointer dereference at 0000000000000000
PGD 42c017067 P4D 42c017067 PUD 40f071067 PMD 0
Oops: 0010 [#1] SMP NOPTI
CPU: 0 PID: 13107 Comm: kfdtest Not tainted 4.20.0-rc2+ #11
Hardware name: Gigabyte Technology Co., Ltd. To be filled by O.E.M./F2A88XM-HD3, BIOS F6 05/28/2014
RIP: 0010: (null)
Code: Bad RIP value.
RSP: 0018:ffffc90001adbbf0 EFLAGS: 00010202
RAX: ffffffffa0806240 RBX: ffff88842a0fbc00 RCX: 0000000000000002
RDX: 0000000000000001 RSI: 0000000000000004 RDI: ffff888429690000
RBP: ffffc90001adbbf8 R08: 0000000000002000 R09: ffff88842e542ec0
R10: 00007feff778f008 R11: 00007feff778f010 R12: 0000000000000000
R13: ffff88840f063a20 R14: ffff88842a0fbd20 R15: 000000000f43ff60
FS: 00007feff7769740(0000) GS:ffff88842fa00000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: ffffffffffffffd6 CR3: 000000040f122000 CR4: 00000000000406f0
Call Trace:
? amdgpu_amdkfd_set_compute_idle+0x29/0x30 [amdgpu]
register_process+0x140/0x150 [amdgpu]
pqm_create_queue+0x395/0x560 [amdgpu]
kfd_ioctl_create_queue+0x285/0x680 [amdgpu]
kfd_ioctl+0x27f/0x450 [amdgpu]
? kfd_ioctl_destroy_queue+0x80/0x80 [amdgpu]
do_vfs_ioctl+0x96/0x6a0
? __audit_syscall_entry+0xdd/0x130
? handle_mm_fault+0x11b/0x240
ksys_ioctl+0x67/0x90
__x64_sys_ioctl+0x1a/0x20
do_syscall_64+0x61/0x190
entry_SYSCALL_64_after_hwframe+0x44/0xa9
The reason is that the pp_funcs->switch_power_profile function pointer is not set for a Kaveri ASIC and thus the kernel calls a NULL pointer.
Add a check before calling the function to avoid that.
Signed-off-by: Joerg Roedel <jroedel@suse.de>
---
drivers/gpu/drm/amd/amdgpu/amdgpu_dpm.h | 5 +++--
1 file changed, 3 insertions(+), 2 deletions(-)
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_dpm.h b/drivers/gpu/drm/amd/amdgpu/amdgpu_dpm.h
index f972cd156795..0ecedd30f2aa 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_dpm.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_dpm.h
@@ -337,8 +337,9 @@ enum amdgpu_pcie_gen {
(adev)->powerplay.pp_handle, request))
#define amdgpu_dpm_switch_power_profile(adev, type, en) \
- ((adev)->powerplay.pp_funcs->switch_power_profile(\
- (adev)->powerplay.pp_handle, type, en))
+ if ((adev)->powerplay.pp_funcs->switch_power_profile != NULL) \
+ ((adev)->powerplay.pp_funcs->switch_power_profile(\
+ (adev)->powerplay.pp_handle, type, en))
#define amdgpu_dpm_set_clockgating_by_smu(adev, msg_id) \
((adev)->powerplay.pp_funcs->set_clockgating_by_smu(\
--
2.13.7
_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx
^ permalink raw reply related [flat|nested] 2+ messages in thread
end of thread, other threads:[~2018-11-15 20:38 UTC | newest]
Thread overview: 2+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-11-15 14:46 [PATCH] drm/amdgpu: Fix Kernel Oops triggered by kfdtest Joerg Roedel
2018-11-15 20:38 ` Kuehling, Felix
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).