All of lore.kernel.org
 help / color / mirror / Atom feed
* [BUG, regression] Dereferencing of NULL pointer in radeon_mn_unregister()
@ 2019-09-01  9:38 Petr Cvek
       [not found] ` <dad0e51a-0f06-e2b0-cef7-3587207c2045-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
  0 siblings, 1 reply; 4+ messages in thread
From: Petr Cvek @ 2019-09-01  9:38 UTC (permalink / raw)
  To: Deucher, Alexander, Koenig, Christian, David1.Zhou-5C7GfCeVMHo,
	sfr-3FnU+UHB4dNDw9hX6IcOSA, jgg-VPRAkNaXOzVWk0Htik3J/w
  Cc: amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW

Hi,

kernel: 5.3.0-rc6-next

After starting Xorg and running xrandr the Xorg crashes with (not exactly useful, it is MIPS dump):

[   28.842553] CPU 0 Unable to handle kernel paging request at virtual address 0000001c, epc == 808de6d4, ra == 804d32ec
[   28.853387] Oops[#1]:
[   28.855699] CPU: 0 PID: 692 Comm: Xorg Not tainted 5.3.0-rc6-next-20190826+ #59
[   28.863104] $ 0   : 00000000 80b60000 00000011 87f1af00
[   28.868407] $ 4   : 0000001c 00000002 00000002 ffff00fe
[   28.873705] $ 8   : 865e9fe0 0000fc00 00000004 00000000
[   28.879003] $12   : 87f1baf0 00000000 0000da9a 00000040
[   28.884301] $16   : 86434450 86434400 00000000 0000001c
[   28.889600] $20   : 865e9dbc 00000000 80912ee4 865e9dbc
[   28.894898] $24   : 80add220 27cfd6fd                  
[   28.900198] $28   : 865e8000 865e9cb8 00000009 804d32ec
[   28.905499] Hi    : 000091bb
[   28.908414] Lo    : ffff6e44
[   28.911350] epc   : 808de6d4 mutex_lock+0x8/0x44
[   28.916045] ra    : 804d32ec radeon_mn_unregister+0x3c/0xb0
[   28.921687] Status: 1100fc03 KERNEL EXL IE 
[   28.925929] Cause : 00800008 (ExcCode 02)
[   28.929987] BadVA : 0000001c
[   28.932903] PrId  : 00019655 (MIPS 24KEc)
[   28.936961] Modules linked in: usbhid hid_generic hid evdev
[   28.942635] Process Xorg (pid: 692, threadinfo=68a84c48, task=84477b53, tls=77e03da0)
[   28.950566] Stack : 00000000 804d32e4 00000001 00000000 84d7b400 84d7b400 8784a078 86434450
[   28.959043]         86632600 8663268c 803a4ed4 8041583c 00000000 803b6d94 865e9dbc 86434450
[   28.967519]         86632600 86434400 86632600 803a451c 87912980 879129ac 80ae0000 00000007
[   28.975996]         00000007 86632620 86632600 803a45d0 87ffc718 71a8f000 71a8f000 87ffc71c
[   28.984472]         71a8efff 800d3c08 865eac00 86632600 00000000 803a5bf4 71a8f000 00000000
[   28.992948]         ...
[   28.995425] Call Trace:
[   28.997905] [<808de6d4>] mutex_lock+0x8/0x44
[   29.002239] [<804d32ec>] radeon_mn_unregister+0x3c/0xb0
[   29.007550] [<8041583c>] radeon_gem_object_free+0x18/0x2c
[   29.013031] [<803a451c>] drm_gem_object_release_handle+0x74/0xac
[   29.019122] [<803a45d0>] drm_gem_handle_delete+0x7c/0x128
[   29.024599] [<803a5bf4>] drm_ioctl_kernel+0xb0/0x108
[   29.029633] [<803a5e74>] drm_ioctl+0x200/0x3a8
[   29.034154] [<803e07b4>] radeon_drm_ioctl+0x54/0xc0
[   29.039110] [<801214dc>] do_vfs_ioctl+0x4e8/0x81c
[   29.043880] [<80121864>] ksys_ioctl+0x54/0xb0
[   29.048305] [<8001100c>] syscall_common+0x34/0x58
[   29.053074] Code: 24050002  27bdfff8  8f830000 <c0850000> 14a00005  00000000  00600825  e0810000  1020fffa 

but it seems there is NULL pointer at this line:

	https://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git/tree/drivers/gpu/drm/radeon/radeon_mn.c?h=next-20190830#n237

The code is:

	struct radeon_mn *rmn = bo->mn;
	...
	mutex_lock(&rmn->lock);		//<-crash

A quick assert proves the bo->mn returns NULL. The code worked in 4.19-rc and it seems the problematic patch is 

	drm/radeon: use mmu_notifier_get/put for struct radeon_mn

as it removes the NULL check.

Forcing -ENODEV in the register funtion (and immediate return in unregister as without CONFIG_MMU_NOTIFIER) works.

Petr
_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [BUG, regression] Dereferencing of NULL pointer in radeon_mn_unregister()
       [not found] ` <dad0e51a-0f06-e2b0-cef7-3587207c2045-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
@ 2019-09-01 14:04   ` Jason Gunthorpe
       [not found]     ` <20190901140409.GA1251-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
  0 siblings, 1 reply; 4+ messages in thread
From: Jason Gunthorpe @ 2019-09-01 14:04 UTC (permalink / raw)
  To: Petr Cvek
  Cc: Deucher, Alexander, David1.Zhou-5C7GfCeVMHo, Koenig, Christian,
	amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW,
	sfr-3FnU+UHB4dNDw9hX6IcOSA

On Sun, Sep 01, 2019 at 11:38:10AM +0200, Petr Cvek wrote:

> The code is:
> 
> 	struct radeon_mn *rmn = bo->mn;
> 	...
> 	mutex_lock(&rmn->lock);		//<-crash
> 
> A quick assert proves the bo->mn returns NULL. The code worked in
> 4.19-rc and it seems the problematic patch is

Hum, the code went away because the locking protecting that variable
went away.. It means the caller is not careful to pair register and
unregister.
 
> 	drm/radeon: use mmu_notifier_get/put for struct radeon_mn
> 
> as it removes the NULL check.
> 
> Forcing -ENODEV in the register funtion (and immediate return in
> unregister as without CONFIG_MMU_NOTIFIER) works.

Is just adding a

  if (!rmn)
       retrun

To the top of radeon_mn_unregister enough to fix it?

Jason
_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [BUG, regression] Dereferencing of NULL pointer in radeon_mn_unregister()
       [not found]     ` <20190901140409.GA1251-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
@ 2019-09-01 15:48       ` Petr Cvek
       [not found]         ` <2fc7ef14-e89a-1f2d-381d-1c9b05da02d3-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
  0 siblings, 1 reply; 4+ messages in thread
From: Petr Cvek @ 2019-09-01 15:48 UTC (permalink / raw)
  To: Jason Gunthorpe
  Cc: Deucher, Alexander, David1.Zhou-5C7GfCeVMHo, Koenig, Christian,
	amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW,
	sfr-3FnU+UHB4dNDw9hX6IcOSA

Dne 01. 09. 19 v 16:04 Jason Gunthorpe napsal(a):
> On Sun, Sep 01, 2019 at 11:38:10AM +0200, Petr Cvek wrote:
> 
>> The code is:
>>
>> 	struct radeon_mn *rmn = bo->mn;
>> 	...
>> 	mutex_lock(&rmn->lock);		//<-crash
>>
>> A quick assert proves the bo->mn returns NULL. The code worked in
>> 4.19-rc and it seems the problematic patch is
> 
> Hum, the code went away because the locking protecting that variable
> went away.. It means the caller is not careful to pair register and
> unregister.
>  
>> 	drm/radeon: use mmu_notifier_get/put for struct radeon_mn
>>
>> as it removes the NULL check.
>>
>> Forcing -ENODEV in the register funtion (and immediate return in
>> unregister as without CONFIG_MMU_NOTIFIER) works.
> 
> Is just adding a
> 
>   if (!rmn)
>        retrun
> 
> To the top of radeon_mn_unregister enough to fix it?

Yeah it seems to work. A further test with minetest works too.

Petr 

> 
> Jason
> 
_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [BUG, regression] Dereferencing of NULL pointer in radeon_mn_unregister()
       [not found]         ` <2fc7ef14-e89a-1f2d-381d-1c9b05da02d3-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
@ 2019-09-02  6:19           ` Jason Gunthorpe
  0 siblings, 0 replies; 4+ messages in thread
From: Jason Gunthorpe @ 2019-09-02  6:19 UTC (permalink / raw)
  To: Petr Cvek
  Cc: Deucher, Alexander, David1.Zhou-5C7GfCeVMHo, Koenig, Christian,
	amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW,
	sfr-3FnU+UHB4dNDw9hX6IcOSA

On Sun, Sep 01, 2019 at 05:48:36PM +0200, Petr Cvek wrote:
> > Is just adding a
> > 
> >   if (!rmn)
> >        retrun
> > 
> > To the top of radeon_mn_unregister enough to fix it?
> 
> Yeah it seems to work. A further test with minetest works too.

Okay, I added this patch to the hmm tree, thanks.

From 829394d77e3026e08e7879fb37f14c90de7b0fd8 Mon Sep 17 00:00:00 2001
From: Jason Gunthorpe <jgg@mellanox.com>
Date: Mon, 2 Sep 2019 03:01:03 -0300
Subject: [PATCH] drm/radeon: guard against calling an unpaired
 radeon_mn_unregister()

This check was accidently deleted in the below commit. There are cases
where the driver will call unregister even though it hasn't registered
anything.

 CPU 0 Unable to handle kernel paging request at virtual address 0000001c, epc == 808de6d4, ra == 804d32ec
 Call Trace:
 [<808de6d4>] mutex_lock+0x8/0x44
 [<804d32ec>] radeon_mn_unregister+0x3c/0xb0
 [<8041583c>] radeon_gem_object_free+0x18/0x2c
 [<803a451c>] drm_gem_object_release_handle+0x74/0xac
 [<803a45d0>] drm_gem_handle_delete+0x7c/0x128
 [<803a5bf4>] drm_ioctl_kernel+0xb0/0x108
 [<803a5e74>] drm_ioctl+0x200/0x3a8
 [<803e07b4>] radeon_drm_ioctl+0x54/0xc0
 [<801214dc>] do_vfs_ioctl+0x4e8/0x81c
 [<80121864>] ksys_ioctl+0x54/0xb0
 [<8001100c>] syscall_common+0x34/0x58

Link: https://lore.kernel.org/r/2fc7ef14-e89a-1f2d-381d-1c9b05da02d3@gmail.com
Fixes: 534e5f84b7a9 ("drm/radeon: use mmu_notifier_get/put for struct radeon_mn")
Reported-by: Petr Cvek <petrcvekcz@gmail.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
---
 drivers/gpu/drm/radeon/radeon_mn.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/drivers/gpu/drm/radeon/radeon_mn.c b/drivers/gpu/drm/radeon/radeon_mn.c
index fc8254273a800b..1ee20d528a7c24 100644
--- a/drivers/gpu/drm/radeon/radeon_mn.c
+++ b/drivers/gpu/drm/radeon/radeon_mn.c
@@ -234,6 +234,9 @@ void radeon_mn_unregister(struct radeon_bo *bo)
 	struct radeon_mn *rmn = bo->mn;
 	struct list_head *head;
 
+	if (!rmn)
+		return;
+
 	mutex_lock(&rmn->lock);
 	/* save the next list entry for later */
 	head = bo->mn_list.next;
-- 
2.23.0

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply related	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2019-09-02  6:19 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-09-01  9:38 [BUG, regression] Dereferencing of NULL pointer in radeon_mn_unregister() Petr Cvek
     [not found] ` <dad0e51a-0f06-e2b0-cef7-3587207c2045-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
2019-09-01 14:04   ` Jason Gunthorpe
     [not found]     ` <20190901140409.GA1251-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
2019-09-01 15:48       ` Petr Cvek
     [not found]         ` <2fc7ef14-e89a-1f2d-381d-1c9b05da02d3-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
2019-09-02  6:19           ` Jason Gunthorpe

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.