linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* Oops in 3.10.99 -- NULL pointer dereference in radeon_fence_ref
@ 2016-03-07  2:50 Erik Andersen
  2016-03-07 20:46 ` Greg Kroah-Hartman
  0 siblings, 1 reply; 8+ messages in thread
From: Erik Andersen @ 2016-03-07  2:50 UTC (permalink / raw)
  To: linux-kernel; +Cc: Greg Kroah-Hartman

The following patch to radeon_sa_bo_new that
went into 3.10.99

  commit 8d5e1e5af0c667545c202e8f4051f77aa3bf31b7
  Author: Nicolai Hähnle <nicolai.haehnle@amd.com>
  Date:   Fri Feb 5 14:35:53 2016 -0500
    drm/radeon: hold reference to fences in radeon_sa_bo_new
    commit f6ff4f67cdf8455d0a4226eeeaf5af17c37d05eb upstream.

is triggering an Oops for me right when xscreensaver
first began doing 3D stuff.  After reverting this
patch, xscreensaver has been happily running 3D stuff.

Mar  6 18:00:43 sage kernel: BUG: unable to handle kernel NULL pointer dereference at 0000000000000008
Mar  6 18:00:43 sage kernel: IP: [<ffffffffa010345d>] radeon_fence_ref+0xd/0x50 [radeon]
Mar  6 18:00:43 sage kernel: PGD 799e1d067 PUD 819186067 PMD 0
Mar  6 18:00:43 sage kernel: Oops: 0002 [#1] SMP

Mar  6 18:00:43 sage kernel: Stack:
Mar  6 18:00:43 sage kernel:  ffffffffa01607ec ffff88108a4e8000 ffff88108a4e8000 ffff880888fbc000
Mar  6 18:00:43 sage kernel:  ffff880ecbf11c78 0000fe2001000006 0000000000000000 0020000000000100
Mar  6 18:00:43 sage kernel:  00000000000d1200 ffff880ecbf11c14 0000000000000000 0000000000000000
Mar  6 18:00:43 sage kernel: Call Trace:
Mar  6 18:00:43 sage kernel:  [<ffffffffa01607ec>] ? radeon_sa_bo_new+0x2ac/0x4f0 [radeon]
Mar  6 18:00:43 sage kernel:  [<ffffffffa005fc9d>] ? ttm_eu_list_ref_sub+0x3d/0x60 [ttm]
Mar  6 18:00:43 sage kernel:  [<ffffffffa0117c49>] radeon_ib_get+0x39/0x110 [radeon]
Mar  6 18:00:43 sage kernel:  [<ffffffffa011a4ea>] radeon_cs_ioctl+0x69a/0xa70 [radeon]
Mar  6 18:00:43 sage kernel:  [<ffffffffa008e2d2>] drm_ioctl+0x512/0x650 [drm]
Mar  6 18:00:43 sage kernel:  [<ffffffff810a46e1>] ? do_futex+0x111/0xc30
Mar  6 18:00:43 sage kernel:  [<ffffffff81182a45>] do_vfs_ioctl+0x305/0x520
Mar  6 18:00:43 sage kernel:  [<ffffffff8107cd39>] ? vtime_account_user+0x69/0x80
Mar  6 18:00:43 sage kernel:  [<ffffffff81182ce1>] SyS_ioctl+0x81/0xa0
Mar  6 18:00:43 sage kernel:  [<ffffffff8178210f>] tracesys+0xe1/0xe6

$ lspci | grep VGA
03:00.0 VGA compatible controller: Advanced Micro Devices, Inc.
[AMD/ATI] Redwood XT [Radeon HD 5670/5690/5730]

 -Erik

--
Erik B. Andersen
--This message was written using 73% post-consumer electrons--

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: Oops in 3.10.99 -- NULL pointer dereference in radeon_fence_ref
  2016-03-07  2:50 Oops in 3.10.99 -- NULL pointer dereference in radeon_fence_ref Erik Andersen
@ 2016-03-07 20:46 ` Greg Kroah-Hartman
  2016-03-07 21:06   ` Christian König
  0 siblings, 1 reply; 8+ messages in thread
From: Greg Kroah-Hartman @ 2016-03-07 20:46 UTC (permalink / raw)
  To: andersen, linux-kernel, Nicolai Hähnle, Christian König; +Cc: stable

On Sun, Mar 06, 2016 at 07:50:14PM -0700, Erik Andersen wrote:
> The following patch to radeon_sa_bo_new that
> went into 3.10.99
> 
>   commit 8d5e1e5af0c667545c202e8f4051f77aa3bf31b7
>   Author: Nicolai Hähnle <nicolai.haehnle@amd.com>
>   Date:   Fri Feb 5 14:35:53 2016 -0500
>     drm/radeon: hold reference to fences in radeon_sa_bo_new
>     commit f6ff4f67cdf8455d0a4226eeeaf5af17c37d05eb upstream.
> 
> is triggering an Oops for me right when xscreensaver
> first began doing 3D stuff.  After reverting this
> patch, xscreensaver has been happily running 3D stuff.
> 
> Mar  6 18:00:43 sage kernel: BUG: unable to handle kernel NULL pointer dereference at 0000000000000008
> Mar  6 18:00:43 sage kernel: IP: [<ffffffffa010345d>] radeon_fence_ref+0xd/0x50 [radeon]
> Mar  6 18:00:43 sage kernel: PGD 799e1d067 PUD 819186067 PMD 0
> Mar  6 18:00:43 sage kernel: Oops: 0002 [#1] SMP
> 
> Mar  6 18:00:43 sage kernel: Stack:
> Mar  6 18:00:43 sage kernel:  ffffffffa01607ec ffff88108a4e8000 ffff88108a4e8000 ffff880888fbc000
> Mar  6 18:00:43 sage kernel:  ffff880ecbf11c78 0000fe2001000006 0000000000000000 0020000000000100
> Mar  6 18:00:43 sage kernel:  00000000000d1200 ffff880ecbf11c14 0000000000000000 0000000000000000
> Mar  6 18:00:43 sage kernel: Call Trace:
> Mar  6 18:00:43 sage kernel:  [<ffffffffa01607ec>] ? radeon_sa_bo_new+0x2ac/0x4f0 [radeon]
> Mar  6 18:00:43 sage kernel:  [<ffffffffa005fc9d>] ? ttm_eu_list_ref_sub+0x3d/0x60 [ttm]
> Mar  6 18:00:43 sage kernel:  [<ffffffffa0117c49>] radeon_ib_get+0x39/0x110 [radeon]
> Mar  6 18:00:43 sage kernel:  [<ffffffffa011a4ea>] radeon_cs_ioctl+0x69a/0xa70 [radeon]
> Mar  6 18:00:43 sage kernel:  [<ffffffffa008e2d2>] drm_ioctl+0x512/0x650 [drm]
> Mar  6 18:00:43 sage kernel:  [<ffffffff810a46e1>] ? do_futex+0x111/0xc30
> Mar  6 18:00:43 sage kernel:  [<ffffffff81182a45>] do_vfs_ioctl+0x305/0x520
> Mar  6 18:00:43 sage kernel:  [<ffffffff8107cd39>] ? vtime_account_user+0x69/0x80
> Mar  6 18:00:43 sage kernel:  [<ffffffff81182ce1>] SyS_ioctl+0x81/0xa0
> Mar  6 18:00:43 sage kernel:  [<ffffffff8178210f>] tracesys+0xe1/0xe6
> 
> $ lspci | grep VGA
> 03:00.0 VGA compatible controller: Advanced Micro Devices, Inc.
> [AMD/ATI] Redwood XT [Radeon HD 5670/5690/5730]

Next time, please cc: the people responsible for that patch as well...

I can revert it, but maybe something else is going on here?  Do you have
this same problem on 3.14, and 4.5-rc7?

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: Oops in 3.10.99 -- NULL pointer dereference in radeon_fence_ref
  2016-03-07 20:46 ` Greg Kroah-Hartman
@ 2016-03-07 21:06   ` Christian König
  2016-03-07 22:58     ` Greg Kroah-Hartman
  0 siblings, 1 reply; 8+ messages in thread
From: Christian König @ 2016-03-07 21:06 UTC (permalink / raw)
  To: Greg Kroah-Hartman, andersen, linux-kernel, Nicolai Hähnle; +Cc: stable

Am 07.03.2016 um 21:46 schrieb Greg Kroah-Hartman:
> On Sun, Mar 06, 2016 at 07:50:14PM -0700, Erik Andersen wrote:
>> The following patch to radeon_sa_bo_new that
>> went into 3.10.99
>>
>>    commit 8d5e1e5af0c667545c202e8f4051f77aa3bf31b7
>>    Author: Nicolai Hähnle <nicolai.haehnle@amd.com>
>>    Date:   Fri Feb 5 14:35:53 2016 -0500
>>      drm/radeon: hold reference to fences in radeon_sa_bo_new
>>      commit f6ff4f67cdf8455d0a4226eeeaf5af17c37d05eb upstream.
>>
>> is triggering an Oops for me right when xscreensaver
>> first began doing 3D stuff.  After reverting this
>> patch, xscreensaver has been happily running 3D stuff.
>>
>> Mar  6 18:00:43 sage kernel: BUG: unable to handle kernel NULL pointer dereference at 0000000000000008
>> Mar  6 18:00:43 sage kernel: IP: [<ffffffffa010345d>] radeon_fence_ref+0xd/0x50 [radeon]
>> Mar  6 18:00:43 sage kernel: PGD 799e1d067 PUD 819186067 PMD 0
>> Mar  6 18:00:43 sage kernel: Oops: 0002 [#1] SMP
>>
>> Mar  6 18:00:43 sage kernel: Stack:
>> Mar  6 18:00:43 sage kernel:  ffffffffa01607ec ffff88108a4e8000 ffff88108a4e8000 ffff880888fbc000
>> Mar  6 18:00:43 sage kernel:  ffff880ecbf11c78 0000fe2001000006 0000000000000000 0020000000000100
>> Mar  6 18:00:43 sage kernel:  00000000000d1200 ffff880ecbf11c14 0000000000000000 0000000000000000
>> Mar  6 18:00:43 sage kernel: Call Trace:
>> Mar  6 18:00:43 sage kernel:  [<ffffffffa01607ec>] ? radeon_sa_bo_new+0x2ac/0x4f0 [radeon]
>> Mar  6 18:00:43 sage kernel:  [<ffffffffa005fc9d>] ? ttm_eu_list_ref_sub+0x3d/0x60 [ttm]
>> Mar  6 18:00:43 sage kernel:  [<ffffffffa0117c49>] radeon_ib_get+0x39/0x110 [radeon]
>> Mar  6 18:00:43 sage kernel:  [<ffffffffa011a4ea>] radeon_cs_ioctl+0x69a/0xa70 [radeon]
>> Mar  6 18:00:43 sage kernel:  [<ffffffffa008e2d2>] drm_ioctl+0x512/0x650 [drm]
>> Mar  6 18:00:43 sage kernel:  [<ffffffff810a46e1>] ? do_futex+0x111/0xc30
>> Mar  6 18:00:43 sage kernel:  [<ffffffff81182a45>] do_vfs_ioctl+0x305/0x520
>> Mar  6 18:00:43 sage kernel:  [<ffffffff8107cd39>] ? vtime_account_user+0x69/0x80
>> Mar  6 18:00:43 sage kernel:  [<ffffffff81182ce1>] SyS_ioctl+0x81/0xa0
>> Mar  6 18:00:43 sage kernel:  [<ffffffff8178210f>] tracesys+0xe1/0xe6
>>
>> $ lspci | grep VGA
>> 03:00.0 VGA compatible controller: Advanced Micro Devices, Inc.
>> [AMD/ATI] Redwood XT [Radeon HD 5670/5690/5730]
> Next time, please cc: the people responsible for that patch as well...
>
> I can revert it, but maybe something else is going on here?  Do you have
> this same problem on 3.14, and 4.5-rc7?

Hi Greg,

yes that's an already known issue. Feel free to revert that one for now.

I got it on my TODO list to provide a fixed patch for older kernel, but 
that can take a while.

For the background Nicolais patch is correct, but assumes that 
radeon_fence_unref() can safely take NULL as the fence which is not the 
case for older kernels.

Regards,
Christian.

>
> thanks,
>
> greg k-h

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: Oops in 3.10.99 -- NULL pointer dereference in radeon_fence_ref
  2016-03-07 21:06   ` Christian König
@ 2016-03-07 22:58     ` Greg Kroah-Hartman
  2016-03-09 13:56       ` Luis Henriques
  0 siblings, 1 reply; 8+ messages in thread
From: Greg Kroah-Hartman @ 2016-03-07 22:58 UTC (permalink / raw)
  To: Christian König; +Cc: andersen, linux-kernel, Nicolai Hähnle, stable

On Mon, Mar 07, 2016 at 10:06:47PM +0100, Christian König wrote:
> Am 07.03.2016 um 21:46 schrieb Greg Kroah-Hartman:
> >On Sun, Mar 06, 2016 at 07:50:14PM -0700, Erik Andersen wrote:
> >>The following patch to radeon_sa_bo_new that
> >>went into 3.10.99
> >>
> >>   commit 8d5e1e5af0c667545c202e8f4051f77aa3bf31b7
> >>   Author: Nicolai Hähnle <nicolai.haehnle@amd.com>
> >>   Date:   Fri Feb 5 14:35:53 2016 -0500
> >>     drm/radeon: hold reference to fences in radeon_sa_bo_new
> >>     commit f6ff4f67cdf8455d0a4226eeeaf5af17c37d05eb upstream.
> >>
> >>is triggering an Oops for me right when xscreensaver
> >>first began doing 3D stuff.  After reverting this
> >>patch, xscreensaver has been happily running 3D stuff.
> >>
> >>Mar  6 18:00:43 sage kernel: BUG: unable to handle kernel NULL pointer dereference at 0000000000000008
> >>Mar  6 18:00:43 sage kernel: IP: [<ffffffffa010345d>] radeon_fence_ref+0xd/0x50 [radeon]
> >>Mar  6 18:00:43 sage kernel: PGD 799e1d067 PUD 819186067 PMD 0
> >>Mar  6 18:00:43 sage kernel: Oops: 0002 [#1] SMP
> >>
> >>Mar  6 18:00:43 sage kernel: Stack:
> >>Mar  6 18:00:43 sage kernel:  ffffffffa01607ec ffff88108a4e8000 ffff88108a4e8000 ffff880888fbc000
> >>Mar  6 18:00:43 sage kernel:  ffff880ecbf11c78 0000fe2001000006 0000000000000000 0020000000000100
> >>Mar  6 18:00:43 sage kernel:  00000000000d1200 ffff880ecbf11c14 0000000000000000 0000000000000000
> >>Mar  6 18:00:43 sage kernel: Call Trace:
> >>Mar  6 18:00:43 sage kernel:  [<ffffffffa01607ec>] ? radeon_sa_bo_new+0x2ac/0x4f0 [radeon]
> >>Mar  6 18:00:43 sage kernel:  [<ffffffffa005fc9d>] ? ttm_eu_list_ref_sub+0x3d/0x60 [ttm]
> >>Mar  6 18:00:43 sage kernel:  [<ffffffffa0117c49>] radeon_ib_get+0x39/0x110 [radeon]
> >>Mar  6 18:00:43 sage kernel:  [<ffffffffa011a4ea>] radeon_cs_ioctl+0x69a/0xa70 [radeon]
> >>Mar  6 18:00:43 sage kernel:  [<ffffffffa008e2d2>] drm_ioctl+0x512/0x650 [drm]
> >>Mar  6 18:00:43 sage kernel:  [<ffffffff810a46e1>] ? do_futex+0x111/0xc30
> >>Mar  6 18:00:43 sage kernel:  [<ffffffff81182a45>] do_vfs_ioctl+0x305/0x520
> >>Mar  6 18:00:43 sage kernel:  [<ffffffff8107cd39>] ? vtime_account_user+0x69/0x80
> >>Mar  6 18:00:43 sage kernel:  [<ffffffff81182ce1>] SyS_ioctl+0x81/0xa0
> >>Mar  6 18:00:43 sage kernel:  [<ffffffff8178210f>] tracesys+0xe1/0xe6
> >>
> >>$ lspci | grep VGA
> >>03:00.0 VGA compatible controller: Advanced Micro Devices, Inc.
> >>[AMD/ATI] Redwood XT [Radeon HD 5670/5690/5730]
> >Next time, please cc: the people responsible for that patch as well...
> >
> >I can revert it, but maybe something else is going on here?  Do you have
> >this same problem on 3.14, and 4.5-rc7?
> 
> Hi Greg,
> 
> yes that's an already known issue. Feel free to revert that one for now.
> 
> I got it on my TODO list to provide a fixed patch for older kernel, but that
> can take a while.
> 
> For the background Nicolais patch is correct, but assumes that
> radeon_fence_unref() can safely take NULL as the fence which is not the case
> for older kernels.

Ok, thanks, now reverted.

greg k-h

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: Oops in 3.10.99 -- NULL pointer dereference in radeon_fence_ref
  2016-03-07 22:58     ` Greg Kroah-Hartman
@ 2016-03-09 13:56       ` Luis Henriques
  2016-03-09 16:31         ` Nicolai Hähnle
  2016-03-14 12:33         ` Jiri Slaby
  0 siblings, 2 replies; 8+ messages in thread
From: Luis Henriques @ 2016-03-09 13:56 UTC (permalink / raw)
  To: Greg Kroah-Hartman
  Cc: Christian König, andersen, linux-kernel,
	Nicolai Hähnle, stable, Sasha Levin, Jiri Slaby,
	Kamal Mostafa

On Mon, Mar 07, 2016 at 02:58:51PM -0800, Greg Kroah-Hartman wrote:
> On Mon, Mar 07, 2016 at 10:06:47PM +0100, Christian König wrote:
> > Am 07.03.2016 um 21:46 schrieb Greg Kroah-Hartman:
> > >On Sun, Mar 06, 2016 at 07:50:14PM -0700, Erik Andersen wrote:
> > >>The following patch to radeon_sa_bo_new that
> > >>went into 3.10.99
> > >>
> > >>   commit 8d5e1e5af0c667545c202e8f4051f77aa3bf31b7
> > >>   Author: Nicolai Hähnle <nicolai.haehnle@amd.com>
> > >>   Date:   Fri Feb 5 14:35:53 2016 -0500
> > >>     drm/radeon: hold reference to fences in radeon_sa_bo_new
> > >>     commit f6ff4f67cdf8455d0a4226eeeaf5af17c37d05eb upstream.
> > >>
> > >>is triggering an Oops for me right when xscreensaver
> > >>first began doing 3D stuff.  After reverting this
> > >>patch, xscreensaver has been happily running 3D stuff.
> > >>
> > >>Mar  6 18:00:43 sage kernel: BUG: unable to handle kernel NULL pointer dereference at 0000000000000008
> > >>Mar  6 18:00:43 sage kernel: IP: [<ffffffffa010345d>] radeon_fence_ref+0xd/0x50 [radeon]
> > >>Mar  6 18:00:43 sage kernel: PGD 799e1d067 PUD 819186067 PMD 0
> > >>Mar  6 18:00:43 sage kernel: Oops: 0002 [#1] SMP
> > >>
> > >>Mar  6 18:00:43 sage kernel: Stack:
> > >>Mar  6 18:00:43 sage kernel:  ffffffffa01607ec ffff88108a4e8000 ffff88108a4e8000 ffff880888fbc000
> > >>Mar  6 18:00:43 sage kernel:  ffff880ecbf11c78 0000fe2001000006 0000000000000000 0020000000000100
> > >>Mar  6 18:00:43 sage kernel:  00000000000d1200 ffff880ecbf11c14 0000000000000000 0000000000000000
> > >>Mar  6 18:00:43 sage kernel: Call Trace:
> > >>Mar  6 18:00:43 sage kernel:  [<ffffffffa01607ec>] ? radeon_sa_bo_new+0x2ac/0x4f0 [radeon]
> > >>Mar  6 18:00:43 sage kernel:  [<ffffffffa005fc9d>] ? ttm_eu_list_ref_sub+0x3d/0x60 [ttm]
> > >>Mar  6 18:00:43 sage kernel:  [<ffffffffa0117c49>] radeon_ib_get+0x39/0x110 [radeon]
> > >>Mar  6 18:00:43 sage kernel:  [<ffffffffa011a4ea>] radeon_cs_ioctl+0x69a/0xa70 [radeon]
> > >>Mar  6 18:00:43 sage kernel:  [<ffffffffa008e2d2>] drm_ioctl+0x512/0x650 [drm]
> > >>Mar  6 18:00:43 sage kernel:  [<ffffffff810a46e1>] ? do_futex+0x111/0xc30
> > >>Mar  6 18:00:43 sage kernel:  [<ffffffff81182a45>] do_vfs_ioctl+0x305/0x520
> > >>Mar  6 18:00:43 sage kernel:  [<ffffffff8107cd39>] ? vtime_account_user+0x69/0x80
> > >>Mar  6 18:00:43 sage kernel:  [<ffffffff81182ce1>] SyS_ioctl+0x81/0xa0
> > >>Mar  6 18:00:43 sage kernel:  [<ffffffff8178210f>] tracesys+0xe1/0xe6
> > >>
> > >>$ lspci | grep VGA
> > >>03:00.0 VGA compatible controller: Advanced Micro Devices, Inc.
> > >>[AMD/ATI] Redwood XT [Radeon HD 5670/5690/5730]
> > >Next time, please cc: the people responsible for that patch as well...
> > >
> > >I can revert it, but maybe something else is going on here?  Do you have
> > >this same problem on 3.14, and 4.5-rc7?
> > 
> > Hi Greg,
> > 
> > yes that's an already known issue. Feel free to revert that one for now.
> > 
> > I got it on my TODO list to provide a fixed patch for older kernel, but that
> > can take a while.
> > 
> > For the background Nicolais patch is correct, but assumes that
> > radeon_fence_unref() can safely take NULL as the fence which is not the case
> > for older kernels.
> 
> Ok, thanks, now reverted.
> 

And looks like a few more kernels may be affected as well.  I'll
revert it from 3.16 kernel, and I'm adding Kamal, Sasha and Jiri to
the CC list.

Cheers,
--
Luís

> greg k-h
> --
> To unsubscribe from this list: send the line "unsubscribe stable" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: Oops in 3.10.99 -- NULL pointer dereference in radeon_fence_ref
  2016-03-09 13:56       ` Luis Henriques
@ 2016-03-09 16:31         ` Nicolai Hähnle
  2016-03-09 16:38           ` Greg Kroah-Hartman
  2016-03-14 12:33         ` Jiri Slaby
  1 sibling, 1 reply; 8+ messages in thread
From: Nicolai Hähnle @ 2016-03-09 16:31 UTC (permalink / raw)
  To: Luis Henriques, Greg Kroah-Hartman
  Cc: Christian König, andersen, linux-kernel, stable,
	Sasha Levin, Jiri Slaby, Kamal Mostafa

[-- Attachment #1: Type: text/plain, Size: 4336 bytes --]

On 09.03.2016 08:56, Luis Henriques wrote:
> On Mon, Mar 07, 2016 at 02:58:51PM -0800, Greg Kroah-Hartman wrote:
>> On Mon, Mar 07, 2016 at 10:06:47PM +0100, Christian König wrote:
>>> Am 07.03.2016 um 21:46 schrieb Greg Kroah-Hartman:
>>>> On Sun, Mar 06, 2016 at 07:50:14PM -0700, Erik Andersen wrote:
>>>>> The following patch to radeon_sa_bo_new that
>>>>> went into 3.10.99
>>>>>
>>>>>    commit 8d5e1e5af0c667545c202e8f4051f77aa3bf31b7
>>>>>    Author: Nicolai Hähnle <nicolai.haehnle@amd.com>
>>>>>    Date:   Fri Feb 5 14:35:53 2016 -0500
>>>>>      drm/radeon: hold reference to fences in radeon_sa_bo_new
>>>>>      commit f6ff4f67cdf8455d0a4226eeeaf5af17c37d05eb upstream.
>>>>>
>>>>> is triggering an Oops for me right when xscreensaver
>>>>> first began doing 3D stuff.  After reverting this
>>>>> patch, xscreensaver has been happily running 3D stuff.
>>>>>
>>>>> Mar  6 18:00:43 sage kernel: BUG: unable to handle kernel NULL pointer dereference at 0000000000000008
>>>>> Mar  6 18:00:43 sage kernel: IP: [<ffffffffa010345d>] radeon_fence_ref+0xd/0x50 [radeon]
>>>>> Mar  6 18:00:43 sage kernel: PGD 799e1d067 PUD 819186067 PMD 0
>>>>> Mar  6 18:00:43 sage kernel: Oops: 0002 [#1] SMP
>>>>>
>>>>> Mar  6 18:00:43 sage kernel: Stack:
>>>>> Mar  6 18:00:43 sage kernel:  ffffffffa01607ec ffff88108a4e8000 ffff88108a4e8000 ffff880888fbc000
>>>>> Mar  6 18:00:43 sage kernel:  ffff880ecbf11c78 0000fe2001000006 0000000000000000 0020000000000100
>>>>> Mar  6 18:00:43 sage kernel:  00000000000d1200 ffff880ecbf11c14 0000000000000000 0000000000000000
>>>>> Mar  6 18:00:43 sage kernel: Call Trace:
>>>>> Mar  6 18:00:43 sage kernel:  [<ffffffffa01607ec>] ? radeon_sa_bo_new+0x2ac/0x4f0 [radeon]
>>>>> Mar  6 18:00:43 sage kernel:  [<ffffffffa005fc9d>] ? ttm_eu_list_ref_sub+0x3d/0x60 [ttm]
>>>>> Mar  6 18:00:43 sage kernel:  [<ffffffffa0117c49>] radeon_ib_get+0x39/0x110 [radeon]
>>>>> Mar  6 18:00:43 sage kernel:  [<ffffffffa011a4ea>] radeon_cs_ioctl+0x69a/0xa70 [radeon]
>>>>> Mar  6 18:00:43 sage kernel:  [<ffffffffa008e2d2>] drm_ioctl+0x512/0x650 [drm]
>>>>> Mar  6 18:00:43 sage kernel:  [<ffffffff810a46e1>] ? do_futex+0x111/0xc30
>>>>> Mar  6 18:00:43 sage kernel:  [<ffffffff81182a45>] do_vfs_ioctl+0x305/0x520
>>>>> Mar  6 18:00:43 sage kernel:  [<ffffffff8107cd39>] ? vtime_account_user+0x69/0x80
>>>>> Mar  6 18:00:43 sage kernel:  [<ffffffff81182ce1>] SyS_ioctl+0x81/0xa0
>>>>> Mar  6 18:00:43 sage kernel:  [<ffffffff8178210f>] tracesys+0xe1/0xe6
>>>>>
>>>>> $ lspci | grep VGA
>>>>> 03:00.0 VGA compatible controller: Advanced Micro Devices, Inc.
>>>>> [AMD/ATI] Redwood XT [Radeon HD 5670/5690/5730]
>>>> Next time, please cc: the people responsible for that patch as well...
>>>>
>>>> I can revert it, but maybe something else is going on here?  Do you have
>>>> this same problem on 3.14, and 4.5-rc7?
>>>
>>> Hi Greg,
>>>
>>> yes that's an already known issue. Feel free to revert that one for now.
>>>
>>> I got it on my TODO list to provide a fixed patch for older kernel, but that
>>> can take a while.
>>>
>>> For the background Nicolais patch is correct, but assumes that
>>> radeon_fence_unref() can safely take NULL as the fence which is not the case
>>> for older kernels.

Actually, the call to radeon_fence_ref() is the culprit.

>>
>> Ok, thanks, now reverted.
>>
>
> And looks like a few more kernels may be affected as well.  I'll
> revert it from 3.16 kernel, and I'm adding Kamal, Sasha and Jiri to
> the CC list.

Kernels that contain commit 954605ca "drm/radeon: use common fence 
implementation for fences, v4" are safe, older kernels require a 
NULL-pointer check around the call to radeon_fence_ref.

This means kernels 3.17 and older are affected and need the additional 
NULL pointer check that I've sent out already on a different thread (I'm 
attaching it again, hoping that Erik gets a chance to test it).

It would be nice to get a confirmation that this really does fix the 
observed bug, then I can prepare a fixed version of the patch for 3.17 
and older (i.e. squash the original bad commit with the attached patch).

Cheers,
Nicolai

>
> Cheers,
> --
> Luís
>
>> greg k-h
>> --
>> To unsubscribe from this list: send the line "unsubscribe stable" in
>> the body of a message to majordomo@vger.kernel.org
>> More majordomo info at  http://vger.kernel.org/majordomo-info.html

[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #2: 0001-drm-radeon-guard-call-to-radeon_fence_ref-against-NU.patch --]
[-- Type: text/x-patch; name="0001-drm-radeon-guard-call-to-radeon_fence_ref-against-NU.patch", Size: 1447 bytes --]

>From 85d028178d9772f2a07e4ed156820d95c4e0ad18 Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?Nicolai=20H=C3=A4hnle?= <nicolai.haehnle@amd.com>
Date: Mon, 7 Mar 2016 23:41:52 -0300
Subject: [PATCH] drm/radeon: guard call to radeon_fence_ref against NULL
 pointers
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

Candidate fix for a kernel oops that was introduced by the backport of
commit 954605ca3 "drm/radeon: hold reference to fences in radeon_sa_bo_new"
to kernels where radeon does not use the common fence implementation for
fences.

Reported-by: Lutz Euler <lutz.euler@freenet.de>
Signed-off-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
---
 drivers/gpu/drm/radeon/radeon_sa.c | 6 ++++--
 1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/radeon/radeon_sa.c b/drivers/gpu/drm/radeon/radeon_sa.c
index 197b157..7d11901 100644
--- a/drivers/gpu/drm/radeon/radeon_sa.c
+++ b/drivers/gpu/drm/radeon/radeon_sa.c
@@ -349,8 +349,10 @@ int radeon_sa_bo_new(struct radeon_device *rdev,
 			/* see if we can skip over some allocations */
 		} while (radeon_sa_bo_next_hole(sa_manager, fences, tries));
 
-		for (i = 0; i < RADEON_NUM_RINGS; ++i)
-			radeon_fence_ref(fences[i]);
+		for (i = 0; i < RADEON_NUM_RINGS; ++i) {
+			if (fences[i])
+				radeon_fence_ref(fences[i]);
+		}
 
 		spin_unlock(&sa_manager->wq.lock);
 		r = radeon_fence_wait_any(rdev, fences, false);
-- 
2.5.0


^ permalink raw reply related	[flat|nested] 8+ messages in thread

* Re: Oops in 3.10.99 -- NULL pointer dereference in radeon_fence_ref
  2016-03-09 16:31         ` Nicolai Hähnle
@ 2016-03-09 16:38           ` Greg Kroah-Hartman
  0 siblings, 0 replies; 8+ messages in thread
From: Greg Kroah-Hartman @ 2016-03-09 16:38 UTC (permalink / raw)
  To: Nicolai Hähnle
  Cc: Luis Henriques, Christian König, andersen, linux-kernel,
	stable, Sasha Levin, Jiri Slaby, Kamal Mostafa

On Wed, Mar 09, 2016 at 11:31:54AM -0500, Nicolai Hähnle wrote:
> On 09.03.2016 08:56, Luis Henriques wrote:
> >On Mon, Mar 07, 2016 at 02:58:51PM -0800, Greg Kroah-Hartman wrote:
> >>On Mon, Mar 07, 2016 at 10:06:47PM +0100, Christian König wrote:
> >>>Am 07.03.2016 um 21:46 schrieb Greg Kroah-Hartman:
> >>>>On Sun, Mar 06, 2016 at 07:50:14PM -0700, Erik Andersen wrote:
> >>>>>The following patch to radeon_sa_bo_new that
> >>>>>went into 3.10.99
> >>>>>
> >>>>>   commit 8d5e1e5af0c667545c202e8f4051f77aa3bf31b7
> >>>>>   Author: Nicolai Hähnle <nicolai.haehnle@amd.com>
> >>>>>   Date:   Fri Feb 5 14:35:53 2016 -0500
> >>>>>     drm/radeon: hold reference to fences in radeon_sa_bo_new
> >>>>>     commit f6ff4f67cdf8455d0a4226eeeaf5af17c37d05eb upstream.
> >>>>>
> >>>>>is triggering an Oops for me right when xscreensaver
> >>>>>first began doing 3D stuff.  After reverting this
> >>>>>patch, xscreensaver has been happily running 3D stuff.
> >>>>>
> >>>>>Mar  6 18:00:43 sage kernel: BUG: unable to handle kernel NULL pointer dereference at 0000000000000008
> >>>>>Mar  6 18:00:43 sage kernel: IP: [<ffffffffa010345d>] radeon_fence_ref+0xd/0x50 [radeon]
> >>>>>Mar  6 18:00:43 sage kernel: PGD 799e1d067 PUD 819186067 PMD 0
> >>>>>Mar  6 18:00:43 sage kernel: Oops: 0002 [#1] SMP
> >>>>>
> >>>>>Mar  6 18:00:43 sage kernel: Stack:
> >>>>>Mar  6 18:00:43 sage kernel:  ffffffffa01607ec ffff88108a4e8000 ffff88108a4e8000 ffff880888fbc000
> >>>>>Mar  6 18:00:43 sage kernel:  ffff880ecbf11c78 0000fe2001000006 0000000000000000 0020000000000100
> >>>>>Mar  6 18:00:43 sage kernel:  00000000000d1200 ffff880ecbf11c14 0000000000000000 0000000000000000
> >>>>>Mar  6 18:00:43 sage kernel: Call Trace:
> >>>>>Mar  6 18:00:43 sage kernel:  [<ffffffffa01607ec>] ? radeon_sa_bo_new+0x2ac/0x4f0 [radeon]
> >>>>>Mar  6 18:00:43 sage kernel:  [<ffffffffa005fc9d>] ? ttm_eu_list_ref_sub+0x3d/0x60 [ttm]
> >>>>>Mar  6 18:00:43 sage kernel:  [<ffffffffa0117c49>] radeon_ib_get+0x39/0x110 [radeon]
> >>>>>Mar  6 18:00:43 sage kernel:  [<ffffffffa011a4ea>] radeon_cs_ioctl+0x69a/0xa70 [radeon]
> >>>>>Mar  6 18:00:43 sage kernel:  [<ffffffffa008e2d2>] drm_ioctl+0x512/0x650 [drm]
> >>>>>Mar  6 18:00:43 sage kernel:  [<ffffffff810a46e1>] ? do_futex+0x111/0xc30
> >>>>>Mar  6 18:00:43 sage kernel:  [<ffffffff81182a45>] do_vfs_ioctl+0x305/0x520
> >>>>>Mar  6 18:00:43 sage kernel:  [<ffffffff8107cd39>] ? vtime_account_user+0x69/0x80
> >>>>>Mar  6 18:00:43 sage kernel:  [<ffffffff81182ce1>] SyS_ioctl+0x81/0xa0
> >>>>>Mar  6 18:00:43 sage kernel:  [<ffffffff8178210f>] tracesys+0xe1/0xe6
> >>>>>
> >>>>>$ lspci | grep VGA
> >>>>>03:00.0 VGA compatible controller: Advanced Micro Devices, Inc.
> >>>>>[AMD/ATI] Redwood XT [Radeon HD 5670/5690/5730]
> >>>>Next time, please cc: the people responsible for that patch as well...
> >>>>
> >>>>I can revert it, but maybe something else is going on here?  Do you have
> >>>>this same problem on 3.14, and 4.5-rc7?
> >>>
> >>>Hi Greg,
> >>>
> >>>yes that's an already known issue. Feel free to revert that one for now.
> >>>
> >>>I got it on my TODO list to provide a fixed patch for older kernel, but that
> >>>can take a while.
> >>>
> >>>For the background Nicolais patch is correct, but assumes that
> >>>radeon_fence_unref() can safely take NULL as the fence which is not the case
> >>>for older kernels.
> 
> Actually, the call to radeon_fence_ref() is the culprit.
> 
> >>
> >>Ok, thanks, now reverted.
> >>
> >
> >And looks like a few more kernels may be affected as well.  I'll
> >revert it from 3.16 kernel, and I'm adding Kamal, Sasha and Jiri to
> >the CC list.
> 
> Kernels that contain commit 954605ca "drm/radeon: use common fence
> implementation for fences, v4" are safe, older kernels require a
> NULL-pointer check around the call to radeon_fence_ref.
> 
> This means kernels 3.17 and older are affected and need the additional NULL
> pointer check that I've sent out already on a different thread (I'm
> attaching it again, hoping that Erik gets a chance to test it).
> 
> It would be nice to get a confirmation that this really does fix the
> observed bug, then I can prepare a fixed version of the patch for 3.17 and
> older (i.e. squash the original bad commit with the attached patch).

Don't "squash" anything together, just send the needed patches
backported, we want to keep things to match Linus's tree as much as
possible.

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: Oops in 3.10.99 -- NULL pointer dereference in radeon_fence_ref
  2016-03-09 13:56       ` Luis Henriques
  2016-03-09 16:31         ` Nicolai Hähnle
@ 2016-03-14 12:33         ` Jiri Slaby
  1 sibling, 0 replies; 8+ messages in thread
From: Jiri Slaby @ 2016-03-14 12:33 UTC (permalink / raw)
  To: Luis Henriques, Greg Kroah-Hartman
  Cc: Christian König, andersen, linux-kernel,
	Nicolai Hähnle, stable, Sasha Levin, Kamal Mostafa

On 03/09/2016, 02:56 PM, Luis Henriques wrote:
> On Mon, Mar 07, 2016 at 02:58:51PM -0800, Greg Kroah-Hartman wrote:
>> On Mon, Mar 07, 2016 at 10:06:47PM +0100, Christian König wrote:
>>> Am 07.03.2016 um 21:46 schrieb Greg Kroah-Hartman:
>>>> On Sun, Mar 06, 2016 at 07:50:14PM -0700, Erik Andersen wrote:
>>>>> The following patch to radeon_sa_bo_new that
>>>>> went into 3.10.99
>>>>>
>>>>>   commit 8d5e1e5af0c667545c202e8f4051f77aa3bf31b7
>>>>>   Author: Nicolai Hähnle <nicolai.haehnle@amd.com>
>>>>>   Date:   Fri Feb 5 14:35:53 2016 -0500
>>>>>     drm/radeon: hold reference to fences in radeon_sa_bo_new
>>>>>     commit f6ff4f67cdf8455d0a4226eeeaf5af17c37d05eb upstream.
>>>>>
>>>>> is triggering an Oops for me right when xscreensaver
>>>>> first began doing 3D stuff.  After reverting this
>>>>> patch, xscreensaver has been happily running 3D stuff.
>>>>>
>>>>> Mar  6 18:00:43 sage kernel: BUG: unable to handle kernel NULL pointer dereference at 0000000000000008
>>>>> Mar  6 18:00:43 sage kernel: IP: [<ffffffffa010345d>] radeon_fence_ref+0xd/0x50 [radeon]
>>>>> Mar  6 18:00:43 sage kernel: PGD 799e1d067 PUD 819186067 PMD 0
>>>>> Mar  6 18:00:43 sage kernel: Oops: 0002 [#1] SMP
>>>>>
>>>>> Mar  6 18:00:43 sage kernel: Stack:
>>>>> Mar  6 18:00:43 sage kernel:  ffffffffa01607ec ffff88108a4e8000 ffff88108a4e8000 ffff880888fbc000
>>>>> Mar  6 18:00:43 sage kernel:  ffff880ecbf11c78 0000fe2001000006 0000000000000000 0020000000000100
>>>>> Mar  6 18:00:43 sage kernel:  00000000000d1200 ffff880ecbf11c14 0000000000000000 0000000000000000
>>>>> Mar  6 18:00:43 sage kernel: Call Trace:
>>>>> Mar  6 18:00:43 sage kernel:  [<ffffffffa01607ec>] ? radeon_sa_bo_new+0x2ac/0x4f0 [radeon]
>>>>> Mar  6 18:00:43 sage kernel:  [<ffffffffa005fc9d>] ? ttm_eu_list_ref_sub+0x3d/0x60 [ttm]
>>>>> Mar  6 18:00:43 sage kernel:  [<ffffffffa0117c49>] radeon_ib_get+0x39/0x110 [radeon]
>>>>> Mar  6 18:00:43 sage kernel:  [<ffffffffa011a4ea>] radeon_cs_ioctl+0x69a/0xa70 [radeon]
>>>>> Mar  6 18:00:43 sage kernel:  [<ffffffffa008e2d2>] drm_ioctl+0x512/0x650 [drm]
>>>>> Mar  6 18:00:43 sage kernel:  [<ffffffff810a46e1>] ? do_futex+0x111/0xc30
>>>>> Mar  6 18:00:43 sage kernel:  [<ffffffff81182a45>] do_vfs_ioctl+0x305/0x520
>>>>> Mar  6 18:00:43 sage kernel:  [<ffffffff8107cd39>] ? vtime_account_user+0x69/0x80
>>>>> Mar  6 18:00:43 sage kernel:  [<ffffffff81182ce1>] SyS_ioctl+0x81/0xa0
>>>>> Mar  6 18:00:43 sage kernel:  [<ffffffff8178210f>] tracesys+0xe1/0xe6
>>>>>
>>>>> $ lspci | grep VGA
>>>>> 03:00.0 VGA compatible controller: Advanced Micro Devices, Inc.
>>>>> [AMD/ATI] Redwood XT [Radeon HD 5670/5690/5730]
>>>> Next time, please cc: the people responsible for that patch as well...
>>>>
>>>> I can revert it, but maybe something else is going on here?  Do you have
>>>> this same problem on 3.14, and 4.5-rc7?
>>>
>>> Hi Greg,
>>>
>>> yes that's an already known issue. Feel free to revert that one for now.
>>>
>>> I got it on my TODO list to provide a fixed patch for older kernel, but that
>>> can take a while.
>>>
>>> For the background Nicolais patch is correct, but assumes that
>>> radeon_fence_unref() can safely take NULL as the fence which is not the case
>>> for older kernels.
>>
>> Ok, thanks, now reverted.
>>
> 
> And looks like a few more kernels may be affected as well.  I'll
> revert it from 3.16 kernel, and I'm adding Kamal, Sasha and Jiri to
> the CC list.

Reverted from 3.12. Thanks!

-- 
js
suse labs

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2016-03-14 12:33 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-03-07  2:50 Oops in 3.10.99 -- NULL pointer dereference in radeon_fence_ref Erik Andersen
2016-03-07 20:46 ` Greg Kroah-Hartman
2016-03-07 21:06   ` Christian König
2016-03-07 22:58     ` Greg Kroah-Hartman
2016-03-09 13:56       ` Luis Henriques
2016-03-09 16:31         ` Nicolai Hähnle
2016-03-09 16:38           ` Greg Kroah-Hartman
2016-03-14 12:33         ` Jiri Slaby

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).