All of lore.kernel.org
 help / color / mirror / Atom feed
From: Paul Menzel <pmenzel@molgen.mpg.de>
To: Kees Cook <keescook@chromium.org>, Mazin Rezk <mnrzk@protonmail.com>
Cc: linux-kernel@vger.kernel.org, amd-gfx@lists.freedesktop.org,
	dri-devel@lists.freedesktop.org,
	"Andrew Morton" <akpm@linux-foundation.org>,
	"Christian König" <christian.koenig@amd.com>,
	"Harry Wentland" <Harry.Wentland@amd.com>,
	"Nicholas Kazlauskas" <nicholas.kazlauskas@amd.com>,
	sunpeng.li@amd.com,
	"Alexander Deucher" <Alexander.Deucher@amd.com>,
	1i5t5.duncan@cox.net, mphantomx@yahoo.com.br,
	regressions@leemhuis.info, anthony.ruhier@gmail.com
Subject: Re: [PATCH] amdgpu_dm: fix nonblocking atomic commit use-after-free
Date: Fri, 24 Jul 2020 09:45:18 +0200	[thread overview]
Message-ID: <a86cba0b-4513-e7c3-ae75-bb331433f664@molgen.mpg.de> (raw)
In-Reply-To: <202007231524.A24720C@keescook>

Dear Kees,


Am 24.07.20 um 00:32 schrieb Kees Cook:
> On Thu, Jul 23, 2020 at 09:10:15PM +0000, Mazin Rezk wrote:
>> When amdgpu_dm_atomic_commit_tail is running in the workqueue,
>> drm_atomic_state_put will get called while amdgpu_dm_atomic_commit_tail is
>> running, causing a race condition where state (and then dm_state) is
>> sometimes freed while amdgpu_dm_atomic_commit_tail is running. This bug has
>> occurred since 5.7-rc1 and is well documented among polaris11 users [1].
>>
>> Prior to 5.7, this was not a noticeable issue since the freelist pointer
>> was stored at the beginning of dm_state (base), which was unused. After
>> changing the freelist pointer to be stored in the middle of the struct, the
>> freelist pointer overwrote the context, causing dc_state to become garbage
>> data and made the call to dm_enable_per_frame_crtc_master_sync dereference
>> a freelist pointer.
>>
>> This patch fixes the aforementioned issue by calling drm_atomic_state_get
>> in amdgpu_dm_atomic_commit before drm_atomic_helper_commit is called and
>> drm_atomic_state_put after amdgpu_dm_atomic_commit_tail is complete.
>>
>> According to my testing on 5.8.0-rc6, this should fix bug 207383 on
>> Bugzilla [1].
>>
>> [1] https://bugzilla.kernel.org/show_bug.cgi?id=207383
> 
> Nice work tracking this down!
> 
>> Fixes: 3202fa62f ("slub: relocate freelist pointer to middle of object")
> 
> I do, however, object to this Fixes tag. :) The flaw appears to have
> been with amdgpu_dm's reference tracking of "state" in the nonblocking
> case. (How this reference counting is supposed to work correctly, though,
> I'm not sure.) If I look at where the drm helper was split from being
> the default callback, it looks like this was what introduced the bug:
> 
> da5c47f682ab ("drm/amd/display: Remove acrtc->stream")
> 
> ? 3202fa62f certainly exposed it much more quickly, but there was a race
> even without 3202fa62f where something could have realloced the memory
> and written over it.

I understand the Fixes tag mainly a help when backporting commits.

As Linux 5.8-rc7 is going to be released this Sunday, I wonder, if 
commit 3202fa62f ("slub: relocate freelist pointer to middle of object") 
should be reverted for now to fix the regression for the users according 
to Linux’ no regression policy. Once the AMDGPU/DRM driver issue is 
fixed, it can be reapplied. I know it’s not optimal, but as some testing 
is going to be involved for the fix, I’d argue it’s the best option for 
the users.


Kind regards,

Paul

WARNING: multiple messages have this Message-ID (diff)
From: Paul Menzel <pmenzel@molgen.mpg.de>
To: Kees Cook <keescook@chromium.org>, Mazin Rezk <mnrzk@protonmail.com>
Cc: anthony.ruhier@gmail.com, 1i5t5.duncan@cox.net,
	sunpeng.li@amd.com, linux-kernel@vger.kernel.org,
	dri-devel@lists.freedesktop.org,
	"Nicholas Kazlauskas" <nicholas.kazlauskas@amd.com>,
	regressions@leemhuis.info, amd-gfx@lists.freedesktop.org,
	"Alexander Deucher" <Alexander.Deucher@amd.com>,
	"Andrew Morton" <akpm@linux-foundation.org>,
	mphantomx@yahoo.com.br,
	"Christian König" <christian.koenig@amd.com>
Subject: Re: [PATCH] amdgpu_dm: fix nonblocking atomic commit use-after-free
Date: Fri, 24 Jul 2020 09:45:18 +0200	[thread overview]
Message-ID: <a86cba0b-4513-e7c3-ae75-bb331433f664@molgen.mpg.de> (raw)
In-Reply-To: <202007231524.A24720C@keescook>

Dear Kees,


Am 24.07.20 um 00:32 schrieb Kees Cook:
> On Thu, Jul 23, 2020 at 09:10:15PM +0000, Mazin Rezk wrote:
>> When amdgpu_dm_atomic_commit_tail is running in the workqueue,
>> drm_atomic_state_put will get called while amdgpu_dm_atomic_commit_tail is
>> running, causing a race condition where state (and then dm_state) is
>> sometimes freed while amdgpu_dm_atomic_commit_tail is running. This bug has
>> occurred since 5.7-rc1 and is well documented among polaris11 users [1].
>>
>> Prior to 5.7, this was not a noticeable issue since the freelist pointer
>> was stored at the beginning of dm_state (base), which was unused. After
>> changing the freelist pointer to be stored in the middle of the struct, the
>> freelist pointer overwrote the context, causing dc_state to become garbage
>> data and made the call to dm_enable_per_frame_crtc_master_sync dereference
>> a freelist pointer.
>>
>> This patch fixes the aforementioned issue by calling drm_atomic_state_get
>> in amdgpu_dm_atomic_commit before drm_atomic_helper_commit is called and
>> drm_atomic_state_put after amdgpu_dm_atomic_commit_tail is complete.
>>
>> According to my testing on 5.8.0-rc6, this should fix bug 207383 on
>> Bugzilla [1].
>>
>> [1] https://bugzilla.kernel.org/show_bug.cgi?id=207383
> 
> Nice work tracking this down!
> 
>> Fixes: 3202fa62f ("slub: relocate freelist pointer to middle of object")
> 
> I do, however, object to this Fixes tag. :) The flaw appears to have
> been with amdgpu_dm's reference tracking of "state" in the nonblocking
> case. (How this reference counting is supposed to work correctly, though,
> I'm not sure.) If I look at where the drm helper was split from being
> the default callback, it looks like this was what introduced the bug:
> 
> da5c47f682ab ("drm/amd/display: Remove acrtc->stream")
> 
> ? 3202fa62f certainly exposed it much more quickly, but there was a race
> even without 3202fa62f where something could have realloced the memory
> and written over it.

I understand the Fixes tag mainly a help when backporting commits.

As Linux 5.8-rc7 is going to be released this Sunday, I wonder, if 
commit 3202fa62f ("slub: relocate freelist pointer to middle of object") 
should be reverted for now to fix the regression for the users according 
to Linux’ no regression policy. Once the AMDGPU/DRM driver issue is 
fixed, it can be reapplied. I know it’s not optimal, but as some testing 
is going to be involved for the fix, I’d argue it’s the best option for 
the users.


Kind regards,

Paul
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

WARNING: multiple messages have this Message-ID (diff)
From: Paul Menzel <pmenzel@molgen.mpg.de>
To: Kees Cook <keescook@chromium.org>, Mazin Rezk <mnrzk@protonmail.com>
Cc: anthony.ruhier@gmail.com, 1i5t5.duncan@cox.net,
	sunpeng.li@amd.com, linux-kernel@vger.kernel.org,
	dri-devel@lists.freedesktop.org,
	"Nicholas Kazlauskas" <nicholas.kazlauskas@amd.com>,
	regressions@leemhuis.info, amd-gfx@lists.freedesktop.org,
	"Alexander Deucher" <Alexander.Deucher@amd.com>,
	"Andrew Morton" <akpm@linux-foundation.org>,
	mphantomx@yahoo.com.br, "Harry Wentland" <Harry.Wentland@amd.com>,
	"Christian König" <christian.koenig@amd.com>
Subject: Re: [PATCH] amdgpu_dm: fix nonblocking atomic commit use-after-free
Date: Fri, 24 Jul 2020 09:45:18 +0200	[thread overview]
Message-ID: <a86cba0b-4513-e7c3-ae75-bb331433f664@molgen.mpg.de> (raw)
In-Reply-To: <202007231524.A24720C@keescook>

Dear Kees,


Am 24.07.20 um 00:32 schrieb Kees Cook:
> On Thu, Jul 23, 2020 at 09:10:15PM +0000, Mazin Rezk wrote:
>> When amdgpu_dm_atomic_commit_tail is running in the workqueue,
>> drm_atomic_state_put will get called while amdgpu_dm_atomic_commit_tail is
>> running, causing a race condition where state (and then dm_state) is
>> sometimes freed while amdgpu_dm_atomic_commit_tail is running. This bug has
>> occurred since 5.7-rc1 and is well documented among polaris11 users [1].
>>
>> Prior to 5.7, this was not a noticeable issue since the freelist pointer
>> was stored at the beginning of dm_state (base), which was unused. After
>> changing the freelist pointer to be stored in the middle of the struct, the
>> freelist pointer overwrote the context, causing dc_state to become garbage
>> data and made the call to dm_enable_per_frame_crtc_master_sync dereference
>> a freelist pointer.
>>
>> This patch fixes the aforementioned issue by calling drm_atomic_state_get
>> in amdgpu_dm_atomic_commit before drm_atomic_helper_commit is called and
>> drm_atomic_state_put after amdgpu_dm_atomic_commit_tail is complete.
>>
>> According to my testing on 5.8.0-rc6, this should fix bug 207383 on
>> Bugzilla [1].
>>
>> [1] https://bugzilla.kernel.org/show_bug.cgi?id=207383
> 
> Nice work tracking this down!
> 
>> Fixes: 3202fa62f ("slub: relocate freelist pointer to middle of object")
> 
> I do, however, object to this Fixes tag. :) The flaw appears to have
> been with amdgpu_dm's reference tracking of "state" in the nonblocking
> case. (How this reference counting is supposed to work correctly, though,
> I'm not sure.) If I look at where the drm helper was split from being
> the default callback, it looks like this was what introduced the bug:
> 
> da5c47f682ab ("drm/amd/display: Remove acrtc->stream")
> 
> ? 3202fa62f certainly exposed it much more quickly, but there was a race
> even without 3202fa62f where something could have realloced the memory
> and written over it.

I understand the Fixes tag mainly a help when backporting commits.

As Linux 5.8-rc7 is going to be released this Sunday, I wonder, if 
commit 3202fa62f ("slub: relocate freelist pointer to middle of object") 
should be reverted for now to fix the regression for the users according 
to Linux’ no regression policy. Once the AMDGPU/DRM driver issue is 
fixed, it can be reapplied. I know it’s not optimal, but as some testing 
is going to be involved for the fix, I’d argue it’s the best option for 
the users.


Kind regards,

Paul
_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

  parent reply	other threads:[~2020-07-24  7:45 UTC|newest]

Thread overview: 48+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-07-23 21:10 [PATCH] amdgpu_dm: fix nonblocking atomic commit use-after-free Mazin Rezk
2020-07-23 21:10 ` Mazin Rezk
2020-07-23 21:10 ` Mazin Rezk
2020-07-23 22:16 ` Kazlauskas, Nicholas
2020-07-23 22:16   ` Kazlauskas, Nicholas
2020-07-23 22:16   ` Kazlauskas, Nicholas
2020-07-23 22:57   ` Mazin Rezk
2020-07-23 22:57     ` Mazin Rezk
2020-07-23 22:57     ` Mazin Rezk
2020-07-24 21:09     ` Mazin Rezk
2020-07-24 21:09       ` Mazin Rezk
2020-07-24 21:09       ` Mazin Rezk
2020-07-23 22:32 ` Kees Cook
2020-07-23 22:32   ` Kees Cook
2020-07-23 22:32   ` Kees Cook
2020-07-23 22:58   ` Mazin Rezk
2020-07-23 22:58     ` Mazin Rezk
2020-07-23 22:58     ` Mazin Rezk
2020-07-24  7:26     ` Christian König
2020-07-24  7:26       ` Christian König
2020-07-24  7:26       ` Christian König
2020-07-24  7:45   ` Paul Menzel [this message]
2020-07-24  7:45     ` Paul Menzel
2020-07-24  7:45     ` Paul Menzel
2020-07-24 17:33     ` Kees Cook
2020-07-24 17:33       ` Kees Cook
2020-07-24 17:33       ` Kees Cook
2020-07-24 21:19       ` Paul Menzel
2020-07-24 21:19         ` Paul Menzel
2020-07-24 21:19         ` Paul Menzel
2020-07-25  3:03         ` Mazin Rezk
2020-07-25  3:03           ` Mazin Rezk
2020-07-25  3:03           ` Mazin Rezk
2020-07-25  4:59           ` Duncan
2020-07-25  4:59             ` Duncan
2020-07-25  4:59             ` Duncan
2020-07-25  5:20             ` Mazin Rezk
2020-07-25  5:20               ` Mazin Rezk
2020-07-25  5:20               ` Mazin Rezk
2020-07-28  9:22               ` Paul Menzel
2020-07-28  9:22                 ` Paul Menzel
2020-07-28  9:22                 ` Paul Menzel
2020-07-28 17:07                 ` Kazlauskas, Nicholas
2020-07-28 17:07                   ` Kazlauskas, Nicholas
2020-07-28 17:07                   ` Kazlauskas, Nicholas
2020-07-28 21:58                   ` daniel
2020-07-28 21:58                     ` daniel
2020-07-28 21:58                     ` daniel

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=a86cba0b-4513-e7c3-ae75-bb331433f664@molgen.mpg.de \
    --to=pmenzel@molgen.mpg.de \
    --cc=1i5t5.duncan@cox.net \
    --cc=Alexander.Deucher@amd.com \
    --cc=Harry.Wentland@amd.com \
    --cc=akpm@linux-foundation.org \
    --cc=amd-gfx@lists.freedesktop.org \
    --cc=anthony.ruhier@gmail.com \
    --cc=christian.koenig@amd.com \
    --cc=dri-devel@lists.freedesktop.org \
    --cc=keescook@chromium.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mnrzk@protonmail.com \
    --cc=mphantomx@yahoo.com.br \
    --cc=nicholas.kazlauskas@amd.com \
    --cc=regressions@leemhuis.info \
    --cc=sunpeng.li@amd.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.