All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Kazlauskas, Nicholas" <nicholas.kazlauskas@amd.com>
To: Paul Menzel <pmenzel@molgen.mpg.de>,
	Mazin Rezk <mnrzk@protonmail.com>, Duncan <1i5t5.duncan@cox.net>
Cc: "Kees Cook" <keescook@chromium.org>,
	linux-kernel@vger.kernel.org, amd-gfx@lists.freedesktop.org,
	dri-devel@lists.freedesktop.org,
	"Andrew Morton" <akpm@linux-foundation.org>,
	"Christian König" <christian.koenig@amd.com>,
	"Harry Wentland" <Harry.Wentland@amd.com>,
	sunpeng.li@amd.com,
	"Alexander Deucher" <Alexander.Deucher@amd.com>,
	mphantomx@yahoo.com.br, regressions@leemhuis.info,
	anthony.ruhier@gmail.com
Subject: Re: [PATCH] amdgpu_dm: fix nonblocking atomic commit use-after-free
Date: Tue, 28 Jul 2020 13:07:13 -0400	[thread overview]
Message-ID: <0edb1498-6c43-27cc-b2fb-71ea5ca1a56c@amd.com> (raw)
In-Reply-To: <0b0fbe35-75cf-ec90-7c3d-bdcedbe217b7@molgen.mpg.de>

On 2020-07-28 5:22 a.m., Paul Menzel wrote:
> Dear Linux folks,
> 
> 
> Am 25.07.20 um 07:20 schrieb Mazin Rezk:
>> On Saturday, July 25, 2020 12:59 AM, Duncan wrote:
>>
>>> On Sat, 25 Jul 2020 03:03:52 +0000 Mazin Rezk wrote:
>>>
>>>>> Am 24.07.20 um 19:33 schrieb Kees Cook:
>>>>>
>>>>>> There was a fix to disable the async path for this driver that
>>>>>> worked around the bug too, yes? That seems like a safer and more
>>>>>> focused change that doesn't revert the SLUB defense for all
>>>>>> users, and would actually provide a complete, I think, workaround
>>>>
>>>> That said, I haven't seen the async disabling patch. If you could
>>>> link to it, I'd be glad to test it out and perhaps we can use that
>>>> instead.
>>>
>>> I'm confused. Not to put words in Kees' mouth; /I/ am confused (which
>>> admittedly could well be just because I make no claims to be a
>>> coder and am simply reading the bug and thread, but I'd appreciate some
>>> "unconfusing" anyway).
>>>
>>> My interpretation of the "async disabling" reference was that it was to
>>> comment #30 on the bug:
>>>
>>> https://bugzilla.kernel.org/show_bug.cgi?id=207383#c30 
>>>
>>>
>>> ... which (if I'm not confused on this point too) appears to be yours.
>>> There it was stated...
>>>
>>> I've also found that this bug exclusively occurs when commit_work is on
>>> the workqueue. After forcing drm_atomic_helper_commit to run all of the
>>> commits without adding to the workqueue and running the OS, the issue
>>> seems to have disappeared.
>>> <<<<
>>>
>>> Would not forcing all commits to run directly, without placing them on
>>> the workqueue, be "async disabling"? That's what I /thought/ he was
>>> referencing.
>>
>> Oh, I thought he was referring to a different patch. Kees, could I get
>> your confirmation on this?
>>
>> The change I made actually affected all of the DRM code, although this 
>> could
>> easily be changed to be specific to amdgpu. (By forcing blocking on
>> amdgpu_dm's non-blocking commit code)
>>
>> That said, I'd still need to test further because I only did test it 
>> for a
>> couple of hours then. Although it should work in theory.
>>
>>> OTOH your base/context swap idea sounds like a possibly "less
>>> disturbance" workaround, if it works, and given the point in the
>>> commit cycle... (But if it's out Sunday it's likely too late to test
>>> and get it in now anyway; if it's another week, tho...)
>>
>> The base/context swap idea should make the use-after-free behave how it
>> did in 5.6. Since the bug doesn't cause an issue in 5.6, it's less of a
>> "less disturbance" workaround and more of a "no disturbance" workaround.
> 
> Sorry for bothering, but is there now a solution, besides reverting the 
> commits, to avoid freezes/crashes *without* performance regressions?
> 
> 
> Kind regards,
> 
> Paul

Mazin's "drm/amd/display: Clear dm_state for fast updates" change 
accomplishes this, at least as a temporary hack.

I've started work on a more large scale fix that we could get in in after.

Regards,
Nicholas Kazlauskas

WARNING: multiple messages have this Message-ID (diff)
From: "Kazlauskas, Nicholas" <nicholas.kazlauskas@amd.com>
To: Paul Menzel <pmenzel@molgen.mpg.de>,
	Mazin Rezk <mnrzk@protonmail.com>, Duncan <1i5t5.duncan@cox.net>
Cc: anthony.ruhier@gmail.com, "Kees Cook" <keescook@chromium.org>,
	sunpeng.li@amd.com, linux-kernel@vger.kernel.org,
	dri-devel@lists.freedesktop.org, regressions@leemhuis.info,
	amd-gfx@lists.freedesktop.org,
	"Alexander Deucher" <Alexander.Deucher@amd.com>,
	"Andrew Morton" <akpm@linux-foundation.org>,
	mphantomx@yahoo.com.br,
	"Christian König" <christian.koenig@amd.com>
Subject: Re: [PATCH] amdgpu_dm: fix nonblocking atomic commit use-after-free
Date: Tue, 28 Jul 2020 13:07:13 -0400	[thread overview]
Message-ID: <0edb1498-6c43-27cc-b2fb-71ea5ca1a56c@amd.com> (raw)
In-Reply-To: <0b0fbe35-75cf-ec90-7c3d-bdcedbe217b7@molgen.mpg.de>

On 2020-07-28 5:22 a.m., Paul Menzel wrote:
> Dear Linux folks,
> 
> 
> Am 25.07.20 um 07:20 schrieb Mazin Rezk:
>> On Saturday, July 25, 2020 12:59 AM, Duncan wrote:
>>
>>> On Sat, 25 Jul 2020 03:03:52 +0000 Mazin Rezk wrote:
>>>
>>>>> Am 24.07.20 um 19:33 schrieb Kees Cook:
>>>>>
>>>>>> There was a fix to disable the async path for this driver that
>>>>>> worked around the bug too, yes? That seems like a safer and more
>>>>>> focused change that doesn't revert the SLUB defense for all
>>>>>> users, and would actually provide a complete, I think, workaround
>>>>
>>>> That said, I haven't seen the async disabling patch. If you could
>>>> link to it, I'd be glad to test it out and perhaps we can use that
>>>> instead.
>>>
>>> I'm confused. Not to put words in Kees' mouth; /I/ am confused (which
>>> admittedly could well be just because I make no claims to be a
>>> coder and am simply reading the bug and thread, but I'd appreciate some
>>> "unconfusing" anyway).
>>>
>>> My interpretation of the "async disabling" reference was that it was to
>>> comment #30 on the bug:
>>>
>>> https://bugzilla.kernel.org/show_bug.cgi?id=207383#c30 
>>>
>>>
>>> ... which (if I'm not confused on this point too) appears to be yours.
>>> There it was stated...
>>>
>>> I've also found that this bug exclusively occurs when commit_work is on
>>> the workqueue. After forcing drm_atomic_helper_commit to run all of the
>>> commits without adding to the workqueue and running the OS, the issue
>>> seems to have disappeared.
>>> <<<<
>>>
>>> Would not forcing all commits to run directly, without placing them on
>>> the workqueue, be "async disabling"? That's what I /thought/ he was
>>> referencing.
>>
>> Oh, I thought he was referring to a different patch. Kees, could I get
>> your confirmation on this?
>>
>> The change I made actually affected all of the DRM code, although this 
>> could
>> easily be changed to be specific to amdgpu. (By forcing blocking on
>> amdgpu_dm's non-blocking commit code)
>>
>> That said, I'd still need to test further because I only did test it 
>> for a
>> couple of hours then. Although it should work in theory.
>>
>>> OTOH your base/context swap idea sounds like a possibly "less
>>> disturbance" workaround, if it works, and given the point in the
>>> commit cycle... (But if it's out Sunday it's likely too late to test
>>> and get it in now anyway; if it's another week, tho...)
>>
>> The base/context swap idea should make the use-after-free behave how it
>> did in 5.6. Since the bug doesn't cause an issue in 5.6, it's less of a
>> "less disturbance" workaround and more of a "no disturbance" workaround.
> 
> Sorry for bothering, but is there now a solution, besides reverting the 
> commits, to avoid freezes/crashes *without* performance regressions?
> 
> 
> Kind regards,
> 
> Paul

Mazin's "drm/amd/display: Clear dm_state for fast updates" change 
accomplishes this, at least as a temporary hack.

I've started work on a more large scale fix that we could get in in after.

Regards,
Nicholas Kazlauskas
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

WARNING: multiple messages have this Message-ID (diff)
From: "Kazlauskas, Nicholas" <nicholas.kazlauskas@amd.com>
To: Paul Menzel <pmenzel@molgen.mpg.de>,
	Mazin Rezk <mnrzk@protonmail.com>, Duncan <1i5t5.duncan@cox.net>
Cc: anthony.ruhier@gmail.com, "Kees Cook" <keescook@chromium.org>,
	sunpeng.li@amd.com, linux-kernel@vger.kernel.org,
	dri-devel@lists.freedesktop.org, regressions@leemhuis.info,
	amd-gfx@lists.freedesktop.org,
	"Alexander Deucher" <Alexander.Deucher@amd.com>,
	"Andrew Morton" <akpm@linux-foundation.org>,
	mphantomx@yahoo.com.br, "Harry Wentland" <Harry.Wentland@amd.com>,
	"Christian König" <christian.koenig@amd.com>
Subject: Re: [PATCH] amdgpu_dm: fix nonblocking atomic commit use-after-free
Date: Tue, 28 Jul 2020 13:07:13 -0400	[thread overview]
Message-ID: <0edb1498-6c43-27cc-b2fb-71ea5ca1a56c@amd.com> (raw)
In-Reply-To: <0b0fbe35-75cf-ec90-7c3d-bdcedbe217b7@molgen.mpg.de>

On 2020-07-28 5:22 a.m., Paul Menzel wrote:
> Dear Linux folks,
> 
> 
> Am 25.07.20 um 07:20 schrieb Mazin Rezk:
>> On Saturday, July 25, 2020 12:59 AM, Duncan wrote:
>>
>>> On Sat, 25 Jul 2020 03:03:52 +0000 Mazin Rezk wrote:
>>>
>>>>> Am 24.07.20 um 19:33 schrieb Kees Cook:
>>>>>
>>>>>> There was a fix to disable the async path for this driver that
>>>>>> worked around the bug too, yes? That seems like a safer and more
>>>>>> focused change that doesn't revert the SLUB defense for all
>>>>>> users, and would actually provide a complete, I think, workaround
>>>>
>>>> That said, I haven't seen the async disabling patch. If you could
>>>> link to it, I'd be glad to test it out and perhaps we can use that
>>>> instead.
>>>
>>> I'm confused. Not to put words in Kees' mouth; /I/ am confused (which
>>> admittedly could well be just because I make no claims to be a
>>> coder and am simply reading the bug and thread, but I'd appreciate some
>>> "unconfusing" anyway).
>>>
>>> My interpretation of the "async disabling" reference was that it was to
>>> comment #30 on the bug:
>>>
>>> https://bugzilla.kernel.org/show_bug.cgi?id=207383#c30 
>>>
>>>
>>> ... which (if I'm not confused on this point too) appears to be yours.
>>> There it was stated...
>>>
>>> I've also found that this bug exclusively occurs when commit_work is on
>>> the workqueue. After forcing drm_atomic_helper_commit to run all of the
>>> commits without adding to the workqueue and running the OS, the issue
>>> seems to have disappeared.
>>> <<<<
>>>
>>> Would not forcing all commits to run directly, without placing them on
>>> the workqueue, be "async disabling"? That's what I /thought/ he was
>>> referencing.
>>
>> Oh, I thought he was referring to a different patch. Kees, could I get
>> your confirmation on this?
>>
>> The change I made actually affected all of the DRM code, although this 
>> could
>> easily be changed to be specific to amdgpu. (By forcing blocking on
>> amdgpu_dm's non-blocking commit code)
>>
>> That said, I'd still need to test further because I only did test it 
>> for a
>> couple of hours then. Although it should work in theory.
>>
>>> OTOH your base/context swap idea sounds like a possibly "less
>>> disturbance" workaround, if it works, and given the point in the
>>> commit cycle... (But if it's out Sunday it's likely too late to test
>>> and get it in now anyway; if it's another week, tho...)
>>
>> The base/context swap idea should make the use-after-free behave how it
>> did in 5.6. Since the bug doesn't cause an issue in 5.6, it's less of a
>> "less disturbance" workaround and more of a "no disturbance" workaround.
> 
> Sorry for bothering, but is there now a solution, besides reverting the 
> commits, to avoid freezes/crashes *without* performance regressions?
> 
> 
> Kind regards,
> 
> Paul

Mazin's "drm/amd/display: Clear dm_state for fast updates" change 
accomplishes this, at least as a temporary hack.

I've started work on a more large scale fix that we could get in in after.

Regards,
Nicholas Kazlauskas
_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

  reply	other threads:[~2020-07-28 17:07 UTC|newest]

Thread overview: 48+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-07-23 21:10 [PATCH] amdgpu_dm: fix nonblocking atomic commit use-after-free Mazin Rezk
2020-07-23 21:10 ` Mazin Rezk
2020-07-23 21:10 ` Mazin Rezk
2020-07-23 22:16 ` Kazlauskas, Nicholas
2020-07-23 22:16   ` Kazlauskas, Nicholas
2020-07-23 22:16   ` Kazlauskas, Nicholas
2020-07-23 22:57   ` Mazin Rezk
2020-07-23 22:57     ` Mazin Rezk
2020-07-23 22:57     ` Mazin Rezk
2020-07-24 21:09     ` Mazin Rezk
2020-07-24 21:09       ` Mazin Rezk
2020-07-24 21:09       ` Mazin Rezk
2020-07-23 22:32 ` Kees Cook
2020-07-23 22:32   ` Kees Cook
2020-07-23 22:32   ` Kees Cook
2020-07-23 22:58   ` Mazin Rezk
2020-07-23 22:58     ` Mazin Rezk
2020-07-23 22:58     ` Mazin Rezk
2020-07-24  7:26     ` Christian König
2020-07-24  7:26       ` Christian König
2020-07-24  7:26       ` Christian König
2020-07-24  7:45   ` Paul Menzel
2020-07-24  7:45     ` Paul Menzel
2020-07-24  7:45     ` Paul Menzel
2020-07-24 17:33     ` Kees Cook
2020-07-24 17:33       ` Kees Cook
2020-07-24 17:33       ` Kees Cook
2020-07-24 21:19       ` Paul Menzel
2020-07-24 21:19         ` Paul Menzel
2020-07-24 21:19         ` Paul Menzel
2020-07-25  3:03         ` Mazin Rezk
2020-07-25  3:03           ` Mazin Rezk
2020-07-25  3:03           ` Mazin Rezk
2020-07-25  4:59           ` Duncan
2020-07-25  4:59             ` Duncan
2020-07-25  4:59             ` Duncan
2020-07-25  5:20             ` Mazin Rezk
2020-07-25  5:20               ` Mazin Rezk
2020-07-25  5:20               ` Mazin Rezk
2020-07-28  9:22               ` Paul Menzel
2020-07-28  9:22                 ` Paul Menzel
2020-07-28  9:22                 ` Paul Menzel
2020-07-28 17:07                 ` Kazlauskas, Nicholas [this message]
2020-07-28 17:07                   ` Kazlauskas, Nicholas
2020-07-28 17:07                   ` Kazlauskas, Nicholas
2020-07-28 21:58                   ` daniel
2020-07-28 21:58                     ` daniel
2020-07-28 21:58                     ` daniel

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=0edb1498-6c43-27cc-b2fb-71ea5ca1a56c@amd.com \
    --to=nicholas.kazlauskas@amd.com \
    --cc=1i5t5.duncan@cox.net \
    --cc=Alexander.Deucher@amd.com \
    --cc=Harry.Wentland@amd.com \
    --cc=akpm@linux-foundation.org \
    --cc=amd-gfx@lists.freedesktop.org \
    --cc=anthony.ruhier@gmail.com \
    --cc=christian.koenig@amd.com \
    --cc=dri-devel@lists.freedesktop.org \
    --cc=keescook@chromium.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mnrzk@protonmail.com \
    --cc=mphantomx@yahoo.com.br \
    --cc=pmenzel@molgen.mpg.de \
    --cc=regressions@leemhuis.info \
    --cc=sunpeng.li@amd.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.