On Mon, Jun 7, 2021 at 2:37 PM Harry Wentland wrote: > On 2021-06-07 2:19 p.m., Sean Paul wrote: > > On Tue, May 18, 2021 at 2:58 PM Rodrigo Siqueira > > wrote: > >> > >> On 05/14, Mark Yacoub wrote: > >>> On Fri, May 14, 2021 at 12:31 PM Mark Yacoub > wrote: > >>>> > >>>> On Fri, May 14, 2021 at 11:28 AM Harry Wentland < > harry.wentland@amd.com> wrote: > >>>>> > >>>>> On 2021-05-14 7:47 a.m., Rodrigo Siqueira wrote: > >>>>>> A few weeks ago, we saw a two cursor issue in a ChromeOS system. We > >>>>>> fixed it in the commit: > >>>>>> > >>>>>> drm/amd/display: Fix two cursor duplication when using overlay > >>>>>> (read the commit message for more details) > >>>>>> > >>>>>> After this change, we noticed that some IGT subtests related to > >>>>>> kms_plane and kms_plane_scaling started to fail. After investigating > >>>>>> this issue, we noticed that all subtests that fail have a primary > plane > >>>>>> covering the overlay plane, which is currently rejected by amdgpu > dm. > >>>>>> Fail those IGT tests highlight that our verification was too broad > and > >>>>>> compromises the overlay usage in our drive. This patch fixes this > issue > >>> nit: s/drive/driver? > >>>>>> by ensuring that we only reject commits where the primary plane is > not > >>>>>> fully covered by the overlay when the cursor hardware is enabled. > With > >>>>>> this fix, all IGT tests start to pass again, which means our overlay > >>>>>> support works as expected. > >>>>>> > >>>>>> Cc: Tianci.Yin > >>>>>> Cc: Harry Wentland > >>>>>> Cc: Nicholas Choi > >>>>>> Cc: Bhawanpreet Lakha > >>>>>> Cc: Nicholas Kazlauskas > >>>>>> Cc: Mark Yacoub > >>>>>> Cc: Daniel Wheeler > >>>>>> > >>>>>> Signed-off-by: Rodrigo Siqueira > >>>>>> --- > >>>>>> drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c | 10 +++++++++- > >>>>>> 1 file changed, 9 insertions(+), 1 deletion(-) > >>>>>> > >>>>>> diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c > b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c > >>>>>> index ccd67003b120..9c2537a17a7b 100644 > >>>>>> --- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c > >>>>>> +++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c > >>>>>> @@ -10067,7 +10067,7 @@ static int validate_overlay(struct > drm_atomic_state *state) > >>>>>> int i; > >>>>>> struct drm_plane *plane; > >>>>>> struct drm_plane_state *old_plane_state, *new_plane_state; > >>>>>> - struct drm_plane_state *primary_state, *overlay_state = NULL; > >>>>>> + struct drm_plane_state *primary_state, *cursor_state, > *overlay_state = NULL; > >>>>>> > >>>>>> /* Check if primary plane is contained inside overlay */ > >>>>>> for_each_oldnew_plane_in_state_reverse(state, plane, > old_plane_state, new_plane_state, i) { > >>>>>> @@ -10097,6 +10097,14 @@ static int validate_overlay(struct > drm_atomic_state *state) > >>>>>> if (!primary_state->crtc) > >>>>>> return 0; > >>>>>> > >>>>>> + /* check if cursor plane is enabled */ > >>>>>> + cursor_state = drm_atomic_get_plane_state(state, > overlay_state->crtc->cursor); > >>>>>> + if (IS_ERR(cursor_state)) > >>>>>> + return PTR_ERR(cursor_state); > >>>>>> + > >>>>>> + if (drm_atomic_plane_disabling(plane->state, cursor_state)) > >>>>>> + return 0; > >>>>>> + > >>>>> > >>>>> I thought this breaks Chrome's compositor since it can't handle an > atomic commit rejection > >>>>> based solely on whether a cursor is enabled or not. > >>> For context: To use overlays (the old/current async way), Chrome tests > >>> if an overlay strategy will work by doing an atomic commit with a TEST > >>> flag to check if the combination would work. If it works, it flags > >>> this planes configuration as valid. As it's valid, it performs an > >>> atomic page flip (which could also include the cursor). > >>> So to Harry's point, you would pass an atomic test but fail on an > >>> atomic page flip with the exact same configuration that's been flagged > >>> as valid. Failing a pageflip causes Chrome to crash (Reset the GPU > >>> process cause something went horribly wrong when it shouldn't). > >> > >> Hi Mark and Sean, > >> > >> I don't know if I fully comprehended the scenario which in my patch > >> might cause a ChromeOS crash, but this is what I understood: > >> > > > > Chrome compositor only does an atomic test when the layout geometry > > changes (ie: plane is added/removed/resized/moved). This does not take > > into consideration the cursor. Once the atomic test has been validated > > for a given layout geometry (set of planes/fbs along with their sizes > > and locations), Chrome assumes this will continue to be valid. > > > > So the situation I'm envisioning is if the cursor is hidden, an > > overlay-based strategy will pass in the kernel. If at any time the > > cursor becomes visible, the kernel will start failing commits which > > causes Chrome's GPU process to crash. > > > > In general I'm uneasy with using the cursor in the atomic check since > > it feels like it could be racy independent of the issue above. I > > haven't dove into the helper code enough to prove it, just a > > spidey-sense. > > > > Isn't the cursor supposed to be just another plane? If so, shouldn't > adding/removing the cursor trigger an atomic test? > > Chrome is using legacy cursor. Yes it will trigger an atomic test in the kernel, and that atomic test will fail. Unfortunately Chrome does not expect a failure so it will crash. Sean Is Chrome's compositor doing cursor update through legacy IOCTLs or > through the ATOMIC IOCTL? > > Thanks, > Harry > > > Sean > > > >> There is a chance that we pass the atomic check > >> (DRM_MODE_ATOMIC_TEST_ONLY - TEST) but fails in the atomic page flip > >> because, after use TEST, the compositor might slightly change the commit > >> config by adding the cursor. The reason behind that came from the > >> assumption that adds the cursor in the atomic commit config after the > >> test is harmless. Is my understand correct? Please, correct me if I'm > >> wrong. > >> > >> If the above reasoning is correct, I think the compositor should not > >> assume anything extra from what it got from the atomic check. > >> > >> Best Regards, > >> Siqueira > >> > >>>>> > >>>>> Harry > >>>>> > >>>>>> /* Perform the bounds check to ensure the overlay plane > covers the primary */ > >>>>>> if (primary_state->crtc_x < overlay_state->crtc_x || > >>>>>> primary_state->crtc_y < overlay_state->crtc_y || > >>>>>> > >>>>> > >> > >> -- > >> Rodrigo Siqueira > >> https://siqueira.tech/> >