All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v1] drm/amdgpu: fix framebuffer memory use after free
@ 2021-06-17  8:18 Lukasz Bartosik
  2021-06-17 14:18 ` Michel Dänzer
  0 siblings, 1 reply; 4+ messages in thread
From: Lukasz Bartosik @ 2021-06-17  8:18 UTC (permalink / raw)
  To: Alex Deucher, Christian Koenig; +Cc: upstream, amd-gfx, Bas Nieuwenhuizen

With option CONFIG_DEBUG_LIST enabled the kernel logs show list_add
corruption warning. The warning originates from drm_framebuffer_init()
function which adds framebuffer to a framebuffers list and is called by
amdgpu_display_gem_fb_verify_and_init().
If amdgpu_display_gem_fb_verify_and_init() encounters an error after
calling drm_framebuffer_init() then framebuffer memory is released
in amdgpu_display_user_framebuffer_create() without removing framebuffer
from the list where it was added. Reuse of that memory by any other
party cause corruption of the framebuffers linked list. This fix removes
framebuffer from the linked list and unregisters it in case of failure.

[   23.252465] ------------[ cut here ]------------
[   23.252469] list_add corruption. next->prev should be prev (ffff9921c16203a8), but was 733a656c69665f6d. (next=ffff9920debec508).
[   23.252506] WARNING: CPU: 1 PID: 1637 at lib/list_debug.c:25 __list_add_valid+0x56/0x8f
[   23.252520] Modules linked in: xt_cgroup rfcomm cmac algif_hash algif_skcipher af_alg uinput xt_MASQUERADE uvcvideo videobuf2_vmalloc videobuf2_memops videobuf2_v4l2 videobuf2_common btusb btrtl btintel btbcm bluetooth ecdh_generic ecc iio_trig_sysfs snd_hda_codec_hdmi designware_i2s snd_hda_intel snd_intel_dspcfg i2c_piix4 snd_hda_codec snd_hwdep snd_hda_core acpi_als industrialio_triggered_buffer kfifo_buf industrialio snd_soc_acp_rn_rt5682_mach snd_soc_max98357a snd_soc_adau7002 snd_soc_acp_rt5682_mach snd_soc_acp_da7219mx98357_mach acp_audio_dma snd_soc_da7219 ip6table_nat fuse ath10k_pci ath10k_core ath mac80211 cfg80211 r8152 mii joydev
[   23.252595] CPU: 1 PID: 1637 Comm: DrmThread Not tainted 5.13.0-rc6 #22
[   23.252603] Hardware name: HP Grunt/Grunt, BIOS Google_Grunt.11031.162.0 04/08/2021
[   23.252608] RIP: 0010:__list_add_valid+0x56/0x8f
[   23.252616] Code: 47 4c 39 fb 0f 95 c1 4c 39 f3 0f 95 c0 20 c8 5b 41 5e 41 5f 5d c3 48 c7 c7 f4 37 9f bc 4c 89 f6 4c 89 f9 31 c0 e8 0c c7 c5 ff <0f> 0b eb 16 48 c7 c7 c2 f6 99 bc 4c 89 fe 4c 89 f1 31 c0 e8 f4 c6
[   23.252622] RSP: 0018:fffface940c87ba0 EFLAGS: 00010246
[   23.252629] RAX: 8ae6d9bca9cb6a00 RBX: ffff9920debec100 RCX: 8ae6d9bca9cb6a00
[   23.252634] RDX: 0000000000000027 RSI: 00000000ffffdfff RDI: ffff9921ead12e48
[   23.252638] RBP: fffface940c87bb8 R08: 0000000000000000 R09: fffface940c87938
[   23.252643] R10: 00000000ffffdfff R11: ffffffffbcc821f0 R12: 0000000000000000
[   23.252647] R13: ffff9920debec508 R14: ffff9921c16203a8 R15: ffff9920debec508
[   23.252652] FS:  00007e6190717640(0000) GS:ffff9921ead00000(0000) knlGS:0000000000000000
[   23.252658] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[   23.252662] CR2: 00002e7a09404000 CR3: 0000000120644000 CR4: 00000000001506e0
[   23.252667] Call Trace:
[   23.252676]  drm_framebuffer_init+0xfb/0x13c
[   23.252685]  amdgpu_display_gem_fb_verify_and_init+0x4b/0x10a
[   23.252693]  ? kmem_cache_alloc_trace+0x104/0x1dc
[   23.252703]  amdgpu_display_user_framebuffer_create+0xe0/0x195
[   23.252709]  drm_internal_framebuffer_create+0x2fd/0x3d4
[   23.252718]  drm_mode_addfb2+0x39/0xd6
[   23.252724]  ? drm_internal_framebuffer_create+0x3d4/0x3d4
[   23.252731]  drm_ioctl_kernel+0x99/0xfb
[   23.252739]  drm_ioctl+0x25a/0x3a4
[   23.252745]  ? drm_internal_framebuffer_create+0x3d4/0x3d4
[   23.252753]  amdgpu_drm_ioctl+0x49/0x7d
[   23.252760]  __se_sys_ioctl+0x7c/0xb8
[   23.252767]  do_syscall_64+0x5f/0xa9
[   23.252774]  ? exit_to_user_mode_prepare+0x68/0x81
[   23.252781]  entry_SYSCALL_64_after_hwframe+0x44/0xae
[   23.252790] RIP: 0033:0x7e619b2aff07
[   23.252796] Code: 3c 1c 48 f7 d8 49 39 c4 72 b2 e8 14 ff ff ff 85 c0 78 b7 48 83 c4 08 4c 89 e0 5b 41 5c 41 5d 5d c3 66 90 b8 10 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 29 af 0b 00 f7 d8 64 89 01 48
[   23.252801] RSP: 002b:00007e6190715e48 EFLAGS: 00000246 ORIG_RAX: 0000000000000010
[   23.252808] RAX: ffffffffffffffda RBX: 4120acd8347d3800 RCX: 00007e619b2aff07
[   23.252812] RDX: 00007e6190715e80 RSI: 00000000c06864b8 RDI: 000000000000001d
[   23.252817] RBP: 00007e6190715e70 R08: 00007e61907160e8 R09: 00007e61907160f8
[   23.252821] R10: 00007e6190716108 R11: 0000000000000246 R12: 000000000000001d
[   23.252825] R13: 0000000000000168 R14: 00007e6190715e80 R15: 00000000c06864b8
[   23.252830] ---[ end trace 34051e69065d2c6d ]---

Fixes: 6eed95b00b45 ("drm/amd/display: Store tiling_flags in the framebuffer.")
Signed-off-by: Lukasz Bartosik <lb@semihalf.com>
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_display.c | 7 +++++--
 1 file changed, 5 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_display.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_display.c
index c13985fb35be..933190281b91 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_display.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_display.c
@@ -1085,14 +1085,17 @@ int amdgpu_display_gem_fb_verify_and_init(
 			    mode_cmd->modifier[0]);
 
 		ret = -EINVAL;
-		goto err;
+		goto err_fb_cleanup;
 	}
 
 	ret = amdgpu_display_framebuffer_init(dev, rfb, mode_cmd, obj);
 	if (ret)
-		goto err;
+		goto err_fb_cleanup;
 
 	return 0;
+err_fb_cleanup:
+	drm_framebuffer_unregister_private(&rfb->base);
+	drm_framebuffer_cleanup(&rfb->base);
 err:
 	drm_dbg_kms(dev, "Failed to verify and init gem fb: %d\n", ret);
 	rfb->base.obj[0] = NULL;
-- 
2.32.0.272.g935e593368-goog

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply related	[flat|nested] 4+ messages in thread

* Re: [PATCH v1] drm/amdgpu: fix framebuffer memory use after free
  2021-06-17  8:18 [PATCH v1] drm/amdgpu: fix framebuffer memory use after free Lukasz Bartosik
@ 2021-06-17 14:18 ` Michel Dänzer
  2021-06-18  8:46   ` Łukasz Bartosik
  0 siblings, 1 reply; 4+ messages in thread
From: Michel Dänzer @ 2021-06-17 14:18 UTC (permalink / raw)
  To: Lukasz Bartosik, Alex Deucher, Christian Koenig
  Cc: upstream, amd-gfx, Bas Nieuwenhuizen

On 2021-06-17 10:18 a.m., Lukasz Bartosik wrote:
> With option CONFIG_DEBUG_LIST enabled the kernel logs show list_add
> corruption warning. The warning originates from drm_framebuffer_init()
> function which adds framebuffer to a framebuffers list and is called by
> amdgpu_display_gem_fb_verify_and_init().
> If amdgpu_display_gem_fb_verify_and_init() encounters an error after
> calling drm_framebuffer_init() then framebuffer memory is released
> in amdgpu_display_user_framebuffer_create() without removing framebuffer
> from the list where it was added. Reuse of that memory by any other
> party cause corruption of the framebuffers linked list. This fix removes
> framebuffer from the linked list and unregisters it in case of failure.
> 
> [...]
> 
> Fixes: 6eed95b00b45 ("drm/amd/display: Store tiling_flags in the framebuffer.")

I didn't realize there was already an issue before f258907fdd835e "drm/amdgpu: Verify bo size can fit framebuffer size on init.". Looking at 
the Git history again, I agree there's already at least a theoretical issue in 5.11, though I suspect it's harder to hit in practice.


> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_display.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_display.c
> index c13985fb35be..933190281b91 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_display.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_display.c
> @@ -1085,14 +1085,17 @@ int amdgpu_display_gem_fb_verify_and_init(
>  			    mode_cmd->modifier[0]);
>  
>  		ret = -EINVAL;
> -		goto err;
> +		goto err_fb_cleanup;
>  	}
>  
>  	ret = amdgpu_display_framebuffer_init(dev, rfb, mode_cmd, obj);
>  	if (ret)
> -		goto err;
> +		goto err_fb_cleanup;
>  
>  	return 0;
> +err_fb_cleanup:
> +	drm_framebuffer_unregister_private(&rfb->base);
> +	drm_framebuffer_cleanup(&rfb->base);
>  err:
>  	drm_dbg_kms(dev, "Failed to verify and init gem fb: %d\n", ret);
>  	rfb->base.obj[0] = NULL;
> 

There's a similar issue in amdgpu_display_gem_fb_init. https://patchwork.freedesktop.org/patch/439542/ fixes that as well, and seems simpler (though I'm biased obviously :).


Neither patch can be trivially cherry picked for fixing the issue in 5.11/5.12 due to f258907fdd835e.


-- 
Earthling Michel Dänzer               |               https://redhat.com
Libre software enthusiast             |             Mesa and X developer
_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [PATCH v1] drm/amdgpu: fix framebuffer memory use after free
  2021-06-17 14:18 ` Michel Dänzer
@ 2021-06-18  8:46   ` Łukasz Bartosik
  2021-06-18 16:50     ` Michel Dänzer
  0 siblings, 1 reply; 4+ messages in thread
From: Łukasz Bartosik @ 2021-06-18  8:46 UTC (permalink / raw)
  To: Michel Dänzer, Alex Deucher, Christian Koenig
  Cc: upstream, amd-gfx, Bas Nieuwenhuizen

czw., 17 cze 2021 o 16:18 Michel Dänzer <michel@daenzer.net> napisał(a):
>
> On 2021-06-17 10:18 a.m., Lukasz Bartosik wrote:
> > With option CONFIG_DEBUG_LIST enabled the kernel logs show list_add
> > corruption warning. The warning originates from drm_framebuffer_init()
> > function which adds framebuffer to a framebuffers list and is called by
> > amdgpu_display_gem_fb_verify_and_init().
> > If amdgpu_display_gem_fb_verify_and_init() encounters an error after
> > calling drm_framebuffer_init() then framebuffer memory is released
> > in amdgpu_display_user_framebuffer_create() without removing framebuffer
> > from the list where it was added. Reuse of that memory by any other
> > party cause corruption of the framebuffers linked list. This fix removes
> > framebuffer from the linked list and unregisters it in case of failure.
> >
> > [...]
> >
> > Fixes: 6eed95b00b45 ("drm/amd/display: Store tiling_flags in the framebuffer.")
>
> I didn't realize there was already an issue before f258907fdd835e "drm/amdgpu: Verify bo size can fit framebuffer size on init.". Looking at
> the Git history again, I agree there's already at least a theoretical issue in 5.11, though I suspect it's harder to hit in practice.
>
>
> > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_display.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_display.c
> > index c13985fb35be..933190281b91 100644
> > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_display.c
> > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_display.c
> > @@ -1085,14 +1085,17 @@ int amdgpu_display_gem_fb_verify_and_init(
> >                           mode_cmd->modifier[0]);
> >
> >               ret = -EINVAL;
> > -             goto err;
> > +             goto err_fb_cleanup;
> >       }
> >
> >       ret = amdgpu_display_framebuffer_init(dev, rfb, mode_cmd, obj);
> >       if (ret)
> > -             goto err;
> > +             goto err_fb_cleanup;
> >
> >       return 0;
> > +err_fb_cleanup:
> > +     drm_framebuffer_unregister_private(&rfb->base);
> > +     drm_framebuffer_cleanup(&rfb->base);
> >  err:
> >       drm_dbg_kms(dev, "Failed to verify and init gem fb: %d\n", ret);
> >       rfb->base.obj[0] = NULL;
> >
>
> There's a similar issue in amdgpu_display_gem_fb_init. https://patchwork.freedesktop.org/patch/439542/ fixes that as well, and seems simpler (though I'm biased obviously :).

I agree your patch is simpler and covers more cases, but IMHO my
approach with explicit framebuffer cleanup has the advantage
that it will be hard to miss in case of future code reorganizations in
that area.

>
>
> Neither patch can be trivially cherry picked for fixing the issue in 5.11/5.12 due to f258907fdd835e.
>
>
> --
> Earthling Michel Dänzer               |               https://redhat.com
> Libre software enthusiast             |             Mesa and X developer
_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [PATCH v1] drm/amdgpu: fix framebuffer memory use after free
  2021-06-18  8:46   ` Łukasz Bartosik
@ 2021-06-18 16:50     ` Michel Dänzer
  0 siblings, 0 replies; 4+ messages in thread
From: Michel Dänzer @ 2021-06-18 16:50 UTC (permalink / raw)
  To: Łukasz Bartosik, Alex Deucher, Christian Koenig
  Cc: upstream, amd-gfx, Bas Nieuwenhuizen

On 2021-06-18 10:46 a.m., Łukasz Bartosik wrote:
> czw., 17 cze 2021 o 16:18 Michel Dänzer <michel@daenzer.net> napisał(a):
>>
>> On 2021-06-17 10:18 a.m., Lukasz Bartosik wrote:
>>> With option CONFIG_DEBUG_LIST enabled the kernel logs show list_add
>>> corruption warning. The warning originates from drm_framebuffer_init()
>>> function which adds framebuffer to a framebuffers list and is called by
>>> amdgpu_display_gem_fb_verify_and_init().
>>> If amdgpu_display_gem_fb_verify_and_init() encounters an error after
>>> calling drm_framebuffer_init() then framebuffer memory is released
>>> in amdgpu_display_user_framebuffer_create() without removing framebuffer
>>> from the list where it was added. Reuse of that memory by any other
>>> party cause corruption of the framebuffers linked list. This fix removes
>>> framebuffer from the linked list and unregisters it in case of failure.
>>>
>>> [...]
>>>
>>> Fixes: 6eed95b00b45 ("drm/amd/display: Store tiling_flags in the framebuffer.")
>>
>> I didn't realize there was already an issue before f258907fdd835e "drm/amdgpu: Verify bo size can fit framebuffer size on init.". Looking at
>> the Git history again, I agree there's already at least a theoretical issue in 5.11, though I suspect it's harder to hit in practice.
>>
>>
>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_display.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_display.c
>>> index c13985fb35be..933190281b91 100644
>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_display.c
>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_display.c
>>> @@ -1085,14 +1085,17 @@ int amdgpu_display_gem_fb_verify_and_init(
>>>                           mode_cmd->modifier[0]);
>>>
>>>               ret = -EINVAL;
>>> -             goto err;
>>> +             goto err_fb_cleanup;
>>>       }
>>>
>>>       ret = amdgpu_display_framebuffer_init(dev, rfb, mode_cmd, obj);
>>>       if (ret)
>>> -             goto err;
>>> +             goto err_fb_cleanup;
>>>
>>>       return 0;
>>> +err_fb_cleanup:
>>> +     drm_framebuffer_unregister_private(&rfb->base);
>>> +     drm_framebuffer_cleanup(&rfb->base);
>>>  err:
>>>       drm_dbg_kms(dev, "Failed to verify and init gem fb: %d\n", ret);
>>>       rfb->base.obj[0] = NULL;
>>>
>>
>> There's a similar issue in amdgpu_display_gem_fb_init. https://patchwork.freedesktop.org/patch/439542/ fixes that as well, and seems simpler (though I'm biased obviously :).
> 
> I agree your patch is simpler and covers more cases, but IMHO my
> approach with explicit framebuffer cleanup has the advantage
> that it will be hard to miss in case of future code reorganizations in
> that area.

Fair enough.

FWIW, I went the "call drm_framebuffer_init last" route because it matches what all other drivers do AFAICT.


-- 
Earthling Michel Dänzer               |               https://redhat.com
Libre software enthusiast             |             Mesa and X developer
_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2021-06-21  7:09 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-06-17  8:18 [PATCH v1] drm/amdgpu: fix framebuffer memory use after free Lukasz Bartosik
2021-06-17 14:18 ` Michel Dänzer
2021-06-18  8:46   ` Łukasz Bartosik
2021-06-18 16:50     ` Michel Dänzer

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.