All of lore.kernel.org
 help / color / mirror / Atom feed
* Exynos vblank timeout issue
@ 2022-05-22  0:02 Martin Jücker
  2022-05-22  7:45 ` Krzysztof Kozlowski
  0 siblings, 1 reply; 7+ messages in thread
From: Martin Jücker @ 2022-05-22  0:02 UTC (permalink / raw)
  To: dri-devel
  Cc: Joonyoung Shim, David Airlie, Seung-Woo Kim, Krzysztof Kozlowski,
	Kyungmin Park, Martin Jücker

Hello,

I'm trying to get Android 12 up and running on my Galaxy Note 10.1 which
is based on Exynos 4412 with a Mali GPU. For Android 11, I had no issues
with graphics but after upgrading and building Android 12, I'm getting a
vblank wait timeout shortly after starting the device setup, which in
turn leads to my display turning black and SurfaceFlinger hanging. This
can be reliably reproduced after every reboot, so much so that it's
basically always on the exact same step of the setup.

I'm using the following setup:

* 5.10.101 Android Common Kernel with some patches to get
the Note 10.1 up and running
* Exynos FIMD + Lima GPU driver
* android-12.1.0_r5 branch
* drm-hwcomposer main branch
* mesa main branch
* minigbm from the GloDroid project

I tried several older versions of hwc, mesa and minigbm with no success,
same problem on all of them. What I found is that, if I disable overlay
planes and only the primary plane is used, it works without any issues.
Adding the second plane is fine as well, but the third one triggers the
problem. For some reason, this didn't happen on Android 11 though and
nothing changed on the kernel side, I was using the same kernel for it.

Unfortunately, there is not a lot going on in the logging. In logcat I
can see the following messages:

E hwc-drm-atomic-state-manager:sync_wait(fd=33) returned: -1 (errno: 62)
E hwc-drm-atomic-state-manager: Failed to commit pset ret=-16
E hwc-drm-atomic-state-manager: Composite failed for pipeline VGA-1
E Fence: waitForever: Throttling EGL Production: fence 431 didn't
signal in 3000 ms
W OpenGLRenderer: dequeueBuffer failed,error = -110;


Here is the dmesg output of it:

[   55.106235] [drm:drm_ioctl] comm="composer@2.4-se" pid=266,
dev=0xe200, auth=1, DRM_IOCTL_MODE_ATOMIC
[   55.106264] [drm:drm_atomic_state_init] Allocated atomic state
a07053f4
[   55.106299] [drm:drm_mode_object_get] OBJ ID: 73 (2)
[   55.106313] [drm:drm_atomic_get_plane_state] Added [PLANE:31:plane-0]
ff8c5726 state to a07053f4
[   55.106327] [drm:drm_mode_object_get] OBJ ID: 60 (1)
[   55.106339] [drm:drm_atomic_get_crtc_state] Added [CRTC:54:crtc-0]
33284f8c state to a07053f4
[   55.106367] [drm:drm_atomic_set_fb_for_plane] Set [FB:73] for
[PLANE:31:plane-0] state ff8c5726
[   55.106385] [drm:drm_mode_object_get] OBJ ID: 73 (5)
[   55.106395] [drm:drm_mode_object_put.part.0] OBJ ID: 73 (6)
[   55.106415] [drm:drm_mode_object_put.part.0] OBJ ID: 73 (5)
[   55.106426] [drm:drm_mode_object_put.part.0] OBJ ID: 73 (4)
[   55.106449] [drm:drm_mode_object_get] OBJ ID: 72 (2)
[   55.106462] [drm:drm_atomic_get_plane_state] Added [PLANE:34:plane-1]
b6a2ee22 state to a07053f4
[   55.106479] [drm:drm_atomic_set_fb_for_plane] Set [FB:72] for
[PLANE:34:plane-1] state b6a2ee22
[   55.106490] [drm:drm_mode_object_get] OBJ ID: 72 (5)
[   55.106500] [drm:drm_mode_object_put.part.0] OBJ ID: 72 (6)
[   55.106510] [drm:drm_mode_object_put.part.0] OBJ ID: 72 (5)
[   55.106520] [drm:drm_mode_object_put.part.0] OBJ ID: 72 (4)
[   55.106540] [drm:drm_mode_object_get] OBJ ID: 68 (2)
[   55.106552] [drm:drm_atomic_get_plane_state] Added [PLANE:39:plane-2]
aba4b1f1 state to a07053f4
[   55.106568] [drm:drm_atomic_set_fb_for_plane] Set [FB:61] for
[PLANE:39:plane-2] state aba4b1f1
[   55.106578] [drm:drm_mode_object_get] OBJ ID: 61 (3)
[   55.106588] [drm:drm_mode_object_put.part.0] OBJ ID: 68 (3)
[   55.106599] [drm:drm_mode_object_put.part.0] OBJ ID: 61 (4)
[   55.106608] [drm:drm_mode_object_put.part.0] OBJ ID: 61 (3)
[   55.106628] [drm:drm_mode_object_get] OBJ ID: 66 (2)
[   55.106639] [drm:drm_atomic_get_plane_state] Added [PLANE:44:plane-3]
08e79ee6 state to a07053f4
[   55.106656] [drm:drm_atomic_set_fb_for_plane] Set [FB:66] for
[PLANE:44:plane-3] state 08e79ee6
[   55.106666] [drm:drm_mode_object_get] OBJ ID: 66 (5)
[   55.106677] [drm:drm_mode_object_put.part.0] OBJ ID: 66 (6)
[   55.106687] [drm:drm_mode_object_put.part.0] OBJ ID: 66 (5)
[   55.106697] [drm:drm_mode_object_put.part.0] OBJ ID: 66 (4)
[   55.106739] [drm:drm_atomic_check_only] checking a07053f4
[   55.106769] exynos-drm exynos-drm: [drm:exynos_plane_atomic_check]
plane : offset_x/y(0,0), width/height(4,800)
[   55.106785] exynos-drm exynos-drm: [drm:exynos_plane_atomic_check]
plane : offset_x/y(4,0), width/height(1276,800)
[   55.106798] exynos-drm exynos-drm: [drm:exynos_plane_atomic_check]
plane : offset_x/y(0,0), width/height(1280,800)
[   55.106813] exynos-drm exynos-drm: [drm:exynos_plane_atomic_check]
plane : offset_x/y(0,776), width/height(1280,24)
[   55.106827] [drm:drm_atomic_nonblocking_commit] committing a07053f4
nonblocking
[   55.106905] exynos-drm exynos-drm:
[drm:drm_calc_timestamping_constants] crtc 54: hwmode: htotal 1350,
vtotal 823, vdisplay 800
[   55.106922] exynos-drm exynos-drm:
[drm:drm_calc_timestamping_constants] crtc 54: clock 66663 kHz framedur
16666666 linedur 20251
[   55.106949] exynos4-fb 11c00000.fimd: [drm:fimd_update_plane] start
addr = 0x22f013f0, end addr = 0x232e93f0, size = 0x3e8000
[   55.106962] exynos4-fb 11c00000.fimd: [drm:fimd_update_plane]
ovl_width = 4, ovl_height = 800
[   55.106977] exynos4-fb 11c00000.fimd: [drm:fimd_update_plane] osd
pos: tx = 0, ty = 0, bx = 3, by = 799
[   55.106989] exynos4-fb 11c00000.fimd: [drm:fimd_update_plane] osd
size = 0xc80
[   55.107005] exynos4-fb 11c00000.fimd: [drm:fimd_update_plane] start
addr = 0x21300000, end addr = 0x216e8000, size = 0x3e8000
[   55.107017] exynos4-fb 11c00000.fimd: [drm:fimd_update_plane]
ovl_width = 1276, ovl_height = 800
[   55.107031] exynos4-fb 11c00000.fimd: [drm:fimd_update_plane] osd
pos: tx = 4, ty = 0, bx = 1279, by = 799
[   55.107043] exynos4-fb 11c00000.fimd: [drm:fimd_update_plane] osd
size = 0xf9380
[   55.107061] exynos4-fb 11c00000.fimd: [drm:fimd_update_plane] start
addr = 0x20500000, end addr = 0x208e8000, size = 0x3e8000
[   55.107073] exynos4-fb 11c00000.fimd: [drm:fimd_update_plane]
ovl_width = 1280, ovl_height = 800
[   55.107086] exynos4-fb 11c00000.fimd: [drm:fimd_update_plane] osd
pos: tx = 0, ty = 0, bx = 1279, by = 799
[   55.107097] exynos4-fb 11c00000.fimd: [drm:fimd_update_plane] osd
size = 0xfa000
[   55.107115] exynos4-fb 11c00000.fimd: [drm:fimd_update_plane] start
addr = 0x22800000, end addr = 0x2281e000, size = 0x1e000
[   55.107127] exynos4-fb 11c00000.fimd: [drm:fimd_update_plane]
ovl_width = 1280, ovl_height = 24
[   55.107140] exynos4-fb 11c00000.fimd: [drm:fimd_update_plane] osd
pos: tx = 0, ty = 776, bx = 1279, by = 799
[   55.107168] exynos-drm exynos-drm: [drm:drm_update_vblank_count]
updating vblank count on crtc 0: current=3108, diff=0, hw=0 hw_last=0
[   55.172238] [drm:drm_ioctl] comm="RenderThread" pid=1145, dev=0xe281,
auth=1, DRM_IOCTL_SYNCOBJ_HANDLE_TO_FD
[   55.215253] ------------[ cut here ]------------
[   55.215285] WARNING: CPU: 1 PID: 115 at
../drivers/gpu/drm/drm_atomic_helper.c:1513
drm_atomic_helper_wait_for_vblanks.part.1+0x2b8/0x2bc
[   55.215294] [CRTC:54:crtc-0] vblank wait timed out
[   55.215299] Modules linked in: s5p_mfc s5p_jpeg v4l2_mem2mem
videobuf2_dma_contig videobuf2_memops videobuf2_v4l2 videobuf2_common
rfcomm kheaders hidp hci_uart cpufreq_userspace cpufreq_powersave
cpufreq_conservative btbcm brcmfmac brcmutil bnep bluetooth atmel_mxt_ts
[   55.215402] CPU: 1 PID: 115 Comm: kworker/u8:2 Tainted: G        W
5.10.101+ #15
[   55.215408] Hardware name: Samsung Exynos (Flattened Device Tree)
[   55.215421] Workqueue: events_unbound commit_work
[   55.215430] Backtrace: 
[   55.215449] [<c010d9c4>] (dump_backtrace) from [<c010dd18>]
(show_stack+0x20/0x24)
[   55.215460] r7:60000013 r6:c13c6d8c r5:00000000 r4:c13c6d8c
[   55.215473] [<c010dcf8>] (show_stack) from [<c0cd3e7c>]
(dump_stack_lvl+0x90/0xa4)
[   55.215485] [<c0cd3dec>] (dump_stack_lvl) from [<c0cd3ea8>]
(dump_stack+0x18/0x1c)
[   55.215496] r9:c070cab4 r8:000005e9 r7:00000009 r6:00000000
r5:c102b268 r4:c2165dec
[   55.215511] [<c0cd3e90>] (dump_stack) from [<c0138098>]
(__warn+0x110/0x114)
[   55.215523] [<c0137f88>] (__warn) from [<c0138124>]
(warn_slowpath_fmt+0x88/0xc4)
[   55.215534] r9:00000009 r8:c070cab4 r7:000005e9 r6:c102b268
r5:c102b7dc r4:c2164000
[   55.215550] [<c01380a0>] (warn_slowpath_fmt) from [<c070cab4>]
(drm_atomic_helper_wait_for_vblanks.part.1+0x2b8/0x2bc)
[   55.215561] r9:00000000 r8:00000001 r7:00000000 r6:00000000
r5:00000000 r4:c213d000
[   55.215575] [<c070c7fc>] (drm_atomic_helper_wait_for_vblanks.part.1)
from [<c070e31c>] (drm_atomic_helper_commit_tail_rpm+0x6c/0x7c)
[   55.215586] r10:c13cdc78 r9:00000000 r8:00000000 r7:d4511ee8
r6:0000000c r5:c213c800
[   55.215592] r4:d0e42b00
[   55.215603] [<c070e2b0>] (drm_atomic_helper_commit_tail_rpm) from
[<c070e724>] (commit_tail+0xb8/0x1d4)
[   55.215611] r5:00000000 r4:d0e42b00
[   55.215621] [<c070e66c>] (commit_tail) from [<c070e85c>]
(commit_work+0x1c/0x20)
[   55.215632] r10:c141cfe0 r9:00000000 r8:00000000 r7:c1c0f000
r6:c1c0a400 r5:c1f16080
[   55.215638] r4:d0e42b30
[   55.215650] [<c070e840>] (commit_work) from [<c01583b8>]
(process_one_work+0x1b0/0x594)
[   55.215661] [<c0158208>] (process_one_work) from [<c01587f8>]
(worker_thread+0x5c/0x550)
[   55.215672] r10:c1303d00 r9:00000088 r8:ffffe000 r7:c1c0a418
r6:c1f16094 r5:c1c0a400
[   55.215678] r4:c1f16080
[   55.215692] [<c015879c>] (worker_thread) from [<c0160bc0>]
(kthread+0x198/0x1b0)
[   55.215703] r10:c215be74 r9:00000000 r8:c1f16080 r7:c015879c
r6:c2164000 r5:c1f0d140
[   55.215710] r4:c211e200
[   55.215722] [<c0160a28>] (kthread) from [<c0100148>]
(ret_from_fork+0x14/0x2c)
[   55.215729] Exception stack(0xc2165fb0 to 0xc2165ff8)
[   55.215738] 5fa0:                                     00000000
00000000 00000000 00000000
[   55.215749] 5fc0: 00000000 00000000 00000000 00000000 00000000
00000000 00000000 00000000
[   55.215758] 5fe0: 00000000 00000000 00000000 00000000 00000013
00000000
[   55.215769] r10:00000000 r9:00000000 r8:00000000 r7:00000000
r6:00000000 r5:c0160a28
[   55.215775] r4:c1f0d140
[   55.215782] ---[ end trace 3478b4dd7c26b793 ]---
[   55.215804] [drm:drm_atomic_state_default_clear] Clearing atomic
state a07053f4
[   55.215820] [drm:drm_mode_object_put.part.0] OBJ ID: 60 (2)
[   55.215833] [drm:drm_mode_object_put.part.0] OBJ ID: 73 (3)
[   55.215844] [drm:drm_mode_object_put.part.0] OBJ ID: 72 (3)
[   55.215855] [drm:drm_mode_object_put.part.0] OBJ ID: 68 (2)
[   55.215865] [drm:drm_mode_object_put.part.0] OBJ ID: 66 (3)
[   55.215876] [drm:__drm_atomic_state_free] Freeing atomic state
a07053f4


If I can provide anything else that would be helpful to analyze this
problem, I'm happy to assist.

Kind Regards
Martin

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Exynos vblank timeout issue
  2022-05-22  0:02 Exynos vblank timeout issue Martin Jücker
@ 2022-05-22  7:45 ` Krzysztof Kozlowski
  2022-05-22 10:06   ` Martin Jücker
  0 siblings, 1 reply; 7+ messages in thread
From: Krzysztof Kozlowski @ 2022-05-22  7:45 UTC (permalink / raw)
  To: Martin Jücker, dri-devel
  Cc: Joonyoung Shim, David Airlie, Seung-Woo Kim, Kyungmin Park

On 22/05/2022 02:02, Martin Jücker wrote:
> Hello,
> 
> I'm trying to get Android 12 up and running on my Galaxy Note 10.1 which
> is based on Exynos 4412 with a Mali GPU. For Android 11, I had no issues
> with graphics but after upgrading and building Android 12, I'm getting a
> vblank wait timeout shortly after starting the device setup, which in
> turn leads to my display turning black and SurfaceFlinger hanging. This
> can be reliably reproduced after every reboot, so much so that it's
> basically always on the exact same step of the setup.
> 
> I'm using the following setup:
> 
> * 5.10.101 Android Common Kernel with some patches to get
> the Note 10.1 up and running

It's Android kernel, so not upstream. It is perfectly fine to use
downstream kernels, but with the issues you also go to downstream folks.
I have no clue what Android did to Exynos.

Best regards,
Krzysztof

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Exynos vblank timeout issue
  2022-05-22  7:45 ` Krzysztof Kozlowski
@ 2022-05-22 10:06   ` Martin Jücker
  2022-05-26 23:34     ` Martin Jücker
  0 siblings, 1 reply; 7+ messages in thread
From: Martin Jücker @ 2022-05-22 10:06 UTC (permalink / raw)
  To: Krzysztof Kozlowski, dri-devel
  Cc: Joonyoung Shim, David Airlie, Seung-Woo Kim, Kyungmin Park,
	Martin Jücker

On Sun, May 22, 2022 at 09:45:51AM +0200, Krzysztof Kozlowski wrote:
> On 22/05/2022 02:02, Martin Jücker wrote:
> > Hello,
> > 
> > I'm trying to get Android 12 up and running on my Galaxy Note 10.1 which
> > is based on Exynos 4412 with a Mali GPU. For Android 11, I had no issues
> > with graphics but after upgrading and building Android 12, I'm getting a
> > vblank wait timeout shortly after starting the device setup, which in
> > turn leads to my display turning black and SurfaceFlinger hanging. This
> > can be reliably reproduced after every reboot, so much so that it's
> > basically always on the exact same step of the setup.
> > 
> > I'm using the following setup:
> > 
> > * 5.10.101 Android Common Kernel with some patches to get
> > the Note 10.1 up and running
> 
> It's Android kernel, so not upstream. It is perfectly fine to use
> downstream kernels, but with the issues you also go to downstream folks.
> I have no clue what Android did to Exynos.

Hi Krzysztof,

indeed, that was my mistake. Should have done that on mainline first.

I rebased some patches on top of v5.17.9 and tried again, same result.
There are no Android patches in there, only p4note related things. You
can have a look here: 

https://github.com/Viciouss/linux/commits/v5.17.9-android

The behaviour is exactly the same, as soon as I try to advance in the
setup process, it suddenly turns the screen all black.

Here is the warning again, just in case there are any differences.

[   77.651495] ------------[ cut here ]------------
[   77.651527] WARNING: CPU: 2 PID: 8 at
../drivers/gpu/drm/drm_atomic_helper.c:1530
drm_atomic_helper_wait_for_vblanks.part.1+0x2b0/0x2b4
[   77.651593] [CRTC:49:crtc-0] vblank wait timed out
[   77.651608] Modules linked in: s5p_mfc s5p_jpeg v4l2_mem2mem
videobuf2_dma_contig videobuf2_memops videobuf2_v4l2 videobuf2_common
rfcomm kheaders hidp hci_uart cpufreq_userspace cpufreq_powersave
cpufreq_conservative btbcm brcmfmac brcmutil bnep bluetooth atmel_mxt_ts
[   77.651789] CPU: 2 PID: 8 Comm: kworker/u8:0 Not tainted 5.17.9+ #3
[   77.651813] Hardware name: Samsung Exynos (Flattened Device Tree)
[   77.651828] Workqueue: events_unbound commit_work
[   77.651858] Backtrace: 
[   77.651874] dump_backtrace from show_stack+0x20/0x24
[   77.651915] r7:c071097c r6:00000000 r5:c10ec66c r4:600f0013
[   77.651926] show_stack from dump_stack_lvl+0x48/0x54
[   77.651958] dump_stack_lvl from dump_stack+0x18/0x1c
[   77.651986] r5:c113dcf4 r4:c1d51e04
[   77.651996] dump_stack from __warn+0x18c/0x190
[   77.652030] __warn from warn_slowpath_fmt+0x80/0xbc
[   77.652070] r9:00000009 r8:c071097c r7:000005fa r6:c113dcf4
r5:c1d8cb40 r4:c113e338
[   77.652081] warn_slowpath_fmt from
drm_atomic_helper_wait_for_vblanks.part.1+0x2b0/0x2b4
[   77.652123] r9:00000001 r8:00000000 r7:00000000 r6:00000000
r5:00000000 r4:c398c800
[   77.652135] drm_atomic_helper_wait_for_vblanks.part.1 from
drm_atomic_helper_commit_tail_rpm+0x6c/0x7c
[   77.652175] r10:c14cce68 r9:c1c2a005 r8:00000000 r7:0e3f351d
r6:00000012 r5:c398c000
[   77.652188] r4:d42943c0
[   77.652197] drm_atomic_helper_commit_tail_rpm from
commit_tail+0xb8/0x1d8
[   77.652228] r5:00000000 r4:d42943c0
[   77.652238] commit_tail from commit_work+0x1c/0x20
[   77.652274] r10:c1518d20 r9:c1c2a005 r8:00000000 r7:c1c2a000
r6:c1c0a800 r5:c1c08a00
[   77.652287] r4:d42943ec
[   77.652297] commit_work from process_one_work+0x1b0/0x528
[   77.652324] process_one_work from worker_thread+0x54/0x4d8
[   77.652356] r10:c1c0a800 r9:00000088 r8:c1403d00 r7:c1c0a81c
r6:c1c08a18 r5:c1c0a800
[   77.652368] r4:c1c08a00
[   77.652378] worker_thread from kthread+0x104/0x134
[   77.652419] r10:00000000 r9:c1d43e5c r8:c1d05880 r7:c1d8cb40
r6:c1c08a00 r5:c015530c
[   77.652432] r4:c1d05700
[   77.652441] kthread from ret_from_fork+0x14/0x2c
[   77.652468] Exception stack(0xc1d51fb0 to 0xc1d51ff8)
[   77.652488] 1fa0:                                     00000000
00000000 00000000 00000000
[   77.652509] 1fc0: 00000000 00000000 00000000 00000000 00000000
00000000 00000000 00000000
[   77.652528] 1fe0: 00000000 00000000 00000000 00000000 00000013
00000000
[   77.652550] r9:00000000 r8:00000000 r7:00000000 r6:00000000
r5:c015da78 r4:c1d05700
[   77.652561] ---[ end trace 0000000000000000 ]---

Kind Regards
Martin

> 
> Best regards,
> Krzysztof

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Exynos vblank timeout issue
  2022-05-22 10:06   ` Martin Jücker
@ 2022-05-26 23:34     ` Martin Jücker
  2022-06-04  4:05       ` Inki Dae
  0 siblings, 1 reply; 7+ messages in thread
From: Martin Jücker @ 2022-05-26 23:34 UTC (permalink / raw)
  To: Krzysztof Kozlowski, dri-devel
  Cc: David Airlie, Seung-Woo Kim, Kyungmin Park, Martin Jücker

Hello again,

I tried to dig around a bit to unearth some more information. What I'm
seeing is that it works just fine in the beginning, planes are updated a
couple of times and suddenly, after one of the plane updates, the
interrupt handler in the FIMD driver is no longer called. The screen
goes dark but the device is still operational, e.g. ADB works fine, I
can connect and execute commands.

Trying to figure out what is called when and curious about the state of
the registers, I littered the code with print statements and it looks
like vsync is still active, no other code calls into disabling it. All
registers are as expected, e.g. VIDINTCON0 has the interrupt bit set. I
also had a look at the interrupt combiner, this too has the
corresponding lcd0-1 interrupt enabled at all times and there is no
interrupt pending, even after FIMD stopped receiving them.

Looking at the wiki at https://exynos.wiki.kernel.org/todo_tasks I found
issue #9. It's about trashed display or DMA freeze if planes are too
narrow and I was wondering if this could be related. So I had a look at
the drm debug output and planes are indeed getting very small. This
happens exactly when the animation that is triggering the issue is
playing, so this would match. Looking a bit closer at the position and
size of the planes, I could see that the last working vsync was right
after one of the planes was exactly 1 pixel in width and vsync only
stopped working one update later. Here are the plane updates from the
logs:

-

Planes getting smaller and smaller with each update:
plane : offset_x/y(0,0), width/height(4,800)
plane : offset_x/y(4,0), width/height(1276,800)
plane : offset_x/y(0,0), width/height(1280,800)
plane : offset_x/y(0,776), width/height(1280,24)

plane : offset_x/y(0,0), width/height(2,800)
plane : offset_x/y(2,0), width/height(1278,800)
plane : offset_x/y(0,0), width/height(1280,800)
plane : offset_x/y(0,776), width/height(1280,24)

plane : offset_x/y(0,0), width/height(1,800)
plane : offset_x/y(1,0), width/height(1279,800)
plane : offset_x/y(0,0), width/height(1280,800)
plane : offset_x/y(0,776), width/height(1280,24)

Still got a vsync in between those two. But after the following update,
it's dead:
plane : offset_x/y(0,0), width/height(1280,800)
plane : offset_x/y(0,0), width/height(1280,24)
plane : offset_x/y(0,740), width/height(1280,60)
plane : offset_x/y(0,0), width/height(1280,800)

-> vsync timeout comes here

-

I have no idea how to analyze this further on the kernel side. I'll try
to write an executable that triggers this bug next. If you have any
ideas on that, I'd be very grateful.

Kind Regards
Martin

On Sun, May 22, 2022 at 12:06:39PM +0200, Martin Jücker wrote:
> On Sun, May 22, 2022 at 09:45:51AM +0200, Krzysztof Kozlowski wrote:
> > On 22/05/2022 02:02, Martin Jücker wrote:
> > > Hello,
> > > 
> > > I'm trying to get Android 12 up and running on my Galaxy Note 10.1 which
> > > is based on Exynos 4412 with a Mali GPU. For Android 11, I had no issues
> > > with graphics but after upgrading and building Android 12, I'm getting a
> > > vblank wait timeout shortly after starting the device setup, which in
> > > turn leads to my display turning black and SurfaceFlinger hanging. This
> > > can be reliably reproduced after every reboot, so much so that it's
> > > basically always on the exact same step of the setup.
> > > 
> > > I'm using the following setup:
> > > 
> > > * 5.10.101 Android Common Kernel with some patches to get
> > > the Note 10.1 up and running
> > 
> > It's Android kernel, so not upstream. It is perfectly fine to use
> > downstream kernels, but with the issues you also go to downstream folks.
> > I have no clue what Android did to Exynos.
> 
> Hi Krzysztof,
> 
> indeed, that was my mistake. Should have done that on mainline first.
> 
> I rebased some patches on top of v5.17.9 and tried again, same result.
> There are no Android patches in there, only p4note related things. You
> can have a look here: 
> 
> https://github.com/Viciouss/linux/commits/v5.17.9-android
> 
> The behaviour is exactly the same, as soon as I try to advance in the
> setup process, it suddenly turns the screen all black.
> 
> Here is the warning again, just in case there are any differences.
> 
> [   77.651495] ------------[ cut here ]------------
> [   77.651527] WARNING: CPU: 2 PID: 8 at
> ../drivers/gpu/drm/drm_atomic_helper.c:1530
> drm_atomic_helper_wait_for_vblanks.part.1+0x2b0/0x2b4
> [   77.651593] [CRTC:49:crtc-0] vblank wait timed out
> [   77.651608] Modules linked in: s5p_mfc s5p_jpeg v4l2_mem2mem
> videobuf2_dma_contig videobuf2_memops videobuf2_v4l2 videobuf2_common
> rfcomm kheaders hidp hci_uart cpufreq_userspace cpufreq_powersave
> cpufreq_conservative btbcm brcmfmac brcmutil bnep bluetooth atmel_mxt_ts
> [   77.651789] CPU: 2 PID: 8 Comm: kworker/u8:0 Not tainted 5.17.9+ #3
> [   77.651813] Hardware name: Samsung Exynos (Flattened Device Tree)
> [   77.651828] Workqueue: events_unbound commit_work
> [   77.651858] Backtrace: 
> [   77.651874] dump_backtrace from show_stack+0x20/0x24
> [   77.651915] r7:c071097c r6:00000000 r5:c10ec66c r4:600f0013
> [   77.651926] show_stack from dump_stack_lvl+0x48/0x54
> [   77.651958] dump_stack_lvl from dump_stack+0x18/0x1c
> [   77.651986] r5:c113dcf4 r4:c1d51e04
> [   77.651996] dump_stack from __warn+0x18c/0x190
> [   77.652030] __warn from warn_slowpath_fmt+0x80/0xbc
> [   77.652070] r9:00000009 r8:c071097c r7:000005fa r6:c113dcf4
> r5:c1d8cb40 r4:c113e338
> [   77.652081] warn_slowpath_fmt from
> drm_atomic_helper_wait_for_vblanks.part.1+0x2b0/0x2b4
> [   77.652123] r9:00000001 r8:00000000 r7:00000000 r6:00000000
> r5:00000000 r4:c398c800
> [   77.652135] drm_atomic_helper_wait_for_vblanks.part.1 from
> drm_atomic_helper_commit_tail_rpm+0x6c/0x7c
> [   77.652175] r10:c14cce68 r9:c1c2a005 r8:00000000 r7:0e3f351d
> r6:00000012 r5:c398c000
> [   77.652188] r4:d42943c0
> [   77.652197] drm_atomic_helper_commit_tail_rpm from
> commit_tail+0xb8/0x1d8
> [   77.652228] r5:00000000 r4:d42943c0
> [   77.652238] commit_tail from commit_work+0x1c/0x20
> [   77.652274] r10:c1518d20 r9:c1c2a005 r8:00000000 r7:c1c2a000
> r6:c1c0a800 r5:c1c08a00
> [   77.652287] r4:d42943ec
> [   77.652297] commit_work from process_one_work+0x1b0/0x528
> [   77.652324] process_one_work from worker_thread+0x54/0x4d8
> [   77.652356] r10:c1c0a800 r9:00000088 r8:c1403d00 r7:c1c0a81c
> r6:c1c08a18 r5:c1c0a800
> [   77.652368] r4:c1c08a00
> [   77.652378] worker_thread from kthread+0x104/0x134
> [   77.652419] r10:00000000 r9:c1d43e5c r8:c1d05880 r7:c1d8cb40
> r6:c1c08a00 r5:c015530c
> [   77.652432] r4:c1d05700
> [   77.652441] kthread from ret_from_fork+0x14/0x2c
> [   77.652468] Exception stack(0xc1d51fb0 to 0xc1d51ff8)
> [   77.652488] 1fa0:                                     00000000
> 00000000 00000000 00000000
> [   77.652509] 1fc0: 00000000 00000000 00000000 00000000 00000000
> 00000000 00000000 00000000
> [   77.652528] 1fe0: 00000000 00000000 00000000 00000000 00000013
> 00000000
> [   77.652550] r9:00000000 r8:00000000 r7:00000000 r6:00000000
> r5:c015da78 r4:c1d05700
> [   77.652561] ---[ end trace 0000000000000000 ]---
> 
> Kind Regards
> Martin
> 
> > 
> > Best regards,
> > Krzysztof

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Exynos vblank timeout issue
  2022-05-26 23:34     ` Martin Jücker
@ 2022-06-04  4:05       ` Inki Dae
  2022-06-04 11:05         ` Martin Jücker
  0 siblings, 1 reply; 7+ messages in thread
From: Inki Dae @ 2022-06-04  4:05 UTC (permalink / raw)
  To: Martin Jücker
  Cc: David Airlie, Kyungmin Park, Seung-Woo Kim, DRI mailing list,
	Krzysztof Kozlowski

Hi Martin,

What kind of panel does Galaxy Note 10.1 use? I guess it uses I80
panel which needs CPU-trigger.
If so, you may need to check if the panel device works correctly after
booting because FIMD will incur vsync timeout if the panel doesn't
work.
I think you could try to check if te signal works or not in
exynos_dsi_te_irq_handler function of exynos_drm_dsi.c

Thanks,
Inki Dae

2022년 5월 27일 (금) 오전 8:34, Martin Jücker <martin.juecker@gmail.com>님이 작성:
>
> Hello again,
>
> I tried to dig around a bit to unearth some more information. What I'm
> seeing is that it works just fine in the beginning, planes are updated a
> couple of times and suddenly, after one of the plane updates, the
> interrupt handler in the FIMD driver is no longer called. The screen
> goes dark but the device is still operational, e.g. ADB works fine, I
> can connect and execute commands.
>
> Trying to figure out what is called when and curious about the state of
> the registers, I littered the code with print statements and it looks
> like vsync is still active, no other code calls into disabling it. All
> registers are as expected, e.g. VIDINTCON0 has the interrupt bit set. I
> also had a look at the interrupt combiner, this too has the
> corresponding lcd0-1 interrupt enabled at all times and there is no
> interrupt pending, even after FIMD stopped receiving them.
>
> Looking at the wiki at https://exynos.wiki.kernel.org/todo_tasks I found
> issue #9. It's about trashed display or DMA freeze if planes are too
> narrow and I was wondering if this could be related. So I had a look at
> the drm debug output and planes are indeed getting very small. This
> happens exactly when the animation that is triggering the issue is
> playing, so this would match. Looking a bit closer at the position and
> size of the planes, I could see that the last working vsync was right
> after one of the planes was exactly 1 pixel in width and vsync only
> stopped working one update later. Here are the plane updates from the
> logs:
>
> -
>
> Planes getting smaller and smaller with each update:
> plane : offset_x/y(0,0), width/height(4,800)
> plane : offset_x/y(4,0), width/height(1276,800)
> plane : offset_x/y(0,0), width/height(1280,800)
> plane : offset_x/y(0,776), width/height(1280,24)
>
> plane : offset_x/y(0,0), width/height(2,800)
> plane : offset_x/y(2,0), width/height(1278,800)
> plane : offset_x/y(0,0), width/height(1280,800)
> plane : offset_x/y(0,776), width/height(1280,24)
>
> plane : offset_x/y(0,0), width/height(1,800)
> plane : offset_x/y(1,0), width/height(1279,800)
> plane : offset_x/y(0,0), width/height(1280,800)
> plane : offset_x/y(0,776), width/height(1280,24)
>
> Still got a vsync in between those two. But after the following update,
> it's dead:
> plane : offset_x/y(0,0), width/height(1280,800)
> plane : offset_x/y(0,0), width/height(1280,24)
> plane : offset_x/y(0,740), width/height(1280,60)
> plane : offset_x/y(0,0), width/height(1280,800)
>
> -> vsync timeout comes here
>
> -
>
> I have no idea how to analyze this further on the kernel side. I'll try
> to write an executable that triggers this bug next. If you have any
> ideas on that, I'd be very grateful.
>
> Kind Regards
> Martin
>
> On Sun, May 22, 2022 at 12:06:39PM +0200, Martin Jücker wrote:
> > On Sun, May 22, 2022 at 09:45:51AM +0200, Krzysztof Kozlowski wrote:
> > > On 22/05/2022 02:02, Martin Jücker wrote:
> > > > Hello,
> > > >
> > > > I'm trying to get Android 12 up and running on my Galaxy Note 10.1 which
> > > > is based on Exynos 4412 with a Mali GPU. For Android 11, I had no issues
> > > > with graphics but after upgrading and building Android 12, I'm getting a
> > > > vblank wait timeout shortly after starting the device setup, which in
> > > > turn leads to my display turning black and SurfaceFlinger hanging. This
> > > > can be reliably reproduced after every reboot, so much so that it's
> > > > basically always on the exact same step of the setup.
> > > >
> > > > I'm using the following setup:
> > > >
> > > > * 5.10.101 Android Common Kernel with some patches to get
> > > > the Note 10.1 up and running
> > >
> > > It's Android kernel, so not upstream. It is perfectly fine to use
> > > downstream kernels, but with the issues you also go to downstream folks.
> > > I have no clue what Android did to Exynos.
> >
> > Hi Krzysztof,
> >
> > indeed, that was my mistake. Should have done that on mainline first.
> >
> > I rebased some patches on top of v5.17.9 and tried again, same result.
> > There are no Android patches in there, only p4note related things. You
> > can have a look here:
> >
> > https://github.com/Viciouss/linux/commits/v5.17.9-android
> >
> > The behaviour is exactly the same, as soon as I try to advance in the
> > setup process, it suddenly turns the screen all black.
> >
> > Here is the warning again, just in case there are any differences.
> >
> > [   77.651495] ------------[ cut here ]------------
> > [   77.651527] WARNING: CPU: 2 PID: 8 at
> > ../drivers/gpu/drm/drm_atomic_helper.c:1530
> > drm_atomic_helper_wait_for_vblanks.part.1+0x2b0/0x2b4
> > [   77.651593] [CRTC:49:crtc-0] vblank wait timed out
> > [   77.651608] Modules linked in: s5p_mfc s5p_jpeg v4l2_mem2mem
> > videobuf2_dma_contig videobuf2_memops videobuf2_v4l2 videobuf2_common
> > rfcomm kheaders hidp hci_uart cpufreq_userspace cpufreq_powersave
> > cpufreq_conservative btbcm brcmfmac brcmutil bnep bluetooth atmel_mxt_ts
> > [   77.651789] CPU: 2 PID: 8 Comm: kworker/u8:0 Not tainted 5.17.9+ #3
> > [   77.651813] Hardware name: Samsung Exynos (Flattened Device Tree)
> > [   77.651828] Workqueue: events_unbound commit_work
> > [   77.651858] Backtrace:
> > [   77.651874] dump_backtrace from show_stack+0x20/0x24
> > [   77.651915] r7:c071097c r6:00000000 r5:c10ec66c r4:600f0013
> > [   77.651926] show_stack from dump_stack_lvl+0x48/0x54
> > [   77.651958] dump_stack_lvl from dump_stack+0x18/0x1c
> > [   77.651986] r5:c113dcf4 r4:c1d51e04
> > [   77.651996] dump_stack from __warn+0x18c/0x190
> > [   77.652030] __warn from warn_slowpath_fmt+0x80/0xbc
> > [   77.652070] r9:00000009 r8:c071097c r7:000005fa r6:c113dcf4
> > r5:c1d8cb40 r4:c113e338
> > [   77.652081] warn_slowpath_fmt from
> > drm_atomic_helper_wait_for_vblanks.part.1+0x2b0/0x2b4
> > [   77.652123] r9:00000001 r8:00000000 r7:00000000 r6:00000000
> > r5:00000000 r4:c398c800
> > [   77.652135] drm_atomic_helper_wait_for_vblanks.part.1 from
> > drm_atomic_helper_commit_tail_rpm+0x6c/0x7c
> > [   77.652175] r10:c14cce68 r9:c1c2a005 r8:00000000 r7:0e3f351d
> > r6:00000012 r5:c398c000
> > [   77.652188] r4:d42943c0
> > [   77.652197] drm_atomic_helper_commit_tail_rpm from
> > commit_tail+0xb8/0x1d8
> > [   77.652228] r5:00000000 r4:d42943c0
> > [   77.652238] commit_tail from commit_work+0x1c/0x20
> > [   77.652274] r10:c1518d20 r9:c1c2a005 r8:00000000 r7:c1c2a000
> > r6:c1c0a800 r5:c1c08a00
> > [   77.652287] r4:d42943ec
> > [   77.652297] commit_work from process_one_work+0x1b0/0x528
> > [   77.652324] process_one_work from worker_thread+0x54/0x4d8
> > [   77.652356] r10:c1c0a800 r9:00000088 r8:c1403d00 r7:c1c0a81c
> > r6:c1c08a18 r5:c1c0a800
> > [   77.652368] r4:c1c08a00
> > [   77.652378] worker_thread from kthread+0x104/0x134
> > [   77.652419] r10:00000000 r9:c1d43e5c r8:c1d05880 r7:c1d8cb40
> > r6:c1c08a00 r5:c015530c
> > [   77.652432] r4:c1d05700
> > [   77.652441] kthread from ret_from_fork+0x14/0x2c
> > [   77.652468] Exception stack(0xc1d51fb0 to 0xc1d51ff8)
> > [   77.652488] 1fa0:                                     00000000
> > 00000000 00000000 00000000
> > [   77.652509] 1fc0: 00000000 00000000 00000000 00000000 00000000
> > 00000000 00000000 00000000
> > [   77.652528] 1fe0: 00000000 00000000 00000000 00000000 00000013
> > 00000000
> > [   77.652550] r9:00000000 r8:00000000 r7:00000000 r6:00000000
> > r5:c015da78 r4:c1d05700
> > [   77.652561] ---[ end trace 0000000000000000 ]---
> >
> > Kind Regards
> > Martin
> >
> > >
> > > Best regards,
> > > Krzysztof

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Exynos vblank timeout issue
  2022-06-04  4:05       ` Inki Dae
@ 2022-06-04 11:05         ` Martin Jücker
  2022-06-22  3:27           ` Inki Dae
  0 siblings, 1 reply; 7+ messages in thread
From: Martin Jücker @ 2022-06-04 11:05 UTC (permalink / raw)
  To: Inki Dae
  Cc: David Airlie, Seung-Woo Kim, DRI mailing list, Kyungmin Park,
	Martin Jücker, Krzysztof Kozlowski

On Sat, Jun 04, 2022 at 01:05:39PM +0900, Inki Dae wrote:
> Hi Martin,
> 

Hi Inki,

> What kind of panel does Galaxy Note 10.1 use? I guess it uses I80
> panel which needs CPU-trigger.

the Note 10.1 uses a Samsung LTL101AL01 LCD which is connected via the
RGB interface.

In the meantime I tried several things but to no avail. Krzysztof
proposed in IRC to disable devfreq which had no effect. I compared the
mainline sources with the vendor kernel, but there are no notable
differences and the parts that were a bit different also didn't solve
the problem when applied to mainline.

I then tried to reproduce the issue, I noticed that things go wrong as
soon as I get to planes that are less than 8x8 pixels. There is two
different issues I can make out:

1) Active pixels are not what expected, so the plane is 1x4 pixels but
only two pixels will be visible on the display
2) The screen goes dark and the vblank interrupt stops working

The vblank occurs for all planes that are 1x8 or any multiple of it, so
1x16, 1x24 as well as planes bigger than 1x279 in size. This is for the
primary plane. A width of two seems to be fine here.

For overlay planes, the situation is worse. All planes of width one will
trigger a vsync timeout. Also, planes of widths smaller 8 seem to be hit
and miss, most of them don't work.

The first issue with the wrong number of pixels seems to be for small
planes less than 8x8 pixels that don't trigger the vsync issue but it's
more difficult to find a pattern here. It looks like even numbers like
4x4, 4x6 are fine but as soon as at least one odd number is present, it
will go down the drain. 5x6 for example will only display 5x5 pixels,
5x5 will display four rows of five pixels and one row with one pixel.

Kind Regards
Martin



> If so, you may need to check if the panel device works correctly after
> booting because FIMD will incur vsync timeout if the panel doesn't
> work.
> I think you could try to check if te signal works or not in
> exynos_dsi_te_irq_handler function of exynos_drm_dsi.c
> 
> Thanks,
> Inki Dae
> 
> 2022년 5월 27일 (금) 오전 8:34, Martin Jücker <martin.juecker@gmail.com>님이 작성:
> >
> > Hello again,
> >
> > I tried to dig around a bit to unearth some more information. What I'm
> > seeing is that it works just fine in the beginning, planes are updated a
> > couple of times and suddenly, after one of the plane updates, the
> > interrupt handler in the FIMD driver is no longer called. The screen
> > goes dark but the device is still operational, e.g. ADB works fine, I
> > can connect and execute commands.
> >
> > Trying to figure out what is called when and curious about the state of
> > the registers, I littered the code with print statements and it looks
> > like vsync is still active, no other code calls into disabling it. All
> > registers are as expected, e.g. VIDINTCON0 has the interrupt bit set. I
> > also had a look at the interrupt combiner, this too has the
> > corresponding lcd0-1 interrupt enabled at all times and there is no
> > interrupt pending, even after FIMD stopped receiving them.
> >
> > Looking at the wiki at https://exynos.wiki.kernel.org/todo_tasks I found
> > issue #9. It's about trashed display or DMA freeze if planes are too
> > narrow and I was wondering if this could be related. So I had a look at
> > the drm debug output and planes are indeed getting very small. This
> > happens exactly when the animation that is triggering the issue is
> > playing, so this would match. Looking a bit closer at the position and
> > size of the planes, I could see that the last working vsync was right
> > after one of the planes was exactly 1 pixel in width and vsync only
> > stopped working one update later. Here are the plane updates from the
> > logs:
> >
> > -
> >
> > Planes getting smaller and smaller with each update:
> > plane : offset_x/y(0,0), width/height(4,800)
> > plane : offset_x/y(4,0), width/height(1276,800)
> > plane : offset_x/y(0,0), width/height(1280,800)
> > plane : offset_x/y(0,776), width/height(1280,24)
> >
> > plane : offset_x/y(0,0), width/height(2,800)
> > plane : offset_x/y(2,0), width/height(1278,800)
> > plane : offset_x/y(0,0), width/height(1280,800)
> > plane : offset_x/y(0,776), width/height(1280,24)
> >
> > plane : offset_x/y(0,0), width/height(1,800)
> > plane : offset_x/y(1,0), width/height(1279,800)
> > plane : offset_x/y(0,0), width/height(1280,800)
> > plane : offset_x/y(0,776), width/height(1280,24)
> >
> > Still got a vsync in between those two. But after the following update,
> > it's dead:
> > plane : offset_x/y(0,0), width/height(1280,800)
> > plane : offset_x/y(0,0), width/height(1280,24)
> > plane : offset_x/y(0,740), width/height(1280,60)
> > plane : offset_x/y(0,0), width/height(1280,800)
> >
> > -> vsync timeout comes here
> >
> > -
> >
> > I have no idea how to analyze this further on the kernel side. I'll try
> > to write an executable that triggers this bug next. If you have any
> > ideas on that, I'd be very grateful.
> >
> > Kind Regards
> > Martin
> >
> > On Sun, May 22, 2022 at 12:06:39PM +0200, Martin Jücker wrote:
> > > On Sun, May 22, 2022 at 09:45:51AM +0200, Krzysztof Kozlowski wrote:
> > > > On 22/05/2022 02:02, Martin Jücker wrote:
> > > > > Hello,
> > > > >
> > > > > I'm trying to get Android 12 up and running on my Galaxy Note 10.1 which
> > > > > is based on Exynos 4412 with a Mali GPU. For Android 11, I had no issues
> > > > > with graphics but after upgrading and building Android 12, I'm getting a
> > > > > vblank wait timeout shortly after starting the device setup, which in
> > > > > turn leads to my display turning black and SurfaceFlinger hanging. This
> > > > > can be reliably reproduced after every reboot, so much so that it's
> > > > > basically always on the exact same step of the setup.
> > > > >
> > > > > I'm using the following setup:
> > > > >
> > > > > * 5.10.101 Android Common Kernel with some patches to get
> > > > > the Note 10.1 up and running
> > > >
> > > > It's Android kernel, so not upstream. It is perfectly fine to use
> > > > downstream kernels, but with the issues you also go to downstream folks.
> > > > I have no clue what Android did to Exynos.
> > >
> > > Hi Krzysztof,
> > >
> > > indeed, that was my mistake. Should have done that on mainline first.
> > >
> > > I rebased some patches on top of v5.17.9 and tried again, same result.
> > > There are no Android patches in there, only p4note related things. You
> > > can have a look here:
> > >
> > > https://github.com/Viciouss/linux/commits/v5.17.9-android
> > >
> > > The behaviour is exactly the same, as soon as I try to advance in the
> > > setup process, it suddenly turns the screen all black.
> > >
> > > Here is the warning again, just in case there are any differences.
> > >
> > > [   77.651495] ------------[ cut here ]------------
> > > [   77.651527] WARNING: CPU: 2 PID: 8 at
> > > ../drivers/gpu/drm/drm_atomic_helper.c:1530
> > > drm_atomic_helper_wait_for_vblanks.part.1+0x2b0/0x2b4
> > > [   77.651593] [CRTC:49:crtc-0] vblank wait timed out
> > > [   77.651608] Modules linked in: s5p_mfc s5p_jpeg v4l2_mem2mem
> > > videobuf2_dma_contig videobuf2_memops videobuf2_v4l2 videobuf2_common
> > > rfcomm kheaders hidp hci_uart cpufreq_userspace cpufreq_powersave
> > > cpufreq_conservative btbcm brcmfmac brcmutil bnep bluetooth atmel_mxt_ts
> > > [   77.651789] CPU: 2 PID: 8 Comm: kworker/u8:0 Not tainted 5.17.9+ #3
> > > [   77.651813] Hardware name: Samsung Exynos (Flattened Device Tree)
> > > [   77.651828] Workqueue: events_unbound commit_work
> > > [   77.651858] Backtrace:
> > > [   77.651874] dump_backtrace from show_stack+0x20/0x24
> > > [   77.651915] r7:c071097c r6:00000000 r5:c10ec66c r4:600f0013
> > > [   77.651926] show_stack from dump_stack_lvl+0x48/0x54
> > > [   77.651958] dump_stack_lvl from dump_stack+0x18/0x1c
> > > [   77.651986] r5:c113dcf4 r4:c1d51e04
> > > [   77.651996] dump_stack from __warn+0x18c/0x190
> > > [   77.652030] __warn from warn_slowpath_fmt+0x80/0xbc
> > > [   77.652070] r9:00000009 r8:c071097c r7:000005fa r6:c113dcf4
> > > r5:c1d8cb40 r4:c113e338
> > > [   77.652081] warn_slowpath_fmt from
> > > drm_atomic_helper_wait_for_vblanks.part.1+0x2b0/0x2b4
> > > [   77.652123] r9:00000001 r8:00000000 r7:00000000 r6:00000000
> > > r5:00000000 r4:c398c800
> > > [   77.652135] drm_atomic_helper_wait_for_vblanks.part.1 from
> > > drm_atomic_helper_commit_tail_rpm+0x6c/0x7c
> > > [   77.652175] r10:c14cce68 r9:c1c2a005 r8:00000000 r7:0e3f351d
> > > r6:00000012 r5:c398c000
> > > [   77.652188] r4:d42943c0
> > > [   77.652197] drm_atomic_helper_commit_tail_rpm from
> > > commit_tail+0xb8/0x1d8
> > > [   77.652228] r5:00000000 r4:d42943c0
> > > [   77.652238] commit_tail from commit_work+0x1c/0x20
> > > [   77.652274] r10:c1518d20 r9:c1c2a005 r8:00000000 r7:c1c2a000
> > > r6:c1c0a800 r5:c1c08a00
> > > [   77.652287] r4:d42943ec
> > > [   77.652297] commit_work from process_one_work+0x1b0/0x528
> > > [   77.652324] process_one_work from worker_thread+0x54/0x4d8
> > > [   77.652356] r10:c1c0a800 r9:00000088 r8:c1403d00 r7:c1c0a81c
> > > r6:c1c08a18 r5:c1c0a800
> > > [   77.652368] r4:c1c08a00
> > > [   77.652378] worker_thread from kthread+0x104/0x134
> > > [   77.652419] r10:00000000 r9:c1d43e5c r8:c1d05880 r7:c1d8cb40
> > > r6:c1c08a00 r5:c015530c
> > > [   77.652432] r4:c1d05700
> > > [   77.652441] kthread from ret_from_fork+0x14/0x2c
> > > [   77.652468] Exception stack(0xc1d51fb0 to 0xc1d51ff8)
> > > [   77.652488] 1fa0:                                     00000000
> > > 00000000 00000000 00000000
> > > [   77.652509] 1fc0: 00000000 00000000 00000000 00000000 00000000
> > > 00000000 00000000 00000000
> > > [   77.652528] 1fe0: 00000000 00000000 00000000 00000000 00000013
> > > 00000000
> > > [   77.652550] r9:00000000 r8:00000000 r7:00000000 r6:00000000
> > > r5:c015da78 r4:c1d05700
> > > [   77.652561] ---[ end trace 0000000000000000 ]---
> > >
> > > Kind Regards
> > > Martin
> > >
> > > >
> > > > Best regards,
> > > > Krzysztof

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Exynos vblank timeout issue
  2022-06-04 11:05         ` Martin Jücker
@ 2022-06-22  3:27           ` Inki Dae
  0 siblings, 0 replies; 7+ messages in thread
From: Inki Dae @ 2022-06-22  3:27 UTC (permalink / raw)
  To: Martin Jücker
  Cc: David Airlie, Kyungmin Park, Seung-Woo Kim, DRI mailing list,
	Krzysztof Kozlowski

Hi Martin,

2022년 6월 4일 (토) 오후 8:05, Martin Jücker <martin.juecker@gmail.com>님이 작성:
>
> On Sat, Jun 04, 2022 at 01:05:39PM +0900, Inki Dae wrote:
> > Hi Martin,
> >
>
> Hi Inki,
>
> > What kind of panel does Galaxy Note 10.1 use? I guess it uses I80
> > panel which needs CPU-trigger.
>
> the Note 10.1 uses a Samsung LTL101AL01 LCD which is connected via the
> RGB interface.
>
> In the meantime I tried several things but to no avail. Krzysztof
> proposed in IRC to disable devfreq which had no effect. I compared the
> mainline sources with the vendor kernel, but there are no notable
> differences and the parts that were a bit different also didn't solve
> the problem when applied to mainline.
>
> I then tried to reproduce the issue, I noticed that things go wrong as
> soon as I get to planes that are less than 8x8 pixels. There is two
> different issues I can make out:
>
> 1) Active pixels are not what expected, so the plane is 1x4 pixels but
> only two pixels will be visible on the display
> 2) The screen goes dark and the vblank interrupt stops working

Seems malfunctioning of display controller, FIMD device.

>
> The vblank occurs for all planes that are 1x8 or any multiple of it, so
> 1x16, 1x24 as well as planes bigger than 1x279 in size. This is for the
> primary plane. A width of two seems to be fine here.

FIMD device has a limit to DMA burst size. Please see how burst size
is set according to pixel format,
https://git.kernel.org/pub/scm/linux/kernel/git/daeinki/drm-exynos.git/tree/drivers/gpu/drm/exynos/exynos_drm_fimd.c?h=exynos-drm-fixes#n660

I think Android platform uses XRGB888 or ARGB888 for framebuffer. In
this case, the burst size - it means the minimum size that FIMD's DMA
controller reads some portion of frame buffer from memory - is 16
words.
So width size should be multiple of the word size.

Could you align the width size of the frame buffer to multiple of 16?
According to your analysis, multiple of 8 would be ok.

Thanks,
Inki Dae

>
> For overlay planes, the situation is worse. All planes of width one will
> trigger a vsync timeout. Also, planes of widths smaller 8 seem to be hit
> and miss, most of them don't work.
>
> The first issue with the wrong number of pixels seems to be for small
> planes less than 8x8 pixels that don't trigger the vsync issue but it's
> more difficult to find a pattern here. It looks like even numbers like
> 4x4, 4x6 are fine but as soon as at least one odd number is present, it
> will go down the drain. 5x6 for example will only display 5x5 pixels,
> 5x5 will display four rows of five pixels and one row with one pixel.
>
> Kind Regards
> Martin
>
>
>
> > If so, you may need to check if the panel device works correctly after
> > booting because FIMD will incur vsync timeout if the panel doesn't
> > work.
> > I think you could try to check if te signal works or not in
> > exynos_dsi_te_irq_handler function of exynos_drm_dsi.c
> >
> > Thanks,
> > Inki Dae
> >
> > 2022년 5월 27일 (금) 오전 8:34, Martin Jücker <martin.juecker@gmail.com>님이 작성:
> > >
> > > Hello again,
> > >
> > > I tried to dig around a bit to unearth some more information. What I'm
> > > seeing is that it works just fine in the beginning, planes are updated a
> > > couple of times and suddenly, after one of the plane updates, the
> > > interrupt handler in the FIMD driver is no longer called. The screen
> > > goes dark but the device is still operational, e.g. ADB works fine, I
> > > can connect and execute commands.
> > >
> > > Trying to figure out what is called when and curious about the state of
> > > the registers, I littered the code with print statements and it looks
> > > like vsync is still active, no other code calls into disabling it. All
> > > registers are as expected, e.g. VIDINTCON0 has the interrupt bit set. I
> > > also had a look at the interrupt combiner, this too has the
> > > corresponding lcd0-1 interrupt enabled at all times and there is no
> > > interrupt pending, even after FIMD stopped receiving them.
> > >
> > > Looking at the wiki at https://exynos.wiki.kernel.org/todo_tasks I found
> > > issue #9. It's about trashed display or DMA freeze if planes are too
> > > narrow and I was wondering if this could be related. So I had a look at
> > > the drm debug output and planes are indeed getting very small. This
> > > happens exactly when the animation that is triggering the issue is
> > > playing, so this would match. Looking a bit closer at the position and
> > > size of the planes, I could see that the last working vsync was right
> > > after one of the planes was exactly 1 pixel in width and vsync only
> > > stopped working one update later. Here are the plane updates from the
> > > logs:
> > >
> > > -
> > >
> > > Planes getting smaller and smaller with each update:
> > > plane : offset_x/y(0,0), width/height(4,800)
> > > plane : offset_x/y(4,0), width/height(1276,800)
> > > plane : offset_x/y(0,0), width/height(1280,800)
> > > plane : offset_x/y(0,776), width/height(1280,24)
> > >
> > > plane : offset_x/y(0,0), width/height(2,800)
> > > plane : offset_x/y(2,0), width/height(1278,800)
> > > plane : offset_x/y(0,0), width/height(1280,800)
> > > plane : offset_x/y(0,776), width/height(1280,24)
> > >
> > > plane : offset_x/y(0,0), width/height(1,800)
> > > plane : offset_x/y(1,0), width/height(1279,800)
> > > plane : offset_x/y(0,0), width/height(1280,800)
> > > plane : offset_x/y(0,776), width/height(1280,24)
> > >
> > > Still got a vsync in between those two. But after the following update,
> > > it's dead:
> > > plane : offset_x/y(0,0), width/height(1280,800)
> > > plane : offset_x/y(0,0), width/height(1280,24)
> > > plane : offset_x/y(0,740), width/height(1280,60)
> > > plane : offset_x/y(0,0), width/height(1280,800)
> > >
> > > -> vsync timeout comes here
> > >
> > > -
> > >
> > > I have no idea how to analyze this further on the kernel side. I'll try
> > > to write an executable that triggers this bug next. If you have any
> > > ideas on that, I'd be very grateful.
> > >
> > > Kind Regards
> > > Martin
> > >
> > > On Sun, May 22, 2022 at 12:06:39PM +0200, Martin Jücker wrote:
> > > > On Sun, May 22, 2022 at 09:45:51AM +0200, Krzysztof Kozlowski wrote:
> > > > > On 22/05/2022 02:02, Martin Jücker wrote:
> > > > > > Hello,
> > > > > >
> > > > > > I'm trying to get Android 12 up and running on my Galaxy Note 10.1 which
> > > > > > is based on Exynos 4412 with a Mali GPU. For Android 11, I had no issues
> > > > > > with graphics but after upgrading and building Android 12, I'm getting a
> > > > > > vblank wait timeout shortly after starting the device setup, which in
> > > > > > turn leads to my display turning black and SurfaceFlinger hanging. This
> > > > > > can be reliably reproduced after every reboot, so much so that it's
> > > > > > basically always on the exact same step of the setup.
> > > > > >
> > > > > > I'm using the following setup:
> > > > > >
> > > > > > * 5.10.101 Android Common Kernel with some patches to get
> > > > > > the Note 10.1 up and running
> > > > >
> > > > > It's Android kernel, so not upstream. It is perfectly fine to use
> > > > > downstream kernels, but with the issues you also go to downstream folks.
> > > > > I have no clue what Android did to Exynos.
> > > >
> > > > Hi Krzysztof,
> > > >
> > > > indeed, that was my mistake. Should have done that on mainline first.
> > > >
> > > > I rebased some patches on top of v5.17.9 and tried again, same result.
> > > > There are no Android patches in there, only p4note related things. You
> > > > can have a look here:
> > > >
> > > > https://github.com/Viciouss/linux/commits/v5.17.9-android
> > > >
> > > > The behaviour is exactly the same, as soon as I try to advance in the
> > > > setup process, it suddenly turns the screen all black.
> > > >
> > > > Here is the warning again, just in case there are any differences.
> > > >
> > > > [   77.651495] ------------[ cut here ]------------
> > > > [   77.651527] WARNING: CPU: 2 PID: 8 at
> > > > ../drivers/gpu/drm/drm_atomic_helper.c:1530
> > > > drm_atomic_helper_wait_for_vblanks.part.1+0x2b0/0x2b4
> > > > [   77.651593] [CRTC:49:crtc-0] vblank wait timed out
> > > > [   77.651608] Modules linked in: s5p_mfc s5p_jpeg v4l2_mem2mem
> > > > videobuf2_dma_contig videobuf2_memops videobuf2_v4l2 videobuf2_common
> > > > rfcomm kheaders hidp hci_uart cpufreq_userspace cpufreq_powersave
> > > > cpufreq_conservative btbcm brcmfmac brcmutil bnep bluetooth atmel_mxt_ts
> > > > [   77.651789] CPU: 2 PID: 8 Comm: kworker/u8:0 Not tainted 5.17.9+ #3
> > > > [   77.651813] Hardware name: Samsung Exynos (Flattened Device Tree)
> > > > [   77.651828] Workqueue: events_unbound commit_work
> > > > [   77.651858] Backtrace:
> > > > [   77.651874] dump_backtrace from show_stack+0x20/0x24
> > > > [   77.651915] r7:c071097c r6:00000000 r5:c10ec66c r4:600f0013
> > > > [   77.651926] show_stack from dump_stack_lvl+0x48/0x54
> > > > [   77.651958] dump_stack_lvl from dump_stack+0x18/0x1c
> > > > [   77.651986] r5:c113dcf4 r4:c1d51e04
> > > > [   77.651996] dump_stack from __warn+0x18c/0x190
> > > > [   77.652030] __warn from warn_slowpath_fmt+0x80/0xbc
> > > > [   77.652070] r9:00000009 r8:c071097c r7:000005fa r6:c113dcf4
> > > > r5:c1d8cb40 r4:c113e338
> > > > [   77.652081] warn_slowpath_fmt from
> > > > drm_atomic_helper_wait_for_vblanks.part.1+0x2b0/0x2b4
> > > > [   77.652123] r9:00000001 r8:00000000 r7:00000000 r6:00000000
> > > > r5:00000000 r4:c398c800
> > > > [   77.652135] drm_atomic_helper_wait_for_vblanks.part.1 from
> > > > drm_atomic_helper_commit_tail_rpm+0x6c/0x7c
> > > > [   77.652175] r10:c14cce68 r9:c1c2a005 r8:00000000 r7:0e3f351d
> > > > r6:00000012 r5:c398c000
> > > > [   77.652188] r4:d42943c0
> > > > [   77.652197] drm_atomic_helper_commit_tail_rpm from
> > > > commit_tail+0xb8/0x1d8
> > > > [   77.652228] r5:00000000 r4:d42943c0
> > > > [   77.652238] commit_tail from commit_work+0x1c/0x20
> > > > [   77.652274] r10:c1518d20 r9:c1c2a005 r8:00000000 r7:c1c2a000
> > > > r6:c1c0a800 r5:c1c08a00
> > > > [   77.652287] r4:d42943ec
> > > > [   77.652297] commit_work from process_one_work+0x1b0/0x528
> > > > [   77.652324] process_one_work from worker_thread+0x54/0x4d8
> > > > [   77.652356] r10:c1c0a800 r9:00000088 r8:c1403d00 r7:c1c0a81c
> > > > r6:c1c08a18 r5:c1c0a800
> > > > [   77.652368] r4:c1c08a00
> > > > [   77.652378] worker_thread from kthread+0x104/0x134
> > > > [   77.652419] r10:00000000 r9:c1d43e5c r8:c1d05880 r7:c1d8cb40
> > > > r6:c1c08a00 r5:c015530c
> > > > [   77.652432] r4:c1d05700
> > > > [   77.652441] kthread from ret_from_fork+0x14/0x2c
> > > > [   77.652468] Exception stack(0xc1d51fb0 to 0xc1d51ff8)
> > > > [   77.652488] 1fa0:                                     00000000
> > > > 00000000 00000000 00000000
> > > > [   77.652509] 1fc0: 00000000 00000000 00000000 00000000 00000000
> > > > 00000000 00000000 00000000
> > > > [   77.652528] 1fe0: 00000000 00000000 00000000 00000000 00000013
> > > > 00000000
> > > > [   77.652550] r9:00000000 r8:00000000 r7:00000000 r6:00000000
> > > > r5:c015da78 r4:c1d05700
> > > > [   77.652561] ---[ end trace 0000000000000000 ]---
> > > >
> > > > Kind Regards
> > > > Martin
> > > >
> > > > >
> > > > > Best regards,
> > > > > Krzysztof

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2022-06-22  3:28 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-05-22  0:02 Exynos vblank timeout issue Martin Jücker
2022-05-22  7:45 ` Krzysztof Kozlowski
2022-05-22 10:06   ` Martin Jücker
2022-05-26 23:34     ` Martin Jücker
2022-06-04  4:05       ` Inki Dae
2022-06-04 11:05         ` Martin Jücker
2022-06-22  3:27           ` Inki Dae

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.