* vcn regression on raven1
@ 2018-05-01 13:34 Tom St Denis
[not found] ` <86eb81c5-8ec5-8927-71ce-a2794e48721e-5C7GfCeVMHo@public.gmane.org>
0 siblings, 1 reply; 12+ messages in thread
From: Tom St Denis @ 2018-05-01 13:34 UTC (permalink / raw)
To: Deucher, Alexander
Cc: Koenig, Christian, amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW
Hi all,
I've noticed that on the tip of drm-next vcn playback of video is broken
(see dmesg below). I've bisected it to this commit
[root@raven linux]# git bisect good
701372349fd55b5396b335580e979ac4dde3dd02 is the first bad commit
commit 701372349fd55b5396b335580e979ac4dde3dd02
Author: Alex Deucher <alexander.deucher@amd.com>
Date: Tue Mar 27 17:10:56 2018 -0500
drm/amdgpu/gmc9: use amdgpu_ring_emit_reg_write_reg_wait in gpu tlb
flush
Use amdgpu_ring_emit_reg_write_reg_wait. On engines that support it,
it provides a write and wait in a single packet which avoids a missed
ack if a world switch happens between the request and waiting for the
ack.
Reviewed-by: Huang Rui <ray.huang@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
:040000 040000 4e4312de03f4b34abd65f4bb12dba4c7093055ba
ccc4abc78c0b6f24328fd998f998fa06bf0618b1 M drivers
Which is odd because the commit before this is the vcn change and it
works fine (playing BBB right now).
Here's the dmesg:
[ 2925.640102] BUG: unable to handle kernel NULL pointer dereference at
0000000000000000
[ 2925.640113] IP: (null)
[ 2925.640116] PGD 0 P4D 0
[ 2925.640122] Oops: 0010 [#1] SMP KASAN NOPTI
[ 2925.640126] Modules linked in: tun fuse amdkfd amdgpu mfd_core chash
gpu_sched ttm ax88179_178a usbnet
[ 2925.640139] CPU: 4 PID: 3791 Comm: vcn_dec Not tainted 4.16.0-rc7+ #20
[ 2925.640142] Hardware name: System manufacturer System Product
Name/TUF B350M-PLUS GAMING, BIOS 3803 01/22/2018
[ 2925.640146] RIP: 0010: (null)
[ 2925.640148] RSP: 0018:ffff8801d54f7790 EFLAGS: 00010206
[ 2925.640153] RAX: 0000000000000000 RBX: ffff8801d8b38420 RCX:
00000000007c0080
[ 2925.640156] RDX: 000000000001a6fa RSI: 000000000001a6e8 RDI:
ffff8801d8b38420
[ 2925.640159] RBP: 000000000001a6fa R08: 0000000000000080 R09:
ffffed003aa9eef9
[ 2925.640162] R10: 0000000009c74f08 R11: fffffbfff0f5d1e7 R12:
ffff8801d8b3277c
[ 2925.640164] R13: ffff8801d8b3001c R14: 0000000000000005 R15:
0000000000000000
[ 2925.640168] FS: 0000000000000000(0000) GS:ffff8801dcf00000(0000)
knlGS:0000000000000000
[ 2925.640171] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 2925.640174] CR2: 0000000000000000 CR3: 00000001d9712000 CR4:
00000000003406e0
[ 2925.640176] Call Trace:
[ 2925.640272] ? gmc_v9_0_emit_flush_gpu_tlb+0x260/0x2a0 [amdgpu]
[ 2925.640368] ? vcn_v1_0_dec_ring_insert_start+0x360/0x360 [amdgpu]
[ 2925.640459] ? mmhub_v1_0_get_clockgating+0xc0/0xc0 [amdgpu]
[ 2925.640545] ? amdgpu_vmid_had_gpu_reset+0x89/0xc0 [amdgpu]
[ 2925.640640] ? vcn_v1_0_dec_ring_emit_vm_flush+0x64/0xb0 [amdgpu]
[ 2925.640725] ? amdgpu_vm_flush+0xb43/0xcc0 [amdgpu]
[ 2925.640810] ? amdgpu_vm_need_pipeline_sync+0x260/0x260 [amdgpu]
[ 2925.640897] ? amdgpu_vmid_had_gpu_reset+0xc0/0xc0 [amdgpu]
[ 2925.641003] ? vcn_v1_0_dec_ring_insert_start+0x2d7/0x360 [amdgpu]
[ 2925.641095] ? amdgpu_ib_schedule+0x1b5/0x800 [amdgpu]
[ 2925.641102] ? dma_fence_add_callback+0x15f/0x360
[ 2925.641201] ? amdgpu_job_run+0x32f/0x370 [amdgpu]
[ 2925.641297] ? amdgpu_job_free_resources+0xd0/0xd0 [amdgpu]
[ 2925.641302] ? __queue_delayed_work+0x144/0x1d0
[ 2925.641306] ? delayed_work_timer_fn+0x40/0x40
[ 2925.641312] ? prepare_to_wait_exclusive+0x1d0/0x1d0
[ 2925.641318] ? drm_sched_main+0x68c/0x940 [gpu_sched]
[ 2925.641323] ? drm_sched_entity_fini+0x60/0x60 [gpu_sched]
[ 2925.641328] ? save_stack+0x89/0xb0
[ 2925.641332] ? wait_woken+0x110/0x110
[ 2925.641337] ? ret_from_fork+0x22/0x40
[ 2925.641343] ? __schedule+0xd30/0xd30
[ 2925.641346] ? remove_wait_queue+0x150/0x150
[ 2925.641353] ? rcu_note_context_switch+0x2a0/0x2a0
[ 2925.641359] ? __lock_text_start+0x8/0x8
[ 2925.641367] ? drm_sched_entity_fini+0x60/0x60 [gpu_sched]
[ 2925.641371] ? kthread+0x19b/0x1c0
[ 2925.641376] ? kthread_create_worker_on_cpu+0xc0/0xc0
[ 2925.641382] ? ret_from_fork+0x22/0x40
[ 2925.641387] Code: Bad RIP value.
[ 2925.641397] RIP: (null) RSP: ffff8801d54f7790
[ 2925.641400] CR2: 0000000000000000
[ 2925.641405] ---[ end trace 0684cc0468f60fb1 ]---
Note that regular compute/gfx workflows work fine on the tip of drm-next
only vcn playback triggeers this (haven't tried encode yet...).
Cheers,
Tom
_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: vcn regression on raven1
[not found] ` <86eb81c5-8ec5-8927-71ce-a2794e48721e-5C7GfCeVMHo@public.gmane.org>
@ 2018-05-02 0:43 ` Zhang, Jerry (Junwei)
[not found] ` <5AE909C0.8040704-5C7GfCeVMHo@public.gmane.org>
0 siblings, 1 reply; 12+ messages in thread
From: Zhang, Jerry (Junwei) @ 2018-05-02 0:43 UTC (permalink / raw)
To: Tom St Denis, Deucher, Alexander
Cc: Koenig, Christian, amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW
On 05/01/2018 09:34 PM, Tom St Denis wrote:
> Hi all,
>
> I've noticed that on the tip of drm-next vcn playback of video is broken (see
> dmesg below). I've bisected it to this commit
It may be fixed here as a common issue.
* https://patchwork.freedesktop.org/patch/218909/
Jerry
>
> [root@raven linux]# git bisect good
> 701372349fd55b5396b335580e979ac4dde3dd02 is the first bad commit
> commit 701372349fd55b5396b335580e979ac4dde3dd02
> Author: Alex Deucher <alexander.deucher@amd.com>
> Date: Tue Mar 27 17:10:56 2018 -0500
>
> drm/amdgpu/gmc9: use amdgpu_ring_emit_reg_write_reg_wait in gpu tlb flush
>
> Use amdgpu_ring_emit_reg_write_reg_wait. On engines that support it,
> it provides a write and wait in a single packet which avoids a missed
> ack if a world switch happens between the request and waiting for the
> ack.
>
> Reviewed-by: Huang Rui <ray.huang@amd.com>
> Reviewed-by: Christian König <christian.koenig@amd.com>
> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
>
> :040000 040000 4e4312de03f4b34abd65f4bb12dba4c7093055ba
> ccc4abc78c0b6f24328fd998f998fa06bf0618b1 M drivers
>
> Which is odd because the commit before this is the vcn change and it works fine
> (playing BBB right now).
>
> Here's the dmesg:
>
> [ 2925.640102] BUG: unable to handle kernel NULL pointer dereference at
> 0000000000000000
> [ 2925.640113] IP: (null)
> [ 2925.640116] PGD 0 P4D 0
> [ 2925.640122] Oops: 0010 [#1] SMP KASAN NOPTI
> [ 2925.640126] Modules linked in: tun fuse amdkfd amdgpu mfd_core chash
> gpu_sched ttm ax88179_178a usbnet
> [ 2925.640139] CPU: 4 PID: 3791 Comm: vcn_dec Not tainted 4.16.0-rc7+ #20
> [ 2925.640142] Hardware name: System manufacturer System Product Name/TUF
> B350M-PLUS GAMING, BIOS 3803 01/22/2018
> [ 2925.640146] RIP: 0010: (null)
> [ 2925.640148] RSP: 0018:ffff8801d54f7790 EFLAGS: 00010206
> [ 2925.640153] RAX: 0000000000000000 RBX: ffff8801d8b38420 RCX: 00000000007c0080
> [ 2925.640156] RDX: 000000000001a6fa RSI: 000000000001a6e8 RDI: ffff8801d8b38420
> [ 2925.640159] RBP: 000000000001a6fa R08: 0000000000000080 R09: ffffed003aa9eef9
> [ 2925.640162] R10: 0000000009c74f08 R11: fffffbfff0f5d1e7 R12: ffff8801d8b3277c
> [ 2925.640164] R13: ffff8801d8b3001c R14: 0000000000000005 R15: 0000000000000000
> [ 2925.640168] FS: 0000000000000000(0000) GS:ffff8801dcf00000(0000)
> knlGS:0000000000000000
> [ 2925.640171] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> [ 2925.640174] CR2: 0000000000000000 CR3: 00000001d9712000 CR4: 00000000003406e0
> [ 2925.640176] Call Trace:
> [ 2925.640272] ? gmc_v9_0_emit_flush_gpu_tlb+0x260/0x2a0 [amdgpu]
> [ 2925.640368] ? vcn_v1_0_dec_ring_insert_start+0x360/0x360 [amdgpu]
> [ 2925.640459] ? mmhub_v1_0_get_clockgating+0xc0/0xc0 [amdgpu]
> [ 2925.640545] ? amdgpu_vmid_had_gpu_reset+0x89/0xc0 [amdgpu]
> [ 2925.640640] ? vcn_v1_0_dec_ring_emit_vm_flush+0x64/0xb0 [amdgpu]
> [ 2925.640725] ? amdgpu_vm_flush+0xb43/0xcc0 [amdgpu]
> [ 2925.640810] ? amdgpu_vm_need_pipeline_sync+0x260/0x260 [amdgpu]
> [ 2925.640897] ? amdgpu_vmid_had_gpu_reset+0xc0/0xc0 [amdgpu]
> [ 2925.641003] ? vcn_v1_0_dec_ring_insert_start+0x2d7/0x360 [amdgpu]
> [ 2925.641095] ? amdgpu_ib_schedule+0x1b5/0x800 [amdgpu]
> [ 2925.641102] ? dma_fence_add_callback+0x15f/0x360
> [ 2925.641201] ? amdgpu_job_run+0x32f/0x370 [amdgpu]
> [ 2925.641297] ? amdgpu_job_free_resources+0xd0/0xd0 [amdgpu]
> [ 2925.641302] ? __queue_delayed_work+0x144/0x1d0
> [ 2925.641306] ? delayed_work_timer_fn+0x40/0x40
> [ 2925.641312] ? prepare_to_wait_exclusive+0x1d0/0x1d0
> [ 2925.641318] ? drm_sched_main+0x68c/0x940 [gpu_sched]
> [ 2925.641323] ? drm_sched_entity_fini+0x60/0x60 [gpu_sched]
> [ 2925.641328] ? save_stack+0x89/0xb0
> [ 2925.641332] ? wait_woken+0x110/0x110
> [ 2925.641337] ? ret_from_fork+0x22/0x40
> [ 2925.641343] ? __schedule+0xd30/0xd30
> [ 2925.641346] ? remove_wait_queue+0x150/0x150
> [ 2925.641353] ? rcu_note_context_switch+0x2a0/0x2a0
> [ 2925.641359] ? __lock_text_start+0x8/0x8
> [ 2925.641367] ? drm_sched_entity_fini+0x60/0x60 [gpu_sched]
> [ 2925.641371] ? kthread+0x19b/0x1c0
> [ 2925.641376] ? kthread_create_worker_on_cpu+0xc0/0xc0
> [ 2925.641382] ? ret_from_fork+0x22/0x40
> [ 2925.641387] Code: Bad RIP value.
> [ 2925.641397] RIP: (null) RSP: ffff8801d54f7790
> [ 2925.641400] CR2: 0000000000000000
> [ 2925.641405] ---[ end trace 0684cc0468f60fb1 ]---
>
>
> Note that regular compute/gfx workflows work fine on the tip of drm-next only
> vcn playback triggeers this (haven't tried encode yet...).
>
> Cheers,
> Tom
> _______________________________________________
> amd-gfx mailing list
> amd-gfx@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/amd-gfx
_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: vcn regression on raven1
[not found] ` <5AE909C0.8040704-5C7GfCeVMHo@public.gmane.org>
@ 2018-05-02 0:47 ` StDenis, Tom
[not found] ` <MWHPR1201MB0061C0CE5E63D714899BD73BF7800-3iK1xFAIwjq5p0Vu5p1DW2rFom/aUZj6nBOFsp37pqbUKgpGm//BTAC/G2K4zDHf@public.gmane.org>
0 siblings, 1 reply; 12+ messages in thread
From: StDenis, Tom @ 2018-05-02 0:47 UTC (permalink / raw)
To: Zhang, Jerry, Deucher, Alexander
Cc: Koenig, Christian, amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW
Hi Jerry,
So far as I know this wasn't included on the tip of drm-next. I hit this this morning in my semi-regular pull/build/test cycle.
Was this missed in a recent rebase?
Tom
________________________________________
From: Zhang, Jerry
Sent: Tuesday, May 1, 2018 20:43
To: StDenis, Tom; Deucher, Alexander
Cc: Koenig, Christian; amd-gfx@lists.freedesktop.org
Subject: Re: vcn regression on raven1
On 05/01/2018 09:34 PM, Tom St Denis wrote:
> Hi all,
>
> I've noticed that on the tip of drm-next vcn playback of video is broken (see
> dmesg below). I've bisected it to this commit
It may be fixed here as a common issue.
* https://patchwork.freedesktop.org/patch/218909/
Jerry
>
> [root@raven linux]# git bisect good
> 701372349fd55b5396b335580e979ac4dde3dd02 is the first bad commit
> commit 701372349fd55b5396b335580e979ac4dde3dd02
> Author: Alex Deucher <alexander.deucher@amd.com>
> Date: Tue Mar 27 17:10:56 2018 -0500
>
> drm/amdgpu/gmc9: use amdgpu_ring_emit_reg_write_reg_wait in gpu tlb flush
>
> Use amdgpu_ring_emit_reg_write_reg_wait. On engines that support it,
> it provides a write and wait in a single packet which avoids a missed
> ack if a world switch happens between the request and waiting for the
> ack.
>
> Reviewed-by: Huang Rui <ray.huang@amd.com>
> Reviewed-by: Christian König <christian.koenig@amd.com>
> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
>
> :040000 040000 4e4312de03f4b34abd65f4bb12dba4c7093055ba
> ccc4abc78c0b6f24328fd998f998fa06bf0618b1 M drivers
>
> Which is odd because the commit before this is the vcn change and it works fine
> (playing BBB right now).
>
> Here's the dmesg:
>
> [ 2925.640102] BUG: unable to handle kernel NULL pointer dereference at
> 0000000000000000
> [ 2925.640113] IP: (null)
> [ 2925.640116] PGD 0 P4D 0
> [ 2925.640122] Oops: 0010 [#1] SMP KASAN NOPTI
> [ 2925.640126] Modules linked in: tun fuse amdkfd amdgpu mfd_core chash
> gpu_sched ttm ax88179_178a usbnet
> [ 2925.640139] CPU: 4 PID: 3791 Comm: vcn_dec Not tainted 4.16.0-rc7+ #20
> [ 2925.640142] Hardware name: System manufacturer System Product Name/TUF
> B350M-PLUS GAMING, BIOS 3803 01/22/2018
> [ 2925.640146] RIP: 0010: (null)
> [ 2925.640148] RSP: 0018:ffff8801d54f7790 EFLAGS: 00010206
> [ 2925.640153] RAX: 0000000000000000 RBX: ffff8801d8b38420 RCX: 00000000007c0080
> [ 2925.640156] RDX: 000000000001a6fa RSI: 000000000001a6e8 RDI: ffff8801d8b38420
> [ 2925.640159] RBP: 000000000001a6fa R08: 0000000000000080 R09: ffffed003aa9eef9
> [ 2925.640162] R10: 0000000009c74f08 R11: fffffbfff0f5d1e7 R12: ffff8801d8b3277c
> [ 2925.640164] R13: ffff8801d8b3001c R14: 0000000000000005 R15: 0000000000000000
> [ 2925.640168] FS: 0000000000000000(0000) GS:ffff8801dcf00000(0000)
> knlGS:0000000000000000
> [ 2925.640171] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> [ 2925.640174] CR2: 0000000000000000 CR3: 00000001d9712000 CR4: 00000000003406e0
> [ 2925.640176] Call Trace:
> [ 2925.640272] ? gmc_v9_0_emit_flush_gpu_tlb+0x260/0x2a0 [amdgpu]
> [ 2925.640368] ? vcn_v1_0_dec_ring_insert_start+0x360/0x360 [amdgpu]
> [ 2925.640459] ? mmhub_v1_0_get_clockgating+0xc0/0xc0 [amdgpu]
> [ 2925.640545] ? amdgpu_vmid_had_gpu_reset+0x89/0xc0 [amdgpu]
> [ 2925.640640] ? vcn_v1_0_dec_ring_emit_vm_flush+0x64/0xb0 [amdgpu]
> [ 2925.640725] ? amdgpu_vm_flush+0xb43/0xcc0 [amdgpu]
> [ 2925.640810] ? amdgpu_vm_need_pipeline_sync+0x260/0x260 [amdgpu]
> [ 2925.640897] ? amdgpu_vmid_had_gpu_reset+0xc0/0xc0 [amdgpu]
> [ 2925.641003] ? vcn_v1_0_dec_ring_insert_start+0x2d7/0x360 [amdgpu]
> [ 2925.641095] ? amdgpu_ib_schedule+0x1b5/0x800 [amdgpu]
> [ 2925.641102] ? dma_fence_add_callback+0x15f/0x360
> [ 2925.641201] ? amdgpu_job_run+0x32f/0x370 [amdgpu]
> [ 2925.641297] ? amdgpu_job_free_resources+0xd0/0xd0 [amdgpu]
> [ 2925.641302] ? __queue_delayed_work+0x144/0x1d0
> [ 2925.641306] ? delayed_work_timer_fn+0x40/0x40
> [ 2925.641312] ? prepare_to_wait_exclusive+0x1d0/0x1d0
> [ 2925.641318] ? drm_sched_main+0x68c/0x940 [gpu_sched]
> [ 2925.641323] ? drm_sched_entity_fini+0x60/0x60 [gpu_sched]
> [ 2925.641328] ? save_stack+0x89/0xb0
> [ 2925.641332] ? wait_woken+0x110/0x110
> [ 2925.641337] ? ret_from_fork+0x22/0x40
> [ 2925.641343] ? __schedule+0xd30/0xd30
> [ 2925.641346] ? remove_wait_queue+0x150/0x150
> [ 2925.641353] ? rcu_note_context_switch+0x2a0/0x2a0
> [ 2925.641359] ? __lock_text_start+0x8/0x8
> [ 2925.641367] ? drm_sched_entity_fini+0x60/0x60 [gpu_sched]
> [ 2925.641371] ? kthread+0x19b/0x1c0
> [ 2925.641376] ? kthread_create_worker_on_cpu+0xc0/0xc0
> [ 2925.641382] ? ret_from_fork+0x22/0x40
> [ 2925.641387] Code: Bad RIP value.
> [ 2925.641397] RIP: (null) RSP: ffff8801d54f7790
> [ 2925.641400] CR2: 0000000000000000
> [ 2925.641405] ---[ end trace 0684cc0468f60fb1 ]---
>
>
> Note that regular compute/gfx workflows work fine on the tip of drm-next only
> vcn playback triggeers this (haven't tried encode yet...).
>
> Cheers,
> Tom
> _______________________________________________
> amd-gfx mailing list
> amd-gfx@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/amd-gfx
_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: vcn regression on raven1
[not found] ` <MWHPR1201MB0061C0CE5E63D714899BD73BF7800-3iK1xFAIwjq5p0Vu5p1DW2rFom/aUZj6nBOFsp37pqbUKgpGm//BTAC/G2K4zDHf@public.gmane.org>
@ 2018-05-02 0:52 ` Zhang, Jerry (Junwei)
[not found] ` <5AE90BD4.8080205-5C7GfCeVMHo@public.gmane.org>
0 siblings, 1 reply; 12+ messages in thread
From: Zhang, Jerry (Junwei) @ 2018-05-02 0:52 UTC (permalink / raw)
To: StDenis, Tom, Deucher, Alexander
Cc: Koenig, Christian, amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW
Hi Tom,
It was landed in the latest drm-next, like
* 964933a 2018-04-27 10:26:09 +0800 drm/amdgpu/uvd7: add
emit_reg_write_reg_wait ring callback <Xiaojie Yuan>
Did you test with that included?
Please try to get the latest drm-next, if not.
They look the same issue from the log.
Jerry
On 05/02/2018 08:47 AM, StDenis, Tom wrote:
> Hi Jerry,
>
> So far as I know this wasn't included on the tip of drm-next. I hit this this morning in my semi-regular pull/build/test cycle.
>
> Was this missed in a recent rebase?
>
> Tom
> ________________________________________
> From: Zhang, Jerry
> Sent: Tuesday, May 1, 2018 20:43
> To: StDenis, Tom; Deucher, Alexander
> Cc: Koenig, Christian; amd-gfx@lists.freedesktop.org
> Subject: Re: vcn regression on raven1
>
> On 05/01/2018 09:34 PM, Tom St Denis wrote:
>> Hi all,
>>
>> I've noticed that on the tip of drm-next vcn playback of video is broken (see
>> dmesg below). I've bisected it to this commit
>
> It may be fixed here as a common issue.
>
> * https://patchwork.freedesktop.org/patch/218909/
>
> Jerry
>
>>
>> [root@raven linux]# git bisect good
>> 701372349fd55b5396b335580e979ac4dde3dd02 is the first bad commit
>> commit 701372349fd55b5396b335580e979ac4dde3dd02
>> Author: Alex Deucher <alexander.deucher@amd.com>
>> Date: Tue Mar 27 17:10:56 2018 -0500
>>
>> drm/amdgpu/gmc9: use amdgpu_ring_emit_reg_write_reg_wait in gpu tlb flush
>>
>> Use amdgpu_ring_emit_reg_write_reg_wait. On engines that support it,
>> it provides a write and wait in a single packet which avoids a missed
>> ack if a world switch happens between the request and waiting for the
>> ack.
>>
>> Reviewed-by: Huang Rui <ray.huang@amd.com>
>> Reviewed-by: Christian König <christian.koenig@amd.com>
>> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
>>
>> :040000 040000 4e4312de03f4b34abd65f4bb12dba4c7093055ba
>> ccc4abc78c0b6f24328fd998f998fa06bf0618b1 M drivers
>>
>> Which is odd because the commit before this is the vcn change and it works fine
>> (playing BBB right now).
>>
>> Here's the dmesg:
>>
>> [ 2925.640102] BUG: unable to handle kernel NULL pointer dereference at
>> 0000000000000000
>> [ 2925.640113] IP: (null)
>> [ 2925.640116] PGD 0 P4D 0
>> [ 2925.640122] Oops: 0010 [#1] SMP KASAN NOPTI
>> [ 2925.640126] Modules linked in: tun fuse amdkfd amdgpu mfd_core chash
>> gpu_sched ttm ax88179_178a usbnet
>> [ 2925.640139] CPU: 4 PID: 3791 Comm: vcn_dec Not tainted 4.16.0-rc7+ #20
>> [ 2925.640142] Hardware name: System manufacturer System Product Name/TUF
>> B350M-PLUS GAMING, BIOS 3803 01/22/2018
>> [ 2925.640146] RIP: 0010: (null)
>> [ 2925.640148] RSP: 0018:ffff8801d54f7790 EFLAGS: 00010206
>> [ 2925.640153] RAX: 0000000000000000 RBX: ffff8801d8b38420 RCX: 00000000007c0080
>> [ 2925.640156] RDX: 000000000001a6fa RSI: 000000000001a6e8 RDI: ffff8801d8b38420
>> [ 2925.640159] RBP: 000000000001a6fa R08: 0000000000000080 R09: ffffed003aa9eef9
>> [ 2925.640162] R10: 0000000009c74f08 R11: fffffbfff0f5d1e7 R12: ffff8801d8b3277c
>> [ 2925.640164] R13: ffff8801d8b3001c R14: 0000000000000005 R15: 0000000000000000
>> [ 2925.640168] FS: 0000000000000000(0000) GS:ffff8801dcf00000(0000)
>> knlGS:0000000000000000
>> [ 2925.640171] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
>> [ 2925.640174] CR2: 0000000000000000 CR3: 00000001d9712000 CR4: 00000000003406e0
>> [ 2925.640176] Call Trace:
>> [ 2925.640272] ? gmc_v9_0_emit_flush_gpu_tlb+0x260/0x2a0 [amdgpu]
>> [ 2925.640368] ? vcn_v1_0_dec_ring_insert_start+0x360/0x360 [amdgpu]
>> [ 2925.640459] ? mmhub_v1_0_get_clockgating+0xc0/0xc0 [amdgpu]
>> [ 2925.640545] ? amdgpu_vmid_had_gpu_reset+0x89/0xc0 [amdgpu]
>> [ 2925.640640] ? vcn_v1_0_dec_ring_emit_vm_flush+0x64/0xb0 [amdgpu]
>> [ 2925.640725] ? amdgpu_vm_flush+0xb43/0xcc0 [amdgpu]
>> [ 2925.640810] ? amdgpu_vm_need_pipeline_sync+0x260/0x260 [amdgpu]
>> [ 2925.640897] ? amdgpu_vmid_had_gpu_reset+0xc0/0xc0 [amdgpu]
>> [ 2925.641003] ? vcn_v1_0_dec_ring_insert_start+0x2d7/0x360 [amdgpu]
>> [ 2925.641095] ? amdgpu_ib_schedule+0x1b5/0x800 [amdgpu]
>> [ 2925.641102] ? dma_fence_add_callback+0x15f/0x360
>> [ 2925.641201] ? amdgpu_job_run+0x32f/0x370 [amdgpu]
>> [ 2925.641297] ? amdgpu_job_free_resources+0xd0/0xd0 [amdgpu]
>> [ 2925.641302] ? __queue_delayed_work+0x144/0x1d0
>> [ 2925.641306] ? delayed_work_timer_fn+0x40/0x40
>> [ 2925.641312] ? prepare_to_wait_exclusive+0x1d0/0x1d0
>> [ 2925.641318] ? drm_sched_main+0x68c/0x940 [gpu_sched]
>> [ 2925.641323] ? drm_sched_entity_fini+0x60/0x60 [gpu_sched]
>> [ 2925.641328] ? save_stack+0x89/0xb0
>> [ 2925.641332] ? wait_woken+0x110/0x110
>> [ 2925.641337] ? ret_from_fork+0x22/0x40
>> [ 2925.641343] ? __schedule+0xd30/0xd30
>> [ 2925.641346] ? remove_wait_queue+0x150/0x150
>> [ 2925.641353] ? rcu_note_context_switch+0x2a0/0x2a0
>> [ 2925.641359] ? __lock_text_start+0x8/0x8
>> [ 2925.641367] ? drm_sched_entity_fini+0x60/0x60 [gpu_sched]
>> [ 2925.641371] ? kthread+0x19b/0x1c0
>> [ 2925.641376] ? kthread_create_worker_on_cpu+0xc0/0xc0
>> [ 2925.641382] ? ret_from_fork+0x22/0x40
>> [ 2925.641387] Code: Bad RIP value.
>> [ 2925.641397] RIP: (null) RSP: ffff8801d54f7790
>> [ 2925.641400] CR2: 0000000000000000
>> [ 2925.641405] ---[ end trace 0684cc0468f60fb1 ]---
>>
>>
>> Note that regular compute/gfx workflows work fine on the tip of drm-next only
>> vcn playback triggeers this (haven't tried encode yet...).
>>
>> Cheers,
>> Tom
>> _______________________________________________
>> amd-gfx mailing list
>> amd-gfx@lists.freedesktop.org
>> https://lists.freedesktop.org/mailman/listinfo/amd-gfx
_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: vcn regression on raven1
[not found] ` <5AE90BD4.8080205-5C7GfCeVMHo@public.gmane.org>
@ 2018-05-02 0:57 ` StDenis, Tom
[not found] ` <MWHPR1201MB0061A8D8AEB5744131AC416AF7800-3iK1xFAIwjq5p0Vu5p1DW2rFom/aUZj6nBOFsp37pqbUKgpGm//BTAC/G2K4zDHf@public.gmane.org>
0 siblings, 1 reply; 12+ messages in thread
From: StDenis, Tom @ 2018-05-02 0:57 UTC (permalink / raw)
To: Zhang, Jerry, Deucher, Alexander
Cc: Koenig, Christian, amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW
Hi Jerry,
It's well past EOD for me I'll pick this up in the morning.
I'm fairly certain I wrote my patch against the tip of amd-staging-drm-next as of my pull this morning though.
If it's in there and I missed it somehow I apologize otherwise it'd be nice to make sure it's in there.
Based on the public copy of the tree it's not there
https://cgit.freedesktop.org/~agd5f/linux/tree/drivers/gpu/drm/amd/amdgpu/vcn_v1_0.c?h=amd-staging-drm-next#n1110
Cheers,
Tom
________________________________________
From: Zhang, Jerry
Sent: Tuesday, May 1, 2018 20:52
To: StDenis, Tom; Deucher, Alexander
Cc: Koenig, Christian; amd-gfx@lists.freedesktop.org
Subject: Re: vcn regression on raven1
Hi Tom,
It was landed in the latest drm-next, like
* 964933a 2018-04-27 10:26:09 +0800 drm/amdgpu/uvd7: add
emit_reg_write_reg_wait ring callback <Xiaojie Yuan>
Did you test with that included?
Please try to get the latest drm-next, if not.
They look the same issue from the log.
Jerry
On 05/02/2018 08:47 AM, StDenis, Tom wrote:
> Hi Jerry,
>
> So far as I know this wasn't included on the tip of drm-next. I hit this this morning in my semi-regular pull/build/test cycle.
>
> Was this missed in a recent rebase?
>
> Tom
> ________________________________________
> From: Zhang, Jerry
> Sent: Tuesday, May 1, 2018 20:43
> To: StDenis, Tom; Deucher, Alexander
> Cc: Koenig, Christian; amd-gfx@lists.freedesktop.org
> Subject: Re: vcn regression on raven1
>
> On 05/01/2018 09:34 PM, Tom St Denis wrote:
>> Hi all,
>>
>> I've noticed that on the tip of drm-next vcn playback of video is broken (see
>> dmesg below). I've bisected it to this commit
>
> It may be fixed here as a common issue.
>
> * https://patchwork.freedesktop.org/patch/218909/
>
> Jerry
>
>>
>> [root@raven linux]# git bisect good
>> 701372349fd55b5396b335580e979ac4dde3dd02 is the first bad commit
>> commit 701372349fd55b5396b335580e979ac4dde3dd02
>> Author: Alex Deucher <alexander.deucher@amd.com>
>> Date: Tue Mar 27 17:10:56 2018 -0500
>>
>> drm/amdgpu/gmc9: use amdgpu_ring_emit_reg_write_reg_wait in gpu tlb flush
>>
>> Use amdgpu_ring_emit_reg_write_reg_wait. On engines that support it,
>> it provides a write and wait in a single packet which avoids a missed
>> ack if a world switch happens between the request and waiting for the
>> ack.
>>
>> Reviewed-by: Huang Rui <ray.huang@amd.com>
>> Reviewed-by: Christian König <christian.koenig@amd.com>
>> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
>>
>> :040000 040000 4e4312de03f4b34abd65f4bb12dba4c7093055ba
>> ccc4abc78c0b6f24328fd998f998fa06bf0618b1 M drivers
>>
>> Which is odd because the commit before this is the vcn change and it works fine
>> (playing BBB right now).
>>
>> Here's the dmesg:
>>
>> [ 2925.640102] BUG: unable to handle kernel NULL pointer dereference at
>> 0000000000000000
>> [ 2925.640113] IP: (null)
>> [ 2925.640116] PGD 0 P4D 0
>> [ 2925.640122] Oops: 0010 [#1] SMP KASAN NOPTI
>> [ 2925.640126] Modules linked in: tun fuse amdkfd amdgpu mfd_core chash
>> gpu_sched ttm ax88179_178a usbnet
>> [ 2925.640139] CPU: 4 PID: 3791 Comm: vcn_dec Not tainted 4.16.0-rc7+ #20
>> [ 2925.640142] Hardware name: System manufacturer System Product Name/TUF
>> B350M-PLUS GAMING, BIOS 3803 01/22/2018
>> [ 2925.640146] RIP: 0010: (null)
>> [ 2925.640148] RSP: 0018:ffff8801d54f7790 EFLAGS: 00010206
>> [ 2925.640153] RAX: 0000000000000000 RBX: ffff8801d8b38420 RCX: 00000000007c0080
>> [ 2925.640156] RDX: 000000000001a6fa RSI: 000000000001a6e8 RDI: ffff8801d8b38420
>> [ 2925.640159] RBP: 000000000001a6fa R08: 0000000000000080 R09: ffffed003aa9eef9
>> [ 2925.640162] R10: 0000000009c74f08 R11: fffffbfff0f5d1e7 R12: ffff8801d8b3277c
>> [ 2925.640164] R13: ffff8801d8b3001c R14: 0000000000000005 R15: 0000000000000000
>> [ 2925.640168] FS: 0000000000000000(0000) GS:ffff8801dcf00000(0000)
>> knlGS:0000000000000000
>> [ 2925.640171] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
>> [ 2925.640174] CR2: 0000000000000000 CR3: 00000001d9712000 CR4: 00000000003406e0
>> [ 2925.640176] Call Trace:
>> [ 2925.640272] ? gmc_v9_0_emit_flush_gpu_tlb+0x260/0x2a0 [amdgpu]
>> [ 2925.640368] ? vcn_v1_0_dec_ring_insert_start+0x360/0x360 [amdgpu]
>> [ 2925.640459] ? mmhub_v1_0_get_clockgating+0xc0/0xc0 [amdgpu]
>> [ 2925.640545] ? amdgpu_vmid_had_gpu_reset+0x89/0xc0 [amdgpu]
>> [ 2925.640640] ? vcn_v1_0_dec_ring_emit_vm_flush+0x64/0xb0 [amdgpu]
>> [ 2925.640725] ? amdgpu_vm_flush+0xb43/0xcc0 [amdgpu]
>> [ 2925.640810] ? amdgpu_vm_need_pipeline_sync+0x260/0x260 [amdgpu]
>> [ 2925.640897] ? amdgpu_vmid_had_gpu_reset+0xc0/0xc0 [amdgpu]
>> [ 2925.641003] ? vcn_v1_0_dec_ring_insert_start+0x2d7/0x360 [amdgpu]
>> [ 2925.641095] ? amdgpu_ib_schedule+0x1b5/0x800 [amdgpu]
>> [ 2925.641102] ? dma_fence_add_callback+0x15f/0x360
>> [ 2925.641201] ? amdgpu_job_run+0x32f/0x370 [amdgpu]
>> [ 2925.641297] ? amdgpu_job_free_resources+0xd0/0xd0 [amdgpu]
>> [ 2925.641302] ? __queue_delayed_work+0x144/0x1d0
>> [ 2925.641306] ? delayed_work_timer_fn+0x40/0x40
>> [ 2925.641312] ? prepare_to_wait_exclusive+0x1d0/0x1d0
>> [ 2925.641318] ? drm_sched_main+0x68c/0x940 [gpu_sched]
>> [ 2925.641323] ? drm_sched_entity_fini+0x60/0x60 [gpu_sched]
>> [ 2925.641328] ? save_stack+0x89/0xb0
>> [ 2925.641332] ? wait_woken+0x110/0x110
>> [ 2925.641337] ? ret_from_fork+0x22/0x40
>> [ 2925.641343] ? __schedule+0xd30/0xd30
>> [ 2925.641346] ? remove_wait_queue+0x150/0x150
>> [ 2925.641353] ? rcu_note_context_switch+0x2a0/0x2a0
>> [ 2925.641359] ? __lock_text_start+0x8/0x8
>> [ 2925.641367] ? drm_sched_entity_fini+0x60/0x60 [gpu_sched]
>> [ 2925.641371] ? kthread+0x19b/0x1c0
>> [ 2925.641376] ? kthread_create_worker_on_cpu+0xc0/0xc0
>> [ 2925.641382] ? ret_from_fork+0x22/0x40
>> [ 2925.641387] Code: Bad RIP value.
>> [ 2925.641397] RIP: (null) RSP: ffff8801d54f7790
>> [ 2925.641400] CR2: 0000000000000000
>> [ 2925.641405] ---[ end trace 0684cc0468f60fb1 ]---
>>
>>
>> Note that regular compute/gfx workflows work fine on the tip of drm-next only
>> vcn playback triggeers this (haven't tried encode yet...).
>>
>> Cheers,
>> Tom
>> _______________________________________________
>> amd-gfx mailing list
>> amd-gfx@lists.freedesktop.org
>> https://lists.freedesktop.org/mailman/listinfo/amd-gfx
_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: vcn regression on raven1
[not found] ` <MWHPR1201MB0061A8D8AEB5744131AC416AF7800-3iK1xFAIwjq5p0Vu5p1DW2rFom/aUZj6nBOFsp37pqbUKgpGm//BTAC/G2K4zDHf@public.gmane.org>
@ 2018-05-02 1:07 ` Zhang, Jerry (Junwei)
[not found] ` <5AE90F4E.8080800-5C7GfCeVMHo@public.gmane.org>
0 siblings, 1 reply; 12+ messages in thread
From: Zhang, Jerry (Junwei) @ 2018-05-02 1:07 UTC (permalink / raw)
To: StDenis, Tom, Deucher, Alexander
Cc: Koenig, Christian, amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW
Hi Tom,
Sound you get the code from freedesktop rather than the internal drm-next.
Unfortunately freedesktop looks delay to sync the code from internal drm-next.
That's the gap it happened as issue in the test.
Hi Alex,
Is that a issue for code syncing between freedesktop and internal drm-next?
Or it's a known issue of delay syncing code.
Jerry
On 05/02/2018 08:57 AM, StDenis, Tom wrote:
> Hi Jerry,
>
> It's well past EOD for me I'll pick this up in the morning.
>
> I'm fairly certain I wrote my patch against the tip of amd-staging-drm-next as of my pull this morning though.
>
> If it's in there and I missed it somehow I apologize otherwise it'd be nice to make sure it's in there.
>
> Based on the public copy of the tree it's not there
>
> https://cgit.freedesktop.org/~agd5f/linux/tree/drivers/gpu/drm/amd/amdgpu/vcn_v1_0.c?h=amd-staging-drm-next#n1110
>
> Cheers,
> Tom
> ________________________________________
> From: Zhang, Jerry
> Sent: Tuesday, May 1, 2018 20:52
> To: StDenis, Tom; Deucher, Alexander
> Cc: Koenig, Christian; amd-gfx@lists.freedesktop.org
> Subject: Re: vcn regression on raven1
>
> Hi Tom,
>
> It was landed in the latest drm-next, like
> * 964933a 2018-04-27 10:26:09 +0800 drm/amdgpu/uvd7: add
> emit_reg_write_reg_wait ring callback <Xiaojie Yuan>
>
> Did you test with that included?
> Please try to get the latest drm-next, if not.
> They look the same issue from the log.
>
> Jerry
>
> On 05/02/2018 08:47 AM, StDenis, Tom wrote:
>> Hi Jerry,
>>
>> So far as I know this wasn't included on the tip of drm-next. I hit this this morning in my semi-regular pull/build/test cycle.
>>
>> Was this missed in a recent rebase?
>>
>> Tom
>> ________________________________________
>> From: Zhang, Jerry
>> Sent: Tuesday, May 1, 2018 20:43
>> To: StDenis, Tom; Deucher, Alexander
>> Cc: Koenig, Christian; amd-gfx@lists.freedesktop.org
>> Subject: Re: vcn regression on raven1
>>
>> On 05/01/2018 09:34 PM, Tom St Denis wrote:
>>> Hi all,
>>>
>>> I've noticed that on the tip of drm-next vcn playback of video is broken (see
>>> dmesg below). I've bisected it to this commit
>>
>> It may be fixed here as a common issue.
>>
>> * https://patchwork.freedesktop.org/patch/218909/
>>
>> Jerry
>>
>>>
>>> [root@raven linux]# git bisect good
>>> 701372349fd55b5396b335580e979ac4dde3dd02 is the first bad commit
>>> commit 701372349fd55b5396b335580e979ac4dde3dd02
>>> Author: Alex Deucher <alexander.deucher@amd.com>
>>> Date: Tue Mar 27 17:10:56 2018 -0500
>>>
>>> drm/amdgpu/gmc9: use amdgpu_ring_emit_reg_write_reg_wait in gpu tlb flush
>>>
>>> Use amdgpu_ring_emit_reg_write_reg_wait. On engines that support it,
>>> it provides a write and wait in a single packet which avoids a missed
>>> ack if a world switch happens between the request and waiting for the
>>> ack.
>>>
>>> Reviewed-by: Huang Rui <ray.huang@amd.com>
>>> Reviewed-by: Christian König <christian.koenig@amd.com>
>>> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
>>>
>>> :040000 040000 4e4312de03f4b34abd65f4bb12dba4c7093055ba
>>> ccc4abc78c0b6f24328fd998f998fa06bf0618b1 M drivers
>>>
>>> Which is odd because the commit before this is the vcn change and it works fine
>>> (playing BBB right now).
>>>
>>> Here's the dmesg:
>>>
>>> [ 2925.640102] BUG: unable to handle kernel NULL pointer dereference at
>>> 0000000000000000
>>> [ 2925.640113] IP: (null)
>>> [ 2925.640116] PGD 0 P4D 0
>>> [ 2925.640122] Oops: 0010 [#1] SMP KASAN NOPTI
>>> [ 2925.640126] Modules linked in: tun fuse amdkfd amdgpu mfd_core chash
>>> gpu_sched ttm ax88179_178a usbnet
>>> [ 2925.640139] CPU: 4 PID: 3791 Comm: vcn_dec Not tainted 4.16.0-rc7+ #20
>>> [ 2925.640142] Hardware name: System manufacturer System Product Name/TUF
>>> B350M-PLUS GAMING, BIOS 3803 01/22/2018
>>> [ 2925.640146] RIP: 0010: (null)
>>> [ 2925.640148] RSP: 0018:ffff8801d54f7790 EFLAGS: 00010206
>>> [ 2925.640153] RAX: 0000000000000000 RBX: ffff8801d8b38420 RCX: 00000000007c0080
>>> [ 2925.640156] RDX: 000000000001a6fa RSI: 000000000001a6e8 RDI: ffff8801d8b38420
>>> [ 2925.640159] RBP: 000000000001a6fa R08: 0000000000000080 R09: ffffed003aa9eef9
>>> [ 2925.640162] R10: 0000000009c74f08 R11: fffffbfff0f5d1e7 R12: ffff8801d8b3277c
>>> [ 2925.640164] R13: ffff8801d8b3001c R14: 0000000000000005 R15: 0000000000000000
>>> [ 2925.640168] FS: 0000000000000000(0000) GS:ffff8801dcf00000(0000)
>>> knlGS:0000000000000000
>>> [ 2925.640171] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
>>> [ 2925.640174] CR2: 0000000000000000 CR3: 00000001d9712000 CR4: 00000000003406e0
>>> [ 2925.640176] Call Trace:
>>> [ 2925.640272] ? gmc_v9_0_emit_flush_gpu_tlb+0x260/0x2a0 [amdgpu]
>>> [ 2925.640368] ? vcn_v1_0_dec_ring_insert_start+0x360/0x360 [amdgpu]
>>> [ 2925.640459] ? mmhub_v1_0_get_clockgating+0xc0/0xc0 [amdgpu]
>>> [ 2925.640545] ? amdgpu_vmid_had_gpu_reset+0x89/0xc0 [amdgpu]
>>> [ 2925.640640] ? vcn_v1_0_dec_ring_emit_vm_flush+0x64/0xb0 [amdgpu]
>>> [ 2925.640725] ? amdgpu_vm_flush+0xb43/0xcc0 [amdgpu]
>>> [ 2925.640810] ? amdgpu_vm_need_pipeline_sync+0x260/0x260 [amdgpu]
>>> [ 2925.640897] ? amdgpu_vmid_had_gpu_reset+0xc0/0xc0 [amdgpu]
>>> [ 2925.641003] ? vcn_v1_0_dec_ring_insert_start+0x2d7/0x360 [amdgpu]
>>> [ 2925.641095] ? amdgpu_ib_schedule+0x1b5/0x800 [amdgpu]
>>> [ 2925.641102] ? dma_fence_add_callback+0x15f/0x360
>>> [ 2925.641201] ? amdgpu_job_run+0x32f/0x370 [amdgpu]
>>> [ 2925.641297] ? amdgpu_job_free_resources+0xd0/0xd0 [amdgpu]
>>> [ 2925.641302] ? __queue_delayed_work+0x144/0x1d0
>>> [ 2925.641306] ? delayed_work_timer_fn+0x40/0x40
>>> [ 2925.641312] ? prepare_to_wait_exclusive+0x1d0/0x1d0
>>> [ 2925.641318] ? drm_sched_main+0x68c/0x940 [gpu_sched]
>>> [ 2925.641323] ? drm_sched_entity_fini+0x60/0x60 [gpu_sched]
>>> [ 2925.641328] ? save_stack+0x89/0xb0
>>> [ 2925.641332] ? wait_woken+0x110/0x110
>>> [ 2925.641337] ? ret_from_fork+0x22/0x40
>>> [ 2925.641343] ? __schedule+0xd30/0xd30
>>> [ 2925.641346] ? remove_wait_queue+0x150/0x150
>>> [ 2925.641353] ? rcu_note_context_switch+0x2a0/0x2a0
>>> [ 2925.641359] ? __lock_text_start+0x8/0x8
>>> [ 2925.641367] ? drm_sched_entity_fini+0x60/0x60 [gpu_sched]
>>> [ 2925.641371] ? kthread+0x19b/0x1c0
>>> [ 2925.641376] ? kthread_create_worker_on_cpu+0xc0/0xc0
>>> [ 2925.641382] ? ret_from_fork+0x22/0x40
>>> [ 2925.641387] Code: Bad RIP value.
>>> [ 2925.641397] RIP: (null) RSP: ffff8801d54f7790
>>> [ 2925.641400] CR2: 0000000000000000
>>> [ 2925.641405] ---[ end trace 0684cc0468f60fb1 ]---
>>>
>>>
>>> Note that regular compute/gfx workflows work fine on the tip of drm-next only
>>> vcn playback triggeers this (haven't tried encode yet...).
>>>
>>> Cheers,
>>> Tom
>>> _______________________________________________
>>> amd-gfx mailing list
>>> amd-gfx@lists.freedesktop.org
>>> https://lists.freedesktop.org/mailman/listinfo/amd-gfx
_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: vcn regression on raven1
[not found] ` <5AE90F4E.8080800-5C7GfCeVMHo@public.gmane.org>
@ 2018-05-02 1:29 ` StDenis, Tom
[not found] ` <MWHPR1201MB006170E338FD24D648B90766F7800-3iK1xFAIwjq5p0Vu5p1DW2rFom/aUZj6nBOFsp37pqbUKgpGm//BTAC/G2K4zDHf@public.gmane.org>
0 siblings, 1 reply; 12+ messages in thread
From: StDenis, Tom @ 2018-05-02 1:29 UTC (permalink / raw)
To: Zhang, Jerry, Deucher, Alexander
Cc: Koenig, Christian, amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW
I pull from gerrit. I'm just pointing out that it's not on drm-next upstream either.
It may have been missed in a rebase or something.
Tom
________________________________________
From: Zhang, Jerry
Sent: Tuesday, May 1, 2018 21:07
To: StDenis, Tom; Deucher, Alexander
Cc: Koenig, Christian; amd-gfx@lists.freedesktop.org
Subject: Re: vcn regression on raven1
Hi Tom,
Sound you get the code from freedesktop rather than the internal drm-next.
Unfortunately freedesktop looks delay to sync the code from internal drm-next.
That's the gap it happened as issue in the test.
Hi Alex,
Is that a issue for code syncing between freedesktop and internal drm-next?
Or it's a known issue of delay syncing code.
Jerry
On 05/02/2018 08:57 AM, StDenis, Tom wrote:
> Hi Jerry,
>
> It's well past EOD for me I'll pick this up in the morning.
>
> I'm fairly certain I wrote my patch against the tip of amd-staging-drm-next as of my pull this morning though.
>
> If it's in there and I missed it somehow I apologize otherwise it'd be nice to make sure it's in there.
>
> Based on the public copy of the tree it's not there
>
> https://cgit.freedesktop.org/~agd5f/linux/tree/drivers/gpu/drm/amd/amdgpu/vcn_v1_0.c?h=amd-staging-drm-next#n1110
>
> Cheers,
> Tom
> ________________________________________
> From: Zhang, Jerry
> Sent: Tuesday, May 1, 2018 20:52
> To: StDenis, Tom; Deucher, Alexander
> Cc: Koenig, Christian; amd-gfx@lists.freedesktop.org
> Subject: Re: vcn regression on raven1
>
> Hi Tom,
>
> It was landed in the latest drm-next, like
> * 964933a 2018-04-27 10:26:09 +0800 drm/amdgpu/uvd7: add
> emit_reg_write_reg_wait ring callback <Xiaojie Yuan>
>
> Did you test with that included?
> Please try to get the latest drm-next, if not.
> They look the same issue from the log.
>
> Jerry
>
> On 05/02/2018 08:47 AM, StDenis, Tom wrote:
>> Hi Jerry,
>>
>> So far as I know this wasn't included on the tip of drm-next. I hit this this morning in my semi-regular pull/build/test cycle.
>>
>> Was this missed in a recent rebase?
>>
>> Tom
>> ________________________________________
>> From: Zhang, Jerry
>> Sent: Tuesday, May 1, 2018 20:43
>> To: StDenis, Tom; Deucher, Alexander
>> Cc: Koenig, Christian; amd-gfx@lists.freedesktop.org
>> Subject: Re: vcn regression on raven1
>>
>> On 05/01/2018 09:34 PM, Tom St Denis wrote:
>>> Hi all,
>>>
>>> I've noticed that on the tip of drm-next vcn playback of video is broken (see
>>> dmesg below). I've bisected it to this commit
>>
>> It may be fixed here as a common issue.
>>
>> * https://patchwork.freedesktop.org/patch/218909/
>>
>> Jerry
>>
>>>
>>> [root@raven linux]# git bisect good
>>> 701372349fd55b5396b335580e979ac4dde3dd02 is the first bad commit
>>> commit 701372349fd55b5396b335580e979ac4dde3dd02
>>> Author: Alex Deucher <alexander.deucher@amd.com>
>>> Date: Tue Mar 27 17:10:56 2018 -0500
>>>
>>> drm/amdgpu/gmc9: use amdgpu_ring_emit_reg_write_reg_wait in gpu tlb flush
>>>
>>> Use amdgpu_ring_emit_reg_write_reg_wait. On engines that support it,
>>> it provides a write and wait in a single packet which avoids a missed
>>> ack if a world switch happens between the request and waiting for the
>>> ack.
>>>
>>> Reviewed-by: Huang Rui <ray.huang@amd.com>
>>> Reviewed-by: Christian König <christian.koenig@amd.com>
>>> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
>>>
>>> :040000 040000 4e4312de03f4b34abd65f4bb12dba4c7093055ba
>>> ccc4abc78c0b6f24328fd998f998fa06bf0618b1 M drivers
>>>
>>> Which is odd because the commit before this is the vcn change and it works fine
>>> (playing BBB right now).
>>>
>>> Here's the dmesg:
>>>
>>> [ 2925.640102] BUG: unable to handle kernel NULL pointer dereference at
>>> 0000000000000000
>>> [ 2925.640113] IP: (null)
>>> [ 2925.640116] PGD 0 P4D 0
>>> [ 2925.640122] Oops: 0010 [#1] SMP KASAN NOPTI
>>> [ 2925.640126] Modules linked in: tun fuse amdkfd amdgpu mfd_core chash
>>> gpu_sched ttm ax88179_178a usbnet
>>> [ 2925.640139] CPU: 4 PID: 3791 Comm: vcn_dec Not tainted 4.16.0-rc7+ #20
>>> [ 2925.640142] Hardware name: System manufacturer System Product Name/TUF
>>> B350M-PLUS GAMING, BIOS 3803 01/22/2018
>>> [ 2925.640146] RIP: 0010: (null)
>>> [ 2925.640148] RSP: 0018:ffff8801d54f7790 EFLAGS: 00010206
>>> [ 2925.640153] RAX: 0000000000000000 RBX: ffff8801d8b38420 RCX: 00000000007c0080
>>> [ 2925.640156] RDX: 000000000001a6fa RSI: 000000000001a6e8 RDI: ffff8801d8b38420
>>> [ 2925.640159] RBP: 000000000001a6fa R08: 0000000000000080 R09: ffffed003aa9eef9
>>> [ 2925.640162] R10: 0000000009c74f08 R11: fffffbfff0f5d1e7 R12: ffff8801d8b3277c
>>> [ 2925.640164] R13: ffff8801d8b3001c R14: 0000000000000005 R15: 0000000000000000
>>> [ 2925.640168] FS: 0000000000000000(0000) GS:ffff8801dcf00000(0000)
>>> knlGS:0000000000000000
>>> [ 2925.640171] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
>>> [ 2925.640174] CR2: 0000000000000000 CR3: 00000001d9712000 CR4: 00000000003406e0
>>> [ 2925.640176] Call Trace:
>>> [ 2925.640272] ? gmc_v9_0_emit_flush_gpu_tlb+0x260/0x2a0 [amdgpu]
>>> [ 2925.640368] ? vcn_v1_0_dec_ring_insert_start+0x360/0x360 [amdgpu]
>>> [ 2925.640459] ? mmhub_v1_0_get_clockgating+0xc0/0xc0 [amdgpu]
>>> [ 2925.640545] ? amdgpu_vmid_had_gpu_reset+0x89/0xc0 [amdgpu]
>>> [ 2925.640640] ? vcn_v1_0_dec_ring_emit_vm_flush+0x64/0xb0 [amdgpu]
>>> [ 2925.640725] ? amdgpu_vm_flush+0xb43/0xcc0 [amdgpu]
>>> [ 2925.640810] ? amdgpu_vm_need_pipeline_sync+0x260/0x260 [amdgpu]
>>> [ 2925.640897] ? amdgpu_vmid_had_gpu_reset+0xc0/0xc0 [amdgpu]
>>> [ 2925.641003] ? vcn_v1_0_dec_ring_insert_start+0x2d7/0x360 [amdgpu]
>>> [ 2925.641095] ? amdgpu_ib_schedule+0x1b5/0x800 [amdgpu]
>>> [ 2925.641102] ? dma_fence_add_callback+0x15f/0x360
>>> [ 2925.641201] ? amdgpu_job_run+0x32f/0x370 [amdgpu]
>>> [ 2925.641297] ? amdgpu_job_free_resources+0xd0/0xd0 [amdgpu]
>>> [ 2925.641302] ? __queue_delayed_work+0x144/0x1d0
>>> [ 2925.641306] ? delayed_work_timer_fn+0x40/0x40
>>> [ 2925.641312] ? prepare_to_wait_exclusive+0x1d0/0x1d0
>>> [ 2925.641318] ? drm_sched_main+0x68c/0x940 [gpu_sched]
>>> [ 2925.641323] ? drm_sched_entity_fini+0x60/0x60 [gpu_sched]
>>> [ 2925.641328] ? save_stack+0x89/0xb0
>>> [ 2925.641332] ? wait_woken+0x110/0x110
>>> [ 2925.641337] ? ret_from_fork+0x22/0x40
>>> [ 2925.641343] ? __schedule+0xd30/0xd30
>>> [ 2925.641346] ? remove_wait_queue+0x150/0x150
>>> [ 2925.641353] ? rcu_note_context_switch+0x2a0/0x2a0
>>> [ 2925.641359] ? __lock_text_start+0x8/0x8
>>> [ 2925.641367] ? drm_sched_entity_fini+0x60/0x60 [gpu_sched]
>>> [ 2925.641371] ? kthread+0x19b/0x1c0
>>> [ 2925.641376] ? kthread_create_worker_on_cpu+0xc0/0xc0
>>> [ 2925.641382] ? ret_from_fork+0x22/0x40
>>> [ 2925.641387] Code: Bad RIP value.
>>> [ 2925.641397] RIP: (null) RSP: ffff8801d54f7790
>>> [ 2925.641400] CR2: 0000000000000000
>>> [ 2925.641405] ---[ end trace 0684cc0468f60fb1 ]---
>>>
>>>
>>> Note that regular compute/gfx workflows work fine on the tip of drm-next only
>>> vcn playback triggeers this (haven't tried encode yet...).
>>>
>>> Cheers,
>>> Tom
>>> _______________________________________________
>>> amd-gfx mailing list
>>> amd-gfx@lists.freedesktop.org
>>> https://lists.freedesktop.org/mailman/listinfo/amd-gfx
_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: vcn regression on raven1
[not found] ` <MWHPR1201MB006170E338FD24D648B90766F7800-3iK1xFAIwjq5p0Vu5p1DW2rFom/aUZj6nBOFsp37pqbUKgpGm//BTAC/G2K4zDHf@public.gmane.org>
@ 2018-05-02 1:39 ` Zhang, Jerry (Junwei)
[not found] ` <5AE916E1.1040305-5C7GfCeVMHo@public.gmane.org>
0 siblings, 1 reply; 12+ messages in thread
From: Zhang, Jerry (Junwei) @ 2018-05-02 1:39 UTC (permalink / raw)
To: StDenis, Tom, Deucher, Alexander
Cc: Koenig, Christian, amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW
Hi Tom,
Do you mean you cannot find the patch from gerrit/amd-staging-dkms-next either?
I do find it.
the tip of gerrit/amd-staging-drm-next is
* bb54e82 2018-04-30 12:17:07 -0400 drm/amdgpu: Switch to interruptable wait
to recover from ring hang. <Andrey Grodzovsky>
while the tip of freedesktop is
* a11008c 2018-04-25 20:32:05 -0500 drm/powerplay: Add powertune table for
VEGAM <Eric Huang>
Jerry
On 05/02/2018 09:29 AM, StDenis, Tom wrote:
> I pull from gerrit. I'm just pointing out that it's not on drm-next upstream either.
>
> It may have been missed in a rebase or something.
>
> Tom
> ________________________________________
> From: Zhang, Jerry
> Sent: Tuesday, May 1, 2018 21:07
> To: StDenis, Tom; Deucher, Alexander
> Cc: Koenig, Christian; amd-gfx@lists.freedesktop.org
> Subject: Re: vcn regression on raven1
>
> Hi Tom,
>
> Sound you get the code from freedesktop rather than the internal drm-next.
> Unfortunately freedesktop looks delay to sync the code from internal drm-next.
> That's the gap it happened as issue in the test.
>
> Hi Alex,
>
> Is that a issue for code syncing between freedesktop and internal drm-next?
> Or it's a known issue of delay syncing code.
>
> Jerry
>
> On 05/02/2018 08:57 AM, StDenis, Tom wrote:
>> Hi Jerry,
>>
>> It's well past EOD for me I'll pick this up in the morning.
>>
>> I'm fairly certain I wrote my patch against the tip of amd-staging-drm-next as of my pull this morning though.
>>
>> If it's in there and I missed it somehow I apologize otherwise it'd be nice to make sure it's in there.
>>
>> Based on the public copy of the tree it's not there
>>
>> https://cgit.freedesktop.org/~agd5f/linux/tree/drivers/gpu/drm/amd/amdgpu/vcn_v1_0.c?h=amd-staging-drm-next#n1110
>>
>> Cheers,
>> Tom
>> ________________________________________
>> From: Zhang, Jerry
>> Sent: Tuesday, May 1, 2018 20:52
>> To: StDenis, Tom; Deucher, Alexander
>> Cc: Koenig, Christian; amd-gfx@lists.freedesktop.org
>> Subject: Re: vcn regression on raven1
>>
>> Hi Tom,
>>
>> It was landed in the latest drm-next, like
>> * 964933a 2018-04-27 10:26:09 +0800 drm/amdgpu/uvd7: add
>> emit_reg_write_reg_wait ring callback <Xiaojie Yuan>
>>
>> Did you test with that included?
>> Please try to get the latest drm-next, if not.
>> They look the same issue from the log.
>>
>> Jerry
>>
>> On 05/02/2018 08:47 AM, StDenis, Tom wrote:
>>> Hi Jerry,
>>>
>>> So far as I know this wasn't included on the tip of drm-next. I hit this this morning in my semi-regular pull/build/test cycle.
>>>
>>> Was this missed in a recent rebase?
>>>
>>> Tom
>>> ________________________________________
>>> From: Zhang, Jerry
>>> Sent: Tuesday, May 1, 2018 20:43
>>> To: StDenis, Tom; Deucher, Alexander
>>> Cc: Koenig, Christian; amd-gfx@lists.freedesktop.org
>>> Subject: Re: vcn regression on raven1
>>>
>>> On 05/01/2018 09:34 PM, Tom St Denis wrote:
>>>> Hi all,
>>>>
>>>> I've noticed that on the tip of drm-next vcn playback of video is broken (see
>>>> dmesg below). I've bisected it to this commit
>>>
>>> It may be fixed here as a common issue.
>>>
>>> * https://patchwork.freedesktop.org/patch/218909/
>>>
>>> Jerry
>>>
>>>>
>>>> [root@raven linux]# git bisect good
>>>> 701372349fd55b5396b335580e979ac4dde3dd02 is the first bad commit
>>>> commit 701372349fd55b5396b335580e979ac4dde3dd02
>>>> Author: Alex Deucher <alexander.deucher@amd.com>
>>>> Date: Tue Mar 27 17:10:56 2018 -0500
>>>>
>>>> drm/amdgpu/gmc9: use amdgpu_ring_emit_reg_write_reg_wait in gpu tlb flush
>>>>
>>>> Use amdgpu_ring_emit_reg_write_reg_wait. On engines that support it,
>>>> it provides a write and wait in a single packet which avoids a missed
>>>> ack if a world switch happens between the request and waiting for the
>>>> ack.
>>>>
>>>> Reviewed-by: Huang Rui <ray.huang@amd.com>
>>>> Reviewed-by: Christian König <christian.koenig@amd.com>
>>>> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
>>>>
>>>> :040000 040000 4e4312de03f4b34abd65f4bb12dba4c7093055ba
>>>> ccc4abc78c0b6f24328fd998f998fa06bf0618b1 M drivers
>>>>
>>>> Which is odd because the commit before this is the vcn change and it works fine
>>>> (playing BBB right now).
>>>>
>>>> Here's the dmesg:
>>>>
>>>> [ 2925.640102] BUG: unable to handle kernel NULL pointer dereference at
>>>> 0000000000000000
>>>> [ 2925.640113] IP: (null)
>>>> [ 2925.640116] PGD 0 P4D 0
>>>> [ 2925.640122] Oops: 0010 [#1] SMP KASAN NOPTI
>>>> [ 2925.640126] Modules linked in: tun fuse amdkfd amdgpu mfd_core chash
>>>> gpu_sched ttm ax88179_178a usbnet
>>>> [ 2925.640139] CPU: 4 PID: 3791 Comm: vcn_dec Not tainted 4.16.0-rc7+ #20
>>>> [ 2925.640142] Hardware name: System manufacturer System Product Name/TUF
>>>> B350M-PLUS GAMING, BIOS 3803 01/22/2018
>>>> [ 2925.640146] RIP: 0010: (null)
>>>> [ 2925.640148] RSP: 0018:ffff8801d54f7790 EFLAGS: 00010206
>>>> [ 2925.640153] RAX: 0000000000000000 RBX: ffff8801d8b38420 RCX: 00000000007c0080
>>>> [ 2925.640156] RDX: 000000000001a6fa RSI: 000000000001a6e8 RDI: ffff8801d8b38420
>>>> [ 2925.640159] RBP: 000000000001a6fa R08: 0000000000000080 R09: ffffed003aa9eef9
>>>> [ 2925.640162] R10: 0000000009c74f08 R11: fffffbfff0f5d1e7 R12: ffff8801d8b3277c
>>>> [ 2925.640164] R13: ffff8801d8b3001c R14: 0000000000000005 R15: 0000000000000000
>>>> [ 2925.640168] FS: 0000000000000000(0000) GS:ffff8801dcf00000(0000)
>>>> knlGS:0000000000000000
>>>> [ 2925.640171] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
>>>> [ 2925.640174] CR2: 0000000000000000 CR3: 00000001d9712000 CR4: 00000000003406e0
>>>> [ 2925.640176] Call Trace:
>>>> [ 2925.640272] ? gmc_v9_0_emit_flush_gpu_tlb+0x260/0x2a0 [amdgpu]
>>>> [ 2925.640368] ? vcn_v1_0_dec_ring_insert_start+0x360/0x360 [amdgpu]
>>>> [ 2925.640459] ? mmhub_v1_0_get_clockgating+0xc0/0xc0 [amdgpu]
>>>> [ 2925.640545] ? amdgpu_vmid_had_gpu_reset+0x89/0xc0 [amdgpu]
>>>> [ 2925.640640] ? vcn_v1_0_dec_ring_emit_vm_flush+0x64/0xb0 [amdgpu]
>>>> [ 2925.640725] ? amdgpu_vm_flush+0xb43/0xcc0 [amdgpu]
>>>> [ 2925.640810] ? amdgpu_vm_need_pipeline_sync+0x260/0x260 [amdgpu]
>>>> [ 2925.640897] ? amdgpu_vmid_had_gpu_reset+0xc0/0xc0 [amdgpu]
>>>> [ 2925.641003] ? vcn_v1_0_dec_ring_insert_start+0x2d7/0x360 [amdgpu]
>>>> [ 2925.641095] ? amdgpu_ib_schedule+0x1b5/0x800 [amdgpu]
>>>> [ 2925.641102] ? dma_fence_add_callback+0x15f/0x360
>>>> [ 2925.641201] ? amdgpu_job_run+0x32f/0x370 [amdgpu]
>>>> [ 2925.641297] ? amdgpu_job_free_resources+0xd0/0xd0 [amdgpu]
>>>> [ 2925.641302] ? __queue_delayed_work+0x144/0x1d0
>>>> [ 2925.641306] ? delayed_work_timer_fn+0x40/0x40
>>>> [ 2925.641312] ? prepare_to_wait_exclusive+0x1d0/0x1d0
>>>> [ 2925.641318] ? drm_sched_main+0x68c/0x940 [gpu_sched]
>>>> [ 2925.641323] ? drm_sched_entity_fini+0x60/0x60 [gpu_sched]
>>>> [ 2925.641328] ? save_stack+0x89/0xb0
>>>> [ 2925.641332] ? wait_woken+0x110/0x110
>>>> [ 2925.641337] ? ret_from_fork+0x22/0x40
>>>> [ 2925.641343] ? __schedule+0xd30/0xd30
>>>> [ 2925.641346] ? remove_wait_queue+0x150/0x150
>>>> [ 2925.641353] ? rcu_note_context_switch+0x2a0/0x2a0
>>>> [ 2925.641359] ? __lock_text_start+0x8/0x8
>>>> [ 2925.641367] ? drm_sched_entity_fini+0x60/0x60 [gpu_sched]
>>>> [ 2925.641371] ? kthread+0x19b/0x1c0
>>>> [ 2925.641376] ? kthread_create_worker_on_cpu+0xc0/0xc0
>>>> [ 2925.641382] ? ret_from_fork+0x22/0x40
>>>> [ 2925.641387] Code: Bad RIP value.
>>>> [ 2925.641397] RIP: (null) RSP: ffff8801d54f7790
>>>> [ 2925.641400] CR2: 0000000000000000
>>>> [ 2925.641405] ---[ end trace 0684cc0468f60fb1 ]---
>>>>
>>>>
>>>> Note that regular compute/gfx workflows work fine on the tip of drm-next only
>>>> vcn playback triggeers this (haven't tried encode yet...).
>>>>
>>>> Cheers,
>>>> Tom
>>>> _______________________________________________
>>>> amd-gfx mailing list
>>>> amd-gfx@lists.freedesktop.org
>>>> https://lists.freedesktop.org/mailman/listinfo/amd-gfx
_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: vcn regression on raven1
[not found] ` <5AE916E1.1040305-5C7GfCeVMHo@public.gmane.org>
@ 2018-05-02 1:41 ` StDenis, Tom
[not found] ` <MWHPR1201MB006160B52BC8575261FA0347F7800-3iK1xFAIwjq5p0Vu5p1DW2rFom/aUZj6nBOFsp37pqbUKgpGm//BTAC/G2K4zDHf@public.gmane.org>
0 siblings, 1 reply; 12+ messages in thread
From: StDenis, Tom @ 2018-05-02 1:41 UTC (permalink / raw)
To: Zhang, Jerry, Deucher, Alexander
Cc: Koenig, Christian, amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW
Hi Jerry,
Like I said it's (now well) past EOD (meaning my workstation is powered off) so I'll have to check tomorrow. But I do pull from gerrit daily and build from that.
I'll take a look in the morning.
Cheers,
Tom
________________________________________
From: Zhang, Jerry
Sent: Tuesday, May 1, 2018 21:39
To: StDenis, Tom; Deucher, Alexander
Cc: Koenig, Christian; amd-gfx@lists.freedesktop.org
Subject: Re: vcn regression on raven1
Hi Tom,
Do you mean you cannot find the patch from gerrit/amd-staging-dkms-next either?
I do find it.
the tip of gerrit/amd-staging-drm-next is
* bb54e82 2018-04-30 12:17:07 -0400 drm/amdgpu: Switch to interruptable wait
to recover from ring hang. <Andrey Grodzovsky>
while the tip of freedesktop is
* a11008c 2018-04-25 20:32:05 -0500 drm/powerplay: Add powertune table for
VEGAM <Eric Huang>
Jerry
On 05/02/2018 09:29 AM, StDenis, Tom wrote:
> I pull from gerrit. I'm just pointing out that it's not on drm-next upstream either.
>
> It may have been missed in a rebase or something.
>
> Tom
> ________________________________________
> From: Zhang, Jerry
> Sent: Tuesday, May 1, 2018 21:07
> To: StDenis, Tom; Deucher, Alexander
> Cc: Koenig, Christian; amd-gfx@lists.freedesktop.org
> Subject: Re: vcn regression on raven1
>
> Hi Tom,
>
> Sound you get the code from freedesktop rather than the internal drm-next.
> Unfortunately freedesktop looks delay to sync the code from internal drm-next.
> That's the gap it happened as issue in the test.
>
> Hi Alex,
>
> Is that a issue for code syncing between freedesktop and internal drm-next?
> Or it's a known issue of delay syncing code.
>
> Jerry
>
> On 05/02/2018 08:57 AM, StDenis, Tom wrote:
>> Hi Jerry,
>>
>> It's well past EOD for me I'll pick this up in the morning.
>>
>> I'm fairly certain I wrote my patch against the tip of amd-staging-drm-next as of my pull this morning though.
>>
>> If it's in there and I missed it somehow I apologize otherwise it'd be nice to make sure it's in there.
>>
>> Based on the public copy of the tree it's not there
>>
>> https://cgit.freedesktop.org/~agd5f/linux/tree/drivers/gpu/drm/amd/amdgpu/vcn_v1_0.c?h=amd-staging-drm-next#n1110
>>
>> Cheers,
>> Tom
>> ________________________________________
>> From: Zhang, Jerry
>> Sent: Tuesday, May 1, 2018 20:52
>> To: StDenis, Tom; Deucher, Alexander
>> Cc: Koenig, Christian; amd-gfx@lists.freedesktop.org
>> Subject: Re: vcn regression on raven1
>>
>> Hi Tom,
>>
>> It was landed in the latest drm-next, like
>> * 964933a 2018-04-27 10:26:09 +0800 drm/amdgpu/uvd7: add
>> emit_reg_write_reg_wait ring callback <Xiaojie Yuan>
>>
>> Did you test with that included?
>> Please try to get the latest drm-next, if not.
>> They look the same issue from the log.
>>
>> Jerry
>>
>> On 05/02/2018 08:47 AM, StDenis, Tom wrote:
>>> Hi Jerry,
>>>
>>> So far as I know this wasn't included on the tip of drm-next. I hit this this morning in my semi-regular pull/build/test cycle.
>>>
>>> Was this missed in a recent rebase?
>>>
>>> Tom
>>> ________________________________________
>>> From: Zhang, Jerry
>>> Sent: Tuesday, May 1, 2018 20:43
>>> To: StDenis, Tom; Deucher, Alexander
>>> Cc: Koenig, Christian; amd-gfx@lists.freedesktop.org
>>> Subject: Re: vcn regression on raven1
>>>
>>> On 05/01/2018 09:34 PM, Tom St Denis wrote:
>>>> Hi all,
>>>>
>>>> I've noticed that on the tip of drm-next vcn playback of video is broken (see
>>>> dmesg below). I've bisected it to this commit
>>>
>>> It may be fixed here as a common issue.
>>>
>>> * https://patchwork.freedesktop.org/patch/218909/
>>>
>>> Jerry
>>>
>>>>
>>>> [root@raven linux]# git bisect good
>>>> 701372349fd55b5396b335580e979ac4dde3dd02 is the first bad commit
>>>> commit 701372349fd55b5396b335580e979ac4dde3dd02
>>>> Author: Alex Deucher <alexander.deucher@amd.com>
>>>> Date: Tue Mar 27 17:10:56 2018 -0500
>>>>
>>>> drm/amdgpu/gmc9: use amdgpu_ring_emit_reg_write_reg_wait in gpu tlb flush
>>>>
>>>> Use amdgpu_ring_emit_reg_write_reg_wait. On engines that support it,
>>>> it provides a write and wait in a single packet which avoids a missed
>>>> ack if a world switch happens between the request and waiting for the
>>>> ack.
>>>>
>>>> Reviewed-by: Huang Rui <ray.huang@amd.com>
>>>> Reviewed-by: Christian König <christian.koenig@amd.com>
>>>> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
>>>>
>>>> :040000 040000 4e4312de03f4b34abd65f4bb12dba4c7093055ba
>>>> ccc4abc78c0b6f24328fd998f998fa06bf0618b1 M drivers
>>>>
>>>> Which is odd because the commit before this is the vcn change and it works fine
>>>> (playing BBB right now).
>>>>
>>>> Here's the dmesg:
>>>>
>>>> [ 2925.640102] BUG: unable to handle kernel NULL pointer dereference at
>>>> 0000000000000000
>>>> [ 2925.640113] IP: (null)
>>>> [ 2925.640116] PGD 0 P4D 0
>>>> [ 2925.640122] Oops: 0010 [#1] SMP KASAN NOPTI
>>>> [ 2925.640126] Modules linked in: tun fuse amdkfd amdgpu mfd_core chash
>>>> gpu_sched ttm ax88179_178a usbnet
>>>> [ 2925.640139] CPU: 4 PID: 3791 Comm: vcn_dec Not tainted 4.16.0-rc7+ #20
>>>> [ 2925.640142] Hardware name: System manufacturer System Product Name/TUF
>>>> B350M-PLUS GAMING, BIOS 3803 01/22/2018
>>>> [ 2925.640146] RIP: 0010: (null)
>>>> [ 2925.640148] RSP: 0018:ffff8801d54f7790 EFLAGS: 00010206
>>>> [ 2925.640153] RAX: 0000000000000000 RBX: ffff8801d8b38420 RCX: 00000000007c0080
>>>> [ 2925.640156] RDX: 000000000001a6fa RSI: 000000000001a6e8 RDI: ffff8801d8b38420
>>>> [ 2925.640159] RBP: 000000000001a6fa R08: 0000000000000080 R09: ffffed003aa9eef9
>>>> [ 2925.640162] R10: 0000000009c74f08 R11: fffffbfff0f5d1e7 R12: ffff8801d8b3277c
>>>> [ 2925.640164] R13: ffff8801d8b3001c R14: 0000000000000005 R15: 0000000000000000
>>>> [ 2925.640168] FS: 0000000000000000(0000) GS:ffff8801dcf00000(0000)
>>>> knlGS:0000000000000000
>>>> [ 2925.640171] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
>>>> [ 2925.640174] CR2: 0000000000000000 CR3: 00000001d9712000 CR4: 00000000003406e0
>>>> [ 2925.640176] Call Trace:
>>>> [ 2925.640272] ? gmc_v9_0_emit_flush_gpu_tlb+0x260/0x2a0 [amdgpu]
>>>> [ 2925.640368] ? vcn_v1_0_dec_ring_insert_start+0x360/0x360 [amdgpu]
>>>> [ 2925.640459] ? mmhub_v1_0_get_clockgating+0xc0/0xc0 [amdgpu]
>>>> [ 2925.640545] ? amdgpu_vmid_had_gpu_reset+0x89/0xc0 [amdgpu]
>>>> [ 2925.640640] ? vcn_v1_0_dec_ring_emit_vm_flush+0x64/0xb0 [amdgpu]
>>>> [ 2925.640725] ? amdgpu_vm_flush+0xb43/0xcc0 [amdgpu]
>>>> [ 2925.640810] ? amdgpu_vm_need_pipeline_sync+0x260/0x260 [amdgpu]
>>>> [ 2925.640897] ? amdgpu_vmid_had_gpu_reset+0xc0/0xc0 [amdgpu]
>>>> [ 2925.641003] ? vcn_v1_0_dec_ring_insert_start+0x2d7/0x360 [amdgpu]
>>>> [ 2925.641095] ? amdgpu_ib_schedule+0x1b5/0x800 [amdgpu]
>>>> [ 2925.641102] ? dma_fence_add_callback+0x15f/0x360
>>>> [ 2925.641201] ? amdgpu_job_run+0x32f/0x370 [amdgpu]
>>>> [ 2925.641297] ? amdgpu_job_free_resources+0xd0/0xd0 [amdgpu]
>>>> [ 2925.641302] ? __queue_delayed_work+0x144/0x1d0
>>>> [ 2925.641306] ? delayed_work_timer_fn+0x40/0x40
>>>> [ 2925.641312] ? prepare_to_wait_exclusive+0x1d0/0x1d0
>>>> [ 2925.641318] ? drm_sched_main+0x68c/0x940 [gpu_sched]
>>>> [ 2925.641323] ? drm_sched_entity_fini+0x60/0x60 [gpu_sched]
>>>> [ 2925.641328] ? save_stack+0x89/0xb0
>>>> [ 2925.641332] ? wait_woken+0x110/0x110
>>>> [ 2925.641337] ? ret_from_fork+0x22/0x40
>>>> [ 2925.641343] ? __schedule+0xd30/0xd30
>>>> [ 2925.641346] ? remove_wait_queue+0x150/0x150
>>>> [ 2925.641353] ? rcu_note_context_switch+0x2a0/0x2a0
>>>> [ 2925.641359] ? __lock_text_start+0x8/0x8
>>>> [ 2925.641367] ? drm_sched_entity_fini+0x60/0x60 [gpu_sched]
>>>> [ 2925.641371] ? kthread+0x19b/0x1c0
>>>> [ 2925.641376] ? kthread_create_worker_on_cpu+0xc0/0xc0
>>>> [ 2925.641382] ? ret_from_fork+0x22/0x40
>>>> [ 2925.641387] Code: Bad RIP value.
>>>> [ 2925.641397] RIP: (null) RSP: ffff8801d54f7790
>>>> [ 2925.641400] CR2: 0000000000000000
>>>> [ 2925.641405] ---[ end trace 0684cc0468f60fb1 ]---
>>>>
>>>>
>>>> Note that regular compute/gfx workflows work fine on the tip of drm-next only
>>>> vcn playback triggeers this (haven't tried encode yet...).
>>>>
>>>> Cheers,
>>>> Tom
>>>> _______________________________________________
>>>> amd-gfx mailing list
>>>> amd-gfx@lists.freedesktop.org
>>>> https://lists.freedesktop.org/mailman/listinfo/amd-gfx
_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: vcn regression on raven1
[not found] ` <MWHPR1201MB006160B52BC8575261FA0347F7800-3iK1xFAIwjq5p0Vu5p1DW2rFom/aUZj6nBOFsp37pqbUKgpGm//BTAC/G2K4zDHf@public.gmane.org>
@ 2018-05-02 1:44 ` Zhang, Jerry (Junwei)
[not found] ` <5AE91812.7010005-5C7GfCeVMHo@public.gmane.org>
0 siblings, 1 reply; 12+ messages in thread
From: Zhang, Jerry (Junwei) @ 2018-05-02 1:44 UTC (permalink / raw)
To: StDenis, Tom, Deucher, Alexander
Cc: Koenig, Christian, amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW
Hi Tom,
Ha, got your meaning.
Please check it with the latest drm-next from gerrit tomorrow.
Jerry
On 05/02/2018 09:41 AM, StDenis, Tom wrote:
> Hi Jerry,
>
> Like I said it's (now well) past EOD (meaning my workstation is powered off) so I'll have to check tomorrow. But I do pull from gerrit daily and build from that.
>
> I'll take a look in the morning.
>
> Cheers,
> Tom
> ________________________________________
> From: Zhang, Jerry
> Sent: Tuesday, May 1, 2018 21:39
> To: StDenis, Tom; Deucher, Alexander
> Cc: Koenig, Christian; amd-gfx@lists.freedesktop.org
> Subject: Re: vcn regression on raven1
>
> Hi Tom,
>
> Do you mean you cannot find the patch from gerrit/amd-staging-dkms-next either?
>
> I do find it.
>
> the tip of gerrit/amd-staging-drm-next is
> * bb54e82 2018-04-30 12:17:07 -0400 drm/amdgpu: Switch to interruptable wait
> to recover from ring hang. <Andrey Grodzovsky>
>
> while the tip of freedesktop is
> * a11008c 2018-04-25 20:32:05 -0500 drm/powerplay: Add powertune table for
> VEGAM <Eric Huang>
>
> Jerry
>
> On 05/02/2018 09:29 AM, StDenis, Tom wrote:
>> I pull from gerrit. I'm just pointing out that it's not on drm-next upstream either.
>>
>> It may have been missed in a rebase or something.
>>
>> Tom
>> ________________________________________
>> From: Zhang, Jerry
>> Sent: Tuesday, May 1, 2018 21:07
>> To: StDenis, Tom; Deucher, Alexander
>> Cc: Koenig, Christian; amd-gfx@lists.freedesktop.org
>> Subject: Re: vcn regression on raven1
>>
>> Hi Tom,
>>
>> Sound you get the code from freedesktop rather than the internal drm-next.
>> Unfortunately freedesktop looks delay to sync the code from internal drm-next.
>> That's the gap it happened as issue in the test.
>>
>> Hi Alex,
>>
>> Is that a issue for code syncing between freedesktop and internal drm-next?
>> Or it's a known issue of delay syncing code.
>>
>> Jerry
>>
>> On 05/02/2018 08:57 AM, StDenis, Tom wrote:
>>> Hi Jerry,
>>>
>>> It's well past EOD for me I'll pick this up in the morning.
>>>
>>> I'm fairly certain I wrote my patch against the tip of amd-staging-drm-next as of my pull this morning though.
>>>
>>> If it's in there and I missed it somehow I apologize otherwise it'd be nice to make sure it's in there.
>>>
>>> Based on the public copy of the tree it's not there
>>>
>>> https://cgit.freedesktop.org/~agd5f/linux/tree/drivers/gpu/drm/amd/amdgpu/vcn_v1_0.c?h=amd-staging-drm-next#n1110
>>>
>>> Cheers,
>>> Tom
>>> ________________________________________
>>> From: Zhang, Jerry
>>> Sent: Tuesday, May 1, 2018 20:52
>>> To: StDenis, Tom; Deucher, Alexander
>>> Cc: Koenig, Christian; amd-gfx@lists.freedesktop.org
>>> Subject: Re: vcn regression on raven1
>>>
>>> Hi Tom,
>>>
>>> It was landed in the latest drm-next, like
>>> * 964933a 2018-04-27 10:26:09 +0800 drm/amdgpu/uvd7: add
>>> emit_reg_write_reg_wait ring callback <Xiaojie Yuan>
>>>
>>> Did you test with that included?
>>> Please try to get the latest drm-next, if not.
>>> They look the same issue from the log.
>>>
>>> Jerry
>>>
>>> On 05/02/2018 08:47 AM, StDenis, Tom wrote:
>>>> Hi Jerry,
>>>>
>>>> So far as I know this wasn't included on the tip of drm-next. I hit this this morning in my semi-regular pull/build/test cycle.
>>>>
>>>> Was this missed in a recent rebase?
>>>>
>>>> Tom
>>>> ________________________________________
>>>> From: Zhang, Jerry
>>>> Sent: Tuesday, May 1, 2018 20:43
>>>> To: StDenis, Tom; Deucher, Alexander
>>>> Cc: Koenig, Christian; amd-gfx@lists.freedesktop.org
>>>> Subject: Re: vcn regression on raven1
>>>>
>>>> On 05/01/2018 09:34 PM, Tom St Denis wrote:
>>>>> Hi all,
>>>>>
>>>>> I've noticed that on the tip of drm-next vcn playback of video is broken (see
>>>>> dmesg below). I've bisected it to this commit
>>>>
>>>> It may be fixed here as a common issue.
>>>>
>>>> * https://patchwork.freedesktop.org/patch/218909/
>>>>
>>>> Jerry
>>>>
>>>>>
>>>>> [root@raven linux]# git bisect good
>>>>> 701372349fd55b5396b335580e979ac4dde3dd02 is the first bad commit
>>>>> commit 701372349fd55b5396b335580e979ac4dde3dd02
>>>>> Author: Alex Deucher <alexander.deucher@amd.com>
>>>>> Date: Tue Mar 27 17:10:56 2018 -0500
>>>>>
>>>>> drm/amdgpu/gmc9: use amdgpu_ring_emit_reg_write_reg_wait in gpu tlb flush
>>>>>
>>>>> Use amdgpu_ring_emit_reg_write_reg_wait. On engines that support it,
>>>>> it provides a write and wait in a single packet which avoids a missed
>>>>> ack if a world switch happens between the request and waiting for the
>>>>> ack.
>>>>>
>>>>> Reviewed-by: Huang Rui <ray.huang@amd.com>
>>>>> Reviewed-by: Christian König <christian.koenig@amd.com>
>>>>> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
>>>>>
>>>>> :040000 040000 4e4312de03f4b34abd65f4bb12dba4c7093055ba
>>>>> ccc4abc78c0b6f24328fd998f998fa06bf0618b1 M drivers
>>>>>
>>>>> Which is odd because the commit before this is the vcn change and it works fine
>>>>> (playing BBB right now).
>>>>>
>>>>> Here's the dmesg:
>>>>>
>>>>> [ 2925.640102] BUG: unable to handle kernel NULL pointer dereference at
>>>>> 0000000000000000
>>>>> [ 2925.640113] IP: (null)
>>>>> [ 2925.640116] PGD 0 P4D 0
>>>>> [ 2925.640122] Oops: 0010 [#1] SMP KASAN NOPTI
>>>>> [ 2925.640126] Modules linked in: tun fuse amdkfd amdgpu mfd_core chash
>>>>> gpu_sched ttm ax88179_178a usbnet
>>>>> [ 2925.640139] CPU: 4 PID: 3791 Comm: vcn_dec Not tainted 4.16.0-rc7+ #20
>>>>> [ 2925.640142] Hardware name: System manufacturer System Product Name/TUF
>>>>> B350M-PLUS GAMING, BIOS 3803 01/22/2018
>>>>> [ 2925.640146] RIP: 0010: (null)
>>>>> [ 2925.640148] RSP: 0018:ffff8801d54f7790 EFLAGS: 00010206
>>>>> [ 2925.640153] RAX: 0000000000000000 RBX: ffff8801d8b38420 RCX: 00000000007c0080
>>>>> [ 2925.640156] RDX: 000000000001a6fa RSI: 000000000001a6e8 RDI: ffff8801d8b38420
>>>>> [ 2925.640159] RBP: 000000000001a6fa R08: 0000000000000080 R09: ffffed003aa9eef9
>>>>> [ 2925.640162] R10: 0000000009c74f08 R11: fffffbfff0f5d1e7 R12: ffff8801d8b3277c
>>>>> [ 2925.640164] R13: ffff8801d8b3001c R14: 0000000000000005 R15: 0000000000000000
>>>>> [ 2925.640168] FS: 0000000000000000(0000) GS:ffff8801dcf00000(0000)
>>>>> knlGS:0000000000000000
>>>>> [ 2925.640171] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
>>>>> [ 2925.640174] CR2: 0000000000000000 CR3: 00000001d9712000 CR4: 00000000003406e0
>>>>> [ 2925.640176] Call Trace:
>>>>> [ 2925.640272] ? gmc_v9_0_emit_flush_gpu_tlb+0x260/0x2a0 [amdgpu]
>>>>> [ 2925.640368] ? vcn_v1_0_dec_ring_insert_start+0x360/0x360 [amdgpu]
>>>>> [ 2925.640459] ? mmhub_v1_0_get_clockgating+0xc0/0xc0 [amdgpu]
>>>>> [ 2925.640545] ? amdgpu_vmid_had_gpu_reset+0x89/0xc0 [amdgpu]
>>>>> [ 2925.640640] ? vcn_v1_0_dec_ring_emit_vm_flush+0x64/0xb0 [amdgpu]
>>>>> [ 2925.640725] ? amdgpu_vm_flush+0xb43/0xcc0 [amdgpu]
>>>>> [ 2925.640810] ? amdgpu_vm_need_pipeline_sync+0x260/0x260 [amdgpu]
>>>>> [ 2925.640897] ? amdgpu_vmid_had_gpu_reset+0xc0/0xc0 [amdgpu]
>>>>> [ 2925.641003] ? vcn_v1_0_dec_ring_insert_start+0x2d7/0x360 [amdgpu]
>>>>> [ 2925.641095] ? amdgpu_ib_schedule+0x1b5/0x800 [amdgpu]
>>>>> [ 2925.641102] ? dma_fence_add_callback+0x15f/0x360
>>>>> [ 2925.641201] ? amdgpu_job_run+0x32f/0x370 [amdgpu]
>>>>> [ 2925.641297] ? amdgpu_job_free_resources+0xd0/0xd0 [amdgpu]
>>>>> [ 2925.641302] ? __queue_delayed_work+0x144/0x1d0
>>>>> [ 2925.641306] ? delayed_work_timer_fn+0x40/0x40
>>>>> [ 2925.641312] ? prepare_to_wait_exclusive+0x1d0/0x1d0
>>>>> [ 2925.641318] ? drm_sched_main+0x68c/0x940 [gpu_sched]
>>>>> [ 2925.641323] ? drm_sched_entity_fini+0x60/0x60 [gpu_sched]
>>>>> [ 2925.641328] ? save_stack+0x89/0xb0
>>>>> [ 2925.641332] ? wait_woken+0x110/0x110
>>>>> [ 2925.641337] ? ret_from_fork+0x22/0x40
>>>>> [ 2925.641343] ? __schedule+0xd30/0xd30
>>>>> [ 2925.641346] ? remove_wait_queue+0x150/0x150
>>>>> [ 2925.641353] ? rcu_note_context_switch+0x2a0/0x2a0
>>>>> [ 2925.641359] ? __lock_text_start+0x8/0x8
>>>>> [ 2925.641367] ? drm_sched_entity_fini+0x60/0x60 [gpu_sched]
>>>>> [ 2925.641371] ? kthread+0x19b/0x1c0
>>>>> [ 2925.641376] ? kthread_create_worker_on_cpu+0xc0/0xc0
>>>>> [ 2925.641382] ? ret_from_fork+0x22/0x40
>>>>> [ 2925.641387] Code: Bad RIP value.
>>>>> [ 2925.641397] RIP: (null) RSP: ffff8801d54f7790
>>>>> [ 2925.641400] CR2: 0000000000000000
>>>>> [ 2925.641405] ---[ end trace 0684cc0468f60fb1 ]---
>>>>>
>>>>>
>>>>> Note that regular compute/gfx workflows work fine on the tip of drm-next only
>>>>> vcn playback triggeers this (haven't tried encode yet...).
>>>>>
>>>>> Cheers,
>>>>> Tom
>>>>> _______________________________________________
>>>>> amd-gfx mailing list
>>>>> amd-gfx@lists.freedesktop.org
>>>>> https://lists.freedesktop.org/mailman/listinfo/amd-gfx
_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: vcn regression on raven1
[not found] ` <5AE91812.7010005-5C7GfCeVMHo@public.gmane.org>
@ 2018-05-02 10:21 ` Tom St Denis
[not found] ` <97f75a85-b805-758c-e8a4-492a865746e3-5C7GfCeVMHo@public.gmane.org>
0 siblings, 1 reply; 12+ messages in thread
From: Tom St Denis @ 2018-05-02 10:21 UTC (permalink / raw)
To: Zhang, Jerry (Junwei), Deucher, Alexander
Cc: Koenig, Christian, amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW
Hi Jerry,
Just got up and going (6am ... ugh early). I see the confusion. Yes
there is a patch on drm-next but the problem is there is a table for
both decode and encode. That patch that is already on drm-next only
adds the callback for encode.
My patch adds the callback for decode as well. :-)
Cheers,
Tom
On 05/01/2018 09:44 PM, Zhang, Jerry (Junwei) wrote:
> Hi Tom,
>
> Ha, got your meaning.
> Please check it with the latest drm-next from gerrit tomorrow.
>
> Jerry
>
> On 05/02/2018 09:41 AM, StDenis, Tom wrote:
>> Hi Jerry,
>>
>> Like I said it's (now well) past EOD (meaning my workstation is
>> powered off) so I'll have to check tomorrow. But I do pull from
>> gerrit daily and build from that.
>>
>> I'll take a look in the morning.
>>
>> Cheers,
>> Tom
>> ________________________________________
>> From: Zhang, Jerry
>> Sent: Tuesday, May 1, 2018 21:39
>> To: StDenis, Tom; Deucher, Alexander
>> Cc: Koenig, Christian; amd-gfx@lists.freedesktop.org
>> Subject: Re: vcn regression on raven1
>>
>> Hi Tom,
>>
>> Do you mean you cannot find the patch from
>> gerrit/amd-staging-dkms-next either?
>>
>> I do find it.
>>
>> the tip of gerrit/amd-staging-drm-next is
>> * bb54e82 2018-04-30 12:17:07 -0400 drm/amdgpu: Switch to
>> interruptable wait
>> to recover from ring hang. <Andrey Grodzovsky>
>>
>> while the tip of freedesktop is
>> * a11008c 2018-04-25 20:32:05 -0500 drm/powerplay: Add powertune
>> table for
>> VEGAM <Eric Huang>
>>
>> Jerry
>>
>> On 05/02/2018 09:29 AM, StDenis, Tom wrote:
>>> I pull from gerrit. I'm just pointing out that it's not on drm-next
>>> upstream either.
>>>
>>> It may have been missed in a rebase or something.
>>>
>>> Tom
>>> ________________________________________
>>> From: Zhang, Jerry
>>> Sent: Tuesday, May 1, 2018 21:07
>>> To: StDenis, Tom; Deucher, Alexander
>>> Cc: Koenig, Christian; amd-gfx@lists.freedesktop.org
>>> Subject: Re: vcn regression on raven1
>>>
>>> Hi Tom,
>>>
>>> Sound you get the code from freedesktop rather than the internal
>>> drm-next.
>>> Unfortunately freedesktop looks delay to sync the code from internal
>>> drm-next.
>>> That's the gap it happened as issue in the test.
>>>
>>> Hi Alex,
>>>
>>> Is that a issue for code syncing between freedesktop and internal
>>> drm-next?
>>> Or it's a known issue of delay syncing code.
>>>
>>> Jerry
>>>
>>> On 05/02/2018 08:57 AM, StDenis, Tom wrote:
>>>> Hi Jerry,
>>>>
>>>> It's well past EOD for me I'll pick this up in the morning.
>>>>
>>>> I'm fairly certain I wrote my patch against the tip of
>>>> amd-staging-drm-next as of my pull this morning though.
>>>>
>>>> If it's in there and I missed it somehow I apologize otherwise it'd
>>>> be nice to make sure it's in there.
>>>>
>>>> Based on the public copy of the tree it's not there
>>>>
>>>> https://cgit.freedesktop.org/~agd5f/linux/tree/drivers/gpu/drm/amd/amdgpu/vcn_v1_0.c?h=amd-staging-drm-next#n1110
>>>>
>>>>
>>>> Cheers,
>>>> Tom
>>>> ________________________________________
>>>> From: Zhang, Jerry
>>>> Sent: Tuesday, May 1, 2018 20:52
>>>> To: StDenis, Tom; Deucher, Alexander
>>>> Cc: Koenig, Christian; amd-gfx@lists.freedesktop.org
>>>> Subject: Re: vcn regression on raven1
>>>>
>>>> Hi Tom,
>>>>
>>>> It was landed in the latest drm-next, like
>>>> * 964933a 2018-04-27 10:26:09 +0800 drm/amdgpu/uvd7: add
>>>> emit_reg_write_reg_wait ring callback <Xiaojie Yuan>
>>>>
>>>> Did you test with that included?
>>>> Please try to get the latest drm-next, if not.
>>>> They look the same issue from the log.
>>>>
>>>> Jerry
>>>>
>>>> On 05/02/2018 08:47 AM, StDenis, Tom wrote:
>>>>> Hi Jerry,
>>>>>
>>>>> So far as I know this wasn't included on the tip of drm-next. I
>>>>> hit this this morning in my semi-regular pull/build/test cycle.
>>>>>
>>>>> Was this missed in a recent rebase?
>>>>>
>>>>> Tom
>>>>> ________________________________________
>>>>> From: Zhang, Jerry
>>>>> Sent: Tuesday, May 1, 2018 20:43
>>>>> To: StDenis, Tom; Deucher, Alexander
>>>>> Cc: Koenig, Christian; amd-gfx@lists.freedesktop.org
>>>>> Subject: Re: vcn regression on raven1
>>>>>
>>>>> On 05/01/2018 09:34 PM, Tom St Denis wrote:
>>>>>> Hi all,
>>>>>>
>>>>>> I've noticed that on the tip of drm-next vcn playback of video is
>>>>>> broken (see
>>>>>> dmesg below). I've bisected it to this commit
>>>>>
>>>>> It may be fixed here as a common issue.
>>>>>
>>>>> * https://patchwork.freedesktop.org/patch/218909/
>>>>>
>>>>> Jerry
>>>>>
>>>>>>
>>>>>> [root@raven linux]# git bisect good
>>>>>> 701372349fd55b5396b335580e979ac4dde3dd02 is the first bad commit
>>>>>> commit 701372349fd55b5396b335580e979ac4dde3dd02
>>>>>> Author: Alex Deucher <alexander.deucher@amd.com>
>>>>>> Date: Tue Mar 27 17:10:56 2018 -0500
>>>>>>
>>>>>> drm/amdgpu/gmc9: use amdgpu_ring_emit_reg_write_reg_wait
>>>>>> in gpu tlb flush
>>>>>>
>>>>>> Use amdgpu_ring_emit_reg_write_reg_wait. On engines that
>>>>>> support it,
>>>>>> it provides a write and wait in a single packet which
>>>>>> avoids a missed
>>>>>> ack if a world switch happens between the request and
>>>>>> waiting for the
>>>>>> ack.
>>>>>>
>>>>>> Reviewed-by: Huang Rui <ray.huang@amd.com>
>>>>>> Reviewed-by: Christian König <christian.koenig@amd.com>
>>>>>> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
>>>>>>
>>>>>> :040000 040000 4e4312de03f4b34abd65f4bb12dba4c7093055ba
>>>>>> ccc4abc78c0b6f24328fd998f998fa06bf0618b1 M drivers
>>>>>>
>>>>>> Which is odd because the commit before this is the vcn change and
>>>>>> it works fine
>>>>>> (playing BBB right now).
>>>>>>
>>>>>> Here's the dmesg:
>>>>>>
>>>>>> [ 2925.640102] BUG: unable to handle kernel NULL pointer
>>>>>> dereference at
>>>>>> 0000000000000000
>>>>>> [ 2925.640113] IP: (null)
>>>>>> [ 2925.640116] PGD 0 P4D 0
>>>>>> [ 2925.640122] Oops: 0010 [#1] SMP KASAN NOPTI
>>>>>> [ 2925.640126] Modules linked in: tun fuse amdkfd amdgpu mfd_core
>>>>>> chash
>>>>>> gpu_sched ttm ax88179_178a usbnet
>>>>>> [ 2925.640139] CPU: 4 PID: 3791 Comm: vcn_dec Not tainted
>>>>>> 4.16.0-rc7+ #20
>>>>>> [ 2925.640142] Hardware name: System manufacturer System Product
>>>>>> Name/TUF
>>>>>> B350M-PLUS GAMING, BIOS 3803 01/22/2018
>>>>>> [ 2925.640146] RIP: 0010: (null)
>>>>>> [ 2925.640148] RSP: 0018:ffff8801d54f7790 EFLAGS: 00010206
>>>>>> [ 2925.640153] RAX: 0000000000000000 RBX: ffff8801d8b38420 RCX:
>>>>>> 00000000007c0080
>>>>>> [ 2925.640156] RDX: 000000000001a6fa RSI: 000000000001a6e8 RDI:
>>>>>> ffff8801d8b38420
>>>>>> [ 2925.640159] RBP: 000000000001a6fa R08: 0000000000000080 R09:
>>>>>> ffffed003aa9eef9
>>>>>> [ 2925.640162] R10: 0000000009c74f08 R11: fffffbfff0f5d1e7 R12:
>>>>>> ffff8801d8b3277c
>>>>>> [ 2925.640164] R13: ffff8801d8b3001c R14: 0000000000000005 R15:
>>>>>> 0000000000000000
>>>>>> [ 2925.640168] FS: 0000000000000000(0000) GS:ffff8801dcf00000(0000)
>>>>>> knlGS:0000000000000000
>>>>>> [ 2925.640171] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
>>>>>> [ 2925.640174] CR2: 0000000000000000 CR3: 00000001d9712000 CR4:
>>>>>> 00000000003406e0
>>>>>> [ 2925.640176] Call Trace:
>>>>>> [ 2925.640272] ? gmc_v9_0_emit_flush_gpu_tlb+0x260/0x2a0 [amdgpu]
>>>>>> [ 2925.640368] ? vcn_v1_0_dec_ring_insert_start+0x360/0x360 [amdgpu]
>>>>>> [ 2925.640459] ? mmhub_v1_0_get_clockgating+0xc0/0xc0 [amdgpu]
>>>>>> [ 2925.640545] ? amdgpu_vmid_had_gpu_reset+0x89/0xc0 [amdgpu]
>>>>>> [ 2925.640640] ? vcn_v1_0_dec_ring_emit_vm_flush+0x64/0xb0 [amdgpu]
>>>>>> [ 2925.640725] ? amdgpu_vm_flush+0xb43/0xcc0 [amdgpu]
>>>>>> [ 2925.640810] ? amdgpu_vm_need_pipeline_sync+0x260/0x260 [amdgpu]
>>>>>> [ 2925.640897] ? amdgpu_vmid_had_gpu_reset+0xc0/0xc0 [amdgpu]
>>>>>> [ 2925.641003] ? vcn_v1_0_dec_ring_insert_start+0x2d7/0x360 [amdgpu]
>>>>>> [ 2925.641095] ? amdgpu_ib_schedule+0x1b5/0x800 [amdgpu]
>>>>>> [ 2925.641102] ? dma_fence_add_callback+0x15f/0x360
>>>>>> [ 2925.641201] ? amdgpu_job_run+0x32f/0x370 [amdgpu]
>>>>>> [ 2925.641297] ? amdgpu_job_free_resources+0xd0/0xd0 [amdgpu]
>>>>>> [ 2925.641302] ? __queue_delayed_work+0x144/0x1d0
>>>>>> [ 2925.641306] ? delayed_work_timer_fn+0x40/0x40
>>>>>> [ 2925.641312] ? prepare_to_wait_exclusive+0x1d0/0x1d0
>>>>>> [ 2925.641318] ? drm_sched_main+0x68c/0x940 [gpu_sched]
>>>>>> [ 2925.641323] ? drm_sched_entity_fini+0x60/0x60 [gpu_sched]
>>>>>> [ 2925.641328] ? save_stack+0x89/0xb0
>>>>>> [ 2925.641332] ? wait_woken+0x110/0x110
>>>>>> [ 2925.641337] ? ret_from_fork+0x22/0x40
>>>>>> [ 2925.641343] ? __schedule+0xd30/0xd30
>>>>>> [ 2925.641346] ? remove_wait_queue+0x150/0x150
>>>>>> [ 2925.641353] ? rcu_note_context_switch+0x2a0/0x2a0
>>>>>> [ 2925.641359] ? __lock_text_start+0x8/0x8
>>>>>> [ 2925.641367] ? drm_sched_entity_fini+0x60/0x60 [gpu_sched]
>>>>>> [ 2925.641371] ? kthread+0x19b/0x1c0
>>>>>> [ 2925.641376] ? kthread_create_worker_on_cpu+0xc0/0xc0
>>>>>> [ 2925.641382] ? ret_from_fork+0x22/0x40
>>>>>> [ 2925.641387] Code: Bad RIP value.
>>>>>> [ 2925.641397] RIP: (null) RSP: ffff8801d54f7790
>>>>>> [ 2925.641400] CR2: 0000000000000000
>>>>>> [ 2925.641405] ---[ end trace 0684cc0468f60fb1 ]---
>>>>>>
>>>>>>
>>>>>> Note that regular compute/gfx workflows work fine on the tip of
>>>>>> drm-next only
>>>>>> vcn playback triggeers this (haven't tried encode yet...).
>>>>>>
>>>>>> Cheers,
>>>>>> Tom
>>>>>> _______________________________________________
>>>>>> amd-gfx mailing list
>>>>>> amd-gfx@lists.freedesktop.org
>>>>>> https://lists.freedesktop.org/mailman/listinfo/amd-gfx
_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: vcn regression on raven1
[not found] ` <97f75a85-b805-758c-e8a4-492a865746e3-5C7GfCeVMHo@public.gmane.org>
@ 2018-05-03 0:42 ` Zhang, Jerry (Junwei)
0 siblings, 0 replies; 12+ messages in thread
From: Zhang, Jerry (Junwei) @ 2018-05-03 0:42 UTC (permalink / raw)
To: Tom St Denis, Deucher, Alexander
Cc: Koenig, Christian, amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW
Hi Tom,
Thanks for your update. That's good news.
If necessary, please also send out your patch to improve the functionality.
Thanks.
Jerry
On 05/02/2018 06:21 PM, Tom St Denis wrote:
> Hi Jerry,
>
> Just got up and going (6am ... ugh early). I see the confusion. Yes there is a
> patch on drm-next but the problem is there is a table for both decode and
> encode. That patch that is already on drm-next only adds the callback for encode.
>
> My patch adds the callback for decode as well. :-)
>
> Cheers,
> Tom
>
>
>
> On 05/01/2018 09:44 PM, Zhang, Jerry (Junwei) wrote:
>> Hi Tom,
>>
>> Ha, got your meaning.
>> Please check it with the latest drm-next from gerrit tomorrow.
>>
>> Jerry
>>
>> On 05/02/2018 09:41 AM, StDenis, Tom wrote:
>>> Hi Jerry,
>>>
>>> Like I said it's (now well) past EOD (meaning my workstation is powered off)
>>> so I'll have to check tomorrow. But I do pull from gerrit daily and build
>>> from that.
>>>
>>> I'll take a look in the morning.
>>>
>>> Cheers,
>>> Tom
>>> ________________________________________
>>> From: Zhang, Jerry
>>> Sent: Tuesday, May 1, 2018 21:39
>>> To: StDenis, Tom; Deucher, Alexander
>>> Cc: Koenig, Christian; amd-gfx@lists.freedesktop.org
>>> Subject: Re: vcn regression on raven1
>>>
>>> Hi Tom,
>>>
>>> Do you mean you cannot find the patch from gerrit/amd-staging-dkms-next either?
>>>
>>> I do find it.
>>>
>>> the tip of gerrit/amd-staging-drm-next is
>>> * bb54e82 2018-04-30 12:17:07 -0400 drm/amdgpu: Switch to interruptable wait
>>> to recover from ring hang. <Andrey Grodzovsky>
>>>
>>> while the tip of freedesktop is
>>> * a11008c 2018-04-25 20:32:05 -0500 drm/powerplay: Add powertune table for
>>> VEGAM <Eric Huang>
>>>
>>> Jerry
>>>
>>> On 05/02/2018 09:29 AM, StDenis, Tom wrote:
>>>> I pull from gerrit. I'm just pointing out that it's not on drm-next
>>>> upstream either.
>>>>
>>>> It may have been missed in a rebase or something.
>>>>
>>>> Tom
>>>> ________________________________________
>>>> From: Zhang, Jerry
>>>> Sent: Tuesday, May 1, 2018 21:07
>>>> To: StDenis, Tom; Deucher, Alexander
>>>> Cc: Koenig, Christian; amd-gfx@lists.freedesktop.org
>>>> Subject: Re: vcn regression on raven1
>>>>
>>>> Hi Tom,
>>>>
>>>> Sound you get the code from freedesktop rather than the internal drm-next.
>>>> Unfortunately freedesktop looks delay to sync the code from internal drm-next.
>>>> That's the gap it happened as issue in the test.
>>>>
>>>> Hi Alex,
>>>>
>>>> Is that a issue for code syncing between freedesktop and internal drm-next?
>>>> Or it's a known issue of delay syncing code.
>>>>
>>>> Jerry
>>>>
>>>> On 05/02/2018 08:57 AM, StDenis, Tom wrote:
>>>>> Hi Jerry,
>>>>>
>>>>> It's well past EOD for me I'll pick this up in the morning.
>>>>>
>>>>> I'm fairly certain I wrote my patch against the tip of amd-staging-drm-next
>>>>> as of my pull this morning though.
>>>>>
>>>>> If it's in there and I missed it somehow I apologize otherwise it'd be nice
>>>>> to make sure it's in there.
>>>>>
>>>>> Based on the public copy of the tree it's not there
>>>>>
>>>>> https://cgit.freedesktop.org/~agd5f/linux/tree/drivers/gpu/drm/amd/amdgpu/vcn_v1_0.c?h=amd-staging-drm-next#n1110
>>>>>
>>>>>
>>>>> Cheers,
>>>>> Tom
>>>>> ________________________________________
>>>>> From: Zhang, Jerry
>>>>> Sent: Tuesday, May 1, 2018 20:52
>>>>> To: StDenis, Tom; Deucher, Alexander
>>>>> Cc: Koenig, Christian; amd-gfx@lists.freedesktop.org
>>>>> Subject: Re: vcn regression on raven1
>>>>>
>>>>> Hi Tom,
>>>>>
>>>>> It was landed in the latest drm-next, like
>>>>> * 964933a 2018-04-27 10:26:09 +0800 drm/amdgpu/uvd7: add
>>>>> emit_reg_write_reg_wait ring callback <Xiaojie Yuan>
>>>>>
>>>>> Did you test with that included?
>>>>> Please try to get the latest drm-next, if not.
>>>>> They look the same issue from the log.
>>>>>
>>>>> Jerry
>>>>>
>>>>> On 05/02/2018 08:47 AM, StDenis, Tom wrote:
>>>>>> Hi Jerry,
>>>>>>
>>>>>> So far as I know this wasn't included on the tip of drm-next. I hit this
>>>>>> this morning in my semi-regular pull/build/test cycle.
>>>>>>
>>>>>> Was this missed in a recent rebase?
>>>>>>
>>>>>> Tom
>>>>>> ________________________________________
>>>>>> From: Zhang, Jerry
>>>>>> Sent: Tuesday, May 1, 2018 20:43
>>>>>> To: StDenis, Tom; Deucher, Alexander
>>>>>> Cc: Koenig, Christian; amd-gfx@lists.freedesktop.org
>>>>>> Subject: Re: vcn regression on raven1
>>>>>>
>>>>>> On 05/01/2018 09:34 PM, Tom St Denis wrote:
>>>>>>> Hi all,
>>>>>>>
>>>>>>> I've noticed that on the tip of drm-next vcn playback of video is broken
>>>>>>> (see
>>>>>>> dmesg below). I've bisected it to this commit
>>>>>>
>>>>>> It may be fixed here as a common issue.
>>>>>>
>>>>>> * https://patchwork.freedesktop.org/patch/218909/
>>>>>>
>>>>>> Jerry
>>>>>>
>>>>>>>
>>>>>>> [root@raven linux]# git bisect good
>>>>>>> 701372349fd55b5396b335580e979ac4dde3dd02 is the first bad commit
>>>>>>> commit 701372349fd55b5396b335580e979ac4dde3dd02
>>>>>>> Author: Alex Deucher <alexander.deucher@amd.com>
>>>>>>> Date: Tue Mar 27 17:10:56 2018 -0500
>>>>>>>
>>>>>>> drm/amdgpu/gmc9: use amdgpu_ring_emit_reg_write_reg_wait in gpu
>>>>>>> tlb flush
>>>>>>>
>>>>>>> Use amdgpu_ring_emit_reg_write_reg_wait. On engines that
>>>>>>> support it,
>>>>>>> it provides a write and wait in a single packet which avoids a
>>>>>>> missed
>>>>>>> ack if a world switch happens between the request and waiting
>>>>>>> for the
>>>>>>> ack.
>>>>>>>
>>>>>>> Reviewed-by: Huang Rui <ray.huang@amd.com>
>>>>>>> Reviewed-by: Christian König <christian.koenig@amd.com>
>>>>>>> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
>>>>>>>
>>>>>>> :040000 040000 4e4312de03f4b34abd65f4bb12dba4c7093055ba
>>>>>>> ccc4abc78c0b6f24328fd998f998fa06bf0618b1 M drivers
>>>>>>>
>>>>>>> Which is odd because the commit before this is the vcn change and it
>>>>>>> works fine
>>>>>>> (playing BBB right now).
>>>>>>>
>>>>>>> Here's the dmesg:
>>>>>>>
>>>>>>> [ 2925.640102] BUG: unable to handle kernel NULL pointer dereference at
>>>>>>> 0000000000000000
>>>>>>> [ 2925.640113] IP: (null)
>>>>>>> [ 2925.640116] PGD 0 P4D 0
>>>>>>> [ 2925.640122] Oops: 0010 [#1] SMP KASAN NOPTI
>>>>>>> [ 2925.640126] Modules linked in: tun fuse amdkfd amdgpu mfd_core chash
>>>>>>> gpu_sched ttm ax88179_178a usbnet
>>>>>>> [ 2925.640139] CPU: 4 PID: 3791 Comm: vcn_dec Not tainted 4.16.0-rc7+ #20
>>>>>>> [ 2925.640142] Hardware name: System manufacturer System Product Name/TUF
>>>>>>> B350M-PLUS GAMING, BIOS 3803 01/22/2018
>>>>>>> [ 2925.640146] RIP: 0010: (null)
>>>>>>> [ 2925.640148] RSP: 0018:ffff8801d54f7790 EFLAGS: 00010206
>>>>>>> [ 2925.640153] RAX: 0000000000000000 RBX: ffff8801d8b38420 RCX:
>>>>>>> 00000000007c0080
>>>>>>> [ 2925.640156] RDX: 000000000001a6fa RSI: 000000000001a6e8 RDI:
>>>>>>> ffff8801d8b38420
>>>>>>> [ 2925.640159] RBP: 000000000001a6fa R08: 0000000000000080 R09:
>>>>>>> ffffed003aa9eef9
>>>>>>> [ 2925.640162] R10: 0000000009c74f08 R11: fffffbfff0f5d1e7 R12:
>>>>>>> ffff8801d8b3277c
>>>>>>> [ 2925.640164] R13: ffff8801d8b3001c R14: 0000000000000005 R15:
>>>>>>> 0000000000000000
>>>>>>> [ 2925.640168] FS: 0000000000000000(0000) GS:ffff8801dcf00000(0000)
>>>>>>> knlGS:0000000000000000
>>>>>>> [ 2925.640171] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
>>>>>>> [ 2925.640174] CR2: 0000000000000000 CR3: 00000001d9712000 CR4:
>>>>>>> 00000000003406e0
>>>>>>> [ 2925.640176] Call Trace:
>>>>>>> [ 2925.640272] ? gmc_v9_0_emit_flush_gpu_tlb+0x260/0x2a0 [amdgpu]
>>>>>>> [ 2925.640368] ? vcn_v1_0_dec_ring_insert_start+0x360/0x360 [amdgpu]
>>>>>>> [ 2925.640459] ? mmhub_v1_0_get_clockgating+0xc0/0xc0 [amdgpu]
>>>>>>> [ 2925.640545] ? amdgpu_vmid_had_gpu_reset+0x89/0xc0 [amdgpu]
>>>>>>> [ 2925.640640] ? vcn_v1_0_dec_ring_emit_vm_flush+0x64/0xb0 [amdgpu]
>>>>>>> [ 2925.640725] ? amdgpu_vm_flush+0xb43/0xcc0 [amdgpu]
>>>>>>> [ 2925.640810] ? amdgpu_vm_need_pipeline_sync+0x260/0x260 [amdgpu]
>>>>>>> [ 2925.640897] ? amdgpu_vmid_had_gpu_reset+0xc0/0xc0 [amdgpu]
>>>>>>> [ 2925.641003] ? vcn_v1_0_dec_ring_insert_start+0x2d7/0x360 [amdgpu]
>>>>>>> [ 2925.641095] ? amdgpu_ib_schedule+0x1b5/0x800 [amdgpu]
>>>>>>> [ 2925.641102] ? dma_fence_add_callback+0x15f/0x360
>>>>>>> [ 2925.641201] ? amdgpu_job_run+0x32f/0x370 [amdgpu]
>>>>>>> [ 2925.641297] ? amdgpu_job_free_resources+0xd0/0xd0 [amdgpu]
>>>>>>> [ 2925.641302] ? __queue_delayed_work+0x144/0x1d0
>>>>>>> [ 2925.641306] ? delayed_work_timer_fn+0x40/0x40
>>>>>>> [ 2925.641312] ? prepare_to_wait_exclusive+0x1d0/0x1d0
>>>>>>> [ 2925.641318] ? drm_sched_main+0x68c/0x940 [gpu_sched]
>>>>>>> [ 2925.641323] ? drm_sched_entity_fini+0x60/0x60 [gpu_sched]
>>>>>>> [ 2925.641328] ? save_stack+0x89/0xb0
>>>>>>> [ 2925.641332] ? wait_woken+0x110/0x110
>>>>>>> [ 2925.641337] ? ret_from_fork+0x22/0x40
>>>>>>> [ 2925.641343] ? __schedule+0xd30/0xd30
>>>>>>> [ 2925.641346] ? remove_wait_queue+0x150/0x150
>>>>>>> [ 2925.641353] ? rcu_note_context_switch+0x2a0/0x2a0
>>>>>>> [ 2925.641359] ? __lock_text_start+0x8/0x8
>>>>>>> [ 2925.641367] ? drm_sched_entity_fini+0x60/0x60 [gpu_sched]
>>>>>>> [ 2925.641371] ? kthread+0x19b/0x1c0
>>>>>>> [ 2925.641376] ? kthread_create_worker_on_cpu+0xc0/0xc0
>>>>>>> [ 2925.641382] ? ret_from_fork+0x22/0x40
>>>>>>> [ 2925.641387] Code: Bad RIP value.
>>>>>>> [ 2925.641397] RIP: (null) RSP: ffff8801d54f7790
>>>>>>> [ 2925.641400] CR2: 0000000000000000
>>>>>>> [ 2925.641405] ---[ end trace 0684cc0468f60fb1 ]---
>>>>>>>
>>>>>>>
>>>>>>> Note that regular compute/gfx workflows work fine on the tip of drm-next
>>>>>>> only
>>>>>>> vcn playback triggeers this (haven't tried encode yet...).
>>>>>>>
>>>>>>> Cheers,
>>>>>>> Tom
>>>>>>> _______________________________________________
>>>>>>> amd-gfx mailing list
>>>>>>> amd-gfx@lists.freedesktop.org
>>>>>>> https://lists.freedesktop.org/mailman/listinfo/amd-gfx
_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx
^ permalink raw reply [flat|nested] 12+ messages in thread
end of thread, other threads:[~2018-05-03 0:42 UTC | newest]
Thread overview: 12+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-05-01 13:34 vcn regression on raven1 Tom St Denis
[not found] ` <86eb81c5-8ec5-8927-71ce-a2794e48721e-5C7GfCeVMHo@public.gmane.org>
2018-05-02 0:43 ` Zhang, Jerry (Junwei)
[not found] ` <5AE909C0.8040704-5C7GfCeVMHo@public.gmane.org>
2018-05-02 0:47 ` StDenis, Tom
[not found] ` <MWHPR1201MB0061C0CE5E63D714899BD73BF7800-3iK1xFAIwjq5p0Vu5p1DW2rFom/aUZj6nBOFsp37pqbUKgpGm//BTAC/G2K4zDHf@public.gmane.org>
2018-05-02 0:52 ` Zhang, Jerry (Junwei)
[not found] ` <5AE90BD4.8080205-5C7GfCeVMHo@public.gmane.org>
2018-05-02 0:57 ` StDenis, Tom
[not found] ` <MWHPR1201MB0061A8D8AEB5744131AC416AF7800-3iK1xFAIwjq5p0Vu5p1DW2rFom/aUZj6nBOFsp37pqbUKgpGm//BTAC/G2K4zDHf@public.gmane.org>
2018-05-02 1:07 ` Zhang, Jerry (Junwei)
[not found] ` <5AE90F4E.8080800-5C7GfCeVMHo@public.gmane.org>
2018-05-02 1:29 ` StDenis, Tom
[not found] ` <MWHPR1201MB006170E338FD24D648B90766F7800-3iK1xFAIwjq5p0Vu5p1DW2rFom/aUZj6nBOFsp37pqbUKgpGm//BTAC/G2K4zDHf@public.gmane.org>
2018-05-02 1:39 ` Zhang, Jerry (Junwei)
[not found] ` <5AE916E1.1040305-5C7GfCeVMHo@public.gmane.org>
2018-05-02 1:41 ` StDenis, Tom
[not found] ` <MWHPR1201MB006160B52BC8575261FA0347F7800-3iK1xFAIwjq5p0Vu5p1DW2rFom/aUZj6nBOFsp37pqbUKgpGm//BTAC/G2K4zDHf@public.gmane.org>
2018-05-02 1:44 ` Zhang, Jerry (Junwei)
[not found] ` <5AE91812.7010005-5C7GfCeVMHo@public.gmane.org>
2018-05-02 10:21 ` Tom St Denis
[not found] ` <97f75a85-b805-758c-e8a4-492a865746e3-5C7GfCeVMHo@public.gmane.org>
2018-05-03 0:42 ` Zhang, Jerry (Junwei)
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.