All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH] drm/amdkfd: add schedule to remove RCU stall on CPU
@ 2023-08-11 19:11 James Zhu
  2023-08-11 20:06 ` Felix Kuehling
  2023-08-11 21:12 ` Chen, Xiaogang
  0 siblings, 2 replies; 11+ messages in thread
From: James Zhu @ 2023-08-11 19:11 UTC (permalink / raw)
  To: amd-gfx; +Cc: Felix.kuehling, jamesz, Roger.Madrid

update_list could be big in list_for_each_entry(prange, &update_list, update_list),
mmap_read_lock(mm) is kept hold all the time, adding schedule() can remove
RCU stall on CPU for this case.

RIP: 0010:svm_range_cpu_invalidate_pagetables+0x317/0x610 [amdgpu]
Code: 00 00 00 bf 00 02 00 00 48 81 c2 90 00 00 00 e8 1f 6a b9 e0 65 48 8b 14 25 00 bd 01 00 8b 42 2c 48 8b 3c 24 80 e4 f7 0b 43 d8 <89> 42 2c e8 51 dd 2d e1 48 8b 7b 38 e8 98 29 b7 e0 48 83 c4 30 b8
RSP: 0018:ffffc9000ffd7b10 EFLAGS: 00000206
RAX: 0000000000000100 RBX: ffff88c493968d80 RCX: ffff88d1a6469b18
RDX: ffff88e18ef1ec80 RSI: ffffc9000ffd7be0 RDI: ffff88c493968d38
RBP: 000000000003062e R08: 000000003042f000 R09: 000000003062efff
R10: 0000000000001000 R11: ffff88c1ad255000 R12: 000000000003042f
R13: ffff88c493968c00 R14: ffffc9000ffd7be0 R15: ffff88c493968c00
__mmu_notifier_invalidate_range_start+0x132/0x1d0
? amdgpu_vm_bo_update+0x3fd/0x520 [amdgpu]
migrate_vma_setup+0x6c7/0x8f0
? kfd_smi_event_migration_start+0x5f/0x80 [amdgpu]
svm_migrate_ram_to_vram+0x14e/0x580 [amdgpu]
svm_range_set_attr+0xe34/0x11a0 [amdgpu]
kfd_ioctl+0x271/0x4e0 [amdgpu]
? kfd_ioctl_set_xnack_mode+0xd0/0xd0 [amdgpu]
__x64_sys_ioctl+0x92/0xd0

Signed-off-by: James Zhu <James.Zhu@amd.com>
---
 drivers/gpu/drm/amd/amdkfd/kfd_svm.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_svm.c b/drivers/gpu/drm/amd/amdkfd/kfd_svm.c
index 113fd11aa96e..9f2d48ade7fa 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_svm.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_svm.c
@@ -3573,6 +3573,7 @@ svm_range_set_attr(struct kfd_process *p, struct mm_struct *mm,
 		r = svm_range_trigger_migration(mm, prange, &migrated);
 		if (r)
 			goto out_unlock_range;
+		schedule();
 
 		if (migrated && (!p->xnack_enabled ||
 		    (prange->flags & KFD_IOCTL_SVM_FLAG_GPU_ALWAYS_MAPPED)) &&
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 11+ messages in thread

* Re: [PATCH] drm/amdkfd: add schedule to remove RCU stall on CPU
  2023-08-11 19:11 [PATCH] drm/amdkfd: add schedule to remove RCU stall on CPU James Zhu
@ 2023-08-11 20:06 ` Felix Kuehling
  2023-08-11 20:50   ` James Zhu
  2023-08-11 21:12 ` Chen, Xiaogang
  1 sibling, 1 reply; 11+ messages in thread
From: Felix Kuehling @ 2023-08-11 20:06 UTC (permalink / raw)
  To: James Zhu, amd-gfx; +Cc: jamesz, Roger.Madrid


On 2023-08-11 15:11, James Zhu wrote:
> update_list could be big in list_for_each_entry(prange, &update_list, update_list),
> mmap_read_lock(mm) is kept hold all the time, adding schedule() can remove
> RCU stall on CPU for this case.
>
> RIP: 0010:svm_range_cpu_invalidate_pagetables+0x317/0x610 [amdgpu]

You're just showing the backtrace here, but not what the problem is. Can 
you include more context, e.g. the message that says something about a 
stall?


> Code: 00 00 00 bf 00 02 00 00 48 81 c2 90 00 00 00 e8 1f 6a b9 e0 65 48 8b 14 25 00 bd 01 00 8b 42 2c 48 8b 3c 24 80 e4 f7 0b 43 d8 <89> 42 2c e8 51 dd 2d e1 48 8b 7b 38 e8 98 29 b7 e0 48 83 c4 30 b8
> RSP: 0018:ffffc9000ffd7b10 EFLAGS: 00000206
> RAX: 0000000000000100 RBX: ffff88c493968d80 RCX: ffff88d1a6469b18
> RDX: ffff88e18ef1ec80 RSI: ffffc9000ffd7be0 RDI: ffff88c493968d38
> RBP: 000000000003062e R08: 000000003042f000 R09: 000000003062efff
> R10: 0000000000001000 R11: ffff88c1ad255000 R12: 000000000003042f
> R13: ffff88c493968c00 R14: ffffc9000ffd7be0 R15: ffff88c493968c00
> __mmu_notifier_invalidate_range_start+0x132/0x1d0
> ? amdgpu_vm_bo_update+0x3fd/0x520 [amdgpu]
> migrate_vma_setup+0x6c7/0x8f0
> ? kfd_smi_event_migration_start+0x5f/0x80 [amdgpu]
> svm_migrate_ram_to_vram+0x14e/0x580 [amdgpu]
> svm_range_set_attr+0xe34/0x11a0 [amdgpu]
> kfd_ioctl+0x271/0x4e0 [amdgpu]
> ? kfd_ioctl_set_xnack_mode+0xd0/0xd0 [amdgpu]
> __x64_sys_ioctl+0x92/0xd0
>
> Signed-off-by: James Zhu <James.Zhu@amd.com>
> ---
>   drivers/gpu/drm/amd/amdkfd/kfd_svm.c | 1 +
>   1 file changed, 1 insertion(+)
>
> diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_svm.c b/drivers/gpu/drm/amd/amdkfd/kfd_svm.c
> index 113fd11aa96e..9f2d48ade7fa 100644
> --- a/drivers/gpu/drm/amd/amdkfd/kfd_svm.c
> +++ b/drivers/gpu/drm/amd/amdkfd/kfd_svm.c
> @@ -3573,6 +3573,7 @@ svm_range_set_attr(struct kfd_process *p, struct mm_struct *mm,
>   		r = svm_range_trigger_migration(mm, prange, &migrated);
>   		if (r)
>   			goto out_unlock_range;
> +		schedule();

I'm not sure that unconditionally scheduling here in every loop 
iteration is a good solution. This could lead to performance degradation 
when there are many small ranges. I think a better option is to call 
cond_resched. That would only reschedule only "if necessary", though I 
haven't quite figured out the criteria for rescheduling being necessary.

Regards,
   Felix


>   
>   		if (migrated && (!p->xnack_enabled ||
>   		    (prange->flags & KFD_IOCTL_SVM_FLAG_GPU_ALWAYS_MAPPED)) &&

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH] drm/amdkfd: add schedule to remove RCU stall on CPU
  2023-08-11 20:06 ` Felix Kuehling
@ 2023-08-11 20:50   ` James Zhu
  2023-08-11 21:13     ` Felix Kuehling
  0 siblings, 1 reply; 11+ messages in thread
From: James Zhu @ 2023-08-11 20:50 UTC (permalink / raw)
  To: Felix Kuehling, James Zhu, amd-gfx; +Cc: Roger.Madrid


On 2023-08-11 16:06, Felix Kuehling wrote:
>
> On 2023-08-11 15:11, James Zhu wrote:
>> update_list could be big in list_for_each_entry(prange, &update_list, 
>> update_list),
>> mmap_read_lock(mm) is kept hold all the time, adding schedule() can 
>> remove
>> RCU stall on CPU for this case.
>>
>> RIP: 0010:svm_range_cpu_invalidate_pagetables+0x317/0x610 [amdgpu]
>
> You're just showing the backtrace here, but not what the problem is. 
> Can you include more context, e.g. the message that says something 
> about a stall?

[JZ] I attached more log here, and update in patch later.

2023-07-20T14:15:39-04:00 frontier06693 kernel: rcu: INFO: rcu_sched 
self-detected stall on CPU
2023-07-20T14:15:39-04:00 frontier06693 kernel: rcu: #01134-....: (59947 
ticks this GP) idle=7f6/1/0x4000000000000000 softirq=1735/1735 fqs=29977
2023-07-20T14:15:39-04:00 frontier06693 kernel: #011(t=60006 jiffies 
g=3265905 q=15150)
2023-07-20T14:15:39-04:00 frontier06693 kernel: rcu: CPU 34: RCU dump 
cpu stacks:
2023-07-20T14:15:39-04:00 frontier06693 kernel: NMI backtrace for cpu 34
2023-07-20T14:15:39-04:00 frontier06693 kernel: CPU: 34 PID: 72044 Comm: 
ncsd-it-hip.exe Kdump: loaded Tainted: G           OE 
5.14.21-150400.24.46_12.0.83-cray_shasta_c #1 SLE15-SP4 (unreleased)
2023-07-20T14:15:39-04:00 frontier06693 kernel: Hardware name: HPE 
HPE_CRAY_EX235A/HPE CRAY EX235A, BIOS 1.6.2 03-22-2023
2023-07-20T14:15:39-04:00 frontier06693 kernel: Call Trace:
2023-07-20T14:15:39-04:00 frontier06693 kernel: <IRQ>
2023-07-20T14:15:39-04:00 frontier06693 kernel: dump_stack_lvl+0x44/0x5b
2023-07-20T14:15:39-04:00 frontier06693 kernel: nmi_cpu_backtrace+0xdd/0xe0
2023-07-20T14:15:39-04:00 frontier06693 kernel: ? 
lapic_can_unplug_cpu+0xa0/0xa0
2023-07-20T14:15:39-04:00 frontier06693 kernel: 
nmi_trigger_cpumask_backtrace+0xfd/0x130
2023-07-20T14:15:39-04:00 frontier06693 kernel: 
rcu_dump_cpu_stacks+0x13b/0x180
2023-07-20T14:15:39-04:00 frontier06693 kernel: 
rcu_sched_clock_irq+0x6cb/0x930
2023-07-20T14:15:39-04:00 frontier06693 kernel: ? 
trigger_load_balance+0x158/0x390
2023-07-20T14:15:39-04:00 frontier06693 kernel: ? scheduler_tick+0xe1/0x290
2023-07-20T14:15:39-04:00 frontier06693 kernel: 
update_process_times+0x8c/0xb0
2023-07-20T14:15:39-04:00 frontier06693 kernel: 
tick_sched_handle.isra.21+0x1d/0x60
2023-07-20T14:15:39-04:00 frontier06693 kernel: ? 
tick_sched_handle.isra.21+0x60/0x60
2023-07-20T14:15:39-04:00 frontier06693 kernel: tick_sched_timer+0x67/0x80
2023-07-20T14:15:39-04:00 frontier06693 kernel: ? 
tick_sched_handle.isra.21+0x60/0x60
2023-07-20T14:15:39-04:00 frontier06693 kernel: 
__hrtimer_run_queues+0xa0/0x2b0
2023-07-20T14:15:39-04:00 frontier06693 kernel: hrtimer_interrupt+0xe5/0x250
2023-07-20T14:15:39-04:00 frontier06693 kernel: 
__sysvec_apic_timer_interrupt+0x62/0x100
2023-07-20T14:15:39-04:00 frontier06693 kernel: 
sysvec_apic_timer_interrupt+0x4b/0x90
2023-07-20T14:15:39-04:00 frontier06693 kernel: </IRQ>
2023-07-20T14:15:39-04:00 frontier06693 kernel: <TASK>
2023-07-20T14:15:39-04:00 frontier06693 kernel: 
asm_sysvec_apic_timer_interrupt+0x12/0x20
2023-07-20T14:15:39-04:00 frontier06693 kernel: RIP: 
0010:svm_range_cpu_invalidate_pagetables+0x317/0x610 [amdgpu]
2023-07-20T14:15:39-04:00 frontier06693 kernel: Code: 00 00 00 bf 00 02 
00 00 48 81 c2 90 00 00 00 e8 1f 6a b9 e0 65 48 8b 14 25 00 bd 01 00 8b 
42 2c 48 8b 3c 24 80 e4 f7 0b 43 d8 <89> 42 2c e8 51 dd 2d e1 48 8b 7b 
38 e8 98 29 b7 e0 48 83 c4 30 b8
2023-07-20T14:15:39-04:00 frontier06693 kernel: RSP: 
0018:ffffc9000ffd7b10 EFLAGS: 00000206
2023-07-20T14:15:39-04:00 frontier06693 kernel: RAX: 0000000000000100 
RBX: ffff88c493968d80 RCX: ffff88d1a6469b18
2023-07-20T14:15:39-04:00 frontier06693 kernel: RDX: ffff88e18ef1ec80 
RSI: ffffc9000ffd7be0 RDI: ffff88c493968d38
2023-07-20T14:15:39-04:00 frontier06693 kernel: RBP: 000000000003062e 
R08: 000000003042f000 R09: 000000003062efff
2023-07-20T14:15:39-04:00 frontier06693 kernel: R10: 0000000000001000 
R11: ffff88c1ad255000 R12: 000000000003042f
2023-07-20T14:15:39-04:00 frontier06693 kernel: R13: ffff88c493968c00 
R14: ffffc9000ffd7be0 R15: ffff88c493968c00
2023-07-20T14:15:39-04:00 frontier06693 kernel: 
__mmu_notifier_invalidate_range_start+0x132/0x1d0
2023-07-20T14:15:39-04:00 frontier06693 kernel: ? 
amdgpu_vm_bo_update+0x3fd/0x520 [amdgpu]
2023-07-20T14:15:39-04:00 frontier06693 kernel: 
migrate_vma_setup+0x6c7/0x8f0
2023-07-20T14:15:39-04:00 frontier06693 kernel: ? 
kfd_smi_event_migration_start+0x5f/0x80 [amdgpu]
2023-07-20T14:15:39-04:00 frontier06693 kernel: 
svm_migrate_ram_to_vram+0x14e/0x580 [amdgpu]
2023-07-20T14:15:39-04:00 frontier06693 kernel: 
svm_range_set_attr+0xe34/0x11a0 [amdgpu]
2023-07-20T14:15:39-04:00 frontier06693 kernel: kfd_ioctl+0x271/0x4e0 
[amdgpu]
2023-07-20T14:15:39-04:00 frontier06693 kernel: ? 
kfd_ioctl_set_xnack_mode+0xd0/0xd0 [amdgpu]
2023-07-20T14:15:39-04:00 frontier06693 kernel: __x64_sys_ioctl+0x92/0xd0
2023-07-20T14:15:39-04:00 frontier06693 kernel: ? 
trace_hardirqs_on+0x2a/0xc0
2023-07-20T14:15:39-04:00 frontier06693 kernel: do_syscall_64+0x42/0xc0
2023-07-20T14:15:39-04:00 frontier06693 kernel: 
entry_SYSCALL_64_after_hwframe+0x61/0xcb

>
>
>> Code: 00 00 00 bf 00 02 00 00 48 81 c2 90 00 00 00 e8 1f 6a b9 e0 65 
>> 48 8b 14 25 00 bd 01 00 8b 42 2c 48 8b 3c 24 80 e4 f7 0b 43 d8 <89> 
>> 42 2c e8 51 dd 2d e1 48 8b 7b 38 e8 98 29 b7 e0 48 83 c4 30 b8
>> RSP: 0018:ffffc9000ffd7b10 EFLAGS: 00000206
>> RAX: 0000000000000100 RBX: ffff88c493968d80 RCX: ffff88d1a6469b18
>> RDX: ffff88e18ef1ec80 RSI: ffffc9000ffd7be0 RDI: ffff88c493968d38
>> RBP: 000000000003062e R08: 000000003042f000 R09: 000000003062efff
>> R10: 0000000000001000 R11: ffff88c1ad255000 R12: 000000000003042f
>> R13: ffff88c493968c00 R14: ffffc9000ffd7be0 R15: ffff88c493968c00
>> __mmu_notifier_invalidate_range_start+0x132/0x1d0
>> ? amdgpu_vm_bo_update+0x3fd/0x520 [amdgpu]
>> migrate_vma_setup+0x6c7/0x8f0
>> ? kfd_smi_event_migration_start+0x5f/0x80 [amdgpu]
>> svm_migrate_ram_to_vram+0x14e/0x580 [amdgpu]
>> svm_range_set_attr+0xe34/0x11a0 [amdgpu]
>> kfd_ioctl+0x271/0x4e0 [amdgpu]
>> ? kfd_ioctl_set_xnack_mode+0xd0/0xd0 [amdgpu]
>> __x64_sys_ioctl+0x92/0xd0
>>
>> Signed-off-by: James Zhu <James.Zhu@amd.com>
>> ---
>>   drivers/gpu/drm/amd/amdkfd/kfd_svm.c | 1 +
>>   1 file changed, 1 insertion(+)
>>
>> diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_svm.c 
>> b/drivers/gpu/drm/amd/amdkfd/kfd_svm.c
>> index 113fd11aa96e..9f2d48ade7fa 100644
>> --- a/drivers/gpu/drm/amd/amdkfd/kfd_svm.c
>> +++ b/drivers/gpu/drm/amd/amdkfd/kfd_svm.c
>> @@ -3573,6 +3573,7 @@ svm_range_set_attr(struct kfd_process *p, 
>> struct mm_struct *mm,
>>           r = svm_range_trigger_migration(mm, prange, &migrated);
>>           if (r)
>>               goto out_unlock_range;
>> +        schedule();
>
> I'm not sure that unconditionally scheduling here in every loop 
> iteration is a good solution. This could lead to performance 
> degradation when there are many small ranges. I think a better option 
> is to call cond_resched. That would only reschedule only "if 
> necessary", though I haven't quite figured out the criteria for 
> rescheduling being necessary.
[JZ] you are right, small range will sacrifice performance.  but 
cond_resched has no guarantee to remove RCU stall CPU completely. Maybe 
we add own condition check here based on accumulated prange which ls 
processed.
>
> Regards,
>   Felix
>
>
>>             if (migrated && (!p->xnack_enabled ||
>>               (prange->flags & KFD_IOCTL_SVM_FLAG_GPU_ALWAYS_MAPPED)) &&

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH] drm/amdkfd: add schedule to remove RCU stall on CPU
  2023-08-11 19:11 [PATCH] drm/amdkfd: add schedule to remove RCU stall on CPU James Zhu
  2023-08-11 20:06 ` Felix Kuehling
@ 2023-08-11 21:12 ` Chen, Xiaogang
  2023-08-11 21:22   ` Felix Kuehling
  2023-08-12  0:24   ` James Zhu
  1 sibling, 2 replies; 11+ messages in thread
From: Chen, Xiaogang @ 2023-08-11 21:12 UTC (permalink / raw)
  To: James Zhu, amd-gfx; +Cc: Felix.kuehling, jamesz, Roger.Madrid


I know the original jira ticket. The system got RCU cpu stall, then 
kernel enter panic, then no response or ssh. This patch let prange list 
update task yield cpu after each range update. It can prevent task 
holding mm lock too long. mm lock is rw_semophore, not RCU mechanism. 
Can you explain how that can prevent RCU cpu stall in this case?

Regards

Xiaogang

On 8/11/2023 2:11 PM, James Zhu wrote:
> Caution: This message originated from an External Source. Use proper caution when opening attachments, clicking links, or responding.
>
>
> update_list could be big in list_for_each_entry(prange, &update_list, update_list),
> mmap_read_lock(mm) is kept hold all the time, adding schedule() can remove
> RCU stall on CPU for this case.
>
> RIP: 0010:svm_range_cpu_invalidate_pagetables+0x317/0x610 [amdgpu]
> Code: 00 00 00 bf 00 02 00 00 48 81 c2 90 00 00 00 e8 1f 6a b9 e0 65 48 8b 14 25 00 bd 01 00 8b 42 2c 48 8b 3c 24 80 e4 f7 0b 43 d8 <89> 42 2c e8 51 dd 2d e1 48 8b 7b 38 e8 98 29 b7 e0 48 83 c4 30 b8
> RSP: 0018:ffffc9000ffd7b10 EFLAGS: 00000206
> RAX: 0000000000000100 RBX: ffff88c493968d80 RCX: ffff88d1a6469b18
> RDX: ffff88e18ef1ec80 RSI: ffffc9000ffd7be0 RDI: ffff88c493968d38
> RBP: 000000000003062e R08: 000000003042f000 R09: 000000003062efff
> R10: 0000000000001000 R11: ffff88c1ad255000 R12: 000000000003042f
> R13: ffff88c493968c00 R14: ffffc9000ffd7be0 R15: ffff88c493968c00
> __mmu_notifier_invalidate_range_start+0x132/0x1d0
> ? amdgpu_vm_bo_update+0x3fd/0x520 [amdgpu]
> migrate_vma_setup+0x6c7/0x8f0
> ? kfd_smi_event_migration_start+0x5f/0x80 [amdgpu]
> svm_migrate_ram_to_vram+0x14e/0x580 [amdgpu]
> svm_range_set_attr+0xe34/0x11a0 [amdgpu]
> kfd_ioctl+0x271/0x4e0 [amdgpu]
> ? kfd_ioctl_set_xnack_mode+0xd0/0xd0 [amdgpu]
> __x64_sys_ioctl+0x92/0xd0
>
> Signed-off-by: James Zhu <James.Zhu@amd.com>
> ---
>   drivers/gpu/drm/amd/amdkfd/kfd_svm.c | 1 +
>   1 file changed, 1 insertion(+)
>
> diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_svm.c b/drivers/gpu/drm/amd/amdkfd/kfd_svm.c
> index 113fd11aa96e..9f2d48ade7fa 100644
> --- a/drivers/gpu/drm/amd/amdkfd/kfd_svm.c
> +++ b/drivers/gpu/drm/amd/amdkfd/kfd_svm.c
> @@ -3573,6 +3573,7 @@ svm_range_set_attr(struct kfd_process *p, struct mm_struct *mm,
>                  r = svm_range_trigger_migration(mm, prange, &migrated);
>                  if (r)
>                          goto out_unlock_range;
> +               schedule();
>
>                  if (migrated && (!p->xnack_enabled ||
>                      (prange->flags & KFD_IOCTL_SVM_FLAG_GPU_ALWAYS_MAPPED)) &&
> --
> 2.34.1
>

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH] drm/amdkfd: add schedule to remove RCU stall on CPU
  2023-08-11 20:50   ` James Zhu
@ 2023-08-11 21:13     ` Felix Kuehling
  0 siblings, 0 replies; 11+ messages in thread
From: Felix Kuehling @ 2023-08-11 21:13 UTC (permalink / raw)
  To: James Zhu, James Zhu, amd-gfx; +Cc: Roger.Madrid

I don't understand why this loop is causing a stall. These stall 
warnings indicate that there is an RCU grace period that's not making 
progress. That means there must be an RCU read critical section that's 
being blocked. But there is no RCU-read critical section in 
svm_range_set_attr function. You mentioned the mmap-read-lock. But why 
is that causing an issue? Does it trigger any of the conditions listed 
in kernel/Documentation/RCU/stallwarn.rst?

-       A CPU looping in an RCU read-side critical section.
-       A CPU looping with interrupts disabled.
-       A CPU looping with preemption disabled.
-       A CPU looping with bottom halves disabled.

Or is there another thread that has an mmap_write_lock inside an RCU 
read critical section that's getting stalled by the mmap_read_lock?

Regards,
   Felix


On 2023-08-11 16:50, James Zhu wrote:
>
> On 2023-08-11 16:06, Felix Kuehling wrote:
>>
>> On 2023-08-11 15:11, James Zhu wrote:
>>> update_list could be big in list_for_each_entry(prange, 
>>> &update_list, update_list),
>>> mmap_read_lock(mm) is kept hold all the time, adding schedule() can 
>>> remove
>>> RCU stall on CPU for this case.
>>>
>>> RIP: 0010:svm_range_cpu_invalidate_pagetables+0x317/0x610 [amdgpu]
>>
>> You're just showing the backtrace here, but not what the problem is. 
>> Can you include more context, e.g. the message that says something 
>> about a stall?
>
> [JZ] I attached more log here, and update in patch later.
>
> 2023-07-20T14:15:39-04:00 frontier06693 kernel: rcu: INFO: rcu_sched 
> self-detected stall on CPU
> 2023-07-20T14:15:39-04:00 frontier06693 kernel: rcu: #01134-....: 
> (59947 ticks this GP) idle=7f6/1/0x4000000000000000 softirq=1735/1735 
> fqs=29977
> 2023-07-20T14:15:39-04:00 frontier06693 kernel: #011(t=60006 jiffies 
> g=3265905 q=15150)
> 2023-07-20T14:15:39-04:00 frontier06693 kernel: rcu: CPU 34: RCU dump 
> cpu stacks:
> 2023-07-20T14:15:39-04:00 frontier06693 kernel: NMI backtrace for cpu 34
> 2023-07-20T14:15:39-04:00 frontier06693 kernel: CPU: 34 PID: 72044 
> Comm: ncsd-it-hip.exe Kdump: loaded Tainted: G           OE 
> 5.14.21-150400.24.46_12.0.83-cray_shasta_c #1 SLE15-SP4 (unreleased)
> 2023-07-20T14:15:39-04:00 frontier06693 kernel: Hardware name: HPE 
> HPE_CRAY_EX235A/HPE CRAY EX235A, BIOS 1.6.2 03-22-2023
> 2023-07-20T14:15:39-04:00 frontier06693 kernel: Call Trace:
> 2023-07-20T14:15:39-04:00 frontier06693 kernel: <IRQ>
> 2023-07-20T14:15:39-04:00 frontier06693 kernel: dump_stack_lvl+0x44/0x5b
> 2023-07-20T14:15:39-04:00 frontier06693 kernel: 
> nmi_cpu_backtrace+0xdd/0xe0
> 2023-07-20T14:15:39-04:00 frontier06693 kernel: ? 
> lapic_can_unplug_cpu+0xa0/0xa0
> 2023-07-20T14:15:39-04:00 frontier06693 kernel: 
> nmi_trigger_cpumask_backtrace+0xfd/0x130
> 2023-07-20T14:15:39-04:00 frontier06693 kernel: 
> rcu_dump_cpu_stacks+0x13b/0x180
> 2023-07-20T14:15:39-04:00 frontier06693 kernel: 
> rcu_sched_clock_irq+0x6cb/0x930
> 2023-07-20T14:15:39-04:00 frontier06693 kernel: ? 
> trigger_load_balance+0x158/0x390
> 2023-07-20T14:15:39-04:00 frontier06693 kernel: ? 
> scheduler_tick+0xe1/0x290
> 2023-07-20T14:15:39-04:00 frontier06693 kernel: 
> update_process_times+0x8c/0xb0
> 2023-07-20T14:15:39-04:00 frontier06693 kernel: 
> tick_sched_handle.isra.21+0x1d/0x60
> 2023-07-20T14:15:39-04:00 frontier06693 kernel: ? 
> tick_sched_handle.isra.21+0x60/0x60
> 2023-07-20T14:15:39-04:00 frontier06693 kernel: 
> tick_sched_timer+0x67/0x80
> 2023-07-20T14:15:39-04:00 frontier06693 kernel: ? 
> tick_sched_handle.isra.21+0x60/0x60
> 2023-07-20T14:15:39-04:00 frontier06693 kernel: 
> __hrtimer_run_queues+0xa0/0x2b0
> 2023-07-20T14:15:39-04:00 frontier06693 kernel: 
> hrtimer_interrupt+0xe5/0x250
> 2023-07-20T14:15:39-04:00 frontier06693 kernel: 
> __sysvec_apic_timer_interrupt+0x62/0x100
> 2023-07-20T14:15:39-04:00 frontier06693 kernel: 
> sysvec_apic_timer_interrupt+0x4b/0x90
> 2023-07-20T14:15:39-04:00 frontier06693 kernel: </IRQ>
> 2023-07-20T14:15:39-04:00 frontier06693 kernel: <TASK>
> 2023-07-20T14:15:39-04:00 frontier06693 kernel: 
> asm_sysvec_apic_timer_interrupt+0x12/0x20
> 2023-07-20T14:15:39-04:00 frontier06693 kernel: RIP: 
> 0010:svm_range_cpu_invalidate_pagetables+0x317/0x610 [amdgpu]
> 2023-07-20T14:15:39-04:00 frontier06693 kernel: Code: 00 00 00 bf 00 
> 02 00 00 48 81 c2 90 00 00 00 e8 1f 6a b9 e0 65 48 8b 14 25 00 bd 01 
> 00 8b 42 2c 48 8b 3c 24 80 e4 f7 0b 43 d8 <89> 42 2c e8 51 dd 2d e1 48 
> 8b 7b 38 e8 98 29 b7 e0 48 83 c4 30 b8
> 2023-07-20T14:15:39-04:00 frontier06693 kernel: RSP: 
> 0018:ffffc9000ffd7b10 EFLAGS: 00000206
> 2023-07-20T14:15:39-04:00 frontier06693 kernel: RAX: 0000000000000100 
> RBX: ffff88c493968d80 RCX: ffff88d1a6469b18
> 2023-07-20T14:15:39-04:00 frontier06693 kernel: RDX: ffff88e18ef1ec80 
> RSI: ffffc9000ffd7be0 RDI: ffff88c493968d38
> 2023-07-20T14:15:39-04:00 frontier06693 kernel: RBP: 000000000003062e 
> R08: 000000003042f000 R09: 000000003062efff
> 2023-07-20T14:15:39-04:00 frontier06693 kernel: R10: 0000000000001000 
> R11: ffff88c1ad255000 R12: 000000000003042f
> 2023-07-20T14:15:39-04:00 frontier06693 kernel: R13: ffff88c493968c00 
> R14: ffffc9000ffd7be0 R15: ffff88c493968c00
> 2023-07-20T14:15:39-04:00 frontier06693 kernel: 
> __mmu_notifier_invalidate_range_start+0x132/0x1d0
> 2023-07-20T14:15:39-04:00 frontier06693 kernel: ? 
> amdgpu_vm_bo_update+0x3fd/0x520 [amdgpu]
> 2023-07-20T14:15:39-04:00 frontier06693 kernel: 
> migrate_vma_setup+0x6c7/0x8f0
> 2023-07-20T14:15:39-04:00 frontier06693 kernel: ? 
> kfd_smi_event_migration_start+0x5f/0x80 [amdgpu]
> 2023-07-20T14:15:39-04:00 frontier06693 kernel: 
> svm_migrate_ram_to_vram+0x14e/0x580 [amdgpu]
> 2023-07-20T14:15:39-04:00 frontier06693 kernel: 
> svm_range_set_attr+0xe34/0x11a0 [amdgpu]
> 2023-07-20T14:15:39-04:00 frontier06693 kernel: kfd_ioctl+0x271/0x4e0 
> [amdgpu]
> 2023-07-20T14:15:39-04:00 frontier06693 kernel: ? 
> kfd_ioctl_set_xnack_mode+0xd0/0xd0 [amdgpu]
> 2023-07-20T14:15:39-04:00 frontier06693 kernel: __x64_sys_ioctl+0x92/0xd0
> 2023-07-20T14:15:39-04:00 frontier06693 kernel: ? 
> trace_hardirqs_on+0x2a/0xc0
> 2023-07-20T14:15:39-04:00 frontier06693 kernel: do_syscall_64+0x42/0xc0
> 2023-07-20T14:15:39-04:00 frontier06693 kernel: 
> entry_SYSCALL_64_after_hwframe+0x61/0xcb
>
>>
>>
>>> Code: 00 00 00 bf 00 02 00 00 48 81 c2 90 00 00 00 e8 1f 6a b9 e0 65 
>>> 48 8b 14 25 00 bd 01 00 8b 42 2c 48 8b 3c 24 80 e4 f7 0b 43 d8 <89> 
>>> 42 2c e8 51 dd 2d e1 48 8b 7b 38 e8 98 29 b7 e0 48 83 c4 30 b8
>>> RSP: 0018:ffffc9000ffd7b10 EFLAGS: 00000206
>>> RAX: 0000000000000100 RBX: ffff88c493968d80 RCX: ffff88d1a6469b18
>>> RDX: ffff88e18ef1ec80 RSI: ffffc9000ffd7be0 RDI: ffff88c493968d38
>>> RBP: 000000000003062e R08: 000000003042f000 R09: 000000003062efff
>>> R10: 0000000000001000 R11: ffff88c1ad255000 R12: 000000000003042f
>>> R13: ffff88c493968c00 R14: ffffc9000ffd7be0 R15: ffff88c493968c00
>>> __mmu_notifier_invalidate_range_start+0x132/0x1d0
>>> ? amdgpu_vm_bo_update+0x3fd/0x520 [amdgpu]
>>> migrate_vma_setup+0x6c7/0x8f0
>>> ? kfd_smi_event_migration_start+0x5f/0x80 [amdgpu]
>>> svm_migrate_ram_to_vram+0x14e/0x580 [amdgpu]
>>> svm_range_set_attr+0xe34/0x11a0 [amdgpu]
>>> kfd_ioctl+0x271/0x4e0 [amdgpu]
>>> ? kfd_ioctl_set_xnack_mode+0xd0/0xd0 [amdgpu]
>>> __x64_sys_ioctl+0x92/0xd0
>>>
>>> Signed-off-by: James Zhu <James.Zhu@amd.com>
>>> ---
>>>   drivers/gpu/drm/amd/amdkfd/kfd_svm.c | 1 +
>>>   1 file changed, 1 insertion(+)
>>>
>>> diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_svm.c 
>>> b/drivers/gpu/drm/amd/amdkfd/kfd_svm.c
>>> index 113fd11aa96e..9f2d48ade7fa 100644
>>> --- a/drivers/gpu/drm/amd/amdkfd/kfd_svm.c
>>> +++ b/drivers/gpu/drm/amd/amdkfd/kfd_svm.c
>>> @@ -3573,6 +3573,7 @@ svm_range_set_attr(struct kfd_process *p, 
>>> struct mm_struct *mm,
>>>           r = svm_range_trigger_migration(mm, prange, &migrated);
>>>           if (r)
>>>               goto out_unlock_range;
>>> +        schedule();
>>
>> I'm not sure that unconditionally scheduling here in every loop 
>> iteration is a good solution. This could lead to performance 
>> degradation when there are many small ranges. I think a better option 
>> is to call cond_resched. That would only reschedule only "if 
>> necessary", though I haven't quite figured out the criteria for 
>> rescheduling being necessary.
> [JZ] you are right, small range will sacrifice performance.  but 
> cond_resched has no guarantee to remove RCU stall CPU completely. 
> Maybe we add own condition check here based on accumulated prange 
> which ls processed.
>>
>> Regards,
>>   Felix
>>
>>
>>>             if (migrated && (!p->xnack_enabled ||
>>>               (prange->flags & 
>>> KFD_IOCTL_SVM_FLAG_GPU_ALWAYS_MAPPED)) &&

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH] drm/amdkfd: add schedule to remove RCU stall on CPU
  2023-08-11 21:12 ` Chen, Xiaogang
@ 2023-08-11 21:22   ` Felix Kuehling
  2023-08-11 21:27     ` Chen, Xiaogang
  2023-08-12  0:24   ` James Zhu
  1 sibling, 1 reply; 11+ messages in thread
From: Felix Kuehling @ 2023-08-11 21:22 UTC (permalink / raw)
  To: Chen, Xiaogang, James Zhu, amd-gfx; +Cc: jamesz, Roger.Madrid

On 2023-08-11 17:12, Chen, Xiaogang wrote:
>
> I know the original jira ticket. The system got RCU cpu stall, then 
> kernel enter panic, then no response or ssh. This patch let prange 
> list update task yield cpu after each range update. It can prevent 
> task holding mm lock too long.

Calling schedule does not drop the lock. If anything, it causes the lock 
to be held longer, because the function takes longer to complete.

Regards,
   Felix


> mm lock is rw_semophore, not RCU mechanism. Can you explain how that 
> can prevent RCU cpu stall in this case?
>
> Regards
>
> Xiaogang
>
> On 8/11/2023 2:11 PM, James Zhu wrote:
>> Caution: This message originated from an External Source. Use proper 
>> caution when opening attachments, clicking links, or responding.
>>
>>
>> update_list could be big in list_for_each_entry(prange, &update_list, 
>> update_list),
>> mmap_read_lock(mm) is kept hold all the time, adding schedule() can 
>> remove
>> RCU stall on CPU for this case.
>>
>> RIP: 0010:svm_range_cpu_invalidate_pagetables+0x317/0x610 [amdgpu]
>> Code: 00 00 00 bf 00 02 00 00 48 81 c2 90 00 00 00 e8 1f 6a b9 e0 65 
>> 48 8b 14 25 00 bd 01 00 8b 42 2c 48 8b 3c 24 80 e4 f7 0b 43 d8 <89> 
>> 42 2c e8 51 dd 2d e1 48 8b 7b 38 e8 98 29 b7 e0 48 83 c4 30 b8
>> RSP: 0018:ffffc9000ffd7b10 EFLAGS: 00000206
>> RAX: 0000000000000100 RBX: ffff88c493968d80 RCX: ffff88d1a6469b18
>> RDX: ffff88e18ef1ec80 RSI: ffffc9000ffd7be0 RDI: ffff88c493968d38
>> RBP: 000000000003062e R08: 000000003042f000 R09: 000000003062efff
>> R10: 0000000000001000 R11: ffff88c1ad255000 R12: 000000000003042f
>> R13: ffff88c493968c00 R14: ffffc9000ffd7be0 R15: ffff88c493968c00
>> __mmu_notifier_invalidate_range_start+0x132/0x1d0
>> ? amdgpu_vm_bo_update+0x3fd/0x520 [amdgpu]
>> migrate_vma_setup+0x6c7/0x8f0
>> ? kfd_smi_event_migration_start+0x5f/0x80 [amdgpu]
>> svm_migrate_ram_to_vram+0x14e/0x580 [amdgpu]
>> svm_range_set_attr+0xe34/0x11a0 [amdgpu]
>> kfd_ioctl+0x271/0x4e0 [amdgpu]
>> ? kfd_ioctl_set_xnack_mode+0xd0/0xd0 [amdgpu]
>> __x64_sys_ioctl+0x92/0xd0
>>
>> Signed-off-by: James Zhu <James.Zhu@amd.com>
>> ---
>>   drivers/gpu/drm/amd/amdkfd/kfd_svm.c | 1 +
>>   1 file changed, 1 insertion(+)
>>
>> diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_svm.c 
>> b/drivers/gpu/drm/amd/amdkfd/kfd_svm.c
>> index 113fd11aa96e..9f2d48ade7fa 100644
>> --- a/drivers/gpu/drm/amd/amdkfd/kfd_svm.c
>> +++ b/drivers/gpu/drm/amd/amdkfd/kfd_svm.c
>> @@ -3573,6 +3573,7 @@ svm_range_set_attr(struct kfd_process *p, 
>> struct mm_struct *mm,
>>                  r = svm_range_trigger_migration(mm, prange, &migrated);
>>                  if (r)
>>                          goto out_unlock_range;
>> +               schedule();
>>
>>                  if (migrated && (!p->xnack_enabled ||
>>                      (prange->flags & 
>> KFD_IOCTL_SVM_FLAG_GPU_ALWAYS_MAPPED)) &&
>> -- 
>> 2.34.1
>>

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH] drm/amdkfd: add schedule to remove RCU stall on CPU
  2023-08-11 21:22   ` Felix Kuehling
@ 2023-08-11 21:27     ` Chen, Xiaogang
  2023-08-11 21:31       ` Felix Kuehling
  2023-08-12  0:28       ` James Zhu
  0 siblings, 2 replies; 11+ messages in thread
From: Chen, Xiaogang @ 2023-08-11 21:27 UTC (permalink / raw)
  To: Felix Kuehling, James Zhu, amd-gfx; +Cc: jamesz, Roger.Madrid


On 8/11/2023 4:22 PM, Felix Kuehling wrote:
> On 2023-08-11 17:12, Chen, Xiaogang wrote:
>>
>> I know the original jira ticket. The system got RCU cpu stall, then 
>> kernel enter panic, then no response or ssh. This patch let prange 
>> list update task yield cpu after each range update. It can prevent 
>> task holding mm lock too long.
>
> Calling schedule does not drop the lock. If anything, it causes the 
> lock to be held longer, because the function takes longer to complete.
>
> Regards,
>   Felix
>
Right. I do not see either how this patch target the root cause. It is 
on customer system that can have many RCU operations(not necessary from 
our code). Any read critical section can cause write stall.

I think we can use some RCU parameters first to see if thing can change: 
like config_rcu_cpu_stall_timeout to increase grace period, or 
rcuupdate.rcu_cpu_stall_suppress to surppress RCU stall.

Regards

Xiaogang

>> mm lock is rw_semophore, not RCU mechanism. Can you explain how that 
>> can prevent RCU cpu stall in this case?
>>
>> Regards
>>
>> Xiaogang
>>
>> On 8/11/2023 2:11 PM, James Zhu wrote:
>>> Caution: This message originated from an External Source. Use proper 
>>> caution when opening attachments, clicking links, or responding.
>>>
>>>
>>> update_list could be big in list_for_each_entry(prange, 
>>> &update_list, update_list),
>>> mmap_read_lock(mm) is kept hold all the time, adding schedule() can 
>>> remove
>>> RCU stall on CPU for this case.
>>>
>>> RIP: 0010:svm_range_cpu_invalidate_pagetables+0x317/0x610 [amdgpu]
>>> Code: 00 00 00 bf 00 02 00 00 48 81 c2 90 00 00 00 e8 1f 6a b9 e0 65 
>>> 48 8b 14 25 00 bd 01 00 8b 42 2c 48 8b 3c 24 80 e4 f7 0b 43 d8 <89> 
>>> 42 2c e8 51 dd 2d e1 48 8b 7b 38 e8 98 29 b7 e0 48 83 c4 30 b8
>>> RSP: 0018:ffffc9000ffd7b10 EFLAGS: 00000206
>>> RAX: 0000000000000100 RBX: ffff88c493968d80 RCX: ffff88d1a6469b18
>>> RDX: ffff88e18ef1ec80 RSI: ffffc9000ffd7be0 RDI: ffff88c493968d38
>>> RBP: 000000000003062e R08: 000000003042f000 R09: 000000003062efff
>>> R10: 0000000000001000 R11: ffff88c1ad255000 R12: 000000000003042f
>>> R13: ffff88c493968c00 R14: ffffc9000ffd7be0 R15: ffff88c493968c00
>>> __mmu_notifier_invalidate_range_start+0x132/0x1d0
>>> ? amdgpu_vm_bo_update+0x3fd/0x520 [amdgpu]
>>> migrate_vma_setup+0x6c7/0x8f0
>>> ? kfd_smi_event_migration_start+0x5f/0x80 [amdgpu]
>>> svm_migrate_ram_to_vram+0x14e/0x580 [amdgpu]
>>> svm_range_set_attr+0xe34/0x11a0 [amdgpu]
>>> kfd_ioctl+0x271/0x4e0 [amdgpu]
>>> ? kfd_ioctl_set_xnack_mode+0xd0/0xd0 [amdgpu]
>>> __x64_sys_ioctl+0x92/0xd0
>>>
>>> Signed-off-by: James Zhu <James.Zhu@amd.com>
>>> ---
>>>   drivers/gpu/drm/amd/amdkfd/kfd_svm.c | 1 +
>>>   1 file changed, 1 insertion(+)
>>>
>>> diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_svm.c 
>>> b/drivers/gpu/drm/amd/amdkfd/kfd_svm.c
>>> index 113fd11aa96e..9f2d48ade7fa 100644
>>> --- a/drivers/gpu/drm/amd/amdkfd/kfd_svm.c
>>> +++ b/drivers/gpu/drm/amd/amdkfd/kfd_svm.c
>>> @@ -3573,6 +3573,7 @@ svm_range_set_attr(struct kfd_process *p, 
>>> struct mm_struct *mm,
>>>                  r = svm_range_trigger_migration(mm, prange, 
>>> &migrated);
>>>                  if (r)
>>>                          goto out_unlock_range;
>>> +               schedule();
>>>
>>>                  if (migrated && (!p->xnack_enabled ||
>>>                      (prange->flags & 
>>> KFD_IOCTL_SVM_FLAG_GPU_ALWAYS_MAPPED)) &&
>>> -- 
>>> 2.34.1
>>>

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH] drm/amdkfd: add schedule to remove RCU stall on CPU
  2023-08-11 21:27     ` Chen, Xiaogang
@ 2023-08-11 21:31       ` Felix Kuehling
  2023-08-11 22:00         ` Chen, Xiaogang
  2023-08-12  0:28       ` James Zhu
  1 sibling, 1 reply; 11+ messages in thread
From: Felix Kuehling @ 2023-08-11 21:31 UTC (permalink / raw)
  To: Chen, Xiaogang, James Zhu, amd-gfx; +Cc: jamesz, Roger.Madrid

If you have a complete kernel log, it may be worth looking at backtraces 
from other threads, to better understand the interactions. I'd expect 
that there is a thread there that's in an RCU read critical section. It 
may not be in our driver, though. If it's a customer system, it may also 
help to see the kernel config. Maybe the kernel was configured without 
preemption:

-       For !CONFIG_PREEMPTION kernels, a CPU looping anywhere in the kernel
         without invoking schedule().  If the looping in the kernel is
         really expected and desirable behavior, you might need to add
         some calls to cond_resched().

But then I would expect cond_resched() to fix the problem, according to 
this document.

Regards,
   Felix


On 2023-08-11 17:27, Chen, Xiaogang wrote:
>
> On 8/11/2023 4:22 PM, Felix Kuehling wrote:
>> On 2023-08-11 17:12, Chen, Xiaogang wrote:
>>>
>>> I know the original jira ticket. The system got RCU cpu stall, then 
>>> kernel enter panic, then no response or ssh. This patch let prange 
>>> list update task yield cpu after each range update. It can prevent 
>>> task holding mm lock too long.
>>
>> Calling schedule does not drop the lock. If anything, it causes the 
>> lock to be held longer, because the function takes longer to complete.
>>
>> Regards,
>>   Felix
>>
> Right. I do not see either how this patch target the root cause. It is 
> on customer system that can have many RCU operations(not necessary 
> from our code). Any read critical section can cause write stall.
>
> I think we can use some RCU parameters first to see if thing can 
> change: like config_rcu_cpu_stall_timeout to increase grace period, or 
> rcuupdate.rcu_cpu_stall_suppress to surppress RCU stall.
>
> Regards
>
> Xiaogang
>
>>> mm lock is rw_semophore, not RCU mechanism. Can you explain how that 
>>> can prevent RCU cpu stall in this case?
>>>
>>> Regards
>>>
>>> Xiaogang
>>>
>>> On 8/11/2023 2:11 PM, James Zhu wrote:
>>>> Caution: This message originated from an External Source. Use 
>>>> proper caution when opening attachments, clicking links, or 
>>>> responding.
>>>>
>>>>
>>>> update_list could be big in list_for_each_entry(prange, 
>>>> &update_list, update_list),
>>>> mmap_read_lock(mm) is kept hold all the time, adding schedule() can 
>>>> remove
>>>> RCU stall on CPU for this case.
>>>>
>>>> RIP: 0010:svm_range_cpu_invalidate_pagetables+0x317/0x610 [amdgpu]
>>>> Code: 00 00 00 bf 00 02 00 00 48 81 c2 90 00 00 00 e8 1f 6a b9 e0 
>>>> 65 48 8b 14 25 00 bd 01 00 8b 42 2c 48 8b 3c 24 80 e4 f7 0b 43 d8 
>>>> <89> 42 2c e8 51 dd 2d e1 48 8b 7b 38 e8 98 29 b7 e0 48 83 c4 30 b8
>>>> RSP: 0018:ffffc9000ffd7b10 EFLAGS: 00000206
>>>> RAX: 0000000000000100 RBX: ffff88c493968d80 RCX: ffff88d1a6469b18
>>>> RDX: ffff88e18ef1ec80 RSI: ffffc9000ffd7be0 RDI: ffff88c493968d38
>>>> RBP: 000000000003062e R08: 000000003042f000 R09: 000000003062efff
>>>> R10: 0000000000001000 R11: ffff88c1ad255000 R12: 000000000003042f
>>>> R13: ffff88c493968c00 R14: ffffc9000ffd7be0 R15: ffff88c493968c00
>>>> __mmu_notifier_invalidate_range_start+0x132/0x1d0
>>>> ? amdgpu_vm_bo_update+0x3fd/0x520 [amdgpu]
>>>> migrate_vma_setup+0x6c7/0x8f0
>>>> ? kfd_smi_event_migration_start+0x5f/0x80 [amdgpu]
>>>> svm_migrate_ram_to_vram+0x14e/0x580 [amdgpu]
>>>> svm_range_set_attr+0xe34/0x11a0 [amdgpu]
>>>> kfd_ioctl+0x271/0x4e0 [amdgpu]
>>>> ? kfd_ioctl_set_xnack_mode+0xd0/0xd0 [amdgpu]
>>>> __x64_sys_ioctl+0x92/0xd0
>>>>
>>>> Signed-off-by: James Zhu <James.Zhu@amd.com>
>>>> ---
>>>>   drivers/gpu/drm/amd/amdkfd/kfd_svm.c | 1 +
>>>>   1 file changed, 1 insertion(+)
>>>>
>>>> diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_svm.c 
>>>> b/drivers/gpu/drm/amd/amdkfd/kfd_svm.c
>>>> index 113fd11aa96e..9f2d48ade7fa 100644
>>>> --- a/drivers/gpu/drm/amd/amdkfd/kfd_svm.c
>>>> +++ b/drivers/gpu/drm/amd/amdkfd/kfd_svm.c
>>>> @@ -3573,6 +3573,7 @@ svm_range_set_attr(struct kfd_process *p, 
>>>> struct mm_struct *mm,
>>>>                  r = svm_range_trigger_migration(mm, prange, 
>>>> &migrated);
>>>>                  if (r)
>>>>                          goto out_unlock_range;
>>>> +               schedule();
>>>>
>>>>                  if (migrated && (!p->xnack_enabled ||
>>>>                      (prange->flags & 
>>>> KFD_IOCTL_SVM_FLAG_GPU_ALWAYS_MAPPED)) &&
>>>> -- 
>>>> 2.34.1
>>>>

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH] drm/amdkfd: add schedule to remove RCU stall on CPU
  2023-08-11 21:31       ` Felix Kuehling
@ 2023-08-11 22:00         ` Chen, Xiaogang
  0 siblings, 0 replies; 11+ messages in thread
From: Chen, Xiaogang @ 2023-08-11 22:00 UTC (permalink / raw)
  To: Felix Kuehling, James Zhu, amd-gfx; +Cc: jamesz, Roger.Madrid


one checkpoint: I saw they use serial port for console at kernel 
parameter: console=ttyS0,115200n8

  *

    Booting Linux using a console connection that is too slow to keep up
    with the boot-time console-message rate. For example, a 115Kbaud
    serial console can be/way/too slow to keep up with boot-time message
    rates, and will frequently result in RCU CPU stall warning messages.
    Especially if you have added debug|printk()|
    <https://www.kernel.org/doc/html/latest/core-api/printk-basics.html#c.printk>s.


On 8/11/2023 4:31 PM, Felix Kuehling wrote:
> If you have a complete kernel log, it may be worth looking at 
> backtraces from other threads, to better understand the interactions. 
> I'd expect that there is a thread there that's in an RCU read critical 
> section. It may not be in our driver, though. If it's a customer 
> system, it may also help to see the kernel config. Maybe the kernel 
> was configured without preemption:
>
> -       For !CONFIG_PREEMPTION kernels, a CPU looping anywhere in the 
> kernel
>         without invoking schedule().  If the looping in the kernel is
>         really expected and desirable behavior, you might need to add
>         some calls to cond_resched().
>
> But then I would expect cond_resched() to fix the problem, according 
> to this document.
>
> Regards,
>   Felix
>
>
> On 2023-08-11 17:27, Chen, Xiaogang wrote:
>>
>> On 8/11/2023 4:22 PM, Felix Kuehling wrote:
>>> On 2023-08-11 17:12, Chen, Xiaogang wrote:
>>>>
>>>> I know the original jira ticket. The system got RCU cpu stall, then 
>>>> kernel enter panic, then no response or ssh. This patch let prange 
>>>> list update task yield cpu after each range update. It can prevent 
>>>> task holding mm lock too long.
>>>
>>> Calling schedule does not drop the lock. If anything, it causes the 
>>> lock to be held longer, because the function takes longer to complete.
>>>
>>> Regards,
>>>   Felix
>>>
>> Right. I do not see either how this patch target the root cause. It 
>> is on customer system that can have many RCU operations(not necessary 
>> from our code). Any read critical section can cause write stall.
>>
>> I think we can use some RCU parameters first to see if thing can 
>> change: like config_rcu_cpu_stall_timeout to increase grace period, 
>> or rcuupdate.rcu_cpu_stall_suppress to surppress RCU stall.
>>
>> Regards
>>
>> Xiaogang
>>
>>>> mm lock is rw_semophore, not RCU mechanism. Can you explain how 
>>>> that can prevent RCU cpu stall in this case?
>>>>
>>>> Regards
>>>>
>>>> Xiaogang
>>>>
>>>> On 8/11/2023 2:11 PM, James Zhu wrote:
>>>>> Caution: This message originated from an External Source. Use 
>>>>> proper caution when opening attachments, clicking links, or 
>>>>> responding.
>>>>>
>>>>>
>>>>> update_list could be big in list_for_each_entry(prange, 
>>>>> &update_list, update_list),
>>>>> mmap_read_lock(mm) is kept hold all the time, adding schedule() 
>>>>> can remove
>>>>> RCU stall on CPU for this case.
>>>>>
>>>>> RIP: 0010:svm_range_cpu_invalidate_pagetables+0x317/0x610 [amdgpu]
>>>>> Code: 00 00 00 bf 00 02 00 00 48 81 c2 90 00 00 00 e8 1f 6a b9 e0 
>>>>> 65 48 8b 14 25 00 bd 01 00 8b 42 2c 48 8b 3c 24 80 e4 f7 0b 43 d8 
>>>>> <89> 42 2c e8 51 dd 2d e1 48 8b 7b 38 e8 98 29 b7 e0 48 83 c4 30 b8
>>>>> RSP: 0018:ffffc9000ffd7b10 EFLAGS: 00000206
>>>>> RAX: 0000000000000100 RBX: ffff88c493968d80 RCX: ffff88d1a6469b18
>>>>> RDX: ffff88e18ef1ec80 RSI: ffffc9000ffd7be0 RDI: ffff88c493968d38
>>>>> RBP: 000000000003062e R08: 000000003042f000 R09: 000000003062efff
>>>>> R10: 0000000000001000 R11: ffff88c1ad255000 R12: 000000000003042f
>>>>> R13: ffff88c493968c00 R14: ffffc9000ffd7be0 R15: ffff88c493968c00
>>>>> __mmu_notifier_invalidate_range_start+0x132/0x1d0
>>>>> ? amdgpu_vm_bo_update+0x3fd/0x520 [amdgpu]
>>>>> migrate_vma_setup+0x6c7/0x8f0
>>>>> ? kfd_smi_event_migration_start+0x5f/0x80 [amdgpu]
>>>>> svm_migrate_ram_to_vram+0x14e/0x580 [amdgpu]
>>>>> svm_range_set_attr+0xe34/0x11a0 [amdgpu]
>>>>> kfd_ioctl+0x271/0x4e0 [amdgpu]
>>>>> ? kfd_ioctl_set_xnack_mode+0xd0/0xd0 [amdgpu]
>>>>> __x64_sys_ioctl+0x92/0xd0
>>>>>
>>>>> Signed-off-by: James Zhu <James.Zhu@amd.com>
>>>>> ---
>>>>>   drivers/gpu/drm/amd/amdkfd/kfd_svm.c | 1 +
>>>>>   1 file changed, 1 insertion(+)
>>>>>
>>>>> diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_svm.c 
>>>>> b/drivers/gpu/drm/amd/amdkfd/kfd_svm.c
>>>>> index 113fd11aa96e..9f2d48ade7fa 100644
>>>>> --- a/drivers/gpu/drm/amd/amdkfd/kfd_svm.c
>>>>> +++ b/drivers/gpu/drm/amd/amdkfd/kfd_svm.c
>>>>> @@ -3573,6 +3573,7 @@ svm_range_set_attr(struct kfd_process *p, 
>>>>> struct mm_struct *mm,
>>>>>                  r = svm_range_trigger_migration(mm, prange, 
>>>>> &migrated);
>>>>>                  if (r)
>>>>>                          goto out_unlock_range;
>>>>> +               schedule();
>>>>>
>>>>>                  if (migrated && (!p->xnack_enabled ||
>>>>>                      (prange->flags & 
>>>>> KFD_IOCTL_SVM_FLAG_GPU_ALWAYS_MAPPED)) &&
>>>>> -- 
>>>>> 2.34.1
>>>>>

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH] drm/amdkfd: add schedule to remove RCU stall on CPU
  2023-08-11 21:12 ` Chen, Xiaogang
  2023-08-11 21:22   ` Felix Kuehling
@ 2023-08-12  0:24   ` James Zhu
  1 sibling, 0 replies; 11+ messages in thread
From: James Zhu @ 2023-08-12  0:24 UTC (permalink / raw)
  To: Chen, Xiaogang, James Zhu, amd-gfx; +Cc: Felix.kuehling, Roger.Madrid

-Remove others, continue discussing internally

On 2023-08-11 17:12, Chen, Xiaogang wrote:
>
> I know the original jira ticket. The system got RCU cpu stall, then 
> kernel enter panic, then no response or ssh. This patch let prange 
> list update task yield cpu after each range update. It can prevent 
> task holding mm lock too long. mm lock is rw_semophore, not RCU 
> mechanism. Can you explain how that can prevent RCU cpu stall in this 
> case?
[JZ] I can't find exactly rcu_read_lock either, there are many different 
lockers protecting this period.  mm lock is rw_semaphore , i suspected 
it is implemented with RCU mechanism somewhere,
>
> Regards
>
> Xiaogang
>
> On 8/11/2023 2:11 PM, James Zhu wrote:
>> Caution: This message originated from an External Source. Use proper 
>> caution when opening attachments, clicking links, or responding.
>>
>>
>> update_list could be big in list_for_each_entry(prange, &update_list, 
>> update_list),
>> mmap_read_lock(mm) is kept hold all the time, adding schedule() can 
>> remove
>> RCU stall on CPU for this case.
>>
>> RIP: 0010:svm_range_cpu_invalidate_pagetables+0x317/0x610 [amdgpu]
>> Code: 00 00 00 bf 00 02 00 00 48 81 c2 90 00 00 00 e8 1f 6a b9 e0 65 
>> 48 8b 14 25 00 bd 01 00 8b 42 2c 48 8b 3c 24 80 e4 f7 0b 43 d8 <89> 
>> 42 2c e8 51 dd 2d e1 48 8b 7b 38 e8 98 29 b7 e0 48 83 c4 30 b8
>> RSP: 0018:ffffc9000ffd7b10 EFLAGS: 00000206
>> RAX: 0000000000000100 RBX: ffff88c493968d80 RCX: ffff88d1a6469b18
>> RDX: ffff88e18ef1ec80 RSI: ffffc9000ffd7be0 RDI: ffff88c493968d38
>> RBP: 000000000003062e R08: 000000003042f000 R09: 000000003062efff
>> R10: 0000000000001000 R11: ffff88c1ad255000 R12: 000000000003042f
>> R13: ffff88c493968c00 R14: ffffc9000ffd7be0 R15: ffff88c493968c00
>> __mmu_notifier_invalidate_range_start+0x132/0x1d0
>> ? amdgpu_vm_bo_update+0x3fd/0x520 [amdgpu]
>> migrate_vma_setup+0x6c7/0x8f0
>> ? kfd_smi_event_migration_start+0x5f/0x80 [amdgpu]
>> svm_migrate_ram_to_vram+0x14e/0x580 [amdgpu]
>> svm_range_set_attr+0xe34/0x11a0 [amdgpu]
>> kfd_ioctl+0x271/0x4e0 [amdgpu]
>> ? kfd_ioctl_set_xnack_mode+0xd0/0xd0 [amdgpu]
>> __x64_sys_ioctl+0x92/0xd0
>>
>> Signed-off-by: James Zhu <James.Zhu@amd.com>
>> ---
>>   drivers/gpu/drm/amd/amdkfd/kfd_svm.c | 1 +
>>   1 file changed, 1 insertion(+)
>>
>> diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_svm.c 
>> b/drivers/gpu/drm/amd/amdkfd/kfd_svm.c
>> index 113fd11aa96e..9f2d48ade7fa 100644
>> --- a/drivers/gpu/drm/amd/amdkfd/kfd_svm.c
>> +++ b/drivers/gpu/drm/amd/amdkfd/kfd_svm.c
>> @@ -3573,6 +3573,7 @@ svm_range_set_attr(struct kfd_process *p, 
>> struct mm_struct *mm,
>>                  r = svm_range_trigger_migration(mm, prange, &migrated);
>>                  if (r)
>>                          goto out_unlock_range;
>> +               schedule();
>>
>>                  if (migrated && (!p->xnack_enabled ||
>>                      (prange->flags & 
>> KFD_IOCTL_SVM_FLAG_GPU_ALWAYS_MAPPED)) &&
>> -- 
>> 2.34.1
>>

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH] drm/amdkfd: add schedule to remove RCU stall on CPU
  2023-08-11 21:27     ` Chen, Xiaogang
  2023-08-11 21:31       ` Felix Kuehling
@ 2023-08-12  0:28       ` James Zhu
  1 sibling, 0 replies; 11+ messages in thread
From: James Zhu @ 2023-08-12  0:28 UTC (permalink / raw)
  To: Chen, Xiaogang, Felix Kuehling, James Zhu, amd-gfx; +Cc: Roger.Madrid


On 2023-08-11 17:27, Chen, Xiaogang wrote:
>
> On 8/11/2023 4:22 PM, Felix Kuehling wrote:
>> On 2023-08-11 17:12, Chen, Xiaogang wrote:
>>>
>>> I know the original jira ticket. The system got RCU cpu stall, then 
>>> kernel enter panic, then no response or ssh. This patch let prange 
>>> list update task yield cpu after each range update. It can prevent 
>>> task holding mm lock too long.
>>
>> Calling schedule does not drop the lock. If anything, it causes the 
>> lock to be held longer, because the function takes longer to complete.
>>
>> Regards,
>>   Felix
>>
> Right. I do not see either how this patch target the root cause. It is 
> on customer system that can have many RCU operations(not necessary 
> from our code). Any read critical section can cause write stall.
>
> I think we can use some RCU parameters first to see if thing can 
> change: like config_rcu_cpu_stall_timeout to increase grace period, or 
> rcuupdate.rcu_cpu_stall_suppress to surppress RCU stall.
[JZ] I tried to tune grace period before, it is easy to hung the system
>
> Regards
>
> Xiaogang
>
>>> mm lock is rw_semophore, not RCU mechanism. Can you explain how that 
>>> can prevent RCU cpu stall in this case?
>>>
>>> Regards
>>>
>>> Xiaogang
>>>
>>> On 8/11/2023 2:11 PM, James Zhu wrote:
>>>> Caution: This message originated from an External Source. Use 
>>>> proper caution when opening attachments, clicking links, or 
>>>> responding.
>>>>
>>>>
>>>> update_list could be big in list_for_each_entry(prange, 
>>>> &update_list, update_list),
>>>> mmap_read_lock(mm) is kept hold all the time, adding schedule() can 
>>>> remove
>>>> RCU stall on CPU for this case.
>>>>
>>>> RIP: 0010:svm_range_cpu_invalidate_pagetables+0x317/0x610 [amdgpu]
>>>> Code: 00 00 00 bf 00 02 00 00 48 81 c2 90 00 00 00 e8 1f 6a b9 e0 
>>>> 65 48 8b 14 25 00 bd 01 00 8b 42 2c 48 8b 3c 24 80 e4 f7 0b 43 d8 
>>>> <89> 42 2c e8 51 dd 2d e1 48 8b 7b 38 e8 98 29 b7 e0 48 83 c4 30 b8
>>>> RSP: 0018:ffffc9000ffd7b10 EFLAGS: 00000206
>>>> RAX: 0000000000000100 RBX: ffff88c493968d80 RCX: ffff88d1a6469b18
>>>> RDX: ffff88e18ef1ec80 RSI: ffffc9000ffd7be0 RDI: ffff88c493968d38
>>>> RBP: 000000000003062e R08: 000000003042f000 R09: 000000003062efff
>>>> R10: 0000000000001000 R11: ffff88c1ad255000 R12: 000000000003042f
>>>> R13: ffff88c493968c00 R14: ffffc9000ffd7be0 R15: ffff88c493968c00
>>>> __mmu_notifier_invalidate_range_start+0x132/0x1d0
>>>> ? amdgpu_vm_bo_update+0x3fd/0x520 [amdgpu]
>>>> migrate_vma_setup+0x6c7/0x8f0
>>>> ? kfd_smi_event_migration_start+0x5f/0x80 [amdgpu]
>>>> svm_migrate_ram_to_vram+0x14e/0x580 [amdgpu]
>>>> svm_range_set_attr+0xe34/0x11a0 [amdgpu]
>>>> kfd_ioctl+0x271/0x4e0 [amdgpu]
>>>> ? kfd_ioctl_set_xnack_mode+0xd0/0xd0 [amdgpu]
>>>> __x64_sys_ioctl+0x92/0xd0
>>>>
>>>> Signed-off-by: James Zhu <James.Zhu@amd.com>
>>>> ---
>>>>   drivers/gpu/drm/amd/amdkfd/kfd_svm.c | 1 +
>>>>   1 file changed, 1 insertion(+)
>>>>
>>>> diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_svm.c 
>>>> b/drivers/gpu/drm/amd/amdkfd/kfd_svm.c
>>>> index 113fd11aa96e..9f2d48ade7fa 100644
>>>> --- a/drivers/gpu/drm/amd/amdkfd/kfd_svm.c
>>>> +++ b/drivers/gpu/drm/amd/amdkfd/kfd_svm.c
>>>> @@ -3573,6 +3573,7 @@ svm_range_set_attr(struct kfd_process *p, 
>>>> struct mm_struct *mm,
>>>>                  r = svm_range_trigger_migration(mm, prange, 
>>>> &migrated);
>>>>                  if (r)
>>>>                          goto out_unlock_range;
>>>> +               schedule();
>>>>
>>>>                  if (migrated && (!p->xnack_enabled ||
>>>>                      (prange->flags & 
>>>> KFD_IOCTL_SVM_FLAG_GPU_ALWAYS_MAPPED)) &&
>>>> -- 
>>>> 2.34.1
>>>>

^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2023-08-12  0:28 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-08-11 19:11 [PATCH] drm/amdkfd: add schedule to remove RCU stall on CPU James Zhu
2023-08-11 20:06 ` Felix Kuehling
2023-08-11 20:50   ` James Zhu
2023-08-11 21:13     ` Felix Kuehling
2023-08-11 21:12 ` Chen, Xiaogang
2023-08-11 21:22   ` Felix Kuehling
2023-08-11 21:27     ` Chen, Xiaogang
2023-08-11 21:31       ` Felix Kuehling
2023-08-11 22:00         ` Chen, Xiaogang
2023-08-12  0:28       ` James Zhu
2023-08-12  0:24   ` James Zhu

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.