stable-rt.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* kernel-rt: mm/memcg: refill_obj_stock() is being called from IRQ context
@ 2023-04-28 14:42 Luis Claudio R. Goncalves
  2023-04-28 16:19 ` Sebastian Andrzej Siewior
  0 siblings, 1 reply; 3+ messages in thread
From: Luis Claudio R. Goncalves @ 2023-04-28 14:42 UTC (permalink / raw)
  To: Steven Rostedt, Sebastian Andrzej Siewior, stable-rt

Sebastian et al,

I am reporting this here, at first, to confirm if there is an obvious solution
for the problem or if it is really a problem.

In short, refill_obj_stock() is being called from IRQ context and lockdep
is complaining about the local lock stock_lock probably not being enough to
protect the data.

The problem is 100% reproducible.

Kernel: v6.3.0-rt11

Reproducer:

    # stress-ng  --sched deadline  --sched-period 1000000000 --sched-runtime 500000000 --sched-deadline 1000000000 --cpu 1 -l 95 -t 120


[  387.365953] ================================
[  387.365954] WARNING: inconsistent lock state
[  387.365954] 6.3.0-rt11 #3 Not tainted
[  387.365956] --------------------------------
[  387.365956] inconsistent {HARDIRQ-ON-W} -> {IN-HARDIRQ-W} usage.
[  387.365957] swapper/7/0 [HC1[1]:SC0[0]:HE0:SE1] takes:
[  387.365958] ffff934e8fdb0338 ((stock_lock)){?.+.}-{2:2}, at: refill_obj_stock+0x3d/0x2d0
[  387.365965] {HARDIRQ-ON-W} state was registered at:
[  387.365965]   __lock_acquire+0x2fa/0xe70
[  387.365968]   lock_acquire+0xca/0x2f0
[  387.365969]   rt_spin_lock+0x27/0xd0
[  387.365971]   obj_cgroup_charge+0x3c/0x150
[  387.365973]   slab_pre_alloc_hook.constprop.0+0x119/0x320
[  387.365975]   kmem_cache_alloc_lru+0x4c/0x200
[  387.365976]   __d_alloc+0x29/0x250
[  387.365978]   d_alloc+0x1e/0xa0
[  387.365980]   __lookup_hash+0x53/0xa0
[  387.365982]   filename_create+0xb0/0x160
[  387.365983]   do_mkdirat+0x4b/0x160
[  387.365984]   __x64_sys_mkdir+0x48/0x70
[  387.365986]   do_syscall_64+0x59/0x90
[  387.365989]   entry_SYSCALL_64_after_hwframe+0x72/0xdc
[  387.365990] irq event stamp: 432988
[  387.365991] hardirqs last  enabled at (432987): [<ffffffffaa92b6eb>] cpuidle_enter_state+0xcb/0x340
[  387.365993] hardirqs last disabled at (432988): [<ffffffffaa92919b>] sysvec_apic_timer_interrupt+0xb/0xd0
[  387.365996] softirqs last  enabled at (0): [<ffffffffa9d13c1b>] copy_process+0x7cb/0x19b0
[  387.365999] softirqs last disabled at (0): [<0000000000000000>] 0x0
[  387.366001] 
               other info that might help us debug this:
[  387.366001]  Possible unsafe locking scenario:

[  387.366002]        CPU0
[  387.366002]        ----
[  387.366003]   lock((stock_lock));
[  387.366004]   <Interrupt>
[  387.366004]     lock((stock_lock));
[  387.366005] 
                *** DEADLOCK ***

[  387.366006] no locks held by swapper/7/0.
[  387.366007] 
               stack backtrace:
[  387.366007] CPU: 7 PID: 0 Comm: swapper/7 Kdump: loaded Not tainted 6.3.0-rt11 #3
[  387.366009] Hardware name: Dell Inc. PowerEdge R330/0H5N7P, BIOS 2.10.1 06/11/2020
[  387.366010] Call Trace:
[  387.366011]  <IRQ>
[  387.366012]  dump_stack_lvl+0x47/0x80
[  387.366015]  mark_lock_irq+0x3bb/0x5d0
[  387.366017]  ? stack_trace_save+0x4b/0x70
[  387.366021]  ? save_trace+0x55/0x180
[  387.366025]  mark_lock.part.0+0x1c0/0x3d0
[  387.366027]  mark_usage+0x129/0x150
[  387.366028]  __lock_acquire+0x2fa/0xe70
[  387.366032]  lock_acquire+0xca/0x2f0
[  387.366034]  ? refill_obj_stock+0x3d/0x2d0
[  387.366036]  ? lock_is_held_type+0xd7/0x130
[  387.366039]  ? __pfx_put_cred_rcu+0x10/0x10
[  387.366042]  rt_spin_lock+0x27/0xd0
[  387.366044]  ? refill_obj_stock+0x3d/0x2d0
[  387.366046]  refill_obj_stock+0x3d/0x2d0
[  387.366047]  ? __put_task_struct+0xe9/0x120
[  387.366050]  kmem_cache_free+0x150/0x3c0
[  387.366054]  __put_task_struct+0xe9/0x120
[  387.366056]  inactive_task_timer+0x1af/0x4c0
[  387.366060]  ? __pfx_inactive_task_timer+0x10/0x10
[  387.366063]  __hrtimer_run_queues+0x26b/0x3c0
[  387.366067]  hrtimer_interrupt+0x10a/0x240
[  387.366070]  __sysvec_apic_timer_interrupt+0x93/0x220
[  387.366073]  sysvec_apic_timer_interrupt+0x9d/0xd0
[  387.366075]  </IRQ>
[  387.366076]  <TASK>
[  387.366077]  asm_sysvec_apic_timer_interrupt+0x16/0x20
[  387.366078] RIP: 0010:cpuidle_enter_state+0xcf/0x340
[  387.366080] Code: ff 8b 73 04 bf ff ff ff ff 49 89 c6 e8 2a 37 cb ff 31 ff e8 63 83 45 ff 45 84 ff 0f 85 de 01 00 00 e8 65 46 57 ff fb 45 85 ed <0f> 88 15 01 00 00 49 63 cd 4c 89 f2 48 2b 14 24 48 8d 04 49 48 8d
[  387.366081] RSP: 0018:ffffa9178025be80 EFLAGS: 00000206
[  387.366083] RAX: 0000000000069b5b RBX: ffffc9177fd873d8 RCX: 000000000000001f
[  387.366084] RDX: 0000000000000000 RSI: ffffffffab23caed RDI: ffffffffab1d1d1e
[  387.366085] RBP: 0000000000000006 R08: 0000000000000001 R09: 0000000000000001
[  387.366086] R10: 0000000000000000 R11: 0000000000000000 R12: ffffffffabd7a8e0
[  387.366087] R13: 0000000000000006 R14: 0000005a30cf27dd R15: 0000000000000000
[  387.366093]  cpuidle_enter+0x29/0x40
[  387.366096]  cpuidle_idle_call+0x109/0x180
[  387.366099]  do_idle+0x91/0x100
[  387.366100]  cpu_startup_entry+0x19/0x20
[  387.366102]  start_secondary+0x112/0x130
[  387.366104]  secondary_startup_64_no_verify+0xe5/0xeb
[  387.366110]  </TASK>


^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: kernel-rt: mm/memcg: refill_obj_stock() is being called from IRQ context
  2023-04-28 14:42 kernel-rt: mm/memcg: refill_obj_stock() is being called from IRQ context Luis Claudio R. Goncalves
@ 2023-04-28 16:19 ` Sebastian Andrzej Siewior
  2023-04-28 18:00   ` Luis Claudio R. Goncalves
  0 siblings, 1 reply; 3+ messages in thread
From: Sebastian Andrzej Siewior @ 2023-04-28 16:19 UTC (permalink / raw)
  To: Luis Claudio R. Goncalves; +Cc: Steven Rostedt, stable-rt

On 2023-04-28 11:42:25 [-0300], Luis Claudio R. Goncalves wrote:
> Sebastian et al,
Hi,

> I am reporting this here, at first, to confirm if there is an obvious solution
> for the problem or if it is really a problem.
> 
> In short, refill_obj_stock() is being called from IRQ context and lockdep
> is complaining about the local lock stock_lock probably not being enough to
> protect the data.

This has been reported before and a patch has been posted
	https://lore.kernel.org/all/20230425114307.36889-1-wander@redhat.com/

I don't like it but it should cure it.

Sebastian

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: kernel-rt: mm/memcg: refill_obj_stock() is being called from IRQ context
  2023-04-28 16:19 ` Sebastian Andrzej Siewior
@ 2023-04-28 18:00   ` Luis Claudio R. Goncalves
  0 siblings, 0 replies; 3+ messages in thread
From: Luis Claudio R. Goncalves @ 2023-04-28 18:00 UTC (permalink / raw)
  To: Sebastian Andrzej Siewior; +Cc: Steven Rostedt, stable-rt

On Fri, Apr 28, 2023 at 06:19:10PM +0200, Sebastian Andrzej Siewior wrote:
> On 2023-04-28 11:42:25 [-0300], Luis Claudio R. Goncalves wrote:
> > Sebastian et al,
> Hi,
> 
> > I am reporting this here, at first, to confirm if there is an obvious solution
> > for the problem or if it is really a problem.
> > 
> > In short, refill_obj_stock() is being called from IRQ context and lockdep
> > is complaining about the local lock stock_lock probably not being enough to
> > protect the data.
> 
> This has been reported before and a patch has been posted
> 	https://lore.kernel.org/all/20230425114307.36889-1-wander@redhat.com/
> 
> I don't like it but it should cure it.

Thank you! I both forgot about that thread and also missed that in my searches.

And yes, that's a solution for a different issue that fixes the problem I
reported as a side effect :)

I was torn between redoing the locking in that code or using something
similar to an old proposal from Wayman Long, where he had different
variables to account work on atomic and non-atomic contexts (which might
not be applicable here)

> Sebastian
> 
---end quoted text---


^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2023-04-28 18:01 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-04-28 14:42 kernel-rt: mm/memcg: refill_obj_stock() is being called from IRQ context Luis Claudio R. Goncalves
2023-04-28 16:19 ` Sebastian Andrzej Siewior
2023-04-28 18:00   ` Luis Claudio R. Goncalves

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).