linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] cgroup: use irqsave in cgroup_rstat_flush_locked()
@ 2018-07-03 16:45 Sebastian Andrzej Siewior
  2018-07-03 20:24 ` Tejun Heo
  0 siblings, 1 reply; 5+ messages in thread
From: Sebastian Andrzej Siewior @ 2018-07-03 16:45 UTC (permalink / raw)
  To: linux-kernel; +Cc: Tejun Heo, Thomas Gleixner, Sebastian Andrzej Siewior

All callers of cgroup_rstat_flush_locked() acquire cgroup_rstat_lock
either with spin_lock_irq() or spin_lock_irqsave().
cgroup_rstat_flush_locked() itself acquires cgroup_rstat_cpu_lock which
is a raw_spin_lock. This lock is also acquired in cgroup_rstat_updated()
in IRQ context and therefore requires _irqsave() locking suffix in
cgroup_rstat_flush_locked().
Since there is no difference between spin_lock_t and raw_spin_lock_t
on !RT lockdep does not complain here. On RT lockdep complains because
the interrupts were not disabled here and a deadlock is possible.

Acquire the raw_spin_lock_t with disabled interrupts.

Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
---
 kernel/cgroup/rstat.c | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/kernel/cgroup/rstat.c b/kernel/cgroup/rstat.c
index d503d1a9007c..63fc5e472c82 100644
--- a/kernel/cgroup/rstat.c
+++ b/kernel/cgroup/rstat.c
@@ -157,8 +157,9 @@ static void cgroup_rstat_flush_locked(struct cgroup *cgrp, bool may_sleep)
 		raw_spinlock_t *cpu_lock = per_cpu_ptr(&cgroup_rstat_cpu_lock,
 						       cpu);
 		struct cgroup *pos = NULL;
+		unsigned long flags;
 
-		raw_spin_lock(cpu_lock);
+		raw_spin_lock_irqsave(cpu_lock, flags);
 		while ((pos = cgroup_rstat_cpu_pop_updated(pos, cgrp, cpu))) {
 			struct cgroup_subsys_state *css;
 
@@ -170,7 +171,7 @@ static void cgroup_rstat_flush_locked(struct cgroup *cgrp, bool may_sleep)
 				css->ss->css_rstat_flush(css, cpu);
 			rcu_read_unlock();
 		}
-		raw_spin_unlock(cpu_lock);
+		raw_spin_unlock_irqrestore(cpu_lock, flags);
 
 		/* if @may_sleep, play nice and yield if necessary */
 		if (may_sleep && (need_resched() ||
-- 
2.18.0


^ permalink raw reply related	[flat|nested] 5+ messages in thread

* Re: [PATCH] cgroup: use irqsave in cgroup_rstat_flush_locked()
  2018-07-03 16:45 [PATCH] cgroup: use irqsave in cgroup_rstat_flush_locked() Sebastian Andrzej Siewior
@ 2018-07-03 20:24 ` Tejun Heo
  2018-07-03 21:35   ` Sebastian Andrzej Siewior
  0 siblings, 1 reply; 5+ messages in thread
From: Tejun Heo @ 2018-07-03 20:24 UTC (permalink / raw)
  To: Sebastian Andrzej Siewior
  Cc: linux-kernel, Thomas Gleixner, Peter Zijlstra, Ingo Molnar

(cc'ing Peter and Ingo for lockdep)

Hello, Sebastian.

On Tue, Jul 03, 2018 at 06:45:44PM +0200, Sebastian Andrzej Siewior wrote:
> All callers of cgroup_rstat_flush_locked() acquire cgroup_rstat_lock
> either with spin_lock_irq() or spin_lock_irqsave().

So, irq is always disabled in cgroup_rstat_flush_locked().

> cgroup_rstat_flush_locked() itself acquires cgroup_rstat_cpu_lock which
> is a raw_spin_lock. This lock is also acquired in cgroup_rstat_updated()
> in IRQ context and therefore requires _irqsave() locking suffix in
> cgroup_rstat_flush_locked().

Yes, the cpu locks should be irqsafe too; however, as irq is always
disabled in that function, save/restore is redundant, no?

> Since there is no difference between spin_lock_t and raw_spin_lock_t
> on !RT lockdep does not complain here. On RT lockdep complains because
> the interrupts were not disabled here and a deadlock is possible.

We at least used to do this in the kernel - manipulating irqsafe locks
with spin_lock/unlock() if the irq state is known, whether enabled or
disabled, and ISTR lockdep being smart enough to track actual irq
state to determine irq safety.  Am I misremembering or is this
different on RT kernels?

Thanks.

-- 
tejun

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH] cgroup: use irqsave in cgroup_rstat_flush_locked()
  2018-07-03 20:24 ` Tejun Heo
@ 2018-07-03 21:35   ` Sebastian Andrzej Siewior
  2018-07-11 11:05     ` Sebastian Andrzej Siewior
  0 siblings, 1 reply; 5+ messages in thread
From: Sebastian Andrzej Siewior @ 2018-07-03 21:35 UTC (permalink / raw)
  To: Tejun Heo; +Cc: linux-kernel, Thomas Gleixner, Peter Zijlstra, Ingo Molnar

On 2018-07-03 13:24:24 [-0700], Tejun Heo wrote:
> (cc'ing Peter and Ingo for lockdep)
> 
> Hello, Sebastian.
Hi Tejun,

> On Tue, Jul 03, 2018 at 06:45:44PM +0200, Sebastian Andrzej Siewior wrote:
> > All callers of cgroup_rstat_flush_locked() acquire cgroup_rstat_lock
> > either with spin_lock_irq() or spin_lock_irqsave().
> 
> So, irq is always disabled in cgroup_rstat_flush_locked().

on not RT enabled kernels. On RT enabled kernels spin_lock_irq.*() is
turned into a sleeping spinlock which do not disable interrupts.

> > cgroup_rstat_flush_locked() itself acquires cgroup_rstat_cpu_lock which
> > is a raw_spin_lock. This lock is also acquired in cgroup_rstat_updated()
> > in IRQ context and therefore requires _irqsave() locking suffix in
> > cgroup_rstat_flush_locked().
> 
> Yes, the cpu locks should be irqsafe too; however, as irq is always
> disabled in that function, save/restore is redundant, no?

as I pointed out above only the raw_spin_lock_t really disables
interrupts on -RT. That is the difference between those two.

> > Since there is no difference between spin_lock_t and raw_spin_lock_t
> > on !RT lockdep does not complain here. On RT lockdep complains because
> > the interrupts were not disabled here and a deadlock is possible.
> 
> We at least used to do this in the kernel - manipulating irqsafe locks
> with spin_lock/unlock() if the irq state is known, whether enabled or
> disabled, and ISTR lockdep being smart enough to track actual irq
> state to determine irq safety.  Am I misremembering or is this
> different on RT kernels?

No, this is correct. So on !RT kernels the spin_lock_irq() disables
interrupts and the raw_spin_lock() has the interrupts already disabled,
everything is good. On RT kernels the spin_lock_irq() does not disable
interrupts and the raw_spin_lock() acquires the lock with enabled
interrupts and lockdep complains properly.
lockdep sees the hardirq path via:

 {IN-HARDIRQ-W} state was registered at:
   lock_acquire+0x9e/0x250
   _raw_spin_lock_irqsave+0x38/0x50
   cgroup_rstat_updated+0x57/0x100
   cgroup_base_stat_cputime_account_end.isra.6+0x17/0x60
   __cgroup_account_cputime_field+0x49/0x60
   account_system_index_time+0xdb/0x1f0
   account_system_time+0x3f/0x70
   account_process_tick+0x59/0x80
   update_process_times+0x1d/0x50
   tick_sched_handle+0x20/0x60
   tick_sched_timer+0x37/0x80
   __hrtimer_run_queues+0x12c/0x6d0
   hrtimer_interrupt+0xed/0x240
   smp_apic_timer_interrupt+0x89/0x3c0
   apic_timer_interrupt+0xf/0x20
   pin_current_cpu+0xa/0x120
   migrate_disable+0x9a/0x200
   rt_spin_lock+0x1d/0x60
   put_unused_fd+0x2c/0x50
   do_sys_open+0x23a/0x250
   __x64_sys_openat+0x1b/0x20
   do_syscall_64+0x50/0x190
   entry_SYSCALL_64_after_hwframe+0x49/0xbe

> Thanks.

Sebastian

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH] cgroup: use irqsave in cgroup_rstat_flush_locked()
  2018-07-03 21:35   ` Sebastian Andrzej Siewior
@ 2018-07-11 11:05     ` Sebastian Andrzej Siewior
  2018-07-11 17:28       ` Tejun Heo
  0 siblings, 1 reply; 5+ messages in thread
From: Sebastian Andrzej Siewior @ 2018-07-11 11:05 UTC (permalink / raw)
  To: Tejun Heo; +Cc: linux-kernel, Thomas Gleixner, Peter Zijlstra, Ingo Molnar

On 2018-07-03 23:35:39 [+0200], To Tejun Heo wrote:
> On 2018-07-03 13:24:24 [-0700], Tejun Heo wrote:
> > (cc'ing Peter and Ingo for lockdep)
> > 
> > Hello, Sebastian.
> Hi Tejun,
> 
> > On Tue, Jul 03, 2018 at 06:45:44PM +0200, Sebastian Andrzej Siewior wrote:
> > > All callers of cgroup_rstat_flush_locked() acquire cgroup_rstat_lock
> > > either with spin_lock_irq() or spin_lock_irqsave().
> > 
> > So, irq is always disabled in cgroup_rstat_flush_locked().
> 
> on not RT enabled kernels. On RT enabled kernels spin_lock_irq.*() is
> turned into a sleeping spinlock which do not disable interrupts.
> 
> > > cgroup_rstat_flush_locked() itself acquires cgroup_rstat_cpu_lock which
> > > is a raw_spin_lock. This lock is also acquired in cgroup_rstat_updated()
> > > in IRQ context and therefore requires _irqsave() locking suffix in
> > > cgroup_rstat_flush_locked().
> > 
> > Yes, the cpu locks should be irqsafe too; however, as irq is always
> > disabled in that function, save/restore is redundant, no?
> 
> as I pointed out above only the raw_spin_lock_t really disables
> interrupts on -RT. That is the difference between those two.
> 
> > > Since there is no difference between spin_lock_t and raw_spin_lock_t
> > > on !RT lockdep does not complain here. On RT lockdep complains because
> > > the interrupts were not disabled here and a deadlock is possible.
> > 
> > We at least used to do this in the kernel - manipulating irqsafe locks
> > with spin_lock/unlock() if the irq state is known, whether enabled or
> > disabled, and ISTR lockdep being smart enough to track actual irq
> > state to determine irq safety.  Am I misremembering or is this
> > different on RT kernels?
> 
> No, this is correct. So on !RT kernels the spin_lock_irq() disables
> interrupts and the raw_spin_lock() has the interrupts already disabled,
> everything is good. On RT kernels the spin_lock_irq() does not disable
> interrupts and the raw_spin_lock() acquires the lock with enabled
> interrupts and lockdep complains properly.
> lockdep sees the hardirq path via:
> 
>  {IN-HARDIRQ-W} state was registered at:
>    lock_acquire+0x9e/0x250
>    _raw_spin_lock_irqsave+0x38/0x50
>    cgroup_rstat_updated+0x57/0x100
>    cgroup_base_stat_cputime_account_end.isra.6+0x17/0x60
>    __cgroup_account_cputime_field+0x49/0x60
>    account_system_index_time+0xdb/0x1f0
>    account_system_time+0x3f/0x70
>    account_process_tick+0x59/0x80
>    update_process_times+0x1d/0x50
>    tick_sched_handle+0x20/0x60
>    tick_sched_timer+0x37/0x80
>    __hrtimer_run_queues+0x12c/0x6d0
>    hrtimer_interrupt+0xed/0x240
>    smp_apic_timer_interrupt+0x89/0x3c0
>    apic_timer_interrupt+0xf/0x20
>    pin_current_cpu+0xa/0x120
>    migrate_disable+0x9a/0x200
>    rt_spin_lock+0x1d/0x60
>    put_unused_fd+0x2c/0x50
>    do_sys_open+0x23a/0x250
>    __x64_sys_openat+0x1b/0x20
>    do_syscall_64+0x50/0x190
>    entry_SYSCALL_64_after_hwframe+0x49/0xbe
> 
> > Thanks.

ping.

Sebastian

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH] cgroup: use irqsave in cgroup_rstat_flush_locked()
  2018-07-11 11:05     ` Sebastian Andrzej Siewior
@ 2018-07-11 17:28       ` Tejun Heo
  0 siblings, 0 replies; 5+ messages in thread
From: Tejun Heo @ 2018-07-11 17:28 UTC (permalink / raw)
  To: Sebastian Andrzej Siewior
  Cc: linux-kernel, Thomas Gleixner, Peter Zijlstra, Ingo Molnar

Hello, Sebastian.

On Wed, Jul 11, 2018 at 01:05:13PM +0200, Sebastian Andrzej Siewior wrote:
> > > We at least used to do this in the kernel - manipulating irqsafe locks
> > > with spin_lock/unlock() if the irq state is known, whether enabled or
> > > disabled, and ISTR lockdep being smart enough to track actual irq
> > > state to determine irq safety.  Am I misremembering or is this
> > > different on RT kernels?
> > 
> > No, this is correct. So on !RT kernels the spin_lock_irq() disables
> > interrupts and the raw_spin_lock() has the interrupts already disabled,
> > everything is good. On RT kernels the spin_lock_irq() does not disable
> > interrupts and the raw_spin_lock() acquires the lock with enabled
> > interrupts and lockdep complains properly.
> > lockdep sees the hardirq path via:

I feel weary about applying a patch which isn't needed in mainline,
especially without annotations or at least comments.  I suppose it may
not be too common but this can't be the only place which needs this
and using irqsave/restore spuriously in all those sites doesn't sound
like a good solution.  Is there any other way of handling this?

Thanks.

-- 
tejun

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2018-07-11 17:28 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-07-03 16:45 [PATCH] cgroup: use irqsave in cgroup_rstat_flush_locked() Sebastian Andrzej Siewior
2018-07-03 20:24 ` Tejun Heo
2018-07-03 21:35   ` Sebastian Andrzej Siewior
2018-07-11 11:05     ` Sebastian Andrzej Siewior
2018-07-11 17:28       ` Tejun Heo

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).