All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 1/2] cgroup: Use irqsave in cgroup_rstat_flush_locked().
@ 2022-03-01 12:21     ` Sebastian Andrzej Siewior
  0 siblings, 0 replies; 10+ messages in thread
From: Sebastian Andrzej Siewior @ 2022-03-01 12:21 UTC (permalink / raw)
  To: cgroups, linux-mm
  Cc: Andrew Morton, Johannes Weiner, Tejun Heo, Zefan Li,
	Thomas Gleixner, Sebastian Andrzej Siewior

All callers of cgroup_rstat_flush_locked() acquire cgroup_rstat_lock
either with spin_lock_irq() or spin_lock_irqsave().
cgroup_rstat_flush_locked() itself acquires cgroup_rstat_cpu_lock which
is a raw_spin_lock. This lock is also acquired in cgroup_rstat_updated()
in IRQ context and therefore requires _irqsave() locking suffix in
cgroup_rstat_flush_locked().
Since there is no difference between spin_lock_t and raw_spin_lock_t
on !RT lockdep does not complain here. On RT lockdep complains because
the interrupts were not disabled here and a deadlock is possible.

Acquire the raw_spin_lock_t with disabled interrupts.

Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
---
 kernel/cgroup/rstat.c | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/kernel/cgroup/rstat.c b/kernel/cgroup/rstat.c
index 9d331ba44870a..53b771c20ee50 100644
--- a/kernel/cgroup/rstat.c
+++ b/kernel/cgroup/rstat.c
@@ -153,8 +153,9 @@ static void cgroup_rstat_flush_locked(struct cgroup *cgrp, bool may_sleep)
 		raw_spinlock_t *cpu_lock = per_cpu_ptr(&cgroup_rstat_cpu_lock,
 						       cpu);
 		struct cgroup *pos = NULL;
+		unsigned long flags;
 
-		raw_spin_lock(cpu_lock);
+		raw_spin_lock_irqsave(cpu_lock, flags);
 		while ((pos = cgroup_rstat_cpu_pop_updated(pos, cgrp, cpu))) {
 			struct cgroup_subsys_state *css;
 
@@ -166,7 +167,7 @@ static void cgroup_rstat_flush_locked(struct cgroup *cgrp, bool may_sleep)
 				css->ss->css_rstat_flush(css, cpu);
 			rcu_read_unlock();
 		}
-		raw_spin_unlock(cpu_lock);
+		raw_spin_unlock_irqrestore(cpu_lock, flags);
 
 		/* if @may_sleep, play nice and yield if necessary */
 		if (may_sleep && (need_resched() ||
-- 
2.35.1



^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [PATCH 1/2] cgroup: Use irqsave in cgroup_rstat_flush_locked().
@ 2022-03-01 12:21     ` Sebastian Andrzej Siewior
  0 siblings, 0 replies; 10+ messages in thread
From: Sebastian Andrzej Siewior @ 2022-03-01 12:21 UTC (permalink / raw)
  To: cgroups-u79uwXL29TY76Z2rM5mHXA, linux-mm-Bw31MaZKKs3YtjvyW6yDsg
  Cc: Andrew Morton, Johannes Weiner, Tejun Heo, Zefan Li,
	Thomas Gleixner, Sebastian Andrzej Siewior

All callers of cgroup_rstat_flush_locked() acquire cgroup_rstat_lock
either with spin_lock_irq() or spin_lock_irqsave().
cgroup_rstat_flush_locked() itself acquires cgroup_rstat_cpu_lock which
is a raw_spin_lock. This lock is also acquired in cgroup_rstat_updated()
in IRQ context and therefore requires _irqsave() locking suffix in
cgroup_rstat_flush_locked().
Since there is no difference between spin_lock_t and raw_spin_lock_t
on !RT lockdep does not complain here. On RT lockdep complains because
the interrupts were not disabled here and a deadlock is possible.

Acquire the raw_spin_lock_t with disabled interrupts.

Signed-off-by: Sebastian Andrzej Siewior <bigeasy-hfZtesqFncYOwBW4kG4KsQ@public.gmane.org>
---
 kernel/cgroup/rstat.c | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/kernel/cgroup/rstat.c b/kernel/cgroup/rstat.c
index 9d331ba44870a..53b771c20ee50 100644
--- a/kernel/cgroup/rstat.c
+++ b/kernel/cgroup/rstat.c
@@ -153,8 +153,9 @@ static void cgroup_rstat_flush_locked(struct cgroup *cgrp, bool may_sleep)
 		raw_spinlock_t *cpu_lock = per_cpu_ptr(&cgroup_rstat_cpu_lock,
 						       cpu);
 		struct cgroup *pos = NULL;
+		unsigned long flags;
 
-		raw_spin_lock(cpu_lock);
+		raw_spin_lock_irqsave(cpu_lock, flags);
 		while ((pos = cgroup_rstat_cpu_pop_updated(pos, cgrp, cpu))) {
 			struct cgroup_subsys_state *css;
 
@@ -166,7 +167,7 @@ static void cgroup_rstat_flush_locked(struct cgroup *cgrp, bool may_sleep)
 				css->ss->css_rstat_flush(css, cpu);
 			rcu_read_unlock();
 		}
-		raw_spin_unlock(cpu_lock);
+		raw_spin_unlock_irqrestore(cpu_lock, flags);
 
 		/* if @may_sleep, play nice and yield if necessary */
 		if (may_sleep && (need_resched() ||
-- 
2.35.1


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [PATCH 2/2] mm: workingset: Replace IRQ-off check with a lockdep assert.
@ 2022-03-01 12:21     ` Sebastian Andrzej Siewior
  0 siblings, 0 replies; 10+ messages in thread
From: Sebastian Andrzej Siewior @ 2022-03-01 12:21 UTC (permalink / raw)
  To: cgroups, linux-mm
  Cc: Andrew Morton, Johannes Weiner, Tejun Heo, Zefan Li,
	Thomas Gleixner, Sebastian Andrzej Siewior

Commit
  68d48e6a2df57 ("mm: workingset: add vmstat counter for shadow nodes")

introduced an IRQ-off check to ensure that a lock is held which also
disabled interrupts. This does not work the same way on PREEMPT_RT
because none of the locks, that are held, disable interrupts.

Replace this check with a lockdep assert which ensures that the lock is
held.

Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
---
 mm/workingset.c | 5 ++++-
 1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/mm/workingset.c b/mm/workingset.c
index 2e4fd7c3296fe..8a3828acc0bfd 100644
--- a/mm/workingset.c
+++ b/mm/workingset.c
@@ -434,6 +434,8 @@ struct list_lru shadow_nodes;
 
 void workingset_update_node(struct xa_node *node)
 {
+	struct address_space *mapping;
+
 	/*
 	 * Track non-empty nodes that contain only shadow entries;
 	 * unlink those that contain pages or are being freed.
@@ -442,7 +444,8 @@ void workingset_update_node(struct xa_node *node)
 	 * already where they should be. The list_empty() test is safe
 	 * as node->private_list is protected by the i_pages lock.
 	 */
-	VM_WARN_ON_ONCE(!irqs_disabled());  /* For __inc_lruvec_page_state */
+	mapping = container_of(node->array, struct address_space, i_pages);
+	lockdep_assert_held(&mapping->i_pages.xa_lock);
 
 	if (node->count && node->count == node->nr_values) {
 		if (list_empty(&node->private_list)) {
-- 
2.35.1



^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [PATCH 2/2] mm: workingset: Replace IRQ-off check with a lockdep assert.
@ 2022-03-01 12:21     ` Sebastian Andrzej Siewior
  0 siblings, 0 replies; 10+ messages in thread
From: Sebastian Andrzej Siewior @ 2022-03-01 12:21 UTC (permalink / raw)
  To: cgroups-u79uwXL29TY76Z2rM5mHXA, linux-mm-Bw31MaZKKs3YtjvyW6yDsg
  Cc: Andrew Morton, Johannes Weiner, Tejun Heo, Zefan Li,
	Thomas Gleixner, Sebastian Andrzej Siewior

Commit
  68d48e6a2df57 ("mm: workingset: add vmstat counter for shadow nodes")

introduced an IRQ-off check to ensure that a lock is held which also
disabled interrupts. This does not work the same way on PREEMPT_RT
because none of the locks, that are held, disable interrupts.

Replace this check with a lockdep assert which ensures that the lock is
held.

Signed-off-by: Sebastian Andrzej Siewior <bigeasy-hfZtesqFncYOwBW4kG4KsQ@public.gmane.org>
---
 mm/workingset.c | 5 ++++-
 1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/mm/workingset.c b/mm/workingset.c
index 2e4fd7c3296fe..8a3828acc0bfd 100644
--- a/mm/workingset.c
+++ b/mm/workingset.c
@@ -434,6 +434,8 @@ struct list_lru shadow_nodes;
 
 void workingset_update_node(struct xa_node *node)
 {
+	struct address_space *mapping;
+
 	/*
 	 * Track non-empty nodes that contain only shadow entries;
 	 * unlink those that contain pages or are being freed.
@@ -442,7 +444,8 @@ void workingset_update_node(struct xa_node *node)
 	 * already where they should be. The list_empty() test is safe
 	 * as node->private_list is protected by the i_pages lock.
 	 */
-	VM_WARN_ON_ONCE(!irqs_disabled());  /* For __inc_lruvec_page_state */
+	mapping = container_of(node->array, struct address_space, i_pages);
+	lockdep_assert_held(&mapping->i_pages.xa_lock);
 
 	if (node->count && node->count == node->nr_values) {
 		if (list_empty(&node->private_list)) {
-- 
2.35.1


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* Re: [PATCH 1/2] cgroup: Use irqsave in cgroup_rstat_flush_locked().
@ 2022-03-02  6:38       ` Tejun Heo
  0 siblings, 0 replies; 10+ messages in thread
From: Tejun Heo @ 2022-03-02  6:38 UTC (permalink / raw)
  To: Sebastian Andrzej Siewior
  Cc: cgroups, linux-mm, Andrew Morton, Johannes Weiner, Zefan Li,
	Thomas Gleixner

Hello,

On Tue, Mar 01, 2022 at 01:21:42PM +0100, Sebastian Andrzej Siewior wrote:
> All callers of cgroup_rstat_flush_locked() acquire cgroup_rstat_lock
> either with spin_lock_irq() or spin_lock_irqsave().
> cgroup_rstat_flush_locked() itself acquires cgroup_rstat_cpu_lock which
> is a raw_spin_lock. This lock is also acquired in cgroup_rstat_updated()
> in IRQ context and therefore requires _irqsave() locking suffix in
> cgroup_rstat_flush_locked().
> Since there is no difference between spin_lock_t and raw_spin_lock_t
> on !RT lockdep does not complain here. On RT lockdep complains because
> the interrupts were not disabled here and a deadlock is possible.
> 
> Acquire the raw_spin_lock_t with disabled interrupts.
> 
> Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>

Can you please add a comment explaining why irqsave is being used? As it
stands, it just looks spurious.

Thanks.

-- 
tejun


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH 1/2] cgroup: Use irqsave in cgroup_rstat_flush_locked().
@ 2022-03-02  6:38       ` Tejun Heo
  0 siblings, 0 replies; 10+ messages in thread
From: Tejun Heo @ 2022-03-02  6:38 UTC (permalink / raw)
  To: Sebastian Andrzej Siewior
  Cc: cgroups-u79uwXL29TY76Z2rM5mHXA, linux-mm-Bw31MaZKKs3YtjvyW6yDsg,
	Andrew Morton, Johannes Weiner, Zefan Li, Thomas Gleixner

Hello,

On Tue, Mar 01, 2022 at 01:21:42PM +0100, Sebastian Andrzej Siewior wrote:
> All callers of cgroup_rstat_flush_locked() acquire cgroup_rstat_lock
> either with spin_lock_irq() or spin_lock_irqsave().
> cgroup_rstat_flush_locked() itself acquires cgroup_rstat_cpu_lock which
> is a raw_spin_lock. This lock is also acquired in cgroup_rstat_updated()
> in IRQ context and therefore requires _irqsave() locking suffix in
> cgroup_rstat_flush_locked().
> Since there is no difference between spin_lock_t and raw_spin_lock_t
> on !RT lockdep does not complain here. On RT lockdep complains because
> the interrupts were not disabled here and a deadlock is possible.
> 
> Acquire the raw_spin_lock_t with disabled interrupts.
> 
> Signed-off-by: Sebastian Andrzej Siewior <bigeasy-hfZtesqFncYOwBW4kG4KsQ@public.gmane.org>

Can you please add a comment explaining why irqsave is being used? As it
stands, it just looks spurious.

Thanks.

-- 
tejun

^ permalink raw reply	[flat|nested] 10+ messages in thread

* [PATCH] cgroup: Add a comment to cgroup_rstat_flush_locked().
@ 2022-03-02 14:46         ` Sebastian Andrzej Siewior
  0 siblings, 0 replies; 10+ messages in thread
From: Sebastian Andrzej Siewior @ 2022-03-02 14:46 UTC (permalink / raw)
  To: Tejun Heo
  Cc: cgroups, linux-mm, Andrew Morton, Johannes Weiner, Zefan Li,
	Thomas Gleixner

Add a comment why spin_lock_irq() -> raw_spin_lock_irqsave() is needed.

Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
---
On 2022-03-01 20:38:51 [-1000], Tejun Heo wrote:
> Hello,

Hello Tejun,

> Can you please add a comment explaining why irqsave is being used? As it
> stands, it just looks spurious.

Something like this?

 kernel/cgroup/rstat.c | 8 ++++++++
 1 file changed, 8 insertions(+)

diff --git a/kernel/cgroup/rstat.c b/kernel/cgroup/rstat.c
index 53b771c20ee50..ba7a660184e41 100644
--- a/kernel/cgroup/rstat.c
+++ b/kernel/cgroup/rstat.c
@@ -155,6 +155,14 @@ static void cgroup_rstat_flush_locked(struct cgroup *cgrp, bool may_sleep)
 		struct cgroup *pos = NULL;
 		unsigned long flags;
 
+		/*
+		 * The _irqsave() is needed because cgroup_rstat_lock is
+		 * spinlock_t which is a sleeping lock on PREEMPT_RT. Acquiring
+		 * this lock with the _irq() suffix only disables interrupts on
+		 * a non-PREEMPT_RT kernel. The raw_spinlock_t below disables
+		 * interrupts on both configurations. The _irqsave() ensures
+		 * that interrupts are always disabled and later restored.
+		 */
 		raw_spin_lock_irqsave(cpu_lock, flags);
 		while ((pos = cgroup_rstat_cpu_pop_updated(pos, cgrp, cpu))) {
 			struct cgroup_subsys_state *css;
-- 
2.35.1


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [PATCH] cgroup: Add a comment to cgroup_rstat_flush_locked().
@ 2022-03-02 14:46         ` Sebastian Andrzej Siewior
  0 siblings, 0 replies; 10+ messages in thread
From: Sebastian Andrzej Siewior @ 2022-03-02 14:46 UTC (permalink / raw)
  To: Tejun Heo
  Cc: cgroups-u79uwXL29TY76Z2rM5mHXA, linux-mm-Bw31MaZKKs3YtjvyW6yDsg,
	Andrew Morton, Johannes Weiner, Zefan Li, Thomas Gleixner

Add a comment why spin_lock_irq() -> raw_spin_lock_irqsave() is needed.

Signed-off-by: Sebastian Andrzej Siewior <bigeasy-hfZtesqFncYOwBW4kG4KsQ@public.gmane.org>
---
On 2022-03-01 20:38:51 [-1000], Tejun Heo wrote:
> Hello,

Hello Tejun,

> Can you please add a comment explaining why irqsave is being used? As it
> stands, it just looks spurious.

Something like this?

 kernel/cgroup/rstat.c | 8 ++++++++
 1 file changed, 8 insertions(+)

diff --git a/kernel/cgroup/rstat.c b/kernel/cgroup/rstat.c
index 53b771c20ee50..ba7a660184e41 100644
--- a/kernel/cgroup/rstat.c
+++ b/kernel/cgroup/rstat.c
@@ -155,6 +155,14 @@ static void cgroup_rstat_flush_locked(struct cgroup *cgrp, bool may_sleep)
 		struct cgroup *pos = NULL;
 		unsigned long flags;
 
+		/*
+		 * The _irqsave() is needed because cgroup_rstat_lock is
+		 * spinlock_t which is a sleeping lock on PREEMPT_RT. Acquiring
+		 * this lock with the _irq() suffix only disables interrupts on
+		 * a non-PREEMPT_RT kernel. The raw_spinlock_t below disables
+		 * interrupts on both configurations. The _irqsave() ensures
+		 * that interrupts are always disabled and later restored.
+		 */
 		raw_spin_lock_irqsave(cpu_lock, flags);
 		while ((pos = cgroup_rstat_cpu_pop_updated(pos, cgrp, cpu))) {
 			struct cgroup_subsys_state *css;
-- 
2.35.1

^ permalink raw reply related	[flat|nested] 10+ messages in thread

* Re: [PATCH] cgroup: Add a comment to cgroup_rstat_flush_locked().
@ 2022-03-02 15:47           ` Tejun Heo
  0 siblings, 0 replies; 10+ messages in thread
From: Tejun Heo @ 2022-03-02 15:47 UTC (permalink / raw)
  To: Sebastian Andrzej Siewior
  Cc: cgroups, linux-mm, Andrew Morton, Johannes Weiner, Zefan Li,
	Thomas Gleixner

On Wed, Mar 02, 2022 at 03:46:16PM +0100, Sebastian Andrzej Siewior wrote:
> Add a comment why spin_lock_irq() -> raw_spin_lock_irqsave() is needed.
> 
> Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
> ---
> On 2022-03-01 20:38:51 [-1000], Tejun Heo wrote:
> > Hello,
> 
> Hello Tejun,
> 
> > Can you please add a comment explaining why irqsave is being used? As it
> > stands, it just looks spurious.
> 
> Something like this?

Yeah, looks good to me.

Acked-by: Tejun Heo <tj@kernel.org>

Thanks.

-- 
tejun


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH] cgroup: Add a comment to cgroup_rstat_flush_locked().
@ 2022-03-02 15:47           ` Tejun Heo
  0 siblings, 0 replies; 10+ messages in thread
From: Tejun Heo @ 2022-03-02 15:47 UTC (permalink / raw)
  To: Sebastian Andrzej Siewior
  Cc: cgroups-u79uwXL29TY76Z2rM5mHXA, linux-mm-Bw31MaZKKs3YtjvyW6yDsg,
	Andrew Morton, Johannes Weiner, Zefan Li, Thomas Gleixner

On Wed, Mar 02, 2022 at 03:46:16PM +0100, Sebastian Andrzej Siewior wrote:
> Add a comment why spin_lock_irq() -> raw_spin_lock_irqsave() is needed.
> 
> Signed-off-by: Sebastian Andrzej Siewior <bigeasy-hfZtesqFncYOwBW4kG4KsQ@public.gmane.org>
> ---
> On 2022-03-01 20:38:51 [-1000], Tejun Heo wrote:
> > Hello,
> 
> Hello Tejun,
> 
> > Can you please add a comment explaining why irqsave is being used? As it
> > stands, it just looks spurious.
> 
> Something like this?

Yeah, looks good to me.

Acked-by: Tejun Heo <tj-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>

Thanks.

-- 
tejun

^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2022-03-02 15:47 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
     [not found] <[PATCH 0/2] Correct locking assumption on PREEMPT_RT>
     [not found] ` <20220301122143.1521823-1-bigeasy@linutronix.de>
2022-03-01 12:21   ` [PATCH 1/2] cgroup: Use irqsave in cgroup_rstat_flush_locked() Sebastian Andrzej Siewior
2022-03-01 12:21     ` Sebastian Andrzej Siewior
2022-03-02  6:38     ` Tejun Heo
2022-03-02  6:38       ` Tejun Heo
2022-03-02 14:46       ` [PATCH] cgroup: Add a comment to cgroup_rstat_flush_locked() Sebastian Andrzej Siewior
2022-03-02 14:46         ` Sebastian Andrzej Siewior
2022-03-02 15:47         ` Tejun Heo
2022-03-02 15:47           ` Tejun Heo
2022-03-01 12:21   ` [PATCH 2/2] mm: workingset: Replace IRQ-off check with a lockdep assert Sebastian Andrzej Siewior
2022-03-01 12:21     ` Sebastian Andrzej Siewior

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.