All of lore.kernel.org
 help / color / mirror / Atom feed
From: Mike Galbraith <bitbucket@online.de>
To: Peter Zijlstra <peterz@infradead.org>
Cc: Sasha Levin <sasha.levin@oracle.com>,
	mingo@kernel.org, hpa@zytor.com, linux-kernel@vger.kernel.org,
	torvalds@linux-foundation.org, mgorman@suse.com,
	akpm@linux-foundation.org, tglx@linutronix.de,
	linux-tip-commits@vger.kernel.org, Dave Jones <davej@redhat.com>
Subject: Re: [tip:sched/core] sched/numa: Move task_numa_free() to __put_task_struct()
Date: Mon, 07 Apr 2014 10:55:15 +0200	[thread overview]
Message-ID: <1396860915.5170.5.camel@marge.simpson.net> (raw)
In-Reply-To: <20140407081644.GD11096@twins.programming.kicks-ass.net>

On Mon, 2014-04-07 at 10:16 +0200, Peter Zijlstra wrote: 
> On Mon, Apr 07, 2014 at 09:30:30AM +0200, Mike Galbraith wrote:
> > -	double_lock(&my_grp->lock, &grp->lock);
> > +	BUG_ON(irqs_disabled());
> > +	double_lock_irq(&my_grp->lock, &grp->lock);
> 
> So either make this:
> 
> 	local_irq_disable();
> 	double_lock();
> 
> or
> 
> >  
> >  	for (i = 0; i < NR_NUMA_HINT_FAULT_STATS * nr_node_ids; i++) {
> >  		my_grp->faults[i] -= p->numa_faults_memory[i];
> > @@ -1692,6 +1693,7 @@ static void task_numa_group(struct task_
> >  
> >  	spin_unlock(&my_grp->lock);
> >  	spin_unlock(&grp->lock);
> > +	local_irq_enable();
> 
> use:
> 	spin_unlock()
> 	spin_unlock_irq()
> 
> or so, but this imbalance is making my itch :-)

sched, numa: fix task_numa_free() lockdep splat

Sasha reports that lockdep claims 156654f491dd8d52687a5fbe1637f472a52ce75b made
numa_group.lock interrupt unsafe.  While I don't see how that could be given the
commit in question moved task_numa_free() from one irq enabled region to another,
the below does make both gripes and lockups upon gripe with numa=fake=4 go away.

Reported-by: Sasha Levin <sasha.levin@oracle.com>
Signed-off-by: Mike Galbraith <bitbucket@online.de>
---
 kernel/sched/fair.c  |   13 +++++++------
 kernel/sched/sched.h |    9 +++++++++
 2 files changed, 16 insertions(+), 6 deletions(-)

--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -1497,7 +1497,7 @@ static void task_numa_placement(struct t
 	/* If the task is part of a group prevent parallel updates to group stats */
 	if (p->numa_group) {
 		group_lock = &p->numa_group->lock;
-		spin_lock(group_lock);
+		spin_lock_irq(group_lock);
 	}
 
 	/* Find the node with the highest number of faults */
@@ -1572,7 +1572,7 @@ static void task_numa_placement(struct t
 			}
 		}
 
-		spin_unlock(group_lock);
+		spin_unlock_irq(group_lock);
 	}
 
 	/* Preferred node as the node with the most faults */
@@ -1677,7 +1677,8 @@ static void task_numa_group(struct task_
 	if (!join)
 		return;
 
-	double_lock(&my_grp->lock, &grp->lock);
+	BUG_ON(irqs_disabled());
+	double_lock_irq(&my_grp->lock, &grp->lock);
 
 	for (i = 0; i < NR_NUMA_HINT_FAULT_STATS * nr_node_ids; i++) {
 		my_grp->faults[i] -= p->numa_faults_memory[i];
@@ -1691,7 +1692,7 @@ static void task_numa_group(struct task_
 	grp->nr_tasks++;
 
 	spin_unlock(&my_grp->lock);
-	spin_unlock(&grp->lock);
+	spin_unlock_irq(&grp->lock);
 
 	rcu_assign_pointer(p->numa_group, grp);
 
@@ -1710,14 +1711,14 @@ void task_numa_free(struct task_struct *
 	void *numa_faults = p->numa_faults_memory;
 
 	if (grp) {
-		spin_lock(&grp->lock);
+		spin_lock_irq(&grp->lock);
 		for (i = 0; i < NR_NUMA_HINT_FAULT_STATS * nr_node_ids; i++)
 			grp->faults[i] -= p->numa_faults_memory[i];
 		grp->total_faults -= p->total_numa_faults;
 
 		list_del(&p->numa_entry);
 		grp->nr_tasks--;
-		spin_unlock(&grp->lock);
+		spin_unlock_irq(&grp->lock);
 		rcu_assign_pointer(p->numa_group, NULL);
 		put_numa_group(grp);
 	}
--- a/kernel/sched/sched.h
+++ b/kernel/sched/sched.h
@@ -1388,6 +1388,15 @@ static inline void double_lock(spinlock_
 	spin_lock_nested(l2, SINGLE_DEPTH_NESTING);
 }
 
+static inline void double_lock_irq(spinlock_t *l1, spinlock_t *l2)
+{
+	if (l1 > l2)
+		swap(l1, l2);
+
+	spin_lock_irq(l1);
+	spin_lock_nested(l2, SINGLE_DEPTH_NESTING);
+}
+
 static inline void double_raw_lock(raw_spinlock_t *l1, raw_spinlock_t *l2)
 {
 	if (l1 > l2)



  parent reply	other threads:[~2014-04-07  8:55 UTC|newest]

Thread overview: 12+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-02-28  6:23 [patch] rt,sched,numa: Move task_numa_free() to __put_task_struct(), which -rt offloads Mike Galbraith
2014-02-28  9:00 ` Pavel Vasilyev
2014-02-28 11:32 ` Peter Zijlstra
2014-03-11 12:40 ` [tip:sched/core] sched/numa: Move task_numa_free() to __put_task_struct() tip-bot for Mike Galbraith
2014-04-06 19:17   ` Sasha Levin
2014-04-07  5:29     ` Mike Galbraith
2014-04-07  7:30       ` Mike Galbraith
2014-04-07  8:16         ` Peter Zijlstra
2014-04-07  8:40           ` Mike Galbraith
2014-04-07  8:55           ` Mike Galbraith [this message]
2014-04-13 20:53             ` Govindarajulu Varadarajan
2014-04-14  7:22             ` [tip:sched/urgent] sched/numa: Fix task_numa_free() lockdep splat tip-bot for Mike Galbraith

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1396860915.5170.5.camel@marge.simpson.net \
    --to=bitbucket@online.de \
    --cc=akpm@linux-foundation.org \
    --cc=davej@redhat.com \
    --cc=hpa@zytor.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-tip-commits@vger.kernel.org \
    --cc=mgorman@suse.com \
    --cc=mingo@kernel.org \
    --cc=peterz@infradead.org \
    --cc=sasha.levin@oracle.com \
    --cc=tglx@linutronix.de \
    --cc=torvalds@linux-foundation.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.