From: Nicolas Pitre <nico@fluxnic.net> To: Will Deacon <will.deacon@arm.com> Cc: linux-kernel@vger.kernel.org, Peter Zijlstra <a.p.zijlstra@chello.nl>, Ingo Molnar <mingo@elte.hu>, Thomas Gleixner <tglx@linutronix.de>, Chris Mason <chris.mason@fusionio.com>, Arnd Bergmann <arnd@arndb.de>, linux-arm-kernel@lists.infradead.org Subject: Re: RFC: mutex: hung tasks on SMP platforms with asm-generic/mutex-xchg.h Date: Thu, 09 Aug 2012 01:12:15 -0400 (EDT) [thread overview] Message-ID: <alpine.LFD.2.02.1208090029460.5231@xanadu.home> (raw) In-Reply-To: <20120807115647.GA12828@mudshark.cambridge.arm.com> On Tue, 7 Aug 2012, Will Deacon wrote: > Hello, > > ARM recently moved to asm-generic/mutex-xchg.h for its mutex implementation > after our previous implementation was found to be missing some crucial > memory barriers. However, I'm seeing some problems running hackbench on > SMP platforms due to the way in which the MUTEX_SPIN_ON_OWNER code operates. > > The symptoms are that a bunch of hackbench tasks are left waiting on an > unlocked mutex and therefore never get woken up to claim it. I think this > boils down to the following sequence: > > > Task A Task B Task C Lock value > 0 1 > 1 lock() 0 > 2 lock() 0 > 3 spin(A) 0 > 4 unlock() 1 > 5 lock() 0 > 6 cmpxchg(1,0) 0 > 7 contended() -1 > 8 lock() 0 > 9 spin(C) 0 > 10 unlock() 1 > 11 cmpxchg(1,0) 0 > 12 unlock() 1 > > > At this point, the lock is unlocked, but Task B is in an uninterruptible > sleep with nobody to wake it up. > > The following patch fixes the problem by ensuring we put the lock into > the contended state if we acquire it from the spin loop on the slowpath > but I'd like to be sure that this won't cause problems with other mutex > implementations: > > > diff --git a/kernel/mutex.c b/kernel/mutex.c > index a307cc9..27b7887 100644 > --- a/kernel/mutex.c > +++ b/kernel/mutex.c > @@ -170,7 +170,7 @@ __mutex_lock_common(struct mutex *lock, long state, unsigned int subclass, > if (owner && !mutex_spin_on_owner(lock, owner)) > break; > > - if (atomic_cmpxchg(&lock->count, 1, 0) == 1) { > + if (atomic_cmpxchg(&lock->count, 1, -1) == 1) { > lock_acquired(&lock->dep_map, ip); > mutex_set_owner(lock); > preempt_enable(); > > > All comments welcome. Well... after thinking about this for a while, I came to the conclusion that the mutex_spin_on_owner code is indeed breaking the contract between the xchg lock fast path and the slow path. The xchg fast path will always set the count to 0 and rely on the slow path to restore a possible pre-existing negative count. So the above change would be needed for correctness in the xchg case, even if it always forces the unlock into the slow path. As I mentioned previously, we don't want to force the slow path in all cases though. The atomic decrement based fast path doesn't need that, so I'd suggest this fix instead: diff --git a/include/asm-generic/mutex-xchg.h b/include/asm-generic/mutex-xchg.h index 580a6d35c7..60964844c8 100644 --- a/include/asm-generic/mutex-xchg.h +++ b/include/asm-generic/mutex-xchg.h @@ -108,4 +108,6 @@ __mutex_fastpath_trylock(atomic_t *count, int (*fail_fn)(atomic_t *)) return prev; } +#define __MUTEX_XCHG_FAST_PATH + #endif diff --git a/kernel/mutex.c b/kernel/mutex.c index a307cc9c95..c6a26a4f1c 100644 --- a/kernel/mutex.c +++ b/kernel/mutex.c @@ -161,6 +161,7 @@ __mutex_lock_common(struct mutex *lock, long state, unsigned int subclass, for (;;) { struct task_struct *owner; + int locked_val; /* * If there's an owner, wait for it to either @@ -170,7 +171,19 @@ __mutex_lock_common(struct mutex *lock, long state, unsigned int subclass, if (owner && !mutex_spin_on_owner(lock, owner)) break; - if (atomic_cmpxchg(&lock->count, 1, 0) == 1) { +#ifdef __MUTEX_XCHG_FAST_PATH + /* + * The fast path based on xchg sets a transient 0 count, + * relying on the slow path to restore a possible + * pre-existing contended count. Without checking the + * waiters' list we must presume possible contension here. + */ + locked_val = -1; +#else + locked_val = 0; +#endif + + if (atomic_cmpxchg(&lock->count, 1, locked_val) == 1) { lock_acquired(&lock->dep_map, ip); mutex_set_owner(lock); preempt_enable(); That would be needed for the stable tree as well. A further cleanup could remove all definitions of __mutex_slowpath_needs_to_unlock() given that they're all set to 1 except for the xchg fast path, in which case __MUTEX_XCHG_FAST_PATH could be reused instead. Of course that might tilt the balance towards using mutex-dec.h on ARM v6 and above instead of mutex-xchg.h. But that is an orthogonal issue, and that doesn't remove the need for fixing the xchg based case for correctness. Nicolas
WARNING: multiple messages have this Message-ID (diff)
From: nico@fluxnic.net (Nicolas Pitre) To: linux-arm-kernel@lists.infradead.org Subject: RFC: mutex: hung tasks on SMP platforms with asm-generic/mutex-xchg.h Date: Thu, 09 Aug 2012 01:12:15 -0400 (EDT) [thread overview] Message-ID: <alpine.LFD.2.02.1208090029460.5231@xanadu.home> (raw) In-Reply-To: <20120807115647.GA12828@mudshark.cambridge.arm.com> On Tue, 7 Aug 2012, Will Deacon wrote: > Hello, > > ARM recently moved to asm-generic/mutex-xchg.h for its mutex implementation > after our previous implementation was found to be missing some crucial > memory barriers. However, I'm seeing some problems running hackbench on > SMP platforms due to the way in which the MUTEX_SPIN_ON_OWNER code operates. > > The symptoms are that a bunch of hackbench tasks are left waiting on an > unlocked mutex and therefore never get woken up to claim it. I think this > boils down to the following sequence: > > > Task A Task B Task C Lock value > 0 1 > 1 lock() 0 > 2 lock() 0 > 3 spin(A) 0 > 4 unlock() 1 > 5 lock() 0 > 6 cmpxchg(1,0) 0 > 7 contended() -1 > 8 lock() 0 > 9 spin(C) 0 > 10 unlock() 1 > 11 cmpxchg(1,0) 0 > 12 unlock() 1 > > > At this point, the lock is unlocked, but Task B is in an uninterruptible > sleep with nobody to wake it up. > > The following patch fixes the problem by ensuring we put the lock into > the contended state if we acquire it from the spin loop on the slowpath > but I'd like to be sure that this won't cause problems with other mutex > implementations: > > > diff --git a/kernel/mutex.c b/kernel/mutex.c > index a307cc9..27b7887 100644 > --- a/kernel/mutex.c > +++ b/kernel/mutex.c > @@ -170,7 +170,7 @@ __mutex_lock_common(struct mutex *lock, long state, unsigned int subclass, > if (owner && !mutex_spin_on_owner(lock, owner)) > break; > > - if (atomic_cmpxchg(&lock->count, 1, 0) == 1) { > + if (atomic_cmpxchg(&lock->count, 1, -1) == 1) { > lock_acquired(&lock->dep_map, ip); > mutex_set_owner(lock); > preempt_enable(); > > > All comments welcome. Well... after thinking about this for a while, I came to the conclusion that the mutex_spin_on_owner code is indeed breaking the contract between the xchg lock fast path and the slow path. The xchg fast path will always set the count to 0 and rely on the slow path to restore a possible pre-existing negative count. So the above change would be needed for correctness in the xchg case, even if it always forces the unlock into the slow path. As I mentioned previously, we don't want to force the slow path in all cases though. The atomic decrement based fast path doesn't need that, so I'd suggest this fix instead: diff --git a/include/asm-generic/mutex-xchg.h b/include/asm-generic/mutex-xchg.h index 580a6d35c7..60964844c8 100644 --- a/include/asm-generic/mutex-xchg.h +++ b/include/asm-generic/mutex-xchg.h @@ -108,4 +108,6 @@ __mutex_fastpath_trylock(atomic_t *count, int (*fail_fn)(atomic_t *)) return prev; } +#define __MUTEX_XCHG_FAST_PATH + #endif diff --git a/kernel/mutex.c b/kernel/mutex.c index a307cc9c95..c6a26a4f1c 100644 --- a/kernel/mutex.c +++ b/kernel/mutex.c @@ -161,6 +161,7 @@ __mutex_lock_common(struct mutex *lock, long state, unsigned int subclass, for (;;) { struct task_struct *owner; + int locked_val; /* * If there's an owner, wait for it to either @@ -170,7 +171,19 @@ __mutex_lock_common(struct mutex *lock, long state, unsigned int subclass, if (owner && !mutex_spin_on_owner(lock, owner)) break; - if (atomic_cmpxchg(&lock->count, 1, 0) == 1) { +#ifdef __MUTEX_XCHG_FAST_PATH + /* + * The fast path based on xchg sets a transient 0 count, + * relying on the slow path to restore a possible + * pre-existing contended count. Without checking the + * waiters' list we must presume possible contension here. + */ + locked_val = -1; +#else + locked_val = 0; +#endif + + if (atomic_cmpxchg(&lock->count, 1, locked_val) == 1) { lock_acquired(&lock->dep_map, ip); mutex_set_owner(lock); preempt_enable(); That would be needed for the stable tree as well. A further cleanup could remove all definitions of __mutex_slowpath_needs_to_unlock() given that they're all set to 1 except for the xchg fast path, in which case __MUTEX_XCHG_FAST_PATH could be reused instead. Of course that might tilt the balance towards using mutex-dec.h on ARM v6 and above instead of mutex-xchg.h. But that is an orthogonal issue, and that doesn't remove the need for fixing the xchg based case for correctness. Nicolas
next prev parent reply other threads:[~2012-08-09 5:12 UTC|newest] Thread overview: 42+ messages / expand[flat|nested] mbox.gz Atom feed top 2012-08-07 11:56 RFC: mutex: hung tasks on SMP platforms with asm-generic/mutex-xchg.h Will Deacon 2012-08-07 11:56 ` Will Deacon 2012-08-07 13:48 ` Peter Zijlstra 2012-08-07 13:48 ` Peter Zijlstra 2012-08-07 14:04 ` Will Deacon 2012-08-07 14:04 ` Will Deacon 2012-08-07 17:14 ` Nicolas Pitre 2012-08-07 17:14 ` Nicolas Pitre 2012-08-07 17:33 ` Will Deacon 2012-08-07 17:33 ` Will Deacon 2012-08-07 17:38 ` Will Deacon 2012-08-07 17:38 ` Will Deacon 2012-08-07 18:28 ` Nicolas Pitre 2012-08-07 18:28 ` Nicolas Pitre 2012-08-09 5:12 ` Nicolas Pitre [this message] 2012-08-09 5:12 ` Nicolas Pitre 2012-08-09 14:49 ` Will Deacon 2012-08-09 14:49 ` Will Deacon 2012-08-09 16:17 ` Nicolas Pitre 2012-08-09 16:17 ` Nicolas Pitre 2012-08-09 16:57 ` Nicolas Pitre 2012-08-09 16:57 ` Nicolas Pitre 2012-08-09 17:50 ` Will Deacon 2012-08-09 17:50 ` Will Deacon 2012-08-09 18:09 ` Nicolas Pitre 2012-08-09 18:09 ` Nicolas Pitre 2012-08-09 18:17 ` Will Deacon 2012-08-09 18:17 ` Will Deacon 2012-08-09 20:05 ` Nicolas Pitre 2012-08-09 20:05 ` Nicolas Pitre 2012-08-13 8:15 ` Peter Zijlstra 2012-08-13 8:15 ` Peter Zijlstra 2012-08-13 9:13 ` Will Deacon 2012-08-13 9:13 ` Will Deacon 2012-08-13 13:35 ` Nicolas Pitre 2012-08-13 13:35 ` Nicolas Pitre 2012-08-13 14:05 ` Peter Zijlstra 2012-08-13 14:05 ` Peter Zijlstra 2012-08-13 14:11 ` Will Deacon 2012-08-13 14:11 ` Will Deacon 2012-08-13 14:45 ` Peter Zijlstra 2012-08-13 14:45 ` Peter Zijlstra
Reply instructions: You may reply publicly to this message via plain-text email using any one of the following methods: * Save the following mbox file, import it into your mail client, and reply-to-all from there: mbox Avoid top-posting and favor interleaved quoting: https://en.wikipedia.org/wiki/Posting_style#Interleaved_style * Reply using the --to, --cc, and --in-reply-to switches of git-send-email(1): git send-email \ --in-reply-to=alpine.LFD.2.02.1208090029460.5231@xanadu.home \ --to=nico@fluxnic.net \ --cc=a.p.zijlstra@chello.nl \ --cc=arnd@arndb.de \ --cc=chris.mason@fusionio.com \ --cc=linux-arm-kernel@lists.infradead.org \ --cc=linux-kernel@vger.kernel.org \ --cc=mingo@elte.hu \ --cc=tglx@linutronix.de \ --cc=will.deacon@arm.com \ /path/to/YOUR_REPLY https://kernel.org/pub/software/scm/git/docs/git-send-email.html * If your mail client supports setting the In-Reply-To header via mailto: links, try the mailto: linkBe sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.