* [RFC PATCH v2 0/2] locking/rwsem: Fix DEBUG_RWSEM warning from thaw_sup @ 2018-05-14 19:31 Waiman Long 2018-05-14 19:31 ` [RFC PATCH v2 1/2] locking/rwsem: Add a new RWSEM_WRITER_OWNED_NOSPIN flag Waiman Long 2018-05-14 19:31 ` [RFC PATCH v2 2/2] locking/percpu-rwsem: Mark rwsem as non-spinnable in percpu_rwsem_release() Waiman Long 0 siblings, 2 replies; 18+ messages in thread From: Waiman Long @ 2018-05-14 19:31 UTC (permalink / raw) To: Ingo Molnar, Peter Zijlstra, Thomas Gleixner Cc: linux-kernel, linux-fsdevel, Davidlohr Bueso, Theodore Y. Ts'o, Oleg Nesterov, Amir Goldstein, Jan Kara, Waiman Long My original patch (https://lkml.org/lkml/2018/4/4/447) to fix this isse probably won't work. This is my second attempt to fix it. I don't have the setup to reproduce the problem. Could someone try it to see if it can eliminate the warning? Waiman Long (2): locking/rwsem: Add a new RWSEM_WRITER_OWNED_NOSPIN flag locking/percpu-rwsem: Mark rwsem as non-spinnable in percpu_rwsem_release() include/linux/percpu-rwsem.h | 6 +++--- include/linux/rwsem.h | 10 ++++++++++ kernel/locking/rwsem-xadd.c | 17 ++++++++--------- kernel/locking/rwsem.c | 16 +++++++++++++++- kernel/locking/rwsem.h | 37 ++++++++++++++++++++++++++++++------- 5 files changed, 66 insertions(+), 20 deletions(-) -- 1.8.3.1 ^ permalink raw reply [flat|nested] 18+ messages in thread
* [RFC PATCH v2 1/2] locking/rwsem: Add a new RWSEM_WRITER_OWNED_NOSPIN flag 2018-05-14 19:31 [RFC PATCH v2 0/2] locking/rwsem: Fix DEBUG_RWSEM warning from thaw_sup Waiman Long @ 2018-05-14 19:31 ` Waiman Long 2018-05-15 6:59 ` Amir Goldstein 2018-05-15 8:25 ` Peter Zijlstra 2018-05-14 19:31 ` [RFC PATCH v2 2/2] locking/percpu-rwsem: Mark rwsem as non-spinnable in percpu_rwsem_release() Waiman Long 1 sibling, 2 replies; 18+ messages in thread From: Waiman Long @ 2018-05-14 19:31 UTC (permalink / raw) To: Ingo Molnar, Peter Zijlstra, Thomas Gleixner Cc: linux-kernel, linux-fsdevel, Davidlohr Bueso, Theodore Y. Ts'o, Oleg Nesterov, Amir Goldstein, Jan Kara, Waiman Long There are use cases where a rwsem can be acquired by one task, but released by another task. In thess cases, it may not be appropriate for the lock waiters to spin on the task that acquires the lock. One example will be the filesystem freeze/thaw code. To handle such use cases, a new RWSEM_WRITER_OWNED_NOSPIN flag can now be set in the owner field of the rwsem by the new rwsem_set_writer_owned_nospin() function to indicate that the rwsem is writer owned, but optimistic spinning on the rwsem should be disabled. Later on, the new rwsem_set_writer_owned() function can be called to set the new owner, if it is known. This function should not be called without a prior rwsem_set_writer_owned_nospin() call. Signed-off-by: Waiman Long <longman@redhat.com> --- include/linux/rwsem.h | 10 ++++++++++ kernel/locking/rwsem-xadd.c | 17 ++++++++--------- kernel/locking/rwsem.c | 16 +++++++++++++++- kernel/locking/rwsem.h | 37 ++++++++++++++++++++++++++++++------- 4 files changed, 63 insertions(+), 17 deletions(-) diff --git a/include/linux/rwsem.h b/include/linux/rwsem.h index 56707d5..1ddf24b 100644 --- a/include/linux/rwsem.h +++ b/include/linux/rwsem.h @@ -145,6 +145,16 @@ static inline int rwsem_is_contended(struct rw_semaphore *sem) */ extern void downgrade_write(struct rw_semaphore *sem); +#ifdef CONFIG_RWSEM_SPIN_ON_OWNER +extern void rwsem_set_writer_owned_nospin(struct rw_semaphore *sem); +extern void rwsem_set_writer_owned(struct rw_semaphore *sem, + struct task_struct *task); +#else +static inline void rwsem_set_writer_owned_nospin(struct rw_semaphore *sem) { } +extern inline void rwsem_set_writer_owned(struct rw_semaphore *sem, + struct task_struct *task) { } +#endif + #ifdef CONFIG_DEBUG_LOCK_ALLOC /* * nested locking. NOTE: rwsems are not allowed to recurse diff --git a/kernel/locking/rwsem-xadd.c b/kernel/locking/rwsem-xadd.c index e795908..a27dbb4 100644 --- a/kernel/locking/rwsem-xadd.c +++ b/kernel/locking/rwsem-xadd.c @@ -357,11 +357,8 @@ static inline bool rwsem_can_spin_on_owner(struct rw_semaphore *sem) rcu_read_lock(); owner = READ_ONCE(sem->owner); - if (!rwsem_owner_is_writer(owner)) { - /* - * Don't spin if the rwsem is readers owned. - */ - ret = !rwsem_owner_is_reader(owner); + if (!owner || !is_rwsem_owner_spinnable(owner)) { + ret = !owner; /* !owner is spinnable */ goto done; } @@ -382,8 +379,10 @@ static noinline bool rwsem_spin_on_owner(struct rw_semaphore *sem) { struct task_struct *owner = READ_ONCE(sem->owner); - if (!rwsem_owner_is_writer(owner)) - goto out; + if (!owner) + return true; + else if (!is_rwsem_owner_spinnable(owner)) + return false; rcu_read_lock(); while (sem->owner == owner) { @@ -408,12 +407,12 @@ static noinline bool rwsem_spin_on_owner(struct rw_semaphore *sem) cpu_relax(); } rcu_read_unlock(); -out: + /* * If there is a new owner or the owner is not set, we continue * spinning. */ - return !rwsem_owner_is_reader(READ_ONCE(sem->owner)); + return is_rwsem_owner_spinnable(READ_ONCE(sem->owner)); } static bool rwsem_optimistic_spin(struct rw_semaphore *sem) diff --git a/kernel/locking/rwsem.c b/kernel/locking/rwsem.c index 30465a2..90e89ee 100644 --- a/kernel/locking/rwsem.c +++ b/kernel/locking/rwsem.c @@ -130,7 +130,8 @@ void up_read(struct rw_semaphore *sem) void up_write(struct rw_semaphore *sem) { rwsem_release(&sem->dep_map, 1, _RET_IP_); - DEBUG_RWSEMS_WARN_ON(sem->owner != current); + DEBUG_RWSEMS_WARN_ON((sem->owner != current) && + (sem->owner != RWSEM_WRITER_OWNED_NOSPIN)); rwsem_clear_owner(sem); __up_write(sem); @@ -222,4 +223,17 @@ void up_read_non_owner(struct rw_semaphore *sem) #endif +#ifdef CONFIG_RWSEM_SPIN_ON_OWNER +void rwsem_set_writer_owned_nospin(struct rw_semaphore *sem) +{ + __rwsem_set_writer_owned_nospin(sem); +} +EXPORT_SYMBOL(rwsem_set_writer_owned_nospin); +void rwsem_set_writer_owned(struct rw_semaphore *sem, struct task_struct *task) +{ + DEBUG_RWSEMS_WARN_ON(sem->owner != RWSEM_WRITER_OWNED_NOSPIN); + __rwsem_set_writer_owned(sem, task); +} +EXPORT_SYMBOL(rwsem_set_writer_owned); +#endif diff --git a/kernel/locking/rwsem.h b/kernel/locking/rwsem.h index a17cba8..bbbd5a3 100644 --- a/kernel/locking/rwsem.h +++ b/kernel/locking/rwsem.h @@ -11,10 +11,15 @@ * 2) RWSEM_READER_OWNED * - lock is currently or previously owned by readers (lock is free * or not set by owner yet) - * 3) Other non-zero value - * - a writer owns the lock + * 3) RWSEM_WRITER_OWNED_NOSPIN + * - lock is owned by a writer whose lock ownership may be transfered to + * another task and so spinning on the lock owner should be disabled. + * 4) Other non-zero value + * - a writer owns the lock and other writers can spin on the lock owner. */ -#define RWSEM_READER_OWNED ((struct task_struct *)1UL) +#define RWSEM_READER_OWNED ((struct task_struct *)1UL) +#define RWSEM_WRITER_OWNED_NOSPIN ((struct task_struct *)2UL) +#define RWSEM_NOSPIN_MASK 3UL #ifdef CONFIG_DEBUG_RWSEMS # define DEBUG_RWSEMS_WARN_ON(c) DEBUG_LOCKS_WARN_ON(c) @@ -51,14 +56,32 @@ static inline void rwsem_set_reader_owned(struct rw_semaphore *sem) WRITE_ONCE(sem->owner, RWSEM_READER_OWNED); } -static inline bool rwsem_owner_is_writer(struct task_struct *owner) +/* + * Mark the rwsem as writer owned, but optimistic spinning should be + * disabled. + * + * The caller must make sure that the rwsem is really writer owned + * and the lock won't be freed concurrently with this call. + */ +static inline void __rwsem_set_writer_owned_nospin(struct rw_semaphore *sem) +{ + WRITE_ONCE(sem->owner, RWSEM_WRITER_OWNED_NOSPIN); +} + +static inline void __rwsem_set_writer_owned(struct rw_semaphore *sem, + struct task_struct *task) { - return owner && owner != RWSEM_READER_OWNED; + WRITE_ONCE(sem->owner, task); } -static inline bool rwsem_owner_is_reader(struct task_struct *owner) +/* + * Return true if the a rwsem waiter can spin on the rwsem's owner + * and steal the lock. + * N.B. !owner is considered spinnable. + */ +static inline bool is_rwsem_owner_spinnable(struct task_struct *owner) { - return owner == RWSEM_READER_OWNED; + return !((unsigned long)owner & RWSEM_NOSPIN_MASK); } #else static inline void rwsem_set_owner(struct rw_semaphore *sem) -- 1.8.3.1 ^ permalink raw reply related [flat|nested] 18+ messages in thread
* Re: [RFC PATCH v2 1/2] locking/rwsem: Add a new RWSEM_WRITER_OWNED_NOSPIN flag 2018-05-14 19:31 ` [RFC PATCH v2 1/2] locking/rwsem: Add a new RWSEM_WRITER_OWNED_NOSPIN flag Waiman Long @ 2018-05-15 6:59 ` Amir Goldstein 2018-05-15 8:25 ` Peter Zijlstra 1 sibling, 0 replies; 18+ messages in thread From: Amir Goldstein @ 2018-05-15 6:59 UTC (permalink / raw) To: Waiman Long Cc: Ingo Molnar, Peter Zijlstra, Thomas Gleixner, linux-kernel, linux-fsdevel, Davidlohr Bueso, Theodore Y. Ts'o, Oleg Nesterov, Jan Kara On Mon, May 14, 2018 at 10:31 PM, Waiman Long <longman@redhat.com> wrote: > There are use cases where a rwsem can be acquired by one task, but > released by another task. In thess cases, it may not be appropriate > for the lock waiters to spin on the task that acquires the lock. > One example will be the filesystem freeze/thaw code. > > To handle such use cases, a new RWSEM_WRITER_OWNED_NOSPIN > flag can now be set in the owner field of the rwsem by the new > rwsem_set_writer_owned_nospin() function to indicate that the rwsem is > writer owned, but optimistic spinning on the rwsem should be disabled. > > Later on, the new rwsem_set_writer_owned() function can be called to > set the new owner, if it is known. This function should not be called > without a prior rwsem_set_writer_owned_nospin() call. > > Signed-off-by: Waiman Long <longman@redhat.com> Makes sense to me. one nit. > > +static inline void __rwsem_set_writer_owned(struct rw_semaphore *sem, > + struct task_struct *task) rwsem_set_owner() doesn't pass in task argument and IMO __rwsem_set_writer_owned() shouldn't either. Thanks, Amir. ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [RFC PATCH v2 1/2] locking/rwsem: Add a new RWSEM_WRITER_OWNED_NOSPIN flag 2018-05-14 19:31 ` [RFC PATCH v2 1/2] locking/rwsem: Add a new RWSEM_WRITER_OWNED_NOSPIN flag Waiman Long 2018-05-15 6:59 ` Amir Goldstein @ 2018-05-15 8:25 ` Peter Zijlstra 1 sibling, 0 replies; 18+ messages in thread From: Peter Zijlstra @ 2018-05-15 8:25 UTC (permalink / raw) To: Waiman Long Cc: Ingo Molnar, Thomas Gleixner, linux-kernel, linux-fsdevel, Davidlohr Bueso, Theodore Y. Ts'o, Oleg Nesterov, Amir Goldstein, Jan Kara On Mon, May 14, 2018 at 03:31:06PM -0400, Waiman Long wrote: > There are use cases where a rwsem can be acquired by one task, but > released by another task. In thess cases, it may not be appropriate > for the lock waiters to spin on the task that acquires the lock. > One example will be the filesystem freeze/thaw code. > > To handle such use cases, a new RWSEM_WRITER_OWNED_NOSPIN > flag can now be set in the owner field of the rwsem by the new > rwsem_set_writer_owned_nospin() function to indicate that the rwsem is > writer owned, but optimistic spinning on the rwsem should be disabled. > > Later on, the new rwsem_set_writer_owned() function can be called to > set the new owner, if it is known. This function should not be called > without a prior rwsem_set_writer_owned_nospin() call. Urgh.. no please don't do this. Aside from the horrible naming, do not expose 'set-owner' semantics. Can't we just stick to the existing _non_owner() interface without further polluting the API? ^ permalink raw reply [flat|nested] 18+ messages in thread
* [RFC PATCH v2 2/2] locking/percpu-rwsem: Mark rwsem as non-spinnable in percpu_rwsem_release() 2018-05-14 19:31 [RFC PATCH v2 0/2] locking/rwsem: Fix DEBUG_RWSEM warning from thaw_sup Waiman Long 2018-05-14 19:31 ` [RFC PATCH v2 1/2] locking/rwsem: Add a new RWSEM_WRITER_OWNED_NOSPIN flag Waiman Long @ 2018-05-14 19:31 ` Waiman Long 2018-05-15 5:42 ` Amir Goldstein ` (2 more replies) 1 sibling, 3 replies; 18+ messages in thread From: Waiman Long @ 2018-05-14 19:31 UTC (permalink / raw) To: Ingo Molnar, Peter Zijlstra, Thomas Gleixner Cc: linux-kernel, linux-fsdevel, Davidlohr Bueso, Theodore Y. Ts'o, Oleg Nesterov, Amir Goldstein, Jan Kara, Waiman Long The percpu_rwsem_release() is called when the ownership of the embedded rwsem is to be transferred to another task. The new owner, however, may take a while to get the ownership of the lock via percpu_rwsem_acquire(). During that period, the rwsem is now marked as writer-owned with no optimistic spinning. Signed-off-by: Waiman Long <longman@redhat.com> --- include/linux/percpu-rwsem.h | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/include/linux/percpu-rwsem.h b/include/linux/percpu-rwsem.h index b1f37a8..dd37102 100644 --- a/include/linux/percpu-rwsem.h +++ b/include/linux/percpu-rwsem.h @@ -131,16 +131,16 @@ static inline void percpu_rwsem_release(struct percpu_rw_semaphore *sem, bool read, unsigned long ip) { lock_release(&sem->rw_sem.dep_map, 1, ip); -#ifdef CONFIG_RWSEM_SPIN_ON_OWNER if (!read) - sem->rw_sem.owner = NULL; -#endif + rwsem_set_writer_owned_nospin(&sem->rw_sem); } static inline void percpu_rwsem_acquire(struct percpu_rw_semaphore *sem, bool read, unsigned long ip) { lock_acquire(&sem->rw_sem.dep_map, 0, 1, read, 1, NULL, ip); + if (!read) + rwsem_set_writer_owned(&sem->rw_sem, current); } #endif -- 1.8.3.1 ^ permalink raw reply related [flat|nested] 18+ messages in thread
* Re: [RFC PATCH v2 2/2] locking/percpu-rwsem: Mark rwsem as non-spinnable in percpu_rwsem_release() 2018-05-14 19:31 ` [RFC PATCH v2 2/2] locking/percpu-rwsem: Mark rwsem as non-spinnable in percpu_rwsem_release() Waiman Long @ 2018-05-15 5:42 ` Amir Goldstein 2018-05-15 7:04 ` Amir Goldstein 2018-05-15 13:45 ` Waiman Long 2018-05-15 8:35 ` Peter Zijlstra 2018-05-15 8:51 ` Peter Zijlstra 2 siblings, 2 replies; 18+ messages in thread From: Amir Goldstein @ 2018-05-15 5:42 UTC (permalink / raw) To: Waiman Long Cc: Ingo Molnar, Peter Zijlstra, Thomas Gleixner, linux-kernel, linux-fsdevel, Davidlohr Bueso, Theodore Y. Ts'o, Oleg Nesterov, Jan Kara On Mon, May 14, 2018 at 10:31 PM, Waiman Long <longman@redhat.com> wrote: > The percpu_rwsem_release() is called when the ownership of the embedded > rwsem is to be transferred to another task. The new owner, however, may > take a while to get the ownership of the lock via percpu_rwsem_acquire(). > During that period, the rwsem is now marked as writer-owned with no > optimistic spinning. > Waiman, Thanks for the fix. I will test it soon. For this commit message I suggest that you add parts of the reproducer found here: https://marc.info/?l=linux-fsdevel&m=152622016219975&w=2 Thanks, Amir. > Signed-off-by: Waiman Long <longman@redhat.com> > --- > include/linux/percpu-rwsem.h | 6 +++--- > 1 file changed, 3 insertions(+), 3 deletions(-) > > diff --git a/include/linux/percpu-rwsem.h b/include/linux/percpu-rwsem.h > index b1f37a8..dd37102 100644 > --- a/include/linux/percpu-rwsem.h > +++ b/include/linux/percpu-rwsem.h > @@ -131,16 +131,16 @@ static inline void percpu_rwsem_release(struct percpu_rw_semaphore *sem, > bool read, unsigned long ip) > { > lock_release(&sem->rw_sem.dep_map, 1, ip); > -#ifdef CONFIG_RWSEM_SPIN_ON_OWNER > if (!read) > - sem->rw_sem.owner = NULL; > -#endif > + rwsem_set_writer_owned_nospin(&sem->rw_sem); > } > > static inline void percpu_rwsem_acquire(struct percpu_rw_semaphore *sem, > bool read, unsigned long ip) > { > lock_acquire(&sem->rw_sem.dep_map, 0, 1, read, 1, NULL, ip); > + if (!read) > + rwsem_set_writer_owned(&sem->rw_sem, current); > } > > #endif > -- > 1.8.3.1 > ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [RFC PATCH v2 2/2] locking/percpu-rwsem: Mark rwsem as non-spinnable in percpu_rwsem_release() 2018-05-15 5:42 ` Amir Goldstein @ 2018-05-15 7:04 ` Amir Goldstein 2018-05-15 13:45 ` Waiman Long 1 sibling, 0 replies; 18+ messages in thread From: Amir Goldstein @ 2018-05-15 7:04 UTC (permalink / raw) To: Waiman Long Cc: Ingo Molnar, Peter Zijlstra, Thomas Gleixner, linux-kernel, linux-fsdevel, Davidlohr Bueso, Theodore Y. Ts'o, Oleg Nesterov, Jan Kara On Tue, May 15, 2018 at 8:42 AM, Amir Goldstein <amir73il@gmail.com> wrote: > On Mon, May 14, 2018 at 10:31 PM, Waiman Long <longman@redhat.com> wrote: >> The percpu_rwsem_release() is called when the ownership of the embedded >> rwsem is to be transferred to another task. The new owner, however, may >> take a while to get the ownership of the lock via percpu_rwsem_acquire(). >> During that period, the rwsem is now marked as writer-owned with no >> optimistic spinning. >> > > Waiman, > > Thanks for the fix. I will test it soon. > > For this commit message I suggest that you add parts of the reproducer > found here: > https://marc.info/?l=linux-fsdevel&m=152622016219975&w=2 > fsfreeze is happy with these changes. You may add: Tested-by: Amir Goldstein <amir73il@gmail.com> Thanks, Amir. ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [RFC PATCH v2 2/2] locking/percpu-rwsem: Mark rwsem as non-spinnable in percpu_rwsem_release() 2018-05-15 5:42 ` Amir Goldstein 2018-05-15 7:04 ` Amir Goldstein @ 2018-05-15 13:45 ` Waiman Long 1 sibling, 0 replies; 18+ messages in thread From: Waiman Long @ 2018-05-15 13:45 UTC (permalink / raw) To: Amir Goldstein Cc: Ingo Molnar, Peter Zijlstra, Thomas Gleixner, linux-kernel, linux-fsdevel, Davidlohr Bueso, Theodore Y. Ts'o, Oleg Nesterov, Jan Kara On 05/15/2018 01:42 AM, Amir Goldstein wrote: > On Mon, May 14, 2018 at 10:31 PM, Waiman Long <longman@redhat.com> wrote: >> The percpu_rwsem_release() is called when the ownership of the embedded >> rwsem is to be transferred to another task. The new owner, however, may >> take a while to get the ownership of the lock via percpu_rwsem_acquire(). >> During that period, the rwsem is now marked as writer-owned with no >> optimistic spinning. >> > Waiman, > > Thanks for the fix. I will test it soon. > > For this commit message I suggest that you add parts of the reproducer > found here: > https://marc.info/?l=linux-fsdevel&m=152622016219975&w=2 > > Thanks, > Amir. Sure. I will add that to the commit log. Cheers, Longman ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [RFC PATCH v2 2/2] locking/percpu-rwsem: Mark rwsem as non-spinnable in percpu_rwsem_release() 2018-05-14 19:31 ` [RFC PATCH v2 2/2] locking/percpu-rwsem: Mark rwsem as non-spinnable in percpu_rwsem_release() Waiman Long 2018-05-15 5:42 ` Amir Goldstein @ 2018-05-15 8:35 ` Peter Zijlstra 2018-05-15 9:00 ` Jan Kara 2018-05-15 8:51 ` Peter Zijlstra 2 siblings, 1 reply; 18+ messages in thread From: Peter Zijlstra @ 2018-05-15 8:35 UTC (permalink / raw) To: Waiman Long Cc: Ingo Molnar, Thomas Gleixner, linux-kernel, linux-fsdevel, Davidlohr Bueso, Theodore Y. Ts'o, Oleg Nesterov, Amir Goldstein, Jan Kara On Mon, May 14, 2018 at 03:31:07PM -0400, Waiman Long wrote: > The percpu_rwsem_release() is called when the ownership of the embedded > rwsem is to be transferred to another task. The new owner, however, may > take a while to get the ownership of the lock via percpu_rwsem_acquire(). > During that period, the rwsem is now marked as writer-owned with no > optimistic spinning. This does not explain the problem sufficiently to even begin considering if the proposed solution is sensible. ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [RFC PATCH v2 2/2] locking/percpu-rwsem: Mark rwsem as non-spinnable in percpu_rwsem_release() 2018-05-15 8:35 ` Peter Zijlstra @ 2018-05-15 9:00 ` Jan Kara 2018-05-15 11:33 ` Oleg Nesterov 0 siblings, 1 reply; 18+ messages in thread From: Jan Kara @ 2018-05-15 9:00 UTC (permalink / raw) To: Peter Zijlstra Cc: Waiman Long, Ingo Molnar, Thomas Gleixner, linux-kernel, linux-fsdevel, Davidlohr Bueso, Theodore Y. Ts'o, Oleg Nesterov, Amir Goldstein, Jan Kara On Tue 15-05-18 10:35:25, Peter Zijlstra wrote: > On Mon, May 14, 2018 at 03:31:07PM -0400, Waiman Long wrote: > > The percpu_rwsem_release() is called when the ownership of the embedded > > rwsem is to be transferred to another task. The new owner, however, may > > take a while to get the ownership of the lock via percpu_rwsem_acquire(). > > During that period, the rwsem is now marked as writer-owned with no > > optimistic spinning. > > This does not explain the problem sufficiently to even begin considering > if the proposed solution is sensible. So the original problem is following: There is percpu_rw_semaphore in super_block which is used to implement filesystem freezing (actually three of them but that's not really substantial here). This semaphore is acquired for writing when a fs is frozen (i.e., in response to a syscall) and we return to userspace with this semaphore held. Later someone else calls another syscall to unfreeze the filesystem which drops the semaphore. Now this behavior upsets lockdep and that's why we fool it by telling the semaphore got released before returning to userspace (through percpu_rwsem_release() helper) and similarly we tell lockdep we've got the semaphore when an unfreeze syscall is called by percpu_rwsem_acquire(). Now Amir has discovered that also rwsem debugging code gets confused by this behavior and previously also someone noticed that rwsem spinning does not make sense and can be broken by this behavior. So these patches from Waiman try to fix up all these problems... Honza -- Jan Kara <jack@suse.com> SUSE Labs, CR ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [RFC PATCH v2 2/2] locking/percpu-rwsem: Mark rwsem as non-spinnable in percpu_rwsem_release() 2018-05-15 9:00 ` Jan Kara @ 2018-05-15 11:33 ` Oleg Nesterov 0 siblings, 0 replies; 18+ messages in thread From: Oleg Nesterov @ 2018-05-15 11:33 UTC (permalink / raw) To: Jan Kara Cc: Peter Zijlstra, Waiman Long, Ingo Molnar, Thomas Gleixner, linux-kernel, linux-fsdevel, Davidlohr Bueso, Theodore Y. Ts'o, Amir Goldstein On 05/15, Jan Kara wrote: > > Now this behavior upsets lockdep and that's why we fool it by telling the > semaphore got released before returning to userspace (through > percpu_rwsem_release() helper) and similarly we tell lockdep we've got the > semaphore when an unfreeze syscall is called by percpu_rwsem_acquire(). Now > Amir has discovered that also rwsem debugging code gets confused by this > behavior Yes, plus someone else has already reported the problem a month ago, > and previously also someone noticed that rwsem spinning does not > make sense and can be broken by this behavior. Well, this doesn't really matter but again, freeze_super() checks frozen == SB_UNFROZEN under sb->s_umount and only then does sb_wait_write(), when the previous writer has already realeased this lock. So the new writer will never spin after lockdep_sb_freeze_release() clears ->owner. Oleg. ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [RFC PATCH v2 2/2] locking/percpu-rwsem: Mark rwsem as non-spinnable in percpu_rwsem_release() 2018-05-14 19:31 ` [RFC PATCH v2 2/2] locking/percpu-rwsem: Mark rwsem as non-spinnable in percpu_rwsem_release() Waiman Long 2018-05-15 5:42 ` Amir Goldstein 2018-05-15 8:35 ` Peter Zijlstra @ 2018-05-15 8:51 ` Peter Zijlstra 2018-05-15 11:06 ` Oleg Nesterov 2018-05-15 13:57 ` Waiman Long 2 siblings, 2 replies; 18+ messages in thread From: Peter Zijlstra @ 2018-05-15 8:51 UTC (permalink / raw) To: Waiman Long Cc: Ingo Molnar, Thomas Gleixner, linux-kernel, linux-fsdevel, Davidlohr Bueso, Theodore Y. Ts'o, Oleg Nesterov, Amir Goldstein, Jan Kara On Mon, May 14, 2018 at 03:31:07PM -0400, Waiman Long wrote: > The percpu_rwsem_release() is called when the ownership of the embedded > rwsem is to be transferred to another task. The new owner, however, may > take a while to get the ownership of the lock via percpu_rwsem_acquire(). > During that period, the rwsem is now marked as writer-owned with no > optimistic spinning. > > Signed-off-by: Waiman Long <longman@redhat.com> > --- > include/linux/percpu-rwsem.h | 6 +++--- > 1 file changed, 3 insertions(+), 3 deletions(-) > > diff --git a/include/linux/percpu-rwsem.h b/include/linux/percpu-rwsem.h > index b1f37a8..dd37102 100644 > --- a/include/linux/percpu-rwsem.h > +++ b/include/linux/percpu-rwsem.h > @@ -131,16 +131,16 @@ static inline void percpu_rwsem_release(struct percpu_rw_semaphore *sem, > bool read, unsigned long ip) > { > lock_release(&sem->rw_sem.dep_map, 1, ip); > -#ifdef CONFIG_RWSEM_SPIN_ON_OWNER > if (!read) > - sem->rw_sem.owner = NULL; > -#endif > + rwsem_set_writer_owned_nospin(&sem->rw_sem); > } > > static inline void percpu_rwsem_acquire(struct percpu_rw_semaphore *sem, > bool read, unsigned long ip) > { > lock_acquire(&sem->rw_sem.dep_map, 0, 1, read, 1, NULL, ip); > + if (!read) > + rwsem_set_writer_owned(&sem->rw_sem, current); > } So what's wrong with adding: if (!read) sem->rw_sem.owner = current; ? Afaict the whole .owner=NULL thing in release already stops the spinners dead, and the above 'fixes' the debug splat. And this avoids exposing that horrible interface and keeps the mucking private to rwsem/percpu_rwsem. ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [RFC PATCH v2 2/2] locking/percpu-rwsem: Mark rwsem as non-spinnable in percpu_rwsem_release() 2018-05-15 8:51 ` Peter Zijlstra @ 2018-05-15 11:06 ` Oleg Nesterov 2018-05-15 11:51 ` Peter Zijlstra 2018-05-15 13:57 ` Waiman Long 1 sibling, 1 reply; 18+ messages in thread From: Oleg Nesterov @ 2018-05-15 11:06 UTC (permalink / raw) To: Peter Zijlstra Cc: Waiman Long, Ingo Molnar, Thomas Gleixner, linux-kernel, linux-fsdevel, Davidlohr Bueso, Theodore Y. Ts'o, Amir Goldstein, Jan Kara On 05/15, Peter Zijlstra wrote: > > So what's wrong with adding: > > if (!read) > sem->rw_sem.owner = current; Agreed, I have already suggested this change twice. Except we obviously need to check CONFIG_RWSEM_SPIN_ON_OWNER (->owner doesn't exists otherwise) or even CONFIG_DEBUG_RWSEMS to make the purpose more clear. > Afaict the whole .owner=NULL thing in release already stops the spinners Not really, the new writer will spin in this case, afaics. But this is another problem and probably we do not care. The new writer is almost impossible in this particular case, another freeze_super() should notice frozen != SB_UNFROZEN and return EBUSY. > and the above 'fixes' the debug splat. Yes. Waiman, can't we trivially fix the problem first? Then we can add the helpers and think about other improvements. Oleg. ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [RFC PATCH v2 2/2] locking/percpu-rwsem: Mark rwsem as non-spinnable in percpu_rwsem_release() 2018-05-15 11:06 ` Oleg Nesterov @ 2018-05-15 11:51 ` Peter Zijlstra 2018-05-15 12:45 ` Oleg Nesterov 0 siblings, 1 reply; 18+ messages in thread From: Peter Zijlstra @ 2018-05-15 11:51 UTC (permalink / raw) To: Oleg Nesterov Cc: Waiman Long, Ingo Molnar, Thomas Gleixner, linux-kernel, linux-fsdevel, Davidlohr Bueso, Theodore Y. Ts'o, Amir Goldstein, Jan Kara On Tue, May 15, 2018 at 01:06:33PM +0200, Oleg Nesterov wrote: > On 05/15, Peter Zijlstra wrote: > > > > So what's wrong with adding: > > > > if (!read) > > sem->rw_sem.owner = current; > > Agreed, I have already suggested this change twice. Except we obviously > need to check CONFIG_RWSEM_SPIN_ON_OWNER (->owner doesn't exists otherwise) > or even CONFIG_DEBUG_RWSEMS to make the purpose more clear. Right, details ;-) > > Afaict the whole .owner=NULL thing in release already stops the spinners > > Not really, the new writer will spin in this case, afaics. > > But this is another problem and probably we do not care. The new writer is > almost impossible in this particular case, another freeze_super() should > notice frozen != SB_UNFROZEN and return EBUSY. rwsem_spin_on_owner() checks rwsem_owner_is_writer(), which does owner && owner != RWSEM_READER_OWNED, which will fail for !owner. Or am I completely confused again? > > and the above 'fixes' the debug splat. > > Yes. > > Waiman, can't we trivially fix the problem first? Then we can add the helpers > and think about other improvements. It is really simple; we're not going to add public (and EXPORT'ed to boot) interfaces to rwsem for this. ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [RFC PATCH v2 2/2] locking/percpu-rwsem: Mark rwsem as non-spinnable in percpu_rwsem_release() 2018-05-15 11:51 ` Peter Zijlstra @ 2018-05-15 12:45 ` Oleg Nesterov 2018-05-15 12:58 ` Peter Zijlstra 0 siblings, 1 reply; 18+ messages in thread From: Oleg Nesterov @ 2018-05-15 12:45 UTC (permalink / raw) To: Peter Zijlstra Cc: Waiman Long, Ingo Molnar, Thomas Gleixner, linux-kernel, linux-fsdevel, Davidlohr Bueso, Theodore Y. Ts'o, Amir Goldstein, Jan Kara On 05/15, Peter Zijlstra wrote: > > > > Afaict the whole .owner=NULL thing in release already stops the spinners > > > > Not really, the new writer will spin in this case, afaics. > > > > But this is another problem and probably we do not care. The new writer is > > almost impossible in this particular case, another freeze_super() should > > notice frozen != SB_UNFROZEN and return EBUSY. > > rwsem_spin_on_owner() checks rwsem_owner_is_writer(), which does owner > && owner != RWSEM_READER_OWNED, which will fail for !owner. Yep. So rwsem_spin_on_owner() goes to "out:" and returns !rwsem_owner_is_reader() == T. IOW, afaics owner == NULL means "spin unconditionally", I guess this is for the case when the new writer is going to do rwsem_set_owner() or up_write() has already called rwsem_clear_owner() but didn't do up_write() yet. Probably makes sense, but the code is not very clean, > Or am I completely confused again? Or me, I am not sure. Oleg. ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [RFC PATCH v2 2/2] locking/percpu-rwsem: Mark rwsem as non-spinnable in percpu_rwsem_release() 2018-05-15 12:45 ` Oleg Nesterov @ 2018-05-15 12:58 ` Peter Zijlstra 0 siblings, 0 replies; 18+ messages in thread From: Peter Zijlstra @ 2018-05-15 12:58 UTC (permalink / raw) To: Oleg Nesterov Cc: Waiman Long, Ingo Molnar, Thomas Gleixner, linux-kernel, linux-fsdevel, Davidlohr Bueso, Theodore Y. Ts'o, Amir Goldstein, Jan Kara On Tue, May 15, 2018 at 02:45:32PM +0200, Oleg Nesterov wrote: > On 05/15, Peter Zijlstra wrote: > > > > > > Afaict the whole .owner=NULL thing in release already stops the spinners > > > > > > Not really, the new writer will spin in this case, afaics. > > > > > > But this is another problem and probably we do not care. The new writer is > > > almost impossible in this particular case, another freeze_super() should > > > notice frozen != SB_UNFROZEN and return EBUSY. > > > > rwsem_spin_on_owner() checks rwsem_owner_is_writer(), which does owner > > && owner != RWSEM_READER_OWNED, which will fail for !owner. > > Yep. So rwsem_spin_on_owner() goes to "out:" and returns > !rwsem_owner_is_reader() == T. > > IOW, afaics owner == NULL means "spin unconditionally", I guess this is for > the case when the new writer is going to do rwsem_set_owner() or up_write() > has already called rwsem_clear_owner() but didn't do up_write() yet. > > Probably makes sense, but the code is not very clean, Arrgh, you're right... I hate this rwsem code. Some day I'll finish the atomic_long_t version, which similar to mutex, merges the owner and 'count' fields. ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [RFC PATCH v2 2/2] locking/percpu-rwsem: Mark rwsem as non-spinnable in percpu_rwsem_release() 2018-05-15 8:51 ` Peter Zijlstra 2018-05-15 11:06 ` Oleg Nesterov @ 2018-05-15 13:57 ` Waiman Long 2018-05-15 14:00 ` Matthew Wilcox 1 sibling, 1 reply; 18+ messages in thread From: Waiman Long @ 2018-05-15 13:57 UTC (permalink / raw) To: Peter Zijlstra Cc: Ingo Molnar, Thomas Gleixner, linux-kernel, linux-fsdevel, Davidlohr Bueso, Theodore Y. Ts'o, Oleg Nesterov, Amir Goldstein, Jan Kara On 05/15/2018 04:51 AM, Peter Zijlstra wrote: > On Mon, May 14, 2018 at 03:31:07PM -0400, Waiman Long wrote: >> The percpu_rwsem_release() is called when the ownership of the embedded >> rwsem is to be transferred to another task. The new owner, however, may >> take a while to get the ownership of the lock via percpu_rwsem_acquire(). >> During that period, the rwsem is now marked as writer-owned with no >> optimistic spinning. >> >> Signed-off-by: Waiman Long <longman@redhat.com> >> --- >> include/linux/percpu-rwsem.h | 6 +++--- >> 1 file changed, 3 insertions(+), 3 deletions(-) >> >> diff --git a/include/linux/percpu-rwsem.h b/include/linux/percpu-rwsem.h >> index b1f37a8..dd37102 100644 >> --- a/include/linux/percpu-rwsem.h >> +++ b/include/linux/percpu-rwsem.h >> @@ -131,16 +131,16 @@ static inline void percpu_rwsem_release(struct percpu_rw_semaphore *sem, >> bool read, unsigned long ip) >> { >> lock_release(&sem->rw_sem.dep_map, 1, ip); >> -#ifdef CONFIG_RWSEM_SPIN_ON_OWNER >> if (!read) >> - sem->rw_sem.owner = NULL; >> -#endif >> + rwsem_set_writer_owned_nospin(&sem->rw_sem); >> } >> >> static inline void percpu_rwsem_acquire(struct percpu_rw_semaphore *sem, >> bool read, unsigned long ip) >> { >> lock_acquire(&sem->rw_sem.dep_map, 0, 1, read, 1, NULL, ip); >> + if (!read) >> + rwsem_set_writer_owned(&sem->rw_sem, current); >> } > So what's wrong with adding: > > if (!read) > sem->rw_sem.owner = current; > > ? Yes, we can certainly do that within a "#ifdef" block. > > Afaict the whole .owner=NULL thing in release already stops the spinners > dead, and the above 'fixes' the debug splat. And this avoids exposing > that horrible interface and keeps the mucking private to > rwsem/percpu_rwsem. Actually setting owner to NULL does not stop spinning. The code just assume that the lock is going to be freed and spin in the outer loop. We need some special value to indicate that spinning should be stopped. How about just exposing a special value for that in linux/rwsem.h? Any suggestion for a good name? Cheers, Longman ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [RFC PATCH v2 2/2] locking/percpu-rwsem: Mark rwsem as non-spinnable in percpu_rwsem_release() 2018-05-15 13:57 ` Waiman Long @ 2018-05-15 14:00 ` Matthew Wilcox 0 siblings, 0 replies; 18+ messages in thread From: Matthew Wilcox @ 2018-05-15 14:00 UTC (permalink / raw) To: Waiman Long Cc: Peter Zijlstra, Ingo Molnar, Thomas Gleixner, linux-kernel, linux-fsdevel, Davidlohr Bueso, Theodore Y. Ts'o, Oleg Nesterov, Amir Goldstein, Jan Kara On Tue, May 15, 2018 at 09:57:44AM -0400, Waiman Long wrote: > > Afaict the whole .owner=NULL thing in release already stops the spinners > > dead, and the above 'fixes' the debug splat. And this avoids exposing > > that horrible interface and keeps the mucking private to > > rwsem/percpu_rwsem. > > Actually setting owner to NULL does not stop spinning. The code just > assume that the lock is going to be freed and spin in the outer loop. We > need some special value to indicate that spinning should be stopped. How > about just exposing a special value for that in linux/rwsem.h? Any > suggestion for a good name? RWSEM_NO_OWNER ^ permalink raw reply [flat|nested] 18+ messages in thread
end of thread, other threads:[~2018-05-15 14:00 UTC | newest] Thread overview: 18+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2018-05-14 19:31 [RFC PATCH v2 0/2] locking/rwsem: Fix DEBUG_RWSEM warning from thaw_sup Waiman Long 2018-05-14 19:31 ` [RFC PATCH v2 1/2] locking/rwsem: Add a new RWSEM_WRITER_OWNED_NOSPIN flag Waiman Long 2018-05-15 6:59 ` Amir Goldstein 2018-05-15 8:25 ` Peter Zijlstra 2018-05-14 19:31 ` [RFC PATCH v2 2/2] locking/percpu-rwsem: Mark rwsem as non-spinnable in percpu_rwsem_release() Waiman Long 2018-05-15 5:42 ` Amir Goldstein 2018-05-15 7:04 ` Amir Goldstein 2018-05-15 13:45 ` Waiman Long 2018-05-15 8:35 ` Peter Zijlstra 2018-05-15 9:00 ` Jan Kara 2018-05-15 11:33 ` Oleg Nesterov 2018-05-15 8:51 ` Peter Zijlstra 2018-05-15 11:06 ` Oleg Nesterov 2018-05-15 11:51 ` Peter Zijlstra 2018-05-15 12:45 ` Oleg Nesterov 2018-05-15 12:58 ` Peter Zijlstra 2018-05-15 13:57 ` Waiman Long 2018-05-15 14:00 ` Matthew Wilcox
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).