linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* Re: [PATCH 2/7] fs/locks: Replace lg_global with a percpu-rwsem
@ 2016-09-06  1:58 Andreas Mohr
  2016-09-06  8:23 ` Peter Zijlstra
  0 siblings, 1 reply; 5+ messages in thread
From: Andreas Mohr @ 2016-09-06  1:58 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: der.herr, Al Viro, mingo, torvalds, dave, Oleg Nesterov, riel,
	tj, paulmck, linux-kernel

Hi,

[no properly binding reference via In-Reply-To: available thus manually re-creating, sorry]

https://lkml.org/lkml/2016/9/5/832


Two thoughts:

***multiple locks

Don't have much insight into this
(didn't spend much thinking on this),
but of course it's unfortunate
that two lock types need to be serviced
rather than having the atomic handling exchange area/section
be sufficiently guarded by one lock only
(the question here could possibly be:
what kind of currently existing
structural disadvantage/layer distribution
prevents us from being able to
have things simply serviced
within one granular lock area only?)


***lock handling

> +	percpu_down_read(&file_rwsem);
>  	spin_lock(&ctx->flc_lock);


>  	spin_unlock(&ctx->flc_lock);
> +	percpu_up_read(&file_rwsem);


These are repeated multiple times in this commit, thus error-prone.

A possibly good way to commit-micro-manage this would be:
1. commit shoves things into a newly created encapsulation/wrapper helper
   stuff_lock(&flc_lock);  /* <---- naming surely can be improved here */
2. [this commit]
   extend encapsulation/wrapper helper
   to be servicing file_rwsem, too:
   stuff_lock(&flc_lock, &file_rwsem);

That way it is pretty much guaranteed that:
a) neither one nor the other lock type
   will get forgotten later, at a specific use site
b) lock order will be correctly maintained at all times
   (AB-BA deadlock......)


[or, IOW, you get some symmetry/consistency malus points
for having provided
a relatively large amount of LOC extensions
which put certain amounts of stress on
code implementation quality/maintainability ;)]

HTH,

Andreas Mohr

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH 2/7] fs/locks: Replace lg_global with a percpu-rwsem
  2016-09-06  1:58 [PATCH 2/7] fs/locks: Replace lg_global with a percpu-rwsem Andreas Mohr
@ 2016-09-06  8:23 ` Peter Zijlstra
  2016-09-06  8:36   ` Andreas Mohr
  0 siblings, 1 reply; 5+ messages in thread
From: Peter Zijlstra @ 2016-09-06  8:23 UTC (permalink / raw)
  To: Andreas Mohr
  Cc: der.herr, Al Viro, mingo, torvalds, dave, Oleg Nesterov, riel,
	tj, paulmck, linux-kernel

On Tue, Sep 06, 2016 at 03:58:00AM +0200, Andreas Mohr wrote:
> Hi,
> 
> [no properly binding reference via In-Reply-To: available thus manually re-creating, sorry]
> 
> https://lkml.org/lkml/2016/9/5/832

Subscribe to lkml already.. also lkml.org is near useless these days,
please use any other archive.

> Two thoughts:
> 
> ***multiple locks
> 
> Don't have much insight into this
> (didn't spend much thinking on this),
> but of course it's unfortunate
> that two lock types need to be serviced
> rather than having the atomic handling exchange area/section
> be sufficiently guarded by one lock only
> (the question here could possibly be:
> what kind of currently existing
> structural disadvantage/layer distribution
> prevents us from being able to
> have things simply serviced
> within one granular lock area only?)

percpu rwsem is a sleeping lock, flc_flock is not. flc_flock also nests
under other non-sleeping locks, and therefore cannot (trivially) be made
a sleeping lock.

This is in the Changelog.

Furthermore, in the fast path of percpu_{down,up}_read() are exactly 0
atomic/serializing instructions.

> ***lock handling
> 
> > +	percpu_down_read(&file_rwsem);
> >  	spin_lock(&ctx->flc_lock);
> 
> 
> >  	spin_unlock(&ctx->flc_lock);
> > +	percpu_up_read(&file_rwsem);
> 
> 
> These are repeated multiple times in this commit, thus error-prone.
> 
> A possibly good way to commit-micro-manage this would be:
> 1. commit shoves things into a newly created encapsulation/wrapper helper
>    stuff_lock(&flc_lock);  /* <---- naming surely can be improved here */

No, because not all instances of flc_flock need the percpu-rwsem held,
creating such a wrapper could mistakenly create the impression it
should.

> That way it is pretty much guaranteed that:
> a) neither one nor the other lock type
>    will get forgotten later, at a specific use site

That's what we have lockdep_assert_held() for. Note how this patch also
introduces percpu_rwsem_assert_held() usage, this insures we shall never
'forget' the appropriate locking.

> b) lock order will be correctly maintained at all times
>    (AB-BA deadlock......)

lockdep is rather good at finding those, also might_sleep() debugging
would trivially catch those since percpu-rwsem is a sleeping lock while
fcl_flock is not.

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH 2/7] fs/locks: Replace lg_global with a percpu-rwsem
  2016-09-06  8:23 ` Peter Zijlstra
@ 2016-09-06  8:36   ` Andreas Mohr
  2016-09-06  8:59     ` Peter Zijlstra
  0 siblings, 1 reply; 5+ messages in thread
From: Andreas Mohr @ 2016-09-06  8:36 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Andreas Mohr, der.herr, Al Viro, mingo, torvalds, dave,
	Oleg Nesterov, riel, tj, paulmck, linux-kernel

On Tue, Sep 06, 2016 at 10:23:28AM +0200, Peter Zijlstra wrote:
> On Tue, Sep 06, 2016 at 03:58:00AM +0200, Andreas Mohr wrote:
> > Hi,
> > 
> > [no properly binding reference via In-Reply-To: available thus manually re-creating, sorry]
> > 
> > https://lkml.org/lkml/2016/9/5/832
> 
> Subscribe to lkml already.. also lkml.org is near useless these days,
> please use any other archive.

I cannot count the number of times that I did so
(sometimes directly failing, sometimes getting dropped after few weeks).
Hopefully with that other address things will now work out ok...


There might be further archives (other than gmane.org or some such),
however so far lkml.org seemed to be quite ok
and with IMHO better usability than some others
(although it has degraded a lot indeed in recent times,
judging from
many server/connection issues and
not listing details any more due to SPAM etc.).


> > A possibly good way to commit-micro-manage this would be:
> > 1. commit shoves things into a newly created encapsulation/wrapper helper
> >    stuff_lock(&flc_lock);  /* <---- naming surely can be improved here */
> 
> No, because not all instances of flc_flock need the percpu-rwsem held,
> creating such a wrapper could mistakenly create the impression it
> should.

OK, that aspect sounds valid.
However with a helper appropriately named to be focussing on
that use case (protecting that section communication),
it might be less of a concern.


> lockdep is rather good at finding those, also might_sleep() debugging
> would trivially catch those since percpu-rwsem is a sleeping lock while
> fcl_flock is not.

So perhaps the only thing left to argue on my side is
that this way it would be
n+1 vs. n protection mechanisms ;)
(with some of those n+1 mechanisms
known to be failing one time or another...)

Thanks,

Andreas Mohr

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH 2/7] fs/locks: Replace lg_global with a percpu-rwsem
  2016-09-06  8:36   ` Andreas Mohr
@ 2016-09-06  8:59     ` Peter Zijlstra
  0 siblings, 0 replies; 5+ messages in thread
From: Peter Zijlstra @ 2016-09-06  8:59 UTC (permalink / raw)
  To: Andreas Mohr
  Cc: der.herr, Al Viro, mingo, torvalds, dave, Oleg Nesterov, riel,
	tj, paulmck, linux-kernel

On Tue, Sep 06, 2016 at 10:36:01AM +0200, Andreas Mohr wrote:
> There might be further archives (other than gmane.org or some such),
> however so far lkml.org seemed to be quite ok
> and with IMHO better usability than some others

Agreed, in that I liked the interface best too, however:

> (although it has degraded a lot indeed in recent times,
> judging from
> many server/connection issues and
> not listing details any more due to SPAM etc.).

this has gotten to the point where it often simply doesn't show messages
anymore, threads are incomplete etc..

> > > A possibly good way to commit-micro-manage this would be:
> > > 1. commit shoves things into a newly created encapsulation/wrapper helper
> > >    stuff_lock(&flc_lock);  /* <---- naming surely can be improved here */
> > 
> > No, because not all instances of flc_flock need the percpu-rwsem held,
> > creating such a wrapper could mistakenly create the impression it
> > should.
> 
> OK, that aspect sounds valid.
> However with a helper appropriately named to be focussing on
> that use case (protecting that section communication),
> it might be less of a concern.

Dunno, I'd struggling to come up with a sensible name for such a
construct. I'll leave that to others. If it really is desired we can
always do so later.

^ permalink raw reply	[flat|nested] 5+ messages in thread

* [PATCH 2/7] fs/locks: Replace lg_global with a percpu-rwsem
  2016-09-05 19:40 [PATCH 0/7] perpcu rwsem, fs/locks and killing lglocks Peter Zijlstra
@ 2016-09-05 19:40 ` Peter Zijlstra
  0 siblings, 0 replies; 5+ messages in thread
From: Peter Zijlstra @ 2016-09-05 19:40 UTC (permalink / raw)
  To: oleg, paulmck, tj, mingo, linux-kernel, der.herr, peterz, dave,
	riel, viro, torvalds, wagi

[-- Attachment #1: peter_zijlstra-fs_locks-replace_lg_global_with_a_percpu-rwsem.patch --]
[-- Type: text/plain, Size: 5292 bytes --]

Replace the global part of the lglock with a percpu-rwsem.

Since fcl_lock is a spinlock and itself nests under i_lock, which too
is a spinlock we cannot acquire sleeping locks at
locks_{insert,remove}_global_locks().

We can however wrap all fcl_lock acquisitions with percpu_down_read
such that all invocations of locks_{insert,remove}_global_locks() have
that read lock held.

This allows us to replace the lg_global part of the lglock with the
write side of the rwsem.

In the absense of writers, percpu_{down,up}_read() are free of atomic
instructions. This further avoids the very long preempt-disable
regions caused by lglock on larger machines.

Cc: der.herr@hofr.at
Cc: Al Viro <viro@ZenIV.linux.org.uk>
Cc: mingo@redhat.com
Cc: torvalds@linux-foundation.org
Cc: dave@stgolabs.net
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: riel@redhat.com
Cc: tj@kernel.org
Cc: paulmck@linux.vnet.ibm.com
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
---
 fs/locks.c |   21 +++++++++++++++++++++
 1 file changed, 21 insertions(+)

--- a/fs/locks.c
+++ b/fs/locks.c
@@ -164,6 +164,7 @@ int lease_break_time = 45;
  */
 DEFINE_STATIC_LGLOCK(file_lock_lglock);
 static DEFINE_PER_CPU(struct hlist_head, file_lock_list);
+DEFINE_STATIC_PERCPU_RWSEM(file_rwsem);
 
 /*
  * The blocked_hash is used to find POSIX lock loops for deadlock detection.
@@ -587,6 +588,8 @@ static int posix_same_owner(struct file_
 /* Must be called with the flc_lock held! */
 static void locks_insert_global_locks(struct file_lock *fl)
 {
+	percpu_rwsem_assert_held(&file_rwsem);
+
 	lg_local_lock(&file_lock_lglock);
 	fl->fl_link_cpu = smp_processor_id();
 	hlist_add_head(&fl->fl_link, this_cpu_ptr(&file_lock_list));
@@ -596,6 +599,8 @@ static void locks_insert_global_locks(st
 /* Must be called with the flc_lock held! */
 static void locks_delete_global_locks(struct file_lock *fl)
 {
+	percpu_rwsem_assert_held(&file_rwsem);
+
 	/*
 	 * Avoid taking lock if already unhashed. This is safe since this check
 	 * is done while holding the flc_lock, and new insertions into the list
@@ -915,6 +920,7 @@ static int flock_lock_inode(struct inode
 			return -ENOMEM;
 	}
 
+	percpu_down_read(&file_rwsem);
 	spin_lock(&ctx->flc_lock);
 	if (request->fl_flags & FL_ACCESS)
 		goto find_conflict;
@@ -955,6 +961,7 @@ static int flock_lock_inode(struct inode
 
 out:
 	spin_unlock(&ctx->flc_lock);
+	percpu_up_read(&file_rwsem);
 	if (new_fl)
 		locks_free_lock(new_fl);
 	locks_dispose_list(&dispose);
@@ -991,6 +998,7 @@ static int posix_lock_inode(struct inode
 		new_fl2 = locks_alloc_lock();
 	}
 
+	percpu_down_read(&file_rwsem);
 	spin_lock(&ctx->flc_lock);
 	/*
 	 * New lock request. Walk all POSIX locks and look for conflicts. If
@@ -1162,6 +1170,7 @@ static int posix_lock_inode(struct inode
 	}
  out:
 	spin_unlock(&ctx->flc_lock);
+	percpu_up_read(&file_rwsem);
 	/*
 	 * Free any unused locks.
 	 */
@@ -1436,6 +1445,7 @@ int __break_lease(struct inode *inode, u
 		return error;
 	}
 
+	percpu_down_read(&file_rwsem);
 	spin_lock(&ctx->flc_lock);
 
 	time_out_leases(inode, &dispose);
@@ -1487,9 +1497,13 @@ int __break_lease(struct inode *inode, u
 	locks_insert_block(fl, new_fl);
 	trace_break_lease_block(inode, new_fl);
 	spin_unlock(&ctx->flc_lock);
+	percpu_up_read(&file_rwsem);
+
 	locks_dispose_list(&dispose);
 	error = wait_event_interruptible_timeout(new_fl->fl_wait,
 						!new_fl->fl_next, break_time);
+
+	percpu_down_read(&file_rwsem);
 	spin_lock(&ctx->flc_lock);
 	trace_break_lease_unblock(inode, new_fl);
 	locks_delete_block(new_fl);
@@ -1506,6 +1520,7 @@ int __break_lease(struct inode *inode, u
 	}
 out:
 	spin_unlock(&ctx->flc_lock);
+	percpu_up_read(&file_rwsem);
 	locks_dispose_list(&dispose);
 	locks_free_lock(new_fl);
 	return error;
@@ -1660,6 +1675,7 @@ generic_add_lease(struct file *filp, lon
 		return -EINVAL;
 	}
 
+	percpu_down_read(&file_rwsem);
 	spin_lock(&ctx->flc_lock);
 	time_out_leases(inode, &dispose);
 	error = check_conflicting_open(dentry, arg, lease->fl_flags);
@@ -1730,6 +1746,7 @@ generic_add_lease(struct file *filp, lon
 		lease->fl_lmops->lm_setup(lease, priv);
 out:
 	spin_unlock(&ctx->flc_lock);
+	percpu_up_read(&file_rwsem);
 	locks_dispose_list(&dispose);
 	if (is_deleg)
 		inode_unlock(inode);
@@ -1752,6 +1769,7 @@ static int generic_delete_lease(struct f
 		return error;
 	}
 
+	percpu_down_read(&file_rwsem);
 	spin_lock(&ctx->flc_lock);
 	list_for_each_entry(fl, &ctx->flc_lease, fl_list) {
 		if (fl->fl_file == filp &&
@@ -1764,6 +1782,7 @@ static int generic_delete_lease(struct f
 	if (victim)
 		error = fl->fl_lmops->lm_change(victim, F_UNLCK, &dispose);
 	spin_unlock(&ctx->flc_lock);
+	percpu_up_read(&file_rwsem);
 	locks_dispose_list(&dispose);
 	return error;
 }
@@ -2703,6 +2722,7 @@ static void *locks_start(struct seq_file
 	struct locks_iterator *iter = f->private;
 
 	iter->li_pos = *pos + 1;
+	percpu_down_write(&file_rwsem);
 	lg_global_lock(&file_lock_lglock);
 	spin_lock(&blocked_lock_lock);
 	return seq_hlist_start_percpu(&file_lock_list, &iter->li_cpu, *pos);
@@ -2721,6 +2741,7 @@ static void locks_stop(struct seq_file *
 {
 	spin_unlock(&blocked_lock_lock);
 	lg_global_unlock(&file_lock_lglock);
+	percpu_up_write(&file_rwsem);
 }
 
 static const struct seq_operations locks_seq_operations = {

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2016-09-06  8:59 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-09-06  1:58 [PATCH 2/7] fs/locks: Replace lg_global with a percpu-rwsem Andreas Mohr
2016-09-06  8:23 ` Peter Zijlstra
2016-09-06  8:36   ` Andreas Mohr
2016-09-06  8:59     ` Peter Zijlstra
  -- strict thread matches above, loose matches on Subject: below --
2016-09-05 19:40 [PATCH 0/7] perpcu rwsem, fs/locks and killing lglocks Peter Zijlstra
2016-09-05 19:40 ` [PATCH 2/7] fs/locks: Replace lg_global with a percpu-rwsem Peter Zijlstra

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).