On Thu, 17 Oct 2013 11:23:49 +0200 Peter Zijlstra <peterz@infradead.org>
wrote:

> On Thu, Oct 17, 2013 at 12:28:59PM +1100, NeilBrown wrote:
> > I always run with lockdep enabled, and I have done at least basic testing
> 
> Very good!
> 
> > > 
> > > Stuff like:
> > > 
> > > +	for (i = 0; i < NR_STRIPE_HASH_LOCKS; i++)
> > > +		spin_lock_init(conf->hash_locks + i);
> > > 
> > > And:
> > > 
> > > +static void __lock_all_hash_locks(struct r5conf *conf)
> > > +{
> > > +	int i;
> > > +	for (i = 0; i < NR_STRIPE_HASH_LOCKS; i++)
> > > +		spin_lock(conf->hash_locks + i);
> > > +}
> > > 
> > > Tends to complain real loud.
> > 
> > Why is that?
> > Because "conf->hash_locks + i" gets used as the "name" of the lockdep map for
> > each one, and when they are all locked it looks like nested locking??
> 
> Exactly so; they all share the same class (and name) because they have
> the same init site; so indeed the multiple lock will look like a nested
> lock.
> 
> > Do you have a suggestion for how to make this work?
> > Would
> >     spin_lock_nested(conf->hash_locks + i, i)
> > do the trick?
> 
> 	spin_lock_nest_lock(conf->hash_locks + i, &conf->device_lock);

Unfortunately this doesn't work as the order is backwards.
hash_lock is taken first, then (when necessary) device lock.
(hash_lock is needed more often, so we split it up to reduce contention.
 device lock is needed less often, but sometimes when hash_lock is held).

I've currently got:
	spin_lock_init(conf->hash_locks);
	for (i = 1; i < NR_STRIPE_HASH_LOCKS; i++)
		spin_lock_init(conf->hash_locks + i);

and

	spin_lock(conf->hash_locks);
	for (i = 1; i < NR_STRIPE_HASH_LOCKS; i++)
		spin_lock_nest_lock(conf->hash_locks + i, conf->hash_locks);
	spin_lock(&conf->device_lock);

which doesn't trigger any lockdep warnings and isn't too ugly.

Does it seem OK to you?

Thanks,
NeilBrown


> 
> Would be the better option; your suggestion might just work because
> NR_STRIP_HASH_LOCKS is 8 and we have exactly 8 subclasses available, but
> any increase to NR_STRIPE_HASH_LOCKS will make things explode again.
> 
> The spin_lock_nest_lock() annotation tells that the lock order is
> irrelevant because all such multiple acquisitions are serialized under
> the other lock.
> 
> Also, if in future you feel the need to increase NR_STRIP_HASH_LOCKS,
> please keep it <= 64 or so; if you have a need to go above that, please
> yell and we'll see if we can do something smarter.

I've added a comment to this effect in the code.


> 
> This is because of:
>  - each spin_lock() increases preempt_count and that's 8 bits; we
>    wouldn't want to overflow that
>  - each consecutive nested spin_lock() increases the total acquisition
>    wait-time for all locks. Note that the worst case acquisition time
>    for even a single hash lock is gated by the complete acquisition time
>    of all of them in this scenario.