On Fri, Feb 23, 2018 at 12:55:20PM +0100, Peter Zijlstra wrote:
> On Thu, Feb 22, 2018 at 03:08:51PM +0800, Boqun Feng wrote:
> > @@ -1012,6 +1013,33 @@ static inline bool bfs_error(enum bfs_result res)
> >  	return res < 0;
> >  }
> >  
> > +#define DEP_NN_BIT 0
> > +#define DEP_RN_BIT 1
> > +#define DEP_NR_BIT 2
> > +#define DEP_RR_BIT 3
> > +
> > +#define DEP_NN_MASK (1U << (DEP_NN_BIT))
> > +#define DEP_RN_MASK (1U << (DEP_RN_BIT))
> > +#define DEP_NR_MASK (1U << (DEP_NR_BIT))
> > +#define DEP_RR_MASK (1U << (DEP_RR_BIT))
> > +
> > +static inline unsigned int __calc_dep_bit(int prev, int next)
> > +{
> > +	if (prev == 2 && next != 2)
> > +		return DEP_RN_BIT;
> > +	if (prev != 2 && next == 2)
> > +		return DEP_NR_BIT;
> > +	if (prev == 2 && next == 2)
> > +		return DEP_RR_BIT;
> > +	else
> > +		return DEP_NN_BIT;
> > +}
> > +
> > +static inline unsigned int calc_dep(int prev, int next)
> > +{
> > +	return 1U << __calc_dep_bit(prev, next);
> > +}
> > +
> >  static enum bfs_result __bfs(struct lock_list *source_entry,
> >  			     void *data,
> >  			     int (*match)(struct lock_list *entry, void *data),
> > @@ -1921,6 +1949,16 @@ check_prev_add(struct task_struct *curr, struct held_lock *prev,
> >  		if (entry->class == hlock_class(next)) {
> >  			if (distance == 1)
> >  				entry->distance = 1;
> > +			entry->dep |= calc_dep(prev->read, next->read);
> > +		}
> > +	}
> > +
> > +	/* Also, update the reverse dependency in @next's ->locks_before list */
> > +	list_for_each_entry(entry, &hlock_class(next)->locks_before, entry) {
> > +		if (entry->class == hlock_class(prev)) {
> > +			if (distance == 1)
> > +				entry->distance = 1;
> > +			entry->dep |= calc_dep(next->read, prev->read);
> >  			return 1;
> >  		}
> >  	}
> 
> I think it all becomes simpler if you use only 2 bits. Such that:
> 
>   bit0 is the prev R (0) or N (1) value,
>   bit1 is the next R (0) or N (1) value.
> 
> I think this should work because we don't care about the empty set
> (currently 0000) and all the complexity in patch 5 is because we can
> have R bits set when there's also N bits. The concequence of that is
> that we cannot replace ! with ~ (which is what I kept doing).
> 
> But with only 2 bits, we only track the strongest relation in the set,
> which is exactly what we appear to need.
> 

But if we only have RN and NR, both bits will be set, we can not check
whether we have NN or not. Consider we have:

	A -(RR)-> B
	B -(NR)-> C and B -(RN)-> C
	C -(RN)-> A

this is not a deadlock case, but with "two bits" approach, we can not
differ this with:

	A -(RR)-> B
	B -(NN)-> C
	C -(RN)-> A

, which is a deadlock.

But maybe "three bits" (NR, RN and NN bits) approach works, that is if
->dep is 0, we indicates this is only RR, and is_rx() becomes:

	static inline bool is_rx(u8 dep)
	{
		return !(dep & (NR_MASK | NN_MASK));
	}

and is_xr() becomes:

	static inline bool is_xr(u8 dep)
	{
		return !(dep & (RN_MASK | NN_MASK));
	}

, with this I think your simplification with have_xr works, thanks!

Regards,
Boqun

> 
> The above then becomes something like:
> 
> static inline u8 __calc_dep(struct held_lock *lock)
> {
> 	return lock->read != 2;
> }
> 
> static inline u8
> calc_dep(struct held_lock *prev, struct held_lock *next)
> {
> 	return (__calc_dep(prev) << 0) | (__calc_dep(next) << 1);
> }
> 
> 
> 	entry->dep |= calc_dep(prev, next);
> 
> 
> 
> Then the stuff from 5 can be:
> 
> static inline bool is_rx(u8 dep)
> {
> 	return !(dep & 1);
> }
> 
> static inline bool is_xr(u8 dep)
> {
> 	return !(dep & 2);
> }
> 
> 
> 	if (have_xr && is_rx(entry->dep))
> 		continue;
> 
> 	entry->have_xr = is_xr(entry->dep);
> 
> 
> Or did I mess that up somewhere?