All of lore.kernel.org
 help / color / mirror / Atom feed
From: Laurent Dufour <ldufour@linux.vnet.ibm.com>
To: Davidlohr Bueso <dave@stgolabs.net>,
	mingo@kernel.org, peterz@infradead.org,
	akpm@linux-foundation.org
Cc: jack@suse.cz, kirill.shutemov@linux.intel.com, mhocko@suse.com,
	mgorman@techsingularity.net, linux-kernel@vger.kernel.org,
	Davidlohr Bueso <dbueso@suse.de>
Subject: Re: [PATCH 2/6] locking: Introduce range reader/writer lock
Date: Tue, 18 Apr 2017 15:57:26 +0200	[thread overview]
Message-ID: <b1a53492-7fb3-a3cd-80d1-92dc4be5d8c4@linux.vnet.ibm.com> (raw)
In-Reply-To: <20170406084620.22700-3-dave@stgolabs.net>

On 06/04/2017 10:46, Davidlohr Bueso wrote:
> This implements a sleepable range rwlock, based on interval tree, serializing
> conflicting/intersecting/overlapping ranges within the tree. The largest range
> is given by [0, ~0 - 1] (inclusive). Unlike traditional locks, range locking
> involves dealing with the tree itself and the range to be locked, normally
> stack allocated and always explicitly prepared/initialized by the user in a
> [a0, a1] a0 <= a1 sorted manner, before actually taking the lock.
> 
> An interval tree of locked and to-be-locked ranges is kept. When a new range
> lock is requested, we add its interval to the tree and store number of
> intervals intersecting it to 'blocking_ranges'. For the reader case,
> 'blocking_ranges' is only accounted for if the intersecting range is
> marked as a writer. To achieve mutual exclusion of arbitrary ranges, we
> guarantee that task is blocked until there are no overlapping ranges in the
> tree.
> 
> When a range is unlocked, we again walk intervals that overlap with the
> unlocked one and decrement their 'blocking_ranges'. Naturally, we wake up
> owner of any range lock whose 'blocking_ranges' drops to 0. Wakeup order
> therefore relies on the order of the interval tree  -- as opposed to a
> more traditional fifo mechanism. There is no lock stealing either, which
> prevents starvation and guarantees fairness.
> 
> How much does it cost:
> ----------------------
> 
> The cost of lock and unlock of a range is O((1+R_int)log(R_all)) where R_all
> is total number of ranges and R_int is the number of ranges intersecting the
> new range range to be added.
> 
> Due to its sharable nature, full range locks can be compared with rw-sempahores,
> which also serves from a mutex standpoint as writer-only situations are
> pretty similar nowadays.
> 
> The first is the memory footprint, tree locks are larger than rwsems: 80 vs
> 40 bytes, and also requires an additional 72 bytes of stack for the range
> structure.
> 
> Secondly, because every range call is serialized by the tree->lock, any lock()
> fastpath will at least have an interval_tree_insert() and spinlock lock+unlock
> overhead compared to a single atomic insn in the case of rwsems. Similar scenario
> obviously for the unlock() case.
> 
> The torture module was used to measure 1-1 differences in lock acquisition with
> increasing core counts over a period of 10 minutes. Readers and writers are
> interleaved, with a slight advantage to writers as its the first kthread that is
> created. The following shows the avg ops/minute with various thread-setups on
> boxes with small and large core-counts.
> 
> ** 4-core AMD Opteron **
> (write-only)
> rwsem-2thr: 4198.5, stddev: 7.77
> range-2thr: 4199.1, stddev: 0.73
> 
> rwsem-4thr: 6036.8, stddev: 50.91
> range-4thr: 6004.9, stddev: 126.57
> 
> rwsem-8thr: 6245.6, stddev: 59.39
> range-8thr: 6229.3, stddev: 10.60
> 
> (read-only)
> rwsem-2thr: 5930.7, stddev: 21.92
> range-2thr: 5917.3, stddev: 25.45
> 
> rwsem-4thr: 9881.6, stddev: 0.70
> range-4thr: 9540.2, stddev: 98.28
> 
> rwsem-8thr: 11633.2, stddev: 7.72
> range-8thr: 11314.7, stddev: 62.22
> 
> For the read/write-only cases, there is very little difference between the range lock
> and rwsems, with up to a 3% hit, which could very well be considered in the noise range.
> 
> (read-write)
> rwsem-write-1thr: 1744.8, stddev: 11.59
> rwsem-read-1thr:  1043.1, stddev: 3.97
> range-write-1thr: 1740.2, stddev: 5.99
> range-read-1thr:  1022.5, stddev: 6.41
> 
> rwsem-write-2thr: 1662.5, stddev: 0.70
> rwsem-read-2thr:  1278.0, stddev: 25.45
> range-write-2thr: 1321.5, stddev: 51.61
> range-read-2thr:  1243.5, stddev: 30.40
> 
> rwsem-write-4thr: 1761.0, stddev: 11.31
> rwsem-read-4thr:  1426.0, stddev: 7.07
> range-write-4thr: 1417.0, stddev: 29.69
> range-read-4thr:  1398.0, stddev: 56.56
> 
> While a single reader and writer threads does not show must difference, increasing
> core counts shows that in reader/writer workloads, writer threads can take a hit in
> raw performance of up to ~20%, while the number of reader throughput is quite similar
> among both locks.
> 
> ** 240-core (ht) IvyBridge **
> (write-only)
> rwsem-120thr: 6844.5, stddev: 82.73
> range-120thr: 6070.5, stddev: 85.55
> 
> rwsem-240thr: 6292.5, stddev: 146.3
> range-240thr: 6099.0, stddev: 15.55
> 
> rwsem-480thr: 6164.8, stddev: 33.94
> range-480thr: 6062.3, stddev: 19.79
> 
> (read-only)
> rwsem-120thr: 136860.4, stddev: 2539.92
> range-120thr: 138052.2, stddev: 327.39
> 
> rwsem-240thr: 235297.5, stddev: 2220.50
> range-240thr: 232099.1, stddev: 3614.72
> 
> rwsem-480thr: 272683.0, stddev: 3924.32
> range-480thr: 256539.2, stddev: 9541.69
> 
> Similar to the small box, larger machines show that range locks take only a minor
> (up to ~6% for 480 threads) hit even in completely exclusive or shared scenarios.
> 
> (read-write)
> rwsem-write-60thr: 4658.1, stddev: 1303.19
> rwsem-read-60thr:  1108.7, stddev: 718.42
> range-write-60thr: 3203.6, stddev: 139.30
> range-read-60thr:  1852.8, stddev: 147.5
> 
> rwsem-write-120thr: 3971.3, stddev: 1413.0
> rwsem-read-120thr:  1038.8, stddev: 353.51
> range-write-120thr: 2282.1, stddev: 207.18
> range-read-120thr:  1856.5, stddev: 198.69
> 
> rwsem-write-240thr: 4112.7, stddev: 2448.1
> rwsem-read-240thr:  1277.4, stddev: 430.30
> range-write-240thr: 2353.1, stddev: 502.04
> range-read-240thr:  1551.5, stddev: 361.33
> 
> When mixing readers and writers, writer throughput can take a hit of up to ~40%,
> similar to the 4 core machine, however, reader threads can increase the number of
> acquisitions in up to ~80%. In any case, the overall writer+reader throughput will
> always be higher for rwsems. A huge factor in this behavior is that range locks
> do not have writer spin-on-owner feature.
> 
> On both machines when actually testing threads acquiring different ranges, the
> amount of throughput will always outperform the rwsem, due to the increased
> parallelism; which is no surprise either. As such microbenchmarks that merely
> pounds on a lock will pretty much always suffer upon direct lock conversions,
> but not enough to matter in the overall picture.
> 
> Signed-off-by: Davidlohr Bueso <dbueso@suse.de>
> Reviewed-by: Jan Kara <jack@suse.cz>
> ---
>  include/linux/range_rwlock.h  | 115 +++++++++
>  kernel/locking/Makefile       |   2 +-
>  kernel/locking/range_rwlock.c | 554 ++++++++++++++++++++++++++++++++++++++++++
>  3 files changed, 670 insertions(+), 1 deletion(-)
>  create mode 100644 include/linux/range_rwlock.h
>  create mode 100644 kernel/locking/range_rwlock.c
> 
> diff --git a/include/linux/range_rwlock.h b/include/linux/range_rwlock.h
> new file mode 100644
> index 000000000000..0a8dad62d097
> --- /dev/null
> +++ b/include/linux/range_rwlock.h
> @@ -0,0 +1,115 @@
> +/*
> + * Range/interval rw-locking
> + * -------------------------
> + *
> + * An interval tree of locked and to-be-locked ranges is kept. When a new range
> + * lock is requested, we add its interval to the tree and store number of
> + * intervals intersecting it to 'blocking_ranges'. For the reader case,
> + * 'blocking_ranges' is only accounted for if the intersecting range is
> + * marked as a writer. To achieve mutual exclusion of arbitrary ranges, we
> + * guarantee that task is blocked until there are no overlapping ranges in the
> + * tree.
> + *
> + * When a range is unlocked, we again walk intervals that overlap with the
> + * unlocked one and decrement their 'blocking_ranges'. Naturally, we wake up
> + * owner of any range lock whose 'blocking_ranges' drops to 0. Wakeup order
> + * therefore relies on the order of the interval tree  -- as opposed to a
> + * more traditional fifo mechanism. There is no lock stealing either, which
> + * prevents starvation and guarantees fairness.
> + *
> + * The cost of lock and unlock of a range is O((1+R_int)log(R_all)) where R_all
> + * is total number of ranges and R_int is the number of ranges intersecting the
> + * operated range.
> + */
> +#ifndef _LINUX_RANGE_RWLOCK_H
> +#define _LINUX_RANGE_RWLOCK_H
> +
> +#include <linux/rbtree.h>
> +#include <linux/interval_tree.h>
> +#include <linux/list.h>
> +#include <linux/spinlock.h>
> +
> +/*
> + * The largest range will span [0,RANGE_RWLOCK_FULL].
> + */
> +#define RANGE_RWLOCK_FULL  ~0UL
> +
> +struct range_rwlock {
> +	struct interval_tree_node node;
> +	struct task_struct *waiter;
> +	/* Number of ranges which are blocking acquisition of the lock */
> +	unsigned int blocking_ranges;
> +	u64 seqnum;
> +};
> +
> +struct range_rwlock_tree {
> +	struct rb_root root;
> +	spinlock_t lock;
> +	struct interval_tree_node *leftmost; /* compute smallest 'start' */
> +	u64 seqnum; /* track order of incoming ranges, avoid overflows */
> +};
> +
> +#define __RANGE_RWLOCK_TREE_INITIALIZER(name)		\
> +	{ .leftmost = NULL                              \
> +	, .root = RB_ROOT				\
> +	, .seqnum = 0					\
> +	, .lock = __SPIN_LOCK_UNLOCKED(name.lock) }
> +
> +#define DEFINE_RANGE_RWLOCK_TREE(name) \
> +       struct range_rwlock_tree name = __RANGE_RWLOCK_TREE_INITIALIZER(name)
> +
> +#define __RANGE_RWLOCK_INITIALIZER(__start, __last) {	\
> +		.node = {			\
> +			.start = (__start)	\
> +			,.last = (__last)	\
> +		}				\
> +		, .task = NULL			\
> +		, .blocking_ranges = 0		\
> +		, .reader = false		\
> +		, .seqnum = 0			\
> +	}
> +
> +#define DEFINE_RANGE_RWLOCK(name, start, last)				\
> +	struct range_rwlock name = __RANGE_RWLOCK_INITIALIZER((start), (last))
> +
> +#define DEFINE_RANGE_RWLOCK_FULL(name)		\
> +	struct range_rwlock name = __RANGE_RWLOCK_INITIALIZER(0, RANGE_RWLOCK_FULL)
> +
> +static inline void range_rwlock_tree_init(struct range_rwlock_tree *tree)
> +{
> +	tree->root = RB_ROOT;
> +	spin_lock_init(&tree->lock);
> +	tree->leftmost = NULL;
> +	tree->seqnum = 0;
> +}
> +
> +void range_rwlock_init(struct range_rwlock *lock,
> +		       unsigned long start, unsigned long last);
> +void range_rwlock_init_full(struct range_rwlock *lock);
> +
> +/*
> + * lock for reading
> + */
> +void range_read_lock(struct range_rwlock_tree *tree, struct range_rwlock *lock);
> +int range_read_lock_interruptible(struct range_rwlock_tree *tree,
> +				  struct range_rwlock *lock);
> +int range_read_lock_killable(struct range_rwlock_tree *tree,
> +			     struct range_rwlock *lock);
> +int range_read_trylock(struct range_rwlock_tree *tree, struct range_rwlock *lock);
> +void range_read_unlock(struct range_rwlock_tree *tree, struct range_rwlock *lock);
> +
> +/*
> + * lock for writing
> + */
> +void range_write_lock(struct range_rwlock_tree *tree, struct range_rwlock *lock);
> +int range_write_lock_interruptible(struct range_rwlock_tree *tree,
> +				   struct range_rwlock *lock);
> +int range_write_lock_killable(struct range_rwlock_tree *tree,
> +			      struct range_rwlock *lock);
> +int range_write_trylock(struct range_rwlock_tree *tree, struct range_rwlock *lock);
> +void range_write_unlock(struct range_rwlock_tree *tree, struct range_rwlock *lock);
> +
> +void range_downgrade_write(struct range_rwlock_tree *tree,
> +			   struct range_rwlock *lock);
> +
> +#endif
> diff --git a/kernel/locking/Makefile b/kernel/locking/Makefile
> index 760158d9d98d..bf054d0af82b 100644
> --- a/kernel/locking/Makefile
> +++ b/kernel/locking/Makefile
> @@ -2,7 +2,7 @@
>  # and is generally not a function of system call inputs.
>  KCOV_INSTRUMENT		:= n
> 
> -obj-y += mutex.o semaphore.o rwsem.o percpu-rwsem.o
> +obj-y += mutex.o semaphore.o rwsem.o percpu-rwsem.o range_rwlock.o
> 
>  ifdef CONFIG_FUNCTION_TRACER
>  CFLAGS_REMOVE_lockdep.o = $(CC_FLAGS_FTRACE)
> diff --git a/kernel/locking/range_rwlock.c b/kernel/locking/range_rwlock.c
> new file mode 100644
> index 000000000000..550ad360d87a
> --- /dev/null
> +++ b/kernel/locking/range_rwlock.c
> @@ -0,0 +1,554 @@
> +#include <linux/rbtree.h>
> +#include <linux/spinlock.h>
> +#include <linux/range_rwlock.h>
> +#include <linux/sched/signal.h>
> +#include <linux/sched/debug.h>
> +#include <linux/sched/wake_q.h>
> +#include <linux/sched.h>
> +#include <linux/export.h>
> +
> +#define range_entry(ptr, type, member) container_of(ptr, type, member)
> +
> +#define range_interval_tree_foreach(node, root, start, last)	\
> +	for (node = interval_tree_iter_first(root, start, last); \
> +	     node; node = interval_tree_iter_next(node, start, last))
> +
> +/*
> + * Fastpath range intersection/overlap between A: [a0, a1] and B: [b0, b1]
> + * is given by:
> + *
> + *        a0 <= b1 && b0 <= a1
> + *
> + * ... where A holds the lock range and B holds the smallest 'start' and
> + * largest 'last' in the tree. For the later, we rely on the root node,
> + * which by augmented interval tree property, holds the largest value in
> + * its last-in-subtree. This allows mitigating some of the tree walk overhead
> + * for non-intersecting ranges, maintained and consulted in O(1).
> + */
> +static inline bool
> +__range_intersects_intree(struct range_rwlock_tree *tree, struct range_rwlock *lock)
> +{
> +	struct interval_tree_node *root;
> +
> +	if (unlikely(RB_EMPTY_ROOT(&tree->root)))
> +		return false;
> +
> +	root = range_entry(tree->root.rb_node, struct interval_tree_node, rb);
> +
> +	return lock->node.start <= root->__subtree_last &&
> +		tree->leftmost->start <= lock->node.last;
> +}
> +
> +static inline void
> +__range_tree_insert(struct range_rwlock_tree *tree, struct range_rwlock *lock)
> +{
> +	if (unlikely(RB_EMPTY_ROOT(&tree->root)) ||
> +	    lock->node.start < tree->leftmost->start)
> +		tree->leftmost = &lock->node;
> +
> +	lock->seqnum = tree->seqnum++;
> +	interval_tree_insert(&lock->node, &tree->root);
> +}
> +
> +static inline void
> +__range_tree_remove(struct range_rwlock_tree *tree, struct range_rwlock *lock)
> +{
> +	if (tree->leftmost == &lock->node) {
> +		struct rb_node *next = rb_next(&tree->leftmost->rb);
> +		tree->leftmost = range_entry(next, struct interval_tree_node, rb);
> +	}
> +
> +	interval_tree_remove(&lock->node, &tree->root);
> +}
> +
> +/*
> + * lock->waiter reader tracking.
> + */
> +#define RANGE_FLAG_READER	1UL
> +
> +static inline struct task_struct *range_lock_waiter(struct range_rwlock *lock)
> +{
> +	return (struct task_struct *)
> +		((unsigned long) lock->waiter & ~RANGE_FLAG_READER);
> +}
> +
> +static inline void range_lock_set_reader(struct range_rwlock *lock)
> +{
> +	lock->waiter = (struct task_struct *)
> +		((unsigned long)lock->waiter | RANGE_FLAG_READER);
> +}
> +
> +static inline void range_lock_clear_reader(struct range_rwlock *lock)
> +{
> +	lock->waiter = (struct task_struct *)
> +		((unsigned long)lock->waiter & ~RANGE_FLAG_READER);
> +}
> +
> +static inline bool range_lock_is_reader(struct range_rwlock *lock)
> +{
> +	return (unsigned long) lock->waiter & RANGE_FLAG_READER;
> +}
> +
> +static inline void
> +__range_rwlock_init(struct range_rwlock *lock,
> +		    unsigned long start, unsigned long last)
> +{
> +	WARN_ON(start > last);
> +
> +	lock->node.start = start;
> +	lock->node.last = last;
> +	RB_CLEAR_NODE(&lock->node.rb);
> +	lock->blocking_ranges = 0;
> +	lock->waiter = NULL;
> +	lock->seqnum = 0;
> +}
> +
> +/**
> + * range_rwlock_init - Initialize the range lock
> + * @lock: the range lock to be initialized
> + * @start: start of the interval (inclusive)
> + * @last: last location in the interval (inclusive)
> + *
> + * Initialize the range's [start, last] such that it can
> + * later be locked. User is expected to enter a sorted
> + * range, such that @start <= @last.
> + *
> + * It is not allowed to initialize an already locked range.
> + */
> +void range_rwlock_init(struct range_rwlock *lock, unsigned long start,
> +		     unsigned long last)
> +{
> +	__range_rwlock_init(lock, start, last);
> +}
> +EXPORT_SYMBOL_GPL(range_rwlock_init);
> +
> +/**
> + * range_rwlock_init_full - Initialize a full range lock
> + * @lock: the range lock to be initialized
> + *
> + * Initialize the full range.
> + *
> + * It is not allowed to initialize an already locked range.
> + */
> +void range_rwlock_init_full(struct range_rwlock *lock)
> +{
> +	__range_rwlock_init(lock, 0, RANGE_RWLOCK_FULL);
> +}
> +EXPORT_SYMBOL_GPL(range_rwlock_init_full);
> +
> +static inline void
> +range_rwlock_unblock(struct range_rwlock *lock, struct wake_q_head *wake_q)
> +{
> +	if (!--lock->blocking_ranges)
> +		wake_q_add(wake_q, range_lock_waiter(lock));
> +}
> +
> +static inline int wait_for_ranges(struct range_rwlock_tree *tree,
> +				  struct range_rwlock *lock, long state)
> +{
> +	int ret = 0;
> +
> +	while (true) {
> +		set_current_state(state);
> +
> +		/* do we need to go to sleep? */
> +		if (!lock->blocking_ranges)
> +			break;
> +
> +		if (unlikely(signal_pending_state(state, current))) {
> +			struct interval_tree_node *node;
> +			unsigned long flags;
> +			DEFINE_WAKE_Q(wake_q);
> +
> +			ret = -EINTR;
> +			/*
> +			 * We're not taking the lock after all, cleanup
> +			 * after ourselves.
> +			 */
> +			spin_lock_irqsave(&tree->lock, flags);
> +
> +			range_lock_clear_reader(lock);
> +			__range_tree_remove(tree, lock);
> +
> +			if (!__range_intersects_intree(tree, lock))
> +				goto unlock;
> +
> +			range_interval_tree_foreach(node, &tree->root,
> +						    lock->node.start,
> +						    lock->node.last) {
> +				struct range_rwlock *blked;
> +				blked = range_entry(node,
> +						    struct range_rwlock, node);
> +
> +				if (range_lock_is_reader(lock) &&
> +				    range_lock_is_reader(blked))
> +					continue;
> +
> +				/* unaccount for threads _we_ are blocking */
> +				if (lock->seqnum < blked->seqnum)
> +					range_rwlock_unblock(blked, &wake_q);
> +			}
> +
> +		unlock:
> +			spin_unlock_irqrestore(&tree->lock, flags);
> +			wake_up_q(&wake_q);
> +			break;
> +		}
> +
> +		schedule();
> +	}
> +
> +	__set_current_state(TASK_RUNNING);
> +	return ret;
> +}
> +
> +static __always_inline int __sched
> +__range_read_lock_common(struct range_rwlock_tree *tree,
> +			 struct range_rwlock *lock, long state)
> +{
> +	struct interval_tree_node *node;
> +	unsigned long flags;
> +
> +	spin_lock_irqsave(&tree->lock, flags);
> +	range_lock_set_reader(lock);
> +
> +	if (!__range_intersects_intree(tree, lock))
> +		goto insert;
> +
> +	range_interval_tree_foreach(node, &tree->root,
> +				    lock->node.start, lock->node.last) {
> +		struct range_rwlock *blocked_lock;
> +		blocked_lock = range_entry(node, struct range_rwlock, node);
> +
> +		if (!range_lock_is_reader(blocked_lock))
> +			lock->blocking_ranges++;
> +	}
> +insert:
> +	__range_tree_insert(tree, lock);
> +
> +	lock->waiter = current;

Hi Davidlohr,

Setting lock->waiter after calling range_lock_set_reader() is resetting
the reader flag. Moving the call to range_lock_set_reader() here fixes that.

> +	spin_unlock_irqrestore(&tree->lock, flags);
> +
> +	return wait_for_ranges(tree, lock, state);
> +}
> +
> +/**
> + * range_read_lock - Lock for reading
> + * @tree: interval tree
> + * @lock: the range lock to be locked
> + *
> + * Returns when the lock has been acquired or sleep until
> + * until there are no overlapping ranges.
> + */
> +void range_read_lock(struct range_rwlock_tree *tree, struct range_rwlock *lock)
> +{
> +	might_sleep();
> +	__range_read_lock_common(tree, lock, TASK_UNINTERRUPTIBLE);
> +}
> +EXPORT_SYMBOL_GPL(range_read_lock);
> +
> +/**
> + * range_read_lock_interruptible - Lock for reading (interruptible)
> + * @tree: interval tree
> + * @lock: the range lock to be locked
> + *
> + * Lock the range like range_read_lock(), and return 0 if the
> + * lock has been acquired or sleep until until there are no
> + * overlapping ranges. If a signal arrives while waiting for the
> + * lock then this function returns -EINTR.
> + */
> +int range_read_lock_interruptible(struct range_rwlock_tree *tree,
> +				  struct range_rwlock *lock)
> +{
> +	might_sleep();
> +	return __range_read_lock_common(tree, lock, TASK_INTERRUPTIBLE);
> +}
> +EXPORT_SYMBOL_GPL(range_read_lock_interruptible);
> +
> +/**
> + * range_read_lock_killable - Lock for reading (killable)
> + * @tree: interval tree
> + * @lock: the range lock to be locked
> + *
> + * Lock the range like range_read_lock(), and return 0 if the
> + * lock has been acquired or sleep until until there are no
> + * overlapping ranges. If a signal arrives while waiting for the
> + * lock then this function returns -EINTR.
> + */
> +int range_read_lock_killable(struct range_rwlock_tree *tree,
> +			     struct range_rwlock *lock)
> +{
> +	might_sleep();
> +	return __range_read_lock_common(tree, lock, TASK_KILLABLE);
> +}
> +EXPORT_SYMBOL_GPL(range_read_lock_killable);
> +
> +/**
> + * range_read_trylock - Trylock for reading
> + * @tree: interval tree
> + * @lock: the range lock to be trylocked
> + *
> + * The trylock is against the range itself, not the @tree->lock.
> + *
> + * Returns 1 if successful, 0 if contention (must block to acquire).
> + */
> +int range_read_trylock(struct range_rwlock_tree *tree, struct range_rwlock *lock)
> +{
> +	int ret = true;
> +	unsigned long flags;
> +	struct interval_tree_node *node;
> +
> +	spin_lock_irqsave(&tree->lock, flags);
> +
> +	if (!__range_intersects_intree(tree, lock))
> +		goto insert;
> +
> +	/*
> +	 * We have overlapping ranges in the tree, ensure that we can
> +	 * in fact share the lock.
> +	 */
> +	range_interval_tree_foreach(node, &tree->root,
> +				    lock->node.start, lock->node.last) {
> +		struct range_rwlock *blocked_lock;
> +		blocked_lock = range_entry(node, struct range_rwlock, node);
> +
> +		if (!range_lock_is_reader(blocked_lock)) {
> +			ret = false;
> +			goto unlock;
> +		}
> +	}
> +insert:
> +	range_lock_set_reader(lock);

Here, the lock->waiter field should have been set to current before
calling range_lock_set_reader()

Cheers,
Laurent.

  parent reply	other threads:[~2017-04-18 13:57 UTC|newest]

Thread overview: 31+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-04-06  8:46 [PATCH v2 -tip 0/6] locking: Introduce range reader/writer lock Davidlohr Bueso
2017-04-06  8:46 ` [PATCH 1/6] interval-tree: Build unconditionally Davidlohr Bueso
2017-04-06  8:46 ` [PATCH 2/6] locking: Introduce range reader/writer lock Davidlohr Bueso
2017-04-06  9:01   ` Laurent Dufour
2017-04-06 16:50     ` Davidlohr Bueso
2017-04-13  8:07       ` Laurent Dufour
2017-04-13  8:38         ` Jan Kara
2017-04-13  8:58           ` Laurent Dufour
2017-04-06 10:24   ` Peter Zijlstra
2017-04-18 13:57   ` Laurent Dufour [this message]
2017-04-20 16:01     ` Davidlohr Bueso
2017-04-21  7:00       ` Laurent Dufour
2017-04-06  8:46 ` [PATCH 3/6] locking/locktorture: Fix rwsem reader_delay Davidlohr Bueso
2017-04-06  8:46 ` [PATCH 4/6] locking/locktorture: Fix num reader/writer corner cases Davidlohr Bueso
2017-04-06  8:46 ` [PATCH 5/6] locking/locktorture: Support range rwlocks Davidlohr Bueso
2017-04-06  8:46 ` [PATCH 6/6] staging/lustre: Use generic range rwlock Davidlohr Bueso
2017-04-07 10:08   ` Dilger, Andreas
2017-04-07 10:08     ` [lustre-devel] " Dilger, Andreas
2017-04-19 12:37 ` [PATCH v2 -tip 0/6] locking: Introduce range reader/writer lock Peter Zijlstra
2017-04-20 17:13   ` Davidlohr Bueso
2017-04-20 17:53     ` Peter Zijlstra
2017-04-20 18:36       ` Davidlohr Bueso
2017-04-20 19:10         ` Peter Zijlstra
2017-05-15  9:19           ` Davidlohr Bueso
2017-05-15  9:07 [PATCH v3 " Davidlohr Bueso
2017-05-15  9:07 ` [PATCH 2/6] " Davidlohr Bueso
2017-05-15 13:02   ` Peter Zijlstra
2017-05-16 22:19     ` Davidlohr Bueso
2017-05-15 13:44   ` Peter Zijlstra
2017-05-16 21:17     ` Davidlohr Bueso
2017-05-15 13:59   ` Peter Zijlstra
2017-05-23 15:12   ` Laurent Dufour

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=b1a53492-7fb3-a3cd-80d1-92dc4be5d8c4@linux.vnet.ibm.com \
    --to=ldufour@linux.vnet.ibm.com \
    --cc=akpm@linux-foundation.org \
    --cc=dave@stgolabs.net \
    --cc=dbueso@suse.de \
    --cc=jack@suse.cz \
    --cc=kirill.shutemov@linux.intel.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mgorman@techsingularity.net \
    --cc=mhocko@suse.com \
    --cc=mingo@kernel.org \
    --cc=peterz@infradead.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.