All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
To: "Srivatsa S. Bhat" <srivatsa.bhat@linux.vnet.ibm.com>
Cc: tglx@linutronix.de, peterz@infradead.org, tj@kernel.org,
	oleg@redhat.com, rusty@rustcorp.com.au, mingo@kernel.org,
	akpm@linux-foundation.org, namhyung@kernel.org,
	rostedt@goodmis.org, wangyun@linux.vnet.ibm.com,
	xiaoguangrong@linux.vnet.ibm.com, rjw@sisk.pl, sbw@mit.edu,
	fweisbec@gmail.com, linux@arm.linux.org.uk,
	nikunj@linux.vnet.ibm.com, linux-pm@vger.kernel.org,
	linux-arch@vger.kernel.org, linux-arm-kernel@lists.infradead.org,
	linuxppc-dev@lists.ozlabs.org, netdev@vger.kernel.org,
	linux-doc@vger.kernel.org, linux-kernel@vger.kernel.org
Subject: Re: [PATCH v5 04/45] percpu_rwlock: Implement the core design of Per-CPU Reader-Writer Locks
Date: Fri, 8 Feb 2013 15:10:17 -0800	[thread overview]
Message-ID: <20130208231017.GK2666@linux.vnet.ibm.com> (raw)
In-Reply-To: <20130122073347.13822.85876.stgit@srivatsabhat.in.ibm.com>

On Tue, Jan 22, 2013 at 01:03:53PM +0530, Srivatsa S. Bhat wrote:
> Using global rwlocks as the backend for per-CPU rwlocks helps us avoid many
> lock-ordering related problems (unlike per-cpu locks). However, global
> rwlocks lead to unnecessary cache-line bouncing even when there are no
> writers present, which can slow down the system needlessly.
> 
> Per-cpu counters can help solve the cache-line bouncing problem. So we
> actually use the best of both: per-cpu counters (no-waiting) at the reader
> side in the fast-path, and global rwlocks in the slowpath.
> 
> [ Fastpath = no writer is active; Slowpath = a writer is active ]
> 
> IOW, the readers just increment/decrement their per-cpu refcounts (disabling
> interrupts during the updates, if necessary) when no writer is active.
> When a writer becomes active, he signals all readers to switch to global
> rwlocks for the duration of his activity. The readers switch over when it
> is safe for them (ie., when they are about to start a fresh, non-nested
> read-side critical section) and start using (holding) the global rwlock for
> read in their subsequent critical sections.
> 
> The writer waits for every existing reader to switch, and then acquires the
> global rwlock for write and enters his critical section. Later, the writer
> signals all readers that he is done, and that they can go back to using their
> per-cpu refcounts again.
> 
> Note that the lock-safety (despite the per-cpu scheme) comes from the fact
> that the readers can *choose* _when_ to switch to rwlocks upon the writer's
> signal. And the readers don't wait on anybody based on the per-cpu counters.
> The only true synchronization that involves waiting at the reader-side in this
> scheme, is the one arising from the global rwlock, which is safe from circular
> locking dependency issues.
> 
> Reader-writer locks and per-cpu counters are recursive, so they can be
> used in a nested fashion in the reader-path, which makes per-CPU rwlocks also
> recursive. Also, this design of switching the synchronization scheme ensures
> that you can safely nest and use these locks in a very flexible manner.
> 
> I'm indebted to Michael Wang and Xiao Guangrong for their numerous thoughtful
> suggestions and ideas, which inspired and influenced many of the decisions in
> this as well as previous designs. Thanks a lot Michael and Xiao!

Looks pretty close!  Some comments interspersed below.  Please either
fix the code or my confusion, as the case may be.  ;-)

							Thanx, Paul

> Cc: David Howells <dhowells@redhat.com>
> Signed-off-by: Srivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com>
> ---
> 
>  include/linux/percpu-rwlock.h |   10 +++
>  lib/percpu-rwlock.c           |  128 ++++++++++++++++++++++++++++++++++++++++-
>  2 files changed, 136 insertions(+), 2 deletions(-)
> 
> diff --git a/include/linux/percpu-rwlock.h b/include/linux/percpu-rwlock.h
> index 8dec8fe..6819bb8 100644
> --- a/include/linux/percpu-rwlock.h
> +++ b/include/linux/percpu-rwlock.h
> @@ -68,4 +68,14 @@ extern void percpu_free_rwlock(struct percpu_rwlock *);
>  	__percpu_init_rwlock(pcpu_rwlock, #pcpu_rwlock, &rwlock_key);	\
>  })
> 
> +#define reader_uses_percpu_refcnt(pcpu_rwlock, cpu)			\
> +		(ACCESS_ONCE(per_cpu(*((pcpu_rwlock)->reader_refcnt), cpu)))
> +
> +#define reader_nested_percpu(pcpu_rwlock)				\
> +			(__this_cpu_read(*((pcpu_rwlock)->reader_refcnt)) > 1)
> +
> +#define writer_active(pcpu_rwlock)					\
> +			(__this_cpu_read(*((pcpu_rwlock)->writer_signal)))
> +
>  #endif
> +
> diff --git a/lib/percpu-rwlock.c b/lib/percpu-rwlock.c
> index 80dad93..992da5c 100644
> --- a/lib/percpu-rwlock.c
> +++ b/lib/percpu-rwlock.c
> @@ -64,21 +64,145 @@ void percpu_free_rwlock(struct percpu_rwlock *pcpu_rwlock)
> 
>  void percpu_read_lock(struct percpu_rwlock *pcpu_rwlock)
>  {
> -	read_lock(&pcpu_rwlock->global_rwlock);
> +	preempt_disable();
> +
> +	/* First and foremost, let the writer know that a reader is active */
> +	this_cpu_inc(*pcpu_rwlock->reader_refcnt);
> +
> +	/*
> +	 * If we are already using per-cpu refcounts, it is not safe to switch
> +	 * the synchronization scheme. So continue using the refcounts.
> +	 */
> +	if (reader_nested_percpu(pcpu_rwlock)) {
> +		goto out;
> +	} else {
> +		/*
> +		 * The write to 'reader_refcnt' must be visible before we
> +		 * read 'writer_signal'.
> +		 */
> +		smp_mb(); /* Paired with smp_rmb() in sync_reader() */
> +
> +		if (likely(!writer_active(pcpu_rwlock))) {
> +			goto out;
> +		} else {
> +			/* Writer is active, so switch to global rwlock. */
> +			read_lock(&pcpu_rwlock->global_rwlock);
> +
> +			/*
> +			 * We might have raced with a writer going inactive
> +			 * before we took the read-lock. So re-evaluate whether
> +			 * we still need to hold the rwlock or if we can switch
> +			 * back to per-cpu refcounts. (This also helps avoid
> +			 * heterogeneous nesting of readers).
> +			 */
> +			if (writer_active(pcpu_rwlock))

The above writer_active() can be reordered with the following this_cpu_dec(),
strange though it might seem.  But this is OK because holding the rwlock
is conservative.  But might be worth a comment.

> +				this_cpu_dec(*pcpu_rwlock->reader_refcnt);
> +			else

In contrast, no reordering can happen here because read_unlock() is
required to keep the critical section underneath the lock.

> +				read_unlock(&pcpu_rwlock->global_rwlock);
> +		}
> +	}
> +
> +out:
> +	/* Prevent reordering of any subsequent reads */
> +	smp_rmb();

This should be smp_mb().  "Readers" really can do writes.  Hence the
name lglock -- "local/global" rather than "reader/writer".

>  }
> 
>  void percpu_read_unlock(struct percpu_rwlock *pcpu_rwlock)
>  {
> -	read_unlock(&pcpu_rwlock->global_rwlock);

We need an smp_mb() here to keep the critical section ordered before the
this_cpu_dec() below.  Otherwise, if a writer shows up just after we
exit the fastpath, that writer is not guaranteed to see the effects of
our critical section.  Equivalently, the prior read-side critical section
just might see some of the writer's updates, which could be a bit of
a surprise to the reader.

> +	/*
> +	 * We never allow heterogeneous nesting of readers. So it is trivial
> +	 * to find out the kind of reader we are, and undo the operation
> +	 * done by our corresponding percpu_read_lock().
> +	 */
> +	if (__this_cpu_read(*pcpu_rwlock->reader_refcnt)) {
> +		this_cpu_dec(*pcpu_rwlock->reader_refcnt);
> +		smp_wmb(); /* Paired with smp_rmb() in sync_reader() */

Given an smp_mb() above, I don't understand the need for this smp_wmb().
Isn't the idea that if the writer sees ->reader_refcnt decremented to
zero, it also needs to see the effects of the corresponding reader's
critical section?

Or am I missing something subtle here?  In any case, if this smp_wmb()
really is needed, there should be some subsequent write that the writer
might observe.  From what I can see, there is no subsequent write from
this reader that the writer cares about.

> +	} else {
> +		read_unlock(&pcpu_rwlock->global_rwlock);
> +	}
> +
> +	preempt_enable();
> +}
> +
> +static inline void raise_writer_signal(struct percpu_rwlock *pcpu_rwlock,
> +				       unsigned int cpu)
> +{
> +	per_cpu(*pcpu_rwlock->writer_signal, cpu) = true;
> +}
> +
> +static inline void drop_writer_signal(struct percpu_rwlock *pcpu_rwlock,
> +				      unsigned int cpu)
> +{
> +	per_cpu(*pcpu_rwlock->writer_signal, cpu) = false;
> +}
> +
> +static void announce_writer_active(struct percpu_rwlock *pcpu_rwlock)
> +{
> +	unsigned int cpu;
> +
> +	for_each_online_cpu(cpu)
> +		raise_writer_signal(pcpu_rwlock, cpu);
> +
> +	smp_mb(); /* Paired with smp_rmb() in percpu_read_[un]lock() */
> +}
> +
> +static void announce_writer_inactive(struct percpu_rwlock *pcpu_rwlock)
> +{
> +	unsigned int cpu;
> +
> +	drop_writer_signal(pcpu_rwlock, smp_processor_id());

Why do we drop ourselves twice?  More to the point, why is it important to
drop ourselves first?

> +
> +	for_each_online_cpu(cpu)
> +		drop_writer_signal(pcpu_rwlock, cpu);
> +
> +	smp_mb(); /* Paired with smp_rmb() in percpu_read_[un]lock() */
> +}
> +
> +/*
> + * Wait for the reader to see the writer's signal and switch from percpu
> + * refcounts to global rwlock.
> + *
> + * If the reader is still using percpu refcounts, wait for him to switch.
> + * Else, we can safely go ahead, because either the reader has already
> + * switched over, or the next reader that comes along on that CPU will
> + * notice the writer's signal and will switch over to the rwlock.
> + */
> +static inline void sync_reader(struct percpu_rwlock *pcpu_rwlock,
> +			       unsigned int cpu)
> +{
> +	smp_rmb(); /* Paired with smp_[w]mb() in percpu_read_[un]lock() */

As I understand it, the purpose of this memory barrier is to ensure
that the stores in drop_writer_signal() happen before the reads from
->reader_refcnt in reader_uses_percpu_refcnt(), thus preventing the
race between a new reader attempting to use the fastpath and this writer
acquiring the lock.  Unless I am confused, this must be smp_mb() rather
than smp_rmb().

Also, why not just have a single smp_mb() at the beginning of
sync_all_readers() instead of executing one barrier per CPU?

> +
> +	while (reader_uses_percpu_refcnt(pcpu_rwlock, cpu))
> +		cpu_relax();
> +}
> +
> +static void sync_all_readers(struct percpu_rwlock *pcpu_rwlock)
> +{
> +	unsigned int cpu;
> +
> +	for_each_online_cpu(cpu)
> +		sync_reader(pcpu_rwlock, cpu);
>  }
> 
>  void percpu_write_lock(struct percpu_rwlock *pcpu_rwlock)
>  {
> +	/*
> +	 * Tell all readers that a writer is becoming active, so that they
> +	 * start switching over to the global rwlock.
> +	 */
> +	announce_writer_active(pcpu_rwlock);
> +	sync_all_readers(pcpu_rwlock);
>  	write_lock(&pcpu_rwlock->global_rwlock);
>  }
> 
>  void percpu_write_unlock(struct percpu_rwlock *pcpu_rwlock)
>  {
> +	/*
> +	 * Inform all readers that we are done, so that they can switch back
> +	 * to their per-cpu refcounts. (We don't need to wait for them to
> +	 * see it).
> +	 */
> +	announce_writer_inactive(pcpu_rwlock);
>  	write_unlock(&pcpu_rwlock->global_rwlock);
>  }
> 
> 


WARNING: multiple messages have this Message-ID (diff)
From: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
To: "Srivatsa S. Bhat" <srivatsa.bhat@linux.vnet.ibm.com>
Cc: linux-doc@vger.kernel.org, peterz@infradead.org,
	fweisbec@gmail.com, linux-kernel@vger.kernel.org,
	mingo@kernel.org, linux-arch@vger.kernel.org,
	linux@arm.linux.org.uk, xiaoguangrong@linux.vnet.ibm.com,
	wangyun@linux.vnet.ibm.com, nikunj@linux.vnet.ibm.com,
	linux-pm@vger.kernel.org, rusty@rustcorp.com.au,
	rostedt@goodmis.org, rjw@sisk.pl, namhyung@kernel.org,
	tglx@linutronix.de, linux-arm-kernel@lists.infradead.org,
	netdev@vger.kernel.org, oleg@redhat.com, sbw@mit.edu,
	tj@kernel.org, akpm@linux-foundation.org,
	linuxppc-dev@lists.ozlabs.org
Subject: Re: [PATCH v5 04/45] percpu_rwlock: Implement the core design of Per-CPU Reader-Writer Locks
Date: Fri, 8 Feb 2013 15:10:17 -0800	[thread overview]
Message-ID: <20130208231017.GK2666@linux.vnet.ibm.com> (raw)
In-Reply-To: <20130122073347.13822.85876.stgit@srivatsabhat.in.ibm.com>

On Tue, Jan 22, 2013 at 01:03:53PM +0530, Srivatsa S. Bhat wrote:
> Using global rwlocks as the backend for per-CPU rwlocks helps us avoid many
> lock-ordering related problems (unlike per-cpu locks). However, global
> rwlocks lead to unnecessary cache-line bouncing even when there are no
> writers present, which can slow down the system needlessly.
> 
> Per-cpu counters can help solve the cache-line bouncing problem. So we
> actually use the best of both: per-cpu counters (no-waiting) at the reader
> side in the fast-path, and global rwlocks in the slowpath.
> 
> [ Fastpath = no writer is active; Slowpath = a writer is active ]
> 
> IOW, the readers just increment/decrement their per-cpu refcounts (disabling
> interrupts during the updates, if necessary) when no writer is active.
> When a writer becomes active, he signals all readers to switch to global
> rwlocks for the duration of his activity. The readers switch over when it
> is safe for them (ie., when they are about to start a fresh, non-nested
> read-side critical section) and start using (holding) the global rwlock for
> read in their subsequent critical sections.
> 
> The writer waits for every existing reader to switch, and then acquires the
> global rwlock for write and enters his critical section. Later, the writer
> signals all readers that he is done, and that they can go back to using their
> per-cpu refcounts again.
> 
> Note that the lock-safety (despite the per-cpu scheme) comes from the fact
> that the readers can *choose* _when_ to switch to rwlocks upon the writer's
> signal. And the readers don't wait on anybody based on the per-cpu counters.
> The only true synchronization that involves waiting at the reader-side in this
> scheme, is the one arising from the global rwlock, which is safe from circular
> locking dependency issues.
> 
> Reader-writer locks and per-cpu counters are recursive, so they can be
> used in a nested fashion in the reader-path, which makes per-CPU rwlocks also
> recursive. Also, this design of switching the synchronization scheme ensures
> that you can safely nest and use these locks in a very flexible manner.
> 
> I'm indebted to Michael Wang and Xiao Guangrong for their numerous thoughtful
> suggestions and ideas, which inspired and influenced many of the decisions in
> this as well as previous designs. Thanks a lot Michael and Xiao!

Looks pretty close!  Some comments interspersed below.  Please either
fix the code or my confusion, as the case may be.  ;-)

							Thanx, Paul

> Cc: David Howells <dhowells@redhat.com>
> Signed-off-by: Srivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com>
> ---
> 
>  include/linux/percpu-rwlock.h |   10 +++
>  lib/percpu-rwlock.c           |  128 ++++++++++++++++++++++++++++++++++++++++-
>  2 files changed, 136 insertions(+), 2 deletions(-)
> 
> diff --git a/include/linux/percpu-rwlock.h b/include/linux/percpu-rwlock.h
> index 8dec8fe..6819bb8 100644
> --- a/include/linux/percpu-rwlock.h
> +++ b/include/linux/percpu-rwlock.h
> @@ -68,4 +68,14 @@ extern void percpu_free_rwlock(struct percpu_rwlock *);
>  	__percpu_init_rwlock(pcpu_rwlock, #pcpu_rwlock, &rwlock_key);	\
>  })
> 
> +#define reader_uses_percpu_refcnt(pcpu_rwlock, cpu)			\
> +		(ACCESS_ONCE(per_cpu(*((pcpu_rwlock)->reader_refcnt), cpu)))
> +
> +#define reader_nested_percpu(pcpu_rwlock)				\
> +			(__this_cpu_read(*((pcpu_rwlock)->reader_refcnt)) > 1)
> +
> +#define writer_active(pcpu_rwlock)					\
> +			(__this_cpu_read(*((pcpu_rwlock)->writer_signal)))
> +
>  #endif
> +
> diff --git a/lib/percpu-rwlock.c b/lib/percpu-rwlock.c
> index 80dad93..992da5c 100644
> --- a/lib/percpu-rwlock.c
> +++ b/lib/percpu-rwlock.c
> @@ -64,21 +64,145 @@ void percpu_free_rwlock(struct percpu_rwlock *pcpu_rwlock)
> 
>  void percpu_read_lock(struct percpu_rwlock *pcpu_rwlock)
>  {
> -	read_lock(&pcpu_rwlock->global_rwlock);
> +	preempt_disable();
> +
> +	/* First and foremost, let the writer know that a reader is active */
> +	this_cpu_inc(*pcpu_rwlock->reader_refcnt);
> +
> +	/*
> +	 * If we are already using per-cpu refcounts, it is not safe to switch
> +	 * the synchronization scheme. So continue using the refcounts.
> +	 */
> +	if (reader_nested_percpu(pcpu_rwlock)) {
> +		goto out;
> +	} else {
> +		/*
> +		 * The write to 'reader_refcnt' must be visible before we
> +		 * read 'writer_signal'.
> +		 */
> +		smp_mb(); /* Paired with smp_rmb() in sync_reader() */
> +
> +		if (likely(!writer_active(pcpu_rwlock))) {
> +			goto out;
> +		} else {
> +			/* Writer is active, so switch to global rwlock. */
> +			read_lock(&pcpu_rwlock->global_rwlock);
> +
> +			/*
> +			 * We might have raced with a writer going inactive
> +			 * before we took the read-lock. So re-evaluate whether
> +			 * we still need to hold the rwlock or if we can switch
> +			 * back to per-cpu refcounts. (This also helps avoid
> +			 * heterogeneous nesting of readers).
> +			 */
> +			if (writer_active(pcpu_rwlock))

The above writer_active() can be reordered with the following this_cpu_dec(),
strange though it might seem.  But this is OK because holding the rwlock
is conservative.  But might be worth a comment.

> +				this_cpu_dec(*pcpu_rwlock->reader_refcnt);
> +			else

In contrast, no reordering can happen here because read_unlock() is
required to keep the critical section underneath the lock.

> +				read_unlock(&pcpu_rwlock->global_rwlock);
> +		}
> +	}
> +
> +out:
> +	/* Prevent reordering of any subsequent reads */
> +	smp_rmb();

This should be smp_mb().  "Readers" really can do writes.  Hence the
name lglock -- "local/global" rather than "reader/writer".

>  }
> 
>  void percpu_read_unlock(struct percpu_rwlock *pcpu_rwlock)
>  {
> -	read_unlock(&pcpu_rwlock->global_rwlock);

We need an smp_mb() here to keep the critical section ordered before the
this_cpu_dec() below.  Otherwise, if a writer shows up just after we
exit the fastpath, that writer is not guaranteed to see the effects of
our critical section.  Equivalently, the prior read-side critical section
just might see some of the writer's updates, which could be a bit of
a surprise to the reader.

> +	/*
> +	 * We never allow heterogeneous nesting of readers. So it is trivial
> +	 * to find out the kind of reader we are, and undo the operation
> +	 * done by our corresponding percpu_read_lock().
> +	 */
> +	if (__this_cpu_read(*pcpu_rwlock->reader_refcnt)) {
> +		this_cpu_dec(*pcpu_rwlock->reader_refcnt);
> +		smp_wmb(); /* Paired with smp_rmb() in sync_reader() */

Given an smp_mb() above, I don't understand the need for this smp_wmb().
Isn't the idea that if the writer sees ->reader_refcnt decremented to
zero, it also needs to see the effects of the corresponding reader's
critical section?

Or am I missing something subtle here?  In any case, if this smp_wmb()
really is needed, there should be some subsequent write that the writer
might observe.  From what I can see, there is no subsequent write from
this reader that the writer cares about.

> +	} else {
> +		read_unlock(&pcpu_rwlock->global_rwlock);
> +	}
> +
> +	preempt_enable();
> +}
> +
> +static inline void raise_writer_signal(struct percpu_rwlock *pcpu_rwlock,
> +				       unsigned int cpu)
> +{
> +	per_cpu(*pcpu_rwlock->writer_signal, cpu) = true;
> +}
> +
> +static inline void drop_writer_signal(struct percpu_rwlock *pcpu_rwlock,
> +				      unsigned int cpu)
> +{
> +	per_cpu(*pcpu_rwlock->writer_signal, cpu) = false;
> +}
> +
> +static void announce_writer_active(struct percpu_rwlock *pcpu_rwlock)
> +{
> +	unsigned int cpu;
> +
> +	for_each_online_cpu(cpu)
> +		raise_writer_signal(pcpu_rwlock, cpu);
> +
> +	smp_mb(); /* Paired with smp_rmb() in percpu_read_[un]lock() */
> +}
> +
> +static void announce_writer_inactive(struct percpu_rwlock *pcpu_rwlock)
> +{
> +	unsigned int cpu;
> +
> +	drop_writer_signal(pcpu_rwlock, smp_processor_id());

Why do we drop ourselves twice?  More to the point, why is it important to
drop ourselves first?

> +
> +	for_each_online_cpu(cpu)
> +		drop_writer_signal(pcpu_rwlock, cpu);
> +
> +	smp_mb(); /* Paired with smp_rmb() in percpu_read_[un]lock() */
> +}
> +
> +/*
> + * Wait for the reader to see the writer's signal and switch from percpu
> + * refcounts to global rwlock.
> + *
> + * If the reader is still using percpu refcounts, wait for him to switch.
> + * Else, we can safely go ahead, because either the reader has already
> + * switched over, or the next reader that comes along on that CPU will
> + * notice the writer's signal and will switch over to the rwlock.
> + */
> +static inline void sync_reader(struct percpu_rwlock *pcpu_rwlock,
> +			       unsigned int cpu)
> +{
> +	smp_rmb(); /* Paired with smp_[w]mb() in percpu_read_[un]lock() */

As I understand it, the purpose of this memory barrier is to ensure
that the stores in drop_writer_signal() happen before the reads from
->reader_refcnt in reader_uses_percpu_refcnt(), thus preventing the
race between a new reader attempting to use the fastpath and this writer
acquiring the lock.  Unless I am confused, this must be smp_mb() rather
than smp_rmb().

Also, why not just have a single smp_mb() at the beginning of
sync_all_readers() instead of executing one barrier per CPU?

> +
> +	while (reader_uses_percpu_refcnt(pcpu_rwlock, cpu))
> +		cpu_relax();
> +}
> +
> +static void sync_all_readers(struct percpu_rwlock *pcpu_rwlock)
> +{
> +	unsigned int cpu;
> +
> +	for_each_online_cpu(cpu)
> +		sync_reader(pcpu_rwlock, cpu);
>  }
> 
>  void percpu_write_lock(struct percpu_rwlock *pcpu_rwlock)
>  {
> +	/*
> +	 * Tell all readers that a writer is becoming active, so that they
> +	 * start switching over to the global rwlock.
> +	 */
> +	announce_writer_active(pcpu_rwlock);
> +	sync_all_readers(pcpu_rwlock);
>  	write_lock(&pcpu_rwlock->global_rwlock);
>  }
> 
>  void percpu_write_unlock(struct percpu_rwlock *pcpu_rwlock)
>  {
> +	/*
> +	 * Inform all readers that we are done, so that they can switch back
> +	 * to their per-cpu refcounts. (We don't need to wait for them to
> +	 * see it).
> +	 */
> +	announce_writer_inactive(pcpu_rwlock);
>  	write_unlock(&pcpu_rwlock->global_rwlock);
>  }
> 
> 

WARNING: multiple messages have this Message-ID (diff)
From: paulmck@linux.vnet.ibm.com (Paul E. McKenney)
To: linux-arm-kernel@lists.infradead.org
Subject: [PATCH v5 04/45] percpu_rwlock: Implement the core design of Per-CPU Reader-Writer Locks
Date: Fri, 8 Feb 2013 15:10:17 -0800	[thread overview]
Message-ID: <20130208231017.GK2666@linux.vnet.ibm.com> (raw)
In-Reply-To: <20130122073347.13822.85876.stgit@srivatsabhat.in.ibm.com>

On Tue, Jan 22, 2013 at 01:03:53PM +0530, Srivatsa S. Bhat wrote:
> Using global rwlocks as the backend for per-CPU rwlocks helps us avoid many
> lock-ordering related problems (unlike per-cpu locks). However, global
> rwlocks lead to unnecessary cache-line bouncing even when there are no
> writers present, which can slow down the system needlessly.
> 
> Per-cpu counters can help solve the cache-line bouncing problem. So we
> actually use the best of both: per-cpu counters (no-waiting) at the reader
> side in the fast-path, and global rwlocks in the slowpath.
> 
> [ Fastpath = no writer is active; Slowpath = a writer is active ]
> 
> IOW, the readers just increment/decrement their per-cpu refcounts (disabling
> interrupts during the updates, if necessary) when no writer is active.
> When a writer becomes active, he signals all readers to switch to global
> rwlocks for the duration of his activity. The readers switch over when it
> is safe for them (ie., when they are about to start a fresh, non-nested
> read-side critical section) and start using (holding) the global rwlock for
> read in their subsequent critical sections.
> 
> The writer waits for every existing reader to switch, and then acquires the
> global rwlock for write and enters his critical section. Later, the writer
> signals all readers that he is done, and that they can go back to using their
> per-cpu refcounts again.
> 
> Note that the lock-safety (despite the per-cpu scheme) comes from the fact
> that the readers can *choose* _when_ to switch to rwlocks upon the writer's
> signal. And the readers don't wait on anybody based on the per-cpu counters.
> The only true synchronization that involves waiting at the reader-side in this
> scheme, is the one arising from the global rwlock, which is safe from circular
> locking dependency issues.
> 
> Reader-writer locks and per-cpu counters are recursive, so they can be
> used in a nested fashion in the reader-path, which makes per-CPU rwlocks also
> recursive. Also, this design of switching the synchronization scheme ensures
> that you can safely nest and use these locks in a very flexible manner.
> 
> I'm indebted to Michael Wang and Xiao Guangrong for their numerous thoughtful
> suggestions and ideas, which inspired and influenced many of the decisions in
> this as well as previous designs. Thanks a lot Michael and Xiao!

Looks pretty close!  Some comments interspersed below.  Please either
fix the code or my confusion, as the case may be.  ;-)

							Thanx, Paul

> Cc: David Howells <dhowells@redhat.com>
> Signed-off-by: Srivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com>
> ---
> 
>  include/linux/percpu-rwlock.h |   10 +++
>  lib/percpu-rwlock.c           |  128 ++++++++++++++++++++++++++++++++++++++++-
>  2 files changed, 136 insertions(+), 2 deletions(-)
> 
> diff --git a/include/linux/percpu-rwlock.h b/include/linux/percpu-rwlock.h
> index 8dec8fe..6819bb8 100644
> --- a/include/linux/percpu-rwlock.h
> +++ b/include/linux/percpu-rwlock.h
> @@ -68,4 +68,14 @@ extern void percpu_free_rwlock(struct percpu_rwlock *);
>  	__percpu_init_rwlock(pcpu_rwlock, #pcpu_rwlock, &rwlock_key);	\
>  })
> 
> +#define reader_uses_percpu_refcnt(pcpu_rwlock, cpu)			\
> +		(ACCESS_ONCE(per_cpu(*((pcpu_rwlock)->reader_refcnt), cpu)))
> +
> +#define reader_nested_percpu(pcpu_rwlock)				\
> +			(__this_cpu_read(*((pcpu_rwlock)->reader_refcnt)) > 1)
> +
> +#define writer_active(pcpu_rwlock)					\
> +			(__this_cpu_read(*((pcpu_rwlock)->writer_signal)))
> +
>  #endif
> +
> diff --git a/lib/percpu-rwlock.c b/lib/percpu-rwlock.c
> index 80dad93..992da5c 100644
> --- a/lib/percpu-rwlock.c
> +++ b/lib/percpu-rwlock.c
> @@ -64,21 +64,145 @@ void percpu_free_rwlock(struct percpu_rwlock *pcpu_rwlock)
> 
>  void percpu_read_lock(struct percpu_rwlock *pcpu_rwlock)
>  {
> -	read_lock(&pcpu_rwlock->global_rwlock);
> +	preempt_disable();
> +
> +	/* First and foremost, let the writer know that a reader is active */
> +	this_cpu_inc(*pcpu_rwlock->reader_refcnt);
> +
> +	/*
> +	 * If we are already using per-cpu refcounts, it is not safe to switch
> +	 * the synchronization scheme. So continue using the refcounts.
> +	 */
> +	if (reader_nested_percpu(pcpu_rwlock)) {
> +		goto out;
> +	} else {
> +		/*
> +		 * The write to 'reader_refcnt' must be visible before we
> +		 * read 'writer_signal'.
> +		 */
> +		smp_mb(); /* Paired with smp_rmb() in sync_reader() */
> +
> +		if (likely(!writer_active(pcpu_rwlock))) {
> +			goto out;
> +		} else {
> +			/* Writer is active, so switch to global rwlock. */
> +			read_lock(&pcpu_rwlock->global_rwlock);
> +
> +			/*
> +			 * We might have raced with a writer going inactive
> +			 * before we took the read-lock. So re-evaluate whether
> +			 * we still need to hold the rwlock or if we can switch
> +			 * back to per-cpu refcounts. (This also helps avoid
> +			 * heterogeneous nesting of readers).
> +			 */
> +			if (writer_active(pcpu_rwlock))

The above writer_active() can be reordered with the following this_cpu_dec(),
strange though it might seem.  But this is OK because holding the rwlock
is conservative.  But might be worth a comment.

> +				this_cpu_dec(*pcpu_rwlock->reader_refcnt);
> +			else

In contrast, no reordering can happen here because read_unlock() is
required to keep the critical section underneath the lock.

> +				read_unlock(&pcpu_rwlock->global_rwlock);
> +		}
> +	}
> +
> +out:
> +	/* Prevent reordering of any subsequent reads */
> +	smp_rmb();

This should be smp_mb().  "Readers" really can do writes.  Hence the
name lglock -- "local/global" rather than "reader/writer".

>  }
> 
>  void percpu_read_unlock(struct percpu_rwlock *pcpu_rwlock)
>  {
> -	read_unlock(&pcpu_rwlock->global_rwlock);

We need an smp_mb() here to keep the critical section ordered before the
this_cpu_dec() below.  Otherwise, if a writer shows up just after we
exit the fastpath, that writer is not guaranteed to see the effects of
our critical section.  Equivalently, the prior read-side critical section
just might see some of the writer's updates, which could be a bit of
a surprise to the reader.

> +	/*
> +	 * We never allow heterogeneous nesting of readers. So it is trivial
> +	 * to find out the kind of reader we are, and undo the operation
> +	 * done by our corresponding percpu_read_lock().
> +	 */
> +	if (__this_cpu_read(*pcpu_rwlock->reader_refcnt)) {
> +		this_cpu_dec(*pcpu_rwlock->reader_refcnt);
> +		smp_wmb(); /* Paired with smp_rmb() in sync_reader() */

Given an smp_mb() above, I don't understand the need for this smp_wmb().
Isn't the idea that if the writer sees ->reader_refcnt decremented to
zero, it also needs to see the effects of the corresponding reader's
critical section?

Or am I missing something subtle here?  In any case, if this smp_wmb()
really is needed, there should be some subsequent write that the writer
might observe.  From what I can see, there is no subsequent write from
this reader that the writer cares about.

> +	} else {
> +		read_unlock(&pcpu_rwlock->global_rwlock);
> +	}
> +
> +	preempt_enable();
> +}
> +
> +static inline void raise_writer_signal(struct percpu_rwlock *pcpu_rwlock,
> +				       unsigned int cpu)
> +{
> +	per_cpu(*pcpu_rwlock->writer_signal, cpu) = true;
> +}
> +
> +static inline void drop_writer_signal(struct percpu_rwlock *pcpu_rwlock,
> +				      unsigned int cpu)
> +{
> +	per_cpu(*pcpu_rwlock->writer_signal, cpu) = false;
> +}
> +
> +static void announce_writer_active(struct percpu_rwlock *pcpu_rwlock)
> +{
> +	unsigned int cpu;
> +
> +	for_each_online_cpu(cpu)
> +		raise_writer_signal(pcpu_rwlock, cpu);
> +
> +	smp_mb(); /* Paired with smp_rmb() in percpu_read_[un]lock() */
> +}
> +
> +static void announce_writer_inactive(struct percpu_rwlock *pcpu_rwlock)
> +{
> +	unsigned int cpu;
> +
> +	drop_writer_signal(pcpu_rwlock, smp_processor_id());

Why do we drop ourselves twice?  More to the point, why is it important to
drop ourselves first?

> +
> +	for_each_online_cpu(cpu)
> +		drop_writer_signal(pcpu_rwlock, cpu);
> +
> +	smp_mb(); /* Paired with smp_rmb() in percpu_read_[un]lock() */
> +}
> +
> +/*
> + * Wait for the reader to see the writer's signal and switch from percpu
> + * refcounts to global rwlock.
> + *
> + * If the reader is still using percpu refcounts, wait for him to switch.
> + * Else, we can safely go ahead, because either the reader has already
> + * switched over, or the next reader that comes along on that CPU will
> + * notice the writer's signal and will switch over to the rwlock.
> + */
> +static inline void sync_reader(struct percpu_rwlock *pcpu_rwlock,
> +			       unsigned int cpu)
> +{
> +	smp_rmb(); /* Paired with smp_[w]mb() in percpu_read_[un]lock() */

As I understand it, the purpose of this memory barrier is to ensure
that the stores in drop_writer_signal() happen before the reads from
->reader_refcnt in reader_uses_percpu_refcnt(), thus preventing the
race between a new reader attempting to use the fastpath and this writer
acquiring the lock.  Unless I am confused, this must be smp_mb() rather
than smp_rmb().

Also, why not just have a single smp_mb() at the beginning of
sync_all_readers() instead of executing one barrier per CPU?

> +
> +	while (reader_uses_percpu_refcnt(pcpu_rwlock, cpu))
> +		cpu_relax();
> +}
> +
> +static void sync_all_readers(struct percpu_rwlock *pcpu_rwlock)
> +{
> +	unsigned int cpu;
> +
> +	for_each_online_cpu(cpu)
> +		sync_reader(pcpu_rwlock, cpu);
>  }
> 
>  void percpu_write_lock(struct percpu_rwlock *pcpu_rwlock)
>  {
> +	/*
> +	 * Tell all readers that a writer is becoming active, so that they
> +	 * start switching over to the global rwlock.
> +	 */
> +	announce_writer_active(pcpu_rwlock);
> +	sync_all_readers(pcpu_rwlock);
>  	write_lock(&pcpu_rwlock->global_rwlock);
>  }
> 
>  void percpu_write_unlock(struct percpu_rwlock *pcpu_rwlock)
>  {
> +	/*
> +	 * Inform all readers that we are done, so that they can switch back
> +	 * to their per-cpu refcounts. (We don't need to wait for them to
> +	 * see it).
> +	 */
> +	announce_writer_inactive(pcpu_rwlock);
>  	write_unlock(&pcpu_rwlock->global_rwlock);
>  }
> 
> 

  parent reply	other threads:[~2013-02-08 23:10 UTC|newest]

Thread overview: 362+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2013-01-22  7:33 [PATCH v5 00/45] CPU hotplug: stop_machine()-free CPU hotplug Srivatsa S. Bhat
2013-01-22  7:33 ` Srivatsa S. Bhat
2013-01-22  7:33 ` Srivatsa S. Bhat
2013-01-22  7:33 ` [PATCH v5 01/45] percpu_rwlock: Introduce the global reader-writer lock backend Srivatsa S. Bhat
2013-01-22  7:33   ` Srivatsa S. Bhat
2013-01-22  7:33   ` Srivatsa S. Bhat
2013-01-22 18:45   ` Stephen Hemminger
2013-01-22 18:45     ` Stephen Hemminger
2013-01-22 18:45     ` Stephen Hemminger
2013-01-22 19:41     ` Srivatsa S. Bhat
2013-01-22 19:41       ` Srivatsa S. Bhat
2013-01-22 19:41       ` Srivatsa S. Bhat
2013-01-22 19:32   ` Steven Rostedt
2013-01-22 19:32     ` Steven Rostedt
2013-01-22 19:32     ` Steven Rostedt
2013-01-22 19:58     ` Srivatsa S. Bhat
2013-01-22 19:58       ` Srivatsa S. Bhat
2013-01-22 20:54       ` Steven Rostedt
2013-01-22 20:54         ` Steven Rostedt
2013-01-24  4:14     ` Michel Lespinasse
2013-01-24  4:14       ` Michel Lespinasse
2013-01-24  4:14       ` Michel Lespinasse
2013-01-24 15:58       ` Oleg Nesterov
2013-01-24 15:58         ` Oleg Nesterov
2013-01-22  7:33 ` [PATCH v5 02/45] percpu_rwlock: Introduce per-CPU variables for the reader and the writer Srivatsa S. Bhat
2013-01-22  7:33   ` Srivatsa S. Bhat
2013-01-22  7:33   ` Srivatsa S. Bhat
2013-01-22  7:33 ` [PATCH v5 03/45] percpu_rwlock: Provide a way to define and init percpu-rwlocks at compile time Srivatsa S. Bhat
2013-01-22  7:33   ` Srivatsa S. Bhat
2013-01-22  7:33   ` Srivatsa S. Bhat
2013-01-22  7:33 ` [PATCH v5 04/45] percpu_rwlock: Implement the core design of Per-CPU Reader-Writer Locks Srivatsa S. Bhat
2013-01-22  7:33   ` Srivatsa S. Bhat
2013-01-22  7:33   ` Srivatsa S. Bhat
2013-01-23 18:55   ` Tejun Heo
2013-01-23 18:55     ` Tejun Heo
2013-01-23 18:55     ` Tejun Heo
2013-01-23 19:33     ` Srivatsa S. Bhat
2013-01-23 19:33       ` Srivatsa S. Bhat
2013-01-23 19:33       ` Srivatsa S. Bhat
2013-01-23 19:57       ` Tejun Heo
2013-01-23 19:57         ` Tejun Heo
2013-01-23 19:57         ` Tejun Heo
2013-01-24  4:30         ` Srivatsa S. Bhat
2013-01-24  4:30           ` Srivatsa S. Bhat
2013-01-24  4:30           ` Srivatsa S. Bhat
2013-01-29 11:12           ` Namhyung Kim
2013-01-29 11:12             ` Namhyung Kim
2013-01-29 11:12             ` Namhyung Kim
2013-02-08 22:47             ` Paul E. McKenney
2013-02-08 22:47               ` Paul E. McKenney
2013-02-08 22:47               ` Paul E. McKenney
2013-02-10 18:38               ` Srivatsa S. Bhat
2013-02-10 18:38                 ` Srivatsa S. Bhat
2013-02-10 18:38                 ` Srivatsa S. Bhat
2013-02-08 23:10   ` Paul E. McKenney [this message]
2013-02-08 23:10     ` Paul E. McKenney
2013-02-08 23:10     ` Paul E. McKenney
2013-02-10 18:06     ` Oleg Nesterov
2013-02-10 18:06       ` Oleg Nesterov
2013-02-10 18:06       ` Oleg Nesterov
2013-02-10 19:24       ` Srivatsa S. Bhat
2013-02-10 19:24         ` Srivatsa S. Bhat
2013-02-10 19:24         ` Srivatsa S. Bhat
2013-02-10 19:50         ` Oleg Nesterov
2013-02-10 19:50           ` Oleg Nesterov
2013-02-10 19:50           ` Oleg Nesterov
2013-02-10 20:09           ` Srivatsa S. Bhat
2013-02-10 20:09             ` Srivatsa S. Bhat
2013-02-10 20:09             ` Srivatsa S. Bhat
2013-02-10 22:13             ` Paul E. McKenney
2013-02-10 22:13               ` Paul E. McKenney
2013-02-10 22:13               ` Paul E. McKenney
2013-02-10 19:54       ` Paul E. McKenney
2013-02-10 19:54         ` Paul E. McKenney
2013-02-10 19:54         ` Paul E. McKenney
2013-02-12 16:15         ` Paul E. McKenney
2013-02-12 16:15           ` Paul E. McKenney
2013-02-12 16:15           ` Paul E. McKenney
2013-02-10 19:10     ` Srivatsa S. Bhat
2013-02-10 19:10       ` Srivatsa S. Bhat
2013-02-10 19:10       ` Srivatsa S. Bhat
2013-02-10 19:47       ` Paul E. McKenney
2013-02-10 19:47         ` Paul E. McKenney
2013-02-10 19:47         ` Paul E. McKenney
2013-02-10 19:57         ` Srivatsa S. Bhat
2013-02-10 19:57           ` Srivatsa S. Bhat
2013-02-10 19:57           ` Srivatsa S. Bhat
2013-02-10 20:13       ` Oleg Nesterov
2013-02-10 20:13         ` Oleg Nesterov
2013-02-10 20:13         ` Oleg Nesterov
2013-02-10 20:20         ` Srivatsa S. Bhat
2013-02-10 20:20           ` Srivatsa S. Bhat
2013-01-22  7:34 ` [PATCH v5 05/45] percpu_rwlock: Make percpu-rwlocks IRQ-safe, optimally Srivatsa S. Bhat
2013-01-22  7:34   ` Srivatsa S. Bhat
2013-01-22  7:34   ` Srivatsa S. Bhat
2013-02-08 23:44   ` Paul E. McKenney
2013-02-08 23:44     ` Paul E. McKenney
2013-02-08 23:44     ` Paul E. McKenney
2013-02-10 19:27     ` Srivatsa S. Bhat
2013-02-10 19:27       ` Srivatsa S. Bhat
2013-02-10 19:27       ` Srivatsa S. Bhat
2013-02-10 18:42   ` Oleg Nesterov
2013-02-10 18:42     ` Oleg Nesterov
2013-02-10 18:42     ` Oleg Nesterov
2013-02-10 19:30     ` Srivatsa S. Bhat
2013-02-10 19:30       ` Srivatsa S. Bhat
2013-02-10 19:30       ` Srivatsa S. Bhat
2013-01-22  7:34 ` [PATCH v5 06/45] percpu_rwlock: Allow writers to be readers, and add lockdep annotations Srivatsa S. Bhat
2013-01-22  7:34   ` Srivatsa S. Bhat
2013-01-22  7:34   ` Srivatsa S. Bhat
2013-02-08 23:47   ` Paul E. McKenney
2013-02-08 23:47     ` Paul E. McKenney
2013-02-08 23:47     ` Paul E. McKenney
2013-02-10 19:32     ` Srivatsa S. Bhat
2013-02-10 19:32       ` Srivatsa S. Bhat
2013-02-10 19:32       ` Srivatsa S. Bhat
2013-01-22  7:34 ` [PATCH v5 07/45] CPU hotplug: Provide APIs to prevent CPU offline from atomic context Srivatsa S. Bhat
2013-01-22  7:34   ` Srivatsa S. Bhat
2013-01-22  7:34   ` Srivatsa S. Bhat
2013-01-29 10:21   ` Namhyung Kim
2013-01-29 10:21     ` Namhyung Kim
2013-01-29 10:21     ` Namhyung Kim
2013-02-10 19:34     ` Srivatsa S. Bhat
2013-02-10 19:34       ` Srivatsa S. Bhat
2013-02-08 23:50   ` Paul E. McKenney
2013-02-08 23:50     ` Paul E. McKenney
2013-02-08 23:50     ` Paul E. McKenney
2013-01-22  7:35 ` [PATCH v5 08/45] CPU hotplug: Convert preprocessor macros to static inline functions Srivatsa S. Bhat
2013-01-22  7:35   ` Srivatsa S. Bhat
2013-01-22  7:35   ` Srivatsa S. Bhat
2013-02-08 23:51   ` Paul E. McKenney
2013-02-08 23:51     ` Paul E. McKenney
2013-02-08 23:51     ` Paul E. McKenney
2013-01-22  7:35 ` [PATCH v5 09/45] smp, cpu hotplug: Fix smp_call_function_*() to prevent CPU offline properly Srivatsa S. Bhat
2013-01-22  7:35   ` Srivatsa S. Bhat
2013-01-22  7:35   ` Srivatsa S. Bhat
2013-02-09  0:07   ` Paul E. McKenney
2013-02-09  0:07     ` Paul E. McKenney
2013-02-09  0:07     ` Paul E. McKenney
2013-02-10 19:41     ` Srivatsa S. Bhat
2013-02-10 19:41       ` Srivatsa S. Bhat
2013-02-10 19:41       ` Srivatsa S. Bhat
2013-02-10 19:56       ` Paul E. McKenney
2013-02-10 19:56         ` Paul E. McKenney
2013-02-10 19:56         ` Paul E. McKenney
2013-02-10 19:59         ` Srivatsa S. Bhat
2013-02-10 19:59           ` Srivatsa S. Bhat
2013-02-10 19:59           ` Srivatsa S. Bhat
2013-01-22  7:35 ` [PATCH v5 10/45] smp, cpu hotplug: Fix on_each_cpu_*() " Srivatsa S. Bhat
2013-01-22  7:35   ` Srivatsa S. Bhat
2013-01-22  7:35   ` Srivatsa S. Bhat
2013-01-22  7:35 ` [PATCH v5 11/45] sched/timer: Use get/put_online_cpus_atomic() to prevent CPU offline Srivatsa S. Bhat
2013-01-22  7:35   ` Srivatsa S. Bhat
2013-01-22  7:35   ` Srivatsa S. Bhat
2013-01-22  7:35 ` [PATCH v5 12/45] sched/migration: Use raw_spin_lock/unlock since interrupts are already disabled Srivatsa S. Bhat
2013-01-22  7:35   ` Srivatsa S. Bhat
2013-01-22  7:35   ` Srivatsa S. Bhat
2013-01-22  7:36 ` [PATCH v5 13/45] sched/rt: Use get/put_online_cpus_atomic() to prevent CPU offline Srivatsa S. Bhat
2013-01-22  7:36   ` Srivatsa S. Bhat
2013-01-22  7:36   ` Srivatsa S. Bhat
2013-01-22  7:36 ` [PATCH v5 14/45] rcu, CPU hotplug: Fix comment referring to stop_machine() Srivatsa S. Bhat
2013-01-22  7:36   ` Srivatsa S. Bhat
2013-01-22  7:36   ` Srivatsa S. Bhat
2013-02-09  0:14   ` Paul E. McKenney
2013-02-09  0:14     ` Paul E. McKenney
2013-02-09  0:14     ` Paul E. McKenney
2013-02-10 19:43     ` Srivatsa S. Bhat
2013-02-10 19:43       ` Srivatsa S. Bhat
2013-02-10 19:43       ` Srivatsa S. Bhat
2013-01-22  7:36 ` [PATCH v5 15/45] tick: Use get/put_online_cpus_atomic() to prevent CPU offline Srivatsa S. Bhat
2013-01-22  7:36   ` Srivatsa S. Bhat
2013-01-22  7:36   ` Srivatsa S. Bhat
2013-01-22  7:37 ` [PATCH v5 16/45] time/clocksource: " Srivatsa S. Bhat
2013-01-22  7:37   ` Srivatsa S. Bhat
2013-01-22  7:37   ` Srivatsa S. Bhat
2013-01-22  7:37 ` [PATCH v5 17/45] softirq: " Srivatsa S. Bhat
2013-01-22  7:37   ` Srivatsa S. Bhat
2013-01-22  7:37   ` Srivatsa S. Bhat
2013-01-22  7:38 ` [PATCH v5 18/45] irq: " Srivatsa S. Bhat
2013-01-22  7:38   ` Srivatsa S. Bhat
2013-01-22  7:38   ` Srivatsa S. Bhat
2013-01-22  7:38 ` [PATCH v5 19/45] net: " Srivatsa S. Bhat
2013-01-22  7:38   ` Srivatsa S. Bhat
2013-01-22  7:38   ` Srivatsa S. Bhat
2013-01-22  7:38 ` [PATCH v5 20/45] block: " Srivatsa S. Bhat
2013-01-22  7:38   ` Srivatsa S. Bhat
2013-01-22  7:38   ` Srivatsa S. Bhat
2013-01-22  7:38 ` [PATCH v5 21/45] crypto: pcrypt - Protect access to cpu_online_mask with get/put_online_cpus() Srivatsa S. Bhat
2013-01-22  7:38   ` Srivatsa S. Bhat
2013-01-22  7:38   ` Srivatsa S. Bhat
2013-01-22  7:39 ` [PATCH v5 22/45] infiniband: ehca: Use get/put_online_cpus_atomic() to prevent CPU offline Srivatsa S. Bhat
2013-01-22  7:39   ` Srivatsa S. Bhat
2013-01-22  7:39   ` Srivatsa S. Bhat
2013-01-22  7:39 ` [PATCH v5 23/45] [SCSI] fcoe: " Srivatsa S. Bhat
2013-01-22  7:39   ` Srivatsa S. Bhat
2013-01-22  7:39   ` Srivatsa S. Bhat
2013-01-22  7:39 ` [PATCH v5 24/45] staging: octeon: " Srivatsa S. Bhat
2013-01-22  7:39   ` Srivatsa S. Bhat
2013-01-22  7:39   ` Srivatsa S. Bhat
2013-01-22  7:39 ` [PATCH v5 25/45] x86: " Srivatsa S. Bhat
2013-01-22  7:39   ` Srivatsa S. Bhat
2013-01-22  7:39   ` Srivatsa S. Bhat
2013-01-22  7:39 ` [PATCH v5 26/45] perf/x86: " Srivatsa S. Bhat
2013-01-22  7:39   ` Srivatsa S. Bhat
2013-01-22  7:39   ` Srivatsa S. Bhat
2013-01-22  7:40 ` [PATCH v5 27/45] KVM: Use get/put_online_cpus_atomic() to prevent CPU offline from atomic context Srivatsa S. Bhat
2013-01-22  7:40   ` Srivatsa S. Bhat
2013-01-22  7:40   ` Srivatsa S. Bhat
2013-01-22  7:40 ` [PATCH v5 28/45] kvm/vmx: Use get/put_online_cpus_atomic() to prevent CPU offline Srivatsa S. Bhat
2013-01-22  7:40   ` Srivatsa S. Bhat
2013-01-22  7:40   ` Srivatsa S. Bhat
2013-01-22  7:40 ` [PATCH v5 29/45] x86/xen: " Srivatsa S. Bhat
2013-01-22  7:40   ` Srivatsa S. Bhat
2013-01-22  7:40   ` Srivatsa S. Bhat
2013-02-19 18:10   ` Konrad Rzeszutek Wilk
2013-02-19 18:10     ` Konrad Rzeszutek Wilk
2013-02-19 18:10     ` Konrad Rzeszutek Wilk
2013-02-19 18:29     ` Srivatsa S. Bhat
2013-02-19 18:29       ` Srivatsa S. Bhat
2013-02-19 18:29       ` Srivatsa S. Bhat
2013-01-22  7:41 ` [PATCH v5 30/45] alpha/smp: " Srivatsa S. Bhat
2013-01-22  7:41   ` Srivatsa S. Bhat
2013-01-22  7:41   ` Srivatsa S. Bhat
2013-01-22  7:41   ` Srivatsa S. Bhat
2013-01-22  7:41 ` [PATCH v5 31/45] blackfin/smp: " Srivatsa S. Bhat
2013-01-22  7:41   ` Srivatsa S. Bhat
2013-01-22  7:41   ` Srivatsa S. Bhat
2013-01-28  9:09   ` Bob Liu
2013-01-28  9:09     ` Bob Liu
2013-01-28  9:09     ` Bob Liu
2013-01-28 19:06     ` Tejun Heo
2013-01-28 19:06       ` Tejun Heo
2013-01-28 19:06       ` Tejun Heo
2013-01-29  1:14       ` Srivatsa S. Bhat
2013-01-29  1:14         ` Srivatsa S. Bhat
2013-01-29  1:14         ` Srivatsa S. Bhat
2013-01-22  7:41 ` [PATCH v5 32/45] cris/smp: " Srivatsa S. Bhat
2013-01-22  7:41   ` Srivatsa S. Bhat
2013-01-22  7:41   ` Srivatsa S. Bhat
2013-01-22  7:42 ` [PATCH v5 33/45] hexagon/smp: " Srivatsa S. Bhat
2013-01-22  7:42   ` Srivatsa S. Bhat
2013-01-22  7:42   ` Srivatsa S. Bhat
2013-01-22  7:42 ` [PATCH v5 34/45] ia64: " Srivatsa S. Bhat
2013-01-22  7:42   ` Srivatsa S. Bhat
2013-01-22  7:42   ` Srivatsa S. Bhat
2013-01-22  7:42 ` [PATCH v5 35/45] m32r: " Srivatsa S. Bhat
2013-01-22  7:42   ` Srivatsa S. Bhat
2013-01-22  7:42   ` Srivatsa S. Bhat
2013-01-22  7:42 ` [PATCH v5 36/45] MIPS: " Srivatsa S. Bhat
2013-01-22  7:42   ` Srivatsa S. Bhat
2013-01-22  7:42   ` Srivatsa S. Bhat
2013-01-22  7:43 ` [PATCH v5 37/45] mn10300: " Srivatsa S. Bhat
2013-01-22  7:43   ` Srivatsa S. Bhat
2013-01-22  7:43   ` Srivatsa S. Bhat
2013-01-22  7:43 ` [PATCH v5 38/45] parisc: " Srivatsa S. Bhat
2013-01-22  7:43   ` Srivatsa S. Bhat
2013-01-22  7:43   ` Srivatsa S. Bhat
2013-01-22  7:43 ` [PATCH v5 39/45] powerpc: " Srivatsa S. Bhat
2013-01-22  7:43   ` Srivatsa S. Bhat
2013-01-22  7:43   ` Srivatsa S. Bhat
2013-01-22  7:44 ` [PATCH v5 40/45] sh: " Srivatsa S. Bhat
2013-01-22  7:44   ` Srivatsa S. Bhat
2013-01-22  7:44   ` Srivatsa S. Bhat
2013-01-22  7:44 ` [PATCH v5 41/45] sparc: " Srivatsa S. Bhat
2013-01-22  7:44   ` Srivatsa S. Bhat
2013-01-22  7:44   ` Srivatsa S. Bhat
2013-01-22  7:44 ` [PATCH v5 42/45] tile: " Srivatsa S. Bhat
2013-01-22  7:44   ` Srivatsa S. Bhat
2013-01-22  7:44   ` Srivatsa S. Bhat
2013-01-22  7:44 ` [PATCH v5 43/45] cpu: No more __stop_machine() in _cpu_down() Srivatsa S. Bhat
2013-01-22  7:44   ` Srivatsa S. Bhat
2013-01-22  7:44   ` Srivatsa S. Bhat
2013-01-22  7:45 ` [PATCH v5 44/45] CPU hotplug, stop_machine: Decouple CPU hotplug from stop_machine() in Kconfig Srivatsa S. Bhat
2013-01-22  7:45   ` Srivatsa S. Bhat
2013-01-22  7:45   ` Srivatsa S. Bhat
2013-02-09  0:15   ` Paul E. McKenney
2013-02-09  0:15     ` Paul E. McKenney
2013-02-09  0:15     ` Paul E. McKenney
2013-02-10 19:45     ` Srivatsa S. Bhat
2013-02-10 19:45       ` Srivatsa S. Bhat
2013-02-10 19:45       ` Srivatsa S. Bhat
2013-01-22  7:45 ` [PATCH v5 45/45] Documentation/cpu-hotplug: Remove references to stop_machine() Srivatsa S. Bhat
2013-01-22  7:45   ` Srivatsa S. Bhat
2013-01-22  7:45   ` Srivatsa S. Bhat
2013-02-09  0:16   ` Paul E. McKenney
2013-02-09  0:16     ` Paul E. McKenney
2013-02-09  0:16     ` Paul E. McKenney
2013-02-04 13:47 ` [PATCH v5 00/45] CPU hotplug: stop_machine()-free CPU hotplug Srivatsa S. Bhat
2013-02-04 13:47   ` Srivatsa S. Bhat
2013-02-04 13:47   ` Srivatsa S. Bhat
2013-02-07  4:14   ` Rusty Russell
2013-02-07  4:14     ` Rusty Russell
2013-02-07  4:14     ` Rusty Russell
2013-02-07  4:14     ` Rusty Russell
2013-02-07  4:14     ` Rusty Russell
2013-02-07  6:11     ` Srivatsa S. Bhat
2013-02-07  6:11       ` Srivatsa S. Bhat
2013-02-07  6:11       ` Srivatsa S. Bhat
2013-02-08 15:41       ` Russell King - ARM Linux
2013-02-08 15:41         ` Russell King - ARM Linux
2013-02-08 15:41         ` Russell King - ARM Linux
2013-02-08 16:44         ` Srivatsa S. Bhat
2013-02-08 16:44           ` Srivatsa S. Bhat
2013-02-08 18:09           ` Srivatsa S. Bhat
2013-02-08 18:09             ` Srivatsa S. Bhat
2013-02-08 18:09             ` Srivatsa S. Bhat
2013-02-11 11:58             ` Vincent Guittot
2013-02-11 11:58               ` Vincent Guittot
2013-02-11 11:58               ` Vincent Guittot
2013-02-11 12:23               ` Srivatsa S. Bhat
2013-02-11 12:23                 ` Srivatsa S. Bhat
2013-02-11 12:23                 ` Srivatsa S. Bhat
2013-02-11 19:08                 ` Paul E. McKenney
2013-02-11 19:08                   ` Paul E. McKenney
2013-02-11 19:08                   ` Paul E. McKenney
2013-02-12  3:58                   ` Srivatsa S. Bhat
2013-02-12  3:58                     ` Srivatsa S. Bhat
2013-02-12  3:58                     ` Srivatsa S. Bhat
2013-02-15 13:28                     ` Vincent Guittot
2013-02-15 13:28                       ` Vincent Guittot
2013-02-15 19:40                       ` Srivatsa S. Bhat
2013-02-15 19:40                         ` Srivatsa S. Bhat
2013-02-15 19:40                         ` Srivatsa S. Bhat
2013-02-18 10:24                         ` Vincent Guittot
2013-02-18 10:24                           ` Vincent Guittot
2013-02-18 10:24                           ` Vincent Guittot
2013-02-18 10:34                           ` Srivatsa S. Bhat
2013-02-18 10:34                             ` Srivatsa S. Bhat
2013-02-18 10:34                             ` Srivatsa S. Bhat
2013-02-18 10:51                             ` Srivatsa S. Bhat
2013-02-18 10:51                               ` Srivatsa S. Bhat
2013-02-18 10:51                               ` Srivatsa S. Bhat
2013-02-18 10:58                               ` Vincent Guittot
2013-02-18 10:58                                 ` Vincent Guittot
2013-02-18 10:58                                 ` Vincent Guittot
2013-02-18 15:30                                 ` Steven Rostedt
2013-02-18 15:30                                   ` Steven Rostedt
2013-02-18 15:30                                   ` Steven Rostedt
2013-02-18 16:50                                   ` Vincent Guittot
2013-02-18 16:50                                     ` Vincent Guittot
2013-02-18 16:50                                     ` Vincent Guittot
2013-02-18 19:53                                     ` Steven Rostedt
2013-02-18 19:53                                       ` Steven Rostedt
2013-02-18 19:53                                       ` Steven Rostedt
2013-02-18 19:53                                     ` Steven Rostedt
2013-02-18 19:53                                       ` Steven Rostedt
2013-02-18 19:53                                       ` Steven Rostedt
2013-02-19 10:33                                       ` Vincent Guittot
2013-02-19 10:33                                         ` Vincent Guittot
2013-02-19 10:33                                         ` Vincent Guittot
2013-02-18 10:54                             ` Thomas Gleixner
2013-02-18 10:54                               ` Thomas Gleixner
2013-02-18 10:54                               ` Thomas Gleixner
2013-02-18 10:57                               ` Srivatsa S. Bhat
2013-02-18 10:57                                 ` Srivatsa S. Bhat
2013-02-18 10:57                                 ` Srivatsa S. Bhat
2013-02-11 12:41 ` [PATCH v5 01/45] percpu_rwlock: Introduce the global reader-writer lock backend David Howells
2013-02-11 12:41   ` David Howells
2013-02-11 12:41   ` David Howells
2013-02-11 12:56   ` Srivatsa S. Bhat
2013-02-11 12:56     ` Srivatsa S. Bhat
2013-02-11 12:56     ` Srivatsa S. Bhat

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20130208231017.GK2666@linux.vnet.ibm.com \
    --to=paulmck@linux.vnet.ibm.com \
    --cc=akpm@linux-foundation.org \
    --cc=fweisbec@gmail.com \
    --cc=linux-arch@vger.kernel.org \
    --cc=linux-arm-kernel@lists.infradead.org \
    --cc=linux-doc@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-pm@vger.kernel.org \
    --cc=linux@arm.linux.org.uk \
    --cc=linuxppc-dev@lists.ozlabs.org \
    --cc=mingo@kernel.org \
    --cc=namhyung@kernel.org \
    --cc=netdev@vger.kernel.org \
    --cc=nikunj@linux.vnet.ibm.com \
    --cc=oleg@redhat.com \
    --cc=peterz@infradead.org \
    --cc=rjw@sisk.pl \
    --cc=rostedt@goodmis.org \
    --cc=rusty@rustcorp.com.au \
    --cc=sbw@mit.edu \
    --cc=srivatsa.bhat@linux.vnet.ibm.com \
    --cc=tglx@linutronix.de \
    --cc=tj@kernel.org \
    --cc=wangyun@linux.vnet.ibm.com \
    --cc=xiaoguangrong@linux.vnet.ibm.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.