linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCHSET percpu/for-3.18] percpu_ref: implement switch_to_atomic/percpu()
@ 2014-09-23  5:55 Tejun Heo
  2014-09-23  5:55 ` [PATCH 1/9] percpu_ref: relocate percpu_ref_reinit() Tejun Heo
                   ` (9 more replies)
  0 siblings, 10 replies; 23+ messages in thread
From: Tejun Heo @ 2014-09-23  5:55 UTC (permalink / raw)
  To: linux-kernel; +Cc: kmo, axboe, hch, hannes

Hello,

Over the past several months, percpu_ref grew use cases where it's
used as a persistent on/off switch which may be cycled multiple times
using percpu_ref_reinit().  One of such use cases is blk-mq's
mq_usage_counter which tracks the number of in-flight commands and is
used to drain them.  Unfortunately, SCSI device probing involves
synchronously creating and destroying request_queues for non-existent
devices and the sched RCU grace period involved in percpu_ref killing
adds upto a significant amount of latency.

Block layer already experienced the same issue in other areas and
works around it by starting the queue in a degraded mode which is
faster to shut down and making it fully functional only after it's
known that the queue isn't a temporary one for probing.

This patchset implements percpu_ref mechanisms to manually switch
between atomic and percpu operation modes so that blk-mq can implement
a similar degraded operation mode.  This will also allow implementing
debug mode for percpu_ref so that underflow can be detected sooner.

This patchset contains the following nine patches.

 0001-percpu_ref-relocate-percpu_ref_reinit.patch
 0002-percpu_ref-minor-code-and-comment-updates.patch
 0003-percpu_ref-replace-pcpu_-prefix-with-percpu_.patch
 0004-percpu_ref-rename-things-to-prepare-for-decoupling-p.patch
 0005-percpu_ref-add-PCPU_REF_DEAD.patch
 0006-percpu_ref-decouple-switching-to-atomic-mode-and-kil.patch
 0007-percpu_ref-decouple-switching-to-percpu-mode-and-rei.patch
 0008-percpu_ref-add-PERCPU_REF_INIT_-flags.patch
 0009-percpu_ref-make-INIT_ATOMIC-and-switch_to_atomic-sti.patch

0001-0005 are prep patches.

0006-0007 implement percpu_ref_switch_to_atomic/percpu().

0008 extends percpu_ref_init() so that a percpu_ref can be initialized
in different states including atomic mode.

0009 makes atomic mode sticky so that it survives through reinits.

This patchset is on top of percpu/for-3.18 and available in the
following git branch.

 git://git.kernel.org/pub/scm/linux/kernel/git/tj/percpu.git review-percpu_ref-switch

diffstat follows.

 block/blk-mq.c                  |    2 
 fs/aio.c                        |    4 
 include/linux/percpu-refcount.h |  108 +++++++++-----
 kernel/cgroup.c                 |    7 
 lib/percpu-refcount.c           |  291 +++++++++++++++++++++++++++++-----------
 5 files changed, 295 insertions(+), 117 deletions(-)

Thanks.

--
tejun

^ permalink raw reply	[flat|nested] 23+ messages in thread

* [PATCH 1/9] percpu_ref: relocate percpu_ref_reinit()
  2014-09-23  5:55 [PATCHSET percpu/for-3.18] percpu_ref: implement switch_to_atomic/percpu() Tejun Heo
@ 2014-09-23  5:55 ` Tejun Heo
  2014-09-23 21:01   ` Kent Overstreet
  2014-09-23 21:07   ` Kent Overstreet
  2014-09-23  5:55 ` [PATCH 2/9] percpu_ref: minor code and comment updates Tejun Heo
                   ` (8 subsequent siblings)
  9 siblings, 2 replies; 23+ messages in thread
From: Tejun Heo @ 2014-09-23  5:55 UTC (permalink / raw)
  To: linux-kernel; +Cc: kmo, axboe, hch, hannes, Tejun Heo

percpu_ref is gonna go through restructuring.  Move
percpu_ref_reinit() after percpu_ref_kill_and_confirm().  This will
make later changes easier to follow and result in cleaner
organization.

Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: Kent Overstreet <kmo@daterainc.com>
---
 include/linux/percpu-refcount.h |  2 +-
 lib/percpu-refcount.c           | 70 ++++++++++++++++++++---------------------
 2 files changed, 36 insertions(+), 36 deletions(-)

diff --git a/include/linux/percpu-refcount.h b/include/linux/percpu-refcount.h
index 5df6784..f015f13 100644
--- a/include/linux/percpu-refcount.h
+++ b/include/linux/percpu-refcount.h
@@ -68,10 +68,10 @@ struct percpu_ref {
 
 int __must_check percpu_ref_init(struct percpu_ref *ref,
 				 percpu_ref_func_t *release, gfp_t gfp);
-void percpu_ref_reinit(struct percpu_ref *ref);
 void percpu_ref_exit(struct percpu_ref *ref);
 void percpu_ref_kill_and_confirm(struct percpu_ref *ref,
 				 percpu_ref_func_t *confirm_kill);
+void percpu_ref_reinit(struct percpu_ref *ref);
 
 /**
  * percpu_ref_kill - drop the initial ref
diff --git a/lib/percpu-refcount.c b/lib/percpu-refcount.c
index 559ee0b..070dab5 100644
--- a/lib/percpu-refcount.c
+++ b/lib/percpu-refcount.c
@@ -63,41 +63,6 @@ int percpu_ref_init(struct percpu_ref *ref, percpu_ref_func_t *release,
 EXPORT_SYMBOL_GPL(percpu_ref_init);
 
 /**
- * percpu_ref_reinit - re-initialize a percpu refcount
- * @ref: perpcu_ref to re-initialize
- *
- * Re-initialize @ref so that it's in the same state as when it finished
- * percpu_ref_init().  @ref must have been initialized successfully, killed
- * and reached 0 but not exited.
- *
- * Note that percpu_ref_tryget[_live]() are safe to perform on @ref while
- * this function is in progress.
- */
-void percpu_ref_reinit(struct percpu_ref *ref)
-{
-	unsigned long __percpu *pcpu_count = pcpu_count_ptr(ref);
-	int cpu;
-
-	BUG_ON(!pcpu_count);
-	WARN_ON(!percpu_ref_is_zero(ref));
-
-	atomic_long_set(&ref->count, 1 + PCPU_COUNT_BIAS);
-
-	/*
-	 * Restore per-cpu operation.  smp_store_release() is paired with
-	 * smp_read_barrier_depends() in __pcpu_ref_alive() and guarantees
-	 * that the zeroing is visible to all percpu accesses which can see
-	 * the following PCPU_REF_DEAD clearing.
-	 */
-	for_each_possible_cpu(cpu)
-		*per_cpu_ptr(pcpu_count, cpu) = 0;
-
-	smp_store_release(&ref->pcpu_count_ptr,
-			  ref->pcpu_count_ptr & ~PCPU_REF_DEAD);
-}
-EXPORT_SYMBOL_GPL(percpu_ref_reinit);
-
-/**
  * percpu_ref_exit - undo percpu_ref_init()
  * @ref: percpu_ref to exit
  *
@@ -189,3 +154,38 @@ void percpu_ref_kill_and_confirm(struct percpu_ref *ref,
 	call_rcu_sched(&ref->rcu, percpu_ref_kill_rcu);
 }
 EXPORT_SYMBOL_GPL(percpu_ref_kill_and_confirm);
+
+/**
+ * percpu_ref_reinit - re-initialize a percpu refcount
+ * @ref: perpcu_ref to re-initialize
+ *
+ * Re-initialize @ref so that it's in the same state as when it finished
+ * percpu_ref_init().  @ref must have been initialized successfully, killed
+ * and reached 0 but not exited.
+ *
+ * Note that percpu_ref_tryget[_live]() are safe to perform on @ref while
+ * this function is in progress.
+ */
+void percpu_ref_reinit(struct percpu_ref *ref)
+{
+	unsigned long __percpu *pcpu_count = pcpu_count_ptr(ref);
+	int cpu;
+
+	BUG_ON(!pcpu_count);
+	WARN_ON(!percpu_ref_is_zero(ref));
+
+	atomic_long_set(&ref->count, 1 + PCPU_COUNT_BIAS);
+
+	/*
+	 * Restore per-cpu operation.  smp_store_release() is paired with
+	 * smp_read_barrier_depends() in __pcpu_ref_alive() and guarantees
+	 * that the zeroing is visible to all percpu accesses which can see
+	 * the following PCPU_REF_DEAD clearing.
+	 */
+	for_each_possible_cpu(cpu)
+		*per_cpu_ptr(pcpu_count, cpu) = 0;
+
+	smp_store_release(&ref->pcpu_count_ptr,
+			  ref->pcpu_count_ptr & ~PCPU_REF_DEAD);
+}
+EXPORT_SYMBOL_GPL(percpu_ref_reinit);
-- 
1.9.3


^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [PATCH 2/9] percpu_ref: minor code and comment updates
  2014-09-23  5:55 [PATCHSET percpu/for-3.18] percpu_ref: implement switch_to_atomic/percpu() Tejun Heo
  2014-09-23  5:55 ` [PATCH 1/9] percpu_ref: relocate percpu_ref_reinit() Tejun Heo
@ 2014-09-23  5:55 ` Tejun Heo
  2014-09-23 21:09   ` Kent Overstreet
  2014-09-23  5:55 ` [PATCH 3/9] percpu_ref: replace pcpu_ prefix with percpu_ Tejun Heo
                   ` (7 subsequent siblings)
  9 siblings, 1 reply; 23+ messages in thread
From: Tejun Heo @ 2014-09-23  5:55 UTC (permalink / raw)
  To: linux-kernel; +Cc: kmo, axboe, hch, hannes, Tejun Heo

* Some comments became stale.  Updated.
* percpu_ref_tryget() unnecessarily initializes @ret.  Removed.
* A blank line removed from percpu_ref_kill_rcu().
* Explicit function name in a WARN format string replaced with __func__.
* WARN_ON() in percpu_ref_reinit() converted to WARN_ON_ONCE().

Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: Kent Overstreet <kmo@daterainc.com>
---
 include/linux/percpu-refcount.h | 25 ++++++++++++++++---------
 lib/percpu-refcount.c           | 14 ++++++--------
 2 files changed, 22 insertions(+), 17 deletions(-)

diff --git a/include/linux/percpu-refcount.h b/include/linux/percpu-refcount.h
index f015f13..d44b027 100644
--- a/include/linux/percpu-refcount.h
+++ b/include/linux/percpu-refcount.h
@@ -115,8 +115,10 @@ static inline bool __pcpu_ref_alive(struct percpu_ref *ref,
  * percpu_ref_get - increment a percpu refcount
  * @ref: percpu_ref to get
  *
- * Analagous to atomic_inc().
-  */
+ * Analagous to atomic_long_inc().
+ *
+ * This function is safe to call as long as @ref is between init and exit.
+ */
 static inline void percpu_ref_get(struct percpu_ref *ref)
 {
 	unsigned long __percpu *pcpu_count;
@@ -138,12 +140,12 @@ static inline void percpu_ref_get(struct percpu_ref *ref)
  * Increment a percpu refcount unless its count already reached zero.
  * Returns %true on success; %false on failure.
  *
- * The caller is responsible for ensuring that @ref stays accessible.
+ * This function is safe to call as long as @ref is between init and exit.
  */
 static inline bool percpu_ref_tryget(struct percpu_ref *ref)
 {
 	unsigned long __percpu *pcpu_count;
-	int ret = false;
+	int ret;
 
 	rcu_read_lock_sched();
 
@@ -166,12 +168,13 @@ static inline bool percpu_ref_tryget(struct percpu_ref *ref)
  * Increment a percpu refcount unless it has already been killed.  Returns
  * %true on success; %false on failure.
  *
- * Completion of percpu_ref_kill() in itself doesn't guarantee that tryget
- * will fail.  For such guarantee, percpu_ref_kill_and_confirm() should be
- * used.  After the confirm_kill callback is invoked, it's guaranteed that
- * no new reference will be given out by percpu_ref_tryget().
+ * Completion of percpu_ref_kill() in itself doesn't guarantee that this
+ * function will fail.  For such guarantee, percpu_ref_kill_and_confirm()
+ * should be used.  After the confirm_kill callback is invoked, it's
+ * guaranteed that no new reference will be given out by
+ * percpu_ref_tryget_live().
  *
- * The caller is responsible for ensuring that @ref stays accessible.
+ * This function is safe to call as long as @ref is between init and exit.
  */
 static inline bool percpu_ref_tryget_live(struct percpu_ref *ref)
 {
@@ -196,6 +199,8 @@ static inline bool percpu_ref_tryget_live(struct percpu_ref *ref)
  *
  * Decrement the refcount, and if 0, call the release function (which was passed
  * to percpu_ref_init())
+ *
+ * This function is safe to call as long as @ref is between init and exit.
  */
 static inline void percpu_ref_put(struct percpu_ref *ref)
 {
@@ -216,6 +221,8 @@ static inline void percpu_ref_put(struct percpu_ref *ref)
  * @ref: percpu_ref to test
  *
  * Returns %true if @ref reached zero.
+ *
+ * This function is safe to call as long as @ref is between init and exit.
  */
 static inline bool percpu_ref_is_zero(struct percpu_ref *ref)
 {
diff --git a/lib/percpu-refcount.c b/lib/percpu-refcount.c
index 070dab5..8ef3f5c 100644
--- a/lib/percpu-refcount.c
+++ b/lib/percpu-refcount.c
@@ -108,7 +108,6 @@ static void percpu_ref_kill_rcu(struct rcu_head *rcu)
 	 * reaching 0 before we add the percpu counts. But doing it at the same
 	 * time is equivalent and saves us atomic operations:
 	 */
-
 	atomic_long_add((long)count - PCPU_COUNT_BIAS, &ref->count);
 
 	WARN_ONCE(atomic_long_read(&ref->count) <= 0,
@@ -120,8 +119,8 @@ static void percpu_ref_kill_rcu(struct rcu_head *rcu)
 		ref->confirm_kill(ref);
 
 	/*
-	 * Now we're in single atomic_t mode with a consistent refcount, so it's
-	 * safe to drop our initial ref:
+	 * Now we're in single atomic_long_t mode with a consistent
+	 * refcount, so it's safe to drop our initial ref:
 	 */
 	percpu_ref_put(ref);
 }
@@ -134,8 +133,8 @@ static void percpu_ref_kill_rcu(struct rcu_head *rcu)
  * Equivalent to percpu_ref_kill() but also schedules kill confirmation if
  * @confirm_kill is not NULL.  @confirm_kill, which may not block, will be
  * called after @ref is seen as dead from all CPUs - all further
- * invocations of percpu_ref_tryget() will fail.  See percpu_ref_tryget()
- * for more details.
+ * invocations of percpu_ref_tryget_live() will fail.  See
+ * percpu_ref_tryget_live() for more details.
  *
  * Due to the way percpu_ref is implemented, @confirm_kill will be called
  * after at least one full RCU grace period has passed but this is an
@@ -145,8 +144,7 @@ void percpu_ref_kill_and_confirm(struct percpu_ref *ref,
 				 percpu_ref_func_t *confirm_kill)
 {
 	WARN_ONCE(ref->pcpu_count_ptr & PCPU_REF_DEAD,
-		  "percpu_ref_kill() called more than once on %pf!",
-		  ref->release);
+		  "%s called more than once on %pf!", __func__, ref->release);
 
 	ref->pcpu_count_ptr |= PCPU_REF_DEAD;
 	ref->confirm_kill = confirm_kill;
@@ -172,7 +170,7 @@ void percpu_ref_reinit(struct percpu_ref *ref)
 	int cpu;
 
 	BUG_ON(!pcpu_count);
-	WARN_ON(!percpu_ref_is_zero(ref));
+	WARN_ON_ONCE(!percpu_ref_is_zero(ref));
 
 	atomic_long_set(&ref->count, 1 + PCPU_COUNT_BIAS);
 
-- 
1.9.3


^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [PATCH 3/9] percpu_ref: replace pcpu_ prefix with percpu_
  2014-09-23  5:55 [PATCHSET percpu/for-3.18] percpu_ref: implement switch_to_atomic/percpu() Tejun Heo
  2014-09-23  5:55 ` [PATCH 1/9] percpu_ref: relocate percpu_ref_reinit() Tejun Heo
  2014-09-23  5:55 ` [PATCH 2/9] percpu_ref: minor code and comment updates Tejun Heo
@ 2014-09-23  5:55 ` Tejun Heo
  2014-09-23 21:10   ` Kent Overstreet
  2014-09-23  5:55 ` [PATCH 4/9] percpu_ref: rename things to prepare for decoupling percpu/atomic mode switch Tejun Heo
                   ` (6 subsequent siblings)
  9 siblings, 1 reply; 23+ messages in thread
From: Tejun Heo @ 2014-09-23  5:55 UTC (permalink / raw)
  To: linux-kernel; +Cc: kmo, axboe, hch, hannes, Tejun Heo

percpu_ref uses pcpu_ prefix for internal stuff and percpu_ for
externally visible ones.  This is the same convention used in the
percpu allocator implementation.  It works fine there but percpu_ref
doesn't have too much internal-only stuff and scattered usages of
pcpu_ prefix are confusing than helpful.

This patch replaces all pcpu_ prefixes with percpu_.  This is pure
rename and there's no functional change.  Note that PCPU_REF_DEAD is
renamed to __PERCPU_REF_DEAD to signify that the flag is internal.

Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: Kent Overstreet <kmo@daterainc.com>
---
 include/linux/percpu-refcount.h | 46 ++++++++++++++++-----------------
 lib/percpu-refcount.c           | 56 +++++++++++++++++++++--------------------
 2 files changed, 52 insertions(+), 50 deletions(-)

diff --git a/include/linux/percpu-refcount.h b/include/linux/percpu-refcount.h
index d44b027..3d463a3 100644
--- a/include/linux/percpu-refcount.h
+++ b/include/linux/percpu-refcount.h
@@ -13,7 +13,7 @@
  *
  * The refcount will have a range of 0 to ((1U << 31) - 1), i.e. one bit less
  * than an atomic_t - this is because of the way shutdown works, see
- * percpu_ref_kill()/PCPU_COUNT_BIAS.
+ * percpu_ref_kill()/PERCPU_COUNT_BIAS.
  *
  * Before you call percpu_ref_kill(), percpu_ref_put() does not check for the
  * refcount hitting 0 - it can't, if it was in percpu mode. percpu_ref_kill()
@@ -60,7 +60,7 @@ struct percpu_ref {
 	 * The low bit of the pointer indicates whether the ref is in percpu
 	 * mode; if set, then get/put will manipulate the atomic_t.
 	 */
-	unsigned long		pcpu_count_ptr;
+	unsigned long		percpu_count_ptr;
 	percpu_ref_func_t	*release;
 	percpu_ref_func_t	*confirm_kill;
 	struct rcu_head		rcu;
@@ -88,26 +88,26 @@ static inline void percpu_ref_kill(struct percpu_ref *ref)
 	return percpu_ref_kill_and_confirm(ref, NULL);
 }
 
-#define PCPU_REF_DEAD		1
+#define __PERCPU_REF_DEAD	1
 
 /*
  * Internal helper.  Don't use outside percpu-refcount proper.  The
  * function doesn't return the pointer and let the caller test it for NULL
  * because doing so forces the compiler to generate two conditional
- * branches as it can't assume that @ref->pcpu_count is not NULL.
+ * branches as it can't assume that @ref->percpu_count is not NULL.
  */
-static inline bool __pcpu_ref_alive(struct percpu_ref *ref,
-				    unsigned long __percpu **pcpu_countp)
+static inline bool __percpu_ref_alive(struct percpu_ref *ref,
+				      unsigned long __percpu **percpu_countp)
 {
-	unsigned long pcpu_ptr = ACCESS_ONCE(ref->pcpu_count_ptr);
+	unsigned long percpu_ptr = ACCESS_ONCE(ref->percpu_count_ptr);
 
 	/* paired with smp_store_release() in percpu_ref_reinit() */
 	smp_read_barrier_depends();
 
-	if (unlikely(pcpu_ptr & PCPU_REF_DEAD))
+	if (unlikely(percpu_ptr & __PERCPU_REF_DEAD))
 		return false;
 
-	*pcpu_countp = (unsigned long __percpu *)pcpu_ptr;
+	*percpu_countp = (unsigned long __percpu *)percpu_ptr;
 	return true;
 }
 
@@ -121,12 +121,12 @@ static inline bool __pcpu_ref_alive(struct percpu_ref *ref,
  */
 static inline void percpu_ref_get(struct percpu_ref *ref)
 {
-	unsigned long __percpu *pcpu_count;
+	unsigned long __percpu *percpu_count;
 
 	rcu_read_lock_sched();
 
-	if (__pcpu_ref_alive(ref, &pcpu_count))
-		this_cpu_inc(*pcpu_count);
+	if (__percpu_ref_alive(ref, &percpu_count))
+		this_cpu_inc(*percpu_count);
 	else
 		atomic_long_inc(&ref->count);
 
@@ -144,13 +144,13 @@ static inline void percpu_ref_get(struct percpu_ref *ref)
  */
 static inline bool percpu_ref_tryget(struct percpu_ref *ref)
 {
-	unsigned long __percpu *pcpu_count;
+	unsigned long __percpu *percpu_count;
 	int ret;
 
 	rcu_read_lock_sched();
 
-	if (__pcpu_ref_alive(ref, &pcpu_count)) {
-		this_cpu_inc(*pcpu_count);
+	if (__percpu_ref_alive(ref, &percpu_count)) {
+		this_cpu_inc(*percpu_count);
 		ret = true;
 	} else {
 		ret = atomic_long_inc_not_zero(&ref->count);
@@ -178,13 +178,13 @@ static inline bool percpu_ref_tryget(struct percpu_ref *ref)
  */
 static inline bool percpu_ref_tryget_live(struct percpu_ref *ref)
 {
-	unsigned long __percpu *pcpu_count;
+	unsigned long __percpu *percpu_count;
 	int ret = false;
 
 	rcu_read_lock_sched();
 
-	if (__pcpu_ref_alive(ref, &pcpu_count)) {
-		this_cpu_inc(*pcpu_count);
+	if (__percpu_ref_alive(ref, &percpu_count)) {
+		this_cpu_inc(*percpu_count);
 		ret = true;
 	}
 
@@ -204,12 +204,12 @@ static inline bool percpu_ref_tryget_live(struct percpu_ref *ref)
  */
 static inline void percpu_ref_put(struct percpu_ref *ref)
 {
-	unsigned long __percpu *pcpu_count;
+	unsigned long __percpu *percpu_count;
 
 	rcu_read_lock_sched();
 
-	if (__pcpu_ref_alive(ref, &pcpu_count))
-		this_cpu_dec(*pcpu_count);
+	if (__percpu_ref_alive(ref, &percpu_count))
+		this_cpu_dec(*percpu_count);
 	else if (unlikely(atomic_long_dec_and_test(&ref->count)))
 		ref->release(ref);
 
@@ -226,9 +226,9 @@ static inline void percpu_ref_put(struct percpu_ref *ref)
  */
 static inline bool percpu_ref_is_zero(struct percpu_ref *ref)
 {
-	unsigned long __percpu *pcpu_count;
+	unsigned long __percpu *percpu_count;
 
-	if (__pcpu_ref_alive(ref, &pcpu_count))
+	if (__percpu_ref_alive(ref, &percpu_count))
 		return false;
 	return !atomic_long_read(&ref->count);
 }
diff --git a/lib/percpu-refcount.c b/lib/percpu-refcount.c
index 8ef3f5c..5aea6b7 100644
--- a/lib/percpu-refcount.c
+++ b/lib/percpu-refcount.c
@@ -11,8 +11,8 @@
  * percpu counters will all sum to the correct value
  *
  * (More precisely: because moduler arithmatic is commutative the sum of all the
- * pcpu_count vars will be equal to what it would have been if all the gets and
- * puts were done to a single integer, even if some of the percpu integers
+ * percpu_count vars will be equal to what it would have been if all the gets
+ * and puts were done to a single integer, even if some of the percpu integers
  * overflow or underflow).
  *
  * The real trick to implementing percpu refcounts is shutdown. We can't detect
@@ -29,11 +29,12 @@
  * atomic_long_t can't hit 0 before we've added up all the percpu refs.
  */
 
-#define PCPU_COUNT_BIAS		(1LU << (BITS_PER_LONG - 1))
+#define PERCPU_COUNT_BIAS	(1LU << (BITS_PER_LONG - 1))
 
-static unsigned long __percpu *pcpu_count_ptr(struct percpu_ref *ref)
+static unsigned long __percpu *percpu_count_ptr(struct percpu_ref *ref)
 {
-	return (unsigned long __percpu *)(ref->pcpu_count_ptr & ~PCPU_REF_DEAD);
+	return (unsigned long __percpu *)
+		(ref->percpu_count_ptr & ~__PERCPU_REF_DEAD);
 }
 
 /**
@@ -51,10 +52,11 @@ static unsigned long __percpu *pcpu_count_ptr(struct percpu_ref *ref)
 int percpu_ref_init(struct percpu_ref *ref, percpu_ref_func_t *release,
 		    gfp_t gfp)
 {
-	atomic_long_set(&ref->count, 1 + PCPU_COUNT_BIAS);
+	atomic_long_set(&ref->count, 1 + PERCPU_COUNT_BIAS);
 
-	ref->pcpu_count_ptr = (unsigned long)alloc_percpu_gfp(unsigned long, gfp);
-	if (!ref->pcpu_count_ptr)
+	ref->percpu_count_ptr =
+		(unsigned long)alloc_percpu_gfp(unsigned long, gfp);
+	if (!ref->percpu_count_ptr)
 		return -ENOMEM;
 
 	ref->release = release;
@@ -74,11 +76,11 @@ EXPORT_SYMBOL_GPL(percpu_ref_init);
  */
 void percpu_ref_exit(struct percpu_ref *ref)
 {
-	unsigned long __percpu *pcpu_count = pcpu_count_ptr(ref);
+	unsigned long __percpu *percpu_count = percpu_count_ptr(ref);
 
-	if (pcpu_count) {
-		free_percpu(pcpu_count);
-		ref->pcpu_count_ptr = PCPU_REF_DEAD;
+	if (percpu_count) {
+		free_percpu(percpu_count);
+		ref->percpu_count_ptr = __PERCPU_REF_DEAD;
 	}
 }
 EXPORT_SYMBOL_GPL(percpu_ref_exit);
@@ -86,14 +88,14 @@ EXPORT_SYMBOL_GPL(percpu_ref_exit);
 static void percpu_ref_kill_rcu(struct rcu_head *rcu)
 {
 	struct percpu_ref *ref = container_of(rcu, struct percpu_ref, rcu);
-	unsigned long __percpu *pcpu_count = pcpu_count_ptr(ref);
+	unsigned long __percpu *percpu_count = percpu_count_ptr(ref);
 	unsigned long count = 0;
 	int cpu;
 
 	for_each_possible_cpu(cpu)
-		count += *per_cpu_ptr(pcpu_count, cpu);
+		count += *per_cpu_ptr(percpu_count, cpu);
 
-	pr_debug("global %ld pcpu %ld",
+	pr_debug("global %ld percpu %ld",
 		 atomic_long_read(&ref->count), (long)count);
 
 	/*
@@ -108,7 +110,7 @@ static void percpu_ref_kill_rcu(struct rcu_head *rcu)
 	 * reaching 0 before we add the percpu counts. But doing it at the same
 	 * time is equivalent and saves us atomic operations:
 	 */
-	atomic_long_add((long)count - PCPU_COUNT_BIAS, &ref->count);
+	atomic_long_add((long)count - PERCPU_COUNT_BIAS, &ref->count);
 
 	WARN_ONCE(atomic_long_read(&ref->count) <= 0,
 		  "percpu ref (%pf) <= 0 (%ld) after killed",
@@ -143,10 +145,10 @@ static void percpu_ref_kill_rcu(struct rcu_head *rcu)
 void percpu_ref_kill_and_confirm(struct percpu_ref *ref,
 				 percpu_ref_func_t *confirm_kill)
 {
-	WARN_ONCE(ref->pcpu_count_ptr & PCPU_REF_DEAD,
+	WARN_ONCE(ref->percpu_count_ptr & __PERCPU_REF_DEAD,
 		  "%s called more than once on %pf!", __func__, ref->release);
 
-	ref->pcpu_count_ptr |= PCPU_REF_DEAD;
+	ref->percpu_count_ptr |= __PERCPU_REF_DEAD;
 	ref->confirm_kill = confirm_kill;
 
 	call_rcu_sched(&ref->rcu, percpu_ref_kill_rcu);
@@ -166,24 +168,24 @@ EXPORT_SYMBOL_GPL(percpu_ref_kill_and_confirm);
  */
 void percpu_ref_reinit(struct percpu_ref *ref)
 {
-	unsigned long __percpu *pcpu_count = pcpu_count_ptr(ref);
+	unsigned long __percpu *percpu_count = percpu_count_ptr(ref);
 	int cpu;
 
-	BUG_ON(!pcpu_count);
+	BUG_ON(!percpu_count);
 	WARN_ON_ONCE(!percpu_ref_is_zero(ref));
 
-	atomic_long_set(&ref->count, 1 + PCPU_COUNT_BIAS);
+	atomic_long_set(&ref->count, 1 + PERCPU_COUNT_BIAS);
 
 	/*
 	 * Restore per-cpu operation.  smp_store_release() is paired with
-	 * smp_read_barrier_depends() in __pcpu_ref_alive() and guarantees
-	 * that the zeroing is visible to all percpu accesses which can see
-	 * the following PCPU_REF_DEAD clearing.
+	 * smp_read_barrier_depends() in __percpu_ref_alive() and
+	 * guarantees that the zeroing is visible to all percpu accesses
+	 * which can see the following __PERCPU_REF_DEAD clearing.
 	 */
 	for_each_possible_cpu(cpu)
-		*per_cpu_ptr(pcpu_count, cpu) = 0;
+		*per_cpu_ptr(percpu_count, cpu) = 0;
 
-	smp_store_release(&ref->pcpu_count_ptr,
-			  ref->pcpu_count_ptr & ~PCPU_REF_DEAD);
+	smp_store_release(&ref->percpu_count_ptr,
+			  ref->percpu_count_ptr & ~__PERCPU_REF_DEAD);
 }
 EXPORT_SYMBOL_GPL(percpu_ref_reinit);
-- 
1.9.3


^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [PATCH 4/9] percpu_ref: rename things to prepare for decoupling percpu/atomic mode switch
  2014-09-23  5:55 [PATCHSET percpu/for-3.18] percpu_ref: implement switch_to_atomic/percpu() Tejun Heo
                   ` (2 preceding siblings ...)
  2014-09-23  5:55 ` [PATCH 3/9] percpu_ref: replace pcpu_ prefix with percpu_ Tejun Heo
@ 2014-09-23  5:55 ` Tejun Heo
  2014-09-23 21:11   ` Kent Overstreet
  2014-09-23  5:55 ` [PATCH 5/9] percpu_ref: add PCPU_REF_DEAD Tejun Heo
                   ` (5 subsequent siblings)
  9 siblings, 1 reply; 23+ messages in thread
From: Tejun Heo @ 2014-09-23  5:55 UTC (permalink / raw)
  To: linux-kernel; +Cc: kmo, axboe, hch, hannes, Tejun Heo

percpu_ref will be restructured so that percpu/atomic mode switching
and reference killing are dedoupled.  In preparation, do the following
renames.

* percpu_ref->confirm_kill	-> percpu_ref->confirm_switch
* __PERCPU_REF_DEAD		-> __PERCPU_REF_ATOMIC
* __percpu_ref_alive()		-> __ref_is_percpu()

This patch is pure rename and doesn't introduce any functional
changes.

Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: Kent Overstreet <kmo@daterainc.com>
---
 include/linux/percpu-refcount.h | 25 ++++++++++++++-----------
 lib/percpu-refcount.c           | 22 +++++++++++-----------
 2 files changed, 25 insertions(+), 22 deletions(-)

diff --git a/include/linux/percpu-refcount.h b/include/linux/percpu-refcount.h
index 3d463a3..910e5f7 100644
--- a/include/linux/percpu-refcount.h
+++ b/include/linux/percpu-refcount.h
@@ -54,6 +54,11 @@
 struct percpu_ref;
 typedef void (percpu_ref_func_t)(struct percpu_ref *);
 
+/* flags set in the lower bits of percpu_ref->percpu_count_ptr */
+enum {
+	__PERCPU_REF_ATOMIC	= 1LU << 0,	/* operating in atomic mode */
+};
+
 struct percpu_ref {
 	atomic_long_t		count;
 	/*
@@ -62,7 +67,7 @@ struct percpu_ref {
 	 */
 	unsigned long		percpu_count_ptr;
 	percpu_ref_func_t	*release;
-	percpu_ref_func_t	*confirm_kill;
+	percpu_ref_func_t	*confirm_switch;
 	struct rcu_head		rcu;
 };
 
@@ -88,23 +93,21 @@ static inline void percpu_ref_kill(struct percpu_ref *ref)
 	return percpu_ref_kill_and_confirm(ref, NULL);
 }
 
-#define __PERCPU_REF_DEAD	1
-
 /*
  * Internal helper.  Don't use outside percpu-refcount proper.  The
  * function doesn't return the pointer and let the caller test it for NULL
  * because doing so forces the compiler to generate two conditional
  * branches as it can't assume that @ref->percpu_count is not NULL.
  */
-static inline bool __percpu_ref_alive(struct percpu_ref *ref,
-				      unsigned long __percpu **percpu_countp)
+static inline bool __ref_is_percpu(struct percpu_ref *ref,
+					  unsigned long __percpu **percpu_countp)
 {
 	unsigned long percpu_ptr = ACCESS_ONCE(ref->percpu_count_ptr);
 
 	/* paired with smp_store_release() in percpu_ref_reinit() */
 	smp_read_barrier_depends();
 
-	if (unlikely(percpu_ptr & __PERCPU_REF_DEAD))
+	if (unlikely(percpu_ptr & __PERCPU_REF_ATOMIC))
 		return false;
 
 	*percpu_countp = (unsigned long __percpu *)percpu_ptr;
@@ -125,7 +128,7 @@ static inline void percpu_ref_get(struct percpu_ref *ref)
 
 	rcu_read_lock_sched();
 
-	if (__percpu_ref_alive(ref, &percpu_count))
+	if (__ref_is_percpu(ref, &percpu_count))
 		this_cpu_inc(*percpu_count);
 	else
 		atomic_long_inc(&ref->count);
@@ -149,7 +152,7 @@ static inline bool percpu_ref_tryget(struct percpu_ref *ref)
 
 	rcu_read_lock_sched();
 
-	if (__percpu_ref_alive(ref, &percpu_count)) {
+	if (__ref_is_percpu(ref, &percpu_count)) {
 		this_cpu_inc(*percpu_count);
 		ret = true;
 	} else {
@@ -183,7 +186,7 @@ static inline bool percpu_ref_tryget_live(struct percpu_ref *ref)
 
 	rcu_read_lock_sched();
 
-	if (__percpu_ref_alive(ref, &percpu_count)) {
+	if (__ref_is_percpu(ref, &percpu_count)) {
 		this_cpu_inc(*percpu_count);
 		ret = true;
 	}
@@ -208,7 +211,7 @@ static inline void percpu_ref_put(struct percpu_ref *ref)
 
 	rcu_read_lock_sched();
 
-	if (__percpu_ref_alive(ref, &percpu_count))
+	if (__ref_is_percpu(ref, &percpu_count))
 		this_cpu_dec(*percpu_count);
 	else if (unlikely(atomic_long_dec_and_test(&ref->count)))
 		ref->release(ref);
@@ -228,7 +231,7 @@ static inline bool percpu_ref_is_zero(struct percpu_ref *ref)
 {
 	unsigned long __percpu *percpu_count;
 
-	if (__percpu_ref_alive(ref, &percpu_count))
+	if (__ref_is_percpu(ref, &percpu_count))
 		return false;
 	return !atomic_long_read(&ref->count);
 }
diff --git a/lib/percpu-refcount.c b/lib/percpu-refcount.c
index 5aea6b7..7aef590 100644
--- a/lib/percpu-refcount.c
+++ b/lib/percpu-refcount.c
@@ -34,7 +34,7 @@
 static unsigned long __percpu *percpu_count_ptr(struct percpu_ref *ref)
 {
 	return (unsigned long __percpu *)
-		(ref->percpu_count_ptr & ~__PERCPU_REF_DEAD);
+		(ref->percpu_count_ptr & ~__PERCPU_REF_ATOMIC);
 }
 
 /**
@@ -80,7 +80,7 @@ void percpu_ref_exit(struct percpu_ref *ref)
 
 	if (percpu_count) {
 		free_percpu(percpu_count);
-		ref->percpu_count_ptr = __PERCPU_REF_DEAD;
+		ref->percpu_count_ptr = __PERCPU_REF_ATOMIC;
 	}
 }
 EXPORT_SYMBOL_GPL(percpu_ref_exit);
@@ -117,8 +117,8 @@ static void percpu_ref_kill_rcu(struct rcu_head *rcu)
 		  ref->release, atomic_long_read(&ref->count));
 
 	/* @ref is viewed as dead on all CPUs, send out kill confirmation */
-	if (ref->confirm_kill)
-		ref->confirm_kill(ref);
+	if (ref->confirm_switch)
+		ref->confirm_switch(ref);
 
 	/*
 	 * Now we're in single atomic_long_t mode with a consistent
@@ -145,11 +145,11 @@ static void percpu_ref_kill_rcu(struct rcu_head *rcu)
 void percpu_ref_kill_and_confirm(struct percpu_ref *ref,
 				 percpu_ref_func_t *confirm_kill)
 {
-	WARN_ONCE(ref->percpu_count_ptr & __PERCPU_REF_DEAD,
+	WARN_ONCE(ref->percpu_count_ptr & __PERCPU_REF_ATOMIC,
 		  "%s called more than once on %pf!", __func__, ref->release);
 
-	ref->percpu_count_ptr |= __PERCPU_REF_DEAD;
-	ref->confirm_kill = confirm_kill;
+	ref->percpu_count_ptr |= __PERCPU_REF_ATOMIC;
+	ref->confirm_switch = confirm_kill;
 
 	call_rcu_sched(&ref->rcu, percpu_ref_kill_rcu);
 }
@@ -178,14 +178,14 @@ void percpu_ref_reinit(struct percpu_ref *ref)
 
 	/*
 	 * Restore per-cpu operation.  smp_store_release() is paired with
-	 * smp_read_barrier_depends() in __percpu_ref_alive() and
-	 * guarantees that the zeroing is visible to all percpu accesses
-	 * which can see the following __PERCPU_REF_DEAD clearing.
+	 * smp_read_barrier_depends() in __ref_is_percpu() and guarantees
+	 * that the zeroing is visible to all percpu accesses which can see
+	 * the following __PERCPU_REF_ATOMIC clearing.
 	 */
 	for_each_possible_cpu(cpu)
 		*per_cpu_ptr(percpu_count, cpu) = 0;
 
 	smp_store_release(&ref->percpu_count_ptr,
-			  ref->percpu_count_ptr & ~__PERCPU_REF_DEAD);
+			  ref->percpu_count_ptr & ~__PERCPU_REF_ATOMIC);
 }
 EXPORT_SYMBOL_GPL(percpu_ref_reinit);
-- 
1.9.3


^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [PATCH 5/9] percpu_ref: add PCPU_REF_DEAD
  2014-09-23  5:55 [PATCHSET percpu/for-3.18] percpu_ref: implement switch_to_atomic/percpu() Tejun Heo
                   ` (3 preceding siblings ...)
  2014-09-23  5:55 ` [PATCH 4/9] percpu_ref: rename things to prepare for decoupling percpu/atomic mode switch Tejun Heo
@ 2014-09-23  5:55 ` Tejun Heo
  2014-09-23 13:48   ` [PATCH v2 " Tejun Heo
  2014-09-23  5:55 ` [PATCH 6/9] percpu_ref: decouple switching to atomic mode and killing Tejun Heo
                   ` (4 subsequent siblings)
  9 siblings, 1 reply; 23+ messages in thread
From: Tejun Heo @ 2014-09-23  5:55 UTC (permalink / raw)
  To: linux-kernel; +Cc: kmo, axboe, hch, hannes, Tejun Heo

percpu_ref will be restructured so that percpu/atomic mode switching
and reference killing are dedoupled.  In preparation, add
PCPU_REF_DEAD and PCPU_REF_ATOMIC_DEAD which is OR of ATOMIC and DEAD.
For now, ATOMIC and DEAD are changed together and all PCPU_REF_ATOMIC
uses are converted to PCPU_REF_ATOMIC_DEAD without causing any
behavior changes.

BUILD_BUG_ON() is added to percpu_ref_init() so that later flag
additions don't accidentally clobber lower bits of the pointer in
percpu_ref->pcpu_count_ptr.

Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: Kent Overstreet <kmo@daterainc.com>
---
 include/linux/percpu-refcount.h |  4 +++-
 lib/percpu-refcount.c           | 15 +++++++++------
 2 files changed, 12 insertions(+), 7 deletions(-)

diff --git a/include/linux/percpu-refcount.h b/include/linux/percpu-refcount.h
index 910e5f7..24cf157 100644
--- a/include/linux/percpu-refcount.h
+++ b/include/linux/percpu-refcount.h
@@ -57,6 +57,8 @@ typedef void (percpu_ref_func_t)(struct percpu_ref *);
 /* flags set in the lower bits of percpu_ref->percpu_count_ptr */
 enum {
 	__PERCPU_REF_ATOMIC	= 1LU << 0,	/* operating in atomic mode */
+	__PERCPU_REF_DEAD	= 1LU << 1,	/* (being) killed */
+	__PERCPU_REF_ATOMIC_DEAD = __PERCPU_REF_ATOMIC | __PERCPU_REF_DEAD,
 };
 
 struct percpu_ref {
@@ -107,7 +109,7 @@ static inline bool __ref_is_percpu(struct percpu_ref *ref,
 	/* paired with smp_store_release() in percpu_ref_reinit() */
 	smp_read_barrier_depends();
 
-	if (unlikely(percpu_ptr & __PERCPU_REF_ATOMIC))
+	if (unlikely(percpu_ptr & __PERCPU_REF_ATOMIC_DEAD))
 		return false;
 
 	*percpu_countp = (unsigned long __percpu *)percpu_ptr;
diff --git a/lib/percpu-refcount.c b/lib/percpu-refcount.c
index 7aef590..b0b8c09 100644
--- a/lib/percpu-refcount.c
+++ b/lib/percpu-refcount.c
@@ -34,7 +34,7 @@
 static unsigned long __percpu *percpu_count_ptr(struct percpu_ref *ref)
 {
 	return (unsigned long __percpu *)
-		(ref->percpu_count_ptr & ~__PERCPU_REF_ATOMIC);
+		(ref->percpu_count_ptr & ~__PERCPU_REF_ATOMIC_DEAD);
 }
 
 /**
@@ -52,6 +52,9 @@ static unsigned long __percpu *percpu_count_ptr(struct percpu_ref *ref)
 int percpu_ref_init(struct percpu_ref *ref, percpu_ref_func_t *release,
 		    gfp_t gfp)
 {
+	BUILD_BUG_ON(__PERCPU_REF_ATOMIC_DEAD &
+		     ~(__alignof__(unsigned long) - 1));
+
 	atomic_long_set(&ref->count, 1 + PERCPU_COUNT_BIAS);
 
 	ref->percpu_count_ptr =
@@ -80,7 +83,7 @@ void percpu_ref_exit(struct percpu_ref *ref)
 
 	if (percpu_count) {
 		free_percpu(percpu_count);
-		ref->percpu_count_ptr = __PERCPU_REF_ATOMIC;
+		ref->percpu_count_ptr = __PERCPU_REF_ATOMIC_DEAD;
 	}
 }
 EXPORT_SYMBOL_GPL(percpu_ref_exit);
@@ -145,10 +148,10 @@ static void percpu_ref_kill_rcu(struct rcu_head *rcu)
 void percpu_ref_kill_and_confirm(struct percpu_ref *ref,
 				 percpu_ref_func_t *confirm_kill)
 {
-	WARN_ONCE(ref->percpu_count_ptr & __PERCPU_REF_ATOMIC,
+	WARN_ONCE(ref->percpu_count_ptr & __PERCPU_REF_ATOMIC_DEAD,
 		  "%s called more than once on %pf!", __func__, ref->release);
 
-	ref->percpu_count_ptr |= __PERCPU_REF_ATOMIC;
+	ref->percpu_count_ptr |= __PERCPU_REF_ATOMIC_DEAD;
 	ref->confirm_switch = confirm_kill;
 
 	call_rcu_sched(&ref->rcu, percpu_ref_kill_rcu);
@@ -180,12 +183,12 @@ void percpu_ref_reinit(struct percpu_ref *ref)
 	 * Restore per-cpu operation.  smp_store_release() is paired with
 	 * smp_read_barrier_depends() in __ref_is_percpu() and guarantees
 	 * that the zeroing is visible to all percpu accesses which can see
-	 * the following __PERCPU_REF_ATOMIC clearing.
+	 * the following __PERCPU_REF_ATOMIC_DEAD clearing.
 	 */
 	for_each_possible_cpu(cpu)
 		*per_cpu_ptr(percpu_count, cpu) = 0;
 
 	smp_store_release(&ref->percpu_count_ptr,
-			  ref->percpu_count_ptr & ~__PERCPU_REF_ATOMIC);
+			  ref->percpu_count_ptr & ~__PERCPU_REF_ATOMIC_DEAD);
 }
 EXPORT_SYMBOL_GPL(percpu_ref_reinit);
-- 
1.9.3


^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [PATCH 6/9] percpu_ref: decouple switching to atomic mode and killing
  2014-09-23  5:55 [PATCHSET percpu/for-3.18] percpu_ref: implement switch_to_atomic/percpu() Tejun Heo
                   ` (4 preceding siblings ...)
  2014-09-23  5:55 ` [PATCH 5/9] percpu_ref: add PCPU_REF_DEAD Tejun Heo
@ 2014-09-23  5:55 ` Tejun Heo
  2014-09-23 21:13   ` Kent Overstreet
  2014-09-23  5:55 ` [PATCH 7/9] percpu_ref: decouple switching to percpu mode and reinit Tejun Heo
                   ` (3 subsequent siblings)
  9 siblings, 1 reply; 23+ messages in thread
From: Tejun Heo @ 2014-09-23  5:55 UTC (permalink / raw)
  To: linux-kernel; +Cc: kmo, axboe, hch, hannes, Tejun Heo

percpu_ref has treated the dropping of the base reference and
switching to atomic mode as an integral operation; however, there's
nothing inherent tying the two together.

The use cases for percpu_ref have been expanding continuously.  While
the current init/kill/reinit/exit model can cover a lot, the coupling
of kill/reinit with atomic/percpu mode switching is turning out to be
too restrictive for use cases where many percpu_refs are created and
destroyed back-to-back with only some of them reaching extended
operation.  The coupling also makes implementing always-atomic debug
mode difficult.

This patch separates out atomic mode switching into
percpu_ref_switch_to_atomic() and reimplements
percpu_ref_kill_and_confirm() on top of it.

* The handling of __PERCPU_REF_ATOMIC and __PERCPU_REF_DEAD is now
  differentiated.  Among get/put operations, percpu_ref_tryget_live()
  is the only one which cares about DEAD.

* percpu_ref_switch_to_atomic() can be called multiple times on the
  same ref.  This means that multiple @confirm_switch may get queued
  up which we can't do reliably without extra memory area.  This is
  handled by making the later invocation synchronously wait for the
  completion of the previous one.  This isn't particularly desirable
  but such synchronous waits shouldn't happen in most cases.

Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: Kent Overstreet <kmo@daterainc.com>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: Christoph Hellwig <hch@infradead.org>
Cc: Johannes Weiner <hannes@cmpxchg.org>
---
 include/linux/percpu-refcount.h |   8 ++-
 lib/percpu-refcount.c           | 141 +++++++++++++++++++++++++++++++---------
 2 files changed, 116 insertions(+), 33 deletions(-)

diff --git a/include/linux/percpu-refcount.h b/include/linux/percpu-refcount.h
index 24cf157..03a02e9 100644
--- a/include/linux/percpu-refcount.h
+++ b/include/linux/percpu-refcount.h
@@ -76,9 +76,11 @@ struct percpu_ref {
 int __must_check percpu_ref_init(struct percpu_ref *ref,
 				 percpu_ref_func_t *release, gfp_t gfp);
 void percpu_ref_exit(struct percpu_ref *ref);
+void percpu_ref_switch_to_atomic(struct percpu_ref *ref,
+				 percpu_ref_func_t *confirm_switch);
+void percpu_ref_reinit(struct percpu_ref *ref);
 void percpu_ref_kill_and_confirm(struct percpu_ref *ref,
 				 percpu_ref_func_t *confirm_kill);
-void percpu_ref_reinit(struct percpu_ref *ref);
 
 /**
  * percpu_ref_kill - drop the initial ref
@@ -109,7 +111,7 @@ static inline bool __ref_is_percpu(struct percpu_ref *ref,
 	/* paired with smp_store_release() in percpu_ref_reinit() */
 	smp_read_barrier_depends();
 
-	if (unlikely(percpu_ptr & __PERCPU_REF_ATOMIC_DEAD))
+	if (unlikely(percpu_ptr & __PERCPU_REF_ATOMIC))
 		return false;
 
 	*percpu_countp = (unsigned long __percpu *)percpu_ptr;
@@ -191,6 +193,8 @@ static inline bool percpu_ref_tryget_live(struct percpu_ref *ref)
 	if (__ref_is_percpu(ref, &percpu_count)) {
 		this_cpu_inc(*percpu_count);
 		ret = true;
+	} else if (!(ACCESS_ONCE(ref->percpu_count_ptr) & __PERCPU_REF_DEAD)) {
+		ret = atomic_long_inc_not_zero(&ref->count);
 	}
 
 	rcu_read_unlock_sched();
diff --git a/lib/percpu-refcount.c b/lib/percpu-refcount.c
index b0b8c09..56a7c0d 100644
--- a/lib/percpu-refcount.c
+++ b/lib/percpu-refcount.c
@@ -1,6 +1,8 @@
 #define pr_fmt(fmt) "%s: " fmt "\n", __func__
 
 #include <linux/kernel.h>
+#include <linux/sched.h>
+#include <linux/wait.h>
 #include <linux/percpu-refcount.h>
 
 /*
@@ -31,6 +33,8 @@
 
 #define PERCPU_COUNT_BIAS	(1LU << (BITS_PER_LONG - 1))
 
+static DECLARE_WAIT_QUEUE_HEAD(percpu_ref_switch_waitq);
+
 static unsigned long __percpu *percpu_count_ptr(struct percpu_ref *ref)
 {
 	return (unsigned long __percpu *)
@@ -88,7 +92,19 @@ void percpu_ref_exit(struct percpu_ref *ref)
 }
 EXPORT_SYMBOL_GPL(percpu_ref_exit);
 
-static void percpu_ref_kill_rcu(struct rcu_head *rcu)
+static void percpu_ref_call_confirm_rcu(struct rcu_head *rcu)
+{
+	struct percpu_ref *ref = container_of(rcu, struct percpu_ref, rcu);
+
+	ref->confirm_switch(ref);
+	ref->confirm_switch = NULL;
+	wake_up_all(&percpu_ref_switch_waitq);
+
+	/* drop ref from percpu_ref_switch_to_atomic() */
+	percpu_ref_put(ref);
+}
+
+static void percpu_ref_switch_to_atomic_rcu(struct rcu_head *rcu)
 {
 	struct percpu_ref *ref = container_of(rcu, struct percpu_ref, rcu);
 	unsigned long __percpu *percpu_count = percpu_count_ptr(ref);
@@ -116,47 +132,79 @@ static void percpu_ref_kill_rcu(struct rcu_head *rcu)
 	atomic_long_add((long)count - PERCPU_COUNT_BIAS, &ref->count);
 
 	WARN_ONCE(atomic_long_read(&ref->count) <= 0,
-		  "percpu ref (%pf) <= 0 (%ld) after killed",
+		  "percpu ref (%pf) <= 0 (%ld) after switching to atomic",
 		  ref->release, atomic_long_read(&ref->count));
 
-	/* @ref is viewed as dead on all CPUs, send out kill confirmation */
-	if (ref->confirm_switch)
-		ref->confirm_switch(ref);
+	/* @ref is viewed as dead on all CPUs, send out switch confirmation */
+	percpu_ref_call_confirm_rcu(rcu);
+}
 
-	/*
-	 * Now we're in single atomic_long_t mode with a consistent
-	 * refcount, so it's safe to drop our initial ref:
-	 */
-	percpu_ref_put(ref);
+static void percpu_ref_noop_confirm_switch(struct percpu_ref *ref)
+{
+}
+
+static void __percpu_ref_switch_to_atomic(struct percpu_ref *ref,
+					  percpu_ref_func_t *confirm_switch)
+{
+	if (!(ref->percpu_count_ptr & __PERCPU_REF_ATOMIC)) {
+		/* switching from percpu to atomic */
+		ref->percpu_count_ptr |= __PERCPU_REF_ATOMIC;
+
+		/*
+		 * Non-NULL ->confirm_switch is used to indicate that
+		 * switching is in progress.  Use noop one if unspecified.
+		 */
+		WARN_ON_ONCE(ref->confirm_switch);
+		ref->confirm_switch =
+			confirm_switch ?: percpu_ref_noop_confirm_switch;
+
+		percpu_ref_get(ref);	/* put after confirmation */
+		call_rcu_sched(&ref->rcu, percpu_ref_switch_to_atomic_rcu);
+	} else if (confirm_switch) {
+		/*
+		 * Somebody already set ATOMIC.  Switching may still be in
+		 * progress.  @confirm_switch must be invoked after the
+		 * switching is complete and a full sched RCU grace period
+		 * has passed.  Wait synchronously for the previous
+		 * switching and schedule @confirm_switch invocation.
+		 */
+		wait_event(percpu_ref_switch_waitq, !ref->confirm_switch);
+		ref->confirm_switch = confirm_switch;
+
+		percpu_ref_get(ref);	/* put after confirmation */
+		call_rcu_sched(&ref->rcu, percpu_ref_call_confirm_rcu);
+	}
 }
 
 /**
- * percpu_ref_kill_and_confirm - drop the initial ref and schedule confirmation
- * @ref: percpu_ref to kill
- * @confirm_kill: optional confirmation callback
+ * percpu_ref_switch_to_atomic - switch a percpu_ref to atomic mode
+ * @ref: percpu_ref to switch to atomic mode
+ * @confirm_switch: optional confirmation callback
  *
- * Equivalent to percpu_ref_kill() but also schedules kill confirmation if
- * @confirm_kill is not NULL.  @confirm_kill, which may not block, will be
- * called after @ref is seen as dead from all CPUs - all further
- * invocations of percpu_ref_tryget_live() will fail.  See
- * percpu_ref_tryget_live() for more details.
+ * There's no reason to use this function for the usual reference counting.
+ * Use percpu_ref_kill[_and_confirm]().
+ *
+ * Schedule switching of @ref to atomic mode.  All its percpu counts will
+ * be collected to the main atomic counter.  On completion, when all CPUs
+ * are guaraneed to be in atomic mode, @confirm_switch, which may not
+ * block, is invoked.  This function may be invoked concurrently with all
+ * the get/put operations and can safely be mixed with kill and reinit
+ * operations.
  *
- * Due to the way percpu_ref is implemented, @confirm_kill will be called
- * after at least one full RCU grace period has passed but this is an
- * implementation detail and callers must not depend on it.
+ * This function normally doesn't block and can be called from any context
+ * but it may block if @confirm_kill is specified and @ref is already in
+ * the process of switching to atomic mode.  In such cases, @confirm_switch
+ * will be invoked after the switching is complete.
+ *
+ * Due to the way percpu_ref is implemented, @confirm_switch will be called
+ * after at least one full sched RCU grace period has passed but this is an
+ * implementation detail and must not be depended upon.
  */
-void percpu_ref_kill_and_confirm(struct percpu_ref *ref,
-				 percpu_ref_func_t *confirm_kill)
+void percpu_ref_switch_to_atomic(struct percpu_ref *ref,
+				 percpu_ref_func_t *confirm_switch)
 {
-	WARN_ONCE(ref->percpu_count_ptr & __PERCPU_REF_ATOMIC_DEAD,
-		  "%s called more than once on %pf!", __func__, ref->release);
-
-	ref->percpu_count_ptr |= __PERCPU_REF_ATOMIC_DEAD;
-	ref->confirm_switch = confirm_kill;
-
-	call_rcu_sched(&ref->rcu, percpu_ref_kill_rcu);
+	__percpu_ref_switch_to_atomic(ref, confirm_switch);
 }
-EXPORT_SYMBOL_GPL(percpu_ref_kill_and_confirm);
 
 /**
  * percpu_ref_reinit - re-initialize a percpu refcount
@@ -192,3 +240,34 @@ void percpu_ref_reinit(struct percpu_ref *ref)
 			  ref->percpu_count_ptr & ~__PERCPU_REF_ATOMIC_DEAD);
 }
 EXPORT_SYMBOL_GPL(percpu_ref_reinit);
+
+/**
+ * percpu_ref_kill_and_confirm - drop the initial ref and schedule confirmation
+ * @ref: percpu_ref to kill
+ * @confirm_kill: optional confirmation callback
+ *
+ * Equivalent to percpu_ref_kill() but also schedules kill confirmation if
+ * @confirm_kill is not NULL.  @confirm_kill, which may not block, will be
+ * called after @ref is seen as dead from all CPUs at which point all
+ * further invocations of percpu_ref_tryget_live() will fail.  See
+ * percpu_ref_tryget_live() for details.
+ *
+ * This function normally doesn't block and can be called from any context
+ * but it may block if @confirm_kill is specified and @ref is already in
+ * the process of switching to atomic mode by percpu_ref_switch_atomic().
+ *
+ * Due to the way percpu_ref is implemented, @confirm_switch will be called
+ * after at least one full sched RCU grace period has passed but this is an
+ * implementation detail and must not be depended upon.
+ */
+void percpu_ref_kill_and_confirm(struct percpu_ref *ref,
+				 percpu_ref_func_t *confirm_kill)
+{
+	WARN_ONCE(ref->percpu_count_ptr & __PERCPU_REF_DEAD,
+		  "%s called more than once on %pf!", __func__, ref->release);
+
+	ref->percpu_count_ptr |= __PERCPU_REF_DEAD;
+	__percpu_ref_switch_to_atomic(ref, confirm_kill);
+	percpu_ref_put(ref);
+}
+EXPORT_SYMBOL_GPL(percpu_ref_kill_and_confirm);
-- 
1.9.3


^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [PATCH 7/9] percpu_ref: decouple switching to percpu mode and reinit
  2014-09-23  5:55 [PATCHSET percpu/for-3.18] percpu_ref: implement switch_to_atomic/percpu() Tejun Heo
                   ` (5 preceding siblings ...)
  2014-09-23  5:55 ` [PATCH 6/9] percpu_ref: decouple switching to atomic mode and killing Tejun Heo
@ 2014-09-23  5:55 ` Tejun Heo
  2014-09-23 13:49   ` [PATCH v2 " Tejun Heo
  2014-09-23  5:55 ` [PATCH 8/9] percpu_ref: add PERCPU_REF_INIT_* flags Tejun Heo
                   ` (2 subsequent siblings)
  9 siblings, 1 reply; 23+ messages in thread
From: Tejun Heo @ 2014-09-23  5:55 UTC (permalink / raw)
  To: linux-kernel; +Cc: kmo, axboe, hch, hannes, Tejun Heo

percpu_ref has treated the dropping of the base reference and
switching to atomic mode as an integral operation; however, there's
nothing inherent tying the two together.

The use cases for percpu_ref have been expanding continuously.  While
the current init/kill/reinit/exit model can cover a lot, the coupling
of kill/reinit with atomic/percpu mode switching is turning out to be
too restrictive for use cases where many percpu_refs are created and
destroyed back-to-back with only some of them reaching extended
operation.  The coupling also makes implementing always-atomic debug
mode difficult.

This patch separates out percpu mode switching into
percpu_ref_switch_to_percpu() and reimplements percpu_ref_reinit() on
top of it.

* DEAD still requires ATOMIC.  A dead ref can't be switched to percpu
  mode w/o going through reinit.

Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: Kent Overstreet <kmo@daterainc.com>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: Christoph Hellwig <hch@infradead.org>
Cc: Johannes Weiner <hannes@cmpxchg.org>
---
 include/linux/percpu-refcount.h |  3 +-
 lib/percpu-refcount.c           | 73 ++++++++++++++++++++++++++++++-----------
 2 files changed, 56 insertions(+), 20 deletions(-)

diff --git a/include/linux/percpu-refcount.h b/include/linux/percpu-refcount.h
index 03a02e9..e41ca20 100644
--- a/include/linux/percpu-refcount.h
+++ b/include/linux/percpu-refcount.h
@@ -78,9 +78,10 @@ int __must_check percpu_ref_init(struct percpu_ref *ref,
 void percpu_ref_exit(struct percpu_ref *ref);
 void percpu_ref_switch_to_atomic(struct percpu_ref *ref,
 				 percpu_ref_func_t *confirm_switch);
-void percpu_ref_reinit(struct percpu_ref *ref);
+void percpu_ref_switch_to_percpu(struct percpu_ref *ref);
 void percpu_ref_kill_and_confirm(struct percpu_ref *ref,
 				 percpu_ref_func_t *confirm_kill);
+void percpu_ref_reinit(struct percpu_ref *ref);
 
 /**
  * percpu_ref_kill - drop the initial ref
diff --git a/lib/percpu-refcount.c b/lib/percpu-refcount.c
index 56a7c0d..548b19e 100644
--- a/lib/percpu-refcount.c
+++ b/lib/percpu-refcount.c
@@ -206,40 +206,54 @@ void percpu_ref_switch_to_atomic(struct percpu_ref *ref,
 	__percpu_ref_switch_to_atomic(ref, confirm_switch);
 }
 
-/**
- * percpu_ref_reinit - re-initialize a percpu refcount
- * @ref: perpcu_ref to re-initialize
- *
- * Re-initialize @ref so that it's in the same state as when it finished
- * percpu_ref_init().  @ref must have been initialized successfully, killed
- * and reached 0 but not exited.
- *
- * Note that percpu_ref_tryget[_live]() are safe to perform on @ref while
- * this function is in progress.
- */
-void percpu_ref_reinit(struct percpu_ref *ref)
+void __percpu_ref_switch_to_percpu(struct percpu_ref *ref)
 {
 	unsigned long __percpu *percpu_count = percpu_count_ptr(ref);
 	int cpu;
 
 	BUG_ON(!percpu_count);
-	WARN_ON_ONCE(!percpu_ref_is_zero(ref));
 
-	atomic_long_set(&ref->count, 1 + PERCPU_COUNT_BIAS);
+	if (!(ref->percpu_count_ptr & __PERCPU_REF_ATOMIC))
+		return;
+
+	wait_event(percpu_ref_switch_waitq, !ref->confirm_switch);
+
+	atomic_long_add(PERCPU_COUNT_BIAS, &ref->count);
 
 	/*
 	 * Restore per-cpu operation.  smp_store_release() is paired with
 	 * smp_read_barrier_depends() in __ref_is_percpu() and guarantees
 	 * that the zeroing is visible to all percpu accesses which can see
-	 * the following __PERCPU_REF_ATOMIC_DEAD clearing.
+	 * the following __PERCPU_REF_ATOMIC clearing.
 	 */
 	for_each_possible_cpu(cpu)
 		*per_cpu_ptr(percpu_count, cpu) = 0;
 
 	smp_store_release(&ref->percpu_count_ptr,
-			  ref->percpu_count_ptr & ~__PERCPU_REF_ATOMIC_DEAD);
+			  ref->percpu_count_ptr & ~__PERCPU_REF_ATOMIC);
+}
+
+/**
+ * percpu_ref_switch_to_percpu - switch a percpu_ref to percpu mode
+ * @ref: percpu_ref to switch to percpu mode
+ *
+ * There's no reason to use this function for the usual reference counting.
+ * To re-use an expired ref, use percpu_ref_reinit().
+ *
+ * Switch @ref to percpu mode.  This function may be invoked concurrently
+ * with all the get/put operations and can safely be mixed with kill and
+ * reinit operations.
+ *
+ * This function normally doesn't block and can be called from any context
+ * but it may block if @ref is in the process of switching to atomic mode
+ * by percpu_ref_switch_atomic().
+ */
+void percpu_ref_switch_to_percpu(struct percpu_ref *ref)
+{
+	/* a dying or dead ref can't be switched to percpu mode w/o reinit */
+	if (!(ref->percpu_count_ptr & __PERCPU_REF_DEAD))
+		__percpu_ref_switch_to_percpu(ref);
 }
-EXPORT_SYMBOL_GPL(percpu_ref_reinit);
 
 /**
  * percpu_ref_kill_and_confirm - drop the initial ref and schedule confirmation
@@ -253,8 +267,8 @@ EXPORT_SYMBOL_GPL(percpu_ref_reinit);
  * percpu_ref_tryget_live() for details.
  *
  * This function normally doesn't block and can be called from any context
- * but it may block if @confirm_kill is specified and @ref is already in
- * the process of switching to atomic mode by percpu_ref_switch_atomic().
+ * but it may block if @confirm_kill is specified and @ref is in the
+ * process of switching to atomic mode by percpu_ref_switch_atomic().
  *
  * Due to the way percpu_ref is implemented, @confirm_switch will be called
  * after at least one full sched RCU grace period has passed but this is an
@@ -271,3 +285,24 @@ void percpu_ref_kill_and_confirm(struct percpu_ref *ref,
 	percpu_ref_put(ref);
 }
 EXPORT_SYMBOL_GPL(percpu_ref_kill_and_confirm);
+
+/**
+ * percpu_ref_reinit - re-initialize a percpu refcount
+ * @ref: perpcu_ref to re-initialize
+ *
+ * Re-initialize @ref so that it's in the same state as when it finished
+ * percpu_ref_init().  @ref must have been initialized successfully and
+ * reached 0 but not exited.
+ *
+ * Note that percpu_ref_tryget[_live]() are safe to perform on @ref while
+ * this function is in progress.
+ */
+void percpu_ref_reinit(struct percpu_ref *ref)
+{
+	WARN_ON_ONCE(!percpu_ref_is_zero(ref));
+
+	ref->percpu_count_ptr &= ~__PERCPU_REF_DEAD;
+	percpu_ref_get(ref);
+	__percpu_ref_switch_to_percpu(ref);
+}
+EXPORT_SYMBOL_GPL(percpu_ref_reinit);
-- 
1.9.3


^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [PATCH 8/9] percpu_ref: add PERCPU_REF_INIT_* flags
  2014-09-23  5:55 [PATCHSET percpu/for-3.18] percpu_ref: implement switch_to_atomic/percpu() Tejun Heo
                   ` (6 preceding siblings ...)
  2014-09-23  5:55 ` [PATCH 7/9] percpu_ref: decouple switching to percpu mode and reinit Tejun Heo
@ 2014-09-23  5:55 ` Tejun Heo
  2014-09-23 21:16   ` Kent Overstreet
  2014-09-23  5:55 ` [PATCH 9/9] percpu_ref: make INIT_ATOMIC and switch_to_atomic() sticky Tejun Heo
  2014-09-24 17:32 ` [PATCHSET percpu/for-3.18] percpu_ref: implement switch_to_atomic/percpu() Tejun Heo
  9 siblings, 1 reply; 23+ messages in thread
From: Tejun Heo @ 2014-09-23  5:55 UTC (permalink / raw)
  To: linux-kernel; +Cc: kmo, axboe, hch, hannes, Tejun Heo

With the recent addition of percpu_ref_reinit(), percpu_ref now can be
used as a persistent switch which can be turned on and off repeatedly
where turning off maps to killing the ref and waiting for it to drain;
however, there currently isn't a way to initialize a percpu_ref in its
off (killed and drained) state, which can be inconvenient for certain
persistent switch use cases.

Similarly, percpu_ref_switch_to_atomic/percpu() allow dynamic
selection of operation mode; however, currently a newly initialized
percpu_ref is always in percpu mode making it impossible to avoid the
latency overhead of switching to atomic mode.

This patch adds @flags to percpu_ref_init() and implements the
following flags.

* PERCPU_REF_INIT_ATOMIC	: start ref in atomic mode
* PERCPU_REF_INIT_DEAD		: start ref killed and drained

These flags should be able to serve the above two use cases.

Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: Kent Overstreet <kmo@daterainc.com>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: Christoph Hellwig <hch@infradead.org>
Cc: Johannes Weiner <hannes@cmpxchg.org>
---
 block/blk-mq.c                  |  2 +-
 fs/aio.c                        |  4 ++--
 include/linux/percpu-refcount.h | 18 +++++++++++++++++-
 kernel/cgroup.c                 |  7 ++++---
 lib/percpu-refcount.c           | 24 +++++++++++++++++++-----
 5 files changed, 43 insertions(+), 12 deletions(-)

diff --git a/block/blk-mq.c b/block/blk-mq.c
index 702df07..3f6e6f5 100644
--- a/block/blk-mq.c
+++ b/block/blk-mq.c
@@ -1777,7 +1777,7 @@ struct request_queue *blk_mq_init_queue(struct blk_mq_tag_set *set)
 		goto err_hctxs;
 
 	if (percpu_ref_init(&q->mq_usage_counter, blk_mq_usage_counter_release,
-			    GFP_KERNEL))
+			    0, GFP_KERNEL))
 		goto err_map;
 
 	setup_timer(&q->timeout, blk_mq_rq_timer, (unsigned long) q);
diff --git a/fs/aio.c b/fs/aio.c
index 93fbcc0f..9b6d5d6 100644
--- a/fs/aio.c
+++ b/fs/aio.c
@@ -666,10 +666,10 @@ static struct kioctx *ioctx_alloc(unsigned nr_events)
 
 	INIT_LIST_HEAD(&ctx->active_reqs);
 
-	if (percpu_ref_init(&ctx->users, free_ioctx_users, GFP_KERNEL))
+	if (percpu_ref_init(&ctx->users, free_ioctx_users, 0, GFP_KERNEL))
 		goto err;
 
-	if (percpu_ref_init(&ctx->reqs, free_ioctx_reqs, GFP_KERNEL))
+	if (percpu_ref_init(&ctx->reqs, free_ioctx_reqs, 0, GFP_KERNEL))
 		goto err;
 
 	ctx->cpu = alloc_percpu(struct kioctx_cpu);
diff --git a/include/linux/percpu-refcount.h b/include/linux/percpu-refcount.h
index e41ca20..5f84bf0 100644
--- a/include/linux/percpu-refcount.h
+++ b/include/linux/percpu-refcount.h
@@ -61,6 +61,21 @@ enum {
 	__PERCPU_REF_ATOMIC_DEAD = __PERCPU_REF_ATOMIC | __PERCPU_REF_DEAD,
 };
 
+/* @flags for percpu_ref_init() */
+enum {
+	/*
+	 * Start w/ ref == 1 in atomic mode.  Can be switched to percpu
+	 * operation using percpu_ref_switch_to_percpu().
+	 */
+	PERCPU_REF_INIT_ATOMIC	= 1 << 0,
+
+	/*
+	 * Start dead w/ ref == 0 in atomic mode.  Must be revived with
+	 * percpu_ref_reinit() before used.  Implies INIT_ATOMIC.
+	 */
+	PERCPU_REF_INIT_DEAD	= 1 << 1,
+};
+
 struct percpu_ref {
 	atomic_long_t		count;
 	/*
@@ -74,7 +89,8 @@ struct percpu_ref {
 };
 
 int __must_check percpu_ref_init(struct percpu_ref *ref,
-				 percpu_ref_func_t *release, gfp_t gfp);
+				 percpu_ref_func_t *release, unsigned int flags,
+				 gfp_t gfp);
 void percpu_ref_exit(struct percpu_ref *ref);
 void percpu_ref_switch_to_atomic(struct percpu_ref *ref,
 				 percpu_ref_func_t *confirm_switch);
diff --git a/kernel/cgroup.c b/kernel/cgroup.c
index 589b4d8..e2fbcc1 100644
--- a/kernel/cgroup.c
+++ b/kernel/cgroup.c
@@ -1628,7 +1628,8 @@ static int cgroup_setup_root(struct cgroup_root *root, unsigned int ss_mask)
 		goto out;
 	root_cgrp->id = ret;
 
-	ret = percpu_ref_init(&root_cgrp->self.refcnt, css_release, GFP_KERNEL);
+	ret = percpu_ref_init(&root_cgrp->self.refcnt, css_release, 0,
+			      GFP_KERNEL);
 	if (ret)
 		goto out;
 
@@ -4487,7 +4488,7 @@ static int create_css(struct cgroup *cgrp, struct cgroup_subsys *ss,
 
 	init_and_link_css(css, ss, cgrp);
 
-	err = percpu_ref_init(&css->refcnt, css_release, GFP_KERNEL);
+	err = percpu_ref_init(&css->refcnt, css_release, 0, GFP_KERNEL);
 	if (err)
 		goto err_free_css;
 
@@ -4555,7 +4556,7 @@ static int cgroup_mkdir(struct kernfs_node *parent_kn, const char *name,
 		goto out_unlock;
 	}
 
-	ret = percpu_ref_init(&cgrp->self.refcnt, css_release, GFP_KERNEL);
+	ret = percpu_ref_init(&cgrp->self.refcnt, css_release, 0, GFP_KERNEL);
 	if (ret)
 		goto out_free_cgrp;
 
diff --git a/lib/percpu-refcount.c b/lib/percpu-refcount.c
index 548b19e..74ec33e 100644
--- a/lib/percpu-refcount.c
+++ b/lib/percpu-refcount.c
@@ -45,27 +45,41 @@ static unsigned long __percpu *percpu_count_ptr(struct percpu_ref *ref)
  * percpu_ref_init - initialize a percpu refcount
  * @ref: percpu_ref to initialize
  * @release: function which will be called when refcount hits 0
+ * @flags: PERCPU_REF_INIT_* flags
  * @gfp: allocation mask to use
  *
- * Initializes the refcount in single atomic counter mode with a refcount of 1;
- * analagous to atomic_long_set(ref, 1).
+ * Initializes @ref.  If @flags is zero, @ref starts in percpu mode with a
+ * refcount of 1; analagous to atomic_long_set(ref, 1).  See the
+ * definitions of PERCPU_REF_INIT_* flags for flag behaviors.
  *
  * Note that @release must not sleep - it may potentially be called from RCU
  * callback context by percpu_ref_kill().
  */
 int percpu_ref_init(struct percpu_ref *ref, percpu_ref_func_t *release,
-		    gfp_t gfp)
+		    unsigned int flags, gfp_t gfp)
 {
+	unsigned long start_count = 0;
+
 	BUILD_BUG_ON(__PERCPU_REF_ATOMIC_DEAD &
 		     ~(__alignof__(unsigned long) - 1));
 
-	atomic_long_set(&ref->count, 1 + PERCPU_COUNT_BIAS);
-
 	ref->percpu_count_ptr =
 		(unsigned long)alloc_percpu_gfp(unsigned long, gfp);
 	if (!ref->percpu_count_ptr)
 		return -ENOMEM;
 
+	if (flags & (PERCPU_REF_INIT_ATOMIC | PERCPU_REF_INIT_DEAD))
+		ref->percpu_count_ptr |= __PERCPU_REF_ATOMIC;
+	else
+		start_count += PERCPU_COUNT_BIAS;
+
+	if (flags & PERCPU_REF_INIT_DEAD)
+		ref->percpu_count_ptr |= __PERCPU_REF_DEAD;
+	else
+		start_count++;
+
+	atomic_long_set(&ref->count, start_count);
+
 	ref->release = release;
 	return 0;
 }
-- 
1.9.3


^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [PATCH 9/9] percpu_ref: make INIT_ATOMIC and switch_to_atomic() sticky
  2014-09-23  5:55 [PATCHSET percpu/for-3.18] percpu_ref: implement switch_to_atomic/percpu() Tejun Heo
                   ` (7 preceding siblings ...)
  2014-09-23  5:55 ` [PATCH 8/9] percpu_ref: add PERCPU_REF_INIT_* flags Tejun Heo
@ 2014-09-23  5:55 ` Tejun Heo
  2014-09-23 21:17   ` Kent Overstreet
  2014-09-24 17:32 ` [PATCHSET percpu/for-3.18] percpu_ref: implement switch_to_atomic/percpu() Tejun Heo
  9 siblings, 1 reply; 23+ messages in thread
From: Tejun Heo @ 2014-09-23  5:55 UTC (permalink / raw)
  To: linux-kernel; +Cc: kmo, axboe, hch, hannes, Tejun Heo

Currently, a percpu_ref which is initialized with
PERPCU_REF_INIT_ATOMIC or switched to atomic mode via
switch_to_atomic() automatically reverts to percpu mode on the first
percpu_ref_reinit().  This makes the atomic mode difficult to use for
cases where a percpu_ref is used as a persistent on/off switch which
may be cycled multiple times.

This patch makes such atomic state sticky so that it survives through
kill/reinit cycles.  After this patch, atomic state is cleared only by
an explicit percpu_ref_switch_to_percpu() call.

Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: Kent Overstreet <kmo@daterainc.com>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: Christoph Hellwig <hch@infradead.org>
Cc: Johannes Weiner <hannes@cmpxchg.org>
---
 include/linux/percpu-refcount.h |  5 ++++-
 lib/percpu-refcount.c           | 20 +++++++++++++++-----
 2 files changed, 19 insertions(+), 6 deletions(-)

diff --git a/include/linux/percpu-refcount.h b/include/linux/percpu-refcount.h
index 5f84bf0..8459d3a 100644
--- a/include/linux/percpu-refcount.h
+++ b/include/linux/percpu-refcount.h
@@ -65,7 +65,9 @@ enum {
 enum {
 	/*
 	 * Start w/ ref == 1 in atomic mode.  Can be switched to percpu
-	 * operation using percpu_ref_switch_to_percpu().
+	 * operation using percpu_ref_switch_to_percpu().  If initialized
+	 * with this flag, the ref will stay in atomic mode until
+	 * percpu_ref_switch_to_percpu() is invoked on it.
 	 */
 	PERCPU_REF_INIT_ATOMIC	= 1 << 0,
 
@@ -85,6 +87,7 @@ struct percpu_ref {
 	unsigned long		percpu_count_ptr;
 	percpu_ref_func_t	*release;
 	percpu_ref_func_t	*confirm_switch;
+	bool			force_atomic:1;
 	struct rcu_head		rcu;
 };
 
diff --git a/lib/percpu-refcount.c b/lib/percpu-refcount.c
index 74ec33e..c47e496 100644
--- a/lib/percpu-refcount.c
+++ b/lib/percpu-refcount.c
@@ -68,6 +68,8 @@ int percpu_ref_init(struct percpu_ref *ref, percpu_ref_func_t *release,
 	if (!ref->percpu_count_ptr)
 		return -ENOMEM;
 
+	ref->force_atomic = flags & PERCPU_REF_INIT_ATOMIC;
+
 	if (flags & (PERCPU_REF_INIT_ATOMIC | PERCPU_REF_INIT_DEAD))
 		ref->percpu_count_ptr |= __PERCPU_REF_ATOMIC;
 	else
@@ -203,7 +205,8 @@ static void __percpu_ref_switch_to_atomic(struct percpu_ref *ref,
  * are guaraneed to be in atomic mode, @confirm_switch, which may not
  * block, is invoked.  This function may be invoked concurrently with all
  * the get/put operations and can safely be mixed with kill and reinit
- * operations.
+ * operations.  Note that @ref will stay in atomic mode across kill/reinit
+ * cycles until percpu_ref_switch_to_percpu() is called.
  *
  * This function normally doesn't block and can be called from any context
  * but it may block if @confirm_kill is specified and @ref is already in
@@ -217,6 +220,7 @@ static void __percpu_ref_switch_to_atomic(struct percpu_ref *ref,
 void percpu_ref_switch_to_atomic(struct percpu_ref *ref,
 				 percpu_ref_func_t *confirm_switch)
 {
+	ref->force_atomic = true;
 	__percpu_ref_switch_to_atomic(ref, confirm_switch);
 }
 
@@ -256,7 +260,10 @@ void __percpu_ref_switch_to_percpu(struct percpu_ref *ref)
  *
  * Switch @ref to percpu mode.  This function may be invoked concurrently
  * with all the get/put operations and can safely be mixed with kill and
- * reinit operations.
+ * reinit operations.  This function reverses the sticky atomic state set
+ * by PERCPU_REF_INIT_ATOMIC or percpu_ref_switch_to_atomic().  If @ref is
+ * dying or dead, the actual switching takes place on the following
+ * percpu_ref_reinit().
  *
  * This function normally doesn't block and can be called from any context
  * but it may block if @ref is in the process of switching to atomic mode
@@ -264,6 +271,8 @@ void __percpu_ref_switch_to_percpu(struct percpu_ref *ref)
  */
 void percpu_ref_switch_to_percpu(struct percpu_ref *ref)
 {
+	ref->force_atomic = false;
+
 	/* a dying or dead ref can't be switched to percpu mode w/o reinit */
 	if (!(ref->percpu_count_ptr & __PERCPU_REF_DEAD))
 		__percpu_ref_switch_to_percpu(ref);
@@ -305,8 +314,8 @@ EXPORT_SYMBOL_GPL(percpu_ref_kill_and_confirm);
  * @ref: perpcu_ref to re-initialize
  *
  * Re-initialize @ref so that it's in the same state as when it finished
- * percpu_ref_init().  @ref must have been initialized successfully and
- * reached 0 but not exited.
+ * percpu_ref_init() ignoring %PERCPU_REF_INIT_DEAD.  @ref must have been
+ * initialized successfully and reached 0 but not exited.
  *
  * Note that percpu_ref_tryget[_live]() are safe to perform on @ref while
  * this function is in progress.
@@ -317,6 +326,7 @@ void percpu_ref_reinit(struct percpu_ref *ref)
 
 	ref->percpu_count_ptr &= ~__PERCPU_REF_DEAD;
 	percpu_ref_get(ref);
-	__percpu_ref_switch_to_percpu(ref);
+	if (!ref->force_atomic)
+		__percpu_ref_switch_to_percpu(ref);
 }
 EXPORT_SYMBOL_GPL(percpu_ref_reinit);
-- 
1.9.3


^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [PATCH v2 5/9] percpu_ref: add PCPU_REF_DEAD
  2014-09-23  5:55 ` [PATCH 5/9] percpu_ref: add PCPU_REF_DEAD Tejun Heo
@ 2014-09-23 13:48   ` Tejun Heo
  2014-09-23 21:14     ` Kent Overstreet
  0 siblings, 1 reply; 23+ messages in thread
From: Tejun Heo @ 2014-09-23 13:48 UTC (permalink / raw)
  To: linux-kernel; +Cc: kmo, axboe, hch, hannes

>From 6f5bdc32c66317416c13eedb68ead2b36fb02603 Mon Sep 17 00:00:00 2001
From: Tejun Heo <tj@kernel.org>
Date: Tue, 23 Sep 2014 09:45:56 -0400

percpu_ref will be restructured so that percpu/atomic mode switching
and reference killing are dedoupled.  In preparation, add
PCPU_REF_DEAD and PCPU_REF_ATOMIC_DEAD which is OR of ATOMIC and DEAD.
For now, ATOMIC and DEAD are changed together and all PCPU_REF_ATOMIC
uses are converted to PCPU_REF_ATOMIC_DEAD without causing any
behavior changes.

percpu_ref_init() now specifies an explicit alignment when allocating
the percpu counters so that the pointer has enough unused low bits to
accomodate the flags.  Note that one flag was fine as min alignment
for percpu memory is 2 bytes but two flags are already too many for
the natural alignment of unsigned longs on archs like cris and m68k.

v2: The original patch had BUILD_BUG_ON() which triggers if unsigned
    long's alignment isn't enough to accomodate the flags, which
    triggered on cris and m64k.  percpu_ref_init() updated to specify
    the required alignment explicitly.  Reported by Fengguang.

Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: Kent Overstreet <kmo@daterainc.com>
Cc: kbuild test robot <fengguang.wu@intel.com>
---
 include/linux/percpu-refcount.h |  6 +++++-
 lib/percpu-refcount.c           | 19 +++++++++++--------
 2 files changed, 16 insertions(+), 9 deletions(-)

diff --git a/include/linux/percpu-refcount.h b/include/linux/percpu-refcount.h
index 910e5f7..bd9483d 100644
--- a/include/linux/percpu-refcount.h
+++ b/include/linux/percpu-refcount.h
@@ -57,6 +57,10 @@ typedef void (percpu_ref_func_t)(struct percpu_ref *);
 /* flags set in the lower bits of percpu_ref->percpu_count_ptr */
 enum {
 	__PERCPU_REF_ATOMIC	= 1LU << 0,	/* operating in atomic mode */
+	__PERCPU_REF_DEAD	= 1LU << 1,	/* (being) killed */
+	__PERCPU_REF_ATOMIC_DEAD = __PERCPU_REF_ATOMIC | __PERCPU_REF_DEAD,
+
+	__PERCPU_REF_FLAG_BITS	= 2,
 };
 
 struct percpu_ref {
@@ -107,7 +111,7 @@ static inline bool __ref_is_percpu(struct percpu_ref *ref,
 	/* paired with smp_store_release() in percpu_ref_reinit() */
 	smp_read_barrier_depends();
 
-	if (unlikely(percpu_ptr & __PERCPU_REF_ATOMIC))
+	if (unlikely(percpu_ptr & __PERCPU_REF_ATOMIC_DEAD))
 		return false;
 
 	*percpu_countp = (unsigned long __percpu *)percpu_ptr;
diff --git a/lib/percpu-refcount.c b/lib/percpu-refcount.c
index 7aef590..e2ff19f 100644
--- a/lib/percpu-refcount.c
+++ b/lib/percpu-refcount.c
@@ -34,7 +34,7 @@
 static unsigned long __percpu *percpu_count_ptr(struct percpu_ref *ref)
 {
 	return (unsigned long __percpu *)
-		(ref->percpu_count_ptr & ~__PERCPU_REF_ATOMIC);
+		(ref->percpu_count_ptr & ~__PERCPU_REF_ATOMIC_DEAD);
 }
 
 /**
@@ -52,10 +52,13 @@ static unsigned long __percpu *percpu_count_ptr(struct percpu_ref *ref)
 int percpu_ref_init(struct percpu_ref *ref, percpu_ref_func_t *release,
 		    gfp_t gfp)
 {
+	size_t align = max_t(size_t, 1 << __PERCPU_REF_FLAG_BITS,
+			     __alignof__(unsigned long));
+
 	atomic_long_set(&ref->count, 1 + PERCPU_COUNT_BIAS);
 
-	ref->percpu_count_ptr =
-		(unsigned long)alloc_percpu_gfp(unsigned long, gfp);
+	ref->percpu_count_ptr = (unsigned long)
+		__alloc_percpu_gfp(sizeof(unsigned long), align, gfp);
 	if (!ref->percpu_count_ptr)
 		return -ENOMEM;
 
@@ -80,7 +83,7 @@ void percpu_ref_exit(struct percpu_ref *ref)
 
 	if (percpu_count) {
 		free_percpu(percpu_count);
-		ref->percpu_count_ptr = __PERCPU_REF_ATOMIC;
+		ref->percpu_count_ptr = __PERCPU_REF_ATOMIC_DEAD;
 	}
 }
 EXPORT_SYMBOL_GPL(percpu_ref_exit);
@@ -145,10 +148,10 @@ static void percpu_ref_kill_rcu(struct rcu_head *rcu)
 void percpu_ref_kill_and_confirm(struct percpu_ref *ref,
 				 percpu_ref_func_t *confirm_kill)
 {
-	WARN_ONCE(ref->percpu_count_ptr & __PERCPU_REF_ATOMIC,
+	WARN_ONCE(ref->percpu_count_ptr & __PERCPU_REF_ATOMIC_DEAD,
 		  "%s called more than once on %pf!", __func__, ref->release);
 
-	ref->percpu_count_ptr |= __PERCPU_REF_ATOMIC;
+	ref->percpu_count_ptr |= __PERCPU_REF_ATOMIC_DEAD;
 	ref->confirm_switch = confirm_kill;
 
 	call_rcu_sched(&ref->rcu, percpu_ref_kill_rcu);
@@ -180,12 +183,12 @@ void percpu_ref_reinit(struct percpu_ref *ref)
 	 * Restore per-cpu operation.  smp_store_release() is paired with
 	 * smp_read_barrier_depends() in __ref_is_percpu() and guarantees
 	 * that the zeroing is visible to all percpu accesses which can see
-	 * the following __PERCPU_REF_ATOMIC clearing.
+	 * the following __PERCPU_REF_ATOMIC_DEAD clearing.
 	 */
 	for_each_possible_cpu(cpu)
 		*per_cpu_ptr(percpu_count, cpu) = 0;
 
 	smp_store_release(&ref->percpu_count_ptr,
-			  ref->percpu_count_ptr & ~__PERCPU_REF_ATOMIC);
+			  ref->percpu_count_ptr & ~__PERCPU_REF_ATOMIC_DEAD);
 }
 EXPORT_SYMBOL_GPL(percpu_ref_reinit);
-- 
1.9.3


^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [PATCH v2 7/9] percpu_ref: decouple switching to percpu mode and reinit
  2014-09-23  5:55 ` [PATCH 7/9] percpu_ref: decouple switching to percpu mode and reinit Tejun Heo
@ 2014-09-23 13:49   ` Tejun Heo
  2014-09-23 21:15     ` Kent Overstreet
  0 siblings, 1 reply; 23+ messages in thread
From: Tejun Heo @ 2014-09-23 13:49 UTC (permalink / raw)
  To: linux-kernel; +Cc: kmo, axboe, hch, hannes

>From c31543be3b12bd3cb3f8c037cefb89504d294f82 Mon Sep 17 00:00:00 2001
From: Tejun Heo <tj@kernel.org>
Date: Tue, 23 Sep 2014 09:45:56 -0400

percpu_ref has treated the dropping of the base reference and
switching to atomic mode as an integral operation; however, there's
nothing inherent tying the two together.

The use cases for percpu_ref have been expanding continuously.  While
the current init/kill/reinit/exit model can cover a lot, the coupling
of kill/reinit with atomic/percpu mode switching is turning out to be
too restrictive for use cases where many percpu_refs are created and
destroyed back-to-back with only some of them reaching extended
operation.  The coupling also makes implementing always-atomic debug
mode difficult.

This patch separates out percpu mode switching into
percpu_ref_switch_to_percpu() and reimplements percpu_ref_reinit() on
top of it.

* DEAD still requires ATOMIC.  A dead ref can't be switched to percpu
  mode w/o going through reinit.

v2: __percpu_ref_switch_to_percpu() was missing static.  Fixed.
    Reported by Fengguang aka kbuild test robot.

Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: Kent Overstreet <kmo@daterainc.com>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: Christoph Hellwig <hch@infradead.org>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: kbuild test robot <fengguang.wu@intel.com>
---
 include/linux/percpu-refcount.h |  3 +-
 lib/percpu-refcount.c           | 73 ++++++++++++++++++++++++++++++-----------
 2 files changed, 56 insertions(+), 20 deletions(-)

diff --git a/include/linux/percpu-refcount.h b/include/linux/percpu-refcount.h
index d1252e1..cd7e20f 100644
--- a/include/linux/percpu-refcount.h
+++ b/include/linux/percpu-refcount.h
@@ -80,9 +80,10 @@ int __must_check percpu_ref_init(struct percpu_ref *ref,
 void percpu_ref_exit(struct percpu_ref *ref);
 void percpu_ref_switch_to_atomic(struct percpu_ref *ref,
 				 percpu_ref_func_t *confirm_switch);
-void percpu_ref_reinit(struct percpu_ref *ref);
+void percpu_ref_switch_to_percpu(struct percpu_ref *ref);
 void percpu_ref_kill_and_confirm(struct percpu_ref *ref,
 				 percpu_ref_func_t *confirm_kill);
+void percpu_ref_reinit(struct percpu_ref *ref);
 
 /**
  * percpu_ref_kill - drop the initial ref
diff --git a/lib/percpu-refcount.c b/lib/percpu-refcount.c
index 6e0d143..5a6d43b 100644
--- a/lib/percpu-refcount.c
+++ b/lib/percpu-refcount.c
@@ -206,40 +206,54 @@ void percpu_ref_switch_to_atomic(struct percpu_ref *ref,
 	__percpu_ref_switch_to_atomic(ref, confirm_switch);
 }
 
-/**
- * percpu_ref_reinit - re-initialize a percpu refcount
- * @ref: perpcu_ref to re-initialize
- *
- * Re-initialize @ref so that it's in the same state as when it finished
- * percpu_ref_init().  @ref must have been initialized successfully, killed
- * and reached 0 but not exited.
- *
- * Note that percpu_ref_tryget[_live]() are safe to perform on @ref while
- * this function is in progress.
- */
-void percpu_ref_reinit(struct percpu_ref *ref)
+static void __percpu_ref_switch_to_percpu(struct percpu_ref *ref)
 {
 	unsigned long __percpu *percpu_count = percpu_count_ptr(ref);
 	int cpu;
 
 	BUG_ON(!percpu_count);
-	WARN_ON_ONCE(!percpu_ref_is_zero(ref));
 
-	atomic_long_set(&ref->count, 1 + PERCPU_COUNT_BIAS);
+	if (!(ref->percpu_count_ptr & __PERCPU_REF_ATOMIC))
+		return;
+
+	wait_event(percpu_ref_switch_waitq, !ref->confirm_switch);
+
+	atomic_long_add(PERCPU_COUNT_BIAS, &ref->count);
 
 	/*
 	 * Restore per-cpu operation.  smp_store_release() is paired with
 	 * smp_read_barrier_depends() in __ref_is_percpu() and guarantees
 	 * that the zeroing is visible to all percpu accesses which can see
-	 * the following __PERCPU_REF_ATOMIC_DEAD clearing.
+	 * the following __PERCPU_REF_ATOMIC clearing.
 	 */
 	for_each_possible_cpu(cpu)
 		*per_cpu_ptr(percpu_count, cpu) = 0;
 
 	smp_store_release(&ref->percpu_count_ptr,
-			  ref->percpu_count_ptr & ~__PERCPU_REF_ATOMIC_DEAD);
+			  ref->percpu_count_ptr & ~__PERCPU_REF_ATOMIC);
+}
+
+/**
+ * percpu_ref_switch_to_percpu - switch a percpu_ref to percpu mode
+ * @ref: percpu_ref to switch to percpu mode
+ *
+ * There's no reason to use this function for the usual reference counting.
+ * To re-use an expired ref, use percpu_ref_reinit().
+ *
+ * Switch @ref to percpu mode.  This function may be invoked concurrently
+ * with all the get/put operations and can safely be mixed with kill and
+ * reinit operations.
+ *
+ * This function normally doesn't block and can be called from any context
+ * but it may block if @ref is in the process of switching to atomic mode
+ * by percpu_ref_switch_atomic().
+ */
+void percpu_ref_switch_to_percpu(struct percpu_ref *ref)
+{
+	/* a dying or dead ref can't be switched to percpu mode w/o reinit */
+	if (!(ref->percpu_count_ptr & __PERCPU_REF_DEAD))
+		__percpu_ref_switch_to_percpu(ref);
 }
-EXPORT_SYMBOL_GPL(percpu_ref_reinit);
 
 /**
  * percpu_ref_kill_and_confirm - drop the initial ref and schedule confirmation
@@ -253,8 +267,8 @@ EXPORT_SYMBOL_GPL(percpu_ref_reinit);
  * percpu_ref_tryget_live() for details.
  *
  * This function normally doesn't block and can be called from any context
- * but it may block if @confirm_kill is specified and @ref is already in
- * the process of switching to atomic mode by percpu_ref_switch_atomic().
+ * but it may block if @confirm_kill is specified and @ref is in the
+ * process of switching to atomic mode by percpu_ref_switch_atomic().
  *
  * Due to the way percpu_ref is implemented, @confirm_switch will be called
  * after at least one full sched RCU grace period has passed but this is an
@@ -271,3 +285,24 @@ void percpu_ref_kill_and_confirm(struct percpu_ref *ref,
 	percpu_ref_put(ref);
 }
 EXPORT_SYMBOL_GPL(percpu_ref_kill_and_confirm);
+
+/**
+ * percpu_ref_reinit - re-initialize a percpu refcount
+ * @ref: perpcu_ref to re-initialize
+ *
+ * Re-initialize @ref so that it's in the same state as when it finished
+ * percpu_ref_init().  @ref must have been initialized successfully and
+ * reached 0 but not exited.
+ *
+ * Note that percpu_ref_tryget[_live]() are safe to perform on @ref while
+ * this function is in progress.
+ */
+void percpu_ref_reinit(struct percpu_ref *ref)
+{
+	WARN_ON_ONCE(!percpu_ref_is_zero(ref));
+
+	ref->percpu_count_ptr &= ~__PERCPU_REF_DEAD;
+	percpu_ref_get(ref);
+	__percpu_ref_switch_to_percpu(ref);
+}
+EXPORT_SYMBOL_GPL(percpu_ref_reinit);
-- 
1.9.3


^ permalink raw reply related	[flat|nested] 23+ messages in thread

* Re: [PATCH 1/9] percpu_ref: relocate percpu_ref_reinit()
  2014-09-23  5:55 ` [PATCH 1/9] percpu_ref: relocate percpu_ref_reinit() Tejun Heo
@ 2014-09-23 21:01   ` Kent Overstreet
  2014-09-23 21:07   ` Kent Overstreet
  1 sibling, 0 replies; 23+ messages in thread
From: Kent Overstreet @ 2014-09-23 21:01 UTC (permalink / raw)
  To: Tejun Heo; +Cc: linux-kernel, axboe, hch, hannes

On Tue, Sep 23, 2014 at 01:55:10AM -0400, Tejun Heo wrote:
> percpu_ref is gonna go through restructuring.  Move
> percpu_ref_reinit() after percpu_ref_kill_and_confirm().  This will
> make later changes easier to follow and result in cleaner
> organization.
> 
> Signed-off-by: Tejun Heo <tj@kernel.org>
> Cc: Kent Overstreet <kmo@daterainc.com>

Reviewed-by: Kent Overstreet <kmo@daterainc.com>

> ---
>  include/linux/percpu-refcount.h |  2 +-
>  lib/percpu-refcount.c           | 70 ++++++++++++++++++++---------------------
>  2 files changed, 36 insertions(+), 36 deletions(-)
> 
> diff --git a/include/linux/percpu-refcount.h b/include/linux/percpu-refcount.h
> index 5df6784..f015f13 100644
> --- a/include/linux/percpu-refcount.h
> +++ b/include/linux/percpu-refcount.h
> @@ -68,10 +68,10 @@ struct percpu_ref {
>  
>  int __must_check percpu_ref_init(struct percpu_ref *ref,
>  				 percpu_ref_func_t *release, gfp_t gfp);
> -void percpu_ref_reinit(struct percpu_ref *ref);
>  void percpu_ref_exit(struct percpu_ref *ref);
>  void percpu_ref_kill_and_confirm(struct percpu_ref *ref,
>  				 percpu_ref_func_t *confirm_kill);
> +void percpu_ref_reinit(struct percpu_ref *ref);
>  
>  /**
>   * percpu_ref_kill - drop the initial ref
> diff --git a/lib/percpu-refcount.c b/lib/percpu-refcount.c
> index 559ee0b..070dab5 100644
> --- a/lib/percpu-refcount.c
> +++ b/lib/percpu-refcount.c
> @@ -63,41 +63,6 @@ int percpu_ref_init(struct percpu_ref *ref, percpu_ref_func_t *release,
>  EXPORT_SYMBOL_GPL(percpu_ref_init);
>  
>  /**
> - * percpu_ref_reinit - re-initialize a percpu refcount
> - * @ref: perpcu_ref to re-initialize
> - *
> - * Re-initialize @ref so that it's in the same state as when it finished
> - * percpu_ref_init().  @ref must have been initialized successfully, killed
> - * and reached 0 but not exited.
> - *
> - * Note that percpu_ref_tryget[_live]() are safe to perform on @ref while
> - * this function is in progress.
> - */
> -void percpu_ref_reinit(struct percpu_ref *ref)
> -{
> -	unsigned long __percpu *pcpu_count = pcpu_count_ptr(ref);
> -	int cpu;
> -
> -	BUG_ON(!pcpu_count);
> -	WARN_ON(!percpu_ref_is_zero(ref));
> -
> -	atomic_long_set(&ref->count, 1 + PCPU_COUNT_BIAS);
> -
> -	/*
> -	 * Restore per-cpu operation.  smp_store_release() is paired with
> -	 * smp_read_barrier_depends() in __pcpu_ref_alive() and guarantees
> -	 * that the zeroing is visible to all percpu accesses which can see
> -	 * the following PCPU_REF_DEAD clearing.
> -	 */
> -	for_each_possible_cpu(cpu)
> -		*per_cpu_ptr(pcpu_count, cpu) = 0;
> -
> -	smp_store_release(&ref->pcpu_count_ptr,
> -			  ref->pcpu_count_ptr & ~PCPU_REF_DEAD);
> -}
> -EXPORT_SYMBOL_GPL(percpu_ref_reinit);
> -
> -/**
>   * percpu_ref_exit - undo percpu_ref_init()
>   * @ref: percpu_ref to exit
>   *
> @@ -189,3 +154,38 @@ void percpu_ref_kill_and_confirm(struct percpu_ref *ref,
>  	call_rcu_sched(&ref->rcu, percpu_ref_kill_rcu);
>  }
>  EXPORT_SYMBOL_GPL(percpu_ref_kill_and_confirm);
> +
> +/**
> + * percpu_ref_reinit - re-initialize a percpu refcount
> + * @ref: perpcu_ref to re-initialize
> + *
> + * Re-initialize @ref so that it's in the same state as when it finished
> + * percpu_ref_init().  @ref must have been initialized successfully, killed
> + * and reached 0 but not exited.
> + *
> + * Note that percpu_ref_tryget[_live]() are safe to perform on @ref while
> + * this function is in progress.
> + */
> +void percpu_ref_reinit(struct percpu_ref *ref)
> +{
> +	unsigned long __percpu *pcpu_count = pcpu_count_ptr(ref);
> +	int cpu;
> +
> +	BUG_ON(!pcpu_count);
> +	WARN_ON(!percpu_ref_is_zero(ref));
> +
> +	atomic_long_set(&ref->count, 1 + PCPU_COUNT_BIAS);
> +
> +	/*
> +	 * Restore per-cpu operation.  smp_store_release() is paired with
> +	 * smp_read_barrier_depends() in __pcpu_ref_alive() and guarantees
> +	 * that the zeroing is visible to all percpu accesses which can see
> +	 * the following PCPU_REF_DEAD clearing.
> +	 */
> +	for_each_possible_cpu(cpu)
> +		*per_cpu_ptr(pcpu_count, cpu) = 0;
> +
> +	smp_store_release(&ref->pcpu_count_ptr,
> +			  ref->pcpu_count_ptr & ~PCPU_REF_DEAD);
> +}
> +EXPORT_SYMBOL_GPL(percpu_ref_reinit);
> -- 
> 1.9.3
> 

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH 1/9] percpu_ref: relocate percpu_ref_reinit()
  2014-09-23  5:55 ` [PATCH 1/9] percpu_ref: relocate percpu_ref_reinit() Tejun Heo
  2014-09-23 21:01   ` Kent Overstreet
@ 2014-09-23 21:07   ` Kent Overstreet
  1 sibling, 0 replies; 23+ messages in thread
From: Kent Overstreet @ 2014-09-23 21:07 UTC (permalink / raw)
  To: Tejun Heo; +Cc: linux-kernel, axboe, hch, hannes

On Tue, Sep 23, 2014 at 01:55:10AM -0400, Tejun Heo wrote:
> percpu_ref is gonna go through restructuring.  Move
> percpu_ref_reinit() after percpu_ref_kill_and_confirm().  This will
> make later changes easier to follow and result in cleaner
> organization.
> 
> Signed-off-by: Tejun Heo <tj@kernel.org>
> Cc: Kent Overstreet <kmo@daterainc.com>

Reviewed-by: Kent Overstreet <kmo@daterainc.com>

> ---
>  include/linux/percpu-refcount.h |  2 +-
>  lib/percpu-refcount.c           | 70 ++++++++++++++++++++---------------------
>  2 files changed, 36 insertions(+), 36 deletions(-)
> 
> diff --git a/include/linux/percpu-refcount.h b/include/linux/percpu-refcount.h
> index 5df6784..f015f13 100644
> --- a/include/linux/percpu-refcount.h
> +++ b/include/linux/percpu-refcount.h
> @@ -68,10 +68,10 @@ struct percpu_ref {
>  
>  int __must_check percpu_ref_init(struct percpu_ref *ref,
>  				 percpu_ref_func_t *release, gfp_t gfp);
> -void percpu_ref_reinit(struct percpu_ref *ref);
>  void percpu_ref_exit(struct percpu_ref *ref);
>  void percpu_ref_kill_and_confirm(struct percpu_ref *ref,
>  				 percpu_ref_func_t *confirm_kill);
> +void percpu_ref_reinit(struct percpu_ref *ref);
>  
>  /**
>   * percpu_ref_kill - drop the initial ref
> diff --git a/lib/percpu-refcount.c b/lib/percpu-refcount.c
> index 559ee0b..070dab5 100644
> --- a/lib/percpu-refcount.c
> +++ b/lib/percpu-refcount.c
> @@ -63,41 +63,6 @@ int percpu_ref_init(struct percpu_ref *ref, percpu_ref_func_t *release,
>  EXPORT_SYMBOL_GPL(percpu_ref_init);
>  
>  /**
> - * percpu_ref_reinit - re-initialize a percpu refcount
> - * @ref: perpcu_ref to re-initialize
> - *
> - * Re-initialize @ref so that it's in the same state as when it finished
> - * percpu_ref_init().  @ref must have been initialized successfully, killed
> - * and reached 0 but not exited.
> - *
> - * Note that percpu_ref_tryget[_live]() are safe to perform on @ref while
> - * this function is in progress.
> - */
> -void percpu_ref_reinit(struct percpu_ref *ref)
> -{
> -	unsigned long __percpu *pcpu_count = pcpu_count_ptr(ref);
> -	int cpu;
> -
> -	BUG_ON(!pcpu_count);
> -	WARN_ON(!percpu_ref_is_zero(ref));
> -
> -	atomic_long_set(&ref->count, 1 + PCPU_COUNT_BIAS);
> -
> -	/*
> -	 * Restore per-cpu operation.  smp_store_release() is paired with
> -	 * smp_read_barrier_depends() in __pcpu_ref_alive() and guarantees
> -	 * that the zeroing is visible to all percpu accesses which can see
> -	 * the following PCPU_REF_DEAD clearing.
> -	 */
> -	for_each_possible_cpu(cpu)
> -		*per_cpu_ptr(pcpu_count, cpu) = 0;
> -
> -	smp_store_release(&ref->pcpu_count_ptr,
> -			  ref->pcpu_count_ptr & ~PCPU_REF_DEAD);
> -}
> -EXPORT_SYMBOL_GPL(percpu_ref_reinit);
> -
> -/**
>   * percpu_ref_exit - undo percpu_ref_init()
>   * @ref: percpu_ref to exit
>   *
> @@ -189,3 +154,38 @@ void percpu_ref_kill_and_confirm(struct percpu_ref *ref,
>  	call_rcu_sched(&ref->rcu, percpu_ref_kill_rcu);
>  }
>  EXPORT_SYMBOL_GPL(percpu_ref_kill_and_confirm);
> +
> +/**
> + * percpu_ref_reinit - re-initialize a percpu refcount
> + * @ref: perpcu_ref to re-initialize
> + *
> + * Re-initialize @ref so that it's in the same state as when it finished
> + * percpu_ref_init().  @ref must have been initialized successfully, killed
> + * and reached 0 but not exited.
> + *
> + * Note that percpu_ref_tryget[_live]() are safe to perform on @ref while
> + * this function is in progress.
> + */
> +void percpu_ref_reinit(struct percpu_ref *ref)
> +{
> +	unsigned long __percpu *pcpu_count = pcpu_count_ptr(ref);
> +	int cpu;
> +
> +	BUG_ON(!pcpu_count);
> +	WARN_ON(!percpu_ref_is_zero(ref));
> +
> +	atomic_long_set(&ref->count, 1 + PCPU_COUNT_BIAS);
> +
> +	/*
> +	 * Restore per-cpu operation.  smp_store_release() is paired with
> +	 * smp_read_barrier_depends() in __pcpu_ref_alive() and guarantees
> +	 * that the zeroing is visible to all percpu accesses which can see
> +	 * the following PCPU_REF_DEAD clearing.
> +	 */
> +	for_each_possible_cpu(cpu)
> +		*per_cpu_ptr(pcpu_count, cpu) = 0;
> +
> +	smp_store_release(&ref->pcpu_count_ptr,
> +			  ref->pcpu_count_ptr & ~PCPU_REF_DEAD);
> +}
> +EXPORT_SYMBOL_GPL(percpu_ref_reinit);
> -- 
> 1.9.3
> 

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH 2/9] percpu_ref: minor code and comment updates
  2014-09-23  5:55 ` [PATCH 2/9] percpu_ref: minor code and comment updates Tejun Heo
@ 2014-09-23 21:09   ` Kent Overstreet
  0 siblings, 0 replies; 23+ messages in thread
From: Kent Overstreet @ 2014-09-23 21:09 UTC (permalink / raw)
  To: Tejun Heo; +Cc: linux-kernel, axboe, hch, hannes

On Tue, Sep 23, 2014 at 01:55:11AM -0400, Tejun Heo wrote:
> * Some comments became stale.  Updated.
> * percpu_ref_tryget() unnecessarily initializes @ret.  Removed.
> * A blank line removed from percpu_ref_kill_rcu().
> * Explicit function name in a WARN format string replaced with __func__.
> * WARN_ON() in percpu_ref_reinit() converted to WARN_ON_ONCE().
> 
> Signed-off-by: Tejun Heo <tj@kernel.org>
> Cc: Kent Overstreet <kmo@daterainc.com>

Reviewed-by: Kent Overstreet <kmo@daterainc.com>

> ---
>  include/linux/percpu-refcount.h | 25 ++++++++++++++++---------
>  lib/percpu-refcount.c           | 14 ++++++--------
>  2 files changed, 22 insertions(+), 17 deletions(-)
> 
> diff --git a/include/linux/percpu-refcount.h b/include/linux/percpu-refcount.h
> index f015f13..d44b027 100644
> --- a/include/linux/percpu-refcount.h
> +++ b/include/linux/percpu-refcount.h
> @@ -115,8 +115,10 @@ static inline bool __pcpu_ref_alive(struct percpu_ref *ref,
>   * percpu_ref_get - increment a percpu refcount
>   * @ref: percpu_ref to get
>   *
> - * Analagous to atomic_inc().
> -  */
> + * Analagous to atomic_long_inc().
> + *
> + * This function is safe to call as long as @ref is between init and exit.
> + */
>  static inline void percpu_ref_get(struct percpu_ref *ref)
>  {
>  	unsigned long __percpu *pcpu_count;
> @@ -138,12 +140,12 @@ static inline void percpu_ref_get(struct percpu_ref *ref)
>   * Increment a percpu refcount unless its count already reached zero.
>   * Returns %true on success; %false on failure.
>   *
> - * The caller is responsible for ensuring that @ref stays accessible.
> + * This function is safe to call as long as @ref is between init and exit.
>   */
>  static inline bool percpu_ref_tryget(struct percpu_ref *ref)
>  {
>  	unsigned long __percpu *pcpu_count;
> -	int ret = false;
> +	int ret;
>  
>  	rcu_read_lock_sched();
>  
> @@ -166,12 +168,13 @@ static inline bool percpu_ref_tryget(struct percpu_ref *ref)
>   * Increment a percpu refcount unless it has already been killed.  Returns
>   * %true on success; %false on failure.
>   *
> - * Completion of percpu_ref_kill() in itself doesn't guarantee that tryget
> - * will fail.  For such guarantee, percpu_ref_kill_and_confirm() should be
> - * used.  After the confirm_kill callback is invoked, it's guaranteed that
> - * no new reference will be given out by percpu_ref_tryget().
> + * Completion of percpu_ref_kill() in itself doesn't guarantee that this
> + * function will fail.  For such guarantee, percpu_ref_kill_and_confirm()
> + * should be used.  After the confirm_kill callback is invoked, it's
> + * guaranteed that no new reference will be given out by
> + * percpu_ref_tryget_live().
>   *
> - * The caller is responsible for ensuring that @ref stays accessible.
> + * This function is safe to call as long as @ref is between init and exit.
>   */
>  static inline bool percpu_ref_tryget_live(struct percpu_ref *ref)
>  {
> @@ -196,6 +199,8 @@ static inline bool percpu_ref_tryget_live(struct percpu_ref *ref)
>   *
>   * Decrement the refcount, and if 0, call the release function (which was passed
>   * to percpu_ref_init())
> + *
> + * This function is safe to call as long as @ref is between init and exit.
>   */
>  static inline void percpu_ref_put(struct percpu_ref *ref)
>  {
> @@ -216,6 +221,8 @@ static inline void percpu_ref_put(struct percpu_ref *ref)
>   * @ref: percpu_ref to test
>   *
>   * Returns %true if @ref reached zero.
> + *
> + * This function is safe to call as long as @ref is between init and exit.
>   */
>  static inline bool percpu_ref_is_zero(struct percpu_ref *ref)
>  {
> diff --git a/lib/percpu-refcount.c b/lib/percpu-refcount.c
> index 070dab5..8ef3f5c 100644
> --- a/lib/percpu-refcount.c
> +++ b/lib/percpu-refcount.c
> @@ -108,7 +108,6 @@ static void percpu_ref_kill_rcu(struct rcu_head *rcu)
>  	 * reaching 0 before we add the percpu counts. But doing it at the same
>  	 * time is equivalent and saves us atomic operations:
>  	 */
> -
>  	atomic_long_add((long)count - PCPU_COUNT_BIAS, &ref->count);
>  
>  	WARN_ONCE(atomic_long_read(&ref->count) <= 0,
> @@ -120,8 +119,8 @@ static void percpu_ref_kill_rcu(struct rcu_head *rcu)
>  		ref->confirm_kill(ref);
>  
>  	/*
> -	 * Now we're in single atomic_t mode with a consistent refcount, so it's
> -	 * safe to drop our initial ref:
> +	 * Now we're in single atomic_long_t mode with a consistent
> +	 * refcount, so it's safe to drop our initial ref:
>  	 */
>  	percpu_ref_put(ref);
>  }
> @@ -134,8 +133,8 @@ static void percpu_ref_kill_rcu(struct rcu_head *rcu)
>   * Equivalent to percpu_ref_kill() but also schedules kill confirmation if
>   * @confirm_kill is not NULL.  @confirm_kill, which may not block, will be
>   * called after @ref is seen as dead from all CPUs - all further
> - * invocations of percpu_ref_tryget() will fail.  See percpu_ref_tryget()
> - * for more details.
> + * invocations of percpu_ref_tryget_live() will fail.  See
> + * percpu_ref_tryget_live() for more details.
>   *
>   * Due to the way percpu_ref is implemented, @confirm_kill will be called
>   * after at least one full RCU grace period has passed but this is an
> @@ -145,8 +144,7 @@ void percpu_ref_kill_and_confirm(struct percpu_ref *ref,
>  				 percpu_ref_func_t *confirm_kill)
>  {
>  	WARN_ONCE(ref->pcpu_count_ptr & PCPU_REF_DEAD,
> -		  "percpu_ref_kill() called more than once on %pf!",
> -		  ref->release);
> +		  "%s called more than once on %pf!", __func__, ref->release);
>  
>  	ref->pcpu_count_ptr |= PCPU_REF_DEAD;
>  	ref->confirm_kill = confirm_kill;
> @@ -172,7 +170,7 @@ void percpu_ref_reinit(struct percpu_ref *ref)
>  	int cpu;
>  
>  	BUG_ON(!pcpu_count);
> -	WARN_ON(!percpu_ref_is_zero(ref));
> +	WARN_ON_ONCE(!percpu_ref_is_zero(ref));
>  
>  	atomic_long_set(&ref->count, 1 + PCPU_COUNT_BIAS);
>  
> -- 
> 1.9.3
> 

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH 3/9] percpu_ref: replace pcpu_ prefix with percpu_
  2014-09-23  5:55 ` [PATCH 3/9] percpu_ref: replace pcpu_ prefix with percpu_ Tejun Heo
@ 2014-09-23 21:10   ` Kent Overstreet
  0 siblings, 0 replies; 23+ messages in thread
From: Kent Overstreet @ 2014-09-23 21:10 UTC (permalink / raw)
  To: Tejun Heo; +Cc: linux-kernel, axboe, hch, hannes

On Tue, Sep 23, 2014 at 01:55:12AM -0400, Tejun Heo wrote:
> percpu_ref uses pcpu_ prefix for internal stuff and percpu_ for
> externally visible ones.  This is the same convention used in the
> percpu allocator implementation.  It works fine there but percpu_ref
> doesn't have too much internal-only stuff and scattered usages of
> pcpu_ prefix are confusing than helpful.
> 
> This patch replaces all pcpu_ prefixes with percpu_.  This is pure
> rename and there's no functional change.  Note that PCPU_REF_DEAD is
> renamed to __PERCPU_REF_DEAD to signify that the flag is internal.
> 
> Signed-off-by: Tejun Heo <tj@kernel.org>
> Cc: Kent Overstreet <kmo@daterainc.com>

Reviewed-by: Kent Overstreet <kmo@daterainc.com>

> ---
>  include/linux/percpu-refcount.h | 46 ++++++++++++++++-----------------
>  lib/percpu-refcount.c           | 56 +++++++++++++++++++++--------------------
>  2 files changed, 52 insertions(+), 50 deletions(-)
> 
> diff --git a/include/linux/percpu-refcount.h b/include/linux/percpu-refcount.h
> index d44b027..3d463a3 100644
> --- a/include/linux/percpu-refcount.h
> +++ b/include/linux/percpu-refcount.h
> @@ -13,7 +13,7 @@
>   *
>   * The refcount will have a range of 0 to ((1U << 31) - 1), i.e. one bit less
>   * than an atomic_t - this is because of the way shutdown works, see
> - * percpu_ref_kill()/PCPU_COUNT_BIAS.
> + * percpu_ref_kill()/PERCPU_COUNT_BIAS.
>   *
>   * Before you call percpu_ref_kill(), percpu_ref_put() does not check for the
>   * refcount hitting 0 - it can't, if it was in percpu mode. percpu_ref_kill()
> @@ -60,7 +60,7 @@ struct percpu_ref {
>  	 * The low bit of the pointer indicates whether the ref is in percpu
>  	 * mode; if set, then get/put will manipulate the atomic_t.
>  	 */
> -	unsigned long		pcpu_count_ptr;
> +	unsigned long		percpu_count_ptr;
>  	percpu_ref_func_t	*release;
>  	percpu_ref_func_t	*confirm_kill;
>  	struct rcu_head		rcu;
> @@ -88,26 +88,26 @@ static inline void percpu_ref_kill(struct percpu_ref *ref)
>  	return percpu_ref_kill_and_confirm(ref, NULL);
>  }
>  
> -#define PCPU_REF_DEAD		1
> +#define __PERCPU_REF_DEAD	1
>  
>  /*
>   * Internal helper.  Don't use outside percpu-refcount proper.  The
>   * function doesn't return the pointer and let the caller test it for NULL
>   * because doing so forces the compiler to generate two conditional
> - * branches as it can't assume that @ref->pcpu_count is not NULL.
> + * branches as it can't assume that @ref->percpu_count is not NULL.
>   */
> -static inline bool __pcpu_ref_alive(struct percpu_ref *ref,
> -				    unsigned long __percpu **pcpu_countp)
> +static inline bool __percpu_ref_alive(struct percpu_ref *ref,
> +				      unsigned long __percpu **percpu_countp)
>  {
> -	unsigned long pcpu_ptr = ACCESS_ONCE(ref->pcpu_count_ptr);
> +	unsigned long percpu_ptr = ACCESS_ONCE(ref->percpu_count_ptr);
>  
>  	/* paired with smp_store_release() in percpu_ref_reinit() */
>  	smp_read_barrier_depends();
>  
> -	if (unlikely(pcpu_ptr & PCPU_REF_DEAD))
> +	if (unlikely(percpu_ptr & __PERCPU_REF_DEAD))
>  		return false;
>  
> -	*pcpu_countp = (unsigned long __percpu *)pcpu_ptr;
> +	*percpu_countp = (unsigned long __percpu *)percpu_ptr;
>  	return true;
>  }
>  
> @@ -121,12 +121,12 @@ static inline bool __pcpu_ref_alive(struct percpu_ref *ref,
>   */
>  static inline void percpu_ref_get(struct percpu_ref *ref)
>  {
> -	unsigned long __percpu *pcpu_count;
> +	unsigned long __percpu *percpu_count;
>  
>  	rcu_read_lock_sched();
>  
> -	if (__pcpu_ref_alive(ref, &pcpu_count))
> -		this_cpu_inc(*pcpu_count);
> +	if (__percpu_ref_alive(ref, &percpu_count))
> +		this_cpu_inc(*percpu_count);
>  	else
>  		atomic_long_inc(&ref->count);
>  
> @@ -144,13 +144,13 @@ static inline void percpu_ref_get(struct percpu_ref *ref)
>   */
>  static inline bool percpu_ref_tryget(struct percpu_ref *ref)
>  {
> -	unsigned long __percpu *pcpu_count;
> +	unsigned long __percpu *percpu_count;
>  	int ret;
>  
>  	rcu_read_lock_sched();
>  
> -	if (__pcpu_ref_alive(ref, &pcpu_count)) {
> -		this_cpu_inc(*pcpu_count);
> +	if (__percpu_ref_alive(ref, &percpu_count)) {
> +		this_cpu_inc(*percpu_count);
>  		ret = true;
>  	} else {
>  		ret = atomic_long_inc_not_zero(&ref->count);
> @@ -178,13 +178,13 @@ static inline bool percpu_ref_tryget(struct percpu_ref *ref)
>   */
>  static inline bool percpu_ref_tryget_live(struct percpu_ref *ref)
>  {
> -	unsigned long __percpu *pcpu_count;
> +	unsigned long __percpu *percpu_count;
>  	int ret = false;
>  
>  	rcu_read_lock_sched();
>  
> -	if (__pcpu_ref_alive(ref, &pcpu_count)) {
> -		this_cpu_inc(*pcpu_count);
> +	if (__percpu_ref_alive(ref, &percpu_count)) {
> +		this_cpu_inc(*percpu_count);
>  		ret = true;
>  	}
>  
> @@ -204,12 +204,12 @@ static inline bool percpu_ref_tryget_live(struct percpu_ref *ref)
>   */
>  static inline void percpu_ref_put(struct percpu_ref *ref)
>  {
> -	unsigned long __percpu *pcpu_count;
> +	unsigned long __percpu *percpu_count;
>  
>  	rcu_read_lock_sched();
>  
> -	if (__pcpu_ref_alive(ref, &pcpu_count))
> -		this_cpu_dec(*pcpu_count);
> +	if (__percpu_ref_alive(ref, &percpu_count))
> +		this_cpu_dec(*percpu_count);
>  	else if (unlikely(atomic_long_dec_and_test(&ref->count)))
>  		ref->release(ref);
>  
> @@ -226,9 +226,9 @@ static inline void percpu_ref_put(struct percpu_ref *ref)
>   */
>  static inline bool percpu_ref_is_zero(struct percpu_ref *ref)
>  {
> -	unsigned long __percpu *pcpu_count;
> +	unsigned long __percpu *percpu_count;
>  
> -	if (__pcpu_ref_alive(ref, &pcpu_count))
> +	if (__percpu_ref_alive(ref, &percpu_count))
>  		return false;
>  	return !atomic_long_read(&ref->count);
>  }
> diff --git a/lib/percpu-refcount.c b/lib/percpu-refcount.c
> index 8ef3f5c..5aea6b7 100644
> --- a/lib/percpu-refcount.c
> +++ b/lib/percpu-refcount.c
> @@ -11,8 +11,8 @@
>   * percpu counters will all sum to the correct value
>   *
>   * (More precisely: because moduler arithmatic is commutative the sum of all the
> - * pcpu_count vars will be equal to what it would have been if all the gets and
> - * puts were done to a single integer, even if some of the percpu integers
> + * percpu_count vars will be equal to what it would have been if all the gets
> + * and puts were done to a single integer, even if some of the percpu integers
>   * overflow or underflow).
>   *
>   * The real trick to implementing percpu refcounts is shutdown. We can't detect
> @@ -29,11 +29,12 @@
>   * atomic_long_t can't hit 0 before we've added up all the percpu refs.
>   */
>  
> -#define PCPU_COUNT_BIAS		(1LU << (BITS_PER_LONG - 1))
> +#define PERCPU_COUNT_BIAS	(1LU << (BITS_PER_LONG - 1))
>  
> -static unsigned long __percpu *pcpu_count_ptr(struct percpu_ref *ref)
> +static unsigned long __percpu *percpu_count_ptr(struct percpu_ref *ref)
>  {
> -	return (unsigned long __percpu *)(ref->pcpu_count_ptr & ~PCPU_REF_DEAD);
> +	return (unsigned long __percpu *)
> +		(ref->percpu_count_ptr & ~__PERCPU_REF_DEAD);
>  }
>  
>  /**
> @@ -51,10 +52,11 @@ static unsigned long __percpu *pcpu_count_ptr(struct percpu_ref *ref)
>  int percpu_ref_init(struct percpu_ref *ref, percpu_ref_func_t *release,
>  		    gfp_t gfp)
>  {
> -	atomic_long_set(&ref->count, 1 + PCPU_COUNT_BIAS);
> +	atomic_long_set(&ref->count, 1 + PERCPU_COUNT_BIAS);
>  
> -	ref->pcpu_count_ptr = (unsigned long)alloc_percpu_gfp(unsigned long, gfp);
> -	if (!ref->pcpu_count_ptr)
> +	ref->percpu_count_ptr =
> +		(unsigned long)alloc_percpu_gfp(unsigned long, gfp);
> +	if (!ref->percpu_count_ptr)
>  		return -ENOMEM;
>  
>  	ref->release = release;
> @@ -74,11 +76,11 @@ EXPORT_SYMBOL_GPL(percpu_ref_init);
>   */
>  void percpu_ref_exit(struct percpu_ref *ref)
>  {
> -	unsigned long __percpu *pcpu_count = pcpu_count_ptr(ref);
> +	unsigned long __percpu *percpu_count = percpu_count_ptr(ref);
>  
> -	if (pcpu_count) {
> -		free_percpu(pcpu_count);
> -		ref->pcpu_count_ptr = PCPU_REF_DEAD;
> +	if (percpu_count) {
> +		free_percpu(percpu_count);
> +		ref->percpu_count_ptr = __PERCPU_REF_DEAD;
>  	}
>  }
>  EXPORT_SYMBOL_GPL(percpu_ref_exit);
> @@ -86,14 +88,14 @@ EXPORT_SYMBOL_GPL(percpu_ref_exit);
>  static void percpu_ref_kill_rcu(struct rcu_head *rcu)
>  {
>  	struct percpu_ref *ref = container_of(rcu, struct percpu_ref, rcu);
> -	unsigned long __percpu *pcpu_count = pcpu_count_ptr(ref);
> +	unsigned long __percpu *percpu_count = percpu_count_ptr(ref);
>  	unsigned long count = 0;
>  	int cpu;
>  
>  	for_each_possible_cpu(cpu)
> -		count += *per_cpu_ptr(pcpu_count, cpu);
> +		count += *per_cpu_ptr(percpu_count, cpu);
>  
> -	pr_debug("global %ld pcpu %ld",
> +	pr_debug("global %ld percpu %ld",
>  		 atomic_long_read(&ref->count), (long)count);
>  
>  	/*
> @@ -108,7 +110,7 @@ static void percpu_ref_kill_rcu(struct rcu_head *rcu)
>  	 * reaching 0 before we add the percpu counts. But doing it at the same
>  	 * time is equivalent and saves us atomic operations:
>  	 */
> -	atomic_long_add((long)count - PCPU_COUNT_BIAS, &ref->count);
> +	atomic_long_add((long)count - PERCPU_COUNT_BIAS, &ref->count);
>  
>  	WARN_ONCE(atomic_long_read(&ref->count) <= 0,
>  		  "percpu ref (%pf) <= 0 (%ld) after killed",
> @@ -143,10 +145,10 @@ static void percpu_ref_kill_rcu(struct rcu_head *rcu)
>  void percpu_ref_kill_and_confirm(struct percpu_ref *ref,
>  				 percpu_ref_func_t *confirm_kill)
>  {
> -	WARN_ONCE(ref->pcpu_count_ptr & PCPU_REF_DEAD,
> +	WARN_ONCE(ref->percpu_count_ptr & __PERCPU_REF_DEAD,
>  		  "%s called more than once on %pf!", __func__, ref->release);
>  
> -	ref->pcpu_count_ptr |= PCPU_REF_DEAD;
> +	ref->percpu_count_ptr |= __PERCPU_REF_DEAD;
>  	ref->confirm_kill = confirm_kill;
>  
>  	call_rcu_sched(&ref->rcu, percpu_ref_kill_rcu);
> @@ -166,24 +168,24 @@ EXPORT_SYMBOL_GPL(percpu_ref_kill_and_confirm);
>   */
>  void percpu_ref_reinit(struct percpu_ref *ref)
>  {
> -	unsigned long __percpu *pcpu_count = pcpu_count_ptr(ref);
> +	unsigned long __percpu *percpu_count = percpu_count_ptr(ref);
>  	int cpu;
>  
> -	BUG_ON(!pcpu_count);
> +	BUG_ON(!percpu_count);
>  	WARN_ON_ONCE(!percpu_ref_is_zero(ref));
>  
> -	atomic_long_set(&ref->count, 1 + PCPU_COUNT_BIAS);
> +	atomic_long_set(&ref->count, 1 + PERCPU_COUNT_BIAS);
>  
>  	/*
>  	 * Restore per-cpu operation.  smp_store_release() is paired with
> -	 * smp_read_barrier_depends() in __pcpu_ref_alive() and guarantees
> -	 * that the zeroing is visible to all percpu accesses which can see
> -	 * the following PCPU_REF_DEAD clearing.
> +	 * smp_read_barrier_depends() in __percpu_ref_alive() and
> +	 * guarantees that the zeroing is visible to all percpu accesses
> +	 * which can see the following __PERCPU_REF_DEAD clearing.
>  	 */
>  	for_each_possible_cpu(cpu)
> -		*per_cpu_ptr(pcpu_count, cpu) = 0;
> +		*per_cpu_ptr(percpu_count, cpu) = 0;
>  
> -	smp_store_release(&ref->pcpu_count_ptr,
> -			  ref->pcpu_count_ptr & ~PCPU_REF_DEAD);
> +	smp_store_release(&ref->percpu_count_ptr,
> +			  ref->percpu_count_ptr & ~__PERCPU_REF_DEAD);
>  }
>  EXPORT_SYMBOL_GPL(percpu_ref_reinit);
> -- 
> 1.9.3
> 

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH 4/9] percpu_ref: rename things to prepare for decoupling percpu/atomic mode switch
  2014-09-23  5:55 ` [PATCH 4/9] percpu_ref: rename things to prepare for decoupling percpu/atomic mode switch Tejun Heo
@ 2014-09-23 21:11   ` Kent Overstreet
  0 siblings, 0 replies; 23+ messages in thread
From: Kent Overstreet @ 2014-09-23 21:11 UTC (permalink / raw)
  To: Tejun Heo; +Cc: linux-kernel, axboe, hch, hannes

On Tue, Sep 23, 2014 at 01:55:13AM -0400, Tejun Heo wrote:
> percpu_ref will be restructured so that percpu/atomic mode switching
> and reference killing are dedoupled.  In preparation, do the following
> renames.
> 
> * percpu_ref->confirm_kill	-> percpu_ref->confirm_switch
> * __PERCPU_REF_DEAD		-> __PERCPU_REF_ATOMIC
> * __percpu_ref_alive()		-> __ref_is_percpu()
> 
> This patch is pure rename and doesn't introduce any functional
> changes.
> 
> Signed-off-by: Tejun Heo <tj@kernel.org>
> Cc: Kent Overstreet <kmo@daterainc.com>

Reviewed-by: Kent Overstreet <kmo@daterainc.com>

> ---
>  include/linux/percpu-refcount.h | 25 ++++++++++++++-----------
>  lib/percpu-refcount.c           | 22 +++++++++++-----------
>  2 files changed, 25 insertions(+), 22 deletions(-)
> 
> diff --git a/include/linux/percpu-refcount.h b/include/linux/percpu-refcount.h
> index 3d463a3..910e5f7 100644
> --- a/include/linux/percpu-refcount.h
> +++ b/include/linux/percpu-refcount.h
> @@ -54,6 +54,11 @@
>  struct percpu_ref;
>  typedef void (percpu_ref_func_t)(struct percpu_ref *);
>  
> +/* flags set in the lower bits of percpu_ref->percpu_count_ptr */
> +enum {
> +	__PERCPU_REF_ATOMIC	= 1LU << 0,	/* operating in atomic mode */
> +};
> +
>  struct percpu_ref {
>  	atomic_long_t		count;
>  	/*
> @@ -62,7 +67,7 @@ struct percpu_ref {
>  	 */
>  	unsigned long		percpu_count_ptr;
>  	percpu_ref_func_t	*release;
> -	percpu_ref_func_t	*confirm_kill;
> +	percpu_ref_func_t	*confirm_switch;
>  	struct rcu_head		rcu;
>  };
>  
> @@ -88,23 +93,21 @@ static inline void percpu_ref_kill(struct percpu_ref *ref)
>  	return percpu_ref_kill_and_confirm(ref, NULL);
>  }
>  
> -#define __PERCPU_REF_DEAD	1
> -
>  /*
>   * Internal helper.  Don't use outside percpu-refcount proper.  The
>   * function doesn't return the pointer and let the caller test it for NULL
>   * because doing so forces the compiler to generate two conditional
>   * branches as it can't assume that @ref->percpu_count is not NULL.
>   */
> -static inline bool __percpu_ref_alive(struct percpu_ref *ref,
> -				      unsigned long __percpu **percpu_countp)
> +static inline bool __ref_is_percpu(struct percpu_ref *ref,
> +					  unsigned long __percpu **percpu_countp)
>  {
>  	unsigned long percpu_ptr = ACCESS_ONCE(ref->percpu_count_ptr);
>  
>  	/* paired with smp_store_release() in percpu_ref_reinit() */
>  	smp_read_barrier_depends();
>  
> -	if (unlikely(percpu_ptr & __PERCPU_REF_DEAD))
> +	if (unlikely(percpu_ptr & __PERCPU_REF_ATOMIC))
>  		return false;
>  
>  	*percpu_countp = (unsigned long __percpu *)percpu_ptr;
> @@ -125,7 +128,7 @@ static inline void percpu_ref_get(struct percpu_ref *ref)
>  
>  	rcu_read_lock_sched();
>  
> -	if (__percpu_ref_alive(ref, &percpu_count))
> +	if (__ref_is_percpu(ref, &percpu_count))
>  		this_cpu_inc(*percpu_count);
>  	else
>  		atomic_long_inc(&ref->count);
> @@ -149,7 +152,7 @@ static inline bool percpu_ref_tryget(struct percpu_ref *ref)
>  
>  	rcu_read_lock_sched();
>  
> -	if (__percpu_ref_alive(ref, &percpu_count)) {
> +	if (__ref_is_percpu(ref, &percpu_count)) {
>  		this_cpu_inc(*percpu_count);
>  		ret = true;
>  	} else {
> @@ -183,7 +186,7 @@ static inline bool percpu_ref_tryget_live(struct percpu_ref *ref)
>  
>  	rcu_read_lock_sched();
>  
> -	if (__percpu_ref_alive(ref, &percpu_count)) {
> +	if (__ref_is_percpu(ref, &percpu_count)) {
>  		this_cpu_inc(*percpu_count);
>  		ret = true;
>  	}
> @@ -208,7 +211,7 @@ static inline void percpu_ref_put(struct percpu_ref *ref)
>  
>  	rcu_read_lock_sched();
>  
> -	if (__percpu_ref_alive(ref, &percpu_count))
> +	if (__ref_is_percpu(ref, &percpu_count))
>  		this_cpu_dec(*percpu_count);
>  	else if (unlikely(atomic_long_dec_and_test(&ref->count)))
>  		ref->release(ref);
> @@ -228,7 +231,7 @@ static inline bool percpu_ref_is_zero(struct percpu_ref *ref)
>  {
>  	unsigned long __percpu *percpu_count;
>  
> -	if (__percpu_ref_alive(ref, &percpu_count))
> +	if (__ref_is_percpu(ref, &percpu_count))
>  		return false;
>  	return !atomic_long_read(&ref->count);
>  }
> diff --git a/lib/percpu-refcount.c b/lib/percpu-refcount.c
> index 5aea6b7..7aef590 100644
> --- a/lib/percpu-refcount.c
> +++ b/lib/percpu-refcount.c
> @@ -34,7 +34,7 @@
>  static unsigned long __percpu *percpu_count_ptr(struct percpu_ref *ref)
>  {
>  	return (unsigned long __percpu *)
> -		(ref->percpu_count_ptr & ~__PERCPU_REF_DEAD);
> +		(ref->percpu_count_ptr & ~__PERCPU_REF_ATOMIC);
>  }
>  
>  /**
> @@ -80,7 +80,7 @@ void percpu_ref_exit(struct percpu_ref *ref)
>  
>  	if (percpu_count) {
>  		free_percpu(percpu_count);
> -		ref->percpu_count_ptr = __PERCPU_REF_DEAD;
> +		ref->percpu_count_ptr = __PERCPU_REF_ATOMIC;
>  	}
>  }
>  EXPORT_SYMBOL_GPL(percpu_ref_exit);
> @@ -117,8 +117,8 @@ static void percpu_ref_kill_rcu(struct rcu_head *rcu)
>  		  ref->release, atomic_long_read(&ref->count));
>  
>  	/* @ref is viewed as dead on all CPUs, send out kill confirmation */
> -	if (ref->confirm_kill)
> -		ref->confirm_kill(ref);
> +	if (ref->confirm_switch)
> +		ref->confirm_switch(ref);
>  
>  	/*
>  	 * Now we're in single atomic_long_t mode with a consistent
> @@ -145,11 +145,11 @@ static void percpu_ref_kill_rcu(struct rcu_head *rcu)
>  void percpu_ref_kill_and_confirm(struct percpu_ref *ref,
>  				 percpu_ref_func_t *confirm_kill)
>  {
> -	WARN_ONCE(ref->percpu_count_ptr & __PERCPU_REF_DEAD,
> +	WARN_ONCE(ref->percpu_count_ptr & __PERCPU_REF_ATOMIC,
>  		  "%s called more than once on %pf!", __func__, ref->release);
>  
> -	ref->percpu_count_ptr |= __PERCPU_REF_DEAD;
> -	ref->confirm_kill = confirm_kill;
> +	ref->percpu_count_ptr |= __PERCPU_REF_ATOMIC;
> +	ref->confirm_switch = confirm_kill;
>  
>  	call_rcu_sched(&ref->rcu, percpu_ref_kill_rcu);
>  }
> @@ -178,14 +178,14 @@ void percpu_ref_reinit(struct percpu_ref *ref)
>  
>  	/*
>  	 * Restore per-cpu operation.  smp_store_release() is paired with
> -	 * smp_read_barrier_depends() in __percpu_ref_alive() and
> -	 * guarantees that the zeroing is visible to all percpu accesses
> -	 * which can see the following __PERCPU_REF_DEAD clearing.
> +	 * smp_read_barrier_depends() in __ref_is_percpu() and guarantees
> +	 * that the zeroing is visible to all percpu accesses which can see
> +	 * the following __PERCPU_REF_ATOMIC clearing.
>  	 */
>  	for_each_possible_cpu(cpu)
>  		*per_cpu_ptr(percpu_count, cpu) = 0;
>  
>  	smp_store_release(&ref->percpu_count_ptr,
> -			  ref->percpu_count_ptr & ~__PERCPU_REF_DEAD);
> +			  ref->percpu_count_ptr & ~__PERCPU_REF_ATOMIC);
>  }
>  EXPORT_SYMBOL_GPL(percpu_ref_reinit);
> -- 
> 1.9.3
> 

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH 6/9] percpu_ref: decouple switching to atomic mode and killing
  2014-09-23  5:55 ` [PATCH 6/9] percpu_ref: decouple switching to atomic mode and killing Tejun Heo
@ 2014-09-23 21:13   ` Kent Overstreet
  0 siblings, 0 replies; 23+ messages in thread
From: Kent Overstreet @ 2014-09-23 21:13 UTC (permalink / raw)
  To: Tejun Heo; +Cc: linux-kernel, axboe, hch, hannes

On Tue, Sep 23, 2014 at 01:55:15AM -0400, Tejun Heo wrote:
> percpu_ref has treated the dropping of the base reference and
> switching to atomic mode as an integral operation; however, there's
> nothing inherent tying the two together.
> 
> The use cases for percpu_ref have been expanding continuously.  While
> the current init/kill/reinit/exit model can cover a lot, the coupling
> of kill/reinit with atomic/percpu mode switching is turning out to be
> too restrictive for use cases where many percpu_refs are created and
> destroyed back-to-back with only some of them reaching extended
> operation.  The coupling also makes implementing always-atomic debug
> mode difficult.
> 
> This patch separates out atomic mode switching into
> percpu_ref_switch_to_atomic() and reimplements
> percpu_ref_kill_and_confirm() on top of it.
> 
> * The handling of __PERCPU_REF_ATOMIC and __PERCPU_REF_DEAD is now
>   differentiated.  Among get/put operations, percpu_ref_tryget_live()
>   is the only one which cares about DEAD.
> 
> * percpu_ref_switch_to_atomic() can be called multiple times on the
>   same ref.  This means that multiple @confirm_switch may get queued
>   up which we can't do reliably without extra memory area.  This is
>   handled by making the later invocation synchronously wait for the
>   completion of the previous one.  This isn't particularly desirable
>   but such synchronous waits shouldn't happen in most cases.
> 
> Signed-off-by: Tejun Heo <tj@kernel.org>
> Cc: Kent Overstreet <kmo@daterainc.com>
> Cc: Jens Axboe <axboe@kernel.dk>
> Cc: Christoph Hellwig <hch@infradead.org>
> Cc: Johannes Weiner <hannes@cmpxchg.org>

Reviewed-by: Kent Overstreet <kmo@daterainc.com>

> ---
>  include/linux/percpu-refcount.h |   8 ++-
>  lib/percpu-refcount.c           | 141 +++++++++++++++++++++++++++++++---------
>  2 files changed, 116 insertions(+), 33 deletions(-)
> 
> diff --git a/include/linux/percpu-refcount.h b/include/linux/percpu-refcount.h
> index 24cf157..03a02e9 100644
> --- a/include/linux/percpu-refcount.h
> +++ b/include/linux/percpu-refcount.h
> @@ -76,9 +76,11 @@ struct percpu_ref {
>  int __must_check percpu_ref_init(struct percpu_ref *ref,
>  				 percpu_ref_func_t *release, gfp_t gfp);
>  void percpu_ref_exit(struct percpu_ref *ref);
> +void percpu_ref_switch_to_atomic(struct percpu_ref *ref,
> +				 percpu_ref_func_t *confirm_switch);
> +void percpu_ref_reinit(struct percpu_ref *ref);
>  void percpu_ref_kill_and_confirm(struct percpu_ref *ref,
>  				 percpu_ref_func_t *confirm_kill);
> -void percpu_ref_reinit(struct percpu_ref *ref);
>  
>  /**
>   * percpu_ref_kill - drop the initial ref
> @@ -109,7 +111,7 @@ static inline bool __ref_is_percpu(struct percpu_ref *ref,
>  	/* paired with smp_store_release() in percpu_ref_reinit() */
>  	smp_read_barrier_depends();
>  
> -	if (unlikely(percpu_ptr & __PERCPU_REF_ATOMIC_DEAD))
> +	if (unlikely(percpu_ptr & __PERCPU_REF_ATOMIC))
>  		return false;
>  
>  	*percpu_countp = (unsigned long __percpu *)percpu_ptr;
> @@ -191,6 +193,8 @@ static inline bool percpu_ref_tryget_live(struct percpu_ref *ref)
>  	if (__ref_is_percpu(ref, &percpu_count)) {
>  		this_cpu_inc(*percpu_count);
>  		ret = true;
> +	} else if (!(ACCESS_ONCE(ref->percpu_count_ptr) & __PERCPU_REF_DEAD)) {
> +		ret = atomic_long_inc_not_zero(&ref->count);
>  	}
>  
>  	rcu_read_unlock_sched();
> diff --git a/lib/percpu-refcount.c b/lib/percpu-refcount.c
> index b0b8c09..56a7c0d 100644
> --- a/lib/percpu-refcount.c
> +++ b/lib/percpu-refcount.c
> @@ -1,6 +1,8 @@
>  #define pr_fmt(fmt) "%s: " fmt "\n", __func__
>  
>  #include <linux/kernel.h>
> +#include <linux/sched.h>
> +#include <linux/wait.h>
>  #include <linux/percpu-refcount.h>
>  
>  /*
> @@ -31,6 +33,8 @@
>  
>  #define PERCPU_COUNT_BIAS	(1LU << (BITS_PER_LONG - 1))
>  
> +static DECLARE_WAIT_QUEUE_HEAD(percpu_ref_switch_waitq);
> +
>  static unsigned long __percpu *percpu_count_ptr(struct percpu_ref *ref)
>  {
>  	return (unsigned long __percpu *)
> @@ -88,7 +92,19 @@ void percpu_ref_exit(struct percpu_ref *ref)
>  }
>  EXPORT_SYMBOL_GPL(percpu_ref_exit);
>  
> -static void percpu_ref_kill_rcu(struct rcu_head *rcu)
> +static void percpu_ref_call_confirm_rcu(struct rcu_head *rcu)
> +{
> +	struct percpu_ref *ref = container_of(rcu, struct percpu_ref, rcu);
> +
> +	ref->confirm_switch(ref);
> +	ref->confirm_switch = NULL;
> +	wake_up_all(&percpu_ref_switch_waitq);
> +
> +	/* drop ref from percpu_ref_switch_to_atomic() */
> +	percpu_ref_put(ref);
> +}
> +
> +static void percpu_ref_switch_to_atomic_rcu(struct rcu_head *rcu)
>  {
>  	struct percpu_ref *ref = container_of(rcu, struct percpu_ref, rcu);
>  	unsigned long __percpu *percpu_count = percpu_count_ptr(ref);
> @@ -116,47 +132,79 @@ static void percpu_ref_kill_rcu(struct rcu_head *rcu)
>  	atomic_long_add((long)count - PERCPU_COUNT_BIAS, &ref->count);
>  
>  	WARN_ONCE(atomic_long_read(&ref->count) <= 0,
> -		  "percpu ref (%pf) <= 0 (%ld) after killed",
> +		  "percpu ref (%pf) <= 0 (%ld) after switching to atomic",
>  		  ref->release, atomic_long_read(&ref->count));
>  
> -	/* @ref is viewed as dead on all CPUs, send out kill confirmation */
> -	if (ref->confirm_switch)
> -		ref->confirm_switch(ref);
> +	/* @ref is viewed as dead on all CPUs, send out switch confirmation */
> +	percpu_ref_call_confirm_rcu(rcu);
> +}
>  
> -	/*
> -	 * Now we're in single atomic_long_t mode with a consistent
> -	 * refcount, so it's safe to drop our initial ref:
> -	 */
> -	percpu_ref_put(ref);
> +static void percpu_ref_noop_confirm_switch(struct percpu_ref *ref)
> +{
> +}
> +
> +static void __percpu_ref_switch_to_atomic(struct percpu_ref *ref,
> +					  percpu_ref_func_t *confirm_switch)
> +{
> +	if (!(ref->percpu_count_ptr & __PERCPU_REF_ATOMIC)) {
> +		/* switching from percpu to atomic */
> +		ref->percpu_count_ptr |= __PERCPU_REF_ATOMIC;
> +
> +		/*
> +		 * Non-NULL ->confirm_switch is used to indicate that
> +		 * switching is in progress.  Use noop one if unspecified.
> +		 */
> +		WARN_ON_ONCE(ref->confirm_switch);
> +		ref->confirm_switch =
> +			confirm_switch ?: percpu_ref_noop_confirm_switch;
> +
> +		percpu_ref_get(ref);	/* put after confirmation */
> +		call_rcu_sched(&ref->rcu, percpu_ref_switch_to_atomic_rcu);
> +	} else if (confirm_switch) {
> +		/*
> +		 * Somebody already set ATOMIC.  Switching may still be in
> +		 * progress.  @confirm_switch must be invoked after the
> +		 * switching is complete and a full sched RCU grace period
> +		 * has passed.  Wait synchronously for the previous
> +		 * switching and schedule @confirm_switch invocation.
> +		 */
> +		wait_event(percpu_ref_switch_waitq, !ref->confirm_switch);
> +		ref->confirm_switch = confirm_switch;
> +
> +		percpu_ref_get(ref);	/* put after confirmation */
> +		call_rcu_sched(&ref->rcu, percpu_ref_call_confirm_rcu);
> +	}
>  }
>  
>  /**
> - * percpu_ref_kill_and_confirm - drop the initial ref and schedule confirmation
> - * @ref: percpu_ref to kill
> - * @confirm_kill: optional confirmation callback
> + * percpu_ref_switch_to_atomic - switch a percpu_ref to atomic mode
> + * @ref: percpu_ref to switch to atomic mode
> + * @confirm_switch: optional confirmation callback
>   *
> - * Equivalent to percpu_ref_kill() but also schedules kill confirmation if
> - * @confirm_kill is not NULL.  @confirm_kill, which may not block, will be
> - * called after @ref is seen as dead from all CPUs - all further
> - * invocations of percpu_ref_tryget_live() will fail.  See
> - * percpu_ref_tryget_live() for more details.
> + * There's no reason to use this function for the usual reference counting.
> + * Use percpu_ref_kill[_and_confirm]().
> + *
> + * Schedule switching of @ref to atomic mode.  All its percpu counts will
> + * be collected to the main atomic counter.  On completion, when all CPUs
> + * are guaraneed to be in atomic mode, @confirm_switch, which may not
> + * block, is invoked.  This function may be invoked concurrently with all
> + * the get/put operations and can safely be mixed with kill and reinit
> + * operations.
>   *
> - * Due to the way percpu_ref is implemented, @confirm_kill will be called
> - * after at least one full RCU grace period has passed but this is an
> - * implementation detail and callers must not depend on it.
> + * This function normally doesn't block and can be called from any context
> + * but it may block if @confirm_kill is specified and @ref is already in
> + * the process of switching to atomic mode.  In such cases, @confirm_switch
> + * will be invoked after the switching is complete.
> + *
> + * Due to the way percpu_ref is implemented, @confirm_switch will be called
> + * after at least one full sched RCU grace period has passed but this is an
> + * implementation detail and must not be depended upon.
>   */
> -void percpu_ref_kill_and_confirm(struct percpu_ref *ref,
> -				 percpu_ref_func_t *confirm_kill)
> +void percpu_ref_switch_to_atomic(struct percpu_ref *ref,
> +				 percpu_ref_func_t *confirm_switch)
>  {
> -	WARN_ONCE(ref->percpu_count_ptr & __PERCPU_REF_ATOMIC_DEAD,
> -		  "%s called more than once on %pf!", __func__, ref->release);
> -
> -	ref->percpu_count_ptr |= __PERCPU_REF_ATOMIC_DEAD;
> -	ref->confirm_switch = confirm_kill;
> -
> -	call_rcu_sched(&ref->rcu, percpu_ref_kill_rcu);
> +	__percpu_ref_switch_to_atomic(ref, confirm_switch);
>  }
> -EXPORT_SYMBOL_GPL(percpu_ref_kill_and_confirm);
>  
>  /**
>   * percpu_ref_reinit - re-initialize a percpu refcount
> @@ -192,3 +240,34 @@ void percpu_ref_reinit(struct percpu_ref *ref)
>  			  ref->percpu_count_ptr & ~__PERCPU_REF_ATOMIC_DEAD);
>  }
>  EXPORT_SYMBOL_GPL(percpu_ref_reinit);
> +
> +/**
> + * percpu_ref_kill_and_confirm - drop the initial ref and schedule confirmation
> + * @ref: percpu_ref to kill
> + * @confirm_kill: optional confirmation callback
> + *
> + * Equivalent to percpu_ref_kill() but also schedules kill confirmation if
> + * @confirm_kill is not NULL.  @confirm_kill, which may not block, will be
> + * called after @ref is seen as dead from all CPUs at which point all
> + * further invocations of percpu_ref_tryget_live() will fail.  See
> + * percpu_ref_tryget_live() for details.
> + *
> + * This function normally doesn't block and can be called from any context
> + * but it may block if @confirm_kill is specified and @ref is already in
> + * the process of switching to atomic mode by percpu_ref_switch_atomic().
> + *
> + * Due to the way percpu_ref is implemented, @confirm_switch will be called
> + * after at least one full sched RCU grace period has passed but this is an
> + * implementation detail and must not be depended upon.
> + */
> +void percpu_ref_kill_and_confirm(struct percpu_ref *ref,
> +				 percpu_ref_func_t *confirm_kill)
> +{
> +	WARN_ONCE(ref->percpu_count_ptr & __PERCPU_REF_DEAD,
> +		  "%s called more than once on %pf!", __func__, ref->release);
> +
> +	ref->percpu_count_ptr |= __PERCPU_REF_DEAD;
> +	__percpu_ref_switch_to_atomic(ref, confirm_kill);
> +	percpu_ref_put(ref);
> +}
> +EXPORT_SYMBOL_GPL(percpu_ref_kill_and_confirm);
> -- 
> 1.9.3
> 

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH v2 5/9] percpu_ref: add PCPU_REF_DEAD
  2014-09-23 13:48   ` [PATCH v2 " Tejun Heo
@ 2014-09-23 21:14     ` Kent Overstreet
  0 siblings, 0 replies; 23+ messages in thread
From: Kent Overstreet @ 2014-09-23 21:14 UTC (permalink / raw)
  To: Tejun Heo; +Cc: linux-kernel, axboe, hch, hannes

On Tue, Sep 23, 2014 at 09:48:51AM -0400, Tejun Heo wrote:
> From 6f5bdc32c66317416c13eedb68ead2b36fb02603 Mon Sep 17 00:00:00 2001
> From: Tejun Heo <tj@kernel.org>
> Date: Tue, 23 Sep 2014 09:45:56 -0400
> 
> percpu_ref will be restructured so that percpu/atomic mode switching
> and reference killing are dedoupled.  In preparation, add
> PCPU_REF_DEAD and PCPU_REF_ATOMIC_DEAD which is OR of ATOMIC and DEAD.
> For now, ATOMIC and DEAD are changed together and all PCPU_REF_ATOMIC
> uses are converted to PCPU_REF_ATOMIC_DEAD without causing any
> behavior changes.
> 
> percpu_ref_init() now specifies an explicit alignment when allocating
> the percpu counters so that the pointer has enough unused low bits to
> accomodate the flags.  Note that one flag was fine as min alignment
> for percpu memory is 2 bytes but two flags are already too many for
> the natural alignment of unsigned longs on archs like cris and m68k.
> 
> v2: The original patch had BUILD_BUG_ON() which triggers if unsigned
>     long's alignment isn't enough to accomodate the flags, which
>     triggered on cris and m64k.  percpu_ref_init() updated to specify
>     the required alignment explicitly.  Reported by Fengguang.
> 
> Signed-off-by: Tejun Heo <tj@kernel.org>
> Cc: Kent Overstreet <kmo@daterainc.com>
> Cc: kbuild test robot <fengguang.wu@intel.com>

Reviewed-by: Kent Overstreet <kmo@daterainc.com>

> ---
>  include/linux/percpu-refcount.h |  6 +++++-
>  lib/percpu-refcount.c           | 19 +++++++++++--------
>  2 files changed, 16 insertions(+), 9 deletions(-)
> 
> diff --git a/include/linux/percpu-refcount.h b/include/linux/percpu-refcount.h
> index 910e5f7..bd9483d 100644
> --- a/include/linux/percpu-refcount.h
> +++ b/include/linux/percpu-refcount.h
> @@ -57,6 +57,10 @@ typedef void (percpu_ref_func_t)(struct percpu_ref *);
>  /* flags set in the lower bits of percpu_ref->percpu_count_ptr */
>  enum {
>  	__PERCPU_REF_ATOMIC	= 1LU << 0,	/* operating in atomic mode */
> +	__PERCPU_REF_DEAD	= 1LU << 1,	/* (being) killed */
> +	__PERCPU_REF_ATOMIC_DEAD = __PERCPU_REF_ATOMIC | __PERCPU_REF_DEAD,
> +
> +	__PERCPU_REF_FLAG_BITS	= 2,
>  };
>  
>  struct percpu_ref {
> @@ -107,7 +111,7 @@ static inline bool __ref_is_percpu(struct percpu_ref *ref,
>  	/* paired with smp_store_release() in percpu_ref_reinit() */
>  	smp_read_barrier_depends();
>  
> -	if (unlikely(percpu_ptr & __PERCPU_REF_ATOMIC))
> +	if (unlikely(percpu_ptr & __PERCPU_REF_ATOMIC_DEAD))
>  		return false;
>  
>  	*percpu_countp = (unsigned long __percpu *)percpu_ptr;
> diff --git a/lib/percpu-refcount.c b/lib/percpu-refcount.c
> index 7aef590..e2ff19f 100644
> --- a/lib/percpu-refcount.c
> +++ b/lib/percpu-refcount.c
> @@ -34,7 +34,7 @@
>  static unsigned long __percpu *percpu_count_ptr(struct percpu_ref *ref)
>  {
>  	return (unsigned long __percpu *)
> -		(ref->percpu_count_ptr & ~__PERCPU_REF_ATOMIC);
> +		(ref->percpu_count_ptr & ~__PERCPU_REF_ATOMIC_DEAD);
>  }
>  
>  /**
> @@ -52,10 +52,13 @@ static unsigned long __percpu *percpu_count_ptr(struct percpu_ref *ref)
>  int percpu_ref_init(struct percpu_ref *ref, percpu_ref_func_t *release,
>  		    gfp_t gfp)
>  {
> +	size_t align = max_t(size_t, 1 << __PERCPU_REF_FLAG_BITS,
> +			     __alignof__(unsigned long));
> +
>  	atomic_long_set(&ref->count, 1 + PERCPU_COUNT_BIAS);
>  
> -	ref->percpu_count_ptr =
> -		(unsigned long)alloc_percpu_gfp(unsigned long, gfp);
> +	ref->percpu_count_ptr = (unsigned long)
> +		__alloc_percpu_gfp(sizeof(unsigned long), align, gfp);
>  	if (!ref->percpu_count_ptr)
>  		return -ENOMEM;
>  
> @@ -80,7 +83,7 @@ void percpu_ref_exit(struct percpu_ref *ref)
>  
>  	if (percpu_count) {
>  		free_percpu(percpu_count);
> -		ref->percpu_count_ptr = __PERCPU_REF_ATOMIC;
> +		ref->percpu_count_ptr = __PERCPU_REF_ATOMIC_DEAD;
>  	}
>  }
>  EXPORT_SYMBOL_GPL(percpu_ref_exit);
> @@ -145,10 +148,10 @@ static void percpu_ref_kill_rcu(struct rcu_head *rcu)
>  void percpu_ref_kill_and_confirm(struct percpu_ref *ref,
>  				 percpu_ref_func_t *confirm_kill)
>  {
> -	WARN_ONCE(ref->percpu_count_ptr & __PERCPU_REF_ATOMIC,
> +	WARN_ONCE(ref->percpu_count_ptr & __PERCPU_REF_ATOMIC_DEAD,
>  		  "%s called more than once on %pf!", __func__, ref->release);
>  
> -	ref->percpu_count_ptr |= __PERCPU_REF_ATOMIC;
> +	ref->percpu_count_ptr |= __PERCPU_REF_ATOMIC_DEAD;
>  	ref->confirm_switch = confirm_kill;
>  
>  	call_rcu_sched(&ref->rcu, percpu_ref_kill_rcu);
> @@ -180,12 +183,12 @@ void percpu_ref_reinit(struct percpu_ref *ref)
>  	 * Restore per-cpu operation.  smp_store_release() is paired with
>  	 * smp_read_barrier_depends() in __ref_is_percpu() and guarantees
>  	 * that the zeroing is visible to all percpu accesses which can see
> -	 * the following __PERCPU_REF_ATOMIC clearing.
> +	 * the following __PERCPU_REF_ATOMIC_DEAD clearing.
>  	 */
>  	for_each_possible_cpu(cpu)
>  		*per_cpu_ptr(percpu_count, cpu) = 0;
>  
>  	smp_store_release(&ref->percpu_count_ptr,
> -			  ref->percpu_count_ptr & ~__PERCPU_REF_ATOMIC);
> +			  ref->percpu_count_ptr & ~__PERCPU_REF_ATOMIC_DEAD);
>  }
>  EXPORT_SYMBOL_GPL(percpu_ref_reinit);
> -- 
> 1.9.3
> 

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH v2 7/9] percpu_ref: decouple switching to percpu mode and reinit
  2014-09-23 13:49   ` [PATCH v2 " Tejun Heo
@ 2014-09-23 21:15     ` Kent Overstreet
  0 siblings, 0 replies; 23+ messages in thread
From: Kent Overstreet @ 2014-09-23 21:15 UTC (permalink / raw)
  To: Tejun Heo; +Cc: linux-kernel, axboe, hch, hannes

On Tue, Sep 23, 2014 at 09:49:19AM -0400, Tejun Heo wrote:
> From c31543be3b12bd3cb3f8c037cefb89504d294f82 Mon Sep 17 00:00:00 2001
> From: Tejun Heo <tj@kernel.org>
> Date: Tue, 23 Sep 2014 09:45:56 -0400
> 
> percpu_ref has treated the dropping of the base reference and
> switching to atomic mode as an integral operation; however, there's
> nothing inherent tying the two together.
> 
> The use cases for percpu_ref have been expanding continuously.  While
> the current init/kill/reinit/exit model can cover a lot, the coupling
> of kill/reinit with atomic/percpu mode switching is turning out to be
> too restrictive for use cases where many percpu_refs are created and
> destroyed back-to-back with only some of them reaching extended
> operation.  The coupling also makes implementing always-atomic debug
> mode difficult.
> 
> This patch separates out percpu mode switching into
> percpu_ref_switch_to_percpu() and reimplements percpu_ref_reinit() on
> top of it.
> 
> * DEAD still requires ATOMIC.  A dead ref can't be switched to percpu
>   mode w/o going through reinit.
> 
> v2: __percpu_ref_switch_to_percpu() was missing static.  Fixed.
>     Reported by Fengguang aka kbuild test robot.
> 
> Signed-off-by: Tejun Heo <tj@kernel.org>
> Cc: Kent Overstreet <kmo@daterainc.com>
> Cc: Jens Axboe <axboe@kernel.dk>
> Cc: Christoph Hellwig <hch@infradead.org>
> Cc: Johannes Weiner <hannes@cmpxchg.org>
> Cc: kbuild test robot <fengguang.wu@intel.com>

Reviewed-by: Kent Overstreet <kmo@daterainc.com>

> ---
>  include/linux/percpu-refcount.h |  3 +-
>  lib/percpu-refcount.c           | 73 ++++++++++++++++++++++++++++++-----------
>  2 files changed, 56 insertions(+), 20 deletions(-)
> 
> diff --git a/include/linux/percpu-refcount.h b/include/linux/percpu-refcount.h
> index d1252e1..cd7e20f 100644
> --- a/include/linux/percpu-refcount.h
> +++ b/include/linux/percpu-refcount.h
> @@ -80,9 +80,10 @@ int __must_check percpu_ref_init(struct percpu_ref *ref,
>  void percpu_ref_exit(struct percpu_ref *ref);
>  void percpu_ref_switch_to_atomic(struct percpu_ref *ref,
>  				 percpu_ref_func_t *confirm_switch);
> -void percpu_ref_reinit(struct percpu_ref *ref);
> +void percpu_ref_switch_to_percpu(struct percpu_ref *ref);
>  void percpu_ref_kill_and_confirm(struct percpu_ref *ref,
>  				 percpu_ref_func_t *confirm_kill);
> +void percpu_ref_reinit(struct percpu_ref *ref);
>  
>  /**
>   * percpu_ref_kill - drop the initial ref
> diff --git a/lib/percpu-refcount.c b/lib/percpu-refcount.c
> index 6e0d143..5a6d43b 100644
> --- a/lib/percpu-refcount.c
> +++ b/lib/percpu-refcount.c
> @@ -206,40 +206,54 @@ void percpu_ref_switch_to_atomic(struct percpu_ref *ref,
>  	__percpu_ref_switch_to_atomic(ref, confirm_switch);
>  }
>  
> -/**
> - * percpu_ref_reinit - re-initialize a percpu refcount
> - * @ref: perpcu_ref to re-initialize
> - *
> - * Re-initialize @ref so that it's in the same state as when it finished
> - * percpu_ref_init().  @ref must have been initialized successfully, killed
> - * and reached 0 but not exited.
> - *
> - * Note that percpu_ref_tryget[_live]() are safe to perform on @ref while
> - * this function is in progress.
> - */
> -void percpu_ref_reinit(struct percpu_ref *ref)
> +static void __percpu_ref_switch_to_percpu(struct percpu_ref *ref)
>  {
>  	unsigned long __percpu *percpu_count = percpu_count_ptr(ref);
>  	int cpu;
>  
>  	BUG_ON(!percpu_count);
> -	WARN_ON_ONCE(!percpu_ref_is_zero(ref));
>  
> -	atomic_long_set(&ref->count, 1 + PERCPU_COUNT_BIAS);
> +	if (!(ref->percpu_count_ptr & __PERCPU_REF_ATOMIC))
> +		return;
> +
> +	wait_event(percpu_ref_switch_waitq, !ref->confirm_switch);
> +
> +	atomic_long_add(PERCPU_COUNT_BIAS, &ref->count);
>  
>  	/*
>  	 * Restore per-cpu operation.  smp_store_release() is paired with
>  	 * smp_read_barrier_depends() in __ref_is_percpu() and guarantees
>  	 * that the zeroing is visible to all percpu accesses which can see
> -	 * the following __PERCPU_REF_ATOMIC_DEAD clearing.
> +	 * the following __PERCPU_REF_ATOMIC clearing.
>  	 */
>  	for_each_possible_cpu(cpu)
>  		*per_cpu_ptr(percpu_count, cpu) = 0;
>  
>  	smp_store_release(&ref->percpu_count_ptr,
> -			  ref->percpu_count_ptr & ~__PERCPU_REF_ATOMIC_DEAD);
> +			  ref->percpu_count_ptr & ~__PERCPU_REF_ATOMIC);
> +}
> +
> +/**
> + * percpu_ref_switch_to_percpu - switch a percpu_ref to percpu mode
> + * @ref: percpu_ref to switch to percpu mode
> + *
> + * There's no reason to use this function for the usual reference counting.
> + * To re-use an expired ref, use percpu_ref_reinit().
> + *
> + * Switch @ref to percpu mode.  This function may be invoked concurrently
> + * with all the get/put operations and can safely be mixed with kill and
> + * reinit operations.
> + *
> + * This function normally doesn't block and can be called from any context
> + * but it may block if @ref is in the process of switching to atomic mode
> + * by percpu_ref_switch_atomic().
> + */
> +void percpu_ref_switch_to_percpu(struct percpu_ref *ref)
> +{
> +	/* a dying or dead ref can't be switched to percpu mode w/o reinit */
> +	if (!(ref->percpu_count_ptr & __PERCPU_REF_DEAD))
> +		__percpu_ref_switch_to_percpu(ref);
>  }
> -EXPORT_SYMBOL_GPL(percpu_ref_reinit);
>  
>  /**
>   * percpu_ref_kill_and_confirm - drop the initial ref and schedule confirmation
> @@ -253,8 +267,8 @@ EXPORT_SYMBOL_GPL(percpu_ref_reinit);
>   * percpu_ref_tryget_live() for details.
>   *
>   * This function normally doesn't block and can be called from any context
> - * but it may block if @confirm_kill is specified and @ref is already in
> - * the process of switching to atomic mode by percpu_ref_switch_atomic().
> + * but it may block if @confirm_kill is specified and @ref is in the
> + * process of switching to atomic mode by percpu_ref_switch_atomic().
>   *
>   * Due to the way percpu_ref is implemented, @confirm_switch will be called
>   * after at least one full sched RCU grace period has passed but this is an
> @@ -271,3 +285,24 @@ void percpu_ref_kill_and_confirm(struct percpu_ref *ref,
>  	percpu_ref_put(ref);
>  }
>  EXPORT_SYMBOL_GPL(percpu_ref_kill_and_confirm);
> +
> +/**
> + * percpu_ref_reinit - re-initialize a percpu refcount
> + * @ref: perpcu_ref to re-initialize
> + *
> + * Re-initialize @ref so that it's in the same state as when it finished
> + * percpu_ref_init().  @ref must have been initialized successfully and
> + * reached 0 but not exited.
> + *
> + * Note that percpu_ref_tryget[_live]() are safe to perform on @ref while
> + * this function is in progress.
> + */
> +void percpu_ref_reinit(struct percpu_ref *ref)
> +{
> +	WARN_ON_ONCE(!percpu_ref_is_zero(ref));
> +
> +	ref->percpu_count_ptr &= ~__PERCPU_REF_DEAD;
> +	percpu_ref_get(ref);
> +	__percpu_ref_switch_to_percpu(ref);
> +}
> +EXPORT_SYMBOL_GPL(percpu_ref_reinit);
> -- 
> 1.9.3
> 

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH 8/9] percpu_ref: add PERCPU_REF_INIT_* flags
  2014-09-23  5:55 ` [PATCH 8/9] percpu_ref: add PERCPU_REF_INIT_* flags Tejun Heo
@ 2014-09-23 21:16   ` Kent Overstreet
  0 siblings, 0 replies; 23+ messages in thread
From: Kent Overstreet @ 2014-09-23 21:16 UTC (permalink / raw)
  To: Tejun Heo; +Cc: linux-kernel, axboe, hch, hannes

On Tue, Sep 23, 2014 at 01:55:17AM -0400, Tejun Heo wrote:
> With the recent addition of percpu_ref_reinit(), percpu_ref now can be
> used as a persistent switch which can be turned on and off repeatedly
> where turning off maps to killing the ref and waiting for it to drain;
> however, there currently isn't a way to initialize a percpu_ref in its
> off (killed and drained) state, which can be inconvenient for certain
> persistent switch use cases.
> 
> Similarly, percpu_ref_switch_to_atomic/percpu() allow dynamic
> selection of operation mode; however, currently a newly initialized
> percpu_ref is always in percpu mode making it impossible to avoid the
> latency overhead of switching to atomic mode.
> 
> This patch adds @flags to percpu_ref_init() and implements the
> following flags.
> 
> * PERCPU_REF_INIT_ATOMIC	: start ref in atomic mode
> * PERCPU_REF_INIT_DEAD		: start ref killed and drained
> 
> These flags should be able to serve the above two use cases.
> 
> Signed-off-by: Tejun Heo <tj@kernel.org>
> Cc: Kent Overstreet <kmo@daterainc.com>
> Cc: Jens Axboe <axboe@kernel.dk>
> Cc: Christoph Hellwig <hch@infradead.org>
> Cc: Johannes Weiner <hannes@cmpxchg.org>

Reviewed-by: Kent Overstreet <kmo@daterainc.com>

> ---
>  block/blk-mq.c                  |  2 +-
>  fs/aio.c                        |  4 ++--
>  include/linux/percpu-refcount.h | 18 +++++++++++++++++-
>  kernel/cgroup.c                 |  7 ++++---
>  lib/percpu-refcount.c           | 24 +++++++++++++++++++-----
>  5 files changed, 43 insertions(+), 12 deletions(-)
> 
> diff --git a/block/blk-mq.c b/block/blk-mq.c
> index 702df07..3f6e6f5 100644
> --- a/block/blk-mq.c
> +++ b/block/blk-mq.c
> @@ -1777,7 +1777,7 @@ struct request_queue *blk_mq_init_queue(struct blk_mq_tag_set *set)
>  		goto err_hctxs;
>  
>  	if (percpu_ref_init(&q->mq_usage_counter, blk_mq_usage_counter_release,
> -			    GFP_KERNEL))
> +			    0, GFP_KERNEL))
>  		goto err_map;
>  
>  	setup_timer(&q->timeout, blk_mq_rq_timer, (unsigned long) q);
> diff --git a/fs/aio.c b/fs/aio.c
> index 93fbcc0f..9b6d5d6 100644
> --- a/fs/aio.c
> +++ b/fs/aio.c
> @@ -666,10 +666,10 @@ static struct kioctx *ioctx_alloc(unsigned nr_events)
>  
>  	INIT_LIST_HEAD(&ctx->active_reqs);
>  
> -	if (percpu_ref_init(&ctx->users, free_ioctx_users, GFP_KERNEL))
> +	if (percpu_ref_init(&ctx->users, free_ioctx_users, 0, GFP_KERNEL))
>  		goto err;
>  
> -	if (percpu_ref_init(&ctx->reqs, free_ioctx_reqs, GFP_KERNEL))
> +	if (percpu_ref_init(&ctx->reqs, free_ioctx_reqs, 0, GFP_KERNEL))
>  		goto err;
>  
>  	ctx->cpu = alloc_percpu(struct kioctx_cpu);
> diff --git a/include/linux/percpu-refcount.h b/include/linux/percpu-refcount.h
> index e41ca20..5f84bf0 100644
> --- a/include/linux/percpu-refcount.h
> +++ b/include/linux/percpu-refcount.h
> @@ -61,6 +61,21 @@ enum {
>  	__PERCPU_REF_ATOMIC_DEAD = __PERCPU_REF_ATOMIC | __PERCPU_REF_DEAD,
>  };
>  
> +/* @flags for percpu_ref_init() */
> +enum {
> +	/*
> +	 * Start w/ ref == 1 in atomic mode.  Can be switched to percpu
> +	 * operation using percpu_ref_switch_to_percpu().
> +	 */
> +	PERCPU_REF_INIT_ATOMIC	= 1 << 0,
> +
> +	/*
> +	 * Start dead w/ ref == 0 in atomic mode.  Must be revived with
> +	 * percpu_ref_reinit() before used.  Implies INIT_ATOMIC.
> +	 */
> +	PERCPU_REF_INIT_DEAD	= 1 << 1,
> +};
> +
>  struct percpu_ref {
>  	atomic_long_t		count;
>  	/*
> @@ -74,7 +89,8 @@ struct percpu_ref {
>  };
>  
>  int __must_check percpu_ref_init(struct percpu_ref *ref,
> -				 percpu_ref_func_t *release, gfp_t gfp);
> +				 percpu_ref_func_t *release, unsigned int flags,
> +				 gfp_t gfp);
>  void percpu_ref_exit(struct percpu_ref *ref);
>  void percpu_ref_switch_to_atomic(struct percpu_ref *ref,
>  				 percpu_ref_func_t *confirm_switch);
> diff --git a/kernel/cgroup.c b/kernel/cgroup.c
> index 589b4d8..e2fbcc1 100644
> --- a/kernel/cgroup.c
> +++ b/kernel/cgroup.c
> @@ -1628,7 +1628,8 @@ static int cgroup_setup_root(struct cgroup_root *root, unsigned int ss_mask)
>  		goto out;
>  	root_cgrp->id = ret;
>  
> -	ret = percpu_ref_init(&root_cgrp->self.refcnt, css_release, GFP_KERNEL);
> +	ret = percpu_ref_init(&root_cgrp->self.refcnt, css_release, 0,
> +			      GFP_KERNEL);
>  	if (ret)
>  		goto out;
>  
> @@ -4487,7 +4488,7 @@ static int create_css(struct cgroup *cgrp, struct cgroup_subsys *ss,
>  
>  	init_and_link_css(css, ss, cgrp);
>  
> -	err = percpu_ref_init(&css->refcnt, css_release, GFP_KERNEL);
> +	err = percpu_ref_init(&css->refcnt, css_release, 0, GFP_KERNEL);
>  	if (err)
>  		goto err_free_css;
>  
> @@ -4555,7 +4556,7 @@ static int cgroup_mkdir(struct kernfs_node *parent_kn, const char *name,
>  		goto out_unlock;
>  	}
>  
> -	ret = percpu_ref_init(&cgrp->self.refcnt, css_release, GFP_KERNEL);
> +	ret = percpu_ref_init(&cgrp->self.refcnt, css_release, 0, GFP_KERNEL);
>  	if (ret)
>  		goto out_free_cgrp;
>  
> diff --git a/lib/percpu-refcount.c b/lib/percpu-refcount.c
> index 548b19e..74ec33e 100644
> --- a/lib/percpu-refcount.c
> +++ b/lib/percpu-refcount.c
> @@ -45,27 +45,41 @@ static unsigned long __percpu *percpu_count_ptr(struct percpu_ref *ref)
>   * percpu_ref_init - initialize a percpu refcount
>   * @ref: percpu_ref to initialize
>   * @release: function which will be called when refcount hits 0
> + * @flags: PERCPU_REF_INIT_* flags
>   * @gfp: allocation mask to use
>   *
> - * Initializes the refcount in single atomic counter mode with a refcount of 1;
> - * analagous to atomic_long_set(ref, 1).
> + * Initializes @ref.  If @flags is zero, @ref starts in percpu mode with a
> + * refcount of 1; analagous to atomic_long_set(ref, 1).  See the
> + * definitions of PERCPU_REF_INIT_* flags for flag behaviors.
>   *
>   * Note that @release must not sleep - it may potentially be called from RCU
>   * callback context by percpu_ref_kill().
>   */
>  int percpu_ref_init(struct percpu_ref *ref, percpu_ref_func_t *release,
> -		    gfp_t gfp)
> +		    unsigned int flags, gfp_t gfp)
>  {
> +	unsigned long start_count = 0;
> +
>  	BUILD_BUG_ON(__PERCPU_REF_ATOMIC_DEAD &
>  		     ~(__alignof__(unsigned long) - 1));
>  
> -	atomic_long_set(&ref->count, 1 + PERCPU_COUNT_BIAS);
> -
>  	ref->percpu_count_ptr =
>  		(unsigned long)alloc_percpu_gfp(unsigned long, gfp);
>  	if (!ref->percpu_count_ptr)
>  		return -ENOMEM;
>  
> +	if (flags & (PERCPU_REF_INIT_ATOMIC | PERCPU_REF_INIT_DEAD))
> +		ref->percpu_count_ptr |= __PERCPU_REF_ATOMIC;
> +	else
> +		start_count += PERCPU_COUNT_BIAS;
> +
> +	if (flags & PERCPU_REF_INIT_DEAD)
> +		ref->percpu_count_ptr |= __PERCPU_REF_DEAD;
> +	else
> +		start_count++;
> +
> +	atomic_long_set(&ref->count, start_count);
> +
>  	ref->release = release;
>  	return 0;
>  }
> -- 
> 1.9.3
> 

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH 9/9] percpu_ref: make INIT_ATOMIC and switch_to_atomic() sticky
  2014-09-23  5:55 ` [PATCH 9/9] percpu_ref: make INIT_ATOMIC and switch_to_atomic() sticky Tejun Heo
@ 2014-09-23 21:17   ` Kent Overstreet
  0 siblings, 0 replies; 23+ messages in thread
From: Kent Overstreet @ 2014-09-23 21:17 UTC (permalink / raw)
  To: Tejun Heo; +Cc: linux-kernel, axboe, hch, hannes

On Tue, Sep 23, 2014 at 01:55:18AM -0400, Tejun Heo wrote:
> Currently, a percpu_ref which is initialized with
> PERPCU_REF_INIT_ATOMIC or switched to atomic mode via
> switch_to_atomic() automatically reverts to percpu mode on the first
> percpu_ref_reinit().  This makes the atomic mode difficult to use for
> cases where a percpu_ref is used as a persistent on/off switch which
> may be cycled multiple times.
> 
> This patch makes such atomic state sticky so that it survives through
> kill/reinit cycles.  After this patch, atomic state is cleared only by
> an explicit percpu_ref_switch_to_percpu() call.
> 
> Signed-off-by: Tejun Heo <tj@kernel.org>
> Cc: Kent Overstreet <kmo@daterainc.com>
> Cc: Jens Axboe <axboe@kernel.dk>
> Cc: Christoph Hellwig <hch@infradead.org>
> Cc: Johannes Weiner <hannes@cmpxchg.org>

Reviewed-by: Kent Overstreet <kmo@daterainc.com>

> ---
>  include/linux/percpu-refcount.h |  5 ++++-
>  lib/percpu-refcount.c           | 20 +++++++++++++++-----
>  2 files changed, 19 insertions(+), 6 deletions(-)
> 
> diff --git a/include/linux/percpu-refcount.h b/include/linux/percpu-refcount.h
> index 5f84bf0..8459d3a 100644
> --- a/include/linux/percpu-refcount.h
> +++ b/include/linux/percpu-refcount.h
> @@ -65,7 +65,9 @@ enum {
>  enum {
>  	/*
>  	 * Start w/ ref == 1 in atomic mode.  Can be switched to percpu
> -	 * operation using percpu_ref_switch_to_percpu().
> +	 * operation using percpu_ref_switch_to_percpu().  If initialized
> +	 * with this flag, the ref will stay in atomic mode until
> +	 * percpu_ref_switch_to_percpu() is invoked on it.
>  	 */
>  	PERCPU_REF_INIT_ATOMIC	= 1 << 0,
>  
> @@ -85,6 +87,7 @@ struct percpu_ref {
>  	unsigned long		percpu_count_ptr;
>  	percpu_ref_func_t	*release;
>  	percpu_ref_func_t	*confirm_switch;
> +	bool			force_atomic:1;
>  	struct rcu_head		rcu;
>  };
>  
> diff --git a/lib/percpu-refcount.c b/lib/percpu-refcount.c
> index 74ec33e..c47e496 100644
> --- a/lib/percpu-refcount.c
> +++ b/lib/percpu-refcount.c
> @@ -68,6 +68,8 @@ int percpu_ref_init(struct percpu_ref *ref, percpu_ref_func_t *release,
>  	if (!ref->percpu_count_ptr)
>  		return -ENOMEM;
>  
> +	ref->force_atomic = flags & PERCPU_REF_INIT_ATOMIC;
> +
>  	if (flags & (PERCPU_REF_INIT_ATOMIC | PERCPU_REF_INIT_DEAD))
>  		ref->percpu_count_ptr |= __PERCPU_REF_ATOMIC;
>  	else
> @@ -203,7 +205,8 @@ static void __percpu_ref_switch_to_atomic(struct percpu_ref *ref,
>   * are guaraneed to be in atomic mode, @confirm_switch, which may not
>   * block, is invoked.  This function may be invoked concurrently with all
>   * the get/put operations and can safely be mixed with kill and reinit
> - * operations.
> + * operations.  Note that @ref will stay in atomic mode across kill/reinit
> + * cycles until percpu_ref_switch_to_percpu() is called.
>   *
>   * This function normally doesn't block and can be called from any context
>   * but it may block if @confirm_kill is specified and @ref is already in
> @@ -217,6 +220,7 @@ static void __percpu_ref_switch_to_atomic(struct percpu_ref *ref,
>  void percpu_ref_switch_to_atomic(struct percpu_ref *ref,
>  				 percpu_ref_func_t *confirm_switch)
>  {
> +	ref->force_atomic = true;
>  	__percpu_ref_switch_to_atomic(ref, confirm_switch);
>  }
>  
> @@ -256,7 +260,10 @@ void __percpu_ref_switch_to_percpu(struct percpu_ref *ref)
>   *
>   * Switch @ref to percpu mode.  This function may be invoked concurrently
>   * with all the get/put operations and can safely be mixed with kill and
> - * reinit operations.
> + * reinit operations.  This function reverses the sticky atomic state set
> + * by PERCPU_REF_INIT_ATOMIC or percpu_ref_switch_to_atomic().  If @ref is
> + * dying or dead, the actual switching takes place on the following
> + * percpu_ref_reinit().
>   *
>   * This function normally doesn't block and can be called from any context
>   * but it may block if @ref is in the process of switching to atomic mode
> @@ -264,6 +271,8 @@ void __percpu_ref_switch_to_percpu(struct percpu_ref *ref)
>   */
>  void percpu_ref_switch_to_percpu(struct percpu_ref *ref)
>  {
> +	ref->force_atomic = false;
> +
>  	/* a dying or dead ref can't be switched to percpu mode w/o reinit */
>  	if (!(ref->percpu_count_ptr & __PERCPU_REF_DEAD))
>  		__percpu_ref_switch_to_percpu(ref);
> @@ -305,8 +314,8 @@ EXPORT_SYMBOL_GPL(percpu_ref_kill_and_confirm);
>   * @ref: perpcu_ref to re-initialize
>   *
>   * Re-initialize @ref so that it's in the same state as when it finished
> - * percpu_ref_init().  @ref must have been initialized successfully and
> - * reached 0 but not exited.
> + * percpu_ref_init() ignoring %PERCPU_REF_INIT_DEAD.  @ref must have been
> + * initialized successfully and reached 0 but not exited.
>   *
>   * Note that percpu_ref_tryget[_live]() are safe to perform on @ref while
>   * this function is in progress.
> @@ -317,6 +326,7 @@ void percpu_ref_reinit(struct percpu_ref *ref)
>  
>  	ref->percpu_count_ptr &= ~__PERCPU_REF_DEAD;
>  	percpu_ref_get(ref);
> -	__percpu_ref_switch_to_percpu(ref);
> +	if (!ref->force_atomic)
> +		__percpu_ref_switch_to_percpu(ref);
>  }
>  EXPORT_SYMBOL_GPL(percpu_ref_reinit);
> -- 
> 1.9.3
> 

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCHSET percpu/for-3.18] percpu_ref: implement switch_to_atomic/percpu()
  2014-09-23  5:55 [PATCHSET percpu/for-3.18] percpu_ref: implement switch_to_atomic/percpu() Tejun Heo
                   ` (8 preceding siblings ...)
  2014-09-23  5:55 ` [PATCH 9/9] percpu_ref: make INIT_ATOMIC and switch_to_atomic() sticky Tejun Heo
@ 2014-09-24 17:32 ` Tejun Heo
  9 siblings, 0 replies; 23+ messages in thread
From: Tejun Heo @ 2014-09-24 17:32 UTC (permalink / raw)
  To: linux-kernel; +Cc: kmo, axboe, hch, hannes

On Tue, Sep 23, 2014 at 01:55:09AM -0400, Tejun Heo wrote:
>  0001-percpu_ref-relocate-percpu_ref_reinit.patch
>  0002-percpu_ref-minor-code-and-comment-updates.patch
>  0003-percpu_ref-replace-pcpu_-prefix-with-percpu_.patch
>  0004-percpu_ref-rename-things-to-prepare-for-decoupling-p.patch
>  0005-percpu_ref-add-PCPU_REF_DEAD.patch
>  0006-percpu_ref-decouple-switching-to-atomic-mode-and-kil.patch
>  0007-percpu_ref-decouple-switching-to-percpu-mode-and-rei.patch
>  0008-percpu_ref-add-PERCPU_REF_INIT_-flags.patch
>  0009-percpu_ref-make-INIT_ATOMIC-and-switch_to_atomic-sti.patch

Applied 1-9 to percpu/for-3.18.

Thanks.

-- 
tejun

^ permalink raw reply	[flat|nested] 23+ messages in thread

end of thread, other threads:[~2014-09-24 17:33 UTC | newest]

Thread overview: 23+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2014-09-23  5:55 [PATCHSET percpu/for-3.18] percpu_ref: implement switch_to_atomic/percpu() Tejun Heo
2014-09-23  5:55 ` [PATCH 1/9] percpu_ref: relocate percpu_ref_reinit() Tejun Heo
2014-09-23 21:01   ` Kent Overstreet
2014-09-23 21:07   ` Kent Overstreet
2014-09-23  5:55 ` [PATCH 2/9] percpu_ref: minor code and comment updates Tejun Heo
2014-09-23 21:09   ` Kent Overstreet
2014-09-23  5:55 ` [PATCH 3/9] percpu_ref: replace pcpu_ prefix with percpu_ Tejun Heo
2014-09-23 21:10   ` Kent Overstreet
2014-09-23  5:55 ` [PATCH 4/9] percpu_ref: rename things to prepare for decoupling percpu/atomic mode switch Tejun Heo
2014-09-23 21:11   ` Kent Overstreet
2014-09-23  5:55 ` [PATCH 5/9] percpu_ref: add PCPU_REF_DEAD Tejun Heo
2014-09-23 13:48   ` [PATCH v2 " Tejun Heo
2014-09-23 21:14     ` Kent Overstreet
2014-09-23  5:55 ` [PATCH 6/9] percpu_ref: decouple switching to atomic mode and killing Tejun Heo
2014-09-23 21:13   ` Kent Overstreet
2014-09-23  5:55 ` [PATCH 7/9] percpu_ref: decouple switching to percpu mode and reinit Tejun Heo
2014-09-23 13:49   ` [PATCH v2 " Tejun Heo
2014-09-23 21:15     ` Kent Overstreet
2014-09-23  5:55 ` [PATCH 8/9] percpu_ref: add PERCPU_REF_INIT_* flags Tejun Heo
2014-09-23 21:16   ` Kent Overstreet
2014-09-23  5:55 ` [PATCH 9/9] percpu_ref: make INIT_ATOMIC and switch_to_atomic() sticky Tejun Heo
2014-09-23 21:17   ` Kent Overstreet
2014-09-24 17:32 ` [PATCHSET percpu/for-3.18] percpu_ref: implement switch_to_atomic/percpu() Tejun Heo

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).