From: Rusty Russell <rusty@rustcorp.com.au> To: Kent Overstreet <koverstreet@google.com>, linux-kernel@vger.kernel.org, linux-fsdevel@vger.kernel.org, linux-aio@kvack.org Cc: akpm@linux-foundation.org, Kent Overstreet <koverstreet@google.com>, Zach Brown <zab@redhat.com>, Felipe Balbi <balbi@ti.com>, Greg Kroah-Hartman <gregkh@linuxfoundation.org>, Mark Fasheh <mfasheh@suse.com>, Joel Becker <jlbec@evilplan.org>, Jens Axboe <axboe@kernel.dk>, Asai Thambi S P <asamymuthupa@micron.com>, Selvan Mani <smani@micron.com>, Sam Bradshaw <sbradshaw@micron.com>, Jeff Moyer <jmoyer@redhat.com>, Al Viro <viro@zeniv.linux.org.uk>, Benjamin LaHaise <bcrl@kvack.org>, Tejun Heo <tj@kernel.org>, Oleg Nesterov <oleg@redhat.com>, Christoph Lameter <cl@linux-foundation.org>, Ingo Molnar <mingo@redhat.com> Subject: Re: [PATCH 04/21] Generic percpu refcounting Date: Thu, 16 May 2013 09:56:19 +0930 [thread overview] Message-ID: <87y5bfzs5w.fsf@rustcorp.com.au> (raw) In-Reply-To: <1368494338-7069-5-git-send-email-koverstreet@google.com> Kent Overstreet <koverstreet@google.com> writes: > This implements a refcount with similar semantics to > atomic_get()/atomic_dec_and_test() - but percpu. Ah! This is why I was CC'd... Now I understand. Thanks :) Delighted to see someone chasing this. I had an implementation of such a thing last decade, but the slowmode pattern didn't make for trivial kref conversions, so I dropped it. Note: I haven't read the other feedback yet, so ignore if dups. > +int percpu_ref_init(struct percpu_ref *ref); Why not just run is slow mode when allocation fails? Things which can't fail make for simpler use. > +int percpu_ref_tryget(struct percpu_ref *ref); > +int percpu_ref_put_initial_ref(struct percpu_ref *ref); This is part of a slightly different pattern: the owned refcount. In fact, I think that's the most sane pattern to use (but I could be wrong; does the AIO stuff fit?). If so, promote this to the first class citizen, and if necessary expose kill as __percpu_ref_kill()? (I might suggest percpu_ref_owner_put() as a name, in fact). > +/** > + * percpu_ref_get - increment a dynamic percpu refcount > + * > + * Analagous to atomic_inc(). > + */ > +static inline void percpu_ref_get(struct percpu_ref *ref) > +{ > + unsigned __percpu *pcpu_count; > + > + preempt_disable(); > + > + pcpu_count = ACCESS_ONCE(ref->pcpu_count); > + > + if (pcpu_count) > + __this_cpu_inc(*pcpu_count); > + else > + atomic_inc(&ref->count); > + > + preempt_enable(); > +} s/preempt_disable()/rcu_read_lock()/ ? > +/** > + * percpu_ref_put - decrement a dynamic percpu refcount > + * > + * Returns true if the result is 0, otherwise false; only checks for the ref > + * hitting 0 after percpu_ref_kill() has been called. Analagous to > + * atomic_dec_and_test(). > + */ > +static inline int percpu_ref_put(struct percpu_ref *ref) > +{ > + unsigned __percpu *pcpu_count; > + int ret = 0; > + > + preempt_disable(); > + > + pcpu_count = ACCESS_ONCE(ref->pcpu_count); > + > + if (pcpu_count) > + __this_cpu_dec(*pcpu_count); > + else > + ret = atomic_dec_and_test(&ref->count); > + > + preempt_enable(); > + > + return ret; > +} Here too. And if you don't put unlikely() in this code, you lose kernel hacker points :) And int/true/false is for old-timers. > + > +unsigned percpu_ref_count(struct percpu_ref *ref); > +int percpu_ref_kill(struct percpu_ref *ref); > + > +/** > + * percpu_ref_dead - check if a dynamic percpu refcount is shutting down > + * > + * Returns true if percpu_ref_kill() has been called on @ref, false otherwise. > + */ > +static inline int percpu_ref_dead(struct percpu_ref *ref) > +{ > + return ref->pcpu_count == NULL; > +} Can you unexpose these? I think percpu_ref_init(), ...get(), ...put() and ...put_initial() are a nicer API. > +int percpu_ref_kill(struct percpu_ref *ref) > +{ > + unsigned __percpu *pcpu_count; > + unsigned __percpu *old; > + unsigned count = 0; > + int cpu; > + > + pcpu_count = ACCESS_ONCE(ref->pcpu_count); > + > + do { > + if (!pcpu_count) > + return 0; > + > + old = pcpu_count; > + pcpu_count = cmpxchg(&ref->pcpu_count, old, NULL); > + } while (pcpu_count != old); This is more complex than it needs to be, no? pcpu_count = ACCESS_ONCE(ref->pcpu_count); if (!pcpu_count) return 0; if (cmpxchg(&ref->pcpu_count, pcpu_count, NULL) == NULL) return 0; Of course, if all callers use the owner pattern, this is simply: pcpu_count = ACCESS_ONCE(ref->pcpu_count); BUG_ON(!pcpu_count); > + synchronize_sched(); synchronize_rcu() ? > + for_each_possible_cpu(cpu) > + count += *per_cpu_ptr(pcpu_count, cpu); > + > + free_percpu(pcpu_count); > + > + pr_debug("global %lli pcpu %i", > + (int64_t) atomic_read(&ref->count), (int) count); > + > + atomic_add((int) count - PCPU_COUNT_BIAS, &ref->count); > + > + return 1; > +} > + > +/** > + * percpu_ref_put_initial_ref - safely drop the initial ref > + * > + * A percpu refcount needs a shutdown sequence before dropping the initial ref, > + * to put it back into single atomic_t mode with the appropriate barriers so > + * that percpu_ref_put() can safely check for it hitting 0 - this does so. > + * > + * Returns true if @ref hit 0. > + */ > +int percpu_ref_put_initial_ref(struct percpu_ref *ref) > +{ > + if (percpu_ref_kill(ref)) { > + return percpu_ref_put(ref); > + } else { > + WARN_ON(1); > + return 0; > + } > +} Note that percpu_ref_restore_initial_ref() is also possible, and may be useful for the module code... (or percpu_ref_owner_get). Great stuff! Rusty.
WARNING: multiple messages have this Message-ID (diff)
From: Rusty Russell <rusty@rustcorp.com.au> To: Kent Overstreet <koverstreet@google.com>, linux-kernel@vger.kernel.org, linux-fsdevel@vger.kernel.org, linux-aio@kvack.org Cc: akpm@linux-foundation.org, Kent Overstreet <koverstreet@google.com>, Zach Brown <zab@redhat.com>, Felipe Balbi <balbi@ti.com>, Greg Kroah-Hartman <gregkh@linuxfoundation.org>, Mark Fasheh <mfasheh@suse.com>, Joel Becker <jlbec@evilplan.org>, Jens Axboe <axboe@kernel.dk>, Asai Thambi S P <asamymuthupa@micron.com>, Selvan Mani <smani@micron.com>, Sam Bradshaw <sbradshaw@micron.com>, Jeff Moyer <jmoyer@redhat.com>, Al Viro <viro@zeniv.linux.org.uk>, Benjamin LaHaise <bcrl@kvack.org>, Tejun Heo <tj@kernel.org>, Oleg Nesterov <oleg@redhat.com>, Christoph Lameter <cl@linux-foundation.org>, Ingo Molnar <mingo@redhat.com> Subject: Re: [PATCH 04/21] Generic percpu refcounting Date: Thu, 16 May 2013 09:56:19 +0930 [thread overview] Message-ID: <87y5bfzs5w.fsf@rustcorp.com.au> (raw) In-Reply-To: <1368494338-7069-5-git-send-email-koverstreet@google.com> Kent Overstreet <koverstreet@google.com> writes: > This implements a refcount with similar semantics to > atomic_get()/atomic_dec_and_test() - but percpu. Ah! This is why I was CC'd... Now I understand. Thanks :) Delighted to see someone chasing this. I had an implementation of such a thing last decade, but the slowmode pattern didn't make for trivial kref conversions, so I dropped it. Note: I haven't read the other feedback yet, so ignore if dups. > +int percpu_ref_init(struct percpu_ref *ref); Why not just run is slow mode when allocation fails? Things which can't fail make for simpler use. > +int percpu_ref_tryget(struct percpu_ref *ref); > +int percpu_ref_put_initial_ref(struct percpu_ref *ref); This is part of a slightly different pattern: the owned refcount. In fact, I think that's the most sane pattern to use (but I could be wrong; does the AIO stuff fit?). If so, promote this to the first class citizen, and if necessary expose kill as __percpu_ref_kill()? (I might suggest percpu_ref_owner_put() as a name, in fact). > +/** > + * percpu_ref_get - increment a dynamic percpu refcount > + * > + * Analagous to atomic_inc(). > + */ > +static inline void percpu_ref_get(struct percpu_ref *ref) > +{ > + unsigned __percpu *pcpu_count; > + > + preempt_disable(); > + > + pcpu_count = ACCESS_ONCE(ref->pcpu_count); > + > + if (pcpu_count) > + __this_cpu_inc(*pcpu_count); > + else > + atomic_inc(&ref->count); > + > + preempt_enable(); > +} s/preempt_disable()/rcu_read_lock()/ ? > +/** > + * percpu_ref_put - decrement a dynamic percpu refcount > + * > + * Returns true if the result is 0, otherwise false; only checks for the ref > + * hitting 0 after percpu_ref_kill() has been called. Analagous to > + * atomic_dec_and_test(). > + */ > +static inline int percpu_ref_put(struct percpu_ref *ref) > +{ > + unsigned __percpu *pcpu_count; > + int ret = 0; > + > + preempt_disable(); > + > + pcpu_count = ACCESS_ONCE(ref->pcpu_count); > + > + if (pcpu_count) > + __this_cpu_dec(*pcpu_count); > + else > + ret = atomic_dec_and_test(&ref->count); > + > + preempt_enable(); > + > + return ret; > +} Here too. And if you don't put unlikely() in this code, you lose kernel hacker points :) And int/true/false is for old-timers. > + > +unsigned percpu_ref_count(struct percpu_ref *ref); > +int percpu_ref_kill(struct percpu_ref *ref); > + > +/** > + * percpu_ref_dead - check if a dynamic percpu refcount is shutting down > + * > + * Returns true if percpu_ref_kill() has been called on @ref, false otherwise. > + */ > +static inline int percpu_ref_dead(struct percpu_ref *ref) > +{ > + return ref->pcpu_count == NULL; > +} Can you unexpose these? I think percpu_ref_init(), ...get(), ...put() and ...put_initial() are a nicer API. > +int percpu_ref_kill(struct percpu_ref *ref) > +{ > + unsigned __percpu *pcpu_count; > + unsigned __percpu *old; > + unsigned count = 0; > + int cpu; > + > + pcpu_count = ACCESS_ONCE(ref->pcpu_count); > + > + do { > + if (!pcpu_count) > + return 0; > + > + old = pcpu_count; > + pcpu_count = cmpxchg(&ref->pcpu_count, old, NULL); > + } while (pcpu_count != old); This is more complex than it needs to be, no? pcpu_count = ACCESS_ONCE(ref->pcpu_count); if (!pcpu_count) return 0; if (cmpxchg(&ref->pcpu_count, pcpu_count, NULL) == NULL) return 0; Of course, if all callers use the owner pattern, this is simply: pcpu_count = ACCESS_ONCE(ref->pcpu_count); BUG_ON(!pcpu_count); > + synchronize_sched(); synchronize_rcu() ? > + for_each_possible_cpu(cpu) > + count += *per_cpu_ptr(pcpu_count, cpu); > + > + free_percpu(pcpu_count); > + > + pr_debug("global %lli pcpu %i", > + (int64_t) atomic_read(&ref->count), (int) count); > + > + atomic_add((int) count - PCPU_COUNT_BIAS, &ref->count); > + > + return 1; > +} > + > +/** > + * percpu_ref_put_initial_ref - safely drop the initial ref > + * > + * A percpu refcount needs a shutdown sequence before dropping the initial ref, > + * to put it back into single atomic_t mode with the appropriate barriers so > + * that percpu_ref_put() can safely check for it hitting 0 - this does so. > + * > + * Returns true if @ref hit 0. > + */ > +int percpu_ref_put_initial_ref(struct percpu_ref *ref) > +{ > + if (percpu_ref_kill(ref)) { > + return percpu_ref_put(ref); > + } else { > + WARN_ON(1); > + return 0; > + } > +} Note that percpu_ref_restore_initial_ref() is also possible, and may be useful for the module code... (or percpu_ref_owner_get). Great stuff! Rusty. -- To unsubscribe, send a message with 'unsubscribe linux-aio' in the body to majordomo@kvack.org. For more info on Linux AIO, see: http://www.kvack.org/aio/ Don't email: <a href=mailto:"aart@kvack.org">aart@kvack.org</a>
next prev parent reply other threads:[~2013-05-16 1:07 UTC|newest] Thread overview: 104+ messages / expand[flat|nested] mbox.gz Atom feed top 2013-05-14 1:18 AIO refactoring/performance improvements/cancellation Kent Overstreet 2013-05-14 1:18 ` Kent Overstreet 2013-05-14 1:18 ` [PATCH 01/21] aio: fix kioctx not being freed after cancellation at exit time Kent Overstreet 2013-05-14 1:18 ` Kent Overstreet 2013-05-14 1:18 ` [PATCH 02/21] aio: reqs_active -> reqs_available Kent Overstreet 2013-05-14 1:18 ` Kent Overstreet 2013-05-14 1:18 ` [PATCH 03/21] aio: percpu reqs_available Kent Overstreet 2013-05-14 1:18 ` Kent Overstreet 2013-05-14 1:18 ` [PATCH 04/21] Generic percpu refcounting Kent Overstreet 2013-05-14 1:18 ` Kent Overstreet 2013-05-14 13:51 ` Oleg Nesterov 2013-05-14 13:51 ` Oleg Nesterov 2013-05-15 8:21 ` Kent Overstreet 2013-05-15 8:21 ` Kent Overstreet 2013-05-14 14:59 ` Tejun Heo 2013-05-14 14:59 ` Tejun Heo 2013-05-14 15:28 ` Oleg Nesterov 2013-05-14 15:28 ` Oleg Nesterov 2013-05-15 9:00 ` Kent Overstreet 2013-05-15 9:00 ` Kent Overstreet 2013-05-15 8:58 ` Kent Overstreet 2013-05-15 8:58 ` Kent Overstreet 2013-05-15 17:37 ` Tejun Heo 2013-05-15 17:37 ` Tejun Heo 2013-05-28 23:47 ` Kent Overstreet 2013-05-28 23:47 ` Kent Overstreet 2013-05-29 1:11 ` Tejun Heo 2013-05-29 1:11 ` Tejun Heo 2013-05-29 4:59 ` Rusty Russell 2013-05-29 4:59 ` Rusty Russell 2013-05-31 20:12 ` Kent Overstreet 2013-05-31 20:12 ` Kent Overstreet 2013-05-14 21:59 ` Tejun Heo 2013-05-14 21:59 ` Tejun Heo 2013-05-14 22:15 ` Tejun Heo 2013-05-14 22:15 ` Tejun Heo 2013-05-15 9:07 ` Kent Overstreet 2013-05-15 9:07 ` Kent Overstreet 2013-05-15 17:56 ` Tejun Heo 2013-05-15 17:56 ` Tejun Heo 2013-05-16 0:26 ` Rusty Russell [this message] 2013-05-16 0:26 ` Rusty Russell 2013-05-14 1:18 ` [PATCH 05/21] aio: percpu ioctx refcount Kent Overstreet 2013-05-14 1:18 ` Kent Overstreet 2013-05-14 1:18 ` [PATCH 06/21] aio: io_cancel() no longer returns the io_event Kent Overstreet 2013-05-14 1:18 ` Kent Overstreet 2013-05-14 1:18 ` [PATCH 07/21] aio: Don't use ctx->tail unnecessarily Kent Overstreet 2013-05-14 1:18 ` Kent Overstreet 2013-05-14 1:18 ` [PATCH 08/21] aio: Kill aio_rw_vect_retry() Kent Overstreet 2013-05-14 1:18 ` Kent Overstreet 2013-05-14 1:18 ` [PATCH 09/21] aio: Kill unneeded kiocb members Kent Overstreet 2013-05-14 1:18 ` Kent Overstreet 2013-05-14 1:18 ` [PATCH 10/21] aio: Kill ki_users Kent Overstreet 2013-05-14 1:18 ` Kent Overstreet 2013-05-14 1:18 ` [PATCH 11/21] aio: Kill ki_dtor Kent Overstreet 2013-05-14 1:18 ` Kent Overstreet 2013-05-14 1:18 ` [PATCH 12/21] aio: convert the ioctx list to radix tree Kent Overstreet 2013-05-14 1:18 ` Kent Overstreet 2013-05-14 1:18 ` [PATCH 13/21] block: prep work for batch completion Kent Overstreet 2013-05-14 1:18 ` Kent Overstreet 2013-05-14 1:18 ` [PATCH 14/21] block, aio: batch completion for bios/kiocbs Kent Overstreet 2013-05-14 1:18 ` Kent Overstreet 2013-05-14 1:18 ` [PATCH 15/21] virtio-blk: convert to batch completion Kent Overstreet 2013-05-14 1:18 ` Kent Overstreet 2013-05-14 1:18 ` [PATCH 16/21] mtip32xx: " Kent Overstreet 2013-05-14 1:18 ` Kent Overstreet 2013-05-14 1:18 ` [PATCH 17/21] Percpu tag allocator Kent Overstreet 2013-05-14 1:18 ` Kent Overstreet 2013-05-14 13:48 ` Oleg Nesterov 2013-05-14 13:48 ` Oleg Nesterov 2013-05-14 14:24 ` Oleg Nesterov 2013-05-14 14:24 ` Oleg Nesterov 2013-05-15 9:34 ` Kent Overstreet 2013-05-15 9:34 ` Kent Overstreet 2013-05-15 9:25 ` Kent Overstreet 2013-05-15 9:25 ` Kent Overstreet 2013-05-15 15:41 ` Oleg Nesterov 2013-05-15 15:41 ` Oleg Nesterov 2013-05-15 16:10 ` Oleg Nesterov 2013-05-15 16:10 ` Oleg Nesterov 2013-06-10 23:20 ` Kent Overstreet 2013-06-10 23:20 ` Kent Overstreet 2013-06-11 17:42 ` Oleg Nesterov 2013-06-11 17:42 ` Oleg Nesterov 2013-05-14 15:03 ` Tejun Heo 2013-05-14 15:03 ` Tejun Heo 2013-05-15 20:19 ` Andi Kleen 2013-05-15 20:19 ` Andi Kleen 2013-05-14 1:18 ` [PATCH 18/21] aio: Allow cancellation without a cancel callback, new kiocb lookup Kent Overstreet 2013-05-14 1:18 ` Kent Overstreet 2013-05-14 1:18 ` [PATCH 19/21] aio/usb: Update cancellation for new synchonization Kent Overstreet 2013-05-14 1:18 ` Kent Overstreet 2013-05-14 1:18 ` [PATCH 20/21] direct-io: Set dio->io_error directly Kent Overstreet 2013-05-14 1:18 ` Kent Overstreet 2013-05-14 1:18 ` [PATCH 21/21] block: Bio cancellation Kent Overstreet 2013-05-14 1:18 ` Kent Overstreet 2013-05-15 17:52 ` Jens Axboe 2013-05-15 17:52 ` Jens Axboe 2013-05-15 19:29 ` Kent Overstreet 2013-05-15 19:29 ` Kent Overstreet 2013-05-15 20:01 ` Jens Axboe 2013-05-15 20:01 ` Jens Axboe 2013-05-31 22:52 ` Kent Overstreet 2013-05-31 22:52 ` Kent Overstreet
Reply instructions: You may reply publicly to this message via plain-text email using any one of the following methods: * Save the following mbox file, import it into your mail client, and reply-to-all from there: mbox Avoid top-posting and favor interleaved quoting: https://en.wikipedia.org/wiki/Posting_style#Interleaved_style * Reply using the --to, --cc, and --in-reply-to switches of git-send-email(1): git send-email \ --in-reply-to=87y5bfzs5w.fsf@rustcorp.com.au \ --to=rusty@rustcorp.com.au \ --cc=akpm@linux-foundation.org \ --cc=asamymuthupa@micron.com \ --cc=axboe@kernel.dk \ --cc=balbi@ti.com \ --cc=bcrl@kvack.org \ --cc=cl@linux-foundation.org \ --cc=gregkh@linuxfoundation.org \ --cc=jlbec@evilplan.org \ --cc=jmoyer@redhat.com \ --cc=koverstreet@google.com \ --cc=linux-aio@kvack.org \ --cc=linux-fsdevel@vger.kernel.org \ --cc=linux-kernel@vger.kernel.org \ --cc=mfasheh@suse.com \ --cc=mingo@redhat.com \ --cc=oleg@redhat.com \ --cc=sbradshaw@micron.com \ --cc=smani@micron.com \ --cc=tj@kernel.org \ --cc=viro@zeniv.linux.org.uk \ --cc=zab@redhat.com \ /path/to/YOUR_REPLY https://kernel.org/pub/software/scm/git/docs/git-send-email.html * If your mail client supports setting the In-Reply-To header via mailto: links, try the mailto: linkBe sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.